News Sentiment
Two types of news datasets have been developed, one is ticker-matched, and the next is theme-matched.
Last updated
Two types of news datasets have been developed, one is ticker-matched, and the next is theme-matched.
Last updated
Data is updated quarterly as data arrives after market close US-EST time.
Tutorials
are the best documentation — News Sentiment Analaysis Tutorial
This dataset provides comprehensive news sentiment analysis, offering ticker-matched and theme-matched data on various aspects of news coverage.
It includes metrics on sentiment, tone, polarity, and article count, enabling investors and analysts to gauge public perception and potential market impacts of news.
As you have done for sentiment
above you can do for news tone
, polarity
, activeness
etc.
df_sentiment_score = sov.data("news/sentiment_score")
Measures emotional tone of news articles. Positive scores: favorable news; Negative scores: unfavorable news.
df_polarity_score = sov.data("news/polarity_score")
Gauges opinion intensity in news. Higher scores: stronger opinions; Lower scores: more neutral reporting.
df_topic = sov.data("news/topic_probability")
Indicates topic prevalence in news. Higher values: more frequently discussed topics.
All use various statistical measures (mean, median, etc.) across financial/economic topics over time.
This dataset provides a comprehensive analysis of various entities (such as companies and individuals) based on their media coverage and associated articles. It's designed to assist investors in understanding the market sentiment, media focus, and the overall perception of entities in which they might be interested. The data is extracted and processed from a wide range of articles, ensuring a broad and in-depth view of each entity.
This dataset is an invaluable resource for investors seeking to gauge public perception, media sentiment, and the prominence of entities in the news. It can be used for:
Sentiment analysis to understand the market mood.
Identifying trends in media coverage related to specific entities.
Assessing the impact of news on stock performance.
Conducting peer comparison based on media presence and sentiment.
Feature Name | Description | Type | Example |
---|---|---|---|
Input Datasets
News Scrapers, Public Event Data
Models Used
Fuzzy Matching
Model Outputs
Sentiment Scores
match_quality
Quality score of the match between the article and the entity, indicating the relevance and accuracy of the match.
float
99.75
within_article
Number of mentions of the entity within the article, indicating the focus on the entity in the article's content.
int
2
relevance
The average salience of the entity across the articles, indicating the importance or prominence of the entity.
float
0.022049
magnitude
A measure of the intensity or strength of the sentiment expressed in the article.
float
18.203125
sentiment
A score representing the overall sentiment (positive or negative) of the article.
float
0.054504
article_count
The total number of articles associated with the entity, indicating the level of media attention or coverage.
int
1666
associated_people
Count of unique people mentioned in the context of the entity, reflecting its association with various individuals.
int
143
associated_companies
Count of unique companies mentioned in relation to the entity, indicating its business connections.
int
287
tone
The overall tone of the article, derived from a textual analysis of its content.
float
0.237061
positive
The score quantifying the positive sentiments expressed in the article.
float
2.828125
negative
The score quantifying the negative sentiments expressed in the article.
float
2.591797
polarity
The degree of polarity in the sentiment, indicating the extent of opinionated content.
float
5.421875
activeness
A measure of the dynamism in the language used, possibly indicating the urgency of the article.
float
22.031250
pronouns
The count of pronouns used in the article, indicative of the narrative style or subject focus.
float
0.995117
word_count
The total number of words in the article, giving an indication of its length or detail.
int
1084