News Sentiment
Two types of news datasets have been developed, one is ticker-matched, and the next is theme-matched.
Data is updated quarterly as data arrives after market close US-EST time.
Tutorials
are the best documentation — News Sentiment Analaysis Tutorial
Input Datasets
News Scrapers, Public Event Data
Models Used
Fuzzy Matching
Model Outputs
Sentiment Scores
Description
This dataset provides comprehensive news sentiment analysis, offering ticker-matched and theme-matched data on various aspects of news coverage.
It includes metrics on sentiment, tone, polarity, and article count, enabling investors and analysts to gauge public perception and potential market impacts of news.
Data Access
Institutional Trading Data
This data is around 1GB if you download the entire dataset.
Filtered Dataset
Ticker Level Full-Files
Themed Sentiment
df_sentiment_score = sov.data("news/sentiment_score")
Measures emotional tone of news articles. Positive scores: favorable news; Negative scores: unfavorable news.
df_polarity_score = sov.data("news/polarity_score")
Gauges opinion intensity in news. Higher scores: stronger opinions; Lower scores: more neutral reporting.
df_topic = sov.data("news/topic_probability")
Indicates topic prevalence in news. Higher values: more frequently discussed topics.
All use various statistical measures (mean, median, etc.) across financial/economic topics over time.
Vizualisations
Strategy
Econometrics
Analysis
Data Dictionary
Feature Name | Description | Type | Example |
---|---|---|---|
match_quality | Quality score of the match between the article and the entity, indicating the relevance and accuracy of the match. | float | 99.75 |
within_article | Number of mentions of the entity within the article, indicating the focus on the entity in the article's content. | int | 2 |
relevance | The average salience of the entity across the articles, indicating the importance or prominence of the entity. | float | 0.022049 |
magnitude | A measure of the intensity or strength of the sentiment expressed in the article. | float | 18.203125 |
sentiment | A score representing the overall sentiment (positive or negative) of the article. | float | 0.054504 |
article_count | The total number of articles associated with the entity, indicating the level of media attention or coverage. | int | 1666 |
associated_people | Count of unique people mentioned in the context of the entity, reflecting its association with various individuals. | int | 143 |
associated_companies | Count of unique companies mentioned in relation to the entity, indicating its business connections. | int | 287 |
tone | The overall tone of the article, derived from a textual analysis of its content. | float | 0.237061 |
positive | The score quantifying the positive sentiments expressed in the article. | float | 2.828125 |
negative | The score quantifying the negative sentiments expressed in the article. | float | 2.591797 |
polarity | The degree of polarity in the sentiment, indicating the extent of opinionated content. | float | 5.421875 |
activeness | A measure of the dynamism in the language used, possibly indicating the urgency of the article. | float | 22.031250 |
pronouns | The count of pronouns used in the article, indicative of the narrative style or subject focus. | float | 0.995117 |
word_count | The total number of words in the article, giving an indication of its length or detail. | int | 1084 |
Use Case
This dataset provides a comprehensive analysis of various entities (such as companies and individuals) based on their media coverage and associated articles. It's designed to assist investors in understanding the market sentiment, media focus, and the overall perception of entities in which they might be interested. The data is extracted and processed from a wide range of articles, ensuring a broad and in-depth view of each entity.
This dataset is an invaluable resource for investors seeking to gauge public perception, media sentiment, and the prominence of entities in the news. It can be used for:
Sentiment analysis to understand the market mood.
Identifying trends in media coverage related to specific entities.
Assessing the impact of news on stock performance.
Conducting peer comparison based on media presence and sentiment.
Last updated