SOV.AI
  • Data & Screens
  • GET STARTED
    • Blog (Screener)
    • 🚀Quick Start
    • ⭐Tutorials
    • 💻Installation
    • ⚒️Release Notes
    • 🔘About
  • REALTIME DATASETS
    • Equity Datasets
      • Accounting Data
      • Bankruptcy Predictions
      • Employee Visa
      • Earnings Surprise
      • Congressional Data
      • Factor Signals
      • Financial Ratios
      • Government Contracts
      • Institutional Trading
      • Insider Flow Prediction
      • Liquidity Data
      • Lobbying Data
      • News Sentiment
      • Price Breakout
      • Risk Indicators
      • SEC Edgar Search
      • SEC 10K Filings
      • Short Selling
      • Wikipedia Views
      • Patents Data
    • Economic Datasets
      • Asset Rotation
      • Core Economic Data
      • ETF Flows
      • Government Traffic
      • 🏳️Turing Risk Index
    • Sectorial Datasets
      • Airbnb Data
      • Box Office Stats
      • CFPB Complaints
      • Phrama Clinical Trials
      • Request Datasets
  • Asset Managment
    • Signal Evaluation
    • Weight Optimization
    • Screens and Filters
  • Pattern Recognition
    • Pairwise Distance
    • Anomaly Detection
    • Clustering Panels
  • Feature Processing
    • Extract Features
    • Neutralize Features
    • Select Features
    • Dimensionality Reduction
    • Feature Importance
  • Time Series
    • Nowcasting Series
    • TS Decomposition
    • Time Segmentation
  • Dashboard Examples
    • 🔰Bankruptcy Prediction
    • 🛰️Turing Risk Index
  • IMPORTANT LINKS
    • ⚙️Main Website
    • 👮Forum and Issues
    • 🙋Web Application
    • 📤LinkedIn
    • 🟢Buy Subscription
Powered by GitBook
On this page
  • Description
  • Data Access
  • Retrieving Data
  • Plots
  • Line Predictions
  • Breakout Predictions
  • Prediction Accuracy
  • Data Dictionary
  • Use Case

Was this helpful?

  1. REALTIME DATASETS
  2. Equity Datasets

Price Breakout

A dataset with daily updated predictions of price breaking upwards for US Equities.

PreviousNews SentimentNextRisk Indicators

Last updated 6 months ago

Was this helpful?

Daily predictions arrive between 11 pm - 4 am before market open in the US for 13,000+ stocks.

Tutorials are the best documentation —

Description

This datasets identifies potential price breakout stocks over the next 30-60 days for US Equities. This dataset provides daily predictions of upward price breakouts for over 13,000 US equities.

The accuracy is around 65% and ROC-AUC of 68%, it is one of the most accurate breakout models on the market. It is retrained on a weekly basis.

Several machine learning models are trained using the prepared dataset:

  • Calibrated Classifier: A classification model trained on the engineered features to predict the binary target.

  • Proprietory Regressor: A proprietory regression model is used to predict the probability of a price increase.

  • Conformal Regressor: Used to provide calibrated confidence intervals around the predictions, offering an additional measure of uncertainty.

Data Access

Retrieving Data

Latest Data

import sovai as sov
df_breakout = sov.data("breakout")

Full history

import sovai as sov
df_breakout = sov.data("breakout", full_history=True)

Specific Ticker

df_msft = sov.data("breakout", tickers=["MSFT"])

Plots

Line Predictions

df_breakout.plot_line(tickers=["TSLA", "META", "NFLX"])

Breakout Predictions

Visualize breakout predictions using the SDK's plotting capabilities:

sov.plot("breakout", chart_type="predictions", df=df_msft)

Prediction Accuracy

Assess the accuracy of breakout predictions:

sov.plot("breakout", chart_type="accuracy", df=df_msft)

Data Dictionary

Column
Description
Type
Example

ticker

Stock ticker symbol.

object

"AAPL"

date

Date when the data was recorded.

datetime64[ns]

2023-09-30

target

Target variable for predictions.

float64

0.05

future_returns

Future returns of the stock.

float32

0.10

prediction

Predicted probability from the model.

float64

1.25

bottom_prediction

Lower bound of the prediction interval.

float64

1.20

top_prediction

Upper bound of the prediction interval.

float64

1.30

standard_deviation

Standard deviation of the predictions.

float64

0.02

bottom_conformal

Lower bound of the conformal prediction interval.

float64

1.18

top_conformal

Upper bound of the conformal prediction interval.

float64

1.32

slope

Slope derived from the rolling regression of predictions over a window.

float64

0.003


Use Case

Understood. I'll focus on the use cases that would be most relevant to professional investors. Here's the refined list:

• Portfolio optimization:

  • Identify potential new additions to diversified stock portfolios

  • Rebalance existing holdings based on breakout predictions

• Risk management:

  • Use confidence intervals and standard deviations to assess potential downside risk

  • Implement more precise hedging strategies based on predicted price movements

• Sector and market analysis:

  • Identify trends across industry sectors or the broader market

  • Compare breakout potentials across different stock categories (e.g., large-cap vs. small-cap)

• Market timing:

  • Use aggregate predictions across multiple stocks to gauge overall market sentiment

  • Time entry and exit points for broader market positions

Input Datasets

Historical Stock Prices, Trading Volumes, Technical Indicators, Order Book.

Models Used

Classification Algorithms, Regression Models, Conformal Predictors

Model Outputs

Price Movement Predictions, Probability Scores, Confidence Intervals

Price Breakout Prediction Tutorial