SOV.AI
  • Data & Screens
  • GET STARTED
    • Blog (Screener)
    • 🚀Quick Start
    • ⭐Tutorials
    • 💻Installation
    • ⚒️Release Notes
    • 🔘About
  • REALTIME DATASETS
    • Equity Datasets
      • Accounting Data
      • Bankruptcy Predictions
      • Employee Visa
      • Earnings Surprise
      • Congressional Data
      • Factor Signals
      • Financial Ratios
      • Government Contracts
      • Institutional Trading
      • Insider Flow Prediction
      • Liquidity Data
      • Lobbying Data
      • News Sentiment
      • Price Breakout
      • Risk Indicators
      • SEC Edgar Search
      • SEC 10K Filings
      • Short Selling
      • Wikipedia Views
      • Patents Data
    • Economic Datasets
      • Asset Rotation
      • Core Economic Data
      • ETF Flows
      • Government Traffic
      • 🏳️Turing Risk Index
    • Sectorial Datasets
      • Airbnb Data
      • Box Office Stats
      • CFPB Complaints
      • Phrama Clinical Trials
      • Request Datasets
  • Asset Managment
    • Signal Evaluation
    • Weight Optimization
    • Screens and Filters
  • Pattern Recognition
    • Pairwise Distance
    • Anomaly Detection
    • Clustering Panels
  • Feature Processing
    • Extract Features
    • Neutralize Features
    • Select Features
    • Dimensionality Reduction
    • Feature Importance
  • Time Series
    • Nowcasting Series
    • TS Decomposition
    • Time Segmentation
  • Dashboard Examples
    • 🔰Bankruptcy Prediction
    • 🛰️Turing Risk Index
  • IMPORTANT LINKS
    • ⚙️Main Website
    • 👮Forum and Issues
    • 🙋Web Application
    • 📤LinkedIn
    • 🟢Buy Subscription
Powered by GitBook
On this page
  • Introduction
  • Visualization Methods
  • Advanced Analysis
  • Examples

Was this helpful?

  1. Pattern Recognition

Clustering Panels

Clustering specifically designed for multivariate panel clustering of financial and time-series data

PreviousAnomaly DetectionNextExtract Features

Last updated 6 months ago

Was this helpful?

Tutorials are the best documentation —

Introduction

Can be used to cluster any panel dataset. It is particularly useful for financial analysts, data scientists, and researchers working with time-series data across multiple entities (e.g., stocks, companies) and variables.

Initialization

The CustomDataFrame can be initialized using the sov.data() function:

import sovai as sov

sov.token_auth(token="your_token_here")
df = sov.data("accounting/weekly")

Basic Clustering

Perform clustering on all features:

df_cluster = df.cluster()

Feature-Specific Clustering

Cluster based on specific features:

df_cluster_ebit = df.cluster(features=["ebit"])
df_cluster_multi = df.cluster(features=["total_assets", "total_debt", "ebit"])

Summary Clustering

Get a quick summary of the last 6-months data:

df.cluster("summary")

Visualization Methods

Line Plot

Visualize cluster centroids and distances:

df.cluster("line_plot")

Scatter Plot

Create a scatter plot of clustered data:

df.cluster("scatter_plot")

Animation Plot

Generate an animated plot of cluster evolution:

df.cluster("animation_plot")

Advanced Analysis

Distance Calculation

Calculate distances between ticker-cluster combinations:

df_dist = df_cluster.drop(columns=["labels"]).distance(orient="time-series")

Examples

Basic Clustering and Visualization

import sovai as sov

sov.token_auth(token="your_token_here")
df_accounting = sov.data("accounting/weekly")
df_mega = df_accounting.select_stocks("mega").date_range("2018-01-01")
df_cluster = df_mega.cluster()
df_mega.cluster("line_plot")

Feature-Specific Clustering and Distance Analysis

df_cluster_ebit = df_mega.cluster(features=["ebit"])
df_dist = df_cluster_ebit.drop(columns=["labels"]).distance(orient="time-series")
similar_to_amzn = df_dist.sort_values(["AMZN"])[["AMZN"]].T
Clustering Panels Tutorial