Clustering Panels
Clustering specifically designed for multivariate panel clustering of financial and time-series data
Tutorials
are the best documentation — Clustering Panels Tutorial
Introduction
Can be used to cluster any panel dataset. It is particularly useful for financial analysts, data scientists, and researchers working with time-series data across multiple entities (e.g., stocks, companies) and variables.
Initialization
The CustomDataFrame can be initialized using the sov.data()
function:
import sovai as sov
sov.token_auth(token="your_token_here")
df = sov.data("accounting/weekly")
Basic Clustering
Perform clustering on all features:
df_cluster = df.cluster()

Feature-Specific Clustering
Cluster based on specific features:
df_cluster_ebit = df.cluster(features=["ebit"])
df_cluster_multi = df.cluster(features=["total_assets", "total_debt", "ebit"])
Summary Clustering
Get a quick summary of the last 6-months data:
df.cluster("summary")

Visualization Methods
Line Plot
Visualize cluster centroids and distances:
df.cluster("line_plot")

Scatter Plot
Create a scatter plot of clustered data:
df.cluster("scatter_plot")

Animation Plot
Generate an animated plot of cluster evolution:
df.cluster("animation_plot")

Advanced Analysis
Distance Calculation
Calculate distances between ticker-cluster combinations:
df_dist = df_cluster.drop(columns=["labels"]).distance(orient="time-series")

Examples
Basic Clustering and Visualization
import sovai as sov
sov.token_auth(token="your_token_here")
df_accounting = sov.data("accounting/weekly")
df_mega = df_accounting.select_stocks("mega").date_range("2018-01-01")
df_cluster = df_mega.cluster()
df_mega.cluster("line_plot")
Feature-Specific Clustering and Distance Analysis
df_cluster_ebit = df_mega.cluster(features=["ebit"])
df_dist = df_cluster_ebit.drop(columns=["labels"]).distance(orient="time-series")
similar_to_amzn = df_dist.sort_values(["AMZN"])[["AMZN"]].T
Last updated
Was this helpful?