Feature Importance
The feature importance module in the sovai library offers multiple unsupervised algorithms to quantify the significance of each feature in financial datasets.
Tutorials
are the best documentation — Feature Importance Tutorial
Feature Importance Methods
The module supports several methods for calculating feature importance:
Random Projection
Reflects how much each feature contributes to the variance in the randomly projected space.
Random Fourier Features
Indicates how strongly each feature influences the approximation of non-linear relationships in the Fourier-transformed space.
Independent Component Analysis (ICA)
Based on the magnitude of each feature's contribution to the extracted independent components, representing underlying independent signals in the data.
Truncated Singular Value Decomposition (SVD)
Determined by each feature's influence on the principal singular vectors, which represent directions of maximum variance in the data.
Sparse Random Projection
Based on how much each feature contributes to the variance in the sparsely projected space, similar to standard Random Projection but with improved computational efficiency.
Clustered SHAP Ensemble
Iteratively applies clustering, uses XGBoost to predict cluster membership, calculates SHAP values, and averages results across multiple runs to determine feature importance in identifying natural data structures.
Global Feature Importance
To calculate global feature importance across all methods:
Feature Selection
Example of selecting top features based on importance scores:
Last updated