FinBench is a collection of tools, datasets and example implementations to evaluate and experiment with models and algorithms in the financial domain (time-series forecasting, ranking, portfolio simulation, factor modeling, etc.). The repository aims to provide a reproducible foundation for research, benchmarking and rapid prototyping in quantitative finance and financial machine learning.
- Structured datasets and data loaders for common financial tasks.
- Preprocessing and feature engineering utilities (technical indicators, rolling statistics, factor calculation).
- Baseline model implementations across multiple tasks (classification, regression, ranking, portfolio optimization).
- Evaluation and backtesting tools for reproducible experiment comparison.
- Per-model example training scripts and requirements to reproduce results.
| Type | Model | Loss Function | Data Normalization |
|---|---|---|---|
| Classification | THGNN | Cross-Entropy Loss | - |
| MAN-SF | Cross-Entropy Loss | Relative Price Scaling (High/Low divided by previous Adjusted Close) | |
| Adv-ALSTM | Hinge Loss | - | |
| HGTAN | Cross-Entropy Loss | Per-Ticker Max Scaling (Max Normalization) | |
| CNNPred2D / CNNPred3D | MSE Loss | Standardization (Z-score scaling via StandardScaler), Missing Value Imputation (fillna(0)) | |
| DGDNN | Cross-Entropy Loss | - | |
| Regression | D-Va | MSE + Regularization + Variational + Denoising Loss | - |
| ESTIMATE | RMSE Loss | Per-Ticker Max Scaling (Max Normalization) | |
| StockMixer | MSE Loss + Pointwise Ranking Loss | Per-Ticker Max Scaling (Max Normalization) | |
| MASTER | MSE Loss | Daily Cross-Sectional Z-score Normalization (label only), Drop-Last Strategy in Training | |
| MATCC | MSE Loss | Robust Standardization (Robust Z-score using median & IQR), Drop-Last Strategy in Training | |
| HIST | MSE Loss | Robust Standardization + Daily Cross-Sectional Z-score, Missing Value Handling (Drop NaN Labels + fillna(0)) | |
| DiscoverPLF | Reconstruction + Prediction + KL Divergence Loss | Robust Standardization + Daily Cross-Sectional Z-score, Missing Value Handling (Drop NaN Labels + fillna(0)) | |
| FactorVAE | Negative Log-Likelihood + KL Divergence Loss | Robust Standardization + Daily Cross-Sectional Z-score, Missing Value Handling (Drop NaN Labels + fillna(0)) | |
| FinFormer | Concordance Correlation Coefficient (CCC) Loss | Robust Z-score Normalization + Missing Value Imputation + Label Filtering + Cross-Sectional Rank Normalization (CSRank) | |
| SAMBA | MAE Loss | Min-Max Scaling | |
| Ranking | STHAN-SR | MSE Loss + Pointwise Ranking Loss | Per-Ticker Max Scaling (Max Normalization) |
| SVAT | MSE Loss + Pointwise Ranking Loss | Per-Ticker Max Scaling (Max Normalization) | |
| RT-GCN | MSE Loss + Pointwise Ranking Loss | Per-Ticker Max Scaling (Max Normalization) |
Classification/— Multiple classification model implementations and training scripts (e.g., Adv-ALSTM, CNNPred, DGDNN, HGTAN, MAN-SF, THGNN).Ranking/— Ranking models and related training pipelines.Regression/— Regression and forecasting models (FinFormer, FactorVAE, HIST and more).Evaluation/— Evaluation and backtesting utilities, evaluation scripts and configuration templates.
Note: Each model implementation includes their own requirements.txt and example training scripts.
- Clone the repository:
git clone https://github.com/softlab-unimore/finbench.git
cd finbench
-
Create and activate a Python virtual environment for each model and Evaluation package.
-
Install dependencies.
- Global evaluation tools (used by
Evaluation/):
pip install -r Evaluation/requirements.txt- Per-model dependencies: each model folder (for example
Classification/Adv-ALSTM/) contains arequirements.txtwith the packages needed for training and evaluation of that model. Follow the instructions in each model folder.
- Global evaluation tools (used by
-
Data loading:
Evaluation/main.pyprovide the script to extract data from the data sources and prepare it for training and evaluation. Please run from the root directory:cd Evaluation python3 main.py -
Model training: all the models provide a
train.pyscript inside their folder. Typical usage (adjust per-model arguments):cd ../<Type>/<Model_Folder> python3 train.py [<pararms>]Replace
<Model_Folder>with the appropriate value. Check the model folder for specific training instructions and required arguments.Typemust be one of:Ranking,Classification,Regression. -
Extract task level metrics: Use the provided tool to collect best validation runs and produce per-model CSV metric summaries.
-
Verify your results layout
- Results must follow the pattern:
<Type>/<Model>/results/<Universe>/<Config>/<Seed>/<Year>/*
- Results must follow the pattern:
-
Run the extractor
-
From the repository root, run:
python extract_model_metrics.py --type <TYPE> --model <MODEL_NAME>Replace
TYPEandMODEL_NAMEwith the appropriate type and model folder.TYPEmust be one of:Ranking,Classification,Regression. -
The script will create:
<Type>/<Model>/best_results.json— best test metrics selected by validation score.<Type>/<Model>/metrics.csv— tab-separated table of metrics per (Year, Seed, Universe) for common sl/pl configurations.
-
-
-
Evaluation:
-
evaluation.pyprovide mechanisms compute portfolio metrics on model predictions.cd Evaluation python3 evaluation.py --type <TYPE> --model <MODEL_NAME> --universe <UNIVERSE> --sl <SL> --pl <PL> --initial_year <YEAR> --top_k <K> --short_k <SK> -
quintile_analysis.pyprovides tools to compute quintile-based metrics and visualizations.cd Evaluation python3 quintile_analysis.py --type <TYPE> --model <MODEL_NAME> --universe <UNIVERSE> --initial_year <YEAR>
Replace
TYPE,MODEL_NAMEand the other placeholders with the appropriate values. Check theEvaluation/README.mdfile for specific instructions and available arguments. -
Check the docs or the training script in the model folder for model-specific flags and data requirements.
Almost all models were tested with Python 3.10; however, some exceptions (e.g., Adv-ASLTM) required different Python versions due to library compatibility issues.
Check the README.md in each model folder for specific Python version requirements and installation instructions.
All the results on the implemented models are available in the results.md file at the project root.
This repository includes a LICENSE file at the project root. Review it for terms and conditions before using the code in production.
