Ratio mean-reversion testing for asset pairs. Hurst exponent + ADF + range filters + walk-forward backtest. No lookahead. MIT license.
This is the open-source utility behind pairscan.io — a screener for pair trading on crypto and tokenized US equities. It does one thing: takes two price series, tells you whether their log-ratio shows mean-reversion, and if so, what a walk-forward backtest would have looked like.
✅ Does:
- Compute Hurst exponent via R/S analysis
- Run Augmented Dickey-Fuller test for stationarity
- Test range width and alternating boundary touches
- Combine all four into a single
is_mean_reverting()predicate - Walk-forward backtest one pair (rolling P5/P95, no lookahead)
❌ Doesn't:
- Fetch data from exchanges (use
ccxt,yfinance, or your own pipeline) - Screen multiple pairs (this is a single-pair tool)
- Validate tokenized asset pegs against oracles
- Run scheduled, multi-source data fallback
- Send alerts
If you need those, pairscan.io does them as a hosted product — that's our commercial offering. This package is the math, free and open.
pip install pairscan-rmrimport numpy as np
from pairscan_rmr import is_mean_reverting, walk_forward_backtest
# Your price series — daily closes for two assets
price_a = np.array([...]) # e.g. ETH daily closes
price_b = np.array([...]) # e.g. BTC daily closes
# Step 1: Does this pair mean-revert?
result = is_mean_reverting(price_a, price_b)
print(result)
# MeanReversionResult(passed=True, hurst=0.42, adf_pvalue=0.31,
# range_width=0.53, low_touches=3, high_touches=2)
# Step 2: If yes, run a walk-forward backtest
if result.passed:
backtest = walk_forward_backtest(
price_a, price_b,
lookback_days=540,
entry_low=0.2,
entry_high=0.8,
fee_pct=0.001,
)
print(f"Final A qty: {backtest.final_a_qty:.2f}")
print(f"Final B qty: {backtest.final_b_qty:.2f}")
print(f"Trades: {backtest.n_trades}")
print(f"Max drawdown: {backtest.max_drawdown:.1%}")Because the math has been public since 1951. Hurst (1951), Dickey-Fuller (1979), Lo-MacKinlay (1988) — none of this is proprietary. Anyone can reimplement it in an afternoon.
What's not in this repo is what makes pairscan.io worth $19/mo: 5-source data fallback, oracle peg-check on tokenized assets, 170-pair screening every 6 hours, cross-sector matching, Telegram alerts. That's operational engineering, and that's what we sell.
The math should be free. The pipeline costs money to run.
Brief intro below. Full walkthrough with derivations and academic references at pairscan.io/methodology.
Measures long-term memory of a time series:
H < 0.5— anti-persistent / mean-reverting (we want this)H = 0.5— random walkH > 0.5— persistent / trending
We compute it on the log-ratio, not raw prices.
Augmented Dickey-Fuller checks for unit root. Low p-value → stationarity → mean to revert to. We use a loose threshold (p < 0.7) combined with other filters — strict p < 0.05 throws out genuinely mean-reverting crypto pairs because crypto data is noisier than equities.
Operational filters: range must span ≥ 40% (so swap fees don't kill returns) and the series must touch both boundaries multiple times alternately (so it's genuinely oscillating, not just visiting an extreme once).
At each decision point t, only data up to t is used to set entry/exit thresholds. The percentile bounds are recomputed every day on a trailing 540-day window. This is the only way to honestly simulate "what would have happened if I'd been running this in real time".
tests/test_no_lookahead.py runs the same backtest twice — once with clean data, once with all data after a midpoint replaced with garbage — and asserts the two trade lists are byte-identical up to the midpoint. If a future-dependent statistic ever leaks in, the test fails immediately. Look at it before trusting the backtest output.
See examples/ for runnable scripts:
- 01_quick_start.py — 5 minutes, synthetic data
- 02_synthetic_series.py — Ornstein-Uhlenbeck (mean-reverting) and GBM (trending) as ground truth — check that filters classify them correctly
- 03_real_crypto_pair.py — ETH/BTC via
ccxt, full pipeline - 04_walk_forward_explained.py — visual comparison with naive in-sample backtest
We're explicit about where this fails. See full discussion at pairscan.io/methodology:
- Hurst R/S has variance — sensitive to
max_lagchoice - ADF assumes stationary residuals — structural breaks mislead it
- Tests are descriptive, not predictive
- Sample size matters: < 200 days = noise, < 540 days = use with caution
- Real execution adds slippage, taxes, exchange downtime — none modeled
PRs welcome, especially:
- Performance improvements (vectorization, Numba)
- Additional tests (edge cases, numerical stability)
- Examples on different asset classes (FX, commodities, equities)
See CONTRIBUTING.md.
MIT — do whatever you want, attribution appreciated.
If you use this in research:
@software{pairscan_rmr,
author = {Pairscan},
title = {pairscan-rmr: Ratio mean-reversion testing for asset pairs},
url = {https://github.com/pairscan/ratio-mean-reversion},
year = {2026}
}- Hurst, H.E. (1951). Long-term storage capacity of reservoirs. Transactions of the American Society of Civil Engineers, 116, 770–799.
- Dickey, D.A. & Fuller, W.A. (1979). Distribution of the Estimators for Autoregressive Time Series with a Unit Root. JASA, 74, 427–431.
- Gatev, E., Goetzmann, W.N. & Rouwenhorst, K.G. (2006). Pairs Trading: Performance of a Relative-Value Arbitrage Rule. Review of Financial Studies, 19(3), 797–827.