critband is the release name for v0.2.3 on GitHub and PyPI.
critband is a Python package for analyzing critical bandwidth and related multimodality structure in kernel density estimation (KDE).
It finds the smallest bandwidth where a KDE transitions from k-modal to fewer modes, and it also exposes related summaries such as bimodality_strength() and excess_mass().
Paper: Zhang, R., & Wang, Q. (2026). critband: Critical Bandwidth Analysis for Multimodal Distributions in Python. arXiv:2605.18686. https://arxiv.org/abs/2605.18686
pip install critbandimport numpy as np
from critband import critical_bandwidth
# Generate bimodal data
x = np.concatenate([np.random.normal(-2, 0.5, 200),
np.random.normal( 2, 0.5, 200)])
h_crit, success = critical_bandwidth(x)
print(f"Critical bandwidth: {h_crit:.4f}")
print(f"Converged: {success}")
# Output: Critical bandwidth: ~0.97
# Converged: True| Function | Description |
|---|---|
silverman_bandwidth(x) |
Silverman's rule-of-thumb bandwidth: 1.06 · min(σ, IQR/1.34) · n^(-1/5) |
gaussian_kde(x, grid, h, use_fft=None) |
KDE via direct O(n·g) or FFT O(g·log(g)) computation; auto-selects FFT for n > 5000 |
critical_bandwidth(x, k=2, return_ci=False) |
Critical bandwidth for mode-count detection; auto/binary/brent methods; optionally returns bootstrap CI |
find_modes(x, h) |
Detect all KDE modes at bandwidth h; returns ModeResult with per-mode position, height, width, prominence |
find_trough(x, h) |
Find valley between the two highest KDE peaks; returns x-coordinate or None |
detect_components(x) |
Decompose bimodal distribution into two Gaussian components; returns BimodalDecomposition |
bimodality_strength(x) |
Comprehensive bimodality assessment; returns interpretable strength label + score in BimodalityStrength |
silverman_test(x) |
Silverman's bootstrap test for bimodality; returns SilvermanTestResult dataclass |
excess_mass(x) |
Müller & Sawitzki excess mass test for multimodality; detects any number of modes, returns ExcessMassResult |
The critical_bandwidth function uses automatic method selection: Brent's method for well-separated data (converges in fewer iterations), binary search for weak/small-n cases. All three methods (auto, binary, brent) produce consistent results. Hartigan's dip test is included as a complementary unimodality check, not as a core differentiator of the package.
The critical bandwidth is the smallest bandwidth
Silverman (1986) proposed a reference bandwidth for Gaussian KDE:
This is used as the baseline for the critical bandwidth search: the solver searches for
The critical bandwidth is found via binary search on KDE mode counts (for any k). The solver:
- Automatically computes bounds from Silverman's rule:
$[h_{\text{silverman}}/20, 10 \cdot h_{\text{silverman}}]$ - Uses
method="auto"(default): Brent's method for large, well-separated data (fewer iterations); binary search for weak/small-n cases - For k > 2,
critical_bandwidth(x, k=3)finds the bandwidth where trimodality disappears, etc. - Optionally returns a bootstrap confidence interval via
return_ci=True
The dip_ratio (via _trough_ratio) is available as a descriptive bimodality strength measure but is not used as the optimizer objective. Hartigan's dip test is included as a complementary unimodality check, not as a core differentiator of the package.
The excess_mass function implements the Müller & Sawitzki (1991) test for multimodality. It measures the amount of probability mass above threshold levels in a KDE, and uses bootstrap calibration to estimate the number of modes — unlike the Silverman test, it can detect any number of modes in a single call.
- Silverman (1981) — The critical bandwidth test for multimodality in kernel density estimates
- Hartigan & Hartigan (1985) — The Dip Test of Unimodality, a complementary non-parametric approach
- Müller & Sawitzki (1991) — Excess mass estimates and tests for multimodality
- Hall & York (2001) — On the calibration of Silverman's test for multimodality, refinements to the original method
- Silverman, B.W. (1986). Density Estimation for Statistics and Data Analysis. Chapman and Hall.
- Hartigan, J.A. & Hartigan, P.M. (1985). The Dip Test of Unimodality. The Annals of Statistics, 13(1), 70-84.
- Müller, D.W. & Sawitzki, G. (1991). Excess Mass Estimates and Tests for Multimodality. JASA, 86(415), 738-746.
- Cheng, M.-Y. & Hall, P. (1998). Calibrating the excess mass and dip tests of modality. Journal of the Royal Statistical Society: Series B, 60(3), 579-589.
# From PyPI
pip install critband
# Or using uv
uv add critband
# Development
git clone git@github.com:ryZhangHason/critband.git
cd critband
uv sync
uv run python -m pytest tests/ -v- Python >= 3.10
- Runtime: numpy, scipy, openpyxl, xlrd, python-docx, pdfplumber
- Test only (not needed at runtime): pytest, pytest-cov, xlwt, reportlab
- Visualization (optional): matplotlib >= 3.5
WS1 calibration has been run on the fixed design grid and is kept as a validation artifact only. The repository now passes the strict full-suite acceptance run, but WS1 remains descriptive rather than threshold-setting.
- Critical bandwidth analysis: One function call computes the critical bandwidth for mode-count structure, and you can use
bimodality_strength()as a descriptive score orexcess_mass()to estimate how many modes your data has. - Mode-count detection:
critical_bandwidth(x, k=3)detects the bandwidth where trimodality disappears and generalizes the mode-count parameter. - Any-file input: 9 formats from a single API. CSV, Excel, PDF, Word, JSON, HTML, Markdown — just point
read_dataat the file and go. - One-command install:
pip install critbandinstalls everything. No system packages, no manual steps, no Tesseract OCR. - Pure Python dependencies: All 4 additional libraries (openpyxl, xlrd, python-docx, pdfplumber) are pure Python — no compiled extensions.
- Built for production and research: Works in CLI scripts and Jupyter notebooks equally well.
import numpy as np
from critband import critical_bandwidth, bimodality_strength, excess_mass
# Unimodal data — the function tells you it's already unimodal
unimodal = np.random.normal(0, 1, 500)
h_crit, success = critical_bandwidth(unimodal)
print(f"Unimodal: h_crit={h_crit:.4f}, converged={success}")
# → success may be False (already unimodal at minimum bandwidth)
# Bimodal data — full pipeline
bimodal = np.concatenate([
np.random.normal(-3, 0.5, 300),
np.random.normal( 3, 0.5, 300)
])
h_crit, success = critical_bandwidth(bimodal)
print(f"Bimodal: h_crit={h_crit:.4f}, converged={success}")
# Interpretable strength assessment
s = bimodality_strength(bimodal)
print(f"Strength: {s.strength} (score={s.strength_score:.2f})")
# Excess mass test — detects number of modes
e = excess_mass(bimodal, n_boot=199)
print(f"Estimated modes: {e.n_modes_estimated}")
# Trimodal data with mode-count detection
trimodal = np.concatenate([
np.random.normal(-4, 0.3, 150),
np.random.normal( 0, 0.3, 150),
np.random.normal( 4, 0.3, 150),
])
h_k3, ok = critical_bandwidth(trimodal, k=3)
print(f"Trimodal (k=3): h_crit={h_k3:.4f}")Generate 3 matplotlib figures that illustrate the core algorithms:
uv sync --group viz
uv run python examples/bandwidth_demo.pyThe demo uses well-separated bimodal data from the benchmark suite and produces:
kde_bandwidth_sweep.png— 2×2 subplots showing KDE at 4 bandwidths (too small, Silverman, critical, too large), with peaks and troughs marked.critical_bandwidth_transition.png— Three overlaid KDE curves: bimodal (h_crit × 0.85), critical (h_crit), and unimodal (h_crit × 1.15), with the trough position annotated.component_decomposition.png— Histogram, KDE, detected Gaussian components (scaled by weight), and the trough-based separation point.
matplotlib is an optional dependency — the core package does not require it.
An interactive Jupyter Notebook is available for hands-on exploration:
uv sync --group viz --group jupyter
uv run jupyter notebook examples/bandwidth_demo.ipynbThe notebook contains 10 cells that walk through the same three visualizations with explanatory markdown, and includes an extension exercise that compares dip ratios across all 7 benchmark cases.
Measure execution time of core functions across data sizes:
uv run python examples/benchmark_performance.pyCustomize the number of runs and data sizes:
uv run python examples/benchmark_performance.py --runs 10 --sizes 100 500 2000
uv run python examples/benchmark_performance.py --output results.csvThis section records validation against R's multimode package across 12 benchmark cases.
| Feature | critband | multimode |
|---|---|---|
| Critical bandwidth | ✅ (k-based mode-count test, k >= 2) | ✅ (mod0-based mode-count test, mod0 >= 1) |
| Mode-count detection | ✅ (threshold parameter k) | ✅ (threshold parameter mod0) |
| Bimodality strength | ✅ (interpretable) | ❌ |
| Excess mass test | ✅ | ✅ |
| Silverman's bootstrap test | ✅ | ✅ |
| Component decomposition | ✅ | ❌ |
| 9-format I/O | ✅ | ❌ |
| Dependencies | Pure Python | R + compiled |
| Case | n | critband h_crit | R modetest p |
Agreement |
|---|---|---|---|---|
| Well-separated | 400 | 1.8650 | 0.000 | ✅ Both detect bimodality |
| Moderate separation | 500 | 1.0964 | 0.000 | ✅ Both detect bimodality |
| Barely separated | 600 | 0.2791 | 0.251 | ✅ critband flags weak bimodality, R agrees (n.s.) |
| Unequal variance | 400 | 1.7849 | 0.000 | ✅ Both detect bimodality |
| Unequal weights | 500 | 1.2591 | 0.000 | ✅ critband correct; R yields spurious 772 modes |
| Extreme separation | 400 | 4.6987 | 0.000 | ✅ Both detect bimodality |
| Trimodal | 450 | 1.3824 | 0.000 | ✅ critband finds 3 modes; R detects multimodality |
| Skewed bimodal | 500 | 1.1417 | 0.000 | ✅ Both detect bimodality |
| Wide-component bimodal | 400 | 2.7109 | 0.000 | ✅ Both detect bimodality |
| Near unimodal | 600 | 0.4186 | 0.055 | ✅ critband flags weak; R agrees (n.s.) |
| Small sample bimodal | 60 | 1.8608 | 0.000 | ✅ Both detect bimodality (small n) |
| Overlapping variances | 500 | 0.4598 | 0.045 | ✅ critband flags weak; R agrees (n.s.) |
Across these benchmark cases, critband achieves <0.5% mean absolute relative error vs high-precision reference values.
| Operation | critband | R (multimode) |
Comparison |
|---|---|---|---|
critical_bandwidth |
0.04–0.79 s | 0.82–1.58 s | 3–10× difference |
find_modes (mode counting) |
<0.01 s | 0.03–0.05 s | Comparable |
The runtime difference reflects critband's pure-Python numerical stack (NumPy/SciPy) and adaptive grid sizing, compared to R's compiled package dispatch overhead with bootstrap calibration (modetest(B = 199)). These timing numbers were collected on Apple M1-class Apple Silicon hardware.
Read data from 9 file formats without learning separate tools:
from critband.io import read_data
# Auto-detect — just point at any supported file
x = read_data("measurements.csv")
x = read_data("survey.xlsx", sheet="Sheet1")
x = read_data("results.pdf", column="Score")
y = read_data("report.docx", column=0, return_all=False)| Format | Extension | Library |
|---|---|---|
| CSV | .csv |
Built-in csv (zero dependencies) |
| TSV / TXT | .tsv, .txt |
Built-in (zero dependencies) |
| JSON | .json |
Built-in json (zero dependencies) |
| Markdown (tables) | .md |
Built-in parser (zero dependencies) |
| HTML (tables) | .html, .htm |
Built-in html.parser (zero dependencies) |
| Excel | .xlsx |
openpyxl |
| Excel (legacy) | .xls |
xlrd |
| Word (tables) | .docx |
python-docx |
| PDF (tables) | .pdf |
pdfplumber |
All dependencies are declared in pyproject.toml — pip install critband installs everything automatically. No system-level tools required.
from critband.io import read_data, read_buffer
# By sheet name (Excel, PDF, Markdown multi-section)
x = read_data("data.xlsx", sheet="Measurements")
# By index
x = read_data("data.xlsx", sheet=0)
# By column name
x = read_data("data.csv", column="Value")
# By column index
x = read_data("data.csv", column=1)
# Get all numerical columns as a dict
cols = read_data("data.csv", return_all=True)
# cols = {"x": array([...]), "y": array([...])}Works with in-memory file buffers for pipeline and service integration:
from critband.io import read_buffer
# From a BytesIO object
buf = io.BytesIO(uploaded_file.read())
x = read_buffer(buf, filename="data.csv")Apache-2.0