FormalSLT is a compact Lean 4 library for the finite-sample statistical learning theory route from empirical risk minimization to VC-style generalization bounds, with recent extensions for contraction, linear predictors, finite sub-Gaussian chaining, algorithmic stability, and finite PAC-Bayes confidence bounds, and a first total-bounded finite-net bridge for the Dudley lane.
45 Lean modules. Zero sorry. Zero admit. Zero custom axioms.
Axioms used by the public theorem spine:
[propext, Classical.choice, Quot.sound].
The core path runs from ERM through Rademacher symmetrization, high-probability Rademacher bounds, Massart, Sauer-Shelah, the binary VC bridge, finite contraction, linear predictors, finite sub-Gaussian chaining, finite Dudley entropy-budget wrappers, finite algorithmic stability, finite localized-Rademacher scaffolding, finite PAC-Bayes KL/DV/MGF and bounded-loss confidence bounds, and total-bounded finite-net adapters for the next Dudley steps.
- For ML readers: start with How to read the proofs, then Intuition.
- For proof structure: see Diagrams.
- For exact theorem names: use Theorem map.
- For scope and assumptions: read Assumptions and current boundaries.
- For contributors: read Contributing, then Good first issues.
- For related Lean projects: see Related work. FormalSLT is complementary to YuanheZ/lean-stat-learning-theory, which builds deeper empirical-process infrastructure; this repo focuses on a readable finite-class theorem spine.
Finite-sample proofs in statistical learning theory carry assumptions that are easy to blur in prose. FormalSLT turns a readable finite-sample route into machine-checked Lean statements, so each bound carries its hypotheses, constants, and finite-class scope in the theorem signature. The motivations:
- Reproducibility. Every constant — the
8B²exponent, the2 * E[Rad]factor, the(en/d)^dSauer-Shelah closed form — is a Lean term that the kernel re-checks on every build. There is no implicit "without loss of generality" or "up to constants". - Curriculum. A graduate student reading the README can click through to the exact Lean statement for any bound they have seen on a chalkboard, with the assumptions made explicit in the type signature.
- Frontier ML. As learning theory increasingly feeds back into modern deep-learning analyses (PAC-Bayes generalization, stability, chaining), having a checked finite scaffold is the first step toward formal guarantees on the methods themselves.
| Family | Main modules | Representative result | Status |
|---|---|---|---|
| Finite-class ERM | Risk, ERM, Rademacher.ERMGeneralization |
Excess risk controlled by uniform deviation | Verified |
| Rademacher symmetrization | GhostSample, Rademacher.Symmetrization |
E[genGap] ≤ 2 * E[Rad] |
Verified |
| High-probability Rademacher | Azuma.*, Rademacher.HighProbability |
P(genGap ≥ 2 * E[Rad] + ε) ≤ exp(-ε² n / (8B²)) |
Verified |
| Massart finite-class bound | Rademacher.Massart |
Rad(H,S) ≤ B * sqrt(2 * log card(H) / n) |
Verified |
| Binary VC route | VC.SauerShelah, VC.BinaryVCBridge, VC.SampleComplexity |
VC-style ERM excess-risk tail and closed sample-complexity form via effective classes | Verified |
| Finite contraction | Rademacher.Contraction |
Rad_S(φ ∘ F) ≤ L * Rad_S(F) for scalar finite samples/classes |
Verified |
| Linear predictors | Rademacher.LinearPredictor |
Rad ≤ R * n⁻¹ * sqrt(∑ k, ‖z k‖²) and Rad ≤ R * B / sqrt n |
Verified |
| Localized Rademacher scaffold | Rademacher.Localized |
Bernstein excess-risk localization embeds into second-moment localized empirical Rademacher complexity | Verified finite scaffold |
| Finite covering and two-scale chaining | Covering.Rademacher, Covering.DudleyChaining |
ε-net peeling and two-scale finite chaining | Verified |
| Finite sub-Gaussian chaining foundation | Covering.FiniteSubGaussianChaining |
finite-max entropy bounds and finite Dudley-style entropy-budget sums | Verified finite infrastructure |
| Total-bounded Dudley bridge | Covering.TotalBoundedDudley |
totally bounded metric spaces yield dyadic finite-net schedules, projected finite-net wrappers, truncated interval-integral entropy comparisons, and supplied-supremum / finite-skeleton / pathwise-modulus / epsilonized boundary adapters | Verified bridge |
| Algorithmic stability | AlgorithmicStability, Stability.BousquetElisseeff |
bounded-differences constants, finite and product-measure expected-gap wrappers with bound β, bounded-loss measurability adapters, and bounded-loss Azuma-constant concentration wrappers |
Verified finite scaffold |
| PAC-Bayes finite confidence layer | PACBayesKL, PACBayesMcAllester, PACBayesFiniteProductMGF, PACBayesBoundedLoss |
finite KL/DV change-of-measure, bounded-loss Catoni-style bound, closed PAC-Bayes good-event payoff, fixed-budget McAllester corollary, and finite-grid McAllester peeling wrapper | Verified finite layer |
The main generalization theorems are intentionally finite and explicit.
| Scope item | Current state |
|---|---|
| Hypothesis classes | Finite index types unless a theorem states a separate finite net/family |
| Samples | Finite iid samples through product measures |
| Losses/processes | Scalar real-valued, with boundedness or finite sub-Gaussian MGF assumptions |
| Constants | High-probability Rademacher bounds use the Azuma 8B² exponent |
| Chaining | Finite nets/images, finite support/outcome spaces, finite entropy sums |
| Public axiom target | [propext, Classical.choice, Quot.sound] only |
Short version:
- The main Rademacher and VC results are finite-class / finite-sample theorems.
- High-probability Rademacher bounds use the Azuma
8B²exponent; the sharper McDiarmid constant is future work. - The chaining layer proves finite entropy-budget infrastructure and a first total-bounded finite-net extraction bridge, not the continuous Dudley integral.
- PAC-Bayes includes a finite
[0,1]bounded-loss Catoni-style confidence bound, a closed high-confidence good-event theorem, a fixed-budget McAllester-style square-root corollary, and a finite-grid peeling wrapper for posterior-dependent penalties. Exact all-real-λ, infinite-hypothesis, and continuous-posterior variants remain future work. - Algorithmic stability includes finite iid and measure-theoretic iid expected-gap wrappers, plus bounded-loss high-probability wrappers for finite measurable hypothesis interfaces.
For the full scope statement, see Assumptions and current boundaries.
This project requires elan, the Lean
toolchain manager. elan will read lean-toolchain and
fetch the pinned Lean version automatically.
git clone https://github.com/Robby955/lean-statistical-learning.git
cd lean-statistical-learning
lake exe cache get # download pre-built Mathlib oleans
lake build FormalSLT # build the libraryIf lake is not on your shell path, use the elan binary directly:
~/.elan/bin/lake exe cache get
~/.elan/bin/lake build FormalSLTThe first build takes a few minutes (downloading the Mathlib cache); after that, builds are incremental.
Run these before treating a branch as a showcase candidate:
lake exe cache get
lake build FormalSLT
lake env lean examples/CheckShowcaseTheorems.leanUse the release-candidate checks above, then run the proof-debt and whitespace audits:
rg -n --pcre2 '^\s*(?:by\s+)?(?:sorry|admit)\b|:=\s*(?:by\s+)?(?:sorry|admit)\b' FormalSLT examples
rg -n --pcre2 '^\s*(?:axiom|constant)\s+[A-Za-z_]' FormalSLT examples
git diff --checkThe expected result is:
lake build FormalSLTexits successfully;examples/CheckShowcaseTheorems.leanprints standard Lean/Mathlib axioms for selected public theorems;- the
rgcommands find no executablesorry, no executableadmit, and no custom axioms/constants inFormalSLTorexamples; git diff --checkreports no whitespace errors.
| Layer | Modules |
|---|---|
| Core definitions | Risk, ERM, UniformConvergence, GhostSample |
| Probability utilities | Probability.Concentration, Probability.FiniteUnionBound, Probability.FiniteExpectation |
| Rademacher route | Rademacher.FiniteSample, Rademacher.FiniteSampleSymmetrization, Rademacher.ProbabilityBridge, Rademacher.Decoupling, Rademacher.Symmetrization, Rademacher.Massart, Rademacher.HighProbability, Rademacher.FiniteClassHighProb, Rademacher.UniformDeviation, Rademacher.ERMGeneralization, Rademacher.Contraction, Rademacher.LinearPredictor, Rademacher.Localized |
| Azuma infrastructure | Azuma.ExposureMartingale, Azuma.BoundedDifferences, Azuma.BoundedDiffMartingale, Azuma.BoundedDiffsAzumaInput, Azuma.BoundedIncrementBound, Azuma.HasBoundedDifferences, Azuma.ExposureIncrementHoeffding, Azuma.ExposureIncrementCondMGF, Azuma.GenGapTail |
| VC route | VC.Dimension, VC.PACBridge, VC.SauerShelah, VC.Rademacher, VC.SampleComplexity, VC.BinaryVCBridge |
| Covering and chaining | Covering.Rademacher, Covering.DudleyChaining, Covering.FiniteSubGaussianChaining, Covering.TotalBoundedDudley |
| Stability and PAC-Bayes foundations | AlgorithmicStability, Stability.BousquetElisseeff, PACBayesKL, PACBayesMcAllester, PACBayesFiniteProductMGF, PACBayesBoundedLoss |
- Finite-sample Rademacher definitions
- Rademacher symmetrization
- Massart finite-class bound
- Azuma-Hoeffding genGap tail
- High-probability Rademacher bound
- Sauer-Shelah polynomial bound
- VC-style pointwise Rademacher
- VC uniform deviation and ERM excess-risk tail
- VC closed-form ERM sample-complexity theorem
- Binary-class VC to effective loss-pattern bridge
- Finite-sample scalar contraction
- Finite-dimensional linear predictor Rademacher bound
- Finite localized Rademacher/Bernstein variance-localization scaffold
- Covering number peeling and two-scale chaining
- Finite sub-Gaussian max and finite chaining entropy budgets
- Algorithmic stability bounded-differences scaffold
- Finite algorithmic stability expected-gap adapter under coordinate-swap identity
- Finite iid coordinate-swap identity and literal expected-generalization-gap specialization
- Finite iid two-sided algorithmic stability expected-gap wrappers
- PAC-Bayes KL divergence and Donsker-Varadhan variational inequality
- PAC-Bayes finite iid product MGF bridge for empirical-risk deviations
- PAC-Bayes bounded-loss MGF, Markov confidence, and finite Catoni-style bound
- PAC-Bayes closed high-confidence generalization payoff theorem
- PAC-Bayes fixed-budget McAllester-style square-root corollary
- PAC-Bayes finite-grid McAllester peeling and optimized finite-grid wrapper
- Finite Dudley discrete entropy-bound refinements: annulus, integral-budget, and prefix-envelope wrappers
- Total-bounded finite-net extraction bridge for the continuous Dudley lane
- Projected-sup total-bounded dyadic Dudley wrapper
- Projected finite-net total-bounded dyadic Dudley wrapper without finite ambient index type
- Finite dyadic-budget to entropy-at-radius upper-sum comparison
- Shifted finite-annulus entropy budget collapsed to one truncated interval integral
- Supplied-supremum boundary adapter with explicit terminal approximation error
- Finite-skeleton/dense-net boundary adapter with explicit separability and terminal projection errors
- Pathwise-modulus and finite-skeleton witness lemmas that discharge the continuous-boundary adapter hypotheses
- Epsilonized finite-choice boundary adapter: every positive error budget can be discharged by a finite skeleton and terminal dyadic scale certificate
- Finite-cover/pathwise-modulus certificate for the epsilonized total-bounded Dudley boundary layer
- Dudley boundary epsilon elimination under a uniform global finite-budget hypothesis
- Separable-terminal Dudley boundary adapter under explicit finite-skeleton and terminal-projection hypotheses
- Pathwise terminal-modulus constructor for separable-terminal Dudley boundary certificates
- Finite-cover/pathwise-modulus bridge into the separable-terminal Dudley boundary interface
- Finite-terminal total-bounded dyadic Dudley wrapper
- Continuous Dudley entropy-integral theorem over total-bounded classes
- Measure-theoretic iid algorithmic stability expected bound, with explicit integrability assumptions
- Bounded-loss measurability adapters for the measure-theoretic stability expected-gap theorem
- Bounded-loss high-probability stability wrappers for finite measurable hypothesis interfaces
- PAC-Bayes all-real-
λor continuous-posterior extensions - Sharp McDiarmid/product-kernel decomposition
- Continuous Dudley-style entropy integral
Contributions are welcome. Please read CONTRIBUTING.md
before opening a PR. The short version: one theorem per PR, no sorry / no
admit, only the standard [propext, Classical.choice, Quot.sound] axioms,
and assumptions stated in the type signature rather than buried in
hypotheses.
For ideas, see the open-formalization-problems list, good first issues, and the unchecked items in the roadmap above. Before a public release, maintainers can use the public release checklist.
If you use FormalSLT in academic work, please cite:
@software{formal_slt,
title = {FormalSLT: Formal Statistical Learning Theory in Lean 4},
author = {Sneiderman, Robert},
year = {2026},
url = {https://github.com/Robby955/FormalSLT},
note = {Lean 4 formalization of finite-sample SLT bounds.}
}This project is released under the MIT License.