Skip to content

Sequential-Intelligence-Lab/MathlibLemma

Repository files navigation

MathlibLemma

License

Official repository for the paper
MathlibLemma: Folklore Lemma Generation and Benchmark for Formal Mathematics
Xinyu Liu, Zixuan Xie, Amir Moeini, Claire Chen, Shuze Daniel Liu, Yu Meng, Aidong Zhang, Shangtong Zhang

This repository contains MathlibLemma, a benchmark and proof library for folklore lemmas in Lean 4. The project studies automated folklore mining: discovering, filtering, formalizing, and proving missing intermediate results that are natural to mathematicians but not always available in Mathlib in reusable form.

At the paper level, MathlibLemma contributes:

  • a benchmark of 4028 non-trivial type-checked Lean statements
  • a screened proof library of 1506 Lean-checked proofs
  • a modular pipeline organized around Discovery, Judge, Formalizer, and Prover modules

Reference:

Repository Layout

  • MathlibLemma/ the main released artifact view
  • MathlibLemma/Benchmark/ the 4028 benchmark statements, organized into Foundational, Applied, and Abstract
  • MathlibLemma/Proved/ProofLibrary/ the 1506 screened proofs used for the main proof-library count
  • MathlibLemma/Proved/ByModel/ model-specific solved outputs
  • main_pipeline/ the main pipeline code for the core benchmark and proof-library workflow
  • source_batches/ the three source batch trees underlying the main release artifacts
  • supplementary_experiments/ appendix-style follow-up materials, including the explicit-necessary-conditions ablation and the held-out residual provability check

Compared with the earlier public release, the main benchmark and proof-library artifacts now live under MathlibLemma/Benchmark/ and MathlibLemma/Proved/ProofLibrary/ rather than top-level benchmark/ and lemma/ folders.

Lean Usage

This repository includes Lean project metadata (lean-toolchain, lakefile.toml, and lake-manifest.json) so the released files can be checked against a consistent Mathlib environment.

Typical setup:

  1. Install Lean 4 via elan.
  2. Clone the repository:
    git clone https://github.com/Sequential-Intelligence-Lab/MathlibLemma.git
    cd MathlibLemma
  3. Fetch cached Mathlib artifacts:
    lake exe cache get

In practice, the numbered benchmark and proof artifacts are most naturally checked file-by-file. For example:

lake env lean MathlibLemma/Benchmark/Foundational/Data/Fintype/Basic/0.lean
lake env lean MathlibLemma/Proved/ProofLibrary/Foundational/Data/Fintype/Basic/0.lean

The repository also includes .lake/ to make the packaged dependency state easier to reproduce.

Pipeline

At the paper level, the framework is modular rather than monolithic. The four modules are:

  1. Discovery Module generates candidate folklore lemmas from Mathlib seed files
  2. Judge Module filters candidates semantically using LLM-as-a-judge
  3. Formalizer Module repairs syntax and type issues until statements become Lean-compilable
  4. Prover Module attempts Lean-checked proofs and applies the proof-bypass filter

The main implementation for this workflow is under main_pipeline/. The appendix-style follow-up scripts live under supplementary_experiments/supplementary_pipeline/.

Merged Contributions to Mathlib

A key contribution of this project is the discovery of missing lemmas that have been accepted into Mathlib. Representative merged pull requests include:

  • Mathlib4 PR #32170 adds gronwallBound_mono: gronwallBound is monotone non-decreasing in time x given non-negative parameters
  • Mathlib4 PR #32167 adds Kernel.restrict_const: restricting a constant kernel to a measurable set commutes with restricting the underlying measure
  • Mathlib4 PR #31985 adds centralMoment_congr_ae: central moments agree for almost-everywhere-equal random variables

Contributed by Sequential Intelligence Lab (SIL), University of Virginia.

Citation

If you find this repository useful, please cite the paper:

@article{liu2026mathliblemma,
  title={MathlibLemma: Folklore Lemma Generation and Benchmark for Formal Mathematics},
  author={Xinyu Liu and Zixuan Xie and Amir Moeini and Claire Chen and Shuze Daniel Liu and Yu Meng and Aidong Zhang and Shangtong Zhang},
  year={2026},
  journal={arXiv preprint arXiv:2602.02561}
}

License

This project is licensed under the Apache License 2.0; see LICENSE.

About

Official repository for "MathlibLemma: Folklore Lemma Generation and Benchmark for Formal Mathematics". Contains the benchmark datasets and verified folklore lemmas.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages