Mathematician (multidimensional symbolic dynamics) working on AI and formal methods for mathematics, with applications to blockchain and smart contracts.
I am currently operating a shift from mathematics to AI for mathematics which is a continuation of my initial interests in research. When I entered the École Normale Supérieure (ENS Ulm), I was obsessed — more than with mathematics itself — with how the mind works when doing mathematics, and more generally with how to mechanically interpret how the mind works, and how low-level mechanisms relate to high-level introspective interpretations. I explored this through adjacent, more general problems: the hard problem of consciousness, the mind–body problem. Artificial intelligence is a natural field in which to push these questions further. I chose my field of mathematics not because it is central to mathematics, but because in multidimensional dynamical systems there is a natural bridge between judgement structures inherently related to the mind and low-level causal structures deriving from the rules defining the system. This is also a natural angle from which to explore how AI learns to deal with causality. So AI for mathematics is, for me, a way to get back to mainstream research while also following my initial scientific interest.
You can find a list of my publications here. Philosophical essays are available there.
The work below mainly concerns AI for mathematics and formal methods. It is organized around a working assumption — that AI is doing, or will soon do, a large part of mathematics — and around the questions that follow from it:
- Signals for AI agents — where should an autonomous agent look (interestingness), and how do we tell whether it has actually learnt something?
- Modeling mathematical reasoning — treating mathematical practice as a complex state machine, refining the model and at the same time testing tools derived from it.
- Tooling for a world in which AI does most of the math — making sense of what AI produces, encoding mathematicians' mental processes into Lean, and managing formalized libraries at the domain level.
- Automation engines — systems that generate proofs or tactics directly: theorem provers and tactic-generation pipelines, plus evaluation of external provers.
- Applied Lean — applications of Lean and formal verification beyond mathematics, currently to smart contracts and zero-knowledge circuit verification.
- Domain research code — code from my mathematical field (multidimensional symbolic dynamics) that doubles as substrate for the AI-for-math experiments above.
There is also an Adjacent section at the bottom for work outside the math arc (philosophy tooling).
Two complementary kinds of signal for autonomous AI mathematical research: where to look (interestingness) and whether something has actually been learnt.
- topology-mathematical-knowledge — Experiments on the structure of formal mathematics, with a focus on automatically detecting bridges and potential bridges between domains. Bridges are particularly valued in mathematics because they are hard to find. The position of this work in AI for math is to design interestingness signals — telling AI agents where to look — under the assumption that they autonomously search for mathematical problems to tackle. Concretely: Leiden community detection recovers named mathlib modules without supervision, and a bridge-theorem detector recovers 6 / 8 historically-known cross-domain connections. Targeting AI for Math at ICML 2026.
- intepretability-operator-precedence — Falls into the same category, but on the other axis: how do we detect whether a model has actually learnt a rule, using mechanistic interpretability? This helps define proxies, and thus signals, for AI agents that they have actually learnt something — pretty much like for humans (how do we know we have learnt a rule?). Concretely: activation patching and direct logit attribution on operator precedence (BODMAS) in Qwen2.5-Math-7B localize the computation to MLP L22 (computation hub) and head L27H11 (
*→=routing), with the+token largely context-independent.
-
math-reasoning-tools — Based on the idea of modeling mathematical reasoning, with the working hypothesis that it behaves like a complex state machine. The repository simultaneously refines that model and tests tools derived from the modelisation — including tools interacting with the Lean theorem prover, but also tools mimicking other aspects of mathematical practice that are not purely formal.
Concretely, the tools are exposed as MCP servers to Claude Code:
math-compute(SymPy / Z3),math-viz(plots, diagrams),math-search(arXiv / OEIS / MathWorld / Loogle / zbMATH),commutative-diagrams(TikZ / Quiver),proof-explorer(Lean-aware introspection of proof state). Lean-specific tooling lives upstream inoOo0oOo/lean-lsp-mcp— the canonical Lean MCP, to which I contribute.math-reasoning-toolsalso indexes other MCP servers for mathematics, so the broader ecosystem is discoverable from a single entry point. -
graphical-reasoning — Downstream test bed for
math-viz: do graphical-representation tools (plots, diagrams) actually help Claude Code solve problems where visualization is known to help humans? Substrate is the Putnam Competition real-analysis problems (1985–2025), formalized at the statement level in Lean 4. Pilot solutions run under a deliberately restricted tool budget (lean-lsp + math-viz only) so the effect of visualization is measurable rather than confounded. Findings so far: on Putnam 2025 A-2 a single plot caught a false claim in the original Lean statement (the agent would otherwise have spent its budget proving a false theorem); on B-2 a plot confirmed the qualitative direction of a centroid inequality before formalization. -
combinatorial-games-experiments — A tool-granularity ablation, currently on Slitherlink. Three MCP tool sets — many primitives, named local deductions, or a single "suggest next move" — held against each other to ask how much of the reasoning the tools should do versus the model. Metrics: solve rate, tool-call count, tokens, dead-ends, wall time. Aim: an empirical handle on what "the right level of help" looks like; deliberately avoids handing the model a SAT solver, since the point is to remove bookkeeping, not the puzzle.
-
proof-hierarchies — A different angle on the same modeling theme: a Lean 4 proof can be viewed as a tree where each
haveintroduces an intermediate node, and the same theorem admits many such tree representations (split / merge / reorder steps). The research plan probes: (a) does fine-tuning a small theorem prover on proofs rewritten at the right granularity improve generalization? (b) does separating plan generation from gap filling help proof search? (c) does a specific hierarchical form make informalization (formal → NL) easier? Early stage — plan + extractor + rewriter scaffolding.
Once a large part of mathematics is being produced by AI, several supporting needs follow: making sense of what was produced, encoding mathematicians' mental processes into Lean so that more of practice is reachable formally, and managing the resulting formalized libraries at the domain level.
- informalizer — A tool which assumes that a large part of mathematics is done by AI. Once that is the case, how do we make sense of what the AI has done, how do we understand it? This tool serves that purpose: Lean 4 file → Markdown report. Builds a dependency DAG, categorizes declarations (Central Result, Key Lemma, …), and uses the Anthropic API to generate per-declaration explanations and per-file summaries. Tracks per-declaration understanding state (
unknown/learning/known). - formalized-symbolic-dynamics — Two purposes. First, formalizing articles in my field — multidimensional symbolic dynamics — currently centered on Hochman & Meyerovitch's entropy characterization of multidimensional SFTs. Second, a test case for domain-level library management, again under the assumption that AI is doing a lot of the mathematics-proving: automatic uniformization of definitions, reduction to minimal definition sets, standardization, and decomposition into chunks easily integrable into standardized libraries (mathlib).
Systems that produce proofs or tactics directly, as opposed to the tooling that wraps around it.
- tactics_generation — Experiments with tactics generation, also as a way to encode in Lean the mental processes of mathematicians that admit formalization. 2×2 factorial study on LLM tactic generation in Lean: planning vs no planning × live LSP feedback vs none. Finding: live compiler access dominates planning alone; the combined condition is strongest.
- theorem-prover — A toy experiment to see what modern theorem provers look like. RL pipeline (instrument mathlib → record steps → train → prove); built as a learning project rather than a research deliverable.
- theorem-proving-agent — Autonomous Lean 4 coding agent: a compile–repair loop driven by deterministic rewrite rules and (optionally) an LLM, orchestrated via LangGraph with step-wise checkpoints. Optional "innovation" cycles extend a file with thematically relevant new results, and a documentation pass auto-annotates definitions and proofs. Same family as
tactics_generation/theorem-proverabove — older, broader in scope, less of a controlled study. - aristotle_tests — Stress-test of an external autonomous prover (Aristotle) on hard conjectures from my own field — multidimensional symbolic dynamics: Nivat's conjecture, the Furstenberg ×2/×3 conjecture, the hard-square entropy, the Kari–Culik entropy construction, Weiss' conjecture, the odd shift. Useful both as a benchmark and as a way to surface how a prover handles statements that aren't already in mathlib's vicinity.
- sc-verification — Reflects my interest in applications of Lean (and formal verification more broadly: Certora CVL specifications, Hoare-logic style writeups) to Solidity smart contracts. I worked for some time in crypto and find this a particularly interesting application area.
- CompPoly — Fork of
Verified-zkEVM/CompPoly, a Lean 4 library of computable polynomial operations (univariate, multivariate, multilinear, bivariate) withRingEquivs to the Mathlib counterparts. Foundation for zero-knowledge circuit verification. I contribute on the optimization / performance side — the project's current Phase 2 focus: optimized field arithmetic, FFT/NTT multiplication, faster evaluation pathways, with proof-first correctness preserved.
Code from the mathematical field I came from. Not "AI for math" per se — but generates the substrate (random subshifts, simulated configurations) on which experiments above can be run.
- models_of_sft — Random nearest-neighbor 2D SFT generator plus a SAT-based sampler of locally admissible n×n patterns over a finite alphabet. Forbidden dominoes are sampled independently per direction; admissible configurations are produced via a SAT solver with blocking clauses to ensure distinct samples. Output is structured (
patterns/subshift_k/all_patterns.npy+ a centralsamples.json) so the data feeds cleanly into downstream pipelines. - simulations_homshifts — Live simulation of homshift evolution: a 2D array of vertices of a fixed undirected graph, with an update rule that flips a cell only to a vertex graph-adjacent to all four of its neighbors. Live heatmap visualisation makes the formation of frozen regions visible. Small, but a useful intuition-builder for graph-homomorphism shifts.
- co-philosopher — Off the math/AI-for-math thesis but in the same intellectual neighborhood (cf. the philosophy essays linked above). Terminal-based philosophy assistant driving the local Claude Code CLI by default (no API key required): ingests articles and notes (PDF / DOCX / LaTeX / Markdown), extracts concepts and open questions into a queryable SQLite store, proposes coherent article-shaped clusters of notes, drafts in LaTeX with a PhilArchive bibliography, and exposes a vector-searchable journals catalog as a local MCP server.
%%{init: {'theme':'base', 'themeVariables': {
'fontFamily': 'ui-sans-serif, -apple-system, BlinkMacSystemFont, system-ui, sans-serif',
'primaryColor': '#ffffff',
'primaryTextColor': '#1f2937',
'primaryBorderColor': '#cbd5e1',
'lineColor': '#a3a8b8',
'edgeLabelBackground': '#ffffff'
}}}%%
flowchart TB
subgraph SIGNALS["◐ 1 · Signals for AI agents"]
direction TB
TMK("topology-mathematical-knowledge<br/><i>where to look — interestingness</i>")
IOP("intepretability-operator-precedence<br/><i>has it learnt? — proxies</i>")
end
subgraph MODELING["◑ 2 · Modeling mathematical reasoning"]
direction TB
MRT("math-reasoning-tools<br/><i>state-machine model + MCP servers</i>")
GR("graphical-reasoning<br/><i>does visualization help?</i>")
CGE("combinatorial-games-experiments<br/><i>tool-granularity ablation</i>")
PH("proof-hierarchies<br/><i>proofs as `have`-trees</i>")
end
subgraph TOOLING["◒ 3 · Tooling for the AI-does-math regime"]
direction TB
INF("informalizer<br/><i>understand AI output</i>")
FSD("formalized-symbolic-dynamics<br/><i>domain + library management</i>")
end
subgraph ENGINES["◓ 4 · Automation engines"]
direction TB
TG("tactics_generation<br/><i>encode mental processes</i>")
TP("theorem-prover<br/><i>toy prover</i>")
TPA("theorem-proving-agent<br/><i>LangGraph compile–repair agent</i>")
AT("aristotle_tests<br/><i>external-prover eval</i>")
end
subgraph APPLIED["◍ 5 · Applied Lean"]
direction TB
SCV("sc-verification<br/><i>smart contracts</i>")
CP("CompPoly<br/><i>computable polys for ZK</i>")
end
subgraph DOMAIN["◉ 6 · Domain research code"]
direction TB
MOS("models_of_sft<br/><i>SFT sample generator</i>")
SH("simulations_homshifts<br/><i>homshift live sim</i>")
end
SIGNALS -.-> MODELING
MODELING --> TOOLING
MODELING --> ENGINES
DOMAIN -.-> MODELING
DOMAIN -.-> ENGINES
classDef signals fill:#f3edff,stroke:#b388ff,stroke-width:1.5px,color:#3d2466
classDef modeling fill:#e0f7f5,stroke:#4dd0c1,stroke-width:1.5px,color:#0f4a43
classDef tooling fill:#ebf9ec,stroke:#81c784,stroke-width:1.5px,color:#1b5e20
classDef engines fill:#fff1e6,stroke:#ffab40,stroke-width:1.5px,color:#6d3914
classDef applied fill:#ffe6ed,stroke:#ff8fa3,stroke-width:1.5px,color:#6e1b32
classDef domain fill:#e3f7fa,stroke:#4dd0e1,stroke-width:1.5px,color:#0d4a55
class TMK,IOP signals
class MRT,GR,CGE,PH modeling
class INF,FSD tooling
class TG,TP,TPA,AT engines
class SCV,CP applied
class MOS,SH domain
style SIGNALS fill:#faf7ff,stroke:#d4bff7,stroke-width:1px,color:#3d2466
style MODELING fill:#f3fbfa,stroke:#a5e3da,stroke-width:1px,color:#0f4a43
style TOOLING fill:#f5fcf6,stroke:#bce0bf,stroke-width:1px,color:#1b5e20
style ENGINES fill:#fff8f1,stroke:#f5cc94,stroke-width:1px,color:#6d3914
style APPLIED fill:#fff2f5,stroke:#f5b7c4,stroke-width:1px,color:#6e1b32
style DOMAIN fill:#f0fafc,stroke:#a3dde4,stroke-width:1px,color:#0d4a55
| Repo | Role | Lang | One-liner | Status |
|---|---|---|---|---|
| topology-mathematical-knowledge | 1 Signals — where to look | Python | Bridge detection on the mathlib dependency graph; interestingness signals for autonomous search | Paper track — ICML 2026 AI4Math |
| intepretability-operator-precedence | 1 Signals — has it learnt? | Jupyter | Mech-interp proxies for whether a model has learnt a rule (BODMAS in Qwen2.5-Math-7B) | Study complete |
| math-reasoning-tools | 2 Modeling reasoning | Python | State-machine model of mathematical reasoning + MCP servers (compute, viz, search, diagrams, proof-explorer) | Active |
| graphical-reasoning | 2 Modeling — visualization test bed | Lean 4 / Python | Putnam 1985–2025 real-analysis problems + pilot solutions under restricted (lean-lsp + math-viz) tool budget | Active |
| combinatorial-games-experiments | 2 Modeling — tool-granularity ablation | Python | Slitherlink MCP servers at fine/medium/coarse granularity | Active |
| proof-hierarchies | 2 Modeling — proofs as have-trees |
Python / Lean | Proof-tree extractor + rewriter; plan + scaffolding for prover-finetuning / planning / informalization experiments | Plan + scaffolding |
| informalizer | 3 Tooling — understand AI output | Python | Lean files → role-categorized Markdown reports via Claude API | Active |
| formalized-symbolic-dynamics | 3 Tooling — domain + library mgmt | Lean 4 | Symbolic-dynamics articles formalized; domain-level library management workbench | Active, long-running |
| tactics_generation | 4 Automation engines — encode mental processes | Lean / Python | 2×2 study + encoding mathematicians' moves in Lean | Study complete |
| theorem-prover | 4 Automation engines — toy prover | Python / Lean | Toy RL pipeline to see what modern theorem provers look like | Learning project |
| theorem-proving-agent | 4 Automation engines — autonomous Lean agent | Python / Lean | Compile–repair / innovation / documentation loop with LangGraph orchestration | Older, broader scope |
| aristotle_tests | 4 Automation engines — external-prover eval | Lean 4 / TeX | Aristotle on Nivat / Furstenberg ×2×3 / hard-square / Kari–Culik / Weiss / odd-shift | Benchmark dump |
| sc-verification | 5 Applied Lean — smart contracts | Solidity | Formal verification of smart contracts | Experimental |
| CompPoly | 5 Applied Lean — ZK (fork, contributor) | Lean 4 | Computable polynomials over finite fields/rings; foundation for ZK circuit verification | Active (upstream Phase 2 — perf) |
| models_of_sft | 6 Domain code | Python | Random 2D nearest-neighbor SFT + SAT-based pattern sampler | Done |
| simulations_homshifts | 6 Domain code | Python | Live simulation / heatmap of homshift evolution | Done |
| co-philosopher | Adjacent | Python | Terminal philosophy assistant (Claude Code CLI; PhilArchive bibliography; MCP-exposed journals memory) | Active |
oOo0oOo/lean-lsp-mcp— upstream contributions for Lean-related MCP tooling (the canonical Lean MCP). My ownmath-reasoning-toolsis positioned to be complementary, focused on non-Lean tools.google-deepmind/formal-conjectures— formalizing conjectures in Lean 4.leanprover-community/mathlib4— contributing basic library content for symbolic dynamics.Verified-zkEVM/CompPoly— contributing on the Lean optimization side (FFT/NTT-based polynomial arithmetic for ZK circuits). My fork lives atSfgangloff/CompPoly.






