Skip to content
View Sfgangloff's full-sized avatar
🏠
Working from home
🏠
Working from home

Block or report Sfgangloff

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don’t include any personal information such as legal names or email addresses. Markdown is supported. This note will only be visible to you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
sfgangloff/README.md

Mathematician (multidimensional symbolic dynamics) working on AI and formal methods for mathematics, with applications to blockchain and smart contracts.

I am currently operating a shift from mathematics to AI for mathematics which is a continuation of my initial interests in research. When I entered the École Normale Supérieure (ENS Ulm), I was obsessed — more than with mathematics itself — with how the mind works when doing mathematics, and more generally with how to mechanically interpret how the mind works, and how low-level mechanisms relate to high-level introspective interpretations. I explored this through adjacent, more general problems: the hard problem of consciousness, the mind–body problem. Artificial intelligence is a natural field in which to push these questions further. I chose my field of mathematics not because it is central to mathematics, but because in multidimensional dynamical systems there is a natural bridge between judgement structures inherently related to the mind and low-level causal structures deriving from the rules defining the system. This is also a natural angle from which to explore how AI learns to deal with causality. So AI for mathematics is, for me, a way to get back to mainstream research while also following my initial scientific interest.

You can find a list of my publications here. Philosophical essays are available there.

The work below mainly concerns AI for mathematics and formal methods. It is organized around a working assumption — that AI is doing, or will soon do, a large part of mathematics — and around the questions that follow from it:

  1. Signals for AI agents — where should an autonomous agent look (interestingness), and how do we tell whether it has actually learnt something?
  2. Modeling mathematical reasoning — treating mathematical practice as a complex state machine, refining the model and at the same time testing tools derived from it.
  3. Tooling for a world in which AI does most of the math — making sense of what AI produces, encoding mathematicians' mental processes into Lean, and managing formalized libraries at the domain level.
  4. Automation engines — systems that generate proofs or tactics directly: theorem provers and tactic-generation pipelines, plus evaluation of external provers.
  5. Applied Lean — applications of Lean and formal verification beyond mathematics, currently to smart contracts and zero-knowledge circuit verification.
  6. Domain research code — code from my mathematical field (multidimensional symbolic dynamics) that doubles as substrate for the AI-for-math experiments above.

There is also an Adjacent section at the bottom for work outside the math arc (philosophy tooling).


1 · Signals for AI agents — where to look, and has it learnt?

community graph of mathlib

Two complementary kinds of signal for autonomous AI mathematical research: where to look (interestingness) and whether something has actually been learnt.

  • topology-mathematical-knowledge — Experiments on the structure of formal mathematics, with a focus on automatically detecting bridges and potential bridges between domains. Bridges are particularly valued in mathematics because they are hard to find. The position of this work in AI for math is to design interestingness signals — telling AI agents where to look — under the assumption that they autonomously search for mathematical problems to tackle. Concretely: Leiden community detection recovers named mathlib modules without supervision, and a bridge-theorem detector recovers 6 / 8 historically-known cross-domain connections. Targeting AI for Math at ICML 2026.
  • intepretability-operator-precedence — Falls into the same category, but on the other axis: how do we detect whether a model has actually learnt a rule, using mechanistic interpretability? This helps define proxies, and thus signals, for AI agents that they have actually learnt something — pretty much like for humans (how do we know we have learnt a rule?). Concretely: activation patching and direct logit attribution on operator precedence (BODMAS) in Qwen2.5-Math-7B localize the computation to MLP L22 (computation hub) and head L27H11 (*= routing), with the + token largely context-independent.

2 · Modeling mathematical reasoning

  • math-reasoning-tools — Based on the idea of modeling mathematical reasoning, with the working hypothesis that it behaves like a complex state machine. The repository simultaneously refines that model and tests tools derived from the modelisation — including tools interacting with the Lean theorem prover, but also tools mimicking other aspects of mathematical practice that are not purely formal.

    Concretely, the tools are exposed as MCP servers to Claude Code: math-compute (SymPy / Z3), math-viz (plots, diagrams), math-search (arXiv / OEIS / MathWorld / Loogle / zbMATH), commutative-diagrams (TikZ / Quiver), proof-explorer (Lean-aware introspection of proof state). Lean-specific tooling lives upstream in oOo0oOo/lean-lsp-mcp — the canonical Lean MCP, to which I contribute. math-reasoning-tools also indexes other MCP servers for mathematics, so the broader ecosystem is discoverable from a single entry point.

  • graphical-reasoning — Downstream test bed for math-viz: do graphical-representation tools (plots, diagrams) actually help Claude Code solve problems where visualization is known to help humans? Substrate is the Putnam Competition real-analysis problems (1985–2025), formalized at the statement level in Lean 4. Pilot solutions run under a deliberately restricted tool budget (lean-lsp + math-viz only) so the effect of visualization is measurable rather than confounded. Findings so far: on Putnam 2025 A-2 a single plot caught a false claim in the original Lean statement (the agent would otherwise have spent its budget proving a false theorem); on B-2 a plot confirmed the qualitative direction of a centroid inequality before formalization.

  • combinatorial-games-experiments — A tool-granularity ablation, currently on Slitherlink. Three MCP tool sets — many primitives, named local deductions, or a single "suggest next move" — held against each other to ask how much of the reasoning the tools should do versus the model. Metrics: solve rate, tool-call count, tokens, dead-ends, wall time. Aim: an empirical handle on what "the right level of help" looks like; deliberately avoids handing the model a SAT solver, since the point is to remove bookkeeping, not the puzzle.

  • proof-hierarchies — A different angle on the same modeling theme: a Lean 4 proof can be viewed as a tree where each have introduces an intermediate node, and the same theorem admits many such tree representations (split / merge / reorder steps). The research plan probes: (a) does fine-tuning a small theorem prover on proofs rewritten at the right granularity improve generalization? (b) does separating plan generation from gap filling help proof search? (c) does a specific hierarchical form make informalization (formal → NL) easier? Early stage — plan + extractor + rewriter scaffolding.

3 · Tooling for a world in which mathematical proofs are mostly written by AI

Once a large part of mathematics is being produced by AI, several supporting needs follow: making sense of what was produced, encoding mathematicians' mental processes into Lean so that more of practice is reachable formally, and managing the resulting formalized libraries at the domain level.

  • informalizer — A tool which assumes that a large part of mathematics is done by AI. Once that is the case, how do we make sense of what the AI has done, how do we understand it? This tool serves that purpose: Lean 4 file → Markdown report. Builds a dependency DAG, categorizes declarations (Central Result, Key Lemma, …), and uses the Anthropic API to generate per-declaration explanations and per-file summaries. Tracks per-declaration understanding state (unknown / learning / known).
  • formalized-symbolic-dynamics — Two purposes. First, formalizing articles in my field — multidimensional symbolic dynamics — currently centered on Hochman & Meyerovitch's entropy characterization of multidimensional SFTs. Second, a test case for domain-level library management, again under the assumption that AI is doing a lot of the mathematics-proving: automatic uniformization of definitions, reduction to minimal definition sets, standardization, and decomposition into chunks easily integrable into standardized libraries (mathlib).

4 · Automation engines

Systems that produce proofs or tactics directly, as opposed to the tooling that wraps around it.

  • tactics_generation — Experiments with tactics generation, also as a way to encode in Lean the mental processes of mathematicians that admit formalization. 2×2 factorial study on LLM tactic generation in Lean: planning vs no planning × live LSP feedback vs none. Finding: live compiler access dominates planning alone; the combined condition is strongest.
  • theorem-prover — A toy experiment to see what modern theorem provers look like. RL pipeline (instrument mathlib → record steps → train → prove); built as a learning project rather than a research deliverable.
  • theorem-proving-agent — Autonomous Lean 4 coding agent: a compile–repair loop driven by deterministic rewrite rules and (optionally) an LLM, orchestrated via LangGraph with step-wise checkpoints. Optional "innovation" cycles extend a file with thematically relevant new results, and a documentation pass auto-annotates definitions and proofs. Same family as tactics_generation / theorem-prover above — older, broader in scope, less of a controlled study.
  • aristotle_tests — Stress-test of an external autonomous prover (Aristotle) on hard conjectures from my own field — multidimensional symbolic dynamics: Nivat's conjecture, the Furstenberg ×2/×3 conjecture, the hard-square entropy, the Kari–Culik entropy construction, Weiss' conjecture, the odd shift. Useful both as a benchmark and as a way to surface how a prover handles statements that aren't already in mathlib's vicinity.

5 · Applied Lean

  • sc-verification — Reflects my interest in applications of Lean (and formal verification more broadly: Certora CVL specifications, Hoare-logic style writeups) to Solidity smart contracts. I worked for some time in crypto and find this a particularly interesting application area.
  • CompPoly — Fork of Verified-zkEVM/CompPoly, a Lean 4 library of computable polynomial operations (univariate, multivariate, multilinear, bivariate) with RingEquivs to the Mathlib counterparts. Foundation for zero-knowledge circuit verification. I contribute on the optimization / performance side — the project's current Phase 2 focus: optimized field arithmetic, FFT/NTT multiplication, faster evaluation pathways, with proof-first correctness preserved.

6 · Domain research code (multidimensional symbolic dynamics)

homshift evolution heatmap

Code from the mathematical field I came from. Not "AI for math" per se — but generates the substrate (random subshifts, simulated configurations) on which experiments above can be run.

  • models_of_sft — Random nearest-neighbor 2D SFT generator plus a SAT-based sampler of locally admissible n×n patterns over a finite alphabet. Forbidden dominoes are sampled independently per direction; admissible configurations are produced via a SAT solver with blocking clauses to ensure distinct samples. Output is structured (patterns/subshift_k/all_patterns.npy + a central samples.json) so the data feeds cleanly into downstream pipelines.
  • simulations_homshifts — Live simulation of homshift evolution: a 2D array of vertices of a fixed undirected graph, with an update rule that flips a cell only to a vertex graph-adjacent to all four of its neighbors. Live heatmap visualisation makes the formation of frozen regions visible. Small, but a useful intuition-builder for graph-homomorphism shifts.

Adjacent — outside the math arc

  • co-philosopher — Off the math/AI-for-math thesis but in the same intellectual neighborhood (cf. the philosophy essays linked above). Terminal-based philosophy assistant driving the local Claude Code CLI by default (no API key required): ingests articles and notes (PDF / DOCX / LaTeX / Markdown), extracts concepts and open questions into a queryable SQLite store, proposes coherent article-shaped clusters of notes, drafts in LaTeX with a PhilArchive bibliography, and exposes a vector-searchable journals catalog as a local MCP server.

How the repos relate

%%{init: {'theme':'base', 'themeVariables': {
    'fontFamily': 'ui-sans-serif, -apple-system, BlinkMacSystemFont, system-ui, sans-serif',
    'primaryColor': '#ffffff',
    'primaryTextColor': '#1f2937',
    'primaryBorderColor': '#cbd5e1',
    'lineColor': '#a3a8b8',
    'edgeLabelBackground': '#ffffff'
}}}%%
flowchart TB
    subgraph SIGNALS["◐  1 · Signals for AI agents"]
        direction TB
        TMK("topology-mathematical-knowledge<br/><i>where to look — interestingness</i>")
        IOP("intepretability-operator-precedence<br/><i>has it learnt? — proxies</i>")
    end

    subgraph MODELING["◑  2 · Modeling mathematical reasoning"]
        direction TB
        MRT("math-reasoning-tools<br/><i>state-machine model + MCP servers</i>")
        GR("graphical-reasoning<br/><i>does visualization help?</i>")
        CGE("combinatorial-games-experiments<br/><i>tool-granularity ablation</i>")
        PH("proof-hierarchies<br/><i>proofs as `have`-trees</i>")
    end

    subgraph TOOLING["◒  3 · Tooling for the AI-does-math regime"]
        direction TB
        INF("informalizer<br/><i>understand AI output</i>")
        FSD("formalized-symbolic-dynamics<br/><i>domain + library management</i>")
    end

    subgraph ENGINES["◓  4 · Automation engines"]
        direction TB
        TG("tactics_generation<br/><i>encode mental processes</i>")
        TP("theorem-prover<br/><i>toy prover</i>")
        TPA("theorem-proving-agent<br/><i>LangGraph compile–repair agent</i>")
        AT("aristotle_tests<br/><i>external-prover eval</i>")
    end

    subgraph APPLIED["◍  5 · Applied Lean"]
        direction TB
        SCV("sc-verification<br/><i>smart contracts</i>")
        CP("CompPoly<br/><i>computable polys for ZK</i>")
    end

    subgraph DOMAIN["◉  6 · Domain research code"]
        direction TB
        MOS("models_of_sft<br/><i>SFT sample generator</i>")
        SH("simulations_homshifts<br/><i>homshift live sim</i>")
    end

    SIGNALS -.-> MODELING
    MODELING --> TOOLING
    MODELING --> ENGINES
    DOMAIN -.-> MODELING
    DOMAIN -.-> ENGINES

    classDef signals  fill:#f3edff,stroke:#b388ff,stroke-width:1.5px,color:#3d2466
    classDef modeling fill:#e0f7f5,stroke:#4dd0c1,stroke-width:1.5px,color:#0f4a43
    classDef tooling  fill:#ebf9ec,stroke:#81c784,stroke-width:1.5px,color:#1b5e20
    classDef engines  fill:#fff1e6,stroke:#ffab40,stroke-width:1.5px,color:#6d3914
    classDef applied  fill:#ffe6ed,stroke:#ff8fa3,stroke-width:1.5px,color:#6e1b32
    classDef domain   fill:#e3f7fa,stroke:#4dd0e1,stroke-width:1.5px,color:#0d4a55

    class TMK,IOP signals
    class MRT,GR,CGE,PH modeling
    class INF,FSD tooling
    class TG,TP,TPA,AT engines
    class SCV,CP applied
    class MOS,SH domain

    style SIGNALS  fill:#faf7ff,stroke:#d4bff7,stroke-width:1px,color:#3d2466
    style MODELING fill:#f3fbfa,stroke:#a5e3da,stroke-width:1px,color:#0f4a43
    style TOOLING  fill:#f5fcf6,stroke:#bce0bf,stroke-width:1px,color:#1b5e20
    style ENGINES  fill:#fff8f1,stroke:#f5cc94,stroke-width:1px,color:#6d3914
    style APPLIED  fill:#fff2f5,stroke:#f5b7c4,stroke-width:1px,color:#6e1b32
    style DOMAIN   fill:#f0fafc,stroke:#a3dde4,stroke-width:1px,color:#0d4a55
Loading

Repo index

Repo Role Lang One-liner Status
topology-mathematical-knowledge 1 Signals — where to look Python Bridge detection on the mathlib dependency graph; interestingness signals for autonomous search Paper track — ICML 2026 AI4Math
intepretability-operator-precedence 1 Signals — has it learnt? Jupyter Mech-interp proxies for whether a model has learnt a rule (BODMAS in Qwen2.5-Math-7B) Study complete
math-reasoning-tools 2 Modeling reasoning Python State-machine model of mathematical reasoning + MCP servers (compute, viz, search, diagrams, proof-explorer) Active
graphical-reasoning 2 Modeling — visualization test bed Lean 4 / Python Putnam 1985–2025 real-analysis problems + pilot solutions under restricted (lean-lsp + math-viz) tool budget Active
combinatorial-games-experiments 2 Modeling — tool-granularity ablation Python Slitherlink MCP servers at fine/medium/coarse granularity Active
proof-hierarchies 2 Modeling — proofs as have-trees Python / Lean Proof-tree extractor + rewriter; plan + scaffolding for prover-finetuning / planning / informalization experiments Plan + scaffolding
informalizer 3 Tooling — understand AI output Python Lean files → role-categorized Markdown reports via Claude API Active
formalized-symbolic-dynamics 3 Tooling — domain + library mgmt Lean 4 Symbolic-dynamics articles formalized; domain-level library management workbench Active, long-running
tactics_generation 4 Automation engines — encode mental processes Lean / Python 2×2 study + encoding mathematicians' moves in Lean Study complete
theorem-prover 4 Automation engines — toy prover Python / Lean Toy RL pipeline to see what modern theorem provers look like Learning project
theorem-proving-agent 4 Automation engines — autonomous Lean agent Python / Lean Compile–repair / innovation / documentation loop with LangGraph orchestration Older, broader scope
aristotle_tests 4 Automation engines — external-prover eval Lean 4 / TeX Aristotle on Nivat / Furstenberg ×2×3 / hard-square / Kari–Culik / Weiss / odd-shift Benchmark dump
sc-verification 5 Applied Lean — smart contracts Solidity Formal verification of smart contracts Experimental
CompPoly 5 Applied Lean — ZK (fork, contributor) Lean 4 Computable polynomials over finite fields/rings; foundation for ZK circuit verification Active (upstream Phase 2 — perf)
models_of_sft 6 Domain code Python Random 2D nearest-neighbor SFT + SAT-based pattern sampler Done
simulations_homshifts 6 Domain code Python Live simulation / heatmap of homshift evolution Done
co-philosopher Adjacent Python Terminal philosophy assistant (Claude Code CLI; PhilArchive bibliography; MCP-exposed journals memory) Active

External contributions

Popular repositories Loading

  1. theorem-prover theorem-prover Public

    Python 1

  2. informalizer informalizer Public

    Python 1

  3. simulations_homshifts simulations_homshifts Public

    Python

  4. mathlib4 mathlib4 Public

    Forked from leanprover-community/mathlib4

    The math library of Lean 4

    Lean

  5. models_of_sft models_of_sft Public

    Python

  6. theorem-proving-agent theorem-proving-agent Public

    Python