feat(examples): mnist_mlp — PyTorch + C MNIST dense-MLP parity demo#253
Merged
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
What
Adds
examples/mnist_mlp/— a self-contained MNIST dense-MLP classifier demo with exact PyTorch ↔ C bit-parity, plus a sharedexamples/_shared/mnist_data.pyloader. First of two MNIST examples (the conv twinmnist_cnnfollows in a separate PR), replacing the deleted legacyexample/MnistExperimentwith the factory layer API.Mirrors the canonical
har_classifier/pattern:prepare_data.py→train_pytorch.py(reference + per-layer weight export) →train_c.c(factory-API trainer, two modes) →compare.py(informational) →README.md+CMakeLists.txt, wired into thec-bit-parityCI job.Model:
Flatten[1,28,28]→Linear(784→64)→ReLU→Linear(64→10)→Softmax, CrossEntropy (~51 K params). The framework is 1D-only (no Conv2d); the MLP'sflattenconsumes[1,28,28]directly (the channel-1 acts as batch), so no loader reshape is needed — matching the legacy MLP.Verification
BIT_PARITY=1loads PyTorch weights viaStateDictApi(npyLoadFlat, per v2 examples (HAR/ECG) forwards not parity-faithful — bit-parity diff fails after the crash fix #177) and runs inference only.train_c.cclang-format-21 clean.Notes
compare.py) is informational, not a gate — independent random init, and the known C-vs-PyTorch training divergence is deferred. On full MNIST the C demo runs ~75 min (framework trains one sample at a time over 54k samples); the README documents this and points at the fast bit-parity check instead.docs/superpowers/specs/2026-06-26-mnist-examples-design.md; plan:docs/superpowers/plans/2026-06-26-mnist-examples.md(both gitignored scratch).🤖 Generated with Claude Code