Skip to content

feat(examples): mnist_cnn — PyTorch + C MNIST 1D-CNN parity demo#254

Merged
LeoBuron merged 6 commits into
developfrom
examples-mnist-cnn
Jun 26, 2026
Merged

feat(examples): mnist_cnn — PyTorch + C MNIST 1D-CNN parity demo#254
LeoBuron merged 6 commits into
developfrom
examples-mnist-cnn

Conversation

@LeoBuron

Copy link
Copy Markdown
Member

What

Adds examples/mnist_cnn/ — the 1D-CNN twin of the merged mnist_mlp example, on the same MNIST data and harness. Together they showcase the framework across two architectures (dense vs convolutional) on one canonical dataset, with everything else held constant.

The framework is 1D-only (no Conv2d), so the CNN treats each [1,28,28] image as a length-784 single-channel 1D signal. Since flatten only emits 2D and there is no view/reshape layer, the [1,28,28]→[1,1,784] reinterpretation is a loader-side shape_t surgery (reshapeItemsToConv1d) — documented in the README as a known framework gap.

Model (~600 params): reshape→Conv1d(1→8,K3,SAME)→ReLU→MaxPool(2)→Conv1d(8→16,K3,SAME)→ReLU→MaxPool(2)→global AvgPool1d(196)→Flatten→Linear(16→10)→Softmax, CrossEntropy.

Verification

  • Bit-parity gate (the CI gate): C predictions int32 bit-identical to PyTorch, 10000/10000. Confirms the reshape, Conv1d SAME padding (matches PyTorch padding=1 for K=3/stride=1), MaxPool, global AvgPool, Flatten, and Linear all reproduce PyTorch exactly.
  • C unit tests 62/62, pytest 22 passed, all examples build (rot guard), train_c.c clang-format-21 clean.
  • Final whole-branch review (Opus): READY TO MERGE, no Critical/Important — reshape memory lifecycle, buildModel indices, positional state-dict, and prediction-loop ownership all verified.

Notes

  • Test accuracy is intentionally low (~45%): the tiny global-average-pool head over a flattened pixel stream is a deliberately memory-light demo of the conv stack, not an accuracy showcase. A flatten head would reach ~97% (~31 K params) — a knob we can flip if a more impressive demo is wanted. Accuracy is not the gate; bit-parity is.
  • The train-from-scratch compare.py is informational (not a gate) and slow (~75 min on full MNIST, framework trains one sample at a time); the README points at the fast bit-parity check.
  • Spec/plan: docs/superpowers/{specs,plans}/2026-06-26-mnist-examples* (gitignored scratch).

🤖 Generated with Claude Code

@LeoBuron LeoBuron merged commit 39a2bd0 into develop Jun 26, 2026
7 checks passed
@LeoBuron LeoBuron deleted the examples-mnist-cnn branch June 26, 2026 13:42
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant