Skip to content

Design spec: Apple Silicon port (simde + Accelerate)#2

Closed
christhechris wants to merge 2 commits into
mainfrom
claude/wonderful-neumann-091991
Closed

Design spec: Apple Silicon port (simde + Accelerate)#2
christhechris wants to merge 2 commits into
mainfrom
claude/wonderful-neumann-091991

Conversation

@christhechris

Copy link
Copy Markdown

Summary

  • Adds a design spec for porting digHolo to Apple Silicon (arm64 macOS) using simde (AVX2/FMA3 → NEON translation) and Apple Accelerate (BLAS/LAPACK replacement for MKL).
  • Scope A: local dev build only — no CI changes, no release artefacts, no Python wheels. These are explicit follow-ups.
  • All changes are gated behind `APPLE AND arm64` so Linux/Windows x86-64 builds are byte-unchanged.

Why

macOS is currently blocked at configure time (CMakeLists.txt:54-59) because the SIMD paths hard-require AVX2/FMA3 and MKL isn't redistributed for modern macOS. This PR lays out the minimal-surface path to a working build on Apple Silicon:

  • ~912 x86 intrinsic call sites in src/digHolo.cpp stay untouched — simde translates at compile time.
  • LAPACK/CBLAS call sites (`cgesvd`, `sgels`, `cgemv`, `cgemm`) stay untouched — Accelerate exposes standard symbols.
  • Two narrow source edits: a new `digholo_simd_compat.h` include shim and an `Accelerate`/`MKL`/generic-CBLAS dispatch branch.

Key design decisions

  • simde over sse2neon — digHolo is entirely 256-bit AVX2; simde's AVX2 coverage is more mature than sse2neon's (which started as SSE→NEON and added AVX2 later).
  • Accelerate over OpenBLAS — ships with macOS, tuned for Apple Silicon's AMX coprocessor, zero extra dependency. Uses the modern `ACCELERATE_NEW_LAPACK` interface (macOS 13.3+).
  • FetchContent over Homebrew for simde — pins to a specific commit, byte-reproducible builds, no user-facing install step.
  • Reference-output diff test — new `tests/reference/` infrastructure + `test_reference.cpp` land now, skipped until reference binaries are committed later. Element-wise tolerance: relative 1e-4, absolute 1e-6.

What's in this PR

Design doc only: docs/superpowers/specs/2026-04-19-digholo-apple-silicon-port-design.md. No code changes yet — implementation will land in follow-up PRs driven by the writing-plans workflow.

Test plan

  • Reviewer reads the spec and confirms scope/approach before implementation starts
  • Confirm macOS 13.3 minimum is acceptable as the baseline for the arm64 build
  • Confirm scope A (no CI runner, no release artefacts) is the right first step

🤖 Generated with Claude Code

christhechris and others added 2 commits April 19, 2026 18:31
Scope A: local dev build only. Translates AVX2/FMA3 via simde, swaps
MKL for Accelerate. Gated behind APPLE AND arm64 so x86 builds are
byte-unchanged.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
- Pin simde to latest stable release tag
- Use modern Accelerate LAPACK interface (ACCELERATE_NEW_LAPACK); macOS 13.3+
- DIGHOLO_USE_ACCELERATE is automatic on Apple Silicon, no opt-in toggle

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@christhechris

Copy link
Copy Markdown
Author

Closing — spec will be kept local instead of in-repo.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant