Skip to content

feat(self-packaging #66): fat / universal macOS binary support#70

Merged
jrosskopf merged 1 commit into
mainfrom
feature/gh-66-fat-macho
May 25, 2026
Merged

feat(self-packaging #66): fat / universal macOS binary support#70
jrosskopf merged 1 commit into
mainfrom
feature/gh-66-fat-macho

Conversation

@jrosskopf
Copy link
Copy Markdown
Contributor

Summary

LocateFlapiSectionInBuffer rejected any input whose magic wasn't
MH_MAGIC_64, so universal binaries (FAT_MAGIC / FAT_MAGIC_64)
-- the format produced by lipo -create and consumed by
Homebrew-style formulas that ship both arches -- couldn't be packed
or self-inspected. flapi info reported "no bundle"; flapi pack
either failed outright or appended the bundle past EOF, producing an
ad-hoc-signed binary that fails notarisation.

A universal binary is a small big-endian fat header followed by N
thin Mach-O slices at distinct file offsets. The section_64.offset
field inside a slice is relative to the slice's base, not the fat
file -- both OverwriteFlapiSection (write side) and
LocateBundleInRange (read side) treat the returned offset as an
absolute file offset, so a naive accept-fat patch would have written
to the wrong byte address.

This PR fixes both halves.

Changes (all in src/macho_bundle.cpp + src/include/macho_bundle.hpp)

  • Big-endian readers (ReadU32BE, ReadU64BE). Fat headers are
    big-endian on disk regardless of host endianness, so Linux/macOS
    little-endian hosts must byte-swap every multi-byte field.
  • ParseFatHeader (namespace-private). Walks fat_arch[] records
    (20 bytes for FAT_MAGIC/CIGAM, 32 bytes for the _64 variants).
    Selection rule: first slice whose cputype matches the host arch
    (compile-time, via __aarch64__ / __x86_64__), else the first
    slice. Rejects malformed nfat_arch (0 or > 64) and slice extents
    past EOF.
  • LocateFlapiSectionInBuffer split into an outer dispatch + an
    inner LocateFlapiSectionAt(buffer, base) overload. On fat input
    the outer call parses the fat header, picks a slice, and recurses
    via the inner overload with the slice's absolute offset as base.
    The inner overload adds base to whatever the load-cmd walker
    returns, so all callers see an absolute file offset.
  • IsMachOMagic extended to recognise FAT_MAGIC_64 / FAT_CIGAM_64
    in addition to the 32-bit fat variants it already accepted.
  • OverwriteFlapiSection is unchanged -- the new
    absolute-offset invariant makes its existing
    seekp(file_offset) land in the correct slice automatically.
    This is the whole reason for the split + base-addition design.
  • Header doc dropped the "fat / universal not supported" caveat and
    documents the slice-selection rule.

32-bit Mach-O (MH_MAGIC / MH_CIGAM) is still rejected with an
inline comment that we don't ship 32-bit artifacts.

Test fixture & cases (test/cpp/macho_bundle_test.cpp)

New BuildFatMachO(slices) helper wraps any number of BuildMachO64
outputs with a big-endian fat header + 4 KiB-aligned slice placement.
The existing helpers (BuildMachO64, WriteU32LE, SectionSpec)
are reused unchanged.

Four new test cases:

  • single-slice fat: section located at the absolute offset
    (slice_offset + intra-slice section offset).
  • two-slice fat (arm64 + x86_64): parser picks the host-matching
    slice; deterministic per-host expected offset via #ifdef.
  • two PPC slices (host arch matches neither): parser falls back
    to the first slice and stays deterministic across exotic hosts.
  • OverwriteFlapiSection round-trip inside a slice: write a
    7-byte payload through the located section, verify the bytes land
    at the returned absolute offset, and confirm a re-locate finds the
    same section -- proves the read+write absolute-offset invariant.

Test plan

  • cmake --build build/release --config Release clean
  • ctest -- 642 / 642 pass (637 previous + 5 new)
  • pytest test_self_packaging.py test_self_packaging_http.py -v
    -- 11 / 11 pass (no regression on the Linux EOCD tail-scan path,
    which is unaffected)
  • End-to-end on a Mac (out of scope from this Linux dev host):
    lipo -create an arm64 + x86_64 flapi, run
    ./flapi-universal pack --in examples --out fat-bundled, then
    ./fat-bundled info to confirm the bundle is found in the host
    slice. The macOS CI builder will exercise the existing macOS
    pack-smoke against thin binaries; this PR doesn't add an automated
    Mac path for fat binaries.

Notes

Closes #66.

Before this change, `LocateFlapiSectionInBuffer` returned `nullopt`
for any input whose magic wasn't `MH_MAGIC_64`. Universal binaries
(`FAT_MAGIC` / `FAT_MAGIC_64`) -- the format produced by `lipo
-create` and consumed by Homebrew-style formulas that ship both
arches -- therefore couldn't be packed or self-inspected.

A universal binary is a small big-endian fat header followed by N
thin Mach-O slices at distinct file offsets. The `section_64.offset`
read inside a slice is relative to the slice's base, not the fat
file -- both `OverwriteFlapiSection` (write side) and
`LocateBundleInRange` (read side) treat the returned offset as an
absolute file offset, so a naive accept-fat patch would have written
to the wrong byte address.

This PR:

- Adds `ReadU32BE` / `ReadU64BE` -- fat headers are big-endian on
  disk regardless of host endianness.
- Adds `ParseFatHeader` (namespace-private). Walks `fat_arch[]`
  (20-byte records for `FAT_MAGIC/CIGAM`, 32-byte for the `_64`
  variants). Selection rule: first slice whose cputype matches the
  host arch (compile-time, via `__aarch64__` / `__x86_64__`), else
  the first slice. Rejects malformed `nfat_arch` (0 or > 64) and
  slice extents past EOF.
- Splits `LocateFlapiSectionInBuffer` into an outer dispatch + an
  inner `LocateFlapiSectionAt(buffer, base)` overload. On fat input
  the outer call parses the fat header, picks a slice, and recurses
  via the inner overload with the slice's absolute offset as base.
  The inner overload adds base to whatever the load-cmd walker
  returns, so callers see an absolute file offset.
- `IsMachOMagic` extended to recognise `FAT_MAGIC_64` / `FAT_CIGAM_64`
  in addition to the 32-bit fat variants it already accepted.
- `OverwriteFlapiSection` is unchanged -- the new absolute-offset
  invariant makes its existing `seekp(file_offset)` land in the
  correct slice automatically.
- Header doc updated to drop the "fat / universal not supported"
  caveat and document the slice-selection rule.

Test fixture: new `BuildFatMachO(slices)` helper wraps any number
of `BuildMachO64` outputs with a big-endian fat header + 4 KiB-
aligned slice placement. Four new test cases (issue #66 acceptance):

- single-slice fat: section located at the absolute offset
  (slice_offset + intra-slice section offset).
- two-slice fat (arm64 + x86_64): parser picks the host-matching
  slice; deterministic per-host expected offset via `#ifdef`.
- two PPC slices (host arch doesn't match either): parser falls back
  to the first slice and stays deterministic across exotic hosts.
- OverwriteFlapiSection round-trip inside a slice: write a 7-byte
  payload through the located section, verify the bytes land at the
  returned absolute offset, and confirm a re-locate finds the same
  section.

Test plan:
- ctest: 642 / 642 pass (637 previous + 5 new -- 4 fat cases +
  the round-trip).
- pytest test_self_packaging.py + test_self_packaging_http.py:
  11 / 11 pass (no regression on the Linux EOCD tail-scan path,
  which is unaffected by this change).

Closes #66.
@jrosskopf jrosskopf merged commit eae15c1 into main May 25, 2026
21 checks passed
@jrosskopf jrosskopf deleted the feature/gh-66-fat-macho branch May 25, 2026 13:14
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Fat / universal macOS binary support in macho_bundle parser

1 participant