Portable Linux libccl.so (glibc ≥ 2.17 baseline build)#3
Draft
matiwinnetou wants to merge 6 commits into
Draft
Conversation
native-image cannot emit a static library (oracle/graal#3053, still open) and musl portability applies only to static executables, not shared libs — so a fully-static, no-.so FFI distribution isn't possible without an IPC rewrite. Pursue the real goal (glibc/distro independence) instead by building the Linux .so against an old glibc baseline. - docs/spikes/static-linking.md: findings, options, decision (Option A). - static-linking-spike.yml: build libccl.so in manylinux_2_28 (glibc 2.28), assert max required GLIBC symbol <= 2.28 via objdump. - TODO: mark the static-linking item spiked with the verdict. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
CI run 28175185458 green — the .so references no glibc symbol newer than 2.17, so it runs on glibc >= 2.17 (RHEL/CentOS 7+, Amazon Linux 2, Ubuntu 18.04+, Debian 9+). Better than the 2.28 target. Records the objdump output and notes the remaining follow-up: a run-on-old-distro smoke test before rollout. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Compile a minimal isolate+version+account smoke harness in the manylinux_2_28 builder, then execute the prebuilt binary inside centos:7 (glibc 2.17) with no package installs — confirming the lib loads and runs on the measured floor, not only that it links against an old baseline. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Spike proven: building libccl.so in manylinux_2_28 yields a lib requiring only GLIBC_2.17, which loads and runs on centos:7. Promote it from experiment to permanent infrastructure: - portable-linux-lib.yml: permanent guard on PRs + develop/main — builds in manylinux_2_28, asserts the glibc floor via objdump, and re-runs the centos:7 smoke as a strict run-on-2.17 regression check (renamed from the spike file). - release.yml: split Linux out of the matrix into a manylinux_2_28 container job so every shipped Linux artifact is glibc->=2.17 portable; macOS/Windows unchanged. Re-asserts the floor before packaging. - README: document the glibc 2.17 floor (and the Alpine/musl caveat). - docs/spikes/static-linking.md, TODO: record the run-proof and rollout. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Per Satya's review: glibc-baseline solves OS-library portability, but the binary also defaults to the build machine's CPU instruction set and can SIGILL on older / datacenter CPUs lacking AVX2/AVX-512. -march=compatibility emits only the baseline instructions common to all CPUs of the target arch, closing the CPU-portability axis. Placed in native-image.properties (the canonical spot). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Add docs/adr/ (Architecture Decision Records) with a template + index, and record 10 decisions retrospectively and new: 0001 native shared library via GraalVM native-image + C FFI 0002 offline, stateless bridge; caller-supplied chain data; no provider in libccl 0003 one FFI, four language wrappers; uniform thin contract; explicit inputs 0004 Bun is the only supported JavaScript runtime 0005 standardize on Oracle GraalVM 25.0.3 0006 TxPlan (YAML) transaction format, replacing the bespoke JSON spec 0007 Plutus exec units are caller-supplied; evaluator-agnostic 0008 Linux portability: glibc-baseline build + -march=compatibility (not static) 0009 branch & release process: feature -> develop, one large develop -> main PR 0010 Go wrapper isolate thread-affinity Replace the ephemeral docs/spikes/ with ADR-0008 (the durable rationale), move the CI-used harness docs/spikes/smoke.c -> native-test/src/smoke.c (where C smoke tests belong), and update all references (README, TODO, both workflows). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Makes the shipped Linux
libccl.sodistro-independent so consumers no longer need a specific Ubuntu / glibc. Result of the static-linking spike (full findings indocs/spikes/static-linking.md).Why not actual static linking?
The original idea was to statically link (no
.so). The spike found that's not possible: GraalVMnative-imagecannot emit a static library (oracle/graal#3053, still open), and musl's run-anywhere property only applies to static executables, not shared libraries. A truly static, no-.sodistribution would require re-architecting to an IPC subprocess model — rejected as too invasive.What we do instead — glibc baseline
Build the Linux
.soinside an old-glibc container (manylinux_2_28). Because glibc is backward-compatible, a lib built against an old baseline runs on that baseline and everything newer — while the in-process FFI stays exactly as-is for all four wrappers.Measured result (CI): the
.soreferences no symbol newer thanGLIBC_2.17, and it loads + runs a real key-derivation oncentos:7(glibc 2.17):→ runs on RHEL/CentOS 7+, Amazon Linux 2, Ubuntu 18.04+, Debian 9+ and all newer distros. (Current
ubuntu-latestbuild demands ~GLIBC_2.39and runs on none of those.) Not Alpine/musl — a musl variant is a possible future follow-up.Changes
.github/workflows/portable-linux-lib.yml(new permanent guard, on PRs +develop/main): builds inmanylinux_2_28, asserts the glibc floor viaobjdump, and re-runs thecentos:7smoke as a strict run-on-2.17 regression check.release.yml: Linux artifact split into amanylinux_2_28container job so every shipped Linux.sois glibc-≥2.17 portable; re-asserts the floor before packaging. macOS/Windows unchanged.docs/spikes/static-linking.md: full spike findings, options, decision, and proof.docs/spikes/smoke.c: minimal isolate + version + account harness used for the run-proof.README.md,TODO.md: document the glibc-2.17 floor; mark the backlog item done.Verification
The
Portable Linux libworkflow on this PR builds the lib, assertsGLIBC ≤ 2.28, and runs it oncentos:7(glibc 2.17). Green = portability proven on this commit.🤖 Generated with Claude Code