Fix @cfunction closure error on ARM64/Apple Silicon#99
Merged
Conversation
On aarch64 (Apple Silicon), `@cfunction(\$closure, ...)` cannot
create thunk-based C function pointers at runtime because macOS
enforces strict W^X memory protection, preventing the JIT from
allocating writable+executable trampolines.
The four p4est iterator callbacks defined inside generate_face_labeling
(corner, edge, face, cell) captured local Julia variables as closures
and were registered via `@cfunction(\$f, ...)`. This caused the error:
cfunction: closures are not supported on this platform
Fix: bundle all captured state into a module-level mutable struct
`FaceLabelingCallbackData` and pass it through the `user_data` pointer
that p4est provides to every callback invocation. The four callbacks
are lifted to module-level named functions and registered with the
non-closure form `@cfunction(f, ...)`, which produces a static function
pointer compatible with ARM64.
The `GC.@preserve` block around the p4est_iterate calls ensures the
struct is not collected or moved while C holds a raw pointer to it.
Tested on Apple Silicon (arm64-apple-darwin, Julia 1.10) with 4 MPI
ranks on a 2D cubed-sphere mesh.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Problem
On ARM64 (Apple Silicon / aarch64),
@cfunction($closure, ...)cannot create thunk-based C function pointers at runtime. macOS enforces strict W^X memory protection, so the JIT cannot allocate writable+executable trampolines for closure-capturing callbacks. Any attempt to build a distributed p4est mesh fails with:This makes
OctreeDistributedDiscreteModel(and anything that callsgenerate_face_labeling) completely unusable on Apple Silicon, regardless of Julia version.Root cause
generate_face_labelingdefined four p4est iterator callbacks (jcorner_callback,jedge_callback,jface_callback,jcell_callback) as local functions that captured variables from the enclosing scope. They were registered with@cfunction($f, ...)(dollar-sign = closure thunk), which fails on ARM64.Fix
mutable struct FaceLabelingCallbackDatathat bundles all captured state (topology face tables, entity arrays, coarse-mesh labeling data, iterator mode)._jcorner_callback_impl,_jedge_callback_impl,_jface_callback_impl,_jcell_callback_impl) that recover the struct viaunsafe_pointer_to_objref(user_data).@cfunction(f, ...)without$, which creates a static function pointer — fully supported on all platforms including ARM64.GC.@preserveto pin the struct while C holds a raw pointer to it.iterator_modepreviously passed as a separateReftop4est_iterateis now a field of the struct, updated between calls.No behaviour change on x86-64. Only
UniformlyRefinedForestOfOctreesDiscreteModels.jlis modified.Testing
Tested on Apple Silicon (
arm64-apple-darwin, Julia 1.10) with 4 MPI ranks on a 2D cubed-sphere mesh — previously crashed at mesh construction, now completes successfully.Existing behaviour on x86-64 is unchanged (static cfunctions work identically on that platform).