[nanvix] E: Embed C/C++ runtime in python.elf for .so dlopen support#682
Open
esaurez wants to merge 1 commit into
Open
[nanvix] E: Embed C/C++ runtime in python.elf for .so dlopen support#682esaurez wants to merge 1 commit into
esaurez wants to merge 1 commit into
Conversation
Updates `Makefile.nanvix` so that `python.elf` correctly serves as the "main module" against which extension `.so`s (numpy, ssl, lxml, future pip-installed wheels, ...) resolve their C and C++ runtime symbols at dlopen() time. This is the consumer-side companion to the Nanvix loader's STB_WEAK support (esaurez/nanvix#22) and is gated on the new libposix `pathconf` / `fpathconf` stubs (esaurez/nanvix#23) for the configure conftest to even produce an executable. Three coordinated link-flag changes to the `CONFIGURE_ENV` block: 1. `LIBS` segment 1 -- new `--whole-archive ... --no-whole-archive` block ahead of the existing `--start-group`. Forces every object from libposix, libc, libm, libstdc++, and libgcc into python.elf so the runtime symbols extension `.so`s depend on are embedded (and re-exported via `-Wl,--export-dynamic`, already present). Without this, the static linker drops unreferenced objects (e.g. `fscanf`, `longjmp`, `strtold_l` for numpy; `operator new/delete[]`, `__cxa_*`, `_Unwind_*`, `std::type_info` vtables for any C++ extension) and subsequent dlopen() of those `.so`s fails with "symbol not found". 2. `LIBS` segment 2 -- the existing `--start-group` is trimmed to just the external add-on libraries (sqlite3, ssl, crypto, z, bz2, lzma, ffi). It no longer re-lists libposix / libc / libm: those archives are already fully included by segment 1, so the external libs can resolve their references against the already-embedded objects. 3. Two new top-level Makefile vars `LIBSTDCXX := -lstdc++` and `LIBGCC := -lgcc`. The GCC driver resolves them against its built- in search paths (libgcc lives under a versioned `lib/gcc/i686- nanvix/<gcc-version>/` directory, which would be fragile to hardcode). Defined once at top level because the `-l` form is identical between the docker and host build paths. `LDFLAGS` is unchanged. The existing `-Wl,--allow-multiple-definition` flag is kept and the surrounding comment is expanded to honestly enumerate the duplicate-symbol categories the flag is masking (newlib long-double math helpers, libposix/libc env+isatty overlaps, libc/libm math helper overlaps, libgcc internal `__x86.get_pc_thunk.*` duplicates, etc.) -- the set is large and toolchain-build-version- dependent, and is the only practical workaround until the contributing upstreams are fixed. `.nanvix/config.py::configure_env()` -- an unused helper that mirrors `Makefile.nanvix`'s `CONFIGURE_ENV` -- is kept in sync (same `--whole-archive` LIBS, same LDFLAGS) and gains a docstring calling out the dead-code status. A separate small cleanup PR can delete the helper entirely. Validated end-to-end on the Nanvix microvm: CPython 3.12 + numpy 1.26.4 runs `import numpy`, `np.arange`, `np.dot`, `reshape`, `flatten`, broadcasting, all producing `NUMPY_TEST_OK`. Hello.py and the existing single-process / multi-process / standalone modes are unaffected by the change because the linker flags are not mode-conditional. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
There was a problem hiding this comment.
Pull request overview
This PR updates the Nanvix CPython build harness so python.elf exports the full C/C++ runtime surface needed for extension modules (.so) to resolve symbols against the main executable at dlopen() time, preventing load-time “symbol not found” failures caused by the static linker discarding unreferenced runtime objects.
Changes:
- Update
Makefile.nanvixlink flags to--whole-archivelibposix/libc/libm plus-lstdc++/-lgcc, while keeping add-on libs in a separate--start-group. - Add detailed in-file documentation explaining why
--export-dynamic,--whole-archive, and--allow-multiple-definitionare used. - Mirror the new link flags in
.nanvix/config.py’s (currently unused)configure_env()helper and document its status.
Reviewed changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated no comments.
| File | Description |
|---|---|
| Makefile.nanvix | Embeds C/C++ runtime archives into python.elf via --whole-archive and documents linker-flag rationale to support .so dlopen() symbol resolution. |
| .nanvix/config.py | Keeps the unused configure_env() helper’s link flags aligned with Makefile.nanvix and clarifies that Makefile.nanvix is authoritative. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Updates
Makefile.nanvixsopython.elfcorrectly serves as the "main module" against which extension.sos (numpy, ssl, lxml, future pip-installed wheels) resolve their C and C++ runtime symbols atdlopen()time. Without this change, even pure-Python plus pre-compiled extension wheels fail because the static linker drops every libc/libstdc++/libgcc symbol that cpython itself doesn't reference — symbols those.sos definitely need at runtime.This is the consumer-side companion to two upstream Nanvix changes:
python.elfactually presents the strong symbols.sos need so the weak-undef fallback is only used for the genuinely-optional cases.pathconf/fpathconfENOSYS stubs. Without these, the cpython./configureconftest fails ("C compiler cannot create executables") because libstdc++'sstd::filesystem::current_pathleavespathconfas an unresolved strong UND.What changed
Makefile.nanvixCONFIGURE_ENVlink-flag changes (details below) + a ~30-line comment block documenting the rationale..nanvix/config.pyconfigure_env()helper is updated to mirror the newMakefile.nanvixflags, with a docstring marking it as currently-unused. (A separate small PR can delete the helper entirely.)LIBSsegment 1 —--whole-archiveblockBefore:
After:
The first segment forces every object from libposix, libc, libm, libstdc++, and libgcc into
python.elf. Combined with the already-present-Wl,--export-dynamicflag, this means all of those runtime symbols end up inpython.elf's.dynsymand are visible to extension.sos atdlopen()time. Without--whole-archive, the static linker drops unreferenced objects (e.g.fscanf,longjmp,strtold_lfor numpy;operator new/delete[],__cxa_*,_Unwind_*,std::type_infovtables for any C++ extension) and subsequentdlopen()fails with "symbol not found".LIBSsegment 2 — trimmed--start-groupThe trailing
--start-group ... --end-groupis now just the external add-on libraries (sqlite3, ssl, crypto, z, bz2, lzma, ffi). It no longer re-listslibposix/libc/libm— those archives are already fully embedded by segment 1, so the external libs can resolve their references against the already-included objects.New top-level Makefile vars
Defined once at the top level (the
-lform works identically in both the docker and host build paths). The GCC driver resolves them against its built-in search paths; this avoids hardcoding the versionedlib/gcc/i686-nanvix/<gcc-version>/libgcc.apath, which would be fragile across toolchain upgrades.LDFLAGS— kept identicalNo change to LDFLAGS. The
-Wl,--allow-multiple-definitionflag is the only piece that looks workaround-shaped, and the surrounding comment block has been expanded to honestly enumerate the duplicate-symbol categories it masks:frexpl,llrintl,lrintl,rintlare defined in three different newlib directories simultaneously (a newlib build-system bug we will file as a discussion issue atnanvix/newlib)._start,copysign[f],getenv,setenv,unsetenv,environ,isatty.frexp,ldexp,modf,isnan,isinf,scalbn, ...hypotf.__x86.get_pc_thunk.*duplicates.__eprintf.The set is large and toolchain-build-version-dependent; treating the link as multiple-definition-tolerant is the only practical workaround until each upstream is fixed. The comment explicitly marks the flag as temporary and lists the categories so a future reader can audit which contributing upstreams have been addressed.
Validation
local-nanvix/toolchain-python:from-prs(a toolchain image built from the filed-PR branches ofnanvix/newlibandnanvix/gcc, with no manual workarounds).python.elfrunshello.pyon the Nanvix microvm without regression.import numpy,np.arange(10).sum(),np.dot(matrix, vector),reshape,flatten, broadcasting all work; the test harness printsNUMPY_TEST_OK.python.elf.)make -f Makefile.nanvix cleanstill works (cleanup logic doesn't reference the new variables).Compatibility
python.elfgrows by ~3-5 MB because of the additional whole-archive objects (libstdc++ in particular is sizeable). This is the trade-off for.soextensions to work without per-extension RPATHs or static-link sharing.python.elfthat runs pure Python correctly but cannotdlopen()any extension.sothat depends on libc/libstdc++/libgcc symbols cpython doesn't itself reference (i.e. essentially every real extension wheel).Future cleanup (not in this PR)
nanvix/newliblibm duplicates — file the discussion issue and land the build-system fix, then drop-Wl,--allow-multiple-definitionhere.nanvix/cpython.nanvix/config.py— delete the unusedconfigure_env()helper entirely. Tracked locally as a follow-up small PR.nanvix/toolchain-python— drop-Wl,--allow-shlib-undefinedfrom the.solink line now that the loader handles weak undefs at runtime (planned follow-up PR againstnanvix/toolchain-python).