fix(libc): guard _jp2uc_l/_uc2jp_l calls in towctrans_l with _MB_CAPABLE#11
Open
esaurez wants to merge 1 commit into
Open
fix(libc): guard _jp2uc_l/_uc2jp_l calls in towctrans_l with _MB_CAPABLE#11esaurez wants to merge 1 commit into
esaurez wants to merge 1 commit into
Conversation
When _MB_CAPABLE is undefined, jp2uc.c wraps the bodies of _jp2uc_l
and _uc2jp_l in `#ifdef _MB_CAPABLE`, so the symbols are not defined.
towctrans_l, however, called both helpers unconditionally:
wint_t u = _jp2uc_l (c, locale);
...
return _uc2jp_l (res, locale);
The result was that builds without _MB_CAPABLE linked successfully
only because nothing referenced towctrans_l; the moment something did
(for example `--whole-archive libc.a` when packaging a static
runtime), the link failed with undefined references to _jp2uc_l and
_uc2jp_l.
Every other caller of these helpers in newlib already guards them
behind `#ifdef _MB_CAPABLE` (iswalnum_l.c, iswalpha_l.c, iswlower_l.c,
iswupper_l.c, iswprint_l.c, iswpunct_l.c, iswspace_l.c, iswgraph_l.c,
iswblank_l.c, iswcntrl_l.c, wcwidth.c, wcswidth.c). towctrans_l was
the lone outlier.
This patch wraps the Unicode case-conversion machinery (caseconv_table,
bisearch, toulower, touupper, and the original Unicode-table
towctrans_l body) in `#ifdef _MB_CAPABLE`, and provides a no-MB
implementation that matches what towlower_l.c / towupper_l.c already do
in the no-MB build: delegate to the narrow-ctype-backed towlower /
towupper. This keeps direct callers of towctrans / towctrans_l
consistent with the sibling locale wrappers in the same configuration
and avoids -Wunused-function on the otherwise-dead static helpers.
The _MB_CAPABLE-on path is functionally unchanged: the preprocessor
selects exactly the same code as before.
The same bug exists in upstream sourceware newlib; this fix is being
landed in nanvix/newlib first to unblock dependent work, and will be
proposed upstream separately.
Verified by cross-compiling the patched file with i686-nanvix-gcc:
- without _MB_CAPABLE: towctrans_l defined, no undefined references
to _jp2uc_l/_uc2jp_l (the bug fix); towlower/towupper are the
only newly-referenced externals.
- with -D_MB_CAPABLE: towctrans_l defined, _jp2uc_l/_uc2jp_l
referenced exactly as before; caseconv_table emitted.
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
newlib/libc/ctype/towctrans_l.ccalled_jp2uc_land_uc2jp_lunconditionally, but those helpers (defined injp2uc.c) are only compiled when_MB_CAPABLEis defined. In any newlib configuration without_MB_CAPABLE, linking anything that pulls intowctrans_l.ofails with undefined references to_jp2uc_land_uc2jp_l. This blocks--whole-archive libc.apackaging for the nanvix targets.This patch guards the Unicode/JIS machinery in
towctrans_l.cwith#ifdef _MB_CAPABLEand provides a clean no-MB fallback that matches the existing sibling functions.Why this shape
Every other caller of
_jp2uc_l/_uc2jp_lin newlib already guards the call behind#ifdef _MB_CAPABLE:iswalnum_l.c,iswalpha_l.c,iswlower_l.c,iswupper_l.c,iswprint_l.c,iswpunct_l.c,iswspace_l.c,iswgraph_l.c,iswblank_l.c,iswcntrl_l.cwcwidth.c,wcswidth.ctowctrans_l.cwas the lone outlier. The fix restores symmetry.For the no-MB fallback, the patched code delegates to
towlower(c)/towupper(c). This matches whattowlower_l.candtowupper_l.calready do in their own no-_MB_CAPABLEbranch:So direct callers of
towctrans/towctrans_lnow see exactly the same result astowlower_l/towupper_lin the same configuration, instead of a Unicode-table-based answer that would diverge from every other no-MB wide-ctype function.Wrapping the
caseconv_table,bisearch,toulower, andtouupperstatic helpers in the same#ifdef _MB_CAPABLEalso avoids-Wunused-functionon the otherwise-dead code in the no-MB build.What changes
_MB_CAPABLEdefined (existing default): the preprocessor selects exactly the same code as before —towctrans_lis byte-for-byte the original implementation. The fullcaseconv_tableand Unicode bisearch are still emitted._MB_CAPABLEundefined (nanvix default):towctrans_ldelegates totowlower/towupperforWCT_TOLOWER/WCT_TOUPPER, returnscfor any otherwctrans_t. No reference to_jp2uc_l/_uc2jp_l.Verification
Cross-compiled
towctrans_l.cwithi686-nanvix-gccfrom the nanvix toolchain image in both configurations:_MB_CAPABLET towctrans_l; referencestowlower/towupper; no undefined_jp2uc_l/_uc2jp_l-D_MB_CAPABLET towctrans_l;U _jp2uc_l,U _uc2jp_l;d caseconv_table(original behavior)Upstream note
The same bug exists in upstream sourceware newlib (
mirror/newlib-cygwin master). This fix is being landed innanvix/newlibfirst to unblock dependent work; a parallel proposal to upstream is planned as a follow-up.