Skip to content

fix(libc): guard _jp2uc_l/_uc2jp_l calls in towctrans_l with _MB_CAPABLE#11

Open
esaurez wants to merge 1 commit into
devfrom
fix/libc-towctrans-l-guard-mb-capable
Open

fix(libc): guard _jp2uc_l/_uc2jp_l calls in towctrans_l with _MB_CAPABLE#11
esaurez wants to merge 1 commit into
devfrom
fix/libc-towctrans-l-guard-mb-capable

Conversation

@esaurez
Copy link
Copy Markdown

@esaurez esaurez commented May 30, 2026

Summary

newlib/libc/ctype/towctrans_l.c called _jp2uc_l and _uc2jp_l unconditionally, but those helpers (defined in jp2uc.c) are only compiled when _MB_CAPABLE is defined. In any newlib configuration without _MB_CAPABLE, linking anything that pulls in towctrans_l.o fails with undefined references to _jp2uc_l and _uc2jp_l. This blocks --whole-archive libc.a packaging for the nanvix targets.

This patch guards the Unicode/JIS machinery in towctrans_l.c with #ifdef _MB_CAPABLE and provides a clean no-MB fallback that matches the existing sibling functions.

Why this shape

Every other caller of _jp2uc_l/_uc2jp_l in newlib already guards the call behind #ifdef _MB_CAPABLE:

  • iswalnum_l.c, iswalpha_l.c, iswlower_l.c, iswupper_l.c, iswprint_l.c, iswpunct_l.c, iswspace_l.c, iswgraph_l.c, iswblank_l.c, iswcntrl_l.c
  • wcwidth.c, wcswidth.c

towctrans_l.c was the lone outlier. The fix restores symmetry.

For the no-MB fallback, the patched code delegates to towlower(c) / towupper(c). This matches what towlower_l.c and towupper_l.c already do in their own no-_MB_CAPABLE branch:

// towlower_l.c
#ifdef _MB_CAPABLE
  return towctrans_l (c, WCT_TOLOWER, locale);
#else
  return towlower (c);
#endif

So direct callers of towctrans / towctrans_l now see exactly the same result as towlower_l / towupper_l in the same configuration, instead of a Unicode-table-based answer that would diverge from every other no-MB wide-ctype function.

Wrapping the caseconv_table, bisearch, toulower, and touupper static helpers in the same #ifdef _MB_CAPABLE also avoids -Wunused-function on the otherwise-dead code in the no-MB build.

What changes

  • _MB_CAPABLE defined (existing default): the preprocessor selects exactly the same code as before — towctrans_l is byte-for-byte the original implementation. The full caseconv_table and Unicode bisearch are still emitted.
  • _MB_CAPABLE undefined (nanvix default): towctrans_l delegates to towlower / towupper for WCT_TOLOWER / WCT_TOUPPER, returns c for any other wctrans_t. No reference to _jp2uc_l / _uc2jp_l.

Verification

Cross-compiled towctrans_l.c with i686-nanvix-gcc from the nanvix toolchain image in both configurations:

Build Result
Without _MB_CAPABLE T towctrans_l; references towlower / towupper; no undefined _jp2uc_l / _uc2jp_l
With -D_MB_CAPABLE T towctrans_l; U _jp2uc_l, U _uc2jp_l; d caseconv_table (original behavior)

Upstream note

The same bug exists in upstream sourceware newlib (mirror/newlib-cygwin master). This fix is being landed in nanvix/newlib first to unblock dependent work; a parallel proposal to upstream is planned as a follow-up.

When _MB_CAPABLE is undefined, jp2uc.c wraps the bodies of _jp2uc_l
and _uc2jp_l in `#ifdef _MB_CAPABLE`, so the symbols are not defined.
towctrans_l, however, called both helpers unconditionally:

    wint_t u = _jp2uc_l (c, locale);
    ...
    return _uc2jp_l (res, locale);

The result was that builds without _MB_CAPABLE linked successfully
only because nothing referenced towctrans_l; the moment something did
(for example `--whole-archive libc.a` when packaging a static
runtime), the link failed with undefined references to _jp2uc_l and
_uc2jp_l.

Every other caller of these helpers in newlib already guards them
behind `#ifdef _MB_CAPABLE` (iswalnum_l.c, iswalpha_l.c, iswlower_l.c,
iswupper_l.c, iswprint_l.c, iswpunct_l.c, iswspace_l.c, iswgraph_l.c,
iswblank_l.c, iswcntrl_l.c, wcwidth.c, wcswidth.c). towctrans_l was
the lone outlier.

This patch wraps the Unicode case-conversion machinery (caseconv_table,
bisearch, toulower, touupper, and the original Unicode-table
towctrans_l body) in `#ifdef _MB_CAPABLE`, and provides a no-MB
implementation that matches what towlower_l.c / towupper_l.c already do
in the no-MB build: delegate to the narrow-ctype-backed towlower /
towupper. This keeps direct callers of towctrans / towctrans_l
consistent with the sibling locale wrappers in the same configuration
and avoids -Wunused-function on the otherwise-dead static helpers.

The _MB_CAPABLE-on path is functionally unchanged: the preprocessor
selects exactly the same code as before.

The same bug exists in upstream sourceware newlib; this fix is being
landed in nanvix/newlib first to unblock dependent work, and will be
proposed upstream separately.

Verified by cross-compiling the patched file with i686-nanvix-gcc:
  - without _MB_CAPABLE: towctrans_l defined, no undefined references
    to _jp2uc_l/_uc2jp_l (the bug fix); towlower/towupper are the
    only newly-referenced externals.
  - with -D_MB_CAPABLE: towctrans_l defined, _jp2uc_l/_uc2jp_l
    referenced exactly as before; caseconv_table emitted.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant