Staticlib hide internal symbols#155338
Conversation
|
rustbot has assigned @petrochenkov. Use Why was this reviewer chosen?The reviewer was selected based on:
|
This comment has been minimized.
This comment has been minimized.
|
This would also need to rename symbols to avoid conflicts between two rust staticlibs ending up getting linked together, right? |
Why exactly is that the case? |
ff707ad to
7ac49d1
Compare
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
My primary goal right now is to reduce binary size, so I don't have immediate plans to implement symbol renaming. This means that linking multiple Rust staticlibs together can still result in multiple definition errors. Would you like me to address that in this PR as well? It seems feasible to implement — for example, by rehashing symbols and updating their references accordingly. |
I previously assumed this symbol needed to remain externally visible to support scenarios requiring cross-language exception propagation. Do you think we should also set rust_eh_personality as hidden? |
|
If it isn't too hard it would be nice to do symbol renaming too. I think doing in-place modification isn't going to work for that though. Adding a unique suffix would require growing the size of the string table. |
|
Got it. I will first fix the |
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
5e1c3a1 to
c7d4e98
Compare
|
@bors delegate=try |
|
✌️ @cezarbbb, you can now perform try builds on this pull request! You can now post |
|
@bors try |
This comment has been minimized.
This comment has been minimized.
`-Zstaticlib-hide-internal-symbols`: Hide non-exported internal symbols from staticlibs
|
@bors try jobs=x86_64-* |
This comment has been minimized.
This comment has been minimized.
|
Preliminarily speaking, I think it is renaming user-defined symbols, causing undefined symbol errors when the output goes into the linking stage: (I am using cxx.rs btw) Simply toggling the rename flag off fixes this build error. Still investigating. |
|
I think the rename flag indeed modifies public symbols:
When linking against It also seems like the flag breaks mangling somehow as such it could demangle |
|
I see. The |
|
I think I can filter by filename and only process |
|
I have no idea, but let me test that already on my setup |
|
I think I'm like 99% sure renaming works flawlessly now. So my setup was 2 Rust libraries being ported to C++ via cxx.rs and linked independently into a mobile app. There were
For one of the Rust libraries I built it with the That was for Android though, for iOS I'll get that tested very soon too, maybe within the next hour. |
|
Took me long time to test for iOS cause I'm having other unrelated issues with the iOS build, but nonetheless judging by how far I am I think the patch just works on iOS too ✅ |
|
@cezarbbb asked me to take over the review since @bjorn3 didn't respond for some time. r? @petrochenkov |
What do the numbers in the tables mean?
|
|
|
| # `staticlib-hide-internal-symbols` | ||
|
|
||
| When building a `staticlib`, this option hides all non-exported Rust-internal | ||
| symbols by setting their ELF visibility to `STV_HIDDEN`. |
There was a problem hiding this comment.
What is the difference with -Z default-visibility=hidden here?
As far as I can see, that option will also set visibility to hidden for all SymbolExportLevel::Rust symbols.
There was a problem hiding this comment.
-Z default-visibility=hidden only affects the current crate's codegen. A staticlib also bundles .o files from upstream crates (std, core, etc.) that were compiled without it. -Zstaticlib-hide-internal-symbols post-processes the entire archive including those pre-compiled upstream objects.
There was a problem hiding this comment.
Could you add this comparison with -Z default-visibility=hidden to the documentation?
|
(I started reviewing, will continue on Monday.) |
Sorry for the confusing column names. Here's what the table shows: The test links the staticlib into a shared library using
The column names are bad — I'll rewrite the table with clearer labels when I update the PR description. |
Makes sense, I'll split it. |
352adb2 to
786373a
Compare
|
This PR was rebased onto a different main commit. Here's a range-diff highlighting what actually changed. Rebasing is a normal part of keeping PRs up to date, so no action is needed—this note is just to help reviewers. |
|
Working on splitting this pull request, and it seems some functions will be reused. Clarifying the logic. |
786373a to
40c254b
Compare
-Zstaticlib-hide-internal-symbols and Zstaticlib-rename-internal-symbols: hide/rename internal symbols in staticlibs
This comment has been minimized.
This comment has been minimized.
40c254b to
54a7c4c
Compare
Yes, could you put the legend from #155338 (comment) into the PR description before the tables as well. |
|
|
||
| use std::collections::HashSet; | ||
|
|
||
| use run_make_support::object::Endianness; |
There was a problem hiding this comment.
| use run_make_support::object::Endianness; | |
| use run_make_support::object::{elf, Endianness}; |
There are many uses of elf::something below, let's clean them up.
|
|
||
| for member in archive.members() { | ||
| let member = member.unwrap(); | ||
| let member_name = std::str::from_utf8(member.name()).unwrap(); |
There was a problem hiding this comment.
| let member_name = std::str::from_utf8(member.name()).unwrap(); | |
| let member_name = str::from_utf8(member.name()).unwrap(); |
| ) -> Option<&'data str> { | ||
| let bytes = strtab.get(symbol.st_name(endian)).ok()?; | ||
| let end = bytes.iter().position(|&b| b == 0).unwrap_or(bytes.len()); | ||
| std::str::from_utf8(&bytes[..end]).ok() |
There was a problem hiding this comment.
| std::str::from_utf8(&bytes[..end]).ok() | |
| str::from_utf8(&bytes[..end]).ok() |
|
|
||
| fn visibility_name(v: u8) -> &'static str { | ||
| match v { | ||
| v if v == object::elf::STV_DEFAULT => "STV_DEFAULT", |
There was a problem hiding this comment.
| v if v == object::elf::STV_DEFAULT => "STV_DEFAULT", | |
| elf::STV_DEFAULT => "STV_DEFAULT", |
Can't you match on these directly?
| for member in archive.members() { | ||
| let member = member.unwrap(); | ||
| let member_name = std::str::from_utf8(member.name()).unwrap(); | ||
| if !member_name.ends_with(".rcgu.o") { |
There was a problem hiding this comment.
| if !member_name.ends_with(".rcgu.o") { | |
| if !member.name().ends_with(b".rcgu.o") { |
will probably work too, slices support ends_with.
| let bind = symbol.st_bind(); | ||
| let shndx = symbol.st_shndx(endian); | ||
|
|
||
| if shndx == object::elf::SHN_UNDEF as u16 { |
There was a problem hiding this comment.
| if shndx == object::elf::SHN_UNDEF as u16 { | |
| if shndx == object::elf::SHN_UNDEF { |
| } | ||
| } | ||
|
|
||
| fn read_symbol_name<'data>( |
There was a problem hiding this comment.
Doesn't symbol.name() just work?
I see other tests using it.
| v if v == object::elf::STV_PROTECTED => "STV_PROTECTED", | ||
| _ => "UNKNOWN", | ||
| } | ||
| } |
There was a problem hiding this comment.
Most comments from this file basically apply to tests/run-make/staticlib-hide-internal-symbols-macho/rmake.rs too.
| } else if let Some(symbols) = crate_info.exported_symbols.get(&CrateType::StaticLib) { | ||
| use rustc_data_structures::fx::FxHashSet; | ||
| let keep: FxHashSet<String> = symbols.iter().map(|(s, _)| s.clone()).collect(); | ||
| ab.set_hide_symbols(keep); |
There was a problem hiding this comment.
| ab.set_hide_symbols(keep); | |
| ab.set_hide_symbols(symbols.iter().map(|(s, _)| s.clone()).collect()); |
View all comments
According to issue #104707, when building a staticlib, all Rust internal symbols — mangled symbols,
#[rustc_std_internal_symbol]items, allocator shims, etc. — leak out of the static archive. In contrast, cdylib correctly exports only#[no_mangle]symbols via a linker version script.-Zstaticlib-hide-internal-symbolsdirectly post-processes ELF object files in the archive: parsing theSHT_SYMTABsections and settingSTV_HIDDENvisibility on anyGLOBAL/WEAKdefined symbol that is not in the exported symbol set, without changing the binding. This is an in-place modification (only writing the st_other byte per matching entry), with zero overhead.Supported on ELF targets (Linux, BSD, etc.) and Apple targets (macOS, iOS, etc.). On unsupported targets (Windows), a warning is emitted and the flag has no effect.
Update: The rename counterpart (
-Zstaticlib-rename-internal-symbols) is in #156950.The test code are as follows:
1.a std rust staticlib:
1.b downstream c program:
The test results with different compiler flags(which might cause binary size reduction) are as follows:
1.c result with
-Zstaticlib-hide-internal-symbols1.d result with
-Zstaticlib-hide-internal-symbols + -Zstaticlib-rename-internal-symbols2.a no_std rust staticlib
2.b downstream c program
The test results with different compiler flags(which might cause binary size reduction) are as follows:
2.c result with
-Zstaticlib-hide-internal-symbols2.d result with
-Zstaticlib-hide-internal-symbols + -Zstaticlib-rename-internal-symbolsTest results show that this compiler option is beneficial for scenarios where LTO cannot be enabled.
r? @bjorn3 @petrochenkov