[pauthabielf64] Define R_AARCH64_AUTH_TLSDESC_CALL.#395
Conversation
The TLSDESC sequence for accessing an authenticated pointer is similar to the traditional TLSDESC sequence. As pointer signing must be done at run-time there are much more limited opportunities to relax a TLSDESC AUTH, we introduce a R_AARCH64_AUTH_TLSDESC_CALL relocation that permits a static linker to transform the blraa to a nop, when relaxation is possible. A new relocation code has been introduced rather than reusing R_AARCH64_TLSDESC_CALL so that a static linker can assume that the destination symbol is signed, without having to derive it from other R_AARCH64_TLSDESC_* relocations. TLSDESC sequence adrp x0, :tlsdesc:v //R_AARCH64_TLSDESC_ADR_PAGE21 ldr x1, [x0, #:tlsdesc_lo12:v] //R_AARCH64_TLSDESC_LD64_LO12 add x0, x0, #:tlsdesc_lo12:v //R_AARCH64_TLSDESC_ADD_LO12 .tlsdesccall var //R_AARCH64_TLSDESC_CALL blr x1 TLSDESC AUTH sequence adrp x0, :tlsdesc_auth:v //R_AARCH64_AUTH_TLSDESC_ADR_PAGE21 ldr x16, [x0, #:tlsdesc_auth_lo12:v]//R_AARCH64_AUTH_TLSDESC_LD64_LO12 add x0, x0, #:tlsdesc_auth_lo12:v //R_AARCH64_AUTH_TLSDESC_ADD_LO12 .tlsdescauthcall v //R_AARCH64_AUTH_TLSDESC_CALL blraa x16, x0 fixes ARM-software#393
|
Thanks @smithp35! Just for transparency - PRs implementing support in llvm-project: |
| ldr x16, [x0, :tlsdesc_auth_lo12: undefined_weak // R_AARCH64_AUTH_TLSDESC_LD64_LO12 | ||
| add x0, x0 :tlsdesc_auth_lo12: undefined_weak // R_AARCH64_AUTH_TLSDESC_ADD_LO12 | ||
| .tlddescauthcall undefined_weak // R_AARCH64_AUTH_TLSDESC_CALL | ||
| autia x0, x8 |
There was a problem hiding this comment.
#393 (comment) gives this as blraa x16, x0 instead?
There was a problem hiding this comment.
Thanks for spotting. Will update.
| autia x0, x8 | ||
|
|
||
| // After relaxation, assuming undefined_weak is known to be 0 at static-link time. | ||
| mov x0, #0x0 |
There was a problem hiding this comment.
If the resolver is returning a (potentially between distinct allocations) offset from TP, wouldn't this cause a change in behaviour from giving "NULL" to giving "TP"? At least within FreeBSD our normal AArch64 TLSDESC resolver for undefined weak symbols is to return -TP(+A) so adding it to TP gives NULL(+A). I know undefined weak TLS objects are historically very cursed and break in all kinds of ways, but this would be a regression over non-PAuth TLSDESC, I think.
There was a problem hiding this comment.
(This also loses the addend entirely)
There was a problem hiding this comment.
This is a good point, @kovdan01 is your TLSDESC resolver function for an undefined weak capable of returning -TP? I believe AArch64 glibc does this too for undefined weak TLS symbols.
I think this would be preferable to relaxing the sequence to 0 as currently that 0 would get added to TP, which would not result in a value of 0 for an undefined weak.
There's always the possibility of altering the TLSDESC sequence of checking for 0 before adding to TP as 0 isn't a valid offset (first offset is TCB size + alignment padding) which is a minimum of 16. However this seems worse than the resolver.
As an aside I've not been able to create a TLSDESC sequence with a non-zero addend from some simple compiled code. Always seems like the TLSDESC calculates the address of the symbol, then does an addition, or load with immediate offset instead.
There was a problem hiding this comment.
Oh, that may be true in practice, just like GOT entries typically don't actually have addends (except when turned to relative ones, of course) as it's annoying to have potentially multiple entries per symbol to track in the linker.
There was a problem hiding this comment.
@kovdan01 is your TLSDESC resolver function for an undefined weak capable of returning -TP? I believe AArch64 glibc does this too for undefined weak TLS symbols.
@smithp35 Yes, glibc does this, and we match that behavior in our PoC reference musl implementation - see https://github.com/access-softek/musl/blob/v1.2.5-pauth-rev2025-11-21/src/ldso/aarch64/tlsdesc.S#L63-L70
The R_AARCH64_AUTH_TLSDESC_CALL is introduced to allow linker relaxation of AUTH TLSDESC call sequences for non-preemptible undefined weak symbols. The lld patch introducing the relaxation: #194636 Corresponding ARM docs PR: ARM-software/abi-aa#395
The R_AARCH64_AUTH_TLSDESC_CALL is introduced to allow linker relaxation of AUTH TLSDESC call sequences for non-preemptible undefined weak symbols. The lld patch introducing the relaxation: #194636 Corresponding ARM docs PR: ARM-software/abi-aa#395
…reloc See specification ARM-software/abi-aa#395
* use blraa as in commit message. * mention that a value of 0 when added to the thread pointer is invalid.
|
I've fixed the typos and added a note that when 0 is added to the TP it will point at the thread control block. Do we still need this? EDIT, yes as on other platforms there may be other relaxations possible. |
|
Reading through https://www.fsfla.org/~lxoliva/writeups/TLS/RFC-TLSDESC-ARM.txt again. There is a relaxable sequence that could be used when static linking (so we know that the weak reference won't be defined). In effect this is inlining the resolver function that returns -TP |
|
|
||
| .. note:: | ||
|
|
||
| Relocation code ``R_AARCH64_AUTH_TLSDESC_CALL`` is needed to permit |
There was a problem hiding this comment.
I think I'll move the details about the relaxation to a separate document in the design-documents folder.
I think it is platform specific choice of whether both fields of the TLS descriptor are signed. If only the resolver function address is signed then more relaxations are possible.
Likely to be next week before I can do that.
There was a problem hiding this comment.
Well, it's not about whether the second word of the descriptor is signed. The generated code never uses that, either it passes a pointer to it as an opaque blob to the resolver or it relaxes the entire sequence so there is no descriptor to sign? The question is whether the TLS data is being signed like globals can be, and therefore whether &tls_var - TP is the same for all threads or differs in the high bits due to signing. If the data isn't signed, you can just relax to that constant (which will be the same as non-PAuth, and is either an LE immediate or an IE run-time constant), it's only if the data is signed that IE/LE are fundamentally broken?
There was a problem hiding this comment.
Thanks for the comment. I'll probably just take the rationale/relaxation bits out of the main document for now until I've got the time to work through this slowly.
Reading through the initial issue again #393. I think I've put too much weight on the comment:
Broader relaxations (such as GD->IE or GD->LE with non-statically-known-NULL symbols) are not possible because pointer authentication requires signing at program start-up.
I've missed the non-statically-known-NULL symbols part and in my haste reading. I had been trying to reconcile why other relaxations weren't possible with the code sequence the compiler uses for TLSDESC and a signed GOT.
As an aside:
Empirically using: clang --target=aarch64-linux -march=armv8.3-a -S -O2 tlsdesc.c -o - -fptrauth-elf-got -mabi=pauthtest with a trivial __thread int x; int val() { return x; }
I get:
pacibsp
stp x29, x30, [sp, #-16]! // 16-byte Folded Spill
mov x29, sp
adrp x0, :tlsdesc_auth:x
ldr x16, [x0, :tlsdesc_auth_lo12:x]
add x0, x0, :tlsdesc_auth_lo12:x
blraa x16, x0
mrs x8, TPIDR_EL0
ldr w0, [x8, x0]
ldp x29, x30, [sp], #16 // 16-byte Folded Reload
retab
If I'm reading that correctly, the return value from the TLS resolver function isn't signed. Nor is the value of x. It looks like the only thing that is signed is the descriptor in the GOT.
That looks relaxable in principle, although maybe not in practice to initial exec. I'd expect if it were initial exec the (&tls_var - TP) would be signed in the GOT, there are enough spare instructions and registers to extract the unsigned (&tls_var - TP), but there aren't enough spare instructions to test whether the authenticate failed, which I believe is a requirement for -fptrauth-traps (for systems without FEAT_FPAC).
The TLSDESC sequence for accessing an authenticated pointer is similar to the traditional TLSDESC sequence. As pointer signing must be done at run-time there are much more limited opportunities to relax a TLSDESC AUTH, we introduce a R_AARCH64_AUTH_TLSDESC_CALL relocation that permits a static linker to transform the blraa to a nop, when relaxation is possible.
A new relocation code has been introduced rather than reusing R_AARCH64_TLSDESC_CALL so that a static linker can assume that the destination symbol is signed, without having to derive it from other R_AARCH64_TLSDESC_* relocations.
TLSDESC sequence
adrp x0, :tlsdesc:v //R_AARCH64_TLSDESC_ADR_PAGE21
ldr x1, [x0, #:tlsdesc_lo12:v] //R_AARCH64_TLSDESC_LD64_LO12
add x0, x0, #:tlsdesc_lo12:v //R_AARCH64_TLSDESC_ADD_LO12
.tlsdesccall var //R_AARCH64_TLSDESC_CALL
blr x1
TLSDESC AUTH sequence
adrp x0, :tlsdesc_auth:v //R_AARCH64_AUTH_TLSDESC_ADR_PAGE21
ldr x16, [x0, #:tlsdesc_auth_lo12:v]//R_AARCH64_AUTH_TLSDESC_LD64_LO12
add x0, x0, #:tlsdesc_auth_lo12:v //R_AARCH64_AUTH_TLSDESC_ADD_LO12
.tlsdescauthcall v //R_AARCH64_AUTH_TLSDESC_CALL
blraa x16, x0
fixes #393