Skip to content

[pauthabielf64] Define R_AARCH64_AUTH_TLSDESC_CALL.#395

Open
smithp35 wants to merge 2 commits into
ARM-software:mainfrom
smithp35:tlsauthdesccall
Open

[pauthabielf64] Define R_AARCH64_AUTH_TLSDESC_CALL.#395
smithp35 wants to merge 2 commits into
ARM-software:mainfrom
smithp35:tlsauthdesccall

Conversation

@smithp35
Copy link
Copy Markdown
Contributor

The TLSDESC sequence for accessing an authenticated pointer is similar to the traditional TLSDESC sequence. As pointer signing must be done at run-time there are much more limited opportunities to relax a TLSDESC AUTH, we introduce a R_AARCH64_AUTH_TLSDESC_CALL relocation that permits a static linker to transform the blraa to a nop, when relaxation is possible.

A new relocation code has been introduced rather than reusing R_AARCH64_TLSDESC_CALL so that a static linker can assume that the destination symbol is signed, without having to derive it from other R_AARCH64_TLSDESC_* relocations.

TLSDESC sequence

adrp x0, :tlsdesc:v //R_AARCH64_TLSDESC_ADR_PAGE21
ldr x1, [x0, #:tlsdesc_lo12:v] //R_AARCH64_TLSDESC_LD64_LO12
add x0, x0, #:tlsdesc_lo12:v //R_AARCH64_TLSDESC_ADD_LO12
.tlsdesccall var //R_AARCH64_TLSDESC_CALL
blr x1

TLSDESC AUTH sequence

adrp x0, :tlsdesc_auth:v //R_AARCH64_AUTH_TLSDESC_ADR_PAGE21
ldr x16, [x0, #:tlsdesc_auth_lo12:v]//R_AARCH64_AUTH_TLSDESC_LD64_LO12
add x0, x0, #:tlsdesc_auth_lo12:v //R_AARCH64_AUTH_TLSDESC_ADD_LO12
.tlsdescauthcall v //R_AARCH64_AUTH_TLSDESC_CALL
blraa x16, x0

fixes #393

The TLSDESC sequence for accessing an authenticated pointer is
similar to the traditional TLSDESC sequence. As pointer signing
must be done at run-time there are much more limited opportunities
to relax a TLSDESC AUTH, we introduce a R_AARCH64_AUTH_TLSDESC_CALL
relocation that permits a static linker to transform the blraa to
a nop, when relaxation is possible.

A new relocation code has been introduced rather than reusing
R_AARCH64_TLSDESC_CALL so that a static linker can assume that
the destination symbol is signed, without having to derive it
from other R_AARCH64_TLSDESC_* relocations.

TLSDESC sequence

adrp x0, :tlsdesc:v             //R_AARCH64_TLSDESC_ADR_PAGE21
ldr  x1, [x0, #:tlsdesc_lo12:v] //R_AARCH64_TLSDESC_LD64_LO12
add  x0, x0, #:tlsdesc_lo12:v   //R_AARCH64_TLSDESC_ADD_LO12
.tlsdesccall var                //R_AARCH64_TLSDESC_CALL
blr  x1

TLSDESC AUTH sequence

adrp x0, :tlsdesc_auth:v             //R_AARCH64_AUTH_TLSDESC_ADR_PAGE21
ldr  x16, [x0, #:tlsdesc_auth_lo12:v]//R_AARCH64_AUTH_TLSDESC_LD64_LO12
add  x0, x0, #:tlsdesc_auth_lo12:v   //R_AARCH64_AUTH_TLSDESC_ADD_LO12
.tlsdescauthcall v                   //R_AARCH64_AUTH_TLSDESC_CALL
blraa x16, x0

fixes ARM-software#393
@kovdan01
Copy link
Copy Markdown

Comment thread pauthabielf64/pauthabielf64.rst Outdated
ldr x16, [x0, :tlsdesc_auth_lo12: undefined_weak // R_AARCH64_AUTH_TLSDESC_LD64_LO12
add x0, x0 :tlsdesc_auth_lo12: undefined_weak // R_AARCH64_AUTH_TLSDESC_ADD_LO12
.tlddescauthcall undefined_weak // R_AARCH64_AUTH_TLSDESC_CALL
autia x0, x8
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

#393 (comment) gives this as blraa x16, x0 instead?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for spotting. Will update.

autia x0, x8

// After relaxation, assuming undefined_weak is known to be 0 at static-link time.
mov x0, #0x0
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the resolver is returning a (potentially between distinct allocations) offset from TP, wouldn't this cause a change in behaviour from giving "NULL" to giving "TP"? At least within FreeBSD our normal AArch64 TLSDESC resolver for undefined weak symbols is to return -TP(+A) so adding it to TP gives NULL(+A). I know undefined weak TLS objects are historically very cursed and break in all kinds of ways, but this would be a regression over non-PAuth TLSDESC, I think.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(This also loses the addend entirely)

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a good point, @kovdan01 is your TLSDESC resolver function for an undefined weak capable of returning -TP? I believe AArch64 glibc does this too for undefined weak TLS symbols.

I think this would be preferable to relaxing the sequence to 0 as currently that 0 would get added to TP, which would not result in a value of 0 for an undefined weak.

There's always the possibility of altering the TLSDESC sequence of checking for 0 before adding to TP as 0 isn't a valid offset (first offset is TCB size + alignment padding) which is a minimum of 16. However this seems worse than the resolver.

As an aside I've not been able to create a TLSDESC sequence with a non-zero addend from some simple compiled code. Always seems like the TLSDESC calculates the address of the symbol, then does an addition, or load with immediate offset instead.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, that may be true in practice, just like GOT entries typically don't actually have addends (except when turned to relative ones, of course) as it's annoying to have potentially multiple entries per symbol to track in the linker.

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@kovdan01 is your TLSDESC resolver function for an undefined weak capable of returning -TP? I believe AArch64 glibc does this too for undefined weak TLS symbols.

@smithp35 Yes, glibc does this, and we match that behavior in our PoC reference musl implementation - see https://github.com/access-softek/musl/blob/v1.2.5-pauth-rev2025-11-21/src/ldso/aarch64/tlsdesc.S#L63-L70

kovdan01 added a commit to llvm/llvm-project that referenced this pull request May 25, 2026
The R_AARCH64_AUTH_TLSDESC_CALL is introduced to allow linker relaxation of
AUTH TLSDESC call sequences for non-preemptible undefined weak symbols.

The lld patch introducing the relaxation: #194636

Corresponding ARM docs PR: ARM-software/abi-aa#395
kovdan01 added a commit to llvm/llvm-project that referenced this pull request May 25, 2026
The R_AARCH64_AUTH_TLSDESC_CALL is introduced to allow linker relaxation of
AUTH TLSDESC call sequences for non-preemptible undefined weak symbols.

The lld patch introducing the relaxation: #194636

Corresponding ARM docs PR: ARM-software/abi-aa#395
jollaitbot pushed a commit to sailfishos-mirror/llvm-project that referenced this pull request May 26, 2026
* use blraa as in commit message.
* mention that a value of 0 when added to the thread pointer is invalid.
@smithp35
Copy link
Copy Markdown
Contributor Author

smithp35 commented May 27, 2026

I've fixed the typos and added a note that when 0 is added to the TP it will point at the thread control block.

Do we still need this? EDIT, yes as on other platforms there may be other relaxations possible.

@smithp35
Copy link
Copy Markdown
Contributor Author

Reading through https://www.fsfla.org/~lxoliva/writeups/TLS/RFC-TLSDESC-ARM.txt again. There is a relaxable sequence that could be used when static linking (so we know that the weak reference won't be defined). In effect this is inlining the resolver function that returns -TP

mrs x0, TPIDR_EL0
neg x0, x0 // alias of sub x0, xzr, x0
nop
nop


.. note::

Relocation code ``R_AARCH64_AUTH_TLSDESC_CALL`` is needed to permit
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I'll move the details about the relaxation to a separate document in the design-documents folder.

I think it is platform specific choice of whether both fields of the TLS descriptor are signed. If only the resolver function address is signed then more relaxations are possible.

Likely to be next week before I can do that.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, it's not about whether the second word of the descriptor is signed. The generated code never uses that, either it passes a pointer to it as an opaque blob to the resolver or it relaxes the entire sequence so there is no descriptor to sign? The question is whether the TLS data is being signed like globals can be, and therefore whether &tls_var - TP is the same for all threads or differs in the high bits due to signing. If the data isn't signed, you can just relax to that constant (which will be the same as non-PAuth, and is either an LE immediate or an IE run-time constant), it's only if the data is signed that IE/LE are fundamentally broken?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the comment. I'll probably just take the rationale/relaxation bits out of the main document for now until I've got the time to work through this slowly.

Reading through the initial issue again #393. I think I've put too much weight on the comment:

Broader relaxations (such as GD->IE or GD->LE with non-statically-known-NULL symbols) are not possible because pointer authentication requires signing at program start-up.

I've missed the non-statically-known-NULL symbols part and in my haste reading. I had been trying to reconcile why other relaxations weren't possible with the code sequence the compiler uses for TLSDESC and a signed GOT.

As an aside:

Empirically using: clang --target=aarch64-linux -march=armv8.3-a -S -O2 tlsdesc.c -o - -fptrauth-elf-got -mabi=pauthtest with a trivial __thread int x; int val() { return x; }

I get:

        pacibsp
        stp     x29, x30, [sp, #-16]!           // 16-byte Folded Spill
        mov     x29, sp
        adrp    x0, :tlsdesc_auth:x
        ldr     x16, [x0, :tlsdesc_auth_lo12:x]
        add     x0, x0, :tlsdesc_auth_lo12:x
        blraa   x16, x0
        mrs     x8, TPIDR_EL0
        ldr     w0, [x8, x0]
        ldp     x29, x30, [sp], #16             // 16-byte Folded Reload
        retab

If I'm reading that correctly, the return value from the TLS resolver function isn't signed. Nor is the value of x. It looks like the only thing that is signed is the descriptor in the GOT.

That looks relaxable in principle, although maybe not in practice to initial exec. I'd expect if it were initial exec the (&tls_var - TP) would be signed in the GOT, there are enough spare instructions and registers to extract the unsigned (&tls_var - TP), but there aren't enough spare instructions to test whether the authenticate failed, which I believe is a requirement for -fptrauth-traps (for systems without FEAT_FPAC).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[PAUTHABIELF64] Introduce R_AARCH64_AUTH_TLSDESC_CALL and .tlsdescauthcall

3 participants