Skip to content

gh-149202: Fix frame pointer unwinding on s390x and ARM#149409

Open
encukou wants to merge 7 commits intopython:mainfrom
encukou:gh-149202-2
Open

gh-149202: Fix frame pointer unwinding on s390x and ARM#149409
encukou wants to merge 7 commits intopython:mainfrom
encukou:gh-149202-2

Conversation

@encukou
Copy link
Copy Markdown
Member

@encukou encukou commented May 5, 2026

pablogsal and others added 5 commits May 4, 2026 20:14
-fno-omit-frame-pointer is not enough to make every target walkable by the simple manual frame pointer unwinder.

The helper used by test_frame_pointer_unwind used to assume the frame pointer named a two-word record where fp[0] was the previous frame pointer and fp[1] was the return address. That is only the generic layout used by some targets. This patch keeps that default, but moves the slots behind named offsets so architecture-specific layouts can describe where the backchain and return address really live.

On s390x, GCC and Clang do not emit a usable backchain unless -mbackchain is also enabled. Without it, the unwinder stops at the current C frame and the test reports no Python frames. Once backchains are present, the helper must also stop at the current thread's known C stack bounds; otherwise it can follow the final backchain far enough to dereference an invalid frame and segfault. For Linux s390x backchain frames, the documented z/Architecture stack-frame layout saves r14, the return-address register, at byte offset 112 from the frame pointer, so read the return address from that named slot instead of fp[1].

The 112-byte offset comes from Linux's s390 debugging documentation: its Stack Frame Layout table shows z/Architecture backchain frames with the backchain at offset 0 and saved r14 of the caller function at offset 112: https://www.kernel.org/doc/html/v5.3/s390/debugging390.html#stack-frame-layout

This helper remains scoped to Linux s390x backchain frames. GNU SFrame's s390x notes state that the s390x ELF ABI does not generally mandate where RA and FP are saved, or whether they are saved at all: https://sourceware.org/binutils/docs/sframe-spec.html#s390x

On 32-bit ARM, GCC defaults to Thumb mode on common armhf toolchains. The Thumb prologue keeps the saved frame pointer and link register at offsets that depend on the generated frame, which breaks the fp[0]/fp[1] walk used by the helper. Use -marm when it is supported for frame-pointer builds, and teach the helper the GCC ARM-mode slots where the previous frame pointer is at fp[-1] and the saved LR return address is at fp[0].
@read-the-docs-community
Copy link
Copy Markdown

read-the-docs-community Bot commented May 5, 2026

@encukou encukou added the 🔨 test-with-buildbots Test PR w/ buildbots; report in status section label May 5, 2026
@bedevere-bot
Copy link
Copy Markdown

🤖 New build scheduled with the buildbot fleet by @encukou for commit efb829e 🤖

Results will be shown at:

https://buildbot.python.org/all/#/grid?branch=refs%2Fpull%2F149409%2Fmerge

If you want to schedule another build, you need to add the 🔨 test-with-buildbots label again.

@bedevere-bot bedevere-bot removed the 🔨 test-with-buildbots Test PR w/ buildbots; report in status section label May 5, 2026
@encukou
Copy link
Copy Markdown
Member Author

encukou commented May 5, 2026

!buildbot ARM

@bedevere-bot
Copy link
Copy Markdown

🤖 New build scheduled with the buildbot fleet by @encukou for commit 955261c 🤖

Results will be shown at:

https://buildbot.python.org/all/#/grid?branch=refs%2Fpull%2F149409%2Fmerge

The command will test the builders whose names match following regular expression: ARM

The builders matched are:

  • ARM64 macOS PR
  • ARM64 Raspbian Debug PR
  • ARM64 Windows PR
  • ARM64 Raspbian PR
  • ARM64 MacOS M1 Refleaks NoGIL PR
  • ARM Raspbian PR
  • iOS ARM64 Simulator PR
  • ARM64 MacOS M1 NoGIL PR
  • ARM64 Windows Non-Debug PR

@encukou
Copy link
Copy Markdown
Member Author

encukou commented May 5, 2026

!buildbot S390x

@bedevere-bot
Copy link
Copy Markdown

🤖 New build scheduled with the buildbot fleet by @encukou for commit 955261c 🤖

Results will be shown at:

https://buildbot.python.org/all/#/grid?branch=refs%2Fpull%2F149409%2Fmerge

The command will test the builders whose names match following regular expression: S390x

The builders matched are:

  • s390x Fedora Rawhide NoGIL refleaks PR
  • s390x RHEL8 LTO + PGO PR
  • s390x Fedora Rawhide PR
  • s390x RHEL9 PR
  • s390x RHEL8 LTO PR
  • s390x RHEL9 Refleaks PR
  • s390x Fedora Rawhide NoGIL PR
  • s390x Fedora Stable PR
  • s390x Fedora Rawhide Clang PR
  • s390x Fedora Stable Clang Installed PR
  • s390x RHEL9 LTO PR
  • s390x Fedora Rawhide Refleaks PR
  • s390x Fedora Stable LTO PR
  • s390x Fedora Rawhide LTO + PGO PR
  • s390x RHEL8 PR
  • s390x Fedora Stable LTO + PGO PR
  • s390x RHEL8 Refleaks PR
  • s390x Fedora Stable Refleaks PR
  • s390x Fedora Stable Clang PR
  • s390x RHEL9 LTO + PGO PR
  • s390x Fedora Rawhide Clang Installed PR
  • s390x Fedora Rawhide LTO PR

@stratakis
Copy link
Copy Markdown
Contributor

Just read the PEP today, so I'll gently nudge for an extra fix here, or maybe I could send another PR :)

ppc64le doesn't need any frame pointer compiler flags, so I'd exclude it entirely through configure.

The power abi specification requires that compilers maintain a back chain by default, so unwinding already works without a dedicated frame pointer. Adding however -fno-omit-frame-pointer forces the compiler to reserve r31 for it adding extra overhead with no benefits.

Also following up on Diego's comment, the current unwinding model here won't work for ppc64le (stack frame layout is on page 34 of the ABI spec pdf). The LR is saved in the caller's frame (SP+16), not the current frame, so frame_pointer[RETURN_OFFSET] can't reach it.

I can provide access to a power buildbot for testing it out or try to come up with a solution myself.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants