Skip to content

runtime: preserve recover error values#1882

Open
cpunion wants to merge 55 commits into
xgo-dev:mainfrom
cpunion:codex/goroot-panic-coverage
Open

runtime: preserve recover error values#1882
cpunion wants to merge 55 commits into
xgo-dev:mainfrom
cpunion:codex/goroot-panic-coverage

Conversation

@cpunion

@cpunion cpunion commented May 22, 2026

Copy link
Copy Markdown
Collaborator

Summary

  • Panic with runtime error values instead of plain strings so recover() values assert to error and runtime.Error.
  • Build TypeAssertionError values for compiler-generated failed type assertions, and fix slice-to-array conversion panic bounds order.
  • Preserve direct recover semantics through compiler-generated method wrappers when the wrapper is itself the deferred call, while keeping nested wrapper calls non-recovering.
  • Preserve Go 1.26 embedded method-wrapper function pointer semantics for fixedbugs/issue73917.go and fixedbugs/issue73920.go: a recover in a real helper call reached through deferred promoted wrappers remains indirect, and the outer deferred recover still sees the panic.
  • Preserve local values across fault-triggered longjmp recovery by emitting volatile loads/stores for local SSA allocs in functions with recover blocks.
  • Add focused test/go coverage for recovered runtime errors, type assertion panic values, method-wrapper recover semantics, embedded wrapper function-pointer recover semantics, and fault recovery preserving named results; remove the now-passing recover2.go, zerodivide.go, recover4.go, fixedbugs/issue73917.go, and fixedbugs/issue73920.go GOROOT xfails.

Still XFail / Out of Scope

  • recover.go remains xfail in this PR. On this branch it still fails before the recover-10 path with the ssa: panic with runtime type assertion errors #1892 interface type identity issue: interface conversion: interface {} is func(*main.T1), not func(*main.T1) (types from different scopes).
  • This PR does not cherry-pick ssa: panic with runtime type assertion errors #1892. The recover-wrapper behavior exposed by missing recover 10 is covered independently by test/go.
  • The new embedded-wrapper test/go cases are Go 1.26+ semantic coverage. Go 1.24 has pre-Go 1.26 behavior for the upstream issue73917.go/issue73920.go programs, so those local tests skip under Go <1.26.
  • A full Go 1.26 standard-library run of test/go currently hits unrelated nil/range behavior in TestRangeOverNilArrayPointerCallIsEvaluated; this PR intentionally does not cover nil/range, chan, print, interface, reflect, GC, liveness, finalizer, or goroutine domains.

Testing

  • go test ./test/go -run 'TestRecoverThrough|TestRecoverAfterFault' -count=1
  • go run ./cmd/llgo test -run 'TestRecoverThrough|TestRecoverAfterFault' -count=1 ./test/go
  • go test ./test/goroot -run TestGoRootRunCases/recover4.go -count=1 -args -goroot "$(go env GOROOT)" -dirs . -case '^recover4\.go$' -xfail /tmp/llgo-empty-xfail.yaml
  • go test ./test/goroot -run TestGoRootRunCases/recover4.go -count=1 -args -goroot "$(go env GOROOT)" -dirs . -case '^recover4\.go$'
  • go test ./test/goroot -run TestGoRootRunCases/recover.go -count=1 -args -goroot "$(go env GOROOT)" -dirs . -case '^recover\.go$' (passes via retained xfail)
  • go test ./test/goroot -run TestGoRootRunCases/recover.go -count=1 -args -goroot "$(go env GOROOT)" -dirs . -case '^recover\.go$' -xfail /tmp/llgo-empty-xfail.yaml (expected failure: ssa: panic with runtime type assertion errors #1892 interface type identity blocker)
  • go test ./test/go -run 'TestDeferredEmbedded.*MethodWrapperKeepsIndirectRecoverNil|TestRecoverThrough' -count=1
  • /Users/lijie/sdk/go1.26.0/bin/go test ./test/go -run 'TestDeferredEmbedded.*MethodWrapperKeepsIndirectRecoverNil' -count=1 -v
  • /Users/lijie/sdk/go1.26.0/bin/go run ./cmd/llgo test -run 'TestDeferredEmbedded.*MethodWrapperKeepsIndirectRecoverNil|TestRecoverThrough' -count=1 ./test/go
  • go test ./test/goroot -run TestGoRootRunCases -count=1 -args -goroot /Users/lijie/sdk/go1.26.0 -dirs fixedbugs -case '^fixedbugs/issue739(17|20)\.go$'
  • go test ./test/go -count=1
  • (cd runtime && go test ./internal/runtime -count=1)
  • go test ./ssa -count=1
  • go test ./cl -count=1
  • /Users/lijie/sdk/go1.26.0/bin/go test ./test/go -count=1 (known unrelated failure: TestRangeOverNilArrayPointerCallIsEvaluated nil array pointer range SIGSEGV)

GOROOT CI

Full GOROOT CI is disabled/too slow for regular PR validation in this repo and the GOROOT workflow is manual-only. This PR keeps stable coverage in test/go and lists the exact targeted GOROOT commands above. I will monitor the available PR checks after pushing the fork head branch; I will not push branches to xgo-dev/llgo.

@cpunion

cpunion commented May 22, 2026

Copy link
Copy Markdown
Collaborator Author

CI note: normal PR CI is running from the fork branch cpunion:codex/goroot-panic-coverage.

I attempted to dispatch the manual-only GOROOT workflow for the PR head branch, but GitHub rejected the upstream dispatch because the ref only exists on the fork:

HTTP 422: No ref found for: codex/goroot-panic-coverage

I also attempted to dispatch .github/workflows/goroot.yml on cpunion/llgo at the fork branch, but GitHub returned:

HTTP 404: workflow .github/workflows/goroot.yml not found on the default branch

Per branch policy, I did not push an upstream xgo-dev/llgo ref. Local targeted GOROOT commands are listed in the PR body.

@cpunion cpunion force-pushed the codex/goroot-panic-coverage branch from 699e9cf to 82b8a1a Compare May 22, 2026 08:59
@cpunion

cpunion commented May 22, 2026

Copy link
Copy Markdown
Collaborator Author

Updated PR after rebasing onto current xgo-dev/main and adjusted the new genericembediface FileCheck expectation for the runtime.TypeAssertError path introduced by this branch.

Additional tests run locally:

  • go test ./cl ./ssa -run 'TestRunAndTestFromTestgo/genericembediface|TestFromTestgo/genericembediface' -count=1
  • go test ./test/go -run 'TestRecoveredRuntimePanicsAreErrors|TestRecoveredTypeAssertionPanicsAreRuntimeErrors' -count=1
  • go test ./test/goroot -run TestGoRootRunCases -count=1 -timeout=10m -args -goroot "$(go env GOROOT)" -dirs . -case '^(recover2|zerodivide)\.go$' -run-timeout=60s -build-timeout=3m
  • git diff --check

The earlier Ubuntu gpg: no valid OpenPGP data found failure occurred during dependency setup before repository tests ran, so the new push should trigger regular PR CI again from the fork branch.

@codecov-commenter

codecov-commenter commented May 22, 2026

Copy link
Copy Markdown

@cpunion

cpunion commented May 24, 2026

Copy link
Copy Markdown
Collaborator Author

Update from commit 6b65c2c:\n\n- Fixed direct vs indirect recover scoping and nested panic stack handling; removed the recover1.go xfail entries covered by this PR.\n- Kept recover.go xfails: the remaining failure is reflect/interface type identity in test9reflect2, not the recover/defer panic root.\n- Kept recover4.go xfails: SIGBUS is now routed to the panic path, but the remaining mismatch is stale local/named-result state after fault recovery (memcopy returned 0 vs 131067), which I am not mixing into this recover/defer PR.\n- Kept deferprint.go and fixedbugs/bug409.go xfails: remaining mismatches are float print exponent formatting (+e+00 vs +e+000).\n\nLocal validation:\n- go test ./test/go -count=1\n- go test ./test/goroot -count=1 -run TestGoRootRunCases -args -goroot $(go env GOROOT) -dirs .,fixedbugs -case '^(recover.go|recover1.go|recover4.go|deferprint.go|fixedbugs/bug409.go)$' -run-timeout=30s\n- go test ./ssa ./cl -count=1\n- cd runtime && go test ./internal/runtime -count=1

@cpunion

cpunion commented May 24, 2026

Copy link
Copy Markdown
Collaborator Author

Update after 77ceb09:

  • Added focused recover/defer coverage in test/go for deferred method wrappers and direct-defer recover semantics.
  • Added fault recover coverage for preserving a named result after a protected-memory fault path.
  • Fixed recover-frame forwarding through recover-transparent wrappers and made recover-function local alloc loads/stores volatile so fault longjmp keeps observable locals.
  • Removed the recover4.go goroot xfails now that it passes independently on this branch.
  • Kept recover.go xfail: without ssa: panic with runtime type assertion errors #1892 it still stops first on the interface type-identity panic, so this PR does not cherry-pick the interface fix. The recover-wrapper behavior is covered independently in test/go.

Focused tests passed locally: test/go recover cases via Go and llgo, recover4.go goroot with and without xfail, recover.go goroot with retained xfail, go test ./test/go, go test ./ssa, go test ./cl, and runtime/internal/runtime inside the runtime module.

@cpunion

cpunion commented May 24, 2026

Copy link
Copy Markdown
Collaborator Author

Update for commit 66a820936f3baab3164393d63f95d944fd9566fb:

  • Classified fixedbugs/issue73917.go and fixedbugs/issue73920.go as Go 1.26 embedded method-wrapper function-pointer defer/recover semantics. Both are in the same recover/defer wrapper scope as this PR.
  • Added test/go coverage for value and pointer promoted method wrappers reached through function pointers, keeping the helper recover indirect so the outer deferred recover still receives the panic.
  • Removed the now-passing Go 1.26 xfails for both cases on darwin/arm64 and linux/amd64.
  • Left nil/range, interface ssa: panic with runtime type assertion errors #1892, GC/liveness/finalizer/goroutine, chan, print, reflect, and related non-wrapper domains untouched.

Focused checks passed:

  • go test ./test/goroot -run TestGoRootRunCases -count=1 -args -goroot /Users/lijie/sdk/go1.26.0 -dirs fixedbugs -case '^fixedbugs/issue739(17|20)\.go$'
  • /Users/lijie/sdk/go1.26.0/bin/go test ./test/go -run 'TestDeferredEmbedded.*MethodWrapperKeepsIndirectRecoverNil' -count=1 -v
  • /Users/lijie/sdk/go1.26.0/bin/go run ./cmd/llgo test -run 'TestDeferredEmbedded.*MethodWrapperKeepsIndirectRecoverNil|TestRecoverThrough' -count=1 ./test/go
  • go test ./test/go -count=1
  • (cd runtime && go test ./internal/runtime -count=1)
  • go test ./ssa -count=1
  • go test ./cl -count=1

Known unrelated non-covered check: /Users/lijie/sdk/go1.26.0/bin/go test ./test/go -count=1 fails in TestRangeOverNilArrayPointerCallIsEvaluated with a nil array pointer range SIGSEGV, which is outside this recover/defer wrapper PR.

@cpunion

cpunion commented May 24, 2026

Copy link
Copy Markdown
Collaborator Author

Update for fixedbugs/issue43835.go at 5fc53956f98a7718eab8399fd905756b7f058ece:

  • Classified as the same recover/defer/named-result-after-panic family: the nil deref is only the panic trigger; after deferred recover, partial true writes from an unfinished assignment/return must not be observed.
  • Implemented a compiler-side ordering fix for recover-capable functions: explicit nil-deref assertion before deref loads that could otherwise be eliminated/moved past result writes; existing volatile result-slot handling remains scoped to recover blocks.
  • Added stable test/go coverage for assignment, unnamed return, and named return expression forms.
  • Removed only the now-covered fixedbugs/issue43835.go xfails for darwin/arm64 and linux/amd64. No GC/liveness/finalizer/goroutine, chan, print, interface, or broader nil-domain changes are included.

Local validation:

  • go test ./test/goroot -run TestGoRootRunCases -count=1 -timeout 30m -args -goroot /Users/lijie/sdk/go1.26.0 -dirs fixedbugs -case '^fixedbugs/issue43835\\.go$' -directive-mode runlike -run-timeout 2m\n- go test ./test/go -count=1\n- go test ./ssa -count=1\n- go test ./cl -count=1\n\nRuntime package tests were not rerun for this increment because no runtime code changed.

@cpunion cpunion marked this pull request as ready for review June 1, 2026 04:27
@cpunion

cpunion commented Jun 6, 2026

Copy link
Copy Markdown
Collaborator Author

Resolved the latest merge conflict with xgo-dev/main (51d665e). Kept recover1.go/recover4.go out of xfail for this PR's recovered panic value fix, while preserving unrelated main xfails such as issue73917/issue73920.\n\nLocal verification:\n- go test -timeout 20m ./test/go -run 'Test(Recover|RuntimeError|Panic|GenericUnsafeSizeofArithmetic)' -count=1\n- go test -timeout 20m ./test/goroot -run TestGoRootRunCases -count=1 -args -goroot /Users/lijie/sdk/go1.26.0 -case '^(recover1|recover4).go$' -run-timeout 60s\n- git diff --check

@cpunion

cpunion commented Jun 6, 2026

Copy link
Copy Markdown
Collaborator Author

Update for 259389962:

  • Added package-level coverage for the recover-frame helper paths that Codecov previously reported as uncovered: direct deferred recover, MaskRecoverCall, ForwardRecoverFrameCall, emitDoEx mask/forward modes, and transparent wrapper detection.
  • Kept the change scoped to panic/recover error-value preservation; no GC/liveness/finalizer/goroutine changes.

Local validation:

  • go test ./ssa -run 'Test(PlainDeferWithoutSavedArgsIR|DeferredRecoverBuiltinDoesNotStartNestedFrameIR|RecoverFrameCallHelpersIR)' -coverprofile=/tmp/llgo-ssa-recover.out -covermode=count -count=1
    • covered callRecoverScopedDefer, MaskRecoverCall, and ForwardRecoverFrameCall at 100% in the focused profile.
  • go test ./cl -run 'Test(EmitDoWithExplicitDeferStack|EmitDoWithoutExplicitDeferStack|EmitDoRecoverFrameModes|RecoverTransparentWrapperCall)' -coverprofile=/tmp/llgo-cl-recover.out -covermode=count -count=1
    • covered emitDoEx and recoverTransparentWrapperCall at 100% in the focused profile.
  • go test -timeout 20m ./test/go -run 'Test(Recover|RuntimeError|Panic|GenericUnsafeSizeofArithmetic)' -count=1
  • go test -timeout 20m ./test/goroot -run TestGoRootRunCases -count=1 -args -goroot /Users/lijie/sdk/go1.26.0 -case '^(recover1|recover4)\\.go$' -run-timeout 60s
  • git diff --check

Also checked broader local package runs. They still hit pre-existing, unrelated failures that I did not change in this recover/panic PR: go test ./ssa -count=1 and go test ./cl -count=1 fail on cl/_testdata/vargs expecting AssertIndexRange while the current IR emits CheckIndexRange; go test ./test/go -count=1 fails on legacy print exponent width (+e+000 vs +e+00).

CI and Codecov are expected to rerun from the fork branch after this push.

@cpunion

cpunion commented Jun 6, 2026

Copy link
Copy Markdown
Collaborator Author

Updated the stale FileCheck expectation in cl/_testdata/vargs/in.go from AssertIndexRange(i1) to the current CheckIndexRange(i1, i64, i1, i64) IR emitted by litgen.

Local verification passed:

  • go test ./cl -run TestRunAndTestFromTestdata
  • go test ./cl ./ssa

The remaining public test/go print failure is still tracked separately in #1945 and was not changed here.

@cpunion cpunion force-pushed the codex/goroot-panic-coverage branch 3 times, most recently from 5a10a71 to f7ea6d7 Compare June 10, 2026 11:31
@cpunion cpunion force-pushed the codex/goroot-panic-coverage branch from 5c0bf1f to f29c5d6 Compare June 20, 2026 03:12
@cpunion cpunion force-pushed the codex/goroot-panic-coverage branch 3 times, most recently from 17cfabb to 5bb0176 Compare June 29, 2026 02:26
Comment thread ssa/eh.go

// declare ptr @llvm.frameaddress.p0(i32 immarg)
func (b Builder) FrameAddress(level uint64) Expr {
fn := b.Pkg.cFunc("llvm.frameaddress.p0", b.Prog.tyFrameaddress())

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这个可以改成用新封装的CreateIntrinsic来调用,可以参考

llgo/ssa/eh.go

Line 83 in 377e737

b.impl.CreateIntrinsic(b.Prog.VoidPtr().ll, llvm.LookupIntrinsicID("llvm.stacksave"), nil, ""),

cpunion and others added 27 commits July 2, 2026 16:16
Record the experiment results at the emitter: !associated only guides
linker GC and IR-level GlobalDCE deletes the records; llvm.compiler.used
pins dead functions through the records' address initializers; and
noduplicate blocks inlining. Section dedup is link-phase work.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
Post-link table generation plan: parse the linked binary's metadata
sections, dedup LTO inline copies against the symbol table, sort with a
sentinel, build Go-layout findfunctab via internal/pclntab, and write
back into a reserved section with ASLR-safe anchor offsets. Runtime
adopts the prebuilt table when the header validates and keeps first-use
construction as fallback. Includes the list of platform facts
established in xgo-dev#2012 so implementation does not re-derive them.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
The monotonic time source had two problems:

- On Linux, runtimeNano passed clite's CLOCK_MONOTONIC, whose value is
  Darwin's clock id (6). Linux interprets 6 as CLOCK_MONOTONIC_COARSE,
  a millisecond-granularity clock: consecutive time.Now() readings were
  identical 100% of the time and the smallest nonzero delta was 1ms.
- On Darwin, clock_gettime(CLOCK_MONOTONIC) itself only has microsecond
  granularity (96% identical consecutive readings, 1us minimum delta).

Mirror Go's runtime structure with a per-OS nanotime1 in the runtime
package itself, keeping the hot path free of clite indirection and clite
unchanged: Darwin reads CLOCK_UPTIME_RAW through clock_gettime_nsec_np
(the same clock Go's nanotime uses there), Linux uses clock_gettime with
the OS-correct CLOCK_MONOTONIC id as a local constant, and remaining
platforms keep the previous behavior.

Measured with consecutive time.Now() deltas (min nonzero / zero-frac):
- macOS arm64: 1us / 96.5%  ->  41ns / 26%  (Go 1.26: 41ns / 22%)
- Linux arm64: 1ms / 100%   ->  41ns / 21%

time.Sleep, Timer and Ticker behave identically before and after.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
The macOS CI LLDB step caught the funcinfo entry/stub site anchors
shifting instruction/scope layout: with the records emitted at function
entry, LLDB reported variables from an inner lexical block (ScopeIf's
b, c) as in scope before the block began. Debug builds carry full
DWARF, so the funcinfo tables are redundant there; gate the metadata
pipeline on !IsDbgEnabled(). Caller-frame instrumentation is
independent of this switch, so runtime.Caller keeps working in debug
builds. _lldb/runtest.sh: 194/194 pass.

This also covers Linux, where the same interference existed since the
sites were introduced but the LLDB suite only runs on the macOS jobs.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
Refine the previous commit: instead of disabling the whole funcinfo
metadata pipeline under LLGO_DEBUG/LLGO_DEBUG_SYMBOLS, add a separate
Program.EnableFuncInfoSites switch and turn off just the body-embedded
site records (entry/stub anchors and pc-line labels) — they are what
shifts instruction/scope layout and confused LLDB. The funcinfo tables
are plain data globals and stay enabled, so runtime.FuncForPC keeps its
normalized name and Func.FileLine keeps file/line in debug builds (via
the dlsym fallback path); runtime.Caller/Callers were never affected
because caller-frame instrumentation is independent of both switches.

Debug builds lose only the section fast paths (first-use latency) and
statement-level pc-line granularity, both redundant next to full DWARF.
_lldb/runtest.sh: 194/194; cl and test/go suites pass.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
frameFuncForPC could cache a Func built from a pcline frame whose entry
resolution failed (entry == 0); a later FuncForPC on the same PC would
then observe Entry() == 0 where its own constructor falls back to pc.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
LLGO_FUNCINFO_SITES=0 keeps the funcinfo metadata tables but drops the
body-embedded entry/stub/pc-line inline-asm sites. This is the narrow
A/B needed to isolate codegen perturbation caused by the in-body asm
anchors: with sites off, plain-code benchmarks match the no-funcinfo
baseline within noise, while sites on shifts hot runtime-internal
loops by -30%..+6% through inline/layout decisions.

Semantics with sites off: FuncForPC(entry) and Func.FileLine(entry)
keep working through the dlsym fallback path; statement/call-site
granularity PC line lookup is disabled, and first-use table
construction loses the section fast path.

Tests assert the split: tables still materialize while entry/stub
section asm, boundary symbols, and pc-line site labels are all absent.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
First stage of doc/design/pclntab-linkphase.md: parse a linked binary's
funcinfo entry/stub sections (Mach-O and ELF), deduplicate LTO inline
copies against the symbol table's text ranges, sort with a Go-style
sentinel, and build findfunctab through internal/pclntab — the faithful
port that has been waiting for exactly this caller. Read-only: prints
what the P2 build integration would write back.

Measured on the 576-target multipkg binaries:
- non-LTO: 9319 records -> ftab 3161 + 207 buckets; lookup self-check
  3160/3160; site sections 149KB -> 29KB (5.1x)
- LTO: 15371 entry records -> 13857 inline copies dropped, 4144 kept;
  self-check 3045/3045; 299KB -> 28.5KB (10.5x)

Findings for P2: on-disk Mach-O pointer slots hold dyld chained-fixup
encodings (low 36 bits are the target; decoded here; the write-back
design stores anchor-relative offsets and avoids pointers entirely),
and some non-LTO stub symbols are absent from the symbol table
(records conservatively dropped; needs tightening).

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
…adoption

pclnpost -write rewrites the entry-site section in place with the
prebuilt table (header + ftab {entryOff,funcIndex} + runtime-layout
findfunctab buckets), resolving funcinfo indexes through the binary's
symbol-index section, and voids the stub section (its records are
merged into the table). ASLR is handled by anchoring on the section's
own link-time address; entries are normalized to true symbol starts,
which retires the entry-PC slack on this path. macOS re-signs with an
ad-hoc codesign after rewriting.

The runtime adopts the table zero-copy when the magic header validates:
lookups binary-search the on-disk ftab directly through the shared
bucket index, nothing is materialized on first use (the funcIndex ->
entry map is built lazily and only for the pcline initializer), and the
cold scan/dladdr path is skipped since adoption is cheap. First-use
construction remains the fallback whenever the header is absent.

Linux end-to-end: entries=prebuilt, FuncForPC/FileLine correct,
first-FuncForPC 110µs (materializing) -> 6-8µs (zero-copy); 13ms on the
original macOS baseline. Known gap: on macOS the on-disk rewrite is
corrupted at load time because dyld still walks the stale chained-fixup
chain over the section; fix (unlinking the section's nodes from the
page chains in LC_DYLD_CHAINED_FIXUPS) is identified and next.
Non-prebuilt paths verified regression-free: cl + test/go suites pass,
smoke behavior unchanged.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
Every llgo-linked executable (linux/darwin, sites enabled) now gets the
prebuilt ftab/findfunctab automatically: internal/build runs
internal/pclnpost.Rewrite after linkMainPkg, and any failure degrades
silently to the first-use construction fallback.

Moves the tool core into internal/pclnpost and hardens it:

- Canonical-record detection by FNV: a record survives when its anchor's
  owning symbol hashes to the record's symbolID (or is the __llgo_stub.
  wrapper of it). The previous one-per-symbolID rule wrongly collapsed a
  function with its stub — they share the target's symbolID by design —
  which broke exact-entry lookups (caught by TestRuntimeLineInfoAndStack
  on Linux). LTO inline copies are now identified exactly: 8.4k/9.5k
  copies removed in the LTO probes.
- Mach-O chained-fixups surgery: unlink the rewritten sections' pointer
  slots from the dyld page chains (repointing predecessors' next links
  and page_start entries) so dyld neither rebases slots inside the new
  table nor skips unrelated fixups after the zeroed stub section, then
  re-sign ad hoc. Without this the table was corrupted at load.
- LTO-safe metadata location: the entry section carries a meta record
  whose relocations hold the addresses of the symbol-index pointer and
  count globals; LTO internalization strips those names from the symbol
  table but relocations always resolve. Runtime skips the meta rows
  (pc==0 / symbolID==0).
- Idempotence guard (already-rewritten binaries are left alone).

Runtime fixes that surfaced during validation:

- materializePrebuiltEntries is now two-phase so concurrent losers wait
  for the winner's store instead of reading a nil entries slice.
- pcLineFrameForPC rejects nearest-below sites whose entry is
  unresolved when the caller knows the function entry, instead of
  leaking a neighboring function's file/line.

Validation: macOS cl (full) + test/go + LLDB 194/194; Linux test/go
TestRuntime suite; probes on both platforms report entries=prebuilt
with first-FuncForPC at 7-21µs (Linux) from 13ms on the original
baseline, and LTO builds drop 8-9.5k inline copies.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
…table

On Mach-O, pointer slots that name exported functions — every
__llgo_stub.* wrapper and any exported Go function — are emitted as
chained-fixup BIND nodes, not rebases. The rewriter only decoded rebase
nodes, so all stub records (and some entry records) were dropped as
unowned and never reached the prebuilt ftab; FuncForPC on function
values silently fell back to dladdr (~6µs per fresh pc on darwin).

- Parse the LC_DYLD_CHAINED_FIXUPS imports table and resolve bind
  ordinals to their in-image definitions.
- Match canonical owners against the record symbolID with underscore
  normalization (debug/macho's suffix-shared string table can surface
  one mangling underscore more or less than the source-level name).
- Splice the prebuilt header's base slot back into the fixup chain as a
  live rebase node: dyld writes the slid text base at load, so the
  runtime reads a ready runtime PC with no slide arithmetic (non-PIE
  ELF link-time values already equal runtime addresses).
- LLGO_PCLNPOST=0 escape hatch keeps first-use construction.

Fresh-pc FuncForPC slow path: darwin 6-8µs -> 1.2-1.7µs, linux
6.8µs -> 0.5µs; first-in-process lookup: darwin ~32µs -> ~14µs,
linux ~6.8µs -> ~4µs.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
Pure-compute probes (recursive fib, JSON round-trip, sort.Ints, map
churn) with no runtime introspection, so one harness run covers both
the introspection extremes and what the funcinfo machinery costs code
that never asks for it.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
Go's pclntab pages are touched by its own runtime (traceback, GC) long
before user code queries it, so its first FuncForPC never pays page-in.
Mirror that: when the prebuilt table is present, init adopts it
(zero-copy, sub-µs), touches the pages the lookup path reads (blob,
funcinfo records, string offsets, strings), runs one synthetic lookup
to warm the code paths, and write-warms the FuncForPC cache pages.

First-in-process FuncForPC: darwin ~17µs -> ~2.8µs, linux ~6.6µs ->
~1.0µs. Startup cost is page-count-bound (tens of µs on stdlib-sized
tables, invisible next to ~3ms process startup; hello-world medians
unchanged). Non-prebuilt binaries stay fully lazy: first-use
construction allocates, which has no place in init, and programs that
never introspect pay nothing.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
-depths generates deep_<N> scenarios at configurable call depths;
-bigsizes generates bigfunc scenarios (funcs x statements) whose large
bodies stress statement-level pcline density, mid-function pc
symbolization, and ordinary performance of big method bodies.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
@cpunion cpunion force-pushed the codex/goroot-panic-coverage branch from 5bb0176 to bd1d08c Compare July 2, 2026 15:03
@cpunion

cpunion commented Jul 2, 2026

Copy link
Copy Markdown
Collaborator Author

Rebased onto #2016 (codex/pclntab-linkphase-p1, which includes #2012) per the review-order plan: #2012#2016 → this line of semantics fixes. Conflicts resolved were additive (context fields, the noinline condition set, and runtime.Panic now calls SavePanicCallerFrames() before the panic-node bookkeeping). Note the PR base is still main, so the diff shows #2012/#2016 commits until those merge.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants