runtime: preserve recover error values#1882
Conversation
|
CI note: normal PR CI is running from the fork branch I attempted to dispatch the manual-only I also attempted to dispatch Per branch policy, I did not push an upstream |
699e9cf to
82b8a1a
Compare
|
Updated PR after rebasing onto current Additional tests run locally:
The earlier Ubuntu |
Codecov Report❌ Patch coverage is 📢 Thoughts on this report? Let us know! |
|
Update from commit 6b65c2c:\n\n- Fixed direct vs indirect recover scoping and nested panic stack handling; removed the recover1.go xfail entries covered by this PR.\n- Kept recover.go xfails: the remaining failure is reflect/interface type identity in test9reflect2, not the recover/defer panic root.\n- Kept recover4.go xfails: SIGBUS is now routed to the panic path, but the remaining mismatch is stale local/named-result state after fault recovery (memcopy returned 0 vs 131067), which I am not mixing into this recover/defer PR.\n- Kept deferprint.go and fixedbugs/bug409.go xfails: remaining mismatches are float print exponent formatting (+e+00 vs +e+000).\n\nLocal validation:\n- go test ./test/go -count=1\n- go test ./test/goroot -count=1 -run TestGoRootRunCases -args -goroot |
|
Update after 77ceb09:
Focused tests passed locally: test/go recover cases via Go and llgo, recover4.go goroot with and without xfail, recover.go goroot with retained xfail, go test ./test/go, go test ./ssa, go test ./cl, and runtime/internal/runtime inside the runtime module. |
|
Update for commit
Focused checks passed:
Known unrelated non-covered check: |
|
Update for
Local validation:
|
|
Resolved the latest merge conflict with xgo-dev/main (51d665e). Kept recover1.go/recover4.go out of xfail for this PR's recovered panic value fix, while preserving unrelated main xfails such as issue73917/issue73920.\n\nLocal verification:\n- go test -timeout 20m ./test/go -run 'Test(Recover|RuntimeError|Panic|GenericUnsafeSizeofArithmetic)' -count=1\n- go test -timeout 20m ./test/goroot -run TestGoRootRunCases -count=1 -args -goroot /Users/lijie/sdk/go1.26.0 -case '^(recover1|recover4).go$' -run-timeout 60s\n- git diff --check |
|
Update for
Local validation:
Also checked broader local package runs. They still hit pre-existing, unrelated failures that I did not change in this recover/panic PR: CI and Codecov are expected to rerun from the fork branch after this push. |
|
Updated the stale FileCheck expectation in cl/_testdata/vargs/in.go from AssertIndexRange(i1) to the current CheckIndexRange(i1, i64, i1, i64) IR emitted by litgen. Local verification passed:
The remaining public test/go print failure is still tracked separately in #1945 and was not changed here. |
5a10a71 to
f7ea6d7
Compare
5c0bf1f to
f29c5d6
Compare
17cfabb to
5bb0176
Compare
Record the experiment results at the emitter: !associated only guides linker GC and IR-level GlobalDCE deletes the records; llvm.compiler.used pins dead functions through the records' address initializers; and noduplicate blocks inlining. Section dedup is link-phase work. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
Post-link table generation plan: parse the linked binary's metadata sections, dedup LTO inline copies against the symbol table, sort with a sentinel, build Go-layout findfunctab via internal/pclntab, and write back into a reserved section with ASLR-safe anchor offsets. Runtime adopts the prebuilt table when the header validates and keeps first-use construction as fallback. Includes the list of platform facts established in xgo-dev#2012 so implementation does not re-derive them. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
The monotonic time source had two problems: - On Linux, runtimeNano passed clite's CLOCK_MONOTONIC, whose value is Darwin's clock id (6). Linux interprets 6 as CLOCK_MONOTONIC_COARSE, a millisecond-granularity clock: consecutive time.Now() readings were identical 100% of the time and the smallest nonzero delta was 1ms. - On Darwin, clock_gettime(CLOCK_MONOTONIC) itself only has microsecond granularity (96% identical consecutive readings, 1us minimum delta). Mirror Go's runtime structure with a per-OS nanotime1 in the runtime package itself, keeping the hot path free of clite indirection and clite unchanged: Darwin reads CLOCK_UPTIME_RAW through clock_gettime_nsec_np (the same clock Go's nanotime uses there), Linux uses clock_gettime with the OS-correct CLOCK_MONOTONIC id as a local constant, and remaining platforms keep the previous behavior. Measured with consecutive time.Now() deltas (min nonzero / zero-frac): - macOS arm64: 1us / 96.5% -> 41ns / 26% (Go 1.26: 41ns / 22%) - Linux arm64: 1ms / 100% -> 41ns / 21% time.Sleep, Timer and Ticker behave identically before and after. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
The macOS CI LLDB step caught the funcinfo entry/stub site anchors shifting instruction/scope layout: with the records emitted at function entry, LLDB reported variables from an inner lexical block (ScopeIf's b, c) as in scope before the block began. Debug builds carry full DWARF, so the funcinfo tables are redundant there; gate the metadata pipeline on !IsDbgEnabled(). Caller-frame instrumentation is independent of this switch, so runtime.Caller keeps working in debug builds. _lldb/runtest.sh: 194/194 pass. This also covers Linux, where the same interference existed since the sites were introduced but the LLDB suite only runs on the macOS jobs. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
Refine the previous commit: instead of disabling the whole funcinfo metadata pipeline under LLGO_DEBUG/LLGO_DEBUG_SYMBOLS, add a separate Program.EnableFuncInfoSites switch and turn off just the body-embedded site records (entry/stub anchors and pc-line labels) — they are what shifts instruction/scope layout and confused LLDB. The funcinfo tables are plain data globals and stay enabled, so runtime.FuncForPC keeps its normalized name and Func.FileLine keeps file/line in debug builds (via the dlsym fallback path); runtime.Caller/Callers were never affected because caller-frame instrumentation is independent of both switches. Debug builds lose only the section fast paths (first-use latency) and statement-level pc-line granularity, both redundant next to full DWARF. _lldb/runtest.sh: 194/194; cl and test/go suites pass. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
frameFuncForPC could cache a Func built from a pcline frame whose entry resolution failed (entry == 0); a later FuncForPC on the same PC would then observe Entry() == 0 where its own constructor falls back to pc. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
LLGO_FUNCINFO_SITES=0 keeps the funcinfo metadata tables but drops the body-embedded entry/stub/pc-line inline-asm sites. This is the narrow A/B needed to isolate codegen perturbation caused by the in-body asm anchors: with sites off, plain-code benchmarks match the no-funcinfo baseline within noise, while sites on shifts hot runtime-internal loops by -30%..+6% through inline/layout decisions. Semantics with sites off: FuncForPC(entry) and Func.FileLine(entry) keep working through the dlsym fallback path; statement/call-site granularity PC line lookup is disabled, and first-use table construction loses the section fast path. Tests assert the split: tables still materialize while entry/stub section asm, boundary symbols, and pc-line site labels are all absent. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
First stage of doc/design/pclntab-linkphase.md: parse a linked binary's funcinfo entry/stub sections (Mach-O and ELF), deduplicate LTO inline copies against the symbol table's text ranges, sort with a Go-style sentinel, and build findfunctab through internal/pclntab — the faithful port that has been waiting for exactly this caller. Read-only: prints what the P2 build integration would write back. Measured on the 576-target multipkg binaries: - non-LTO: 9319 records -> ftab 3161 + 207 buckets; lookup self-check 3160/3160; site sections 149KB -> 29KB (5.1x) - LTO: 15371 entry records -> 13857 inline copies dropped, 4144 kept; self-check 3045/3045; 299KB -> 28.5KB (10.5x) Findings for P2: on-disk Mach-O pointer slots hold dyld chained-fixup encodings (low 36 bits are the target; decoded here; the write-back design stores anchor-relative offsets and avoids pointers entirely), and some non-LTO stub symbols are absent from the symbol table (records conservatively dropped; needs tightening). Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
…adoption
pclnpost -write rewrites the entry-site section in place with the
prebuilt table (header + ftab {entryOff,funcIndex} + runtime-layout
findfunctab buckets), resolving funcinfo indexes through the binary's
symbol-index section, and voids the stub section (its records are
merged into the table). ASLR is handled by anchoring on the section's
own link-time address; entries are normalized to true symbol starts,
which retires the entry-PC slack on this path. macOS re-signs with an
ad-hoc codesign after rewriting.
The runtime adopts the table zero-copy when the magic header validates:
lookups binary-search the on-disk ftab directly through the shared
bucket index, nothing is materialized on first use (the funcIndex ->
entry map is built lazily and only for the pcline initializer), and the
cold scan/dladdr path is skipped since adoption is cheap. First-use
construction remains the fallback whenever the header is absent.
Linux end-to-end: entries=prebuilt, FuncForPC/FileLine correct,
first-FuncForPC 110µs (materializing) -> 6-8µs (zero-copy); 13ms on the
original macOS baseline. Known gap: on macOS the on-disk rewrite is
corrupted at load time because dyld still walks the stale chained-fixup
chain over the section; fix (unlinking the section's nodes from the
page chains in LC_DYLD_CHAINED_FIXUPS) is identified and next.
Non-prebuilt paths verified regression-free: cl + test/go suites pass,
smoke behavior unchanged.
Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
Every llgo-linked executable (linux/darwin, sites enabled) now gets the prebuilt ftab/findfunctab automatically: internal/build runs internal/pclnpost.Rewrite after linkMainPkg, and any failure degrades silently to the first-use construction fallback. Moves the tool core into internal/pclnpost and hardens it: - Canonical-record detection by FNV: a record survives when its anchor's owning symbol hashes to the record's symbolID (or is the __llgo_stub. wrapper of it). The previous one-per-symbolID rule wrongly collapsed a function with its stub — they share the target's symbolID by design — which broke exact-entry lookups (caught by TestRuntimeLineInfoAndStack on Linux). LTO inline copies are now identified exactly: 8.4k/9.5k copies removed in the LTO probes. - Mach-O chained-fixups surgery: unlink the rewritten sections' pointer slots from the dyld page chains (repointing predecessors' next links and page_start entries) so dyld neither rebases slots inside the new table nor skips unrelated fixups after the zeroed stub section, then re-sign ad hoc. Without this the table was corrupted at load. - LTO-safe metadata location: the entry section carries a meta record whose relocations hold the addresses of the symbol-index pointer and count globals; LTO internalization strips those names from the symbol table but relocations always resolve. Runtime skips the meta rows (pc==0 / symbolID==0). - Idempotence guard (already-rewritten binaries are left alone). Runtime fixes that surfaced during validation: - materializePrebuiltEntries is now two-phase so concurrent losers wait for the winner's store instead of reading a nil entries slice. - pcLineFrameForPC rejects nearest-below sites whose entry is unresolved when the caller knows the function entry, instead of leaking a neighboring function's file/line. Validation: macOS cl (full) + test/go + LLDB 194/194; Linux test/go TestRuntime suite; probes on both platforms report entries=prebuilt with first-FuncForPC at 7-21µs (Linux) from 13ms on the original baseline, and LTO builds drop 8-9.5k inline copies. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
…table On Mach-O, pointer slots that name exported functions — every __llgo_stub.* wrapper and any exported Go function — are emitted as chained-fixup BIND nodes, not rebases. The rewriter only decoded rebase nodes, so all stub records (and some entry records) were dropped as unowned and never reached the prebuilt ftab; FuncForPC on function values silently fell back to dladdr (~6µs per fresh pc on darwin). - Parse the LC_DYLD_CHAINED_FIXUPS imports table and resolve bind ordinals to their in-image definitions. - Match canonical owners against the record symbolID with underscore normalization (debug/macho's suffix-shared string table can surface one mangling underscore more or less than the source-level name). - Splice the prebuilt header's base slot back into the fixup chain as a live rebase node: dyld writes the slid text base at load, so the runtime reads a ready runtime PC with no slide arithmetic (non-PIE ELF link-time values already equal runtime addresses). - LLGO_PCLNPOST=0 escape hatch keeps first-use construction. Fresh-pc FuncForPC slow path: darwin 6-8µs -> 1.2-1.7µs, linux 6.8µs -> 0.5µs; first-in-process lookup: darwin ~32µs -> ~14µs, linux ~6.8µs -> ~4µs. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
Pure-compute probes (recursive fib, JSON round-trip, sort.Ints, map churn) with no runtime introspection, so one harness run covers both the introspection extremes and what the funcinfo machinery costs code that never asks for it. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
Go's pclntab pages are touched by its own runtime (traceback, GC) long before user code queries it, so its first FuncForPC never pays page-in. Mirror that: when the prebuilt table is present, init adopts it (zero-copy, sub-µs), touches the pages the lookup path reads (blob, funcinfo records, string offsets, strings), runs one synthetic lookup to warm the code paths, and write-warms the FuncForPC cache pages. First-in-process FuncForPC: darwin ~17µs -> ~2.8µs, linux ~6.6µs -> ~1.0µs. Startup cost is page-count-bound (tens of µs on stdlib-sized tables, invisible next to ~3ms process startup; hello-world medians unchanged). Non-prebuilt binaries stay fully lazy: first-use construction allocates, which has no place in init, and programs that never introspect pay nothing. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
-depths generates deep_<N> scenarios at configurable call depths; -bigsizes generates bigfunc scenarios (funcs x statements) whose large bodies stress statement-level pcline density, mid-function pc symbolization, and ordinary performance of big method bodies. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
5bb0176 to
bd1d08c
Compare
|
Rebased onto #2016 ( |
Summary
recover()values assert toerrorandruntime.Error.TypeAssertionErrorvalues for compiler-generated failed type assertions, and fix slice-to-array conversion panic bounds order.recoversemantics through compiler-generated method wrappers when the wrapper is itself the deferred call, while keeping nested wrapper calls non-recovering.fixedbugs/issue73917.goandfixedbugs/issue73920.go: a recover in a real helper call reached through deferred promoted wrappers remains indirect, and the outer deferred recover still sees the panic.longjmprecovery by emitting volatile loads/stores for local SSA allocs in functions with recover blocks.test/gocoverage for recovered runtime errors, type assertion panic values, method-wrapper recover semantics, embedded wrapper function-pointer recover semantics, and fault recovery preserving named results; remove the now-passingrecover2.go,zerodivide.go,recover4.go,fixedbugs/issue73917.go, andfixedbugs/issue73920.goGOROOT xfails.Still XFail / Out of Scope
recover.goremains xfail in this PR. On this branch it still fails before the recover-10 path with the ssa: panic with runtime type assertion errors #1892 interface type identity issue:interface conversion: interface {} is func(*main.T1), not func(*main.T1) (types from different scopes).missing recover 10is covered independently bytest/go.test/gocases are Go 1.26+ semantic coverage. Go 1.24 has pre-Go 1.26 behavior for the upstreamissue73917.go/issue73920.goprograms, so those local tests skip under Go <1.26.test/gocurrently hits unrelated nil/range behavior inTestRangeOverNilArrayPointerCallIsEvaluated; this PR intentionally does not cover nil/range, chan, print, interface, reflect, GC, liveness, finalizer, or goroutine domains.Testing
go test ./test/go -run 'TestRecoverThrough|TestRecoverAfterFault' -count=1go run ./cmd/llgo test -run 'TestRecoverThrough|TestRecoverAfterFault' -count=1 ./test/gogo test ./test/goroot -run TestGoRootRunCases/recover4.go -count=1 -args -goroot "$(go env GOROOT)" -dirs . -case '^recover4\.go$' -xfail /tmp/llgo-empty-xfail.yamlgo test ./test/goroot -run TestGoRootRunCases/recover4.go -count=1 -args -goroot "$(go env GOROOT)" -dirs . -case '^recover4\.go$'go test ./test/goroot -run TestGoRootRunCases/recover.go -count=1 -args -goroot "$(go env GOROOT)" -dirs . -case '^recover\.go$'(passes via retained xfail)go test ./test/goroot -run TestGoRootRunCases/recover.go -count=1 -args -goroot "$(go env GOROOT)" -dirs . -case '^recover\.go$' -xfail /tmp/llgo-empty-xfail.yaml(expected failure: ssa: panic with runtime type assertion errors #1892 interface type identity blocker)go test ./test/go -run 'TestDeferredEmbedded.*MethodWrapperKeepsIndirectRecoverNil|TestRecoverThrough' -count=1/Users/lijie/sdk/go1.26.0/bin/go test ./test/go -run 'TestDeferredEmbedded.*MethodWrapperKeepsIndirectRecoverNil' -count=1 -v/Users/lijie/sdk/go1.26.0/bin/go run ./cmd/llgo test -run 'TestDeferredEmbedded.*MethodWrapperKeepsIndirectRecoverNil|TestRecoverThrough' -count=1 ./test/gogo test ./test/goroot -run TestGoRootRunCases -count=1 -args -goroot /Users/lijie/sdk/go1.26.0 -dirs fixedbugs -case '^fixedbugs/issue739(17|20)\.go$'go test ./test/go -count=1(cd runtime && go test ./internal/runtime -count=1)go test ./ssa -count=1go test ./cl -count=1/Users/lijie/sdk/go1.26.0/bin/go test ./test/go -count=1(known unrelated failure:TestRangeOverNilArrayPointerCallIsEvaluatednil array pointer range SIGSEGV)GOROOT CI
Full GOROOT CI is disabled/too slow for regular PR validation in this repo and the
GOROOTworkflow is manual-only. This PR keeps stable coverage intest/goand lists the exact targeted GOROOT commands above. I will monitor the available PR checks after pushing the fork head branch; I will not push branches toxgo-dev/llgo.