Skip to content

runtime: speed up funcinfo entry lookup#2009

Closed
cpunion wants to merge 20 commits into
xgo-dev:mainfrom
cpunion:codex/runtime-funcentry-perf
Closed

runtime: speed up funcinfo entry lookup#2009
cpunion wants to merge 20 commits into
xgo-dev:mainfrom
cpunion:codex/runtime-funcentry-perf

Conversation

@cpunion

@cpunion cpunion commented Jul 1, 2026

Copy link
Copy Markdown
Collaborator

Depends on #2002.

This is a performance follow-up for the compact runtime funcinfo table. It should be reviewed after #2002; this PR is not intended to replace that semantic baseline.

Changes

  • Emit ELF llgo_funcinfo_entry records from function-body .L... anchors so non-LTO and full-LTO builds can materialize funcinfo PCs without per-function dlsym.
  • Keep the entry/stub records DCE-safe by associating them with function-body temporary labels instead of function symbols.
  • Build a one-time symbolID -> funcIndex map and reuse it for entry/stub-site records, removing the previous O(n*m) stub-site lookup.
  • Add an exact function-value PC fallback for runtime.FuncForPC(reflect.ValueOf(fn).Pointer()), because backend prologues can place the anchor after the real entry PC.
  • Use .L temporary anchors so full-LTO does not retain thousands of local anchor symbols in the final symbol table.

Validation

  • macOS: go test ./internal/build ./cl ./ssa -run 'Test(FuncInfo|RuntimeCaller|CallerFrame|DevLTOGlobalDCE)' -count=1
  • macOS: go test ./test/go -run 'TestRuntime(LineInfoAndStack|StatementLineInfo|FuncInfoConcurrentFirstUse)$' -count=1 -timeout=40m
  • Linux container: same two test commands above
  • Linux full-LTO semantic probe: FuncForPC(function value), Func.FileLine, and runtime.Caller passed

Benchmark Notes

Each executable was run 9 times. Performance cells are best/trimmed avg after dropping min and max, with ns, us, or ms suffixes. Size values are MiB.

main lacks the new funcinfo support, so the entry and cold probes fail with panic: nil FuncForPC. In the hot probe, main FuncForPC/FuncFileLine are nil fast paths and are not functionally comparable.

Linux Performance

metric main go 2002 2002+lto current current+lto
entry.FuncForPCOnly FAIL 4/4ns 7/7.3ns 4/4.9ns 6/6.9ns 4/5ns
entry.FuncNameOnly FAIL 10/10ns 7/7ns 5/6ns 6/7.3ns 5/5.1ns
entry.FuncForPCName FAIL 13/13.6ns 8/8ns 5/6ns 8/8ns 5/6ns
entry.FuncFileLine FAIL 9/10.7ns 8/8.1ns 6/6.4ns 8/8.1ns 6/6ns
hot.Caller0 1.2/1.3us 167/169.7ns 50/62.9ns 30/50ns 50/60ns 40/52.9ns
hot.Caller1 4.5/4.5us 181/183.3ns 80/92.9ns 70/81.4ns 90/95.7ns 70/80ns
hot.CallersOnly 8.4/8.6us 119/127.1ns 180/185.7ns 140/162.9ns 180/188.6ns 140/174.3ns
hot.CallersFramesFirst 8.8/8.9us 277/284ns 440/445.7ns 380/408.6ns 440/460ns 380/402.9ns
hot.FuncForPC 0/1.7ns* 15/15.3ns 54/54.9ns 46/47.7ns 54/55.4ns 46/48.6ns
hot.FuncFileLine 2/2ns* 14/15.3ns 56/56.3ns 48/48.3ns 56/57.4ns 48/50.6ns
cold.FirstFuncForPCNs FAIL 500/767.7ns 136/138.6ms 6/6ms 2/2.3ms 2/3.1ms
cold.WarmFuncForPCOnly FAIL 14/14.7ns 2/2.1ns 0/0.7ns 2/2ns 0/0.6ns
cold.WarmFuncFileLine FAIL 10/10.6ns 2/2.7ns 126/129.6ns 2/2.6ns 1/1.9ns

Linux Size (MiB)

probe main go 2002 2002+lto current current+lto
entry 1.78 2.15 2.07 1.85 2.17 2.35
hot 1.78 2.15 2.07 1.86 2.18 2.36
cold 1.91 2.33 2.26 1.97 2.37 2.53

macOS Performance

metric main go 2002 2002+lto current current+lto
entry.FuncForPCOnly FAIL 4/4ns 6/7ns 4/4.4ns 7/7ns 4/4.3ns
entry.FuncNameOnly FAIL 10/10ns 7/7ns 4/4.4ns 6/7ns 4/4ns
entry.FuncForPCName FAIL 13/13.1ns 7/7.7ns 5/5ns 7/7.9ns 5/5ns
entry.FuncFileLine FAIL 9/10.6ns 8/8ns 5/5ns 8/8ns 5/5ns
hot.Caller0 9.3/9.4us 154/155.9ns 45/46.6ns 36/36.6ns 45/46.3ns 35/37.1ns
hot.Caller1 5.9/5.9us 169/173ns 73/75.6ns 62/62.7ns 74/76ns 61/62.4ns
hot.CallersOnly 13.6/13.7us 119/124ns 145/150.7ns 129/136.1ns 145/148.1ns 129/134.4ns
hot.CallersFramesFirst 21.4/21.5us 263/266.7ns 354/358ns 303/304.9ns 350/357ns 298/305.7ns
hot.FuncForPC 1/1ns* 13/13.6ns 42/42.6ns 33/34ns 42/43.3ns 33/34.3ns
hot.FuncFileLine 1/1ns* 13/14.4ns 45/46ns 35/36.1ns 46/46.7ns 35/36ns
cold.FirstFuncForPCNs FAIL 1.4/1.9us 17.4/18.2ms 34.3/35.4ms 21/31.4us 21/27.3us
cold.WarmFuncForPCOnly FAIL 17/17.1ns 1/1ns 0/0.9ns 1/1ns 1/1ns
cold.WarmFuncFileLine FAIL 9/10.3ns 2/2ns 1/1ns 2/2ns 1/1ns

macOS Size (MiB)

probe main go 2002 2002+lto current current+lto
entry 1.83 2.24 2.04 1.59 2.04 1.59
hot 1.83 2.26 2.04 1.60 2.06 1.60
cold 1.94 2.41 2.18 1.69 2.18 1.69

@codecov

codecov Bot commented Jul 1, 2026

Copy link
Copy Markdown

Codecov Report

❌ Patch coverage is 99.05808% with 6 lines in your changes missing coverage. Please review.

Files with missing lines Patch % Lines
cl/instr.go 98.71% 3 Missing and 3 partials ⚠️

📢 Thoughts on this report? Let us know!

@cpunion

cpunion commented Jul 2, 2026

Copy link
Copy Markdown
Collaborator Author

Superseded by #2012/#2016 (entry lookup is now the link-phase prebuilt ftab + zero-copy runtime adoption). Review continues on #2012 + #2016.

@cpunion cpunion closed this Jul 2, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant