Add perf timestamps to top level functions.#258
Conversation
|
Please rebase and add unit tests. |
|
Hi @snehasish , thanks for the review! Does the revised patch look OK ? |
snehasish
left a comment
There was a problem hiding this comment.
lgtm. Could you add a reference to the gcc rfc to the commit message?
https://gcc.gnu.org/pipermail/gcc-patches/2025-October/696758.html
I have a few questions regarding the work described in the RFC.
- Did you get numbers for SPEC?
- What was the nature of the internal workload e.g. HPC, Server, Mobile etc?
- Did the profiling session include the entire workload run, just the startup or some running snapshot?
- Did you experiment with code layout tools such as BOLT and Propeller?
Thanks!
| callsites(0), | ||
| pos_counts() { | ||
| pos_counts(), | ||
| timestamp(0) { |
There was a problem hiding this comment.
The formatting looks off here?
There was a problem hiding this comment.
Ah I think this was a tab stop instead of spaces, fixed.
To answer your questions:
- There was no meaningful difference for SPEC workloads.
- The workload is representative of a program with deep call chains.
- The profiling session included the entire workload run.
- The speedup is comparable to BOLT's code layout speedup. I haven't tried Propeller.
Thanks!
This patch is a counter part to GCC's patch for enabling time profile based reordering based on perf timestamps with AutoFDO: https://gcc.gnu.org/pipermail/gcc-patches/2025-October/696758.html The patch adds a new timestamp field to Symbol, and emits it to gcov for version >= 3. Currently, the patch only records timestamps from PMU mode (PerfDataSampleReader), I will post a follow up patch shortly for recording timestamps with SPE. Results: On an internal large workload with GCC's AutoFDO and time profile reordering (enabled by this patch), we see an improvement of ~32%.
|
Hi @snehasish , thanks for the suggestions. I have fixed the formatting issue in the revised patch, the link to GCC's RFC was included in commit message. Is this patch OK to merge ? |
Will merge once the workflows complete. Thanks! |
This patch is a counter part to GCC's patch for enabling time profile based reordering based on perf timestamps with AutoFDO: https://gcc.gnu.org/pipermail/gcc-patches/2025-October/696758.html
The patch adds a new timestamp field to Symbol, and emits it to gcov for version >= 3. Currently, the patch only records timestamps from PMU mode (PerfDataSampleReader), I will post a follow up patch shortly for recording timestamps with SPE.
Results:
On an internal large workload with GCC's AutoFDO and time profile reordering (enabled by this patch), we see an improvement of ~32%.