Skip to content

Add perf timestamps to top level functions.#258

Merged
snehasish merged 1 commit into
google:masterfrom
prathameshnv:p-742
May 13, 2026
Merged

Add perf timestamps to top level functions.#258
snehasish merged 1 commit into
google:masterfrom
prathameshnv:p-742

Conversation

@prathameshnv
Copy link
Copy Markdown
Contributor

This patch is a counter part to GCC's patch for enabling time profile based reordering based on perf timestamps with AutoFDO: https://gcc.gnu.org/pipermail/gcc-patches/2025-October/696758.html

The patch adds a new timestamp field to Symbol, and emits it to gcov for version >= 3. Currently, the patch only records timestamps from PMU mode (PerfDataSampleReader), I will post a follow up patch shortly for recording timestamps with SPE.

Results:
On an internal large workload with GCC's AutoFDO and time profile reordering (enabled by this patch), we see an improvement of ~32%.

@snehasish
Copy link
Copy Markdown
Collaborator

Please rebase and add unit tests.

@prathameshnv
Copy link
Copy Markdown
Contributor Author

Hi @snehasish , thanks for the review! Does the revised patch look OK ?

Copy link
Copy Markdown
Collaborator

@snehasish snehasish left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm. Could you add a reference to the gcc rfc to the commit message?
https://gcc.gnu.org/pipermail/gcc-patches/2025-October/696758.html

I have a few questions regarding the work described in the RFC.

  • Did you get numbers for SPEC?
  • What was the nature of the internal workload e.g. HPC, Server, Mobile etc?
  • Did the profiling session include the entire workload run, just the startup or some running snapshot?
  • Did you experiment with code layout tools such as BOLT and Propeller?

Thanks!

Comment thread symbol_map.h Outdated
callsites(0),
pos_counts() {
pos_counts(),
timestamp(0) {
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The formatting looks off here?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah I think this was a tab stop instead of spaces, fixed.

To answer your questions:

  1. There was no meaningful difference for SPEC workloads.
  2. The workload is representative of a program with deep call chains.
  3. The profiling session included the entire workload run.
  4. The speedup is comparable to BOLT's code layout speedup. I haven't tried Propeller.

Thanks!

This patch is a counter part to GCC's patch for enabling time profile
based reordering based on perf timestamps with AutoFDO:
https://gcc.gnu.org/pipermail/gcc-patches/2025-October/696758.html

The patch adds a new timestamp field to Symbol, and emits it to gcov for
version >= 3. Currently, the patch only records timestamps from PMU
mode (PerfDataSampleReader), I will post a follow up patch shortly for
recording timestamps with SPE.

Results:
On an internal large workload with GCC's AutoFDO and time profile reordering (enabled by this patch),
we see an improvement of ~32%.
@prathameshnv
Copy link
Copy Markdown
Contributor Author

Hi @snehasish , thanks for the suggestions. I have fixed the formatting issue in the revised patch, the link to GCC's RFC was included in commit message. Is this patch OK to merge ?

@snehasish
Copy link
Copy Markdown
Collaborator

Hi @snehasish , thanks for the suggestions. I have fixed the formatting issue in the revised patch, the link to GCC's RFC was included in commit message. Is this patch OK to merge ?

Will merge once the workflows complete. Thanks!

@snehasish snehasish merged commit 158a27f into google:master May 13, 2026
3 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants