Fix clocktest false failure on high CPU count systems (bugfix)#2539
Open
mreed8855 wants to merge 1 commit into
Open
Fix clocktest false failure on high CPU count systems (bugfix)#2539mreed8855 wants to merge 1 commit into
mreed8855 wants to merge 1 commit into
Conversation
Collaborator
Author
|
HPE tested this. Testing for clock jitter on 688 cpus Testing clock direction for 5 minutes... |
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## main #2539 +/- ##
=======================================
Coverage 58.92% 58.92%
=======================================
Files 476 476
Lines 48039 48039
Branches 8577 8577
=======================================
Hits 28306 28306
Misses 18838 18838
Partials 895 895
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
On systems with many cores (e.g., 688), the cumulative time spent switching affinity and yielding exceeds MAX_JITTER (0.2s). This causes the test to misidentify loop overhead as clock skew. This patch brackets the sampling loop and normalizes each CPU's timestamp against its relative position in the sequence. This isolates actual hardware jitter from the predictable delay of the measurement loop itself.
afb8002 to
71b311b
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
On systems with many cores (e.g., 688), the cumulative time spent switching affinity and yielding exceeds MAX_JITTER (0.2s). This causes the test to misidentify loop overhead as clock skew.
This patch brackets the sampling loop and normalizes each CPU's timestamp against its relative position in the sequence. This isolates actual hardware jitter from the predictable delay of the measurement loop itself.
Description
This patch brackets the sampling loop and normalizes each CPU's timestamp against its relative position in the sequence. This isolates actual hardware jitter from the predictable delay of the measurement loop itself.
Resolved issues
https://bugs.launchpad.net/hp/+bug/2143999
This issue was seen on 26.04 and also on 24.04 but only on systems with high core counts.
Documentation
Tests