[release-4.20] OCPBUGS-88705: T-BC CLOCK_REALTIME stuck at FREERUN due to dual-publisher conflict in ParseTBCLogs and extractRegularMetrics#705
Conversation
|
Important Review skippedAuto reviews are disabled on base/target branches other than the default branch. Please check the settings in the CodeRabbit UI or the ⚙️ Run configurationConfiguration used: Organization UI Review profile: CHILL Plan: Enterprise Run ID: You can disable this status message by setting the Use the checkbox below for a quick retry:
✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
|
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: edcdavid The full list of commands accepted by this bot can be found here. DetailsNeeds approval from an approver in each of these files:Approvers can indicate their approval by writing |
|
@edcdavid: Jira Issue OCPBUGS-88369 has been cloned as Jira Issue OCPBUGS-88708. Will retitle bug to link to clone. DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
ce0ceb8 to
02f75de
Compare
580acd2 to
eb246fa
Compare
|
@edcdavid: This pull request references Jira Issue OCPBUGS-88705, which is invalid:
Comment The bug has been updated to refer to the pull request using the external bug tracker. DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
ParseTBCLogs was unconditionally forcing CLOCK_REALTIME to FREERUN (with sentinel offset -9999999999999999) whenever the T-BC FSM reported FREERUN. This overrides phc2sys, which may still be reporting LOCKED with a healthy offset, creating a dual-publisher conflict where CLOCK_REALTIME oscillates between LOCKED and FREERUN. CLOCK_REALTIME state should be managed exclusively by phc2sys. When the T-BC chain loses its upstream source, the PHC will eventually drift, phc2sys will detect the growing offset, and CLOCK_REALTIME will transition to FREERUN naturally through the threshold-based mechanism in extractRegularMetrics. Adds TestTBCFreerunDoesNotOverrideClockRealtime regression test.
2841f92 to
58165da
Compare
Cherry-pick of #701 to release-4.20.
Manually adapted due to cherry-pick conflicts — the 4.20 branch has an older code structure where holdover logic is inlined rather than using the refactored
startHoldoverTimerfunction.Summary
Commit 1: Fix CLOCK_REALTIME dual-publisher conflict in ParseTBCLogs
ParseTBCLogsthat was forcing FREERUN with a sentinel offset, overriding phc2sys which may still report LOCKED.CLOCK_REALTIMEis now managed exclusively byphc2sys.TestTBCFreerunDoesNotOverrideClockRealtime.Commit 2: E1-aware E3 derivation per O-RAN O-Cloud API v04.00 Table 37
ExtractMetrics: E3 =worst_of(phc2sys_state, E1_state). T-GM profiles skipped.maybePublishOSClockSyncStateChangeEventand its call sites (adapted to 4.20 inlined holdover structure).GenPTPEventto allow HOLDOVER-to-FREERUN transitions for E3.TestE3DerivationORANTable37covering all 5 rows of O-RAN Table 37.E3 derivation (O-RAN Table 37)
Test plan
go test ./plugins/ptp_operator/metrics/...passes/assign edcdavid
Assisted-By: Cursor