Skip to content

[release-4.20] OCPBUGS-88705: T-BC CLOCK_REALTIME stuck at FREERUN due to dual-publisher conflict in ParseTBCLogs and extractRegularMetrics#705

Open
edcdavid wants to merge 3 commits into
redhat-cne:release-4.20from
edcdavid:cherry-pick-OCPBUGS-88705-to-release-4.20
Open

[release-4.20] OCPBUGS-88705: T-BC CLOCK_REALTIME stuck at FREERUN due to dual-publisher conflict in ParseTBCLogs and extractRegularMetrics#705
edcdavid wants to merge 3 commits into
redhat-cne:release-4.20from
edcdavid:cherry-pick-OCPBUGS-88705-to-release-4.20

Conversation

@edcdavid

@edcdavid edcdavid commented Jun 15, 2026

Copy link
Copy Markdown
Contributor

Cherry-pick of #701 to release-4.20.

Manually adapted due to cherry-pick conflicts — the 4.20 branch has an older code structure where holdover logic is inlined rather than using the refactored startHoldoverTimer function.

Summary

Commit 1: Fix CLOCK_REALTIME dual-publisher conflict in ParseTBCLogs

  • Removes the CLOCK_REALTIME override from ParseTBCLogs that was forcing FREERUN with a sentinel offset, overriding phc2sys which may still report LOCKED.
  • CLOCK_REALTIME is now managed exclusively by phc2sys.
  • Adds regression test TestTBCFreerunDoesNotOverrideClockRealtime.

Commit 2: E1-aware E3 derivation per O-RAN O-Cloud API v04.00 Table 37

  • Adds E1-aware syncState downgrade in ExtractMetrics: E3 = worst_of(phc2sys_state, E1_state). T-GM profiles skipped.
  • Removes redundant maybePublishOSClockSyncStateChangeEvent and its call sites (adapted to 4.20 inlined holdover structure).
  • Fixes GenPTPEvent to allow HOLDOVER-to-FREERUN transitions for E3.
  • Adds TestE3DerivationORANTable37 covering all 5 rows of O-RAN Table 37.

E3 derivation (O-RAN Table 37)

E1 (PTP State) phc2sys status E3 (OS Clock)
LOCKED offset < threshold LOCKED
HOLDOVER offset < threshold HOLDOVER
FREERUN offset < threshold FREERUN
any offset > threshold FREERUN
any phc2sys down FREERUN

Test plan

  • Full go test ./plugins/ptp_operator/metrics/... passes

/assign edcdavid

Assisted-By: Cursor

@coderabbitai

coderabbitai Bot commented Jun 15, 2026

Copy link
Copy Markdown

Important

Review skipped

Auto reviews are disabled on base/target branches other than the default branch.

Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Enterprise

Run ID: 16dc8c54-9469-4dde-a612-562cf95e49a7

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

Use the checkbox below for a quick retry:

  • 🔍 Trigger review
✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands.

@openshift-ci openshift-ci Bot requested review from josephdrichard and jzding June 15, 2026 20:28
@openshift-ci

openshift-ci Bot commented Jun 15, 2026

Copy link
Copy Markdown
Contributor

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: edcdavid
Once this PR has been reviewed and has the lgtm label, please assign josephdrichard for approval. For more information see the Code Review Process.

The full list of commands accepted by this bot can be found here.

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-ci-robot

Copy link
Copy Markdown

@edcdavid: Jira Issue OCPBUGS-88369 has been cloned as Jira Issue OCPBUGS-88708. Will retitle bug to link to clone.
/retitle OCPBUGS-88705: T-BC CLOCK_REALTIME stuck at FREERUN due to dual-publisher conflict in ParseTBCLogs and extractRegularMetrics

Details

In response to this:

This is an automated cherry-pick of #701

/assign edcdavid

Assisted-By: Cursor

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@edcdavid edcdavid changed the title OCPBUGS-88705: T-BC CLOCK_REALTIME stuck at FREERUN due to dual-publisher conflict in ParseTBCLogs and extractRegularMetrics [release-4.20] OCPBUGS-88705: T-BC CLOCK_REALTIME stuck at FREERUN due to dual-publisher conflict in ParseTBCLogs and extractRegularMetrics Jun 15, 2026
@edcdavid edcdavid force-pushed the cherry-pick-OCPBUGS-88705-to-release-4.20 branch 2 times, most recently from ce0ceb8 to 02f75de Compare June 15, 2026 20:50
@edcdavid edcdavid force-pushed the cherry-pick-OCPBUGS-88705-to-release-4.20 branch from 580acd2 to eb246fa Compare June 23, 2026 20:16
@openshift-ci-robot

Copy link
Copy Markdown

@edcdavid: This pull request references Jira Issue OCPBUGS-88705, which is invalid:

  • expected dependent Jira Issue OCPBUGS-88704 to be in one of the following states: VERIFIED, RELEASE PENDING, CLOSED (ERRATA), CLOSED (CURRENT RELEASE), CLOSED (DONE), CLOSED (DONE-ERRATA), but it is New instead

Comment /jira refresh to re-evaluate validity if changes to the Jira bug are made, or edit the title of this pull request to link to a different bug.

The bug has been updated to refer to the pull request using the external bug tracker.

Details

In response to this:

Cherry-pick of #701 to release-4.20.

Manually adapted due to cherry-pick conflicts — the 4.20 branch has an older code structure where holdover logic is inlined rather than using the refactored startHoldoverTimer function.

Summary

Commit 1: Fix CLOCK_REALTIME dual-publisher conflict in ParseTBCLogs

  • Removes the CLOCK_REALTIME override from ParseTBCLogs that was forcing FREERUN with a sentinel offset, overriding phc2sys which may still report LOCKED.
  • CLOCK_REALTIME is now managed exclusively by phc2sys.
  • Adds regression test TestTBCFreerunDoesNotOverrideClockRealtime.

Commit 2: E1-aware E3 derivation per O-RAN O-Cloud API v04.00 Table 37

  • Adds E1-aware syncState downgrade in ExtractMetrics: E3 = worst_of(phc2sys_state, E1_state). T-GM profiles skipped.
  • Removes redundant maybePublishOSClockSyncStateChangeEvent and its call sites (adapted to 4.20 inlined holdover structure).
  • Fixes GenPTPEvent to allow HOLDOVER-to-FREERUN transitions for E3.
  • Adds TestE3DerivationORANTable37 covering all 5 rows of O-RAN Table 37.

E3 derivation (O-RAN Table 37)

E1 (PTP State) phc2sys status E3 (OS Clock)
LOCKED offset < threshold LOCKED
HOLDOVER offset < threshold HOLDOVER
FREERUN offset < threshold FREERUN
any offset > threshold FREERUN
any phc2sys down FREERUN

Test plan

  • Full go test ./plugins/ptp_operator/metrics/... passes

/assign edcdavid

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

edcdavid added 3 commits June 25, 2026 11:42
ParseTBCLogs was unconditionally forcing CLOCK_REALTIME to FREERUN
(with sentinel offset -9999999999999999) whenever the T-BC FSM
reported FREERUN. This overrides phc2sys, which may still be
reporting LOCKED with a healthy offset, creating a dual-publisher
conflict where CLOCK_REALTIME oscillates between LOCKED and FREERUN.

CLOCK_REALTIME state should be managed exclusively by phc2sys. When
the T-BC chain loses its upstream source, the PHC will eventually
drift, phc2sys will detect the growing offset, and CLOCK_REALTIME
will transition to FREERUN naturally through the threshold-based
mechanism in extractRegularMetrics.

Adds TestTBCFreerunDoesNotOverrideClockRealtime regression test.
@edcdavid edcdavid force-pushed the cherry-pick-OCPBUGS-88705-to-release-4.20 branch from 2841f92 to 58165da Compare June 25, 2026 16:59
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants