Skip to content

Add reporting, retry lifecycle, and dry-run functionality#231

Open
timkpaine wants to merge 2 commits into
mainfrom
tkp/rep
Open

Add reporting, retry lifecycle, and dry-run functionality#231
timkpaine wants to merge 2 commits into
mainfrom
tkp/rep

Conversation

@timkpaine

Copy link
Copy Markdown
Member

Introduce a structured reporting layer that captures evaluation metadata, timing, topology, retries, and failures without consuming result payloads, mirroring the existing evaluator/policy architecture.

  • ReportingPolicy shared core with span/run ContextVars for nested, thread/async-local isolation
  • ReportEvent model plus NoOp/InMemory/Logging/Composite/UI reporters and a bounded UI polling buffer
  • Tracing/metrics/alerts policies and OpenTelemetry tracing/metrics integration (exposed via otel/full/develop/test extras)
  • Structural reporting models and a Reporting{Evaluator,Model} taxonomy with placeholder vendor classes
  • Refactor LoggingEvaluator onto LoggingPolicy to share formatting and enable LoggingModel while preserving the existing import path and log output
  • Retry lifecycle events now carry run_id and child depth via current_span_depth(); reporter failures are isolated on reporting/retry paths
  • DryRunEvaluator with context-local planning guard; synthetic mode is non-transparent so results are not cached under real-run keys; node_key strips the dry-run evaluator layer while preserving non-evaluator options so it matches cache_key() for the logical node
  • ReportingStateStore preserves terminal outcomes while allowing retry streams to progress
  • Docs: reporting workflow, reporter options, OpenTelemetry install, reserved run/graph phases, extra payload keys, and dry-run synthetic-result warnings
  • Tests across utils/evaluators/models covering success/error flows, dry-run override recursion, cache composition, concurrent dry-run reuse, node-key semantics, retry event nesting, reporter failure isolation, and state folding

Introduce a structured reporting layer that captures evaluation metadata,
timing, topology, retries, and failures without consuming result payloads,
mirroring the existing evaluator/policy architecture.

- ReportingPolicy shared core with span/run ContextVars for nested,
  thread/async-local isolation
- ReportEvent model plus NoOp/InMemory/Logging/Composite/UI reporters and
  a bounded UI polling buffer
- Tracing/metrics/alerts policies and OpenTelemetry tracing/metrics
  integration (exposed via otel/full/develop/test extras)
- Structural reporting models and a <Vendor><Signal>Reporting{Evaluator,Model}
  taxonomy with placeholder vendor classes
- Refactor LoggingEvaluator onto LoggingPolicy to share formatting and
  enable LoggingModel while preserving the existing import path and log output
- Retry lifecycle events now carry run_id and child depth via
  current_span_depth(); reporter failures are isolated on reporting/retry paths
- DryRunEvaluator with context-local planning guard; synthetic mode is
  non-transparent so results are not cached under real-run keys; node_key
  strips the dry-run evaluator layer while preserving non-evaluator options
  so it matches cache_key() for the logical node
- ReportingStateStore preserves terminal outcomes while allowing retry streams
  to progress
- Docs: reporting workflow, reporter options, OpenTelemetry install, reserved
  run/graph phases, extra payload keys, and dry-run synthetic-result warnings
- Tests across utils/evaluators/models covering success/error flows, dry-run
  override recursion, cache composition, concurrent dry-run reuse, node-key
  semantics, retry event nesting, reporter failure isolation, and state folding
@github-actions

github-actions Bot commented Jun 8, 2026

Copy link
Copy Markdown
Contributor

Test Results

    1 files  ± 0      1 suites  ±0   3m 11s ⏱️ -1s
1 264 tests +83  1 262 ✅ +83  2 💤 ±0  0 ❌ ±0 
1 270 runs  +83  1 268 ✅ +83  2 💤 ±0  0 ❌ ±0 

Results for commit 8b4bf97. ± Comparison against base commit 30e109a.

♻️ This comment has been updated with latest results.

@codecov

codecov Bot commented Jun 8, 2026

Copy link
Copy Markdown

Codecov Report

❌ Patch coverage is 96.99054% with 35 lines in your changes missing coverage. Please review.
✅ Project coverage is 93.29%. Comparing base (30e109a) to head (8b4bf97).

Files with missing lines Patch % Lines
ccflow/utils/reporting.py 93.57% 17 Missing and 8 partials ⚠️
ccflow/evaluators/reporting.py 95.12% 2 Missing and 2 partials ⚠️
ccflow/tests/evaluators/test_reporting.py 98.07% 4 Missing ⚠️
ccflow/tests/models/test_reporting.py 97.01% 2 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main     #231      +/-   ##
==========================================
+ Coverage   93.06%   93.29%   +0.22%     
==========================================
  Files         163      169       +6     
  Lines       18007    19109    +1102     
  Branches     1168     1209      +41     
==========================================
+ Hits        16758    17827    +1069     
- Misses       1023     1046      +23     
- Partials      226      236      +10     

☔ View full report in Codecov by Harness.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Signed-off-by: Tim Paine <3105306+timkpaine@users.noreply.github.com>
@timkpaine timkpaine marked this pull request as ready for review June 23, 2026 22:34
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant