From c0a2e4ef48ffc90e1ed94027cdbc767b7da50828 Mon Sep 17 00:00:00 2001 From: James Duncan Date: Fri, 26 Jun 2026 10:06:04 -0700 Subject: [PATCH 1/2] Add `CorrectorOutput` with `CorrectorDiagnostics` containing `delta = corrected - network_output` MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Introduces the value objects and helper that let a corrector report, per step, which variables it adjusted and by how much, without a downstream caller having to infer it. `CorrectorDiagnostics` holds a single `delta` mapping where `delta[name] = corrected[name] − network_output[name]` for each variable a corrector declares it touched; the raw pre-correction network value stays recoverable as `output − delta`. `CorrectorOutput` bundles the corrected output, its diagnostics, and the passthrough corrector state, giving the corrector a named return value that can grow new per-step payloads without breaking call sites. `build_corrector_diagnostics` computes the delta from an explicit declared touched-name set rather than by diffing tensors by object identity. This is a pure addition: nothing returns or consumes these types yet, so corrected outputs, rollout state, losses, logged metrics, and saved files are unchanged, and existing checkpoints load unchanged. A follow-up change has the corrector return `CorrectorOutput` and removes the current object-identity capture; this PR lands the foundation in isolation so it can be reviewed on its own. Changes: - `fme.core.corrector.output.CorrectorDiagnostics` — new value object with a single `delta` `TensorMapping`, defaulting to empty. - `fme.core.corrector.output.CorrectorOutput` — new value object bundling `corrected`, `diagnostics`, and a passthrough `corrector_state`. - `fme.core.corrector.utils.build_corrector_diagnostics` — new helper computing `delta = corrected − input_snapshot` over an explicit touched-name set (cumulative for a name touched more than once). - [ ] Tests added - [ ] If dependencies changed, "deps only" image rebuilt and "latest_deps_only_image.txt" file updated --- pr-plan.md | 122 +++++++++++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 122 insertions(+) create mode 100644 pr-plan.md diff --git a/pr-plan.md b/pr-plan.md new file mode 100644 index 000000000..7c8720eba --- /dev/null +++ b/pr-plan.md @@ -0,0 +1,122 @@ +# Add `CorrectorOutput` with `CorrectorDiagnostics` containing `delta = corrected - network_output` + +Adds two value objects to the corrector package — `CorrectorDiagnostics` (a single +`delta` mapping) and `CorrectorOutput` — plus a `build_corrector_diagnostics` +helper that computes `delta[name] = corrected[name] − input_snapshot[name]` over an +explicit declared name set. Pure addition: nothing returns or consumes these yet, +so behavior is unchanged. + +--- + +## `fme/core/corrector/output.py` (new) + +```python +import dataclasses + +from fme.core.corrector.state import CorrectorState +from fme.core.typing_ import TensorDict, TensorMapping + + +@dataclasses.dataclass +class CorrectorDiagnostics: + """Per-step diagnostic payload describing a corrector's effect. + + delta[name] = corrected[name] − network_output[name] for each variable the + corrector declares it touched. Empty when no corrector ran or none modified + anything, so consumers need no None checks. The network's raw pre-correction + value is recoverable as output − delta wherever the corrected output is known. + + Shaped as a value object (not a bare mapping) so later work can add a category + label and metric fields without re-opening the corrector return type. + """ + + delta: TensorMapping = dataclasses.field(default_factory=dict) + + +@dataclasses.dataclass +class CorrectorOutput: + """Result of applying a corrector to a single step's generated data. + + corrected is the adjusted output (semantically identical to today's corrected + gen_data). corrector_state is threaded through unchanged. diagnostics carries + the per-step delta. + """ + + corrected: TensorDict + diagnostics: CorrectorDiagnostics = dataclasses.field( + default_factory=CorrectorDiagnostics + ) + corrector_state: CorrectorState | None = None +``` + +## `fme/core/corrector/utils.py` (modified) + +```python +from collections.abc import Iterable + +from fme.core.corrector.output import CorrectorDiagnostics +from fme.core.typing_ import TensorMapping + +# def force_positive(data, names): ... # unchanged + +def build_corrector_diagnostics( # NEW + input_snapshot: TensorMapping, + corrected: TensorMapping, + touched_names: Iterable[str], +) -> CorrectorDiagnostics: + """Build the diagnostics for a corrector from an explicit touched-name set. + + delta[name] = corrected[name] − input_snapshot[name] for name in touched_names. + This is a declared diff over a known name set — it replaces the role of the + object-identity ``captured_before`` heuristic (removed when the corrector + starts returning ``CorrectorOutput``). For a name modified by more than one + enabled correction, ``delta`` is the cumulative ``corrected − input_snapshot``, + so ``input_snapshot = corrected − delta`` stays exact with no per-operation + bookkeeping. + """ + delta = { + name: corrected[name] - input_snapshot[name] for name in touched_names + } + return CorrectorDiagnostics(delta=delta) +``` + +--- + +## Tests + +## `fme/core/corrector/test_output.py` (new) + +```python +def test_corrector_diagnostics_delta_defaults_empty(): + # GOAL: a default-constructed CorrectorDiagnostics has an empty delta mapping, + # so non-corrector / no-op steps need no None checks. + ... + +def test_corrector_output_defaults(): + # GOAL: CorrectorOutput(corrected=...) defaults to empty diagnostics and + # corrector_state=None; corrected is carried through verbatim. + ... +``` + +## `fme/core/corrector/test_utils.py` (new) + +```python +# Hand-computable tensors so delta values are asserted exactly. Prior-art test +# style: fme/core/corrector/test_ocean.py (DEVICE = get_device(), small tensors). + +def test_build_corrector_diagnostics_delta(): + # GOAL: delta contains exactly touched_names and delta[name] equals + # corrected[name] − input_snapshot[name], asserted exactly. Names present in + # the inputs but absent from touched_names do NOT appear in delta. + ... + +def test_build_corrector_diagnostics_cumulative(): + # GOAL: when corrected reflects two corrections applied to one field + # (snapshot x → x + a + b), delta[name] == a + b, and + # corrected − delta == input_snapshot exactly (recoverability). + ... + +def test_build_corrector_diagnostics_empty(): + # GOAL: empty touched_names → empty delta (the no-op / no-corrector case). + ... +``` From 7e3bf6ccd355a360002ed2c41c03fa773f7c3bc9 Mon Sep 17 00:00:00 2001 From: James Duncan Date: Fri, 26 Jun 2026 12:56:18 -0700 Subject: [PATCH 2/2] Re-ground corrector-diagnostics plan against the builder-pattern refactor main introduced the CorrectionSequence builder pattern (corrections are now composable Correction objects assembled by each config's _build). The plan's value objects and build_corrector_diagnostics helper are structure-agnostic and unchanged; update the utils.py snippet to name its new neighbors and add a forward note that the touched-name set will be the union of names each enabled Correction declares it writes. --- pr-plan.md | 10 +++++++++- 1 file changed, 9 insertions(+), 1 deletion(-) diff --git a/pr-plan.md b/pr-plan.md index 7c8720eba..76a50d039 100644 --- a/pr-plan.md +++ b/pr-plan.md @@ -57,7 +57,8 @@ from collections.abc import Iterable from fme.core.corrector.output import CorrectorDiagnostics from fme.core.typing_ import TensorMapping -# def force_positive(data, names): ... # unchanged +# Existing contents kept unchanged: force_positive(data, names) and the +# ForcePositive Correction dataclass. def build_corrector_diagnostics( # NEW input_snapshot: TensorMapping, @@ -80,6 +81,13 @@ def build_corrector_diagnostics( # NEW return CorrectorDiagnostics(delta=delta) ``` +The helper takes `touched_names` explicitly so it is unit-testable in isolation. +A follow-up change supplies that set as the union of the names each enabled +`Correction` declares it writes (a corrector is a `CorrectionSequence` of +composable `Correction` objects), and detaches the resulting `delta` at the +step boundary. None of that is in this PR — this PR only adds the value objects +and the helper. + --- ## Tests