Skip to content
Draft
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
130 changes: 130 additions & 0 deletions pr-plan.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,130 @@
# Add `CorrectorOutput` with `CorrectorDiagnostics` containing `delta = corrected - network_output`

Adds two value objects to the corrector package — `CorrectorDiagnostics` (a single
`delta` mapping) and `CorrectorOutput` — plus a `build_corrector_diagnostics`
helper that computes `delta[name] = corrected[name] − input_snapshot[name]` over an
explicit declared name set. Pure addition: nothing returns or consumes these yet,
so behavior is unchanged.

---

## `fme/core/corrector/output.py` (new)

```python
import dataclasses

from fme.core.corrector.state import CorrectorState
from fme.core.typing_ import TensorDict, TensorMapping


@dataclasses.dataclass
class CorrectorDiagnostics:
"""Per-step diagnostic payload describing a corrector's effect.

delta[name] = corrected[name] − network_output[name] for each variable the
corrector declares it touched. Empty when no corrector ran or none modified
Comment on lines +24 to +25

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note: "replaced" values like the advected total column humidity in the case of columnwise moisture budget closure would not be included in "delta", at least not by default.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

delta = corrected - network_output is a clean and well-defined definition for every network output, regardless of whether it is replaced. The "replaced" vs "delta" distinction adds unnecessary confusion to the output structure that is not necessary at this stage.

In a follow-on PR, intended default downstream usage can be added when it is needed; that's out of scope here.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Quick clarification on the mechanic, since I think this case is actually covered: the touched-name set is the union of every variable the enabled options write, which includes replaced fields like the advected total column humidity under moisture-budget closure. So those fields are in delta (= prescribed_value − network_value), with the raw network value recoverable as output − delta — nothing replaced is excluded. Whether a consumer later chooses to treat a replaced field differently (e.g. drop it from a correction-magnitude loss) is the labeling policy from the other thread, in the follow-on PR.

— 🤖 Drafted by Claude (Opus 4.8) via Claude Code, posted on @jpdunc23's behalf.

anything, so consumers need no None checks. The network's raw pre-correction
value is recoverable as output − delta wherever the corrected output is known.

Shaped as a value object (not a bare mapping) so later work can add a category
label and metric fields without re-opening the corrector return type.
"""

delta: TensorMapping = dataclasses.field(default_factory=dict)

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note: Our plan also included having something like a prescribed field that is deemed to be replacing a field, not applying a delta, and would e.g. be treated differently in corrector-based losses.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A follow-on PR will handle labeling for default behavior. For various reasons the "delta" vs "prescribed" distinction is not well-defined (e.g., the hfds "prescribed" correction is both replacement and delta) but every correction can be described cleanly as delta = corrected - network_output. So the current plan I have in mind is to simply include labels which signal default intentions for downstream uses.

Let's please save the debate for the follow on PR.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed the prescribed-vs-delta concept needs to exist — the plan keeps it, just as a label on CorrectorDiagnostics in the follow-on PR rather than a second mapping here. Two reasons the label is the better home:

  1. It's additive — a new field on this value object, no change to the corrector return type or this helper.
  2. The bucket isn't always clean at the data layer: as you note, hfds is genuinely both a replacement and a delta, so a corrector can't authoritatively pre-sort fields. A label set where the loss/consumer knows its intent can.

And nothing is dropped in the meantime: for a fully-prescribed field, delta = prescribed − network_output captures the change exactly, with the prescribed value recoverable as output and the raw network value as output − delta. The loss in the next PR has everything it needs to treat prescribed fields differently.

— 🤖 Drafted by Claude (Opus 4.8) via Claude Code, posted on @jpdunc23's behalf.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So the current plan I have in mind is to simply include labels which signal default intentions for downstream uses.

Let's please save the debate for the follow on PR.

OK, we can discuss this in the context of the PR that implements these features. At least we're in agreement a class like this with delta defined should be added, and I think the other details are best discussed in that PR's context.

So the current plan I have in mind is to simply include labels which signal default intentions for downstream uses.

We had a long discussion where this was your plan going into the conversation, I posed my concerns with the plan, and we agreed on a different plan. That plan was basically to use multiple TensorMapping instead of labels, which would a single field to exist in multiple label buckets and be easier to work with than low-level one-hot-encodings of labels. Going back to the original plan I had concerns with will likely mean re-hashing that discussion, but it should be easier to have in the context of the next PR.

I think we need to talk more about HFDS specifically even separate from these PRs, because I'm kind of confused by some of it. Like the idea of corrector regularization and pre-corrector regularization, to my mind, are that we want to make it so that if the corrector were dropped, the network would give reasonable outputs, and that if we then use the corrector it's just being used for minor corrections. But that really doesn't seem to fit for HFDS.

I wonder if we might need a separate concept, outside of the corrector, for these "outputs that are derived from network outputs", and doing that could completely remove the need to have "prescribed" terms for the corrector (since they would live in the derived-output concept instead).

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We had a long discussion where this was your plan going into the conversation, I posed my concerns with the plan, and we agreed on a different plan. That plan was basically to use multiple TensorMapping instead of labels, which would a single field to exist in multiple label buckets and be easier to work with than low-level one-hot-encodings of labels. Going back to the original plan I had concerns with will likely mean re-hashing that discussion, but it should be easier to have in the context of the next PR.

Actually trying to implement the plan we discussed last Thursday led to too much friction, thus the change in tack.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually trying to implement the plan we discussed last Thursday led to too much friction, thus the change in tack.

Hm, having just looked at some of the corrector code I think I can imagine what you mean. For example, some values get derived only after certain corrections are applied.

These scientific issues are something we're going to have to address, but maybe let's move forward with the PR.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the idea of corrector regularization and pre-corrector regularization, to my mind, are that we want to make it so that if the corrector were dropped, the network would give reasonable outputs, and that if we then use the corrector it's just being used for minor corrections

I agree with this

But that really doesn't seem to fit for HFDS

I don't agree. For the "prescribed" method of HFDS correction (which I now realize is a terrible name), we still have a hybrid of ML + surface forcing.

  • Where sea ice fraction is 0 (fully ocean) the output is the "derived" surface flux from the forcing; this always has small irreducible error in practice because our AM4 fluxes are output on a grid that is not fully consistent with the OM4 surface flux target HFDS.
  • Where sea ice fraction is 1 (fully ice) the output is fully ML.

But the network still predicts something for HFDS over the entire ocean including open ocean grid cells that are entirely replaced by the derived flux from the AM4 flux forcing. Optimizing the error of that output (or penalizing the magnitude of the delta) regardless encourages HFDS to stay close to the target (which, as we saw, improved the interannual variability of the 3D sea water temperature), but "prescribing" the open ocean flux forces Samudra to use the AM4 reference flux forcing when correcting the ocean heat content.

I wonder if we might need a separate concept, outside of the corrector, for these "outputs that are derived from network outputs"

Don't we already have this concept via derive_func?



@dataclasses.dataclass
class CorrectorOutput:
"""Result of applying a corrector to a single step's generated data.

corrected is the adjusted output (semantically identical to today's corrected
gen_data). corrector_state is threaded through unchanged. diagnostics carries
the per-step delta.
"""

corrected: TensorDict
diagnostics: CorrectorDiagnostics = dataclasses.field(
default_factory=CorrectorDiagnostics
)
corrector_state: CorrectorState | None = None
```

## `fme/core/corrector/utils.py` (modified)

```python
from collections.abc import Iterable

from fme.core.corrector.output import CorrectorDiagnostics
from fme.core.typing_ import TensorMapping

# Existing contents kept unchanged: force_positive(data, names) and the
# ForcePositive Correction dataclass.

def build_corrector_diagnostics( # NEW
input_snapshot: TensorMapping,
corrected: TensorMapping,
touched_names: Iterable[str],
) -> CorrectorDiagnostics:
"""Build the diagnostics for a corrector from an explicit touched-name set.

delta[name] = corrected[name] − input_snapshot[name] for name in touched_names.
This is a declared diff over a known name set — it replaces the role of the
object-identity ``captured_before`` heuristic (removed when the corrector
starts returning ``CorrectorOutput``). For a name modified by more than one
enabled correction, ``delta`` is the cumulative ``corrected − input_snapshot``,
so ``input_snapshot = corrected − delta`` stays exact with no per-operation
bookkeeping.
"""
delta = {
name: corrected[name] - input_snapshot[name] for name in touched_names
}
return CorrectorDiagnostics(delta=delta)
Comment on lines +63 to +81

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Issue: this doesn't allow us to control what corrector impacts are deemed to be deltas and which ones aren't, as we discussed.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All corrections are delta = corrected - network_output. Downstream handling for diagnostic and optim purposes is intentionally out of scope for this PR.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd argue this PR preserves that control rather than removing it. Because the corrected value (StepOutput.output) is always present alongside delta, every per-field view stays recoverable downstream:

  • correction magnitude → delta
  • raw network value → output − delta
  • multiplicative scale → output / (output − delta)

So the correction loss/metric in the next PR can decide, per field, whether to treat an impact as a delta, a replacement, or neither — driven by a label or config, where the intent actually lives. The control isn't lost; it moves from a corrector-time storage split to a consumer-time policy, which is strictly more flexible since the corrector no longer has to pre-commit each field to a bucket. Storing one uniform delta is what keeps all of those options open.

— 🤖 Drafted by Claude (Opus 4.8) via Claude Code, posted on @jpdunc23's behalf.

@mcgibbon mcgibbon Jun 26, 2026

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The issue isn't that delta is poorly defined, it's that the purpose is supposed to be to enable regularization strategies, but it doesn't fit that purpose.

For HDFS, with residual_prediction delta = ocean_fraction · net_flux. What's the purpose of regularizing this? When using prescribed, delta = ocean_fraction · (net_flux − gen_hfds) becomes a regularization term that encourages gen_hfds to become equal to net_flux. Is net_flux what the channel is being optimized on? If not, this is also an issue.

It's difficult to review this delta in a context where it's not used, can we move it to the next PR?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Per #1321 (comment) we're in agreement that delta stays in this PR.

```

The helper takes `touched_names` explicitly so it is unit-testable in isolation.
A follow-up change supplies that set as the union of the names each enabled
`Correction` declares it writes (a corrector is a `CorrectionSequence` of
composable `Correction` objects), and detaches the resulting `delta` at the
step boundary. None of that is in this PR — this PR only adds the value objects
and the helper.

---

## Tests

## `fme/core/corrector/test_output.py` (new)

```python
def test_corrector_diagnostics_delta_defaults_empty():
# GOAL: a default-constructed CorrectorDiagnostics has an empty delta mapping,
# so non-corrector / no-op steps need no None checks.
...

def test_corrector_output_defaults():
# GOAL: CorrectorOutput(corrected=...) defaults to empty diagnostics and
# corrector_state=None; corrected is carried through verbatim.
...
```

## `fme/core/corrector/test_utils.py` (new)

```python
# Hand-computable tensors so delta values are asserted exactly. Prior-art test
# style: fme/core/corrector/test_ocean.py (DEVICE = get_device(), small tensors).

def test_build_corrector_diagnostics_delta():
# GOAL: delta contains exactly touched_names and delta[name] equals
# corrected[name] − input_snapshot[name], asserted exactly. Names present in
# the inputs but absent from touched_names do NOT appear in delta.
...

def test_build_corrector_diagnostics_cumulative():
# GOAL: when corrected reflects two corrections applied to one field
# (snapshot x → x + a + b), delta[name] == a + b, and
# corrected − delta == input_snapshot exactly (recoverability).
...

def test_build_corrector_diagnostics_empty():
# GOAL: empty touched_names → empty delta (the no-op / no-corrector case).
...
```
Loading