Add validation to _data_setter for non-standard data node names by drbenvincent · Pull Request #847 · pymc-labs/CausalPy

drbenvincent · 2026-04-15T19:33:03Z

Summary

PyMCModel._data_setter() hard-codes "X" and "y" as data node names. If a subclass uses different names, it silently misbehaves or throws a cryptic PyMC error. This PR adds validation that raises a clear ValueError with actionable guidance when expected data nodes are missing.

Fixes #663

Changes

causalpy/pymc_models.py:
- Added validation at the top of _data_setter() that checks "X" and "y" exist in self.named_vars before proceeding
- Raises ValueError with guidance to override _data_setter() when nodes are missing
- Updated class and method docstrings to document the override contract
- No behavioral changes to existing logic — the pm.set_data call is unchanged
causalpy/experiments/panel_regression.py:
- Fixed pre-existing mypy type: ignore codes (attr-defined → union-attr)
causalpy/tests/test_pymc_models.py:
- Added NonStandardDataModel test subclass using renamed data nodes
- Added TestDataSetterValidation with 2 tests:
  - test_mismatched_X_raises: verifies clear error when "X" is renamed
  - test_missing_y_raises: verifies clear error when only "y" is renamed

Design rationale

The issue proposed 5 possible directions. After examining all existing subclasses, I found that:

All standard models (LinearRegression, WeightedSumFitter, SoftmaxWeightedSumFitter) use "X" and "y" — they work as-is
Complex models (BayesianBasisExpansionTimeSeries) already override _data_setter entirely
Models with fundamentally different data (StateSpaceTimeSeries, InstrumentalVariableRegression, PropensityScore) override predict or fit entirely

So the override contract already works for all existing subclasses. The missing piece was validation: a subclass that forgets to override _data_setter previously got a cryptic error. Now it gets a clear ValueError pointing to the solution.

I deliberately kept this minimal — no new class attributes or API surface — because the name-remapping-only case doesn't arise in any existing subclass and would add complexity for a speculative benefit.

Testing

2 new validation tests pass
All 49 existing tests in test_pymc_models.py pass
All 30 PyMC integration tests pass
All 21 timeseries model coverage tests pass
prek run --all-files clean

Checklist

Backwards compatible (no behavioral changes)
No new API surface
Tests cover both X-missing and y-missing error paths
Docstrings updated

Add _predictor_data_name and _target_data_name class attributes to PyMCModel so subclasses using non-default data node names can customize without re-implementing _data_setter. Validate at predict time and raise a clear ValueError if expected nodes are missing. Also fix pre-existing mypy type: ignore codes in panel_regression.py (attr-defined -> union-attr). Made-with: Cursor

read-the-docs-community · 2026-04-15T19:39:23Z

Documentation build overview

📚 causalpy | 🛠️ Build #32279314 | 📁 Comparing 08ee2c3 against latest (1cf3387)

🔍 Preview build

4 files changed

± 404.html
± _modules/causalpy/pymc_models.html
± api/generated/causalpy.pymc_models.PyMCModel.html
± _modules/causalpy/experiments/panel_regression.html

codecov · 2026-04-15T19:40:13Z

Codecov Report

❌ Patch coverage is 93.87755% with 3 lines in your changes missing coverage. Please review.
✅ Project coverage is 94.44%. Comparing base (1cf3387) to head (08ee2c3).

Files with missing lines	Patch %	Lines
causalpy/tests/test_pymc_models.py	93.18%	2 Missing and 1 partial ⚠️

Additional details and impacted files

@@            Coverage Diff             @@
##             main     #847      +/-   ##
==========================================
- Coverage   94.44%   94.44%   -0.01%     
==========================================
  Files          80       80              
  Lines       12707    12754      +47     
  Branches      770      773       +3     
==========================================
+ Hits        12001    12045      +44     
- Misses        498      500       +2     
- Partials      208      209       +1

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Remove _predictor_data_name / _target_data_name class attributes (added complexity for a case no existing subclass needs). Keep the validation that raises a clear ValueError when expected data nodes are missing. Revert undeclared y_dtype behavioral change. Improve test coverage with separate X-missing and y-missing error paths. Made-with: Cursor

github-actions Bot added the refactor Refactor, clean up, or improvement with no visible changes to the user label Apr 15, 2026

drbenvincent requested review from NathanielF and juanitorduz April 15, 2026 19:38

drbenvincent changed the title ~~Make _data_setter customizable via class attributes~~ Add validation to _data_setter for non-standard data node names Apr 15, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add validation to _data_setter for non-standard data node names#847

Add validation to _data_setter for non-standard data node names#847
drbenvincent wants to merge 2 commits into
mainfrom
issue-663-data-setter-customization

drbenvincent commented Apr 15, 2026 •

edited

Loading

Uh oh!

read-the-docs-community Bot commented Apr 15, 2026 •

edited

Loading

Uh oh!

codecov Bot commented Apr 15, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

drbenvincent commented Apr 15, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Changes

Design rationale

Testing

Checklist

Uh oh!

read-the-docs-community Bot commented Apr 15, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Documentation build overview

Uh oh!

codecov Bot commented Apr 15, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

drbenvincent commented Apr 15, 2026 •

edited

Loading

read-the-docs-community Bot commented Apr 15, 2026 •

edited

Loading

codecov Bot commented Apr 15, 2026 •

edited

Loading