Skip to content

[XV][Animal] Compose CrossValidation Rules for Animal Data#2935

Open
allisterakun wants to merge 69 commits into
devfrom
animal_xvalidation
Open

[XV][Animal] Compose CrossValidation Rules for Animal Data#2935
allisterakun wants to merge 69 commits into
devfrom
animal_xvalidation

Conversation

@allisterakun

@allisterakun allisterakun commented Apr 7, 2026

Copy link
Copy Markdown
Collaborator

Adds 27 cross-validation rules for the Animal module, covering herd configuration, reproduction settings, body weight, culling, and methane mitigation. Also extends CrossValidator with two new relationship operators (is_not_null, is_in) and a field_to_save option for for_each blocks, and removes SynchED_CP/SynchED_2P from HeiferTAISubProtocol (those values belong only to HeiferSynchEDSubProtocol).

Context

Closes #2888 and #2976.

What

27 animal cross-validation rules added in input/metadata/cross_validation/animal_cross_validation.json:

Herd configuration

  1. Sum of all parity fractions (parity_fractions.1.5) must equal 1.0.
  2. avg_gestation_len must exceed days_in_preg_when_dry (dry-off must occur before calving).
  3. heifer_repro_cull_time must exceed breeding_start_day_h (heifer culling window opens after breeding starts).
  4. (When cow repro method is ED or ED-TAI) do_not_breed_time must exceed voluntary_waiting_period.
  5. breeding_start_day_h must exceed wean_day (weaning before heifer breeding begins).
  6. prefresh_day must be shorter than both avg_gestation_len and the dry period duration (avg_gestation_len - days_in_preg_when_dry).
  7. calving_interval must exceed voluntary_waiting_period + avg_gestation_len.

Reproduction sub-protocol consistency
8. (When heifer repro method is ED) repro_sub_protocol must be "NA".
9. (When heifer repro method is TAI) repro_sub_protocol must be one of ["5dCG2P", "5dCGP"].
10. (When heifer repro method is SynchED) repro_sub_protocol must be one of ["2P", "CP"].
11. (When cow repro method is ED) voluntary_waiting_period must be positive; presynch_program, presynch_program_start_day, and ovsynch_program_start_day must be null; estrus_detection_rate and ED_conception_rate must be positive.
12. (When cow repro method is TAI) ovsynch_program must be set; DOUBLE CHECK IF OVSYNCH PROGRAM START DAY HAS TO BE > VWP ovsynch_program_start_day must be positive and exceed voluntary_waiting_period; ovsynch_program_conception_rate must be positive.
13. (When cow repro method is TAI and resynch_program is in ["PGFatPD", "none"]) DOUBLE CHECK IF PRESYNCH PROGRAM AND START DAY HAVE TO BE SETpresynch_program must be set; presynch_program_start_day must be positive; estrus_detection_rate and ED_conception_rate must be positive.
14. (When cow repro method is ED-TAI) voluntary_waiting_period, estrus_detection_rate, ED_conception_rate, ovsynch_program_start_day, and ovsynch_program_conception_rate must be positive; ovsynch_program must be set; DOUBLE CHECK IF OVSYNCH_PROGRAM_START_DAY > VWP? ovsynch_program_start_day must exceed voluntary_waiting_period.
15. (When presynch_program and ovsynch_program are both not null) ovsynch_program_start_day must exceed presynch_program_start_day.

Body weight
16. birth_weight_std_ho must be less than 20% of birth_weight_avg_ho.
17. birth_weight_std_ho must be less than 20% of birth_weight_avg_ho (duplicate of rule 16 — flagged for review).
18. target_heifer_preg_day must exceed breeding_start_day_h.
19. birth_weight_avg_ho must be less than mature_body_weight_avg.

Pregnancy checks & culling
20. Pregnancy check days must be strictly ascending (preg_check_day_1 < preg_check_day_2 < preg_check_day_3), and preg_check_day_3 must precede both calving (avg_gestation_len) and dry-off (days_in_preg_when_dry).
21. Sum of all cull-reason probabilities (feet_leg, injury, mastitis, disease, udder, unknown) must equal 1.0.
22. All cull_day_prob arrays and death_day_prob must have the same length as cull_day_count.

Methane mitigation
23. (When method is None) methane_mitigation_additive_amount must be 0.
24. (When method is 3-NOP) methane_mitigation_additive_amount must equal 3-NOP_additive_amount.
25. (When method is monensin) methane_mitigation_additive_amount must equal monensin_additive_amount.
26. (When method is essential_oils) methane_mitigation_additive_amount must equal essential_oils_additive_amount.
27. (When method is seaweed_additive) methane_mitigation_additive_amount must equal seaweed_additive_amount.

Deferred rules — not implemented due to input structure complexity:

The following rules were identified during review but could not be expressed with the current cross-validation schema. They are tracked here for future implementation.

  1. Pen manure processor type constraint (freestall/tiestall): When pen_type is "freestall" or "tiestall", the processor type of any first processor may only be "Handler" or "DailySpread".

  2. SingleStreamHandler requirement (freestall/tiestall): When pen_type is "freestall" or "tiestall", at least one first processor must be of type SingleStreamHandler.

  3. OpenLot processor requirement: When pen_type is "openlot", at least one first processor must be an OpenLot processor type.

  4. BeddedPack processor requirement: When pen_type is "bedded pack", at least one first processor must be a BeddedPack processor type.

  5. Lactating cow milking fields non-null: For LAC_COW pens, minutes_away_for_milking, first_parlor_processor, and parlor_stream_name must not be null.

  6. Stream proportion sum: The sum of all stream_proportion values within a pen must equal 1.0.

  7. Stall lower-bound checks (CLOSE_UP and LAC_COW): Extending the existing lower-bound stall pattern already applied to CALF and GROWING pens. For CLOSE_UP and LAC_COW pens, the total stall count multiplied by max_stocking_density must accommodate the computed pen population (dry cows + springers for CLOSE_UP; lactating cows for LAC_COW), with both an upper and lower stocking density bound enforced.

    Detailed Notes
    {
        "1. Too few stalls (lower bound) — extends existing pattern": "",
        "CALF and GROWING pens already have lower-bound rules.": "",
        "CLOSE_UP": [
            "Dry cow num = cow_num × (1 − milking_cow_fraction)",
            "Compute total_close_up = sum(heiferIII_num_springers, dry_cows) → save_as: total_close_up",
            "Actual check: sum(CLOSE_UP stalls) * 1.5 ≥ total_close_up / average(max_stocking_density)",
            "CLOSE_UP stalls * 0.5 <= total_close_up / average(max_stocking_density)"
        ],
        "LAC_COW": [
            "same process as CLOSE_UP"
        ]
    }
  8. Bedding config reverse lookup: All bedding names referenced in pen configs must have a corresponding entry in the bedding configuration.

CrossValidator extensions (RUFAS/data_validator.py):

  • New is_not_null relationship operator.
  • New is_in relationship operator (checks left value is a member of the right-hand list) — used by conditional apply_when blocks in rules 4, 8–10, 13.
  • New field_to_save option in for_each blocks: when set, returns a flat list of a single field extracted from each matched entry instead of full dicts.
  • Removed the restriction that multi-operand aggregations cannot use no_op (was blocking valid single-operand patterns that share a block structure).

Enum cleanup (RUFAS/biophysical/animal/data_types/repro_protocol_enums.py):

  • Removed SynchED_CP and SynchED_2P from HeiferTAISubProtocol; these values belong exclusively to HeiferSynchEDSubProtocol.

Why

Closes #2888 — the Animal module lacked cross-validation rules to catch logically inconsistent input configurations before simulation begins.

How

  • Created animal_cross_validation.json with 27 rules covering all major animal input sections (herd info, management decisions, farm-level repro, body weight, culling, and methane mitigation).
  • Extended CrossValidator with is_not_null and is_in operators needed to express the conditional (apply_when) rules, and field_to_save to support rules that filter array entries and extract a single field.
  • Cleaned up HeiferTAISubProtocol to remove the two members that were misplaced there (SynchED_CP, SynchED_2P).

Test plan

  • All existing unit tests pass (584 passing in the animal test suite; 432 in test_data_validator).
  • Fixed 4 broken test_data_validator tests asserting the now-removed multi-operand no_op restriction.
  • Fixed 5 broken animal/reproduction tests referencing the removed HeiferTAISubProtocol members.
  • Added 7 new unit tests for the new CrossValidator functionality: test_evaluate_is_not_null (5 parametrized cases) and test_evaluate_condition_is_not_null_branch.

Input Changes

  • Added input/metadata/cross_validation/animal_cross_validation.json with 27 cross-validation rules for the Animal module.

Output Changes

  • N/A

Filter

@github-actions

github-actions Bot commented Apr 7, 2026

Copy link
Copy Markdown
Contributor

Current Coverage: 99%

Mypy errors on animal_xvalidation branch: 1191
Mypy errors on dev branch: 1191
No difference in error counts

@github-actions

github-actions Bot commented Apr 7, 2026

Copy link
Copy Markdown
Contributor

🚨 Please update the changelog. This PR cannot be merged until changelog.md is updated.
🚨 Unauthorized changes detected in protected files. Please remove these changes if they are not intended.

@allisterakun allisterakun changed the title wip [XV][Animal] Compose CrossValidation Rules for Animal Data Apr 7, 2026
@github-actions

github-actions Bot commented Apr 8, 2026

Copy link
Copy Markdown
Contributor

Current Coverage: 99%

Mypy errors on animal_xvalidation branch: 1191
Mypy errors on dev branch: 1191
No difference in error counts

@github-actions

github-actions Bot commented Apr 8, 2026

Copy link
Copy Markdown
Contributor

🚨 Please update the changelog. This PR cannot be merged until changelog.md is updated.
🚨 Unauthorized changes detected in protected files. Please remove these changes if they are not intended.

@github-actions

github-actions Bot commented Apr 8, 2026

Copy link
Copy Markdown
Contributor

Current Coverage: 99%

Mypy errors on animal_xvalidation branch: 1191
Mypy errors on dev branch: 1191
No difference in error counts

@github-actions

github-actions Bot commented Apr 8, 2026

Copy link
Copy Markdown
Contributor

🚨 Please update the changelog. This PR cannot be merged until changelog.md is updated.
🚨 Unauthorized changes detected in protected files. Please remove these changes if they are not intended.

@github-actions

Copy link
Copy Markdown
Contributor

Current Coverage: 99%

Mypy errors on animal_xvalidation branch: 1191
Mypy errors on dev branch: 1191
No difference in error counts

@github-actions

Copy link
Copy Markdown
Contributor

🚨 Please update the changelog. This PR cannot be merged until changelog.md is updated.
🚨 Unauthorized changes detected in protected files. Please remove these changes if they are not intended.

@github-actions

Copy link
Copy Markdown
Contributor

Current Coverage: 99%

Mypy errors on animal_xvalidation branch: 1168
Mypy errors on dev branch: 1168
No difference in error counts

@github-actions

Copy link
Copy Markdown
Contributor

🚨 Please update the changelog. This PR cannot be merged until changelog.md is updated.
🚨 Unauthorized changes detected in protected files. Please remove these changes if they are not intended.

@github-actions

Copy link
Copy Markdown
Contributor

🚨 Unauthorized changes detected in protected files. Please remove these changes if they are not intended.

@github-actions

Copy link
Copy Markdown
Contributor

Current Coverage: 99%

Mypy errors on animal_xvalidation branch: 1164
Mypy errors on dev branch: 1164
No difference in error counts

@github-actions

Copy link
Copy Markdown
Contributor

🚨 Unauthorized changes detected in protected files. Please remove these changes if they are not intended.

@github-actions

Copy link
Copy Markdown
Contributor

Current Coverage: 99%

Mypy errors on animal_xvalidation branch: 1164
Mypy errors on dev branch: 1164
No difference in error counts

@github-actions

Copy link
Copy Markdown
Contributor

🚨 Unauthorized changes detected in protected files. Please remove these changes if they are not intended.

@github-actions

Copy link
Copy Markdown
Contributor

Current Coverage: 99%

Mypy errors on animal_xvalidation branch: 1164
Mypy errors on dev branch: 1164
No difference in error counts

@github-actions

Copy link
Copy Markdown
Contributor

🚨 Unauthorized changes detected in protected files. Please remove these changes if they are not intended.

@ew3361zh ew3361zh left a comment

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks good to me. I think the second approval should be from the SME side to ensure these rules are being enacted correctly according to the science. But code-wise I think more defensive error protection is warranted but not necessary for this PR at this point.

@allisterakun allisterakun requested a review from YijingGong June 2, 2026 21:02
Comment on lines +597 to +611
{
"left_hand": {
"aggregation": {
"operation": "no_op",
"operands": ["ovsynch_program_start_day"]
}
},
"right_hand": {
"aggregation": {
"operation": "no_op",
"operands": ["voluntary_waiting_period"]
}
},
"relationship": "greater"
},

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When cow_repro_method == 'TAI', DOUBLE CHECK ovsynch_program_start_day > voluntary_waiting_period?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So, when cow_repro_method == 'TAI', the voluntary_waiting_period is not used. The VWP is used in ED and ED-TAI to indicate when the cow becomes eligible for breeding based on estrus detection. Since there is no estrus detection in TAI method, the timing is entirely driven by ovsynch_program_start_day

so I think we could put in a rule (if not already there): when cow_repro_method == 'TAI', VWP is set to 0... or technically, the rule as written here would also work (since 0 < ovsynch_program_start_day)...

sources --
reproduction.py --> execute_cow_tai_protocol
metadata properties: "Voluntary Waiting Period (days) -- When the cow's days in milk has reached this day, monitoring for estrus and subsequent breeding, if found, will begin. Used only in the ED and ED-TAI protocols. When TAI protocol is used, this value will be ignored, and it is recommended to set it to 0.",

Comment on lines +856 to +870
{
"left_hand": {
"aggregation": {
"operation": "no_op",
"operands": ["ovsynch_program_start_day"]
}
},
"right_hand": {
"aggregation": {
"operation": "no_op",
"operands": ["voluntary_waiting_period"]
}
},
"relationship": "greater"
},

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When cow_repro_method == 'ED-TAI', DOUBLE CHECK ovsynch_program_start_day > voluntary_waiting_period?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no, this does not have to be the case. If ovsynch_program_start_day > voluntary_waiting_period, then it will have a period of estrus detection before beginning the hormone protocol. But if voluntary_waiting_period > ovsynch_program_start_day, then it will initiate the hormone protocol whenever the ovsynch_program_start_day is.
As I think about it, I'm not sure at that point how the protocol is actually different from TAI (since, if she doesn't conceive on OvSynch, she will go into a Resynch protocol).. Maybe it still allows estrus detection during the hormone protocol? I would have to dig deeper, and not the point right now...
As an example, Farm1 of the evaluation farms has ED-TAI, VWP =74, and ovsynch_program_start_day of 66 and runs without issue.

Comment on lines +683 to +712
{
"left_hand": {
"aggregation": {
"operation": "no_op",
"operands": ["presynch_program"]
}
},
"right_hand": {
"aggregation": {
"operation": "no_op",
"operands": ["true_constant"]
}
},
"relationship": "is_not_null"
},
{
"left_hand": {
"aggregation": {
"operation": "no_op",
"operands": ["presynch_program_start_day"]
}
},
"right_hand": {
"aggregation": {
"operation": "no_op",
"operands": ["zero"]
}
},
"relationship": "greater"
},

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When cow_repro_method == 'TAI' and resynch_program in ['PGFatPD', 'none'], DOUBLE CHECK if presynch_program and presynch_program_start_day both have to be set (not null, start day > 0)

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no, this is not the case. it is possible (and likely) to have both presynch and resynch programs, regardless of which resynch_program is being followed.

What IS relevant to those conditions:
When cow_repro_method == 'TAI' and resynch_program in ['PGFatPD', 'none'] --> "estrus_detection_rate" and "ED_conception_rate" must each be > 0 (because, even though the first breeding is using hormone protocol, the following breedings will be based on estrus detection under those resynch protocols)

@github-actions

github-actions Bot commented Jun 3, 2026

Copy link
Copy Markdown
Contributor

Current Coverage: 99%

Mypy errors on animal_xvalidation branch: 1145
Mypy errors on dev branch: 1145
No difference in error counts

@github-actions

github-actions Bot commented Jun 3, 2026

Copy link
Copy Markdown
Contributor

🚨 Unauthorized changes detected in protected files. Please remove these changes if they are not intended.

@github-actions

github-actions Bot commented Jun 3, 2026

Copy link
Copy Markdown
Contributor

Current Coverage: 99%

Mypy errors on animal_xvalidation branch: 1145
Mypy errors on dev branch: 1145
No difference in error counts

@github-actions

github-actions Bot commented Jun 3, 2026

Copy link
Copy Markdown
Contributor

🚨 Unauthorized changes detected in protected files. Please remove these changes if they are not intended.

@allisterakun allisterakun linked an issue Jun 4, 2026 that may be closed by this pull request
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

mismatched heifer repro protocol enums [XV] Compose CrossValidation Rules for Animal Data

4 participants