[XV][Animal] Compose CrossValidation Rules for Animal Data#2935
[XV][Animal] Compose CrossValidation Rules for Animal Data#2935allisterakun wants to merge 69 commits into
Conversation
|
Current Coverage: 99% Mypy errors on animal_xvalidation branch: 1191 |
|
🚨 Please update the changelog. This PR cannot be merged until |
|
Current Coverage: 99% Mypy errors on animal_xvalidation branch: 1191 |
|
🚨 Please update the changelog. This PR cannot be merged until |
|
Current Coverage: 99% Mypy errors on animal_xvalidation branch: 1191 |
|
🚨 Please update the changelog. This PR cannot be merged until |
|
Current Coverage: 99% Mypy errors on animal_xvalidation branch: 1191 |
|
🚨 Please update the changelog. This PR cannot be merged until |
|
Current Coverage: 99% Mypy errors on animal_xvalidation branch: 1168 |
|
🚨 Please update the changelog. This PR cannot be merged until |
|
🚨 Unauthorized changes detected in protected files. Please remove these changes if they are not intended. |
|
Current Coverage: 99% Mypy errors on animal_xvalidation branch: 1164 |
|
🚨 Unauthorized changes detected in protected files. Please remove these changes if they are not intended. |
|
Current Coverage: 99% Mypy errors on animal_xvalidation branch: 1164 |
|
🚨 Unauthorized changes detected in protected files. Please remove these changes if they are not intended. |
|
Current Coverage: 99% Mypy errors on animal_xvalidation branch: 1164 |
|
🚨 Unauthorized changes detected in protected files. Please remove these changes if they are not intended. |
ew3361zh
left a comment
There was a problem hiding this comment.
This looks good to me. I think the second approval should be from the SME side to ensure these rules are being enacted correctly according to the science. But code-wise I think more defensive error protection is warranted but not necessary for this PR at this point.
| { | ||
| "left_hand": { | ||
| "aggregation": { | ||
| "operation": "no_op", | ||
| "operands": ["ovsynch_program_start_day"] | ||
| } | ||
| }, | ||
| "right_hand": { | ||
| "aggregation": { | ||
| "operation": "no_op", | ||
| "operands": ["voluntary_waiting_period"] | ||
| } | ||
| }, | ||
| "relationship": "greater" | ||
| }, |
There was a problem hiding this comment.
When cow_repro_method == 'TAI', DOUBLE CHECK ovsynch_program_start_day > voluntary_waiting_period?
There was a problem hiding this comment.
So, when cow_repro_method == 'TAI', the voluntary_waiting_period is not used. The VWP is used in ED and ED-TAI to indicate when the cow becomes eligible for breeding based on estrus detection. Since there is no estrus detection in TAI method, the timing is entirely driven by ovsynch_program_start_day
so I think we could put in a rule (if not already there): when cow_repro_method == 'TAI', VWP is set to 0... or technically, the rule as written here would also work (since 0 < ovsynch_program_start_day)...
sources --
reproduction.py --> execute_cow_tai_protocol
metadata properties: "Voluntary Waiting Period (days) -- When the cow's days in milk has reached this day, monitoring for estrus and subsequent breeding, if found, will begin. Used only in the ED and ED-TAI protocols. When TAI protocol is used, this value will be ignored, and it is recommended to set it to 0.",
| { | ||
| "left_hand": { | ||
| "aggregation": { | ||
| "operation": "no_op", | ||
| "operands": ["ovsynch_program_start_day"] | ||
| } | ||
| }, | ||
| "right_hand": { | ||
| "aggregation": { | ||
| "operation": "no_op", | ||
| "operands": ["voluntary_waiting_period"] | ||
| } | ||
| }, | ||
| "relationship": "greater" | ||
| }, |
There was a problem hiding this comment.
When cow_repro_method == 'ED-TAI', DOUBLE CHECK ovsynch_program_start_day > voluntary_waiting_period?
There was a problem hiding this comment.
no, this does not have to be the case. If ovsynch_program_start_day > voluntary_waiting_period, then it will have a period of estrus detection before beginning the hormone protocol. But if voluntary_waiting_period > ovsynch_program_start_day, then it will initiate the hormone protocol whenever the ovsynch_program_start_day is.
As I think about it, I'm not sure at that point how the protocol is actually different from TAI (since, if she doesn't conceive on OvSynch, she will go into a Resynch protocol).. Maybe it still allows estrus detection during the hormone protocol? I would have to dig deeper, and not the point right now...
As an example, Farm1 of the evaluation farms has ED-TAI, VWP =74, and ovsynch_program_start_day of 66 and runs without issue.
| { | ||
| "left_hand": { | ||
| "aggregation": { | ||
| "operation": "no_op", | ||
| "operands": ["presynch_program"] | ||
| } | ||
| }, | ||
| "right_hand": { | ||
| "aggregation": { | ||
| "operation": "no_op", | ||
| "operands": ["true_constant"] | ||
| } | ||
| }, | ||
| "relationship": "is_not_null" | ||
| }, | ||
| { | ||
| "left_hand": { | ||
| "aggregation": { | ||
| "operation": "no_op", | ||
| "operands": ["presynch_program_start_day"] | ||
| } | ||
| }, | ||
| "right_hand": { | ||
| "aggregation": { | ||
| "operation": "no_op", | ||
| "operands": ["zero"] | ||
| } | ||
| }, | ||
| "relationship": "greater" | ||
| }, |
There was a problem hiding this comment.
When cow_repro_method == 'TAI' and resynch_program in ['PGFatPD', 'none'], DOUBLE CHECK if presynch_program and presynch_program_start_day both have to be set (not null, start day > 0)
There was a problem hiding this comment.
no, this is not the case. it is possible (and likely) to have both presynch and resynch programs, regardless of which resynch_program is being followed.
What IS relevant to those conditions:
When cow_repro_method == 'TAI' and resynch_program in ['PGFatPD', 'none'] --> "estrus_detection_rate" and "ED_conception_rate" must each be > 0 (because, even though the first breeding is using hormone protocol, the following breedings will be based on estrus detection under those resynch protocols)
|
Current Coverage: 99% Mypy errors on animal_xvalidation branch: 1145 |
|
🚨 Unauthorized changes detected in protected files. Please remove these changes if they are not intended. |
|
Current Coverage: 99% Mypy errors on animal_xvalidation branch: 1145 |
|
🚨 Unauthorized changes detected in protected files. Please remove these changes if they are not intended. |
Adds 27 cross-validation rules for the Animal module, covering herd configuration, reproduction settings, body weight, culling, and methane mitigation. Also extends
CrossValidatorwith two new relationship operators (is_not_null,is_in) and afield_to_saveoption forfor_eachblocks, and removesSynchED_CP/SynchED_2PfromHeiferTAISubProtocol(those values belong only toHeiferSynchEDSubProtocol).Context
Closes #2888 and #2976.
What
27 animal cross-validation rules added in
input/metadata/cross_validation/animal_cross_validation.json:Herd configuration
parity_fractions.1–.5) must equal 1.0.avg_gestation_lenmust exceeddays_in_preg_when_dry(dry-off must occur before calving).heifer_repro_cull_timemust exceedbreeding_start_day_h(heifer culling window opens after breeding starts).do_not_breed_timemust exceedvoluntary_waiting_period.breeding_start_day_hmust exceedwean_day(weaning before heifer breeding begins).prefresh_daymust be shorter than bothavg_gestation_lenand the dry period duration (avg_gestation_len - days_in_preg_when_dry).calving_intervalmust exceedvoluntary_waiting_period + avg_gestation_len.Reproduction sub-protocol consistency
8. (When heifer repro method is ED)
repro_sub_protocolmust be"NA".9. (When heifer repro method is TAI)
repro_sub_protocolmust be one of["5dCG2P", "5dCGP"].10. (When heifer repro method is SynchED)
repro_sub_protocolmust be one of["2P", "CP"].11. (When cow repro method is ED)
voluntary_waiting_periodmust be positive;presynch_program,presynch_program_start_day, andovsynch_program_start_daymust be null;estrus_detection_rateandED_conception_ratemust be positive.12. (When cow repro method is TAI)
ovsynch_programmust be set; DOUBLE CHECK IF OVSYNCH PROGRAM START DAY HAS TO BE > VWPovsynch_program_start_daymust be positive and exceedvoluntary_waiting_period;ovsynch_program_conception_ratemust be positive.13. (When cow repro method is TAI and
resynch_programis in["PGFatPD", "none"]) DOUBLE CHECK IF PRESYNCH PROGRAM AND START DAY HAVE TO BE SETpresynch_programmust be set;presynch_program_start_daymust be positive;estrus_detection_rateandED_conception_ratemust be positive.14. (When cow repro method is ED-TAI)
voluntary_waiting_period,estrus_detection_rate,ED_conception_rate,ovsynch_program_start_day, andovsynch_program_conception_ratemust be positive;ovsynch_programmust be set; DOUBLE CHECK IF OVSYNCH_PROGRAM_START_DAY > VWP?ovsynch_program_start_daymust exceedvoluntary_waiting_period.15. (When
presynch_programandovsynch_programare both not null)ovsynch_program_start_daymust exceedpresynch_program_start_day.Body weight
16.
birth_weight_std_homust be less than 20% ofbirth_weight_avg_ho.17.
birth_weight_std_homust be less than 20% ofbirth_weight_avg_ho(duplicate of rule 16 — flagged for review).18.
target_heifer_preg_daymust exceedbreeding_start_day_h.19.
birth_weight_avg_homust be less thanmature_body_weight_avg.Pregnancy checks & culling
20. Pregnancy check days must be strictly ascending (
preg_check_day_1<preg_check_day_2<preg_check_day_3), andpreg_check_day_3must precede both calving (avg_gestation_len) and dry-off (days_in_preg_when_dry).21. Sum of all cull-reason probabilities (
feet_leg,injury,mastitis,disease,udder,unknown) must equal 1.0.22. All
cull_day_probarrays anddeath_day_probmust have the same length ascull_day_count.Methane mitigation
23. (When method is
None)methane_mitigation_additive_amountmust be 0.24. (When method is
3-NOP)methane_mitigation_additive_amountmust equal3-NOP_additive_amount.25. (When method is
monensin)methane_mitigation_additive_amountmust equalmonensin_additive_amount.26. (When method is
essential_oils)methane_mitigation_additive_amountmust equalessential_oils_additive_amount.27. (When method is
seaweed_additive)methane_mitigation_additive_amountmust equalseaweed_additive_amount.Deferred rules — not implemented due to input structure complexity:
The following rules were identified during review but could not be expressed with the current cross-validation schema. They are tracked here for future implementation.
Pen manure processor type constraint (freestall/tiestall): When
pen_typeis"freestall"or"tiestall", the processor type of any first processor may only be"Handler"or"DailySpread".SingleStreamHandler requirement (freestall/tiestall): When
pen_typeis"freestall"or"tiestall", at least one first processor must be of typeSingleStreamHandler.OpenLot processor requirement: When
pen_typeis"openlot", at least one first processor must be an OpenLot processor type.BeddedPack processor requirement: When
pen_typeis"bedded pack", at least one first processor must be a BeddedPack processor type.Lactating cow milking fields non-null: For
LAC_COWpens,minutes_away_for_milking,first_parlor_processor, andparlor_stream_namemust not be null.Stream proportion sum: The sum of all
stream_proportionvalues within a pen must equal 1.0.Stall lower-bound checks (CLOSE_UP and LAC_COW): Extending the existing lower-bound stall pattern already applied to CALF and GROWING pens. For CLOSE_UP and LAC_COW pens, the total stall count multiplied by
max_stocking_densitymust accommodate the computed pen population (dry cows + springers for CLOSE_UP; lactating cows for LAC_COW), with both an upper and lower stocking density bound enforced.Detailed Notes
{ "1. Too few stalls (lower bound) — extends existing pattern": "", "CALF and GROWING pens already have lower-bound rules.": "", "CLOSE_UP": [ "Dry cow num = cow_num × (1 − milking_cow_fraction)", "Compute total_close_up = sum(heiferIII_num_springers, dry_cows) → save_as: total_close_up", "Actual check: sum(CLOSE_UP stalls) * 1.5 ≥ total_close_up / average(max_stocking_density)", "CLOSE_UP stalls * 0.5 <= total_close_up / average(max_stocking_density)" ], "LAC_COW": [ "same process as CLOSE_UP" ] }Bedding config reverse lookup: All bedding names referenced in pen configs must have a corresponding entry in the bedding configuration.
CrossValidatorextensions (RUFAS/data_validator.py):is_not_nullrelationship operator.is_inrelationship operator (checks left value is a member of the right-hand list) — used by conditionalapply_whenblocks in rules 4, 8–10, 13.field_to_saveoption infor_eachblocks: when set, returns a flat list of a single field extracted from each matched entry instead of full dicts.no_op(was blocking valid single-operand patterns that share a block structure).Enum cleanup (
RUFAS/biophysical/animal/data_types/repro_protocol_enums.py):SynchED_CPandSynchED_2PfromHeiferTAISubProtocol; these values belong exclusively toHeiferSynchEDSubProtocol.Why
Closes #2888 — the Animal module lacked cross-validation rules to catch logically inconsistent input configurations before simulation begins.
How
animal_cross_validation.jsonwith 27 rules covering all major animal input sections (herd info, management decisions, farm-level repro, body weight, culling, and methane mitigation).CrossValidatorwithis_not_nullandis_inoperators needed to express the conditional (apply_when) rules, andfield_to_saveto support rules that filter array entries and extract a single field.HeiferTAISubProtocolto remove the two members that were misplaced there (SynchED_CP,SynchED_2P).Test plan
test_data_validator).test_data_validatortests asserting the now-removed multi-operandno_oprestriction.HeiferTAISubProtocolmembers.CrossValidatorfunctionality:test_evaluate_is_not_null(5 parametrized cases) andtest_evaluate_condition_is_not_null_branch.Input Changes
input/metadata/cross_validation/animal_cross_validation.jsonwith 27 cross-validation rules for the Animal module.Output Changes
Filter