feat(prepro): make tests more readable#6790
Conversation
| input_fields, | ||
| args, | ||
| ) | ||
| _CONCATENATE_CASES = [ |
There was a problem hiding this comment.
| _CONCATENATE_CASES = [ | |
| concatenate_case_definitions = [ |
Maybe this? To be more consistent with test_case_definitions etc. in the same file
| assert res.datum == "DENV-1/Switzerland/mySample/2025" | ||
| assert res_insdc.datum == "DENV-1/Switzerland/mySample/2025" | ||
| assert res_prefix.datum == "hYF/Switzerland/mySample/2025" | ||
| _DISPLAY_NAME_CASES = [ |
There was a problem hiding this comment.
| _DISPLAY_NAME_CASES = [ | |
| display_name_case_definitions = [ |
| "submissionId", | ||
| "sampleCollectionDate", | ||
| ] | ||
| _DISPLAY_NAME_REGEX = r"^[^\/][^/]*/[^/]+/(?P<identifier>[^/]+)/\d{4}(?:-\d{2}){0,2}$" |
There was a problem hiding this comment.
Not blocking, but idk why claude likes this _UPPERCASE_WITH_LEADING_UNDERSCORE so much. I'm not the biggest fan and it's also inconsistent with other variable definitions in the same file e.g.,
Could we refactor these?
There was a problem hiding this comment.
tbf capitalising constants is actually pretty common and considered good practice I think, the leading underscore is weird though, I will remove it
There was a problem hiding this comment.
Yeah fair, I don't mind the capitalization either really it's mostly the leading underscore
maverbiest
left a comment
There was a problem hiding this comment.
Thanks for this, these tests had really turned into a mess.
I left some small comments but overall I like this approach much more!
| if identifier is None: | ||
| identifier = parse_identifier_string(submission_id, insdc_ingested, pattern) | ||
| if insdc_ingested: | ||
| # For INSDC ingested sequences the submissionId is the INSDC accession |
There was a problem hiding this comment.
@maverbiest I realized now that the logic of the PR should maybe be different? I thought we did not want to fallback to the INSDC accession for ingested sequences
There was a problem hiding this comment.
Oh hmm I don't remember exactly what we decided on this. I'll check on Slack to see what people's thoughts are
| None if has_display_name_separators(str(collector_id)) else str(collector_id) | ||
| ) | ||
| else: | ||
| identifier = parse_identifier_string(str(collector_id), str(regex_pattern)) |
There was a problem hiding this comment.
This is why the unit test is failing, I think: if regex_pattern is None, str(None) -> "None"
We need to narrow to str | None from ArgValue and then check for None before passing
There was a problem hiding this comment.
(might be wrong though, need to take a closer look)
maverbiest
left a comment
There was a problem hiding this comment.
Since we're now anyway doing a big refactor of the prepro code in #6537, could we maybe just keep the changes to tests in this PR?
We can then see how we want to update to logic after rebasing the display-name-fallback branch
|
@maverbiest, yes! you can cherry-pick the test changes to your PR if you would like :-) |
While reviewing #6774 I had a lot of trouble reading the tests, I think having clear test cases with descriptive names of what the test is checking makes this a lot easier to parse (or grok as @corneliusroemer would say). I got claude to reformat these tests like the other prepro test
Caseobjects.Screenshot
PR Checklist
🚀 Preview: Add
previewlabel to enable