chore: address CodeRabbit follow-up findings#107
Conversation
- docs/use-cases.md: replace non-portable `grep -oP` with a portable `sed -E` extraction so the per-chromosome ENSEMBL example works on systems without GNU grep (e.g. default macOS). - gencode_downloader: also catch `subprocess.CalledProcessError` when running gffread (matches ensembl_downloader and ncbi_downloader). - ensembl_downloader: when no genome matches the GTF prefix, emit a warning naming the missing prefix and the fallback assembly before using it — previously a GRCh38 GTF could silently pair with a GRCh37 genome (or vice versa) with no indication to the user. - mini_clinvar.vcf: document that rs00002 and rs00004 are intentionally at the same 1:69550 G>A position so the two filtering paths (CLNSIG-based and MC-based) are independently exercised. Skipped: CodeRabbit also flagged the `workers > 1` guard in cbioportal_to_proteindb. The click option defaults to 1 and the service also defaults to 1 when WORKERS is unset, so `--workers 1` and "no flag" produce identical behaviour; the guard is not a user-visible bug.
Qodo reviews are paused for this user.Troubleshooting steps vary by plan Learn more → On a Teams plan? Using GitHub Enterprise Server, GitLab Self-Managed, or Bitbucket Data Center? |
|
Important Review skippedAuto reviews are disabled on base/target branches other than the default branch. Please check the settings in the CodeRabbit UI or the ⚙️ Run configurationConfiguration used: defaults Review profile: CHILL Plan: Pro Run ID: You can disable this status message by setting the Use the checkbox below for a quick retry:
✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
Summary
Acts on the actionable findings from a CodeRabbit
--agentreview of the branch. Each item was verified against current code before fixing.docs/use-cases.md:262(major) — replace non-portablegrep -oP(PCRE lookbehind, GNU-only) withsed -Eso the per-chromosome ENSEMBL example works on default macOS / BSD grep. Also handles the.vcf.gzvariant in one pass.pgatk/commands/gencode_downloader.py:69(minor) — also catchsubprocess.CalledProcessErrorfrom gffread. Matches whatensembl_downloader.py:139andncbi_downloader.pyalready do; previously a gffread failure during--generate-transcriptswould bubble up as an uncaught traceback.pgatk/commands/ensembl_downloader.py:130(minor) — when no genome matches a GTF's prefix, the loop silently fell back tonext(iter(genome_map.values())). Now warns with the missing prefix and the fallback assembly name so users can spot a GRCh38-GTF / GRCh37-genome mismatch.pgatk/testdata/clinvar/mini_clinvar.vcf:8-10(minor) — add a header comment documenting that the two records at1:69550 G>Aare intentional: rs00002 (Benign, missense) exercises CLNSIG filtering and rs00004 (Pathogenic, synonymous) exercises consequence filtering. Both assertions live intest_clinvar_service.py/test_clinvar_integration.py.Skipped
pgatk/commands/cbioportal_to_proteindb.py:99— CodeRabbit suggested changingif workers > 1toif workers is not None. The click option defaults to1andCancerGenomesServicealso defaults to 1 whenWORKERSis unset (cgenomes_proteindb.py:411), so--workers 1and "no flag" produce identical behaviour. The guard is not a user-visible bug.Test plan
python -m pytest pgatk/tests/ -q— 315 passedpython -m unittest discover -s pgatk/tests -p "*_tests.py"— 20/20 passed