Correct docs/code inconsistencies and emit N_NO_COVERAGE in query output#32
Merged
Conversation
A consistency review against the source code surfaced several technical claims that no longer matched behaviour. The most significant: FILTER!=PASS samples were described as excluded from both AC and AN, when the code keeps them in the eligible set so they count toward AN (only AC excludes them). Also corrected: - genotype-accounting invariant now includes N_NO_COVERAGE everywhere - fail_bitmap definition (any FILTER!=PASS call, including ./. at failed sites), not "non-ref genotype AND FILTER!=PASS" - hemizygous males counted in N_HOM_ALT, not N_HET - Parquet schema lists all five bitmap columns; bucket files are bucket_N.parquet, not bucket_N/data.parquet - bucket-id SQL uses integer division (//), not float division - manifest.json field table; ingest temp store is Parquet, not SQLite - N_NO_COVERAGE added to annotation INFO field lists
The query command computed N_NO_COVERAGE but never printed it, so text, TSV and JSON output stopped at N_FAIL while annotate, dump and the Python API all exposed the field. Add it to all three formats and to the annotate help text.
Welcome to Codecov 🎉Once you merge this PR into your default branch, you're all set! Codecov will compare coverage reports and display results in all future pull requests. Thanks for integrating Codecov - We've got you covered ☂️ |
QueryResult.N_FAIL is a plain int defaulting to 0, never None, so the is-not-None guards in _print_results were always-true dead branches. N_FAIL now prints unconditionally and symmetrically with N_NO_COVERAGE across text/tsv/json. Adds a regression test pinning that both fields appear in all three formats and that TSV columns stay aligned.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
A consistency review of the documentation against the source code (treating the code as the source of truth) surfaced a set of technical claims that no longer matched behaviour, plus one genuine CLI gap.
Documentation corrections (21 files)
filter-pass-tracking.mdand echoed acrossglossary.md,faq.md,debugging-results.md,acmg-use-cases.md,create-database.md,understanding-output.md,python-api.md.N_NO_COVERAGEwas missing from the identity inploidy-and-sex-chroms.mdandtutorial.md.fail_bitmapdefinition — it tracks anyFILTER!=PASScall (including./.at failed sites), not "non-ref genotype AND FILTER!=PASS".N_HOM_ALT, notN_HET(sex-specific-af.mdhad this backwards, including example outputs).bucket_N.parquet(notbucket_N/data.parquet); bucket-id SQL uses integer division//; manifest.json field table corrected; ingest temp store is Parquet, not SQLite.N_NO_COVERAGEadded to annotation INFO field lists where it was missing.CLI fix
afquery querycomputedN_NO_COVERAGEbut never printed it — text, TSV and JSON stopped atN_FAIL, whileannotate,dumpand the Python API all exposed the field. Added to all three formats and to theannotatehelp text.Test plan
mkdocs buildpasses, all cross-links resolveN_NO_COVERAGEappears in text/TSV/JSON query output with matching TSV column counts