Remove confirmed-dead scripts, defaults, and config keys#1208
Conversation
joverlee521
left a comment
There was a problem hiding this comment.
Checked a couple of the files that I've seen referenced elsewhere, but should all be good to remove from this repo.
There was a problem hiding this comment.
This file was originally added a human/machine readable file for reference in #817, but was never intended to be used in the workflow.
Would it be worth keeping as a reference? It's currently used in ncov-clades-schema, but that repo also has not been run/updated in a while.
There was a problem hiding this comment.
Oh! Good catch. I thought this was stale, but we need a source of truth for which clades are parents to which clades. It's this file. I've restored it in a rebased cf49012.
There was a problem hiding this comment.
Noting the example data is still referenced in the ncov-tutorial, but that folder also has a warning that it is just a public archive and not guaranteed to work.
Should be okay to delete 👍
There was a problem hiding this comment.
Originally added in 9c48778 as an orphan script. A copy of it is referenced in a fork (from nextstrain/auspice#1036 (comment)), but should be good to delete from this repo.
First pass of a pandemic-era cruft cleanup. Removes code and data that is referenced by nothing in the workflow, the active profiles, the docs, or the tests (verified by grep plus a dry-run of the CI profile showing an unchanged 37-job DAG). No behavior change to any build. Removed: - Orphan scripts: scripts/add_labels.py, scripts/generate-scientific-credits.py, and the explicitly-deprecated scripts/deprecated/ (calculate_delta_frequency.py, parse_mutational_fitness_tsv_into_distance_map.py). - Unused defaults: defaults/distance_maps/VoC.json (only S1.json is used) and defaults/clades_who.tsv. - Dead config key files.outgroup (defaults/parameters.yaml) — it pointed at a file that does not exist in the repo and was read nowhere; its config-reference entry (already documented "No longer used") is removed too. - Deprecated my_profiles/ directory (only a deprecation README) and its now-orphan .gitignore exception. - Unused committed example data: data/example_*_worldwide.*, data/example_*_aus.*, data/example_multiple_inputs.tar.xz (the CI-used example_metadata.tsv / example_sequences.fasta.gz are kept). Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
First in a series of small, independently-reviewable PRs cleaning up pandemic-era cruft (see the cleanup plan; scope: dead code + proximity removal + docs/schemes).
Motivation
The repo accumulated scripts, default data files, and config keys that nothing references anymore. This PR removes the verified-dead ones. The three live contracts — weekly OPEN build, occasional GISAID builds, and external users running their own builds — are unaffected.
What's removed
workflow/ Snakefile nextstrain_profiles/ docs/ tests/):scripts/add_labels.py,scripts/generate-scientific-credits.py, and the explicitly-deprecatedscripts/deprecated/(calculate_delta_frequency.py,parse_mutational_fitness_tsv_into_distance_map.py).defaults/distance_maps/VoC.json(onlyS1.jsonis used by thedistancesrule),defaults/clade_hierarchy.tsv,defaults/clades_who.tsv.files.outgroupindefaults/parameters.yaml— pointed at a file that doesn't exist in the repo and was read nowhere; its config-reference entry (already documented "No longer used") is removed too.my_profiles/directory (only a deprecation README) and its now-orphan.gitignoreexception.data/example_*_worldwide.*,data/example_*_aus.*,data/example_multiple_inputs.tar.xz. The CI-usedexample_metadata.tsv/example_sequences.fasta.gzare kept.Verification
workflow/,Snakefile,nextstrain_profiles/,docs/, andtests/(accounting for config-key indirection, which hides references behindconfig["files"][...]).snakemake --profile nextstrain_profiles/nextstrain-ci -nbuilds an unchanged 37-job DAG with no errors.Test plan
🤖 Generated with Claude Code