Skip to content

Remove confirmed-dead scripts, defaults, and config keys#1208

Merged
trvrb merged 2 commits into
masterfrom
cleanup-dead-code
Jun 30, 2026
Merged

Remove confirmed-dead scripts, defaults, and config keys#1208
trvrb merged 2 commits into
masterfrom
cleanup-dead-code

Conversation

@trvrb

@trvrb trvrb commented Jun 27, 2026

Copy link
Copy Markdown
Member

First in a series of small, independently-reviewable PRs cleaning up pandemic-era cruft (see the cleanup plan; scope: dead code + proximity removal + docs/schemes).

Motivation

The repo accumulated scripts, default data files, and config keys that nothing references anymore. This PR removes the verified-dead ones. The three live contracts — weekly OPEN build, occasional GISAID builds, and external users running their own builds — are unaffected.

What's removed

  • Orphan scripts (0 refs in workflow/ Snakefile nextstrain_profiles/ docs/ tests/): scripts/add_labels.py, scripts/generate-scientific-credits.py, and the explicitly-deprecated scripts/deprecated/ (calculate_delta_frequency.py, parse_mutational_fitness_tsv_into_distance_map.py).
  • Unused defaults: defaults/distance_maps/VoC.json (only S1.json is used by the distances rule), defaults/clade_hierarchy.tsv, defaults/clades_who.tsv.
  • Dead config key files.outgroup in defaults/parameters.yaml — pointed at a file that doesn't exist in the repo and was read nowhere; its config-reference entry (already documented "No longer used") is removed too.
  • Deprecated my_profiles/ directory (only a deprecation README) and its now-orphan .gitignore exception.
  • Unused committed example data: data/example_*_worldwide.*, data/example_*_aus.*, data/example_multiple_inputs.tar.xz. The CI-used example_metadata.tsv / example_sequences.fasta.gz are kept.

Verification

  • Every removed path was confirmed to have zero references across workflow/, Snakefile, nextstrain_profiles/, docs/, and tests/ (accounting for config-key indirection, which hides references behind config["files"][...]).
  • snakemake --profile nextstrain_profiles/nextstrain-ci -n builds an unchanged 37-job DAG with no errors.

Test plan

  • CI green.

🤖 Generated with Claude Code

@joverlee521 joverlee521 left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Checked a couple of the files that I've seen referenced elsewhere, but should all be good to remove from this repo.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This file was originally added a human/machine readable file for reference in #817, but was never intended to be used in the workflow.

Would it be worth keeping as a reference? It's currently used in ncov-clades-schema, but that repo also has not been run/updated in a while.

@trvrb trvrb Jun 30, 2026

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh! Good catch. I thought this was stale, but we need a source of truth for which clades are parents to which clades. It's this file. I've restored it in a rebased cf49012.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Noting the example data is still referenced in the ncov-tutorial, but that folder also has a warning that it is just a public archive and not guaranteed to work.

Should be okay to delete 👍

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great!

Comment thread scripts/add_labels.py

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Originally added in 9c48778 as an orphan script. A copy of it is referenced in a fork (from nextstrain/auspice#1036 (comment)), but should be good to delete from this repo.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for investigating!

@trvrb trvrb force-pushed the cleanup-dead-code branch from ba8dd95 to 4bc5061 Compare June 30, 2026 22:57
trvrb and others added 2 commits June 30, 2026 15:59
First pass of a pandemic-era cruft cleanup. Removes code and data that is
referenced by nothing in the workflow, the active profiles, the docs, or the
tests (verified by grep plus a dry-run of the CI profile showing an unchanged
37-job DAG). No behavior change to any build.

Removed:
- Orphan scripts: scripts/add_labels.py, scripts/generate-scientific-credits.py,
  and the explicitly-deprecated scripts/deprecated/ (calculate_delta_frequency.py,
  parse_mutational_fitness_tsv_into_distance_map.py).
- Unused defaults: defaults/distance_maps/VoC.json (only S1.json is used) and
  defaults/clades_who.tsv.
- Dead config key files.outgroup (defaults/parameters.yaml) — it pointed at a
  file that does not exist in the repo and was read nowhere; its config-reference
  entry (already documented "No longer used") is removed too.
- Deprecated my_profiles/ directory (only a deprecation README) and its now-orphan
  .gitignore exception.
- Unused committed example data: data/example_*_worldwide.*, data/example_*_aus.*,
  data/example_multiple_inputs.tar.xz (the CI-used example_metadata.tsv /
  example_sequences.fasta.gz are kept).

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
@trvrb trvrb force-pushed the cleanup-dead-code branch from 4bc5061 to 8d7cafd Compare June 30, 2026 22:59
@trvrb trvrb merged commit a1ffbda into master Jun 30, 2026
6 checks passed
@trvrb trvrb deleted the cleanup-dead-code branch June 30, 2026 23:34
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants