Skip to content

Simplify configuration of accession as metadata id column#2021

Open
victorlin wants to merge 6 commits into
masterfrom
victorlin/metadata-id-column
Open

Simplify configuration of accession as metadata id column#2021
victorlin wants to merge 6 commits into
masterfrom
victorlin/metadata-id-column

Conversation

@victorlin

@victorlin victorlin commented Jul 1, 2026

Copy link
Copy Markdown
Member

Description of proposed changes

This PR's commits revolve around a common goal of simplifying metadata id column handling in workflows. See #1780 (comment) for context.

Related issue(s)

Checklist

  • Test in zika repo
  • Automated checks pass (ignore unrelated docs warning)
  • Check if you need to add a changelog message
  • Check if you need to add tests
  • Check if you need to update docs

victorlin added 2 commits July 1, 2026 11:01
Mention that values must be unique, and --metadata-id-columns can be used to use other columns.
Reflect the changes in "Prefer `strain` over `name` as sequence ID
field" (6064fa2).
@victorlin victorlin self-assigned this Jul 1, 2026
@victorlin victorlin marked this pull request as draft July 1, 2026 20:01
Prefer a new explicit 'id' column to represent unique identifier,
instead of 'strain' and 'name' which have historically been associated
with non-unique columns from some data sources.
@victorlin victorlin force-pushed the victorlin/metadata-id-column branch from 270f076 to ca57842 Compare July 1, 2026 20:29
@codecov

codecov Bot commented Jul 1, 2026

Copy link
Copy Markdown

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 72.96%. Comparing base (2a525b5) to head (54a41e9).

Additional details and impacted files
@@            Coverage Diff             @@
##           master    #2021      +/-   ##
==========================================
- Coverage   72.96%   72.96%   -0.01%     
==========================================
  Files          85       85              
  Lines       10732    10730       -2     
  Branches     2102     2100       -2     
==========================================
- Hits         7831     7829       -2     
  Misses       2536     2536              
  Partials      365      365              

☔ View full report in Codecov by Harness.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

victorlin added 3 commits July 1, 2026 14:50
Allow explicitly naming the id column in merged metadata output, instead
of always using the first input's id column, so workflows can use a
standard id column name regardless of inputs.
Allow --metadata and --sequences to accept a single input instead of
requiring at least two, since augur merge is also used as a general
entry point for standardization even when no actual merge across inputs
is needed.
Follow-up to "Read FASTA files with SeqKit directly" (44b72fe) which is
really more of a workaround than a fix.
@victorlin victorlin changed the title Prefer 'id' as identifier column Simplify configuration of accession as metadata id column Jul 1, 2026
@victorlin victorlin marked this pull request as ready for review July 1, 2026 23:12
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

merge: Support single inputs Simplify configuration of accession as metadata id column

1 participant