Releases: AstrobioMike/bit
Releases · AstrobioMike/bit
v2.7.0
Added
- to
bit gen-metagenome- taxid and detection columns in the per-genome truth tsvs
- to
bit gen-reads- per-genome output tsv with info like number of reads generated, coverage, and detection
Full Changelog: v2.6.0...v2.7.0
v2.6.0
Added
bit gen-metagenome- leverages
bit get-accs-from-gtdb,bit dl-ncbi-assemblies,bit mutate-seqs,bit gen-reads, andbit lineage from-taxidsto produce mock metagenomes with ground-truth tables per-genome, per-taxonomic-rank, and per-read with associated GTDB and NCBI taxonomy
- leverages
bit get-accs-from-ncbifor searching ncbi for assembly accessions based on taxonomy (corollary tobit get-accs-from-gtdb)bit gen-reads- has an optional
--source-tsvoption which will write out each read's source info (and when set, the headers no long hold all that info and are smaller) - can now take gzipped input fastas
- has an optional
- added a github action to weekly rebuild the slimmed down ncbi assembly summary table
Changed
bit dl-ncbi-assemblies- many robustification improvements
- output fasta files have extension ".fasta.gz" now instead of ".fa.gz" for consistency with the rest of bit-produced nt fasta files (sorry for the change!)
bit gen-reads- read headers have been improved with regard to provenance tracking
- initial fragments/reads now have a 50/50 chance to be drawn from +/- strand (relative to the ref)
bit data get gtdb-dataandbit data get ncbi-assembly-datanow both pull a prepared, slimmed version from github first if possible (saves on download and reading time)
Full Changelog: v2.5.0...v2.6.0
NCBI assembly-info (latest)
an auto-weekly-rebuilt slimmed down NCBI assembly summary table that bit uses
GTDB metadata (latest)
an auto-maintained slimmed down GTDB metadata table that bit uses (it's stored date file is checked weekly, and this is rebuilt when it sees that GTDB has issued a new release)
Metagenomics workflow v1.0.4
- removed f-strings and made shell blocks raw/r so Snakefile was compatible with bit env python 3.12
Amplicon workflow v1.0.1
- removed unnecessary f-strings to work with updated bit env python 3.12
v2.5.0
Base env python updated from 3.10 to 3.12
Added
- to
bit ez-screen assembly- region-calls.tsv now includes a contig_length column
- to
bit fasta extract-seqs-by-coords- can now take an inline entry for a single region to extract instead of requiring a bed file, e.g.
-b contig-1 20 100
- can now take an inline entry for a single region to extract instead of requiring a bed file, e.g.
Changed
bit fasta extract-seqs-by-headers- now ignores ">" characters if they are at the front of the specified headers
- space-delimited list or file with one header per line can be provided to the
-Hparameter now
SRA download workflow v1.1.2
- slight changes to some f-strings so it works with python 3.12 (after bit base env migration from 3.10)
v2.4.1
Added
- additions to
bit ez-screen assembly--min-edge-perc-covand--edge-tolerancewhich enable lower-coverage targets to be captured and reported that are likely to be clipped by the contig edge
v2.4.0
Added
- additions to
bit ez-screen assembly- can now take multiple nucleotide (as fasta or blast db) and/or amino-acid (as fasta or diamond db) inputs for targets
- now by default extracts "islands" of densely clustered regions into their own fastas (disable with
--no-island-extraction) - detailed help menu accessible with
-H|--show-detailed-help