Skip to content

Uniref Bottleneck #513

@blediro

Description

@blediro

Description of the bug

Hello,

We are running DRAM 2.0 on the Digital Research Alliance of Canada HPC (Fir) with UniRef on high-performance scratch storage. The MMSEQS_SEARCH_UNIREF step shows no progress after 12 hours hours per genome even for a small test MAG. We plan to annotate ~3000 MAGs and at this rate UniRef alone would be impractical.
Is UniRef pre-indexed once at the database level, or is indexing happening per genome? Are there recommended MMseqs2 settings (e.g. --split-memory-limit, sensitivity, or prefilter settings) to speed up the UniRef search for large MAG datasets?

Thank you for your time and the great tool!

System information

DRAM v2.0.0-beta28
Nextflow version 25.10.2
HPC
Slurm
Apptainer

Command used and terminal output

nextflow run WrightonLabCSU/DRAM -r v2.0.0-beta28 \
 --input_fasta /home/jcsm2010/scratch/2026-02-19_MetaG_FarmAD_400L/genome_qc_results/test_bins/ \
 --outdir  /home/jcsm2010/scratch/2026-02-19_MetaG_FarmAD_400L/DRAM_output \
 --annotate \
 --anno_dbs kofam,camper,canthyd,sulfur,fegenie,merops,dbcan,pfam,metals,antismash,tcdb,dram_db,vog,methyl,uniref \
 --summarize \
 --traits \
 --visualize \
 --slurm \
 -profile apptainer \
 -c /scratch/jcsm2010/DRAM/custom.config \
 -resume

Relevant files

custom config:

process {
clusterOptions = '--account=rpp-shallam '
}

params {
tiny_gb_mem_limit = 1
small_gb_mem_limit = 8
medium_gb_mem_limit = 24
big_gb_mem_limit = 48
huge_gb_mem_limit = 400
huge_hr_time_limit = 48
}

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type
    No fields configured for issues without a type.

    Projects

    Status
    To Sort

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions