Summary
The two LIMS validation processes redirect all stdout/stderr into a local report file (> *_validation_report.txt 2>&1). That file is not a declared Nextflow output, so on the OSPool/HTCondor executor it is never staged back from the worker. When validation fails, the only artifact containing the real error is discarded, and the operator is left with a generic exit status 1 plus a misleading NoSuchFileException: .command.log.
Where
modules/labkey.nf -> LABKEY_VALIDATE_BLAST_HITS_LIST (the > blast_validation_report.txt 2>&1 redirect)
modules/labkey.nf -> LABKEY_VALIDATE_BLAST_FASTA_LIST (the > fasta_validation_report.txt 2>&1 redirect)
Both call validate_labkey.py, which on failure logs the cause and exit(1).
Observed failure
A run failed with a generic exit status 1 and NoSuchFileException: .command.log. The work dir had empty .command.out/.command.err, no .command.log, and no fasta_validation_report.txt (it isn't a declared output, and OSPool only returns declared outputs + .command.*).
Reproducing validate_labkey.py locally with output un-redirected revealed the hidden cause:
[FAIL] Insert/Delete failed: 401: The API key you provided is invalid.
i.e. an expired/invalid LABKEY_API_KEY secret -- a trivial fix that was nearly impossible to diagnose from the pipeline output.
Impact
Any validation failure (bad API key, schema mismatch, permissions, network) is silently swallowed on the OSPool executor. Operators must manually re-run the script off-pipeline to see what went wrong.
Proposed fix
Drop the > *_validation_report.txt 2>&1 redirect in both processes so output flows to stdout/stderr, which Nextflow captures in .command.out/.command.err and surfaces in its error report. (Alternatively declare the report as a path output so it is staged back, but hitting stdout/stderr is simpler and idiomatic.)
Summary
The two LIMS validation processes redirect all stdout/stderr into a local report file (
> *_validation_report.txt 2>&1). That file is not a declared Nextflow output, so on the OSPool/HTCondor executor it is never staged back from the worker. When validation fails, the only artifact containing the real error is discarded, and the operator is left with a genericexit status 1plus a misleadingNoSuchFileException: .command.log.Where
modules/labkey.nf->LABKEY_VALIDATE_BLAST_HITS_LIST(the> blast_validation_report.txt 2>&1redirect)modules/labkey.nf->LABKEY_VALIDATE_BLAST_FASTA_LIST(the> fasta_validation_report.txt 2>&1redirect)Both call
validate_labkey.py, which on failure logs the cause andexit(1).Observed failure
A run failed with a generic
exit status 1andNoSuchFileException: .command.log. The work dir had empty.command.out/.command.err, no.command.log, and nofasta_validation_report.txt(it isn't a declared output, and OSPool only returns declared outputs +.command.*).Reproducing
validate_labkey.pylocally with output un-redirected revealed the hidden cause:i.e. an expired/invalid
LABKEY_API_KEYsecret -- a trivial fix that was nearly impossible to diagnose from the pipeline output.Impact
Any validation failure (bad API key, schema mismatch, permissions, network) is silently swallowed on the OSPool executor. Operators must manually re-run the script off-pipeline to see what went wrong.
Proposed fix
Drop the
> *_validation_report.txt 2>&1redirect in both processes so output flows to stdout/stderr, which Nextflow captures in.command.out/.command.errand surfaces in its error report. (Alternatively declare the report as apathoutput so it is staged back, but hitting stdout/stderr is simpler and idiomatic.)