Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
21 changes: 19 additions & 2 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -5,7 +5,24 @@ All notable changes to this project will be documented in this file.
The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.0.0/),
and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html).

## [1.0.1] - 2026-03-03
## [1.0.2] - 04-03-2026

### Changed

- **Rich progress bars replace log output during pipeline runs**: The CLI now shows clean Rich progress bars on stderr (step-level + per-junction detail for primer design and SNP check) instead of a wall of log messages. All detailed logs still go to the file. Warnings and errors are printed above the progress bar. Multi-panel parallel mode shows a panel-level progress bar. Progress bars are only active when stderr is a TTY; non-interactive runs behave as before.
- **SNP and off-target filters now retain all tied least-affected pairs** (`snpcheck/checker.py`, `blast/specificity.py`): When all primer pairs for a junction overlap SNPs or have off-target products, the filters now keep every pair tied at the minimum count instead of arbitrarily picking one. This lets the downstream selector evaluate tied candidates on other properties (Tm, GC%, pair penalty, etc.).
- **Normalise cross-dimer penalty in multiplex cost function** (`selector/cost.py`): The cross-dimer penalty was a raw sum over all C(2n, 2) pairwise primer interactions, scaling quadratically with multiplex size. This caused it to dominate the cost function at higher plexities, effectively drowning out off-target and SNP penalties during selection. The penalty is now divided by the number of interactions, making it a per-interaction average. Weights are now directly comparable regardless of multiplex size.

- **Separate warnings from errors in pipeline output** (`pipeline.py`, `cli.py`): Off-target and SNP fallback messages (where all pairs had issues but the least-affected were kept) were incorrectly reported as errors, causing the CLI to display "Some panels had errors" for panels that completed successfully. These are now reported as warnings. Errors are reserved for actual failures (e.g. design exceptions, BLAST unavailable). The CLI now shows a distinct warnings section below the success summary.
- **Update fallback message wording**: "least-affected pair kept" now reads "all least-affected pairs kept" to reflect the v1.0.2 change that retains all tied pairs.
- **Remove stale `config/` directory**: Config presets and alignment parameters were moved to `src/plexus/data/` in v1.0.2 but the old `config/` directory was left behind. Removed it and moved `environment.yml` to the repo root. Updated references in `README.md`, `docs/USER_GUIDE.md`, `docs/IMPLEMENTATION.md`, and `docs/getting_started.ipynb`.

### Fixed

- **Swapped-orientation off-targets missed** (`blast/specificity.py`): When a forward primer binds the minus strand and the reverse primer binds the plus strand at an off-target locus, the `AmpliconFinder` stores the amplicon under `(reverse_id, forward_id)`. The mapping step only checked `(forward_id, reverse_id)`, silently missing these swapped-orientation off-targets. Now checks both key orders.
- **Package data files not found in global installs**: Config presets and alignment parameter files were not included in the wheel because they lived outside the Python package at the project root (`config/`). Moved all data files (`designer_default_config.json`, `designer_lenient_config.json`, `alignment_parameters.json`, `nn_model/`) into `src/plexus/data/` and switched `config.py` and `aligner/align.py` from `ROOT_DIR` path concatenation to `importlib.resources.files()`. Removed the now-unused `utils/root_dir.py`. `uv tool install` and `pip install` now work correctly without an editable install.

## [1.0.1] - 03-03-2026

### Changed

Expand All @@ -26,7 +43,7 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
- **3'-end tolerance for BLAST annotation** (`annotator.py`, `config.py`, `specificity.py`, `pipeline.py`): New `three_prime_tolerance` parameter (default 3) relaxes the `from_3prime` check from `qend == qlen` to `qlen - qend <= tolerance`. BLAST's local alignment clips terminal mismatches, causing hits like the DCAF12L1 reverse primer (19/21bp aligned, 2bp clipped at 3' end) to be wrongly discarded as "not 3'-anchored" even though Primer-BLAST detects them via semi-global alignment.
- Tests for `find_max_poly_gc`, `check_kmer` poly-GC integration, BLAST evalue/reward/penalty/word_size parameter forwarding, and specificity check threading.

## [1.0.0] - 2026-03-03
## [1.0.0] - 03-03-2026

First stable release. All v1.0 roadmap items complete — see `docs/ROADMAP.md` for the full list.
Includes correctness fixes (BLAST annotation, coordinate handling, off-target filtering),
Expand Down
2 changes: 1 addition & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -60,7 +60,7 @@ You can also set up the environment using Conda:
```bash
git clone https://github.com/sfilges/plexus
cd plexus
conda env create -f config/environment.yml
conda env create -f environment.yml
conda activate plexus-run
pip install -e .
```
Expand Down
7 changes: 0 additions & 7 deletions config/alignment_parameters.json

This file was deleted.

5 changes: 0 additions & 5 deletions data/design_regions.csv

This file was deleted.

24 changes: 4 additions & 20 deletions data/junctions.csv
Original file line number Diff line number Diff line change
@@ -1,21 +1,5 @@
Name,Chrom,Five_Prime_Coordinate,Three_Prime_Coordinate
HOXA2_p.V274F,chr7,27101037,27101037
BRAF_p.L485W ,chr7,140778054,140778054
CLTCL1_p.R354H,chr22,19234615,19234615
TTN_p.E8395K ,chr2,178715052,178715052
FEM1B_,chr2,177216892,177216892
PDCL3_p.E49D,chr2,100568944,100568944
DCDC1_p.T1379fs*2,chr11,30905134,30905134
KIF22_p.Q88K,chr16,29798664,29798664
MAMDC4_p.G1071E,chr9,136859904,136859904
WLS_p.I360N ,chr1,68144579,68144579
RPA1_p.R31H,chr17,1843927,1843927
ARL6_p.T181I,chr3,97791959,97791959
NOLC1_p.I687V,chr10,102162228,102162228
GOLGA2_p.R322W,chr9,128262652,128262652
CRTC1_p.P334L ,chr19,18765518,18765518
FEM1B_c.249-5T>C,chr15,68289602,68289602
UNK_p.R77fs*103,chr17,75809883,75809883
CLTCL1_p.R354H,chr22,19234615,19234615
MBP_p.G56R,chr18,77017242,77017242
ZNF729_p.C1134Y,chr19,22316818,22316818
EGFR_T790M,chr7,55181378,55181378
KRAS_G12D,chr12,25245350,25245350
KRAS_G13R,chr12,25245348,25245349
BRAF_V600E,chr7,140753336,140753336
4 changes: 2 additions & 2 deletions docs/IMPLEMENTATION.md
Original file line number Diff line number Diff line change
Expand Up @@ -235,8 +235,8 @@ Configuration is managed via Pydantic models in `src/plexus/config.py`. The top-

Two built-in presets are bundled as JSON:

* `default` (`config/designer_default_config.json`) — conservative thermodynamic thresholds
* `lenient` (`config/designer_lenient_config.json`) — relaxed thresholds for difficult junctions
* `default` (`src/plexus/data/designer_default_config.json`) — conservative thermodynamic thresholds
* `lenient` (`src/plexus/data/designer_lenient_config.json`) — relaxed thresholds for difficult junctions

Users can supply a custom JSON with `--config` / `-c`. Generate a template with
`plexus template`.
Expand Down
2 changes: 1 addition & 1 deletion docs/USER_GUIDE.md
Original file line number Diff line number Diff line change
Expand Up @@ -108,7 +108,7 @@ uv pip install -e .

```bash
# Create conda environment
conda env create -f config/environment.yml
conda env create -f environment.yml
conda activate plexus-run

# Install plexus
Expand Down
56 changes: 1 addition & 55 deletions docs/getting_started.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -328,61 +328,7 @@
"cell_type": "markdown",
"id": "c1d2e3f4",
"metadata": {},
"source": [
"## 6. Next steps\n",
"\n",
"### CLI usage\n",
"\n",
"Everything in this notebook can also be done from the command line:\n",
"\n",
"```bash\n",
"plexus run \\\n",
" --input data/junctions.csv \\\n",
" --fasta /path/to/hg38.fa \\\n",
" --output results/ \\\n",
" --name my_panel\n",
"```\n",
"\n",
"Run `plexus --help` for all options.\n",
"\n",
"### Tuning the design parameters\n",
"\n",
"Pass a `config_file` (JSON) to `run_pipeline()` or `--config` on the CLI to override any parameter. \n",
"A minimal example to widen the Tm window:\n",
"\n",
"```json\n",
"{\n",
" \"singleplex_design_parameters\": {\n",
" \"PRIMER_MIN_TM\": 55.0,\n",
" \"PRIMER_MAX_TM\": 66.0\n",
" }\n",
"}\n",
"```\n",
"\n",
"See `config/designer_default_config.json` for all available parameters.\n",
"\n",
"### SNP checking\n",
"\n",
"Filter primers that overlap common germline variants — useful for liquid biopsy panels:\n",
"\n",
"```bash\n",
"plexus init # downloads a bundled gnomAD VCF subset\n",
"plexus run -i junctions.csv -f hg38.fa --snp-strict\n",
"```\n",
"\n",
"### Multi-patient / multi-panel inputs\n",
"\n",
"Add a `Panel` column to your CSV to design independent panels for multiple patients in one run:\n",
"\n",
"```bash\n",
"plexus run -i cohort.csv -f hg38.fa --parallel\n",
"```\n",
"\n",
"### Docker / clinical deployment\n",
"\n",
"For containerised or regulated environments, plexus ships a compliance mode — \n",
"see the [README](../README.md#compliance-mode-and-container-deployment) for the Docker workflow."
]
"source": "## 6. Next steps\n\n### CLI usage\n\nEverything in this notebook can also be done from the command line:\n\n```bash\nplexus run \\\n --input data/junctions.csv \\\n --fasta /path/to/hg38.fa \\\n --output results/ \\\n --name my_panel\n```\n\nRun `plexus --help` for all options.\n\n### Tuning the design parameters\n\nPass a `config_file` (JSON) to `run_pipeline()` or `--config` on the CLI to override any parameter. \nA minimal example to widen the Tm window:\n\n```json\n{\n \"singleplex_design_parameters\": {\n \"PRIMER_MIN_TM\": 55.0,\n \"PRIMER_MAX_TM\": 66.0\n }\n}\n```\n\nSee `src/plexus/data/designer_default_config.json` for all available parameters.\n\n### SNP checking\n\nFilter primers that overlap common germline variants — useful for liquid biopsy panels:\n\n```bash\nplexus init # downloads a bundled gnomAD VCF subset\nplexus run -i junctions.csv -f hg38.fa --snp-strict\n```\n\n### Multi-patient / multi-panel inputs\n\nAdd a `Panel` column to your CSV to design independent panels for multiple patients in one run:\n\n```bash\nplexus run -i cohort.csv -f hg38.fa --parallel\n```\n\n### Docker / clinical deployment\n\nFor containerised or regulated environments, plexus ships a compliance mode — \nsee the [README](../README.md#compliance-mode-and-container-deployment) for the Docker workflow."
}
],
"metadata": {
Expand Down
File renamed without changes.
93 changes: 65 additions & 28 deletions src/plexus/aligner/align.py
Original file line number Diff line number Diff line change
Expand Up @@ -21,12 +21,12 @@

import json
from dataclasses import dataclass, field
from importlib.resources import files
from itertools import product
from pathlib import Path

from loguru import logger

from plexus.utils.root_dir import ROOT_DIR

# ================================================================================
# Define an alignment between two primers
# ================================================================================
Expand Down Expand Up @@ -91,8 +91,9 @@ def __init__(self, param_path: str | None = None) -> None:

# Load parameters
if param_path is None:
param_path = f"{ROOT_DIR}/config/alignment_parameters.json"
self.load_parameters(param_path)
self._load_parameters_from_package()
else:
self._load_parameters_from_file(param_path)

def set_primers(
self, primer1: str, primer2: str, primer1_name: str, primer2_name: str
Expand All @@ -104,12 +105,46 @@ def set_primers(
self.primer2_name = primer2_name
self.score = None

def load_parameters(self, param_path: str) -> None:
def _load_parameters_from_package(self) -> None:
"""Load alignment parameters from the bundled plexus.data package."""
cache_key = "__package__"
if cache_key in PrimerDimerPredictor._param_cache:
logger.debug("Using cached alignment parameters from package data")
self.nn_scores, self.end_length, self.end_bonus = (
PrimerDimerPredictor._param_cache[cache_key]
)
return

logger.info("Loading alignment parameters from package data")
data_pkg = files("plexus.data")

params = json.loads(data_pkg.joinpath("alignment_parameters.json").read_text())

match_dt: dict[str, float] = json.loads(
data_pkg.joinpath(params["match_scores"]).read_text()
)
single_mismatch_dt: dict[str, float] = json.loads(
data_pkg.joinpath(params["single_mismatch_scores"]).read_text()
)

self.nn_scores = _build_nn_score_dt(
match_dt, single_mismatch_dt, params["double_mismatch_score"]
)
self.end_length = params["end_length"]
self.end_bonus = params["end_bonus"]

PrimerDimerPredictor._param_cache[cache_key] = (
self.nn_scores,
self.end_length,
self.end_bonus,
)

def _load_parameters_from_file(self, param_path: str) -> None:
"""
Load parameters necessary for Primer Dimer algorithm,
and set as attributes. Results are cached by path so that
repeated instantiations within the same process only read
the JSON files once.
Load parameters from a user-specified file path.

Results are cached by path so that repeated instantiations
within the same process only read the JSON files once.

Parameters
----------
Expand All @@ -125,18 +160,16 @@ def load_parameters(self, param_path: str) -> None:

logger.info(f"Loading alignment parameters from: {param_path}")

# Load parameter JSON
with open(param_path) as f:
params = json.load(f)

# Load nearest neighbour model, these should all be paths
param_dir = str(Path(param_path).parent)
self.nn_scores = create_nn_score_dt(
match_json=f"{ROOT_DIR}/{params['match_scores']}",
single_mismatch_json=f"{ROOT_DIR}/{params['single_mismatch_scores']}",
match_json=f"{param_dir}/{params['match_scores']}",
single_mismatch_json=f"{param_dir}/{params['single_mismatch_scores']}",
double_mismatch_score=params["double_mismatch_score"],
)

# Load penalties
self.end_length = params["end_length"]
self.end_bonus = params["end_bonus"]

Expand Down Expand Up @@ -366,6 +399,23 @@ def get_primer_alignment(self) -> PrimerAlignment:
)


def _build_nn_score_dt(
match_dt: dict[str, float],
single_mismatch_dt: dict[str, float],
double_mismatch_score: float = 0.2,
) -> dict[str, float]:
"""Build the nearest-neighbour scoring dict from pre-loaded dicts."""
nts = ["A", "T", "C", "G"]
nn_score_dt: dict[str, float] = {
"".join(watson) + "/" + "".join(crick): double_mismatch_score
for watson in product(nts, repeat=2)
for crick in product(nts, repeat=2)
}
nn_score_dt.update(match_dt)
nn_score_dt.update(single_mismatch_dt)
return nn_score_dt


def create_nn_score_dt(
match_json: str, single_mismatch_json: str, double_mismatch_score: float = 0.2
) -> dict[str, float]:
Expand All @@ -386,22 +436,9 @@ def create_nn_score_dt(
dict
Dictionary mapping dinucleotide pairs to their scores
"""
# Load match and single mismatch .jsons
with open(match_json) as f:
match_dt: dict[str, float] = json.load(f)
with open(single_mismatch_json) as f:
single_mismatch_dt: dict[str, float] = json.load(f)

# Set all as double mismatches; then update
nts = ["A", "T", "C", "G"]
nn_score_dt: dict[str, float] = {
"".join(watson) + "/" + "".join(crick): double_mismatch_score
for watson in product(nts, repeat=2)
for crick in product(nts, repeat=2)
}

# Update
nn_score_dt.update(match_dt)
nn_score_dt.update(single_mismatch_dt)

return nn_score_dt
return _build_nn_score_dt(match_dt, single_mismatch_dt, double_mismatch_score)
16 changes: 10 additions & 6 deletions src/plexus/blast/specificity.py
Original file line number Diff line number Diff line change
Expand Up @@ -139,6 +139,7 @@ def run_specificity_check(
n_checked += 1

potential_products = amplicon_map.get((f_id, r_id), [])
potential_products += amplicon_map.get((r_id, f_id), [])

off_targets = []
on_targets = []
Expand Down Expand Up @@ -212,15 +213,18 @@ def filter_offtarget_pairs(panel: MultiplexPanel) -> tuple[int, list[str]]:
f"with off-target products, {len(clean)} clean pair(s) remain"
)
else:
# All pairs have off-targets — keep the least affected one
best = min(junction.primer_pairs, key=lambda p: len(p.off_target_products))
removed = len(junction.primer_pairs) - 1
junction.primer_pairs = [best]
# All pairs have off-targets — keep all with the fewest
min_ot = min(len(p.off_target_products) for p in junction.primer_pairs)
best_pairs = [
p for p in junction.primer_pairs if len(p.off_target_products) == min_ot
]
removed = len(junction.primer_pairs) - len(best_pairs)
junction.primer_pairs = best_pairs
fallback_junctions.append(junction.name)
logger.warning(
f"Junction {junction.name}: all pairs have off-target products; "
f"keeping pair {best.pair_id} with fewest "
f"off-targets={len(best.off_target_products)}"
f"keeping {len(best_pairs)} pair(s) with fewest "
f"off-targets={min_ot}"
)

total_removed += removed
Expand Down
Loading