Digestion enzyme test az by cbielow · Pull Request #173 · cbielow/OpenMS

cbielow · 2026-05-27T14:22:58Z

Description

update of #172
after updating base branch.

However, there are still too many weird changes in here.
@TheClaervoyant
Please rebase your branch or move stuff manually if need be, such that changes here are limited to DigestionEnzyme. ;)

Checklist

Make sure that you are listed in the AUTHORS file
Add relevant changes and new features to the CHANGELOG file
I have commented my code, particularly in hard-to-understand areas
New and existing unit tests pass locally with my changes
Updated or added python bindings for changed or new classes (Tick if no updates were necessary.)

How can I get additional information on failed tests during CI

Click to expand

If your PR is failing you can check out

The details of the action statuses at the end of the PR or the "Checks" tab.
http://cdash.seqan.de/index.php?project=OpenMS and look for your PR. Use the "Show filters" capability on the top right to search for your PR number.
If you click in the column that lists the failed tests you will get detailed error messages.

Advanced commands (admins / reviewer only)

Click to expand

/reformat (experimental) applies the clang-format style changes as additional commit. Note: your branch must have a different name (e.g., yourrepo:feature/XYZ) than the receiving branch (e.g., OpenMS:develop). Otherwise, reformat fails to push.
setting the label "NoJenkins" will skip tests for this PR on jenkins (saves resources e.g., on edits that do not affect tests)
commenting with rebuild jenkins will retrigger Jenkins-based CI builds

⚠️ Note: Once you opened a PR try to minimize the number of pushes to it as every push will trigger CI (automated builds and test) and is rather heavy on our infrastructure (e.g., if several pushes per day are performed).

Lower the CLI floor on -bruker:dia_ms2_min_support from 1 to 0 so users can run frame-aggregated MS2 centroiding without the 3x3 neighbor filter. With centroid=true and min_support=0, every populated grid cell becomes a local-maxima candidate instead of being dropped when isolated. Also short-circuit the denoise step in DIAFrameAggregator when min_support<=0 to skip the now-useless 8-probe hashmap lookup per cell. Regression guard added: with the test fixture, min_support=0 retains ~6.7x more centroids (106.5M vs 15.9M peaks) than min_support=1, with spectra still emitted as IM_CENTROIDED. Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

install_library() adds opentims_cpp to the install-tree export (OpenMSTargets via install(EXPORT)), but the build-tree export() call in export_macros.cmake uses _OPENMS_EXPORT_TARGETS, which is only populated by openms_register_export_target(). Without this call CMake errors at generate time: export called with target "OpenMS" which requires target "opentims_cpp" that is not in any export set. Matches the pattern used by Evergreen and IsoSpec in their own CMakeLists.txt files. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

…tchContent path Two bugs in the FetchContent branch that only manifest on CI (where opentims is not installed system-wide): 1. BUILD_SHARED_LIBS=true is set in the top-level CMakeLists.txt (line 287) before cmake_findExternalLibs.cmake is included. Without an explicit override, opentims's CMakeLists.txt would produce libopentims_cpp.so instead of libopentims_cpp.a. The shared library lands in _deps/opentims-build/ rather than the system library path, so the build-time linker fails with "undefined reference to TimsDataHandle" when linking test binaries against libOpenMS.so (libopentims_cpp.so's directory is not in the test's -L search path). Fix: save/restore BUILD_SHARED_LIBS around FetchContent_MakeAvailable(opentims). 2. target_link_libraries(opentims_cpp PRIVATE sqlite3) links the bundled SQLite3 CMake target to the opentims_cpp STATIC library. For static library targets, CMake propagates PRIVATE deps into the link interface, so the export mechanism requires sqlite3 to be in the same export set. This conflicts with SQLiteCpp's own sqlite3 export and causes a CMake generate error. Fix: provide only the sqlite3 include directory (needed for compilation with OPENTIMS_LINK_SQLITE_STATICALLY); the sqlite3 symbols are resolved at final link time through OpenMS's SQLiteCpp dependency (SQLiteCpp PUBLIC-links sqlite3, so libsqlite3.a appears in libOpenMS.so's link command after libopentims_cpp.a). Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

…llback sentinel FindOpentims.cmake sets INTERFACE_COMPILE_DEFINITIONS to "" on the manually created UNKNOWN IMPORTED target to signal "found via header/library search, sqlite linkage unknown". The previous condition only checked MATCHES "NOTFOUND" (which CMake sets when get_target_property finds no property), so it never fired for the empty-string case — skipping the sqlite3 include/link injection for system installs that lack a CMake config package. Add OR _opentims_defs STREQUAL "" to catch this case. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

The centroid-or-copy branch for MS1 frame loading was duplicated across four call sites (loadDIA_, loadDDA_, loadDIAStreaming, loadFrameMode_). Extract a static loadMS1Spectrum(frame, spec, config, centroider) that runs IM-centroiding when configured and falls back to frameToSpectrum() otherwise. Each call site drops the local do_centroid bool and the if/else branch. Pure refactor: no behavior change. Existing BrukerTimsFile_test suite (13 sections including all DIA/DDA integration paths) passes unchanged. Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

…hem (OpenMS#9154) merge because this path is not tested on ci

Add three missing entries for changes merged after the 2026-04-15 sync: - OpenMS#9154: BrukerTimsFile spectra marked as CENTROID - OpenMS#9148/OpenMS#9147: system opentims install fallback before FetchContent - OpenMS#9151: Bruker DIA MS2 centroiding without denoising (min_support=0) Co-authored-by: GitHub Copilot <copilot@github.com> Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* refactor: rename DIAFrameAggregator to FrameAggregator The aggregator is scope-agnostic (no SWATH assumptions inside; the MS2 scan-window filter lives in the outer loop, not the class). Renaming prepares for reuse from the MS1 aggregation path. Pure refactor: no behavior change. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * refactor: parameterize FrameAggregator m/z bin width Lift MZ_BIN_WIDTH from static constexpr to a constructor parameter. Default 0.02 Da preserves existing MS2 behavior; MS1 callers will pass a smaller value (0.01 Da) tuned for timsTOF MS1 mass accuracy. Pure refactor: no behavior change. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix: guard FrameAggregator neighbor lookups against uint32 wrap at scan_id=0 finalize() and finalizeCentroided() compute scan_id + ds and mz_bin + dm with negative offsets. When either axis is 0 and the offset is negative, the unsigned arithmetic wraps to 0xFFFFFFFF and hits phantom cells. Latent in MS2 (edge scans rare), but exposed at MS1 density where peaks at scan_id == 0 are common. Fixed at all five negative-offset lookup sites via a shared neighborKey() helper that returns std::nullopt on wrap. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * refactor: use contributing_frames count for skip_denoise in aggregator loops Previously skip_denoise was based on the index span (hi - lo), which could run denoising when neighbor frames existed in the range but contributed no peaks after the SWATH-window filter. Now the outer loop counts frames that actually added peaks to the grid, and uses that as the denoise predicate. Closes the pre-existing TODO from OpenMS#9151 and prepares the same logic for reuse in the upcoming MS1 aggregation path. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * refactor: add FrameCentroider::loadFrame(const double*) overload The upcoming MS1 aggregation path sums per-peak intensities across adjacent frames into FrameAggregator::Cell::intensity_sum (double). For MS1 with N=2 and a 10^9-intensity base peak, the sum reaches 5*10^9 and overflows uint32_t. Internal storage is already float (ImsPeak::intensity), so the new overload just casts double->float at the entry point. The existing uint32_t overload is untouched; MS2 callers see no change. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * feat: add ms1_n_neighbors, ms1_min_support, ms1_max_rt_distance_sec Plumb three new knobs for Bruker MS1 frame aggregation through the Config struct and the FileConverter CLI. The library side is wired up in follow-up commits; this commit is purely surface-level and has no runtime effect (defaults 0/0.0 preserve existing behavior). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * test: add DIA MS1 default config regression section Locks in the "zero behavior change at defaults" guarantee for the upcoming MS1 frame aggregation feature. Snapshots MS1 spectrum count, peak count, and cumulative intensity with Config{} defaults, and independently reproduces with the new knobs explicitly set to 0 / 0.0 — asserts both paths produce identical numbers. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * test: add RED test for DIA MS1 frame aggregation Asserts that ms1_n_neighbors=1 produces an intensity-boosted MS1 output relative to the raw per-frame path. Fails until loadAggregatedMS1Spectrum is implemented (next task). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * feat: implement MS1 frame aggregation in Bruker DIA loader Add loadAggregatedMS1Spectrum() helper that builds one aggregated MS1 spectrum per center frame from a sliding window of 2N+1 adjacent MS1 frames, using FrameAggregator with a 0.01 Da bin (finer than the MS2 default). Composes with FrameCentroider for within-frame IM centroiding when ms1_centroid_mz_ppm/pct are set. Wires the new helper into loadDIA_ via an explicit branch on config.ms1_n_neighbors > 0, preserving the per-frame code path when aggregation is disabled. Also adds FrameAggregator::reserve() to pre-size the grid bucket array (MS1 grids reach ~300k cells per center at 0.01 Da bins on dense diaPASEF MS1). DIA MS1 aggregation test bounds re-tuned to [2.5, 3.1] — MS1 ion populations are more stable across adjacent frames than MS2 fragments, so 3-frame aggregation gives close to the naive 3x sum (empirical 2.99) rather than MS2's ~1.28x. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * feat: MS1 frame aggregation for Bruker DDA-PASEF loader Extend the ms1_n_neighbors dispatch to loadDDA_ and add a DDA-specific test section validating both the aggregation intensity boost and the ms1_max_rt_distance_sec cap's ability to truncate the neighbor window on irregular DDA frame cadences. The STATUS line reports cap_reduction so future test-runners can re-tune the cap if the fixture's MS1 cadence changes. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * feat: MS1 frame aggregation in Bruker DIA streaming loader Extend the ms1_n_neighbors dispatch to loadDIAStreaming so OpenSwath consumers see an aggregated MS1 survey when requested. Preserves the per-center-frame cadence invariant that SwathFileConsumer relies on. Adds a streaming-specific test section validating spectrum count equality with the raw MS1 frame count. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * test: MS1 aggregation composition, min_support, RT cap, FRAME mode Four new test sections exercising every leaf of the MS1 aggregation feature: - DIA MS1 aggregation with centroiding test — confirms composition with FrameCentroider. Centroider caps output at MAX_CENTROID_PEAKS=10000 per frame, so on dense MS1 surveys (~600k aggregated peaks/frame) the top 10k carry ≈0.32 of total intensity. Bounds [0.1, 1.0] accept this. - DIA MS1 aggregation min_support test — confirms denoise works: drop_frac ≈ 0.52, intensity still 1.8× raw (signal peaks survive). - DIA MS1 RT cap test — 0.5s cap at N=2 reduces intensity by 80% (DIA MS1 cadence ~1.5s; tight cap collapses the window to just the center frame, so intensity goes from 5x raw down to 1x raw). - FRAME mode ignores ms1_n_neighbors test — FRAME export emits one spectrum per raw MS1 frame regardless of the aggregation knob. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * feat: warn-and-ignore + partial-config warnings for MS1 aggregation - loadFrames_: warn when ms1_n_neighbors > 0 in FRAME export mode, proceed per-frame (preserves FRAME's "raw frames" contract). - warnPartialMS1AggregationConfig(): one-shot warnings for three cases: - ms1_min_support > 0 with ms1_n_neighbors == 0 (knob inert) - ms1_max_rt_distance_sec > 0 with ms1_n_neighbors == 0 (knob inert) - 2*N+1 > ms1_frame_count (window clamped at run edges) Called from loadDIA_, loadDDA_, and loadDIAStreaming after ms1_frame_ids is populated. Mirrors the existing isCentroidingEnabled partial-config warn pattern. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * docs: CHANGELOG entry for Bruker MS1 frame aggregation Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * feat: ms1_centroid_max_peaks knob + per-load cap-hit summary The FrameCentroider had a hardcoded MAX_CENTROID_PEAKS=10000 cap that silently dropped up to 68% of intensity on aggregated MS1 surveys (flagged by Codex review of PR OpenMS#9158). Exposing the cap as a knob and raising the default to 100000 makes the loss recoverable. Changes: - New Config::ms1_centroid_max_peaks field (default 100000). - New CLI option -bruker:ms1_centroid_max_peaks (default 100000, min 1). - Plumbed through FileConverter::getBrukerConfig_. - FrameCentroider::centroid() signature accepts max_peaks parameter; default constant renamed to DEFAULT_CENTROID_MAX_PEAKS (=10000). - FrameCentroider tracks cap-hit count + max dropped-peak intensity per load; reportCapSummary() emits ONE summary WARN per load() (not one per spectrum). Warning only fires when dropped peaks are above noise floor (>200 intensity counts). - reportCapSummary() called at the end of loadDIA_, loadDDA_, loadDIAStreaming, and loadFrames_ level-1 loops. On the DIA test fixture: at the 100k default the cap is hit on ~97% of spectra but only noise-floor peaks are dropped, so the summary warning stays silent. Intensity retention jumps from 32% (at 10k) to 70% (at 100k). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * docs,test: address CodeRabbit review comments on PR OpenMS#9158 All four comments were Minor but legitimate: 1. CHANGELOG used an ambiguous "ms1_centroid_mz_ppm/pct" shorthand. Spell out both identifiers explicitly, plus the new ms1_centroid_max_peaks knob. 2. FRAME mode test locked only spectrum count. A silent content corruption (e.g. aggregation partially running but writing to per-frame output) would pass. Now also asserts peak count and cumulative intensity equality between ref and agg runs. 3. DIA MS1 streaming aggregation test only asserted the cadence invariant (meta.nr_ms1_spectra), which is independent of whether ms1_n_neighbors is honored at all. Now also runs a non-aggregated streaming baseline and asserts total MS1 peaks(agg) > peaks(ref) — proof that aggregation actually ran. Also replaces swath_maps.front() with a proper m.ms1-flag lookup. 4. -bruker:ms1_min_support accepted any non-negative integer, but in a 3x3 grid with center excluded the max possible neighbors is 8. Added setMaxInt_(..., 8); values >8 are now rejected at the CLI boundary instead of silently dropping every peak. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

… (OpenMS#9153) LinearResamplerAlign is a strict superset of LinearResampler (adds explicit raster alignment, ppm-spaced rasters, chromatogram support and a linear-interpolation mode). Delete the narrower LinearResampler class and redirect its callers (Tutorial_SavitzkyGolayFilter, ImageCreator, DefaultParamHandlerDocumenter, pyOpenMS binding) to LinearResamplerAlign. Also add a runtime warning (verifySpacing_) when the configured spacing is finer than the minimum distance between adjacent input points, which can produce spurious peaks below the detector dead time. Co-authored-by: Timo Sachsenberg <timo.sachsenberg@uni-tuebingen.de> Co-authored-by: Luis Jacob Keller <noreply@github.com> Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

…MS#9155) We want to use the submodule revision hash as the way to connect an OpenMS release to a version of contrib. By doing this we make releasing OpenMS easier because we don't have to tag/release contrib. Yes, I know this is slower for building on Windows. But contrib is going away in the next couple of months. I put the contrib build in its own job so that we don't hit run-time limits.

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

…workaround (OpenMS#9160) * feat: Bruker DDA native ID + load_ms1 toggle Two related BrukerTimsFile changes: 1. DDA MS2 native IDs change from "scan=<precursor_id>" to "frame=<frame_id> scan=<scan_begin> precursor=<precursor_id>". Matches the DIA MS2 convention and carries all three physical identifiers (source frame, TIMS isolation-window start, Bruker Precursors.Id). Unblocks downstream cross-referencing and is the format mzParser/Comet require a workaround for (handled in CometAdapter). 2. New Config::load_ms1 bool (default true) + -bruker:load_ms1 CLI toggle. When false, MS1 spectra are not emitted to the MSExperiment or streaming consumer; applies to loadDIA_, loadDDA_, loadDIAStreaming, and loadFrames_ level==1. Partial-config warning fires if load_ms1=false combined with ms1_n_neighbors>0 (inconsistent). Two new test sections: - "DDA native ID format test" — asserts native ID carries frame=, scan=, precursor= components and starts with frame=. - "Bruker load_ms1=false test" — asserts zero MS1 and unchanged MS2 count for DDA, DIA, and FRAME export modes. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * feat(CometAdapter): workaround mzParser segfault on Bruker DDA mzML Comet's bundled mzParser (MSToolkit/saxmzmlhandler.cpp) grep-matches native IDs for "scan=" and uses atoi of the following digits as its spectrum-index sort key. When Bruker DDA MS2 IDs of the form "frame=F scan=S precursor=P" are written alongside MS1s that take the counter path ("frame=F"), scanNum values can collide — and cindex::compare() in MSToolkit/include/mzParser.h returns true for equal elements, violating std::sort's strict-weak-ordering invariant. The resulting UB manifests as a segfault during std::string shuffling in the sort. Workaround (upstream fix for cindex::compare should still be filed against UWPR/Comet separately): 1. Before writing the temp mzML fed to Comet, rewrite every native ID to "index=N" (monotonic counter). mzParser routes these through its counter path — no sort, no UB. Original ID is preserved as a per-spectrum MetaValue. 2. Let Comet run normally; it emits pep.xml with the rewritten IDs. 3. After IM/FAIMS annotation (which operates on the rewritten IDs in both pep.xml and exp), translate PSM spectrum_reference values back to the original Bruker IDs via a rewritten→original map. Also: default bruker_config.load_ms1=false when searching MS2. Comet only needs MS2; loading MS1 just wastes memory and time. Can be overridden via -bruker:load_ms1=true if the caller needs MS1 available in `exp` for custom post-processing. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * docs: CHANGELOG for Bruker DDA native ID + load_ms1 + Comet workaround Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix: keep DDA native ID CV-compliant (MS:1002818) CodeRabbit review flagged that "frame=F scan=S precursor=P" violates the PSI-MS CV term MS:1002818 ("Bruker TDF nativeID format"), which mandates exactly "frame=<FRAME_ID> scan=<SCAN_ID>" — and BrukerTimsFile advertises that accession at BrukerTimsFile.cpp:1297-1298. Revert to the strict pattern and move the Bruker Precursors.Id SQL key out of the nativeID into a per-spectrum MetaValue "bruker_precursor_id". Downstream code that needs the SQL key reads the MetaValue; the native ID stays CV-compliant and mzParser-stable. Test updated to assert the CV pattern (frame= prefix, no precursor= in ID) and that bruker_precursor_id is present as a MetaValue. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix: address CodeRabbit review on PR OpenMS#9160 - BrukerTimsFile::transform(): respect config.load_ms1 when computing the consumer's expected spectrum count, so setExpectedSize matches what the underlying loadFrames_/loadDIA_/loadDDA_ paths actually emit - CometAdapter: drop the misleading claim that -bruker:load_ms1=true can override the forced MS1 skip (the option isn't registered here) - BrukerTimsFile_test: brace single-line if per OpenMS coding guidelines Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * fix(bruker): correctness fixes from review of PR OpenMS#9160 Three confirmed correctness bugs surfaced during the PR review: 1. BrukerTimsFile::transform() computed the DIA expected-spectrum count as counts.ms2 * windows.size(), but each MS2 frame belongs to exactly ONE WindowGroup. Actual emit count is the per-group sum Σ (frames_in_group × windows_in_group), identical to loadDIA_'s total_work formula. With multiple WindowGroups the previous formula overcounted by a factor proportional to the number of groups. 2. DDA MS2 nativeID collisions. This reader aggregates all PasefFrameMsMsInfo rows sharing the same Precursors.Id (potentially across multiple frames) into ONE spectrum — a timsrust-style design pwiz does NOT use (pwiz emits per-mobility-scan, where (frame, scan) is inherently unique). Under our per-precursor aggregation, first_entry->(frame_id, scan_begin) is NOT guaranteed unique across distinct precursors (PASEF co-isolation with different IsolationMz, multi-cycle precursors colliding on their first entry). Restore the trailing "precursor=<P>" token to the nativeID. This matches pwiz's combined-mode convention ("merged=N frame=F scanStart=A scanEnd=B" under the same MS:1002818 accession) and OpenMS's own DIA path ("frame=F windowGroup=G scan=S"). The CV accession MS:1002818 stays; Comet/Sage dispatch on nativeID content, not the CV accession, so downstream compatibility is preserved. Precursors.Id remains duplicated as a typed MetaValue "bruker_precursor_id" for direct lookup. Test adds a uniqueness assertion on the full set of DDA MS2 native IDs. 3. SpectrumNativeIDParser did not recognize MS:1002818 (accession-based overload returned -1; content-based overload didn't recognize the "frame=" prefix). Downstream tools relying on scan-number extraction (SpectrumLookup, USI, PepXMLFile, PercolatorInfile, IDMapper, ConsensusMapArrowExport) silently produced -1 or used the bare-int fallback for Bruker .d output. Add MS:1002818 to the "scan=" accession bucket and "frame=" to the prefix map; both route to the existing scan=(\d+) regex which picks up the scan_begin integer. CometAdapter's existing comments at :725 and :853 already described the "frame=F scan=S precursor=P" format and are now correct again without modification. CHANGELOG updated to document the real format and the downstream parser work. rationale for the nativeID format is documented inline in BrukerTimsFile::loadDDA_. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * fix(ProSE): swap cal_lower/cal_upper to match signed-error convention ProSEAlgorithm::runCalibrationPass_ was reflecting the calibration window around zero. Given the (lower, upper) schema established at ProSEAlgorithm.cpp:1615-1617 (theoretical ∈ [observed - lower, observed + upper] ⇔ signed error e ∈ [-upper, +lower]), mapping the intended window [shift - spread, shift + spread] requires: cal_lower = spread + shift // was: spread - shift cal_upper = spread - shift // was: spread + shift The filter at lines 1624-1625 already uses the correct convention; only the write-back at 1667-1668 was inverted. Impact on the Bruker DDA fixture (PR OpenMS#9160's opentims integration test): on a precursor error distribution biased +3.57 ppm with 6.02 ppm spread, the bug made FragmentIndex accept e ∈ [-9.59, +2.44] (hard-capped, opposite-sign) instead of [-2.44, +9.59]. 1% FDR PSMs went from 1441 (buggy) to 4537 (fixed) — uncalibrated baseline is 4373, so calibration now correctly improves ID yield by ~4% instead of degrading it 3x. Bug introduced in commit 7445cd6 (PR OpenMS#9132, "asymmetric precursor tolerance window"). Pre-existing on develop; surfaced because PR OpenMS#9160 added a Bruker DDA integration test that exposes the signed precursor bias through which this swap becomes visible. Also adds floor-only PSM-count regression tripwires for the three DDA pipelines (SimpleSearchEngine, ProSE, ProSE calibrated). Floors sit ~5% below observed yields, catching real regressions while tolerating small algorithmic drift. The calibrated floor specifically guards the swap fix — a regression would drop PSMs back to ~1441 and trigger the floor. The check is a pure-CMake -P script (file(READ) + string(REGEX MATCHALL) since file(STRINGS) silently undercounts on large idXML files). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * test(bruker): tighten PSM-count floors to ~0.5% below current Original floors were ~5% below observed counts, too loose to catch a subtle regression (e.g., a partial nativeID or IM-annotation break that drops 2% of PSMs). Tighten to ~0.5%: SimpleSearchEngine DDA: 3900 -> 4170 (current 4187) ProSE DDA: 4100 -> 4350 (current 4373) ProSE DDA calibrated: 4300 -> 4510 (current 4537) If a legitimate change raises yield, the floor should be raised in the same commit. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * test(ProSE): align calibration-asymmetry assertion with fixed convention The calibration asymmetry check in the "calibration preserves asymmetric bias - normal case" section was matched to the previous buggy ordering (cal_upper > cal_lower for positive bias). After fixing the swap in runCalibrationPass_, the correct inequality is cal_lower > cal_upper: under (lower, upper) where signed error e ∈ [-upper, +lower], a positive bias widens +lower and tightens -upper. The order-invariant downstream uses (std::min(cal.cal_lower, cal.cal_upper) on line 1299) are unaffected. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * test(ProSE): strengthen calibration asymmetry-case assertions Previous version used tautological checks (cal_lower < 20.0 is trivially true because the code does std::min(raw, 20.0)) and a way-too-loose 5 ppm tolerance on the shift, so it couldn't actually catch a regression of the sign convention. Replace with: - Direction check: shift > 0 and spread > 0 for this positive-biased fixture - |shift| < spread precondition stated explicitly - Load-bearing functional identities binding the writeback mapping: cal_lower = spread + shift (upper endpoint of signed-e window) cal_upper = spread - shift (|lower endpoint| of signed-e window) These directly encode the (lower, upper) <-> [-upper, +lower] convention and would fail simultaneously if the swap bug were re-introduced. - Asymmetry check (cal_lower > cal_upper for positive bias) as a cross-check against mislabeled shift/spread producing consistent-but-wrong identities. - Keep the "< 20, < 30" checks only as a witness that the std::min cap is inactive (so the identities above are unconstrained), NOT as primary behavior assertions. Dropped the hardcoded shift=7.0 / spread=8.5 / cal_lower=15.5 / cal_upper=1.5 literals: the search pipeline doesn't include every synthetic shift in the calibration set, so those specific numbers depend on search-side filtering details. The functional identities are immune to that. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Add entry for contrib directory converted to git submodule (OpenMS#9155): developers and packagers building from source now need 'git submodule update --init' to obtain vendored third-party libraries. Co-authored-by: GitHub Copilot <copilot@github.com> Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* feat(ProSE): add -out_pin for Percolator input file output Registers -out_pin <files> (per-input, like -out_idxml) that writes Percolator .pin/.tsv files via PercolatorInfile::store(). Works independently of -percolator_executable — users can produce .pin files for external rescoring without running Percolator from within ProSE. The search algorithm already populates the extra_features metavalue on ProteinIdentification::SearchParameters with the correct feature set (score, fragment_error_median_ppm, matched_prefix/suffix_ions_ fraction). This commit just wires it through the TOPP tool's CLI and output loop. Closes item 4 in OpenMS#9108. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix(ProSE): mark -out_pin as advanced parameter Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix(ProSE): address CodeRabbit review on -out_pin (OpenMS#9166) Three bugs found by CodeRabbit, all in the .pin output block: 1. (Critical) min_charge parsed via String(sp.charges).toInt() on the full "2:5" string — throws ConversionError, silently preventing .pin output on every run. Fixed: split on ':' for both min and max. 2. (Major) Duplicate "score" column — hardcoded "score" + extra_features (which already contains "score") → malformed .pin header. Fixed: only insert "score" if not already present in extra_features. 3. (Minor) result.protein_ids.front() without empty guard. Fixed: check empty and log error before accessing. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

ProSE declared ions:add_a/b/c/x/y/z_ions parameters in its constructor but updateMembers_() never read them. FragmentIndex received and used them correctly (via setParameters in prepareContext), but TheoreticalSpectrumGenerator in the scoring loop always used its defaults (b/y only) — making the index and scorer inconsistent. Fix: read the 6 ion toggles in updateMembers_() and forward them to TheoreticalSpectrumGenerator alongside the existing add_first_prefix_ion and add_metainfo settings. Users can now configure ETD-style searches with -Search:ions:add_c_ions true -Search:ions:add_z_ions true -Search:ions:add_b_ions false -Search:ions:add_y_ions false and have both the index and scoring honor the setting. Closes item 12 in OpenMS#9108. Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

…y be A-Z, if an unknown character is recorded, it throws an Exception.

…EnzymeProtein. Created also Tests and whatnot.

…ngende Files geändert.

github-actions · 2026-05-27T14:23:53Z

@cbielow The following names might be missing from the AUTHORS file:

Alen Saric

Please consider applying the following patch:

--- AUTHORS	2026-05-27 14:23:47.953841885 +0000
+++ AUTHORS.new	2026-05-27 14:23:52.817029772 +0000
@@ -6,6 +6,7 @@
 authors contributing to a specific piece of code, please refer to
 the authors tag in the respective file header.
  - Achal Bajpai
+ - Adithya Kammati
  - Aditya Muzumdar
  - Ahmed Khalil
  - Alen Saric
@@ -59,6 +60,7 @@
  - Johannes Veit
  - Johannes von Kleist
  - Johan Teleman
+ - Jonas Scheid
  - Joshua Charkow
  - Juliane Schmachtenberg
  - Julianus Pfeuffer
@@ -73,6 +75,7 @@
  - Lenny Kovac
  - Leon Bichmann
  - Leon Kuchenbecker
+ - LuBox2K
  - Lucia Espona
  - Luis Jacob Keller
  - Lukas Mueller

timosachsenberg and others added 20 commits May 23, 2026 13:35

fix: mark Bruker TDF spectra as CENTROID so downstream tools accept t…

894caef

…hem (OpenMS#9154) merge because this path is not tested on ci

Fix outdated links in contributing guide

4e1703b

Update CONTRIBUTING.md

9a8925d

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

Fix capitalization in documentation requirements

622398f

Implemented a checkup for the cleavge points and stops. It should onl…

69f08c3

…y be A-Z, if an unknown character is recorded, it throws an Exception.

Shifted the Constructor of DigestionEnzyme. Refactored into Digestion…

5cca9e9

…EnzymeProtein. Created also Tests and whatnot.

Letzter Stand der Dinge - Base Class Test geschrieben und zuzsammenhä…

892038b

…ngende Files geändert.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Digestion enzyme test az#173

Digestion enzyme test az#173
cbielow wants to merge 20 commits into
cbielow:developfrom
TheClaervoyant:DigestionEnzymeTestAZ

cbielow commented May 27, 2026 •

edited

Loading

Uh oh!

github-actions Bot commented May 27, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

9 participants

Conversation

cbielow commented May 27, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Checklist

How can I get additional information on failed tests during CI

Advanced commands (admins / reviewer only)

Uh oh!

github-actions Bot commented May 27, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

9 participants

cbielow commented May 27, 2026 •

edited

Loading