Add nearest-RT FID/MS matching for split-GC (no retention-index step) by niklashoelter · Pull Request #3 · FelixKatz77/pyGecko

niklashoelter · 2026-06-02T08:27:25Z

Summary

For the new split GC (one injection split post-column to both the MS and the Polyarc-FID), the FID and MS traces come from a single run and share one retention-time axis, so retention-index (RI) bridging — and its alkane-standard requirement — is unnecessary. This branch adds nearest-retention-time matching as a sibling to the existing RI path and validates it end-to-end on real data.

RI matching is kept as the default for the legacy two-machine workflow; nothing in that path changes.

What changed

Injection.match_rt(rt, func, tolerance) — RT analogue of match_ri. func maps the source (MS) retention time to the expected (FID) retention time (identity by default; a drift model plugs in later without touching the matcher), tolerance defaults to 1 s, the internal standard is excluded up front, and the real FID Peak is returned.
Analysis gains matching='ri'|'rt' (default 'ri', fully back-compatible) on calc_plate_yield / calc_plate_conv, with rt_func / rt_tolerance. The height-ratio tie-break is reused for both modes, and the plate reshape is now layout-driven so non-8x12 plates (e.g. the A1..K3 split-GC sequence) work.
Analysis.constant_offset(b=0.0) and Analysis.linear_drift(a, b) factories for rt_func.
Product_Array.get_product(pos) so the plate engine works with product-only layouts (it previously required Reaction_Array).
New example examples/split_gc/plate_processing.py (+ splitgc_layout.csv) on the FBS-FB-021-ALL.rslt data, using matching='rt'.

Validation (real FBS-FB-021-ALL data, 33 wells A1..K3)

Dilution series recovered cleanly across the three replicate columns (col 1 ~100% / col 2 ~50% / col 3 ~17%).
Measured FID-MS offset is consistent: +0.0094 mean / +0.010 median min (FID elutes later), matching the expected ~0.01 min splitter dead-volume. The example sets RT_FUNC = Analysis.constant_offset(0.010), which centres the +-1 s window and recovers two tolerance-edge wells -> 29/33 matched.
The 4 unmatched wells fail at MS identification, not RT matching: the E-row analyte (Decanenitrile) has a weak/absent EI molecular ion so match_mol cannot find it, plus one further single-injection MS miss.

How to switch modes

Plate API: Analysis.calc_plate_yield(ms_seq, fid_seq, layout, matching='rt', rt_func=Analysis.constant_offset(0.010), rt_tolerance=1/60); omit / use matching='ri' (with ri_tolerance) for the legacy path.
Per-injection: fid_injection.match_rt(ms_peak.rt, func=..., tolerance=...) in place of match_ri.

Split-GC runs split one injection post-column to both MS and Polyarc-FID, so FID and MS peaks share a retention-time axis and no longer need retention-index (RI) matching via an alkane standard. Add an rt-matching mode alongside the existing ri-matching: - Injection.match_rt(rt, func, tolerance): nearest-retention-time analogue of match_ri. func maps the source (MS) rt to the expected (FID) rt (identity by default); tolerance defaults to 1 s. Excludes the internal standard and returns the real FID peak. - Analysis: matching='ri'|'rt' option (default 'ri', back-compat) wired through calc_plate_yield/calc_plate_conv, with rt_func/rt_tolerance. The height-ratio tie-break is reused for both modes. Plate reshape is now layout-driven so non-8x12 plates (split-GC A1..K3) work. - Analysis.constant_offset(b)/linear_drift(a,b): rt_func factories (t_fid = t_ms + b; t_fid = a*t_ms + b). constant_offset(0) is the default. - examples/split_gc/plate_processing.py (+ splitgc_layout.csv): new example on the same FBS-FB-021-ALL.rslt data using matching='rt'. - Remove stray DEBUG print in RI_Calibration. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

Validated examples/split_gc/plate_processing.py end-to-end on the real FBS-FB-021-ALL.rslt sequence (33 wells, A1..K3). - pygecko/reaction/array.py: add Product_Array.get_product(pos). The plate engine (Analysis.__match_and_quantify) looks up the well analyte via layout.get_product(pos), which previously existed only on Reaction_Array. Product_Array stores product SMILES directly, so get_product returns the well entry; the analysis engine now works for product-only layouts. - examples/split_gc/plate_processing.py: set RT_FUNC to Analysis.constant_offset(0.010). The measured FID-MS offset is a consistent +0.0094 mean / +0.010 median min (FID elutes later, matching the ~0.01 min splitter dead-volume). The offset centres the +-1 s match window and recovers two tolerance-edge wells (B3, F1) -> 29/33 matched. The 4 remaining misses are MS-side (E-row Decanenitrile has a weak/absent EI molecular ion; I2 a single-injection MS miss), not nearest-RT matching. - Add validated reference outputs (yields CSV + plate heatmap PNG).

niklashoelter and others added 2 commits May 31, 2026 21:15

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add nearest-RT FID/MS matching for split-GC (no retention-index step)#3

Add nearest-RT FID/MS matching for split-GC (no retention-index step)#3
niklashoelter wants to merge 2 commits into
FelixKatz77:feature/split-gcfrom
niklashoelter:feature/split-gc-no-ri

niklashoelter commented Jun 2, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

niklashoelter commented Jun 2, 2026

Summary

What changed

Validation (real FBS-FB-021-ALL data, 33 wells A1..K3)

How to switch modes

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant