Skip to content

Add nearest-RT FID/MS matching for split-GC (no retention-index step)#3

Open
niklashoelter wants to merge 2 commits into
FelixKatz77:feature/split-gcfrom
niklashoelter:feature/split-gc-no-ri
Open

Add nearest-RT FID/MS matching for split-GC (no retention-index step)#3
niklashoelter wants to merge 2 commits into
FelixKatz77:feature/split-gcfrom
niklashoelter:feature/split-gc-no-ri

Conversation

@niklashoelter

Copy link
Copy Markdown
Contributor

Summary

For the new split GC (one injection split post-column to both the MS and the Polyarc-FID), the FID and MS traces come from a single run and share one retention-time axis, so retention-index (RI) bridging — and its alkane-standard requirement — is unnecessary. This branch adds nearest-retention-time matching as a sibling to the existing RI path and validates it end-to-end on real data.

RI matching is kept as the default for the legacy two-machine workflow; nothing in that path changes.

What changed

  • Injection.match_rt(rt, func, tolerance) — RT analogue of match_ri. func maps the source (MS) retention time to the expected (FID) retention time (identity by default; a drift model plugs in later without touching the matcher), tolerance defaults to 1 s, the internal standard is excluded up front, and the real FID Peak is returned.
  • Analysis gains matching='ri'|'rt' (default 'ri', fully back-compatible) on calc_plate_yield / calc_plate_conv, with rt_func / rt_tolerance. The height-ratio tie-break is reused for both modes, and the plate reshape is now layout-driven so non-8x12 plates (e.g. the A1..K3 split-GC sequence) work.
  • Analysis.constant_offset(b=0.0) and Analysis.linear_drift(a, b) factories for rt_func.
  • Product_Array.get_product(pos) so the plate engine works with product-only layouts (it previously required Reaction_Array).
  • New example examples/split_gc/plate_processing.py (+ splitgc_layout.csv) on the FBS-FB-021-ALL.rslt data, using matching='rt'.

Validation (real FBS-FB-021-ALL data, 33 wells A1..K3)

  • Dilution series recovered cleanly across the three replicate columns (col 1 ~100% / col 2 ~50% / col 3 ~17%).
  • Measured FID-MS offset is consistent: +0.0094 mean / +0.010 median min (FID elutes later), matching the expected ~0.01 min splitter dead-volume. The example sets RT_FUNC = Analysis.constant_offset(0.010), which centres the +-1 s window and recovers two tolerance-edge wells -> 29/33 matched.
  • The 4 unmatched wells fail at MS identification, not RT matching: the E-row analyte (Decanenitrile) has a weak/absent EI molecular ion so match_mol cannot find it, plus one further single-injection MS miss.

How to switch modes

  • Plate API: Analysis.calc_plate_yield(ms_seq, fid_seq, layout, matching='rt', rt_func=Analysis.constant_offset(0.010), rt_tolerance=1/60); omit / use matching='ri' (with ri_tolerance) for the legacy path.
  • Per-injection: fid_injection.match_rt(ms_peak.rt, func=..., tolerance=...) in place of match_ri.

niklashoelter and others added 2 commits May 31, 2026 21:15
Split-GC runs split one injection post-column to both MS and Polyarc-FID,
so FID and MS peaks share a retention-time axis and no longer need
retention-index (RI) matching via an alkane standard. Add an rt-matching
mode alongside the existing ri-matching:

- Injection.match_rt(rt, func, tolerance): nearest-retention-time analogue
  of match_ri. func maps the source (MS) rt to the expected (FID) rt
  (identity by default); tolerance defaults to 1 s. Excludes the internal
  standard and returns the real FID peak.
- Analysis: matching='ri'|'rt' option (default 'ri', back-compat) wired
  through calc_plate_yield/calc_plate_conv, with rt_func/rt_tolerance. The
  height-ratio tie-break is reused for both modes. Plate reshape is now
  layout-driven so non-8x12 plates (split-GC A1..K3) work.
- Analysis.constant_offset(b)/linear_drift(a,b): rt_func factories
  (t_fid = t_ms + b; t_fid = a*t_ms + b). constant_offset(0) is the default.
- examples/split_gc/plate_processing.py (+ splitgc_layout.csv): new example
  on the same FBS-FB-021-ALL.rslt data using matching='rt'.
- Remove stray DEBUG print in RI_Calibration.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Validated examples/split_gc/plate_processing.py end-to-end on the real
FBS-FB-021-ALL.rslt sequence (33 wells, A1..K3).

- pygecko/reaction/array.py: add Product_Array.get_product(pos). The plate
  engine (Analysis.__match_and_quantify) looks up the well analyte via
  layout.get_product(pos), which previously existed only on Reaction_Array.
  Product_Array stores product SMILES directly, so get_product returns the
  well entry; the analysis engine now works for product-only layouts.

- examples/split_gc/plate_processing.py: set RT_FUNC to
  Analysis.constant_offset(0.010). The measured FID-MS offset is a consistent
  +0.0094 mean / +0.010 median min (FID elutes later, matching the ~0.01 min
  splitter dead-volume). The offset centres the +-1 s match window and recovers
  two tolerance-edge wells (B3, F1) -> 29/33 matched. The 4 remaining misses
  are MS-side (E-row Decanenitrile has a weak/absent EI molecular ion; I2 a
  single-injection MS miss), not nearest-RT matching.

- Add validated reference outputs (yields CSV + plate heatmap PNG).
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant