Skip to content

prairieview reader: real per-frame timestamps from .xml configs (+ v2/.pcf, cycles)#107

Merged
stevevanhooser merged 8 commits into
mainfrom
claude/loving-wozniak-stmogc
Jun 10, 2026
Merged

prairieview reader: real per-frame timestamps from .xml configs (+ v2/.pcf, cycles)#107
stevevanhooser merged 8 commits into
mainfrom
claude/loving-wozniak-stmogc

Conversation

@stevevanhooser

Copy link
Copy Markdown
Contributor

Summary

Follow-up to the merged imaging PR (#106), extending the native ndr.reader.prairieview reader to read real per-frame timestamps from Prairie View XML configs (not just legacy .pcf), validated against real lab recordings. Ported and revised from VH-Lab/vhlab-TwoPhoton-matlab.

If GitHub shows the earlier imaging files in the diff, that's because #106 was merged separately; the new work in this PR is the ndr.format.prairieview.* XML additions, the prairieview reader header, the cycles doc, and the TestPrairieView additions (see "Files new in this PR" below).

What's new

XML config reading (ndr.format.prairieview)

  • readxml — parse a Prairie XML into the same struct shape as the .pcf reader. Modern PVScan files take per-frame times from <Frame absoluteTime> (×1e6 → µs) and dims from <Key/PVStateValue key="linesPerFrame" … value=…>; legacy v2.2 .NET DataSet files take per-frame <Time> (×1e3 → µs) from <Dataset_x0020_N> rows and <Lines_Per_Frame> etc., skipping the embedded <xs:schema>.
  • keyvalue / elementvalue — reusable XML value parsers (ports of readprairie3keyvalue / getxmlval), kept in ndr.format.* per the project convention.
  • readconfig now delegates .xml configs to readxml, so ndr.reader.prairieview surfaces real per-frame timestamps for .pcf and .xml alike. Multi-channel grouping (from Cycle/Ch file names) composes with either.

The originals' file-position scanning is replaced with whole-file regex parsing for robustness.

Cycles == one epoch (documented)

A Prairie run is divided into cycles; NDR reads a collection of cycles as a single epoch (frames ordered cycle-then-frame, timestamps from the Main config's list spanning all cycles). Recorded in the ndr.reader.prairieview help and docs/notes/prairieview_cycles.md.

Validated against real recordings

Format Result
PVScan v4 XML (t00004-001.xml) 596 per-frame absoluteTime, dims, 2 channels ✓
v2.2 .NET DataSet XML (t00001-001.xml) 847 <Time> values; schema-skip for dims ✓
Legacy .pcf (t00012-001_Main.pcf) 22 frames, 3 cycles [1,20,1], dims ✓ (no code change needed)

Real files were used only to validate the parsers; they are not committed (they contain internal lab paths/hostnames). Each format got a synthetic, real-structure test fixture instead.

Files new in this PR

  • +ndr/+format/+prairieview/readxml.m, keyvalue.m, elementvalue.m
  • docs/notes/prairieview_cycles.md
  • changes to +ndr/+format/+prairieview/readconfig.m, +ndr/+reader/prairieview.m, tools/tests/+ndr/+unittest/+reader/TestPrairieView.m

Notes / not done

  • The old MM-era <Datasets> XML variant is implemented but not yet validated against a real file (modern v4 and v2 are the common cases).
  • One-epoch-per-cycle would be a future NDI file-navigator concern.

https://claude.ai/code/session_01M7f7wSBJeN4QJFvuCwdsSu


Generated by Claude Code

claude added 4 commits June 10, 2026 12:33
Implement Prairie View XML reading so the native reader extracts real
per-frame timestamps for .xml recordings too (not just legacy .pcf),
ported and revised from VH-Lab/vhlab-TwoPhoton-matlab
(readprairieviewxml.m / readprairieviewxml3.m).

- ndr.format.prairieview.readxml: parse a Prairie XML into the same
  struct shape as readconfig. Modern PVScan files take per-frame times
  from '<Frame absoluteTime>' (x1e6 -> us) and dims from
  '<Key/PVStateValue key="linesPerFrame"/...>'; legacy MM-era XML takes
  per-frame '<Time>' values (x1e3 -> us) and '<Lines_Per_Frame>' etc.
- ndr.format.prairieview.keyvalue / elementvalue: the reusable XML value
  parsers (ports of readprairie3keyvalue / getxmlval), kept in
  ndr.format.* per the project convention for format helpers.
- ndr.format.prairieview.readconfig now delegates '.xml' configs to
  readxml (instead of returning is_xml with no timestamps), so
  ndr.reader.prairieview surfaces real per-frame times for .pcf and .xml
  alike. Multi-channel grouping (from the Cycle/Ch file names) composes
  with either config.

The originals' file-position scanning is replaced with whole-file
regular-expression parsing for robustness. TestPrairieView gains a
modern-PVScan-XML 2-channel fixture covering config parsing, geometry,
multi-channel frame round-trip, and absoluteTime-derived timestamps.

https://claude.ai/code/session_01M7f7wSBJeN4QJFvuCwdsSu
Validated ndr.format.prairieview.readxml/keyvalue against a real Prairie
View v4 recording's .xml/.cfg: the regexes extract all 596 per-frame
absoluteTime values, the dimension Keys (skipping the 'permissions'
attribute that sits between key and value), and the channels correctly.
The reader's framelayout derives the same 596 timepoints from the
filenames, matching the timestamp count.

Real recordings put one <Frame> per <Sequence cycle="N"> (the cycle is
the timepoint, frame index fixed at 000001, 3-digit cycle). Rewrite the
synthetic test fixture to that layout (multi-Sequence, per-frame
<PVStateShard> with the permissions-bearing Keys, Cycle%03d_Ch%d
filenames) so the test guards the real-world path. Parser code is
unchanged; it already handled the real structure.

https://claude.ai/code/session_01M7f7wSBJeN4QJFvuCwdsSu
…ema)

Validated against a real PrairieView v2.2.0.7 recording: the per-frame
timestamps live in '<Time>' (milliseconds) inside '<Dataset_x0020_N>'
rows, and the file embeds an '<xs:schema>' that defines the field names
before the data. The timestamp extraction already matched all 847 real
<Time> values; the only problem was that the dimension lookups
(Lines_Per_Frame / Framerate) matched the schema's '<xs:element
name="...">' definitions instead of the data.

Fix: strip the '<xs:schema>...</xs:schema>' block in the legacy XML path
before reading element values, so dims come from the data rows. Frame
times (Time ms -> us) were already correct and unchanged.

Add a synthetic v2 '.NET DataSet' fixture (embedded schema + header dims
+ Dataset_x0020 rows with per-channel filenames and <Time>) and tests
for config parsing (dims read past the schema), multi-channel geometry,
frame round-trip, and Time-derived timestamps.

https://claude.ai/code/session_01M7f7wSBJeN4QJFvuCwdsSu
Validated ndr.format.prairieview.readconfig against a real PrairieView
v2.1 recording's *_Main.pcf: the [Main] section (Total images=22, dims
512x512, Total cycles=3), the per-[Cycle N] image counts (1, 20, 1), and
all 22 [Image TimeStamp (us)] values parse correctly.

Document that a Prairie run is divided into cycles and that NDR reads a
collection of cycles as a SINGLE epoch (ordered cycle-then-frame, with
the Main timestamp list spanning all cycles):
- add a "Cycles and epochs" section to the ndr.reader.prairieview help;
- add docs/notes/prairieview_cycles.md.

Add a real-style multi-cycle .pcf test fixture ([Main] + [Cycle N]
sections + a spanning [Image TimeStamp (us)] list, frames named across
cycles with a per-cycle resetting index) and tests that the epoch spans
all cycles in cycle-then-frame order with the right timestamps.

Fix the test .pcf/v2 writers to emit timestamps with %.15g instead of
%g, which was truncating large microsecond integers to 6 significant
figures (e.g. 1486848 -> 1.48685e+06).

https://claude.ai/code/session_01M7f7wSBJeN4QJFvuCwdsSu
@github-actions

github-actions Bot commented Jun 10, 2026

Copy link
Copy Markdown
Contributor

Test Results

177 tests  +7   177 ✅ +7   8s ⏱️ -1s
 25 suites ±0     0 💤 ±0 
  1 files   ±0     0 ❌ ±0 

Results for commit dd6e309. ± Comparison against base commit 0365b4a.

♻️ This comment has been updated with latest results.

claude and others added 4 commits June 10, 2026 13:06
CI showed testXmlConfigParsing/testXmlTimestamps failing: the modern
PVScan XML parsing returned all-empty (Lines_per_frame=[], timestamps
1x0) in MATLAB, while the .pcf and v2 '.NET DataSet' paths passed. The
difference was the modern patterns' use of \b (word boundary) and \s,
which did not match as intended under MATLAB regexp; the passing paths
use only [^>] / \d.

Rewrite the modern-XML patterns to use those same constructs:
- version detection: '<PVScan[^>]*version="..."'
- per-frame times: '<Frame[^>]*absoluteTime="..."'
- ndr.format.prairieview.keyvalue: 'key="<name>"[^>]*value="..."'
  (key and value confined to one tag since [^>]* cannot cross '>').

Re-validated against the real t00004-001.xml / Config.cfg: version
4.0.0.43, all 596 absoluteTime values, and linesPerFrame/pixelsPerLine/
framePeriod/dwellTime extract correctly. Semantics unchanged; only the
regex constructs changed.

https://claude.ai/code/session_01M7f7wSBJeN4QJFvuCwdsSu
@stevevanhooser stevevanhooser merged commit 68512e1 into main Jun 10, 2026
@stevevanhooser stevevanhooser deleted the claude/loving-wozniak-stmogc branch June 10, 2026 13:33
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants