prairieview reader: real per-frame timestamps from .xml configs (+ v2/.pcf, cycles)#107
Merged
Merged
Conversation
Implement Prairie View XML reading so the native reader extracts real per-frame timestamps for .xml recordings too (not just legacy .pcf), ported and revised from VH-Lab/vhlab-TwoPhoton-matlab (readprairieviewxml.m / readprairieviewxml3.m). - ndr.format.prairieview.readxml: parse a Prairie XML into the same struct shape as readconfig. Modern PVScan files take per-frame times from '<Frame absoluteTime>' (x1e6 -> us) and dims from '<Key/PVStateValue key="linesPerFrame"/...>'; legacy MM-era XML takes per-frame '<Time>' values (x1e3 -> us) and '<Lines_Per_Frame>' etc. - ndr.format.prairieview.keyvalue / elementvalue: the reusable XML value parsers (ports of readprairie3keyvalue / getxmlval), kept in ndr.format.* per the project convention for format helpers. - ndr.format.prairieview.readconfig now delegates '.xml' configs to readxml (instead of returning is_xml with no timestamps), so ndr.reader.prairieview surfaces real per-frame times for .pcf and .xml alike. Multi-channel grouping (from the Cycle/Ch file names) composes with either config. The originals' file-position scanning is replaced with whole-file regular-expression parsing for robustness. TestPrairieView gains a modern-PVScan-XML 2-channel fixture covering config parsing, geometry, multi-channel frame round-trip, and absoluteTime-derived timestamps. https://claude.ai/code/session_01M7f7wSBJeN4QJFvuCwdsSu
Validated ndr.format.prairieview.readxml/keyvalue against a real Prairie View v4 recording's .xml/.cfg: the regexes extract all 596 per-frame absoluteTime values, the dimension Keys (skipping the 'permissions' attribute that sits between key and value), and the channels correctly. The reader's framelayout derives the same 596 timepoints from the filenames, matching the timestamp count. Real recordings put one <Frame> per <Sequence cycle="N"> (the cycle is the timepoint, frame index fixed at 000001, 3-digit cycle). Rewrite the synthetic test fixture to that layout (multi-Sequence, per-frame <PVStateShard> with the permissions-bearing Keys, Cycle%03d_Ch%d filenames) so the test guards the real-world path. Parser code is unchanged; it already handled the real structure. https://claude.ai/code/session_01M7f7wSBJeN4QJFvuCwdsSu
…ema) Validated against a real PrairieView v2.2.0.7 recording: the per-frame timestamps live in '<Time>' (milliseconds) inside '<Dataset_x0020_N>' rows, and the file embeds an '<xs:schema>' that defines the field names before the data. The timestamp extraction already matched all 847 real <Time> values; the only problem was that the dimension lookups (Lines_Per_Frame / Framerate) matched the schema's '<xs:element name="...">' definitions instead of the data. Fix: strip the '<xs:schema>...</xs:schema>' block in the legacy XML path before reading element values, so dims come from the data rows. Frame times (Time ms -> us) were already correct and unchanged. Add a synthetic v2 '.NET DataSet' fixture (embedded schema + header dims + Dataset_x0020 rows with per-channel filenames and <Time>) and tests for config parsing (dims read past the schema), multi-channel geometry, frame round-trip, and Time-derived timestamps. https://claude.ai/code/session_01M7f7wSBJeN4QJFvuCwdsSu
Validated ndr.format.prairieview.readconfig against a real PrairieView v2.1 recording's *_Main.pcf: the [Main] section (Total images=22, dims 512x512, Total cycles=3), the per-[Cycle N] image counts (1, 20, 1), and all 22 [Image TimeStamp (us)] values parse correctly. Document that a Prairie run is divided into cycles and that NDR reads a collection of cycles as a SINGLE epoch (ordered cycle-then-frame, with the Main timestamp list spanning all cycles): - add a "Cycles and epochs" section to the ndr.reader.prairieview help; - add docs/notes/prairieview_cycles.md. Add a real-style multi-cycle .pcf test fixture ([Main] + [Cycle N] sections + a spanning [Image TimeStamp (us)] list, frames named across cycles with a per-cycle resetting index) and tests that the epoch spans all cycles in cycle-then-frame order with the right timestamps. Fix the test .pcf/v2 writers to emit timestamps with %.15g instead of %g, which was truncating large microsecond integers to 6 significant figures (e.g. 1486848 -> 1.48685e+06). https://claude.ai/code/session_01M7f7wSBJeN4QJFvuCwdsSu
Contributor
CI showed testXmlConfigParsing/testXmlTimestamps failing: the modern PVScan XML parsing returned all-empty (Lines_per_frame=[], timestamps 1x0) in MATLAB, while the .pcf and v2 '.NET DataSet' paths passed. The difference was the modern patterns' use of \b (word boundary) and \s, which did not match as intended under MATLAB regexp; the passing paths use only [^>] / \d. Rewrite the modern-XML patterns to use those same constructs: - version detection: '<PVScan[^>]*version="..."' - per-frame times: '<Frame[^>]*absoluteTime="..."' - ndr.format.prairieview.keyvalue: 'key="<name>"[^>]*value="..."' (key and value confined to one tag since [^>]* cannot cross '>'). Re-validated against the real t00004-001.xml / Config.cfg: version 4.0.0.43, all 596 absoluteTime values, and linesPerFrame/pixelsPerLine/ framePeriod/dwellTime extract correctly. Semantics unchanged; only the regex constructs changed. https://claude.ai/code/session_01M7f7wSBJeN4QJFvuCwdsSu
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Follow-up to the merged imaging PR (#106), extending the native
ndr.reader.prairieviewreader to read real per-frame timestamps from Prairie View XML configs (not just legacy.pcf), validated against real lab recordings. Ported and revised fromVH-Lab/vhlab-TwoPhoton-matlab.If GitHub shows the earlier imaging files in the diff, that's because #106 was merged separately; the new work in this PR is the
ndr.format.prairieview.*XML additions, theprairieviewreader header, the cycles doc, and theTestPrairieViewadditions (see "Files new in this PR" below).What's new
XML config reading (
ndr.format.prairieview)readxml— parse a Prairie XML into the same struct shape as the.pcfreader. Modern PVScan files take per-frame times from<Frame absoluteTime>(×1e6 → µs) and dims from<Key/PVStateValue key="linesPerFrame" … value=…>; legacy v2.2.NET DataSetfiles take per-frame<Time>(×1e3 → µs) from<Dataset_x0020_N>rows and<Lines_Per_Frame>etc., skipping the embedded<xs:schema>.keyvalue/elementvalue— reusable XML value parsers (ports ofreadprairie3keyvalue/getxmlval), kept inndr.format.*per the project convention.readconfignow delegates.xmlconfigs toreadxml, sondr.reader.prairieviewsurfaces real per-frame timestamps for.pcfand.xmlalike. Multi-channel grouping (fromCycle/Chfile names) composes with either.The originals' file-position scanning is replaced with whole-file regex parsing for robustness.
Cycles == one epoch (documented)
A Prairie run is divided into cycles; NDR reads a collection of cycles as a single epoch (frames ordered cycle-then-frame, timestamps from the Main config's list spanning all cycles). Recorded in the
ndr.reader.prairieviewhelp anddocs/notes/prairieview_cycles.md.Validated against real recordings
t00004-001.xml)absoluteTime, dims, 2 channels ✓.NET DataSetXML (t00001-001.xml)<Time>values; schema-skip for dims ✓.pcf(t00012-001_Main.pcf)[1,20,1], dims ✓ (no code change needed)Real files were used only to validate the parsers; they are not committed (they contain internal lab paths/hostnames). Each format got a synthetic, real-structure test fixture instead.
Files new in this PR
+ndr/+format/+prairieview/readxml.m,keyvalue.m,elementvalue.mdocs/notes/prairieview_cycles.md+ndr/+format/+prairieview/readconfig.m,+ndr/+reader/prairieview.m,tools/tests/+ndr/+unittest/+reader/TestPrairieView.mNotes / not done
<Datasets>XML variant is implemented but not yet validated against a real file (modern v4 and v2 are the common cases).https://claude.ai/code/session_01M7f7wSBJeN4QJFvuCwdsSu
Generated by Claude Code