This is the practitioner summary. The full theory companion is
msnpip_methods_theory.md; locked parameters live in msnpip_refactor_spec.md §0.0.
subjects → per-subject MSN → node strength → GROUP CONTRAST (regional map x)
│
demographic correlation (Layer 0) ▼
imaging transcriptomics (Layers A/B)
Per subject, each of the 5 morphometric metrics is z-scored across regions, then inter-regional
similarity is the Pearson correlation between regions' standardized feature vectors (diagonal
NaN). The MSN is whole-cortex (both hemispheres). Node strength is the signed mean:
(mean of positive edges + mean of negative edges) / 2 (positive / absolute selectable).
The z-score ddof does not affect the result — a uniform per-column rescale cancels in the
row-wise Pearson.
Per region, OLS of strength ~ group + covariates; the exported regional statistic is the
group coefficient (beta, default), its t, or cohen_d (standardized mean difference on
covariate-residualized strength). Categorical covariates and site/scanner are one-hot encoded
(reference dropped). This is a subject-level test — no spatial null applies here; spatial
autocorrelation only matters when correlating two regional maps (see Layer A).
Spearman (default) of node strength vs a continuous variable, globally or per region (per-region gets Benjamini–Hochberg FDR across regions), optionally within a single group. Ordinary correlation p-values; no spatial null.
- Layer A (spatial) — handled by the engine via the
vasasurface spin. Answers: "is this gene/pathway association stronger than under spatially-autocorrelated random brain maps?" msnpip fixes the null tovasaand hard-fails (MsnpipSurfaceNullError) if the engine falls back to a grouped shuffle, so an invalid spin test can never reach a figure. - Layer B (sampling) — subject-level resampling of the contrast map's stability. Documented as a future option; not built in v2.
- Correlation: sign-aware empirical
pwith+1smoothing, BHfdr, FWEmaxT. - PLS: component
pis on the cumulative variance through component k; gene columns areweight, zscore, p, fdr, maxT—zscoreis a descriptive ranking aid, not significance. - Enrichment:
ensemble-GCEA (primary, phenotype-side null) andgsea(NES recalibrated against the engine's imaging-permutation null — itsfdris a NES q-value, not BH, and is not numerically comparable to ensemblefdr).
- Empirical p-resolution is
1/(B+1); with ~15,677 genes (DK) even 10⁴ permutations may not yield small adjusted single-gene values — report primarily at the component/category level. - Cross-run multiplicity (multiple contrasts/components/genesets) is your responsibility; pre-specify the primary analysis and treat the rest as exploratory.
- Report the exact gene count for your atlas (DK = 15,677) — it is the FDR denominator.
- Hemisphere/region choices change the science; defaults are recorded in
manifest.jsonand the report. The MSN uses both hemispheres; the engine input hemisphere (defaultleft) is the selectable part.