Skip to content

Add Cell Annotation plugin scaffolding, storage manifest, and provider wiring#68

Merged
yulewu merged 3 commits into
cell_annotationfrom
copilot/add-cell-annotation-plugin
Mar 17, 2026
Merged

Add Cell Annotation plugin scaffolding, storage manifest, and provider wiring#68
yulewu merged 3 commits into
cell_annotationfrom
copilot/add-cell-annotation-plugin

Conversation

Copilot AI commented Mar 17, 2026

Copy link
Copy Markdown

This starts the Cell Annotation workflow by adding the owning plugin scaffold, per-dataset .UELer storage, manifest primitives, and Heatmap/FlowSOM integration points behind a feature flag. It establishes the storage and interface layer needed for later checkpoint save/load/diff/merge work while preserving current behavior when disabled.

  • Cell Annotation plugin bootstrap

    • Adds a non-UI CellAnnotationPlugin registered from ImageMaskViewer only when ENABLE_CELL_ANNOTATION is enabled
    • Initializes per-dataset state on open and releases it on close
    • Keeps orchestration separate from existing PluginBase UI plugins
  • Per-dataset storage under .UELer

    • Adds DatasetStore to resolve:
      • .UELer/dataset_<id>/checkpoints
      • .UELer/dataset_<id>/thumbnails
      • .UELer/dataset_<id>/selections
    • Uses a stable hash derived from the dataset root path
    • Adds atomic write helpers for future checkpoint/manifest persistence
  • Manifest and selection scaffolding

    • Adds a thin Manifest wrapper around manifest.json
    • Adds a materialized SelectionSpec implementation with:
      • subset checks
      • union for future merge flows
      • serializable payload generation for checkpoint metadata
  • Cross-plugin contracts

    • Introduces HeatmapStateProvider, FlowsomParamsProvider, and SelectionSpec protocols in ueler.viewer.interfaces
    • Adds legacy shims so both viewer.* and ueler.viewer.* namespaces resolve cleanly
  • Heatmap / FlowSOM registration hooks

    • Heatmap now registers itself with Cell Annotation and exposes export/import stubs for checkpoint state
    • FlowSOM now registers itself with Cell Annotation and exposes parameter export/import plus subset context storage
    • This establishes the ownership boundary for later subset-only enforcement and checkpoint restore behavior
  • Targeted CI for the new plugin

    • Adds a dedicated GitHub Actions workflow for Cell Annotation unit/integration coverage
    • Locks workflow token permissions to read-only

Example of the new plugin activation path:

if _flag_enabled():
    plugin = CellAnnotationPlugin(self)
    setattr(self, CellAnnotationPlugin.REGISTRY_KEY, plugin)
    plugin.on_dataset_opened(self.base_folder)

This PR is intentionally limited to the scaffolding layer: plugin lifecycle, storage layout, compatibility shims, and provider contracts. Browser UI, artifact serialization, checkpoint restore, merge/recluster flows, and subset-only enforcement remain follow-on work on top of this foundation.

Original prompt

This section details on the original issue you should resolve

<issue_title>Epic: Cell Annotation Workflow (checkpoints, merge, flexible markers)</issue_title>
<issue_description>Goal: Ship a dedicated Cell Annotation plugin that orchestrates Heatmap + FlowSOM to save/load/diff/merge checkpoints under .UELer/…, enforce subset-only downstream semantics (with merge as the only union exception), and support flexible marker sets (training vs display-extra, plus opt-in expanded training with imputation/projection).


Plan & scope

  • Owner plugin: Cell Annotation (new) — checkpoint lifecycle, DAG browser, selection semantics, storage under .UELer.
  • Heatmap: export/import view state; render training + extra markers; row linkage from training set only.
  • FlowSOM: run on declared training markers; compute medians for training ∪ extra; subset-only mode; support expanded training via imputation or complete-cases + projection.

Reference doc: Attach the Markdown plan file (cell-annotation-workflow-plan.md) to this issue or link it here.


Milestones

  • M1: Scaffolding (plugin skeleton, store, interfaces)
  • M2: Serialization & Manifest
  • M3: Checkpoint Browser UI
  • M4: Heatmap & FlowSOM upgrades + Marker Manager
  • M5: Merge workflow + Subset enforcement
  • M6: Tests • Docs • Release/Migration

Definition of Done (Epic)

  • End-to-end: save → manifest → browser → load → merge → recluster works across restarts.
  • Heatmap faithfully restores view; row order stable (linkage on training set).
  • FlowSOM honors subset context; expanded training behaves as specified (imputation/projection recorded).
  • Artifacts validate (checksums, invariants); performance budgets met.

Top-level checklist (convert each to sub-issue)

0) Project setup & guardrails

  • Create feature branch feature/cell-annotation-workflow
  • Add feature flag ENABLE_CELL_ANNOTATION=true (env/config)
  • Wire CI for unit + integration tests for the new plugin
  • Perf budgets:
    • DAG load ≤ 150 ms @ 500 nodes
    • Save checkpoint ≤ 2 s @ 2k clusters
    • Manifest rebuild ≤ 1 s @ 1k files

1) Core plugin & storage scaffolding

  • New plugin plugins/cell_annotation/ (plugin.py, store.py, manifest.py, selection_spec.py)
  • Hook into MainViewer lifecycle; create .UELer/dataset_<id>/{checkpoints,thumbnails,selections}

2) Cross-plugin interfaces

  • viewer/interfaces.py: SelectionSpec, HeatmapStateProvider, FlowsomParamsProvider
  • Heatmap & FlowSOM register providers on init

3) Serializer, schema & validator

  • serialize_heatmap_state(display, flowsom, meta) -> AnnData
  • Persist: canonical orientation; X float32; layers["median"]; uns: artifact/ui/palettes/zscore/filters/linkages
  • Marker sets: training, display_extra, available, linkage, expanded_training
  • Checkpoint: id (UUIDv7), parents, op, created_at, producer
  • FlowSOM snapshot (training_markers, imputation/projection, availability, params)
  • Atomic writer for .h5ad + checksums; validator

4) Manifest & thumbnails

  • Atomic manifest.json update & rebuild (ignore *.partial)
  • Generate 64–128 px thumbnails per checkpoint

5) Checkpoint Browser UI

  • DAG/tree + search/filter
  • Details pane (badges, marker sets)
  • Actions: Save, Load, Diff, Branch/Rebase, Merge
  • Save dialog shows live size estimate

6) Heatmap upgrades

  • export_heatmap_state & import_heatmap_state (include marker roles/availability/missing_rate, linkages, zscore, palettes)
  • Render training + extra sections; row linkage strictly on marker_sets.linkage

7) FlowSOM upgrades

  • run_flowsom(selection, params, training_markers, extra_markers, imputation, projection)
  • Subset-only mode under context; propose Merge when widening
  • Compute medians for training ∪ extra; snapshot records availability & strategies

8) Marker Manager (in Cell Annotation)

  • Pick training (default: intersection) with Include more markers option
  • Imputation (median/panel_median/kNN) or Complete cases + Project
  • Pick display-extra; choose linkage set (default intersection; warn if expanded)

9) Merge workflow

  • Multi-parent union; write uns["selection"] handle (Parquet)
  • Merge only and Merge & Recluster paths
  • Enforce subset-only semantics downstream of merge

10) Subset-only invariants

  • Validate child ⊆ parent (exact or count pre-check)
  • Clear UI errors (which predicate widened; % outside parent)

11) Save/Load wiring

  • Save: Heatmap export → (optional) FlowSOM snapshot → serialize → atomic write → manifest/thumbnail → event
  • Load: validate → Heatmap import → (optional) FlowSOM import → event

12) Testing & QA

  • Unit: serializer/validator, SelectionSpec ops, atomic wri...

✨ Let Copilot coding agent set things up for you — coding agent works faster and does higher quality work when set up for your repo.

Copilot AI and others added 2 commits March 17, 2026 22:51
Co-authored-by: yulewu <38241047+yulewu@users.noreply.github.com>
Co-authored-by: yulewu <38241047+yulewu@users.noreply.github.com>
Copilot AI changed the title [WIP] Add Cell Annotation plugin for checkpoint management Add Cell Annotation plugin scaffolding, storage manifest, and provider wiring Mar 17, 2026
Copilot AI requested a review from yulewu March 17, 2026 22:59
@yulewu yulewu marked this pull request as ready for review March 17, 2026 23:08
@yulewu yulewu merged commit 1dbc7a9 into cell_annotation Mar 17, 2026
4 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants