diff --git a/.gitignore b/.gitignore index cce868ec..8bdd129d 100644 --- a/.gitignore +++ b/.gitignore @@ -145,3 +145,7 @@ electron/ /stage_definitions_for_review.txt gently/ui/tui/node_modules/ gently/ui/tui/dist/ + +# Stray local storage: on Linux the Windows default GENTLY_STORAGE_PATH +# (D:\Gently3) is created literally as ./D:/ under the repo. Not data we track. +/D:/ diff --git a/docs/HEURISTICS-AUDIT.md b/docs/HEURISTICS-AUDIT.md new file mode 100644 index 00000000..249b27cc --- /dev/null +++ b/docs/HEURISTICS-AUDIT.md @@ -0,0 +1,112 @@ +# Heuristics audit — where to use the model (as a typed-output function) instead + +Codebase sweep (5 parallel scanners + synthesis) for heuristics that **fake +judgment** an LLM would do better — in the spirit of the genotype→channel +refactor (drop the lookup table, let the model infer, keep a typed provenance +record + confirm-when-unsure). The flip side — logic that **must stay +deterministic** (safety, math, calibration, transport) — is listed at the end so +we don't mistakenly LLM-ify it. + +The unifying move for every candidate: **LLM with a typed structured-output +schema + provenance + a confirm/UNCERTAIN escape**, never free-text-then-parse. + +## Model candidates (ranked) + +### High value + +1. **Hatching / time-to-stage prediction** — `organisms/celegans/developmental_tracker.py` + *(the closest twin of genotype→channel; medium effort)* + Three hardcoded 20 °C lookup tables (`STAGE_TIMING_20C`, `TIME_TO_HATCHING`, + `TIMING_VARIABILITY`) plus magic `{HIGH:1.0, MEDIUM:1.5, LOW:2.0}` uncertainty + fudge factors. Structurally **can't use the rig's actual temperature** (we run + a TEC), the strain, or the embryo's observed progression rate. Let the model + produce a calibrated, explained interval; **keep the literature table as a + deterministic sanity bracket** and flag when the estimate falls outside it. + → `{ predicted_minutes_to_hatching, low, high, basis, assumptions{temperature_c,strain,used_observed_rate}, confidence, reasoning }` + +2. **Citation → PubMed query** — `harness/plan_mode/tools/research.py` (`_search_pmid`) + A regex that only handles "Surname et al YEAR …" + six hand-rolled query- + relaxation strategies + a stopword/word-position ladder that drops load-bearing + nouns. The model parses the sloppy citation and proposes relaxed queries; **code + keeps the deterministic esearch call and never fabricates a PMID.** + → `{ author_last, year, journal, topic_keywords[], organism, pubmed_query, alt_queries[], confidence }` + +3. **Lab-history retrieval** — `harness/plan_mode/tools/lab_context.py`, `harness/memory/interface.py` + Semantic recall faked by substring-OR over query tokens (matches "we"/"before", + misses every paraphrase). Feed the model the candidate records and have it + **rank/select from provided ids only** (no fabrication). Read-only, no + acquisition risk. + → `{ matches:[{kind,id,summary,relevance,why_relevant}], answer }` + +4. **Stage-label parse via 22-entry synonym dict** — `developmental_tracker.py` (`_parse_stage_name`) + *(small effort, pure robustness win)* The Vision call already classifies; the + brittleness is a plain-text `STAGE:/CONFIDENCE:` block scraped line-by-line, with + off-vocabulary phrasings silently collapsing to `UNKNOWN` (which kills the + downstream hatching prediction). Constrained-enum structured output deletes the + parser + synonym table. + → `{ stage: enum(...), confidence: enum(high|medium|low), is_transitional, reasoning }` + +### Medium value (mostly small — fix the output contract, not the judgment) + +5. **Calibration Vision calls** — `hardware/dispim/claude_client.py` + Four Vision calls return positional free text recovered by `'yes' in first_line` + / `re.search(r'\d+')` / first-valid-letter, with silent defaults (so "no, this is + not yes…" reads as *yes*). Typed output deletes the parse + silent-default layer. + +6. **ML architecture ranking** — `ml/architectures.py` (`get_suitable_architectures`) + Hard feasibility gates (VRAM / dataset) are correct **and stay**; the `+2/+1/+1` + point-score ranking that follows discards the per-arch prose. Let the model rank + the *pre-filtered feasible set* (ids constrained to that set). + +7. **Training label normalization** — `ml/data_loader.py` (`build_labels_from_store`) + Class space built by exact-string identity over free-text human annotations — + "1.5-fold" and "1.5 fold" become different classes. Model normalizes to the + canonical staging vocabulary, flags novel/ambiguous ones. + +### Lower value + +8. **"Plan has a control?"** — `plan_mode/tools/validation.py` — substring scan of a + 6-word keyword set; a scientific judgment over the whole plan. Non-blocking + warning → safe for the model. +9. **CGC HTML scraping** — `research.py` (`_cgc_search`) — positional multi-group + regex over fetched HTML; structured extraction the model does better (HTTP GET + stays code; **mark strain names low-confidence to avoid sending someone to order + a hallucinated strain**). + +### Cross-cutting batch (small each): typed output for the detector/verifier cluster +`harness/detection/verifier.py`, `app/detectors/hatching.py`, +`app/detectors/dopaminergic_signal.py`, `hardware/dispim/sam_detection.py` — all +already make the right model call but reconstruct the verdict via +`startswith`/regex-JSON-scraping with silent defaults. A batch move to native +structured output **strictly reduces parse-induced false negatives** without +touching the deterministic vote-tally/consensus/enum-dispatch downstream. + +**Reference implementations already in the repo (imitate, don't change):** +`dopaminergic_signal`'s perceiver→classifier rubric (typed enums, UNCERTAIN +escape, conservative-on-tie) and onboarding's `_extract_with_llm` (typed +extraction, degrade-to-verbatim fallback). + +## Keep deterministic (do NOT LLM-ify) +Safety, math, calibration, and transport — where a hallucinated value is unsafe +or breaks reproducibility: +- Laser-power safety limits + wavelength→MM-property map (`hardware/dispim/devices/optical.py`) +- SPIM trigger-timing arithmetic, piezo–galvo calibration, MM framing (`dispim/config.py`) +- Calibration prior EMA + R²≥0.75 slope-lock gate (`dispim/calibration.py`) +- SwitchBot GATT byte commands / status decoding (`hardware/switchbot.py`) +- Temperature setpoint bound [0,99.9] °C + stabilization I/O (`hardware/temperature.py`) +- Autofocus signal-processing, curve fitting, adaptive-sweep stop rules (`analysis/core.py`, `analysis/focus.py`) +- Classical-CV ROI detection + pixel→stage coordinate transforms (`detection.py`, `sam_detection.py` geometry) +- Timelapse rule dispatch + `confirm_timepoints` debounce + monotonic power ramp (`app/orchestration/timelapse.py`) +- Volume→b64 dark/flat calibration + fixed brightness scaling (`dopaminergic_signal._volume_to_b64` — deliberately non-adaptive) +- Wake-router debounce/throttle/stage-transition gate (`app/wake_router.py`) +- Plan hardware limits, detector-preset membership, dependency-cycle DFS, stage-order normalization (`plan_mode/tools/validation.py`) +- Ensemble vote tally + 0.70 quorum / unanimity consensus (`detection/verifier.py`) +- ML metric/aggregation math: confusion matrix, F1, federated averaging (`ml/evaluation.py`, `federated.py`) +- Core imaging geometry (max-projection, crop bounds, Euler rotations) + UI event reduction/routing/security (`core/imaging.py`, `ui/web/*`) +- Device-state SSE watchdog/staleness timers (`app/device_state_monitor.py`) +- Reference-type dispatch (PMID/DOI/URL by canonical syntax), `os.path.isfile` checks (`research.py`) + +## Note +`gap_assessment.conversation_weight` (the 0.25/0.1/0.05 readiness scalar) is now +largely **vestigial** — it only returns 'heavy' (lab onboarding) or 'none' — so +it's not worth an API call. Left off the candidate list. diff --git a/docs/ux-v2-flow-audit.md b/docs/ux-v2-flow-audit.md new file mode 100644 index 00000000..00bba3e7 --- /dev/null +++ b/docs/ux-v2-flow-audit.md @@ -0,0 +1,106 @@ +# UX v2 — interaction-flow / IA audit + +**Branch:** `feature/ux-v2` (now includes the 3D optical-space view). +**Scope:** the *flow* of the agent-first UI — clicks, how each step renders, how the +workspace is unveiled, moving back/forth between views, resume — **not** the visual look +(the look is fine). Plus where the 3D optical-space view belongs in the new workspace IA. + +**Method:** live click-audit driven through a real browser as a *dev biologist* would use +it, with the agent **live** (Opus 4.8, `--offline` hardware, `GENTLY_NO_AUTH=1` single +controller), cross-checked against the code. Screenshots from the run are in `screenshots/audit-*.png`. + +> Correction to an earlier automated pass: the plan-wizard helpers +> (`buildAskCard`/`answerChoice`/`togglePanel`) are **not** missing — `agent-chat.js` +> exports them and the module loads; the plan wizard works. The real issues are below. + +--- + +## What works (keep it) + +- **The forward path is good.** Entry → one calm choice (Plan / Quick look / "just tell me") + → overlay dismisses to reveal the workspace → grouped rail (NOW / LIBRARY / SYSTEM) drives + everything through one chokepoint (`app.js switchTab`). The welcome→workspace unveil is genuinely nice. +- **The agent-driven plan wizard is strong.** Live, it asked a well-framed scientific question + ("What's the core scientific question this run should capture?") with real C. elegans options, + ran a `query_lab_history` tool with visible provenance, and **assembled THE PLAN panel as each + answer landed** (strain → wavelengths, etc.). The "plan builds as you answer" feel is excellent. +- **The dual-render** (ask shows in the plan stage *and* the chat transcript) is implemented. + +--- + +## Findings (prioritized) + +| # | Pri | Symptom (felt) | Root cause / evidence | Fix | +|---|-----|----------------|------------------------|-----| +| 1 | **P0** | First plan step sat on "working through the next step…" for **~90s** with a static spinner — feels hung. | The wait is the model *thinking*. The streaming call requests **no thinking config** and the stream loop reads only text deltas. `conversation.py:272-275` (only `output_config.effort`), `conversation.py:654-657` (only `event.delta.text`). | Set `thinking={"type":"adaptive","display":"summarized"}` on the stream (`conversation.py:552`); handle `thinking_delta` in the loop (`:654`) and emit as a `thinking` activity; render it live + add an elapsed timer. See §1. | +| 2 | **P1** | Agent's first line renders as **"'d love to help…"** — leading "I" dropped. | Plan-feed streaming path drops the first character of the turn's first text block; the chat transcript renders it correctly (`12_41` vs `12_3` in the run). Plan feed: `landing.js applyActivity` `'text'` case (`:269`). | Most likely the first `AGENT_ACTIVITY`/`text` delta is missed by `landing.js`'s listener (subscribed after the first delta) or coalesced wrong. Confirm with a 1-line repro; the transcript path is the reference. | +| 3 | **P1** | Clicked the primary "Plan an experiment" → plan stage spun forever; the *real* blocker ("Viewing only — control is held by another client / sign in to control") was **hidden in the chat panel**. | Control/auth state isn't surfaced on the landing/plan surface — only in the chat dock. A viewer can enter the plan flow and dead-end. | Surface control/sign-in state on the landing **before** the primary CTA; gate or relabel "Plan an experiment" when `!hasControl`; show the wall on the plan stage, not just chat. | +| 4 | **P1** | (Structural) The same ask renders in **two** stage mounts plus the transcript. | `#v2-plan-ask` **and** `#ask-stage` both render the ask (the overlay covers the workspace copy, so only cosmetic/perf today). Two live regions seen in the run (`12_10` + `12_24`). | One stage mount at a time — suppress `#ask-stage` while the landing overlay owns the ask. | +| 5 | **P1** | Cross-surface clear can desync. | `ASK_CLEARED` is **listened for but emitted nowhere** (`landing.js:624`, `ask-stage.js:43` listen; no emit in repo). Answering works locally because `renderAsk.onPick` clears directly, but stage↔transcript sync relies on the missing signal. | Emit `ASK_CLEARED` the instant a `choice_response` is sent (per the migration plan's Phase-1 blocker), plus on cancel/control-loss/socket-close. | +| 6 | **P1** | **No way back.** Once the landing dismisses, there's no path back to welcome / "start a new plan" from the workspace — must reload. | `dismiss()` is one-way (`landing.js:42-54`); `V2Landing.show()` exists but is never called from the workspace. | Add a "New plan" / "Talk to Gently" entry in the rail or header that re-summons the welcome/plan surface. | +| 7 | **P2** | Browser **Back / refresh don't mean anything**; refresh mid-plan loses state and may re-show the landing. | Entry hash is consumed (`app.js` → `replaceState('/')`, ~`:650-662`); no deliberate URL/state sync; in-memory plan state (`planKickedOff`, feed pages) resets on reload. | Real routing: sync screen/tab to URL/History so Back/forward/refresh resolve; persist or re-hydrate plan progress. | +| 8 | **P2** | **Resume = full page reload** — jarring, re-shows landing, drops chat position. | `session_changed` → `window.location.href='/'` (`websocket.js:147`; `review.js resumeSession ~:101-116`). Flagged in the migration plan. | In-place re-hydration on `session_changed` instead of a hard reload. | +| 9 | **P1 (IA)** | The **3D optical-space view is buried**: SYSTEM → Devices → (Map / Details / **3D**) — a sub-sub-toggle. | It was integrated into the *legacy* Devices tab structure; the ux-v2 grouped rail doesn't surface it. | Promote "the scope in space" to a first-class run-time surface (NOW tier), reconciled with the grouped rail. See §2. | +| 10 | **P2** | Offline / agent-silent dead-ends the wizard at "working…". | `startPlan` campaign fetch falls through silently if offline (`landing.js ~:502-508`); no error path. | Timeout + inline error/retry on the plan stage. | + +--- + +## §1 — Make the loading state legible (P0, the one the user wants first) + +The 90s "working…" is the agent reasoning. The Claude streaming API exposes this on three +channels; gently currently surfaces none of the reasoning: + +- **Thinking** — `content_block_delta` → `thinking_delta`. **Opus 4.8 defaults to + `display:"omitted"` (empty thinking text)**, and gently doesn't set the thinking config at + all on the stream, so there's nothing to show. Unlock: `thinking={"type":"adaptive","display":"summarized"}`. +- **Tool activity** — `input_json_delta` + tool start/stop. **Already flowing** — the plan feed + renders tool cards (saw the `query_lab_history` card with input/result). +- **Text** — `text_delta`. Already flowing (this is the path with the bug #2 truncation). + +**Backend (`gently/harness/conversation.py`):** +1. `:552` `self.claude.messages.stream(...)` — add `thinking={"type":"adaptive","display":"summarized"}` + (keep `output_config.effort`). +2. `:654` event loop — currently only `if hasattr(event.delta, "text")`. Add a branch for + `event.delta.type == "thinking_delta"` → `yield {"type":"thinking","text": event.delta.thinking}`. + +**Frontend (`gently/ui/web/static/js/landing.js`):** `applyActivity` already has a `thinking` +case (`:266`) that only sets a static label — render the streamed thinking text instead, and add +an elapsed timer to `#v2-plan-thinking` so a long think reads as progress, not a hang. + +Net: the reasoning summary + current tool + a timer fill the wait. Only the backend `display` +flag is a new capability; the rest is surfacing data gently already receives. + +--- + +## §2 — Workspace organization & where the 3D view belongs (P1, IA) + +The ux-v2 workspace is organized differently from the old flat tab bar: a **grouped rail** +(NOW: Home/Experiment/Embryos · LIBRARY: Plans/Sessions · SYSTEM: Devices/Calibration/Logs), +a **session-context strip**, and the **AGENT'S VIEW** surface. The 3D optical-space view, +however, lives in the *legacy* Devices structure (`devices.js switchView`, VIEWS = +`['map','details','optical3d']`; `index.html` devices-content Map/Details/3D switcher). + +During an actual run, "where the scope is in space" + the live experiment + the agent's view are +**NOW-tier** concerns, not a System utility three clicks deep. Proposal (to design next): +- Promote the 3D optical-space + live experiment to a first-class run-time surface in the rail + (or make it the default workspace view while a run is active). +- Keep the Devices Map/Details as the System-tier hardware utility; the 3D "scope in space" + graduates out of that toggle. + +--- + +## Recommended sequencing + +1. **P0 loading state** (§1) — highest felt value, mostly surfacing existing data. +2. **P1 quick correctness**: #2 truncation, #3 control-wall surfacing, #4 single ask mount, #5 `ASK_CLEARED` emit. +3. **P1 reachability**: #6 "new plan"/back entry; then #9 the workspace-IA / 3D-placement redesign (its own design pass). +4. **P2 navigation**: #7 real routing, #8 resume re-hydration, #10 offline error path. + +--- + +## Notes / housekeeping + +- Findings 1–5, 10 verified live with the agent on; 6–9 verified from code + the live rail. +- `screenshots/audit-*.png` (live run) and `screenshots/uxv2-*.png` are local evidence (untracked). +- The earlier visual-design exploration (`docs/superpowers/mockups/`, `screenshots/dir-*.png`) is + superseded — the look is staying as-is — and can be deleted. diff --git a/gently/app/agent.py b/gently/app/agent.py index 4a5f6602..f68d67a7 100644 --- a/gently/app/agent.py +++ b/gently/app/agent.py @@ -27,16 +27,13 @@ if TYPE_CHECKING: from ..ui.web.server import VisualizationServer + from gently_perception import Perceiver from ..core import EventType, emit, get_event_bus from ..core.file_store import FileStore from ..harness.conversation import ConversationManager -from ..harness.orchestration.plan_synthesis import ( - PlanLibrary, - PlanSynthesizer, - PlanValidator, -) +from ..harness.orchestration.plan_synthesis import PlanLibrary, PlanSynthesizer, PlanValidator from ..harness.prompts.manager import PromptManager from ..harness.session.interaction_logger import InteractionLogger from ..harness.session.manager import SessionManager @@ -112,12 +109,13 @@ def __init__( # the message entry points refuse to call Claude. self.api_enabled = not no_api - # API client with interleaved thinking support + # Shared API client. No interleaved-thinking beta header: it's GA on the + # 4.6+ models and obsolete on Fable 5 (always-on thinking); the header is + # dropped so it can't conflict with the new model family. self.claude = anthropic.Anthropic( api_key=api_key or os.getenv("ANTHROPIC_API_KEY") or ("no-api-mode" if no_api else None), - default_headers={"anthropic-beta": "interleaved-thinking-2025-05-14"}, ) self.model = model @@ -380,11 +378,7 @@ def enter_plan_mode(self) -> str: import gently.harness.plan_mode.tools # noqa: F401 self._update_system_prompt() - emit( - EventType.STATUS_CHANGED, - {"field": "agent_mode", "value": "plan"}, - source="agent", - ) + emit(EventType.STATUS_CHANGED, {"field": "agent_mode", "value": "plan"}, source="agent") logger.info("Entered plan mode") return "Switched to plan mode. I'm now your experimental design collaborator." @@ -480,11 +474,7 @@ def exit_plan_mode(self) -> str: self.prompts.invalidate_context_cache() self._update_system_prompt() - emit( - EventType.STATUS_CHANGED, - {"field": "agent_mode", "value": "run"}, - source="agent", - ) + emit(EventType.STATUS_CHANGED, {"field": "agent_mode", "value": "run"}, source="agent") logger.info("Exited plan mode") return result @@ -760,10 +750,7 @@ def on_perception(event): self.invalidate_context_cache() self._auto_save() logger.info( - "Perception: %s -> stage %s (t%s)", - embryo_id, - stage, - data.get("timepoint"), + "Perception: %s -> stage %s (t%s)", embryo_id, stage, data.get("timepoint") ) except Exception as e: logger.warning(f"Error handling perception event: {e}") @@ -1615,8 +1602,8 @@ async def check_blank_image( img.save(buffer, format="PNG") b64_image = base64.b64encode(buffer.getvalue()).decode() - prompt = """Look at this microscopy image. Is this a VALID microscopy image or a -BLANK/CORRUPTED image? + prompt = """\ +Look at this microscopy image. Is this a VALID microscopy image or a BLANK/CORRUPTED image? A BLANK or CORRUPTED image shows: - Mostly uniform gray/black with no structure diff --git a/gently/app/detectors/dopaminergic_signal.py b/gently/app/detectors/dopaminergic_signal.py index 7fb09408..9e9edbd9 100644 --- a/gently/app/detectors/dopaminergic_signal.py +++ b/gently/app/detectors/dopaminergic_signal.py @@ -273,7 +273,9 @@ async def _call_perceiver( } ], ) - raw = response.content[0].text if response.content else "" + if response.stop_reason == "refusal" or not response.content: + return "(perception model declined the request)", "" + raw = response.content[0].text return raw.strip(), raw async def _call_classifier( @@ -290,7 +292,9 @@ async def _call_classifier( max_tokens=300, messages=[{"role": "user", "content": prompt}], ) - raw = response.content[0].text if response.content else "" + if response.stop_reason == "refusal" or not response.content: + return dict(_DEFAULT_FINDINGS), "", "Safety refusal" + raw = response.content[0].text findings, parse_err = _parse_response(raw) return findings, raw, parse_err diff --git a/gently/app/detectors/hatching.py b/gently/app/detectors/hatching.py index 62dd31cb..6dc0705d 100644 --- a/gently/app/detectors/hatching.py +++ b/gently/app/detectors/hatching.py @@ -5,6 +5,10 @@ pipeline trains on. The dopaminergic-signal detector already returns ``has_hatched`` as part of its richer schema; this is a lighter-weight yes/no for use cases where structure / intensity assessment isn't needed. + +The verdict comes back as a forced tool call (``tool_choice`` pins the model +to ``record_hatching``), so the structured fields arrive already parsed as +``block.input`` — no JSON-from-prose scraping, no silent-default parse layer. """ import asyncio @@ -20,8 +24,9 @@ logger = logging.getLogger(__name__) -_HATCHING_PROMPT = """You are observing a C. elegans embryo on a microscope. Decide whether -the embryo has HATCHED. +_HATCHING_PROMPT = """\ +You are observing a C. elegans embryo on a microscope. Decide whether the embryo has HATCHED, +then record your decision with the record_hatching tool. A HATCHED embryo: - Has visibly broken out of the eggshell @@ -32,20 +37,37 @@ - Is still contained within an intact eggshell - May be at any pre-hatching stage (bean, comma, 1.5-fold, 2-fold, pretzel) -Respond with ONLY a JSON object exactly matching this schema: +Default to has_hatched=false unless you are confident. Don't over-call hatching. +""" -{ - "has_hatched": true|false, - "confidence": "LOW|MEDIUM|HIGH", - "reasoning": "..." -} -Default to false unless you are confident. Don't over-call hatching. -""" +# Forced tool schema — the model is pinned to this via tool_choice, so the +# fields come back as a validated dict on the tool_use block. The conservative +# "default to false" guidance lives in the prompt. We deliberately do NOT ask +# the model to self-rate confidence — that's a heuristics-era artifact; the +# has_hatched judgment is the signal. +_HATCHING_TOOL = { + "name": "record_hatching", + "description": "Record whether the C. elegans embryo has hatched, with brief reasoning.", + "input_schema": { + "type": "object", + "properties": { + "has_hatched": { + "type": "boolean", + "description": "True only if the embryo has visibly broken out of the eggshell.", + }, + "reasoning": { + "type": "string", + "description": "One short sentence citing the visual evidence for the call.", + }, + }, + "required": ["has_hatched", "reasoning"], + }, +} class HatchingDetector(Detector): - """Claude-vision hatching yes/no, with confidence.""" + """Claude-vision hatching yes/no.""" name = "hatching" @@ -59,7 +81,6 @@ async def run( context: dict[str, Any], ) -> DetectorResult: import json - import re import anthropic @@ -84,7 +105,7 @@ async def run( detector_name=self.name, embryo_id=embryo_id, timepoint=timepoint, - findings={"has_hatched": False, "confidence": "LOW"}, + findings={"has_hatched": False}, reasoning="Empty / unreadable volume", elapsed_ms=(time.time() - start) * 1000, ) @@ -93,7 +114,9 @@ async def run( response = await asyncio.to_thread( claude.messages.create, model=self._model or settings.models.fast, - max_tokens=200, + max_tokens=256, + tools=[_HATCHING_TOOL], + tool_choice={"type": "tool", "name": _HATCHING_TOOL["name"]}, messages=[ { "role": "user", @@ -111,23 +134,24 @@ async def run( } ], ) - raw = response.content[0].text if response.content else "" - findings = {"has_hatched": False, "confidence": "LOW"} + # Forced tool_choice guarantees a tool_use block; read its parsed + # input directly. No regex, no JSON-from-prose fallback. + tool_input = next( + (b.input for b in response.content if getattr(b, "type", None) == "tool_use"), + None, + ) + + findings = {"has_hatched": False} reasoning = None err = None - try: - m = re.search(r"\{.*?\}", raw, re.DOTALL) - blob = m.group(0) if m else raw.strip() - parsed = json.loads(blob) - findings["has_hatched"] = bool(parsed.get("has_hatched", False)) - confidence = str(parsed.get("confidence", "LOW")).upper() - if confidence not in {"LOW", "MEDIUM", "HIGH"}: - confidence = "LOW" - findings["confidence"] = confidence - reasoning = parsed.get("reasoning") - except (json.JSONDecodeError, AttributeError) as e: - err = f"parse error: {e}" + if isinstance(tool_input, dict): + findings["has_hatched"] = bool(tool_input.get("has_hatched", False)) + reasoning = tool_input.get("reasoning") + else: + # Shouldn't happen with forced tool_choice — keep the + # conservative default and record why. + err = "no tool_use block in response" return DetectorResult( detector_name=self.name, @@ -135,7 +159,7 @@ async def run( timepoint=timepoint, findings=findings, reasoning=reasoning, - raw_response=raw, + raw_response=json.dumps(tool_input) if isinstance(tool_input, dict) else None, elapsed_ms=(time.time() - start) * 1000, error=err, ) diff --git a/gently/app/tools/acquisition_tools.py b/gently/app/tools/acquisition_tools.py index 8efcef79..79f69732 100644 --- a/gently/app/tools/acquisition_tools.py +++ b/gently/app/tools/acquisition_tools.py @@ -6,6 +6,8 @@ import asyncio import logging +import time +from typing import Any import numpy as np @@ -15,6 +17,62 @@ logger = logging.getLogger(__name__) +def _publish_scan_geometry( + agent: Any, + *, + embryo_id: str, + stage_position: dict | None, + num_slices: int, + exposure_ms: float, + galvo_amplitude: float, + galvo_center: float, + piezo_amplitude: float, + piezo_center: float, +) -> None: + """Emit SCAN_GEOMETRY_UPDATE describing the cuboid being acquired. + + Drives the 3D optical-space view (the addressable volume + the scan cuboid + and light-sheet mode). Telemetry only — callers guard against exceptions so + this never interferes with an acquisition. The payload is also stashed on + the agent for REST bootstrap (``/api/devices/scan_geometry``). + """ + from gently.core import EventType, get_event_bus + + z_extent_um = 2.0 * piezo_amplitude + slice_spacing_um = z_extent_um / (num_slices - 1) if num_slices > 1 else 0.0 + sx = stage_position.get("x") if stage_position else None + sy = stage_position.get("y") if stage_position else None + + payload: dict[str, Any] = { + "embryo_id": embryo_id, + "stage_position_um": {"x": sx, "y": sy}, + "scan": { + "num_slices": num_slices, + "exposure_ms": exposure_ms, + "galvo_amplitude_deg": galvo_amplitude, + "galvo_center_deg": galvo_center, + "piezo_amplitude_um": piezo_amplitude, + "piezo_center_um": piezo_center, + }, + "derived": { + "z_extent_um": z_extent_um, + "slice_spacing_um": slice_spacing_um, + "z_min_um": piezo_center - piezo_amplitude, + "z_max_um": piezo_center + piezo_amplitude, + }, + # diSPIM here is scanned-light-sheet only; a future pencil/beam tool + # would emit "pencil". See the 3D optical-space view notes. + "mode": "sheet", + "ts": time.time(), + } + agent.last_scan_geometry = payload + get_event_bus().publish( + event_type=EventType.SCAN_GEOMETRY_UPDATE, + data=payload, + source="acquisition-tools", + ) + + @tool( name="acquire_volume", description="""Acquire a single 3D lightsheet volume for a specific embryo. Moves to embryo @@ -88,6 +146,23 @@ async def acquire_volume( piezo_amplitude = piezo_amplitude + (additional_buffer_um * abs(slope) / 100.0) z_buffer_applied = z_buffer_um + # Publish the resolved scan geometry for the 3D optical-space view. + # Telemetry only — must never break the acquisition. + try: + _publish_scan_geometry( + agent, + embryo_id=embryo_id, + stage_position=pos, + num_slices=num_slices, + exposure_ms=exposure_ms, + galvo_amplitude=galvo_amplitude, + galvo_center=galvo_center, + piezo_amplitude=piezo_amplitude, + piezo_center=piezo_center, + ) + except Exception: + logger.debug("SCAN_GEOMETRY_UPDATE publish failed", exc_info=True) + result = await client.acquire_volume( num_slices=num_slices, exposure_ms=exposure_ms, diff --git a/gently/core/event_bus.py b/gently/core/event_bus.py index 3b98e616..d3ab0e7a 100644 --- a/gently/core/event_bus.py +++ b/gently/core/event_bus.py @@ -86,12 +86,17 @@ class EventType(Enum): DEVICE_STATE_UPDATE = auto() # Periodic device-state snapshot from device layer BOTTOM_CAMERA_FRAME = auto() # Live JPEG frame from the bottom camera stream EMBRYOS_UPDATE = auto() # Full embryo list snapshot from agent.experiment + SCAN_GEOMETRY_UPDATE = auto() # Scan cuboid + light-sheet mode for the 3D optical-space view # Python logging.LogRecord republished onto the bus so the Events page # surfaces what would otherwise only land in the terminal. See # gently/core/log_bridge.py — opt-in handler. LOG_RECORD = auto() + # Agent context/mind updates (expectations / watchpoints / questions) — + # drives the shared-visibility surface in the v2 UI. + CONTEXT_UPDATED = auto() + # Operator-action events. Distinct from EMBRYOS_UPDATE because they # carry intent ("a human did this") rather than just state delta. # Candidate orchestrators can subscribe and reason about what the diff --git a/gently/hardware/dispim/sam_detection.py b/gently/hardware/dispim/sam_detection.py index cea9990a..45355438 100644 --- a/gently/hardware/dispim/sam_detection.py +++ b/gently/hardware/dispim/sam_detection.py @@ -18,16 +18,15 @@ import numpy as np from PIL import Image -from gently.settings import settings - -logger = logging.getLogger(__name__) - -from gently.core.coordinates import ( # noqa: E402 +from gently.core.coordinates import ( DEFAULT_OBJECTIVE_MAG, DEFAULT_PIXEL_SIZE_UM, get_um_per_pixel, pixel_to_stage_position, ) +from gently.settings import settings + +logger = logging.getLogger(__name__) class SAMEmbryoDetector: @@ -756,8 +755,8 @@ async def _review_with_claude( image_base64 = self._encode_image_base64(annotated) - prompt = f"""You are a microscopy expert analyzing embryo detections from a bottom -camera view. + prompt = f"""\ +You are a microscopy expert analyzing embryo detections from a bottom camera view. CURRENT DETECTIONS: {len(embryos)} embryos labeled 0-{len(embryos) - 1} with colored bounding boxes. @@ -789,7 +788,9 @@ async def _review_with_claude( message = self.claude_client.messages.create( model=settings.models.perception, max_tokens=8000, - thinking={"type": "enabled", "budget_tokens": 5000}, + output_config={ + "effort": "high" + }, # was thinking budget_tokens (Opus 4.8 rejects it) messages=[ { "role": "user", @@ -859,7 +860,9 @@ async def _verify_with_claude( message = self.claude_client.messages.create( model=settings.models.perception, max_tokens=6000, - thinking={"type": "enabled", "budget_tokens": 4000}, + output_config={ + "effort": "high" + }, # was thinking budget_tokens (Opus 4.8 rejects it) messages=[ { "role": "user", diff --git a/gently/harness/bridge.py b/gently/harness/bridge.py index ddbfa9ca..fdd21040 100644 --- a/gently/harness/bridge.py +++ b/gently/harness/bridge.py @@ -251,9 +251,12 @@ def _candidate_to_option(self, item, spec, campaign) -> dict: spec_dict: dict[str, Any] = {} for field in ( "strain", + "genotype", + "reporter", "temperature_c", "num_slices", "exposure_ms", + "laser_wavelength_nm", "interval_s", "stop_condition", "success_criteria", @@ -261,6 +264,11 @@ def _candidate_to_option(self, item, spec, campaign) -> dict: val = getattr(spec, field, None) if val is not None: spec_dict[field] = val + # Carry per-field provenance so the UI can tag inferred values + # (e.g. "561 nm · inferred · medium") and show what to confirm. + prov = getattr(spec, "provenance", None) + if prov: + spec_dict["provenance"] = prov if spec_dict: meta["spec"] = spec_dict diff --git a/gently/harness/conversation.py b/gently/harness/conversation.py index e154c724..a3955df0 100644 --- a/gently/harness/conversation.py +++ b/gently/harness/conversation.py @@ -12,6 +12,8 @@ import time from typing import Any +from ..settings import settings + logger = logging.getLogger(__name__) @@ -158,15 +160,13 @@ def should_use_thinking(self, message: str, mode: str) -> bool: if re.search(r"\b(plan|timelapse|time-lapse|acquisition)\b", msg_lower): return True if re.search( - r"\b(analy[sz]e|look at|check|inspect|review).*(image|volume|embryo)", - msg_lower, + r"\b(analy[sz]e|look at|check|inspect|review).*(image|volume|embryo)", msg_lower ): return True if re.search(r"\b(all|every|each)\s+(embryo|sample)", msg_lower): return True if re.search( - r"\b(first|then|after|next|finally)\b.*\b(first|then|after|next|finally)\b", - msg_lower, + r"\b(first|then|after|next|finally)\b.*\b(first|then|after|next|finally)\b", msg_lower ): return True if re.search(r"\b(why|problem|issue|error|wrong|fail|debug|troubleshoot)", msg_lower): @@ -176,6 +176,33 @@ def should_use_thinking(self, message: str, mode: str) -> bool: # ===== Non-Streaming API Call ===== + async def _create_with_refusal_fallback(self, api_kwargs): + """messages.create with main-tier resilience: if the model rejects the + request with a 400 (e.g. Fable 5 under <30-day org data retention, or + unavailable) OR declines it (stop_reason="refusal", empty content), retry + the SAME request once on the fallback model (Opus 4.8) — so gently keeps + working whether or not Fable 5 is currently serviceable. The moment the + org retention is fixed, Fable 5 serves with no code change.""" + from anthropic import BadRequestError + + fb = settings.models.refusal_fallback + model = api_kwargs.get("model") + try: + response = await self._call_api_with_retry(self.claude.messages.create, **api_kwargs) + except BadRequestError: + if not fb or fb == model: + raise + logger.warning("Model %s rejected the request (400); falling back to %s", model, fb) + return await self._call_api_with_retry( + self.claude.messages.create, **{**api_kwargs, "model": fb} + ) + if response.stop_reason == "refusal" and fb and fb != model: + logger.warning("Model %s declined the turn; retrying on %s", model, fb) + response = await self._call_api_with_retry( + self.claude.messages.create, **{**api_kwargs, "model": fb} + ) + return response + async def call_claude( self, user_message: str, system_prompt, tools, mode: str, auto_save_fn ) -> str: @@ -243,10 +270,11 @@ async def call_claude( "max_tokens": 16000 if use_thinking else 4096, } if use_thinking: - budget = 30000 if mode == "plan" else 10000 - api_kwargs["thinking"] = {"type": "enabled", "budget_tokens": budget} + # Fable 5 / Opus 4.8 reject thinking budget_tokens (400) — thinking + # is adaptive; control depth via effort instead of a token budget. + api_kwargs["output_config"] = {"effort": "high" if mode == "plan" else "medium"} - response = await self._call_api_with_retry(self.claude.messages.create, **api_kwargs) + response = await self._create_with_refusal_fallback(api_kwargs) self._track_token_usage(response) _extend_tool_calls(tool_calls_collected, response.content) @@ -258,17 +286,21 @@ async def call_claude( self.conversation_history.append({"role": "user", "content": tool_results}) api_kwargs["messages"] = self.conversation_history - response = await self._call_api_with_retry( - self.claude.messages.create, **api_kwargs - ) + response = await self._create_with_refusal_fallback(api_kwargs) self._track_token_usage(response) _extend_tool_calls(tool_calls_collected, response.content) - # Extract text response - assistant_message = "" - for block in response.content: - if hasattr(block, "text"): - assistant_message += block.text + # Extract text response. Fable 5 may refuse (stop_reason="refusal") + # with empty content — surface it instead of returning blank. + if response.stop_reason == "refusal": + assistant_message = ( + "(The request was declined by the model's safety system. Try rephrasing.)" + ) + else: + assistant_message = "" + for block in response.content: + if hasattr(block, "text"): + assistant_message += block.text self.conversation_history.append({"role": "assistant", "content": response.content}) @@ -399,6 +431,9 @@ async def get_tool_call(self, user_message: str, system_prompt, tools) -> dict | input_tokens = getattr(response.usage, "input_tokens", 0) output_tokens = getattr(response.usage, "output_tokens", 0) + # A refusal returns empty content — treat as "no tool call". + if response.stop_reason == "refusal" or not response.content: + return None for block in response.content: if block.type == "tool_use": return { @@ -509,71 +544,164 @@ async def call_claude_stream(self, system_prompt, tools, tool_label_fn, auto_sav dict Chunks as they arrive from Claude """ - from anthropic import APIStatusError - - def stream_and_collect(): - events = [] - final_message = None - - with self.claude.messages.stream( - model=self.model, - system=system_prompt, - messages=self.conversation_history, - tools=tools, - max_tokens=4096, - ) as stream: - for event in stream: - events.append(event) - final_message = stream.get_final_message() - - return events, final_message + from anthropic import APIStatusError, BadRequestError + + # Live streaming: a worker thread drains the SDK's (blocking) stream and + # pushes each event onto an asyncio queue as it arrives, so this coroutine + # can yield text/thinking deltas in real time instead of collecting the + # whole turn first (which left the UI on a blank spinner for the entire + # turn). thinking=summarized surfaces the model's reasoning during the + # wait. The full assistant content (incl. thinking blocks) is replayed from + # final_message below, so the tool-loop continuation stays valid. + _DONE = object() + + async def _stream_live(model, sink): + """Stream one attempt live: yield delta dicts as they arrive; record + events / final_message / error / full_text into `sink`.""" + loop = asyncio.get_running_loop() + queue: asyncio.Queue = asyncio.Queue() + state: dict = {} + + def worker(): + try: + with self.claude.messages.stream( + model=model, + system=system_prompt, + messages=self.conversation_history, + tools=tools, + max_tokens=16000, + # Adaptive thinking with a streamed, human-readable summary — + # this is what fills the "working…" wait. Opus 4.8 defaults to + # display="omitted" (empty thinking text), so it must be set. + thinking={"type": "adaptive", "display": "summarized"}, + output_config={"effort": "medium"}, + ) as stream: + for event in stream: + loop.call_soon_threadsafe(queue.put_nowait, event) + state["final"] = stream.get_final_message() + except BaseException as exc: # noqa: BLE001 — re-raised to caller below + state["error"] = exc + finally: + loop.call_soon_threadsafe(queue.put_nowait, _DONE) + + task = asyncio.create_task(asyncio.to_thread(worker)) + events: list = [] + full_text: list = [] + while True: + item = await queue.get() + if item is _DONE: + break + events.append(item) + if item.type != "content_block_delta": + continue + delta = item.delta + dtype = getattr(delta, "type", None) + if dtype == "thinking_delta": + chunk = getattr(delta, "thinking", "") or "" + if chunk: + yield {"type": "thinking", "text": chunk} + elif dtype == "text_delta" or hasattr(delta, "text"): + chunk = getattr(delta, "text", "") or "" + if chunk: + full_text.append(chunk) + yield {"type": "text", "text": chunk} + await task + sink["events"] = events + sink["full_text"] = full_text + sink["final"] = state.get("final") + sink["error"] = state.get("error") - # Run streaming in thread with retry logic max_retries = 3 retry_delay = 1.0 + fb = settings.models.refusal_fallback + model_in_use = self.model + sink: dict = {} for attempt in range(max_retries): - try: - events, final_message = await asyncio.to_thread(stream_and_collect) - self._track_token_usage(final_message) + sink = {} + yielded_any = False + async for chunk in _stream_live(model_in_use, sink): + yielded_any = True + yield chunk + err = sink.get("error") + if err is None: break - except APIStatusError as e: - error_type = getattr(e, "body", {}) + # Fable 5 under <30-day data retention (or unavailable) rejects with a + # 400 — fall back to Opus 4.8. Only safe before any partial was streamed. + if isinstance(err, BadRequestError) and fb and fb != model_in_use and not yielded_any: + logger.warning( + "Stream model %s rejected the request (400); falling back to %s", + model_in_use, + fb, + ) + model_in_use = fb + continue + if isinstance(err, APIStatusError): + error_type = getattr(err, "body", {}) if isinstance(error_type, dict): error_type = error_type.get("error", {}).get("type", "") - - if ( + overloaded = ( error_type in ("overloaded_error", "rate_limit_error") - or "overloaded" in str(e).lower() - ): - if attempt < max_retries - 1: - wait_time = retry_delay * (2**attempt) - logger.warning( - f"API overloaded, retrying in {wait_time:.1f}s" - f" (attempt {attempt + 1}/{max_retries})" - ) - yield { - "type": "text", - "text": f"\n*[API busy, retrying in {wait_time:.0f}s...]*\n", - } - await asyncio.sleep(wait_time) - continue - raise + or "overloaded" in str(err).lower() + ) + if overloaded and attempt < max_retries - 1 and not yielded_any: + wait_time = retry_delay * (2**attempt) + logger.warning( + "API overloaded, retrying in %.1fs (attempt %d/%d)", + wait_time, + attempt + 1, + max_retries, + ) + yield { + "type": "text", + "text": f"\n*[API busy, retrying in {wait_time:.0f}s...]*\n", + } + await asyncio.sleep(wait_time) + continue + raise err else: raise RuntimeError("API overloaded after multiple retries") - # Diagnostic: log stop_reason and tool block counts + final_message = sink["final"] + full_text = sink["full_text"] + self._track_token_usage(final_message) + + # Refusal → retry on the fallback model. Re-streaming live is only safe when + # the refusal came before any visible output (pre-output refusals carry empty + # content, so nothing was yielded); otherwise we keep the partial we showed. + if final_message.stop_reason == "refusal" and fb and model_in_use != fb and not full_text: + logger.warning("Model %s declined the streamed turn; retrying on %s", model_in_use, fb) + sink = {} + async for chunk in _stream_live(fb, sink): + yield chunk + model_in_use = fb + final_message = sink["final"] + full_text = sink["full_text"] + self._track_token_usage(final_message) + + # Last resort: if even the fallback declined, surface it and stop. + if final_message.stop_reason == "refusal": + logger.warning("Claude declined the request (model=%s)", model_in_use) + yield { + "type": "text", + "text": "(The request was declined by the model's safety system. Try rephrasing.)", + } + return + + # Diagnostic: per-response counts. DEBUG, not WARNING — stop_reason=tool_use + # with matching tool blocks is normal; the genuine anomaly is the + # logger.error below (tool blocks present but stop_reason != tool_use). tool_block_count = sum( 1 for b in final_message.content if hasattr(b, "type") and b.type == "tool_use" ) - logger.warning( - "Claude response: stop_reason=%s, content_blocks=%d, tool_use_blocks=%d," - " tools_passed=%d, model=%s", + logger.debug( + "Claude response: stop_reason=%s, content_blocks=%d, " + "tool_use_blocks=%d, tools_passed=%d, model=%s", final_message.stop_reason, len(final_message.content), tool_block_count, len(tools), - self.model, + model_in_use, ) if tool_block_count > 0 and final_message.stop_reason != "tool_use": logger.error( @@ -582,14 +710,6 @@ def stream_and_collect(): final_message.stop_reason, ) - # Process events and yield text - full_text = [] - for event in events: - if event.type == "content_block_delta": - if hasattr(event.delta, "text"): - full_text.append(event.delta.text) - yield {"type": "text", "text": event.delta.text} - # Detect fake XML tool calls in text (Claude writing tool_use as text) joined_text = "".join(full_text) if "" in joined_text or "" in joined_text: @@ -610,7 +730,94 @@ def stream_and_collect(): await asyncio.sleep(0.05) tool_results = [] + + # Concurrency fast-path: run a turn's tool calls in parallel when ALL of + # them are non-hardware and non-interactive (e.g. several strain / paper / + # lab-history lookups). Any microscope action or ask_user_choice in the + # batch falls back to the serial path below, so we never race hardware or + # an interactive prompt, and ordering of stateful ops is preserved. + tool_blocks = [b for b in response_content if getattr(b, "type", None) == "tool_use"] + _interactive = {"ask_user_choice"} + # Only parallelize genuinely read-only tools (independent lookups). Mutating + # tools (create_/update_/delete_/set_…) must stay serial — they share state + # (e.g. a campaign's plan file) and are order-dependent, so concurrent runs + # could race or corrupt it. + _readonly_prefixes = ( + "search_", + "read_", + "query_", + "get_", + "list_", + "recall_", + "find_", + "fetch_", + "lookup_", + ) + + def _parallel_safe(b): + td = self._tool_registry.get(b.name) + return ( + td is not None + and not td.requires_microscope + and b.name not in _interactive + and b.name.startswith(_readonly_prefixes) + ) + + handled_parallel = False + if len(tool_blocks) > 1 and all(_parallel_safe(b) for b in tool_blocks): + handled_parallel = True + starts = {b.id: time.time() for b in tool_blocks} + for b in tool_blocks: + yield { + "type": "tool_start", + "tool_name": b.name, + "tool_input": b.input, + "tool_label": tool_label_fn(b.name, b.input), + } + gathered = await asyncio.gather( + *[self._execute_single_tool(b.name, b.input) for b in tool_blocks], + return_exceptions=True, + ) + for b, res in zip(tool_blocks, gathered, strict=True): + if isinstance(res, BaseException): + is_error_flag = True + result_text = f"Error: {res}" + tool_results.append( + { + "type": "tool_result", + "tool_use_id": b.id, + "content": result_text, + "is_error": True, + } + ) + else: + is_error_flag = False + result_text = res if isinstance(res, str) else str(res) + tool_results.append( + {"type": "tool_result", "tool_use_id": b.id, "content": res} + ) + result_summary = next( + (ln.strip() for ln in (result_text or "").splitlines() if ln.strip()), + "", + ) + if len(result_summary) > 140: + result_summary = result_summary[:139] + "…" + result_full = result_text or "" + if len(result_full) > 4000: + result_full = result_full[:4000] + "\n…(truncated)" + yield { + "type": "tool_call", + "tool_name": b.name, + "tool_input": b.input, + "duration": time.time() - starts[b.id], + "result_summary": result_summary, + "result_full": result_full, + "is_error": is_error_flag, + } + for block in response_content: + if handled_parallel: + break if hasattr(block, "type") and block.type == "tool_use": start_time = time.time() @@ -628,9 +835,7 @@ def stream_and_collect(): if isinstance(tool_result, str): try: - from gently.app.tools.interaction_tools import ( - CHOICE_RESPONSE_TYPE, - ) + from gently.app.tools.interaction_tools import CHOICE_RESPONSE_TYPE choice_data = json.loads(tool_result) if ( @@ -649,11 +854,7 @@ def stream_and_collect(): tool_result if isinstance(tool_result, str) else str(tool_result) ) tool_results.append( - { - "type": "tool_result", - "tool_use_id": block.id, - "content": tool_result, - } + {"type": "tool_result", "tool_use_id": block.id, "content": tool_result} ) except Exception as e: is_error_flag = True @@ -677,12 +878,21 @@ def stream_and_collect(): if len(result_summary) > 140: result_summary = result_summary[:139] + "…" + # Full result (bounded) so the UI's expandable tool card can + # show what the tool actually returned — not just the 140-char + # one-liner. The web client caps/scrolls this further; keep the + # streamed payload sane. + result_full = result_text or "" + if len(result_full) > 4000: + result_full = result_full[:4000] + "\n…(truncated)" + yield { "type": "tool_call", "tool_name": block.name, "tool_input": block.input, "duration": time.time() - start_time, "result_summary": result_summary, + "result_full": result_full, "is_error": is_error_flag, } @@ -909,8 +1119,8 @@ async def _call_api_with_retry(self, api_func, *args, max_retries=3, **kwargs): if is_retryable and attempt < max_retries - 1: wait_time = retry_delay * (2**attempt) logger.warning( - f"API error ({error_type}), retrying in {wait_time:.1f}s" - f" (attempt {attempt + 1}/{max_retries})" + f"API error ({error_type}), retrying in {wait_time:.1f}s " + f"(attempt {attempt + 1}/{max_retries})" ) await asyncio.sleep(wait_time) continue diff --git a/gently/harness/detection/verifier.py b/gently/harness/detection/verifier.py index 0370abb7..1a714bcd 100644 --- a/gently/harness/detection/verifier.py +++ b/gently/harness/detection/verifier.py @@ -27,13 +27,123 @@ logger = logging.getLogger(__name__) +# Each verification strategy is pinned to its tool via tool_choice, so the +# verdict arrives as a validated dict on the tool_use block — no +# startswith()-scraping of a "FIELD: VALUE" plain-text format, no silent +# defaults from a missed line. Downstream vote-tally / consensus logic is +# untouched: these helpers still produce the same strategy dataclasses. +# +# We deliberately don't ask the model to self-rate confidence (a heuristics-era +# artifact) — the boolean verdict is the signal, and the only confidence-like +# measure we keep is the ensemble's agreement ratio, which is *derived* from +# many independent votes rather than introspected by one call. +_ADVERSARIAL_TOOL = { + "name": "record_adversarial_review", + "description": ( + "Record the critical review verdict: whether counter-evidence " + "against the detection was found." + ), + "input_schema": { + "type": "object", + "properties": { + "found_counter_evidence": { + "type": "boolean", + "description": "True only if there is real evidence the detection is wrong.", + }, + "concerns": { + "type": "array", + "items": {"type": "string"}, + "description": "Specific doubts or alternative explanations; empty list if none.", + }, + }, + "required": ["found_counter_evidence", "concerns"], + }, +} + +_INDEPENDENT_TOOL = { + "name": "record_independent_assessment", + "description": ( + "Record an unbiased fresh assessment of whether the event occurred in this image." + ), + "input_schema": { + "type": "object", + "properties": { + "detected": { + "type": "boolean", + "description": "True if the event is observed in this image.", + }, + "key_evidence": { + "type": "string", + "description": "What specifically supports the conclusion.", + }, + }, + "required": ["detected", "key_evidence"], + }, +} + +_TEMPORAL_TOOL = { + "name": "record_temporal_comparison", + "description": ( + "Record whether a real change consistent with the event occurred " + "between the previous and current frames." + ), + "input_schema": { + "type": "object", + "properties": { + "change_detected": { + "type": "boolean", + "description": ( + "True if a clear change consistent with the event is visible across frames." + ), + }, + "description": { + "type": "string", + "description": "The specific change observed between previous and current frames.", + }, + }, + "required": ["change_detected", "description"], + }, +} + +_HARDWARE_CONTEXT_TOOL = { + "name": "record_hardware_context", + "description": "Record whether hardware errors could have caused a false-positive detection.", + "input_schema": { + "type": "object", + "properties": { + "suspicious": { + "type": "boolean", + "description": ( + "True if hardware errors could have affected image quality " + "or positioning for this embryo." + ), + }, + "concerns": { + "type": "array", + "items": {"type": "string"}, + "description": "Specific hardware concerns; empty list if none.", + }, + "reasoning": {"type": "string", "description": "Brief explanation of the analysis."}, + }, + "required": ["suspicious", "concerns", "reasoning"], + }, +} + + +def _tool_input(response) -> dict[str, Any] | None: + """Return the parsed input of the first tool_use block, or None.""" + for block in getattr(response, "content", None) or []: + if getattr(block, "type", None) == "tool_use": + return block.input + return None + + @dataclass class AdversarialResult: """Result of adversarial verification strategy""" found_counter_evidence: bool concerns: list[str] - confidence_in_original: ConfidenceLevel | None raw_response: str @@ -42,7 +152,6 @@ class IndependentResult: """Result of independent verification strategy""" detected: bool - confidence: ConfidenceLevel | None key_evidence: str raw_response: str @@ -53,7 +162,6 @@ class TemporalResult: change_detected: bool description: str - confidence: ConfidenceLevel | None raw_response: str @@ -111,21 +219,14 @@ def to_dict(self) -> dict[str, Any]: "adversarial": { "found_counter_evidence": self.adversarial.found_counter_evidence, "concerns": self.adversarial.concerns, - "confidence_in_original": self.adversarial.confidence_in_original.value - if self.adversarial.confidence_in_original - else None, }, "independent": { "detected": self.independent.detected, - "confidence": self.independent.confidence.value - if self.independent.confidence - else None, "key_evidence": self.independent.key_evidence, }, "temporal": { "change_detected": self.temporal.change_detected, "description": self.temporal.description, - "confidence": self.temporal.confidence.value if self.temporal.confidence else None, }, "consensus": self.consensus, "consensus_reasoning": self.consensus_reasoning, @@ -327,19 +428,10 @@ async def verify_with_context( ensemble_result, hardware_result, ) = await asyncio.gather( - adversarial_task, - independent_task, - temporal_task, - ensemble_task, - hardware_task, + adversarial_task, independent_task, temporal_task, ensemble_task, hardware_task ) else: - ( - adversarial, - independent, - temporal, - ensemble_result, - ) = await asyncio.gather( + adversarial, independent, temporal, ensemble_result = await asyncio.gather( adversarial_task, independent_task, temporal_task, ensemble_task ) else: @@ -354,6 +446,11 @@ async def verify_with_context( # Adversarial result strategies_complete += 1 + adversarial_summary = ( + "YES - " + ", ".join(adversarial.concerns) + if adversarial.found_counter_evidence + else "None found" + ) self._emit_event( EventType.VERIFICATION_STRATEGY, { @@ -361,17 +458,7 @@ async def verify_with_context( "detector_name": detector.name, "strategy": "adversarial", "passed": not adversarial.found_counter_evidence, - "summary": ( - "Counter-evidence: " - + ( - "YES - " + ", ".join(adversarial.concerns) - if adversarial.found_counter_evidence - else "None found" - ) - ), - "confidence": adversarial.confidence_in_original.value - if adversarial.confidence_in_original - else None, + "summary": f"Counter-evidence: {adversarial_summary}", }, ) self._emit_event( @@ -396,7 +483,6 @@ async def verify_with_context( f"Independent detection: {'YES' if independent.detected else 'NO'}" f" - {independent.key_evidence}" ), - "confidence": independent.confidence.value if independent.confidence else None, }, ) self._emit_event( @@ -421,7 +507,6 @@ async def verify_with_context( f"Change detected: {'YES' if temporal.change_detected else 'NO'}" f" - {temporal.description}" ), - "confidence": temporal.confidence.value if temporal.confidence else None, }, ) self._emit_event( @@ -464,6 +549,11 @@ async def verify_with_context( # Hardware context result (if applicable) if hardware_result: strategies_complete += 1 + hardware_summary = ( + "YES - " + ", ".join(hardware_result.concerns) + if hardware_result.suspicious + else "No" + ) self._emit_event( EventType.VERIFICATION_STRATEGY, { @@ -471,14 +561,7 @@ async def verify_with_context( "detector_name": detector.name, "strategy": "hardware_context", "passed": not hardware_result.suspicious, - "summary": ( - "Hardware errors suspicious: " - + ( - "YES - " + ", ".join(hardware_result.concerns) - if hardware_result.suspicious - else "No" - ) - ), + "summary": f"Hardware errors suspicious: {hardware_summary}", "reasoning": hardware_result.reasoning, }, ) @@ -493,12 +576,7 @@ async def verify_with_context( # Determine consensus (with hardware context) consensus, reasoning = self._evaluate_consensus_with_hardware( - original_result, - adversarial, - independent, - temporal, - ensemble_result, - hardware_result, + original_result, adversarial, independent, temporal, ensemble_result, hardware_result ) duration = (datetime.now() - start_time).total_seconds() @@ -577,8 +655,8 @@ async def _run_hardware_context_analysis( Analysis result """ try: - prompt = f"""You are analyzing hardware error context for a microscopy detection -verification. + prompt = f"""\ +You are analyzing hardware error context for a microscopy detection verification. GLOBAL ERROR LOG: {global_error_context} @@ -595,24 +673,31 @@ async def _run_hardware_context_analysis( (stage drift, hardware instability) - Multiple errors in quick succession suggests hardware problems -If ANY errors occurred that could have affected the image quality or positioning for -{embryo_id}, report as SUSPICIOUS. +If ANY errors occurred that could have affected the image quality or positioning +for {embryo_id}, mark it suspicious. -Respond in EXACTLY this format: -SUSPICIOUS: [YES/NO] -CONCERNS: [list specific concerns, separated by semicolons] -REASONING: [brief explanation of your analysis] +Record your analysis with the record_hardware_context tool. """ response = await asyncio.to_thread( self.claude.messages.create, model=self.ensemble_model, # Use Haiku for speed max_tokens=300, + tools=[_HARDWARE_CONTEXT_TOOL], + tool_choice={"type": "tool", "name": _HARDWARE_CONTEXT_TOOL["name"]}, messages=[{"role": "user", "content": prompt}], ) - response_text = response.content[0].text - return self._parse_hardware_context_response(response_text) + data = _tool_input(response) + if not isinstance(data, dict): + raise ValueError("no tool_use block in response") + concerns = data.get("concerns") or [] + return HardwareContextResult( + suspicious=bool(data.get("suspicious", True)), + concerns=[str(c) for c in concerns], + reasoning=str(data.get("reasoning", "")), + raw_response=str(data), + ) except Exception as e: logger.error(f"Hardware context analysis failed: {e}") @@ -623,30 +708,6 @@ async def _run_hardware_context_analysis( raw_response="", ) - def _parse_hardware_context_response(self, response: str) -> HardwareContextResult: - """Parse hardware context analysis response""" - suspicious = False - concerns = [] - reasoning = "" - - for line in response.split("\n"): - line = line.strip() - if line.startswith("SUSPICIOUS:"): - value = line.split(":", 1)[1].strip().upper() - suspicious = value == "YES" - elif line.startswith("CONCERNS:"): - concerns_str = line.split(":", 1)[1].strip() - concerns = [c.strip() for c in concerns_str.split(";") if c.strip()] - elif line.startswith("REASONING:"): - reasoning = line.split(":", 1)[1].strip() - - return HardwareContextResult( - suspicious=suspicious, - concerns=concerns, - reasoning=reasoning, - raw_response=response, - ) - def _evaluate_consensus_with_hardware( self, original: DetectionResult, @@ -753,7 +814,6 @@ async def _run_adversarial( return AdversarialResult( found_counter_evidence=False, concerns=["No images available for verification"], - confidence_in_original=None, raw_response="", ) @@ -771,11 +831,10 @@ async def _run_adversarial( else: specific_guidance = "" - prompt = f"""You are reviewing a detection result for a C. elegans embryo -(diSPIM max projection). + prompt = f"""\ +You are reviewing a detection result for a C. elegans embryo (diSPIM max projection). The system detected: {detector.name} -Original confidence: {original_result.confidence.value if original_result.confidence else "unknown"} Original reasoning: {original_result.reasoning or "not provided"} NOW ACT AS A CRITICAL REVIEWER. Your job is to find reasons why this detection might be INCORRECT: @@ -784,10 +843,7 @@ async def _run_adversarial( - Is the evidence actually conclusive, or could it be interpreted differently? - Are there alternative explanations for what is observed? {specific_guidance} -Analyze the image(s) carefully and respond in EXACTLY this format: -COUNTER_EVIDENCE_FOUND: [YES/NO] -CONCERNS: [list specific doubts or alternative explanations, separated by semicolons] -CONFIDENCE_IN_ORIGINAL: [HIGH/MEDIUM/LOW] +Analyze the image(s) carefully and record your review with the record_adversarial_review tool. """ content = [{"type": "text", "text": prompt}] + images @@ -796,18 +852,26 @@ async def _run_adversarial( self.claude.messages.create, model=self.model, max_tokens=500, + tools=[_ADVERSARIAL_TOOL], + tool_choice={"type": "tool", "name": _ADVERSARIAL_TOOL["name"]}, messages=[{"role": "user", "content": content}], ) - response_text = response.content[0].text - return self._parse_adversarial_response(response_text) + data = _tool_input(response) + if not isinstance(data, dict): + raise ValueError("no tool_use block in response") + concerns = data.get("concerns") or [] + return AdversarialResult( + found_counter_evidence=bool(data.get("found_counter_evidence", False)), + concerns=[str(c) for c in concerns], + raw_response=str(data), + ) except Exception as e: logger.error(f"Adversarial verification failed: {e}") return AdversarialResult( found_counter_evidence=False, concerns=[f"Verification error: {str(e)}"], - confidence_in_original=None, raw_response="", ) @@ -829,7 +893,6 @@ async def _run_independent( if not images: return IndependentResult( detected=False, - confidence=None, key_evidence="No images available", raw_response="", ) @@ -849,8 +912,8 @@ async def _run_independent( criteria = detector.description # Use a neutral prompt that doesn't reveal the previous detection - prompt = f"""Analyze this C. elegans embryo image (diSPIM max projection) at -timepoint {timepoint}. + prompt = f"""\ +Analyze this C. elegans embryo image (diSPIM max projection) at timepoint {timepoint}. Question: Has '{detector.name}' occurred in this embryo? @@ -859,10 +922,7 @@ async def _run_independent( Provide an independent assessment based SOLELY on what you observe in this image. Do not assume any prior state - analyze only what is visible now. -Respond in EXACTLY this format: -DETECTED: [YES/NO] -CONFIDENCE: [HIGH/MEDIUM/LOW] -KEY_EVIDENCE: [what specifically do you observe that supports your conclusion?] +Record your assessment with the record_independent_assessment tool. """ content = [{"type": "text", "text": prompt}] + images @@ -871,17 +931,24 @@ async def _run_independent( self.claude.messages.create, model=self.model, max_tokens=400, + tools=[_INDEPENDENT_TOOL], + tool_choice={"type": "tool", "name": _INDEPENDENT_TOOL["name"]}, messages=[{"role": "user", "content": content}], ) - response_text = response.content[0].text - return self._parse_independent_response(response_text) + data = _tool_input(response) + if not isinstance(data, dict): + raise ValueError("no tool_use block in response") + return IndependentResult( + detected=bool(data.get("detected", False)), + key_evidence=str(data.get("key_evidence", "")), + raw_response=str(data), + ) except Exception as e: logger.error(f"Independent verification failed: {e}") return IndependentResult( detected=False, - confidence=None, key_evidence=f"Verification error: {str(e)}", raw_response="", ) @@ -904,7 +971,6 @@ async def _run_temporal_check( return TemporalResult( change_detected=True, # Can't disprove without history description="Insufficient temporal history for comparison", - confidence=ConfidenceLevel.LOW, raw_response="", ) @@ -930,7 +996,6 @@ async def _run_temporal_check( return TemporalResult( change_detected=True, description="No previous images available", - confidence=ConfidenceLevel.LOW, raw_response="", ) @@ -947,8 +1012,8 @@ async def _run_temporal_check( - Not just a static state that could have existed before - Clear evidence of progression or event occurrence""" - prompt = f"""Compare these sequential timepoints of a C. elegans embryo -(diSPIM max projection). + prompt = f"""\ +Compare these sequential timepoints of a C. elegans embryo (diSPIM max projection). PREVIOUS TIMEPOINTS (shown first): These are from t={timepoint - 2} to t={timepoint - 1} @@ -960,10 +1025,7 @@ async def _run_temporal_check( {temporal_criteria} -Respond in EXACTLY this format: -CHANGE_DETECTED: [YES/NO] -DESCRIPTION: [what specific change do you see between the previous and current frames?] -CONFIDENCE: [HIGH/MEDIUM/LOW] +Record your comparison with the record_temporal_comparison tool. """ # Combine: previous images first, then prompt, then current @@ -973,18 +1035,25 @@ async def _run_temporal_check( self.claude.messages.create, model=self.model, max_tokens=400, + tools=[_TEMPORAL_TOOL], + tool_choice={"type": "tool", "name": _TEMPORAL_TOOL["name"]}, messages=[{"role": "user", "content": content}], ) - response_text = response.content[0].text - return self._parse_temporal_response(response_text) + data = _tool_input(response) + if not isinstance(data, dict): + raise ValueError("no tool_use block in response") + return TemporalResult( + change_detected=bool(data.get("change_detected", True)), + description=str(data.get("description", "")), + raw_response=str(data), + ) except Exception as e: logger.error(f"Temporal verification failed: {e}") return TemporalResult( change_detected=True, # Don't block on error description=f"Verification error: {str(e)}", - confidence=None, raw_response="", ) @@ -1038,10 +1107,10 @@ async def _run_ensemble_hatching(self, embryo_state: EmbryoState) -> EnsembleRes Answer ONE question: Has the embryo HATCHED? -HATCHED means: The worm body is OUTSIDE the eggshell (free-floating, elongated, or field is -empty because worm left). -NOT HATCHED means: The worm is still INSIDE the eggshell (coiled/pretzel-shaped, even if -shell looks expanded). +HATCHED means: The worm body is OUTSIDE the eggshell (free-floating, elongated, +or field is empty because worm left). +NOT HATCHED means: The worm is still INSIDE the eggshell (coiled/pretzel-shaped, +even if shell looks expanded). Respond with ONLY: YES or NO""" @@ -1062,8 +1131,8 @@ async def single_vote() -> str: # Run all votes in parallel logger.info( - f"[ENSEMBLE] Running {self.ensemble_size} parallel Haiku calls" - " for hatching verification" + f"[ENSEMBLE] Running {self.ensemble_size} parallel Haiku calls " + "for hatching verification" ) tasks = [single_vote() for _ in range(self.ensemble_size)] responses = await asyncio.gather(*tasks) @@ -1113,88 +1182,6 @@ async def single_vote() -> str: raw_responses=[f"Error: {str(e)}"], ) - def _parse_adversarial_response(self, response: str) -> AdversarialResult: - """Parse adversarial strategy response""" - found_counter = False - concerns = [] - confidence = None - - for line in response.split("\n"): - line = line.strip() - if line.startswith("COUNTER_EVIDENCE_FOUND:"): - value = line.split(":", 1)[1].strip().upper() - found_counter = value == "YES" - elif line.startswith("CONCERNS:"): - concerns_str = line.split(":", 1)[1].strip() - concerns = [c.strip() for c in concerns_str.split(";") if c.strip()] - elif line.startswith("CONFIDENCE_IN_ORIGINAL:"): - value = line.split(":", 1)[1].strip().upper() - try: - confidence = ConfidenceLevel(value) - except ValueError: - pass - - return AdversarialResult( - found_counter_evidence=found_counter, - concerns=concerns, - confidence_in_original=confidence, - raw_response=response, - ) - - def _parse_independent_response(self, response: str) -> IndependentResult: - """Parse independent strategy response""" - detected = False - confidence = None - evidence = "" - - for line in response.split("\n"): - line = line.strip() - if line.startswith("DETECTED:"): - value = line.split(":", 1)[1].strip().upper() - detected = value == "YES" - elif line.startswith("CONFIDENCE:"): - value = line.split(":", 1)[1].strip().upper() - try: - confidence = ConfidenceLevel(value) - except ValueError: - pass - elif line.startswith("KEY_EVIDENCE:"): - evidence = line.split(":", 1)[1].strip() - - return IndependentResult( - detected=detected, - confidence=confidence, - key_evidence=evidence, - raw_response=response, - ) - - def _parse_temporal_response(self, response: str) -> TemporalResult: - """Parse temporal strategy response""" - change_detected = False - description = "" - confidence = None - - for line in response.split("\n"): - line = line.strip() - if line.startswith("CHANGE_DETECTED:"): - value = line.split(":", 1)[1].strip().upper() - change_detected = value == "YES" - elif line.startswith("DESCRIPTION:"): - description = line.split(":", 1)[1].strip() - elif line.startswith("CONFIDENCE:"): - value = line.split(":", 1)[1].strip().upper() - try: - confidence = ConfidenceLevel(value) - except ValueError: - pass - - return TemporalResult( - change_detected=change_detected, - description=description, - confidence=confidence, - raw_response=response, - ) - def _evaluate_consensus( self, original: DetectionResult, @@ -1253,8 +1240,8 @@ def _evaluate_consensus( f"All verification strategies agree ({total_strategies}/{total_strategies}): " f"no counter-evidence found, independent analysis confirms detection, " f"temporal change observed, ensemble voting confirms " - f"({ensemble.votes_yes}/{ensemble.total_votes}" - f" = {ensemble.agreement_ratio:.0%} YES)." + f"({ensemble.votes_yes}/{ensemble.total_votes} = " + f"{ensemble.agreement_ratio:.0%} YES)." ) else: reasoning = ( diff --git a/gently/harness/memory/file_store.py b/gently/harness/memory/file_store.py index 0531dcce..c19f785b 100644 --- a/gently/harness/memory/file_store.py +++ b/gently/harness/memory/file_store.py @@ -582,6 +582,11 @@ def get_subcampaigns(self, campaign_id: str) -> list[Campaign]: return children def get_nth_subcampaign(self, parent_id: str, n: int) -> Campaign | None: + # Tolerate n arriving as a numeric string (tool args are often stringified). + try: + n = int(n) + except (ValueError, TypeError): + return None phases = self.get_subcampaigns(parent_id) if 1 <= n <= len(phases): return phases[n - 1] @@ -1908,6 +1913,16 @@ def get_observations_for_embryo(self, embryo_id: str, limit: int = 20) -> list[O # Expectations # ================================================================== + def _notify_context_change(self, kind: str = "context") -> None: + """Emit CONTEXT_UPDATED on the global bus so the shared-visibility + surface refreshes live. Best-effort — a bus failure never breaks a write.""" + try: + from gently.core.event_bus import EventType, emit + + emit(EventType.CONTEXT_UPDATED, {"kind": kind}, source="context_store") + except Exception: + pass + def add_expectation(self, exp: Expectation): path = self.agent_dir / "active" / "expectations.yaml" items = self._read_yaml(path) or [] @@ -1925,6 +1940,7 @@ def add_expectation(self, exp: Expectation): } ) self._write_yaml(path, items) + self._notify_context_change("expectation") def get_pending_expectations(self) -> list[Expectation]: path = self.agent_dir / "active" / "expectations.yaml" @@ -1956,6 +1972,7 @@ def resolve_expectation(self, exp_id: str, status: ExpectationStatus): item["resolved_at"] = now break self._write_yaml(path, items) + self._notify_context_change("expectation") # ================================================================== # Watchpoints @@ -1975,6 +1992,7 @@ def add_watchpoint(self, wp: Watchpoint): } ) self._write_yaml(path, items) + self._notify_context_change("watchpoint") def get_active_watchpoints(self) -> list[Watchpoint]: path = self.agent_dir / "active" / "watchpoints.yaml" @@ -2002,6 +2020,7 @@ def resolve_watchpoint(self, wp_id: str): item["status"] = "resolved" break self._write_yaml(path, items) + self._notify_context_change("watchpoint") # ================================================================== # Questions @@ -2021,6 +2040,7 @@ def add_question(self, q: Question): } ) self._write_yaml(path, items) + self._notify_context_change("question") def get_open_questions(self) -> list[Question]: path = self.agent_dir / "active" / "questions.yaml" @@ -2042,6 +2062,7 @@ def resolve_question(self, q_id: str, resolution: str): item["resolved_at"] = now break self._write_yaml(path, items) + self._notify_context_change("question") # ================================================================== # Learnings @@ -2522,6 +2543,14 @@ def _dict_to_plan_item(d: dict) -> PlanItem: imaging_spec = None bench_spec = None + # Tolerate specs persisted as JSON strings (older tool calls that passed + # spec as a string instead of an object) so read-back never crashes. + if isinstance(spec_data, str): + try: + spec_data = json.loads(spec_data) + except (json.JSONDecodeError, TypeError): + spec_data = None + if spec_data: if item_type == PlanItemType.IMAGING: valid = {f.name for f in dataclasses.fields(ImagingSpec)} @@ -2531,6 +2560,11 @@ def _dict_to_plan_item(d: dict) -> PlanItem: bench_spec = BenchSpec(**{k: v for k, v in spec_data.items() if k in valid}) references = d.get("references") or [] + if isinstance(references, str): + try: + references = json.loads(references) or [] + except (json.JSONDecodeError, TypeError): + references = [] return PlanItem( id=d["id"], diff --git a/gently/harness/memory/model.py b/gently/harness/memory/model.py index 2a164176..3757a0fe 100644 --- a/gently/harness/memory/model.py +++ b/gently/harness/memory/model.py @@ -240,6 +240,11 @@ class ImagingSpec: success_criteria: str | None = None comparison_to: str | None = None # "Compare to WT session 1" + # Per-field provenance for INFERRED values — field name -> {source, confidence}. + # e.g. {"laser_wavelength_nm": {"source": "inferred:genotype", "confidence": "medium"}} + # Lets the UI tag each value with where it came from and what to confirm. + provenance: dict[str, dict[str, str]] = field(default_factory=dict) + @dataclass class BenchSpec: diff --git a/gently/harness/plan_mode/prompt.py b/gently/harness/plan_mode/prompt.py index 4228249c..055d0df2 100644 --- a/gently/harness/plan_mode/prompt.py +++ b/gently/harness/plan_mode/prompt.py @@ -21,8 +21,17 @@ 6. Challenge assumptions — suggest controls the researcher might not have thought of 7. Suggest experiments outside of imaging where appropriate (bench assays, genetics, analysis) -DO NOT rush to a plan. Gather information first. Ask questions. Search the literature. -Understand the researcher's goals and constraints before proposing. +Work INFERENCE-FIRST: arrive with a draft, don't interrogate. Infer what you +reasonably can — read the reporters in the strain's genotype and set the +excitation wavelengths from your knowledge of fluorophore spectra (e.g. +TagRFP/mCherry ≈ 561 nm, GFP/GCaMP ≈ 488 nm), let the organism set sensible +defaults, and let lab/campaign context fill the rest. Record each inferred +value's source and confidence in the imaging spec's ``provenance``. State a +wavelength only when you're confident; if a reporter is unfamiliar or ambiguous, +mark it low-confidence and confirm via ask_user_choice rather than guessing a +number. Then surface the draft for review, asking ONLY for genuine gaps, +low-confidence guesses, or consequential choices. Search the literature to +confirm, not to stall. ## How to Design an Experimental Plan @@ -67,9 +76,38 @@ 3. Set dependencies between items 4. Present the full plan for review with propose_plan +After propose_plan, close with a short confirmation of what the plan contains +(item/phase count, the critical path, anything notable) and stop there. Do NOT +offer to export it, save it as a template, or ask "what would you like to do +next?" — exporting and opening the workspace are handled by the interface, not +this conversation. End on the summary, not an upsell. + IMPORTANT: ALWAYS use ask_user_choice when asking the researcher questions. Never present options as text lists. +## Communication style — keep it light to read + +You're talking to a working biologist, not a software user. Optimize every +user-facing message for fast reading, not completeness: + +- **Lead with the ask or the finding.** The first sentence should be the question, + the decision, or what you found — supporting detail comes after, and only when it + changes what they'd do next. +- **Short questions, short options.** Keep an ask_user_choice question to one line, + and each option to a few-word label plus at most a one-line rationale — never a + paragraph. Trust the biologist to know the domain; don't re-explain standard + concepts (what a histone marker is, why controls matter). +- **Plain words, not process jargon.** Use the field's real terms (strain names, + stages, wavelengths) but drop software/workflow jargon and hedging. +- **Give the short "why", not the survey.** One clause of rationale beats an + exhaustive list of everything you weighed. Put the full reasoning in the spec's + provenance and references, not in the message. +- **One idea per message.** Don't stack caveats, alternatives, and next steps into + one dense block. If something is optional, say so briefly or leave it out. + +Readability and brevity are different — choose readability, but get there by +saying less, not by compressing into fragments or abbreviations. + ## Reading Papers Use read_paper to retrieve and read scientific papers. It accepts: @@ -115,8 +153,12 @@ PLAN_MODE_GUIDELINES = """\ # Behavior in Plan Mode -1. **Ask before assuming**: Don't assume the researcher's constraints. Ask about - available strains, timeline, equipment access, collaborators. +1. **Infer, then confirm — don't interrogate**: Fill what you can from the strain + genotype, organism defaults, and lab/campaign context, and record where each + value came from (database citation, or your own fluorophore/biology knowledge) + in the spec's ``provenance``. Ask — via ask_user_choice — only for genuine + gaps, low-confidence guesses, or consequential choices, not for things you can + derive or look up. 2. **Think about the full story**: What would reviewers want to see? What controls would strengthen the claims? 3. **Be realistic about timelines**: Genetic crosses take weeks. Behavioral assays @@ -140,6 +182,14 @@ items, search to confirm strain availability, check the literature for recent protocols, and attach references. Your built-in knowledge is a great starting point for brainstorming — the databases are where you confirm before finalizing. +11. **Batch independent lookups**: When you need several independent reads — multiple + strains, several papers, or a few lab-history queries — request them together in + one turn so they run in parallel. Don't fetch one, wait for it, then fetch the + next; that's slow. (The system runs same-turn read-only lookups concurrently.) +12. **Build the plan in few turns**: Each turn is a model round-trip, so creating one + item per turn makes plan construction crawl. When writing a phase's items, emit + several create_plan_item calls in a single turn (then set any dependencies in a + follow-up). Fewer turns = a much faster plan. """ diff --git a/gently/harness/plan_mode/tools/planning.py b/gently/harness/plan_mode/tools/planning.py index 34785e5f..87715acd 100644 --- a/gently/harness/plan_mode/tools/planning.py +++ b/gently/harness/plan_mode/tools/planning.py @@ -7,9 +7,34 @@ """ import dataclasses +import json from ...tools.registry import ToolCategory, ToolExample, tool + +def _coerce_plan_args(spec, references, estimated_days): + """The model often serializes nested args (spec/references) as JSON strings + instead of objects — accept either so plan-item creation doesn't store a raw + string (which later breaks ImagingSpec/BenchSpec hydration). Returns the + normalized (spec, references, estimated_days).""" + if isinstance(spec, str): + try: + spec = json.loads(spec) + except (json.JSONDecodeError, TypeError): + spec = None + if isinstance(references, str): + try: + references = json.loads(references) + except (json.JSONDecodeError, TypeError): + references = None + if isinstance(estimated_days, str): + try: + estimated_days = int(estimated_days) + except (ValueError, TypeError): + estimated_days = None + return spec, references, estimated_days + + # --------------------------------------------------------------------------- # Campaign / Phase Management # --------------------------------------------------------------------------- @@ -132,6 +157,17 @@ async def create_plan_item( return "Error: Context store not available" store = agent.context_store + spec, references, estimated_days = _coerce_plan_args(spec, references, estimated_days) + if isinstance(phase_number, str): + try: + phase_number = int(phase_number) + except (ValueError, TypeError): + phase_number = None + if isinstance(phase_order, str): + try: + phase_order = int(phase_order) + except (ValueError, TypeError): + phase_order = -1 # Resolve phase_number → subcampaign ID target_campaign_id = campaign_id @@ -226,6 +262,7 @@ async def update_plan_item( from gently.harness.memory.model import PlanItemStatus status_enum = PlanItemStatus(status) if status else None + spec, references, estimated_days = _coerce_plan_args(spec, references, estimated_days) store.update_plan_item( item_id=resolved_id, status=status_enum, diff --git a/gently/settings.py b/gently/settings.py index 68cebd38..9e7d2e68 100644 --- a/gently/settings.py +++ b/gently/settings.py @@ -56,14 +56,38 @@ class MeshSettings: @dataclass(frozen=True) class ModelSettings: - """Claude model identifiers.""" - - main: str = field(default_factory=lambda: _env("MODEL_MAIN", "claude-opus-4-6")) - perception: str = field( - default_factory=lambda: _env("MODEL_PERCEPTION", "claude-opus-4-5-20251101") + """Claude model identifiers — the single source of truth for every tier. + + Tiers are split by role; capability-first per the latest models: + - main: Opus 4.8 ($5/$25). Per-user-turn reasoning + tool + orchestration (plan mode) and the dopaminergic classifier + stage. (Fable 5 was tried here but declined benign planning + turns — stop_reason="refusal" — forcing a fallback on every + turn; set MODEL_MAIN=claude-fable-5 to retry it.) + - perception: Opus 4.8 (high-res vision, $5/$25). Highest-frequency tier + (per timepoint); Opus-tier vision for perception accuracy. + - medium: Opus 4.8. Onboarding / wizard summaries. + - fast: Sonnet 4.6 ($3/$15). The cheaper/faster tier — drives the + verifier's parallel ensemble (ensemble_size calls per + verification) and blank-image / summary checks. + + API note: Opus 4.8 rejects thinking budget_tokens and sampling params + (temperature/top_p/top_k) — adaptive thinking only, depth via effort. + Sonnet 4.6 supports adaptive thinking. No assistant prefills anywhere + (4.6+ family rejects them). + """ + + main: str = field(default_factory=lambda: _env("MODEL_MAIN", "claude-opus-4-8")) + perception: str = field(default_factory=lambda: _env("MODEL_PERCEPTION", "claude-opus-4-8")) + fast: str = field(default_factory=lambda: _env("MODEL_FAST", "claude-sonnet-4-6")) + medium: str = field(default_factory=lambda: _env("MODEL_MEDIUM", "claude-opus-4-8")) + # If the main tier declines a turn (stop_reason="refusal"), retry it on this + # model instead of surfacing the refusal. Inert while main is Opus 4.8 (the + # guard skips it when fallback == main); relevant if main is set to Fable 5. + # Empty disables the fallback. + refusal_fallback: str = field( + default_factory=lambda: _env("MODEL_REFUSAL_FALLBACK", "claude-opus-4-8") ) - fast: str = field(default_factory=lambda: _env("MODEL_FAST", "claude-haiku-4-5-20251001")) - medium: str = field(default_factory=lambda: _env("MODEL_MEDIUM", "claude-sonnet-4-5-20250929")) @dataclass(frozen=True) @@ -120,6 +144,17 @@ class TransferSettings: ) +@dataclass(frozen=True) +class UISettings: + """Web UI feature flags.""" + + # New agent-first UX paradigm (welcome→shell unfold, dual-rendered agent + # asks, inference-first plan mode, shared-visibility surface). Now ON by + # default; the v1 dashboard remains available as a fallback via + # GENTLY_UX_V2=0 until the v1 markup is removed in a later cleanup step. + ux_v2: bool = field(default_factory=lambda: _env("UX_V2", True)) + + @dataclass(frozen=True) class Settings: """Top-level settings container.""" @@ -132,6 +167,7 @@ class Settings: api: ApiSettings = field(default_factory=ApiSettings) ml: MlSettings = field(default_factory=MlSettings) transfer: TransferSettings = field(default_factory=TransferSettings) + ui: UISettings = field(default_factory=UISettings) # Singleton — import this everywhere diff --git a/gently/ui/web/connection_manager.py b/gently/ui/web/connection_manager.py index 11cc3fe4..1d1f2a23 100644 --- a/gently/ui/web/connection_manager.py +++ b/gently/ui/web/connection_manager.py @@ -158,7 +158,12 @@ async def broadcast(self, message: dict): try: await connection.send_text(message_json) except Exception as e: - logger.warning(f"Failed to send to websocket: {e}") + # Expected when a client disconnects/reloads mid-broadcast + # (send after websocket.close). The connection is dropped + # below, so this is debug-level, not a warning. + logger.debug( + "Dropping a websocket that errored on send (client likely gone): %s", e + ) disconnected.append(connection) # Remove disconnected clients diff --git a/gently/ui/web/routes/__init__.py b/gently/ui/web/routes/__init__.py index ebd90770..bdbf3db7 100644 --- a/gently/ui/web/routes/__init__.py +++ b/gently/ui/web/routes/__init__.py @@ -10,6 +10,7 @@ from .auth_routes import create_router as create_auth_router from .campaigns import create_router as create_campaigns_router from .chat import create_router as create_chat_router +from .context import create_router as create_context_router from .data import create_router as create_data_router from .experiments import create_router as create_experiments_router from .images import create_router as create_images_router @@ -33,6 +34,7 @@ def register_all_routes(server): create_websocket_router, create_agent_ws_router, create_chat_router, + create_context_router, ): router = factory(server) server.app.include_router(router) diff --git a/gently/ui/web/routes/agent_ws.py b/gently/ui/web/routes/agent_ws.py index fdf2fc5f..887fdd25 100644 --- a/gently/ui/web/routes/agent_ws.py +++ b/gently/ui/web/routes/agent_ws.py @@ -14,6 +14,8 @@ from fastapi import APIRouter, WebSocket, WebSocketDisconnect +from gently.settings import settings + logger = logging.getLogger(__name__) @@ -744,9 +746,18 @@ async def _run_resolution_bootstrap(): pass if not wizard_ran: - if bridge.should_enter_resolution(): + enter_resolution = bridge.should_enter_resolution() + # Under ux_v2 the agent-first landing owns the session-entry + # decision ("Plan an experiment" / "Take a quick look"), so the + # legacy connect-time resolution picker would just duplicate it — + # and contradict it, by offering "Standalone" after the user has + # already chosen to plan. Stay quiet on connect for new sessions; + # the landing drives plan-mode (/plan) or standalone instead. + if enter_resolution and not settings.ui.ux_v2: bootstrap_task = asyncio.create_task(_run_resolution_bootstrap()) - else: + elif not enter_resolution: + # Resume / already-resolved sessions still get their briefing + # (it sits behind the landing overlay until dismissed). briefing = bridge.get_session_briefing() if briefing: await send_fn({"type": "stream_start"}) diff --git a/gently/ui/web/routes/chat.py b/gently/ui/web/routes/chat.py index 8c66dd45..e07d4501 100644 --- a/gently/ui/web/routes/chat.py +++ b/gently/ui/web/routes/chat.py @@ -19,11 +19,13 @@ from fastapi.responses import StreamingResponse from pydantic import BaseModel +from gently.settings import settings from gently.ui.web.auth import require_control logger = logging.getLogger(__name__) -CHAT_MODEL = "claude-opus-4-7" +# Per-timepoint VLM chat → perception tier (Opus 4.8); centralized, not hardcoded. +CHAT_MODEL = settings.models.perception SYSTEM_PROMPT = ( "You are helping a biologist interpret a microscopy perception " "assessment of a C. elegans embryo at a specific timepoint. You can " diff --git a/gently/ui/web/routes/context.py b/gently/ui/web/routes/context.py new file mode 100644 index 00000000..66c7f25c --- /dev/null +++ b/gently/ui/web/routes/context.py @@ -0,0 +1,74 @@ +"""Context (shared-visibility) routes. + +Exposes the agent's "mind" — its open questions (uncertainty), active +watchpoints (attention), and pending expectations (beliefs) — read by anyone, +resolvable only by the control holder. Live updates ride the CONTEXT_UPDATED +event the FileContextStore emits on the global bus, which the server already +broadcasts to /ws; the client just re-fetches /api/context on it (no polling). +""" + +from fastapi import APIRouter, Body, Depends + +from gently.ui.web.auth import require_control + +from .campaigns import _serialize + + +def create_router(server) -> APIRouter: + router = APIRouter() + + def _store(): + # Defensive: the store is wired after construction; tolerate cold start. + return getattr(server, "context_store", None) + + @router.get("/api/context") + async def get_context(): + cs = _store() + empty = {"available": False, "expectations": [], "watchpoints": [], "questions": []} + if cs is None: + return empty + try: + return { + "available": True, + "questions": [_serialize(q) for q in cs.get_open_questions()], + "watchpoints": [_serialize(w) for w in cs.get_active_watchpoints()], + "expectations": [_serialize(e) for e in cs.get_pending_expectations()], + } + except Exception: + return empty + + @router.post("/api/context/questions/{q_id}/resolve", dependencies=[Depends(require_control)]) + async def resolve_question(q_id: str, resolution: str = Body("", embed=True)): + cs = _store() + if cs is None: + return {"ok": False, "error": "context store unavailable"} + cs.resolve_question(q_id, resolution or "") + return {"ok": True} + + @router.post( + "/api/context/watchpoints/{wp_id}/resolve", dependencies=[Depends(require_control)] + ) + async def resolve_watchpoint(wp_id: str): + cs = _store() + if cs is None: + return {"ok": False, "error": "context store unavailable"} + cs.resolve_watchpoint(wp_id) + return {"ok": True} + + @router.post( + "/api/context/expectations/{exp_id}/resolve", dependencies=[Depends(require_control)] + ) + async def resolve_expectation(exp_id: str, status: str = Body("confirmed", embed=True)): + cs = _store() + if cs is None: + return {"ok": False, "error": "context store unavailable"} + from gently.harness.memory.model import ExpectationStatus + + try: + st = ExpectationStatus(status) + except ValueError: + st = ExpectationStatus.CONFIRMED + cs.resolve_expectation(exp_id, st) + return {"ok": True} + + return router diff --git a/gently/ui/web/routes/data.py b/gently/ui/web/routes/data.py index 93d49451..a609526a 100644 --- a/gently/ui/web/routes/data.py +++ b/gently/ui/web/routes/data.py @@ -214,6 +214,48 @@ async def get_coverslip(): } } + @router.get("/api/devices/scan_geometry") + async def get_scan_geometry(): + """Return the most recent scan geometry for the 3D optical-space view. + + SCAN_GEOMETRY_UPDATE is published only when a volume is acquired, so a + page opened before the first acquisition would have no cuboid to draw. + This serves the last emitted payload (stashed on the agent by + acquisition_tools._publish_scan_geometry), or nominal defaults so the + scene is never empty. + """ + bridge = getattr(server, "agent_bridge", None) + agent = bridge.agent if bridge is not None else None + last = getattr(agent, "last_scan_geometry", None) if agent else None + if isinstance(last, dict): + return last + # Nominal defaults (calibration defaults; no acquisition yet). + num_slices = 50 + piezo_amplitude = 25.0 + piezo_center = 50.0 + z_extent = 2.0 * piezo_amplitude + return { + "embryo_id": None, + "stage_position_um": {"x": None, "y": None}, + "scan": { + "num_slices": num_slices, + "exposure_ms": 10.0, + "galvo_amplitude_deg": 0.5, + "galvo_center_deg": 0.0, + "piezo_amplitude_um": piezo_amplitude, + "piezo_center_um": piezo_center, + }, + "derived": { + "z_extent_um": z_extent, + "slice_spacing_um": z_extent / (num_slices - 1), + "z_min_um": piezo_center - piezo_amplitude, + "z_max_um": piezo_center + piezo_amplitude, + }, + "mode": "sheet", + "ts": None, + "is_default": True, + } + @router.get("/api/devices/bottom_camera/status") async def get_bottom_camera_status(): """Return whether the bottom-camera stream bridge is running.""" diff --git a/gently/ui/web/routes/pages.py b/gently/ui/web/routes/pages.py index 3858f4e7..04a1a959 100644 --- a/gently/ui/web/routes/pages.py +++ b/gently/ui/web/routes/pages.py @@ -3,6 +3,8 @@ from fastapi import APIRouter, Request from fastapi.responses import HTMLResponse, RedirectResponse +from gently.settings import settings + def create_router(server) -> APIRouter: router = APIRouter() @@ -16,7 +18,9 @@ async def index(request: Request): chat window's "Sign in" affordance), not a gate on the page itself. """ return server.templates.TemplateResponse( - request, "index.html", {"active_section": "embryos", "is_live": True} + request, + "index.html", + {"active_section": "embryos", "is_live": True, "ux_v2": settings.ui.ux_v2}, ) # Standalone URLs redirect to SPA with hash fragment for tab routing diff --git a/gently/ui/web/server.py b/gently/ui/web/server.py index c16c548d..daf91dba 100644 --- a/gently/ui/web/server.py +++ b/gently/ui/web/server.py @@ -784,13 +784,22 @@ async def on_start(self): import socket sock = socket.socket(socket.AF_INET, socket.SOCK_STREAM) + # Match uvicorn's own bind semantics. uvicorn sets SO_REUSEADDR before it + # binds, so a bare preflight bind WITHOUT it is *stricter* than the real + # server: when a previous instance has just exited, its browser/websocket + # connections linger in TIME_WAIT holding this local port, and a plain + # bind() fails with EADDRINUSE even though uvicorn would bind fine. That + # false positive was the recurring "port in use" on quick restarts. With + # SO_REUSEADDR the preflight now fails only on a genuine live listener + # (a real second instance) — exactly when uvicorn would also fail. + sock.setsockopt(socket.SOL_SOCKET, socket.SO_REUSEADDR, 1) try: sock.bind((self.host, self.port)) except OSError: raise OSError( - f"Port {self.port} is already in use. " - "Is another instance of the agent running? " - "Close it first and try again." + f"Port {self.port} is already in use — another instance may be running. " + f"Free it with: fuser -k {self.port}/tcp " + f"(or: lsof -ti:{self.port} | xargs -r kill), then try again." ) from None finally: sock.close() diff --git a/gently/ui/web/static/css/agent-chat.css b/gently/ui/web/static/css/agent-chat.css index fdaaa8e3..29d64646 100644 --- a/gently/ui/web/static/css/agent-chat.css +++ b/gently/ui/web/static/css/agent-chat.css @@ -441,3 +441,42 @@ body.chat-docked .agent-chat:not(.open) { color: var(--text-muted); cursor: pointer; font-size: 12px; line-height: 1; } .ac-queue-remove:hover { color: var(--color-danger, #f87171); } + +/* ── Rendered markdown (mdToHtml output, ac-md-* classes) ────────────────── + Shared by the chat transcript and the ux_v2 plan-wizard activity feed — the + same renderer feeds both, so these styles cover headings, lists, tables, + code blocks, quotes and links the agent emits. */ +.ac-md { line-height: 1.55; } +.ac-md > :first-child { margin-top: 0; } +.ac-md > :last-child { margin-bottom: 0; } +.ac-md-h1, .ac-md-h2, .ac-md-h3, .ac-md-h4, .ac-md-h5, .ac-md-h6 { + margin: 14px 0 6px; font-weight: 650; line-height: 1.3; letter-spacing: -.01em; color: var(--text); +} +.ac-md-h1 { font-size: 1.25em; } +.ac-md-h2 { font-size: 1.15em; } +.ac-md-h3 { font-size: 1.05em; } +.ac-md-h4, .ac-md-h5, .ac-md-h6 { font-size: 1em; } +.ac-md-p { margin: 7px 0; } +.ac-md-ul, .ac-md-ol { margin: 7px 0; padding-left: 22px; } +.ac-md-li { margin: 3px 0; } +.ac-md-quote { + margin: 8px 0; padding: 4px 12px; border-left: 3px solid var(--border, #e4e9f0); + color: var(--text-muted); font-style: italic; +} +.ac-md-hr { border: 0; border-top: 1px solid var(--border, #e4e9f0); margin: 12px 0; } +.ac-md-link { color: var(--accent, #2f6df6); text-decoration: underline; text-underline-offset: 2px; } +.ac-md-pre { + margin: 8px 0; padding: 10px 12px; border-radius: 8px; overflow-x: auto; + background: var(--bg, #f6f8fb); border: 1px solid var(--border, #e4e9f0); +} +.ac-md-pre .ac-md-code-block, .ac-md-pre code { + font-family: 'JetBrains Mono', ui-monospace, monospace; font-size: 12px; + color: var(--text); background: none; padding: 0; white-space: pre; +} +/* GFM tables — wrapped so a wide table scrolls instead of blowing out the column */ +.ac-md-table-wrap { margin: 9px 0; overflow-x: auto; border: 1px solid var(--border, #e4e9f0); border-radius: 8px; } +.ac-md-table { border-collapse: collapse; width: 100%; font-size: 12.5px; } +.ac-md-table th, .ac-md-table td { padding: 6px 10px; text-align: left; border-bottom: 1px solid var(--border, #e4e9f0); border-right: 1px solid var(--border, #e4e9f0); } +.ac-md-table th:last-child, .ac-md-table td:last-child { border-right: 0; } +.ac-md-table tr:last-child td { border-bottom: 0; } +.ac-md-table thead th { background: var(--bg, #f6f8fb); font-weight: 650; color: var(--text); } diff --git a/gently/ui/web/static/css/ask-stage.css b/gently/ui/web/static/css/ask-stage.css new file mode 100644 index 00000000..dc15c623 --- /dev/null +++ b/gently/ui/web/static/css/ask-stage.css @@ -0,0 +1,56 @@ +/* Main-stage ask surface (ux_v2): the agent's current pending ask, rendered + prominently outside the chat transcript. Reuses the .ac-choice card markup + from agent-chat.css; this file frames the stage container and adds the + shared free-text ("Something else…") escape styling. Only #ask-stage is + gated behind the flag, so loading this CSS unconditionally is harmless. */ + +.ask-stage { margin: 14px 16px 0; } +.ask-stage.hidden { display: none; } + +.ask-stage .ac-choice { + border: 1px solid var(--border, #e4e9f0); + border-radius: 14px; + padding: 16px 18px; + background: var(--surface, #fff); + box-shadow: 0 8px 28px rgba(15, 23, 42, .08); +} +.ask-stage .ac-choice-q { + font-size: 1.02rem; + font-weight: 600; + margin-bottom: 12px; +} + +/* Free-text "Something else…" escape — present on ask cards in BOTH surfaces. */ +.ac-choice-otherwrap { margin-top: 6px; } +.ac-choice-other.hidden, +.ac-choice-otherform.hidden { display: none; } +.ac-choice-otherform { display: flex; gap: 6px; align-items: center; margin-top: 4px; } +.ac-choice-otherinput { + flex: 1; min-width: 0; + padding: 8px 10px; + border: 1px solid var(--border, #cbd5e1); + border-radius: 8px; + font: inherit; + background: var(--surface, #fff); + color: inherit; +} +.ac-choice-otherinput:focus { outline: none; border-color: var(--accent, #2f6df6); } +.ac-choice-othergo { + border: 0; cursor: pointer; + background: var(--accent, #2f6df6); color: #fff; + border-radius: 8px; padding: 8px 12px; line-height: 1; +} + +/* Per-field provenance tag on imaging-spec rows (Phase 3b): shows where an + inferred value came from, e.g. "inferred · medium". */ +.ac-spec-src { + margin-left: 6px; + font-size: 10px; + letter-spacing: .02em; + color: var(--text-muted, #94a3b8); + background: var(--bg-hover, #f1f5f9); + border-radius: 999px; + padding: 1px 7px; + white-space: nowrap; + vertical-align: middle; +} diff --git a/gently/ui/web/static/css/landing.css b/gently/ui/web/static/css/landing.css new file mode 100644 index 00000000..046675de --- /dev/null +++ b/gently/ui/web/static/css/landing.css @@ -0,0 +1,462 @@ +/* ux_v2 landing — the agent-first welcome that the prototype sketched, ported + into production. A full-bleed overlay shown on first entry that recedes into + the workspace once the user picks a path. Everything is scoped under + body.ux-v2 and the #v2-landing node only renders when the flag is on, so v1 + is byte-for-byte untouched. Visual language mirrors ux-prototype/landing.html + but reuses production's CSS variables (with the prototype hexes as fallback) + so it tracks the app theme. */ + +/* ux_v2 landing fills the gaps in the production token set (main.css defines + --bg-dark/-card/-hover, --border, --text, --text-muted, --accent, --accent-green + but NOT a page-bg alias, a secondary-text, or accent tints). Scope to + body.ux-v2 so v1 is untouched; both themes resolved here so landing.css can + reference these like any real token. dark is the default theme (main.css :root). */ +body.ux-v2 { + --bg: var(--bg-dark); /* page background, theme-aware */ + --text-secondary: var(--text-muted); + --accent-soft: rgba(96,165,250,.15); /* tint of dark --accent #60a5fa */ + --accent-green-soft: rgba(74,222,128,.15); /* tint of dark --accent-green */ + /* one disciplined type scale for the landing/plan surface */ + --v2-fs-body: 14px; + --v2-fs-sm: 13px; + --v2-fs-cap: 12px; + --v2-fs-eyebrow: 11px; +} +body.ux-v2[data-theme="light"] { + --accent-soft: rgba(59,130,246,.10); /* tint of light --accent #3b82f6 */ + --accent-green-soft: rgba(34,197,94,.12); /* tint of light --accent-green */ +} +/* accent-keyed glows can't put var() inside rgba channels, so the dark defaults + live on the elements (re-keyed off the dead #2f6df6 onto the real #60a5fa) and + light overrides ride here next to the tokens. */ +body.ux-v2[data-theme="light"] .v2-landing-orb { box-shadow: 0 6px 22px rgba(59,130,246,.35), inset 0 0 12px rgba(255,255,255,.6); } +body.ux-v2[data-theme="light"] .v2-escape-field input:focus { box-shadow: 0 0 0 4px rgba(59,130,246,.12); } +body.ux-v2[data-theme="light"] .v2-escape-send { box-shadow: 0 6px 16px rgba(59,130,246,.35); } + +.v2-landing { + position: fixed; + inset: 0; + z-index: 200; + display: flex; + align-items: flex-start; /* BOTH screens top-anchored — no discrete switch on swap */ + justify-content: center; + padding: 24px; + overflow: hidden; + background: + radial-gradient(1100px 700px at 78% -8%, var(--accent-soft) 0%, transparent 55%), + radial-gradient(900px 600px at 8% 108%, var(--accent-green-soft) 0%, transparent 55%), + var(--bg); + transition: opacity .5s cubic-bezier(.22,1,.36,1), transform .5s cubic-bezier(.22,1,.36,1), visibility .5s; +} +/* The calm screen "unfolds" into the workspace: fade + slight scale-up, then + the node is pulled from the layout (display:none set by JS after the + transition) so it never traps clicks. */ +.v2-landing.dismissed { + opacity: 0; + visibility: hidden; + transform: scale(1.015); + pointer-events: none; +} +.v2-landing::before { + content: ""; + position: absolute; + inset: -20vmax; + background: radial-gradient(closest-side, var(--accent-soft), transparent 70%); + filter: blur(30px); + animation: v2land-drift 26s cubic-bezier(.22,1,.36,1) infinite alternate; + will-change: transform; + pointer-events: none; +} +@keyframes v2land-drift { + 0% { transform: translate(-6vw,-4vh) scale(1); } + 100% { transform: translate(8vw,6vh) scale(1.15); } +} + +.v2-landing-inner { + position: relative; + z-index: 1; + display: flex; + flex-direction: column; + align-items: center; + max-width: 760px; + width: 100%; + margin-top: 7vh; /* shared anchor for welcome AND plan — orb stays put on swap */ + margin-bottom: 5vh; +} +.v2-landing-rise { animation: v2land-rise .6s cubic-bezier(.22,1,.36,1) backwards; } +.v2-landing-rise[data-i="1"] { animation-delay: .07s; } +.v2-landing-rise[data-i="2"] { animation-delay: .14s; } +.v2-landing-rise[data-i="3"] { animation-delay: .21s; } +@keyframes v2land-rise { from { opacity: 0; transform: translateY(14px); } to { opacity: 1; transform: none; } } + +/* agent presence */ +.v2-landing-agent { display: flex; flex-direction: column; align-items: center; gap: 16px; } +.v2-landing-orb { + width: 52px; height: 52px; border-radius: 50%; + background: radial-gradient(closest-side at 38% 34%, #ffffff, #bcd3ff 40%, var(--accent, #2f6df6) 100%); + box-shadow: 0 6px 22px rgba(96,165,250,.45), inset 0 0 12px rgba(255,255,255,.6); + animation: v2land-breathe 4s ease-in-out infinite; +} +@keyframes v2land-breathe { 0%,100% { transform: scale(1); } 50% { transform: scale(1.06); } } +.v2-landing-say { + font-size: clamp(20px, 3vw, 28px); font-weight: 600; letter-spacing: -.02em; + text-align: center; max-width: 22ch; line-height: 1.25; color: var(--text, #0f172a); +} +.v2-landing-say .dim { color: var(--text-muted, #94a3b8); font-weight: 500; } + +/* choice cards */ +.v2-landing-choices { + display: grid; grid-template-columns: repeat(2, minmax(0,1fr)); gap: 16px; + margin-top: 32px; width: min(720px, 92vw); +} +@media (max-width: 620px) { .v2-landing-choices { grid-template-columns: 1fr; } } +.v2-choice { + text-align: left; cursor: pointer; position: relative; overflow: hidden; + border: 1px solid var(--border, #e4e9f0); background: var(--bg-card, #fff); + border-radius: 18px; padding: 20px; + box-shadow: 0 1px 2px rgba(15,23,42,.04), 0 8px 28px rgba(15,23,42,.06); + font: inherit; color: var(--text, #0f172a); + transition: transform .26s cubic-bezier(.22,1,.36,1), box-shadow .26s cubic-bezier(.22,1,.36,1), border-color .26s; +} +.v2-choice:hover { + transform: translateY(-4px); + border-color: color-mix(in srgb, var(--accent) 45%, var(--border)); + box-shadow: 0 2px 6px rgba(15,23,42,.06), + 0 18px 50px color-mix(in srgb, var(--accent) 16%, transparent); +} +.v2-choice:active { transform: translateY(-1px) scale(.995); } + +/* Visible keyboard focus for every landing/plan control (mouse clicks get no ring) */ +#v2-landing :focus-visible { + outline: 2px solid var(--accent); + outline-offset: 2px; + border-radius: 12px; /* hug the pill/card corners */ +} +#v2-landing .v2-choice:focus-visible { outline-offset: -2px; } /* inset: the card clips outside outlines */ +#v2-landing .v2-escape-field input:focus-visible { outline-offset: 0; } +.v2-choice-ic { + width: 40px; height: 40px; border-radius: 11px; display: grid; place-items: center; + background: var(--accent-soft, #eaf1ff); color: var(--accent, #2f6df6); margin-bottom: 14px; +} +.v2-choice.alt .v2-choice-ic { background: var(--accent-green-soft, #e7f6ec); color: var(--accent-green, #16a34a); } +.v2-choice h3 { margin: 0 0 6px; font-size: 17px; letter-spacing: -.01em; } +.v2-choice p { margin: 0; color: var(--text-secondary, #475569); font-size: var(--v2-fs-sm); line-height: 1.5; } +.v2-choice-tag { + position: absolute; top: 16px; right: 16px; + font-size: var(--v2-fs-eyebrow); letter-spacing: .08em; text-transform: uppercase; color: var(--text-muted, #94a3b8); + border: 1px solid var(--border, #e4e9f0); border-radius: 999px; padding: 3px 9px; +} +.v2-choice-go { + margin-top: 16px; display: flex; align-items: center; gap: 6px; + color: var(--accent, #2f6df6); font-size: 13px; font-weight: 600; + opacity: 0; transform: translateX(-4px); transition: .26s cubic-bezier(.22,1,.36,1); +} +.v2-choice.alt .v2-choice-go { color: var(--accent-green, #16a34a); } +.v2-choice:hover .v2-choice-go { opacity: 1; transform: none; } + +/* escape hatch — chat is the last resort, an obvious pill */ +.v2-escape { margin-top: 24px; display: flex; flex-direction: column; align-items: center; } +.v2-escape-toggle { + display: inline-flex; align-items: center; gap: 7px; cursor: pointer; font: inherit; font-size: 13px; + background: var(--bg-card, #fff); border: 1px solid var(--border, #e4e9f0); color: var(--text-secondary, #475569); + padding: 11px 16px; border-radius: 999px; + box-shadow: 0 1px 2px rgba(15,23,42,.04), 0 8px 28px rgba(15,23,42,.06); + transition: color .2s, border-color .2s, transform .2s cubic-bezier(.22,1,.36,1); +} +.v2-escape-toggle:hover { color: var(--text, #0f172a); border-color: var(--border-strong); transform: translateY(-1px); } +.v2-escape-toggle .v2-escape-caret { display: inline-block; transition: transform .3s cubic-bezier(.22,1,.36,1); opacity: .55; } +.v2-escape.open .v2-escape-toggle .v2-escape-caret { transform: rotate(180deg); } +.v2-escape-field { + display: flex; align-items: center; gap: 8px; width: min(520px, 90vw); + max-height: 0; opacity: 0; overflow: hidden; + transition: max-height .4s cubic-bezier(.22,1,.36,1), opacity .4s cubic-bezier(.22,1,.36,1), margin .4s cubic-bezier(.22,1,.36,1); +} +.v2-escape.open .v2-escape-field { max-height: 64px; opacity: 1; margin-top: 12px; } +.v2-escape-field input { + flex: 1; min-width: 0; border: 1px solid var(--border, #e4e9f0); background: var(--bg-card, #fff); + border-radius: 12px; padding: 12px 14px; font: inherit; font-size: var(--v2-fs-body); color: var(--text, #0f172a); + outline: none; box-shadow: 0 1px 2px rgba(15,23,42,.04); + transition: border-color .2s, box-shadow .2s; +} +.v2-escape-field input:focus { border-color: var(--accent, #2f6df6); box-shadow: 0 0 0 4px rgba(96,165,250,.18); } +.v2-escape-send { + appearance: none; border: 0; cursor: pointer; flex: none; width: 42px; height: 42px; border-radius: 12px; + background: var(--accent, #2f6df6); color: #fff; display: grid; place-items: center; + box-shadow: 0 6px 16px rgba(96,165,250,.40); transition: transform .2s cubic-bezier(.22,1,.36,1), filter .2s; +} +.v2-escape-send:hover { transform: translateY(-1px); filter: brightness(1.05); } + +/* one-way skip into the workspace (e.g. a reload mid-session) */ +.v2-landing-skip { + margin-top: 24px; background: none; border: 0; cursor: pointer; font: inherit; font-size: var(--v2-fs-cap); + color: var(--text-muted, #94a3b8); padding: 10px 12px; border-radius: 8px; + transition: color .2s; +} +.v2-landing-skip:hover { color: var(--text-secondary, #475569); } + +/* Under ux_v2 the landing IS the welcome moment, so the legacy home hero + (static "Welcome to Gently" + start button) would be a duplicate behind it — + hide it. The recent-* cards and the context surface stay. */ +body.ux-v2 .home-hero { display: none; } + +/* ── Two-screen system: welcome ↔ in-place plan wizard ───────── */ +.v2-landing-inner { max-width: 980px; } /* widen for the plan layout */ +.v2-landing .v2-screen { display: none; width: 100%; } +.v2-landing[data-screen="welcome"] .v2-screen-welcome { + display: flex; flex-direction: column; align-items: center; + max-width: 760px; margin: 0 auto; + animation: v2land-plan-in .42s cubic-bezier(.22,1,.36,1) backwards; +} +.v2-landing[data-screen="plan"] .v2-screen-plan { + display: flex; flex-direction: column; + animation: v2land-plan-in .42s cubic-bezier(.22,1,.36,1) backwards; +} +/* One swap motion shared by both screens: a soft opacity + rise + settle. The + scale .992→1 echoes the dismissed-state scale(1.015) so the surface feels like + one continuous fabric folding, not two slides swapping. */ +@keyframes v2land-plan-in { + from { opacity: 0; transform: translateY(10px) scale(.992); } + to { opacity: 1; transform: none; } +} + +/* plan header */ +.v2-plan-head { display: flex; align-items: center; gap: 12px; margin-bottom: 16px; } +.v2-plan-orb { width: 40px; height: 40px; transition: width .42s cubic-bezier(.22,1,.36,1), height .42s cubic-bezier(.22,1,.36,1); } +.v2-plan-who { font-size: var(--v2-fs-eyebrow); letter-spacing: .08em; text-transform: uppercase; color: var(--text-muted, #94a3b8); } +.v2-plan-title { font-size: 18px; font-weight: 600; letter-spacing: -.01em; color: var(--text, #0f172a); } +.v2-plan-back { + margin-left: auto; background: none; border: 0; cursor: pointer; font: inherit; font-size: 13px; + color: var(--text-muted, #94a3b8); padding: 9px 12px; border-radius: 8px; transition: color .2s, background .2s; +} +.v2-plan-back:hover { color: var(--text, #0f172a); background: rgba(15,23,42,.05); } + +/* plan body: ask stage + assembling plan */ +.v2-plan-wrap { display: grid; grid-template-columns: 1.35fr .9fr; gap: 20px; align-items: start; } +@media (max-width: 720px) { + .v2-plan-wrap { grid-template-columns: 1fr; } + /* single column: THE PLAN sits BELOW the feed — drop the internal scroll + and let the whole plan screen scroll as one document instead. The + descendant selector outranks the plain `.v2-plan-main { max-height }` + rule that appears later in the file (equal specificity → source order), + so the cap is genuinely lifted here, not silently re-applied. */ + .v2-landing[data-screen="plan"] .v2-plan-main { height: auto; max-height: none; overflow-y: visible; padding-right: 0; } + .v2-landing[data-screen="plan"] { overflow-y: auto; } +} +.v2-plan-main { min-height: 220px; } +.v2-plan-ask:empty { display: none; } +.v2-plan-thinking { display: flex; align-items: center; gap: 9px; color: var(--text-muted, #94a3b8); font-size: var(--v2-fs-sm); padding: 20px 4px; } +.v2-plan-thinking.hidden { display: none; } +.v2-typing { display: inline-flex; gap: 5px; align-items: center; } +.v2-typing i { width: 7px; height: 7px; border-radius: 50%; background: var(--accent, #2f6df6); opacity: .4; animation: v2-blink 1.1s infinite; } +.v2-typing i:nth-child(2) { animation-delay: .15s; } +.v2-typing i:nth-child(3) { animation-delay: .3s; } +@keyframes v2-blink { 0%,100% { opacity: .25; transform: translateY(0); } 50% { opacity: 1; transform: translateY(-3px); } } + +.v2-plan-side { + background: var(--bg-card, #fff); border: 1px solid var(--border, #e4e9f0); border-radius: 14px; padding: 14px 16px; + box-shadow: 0 1px 2px rgba(15,23,42,.04); +} +.v2-plan-side-h { font-size: var(--v2-fs-eyebrow); letter-spacing: .08em; text-transform: uppercase; color: var(--text-muted, #94a3b8); margin-bottom: 10px; } +.v2-plan-side-empty { color: var(--text-muted, #94a3b8); font-size: var(--v2-fs-sm); font-style: italic; } +.v2-plan-row { display: flex; flex-direction: column; gap: 2px; padding: 9px 0; border-top: 1px dashed var(--border, #e4e9f0); } +.v2-plan-row:first-child { border-top: 0; } +.v2-plan-row .k { font-size: var(--v2-fs-eyebrow); letter-spacing: .08em; text-transform: uppercase; color: var(--text-muted, #94a3b8); } +.v2-plan-row .v { font-size: var(--v2-fs-body); color: var(--text, #0f172a); font-weight: 600; } + +/* plan footer: "Open conversation" (quiet, left), a spacer, then "Continue in + workspace" demoted to a text link (right). The agent's recommended option in + the ask card is the real primary action now — the footer no longer competes. */ +.v2-plan-foot { display: flex; align-items: center; gap: 10px; margin-top: 20px; padding-top: 16px; border-top: 1px solid var(--border, #e4e9f0); } +.v2-plan-foot-spacer { flex: 1; } +.v2-plan-chat { + background: none; border: 1px solid var(--border, #e4e9f0); color: var(--text-secondary, #475569); + border-radius: 999px; padding: 11px 16px; font: inherit; font-size: var(--v2-fs-sm); cursor: pointer; transition: border-color .2s, color .2s; +} +.v2-plan-chat:hover { border-color: var(--border-strong); color: var(--text, #0f172a); } +.v2-plan-skip { + background: none; border: 0; cursor: pointer; font: inherit; font-size: var(--v2-fs-sm); + color: var(--text-muted, #94a3b8); padding: 11px 10px; border-radius: 8px; transition: color .2s; +} +.v2-plan-skip:hover { color: var(--text-secondary, #475569); } +.v2-plan-export { + background: none; border: 1px solid var(--border, #e4e9f0); color: var(--text-secondary, #475569); + border-radius: 999px; padding: 11px 16px; font: inherit; font-size: var(--v2-fs-sm); + font-weight: 600; cursor: pointer; transition: border-color .2s, color .2s, background .2s; +} +.v2-plan-export:hover { border-color: var(--border-strong); color: var(--text, #0f172a); background: var(--bg-hover); } +.v2-plan-export:disabled { opacity: .6; cursor: default; } +.v2-plan-export[hidden] { display: none; } + +/* ── Plan-ready state: the design is done, signpost the finish line ───────── */ +.v2-screen-plan.ready .v2-plan-orb { + background: radial-gradient(closest-side at 38% 34%, #ffffff, #c8f0d4 40%, var(--accent-green, #16a34a) 100%); +} +.v2-screen-plan.ready .v2-plan-who { color: var(--accent-green, #16a34a); } +.v2-screen-plan.ready .v2-plan-foot { border-top-color: color-mix(in srgb, var(--accent-green) 35%, var(--border)); } +/* promote "open the workspace" from a quiet skip link to the primary action */ +.v2-screen-plan.ready #v2-plan-continue { + background: var(--accent-green, #16a34a); color: #fff; + border-radius: 999px; padding: 11px 20px; font-weight: 600; +} +.v2-screen-plan.ready #v2-plan-continue:hover { + color: #fff; background: color-mix(in srgb, var(--accent-green) 88%, #000); +} + +/* ── Agent-activity feed: claude.ai-style collapsible tool cards ──────────── */ +/* Both screens share the .v2-landing-inner top anchor (no per-screen align flip — + that was the welcome→plan lurch). The feed is a fixed-height viewport (height + set above) so the streaming activity column scrolls on its own without ever + reflowing the anchored header/footer around it. Short feeds stay compact + (min-height above); long feeds cap at 66vh and scroll internally. The header + never moves because the inner is top-anchored — only the footer rides down as + the feed grows, up to the cap. */ +.v2-plan-main { max-height: 66vh; overflow-y: auto; padding-right: 4px; } + +.v2-plan-activity { display: flex; flex-direction: column; gap: 8px; margin-bottom: 12px; } +.v2-plan-activity:empty { display: none; margin: 0; } + +/* Paginated feed — one agent step (turn) per page, flipped with ‹ Prev / Next ›. + The pager bar / dots reuse .v2-plan-pager / .v2-plan-dots styling. */ +.v2-feed-pages { display: flex; flex-direction: column; } +.v2-act-page { display: none; flex-direction: column; gap: 8px; } +.v2-act-page.active { display: flex; } +/* beat .v2-plan-pager/.v2-plan-dots { display:flex } so [hidden] actually hides */ +.v2-plan-pager[hidden], .v2-plan-dots[hidden] { display: none; } +.v2-feed-pager-bar { margin: 0 0 4px; } +.v2-feed-dots { margin-top: 10px; } +/* the current question, pinned below the paged feed, set off by a divider — + only once there are steps above it (no stray line on the first choice card) */ +#v2-plan-activity:has(.v2-act-page) + .v2-plan-ask:not(:empty) { + margin-top: 14px; padding-top: 14px; border-top: 1px solid var(--border, #e4e9f0); +} + +/* agent prose between tool calls */ +.v2-act-text { font-size: var(--v2-fs-sm); line-height: 1.55; color: var(--text-secondary, #475569); white-space: pre-wrap; } + +/* collapsed-by-default tool card; the header toggles .open to reveal the body */ +.v2-act-tool { border: 1px solid var(--border, #e4e9f0); border-radius: 11px; background: var(--bg-card, #fff); overflow: hidden; } +.v2-act-tool-head { + display: flex; align-items: center; gap: 8px; width: 100%; + background: none; border: 0; cursor: pointer; text-align: left; font: inherit; + padding: 11px 12px; color: var(--text, #0f172a); +} +.v2-act-tool-head:hover { background: var(--bg-hover, #f1f5f9); } +.v2-act-ic { width: 16px; flex: none; text-align: center; font-size: 12px; } +.v2-act-tool.done .v2-act-ic { color: var(--accent-green, #16a34a); } +.v2-act-tool.err .v2-act-ic { color: #ea580c; } +body.ux-v2[data-theme="dark"] .v2-act-tool.err .v2-act-ic { color: #fb923c; } +.v2-act-spin { + display: inline-block; width: 11px; height: 11px; border-radius: 50%; + border: 2px solid var(--border, #e4e9f0); border-top-color: var(--accent, #2f6df6); + animation: v2-act-spin .7s linear infinite; +} +@keyframes v2-act-spin { to { transform: rotate(360deg); } } +.v2-act-label { font-size: var(--v2-fs-sm); font-weight: 600; flex: none; } +.v2-act-summary { font-size: var(--v2-fs-cap); color: var(--text-muted, #94a3b8); flex: 1; min-width: 0; overflow: hidden; text-overflow: ellipsis; white-space: nowrap; } +.v2-act-chev { flex: none; color: var(--text-muted, #94a3b8); transition: transform .2s; font-size: 13px; } +.v2-act-tool.open .v2-act-chev { transform: rotate(90deg); } +/* animatable collapse: grid-template-rows 0fr→1fr eases in step with the chevron + (display:none isn't animatable). Needs exactly ONE min-height:0 child — landing.js + wraps the blocks in a single inner div for this. */ +.v2-act-tool-body { + display: grid; grid-template-rows: 0fr; opacity: 0; + padding: 0 12px 0 37px; + transition: grid-template-rows .26s cubic-bezier(.22,1,.36,1), + opacity .26s cubic-bezier(.22,1,.36,1), + padding-bottom .26s cubic-bezier(.22,1,.36,1); +} +.v2-act-tool-body > * { min-height: 0; overflow: hidden; } +.v2-act-tool.open .v2-act-tool-body { grid-template-rows: 1fr; opacity: 1; padding-bottom: 11px; } +.v2-act-tool.open .v2-act-summary { white-space: normal; } +.v2-act-block-label { font-size: var(--v2-fs-eyebrow); letter-spacing: .08em; text-transform: uppercase; color: var(--text-muted, #94a3b8); margin-top: 8px; } +.v2-act-block { + font: 12px/1.5 ui-monospace, SFMono-Regular, Menlo, monospace; + background: var(--bg-hover); border: 1px solid var(--border, #e4e9f0); border-radius: 8px; + padding: 8px 10px; margin-top: 4px; white-space: pre-wrap; word-break: break-word; + color: var(--text-secondary, #475569); max-height: 220px; overflow: auto; +} + +/* error + fallback states */ +.v2-plan-error { + font-size: var(--v2-fs-sm); color: #b91c1c; + background: rgba(239,68,68,.10); border: 1px solid rgba(239,68,68,.35); + border-radius: 11px; padding: 11px 13px; +} +body.ux-v2[data-theme="dark"] .v2-plan-error { + color: #fca5a5; background: rgba(239,68,68,.14); border-color: rgba(239,68,68,.40); +} +.v2-plan-error.hidden { display: none; } +.v2-plan-fallback { font-size: 13px; color: var(--text-muted, #94a3b8); padding: 8px 2px; } +.v2-plan-fallback a { color: var(--accent, #2f6df6); cursor: pointer; } + +/* plan-panel: phases + tasks (real plan), and a free-text-answer row variant */ +.v2-plan-phase { margin-top: 12px; } +.v2-plan-phase:first-child { margin-top: 0; } +.v2-plan-phase-h { + font-size: var(--v2-fs-eyebrow); font-weight: 700; letter-spacing: .06em; + text-transform: uppercase; color: var(--text-secondary, #475569); margin-bottom: 6px; +} +/* a plan item: "P.I" number · type-color dot · title · optional duration. + the type dot encodes the item kind (imaging/genetics/analysis/…) at a glance. */ +.v2-plan-task { + display: grid; grid-template-columns: auto 8px 1fr auto; align-items: baseline; + gap: 9px; font-size: var(--v2-fs-cap); color: var(--text-secondary, #475569); + padding: 6px 0; border-top: 1px solid color-mix(in srgb, var(--border, #e4e9f0) 55%, transparent); +} +.v2-plan-phase .v2-plan-task:first-child { border-top: 0; } +.v2-task-num { + font: 600 11px/1.5 ui-monospace, SFMono-Regular, Menlo, monospace; + color: var(--text-muted, #94a3b8); font-variant-numeric: tabular-nums; +} +.v2-task-dot { width: 8px; height: 8px; border-radius: 50%; align-self: center; background: var(--text-muted, #94a3b8); } +.v2-task-ttl { min-width: 0; color: var(--text, #0f172a); line-height: 1.45; } +.v2-task-days { + font-size: 10.5px; font-weight: 600; color: var(--text-muted, #94a3b8); + font-variant-numeric: tabular-nums; white-space: nowrap; +} +.v2-plan-task.type-imaging .v2-task-dot { background: var(--accent, #2f6df6); } +.v2-plan-task.type-genetics .v2-task-dot { background: #8b5cf6; } +.v2-plan-task.type-analysis .v2-task-dot { background: var(--accent-green, #16a34a); } +.v2-plan-task.type-bench .v2-task-dot { background: #d97706; } +/* decision points read as gates — a rotated square, not a round bead */ +.v2-plan-task.type-decision_point .v2-task-dot { background: #e11d48; border-radius: 1px; transform: rotate(45deg); } +.v2-plan-title-row { font-size: var(--v2-fs-sm); font-weight: 600; letter-spacing: -.01em; color: var(--text, #0f172a); margin-bottom: 8px; } +.v2-plan-row.v2-plan-row-freetext .v { font-style: italic; } +.v2-plan-task-empty { grid-column: 1 / -1; color: var(--text-muted, #94a3b8); font-style: italic; } + +/* THE PLAN pager: ‹ Prev · "Phase · i of N" · Next › + dots, one phase per page */ +.v2-plan-pager { display: flex; align-items: center; gap: 8px; margin: 2px 0 12px; } +.v2-plan-pager-btn { + flex: none; background: none; border: 0; cursor: pointer; font: inherit; + font-size: var(--v2-fs-cap); font-weight: 600; color: var(--accent, #2f6df6); + padding: 5px 7px; border-radius: 7px; transition: background .15s, color .15s, opacity .15s; +} +.v2-plan-pager-btn:hover:not(:disabled) { background: var(--accent-soft); } +.v2-plan-pager-btn:disabled { color: var(--text-muted, #94a3b8); opacity: .45; cursor: default; } +.v2-plan-pager-pos { + flex: 1; min-width: 0; text-align: center; font-size: var(--v2-fs-cap); font-weight: 600; + color: var(--text, #0f172a); overflow: hidden; text-overflow: ellipsis; white-space: nowrap; +} +.v2-plan-dots { display: flex; gap: 6px; justify-content: center; margin-top: 12px; } +.v2-plan-dot { + width: 7px; height: 7px; padding: 0; border: 0; border-radius: 50%; cursor: pointer; + background: var(--border, #e4e9f0); transition: background .2s, transform .2s; +} +.v2-plan-dot:hover { transform: scale(1.25); } +.v2-plan-dot.active { background: var(--accent, #2f6df6); } + +@media (prefers-reduced-motion: reduce) { + .v2-landing, .v2-landing::before, .v2-landing-rise, .v2-landing-orb, .v2-plan-orb, + .v2-landing[data-screen="plan"] .v2-screen-plan, + .v2-landing[data-screen="welcome"] .v2-screen-welcome, + .v2-typing i, .v2-act-spin, .v2-act-chev, + .v2-act-tool-body, .v2-act-tool-head, + .v2-choice, .v2-escape-field, .v2-escape-toggle, + .v2-plan-pager-btn, .v2-plan-dot { + animation: none !important; + transition-duration: .12s !important; + } + /* keep the collapsible usable without the height tween */ + .v2-act-tool-body { transition: none !important; } + .v2-act-tool.open .v2-act-tool-body { grid-template-rows: 1fr; opacity: 1; } +} diff --git a/gently/ui/web/static/css/shell.css b/gently/ui/web/static/css/shell.css new file mode 100644 index 00000000..c38ae89a --- /dev/null +++ b/gently/ui/web/static/css/shell.css @@ -0,0 +1,105 @@ +/* ux_v2 shell chrome: grouped left-rail nav + session-context strip. + Everything is scoped under body.ux-v2 so the v1 dashboard is byte-for-byte + untouched — no consolidation of the existing duplicate .tab rulesets here + (that cleanup is deferred to the final phase). */ + +/* Replace the flat 8-tab bar with the rail. */ +body.ux-v2 .tabs { display: none; } + +/* ── Left rail ─────────────────────────────────────────────── */ +body.ux-v2 .v2-rail { + flex: 0 0 212px; + display: flex; + flex-direction: column; + gap: 2px; + padding: 14px 10px; + border-right: 1px solid var(--border, #e4e9f0); + background: var(--bg-card, #fff); + overflow-y: auto; + animation: v2-rise .45s ease backwards; +} +body.ux-v2 .v2-nav-group { margin-bottom: 6px; } +body.ux-v2 .v2-nav-label { + font-size: 10px; letter-spacing: .1em; text-transform: uppercase; + color: var(--text-muted, #94a3b8); padding: 10px 10px 4px; +} +body.ux-v2 .v2-nav-item { + display: flex; align-items: center; gap: 8px; width: 100%; + background: none; border: 0; cursor: pointer; text-align: left; + padding: 8px 10px; border-radius: 8px; + font: inherit; font-size: 13.5px; + color: var(--text-secondary, #475569); + transition: background .15s, color .15s; +} +body.ux-v2 .v2-nav-item:hover { background: var(--bg-hover, #f1f5f9); color: var(--text, #0f172a); } +body.ux-v2 .v2-nav-item.active { + background: var(--accent-soft, #eaf1ff); + color: var(--accent, #2f6df6); + font-weight: 600; +} +body.ux-v2 .v2-rail-chat { + margin-top: auto; + display: flex; align-items: center; gap: 9px; + background: none; border: 1px solid var(--border, #e4e9f0); border-radius: 10px; + padding: 9px 12px; cursor: pointer; + font: inherit; font-size: 13px; + color: var(--text-secondary, #475569); + transition: border-color .15s, color .15s; +} +body.ux-v2 .v2-rail-chat:hover { border-color: var(--accent, #2f6df6); color: var(--accent, #2f6df6); } +body.ux-v2 .v2-rail-orb { + width: 18px; height: 18px; border-radius: 50%; flex: none; + background: radial-gradient(closest-side at 38% 34%, #fff, #bcd3ff 42%, var(--accent, #2f6df6) 100%); +} + +/* ── Session-context strip (top of main) ───────────────────── */ +body.ux-v2 .v2-strip { + flex: none; + display: flex; align-items: center; gap: 12px; + padding: 9px 16px; + border-bottom: 1px solid var(--border, #e4e9f0); + background: var(--bg-card, #fff); + font-size: 12.5px; color: var(--text-muted, #94a3b8); + animation: v2-rise .45s ease backwards .05s; +} +body.ux-v2 .v2-strip-live { + display: inline-flex; align-items: center; gap: 6px; + font-size: 10.5px; font-weight: 700; letter-spacing: .08em; color: #ef4444; +} +body.ux-v2 .v2-strip-dot { + width: 8px; height: 8px; border-radius: 50%; background: #ef4444; +} +body.ux-v2 .v2-strip-status { margin-left: auto; font-variant-numeric: tabular-nums; } + +body.ux-v2 .app-main { animation: v2-rise .5s ease backwards .1s; } + +@keyframes v2-rise { from { opacity: 0; transform: translateY(8px); } to { opacity: 1; transform: none; } } +@media (prefers-reduced-motion: reduce) { + body.ux-v2 .v2-rail, body.ux-v2 .v2-strip, body.ux-v2 .app-main { animation: none; } +} + +/* ── Shared-visibility surface (the agent's view) ──────────── */ +body.ux-v2 .cx-surface { + margin: 0 0 16px; + border: 1px solid var(--border, #e4e9f0); + border-radius: 14px; + background: var(--bg-card, #fff); + padding: 14px 16px; +} +body.ux-v2 .cx-surface.hidden { display: none; } +body.ux-v2 .cx-title { font-size: 11px; letter-spacing: .1em; text-transform: uppercase; color: var(--text-muted, #94a3b8); margin-bottom: 8px; } +body.ux-v2 .cx-lens { margin-bottom: 10px; } +body.ux-v2 .cx-lens-h { font-size: 11px; font-weight: 600; color: var(--text-secondary, #475569); margin: 6px 0 4px; } +body.ux-v2 .cx-item { display: flex; align-items: center; gap: 9px; padding: 5px 0; flex-wrap: wrap; } +body.ux-v2 .cx-text { flex: 1; min-width: 0; font-size: 13px; color: var(--text, #0f172a); } +body.ux-v2 .cx-dot { width: 7px; height: 7px; border-radius: 50%; flex: none; } +body.ux-v2 .cx-dot.cx-q { background: #d97706; } +body.ux-v2 .cx-dot.cx-w { background: var(--accent, #2f6df6); } +body.ux-v2 .cx-dot.cx-e { background: var(--accent-green, #16a34a); } +body.ux-v2 .cx-act { flex: none; border: 1px solid var(--border, #e4e9f0); background: none; color: var(--text-secondary, #475569); border-radius: 8px; padding: 3px 10px; font: inherit; font-size: 12px; cursor: pointer; } +body.ux-v2 .cx-act:hover { border-color: var(--accent, #2f6df6); color: var(--accent, #2f6df6); } +body.ux-v2 .cx-answer { display: flex; gap: 6px; align-items: center; flex: 1 0 100%; margin-top: 4px; } +body.ux-v2 .cx-answer.hidden { display: none; } +body.ux-v2 .cx-answer-input { flex: 1; min-width: 0; border: 1px solid var(--border, #cbd5e1); border-radius: 8px; padding: 6px 9px; font: inherit; font-size: 12px; } +body.ux-v2 .cx-answer-go { border: 0; background: var(--accent, #2f6df6); color: #fff; border-radius: 8px; padding: 6px 10px; cursor: pointer; } +body.ux-v2 .cx-empty { font-size: 12.5px; color: var(--text-muted, #94a3b8); font-style: italic; padding: 2px 0 4px; } diff --git a/gently/ui/web/static/js/agent-chat.js b/gently/ui/web/static/js/agent-chat.js index 63339b92..58be6e5c 100644 Binary files a/gently/ui/web/static/js/agent-chat.js and b/gently/ui/web/static/js/agent-chat.js differ diff --git a/gently/ui/web/static/js/app.js b/gently/ui/web/static/js/app.js index d1e75a72..a8c6450c 100644 --- a/gently/ui/web/static/js/app.js +++ b/gently/ui/web/static/js/app.js @@ -60,6 +60,8 @@ function updateCalibrationCount() { function switchTab(tabName) { if (!tabName) return; state.tab = tabName; + // ux_v2 grouped rail mirrors the active tab off this single chokepoint. + if (typeof ClientEventBus !== 'undefined') ClientEventBus.emit('TAB_CHANGED', tabName); // Update tab styling document.querySelectorAll('.tab').forEach(t => t.classList.remove('active')); @@ -538,11 +540,13 @@ function fetchDeviceStatus() { .then(r => r.json()) .then(data => { _microscopeConnected = data.microscope; - _setBadge('status-microscope-badge', data.microscope, 'Online', 'Offline'); - updateTopLevelDot(); + ConnectionStatus.setMicroscope(data.microscope); }) .catch(() => { - _setBadge('status-microscope-badge', false, '', '--'); + // Transient poll failure: keep the last-known badge. The next + // successful poll re-renders via the store if the value changed + // (writing '--' here could stick, since the store only re-renders + // on an actual change, not on an unchanged success). }); } @@ -555,23 +559,26 @@ function _setBadge(id, isOn, onText, offText) { } function updateGentlyStatus(connected) { - _setBadge('status-gently-badge', connected, 'Online', 'Offline'); - updateTopLevelDot(); + // Feed the single source of truth; the header re-renders via the + // ConnectionStatus subscriber (renderConnectionUI). + ConnectionStatus.setGently(connected); } -function updateTopLevelDot() { +// Single renderer for the header connection UI, driven by a ConnectionStatus +// snapshot. Subscribed once at startup, so the pill, both popover badges, and +// the dot always reflect the same shared state. +function renderConnectionUI(s) { + _setBadge('status-gently-badge', s.gentlyConnected, 'Online', 'Offline'); + _setBadge('status-microscope-badge', s.microscopeConnected, 'Online', 'Offline'); const dot = document.getElementById('status-dot'); const text = document.getElementById('status-text'); if (!dot || !text) return; - const gentlyUp = state.connected; - const scopeUp = _microscopeConnected; - dot.classList.remove('connected', 'partial'); - if (gentlyUp && scopeUp) { + if (s.gentlyConnected && s.microscopeConnected) { dot.classList.add('connected'); text.textContent = 'Connected'; - } else if (gentlyUp) { + } else if (s.gentlyConnected) { dot.classList.add('partial'); text.textContent = 'Online'; } else { @@ -579,6 +586,11 @@ function updateTopLevelDot() { } } +// Back-compat shim: any legacy caller re-renders from the current snapshot. +function updateTopLevelDot() { + renderConnectionUI(ConnectionStatus.get()); +} + document.addEventListener('DOMContentLoaded', () => { // Initialize presence manager (before WebSocket so ID is ready) PresenceManager.init(); @@ -620,6 +632,11 @@ document.addEventListener('DOMContentLoaded', () => { } }); + // Connection status: one source of truth, three writers (this /ws, the + // device-status poll, and the agent /ws/agent). Subscribe the header + // renderer BEFORE connecting so the first handshake renders correctly. + ConnectionStatus.subscribe(renderConnectionUI); + // Start WebSocket connection connectWebSocket(); diff --git a/gently/ui/web/static/js/ask-stage.js b/gently/ui/web/static/js/ask-stage.js new file mode 100644 index 00000000..b5e63db1 --- /dev/null +++ b/gently/ui/web/static/js/ask-stage.js @@ -0,0 +1,53 @@ +/** + * AskStage (ux_v2) — renders the agent's CURRENT pending ask prominently on the + * main stage, in addition to the chat transcript. One payload, two renderers: + * it reuses AgentChat.buildAskCard so the stage and the transcript can't drift, + * and answering from either surface clears both (via the ASK_CLEARED event that + * AgentChat fires off the CHOICE lifecycle — not stream_end, which arrives only + * after an in-turn answer and never for a cancelled turn). + * + * No-ops unless #ask-stage is present (gated behind GENTLY_UX_V2 in the + * template), so it never affects the v1 dashboard. + */ +const AskStage = (() => { + let stageEl = null; + let current = null; // { reqId, data, isWake } + + function clear() { + current = null; + if (stageEl) { stageEl.innerHTML = ''; stageEl.classList.add('hidden'); } + } + + function render() { + if (!stageEl || !current || typeof AgentChat === 'undefined' || !AgentChat.buildAskCard) return; + const hasControl = AgentChat.hasControl ? AgentChat.hasControl() : true; + const card = AgentChat.buildAskCard(current.data, { + reqId: current.reqId, + isWake: current.isWake, + hasControl, + onPick: (sel) => AgentChat.answerChoice(current.reqId, sel), + }); + stageEl.innerHTML = ''; + stageEl.appendChild(card); + stageEl.classList.remove('hidden'); + } + + function init() { + stageEl = document.getElementById('ask-stage'); + if (!stageEl || typeof ClientEventBus === 'undefined') return; // ux_v2 off → no-op + + ClientEventBus.on('AGENT_ASK', ({ request_id, choice_data, origin }) => { + current = { reqId: request_id, data: choice_data || {}, isWake: origin === 'wake' }; + render(); + }); + ClientEventBus.on('ASK_CLEARED', ({ request_id }) => { + if (!current) return; + if (request_id === '*' || request_id === current.reqId) clear(); + }); + // Re-render read-only / actionable when control changes hands mid-ask. + ClientEventBus.on('AGENT_CONTROL', () => { if (current) render(); }); + } + + document.addEventListener('DOMContentLoaded', init); + return { clear }; +})(); diff --git a/gently/ui/web/static/js/context-surface.js b/gently/ui/web/static/js/context-surface.js new file mode 100644 index 00000000..2e10000d --- /dev/null +++ b/gently/ui/web/static/js/context-surface.js @@ -0,0 +1,109 @@ +/** + * ContextSurface (ux_v2): renders the agent's "mind" as a calm, always-visible + * panel — open questions (uncertainty), watchpoints (attention), expectations + * (beliefs) — read from /api/context and refreshed live on the CONTEXT_UPDATED + * event (the store emits it; the server broadcasts it to /ws; no polling). + * + * The control holder can resolve items inline (answer a question, resolve a + * watchpoint, confirm an expectation); observers see it read-only. No-ops + * unless #context-surface is present (flag off → v1 untouched). + */ +const ContextSurface = (() => { + let el = null, loading = false; + + const esc = (s) => (typeof escapeHtml === 'function') + ? escapeHtml(String(s == null ? '' : s)) : String(s == null ? '' : s); + const hasControl = () => + (typeof AgentChat !== 'undefined' && AgentChat.hasControl) ? AgentChat.hasControl() : true; + + async function fetchAndRender() { + if (!el || loading) return; + loading = true; + try { render(await (await fetch('/api/context')).json()); } + catch (e) { /* keep last render */ } + finally { loading = false; } + } + + function section(label, html) { return html ? `
${label}
${html}
` : ''; } + + function render(data) { + if (!el) return; + const hc = hasControl(); + const questions = data.questions || [], watchpoints = data.watchpoints || [], expectations = data.expectations || []; + el.classList.remove('hidden'); + if (!questions.length && !watchpoints.length && !expectations.length) { + // Show an empty-state rather than vanishing, so the surface is + // discoverable before the agent has formed any beliefs. + el.innerHTML = '
Agent’s view
' + + '
Nothing yet — the agent’s expectations, watchpoints, and open questions appear here as it works.
'; + return; + } + + const qHtml = questions.map(it => ` +
+ + ${esc(it.content)} + ${hc ? '' : ''} + ${hc ? '' : ''} +
`).join(''); + const wHtml = watchpoints.map(it => ` +
+ + ${esc(it.target)}${it.condition ? ' — ' + esc(it.condition) : ''} + ${hc ? '' : ''} +
`).join(''); + const eHtml = expectations.map(it => ` +
+ + ${esc(it.target)}${it.prediction ? ': ' + esc(it.prediction) : ''} + ${hc ? '' : ''} +
`).join(''); + + el.innerHTML = '
Agent’s view
' + + section('Open questions', qHtml) + section('Watching', wHtml) + section('Expectations', eHtml); + wire(); + } + + function wire() { + el.querySelectorAll('.cx-item').forEach(item => { + const kind = item.dataset.kind, id = item.dataset.id; + const actBtn = item.querySelector('.cx-act'); + if (!actBtn) return; + const act = actBtn.dataset.act; + if (act === 'answer') { + const box = item.querySelector('.cx-answer'); + const input = item.querySelector('.cx-answer-input'); + const submit = () => resolve(kind, id, { resolution: input.value.trim() }); + actBtn.addEventListener('click', () => { box.classList.toggle('hidden'); if (!box.classList.contains('hidden')) input.focus(); }); + item.querySelector('.cx-answer-go').addEventListener('click', submit); + input.addEventListener('keydown', e => { if (e.key === 'Enter') { e.preventDefault(); submit(); } }); + } else if (act === 'resolve') { + actBtn.addEventListener('click', () => resolve(kind, id, {})); + } else if (act === 'confirm') { + actBtn.addEventListener('click', () => resolve(kind, id, { status: 'confirmed' })); + } + }); + } + + async function resolve(kind, id, body) { + try { + await fetch(`/api/context/${kind}/${encodeURIComponent(id)}/resolve`, { + method: 'POST', headers: { 'Content-Type': 'application/json' }, body: JSON.stringify(body || {}), + }); + } catch (e) { /* ignore; surface stays as-is */ } + fetchAndRender(); // CONTEXT_UPDATED also re-fetches for every client + } + + function init() { + el = document.getElementById('context-surface'); + if (!el) return; // ux_v2 off → no-op + if (typeof ClientEventBus !== 'undefined') { + ClientEventBus.on('CONTEXT_UPDATED', () => fetchAndRender()); + ClientEventBus.on('AGENT_CONTROL', () => fetchAndRender()); // re-render with/without resolve controls + } + fetchAndRender(); + } + + document.addEventListener('DOMContentLoaded', init); + return { refresh: fetchAndRender }; +})(); diff --git a/gently/ui/web/static/js/devices.js b/gently/ui/web/static/js/devices.js index caa1bf48..040b11d1 100644 --- a/gently/ui/web/static/js/devices.js +++ b/gently/ui/web/static/js/devices.js @@ -14,7 +14,7 @@ */ const DevicesManager = (function () { const STALE_AFTER_MS = 4000; - const VIEWS = ['map', 'details']; + const VIEWS = ['map', 'details', 'optical3d']; const SVG_NS = 'http://www.w3.org/2000/svg'; // Status / details DOM @@ -1500,6 +1500,12 @@ const DevicesManager = (function () { if (typeof updateViewButtons === 'function') { updateViewButtons('devices-view-switcher', viewName); } + // The 3D optical-space view owns its own WebGL module. Build it lazily + // on first activation (the panel was display:none, so its container had + // no size until now); init() is idempotent and resizes on re-entry. + if (viewName === 'optical3d' && typeof Occupancy3DManager !== 'undefined') { + Occupancy3DManager.init(); + } } function setupViewSwitcher() { @@ -1511,6 +1517,7 @@ const DevicesManager = (function () { if (e.target.matches('input, textarea, select, [contenteditable]')) return; if (e.key === 'm') { e.preventDefault(); switchView('map'); } else if (e.key === 'd') { e.preventDefault(); switchView('details'); } + else if (e.key === 'v') { e.preventDefault(); switchView('optical3d'); } }); } diff --git a/gently/ui/web/static/js/experiment-overview.js b/gently/ui/web/static/js/experiment-overview.js index 33032a40..9e7fc221 100644 --- a/gently/ui/web/static/js/experiment-overview.js +++ b/gently/ui/web/static/js/experiment-overview.js @@ -1,147 +1,12 @@ /** - * Experiment Overview Tab — vector-graphics view of the planned timelapse. + * Experiment Overview Tab — vector-graphics view of the live imaging tactics + * (cadence patterns + reactive-monitoring rules) for the running experiment. * - * Data source priority: - * 1. GET /api/experiments/current/strategy — live snapshot from FileStore. - * 2. STUB_STRATEGY below — used when the live fetch - * fails or no session exists. - * - * The render path is data-shape-driven and doesn't care which source the - * snapshot came from — only ``ExperimentOverview.isLive`` differs so the - * header badge can say "live" or "mockup · stubbed data". + * Data source: GET /api/experiments/current/strategy — the live snapshot from + * FileStore. When there is no active experiment (or the fetch isn't ready), the + * view shows a calm empty state; it never renders stubbed/mock data. */ -const STUB_STRATEGY = { - session_id: "20260522_1430_dopaminergic_demo_a3f8e1c2", - session_name: "dopaminergic-reporter demo", - started_at: "2026-05-22T14:30:00", - now_offset_s: 8100, // 2h 15min into the run - horizon_s: 14400, // 4h total view window (past + projected) - base_interval_s: 120, - dose_budget_base_ms: 50000, - per_timepoint_ms: 500, // 50 slices × 10ms - monitoring_modes: [ - { - name: "expression_monitoring", - description: "Anticipating fluorescent-reporter onset on Test embryos: accelerate to 60s on signal, ramp 488 down on saturation.", - applies_to_roles: ["test"], - params: { - fast_interval: 60, - rampdown_step_pct: 1.0, - rampdown_floor_pct: 2.0, - rampdown_ceiling_pct: 6.0 - } - }, - { - name: "pre_terminal_monitoring", - description: "Anticipating organism pre-terminal stage (pretzel): accelerate to 30s on detection.", - applies_to_roles: ["test"], - params: { fast_interval: 30 } - } - ], - triggers: [ - { id: "t1", kind: "interval_rule", label: "signal onset", - when_text: "dopaminergic ≥ WEAK", then_text: "120s → 60s", - applies_to: ["test"], one_time: true }, - { id: "t2", kind: "power_rule", label: "488 ramp down", - when_text: "intensity = SATURATING (×3)", then_text: "488 ↓ 1%/step, floor 2%", - applies_to: ["test"] }, - { id: "t3", kind: "burst", label: "structure-triggered burst", - when_text: "structure_quality = GOOD", then_text: "burst 200 frames @ 20 Hz", - applies_to: ["test"] }, - { id: "t4", kind: "interval_rule", label: "pre-terminal speedup", - when_text: "stage = pretzel", then_text: "60s → 30s", - applies_to: ["test"], one_time: true } - ], - embryos: [ - { - id: "E1", role: "test", color: "#ff66cc", icon: "★", - dose_used_ms: 12500, dose_budget_ms: 50000, - tp_acquired: 25, - stop_condition: "hatching+3 OR 24h duration", - stop_kind: "bounded", - laser_488_pct_now: 3.0, - phases: [ - { mode: "base", start: 0, end: 1800, cadence_s: 120 }, - { mode: "fast", start: 1800, end: 3600, cadence_s: 60 }, - { mode: "burst", start: 3600, end: 3610, frames: 200, hz: 20 }, - { mode: "cooldown", start: 3610, end: 3640, cadence_s: 60 }, - { mode: "fast", start: 3640, end: 8100, cadence_s: 60 } - ], - trigger_events: [ - { trigger_id: "t1", at: 1800 }, - { trigger_id: "t3", at: 3600 }, - { trigger_id: "t2", at: 5400, count: 3 } - ], - power_history_488: [ - { at: 0, pct: 5.0 }, - { at: 5400, pct: 4.0 }, - { at: 5460, pct: 3.0 }, - { at: 8100, pct: 3.0 } - ], - // Future projection at current cadence (60s, fast). Hatching not - // deterministic so projected_end_s is null — render fades to ∞. - projected_cadence_s: 60, - projected_end_s: null - }, - { - id: "E2", role: "test", color: "#ff66cc", icon: "★", - dose_used_ms: 6500, dose_budget_ms: 50000, - tp_acquired: 13, - stop_condition: "hatching+3 OR 24h duration", - stop_kind: "bounded", - laser_488_pct_now: 5.0, - phases: [ - { mode: "base", start: 0, end: 8100, cadence_s: 120 } - ], - trigger_events: [], - power_history_488: [ - { at: 0, pct: 5.0 }, - { at: 8100, pct: 5.0 } - ], - projected_cadence_s: 120, - projected_end_s: null - }, - { - id: "E3", role: "test", color: "#ff66cc", icon: "★", - dose_used_ms: 38000, dose_budget_ms: 50000, - tp_acquired: 76, - stop_condition: "manual", - stop_kind: "open_ended", - laser_488_pct_now: 5.0, - phases: [ - { mode: "base", start: 0, end: 8100, cadence_s: 120 } - ], - trigger_events: [], - power_history_488: [ - { at: 0, pct: 5.0 }, - { at: 8100, pct: 5.0 } - ], - // Projected dose-exhaust horizon = 4.0h from now (warning condition) - projected_cadence_s: 120, - projected_end_s: null, - dose_exhaust_at_s: 12000 // budget will run out at this elapsed time - }, - { - id: "C1", role: "calibration", color: "#22d3ee", icon: "◆", - dose_used_ms: 33500, dose_budget_ms: 500000, // 10× multiplier - tp_acquired: 67, - stop_condition: "manual", - stop_kind: "open_ended", - laser_488_pct_now: 5.0, - phases: [ - { mode: "base", start: 0, end: 8100, cadence_s: 120 } - ], - trigger_events: [], - power_history_488: [ - { at: 0, pct: 5.0 }, - { at: 8100, pct: 5.0 } - ], - projected_cadence_s: 120, - projected_end_s: null - } - ] -}; const ExperimentOverview = { initialized: false, @@ -164,23 +29,19 @@ const ExperimentOverview = { cache: 'no-store' }); if (!resp.ok) { - console.warn( - '[ExperimentOverview] strategy fetch returned', - resp.status, '- falling back to stub' - ); + // No active experiment / not ready yet — show the empty state, + // never stubbed data. + console.warn('[ExperimentOverview] strategy fetch returned', resp.status); this.isLive = false; - return STUB_STRATEGY; + return null; } const data = await resp.json(); this.isLive = true; return data; } catch (e) { - console.warn( - '[ExperimentOverview] strategy fetch error - falling back to stub:', - e - ); + console.warn('[ExperimentOverview] strategy fetch error:', e); this.isLive = false; - return STUB_STRATEGY; + return null; } }, @@ -193,7 +54,7 @@ const ExperimentOverview = { }); // Re-render against the last fetched strategy (no re-fetch on tab // switch — refresh happens on tab activation in the bootstrap). - this.render(this.activeStrategy || STUB_STRATEGY); + this.render(this.activeStrategy); }, render(s) { @@ -204,6 +65,12 @@ const ExperimentOverview = { } // Tear down any prior ticker before we blow away the SVG it pointed at. this._stopNowTicker(); + if (!s) { + // No active experiment — a calm empty state, never stubbed data. + root.innerHTML = '
' + + 'No active experiment — the imaging tactics (cadence, reactive rules) will appear here once a run is live.
'; + return; + } try { root.innerHTML = ''; if (this.activeView === 'rules') { @@ -301,7 +168,6 @@ const ExperimentOverview = { const metaRow = el('div', 'expov-header-row expov-header-row-meta'); metaRow.appendChild(elText('span', 'expov-session-name', s.session_name)); metaRow.appendChild(elText('span', 'expov-session-id', s.session_id)); - metaRow.appendChild(elText('span', 'expov-mockup-badge', 'mockup · stubbed data')); header.appendChild(metaRow); root.appendChild(header); @@ -330,11 +196,7 @@ const ExperimentOverview = { if (s.session_name && s.session_name !== s.session_id) { metaRow.appendChild(elText('span', 'expov-session-name', s.session_name)); } - if (this.isLive) { - metaRow.appendChild(elText('span', 'expov-live-badge', 'live')); - } else { - metaRow.appendChild(elText('span', 'expov-mockup-badge', 'mockup · stubbed data')); - } + metaRow.appendChild(elText('span', 'expov-live-badge', 'live')); wrap.appendChild(metaRow); // Compact key-metric strip diff --git a/gently/ui/web/static/js/home.js b/gently/ui/web/static/js/home.js index 089d7de3..bf5cad17 100644 --- a/gently/ui/web/static/js/home.js +++ b/gently/ui/web/static/js/home.js @@ -136,7 +136,13 @@ const HomeApp = (() => { function updateStatus() { const el = document.getElementById('home-status'); if (!el) return; - const connected = (typeof state !== 'undefined' && state.connected); + // Read the shared ConnectionStatus store, not a one-shot snapshot of + // state.connected — the latter was read once at tab init (before the + // /ws handshake) and never corrected, showing "Offline" while the + // header pill said "Online". + const connected = (typeof ConnectionStatus !== 'undefined') + ? ConnectionStatus.get().gentlyConnected + : (typeof state !== 'undefined' && state.connected); const n = (typeof state !== 'undefined' && Array.isArray(state.embryos)) ? state.embryos.length : 0; el.textContent = connected ? `Connected · ${n} embryo${n === 1 ? '' : 's'} in view` @@ -162,6 +168,12 @@ const HomeApp = (() => { if (AgentChat.runCommand) setTimeout(() => AgentChat.runCommand('/wizard'), 250); } }); + // Re-render the status line on every connection change. subscribe() + // replays the current snapshot immediately, so a late init still + // renders correct state. Registered once (inside the _inited guard). + if (typeof ConnectionStatus !== 'undefined') { + ConnectionStatus.subscribe(() => updateStatus()); + } } refresh(); // re-fetch on every entry to the tab } diff --git a/gently/ui/web/static/js/landing.js b/gently/ui/web/static/js/landing.js new file mode 100644 index 00000000..6d058ebe --- /dev/null +++ b/gently/ui/web/static/js/landing.js @@ -0,0 +1,876 @@ +/** + * V2Landing (ux_v2): the agent-first welcome AND the in-place plan wizard. + * + * Clicking "Plan an experiment" switches the landing to a plan screen, enters + * plan mode, and renders the agent's work IN THE WIZARD — not the chat REPL: + * - the agent's reasoning + tool calls render as a tidy, claude.ai-style + * collapsible activity feed (#v2-plan-activity), fed by the AGENT_ACTIVITY + * event that agent-chat.js mirrors off the /ws/agent stream; + * - the agent's ask_user_choice questions render as button cards + * (#v2-plan-ask) via AgentChat.buildAskCard; + * - "THE PLAN" panel (#v2-plan-summary) mirrors the REAL plan (phases→tasks) + * fetched from /api/campaigns once a turn settles. + * Chat is the last resort (the escape pill / "Open conversation"). + * + * No-ops unless #v2-landing is present (flag off → v1 untouched, overlay absent). + */ +const V2Landing = (() => { + let el = null; + let current = null; // the ask currently in #v2-plan-ask + let feedTextEl = null; // current accumulating prose paragraph in the feed + let feedThinkingEl = null; // current accumulating reasoning (thinking) block + let runningTools = {}; // tool name -> stack of running card elements + let feedHadContent = false; // did this turn surface anything in the feed? + let capturedCampaignId = null; // best-effort id scraped from tool results + let planProposed = false; // propose_plan ran → plan is ready to commit + + const $ = (id) => document.getElementById(id); + + function greet() { + const g = $('v2-landing-greeting'); + if (!g) return; + const h = new Date().getHours(); + const t = h < 5 ? 'Still here.' : h < 12 ? 'Good morning.' + : h < 18 ? 'Good afternoon.' : 'Good evening.'; + g.innerHTML = t + '
What are we doing today?'; + } + + function setScreen(name) { if (el) el.dataset.screen = name; } + function planActive() { + return !!el && el.dataset.screen === 'plan' && !el.classList.contains('dismissed') + && el.style.display !== 'none'; + } + + function dismiss() { + if (!el || el.classList.contains('dismissed')) return; + el.classList.add('dismissed'); + let done = false; + const finish = () => { + if (done) return; + done = true; + el.style.display = 'none'; + el.setAttribute('aria-hidden', 'true'); + }; + el.addEventListener('transitionend', finish, { once: true }); + setTimeout(finish, 650); + } + + // ── status / error helpers ──────────────────────────────────────── + function setThinkingLabel(text) { + const l = document.querySelector('#v2-plan-thinking .v2-plan-thinking-label'); + if (l && text) l.textContent = text; + } + // Elapsed-time counter so a long think reads as progress, not a hang. Starts + // when the thinking indicator first shows and runs until it's hidden (turn end). + let _thinkTimer = null; + let _thinkStart = 0; + function _thinkTick() { + const t = $('v2-plan-thinking'); + if (!t) return; + let el = t.querySelector('.v2-plan-elapsed'); + if (!el) { + el = document.createElement('span'); + el.className = 'v2-plan-elapsed'; + el.style.cssText = 'margin-left:6px;opacity:.55;font-variant-numeric:tabular-nums;'; + t.appendChild(el); + } + const s = Math.round((Date.now() - _thinkStart) / 1000); + el.textContent = s > 0 ? s + 's' : ''; + } + function showThinking(on, label) { + const t = $('v2-plan-thinking'); + if (t) t.classList.toggle('hidden', !on); + if (on && label) setThinkingLabel(label); + if (on) { + if (!_thinkTimer) { + _thinkStart = Date.now(); + _thinkTick(); + _thinkTimer = setInterval(_thinkTick, 1000); + } + } else if (_thinkTimer) { + clearInterval(_thinkTimer); + _thinkTimer = null; + const el = t && t.querySelector('.v2-plan-elapsed'); + if (el) el.textContent = ''; + } + } + // Human-readable "what's happening right now" from a tool activity event, + // so the status line names the live operation instead of a static string. + function prettyTool(act) { + const raw = (act && (act.label || act.name)) || 'the next step'; + const s = String(raw).replace(/_/g, ' ').trim(); + return s.charAt(0).toUpperCase() + s.slice(1) + '…'; + } + function errorVisible() { const e = $('v2-plan-error'); return !!e && !e.classList.contains('hidden'); } + function showPlanError(msg) { + const e = $('v2-plan-error'); if (!e) return; + e.textContent = msg; e.classList.remove('hidden'); + showThinking(false); + } + function hidePlanError() { const e = $('v2-plan-error'); if (e) e.classList.add('hidden'); } + + function clearAsk() { const m = $('v2-plan-ask'); if (m) m.innerHTML = ''; } + function resetSummary() { + const list = $('v2-plan-summary'); + if (list) list.innerHTML = '
The plan will take shape here as Gently designs it.
'; + planPage = 0; planPages = []; planTitleText = ''; + } + + // ── activity feed: paginated, ONE agent step (turn) per page ─────── + // Instead of one ever-growing scroll, each agent turn — its reasoning + + // the tool calls it made — is a page you flip through with ‹ Prev / Next ›. + // The current question stays pinned below the feed (#v2-plan-ask). A new + // turn auto-advances to its page; you can flip back to review earlier steps. + let feedPages = []; // .v2-act-page elements, one per turn + let feedPage = 0; // index currently shown + let curPageEl = null; // page receiving this turn's content + let pendingNewPage = false; // a turn started; open a fresh page on first content + + function feedEl() { return $('v2-plan-activity'); } + function feedPagesWrap() { return feedEl()?.querySelector('.v2-feed-pages'); } + function clearActivity() { + const f = feedEl(); + if (f) { + f.innerHTML = + '' + + '
' + + ''; + } + feedPages = []; feedPage = 0; curPageEl = null; pendingNewPage = false; + feedTextEl = null; feedThinkingEl = null; runningTools = {}; feedHadContent = false; + capturedCampaignId = null; planProposed = false; + clearPlanReady(); + hidePlanError(); + } + function newFeedPage() { + const wrap = feedPagesWrap(); if (!wrap) return null; + const page = document.createElement('div'); + page.className = 'v2-act-page'; + wrap.appendChild(page); + feedPages.push(page); + curPageEl = page; + feedPage = feedPages.length - 1; // auto-advance to the live step + feedTextEl = null; + drawFeedPager(); + return page; + } + // Where this turn's prose/tool cards land. Opens a fresh page the first time + // content arrives after a turn_start (so content-less command turns don't + // leave empty pages), and lazily on the very first content. + function feedTarget() { + if (pendingNewPage || !curPageEl) { newFeedPage(); pendingNewPage = false; } + return curPageEl; + } + function viewingLatest() { return feedPage >= feedPages.length - 1; } + function drawFeedPager() { + const f = feedEl(); if (!f) return; + const n = feedPages.length; + const i = Math.min(Math.max(feedPage, 0), Math.max(n - 1, 0)); + feedPages.forEach((p, idx) => p.classList.toggle('active', idx === i)); + const bar = f.querySelector('.v2-feed-pager-bar'); + const dots = f.querySelector('.v2-feed-dots'); + if (!bar || !dots) return; + if (n <= 1) { bar.hidden = true; dots.hidden = true; return; } + bar.hidden = false; dots.hidden = false; + bar.innerHTML = ''; + const mkBtn = (txt, disabled, fn) => { + const b = document.createElement('button'); + b.className = 'v2-plan-pager-btn'; b.type = 'button'; b.textContent = txt; + b.disabled = disabled; b.addEventListener('click', fn); + return b; + }; + const pos = document.createElement('span'); + pos.className = 'v2-plan-pager-pos'; pos.textContent = `Step ${i + 1} of ${n}`; + bar.append( + mkBtn('‹ Prev', i === 0, () => { if (feedPage > 0) { feedPage--; drawFeedPager(); } }), + pos, + mkBtn('Next ›', i === n - 1, () => { if (feedPage < n - 1) { feedPage++; drawFeedPager(); } }), + ); + dots.innerHTML = ''; + for (let d = 0; d < n; d++) { + const dot = document.createElement('button'); + dot.className = 'v2-plan-dot' + (d === i ? ' active' : ''); + dot.type = 'button'; + dot.setAttribute('aria-label', `Step ${d + 1} of ${n}`); + dot.addEventListener('click', () => { feedPage = d; drawFeedPager(); }); + dots.appendChild(dot); + } + } + function scrollFeedIfNearBottom() { + if (!viewingLatest()) return; // don't yank the user off an earlier step + const m = document.querySelector('.v2-screen-plan .v2-plan-main'); + if (m && (m.scrollHeight - m.scrollTop - m.clientHeight) < 140) m.scrollTop = m.scrollHeight; + } + function clearFallback() { feedEl()?.querySelectorAll('.v2-plan-fallback').forEach(n => n.remove()); } + + // Render the agent's prose like the chat does (reuses AgentChat.mdToHtml — + // escapes then renders bold/italic/code/line-breaks), so the feed isn't raw + // markdown. Falls back to escaped text if the helper isn't available. + function renderMd(s) { + if (typeof AgentChat !== 'undefined' && AgentChat.mdToHtml) return AgentChat.mdToHtml(s); + const esc = (typeof escapeHtml === 'function') ? escapeHtml(String(s)) : String(s); + return esc.replace(/\n/g, '
'); + } + + // Plan-writing tools → refresh THE PLAN panel during the turn (debounced), + // not only at turn_end (ask_user_choice pauses the turn before it ends). + const PLAN_TOOLS = new Set([ + 'create_campaign', 'create_plan_item', 'link_plan_items', 'update_plan_item', + 'delete_plan_item', 'propose_plan', 'get_plan_status', 'validate_plan', + ]); + let planRefreshTimer = null; + function schedulePlanRefresh() { + if (planRefreshTimer) clearTimeout(planRefreshTimer); + planRefreshTimer = setTimeout(() => { planRefreshTimer = null; refreshPlanPanel(); }, 600); + } + + function safeStringify(v) { + try { + const s = (typeof v === 'string') ? v : JSON.stringify(v, null, 2); + return s.length > 4000 ? s.slice(0, 4000) + '\n…' : s; + } catch (e) { return String(v); } + } + function fillToolBody(body, act) { + body.innerHTML = ''; + // grid-template-rows reveal (landing.css) needs ONE collapsible child — + // append blocks into a single inner wrapper, not directly onto body. + const inner = document.createElement('div'); + body.appendChild(inner); + const inputStr = (act.input != null) ? safeStringify(act.input) : ''; + const full = act.full || act.summary || ''; + const block = (label, text) => { + const l = document.createElement('div'); l.className = 'v2-act-block-label'; l.textContent = label; + const b = document.createElement('pre'); b.className = 'v2-act-block'; b.textContent = text; + inner.append(l, b); + }; + if (inputStr) block('input', inputStr); + if (full) block('result', full); + if (!inputStr && !full) { + const e = document.createElement('div'); e.className = 'v2-act-block-label'; e.textContent = 'no details'; + inner.append(e); + } + } + function buildToolCard(act, done) { + const card = document.createElement('div'); + card.className = 'v2-act-tool' + (done ? (act.is_error ? ' done err' : ' done') : ''); + const head = document.createElement('button'); + head.className = 'v2-act-tool-head'; head.type = 'button'; + head.setAttribute('aria-expanded', 'false'); + const ic = document.createElement('span'); ic.className = 'v2-act-ic'; + ic.innerHTML = done ? (act.is_error ? '⚠' : '✓') : ''; + const label = document.createElement('span'); label.className = 'v2-act-label'; + label.textContent = act.label || act.name || 'tool'; + const sum = document.createElement('span'); sum.className = 'v2-act-summary'; + sum.textContent = done ? (act.summary || '') : ''; + const chev = document.createElement('span'); chev.className = 'v2-act-chev'; chev.textContent = '›'; + head.append(ic, label, sum, chev); + const body = document.createElement('div'); body.className = 'v2-act-tool-body'; + fillToolBody(body, act); + head.addEventListener('click', () => { + const open = card.classList.toggle('open'); + head.setAttribute('aria-expanded', open ? 'true' : 'false'); + }); + card.append(head, body); + return card; + } + function updateToolCard(card, act) { + card.classList.add('done'); + if (act.is_error) card.classList.add('err'); + const ic = card.querySelector('.v2-act-ic'); if (ic) ic.textContent = act.is_error ? '⚠' : '✓'; + const sum = card.querySelector('.v2-act-summary'); if (sum) sum.textContent = act.summary || ''; + const body = card.querySelector('.v2-act-tool-body'); if (body) fillToolBody(body, act); + } + function captureCampaignId(text) { + if (!text) return; + const s = String(text); + const m = s.match(/campaign_id[=:\s]+([0-9a-f]{6,})/i) || s.match(/\(id:\s*([0-9a-f]{6,})/i); + if (m) capturedCampaignId = m[1]; + } + + function applyActivity(act) { + if (!planActive() || !act) return; + const f = feedEl(); if (!f) return; + switch (act.kind) { + case 'turn_start': + feedTextEl = null; feedThinkingEl = null; pendingNewPage = true; hidePlanError(); clearFallback(); + clearPlanReady(); // new work in flight — drop any "ready" state + showThinking(true, 'reviewing your campaign and plan…'); + break; + case 'thinking': { + // Stream the model's reasoning summary live into the feed as a dim + // block, so the wait shows what the agent is actually considering. + showThinking(true); + const chunk = act.text || ''; + if (!chunk) { setThinkingLabel('thinking through the next step…'); break; } + if (!feedThinkingEl) { + feedThinkingEl = document.createElement('div'); + feedThinkingEl.className = 'v2-act-think'; + feedThinkingEl.style.cssText = + 'font-style:italic;opacity:.7;white-space:pre-wrap;margin:2px 0 8px;font-size:12.5px;line-height:1.5;'; + feedThinkingEl._raw = ''; + feedTarget().appendChild(feedThinkingEl); + } + feedThinkingEl._raw += chunk; + feedThinkingEl.textContent = feedThinkingEl._raw; + feedHadContent = true; + setThinkingLabel('reasoning…'); + scrollFeedIfNearBottom(); + break; + } + case 'text': { + const chunk = act.text || ''; + if (!chunk) break; + // The reasoning that immediately precedes the spoken answer is + // wrap-up meta ("let me wrap this up concisely and offer to + // export…") — drop the block entirely so the feed keeps the + // answer, not the narration of getting there. Reasoning that + // precedes a TOOL is left in place (tool_start only nulls the + // pointer) as the rationale for that action. + if (feedThinkingEl) { feedThinkingEl.remove(); feedThinkingEl = null; } + if (!feedTextEl) { + feedTextEl = document.createElement('div'); + feedTextEl.className = 'v2-act-text'; + feedTextEl._raw = ''; + feedTarget().appendChild(feedTextEl); + } + feedTextEl._raw += chunk; + feedTextEl.innerHTML = renderMd(feedTextEl._raw); + feedHadContent = true; showThinking(true, 'composing the response…'); scrollFeedIfNearBottom(); + break; + } + case 'tool_start': { + // ask_user_choice IS the active question (rendered separately in + // #v2-plan-ask) — don't also show it as a feed card. + if (act.name === 'ask_user_choice') break; + feedTextEl = null; feedThinkingEl = null; + const card = buildToolCard(act, false); + feedTarget().appendChild(card); + (runningTools[act.name] = runningTools[act.name] || []).push(card); + feedHadContent = true; showThinking(true, prettyTool(act)); scrollFeedIfNearBottom(); + break; + } + case 'tool_result': { + captureCampaignId(act.summary); + captureCampaignId(act.full); + if (PLAN_TOOLS.has(act.name)) schedulePlanRefresh(); + if (act.name === 'propose_plan' && !act.is_error) planProposed = true; + if (act.name === 'ask_user_choice') break; + feedTextEl = null; feedThinkingEl = null; + const arr = runningTools[act.name] || []; + const card = arr.pop(); + if (card) updateToolCard(card, act); + else feedTarget().appendChild(buildToolCard(act, true)); + feedHadContent = true; setThinkingLabel('working through the next step…'); scrollFeedIfNearBottom(); + break; + } + case 'turn_end': + showThinking(false); feedTextEl = null; feedThinkingEl = null; + refreshPlanPanel(); + if (!current && !feedHadContent) showFallback(); + // Plan proposed and the agent has settled (no pending question) → + // the design is done. Surface a clear "ready" state instead of + // leaving the user parked on the last wizard step. + if (planProposed && !current) showPlanReady(); + break; + case 'turn_error': + showPlanError(act.error || 'Something went wrong — open the conversation for detail.'); + break; + } + } + + function showFallback() { + const f = feedEl(); if (!f || f.querySelector('.v2-plan-fallback')) return; + const d = document.createElement('div'); + d.className = 'v2-plan-fallback'; + d.innerHTML = 'Gently replied in prose — open the conversation to read it.'; + d.querySelector('a').addEventListener('click', openChat); + feedTarget().appendChild(d); + } + + // ── plan-ready state ─────────────────────────────────────────────── + // Once the agent has proposed the plan and gone quiet, the wizard is done. + // Mark the screen "ready": rename the header, count phases/items from the + // panel, and promote "open workspace" to the obvious primary action — so the + // finish line is signposted instead of looking like one more wizard step. + function planCounts() { + let phases = 0, items = 0; + planPages.forEach(p => { + if (p.name !== 'Tasks') phases++; + items += (p.items || []).length; + }); + return { phases, items }; + } + function showPlanReady() { + const sec = document.querySelector('.v2-screen-plan'); + if (!sec) return; + sec.classList.add('ready'); + showThinking(false); + const who = sec.querySelector('.v2-plan-who'); + const title = sec.querySelector('.v2-plan-title'); + if (who) who.textContent = 'Gently · plan ready'; + if (title) { + const { phases, items } = planCounts(); + title.textContent = items + ? `Your plan is ready — ${items} item${items === 1 ? '' : 's'} across ${phases} phase${phases === 1 ? '' : 's'}` + : 'Your plan is ready'; + } + const cont = $('v2-plan-continue'); + if (cont) cont.textContent = 'Open the workspace ›'; + const exp = $('v2-plan-export'); + if (exp) exp.hidden = false; // the plan is final → offer the download + } + function clearPlanReady() { + const sec = document.querySelector('.v2-screen-plan'); + if (!sec || !sec.classList.contains('ready')) return; + sec.classList.remove('ready'); + const who = sec.querySelector('.v2-plan-who'); + const title = sec.querySelector('.v2-plan-title'); + if (who) who.textContent = 'Gently · planning'; + if (title) title.textContent = "Let's design your run"; + const cont = $('v2-plan-continue'); + if (cont) cont.textContent = 'Continue in workspace ›'; + const exp = $('v2-plan-export'); + if (exp) exp.hidden = true; + } + + // ── export the finished plan as a shareable markdown doc ──────────── + // Replaces the agent's end-of-plan "want me to export this?" prose with a + // real action: pull the enriched plan tree (/export) and render it to + // markdown client-side so the biologist can drop it in a doc or share it. + function specToLines(spec) { + let s = spec; + if (typeof s === 'string') { try { s = JSON.parse(s); } catch { return s ? ['- ' + s] : []; } } + if (!s || typeof s !== 'object') return []; + const out = []; + const fmt = (v) => Array.isArray(v) ? v.join(', ') : (typeof v === 'object' ? JSON.stringify(v) : String(v)); + const pick = (k, label) => { if (s[k] != null && s[k] !== '') out.push(`- **${label}:** ${fmt(s[k])}`); }; + pick('strain', 'Strain'); pick('goal', 'Goal'); + if (Array.isArray(s.channels) && s.channels.length) { + out.push('- **Channels:** ' + s.channels.map(c => `${c.name || '?'} (${c.excitation_nm || '?'} nm${c.exposure_ms ? `, ${c.exposure_ms} ms` : ''})`).join(', ')); + } + pick('num_slices', 'Slices'); pick('interval_s', 'Interval (s)'); pick('temperature_c', 'Temperature (°C)'); + pick('num_embryos', 'Embryos'); pick('start_stage', 'Start stage'); pick('stop_condition', 'Stop condition'); + pick('criteria', 'Criteria'); pick('success_criteria', 'Success criteria'); + return out; + } + function buildPlanMarkdown(tree) { + const L = []; + L.push(`# ${tree.description || tree.shorthand || 'Experimental plan'}`, ''); + if (tree.target) L.push(`**Goal:** ${tree.target}`, ''); + if (tree.shorthand) L.push(`**Plan ID:** \`${tree.shorthand}\``, ''); + L.push(`_Exported from Gently — ${new Date().toLocaleString()}_`, ''); + const renderItems = (items, prefix) => { + (items || []).slice().sort((a, b) => (a.phase_order || 0) - (b.phase_order || 0)).forEach((it, idx) => { + L.push(`### ${prefix}${idx + 1} ${it.title || '(task)'} \`${it.type || 'task'}\``, ''); + if (it.description) L.push(it.description, ''); + const sl = specToLines(it.spec); + if (sl.length) L.push(...sl, ''); + const refs = it.references || []; + if (refs.length) { + L.push('**References:**'); + refs.forEach((r, i) => L.push(`${i + 1}. ${r.citation || r.id || ''}${r.source ? ` _(${r.source})_` : ''}`)); + L.push(''); + } + }); + }; + if ((tree.items || []).length) { L.push('## Tasks', ''); renderItems(tree.items, ''); } + (tree.children || []).forEach((ph, pi) => { + if (!ph) return; + L.push(`## ${ph.display_name || ph.description || ph.shorthand || `Phase ${pi + 1}`}`, ''); + if (ph.target) L.push(ph.target, ''); + renderItems(ph.items, `${pi + 1}.`); + }); + return L.join('\n').replace(/\n{3,}/g, '\n\n').trimEnd() + '\n'; + } + async function resolveCampaignId() { + if (capturedCampaignId) return capturedCampaignId; + try { + const r = await fetch('/api/campaigns'); + if (r.ok) { const d = await r.json(); const t = (d.campaigns || [])[0]; return (t && t.campaign && t.campaign.id) || null; } + } catch (e) { /* offline */ } + return null; + } + async function exportPlan() { + const btn = $('v2-plan-export'); + const id = await resolveCampaignId(); + if (!id) { showPlanError('No plan to export yet.'); return; } + if (btn) { btn.disabled = true; btn.textContent = '↓ Exporting…'; } + try { + const r = await fetch(`/api/campaigns/${encodeURIComponent(id)}/export`); + if (!r.ok) throw new Error(`export ${r.status}`); + const tree = await r.json(); + const md = buildPlanMarkdown(tree); + const blob = new Blob([md], { type: 'text/markdown' }); + const url = URL.createObjectURL(blob); + const a = document.createElement('a'); + a.href = url; + a.download = `${(tree.shorthand || 'plan').replace(/[^\w.-]+/g, '_')}.md`; + document.body.appendChild(a); a.click(); a.remove(); + setTimeout(() => URL.revokeObjectURL(url), 1000); + } catch (e) { + showPlanError('Could not export the plan — open the conversation to export it manually.'); + } finally { + if (btn) { btn.disabled = false; btn.textContent = '↓ Export plan'; } + } + } + + // ── THE PLAN panel: mirror the real campaign tree ────────────────── + async function refreshPlanPanel() { + try { + let tree = null; + if (capturedCampaignId) { + const r = await fetch(`/api/campaigns/${encodeURIComponent(capturedCampaignId)}/tree`); + if (r.ok) tree = await r.json(); + } + if (!tree) { + const r = await fetch('/api/campaigns'); + if (r.ok) { const d = await r.json(); tree = (d.campaigns || [])[0] || null; } + } + if (tree) renderPlanTree(tree); + } catch (e) { /* keep whatever is shown */ } + } + function planName(c) { + c = c || {}; + return c.shorthand || c.display_name || c.description || 'Plan'; + } + // Phases read better by their human name ("Phase 1 — Reporter validation") + // than by their code shorthand ("nrp-p1"), which looks like a machine id. + function phaseName(c) { + c = c || {}; + return c.display_name || c.description || c.shorthand || 'Phase'; + } + // THE PLAN renders as a pager — one phase per page with ‹ Prev / Next ›, + // a position label, and dots — instead of one long scroll. planPage is held + // across re-renders (the panel refetches on every plan-writing tool) so the + // page you're reading doesn't snap back to the start mid-design. + let planPage = 0; + let planPages = []; // [{ name, items }] + let planTitleText = ''; + + function renderPlanTree(tree) { + if (!tree) return; + const phases = tree.children || []; + const rootItems = tree.items || []; + if (!phases.length && !rootItems.length) return; // nothing to show yet — keep placeholder + const pages = []; + if (rootItems.length) pages.push({ name: 'Tasks', items: rootItems }); + phases.forEach(phase => { + if (!phase) return; + pages.push({ name: phaseName(phase.campaign), items: phase.items || [] }); + }); + planPages = pages; + planTitleText = planName(tree.campaign); + if (planPage >= pages.length) planPage = pages.length - 1; + if (planPage < 0) planPage = 0; + drawPlanPage(); + } + + function drawPlanPage() { + const list = $('v2-plan-summary'); + if (!list) return; + const pages = planPages; + const n = pages.length; + if (!n) return; + const i = Math.min(Math.max(planPage, 0), n - 1); + const page = pages[i]; + + list.innerHTML = ''; + const title = document.createElement('div'); + title.className = 'v2-plan-title-row'; + title.textContent = planTitleText; + list.appendChild(title); + + if (n > 1) { + const bar = document.createElement('div'); + bar.className = 'v2-plan-pager'; + const prev = document.createElement('button'); + prev.className = 'v2-plan-pager-btn'; prev.type = 'button'; prev.textContent = '‹ Prev'; + prev.disabled = i === 0; + prev.addEventListener('click', () => { if (planPage > 0) { planPage--; drawPlanPage(); } }); + const pos = document.createElement('span'); + pos.className = 'v2-plan-pager-pos'; + pos.textContent = page.name; // position shown by the dots below + pos.title = `${page.name} · ${i + 1} of ${n}`; + const next = document.createElement('button'); + next.className = 'v2-plan-pager-btn'; next.type = 'button'; next.textContent = 'Next ›'; + next.disabled = i === n - 1; + next.addEventListener('click', () => { if (planPage < n - 1) { planPage++; drawPlanPage(); } }); + bar.append(prev, pos, next); + list.appendChild(bar); + } else { + const h = document.createElement('div'); + h.className = 'v2-plan-phase-h'; + h.textContent = page.name; + list.appendChild(h); + } + + const tasks = document.createElement('div'); + tasks.className = 'v2-plan-phase'; + const items = page.items || []; + // phase ordinal (1-based) for "P.I" numbering; the rootItems "Tasks" page + // isn't a phase, so it numbers items bare (1, 2, …). + const phaseOrd = pages.slice(0, i + 1).filter(p => p.name !== 'Tasks').length; + if (items.length) { + items.forEach((it, idx) => { + const type = String(it.type || '').toLowerCase(); + const t = document.createElement('div'); + t.className = 'v2-plan-task type-' + (type || 'other'); + const num = document.createElement('span'); + num.className = 'v2-task-num'; + num.textContent = phaseOrd ? `${phaseOrd}.${idx + 1}` : `${idx + 1}`; + const dot = document.createElement('span'); + dot.className = 'v2-task-dot'; + dot.title = type || 'task'; + const ttl = document.createElement('span'); + ttl.className = 'v2-task-ttl'; + ttl.textContent = it.title || it.shorthand || '(task)'; + t.append(num, dot, ttl); + if (it.estimated_days) { + const d = document.createElement('span'); + d.className = 'v2-task-days'; + d.textContent = `${it.estimated_days}d`; + t.append(d); + } + tasks.appendChild(t); + }); + } else { + const e = document.createElement('div'); + e.className = 'v2-plan-task v2-plan-task-empty'; + e.textContent = 'no items in this phase yet'; + tasks.appendChild(e); + } + list.appendChild(tasks); + + if (n > 1) { + const dots = document.createElement('div'); + dots.className = 'v2-plan-dots'; + for (let d = 0; d < n; d++) { + const dot = document.createElement('button'); + dot.className = 'v2-plan-dot' + (d === i ? ' active' : ''); + dot.type = 'button'; + dot.setAttribute('aria-label', `Go to ${pages[d].name} (${d + 1} of ${n})`); + dot.addEventListener('click', () => { planPage = d; drawPlanPage(); }); + dots.appendChild(dot); + } + list.appendChild(dots); + } + } + + // ── ask rendering (the active question) ──────────────────────────── + function labelFor(data, sel) { + const opts = (data && data.options) || []; + const one = (s) => { + const o = opts.find(o => o && (o.id === s || o.value === s || o.label === s)); + return o ? o.label : String(s); + }; + return Array.isArray(sel) ? sel.map(one).join(', ') : one(sel); + } + function recordPick(data, sel) { + const list = $('v2-plan-summary'); + if (!list) return; + const empty = list.querySelector('.v2-plan-side-empty'); + if (empty) empty.remove(); + const matched = (data && data.options || []).some(o => o && (o.id === sel || o.value === sel || o.label === sel)); + const row = document.createElement('div'); + row.className = 'v2-plan-row' + (matched ? '' : ' v2-plan-row-freetext'); + row.innerHTML = ''; + row.querySelector('.k').textContent = (data && data.question) || 'Choice'; + row.querySelector('.v').textContent = labelFor(data, sel); + list.appendChild(row); + } + function renderAsk() { + const mount = $('v2-plan-ask'); + if (!mount || !current || typeof AgentChat === 'undefined' || !AgentChat.buildAskCard) return; + showThinking(false); hidePlanError(); clearFallback(); + const data = current.data, reqId = current.reqId; + const hasControl = AgentChat.hasControl ? AgentChat.hasControl() : true; + const card = AgentChat.buildAskCard(data, { + reqId, isWake: current.isWake, hasControl, + onPick: (sel) => { + recordPick(data, sel); + AgentChat.answerChoice(reqId, sel); + current = null; clearAsk(); showThinking(true); + }, + }); + mount.innerHTML = ''; + mount.appendChild(card); + const first = mount.querySelector('button:not([disabled])'); + if (first) setTimeout(() => first.focus(), 30); + } + + let planKickedOff = false; // guard: design-kickoff fires once per session + async function startPlan() { + setScreen('plan'); + // Re-entering the wizard (Back → Plan again) must NOT re-fire the + // kickoff — that stacked duplicate "/plan" + design turns. Just show + // the wizard with its existing state. + if (planKickedOff) return; + planKickedOff = true; + resetSummary(); clearAsk(); clearActivity(); + current = null; + showThinking(true); + // Campaigns are persistent agent memory (not session state), so the + // agent always builds on an existing one — which leaves a user wanting a + // fresh plan stuck. So if an active campaign exists, ask up front: + // continue it (the default) or start a brand-new one. With NO campaign + // there's nothing to continue, so skip the gate and design straight away + // (that path is fresh anyway). + let campaign = null; + try { + const r = await fetch('/api/campaigns'); + if (r.ok) { const d = await r.json(); campaign = (d.campaigns || [])[0] || null; } + } catch (e) { /* offline / no API — just design */ } + if (campaign) renderCampaignChoice(campaign); + else kickoffDesign('continue'); + } + + // Enter plan mode, then prompt design. The prompt differs by intent: build + // on the active campaign, or set it aside and create a new one. A free-typed + // answer from the choice card becomes the design brief directly. + function kickoffDesign(mode) { + showThinking(true); + if (typeof AgentChat === 'undefined' || !AgentChat.runCommand) return; + AgentChat.runCommand('/plan'); + if (mode === 'fresh') { + AgentChat.runCommand( + "I want to start a brand-new experiment, not continue any existing " + + "campaign. Create a new campaign and let's design it from scratch — " + + "what should we capture?" + ); + } else if (mode === 'continue') { + AgentChat.runCommand("Let's design this run — what should it capture?"); + } else { + // free text from the choice card's "Something else…" escape + AgentChat.runCommand(String(mode)); + } + } + + // Continue-vs-fresh gate, shown only when an active campaign exists. Reuses + // the agent ask-card styling so it's visually identical to the agent's own + // questions; picking routes into kickoffDesign rather than the agent bridge. + function renderCampaignChoice(tree) { + const mount = $('v2-plan-ask'); + if (!mount || typeof AgentChat === 'undefined' || !AgentChat.buildAskCard) { + kickoffDesign('continue'); + return; + } + showThinking(false); hidePlanError(); clearFallback(); + const name = planName((tree && tree.campaign) || {}); + const data = { + question: `You have an active campaign — **${name}**. Design the next run inside it, or start something new?`, + options: [ + { id: 'continue', label: `Continue ${name}`, description: 'Design the next run inside your existing campaign' }, + { id: 'fresh', label: 'Start a brand-new campaign', description: 'Set the existing plan aside and plan from scratch' }, + ], + }; + const hasControl = AgentChat.hasControl ? AgentChat.hasControl() : true; + const card = AgentChat.buildAskCard(data, { + reqId: 'landing-campaign-choice', isWake: false, hasControl, + onPick: (sel) => { clearAsk(); kickoffDesign(sel); }, + }); + mount.innerHTML = ''; + mount.appendChild(card); + const first = mount.querySelector('button:not([disabled])'); + if (first) setTimeout(() => first.focus(), 30); + } + + function openScope() { + dismiss(); + if (typeof switchTab === 'function') switchTab('devices'); + } + function openChat() { + dismiss(); + if (typeof AgentChat !== 'undefined' && AgentChat.togglePanel) { + setTimeout(() => AgentChat.togglePanel(true), 300); + } + } + function sendFreeform(text) { + const v = (text || '').trim(); + dismiss(); + if (typeof AgentChat !== 'undefined' && AgentChat.togglePanel) { + AgentChat.togglePanel(true); + if (v && AgentChat.runCommand) setTimeout(() => AgentChat.runCommand(v), 300); + } + } + + function init() { + el = $('v2-landing'); + if (!el || typeof ClientEventBus === 'undefined') return; // flag off → no-op + greet(); + + el.querySelectorAll('[data-landing]').forEach(btn => btn.addEventListener('click', () => { + const kind = btn.dataset.landing; + if (kind === 'plan') startPlan(); + else if (kind === 'standalone') openScope(); + })); + + const esc = $('v2-escape'), escToggle = $('v2-escape-toggle'), + escInput = $('v2-escape-input'), escSend = $('v2-escape-send'); + if (escToggle && esc && escInput) { + escToggle.addEventListener('click', () => { + const open = esc.classList.toggle('open'); + escToggle.setAttribute('aria-expanded', open ? 'true' : 'false'); + if (open) setTimeout(() => escInput.focus(), 120); + }); + const submit = () => sendFreeform(escInput.value); + if (escSend) escSend.addEventListener('click', submit); + escInput.addEventListener('keydown', e => { + if (e.key === 'Enter') { e.preventDefault(); submit(); } + else if (e.key === 'Escape') { e.stopPropagation(); esc.classList.remove('open'); escToggle.setAttribute('aria-expanded', 'false'); } + }); + } + + const skip = $('v2-landing-skip'); + if (skip) skip.addEventListener('click', dismiss); + + const back = $('v2-plan-back'); + if (back) back.addEventListener('click', () => setScreen('welcome')); + const planChat = $('v2-plan-chat'); + if (planChat) planChat.addEventListener('click', openChat); + const cont = $('v2-plan-continue'); + if (cont) cont.addEventListener('click', dismiss); + const exp = $('v2-plan-export'); + if (exp) exp.addEventListener('click', exportPlan); + + // The agent's questions + work render in the plan stage while it's active; + // once we've receded into the workspace, AskStage (#ask-stage) takes over. + ClientEventBus.on('AGENT_ASK', ({ request_id, choice_data, origin }) => { + if (!planActive()) return; + current = { reqId: request_id, data: choice_data || {}, isWake: origin === 'wake' }; + renderAsk(); + }); + ClientEventBus.on('ASK_CLEARED', ({ request_id }) => { + if (request_id === '*' || (current && request_id === current.reqId)) { + current = null; clearAsk(); + if (planActive() && !errorVisible()) showThinking(true); + // A question was just answered — the agent's continuation is the + // next step, so open a fresh feed page for it. (A turn stays one + // stream across an ask_user_choice pause, so turn_start alone + // would lump every step of the design into a single page.) + pendingNewPage = true; + } + }); + ClientEventBus.on('AGENT_CONTROL', () => { if (current && planActive()) renderAsk(); }); + ClientEventBus.on('AGENT_ACTIVITY', (act) => applyActivity(act)); + + document.addEventListener('keydown', e => { + if (e.key !== 'Escape' || !el || el.classList.contains('dismissed')) return; + if (el.dataset.screen === 'plan') setScreen('welcome'); // step back, don't bail + else dismiss(); + }); + } + + document.addEventListener('DOMContentLoaded', init); + + return { + dismiss, + show: () => { + if (!el) return; + el.style.display = ''; + el.removeAttribute('aria-hidden'); + el.classList.remove('dismissed'); + setScreen('welcome'); + greet(); + }, + }; +})(); diff --git a/gently/ui/web/static/js/occupancy3d.js b/gently/ui/web/static/js/occupancy3d.js new file mode 100644 index 00000000..b9cfb7ee --- /dev/null +++ b/gently/ui/web/static/js/occupancy3d.js @@ -0,0 +1,489 @@ +// ══════════════════════════════════════════════════════════════════════ +// 3D Optical Space — live digital-twin of the addressable imaging volume +// +// Renders the acquisition cuboid (the box of voxels being scanned) with the +// live light-sheet plane inside it, plus a Z-neighbourhood reference frame. +// An HTML overlay (mode badge + readouts + a top-down minimap) carries the +// GLOBAL context: where in the addressable XY stage range this cuboid sits, +// and the embryos around it. +// +// Why two representations: the addressable stage XY (~tens of mm), the cuboid +// footprint (~hundreds of µm) and the piezo Z range (~µm) differ by ~100x, so +// a single literal-scale 3D box would draw the cuboid invisibly small. The 3D +// scene therefore stays in one µm scale around the cuboid; the minimap (2D) +// handles the much larger stage extent. Some scales are local by design — see +// FOV_UM / the outer-frame sizing below. +// +// Data: +// DEVICE_STATE_UPDATE → live Piezo.Position (sheet Z), Galvo.A/B, XYStage, +// and the firmware box (minimap extent). +// SCAN_GEOMETRY_UPDATE → cuboid extents, num_slices, pencil/sheet mode. +// EMBRYOS_UPDATE → minimap markers. +// Bootstrap via /api/devices/scan_geometry + /api/embryos/current. +// +// Mirrors the DevicesManager IIFE pattern (devices.js) and reuses the +// Three.js scaffold + drag-orbit from projection-viewer.js. +// ══════════════════════════════════════════════════════════════════════ + +const Occupancy3DManager = (function () { + 'use strict'; + + // --- Tunables / approximations (v1) -------------------------------- + // Camera FOV footprint of the SPIM cuboid in µm. SPIM is 0.1625 µm/px; + // a ~2048px sCMOS ROI ≈ 333 µm. Not currently streamed, so we use a + // constant until SCAN_GEOMETRY_UPDATE carries fov_um. (Documented approx.) + const FOV_UM = 333.0; + const MAX_SLICE_LINES = 30; // cap drawn slice outlines for perf + const COLORS = { + outer: 0x33414d, + cuboid: 0x14b8c4, + cuboidFace: 0x14b8c4, + sheet: 0x39d0ff, + slice: 0x2a6f78, + beam: 0xffd166, + }; + + // --- Module state -------------------------------------------------- + let _initialized = false; + let _scene = null, _camera = null, _renderer = null, _root = null; + let _animationId = null, _resizeObserver = null, _resizeRaf = null, _onLayoutChanged = null; + let _isDragging = false, _prevMouse = { x: 0, y: 0 }; + const _rot = { x: -0.6, y: 0.6 }; + let _zoom = 1.7; + + // Live data caches + let _geom = null; // last SCAN_GEOMETRY_UPDATE.data + let _firmwareBox = null; // {x:[min,max], y:[min,max]} µm + let _stage = { x: null, y: null }; + let _piezoZ = null; // live axial position (µm) + let _galvo = { a: null, b: null }; + let _embryos = []; // [{x,y,role,id}] + let _scaler = null; + + // Scene object handles (rebuilt as geometry changes) + let _outerBox = null, _cuboid = null, _cuboidEdges = null; + let _sheet = null, _beam = null, _sliceGroup = null; + + // DOM + let _container = null, _modeEl = null, _readoutsEl = null, _minimapEl = null, _demoBtn = null; + let _demoTimer = null; + + // =================================================================== + // Init / scene scaffold + // =================================================================== + function init() { + if (_initialized) { _resize(); return; } + if (typeof THREE === 'undefined') { + console.warn('[occupancy3d] THREE not loaded'); + return; + } + _container = document.getElementById('occ3d-container'); + _modeEl = document.getElementById('occ3d-mode'); + _readoutsEl = document.getElementById('occ3d-readouts'); + _minimapEl = document.getElementById('occ3d-minimap'); + _demoBtn = document.getElementById('occ3d-demo-btn'); + if (!_container) return; + + _buildScene(); + _wireInteraction(); + if (_demoBtn) _demoBtn.addEventListener('click', toggleDemo); + + // Subscribe to live data (mirror devices.js:1553-1559) + if (typeof ClientEventBus !== 'undefined') { + ClientEventBus.on('DEVICE_STATE_UPDATE', handleDeviceState); + ClientEventBus.on('SCAN_GEOMETRY_UPDATE', handleScanGeometry); + ClientEventBus.on('EMBRYOS_UPDATE', handleEmbryos); + } + _bootstrap(); + + _initialized = true; + _rebuildSceneObjects(); + _renderReadouts(); + _renderMinimap(); + _animate(); + // Container is 0×0 while the tab is hidden; size once it's visible. + requestAnimationFrame(_resize); + } + + function _buildScene() { + const w = _container.clientWidth || 600; + const h = _container.clientHeight || 460; + + _scene = new THREE.Scene(); + _camera = new THREE.PerspectiveCamera(45, w / h, 0.01, 100); + _camera.position.set(0, 0, _zoom); + + _renderer = new THREE.WebGLRenderer({ antialias: true }); + _renderer.setSize(w, h); + _renderer.setClearColor(0x0a0e12); + _container.innerHTML = ''; + _container.appendChild(_renderer.domElement); + + _root = new THREE.Group(); + _root.rotation.x = _rot.x; + _root.rotation.y = _rot.y; + _scene.add(_root); + + // Keep the canvas in sync with its container (chat dock / window resize). + if (_resizeObserver) _resizeObserver.disconnect(); + _resizeObserver = new ResizeObserver(() => { + if (_resizeRaf) cancelAnimationFrame(_resizeRaf); + _resizeRaf = requestAnimationFrame(_resize); + }); + _resizeObserver.observe(_container); + if (!_onLayoutChanged) { + _onLayoutChanged = () => _resize(); + window.addEventListener('gently:layout-changed', _onLayoutChanged); + } + } + + function _resize() { + if (!_renderer || !_container) return; + const w = _container.clientWidth || 600; + const h = _container.clientHeight || 460; + if (w === 0 || h === 0) return; + _camera.aspect = w / h; + _camera.updateProjectionMatrix(); + _renderer.setSize(w, h); + } + + function _wireInteraction() { + const el = _renderer.domElement; + el.addEventListener('mousedown', (e) => { + _isDragging = true; _prevMouse = { x: e.clientX, y: e.clientY }; + }); + el.addEventListener('mousemove', (e) => { + if (!_isDragging) return; + _root.rotation.y += (e.clientX - _prevMouse.x) * 0.01; + _root.rotation.x += (e.clientY - _prevMouse.y) * 0.01; + _rot.x = _root.rotation.x; _rot.y = _root.rotation.y; + _prevMouse = { x: e.clientX, y: e.clientY }; + }); + window.addEventListener('mouseup', () => { _isDragging = false; }); + el.addEventListener('wheel', (e) => { + e.preventDefault(); + _zoom = Math.max(0.4, Math.min(6, _zoom + e.deltaY * 0.002)); + _camera.position.z = _zoom; + }, { passive: false }); + el.addEventListener('dblclick', () => { + _rot.x = -0.6; _rot.y = 0.6; _zoom = 1.7; + _root.rotation.x = _rot.x; _root.rotation.y = _rot.y; + _camera.position.z = _zoom; + }); + } + + function _animate() { + _animationId = requestAnimationFrame(_animate); + if (_renderer && _scene && _camera) _renderer.render(_scene, _camera); + } + + // =================================================================== + // Scene geometry (rebuilt when scan geometry changes) + // =================================================================== + function _disposeObj(obj) { + if (!obj) return; + _root.remove(obj); + obj.traverse?.((c) => { + c.geometry?.dispose?.(); + if (c.material) (Array.isArray(c.material) ? c.material : [c.material]).forEach(m => m.dispose()); + }); + obj.geometry?.dispose?.(); + if (obj.material) (Array.isArray(obj.material) ? obj.material : [obj.material]).forEach(m => m.dispose()); + } + + function _currentGeom() { + // Fall back to nominal defaults so the scene is never empty. + const g = _geom || {}; + const scan = g.scan || {}; + const derived = g.derived || {}; + const piezoCenter = scan.piezo_center_um != null ? scan.piezo_center_um : 50.0; + const zExtent = derived.z_extent_um != null ? derived.z_extent_um : 50.0; + return { + numSlices: scan.num_slices != null ? scan.num_slices : 50, + piezoCenter, + zExtent, + mode: g.mode || 'sheet', + }; + } + + function _rebuildSceneObjects() { + if (!_root) return; + [_outerBox, _cuboid, _cuboidEdges, _sheet, _beam, _sliceGroup].forEach(_disposeObj); + _outerBox = _cuboid = _cuboidEdges = _sheet = _beam = _sliceGroup = null; + + const g = _currentGeom(); + const fov = FOV_UM; + // Outer Z neighbourhood centred on the cuboid so it's always framed. + const halfZ = Math.max(g.zExtent * 2.5, 75); + const zMin = g.piezoCenter - halfZ; + const zMax = g.piezoCenter + halfZ; + const halfXY = fov * 1.5; + + _scaler = makeSceneScaler({ + xRange: [-halfXY, halfXY], + yRange: [-halfXY, halfXY], + zRange: [zMin, zMax], + }); + const L = (um) => _scaler.scaleLen(um); + const Z = (um) => _scaler.toScene(um, 'z'); + + // --- Outer reference frame (addressable Z × local XY) ---------- + _outerBox = new THREE.LineSegments( + new THREE.EdgesGeometry(new THREE.BoxGeometry(L(2 * halfXY), L(zMax - zMin), L(2 * halfXY))), + new THREE.LineBasicMaterial({ color: COLORS.outer }) + ); + _outerBox.position.y = Z(g.piezoCenter); // box centred on its own midpoint == piezoCenter + _root.add(_outerBox); + + // --- Acquisition cuboid (footprint × z-extent) ----------------- + // Three.js Y is our axial (Z µm) axis; X/Z are the lateral footprint. + const cw = L(fov), cd = L(fov), ch = L(g.zExtent); + _cuboid = new THREE.Mesh( + new THREE.BoxGeometry(cw, ch, cd), + new THREE.MeshBasicMaterial({ + color: COLORS.cuboidFace, transparent: true, opacity: 0.06, + depthWrite: false, side: THREE.DoubleSide, + }) + ); + _cuboid.position.y = Z(g.piezoCenter); + _root.add(_cuboid); + _cuboidEdges = new THREE.LineSegments( + new THREE.EdgesGeometry(new THREE.BoxGeometry(cw, ch, cd)), + new THREE.LineBasicMaterial({ color: COLORS.cuboid }) + ); + _cuboidEdges.position.y = Z(g.piezoCenter); + _root.add(_cuboidEdges); + + // --- Slice planes (faint outlines through the cuboid) ---------- + _sliceGroup = new THREE.Group(); + const n = Math.max(1, Math.min(g.numSlices, MAX_SLICE_LINES)); + const sliceMat = new THREE.LineBasicMaterial({ color: COLORS.slice, transparent: true, opacity: 0.5 }); + for (let i = 0; i < n; i++) { + const frac = n === 1 ? 0.5 : i / (n - 1); + const zUm = (g.piezoCenter - g.zExtent / 2) + frac * g.zExtent; + const ring = new THREE.LineLoop(_rectXZ(cw, cd), sliceMat); + ring.position.y = Z(zUm); + _sliceGroup.add(ring); + } + _root.add(_sliceGroup); + + // --- Light sheet / pencil beam --------------------------------- + if (g.mode === 'pencil') { + // Pencil: a thin beam along the lateral axis through cuboid centre. + _beam = new THREE.LineSegments( + new THREE.BufferGeometry().setFromPoints([ + new THREE.Vector3(-cw / 2, 0, 0), new THREE.Vector3(cw / 2, 0, 0), + ]), + new THREE.LineBasicMaterial({ color: COLORS.beam }) + ); + _root.add(_beam); + } else { + _sheet = new THREE.Mesh( + new THREE.PlaneGeometry(cw, cd), + new THREE.MeshBasicMaterial({ + color: COLORS.sheet, transparent: true, opacity: 0.35, + side: THREE.DoubleSide, depthWrite: false, + }) + ); + _sheet.rotation.x = -Math.PI / 2; // lie in the lateral (X-Z) plane + _root.add(_sheet); + } + _updateSheetPosition(); + } + + // A rectangle outline in the lateral (X-Z) plane, centred at origin. + function _rectXZ(w, d) { + const hw = w / 2, hd = d / 2; + return new THREE.BufferGeometry().setFromPoints([ + new THREE.Vector3(-hw, 0, -hd), new THREE.Vector3(hw, 0, -hd), + new THREE.Vector3(hw, 0, hd), new THREE.Vector3(-hw, 0, hd), + ]); + } + + // Move the sheet/beam to the live axial position (piezo µm), clamped to + // the cuboid extent. Falls back to the cuboid centre when no live value. + function _updateSheetPosition() { + if (!_scaler) return; + const g = _currentGeom(); + const zMin = g.piezoCenter - g.zExtent / 2; + const zMax = g.piezoCenter + g.zExtent / 2; + let zUm = _piezoZ != null ? _piezoZ : g.piezoCenter; + zUm = Math.max(zMin, Math.min(zMax, zUm)); + const y = _scaler.toScene(zUm, 'z'); + if (_sheet) _sheet.position.y = y; + if (_beam) _beam.position.y = y; + } + + // =================================================================== + // Event handlers + // =================================================================== + function handleDeviceState(payload) { + if (!payload) return; + const pos = payload.positions || {}; + for (const name of Object.keys(pos)) { + const e = pos[name] || {}; + if (e.kind === 'xy_stage') { + if (e.X != null) _stage.x = e.X; + if (e.Y != null) _stage.y = e.Y; + } else if (e.kind === 'piezo') { + if (e.Position != null) _piezoZ = e.Position; + } else if (e.kind === 'galvo') { + if (e.A != null) _galvo.a = e.A; + if (e.B != null) _galvo.b = e.B; + } + } + const box = extractFirmwareBox(payload.properties); + if (box) _firmwareBox = box; + _updateSheetPosition(); + _renderReadouts(); + _renderMinimap(); + } + + function handleScanGeometry(payload) { + if (!payload) return; + _geom = payload; + if (payload.stage_position_um) { + if (payload.stage_position_um.x != null) _stage.x = payload.stage_position_um.x; + if (payload.stage_position_um.y != null) _stage.y = payload.stage_position_um.y; + } + _rebuildSceneObjects(); + _renderReadouts(); + _renderMinimap(); + } + + function handleEmbryos(payload) { + if (!payload || !Array.isArray(payload.embryos)) return; + _embryos = payload.embryos.map((e) => { + const fine = e.position_fine || {}; + const coarse = e.position_coarse || {}; + const x = fine.x != null ? fine.x : coarse.x; + const y = fine.y != null ? fine.y : coarse.y; + return { x, y, role: e.role, id: e.id }; + }).filter((e) => e.x != null && e.y != null); + _renderMinimap(); + } + + async function _bootstrap() { + try { + const r = await fetch('/api/devices/scan_geometry'); + if (r.ok) handleScanGeometry(await r.json()); + } catch (_) { /* offline — demo button covers it */ } + try { + const r = await fetch('/api/embryos/current'); + if (r.ok) handleEmbryos(await r.json()); + } catch (_) { /* ignore */ } + } + + // =================================================================== + // HTML overlay: mode badge, readouts, minimap + // =================================================================== + function _fmt(v, digits = 1, unit = '') { + return v == null ? '—' : (Number(v).toFixed(digits) + unit); + } + + function _renderReadouts() { + const g = _currentGeom(); + if (_modeEl) { + _modeEl.textContent = g.mode === 'pencil' ? 'PENCIL' : 'SHEET'; + _modeEl.classList.toggle('is-pencil', g.mode === 'pencil'); + } + if (!_readoutsEl) return; + const scan = (_geom && _geom.scan) || {}; + const derived = (_geom && _geom.derived) || {}; + const rows = [ + ['stage X', _fmt(_stage.x, 0, ' µm')], + ['stage Y', _fmt(_stage.y, 0, ' µm')], + ['piezo Z', _fmt(_piezoZ, 1, ' µm')], + ['galvo A/B', `${_fmt(_galvo.a, 3)} / ${_fmt(_galvo.b, 3)}°`], + ['slices', scan.num_slices != null ? String(scan.num_slices) : '—'], + ['Z extent', _fmt(derived.z_extent_um, 1, ' µm')], + ['slice step', _fmt(derived.slice_spacing_um, 3, ' µm')], + ]; + _readoutsEl.innerHTML = rows + .map(([k, v]) => `
${k}${escapeHtml(v)}
`) + .join(''); + } + + function _renderMinimap() { + if (!_minimapEl) return; + const VB = { w: 200, h: 120, pad: 8 }; + const box = _firmwareBox || { x: [-25000, 25000], y: [-12000, 12000] }; + const bw = box.x[1] - box.x[0], bh = box.y[1] - box.y[0]; + if (!(bw > 0 && bh > 0)) return; + const sx = (VB.w - 2 * VB.pad) / bw; + const sy = (VB.h - 2 * VB.pad) / bh; + const s = Math.min(sx, sy); + const ox = VB.pad + (VB.w - 2 * VB.pad - bw * s) / 2; + const oy = VB.pad + (VB.h - 2 * VB.pad - bh * s) / 2; + const px = (x) => ox + (x - box.x[0]) * s; + const py = (y) => oy + (box.y[1] - y) * s; // flip Y for screen + + const parts = []; + parts.push(``); + for (const e of _embryos) { + parts.push(``); + } + if (_stage.x != null && _stage.y != null) { + const fovPx = FOV_UM * s; + parts.push(``); + parts.push(``); + } + _minimapEl.innerHTML = parts.join(''); + } + + // =================================================================== + // Demo driver — develop without live hardware (launch_gently.py --offline) + // =================================================================== + function toggleDemo() { + if (_demoTimer) { + clearInterval(_demoTimer); _demoTimer = null; + if (_demoBtn) _demoBtn.classList.remove('is-on'); + return; + } + if (_demoBtn) _demoBtn.classList.add('is-on'); + // Seed a firmware box, a scan geometry, and a few embryos. + _firmwareBox = { x: [-25000, 25000], y: [-12000, 12000] }; + handleScanGeometry({ + embryo_id: 'demo_2', + stage_position_um: { x: 4200, y: -1800 }, + scan: { + num_slices: 60, exposure_ms: 5.0, + galvo_amplitude_deg: 0.5, galvo_center_deg: 0.0, + piezo_amplitude_um: 25.0, piezo_center_um: 50.0, + }, + derived: { z_extent_um: 50.0, slice_spacing_um: 50 / 59, z_min_um: 25, z_max_um: 75 }, + mode: 'sheet', ts: 0, + }); + handleEmbryos({ + embryos: [ + { id: 'demo_1', role: 'test', position_coarse: { x: 4200, y: -1800 } }, + { id: 'demo_2', role: 'control', position_coarse: { x: -8000, y: 5200 } }, + { id: 'demo_3', role: 'test', position_coarse: { x: 12000, y: 2400 } }, + ], + }); + // Sweep the sheet in Z to animate the plane. + let t = 0; + _demoTimer = setInterval(() => { + t += 0.08; + const g = _currentGeom(); + _piezoZ = g.piezoCenter + (g.zExtent / 2) * Math.sin(t); + _galvo.a = 0.5 * Math.sin(t); + _updateSheetPosition(); + _renderReadouts(); + }, 60); + } + + function cleanup() { + if (_animationId) cancelAnimationFrame(_animationId); + if (_demoTimer) { clearInterval(_demoTimer); _demoTimer = null; } + if (_resizeObserver) _resizeObserver.disconnect(); + if (_renderer) { _renderer.dispose(); } + } + + return { init, cleanup, toggleDemo, handleDeviceState, handleScanGeometry, handleEmbryos }; +})(); + +document.addEventListener('DOMContentLoaded', () => { + // Build lazily on first tab activation (container is 0×0 while hidden), + // so init() is invoked from app.js switchTab(), not here. +}); diff --git a/gently/ui/web/static/js/shell.js b/gently/ui/web/static/js/shell.js new file mode 100644 index 00000000..c85b2474 --- /dev/null +++ b/gently/ui/web/static/js/shell.js @@ -0,0 +1,58 @@ +/** + * Shell (ux_v2): the grouped left-rail nav (Now / Library / System) + the + * session-context strip that replace the flat 8-tab bar. + * + * CRITICAL: the rail ROUTES THROUGH switchTab(tabId) for every reveal — it + * never reimplements tab activation, so each tab's lazy-init side-effect + * (HomeApp.init, EmbryosManager.clearDetectionBadge, CampaignsApp.init, …) + * still fires. switchTab emits TAB_CHANGED, which keeps the rail's active + * state in sync no matter who switched (rail, keyboard shortcut, home card, + * hash route). No-ops unless body.ux-v2 is present (flag off → v1 untouched). + */ +const Shell = (() => { + let railItems = []; + + function setActive(tabName) { + railItems.forEach(b => b.classList.toggle('active', b.dataset.tab === tabName)); + } + + function currentTab() { + const active = document.querySelector('.tab.active'); + return (active && active.dataset.tab) || + (typeof state !== 'undefined' && state.tab) || 'home'; + } + + function renderStrip(status) { + const el = document.getElementById('v2-strip-status'); + if (!el) return; + const s = status || (typeof ConnectionStatus !== 'undefined' ? ConnectionStatus.get() : {}); + const n = (typeof state !== 'undefined' && Array.isArray(state.embryos)) ? state.embryos.length : 0; + const conn = s.gentlyConnected ? (s.microscopeConnected ? 'Connected' : 'Online') : 'Offline'; + el.textContent = `${n} embryo${n === 1 ? '' : 's'} · ${conn}`; + } + + function init() { + if (!document.body.classList.contains('ux-v2')) return; // flag off → no-op + + railItems = Array.from(document.querySelectorAll('.v2-nav-item')); + railItems.forEach(btn => btn.addEventListener('click', () => { + if (typeof switchTab === 'function') switchTab(btn.dataset.tab); + })); + setActive(currentTab()); + + if (typeof ClientEventBus !== 'undefined') { + ClientEventBus.on('TAB_CHANGED', (tabName) => setActive(tabName)); + ClientEventBus.on('CONNECTION_STATUS', (s) => renderStrip(s)); + } + + const chatBtn = document.getElementById('v2-rail-chat'); + if (chatBtn) chatBtn.addEventListener('click', () => { + if (typeof AgentChat !== 'undefined' && AgentChat.togglePanel) AgentChat.togglePanel(true); + }); + + renderStrip(); + } + + document.addEventListener('DOMContentLoaded', init); + return {}; +})(); diff --git a/gently/ui/web/static/js/status-store.js b/gently/ui/web/static/js/status-store.js new file mode 100644 index 00000000..0e50dd30 --- /dev/null +++ b/gently/ui/web/static/js/status-store.js @@ -0,0 +1,54 @@ +/** + * ConnectionStatus — the single source of truth for connection liveness. + * + * Fixes the "three disagreeing indicators" bug where the header pill, the home + * landing line, and the agent dock each computed connection state from their + * own signal at their own time (home.js read state.connected ONCE at tab init, + * before the /ws handshake, and never corrected — showing "Offline" while the + * header showed "Online"). + * + * Three genuinely distinct signals (kept separate, not flattened): + * - gentlyConnected : the main /ws telemetry socket (websocket.js) + * - microscopeConnected : the /api/device-status health poll (app.js) + * - agentConnected : the /ws/agent chat socket (agent-chat.js) + * + * Writers call set*(); readers subscribe(). The store is STICKY: subscribe() + * immediately replays the current snapshot to the new subscriber, so a late + * subscriber can never miss the initial state. Events only fire on real change. + */ +const ConnectionStatus = (() => { + const s = { gentlyConnected: false, microscopeConnected: false, agentConnected: false }; + + function emit() { + if (typeof ClientEventBus !== 'undefined') { + ClientEventBus.emit('CONNECTION_STATUS', { ...s }); + } + } + + function set(key, val) { + val = !!val; + if (s[key] === val) return; // only emit on actual change + s[key] = val; + emit(); + } + + return { + setGently(v) { set('gentlyConnected', v); }, + setMicroscope(v) { set('microscopeConnected', v); }, + setAgent(v) { set('agentConnected', v); }, + get() { return { ...s }; }, + + /** + * Subscribe to status changes AND immediately receive the current + * snapshot (sticky replay). This is the guard against the original bug: + * a subscriber that registers after the first emit still renders from + * the correct current state instead of a stale default. + */ + subscribe(handler) { + if (typeof ClientEventBus !== 'undefined') { + ClientEventBus.on('CONNECTION_STATUS', handler); + } + try { handler({ ...s }); } catch (e) { console.error('ConnectionStatus subscriber error', e); } + } + }; +})(); diff --git a/gently/ui/web/static/js/utils.js b/gently/ui/web/static/js/utils.js index 4b8ff62b..8b4a3d79 100644 --- a/gently/ui/web/static/js/utils.js +++ b/gently/ui/web/static/js/utils.js @@ -5,6 +5,66 @@ // Tab and view name constants const TABS = { HOME: 'home', EMBRYOS: 'embryos', CALIBRATION: 'calibration', EVENTS: 'events', PLANS: 'plans', SESSIONS: 'sessions', DEVICES: 'devices', EXPERIMENT: 'experiment' }; +/** + * Extract the XY firmware fence (the addressable stage box) from a device-state + * properties map. The ASI adapter exposes LowerLimX/UpperLimX/LowerLimY/ + * UpperLimY in mm; we convert to µm. Single source of truth for both the 2D + * devices map and the 3D optical-space view. + * + * @param {Object} propsByDevice - payload.properties from DEVICE_STATE_UPDATE + * @returns {{x:[number,number], y:[number,number]}|null} + */ +function extractFirmwareBox(propsByDevice) { + if (!propsByDevice) return null; + for (const name of Object.keys(propsByDevice)) { + const p = propsByDevice[name] || {}; + const xMinMm = parseFloat(p['LowerLimX(mm)']); + const xMaxMm = parseFloat(p['UpperLimX(mm)']); + const yMinMm = parseFloat(p['LowerLimY(mm)']); + const yMaxMm = parseFloat(p['UpperLimY(mm)']); + if (isFinite(xMinMm) && isFinite(xMaxMm) && + isFinite(yMinMm) && isFinite(yMaxMm)) { + return { + x: [xMinMm * 1000, xMaxMm * 1000], // mm → µm + y: [yMinMm * 1000, yMaxMm * 1000], + }; + } + } + return null; +} + +/** + * Build a coordinate mapper from device microns to Three.js scene units. + * Centers each axis on its range midpoint and divides by the LARGEST span so + * the whole scene fits a ~[-0.5, 0.5] cube while keeping axes proportional + * (anisotropic Z vs XY preserved). Returns helpers used by all scene objects + * so a single scale governs geometry and camera distance. + * + * @param {{xRange:[number,number], yRange:[number,number], zRange:[number,number]}} ranges (µm) + */ +function makeSceneScaler(ranges) { + const xc = (ranges.xRange[0] + ranges.xRange[1]) / 2; + const yc = (ranges.yRange[0] + ranges.yRange[1]) / 2; + const zc = (ranges.zRange[0] + ranges.zRange[1]) / 2; + const xs = Math.abs(ranges.xRange[1] - ranges.xRange[0]); + const ys = Math.abs(ranges.yRange[1] - ranges.yRange[0]); + const zs = Math.abs(ranges.zRange[1] - ranges.zRange[0]); + const maxExtent = Math.max(xs, ys, zs, 1e-6); + return { + maxExtent, + center: { x: xc, y: yc, z: zc }, + // Map an absolute µm position on one axis into scene space. + toScene(um, axis) { + const c = axis === 'x' ? xc : axis === 'y' ? yc : zc; + return (um - c) / maxExtent; + }, + // Map a µm length (span) into scene units (no centering). + scaleLen(um) { + return um / maxExtent; + }, + }; +} + /** * HTML-escape a string (safe for insertion into innerHTML). * Uses the browser's built-in text node escaping. diff --git a/gently/ui/web/templates/index.html b/gently/ui/web/templates/index.html index 85a60345..cd0b1fba 100644 --- a/gently/ui/web/templates/index.html +++ b/gently/ui/web/templates/index.html @@ -15,19 +15,136 @@ + + + - + + {% if ux_v2 %} + {# Agent-first welcome — shown on entry, recedes into the workspace once the + user picks a path. Chat is the last resort (the escape pill), not the + first thing. Only rendered when the flag is on. #} +
+
+ + {# ── Screen 1: welcome ── #} +
+
+
+
Hello.
What are we doing today?
+
+ +
+ + + +
+ +
+ +
+ + +
+
+ + +
+ + {# ── Screen 2: the plan wizard, hosted IN the landing. The agent's + ask_user_choice questions render here as button cards (#v2-plan-ask), + NOT in the chat panel. The plan assembles on the right as you pick. #} +
+
+ +
+
Gently · planning
+
Let's design your run
+
+ +
+
+
+
+
+
working through the next step…
+ +
+ +
+
+ + + + +
+
+ +
+
+ {% endif %} {% include '_header.html' %} {% include '_navbar.html' %}
+ {% if ux_v2 %} + + {% endif %}
+ {% if ux_v2 %}
LIVE
{% endif %} + + {# ux_v2: the agent's current pending ask, dual-rendered here + in the chat. #} + {% if ux_v2 %}{% endif %} +
+ {% if ux_v2 %}{% endif %}

Welcome to Gently

@@ -301,6 +418,7 @@

Device +

@@ -532,6 +650,40 @@

Properties

+ + +
@@ -616,6 +768,7 @@

Properties

+ @@ -631,10 +784,15 @@

Properties

+ + + + + diff --git a/screenshots/occupancy3d-demo.png b/screenshots/occupancy3d-demo.png new file mode 100644 index 00000000..d7688572 Binary files /dev/null and b/screenshots/occupancy3d-demo.png differ diff --git a/screenshots/ux-v2-landing.png b/screenshots/ux-v2-landing.png new file mode 100644 index 00000000..62502ce3 Binary files /dev/null and b/screenshots/ux-v2-landing.png differ diff --git a/ux-prototype/MIGRATION-PLAN.md b/ux-prototype/MIGRATION-PLAN.md new file mode 100644 index 00000000..8e906515 --- /dev/null +++ b/ux-prototype/MIGRATION-PLAN.md @@ -0,0 +1,102 @@ +# Gently web UI → agent-first paradigm: migration plan + +Strangler-fig migration of the **existing** stack (FastAPI + Jinja2 + vanilla-JS, `gently/ui/web/`). **No SPA rewrite.** Everything new is gated behind a `GENTLY_UX_V2` flag and layered onto the seams that already exist. Target paradigm is the prototype in `ux-prototype/landing.html`. + +## Why this is cheap (the load-bearing discoveries) + +- **The structured-ask protocol already exists as data.** The agent emits `{type:'choice_request', request_id, choice_data:{question, options[], _type, allow_multiple}}` over `/ws/agent` (`conversation.py:617`, `bridge.stream_response:648`); the client replies `{type:'choice_response', request_id, selected}` (`agent_ws.py:769`). "One payload, two renderers" = add a `_kind` discriminator + factor `agent-chat.js renderChoice` (L356-390) into a pure `buildAskCard()` + a second mount. **Not a protocol rewrite.** +- **There's already an inference-first precedent**: `bridge.bootstrap_resolution_picker` builds inferred pickers in-memory from candidates without persisting. Plan-mode's draft-first flow models on it. +- **One init chokepoint**: `switchTab(name)` (`app.js:60-103`) is the *only* caller of every tab's manager init. The new shell **calls** it for each region reveal — never reimplements activation. +- **Two separate sockets**: `/ws` (telemetry + server `EventBus` fan-out) and `/ws/agent` (chat + asks). They have different lifecycles — the context surface (Phase 4) rides `/ws`, the ask-stage rides `/ws/agent`. + +## Coexistence (how old + new run side by side) + +A Jinja2 flag: `pages.py GET /` passes `ux_v2` (new `GENTLY_UX_V2` setting, default off) into `index.html`. The template keeps **both** the current 8-tab markup and the new grouped-rail markup, mutually exclusive via `{% if ux_v2 %}` + a `body.ux-v2` class. New JS modules (`status-store.js`, `ask-stage.js`, `shell.js`, `context-surface.js`) load on every page but **no-op without `body.ux-v2`**, so they can't regress v1. Same URL, same shell, same uvicorn process. Flip default-on after a soak (Phase 6), then delete v1 markup. Both UIs read the same state objects and sockets, so v1/v2 can be compared side-by-side on identical live data. + +> CSS hazard: `main.css` has **duplicate** `.tab`/`.tab-content`/`.status-dot` rulesets (~L547 and ~L2892). Consolidate or strictly scope v2 under `body.ux-v2` **before** Phase 2 touches nav CSS. + +## Phase sequence + +| # | Phase | Ships | Flag | Depends | +|---|-------|-------|------|---------| +| 0 | Bug-fix beachhead: sticky status store + idle-telemetry | Status unification + quiet idle channel, **to prod now** | none | — | +| 1 | Dual-render the ask protocol + correct clear-signal | The paradigm enabler | ux_v2 | 0 | +| 2 | Shell unfold + grouped nav + session-context strip | The calm welcome→workspace | ux_v2 | 1 | +| 3a | Inference-first plan mode **backend** (headless) | Draft-from-strain + per-field provenance | — | 2 | +| 3b | Inference-first plan mode **UI** (`plan_confirm` renderer) | Draft renders with provenance | ux_v2 | 3a | +| 4 | Co-editable FileContextStore surface + proactive cards | Shared visibility (beliefs/attention/uncertainty) | ux_v2 | 3b | +| 5 | Carve per-embryo tactical Experiment view out of `embryos.js` | The tactical view mount (contents TBD) | ux_v2 | 4 | +| 6 | Default-on flip + v1 deletion | Irreversible cutover, isolated/soaked | flip | 5 | + +## Phase detail + +### Phase 0 — Bug-fix beachhead (no flag, pure value) +- **Single sticky/replaying `ConnectionStatus` store** (`status-store.js`) holding `{gentlyConnected, microscopeConnected, agentConnected}` and emitting `CONNECTION_STATUS`. Must **replay last state to late subscribers** or bug #1 just moves. +- **Bug #1**: header pill (`updateTopLevelDot` `app.js:562`), home line (`home.js updateStatus:136` — today reads `state.connected` once, the literal "Offline while connected" bug), and dock dot all **subscribe** and re-render on every event. +- **Bug #3 (measure first)**: code shows **15s** polls (not 1-2s); the real idle cost is likely the ~5Hz `DEVICE_STATE_UPDATE` WS stream. **Ship unconditionally:** decouple `#events-count` from `DEVICE_STATE_UPDATE`/`BOTTOM_CAMERA_FRAME`. **Gate polls only if** measurement implicates them; if it's the SSE stream, coalesce/backoff in `device_state_monitor.py`. Gate on a stable `DevicesManager.active` flag, **not** switchTab internals (Phase 2 rewrites those). +- **Bug #2**: capture the prototype's correct multi-select contract (Continue mounts with disabled state *derived from current selection*) for `buildAskCard`. +- Files: `status-store.js` (new), `websocket.js`, `app.js`, `home.js`, `agent-chat.js`, `devices.js`, `events.js`, `index.html`. +- Verify: cold + reload + kill each socket independently — all three indicators agree within one handshake; kill device layer → microscope badge flips in 15s, gently stays Online. + +### Phase 1 — Dual-render the ask protocol (the enabler) +- `_kind` always-present discriminator + additive `_surface` ∈ {transcript,stage,both} (default transcript so the Ink TUI/older clients are unaffected). Set on each payload the bridge builds. +- Factor `renderChoice` → pure `buildAskCard()` + a module-level `answered` Set keyed by **opaque** `request_id` (never parse a prefix — ids mix `req_N` and `resolve_*_`). New `ask-stage.js` renders the same card into `#ask-stage`. +- **BLOCKER fixed — clear signal**: fire `ASK_CLEARED` the instant a `choice_response` is sent (plus on cancel/error/control-loss/socket-close), **not** on `stream_end` — an in-turn ask suspends on `asend` (`bridge.py:657`) and `stream_end` only arrives *after* the answer; a cancelled turn emits none. +- **BLOCKER fixed — dismiss vs control**: `agent_ws.py:772` silently drops non-holder responses. Stage renders read-only when `!hasControl`; only the holder answers/dismisses; a holder's escape posts a real empty `choice_response` so the turn-lock releases. +- **BLOCKER fixed — leak cleanup**: pop orphaned `_choice_futures` on holder-change and answerer-disconnect, not only last-client (`agent_ws.py:889`). +- Add the **free-text "Something else"** affordance web cards lack today (`bridge._dispatch_resolution_pick:462` already routes unknown selections to LLM resolution). +- Verify: trigger the session-open bootstrap picker — appears in chat **and** `#ask-stage`; answering either clears both; cancel a turn mid-ask → stage clears and next turn works; lose control mid-ask → stage goes read-only. + +### Phase 2 — Shell unfold + grouped nav + session strip +- `shell.js` screen-state machine sets `body[data-screen]` (welcome|plan|standalone|shell) and **calls `switchTab()`** for every reveal (+ a dev assertion if a region shows without its init). +- Grouped left rail (Now / Library / System) → each item maps to an existing `data-tab` id via `switchTab`. Session-context strip reuses `ExperimentStrip` (fix stale `switchTab('tasks')` at `experiment-strip.js:176`), fed by `/api/experiments/current/strategy`, reading live status from the Phase 0 store. +- **Real routing**: replace the consume-once hash (`app.js:633` `replaceState` to `/`) with deliberate URL/state sync so refresh/back-button + the `/review`→`/#sessions` redirects resolve correctly. +- Decouple welcome→plan from the brittle `togglePanel(true)+setTimeout(250ms,'/wizard')` (`home.js:159`): the picker renders in `#ask-stage` on connect, driven by the bootstrap `choice_request`. +- **Resume**: replace the `window.location.href='/'` hard reload on `session_changed` (`websocket.js:147`, `review.js:108`) with in-place re-hydration (jarring otherwise). +- Verify: cold load → calm welcome; choosing a plan unfolds (no hard cut); each rail item fires its manager's init side-effect (verify the side-effect, not just visibility). + +### Phase 3a — Inference-first plan backend (headless-testable) +- Flip the plan-mode prompt from ask-first to **infer-first** (`plan_mode_system_prompt.tex`, `harness/plan_mode/prompt.py`): arrive with a draft, ask only for genuine gaps / low-confidence / consequential confirmations. +- Deterministic **strain→channel** inference in `research.py`: parse genotype from `search_strains` (TagRFP→561, GFP→488), attach source ref. **Must degrade to "confirm" — never fabricate a wavelength.** Network-dependent (WormBase REST + CGC scraping), so the degrade path is load-bearing. +- **Per-field provenance** (`model.py:186`): `ImagingSpec` fields are flat scalars with no per-field source (`references[]` is on `PlanItem`, not the spec). Add a parallel `{field → {source, confidence, citation}}` map; reuse the existing `Confidence` enum. +- **Drafts stay in-memory** (like `bootstrap_resolution_picker`), materialized via `create_campaign`/`create_plan_item` only on explicit confirm — avoids a `PlanItemStatus` enum change *and* orphan-folder cleanup. +- Use `gap_assessment.assess_gaps()` to *select* which gaps to ask (don't re-enable the deliberately-disabled multi-question wizard; note `conversation_weight` short-circuits after onboarding). +- Verify (no UI): known strain → draft with channels pre-filled + per-field source/confidence; unknown/offline → "confirm channel", never fabricated; reject → no folder written; confirm → materializes and `GET /api/campaigns/{id}/document` returns the tree. + +### Phase 3b — Inference-first plan UI +- `_kind:'plan_confirm'` ask: bridge emits the in-memory draft as `choice_data`; `ask-stage.js` renders cards with per-field source tags + confidence + edit affordances; chat shows a **compact reference line** (not a duplicate) so the surfaces can't drift. +- Extend `renderSpec` (`agent-chat.js:403`, today a flat key→value table) with a **source column**. +- Confirm posts the same `choice_response`; bridge materializes; stage clears via `ASK_CLEARED`. + +### Phase 4 — Co-editable context surface + proactive cards +- **BLOCKER fixed — real store change**: `FileContextStore` has no event bus. (1) add `CONTEXT_UPDATED` to the closed `EventType` enum (`core/event_bus.py`); (2) inject a bus/callback into the store; (3) emit a **single coalesced** event from each mutator (`add_expectation:1880`, `add_watchpoint:1933`, `add_question:1980`, `update_embryo_understanding:2049`, …); (4) wire in `launch_gently.py:497`. +- New `routes/context.py`: **read** side models on `campaigns.py` (`_serialize`); **write** side uses `Depends(require_control)` from `data.py`/`sessions.py` (NOT the `campaigns.py` mesh/account auth) so a viewer can't mutate the agent's mind. +- Live updates over `/ws` (telemetry socket) → `websocket.js:117` → `context-surface.js`. **Push on change, never poll** (`load_active` scans ~50 observations + YAML). +- `context-surface.js` renders beliefs/attention/uncertainty as a calm panel in the Now region with inline edit/resolve (disabled when `!hasControl`). +- **Proactive cards**: wire watchpoint/question creation + the existing wake-router `origin:'wake'` approvals (`agent.py:1079`) to surface prominent `#ask-stage` cards — real backing for the prototype's attention card, no new mechanism. + +### Phase 5 — Carve the tactical Experiment view (behind the flag, before the flip) +- Make Experiment a distinct renderer over `EmbryosManager.state` + the strategy snapshot rather than overloading the 4556-line `embryos.js`. **Preserve `reconcileWithServerState`/`clearAllState` as the contract.** Don't over-specify contents yet — it's a mount point. +- **Remove the `STUB_STRATEGY` fallback** (`experiment-overview.js:14` + the "mockup · stubbed data" badge) → real loading/empty state. Production must never render stubs. +- Stays behind the flag through its own soak so a reconciliation regression is caught before the irreversible flip. + +### Phase 6 — Default-on flip + v1 cleanup (irreversible, isolated) +- Flip `GENTLY_UX_V2` default-on after soak; delete v1 `{% else %}` nav markup, superseded v1 status writers, and the dead/duplicate `.tab` CSS. Isolated from Phase 5 so the high-regression carve-out never coincides with deletion. + +## Blockers the adversarial pass caught (now folded in) +1. **Clear signal must follow the choice lifecycle, not the stream lifecycle** (asend suspension; cancelled turns emit no `stream_end`). +2. **Dismiss vs control gate** — non-holder responses are silently dropped; only the holder dismisses; server cleans orphaned futures on holder-change/disconnect. +3. **Phase 4 store change is real, not free** — new `EventType`, bus injection, coalesced emit from every mutator, launch wiring. +4. Two answer paths (`_choice_futures` vs bridge-owned `_pending_import`) → client-authoritative `answered` Set + bridge idempotency guard. +5. `switchTab` is the sole init chokepoint → shell calls it, never reimplements. +6. Per-field provenance doesn't exist today → added in 3a. +7. Phase 3 split into headless backend (3a) + UI (3b); embryos.js carve-out isolated in Phase 5. + +## Open decisions (yours) +- **Measure bug #3 first**: is idle chatter the ~5Hz `DEVICE_STATE_UPDATE` stream or the 15s polls? Determines the lever (coalesce in `device_state_monitor.py` vs gate polls). +- **Status**: client-computed sticky store (chosen for Phase 0) vs a single server-emitted status object over `/ws`. +- **Routing**: History API vs hash-fragment for region state (keeping the `/review`→`/#sessions` redirects working without reload). +- **Co-edit concurrency**: optimistic last-write-wins vs per-item version/lock, given the agent mutates the same YAML. +- **Slash-command demotion**: re-render `/status`,`/embryos` rich content as affordances vs button-per-command. +- **Per-field provenance schema**: parallel map on `ImagingSpec` vs sibling dataclass vs extending `PlanItem.references[]`. +- **Experiment tactical view contents** — deferred to Phase 5 design. diff --git a/ux-prototype/landing.html b/ux-prototype/landing.html new file mode 100644 index 00000000..59e7bd94 --- /dev/null +++ b/ux-prototype/landing.html @@ -0,0 +1,674 @@ + + + + + +Gently — entry paradigm sketch + + + +
+
+
Gently
+
+ + Scope ready · 36.9 °C · stage idle +
+
+ +
+ +
+
+
+
Good evening.
What are we doing today?
+
+ +
+ + + +
+ +
+ +
+ + +
+
+
+ + +
+
+
+
+
+ Gently + +
+
+
+
+
+ + +
+
+
+
+
+ +
+

The plan

+

assembling from your choices

+
+
Nothing yet — pick above and watch it fill in.
+
+
+
+
+
+ + +
+
+
+

Quick look

+

I'll take one careful volume right where the stage is now — nothing scheduled, nothing committed.
(We'll design this surface next — it's a stub for now.)

+
+
+
+ + +
+
+
+ LIVE + run + + 0h 12m elapsed · next 1:43 +
+ + + +
+
+
⚠ Needs you
+
Embryo 3 has been quiet for 40 min — past its expected division window. Keep waiting, or flag it for you?
+
+ + + +
+
+ +

Embryos · 3 tracked

representative — the embryo-wise tactical view is yours to define
+
+
Embryo 1on track
4-cell · imaging normally
+
Embryo 2dividing
sped up to every 30 s
+
Embryo 3stalled
40 min quiet · flagged above
+
+
+
+
+ +
+
+ + + +