Agentic harness for microscopy.
Status: 0.22.0 — actively developed at Shroff Lab, Janelia.
Smart microscopy has evolved dramatically, but remains fundamentally rule-based. Adaptive illumination, event-triggered acquisition, real-time segmentation: these systems don't understand what they're imaging. There's a semantic gap between what microscopes measure (pixels, intensities) and what biologists care about (developmental stages, cell health, experimental outcomes).
Vision language models can bridge this gap through semantic reasoning over images. But how do you integrate VLMs with microscope hardware?
There's a useful distinction between workflows and agents: workflows orchestrate AI through predefined code paths, while agents dynamically direct their own tool usage.
Workflow approach: VLMs at specific decision points (classification, quality checks, event detection) within a traditional control system. Predictable, but rigid.
Agentic approach: The microscope exposed as tools an agent calls autonomously. Flexible, but risky without safety guarantees.
Gently supports both. Our orchestrator agent and perception agent operate agentically, while calibration workflows use VLMs at specific decision points (coverage detection, focus assessment). The safety architecture makes either pattern safe to experiment with.
Multiple independent layers of protection:
| Layer | Protection |
|---|---|
| Process Isolation | HTTP API separates agent from device layer. Client crashes don't affect the microscope. |
| Device Limits | Hard bounds validated in set() before any motion. Stage, piezo, galvo all protected. |
| Plan Constraints | Bluesky plans use a restricted vocabulary of safe primitives. |
| Templated Actions | Agents work with Embryo objects, not raw coordinates. |
| Automatic Cleanup | Try-finally patterns ensure lasers off on any error. |
This means: bring your risky code. AI-generated plans, experimental perception, coding agents iterating on control logic. The device layer catches errors before they reach hardware.
Gently is designed for AI-assisted development. The safety stack exists precisely so that coding agents can iterate rapidly without risking hardware.
Our agent-developing-agent methodology: coding agents generate perception systems, test against benchmarks, analyze reasoning traces to identify failures, and refine. AI improving AI, with humans providing ground truth and guidance.
- Hardware: Dual-view selective plane illumination microscope (diSPIM)
- Sample: C. elegans embryo development (8 morphological stages)
- Perception: VLM-based stage classification with full reasoning traces
- Interface: Natural language agent for biologists
The sample is the basic unit of data, not the image or the acquisition. Each sample carries:
- Live imagery and timelapse history
- Calibration state
- Perception traces exposing all classification reasoning
- Detector configurations and event history
This design makes AI decision-making fully observable, addressing a key barrier to AI adoption in scientific instrumentation.
Currently, the sample abstraction is the Embryo object for C. elegans work. The pattern generalizes to other sample types through the plugin system — organism and hardware modules are swappable.
- Python 3.10+
- An
ANTHROPIC_API_KEY— either exported in your shell (export ANTHROPIC_API_KEY=your-key) or placed in a.envfile in the project root (ANTHROPIC_API_KEY=your-key), which is loaded automatically on launch and is gitignored. (Not required if you launch with--no-apito browse the UI only — see Launch below.) - (Optional)
GENTLY_STORAGE_PATH— where sessions and data live (defaultD:/Gently3)
Gently is web-first: the agent is driven from an in-page chat in your browser. There is no TUI to build (Node.js is only needed for the paper diagrams, not the app).
This project uses uv for environment and
dependency management. If you don't have it yet, install it following the
uv installation guide
(e.g. curl -LsSf https://astral.sh/uv/install.sh | sh on macOS/Linux).
Gently depends on gently-perception (the VLM perception harness, repo
gently-project/gently-perception), which is not published to PyPI. For development, it is
installed as an editable sibling clone, so clone both repos side by side:
git clone git@github.com:gently-project/gently.git
git clone git@github.com:gently-project/gently-perception.git
# Layout:
# <parent>/
# gently/ <- you run commands from here
# gently-perception/ <- editable, resolved via [tool.uv.sources]
cd gently
uv sync # base env (add --extra ... for torch etc., see below)The
git@github.com:URLs use SSH, which needs an SSH key configured with GitHub. If you don't use SSH, clone over HTTPS instead (https://github.com/gently-project/<repo>.git).
[tool.uv.sources] resolves gently-perception to the sibling as an editable
install, so your perception edits are live immediately and survive uv sync. If
the sibling isn't cloned, uv sync fails by design — clone it first.
You get a .venv in the project directory with the runtime + dev dependencies
pinned in uv.lock. Activate it with source .venv/bin/activate, or prefix
commands with uv run (e.g. uv run python ...) to use it without activating.
PyTorch is not in the base install — the CUDA build is machine-specific, so it lives in mutually-exclusive extras wired to the right PyTorch index:
# Device-layer accessories (microscope computer): BLE/serial/MQTT transports
uv sync --extra device
# PyTorch (needed for SAM detection and the ML pipeline)
# NOTE: the GPU and CPU builds are mutually exclusive, so they can't be combined.
uv sync --extra torch-gpu # CUDA 11.8 build (GPU box, e.g. the microscope PC)
uv sync --extra torch-cpu # CPU-only build (dev laptop / CI)uv run pytestThe commands below use
uv runso they work without activating the env. If you've activated it first (source .venv/bin/activate), theuv runprefix isn't necessary.
To verify the install, you can start gently without an API key or hardware. The web UI boots and is browsable, though the agent itself (chat, perception, plan mode) stays disabled until you add a key:
uv run python launch_gently.py --offline --no-apiFor the full launch:
# 1. Device layer (hardware control + SAM detection) — separate process, own terminal
uv run python start_device_layer.py
# 2. Agent + web UI (starts the in-process server and opens your browser)
uv run python launch_gently.py
# Run without hardware (development / review)
uv run python launch_gently.py --offline
# UI-only — boot the web UI with no API key (chat/perception disabled)
uv run python launch_gently.py --no-api
# Don't auto-open a browser — open the printed URL yourself
uv run python launch_gently.py --no-browser
# Resume a previous session
uv run python launch_gently.py --resume # interactive picker
uv run python launch_gently.py --resume latest # most recent session
uv run python launch_gently.py --resume <id> # specific session
# Verbose / debug logging
uv run python launch_gently.py -v # INFO level
uv run python launch_gently.py --debug # DEBUG levelThe launcher prints a banner with the URL (default http://localhost:8080),
device status, storage path, and log location. Open that URL in any browser on
the LAN.
Viewing is open — the dashboard loads read-only for anyone, no login. Signing in elevates you to control (driving hardware, taking the single-operator lock); it isn't a gate on the page.
On the first run, Gently creates one admin account and prints a one-time
random password in the startup banner:
First-run admin account created — sign in at the URL above:
username: admin
password: <random>
- Save it now — the password is printed to the console once and never written to the log (only a PBKDF2 hash is stored).
- After signing in, add accounts (roles
viewer/operator/admin) via the admin-onlyPOST /api/auth/users. - Lost it? There's no reset command yet — delete
<GENTLY_STORAGE_PATH>/auth/users.yamland restart to bootstrap a freshadmin(this clears all accounts). - Just trying it locally?
GENTLY_NO_AUTH=1disables accounts entirely (legacy mode: localhost gets control, remote callers needX-Gently-Token).
Accounts live under <GENTLY_STORAGE_PATH>/auth/ (users.yaml + secret.key),
outside the repo.
You don't need a microscope to try the core loop — plan mode is pure agent reasoning and works under --offline. The path from launch to an inspectable plan:
-
Open the agent chat. Click Agent in the header (or press
Ctrl/Cmd+J). New here? The Home tab's Start an experiment button runs a short setup wizard (also available anytime via/wizard— it sets the organism, the campaign, and what you're trying to learn). -
Enter plan mode — type
/planin the chat. The agent switches from operator to scientific collaborator: it won't touch hardware, it helps you design an experiment. -
Describe what you want, in plain language. For example:
"Follow GFP-tagged embryos from bean stage through elongation, imaging every 10 minutes, with a no-laser control — three embryos per condition."
The agent drafts a campaign: a sequence of typed plan items — imaging 📷, bench 🧪, genetics 🧬, analysis 📊, decision points 🚦 — each with concrete specs (strain, interval, laser power, Z-slices, target window, success criteria). Keep replying to refine it;
/plan statusshows progress and/plan exitreturns to run mode. -
Inspect it in the plan viewer. Open the Plans tab. Your campaign appears as a card — click it to open the plan document. Each item shows its status (○ planned · ◑ in progress · ● done) and specs; click one to see full details in the inspector. Switch layouts (document / board / graph / timeline) from the view controls, and browse plan versions as it evolves. (Typing
/campaignin chat lists campaigns too.)
That's the loop: talk → plan → inspect. With hardware connected (drop --offline and start the device layer), the same campaign drives acquisition — and perception events can wake the agent to adjust it as the embryos develop.
| Guide | Audience | What you'll learn |
|---|---|---|
| Try Without Hardware | Everyone | Run the agent in 10 minutes — conversation, plan mode, perception |
| What Gently Can Do | Everyone | Perception, detection, plan mode, memory, mesh, safety |
| Build a Plugin | Developers | Create organism and hardware plugins for other modalities |
| Hardware Setup | Labs | Connect a diSPIM, start the device layer, first acquisition |
Four layers with strict downward-only dependencies. The harness (reusable agent framework) is separated from the application (microscopy agent), with organism and hardware as swappable plugins.
gently/
├── core/ # Layer 1: Foundation — zero domain knowledge
│ ├── event_bus.py # Async pub/sub messaging
│ ├── file_store.py # FileStore (file-based: YAML / JSONL / TIF)
│ ├── imaging.py # Projection, normalization, encoding
│ └── coordinates.py # Pixel/stage transforms
│
├── harness/ # Layer 2: Reusable agent framework
│ ├── tools/ # @tool decorator, ToolRegistry
│ ├── perception/ # VLM-based observation with reasoning traces
│ ├── memory/ # Persistent agent mind (campaigns, learnings)
│ ├── prompts/ # Prompt engineering and context injection
│ ├── detection/ # Event detection framework
│ ├── session/ # Session lifecycle, interaction logging
│ ├── plan_mode/ # Experimental design mode
│ ├── conversation.py # LLM conversation management
│ ├── bridge.py # WebSocket adapter
│ └── protocols.py # OrganismProtocol, HardwareProtocol
│
├── organisms/ # Layer 3: Swappable domain plugins
│ └── celegans/ # C. elegans stages, biology, detectors
├── hardware/
│ └── dispim/ # diSPIM devices, plans, config, device layer
│
├── app/ # Layer 4: The microscopy agent
│ ├── agent.py # MicroscopyAgent orchestrator
│ ├── tools/ # 19 domain-specific tool modules
│ └── orchestration/ # Timelapse, plan synthesis, ML subagent
│
├── ui/web/ # FastAPI viz server + web assets
├── mesh/ # Distributed multi-instrument coordination
├── ml/ # ML training infrastructure
└── analysis/ # Focus analysis utilities
Building a different microscopy agent (e.g. confocal + Drosophila) means writing a new organism plugin, a new hardware plugin, and optionally custom tools — the harness, core, and analysis layers are reused unchanged.
We welcome contributions across the project:
Core Infrastructure
- Devices: Be careful. Changes here affect hardware safety. Add tests.
- Plans: Follow Bluesky conventions. Plans should be composable and device-agnostic.
- Simulated microscopes: Simulated hardware for testing across the stack without real instruments.
- Testing: Test coverage, integration tests, edge cases.
- Error recovery: Better failure modes, graceful degradation.
- Performance: Making things faster and more efficient.
AI & Agents
- Agent/perception: Experiment freely. The safety stack has your back. The harness layer (
gently/harness/) is designed for reuse. - Design patterns: Reusable patterns for LLM/agentic control in microscopy. If it can be a module, even better.
- Cognitive models: Thinking cognitively about microscopy and implementing cognitive computing models.
- Local LLMs: We currently use cloud providers. Support for local models would be valuable.
- Benchmark datasets: Ground truth annotations for perception. The agent-developing-agent loop needs data.
Architecture & Scope
- System architecture: Ideas on how to structure agentic microscopy systems.
- Sample abstractions: The
Embryoobject is our first sample type. What works for cells, tissue, other specimens? - Other microscopy platforms: Write a new hardware plugin (
gently/hardware/<name>/) and organism plugin (gently/organisms/<name>/). Confocal, widefield, other light-sheet systems, electron microscopy. - Multi-modal integration: Combining microscopy with other data sources (genomics, proteomics, etc.).
Human Interface
- UI/UX: The web interface, agent experience, and visualization all need work.
- HCI research: How do biologists work with intelligent instruments?
- Documentation: Tutorials, examples, better explanations for newcomers.
- Accessibility: Making the interface accessible to users with disabilities.
- Internationalization: Supporting other languages for the agent.
Coding agents are welcome contributors.
Questions or ideas? Open an issue.
Gently was developed collaboratively with members of the Shroff Lab, Magdalena Schneider (AI@HHMI), and Subindev Devadasan.
These papers provide theoretical background for gently's approach:
- Kesavan, P.S. & Nordenfelt, P. "From observation to understanding: A multi-agent framework for smart microscopy." Journal of Microscopy (2025). DOI: 10.1111/jmi.70063
- Kesavan, P.S. & Bohra, D. "deepthought: domain driven design for microscopy with applications in DNA damage responses." bioRxiv (2025). DOI: 10.1101/2025.02.25.639997
One microscope, made intelligent. gently gives a microscope perception and reasoning — it understands what it's imaging, not just what it's measuring. A biologist talks to it in natural language. The safety stack means you can trust it.
Now multiply that. Every microscope running gently is an autonomous agent — it can perceive, reason, and act on its own instrument. Each one is a node with local intelligence.
Connect the nodes. gently-meta is a registry where these agents discover each other. Not a central brain — a shared awareness. Each instrument advertises what it can do, what it's working on, what it has seen.
Science stops being bottlenecked by single instruments. A genomics facility in Cambridge finds something unexpected. Microscopes in Boston, Tokyo, and Heidelberg are roped in to validate it across diverse samples and imaging modalities — automatically. The discovery-to-validation loop that currently takes months of emails and facility bookings happens in hours.
Instruments become a shared, coordinated resource. Discoveries in one modality trigger experiments in another. No single lab needs to own every capability. The collective sees more than any individual.
Copyright © 2026 Howard Hughes Medical Institute.
Gently is licensed under the GNU General Public License v3.0 or later (GPL-3.0-or-later) — see the LICENSE file.

