Generalize harness beyond stage classification: task routing and perception-orchestrator interface

## The question

The perception harness currently does one thing: `(image) → stage`. But perception in the broader microscopy system is much wider:

- **Stage classification** (current): "what developmental stage is this?"
- **Embryo detection**: "what's in this field of view? where are the embryos?"
- **Focus assessment**: "is this volume in focus? what's the best focal plane?"
- **Quality control**: "is this volume usable or was there a motion artifact?"
- **Calibration perception**: "is the embryo well-covered by the scan range? are the two-point galvo-piezo calibration measurements being made at the right spots? is the resulting configuration producing good volumes?"

These are all perception tasks. They share harness infrastructure (VLM calls, image handling, prompt caching) but differ in prompts, output schemas, and available tools.

## Generalizability

The harness adapts per stage by selecting different configurations of (prompt, representation, examples, model, tools). It could equally adapt per **task type**. Stage classification is one task configuration. Embryo detection is another. Calibration assessment is another. The harness is the same.

This raises the routing question: how does the harness know which task to run? Options:
- Task parameter on the call: `perceiver(task="detect", image=...)`
- Task-specific perceiver instances: `stage_perceiver`, `detection_perceiver`
- The orchestrator constructs the appropriate perceive function for each task

## Perception-orchestrator communication

Currently: Python function call → return value. `perceiver(embryo_id, timepoint, image, timestamp) → PerceptionOutput`.

A richer interaction pattern emerges for multi-step tasks like calibration:
```
orchestrator: perceiver.detect(field_image)     → embryo positions
orchestrator: move_stage(x1, y1)
orchestrator: perceiver.assess_focus(volume)    → focus quality
orchestrator: perceiver.classify(image)         → developmental stage
```

The intelligence is in the orchestrator's plan, not in the communication channel. The perceiver is a tool the orchestrator uses — sometimes for classification, sometimes for detection. The "dance" between them is the orchestrator making multiple calls with different tasks, not a conversation between two agents.

## Is function-call API the right level?

Probably yes. Perception is fundamentally request-response. The alternative — two LLM agents conversing in natural language — adds latency and cost for marginal benefit. Structured function calls are more reliable and faster.

The richness comes from:
1. The perceiver accumulating context over time (session)
2. The orchestrator choosing which perception task to invoke
3. The result being rich enough to inform the orchestrator's next decision

## What this means for the harness

The harness should be able to handle different perception tasks without being rewritten for each one. The experiment framework already supports this — different experiments can implement different task types. The routing/configuration layer is what's missing.

## Open questions

- Should calibration perception (currently in gently's calibration_tools.py — embryo coverage assessment, galvo-piezo two-point calibration, volume quality checks) move into gently-perception?
- How does task routing interact with the experiment framework? Are detection experiments a dimension, or a separate concern?
- Does the perceive function signature need to change for non-classification tasks (different output schema)?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Generalize harness beyond stage classification: task routing and perception-orchestrator interface #8

The question

Generalizability

Perception-orchestrator communication

Is function-call API the right level?

What this means for the harness

Open questions

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Generalize harness beyond stage classification: task routing and perception-orchestrator interface #8

Description

The question

Generalizability

Perception-orchestrator communication

Is function-call API the right level?

What this means for the harness

Open questions

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions