Skip to content

[DSP-1.1] swarm: orchestrator over-shards by modulo-N input split instead of semantic decomposition #64

@justrach

Description

@justrach

Sub-issue of #55 (DSP-1). Found in smoke test of PR #61.

Observation

Ran npm run swarm (n=3, gpt-5.5 via Codex) against the default task: "List the .rs files under sdk/typescript/src and describe what each does in ONE sentence."

The orchestrator decomposed it into three nearly-identical subtasks:

1. Enumerate every .rs file under sdk/typescript/src recursively, sort the file paths
   lexicographically, select the files whose 0-based sort index... [shard 0/3]
2. ... select the files whose 0-based sort index... [shard 1/3]
3. ... select the files whose 0-based sort index... [shard 2/3]

The directory has 2 .rs files, so worker 3 got an empty shard and returned "no files". The synthesizer had to resolve the disagreement: "Worker 3's 'no files' result applies only to its modulo shard".

Why it broke

The current orchestrator prompt asks for "EXACTLY ${n} independent subtasks that can run in parallel without coordination". gpt-5.5 interpreted "independent" as "non-overlapping input shards" rather than "different aspects of the task". This is a reasonable LLM read of an ambiguous instruction.

Fix direction

Tighten the orchestrator prompt to push toward semantic decomposition:

  • Add a positive instruction: subtasks should attack the task from different angles (e.g. exploration vs analysis vs verification), or focus on disjoint concerns, not disjoint inputs.
  • Add a negative example explicitly forbidding modulo-N or index-based input sharding.
  • Optionally: if the task input space is small (< N items), the orchestrator should return fewer than N subtasks rather than padding with empty shards.

Acceptance criteria

  • Re-running the same default task produces 3 distinct, non-overlapping (by content) subtasks.
  • No subtask contains the phrase "shard", "modulo", or "0-based sort index".
  • If the orchestrator can't justify N meaningful subtasks, it returns fewer (swarm.ts handles dynamic N already).

Metadata

Metadata

Assignees

No one assigned

    Labels

    severity: highSignificant impact; core functionality is impaired.type: bugSomething isn't working.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions