Problem
When some agents fail or time out, the user must re-run the entire ensemble to get more results. This wastes time and API credits by re-running agents that already succeeded.
Common scenarios:
- 1 of 5 agents timed out; user wants a 5th successful result
- 2 agents errored due to a transient issue (rate limit, network blip); user wants to retry just those
- User wants to top up: had 3 agents, wants 2 more without losing the first 3
Proposed Solution
thinktank run --retry
Re-run only the agents that failed, timed out, or errored in the most recent run:
$ thinktank run --retry
Found last run: 5 agents (2 timed out, 1 error)
Re-running 3 failed agents...
Agent 4 (was: timeout) → ✓ done (tests passed)
Agent 5 (was: timeout) → ✓ done (tests passed)
Agent 6 (was: error) → ✓ done (tests failed)
Merging with 2 successful results from previous run...
The merged result set (original successes + new attempts) goes through convergence analysis together.
thinktank run --top-up <N>
Add N more agents to the existing result set and re-run convergence on the combined results:
$ thinktank run --top-up 2
Adding 2 more agents to last run (was: 3 agents)...
Acceptance Criteria
Problem
When some agents fail or time out, the user must re-run the entire ensemble to get more results. This wastes time and API credits by re-running agents that already succeeded.
Common scenarios:
Proposed Solution
thinktank run --retryRe-run only the agents that failed, timed out, or errored in the most recent run:
The merged result set (original successes + new attempts) goes through convergence analysis together.
thinktank run --top-up <N>Add N more agents to the existing result set and re-run convergence on the combined results:
Acceptance Criteria
--retryidentifies failed/timeout/error agents from the latest run file--top-up <N>runs N additional new agents and merges with prior run--retryand--top-upprint a clear error--run <N>can be combined to retry from a specific historical run