Skip to content

upneja/deep-research

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 
 
 
 
 

Repository files navigation

Deep Research

An adversarial multi-agent analysis skill for Claude Code.

A single AI response gives you one line of reasoning. It anchors early, hedges, and lands on a safe middle-ground answer. For hard questions — strategy, architecture, real trade-offs — that's not enough.

Deep Research runs the question through parallel advocate/critic agent pairs, each arguing from a genuinely distinct school of thought, then synthesizes what survived the cross-examination into a layered brief. Every layer gets adversarially checked, including the synthesis itself.

This repo contains the full implementation: a SKILL.md orchestration spec plus four agent prompt templates. It runs as a personal skill inside Claude Code — there's no separate runtime or server.


How It Works

Question in
    │
    ▼
GATE — score on 3 binary axes
(Ambiguity / Consequence / Breadth — 2+ to fire)
    │
    ▼
"How hard should I go?" → Light / Standard / Heavy
    │
    ▼
Propose N directions (distinct lenses) → user confirms
    │
    ▼
WAVE 1 — N advocate agents, in parallel
each builds the strongest case for one lens
    │
    ▼
WAVE 2 — N critic agents, in parallel
each attacks one advocate's actual output
    │
    ▼
Orchestrator — judges the debates, takes a position
    │
    ▼
Meta-steelman — attacks the synthesis itself
    │
    ▼
Layered output: verdict → surviving args → killed args → meta-critique

1. The gate

Every incoming question is scored on three binary axes:

Axis Scores 1 if...
Ambiguity Multiple valid interpretations or approaches exist
Consequence Getting it wrong has real downstream cost
Breadth Genuinely distinct schools of thought bear on the answer

Two or more points and the skill offers to fire: "This looks like it'd benefit from deep research. How hard should I go?" The user picks an intensity — or declines, and the skill stands down. Simple factual questions never trip the gate.

2. Direction proposal

The skill proposes N directions and waits for confirmation. Each direction is a lens, not an option: "operational complexity" and "team autonomy" are lenses; "microservices vs. monolith" restated two ways is not.

3. Wave 1 — advocates (parallel)

N agents launch simultaneously, one per direction, using advocate-prompt.md. Each builds the strongest possible case from its assigned lens — no hedging, no weasel words, evidence-backed, capped at 800 words.

4. Wave 2 — critics (parallel)

After Wave 1 completes, N critic agents launch using critic-prompt.md. Each receives one advocate's full output and goes after structural flaws: weak assumptions, cherry-picked evidence, hidden costs, critical omissions. Running Wave 2 strictly after Wave 1 is the load-bearing design choice — critics attack real arguments, not strawmen.

5. Orchestrator

A single agent (orchestrator-prompt.md) reads all N debates and judges them — which arguments survived intact, which got dismantled, where directions unexpectedly converged, where surviving arguments still conflict. It must take a position, not average everything into mush.

6. Meta-steelman

The orchestrator is a single point of failure, so a final agent (meta-steelman-prompt.md) attacks the synthesis: cherry-picking, false convergence, conclusions the debates don't actually support. If the synthesis is solid, it says so. If not, it amends the verdict.


Intensity Levels

Intensity Directions Advocates Critics Orchestrator Meta-steelman Total agents
Light 3 3 3 1 1 8
Standard 5 5 5 1 1 12
Heavy 8+ 8+ 8+ 1 1 18+

Advocates and critics run in parallel within each wave, so N agents cost roughly the wall-clock time of one. You're buying depth, not waiting for it.


Output

Layered so the reader stops at their depth:

  1. Verdict — 3 lines. Act on this alone if it's clear.
  2. Surviving arguments — what held up under attack and why.
  3. Killed arguments — what was considered and dismantled, one line each.
  4. Meta-critique — what might be wrong with the conclusion itself.
  5. Raw debates — full advocate/critic exchanges, on request.

Install

Copy the skill/ directory into your Claude Code personal skills folder:

git clone https://github.com/upneja/deep-research.git
cp -r deep-research/skill ~/.claude/skills/deep-research

Claude Code picks it up automatically. From then on, any question that trips the gate gets the "how hard should I go?" prompt. You can also invoke it directly by asking for "deep research on [question]". Saying "just answer" or "skip" stands it down.

Requires Claude Code with subagent (Task/Agent tool) support. No other dependencies.


Repo Layout

deep-research/
├── README.md
└── skill/
    ├── SKILL.md                  # Gate logic + orchestration flow
    ├── advocate-prompt.md        # Wave 1 agent template
    ├── critic-prompt.md          # Wave 2 agent template
    ├── orchestrator-prompt.md    # Synthesis agent template
    └── meta-steelman-prompt.md   # Final adversarial check template

The implementation is prompts and orchestration instructions — SKILL.md tells Claude Code when to fire, how to dispatch agents in parallel waves, and how to assemble the layered output. The four templates use {{PLACEHOLDER}} substitution for the question, direction, and upstream agent outputs.


When to Use It

Good fits: meaningful trade-offs with no obvious right answer, decisions with real downstream cost, questions where multiple disciplines disagree, stress-testing a position you already hold.

Bad fits: factual lookups, anything where speed beats depth, questions with one established answer. The gate exists to keep these out.


Design Principles

  • Adversarial review at every layer. Advocates get critics, the synthesis gets a meta-steelman. Remove it from any layer and you're back to theater.
  • Critics engage with the actual argument. Wave ordering guarantees it.
  • The synthesis judges, it doesn't average. Directions that lost their debates get discounted, not blended in.
  • Output respects the reader. The verdict comes first; depth is opt-in.

Built with Claude Code.

About

Adversarial multi-agent analysis engine for Claude Code

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors