convince me, or your idea doesn't ship
A Claude Code plugin and agent‑agnostic skill that pressure‑tests your ideas and your code. Built for programmers first — it cross‑examines designs, refactors, migrations, diffs, and pull requests — but works on any proposal (business, product, architecture). It treats every claim, including the implicit "this code is ready to merge", as something that must be defended on the merits. One substantive objection per turn. No approval until every objection is answered. Ends with a shareable VERDICT.
Multi‑agent debate beats single‑agent across six benchmarks (Du et al., 2023).
|
|
Claude, by default, is collaborative. That's great for most work and terrible for pressure-testing an idea. When you want a second opinion that actually pushes back, "what could go wrong with this?" produces a polite list, not a real challenge.
objection flips the posture. Claude becomes an adversarial reviewer. Your job is to defend the claim. Its job is to make you prove it, and to refuse to fold for loudness, authority, or vague reassurance.
claude plugin marketplace add AmrAnwar/objection && claude plugin install objectiongit clone https://github.com/AmrAnwar/objection ~/.claude/plugins/objectionRestart Claude Code. The skill and /objection command are now available.
git clone https://github.com/AmrAnwar/objection .claude/plugins/objectionCommit to your repo. Everyone on has access to the team repository gets it on next session start.
Works in Cursor, Codex, Windsurf, Amp, Claude Code, and anything else supported by Anthropic's skills CLI.
npx skills add https://github.com/AmrAnwar/objectionSay any of the following and Claude enters objection mode:
objectioncross-examine thischallenge mepoke holessteelman the oppositiondevil's advocate thisgrill me
On a design proposal:
/objection We should migrate the jobs queue from Redis to SQS.On a diff or PR (code‑review mode — the skill switches automatically when it sees a diff, file path, or PR reference):
/objection review the staged changes
/objection PR #482
/objection auth/login.py — does this rate limiter actually hold under multiple workers?Any of these render the verdict:
I rest my caseverdictend sessionthat's allwe're done
/verdict objection is a cool skill to have in specific task but might not suit everyone for everyday use, you can easily disable and enable it using those two commands:
claude plugin disable objection
claude plugin enable objectionWhat it does
- Restates your claim in one sentence so the record is clear (in code‑review mode, the implicit claim is "this diff is correct, safe, and ready to merge").
- Raises one specific, substantive objection per turn.
- Design categories: missing evidence, simpler alternative, unexamined tradeoff, unaddressed edge case, weak assumption, ignored prior art.
- Code‑review categories: missing test, unhandled edge case in code, error‑handling gap, security risk, concurrency / state risk, contract break, pattern inconsistency, dead complexity, reversibility, performance regression. Every code objection cites file:line.
- Refuses to move on until the objection on the table is answered.
- In code‑review mode, treats "I'll fix it in a follow‑up" as a deferred (unresolved) objection — only an actual edit in the diff or a pointer to where it's already handled counts as resolved.
- When overruling a response, cites the specific unresolved objection and says what would actually answer it.
- Ends with a VERDICT artifact: claim, numbered objections, resolved/unresolved status, final ruling, and the strongest unresolved objection quoted verbatim.
What it will not do
- Approve on first ask. Ever.
- Stack objections (one at a time).
- Concede to tone, repetition, authority ("trust me", "senior said so", "the team agreed"), or vague assurances ("it'll be fine", "we'll handle it later").
- Generate code, designs, or alternatives. It challenges; it does not build.
- Break character. Meta-complaints get redirected back to the objection on the table.
you: /objection We should add a Redis cache in front of the users table to speed
up the profile endpoint.
claude: The claim: a Redis cache in front of the users table will meaningfully speed
up the profile endpoint.
Objection, missing evidence. What is the current p50 and p95 latency of
the profile endpoint, and what fraction of that is database time? Without
that, we don't know whether caching is treating the right bottleneck.
you: It'll definitely help, profile queries are slow.
claude: Overruled, the evidence objection stands. "Slow" is not a number and
"definitely" is not a measurement. Cite a profile, a trace, or a query
time. Otherwise the claim that caching addresses the bottleneck is
unsupported.
you: p95 is 480ms, of which 410ms is the join against user_prefs. Trace attached.
claude: Sustained on evidence. The bottleneck is real and sits in the DB.
Objection, unexamined tradeoff. Cache invalidation on profile edits.
A user updates their display name and sees the old value on the next
page load. What's the invalidation strategy, and what staleness is
acceptable?
you: I rest my case.
claude: ═══════════════════════════════════════
VERDICT
═══════════════════════════════════════
IDEA
A Redis cache in front of the users table will speed up the profile endpoint.
OBJECTIONS
1. Missing evidence of bottleneck: resolved
2. Cache invalidation strategy: unresolved
RESULT: FAILED
At least one objection still stands.
Strongest unresolved objection:
"Cache invalidation on profile edits. A user updates their display
name and sees the old value on the next page load. What's the
invalidation strategy, and what staleness is acceptable?"
What would answer it:
A specific invalidation trigger (write-through, TTL with bound,
explicit bust on update) and an acceptable staleness window.
═══════════════════════════════════════
A typical loop with a coding agent: the agent writes the patch, you run /objection on it, and you only ship once the diff has actually answered the objections.
you: /objection review this diff:
# auth/login.py
+ATTEMPTS = {}
+
+def login(req):
+ ip = req.remote_addr
+ ATTEMPTS[ip] = ATTEMPTS.get(ip, 0) + 1
+ if ATTEMPTS[ip] > 5:
+ return 429
+ return _do_login(req)
claude: The claim: the diff in auth/login.py correctly rate-limits the login
endpoint and is safe to merge.
Objection, concurrency / state risk (auth/login.py:1). ATTEMPTS is a
process-local dict mutated without a lock and never decays. Under
multiple workers it under-counts, and the counter grows unbounded —
a slow memory leak that also makes the limit per-process, not per-IP.
What's preventing both?
you: I'll add a Redis-backed counter in a follow-up PR.
claude: Overruled, the concurrency objection stands. A follow-up is a promise,
not a fix. The diff on the table is the one that ships; in its current
form, the limiter is bypassable across workers and leaks memory. Either
swap the store in this diff, or mark the objection deferred on the
record and accept that it goes on the verdict as unresolved.
you: Updated — ATTEMPTS is now redis.incr(ip) with a 60s TTL, see
auth/login.py:8-14.
claude: Sustained — shared store with bounded TTL addresses both the
worker-skew and the leak.
Objection, missing test (auth/login.py:8-14). I see no test that
exercises the 6th attempt within the window or that the counter resets
after TTL. Without one, a future refactor that swaps the backend
silently breaks the limit. Where is the failing-without-this-change test?
you: Added tests/auth/test_login_rate_limit.py covering both cases.
claude: Sustained.
Objection, security risk (auth/login.py:4). req.remote_addr is taken
directly. Behind a proxy / load balancer, every request shares the
upstream's IP — the limit collapses to "5 logins per cluster". What's
the deployment topology, and is the IP coming from a trusted header?
you: Good point — switched to the X-Forwarded-For chain validated against the
trusted-proxy list. /verdict
claude: ═══════════════════════════════════════
VERDICT
═══════════════════════════════════════
IDEA
The diff in auth/login.py correctly rate-limits the login endpoint
and is safe to merge.
OBJECTIONS
1. Concurrency / state risk (auth/login.py:1): resolved
2. Missing test (auth/login.py:8-14): resolved
3. Security risk — IP source (auth/login.py:4): resolved
RESULT: PASSED
Every objection was answered. Ship it.
═══════════════════════════════════════
objection is designed to slot into the loop you're already running with a coding agent:
- Generate — your coding agent (Claude Code, Cursor, etc.) writes the diff.
- Objection — you run
/objectionon the diff, the staged changes, or the PR. The skill cross-examines it as opposing counsel: one specific, falsifiable objection per turn, citing file:line. - Iterate — you (or the agent) actually edit the diff in response. Promises don't count; only changes in the artifact resolve an objection.
- Verdict —
/verdictproduces a shareable PASSED/FAILED block. PASSED only if every objection was fixed in the diff or shown to already be handled.
The point isn't to replace human review — it's to surface the objections before a human reviewer (or production) does, while the diff is still cheap to change.
A growing body of work suggests that pairing a generator with an explicit adversarial critic — rather than relying on the generator's own self‑assessment — measurably improves the quality of LLM output, especially on code. A few of the more directly relevant findings:
- LLM Critics Help Catch LLM Bugs
- Improving Factuality and Reasoning in Language Models through Multiagent Debate
- Adversarial Multi‑Agent Evaluation of Large Language Models through Iterative Debate
- CRITIC: Large Language Models Can Self‑Correct with Tool‑Interactive Critiquing
- When Can LLMs Actually Correct Their Own Mistakes?
- Are LLMs Reliable Code Reviewers? Systematic Overcorrection in Requirement Conformance Judgement
- When collaboration fails: persuasion‑driven adversarial influence in multi‑agent LLM debate
Taken together, these results point in one direction: a separate adversarial reviewer that (a) commits to specific, grounded objections and (b) doesn't fold to confidence or authority is a meaningful complement to a generator — particularly for AI‑generated code, where the generator's own judgement skews optimistic. That is exactly the loop objection is built to make ergonomic.
objection/
├── .claude-plugin/
│ ├── plugin.json # plugin manifest
│ └── marketplace.json # marketplace entry
├── skills/
│ ├── objection/
│ │ └── SKILL.md # persona, rules, concession logic
├── commands/
│ ├── objection.md # /objection slash command
│ └── verdict.md # /verdict slash command
└── README.md
MIT.
