START HEREWhy a single agent's first answer is the most biased one
A single agent inherits your framing. You hand it a hunch, phrased as a hunch — "I think X is true — check it for me." The model reads that framing, treats your belief as the thing to support, and goes looking for evidence that fits. It comes back agreeing with you, citations included, and you feel smart. That's not research — that's confirmation bias with a tool-use loop bolted on. The fix isn't a better single prompt. It's structural: stop asking one agent to evaluate a claim, and instead run several agents that gather evidence independently, then make them argue. The answer you keep is the one still standing after the fight — not the first thing the model said to make you happy.
- A single agent inherits YOUR framing — phrase a hunch as a hunch and it confirms the hunch
- Independence first: separate agents gather evidence without seeing each other's conclusions
- Conflict second: force them to attack each other's findings before anything reaches you
- You read the survivor, not the first draft
THE SHAPEThe pattern: independent evidence-gatherers → forced cross-examination → survivor
Modern frontier models can genuinely fan a job out to sub-agents that work in parallel — this isn't a metaphor for 'think harder.' Anthropic's own multi-agent research system uses an orchestrator that spins up several sub-agents at once, each in its own context window, each running its own searches independently before reporting back. The adversarial pattern adds two deliberate moves on top of that raw capability: you assign each sub-agent a different evidence source so they can't all drift toward the same answer, and then you add an explicit debate stage where they have to challenge each other before the orchestrator writes a single line of conclusion. Getting each agent to gather independently stops the whole answer from converging on one source. The debate step is what surfaces the counter-case you'd actually miss — the evidence you wouldn't have found if you'd asked one agent to confirm you.
| Stage | What happens | Why it matters |
|---|---|---|
| 1. Frame as a test | State the hunch, then explicitly ask the model to PROVE it or DESTROY it | Removes the 'agree with me' framing — the goal is now a verdict, not a confirmation |
| 2. Fan out, independently | Spin up separate sub-agents, each on a DIFFERENT data source, working in parallel | No single source can quietly steer the whole answer; gathering happens before any conclusion |
| 3. Cross-examine | Make the agents challenge each other's findings before any summary is written | Surfaces the weakest evidence and the strongest counter-case — the part you'd otherwise miss |
| 4. Keep the survivor | Only the conclusion that withstands the debate reaches you, with the dissent attached | You see the bull case, the bear case, and which one actually held |
COPY THISThe "prove it or destroy it" prompt scaffold
This is the reusable skeleton. Swap the bracketed parts for your decision and keep the structure — the structure is what does the work. Notice the order: the hunch is stated plainly, then immediately reframed as something to attack; the sources are named and split; the debate is mandatory and comes BEFORE the report; and the model is told in plain language that you don't want the first answer, you want the one that survives.
- Frame it as a challenge, not a confirmation: "I have a hunch and I want you to either prove it or destroy it. My hunch is: [STATE YOUR ACTUAL BELIEF IN ONE SENTENCE]. I'm leaning toward acting on it — talk me out of it if it's wrong."
- Demand independent gathering across named sources: "Run deep research and don't stop until it's done. Spin up separate sub-agents, each working independently. Agent 1: [SOURCE A — e.g. the primary historical/usage data]. Agent 2: [SOURCE B — e.g. what comparable players/competitors actually did]. Agent 3: [SOURCE C — e.g. independent analysis or third-party signals]. They must gather evidence before forming any conclusion."
- Force the cross-examination: "Have the agents challenge each other's findings before you give me anything else. I don't want the first answer. I want the answer that survives the arguments."
- Specify the deliverable: "Then produce one clean report I can act on: my hypothesis at the top, the case FOR, the case AGAINST, and a clear verdict on which survived — with the evidence behind each side."
- Run it at high effort and walk away: give the model room and time (this is a long-horizon job, not a chat reply), then read the survivor — not the running commentary.
WHY AGENTS COME BACK EMPTYWiring real data in so the agents aren't blocked
An adversarial debate is worthless if both sides are arguing from the model's stale memory. The agents need to reach live, real data — and the moment they try to scrape it, a lot of useful sources block bots. The fix is to give the agent a real data path via a connector (MCP), so 'go check the source' actually fetches the source instead of guessing. This is generic plumbing: a scraping/fetch connector, a database connector, whatever your decision needs. The point is that each independent sub-agent has a genuine, non-blocked way to gather its own evidence.
- In your Claude client, go to Customize → Connectors, click +, then Add custom connector and paste the remote MCP server URL for your data source (a scraping/fetch service, a database, an internal API). The + menu inside a chat only enables connectors you've already added here — it's not where you add a new one.
- If it needs a key, add it where the connector asks (API key / OAuth under Advanced), then click Add to finish.
- Back in a conversation, open the + menu → Connectors and toggle your connector on for this chat; set its tools to always allow for the run so the agents aren't interrupted mid-research.
- Confirm the connector shows up and is enabled for the conversation before you start.
- Now write the scaffold so each sub-agent is pointed at a real source through that connector — not at the model's memory.
THE OUTPUTA clean report: hypothesis on top, bull vs bear, what survived
Don't let the model bury the verdict in prose. Ask for a fixed structure so you can act fast and audit the reasoning. The shape below is the deliverable to demand at the end of the scaffold — and the best part is that a frontier model is strong enough at building HTML to render this as an interactive one-pager you can actually click through, not just a wall of text.
- Hypothesis — your original hunch, restated verbatim at the very top, so there's no moving the goalposts.
- The case FOR (bull) — the strongest evidence the pro-side agent assembled, with where it came from.
- The case AGAINST (bear) — the strongest counter-evidence, including anything that directly contradicts your hunch.
- What survived the debate — the conclusion left standing after cross-examination, stated plainly.
- Confidence + what would change it — how strong the verdict is, and the one piece of new evidence that would flip it.
MAKE IT YOURSWhere builders actually use this
Strip out the example and the skeleton fits any decision where being wrong is expensive and you're tempted to trust your gut. The three sub-agents just become three different evidence angles on whatever you're deciding.
- Pricing a new offer — Agent 1: what comparable products actually charge. Agent 2: your own usage/cost data. Agent 3: independent signals (reviews, churn drivers, willingness-to-pay chatter). Hunch: 'we can charge $X.' Make them fight it.
- A tech-stack or vendor bet — Agent 1: the docs and real limits. Agent 2: what teams who adopted it report months later. Agent 3: failure stories and migration pain. Hunch: 'we should standardize on Y.'
- Market validation for a feature — Agent 1: demand signals. Agent 2: what competitors shipped and how it landed. Agent 3: the strongest reasons it would flop. Hunch: 'customers want Z.'
- A hiring or partner decision — frame the 'this is a great fit' belief as something to disprove, and let the bear-case agent do its job before you commit.
Get the next drop
New AI build guides + the occasional bonus template. No spam, unsubscribe anytime.
By submitting you agree to our Privacy Policy & Terms. Unsubscribe anytime.