Kno2gether kno2gether.com ↗ Try Knotie free
Field Guide

Claude Fable 5: The Field Guide

The model-picker decision table, the exact request settings, and every error Fable 5 throws that Opus never did — with the fix for each.

Start free with Knotie
01

Verified against Anthropic's docsThe fast facts (so the table makes sense)

Fable 5 (claude-fable-5, the public release of the Mythos line) is Anthropic's most capable widely-released model. Confirm these before you wire anything up — the rest of this guide builds on them.

  • Model idclaude-fable-5 — there is also claude-mythos-5, the same model without the safety classifiers, but it's invite-only (Project Glasswing). For everyone else it's claude-fable-5.
  • Context / output — 1M tokens in, 128K out. Stream anything over ~16K output or the SDK times out the request.
  • Price$10 / 1M input · $50 / 1M output — 2× Opus 4.8's $5/$25, 3.3× Sonnet's $3/$15, 10× Haiku's $1/$5.
  • Quiet cost multiplier — Fable 5 uses the Opus-4.7 tokenizer: the same text is ~30% more tokens than on older models. So your effective bill is closer to ~2.6× Opus on identical inputs, not 2×. Re-measure with count_tokens before you trust an old budget.
On Anthropic's own internal senior-engineer benchmark, Every scored Fable 5 at 91/100 (human-engineer level) vs 63 for Opus 4.8 and 62 for GPT-5.5. It's a single private benchmark, not an industry standard — treat it as a signal of "step-change on hard work," not a guarantee on yours.
02

Route by task, not by reflexWhen is Fable 5 actually worth $50/1M? A real decision table

The honest answer is rarely as your default. Match the task to the cheapest model that clears the quality bar — Fable 5 earns its premium on a narrow set of jobs. Use this as your routing rule:

TaskReach for Fable 5?Cheaper model that usually winsWhy
Deep, multi-source research briefYesLong-horizon synthesis across a full 1M window is where it pulls ahead.
Greenfield build from a clear, detailed specYesTesters report one-shot implementations of systems that took days to iterate.
Large refactor / repo-wide changeYesHolds long context and self-verifies across many files.
Hard judgment call (contracts, tradeoffs, design review)YesPay for being right when the cost of wrong is high.
Agentic coding loop (interactive)MaybeOpus 4.8 at xhighOpus 4.8 is near-frontier at half the price; benchmark both on YOUR repo first.
Everyday chat / drafting / Q&ANoSonnet 4.6Sonnet is the speed/intelligence sweet spot at $3/$15.
Classification, extraction, routing, taggingNoHaiku 4.5$1/$5 and fast; Fable 5 is pure waste here.
High-volume / latency-sensitive jobsNoHaiku 4.5 / Sonnet 4.6Minutes-long turns and 10× price make Fable 5 a non-starter at scale.
Offensive-security or bio/lab workNo — it'll refuseOpus 4.8Fable 5's classifiers decline these (see the refusal section).
Rule of thumb: send only the hard ~10% to Fable 5 and route the rest down the ladder. A request that runs fine on Sonnet costs you ~17× more on Fable 5 once the tokenizer shift is in.
03

The model-price ladder (the numbers behind the routing)

Same base_url, same SDK — you switch models by changing one string. Here's what each rung costs and what it's for.

Model$/1M in · outContext · max outBest for
Fable 5 (claude-fable-5)$10 · $501M · 128KHardest research, big greenfield builds, judgment
Opus 4.8 (claude-opus-4-8)$5 · $251M · 128KStrong default for agentic + coding
Sonnet 4.6 (claude-sonnet-4-6)$3 · $151M · 64KBest speed/intelligence balance
Haiku 4.5 (claude-haiku-4-5)$1 · $5200K · 64KFast, cheap, simple, high-volume
Opus 4.8 is the one to A/B against. It shares Fable 5's request surface (adaptive thinking, the same removed params) so the only code change to compare them is the model string — which is exactly the point of routing through a single endpoint.
04

Copy these, in orderSwitch from Opus without a 400: the exact settings

Fable 5 shares Opus 4.8's request shape but rejects a few parameters Opus tolerated. Each one below is a hard 400 if you leave it in. Steps 1–4 are the ones that bite during migration.

  1. Set the model to claude-fable-5.
  2. Delete temperature, top_p, and top_k. All three are removed — any of them returns a 400. Steer with the prompt instead.
  3. Remove thinking: {type: "enabled", budget_tokens: N}. Budgets are gone; sending one is a 400. Thinking is always on (adaptive) — just omit the thinking field, or set thinking: {type: "adaptive"}.
  4. Do NOT send thinking: {type: "disabled"}. This one is Fable-specific: it's accepted on Opus 4.8/4.7 but a 400 on Fable 5. There is no "thinking off" — control depth with effort instead.
  5. Set depth via output_config: {effort: ...}low · medium · high (default) · xhigh · max. Start at high; only go xhigh for the most capability-sensitive work, max for genuinely frontier problems (it can overthink).
  6. Drop any last-assistant-turn prefill — also a 400. Use output_config.format (structured output) or a system-prompt instruction to shape output.
  7. Stream for outputs over ~16K tokens so you don't hit an HTTP timeout, and give max_tokens real headroom (it caps thinking + text combined).
Quick gut-check before you send: if a request body for Fable 5 still contains temperature, top_p, top_k, budget_tokens, thinking:{type:"disabled"}, or a trailing assistant message, fix it — each is its own 400.
05

The one most people missThe error that isn't a 400 — and breaks naive code anyway

Unlike Opus, Fable 5 runs safety classifiers that can decline a request — and a decline is not an error. It comes back as a successful HTTP 200 with stop_reason: "refusal" and an empty (or partial) content array. Any code that reads response.content[0].text without checking stop_reason first will crash on a refusal.

  • What triggers it — Offensive-security/exploit work (cyber), bio/lab methods (bio), help building competing models (frontier_llm), or asking it to dump its own reasoning as text (reasoning_extraction). Benign security and life-sciences work can trip these too.
  • How to detect it — Branch on stop_reason == "refusal" — NOT on content. stop_details.category names the policy, but it can be null, so don't key your logic off it.
  • Billing quirk — A refusal before any output costs nothing and doesn't count against rate limits. A mid-stream refusal bills the input + already-streamed output — discard the partial.
  • Monitoring trap — Refusals are 200s, so dashboards built on error/5xx rates never see them. Emit your own metric per refusal.
Want auto-retry on a refusal? You implement it — branch on stop_reason == "refusal" and re-send the same request to a cheaper model like claude-opus-4-8. It's a fallback path you wire up yourself, not something that happens by default.
06

Two more gotchas that look like bugs

Both of these produce confusing failures that have nothing to do with your prompt.

  • Every request 400s out of nowhere — Fable 5 requires 30-day data retention and is not available under zero-data-retention (ZDR). If your org is on ZDR — or any retention below 30 days — every Fable 5 call returns 400 invalid_request_error, even a perfectly valid one. Check the org's retention setting before you debug the payload.
  • Your cost / token math is suddenly off — The ~30% tokenizer inflation means token counts, context budgets, and max_tokens values measured on Opus/Sonnet/Haiku don't carry over. A prompt that fit comfortably before can blow your budget. Re-run count_tokens passing model: "claude-fable-5" — the response reports counts under both the new and old tokenizers so you can see the delta.
  • Migrated prompts feel worse, not better — Over-prescriptive scaffolding written for older models can degrade Fable 5. It follows instructions tightly and plans well on its own — strip the step-by-step hand-holding and re-test; a short "act when you have enough info; don't refactor beyond the task" beats a long checklist.
07

Put it to work without lighting money on fire

Concrete patterns that play to Fable 5's strengths (long-horizon, autonomous, self-verifying) while keeping the bill sane.

  • Research analyst — Load 10+ sources into the 1M window at high effort; ask for a cited, decision-ready brief. This is the textbook Fable 5 job.
  • Architecture partner — Give the WHOLE spec up front in one well-specified turn, then let it plan and build. It rewards a clear goal more than mid-task nudging.
  • Long autonomous run — Plan for minutes-long (sometimes hours-long) turns: stream, show progress, and check in asynchronously instead of blocking. Add "audit each progress claim against a tool result before reporting it" to kill fabricated status updates.
  • Parallel sub-agents — It dispatches sub-agents reliably — delegate independent subtasks and keep working rather than spawning-and-blocking.
  • Route, don't default — Send the hard ~10% here; everything else to Opus 4.8 / Sonnet / Haiku. One endpoint, one-line model swap (see below).
08

If you don't want to run the plumbing yourselfReach every model — including Fable 5 — under one billed endpoint

The whole point of the model-picker table is that the model is a config value you swap per task. To make that real you need one place to call all of them — and if you're serving customers, a way to meter and bill what they use. Knotie's AI Gateway is exactly that: an OpenAI-compatible endpoint. Point the standard OpenAI SDK at https://api.knotie.ai, change the model name, and you reach budget, mid, and premium tiers (Claude — including Fable 5 — plus GPT and Gemini families) without a new provider integration per model.

  • One-line model switch — Swapping claude-fable-5 for claude-opus-4-8 (or a GPT/Gemini model) is the model name + base_url — not a new SDK, not new auth.
  • Per-customer metered keys — Mint virtual keys, restrict each key to specific models from a tiered list, set a profit markup, and bill usage on credits — under your own brand and domain.
  • Guardrails before a demo — Restrict a key to mid-tier only, add domain whitelisting, and watch spend per key — so a client demo can't quietly run the $50/1M model on everything.
This is the managed version of "make the model a swappable, billable config value." If you'd rather not stand up your own gateway and billing, it's the shortcut.

Get the next drop

New AI build guides + the occasional bonus template. No spam, unsubscribe anytime.

By submitting you agree to our Privacy Policy & Terms. Unsubscribe anytime.

Frequently asked questions

Is Fable 5 just a renamed Opus? Why the new name?
No — it's a distinct, higher tier. Opus 4.8 is $5/$25; Fable 5 is $10/$50 and is the public release of the Mythos line (the invite-only sibling claude-mythos-5 is the same model minus the safety classifiers). It sits above the Opus tier, not beside it.
What's the single most common 400 when moving from Opus to Fable 5?
Leftover sampling params. temperature, top_p, and top_k were fine on older Claude models and are all hard 400s on Fable 5. Delete them first. The Fable-only surprise is that thinking: {type: "disabled"} — which Opus 4.8 accepts — is also a 400.
My request returns 200 with empty content and no error. What happened?
That's a refusal: stop_reason: "refusal", classifiers declined the request. It's a successful response, not an exception. Check stop_reason before reading content, and retry on a cheaper model (e.g. Opus 4.8) if you want an answer.
Every Fable 5 call 400s but my payload looks valid. Why?
Almost always data retention. Fable 5 requires 30-day retention and isn't available under zero-data-retention. If your org is ZDR, every request 400s regardless of the body. Fix the retention setting, not the prompt.
How do I control how hard it thinks now that budget_tokens is gone?
Use output_config.effort: low / medium / high (default) / xhigh / max. Start at high. Effort affects all token spend — text, tool calls, and thinking — so lower effort also means fewer tool calls, not just shorter reasoning.
Will my Opus token budgets still be right on Fable 5?
No. The Fable 5 tokenizer produces ~30% more tokens for the same text, so old context budgets and max_tokens values are wrong — and your effective price is closer to ~2.6× Opus, not 2×. Re-measure with count_tokens using model: "claude-fable-5" (it returns both tokenizers' counts).
When is Opus 4.8 the smarter buy than Fable 5?
Most of the time. Opus 4.8 is near-frontier at half the price and shares Fable 5's request surface, so A/B-ing them is a one-string change. Reserve Fable 5 for the hardest research, big greenfield builds from a clear spec, and high-stakes judgment — and benchmark both on your actual task before committing.
Can I resell Fable 5 access to my own customers?
Yes — you don't have to be Anthropic. Front it with an OpenAI-compatible gateway (e.g. Knotie's AI Gateway), mint per-customer keys restricted to the models you allow, set your markup, and bill on credits under your own brand. Switching the model your customers get is a one-line change on your side.
Sources · Anthropic — Introducing Claude Fable 5 and Claude Mythos 5 (model id, pricing, context, API changes, data retention) · Anthropic — Models overview (pricing, context windows, and tokenizer note: Fable 5 / Opus 4.8 1M·128K, Sonnet 4.6 1M·64K, Haiku 4.5 200K·64K) · Anthropic — Effort parameter (levels + recommended starting points for Fable 5) · Anthropic — Refusals and fallback (stop_reason refusal, categories, billing, fallback patterns) · Anthropic — Prompting Claude Fable 5 (long runs, scaffolding, reasoning_extraction) · Every — "Vibe Check: Fable 5 Is the Best Coding Model in the World" (91/100 senior-engineer benchmark)

Sell access to Fable 5 — under YOUR brand

You don't have to be Anthropic to profit from Fable 5. With Knotie's AI Gateway you resell Fable 5 and every other model — voice, chat, automations — under your brand, your domain, your prices. Built-in credit billing means you set the margin and keep it. Start free.

Start free with Knotie
Explore Kno2getherOne home for the products, experiments, tools & free guides.