Anthropic's pricing tiers — Haiku, Sonnet, Opus — span a wide range per token. Claude Sonnet is the workhorse for most production workloads, which makes it the model most worth tracking closely: a shift up to Opus or a jump in context length changes your bill fast.
What actually moves Claude Sonnet cost
- Model mix. A workflow that moves from Sonnet to Opus, or one where Opus becomes the default for a high-volume path, multiplies cost per request. Watch the share of spend by model, not just the total.
- Context length. Long-context workflows — large documents, big retrieved contexts, full conversation history — raise input tokens per request. Reprocessing the same document repeatedly instead of caching is a classic Sonnet cost trap.
- Agent loops and retries. Tool-use loops and retry logic increase the number of requests per user action. Cost scales with requests, so a loop that runs three times instead of once triples spend silently.
The view you need
Tracking Claude Sonnet well means seeing cost broken down by model, workspace/feature, and input vs output tokens, compared against a baseline. That lets you separate three different stories that all look like "the bill went up": more traffic, longer prompts, or a model upgrade.
Turn it into monitoring
StackSpend's Anthropic cost monitoring tracks Claude API usage by model so Sonnet, Opus, and Haiku spend are separated, and ties token volume to cost. Anomaly detection fires the day model mix or context length shifts — before the monthly invoice.
For Claude Code, Cowork, and Office agent usage (telemetry rather than API billing), see Claude cost monitoring. If the bill has already spiked, start with why is my Anthropic bill so high.