AI spend troubleshooting

OpenAI cost spikes: causes, checks, and alert policy.

OpenAI spend can move in hours because token volume, model choice, retries, and product traffic all multiply together.

Common causes

What usually moves the OpenAI bill

A feature ships on a more expensive model than expected or a fallback model becomes the default.
Prompt or context length grows, increasing average input and output tokens per request.
Retries, streaming reconnects, batch jobs, or background agents repeat calls silently.
Embeddings, evals, or summarisation jobs run per event instead of using cache or sampling.

First checks

Compare daily spend by project, model, endpoint, and feature owner.
Review request count, input tokens, output tokens, and average tokens per request.
Check deploys, prompt changes, eval runs, and retry behaviour in the spike window.
Separate launch-driven growth from inefficient token usage.

Alert policy template

Green

Daily OpenAI spend is within 10% of baseline and model mix is unchanged.

Amber

Daily OpenAI spend is 10-25% above baseline, token/request ratio jumps, or a premium model share rises.

Red

Daily OpenAI spend is more than 25% above baseline or forecast exceeds the monthly AI budget.

Next step

StackSpend connects OpenAI to your cloud and AI cost view with daily Slack or email reporting, anomaly detection, and pace-to-forecast.