OpenAI cost spikes: causes, checks, and alert policy.
OpenAI spend can move in hours because token volume, model choice, retries, and product traffic all multiply together.
What usually moves the OpenAI bill
A feature ships on a more expensive model than expected or a fallback model becomes the default.
Prompt or context length grows, increasing average input and output tokens per request.
Retries, streaming reconnects, batch jobs, or background agents repeat calls silently.
Embeddings, evals, or summarisation jobs run per event instead of using cache or sampling.
Triage checklist
- Compare daily spend by project, model, endpoint, and feature owner.
- Review request count, input tokens, output tokens, and average tokens per request.
- Check deploys, prompt changes, eval runs, and retry behaviour in the spike window.
- Separate launch-driven growth from inefficient token usage.
Green, amber, red thresholds for OpenAI
Green
Daily OpenAI spend is within 10% of baseline and model mix is unchanged.
Amber
Daily OpenAI spend is 10-25% above baseline, token/request ratio jumps, or a premium model share rises.
Red
Daily OpenAI spend is more than 25% above baseline or forecast exceeds the monthly AI budget.
Turn this playbook into a daily signal.
StackSpend connects OpenAI to your cloud and AI cost view with daily Slack or email reporting, anomaly detection, and pace-to-forecast.