A total AI bill answers "how much did we spend?" It can't answer "do we make money on this?" For that you need unit economics — and the foundational unit for an AI product is cost per LLM request.
Why the unit matters more than the total
Two products with the same monthly OpenAI bill can have completely different economics. One serves a million cheap requests; the other serves a thousand expensive long-context ones. Only the per-request view tells you which is which — and where margin is won or lost.
From cost per request you can roll up to the numbers that actually drive decisions:
- Cost per customer — is your pricing covering your heaviest users?
- Cost per feature — which AI features are profitable and which are a tax?
- Gross margin on AI — the number your board will eventually ask for.
How to measure it
Cost per request comes from tying spend (by model, tokens in/out) to a unit of usage (a request, a customer, a feature). The hard part is attribution: provider dashboards report by API key and model, not by your customer or feature. You need to map usage to your own dimensions.
Make it a standing number, not a one-off analysis
StackSpend's AI COGS view attributes AI spend to features and customers, surfaces cost per LLM request and model-level breakdown, and tracks margin as usage scales — so unit economics is a live dashboard, not a quarterly spreadsheet. It sits alongside AI spend management and LLM cost monitoring so the total and the unit live in one place.
This is the core of AI FinOps: you can't optimize what you can't allocate.