FinOps
FinOps (Financial Operations) is the practice of giving engineering, finance, and product teams shared, real-time accountability for variable cloud and AI spend. Instead of treating the bill as a finance problem discovered weeks later, FinOps pushes cost decisions to the people who create them — at the moment they create them — using shared data, budgets, and alerts.
Related: Showback and chargeback, Unit economics, Cost anomaly
AI COGS
AI COGS (Cost of Goods Sold) is the inference cost baked into a software product: the OpenAI, Anthropic, Bedrock, or other model spend consumed by each user interaction or feature. Tracking AI COGS lets a team calculate gross margin per product line and see how model usage affects unit economics, rather than burying inference cost in a single undifferentiated API bill.
Related: Gross margin, Unit economics, Cost per token
Cost anomaly
A cost anomaly is a sudden, statistically significant deviation in spend from a service or provider’s historical baseline — for example, OpenAI spend doubling overnight because of a prompt bug or a runaway agent loop. Anomalies are the early-warning signal cost monitoring exists to catch, because they usually surface days before the invoice does.
Related: Anomaly detection, Burn rate, Budget guardrail
Anomaly detection
Anomaly detection is the automated process of learning each service’s normal spend pattern and flagging deviations that exceed it. Effective detection accounts for weekly seasonality and growth trends so it alerts on genuine spikes — not on a predictable Monday-morning increase — and delivers the alert (Slack, email, webhook) before the cost compounds.
Related: Cost anomaly, Budget guardrail
Burn rate
Burn rate is how fast a team is spending over a given period, usually expressed per day or per month. For cloud and AI costs, daily burn rate is the most actionable view because it makes a mid-month spike visible immediately, instead of being averaged away in a monthly total.
Related: Runway, Pace to forecast
Runway
Runway is the amount of time a company can keep operating before it runs out of money, calculated as available cash divided by burn rate. Because cloud and AI spend is one of the largest variable costs for many software companies, an unnoticed spend spike directly shortens runway — which is why daily cost visibility is a runway-protection tool, not just a reporting one.
Related: Burn rate, Pace to forecast
Unit economics
Unit economics describes the direct revenue and costs tied to a single unit of a business — one customer, one API request, or one feature. For AI products, unit economics depend heavily on inference cost: if the model spend per active user grows faster than the revenue per user, the product becomes less profitable as it scales, even while top-line revenue rises.
Related: AI COGS, Gross margin, Cost per request
Gross margin
Gross margin is revenue minus cost of goods sold (COGS), divided by revenue. For AI-powered software, inference cost (AI COGS) is an increasingly large component of COGS, so attributing model spend to the features and customers that drive it is what makes a true gross-margin number possible rather than a guess.
Related: AI COGS, Unit economics
Cost per token
Cost per token is the unit price of large-language-model usage, billed separately for input (prompt) and output (completion) tokens. Because output tokens usually cost several times more than input tokens, and because prompt size compounds across retries and long contexts, cost per token is the lever that most directly determines AI COGS.
Related: Cost per request, AI COGS
Cost per request
Cost per request is the average cost of serving a single API call or user action, including model tokens, retries, tool calls, and any downstream infrastructure. It is the most useful denominator for AI unit economics because it maps cleanly onto product behaviour: a feature that triggers five model calls per click costs five times more per use than one that triggers one.
Related: Cost per token, Unit economics
Egress cost
Egress cost is the fee a cloud provider charges to move data out of its network or across regions. It is a frequent source of surprise bills because it scales with traffic rather than with stored data, and it often hides inside an aggregate networking line item until something — a new integration, a misrouted backup — makes it spike.
Related: Cost anomaly
Idle resource cost
Idle resource cost is money spent on provisioned-but-unused capacity: oversized instances, forgotten dev environments, unattached storage volumes, or always-on resources that only need to run during business hours. Because idle resources accumulate silently and never trigger an error, they are typically found by cost review rather than by monitoring.
Related: Commitment discount
Commitment discount
A commitment discount is a reduced rate a cloud provider offers in exchange for a usage or spend commitment over one to three years — for example AWS Savings Plans and Reserved Instances, or committed-use discounts on GCP. The discount only pays off if committed capacity stays well utilised, so commitment decisions depend on accurate forecasts of baseline demand.
Related: Idle resource cost, Pace to forecast
Showback and chargeback
Showback and chargeback are two models for attributing shared cloud and AI cost to the teams, products, or customers that generate it. Showback reports each team’s cost for visibility and accountability; chargeback goes further and actually bills it to that team’s budget. Both depend on consistent cost allocation, usually through tagging.
Related: Cost allocation tagging, FinOps
Cost allocation tagging
Cost allocation tagging is the practice of labelling cloud resources with metadata — team, product, environment, customer — so that spend can be grouped and attributed instead of viewed only by service. Tag coverage is the foundation of showback, chargeback, and per-feature margin: untagged spend is unattributable spend.
Related: Showback and chargeback, Unit economics
Pace to forecast
Pace to forecast compares spend so far in a period against the projected end-of-period total, answering "are we on track to hit budget?" while there is still time to act. Unlike a month-end variance report, a pace-to-forecast signal is forward-looking: a red pace on day 10 is an invitation to intervene, not a post-mortem.
Related: Burn rate, Budget guardrail
Budget guardrail
A budget guardrail is a defined spend threshold that triggers a notification — or an automated action — when usage approaches or crosses it. Guardrails turn a budget from a number reviewed monthly into a live control: the team hears about a breach in Slack on the day it happens, not in next month’s invoice.
Related: Anomaly detection, Pace to forecast