Cloud & AI Cost Glossary

Plain-English definitions of the terms behind cloud and AI cost management — from AI COGS and unit economics to anomaly detection, commitment discounts, and pace to forecast.

FinOps

FinOps (Financial Operations) is the practice of giving engineering, finance, and product teams shared, real-time accountability for variable cloud and AI spend. Instead of treating the bill as a finance problem discovered weeks later, FinOps pushes cost decisions to the people who create them — at the moment they create them — using shared data, budgets, and alerts.

Related: Showback and chargeback, Unit economics, Cost anomaly

AI COGS

AI COGS (Cost of Goods Sold) is the inference cost baked into a software product: the OpenAI, Anthropic, Bedrock, or other model spend consumed by each user interaction or feature. Tracking AI COGS lets a team calculate gross margin per product line and see how model usage affects unit economics, rather than burying inference cost in a single undifferentiated API bill.

Related: Gross margin, Unit economics, Cost per token

Cost anomaly

A cost anomaly is a sudden, statistically significant deviation in spend from a service or provider’s historical baseline — for example, OpenAI spend doubling overnight because of a prompt bug or a runaway agent loop. Anomalies are the early-warning signal cost monitoring exists to catch, because they usually surface days before the invoice does.

Related: Anomaly detection, Burn rate, Budget guardrail

Anomaly detection

Anomaly detection is the automated process of learning each service’s normal spend pattern and flagging deviations that exceed it. Effective detection accounts for weekly seasonality and growth trends so it alerts on genuine spikes — not on a predictable Monday-morning increase — and delivers the alert (Slack, email, webhook) before the cost compounds.

Related: Cost anomaly, Budget guardrail

Burn rate

Burn rate is how fast a team is spending over a given period, usually expressed per day or per month. For cloud and AI costs, daily burn rate is the most actionable view because it makes a mid-month spike visible immediately, instead of being averaged away in a monthly total.

Related: Runway, Pace to forecast

Runway

Runway is the amount of time a company can keep operating before it runs out of money, calculated as available cash divided by burn rate. Because cloud and AI spend is one of the largest variable costs for many software companies, an unnoticed spend spike directly shortens runway — which is why daily cost visibility is a runway-protection tool, not just a reporting one.

Related: Burn rate, Pace to forecast

Unit economics

Unit economics describes the direct revenue and costs tied to a single unit of a business — one customer, one API request, or one feature. For AI products, unit economics depend heavily on inference cost: if the model spend per active user grows faster than the revenue per user, the product becomes less profitable as it scales, even while top-line revenue rises.

Related: AI COGS, Gross margin, Cost per request

Gross margin

Gross margin is revenue minus cost of goods sold (COGS), divided by revenue. For AI-powered software, inference cost (AI COGS) is an increasingly large component of COGS, so attributing model spend to the features and customers that drive it is what makes a true gross-margin number possible rather than a guess.

Related: AI COGS, Unit economics

Cost per token

Cost per token is the unit price of large-language-model usage, billed separately for input (prompt) and output (completion) tokens. Because output tokens usually cost several times more than input tokens, and because prompt size compounds across retries and long contexts, cost per token is the lever that most directly determines AI COGS.

Related: Cost per request, AI COGS

Cost per request

Cost per request is the average cost of serving a single API call or user action, including model tokens, retries, tool calls, and any downstream infrastructure. It is the most useful denominator for AI unit economics because it maps cleanly onto product behaviour: a feature that triggers five model calls per click costs five times more per use than one that triggers one.

Related: Cost per token, Unit economics

Egress cost

Egress cost is the fee a cloud provider charges to move data out of its network or across regions. It is a frequent source of surprise bills because it scales with traffic rather than with stored data, and it often hides inside an aggregate networking line item until something — a new integration, a misrouted backup — makes it spike.

Related: Cost anomaly

Idle resource cost

Idle resource cost is money spent on provisioned-but-unused capacity: oversized instances, forgotten dev environments, unattached storage volumes, or always-on resources that only need to run during business hours. Because idle resources accumulate silently and never trigger an error, they are typically found by cost review rather than by monitoring.

Related: Commitment discount

Commitment discount

A commitment discount is a reduced rate a cloud provider offers in exchange for a usage or spend commitment over one to three years — for example AWS Savings Plans and Reserved Instances, or committed-use discounts on GCP. The discount only pays off if committed capacity stays well utilised, so commitment decisions depend on accurate forecasts of baseline demand.

Related: Idle resource cost, Pace to forecast

Showback and chargeback

Showback and chargeback are two models for attributing shared cloud and AI cost to the teams, products, or customers that generate it. Showback reports each team’s cost for visibility and accountability; chargeback goes further and actually bills it to that team’s budget. Both depend on consistent cost allocation, usually through tagging.

Related: Cost allocation tagging, FinOps

Cost allocation tagging

Cost allocation tagging is the practice of labelling cloud resources with metadata — team, product, environment, customer — so that spend can be grouped and attributed instead of viewed only by service. Tag coverage is the foundation of showback, chargeback, and per-feature margin: untagged spend is unattributable spend.

Related: Showback and chargeback, Unit economics

Pace to forecast

Pace to forecast compares spend so far in a period against the projected end-of-period total, answering "are we on track to hit budget?" while there is still time to act. Unlike a month-end variance report, a pace-to-forecast signal is forward-looking: a red pace on day 10 is an invitation to intervene, not a post-mortem.

Related: Burn rate, Budget guardrail

Budget guardrail

A budget guardrail is a defined spend threshold that triggers a notification — or an automated action — when usage approaches or crosses it. Guardrails turn a budget from a number reviewed monthly into a live control: the team hears about a breach in Slack on the day it happens, not in next month’s invoice.

Related: Anomaly detection, Pace to forecast

Know where your cloud and AI spend stands — every day.

Connect providers in minutes. Get 90 days of visibility and start receiving daily cost updates before the invoice lands.

14-day free trial. No credit card required. Plans from $19/month.
Cloud & AI Cost Management Glossary — StackSpend