Use this when you want to understand what "deployment cost correlation" means, how source-control cost attribution works, and whether it's worth wiring into your cost workflow.
The fast answer: Deployment cost correlation is the practice of linking a cost anomaly to the deployment — and the specific pull request — that most likely caused it. It works by matching the anomaly's service, environment, and start time against recent deployments, resolving those deployments to the PRs they shipped, and scoring each candidate on timing, service match, and cost-relevant code changes. It turns "the bill went up" into "this PR, deployed at this time, in this service, most likely caused it."
Most cost tools are good at telling you that spend changed. Far fewer connect that change to what your team shipped. Deployment cost correlation is the bridge between your cost data and your source control, and it's quickly becoming a standard part of how engineering-led teams operate their spend.
What is deployment cost correlation?
Deployment cost correlation is a form of source-control cost attribution: it attributes a cost change not to a provider or a category, but to a concrete engineering event — a deployment and the pull requests inside it.
The traditional attribution question is "which team or feature owns this cost?" Source-control attribution adds a sharper one: "which change caused this cost to move?" The first is about ownership; the second is about causality. You need both, but causality is what lets you actually fix a regression instead of just billing it to a team.
Why provider dashboards can't answer "what changed?"
Provider billing views — AWS Cost Explorer, GCP billing, the OpenAI usage dashboard — are built for reporting, not root-cause. They know your spend by service and day. They don't know:
- that you deployed
guide-generationat 13:40 yesterday, - that the deploy included a PR raising
max_tokens, - or that the prompt change is why output tokens per request jumped.
That knowledge lives in GitHub, in your CI/CD system, and in your engineers' heads. Deployment cost correlation is what joins those two worlds so the answer is in one place. For why the dashboards fall short more generally, see why provider dashboards aren't enough for anomaly detection.
How source-control cost attribution works
The pipeline is deterministic and explainable by design. Given an anomaly with a service, provider, environment, and start time:
- Find candidate deployments — successful deploys to the same service and environment before the anomaly start (a 6-hour lookback is a sensible default).
- Resolve deployed SHAs to changes — map each deployment's commit range to the pull requests it shipped since the previous successful deploy.
- Fetch the evidence — pull changed files, PR titles, bodies, and labels for each candidate.
- Match against code mappings — compare changed files to the mapping that links repository paths to services, so only relevant changes score highly.
- Score candidates deterministically — rank on timing, service match, code relevance, metric shape, and the presence of cost-relevant change types.
- Summarize the top candidates — optionally use an LLM, run only on the highest-scoring few, to summarize the PR evidence in plain language.
- Attach the result to the anomaly — persist ranked candidates and evidence so a human doesn't redo the work.
The important property is that the ranking is deterministic and auditable. An LLM, if used at all, only summarizes the evidence for the top candidates — it never invents the ranking.
The scoring signals
Good correlation is about weighing evidence on both sides, not jumping to a verdict.
| Positive signals (likely cause) | Negative signals (likely not the cause) |
|---|---|
| Deployment landed before the anomaly start | Deployment happened after the anomaly started |
| Deployed service matches the anomalous service | No service mapping match |
| Changed files match service code mappings | Changed files are docs or tests only |
| Prompt, model, token-limit, retry, cache, queue, batch, or cron code changed | Cost increase is proportional to traffic growth |
| Metric shape matches the change type | Provider billing backfill explains the timing |
Language matters: candidates, not culprits
A correlation system should be careful about causality, both technically and culturally. The right vocabulary is "possible contributing change", "related deployed change", "candidate PR", "evidence suggests" — never "caused by", "culprit", or "responsible engineer".
This isn't just politeness. Correlation is probabilistic; the deploy that landed before a spike is a strong candidate, not a proven cause, until a human confirms it. Building blame into the tooling poisons the cost-awareness culture you're trying to create — engineers stop wanting cost signals attached to their names. The goal is to make cost a normal quality attribute reviewed alongside latency and reliability, which only works if attribution feels like debugging, not finger-pointing.
What you need to make it work
Three ingredients turn this from a manual investigation into an automatic one:
- Deployment ingestion — a record of what deployed where and when (service, environment, SHA, time). Without this, there's nothing to correlate against.
- A source-control connection — read-only access to repositories, pull requests, and deployments. Read-only matters: correlation needs to read your history, not write to it.
- Code mappings — the link from repository paths to services, so changed files resolve to the right cost center. This is the piece teams most often skip, and it's what makes the difference between precise and noisy candidates.
Where this fits in the broader workflow
Deployment cost correlation is the diagnosis step in a larger loop: detect the anomaly, correlate it to a change, assign the fix to the owning engineer, and confirm it stays fixed. On its own it's a faster root cause; combined with issue-tracker sync, it becomes a closed cost-incident loop.
StackSpend implements this as a read-only, GitHub-first source-control integration that surfaces ranked related changes with their evidence directly on the anomaly — coexisting with the existing GitHub cost provider, not replacing it. GitLab and Bitbucket follow the same normalized adapter model. To see the whole loop, read cost incident response: from anomaly to root cause to resolved issue and how to turn cost anomalies into Jira and Linear tickets.
Practical takeaway
Deployment cost correlation answers the question every cost anomaly raises — "what did we ship?" — by joining cost data to source control. It scores candidate PRs on timing, service match, and cost-relevant code, keeps the language probabilistic, and needs deployment metadata plus code mappings to work. Done well, it turns root-cause from a 30-minute cross-tool hunt into a glance.
For the surrounding product workflow, see cloud + AI cost monitoring and AI cost anomaly detection.
FAQ
What is the difference between cost attribution and deployment cost correlation?
Cost attribution maps spend to an owner — a team, feature, or customer. Deployment cost correlation maps a change in spend to the deployment and pull request that most likely caused it. Attribution answers "who owns this?"; correlation answers "what changed it?".
Does deployment cost correlation need write access to my repositories?
No. It only needs read access to repository metadata, pull requests, and deployments. Read-only access is enough to correlate history and is the safer permission model.
How accurate is correlating a deploy to a cost spike?
It produces ranked candidates with confidence levels, not certainties. Accuracy depends on having clean deployment metadata and code mappings. A human confirms the cause before it's treated as definitive.
Can this work with GitLab or Bitbucket, not just GitHub?
The model is provider-agnostic via a normalized adapter interface. GitHub is typically implemented first because of its App permission model; GitLab and Bitbucket follow the same approach.
Why does correlation avoid saying a PR "caused" a spike?
Because correlation is probabilistic until confirmed. Using "candidate" and "possible contributing change" keeps the analysis honest and protects a healthy, blame-free cost culture.