Back to topic hub
Guides
March 12, 2026
By Andrew Day

Query rewriting, decomposition, and retrieval routing

Improve retrieval quality by deciding when to rewrite the query, split it into parts, or route it to a different retrieval path before generation.

Share this post

Send it to someone managing cloud or AI spend.

LinkedInX

Use this when retrieval is inconsistent because the user query is vague, compound, or mismatched to how your corpus is organized.

The short answer: not every bad retrieval result is an embedding problem. Often the real issue is that the query needs rewriting, decomposition, or a different route before it hits the index.

What you will get in 9 minutes

  • A simple framework for rewrite vs decompose vs route
  • Examples of when each step helps
  • A worksheet for designing a better pre-retrieval layer
  • Metrics that isolate pre-retrieval quality from final answer quality

Use this when

  • Users ask multi-part or conversational questions
  • The same corpus contains very different document types
  • Retrieval sometimes works and sometimes fails for similar intent
  • The best answer often needs more than one retrieval path

The 60-second answer

| Problem | Best first move |
| --- | --- |
| Query is vague or slang-heavy | rewrite |
| Query contains multiple asks | decompose |
| Query belongs to one of several corpora or tools | route |

Do not jump straight to bigger prompts if the retrieval input is the real bottleneck.

Pattern 1: Query rewriting

Rewriting helps when the user query is:

  • informal
  • incomplete
  • phrased differently from the indexed material
  • missing stable domain language

Good rewrite goal:

  • preserve intent
  • improve retrievability

Bad rewrite goal:

  • guess the answer early
  • over-specify details the user did not ask for

Pattern 2: Query decomposition

Decompose when one question contains several retrieval intents.

Examples:

  • “What changed in pricing and what does it mean for enterprise customers?”
  • “Which models are cheaper, and which still pass our coding evals?”

In those cases a single retrieval call often mixes unrelated evidence. Splitting the question lets the system retrieve and answer in smaller, cleaner parts.

Pattern 3: Retrieval routing

Route when different sources require different retrieval methods.

Examples:

  • product docs vs support tickets
  • policy docs vs metrics dashboards
  • SQL-backed systems vs narrative corpora

Routing can decide:

  • which corpus to search
  • whether to use lexical, dense, or hybrid retrieval
  • whether the query should go to a tool instead of a retriever

What a good pre-retrieval layer does

Before retrieval, decide:

  1. does the query need normalization?
  2. does it need splitting?
  3. which source or method should receive it?

That layer often improves quality more cheaply than larger prompts or stronger generation models.

How to evaluate this layer

Measure:

  • rewritten-query retrieval lift
  • decomposed-query recall lift
  • routing accuracy
  • answer correctness after routing

If you only measure final answer quality, you cannot tell whether the improvement came from better routing or better generation.

Pre-retrieval worksheet

For one workflow, define:

  1. Common query shapes
  2. Which shapes need rewriting
  3. Which shapes should be decomposed
  4. Which sources or tools each shape should route to
  5. Which metric proves the layer helped

Common failure modes

  • rewriting every query when only a few need it
  • decomposing questions that should stay whole
  • using one retrieval route for every corpus
  • not logging the rewritten or routed query for debugging

How StackSpend helps

Pre-retrieval layers change workflow cost by altering how many searches, tool calls, and generation steps occur per request. Tracking spend by workflow helps show whether smarter routing reduced wasted retrieval and token volume.

What to do next

Continue in Academy

Build production LLM applications

Choose the right LLM pattern for structured data, retrieval, agents, chat, multimodal workflows, and ML-adjacent systems.

Share this post

Send it to someone managing cloud or AI spend.

LinkedInX

Know where your cloud and AI spend stands — every day, starting today.

Sign up