Back to topic hub
Guides
March 11, 2026
By Andrew Day

Agentic tool-use patterns: planner, executor, and recovery

Build tool-using LLM systems with the lightest orchestration that works: fixed workflows first, planner and executor loops only when the task truly requires them.

Share this post

Send it to someone managing cloud or AI spend.

LinkedInX

Use this when your team is debating whether a workflow needs a true agent or just better orchestration.

The short answer: most teams should start with a fixed workflow plus tool calls, not an open-ended agent. Add planning and recovery loops only when the task genuinely needs branching, retries, or tool selection.

What you will get in 12 minutes

  • A practical ladder from fixed workflows to agent loops
  • When to separate planning from execution
  • A recovery pattern that keeps failures from spiraling
  • A worksheet for deciding how much autonomy a workflow should get

Use this when

  • The model needs to call APIs, search systems, or run actions
  • A single prompt is no longer enough
  • Users want the system to “figure it out” across multiple steps
  • You are worried about retries, loops, or silent failures

The 60-second answer

| Workflow shape | Best for |
| --- | --- |
| Fixed sequence with tool calls | predictable tasks with known steps |
| Router plus specialized flows | a small set of distinct task types |
| Planner plus executor | tasks that need decomposition or dynamic tool choice |
| Open-ended agent loop | rare; only when the environment is truly variable |

If you can enumerate the steps in advance, do not start with an agent.

Start with the lightest viable architecture

Level 1: Fixed workflow

The system decides the order of operations in code.

Example:

  1. classify the request
  2. fetch the relevant record
  3. validate the result
  4. generate the reply

This is the best default for:

  • support actions
  • document processing
  • account lookup
  • approval workflows

Why it wins:

  • easier to test
  • easier to observe
  • lower loop risk
  • lower token waste

Level 2: Router plus specialists

Use a router when one entry point serves a few distinct tasks.

Example:

  • refund request -> billing flow
  • account change -> profile flow
  • policy question -> retrieval flow

This is often enough to create the “agentic” feel people want without giving the system too much freedom.

Level 3: Planner plus executor

Use this when the system must break a task into substeps or choose tools dynamically.

Good examples:

  • multi-step research over several sources
  • reconciling data from multiple systems
  • workflows where missing information changes the next action

A strong pattern is:

  1. planner proposes steps
  2. executor performs one step at a time
  3. validator checks whether the result is usable
  4. recovery logic decides retry, alternative path, or escalation

Keep the planner from calling tools directly. That separation makes failures easier to reason about.

Recovery loops matter more than planning loops

Teams often over-focus on planning and under-design recovery.

Most production pain comes from:

  • tool timeout
  • malformed tool arguments
  • missing permissions
  • contradictory retrieved data
  • repeated retries with no new information

Recovery should answer three questions:

  1. Can the same step be retried safely?
  2. Is there a fallback tool or path?
  3. When should the workflow stop and escalate?

If you cannot answer those clearly, the workflow is not ready for more autonomy.

A practical recovery policy

Use a small state machine:

  • success
  • retry_once
  • fallback
  • ask_user
  • escalate

That is usually better than “let the model keep trying.”

Tool-use design rules

  • Keep tool descriptions concrete and narrow.
  • Validate tool inputs in code.
  • Return structured tool outputs, not prose.
  • Record step count and retry count for every run.
  • Limit loop depth explicitly.

Anthropic's tool-use guidance is directionally helpful here: tool use works best when the model has a clear contract for what each tool does and when it should use it.

How to evaluate an agentic workflow

Do not grade it only on “did the user get an answer?”

Measure:

  • task success rate
  • tool success rate
  • recovery success rate
  • escalation correctness
  • average step count
  • token or cost per successful task

If a more autonomous workflow improves completion but doubles step count and support review load, it may be the wrong design.

Copyable autonomy selector

Before building, answer:

  1. Are the steps mostly known in advance?
  2. Are tool contracts stable?
  3. Is the cost of a wrong action high?
  4. Can failed steps be retried safely?
  5. Is human escalation available?

Interpretation:

  • mostly yes to known steps -> fixed workflow
  • mixed task types -> router
  • unknown substeps but recoverable -> planner plus executor
  • high-risk actions with unclear recovery -> do not use an open-ended agent

Common failure modes

  • using an agent where a workflow would do
  • letting the planner execute actions directly
  • missing loop limits
  • no explicit fallback path
  • no measurement of cost per completed task

How StackSpend helps

Agentic systems can hide cost growth inside retries, longer sessions, and unnecessary tool loops. Tracking spend by workflow makes it easier to see whether a new “agent” improved useful completion or just increased model and tool costs.

What to do next

Continue in Academy

Build production LLM applications

Choose the right LLM pattern for structured data, retrieval, agents, chat, multimodal workflows, and ML-adjacent systems.

Share this post

Send it to someone managing cloud or AI spend.

LinkedInX

Know where your cloud and AI spend stands — every day, starting today.

Sign up