Use this when your team is debating whether a workflow needs a true agent or just better orchestration.
The short answer: most teams should start with a fixed workflow plus tool calls, not an open-ended agent. Add planning and recovery loops only when the task genuinely needs branching, retries, or tool selection.
What you will get in 12 minutes
- A practical ladder from fixed workflows to agent loops
- When to separate planning from execution
- A recovery pattern that keeps failures from spiraling
- A worksheet for deciding how much autonomy a workflow should get
Use this when
- The model needs to call APIs, search systems, or run actions
- A single prompt is no longer enough
- Users want the system to “figure it out” across multiple steps
- You are worried about retries, loops, or silent failures
The 60-second answer
| Workflow shape | Best for |
| --- | --- |
| Fixed sequence with tool calls | predictable tasks with known steps |
| Router plus specialized flows | a small set of distinct task types |
| Planner plus executor | tasks that need decomposition or dynamic tool choice |
| Open-ended agent loop | rare; only when the environment is truly variable |
If you can enumerate the steps in advance, do not start with an agent.
Start with the lightest viable architecture
Level 1: Fixed workflow
The system decides the order of operations in code.
Example:
- classify the request
- fetch the relevant record
- validate the result
- generate the reply
This is the best default for:
- support actions
- document processing
- account lookup
- approval workflows
Why it wins:
- easier to test
- easier to observe
- lower loop risk
- lower token waste
Level 2: Router plus specialists
Use a router when one entry point serves a few distinct tasks.
Example:
- refund request -> billing flow
- account change -> profile flow
- policy question -> retrieval flow
This is often enough to create the “agentic” feel people want without giving the system too much freedom.
Level 3: Planner plus executor
Use this when the system must break a task into substeps or choose tools dynamically.
Good examples:
- multi-step research over several sources
- reconciling data from multiple systems
- workflows where missing information changes the next action
A strong pattern is:
- planner proposes steps
- executor performs one step at a time
- validator checks whether the result is usable
- recovery logic decides retry, alternative path, or escalation
Keep the planner from calling tools directly. That separation makes failures easier to reason about.
Recovery loops matter more than planning loops
Teams often over-focus on planning and under-design recovery.
Most production pain comes from:
- tool timeout
- malformed tool arguments
- missing permissions
- contradictory retrieved data
- repeated retries with no new information
Recovery should answer three questions:
- Can the same step be retried safely?
- Is there a fallback tool or path?
- When should the workflow stop and escalate?
If you cannot answer those clearly, the workflow is not ready for more autonomy.
A practical recovery policy
Use a small state machine:
successretry_oncefallbackask_userescalate
That is usually better than “let the model keep trying.”
Tool-use design rules
- Keep tool descriptions concrete and narrow.
- Validate tool inputs in code.
- Return structured tool outputs, not prose.
- Record step count and retry count for every run.
- Limit loop depth explicitly.
Anthropic's tool-use guidance is directionally helpful here: tool use works best when the model has a clear contract for what each tool does and when it should use it.
How to evaluate an agentic workflow
Do not grade it only on “did the user get an answer?”
Measure:
- task success rate
- tool success rate
- recovery success rate
- escalation correctness
- average step count
- token or cost per successful task
If a more autonomous workflow improves completion but doubles step count and support review load, it may be the wrong design.
Copyable autonomy selector
Before building, answer:
- Are the steps mostly known in advance?
- Are tool contracts stable?
- Is the cost of a wrong action high?
- Can failed steps be retried safely?
- Is human escalation available?
Interpretation:
- mostly yes to known steps -> fixed workflow
- mixed task types -> router
- unknown substeps but recoverable -> planner plus executor
- high-risk actions with unclear recovery -> do not use an open-ended agent
Common failure modes
- using an agent where a workflow would do
- letting the planner execute actions directly
- missing loop limits
- no explicit fallback path
- no measurement of cost per completed task
How StackSpend helps
Agentic systems can hide cost growth inside retries, longer sessions, and unnecessary tool loops. Tracking spend by workflow makes it easier to see whether a new “agent” improved useful completion or just increased model and tool costs.