Agentic Architecture Patterns
From single agents juggling tools to orchestrators directing fleets of specialists. Learn the tradeoffs between swarms, pipelines, and hierarchies.
What is an Agent?
At its core, an agent is an LLM that calls tools. The design of how agents are structured and how they communicate with each other has a significant impact on reliability, cost, and performance.
An agent = an LLM + tools + a loop. The model reasons, decides which tool to call, observes the result, and repeats until it has enough information to respond.
Single Agent Systems
A single agent has access to all available tools and handles everything on its own.
flowchart TD
U([User]) --> A[Agent]
A --> T1[Tool: Search]
A --> T2[Tool: Calculator]
A --> T3[Tool: Email]
A --> T4[Tool: Database]
A --> T5[Tool: Calendar]
T1 & T2 & T3 & T4 & T5 --> A
A --> R([Response])
Problems
- Too many tools leads to poor decision-making. The sweet spot is roughly 5 to 10 tools per agent.
- As tools increase, so does context size, overwhelming the context window and increasing hallucinations.
- A single agent struggles to handle multiple specialization areas (e.g., planning, research, math) simultaneously.
Real-world examples
Customer Support Bot
A support chatbot that can look up orders, check FAQs, and issue refunds, all within a small, well-scoped tool set.
Early Copilot Chat
A coding assistant that answers questions, runs code, and reads files, all within a narrow, well-defined context.
Productivity Assistant
A personal assistant that manages your calendar, sends emails, and sets reminders.
Network of Agents (Swarm)
Each agent has its own tools, and agents communicate by deciding who acts next, with no central controller.
flowchart LR
U([User]) --> A1
A1["Agent 1\nResearcher"] -->|routes to| A2["Agent 2\nWriter"]
A2 -->|routes to| A3["Agent 3\nMath Expert"]
A3 --> A1
A2 -->|routes to| A4["Agent 4\nReviewer"]
A4 --> A1
A4 --> A2
A1 --> T1[Search]
A2 --> T2[Draft]
A3 --> T3[Calculator]
A4 --> T4[Critic]
A4 --> R([Response])
The term is borrowed from swarm intelligence (think ant colonies). It refers to a decentralized, multi-agent system where coordination emerges from agent-to-agent interactions rather than a central authority. It is a loose term with no single agreed-upon definition.
Problem
- Any agent can route to any other agent at any time. The lack of centralization leads to unreliable results, longer execution times, and higher costs.
Real-world examples
OpenAI Research Swarms
Early experiments with agent swarms for open-ended research, where agents self-organized to divide and tackle subtasks without a central controller.
Stanford Smallville
Generative game AI where NPCs coordinate behaviors with each other without a central controller, as demonstrated in Stanford's simulation paper.
Multi-Agent Debate Systems
Experimental systems where multiple LLM agents argue positions and respond to each other, using peer-to-peer debate to arrive at a consensus.
Supervisor Agent
A single supervisor agent is responsible for routing tasks to specialized subagents. The supervisor thinks about who to call next; subagents focus solely on doing their job.
flowchart TD
U([User]) --> S
S[Supervisor Agent] -->|delegates| A1[Researcher Agent]
S -->|delegates| A2[Writer Agent]
S -->|delegates| A3[Math Agent]
A1 --> T1[Search Tool]
A2 --> T2[Draft Tool]
A3 --> T3[Calculator Tool]
A1 & A2 & A3 -->|results| S
S --> R([Response])
- Subagents can also be treated as tools within a larger system.
- Centralized control leads to more predictable and reliable behavior.
Real-world examples
LangGraph Multi-Agent
The recommended multi-agent pattern: a top-level orchestrator routes between a researcher, writer, and critic agent to produce long-form content.
Enterprise Report Pipeline
A supervisor delegates to a data-fetching agent, a formatting agent, and an email-sending agent to generate and distribute reports automatically.
Devin
A planner agent coordinates subagents handling file editing, terminal commands, and browser testing to resolve software engineering tasks end-to-end.
Hierarchical Approach
The supervisor model extended recursively: a supervisor spawns subagents, which can themselves act as supervisors with their own subagents. This allows agents to organize into specialized clusters, handling complex nested workflows.
flowchart TD
U([User]) --> S
S[Top-Level Supervisor] --> S1[Research Supervisor]
S[Top-Level Supervisor] --> S2[Engineering Supervisor]
S1 --> A1[Web Search Agent]
S1 --> A2[Document Agent]
S2 --> A3[Code Agent]
S2 --> A4[Test Agent]
S2 --> A5[Deploy Agent]
A1 & A2 -->|results| S1
A3 & A4 & A5 -->|results| S2
S1 & S2 -->|results| S
S --> R([Response])
Real-world examples
AutoGPT / OpenDevin
A top-level goal gets broken into subgoals, each managed by its own sub-supervisor with specialized workers. Classic recursive delegation.
Software Release Pipeline
An AI system managing an entire software release: a top-level agent coordinates a testing supervisor (unit, integration, e2e) alongside a deployment supervisor (staging and production).
Multi-Department Assistant
A company-wide orchestrator delegates to department supervisors (HR, Finance, Engineering), each running their own subagents for domain-specific tasks.
Fully Custom Architecture (most common)
In practice, the most effective architecture is one tailored to the specific domain. Custom architectures are the most common choice precisely because no single pattern fits every use case.
flowchart TD
U([User]) --> G[Gateway / Router]
G -->|simple query| A1[Single Agent]
G -->|complex task| S[Supervisor]
G -->|data question| DB[(Database Tool)]
S --> A2[Researcher]
S --> A3[Writer]
A2 --> WS[Web Search]
A2 --> RAG[RAG Pipeline]
A1 & S & DB -->|results| AGG[Aggregator]
AGG --> R([Response])
Real-world examples
Cursor / AI IDEs
A custom blend of single-agent tool use, retrieval, and multi-step planning tuned specifically for the code editing context. No standard pattern covers all of it.
Perplexity AI
Combines web search, retrieval, and generation in a pipeline purpose-built for its search-and-synthesize workflow. It does not fit neatly into any standard architecture.
Production AI at Scale
Most production AI at companies like Stripe, Notion, or LinkedIn uses bespoke orchestration logic designed around their specific data sources, APIs, and user workflows.
Additional Nuances
When one agent calls another, there are two dimensions of choice in how they communicate. These choices affect transparency, debuggability, and cost.
| Dimension | Option A | Option B |
|---|---|---|
| What gets passed | Full graph state | Just tool parameters |
| What gets returned | Tool calls, reasoning, and final response | Final response only |
Passing full graph state gives subagents complete context and makes the system easier to debug, but increases token usage on every call. Passing just tool parameters keeps communication lean and focused, at the cost of less context for the receiving agent.