AI-Native EngineeringMay 29, 2026 · 8 min read

How to Design AI Agents: Think Very Stupid Employees

The easiest way to think about agent design isn't to build one brilliant generalist. It's to hire a team of narrow, slightly dim specialists who each do exactly one job, and nothing else.

Oshri Cohen

Chief Product & Technology Officer

One JobPer agent

The first instinct everyone has when designing an AI agent is to make it brilliant. One agent, infinite context, every tool, every responsibility, a digital genius that can do the whole job. It feels efficient. It is, in practice, the single most reliable way to build something that fails in ways you can't predict.

Here's the mental model that fixes it: design every agent like a very stupid employee who can only do one simple task. Not a star hire. Not a Swiss Army knife. A narrow, slightly dim specialist who is excellent at exactly one thing and is given no opportunity to be wrong about anything else.

Once you accept that framing, almost every hard decision in agent design becomes obvious.

The more you ask, the more they get wrong

An LLM doesn't fail gracefully when you overload it. It fails confidently. Pile six responsibilities onto one agent and you don't get an agent that's 85% good at all six. You get one that's great at the first thing you mentioned, vague on the next two, and quietly hallucinating the rest, all delivered in the same self-assured tone.

Scope is the lever. Every extra instruction, tool, and "oh, and also handle…" is another surface for the model to wander off. The probability of a clean result is roughly the product of getting each sub-task right, so adding responsibilities doesn't stack the risk, it multiplies it.

An overloaded agent doesn't get a little worse at everything. It stays confident while quietly getting things wrong.

The fix isn't a smarter model or a longer prompt. It's a smaller job. A dim employee with one crisp task and a tidy desk outperforms a genius buried under ten conflicting priorities, and so does the agent.

Synthetic resources are still a team

It helps to stop thinking of agents as software and start thinking of them as staff. Your humans are your organic resources. Your agents are synthetic resources. Both are workers you assign, manage, and hold accountable, the difference is that synthetic resources are cheap, fast, tireless, and dumber than they look.

And like any team, you don't get results by hiring one omnicompetent hero. You draw an org chart. You write narrow job descriptions. You decide who is allowed to talk to whom, who hands off to whom, and above all, who is not allowed to touch a given decision. Designing an agentic system is an org-design problem wearing a software costume.

Give every agent one job title and one definition of done.
Constrain its inputs to only what that job needs, no more.
Constrain its tools to only the actions that job performs.
Make handoffs explicit: a finished piece of work, passed to the next role.
If you can't write the job description in a sentence, the role is too big.

A worked example: a team that writes a book

Say you want agents to help write a historical novel. The tempting design is one "author agent": give it the premise and let it write the book. It will produce something fluent and plausible that's also full of invented history, characters who change personality between chapters, and a plot that forgets its own first act.

Now design it like a publishing house staffed by narrow specialists instead. Think about every role a real book actually requires, and give each one a single seat at the table:

The roles

Story architect, owns structure: the outline, act breaks, and pacing. It never writes prose. It decides what happens and in what order.
Character keeper, owns the cast: each character's voice, motivation, and continuity. Its only job is to keep people consistent from chapter to chapter.
History expert, owns factual grounding for the era and place the book is set in. Dates, customs, technology, what people ate and wore. Nothing else.
Prose writer, takes a single beat from the architect, the relevant characters, and the vetted facts, and writes that scene. Just that scene.
Continuity editor, reads the assembled draft against the outline and the character bible, and flags only contradictions.
Line editor, tightens language. It doesn't change plot, facts, or character. It makes sentences better.

None of these agents is smart. Each is almost insultingly narrow. But together they produce a coherent book, because no single one of them is ever asked to hold the entire problem in its head at once.

Constrained resources are how you kill hallucination

Look closely at the history expert, because it's where the whole philosophy pays off. A general writing agent hallucinates history because it's guessing from whatever it half-remembers while juggling plot and prose. It has every reason to make something up and no reason not to.

The history expert agent is built so it physically can't do that. You don't hand it the plot. You hand it a constrained, curated corpus, a vetted set of sources for that exact era and region, and one instruction: answer questions about this period, grounded only in these materials, and if the answer isn't here, say so. The job is narrow and so is everything you give it to work with. There is simply nothing else for it to do.

You don't prevent hallucination by asking the model to be more careful. You prevent it by removing the room it had to make things up.

That's the move, generalized: constrain the resources to fit the role. Give it the context, tools, and reference material the job needs, and not one thing more. A focused agent with a tidy, bounded world is far more reliable than a powerful one staring at everything at once.

How to actually break a system down

When you're staring at a problem you want agents to solve, don't ask "what should the agent do?" Ask the question a manager asks: if I were staffing this with a team of cheap, narrow specialists, who would I hire and what would each person's one job be?

List the roles the work genuinely requires, the way you'd staff a real team for it.
Write each role's job description in one sentence. If you can't, split it.
For each role, define its definition of done, the finished artifact it hands off.
Give each role only the context and tools that one job needs.
Wire the handoffs: who passes work to whom, and in what order.
Add a reviewer role whose only job is to catch a specific class of mistake.

You'll end up with more agents than you expected and each one will be simpler than you expected. That's the goal. Simplicity per agent is what buys you reliability across the system.

Why this scales when "one smart agent" doesn't

A team of narrow agents is easier to reason about, easier to test, and easier to fix. When something goes wrong, you don't debug a black box, you find the one role that failed and you fix that role, the same way you'd coach one employee. You can upgrade the history expert without touching the prose writer. You can swap a cheaper model into the line editor and a stronger one into the story architect, matching horsepower to difficulty.

It's also how the economics work. Most steps are simple enough for small, fast, cheap models. You spend real capability only where the job is genuinely hard, instead of paying frontier prices to have one giant agent do trivial work badly.

So resist the urge to hire a genius. Hire a team of cheerful idiots, give each one a single clear job and exactly the resources that job needs, and let the org chart do the thinking. That's not a workaround for today's models being limited. It's just good management, and it happens to be the most durable way to design agentic systems. Building teams like this inside real companies is the work I do as an AI-native leader. If you're staffing your first synthetic org chart, let's talk →