The best AI agents might be the ones with the fewest tools.
Recent case studies show a consistent pattern: agents stripped down to basic primitives, bash, file access, a single execution tool, outperform their over-engineered predecessors. Higher success rates, fewer tokens, faster responses.
Something is shifting in how we build agents.
The instinct to over-engineer
When teams first build AI agents, the instinct is to constrain. Pre-filter context so the model doesn’t get overwhelmed. Build specialized tools for each task. Protect the model from complexity.
One team’s text-to-SQL agent had all of this: schema lookups, query validators, join path finders, 15+ specialized tools. Then they stripped it down to two: bash and SQL execution. The agent figured out the rest using grep, cat, and ls.
Metric | Before | After |
Tools | 15+ specialized | 2 (bash, SQL) |
Tokens | ~102k | ~61k |
Success rate | 80% | 100% |
Avg steps | 12 | 7 |
Better across every metric. By doing less.
The file system is the interface
What’s striking is where these approaches converge: the file system.
Semantic layers as directories of YAML and Markdown. Terminal output synced to local files. Tool descriptions stored as browsable code. The agent navigates context like a codebase:
semantic-layer/
├── models/
│ ├── users.yaml
│ ├── orders.yaml
│ └── products.yaml
├── metrics/
│ └── definitions.md
└── joins/
└── relationships.json
Why files? Because they’re a 50-year-old abstraction that still works. Models have seen billions of examples of navigating directories, grepping through code, reading documentation. grep isn’t a new skill we’re teaching, it’s a capability models already have.
Let the model pull what it needs
The old pattern was to stuff everything into the prompt upfront: all tool definitions, all context, all instructions. This bloats the context window and can confuse the model with too much information.
The new pattern is dynamic discovery:
Agent receives task
↓
Explores filesystem (ls, find)
↓
Searches for relevant content (grep, cat)
↓
Pulls only what it needs into context
↓
Returns result
Don’t load every tool definition, let the agent search for the ones it needs. Don’t summarize aggressively, write history to a file the agent can reference. Don’t pre-filter data, let the model grep for what’s relevant.
The efficiency gains are real. Teams have reported 40-50% fewer tokens using this approach. In some cases, dynamic tool loading cut token usage from 150,000 to 2,000. Less context upfront, better results.
Stop doing the model's thinking
There’s a philosophical shift underneath all of this.
The assumption used to be that models would get lost in complex schemas, make bad joins, hallucinate table names. So we built guardrails. But the guardrails became the problem, constraining reasoning the model could handle on its own.
As one engineering team put it: “We were doing the model’s thinking for it.”
This doesn’t mean abandoning structure. These approaches work because the underlying data is well-organized: clear naming, consistent structure, good documentation. The foundation matters. But given a solid foundation, the model needs far less scaffolding than you’d think.
Build for the model you'll have in six months
Models are improving faster than tooling can keep up. The custom retrieval logic you write today may be unnecessary in three months. The constraints you add now may become liabilities.
This suggests a different approach to agent development. Start with the simplest possible architecture: model, file system, goal. Add complexity only when you’ve proven it’s necessary. Every tool you add is a choice you’re making for the model. Sometimes the model makes better choices.
The takeaway
The pattern is clear: simpler agents, built on universal primitives, with dynamic context discovery instead of static loading.
Files. Bash. Trust the model’s reasoning.
It sounds almost too simple. But that might be the point.
The future of agent architecture isn’t more sophisticated scaffolding. It’s having the confidence to delete most of it.