Skip to content
AI Development

Build and Control Agents That Ship Real Systems

Most developers I talk to are stuck in the middle layer — past prompting, short of real agentic building. Here's how I think about closing that gap.

RB
Robert BoulosApril 7, 2026 · 7 min read
Build and Control Agents That Ship Real Systems

I talk to a lot of developers and technical founders right now, and almost all of them are stuck in the same place.

They've used Claude in a browser tab. They've read about MCP. Some of them have opened Claude Code once or twice and poked at it. A few have shipped a small thing with an agent and felt the lift. But almost none of them can sit down on a Tuesday morning, point an agent at their own production codebase, and have it ship a change they trust by lunch.

That gap is the whole game right now. Closing it is the highest-leverage thing a developer can learn this year. Everything I do at Snappy is built around closing it, and I want to lay out how I think about it, because most of what's written about "AI coding" misses the point.

The middle layer is where everyone is stuck

There's a layer below where you are, and a layer above.

The layer below is prompting a chatbot. You type, it answers, you copy-paste, you decide what to do with it. That layer is fine. Most of the world lives there. It's also a dead end if you build software for a living, because the chatbot doesn't know your repo, doesn't run your tests, and doesn't ship anything.

The layer above is agentic building. You're not asking an agent for advice. You're directing an agent to do work. It reads your code, it writes diffs, it runs your test suite, it calls your APIs, it edits your backend, it commits. You sit next to it the way a senior dev sits next to a junior — except the junior types at a few thousand tokens a second and never gets tired.

The middle layer is where almost everyone I talk to lives. They know layer one isn't enough. They've heard layer two exists. They don't know how to actually work there day-to-day.

I'm going to be blunt about why.

Why people stay stuck

It's not the tools. The tools are good enough. Claude Code is good enough. MCP is good enough. The models are good enough. I shipped real systems with worse tools two years ago.

People stay stuck for three reasons.

First, they treat the agent like a magic box and get burned. They ask it to refactor a module, it touches twelve files, the diff is too big to read, something breaks, they roll back, and they decide agents aren't ready. What actually happened is they skipped the calibration step. On day one you start with changes small enough that you already know what the diff should look like. You're not testing the agent's intelligence. You're calibrating your trust.

Second, they re-explain their context every session. Every morning they tell the agent the same thing about how their repo is laid out, what their conventions are, what to never touch. It's exhausting and error-prone, and after a week they quit. The fix is skills — small, named instruction sets the agent loads on demand. I write a skill for almost every repeat move I make. The first one takes twenty minutes. After that, sessions get shorter and more accurate, and you stop dreading them.

Third, they think of the agent as a coder and nothing else. When they hit something the agent can't do from the filesystem alone — query a live database, call a real API, edit a backend function — they fall back to doing it themselves and the session collapses. The fix is MCP, in the specific places it earns its keep. I built a Xano MCP server because most of my work involves Xano backends, and now my sessions don't need me as a human relay between the agent and the database. That's the only reason to build an MCP server: to remove yourself from a relay you're tired of being.

None of this is exotic. It's just the part nobody writes down.

Build and control

I keep using the phrase "build and control" and I want to be specific about it, because control is the part most people skip.

Build is the easy part to imagine. Agent writes code, agent runs tests, agent ships. Fine.

Control is the hard part and it's where the real skill is. Control means you decided what the agent was allowed to touch before it touched anything. Control means you can read the diff and know whether to merge it. Control means when the agent goes off the rails — and it will — you notice within a minute, not within a deploy. Control means you have skills, guardrails, scoped permissions, and a session structure that makes the agent's behaviour legible to you.

A lot of what's marketed as "AI coding" right now is the opposite of control. It's autocomplete on steroids, or it's a cloud agent that does things in the background and you read about them later. Both have a place. Neither is what I'm teaching, because neither puts you in the seat where you can build something you'd actually run in production.

I want you in that seat. That's the whole thesis.

The stack I actually use

People ask me what to install. Here's the honest answer.

Claude Code is the working surface. I use it every day, on every project, on real repos. It is the place where the agent and the codebase meet under my hand.

Skills are the memory layer. Anything I do twice becomes a skill. My skill library is bigger than my notes app at this point, and it's the single biggest reason my sessions have gotten shorter over the last year.

MCP servers are the bridges to systems that aren't the filesystem. I have one for Xano because I live in Xano. If you live somewhere else, build one there, or use one that already exists. Don't build MCP servers for things that don't need them — that's a tax, not a feature.

Xano is my backend of choice, but the principle generalises. Pick a backend you can drive programmatically and stop hand-clicking dashboards.

That's it. Four things. The reason it looks like a short list is that it is a short list. The work isn't in collecting tools. The work is in getting good at the four you have.

What I want you to do this week

If you read this far and you're still in the middle layer, here's the thing I'd actually do.

Pick one repo of your own. Not a toy. A real one. Open it with Claude Code and have the agent map the architecture back to you in plain English. Read what it says. Correct it where it's wrong. That's session one.

Then pick the smallest real change you've been putting off — the kind of thing you'd assign to yourself on a Friday afternoon — and have the agent do it while you watch. Read every line of the diff. Merge it if it's right. Back it out if it isn't.

That's the on-ramp. Do that twice and you're already past where most of the people I talk to are.

If you want to go deeper with me, I run a free course inside the Snappy community on Skool. It's where I walk through the stack above on real repos, answer questions from other builders, and post the skills and MCP setups I use day to day. Join us at skool.com/snappy and start closing the gap.

— Robert

Stuck on something?

I'll look at your code, live.

30 minutes. Your screen, your repo. No pitch deck. I help you solve the thing that's blocking you.