April 15, 2026
In the last piece, we talked about what AI agents are doing to the business model of pure data aggregators. This piece is about the other side of that conversation: if you're building in an agentic world, how do you build it right?
The answer is less exotic than you might expect, practically boring in fact. It starts with remembering something foundational.
I want to be direct about this because the discourse around AI tends to obscure it: a large language model is a computer program. It has inputs and outputs. It can be prompted, constrained, and validated. Unlike most software (ideally) it can also hallucinate, drift, and produce outputs that are subtly wrong in ways that cascade badly downstream. This is fundamental in the way LLMs function, although the whyfores are not in the scope of this article.
The engineering response to this is not to marvel at the intelligence of the model, nor to grit your teeth when the responses to your prompts are frustratingly close to being correct. It is to treat it like what it is: a program that needs clear inputs, constrained outputs, and hard validation gates at every step.
The less an agent has to intuit, the better. Every degree of freedom you give it to interpret or infer is a degree of freedom for things to go wrong. The goal of good agentic architecture is to minimize that surface area systematically.
In the 1960s, an IBM instructor named George Fuechsel coined a phrase while teaching computer operators: Garbage In, Garbage Out. GIGO. The idea was simple: the quality of a program's output is bounded by the quality of its input. Feed it bad data, get bad results, regardless of how sophisticated the program is.
Although this principle predates LLMs by sixty years, it applies to them completely. In a multi-agent system, where the output of one agent becomes the input of the next, GIGO compounds at every step. A malformed output has the potential to corrupt everything downstream in ways that are difficult to detect and expensive to debug.
The implication is architectural: every handoff between agents needs a hard PASS/FAIL gate. Not a soft check. Not a warning. A hard gate that refuses to pass malformed output to the next step and surfaces the failure clearly.
This isn't theoretical. When you let agents pass loosely structured outputs between each other without validation, the system doesn't degrade linearly. Errors compound. One malformed output becomes the input to the next step, and then the next, and very quickly you're debugging something that looks completely disconnected from the original mistake. Recent work from MIT and Google Research shows exactly this pattern: independent agents without validation amplify errors dramatically, while systems with coordinated, structured handoffs reduce that compounding effect by a wide margin. [On the Reliability of Multi-Agent Systems]
Systems that enforce structured, validated handoffs behave very differently. They fail early. They fail loudly. And, most importantly, they fail at the point where the error was introduced. Related research on agent reliability reaches a similar conclusion: reliability is not primarily a property of the model. It's a property of the architecture. [Architectural Foundations for Reliable Agentic Systems]
This is the same principle that underpins compilers, data pipelines, and API contracts. The difference is that with LLMs, the failure modes are less obvious and more expensive to trace. Which makes the need for hard validation gates more, not less, important.
In 1999, NASA lost a $327 million spacecraft because one engineering team was outputting thrust data in pound-force seconds and another was expecting newton-seconds. The Mars Climate Orbiter executed its orbital insertion maneuver at the wrong angle and burned up in the Martian atmosphere. The spacecraft worked. The instruments worked. The failure was a unit mismatch at a data handoff. [Wikipedia] We live and die by standardization. This is not an abstract principle. It is a $327 million lesson recorded in NASA's failure logs.
The inverse of that story is the Rosetta Stone. The Stone wasn't remarkable because of its content. It was a tax decree. It was remarkable because the same information was inscribed in three scripts: Ancient Egyptian hieroglyphics, Demotic, and Ancient Greek. Because scholars could read Greek, they finally had the key to decode a writing system that had been opaque for centuries. A shared encoding didn't just solve a translation problem. It unlocked an entire civilization's recorded history.
This is what a well-defined schema does between agents. Without a shared contract, one agent's output is hieroglyphics to the next. With one, any agent that understands the format can consume, validate, and act on the output of any other, regardless of what model produced it, what language it was written in, or what infrastructure is running underneath. The schema is the shared stone. Everything else is implementation detail.
The good news is that data standardization is a largely solved problem. ISO and ANSI have spent decades defining common formats for dates, currencies, country codes, measurements, and more. When designing the data contracts between agents, the right instinct is to reach for these standards first rather than inventing bespoke representations. The more your data speaks in formats the world already understands, the less room there is for the kind of silent unit mismatch that ends in a fiery mistake.
The practice of defining the shape of data before writing the code that produces or consumes it has a name: Schema-Driven Development, or SDD. The core idea is to establish system contracts first. That is: a precise, machine-readable definition of what valid input and valid output looks like. The contracts are the templates that inform the system architecture and then the codebase. Validation, documentation, and testing all flow from the same source of truth. [SDD: Schema-Driven Development]
This was already a good way to design systems and APIs before AI agents existed. It's been my preferred methodology for years, and the defacto way that I think about problem abstraction and solution design. For an enterprise example, we look to Facebook's internal development of GraphQL around 2012, and its subsequent open-sourcing. It helped popularize the schema-first approach: define the shape of the data before you build the server that produces it. The insight was that when multiple teams are consuming the same data, having a single authoritative contract is vastly more efficient than discovering mismatches after the fact.
In an agentic system, this principle becomes even more critical, for a specific reason: literally every handoff between agents is an API. These handoffs happen very quickly, and often. The output of one agent is the input contract for the next. If you are building agentic workflows and you are not thinking in schemas, you are building without contracts, and you will pay for it in debugging time, fragile pipelines, and hard-to-trace failures.
Think of agentic outputs like widgets on a factory line. You want every widget to be the same shape. The machine at station three is built to accept a specific shape of widget from station two. If station two occasionally produces a widget that is the wrong shape (even slightly) station three breaks, or worse, produces a subtly wrong output of its own. Schema validation is the QA gate between stations.
Standards like OpenAPI already give you the vocabulary to define these contracts in a way that is machine-readable, well-documented, and widely understood. Use them. The goal is not to be clever. The goal is to be boring and correct.
One of the underappreciated implications of building agentic systems correctly is how it changes the role of the filesystem. In a well-architected agentic pipeline, the filesystem is memory. Intermediate outputs are written to disk in well-defined formats at each step, so that if the process is interrupted, for any reason, it can be resumed cleanly from a known-good state rather than restarted from scratch.
This only works if your outputs are well-defined. A schema-validated JSON file at each checkpoint is resumable. An ambiguous blob of text is not. Designing for graceful interruption is designing for production reality: agentic workflows will fail, be interrupted, and need to recover. The agents that handle this gracefully are the ones whose architects thought carefully about the shape of every output before writing a single line of orchestration code. This system data is also extremely valuable for debugging your orchestration systems and then refining them.
This is not a concept that is in any way unique to agentic systems. Graceful failure recovery has underpinned all software systems for more than half a century.
In a world where every agentic handoff is an API, Schema-Driven Development is not an architectural nicety. It is the foundation that makes the whole thing work.
The tools already exist. The lessons are decades old. The failure modes are well understood.
The only question is whether you enforce the contract.
Because if you don't, the system will enforce it for you. Just not in a way you'll enjoy debugging.