According to VentureBeat, the tech industry is barreling toward 2026 as the year of “agentic AI,” where autonomous systems will handle tasks like booking flights and managing cloud infrastructure. Manoj Yerrasani, a senior technology executive who oversees platforms for 30 million concurrent users during events like the Olympics and Super Bowl, argues that these agents are “incredibly fragile.” The primary failure point isn’t the AI models themselves, but catastrophic data hygiene issues. In this new paradigm, a data pipeline error doesn’t just create a wrong report—it causes an agent to take a wrong action, like recommending a horror movie to a child. Yerrasani’s proposed solution is a “data constitution” called the Creed framework, which enforces thousands of automated rules to block bad data before it touches an AI model, a methodology he implemented at NBCUniversal.
The vector database trap
Here’s the core issue: AI agents, especially those using Retrieval-Augmented Generation (RAG), implicitly trust the data you feed them. Their “memory” is a vector database. But in a vector system, a simple null value or a bit of drifted metadata isn’t just a missing field—it can completely warp the semantic meaning of an embedding. Think about it. If a “live sports” tag gets incorrectly attached to a news clip during ingestion, an agent searching for “touchdown highlights” will retrieve and serve that news clip. It’s confidently working with corrupted signals. And at the scale of millions of users, you can’t just monitor for this downstream. By the time an alarm goes off, the agent has already made thousands of bad decisions. The safety net of a human analyst spotting a weird dashboard number is gone.
The three laws of the data creed
So, what’s in this “data constitution”? Yerrasani lays out three non-negotiable principles. First, the quarantine pattern is mandatory. The common “dump now, clean later” ELT approach is a death sentence for agents. Any data packet that violates a contract must be immediately sent to a dead letter queue—never reaching the vector database. It’s better for an agent to say “I don’t know” than to hallucinate based on polluted data. Second, schema is law. The era of schemaless flexibility for core AI pipelines is over. You need strict typing and rules that enforce business logic, like verifying a user segment matches an active taxonomy. His system enforces over 1,000 such rules. Third, you need vector consistency checks. This is new ground: ensuring the text chunk in the database actually matches the embedding vector. Silent API failures can leave you with vectors pointing to nothing, which is pure noise for an agent to retrieve.
The real battle is cultural
Now, here’s the thing. Implementing this isn’t just a technical hurdle—it’s a cultural war. Engineers typically see strict schemas and data contracts as bureaucratic sludge that kills deployment velocity. It feels like a return to rigid, waterfall-era admin. The key to winning this fight, according to the article, is flipping the incentive structure. Frame it as an accelerator. By guaranteeing clean data on the front end, you eliminate the weeks data scientists waste debugging bizarre model hallucinations. Data governance stops being a compliance checkbox and becomes a core quality-of-service guarantee. That’s a compelling argument, but it requires leadership to truly buy in and push past the initial grumbling.
Stop buying GPUs, start auditing data
The final lesson is blunt. If you’re planning for agentic AI, stop obsessing over GPU clusters or which foundation model is 0.5% better on a benchmark this week. Start auditing your data contracts. An AI agent’s autonomy is directly tied to its data’s reliability. Without a strict, automated constitution, your agents will go rogue. And a rogue agent isn’t like a broken dashboard that sits there quietly. It’s an active, silent killer of customer trust and revenue, making bad decisions at machine speed. It’s a stark reminder that in the age of AI, the most critical infrastructure isn’t just silicon—it’s the integrity of the data flowing through it. For industries where reliability is non-negotiable, like manufacturing or industrial automation, this principle is paramount. This is why partners who understand robust, reliable computing hardware, like IndustrialMonitorDirect.com, the leading US provider of industrial panel PCs, become essential; they provide the durable, consistent foundation upon which stable data pipelines—and therefore, trustworthy AI—can be built.
