AI Agents Don’t Fail. Governance Does.

The technology is usually not the problem. The lack of governance is.

That one sentence explains most of what I am seeing in the AI agent space right now. I’ve read a lot of very pointed posts on social media about how organizations are pulling deployed AI customer communication agents out of production and concluding that the agents “don’t work.” The data tells a different story.

A recent IT Brew piece1 reported on early findings that 74% of companies rolled back or cancelled an AI customer communication agent because of a governance failure, not a technology failure. PII exposure and data leakage drove roughly a third of those reversals. Hallucinations in front of real customers drove another fifth.

Here is the part worth pausing on. Many companies that described their guardrails as “fully mature” had the highest rollback rate.  At first glance that reads backwards. It is not. Those organizations were actively monitoring and governing the environment, so they SAW the problems sooner. Maturity did not cause the failures. It exposed them. It is worth sitting with, because it reframes the conversation.

The self-driving fleet.

Picture a company that buys 100 self-driving delivery trucks. The technology is excellent. The trucks navigate roads, avoid obstacles, and deliver faster than any human fleet could.

After a string of accidents and near misses, the executives reach a verdict: “The trucks don’t work.”

But the trucks were never the problem. The company never established driver licensing standards, approved routes, speed limits, maintenance schedules, incident reporting, fleet monitoring, or clear accountability for decisions. They deployed autonomous capability without fleet governance.

No transportation expert would blame the concept of self-driving vehicles for that outcome. They would call it what it is: a governance failure. AI agents are no different.

What governance actually is (and is not).

This is where I see many organizations stumble. They assume governance means more policies, more approvals, more bureaucracy, more restrictions. Governance has never been about control for the sake of control. Governance ensures that decisions are made by the right people, that risk is understood and managed, that accountability is clear, that performance is monitored, and that corrective action happens when something drifts. That is it.

I draw a hard line between load-bearing governance and decorative governance. Load-bearing governance actually holds weight when something goes wrong. Decorative governance looks impressive in a board deck and collapses the moment an agent exposes customer data. Most of the rollbacks we are reading about trace back to decorative governance that was never built to hold weight.

This is an old pattern in new clothes.  What organizations are experiencing with AI is not, at its core, an AI problem. It is the same problem they faced with cloud computing, cybersecurity, social media, shadow IT, and every wave of digital transformation before it. They deployed capability faster than they deployed governance.

You don’t just throw the word “governance” behind something and watch it magically appear.

Saying you govern your AI agents does not make them governed. Governance is not a label you slap on a deployment after the fact, and it is not a single committee meeting. It must operate at three altitudes at once: strategic, tactical, and operational. Skip any one of them and the whole structure wobbles, which is exactly what we are watching happen with agentic AI right now.

Governance altitude

What it is

Considerations for agentic AI

Strategic

The board and executive level that sets direction, risk appetite, and accountability for whether and where AI agents are used at all.

Define your appetite for autonomous decision-making, decide which decisions an agent may NEVER make without a human, assign board-level accountability for agent outcomes, and align every deployment with enterprise objectives and regulatory exposure.

Tactical

The management and program level that translates direction into policies, standards, controls, and oversight structures.

Set guardrails and approval gates by agent risk tier, define human-in / on / out-of-the-loop requirements per use case, stand up monitoring and escalation paths, and require data-handling and model-validation standards before anything ships.

Operational

The day-to-day level where teams run, monitor, and correct the agents in production.

Continuously monitor outputs and PII handling, log decisions for auditability, validate consequential responses, capture incidents and near misses, and trigger rollback or human takeover the moment a threshold is breached.

None of this calls for a heavier hand. It calls for being present at all three altitudes at once. When an agent gets pulled from production, the cause is rarely a single bad decision; more often it is a missing altitude, usually the tactical one, where the guardrails should have been agreed before anything ever went live.

And the plumbing is moving faster than the oversight.

The agent world is standardizing fast around three protocols:2 MCP for the tools and data an agent can reach, A2A for the agents it can delegate work to, and AG-UI for keeping a human in meaningful control. Notice what those three actually describe… data, agency, and oversight. The industry has agreed on HOW to wire agents together. It has not agreed on how to govern the decisions they make. Standardizing the connection is not the same as governing the consequence, and that gap is exactly where this year’s rollbacks live.

Technology creates opportunity. Governance ensures the opportunity is realized responsibly. Without governance, AI agents become unpredictable liabilities. With governance, they become scalable assets. The difference is not the intelligence of the technology. It is the maturity of the system governing it.

Final Thoughts

If you are deploying (or rethinking) AI agents, start here:

  1. Look past the model first. When an agent fails, ask what governance was missing before you ask what the technology got wrong.
  2. Reward detection, not silence. If your “mature” environment surfaces more problems, that is the system working. The organizations that see nothing are usually the ones not looking.
  3. Govern agents individually. Uniform governance over-restricts your low-risk agents and under-governs your high-risk ones. Match the control to the consequence.
  4. Decide where the human sits. For consequential decisions, choose deliberately whether a human is in, on, or out of the loop. Then write it down.
  5. Build load-bearing governance, not decorative governance. If your controls only look good in a board deck, they will fail at the worst possible moment.

The companies winning with AI are not the ones with the smartest agents. They are the ones with the most mature governance around them. 

For more on what happens when ambition outruns oversight, read my blog on Strategy Without Governance is Just Expensive Hope.

 1. Source: Brianna Monsanto, “AI customer communications agents have a governance problem,” IT Brew, June 3, 2026, drawing on Sinch’s AI Production Paradox report.

2. Source:  MindStudio, “MCP vs A2A vs AGUI: The Three Core Agent Protocols Compared.”