AI agent governance at scale: the four pillars every enterprise needs

Enterprise agents need governance infrastructure, not just better models

June 11, 2026
Last modified on
TL;DR Takeaways:
No items found.
Learn more at Redpanda University

Enterprise agentic AI isn’t failing because the models aren’t good enough. It’s failing because nobody built the infrastructure to make imperfect agents safe to deploy. Every CIO conversation I’ve had hits the same wall: “I can’t trust them with my data.” 

AI agent governance at scale is the unsolved problem. This post lays out a framework for solving it. 

TL;DR: Enterprise AI agent governance requires four things: identity, authorization, observability, and accountability. These should all be enforced through infrastructure the agent cannot see or circumvent.

Why enterprise agents keep failing

Agents differ from humans and traditional software in three ways that compound one another. They’re unpredictable in ways no human employee is. They hallucinate, misinterpret ambiguous instructions, and are susceptible to prompt injection. They’re more capable than any individual operator, able to interact with production systems at a speed and scale no human can match. And they're directable to a fault: when given a bad plan, an agent doesn’t push back. It executes at machine speed across every system it can reach before anyone notices that something is wrong. 

That combination is genuinely new, and it’s what makes enterprise AI security infrastructure such an unsolved problem. Human employees are unpredictable but limited in what they can actually do, and they push back when something feels wrong. Traditional software is capable but does exactly what you coded it to. 

Agents are unpredictable, like humans, and capable, like software, but without the human judgment to question a bad plan or the determinism to at least fail consistently. We don’t need perfect agents. We need to manage imperfect ones the same way we’ve always managed imperfect humans. 

The foundation: out-of-band governance

Before getting into the specifics, there’s one principle that everything else in this framework depends on: governance must be enforced via channels that agents cannot access, modify, or circumvent.

Or more succinctly: out-of-band metadata.

Any policy enforced through the agent is only as strong as the agent’s ability to perfectly retain and obey it. Put rules in the system prompt, and prompt injection can override them entirely. Train the agent to respect boundaries, and hallucination can cause it to confidently invent permissions it doesn’t have. Even routine context management can silently drop the rules it was told to follow. 

Real governance runs out-of-band, outside the agent’s data path, invisible to it, and enforced by infrastructure the agent cannot touch. The agent doesn’t get a vote.

The four pillars of AI agent governance at scale

Each of the four pillars below addresses a specific way existing infrastructure fails when agents are in the loop. 

Identity

Most agent deployments today use shared API keys or service account tokens, which is the equivalent of an entire department sharing one badge. You can’t tell one agent’s actions from another’s, and tracing anything back to the human who kicked off the task is basically impossible. 

The fix is to make the identity instance-bound: each agent instance gets its own cryptographic identity tied to this specific task and to this specific person or delegation chain. Spin up a copy without going through provisioning, and it doesn’t get in. It’s like issuing a new employee a badge on their first day, except agents get a new one for every shift.

For delegation, where an agent acts on behalf of a human, that chain needs to be carried by infrastructure the agent can’t touch. Not in the prompt or in a header the agent can modify. Every system the agent touches should know not just who the agent is, but who sent it. Some standards efforts are emerging here, including OAuth 2.0 Token Exchange (RFC 8693), but most deployed systems today have no concept of this. 

Authorization

The problem with most agent authorization nowadays is agents are given a role’s worth of permissions for a single task’s worth of work. In other words, their permissions are often way broader than any individual task requires. 

What you actually want is the agent to get read access to the specific three tables it needs for a specific job, and for those permissions to evaporate once the job is done. Everything else should simply not exist from the agent’s perspective. 

To put it briefly, agent authorization needs to be: narrowly scoped to the task at hand, short lived, include hard boundaries, and have the same limit on permissions as the human it’s acting on behalf of. A marketing intern can’t access the production database, and so neither should their agent. It’s like having a visitor badge, where having an employee escort you doesn’t get you into the server room if visitors aren’t permitted there.

Observability and explainability

When something goes wrong with traditional software, you can debug and find the if statement that made the bad decision. There’s no equivalent for an LLM. If you want to understand why an agent did what it did, you have two options: ask it (unreliable, for obvious reasons) or analyze everything that went in and everything that came out and draw your own conclusions. 

That means the transcript has to be complete. Not metadata, not just “the agent called this API at this timestamp,” but every input, every output, every tool call with every argument and every response, captured by infrastructure the agent can’t touch.

  • Observability is: “Can I see what happened?” 
  • Explainability is: “Can I reconstruct what happened and justify it to a regulator, an auditor, or the customer it affected?” 

Getting to explainability means the transcript needs to be structured and versioned: which model, which prompt, which tool versions were running when the decision was made. The EU AI Act already requires this for high-risk AI systems, and that’s only going to spread.

The takeaway here is to record everything. And, the transcript should be captured by infrastructure the agent can’t influence (throwback to the out-of-band principle). Don’t trust an agent to be it’s own record-keeper. 

Accountability and control

When something goes wrong, you need to be able to follow the chain from what the agent did all the way back to the specific human who authorized the task. Not, “this API key belongs to the engineering team,” but the actual person who set the thing in motion. 

Then there's the killswitch dilemma. If an agent goes off the rails, do you revoke the API key that ten other agents are also using? Here's where instance-bound identity pays off: you can revoke this specific agent instance without taking down the other ninety-nine, then use the full transcripts to understand and remediate whatever it already did before you caught it. 

However, it’s less about stopping agents that have gone wrong and more about keeping them from going wrong in the first place. Most enterprise agent deployments today lean on human-in-the-loop as the primary safety mechanism. That’s fine as a starting point, but it doesn’t scale. 

If you can get four things right—identity, scoped authorization, full transcripts, clear accountability chains in place—you’ll have something most enterprises don't have today: the infrastructure to manage agents the way you manage employees. Yes, that means establishing constraints, but it also means development, performance reviews, and escalating trust as agents prove themselves. The same infrastructure that makes governance possible makes management possible.

The AI agent governance cheat sheet

The conversation around agent governance is growing, although much of it focuses on improving the models, tightening alignment, and reducing hallucinations. That work is certainly important, but we need the institutional infrastructure that lets imperfect agents do real work safely. 

So here's the cheat sheet. Clip this to the fridge:

The agents aren't the problem. The missing infrastructure between agents and your data is the problem. Agents are unpredictable, capable at machine scale, and directable to a fault. They’re fundamentally a new kind of coworker. We don't need perfect agents. We need to manage imperfect ones, just like we manage imperfect humans.

The foundation is out-of-band governance. Any policy enforced through the agent (e.g., in its prompt, in its training, in its good intentions) is only as strong as the agent's ability to perfectly retain and obey it. Real governance runs in channels the agent can't access, modify, or even see.

That governance has to cover four things:

  • Identity. instance-bound, delegation-aware. Every agent instance gets its own cryptographic identity, and every on-behalf-of chain is propagated faithfully through infrastructure the agent doesn't control.
  • Authorization. Scoped per-task, short-lived, restricted. Not a human role's worth of permissions for a single task's worth of work.
  • Observability and explainability. Full-fidelity, versioned, infrastructure-captured transcripts of every input, output, and tool call. Not metadata or self-reports. The whole thing recorded out-of-band.
  • Accountability and control. Clear chains from every agent action to a responsible human, and kill switches that are fast enough and precise enough to actually contain the damage.

Get started with AI agent governance

Every major paradigm shift in how we work has demanded new governance infrastructure. It feels impossibly complex at the start, and then we build the systems, establish the norms, iterate. We never waited for perfect employees. We built systems that made imperfect ones successful, and we can do exactly the same thing for agents. 

Now we need to build the infrastructure that lets agents be their best selves—the digital coworkers we know they can be.

In the meantime, if you want to learn the most practical approach to get started, watch my talk on “Putting AI to work: Governing Enterprise Agents in Production” or get in touch and let’s chat through what your agent infrastructure needs to work safely at scale. 

{{featured-event}}

---

This post draws on a piece published by Redpanda CTO Tyler Akidau on O’Reilly Radar. Read the full version: Posthuman: We All Built Agents. Nobody Built HR. 

No items found.

Related articles

View all posts
Kristin Crosier
,
,
&
May 12, 2026

5 predictions about agentic AI and analytics in 2026

What AI trends will shape analytics in the coming months?

Read more
Text Link
Robert Siwicki
,
Rachel Zalkind
,
&
Apr 28, 2026

Five principles for governed autonomy with enterprise AI

How we turned opaque agent behavior into governed, provable workflows

Read more
Text Link
Robert Siwicki
,
Rachel Zalkind
,
&
Apr 16, 2026

Building safe, multi-agent AI systems in Redpanda Agentic Data Plane

How we revamped our Redleader agent to enable governed, multi-agent AI for the enterprise

Read more
Text Link
PANDA MAIL

Stay in the loop

Subscribe to our VIP (very important panda) mailing list to pounce on the latest blogs, surprise announcements, and community events!
Opt out anytime.