Who Authorized That? The Delegation Problem in Multi-Agent AI

Your AI agent booked a meeting, summarized a financial report, and emailed the highlights to three stakeholders. To do this, it called a calendar agent, a document analysis agent, and an email agent. Each accessed internal systems, made decisions about what to include, and acted on your behalf.

Here’s the question your security team can’t answer: Who authorized the email agent to read that financial report?

In most current architectures, the honest answer is no one explicitly. The logs may show that a service called another service. But they can’t show that the delegation itself was authorized. The authorization didn’t fail loudly. It leaked silently through the chain.

This is the delegation problem in multi-agent AI. As enterprises connect agents through protocols such as MCP and A2A, they’re solving the connectivity problem faster than they’re solving the authority problem. The result is a new security boundary that most enterprise architectures have not yet modeled, precisely because most organizations still treat it as orchestration rather than authorization.

Agents are connecting faster than authorization is adapting

The agent ecosystem has moved fast over the past two years. Anthropic’s MCP gave model-powered applications a standard way to connect to tools, data sources, and services. Google’s A2A protocol gave agents a standard way to communicate and coordinate across systems. Frameworks and SDKs such as LangChain, CrewAI, and Google’s ADK made it easier to build multi-agent workflows where one agent orchestrates several others.

What these protocols don’t yet provide, at least not as a mature common layer, is a delegation-aware authorization model.

MCP describes a protected server as an OAuth 2.1 resource server, with the MCP client acting as an OAuth client making requests on behalf of a resource owner. That’s a familiar and well-understood pattern, but it was designed for a world where a human clicks “Allow” and a single client gets a scoped token. It doesn’t address what happens when Agent A receives that token, delegates a subtask to Agent B, and Agent B spawns Agent C to handle part of it. Each hop in that chain either reuses the original token (overprivileged) or has no token at all (untracked).

A2A was built for interoperability: independent, potentially opaque agent systems communicating and coordinating actions across enterprise platforms. That’s the right problem to solve. But communication and delegation governance are different layers. A2A helps agents discover, describe, and communicate with one another. This is necessary infrastructure, but it isn’t the same as delegated authority. It doesn’t tell you whether a specific downstream action was legitimately derived from an upstream instruction.

Static API keys are even weaker for this problem. A key grants access to a service. It says nothing about who is using it, what they’re using it for, or whether the entity presenting it is the same one it was issued to. Service accounts identify a workload, not an intent. When three agents share a service account, every action looks the same in your logs.

None of these tools are broken. They solve different problems. The gap is structural. Authentication answers which agent is calling. Authorization defines what that agent may access. The harder question, and the one most enterprise architectures are not yet designed to answer, is whether a specific downstream action was legitimately derived from an upstream instruction, under narrowed constraints, with a verifiable chain back to a human decision. That’s the delegation question, and it sits in a layer that today’s stack doesn’t really have.

In a clean version of this picture, privilege should sit only with the agent that touches the outside world. If a payer (A) asks a bookkeeper agent (B) to make a payment, and the bookkeeper asks a banking agent (C) to execute the transfer, only the banking agent needs banking authority. The bookkeeper doesn’t need to move money. It only needs to know the request came from an authorized payer. The banking agent only needs to know the request came from an authorized bookkeeper. This is the principle of least privilege, a concept the security community has lived with for decades, applied to delegation chains. The difficulty is that today’s agent stacks make it hard to enforce.

What breaks in the chain

Consider a treasury reporting workflow in a regulated bank. A planning agent is allowed to read liquidity projections and produce a daily summary for senior finance users. To complete the task, it delegates chart generation to a visualization agent and narrative review to a communications agent. The visualization agent doesn’t need access to raw account-level data. The communications agent doesn’t need access to the underlying liquidity model. Yet unless the delegation layer attenuates permissions, both may receive more context than their task requires. The result isn’t a dramatic breach, but it is a quiet expansion of access that the access-control model never explicitly approved.

The risk isn’t limited to internet-facing agents. Many delegation failures happen entirely inside the enterprise boundary. An internal agent may call another internal agent, which calls an internal tool, which sends data to an approved SaaS service. Every individual step may look acceptable. The risk appears in the composition: The final data movement or action may exceed the intent of the original authorization.

This pattern creates three categories of failure that enterprises may have to explain to regulators, auditors, or customers.

Ghost permissions. A finance analyst assistant has been given access to a customer transactions database to support quarterly reporting. It calls a summarization agent: “summarize recent transactions for these accounts.” The summarization agent now operates against customer records, even though no policy engine granted it that access. The analyst assistant’s privileges effectively traveled with the request. The permission is a ghost. It exists in practice but not in any authorization system.

Scope drift. Even when an agent starts with narrow permissions, delegation tends to widen scope rather than narrow it. An agent authorized to read Q1 revenue data delegates to a charting agent, which calls an external rendering API, which now has the revenue figures. The data left the organization through three hops of implicit trust. Each agent acted within what it understood as its scope. The aggregate result exceeded what any human would have approved.

Broken audit trails. Regulated industries require the ability to answer “who did what and why” for any consequential action. In a single-agent system, this is manageable. In a multi-agent chain, the audit trail fragments across agents, protocols, and services. When a compliance team asks why a particular customer communication was sent, the answer might involve four agents across two protocols, none of which logged the delegation chain. The action is traceable to a system but not to a decision.

These aren’t edge cases. They’re a common outcome when delegation isn’t modeled explicitly. The delegation problem isn’t a bug in any particular framework. It’s a gap in the layer between them.

What a delegation-aware model requires

A delegation-aware authorization model has to solve four things at once, which is part of why no existing layer covers it cleanly.

The first is identity. The downstream agent needs a cryptographic credential that the receiving system can verify independently, not just a hostname or an API key. Hostnames lie. API keys travel. A real identity is one the calling system cannot fabricate.

The second is attenuation. When an agent delegates a task, the subagent should receive strictly fewer permissions than the parent—never the same set, and certainly never more. This is the principle of least privilege applied to delegation chains, and almost no current tooling enforces it by default.

The third is purpose. “Read this report to summarize liquidity exposure for the CFO” is a different authorization from “read this report and send selected figures to an external charting service.” It may be the same data and the same agent, but it’s two very different risk profiles. Without a purpose binding, the authorization layer has no way to distinguish them.

The fourth is audit. The organization should be able to reconstruct, after the fact, who delegated what, under which constraints, and what evidence each agent produced at completion. Not just which systems were called but which decisions were made and on whose authority.

It’s possible for agents to authenticate successfully even when they don’t have accountable authority. They can prove who they are and still execute actions that no human ever authorized.

Emerging approaches

Several efforts address parts of this problem: workload identity standards, agent metadata in tokens, OAuth-based MCP authorization, A2A authentication patterns, and agent identity frameworks. These are useful building blocks, but identity is not the same as delegated authority. A signed agent card can help establish an agent’s declared identity and capabilities. An OAuth token can tell you what a client may access. Neither, by itself, proves that a specific downstream action was authorized by a specific upstream decision under narrowed constraints.

One emerging pattern is delegation-bound capability tokens: short-lived credentials that bind an invocation to an agent identity, a constrained permission set, and a provenance record. One example is the Agent Identity Protocol (AIP), which I’ve been working on as an Internet-Draft and open source implementation. AIP is still early, but it illustrates the shape of one possible answer: invocation-bound tokens that carry identity, attenuated permissions, and provenance through a delegation chain. The token chain itself becomes part of the audit evidence rather than something reconstructed after the fact from fragmented logs.

Complementary approaches are also emerging. Behavioral credentials, the idea that agents should be continuously reauthorized based on runtime behavior rather than just initial permissions, address a related but distinct problem. Delegation tokens tell you who authorized what. Behavioral monitoring tells you whether the agent is still acting within its authorized profile. A complete solution will likely need both.

None of these approaches have reached mainstream adoption. But the fact that they are emerging simultaneously, from different corners of the industry, signals that the delegation gap is real and recognized.

What enterprise teams should do now

You don’t need to wait for standards to mature before addressing the delegation problem. There are concrete steps that security, platform, and architecture teams can take today.

Map your delegation chains. Most teams deploying multi-agent workflows haven’t documented which agents call which other agents, with what permissions, through which protocols. Start there. If you can’t draw the graph, you can’t secure it.

Audit implicit permissions. For every agent-to-agent interaction, ask: Was this access explicitly granted, or is the downstream agent inheriting permissions by proximity? If the answer is inheritance, you have a ghost permission that needs a policy decision.

Require scope attenuation. Establish an architectural rule: When an agent delegates a task, the subagent must receive fewer permissions than the parent, never more. Current tooling doesn’t enforce this automatically, but you can enforce it in your orchestration layer.

Build the audit trail before the auditor asks. If your organization is in a regulated industry, the question “Who authorized this agent action?” will eventually be asked. The time to instrument delegation logging is before that question arrives, not after. Log the full chain: which agent initiated the task, what permissions were passed, which subagents were invoked, and what each one accessed.

Test with real tooling. Delegation-aware approaches, including capability-token designs, workload identity standards, and agent identity frameworks, are early but functional. Running one in a nonproduction environment will expose gaps in your current authorization model that architecture review alone will not surface.

Delegation is the security boundary

The first phase of enterprise agent adoption was about connectivity: Can the agent reach the tool, the API, the database, or the other agent? The next phase will be about accountable delegation: Should this agent be allowed to ask that agent to do this specific thing, with this data, under these constraints?

That question won’t be answered by prompt engineering. It belongs in the authorization layer, the platform layer, and the audit trail.

Enterprises don’t need to solve the entire standards problem today. But they do need to stop treating delegation as an implementation detail. In multi-agent systems, delegation is the security boundary.

Who Authorized That? The Delegation Problem in Multi-Agent AI – O’Reilly

Agents are connecting faster than authorization is adapting

What breaks in the chain

What a delegation-aware model requires

Emerging approaches

What enterprise teams should do now

Delegation is the security boundary

SnapLogic Launch Brings Governed Enterprise Integration to AI Coding Agents

An Intelligence Operating System for Enterprise AI: Alation’s AIOS

Plex Keeps Getting Worse. Is Jellyfin a Decent Replacement?

Your X feed might now work, as platform finally does what everyone wanted

Leave a reply Cancel reply

ABOUT US

Who Authorized That? The Delegation Problem in Multi-Agent AI – O’Reilly

Agents are connecting faster than authorization is adapting

What breaks in the chain

What a delegation-aware model requires

Emerging approaches

What enterprise teams should do now

Delegation is the security boundary

SnapLogic Launch Brings Governed Enterprise Integration to AI Coding Agents

An Intelligence Operating System for Enterprise AI: Alation’s AIOS

Plex Keeps Getting Worse. Is Jellyfin a Decent Replacement?

Your X feed might now work, as platform finally does what everyone wanted

Plex Keeps Getting Worse. Is Jellyfin a Decent Replacement?

Lindsey Graham, Mitch McConnell, and the American gerontocracy

Kalshi’s legal battles intensify, Google prohibits extensions, Polymarket pushes US expansion, today in prediction market news LIVE

Leave a reply Cancel reply

ABOUT US