Governance

Runtime Authorization vs Policy Engines

Engineering teams often ask whether they should adopt a policy engine like Open Policy Agent (OPA) or Cedar, or whether they should adopt a runtime authorization system. The question is phrased as a choice, but it conflates two different parts of the stack. A policy engine evaluates rules against structured inputs. Runtime authorization is the system you build around a policy engine so that AI agents cannot commit actions without an explicit deterministic decision. One is a tool. The other is a discipline. You almost certainly want both.

Key concepts

This post explains the layering, how the pieces compose, what a policy engine cannot do on its own, and what a runtime authorization system adds in the specific case of runtime authorization for AI agents.

What a policy engine actually is

A policy engine is a program that evaluates declarative rules against structured input and returns a decision. OPA evaluates Rego. Cedar evaluates Cedar. AWS IAM policies evaluate JSON policies. The shared abstraction is clean:

Policy engines are excellent at this. They are mature, fast, well tested, and have rich ecosystems of tooling. They are the right primitive for "given an input, does policy permit it." The Kubernetes admission control community uses OPA for exactly this shape of problem. The AWS IAM system uses policy evaluation for every API call. The Cedar language was designed at Amazon and published specifically for application-level authorization.

What they are not is a system. A policy engine evaluates what you give it. It has no opinion about how the input was constructed, whether the call chain that produced the input was trustworthy, what happens between the decision and the side effect, or how the decision should be recorded. Those questions belong to the layer around the engine. That layer is runtime authorization.

The layering, in one picture

The cleanest way to see the relationship is to draw the call path explicitly:

AI agent / planner
        │
        ▼
tool-call intent (structured)
        │
        ▼
┌────────────────────────────┐
│ runtime authorization gate │
│                            │
│  - validates request shape │
│  - binds policy version    │
│  - assembles inputs        │
│  ┌──────────────────────┐  │
│  │   policy engine      │  │
│  │   (OPA / Cedar)      │  │
│  └──────────────────────┘  │
│  - maps result to          │
│    PERMIT / DENY / SILENCE │
│  - signs receipt           │
└────────────────────────────┘
        │ (only PERMIT)
        ▼
execution surface

The policy engine sits inside the runtime authorization gate. The gate is responsible for everything the engine does not do: shaping the input, committing to a policy version, producing a receipt, enforcing fail-closed semantics, and refusing to call the actuation surface unless the decision is PERMIT.

What the gate adds on top of the engine

It is worth being specific about what a policy engine does not provide, because teams that pick up OPA or Cedar and call it done tend to hit the same gaps in production.

A stable request contract

A policy engine accepts whatever input shape you feed it. That flexibility is a feature for the engine and a liability for the system. If every service produces slightly different inputs, policies accumulate defensive checks and drift into mutually incompatible dialects. A runtime authorization gate pins the request shape: actor, surface, action, target, context, idempotency key, policy version. The shape is the contract. Services that want to call the gate conform to it. Policies that want to evaluate an action can assume the inputs are well-formed.

Explicit decision semantics

OPA returns whatever your Rego says. Cedar returns an effect. Neither enforces a three-valued decision model. Runtime authorization for AI agents needs three outcomes, not two:

SILENCE is how you make "explicit allow, implicit deny" enforceable. A policy engine can be configured to behave this way, but the behavior is not in the engine - it is in the wrapper. The wrapper is the gate. See decision vs enforcement in AI systems for a deeper dive on why keeping these two concepts distinct matters.

Policy versioning and replay

A policy engine evaluates whichever policy is loaded into it right now. That is fine for most request-time decisions, but it is not enough for audit. To defend a decision three months later, you need to know exactly which policy bundle was active when the decision was made, and you need the ability to re-run the same input against the same policy version and get the same answer. Runtime authorization binds every decision to an immutable policy version and keeps the policy artifact around long enough to replay. The engine is the evaluator. The gate is the custodian.

Signed receipts

Neither OPA nor Cedar signs its decisions. They return a result, the caller logs it, and the decision evaporates into application logs that may or may not still be there next quarter. Runtime authorization produces a signed receipt for every decision, binding request, policy version, and outcome together with a signature that a separate verification surface can check without trusting application logs. This is the difference between evidence and narrative. For the receipt shape and verification flow see verify and protocol/receipts.

Actuation coupling

The most important thing the gate does is the thing the engine cannot do on its own: refuse to let the action reach the execution surface unless the decision is PERMIT. A policy engine returns a value. What happens next is up to the caller. That coupling is the control. If the tool adapter looks at the result and decides to proceed anyway under some fallback path, the engine's answer was advisory. Runtime authorization makes the coupling structural: the SDK that wraps every tool call treats the gate's response as binding. Without PERMIT, the action does not dispatch.

Where the gate plugs OPA or Cedar in

A common and sound design is to let engineering teams continue using their existing policy engine inside the gate. Rego-first shops keep their Rego bundles. Cedar-first shops keep their Cedar policies. The gate supplies the envelope; the engine supplies the rules.

In Rego terms, a gate-style policy looks like this (illustrative, not a library binding):

package trigguard.authz

default decision := "silence"

decision := "permit" if {
    input.actor.id in data.agents.approved
    input.surface in data.surfaces.allowed
    not blocked_by_freeze
    not amount_over_threshold
}

decision := "deny" if {
    amount_over_threshold
    not input.context.dual_control_approved
}

blocked_by_freeze if {
    data.ops.change_freeze
    input.surface in data.surfaces.change_sensitive
}

amount_over_threshold if {
    input.action == "transfer"
    input.target.amount > data.thresholds.transfer_max
}

What the gate does with that engine is the interesting part:

1. The gate receives the structured request from the agent SDK. 2. The gate validates the request shape and rejects malformed inputs before they reach policy. 3. The gate attaches the active policy version and derived context (time, actor roles, risk signals). 4. The gate calls the policy engine with the assembled input. 5. The gate translates decision into one of PERMIT, DENY, SILENCE, refusing to emit anything else. 6. The gate produces a signed receipt containing the request, the decision, the policy version, and a content hash. 7. The gate returns the decision to the agent SDK, which only dispatches on PERMIT.

The engine does step 4. The gate does everything else. If you skip the gate and call the engine directly from the agent, you have a policy evaluator wired into a generation loop and no system. That is the failure mode most teams hit first.

What about IAM?

A frequent follow-up is whether a cloud IAM system already covers this. IAM is a policy engine plus a specific set of predefined inputs and a specific set of execution surfaces (cloud APIs). It works extremely well for its scope: the surfaces it controls are the cloud provider's own APIs, the inputs are the signed request headers, and the decision is enforced by the provider's API gateway.

IAM stops being enough the moment the action crosses into a surface IAM does not control. A payment to an external processor. A write to an on-premises EHR. A message to a customer. A CRM update. An internal microservice with its own data model. IAM did not evaluate those. An agent that combined an IAM-gated cloud call with a non-IAM-gated external action is already past the boundary IAM was designed to protect.

Runtime authorization is the layer that generalizes the IAM discipline to the full set of surfaces an AI agent can touch, not just the cloud provider's APIs. It does so by putting the decision in front of every tool call, regardless of where the tool eventually lands.

Deployment patterns

Teams that have landed this successfully tend to converge on one of two shapes.

Sidecar gate

Every agent workload runs alongside a local authorization sidecar. The SDK in the agent process makes a local gRPC or HTTP call to the sidecar for each tool intent. The sidecar embeds the policy engine and a cached policy bundle. Latency is single-digit milliseconds per decision. Failure of the sidecar is treated as SILENCE; the agent does not dispatch. This is the pattern most service-mesh users will find familiar from their existing OPA deployments.

Central decision service

A dedicated authorization service handles decisions for many agents. The SDK makes a remote call per tool intent. The central service has higher fan-in visibility (cross-agent coordination, global rate limits, cluster-wide risk signals) at the cost of a slightly higher per-call latency and a stronger dependency on the authorization service's availability. Fail-closed on timeout is still the rule.

Both shapes use the same policy engine underneath. Both enforce the same decision contract and receipt shape. The choice is about locality and availability, not about whether the policy engine is involved. See deterministic authorization for the shared contract and architecture for the deployment detail.

What you should actually adopt

If you are starting from zero, the order that tends to work is:

The right posture is that the policy engine is an implementation detail of the gate. You should be able to swap engines later without changing the gate contract. That is only true if the gate contract is the top-level abstraction, not the engine's native API.

Frequently asked questions

Is runtime authorization just a wrapper around a policy engine?

It includes a wrapper, but the wrapper is most of the value. The engine answers "does policy permit this." The gate answers "is this action allowed to commit, under which policy version, with what evidence, and what happens if the evaluator is unavailable." Those extra questions are what makes the control defensible in production and in audit.

Can I use OPA directly from my agent and skip the gate?

You can, but you have then built a policy-advised agent, not an authorization-gated one. The coupling between decision and execution is voluntary on the agent's part. Any bug, any fallback path, any timeout handler that proceeds on error, re-opens the actuation surface. The gate exists precisely so that coupling is structural.

Does the gate slow things down noticeably?

In well-designed deployments, the gate adds single-digit millisecond latency for cached policy evaluation. The agents touching the gate are almost always already making external calls that take tens or hundreds of milliseconds. The relative cost is negligible. The absolute cost is lower than almost any other control in the stack.

Next step

If you are already running OPA or Cedar somewhere in your stack and want to understand how to put it in front of agent tool calls, start with runtime authorization for AI agents for the overall model, then read the protocol spec for the request and receipt shapes. The decision model page covers the three-valued outcome semantics in more detail.

Next step

Review how a policy engine sits inside a runtime authorization gate in a reviewable reference architecture.

Request a demo Review architecture Read protocol Documentation