TrigGuard
TRIGGUARD AGENTS

AI agent safety

Agents compress the gap between recommendation and execution. Safety is whether dangerous or irreversible effects are blocked at the execution boundary, not whether the model sounds polite. This page is the canonical spoke for agent execution safety; intent is not split across a second URL.

When recommendations become actions

Recommendation systems score and rank. Agentic systems issue actions: trades, deploys, tickets, exports, privilege changes. The execution layer introduces irreversible risk because external systems treat a successful call as commitment. That is why AI execution governance centers on authorization before commit, not on nicer wording in the model output.

Failure modes in production agent systems

Concrete patterns where speed meets consequence:

  • trading or treasury agents placing orders or transfers;
  • CI/CD bots merging, deploying, or rotating credentials;
  • support copilots mutating accounts, refunds, or entitlements;
  • infrastructure agents changing firewalls, DNS, scaling, or data stores.

In each case, the failure is not bad text; it is an unauthorized execution. See pre-execution authorization for the gate model.

Why monitoring alone is not execution safety

Dashboards, logs, and anomaly detectors tell you what happened. They do not, by themselves, withhold a commit at the moment a tool issues an irreversible call. Execution safety requires a fail-closed authorization step before the execution surface accepts work, not only post-hoc review. Read fail-closed AI systems for default posture.

Why sandboxing and rate limits are not enough

Sandboxes, throttles, and prompt guardrails reduce blast radius and slow abuse. They do not replace a deterministic PERMIT / DENY / SILENCE decision bound to the exact request context at commit time. Policy prompts drift; sandboxes leak when production credentials are reachable. Mitigation without an execution gate still allows almost-safe paths to become incidents.

The agent execution pipeline

Stages on the hot path: signal frame (declared intent and state), policy evaluation, explicit PERMIT / DENY / SILENCE, optional execution surface on PERMIT only, then a verifiable receipt. The figure below is the single deep diagram for this spoke; the pillar hub keeps a lighter diagram so we do not stack the same asset twice.

Execution governance stack: agent to execution surface Flow from autonomous agent through signal frame and TrigGuard evaluation to PERMIT DENY or SILENCE outcomes, then execution surface and signed receipt on PERMIT. EXECUTION GOVERNANCE STACK Autonomous agent / tool caller Signal frame · context TGSignalFrame (declared intent + state) TrigGuard evaluation Deterministic policy · same inputs → same decision DECISION PERMIT DENY SILENCE PERMIT only Execution surface Signed decision receipt (audit · verify)
Figure: deterministic outcomes before irreversible effects. The pillar page uses a simpler path diagram; this is the full agent-oriented stack.

Irreversible execution surfaces

Classes of actions that need deterministic authorization include payments and treasury movements, production deployments, regulated data exports, and identity or permission changes. These are execution surfaces: once called, downstream systems assume commitment. Map your surfaces, then bind policy enforcement at the choke points.

Fail-closed control for autonomous systems

Autonomous systems run without a human in the loop for every step. The safe default is to not execute when evaluation is incomplete, ambiguous, or unsafe. That posture is fail-closed execution governance, not fail until someone notices. See fail-closed AI systems for semantics and operating defaults.

Verifiable decisions and receipts

Execution decisions must be tamper-evident: auditors, counterparties, and internal risk teams need to prove what was authorized, under which policy version, for which request hash. TrigGuard issues signed receipts consumable by Verify; the wire format and verification rules live in the protocol overview and receipt sections.

Where TrigGuard sits in the agent stack

Conceptual stack: application and UX, agent orchestration (planners, tool routers), the execution governance layer (TrigGuard evaluation and receipts), then execution surfaces (payments, cloud control planes, data planes). TrigGuard is not another model; it is the control plane in front of irreversible APIs.

Category pillar

Return to the cluster hub: AI execution governance.