Enterprise Agent Trust Architecture
How Recursiv governs autonomous AI agents for regulated and enterprise environments.
Last updated: 2026-04-10
Overview
Recursiv provides agent orchestration infrastructure for enterprises deploying autonomous AI systems. This document describes the trust architecture that governs how agents access tools, act on data, and make decisions within the platform.
Our approach is informed by emerging enterprise trust frameworks, including zero trust principles applied to data quality and AI governance, as described in recent work by researchers at the SPARK AI Consortium at UC San Diego (Short, 2025; Massa & Short, 2025). We apply these principles specifically to the agent execution layer: every agent tool call is verified against policy, every action is audited, and the system defaults to denial when trust cannot be established.
Core Principles
1. Never Trust, Always Verify
No agent tool execution is trusted by default. Every tool call passes through a policy verification layer before execution. Tools that have not been explicitly classified and permitted are blocked. There is no implicit trust path.
This aligns with zero trust architectures where trust is not an assumed attribute but a continuously verified condition. In the context of agent orchestration, this means:
- Every tool is classified before an agent can access it
- Every execution is checked against the active policy at call time
- Unrecognized tools are denied (fail-closed)
- Policy changes take effect immediately for all subsequent tool calls
2. Classify, Then Govern
Every tool available to an agent is classified across multiple dimensions before it enters the system. These classifications drive policy enforcement automatically.
These classifications are defined in code, version-controlled, and auditable. They are not configurable by the agent.
3. Fail Closed
If a tool is not mapped to a known classification, it is blocked. The system does not default to permissive access. This is enforced at the tool registration layer: unclassified tools are filtered out before the agent model ever sees them.
This means an agent cannot discover and execute an arbitrary tool. The set of available tools is deterministic, policy-governed, and auditable.
4. Audit Everything
Every agent action produces audit records across multiple dimensions:
Tool Execution Log: Every tool invocation records the agent, tool name, success/failure status, error details, conversation context, and timestamp.
Prompt Guard Log: Every user message is scanned for prompt injection. Results are logged with threat level (clean/suspicious/hostile), detected patterns, and action taken (passed/sanitized/blocked). Raw message content is never logged for privacy.
Agent Configuration Change Log: Every change to an agent’s configuration (model, system prompt, tool mode, guardrails, budget) is recorded with the old value, new value, who made the change, and why.
AI Usage Tracking: Every LLM inference call records provider, model, token counts, and cost for billing reconciliation and usage analysis.
Task Activity Log: For orchestrated multi-step workflows, every task state transition (created, claimed, completed, audited) is logged with the acting agent and detail.
Tool Policy Enforcement
Three-Level Permission Model
Every tool operates under one of three permission levels:
Policy Resolution
Policies are resolved at three levels of specificity:
- Per-tool overrides (most specific): A specific tool can be set to any permission level
- Per-bundle overrides: A category of tools (e.g., all database tools) can be set
- Default bundle policy (least specific): The built-in default for each tool category
The most specific applicable policy wins. This allows organizations to customize agent capabilities while maintaining safe defaults.
Enforcement Architecture
Policy enforcement is implemented as a wrapper around the tool definition itself. Before the agent model receives its available tools, each tool passes through wrapToolWithProjectPolicy:
- If the policy is
off, the tool is removed from the toolset. The agent never sees it. - If the policy is
approval, the tool’s execute function is replaced with an approval gate that queues the request and notifies the user. - If the policy is
auto, the tool passes through unchanged.
This architecture ensures that policy enforcement cannot be bypassed by the agent. The policy layer sits between the tool registry and the model, not between the model and the execution.
Hardened Tool Categories
Certain tool categories enforce approval regardless of the agent’s tool mode setting:
- Email sending (
send_email_to_human): Always requires human approval. The agent cannot send email autonomously under any configuration. - Code execution: Gated by sandbox provisioning checks and budget tier verification.
- Database mutations: Default to approval-required for write operations.
- Deployments: Require explicit approval before triggering production deployments.
Human-in-the-Loop Controls
Approval Gate System
When a tool requires approval, the system:
- Creates a pending execution record with the tool name, parameters, agent identity, and conversation context
- Sends a real-time WebSocket notification to the supervising user
- Returns a structured response to the agent instructing it to wait
- The agent cannot proceed until the human approves or denies
Approval records include:
- Argument hash (SHA-256) for deduplication without exposing sensitive parameters
- Organization scope ensuring approvals are isolated between tenants
- Expiry enforcement (pending requests expire after 10-60 minutes if not acted on)
- Denial reason capture when a request is rejected
Agent Tool Modes
Each agent operates in one of three modes that interact with the tool policy system:
Isolation and Containment
Multi-Tenant Isolation
All agent operations are scoped to a network and optionally an organization. Data cannot leak between tenants:
- API keys are scoped to organizations
- Agents are bound to owners and organizations
- Database queries are filtered by network/organization context
- Approval decisions are org-scoped
Sandbox Isolation
Code execution runs in VM-level sandboxes (not containers) with:
- 1-hour maximum lifetime
- Per-user and per-org concurrency caps
- No access to platform credentials, database connections, or API keys
- Each agent gets its own sandbox keyed by project and agent ID
Rate Limiting and Budget Controls
Agent activity is bounded by multiple overlapping controls:
- Per-agent daily request limits (configurable, default 100)
- Per-user, per-org, and per-key API rate limits
- Budget-based throttling that degrades agent capabilities as spending approaches limits
- Per-request cost caps ($10 max) as runaway protection
Prompt Injection Defense
Every user message processed by an agent is scanned by the Prompt Guard service before the agent model receives it. The system:
- Detects known injection patterns using pattern matching
- Classifies threats as clean, suspicious, or hostile
- Takes action: pass, sanitize, or block
- Logs the result without storing raw message content (privacy preservation)
This provides a defense layer between untrusted user input and agent tool execution.
Alignment with Enterprise Trust Frameworks
Zero Trust Principles
Our agent trust architecture applies zero trust principles as described in NIST SP 800-207 and as extended to data quality contexts by researchers at the SPARK AI Consortium (Massa & Short, “Meeting Agentic AI’s Data Quality Needs with Zero Trust Data Quality,” 2025):
Governance Automation
Consistent with research on automating data quality governance decisions (Massa & Short, “Scientifically Automating Data Quality Decisions with AI Explainability Weights,” 2025), our platform automates governance decisions that would otherwise require manual intervention:
- Tool permission decisions are automated based on pre-defined policy classifications
- Budget-based capability degradation happens automatically without human intervention
- Rate limit enforcement is continuous and atomic
- Approval expiry is automatic (no stale pending requests)
Areas of Active Development
We are actively developing enhancements informed by enterprise trust research:
- Threshold-based policies: Extending the current binary permission model (auto/approval/off) to support confidence-weighted decisions where agent autonomy scales with input data quality
- Data freshness verification: Systematic validation of input data recency before agent tool execution
- Decision justification capture: Structured reasoning attached to both automated and human-approved decisions for regulatory audit support
Compliance Posture
Recursiv maintains a comprehensive compliance program aligned with SOC 2 Type II Trust Services Criteria:
- 15 formal security and compliance policies mapped to SOC 2 controls (CC1-CC9, A1, C1, P4-P6)
- Quarterly automated evidence collection
- Formal risk register with 10 tracked risks, mitigations, and residual risk assessments
- Semi-annual policy review cycle with automated deadline enforcement
- Internal security audits with prioritized remediation tracking
For detailed compliance documentation, see our Security Policies and Operations Documentation.
References
- NIST Special Publication 800-207, “Zero Trust Architecture” (2020)
- Massa, J. & Short, J.E., “Meeting Agentic AI’s Data Quality Needs with Zero Trust Data Quality,” SPARK AI Consortium Executive Briefing Vol. 1 No. 2 (2025)
- Massa, J. & Short, J.E., “Scientifically Automating Data Quality Decisions with AI Explainability Weights,” SPARK AI Consortium Executive Briefing Vol. 1 No. 4 (2025)
- Short, J., “Is AI Governable? Industry Perspectives on the Adoption, Effectiveness and Accountability of Frontier AI,” SPARK AI Working Paper (2025)
Contact
For enterprise security inquiries: security@recursiv.io For compliance documentation requests: compliance@recursiv.io