Enterprise Agent Trust Architecture

How Recursiv governs autonomous AI agents for regulated and enterprise environments.

Last updated: 2026-04-10


Overview

Recursiv provides agent orchestration infrastructure for enterprises deploying autonomous AI systems. This document describes the trust architecture that governs how agents access tools, act on data, and make decisions within the platform.

Our approach is informed by emerging enterprise trust frameworks, including zero trust principles applied to data quality and AI governance, as described in recent work by researchers at the SPARK AI Consortium at UC San Diego (Short, 2025; Massa & Short, 2025). We apply these principles specifically to the agent execution layer: every agent tool call is verified against policy, every action is audited, and the system defaults to denial when trust cannot be established.


Core Principles

1. Never Trust, Always Verify

No agent tool execution is trusted by default. Every tool call passes through a policy verification layer before execution. Tools that have not been explicitly classified and permitted are blocked. There is no implicit trust path.

This aligns with zero trust architectures where trust is not an assumed attribute but a continuously verified condition. In the context of agent orchestration, this means:

  • Every tool is classified before an agent can access it
  • Every execution is checked against the active policy at call time
  • Unrecognized tools are denied (fail-closed)
  • Policy changes take effect immediately for all subsequent tool calls

2. Classify, Then Govern

Every tool available to an agent is classified across multiple dimensions before it enters the system. These classifications drive policy enforcement automatically.

DimensionValuesPurpose
Trust Tiernative, self_hosted, third_party_processor, public_protocolWhere the tool executes and who controls it
Data Sensitivitypublic, internal, confidential, regulatedWhat kind of data the tool accesses or produces
Export Boundarynone, org_internal, third_party_processor, public_networkWhether data leaves the organization
Consent Requirementnone, org_admin, end_user, case_by_caseWho must approve before the tool can be used
Retention Classephemeral, operational, audit, user_contentHow long execution artifacts are retained
Scopeproject, organization, network, platformBlast radius of the tool’s effects
Audiencecustomer_safe, approval_required, internal_onlyWho should have access to this tool

These classifications are defined in code, version-controlled, and auditable. They are not configurable by the agent.

3. Fail Closed

If a tool is not mapped to a known classification, it is blocked. The system does not default to permissive access. This is enforced at the tool registration layer: unclassified tools are filtered out before the agent model ever sees them.

This means an agent cannot discover and execute an arbitrary tool. The set of available tools is deterministic, policy-governed, and auditable.

4. Audit Everything

Every agent action produces audit records across multiple dimensions:

Tool Execution Log: Every tool invocation records the agent, tool name, success/failure status, error details, conversation context, and timestamp.

Prompt Guard Log: Every user message is scanned for prompt injection. Results are logged with threat level (clean/suspicious/hostile), detected patterns, and action taken (passed/sanitized/blocked). Raw message content is never logged for privacy.

Agent Configuration Change Log: Every change to an agent’s configuration (model, system prompt, tool mode, guardrails, budget) is recorded with the old value, new value, who made the change, and why.

AI Usage Tracking: Every LLM inference call records provider, model, token counts, and cost for billing reconciliation and usage analysis.

Task Activity Log: For orchestrated multi-step workflows, every task state transition (created, claimed, completed, audited) is logged with the acting agent and detail.


Tool Policy Enforcement

Three-Level Permission Model

Every tool operates under one of three permission levels:

PermissionBehaviorUse Case
autoTool executes immediately when called by the agentLow-risk read operations, web search, memory recall
approvalTool execution is queued for human review and approvalData mutations, external communications, deployments
offTool is removed from the agent’s available toolset entirelyDisabled capabilities, restricted environments

Policy Resolution

Policies are resolved at three levels of specificity:

  1. Per-tool overrides (most specific): A specific tool can be set to any permission level
  2. Per-bundle overrides: A category of tools (e.g., all database tools) can be set
  3. Default bundle policy (least specific): The built-in default for each tool category

The most specific applicable policy wins. This allows organizations to customize agent capabilities while maintaining safe defaults.

Enforcement Architecture

Policy enforcement is implemented as a wrapper around the tool definition itself. Before the agent model receives its available tools, each tool passes through wrapToolWithProjectPolicy:

  • If the policy is off, the tool is removed from the toolset. The agent never sees it.
  • If the policy is approval, the tool’s execute function is replaced with an approval gate that queues the request and notifies the user.
  • If the policy is auto, the tool passes through unchanged.

This architecture ensures that policy enforcement cannot be bypassed by the agent. The policy layer sits between the tool registry and the model, not between the model and the execution.

Hardened Tool Categories

Certain tool categories enforce approval regardless of the agent’s tool mode setting:

  • Email sending (send_email_to_human): Always requires human approval. The agent cannot send email autonomously under any configuration.
  • Code execution: Gated by sandbox provisioning checks and budget tier verification.
  • Database mutations: Default to approval-required for write operations.
  • Deployments: Require explicit approval before triggering production deployments.

Human-in-the-Loop Controls

Approval Gate System

When a tool requires approval, the system:

  1. Creates a pending execution record with the tool name, parameters, agent identity, and conversation context
  2. Sends a real-time WebSocket notification to the supervising user
  3. Returns a structured response to the agent instructing it to wait
  4. The agent cannot proceed until the human approves or denies

Approval records include:

  • Argument hash (SHA-256) for deduplication without exposing sensitive parameters
  • Organization scope ensuring approvals are isolated between tenants
  • Expiry enforcement (pending requests expire after 10-60 minutes if not acted on)
  • Denial reason capture when a request is rejected

Agent Tool Modes

Each agent operates in one of three modes that interact with the tool policy system:

ModeBehavior
chat_onlyAgent can only converse. All tools are blocked. Tool descriptions are provided so the agent can explain what it would do, but execution is prevented.
permissionAll write tools require human approval. Read tools execute immediately.
autonomousTools execute according to their bundle policy. Hardened categories (email, deployment) still require approval.

Isolation and Containment

Multi-Tenant Isolation

All agent operations are scoped to a network and optionally an organization. Data cannot leak between tenants:

  • API keys are scoped to organizations
  • Agents are bound to owners and organizations
  • Database queries are filtered by network/organization context
  • Approval decisions are org-scoped

Sandbox Isolation

Code execution runs in VM-level sandboxes (not containers) with:

  • 1-hour maximum lifetime
  • Per-user and per-org concurrency caps
  • No access to platform credentials, database connections, or API keys
  • Each agent gets its own sandbox keyed by project and agent ID

Rate Limiting and Budget Controls

Agent activity is bounded by multiple overlapping controls:

  • Per-agent daily request limits (configurable, default 100)
  • Per-user, per-org, and per-key API rate limits
  • Budget-based throttling that degrades agent capabilities as spending approaches limits
  • Per-request cost caps ($10 max) as runaway protection

Prompt Injection Defense

Every user message processed by an agent is scanned by the Prompt Guard service before the agent model receives it. The system:

  • Detects known injection patterns using pattern matching
  • Classifies threats as clean, suspicious, or hostile
  • Takes action: pass, sanitize, or block
  • Logs the result without storing raw message content (privacy preservation)

This provides a defense layer between untrusted user input and agent tool execution.


Alignment with Enterprise Trust Frameworks

Zero Trust Principles

Our agent trust architecture applies zero trust principles as described in NIST SP 800-207 and as extended to data quality contexts by researchers at the SPARK AI Consortium (Massa & Short, “Meeting Agentic AI’s Data Quality Needs with Zero Trust Data Quality,” 2025):

Zero Trust PrincipleRecursiv Implementation
Never trust, always verifyEvery tool call verified against policy at execution time
Assume breachPrompt injection scanning on every message; fail-closed tool defaults
Least privilege accessTools scoped by bundle; agents see only permitted tools
Continuous verificationPolicy checked at every tool call, not just at agent creation
Explicit authorizationApproval gates require affirmative human action for sensitive operations

Governance Automation

Consistent with research on automating data quality governance decisions (Massa & Short, “Scientifically Automating Data Quality Decisions with AI Explainability Weights,” 2025), our platform automates governance decisions that would otherwise require manual intervention:

  • Tool permission decisions are automated based on pre-defined policy classifications
  • Budget-based capability degradation happens automatically without human intervention
  • Rate limit enforcement is continuous and atomic
  • Approval expiry is automatic (no stale pending requests)

Areas of Active Development

We are actively developing enhancements informed by enterprise trust research:

  • Threshold-based policies: Extending the current binary permission model (auto/approval/off) to support confidence-weighted decisions where agent autonomy scales with input data quality
  • Data freshness verification: Systematic validation of input data recency before agent tool execution
  • Decision justification capture: Structured reasoning attached to both automated and human-approved decisions for regulatory audit support

Compliance Posture

Recursiv maintains a comprehensive compliance program aligned with SOC 2 Type II Trust Services Criteria:

  • 15 formal security and compliance policies mapped to SOC 2 controls (CC1-CC9, A1, C1, P4-P6)
  • Quarterly automated evidence collection
  • Formal risk register with 10 tracked risks, mitigations, and residual risk assessments
  • Semi-annual policy review cycle with automated deadline enforcement
  • Internal security audits with prioritized remediation tracking

For detailed compliance documentation, see our Security Policies and Operations Documentation.


References

  • NIST Special Publication 800-207, “Zero Trust Architecture” (2020)
  • Massa, J. & Short, J.E., “Meeting Agentic AI’s Data Quality Needs with Zero Trust Data Quality,” SPARK AI Consortium Executive Briefing Vol. 1 No. 2 (2025)
  • Massa, J. & Short, J.E., “Scientifically Automating Data Quality Decisions with AI Explainability Weights,” SPARK AI Consortium Executive Briefing Vol. 1 No. 4 (2025)
  • Short, J., “Is AI Governable? Industry Perspectives on the Adoption, Effectiveness and Accountability of Frontier AI,” SPARK AI Working Paper (2025)

Contact

For enterprise security inquiries: security@recursiv.io For compliance documentation requests: compliance@recursiv.io