Guardrails & Approvals

Per-tool policy and a human approval queue for sensitive actions

Overview

An agent is only as safe as the tools it can call. On Recursiv every tool sits under a policy that decides whether the agent runs it freely, has to wait for a human, or cannot see it at all. When a tool requires approval the call is queued, a human reviews it, and the agent is blocked until someone approves or rejects. This is the guardrail layer, and it cannot be bypassed by the agent because policy is enforced between the tool registry and the model, not inside the model.

This page covers the three permission levels, how policy is resolved, the agent tool modes that interact with it, and the human approval queue.

Three permission levels

Every tool operates under one of three levels:

PermissionBehaviorTypical use
autoRuns immediately when the agent calls it.Low-risk reads: web search, memory recall.
approvalQueued for human review. The agent waits.Mutations, external comms, deployments.
offRemoved from the toolset. The agent never sees it.Disabled or restricted capabilities.

When a tool is set to off, the agent model is never given the tool at all, so it cannot call it or even reference it as available. When set to approval, the tool’s execute step is replaced by an approval gate that queues the request and notifies the supervising human.

Setting policy per tool

Policy is configured on a project through r.projectBrain settings. Tools are grouped into bundles (browser, database, github, storage, communication, outreach, sandbox, and more) and each bundle can be set independently. The settings response also returns the effective permission for each bundle after defaults and overrides are merged.

1// Inspect current and effective tool policy
2const { data: settings } = await r.projectBrain.getSettings('proj_123');
3console.log(settings.effectiveToolPermissions);
4// e.g. { database: 'approval', browser: 'auto', communication: 'approval', ... }
5
6// Require approval for database tools, turn outreach off entirely
7await r.projectBrain.updateSettings('proj_123', {
8 tool_permissions: {
9 database: 'approval',
10 outreach: 'off',
11 },
12});

How policy is resolved

The most specific applicable rule wins, in this order:

  1. Per-tool override (most specific)
  2. Per-bundle override
  3. Default bundle policy (least specific, the built-in safe default)

This lets you keep safe defaults while tightening or loosening a single category, without hand-listing every tool.

Hardened categories

Some categories require approval no matter how the agent is configured. Sending email to a human (send_email_to_human) always requires approval. Code execution is additionally gated by sandbox provisioning and budget checks. Database writes default to approval. Production deployments require explicit approval. An autonomous agent does not get to skip these.

Agent tool modes

Each agent also has a tool mode that interacts with the per-tool policy:

ModeBehavior
chat_onlyAll tools blocked. The agent can describe what it would do but cannot execute.
permissionEvery write tool requires approval. Read tools run immediately.
autonomousTools run according to their bundle policy. Hardened categories still require approval.
1await r.agents.update('agent_123', { tool_mode: 'permission' });

chat_only is the safest default for a new or untrusted agent. permission is the right mode while you build confidence: the agent does useful work but a human signs off on every mutation. autonomous hands the agent the bundle policy you configured, with the hardened categories as a backstop.

The human approval queue

When a tool requires approval the platform creates a pending execution record (tool name, arguments, agent identity, conversation context), notifies the supervising human in real time, and tells the agent to wait. The agent cannot proceed until a human acts. Pending requests expire if not acted on, so the queue does not accumulate stale items.

You drive the queue through r.integrations. These approval endpoints cover both integration tools and brain/platform tools such as provisioning a sandbox or database.

1// 1. See what an agent is waiting on for a given conversation
2const { data: pending } = await r.integrations.listPendingExecutions('conv_123');
3
4for (const item of pending) {
5 console.log(item.id, item.tool, item.arguments);
6}
7
8// 2. Approve and run it, this unblocks a permission-mode agent
9await r.integrations.approveExecution(pending[0].id);
10
11// 3. Or reject it, the waiting agent is notified
12await r.integrations.rejectExecution(pending[0].id);

The same flow is available in MCP as list_pending_tool_executions, approve_tool_execution, and reject_tool_execution, so a human supervising from Claude can clear the queue directly. An agent can poll its own request with check_approval_status to know when it has been cleared.

Approval records carry an argument hash for deduplication without exposing sensitive parameters, an organization scope so approvals never cross tenants, and a captured denial reason when a request is rejected.

Requiring approval on destructive actions

To make a destructive capability gated, set its bundle to approval and keep the agent in permission or autonomous mode:

1await r.projectBrain.updateSettings('proj_123', {
2 tool_permissions: {
3 database: 'approval', // schema and data mutations wait for a human
4 sandbox: 'approval', // code execution waits for a human
5 communication: 'approval', // outbound messages wait for a human
6 },
7});

Now every database write, sandbox run, and outbound message the agent attempts lands in the approval queue. A human reviews the exact arguments before anything runs, and the decision is recorded.

What this proves

Every gated action produces a record of what the agent wanted to do, what a human decided, and when. Combined with Audit & Observability, the approval queue is the evidence trail for human-in-the-loop control: you can show that a given mutation was reviewed and authorized rather than run unsupervised.