Claude Managed Agents (CMA) handles the agent and infrastructure for you: the harness, the session state, the tools, and the execution environment. What if you want to plug in your own environment instead? This guide shows you how, using Vercel Sandbox as the execution layer.
Check out the demo app source code, or follow the walkthrough below to build it yourself.
Anthropic hosts the brain: Claude, the tool-calling loop, skills, and memory. The brain has no hands, so when Claude calls a tool, something on your side has to run it and post the result back. With Vercel, that "something" splits into two planes:
- Control plane (Vercel Function): receives
session.status_run_startedwebhooks from Anthropic and spawns one Vercel Sandbox per session. - Compute plane (Vercel Sandbox): the spawned VM attaches to the session's event stream, executes tool calls (
run_shell,read_file, etc.), posts results back, and exits when the session ends.
Each session runs in a fresh isolated microVM that exits when the session ends. The environment key never enters the VM: Vercel Sandbox's credential brokering injects it on outbound requests scoped to this session, so a compromised sandbox can't extract the key or use it to act on other sessions.
- A Vercel account with Sandbox access
- An Anthropic account with environments access
- Node.js 22+
- Vercel CLI (
pnpm add -g vercel)
The control plane webhook, the UI, and the setup scripts all live in the same Next.js app. Scaffold it with:
tsx auto-loads .env.local before running any script, so no dotenv import is needed in the scripts.
Link the project to Vercel and pull credentials:
This writes a VERCEL_OIDC_TOKEN to .env.local so @vercel/sandbox can authenticate without a long-lived Vercel token.
All API calls below require two headers, which the SDK adds automatically when you pass the beta tag:
Create the environment in the Anthropic dashboard (Workspace → Environments → New → Self-hosted) or in code:
From the project root, run it once and add the printed ID to .env.local:
In the console, open the environment and click Generate environment key. Save it as ANTHROPIC_ENVIRONMENT_KEY in .env.local. This key authenticates the whole worker flow: poll, ack, heartbeat, stop, and the session event stream. Ignore the on-screen instructions about an env_manager binary: Vercel Sandbox is the runtime.
Create an agent with the custom tools your runner will handle. The tools array here must match exactly what runTool implements in the sandbox:
Still in scripts/create-agent.ts, add a read_file tool the same way:
Run it and save the printed ID as ANTHROPIC_AGENT_ID in .env.local:
This is the code that runs inside each spawned sandbox. It reconciles any tool calls that arrived before it attached, then streams new ones and posts results back.
Create sandbox/runner.ts with the imports and constants:
The released SDK uses one credential for everything: poll, ack, heartbeat, stop, and session events. The control plane will configure the sandbox's network policy to inject that credential on outbound requests to api.anthropic.com, so this code never sees the raw token.
In the same file, define your tool implementations:
Still in sandbox/runner.ts, start the heartbeat to keep the work-item lease alive:
Still in sandbox/runner.ts, post results back for each tool call:
Still in sandbox/runner.ts, list existing events to catch up on anything emitted while the sandbox was booting, then switch to the live stream. This top-level try block is the entry point (no main() wrapper needed: tsx runs the file directly):
The reconcile pass matters because the webhook may take a moment to spawn the sandbox. Listing first and deduplicating with handled ensures no tool call is dropped or processed twice. A 409 from work.stop means another runner already stopped it, which is safe to swallow.
Installing the Anthropic SDK and copying in the runner on every spawn would add noticeable latency to each session. Build a snapshot once, then every sandbox boots from that prebuilt image with no install step:
Run it from the project root and save the printed ID to .env.local:
Rebuild the snapshot whenever you change sandbox/runner.ts or bump the SDK version.
Register a webhook in the Anthropic dashboard for session.status_run_started. Each delivery triggers one poll, ack, and spawn pass.
Create app/api/webhook/route.ts with the imports and shared constants:
In the same file, add pollAndAck. The client is already authenticated with the environment key, so ack needs no per-call headers:
Still in app/api/webhook/route.ts, add spawn. It boots a sandbox from the snapshot and runs sandbox/runner.ts detached. The networkPolicy brokers the environment key at the firewall, scoped to just this session and work item: outbound calls to /v1/sessions/<sessionId>/... and /v1/environments/<envId>/work/<workId>/... get the Authorization: Bearer <key> header injected on the wire; anything else (e.g. work/poll or another session ID) gets no auth and is rejected by Anthropic. Matchers require @vercel/sandbox@beta:
process.env.ANTHROPIC_ENVIRONMENT_KEY is undefined inside the spawned VM. Even if an agent jailbreak or compromised tool ran console.log(process.env), there's no key to leak, and the scoped matchers mean a malicious request to work/poll or another session ID won't be authenticated. Adding more domains to the runner (e.g. a customer API) means extending the allow map. The default mode is deny-all once you set a network policy, so anything not in allow is blocked at the firewall.
detached: true returns immediately, leaving the runner running inside the VM. Still in app/api/webhook/route.ts, export the POST handler. client.beta.webhooks.unwrap() verifies the HMAC signature, checks the timestamp, and parses the event in one call, so there's no hand-rolled crypto:
waitUntil hands the spawn off so the function returns 200 immediately while Sandbox.create finishes in the background.
For local development without deploying the webhook, copy the same pollAndAck and spawn helpers into scripts/poll.ts and run a blocking poll loop:
Keep only one control plane running at a time. If the deployed webhook and pnpm tsx scripts/poll.ts both run, they will compete for the same work items.
Push the project to Vercel and set the production environment variables:
In the Anthropic dashboard, add a webhook subscribed to session.status_run_started pointing at https://your-project.vercel.app/api/webhook. Save the webhook signing secret as ANTHROPIC_WEBHOOK_SECRET.
If your Vercel project has Deployment Protection enabled, Anthropic's delivery will be blocked with a 401. Append a bypass token to the URL so it gets through:
The bypass secret is in your Vercel project settings under Deployment Protection.
With the Next.js project in place, the UI is two API routes plus a client page. The browser never talks to Anthropic directly: app/page.tsx calls your API routes, which hold the API keys server-side.
Create app/api/session/route.ts. It creates the session and sends the first message:
Create app/api/session/[id]/route.ts. It streams session events as SSE: catch up on history, then switch to live streaming. The SDK exports BetaManagedAgentsSessionEvent as a discriminated union, so a switch on ev.type narrows each branch without manual casts.
Start the route file with imports and the GET handler shell:
Still in app/api/session/[id]/route.ts, translate each session event into an SSE frame. Returning true from forward signals the turn is over:
Still in app/api/session/[id]/route.ts, catch up on history, then stream live events until the turn ends:
Replace app/page.tsx with a client component that POSTs to /api/session, then opens an EventSource on /api/session/<id>:
Start the dev server and open the UI:
When you click Run, the flow is: app/page.tsx → POST /api/session → Anthropic creates the session → webhook (or pnpm tsx scripts/poll.ts) spawns a sandbox running sandbox/runner.ts → GET /api/session/<id> streams events back to the browser.
The UI above attaches SSE directly to Anthropic's session event stream. That is enough for a demo, but serverless functions can time out on long sessions, and you lose durable replay on refresh.
If you need a production chat UI with durable polling, multi-turn conversations, and full event replay, see Build a Claude Managed Agent with Vercel Workflow: a Vercel Workflow run polls session events, writes them to a durable stream, and the client reads them over SSE. The workflow run is both the execution engine and the event log.
These scripts live under scripts/ and read .env.local via tsx. Run them from the project root.
Create a session (scripts/test-session.ts): same API calls as app/api/session/route.ts, but from the CLI. Use this to get a sesn_01... ID without starting the UI:
Handle tools on your machine (scripts/run-session.ts): skips Vercel Sandbox entirely and executes run_shell locally. Pass the session ID from the step above:
Still in scripts/run-session.ts, catch up on existing events, then stream new ones until the session ends:
For the full sandbox path locally, run pnpm tsx scripts/poll.ts in one terminal (same logic as app/api/webhook/route.ts) and pnpm tsx scripts/test-session.ts in another. Stop the deployed webhook first, or the two control planes will compete for work items.
Self-hosting the CMA compute plane with Vercel Sandbox is the right choice when:
- Tools touch private infrastructure: your runner needs to reach internal databases, private APIs, or services not reachable from Anthropic's compute. Vercel Sandbox lets you run the compute inside or adjacent to your own network with low-latency, secure connectivity.
- You are handling per-customer credentials: in a SaaS context each user has their own API tokens. Passing those tokens as env vars to the runner works, but any code that runs in the sandbox can read them. Vercel Sandbox's credential brokering injects tokens at the firewall level instead: the sandbox calls
fetch("https://api.example.com/...")with no auth header, and the firewall adds it before forwarding.console.log(process.env)inside the sandbox reveals nothing. - You need egress control: Vercel Sandbox lets you define a domain allowlist and deny everything else, which matters when your runner processes private data and you want to prevent exfiltration.
The platform itself is also a good fit for this kind of work:
- Battle-tested infrastructure: Vercel has been running microVM sandboxes for 10 years to power its build system. The same infrastructure handles over a billion deployments and has hardened defenses against the kinds of attacks agent code can run into, like cryptominer abuse and container escapes.
- Built for TypeScript developers: the Sandbox SDK and CLI follow the same DX principles as Next.js, AI SDK, and Turborepo. Secure OIDC authentication, no long-lived tokens, and a small surface area that fits cleanly into the rest of your toolchain.
- Low-latency connectivity to your cloud: sandboxes have direct egress to your AWS workloads with low data transfer costs, which matters more here than with general-purpose sandbox providers like Daytona or Cloudflare when your agent's tools need to reach private services.
Instead of passing tokens as env vars:
Configure injection on the sandbox's network policy:
The runner's fetch calls to api.example.com are authenticated. The token never enters the sandbox. You can also scope injection to specific paths and methods using matchers:
Lock down egress to exactly what the runner needs:
Policies can be updated on running sandboxes without restarting. A useful pattern: start with allow-all to install dependencies, then tighten the policy before running agent-generated code or processing sensitive data.
Check out the source or deploy the complete working implementation to Vercel in one click, then run pnpm tsx scripts/create-environment.ts, pnpm tsx scripts/create-agent.ts, and pnpm tsx scripts/build-snapshot.ts locally to fill in the environment variables before Anthropic's first webhook fires.