Vercel Logo

Build Your Own AI Coding Agent Harness

Build an AI coding agent harness from scratch using AI SDK, Vercel Sandbox, and just-bash. Covers the tool loop, tool design, system prompts, sandbox abstraction, context pruning, subagent delegation, lifecycle management, and extensibility.

A tool loop with three tools is a demo. The problems start when you try to use it for real work. You read a 5,000-line file and it stays in context forever. You give it bash and it runs rm -rf. You ask it to refactor a module and it explains how to refactor a module. One long task fills the context window and the agent loses its own instructions. The cloud sandbox costs money per minute and your code disappears when it times out.

Harness is the word for the system around the agent that handles all of this. This course builds TeensyCode from scratch.

What You'll Build

TeensyCode is a working AI coding agent harness with a compact TypeScript core, a real toolset, and multiple sandbox backends. Something you understand completely because you built every piece:

  • The loop: ToolLoopAgent with read, grep, write, edit, bash, task, and askUser tools
  • Safety gates: Execute-level safety with safe-command allowlists, evolving into configurable approval (interactive, background, delegated)
  • Behavioral prompts: A structured system prompt with Agency, Guardrails, and Handling Ambiguity sections, plus AGENTS.md injection for per-project configuration
  • Sandbox abstraction: One Sandbox interface, two implementations: local (Node fs and child_process) and in-memory (just-bash with a copy-on-write virtual filesystem). Swap the backend, the tools don't change
  • Context management: pruneMessages, bounded tool output, and cache control to keep long-running sessions usable and affordable
  • Subagent delegation: Explorer and executor roles with isolated context, constrained tools, and a model picked per job
  • Human-in-the-loop: askUser with multiple-choice options and an ambiguity protocol that prefers search, then questions, then action
  • Sandbox lifecycle: State-machine thinking, snapshot and restore, and durable workflow concepts
  • Extensibility: Event bus, skills with progressive disclosure, and custom tool registration

Prerequisites

  • TypeScript, async/await, basic terminal experience
  • An AI_GATEWAY_API_KEY environment variable
  • Node.js 20+ or Bun runtime
  • Recommended: Building Filesystem Agents course

How The Course Works

Causal sequence. Each step exists because the previous one broke something. Step 1 adds read because the chatbot can't see files. Step 2 adds grep because the agent can't search. Step 3 adds bash because it can't run commands, but now it can rm -rf. Each step spotlights one concept while the rest stays runnable.

Modules 1 through 6 are build-along. You write code, run it, verify. Module 7 is concept and analysis (sandbox lifecycle involves durable workflows and state machines you can't safely demo locally). Modules 8 through 11 mix building and analysis.


Course Modules

Module 1: The Agent Loop

Build a ToolLoopAgent from zero tools (a chatbot) to read and grep (an agent) to bash with safety gates.

Module 2: Tool Design

Evolve descriptions into a 5-section contract, extract the factory pattern, and build configurable approval.

Module 3: The System Prompt

Shape behavior with structured instructions, dynamic composition, verification gates, and AGENTS.md.

Module 4: The Sandbox Abstraction

One interface, three implementations. Tools call sandbox.exec(), not child_process.exec().

Module 5: Context Management

Every tool call stays in context forever. Fix it with pruning, bounded output, and cache control.

Module 6: Subagent Delegation

Parent plans, subagents execute. Isolated context, constrained tools, role-based models.

Module 7: Sandbox Lifecycle

Cloud sandboxes cost money and time out. Concept and analysis module.

Module 8: Human-in-the-Loop

Agents that guess wrong waste more time than agents that ask.

Module 9: Planning and Verification

Plan before acting, verify after acting.

Module 10: Surfaces

The agent is headless. CLI, TUI, and web are rendering strategies.

Module 11: Extensibility

Events, not inheritance. Skills as progressive disclosure. Tools as registrations.


Capstone

Run your harness against a real project. Not "add a hello world endpoint" but "add rate limiting to the auth routes." Watch where context overflows. Watch where it picks the wrong tool. Watch where the subagent gets bad instructions. Fix what breaks.

Tech Stack

ComponentPurpose
AI SDKToolLoopAgent, tool(), stepCountIs, pruneMessages, streaming
AI GatewayModel routing. "anthropic/claude-haiku-4-5" as a string, no wrapper
Vercel SandboxRemote VM with an isolated filesystem, git, and npm
just-bashIn-memory virtual filesystem and simulated bash
Vercel WorkflowDurable workflows for sandbox lifecycle
Zod v3Tool input schemas. v4 breaks AI SDK v6 types