CLI Entry Point
You've been running bun run index.ts . "prompt" since Module 1. That's a CLI. It just isn't a polite one.
The positional arguments are doing too much work. There's no flag for the sandbox backend, so you've been juggling process.env.SANDBOX. The model is hardcoded into the agent. And if you hit Ctrl-C mid-run, the sandbox doesn't shut down cleanly. For local that's fine. For a cloud sandbox, that's leaving a VM running on someone else's credit card.
This lesson formalizes the entry point. Arguments via parseArgs. Sandbox via a factory that reads a flag. Model via a flag with a default. Shutdown via a signal handler that always runs sandbox.stop().
Outcome
index.ts parses --sandbox, --model, a positional working directory, and a positional prompt. The sandbox shuts down cleanly on normal exit and on SIGINT.
Fast Track
- Use
parseArgsfromnode:utilfor--sandboxand--model - Build the sandbox from the flag through a small factory
- Wire a
SIGINThandler that callssandbox.stop()and exits - Always call
sandbox.stop()in afinallyafter the agent runs
Hands-on Exercise 10.1
Replace the ad-hoc CLI with parseArgs plus a clean shutdown.
Requirements:
- Use
parseArgsfromnode:utilwith--sandbox(defaultlocal) and--model(defaultanthropic/claude-haiku-4-5) - Allow positionals: the first is
cwd, the rest are joined into the prompt - Build the sandbox from the flag via a
sandboxFromFlag(name, cwd)helper - Wrap the agent run in
try/finallysosandbox.stop()always runs - Register a
SIGINThandler that stops the sandbox and exits with code 0
Implementation hints:
parseArgsis innode:util. SetallowPositionals: trueto mix flags and positionalssandboxFromFlagis a one-liner switch over"local"and"just-bash"- The
finallymatters more for cloud sandboxes than local, but you want the same code path either way
The CLI
import { parseArgs } from "node:util";
import { ToolLoopAgent, stepCountIs, pruneMessages } from "ai";
import { resolve } from "node:path";
import { existsSync, readFileSync } from "node:fs";
import { join } from "node:path";
import { createLocalSandbox } from "./src/sandbox-local";
import { createJustBashSandbox } from "./src/sandbox-just-bash";
import { buildSystemPrompt } from "./src/system";
import {
createReadTool,
createGrepTool,
createBashTool,
createTaskTool,
createAskUserTool,
createTodoTool,
} from "./src/tools";
import { createApproval } from "./src/approval";
import { addCacheControl } from "./src/cache";
import { discoverGates } from "./src/verification";
import type { Sandbox } from "./src/sandbox";
const { values, positionals } = parseArgs({
args: process.argv.slice(2),
options: {
sandbox: { type: "string", default: "local" },
model: { type: "string", default: "anthropic/claude-haiku-4-5" },
},
allowPositionals: true,
});
const cwd = resolve(positionals[0] || process.cwd());
const prompt = positionals.slice(1).join(" ") || "Hello!";
async function sandboxFromFlag(name: string, dir: string): Promise<Sandbox> {
if (name === "just-bash") return createJustBashSandbox(dir);
return createLocalSandbox(dir);
}
const sandbox = await sandboxFromFlag(values.sandbox!, cwd);
console.error(`Sandbox: ${sandbox.type}`);
const projectContext = existsSync(join(cwd, "AGENTS.md"))
? readFileSync(join(cwd, "AGENTS.md"), "utf-8")
: undefined;
const verificationCommands = await discoverGates(sandbox);
const baseTools = {
read: createReadTool(sandbox),
grep: createGrepTool(sandbox),
bash: createBashTool(sandbox, createApproval({ mode: "interactive" })),
};
const tools = {
...baseTools,
task: createTaskTool(sandbox, { read: baseTools.read, grep: baseTools.grep }),
askUser: createAskUserTool(),
todo: createTodoTool(),
};
const agent = new ToolLoopAgent({
model: values.model!,
instructions: buildSystemPrompt({
workingDirectory: cwd,
sandboxType: sandbox.type,
toolNames: Object.keys(tools),
projectContext,
verificationCommands,
}),
tools,
stopWhen: stepCountIs(15),
prepareCall: async (options) => {
const pruned = options.messages
? pruneMessages({
messages: options.messages,
toolCalls: "before-last-3-messages",
})
: undefined;
return {
...options,
messages: pruned ? addCacheControl(pruned) : undefined,
};
},
onStepFinish: ({ usage, stepNumber }) => {
console.error(
`Step ${stepNumber}: ${usage.inputTokens} input, ${usage.outputTokens} output`,
);
},
});
process.on("SIGINT", async () => {
console.error("\nShutting down...");
await sandbox.stop();
process.exit(0);
});
try {
const { text, steps } = await agent.generate({ prompt });
console.log(text);
console.log(`\n(${steps.length} steps)`);
} finally {
await sandbox.stop();
}That's the full file. Most of it is the assembly that the previous nine modules already wrote. The CLI changes are: parseArgs, the sandboxFromFlag helper, the SIGINT handler, and the try/finally.
Why finally matters
If the agent throws halfway through, the finally still runs. For a local sandbox, you've cleaned up nothing important. For a cloud sandbox, you've avoided leaving a VM running. For a just-bash sandbox, you've released some memory. Same code, different cost, all of them cleanly handled.
The SIGINT handler is a duplicate. The user pressing Ctrl-C is one path. An uncaught exception is another path. The finally covers normal exit, the handler covers the explicit interrupt.
Almost nothing in this file is about CLI concerns. The agent, the tools, the prompt, the sandbox: that's all reusable. The CLI parts are five or six lines of parseArgs and a signal handler. If you build a different surface (a web server, a Slack bot, a VS Code extension), the only code that changes is those five or six lines. Everything below them stays put.
Try It
Run with the new flags:
bun run index.ts --sandbox=just-bash --model=anthropic/claude-haiku-4-5 . "Read the package.json"You should see Sandbox: just-bash in stderr, then the model's response in stdout, then the step count.
Test the clean shutdown by sending SIGINT mid-run:
bun run index.ts . "Run a long task"Press Ctrl-C. You should see "Shutting down..." and a clean exit, not a hang.
npx tsc --noEmitCommit
git add index.ts
git commit -m "feat(cli): parseArgs and SIGINT-aware shutdown"Done-When
parseArgsreads--sandboxand--modelflags- Positionals supply
cwdand the prompt sandbox.stop()runs on normal exit (viafinally)sandbox.stop()runs on SIGINT (via handler)npx tsc --noEmitpasses
Add --session=<id> that loads a previous run's messages from disk and replays them as messages to agent.generate({ prompt, messages }). On exit, save the new messages back. Now you can pick up a conversation tomorrow. Where does the file live? What happens when the file is corrupt? Should you version-stamp it so old sessions don't break when the harness changes?
Solution
See the full index.ts above.
Was this helpful?