Vercel Logo

Shell Execution with Safety

Your bash tool works, but the description, the safety check, and the call to execSync are sitting on top of each other in one big closure. That's fine when there's one bash tool. It stops being fine the moment you want to run commands somewhere other than this machine.

A few modules from now, we'll swap local execution for a sandbox. When that happens, you don't want to rewrite the bash tool. You want to hand it a different backend and let the rest stay still.

That's what the factory pattern is for.

Outcome

You have a createBashTool(operations, safePrefixes) factory that returns a fully configured bash tool. The model-facing contract (description, schema, safety check) lives in the factory. The execution backend is injected through an operations object.

Fast Track

  1. Define a BashOperations interface with one method, exec(command)
  2. Wrap the existing bash tool in createBashTool(operations, safePrefixes)
  3. Construct a localOps object that wraps execSync, then build the tool with it

Hands-on Exercise 2.2

Pull bash out into a factory function with a swappable execution backend.

Requirements:

  1. Define BashOperations with exec(command: string): Promise<{ stdout: string; exitCode: number }>
  2. Write createBashTool(operations: BashOperations, safePrefixes: string[]) that returns a tool()
  3. Keep the safety check inside the factory, using the injected safePrefixes
  4. Build a localOps implementation that wraps execSync
  5. Replace your existing bash constant with const bash = createBashTool(localOps, SAFE_PREFIXES)

Implementation hints:

  • The factory closes over operations and safePrefixes. The execute function inside the tool calls operations.exec(command) instead of execSync directly
  • Localops handles both stdout and errors uniformly. Return { stdout, exitCode } whether the command succeeded or not
  • Don't refactor read yet. The factory pattern earns its keep where the backend will actually vary, which for now is just bash

Where the seam goes

Right now your bash tool calls execSync directly:

index.ts
execute: async ({ command }) => {
  if (!isSafe(command)) return "Blocked...";
  const stdout = execSync(command, { cwd, encoding: "utf-8", timeout: 30_000 });
  return stdout;
}

When you add a sandbox later, this becomes sandbox.exec(command). Same idea, different backend. The factory introduces a seam between the two:

index.ts
interface BashOperations {
  exec(command: string): Promise<{ stdout: string; exitCode: number }>;
}

Everything the model sees lives above the seam. Everything that actually runs commands lives below.

Build the factory

index.ts
function createBashTool(operations: BashOperations, safePrefixes: string[]) {
  function isSafe(command: string): boolean {
    return safePrefixes.some((p) => command.trim().startsWith(p));
  }
 
  return tool({
    description: `Execute a shell command in the working directory.
 
WHEN TO USE: running build commands, installing packages, running tests,
  git operations, directory listings.
 
WHEN NOT TO USE: reading file contents (use read instead).
  Searching for patterns (use grep instead).
 
DO NOT USE FOR: reading files (use read), searching code (use grep).
 
USAGE: command is a single shell string. Commands not in the safe-prefix
  allowlist are blocked and return a clear error message.`,
    inputSchema: z.object({
      command: z.string().describe("Shell command to execute"),
    }),
    execute: async ({ command }) => {
      if (!isSafe(command)) {
        return `Blocked: "${command}" requires approval.`;
      }
      const { stdout } = await operations.exec(command);
      return stdout || "(no output)";
    },
  });
}

Notice what's gone: no execSync, no cwd reference, no error handling that knows about Node's child_process. The factory only knows there's something called operations.exec and it returns stdout.

Build the local backend

The localOps object is where the actual execSync call lives now:

index.ts
const localOps: BashOperations = {
  exec: async (command) => {
    try {
      const stdout = execSync(command, {
        cwd,
        encoding: "utf-8",
        timeout: 30_000,
      });
      return { stdout, exitCode: 0 };
    } catch (e: any) {
      return {
        stdout: e.stdout || e.stderr || e.message || "",
        exitCode: e.status ?? 1,
      };
    }
  },
};
 
const bash = createBashTool(localOps, SAFE_PREFIXES);

When you build the sandbox abstraction in Module 4, the swap is one line:

index.ts (preview, Module 4)
const sandboxOps: BashOperations = {
  exec: (command) => sandbox.exec(command),
};
 
const bash = createBashTool(sandboxOps, SAFE_PREFIXES);

Same tool. Different backend. The description, the schema, and the safety check don't move.

Why only bash, not read

You could apply the same factory pattern to read. We're not doing it yet. The factory earns its keep when the backend genuinely varies. For read, that won't happen until Module 4. For bash, execution backends and safety policy are already pulling in opposite directions. Refactor when there's pressure, not before.

Try It

Run a safe command to make sure the factory wired everything correctly:

Terminal
bun run index.ts . "List all files in this directory"

Same output as before. Same blocked-command behavior. The model can't tell anything changed, which is the whole point.

Try a blocked command to make sure the safety check still works:

Terminal
bun run index.ts . "Run: rm -rf node_modules"

You should still get the block message back.

Terminal
npx tsc --noEmit

Commit

git add index.ts
git commit -m "refactor(bash): extract createBashTool with operations interface"

Done-When

  • BashOperations interface defined with exec(command)
  • createBashTool(operations, safePrefixes) returns a working tool
  • localOps wraps execSync and returns { stdout, exitCode }
  • Safe commands still run, blocked commands still return the block message
  • npx tsc --noEmit passes
Sketch the sandbox swap

Without building it yet, write a mockOps: BashOperations that doesn't actually run anything. Just return { stdout: "(pretend output)", exitCode: 0 } for any command. Swap localOps for mockOps and watch the agent get plausible-but-fake output for everything. This is the seam that lets the sandbox abstraction in Module 4 work without rewriting your tools.

Solution

index.ts
interface BashOperations {
  exec(command: string): Promise<{ stdout: string; exitCode: number }>;
}
 
function createBashTool(operations: BashOperations, safePrefixes: string[]) {
  function isSafe(command: string): boolean {
    return safePrefixes.some((p) => command.trim().startsWith(p));
  }
 
  return tool({
    description: `Execute a shell command in the working directory.
 
WHEN TO USE: running build commands, installing packages, running tests,
  git operations, directory listings.
 
WHEN NOT TO USE: reading file contents (use read instead).
  Searching for patterns (use grep instead).
 
DO NOT USE FOR: reading files (use read), searching code (use grep).
 
USAGE: command is a single shell string. Commands not in the safe-prefix
  allowlist are blocked and return a clear error message.`,
    inputSchema: z.object({
      command: z.string().describe("Shell command to execute"),
    }),
    execute: async ({ command }) => {
      if (!isSafe(command)) {
        return `Blocked: "${command}" requires approval.`;
      }
      const { stdout } = await operations.exec(command);
      return stdout || "(no output)";
    },
  });
}
 
const localOps: BashOperations = {
  exec: async (command) => {
    try {
      const stdout = execSync(command, {
        cwd,
        encoding: "utf-8",
        timeout: 30_000,
      });
      return { stdout, exitCode: 0 };
    } catch (e: any) {
      return {
        stdout: e.stdout || e.stderr || e.message || "",
        exitCode: e.status ?? 1,
      };
    }
  },
};
 
const bash = createBashTool(localOps, SAFE_PREFIXES);

Was this helpful?

supported.