Your First Tools
In the last lesson, you handed the agent read and turned a chatbot into something useful. Useful, but limited. Your agent can only open files it already knows the name of. Ask it to "find all TODO comments" and it starts guessing which files might have one, then reads them one by one.
That's not searching. That's flailing politely.
Add grep and the agent gets a real search tool. But now you have a new problem. The model has to choose between read and grep every time it picks up a task. And until you tell it how, it picks wrong.
Outcome
You have a grep tool with a rich, behavior-shaping description. The model uses grep for searches and read for file inspection, with the routing decided entirely by the descriptions.
Fast Track
- Add a
greptool with regex pattern, optional glob filter, and a 50-match cap - Write a description using WHEN TO USE, WHEN NOT TO USE, DO NOT USE FOR, and EXAMPLES
- Update
read's description to match the same contract
Hands-on Exercise 1.2
Build the grep tool, then rewrite both descriptions until the model routes correctly.
Requirements:
- Add a
greptool with a Zod schema forpattern, optionalpath, and optionalglob - Implement
executeusingexecSyncwithgrep -rn, excludingnode_modulesand.git - Cap output at 50 matches and report the total count
- Write descriptions for both
readandgrepusing the four-section contract: WHEN TO USE, WHEN NOT TO USE, DO NOT USE FOR, EXAMPLES
Implementation hints:
- Import
execSyncfromnode:child_process - Quote inputs into the shell command to avoid breaking on special characters
- Treat
grep's non-zero exit (no matches found) as success, not error - The description is the model's API for choosing tools. Write it for the model, not for the reader
Watch the wrong tool win
Start with a minimal description and see how badly it goes:
const grep = tool({
description: "Search files.",
inputSchema: z.object({
pattern: z.string(),
glob: z.string().optional(),
}),
// ... execute with execSync grep
});Now ask the agent:
bun run index.ts . "Find all TODO comments in this project"The model ignores grep and reaches for read, opening random files and hoping one has a TODO. If you've already added bash, it'll try that instead. A two-word description doesn't give the model anything to work with, so it guesses.
This is the first time tool selection matters. It's also the first time it breaks.
Descriptions are prompts
The fix isn't a better implementation. It's a better description:
const grep = tool({
description: `Search file contents using regex. Returns matching lines with file paths.
WHEN TO USE: finding patterns across multiple files, locating function definitions,
searching for imports, finding TODOs or error messages.
WHEN NOT TO USE: reading a known file (use read instead).
DO NOT USE FOR: running commands, listing directories.
EXAMPLES:
- Find all TODO comments: pattern "TODO" glob "*.ts"
- Find function definitions: pattern "function \\\\w+" glob "*.ts"`,
inputSchema: z.object({
pattern: z.string().describe("Regex pattern to search for"),
path: z.string().optional().describe("Directory to search (default: working dir)"),
glob: z.string().optional().describe("File glob filter, e.g. '*.ts'"),
}),
execute: async ({ pattern, path: searchPath, glob: globFilter }) => {
const dir = resolve(cwd, searchPath || ".");
const escapedPattern = pattern.replace(/'/g, `'\\''`);
const escapedGlob = (globFilter || "*").replace(/'/g, `'\\''`);
const cmd = `grep -rn --exclude-dir=node_modules --exclude-dir=.git --include='${escapedGlob}' -E '${escapedPattern}' '${dir}' 2>/dev/null`;
try {
const stdout = execSync(cmd, { encoding: "utf-8", timeout: 10_000 });
const lines = stdout.trim().split("\\n").filter(Boolean);
const MAX_MATCHES = 50;
const truncated = lines.length > MAX_MATCHES;
const result = truncated ? lines.slice(0, MAX_MATCHES) : lines;
return truncated
? result.join("\\n") + `\\n... (${lines.length} total, showing first ${MAX_MATCHES})`
: result.join("\\n") || "No matches found.";
} catch (error: any) {
const stdout = String(error?.stdout || "").trim();
if (stdout) {
const lines = stdout.split("\\n").filter(Boolean);
const MAX_MATCHES = 50;
const truncated = lines.length > MAX_MATCHES;
const result = truncated ? lines.slice(0, MAX_MATCHES) : lines;
return truncated
? result.join("\\n") + `\\n... (${lines.length} total, showing first ${MAX_MATCHES})`
: result.join("\\n");
}
return "No matches found.";
}
},
});Two things are happening in that description. WHEN TO USE makes the tool's job legible. WHEN NOT TO USE and DO NOT USE FOR steer the model away from defaulting to whatever tool it likes best, which (in our experience) is usually bash.
Every model we've tested (Haiku, Sonnet, Opus) defaults to bash when tool descriptions are weak. Production harnesses double up the negative steering, with both WHEN NOT TO USE and DO NOT USE FOR on every tool, because saying it once isn't enough. The pattern in this lesson comes from a real harness after watching the agent consistently pick wrong.
Now do the same treatment on read. The description on read has to push back against grep just as hard:
const read = tool({
description: `Read a file from the project. Returns numbered lines.
WHEN TO USE: viewing file contents, checking configs, reading source code.
WHEN NOT TO USE: searching across files (use grep instead).
DO NOT USE FOR: running commands, listing directories.`,
// ... rest unchanged
});Why a 50-match cap
grep gets the same context-conscious treatment as read. Without a cap, a search for import in a large codebase floods the context with hundreds of lines of imports the agent doesn't need. 50 matches is enough to answer the question. 500 is pollution.
Try It
Run the search prompt with the rewritten descriptions:
bun run index.ts . "Find all TODO comments in this project"You should see the model call grep directly, with a pattern like TODO and a glob like *.ts. Seed a couple of // TODO: comments in a small file so the result is obvious. Excluding node_modules keeps the output focused on your code, not the dependencies.
Now run the file-inspection prompt:
bun run index.ts . "Read the tsconfig.json"This still uses read, not grep. The descriptions steer the model in both directions. Search prompts pull toward grep. Known-file prompts pull toward read.
npx tsc --noEmitA real codebase has too many matches to eyeball. Drop two // TODO: comments in a small file you control, then run the search. The point is to verify routing, not to discover bugs. Make the test obvious.
Commit
git add index.ts
git commit -m "feat(tools): add grep with behavioral description contract"Done-When
"Find all TODO comments"callsgrep, notreadorbash"Read the tsconfig.json"callsread, notgrepgrepcaps at 50 matches and reports the total when truncated- Both
readandgrepuse WHEN TO USE, WHEN NOT TO USE, and DO NOT USE FOR npx tsc --noEmitpasses
Start weakening grep's description one section at a time. Drop EXAMPLES first. Then DO NOT USE FOR. Then WHEN NOT TO USE. At what point does the model switch back to bash or read? Does the threshold change between Haiku, Sonnet, and Opus?
Solution
import { ToolLoopAgent, stepCountIs, tool } from "ai";
import { z } from "zod";
import { readFileSync } from "node:fs";
import { resolve } from "node:path";
import { execSync } from "node:child_process";
const cwd = resolve(process.argv[2] || process.cwd());
const read = tool({
description: `Read a file from the project. Returns numbered lines.
WHEN TO USE: viewing file contents, checking configs, reading source code.
WHEN NOT TO USE: searching across files (use grep instead).
DO NOT USE FOR: running commands, listing directories.`,
inputSchema: z.object({
path: z.string().describe("File path relative to working directory"),
offset: z.number().optional().describe("Start line (1-indexed)"),
limit: z.number().optional().describe("Max lines to return"),
}),
execute: async ({ path: filePath, offset, limit }) => {
const abs = resolve(cwd, filePath);
const content = readFileSync(abs, "utf-8");
let lines = content.split("\n");
if (offset) lines = lines.slice(offset - 1);
if (limit) lines = lines.slice(0, limit);
const MAX_LINES = 500;
const truncated = lines.length > MAX_LINES;
if (truncated) lines = lines.slice(0, MAX_LINES);
const numbered = lines.map((l, i) => `${(offset || 1) + i}: ${l}`);
return truncated
? numbered.join("\n") + `\n... (truncated at ${MAX_LINES} lines)`
: numbered.join("\n");
},
});
const grep = tool({
description: `Search file contents using regex. Returns matching lines with file paths.
WHEN TO USE: finding patterns across multiple files, locating function definitions,
searching for imports, finding TODOs or error messages.
WHEN NOT TO USE: reading a known file (use read instead).
DO NOT USE FOR: running commands, listing directories.
EXAMPLES:
- Find all TODO comments: pattern "TODO" glob "*.ts"
- Find function definitions: pattern "function \\\\w+" glob "*.ts"`,
inputSchema: z.object({
pattern: z.string().describe("Regex pattern to search for"),
path: z.string().optional().describe("Directory to search (default: working dir)"),
glob: z.string().optional().describe("File glob filter, e.g. '*.ts'"),
}),
execute: async ({ pattern, path: searchPath, glob: globFilter }) => {
const dir = resolve(cwd, searchPath || ".");
const escapedPattern = pattern.replace(/'/g, `'\\''`);
const escapedGlob = (globFilter || "*").replace(/'/g, `'\\''`);
const cmd = `grep -rn --exclude-dir=node_modules --exclude-dir=.git --include='${escapedGlob}' -E '${escapedPattern}' '${dir}' 2>/dev/null`;
try {
const stdout = execSync(cmd, { encoding: "utf-8", timeout: 10_000 });
const lines = stdout.trim().split("\\n").filter(Boolean);
const MAX_MATCHES = 50;
const truncated = lines.length > MAX_MATCHES;
const result = truncated ? lines.slice(0, MAX_MATCHES) : lines;
return truncated
? result.join("\\n") + `\\n... (${lines.length} total, showing first ${MAX_MATCHES})`
: result.join("\\n") || "No matches found.";
} catch (error: any) {
const stdout = String(error?.stdout || "").trim();
if (stdout) {
const lines = stdout.split("\\n").filter(Boolean);
const MAX_MATCHES = 50;
const truncated = lines.length > MAX_MATCHES;
const result = truncated ? lines.slice(0, MAX_MATCHES) : lines;
return truncated
? result.join("\\n") + `\\n... (${lines.length} total, showing first ${MAX_MATCHES})`
: result.join("\\n");
}
return "No matches found.";
}
},
});
const agent = new ToolLoopAgent({
model: "anthropic/claude-haiku-4-5",
instructions: `You are a coding agent.\nWorking directory: ${cwd}`,
tools: { read, grep },
stopWhen: stepCountIs(10),
});
const prompt = process.argv.slice(3).join(" ") || "Hello!";
const { text, steps } = await agent.generate({ prompt });
console.log(text);
console.log(`\n(${steps.length} steps)`);Was this helpful?