Vercel Logo

The Problem

Your agent has been getting away with something.

Through Modules 1 to 4, you've been running short tasks. Read a file, find a TODO, list the directory. Five steps, ten at most. The context window stays comfortable, the agent stays sharp, and everything is fine.

This is the part where we turn around and look at what happens on a real task. Twenty steps. Thirty. The kind of task where the agent reads three files, searches the codebase twice, runs a test, fixes a thing, runs the test again, and writes a summary.

Before we fix the problem, we need to see it. The fix is small. The seeing is the hard part.

Outcome

onStepFinish logs input and output tokens at every step. Running a multi-step task makes context growth visible: input tokens climb linearly while output stays roughly flat.

Fast Track

  1. Add an onStepFinish callback to ToolLoopAgent
  2. Log usage.inputTokens and usage.outputTokens per step
  3. Run a multi-step task and watch the input number climb

Hands-on Exercise 5.1

Wire token logging and run a task that does enough work to show the curve.

Requirements:

  1. Pass onStepFinish: ({ usage, stepNumber }) => { ... } to ToolLoopAgent
  2. Log to console.error (so it shows up next to the agent's normal output but stays out of stdout)
  3. Run a prompt that forces 4+ tool calls so the curve is visible
  4. Read the numbers. Don't try to fix anything yet

Implementation hints:

  • onStepFinish runs after every step, not just successful ones. The usage field is the relevant one
  • Use console.error for telemetry. console.log mixes with the agent's response and gets ugly
  • The prompt needs to do real work. "Read package.json" is one step. "Read package.json, then tsconfig, then the entry point, then summarize" is four

Add the logging

index.ts
const agent = new ToolLoopAgent({
  // ... existing config
  onStepFinish: ({ usage, stepNumber }) => {
    console.error(
      `Step ${stepNumber}: ${usage.inputTokens} input, ${usage.outputTokens} output`,
    );
  },
});

That's all of the instrumentation. The SDK calls this for you after every step.

Run a task that hurts

Terminal
bun run index.ts . "Read package.json, then tsconfig.json, then index.ts, then summarize everything"

You should see something like:

Step 0: 1,200 input, 450 output
Step 1: 2,800 input, 200 output
Step 2: 4,100 input, 180 output
Step 3: 8,900 input, 350 output
Step 4: 9,200 input, 600 output

The exact numbers will be different in your project. The shape won't be. Input tokens climb every step. Output tokens stay roughly flat.

Why input tokens grow

Every step sends the entire message history to the model. The user prompt. The system prompt. Every tool call the agent has made. Every tool result the agent has received.

The package.json from step 1 is still in context at step 4, even though the agent is done with it. The tsconfig from step 2 is still there. Nothing leaves on its own.

ComponentTokensBehavior
System prompt~500Fixed, sent every call
Each tool result200 to 2,000Stays in history forever
After 20 tool calls4,000 to 40,000Linearly accumulating

The context window is 200,000 tokens. A busy agent that reads big files can hit it in 30 to 50 steps. When it does:

  • The instructions at the top get pushed out of attention
  • The model starts ignoring its own system prompt
  • Tool selection degrades
  • The agent loops or hallucinates

What doesn't work

Three tempting non-fixes:

  • Hoping it doesn't happen. It always happens on real tasks
  • Reducing step count. Ten steps is too few for real work. Fifty is normal
  • Using a bigger model. A bigger context window delays the problem, doesn't solve it. And it costs more per token

The fix is to take old tool results out of the message history before they overflow attention. Lesson 5.2 does exactly that.

Telemetry first, fix second

This lesson doesn't fix anything. That's deliberate. You can't tell whether a fix worked unless you measured the problem first. Token logging stays in for the rest of the module so you can compare before and after at every step.

Try It

Run the multi-step task. Look at the numbers. Confirm:

  1. Input tokens grow with every step
  2. Output tokens stay relatively small
  3. By the last step, you're sending more than three times the tokens of step 0
Terminal
bun run index.ts . "Read package.json, then tsconfig.json, then index.ts, then summarize everything"

Try a longer task and watch the curve get steeper:

Terminal
bun run index.ts . "Read every .ts file in src/, then tell me what each one does"
Terminal
npx tsc --noEmit

Commit

git add index.ts
git commit -m "feat(telemetry): log token usage per step"

Done-When

  • onStepFinish is wired to log usage.inputTokens and usage.outputTokens
  • A 4+ step task shows input tokens climbing
  • Output tokens stay relatively flat across steps
  • The logging goes to console.error, not console.log
  • npx tsc --noEmit passes
Plot the curve

Logging numbers to stderr is fine. Logging them to a CSV is better. Append step,inputTokens,outputTokens to a file in onStepFinish, then load it into your spreadsheet of choice. The shape of the curve tells you whether your task is read-heavy (steep climb) or compute-heavy (gentler climb). Same agent, different curve, depending on what it's doing.

Solution

index.ts
const agent = new ToolLoopAgent({
  model: "anthropic/claude-haiku-4-5",
  instructions: buildSystemPrompt({
    workingDirectory: cwd,
    sandboxType: sandbox.type,
    toolNames: Object.keys(tools),
    projectContext,
  }),
  tools,
  stopWhen: stepCountIs(15),
  onStepFinish: ({ usage, stepNumber }) => {
    console.error(
      `Step ${stepNumber}: ${usage.inputTokens} input, ${usage.outputTokens} output`,
    );
  },
});

Was this helpful?

supported.