Skip to main content
🐍  Looking for Stagehand in Python?Switch to v2 →

What is agent()?

await agent.execute("apply for a job at browserbase")
agent turns high level tasks into fully autonomous browser workflows. You can customize the agent by specifying the LLM provider and model, setting custom instructions for behavior, and configuring max steps. Agent

Why use agent()?

Using agent()

There are two ways to create agents in Stagehand:
  1. Use a Computer Use Agent
  2. Use Agent with any LLM (non-computer-use)

Feature Availability

Some advanced features are only available with non-CUA agents:
FeatureCUANon-CUA
Basic execution
Custom tools
MCP integrations
System prompt
Streaming
Callbacks
Abort signal
Message continuation

Computer Use Agents

You can use specialized computer use models from either Google, OpenAI, or Anthropic as shown below, with cua set to true. To compare the performance of different computer use models, you can visit our evals page.
const agent = stagehand.agent({
    cua: true,
    model: {
        modelName: "google/gemini-2.5-computer-use-preview-10-2025",
        apiKey: process.env.GOOGLE_GENERATIVE_AI_API_KEY
    },
    systemPrompt: "You are a helpful assistant...",
});

await agent.execute({
    instruction: "Go to Hacker News and find the most controversial post from today, then read the top 3 comments and summarize the debate.",
    maxSteps: 20,
    highlightCursor: true
})
View or run the example template here

Use Stagehand Agent with Any LLM

Use the agent without specifying a provider to utilize any model or LLM provider:
Non CUA agents are currently only supported in TypeScript
TypeScript
const agent = stagehand.agent();
await agent.execute("apply for a job at Browserbase")

Available Agent Models

Check out the guide on how to use different models with Stagehand Agent.

Return value of agent()?

When you use agent(), Stagehand will return a Promise<AgentResult> with the following structure:
{
  success: true,
  message: "The first name and email fields have been filled successfully with 'John' and '[email protected]'.",
  actions: [
    {
      type: 'ariaTree',
      reasoning: undefined,
      taskCompleted: true,
      pageUrl: 'https://example.com',
      timestamp: 1761598722055
    },
    {
      type: 'act',
      reasoning: undefined,
      taskCompleted: true,
      action: 'type "John" into the First Name textbox',
      playwrightArguments: {...},
      pageUrl: 'https://example.com',
      timestamp: 1761598731643
    },
    {
      type: 'close',
      reasoning: "The first name and email fields have been filled successfully.",
      taskCompleted: true,
      taskComplete: true,
      pageUrl: 'https://example.com',
      timestamp: 1761598732861
    }
  ],
  completed: true,
  usage: {
    input_tokens: 2040,
    output_tokens: 28,
    reasoning_tokens: 12,
    cached_input_tokens: 0,
    inference_time_ms: 14079
  }
}

Custom Tools

Agents can be enhanced with custom tools for more granular control and better performance. Unlike MCP integrations, custom tools are defined inline and execute directly within your application.
Custom tools provide a cleaner, more performant alternative to MCP integrations when you need specific functionality.

Defining Custom Tools

Use the tool helper from the Vercel AI SDK to define custom tools:
import { tool } from "ai";
import { z } from "zod/v3";

const agent = stagehand.agent({
  model: "openai/gpt-5",
  tools: {
    getWeather: tool({
      description: 'Get the current weather in a location',
      inputSchema: z.object({
        location: z.string().describe('The location to get weather for'),
      }),
      execute: async ({ location }) => {
        // Your custom logic here
        const weather = await fetchWeatherAPI(location);
        return {
          location,
          temperature: weather.temp,
          conditions: weather.conditions,
        };
      },
    }),
  },
  systemPrompt: 'You are a helpful assistant with access to weather data.',
});

await agent.execute("What's the weather in San Francisco and should I bring an umbrella?");

Custom Tools vs MCP Integrations

Custom ToolsMCP Integrations
Defined inline with your codeConnect to external services
Direct function executionStandard protocol
Better performance & optimized contextReusable across applications
Type-safe with TypeScriptAccess to pre-built integrations
Granular controlNetwork-based communication
Use custom tools when you need specific functionality within your application. Use MCP integrations when connecting to external services or when you need standardized cross-application tools.

MCP Integrations

Agents can be enhanced with external tools and services through MCP (Model Context Protocol) integrations. This allows your agent to access external APIs and data sources beyond just browser interactions.
const agent = stagehand.agent({
    cua: true,
    model: {
        modelName: "openai/computer-use-preview",
        apiKey: process.env.OPENAI_API_KEY
    },
    integrations: [
      `https://mcp.exa.ai/mcp?exaApiKey=${process.env.EXA_API_KEY}`,
    ],
    systemPrompt: `You have access to web search through Exa. Use it to find current information before browsing.`
});

await agent.execute("Search for the best headphones of 2025 and go through checkout for the top recommendation");
MCP integrations enable agents to be more powerful by combining browser automation with external APIs, databases, and services. The agent can intelligently decide when to use browser actions versus external tools.

Streaming

Enable streaming mode to receive incremental responses from the agent. This is useful for building real-time UIs that show the agent’s reasoning as it progresses.
Non-CUA agents only. Streaming, callbacks, abort signals, and message continuation are only available when using the standard agent (without cua: true). These features are not supported with Computer Use Agents.
These are experimental features. Set experimental: true in your Stagehand constructor to enable them.

Enabling Streaming Mode

Set stream: true in the agent configuration to enable streaming:
const stagehand = new Stagehand({
  env: "LOCAL",
  experimental: true, // Required for streaming
});
await stagehand.init();

const agent = stagehand.agent({
  model: "anthropic/claude-sonnet-4-5-20250929",
  stream: true, // Enable streaming mode
});

const streamResult = await agent.execute({
  instruction: "Search for headphones on Amazon",
  maxSteps: 20,
});

// Stream the text output incrementally
for await (const delta of streamResult.textStream) {
  process.stdout.write(delta);
}

// Get the final result after streaming completes
const finalResult = await streamResult.result;
console.log("Completed:", finalResult.completed);

Stream Properties

When streaming is enabled, execute() returns an AgentStreamResult with:
PropertyTypeDescription
textStreamAsyncIterable<string>Incremental text output from the agent
fullStreamAsyncIterable<StreamPart>All stream events including tool calls and messages
resultPromise<AgentResult>Final result after streaming completes
// Stream everything (tool calls, messages, etc.)
for await (const event of streamResult.fullStream) {
  console.log(event);
}

Callbacks

Callbacks let you hook into the agent’s execution lifecycle to monitor progress, log events, or modify behavior.
Non-CUA agents only. Callbacks require experimental: true and are not available with Computer Use Agents.

Available Callbacks

When stream: false (default), these callbacks are available:
CallbackDescription
prepareStepCalled before each LLM step to modify settings
onStepFinishCalled when each step completes
const agent = stagehand.agent({
  model: "anthropic/claude-sonnet-4-5-20250929",
});

await agent.execute({
  instruction: "Fill out the contact form",
  maxSteps: 10,
  callbacks: {
    prepareStep: async (stepContext) => {
      console.log(`Starting step ${stepContext.stepNumber}`);
      return stepContext; // Return modified or original context
    },
    onStepFinish: async (event) => {
      console.log(`Step finished: ${event.finishReason}`);
      if (event.toolCalls) {
        for (const tc of event.toolCalls) {
          console.log(`Tool called: ${tc.toolName}`);
        }
      }
    },
  },
});
Streaming-only callbacks (onChunk, onFinish, onError, onAbort) will throw an error if used without stream: true. If you need these callbacks, enable streaming in your agent configuration.

Abort Signal

Cancel agent execution at any time using an AbortSignal. This is useful for implementing timeouts or allowing users to stop long-running tasks.
Non-CUA agents only. Abort signals require experimental: true and are not available with Computer Use Agents.

Basic Usage

const stagehand = new Stagehand({
  env: "LOCAL",
  experimental: true, // Required for abort signal
});
await stagehand.init();

const agent = stagehand.agent({
  model: "anthropic/claude-sonnet-4-5-20250929",
});

const controller = new AbortController();

// Set a 30 second timeout
setTimeout(() => controller.abort(), 30000);

try {
  const result = await agent.execute({
    instruction: "Complete a complex multi-step task",
    maxSteps: 50,
    signal: controller.signal,
  });
} catch (error) {
  if (error.name === "AgentAbortError") {
    console.log("Task was cancelled");
  }
}

Abort with Streaming

Abort signals also work with streaming mode:
const agent = stagehand.agent({
  model: "anthropic/claude-sonnet-4-5-20250929",
  stream: true,
});

const controller = new AbortController();

const streamResult = await agent.execute({
  instruction: "Describe every element on the page",
  maxSteps: 50,
  signal: controller.signal,
  callbacks: {
    onAbort: (event) => {
      console.log(`Aborted after ${event.steps.length} steps`);
    },
  },
});

// Abort after receiving 10 chunks
let chunkCount = 0;
for await (const delta of streamResult.textStream) {
  process.stdout.write(delta);
  chunkCount++;
  if (chunkCount >= 10) {
    controller.abort();
    break;
  }
}

// The result promise will reject with AgentAbortError
try {
  await streamResult.result;
} catch (error) {
  console.log("Stream was aborted:", error.message);
}

Custom Abort Reasons

You can pass a reason when aborting:
controller.abort("User cancelled the operation");

// The error message will include your reason
// Error: "User cancelled the operation"

Message Continuation

Continue a conversation across multiple agent executions by passing the messages from a previous result. This is useful for multi-turn interactions or breaking complex tasks into steps while maintaining context.
Non-CUA agents only. Message continuation requires experimental: true and is not available with Computer Use Agents.

Basic Continuation

const stagehand = new Stagehand({
  env: "LOCAL",
  experimental: true, // Required for message continuation
});
await stagehand.init();

const agent = stagehand.agent({
  model: "anthropic/claude-sonnet-4-5-20250929",
});

const page = stagehand.context.pages()[0];
await page.goto("https://example.com/products");

// First execution: search for products
const firstResult = await agent.execute({
  instruction: "Search for wireless headphones and note the top 3 results",
  maxSteps: 10,
});

console.log("First task:", firstResult.message);

// Continue with the same context: ask follow-up
const secondResult = await agent.execute({
  instruction: "Now filter by price under $100 and tell me which of those 3 are still available",
  maxSteps: 10,
  messages: firstResult.messages, // Pass previous conversation
});

console.log("Follow-up:", secondResult.message);

// Continue further: take action based on conversation history
const thirdResult = await agent.execute({
  instruction: "Add the cheapest one to the cart",
  maxSteps: 10,
  messages: secondResult.messages, // Chain the conversation
});

console.log("Final action:", thirdResult.message);

Agent Execution Configuration

Stagehand uses a 1288x711 viewport by default. Other viewport sizes may reduce performance. If you need to modify the viewport, you can edit in the Browser Configuration.
Control the maximum number of steps the agent can take to complete the task using the maxSteps parameter.
// Set maxSteps to control how many actions the agent can take
await agent.execute({
  instruction: "Sign me up for a library card",
  maxSteps: 15 // Agent will stop after 15 steps if task isn't complete
});

Best Practices

Following these best practices will improve your agent’s success rate, reduce execution time, and minimize unexpected errors during task completion.

Start on the Right Page

Navigate to your target page before executing tasks:
await page.goto('https://github.com/browserbase/stagehand');
await agent.execute('Get me the latest PR on the stagehand repo');

Be Specific

Provide detailed instructions for better results:
await agent.execute("Find Italian restaurants in Brooklyn that are open after 10pm and have outdoor seating");

Troubleshooting

Problem: Agent stops before finishing the requested taskSolutions:
  • Check if the agent is hitting the maxSteps limit (default is 20)
  • Increase maxSteps for complex tasks: maxSteps: 30 or higher
  • Break very complex tasks into smaller sequential executions
// Increase maxSteps for complex tasks
await agent.execute({
  instruction: "Complete the multi-page registration form with all required information",
  maxSteps: 40 // Increased limit for complex task
});

// Or break into smaller tasks with success checking
const firstResult = await agent.execute({
  instruction: "Fill out page 1 of the registration form", 
  maxSteps: 15
});

// Only proceed if the first task was successful
if (firstResult.success === true) {
  await agent.execute({
    instruction: "Navigate to page 2 and complete remaining fields",
    maxSteps: 15
  });
} else {
  console.log("First task failed, stopping execution");
}
Problem: Agent clicks on wrong elements or fails to interact with the correct UI componentsSolutions:
  • Ensure proper viewport size: Stagehand uses 1288x711 by default (optimal for Computer Use models)
  • Avoid changing viewport dimensions as other sizes may reduce performance

Next steps