await agent.execute("apply for a job at browserbase")
agent turns high level tasks into fully autonomous browser workflows. You can customize the agent by specifying the LLM provider and model, setting custom instructions for behavior, and configuring max steps.
You can use specialized computer use models from either Google, OpenAI, or Anthropic as shown below, with cua set to true. To compare the performance of different computer use models, you can visit our evals page.
Copy
Ask AI
const agent = stagehand.agent({ cua: true, model: { modelName: "google/gemini-2.5-computer-use-preview-10-2025", apiKey: process.env.GOOGLE_GENERATIVE_AI_API_KEY }, systemPrompt: "You are a helpful assistant...",});await agent.execute({ instruction: "Go to Hacker News and find the most controversial post from today, then read the top 3 comments and summarize the debate.", maxSteps: 20, highlightCursor: true})
Agents can be enhanced with custom tools for more granular control and better performance. Unlike MCP integrations, custom tools are defined inline and execute directly within your application.
Custom tools provide a cleaner, more performant alternative to MCP integrations when you need specific functionality.
Use the tool helper from the Vercel AI SDK to define custom tools:
Copy
Ask AI
import { tool } from "ai";import { z } from "zod/v3";const agent = stagehand.agent({ model: "openai/gpt-5", tools: { getWeather: tool({ description: 'Get the current weather in a location', inputSchema: z.object({ location: z.string().describe('The location to get weather for'), }), execute: async ({ location }) => { // Your custom logic here const weather = await fetchWeatherAPI(location); return { location, temperature: weather.temp, conditions: weather.conditions, }; }, }), }, systemPrompt: 'You are a helpful assistant with access to weather data.',});await agent.execute("What's the weather in San Francisco and should I bring an umbrella?");
Use custom tools when you need specific functionality within your application. Use MCP integrations when connecting to external services or when you need standardized cross-application tools.
Agents can be enhanced with external tools and services through MCP (Model Context Protocol) integrations. This allows your agent to access external APIs and data sources beyond just browser interactions.
Copy
Ask AI
const agent = stagehand.agent({ cua: true, model: { modelName: "openai/computer-use-preview", apiKey: process.env.OPENAI_API_KEY }, integrations: [ `https://mcp.exa.ai/mcp?exaApiKey=${process.env.EXA_API_KEY}`, ], systemPrompt: `You have access to web search through Exa. Use it to find current information before browsing.`});await agent.execute("Search for the best headphones of 2025 and go through checkout for the top recommendation");
MCP integrations enable agents to be more powerful by combining browser automation with external APIs, databases, and services. The agent can intelligently decide when to use browser actions versus external tools.
Enable streaming mode to receive incremental responses from the agent. This is useful for building real-time UIs that show the agent’s reasoning as it progresses.
Non-CUA agents only. Streaming, callbacks, abort signals, and message continuation are only available when using the standard agent (without cua: true). These features are not supported with Computer Use Agents.
These are experimental features. Set experimental: true in your Stagehand constructor to enable them.
Streaming-only callbacks (onChunk, onFinish, onError, onAbort) will throw an error if used without stream: true. If you need these callbacks, enable streaming in your agent configuration.
Continue a conversation across multiple agent executions by passing the messages from a previous result. This is useful for multi-turn interactions or breaking complex tasks into steps while maintaining context.
Non-CUA agents only. Message continuation requires experimental: true and is not available with Computer Use Agents.
const stagehand = new Stagehand({ env: "LOCAL", experimental: true, // Required for message continuation});await stagehand.init();const agent = stagehand.agent({ model: "anthropic/claude-sonnet-4-5-20250929",});const page = stagehand.context.pages()[0];await page.goto("https://example.com/products");// First execution: search for productsconst firstResult = await agent.execute({ instruction: "Search for wireless headphones and note the top 3 results", maxSteps: 10,});console.log("First task:", firstResult.message);// Continue with the same context: ask follow-upconst secondResult = await agent.execute({ instruction: "Now filter by price under $100 and tell me which of those 3 are still available", maxSteps: 10, messages: firstResult.messages, // Pass previous conversation});console.log("Follow-up:", secondResult.message);// Continue further: take action based on conversation historyconst thirdResult = await agent.execute({ instruction: "Add the cheapest one to the cart", maxSteps: 10, messages: secondResult.messages, // Chain the conversation});console.log("Final action:", thirdResult.message);
Stagehand uses a 1288x711 viewport by default. Other viewport sizes may reduce performance. If you need to modify the viewport, you can edit in the Browser Configuration.
Control the maximum number of steps the agent can take to complete the task using the maxSteps parameter.
Copy
Ask AI
// Set maxSteps to control how many actions the agent can takeawait agent.execute({ instruction: "Sign me up for a library card", maxSteps: 15 // Agent will stop after 15 steps if task isn't complete});
Problem: Agent stops before finishing the requested taskSolutions:
Check if the agent is hitting the maxSteps limit (default is 20)
Increase maxSteps for complex tasks: maxSteps: 30 or higher
Break very complex tasks into smaller sequential executions
Copy
Ask AI
// Increase maxSteps for complex tasksawait agent.execute({ instruction: "Complete the multi-page registration form with all required information", maxSteps: 40 // Increased limit for complex task});// Or break into smaller tasks with success checkingconst firstResult = await agent.execute({ instruction: "Fill out page 1 of the registration form", maxSteps: 15});// Only proceed if the first task was successfulif (firstResult.success === true) { await agent.execute({ instruction: "Navigate to page 2 and complete remaining fields", maxSteps: 15 });} else { console.log("First task failed, stopping execution");}
Agent is failing to click the proper elements
Problem: Agent clicks on wrong elements or fails to interact with the correct UI componentsSolutions:
Ensure proper viewport size: Stagehand uses 1288x711 by default (optimal for Computer Use models)
Avoid changing viewport dimensions as other sizes may reduce performance