Skip to main content

What is agent()?

await agent.execute("apply for a job at browserbase")
agent turns high level tasks into fully autonomous browser workflows. You can customize the agent by specifying the LLM provider and model, setting custom instructions for behavior, and configuring max steps. Agent

Why use agent()?

Using agent()

There are two ways to create agents in Stagehand:
  1. Use a Computer Use Agent
  2. Use Agent with any LLM (non-computer-use)

Computer Use Agents

You can use specialized computer use models from either Google, OpenAI, or Anthropic as shown below, with cua set to true. To compare the performance of different computer use models, you can visit our evals page.
const agent = stagehand.agent({
    cua: true,
    model: {
        modelName: "google/gemini-2.5-computer-use-preview-10-2025",
        apiKey: process.env.GOOGLE_GENERATIVE_AI_API_KEY
    },
    systemPrompt: "You are a helpful assistant...",
});

await agent.execute({
    instruction: "Go to Hacker News and find the most controversial post from today, then read the top 3 comments and summarize the debate.",
    maxSteps: 20,
    highlightCursor: true
})

Use Stagehand Agent with Any LLM

Use the agent without specifying a provider to utilize any model or LLM provider:
Non CUA agents are currently only supported in TypeScript
TypeScript
const agent = stagehand.agent();
await agent.execute("apply for a job at Browserbase")

Available Agent Models

Check out the guide on how to use different models with Stagehand Agent.

Return value of agent()?

When you use agent(), Stagehand will return a Promise<AgentResult> with the following structure:
{
  success: true,
  message: "The first name and email fields have been filled successfully with 'John' and 'john@example.com'.",
  actions: [
    {
      type: 'ariaTree',
      reasoning: undefined,
      taskCompleted: true,
      pageUrl: 'https://example.com',
      timestamp: 1761598722055
    },
    {
      type: 'act',
      reasoning: undefined,
      taskCompleted: true,
      action: 'type "John" into the First Name textbox',
      playwrightArguments: {...},
      pageUrl: 'https://example.com',
      timestamp: 1761598731643
    },
    {
      type: 'close',
      reasoning: "The first name and email fields have been filled successfully.",
      taskCompleted: true,
      taskComplete: true,
      pageUrl: 'https://example.com',
      timestamp: 1761598732861
    }
  ],
  completed: true,
  usage: {
    input_tokens: 2040,
    output_tokens: 28,
    inference_time_ms: 14079
  }
}

MCP Integrations

Agents can be enhanced with external tools and services through MCP (Model Context Protocol) integrations. This allows your agent to access external APIs and data sources beyond just browser interactions.
const agent = stagehand.agent({
    cua: true,
    model: {
        modelName: "openai/computer-use-preview",
        apiKey: process.env.OPENAI_API_KEY
    },
    integrations: [
      `https://mcp.exa.ai/mcp?exaApiKey=${process.env.EXA_API_KEY}`,
    ],
    systemPrompt: `You have access to web search through Exa. Use it to find current information before browsing.`
});

await agent.execute("Search for the best headphones of 2025 and go through checkout for the top recommendation");
MCP integrations enable agents to be more powerful by combining browser automation with external APIs, databases, and services. The agent can intelligently decide when to use browser actions versus external tools.
Stagehand uses a 1288x711 viewport by default (the optimal size for Computer Use Agents). Other viewport sizes may reduce performance. If you need to modify the viewport, you can edit in the Browser Configuration.

Agent Execution Configuration

Control the maximum number of steps the agent can take to complete the task using the maxSteps parameter.
// Set maxSteps to control how many actions the agent can take
await agent.execute({
  instruction: "Sign me up for a library card",
  maxSteps: 15 // Agent will stop after 15 steps if task isn't complete
});

Best Practices

Following these best practices will improve your agent’s success rate, reduce execution time, and minimize unexpected errors during task completion.

Start on the Right Page

Navigate to your target page before executing tasks:
  • Do this
  • Don't do this
await page.goto('https://github.com/browserbase/stagehand');
await agent.execute('Get me the latest PR on the stagehand repo');

Be Specific

Provide detailed instructions for better results:
  • Do this
  • Don't do this
await agent.execute("Find Italian restaurants in Brooklyn that are open after 10pm and have outdoor seating");

Troubleshooting

Problem: Agent stops before finishing the requested taskSolutions:
  • Check if the agent is hitting the maxSteps limit (default is 20)
  • Increase maxSteps for complex tasks: maxSteps: 30 or higher
  • Break very complex tasks into smaller sequential executions
// Increase maxSteps for complex tasks
await agent.execute({
  instruction: "Complete the multi-page registration form with all required information",
  maxSteps: 40 // Increased limit for complex task
});

// Or break into smaller tasks with success checking
const firstResult = await agent.execute({
  instruction: "Fill out page 1 of the registration form", 
  maxSteps: 15
});

// Only proceed if the first task was successful
if (firstResult.success === true) {
  await agent.execute({
    instruction: "Navigate to page 2 and complete remaining fields",
    maxSteps: 15
  });
} else {
  console.log("First task failed, stopping execution");
}
Problem: Agent clicks on wrong elements or fails to interact with the correct UI componentsSolutions:
  • Ensure proper viewport size: Stagehand uses 1288x711 by default (optimal for Computer Use models)
  • Avoid changing viewport dimensions as other sizes may reduce performance

Next steps