Skip to main content
You’re likely using AI to write code, and there’s a right and wrong way to do it. This page is a collection of rules, configs, and copy‑paste snippets to allow your AI agents/assistants to write performant, Stagehand code as fast as possible.

Quickstart

Add MCP servers

Configure Browserbase (Stagehand), Context7, DeepWiki, and Stagehand Docs in your MCP client.

Pin editor rules

Drop in cursorrules and claude.md so AI agents/assistants always emit Stagehand patterns.

Using MCP Servers

MCP (Model Context Protocol) servers act as intermediaries that connect AI systems to external data sources and tools. These servers enable your coding assistant to access real-time information, execute tasks, and retrieve structured data to enhance code generation accuracy. The following MCP servers provide specialized access to Stagehand documentation and related resources:
Provides semantic search across documentation and codebase context. Context7 enables AI assistants to find relevant code patterns, examples, and implementation details from your project history. It maintains contextual understanding of your development workflow and can surface related solutions from previous work.Installation:
{
  "mcpServers": {
    "context7": {
      "command": "npx",
      "args": ["-y", "@upstash/context7-mcp"]
    }
  }
}
Offers deep indexing of GitHub repositories and documentation. DeepWiki allows AI agents to understand project architecture, API references, and best practices from the entire Stagehand ecosystem. It provides comprehensive knowledge about repository structure, code relationships, and development patterns.Installation:
{
  "mcpServers": {
    "deepwiki": {
      "url": "https://mcp.deepwiki.com/mcp"
    }
  }
}
Direct access to official Stagehand documentation. This MCP server provides AI assistants with up-to-date API references, configuration options, and usage examples for accurate code generation. Mintlify auto-generates this server from the official docs, ensuring your AI assistant always has the latest information.Usage:
{
  "mcpServers": {
    "stagehand-docs": {
      "url": "https://docs.stagehand.dev/mcp"
    }
  }
}
How MCP Servers Enhance Your Development:
  • Real-time Documentation Access: AI assistants can query the latest Stagehand docs, examples, and best practices
  • Context-Aware Code Generation: Servers provide relevant code patterns and configurations based on your specific use case
  • Reduced Integration Overhead: Standardized protocol eliminates the need for custom integrations with each documentation source
  • Enhanced Accuracy: AI agents receive structured, up-to-date information rather than relying on potentially outdated training data
Prompting tip: Explicitly ask your coding agent/assistant to use these MCP servers to fetch relevant information from the docs so they have better context and know how to write proper Stagehand code.ie. “Use the stagehand-docs MCP to fetch the act/observe guidelines, then generate code that follows them. Prefer cached observe results.”

Editor rule files (copy‑paste)

Drop these in .cursorrules, windsurfrules, claude.md, or any agent rule framework:
# Stagehand Project

This is a project that uses Stagehand V3, a browser automation framework with AI-powered `act`, `extract`, `observe`, and `agent` methods.

The main class can be imported as `Stagehand` from `@browserbasehq/stagehand`.

**Key Classes:**

- `Stagehand`: Main orchestrator class providing `act`, `extract`, `observe`, and `agent` methods
- `context`: A `V3Context` object that manages browser contexts and pages
- `page`: Individual page objects accessed via `stagehand.context.pages()[i]` or created with `stagehand.context.newPage()`

## Initialize

```typescript
import { Stagehand } from "@browserbasehq/stagehand";

const stagehand = new Stagehand({
  env: "LOCAL", // or "BROWSERBASE"
  verbose: 2, // 0, 1, or 2
  model: "openai/gpt-4.1-mini", // or any supported model
});

await stagehand.init();

// Access the browser context and pages
const page = stagehand.context.pages()[0];
const context = stagehand.context;

// Create new pages if needed
const page2 = await stagehand.context.newPage();
```

## Act

Actions are called on the `stagehand` instance (not the page). Use atomic, specific instructions:

```typescript
// Act on the current active page
await stagehand.act("click the sign in button");

// Act on a specific page (when you need to target a page that isn't currently active)
await stagehand.act("click the sign in button", { page: page2 });
```

**Important:** Act instructions should be atomic and specific:

- ✅ Good: "Click the sign in button" or "Type 'hello' into the search input"
- ❌ Bad: "Order me pizza" or "Type in the search bar and hit enter" (multi-step)

### Observe + Act Pattern (Recommended)

Cache the results of `observe` to avoid unexpected DOM changes:

```typescript
const instruction = "Click the sign in button";

// Get candidate actions
const actions = await stagehand.observe(instruction);

// Execute the first action
await stagehand.act(actions[0]);
```

To target a specific page:

```typescript
const actions = await stagehand.observe("select blue as the favorite color", {
  page: page2,
});
await stagehand.act(actions[0], { page: page2 });
```

## Extract

Extract data from pages using natural language instructions. The `extract` method is called on the `stagehand` instance.

### Basic Extraction (with schema)

```typescript
import { z } from "zod/v3";

// Extract with explicit schema
const data = await stagehand.extract(
  "extract all apartment listings with prices and addresses",
  z.object({
    listings: z.array(
      z.object({
        price: z.string(),
        address: z.string(),
      }),
    ),
  }),
);

console.log(data.listings);
```

### Simple Extraction (without schema)

```typescript
// Extract returns a default object with 'extraction' field
const result = await stagehand.extract("extract the sign in button text");

console.log(result);
// Output: { extraction: "Sign in" }

// Or destructure directly
const { extraction } = await stagehand.extract(
  "extract the sign in button text",
);
console.log(extraction); // "Sign in"
```

### Targeted Extraction

Extract data from a specific element using a selector:

```typescript
const reason = await stagehand.extract(
  "extract the reason why script injection fails",
  z.string(),
  { selector: "/html/body/div[2]/div[3]/iframe/html/body/p[2]" },
);
```

### URL Extraction

When extracting links or URLs, use `z.string().url()`:

```typescript
const { links } = await stagehand.extract(
  "extract all navigation links",
  z.object({
    links: z.array(z.string().url()),
  }),
);
```

### Extracting from a Specific Page

```typescript
// Extract from a specific page (when you need to target a page that isn't currently active)
const data = await stagehand.extract(
  "extract the placeholder text on the name field",
  { page: page2 },
);
```

## Observe

Plan actions before executing them. Returns an array of candidate actions:

```typescript
// Get candidate actions on the current active page
const [action] = await stagehand.observe("Click the sign in button");

// Execute the action
await stagehand.act(action);
```

Observing on a specific page:

```typescript
// Target a specific page (when you need to target a page that isn't currently active)
const actions = await stagehand.observe("find the next page button", {
  page: page2,
});
await stagehand.act(actions[0], { page: page2 });
```

## Agent

Use the `agent` method to autonomously execute complex, multi-step tasks.

### Basic Agent Usage

```typescript
const page = stagehand.context.pages()[0];
await page.goto("https://www.google.com");

const agent = stagehand.agent({
  model: "google/gemini-2.0-flash",
  executionModel: "google/gemini-2.0-flash",
});

const result = await agent.execute({
  instruction: "Search for the stock price of NVDA",
  maxSteps: 20,
});

console.log(result.message);
```

### Computer Use Agent (CUA)

For more advanced scenarios using computer-use models:

```typescript
const agent = stagehand.agent({
  cua: true, // Enable Computer Use Agent mode
  model: "anthropic/claude-sonnet-4-20250514",
  // or "google/gemini-2.5-computer-use-preview-10-2025"
  systemPrompt: `You are a helpful assistant that can use a web browser.
    Do not ask follow up questions, the user will trust your judgement.`,
});

await agent.execute({
  instruction: "Apply for a library card at the San Francisco Public Library",
  maxSteps: 30,
});
```

### Agent with Custom Model Configuration

```typescript
const agent = stagehand.agent({
  cua: true,
  model: {
    modelName: "google/gemini-2.5-computer-use-preview-10-2025",
    apiKey: process.env.GEMINI_API_KEY,
  },
  systemPrompt: `You are a helpful assistant.`,
});
```

### Agent with Integrations (MCP/External Tools)

```typescript
const agent = stagehand.agent({
  integrations: [`https://mcp.exa.ai/mcp?exaApiKey=${process.env.EXA_API_KEY}`],
  systemPrompt: `You have access to the Exa search tool.`,
});
```

## Advanced Features

### DeepLocator (XPath Targeting)

Target specific elements across shadow DOM and iframes:

```typescript
await page
  .deepLocator("/html/body/div[2]/div[3]/iframe/html/body/p")
  .highlight({
    durationMs: 5000,
    contentColor: { r: 255, g: 0, b: 0 },
  });
```

### Multi-Page Workflows

```typescript
const page1 = stagehand.context.pages()[0];
await page1.goto("https://example.com");

const page2 = await stagehand.context.newPage();
await page2.goto("https://example2.com");

// Act/extract/observe operate on the current active page by default
// Pass { page } option to target a specific page
await stagehand.act("click button", { page: page1 });
await stagehand.extract("get title", { page: page2 });
```
# Stagehand Python Project

This is a project that uses [Stagehand Python](https://github.com/browserbase/stagehand-python), which provides AI-powered browser automation with `act`, `extract`, and `observe` methods.

`Stagehand` is a class that provides configuration and browser automation capabilities with:
- Pages accessed via `stagehand.context.pages()` or `stagehand.context.activePage()`
- `stagehand.context`: A StagehandContext object (extends Playwright BrowserContext)
- `stagehand.agent()`: Create AI-powered agents for autonomous multi-step workflows
- `stagehand.init()`: Initialize the browser session
- `stagehand.close()`: Clean up resources

`Page` extends Playwright's Page class with AI-powered methods:
- `act()`: Perform actions on web elements using natural language
- `extract()`: Extract structured data from pages using schemas
- `observe()`: Plan actions and get selectors before executing

`Agent` provides autonomous Computer Use Agent capabilities:
- `execute()`: Perform complex multi-step tasks using natural language instructions

Use the following rules to write code for this project.

- To plan an instruction like "click the sign in button", use Stagehand `observe` to get the action to execute.

You can also pass in the following params:

- The result of `observe` is a list of `ObserveResult` objects that can directly be used as params for `act` like this:
  
- When writing code that needs to extract data from the page, use Stagehand `extract`. Use Pydantic models for schemas:

## Initialize

### Configuration Options

Key configuration options in `StagehandConfig`:

## Act

You can act directly with string instructions:

Use variables for dynamic form filling:

**Best Practices:**
- Cache the results of `observe` to avoid unexpected DOM changes
- Keep actions atomic and specific (e.g., "Click the sign in button" not "Sign in to the website")
- Use specific, descriptive instructions

Act `action` should be as atomic and specific as possible, i.e. "Click the sign in button" or "Type 'hello' into the search input".
AVOID actions that are more than one step, i.e. "Order me pizza" or "Send an email to Paul asking him to call me".

## Extract

### Simple String Extraction

### Structured Extraction with Schema (Recommended)
Always use Pydantic models for structured data extraction:

### Array Extraction
For arrays, use List types:

### Complex Object Extraction
For more complex data structures:

## Agent System

Stagehand provides an Agent System for autonomous web browsing using Computer Use Agents (CUA).

### Creating Agents

### Agent Execution

**Best Practices:**
- Be specific with instructions: `"Fill out the contact form with name 'John Doe' and submit it"`
- Break down complex tasks into smaller steps
- Use error handling with try/except blocks
- Combine agents for navigation with traditional methods for precise data extraction

## Project Structure Best Practices

- Store configurations in environment variables or config files
- Use async/await patterns consistently
- Implement main automation logic in async functions
- Use async context managers for resource management
- Use type hints and Pydantic models for data validation
- Handle exceptions appropriately with try/except blocks

Security notes

  • Do not embed secrets in docs or rule files; use env vars in MCP configs.
  • Avoid broad actions that may trigger unintended navigation; prefer observe first.

Resources/references