> ## Documentation Index
> Fetch the complete documentation index at: https://docs.stagehand.dev/llms.txt
> Use this file to discover all available pages before exploring further.

# AI Rules

> Using AI to write Stagehand code faster, and better.

export const V3Banner = () => null;

<V3Banner />

You're likely using AI to write code, and there's a **right and wrong way to do it.** This page is a collection of rules, configs, and copy‑paste snippets to allow your AI agents/assistants to write performant, Stagehand code as fast as possible.

## Quickstart

<CardGroup cols={2}>
  <Card title="Add MCP servers" icon="screwdriver-wrench">
    Configure Browserbase (Stagehand), Context7, DeepWiki, and Stagehand Docs in your MCP client.
  </Card>

  <Card title="Pin editor rules" icon="memo">
    Drop in `cursorrules` and `claude.md` so AI agents/assistants always emit Stagehand patterns.
  </Card>
</CardGroup>

## Using MCP Servers

MCP (Model Context Protocol) servers act as intermediaries that connect AI systems to external data sources and tools. These servers enable your coding assistant to access real-time information, execute tasks, and retrieve structured data to enhance code generation accuracy.

The following **MCP servers** provide specialized access to Stagehand documentation and related resources:

<Accordion title="Context7 by Upstash" icon="database">
  Provides semantic search across documentation and codebase context. Context7 enables AI assistants to find relevant code patterns, examples, and implementation details from your project history. It maintains contextual understanding of your development workflow and can surface related solutions from previous work.

  **Installation:**

  ```json theme={null}
  {
    "mcpServers": {
      "context7": {
        "command": "npx",
        "args": ["-y", "@upstash/context7-mcp"]
      }
    }
  }
  ```
</Accordion>

<Accordion title="DeepWiki by Cognition" icon="book-open">
  Offers deep indexing of GitHub repositories and documentation. DeepWiki allows AI agents to understand project architecture, API references, and best practices from the entire Stagehand ecosystem. It provides comprehensive knowledge about repository structure, code relationships, and development patterns.

  **Installation:**

  ```json theme={null}
  {
    "mcpServers": {
      "deepwiki": {
        "url": "https://mcp.deepwiki.com/mcp"
      }
    }
  }
  ```
</Accordion>

<Accordion title="Stagehand Docs by Mintlify" icon="mintbit">
  Direct access to official Stagehand documentation. This MCP server provides AI assistants with up-to-date API references, configuration options, and usage examples for accurate code generation. Mintlify auto-generates this server from the official docs, ensuring your AI assistant always has the latest information.

  **Usage:**

  ```json theme={null}
  {
    "mcpServers": {
      "stagehand-docs": {
        "url": "https://docs.stagehand.dev/mcp"
      }
    }
  }
  ```
</Accordion>

**How MCP Servers Enhance Your Development:**

* **Real-time Documentation Access**: AI assistants can query the latest Stagehand docs, examples, and best practices
* **Context-Aware Code Generation**: Servers provide relevant code patterns and configurations based on your specific use case
* **Reduced Integration Overhead**: Standardized protocol eliminates the need for custom integrations with each documentation source
* **Enhanced Accuracy**: AI agents receive structured, up-to-date information rather than relying on potentially outdated training data

<Tip>
  **Prompting tip:**
  Explicitly ask your coding agent/assistant to use these MCP servers to fetch relevant information from the docs so they have better context and know how to write proper Stagehand code.

  ie. **"Use the stagehand-docs MCP to fetch the act/observe guidelines, then generate code that follows them. Prefer cached observe results."**
</Tip>

## Editor rule files (copy‑paste)

Drop these in `.cursorrules`, `windsurfrules`, `claude.md`, or any agent rule framework:

<Accordion title="TypeScript">
  ````md theme={null}
  # Stagehand Project

  This is a project that uses Stagehand V3, a browser automation framework with AI-powered `act`, `extract`, `observe`, and `agent` methods.

  The main class can be imported as `Stagehand` from `@browserbasehq/stagehand`.

  **Key Classes:**

  - `Stagehand`: Main orchestrator class providing `act`, `extract`, `observe`, and `agent` methods
  - `context`: A `V3Context` object that manages browser contexts and pages
  - `page`: Individual page objects accessed via `stagehand.context.pages()[i]` or created with `stagehand.context.newPage()`

  ## Initialize

  ```typescript
  import { Stagehand } from "@browserbasehq/stagehand";

  const stagehand = new Stagehand({
    env: "LOCAL", // or "BROWSERBASE"
    verbose: 2, // 0, 1, or 2
    model: "openai/gpt-4.1-mini", // or any supported model
  });

  await stagehand.init();

  // Access the browser context and pages
  const page = stagehand.context.pages()[0];
  const context = stagehand.context;

  // Create new pages if needed
  const page2 = await stagehand.context.newPage();
  ```

  ## Act

  Actions are called on the `stagehand` instance (not the page). Use atomic, specific instructions:

  ```typescript
  // Act on the current active page
  await stagehand.act("click the sign in button");

  // Act on a specific page (when you need to target a page that isn't currently active)
  await stagehand.act("click the sign in button", { page: page2 });
  ```

  **Important:** Act instructions should be atomic and specific:

  - ✅ Good: "Click the sign in button" or "Type 'hello' into the search input"
  - ❌ Bad: "Order me pizza" or "Type in the search bar and hit enter" (multi-step)

  ### Observe + Act Pattern (Recommended)

  Cache the results of `observe` to avoid unexpected DOM changes:

  ```typescript
  const instruction = "Click the sign in button";

  // Get candidate actions
  const actions = await stagehand.observe(instruction);

  // Execute the first action
  await stagehand.act(actions[0]);
  ```

  To target a specific page:

  ```typescript
  const actions = await stagehand.observe("select blue as the favorite color", {
    page: page2,
  });
  await stagehand.act(actions[0], { page: page2 });
  ```

  ## Extract

  Extract data from pages using natural language instructions. The `extract` method is called on the `stagehand` instance.

  ### Basic Extraction (with schema)

  ```typescript
  import { z } from "zod";

  // Extract with explicit schema
  const data = await stagehand.extract(
    "extract all apartment listings with prices and addresses",
    z.object({
      listings: z.array(
        z.object({
          price: z.string(),
          address: z.string(),
        }),
      ),
    }),
  );

  console.log(data.listings);
  ```

  ### Simple Extraction (without schema)

  ```typescript
  // Extract returns a default object with 'extraction' field
  const result = await stagehand.extract("extract the sign in button text");

  console.log(result);
  // Output: { extraction: "Sign in" }

  // Or destructure directly
  const { extraction } = await stagehand.extract(
    "extract the sign in button text",
  );
  console.log(extraction); // "Sign in"
  ```

  ### Targeted Extraction

  Extract data from a specific element using a selector:

  ```typescript
  const reason = await stagehand.extract(
    "extract the reason why script injection fails",
    z.string(),
    { selector: "/html/body/div[2]/div[3]/iframe/html/body/p[2]" },
  );
  ```

  ### URL Extraction

  When extracting links or URLs, use `z.string().url()`:

  ```typescript
  const { links } = await stagehand.extract(
    "extract all navigation links",
    z.object({
      links: z.array(z.string().url()),
    }),
  );
  ```

  ### Extracting from a Specific Page

  ```typescript
  // Extract from a specific page (when you need to target a page that isn't currently active)
  const data = await stagehand.extract(
    "extract the placeholder text on the name field",
    { page: page2 },
  );
  ```

  ## Observe

  Plan actions before executing them. Returns an array of candidate actions:

  ```typescript
  // Get candidate actions on the current active page
  const [action] = await stagehand.observe("Click the sign in button");

  // Execute the action
  await stagehand.act(action);
  ```

  Observing on a specific page:

  ```typescript
  // Target a specific page (when you need to target a page that isn't currently active)
  const actions = await stagehand.observe("find the next page button", {
    page: page2,
  });
  await stagehand.act(actions[0], { page: page2 });
  ```

  ## Agent

  Use the `agent` method to autonomously execute complex, multi-step tasks.

  ### Basic Agent Usage

  ```typescript
  const page = stagehand.context.pages()[0];
  await page.goto("https://www.google.com");

  const agent = stagehand.agent({
    model: "google/gemini-2.0-flash",
    executionModel: "google/gemini-2.0-flash",
  });

  const result = await agent.execute({
    instruction: "Search for the stock price of NVDA",
    maxSteps: 20,
  });

  console.log(result.message);
  ```

  ### Computer Use Agent (CUA)

  For more advanced scenarios using computer-use models:

  ```typescript
  const agent = stagehand.agent({
    mode: "cua", // Enable Computer Use Agent mode
    model: "anthropic/claude-sonnet-4-6",
    // or "google/gemini-2.5-computer-use-preview-10-2025"
    systemPrompt: `You are a helpful assistant that can use a web browser.
      Do not ask follow up questions, the user will trust your judgement.`,
  });

  await agent.execute({
    instruction: "Apply for a library card at the San Francisco Public Library",
    maxSteps: 30,
  });
  ```

  ### Agent with Custom Model Configuration

  ```typescript
  const agent = stagehand.agent({
    mode: "cua",
    model: {
      modelName: "google/gemini-2.5-computer-use-preview-10-2025",
      apiKey: process.env.GEMINI_API_KEY,
    },
    systemPrompt: `You are a helpful assistant.`,
  });
  ```

  ### Agent with Integrations (MCP/External Tools)

  ```typescript
  const agent = stagehand.agent({
    integrations: [`https://mcp.exa.ai/mcp?exaApiKey=${process.env.EXA_API_KEY}`],
    systemPrompt: `You have access to the Exa search tool.`,
  });
  ```

  ## Advanced Features

  ### DeepLocator (XPath Targeting)

  Target specific elements across shadow DOM and iframes:

  ```typescript
  await page
    .deepLocator("/html/body/div[2]/div[3]/iframe/html/body/p")
    .highlight({
      durationMs: 5000,
      contentColor: { r: 255, g: 0, b: 0 },
    });
  ```

  ### Multi-Page Workflows

  ```typescript
  const page1 = stagehand.context.pages()[0];
  await page1.goto("https://example.com");

  const page2 = await stagehand.context.newPage();
  await page2.goto("https://example2.com");

  // Act/extract/observe operate on the current active page by default
  // Pass { page } option to target a specific page
  await stagehand.act("click button", { page: page1 });
  await stagehand.extract("get title", { page: page2 });
  ```
  ````
</Accordion>

<Accordion title="Python">
  ```md theme={null}
  # Stagehand Python Project

  This is a project that uses [Stagehand Python](https://github.com/browserbase/stagehand-python), which provides AI-powered browser automation with `act`, `extract`, and `observe` methods.

  `Stagehand` is a class that provides configuration and browser automation capabilities with:
  - Pages accessed via `stagehand.context.pages()` or `stagehand.context.activePage()`
  - `stagehand.context`: A StagehandContext object (extends Playwright BrowserContext)
  - `stagehand.agent()`: Create AI-powered agents for autonomous multi-step workflows
  - `stagehand.init()`: Initialize the browser session
  - `stagehand.close()`: Clean up resources

  `Page` extends Playwright's Page class with AI-powered methods:
  - `act()`: Perform actions on web elements using natural language
  - `extract()`: Extract structured data from pages using schemas
  - `observe()`: Plan actions and get selectors before executing

  `Agent` provides autonomous Computer Use Agent capabilities:
  - `execute()`: Perform complex multi-step tasks using natural language instructions

  Use the following rules to write code for this project.

  - To plan an instruction like "click the sign in button", use Stagehand `observe` to get the action to execute.

  You can also pass in the following params:

  - The result of `observe` is a list of `ObserveResult` objects that can directly be used as params for `act` like this:
    
  - When writing code that needs to extract data from the page, use Stagehand `extract`. Use Pydantic models for schemas:

  ## Initialize

  ### Configuration Options

  Key configuration options in `StagehandConfig`:

  ## Act

  You can act directly with string instructions:

  Use variables for dynamic form filling:

  **Best Practices:**
  - Cache the results of `observe` to avoid unexpected DOM changes
  - Keep actions atomic and specific (e.g., "Click the sign in button" not "Sign in to the website")
  - Use specific, descriptive instructions

  Act `action` should be as atomic and specific as possible, i.e. "Click the sign in button" or "Type 'hello' into the search input".
  AVOID actions that are more than one step, i.e. "Order me pizza" or "Send an email to Paul asking him to call me".

  ## Extract

  ### Simple String Extraction

  ### Structured Extraction with Schema (Recommended)
  Always use Pydantic models for structured data extraction:

  ### Array Extraction
  For arrays, use List types:

  ### Complex Object Extraction
  For more complex data structures:

  ## Agent System

  Stagehand provides an Agent System for autonomous web browsing using Computer Use Agents (CUA).

  ### Creating Agents

  ### Agent Execution

  **Best Practices:**
  - Be specific with instructions: `"Fill out the contact form with name 'John Doe' and submit it"`
  - Break down complex tasks into smaller steps
  - Use error handling with try/except blocks
  - Combine agents for navigation with traditional methods for precise data extraction

  ## Project Structure Best Practices

  - Store configurations in environment variables or config files
  - Use async/await patterns consistently
  - Implement main automation logic in async functions
  - Use async context managers for resource management
  - Use type hints and Pydantic models for data validation
  - Handle exceptions appropriately with try/except blocks
  ```
</Accordion>

## Security notes

* Do not embed secrets in docs or rule files; use env vars in MCP configs.
* Avoid broad actions that may trigger unintended navigation; prefer `observe` first.

## Resources/references

* Context7 MCP (Upstash)
  * [https://github.com/upstash/context7](https://github.com/upstash/context7)
* DeepWiki MCP
  * [https://mcp.deepwiki.com/](https://mcp.deepwiki.com/)
* Stagehand Docs MCP (Mintlify)
  * [https://docs.stagehand.dev/mcp](https://docs.stagehand.dev/mcp)
