Skip to main content

Agent Creation

// Create agent instance
const agent = stagehand.agent(config?: AgentConfig): AgentInstance
AgentConfig Interface:
interface AgentConfig {
  systemPrompt?: string;
  integrations?: (Client | string)[];
  tools?: ToolSet;
  /** @deprecated Use `mode: "cua"` instead */
  cua?: boolean;
  model?: string | AgentModelConfig<string>;
  executionModel?: string | AgentModelConfig<string>;
  stream?: boolean; // Enable streaming mode (experimental)
  mode?: "dom" | "hybrid" | "cua"; // Tool mode
}

// AgentModelConfig for advanced configuration
type AgentModelConfig<TModelName extends string = string> = {
  modelName: TModelName;
} & Record<string, unknown>;
AgentInstance Interface:
interface AgentInstance {
  execute: (instructionOrOptions: string | AgentExecuteOptions) => Promise<AgentResult>;
}

Agent Configuration

systemPrompt
string
Custom system prompt to provide to the agent. Overrides the default system prompt and defines agent behavior.
model
string | AgentModelConfig
The model to use for agent functionality. Can be either:
  • A string in the format "provider/model" (e.g., "openai/computer-use-preview", "anthropic/claude-sonnet-4-20250514")
  • An object with modelName and additional provider-specific options
Available CUA Models:
  • "openai/computer-use-preview"
  • "openai/computer-use-preview-2025-03-11"
  • "anthropic/claude-3-7-sonnet-latest"
  • "anthropic/claude-haiku-4-5-20251001"
  • "anthropic/claude-sonnet-4-20250514"
  • "anthropic/claude-sonnet-4-5-20250929"
  • "google/gemini-2.5-computer-use-preview-10-2025"
executionModel
string | AgentModelConfig
The model to use for tool execution (observe/act calls within agent tools). If not specified, inherits from the main model configuration.Format: "provider/model" (e.g., "openai/gpt-4o-mini", "google/gemini-2.0-flash-exp")
cua
boolean
Deprecated: Use mode: "cua" instead. This option will be removed in a future version.
Indicates whether Computer Use Agent (CUA) mode is enabled. When false, the agent uses standard tool-based operation instead of computer control.
integrations
(Client | string)[]
MCP (Model Context Protocol) integrations for external tools and services.Array of: MCP server URLs (strings) or connected Client objects
tools
ToolSet
Custom tool definitions to extend agent capabilities using the AI SDK ToolSet format.
stream
boolean
Enable streaming mode for the agent. When true, execute() returns AgentStreamResult with textStream for incremental output. When false (default), execute() returns AgentResult after completion.Default: false
Non-CUA agents only. Requires experimental: true. Not available when mode: "cua".
mode
"dom" | "hybrid" | "cua"
Tool mode for the agent. Determines which set of tools are available to the agent.Modes:
  • "dom" (default): Uses DOM-based tools (act, fillForm) for structured page interactions. Works with any model.
  • "hybrid": Uses both DOM-based and coordinate-based tools (act, click, type, dragAndDrop, clickAndHold, fillForm) for visual/screenshot-based interactions. Requires models with reliable coordinate-based action capabilities.
  • "cua": Uses Computer Use Agent (CUA) providers like Anthropic Claude, Google Gemini, or OpenAI for screenshot-based automation. This is the preferred way to enable CUA mode (replaces the deprecated cua: true option).
Default: "dom"
Hybrid Mode Model Requirements: Only use hybrid mode with models that can reliably perform coordinate-based actions:
  • Google: google/gemini-3-flash-preview
  • Anthropic: anthropic/claude-sonnet-4-20250514, anthropic/claude-sonnet-4-5-20250929, anthropic/claude-haiku-4-5-20251001
Requires experimental: true in Stagehand constructor.

Execute Method

// String instruction
await agent.execute(instruction: string): Promise<AgentResult>

// With options
await agent.execute(options: AgentExecuteOptions): Promise<AgentResult>
AgentExecuteOptions Interface:
interface AgentExecuteOptions {
  instruction: string;
  maxSteps?: number;
  page?: PlaywrightPage | PuppeteerPage | PatchrightPage | Page;
  highlightCursor?: boolean;
  messages?: ModelMessage[]; // Continue from previous conversation (experimental)
  signal?: AbortSignal; // Cancel execution (experimental)
  excludeTools?: string[]; // Tools to exclude from this execution (experimental)
  callbacks?: AgentExecuteCallbacks;
}

interface AgentExecuteCallbacks {
  prepareStep?: PrepareStepFunction<ToolSet>;
  onStepFinish?: GenerateTextOnStepFinishCallback<ToolSet>;
}

Execute Parameters

instruction
string
required
High-level task description in natural language.
maxSteps
number
Maximum number of actions the agent can take before stopping.Default: 20
page
PlaywrightPage | PuppeteerPage | PatchrightPage | Page
Optional: Specify which page to perform the agent execution on. Supports multiple browser automation libraries:
  • Playwright: Native Playwright Page objects
  • Puppeteer: Puppeteer Page objects
  • Patchright: Patchright Page objects
  • Stagehand Page: Stagehand’s wrapped Page object
If not specified, defaults to the current “active” page in your Stagehand instance.
highlightCursor
boolean
Whether to show a visual cursor on the page during agent execution. Useful for debugging and demonstrations.Default: false
messages
ModelMessage[]
Previous conversation messages to continue from. Pass the messages from a previous AgentResult to continue that conversation.
Non-CUA agents only. Requires experimental: true. Not available when mode: "cua".
signal
AbortSignal
An AbortSignal that can be used to cancel the agent execution. When aborted, the agent will stop and throw an AgentAbortError.
Non-CUA agents only. Requires experimental: true. Not available when mode: "cua".
excludeTools
string[]
Tools to exclude from this execution. Pass an array of tool names to prevent the agent from using those tools.Available tools by mode:DOM mode (default): act, fillForm, ariaTree, extract, goto, scroll, keys, navback, screenshot, think, wait, searchHybrid mode: click, type, dragAndDrop, clickAndHold, fillFormVision, act, ariaTree, extract, goto, scroll, keys, navback, screenshot, think, wait, search
Non-CUA agents only. Requires experimental: true. Not available when cua: true.
callbacks
AgentExecuteCallbacks | AgentStreamCallbacks
Callbacks to hook into the agent’s execution lifecycle. The available callbacks depend on whether streaming is enabled.
Non-CUA agents only. Requires experimental: true. Not available when mode: "cua".

Response

Returns: Promise<AgentResult> (non-streaming) or Promise<AgentStreamResult> (streaming)
AgentResult Interface:
interface AgentResult {
  success: boolean;
  message: string;
  actions: AgentAction[];
  completed: boolean;
  metadata?: Record<string, unknown>;
  messages?: ModelMessage[]; // Conversation history for continuation (experimental)
  usage?: {
    input_tokens: number;
    output_tokens: number;
    reasoning_tokens?: number;
    cached_input_tokens?: number;
    inference_time_ms: number;
  };
}

// AgentAction can contain various tool-specific fields
interface AgentAction {
  type: string;
  reasoning?: string;
  taskCompleted?: boolean;
  action?: string;
  timeMs?: number;        // wait tool
  pageText?: string;      // ariaTree tool
  pageUrl?: string;       // ariaTree tool
  instruction?: string;   // various tools
  timestamp?: number;     // Action timestamp
  [key: string]: unknown; // Additional tool-specific fields
}
success
boolean
Whether the task was completed successfully.
message
string
Description of the execution result and status.
actions
AgentAction[]
Array of individual actions taken during execution. Each action contains tool-specific data.
completed
boolean
Whether the agent believes the task is fully complete.
metadata
Record<string, unknown>
Additional execution metadata and debugging information.
messages
ModelMessage[]
The conversation messages from this execution. Pass these to a subsequent execute() call via the messages option to continue the conversation.
Non-CUA agents only. Requires experimental: true.
usage
object
Token usage and performance metrics.

Example Response

{
  "success": true,
  "message": "Task completed successfully",
  "actions": [
    {
      "type": "act",
      "instruction": "click the submit button",
      "reasoning": "User requested to submit the form",
      "taskCompleted": false
    },
    {
      "type": "observe",
      "instruction": "check if submission was successful",
      "taskCompleted": true
    }
  ],
  "completed": true,
  "metadata": {
    "steps_taken": 2
  },
  "usage": {
    "input_tokens": 1250,
    "output_tokens": 340,
    "reasoning_tokens": 42,
    "cached_input_tokens": 0,
    "inference_time_ms": 2500
  }
}

Code Examples

import { Stagehand } from "@browserbasehq/stagehand";

// Initialize with Browserbase (API key and project ID from environment variables)
// Set BROWSERBASE_API_KEY and BROWSERBASE_PROJECT_ID in your environment
const stagehand = new Stagehand({
  env: "BROWSERBASE",
  model: "anthropic/claude-sonnet-4-20250514"
});
await stagehand.init();

const page = stagehand.context.pages()[0];
// Create agent with default configuration
const agent = stagehand.agent();

// Navigate to a page
await page.goto("https://www.google.com");

// Execute a task
const result = await agent.execute("Search for 'Stagehand automation' and click the first result");

console.log(result.message);
console.log(`Completed: ${result.completed}`);
console.log(`Actions taken: ${result.actions.length}`);

Error Types

The following errors may be thrown by the agent() method:
  • StagehandError - Base class for all Stagehand-specific errors
  • StagehandInitError - Agent was not properly initialized
  • MissingLLMConfigurationError - No LLM API key or client configured
  • UnsupportedModelError - The specified model is not supported for agent functionality
  • UnsupportedModelProviderError - The specified model provider is not supported
  • InvalidAISDKModelFormatError - Model string does not follow the required provider/model format
  • MCPConnectionError - Failed to connect to MCP server
  • StagehandDefaultError - General execution error with detailed message
  • AgentAbortError - Thrown when agent execution is cancelled via an AbortSignal
  • StreamingCallbacksInNonStreamingModeError - Thrown when streaming-only callbacks (onChunk, onFinish, onError, onAbort) are used without stream: true
  • ExperimentalNotConfiguredError - Thrown when experimental features (callbacks, signal, messages, streaming) are used without experimental: true in Stagehand constructor