agent()

Agent

See how to use agent() to create autonomous AI agents for multi-step browser workflows

Agent Creation

TypeScript
Python

// Create agent instance
const agent = stagehand.agent(config: AgentConfig): AgentInstance

AgentConfig Interface:

interface AgentConfig {
  provider?: AgentProviderType;  // "openai" | "anthropic"
  model?: string;
  instructions?: string;
  options?: Record<string, unknown>;
  integrations?: (Client | string)[];
  tools?: ToolSet;
}

AgentInstance Interface:

interface AgentInstance {
  execute: (instructionOrOptions: string | AgentExecuteOptions) => Promise<AgentResult>;
}

Agent Configuration

provider

AgentProviderType

AI provider for agent functionality.Options: "anthropic", "openai"

model

string

Specific model for agent execution.Anthropic: "claude-sonnet-4-20250514", "claude-3-5-sonnet-20241022"OpenAI: "computer-use-preview", "gpt-4o"

instructions

string

System instructions defining agent behavior.

options

Record<string, unknown>

Provider-specific configuration.Common: apiKey, baseURL, organization

integrations

(Client | string)[]

MCP (Model Context Protocol) integrations for external tools and services.Array of: MCP server URLs (strings) or connected Client objects

tools

ToolSet

Custom tool definitions to extend agent capabilities.Format: Object with tool name keys and tool definition values

Execute Method

TypeScript
Python

// String instruction
await agent.execute(instruction: string): Promise<AgentResult>

// With options
await agent.execute(options: AgentExecuteOptions): Promise<AgentResult>

AgentExecuteOptions Interface:

interface AgentExecuteOptions {
  instruction: string;
  maxSteps?: number;
  autoScreenshot?: boolean;
  waitBetweenActions?: number;
  context?: string;
}

Execute Parameters

instruction

string

required

High-level task description in natural language.

maxSteps

number

Maximum number of actions the agent can take.Default: 20

autoScreenshot

boolean

Whether to take screenshots before each action.Default: true

waitBetweenActions

number

Delay in milliseconds between actions.Default: 0

context

string

Additional context or constraints for the agent.

Response

Returns: Promise<AgentResult> AgentResult Interface:

interface AgentResult {
  success: boolean;
  message: string;
  actions: AgentAction[];
  completed: boolean;
  metadata?: Record<string, unknown>;
  usage?: {
    input_tokens: number;
    output_tokens: number;
    inference_time_ms: number;
  };
}

TypeScript
Python

success

boolean

Whether the task was completed successfully.

message

string

Description of the execution result and status.

actions

AgentAction[]

Array of individual actions taken during execution.

completed

boolean

Whether the agent believes the task is fully complete.

metadata

Record<string, unknown>

Additional execution metadata and debugging information.

usage

object

Token usage and performance metrics.

Example Response

{
  "success": true,
  "message": "Task completed successfully",
  "actions": [
    {
      "action": "click",
      "selector": "button.primary",
      "text": "Submit"
    }
  ],
  "completed": true,
  "metadata": {
    "execution_time": 1000
  },
  "usage": {
    "input_tokens": 100,
    "output_tokens": 50,
    "inference_time_ms": 100
  }
}

First Steps

The Basics

Configuration

Best Practices

Integrations

Reference

Agent

Agent Creation

Agent Configuration

Execute Method

Execute Parameters

Response

Example Response

First Steps

The Basics

Configuration

Best Practices

Integrations

Reference

Agent

​Agent Creation

​Agent Configuration

​Execute Method

​Execute Parameters

​Response

​Example Response

Agent Creation

Agent Configuration

Execute Method

Execute Parameters

Response

Example Response