Agent Creation

// Create agent instance
const agent = stagehand.agent(config: AgentConfig): AgentInstance
AgentConfig Interface:
interface AgentConfig {
  provider?: AgentProviderType;  // "openai" | "anthropic"
  model?: string;
  instructions?: string;
  options?: Record<string, unknown>;
}
AgentInstance Interface:
interface AgentInstance {
  execute: (instructionOrOptions: string | AgentExecuteOptions) => Promise<AgentResult>;
}

Agent Configuration

provider
AgentProviderType
AI provider for agent functionality.Options: "anthropic", "openai"
model
string
Specific model for agent execution.Anthropic: "claude-sonnet-4-20250514", "claude-3-5-sonnet-20241022"OpenAI: "computer-use-preview", "gpt-4o"
instructions
string
System instructions defining agent behavior.
options
Record<string, unknown>
Provider-specific configuration.Common: apiKey, baseURL, organization

Execute Method

// String instruction
await agent.execute(instruction: string): Promise<AgentResult>

// With options
await agent.execute(options: AgentExecuteOptions): Promise<AgentResult>
AgentExecuteOptions Interface:
interface AgentExecuteOptions {
  instruction: string;
  maxSteps?: number;
  autoScreenshot?: boolean;
  waitBetweenActions?: number;
  context?: string;
}

Execute Parameters

instruction
string
required
High-level task description in natural language.
maxSteps
number
Maximum number of actions the agent can take.Default: 20
autoScreenshot
boolean
Whether to take screenshots before each action.Default: true
waitBetweenActions
number
Delay in milliseconds between actions.Default: 0
context
string
Additional context or constraints for the agent.

Response

Returns: Promise<AgentResult> AgentResult Interface:
interface AgentResult {
  success: boolean;
  message: string;
  actions: AgentAction[];
  completed: boolean;
  metadata?: Record<string, unknown>;
  usage?: {
    input_tokens: number;
    output_tokens: number;
    inference_time_ms: number;
  };
}
success
boolean
Whether the task was completed successfully.
message
string
Description of the execution result and status.
actions
AgentAction[]
Array of individual actions taken during execution.
completed
boolean
Whether the agent believes the task is fully complete.
metadata
Record<string, unknown>
Additional execution metadata and debugging information.
usage
object
Token usage and performance metrics.

Example Response

{
  "success": true,
  "message": "Task completed successfully",
  "actions": [
    {
      "action": "click",
      "selector": "button.primary",
      "text": "Submit"
    }
  ],
  "completed": true,
  "metadata": {
    "execution_time": 1000
  },
  "usage": {
    "input_tokens": 100,
    "output_tokens": 50,
    "inference_time_ms": 100
  }
}