These are powerful tools that can convert natural language into actions on the computer. However, you’d otherwise need to write your own code to convert these actions into Playwright commands.
Stagehand not only handles the execution of Computer Use outputs, but also lets you hot-swap between OpenAI and Anthropic models with one line of code.
Stagehand lets you use Computer Use Agents with one line of code:
IMPORTANT! Configure your browser dimensions
Computer Use Agents will often return XY-coordinates to click on the screen, so you’ll need to configure your browser dimensions.
If not specified, the default browser dimensions are 1024x768. You can also configure the browser dimensions in the browserbaseSessionCreateParams or localBrowserLaunchOptions options.
Call execute on the agent to assign a task to the agent.
Copy
Ask AI
// Navigate to a websiteawait stagehand.page.goto("https://www.google.com");const agent = stagehand.agent({ // You can use either OpenAI or Anthropic provider: "openai", // The model to use (claude-3-7-sonnet-latest for Anthropic) model: "computer-use-preview", // Customize the system prompt instructions: `You are a helpful assistant that can use a web browser. Do not ask follow up questions, the user will trust your judgement.`, // Customize the API key options: { apiKey: process.env.OPENAI_API_KEY, },});// Execute the agentawait agent.execute("Apply for a library card at the San Francisco Public Library");
You can also define the maximum number of steps the agent can take with:
Copy
Ask AI
await agent.execute({ instructions: "Apply for a library card at the San Francisco Public Library", maxSteps: 10,});