CrewAI Integration
Automate browser tasks using natural language instructions with CrewAI
This tool integrates the Stagehand Python SDK with CrewAI, allowing agents to interact with websites and automate browser tasks using natural language instructions.
Description
The StagehandTool wraps the Stagehand Python SDK to provide CrewAI agents with the ability to control a real web browser and interact with websites using three core primitives:
- Act: Perform actions like clicking, typing, or navigating
- Extract: Extract structured data from web pages
- Observe: Identify and analyze elements on the page
Requirements
Before using this tool, you will need:
- A Browserbase account with API key and project ID
- An API key for an LLM (OpenAI or Anthropic Claude)
- The Stagehand Python SDK installed
Install the dependencies:
Usage
Basic Usage
Command Types
The StagehandTool supports three different command types, each designed for specific web automation tasks:
1. Act - Perform Actions on a Page
The act
command type (default) allows the agent to perform actions on a webpage, such as clicking buttons, filling forms, navigating, and more.
When to use: Use act
when you need to interact with a webpage by performing actions like clicking, typing, scrolling, or navigating.
Example usage:
2. Extract - Get Data from a Page
The extract
command type allows the agent to extract structured data from a webpage, such as product information, article text, or table data.
When to use: Use extract
when you need to retrieve specific information from a webpage in a structured format.
Example usage:
3. Observe - Identify Elements on a Page
The observe
command type allows the agent to identify and analyze specific elements on a webpage, returning information about their attributes, location, and suggested actions.
When to use: Use observe
when you need to identify UI elements, understand page structure, or determine what actions are possible.
Example usage:
Advanced Configuration
You can customize the behavior of the StagehandTool by specifying different parameters:
Task Examples for CrewAI Agents
Here are some examples of tasks that effectively use the StagehandTool:
Tips for Effective Use
- Be specific in instructions: The more specific your instructions, the better the results. For example, instead of “click the button,” use “click the ‘Submit’ button at the bottom of the contact form.”
- Use the right command type: Choose the appropriate command type based on your task:
- Use
act
for interactions and navigation - Use
extract
for gathering information - Use
observe
for understanding page structure
- Use
- Leverage selectors: When extracting data or observing elements, use CSS selectors to narrow the scope and improve accuracy.
- Handle multi-step processes: For complex workflows, break them down into multiple tool calls, each handling a specific step.
- Error handling: Implement appropriate error handling in your agent’s logic to deal with potential issues like elements not found or pages not loading.
Troubleshooting
- Session not starting: Ensure you have valid API keys for both Browserbase and your LLM provider.
- Elements not found: Try increasing the
dom_settle_timeout_ms
parameter to give the page more time to load. - Actions not working: Make sure your instructions are clear and specific. You may need to use
observe
first to identify the correct elements. - Extract returning incomplete data: Try refining your instruction or providing a more specific selector.
Additional Resources
Join the Stagehand Slack community for support and to connect with other users.