🐍 Looking for Stagehand in Python ? Switch to v2 →
Method Signatures
// No parameters (raw page content)
await stagehand . extract (): Promise < { pageText : string } >
// Options only (for example, for targeted extraction)
await stagehand . extract ( options : ExtractOptions ): Promise < { pageText : string } >
// String instruction only
await stagehand . extract ( instruction : string ): Promise < { extraction : string } >
// With schema
await stagehand . extract < T extends ZodTypeAny > (
instruction : string ,
schema : T ,
options ?: ExtractOptions
): Promise < z . infer < T >>
ExtractOptions Interface: interface ExtractOptions {
model ?: ModelConfiguration ;
timeout ?: number ;
selector ?: string ;
page ?: PlaywrightPage | PuppeteerPage | PatchrightPage | Page ;
}
// ModelConfiguration can be either a string or an object
type ModelConfiguration =
| string // Format: "provider/model" (e.g., "openai/gpt-5-mini", "anthropic/claude-sonnet-4-5")
| {
modelName : string ; // The model name
apiKey ?: string ; // Optional: API key override
baseURL ?: string ; // Optional: Base URL override
// Additional provider-specific options
}
Parameters
Natural language description of what data to extract. If omitted with no schema, returns raw page text.
Zod schema defining the structure of data to extract. Ensures type safety and validation. The return type is automatically inferred from the schema.
Configure the AI model to use for this action. Can be either:
A string in the format "provider/model" (e.g., openai/gpt-5, google/gemini-2.5-flash)
An object with detailed configuration
Show Model Configuration Object
The model name (e.g., anthropic/claude-sonnet-4-5, google/gemini-2.5-flash)
API key for the model provider (overrides default)
Base URL for the API endpoint (for custom endpoints or proxies)
Maximum time in milliseconds to wait for the extraction to complete. Default varies by configuration.
Optional selector (XPath, CSS selector, etc.) to limit extraction scope to a specific part of the page. Reduces token usage and improves accuracy.
page
PlaywrightPage | PuppeteerPage | PatchrightPage | Page
Optional: Specify which page to perform the extraction on. Supports multiple browser automation libraries:
Playwright : Native Playwright Page objects
Puppeteer : Puppeteer Page objects
Patchright : Patchright Page objects
Stagehand Page : Stagehand’s wrapped Page object
If not specified, defaults to the current “active” page in your Stagehand instance.
Built-in Support
Iframe and Shadow DOM interactions are supported out of the box. Stagehand automatically handles iframe traversal and shadow DOM elements without requiring additional configuration or flags.
Response Types
With Schema
String Only
No Parameters
Returns: Promise<z.infer<T>> where T is your schemaThe returned object will be strictly typed according to your Zod schema definition.
Returns: Promise<{ extraction: string }>extraction: Simple string extraction without schema validation.
Returns: Promise<{ pageText: string }>pageText: Raw accessibility tree representation of page content.
Code Examples
Single Object
Arrays
URLs
Scoped
Schema-less
Advanced
import { Stagehand } from "@browserbasehq/stagehand" ;
import { z } from 'zod' ;
// Initialize with Browserbase (API key and project ID from environment variables)
// Set BROWSERBASE_API_KEY and BROWSERBASE_PROJECT_ID in your environment
const stagehand = new Stagehand ({ env: "BROWSERBASE" });
await stagehand . init ();
const page = stagehand . context . pages ()[ 0 ];
await page . goto ( "https://example.com/product" );
// Schema definition
const ProductSchema = z . object ({
name: z . string (),
price: z . number (),
inStock: z . boolean ()
});
// Extraction with v3 API
const product = await stagehand . extract (
"extract product details" ,
ProductSchema
);
Example Response {
"name" : "Product Name" ,
"price" : 100 ,
"inStock" : true
}
import { z } from 'zod' ;
// Schema definition
const ApartmentListingsSchema = z . array (
z . object ({
address: z . string (),
price: z . string (),
bedrooms: z . number ()
})
);
// Extraction with v3 API
const listings = await stagehand . extract (
"extract all apartment listings" ,
ApartmentListingsSchema
);
Example Response [
{
"address" : "123 Main St" ,
"price" : "$100,000" ,
"bedrooms" : 3
},
{
"address" : "456 Elm St" ,
"price" : "$150,000" ,
"bedrooms" : 2
}
]
import { z } from 'zod' ;
// Schema definition
const NavigationSchema = z . object ({
links: z . array ( z . object ({
text: z . string (),
url: z . string (). url () // URL validation
}))
});
// Extraction with v3 API
const links = await stagehand . extract (
"extract navigation links" ,
NavigationSchema
);
Example Response {
"links" : [
{
"text" : "Home" ,
"url" : "https://example.com"
}
]
}
import { z } from 'zod' ;
const ProductSchema = z . object ({
name: z . string (),
price: z . number (),
description: z . string ()
});
// Extract from specific page section with v3 API
const data = await stagehand . extract (
"extract product info from this section" ,
ProductSchema ,
{ selector: "/html/body/div/div" }
);
Example Response {
"name" : "Product Name" ,
"price" : 100 ,
"description" : "Product description"
}
// String only extraction
const title = await stagehand . extract ( "get the page title" );
// Returns: { extraction: "Page Title" }
// Raw page content
const content = await stagehand . extract ();
// Returns: { pageText: "Accessibility Tree: ..." }
Example Response {
"extraction" : "Page Title"
}
import { z } from 'zod' ;
// Schema with descriptions and validation
const ProductSchema = z . object ({
price: z . number (). describe ( "Product price in USD" ),
rating: z . number (). min ( 0 ). max ( 5 ). describe ( "Customer rating out of 5" ),
available: z . boolean (). describe ( "Whether product is in stock" ),
tags: z . array ( z . string ()). optional ()
});
// Nested schema
const EcommerceSchema = z . object ({
product: z . object ({
name: z . string (),
price: z . object ({
current: z . number (),
original: z . number (). optional ()
})
}),
reviews: z . array ( z . object ({
rating: z . number (),
comment: z . string ()
}))
});
Example Response {
"product" : {
"name" : "Product Name" ,
"price" : {
"current" : 100 ,
"original" : 120
}
},
"reviews" : [
{
"rating" : 4 ,
"comment" : "Great product!"
}
]
}
Additional Examples
import { z } from 'zod' ;
const DataSchema = z . object ({
title: z . string (),
content: z . string ()
});
// Using string format
const data1 = await stagehand . extract (
"extract article data" ,
DataSchema ,
{ model: "openai/gpt-5-mini" }
);
// Using object format with custom configuration
const data2 = await stagehand . extract (
"extract article data" ,
DataSchema ,
{
model: {
modelName: "claude-3-5-sonnet-20241022" ,
apiKey: process . env . ANTHROPIC_API_KEY
}
}
);
import { z } from 'zod' ;
const page1 = stagehand . context . pages ()[ 0 ];
const page2 = await stagehand . context . newPage ();
const Schema = z . object ({ title: z . string () });
const data1 = await stagehand . extract ( "get title" , Schema , { page: page1 });
const data2 = await stagehand . extract ( "get title" , Schema , { page: page2 });
Error Types
The following errors may be thrown by the extract() method:
StagehandError - Base class for all Stagehand-specific errors
ZodSchemaValidationError - Extracted data does not match the provided Zod schema
StagehandDomProcessError - Error occurred while processing the DOM
StagehandEvalError - Error occurred while evaluating JavaScript in the page context
StagehandIframeError - Unable to resolve iframe for the target element
ContentFrameNotFoundError - Unable to obtain content frame for the selector
XPathResolutionError - XPath does not resolve in the current page or frames
StagehandShadowRootMissingError - No shadow root present on the resolved host element
LLMResponseError - Error in LLM response processing
MissingLLMConfigurationError - No LLM API key or client configured
UnsupportedModelError - The specified model is not supported for this operation
InvalidAISDKModelFormatError - Model string does not follow the required provider/model format