Method Signatures

// With schema and options
await page.extract<T extends z.AnyZodObject>(options: ExtractOptions<T>): Promise<ExtractResult<T>>

// String instruction only
await page.extract(instruction: string): Promise<{ extraction: string }>

// No parameters (raw page content)
await page.extract(): Promise<{ page_text: string }>
ExtractOptions Interface:
interface ExtractOptions<T extends z.AnyZodObject> {
  instruction?: string;
  schema?: T;
  modelName?: AvailableModel;
  modelClientOptions?: ClientOptions;
  domSettleTimeoutMs?: number;
  selector?: string;
  iframes?: boolean;
}

type ExtractResult<T> = z.infer<T>;

Parameters

instruction
string
Natural language description of what data to extract.
schema
z.ZodSchema | BaseModel
Type schema defining the structure of data to extract. Ensures type safety and validation.
selector
string
XPath selector to limit extraction scope. Reduces token usage and improves accuracy.
iframes
boolean
Set to true if content exists within iframes.Default: false
modelName
AvailableModel
Override the default LLM model for this extraction.
modelClientOptions
ClientOptions
Model-specific configuration options.
domSettleTimeoutMs
number
Maximum time to wait for DOM to stabilize.Default: 30000

Response Types

Returns: Promise<ExtractResult<T>> where T matches your schemaThe returned object will be strictly typed according to your schema definition.

Code Examples

import { z } from 'zod';

// Schema definition
const ProductSchema = z.object({
  name: z.string(),
  price: z.number(),
  inStock: z.boolean()
});

// Extraction
const product = await page.extract({
  instruction: "extract product details",
  schema: ProductSchema
});

Example Response

{
  "name": "Product Name",
  "price": 100,
  "inStock": true
}