extract()

Extract

See how to use extract() to extract structured data from web pages

Method Signatures

TypeScript
Python

// With schema and options
await page.extract<T extends z.AnyZodObject>(options: ExtractOptions<T>): Promise<ExtractResult<T>>

// String instruction only
await page.extract(instruction: string): Promise<{ extraction: string }>

// No parameters (raw page content)
await page.extract(): Promise<{ page_text: string }>

ExtractOptions Interface:

interface ExtractOptions<T extends z.AnyZodObject> {
  instruction?: string;
  schema?: T;
  modelName?: AvailableModel;
  modelClientOptions?: ClientOptions;
  domSettleTimeoutMs?: number;
  selector?: string;
  iframes?: boolean;
}

type ExtractResult<T> = z.infer<T>;

Parameters

instruction

string

Natural language description of what data to extract.

schema

z.ZodSchema | BaseModel

Type schema defining the structure of data to extract. Ensures type safety and validation.

selector

string

XPath selector to limit extraction scope. Reduces token usage and improves accuracy.

iframes

boolean

Set to true if content exists within iframes.Default: false

modelName

AvailableModel

Override the default LLM model for this extraction.

modelClientOptions

ClientOptions

Model-specific configuration options.

domSettleTimeoutMs

number

Maximum time to wait for DOM to stabilize.Default: 30000

Response Types

With Schema
String Only
No Parameters

Returns: Promise<ExtractResult<T>> where T matches your schemaThe returned object will be strictly typed according to your schema definition.

Code Examples

Single Object
Arrays
URLs
Scoped
Schema-less
Advanced

import { z } from 'zod';

// Schema definition
const ProductSchema = z.object({
  name: z.string(),
  price: z.number(),
  inStock: z.boolean()
});

// Extraction
const product = await page.extract({
  instruction: "extract product details",
  schema: ProductSchema
});

Example Response

{
  "name": "Product Name",
  "price": 100,
  "inStock": true
}

First Steps

The Basics

Configuration

Best Practices

Integrations

Reference

Extract

Method Signatures

Parameters

Response Types

Code Examples

Example Response

First Steps

The Basics

Configuration

Best Practices

Integrations

Reference

Extract

​Method Signatures

​Parameters

​Response Types

​Code Examples

​Example Response

Method Signatures

Parameters

Response Types

Code Examples

Example Response