Extract
Extract structured data from the page
extract()
grabs structured text from the current page using structured schemas. Given instructions and schema
, you will receive structured data.
For TypeScript, the extract schemas are defined using zod schemas.
For Python, the extract schemas are defined using pydantic models.
Extract a single object
Here is how an extract
call might look for a single object:
Your output schema will look like:
Extract a link
Here is how an extract
call might look for extracting a link or URL.
Extract a list of objects
Here is how an extract
call might look for a list of objects.
Your output schema will look like:
Extract with additional context
You can provide additional context to your schema to help the model extract the data more accurately.
Arguments: ExtractOptions<T extends z.AnyZodObject>
Provides instructions for extraction
Defines the structure of the data to extract (TypeScript only)
Set iframes: true
if the extraction content exists within an iframe.
This field is now deprecated and has no effect.
An xpath that can be used to reduce the scope of an extraction. If an xpath is passed in, extract
will only process
the contents of the HTML element that the xpath points to. Useful for reducing input tokens and increasing extraction
accuracy.
Specifies the model to use
Configuration options for the model client. See ClientOptions
.
Timeout in milliseconds for waiting for the DOM to settle
Returns: Promise<ExtractResult<T extends z.AnyZodObject>>
Resolves to the structured data as defined by the provided schema
.
Arguments: ExtractOptions<T extends z.AnyZodObject>
Provides instructions for extraction
Defines the structure of the data to extract (TypeScript only)
Set iframes: true
if the extraction content exists within an iframe.
This field is now deprecated and has no effect.
An xpath that can be used to reduce the scope of an extraction. If an xpath is passed in, extract
will only process
the contents of the HTML element that the xpath points to. Useful for reducing input tokens and increasing extraction
accuracy.
Specifies the model to use
Configuration options for the model client. See ClientOptions
.
Timeout in milliseconds for waiting for the DOM to settle
Returns: Promise<ExtractResult<T extends z.AnyZodObject>>
Resolves to the structured data as defined by the provided schema
.
Arguments: ExtractOptions<T extends BaseModel>
Provides instructions for extraction
Defines the structure of the data to extract
Specifies the model to use
Configuration options for the model client. See ClientOptions
.
Timeout in milliseconds for waiting for the DOM to settle
Returns: Promise<ExtractResult<BaseModel>>
Resolves to the structured data as defined by the provided schema
.