Extract
Extract structured data from the page
extract()
grabs structured text from the current page using zod. Given instructions and schema
, you will receive structured data.
useTextExtract
to true
if you are extracting data from a longer body of text.extract
a single object
Here is how an extract
call might look for a single object:
Your output schema will look like:
extract
a list of objects
Here is how an extract
call might look for a list of objects. Note that you need to wrap the z.array
in an outer z.object
.
Your output schema will look like:
.describe()
. See the snippet below: Arguments: ExtractOptions<T extends z.AnyZodObject>
Provides instructions for extraction
Defines the structure of the data to extract
This method converts the page to text, which is much cleaner for LLMs than the DOM. However, it may not work for use cases that involve DOM metadata elements.
An xpath that can be used to reduce the scope of an extraction. If an xpath is passed in, extract
will only process
the contents of the HTML element that the xpath points to. Useful for reducing input tokens and increasing extraction
accuracy. Only works when useTextExtract: true
.
Specifies the model to use
Configuration options for the model client. See ClientOptions
.
Timeout in milliseconds for waiting for the DOM to settle
Returns: Promise<ExtractResult<T extends z.AnyZodObject>>
Resolves to the structured data as defined by the provided schema
.