extract() grabs structured text from the current page using zod. Given instructions and schema, you will receive structured data.

extract a single object

Here is how an extract call might look for a single object:

const item = await page.extract({
    instruction: "extract the price of the item",
    schema: z.object({
      price: z.number(),
    }),
  });

Your output schema will look like:

{ price: number }
To extract links or URLs, the relevant field in your schema needs to be defined using z.string().url(). See the snippet below:

Here is how an extract call might look for extracting a link or URL.

const extraction = await stagehand.page.extract({
    instruction: "extract the link to the 'contact us' page",
    schema: z.object({
      link: z.string().url(),
    }),
  });

  console.log("the link to the contact us page is: ", extraction.link);

extract a list of objects

Here is how an extract call might look for a list of objects. Note that you need to wrap the z.array in an outer z.object.

const apartments = await stagehand.page.extract({
    instruction:
      "Extract ALL the apartment listings and their details, including address, price, and square feet."
    schema: z.object({
      list_of_apartments: z.array(
        z.object({
          address: z.string(),
          price: z.string(),
          square_feet: z.string(),
        }),
      ),
    }}
  })

  console.log("the apartment data is: ", apartments));

Your output schema will look like:

list_of_apartments: [
      {
        address: "street address here",
        price: "$1234.00",
        square_feet: "700"
      },
      {
         address: "another address here",
         price: "1010.00",
         square_feet: "500"
      },
      .
      .
      .
  ]
To provide some additional context at the field level within your schema, you can use .describe(). See the snippet below:
const apartments = await stagehand.page.extract({
 instruction:
   "Extract ALL the apartment listings and their details, including address, price, and square feet."
 schema: z.object({
   list_of_apartments: z.array(
     z.object({
       address: z.string().describe("the address of the apartment"),
       price: z.string().describe("the price of the apartment"),
       square_feet: z.string().describe("the square footage of the apartment"),
     }),
   ),
 }}
})

Arguments: ExtractOptions<T extends z.AnyZodObject>

instruction
string
required

Provides instructions for extraction

schema
z.AnyZodObject
required

Defines the structure of the data to extract

useTextExtract
boolean
deprecated

This field is now deprecated and has no effect.

selector
string

An xpath that can be used to reduce the scope of an extraction. If an xpath is passed in, extract will only process the contents of the HTML element that the xpath points to. Useful for reducing input tokens and increasing extraction accuracy.

modelName
AvailableModel

Specifies the model to use

modelClientOptions
object

Configuration options for the model client. See ClientOptions.

domSettleTimeoutMs
number

Timeout in milliseconds for waiting for the DOM to settle

Returns: Promise<ExtractResult<T extends z.AnyZodObject>>

Resolves to the structured data as defined by the provided schema.