Stagehand leverages a generic LLM client architecture to support various language models from different providers. This design allows for flexibility, enabling the integration of new models with minimal changes to the core system. Different models work better for different tasks, so you can choose the model that best suits your needs.

Currently Supported Models

Stagehand currently supports the latest models from OpenAI and Anthropic.

OpenAI Models

  • gpt-4o
  • gpt-4o-2024-08-06
  • o1-mini
  • o1-preview
  • gpt-4o-mini (not recommended due to low parameter count)

Anthropic Models

  • claude-3-5-sonnet-latest
  • claude-3-5-sonnet-20240620
  • claude-3-5-sonnet-20241022

These models can be specified in Stagehand Config as modelName or when calling methods like act() and extract().

Custom Models

Custom LLM clients are a very new feature and don’t have advanced features like prompt caching yet.

We also don’t yet support adding custom LLMClients directly to act/extract/observe methods; they can only be specified in the Stagehand Config.

Check out an Ollama example

Check out an example of how to implement a custom model like Llama 3.2 using Ollama.

Stagehand supports custom models by implementing your own LLMClient interface. This allows you to use any language model that is supported by the LLMClient interface.

To implement a custom model, you can create a new class that implements the LLMClient interface. You can then pass this class to the Stagehand instance as the llmClient parameter in the Stagehand Config.

const customLLMClient: LLMClient = new CustomLLMClient();
const stagehand = new Stagehand({ ...StagehandConfig, llmClient: customLLMClient });
await stagehand.init();

For more information on how to implement a custom model, check out the LLMClient interface.