The HUD SDK provides a base MCPAgent class and several pre-built agent implementations for interacting with MCP environments.
Creating Agents
Use the create() factory method to instantiate agents with typed parameters:
from hud.agents import ClaudeAgent
agent = ClaudeAgent.create(
checkpoint_name="claude-sonnet-4-5",
max_tokens=8192,
verbose=True,
)
result = await agent.run(task, max_steps=20)
Direct constructor calls with kwargs are deprecated. Use Agent.create() instead.
Base Class
MCPAgent
from hud.agents import MCPAgent
Abstract base class for all MCP-enabled agents. Handles the agent loop, MCP client lifecycle, tool discovery/filtering, and telemetry.
Create Parameters (shared by all agents):
| Parameter | Type | Description | Default |
|---|
mcp_client | AgentMCPClient | MCP client for server connections | None |
auto_trace | bool | Enable automatic tracing spans | True |
auto_respond | bool | Use ResponseAgent to decide when to stop/continue | False |
verbose | bool | Verbose console logs for development | False |
Base Config (shared by all agents):
| Parameter | Type | Description | Default |
|---|
allowed_tools | list[str] | Tool patterns to expose to the model | None (all) |
disallowed_tools | list[str] | Tool patterns to hide from the model | None |
system_prompt | str | Custom system prompt | None |
append_setup_output | bool | Include setup output in first turn | True |
initial_screenshot | bool | Include screenshot in initial context | True |
response_tool_name | str | Lifecycle tool for submitting responses | None |
Key Methods:
@classmethod
def create(**kwargs) -> MCPAgent
"""Factory method to create an agent with typed parameters."""
async def run(prompt_or_task: str | Task | dict, max_steps: int = 10) -> Trace
"""Run agent with prompt or task. Returns Trace with results."""
async def call_tools(tool_call: MCPToolCall | list[MCPToolCall]) -> list[MCPToolResult]
"""Execute tool calls through MCP client."""
def get_available_tools() -> list[types.Tool]
"""Get filtered list of available tools."""
Pre-built Agents
ClaudeAgent
from hud.agents import ClaudeAgent
Claude-specific implementation using Anthropic’s API.
Config Parameters:
| Parameter | Type | Description | Default |
|---|
checkpoint_name | str | Claude model to use | "claude-sonnet-4-5" |
model_client | AsyncAnthropic | Anthropic client | Auto-created |
max_tokens | int | Maximum response tokens | 16384 |
use_computer_beta | bool | Enable computer-use beta features | True |
validate_api_key | bool | Validate key on init | True |
Example:
from hud import Environment
from hud.agents import ClaudeAgent
env = Environment("browser").connect_hub("hud-evals/browser")
agent = ClaudeAgent.create(
checkpoint_name="claude-sonnet-4-5",
max_tokens=8192,
)
# Create task from scenario
task = env("navigate", url="https://example.com")
result = await agent.run(task)
OpenAIAgent
from hud.agents import OpenAIAgent
OpenAI agent using the Responses API for function calling.
Config Parameters:
| Parameter | Type | Description | Default |
|---|
checkpoint_name | str | Model to use | "gpt-5.1" |
model_client | AsyncOpenAI | OpenAI client | Auto-created |
max_output_tokens | int | Maximum response tokens | None |
temperature | float | Sampling temperature | None |
reasoning | Reasoning | Reasoning configuration | None |
tool_choice | ToolChoice | Tool selection strategy | None |
parallel_tool_calls | bool | Enable parallel tool execution | None |
validate_api_key | bool | Validate key on init | True |
Example:
agent = OpenAIAgent.create(
checkpoint_name="gpt-4o",
max_output_tokens=2048,
temperature=0.7,
)
OperatorAgent
from hud.agents import OperatorAgent
OpenAI Operator-style agent with computer-use capabilities. Extends OpenAIAgent.
Config Parameters:
| Parameter | Type | Description | Default |
|---|
checkpoint_name | str | Model to use | "computer-use-preview" |
environment | Literal["windows","mac","linux","browser"] | Computer environment | "linux" |
Inherits all OpenAIAgent parameters.
GeminiAgent
from hud.agents import GeminiAgent
Google Gemini agent with native computer-use capabilities.
Config Parameters:
| Parameter | Type | Description | Default |
|---|
checkpoint_name | str | Gemini model to use | "gemini-2.5-computer-use-preview-10-2025" |
model_client | genai.Client | Gemini client | Auto-created |
temperature | float | Sampling temperature | 1.0 |
top_p | float | Top-p sampling | 0.95 |
top_k | int | Top-k sampling | 40 |
max_output_tokens | int | Maximum response tokens | 8192 |
excluded_predefined_functions | list[str] | Predefined functions to exclude | [] |
validate_api_key | bool | Validate key on init | True |
Example:
agent = GeminiAgent.create(
checkpoint_name="gemini-2.5-computer-use-preview-10-2025",
temperature=0.7,
max_output_tokens=4096,
)
OpenAIChatAgent
from hud.agents import OpenAIChatAgent
OpenAI-compatible chat.completions agent. Works with any endpoint implementing the OpenAI schema (vLLM, Ollama, Together, etc.).
Config Parameters:
| Parameter | Type | Description | Default |
|---|
checkpoint_name | str | Model name | "gpt-5-mini" |
openai_client | AsyncOpenAI | OpenAI-compatible client | None |
api_key | str | API key (if not using client) | None |
base_url | str | Base URL (if not using client) | None |
completion_kwargs | dict | Extra args for completions | {} |
Example:
from hud.agents import OpenAIChatAgent
# Using base_url and api_key
agent = OpenAIChatAgent.create(
base_url="http://localhost:11434/v1", # Ollama
api_key="not-needed",
checkpoint_name="llama3.1",
completion_kwargs={"temperature": 0.2},
)
# Or with a custom client
from openai import AsyncOpenAI
agent = OpenAIChatAgent.create(
openai_client=AsyncOpenAI(base_url="http://localhost:8000/v1"),
checkpoint_name="served-model",
)
Usage Examples
With Scenarios
from hud import Environment
from hud.agents import ClaudeAgent
# Define environment with scenario
env = Environment("browser").connect_hub("hud-evals/browser")
@env.scenario("shopping")
async def shopping(instruction: str, start_url: str):
navigate(url=start_url) # Direct function call for local tools
answer = yield instruction
result = check_cart()
yield 1.0 if result["has_items"] else 0.0
# Run agent on task
agent = ClaudeAgent.create()
task = env("shopping", instruction="Add laptop to cart", start_url="https://shop.example.com")
result = await agent.run(task, max_steps=20)
print(f"Reward: {result.reward}, Done: {result.done}")
With Remote Environment
from hud import Environment
from hud.agents import OperatorAgent
# Connect to a remote environment
env = Environment("browser").connect_hub("hud-evals/browser")
# Create task from remote scenario
task = env("web-task", instruction="Find the price of the product")
agent = OperatorAgent.create()
result = await agent.run(task, max_steps=20)
Auto-Respond Mode
When auto_respond=True, the agent uses a ResponseAgent to decide whether to continue or stop after each model response:
agent = ClaudeAgent.create(
auto_respond=True, # Uses HUD inference gateway
verbose=True,
)
See Also