The HUD SDK provides a comprehensive set of tools for environment interaction. All tools inherit from BaseTool and return standardized MCP content blocks.
Base Classes
from hud.tools import BaseTool
Abstract base class for all MCP tools. Provides standardized output formatting and MCP registration.
Constructor Parameters:
| Parameter | Type | Description | Default |
|---|
env | Any | Optional stateful context (game, executor, browser) | None |
name | str | Tool name for MCP registration | Auto from class |
title | str | Human-readable display name | Auto from class |
description | str | Tool description | Auto from docstring |
Abstract Methods:
async def __call__(self, **kwargs: Any) -> list[ContentBlock]
Properties:
mcp - FastMCP FunctionTool wrapper for registration
Usage:
class MyTool(BaseTool):
async def __call__(self, param: str) -> list[ContentBlock]:
result = await self.env.do_something(param)
return [TextContent(text=result, type="text")]
# MCPServer automatically handles BaseTool instances
tool = MyTool(env=my_context)
mcp_server.add_tool(tool) # No need for .mcp property
BaseHub
from hud.tools import BaseHub
A composition-friendly FastMCP server for organizing tools into nested namespaces.
Key Features:
- Internal tool dispatcher pattern
- Automatic resource catalog generation
- Hidden internal tools from MCP clients
Usage:
hub = BaseHub("evaluators")
@hub.tool() # Internal tool, hidden from agents
async def url_match(url: str) -> EvaluationResult:
return EvaluationResult(reward=1.0, done=True)
# Access via dispatcher when mounted on server
# Agents call: evaluators(name="url_match", arguments={"url": "..."})
from hud.tools import AgentTool
Wraps a scenario as a tool that can be called by another agent. Essential for building hierarchical agent systems where an orchestrator delegates to specialized subagents.
Constructor Parameters:
| Parameter | Type | Description | Default |
|---|
task | Task | Task template from env("scenario_name") | Required |
model | str | Model for subagent (via gateway) | None |
agent | type[MCPAgent] | Custom agent class | None |
agent_params | dict | Additional agent parameters | {} |
name | str | Tool name for orchestrator | From scenario |
description | str | Tool description | Auto-generated |
trace | bool | Enable tracing for standalone runs | False |
Must provide either model or agent, not both.
Eval-Only Parameters:
Parameters with | None = None are hidden from the orchestrator but available for evaluation:
@env.scenario("investigate")
async def investigate(
query: str, # Visible - orchestrator passes this
expected_finding: str | None = None, # Hidden - only used in eval scoring
):
response = yield f"Investigate: {query}"
# Scoring uses expected_finding but orchestrator never sees it
if expected_finding and response:
yield 1.0 if expected_finding in response else 0.5
else:
yield 1.0 if response else 0.0
Usage:
from hud import Environment
from hud.tools import AgentTool
# Subagent environment with scenario
sentry_env = Environment(name="sentry-agent")
@sentry_env.scenario("investigate")
async def investigate_sentry(query: str):
yield f"Investigate Sentry: {query}"
# Create orchestrator
orchestrator = Environment(name="orchestrator")
# Wrap subagent scenario as tool
tool = AgentTool(
sentry_env("investigate"), # Task template
model="gpt-4o-mini",
name="investigate_sentry",
description="Investigate errors in Sentry",
)
orchestrator.add_tool(tool.mcp)
# Now orchestrator agent can call investigate_sentry(query="...")
Trace Continuity:
When called from within an eval context, AgentTool automatically:
- Inherits the parent’s trace_id
- Skips duplicate trace registration
- Routes all inference/tool calls to the parent trace
async with hud.eval(task) as ctx:
agent = create_agent("gpt-4o")
result = await agent.run(ctx)
# All subagent activity appears in this single trace
To create a separate trace per AgentTool call (useful for parallel subagents),
set trace_subagent=True. The tool result includes the sub-trace id in
result.meta["trace_id"].
See Also: Ops Diagnostics Cookbook for a complete hierarchical agent example.
from hud.tools import BashTool
Execute bash commands in a persistent shell session.
Constructor Parameters:
| Parameter | Type | Description | Default |
|---|
session | _BashSession | Pre-configured bash session | None |
Tool Parameters:
| Parameter | Type | Description | Default |
|---|
command | str | Bash command to execute | Required |
restart | bool | Restart the bash session | False |
Example:
bash = BashTool()
result = await bash(command="ls -la")
# Returns: [TextContent(text="file1.txt\nfile2.txt", type="text")]
# Restart session
await bash(restart=True)
from hud.tools import EditTool
File system editor with undo support.
Constructor Parameters:
| Parameter | Type | Description | Default |
|---|
file_history | dict[Path, list[str]] | Edit history per file | None |
Commands:
| Command | Parameters | Description |
|---|
view | path, view_range | View file or directory contents |
create | path, file_text | Create new file |
str_replace | path, old_str, new_str | Replace string in file |
insert | path, insert_line, new_str | Insert text at line |
undo_edit | path | Undo last edit |
Example:
editor = EditTool()
# View file
await editor(command="view", path="/path/to/file.py", view_range=[1, 20])
# Create file
await editor(command="create", path="/new/file.py", file_text="print('hello')")
# Replace text
await editor(command="str_replace", path="/file.py",
old_str="old_text", new_str="new_text")
from hud.tools import PlaywrightTool
Web automation using Playwright browser.
Constructor Parameters:
| Parameter | Type | Description | Default |
|---|
page | Page | Existing Playwright page | None |
cdp_url | str | Chrome DevTools Protocol URL | None |
Actions:
| Action | Parameters | Description |
|---|
navigate | url, wait_for_load_state | Navigate to URL |
screenshot | - | Capture page screenshot |
click | selector | Click element |
type | selector, text | Type text in element |
get_page_info | - | Get page title and URL |
wait_for_element | selector | Wait for element to appear |
Example:
tool = PlaywrightTool()
# Navigate
await tool(action="navigate", url="https://example.com")
# Click button
await tool(action="click", selector="button.submit")
# Type text
await tool(action="type", selector="input#search", text="query")
from hud.tools import JupyterTool
Execute Python code in a Jupyter kernel.
Constructor Parameters:
| Parameter | Type | Description | Default |
|---|
url_suffix | str | (Optional) Kernel gateway host:port | ”localhost:8888” |
kernel_name | str | (Optional) Kernel name | ”python3” |
kernel_id | str | (Optional) If set, connect to the existed kernel with kernel_id | ""(empty string) |
Tool Parameters:
| Parameter | Type | Description | Default |
|---|
code | str | Python code to execute | Required |
execution_timeout | int | Execution timeout in seconds | 15 |
Example: Tool Calling
jupyter_tool = JupyterTool()
result = await jupyter_tool(code="print(\"hello world!\")")
# Returns: [TextContent(text="hello world\n", type="text")]
Example: Reuse the same kernel
# When initializing, register the kernel
jupyter_tool = JupyterTool()
JupyterTool.register_shared_kernel("my-jupyter", jupyter_tool.get_kernel_id())
# Reuse the same kernel in other places:
jupyter_tool_reuse = JupyterTool.from_shared_kernel("my-jupyter")
from hud.tools import HudComputerTool
Universal computer control with automatic scaling and executor selection.
Constructor Parameters:
| Parameter | Type | Description | Default |
|---|
executor | BaseExecutor | Executor to use | Auto-detect |
platform_type | "auto", "xdo", "pyautogui" | Executor type | "auto" |
display_num | int | X display number | From env |
width | int | Agent screen width | 1280 |
height | int | Agent screen height | 720 |
rescale_images | bool | Rescale screenshots | True |
Actions:
| Action | Parameters | Description |
|---|
screenshot | - | Capture screen |
click | x, y, button, pattern, hold_keys | Mouse click |
write | text, enter_after, delay | Type text |
press | keys | Press key combination |
scroll | x, y, scroll_x, scroll_y | Scroll |
drag | start_x, start_y, end_x, end_y | Drag mouse |
Example:
computer = HudComputerTool()
# Take screenshot
await computer(action="screenshot")
# Click at coordinates
await computer(action="click", x=100, y=200)
# Type text
await computer(action="write", text="Hello World", enter_after=True)
# Press hotkey
await computer(action="press", keys=["ctrl", "c"])
from hud.tools import AnthropicComputerTool
Computer control optimized for Anthropic’s Claude models.
Features:
- Pre-configured for 1280x720 resolution
- Optimized action names for Claude
- Built-in screenshot scaling
from hud.tools import OpenAIComputerTool
Computer control optimized for OpenAI models.
Features:
- Pre-configured for 1920x1080 resolution
- Simplified action interface
- No automatic screenshot scaling
Executors
Executors provide platform-specific implementations for computer control actions.
BaseExecutor
from hud.tools.executors import BaseExecutor
Abstract base providing simulation mode for all actions.
Core Methods:
click(x, y, button, pattern, hold_keys) - Mouse click
write(text, enter_after, delay) - Type text
press(keys) - Press key combination
scroll(x, y, scroll_x, scroll_y) - Scroll
drag(start_x, start_y, end_x, end_y) - Drag mouse
screenshot() - Capture screen
get_screen_size() - Get display dimensions
PyAutoGUIExecutor
from hud.tools.executors import PyAutoGUIExecutor
Cross-platform executor using PyAutoGUI library.
Features:
- Works on Windows, macOS, Linux
- Real mouse/keyboard control
- Screenshot capture
- Automatic failsafe
Example:
executor = PyAutoGUIExecutor()
computer = HudComputerTool(executor=executor)
XDOExecutor
from hud.tools.executors import XDOExecutor
Linux/X11 executor using xdotool.
Features:
- Native X11 integration
- Faster than PyAutoGUI on Linux
- Support for X display selection
- Window management capabilities
Example:
executor = XDOExecutor(display_num=1)
computer = HudComputerTool(executor=executor)
Common Types
ContentBlock
MCP standard output format (from mcp.types):
from mcp.types import TextContent, ImageContent
# Text output
TextContent(text="Operation complete", type="text")
# Image output
ImageContent(data="base64_data", mimeType="image/png", type="image")
EvaluationResult
from hud.tools.types import EvaluationResult
result = EvaluationResult(
reward=0.8, # Score 0-1
done=True, # Task complete
content="Details", # Optional text
info={"score": 80} # Metadata
)
ContentResult
from hud.tools.types import ContentResult
# Helper for building complex outputs
result = ContentResult(
output="Success message",
error="Error if any",
base64_image="screenshot_data",
system="System message"
)
# Convert to ContentBlocks
blocks = result.to_content_blocks()
Integration Examples
from hud.server import MCPServer
from hud.tools import BashTool, EditTool
mcp = MCPServer(name="my-env")
# MCPServer handles BaseTool instances automatically
bash = BashTool()
mcp.add_tool(bash) # Internally uses bash.mcp
editor = EditTool()
mcp.add_tool(editor) # Same here
from hud.tools import BaseTool
from mcp.types import TextContent
class DatabaseTool(BaseTool):
def __init__(self, db_connection):
super().__init__(
env=db_connection,
name="database",
title="Database Query Tool",
description="Execute SQL queries"
)
async def __call__(self, query: str) -> list[ContentBlock]:
try:
results = await self.env.execute(query)
return [TextContent(text=str(results), type="text")]
except Exception as e:
return [TextContent(text=f"Error: {e}", type="text")]
Hub Pattern for Evaluators
from hud.tools import BaseHub
from hud.tools.types import EvaluationResult
evaluators = BaseHub("evaluate")
@evaluators.tool("text_contains")
async def check_text(text: str, target: str) -> EvaluationResult:
return EvaluationResult(
reward=1.0 if target in text else 0.0,
done=True,
content=f"Checking if '{target}' in text"
)
# Use in environment
@mcp.tool()
async def evaluate(name: str, **kwargs):
return await evaluators.call_tool(name, kwargs)
Callback Functions
Callback functions help you monitor certain types of actions and hook
behaviors to them. You can manage callbacks using three main methods,
add_callback(self, event_type: str, callback: Callable)
remove_callback(self, event_type: str, callback: Callable)
_trigger_callbacks(self, event_type: str, **kwargs)
Special note: Make sure the callback functions are defined by async def
For example, when you want to take a screenshot every
time the webpage changes in Playwright
from hud.tools.playwright import PlaywrightTool
import base64
class MyPlaywrightTool(PlaywrightTool):
def __init__(self, context: Any = None, cdp_url: str | None = None) -> None:
super().__init__(cdp_url=cdp_url)
self.screenshots: List = []
self.add_callback("webpage_change", self._on_webpage_change)
async def _on_webpage_change(self, **kwargs) -> None:
"""Callback to capture screenshots on webpage changes"""
try:
await self._ensure_browser()
if self.page:
screenshot_bytes = await self.page.screenshot(full_page=True)
screenshot_b64 = base64.b64encode(screenshot_bytes).decode("utf-8")
self.screenshots.append(screenshot_b64)
except Exception as e:
pass
async def navigate(self, url: str, wait_for_load_state="networkidle"):
result = await super().navigate(url, wait_for_load_state)
# Trigger callbacks with event data
await self._trigger_callbacks("webpage_change")
return result
async def click(self, selector: str, **kwargs):
result = await super().click(selector, **kwargs)
await self._trigger_callbacks("webpage_change")
return result
See Also