Documentation Index
Fetch the complete documentation index at: https://hud-f5fd7c15-parallel-agent-telemetry.mintlify.app/llms.txt
Use this file to discover all available pages before exploring further.
hud.eval() is the primary way to run evaluations. It creates an EvalContext with telemetry, handles parallel execution, and integrates with the HUD platform.
hud.eval()
Parameters
| Parameter | Type | Description | Default |
|---|---|---|---|
source | Task | list[Task] | str | None | Task objects from env(), task slugs, or None | None |
variants | dict[str, Any] | None | A/B test configuration (lists expand to combinations) | None |
group | int | Runs per variant for statistical significance | 1 |
group_ids | list[str] | None | Custom group IDs for parallel runs | None |
job_id | str | None | Job ID to link traces to | None |
api_key | str | None | API key for backend calls | None |
max_concurrent | int | None | Maximum concurrent evaluations | None |
trace | bool | Send telemetry to backend | True |
quiet | bool | Suppress console output | False |
Source Types
Thesource parameter accepts:
Variants
Test multiple configurations in parallel:Groups
Run each variant multiple times for statistical significance:len(evals) × len(variant_combinations) × group
Concurrency Control
EvalContext
EvalContext extends Environment with evaluation tracking.
Properties
| Property | Type | Description |
|---|---|---|
trace_id | str | Unique trace identifier |
eval_name | str | Evaluation name |
prompt | str | None | Task prompt (from scenario or task) |
variants | dict[str, Any] | Current variant assignment |
reward | float | None | Evaluation reward (settable) |
answer | str | None | Submitted answer |
error | BaseException | None | Error if failed |
results | list[EvalContext] | Results from parallel runs |
headers | dict[str, str] | Trace headers for HTTP requests |
job_id | str | None | Parent job ID |
group_id | str | None | Group ID for parallel runs |
index | int | Index in parallel execution |
Methods
AllEnvironment methods are available, plus:
Headers for Telemetry
Inside an eval context, trace headers are automatically injected into HTTP requests:Working with Environments
The recommended pattern is to useasync with env(...) directly:
Results
After parallel runs complete, access results on the context:See Also
- Environments - Environment class reference
- A/B Evals - Variants and groups guide
- Deploy - Running evals at scale
hud evalCLI - Command-line interface