Getting Started
Setting Up Evaluations
Set up a robust evaluation workflow before running your first full test suite.
Recommended Project Layout
Use a consistent workspace layout so CLI and App commands resolve configs, libraries, and results predictably.
- Keep evaluation YAML files in
mcplab/evals. - Keep reusable servers and agents in library files.
- Store run output in
mcplab/results/evaluation-runs.
recommended layout
mcplab/
evals/
eval.yaml
results/
evaluation-runs/
servers.yaml
agents.yamlConfigure Auth and Environment
Set provider keys and server auth variables before running evaluations. Keep secret values in environment variables, not in committed YAML.
- Use
auth.type: bearer+envfor bearer-token server auth. - Use
auth.type: oauth_client_credentialsfor client-credentials flows. - Use
auth.type: oauth_authorization_codewhen interactive/browser OAuth is required.
.env example
ANTHROPIC_API_KEY=...
OPENAI_API_KEY=...
MY_SERVER_TOKEN=...Preflight Checklist
- MCP endpoint URL is reachable and returns MCP responses.
- Scenario IDs are unique and agent references resolve.
- Server labels in
scenarios[].serversmatch your intended MCP server entries. - All required env var names are set in your shell/session.
- You can run one scenario once successfully before scaling.
first validation run
npx @inspectr/mcplab run -c mcplab/evals/eval.yaml -s setup-check -n 1Next Setup Steps
- Add more scenarios after the baseline setup-check passes.
- Use
--agentsto compare models on the same scenarios. - Open
report.htmlor the App results view to inspect failures and tool traces.