MCPLab

Getting Started

Scenario Configuration

Detailed guide for writing MCPLab scenarios and assertions.

Scenario Fields

A scenario defines one test case the agent must execute against available MCP servers.

  • `id` — unique scenario identifier. Use kebab-case.
  • `prompt` — exact task instruction given to the agent.
  • `agent` — optional pinned agent id. Omit to run all selected agents.
  • `servers` — labels available in the scenario context.
  • `mcp_servers` — concrete MCP server definitions (`ref` or inline server config).
  • `eval` — assertions on tool usage, sequence, and response output.
  • `extract` — capture values from final text using regex named group `value`.

Minimal Scenario Example

single scenario baseline
scenarios:
  - id: weather-baseline
    agent: claude-haiku
    servers: [weather-api]
    mcp_servers:
      - id: weather-api
        transport: http
        url: http://localhost:3000/mcp
    prompt: Get today's forecast for Brussels and summarize in one sentence.

Tool and Response Assertions

Add `eval` only after the baseline prompt run works, then tighten expectations incrementally.

For a full assertion catalog with examples for every type, see Reference / Tool and Response Assertions.

  • Use `required_tools` to enforce critical tool calls.
  • Use `forbidden_tools` to block unsafe or irrelevant tools.
  • Use literal response assertions (`contains`, `equals`, etc.) for stable text checks.
  • Use regex assertions for variable outputs (numbers, IDs, timestamps).
  • Use JSONPath assertions when the response is structured JSON.
scenario with assertions
scenarios:
  - id: weather-asserted
    servers: [weather-api]
    mcp_servers:
      - ref: weather-api
    prompt: Return JSON with city and temperature_c for Brussels.
    eval:
      tool_constraints:
        required_tools: [get_weather]
        forbidden_tools: [delete_city]
      tool_sequence:
        allow:
          - [get_weather]
      response_assertions:
        - type: regex
          pattern: '([0-9]+)(\.[0-9]+)?\s?°?C'
        - type: jsonpath
          path: $.city
          equals: Brussels

Extract Structured Values

Use `extract` when you want reusable values from the final response for downstream checks or reporting.

extract with named capture group
scenarios:
  - id: extract-temperature
    servers: [weather-api]
    prompt: Report the current temperature in Brussels in Celsius.
    extract:
      - name: brussels_temp_c
        from: final_text
        regex: 'Temperature:\s*(?<value>[0-9]+(\.[0-9]+)?)\s*°?C'

Common Configuration Patterns

  • Start with one scenario and one run (`-n 1`) until stable.
  • Keep prompts deterministic; avoid broad, open-ended tasks first.
  • Prefer `mcp_servers` refs for shared servers across many scenarios.
  • Add one assertion at a time so failures are easy to diagnose.
  • Split large workflows into multiple scenarios instead of one giant prompt.

Scenario Validation Checklist

  • Scenario id is unique and readable.
  • `servers` labels and `mcp_servers` mapping are coherent.
  • Pinned `agent` exists in your agents list/library.
  • Assertions test behavior, not cosmetic phrasing only.
  • A focused command can run this scenario in isolation.
run one scenario
npx @inspectr/mcplab run -c mcplab/evals/eval.yaml -s weather-asserted -n 1