CLI
Running Evaluations
The mcplab run command and all its options.
Basic Run
Point mcplab at your eval config to run all scenarios.
npx @inspectr/mcplab run -c eval.yamlFilter Scenarios
Run a single scenario by its ID using -s. Pass the flag multiple times to run several.
npx @inspectr/mcplab run -c eval.yaml -s basic-testnpx @inspectr/mcplab run -c eval.yaml -s test-one -s test-twoSelect Agents
By default all agents defined in the config are used. Narrow the selection with --agents or expand to include all agents defined in the library with --agents-all.
npx @inspectr/mcplab run -c eval.yaml --agents claude,gpt4onpx @inspectr/mcplab run -c eval.yaml --agents-allVariance Runs
Run each scenario multiple times to measure consistency. The -n flag sets the number of runs per scenario. Results include a pass rate across all runs.
npx @inspectr/mcplab run -c eval.yaml -n 5Interactive Mode
Interactive mode prompts you to pick a config and scenarios at the terminal instead of specifying them as flags. Useful for ad-hoc runs during development.
npx @inspectr/mcplab run --interactiveAnnotate and Organise Runs
Add a human-readable note to a run for easier identification in reports and the App. Change the output directory with --runs-dir.
npx @inspectr/mcplab run -c eval.yaml --run-note "after refactor"npx @inspectr/mcplab run -c eval.yaml --runs-dir ./my-runsBatch Runs — Directory Mode
Pass a directory path to -c/--config and MCPLab will discover and run all .yaml and .yml files in that directory recursively. This is useful for running an entire suite of eval configs in one command.
Use --bail to stop the batch after the first config that has any failing scenario (fail-fast mode). Without --bail, all configs run regardless of individual failures.
npx @inspectr/mcplab run -c ./evals/npx @inspectr/mcplab run -c ./evals/ --bailExit Codes
mcplab run exits 0 when all scenarios pass and non-zero when any scenario fails. Use this in CI to fail a pipeline on a regression.