mcpbr-cli 0.3.25 → 0.3.28

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (2) hide show
  1. package/README.md +48 -1
  2. package/package.json +1 -1
package/README.md CHANGED
@@ -296,6 +296,9 @@ dataset: "SWE-bench/SWE-bench_Lite"
296
296
  sample_size: 10
297
297
  timeout_seconds: 300
298
298
  max_concurrent: 4
299
+
300
+ # Optional: disable default logging (logs are saved to output_dir/logs/ by default)
301
+ # disable_logs: true
299
302
  ```
300
303
 
301
304
  4. **Run the evaluation:**
@@ -519,7 +522,8 @@ Run SWE-bench evaluation with the configured MCP server.
519
522
  | `--output-junit PATH` | | Path to save JUnit XML report (for CI/CD integration) |
520
523
  | `--verbose` | `-v` | Verbose output (`-v` summary, `-vv` detailed) |
521
524
  | `--log-file PATH` | `-l` | Path to write raw JSON log output (single file) |
522
- | `--log-dir PATH` | | Directory to write per-instance JSON log files |
525
+ | `--log-dir PATH` | | Directory to write per-instance JSON log files (default: `output_dir/logs/`) |
526
+ | `--disable-logs` | | Disable detailed execution logs (overrides default and config) |
523
527
  | `--task TEXT` | `-t` | Run specific task(s) by instance_id (repeatable) |
524
528
  | `--prompt TEXT` | | Override agent prompt (use `{problem_statement}` placeholder) |
525
529
  | `--baseline-results PATH` | | Path to baseline results JSON for regression detection |
@@ -773,6 +777,38 @@ Results saved to results.json
773
777
  }
774
778
  ```
775
779
 
780
+ ### Output Directory Structure
781
+
782
+ By default, mcpbr consolidates all outputs into a single timestamped directory:
783
+
784
+ ```text
785
+ .mcpbr_run_20260126_133000/
786
+ ├── config.yaml # Copy of configuration used
787
+ ├── evaluation_state.json # Task results and state
788
+ ├── logs/ # Detailed MCP server logs
789
+ │ ├── task_1_mcp.log
790
+ │ ├── task_2_mcp.log
791
+ │ └── ...
792
+ └── README.txt # Auto-generated explanation
793
+ ```
794
+
795
+ This makes it easy to:
796
+ - **Archive results**: `tar -czf results.tar.gz .mcpbr_run_*`
797
+ - **Clean up**: `rm -rf .mcpbr_run_*`
798
+ - **Share**: Just zip one directory
799
+
800
+ You can customize the output directory:
801
+
802
+ ```bash
803
+ # Custom output directory
804
+ mcpbr run -c config.yaml --output-dir ./my-results
805
+
806
+ # Or in config.yaml
807
+ output_dir: "./my-results"
808
+ ```
809
+
810
+ **Note:** The `--output-dir` CLI flag takes precedence over the `output_dir` config setting. This ensures that the README.txt file in the output directory reflects the final effective configuration values after all CLI overrides are applied.
811
+
776
812
  ### Markdown Report (`--report`)
777
813
 
778
814
  Generates a human-readable report with:
@@ -782,6 +818,17 @@ Generates a human-readable report with:
782
818
 
783
819
  ### Per-Instance Logs (`--log-dir`)
784
820
 
821
+ **Logging is enabled by default** to prevent data loss. Detailed execution traces are automatically saved to `output_dir/logs/` unless disabled.
822
+
823
+ To disable logging:
824
+ ```bash
825
+ # Via CLI flag
826
+ mcpbr run -c config.yaml --disable-logs
827
+
828
+ # Or in config file
829
+ disable_logs: true
830
+ ```
831
+
785
832
  Creates a directory with detailed JSON log files for each task run. Filenames include timestamps to prevent overwrites:
786
833
 
787
834
  ```text
package/package.json CHANGED
@@ -1,6 +1,6 @@
1
1
  {
2
2
  "name": "mcpbr-cli",
3
- "version": "0.3.25",
3
+ "version": "0.3.28",
4
4
  "description": "Model Context Protocol Benchmark Runner - CLI tool for evaluating MCP servers",
5
5
  "keywords": [
6
6
  "mcpbr",