@greynewell/mcpbr 0.9.1 → 0.10.2

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (2) hide show
  1. package/README.md +14 -1
  2. package/package.json +1 -1
package/README.md CHANGED
@@ -54,6 +54,8 @@ mcpbr runs controlled experiments: same model, same tasks, same environment - th
54
54
  - **Real GitHub issues** from SWE-bench (not toy examples)
55
55
  - **Reproducible results** via Docker containers with pinned dependencies
56
56
 
57
+ > Read the full origin story: **[Why I Built mcpbr](https://greynewell.com/blog/why-i-built-mcpbr/)** — the problem, the approach, and where the project is headed.
58
+
57
59
  ## Supported Benchmarks
58
60
 
59
61
  mcpbr supports 30+ benchmarks across 10 categories through a flexible abstraction layer:
@@ -727,6 +729,17 @@ Run SWE-bench evaluation with the configured MCP server.
727
729
  | `--smtp-port PORT` | | SMTP server port (default: 587) |
728
730
  | `--smtp-user USER` | | SMTP username for authentication |
729
731
  | `--smtp-password PASS` | | SMTP password for authentication |
732
+ | `--sampling-strategy TEXT` | | Task sampling strategy (`sequential`, `random`, `stratified`) |
733
+ | `--random-seed INT` | | Random seed for reproducible sampling |
734
+ | `--stratify-field TEXT` | | Field to stratify by (requires `--sampling-strategy stratified`) |
735
+ | `--notify-slack URL` | | Slack webhook URL for completion notifications |
736
+ | `--notify-discord URL` | | Discord webhook URL for completion notifications |
737
+ | `--notify-email JSON` | | Email config as JSON string |
738
+ | `--slack-bot-token TOKEN` | | Slack bot token (`xoxb-...`) for uploading results.json to a channel |
739
+ | `--slack-channel ID` | | Slack channel ID for file uploads (used with `--slack-bot-token`) |
740
+ | `--github-token TOKEN` | | GitHub token for auto-creating a Gist with full results (linked in notifications) |
741
+ | `--wandb/--no-wandb` | | Enable/disable Weights & Biases logging |
742
+ | `--wandb-project TEXT` | | W&B project name |
730
743
  | `--profile` | | Enable comprehensive performance profiling (tool latency, memory, overhead) |
731
744
  | `--help` | `-h` | Show help message |
732
745
 
@@ -1489,4 +1502,4 @@ MIT - see [LICENSE](LICENSE) for details.
1489
1502
 
1490
1503
  ---
1491
1504
 
1492
- Built by [Grey Newell](https://greynewell.com)
1505
+ Built by [Grey Newell](https://greynewell.com) | [Why I Built mcpbr](https://greynewell.com/blog/why-i-built-mcpbr/) | [About](https://mcpbr.org/about/)
package/package.json CHANGED
@@ -1,6 +1,6 @@
1
1
  {
2
2
  "name": "@greynewell/mcpbr",
3
- "version": "0.9.1",
3
+ "version": "0.10.2",
4
4
  "description": "Model Context Protocol Benchmark Runner - CLI tool for evaluating MCP servers",
5
5
  "keywords": [
6
6
  "mcpbr",