@dotsetlabs/bellwether 0.11.0 → 1.0.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (62) hide show
  1. package/CHANGELOG.md +128 -0
  2. package/README.md +107 -645
  3. package/dist/baseline/accessors.d.ts +1 -1
  4. package/dist/baseline/baseline-hash.js +23 -6
  5. package/dist/baseline/cloud-types.d.ts +281 -0
  6. package/dist/baseline/cloud-types.js +12 -0
  7. package/dist/baseline/converter.d.ts +1 -1
  8. package/dist/baseline/pr-comment-generator.js +2 -2
  9. package/dist/baseline/types.d.ts +1 -1
  10. package/dist/benchmark/benchmarker.d.ts +30 -0
  11. package/dist/benchmark/benchmarker.js +309 -0
  12. package/dist/benchmark/index.d.ts +6 -0
  13. package/dist/benchmark/index.js +5 -0
  14. package/dist/benchmark/types.d.ts +133 -0
  15. package/dist/benchmark/types.js +5 -0
  16. package/dist/cli/commands/auth.d.ts +0 -1
  17. package/dist/cli/commands/auth.js +0 -1
  18. package/dist/cli/commands/benchmark.d.ts +11 -0
  19. package/dist/cli/commands/benchmark.js +260 -0
  20. package/dist/cli/commands/check.js +11 -3
  21. package/dist/cli/commands/cloud/badge.js +2 -2
  22. package/dist/cli/commands/discover.js +1 -0
  23. package/dist/cli/commands/explore.js +11 -3
  24. package/dist/cli/index.js +2 -28
  25. package/dist/cli/output/terminal-reporter.d.ts +1 -1
  26. package/dist/cli/output/terminal-reporter.js +4 -24
  27. package/dist/cloud/http-client.d.ts +2 -2
  28. package/dist/cloud/http-client.js +6 -6
  29. package/dist/cloud/mock-client.d.ts +2 -2
  30. package/dist/cloud/mock-client.js +26 -26
  31. package/dist/cloud/types.d.ts +28 -28
  32. package/dist/config/defaults.d.ts +0 -14
  33. package/dist/config/defaults.js +0 -14
  34. package/dist/config/loader.d.ts +14 -0
  35. package/dist/config/loader.js +59 -0
  36. package/dist/config/template.js +0 -40
  37. package/dist/config/validator.d.ts +24 -164
  38. package/dist/config/validator.js +0 -85
  39. package/dist/constants/cloud.d.ts +0 -36
  40. package/dist/constants/cloud.js +1 -38
  41. package/dist/constants/core.d.ts +4 -20
  42. package/dist/constants/core.js +4 -20
  43. package/dist/constants/testing.d.ts +68 -8
  44. package/dist/constants/testing.js +153 -33
  45. package/dist/docs/contract.js +1 -2
  46. package/dist/index.d.ts +0 -2
  47. package/dist/index.js +0 -2
  48. package/dist/interview/schema-test-generator.js +320 -24
  49. package/dist/interview/types.d.ts +23 -0
  50. package/dist/logging/logger.js +4 -2
  51. package/dist/transport/http-transport.d.ts +6 -2
  52. package/dist/transport/http-transport.js +23 -9
  53. package/dist/transport/mcp-client.d.ts +13 -0
  54. package/dist/transport/mcp-client.js +108 -6
  55. package/dist/transport/types.d.ts +20 -2
  56. package/dist/utils/timeout.d.ts +1 -1
  57. package/dist/utils/timeout.js +2 -2
  58. package/dist/validation/semantic-test-generator.d.ts +7 -0
  59. package/dist/validation/semantic-test-generator.js +13 -4
  60. package/dist/version.js +1 -1
  61. package/package.json +6 -3
  62. package/schemas/bellwether-check.schema.json +3 -2
package/CHANGELOG.md CHANGED
@@ -2,6 +2,134 @@
2
2
 
3
3
  All notable changes to this project will be documented in this file.
4
4
 
5
+ ## [1.0.0] - 2026-01-27
6
+
7
+ ### Breaking Changes
8
+
9
+ - **Removed cloud commands**: The following commands have been removed: `login`, `upload`, `projects`, `history`, `diff`, `link`, `teams`, `badge`
10
+ - **Removed benchmark command**: The `benchmark` command and "Tested with Bellwether" certification program have been removed
11
+ - **Removed cloud module**: All cloud integration code has been removed from the CLI
12
+
13
+ ### Changed
14
+
15
+ - **Fully open source**: Bellwether is now a completely free, open-source tool with no cloud dependencies
16
+ - **Simplified configuration**: Removed cloud-related settings from `bellwether.yaml` template
17
+ - **Updated documentation**: Removed all cloud-related documentation
18
+
19
+ ### Migration Guide
20
+
21
+ If you were using cloud features:
22
+
23
+ 1. **Baselines**: Store baselines in git instead of uploading to cloud
24
+ ```bash
25
+ bellwether baseline save
26
+ git add bellwether-baseline.json
27
+ git commit -m "Add baseline"
28
+ ```
29
+
30
+ 2. **CI/CD**: Use local baseline comparison instead of cloud upload
31
+ ```bash
32
+ # Old
33
+ bellwether upload --ci --fail-on-drift
34
+
35
+ # New
36
+ bellwether check --fail-on-drift
37
+ bellwether baseline compare ./bellwether-baseline.json
38
+ ```
39
+
40
+ 3. **Environment variables**: Remove `BELLWETHER_SESSION`, `BELLWETHER_API_URL`, `BELLWETHER_TEAM_ID` from your CI/CD configuration
41
+
42
+ ## [0.13.0] - 2026-01-27
43
+
44
+ ### Breaking Changes
45
+
46
+ - **Renamed `bellwether verify` to `bellwether benchmark`**: The verification command has been renamed to better reflect its purpose
47
+ - Old: `bellwether verify <server-command>`
48
+ - New: `bellwether benchmark <server-command>`
49
+ - **Renamed "Verified by Bellwether" to "Tested with Bellwether"**: Updated branding throughout the CLI and documentation
50
+ - Badge text now shows "Tested with Bellwether"
51
+ - Status values changed: `verified` → `passed`, `not_verified` → `not_tested`
52
+ - **Config section renamed**: The `verify:` section in `bellwether.yaml` is now `benchmark:`
53
+ - Old: `verify: { timeout: 30000 }`
54
+ - New: `benchmark: { timeout: 30000 }`
55
+ - **Output file renamed**: Default benchmark report file changed from `bellwether-verification.json` to `bellwether-benchmark.json`
56
+ - **Cloud API changes**: Benchmark-related API endpoints have been renamed
57
+ - `/verifications` → `/benchmarks`
58
+ - Activity events: `verification.completed` → `benchmark.completed`, `verification.failed` → `benchmark.failed`
59
+
60
+ ### Changed
61
+
62
+ - All CLI output messages updated to use "benchmark" terminology
63
+ - Documentation updated throughout to reflect new naming
64
+ - Badge command description updated to reference "benchmark badge"
65
+ - Constants renamed: `VERIFICATION_TIERS` → `BENCHMARK_TIERS`, `DEFAULT_VERIFICATION_REPORT_FILE` → `DEFAULT_BENCHMARK_REPORT_FILE`
66
+
67
+ ### Migration Guide
68
+
69
+ 1. Update your `bellwether.yaml` config file:
70
+ ```yaml
71
+ # Old
72
+ verify:
73
+ timeout: 30000
74
+
75
+ # New
76
+ benchmark:
77
+ timeout: 30000
78
+ ```
79
+
80
+ 2. Update any CI/CD scripts:
81
+ ```bash
82
+ # Old
83
+ bellwether verify npx @mcp/server
84
+
85
+ # New
86
+ bellwether benchmark npx @mcp/server
87
+ ```
88
+
89
+ 3. Update any references to the output file:
90
+ - `bellwether-verification.json` → `bellwether-benchmark.json`
91
+
92
+ ## [0.12.0] - 2026-01-26
93
+
94
+ ### Features
95
+
96
+ - **Streamable HTTP transport improvements**: Full compliance with [MCP Streamable HTTP specification](https://modelcontextprotocol.io/specification/2025-03-26/basic/transports)
97
+ - Fixed Accept header to include both `application/json` and `text/event-stream` as required by spec
98
+ - Added automatic session ID capture from `Mcp-Session-Id` response header
99
+ - Session ID is automatically included in all subsequent requests after initialization
100
+ - Changed header name from `X-Session-Id` to `Mcp-Session-Id` per MCP specification
101
+ - **False positive reduction**: Intelligent pattern detection to reduce false positives in automated testing
102
+ - **Operation-based tool detection**: Tools with `operation` enum + `args` object patterns now use flexible `either` outcome
103
+ - **Self-stateful tool detection**: Tools requiring prior state (session/chain/context) are handled appropriately
104
+ - **Complex array schema detection**: Arrays with nested objects containing required properties use flexible validation
105
+ - **Flexible semantic validation**: Semantic type tests now use `either` outcome by default, allowing tools to accept varied formats (e.g., dayjs, date-fns)
106
+ - **Pattern detection metadata**: Test metadata now includes detection flags for transparency
107
+ - `operationBased`, `operationParam`, `argsParam` for operation-based tools
108
+ - `selfStateful`, `selfStatefulReason` for stateful tools
109
+ - `hasComplexArrays`, `complexArrayParams` for complex schema tools
110
+
111
+ ### Configuration
112
+
113
+ - **New semantic validation option**: `check.flexibleSemanticTests` (default: `true`)
114
+ - When `true`, semantic validation tests use `either` outcome
115
+ - Set to `false` for strict format enforcement
116
+
117
+ ### Documentation
118
+
119
+ - Updated remote-servers guide with correct streamable-http protocol details
120
+ - Added MCP specification link for transport documentation
121
+ - Clarified session ID behavior and Accept header requirements
122
+
123
+ ### Fixes
124
+
125
+ - **Streamable HTTP session management**: Fixed session ID header to use MCP-compliant `Mcp-Session-Id`
126
+ - **False positive tests**: Tests for operation-based, self-stateful, and complex array patterns no longer fail incorrectly
127
+
128
+ ### Tests
129
+
130
+ - Added 17 HTTP transport tests including session ID capture verification
131
+ - Added 11 new pattern detection tests for false positive reduction
132
+
5
133
  ## [0.11.0] - 2026-01-26
6
134
 
7
135
  ### Breaking Changes