@dotsetlabs/bellwether 0.10.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (403) hide show
  1. package/CHANGELOG.md +291 -0
  2. package/LICENSE +21 -0
  3. package/README.md +739 -0
  4. package/dist/auth/credentials.d.ts +64 -0
  5. package/dist/auth/credentials.js +218 -0
  6. package/dist/auth/index.d.ts +6 -0
  7. package/dist/auth/index.js +6 -0
  8. package/dist/auth/keychain.d.ts +64 -0
  9. package/dist/auth/keychain.js +268 -0
  10. package/dist/baseline/ab-testing.d.ts +80 -0
  11. package/dist/baseline/ab-testing.js +236 -0
  12. package/dist/baseline/ai-compatibility-scorer.d.ts +95 -0
  13. package/dist/baseline/ai-compatibility-scorer.js +606 -0
  14. package/dist/baseline/calibration.d.ts +77 -0
  15. package/dist/baseline/calibration.js +136 -0
  16. package/dist/baseline/category-matching.d.ts +85 -0
  17. package/dist/baseline/category-matching.js +289 -0
  18. package/dist/baseline/change-impact-analyzer.d.ts +98 -0
  19. package/dist/baseline/change-impact-analyzer.js +592 -0
  20. package/dist/baseline/comparator.d.ts +64 -0
  21. package/dist/baseline/comparator.js +916 -0
  22. package/dist/baseline/confidence.d.ts +55 -0
  23. package/dist/baseline/confidence.js +122 -0
  24. package/dist/baseline/converter.d.ts +61 -0
  25. package/dist/baseline/converter.js +585 -0
  26. package/dist/baseline/dependency-analyzer.d.ts +89 -0
  27. package/dist/baseline/dependency-analyzer.js +567 -0
  28. package/dist/baseline/deprecation-tracker.d.ts +133 -0
  29. package/dist/baseline/deprecation-tracker.js +322 -0
  30. package/dist/baseline/diff.d.ts +55 -0
  31. package/dist/baseline/diff.js +1584 -0
  32. package/dist/baseline/documentation-scorer.d.ts +205 -0
  33. package/dist/baseline/documentation-scorer.js +466 -0
  34. package/dist/baseline/embeddings.d.ts +118 -0
  35. package/dist/baseline/embeddings.js +251 -0
  36. package/dist/baseline/error-analyzer.d.ts +198 -0
  37. package/dist/baseline/error-analyzer.js +721 -0
  38. package/dist/baseline/evaluation/evaluator.d.ts +42 -0
  39. package/dist/baseline/evaluation/evaluator.js +323 -0
  40. package/dist/baseline/evaluation/expanded-dataset.d.ts +45 -0
  41. package/dist/baseline/evaluation/expanded-dataset.js +1164 -0
  42. package/dist/baseline/evaluation/golden-dataset.d.ts +58 -0
  43. package/dist/baseline/evaluation/golden-dataset.js +717 -0
  44. package/dist/baseline/evaluation/index.d.ts +15 -0
  45. package/dist/baseline/evaluation/index.js +15 -0
  46. package/dist/baseline/evaluation/types.d.ts +186 -0
  47. package/dist/baseline/evaluation/types.js +8 -0
  48. package/dist/baseline/external-dependency-detector.d.ts +181 -0
  49. package/dist/baseline/external-dependency-detector.js +524 -0
  50. package/dist/baseline/golden-output.d.ts +162 -0
  51. package/dist/baseline/golden-output.js +636 -0
  52. package/dist/baseline/health-scorer.d.ts +174 -0
  53. package/dist/baseline/health-scorer.js +451 -0
  54. package/dist/baseline/incremental-checker.d.ts +97 -0
  55. package/dist/baseline/incremental-checker.js +174 -0
  56. package/dist/baseline/index.d.ts +31 -0
  57. package/dist/baseline/index.js +42 -0
  58. package/dist/baseline/migration-generator.d.ts +137 -0
  59. package/dist/baseline/migration-generator.js +554 -0
  60. package/dist/baseline/migrations.d.ts +60 -0
  61. package/dist/baseline/migrations.js +197 -0
  62. package/dist/baseline/performance-tracker.d.ts +214 -0
  63. package/dist/baseline/performance-tracker.js +577 -0
  64. package/dist/baseline/pr-comment-generator.d.ts +117 -0
  65. package/dist/baseline/pr-comment-generator.js +546 -0
  66. package/dist/baseline/response-fingerprint.d.ts +127 -0
  67. package/dist/baseline/response-fingerprint.js +728 -0
  68. package/dist/baseline/response-schema-tracker.d.ts +129 -0
  69. package/dist/baseline/response-schema-tracker.js +420 -0
  70. package/dist/baseline/risk-scorer.d.ts +54 -0
  71. package/dist/baseline/risk-scorer.js +434 -0
  72. package/dist/baseline/saver.d.ts +89 -0
  73. package/dist/baseline/saver.js +554 -0
  74. package/dist/baseline/scenario-generator.d.ts +151 -0
  75. package/dist/baseline/scenario-generator.js +905 -0
  76. package/dist/baseline/schema-compare.d.ts +86 -0
  77. package/dist/baseline/schema-compare.js +557 -0
  78. package/dist/baseline/schema-evolution.d.ts +189 -0
  79. package/dist/baseline/schema-evolution.js +467 -0
  80. package/dist/baseline/semantic.d.ts +203 -0
  81. package/dist/baseline/semantic.js +908 -0
  82. package/dist/baseline/synonyms.d.ts +60 -0
  83. package/dist/baseline/synonyms.js +386 -0
  84. package/dist/baseline/telemetry.d.ts +165 -0
  85. package/dist/baseline/telemetry.js +294 -0
  86. package/dist/baseline/test-pruner.d.ts +120 -0
  87. package/dist/baseline/test-pruner.js +387 -0
  88. package/dist/baseline/types.d.ts +449 -0
  89. package/dist/baseline/types.js +5 -0
  90. package/dist/baseline/version.d.ts +138 -0
  91. package/dist/baseline/version.js +206 -0
  92. package/dist/cache/index.d.ts +5 -0
  93. package/dist/cache/index.js +5 -0
  94. package/dist/cache/response-cache.d.ts +151 -0
  95. package/dist/cache/response-cache.js +287 -0
  96. package/dist/ci/index.d.ts +60 -0
  97. package/dist/ci/index.js +342 -0
  98. package/dist/cli/commands/auth.d.ts +12 -0
  99. package/dist/cli/commands/auth.js +352 -0
  100. package/dist/cli/commands/badge.d.ts +3 -0
  101. package/dist/cli/commands/badge.js +74 -0
  102. package/dist/cli/commands/baseline-accept.d.ts +15 -0
  103. package/dist/cli/commands/baseline-accept.js +178 -0
  104. package/dist/cli/commands/baseline-migrate.d.ts +12 -0
  105. package/dist/cli/commands/baseline-migrate.js +164 -0
  106. package/dist/cli/commands/baseline.d.ts +14 -0
  107. package/dist/cli/commands/baseline.js +449 -0
  108. package/dist/cli/commands/beta.d.ts +10 -0
  109. package/dist/cli/commands/beta.js +231 -0
  110. package/dist/cli/commands/check.d.ts +11 -0
  111. package/dist/cli/commands/check.js +820 -0
  112. package/dist/cli/commands/cloud/badge.d.ts +3 -0
  113. package/dist/cli/commands/cloud/badge.js +74 -0
  114. package/dist/cli/commands/cloud/diff.d.ts +6 -0
  115. package/dist/cli/commands/cloud/diff.js +79 -0
  116. package/dist/cli/commands/cloud/history.d.ts +6 -0
  117. package/dist/cli/commands/cloud/history.js +102 -0
  118. package/dist/cli/commands/cloud/link.d.ts +9 -0
  119. package/dist/cli/commands/cloud/link.js +119 -0
  120. package/dist/cli/commands/cloud/login.d.ts +7 -0
  121. package/dist/cli/commands/cloud/login.js +499 -0
  122. package/dist/cli/commands/cloud/projects.d.ts +6 -0
  123. package/dist/cli/commands/cloud/projects.js +44 -0
  124. package/dist/cli/commands/cloud/shared.d.ts +7 -0
  125. package/dist/cli/commands/cloud/shared.js +42 -0
  126. package/dist/cli/commands/cloud/teams.d.ts +8 -0
  127. package/dist/cli/commands/cloud/teams.js +169 -0
  128. package/dist/cli/commands/cloud/upload.d.ts +8 -0
  129. package/dist/cli/commands/cloud/upload.js +181 -0
  130. package/dist/cli/commands/contract.d.ts +11 -0
  131. package/dist/cli/commands/contract.js +280 -0
  132. package/dist/cli/commands/discover.d.ts +3 -0
  133. package/dist/cli/commands/discover.js +82 -0
  134. package/dist/cli/commands/eval.d.ts +9 -0
  135. package/dist/cli/commands/eval.js +187 -0
  136. package/dist/cli/commands/explore.d.ts +11 -0
  137. package/dist/cli/commands/explore.js +437 -0
  138. package/dist/cli/commands/feedback.d.ts +9 -0
  139. package/dist/cli/commands/feedback.js +174 -0
  140. package/dist/cli/commands/golden.d.ts +12 -0
  141. package/dist/cli/commands/golden.js +407 -0
  142. package/dist/cli/commands/history.d.ts +10 -0
  143. package/dist/cli/commands/history.js +202 -0
  144. package/dist/cli/commands/init.d.ts +9 -0
  145. package/dist/cli/commands/init.js +219 -0
  146. package/dist/cli/commands/interview.d.ts +3 -0
  147. package/dist/cli/commands/interview.js +903 -0
  148. package/dist/cli/commands/link.d.ts +10 -0
  149. package/dist/cli/commands/link.js +169 -0
  150. package/dist/cli/commands/login.d.ts +7 -0
  151. package/dist/cli/commands/login.js +499 -0
  152. package/dist/cli/commands/preset.d.ts +33 -0
  153. package/dist/cli/commands/preset.js +297 -0
  154. package/dist/cli/commands/profile.d.ts +33 -0
  155. package/dist/cli/commands/profile.js +286 -0
  156. package/dist/cli/commands/registry.d.ts +11 -0
  157. package/dist/cli/commands/registry.js +146 -0
  158. package/dist/cli/commands/shared.d.ts +79 -0
  159. package/dist/cli/commands/shared.js +196 -0
  160. package/dist/cli/commands/teams.d.ts +8 -0
  161. package/dist/cli/commands/teams.js +169 -0
  162. package/dist/cli/commands/test.d.ts +9 -0
  163. package/dist/cli/commands/test.js +500 -0
  164. package/dist/cli/commands/upload.d.ts +8 -0
  165. package/dist/cli/commands/upload.js +223 -0
  166. package/dist/cli/commands/validate-config.d.ts +6 -0
  167. package/dist/cli/commands/validate-config.js +35 -0
  168. package/dist/cli/commands/verify.d.ts +11 -0
  169. package/dist/cli/commands/verify.js +283 -0
  170. package/dist/cli/commands/watch.d.ts +12 -0
  171. package/dist/cli/commands/watch.js +253 -0
  172. package/dist/cli/index.d.ts +3 -0
  173. package/dist/cli/index.js +178 -0
  174. package/dist/cli/interactive.d.ts +47 -0
  175. package/dist/cli/interactive.js +216 -0
  176. package/dist/cli/output/terminal-reporter.d.ts +19 -0
  177. package/dist/cli/output/terminal-reporter.js +104 -0
  178. package/dist/cli/output.d.ts +226 -0
  179. package/dist/cli/output.js +438 -0
  180. package/dist/cli/utils/env.d.ts +5 -0
  181. package/dist/cli/utils/env.js +14 -0
  182. package/dist/cli/utils/progress.d.ts +59 -0
  183. package/dist/cli/utils/progress.js +206 -0
  184. package/dist/cli/utils/server-context.d.ts +10 -0
  185. package/dist/cli/utils/server-context.js +36 -0
  186. package/dist/cloud/auth.d.ts +144 -0
  187. package/dist/cloud/auth.js +374 -0
  188. package/dist/cloud/client.d.ts +24 -0
  189. package/dist/cloud/client.js +65 -0
  190. package/dist/cloud/http-client.d.ts +38 -0
  191. package/dist/cloud/http-client.js +215 -0
  192. package/dist/cloud/index.d.ts +23 -0
  193. package/dist/cloud/index.js +25 -0
  194. package/dist/cloud/mock-client.d.ts +107 -0
  195. package/dist/cloud/mock-client.js +545 -0
  196. package/dist/cloud/types.d.ts +515 -0
  197. package/dist/cloud/types.js +15 -0
  198. package/dist/config/defaults.d.ts +160 -0
  199. package/dist/config/defaults.js +169 -0
  200. package/dist/config/loader.d.ts +24 -0
  201. package/dist/config/loader.js +122 -0
  202. package/dist/config/template.d.ts +42 -0
  203. package/dist/config/template.js +647 -0
  204. package/dist/config/validator.d.ts +2112 -0
  205. package/dist/config/validator.js +658 -0
  206. package/dist/constants/cloud.d.ts +107 -0
  207. package/dist/constants/cloud.js +110 -0
  208. package/dist/constants/core.d.ts +521 -0
  209. package/dist/constants/core.js +556 -0
  210. package/dist/constants/testing.d.ts +1283 -0
  211. package/dist/constants/testing.js +1568 -0
  212. package/dist/constants.d.ts +10 -0
  213. package/dist/constants.js +10 -0
  214. package/dist/contract/index.d.ts +6 -0
  215. package/dist/contract/index.js +5 -0
  216. package/dist/contract/validator.d.ts +177 -0
  217. package/dist/contract/validator.js +574 -0
  218. package/dist/cost/index.d.ts +6 -0
  219. package/dist/cost/index.js +5 -0
  220. package/dist/cost/tracker.d.ts +134 -0
  221. package/dist/cost/tracker.js +313 -0
  222. package/dist/discovery/discovery.d.ts +16 -0
  223. package/dist/discovery/discovery.js +173 -0
  224. package/dist/discovery/types.d.ts +51 -0
  225. package/dist/discovery/types.js +2 -0
  226. package/dist/docs/agents.d.ts +3 -0
  227. package/dist/docs/agents.js +995 -0
  228. package/dist/docs/contract.d.ts +51 -0
  229. package/dist/docs/contract.js +1681 -0
  230. package/dist/docs/generator.d.ts +4 -0
  231. package/dist/docs/generator.js +4 -0
  232. package/dist/docs/html-reporter.d.ts +9 -0
  233. package/dist/docs/html-reporter.js +757 -0
  234. package/dist/docs/index.d.ts +10 -0
  235. package/dist/docs/index.js +11 -0
  236. package/dist/docs/junit-reporter.d.ts +18 -0
  237. package/dist/docs/junit-reporter.js +210 -0
  238. package/dist/docs/report.d.ts +14 -0
  239. package/dist/docs/report.js +44 -0
  240. package/dist/docs/sarif-reporter.d.ts +19 -0
  241. package/dist/docs/sarif-reporter.js +335 -0
  242. package/dist/docs/shared.d.ts +35 -0
  243. package/dist/docs/shared.js +162 -0
  244. package/dist/docs/templates.d.ts +12 -0
  245. package/dist/docs/templates.js +76 -0
  246. package/dist/errors/index.d.ts +6 -0
  247. package/dist/errors/index.js +6 -0
  248. package/dist/errors/retry.d.ts +92 -0
  249. package/dist/errors/retry.js +323 -0
  250. package/dist/errors/types.d.ts +321 -0
  251. package/dist/errors/types.js +584 -0
  252. package/dist/index.d.ts +32 -0
  253. package/dist/index.js +32 -0
  254. package/dist/interview/dependency-resolver.d.ts +11 -0
  255. package/dist/interview/dependency-resolver.js +32 -0
  256. package/dist/interview/interviewer.d.ts +232 -0
  257. package/dist/interview/interviewer.js +1939 -0
  258. package/dist/interview/mock-response-generator.d.ts +7 -0
  259. package/dist/interview/mock-response-generator.js +102 -0
  260. package/dist/interview/orchestrator.d.ts +237 -0
  261. package/dist/interview/orchestrator.js +1296 -0
  262. package/dist/interview/rate-limiter.d.ts +15 -0
  263. package/dist/interview/rate-limiter.js +55 -0
  264. package/dist/interview/response-validator.d.ts +10 -0
  265. package/dist/interview/response-validator.js +132 -0
  266. package/dist/interview/schema-inferrer.d.ts +8 -0
  267. package/dist/interview/schema-inferrer.js +71 -0
  268. package/dist/interview/schema-test-generator.d.ts +71 -0
  269. package/dist/interview/schema-test-generator.js +834 -0
  270. package/dist/interview/smart-value-generator.d.ts +155 -0
  271. package/dist/interview/smart-value-generator.js +554 -0
  272. package/dist/interview/stateful-test-runner.d.ts +19 -0
  273. package/dist/interview/stateful-test-runner.js +106 -0
  274. package/dist/interview/types.d.ts +561 -0
  275. package/dist/interview/types.js +2 -0
  276. package/dist/llm/anthropic.d.ts +41 -0
  277. package/dist/llm/anthropic.js +355 -0
  278. package/dist/llm/client.d.ts +123 -0
  279. package/dist/llm/client.js +42 -0
  280. package/dist/llm/factory.d.ts +38 -0
  281. package/dist/llm/factory.js +145 -0
  282. package/dist/llm/fallback.d.ts +140 -0
  283. package/dist/llm/fallback.js +379 -0
  284. package/dist/llm/index.d.ts +18 -0
  285. package/dist/llm/index.js +15 -0
  286. package/dist/llm/ollama.d.ts +37 -0
  287. package/dist/llm/ollama.js +330 -0
  288. package/dist/llm/openai.d.ts +25 -0
  289. package/dist/llm/openai.js +320 -0
  290. package/dist/llm/token-budget.d.ts +161 -0
  291. package/dist/llm/token-budget.js +395 -0
  292. package/dist/logging/logger.d.ts +70 -0
  293. package/dist/logging/logger.js +130 -0
  294. package/dist/metrics/collector.d.ts +106 -0
  295. package/dist/metrics/collector.js +547 -0
  296. package/dist/metrics/index.d.ts +7 -0
  297. package/dist/metrics/index.js +7 -0
  298. package/dist/metrics/prometheus.d.ts +20 -0
  299. package/dist/metrics/prometheus.js +241 -0
  300. package/dist/metrics/types.d.ts +209 -0
  301. package/dist/metrics/types.js +5 -0
  302. package/dist/persona/builtins.d.ts +54 -0
  303. package/dist/persona/builtins.js +219 -0
  304. package/dist/persona/index.d.ts +8 -0
  305. package/dist/persona/index.js +8 -0
  306. package/dist/persona/loader.d.ts +30 -0
  307. package/dist/persona/loader.js +190 -0
  308. package/dist/persona/types.d.ts +144 -0
  309. package/dist/persona/types.js +5 -0
  310. package/dist/persona/validation.d.ts +94 -0
  311. package/dist/persona/validation.js +332 -0
  312. package/dist/prompts/index.d.ts +5 -0
  313. package/dist/prompts/index.js +5 -0
  314. package/dist/prompts/templates.d.ts +180 -0
  315. package/dist/prompts/templates.js +431 -0
  316. package/dist/registry/client.d.ts +49 -0
  317. package/dist/registry/client.js +191 -0
  318. package/dist/registry/index.d.ts +7 -0
  319. package/dist/registry/index.js +6 -0
  320. package/dist/registry/types.d.ts +140 -0
  321. package/dist/registry/types.js +6 -0
  322. package/dist/scenarios/evaluator.d.ts +43 -0
  323. package/dist/scenarios/evaluator.js +206 -0
  324. package/dist/scenarios/index.d.ts +10 -0
  325. package/dist/scenarios/index.js +9 -0
  326. package/dist/scenarios/loader.d.ts +20 -0
  327. package/dist/scenarios/loader.js +285 -0
  328. package/dist/scenarios/types.d.ts +153 -0
  329. package/dist/scenarios/types.js +8 -0
  330. package/dist/security/index.d.ts +17 -0
  331. package/dist/security/index.js +18 -0
  332. package/dist/security/payloads.d.ts +61 -0
  333. package/dist/security/payloads.js +268 -0
  334. package/dist/security/security-tester.d.ts +42 -0
  335. package/dist/security/security-tester.js +582 -0
  336. package/dist/security/types.d.ts +166 -0
  337. package/dist/security/types.js +8 -0
  338. package/dist/transport/base-transport.d.ts +59 -0
  339. package/dist/transport/base-transport.js +38 -0
  340. package/dist/transport/http-transport.d.ts +67 -0
  341. package/dist/transport/http-transport.js +238 -0
  342. package/dist/transport/mcp-client.d.ts +141 -0
  343. package/dist/transport/mcp-client.js +496 -0
  344. package/dist/transport/sse-transport.d.ts +88 -0
  345. package/dist/transport/sse-transport.js +316 -0
  346. package/dist/transport/stdio-transport.d.ts +43 -0
  347. package/dist/transport/stdio-transport.js +238 -0
  348. package/dist/transport/types.d.ts +125 -0
  349. package/dist/transport/types.js +16 -0
  350. package/dist/utils/concurrency.d.ts +123 -0
  351. package/dist/utils/concurrency.js +213 -0
  352. package/dist/utils/formatters.d.ts +16 -0
  353. package/dist/utils/formatters.js +37 -0
  354. package/dist/utils/index.d.ts +8 -0
  355. package/dist/utils/index.js +8 -0
  356. package/dist/utils/jsonpath.d.ts +87 -0
  357. package/dist/utils/jsonpath.js +326 -0
  358. package/dist/utils/markdown.d.ts +113 -0
  359. package/dist/utils/markdown.js +265 -0
  360. package/dist/utils/network.d.ts +14 -0
  361. package/dist/utils/network.js +17 -0
  362. package/dist/utils/sanitize.d.ts +92 -0
  363. package/dist/utils/sanitize.js +191 -0
  364. package/dist/utils/semantic.d.ts +194 -0
  365. package/dist/utils/semantic.js +1051 -0
  366. package/dist/utils/smart-truncate.d.ts +94 -0
  367. package/dist/utils/smart-truncate.js +361 -0
  368. package/dist/utils/timeout.d.ts +153 -0
  369. package/dist/utils/timeout.js +205 -0
  370. package/dist/utils/yaml-parser.d.ts +58 -0
  371. package/dist/utils/yaml-parser.js +86 -0
  372. package/dist/validation/index.d.ts +32 -0
  373. package/dist/validation/index.js +32 -0
  374. package/dist/validation/semantic-test-generator.d.ts +50 -0
  375. package/dist/validation/semantic-test-generator.js +176 -0
  376. package/dist/validation/semantic-types.d.ts +66 -0
  377. package/dist/validation/semantic-types.js +94 -0
  378. package/dist/validation/semantic-validator.d.ts +38 -0
  379. package/dist/validation/semantic-validator.js +340 -0
  380. package/dist/verification/index.d.ts +6 -0
  381. package/dist/verification/index.js +5 -0
  382. package/dist/verification/types.d.ts +133 -0
  383. package/dist/verification/types.js +5 -0
  384. package/dist/verification/verifier.d.ts +30 -0
  385. package/dist/verification/verifier.js +309 -0
  386. package/dist/version.d.ts +19 -0
  387. package/dist/version.js +48 -0
  388. package/dist/workflow/auto-generator.d.ts +27 -0
  389. package/dist/workflow/auto-generator.js +513 -0
  390. package/dist/workflow/discovery.d.ts +40 -0
  391. package/dist/workflow/discovery.js +195 -0
  392. package/dist/workflow/executor.d.ts +82 -0
  393. package/dist/workflow/executor.js +611 -0
  394. package/dist/workflow/index.d.ts +10 -0
  395. package/dist/workflow/index.js +10 -0
  396. package/dist/workflow/loader.d.ts +24 -0
  397. package/dist/workflow/loader.js +194 -0
  398. package/dist/workflow/state-tracker.d.ts +98 -0
  399. package/dist/workflow/state-tracker.js +424 -0
  400. package/dist/workflow/types.d.ts +337 -0
  401. package/dist/workflow/types.js +5 -0
  402. package/package.json +94 -0
  403. package/schemas/bellwether-check.schema.json +651 -0
package/README.md ADDED
@@ -0,0 +1,739 @@
1
+ # Bellwether
2
+
3
+ [![Build Status](https://github.com/dotsetlabs/bellwether/actions/workflows/ci.yml/badge.svg)](https://github.com/dotsetlabs/bellwether/actions)
4
+ [![npm version](https://img.shields.io/npm/v/@dotsetlabs/bellwether)](https://www.npmjs.com/package/@dotsetlabs/bellwether)
5
+ [![Documentation](https://img.shields.io/badge/docs-docs.bellwether.sh-blue)](https://docs.bellwether.sh)
6
+
7
+ > **Catch MCP server drift before your users do. Zero LLM required.**
8
+
9
+ Bellwether detects structural changes in your [MCP (Model Context Protocol)](https://modelcontextprotocol.io/) server using **schema comparison**. No LLM needed. Free. Deterministic.
10
+
11
+ ## Quick Start
12
+
13
+ ```bash
14
+ # Install
15
+ npm install -g @dotsetlabs/bellwether
16
+
17
+ # Initialize configuration (required before any other command)
18
+ bellwether init npx @mcp/your-server
19
+
20
+ # Check for drift (free, fast, deterministic)
21
+ bellwether check
22
+
23
+ # Save baseline for drift detection
24
+ bellwether baseline save
25
+
26
+ # Optional: Explore behavior with LLM
27
+ bellwether explore
28
+ ```
29
+
30
+ That's it. No API keys needed for check. No LLM costs. Deterministic results.
31
+
32
+ ## CI/CD Integration
33
+
34
+ Add drift detection to every PR:
35
+
36
+ ```yaml
37
+ # .github/workflows/bellwether.yml
38
+ name: MCP Drift Detection
39
+ on: [pull_request]
40
+
41
+ jobs:
42
+ bellwether:
43
+ runs-on: ubuntu-latest
44
+ steps:
45
+ - uses: actions/checkout@v4
46
+ - run: npx @dotsetlabs/bellwether init --preset ci npx @mcp/your-server
47
+ - run: npx @dotsetlabs/bellwether check --fail-on-drift
48
+ ```
49
+
50
+ Commit `bellwether.yaml` to your repo so CI always has your config. No secrets needed for `check`. Runs in seconds.
51
+
52
+ ### Exit Codes
53
+
54
+ Check command returns granular exit codes for CI/CD pipelines:
55
+
56
+ | Code | Meaning | CI Action |
57
+ |:-----|:--------|:----------|
58
+ | `0` | No changes detected | Pass |
59
+ | `1` | Info-level changes only | Exit code `1` (handle in CI as desired) |
60
+ | `2` | Warning-level changes | Exit code `2` (handle in CI as desired) |
61
+ | `3` | Breaking changes | Always fail |
62
+ | `4` | Runtime error | Fail |
63
+ | `5` | Low confidence (when `check.sampling.failOnLowConfidence` is true) | Fail |
64
+
65
+ ## What Bellwether Detects
66
+
67
+ Check mode detects when your MCP server changes:
68
+
69
+ | Change Type | Example | Detected |
70
+ |:------------|:--------|:---------|
71
+ | **Tool added** | New `delete_file` tool appears | Yes |
72
+ | **Tool removed** | `write_file` tool disappears | Yes |
73
+ | **Schema changed** | Parameter `path` becomes required | Yes |
74
+ | **Description changed** | Tool help text updated | Yes |
75
+ | **Tool renamed** | `read` becomes `read_file` | Yes |
76
+ | **Performance regression** | Tool latency increased >10% | Yes |
77
+ | **Performance confidence** | Statistical reliability of metrics | Yes |
78
+ | **Security vulnerabilities** | SQL injection accepted (when `check.security.enabled` is on) | Yes |
79
+ | **Response schema changes** | Response fields added/removed | Yes |
80
+ | **Unstable schemas** | Inconsistent response structures | Yes |
81
+ | **Error trends** | New error types, increasing errors | Yes |
82
+
83
+ This catches the changes that break AI agent workflows.
84
+
85
+ ## Documentation
86
+
87
+ **[docs.bellwether.sh](https://docs.bellwether.sh)** - Full documentation including:
88
+
89
+ - [Quick Start](https://docs.bellwether.sh/quickstart)
90
+ - [CLI Reference](https://docs.bellwether.sh/cli/init)
91
+ - [Test Modes](https://docs.bellwether.sh/concepts/test-modes)
92
+ - [CI/CD Integration](https://docs.bellwether.sh/guides/ci-cd)
93
+ - [Cloud Features](https://docs.bellwether.sh/cloud)
94
+
95
+ ## Configuration
96
+
97
+ All settings are configured in `bellwether.yaml`. Create one with:
98
+
99
+ ```bash
100
+ bellwether init npx @mcp/your-server # Default (free, fast)
101
+ bellwether init --preset ci npx @mcp/server # Optimized for CI/CD
102
+ bellwether init --preset security npx @mcp/server # Security-focused exploration
103
+ bellwether init --preset thorough npx @mcp/server # Comprehensive exploration
104
+ bellwether init --preset local npx @mcp/server # Exploration with local Ollama
105
+ ```
106
+
107
+ The generated config file is fully documented with all available options.
108
+
109
+ ### Environment Variable Interpolation
110
+
111
+ Reference environment variables in your config:
112
+
113
+ ```yaml
114
+ server:
115
+ command: "npx @mcp/your-server"
116
+ env:
117
+ API_KEY: "${API_KEY}"
118
+ DEBUG: "${DEBUG:-false}" # With default value
119
+ ```
120
+
121
+ This allows committing `bellwether.yaml` to version control without exposing secrets.
122
+
123
+ ## Commands
124
+
125
+ ### Check Command (Recommended for CI)
126
+
127
+ ```bash
128
+ bellwether init npx @mcp/your-server
129
+ bellwether check
130
+ ```
131
+
132
+ - **Zero LLM** - No API keys required
133
+ - **Free** - No token costs
134
+ - **Deterministic** - Same input = same output
135
+ - **Fast** - Runs in seconds (use `check.parallel` in config for more speed)
136
+ - **Output** - Writes `CONTRACT.md` to `output.docsDir` and `bellwether-check.json` to `output.dir` (filenames configurable via `output.files.contractDoc` and `output.files.checkReport`)
137
+ - **CI-Optimized** - Granular exit codes (0-5), JUnit/SARIF output formats
138
+
139
+ #### Check Mode Enhancements
140
+
141
+ - **Stateful testing** for create → use → delete chains
142
+ - **External service handling** (skip, mock, or fail when credentials are missing)
143
+ - **Response assertions** for semantic validation of outputs
144
+ - **Rate limiting** to avoid 429s on production servers
145
+
146
+ Example configuration:
147
+
148
+ ```yaml
149
+ check:
150
+ statefulTesting:
151
+ enabled: true
152
+ maxChainLength: 5
153
+ shareOutputsBetweenTools: true
154
+
155
+ externalServices:
156
+ mode: skip # skip | mock | fail
157
+ services:
158
+ plaid:
159
+ enabled: false
160
+ sandboxCredentials:
161
+ clientId: "${PLAID_CLIENT_ID}"
162
+ secret: "${PLAID_SECRET}"
163
+
164
+ assertions:
165
+ enabled: true
166
+ strict: false
167
+ infer: true
168
+
169
+ rateLimit:
170
+ enabled: false
171
+ requestsPerSecond: 10
172
+ burstLimit: 20
173
+ backoffStrategy: exponential
174
+ maxRetries: 3
175
+ ```
176
+
177
+ #### Check Report Schema
178
+
179
+ `bellwether-check.json` includes a `$schema` pointer and is validated before writing.
180
+ Schema URL:
181
+
182
+ ```
183
+ https://unpkg.com/@dotsetlabs/bellwether/schemas/bellwether-check.schema.json
184
+ ```
185
+
186
+ ### Explore Command (Optional)
187
+
188
+ ```bash
189
+ bellwether init --preset local npx @mcp/your-server # Uses local Ollama (free)
190
+ # or
191
+ bellwether init --preset thorough npx @mcp/server # Uses OpenAI (requires API key)
192
+
193
+ bellwether explore
194
+ ```
195
+
196
+ - Requires LLM (Ollama for free local, or OpenAI/Anthropic)
197
+ - Multi-persona testing (technical writer, security tester, QA, novice)
198
+ - Generates `AGENTS.md` documentation (filename configurable via `output.files.agentsDoc`)
199
+ - Better for local development and deep exploration
200
+
201
+ ### Core Commands
202
+
203
+ ```bash
204
+ # Initialize configuration (creates bellwether.yaml)
205
+ bellwether init npx @mcp/server
206
+ bellwether init --preset ci npx @mcp/server
207
+
208
+ # Validate configuration (no tests)
209
+ bellwether validate-config
210
+
211
+ # Check for drift (free, fast, deterministic)
212
+ bellwether check # Uses server.command from config
213
+ bellwether check npx @mcp/server # Override server command
214
+ bellwether check --fail-on-drift # Override baseline.failOnDrift from config
215
+ bellwether check --format junit # JUnit XML output for CI
216
+ bellwether check --format sarif # SARIF output for GitHub Code Scanning
217
+
218
+ # Configure performance, parallelism, incremental, and security in bellwether.yaml
219
+ # (check.parallel, check.incremental, check.security, check.sampling)
220
+
221
+ # Explore behavior (LLM-powered)
222
+ bellwether explore # Uses server.command from config
223
+ bellwether explore npx @mcp/server # Override server command
224
+
225
+ # Discover server capabilities
226
+ bellwether discover npx @mcp/server
227
+
228
+ # Watch mode (re-check on file changes, uses config)
229
+ bellwether watch
230
+
231
+ # Search MCP Registry
232
+ bellwether registry filesystem
233
+ bellwether registry database --limit 5
234
+
235
+ # Generate verification report
236
+ bellwether verify --tier gold
237
+
238
+ # Validate against contracts
239
+ bellwether contract generate npx @mcp/server
240
+ bellwether contract validate npx @mcp/server
241
+ bellwether contract show # Display current contract
242
+
243
+ # Manage golden outputs (deterministic regression tests)
244
+ bellwether golden save --tool my_tool --args '{"id":"123"}'
245
+ bellwether golden compare
246
+ ```
247
+
248
+ ### Baseline Commands
249
+
250
+ ```bash
251
+ # Save test results as baseline
252
+ bellwether baseline save
253
+ bellwether baseline save ./my-baseline.json
254
+
255
+ # Compare test results against baseline
256
+ bellwether baseline compare --fail-on-drift # Uses baseline.comparePath or baseline.path from config
257
+ bellwether baseline compare ./baseline.json --ignore-version-mismatch # Force compare incompatible versions
258
+
259
+ # Show baseline contents
260
+ bellwether baseline show
261
+ bellwether baseline show ./baseline.json --json
262
+
263
+ # Compare two baseline files
264
+ bellwether baseline diff v1.json v2.json
265
+ bellwether baseline diff v1.json v2.json --ignore-version-mismatch # Force compare incompatible versions
266
+
267
+ # Migrate baseline to current format version
268
+ bellwether baseline migrate ./bellwether-baseline.json
269
+ bellwether baseline migrate ./baseline.json --dry-run
270
+ bellwether baseline migrate ./baseline.json --info
271
+
272
+ # Accept drift as intentional (update baseline)
273
+ bellwether baseline accept --reason "Intentional API change"
274
+ bellwether baseline accept --dry-run # Preview without saving
275
+ bellwether baseline accept --force # Required for breaking changes
276
+ ```
277
+
278
+ ### Baseline Format Versioning
279
+
280
+ Baselines use semantic versioning (e.g., `1.0.0`) for the format version:
281
+
282
+ - **Major version** - Breaking contract changes (removed fields, type changes)
283
+ - **Minor version** - New optional fields (backwards compatible)
284
+ - **Patch version** - Bug fixes in baseline generation
285
+
286
+ **Compatibility rules:**
287
+ - Same major version = Compatible (can compare baselines)
288
+ - Different major version = Incompatible (requires migration)
289
+
290
+ When comparing baselines with incompatible versions, the CLI will show an error:
291
+
292
+ ```
293
+ Cannot compare baselines with incompatible format versions: v1.0.0 vs v2.0.0.
294
+ Use 'bellwether baseline migrate' to upgrade the older baseline,
295
+ or use --ignore-version-mismatch to force comparison (results may be incorrect).
296
+ ```
297
+
298
+ To upgrade older baselines:
299
+
300
+ ```bash
301
+ # Check if migration is needed
302
+ bellwether baseline migrate ./baseline.json --info
303
+
304
+ # Preview changes without writing
305
+ bellwether baseline migrate ./baseline.json --dry-run
306
+
307
+ # Perform migration
308
+ bellwether baseline migrate ./baseline.json
309
+ ```
310
+
311
+ ### Cloud Commands
312
+
313
+ ```bash
314
+ # Authenticate with Bellwether Cloud
315
+ bellwether login
316
+ bellwether login --status
317
+ bellwether login --logout
318
+
319
+ # Manage team selection (for multi-team users)
320
+ bellwether teams # List your teams
321
+ bellwether teams switch # Interactive team selection
322
+ bellwether teams switch <id> # Switch to specific team
323
+ bellwether teams current # Show current active team
324
+
325
+ # Link project to cloud
326
+ bellwether link
327
+ bellwether link --status
328
+ bellwether link --unlink
329
+
330
+ # List cloud projects
331
+ bellwether projects
332
+ bellwether projects --json
333
+
334
+ # Upload baseline to cloud
335
+ bellwether upload
336
+ bellwether upload --ci --fail-on-drift
337
+
338
+ # View baseline version history
339
+ bellwether history
340
+ bellwether history --limit 20
341
+
342
+ # Compare cloud baseline versions
343
+ bellwether diff 1 2
344
+
345
+ # Get verification badge
346
+ bellwether badge --markdown
347
+ ```
348
+
349
+ ### Auth Commands
350
+
351
+ ```bash
352
+ # Manage LLM API keys (stored in system keychain)
353
+ bellwether auth # Interactive API key setup
354
+ bellwether auth status # Show configured providers
355
+ bellwether auth add openai # Add a specific provider key
356
+ bellwether auth remove openai # Remove a specific provider key
357
+ bellwether auth clear # Remove all stored keys
358
+ ```
359
+
360
+ ## Security Testing
361
+
362
+ Run deterministic security vulnerability testing on your MCP tools:
363
+
364
+ ```bash
365
+ # Enable security testing in config
366
+ bellwether init --preset security npx @mcp/your-server
367
+ bellwether check
368
+ ```
369
+
370
+ ### Security Categories
371
+
372
+ | Category | Description | CWE |
373
+ |:---------|:------------|:----|
374
+ | `sql_injection` | SQL injection payloads | CWE-89 |
375
+ | `xss` | Cross-site scripting payloads | CWE-79 |
376
+ | `path_traversal` | Path traversal attempts | CWE-22 |
377
+ | `command_injection` | Command injection payloads | CWE-78 |
378
+ | `ssrf` | Server-side request forgery | CWE-918 |
379
+ | `error_disclosure` | Sensitive error disclosure | CWE-209 |
380
+
381
+ ### Security Baseline
382
+
383
+ Security findings are stored in your baseline and compared across runs:
384
+
385
+ ```bash
386
+ # Enable security testing in config, then run check
387
+ bellwether check
388
+ bellwether baseline save
389
+
390
+ # On next run, compare security posture
391
+ bellwether check
392
+ # Reports: new findings, resolved findings, risk score changes
393
+ ```
394
+
395
+ ### Output
396
+
397
+ Security findings appear in:
398
+ - **CONTRACT.md** - Security Baseline section with findings and risk scores
399
+ - **Drift reports** - New/resolved findings when comparing baselines
400
+ - **SARIF format** - Integrates with GitHub Code Scanning
401
+
402
+ ### Risk Levels
403
+
404
+ | Level | Score Range | Description |
405
+ |:------|:------------|:------------|
406
+ | Critical | 80-100 | Immediate action required |
407
+ | High | 60-79 | Serious vulnerability |
408
+ | Medium | 40-59 | Moderate risk |
409
+ | Low | 20-39 | Minor concern |
410
+ | Info | 0-19 | Informational |
411
+
412
+ ## Semantic Validation
413
+
414
+ The check command automatically infers semantic types from parameter names and descriptions, then generates targeted validation tests.
415
+
416
+ ### Inferred Semantic Types
417
+
418
+ | Type | Example Parameters | Validation |
419
+ |:-----|:-------------------|:-----------|
420
+ | `date_iso8601` | `created_date`, `birth_day` | YYYY-MM-DD format |
421
+ | `datetime` | `created_at`, `updated_at` | ISO 8601 datetime |
422
+ | `timestamp` | `unix_epoch`, `time_ms` | Positive integer |
423
+ | `email` | `user_email`, `contact_email` | Valid email format |
424
+ | `url` | `website_url`, `api_endpoint` | Valid URL format |
425
+ | `identifier` | `user_id`, `order_uuid` | Non-empty string |
426
+ | `ip_address` | `server_ip`, `client_ip` | IPv4 or IPv6 |
427
+ | `phone` | `phone_number`, `mobile` | At least 7 digits |
428
+ | `percentage` | `tax_rate`, `progress` | Numeric value |
429
+ | `amount_currency` | `total_price`, `balance` | Numeric value |
430
+ | `file_path` | `file_path`, `directory` | Path string |
431
+ | `json` | `config_data`, `payload` | Valid JSON |
432
+ | `base64` | `encoded_data`, `b64_content` | Valid base64 |
433
+ | `regex` | `filter_pattern`, `regex` | Valid regex |
434
+
435
+ ### How It Works
436
+
437
+ 1. **Inference**: Parameters are analyzed based on name patterns and descriptions
438
+ 2. **Test Generation**: Invalid values are generated for each inferred type
439
+ 3. **Validation**: Tests verify that tools properly reject invalid semantic values
440
+ 4. **Documentation**: Inferred types appear in CONTRACT.md
441
+
442
+ Semantic validation runs automatically as part of `bellwether check` - no additional flags needed.
443
+
444
+ ## Response Schema Tracking
445
+
446
+ The check command tracks response schema consistency across multiple test samples, detecting when tools return inconsistent or evolving response structures.
447
+
448
+ ### What It Tracks
449
+
450
+ | Aspect | Detection | Impact |
451
+ |:-------|:----------|:-------|
452
+ | **Field consistency** | Fields appearing inconsistently across samples | Schema instability |
453
+ | **Type changes** | Field types varying between responses | Breaking changes |
454
+ | **Required changes** | Fields becoming required/optional | Contract changes |
455
+ | **Schema evolution** | Structural changes between baselines | API drift |
456
+
457
+ ### Stability Grades
458
+
459
+ | Grade | Confidence | Meaning |
460
+ |:------|:-----------|:--------|
461
+ | A | 95%+ | Fully stable, consistent responses |
462
+ | B | 85%+ | Mostly stable, minor variations |
463
+ | C | 70%+ | Moderately stable, some inconsistency |
464
+ | D | 50%+ | Unstable, significant variations |
465
+ | F | <50% | Very unstable, unreliable responses |
466
+ | N/A | - | Insufficient samples (< 3) |
467
+
468
+ ### Breaking vs Non-Breaking Changes
469
+
470
+ **Breaking changes** (fail CI):
471
+ - Fields removed from responses
472
+ - Types changed to incompatible types (e.g., `string` → `number`)
473
+ - Previously optional fields becoming required
474
+
475
+ **Non-breaking changes** (warning):
476
+ - New fields added
477
+ - Required fields becoming optional
478
+ - Compatible type widening (e.g., `integer` → `number`)
479
+
480
+ ### Output
481
+
482
+ Schema evolution findings appear in:
483
+ - **CONTRACT.md** - Schema Stability section with grades and consistency metrics
484
+ - **Drift reports** - Structure changes, breaking changes, stability changes
485
+ - **JUnit/SARIF** - Test cases for schema evolution issues
486
+
487
+ Response schema tracking runs automatically during `bellwether check` - no additional flags needed.
488
+
489
+ ## Error Analysis
490
+
491
+ The check command provides enhanced error analysis with root cause detection and remediation suggestions.
492
+
493
+ ### What It Analyzes
494
+
495
+ | Aspect | Detection | Impact |
496
+ |:-------|:----------|:-------|
497
+ | **HTTP status codes** | Parses 4xx/5xx codes from messages | Error categorization |
498
+ | **Root cause** | Infers cause from error patterns | Debugging guidance |
499
+ | **Remediation** | Generates fix suggestions | Actionable solutions |
500
+ | **Transient errors** | Identifies retryable errors | Retry strategies |
501
+ | **Error trends** | Tracks errors across baselines | Regression detection |
502
+
503
+ ### Error Categories
504
+
505
+ | Category | HTTP Codes | Description |
506
+ |:---------|:-----------|:------------|
507
+ | Validation Error | 400 | Invalid input or missing parameters |
508
+ | Authentication Error | 401, 403 | Auth or permission failure |
509
+ | Not Found | 404 | Resource does not exist |
510
+ | Conflict | 409 | Resource state conflict |
511
+ | Rate Limited | 429 | Too many requests |
512
+ | Server Error | 5xx | Internal server error |
513
+
514
+ ### Error Trend Detection
515
+
516
+ When comparing baselines, Bellwether tracks:
517
+ - **New error types** - Errors that didn't occur before
518
+ - **Resolved errors** - Errors that no longer occur
519
+ - **Increasing errors** - Error frequency growing >50%
520
+ - **Decreasing errors** - Error frequency reduced >50%
521
+
522
+ ### Output
523
+
524
+ Error analysis findings appear in:
525
+ - **CONTRACT.md** - Error Analysis section with root causes and remediations
526
+ - **Drift reports** - Error trend changes between baselines
527
+ - **JUnit/SARIF** - Test cases for error trend issues
528
+
529
+ Error analysis runs automatically during `bellwether check` - no additional flags needed.
530
+
531
+ ## Performance Confidence
532
+
533
+ The check command calculates statistical confidence for performance metrics, indicating how reliable your performance baselines are.
534
+
535
+ ### What It Measures
536
+
537
+ | Metric | Description | Impact |
538
+ |:-------|:------------|:-------|
539
+ | **Sample count** | Number of latency measurements | More samples = higher confidence |
540
+ | **Standard deviation** | Variability in response times | Lower = more consistent |
541
+ | **Coefficient of variation** | Relative variability (stdDev / mean) | Lower = more predictable |
542
+
543
+ ### Confidence Levels
544
+
545
+ | Level | Requirements | Meaning |
546
+ |:------|:-------------|:--------|
547
+ | HIGH | 10+ samples, CV ≤ 30% | Reliable baseline for regression detection |
548
+ | MEDIUM | 5+ samples, CV ≤ 50% | Moderately reliable, use with caution |
549
+ | LOW | < 5 samples or CV > 50% | Unreliable baseline, collect more data |
550
+
551
+ ### Why It Matters
552
+
553
+ Performance regressions detected with low confidence may not be real:
554
+ - **Few samples**: Random variation can look like regression
555
+ - **High variability**: Tool may have inconsistent performance
556
+
557
+ When confidence is low, the CLI recommends:
558
+ ```
559
+ Increase `check.sampling.minSamples` for reliable baselines
560
+ ```
561
+
562
+ ### Output
563
+
564
+ Confidence information appears in:
565
+ - **CONTRACT.md** - Performance Baseline section with confidence column
566
+ - **Drift reports** - Regression markers indicate reliability
567
+ - **JUnit/SARIF** - Test cases for low confidence tools
568
+ - **GitHub Actions** - Annotations for confidence warnings
569
+
570
+ ### Example Output
571
+
572
+ ```
573
+ ─── Performance Regressions ───
574
+ ! read_file: 100ms → 150ms (+50%) (low confidence)
575
+ ! write_file: 200ms → 250ms (+25%)
576
+
577
+ Note: Some tools have low confidence metrics.
578
+ Run with more samples for reliable baselines: read_file
579
+ ```
580
+
581
+ Performance confidence runs automatically during `bellwether check` - no additional flags needed.
582
+
583
+ ## Documentation Quality Scoring
584
+
585
+ The check command calculates a documentation quality score for your MCP server, evaluating how well tools and parameters are documented.
586
+
587
+ ### What It Measures
588
+
589
+ | Component | Weight | Description |
590
+ |:----------|:-------|:------------|
591
+ | **Description Coverage** | 30% | Percentage of tools with descriptions |
592
+ | **Description Quality** | 30% | Length, clarity, and actionable language |
593
+ | **Parameter Documentation** | 25% | Percentage of parameters with descriptions |
594
+ | **Example Coverage** | 15% | Percentage of tools with schema examples |
595
+
596
+ ### Grade Thresholds
597
+
598
+ | Grade | Score Range | Meaning |
599
+ |:------|:------------|:--------|
600
+ | A | 90-100 | Excellent documentation |
601
+ | B | 80-89 | Good documentation |
602
+ | C | 70-79 | Acceptable documentation |
603
+ | D | 60-69 | Poor documentation |
604
+ | F | 0-59 | Failing documentation |
605
+
606
+ ### Quality Criteria
607
+
608
+ Descriptions are scored based on:
609
+ - **Length**: At least 50 characters for "good", 20+ for "acceptable"
610
+ - **Imperative verbs**: Starting with action words (Creates, Gets, Deletes)
611
+ - **Behavior description**: Mentioning what the tool returns or provides
612
+ - **Examples/specifics**: Including "e.g.", "example", or "such as"
613
+
614
+ ### Issue Types
615
+
616
+ | Issue | Severity | Description |
617
+ |:------|:---------|:------------|
618
+ | Missing Description | Error | Tool has no description |
619
+ | Short Description | Warning | Description under 20 characters |
620
+ | Missing Param Description | Warning | Parameter has no description |
621
+ | No Examples | Info | Schema has no examples |
622
+
623
+ ### Output
624
+
625
+ Documentation scores appear in:
626
+ - **CONTRACT.md** - Documentation Quality section with breakdown
627
+ - **Drift reports** - Score changes between baselines
628
+ - **JUnit/SARIF** - Test cases for documentation issues
629
+ - **GitHub Actions** - Annotations for quality degradation
630
+
631
+ ### Example Output
632
+
633
+ ```
634
+ ─── Documentation Quality ───
635
+ ✓ Score: 60 → 85 (+25)
636
+ Grade: D → B
637
+ ✓ Issues fixed: 3
638
+
639
+ ─── Statistics ───
640
+ Documentation score: 85/100 (B)
641
+ Documentation change: +25
642
+ ```
643
+
644
+ Documentation quality scoring runs automatically during `bellwether check` - no additional flags needed.
645
+
646
+ ## Custom Test Scenarios
647
+
648
+ Define deterministic tests in `bellwether-tests.yaml`:
649
+
650
+ ```yaml
651
+ version: "1"
652
+ scenarios:
653
+ - tool: get_weather
654
+ args:
655
+ location: "San Francisco"
656
+ assertions:
657
+ - path: "content[0].text"
658
+ condition: "contains"
659
+ value: "temperature"
660
+ ```
661
+
662
+ Reference in your config:
663
+
664
+ ```yaml
665
+ # bellwether.yaml
666
+ scenarios:
667
+ path: "./bellwether-tests.yaml"
668
+ only: true # Run only scenarios, no LLM tests
669
+ ```
670
+
671
+ Then run:
672
+
673
+ ```bash
674
+ bellwether check # Run scenarios as part of check
675
+ bellwether explore # Run scenarios as part of explore
676
+ ```
677
+
678
+ ## Presets
679
+
680
+ | Preset | Optimized For | Description |
681
+ |:-------|:--------------|:------------|
682
+ | (default) | check | Zero LLM, free, deterministic |
683
+ | `ci` | check | Optimized for CI/CD, fails on drift |
684
+ | `security` | explore | Security + technical personas, OpenAI |
685
+ | `thorough` | explore | All 4 personas, workflow discovery |
686
+ | `local` | explore | Local Ollama, free, private |
687
+
688
+ Use with: `bellwether init --preset <name> npx @mcp/server`
689
+
690
+ ## GitHub Action
691
+
692
+ ```yaml
693
+ - name: Detect Behavioral Drift
694
+ uses: dotsetlabs/bellwether/action@v1
695
+ with:
696
+ server-command: 'npx @mcp/your-server'
697
+ baseline-path: './bellwether-baseline.json'
698
+ fail-on-severity: 'warning'
699
+ ```
700
+
701
+ See [action/README.md](./action/README.md) for full documentation.
702
+
703
+ ## Environment Variables
704
+
705
+ | Variable | Description |
706
+ |:---------|:------------|
707
+ | `OPENAI_API_KEY` | OpenAI API key (explore command) |
708
+ | `ANTHROPIC_API_KEY` | Anthropic API key (explore command) |
709
+ | `OLLAMA_BASE_URL` | Ollama server URL (default: `http://localhost:11434`) |
710
+ | `BELLWETHER_SESSION` | Cloud session token for CI/CD |
711
+ | `BELLWETHER_API_URL` | Cloud API URL (default: `https://api.bellwether.sh`) |
712
+ | `BELLWETHER_TEAM_ID` | Override active team for cloud operations (multi-team CI/CD) |
713
+ | `BELLWETHER_REGISTRY_URL` | Registry API URL override (for self-hosted registries) |
714
+
715
+ See [.env.example](./.env.example) for full documentation.
716
+
717
+ ## Development
718
+
719
+ ```bash
720
+ git clone https://github.com/dotsetlabs/bellwether
721
+ cd bellwether/cli
722
+ npm install
723
+ npm run build
724
+ npm test
725
+
726
+ # Run locally
727
+ ./dist/cli/index.js check npx @mcp/server
728
+ ./dist/cli/index.js explore npx @mcp/server
729
+ ```
730
+
731
+ ## License
732
+
733
+ MIT License - see [LICENSE](./LICENSE) for details.
734
+
735
+ ---
736
+
737
+ <p align="center">
738
+ Built by <a href="https://dotsetlabs.com">Dotset Labs LLC</a>
739
+ </p>