@kryptosai/mcp-observatory 0.23.0 → 0.24.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (46) hide show
  1. package/README.md +8 -7
  2. package/dist/src/commands/init-ci.d.ts +3 -0
  3. package/dist/src/commands/init-ci.js +24 -12
  4. package/dist/src/commands/init-ci.js.map +1 -1
  5. package/dist/src/reporters/pr-comment.js +6 -2
  6. package/dist/src/reporters/pr-comment.js.map +1 -1
  7. package/docs/certification-campaign-template.md +2 -2
  8. package/docs/mcp-safety-report-latest.md +12 -7
  9. package/docs/mcp-server-safety-index.md +56 -80
  10. package/docs/methodology.md +90 -0
  11. package/docs/metrics-dashboard.md +105 -0
  12. package/docs/paid-pilot-offer.md +21 -5
  13. package/docs/project-case-study.md +12 -8
  14. package/docs/proof.md +28 -15
  15. package/docs/public-post-drafts.md +18 -6
  16. package/docs/publish-readiness.md +1 -5
  17. package/docs/reference-evaluations.md +1 -1
  18. package/docs/safety-index/artifacts/antv-chart-server.json +2765 -0
  19. package/docs/safety-index/artifacts/antv-chart-server.md +156 -0
  20. package/docs/safety-index/artifacts/browsermcp-server.json +416 -0
  21. package/docs/safety-index/artifacts/browsermcp-server.md +163 -0
  22. package/docs/safety-index/artifacts/context7-server.json +286 -0
  23. package/docs/safety-index/artifacts/context7-server.md +163 -0
  24. package/docs/safety-index/artifacts/everything-server.json +482 -0
  25. package/docs/safety-index/artifacts/everything-server.md +163 -0
  26. package/docs/safety-index/artifacts/executeautomation-playwright-server.json +955 -0
  27. package/docs/safety-index/artifacts/executeautomation-playwright-server.md +163 -0
  28. package/docs/safety-index/artifacts/filesystem-server.json +583 -0
  29. package/docs/safety-index/artifacts/filesystem-server.md +156 -0
  30. package/docs/safety-index/artifacts/memory-server.json +469 -0
  31. package/docs/safety-index/artifacts/memory-server.md +156 -0
  32. package/docs/safety-index/artifacts/opentofu-server.json +387 -0
  33. package/docs/safety-index/artifacts/opentofu-server.md +163 -0
  34. package/docs/safety-index/artifacts/playwright-mcp-server.json +919 -0
  35. package/docs/safety-index/artifacts/playwright-mcp-server.md +156 -0
  36. package/docs/safety-index/artifacts/promptopia-server.json +442 -0
  37. package/docs/safety-index/artifacts/promptopia-server.md +156 -0
  38. package/docs/safety-index/artifacts/puppeteer-server.json +377 -0
  39. package/docs/safety-index/artifacts/puppeteer-server.md +163 -0
  40. package/docs/safety-index/artifacts/ref-tools-server.json +262 -0
  41. package/docs/safety-index/artifacts/ref-tools-server.md +156 -0
  42. package/docs/safety-index/artifacts/sequential-thinking-server.json +286 -0
  43. package/docs/safety-index/artifacts/sequential-thinking-server.md +156 -0
  44. package/docs/safety-index/maintainer-note-template.md +25 -0
  45. package/docs/safety-index/targets.json +192 -0
  46. package/package.json +12 -9
@@ -0,0 +1,105 @@
1
+ # Local Metrics Dashboard
2
+
3
+ MCP Observatory includes a private laptop dashboard for collecting telemetry, GitHub activity, and npm download data going forward.
4
+
5
+ The dashboard is intentionally local. It stores raw telemetry in a SQLite database on your laptop, then renders a sanitized static HTML view that is safe to screenshot or skim without exposing raw emails, hostnames, private URLs, tokens, or private command bodies.
6
+
7
+ The layout follows an App Store Connect-style breakdown:
8
+
9
+ - Overview
10
+ - Acquisition
11
+ - Downloads
12
+ - Usage
13
+ - Reliability
14
+
15
+ Daily rows are shown newest to oldest so the most recent project activity is always at the top.
16
+
17
+ ## Refresh
18
+
19
+ ```bash
20
+ npm run metrics:refresh
21
+ npm run metrics:serve
22
+ ```
23
+
24
+ Outputs are written to:
25
+
26
+ - `.mcp-observatory-metrics/observatory.sqlite`
27
+ - `.mcp-observatory-metrics/dashboard/index.html`
28
+ - `.mcp-observatory-metrics/dashboard/latest.json`
29
+ - `.mcp-observatory-metrics/logs/`
30
+
31
+ The local metrics directory is ignored by git and should not be published.
32
+
33
+ ## Data Sources
34
+
35
+ The collector stores each source independently. If one source fails, the dashboard still builds from the last good SQLite data and records the failure in the Reliability section.
36
+
37
+ | Source | Data |
38
+ | --- | --- |
39
+ | Telemetry | Existing Cloudflare D1 export flow via `scripts/export-telemetry-d1.ts` |
40
+ | GitHub | clones, views, referrers, popular paths, repo snapshot, latest release, open issue/PR counts, workflow runs |
41
+ | npm | public daily downloads for `@kryptosai/mcp-observatory` |
42
+
43
+ GitHub traffic APIs have a limited visible window, so the collector stores snapshots locally going forward. npm daily buckets can lag; the dashboard labels npm data as complete public days rather than assuming current-day zero.
44
+
45
+ ## Credentials
46
+
47
+ GitHub collection uses either:
48
+
49
+ ```bash
50
+ gh auth login
51
+ ```
52
+
53
+ or:
54
+
55
+ ```bash
56
+ export GH_TOKEN=...
57
+ ```
58
+
59
+ Telemetry collection uses the same Wrangler configuration as the existing telemetry export script. If the config is not auto-discovered, set:
60
+
61
+ ```bash
62
+ export MCP_OBSERVATORY_TELEMETRY_WRANGLER_CONFIG=/path/to/wrangler.toml
63
+ export MCP_OBSERVATORY_TELEMETRY_D1_DATABASE=mcp-observatory-telemetry
64
+ ```
65
+
66
+ No npm token is required for public download counts.
67
+
68
+ ## Commands
69
+
70
+ ```bash
71
+ npm run metrics:collect # collect telemetry, GitHub, and npm into SQLite
72
+ npm run metrics:build # render HTML from the existing SQLite database
73
+ npm run metrics:refresh # collect + build
74
+ npm run metrics:open # open the static read-only dashboard
75
+ npm run metrics:serve # open the local dashboard with an Update Data button
76
+ ```
77
+
78
+ The **Update Data** button is available in `metrics:serve` mode. A static `file://` dashboard cannot run local commands from the browser, so `metrics:open` shows the same data but disables the button.
79
+
80
+ For tests or offline recovery, you can seed telemetry from an existing export:
81
+
82
+ ```bash
83
+ npm run metrics:refresh -- --telemetry-input telemetry-exports/events-flat-full.json
84
+ ```
85
+
86
+ ## Optional Hourly Refresh
87
+
88
+ Generate a small local refresh wrapper:
89
+
90
+ ```bash
91
+ npx tsx scripts/metrics-dashboard.ts scheduler
92
+ ```
93
+
94
+ The generated script lives under `.mcp-observatory-metrics/` and can be called by `launchd` or cron. It uses the same refresh command and writes logs to `.mcp-observatory-metrics/logs/refresh.log`.
95
+
96
+ ## Privacy
97
+
98
+ Raw telemetry remains available in the local SQLite database for account intelligence and product analytics. The dashboard view uses sanitized aggregates by default:
99
+
100
+ - company domains, not raw emails
101
+ - aggregate source and command counts, not private command bodies
102
+ - GitHub and npm public metrics
103
+ - collection errors and freshness status
104
+
105
+ Do not commit `.mcp-observatory-metrics/`, telemetry exports, private reports, or screenshots that reveal raw data.
@@ -17,11 +17,13 @@ This is a manual pilot, not a self-serve SaaS promise.
17
17
 
18
18
  ## What The Pilot Includes
19
19
 
20
- - review of the customer’s MCP config, repo, or startup commands
21
- - MCP Observatory CI rollout for selected servers
20
+ - MCP server inventory across selected repos, configs, or agent environments
21
+ - reproducible test artifacts for each reviewed server
22
22
  - private readiness report covering startup, capabilities, schema quality, security findings, and drift risk
23
- - MCP lock-file setup for contract drift review
24
- - prioritized remediation notes
23
+ - schema/tool drift baseline using MCP lock files
24
+ - MCP Observatory CI rollout plan for selected servers
25
+ - executive summary with “safe for agent dependency” verdicts
26
+ - prioritized remediation notes and owner-ready next steps
25
27
  - optional certification language for servers that pass agreed checks
26
28
 
27
29
  ## Starting Prices
@@ -55,4 +57,18 @@ William
55
57
 
56
58
  ## Delivery Shape
57
59
 
58
- Start with static reports and CI setup. Do not build a dashboard until paid pilot feedback proves exactly what buyers need.
60
+ Start with static reports and CI setup. The first deliverable should look like an internal security/readiness packet, not a SaaS login.
61
+
62
+ ## Evidence Standard
63
+
64
+ The public [Safety Methodology](./methodology.md) and [MCP Server Safety Index](./mcp-server-safety-index.md) are the template for private work:
65
+
66
+ - command/config used
67
+ - date and tool version
68
+ - JSON artifact
69
+ - Markdown or HTML report
70
+ - failure class
71
+ - verdict
72
+ - reproduction notes
73
+
74
+ Private pilots can include customer-specific details, but public/customer-facing summaries should use sanitized evidence unless the customer approves otherwise.
@@ -57,14 +57,15 @@ For deeper context, see the [MCP Server Security Field Guide](./mcp-security-fie
57
57
 
58
58
  Telemetry is used privately to understand product usage and identify account-level signals without publishing raw personal data.
59
59
 
60
- As of the latest local export on June 20, 2026:
60
+ As of the latest local export on June 21, 2026:
61
61
 
62
- - 10,918 telemetry events
63
- - 7,380 total sessions
64
- - 5,379 external sessions after separating internal activity
62
+ - 11,481 telemetry events
63
+ - 7,571 total sessions
64
+ - 5,389 external sessions after separating internal activity
65
65
  - 2,446 external CI sessions
66
- - 138 attributed company/org sessions
67
- - 11 attributed company/org candidates
66
+ - 148 attributed company/org sessions
67
+ - 12 attributed company/org candidates
68
+ - latest external activity: June 21, 2026
68
69
 
69
70
  Public claims use aggregate or sanitized data only. Raw emails, hostnames, private URLs, tokens, and response bodies are not published.
70
71
 
@@ -77,9 +78,12 @@ Current public distribution proof includes:
77
78
  - latest release: `v0.23.0`
78
79
  - npm package: `@kryptosai/mcp-observatory`
79
80
  - GitHub Action: `KryptosAI/mcp-observatory/action@main`
80
- - visible GitHub traffic window: 721 clones and 221 unique cloners
81
+ - npm downloads snapshot: 511 downloads for June 11-20, 2026
82
+ - visible GitHub traffic window: 745 clones and 232 unique cloners
83
+ - visible GitHub page-view window: 12 views and 9 unique visitors
84
+ - public code-search references in MCP indexes, listing mirrors, and external experiment repos
81
85
  - official MCP reference PR open and green: [`modelcontextprotocol/servers#4392`](https://github.com/modelcontextprotocol/servers/pull/4392)
82
- - open certification PRs for Microsoft Playwright MCP, Upstash Context7, ExecuteAutomation Playwright MCP, and other MCP projects
86
+ - open certification PRs for Microsoft Playwright MCP, Upstash Context7, ExecuteAutomation Playwright MCP, AntV, BrowserMCP, UI5, Notion, and other MCP projects
83
87
 
84
88
  See [reference evaluations](./reference-evaluations.md) and [public proof](./proof.md).
85
89
 
package/docs/proof.md CHANGED
@@ -5,13 +5,15 @@ MCP Observatory is early, but it is already a working MCP testing/security stack
5
5
  ## Current Public Surface
6
6
 
7
7
  - npm package: `@kryptosai/mcp-observatory`
8
- - GitHub Action: `KryptosAI/mcp-observatory/action@main`
8
+ - GitHub Action: `KryptosAI/mcp-observatory/action@v0.24.0`
9
9
  - Latest release: `v0.23.0`
10
10
  - CLI command count: scan, test, record, replay, verify, diff, watch, suggest, serve, lock, history, init-ci, ci-report, enterprise-report, score, badge, cloud
11
11
  - Test suite: 334 passing tests across 43 test files as of June 20, 2026
12
- - GitHub traffic snapshot: 721 clones and 221 unique cloners in the visible June 2026 traffic window
13
- - npm downloads snapshot: 104 downloads for June 11-17, 2026
12
+ - GitHub traffic snapshot: 745 clones and 232 unique cloners in the visible June 2026 traffic window
13
+ - GitHub page views snapshot: 12 views and 9 unique visitors in the visible June 2026 traffic window
14
+ - npm downloads snapshot: 511 downloads for June 11-20, 2026
14
15
  - Security guide: [MCP Server Security Field Guide](./mcp-security-field-guide.md)
16
+ - Safety methodology: [MCP Observatory Safety Methodology](./methodology.md)
15
17
  - Safety index: [MCP Server Safety Index](./mcp-server-safety-index.md)
16
18
  - Public examples: [Reference Evaluations](./reference-evaluations.md)
17
19
  - Lock-file CI primitive: [MCP Lock Files](./mcp-lock-files.md)
@@ -22,14 +24,15 @@ MCP Observatory is early, but it is already a working MCP testing/security stack
22
24
 
23
25
  Internal telemetry is used for product analytics and account-level outreach. Public reporting uses only aggregate or sanitized data.
24
26
 
25
- As of the latest local export on June 20, 2026:
27
+ As of the latest local export on June 21, 2026:
26
28
 
27
- - 10,918 telemetry events
28
- - 7,380 total sessions
29
- - 5,379 external sessions after separating internal/personal activity
29
+ - 11,481 telemetry events
30
+ - 7,571 total sessions
31
+ - 5,389 external sessions after separating internal/personal activity
30
32
  - 2,446 external CI sessions
31
- - 138 attributed company/org sessions
32
- - 11 attributed company/org candidates
33
+ - 148 attributed company/org sessions
34
+ - 12 attributed company/org candidates
35
+ - latest external activity: June 21, 2026
33
36
  - top external commands: `serve`, `run`, `diff`, `test`, `scan`, `history`
34
37
 
35
38
  Raw emails, hostnames, private URLs, tokens, and response bodies are not published.
@@ -54,20 +57,30 @@ MCP Observatory can:
54
57
 
55
58
  The certification campaign is designed to create public proof through accepted maintainer PRs.
56
59
 
57
- Accepted third-party integrations will be tracked here:
60
+ Open and accepted third-party integrations are tracked here:
58
61
 
59
62
  | Repo | PR | Check Added | Badge Added | Status |
60
63
  | --- | --- | --- | --- | --- |
61
64
  | `modelcontextprotocol/servers` | [#4392](https://github.com/modelcontextprotocol/servers/pull/4392) | Yes | No | Open, mergeable, MCP Observatory check passing |
62
- | `microsoft/playwright-mcp` | [#1657](https://github.com/microsoft/playwright-mcp/pull/1657) | Yes | No | Open |
63
- | `upstash/context7` | [#2800](https://github.com/upstash/context7/pull/2800) | Yes | No | Open |
65
+ | `microsoft/playwright-mcp` | [#1657](https://github.com/microsoft/playwright-mcp/pull/1657) | Yes | No | Closed, unmerged |
66
+ | `upstash/context7` | [#2800](https://github.com/upstash/context7/pull/2800) | Yes | No | Closed, maintainer declined third-party CI |
64
67
  | `executeautomation/mcp-playwright` | [#225](https://github.com/executeautomation/mcp-playwright/pull/225) | Yes | No | Open |
65
68
  | `kazuph/mcp-taskmanager` | [#11](https://github.com/kazuph/mcp-taskmanager/pull/11) | Yes | No | Open |
66
- | `cyanheads/filesystem-mcp-server` | [#19](https://github.com/cyanheads/filesystem-mcp-server/pull/19) | Yes | No | Open |
69
+ | `cyanheads/filesystem-mcp-server` | [#19](https://github.com/cyanheads/filesystem-mcp-server/pull/19) | Yes | No | Closed, unmerged |
67
70
  | `antvis/mcp-server-chart` | [#312](https://github.com/antvis/mcp-server-chart/pull/312) | Yes | No | Open |
68
71
  | `BrowserMCP/mcp` | [#189](https://github.com/BrowserMCP/mcp/pull/189) | Yes | No | Open |
69
- | `UI5/mcp-server` | [#348](https://github.com/UI5/mcp-server/pull/348) | Yes | No | Open |
70
- | `makenotion/notion-mcp-server` | [#324](https://github.com/makenotion/notion-mcp-server/pull/324) | Yes | No | Open |
72
+ | `UI5/mcp-server` | [#348](https://github.com/UI5/mcp-server/pull/348) | Yes | No | Closed, maintainer declined third-party CI |
73
+ | `makenotion/notion-mcp-server` | [#324](https://github.com/makenotion/notion-mcp-server/pull/324) | Yes | No | Closed after policy-style CI failure |
74
+
75
+ ## Public Discovery Snapshot
76
+
77
+ GitHub code search shows public references outside the main repo. These are discovery/listing signals, not customer claims:
78
+
79
+ - `punkpeye/awesome-mcp-devtools` lists MCP Observatory in an MCP developer-tools index.
80
+ - `linny006/mcp-servers-live` mirrors a public MCP Observatory listing page.
81
+ - `gabrielmoreira/awesome-ai-rabbit-holes` catalogs the GitHub project.
82
+ - `fmfg03/supermcp` includes an `apps/mcp-observatory` package path.
83
+ - `vellankikoti/mcp-observatory`, `LuKrlier/mcp-observatory`, and `shigeki7777/sasame-mcp-observatory` appear as separate public repos referencing or experimenting with the Observatory name/code surface.
71
84
 
72
85
  ## Commercial Proof
73
86
 
@@ -2,12 +2,22 @@
2
2
 
3
3
  Use these as launch posts, GitHub Discussion posts, LinkedIn posts, or short blog drafts. The framing is about MCP safety patterns, not “look at my tool.”
4
4
 
5
- ## 1. I Tested 20 MCP Servers. The Pattern Was Not “Bad Servers”; It Was Missing Gates.
5
+ ## Flagship Post: I Tested Popular MCP Servers. The Failure Pattern Was Not What I Expected.
6
6
 
7
7
  MCP servers are becoming production dependencies for agents, but many of them still ship without the kind of CI gate we expect from normal software dependencies.
8
8
 
9
9
  The main pattern I saw while building the first MCP Server Safety Index was simple: the risky part is rarely that a server exists. The risky part is that agents may depend on a tool surface nobody is testing for startup reliability, schema quality, security posture, or drift.
10
10
 
11
+ The industry does not need another vibes-based directory. It needs reproducible readiness evidence:
12
+
13
+ - exact command/config
14
+ - date and package version where available
15
+ - JSON artifact
16
+ - Markdown report
17
+ - verdict
18
+ - failure class
19
+ - reproduction notes
20
+
11
21
  The checks that matter most:
12
22
 
13
23
  - does the server start cleanly in CI?
@@ -18,7 +28,9 @@ The checks that matter most:
18
28
 
19
29
  My takeaway: MCP needs a package-lock moment. Commit the agent-facing contract, then make drift visible before agents depend on it.
20
30
 
21
- ## 2. Browser MCP Servers Need A Different Security Bar
31
+ I am publishing the Safety Methodology and the first MCP Server Safety Index as a small evidence standard, not a leaderboard. If your team is putting MCP into private or production agent workflows, I am doing a small number of private MCP readiness reviews: inventory, CI rollout, schema/tool drift baseline, security findings, and safe-for-agent-dependency verdicts.
32
+
33
+ ## Supporting Angle: Browser MCP Servers Need A Different Security Bar
22
34
 
23
35
  Browser automation MCP servers are powerful because agents can navigate pages, click, type, inspect state, and sometimes execute scripts.
24
36
 
@@ -32,9 +44,9 @@ For browser MCP servers, a useful review should separate:
32
44
  - network/navigation controls
33
45
  - tool schemas that are too broad for safe agent planning
34
46
 
35
- The goal is not to block browser MCP. The goal is to make the trust boundary visible before an agent gets a browser with hands.
47
+ The goal is not to block browser MCP. The goal is to make the trust boundary visible before an agent gets browser-control powers.
36
48
 
37
- ## 3. Filesystem MCP Servers Should Always Test In A Sandbox
49
+ ## Supporting Angle: Filesystem MCP Servers Should Always Test In A Sandbox
38
50
 
39
51
  Filesystem MCP servers are one of the clearest examples of why MCP CI needs context.
40
52
 
@@ -50,7 +62,7 @@ The minimum safety pattern:
50
62
 
51
63
  Agents need tools. They do not need accidental access to everything.
52
64
 
53
- ## 4. Token-Backed SaaS MCP Servers Need Issue-First Certification
65
+ ## Supporting Angle: Token-Backed SaaS MCP Servers Need Issue-First Certification
54
66
 
55
67
  Many SaaS, cloud, payments, database, and developer-platform MCP servers cannot be safely checked with a drive-by PR because meaningful startup requires tokens or live services.
56
68
 
@@ -68,7 +80,7 @@ Once maintainers provide a token-safe target config, the useful checks are:
68
80
 
69
81
  Security adoption works better when it starts by respecting maintainer context.
70
82
 
71
- ## 5. MCP Drift Is An AI Supply Chain Problem
83
+ ## Supporting Angle: MCP Drift Is An AI Supply Chain Problem
72
84
 
73
85
  When a package dependency changes, teams have lock files, diffs, review, and release notes.
74
86
 
@@ -26,11 +26,7 @@ Confirm:
26
26
 
27
27
  Known audit note:
28
28
 
29
- - `npm audit` may report `undici <=6.26.0` through the `npm@11.17.0` package bundled under `@semantic-release/npm`. As of June 20, 2026, `npm audit fix` cannot update this bundled copy and `npm@11.17.0` is the current published npm package. The remaining vulnerable `undici` copy is release tooling only and is not part of MCP Observatory runtime dependencies or the packed npm artifact. Recheck after npm publishes a newer package.
30
-
31
- Known audit note:
32
-
33
- - `npm audit` may report `undici <=6.26.0` through the `npm@11.17.0` package bundled under `@semantic-release/npm`. `npm audit fix` updates the fixable `@actions/http-client` path, but the remaining `undici` copy is bundled inside npm release tooling and is not part of MCP Observatory runtime dependencies or the packed npm artifact. Recheck after npm publishes a newer package.
29
+ - Release automation runs `semantic-release` ephemerally in GitHub Actions instead of installing it into the repository lockfile. This keeps release-only bundled dependencies out of the default-branch audit surface and out of the packed CLI artifact.
34
30
 
35
31
  ## Public Distribution
36
32
 
@@ -70,7 +70,7 @@ Representative public category: filesystem-backed MCP servers.
70
70
 
71
71
  Public proof:
72
72
 
73
- - PR: [`cyanheads/filesystem-mcp-server#19`](https://github.com/cyanheads/filesystem-mcp-server/pull/19)
73
+ - PR: [`cyanheads/filesystem-mcp-server#19`](https://github.com/cyanheads/filesystem-mcp-server/pull/19), closed unmerged by maintainer
74
74
 
75
75
  What this represents:
76
76