@kryptosai/mcp-observatory 0.22.0 → 0.23.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/docs/proof.md CHANGED
@@ -6,22 +6,30 @@ MCP Observatory is early, but it is already a working MCP testing/security stack
6
6
 
7
7
  - npm package: `@kryptosai/mcp-observatory`
8
8
  - GitHub Action: `KryptosAI/mcp-observatory/action@main`
9
+ - Latest release: `v0.23.0`
9
10
  - CLI command count: scan, test, record, replay, verify, diff, watch, suggest, serve, lock, history, init-ci, ci-report, enterprise-report, score, badge, cloud
10
- - Test suite: 321 passing tests across 40 test files as of June 19, 2026
11
- - GitHub traffic snapshot: 582 clones and 175 unique cloners in the visible June 2026 traffic window
11
+ - Test suite: 334 passing tests across 43 test files as of June 20, 2026
12
+ - GitHub traffic snapshot: 721 clones and 221 unique cloners in the visible June 2026 traffic window
12
13
  - npm downloads snapshot: 104 downloads for June 11-17, 2026
14
+ - Security guide: [MCP Server Security Field Guide](./mcp-security-field-guide.md)
15
+ - Safety index: [MCP Server Safety Index](./mcp-server-safety-index.md)
16
+ - Public examples: [Reference Evaluations](./reference-evaluations.md)
17
+ - Lock-file CI primitive: [MCP Lock Files](./mcp-lock-files.md)
18
+ - Public post drafts: [Launch Post Drafts](./public-post-drafts.md)
19
+ - Pilot offer: [Private MCP Readiness Review](./paid-pilot-offer.md)
13
20
 
14
21
  ## Safe Aggregate Telemetry Snapshot
15
22
 
16
23
  Internal telemetry is used for product analytics and account-level outreach. Public reporting uses only aggregate or sanitized data.
17
24
 
18
- As of the latest local export on June 19, 2026:
25
+ As of the latest local export on June 20, 2026:
19
26
 
20
- - 10,278 telemetry events
21
- - 7,211 unique sessions
22
- - 5,368 external sessions after separating internal/personal activity
23
- - 2,434 external CI sessions
24
- - 128 attributed company/org sessions
27
+ - 10,918 telemetry events
28
+ - 7,380 total sessions
29
+ - 5,379 external sessions after separating internal/personal activity
30
+ - 2,446 external CI sessions
31
+ - 138 attributed company/org sessions
32
+ - 11 attributed company/org candidates
25
33
  - top external commands: `serve`, `run`, `diff`, `test`, `scan`, `history`
26
34
 
27
35
  Raw emails, hostnames, private URLs, tokens, and response bodies are not published.
@@ -50,7 +58,16 @@ Accepted third-party integrations will be tracked here:
50
58
 
51
59
  | Repo | PR | Check Added | Badge Added | Status |
52
60
  | --- | --- | --- | --- | --- |
53
- | _pending_ | | | | |
61
+ | `modelcontextprotocol/servers` | [#4392](https://github.com/modelcontextprotocol/servers/pull/4392) | Yes | No | Open, mergeable, MCP Observatory check passing |
62
+ | `microsoft/playwright-mcp` | [#1657](https://github.com/microsoft/playwright-mcp/pull/1657) | Yes | No | Open |
63
+ | `upstash/context7` | [#2800](https://github.com/upstash/context7/pull/2800) | Yes | No | Open |
64
+ | `executeautomation/mcp-playwright` | [#225](https://github.com/executeautomation/mcp-playwright/pull/225) | Yes | No | Open |
65
+ | `kazuph/mcp-taskmanager` | [#11](https://github.com/kazuph/mcp-taskmanager/pull/11) | Yes | No | Open |
66
+ | `cyanheads/filesystem-mcp-server` | [#19](https://github.com/cyanheads/filesystem-mcp-server/pull/19) | Yes | No | Open |
67
+ | `antvis/mcp-server-chart` | [#312](https://github.com/antvis/mcp-server-chart/pull/312) | Yes | No | Open |
68
+ | `BrowserMCP/mcp` | [#189](https://github.com/BrowserMCP/mcp/pull/189) | Yes | No | Open |
69
+ | `UI5/mcp-server` | [#348](https://github.com/UI5/mcp-server/pull/348) | Yes | No | Open |
70
+ | `makenotion/notion-mcp-server` | [#324](https://github.com/makenotion/notion-mcp-server/pull/324) | Yes | No | Open |
54
71
 
55
72
  ## Commercial Proof
56
73
 
@@ -0,0 +1,86 @@
1
+ # Public Post Drafts
2
+
3
+ Use these as launch posts, GitHub Discussion posts, LinkedIn posts, or short blog drafts. The framing is about MCP safety patterns, not “look at my tool.”
4
+
5
+ ## 1. I Tested 20 MCP Servers. The Pattern Was Not “Bad Servers”; It Was Missing Gates.
6
+
7
+ MCP servers are becoming production dependencies for agents, but many of them still ship without the kind of CI gate we expect from normal software dependencies.
8
+
9
+ The main pattern I saw while building the first MCP Server Safety Index was simple: the risky part is rarely that a server exists. The risky part is that agents may depend on a tool surface nobody is testing for startup reliability, schema quality, security posture, or drift.
10
+
11
+ The checks that matter most:
12
+
13
+ - does the server start cleanly in CI?
14
+ - do tools, prompts, and resources respond as advertised?
15
+ - are tool schemas precise enough for agents to call safely?
16
+ - did a release add, remove, or broaden a tool?
17
+ - are destructive tools clearly identifiable?
18
+
19
+ My takeaway: MCP needs a package-lock moment. Commit the agent-facing contract, then make drift visible before agents depend on it.
20
+
21
+ ## 2. Browser MCP Servers Need A Different Security Bar
22
+
23
+ Browser automation MCP servers are powerful because agents can navigate pages, click, type, inspect state, and sometimes execute scripts.
24
+
25
+ That is exactly why they need explicit CI and security gates.
26
+
27
+ For browser MCP servers, a useful review should separate:
28
+
29
+ - harmless inventory checks
30
+ - state-mutating browser actions
31
+ - code execution or page-evaluation tools
32
+ - network/navigation controls
33
+ - tool schemas that are too broad for safe agent planning
34
+
35
+ The goal is not to block browser MCP. The goal is to make the trust boundary visible before an agent gets a browser with hands.
36
+
37
+ ## 3. Filesystem MCP Servers Should Always Test In A Sandbox
38
+
39
+ Filesystem MCP servers are one of the clearest examples of why MCP CI needs context.
40
+
41
+ A server can be useful and still dangerous if the test command points at the wrong directory, if read/write boundaries are unclear, or if a tool schema makes broad path access look harmless.
42
+
43
+ The minimum safety pattern:
44
+
45
+ - run CI against a temporary harmless directory
46
+ - verify tools/resources respond as advertised
47
+ - flag broad filesystem access
48
+ - document which operations are read-only vs write-capable
49
+ - treat changes to path schemas as contract drift
50
+
51
+ Agents need tools. They do not need accidental access to everything.
52
+
53
+ ## 4. Token-Backed SaaS MCP Servers Need Issue-First Certification
54
+
55
+ Many SaaS, cloud, payments, database, and developer-platform MCP servers cannot be safely checked with a drive-by PR because meaningful startup requires tokens or live services.
56
+
57
+ For those repos, the right move is usually not a workflow PR first. It is an issue or maintainer question:
58
+
59
+ “What is the safest CI startup command for this server?”
60
+
61
+ Once maintainers provide a token-safe target config, the useful checks are:
62
+
63
+ - does startup fail cleanly without credentials?
64
+ - are auth requirements documented?
65
+ - are destructive tools obvious?
66
+ - are schemas narrow enough for agent use?
67
+ - can the repo publish a safe compatibility/security badge?
68
+
69
+ Security adoption works better when it starts by respecting maintainer context.
70
+
71
+ ## 5. MCP Drift Is An AI Supply Chain Problem
72
+
73
+ When a package dependency changes, teams have lock files, diffs, review, and release notes.
74
+
75
+ When an MCP server changes its tool surface, an agent dependency changed too.
76
+
77
+ That means tool additions, tool removals, schema broadening, new write actions, and prompt/resource changes should be visible in pull requests.
78
+
79
+ The useful primitive is an MCP lock file:
80
+
81
+ ```bash
82
+ npx @kryptosai/mcp-observatory lock
83
+ npx @kryptosai/mcp-observatory lock verify
84
+ ```
85
+
86
+ The point is not bureaucracy. It is to make the agent-facing contract reviewable before production workflows quietly depend on something new.
@@ -22,6 +22,11 @@ Confirm:
22
22
  - HTTP target examples use env references instead of inline tokens.
23
23
  - Security findings appear in artifact evidence as structured `findings`.
24
24
  - Hosted upload is available through `mcp-observatory cloud upload <artifact>` when `MCP_OBSERVATORY_CLOUD_TOKEN` is set.
25
+ - Hosted HTTP scans require `Authorization: Bearer <HOSTED_SCAN_TOKEN>` and are treated as an authenticated pilot surface.
26
+
27
+ Known audit note:
28
+
29
+ - `npm audit` may report `undici <=6.26.0` through the `npm@11.17.0` package bundled under `@semantic-release/npm`. As of June 20, 2026, `npm audit fix` cannot update this bundled copy and `npm@11.17.0` is the current published npm package. The remaining vulnerable `undici` copy is release tooling only and is not part of MCP Observatory runtime dependencies or the packed npm artifact. Recheck after npm publishes a newer package.
25
30
 
26
31
  Known audit note:
27
32
 
@@ -32,10 +37,10 @@ Known audit note:
32
37
  - Merge the health/commercialization PR.
33
38
  - Update the GitHub repo homepage to the README or commercial page.
34
39
  - Publish npm only after the release gate is green.
35
- - Refresh MCP directory listings with: “MCP Observatory helps teams test, secure, and monitor MCP servers before agents depend on them.”
36
- - Include “free for local OSS use; paid for hosted reporting, private repo CI, security reports, production monitoring, certification, support, and fleet visibility.”
40
+ - Refresh MCP directory listings with: “MCP Observatory is the CI and security gate for MCP servers before agents depend on them.”
41
+ - Include “free for local OSS use; paid for hosted reporting, private repo CI, recurring security reports, certification, support, and fleet visibility.”
37
42
  - Link production users to `COMMERCIAL.md` and `william@banksey.com`.
38
- - Submit or refresh listings on Glama, PulseMCP, Smithery, and relevant awesome-MCP lists with the tags: security, developer tools, CI/CD, testing, observability, schema drift.
43
+ - Submit or refresh listings on Glama, PulseMCP, Smithery, and relevant awesome-MCP lists with the tags: security, developer tools, CI/CD, testing, MCP security, schema drift.
39
44
  - Use the certification distribution loop to open helpful PRs against popular MCP server repos and convert accepted PRs into proof points.
40
45
  - Link public proof, the safety report, and directory listing copy from launch/outreach materials.
41
46
 
@@ -67,6 +72,7 @@ Worker:
67
72
 
68
73
  - `POST /api/v1/artifacts` stores a run artifact behind bearer-token auth.
69
74
  - `GET /api/v1/artifacts/:org` returns the org artifact index behind the same auth.
75
+ - `POST /api/v1/scan` requires `Authorization: Bearer <HOSTED_SCAN_TOKEN>`.
70
76
  - Hosted scans reject localhost/private-network targets; use local CLI for internal MCP servers.
71
77
 
72
78
  ## What Not To Do Yet
@@ -0,0 +1,134 @@
1
+ # MCP Observatory Reference Evaluations
2
+
3
+ Reference evaluations show how MCP Observatory applies to common MCP server categories. These are public, safe examples intended to help maintainers and security reviewers understand what the tool checks and what kind of risk each category can expose.
4
+
5
+ The examples below are not customer claims. They are public evaluation targets, public pull requests, or category examples that can be reproduced with the CLI.
6
+
7
+ ## Official MCP Reference Servers
8
+
9
+ Representative repo: [`modelcontextprotocol/servers`](https://github.com/modelcontextprotocol/servers)
10
+
11
+ Public proof:
12
+
13
+ - PR: [`modelcontextprotocol/servers#4392`](https://github.com/modelcontextprotocol/servers/pull/4392)
14
+ - Status: open, mergeable, with a passing MCP Observatory check as of June 19, 2026
15
+
16
+ What this represents:
17
+
18
+ - reference MCP implementations
19
+ - simple tools that should behave predictably in CI
20
+ - a good baseline for model context protocol testing
21
+
22
+ What Observatory checks:
23
+
24
+ - server startup in GitHub Actions
25
+ - tools list/respond correctly
26
+ - schema quality and security scan output
27
+ - report generation for maintainers
28
+
29
+ Adoption command:
30
+
31
+ ```bash
32
+ npx @kryptosai/mcp-observatory init-ci --all --command "npx -y @modelcontextprotocol/server-sequential-thinking"
33
+ ```
34
+
35
+ ## Browser Automation MCP Servers
36
+
37
+ Representative public examples:
38
+
39
+ - [`microsoft/playwright-mcp`](https://github.com/microsoft/playwright-mcp)
40
+ - [`executeautomation/mcp-playwright`](https://github.com/executeautomation/mcp-playwright)
41
+
42
+ Public proof:
43
+
44
+ - PR: [`microsoft/playwright-mcp#1657`](https://github.com/microsoft/playwright-mcp/pull/1657)
45
+ - PR: [`executeautomation/mcp-playwright#225`](https://github.com/executeautomation/mcp-playwright/pull/225)
46
+
47
+ What this represents:
48
+
49
+ - high-capability browser tools
50
+ - agent access to pages, scripts, navigation, screenshots, and user-like actions
51
+ - a category where secure tool invocation and explicit trust boundaries matter
52
+
53
+ What Observatory checks:
54
+
55
+ - tool inventory
56
+ - schema quality
57
+ - risky browser/code-execution surfaces
58
+ - intentional suppressions for known acceptable findings
59
+ - whether deep invocation should be skipped for tools that can mutate browser state
60
+
61
+ Adoption command:
62
+
63
+ ```bash
64
+ npx @kryptosai/mcp-observatory test --security npx -y @playwright/mcp
65
+ ```
66
+
67
+ ## Filesystem MCP Servers
68
+
69
+ Representative public category: filesystem-backed MCP servers.
70
+
71
+ Public proof:
72
+
73
+ - PR: [`cyanheads/filesystem-mcp-server#19`](https://github.com/cyanheads/filesystem-mcp-server/pull/19)
74
+
75
+ What this represents:
76
+
77
+ - local file access exposed to agents
78
+ - read/write boundaries that should be explicit
79
+ - capability declarations that need to match observed MCP behavior
80
+
81
+ What Observatory checks:
82
+
83
+ - tools/resources capability consistency
84
+ - broad filesystem access findings
85
+ - schema quality for path-oriented tools
86
+ - safe sandbox target configuration for CI
87
+
88
+ Adoption command:
89
+
90
+ ```bash
91
+ npx @kryptosai/mcp-observatory test --security npx -y filesystem-mcp-server .
92
+ ```
93
+
94
+ Use a harmless temporary directory for CI checks when evaluating filesystem servers.
95
+
96
+ ## Documentation And Search MCP Servers
97
+
98
+ Representative public example: [`upstash/context7`](https://github.com/upstash/context7)
99
+
100
+ Public proof:
101
+
102
+ - PR: [`upstash/context7#2800`](https://github.com/upstash/context7/pull/2800)
103
+
104
+ What this represents:
105
+
106
+ - documentation retrieval and search tools
107
+ - untrusted or fast-changing text entering an agent context
108
+ - a category where prompt-injection-aware review matters
109
+
110
+ What Observatory checks:
111
+
112
+ - tool inventory
113
+ - schema quality
114
+ - startup reliability
115
+ - security findings around broad retrieval or response behavior
116
+ - report artifacts that maintainers can review in pull requests
117
+
118
+ Adoption command:
119
+
120
+ ```bash
121
+ npx @kryptosai/mcp-observatory init-ci --all --command "npx -y @upstash/context7-mcp"
122
+ ```
123
+
124
+ ## How To Read These Evaluations
125
+
126
+ Passing an Observatory check means the server passed the configured compatibility and security checks for that run. It does not mean the server is universally safe for every environment.
127
+
128
+ Use the results as an engineering control:
129
+
130
+ - add CI for repeatability
131
+ - compare artifacts between releases
132
+ - review security findings and suppressions
133
+ - document accepted risk for broad tools
134
+ - escalate production/private usage to hosted reporting, certification, or fleet visibility when the server becomes operationally important
package/package.json CHANGED
@@ -1,7 +1,7 @@
1
1
  {
2
2
  "name": "@kryptosai/mcp-observatory",
3
- "version": "0.22.0",
4
- "description": "Test, secure, and monitor MCP servers before agents depend on them.",
3
+ "version": "0.23.0",
4
+ "description": "The CI and security gate for MCP servers before agents depend on them.",
5
5
  "mcpName": "io.github.KryptosAI/mcp-observatory",
6
6
  "license": "MIT",
7
7
  "type": "module",
@@ -48,7 +48,7 @@
48
48
  "proof:refresh": "tsx scripts/refresh-proof-artifacts.ts",
49
49
  "release:prep": "node scripts/release.mjs",
50
50
  "certification:pr-body": "tsx scripts/print-certification-pr-body.ts",
51
- "typecheck": "tsc --noEmit -p tsconfig.json",
51
+ "typecheck": "tsc --noEmit -p tsconfig.json && npm --prefix api install && npm --prefix api run typecheck",
52
52
  "test": "vitest run",
53
53
  "validate:artifacts": "tsx scripts/validate-artifacts.ts",
54
54
  "verify:packed-install": "node scripts/verify-packed-install.mjs",
@@ -62,12 +62,13 @@
62
62
  "mcp-server",
63
63
  "model-context-protocol",
64
64
  "ai-agent",
65
+ "agent-security",
66
+ "ai-supply-chain",
65
67
  "ai-tools",
66
68
  "developer-tools",
67
69
  "cli",
68
70
  "regression-testing",
69
71
  "interoperability",
70
- "observability",
71
72
  "record",
72
73
  "replay",
73
74
  "cassette",
@@ -78,7 +79,7 @@
78
79
  "ci-cd",
79
80
  "mcp-ci",
80
81
  "github-action",
81
- "production-monitoring",
82
+ "mcp-lockfile",
82
83
  "enterprise",
83
84
  "enterprise-report",
84
85
  "feishu",