@kryptosai/mcp-observatory 0.22.0 → 0.23.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/COMMERCIAL.md +5 -3
- package/PRIVACY.md +5 -2
- package/README.md +24 -10
- package/dist/src/cli.js +1 -1
- package/dist/src/cli.js.map +1 -1
- package/dist/src/commands/init-ci.js +5 -0
- package/dist/src/commands/init-ci.js.map +1 -1
- package/dist/src/commercial.js +2 -2
- package/dist/src/commercial.js.map +1 -1
- package/dist/src/score.js +1 -1
- package/dist/src/score.js.map +1 -1
- package/dist/src/validate.js +58 -3
- package/dist/src/validate.js.map +1 -1
- package/docs/certification-campaign-template.md +10 -10
- package/docs/certification-distribution.md +16 -0
- package/docs/directory-listing-copy.md +12 -5
- package/docs/distribution-launch.md +5 -5
- package/docs/enterprise-outreach-playbook.md +2 -2
- package/docs/mcp-lock-files.md +63 -0
- package/docs/mcp-safety-report-latest.md +10 -6
- package/docs/mcp-security-field-guide.md +97 -0
- package/docs/mcp-server-safety-index.md +85 -0
- package/docs/paid-pilot-offer.md +58 -0
- package/docs/project-case-study.md +73 -43
- package/docs/proof.md +26 -9
- package/docs/public-post-drafts.md +86 -0
- package/docs/publish-readiness.md +9 -3
- package/docs/reference-evaluations.md +134 -0
- package/package.json +6 -5
|
@@ -2,22 +2,24 @@
|
|
|
2
2
|
|
|
3
3
|
## Standard Positioning
|
|
4
4
|
|
|
5
|
-
MCP Observatory
|
|
5
|
+
MCP Observatory is the CI and security gate for MCP servers before agents depend on them.
|
|
6
6
|
|
|
7
7
|
## Short Description
|
|
8
8
|
|
|
9
|
-
CI, security checks, schema drift detection, reports, and badges for MCP servers.
|
|
9
|
+
CI, security checks, schema drift detection, lock files, reports, and badges for MCP servers.
|
|
10
10
|
|
|
11
11
|
## Medium Description
|
|
12
12
|
|
|
13
|
-
MCP Observatory is a CLI, GitHub Action, and MCP server for testing MCP servers before agents depend on them. It checks tools, prompts, resources, schema quality, security footguns, regressions, and drift, then generates reports and badges maintainers can share.
|
|
13
|
+
MCP Observatory is a CLI, GitHub Action, and MCP server for testing MCP servers before agents depend on them. It checks tools, prompts, resources, schema quality, security footguns, regressions, and drift, then generates lock files, reports, and badges maintainers can share.
|
|
14
14
|
|
|
15
15
|
## Long Description
|
|
16
16
|
|
|
17
|
-
MCP Observatory gives MCP servers production safety rails: one-command CI setup, compatibility checks, security analysis, schema drift detection, record/replay/verify workflows, PR comments, health score badges, and static enterprise reports. It can run as a CLI, inside GitHub Actions, or as an MCP server that lets agents inspect other MCP servers.
|
|
17
|
+
MCP Observatory gives MCP servers production safety rails: one-command CI setup, compatibility checks, security analysis, schema drift detection, lock-file verification, record/replay/verify workflows, PR comments, health score badges, and static enterprise reports. It can run as a CLI, inside GitHub Actions, or as an MCP server that lets agents inspect other MCP servers.
|
|
18
18
|
|
|
19
19
|
Free for local OSS use. Paid pilots are available for hosted reporting, private repo CI history, recurring security reports, certification, support, and fleet visibility.
|
|
20
20
|
|
|
21
|
+
For security and platform teams, see the MCP Server Security Field Guide and MCP Server Safety Index for agent security, AI supply chain security, and production MCP server review guidance.
|
|
22
|
+
|
|
21
23
|
## Primary CTA
|
|
22
24
|
|
|
23
25
|
Add MCP CI in one command:
|
|
@@ -49,7 +51,6 @@ npx @kryptosai/mcp-observatory init-ci --all --command "npx -y my-mcp-server"
|
|
|
49
51
|
- Developer Tools
|
|
50
52
|
- Testing
|
|
51
53
|
- CI/CD
|
|
52
|
-
- Observability
|
|
53
54
|
- Schema Drift
|
|
54
55
|
- Regression Testing
|
|
55
56
|
- AI Agents
|
|
@@ -66,6 +67,8 @@ npx @kryptosai/mcp-observatory init-ci --all --command "npx -y my-mcp-server"
|
|
|
66
67
|
- `github-action`
|
|
67
68
|
- `developer-tools`
|
|
68
69
|
- `security`
|
|
70
|
+
- `agent-security`
|
|
71
|
+
- `ai-supply-chain`
|
|
69
72
|
- `production-monitoring`
|
|
70
73
|
- `enterprise-report`
|
|
71
74
|
|
|
@@ -73,6 +76,10 @@ npx @kryptosai/mcp-observatory init-ci --all --command "npx -y my-mcp-server"
|
|
|
73
76
|
|
|
74
77
|
- README: `https://github.com/KryptosAI/mcp-observatory#readme`
|
|
75
78
|
- GitHub Action: `https://github.com/KryptosAI/mcp-observatory/tree/main/action`
|
|
79
|
+
- Security field guide: `https://github.com/KryptosAI/mcp-observatory/blob/main/docs/mcp-security-field-guide.md`
|
|
80
|
+
- Reference evaluations: `https://github.com/KryptosAI/mcp-observatory/blob/main/docs/reference-evaluations.md`
|
|
81
|
+
- Safety index: `https://github.com/KryptosAI/mcp-observatory/blob/main/docs/mcp-server-safety-index.md`
|
|
82
|
+
- Lock files: `https://github.com/KryptosAI/mcp-observatory/blob/main/docs/mcp-lock-files.md`
|
|
76
83
|
- Certification guide: `https://github.com/KryptosAI/mcp-observatory/blob/main/docs/certification-distribution.md`
|
|
77
84
|
- Proof: `https://github.com/KryptosAI/mcp-observatory/blob/main/docs/proof.md`
|
|
78
85
|
- Commercial pilots: `https://github.com/KryptosAI/mcp-observatory/blob/main/COMMERCIAL.md`
|
|
@@ -8,7 +8,7 @@ For public proof, use [MCP Observatory Proof](./proof.md).
|
|
|
8
8
|
|
|
9
9
|
## Positioning
|
|
10
10
|
|
|
11
|
-
MCP Observatory
|
|
11
|
+
MCP Observatory is the CI and security gate for MCP servers before agents depend on them.
|
|
12
12
|
|
|
13
13
|
## Public Surface Checklist
|
|
14
14
|
|
|
@@ -22,11 +22,11 @@ MCP Observatory helps teams test, secure, and monitor MCP servers before agents
|
|
|
22
22
|
|
|
23
23
|
## Launch Post Draft
|
|
24
24
|
|
|
25
|
-
MCP servers are becoming production dependencies. If an agent depends on a server, that server needs regression tests, security checks, and
|
|
25
|
+
MCP servers are becoming production dependencies. If an agent depends on a server, that server needs regression tests, security checks, and drift gates before it breaks workflows.
|
|
26
26
|
|
|
27
27
|
MCP Observatory scans MCP servers, verifies capabilities, detects schema drift, records/replays sessions, and can run in CI or as an MCP server itself.
|
|
28
28
|
|
|
29
|
-
Free for local OSS use. Paid pilots are available for hosted reporting, private repo CI, security reports,
|
|
29
|
+
Free for local OSS use. Paid pilots are available for hosted reporting, private repo CI, recurring security reports, certification, support, and fleet visibility.
|
|
30
30
|
|
|
31
31
|
Production MCP usage? Contact william@banksey.com.
|
|
32
32
|
|
|
@@ -36,9 +36,9 @@ Subject: MCP production testing and security checks
|
|
|
36
36
|
|
|
37
37
|
Hi,
|
|
38
38
|
|
|
39
|
-
I noticed signals that your team may be evaluating or using MCP servers. MCP Observatory
|
|
39
|
+
I noticed signals that your team may be evaluating or using MCP servers. MCP Observatory is the CI and security gate for MCP servers before agents depend on them.
|
|
40
40
|
|
|
41
|
-
We are running a small number of production pilots for hosted reports, private repo CI, security
|
|
41
|
+
We are running a small number of production pilots for hosted reports, private repo CI, recurring security reviews, certification, support, and fleet visibility.
|
|
42
42
|
|
|
43
43
|
Would it be useful to compare what your MCP servers look like today and where regressions or production risk could show up?
|
|
44
44
|
|
|
@@ -8,7 +8,7 @@ npm run telemetry:intelligence -- --input telemetry-exports/events-flat-full.jso
|
|
|
8
8
|
|
|
9
9
|
Start from `reports/telemetry-usage-summary.html` to confirm external usage before reading account rankings. Do not treat first-party CI, release workflows, or internal/personal sessions as market traction.
|
|
10
10
|
|
|
11
|
-
Do not include raw personal emails in public issues, posts, or
|
|
11
|
+
Raw telemetry is allowed for internal account intelligence and may include git email, git remote URL, hostname, target command or URL, CI metadata, target IDs, and command outcomes. Do not include raw personal emails, hostnames, private URLs, target commands, tokens, or private telemetry exports in public issues, posts, docs, or customer-facing outreach. Use account domains, GitHub orgs, and aggregate telemetry evidence.
|
|
12
12
|
|
|
13
13
|
## Priority Accounts
|
|
14
14
|
|
|
@@ -35,7 +35,7 @@ If your team is running MCP servers in production, I can prepare a short evidenc
|
|
|
35
35
|
- Feishu/Lark MCP compatibility
|
|
36
36
|
- private HTTP MCP health checks
|
|
37
37
|
- security findings and schema drift
|
|
38
|
-
- CI history and
|
|
38
|
+
- CI history and controlled drift review
|
|
39
39
|
- MCP fleet visibility across teams
|
|
40
40
|
|
|
41
41
|
Would it be useful to compare notes this week?
|
|
@@ -0,0 +1,63 @@
|
|
|
1
|
+
# MCP Lock Files
|
|
2
|
+
|
|
3
|
+
MCP lock files are the package-lock for AI tools.
|
|
4
|
+
|
|
5
|
+
They capture the MCP contract a server exposes to agents: tools, prompts, resources, and tool input schemas. Once committed, CI can verify that future changes are intentional before agents depend on a changed surface.
|
|
6
|
+
|
|
7
|
+
## Core Flow
|
|
8
|
+
|
|
9
|
+
Create the lock:
|
|
10
|
+
|
|
11
|
+
```bash
|
|
12
|
+
npx @kryptosai/mcp-observatory lock
|
|
13
|
+
```
|
|
14
|
+
|
|
15
|
+
Verify the live server still matches:
|
|
16
|
+
|
|
17
|
+
```bash
|
|
18
|
+
npx @kryptosai/mcp-observatory lock verify
|
|
19
|
+
```
|
|
20
|
+
|
|
21
|
+
Add CI:
|
|
22
|
+
|
|
23
|
+
```bash
|
|
24
|
+
npx @kryptosai/mcp-observatory init-ci --all --command "npx -y my-mcp-server"
|
|
25
|
+
```
|
|
26
|
+
|
|
27
|
+
## Why It Matters
|
|
28
|
+
|
|
29
|
+
Agents call tools based on schemas and descriptions. If a tool is added, removed, renamed, or made more permissive, the agent-facing contract changed.
|
|
30
|
+
|
|
31
|
+
Lock verification turns that into a reviewable event:
|
|
32
|
+
|
|
33
|
+
- what changed
|
|
34
|
+
- whether a tool, prompt, or resource was added or removed
|
|
35
|
+
- whether a tool schema changed
|
|
36
|
+
- whether the changed MCP surface should be accepted before release
|
|
37
|
+
|
|
38
|
+
## Production Positioning
|
|
39
|
+
|
|
40
|
+
For maintainers, lock files catch accidental breakage.
|
|
41
|
+
|
|
42
|
+
For security and platform teams, lock files create an approval point for AI supply chain changes. A production MCP server can treat new tools, broader schemas, and high-risk capabilities like dependency changes that deserve review.
|
|
43
|
+
|
|
44
|
+
## Recommended CI Policy
|
|
45
|
+
|
|
46
|
+
- Commit `.mcp-observatory/lock.json` for production MCP servers.
|
|
47
|
+
- Run `mcp-observatory lock verify` on pull requests.
|
|
48
|
+
- Treat drift as blocking unless the PR intentionally updates the MCP surface.
|
|
49
|
+
- Pair lock verification with `--security` checks before major releases.
|
|
50
|
+
- Record suppressions with an owner, reason, and expiration when accepted risk is intentional.
|
|
51
|
+
|
|
52
|
+
## Commercial Pilot Use
|
|
53
|
+
|
|
54
|
+
Paid pilots can turn lock verification into a recurring MCP readiness report:
|
|
55
|
+
|
|
56
|
+
- current MCP surface
|
|
57
|
+
- drift since last approved lock
|
|
58
|
+
- new or removed tools
|
|
59
|
+
- schema changes
|
|
60
|
+
- security findings
|
|
61
|
+
- recommended review actions
|
|
62
|
+
|
|
63
|
+
This is the simplest enterprise story: commit your MCP contract, then make drift visible before agents depend on it.
|
|
@@ -1,9 +1,11 @@
|
|
|
1
1
|
# MCP Safety Report
|
|
2
2
|
|
|
3
|
-
Latest generated baseline: June
|
|
3
|
+
Latest generated baseline: June 20, 2026.
|
|
4
4
|
|
|
5
5
|
MCP servers are becoming production dependencies. When agents depend on a server, that server needs repeatable compatibility checks, security review, schema drift detection, and visible trust signals.
|
|
6
6
|
|
|
7
|
+
For a broader security framing, see the [MCP Server Security Field Guide](./mcp-security-field-guide.md). For public examples, see [Reference Evaluations](./reference-evaluations.md).
|
|
8
|
+
|
|
7
9
|
## What Observatory Checks
|
|
8
10
|
|
|
9
11
|
MCP Observatory checks:
|
|
@@ -22,11 +24,13 @@ Safe aggregate telemetry from the latest local export:
|
|
|
22
24
|
|
|
23
25
|
| Metric | Value |
|
|
24
26
|
| --- | ---: |
|
|
25
|
-
| Total telemetry events | 10,
|
|
26
|
-
|
|
|
27
|
-
| External sessions | 5,
|
|
28
|
-
| External CI sessions | 2,
|
|
29
|
-
| Attributed company/org sessions |
|
|
27
|
+
| Total telemetry events | 10,918 |
|
|
28
|
+
| Total sessions | 7,380 |
|
|
29
|
+
| External sessions | 5,379 |
|
|
30
|
+
| External CI sessions | 2,446 |
|
|
31
|
+
| Attributed company/org sessions | 138 |
|
|
32
|
+
| GitHub clones in visible traffic window | 721 |
|
|
33
|
+
| Unique cloners in visible traffic window | 221 |
|
|
30
34
|
|
|
31
35
|
Top external commands:
|
|
32
36
|
|
|
@@ -0,0 +1,97 @@
|
|
|
1
|
+
# MCP Server Security Field Guide
|
|
2
|
+
|
|
3
|
+
MCP servers are becoming part of AI agent infrastructure. They expose tools that agents can call, often with access to files, browsers, cloud APIs, databases, documents, and internal systems. That makes MCP security a practical engineering problem: teams need to know which tools exist, what they can touch, how their schemas change, and whether they are safe enough for production agent workflows.
|
|
4
|
+
|
|
5
|
+
MCP Observatory is built around that control point. It gives maintainers and platform teams a repeatable way to test production MCP servers, add MCP server CI, detect schema drift, and surface agent security risk before agents depend on a tool.
|
|
6
|
+
|
|
7
|
+
## Why MCP Servers Are An Agent-Facing Attack Surface
|
|
8
|
+
|
|
9
|
+
Traditional libraries run inside an application boundary. MCP servers sit beside an agent and expose capabilities the model may choose to call. A small schema mistake, broad tool surface, or unreliable startup path can become an operational risk when the server is wired into an autonomous workflow.
|
|
10
|
+
|
|
11
|
+
Important MCP risk patterns include:
|
|
12
|
+
|
|
13
|
+
- **Tool overreach:** tools that expose shell, browser, filesystem, network, or data-write behavior with weak constraints.
|
|
14
|
+
- **Schema ambiguity:** vague names, missing parameter descriptions, permissive object schemas, or unclear required fields that make agent calls less predictable.
|
|
15
|
+
- **Prompt injection paths:** tools that retrieve untrusted content and return it directly to an agent context.
|
|
16
|
+
- **Secret exposure:** responses, logs, headers, or environment-backed tools that can leak credentials or internal details.
|
|
17
|
+
- **Schema drift:** changed tool names, parameters, or capabilities that break dependent agents without warning.
|
|
18
|
+
- **Unreliable startup:** packages that work locally but hang, exit early, or fail under CI and production runners.
|
|
19
|
+
- **Capability mismatch:** servers that advertise tools, prompts, or resources but do not return valid MCP responses.
|
|
20
|
+
|
|
21
|
+
## What Can Go Wrong When Agents Depend On Tools
|
|
22
|
+
|
|
23
|
+
An MCP server can look harmless during manual evaluation and still fail in production agent infrastructure. The most common failure modes are not exotic. They are basic integration risks amplified by agent autonomy:
|
|
24
|
+
|
|
25
|
+
- a tool disappears or changes shape after an upgrade
|
|
26
|
+
- a server starts on a laptop but fails in GitHub Actions
|
|
27
|
+
- a broad filesystem or browser automation tool is exposed without a clear trust boundary
|
|
28
|
+
- a tool returns untrusted text that gets treated as instruction-like context
|
|
29
|
+
- a schema is technically valid but too vague for reliable model use
|
|
30
|
+
- a private or credential-backed tool is added without audit visibility
|
|
31
|
+
|
|
32
|
+
For security and platform teams, the goal is not to block every MCP server. The goal is to make tool invocation observable, testable, auditable, and safe enough for the workflow that depends on it.
|
|
33
|
+
|
|
34
|
+
## What MCP Observatory Checks Today
|
|
35
|
+
|
|
36
|
+
MCP Observatory focuses on model context protocol testing that can run locally, in CI, or through its own MCP server mode. It checks:
|
|
37
|
+
|
|
38
|
+
- tools, prompts, and resources list/respond correctly
|
|
39
|
+
- advertised capabilities match observed behavior
|
|
40
|
+
- safe read-only tools can be invoked
|
|
41
|
+
- schemas have enough structure for agents to call them reliably
|
|
42
|
+
- risky schema patterns are surfaced before production use
|
|
43
|
+
- runs can be compared for regressions and schema drift detection
|
|
44
|
+
- artifacts can be rendered as JSON, Markdown, HTML, JUnit, SARIF, or PR comments
|
|
45
|
+
- health scores and badges can create visible trust signals for MCP maintainers
|
|
46
|
+
|
|
47
|
+
This is intentionally practical. It is not a formal proof of semantic safety. It is a CI-friendly control that helps teams find obvious compatibility, drift, and security issues before they become agent failures.
|
|
48
|
+
|
|
49
|
+
## What CI Should Catch Before Deployment
|
|
50
|
+
|
|
51
|
+
A useful MCP server CI gate should answer a few operational questions:
|
|
52
|
+
|
|
53
|
+
- Does the server start reliably in a clean environment?
|
|
54
|
+
- Do tools, prompts, and resources respond with valid MCP shapes?
|
|
55
|
+
- Did any tool, parameter, prompt, or resource drift from the previous known-good run?
|
|
56
|
+
- Are there broad filesystem, shell, browser, network, or credential-sensitive tools?
|
|
57
|
+
- Are generated reports readable by maintainers and security reviewers?
|
|
58
|
+
- Can the run produce artifacts for later audit, diffing, or enterprise review?
|
|
59
|
+
|
|
60
|
+
MCP Observatory is designed to make that a one-command adoption path:
|
|
61
|
+
|
|
62
|
+
```bash
|
|
63
|
+
npx @kryptosai/mcp-observatory init-ci --all --command "npx -y my-mcp-server"
|
|
64
|
+
```
|
|
65
|
+
|
|
66
|
+
For a direct check:
|
|
67
|
+
|
|
68
|
+
```bash
|
|
69
|
+
npx @kryptosai/mcp-observatory test --security npx -y my-mcp-server
|
|
70
|
+
```
|
|
71
|
+
|
|
72
|
+
## How Security And Platform Teams Can Adopt MCP Checks
|
|
73
|
+
|
|
74
|
+
For open source maintainers, start with the generated GitHub Action and a public badge. This creates a visible compatibility/security signal without requiring an account.
|
|
75
|
+
|
|
76
|
+
For private teams, start with static artifacts:
|
|
77
|
+
|
|
78
|
+
- run MCP checks in CI
|
|
79
|
+
- store JSON and Markdown artifacts
|
|
80
|
+
- compare releases with `diff`
|
|
81
|
+
- use SARIF where security review tools expect it
|
|
82
|
+
- generate a static enterprise report for owner review
|
|
83
|
+
|
|
84
|
+
For production MCP fleets, the next layer is hosted history, recurring security reports, certification review, support, and fleet visibility across repositories and agent environments.
|
|
85
|
+
|
|
86
|
+
## Future Direction
|
|
87
|
+
|
|
88
|
+
The next generation of secure agentic systems will need more than ad hoc tool installs. Useful controls will include:
|
|
89
|
+
|
|
90
|
+
- policy for which tools agents may call
|
|
91
|
+
- provenance for MCP packages and server configurations
|
|
92
|
+
- schema locks and controlled drift review
|
|
93
|
+
- runtime monitoring for production agent tool use
|
|
94
|
+
- certification signals for high-trust MCP servers
|
|
95
|
+
- fleet inventory across teams, repositories, and hosts
|
|
96
|
+
|
|
97
|
+
MCP Observatory starts with the smallest durable wedge: make MCP servers testable, visible, and auditable before agents depend on them.
|
|
@@ -0,0 +1,85 @@
|
|
|
1
|
+
# MCP Server Safety Index
|
|
2
|
+
|
|
3
|
+
The MCP Server Safety Index is a public, reproducible way to show how MCP servers behave under compatibility, schema quality, drift, and security checks.
|
|
4
|
+
|
|
5
|
+
The goal is constructive proof, not callouts. Each entry shows what should be tested, how to reproduce it, what risk class matters, and what a maintainer can do next.
|
|
6
|
+
|
|
7
|
+
## Index v0
|
|
8
|
+
|
|
9
|
+
| # | Server | Category | Reproducible Command | What To Check | Risk Class | Status |
|
|
10
|
+
| ---: | --- | --- | --- | --- | --- | --- |
|
|
11
|
+
| 1 | [`modelcontextprotocol/servers`](https://github.com/modelcontextprotocol/servers) sequential thinking | Reference | `npx -y @modelcontextprotocol/server-sequential-thinking@latest` | Startup, tools/list, schema quality, security-lite | Reference compatibility | PR open: [#4392](https://github.com/modelcontextprotocol/servers/pull/4392) |
|
|
12
|
+
| 2 | [`modelcontextprotocol/servers`](https://github.com/modelcontextprotocol/servers) filesystem | Filesystem | `npx -y @modelcontextprotocol/server-filesystem .` | Startup in harmless temp dir, path tools, schema quality | Filesystem boundary | Researched |
|
|
13
|
+
| 3 | [`upstash/context7`](https://github.com/upstash/context7) | Documentation/search | `npx -y @upstash/context7-mcp@latest` | Startup, retrieval tools, schemas, prompt-injection-sensitive text flow | Untrusted content retrieval | PR open: [#2800](https://github.com/upstash/context7/pull/2800) |
|
|
14
|
+
| 4 | [`executeautomation/mcp-playwright`](https://github.com/executeautomation/mcp-playwright) | Browser automation | `npx -y @executeautomation/playwright-mcp-server@latest` | Browser tools, schema quality, intentional code-eval suppressions | Browser/code execution | PR open: [#225](https://github.com/executeautomation/mcp-playwright/pull/225) |
|
|
15
|
+
| 5 | [`microsoft/playwright-mcp`](https://github.com/microsoft/playwright-mcp) | Browser automation | `npx -y @playwright/mcp@latest` | Browser tools, skip-invoke policy, schema quality, suppressions | Browser/code execution | PR open: [#1657](https://github.com/microsoft/playwright-mcp/pull/1657) |
|
|
16
|
+
| 6 | [`kazuph/mcp-taskmanager`](https://github.com/kazuph/mcp-taskmanager) | Developer tools | `npx -y @kazuph/mcp-taskmanager@latest` | Task tools, schema quality, mutation clarity | Project/task mutation | PR open: [#11](https://github.com/kazuph/mcp-taskmanager/pull/11) |
|
|
17
|
+
| 7 | [`cyanheads/filesystem-mcp-server`](https://github.com/cyanheads/filesystem-mcp-server) | Filesystem | `node dist/index.js` | Capability declarations, resources/list, sandboxed filesystem target | Filesystem boundary | PR open: [#19](https://github.com/cyanheads/filesystem-mcp-server/pull/19) |
|
|
18
|
+
| 8 | [`browserbase/mcp-server-browserbase`](https://github.com/browserbase/mcp-server-browserbase) | Browser automation | `npx -y @browserbasehq/mcp-server-browserbase` | Auth-free startup, browser tools, network/browser boundaries | Hosted browser control | Researched; likely needs API key |
|
|
19
|
+
| 9 | [`redis/mcp-redis`](https://github.com/redis/mcp-redis) | Database | `uvx mcp-redis` | Startup without live database, command surface, destructive operations | Data mutation | Researched; may need service |
|
|
20
|
+
| 10 | [`mongodb-js/mongodb-mcp-server`](https://github.com/mongodb-js/mongodb-mcp-server) | Database | `npx -y mongodb-mcp-server` | Connection handling, read/write tools, auth posture | Data mutation/auth | Researched; likely needs connection string |
|
|
21
|
+
| 11 | [`supabase-community/supabase-mcp`](https://github.com/supabase-community/supabase-mcp) | Database/SaaS | `npx -y supabase-mcp` | Startup, token handling, project mutation tools | Cloud data access | Researched; likely needs token |
|
|
22
|
+
| 12 | [`cloudflare/mcp-server-cloudflare`](https://github.com/cloudflare/mcp-server-cloudflare) | Cloud | `npx -y @cloudflare/mcp-server-cloudflare` | Auth posture, deploy/config tools, schema clarity | Cloud control plane | Researched; likely needs auth |
|
|
23
|
+
| 13 | [`stripe/agent-toolkit`](https://github.com/stripe/agent-toolkit) | Payments | `npx -y @stripe/agent-toolkit` | MCP mode, payment/customer mutation tools, auth posture | Payments/destructive action | Researched; likely needs API key |
|
|
24
|
+
| 14 | [`github/github-mcp-server`](https://github.com/github/github-mcp-server) | Developer tools | `docker run ghcr.io/github/github-mcp-server` | Auth handling, repo mutation tools, schema clarity | Source-code control | Researched; likely needs token |
|
|
25
|
+
| 15 | [`jetbrains/mcpProxy`](https://github.com/JetBrains/mcpProxy) | IDE/developer tools | `npx -y @jetbrains/mcp-proxy` | IDE dependency, startup behavior, tool surface | Local IDE control | Researched; may need IDE process |
|
|
26
|
+
| 16 | [`BrowserMCP/mcp`](https://github.com/BrowserMCP/mcp) | Browser automation | `npx -y @browsermcp/mcp` | Browser tools, schema quality, browser-control boundary | Browser control | PR open: [#189](https://github.com/BrowserMCP/mcp/pull/189) |
|
|
27
|
+
| 17 | [`UI5/mcp-server`](https://github.com/UI5/mcp-server) | Developer tools | `npx -y @ui5/mcp-server` | UI5 tooling commands, schema quality, drift risk | App development tooling | PR open: [#348](https://github.com/UI5/mcp-server/pull/348) |
|
|
28
|
+
| 18 | [`antvis/mcp-server-chart`](https://github.com/antvis/mcp-server-chart) | Visualization/data | `npx -y @antv/mcp-server-chart` | Chart generation tools, schema quality, artifact-producing tools | Generated artifacts | PR open: [#312](https://github.com/antvis/mcp-server-chart/pull/312) |
|
|
29
|
+
| 19 | [`makenotion/notion-mcp-server`](https://github.com/makenotion/notion-mcp-server) | SaaS/API | `npx -y @notionhq/notion-mcp-server` | Auth handling, read/write tool separation, schema quality | Workspace data access | PR open: [#324](https://github.com/makenotion/notion-mcp-server/pull/324) |
|
|
30
|
+
| 20 | [`sentry/sentry-mcp`](https://github.com/getsentry/sentry-mcp) | Developer SaaS | `npx -y @sentry/mcp-server` | Auth handling, issue/project tools, schema quality | Production incident data | Researched; likely needs token |
|
|
31
|
+
|
|
32
|
+
## Evaluation Command
|
|
33
|
+
|
|
34
|
+
For simple npm-backed servers:
|
|
35
|
+
|
|
36
|
+
```bash
|
|
37
|
+
npx @kryptosai/mcp-observatory test --security npx -y <server-package>
|
|
38
|
+
```
|
|
39
|
+
|
|
40
|
+
For safer campaign PRs:
|
|
41
|
+
|
|
42
|
+
```bash
|
|
43
|
+
npx @kryptosai/mcp-observatory init-ci --all --command "npx -y <server-package>"
|
|
44
|
+
```
|
|
45
|
+
|
|
46
|
+
For production-style review:
|
|
47
|
+
|
|
48
|
+
```bash
|
|
49
|
+
npx @kryptosai/mcp-observatory lock
|
|
50
|
+
npx @kryptosai/mcp-observatory lock verify
|
|
51
|
+
```
|
|
52
|
+
|
|
53
|
+
## What Each Column Means
|
|
54
|
+
|
|
55
|
+
- What To Check: the minimum compatibility/security surface a maintainer or platform team should inspect.
|
|
56
|
+
- Risk Class: the operational reason the server matters before agents depend on it.
|
|
57
|
+
- Status: public proof such as PR open, PR accepted, badge added, researched, or needs maintainer review.
|
|
58
|
+
|
|
59
|
+
## Publication Rules
|
|
60
|
+
|
|
61
|
+
- Use only public repositories, public package commands, public PRs, or sample artifacts.
|
|
62
|
+
- Include a reproduction command for every row.
|
|
63
|
+
- Link to the maintainer PR or public artifact when available.
|
|
64
|
+
- Phrase findings constructively: “needs review” rather than “unsafe” unless there is clear public proof.
|
|
65
|
+
- Keep customer/domain telemetry internal unless the customer gives permission or there is independent public evidence.
|
|
66
|
+
|
|
67
|
+
## Five Patterns To Publish From v0
|
|
68
|
+
|
|
69
|
+
1. Browser automation MCP servers need explicit policy around code execution, screenshots, navigation, and mutation.
|
|
70
|
+
2. Filesystem MCP servers need harmless CI sandboxes and clear read/write boundaries.
|
|
71
|
+
3. SaaS and cloud MCP servers often cannot be meaningfully checked without token-safe target configs.
|
|
72
|
+
4. Database MCP servers need read/write classification and connection-string hygiene before CI rollout.
|
|
73
|
+
5. Lock files turn MCP surface drift into a reviewable PR event instead of an invisible agent dependency change.
|
|
74
|
+
|
|
75
|
+
## Next Wave Criteria
|
|
76
|
+
|
|
77
|
+
Prioritize 20-50 servers that have:
|
|
78
|
+
|
|
79
|
+
- active maintenance in the last 90 days
|
|
80
|
+
- visible stars, downloads, or directory listings
|
|
81
|
+
- simple `npx`, `uvx`, or Docker startup commands
|
|
82
|
+
- enterprise-relevant categories such as browser automation, filesystem, documentation/search, databases, cloud, productivity, and developer tools
|
|
83
|
+
- no existing MCP compatibility/security CI
|
|
84
|
+
|
|
85
|
+
One accepted PR in a respected repo is worth more than a large list of shallow checks.
|
|
@@ -0,0 +1,58 @@
|
|
|
1
|
+
# Paid Pilot Offer
|
|
2
|
+
|
|
3
|
+
## Private MCP Readiness Review
|
|
4
|
+
|
|
5
|
+
Offer:
|
|
6
|
+
|
|
7
|
+
> Private MCP readiness review + CI rollout + drift/security report.
|
|
8
|
+
|
|
9
|
+
This is a manual pilot, not a self-serve SaaS promise.
|
|
10
|
+
|
|
11
|
+
## Who It Is For
|
|
12
|
+
|
|
13
|
+
- teams running MCP servers in production or pre-production
|
|
14
|
+
- security/platform teams reviewing agent tool dependencies
|
|
15
|
+
- companies with private MCP repos
|
|
16
|
+
- teams that need proof before agents depend on internal tools
|
|
17
|
+
|
|
18
|
+
## What The Pilot Includes
|
|
19
|
+
|
|
20
|
+
- review of the customer’s MCP config, repo, or startup commands
|
|
21
|
+
- MCP Observatory CI rollout for selected servers
|
|
22
|
+
- private readiness report covering startup, capabilities, schema quality, security findings, and drift risk
|
|
23
|
+
- MCP lock-file setup for contract drift review
|
|
24
|
+
- prioritized remediation notes
|
|
25
|
+
- optional certification language for servers that pass agreed checks
|
|
26
|
+
|
|
27
|
+
## Starting Prices
|
|
28
|
+
|
|
29
|
+
- Business Pilot: starts at `$999/month`
|
|
30
|
+
- Enterprise Pilot: starts at `$3k/month`
|
|
31
|
+
- Strategic Accounts: custom, `$250k+/year`
|
|
32
|
+
|
|
33
|
+
Do not route major platforms, AI labs, or large enterprises to Team/Business pricing. Use a production/security pilot conversation and ask for the owner or procurement path.
|
|
34
|
+
|
|
35
|
+
## Simple Outreach Copy
|
|
36
|
+
|
|
37
|
+
Subject: Private MCP readiness review
|
|
38
|
+
|
|
39
|
+
Hi,
|
|
40
|
+
|
|
41
|
+
I build MCP Observatory, the CI and security gate for MCP servers before agents depend on them.
|
|
42
|
+
|
|
43
|
+
I am opening a small number of private MCP readiness pilots for teams running MCP in production or pre-production. The pilot includes CI rollout, schema/security review, drift checks, and a private readiness report for your MCP servers.
|
|
44
|
+
|
|
45
|
+
If MCP is becoming part of your agent infrastructure, I can help you answer:
|
|
46
|
+
|
|
47
|
+
- which servers are safe enough for agents to depend on?
|
|
48
|
+
- which tool surfaces changed recently?
|
|
49
|
+
- where are the schema/security risks?
|
|
50
|
+
- what should block a PR before production?
|
|
51
|
+
|
|
52
|
+
Would it be useful to compare notes this week?
|
|
53
|
+
|
|
54
|
+
William
|
|
55
|
+
|
|
56
|
+
## Delivery Shape
|
|
57
|
+
|
|
58
|
+
Start with static reports and CI setup. Do not build a dashboard until paid pilot feedback proves exactly what buyers need.
|
|
@@ -4,9 +4,17 @@
|
|
|
4
4
|
|
|
5
5
|
MCP Observatory is CI/security infrastructure for production MCP servers.
|
|
6
6
|
|
|
7
|
-
##
|
|
7
|
+
## Project Narrative
|
|
8
8
|
|
|
9
|
-
MCP
|
|
9
|
+
MCP Observatory identifies an emerging risk in AI agent infrastructure and turns it into a practical OSS control: CI checks, security reports, drift detection, telemetry intelligence, and certification workflows for production MCP servers.
|
|
10
|
+
|
|
11
|
+
The project is strongest as a signal because it connects product intuition with implementation depth. It starts from a real infrastructure shift, builds a working developer tool around that shift, instruments usage, and creates a credible path from open source adoption to production security workflows.
|
|
12
|
+
|
|
13
|
+
## Problem Discovery
|
|
14
|
+
|
|
15
|
+
MCP servers are becoming dependencies for AI agents. They expose tools, prompts, resources, and data access that agents can call directly. When those servers drift, fail to start, expose broad capabilities, or return ambiguous schemas, the failure can propagate into agent workflows.
|
|
16
|
+
|
|
17
|
+
The control gap is simple: teams need a way to test MCP servers before agents depend on them. They also need artifacts that maintainers, platform engineers, and security reviewers can understand.
|
|
10
18
|
|
|
11
19
|
## Product
|
|
12
20
|
|
|
@@ -23,46 +31,57 @@ MCP Observatory provides:
|
|
|
23
31
|
- static enterprise reports
|
|
24
32
|
- telemetry intelligence for product and account-level learning
|
|
25
33
|
|
|
26
|
-
##
|
|
34
|
+
## System Design
|
|
27
35
|
|
|
28
|
-
The project is a TypeScript/Node CLI with modular command handlers, MCP adapters, check runners, reporters, artifact schemas, and a GitHub Action wrapper.
|
|
36
|
+
The project is a TypeScript/Node CLI with modular command handlers, MCP adapters, check runners, reporters, artifact schemas, and a GitHub Action wrapper.
|
|
29
37
|
|
|
30
|
-
|
|
38
|
+
The system supports local-process and HTTP MCP targets, stores run artifacts, compares runs for regressions, generates reports for humans and CI systems, and can run as an MCP server itself. A Cloudflare Worker handles hosted artifact upload pilots. A separate telemetry Worker stores private aggregate usage events in D1 for product and account intelligence.
|
|
31
39
|
|
|
32
|
-
|
|
40
|
+
## Security Model
|
|
33
41
|
|
|
34
|
-
-
|
|
35
|
-
- 40 test files
|
|
36
|
-
- 321 passing tests
|
|
37
|
-
- npm package published
|
|
38
|
-
- GitHub Action available
|
|
39
|
-
- MCP server mode available
|
|
40
|
-
- telemetry export and company intelligence tooling available
|
|
42
|
+
MCP Observatory treats MCP servers as agent-facing infrastructure. The goal is not to claim formal semantic safety. The goal is to make compatibility, drift, and obvious security risk visible before deployment.
|
|
41
43
|
|
|
42
|
-
|
|
44
|
+
Current controls include:
|
|
43
45
|
|
|
44
|
-
|
|
46
|
+
- lightweight security checks for risky schema patterns
|
|
47
|
+
- schema quality analysis for agent usability
|
|
48
|
+
- SARIF output for security review workflows
|
|
49
|
+
- support for security suppressions when broad tools are intentional
|
|
50
|
+
- private-network rejection for hosted scans
|
|
51
|
+
- privacy disclosure and telemetry opt-out controls
|
|
52
|
+
- sanitized public reporting policy
|
|
45
53
|
|
|
46
|
-
|
|
47
|
-
- 7,211 telemetry sessions
|
|
48
|
-
- 5,368 external sessions after separating internal activity
|
|
49
|
-
- 582 GitHub clones and 175 unique cloners in the visible June 2026 traffic window
|
|
50
|
-
- 104 npm downloads during June 11-17, 2026
|
|
54
|
+
For deeper context, see the [MCP Server Security Field Guide](./mcp-security-field-guide.md).
|
|
51
55
|
|
|
52
|
-
|
|
56
|
+
## Telemetry Intelligence
|
|
53
57
|
|
|
54
|
-
|
|
58
|
+
Telemetry is used privately to understand product usage and identify account-level signals without publishing raw personal data.
|
|
55
59
|
|
|
56
|
-
|
|
60
|
+
As of the latest local export on June 20, 2026:
|
|
57
61
|
|
|
58
|
-
- telemetry
|
|
59
|
-
-
|
|
60
|
-
-
|
|
61
|
-
-
|
|
62
|
-
-
|
|
63
|
-
-
|
|
62
|
+
- 10,918 telemetry events
|
|
63
|
+
- 7,380 total sessions
|
|
64
|
+
- 5,379 external sessions after separating internal activity
|
|
65
|
+
- 2,446 external CI sessions
|
|
66
|
+
- 138 attributed company/org sessions
|
|
67
|
+
- 11 attributed company/org candidates
|
|
64
68
|
|
|
65
|
-
Public claims
|
|
69
|
+
Public claims use aggregate or sanitized data only. Raw emails, hostnames, private URLs, tokens, and response bodies are not published.
|
|
70
|
+
|
|
71
|
+
## Distribution Strategy
|
|
72
|
+
|
|
73
|
+
The distribution wedge is useful CI for other MCP repositories. The certification campaign opens small, helpful PRs that add MCP compatibility/security checks and leave maintainers with a public trust signal.
|
|
74
|
+
|
|
75
|
+
Current public distribution proof includes:
|
|
76
|
+
|
|
77
|
+
- latest release: `v0.23.0`
|
|
78
|
+
- npm package: `@kryptosai/mcp-observatory`
|
|
79
|
+
- GitHub Action: `KryptosAI/mcp-observatory/action@main`
|
|
80
|
+
- visible GitHub traffic window: 721 clones and 221 unique cloners
|
|
81
|
+
- official MCP reference PR open and green: [`modelcontextprotocol/servers#4392`](https://github.com/modelcontextprotocol/servers/pull/4392)
|
|
82
|
+
- open certification PRs for Microsoft Playwright MCP, Upstash Context7, ExecuteAutomation Playwright MCP, and other MCP projects
|
|
83
|
+
|
|
84
|
+
See [reference evaluations](./reference-evaluations.md) and [public proof](./proof.md).
|
|
66
85
|
|
|
67
86
|
## Commercial Path
|
|
68
87
|
|
|
@@ -83,24 +102,35 @@ Current pilot anchors:
|
|
|
83
102
|
- Enterprise: starts at `$3k/month`
|
|
84
103
|
- Strategic: `$250k+/year`
|
|
85
104
|
|
|
86
|
-
##
|
|
105
|
+
## Professional Signal
|
|
87
106
|
|
|
88
|
-
|
|
107
|
+
MCP Observatory demonstrates applied work across:
|
|
89
108
|
|
|
90
|
-
- AI infrastructure
|
|
109
|
+
- AI agent infrastructure
|
|
91
110
|
- developer tooling
|
|
92
|
-
-
|
|
93
|
-
-
|
|
111
|
+
- secure tool invocation
|
|
112
|
+
- software supply chain thinking
|
|
94
113
|
- CI/CD integrations
|
|
95
114
|
- telemetry and product analytics
|
|
96
|
-
-
|
|
115
|
+
- open source distribution
|
|
116
|
+
- enterprise packaging
|
|
117
|
+
|
|
118
|
+
It is designed to be evaluated through public work: code, docs, CI integrations, reference evaluations, proof surfaces, and real maintainer PRs.
|
|
119
|
+
|
|
120
|
+
## Future Roadmap
|
|
121
|
+
|
|
122
|
+
Near-term milestones:
|
|
97
123
|
|
|
98
|
-
|
|
124
|
+
1. Convert certification PRs into accepted public integrations.
|
|
125
|
+
2. Publish recurring MCP safety reports.
|
|
126
|
+
3. Add stronger policy/provenance language for production MCP adoption.
|
|
127
|
+
4. Improve hosted artifact upload into a simple pilot workflow.
|
|
128
|
+
5. Convert serious production users into paid pilots.
|
|
99
129
|
|
|
100
|
-
|
|
130
|
+
Longer-term opportunities:
|
|
101
131
|
|
|
102
|
-
|
|
103
|
-
|
|
104
|
-
|
|
105
|
-
|
|
106
|
-
|
|
132
|
+
- policy controls for agent tool use
|
|
133
|
+
- provenance for MCP packages and configurations
|
|
134
|
+
- schema locks and controlled drift review
|
|
135
|
+
- runtime monitoring for production agent tool calls
|
|
136
|
+
- fleet inventory across teams, repositories, and hosts
|