npm - @nullplatform/mcp - Versions diffs - 0.1.0 - Mend

@nullplatform/mcp 0.1.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (51) hide show

package/LICENSE +21 -0
package/README.md +252 -0
package/dist/config.js +26 -0
package/dist/git.js +27 -0
package/dist/http.js +330 -0
package/dist/i18n.js +595 -0
package/dist/index.js +72 -0
package/dist/md.js +110 -0
package/dist/np/auth.js +130 -0
package/dist/np/client.js +72 -0
package/dist/np/context.js +201 -0
package/dist/np/journey.js +403 -0
package/dist/prompts.js +64 -0
package/dist/render.js +236 -0
package/dist/server.js +91 -0
package/dist/skills.js +84 -0
package/dist/surfaces/developer.js +29 -0
package/dist/surfaces/index.js +17 -0
package/dist/surfaces/surface.js +1 -0
package/dist/tool-names.js +25 -0
package/dist/tool.js +92 -0
package/dist/tools/approvals.js +80 -0
package/dist/tools/builds.js +94 -0
package/dist/tools/create-app.js +187 -0
package/dist/tools/create-release.js +52 -0
package/dist/tools/create-scope.js +82 -0
package/dist/tools/deploy.js +178 -0
package/dist/tools/find-apps.js +36 -0
package/dist/tools/index.js +39 -0
package/dist/tools/logs.js +83 -0
package/dist/tools/metrics.js +83 -0
package/dist/tools/overview.js +110 -0
package/dist/tools/params.js +58 -0
package/dist/tools/playbook.js +39 -0
package/dist/tools/services.js +58 -0
package/dist/tools/set-params.js +58 -0
package/dist/tools/shared.js +141 -0
package/dist/tools/status.js +70 -0
package/dist/tools/traffic.js +74 -0
package/dist/ui.js +76 -0
package/package.json +65 -0
package/skills/deploying-safely/SKILL.md +54 -0
package/skills/incident-response/SKILL.md +52 -0
package/skills/platform-conventions/SKILL.md +61 -0
package/widgets-dist/create-app.html +830 -0
package/widgets-dist/find-apps.html +831 -0
package/widgets-dist/logs.html +830 -0
package/widgets-dist/manifest.json +8 -0
package/widgets-dist/metrics.html +829 -0
package/widgets-dist/np-panel.html +831 -0
package/widgets-dist/params.html +829 -0

package/LICENSE ADDED Viewed

@@ -0,0 +1,21 @@
+MIT License
+Copyright (c) 2026 nullplatform
+Permission is hereby granted, free of charge, to any person obtaining a copy
+of this software and associated documentation files (the "Software"), to deal
+in the Software without restriction, including without limitation the rights
+to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
+copies of the Software, and to permit persons to whom the Software is
+furnished to do so, subject to the following conditions:
+The above copyright notice and this permission notice shall be included in all
+copies or substantial portions of the Software.
+THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
+AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
+OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
+SOFTWARE.

package/README.md ADDED Viewed

@@ -0,0 +1,252 @@
+# ai-mcp — nullplatform from your code assistant
+An MCP server that turns your code assistant (Claude Code, Cursor, Windsurf, …) into the
+**frontend for nullplatform**. Status, deploys, traffic, rollbacks, logs and config — from
+the place you already work, aware of the repo you're sitting in.
+```text
+you  › deploy this
+claude › 🚀 Deploying #4312 on dev — ⏳ waiting for instances
+         Created release 1.4.2 from build #991 (main @ab12cd34)
+         → Next: traffic percent:25 … traffic action:"finalize"
+```
+## Why
+The web dashboard walks you through create → build → release → scope → deploy → observe.
+Your assistant already knows your repo, your branch and what you just changed — so the same
+journey collapses into a sentence: most tools work with **zero arguments** inside a linked repo.
+## 60-second setup
+You need a nullplatform API key. The server runs locally via `npx` — no install, no clone.
+**Claude Code**
+```bash
+claude mcp add nullplatform -e NP_API_KEY=<your-key> -- npx -y @nullplatform/mcp
+```
+**Claude Desktop** — Settings → Developer → Edit Config (`claude_desktop_config.json`), then fully restart:
+```json
+{
+  "mcpServers": {
+    "nullplatform": {
+      "command": "npx",
+      "args": ["-y", "@nullplatform/mcp"],
+      "env": { "NP_API_KEY": "<your-key>" }
+    }
+  }
+}
+```
+**Cursor / Windsurf / others** — same shape in their `mcp.json` (`command: "npx"`, `args: ["-y", "@nullplatform/mcp"]`).
+**Remote/HTTP mode — bring your own key.** `npx -y @nullplatform/mcp --http 8080` → `http://host:8080/mcp`.
+The server holds **no credential** (it refuses to start if `NP_API_KEY`/`NP_BEARER` are in its
+environment). Every caller authenticates each request with their *own* key, so nullplatform's
+RBAC applies to each user individually — you can only see and do what your platform user can:
+```bash
+claude mcp add --transport http nullplatform https://host/mcp \
+  --header "Authorization: Bearer <your-NP_API_KEY>"
+```
+A pre-issued JWT works in place of the api key, and hosts that reserve `Authorization` can send
+`X-NP-API-Key: <key>` instead. Unauthenticated requests get a `401` + `WWW-Authenticate` hint.
+`GET /healthz` is the unauthenticated liveness probe. Requests carrying a browser `Origin` are
+rejected unless allow-listed (DNS-rebinding guard), there's a 1 MiB body cap, and a generous
+per-IP rate limit (default 600 req/min, tunable) so a key-rotating caller can't amplify load
+against the platform's `/token`.
+**Deploy it behind a TLS-terminating reverse proxy on a trusted network** — every request
+carries a bearer credential, so plain HTTP must never be exposed. The proxy should also enforce
+its own per-IP rate limits and connection caps; the in-process limiter is a backstop, not a
+replacement. `X-Forwarded-For` is only trusted when `NP_TRUST_PROXY` is set.
+The api_key → access-token **exchange is the trust boundary** and is treated like one. The
+customer's key is **never retained**: it arrives with each request, exists only in that
+request's async scope, and is used at most once per expiry window to (re)exchange — what's
+cached per user (keyed by the key's SHA-256, never the key) is only the platform-issued
+short-lived access token. Every credential is verified with the platform *before* it reaches
+any tool (invalid ones stop at the edge as `401 invalid_token`), verification and exchange are
+single-flighted per credential across concurrent requests, rejected keys are negative-cached
+for a minute (they can't hammer the platform's `/token` or evict verified users), and secrets
+never appear in logs or error bodies.
+| Env | Meaning |
+| --- | --- |
+| `NP_API_KEY` | **stdio only** — your API key (exchanged for a bearer automatically) |
+| `NP_BEARER` | **stdio only** — a pre-issued token instead (expires; for quick tests) |
+| `NP_API_BASE` | Override the API host (default `https://api.nullplatform.com`) |
+| `NP_BFF_BASE` | Override the dashboard BFF host (metrics) |
+| `NP_ALLOWED_ORIGINS` | `--http` only: comma-separated browser origins allowed to call (server-to-server MCP clients send no Origin and always pass) |
+| `NP_RATE_LIMIT_RPM` | `--http` only: per-IP requests/minute (default `600`; `0` disables — rely on the fronting proxy then) |
+| `NP_TRUST_PROXY` | `--http` only: set to `1`/`true` to key the rate limit off `X-Forwarded-For` (only behind a proxy that sets it) |
+| `NP_LANG` | Fallback answer language (`en`/`es`). The real driver is the **conversation**: every tool takes a `language` argument the assistant sets to the language the user is speaking, which wins over `Accept-Language` (HTTP) and `NP_LANG`/`LANG` (stdio). |
+## The tools (15)
+Every answer is scannable markdown with one **→ Next:** hint, plus structured JSON for the model.
+Write tools are **convergent under retries** — an agent that re-issues a call after a timeout
+converges instead of duplicating (see "Built for agents, not browsers" below).
+| Tool | What it does |
+| --- | --- |
+| `application_get` | **Start here.** Scopes × what's live (release + traffic), latest build/release, next action. Zero-arg in a linked repo; `deployment:<id>` watches one rollout. |
+| `organization_get` | Org-wide digest with no web equivalent: what's mid-rollout and what last failed, across all apps. Answers "is anything broken?" without naming an app. |
+| `application_list` | Org-wide app search (parallel + cached). |
+| `application_build_list` | Recent CI builds — status, branch, commit, age, whether released. Closes the push→CI→deploy gap; `build:<id>` shows its assets. |
+| `application_log_list` | Recent log lines for the app/scope. |
+| `application_parameter_list` | List configuration parameters (secrets masked). |
+| `application_metric_list` | Golden signals per scope — throughput, response time, error rate, CPU, memory — with sparkline trends (1h/3h/24h/7d). |
+| `application_deployment_create` | Ship: picks the latest successful build, cuts the release for you (semver bump), targets the only scope — all overridable. Reuses an in-flight rollout on retry. |
+| `application_deployment_update` | Move traffic (snaps to 1/5/10/25/50/75/90/95/99/100), `finalize`, or `rollback`. Finds the active rollout itself. |
+| `application_release_create` | Cut a release from a build explicitly (reuses an existing one for the same build). |
+| `application_parameter_create` | Create/update env vars & file params (mark `secret`) — upserts, never duplicates. Apply on next deploy. |
+| `application_create` | Link the current repo as a new application (name/URL inferred from the git remote). Returns the existing app if already linked. |
+| `application_scope_create` | Create a deploy target (lists the org's scope types when ambiguous). Returns the existing scope if the name is taken. |
+| `application_approval_list` | List the approvals gating an app (e.g. a deploy stuck on a policy) and `approve`/`cancel` one — using your own permissions, so the platform denies what you can't do. |
+| `application_service_list` | List an app's dependency services (DBs, queues…) and the provisionable catalog; deep-links into the dashboard to create. |
+Plus three slash-command prompts: **/ship** (deploy + walk traffic to 100% with health checks),
+**/setup** (link this repo), **/rollback** (get me out, now).
+## Built for agents, not browsers
+MCP usage differs from web usage, and the tools are shaped for it:
+- **Convergent writes.** Agents retry on timeout; a web user doesn't double-submit a form. So
+  every write reconciles against current state instead of blindly POSTing: `application_parameter_create` reuses
+  an existing parameter definition, `application_release_create` reuses a release already cut from the same
+  build, `application_create`/`application_scope_create` return the existing entity, and `application_deployment_create` returns the
+  in-flight rollout for the same release. Re-running a tool is safe.
+- **Org-wide reads.** `organization_get` answers cross-application questions ("what's broken?") that the
+  dashboard only answers one entity-page at a time.
+- **Approvals as a loop, not a notification.** When a deploy blocks on a policy, `application_approval_list`
+  lets the agent see and (if permitted) clear the gate rather than dead-ending.
+- **Honest async + permissions.** Long operations say so and hand back an id to re-query;
+  `401`/`403` surface in plain language because every call carries the caller's own token.
+## Interactive UI (MCP Apps)
+On hosts that render MCP Apps (claude.ai web/desktop, ChatGPT), the journey is interactive —
+text-only hosts (terminals) keep the markdown answers automatically:
+| Widget | Bound to | What you can do in it |
+| --- | --- | --- |
+| **Application panel** | `application_get`, `application_deployment_create`, `application_deployment_update` | Scope cards with live release/traffic/domain, release chips with one-click **ship**, **Deploy latest**, create-scope when empty — and it morphs into the live rollout: traffic slider (snapped marks), **Finalize**/**Rollback** with confirm, deployment log, self-refreshing. |
+| **Create application** | `application_create` | Name + repo + namespace form (opens automatically when no git remote is inferable), creates and reports provisioning. |
+| **Parameters** | `application_parameter_list` | Editable table — add env vars/files, mark secrets, save via `application_parameter_create`. |
+| **Logs** | `application_log_list` | Terminal pane with filter, refresh and auto-tail. |
+| **Metrics** | `application_metric_list` | Golden-signal cards with live canvas charts, window selector (1h/3h/24h/7d), auto-refresh. |
+| **Applications** | `application_list` | Filterable picker — click an app to open its panel. |
+Widgets are single self-contained HTML files (~330KB: Preact-compat runtime + the ext-apps
+SDK, whose embedded zod accounts for most of it) built inline with esbuild — the sandbox
+blocks CDN fetches. They speak MCP Apps protocol `2026-01-26`. They **adopt the host's
+design tokens** (`hostContext.styles`: colors, border radii, fonts) so the UI looks native —
+Claude-styled in Claude, ChatGPT-styled in ChatGPT — with dark/light handled live and our own
+palette only as the fallback for hosts that don't send tokens.
+## Skills ship with the server
+The connector also ships **operating playbooks** — the methodology travels with the tools:
+| Playbook | Teaches |
+| --- | --- |
+| `deploying-safely` | Pre-flight checks, canary traffic steps gated on metrics, finalize/rollback criteria |
+| `incident-response` | Mitigate-first triage: rollback, then read logs/metrics/params for the cause |
+| `platform-conventions` | Entity chain semantics, dimensions, versioning, parameters, traffic lifecycle, known gotchas |
+They're plain markdown under `skills/` — platform teams change agent behavior by editing
+text, no server redeploy. The model reads them through the **`playbook_get` tool**: a tool is
+the one MCP primitive every coding assistant (Claude Code, Cursor, Claude Desktop, …) exposes
+to the model, so this works everywhere and on demand. The server instructions carry the
+catalog (name + when-to-use) so the model knows which to read before which task; they're also
+listed as passive `playbook://nullplatform/<name>` resources. MCP has no "skills" primitive —
+standard Agent Skills load client-side — and the earlier `skill://` scheme made Claude Desktop
+route to its native Agent Skills executor ("Unknown skill") instead, which is why this is a
+tool.
+## How it knows your app
+1. Your client's MCP **roots** (workspace folders) → `git remote get-url origin`
+2. Fallback: the server's working directory
+3. The remote URL is matched against applications' `repository_url` (ssh/https equivalent), then by repo name
+4. No match? Tools say exactly that and offer `application_create` / `application_list`
+## Design principles
+- **Journey-shaped, not endpoint-shaped** — one tool per developer intent; the API chain
+  (build → asset → release → deployment) is the server's problem.
+- **Defaults that match what you meant** — `application_deployment_create` does what the dashboard needs five screens for.
+- **Markdown is the UI** — tables, status glyphs, a traffic bar, one next-step per answer.
+- **Honest writes** — provisioning/asynchronous things say so and tell you how to watch;
+  errors carry the platform's message, never a stack trace.
+- **The dashboard stays one click away** — entities link to `https://<org>.app.nullplatform.io/nrn/<nrn>`.
+## Develop
+```bash
+npm install
+npm test           # unit + full in-memory MCP round-trips over a fake API (builds widgets first)
+npm run lint       # Biome (lint + format)
+npm run typecheck  # server + widgets, strict (noUncheckedIndexedAccess on)
+npm run dev        # stdio server (tsx)
+npm run build      # dist/ + minified widgets (NP_WIDGET_DEBUG=1 keeps identifiers)
+```
+### Testing (the MCP way)
+The suite covers every level a real client exercises, without ever touching the live platform:
+| Layer | File | What it proves |
+| --- | --- | --- |
+| Unit | `test/unit.test.ts` | pure logic: semver, traffic snapping, locale matching, credential parsing, the presenter |
+| Protocol round-trip | `test/tools.test.ts` | the **SDK pattern**: a real MCP `Client` over `InMemoryTransport` against the real server — tool surface, text replies, structured contracts, widget bindings, skills, UI negotiation |
+| Tool scenarios | `test/scenarios.test.ts` | the behavior matrix: ask-backs (which scope?), dimension targeting, action side effects asserted on the recorded platform calls, 403→permission mapping, partial failures, `#id`/git-remote resolution, prompt rendering per language |
+| Widget DOM | `test/widgets.test.tsx` | **ui-mode**: widgets render each tool's `structuredContent` and user actions go back through the (mocked) host bridge as `tools/call` — ship chips, traffic slider snapping, confirm gates, form submission |
+| Transport boundary | `test/http.test.ts` | multi-user HTTP: per-request auth, token-exchange trust boundary, isolation, guards, per-request language |
+| Black box | `test/stdio.test.ts` | the **raw pattern**: spawns `src/index.ts` as a child process and speaks MCP over stdio like an installed client — entry point, env credential policy, negotiated text-only surface |
+Everything runs against `test/fake-api.ts`, an in-memory fake of the nullplatform API faithful
+to the live-verified contract (query scoping, snake_case bodies, async statuses, the multi-asset
+deploy trap) — fast, deterministic, and it records every call so action tests assert exactly
+what hit the platform.
+### Layers
+| Layer | Where | What it owns |
+| --- | --- | --- |
+| Platform API | `src/np/` | auth + token exchange, HTTP client, org context/caches, the typed journey API (builds → releases → scopes → deployments → metrics/logs). Tools never call the raw client — they go through `journey.ts`. |
+| Tool framework | `src/tool.ts` | `ToolSpec`/`ToolReply`, the presenter (one place replies become wire results), the per-tool error net |
+| Tools | `src/tools/` | one file per tool + `index.ts` registry + `shared.ts` resolution helpers |
+| Prompts | `src/prompts.ts` | declarative slash-command prompts |
+| Presentation (text) | `src/md.ts`, `src/render.ts` | the markdown design language and views |
+| Presentation (ui) | `src/ui.ts`, `src/widgets-react/` | widget registry, MCP Apps glue, the React widgets |
+| i18n | `src/i18n.ts` | `en`/`es` catalogs (compile-checked), locale scoping |
+| Transport | `src/http.ts`, `src/index.ts` | multi-user HTTP boundary (per-request auth, per-user backends, guards) and stdio entry |
+| Assembly | `src/server.ts` | Config → Deps → McpServer, wired from the registries |
+### Dual-mode replies
+Every tool returns one `ToolReply { markdown, data }`: the markdown is the complete answer
+for text hosts, the data feeds both the model (structuredContent) and the bound widget on
+ui hosts. In stdio sessions widgets are only registered once the client actually negotiates
+the MCP Apps extension; in stateless HTTP they're always offered and hosts that don't speak
+the extension simply ignore them (that's the spec's graceful degradation).
+### Extend
+- **New tool**: create `src/tools/<name>.ts` exporting `defineTool({...})` (schema, handler
+  returning a `ToolReply`, optional `widget` binding), add it to `src/tools/index.ts`. Its
+  widget auto-registers; the framework supplies error handling and presentation.
+- **New widget**: drop `src/widgets-react/widgets/<name>.tsx`, add it to the `WIDGETS` map in
+  `src/ui.ts`, bind it from a tool spec. The build picks it up.
+- **New prompt**: append a spec to `src/prompts.ts`.
+- **New playbook**: drop `skills/<name>/SKILL.md` — picked up automatically by the `playbook_get`
+  tool and the instruction catalog.
+- **New language**: add one catalog object in `src/i18n.ts` — completeness is compile-checked.
+- **New platform call**: add a typed function to `src/np/journey.ts`; tools never touch the raw client.

package/dist/config.js ADDED Viewed

@@ -0,0 +1,26 @@
+export function loadConfig(env = process.env, policy = "require") {
+    const apiKey = env.NP_API_KEY?.trim() || undefined;
+    const bearer = env.NP_BEARER?.trim() || undefined;
+    const base = {
+        apiBase: env.NP_API_BASE?.trim() || "https://api.nullplatform.com",
+        bffBase: env.NP_BFF_BASE?.trim() || "https://bff-dashboard.nullplatform.io",
+    };
+    if (policy === "forbid") {
+        if (apiKey || bearer) {
+            throw new Error("HTTP mode is multi-user: the server must not hold credentials. Remove NP_API_KEY/NP_BEARER " +
+                "from the environment — each caller sends their own key per request:\n" +
+                "  Authorization: Bearer <NP_API_KEY>   (or a pre-issued JWT, or the X-NP-API-Key header)");
+        }
+        if (env.NP_HTTP_TOKEN?.trim()) {
+            throw new Error("NP_HTTP_TOKEN was removed: the shared-secret gate is superseded by per-user authentication. " +
+                "Callers now send their own nullplatform credential (Authorization: Bearer <NP_API_KEY>).");
+        }
+        return base;
+    }
+    if (!apiKey && !bearer) {
+        throw new Error("nullplatform credentials missing. Set NP_API_KEY (recommended) or NP_BEARER.\n" +
+            "  Claude Code:  claude mcp add nullplatform -e NP_API_KEY=<key> -- npx -y @nullplatform/mcp\n" +
+            '  Cursor:       add "env": {"NP_API_KEY": "<key>"} to the server entry in mcp.json');
+    }
+    return { ...base, apiKey, bearer };
+}

package/dist/git.js ADDED Viewed

@@ -0,0 +1,27 @@
+import { execFile } from "node:child_process";
+/**
+ * Normalize a git remote URL so dashboard-style https URLs and ssh remotes compare equal:
+ *   git@github.com:org/repo.git  ->  github.com/org/repo
+ *   https://github.com/org/repo  ->  github.com/org/repo
+ */
+export function normalizeRepoUrl(url) {
+    let normalized = url.trim().toLowerCase();
+    normalized = normalized.replace(/^git@([^:]+):/, "$1/");
+    normalized = normalized.replace(/^[a-z+]+:\/\//, "");
+    normalized = normalized.replace(/^[^@]+@/, "");
+    normalized = normalized.replace(/\/+$/, "").replace(/\.git$/, "");
+    return normalized;
+}
+export function repoName(url) {
+    const parts = normalizeRepoUrl(url).split("/");
+    return parts[parts.length - 1] ?? "";
+}
+function git(args, cwd) {
+    return new Promise((resolve) => {
+        execFile("git", args, { cwd, timeout: 3_000 }, (error, stdout) => resolve(error ? undefined : stdout.trim() || undefined));
+    });
+}
+/** The origin remote of the repo at `dir` (or the repo containing it), if any. */
+export async function detectRepoUrl(dir) {
+    return git(["config", "--get", "remote.origin.url"], dir);
+}

package/dist/http.js ADDED Viewed

@@ -0,0 +1,330 @@
+import { AsyncLocalStorage } from "node:async_hooks";
+import { createHash } from "node:crypto";
+import { StreamableHTTPServerTransport } from "@modelcontextprotocol/sdk/server/streamableHttp.js";
+import { resolveLocale, translate, withLocale } from "./i18n.js";
+import { CredentialRejectedError, exchangeApiKey, orgIdFromJwt } from "./np/auth.js";
+import { buildDeps, buildServer } from "./server.js";
+/**
+ * Multi-user streamable HTTP. The server holds no credential; every request carries the
+ * caller's own nullplatform key and the platform's RBAC decides what that user can do.
+ *
+ * The api_key -> access-token exchange is the trust boundary, treated like one:
+ * - the customer's key is NEVER retained: it lives only in the request's async scope,
+ *   and is used at most once per expiry window to (re)exchange — what gets cached per
+ *   user is exclusively the platform-issued short-lived access token
+ * - a credential is VERIFIED with the platform before it is ever handed a tool
+ * - each credential gets an isolated backend (token, org, caches) — never shared
+ * - verification and exchange are single-flighted per credential, across requests
+ * - rejected credentials are negative-cached, so bad keys can't hammer /token
+ *   or evict verified users from the cache
+ * - raw secrets are never logged and never used as map keys (their hash is)
+ */
+const MAX_BODY_BYTES = 1024 * 1024; // tool args are small; anything bigger is abuse
+const USER_CACHE_MAX = 500;
+const USER_IDLE_MS = 15 * 60_000;
+const REJECTED_TTL_MS = 60_000;
+const REJECTED_MAX = 10_000;
+const RATE_WINDOW_MS = 60_000;
+const RATE_LIMITER_MAX_IPS = 10_000;
+/** Generous per-IP ceiling: a chatty single user (polling widgets + actions) stays well
+ *  under it, but a key-rotating attacker can't amplify unbounded /token load. Tune with
+ *  NP_RATE_LIMIT_RPM (0 disables); behind a proxy, set NP_TRUST_PROXY to key off XFF. */
+const DEFAULT_RATE_LIMIT_RPM = 600;
+/**
+ * Slowloris / connection-exhaustion guard. Node's defaults (requestTimeout 5 min,
+ * headersTimeout 1 min) let a trickle client hold a socket — and the verified backend it
+ * cost a /token round-trip to build — open for minutes. The rate limiter counts requests,
+ * not slow in-flight reads, so bounding how long a client may take to deliver its headers
+ * and body is the complementary defense. Tool args are small (< 1 MiB); an honest client
+ * sends them well inside these windows.
+ */
+export const SERVER_TIMEOUTS_MS = {
+    /** Whole request (headers + body) must arrive within this. */
+    request: 30_000,
+    /** Headers alone must arrive within this — Node requires it be <= request. */
+    headers: 15_000,
+    /** Idle keep-alive socket reclaimed after this. */
+    keepAlive: 5_000,
+};
+/** Apply the slowloris timeouts to the HTTP server that fronts this handler. */
+export function hardenServerTimeouts(server) {
+    server.requestTimeout = SERVER_TIMEOUTS_MS.request;
+    server.headersTimeout = SERVER_TIMEOUTS_MS.headers;
+    server.keepAliveTimeout = SERVER_TIMEOUTS_MS.keepAlive;
+}
+/** Fixed-window per-key request limiter, bounded so the key set can't grow unboundedly. */
+function makeRateLimiter(limitPerWindow) {
+    const windows = new Map();
+    return (key) => {
+        if (limitPerWindow <= 0)
+            return true; // disabled
+        const now = Date.now();
+        const existing = windows.get(key);
+        if (!existing || now - existing.windowStart >= RATE_WINDOW_MS) {
+            if (windows.size >= RATE_LIMITER_MAX_IPS) {
+                for (const [knownKey, window] of windows) {
+                    if (now - window.windowStart >= RATE_WINDOW_MS)
+                        windows.delete(knownKey);
+                }
+                while (windows.size >= RATE_LIMITER_MAX_IPS) {
+                    const oldest = windows.keys().next().value;
+                    if (oldest === undefined)
+                        break;
+                    windows.delete(oldest);
+                }
+            }
+            windows.set(key, { count: 1, windowStart: now });
+            return true;
+        }
+        existing.count++;
+        return existing.count <= limitPerWindow;
+    };
+}
+function clientIp(request, trustProxy) {
+    if (trustProxy) {
+        const forwarded = request.headers["x-forwarded-for"];
+        const first = (Array.isArray(forwarded) ? forwarded[0] : forwarded)?.split(",")[0]?.trim();
+        if (first)
+            return first;
+    }
+    return request.socket.remoteAddress ?? "unknown";
+}
+/** Three base64url segments starting with a JSON header — a pre-issued JWT, not an api key. */
+function looksLikeJwt(value) {
+    return /^eyJ[A-Za-z0-9_-]*\.[A-Za-z0-9_-]+\.[A-Za-z0-9_-]+$/.test(value);
+}
+/** `Authorization: Bearer <NP_API_KEY | JWT>`, or `X-NP-API-Key` for hosts that reserve Authorization. */
+export function credentialFrom(headers) {
+    const authorization = headers.authorization;
+    if (authorization) {
+        const bearerMatch = /^Bearer\s+(.+)$/i.exec(authorization.trim());
+        const value = bearerMatch?.[1]?.trim();
+        if (!value)
+            return undefined; // wrong scheme (Basic, …) — not ours
+        return looksLikeJwt(value) ? { raw: value, bearer: value } : { raw: value, apiKey: value };
+    }
+    const headerKey = headers["x-np-api-key"];
+    const value = (Array.isArray(headerKey) ? headerKey[0] : headerKey)?.trim();
+    return value ? { raw: value, apiKey: value } : undefined;
+}
+/** The in-flight request's credential — exists only inside that request's async scope. */
+const requestCredential = new AsyncLocalStorage();
+/**
+ * A TokenSource that takes the api key from the live request, never from storage.
+ * Expiry mid-conversation is fine: whichever request is current re-exchanges with
+ * the key it carried; concurrent callers share one exchange via the slot.
+ */
+function tokenSourceFor(slot, apiBase, fetchImpl) {
+    return {
+        get organizationId() {
+            return slot.organizationId;
+        },
+        invalidate() {
+            slot.accessToken = undefined;
+            slot.expiresAt = 0;
+        },
+        async getToken() {
+            const credential = requestCredential.getStore();
+            if (credential?.bearer)
+                return credential.bearer; // pre-issued token: pure pass-through, cache nothing
+            if (slot.accessToken && Date.now() < slot.expiresAt - 60_000)
+                return slot.accessToken;
+            const apiKey = credential?.apiKey;
+            if (!apiKey)
+                throw new CredentialRejectedError();
+            slot.inflight ??= exchangeApiKey(apiBase, apiKey, fetchImpl)
+                .then((exchanged) => {
+                slot.accessToken = exchanged.accessToken;
+                slot.expiresAt = exchanged.expiresAt;
+                slot.organizationId = exchanged.organizationId;
+                return exchanged.accessToken;
+            })
+                .finally(() => {
+                slot.inflight = undefined;
+            });
+            return slot.inflight;
+        },
+    };
+}
+export function createMcpHttpHandler(config, options = {}) {
+    const fetchImpl = options.fetchImpl ?? fetch;
+    const allowedOrigins = new Set(options.allowedOrigins ?? []);
+    const surface = options.surface;
+    const rateLimit = makeRateLimiter(options.rateLimitRpm ?? DEFAULT_RATE_LIMIT_RPM);
+    const trustProxy = options.trustProxy ?? false;
+    // Verified per-user backends, LRU + idle eviction. Only credentials the platform
+    // accepted ever enter this map — unverified input cannot displace verified users.
+    const users = new Map();
+    // First sight of a credential, in flight: concurrent requests share one verification.
+    const creating = new Map();
+    // Credentials the platform rejected: answered 401 locally for a while.
+    const rejected = new Map();
+    /** Prove the credential to the platform before serving it: throws CredentialRejectedError. */
+    const verify = async (credential) => {
+        // The backend keeps the slot (platform-issued state) — the credential stays out of it.
+        const slot = {
+            expiresAt: 0,
+            organizationId: credential.bearer ? orgIdFromJwt(credential.bearer) : undefined,
+        };
+        const deps = buildDeps({ apiBase: config.apiBase, bffBase: config.bffBase }, fetchImpl, tokenSourceFor(slot, config.apiBase, fetchImpl));
+        if (credential.apiKey) {
+            await deps.tokens.getToken(); // the exchange itself is the validation
+        }
+        else {
+            // A platform-issued JWT always carries the org claim; prove it with a harmless read.
+            const organizationId = deps.tokens.organizationId;
+            if (!organizationId) {
+                throw new CredentialRejectedError(translate("error.bearerNoOrg"));
+            }
+            try {
+                await deps.np.get(`/organization/${organizationId}`);
+            }
+            catch (caught) {
+                const status = caught.status;
+                if (status === 401 || status === 403)
+                    throw new CredentialRejectedError();
+                throw caught;
+            }
+        }
+        return deps;
+    };
+    const depsFor = (credential) => {
+        const credentialHash = createHash("sha256").update(credential.raw).digest("hex");
+        const now = Date.now();
+        const cached = users.get(credentialHash);
+        if (cached) {
+            users.delete(credentialHash); // re-insert: Map order doubles as the LRU order
+            cached.lastUsed = now;
+            users.set(credentialHash, cached);
+            return Promise.resolve(cached.deps);
+        }
+        const rejectedUntil = rejected.get(credentialHash);
+        if (rejectedUntil !== undefined) {
+            if (now < rejectedUntil)
+                return Promise.reject(new CredentialRejectedError());
+            rejected.delete(credentialHash); // window over — let it try the platform again
+        }
+        const pending = creating.get(credentialHash);
+        if (pending)
+            return pending;
+        const create = (async () => {
+            const deps = await verify(credential);
+            for (const [userKey, entry] of users) {
+                if (Date.now() - entry.lastUsed > USER_IDLE_MS)
+                    users.delete(userKey);
+            }
+            while (users.size >= USER_CACHE_MAX) {
+                const oldest = users.keys().next().value;
+                if (oldest === undefined)
+                    break;
+                users.delete(oldest);
+            }
+            users.set(credentialHash, { deps, lastUsed: Date.now() });
+            return deps;
+        })();
+        creating.set(credentialHash, create);
+        void create
+            .catch((caught) => {
+            if (caught instanceof CredentialRejectedError) {
+                // Evict the oldest entries (Map insertion order = LRU), never flush the whole
+                // cache — a key-rotating attacker must not be able to re-open the platform's
+                // /token to previously-rejected keys by overflowing this map.
+                while (rejected.size >= REJECTED_MAX) {
+                    const oldest = rejected.keys().next().value;
+                    if (oldest === undefined)
+                        break;
+                    rejected.delete(oldest);
+                }
+                rejected.set(credentialHash, Date.now() + REJECTED_TTL_MS);
+            }
+        })
+            .finally(() => creating.delete(credentialHash));
+        return create;
+    };
+    const reply = (response, status, body, headers = {}) => response.writeHead(status, { "content-type": "application/json", ...headers }).end(JSON.stringify(body));
+    const handle = async (request, response) => {
+        try {
+            const url = request.url ?? "";
+            if (request.method === "GET" && (url === "/healthz" || url.startsWith("/healthz?"))) {
+                reply(response, 200, { status: "ok" });
+                return;
+            }
+            if (!url.startsWith("/mcp")) {
+                response.writeHead(404).end();
+                return;
+            }
+            // Per-IP rate limit before any auth/verify work, so a key-rotating attacker can't
+            // amplify /token load or pin the event loop. Generous by default; a proxy should too.
+            if (!rateLimit(clientIp(request, trustProxy))) {
+                reply(response, 429, { error: translate("http.rateLimited") }, { "Retry-After": String(Math.ceil(RATE_WINDOW_MS / 1000)) });
+                return;
+            }
+            // DNS-rebinding/browser guard: MCP clients call server-to-server and send no Origin.
+            const origin = request.headers.origin;
+            if (origin && !allowedOrigins.has(origin)) {
+                reply(response, 403, { error: translate("http.originDenied") });
+                return;
+            }
+            if (request.method !== "POST") {
+                // Stateless mode: no SSE stream, no sessions — only POST is meaningful.
+                response.writeHead(405, { Allow: "POST" }).end();
+                return;
+            }
+            const credential = credentialFrom(request.headers);
+            if (!credential) {
+                reply(response, 401, { error: translate("http.authRequired"), hint: translate("http.authHint") }, { "WWW-Authenticate": 'Bearer realm="nullplatform"' });
+                return;
+            }
+            // Everything from verification to tool execution runs inside the request's async
+            // scope: the key is reachable there when an exchange is due, and nowhere else.
+            await requestCredential.run(credential, async () => {
+                // Verify before reading the body or touching MCP: invalid keys stop here, as a 401.
+                let deps;
+                try {
+                    deps = await depsFor(credential);
+                }
+                catch (caught) {
+                    if (caught instanceof CredentialRejectedError) {
+                        reply(response, 401, { error: caught.message }, { "WWW-Authenticate": 'Bearer realm="nullplatform", error="invalid_token"' });
+                    }
+                    else {
+                        console.error("[ai-mcp] credential verification unavailable:", caught instanceof Error ? caught.message : caught);
+                        reply(response, 502, { error: translate("http.verifyUnavailable") });
+                    }
+                    return;
+                }
+                let body;
+                try {
+                    const chunks = [];
+                    let size = 0;
+                    for await (const chunk of request) {
+                        size += chunk.length;
+                        if (size > MAX_BODY_BYTES) {
+                            reply(response, 413, { error: translate("http.bodyTooLarge") });
+                            return;
+                        }
+                        chunks.push(chunk);
+                    }
+                    body = chunks.length ? JSON.parse(Buffer.concat(chunks).toString("utf8")) : undefined;
+                }
+                catch {
+                    reply(response, 400, { error: translate("http.invalidJson") });
+                    return;
+                }
+                // Fresh server+transport per request (stateless), wired to this caller's backend.
+                const transport = new StreamableHTTPServerTransport({ sessionIdGenerator: undefined });
+                response.on("close", () => void transport.close());
+                const server = buildServer(deps, surface ? { surface } : {});
+                await server.connect(transport);
+                await transport.handleRequest(request, response, body);
+            });
+        }
+        catch (caught) {
+            console.error("[ai-mcp] request failed:", caught instanceof Error ? caught.message : caught);
+            if (!response.headersSent)
+                response.writeHead(500).end();
+        }
+    };
+    // The caller's language travels with the request, exactly like their credential.
+    return (request, response) => withLocale(resolveLocale(request.headers["accept-language"]), () => handle(request, response));
+}