npm - backend-manager - Versions diffs - 5.6.3 → 5.7.0 - Mend

backend-manager 5.6.3 → 5.7.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (47) hide show

package/CHANGELOG.md +43 -0
package/CLAUDE.md +4 -3
package/PROGRESS.md +34 -0
package/docs/ai-library.md +62 -11
package/docs/cdp-debugging.md +44 -0
package/docs/cli-output.md +22 -10
package/docs/mcp.md +166 -43
package/docs/test-framework.md +2 -2
package/package.json +1 -1
package/plans/mcp2.md +247 -0
package/src/cli/commands/mcp.js +8 -2
package/src/cli/commands/serve.js +155 -29
package/src/cli/commands/setup-tests/base-test.js +8 -0
package/src/cli/commands/setup-tests/firebase-auth.js +26 -0
package/src/cli/commands/setup-tests/firebase-cli.js +9 -13
package/src/cli/commands/setup-tests/index.js +4 -0
package/src/cli/commands/setup-tests/java-installed.js +26 -0
package/src/cli/commands/setup.js +2 -1
package/src/cli/commands/test.js +13 -0
package/src/cli/index.js +14 -0
package/src/cli/utils/ui.js +27 -5
package/src/manager/index.js +8 -3
package/src/manager/libraries/ai/index.js +45 -1
package/src/manager/libraries/ai/providers/anthropic-format.js +234 -0
package/src/manager/libraries/ai/providers/anthropic.js +28 -49
package/src/manager/libraries/ai/providers/claude-code.js +21 -47
package/src/manager/libraries/ai/providers/openai.js +154 -19
package/src/manager/libraries/ai/providers/test.js +242 -0
package/src/manager/libraries/email/data/disposable-domains.json +465 -0
package/src/mcp/client.js +48 -13
package/src/mcp/handler.js +222 -69
package/src/mcp/index.js +48 -18
package/src/mcp/tools.js +150 -0
package/src/mcp/utils.js +108 -0
package/src/test/fixtures/firebase-project/firebase.json +1 -1
package/src/test/test-accounts.js +31 -0
package/test/ai/tools-live.js +170 -0
package/test/email/marketing-lifecycle.js +10 -5
package/test/helpers/ai-test-provider.js +202 -0
package/test/helpers/ai-tools-format.js +350 -0
package/test/mcp/discovery.js +53 -0
package/test/mcp/oauth.js +161 -0
package/test/mcp/protocol.js +268 -0
package/test/mcp/roles.js +168 -0
package/test/mcp/utils.js +245 -0
package/test/routes/marketing/webhook.js +37 -33
package/.claude/settings.local.json +0 -12

package/CHANGELOG.md CHANGED Viewed

@@ -14,6 +14,49 @@ The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.0.0/).
 - `Fixed` for any bug fixes.
 - `Security` in case of vulnerabilities.
+# [5.7.0] - 2026-06-17
+### Added
+- **MCP role-based tool scoping.** 25 tools (was 19) with admin/user/public roles. Admin sees all, user sees `get_user` + `get_subscription` + `health_check`, unauthenticated gets 401 triggering OAuth. Defense-in-depth: route-level auth still validates.
+- **MCP OAuth user authentication.** OAuth 2.1 with PKCE + dynamic client registration (RFC 7591). User sign-in via consumer website's `/token` page → Firebase ID token → exchanged for `api.privateKey`. Verified end-to-end in Claude Desktop.
+- **MCP consumer tools.** Consumer projects define custom MCP tools in `functions/mcp.js` — route delegation (works on stdio + HTTP) or handler mode (HTTP only). Consumer tools override same-name built-ins.
+- **MCP tool annotations.** `title`, `readOnlyHint`, `destructiveHint`, `idempotentHint`, `openWorldHint` on all tools. Claude Desktop shows read/write categorization and human-readable titles.
+- **6 new MCP tools:** `update_post`, `update_campaign`, `delete_campaign`, `create_contact`, `delete_contact`, `get_payment_portal`.
+- **HTTPS local dev.** `npx mgr serve` starts an HTTPS proxy on port 5002 (firebase serve on 5443 internally) with auto-generated mkcert certificates. Claude Desktop requires HTTPS for MCP connectors. Disable with `--no-https`.
+- **MCP CLI `--token` flag.** `npx mgr mcp --token <api-key>` for user-level stdio connections.
+### Changed
+- **MCP discovery endpoints** use root-level issuer per RFC 8414 (was path-scoped, broke Claude Desktop's discovery chain).
+- **`getApiUrl()`** returns `https://localhost:<port>` when `BEM_HTTPS_PORT` is set (serve command sets this automatically).
+- **`cancel_subscription`, `refund_payment`, `generate_uuid`** moved from user/public to admin role (destructive operations and dev utilities shouldn't be in user-facing MCP).
+- **`resolveConsumerAuthUrl()`** uses `Manager.getWebsiteUrl()` (auto-resolves localhost in dev) instead of `brand.url` (always production).
+# [5.6.6] - 2026-06-15
+### Added
+- **`'warn'` return type for setup checks.** `run()` can now return the string `'warn'` for non-blocking failures — the check prints as `⚠` with detail lines from `getWarning()`, is counted in the summary (`36 passed, 1 warned, 0 failed`), but does **not** halt setup. `BaseTest` provides a default `getWarning()` returning `[]`. `Summary` gains a `warn(name, details)` method alongside `pass()` and `fail()`.
+- **Java setup check** (`setup-tests/java-installed.js`). Checks whether Java is installed (required by the Firebase Firestore emulator for testing). Uses the `'warn'` return type — setup continues without Java, but the summary reports it.
+- **Java pre-check in test runner.** `npx mgr test` now checks for Java before starting the emulator and fails fast with a clear message (`Java is required to run tests`) instead of the raw emulator crash.
+### Changed
+- **Firebase CLI and auth setup checks no longer halt setup.** Both checks now use the `'warn'` return type instead of throwing from `fix()`. Missing Firebase CLI or unauthenticated state is reported in the summary but does not block the remaining checks.
+# [5.6.5] - 2026-06-14
+### Added
+- **Cross-provider native tool calling in the AI library (agentic loops).** `ai.request()` now supports a unified tools interface on every text provider: `tools.list` accepts normalized function tools (`{ name, description, parameters }` — JSON Schema), `tools.choice` accepts `'auto' | 'required' | 'none' | { name }`, and the response adds `toolCalls: [{ id, name, arguments }]` (arguments parsed) plus `stopReason: 'tool_use' | 'end' | 'max_tokens'`. The Anthropic and claude-code providers gain native `tool_use` via a shared pure formatter (`providers/anthropic-format.js` — tool defs → `input_schema`, choice mapping incl. `required`→`any`, response extraction); the OpenAI provider normalizes function tools to the Responses API envelope (hosted tools like `{ type: 'web_search' }` still pass verbatim, and throw a clear error on Anthropic) and extracts `function_call` items. Multi-turn loop continuation is first-class through `options.messages`: `{ role: 'assistant', toolCalls }` and `{ role: 'tool', toolCallId, content }` turns map to each provider's wire format (consecutive tool results merge into one Anthropic user turn; OpenAI gets `function_call`/`function_call_output` items), raw Anthropic block arrays replay verbatim, and `normalizeOptions()` no longer string-flattens structured conversations (system-prompt injections still apply). OpenAI additionally gains a direct-messages mode: passing `messages[]` now sends ALL turns (previously middle turns were silently dropped in favor of prompt/message/history). JSON parsing (`response: 'json'`) is skipped on tool-call turns, where empty text is the normal intermediate state. Return shapes stay backward-compatible (`{ content, output, tokens, raw }` unchanged; OpenAI now also returns `raw`). Covered by `test/helpers/ai-tools-format.js` (pure, 22 cases) and `test/ai/tools-live.js` (extended-mode real 2-step tool loops on both providers; OpenAI live-validated).
+- **Deterministic `test` AI provider** (`providers/test.js`, registered as `provider: 'test'`) — the AI analog of the `test` payment processor: a first-class provider that consumer suites drive with directives embedded in the last user message (`[[tool:name {json}]]`, `[[tools:[...]]]` for parallel calls, `[[reply:{json}]]`, `[[delay:ms]]`, `[[error:msg]]`), consumed sequentially across the turns of a tool loop. Refuses to run outside development/testing (Manager environment detection; falls back to `BEM_TESTING`/`FUNCTIONS_EMULATOR` signals when constructed without a Manager). Lets consumer chat/agent routes test the full loop — Firestore writes, usage, locks, tool executors — against the real emulator with zero paid API calls. Covered by `test/helpers/ai-test-provider.js`.
+- **OpenAI provider constructor hardened** — `assistant.Manager` access now uses optional chaining (matching the Anthropic provider), so the provider can be constructed with a minimal assistant context (direct construction in tests/tools).
+- **`docs/cdp-debugging.md` — launching a controllable browser (mirrored across UJM/BEM/BXM/EM).** The canonical Chrome launch for agents and humans: CDP port + REQUIRED dedicated `--user-data-dir` (Chrome 136+ silently ignores the debug port on the default profile — verified on 149), the persistent agent profile (`~/Library/Application Support/chrome-profiles/agent` — log in once, state survives relaunches, verified), the shared-instance model (CDP is multi-client — agents share the one logged-in Chrome on one port, one tab each; a second profile/port only for a second identity), safe quit by profile match, and driving via the `chrome-devtools` MCP (`CHROME_CDP_PORT` set before the session) or any CDP client. BEM flavor: aimed at verifying the frontend against your routes — watch network payloads and drive auth'd flows through the real UI. Indexed in CLAUDE.md.
+# [5.6.4] - 2026-06-11
+### Changed
+- **Consent side effects moved off shared test accounts (journey-account isolation).** The marketing webhook suite's revoke-event tests (`test/routes/marketing/webhook.js`) repeatedly write `consent.marketing.status = 'revoked'` to their target account — persistent side-effect data that previously landed on the shared `basic` account, leaving it revoked for the remainder of every run (and, since the v5.6.3 library consent gate, changing `sync()`/`add()` behavior for every later suite touching it). They now target a dedicated `journey-webhook-revoke` account. The extended-mode lifecycle suite (`test/email/marketing-lifecycle.js`) likewise now syncs a dedicated `journey-marketing-sync` account (`_test.allow_*` prefix) instead of the shared `consent-granted` sentinel (which the signup + consent-lifecycle suites rely on). Its cleanup step also now deletes the contact the suite actually created — previously it deleted the ADMIN account's contact, which (post-v5.6.3) revoked admin's doc consent mid-run AND left the synced contact behind in SendGrid/Beehiiv after every extended run. `docs/test-framework.md`'s journey-account rule now lists `consent.marketing` writes as a trigger. Validated: marketing route suites pass (46 passing / 10 env-gated skips / 0 failures).
+### Fixed
+- **Anonymous HMAC unsubscribe tests now actually run.** The self-test boot (`src/cli/commands/test.js`) injects a test-only `UNSUBSCRIBE_HMAC_KEY` into the process env (the emulated functions inherit it — same mechanism as the fixture webhook key), closing the fixture gap that left all 8 anon-HMAC tests in `test/routes/marketing/email-preferences.js` failing as "known env gap". Those tests are the route-level coverage for the v5.6.3 HMAC changes (signature validation, rate limiting, consent mirroring); marketing suites went from 38 passing + 8 failing to 46 passing + 0 failing.
 # [5.6.3] - 2026-06-11
 ### Fixed

package/CLAUDE.md CHANGED Viewed

@@ -70,7 +70,7 @@ Every feature ships with tests at EVERY surface it exposes — logic (`test/rout
 | `watch` | Auto-reload functions on file change |
 | `deploy` | Deploy Cloud Functions to Firebase |
 | `test` | Run framework + project test suites against an emulator |
-| `mcp` | Start the stdio MCP server (for Claude Code / Claude Desktop) |
+| `mcp` | Start the stdio MCP server (for Claude Code / Claude Desktop). Supports `--token <key>` for user-level connections |
 | `firestore:get/set/query/delete` | Direct Firestore reads/writes from the terminal |
 | `auth:get/list/delete/set-claims` | Manage Auth users from the terminal |
 | `logs:read` / `logs:tail` | Cloud Function logs from Google Cloud Logging |
@@ -136,8 +136,9 @@ Deep references live in `docs/`. **Whenever you make a behavioral change, update
 - [docs/file-naming.md](docs/file-naming.md) — naming table for routes, schemas, API commands, events, cron jobs, hooks
 - [docs/common-mistakes.md](docs/common-mistakes.md) — anti-pattern checklist (don't modify Manager internals, always await, increment-before-update, etc.)
 - [docs/audit.md](docs/audit.md) — full-audit check catalog (U-xx universal / BEM-xx / F-xx IDs with severity + scope), protocol + fix loop
+- [docs/cdp-debugging.md](docs/cdp-debugging.md) — launching a controllable Chrome (CDP) to verify the frontend against your routes (network payloads, auth'd flows via the persistent agent profile)
 - [docs/key-files.md](docs/key-files.md) — quick lookup for the most-touched files (Manager, helpers, auth events, cron, payment processors, CLI commands)
-- [docs/cli-output.md](docs/cli-output.md) — shared CLI styling module (`src/cli/utils/ui.js`): OMEGA-style banner/dividers/sections/status symbols + the `Summary` block; used by `setup`, adoptable by other commands
+- [docs/cli-output.md](docs/cli-output.md) — shared CLI styling module (`src/cli/utils/ui.js`): OMEGA-style banner/dividers/sections/status symbols + the `Summary` block (pass/warn/fail); setup check return types (`true`/`false`/`Error`/`'warn'`); used by `setup`, adoptable by other commands
 - [docs/environment-detection.md](docs/environment-detection.md) — `getEnvironment()` returns `'development' | 'testing' | 'production'` (mutually exclusive); gate side effects on the INTENTIONAL check (`isProduction()` for prod-only, `isDevelopment() || isTesting()` for local-or-test) — never `!isDevelopment()`. Plus the URL helper convention (always `Manager.getApiUrl()` — auto-resolves local in dev+test, never read `project.apiUrl`)
 - [docs/response-headers.md](docs/response-headers.md) — automatic `bm-properties` header
@@ -157,7 +158,7 @@ Deep references live in `docs/`. **Whenever you make a behavioral change, update
 - [docs/payment-system.md](docs/payment-system.md) — full payment pipeline: Intent → Webhook → On-Write → Transition; subscription model, statuses, `resolveSubscription()`, transition handlers, processor interface, product config, test processor
 - [docs/marketing-campaigns.md](docs/marketing-campaigns.md) — campaign CRUD routes, recurring campaigns, generator pipeline (newsletter), newsletter-driven blog article (`content.article.enabled`), template-owned schemas, asset hosting, seed campaigns
 - [docs/consent.md](docs/consent.md) — marketing consent capture: canonical `consent.{legal,marketing}` user-doc shape, signup-form capture, account-page toggle, HMAC unsub link (cross-provider unsub + re-add on resubscribe), admin contact-DELETE revoke mirror, SendGrid+Beehiiv webhook receivers, parent forwarder (`/marketing/webhook/forward`), library-level consent gate in `email.add()`/`email.sync()` (revoked-only skip), migration script template
-- [docs/mcp.md](docs/mcp.md) — Model Context Protocol server: 19 tools, stdio + HTTP transports, OAuth, Claude Chat/Code configuration
+- [docs/mcp.md](docs/mcp.md) — Model Context Protocol server: 25 tools with role-based scoping (22 admin / 2 user / 1 public), tool annotations (title, read/write hints), OAuth 2.1 with PKCE + dynamic client registration + consumer website sign-in, consumer MCP tools (`functions/mcp.js`), HTTPS local dev (mkcert), Claude Desktop/Chat/Code configuration
 ### Subsystems & Libraries

package/PROGRESS.md ADDED Viewed

@@ -0,0 +1,34 @@
+# Project Progress Tracker
+> Agents and maintainers should update this file regularly to reflect the current state of the project.
+## 🎯 Current Focus
+* **Goal:** MCP role-based tool scoping + consumer extensibility
+* **Current Phase:** Complete — all phases done, 44 tests passing, docs finalized
+* **Priority:** High
+* **Last Updated:** 2026-06-17 3:42 AM PDT
+* **Notes:** Ready to ship. Full OAuth flow verified in Claude Desktop. Role reassignment (16 admin / 2 user / 1 public), annotations, HTTPS serve, dynamic client registration all working.
+## 📌 Active Task List
+## ✅ Completed Task List
+* [x] Phase 1: MCP role-based tool scoping + consumer extensibility
+  * [x] Foundation utilities (`src/mcp/utils.js`)
+  * [x] Add `role` to all 19 tools (`src/mcp/tools.js`)
+  * [x] User token support in HTTP client (`src/mcp/client.js`)
+  * [x] Stdio server role filtering + consumer tools (`src/mcp/index.js`)
+  * [x] CLI `--token` flag + `cwd` passthrough (`src/cli/commands/mcp.js`)
+  * [x] HTTP handler — role filtering, OAuth user flow, consumer tool execution (`src/mcp/handler.js`)
+  * [x] Test suite — 44 tests across 5 files
+  * [x] Documentation — `docs/mcp.md`, `CLAUDE.md`
+  * [x] UJM `/token` page update (separate repo)
+* [x] Phase 2: HTTPS local dev + Claude Desktop MCP testing
+  * [x] HTTPS proxy in `npx mgr serve` (mkcert certs, port 5002 → 5443)
+  * [x] `getApiUrl()` returns `https://` when `BEM_HTTPS_PORT` is set
+  * [x] Fix OAuth discovery (root-level issuer per RFC 8414)
+  * [x] Add dynamic client registration (`POST /mcp/register`)
+  * [x] Fix 401 trigger for OAuth flow
+  * [x] Tool annotations (title, readOnlyHint, destructiveHint, etc.)
+  * [x] Role reassignment (cancel/refund/uuid → admin, user = read-only)
+  * [x] Consumer tool override bug fix (listing/execution sync)
+  * [x] Test MCP connection in Claude Desktop — verified end-to-end
+  * [x] Update tests (44 passing) + docs

package/docs/ai-library.md CHANGED Viewed

@@ -8,7 +8,7 @@
 | `anthropic` | `claude-sonnet-4-6` | Better at SVG illustrations and creative output |
 | `claude-code` | `claude-opus-4-7` | Same Claude models as `anthropic`, but bills a Claude Pro/Max **subscription** instead of API credits |
-Return shape (same for all providers): `{ content, output, tokens, raw }`.
+Return shape (same for all providers): `{ content, output, tokens, raw }` — plus `toolCalls` and `stopReason` when tools are in play (see Tools below).
 `options.response: 'json'` triggers JSON parsing — all providers strip fences and parse with JSON5 for robustness. `options.schema` enforces structure on OpenAI (real JSON schema) and is injected into the system prompt on Anthropic / claude-code.
@@ -36,14 +36,48 @@ Return shape (single image): `{ buffer, b64, mime, revisedPrompt, model, size, q
 API key resolution is the same as `request()` — `BACKEND_MANAGER_OPENAI_API_KEY` / `OPENAI_API_KEY` (process.env or config).
-## Tools / web search (OpenAI)
+## Tools — cross-provider function calling (agentic loops)
-Tools are nested under `options.tools` and opt-in — when omitted, no tools are sent and behavior is identical to a plain request:
+Tools are nested under `options.tools` and opt-in — when omitted, no tools are sent and behavior is identical to a plain request.
-- `tools.list` — array of tool definitions passed to the OpenAI Responses API verbatim. Built-in hosted tools (e.g. `{ type: 'web_search' }`, `{ type: 'code_interpreter' }`) OR custom function tools (`{ type: 'function', name, parameters }`).
-- `tools.choice` *(optional)* — maps to `tool_choice` (`'auto'` | `'required'` | `'none'`, or a specific tool). Omit to let OpenAI default to `auto`.
+- `tools.list` — array of tool definitions. **Normalized function tools** (`{ name, description, parameters }` where `parameters` is a JSON Schema object — `type: 'function'` optional) work on EVERY provider. Provider-specific hosted tools (e.g. `{ type: 'web_search' }`, `{ type: 'code_interpreter' }`) pass verbatim on OpenAI and throw a clear error on Anthropic/claude-code.
+- `tools.choice` *(optional)* — `'auto' | 'required' | 'none'`, or `{ name: 'tool_name' }` to force a specific tool. Mapped per provider (Anthropic: `auto`/`any`/`none`/`tool`).
-The most common use is OpenAI's built-in **web search** so the model finds and cites real, currently-live URLs instead of hallucinating them:
+When the model decides to call tools, the response carries them in normalized form:
+- `r.toolCalls` — `[{ id, name, arguments }]`, `arguments` already parsed to an object.
+- `r.stopReason` — `'tool_use' | 'end' | 'max_tokens'`.
+A tool-call turn legitimately has `content: ''` — `response: 'json'` parsing is skipped on tool-call turns (the caller is expected to continue the loop, not consume a final answer).
+### Loop continuation via `options.messages`
+Structured conversations pass the full turn history through `options.messages` with two cross-provider conventions:
+- Assistant tool-call turn: `{ role: 'assistant', content?, toolCalls: [{ id, name, arguments }] }` — or replay the provider's raw blocks (`{ role: 'assistant', content: r.raw.content }`) on Anthropic.
+- Tool result turn: `{ role: 'tool', toolCallId, content }` — consecutive tool results merge into one Anthropic user turn of `tool_result` blocks; OpenAI gets `function_call_output` items.
+```js
+const ai = Manager.AI(assistant);
+const messages = [
+  { role: 'system', content: 'Use tools to answer.' },
+  { role: 'user', content: 'What is the weather in Paris?' },
+];
+const tools = { list: [{ name: 'get_weather', description: '...', parameters: { type: 'object', properties: { city: { type: 'string' } }, required: ['city'] } }] };
+const first = await ai.request({ provider: 'anthropic', messages, tools });
+// first.stopReason === 'tool_use'; first.toolCalls = [{ id, name: 'get_weather', arguments: { city: 'Paris' } }]
+messages.push({ role: 'assistant', content: first.raw.content });          // or { role: 'assistant', toolCalls: first.toolCalls }
+messages.push({ role: 'tool', toolCallId: first.toolCalls[0].id, content: '{"temp":"21C"}' });
+const second = await ai.request({ provider: 'anthropic', messages, tools });
+// second.stopReason === 'end'; second.content is the final answer
+```
+`normalizeOptions()` detects structured conversations (tool turns / toolCalls / raw blocks) and leaves them intact — only the system turn gets the universal prompt injections. Plain-text `messages[]` keep their legacy behavior, except OpenAI now sends ALL turns (previously middle turns were dropped in favor of prompt/message/history).
+### Hosted web search (OpenAI only)
 ```js
 const r = await ai.request({
@@ -56,7 +90,22 @@ const r = await ai.request({
 });
 ```
-When tools are active, the response `output` array may contain tool-call items (e.g. `web_search_call`) alongside the `message`; the message-text extractor ignores non-message items, so `r.content` is unaffected. URL citations live in the returned `output` (message content) as `annotations` of type `url_citation`.
+URL citations live in the returned `output` (message content) as `annotations` of type `url_citation`.
+## `test` provider — deterministic scripted AI for test suites
+`provider: 'test'` is the AI analog of the `test` payment processor: a first-class provider that suites drive with directives in the LAST user message, so consumer routes exercise their full loop (Firestore writes, usage, locks, tool execution) against the real emulator with zero paid API calls. It **refuses to run outside development/testing**.
+Directives form a sequence consumed across loop turns (call N executes directive N-1, indexed by assistant turns after the last user turn). Directive values must not contain `]]` internally (a trailing JSON `]` is fine):
+| Directive | Behavior |
+|---|---|
+| `[[tool:name {json}]]` | Emit one tool call this step (`stopReason: 'tool_use'`) |
+| `[[tools:[{"name":"a","arguments":{}}, ...]]]` | Emit parallel tool calls this step |
+| `[[reply:{json}]]` | Final reply (parsed when `response: 'json'`) |
+| `[[delay:ms]]` | Modifier — delay the NEXT step (max 30s) |
+| `[[error:msg]]` | Throw at this step |
+| *(none / exhausted)* | Echo reply: `Echo: <message>` (`{ message }` in json mode) |
 ## `claude-code` provider — subscription billing
@@ -74,8 +123,10 @@ The legacy `src/manager/libraries/openai.js` is a thin compatibility shim that r
 | File | Purpose |
 |---|---|
-| `src/manager/libraries/ai/index.js` | Unified `AI` class (dispatches by provider) |
-| `src/manager/libraries/ai/providers/openai.js` | OpenAI provider (original `openai.js` content) |
-| `src/manager/libraries/ai/providers/anthropic.js` | Anthropic provider (Claude Messages API, x-api-key, API credits) |
-| `src/manager/libraries/ai/providers/claude-code.js` | claude-code provider (Claude Messages API, OAuth Bearer, subscription billing) |
+| `src/manager/libraries/ai/index.js` | Unified `AI` class (dispatches by provider; structured-messages detection) |
+| `src/manager/libraries/ai/providers/openai.js` | OpenAI provider (Responses API; direct-messages mode + tool envelopes) |
+| `src/manager/libraries/ai/providers/anthropic.js` | Anthropic provider (Claude Messages API, x-api-key, API credits, native tool_use) |
+| `src/manager/libraries/ai/providers/claude-code.js` | claude-code provider (Claude Messages API, OAuth Bearer, subscription billing, native tool_use) |
+| `src/manager/libraries/ai/providers/anthropic-format.js` | Shared pure formatters for both Claude providers (tool defs, message building, extraction) |
+| `src/manager/libraries/ai/providers/test.js` | Deterministic `test` provider (scripted directives; dev/testing only) |
 | `src/manager/libraries/openai.js` | Back-compat shim → providers/openai.js |

package/docs/cdp-debugging.md ADDED Viewed

@@ -0,0 +1,44 @@
+# CDP Debugging (driving a live browser)
+How to launch a browser you can CONTROL — see a site live, screenshot it, click, type, read console logs, inspect network requests — for agents (Claude via MCP/CDP) and humans. BEM has no UI of its own; reach for this when **verifying the frontend that consumes your backend** (the UJM site, a deployed app): drive the auth flow, watch the network panel for calls to your routes, read the actual request/response payloads.
+> Mirrored across the four sister frameworks (UJM / BEM / BXM / EM) — same core section, framework-flavored. Edit all four together.
+## Launching a controllable Chrome (the canonical command)
+```bash
+open -gna "Google Chrome" --args \
+  --remote-debugging-port=9223 \
+  --user-data-dir="$HOME/Library/Application Support/chrome-profiles/agent" \
+  --no-first-run --no-default-browser-check \
+  --disable-background-timer-throttling \
+  --disable-backgrounding-occluded-windows \
+  --disable-renderer-backgrounding \
+  https://localhost:4000          # ← the frontend talking to your backend (or the IP from .temp/_config_browsersync.yml if localhost doesn't connect)
+```
+Verify it's up: `curl -s http://127.0.0.1:9223/json/version`
+The rules that make this work (each one learned the hard way):
+- **`open -gna` launches WITHOUT stealing focus.** `-g` = don't bring to foreground, `-n` = new instance (required — without it `open` just activates the already-running daily Chrome and the `--args` are ignored). Launching the Chrome binary directly ALWAYS activates the app and steals focus. Do NOT use `-j`/`--hide` — animations need a visible window; instead the three `--disable-*` flags keep timers/rAF/rendering at FULL speed while the window sits behind your work (verified: rAF at the display's native 120fps while backgrounded, focus never moved).
+- **`--user-data-dir` is REQUIRED, not optional.** Chrome 136+ **silently ignores** `--remote-debugging-port` on the default profile — no error, no port, nothing (verified on Chrome 149). This is the #1 "why isn't CDP up" trap.
+- **The profile dir IS the persistent login state.** Cookies + localStorage survive relaunches (verified by round-trip). **Log into sites once in the agent profile and every agent reuses the authenticated state** — auth'd flows against your routes work without re-login. Ecosystem convention: ONE shared profile at `~/Library/Application Support/chrome-profiles/agent` across all four frameworks, so logins are a one-time setup.
+- **One Chrome instance per profile dir — but MANY agents per instance.** CDP is multi-client (verified: two concurrent clients driving different tabs of one instance): agents and sessions attach to the SAME port, each drives its own tab, and all share the profile's logins. One agent per tab is the only rule. A second launch with the same dir just opens a window in the existing instance and **ignores the new debug port** — attach to the running one instead. Reach for a second profile + port (`…/b` on 9224) only for a different IDENTITY (a different account = a different cookie jar) or hard isolation.
+- It runs **side-by-side with the daily Chrome** — a different `--user-data-dir` is a fully separate instance.
+- **Quit by profile match, never by app name**: `pkill -f "chrome-profiles/agent"`. (`osascript 'tell app "Google Chrome" to quit'` hits the daily browser too — same app name.)
+## Driving it
+| Client | Good for | Port handoff |
+|---|---|---|
+| `chrome-devtools` MCP | rich interaction — click, fill, type, screenshots, network requests, console messages, performance traces | `CHROME_CDP_PORT` env var, **expanded ONCE when the Claude session spawns its MCP — set it BEFORE launching `claude`** (mid-session changes do nothing) |
+| Any CDP client — including EM's `npx mgr cdp` run from any EM project | quick JS eval, per-renderer screenshots | per invocation: `EM_CDP_PORT=9223 npx mgr cdp eval ":4000" 'document.title'` |
+Port conventions: **9222** = Electron apps (EM), **9223+** = Chrome instances.
+## BEM specifics
+- **The UJM dev site is HTTPS.** BrowserSync serves over HTTPS (self-signed cert). Prefer `https://localhost:4000`; fall back to the machine's local network IP (e.g. `https://192.168.x.x:4000`) if localhost doesn't connect. Port 4000 by default, increments to 4001+ when multiple sites run. The exact URL is in `.temp/_config_browsersync.yml` at the root of the WEBSITE project (the UJM consumer — e.g. `<brand>-website/.temp/_config_browsersync.yml`, NOT this backend repo) — read that file first, every time, before navigating.
+- The network tab is the payoff: `list_network_requests` shows every call the frontend makes to your routes — method, status, and payloads — while you click through the real UI.
+- Backend-side observation stays where it always was: `npx mgr logs` (gcloud logs) and the emulator suite; this doc only covers the browser half of the loop.

package/docs/cli-output.md CHANGED Viewed

@@ -67,18 +67,20 @@ ui.rule();                                       // a bare 70-char rule string (
 ### `ui.Summary`
-Collects pass/fail outcomes and prints an OMEGA-style summary block (green `✅`
+Collects pass/warn/fail outcomes and prints an OMEGA-style summary block (green `✅`
 when all passed, yellow `⚠` otherwise).
 ```js
 const summary = new ui.Summary().start();
-summary.pass();                       // record a pass
+summary.pass();                        // record a pass
+summary.warn('check name', detailsArr);// record a warning (non-blocking)
 summary.fail('check name', detailsArr);// record a fail with pre-formatted detail lines
 summary.print({ hint: 'Fix the above, then run npx mgr setup again.' });
 ```
-`fail()`'s second arg is an array of already-styled lines shown indented under the
-failing check in the summary block.
+Results line: `36 passed, 1 warned, 0 failed` (the warned segment only appears
+when > 0). Warnings are listed before failures in the summary block — yellow `⚠`
+lines with their detail arrays indented beneath.
 ## How `setup` uses it
@@ -94,12 +96,22 @@ from these helpers:
 ### Test runner (`Main.prototype.test` in `src/cli/index.js`)
-Each setup check prints `    [N] <symbol> <name>`. A check can:
-- **pass** → `✓` (recorded via `setupSummary.pass()`).
-- **fail then auto-fix** → `⚠ … — fixing…` then `✓ fixed`.
-- **fail unfixably** → `✗ Could not fix: <message>`, recorded via
-  `setupSummary.fail(name, details)`, then `haltSetup()` prints the summary and
-  `process.exit(1)`.
+Each setup check prints `    [N] <symbol> <name>`. A check's `run()` can return:
+| Return value | Behavior |
+|---|---|
+| `true` | `✓` pass (recorded via `setupSummary.pass()`) |
+| `false` | Attempt `fix()` → `✓ fixed` on success, `✗ Could not fix` + halt on throw |
+| `Error` | `✗` hard halt, no fix attempted |
+| `'warn'` | `⚠` non-blocking warning — reported in summary, does **not** halt |
+**`'warn'` return type:** When `run()` returns `'warn'`, the runner prints the
+check as `⚠`, calls `getWarning()` on the test instance for detail lines (array
+of strings), and records it via `setupSummary.warn()`. Setup continues. The
+summary shows `36 passed, 1 warned, 0 failed` with the warning details listed
+at the bottom. Use this for environment prerequisites that don't block dev/deploy
+(e.g. Java, optional CLIs). `BaseTest` provides a default `getWarning()` returning
+`[]`; override it with your detail lines.
 A failing check's `fix()` may attach `error.summaryDetails` (an array of styled
 lines) to surface a compact version in the summary block — see

package/docs/mcp.md CHANGED Viewed

@@ -1,41 +1,163 @@
 # Model Context Protocol (MCP)
-BEM includes a built-in MCP server that exposes BEM routes as tools for Claude Chat, Claude Code, and other MCP clients.
+BEM includes a built-in MCP server that exposes BEM routes as tools for Claude Chat, Claude Code, Claude Desktop, and other MCP clients. The MCP layer is a thin wrapper over the existing BEM API — every tool maps to a route, and authentication goes through the same middleware pipeline.
 ## Architecture
 Two transport modes:
 - **Stdio** (local): `npx mgr mcp` — for Claude Code / Claude Desktop
-- **Streamable HTTP** (remote): `POST /backend-manager/mcp` — for Claude Chat (stateless, Firebase Functions compatible)
-## Available Tools (19)
-| Tool | Route | Description |
-|------|-------|-------------|
-| `firestore_read` | `GET /admin/firestore` | Read a Firestore document by path |
-| `firestore_write` | `POST /admin/firestore` | Write/merge a Firestore document |
-| `firestore_query` | `POST /admin/firestore/query` | Query a collection with where/orderBy/limit |
-| `send_email` | `POST /admin/email` | Send transactional email via SendGrid |
-| `send_notification` | `POST /admin/notification` | Send push notification via FCM |
-| `get_user` | `GET /user` | Get authenticated user info |
-| `get_subscription` | `GET /user/subscription` | Get subscription info for a user |
-| `sync_users` | `POST /admin/users/sync` | Sync user data across systems |
-| `list_campaigns` | `GET /marketing/campaign` | List marketing campaigns |
-| `create_campaign` | `POST /marketing/campaign` | Create a marketing campaign |
-| `get_stats` | `GET /admin/stats` | Get system statistics |
-| `cancel_subscription` | `POST /payments/cancel` | Cancel subscription at period end |
-| `refund_payment` | `POST /payments/refund` | Process a refund |
-| `run_cron` | `POST /admin/cron` | Trigger a cron job by ID |
-| `create_post` | `POST /admin/post` | Create a blog post |
-| `create_backup` | `POST /admin/backup` | Create a Firestore backup |
-| `run_hook` | `POST /admin/hook` | Execute a custom hook |
-| `generate_uuid` | `POST /general/uuid` | Generate a UUID |
-| `health_check` | `GET /test/health` | Check server health |
+- **Streamable HTTP** (remote): `POST /backend-manager/mcp` — for Claude Chat / Claude Desktop custom connectors (stateless, Firebase Functions compatible)
+## Roles
+Every tool has a `role` that controls who can see and call it:
+| Role | Who sees it | Tool count | Examples |
+|------|-------------|------------|---------|
+| `admin` | Admin key connections only | 22 | `firestore_read`, `send_email`, `cancel_subscription` |
+| `user` | Authenticated users + admins | 2 | `get_user`, `get_subscription` |
+| `public` | Everyone (after OAuth) | 1 | `health_check` |
+Admin sees ALL tools. User sees `user` + `public`. Unauthenticated connections get a 401 that triggers the OAuth flow — there is no unauthenticated tool access. Defense-in-depth: even if someone calls an admin tool by name, the underlying BEM route still rejects.
+## Available Tools (25)
+| Tool | Role | Route | Description |
+|------|------|-------|-------------|
+| `firestore_read` | admin | `GET /admin/firestore` | Read a Firestore document by path |
+| `firestore_write` | admin | `POST /admin/firestore` | Write/merge a Firestore document |
+| `firestore_query` | admin | `POST /admin/firestore/query` | Query a collection with where/orderBy/limit |
+| `send_email` | admin | `POST /admin/email` | Send transactional email via SendGrid |
+| `send_notification` | admin | `POST /admin/notification` | Send push notification via FCM |
+| `get_user` | user | `GET /user` | Get authenticated user info |
+| `get_subscription` | user | `GET /user/subscription` | Get subscription info for a user |
+| `sync_users` | admin | `POST /admin/users/sync` | Sync user data across systems |
+| `list_campaigns` | admin | `GET /marketing/campaign` | List marketing campaigns |
+| `create_campaign` | admin | `POST /marketing/campaign` | Create a marketing campaign |
+| `get_stats` | admin | `GET /admin/stats` | Get system statistics |
+| `cancel_subscription` | admin | `POST /payments/cancel` | Cancel subscription at period end |
+| `refund_payment` | admin | `POST /payments/refund` | Process a refund |
+| `get_payment_portal` | admin | `POST /payments/portal` | Generate Stripe billing portal link |
+| `update_campaign` | admin | `PUT /marketing/campaign` | Update a pending campaign |
+| `delete_campaign` | admin | `DELETE /marketing/campaign` | Delete a pending campaign |
+| `create_contact` | admin | `POST /marketing/contact` | Add a marketing contact |
+| `delete_contact` | admin | `DELETE /marketing/contact` | Remove a marketing contact |
+| `run_cron` | admin | `POST /admin/cron` | Trigger a cron job by ID |
+| `create_post` | admin | `POST /admin/post` | Create a blog post |
+| `update_post` | admin | `PUT /admin/post` | Update an existing blog post |
+| `create_backup` | admin | `POST /admin/backup` | Create a Firestore backup |
+| `run_hook` | admin | `POST /admin/hook` | Execute a custom hook |
+| `generate_uuid` | admin | `POST /general/uuid` | Generate a UUID |
+| `health_check` | public | `GET /test/health` | Check server health |
+## Tool Annotations
+Every tool has MCP annotations that control how Claude Desktop categorizes and displays it:
+| Field | Purpose |
+|-------|---------|
+| `title` | Human-readable display name (e.g. "Get authenticated user info" instead of `get_user`) |
+| `readOnlyHint` | `true` → "Read-only tools" category in Claude Desktop |
+| `destructiveHint` | `true` → marked as destructive (cancel, refund) |
+| `idempotentHint` | `true` → safe to retry (firestore_write with merge) |
+| `openWorldHint` | `true` → touches external systems (email, notifications) |
+Consumer tools can set all the same annotations — they're passed through automatically.
 ## Authentication
-- **Stdio (local):** Reads `BACKEND_MANAGER_KEY` from `functions/.env` automatically
-- **HTTP (remote):** OAuth 2.1 Authorization Code flow with PKCE. Claude Chat handles the flow — user pastes BEM key once on the authorize page. If `OAuth Client ID` is set to the BEM key in the connector config, the authorize step auto-approves.
+### OAuth Flow (HTTP transport — Claude Desktop / Claude Chat)
+1. Client sends `POST /backend-manager/mcp` with no auth → 401 with `WWW-Authenticate` header
+2. Client discovers `/.well-known/oauth-protected-resource` → finds authorization server
+3. Client discovers `/.well-known/oauth-authorization-server` → gets endpoints
+4. Client registers via `POST /backend-manager/mcp/register` (RFC 7591 Dynamic Client Registration)
+5. Client opens browser to `/backend-manager/mcp/authorize`
+   - If `client_id` matches admin key → auto-redirects (admin access)
+   - Otherwise → redirects to consumer's website (`/token?redirect_uri=...&state=...&mcp=true`)
+6. User signs in on their familiar site, gets a Firebase ID token
+7. Consumer's `/token` page redirects back with `code={idToken}&state={state}`
+8. Client exchanges code: `POST /backend-manager/mcp/token` → BEM verifies ID token, returns `api.privateKey` as `access_token`
+9. Client uses the API key for all future MCP requests as `Authorization: Bearer {key}`
+The consumer auth URL is resolved from `Manager.getWebsiteUrl()` (auto-resolves localhost in dev, production domain otherwise), or overridden via `mcp.authUrl` in `backend-manager-config.json`.
+### Admin (Stdio)
+```bash
+npx mgr mcp    # Reads BACKEND_MANAGER_KEY from functions/.env — sees all 25 tools
+```
+### User (Stdio)
+```bash
+npx mgr mcp --token <api-key>    # User-level — sees 3 tools (2 user + 1 public)
+```
+## Consumer MCP Tools
+Consumer projects expose custom MCP tools via a single `functions/mcp.js` file. Tools are automatically discovered and merged with the built-in tools.
+```js
+// functions/mcp.js
+module.exports = [
+  // Route delegation — points at an existing route (works on stdio + HTTP)
+  {
+    name: 'get_sponsorship',
+    description: 'Get sponsorship details by ID',
+    role: 'user',
+    method: 'GET',
+    path: 'sponsorship',
+    annotations: { title: 'Get sponsorship details', readOnlyHint: true },
+    inputSchema: {
+      type: 'object',
+      properties: {
+        id: { type: 'string', description: 'Sponsorship ID' },
+      },
+      required: ['id'],
+    },
+  },
+  // Handler mode — runs code directly (HTTP transport only)
+  {
+    name: 'newsletter_stats',
+    description: 'Get newsletter stats for the past N days',
+    role: 'admin',
+    annotations: { title: 'Get newsletter stats', readOnlyHint: true },
+    inputSchema: {
+      type: 'object',
+      properties: {
+        days: { type: 'number', description: 'Days to look back', default: 30 },
+      },
+    },
+    handler: async ({ Manager, assistant, user, params, libraries }) => {
+      const cutoff = Date.now() - (params.days || 30) * 86400000;
+      const snapshot = await libraries.admin.firestore()
+        .collection('newsletters')
+        .where('metadata.created.timestampUNIX', '>=', Math.floor(cutoff / 1000))
+        .get();
+      return { total: snapshot.docs.length };
+    },
+  },
+];
+```
+**Rules:**
+- Consumer tools with the same name as a built-in tool override it
+- Every tool needs `name`, `description`, and either `path` (route delegation) or `handler` (direct execution)
+- `role` defaults to `admin` if not specified
+- Handler-based tools only work on the HTTP transport (they return an error on stdio)
+- Handler-based tools bypass BEM route middleware — they execute directly with the Manager context
+- All MCP-standard fields are passed through: `annotations`, `outputSchema`, `inputSchema`
+## HTTPS Local Development
+`npx mgr serve` starts an HTTPS proxy on port 5002 (firebase serve runs internally on 5443). This enables Claude Desktop to connect locally since it requires HTTPS.
+- Certificates are auto-generated via mkcert into `.temp/certs/`
+- `getApiUrl()` returns `https://localhost:5002` when the HTTPS proxy is active
+- Disable with `--no-https` to fall back to plain HTTP
+- Install mkcert: `brew install mkcert && mkcert -install`
 ## Hosting Rewrites
@@ -48,11 +170,12 @@ The `npx mgr setup` command automatically adds required Firebase Hosting rewrite
 }
 ```
-## CLI Usage
+## Claude Desktop Configuration
-```bash
-npx mgr mcp                    # Start stdio MCP server (for Claude Code)
-```
+1. Go to Settings → Integrations → Add Custom Integration
+2. **URL:** `https://api.yourdomain.com/backend-manager/mcp` (production) or `https://localhost:5002/backend-manager/mcp` (local dev with HTTPS proxy)
+3. For admin access: set **OAuth Client ID** to your `BACKEND_MANAGER_KEY`
+4. For user access: leave Client ID empty — the OAuth flow redirects to the consumer's website for sign-in
 ## Claude Code Configuration
@@ -70,26 +193,26 @@ Add to `.claude/settings.json`:
 }
 ```
-## Claude Chat Configuration
-1. Go to Settings → Custom Connectors → Add
-2. **URL:** `https://api.yourdomain.com/backend-manager/mcp`
-3. **OAuth Client ID:** your `BACKEND_MANAGER_KEY` (enables auto-approve)
-4. **OAuth Client Secret:** your `BACKEND_MANAGER_KEY`
 ## Key Files
 | Purpose | File |
 |---------|------|
-| Tool definitions | `src/mcp/tools.js` |
-| HTTP handler (stateless + OAuth) | `src/mcp/handler.js` |
+| Tool definitions (roles + annotations) | `src/mcp/tools.js` |
+| Shared utilities (auth, filtering, consumer loading) | `src/mcp/utils.js` |
+| HTTP handler (OAuth + roles + consumer tools) | `src/mcp/handler.js` |
 | Stdio server | `src/mcp/index.js` |
 | HTTP client | `src/mcp/client.js` |
 | CLI command | `src/cli/commands/mcp.js` |
+| HTTPS proxy for local dev | `src/cli/commands/serve.js` |
 | MCP route interception | `src/manager/index.js` (`_handleMcp`, `resolveMcpRoutePath`) |
 | Hosting rewrites setup | `src/cli/commands/setup-tests/hosting-rewrites.js` |
 ## Adding New Tools
-1. Add the tool definition to `src/mcp/tools.js` with `name`, `description`, `method`, `path`, and `inputSchema`
-2. The tool automatically maps to the corresponding BEM route via the HTTP client — no handler code needed
+### Built-in tools (in BEM itself)
+Add a tool definition to `src/mcp/tools.js` with `name`, `description`, `role`, `method`, `path`, `annotations`, and `inputSchema`. The tool automatically maps to the corresponding BEM route via the HTTP client.
+### Consumer tools (in a consumer project)
+Add an entry to `functions/mcp.js`. Use `path` + `method` for route delegation (works on both transports), or `handler` for direct execution (HTTP only). All MCP fields (`annotations`, `outputSchema`, etc.) are passed through automatically.

package/docs/test-framework.md CHANGED Viewed

@@ -390,7 +390,7 @@ Security-rules tests use the `rules` client (`src/test/utils/firestore-rules-cli
 ## Test Account Isolation (CRITICAL)
-**NEVER use shared accounts (`basic`, `admin`, `premium-active`, …) with the `test` processor or any operation that creates side-effect data** (orders, webhooks, subscriptions). The test processor auto-fires webhooks that upgrade a user's subscription asynchronously — using `basic` for a payment-intent test upgrades `basic` to a paid subscription and breaks every subsequent test that depends on `basic` being a basic user.
+**NEVER use shared accounts (`basic`, `admin`, `premium-active`, …) with the `test` processor or any operation that creates side-effect data** (orders, webhooks, subscriptions, consent revocations). The test processor auto-fires webhooks that upgrade a user's subscription asynchronously — using `basic` for a payment-intent test upgrades `basic` to a paid subscription and breaks every subsequent test that depends on `basic` being a basic user.
 **Rule: any test that creates persistent side-effect data MUST use a dedicated `journey-*` account.**
@@ -402,7 +402,7 @@ const response = await http.as('basic').post('payments/intent', { processor: 'te
 const response = await http.as('journey-payments-intent-discount').post('payments/intent', { processor: 'test', ... });
 ```
-**When to create a journey account:** the test uses `processor: 'test'`, creates docs in `payments-orders` / `payments-intents` / `payments-webhooks`, modifies subscription state, or sends webhooks that trigger Firestore onWrite handlers. Add it to `src/test/test-accounts.js` (framework tests) or your project's `test/_init.js` `accounts` array (consumer tests).
+**When to create a journey account:** the test uses `processor: 'test'`, creates docs in `payments-orders` / `payments-intents` / `payments-webhooks`, modifies subscription state, sends webhooks that trigger Firestore onWrite handlers, or **writes `consent.marketing` (grant/revoke)** — e.g. marketing webhook revoke events, or `DELETE /marketing/contact` (which mirrors `revoked` to the user doc). Revoked consent persists for the rest of the run and trips the email library's consent gate (`{ blocked: 'consent' }`) on every later `sync()`/`add()` of that account. Existing examples: `journey-webhook-revoke` (webhook revoke events), `journey-marketing-sync` (extended-mode live-provider sync + cleanup; `_test.allow_*` prefix). Add new ones to `src/test/test-accounts.js` (framework tests) or your project's `test/_init.js` `accounts` array (consumer tests).
 **Shared accounts are safe for:** validation-only tests (missing fields, invalid input, auth rejection, unknown processor), read-only operations, and tests with no async side effects.

package/package.json CHANGED Viewed

@@ -1,6 +1,6 @@
 {
   "name": "backend-manager",
-  "version": "5.6.3",
+  "version": "5.7.0",
   "description": "Quick tools for developing Firebase functions",
   "main": "src/manager/index.js",
   "bin": {