@rubytech/create-realagent 1.0.815 → 1.0.817

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (41) hide show
  1. package/package.json +1 -1
  2. package/payload/platform/plugins/admin/PLUGIN.md +4 -2
  3. package/payload/platform/plugins/admin/skills/onboarding/SKILL.md +2 -2
  4. package/payload/platform/plugins/cloudflare/PLUGIN.md +2 -2
  5. package/payload/platform/plugins/docs/references/cloudflare.md +3 -5
  6. package/payload/platform/plugins/docs/references/deployment.md +12 -12
  7. package/payload/platform/plugins/docs/references/internals.md +24 -24
  8. package/payload/platform/plugins/docs/references/memory-guide.md +1 -1
  9. package/payload/platform/plugins/docs/references/outlook-guide.md +1 -1
  10. package/payload/platform/plugins/docs/references/plugins-guide.md +8 -8
  11. package/payload/platform/plugins/docs/references/troubleshooting.md +41 -45
  12. package/payload/platform/plugins/memory/PLUGIN.md +4 -4
  13. package/payload/platform/plugins/whatsapp/PLUGIN.md +10 -4
  14. package/payload/platform/plugins/whatsapp/mcp/dist/index.js +80 -0
  15. package/payload/platform/plugins/whatsapp/mcp/dist/index.js.map +1 -1
  16. package/payload/platform/plugins/whatsapp-import/PLUGIN.md +4 -4
  17. package/payload/platform/templates/agents/admin/IDENTITY.md +2 -0
  18. package/payload/platform/templates/specialists/agents/personal-assistant.md +2 -2
  19. package/payload/server/chunk-P3HTEK33.js +10074 -0
  20. package/payload/server/chunk-UYLZDEMC.js +1114 -0
  21. package/payload/server/chunk-Y3UQFQM7.js +10067 -0
  22. package/payload/server/client-pool-BMPFHXHB.js +31 -0
  23. package/payload/server/maxy-edge.js +2 -2
  24. package/payload/server/public/assets/{Checkbox-DZxF6s72.js → Checkbox-CTGhpDKq.js} +1 -1
  25. package/payload/server/public/assets/{admin-CTb65MiO.js → admin-2w0XSMC6.js} +20 -20
  26. package/payload/server/public/assets/data-Y77FLKjs.js +1 -0
  27. package/payload/server/public/assets/graph-C4-jEPDE.js +1 -0
  28. package/payload/server/public/assets/{jsx-runtime-Cb4WEnIV.css → jsx-runtime-D4WovFYk.css} +1 -1
  29. package/payload/server/public/assets/{page-BLanFGXC.js → page-DkBfWy4C.js} +1 -1
  30. package/payload/server/public/assets/{page-DuwlF8N5.js → page-zuI00fuC.js} +1 -1
  31. package/payload/server/public/assets/{public-BqeUfasT.js → public-BdVIVpv8.js} +1 -1
  32. package/payload/server/public/assets/{useAdminFetch-DLGqK3Fs.js → useAdminFetch-DmHu0oCx.js} +1 -1
  33. package/payload/server/public/assets/{useVoiceRecorder-CHPkBGVV.js → useVoiceRecorder-CSc_hxjV.js} +1 -1
  34. package/payload/server/public/data.html +5 -5
  35. package/payload/server/public/graph.html +6 -6
  36. package/payload/server/public/index.html +8 -8
  37. package/payload/server/public/public.html +5 -5
  38. package/payload/server/server.js +243 -409
  39. package/payload/server/public/assets/data-BO12TydY.js +0 -1
  40. package/payload/server/public/assets/graph-XIePTyWQ.js +0 -1
  41. /package/payload/server/public/assets/{jsx-runtime-BZMOvG0f.js → jsx-runtime-DkaAusaX.js} +0 -0
package/package.json CHANGED
@@ -1,6 +1,6 @@
1
1
  {
2
2
  "name": "@rubytech/create-realagent",
3
- "version": "1.0.815",
3
+ "version": "1.0.817",
4
4
  "description": "Install Real Agent — Built for agents. By agents.",
5
5
  "bin": {
6
6
  "create-realagent": "./dist/index.js"
@@ -51,9 +51,11 @@ Platform management tools for both the admin and public agents. The `plugin-read
51
51
 
52
52
  Tools are available via the `admin` MCP server.
53
53
 
54
- **Three-store admin auth invariant (Task 850).** `admin-add` writes to all three identity stores (`users.json` PIN auth, `account.json` `admins[]` role, Neo4j `:AdminUser`/`:Person` graph identity) with per-leg `[admin-auth-store]` log lines and returns `is_error: true` on any leg failure naming what's already written. `admin-update-pin` writes `users.json` only and emits the same line. Direct `Edit`/`Write` on `account.json` is blocked at the `pre-tool-use` hook — mutations go through `account-update`, `plugin-toggle-enabled`, or the `admin-*` tools. See `.docs/agents.md` § "Three-store admin auth invariant" for the full contract.
54
+ **Three-store admin auth invariant.** `admin-add` writes to all three identity stores (`users.json` PIN auth, `account.json` `admins[]` role, Neo4j `:AdminUser`/`:Person` graph identity) with per-leg `[admin-auth-store]` log lines and returns `is_error: true` on any leg failure naming what's already written. `admin-update-pin` writes `users.json` only and emits the same line. Direct `Edit`/`Write` on `account.json` is blocked at the `pre-tool-use` hook — mutations go through `account-update`, `plugin-toggle-enabled`, or the `admin-*` tools. See `.docs/agents.md` § "Three-store admin auth invariant" for the full contract.
55
55
 
56
- `logs-read { type: "agent-stream" }` (Task 850) is the canonical name for the per-conversation tool-use/tool-result archive previously called `system`; both names work and the legacy alias is preserved.
56
+ `logs-read { type: "agent-stream" }` is the canonical name for the per-conversation tool-use/tool-result archive previously called `system`; both names work and the legacy alias is preserved.
57
+
58
+ **Stream log is the primary diagnostic surface for an in-progress conversation.** The parent-process console fan-out tee in [`platform/ui/app/lib/claude-agent/logging.ts`](../../ui/app/lib/claude-agent/logging.ts) appends every `[<tag>]`-prefixed `console.error` / `console.log` line to every active conversation's stream log alongside `server.log`. For diagnosing an in-conversation issue (WhatsApp inbound, Cloudflare action, persist write, baileys error), call `logs-read { type: "agent-stream" }` first — the stream log carries both the agent lifecycle AND the parent-process events that occurred during the session window. `logs-read { type: "server" }` becomes the cross-session escape hatch (filtering across conversations or for events outside any session window), not the default.
57
59
 
58
60
  ## Skills
59
61
 
@@ -6,7 +6,7 @@ After completing each step, persist progress immediately by calling `onboarding-
6
6
 
7
7
  Every user selection and every document presentation in onboarding renders through `render-component` — never as markdown text, bullet lists, or inline chat content. Selections use `single-select` or `multi-select`; file content (SOUL, KNOWLEDGE, config) uses `document-editor`. Describing options in prose or pasting file content into a chat message bypasses the UI and the user's ability to review and edit before saving. Never use directional words ("above", "below") when referring to a rendered component — just name it or refer to "the browser", "the editor", etc. The `document-editor` component renders one file at a time — never call `render-component` with `document-editor` twice in the same turn. Present one file, wait for the user's approval or edit, then present the next.
8
8
 
9
- **Turn-completion contract (Task 879 §C).** Any onboarding turn that advances `currentStep` (via `onboarding-complete-step`) OR narrates a step transition with phrases like "Moving to step N", "Proceeding to step N", "Step N done" MUST end with one of: a `render-component` call surfacing the next step's UI, OR a `?`-terminated question. Bare prose statements like "Moving to step 9 — operator persona and profile bootstrap." with nothing after are forbidden — they leave the operator with no actionable surface and force a manual nudge. The post-restart resume contract for step 7 is the canonical case: when the cloudflare-relay turn arrives at `currentStep=7`, the agent's first turn must acknowledge AND immediately render the step-8 prompt (or `anthropic-setup`'s next surface) without an intervening dead-end paragraph. Same rule applies turn-by-turn through step 9. The `assistant-step-advance-deadend` review-detector rule (`platform/ui/app/lib/review-detector/rules.ts`) fires when this contract is violated, so an agent that drifts back to bare prose surfaces a `[review-detector]` alert against itself.
9
+ **Turn-completion contract.** Any onboarding turn that advances `currentStep` (via `onboarding-complete-step`) OR narrates a step transition with phrases like "Moving to step N", "Proceeding to step N", "Step N done" MUST end with one of: a `render-component` call surfacing the next step's UI, OR a `?`-terminated question. Bare prose statements like "Moving to step 9 — operator persona and profile bootstrap." with nothing after are forbidden — they leave the operator with no actionable surface and force a manual nudge. Same rule applies turn-by-turn through step 9. The post-restart resume contract (Task 881) for step 7 is the canonical case: when the operator's "Cloudflare setup completed" message arrives post-restart at `currentStep=7`, the agent's first turn must acknowledge AND immediately render the step-8 prompt (or `anthropic-setup`'s next surface) without an intervening dead-end paragraph.
10
10
 
11
11
  ## Step 1 — Plugin selection
12
12
 
@@ -144,7 +144,7 @@ Then call `render-component` with `name: "cloudflare-setup-form"` and data conta
144
144
 
145
145
  Wait for the user's submission. The `_componentDone` payload contains the `setup-tunnel.sh` output verbatim. Relay that output to the user — quote any `ACTION REQUIRED` block exactly. When the script exits zero, step-7 completion has already been persisted by the script itself (Task 562) — relay the output and stop. Do not call `onboarding-complete-step` with step 7; the script is the authority for step-7 completion, and any call you make after the script's restart dispatch would race the service restart and almost always lose. If the script failed (the endpoint returned `ok: false, field: "script"`), the form surfaced the error and stayed open — relay the output, cite `plugins/cloudflare/references/reset-guide.md` for recovery, and offer to re-render the form after any manual steps. Do not synthesise alternative recovery commands. If the user skipped (step 7 not reached), call `onboarding-complete-step` with step 7 so the next session resumes at step 8.
146
146
 
147
- **Post-restart resume contract (Task 867).** A successful Cloudflare setup arms a brand-service restart that kills the in-flight admin agent; the operator's "Cloudflare setup completed" relay is queued by the form (write-once filesystem queue) and dispatched server-side as the first turn of the post-restart session via the boot-drain hook in `server/index.ts`. By the time a new admin session opens, `OnboardingState.currentStep` is already 7 (the script's filesystem flag was consumed by `consumeStep7FlagUI` on the way in) AND the queue's relay turn has been persisted as the most-recent user message. From your view as the admin agent, the operator just told you "Cloudflare setup completed (actionId: …)" in a fresh session at currentStep=7. Acknowledge, then proceed to step 8 — do NOT re-ask the Cloudflare question, do NOT re-render the cloudflare-setup-form, do NOT call `onboarding-complete-step` with step 7 (already done). The relay turn is your single source of truth that step 7 finished cleanly; the script's flag-consume is the orthogonal proof that the state machine advanced.
147
+ **Post-restart resume contract (Task 881).** A successful Cloudflare setup arms a brand-service restart that kills the in-flight admin agent; the operator's "Cloudflare setup completed" message is replayed by the chat client itself after the restart cycle completes (`POST /api/admin/sessions/<cid>/resume` re-binds the session via the surviving `__remote_session` cookie, then the client sends the marker as a normal hidden chat POST). By the time you receive that marker, `OnboardingState.currentStep` is already 7 (the script's filesystem flag was consumed by `consumeStep7FlagUI` on the way in). From your view as the admin agent, the operator just told you "Cloudflare setup completed (actionId: …)" at currentStep=7. Acknowledge, then proceed to step 8 — do NOT re-ask the Cloudflare question, do NOT re-render the cloudflare-setup-form, do NOT call `onboarding-complete-step` with step 7 (already done). The marker turn is your single source of truth that step 7 finished cleanly; the script's flag-consume is the orthogonal proof that the state machine advanced.
148
148
 
149
149
  ## Step 8 — Anthropic API key
150
150
 
@@ -24,13 +24,13 @@ Each installation has its own Cloudflare account. The operator signs in with OAu
24
24
 
25
25
  ## Operator-facing surface
26
26
 
27
- The plugin registers no agent-facing MCP tools (Task 554). Every Cloudflare operation is driven through one of four sanctioned surfaces — `setup-tunnel.sh`, `reset-tunnel.sh`, `references/manual-setup.md`, or `references/dashboard-guide.md`. See the skill below for the discipline rule that binds the agent to these four.
27
+ The plugin registers no agent-facing MCP tools. Every Cloudflare operation is driven through one of four sanctioned surfaces — `setup-tunnel.sh`, `reset-tunnel.sh`, `references/manual-setup.md`, or `references/dashboard-guide.md`. See the skill below for the discipline rule that binds the agent to these four.
28
28
 
29
29
  ### Scripts
30
30
 
31
31
  | Script | Purpose |
32
32
  |---|---|
33
- | [`scripts/setup-tunnel.sh`](scripts/setup-tunnel.sh) | Autonomous end-to-end setup: OAuth login, tunnel create, DNS route, config + state, service restart, post-restart verification. Invocation: `~/setup-tunnel.sh <brand> <port> <admin-hostname> [<public-hostname>] [<apex-hostname>]`. Apex hostnames print an `ACTION REQUIRED` block for the dashboard record the CLI cannot create. Step 1 (Task 858 — wrappers faithfully relay third-party CLI) spawns `cloudflared tunnel login`, extracts the argotunnel URL from its stdout, mechanically opens it on the Pi VNC chromium (`DISPLAY=${DISPLAY:-:99} /usr/bin/chromium <url> &`), then polls for `~/.cloudflared/cert.pem` while the operator clicks the zone row + Authorize on the VNC. 180 s budget with a 2-second `step=oauth-login result=awaiting-cert` heartbeat. No CDP auto-click, no DOM matcher. |
33
+ | [`scripts/setup-tunnel.sh`](scripts/setup-tunnel.sh) | Autonomous end-to-end setup: OAuth login, tunnel create, DNS route, config + state, service restart, post-restart verification. Invocation: `~/setup-tunnel.sh <brand> <port> <admin-hostname> [<public-hostname>] [<apex-hostname>]`. Apex hostnames print an `ACTION REQUIRED` block for the dashboard record the CLI cannot create. Step 1 (earlier platform fixes — wrappers faithfully relay third-party CLI) spawns `cloudflared tunnel login`, extracts the argotunnel URL from its stdout, mechanically opens it on the Pi VNC chromium (`DISPLAY=${DISPLAY:-:99} /usr/bin/chromium <url> &`), then polls for `~/.cloudflared/cert.pem` while the operator clicks the zone row + Authorize on the VNC. 180 s budget with a 2-second `step=oauth-login result=awaiting-cert` heartbeat. No CDP auto-click, no DOM matcher. |
34
34
  | [`scripts/reset-tunnel.sh`](scripts/reset-tunnel.sh) | Deletes every tunnel on the brand's CF account and wipes `${CFG_DIR}`. Does not touch the platform service, stray CNAMEs, or token-mode connectors — those require dashboard cleanup or `pkill`. Invocation: `~/reset-tunnel.sh <brand>`. No polling blocks — every long-wait is bounded by `cloudflared`'s network round-trip, so no heartbeat contract applies. |
35
35
 
36
36
  ### Skills
@@ -8,7 +8,7 @@ Each installation has its own Cloudflare account. Sign-in is OAuth in the device
8
8
  |------|--------|
9
9
  | **Product identity** (Maxy vs Real Agent) | `brand.json` (`productName`, `configDir`) — known at install. |
10
10
  | **Cloudflare account identity** | `cert.pem` from OAuth. One account per brand per device. |
11
- | **Domain scope** (which zones the operator can route) | Live Cloudflare dashboard at form-render time via `list-cf-domains.sh`, not `brand.json`. Brand identity has no authority over which domains the operator's CF account holds. When the scrape returns an unexpected count (e.g. 1 on a two-zone account), the stream log's per-poll `phase=dom-scrape-poll n=<k> count=<n> domains=[…]` trajectory + the on-disk HTML dump at `~/{configDir}/logs/list-cf-domains-<ts>-count<n>-<mode>-pid<pid>.html` (Task 608 — written on every scrape outcome, not just empty ones) give the operator everything they need to triage the cause without re-running. |
11
+ | **Domain scope** (which zones the operator can route) | Live Cloudflare dashboard at form-render time via `list-cf-domains.sh`, not `brand.json`. Brand identity has no authority over which domains the operator's CF account holds. When the scrape returns an unexpected count (e.g. 1 on a two-zone account), the stream log's per-poll `phase=dom-scrape-poll n=<k> count=<n> domains=[…]` trajectory + the on-disk HTML dump at `~/{configDir}/logs/list-cf-domains-<ts>-count<n>-<mode>-pid<pid>.html` (earlier platform fixes — written on every scrape outcome, not just empty ones) give the operator everything they need to triage the cause without re-running. |
12
12
  | **Local tunnel state** | `~/{configDir}/cloudflared/` — `cert.pem`, `<UUID>.json`, `config.yml`, `tunnel.state`, `alias-domains.json`. |
13
13
 
14
14
  There is no token-based auth for the operator-owned path (Mode A). To switch Cloudflare accounts, run `reset-tunnel.sh` (which deletes the cert and every tunnel on the current account), then run `setup-tunnel.sh` again — `cloudflared tunnel login` inside the setup script will pick a fresh account when you sign in.
@@ -22,7 +22,7 @@ Ask the agent to set up Cloudflare. The agent first confirms the domain is alrea
22
22
  - **Proxy apex** — optional bare-domain hostname (e.g. `yourdomain.com`) that should also serve the public agent.
23
23
  - **Admin password** — the password used to gate remote access to the admin surface.
24
24
 
25
- When you submit, the `/api/admin/cloudflare/setup` endpoint runs — in strict order — `setRemotePassword`, launches a `cloudflare-setup` action (Task 664: `systemd-run --user` transient unit wrapping `setup-tunnel.sh <brand> <port> <hostname...>`), and registers a post-exit handler to write alias-domains for every non-`public.*` public or apex hostname (so e.g. `chat.yourdomain.com` is classified as public by `isPublicHost()`). The script runs end-to-end:
25
+ When you submit, the `/api/admin/cloudflare/setup` endpoint runs — in strict order — `setRemotePassword`, launches a `cloudflare-setup` action (earlier platform fixes: `systemd-run --user` transient unit wrapping `setup-tunnel.sh <brand> <port> <hostname...>`), and registers a post-exit handler to write alias-domains for every non-`public.*` public or apex hostname (so e.g. `chat.yourdomain.com` is classified as public by `isPublicHost`). The script runs end-to-end:
26
26
 
27
27
  - `cloudflared tunnel login` — OAuth browser sign-in. The VNC browser opens the Cloudflare authorize page; pick the account that owns your domain, click Authorize. `cert.pem` lands.
28
28
  - Tunnel creation under the naming convention `{brand}-{hostname}` (e.g. `maxy-neo`). Stream log emits `step=tunnel-resolve action=reused|created` once the UUID is known so the admin agent can see which tunnel the later steps will write against.
@@ -30,9 +30,7 @@ When you submit, the `/api/admin/cloudflare/setup` endpoint runs — in strict o
30
30
  - `cloudflared tunnel route dns` for each subdomain hostname. Apex hostnames cannot be routed this way — the script prints an **ACTION REQUIRED** block naming the exact dashboard record to add or edit. Stream log emits `step=route-dns hostname=… tunnel_id=…` before the call and `step=route-dns hostname=… result=ok|apex-skip|error` after; on error the bounded cloudflared stderr (≤400 chars) rides in the same phase line. **The script does not parse cloudflared's stdout** — exit code is the sole decision signal, so all three legitimate cloudflared output shapes (new record, overwrite, idempotent "already configured") are treated as success.
31
31
  - `config.yml` and `tunnel.state` written under `${CFG_DIR}`.
32
32
  - **Step-7 onboarding completion persisted** — the script writes `${ACCOUNT_DIR}/onboarding/step7-complete` (a JSON marker with the completion timestamp and tunnel ID) before arming the restart. Stream log: `step=onboarding-persist result=ok|error reason=<r>`. The marker is consumed by the next admin session's first state read and advances `OnboardingState.currentStep` to 7. Without this, the service restart below would SIGTERM the admin agent before it could persist step-7 completion, and the next session would re-ask the Cloudflare question you just finished. Both invocation surfaces (the form-driven action and the agent-via-Bash path) declare `ACCOUNT_DIR` explicitly because `systemd-run --user` does not inherit parent env — when ACCOUNT_DIR isn't reaching the script you'll see `result=skipped reason=no-account-dir` in the stream log instead of `result=ok`.
33
- - **Chat-relay queued for the operator's "Cloudflare setup completed" turn** (Task 867) — when the form's ActionLogPanel reports `code=0`, the form fires `POST /api/admin/cloudflare/relay-completion`. The route enqueues a record at `${ACCOUNT_DIR}/queue/action-completion-relay-<actionId>.json` (write-once via Node's `wx`/O_EXCL flag) BEFORE the brand restart kills the in-flight admin agent. After the restart, the brand service's boot-drain hook ([server/index.ts](../../../../platform/ui/server/index.ts)) consumes the queue once and dispatches a server-driven agent turn via a synthetic one-shot session bound to the queued conversationId; the agent's hoisted user-message persist (Task 867 `admin-agent.ts` + `public-agent.ts` write `role=user` BEFORE the SDK invoke now) captures the operator's relay even if SIGTERM hits mid-generation. The diagnostic line you grep on the working path is `[action-completion-relay] phase=consumed actionId=<id> conversationId=<cid> ageMs=<n> outcome=injected`. Failure modes are surfaced by the `cloudflare-setup-relay-not-acknowledged` review rule.
34
- - **Chat-surface restart-pending banner** (Task 879 §A) — the same `admin-chat:await-relay` CustomEvent that registers the relay-poll now also carries `reason: 'cloudflare-setup'` so the chat hook ([useAdminChat.ts](../../../../platform/ui/app/useAdminChat.ts)) renders an inline `"Service restarting after Cloudflare setup — picking back up…"` banner the moment the form fires. Closes the visible-silence window between the form's `Completed · 20s` and the first post-restart agent token. Idempotent on duplicate dispatch. Copy is keyed by `reason` so future restart sources (plugin-install, npm-update) can plug in their own banner without inventing a new chat surface state. Generic fallback `"Service restarting — reconnect will happen automatically."` is used when the dispatch omits `reason`.
35
- - **Client sessionKey rebind on first post-restart poll** (Task 879 §D.2) — when the relay-poll observes its first `200` for the captured cid (server-side cookie-bridge has just hydrated `(accountId, userId)` onto the wiped sessionStore entry), the chat hook fires `POST /api/admin/session/rebind` exactly once with `{session_key, lastKnownConversationId}`. The endpoint validates accountId scope via `getConversationOwner(cid).accountId === sessionAccountId` and binds the conversationId to the session via `setConversationIdForSession`. `sendMessage` awaits the in-flight rebind promise before opening the next chat POST, closing the silent-fork race where the operator's next turn would otherwise create a NEW Conversation and the `[admin/conversation-flush] result=missing-userId|writer-failed` line never reached the chat surface. Diagnostic: `grep '\[admin/session/rebind\]' ~/{configDir}/logs/server.log` — expects `result=ok conversationId=<cid8>` once per restart cycle; `result=conflict` means the server holds a different canonical cid (client adopts it).
33
+ - **Post-restart resume contract** — when the form's ActionLogPanel reports `code=0`, the form dispatches the `admin-chat:post-restart-resume` window CustomEvent with `{actionId, message}`. The chat hook ([useAdminChat.ts](../../../../platform/ui/app/useAdminChat.ts)) waits for the brand-service down-then-up cycle on `/api/health`, calls `POST /api/admin/sessions/<cid>/resume?session_key=<sk>` (cookie-bridge-rehydrates the wiped sessionStore via the surviving `__remote_session` cookie and binds the conversation), then sends the captured "Cloudflare setup completed (actionId: <id>)." marker as a normal hidden chat POST that re-invokes the agent. No relay queue, no boot-drain, no banner, no rebind endpoint. Diagnostic: `grep '\[admin-resume\] reason=post-restart' ~/{configDir}/logs/server.log` (expect one line per restart cycle), `grep '\[client-event\] kind=post-restart-resume' ~/{configDir}/logs/server.log` for the operator-visible client trace. See `.docs/web-chat.md` "Post-restart resume contract" for the full client/server contract.
36
34
  - `systemctl --user restart ${BRAND}.service` — restarts the platform service so the new tunnel spawns via the service's `ExecStartPre=resume-tunnel.sh`.
37
35
  - Post-restart verification — `ps -ef | grep '[c]loudflared'` confirms the connector is alive, then `curl -I https://<hostname>` against each subdomain (up to 60 s per host) confirms a non-530 response.
38
36
 
@@ -75,27 +75,27 @@ sudo journalctl -u maxy -n 50
75
75
  The logs will show which service failed to start and why. Common causes:
76
76
 
77
77
  - **Neo4j not started** — run `sudo systemctl start neo4j` and retry
78
- - **Port 19200 already in use** — check for another process: `lsof -i :19200`
78
+ - **Port 19200 already in use** — check for another process: `lsof -i:19200`
79
79
  - **Claude OAuth expired** — the next admin session will prompt you to re-authenticate
80
- - **NEO4J_URI guard throws** — the admin agent probes device reality at boot and fails closed on three shapes (Task 682, succeeding Task 681):
80
+ - **NEO4J_URI guard throws** — the admin agent probes device reality at boot and fails closed on three shapes (earlier platform fixessucceeding earlier platform fixes):
81
81
  - `no Neo4j listening on [ports]` — nothing is bound; start `neo4j.service` or `neo4j-<brand>.service`, or edit `NEO4J_URI` to a port a Neo4j is actually running on.
82
- - `port :X not listening; only :Y is live` — single-brand device where `.env` names a port the local Neo4j isn't bound to; edit `NEO4J_URI` in `~/{configDir}/.env` to match the live port (shown in the `[neo4j-probe] listening=[…]` log line).
83
- - `port :X disagrees with brand.json neo4jPort :Y` — co-tenant device (2+ Neo4js listening) where `.env` names the other brand's port; edit `NEO4J_URI` to match `brand.neo4jPort`, or correct `neo4jPort` in `brand.json` and reinstall. Preserves the Task 577 orphan-write protection on multi-brand devices.
82
+ - `port:X not listening; only:Y is live` — single-brand device where `.env` names a port the local Neo4j isn't bound to; edit `NEO4J_URI` in `~/{configDir}/.env` to match the live port (shown in the `[neo4j-probe] listening=[…]` log line).
83
+ - `port:X disagrees with brand.json neo4jPort:Y` — co-tenant device (2+ Neo4js listening) where `.env` names the other brand's port; edit `NEO4J_URI` to match `brand.neo4jPort`, or correct `neo4jPort` in `brand.json` and reinstall. Preserves the earlier platform fixes orphan-write protection on multi-brand devices.
84
84
 
85
85
  ## Systemd units on each device
86
86
 
87
- Each installed brand runs two per-brand `--user` systemd units (Task 662 + Task 664 — unit filenames are prefixed with the brand's `hostname` so two brands on the same device never share a unit file):
87
+ Each installed brand runs two per-brand `--user` systemd units (earlier platform fixes + — unit filenames are prefixed with the brand's `hostname` so two brands on the same device never share a unit file):
88
88
 
89
- - `{hostname}.service` — the admin + public HTTP server on `127.0.0.1:19201` (public port + 1). Restarted by the upgrade flow; short downtime is expected during steps 8→11 of an upgrade. Task 666: the unit carries two port env vars — `PORT=<public>` (canonical public port, read by the upgrade detector) and `MAXY_UI_INTERNAL_PORT=<public+1>` (the port maxy-ui actually binds).
90
- - `{hostname}-edge.service` — the always-on public listener on the configured port (default 19200). Reverse-proxies HTTP to the main brand service and handles `/websockify` (VNC) WebSocket upgrades locally. Task 666: also hosts `/api/admin/actions/*` and `/api/admin/version*` — the Software Update modal's own routes — so the log stream survives the brand service's restart window. Does NOT restart during an upgrade — the browser WebSocket stays connected by construction.
89
+ - `{hostname}.service` — the admin + public HTTP server on `127.0.0.1:19201` (public port + 1). Restarted by the upgrade flow; short downtime is expected during steps 8→11 of an upgrade. An earlier fix: the unit carries two port env vars — `PORT=<public>` (canonical public port, read by the upgrade detector) and `MAXY_UI_INTERNAL_PORT=<public+1>` (the port maxy-ui actually binds).
90
+ - `{hostname}-edge.service` — the always-on public listener on the configured port (default 19200). Reverse-proxies HTTP to the main brand service and handles `/websockify` (VNC) WebSocket upgrades locally. An earlier fix: also hosts `/api/admin/actions/*` and `/api/admin/version*` — the Software Update modal's own routes — so the log stream survives the brand service's restart window. Does NOT restart during an upgrade — the browser WebSocket stays connected by construction.
91
91
 
92
- **Port-drift recovery (Task 666).** Devices upgraded between Tasks 647 and 666 may have drifted +1 on every upgrade because the pre-Task-666 installer wrote `Environment=PORT=<internal>` into `{hostname}.service` and the upgrade reader correctly treated `PORT=` as public. The first post-Task-666 install detects this (comparing maxy's PORT against the edge's EDGE_PORT) and emits a one-shot loud log: `[port-recovery] detected drift maxy=<X> edge=<Y> — pinning at <Y>`. Subsequent upgrades are silent. If your Cloudflare tunnel was pointing at a drifted port, the ingress `config.yml` still needs a one-time manual fix: `sed -i 's|localhost:<old>|localhost:<current>|' ~/.{configDir}/cloudflared/config.yml && cloudflared tunnel ingress validate`. {{productName}} never rewrites cloudflared config programmatically.
92
+ **Port-drift recovery.** Devices upgraded between Tasks 647 and 666 may have drifted +1 on every upgrade because the pre-Task-666 installer wrote `Environment=PORT=<internal>` into `{hostname}.service` and the upgrade reader correctly treated `PORT=` as public. The first post-Task-666 install detects this (comparing maxy's PORT against the edge's EDGE_PORT) and emits a one-shot loud log: `[port-recovery] detected drift maxy=<X> edge=<Y> — pinning at <Y>`. Subsequent upgrades are silent. If your Cloudflare tunnel was pointing at a drifted port, the ingress `config.yml` still needs a one-time manual fix: `sed -i 's|localhost:<old>|localhost:<current>|' ~/.{configDir}/cloudflared/config.yml && cloudflared tunnel ingress validate`. {{productName}} never rewrites cloudflared config programmatically.
93
93
 
94
- Upgrade and Cloudflare setup (Task 664) run as detached actions: `systemd-run --user` transient units per invocation with stdout+stderr persisted to `~/.maxy/logs/actions/<actionId>.log` and streamed to the UI via SSE. No boot-time service file exists for these.
94
+ Upgrade and Cloudflare setup run as detached actions: `systemd-run --user` transient units per invocation with stdout+stderr persisted to `~/.maxy/logs/actions/<actionId>.log` and streamed to the UI via SSE. No boot-time service file exists for these.
95
95
 
96
96
  If an action looks stuck, read `~/.maxy/logs/actions/<actionId>.log` directly for the full output, or `journalctl --user --identifier=maxy-action-<actionId>` for systemd's record.
97
97
 
98
- **Pre-Task-662 / pre-Task-664 upgrade** — devices that ran an installer before Task 662 have legacy shared `maxy-edge.service` / `maxy-ttyd.service` units; devices that ran before Task 664 have per-brand `{hostname}-ttyd.service` units plus a pinned `/usr/local/bin/ttyd` binary. Neither is removed automatically — do this cleanup once per device before re-running any installer:
98
+ **Pre-Task-662 / pre-Task-664 upgrade** — devices that ran an installer have legacy shared `maxy-edge.service` / `maxy-ttyd.service` units; devices that ran have per-brand `{hostname}-ttyd.service` units plus a pinned `/usr/local/bin/ttyd` binary. Neither is removed automatically — do this cleanup once per device before re-running any installer:
99
99
 
100
100
  ```bash
101
101
  systemctl --user stop maxy-edge maxy-ttyd realagent-ttyd 2>/dev/null || true
@@ -112,8 +112,8 @@ systemctl --user daemon-reload
112
112
  A single Pi or laptop can host more than one brand (for example Maxy and Real Agent) side by side. Each brand runs as its own service on its own port, with its own install directory and its own data. Installing one brand does not touch the other.
113
113
 
114
114
  - **Separate:** each brand has its own install folder (`~/maxy/`, `~/realagent/`), its own config folder (`~/.maxy/`, `~/.realagent/`), its own web port, its own Cloudflare tunnel state, its own edge systemd unit (`maxy-edge.service` vs `realagent-edge.service`), and by default its own Neo4j database (Maxy on bolt port 7687, Real Agent on 7688). Action runner units are transient and per-invocation, not per-brand, so no naming conflict is possible.
115
- - **Brand-isolated Neo4j (Task 787):** when a brand provisions a dedicated Neo4j instance (any port other than 7687), the installer stops and disables the apt-package's system `neo4j.service` after enabling the brand-dedicated unit, so only one Neo4j process holds the shared `/var/lib/neo4j/run/` PID file. The seed step receives the brand-correct `NEO4J_URI` and `NEO4J_PASSWORD` as explicit environment variables — the seed script no longer carries a `bolt://localhost:7687` default. A failed dedicated start aborts the install loudly with a journalctl tail; there is no silent fallback to the system instance. Stop/disable targets the literal `neo4j.service` only, so peer brands running their own `neo4j-{brand}.service` are unaffected.
116
- - **Peer-aware system-unit guard (Task 800):** before stopping the system `neo4j.service`, the installer checks whether any other brand on the device still depends on it — that is, has `NEO4J_URI=bolt://localhost:7687` in its `~/.<peer>/.env`. If so, the system unit is left enabled and active, and the install log shows `[neo4j] system unit kept active — peer brand <name> depends on port 7687 (Task 800)` instead of the usual `[neo4j] disabling system unit` line. This prevents a `create-realagent` install from disabling Maxy's database on a host where Maxy still uses the shared system instance (the Task 797 reproducer on Neo's laptop, 2026-04-28). On single-brand hosts and on multi-brand hosts where every peer runs a dedicated port, behaviour is unchanged from Task 787.
115
+ - **Brand-isolated Neo4j:** when a brand provisions a dedicated Neo4j instance (any port other than 7687), the installer stops and disables the apt-package's system `neo4j.service` after enabling the brand-dedicated unit, so only one Neo4j process holds the shared `/var/lib/neo4j/run/` PID file. The seed step receives the brand-correct `NEO4J_URI` and `NEO4J_PASSWORD` as explicit environment variables — the seed script no longer carries a `bolt://localhost:7687` default. A failed dedicated start aborts the install loudly with a journalctl tail; there is no silent fallback to the system instance. Stop/disable targets the literal `neo4j.service` only, so peer brands running their own `neo4j-{brand}.service` are unaffected.
116
+ - **Peer-aware system-unit guard:** before stopping the system `neo4j.service`, the installer checks whether any other brand on the device still depends on it — that is, has `NEO4J_URI=bolt://localhost:7687` in its `~/.<peer>/.env`. If so, the system unit is left enabled and active, and the install log shows `[neo4j] system unit kept active — peer brand <name> depends on port 7687` instead of the usual `[neo4j] disabling system unit` line. This prevents a `create-realagent` install from disabling Maxy's database on a host where Maxy still uses the shared system instance (the earlier platform fixes reproducer on Neo's laptop, 2026-04-28). On single-brand hosts and on multi-brand hosts where every peer runs a dedicated port, behaviour is unchanged.
117
117
  - **Shared:** both brands share the system Chromium/VNC stack, the Ollama model server, and the `cloudflared` command itself. Browser automation is serialised — one admin session at a time across both brands.
118
118
 
119
119
  To install a second brand on a device that already runs the first, just run the other installer. No flags needed for isolation:
@@ -14,9 +14,9 @@ Every knowledge query flows through a hybrid search pipeline that combines seman
14
14
  QUERY
15
15
 
16
16
  ├── EMBED (EMBED_MODEL, default nomic-embed-text) ──► VECTOR SEARCH (per index, cosine)
17
-
18
- ├──► MERGE ──► EXPAND ──► RESULTS
19
-
17
+
18
+ ├──► MERGE ──► EXPAND ──► RESULTS
19
+
20
20
  └── ESCAPE (Lucene special chars) ──────► BM25 FULL-TEXT ──┘
21
21
  (entity_search index — universal coverage)
22
22
 
@@ -29,7 +29,7 @@ Fallback: if the full-text index doesn't exist, vector-only results are returned
29
29
 
30
30
  **Vector path:** The query is embedded via Ollama (model per `EMBED_MODEL` env var, default `nomic-embed-text`). The resulting vector is compared against Neo4j's HNSW cosine indexes — one per indexed label. Dimensions are configured at install time (default 768). The search runs against all discovered indexes (or a subset if the caller specifies label filters). Scores are in [0, 1] (cosine similarity).
31
31
 
32
- **BM25 path:** The raw query text is escaped for Lucene special characters and run against the `entity_search` full-text index (Task 748 — universal coverage), which spans every operator-meaningful label written by the platform on the canonical text-property union (~28 properties: `name`, `firstName`, `lastName`, `givenName`, `familyName`, `title`, `summary`, `body`, `content`, `description`, `headline`, `email`, `subject`, `bodyPreview`, etc.). Pre-Task-748 the index was named `knowledge_fulltext` and covered only `KnowledgeDocument | Section | Chunk` — that gap silently hid Person/Organization/Task/Event/etc. from BM25 regardless of query. Raw BM25 scores are in [0, infinity) — they are normalised to [0, 1] via min-max scaling within the result set before merging. When all scores are equal (or a single result), all normalise to 1.0.
32
+ **BM25 path:** The raw query text is escaped for Lucene special characters and run against the `entity_search` full-text index (earlier platform fixes — universal coverage), which spans every operator-meaningful label written by the platform on the canonical text-property union (~28 properties: `name`, `firstName`, `lastName`, `givenName`, `familyName`, `title`, `summary`, `body`, `content`, `description`, `headline`, `email`, `subject`, `bodyPreview`, etc.). Pre-Task-748 the index was named `knowledge_fulltext` and covered only `KnowledgeDocument | Section | Chunk` — that gap silently hid Person/Organization/Task/Event/etc. from BM25 regardless of query. Raw BM25 scores are in [0, infinity) — they are normalised to [0, 1] via min-max scaling within the result set before merging. When all scores are equal (or a single result), all normalise to 1.0.
33
33
 
34
34
  **Merge:** Results from both paths are collected in a single map keyed by `nodeId`. A node appearing in both paths accumulates the max vector score and max BM25 score independently. The combined score is `0.7 * vectorScore + 0.3 * bm25Score`. Results are sorted descending by combined score, then sliced to the requested limit (default 10).
35
35
 
@@ -59,7 +59,7 @@ Indexed labels: `Question`, `DefinedTerm`, `Review`, `Service`, `Person`, `Local
59
59
 
60
60
  | Index name | Labels | Properties | Purpose |
61
61
  |---|---|---|---|
62
- | `entity_search` | All operator-meaningful labels (~40, see [`schema.cypher`](../../../neo4j/schema.cypher)) | Canonical text-property union (~28) | Universal BM25 keyword matching across the whole graph (Task 748) |
62
+ | `entity_search` | All operator-meaningful labels (~40, see [`schema.cypher`](../../../neo4j/schema.cypher)) | Canonical text-property union (~28) | Universal BM25 keyword matching across the whole graph |
63
63
 
64
64
  ### Embedding lifecycle
65
65
 
@@ -73,13 +73,13 @@ Large documents are decomposed into a three-level hierarchy for granular retriev
73
73
 
74
74
  ```
75
75
  KnowledgeDocument
76
- ├── summary (embedded) — document-level semantic anchor
76
+ ├── summary (embedded) — document-level semantic anchor
77
77
  ├── Section
78
- ├── summary (embedded) — section-level semantic anchor
79
- └── Chunk
80
- ├── summary (embedded) — chunk-level semantic anchor
81
- └── content (raw text, BM25-indexed) — full content for retrieval
82
- └── attachmentId — links back to the source file
78
+ ├── summary (embedded) — section-level semantic anchor
79
+ └── Chunk
80
+ ├── summary (embedded) — chunk-level semantic anchor
81
+ └── content (raw text, BM25-indexed) — full content for retrieval
82
+ └── attachmentId — links back to the source file
83
83
  ```
84
84
 
85
85
  All three levels are independently vector-indexed and BM25-indexed. A query may match at the document level (broad topic), section level (sub-topic), or chunk level (specific passage). Graph expansion from a matched chunk retrieves its parent section and document for context.
@@ -155,7 +155,7 @@ Before searching, a Haiku classifier decides whether a query needs knowledge ret
155
155
  | History window | Last 4 messages (2 user + 2 assistant) | Same |
156
156
  | Max tokens | 200 | 120 |
157
157
  | Query rewriting | Yes — resolves references from history into concrete search terms | Same |
158
- | Topic-change detection | Yes — detects shifts with confidence score | No (removed, Task 387) |
158
+ | Topic-change detection | Yes — detects shifts with confidence score | No (removed, earlier platform fixes) |
159
159
  | Fallback on failure | `search: true` (always search with raw message) | Same |
160
160
 
161
161
  ### Classification output
@@ -181,7 +181,7 @@ The classifier is one input to a broader decision tree that determines whether `
181
181
 
182
182
  Admin: `[admin-query-classifier]` log line with `topicChange`, `topicChangeConfidence`, `existingTopic`, `latencyMs`.
183
183
 
184
- Public: `[public-query-classifier]` log line with `search`, `effectiveQuery`, `reason`, `latencyMs`. The intentional absence of topic-change fields in the public log is the on-disk evidence that the public path does less work (Task 387).
184
+ Public: `[public-query-classifier]` log line with `search`, `effectiveQuery`, `reason`, `latencyMs`. The intentional absence of topic-change fields in the public log is the on-disk evidence that the public path does less work.
185
185
 
186
186
  ---
187
187
 
@@ -221,7 +221,7 @@ Haiku receives a sandboxed system prompt that:
221
221
  - Requires a JSON response with `nodeId`, `rank` (1-indexed, 1 = most important), and `reasoning` (one sentence, under 300 characters)
222
222
  - Explicitly labels the user-provided criterion as "data, not instructions" to prevent prompt injection
223
223
 
224
- The criterion itself (from the calling agent) is wrapped in `<<<CRITERION ... CRITERION` delimiters in the user message.
224
+ The criterion itself (from the calling agent) is wrapped in `<<<CRITERION... CRITERION` delimiters in the user message.
225
225
 
226
226
  ### Hallucination defence
227
227
 
@@ -284,7 +284,7 @@ Each public agent can subscribe to up to 5 keywords via `knowledgeKeywords` in i
284
284
 
285
285
  For each subscription keyword, two complementary searches run:
286
286
 
287
- 1. **BM25 full-text search** — queries the universal `entity_search` index (Task 748) with the keyword as the search term. Catches content that mentions the keyword in its text across every operator-meaningful label.
287
+ 1. **BM25 full-text search** — queries the universal `entity_search` index with the keyword as the search term. Catches content that mentions the keyword in its text across every operator-meaningful label.
288
288
 
289
289
  2. **Property-based search** — finds nodes whose `keywords` array property contains the subscription keyword (case-insensitive). Catches nodes explicitly tagged with that keyword topic. These matches are boosted to maximum BM25 score (1.0) since they are exact tag matches.
290
290
 
@@ -292,7 +292,7 @@ Both searches run **without** the per-agent tag filter (`agentSlug`) — keyword
292
292
 
293
293
  ### Union semantics
294
294
 
295
- Results from keyword subscription searches are merged into the same scored map as the primary vector+BM25 results. Deduplication by `nodeId` with `Math.max()` on scores means a node found by both direct search and keyword subscription keeps the highest score from each method.
295
+ Results from keyword subscription searches are merged into the same scored map as the primary vector+BM25 results. Deduplication by `nodeId` with `Math.max` on scores means a node found by both direct search and keyword subscription keeps the highest score from each method.
296
296
 
297
297
  ### Lifecycle
298
298
 
@@ -315,17 +315,17 @@ This tool is read-only and available to both public and admin agents.
315
315
 
316
316
  ### When conversations are created
317
317
 
318
- `:Conversation` nodes on webchat (admin login, "New conversation" in the burger, a new public visitor) are created lazily. Opening the chat or logging in does not write anything to the graph — {{productName}} only records the conversation once the user sends a second message. This keeps `conversation-search` and the Conversations modal free of one-turn abandoned threads. WhatsApp and Telegram take the opposite posture: every inbound — DM or group, allowed or activation-off, agent-invoked or gated — MERGEs the `:Conversation` and writes a forensic `:Message:WhatsAppMessage` row before any access-control decision (Task 863). The graph is the durable record of every message the device received, not just the ones the agent replied to. See `.docs/web-chat.md` "Deferred conversation persistence (Task 650)" and `.docs/whatsapp.md` "Session continuity" for the full contract.
318
+ `:Conversation` nodes on webchat (admin login, "New conversation" in the burger, a new public visitor) are created lazily. Opening the chat or logging in does not write anything to the graph — {{productName}} only records the conversation once the user sends a second message. This keeps `conversation-search` and the Conversations modal free of one-turn abandoned threads. WhatsApp and Telegram take the opposite posture: every inbound — DM or group, allowed or activation-off, agent-invoked or gated — MERGEs the `:Conversation` and writes a forensic `:Message:WhatsAppMessage` row before any access-control decision. The graph is the durable record of every message the device received, not just the ones the agent replied to. See `.docs/web-chat.md` "Deferred conversation persistence" and `.docs/whatsapp.md` "Session continuity" for the full contract.
319
319
 
320
- Each row in the Conversations modal exposes a `View logs` row-action that opens a popover with three links — **Stream**, **Errors**, **SSE** — each of which targets `/api/admin/logs?type={stream|error|sse}&conversationId={full-id}` in a new tab. The row's 8-char id chip is click-to-copy; hover reveals the full `conversationId` as a tooltip. See `.docs/web-chat.md` "In-chat retrieval" for the route contract and `console.debug` observability (Task 686).
320
+ Each row in the Conversations modal exposes a `View logs` row-action that opens a popover with three links — **Stream**, **Errors**, **SSE** — each of which targets `/api/admin/logs?type={stream|error|sse}&conversationId={full-id}` in a new tab. The row's 8-char id chip is click-to-copy; hover reveals the full `conversationId` as a tooltip. See `.docs/web-chat.md` "In-chat retrieval" for the route contract and `console.debug` observability.
321
321
 
322
- ### Static publish surface — `/sites/*` (Task 853)
322
+ ### Static publish surface — `/sites/*`
323
323
 
324
- {{productName}} hosts a generic per-account static-tree publish surface at `https://public.<brand>/sites/<...>/<file>`. The route serves files from `<accountDir>/sites/<...>` with URL=disk mirroring — operator drops the tree on disk, no upload API. Extended MIME covers HTML/CSS/JS/woff2/fonts on top of images. Trailing-`/` or extension-less requests fall back to `index.html`. Path traversal (`..`, encoded `..`, segments failing `SAFE_SEG_RE`) returns 403; symlinks escaping the sites root are rejected via a `realpathSync` re-check. `.html` responses carry `Content-Security-Policy: default-src 'self' https: data:; script-src 'none'` and `Cache-Control: no-cache`; assets are cached for an hour; every response carries `X-Content-Type-Options: nosniff`. Per-account isolation comes from `resolveAccount()` — Maxy and Real Agent installs each see only their own tree. Drop a brochure at `~/.realagent/data/sites/properties/<id>/brochure/output/` and it serves at `https://public.realagent.bot/sites/properties/<id>/brochure/output/brochure.html`. See `.docs/web-chat.md` `/sites/*` route entry for the wire contract and `[sites]` log lines (`serve|not-found|path-traversal-rejected|symlink-escape-rejected|no-account`).
324
+ {{productName}} hosts a generic per-account static-tree publish surface at `https://public.<brand>/sites/<...>/<file>`. The route serves files from `<accountDir>/sites/<...>` with URL=disk mirroring — operator drops the tree on disk, no upload API. Extended MIME covers HTML/CSS/JS/woff2/fonts on top of images. Trailing-`/` or extension-less requests fall back to `index.html`. Path traversal (`..`, encoded `..`, segments failing `SAFE_SEG_RE`) returns 403; symlinks escaping the sites root are rejected via a `realpathSync` re-check. `.html` responses carry `Content-Security-Policy: default-src 'self' https: data:; script-src 'none'` and `Cache-Control: no-cache`; assets are cached for an hour; every response carries `X-Content-Type-Options: nosniff`. Per-account isolation comes from `resolveAccount` — Maxy and Real Agent installs each see only their own tree. Drop a brochure at `~/.realagent/data/sites/properties/<id>/brochure/output/` and it serves at `https://public.realagent.bot/sites/properties/<id>/brochure/output/brochure.html`. See `.docs/web-chat.md` `/sites/*` route entry for the wire contract and `[sites]` log lines (`serve|not-found|path-traversal-rejected|symlink-escape-rejected|no-account`).
325
325
 
326
- ### Cross-tab session rotation (Task 848)
326
+ ### Cross-tab session rotation
327
327
 
328
- When you click "New conversation" in the chat tab, {{productName}} mints a fresh admin session key on the server and clears the old one. Sibling admin tabs (`/graph`, `/data`) opened in the same browser keep working without re-login: the chat tab broadcasts the new key on a same-origin channel so each sibling tab updates its captured key instantly, and any in-flight admin request that 401s with the rotation-orphan code retries once after re-reading the latest key from per-tab storage. If neither path recovers (browser locked down, second 401 after retry, session expired), the tab shows a single banner — "Your admin session was renewed in another tab. Click to reload." — and one click sends you back through login. No silent 401s; no re-clicking through the same trash icon hoping it sticks. See `.docs/web-chat.md` "Cross-tab rotation contract (Task 848)" for the wire-level `code` taxonomy and observability surfaces.
328
+ When you click "New conversation" in the chat tab, {{productName}} mints a fresh admin session key on the server and clears the old one. Sibling admin tabs (`/graph`, `/data`) opened in the same browser keep working without re-login: the chat tab broadcasts the new key on a same-origin channel so each sibling tab updates its captured key instantly, and any in-flight admin request that 401s with the rotation-orphan code retries once after re-reading the latest key from per-tab storage. If neither path recovers (browser locked down, second 401 after retry, session expired), the tab shows a single banner — "Your admin session was renewed in another tab. Click to reload." — and one click sends you back through login. No silent 401s; no re-clicking through the same trash icon hoping it sticks. See `.docs/web-chat.md` "Cross-tab rotation contract" for the wire-level `code` taxonomy and observability surfaces.
329
329
 
330
330
  ---
331
331
 
@@ -407,7 +407,7 @@ This means:
407
407
  - The `memory-reindex` tool can backfill embeddings for newly indexed labels
408
408
  - Index renames are transparent — the server discovers the current index names at startup
409
409
 
410
- The cache is cleared via `clearIndexCache()` after schema changes (e.g., after `memory-reindex` detects new indexes).
410
+ The cache is cleared via `clearIndexCache` after schema changes (e.g., after `memory-reindex` detects new indexes).
411
411
 
412
412
  ---
413
413
 
@@ -471,6 +471,6 @@ Each log entry includes the tool name and a truncated conversation ID for correl
471
471
 
472
472
  When an admin turn crosses 75% of the model's context window, {{productName}} runs a silent compaction turn that asks the agent to call the `session-compact` MCP tool with a structured briefing (what you asked for, what was done, decisions made, work-in-progress, things you've shared about yourself). The briefing is written to Neo4j; the next admin turn injects it back into the system prompt, so continuity survives across the compaction boundary without re-sending the full transcript.
473
473
 
474
- The compaction runs against a transient one-shot pool entry separate from the long-lived admin Query (Task 784). Operator-visible side effects:
474
+ The compaction runs against a transient one-shot pool entry separate from the long-lived admin Query. Operator-visible side effects:
475
475
  - Compaction logs land in `claude-agent-compaction-stream-YYYY-MM-DD.log` alongside the main stream log. Look for `[compaction-start]`, `[compaction-summary-captured]`, `[compaction-failed]`, `[compaction-timeout]`, `[compaction-crashed]`, or `[compaction-spawn-error]` to triage. Subprocess stderr is captured inline as `[subproc-stderr] <line>` — there is no longer a separate `claude-agent-compaction-stderr-…log` file.
476
476
  - The one-shot pool entry's lifecycle is greppable as `[client-cold-create] reason=compaction-one-shot …` paired with `[client-evict] reason=compaction-one-shot …`, distinguishable from the regular admin pool's lifecycle tags.
@@ -86,7 +86,7 @@ Ask naturally:
86
86
  - "What did I last discuss about the Acme proposal?"
87
87
  - "Who have I met from the fintech conference?"
88
88
 
89
- ## Listing and counting (Task 557)
89
+ ## Listing and counting
90
90
 
91
91
  {{productName}} answers relational questions — "list all my people", "how many tasks do I have", "find the person with email X", "show me the 20 most recently created nodes" — via direct read-only Cypher against your Neo4j. This is faster and more precise than semantic search when the question is "the exact set where", not "things similar to".
92
92
 
@@ -62,7 +62,7 @@ Latency triage: `mail-list count=0 elapsedMs<200` consistent → permissions iss
62
62
  | `Outlook token refresh failed for account=X; re-auth required` | Network down at refresh time, or refresh token invalidated | Verify network; re-register |
63
63
  | `Outlook auth expired for account=X; run outlook-account-register` | Refresh-then-retry still got 401 | Re-register |
64
64
  | `Outlook rate-limited without Retry-After hint` | Graph 429 with no backoff guidance | Wait + retry; if persistent, file bug |
65
- | `Microsoft Graph does not support on-premises Exchange. Use Task 769 (IMAP).` | Mailbox is on hybrid Exchange | Use the `email` plugin |
65
+ | `Microsoft Graph does not support on-premises Exchange. Use earlier platform fixes (IMAP).` | Mailbox is on hybrid Exchange | Use the `email` plugin |
66
66
 
67
67
  ## Out of scope
68
68
 
@@ -109,11 +109,11 @@ import { initStderrTee } from "../../../../lib/mcp-stderr-tee/dist/index.js";
109
109
  initStderrTee("your-plugin-name");
110
110
  ```
111
111
 
112
- After this, every `console.error("[your-tool] ...")` from any tool in the plugin appears as `[<iso-ts>] [mcp:your-plugin-name] [your-tool] ...` in the per-conversation stream log `claude-agent-stream-{conversationId}.log`, alongside the usual agent events. The raw per-server file `mcp-your-plugin-name-stderr-{date}.log` is still produced for deep-dive grep.
112
+ After this, every `console.error("[your-tool]...")` from any tool in the plugin appears as `[<iso-ts>] [mcp:your-plugin-name] [your-tool]...` in the per-conversation stream log `claude-agent-stream-{conversationId}.log`, alongside the usual agent events. The raw per-server file `mcp-your-plugin-name-stderr-{date}.log` is still produced for deep-dive grep.
113
113
 
114
- **How the tee decides which file to write to (Task 532):** the platform sets `STREAM_LOG_PATH` as an environment variable on every MCP server spawn, pointing to the conversation-scoped stream log. The MCP server does not know about conversations — it just trusts `STREAM_LOG_PATH`. Multiple concurrent conversations produce multiple concurrent MCP server processes, each teeing to its own file; no cross-conversation leakage.
114
+ **How the tee decides which file to write to:** the platform sets `STREAM_LOG_PATH` as an environment variable on every MCP server spawn, pointing to the conversation-scoped stream log. The MCP server does not know about conversations — it just trusts `STREAM_LOG_PATH`. Multiple concurrent conversations produce multiple concurrent MCP server processes, each teeing to its own file; no cross-conversation leakage.
115
115
 
116
- **`STREAM_LOG_PATH` reaches every Claude Code child (Task 556).** The platform now sets `STREAM_LOG_PATH` on the parent `claude` spawn env itself (not only on MCP server envs), so the bundled Bun runtime inherits it and every Bash-tool subprocess the CLI spawns sees it too. Opt-in shell scripts — currently `setup-tunnel.sh`, `reset-tunnel.sh`, and `list-cf-domains.sh` under `platform/plugins/cloudflare/scripts/` — read the variable, guard against a missing value with a loud exit, and tee subprocess output line-by-line into the same per-conversation file. Each spawn writes one `[spawn-env] STREAM_LOG_PATH=set pid=… conversationId=… site=…` line so the env-propagation is auditable per session. The chat UI tails the same file for lines matching `^\[([^\]]+)\] \[([a-z][a-z0-9-]*)((?::[a-z0-9:_-]+)?)\] ` — any lowercase scope shape participates on first write (Task 592 generalised from the pre-592 enum `setup-tunnel|reset-tunnel`) — and emits them as `script_stream` SSE events; see `.docs/web-chat.md` for the contract. Inner-layer helpers that a .sh wrapper spawns (e.g. `list-cf-domains.ts` via `node --experimental-strip-types`) must write phase lines directly to `STREAM_LOG_PATH` rather than relying on stderr propagation (Task 598); the build-gate `platform/ui/scripts/check-stream-log-contract.mjs` (Task 600) enforces this and is the definitive reference for the three allowed patterns (tee-wrapped, direct-write, or explicit stderr-only marker).
116
+ **`STREAM_LOG_PATH` reaches every Claude Code child.** The platform now sets `STREAM_LOG_PATH` on the parent `claude` spawn env itself (not only on MCP server envs), so the bundled Bun runtime inherits it and every Bash-tool subprocess the CLI spawns sees it too. Opt-in shell scripts — currently `setup-tunnel.sh`, `reset-tunnel.sh`, and `list-cf-domains.sh` under `platform/plugins/cloudflare/scripts/` — read the variable, guard against a missing value with a loud exit, and tee subprocess output line-by-line into the same per-conversation file. Each spawn writes one `[spawn-env] STREAM_LOG_PATH=set pid=… conversationId=… site=…` line so the env-propagation is auditable per session. The chat UI tails the same file for lines matching `^\[([^\]]+)\] \[([a-z][a-z0-9-]*)((?::[a-z0-9:_-]+)?)\] ` — any lowercase scope shape participates on first write (earlier platform fixes generalised from the pre-592 enum `setup-tunnel|reset-tunnel`) — and emits them as `script_stream` SSE events; see `.docs/web-chat.md` for the contract. Inner-layer helpers that a.sh wrapper spawns (e.g. `list-cf-domains.ts` via `node --experimental-strip-types`) must write phase lines directly to `STREAM_LOG_PATH` rather than relying on stderr propagation; the build-gate `platform/ui/scripts/check-stream-log-contract.mjs` enforces this and is the definitive reference for the three allowed patterns (tee-wrapped, direct-write, or explicit stderr-only marker).
117
117
 
118
118
  **Retrieve MCP diagnostic lines for a conversation:**
119
119
 
@@ -122,20 +122,20 @@ After this, every `console.error("[your-tool] ...")` from any tool in the plugin
122
122
 
123
123
  **Tee-state markers** land in the stream log: `[platform] [mcp-tee-attach] server=<name> streamLogPath=...` when the tee wires up, `[platform] [mcp-tee-skip] server=<name> destination=... reason=...` when a destination fails (missing `LOG_DIR`, unwritable path, `STREAM_LOG_PATH` not set, etc.), `[platform] [mcp-tee-detach] server=<name>` on graceful shutdown. If a server invoked tools but no `[mcp:<name>]` lines appear in the conversation's log, look for the skip marker first.
124
124
 
125
- **Main-subprocess stderr (Task 535).** The same teeing pattern applies to the main Claude Code subprocess's stderr — every line lands in the per-conversation stream log as `[subproc-stderr] …`, with lifecycle markers `[subproc-stderr-tee-attached] pid=…` and `[subproc-stderr-tee-detached] pid=… bytes=N lines=N`. A `bytes=0 lines=0` detach means the tee was attached but the subprocess emitted nothing on stderr — which is the normal state today, because the Claude Code CLI is a bundled Bun runtime binary that does not honour Node's `NODE_DEBUG` env var. The platform records this explicitly with one line per spawn: `[subproc-debug-unavailable] reason=bundled-bun-binary-ignores-node-debug pid=… cli=claude`. A reader who finds a `[spawn]` without these markers should treat that as a regression of the tee infrastructure, not as silence.
125
+ **Main-subprocess stderr.** The same teeing pattern applies to the main Claude Code subprocess's stderr — every line lands in the per-conversation stream log as `[subproc-stderr] …`, with lifecycle markers `[subproc-stderr-tee-attached] pid=…` and `[subproc-stderr-tee-detached] pid=… bytes=N lines=N`. A `bytes=0 lines=0` detach means the tee was attached but the subprocess emitted nothing on stderr — which is the normal state today, because the Claude Code CLI is a bundled Bun runtime binary that does not honour Node's `NODE_DEBUG` env var. The platform records this explicitly with one line per spawn: `[subproc-debug-unavailable] reason=bundled-bun-binary-ignores-node-debug pid=… cli=claude`. A reader who finds a `[spawn]` without these markers should treat that as a regression of the tee infrastructure, not as silence.
126
126
 
127
- ## Failure-path observability contract (Task 560 + Task 743)
127
+ ## Failure-path observability contract (earlier platform fixes + earlier platform fixes)
128
128
 
129
129
  The `initStderrTee` wrapper writes to the per-conversation stream log and per-server raw file via `createWriteStream` — async, buffered. Any diagnostic `console.error(…)` followed by an immediate `process.exit(…)` is lost: the event loop never drains the WriteStream before the process terminates. Same race for any synchronous module-load throw: Node's uncaught-exception handler writes the stack to raw fd 2 and exits before the patched async stream flushes. The platform's `[mcp-init-error] tail="(no stderr file)"` line — operationally useless — is the public symptom of this race.
130
130
 
131
131
  **Two layers now close the gap, each load-bearing on its own:**
132
132
 
133
- 1. **Plugin-side sync-write discipline.** Plugins that call `process.exit()` during module load (rare — `graph-mcp` is the in-tree example; it spawns a child at boot to proxy upstream stdio) use `fs.appendFileSync` at every named exit path to guarantee the cause lands in both log destinations before exit. Lines follow the `[mcp:<name>] [<plugin-prefix>] <cause>` format so existing `grep '[mcp:<name>]'` investigator paths work. Each destination is wrapped in its own try/catch — an unwritable log must not mask the primary failure. This is the discipline propagated from Task 560 to any plugin author who knows their failure paths.
133
+ 1. **Plugin-side sync-write discipline.** Plugins that call `process.exit` during module load (rare — `graph-mcp` is the in-tree example; it spawns a child at boot to proxy upstream stdio) use `fs.appendFileSync` at every named exit path to guarantee the cause lands in both log destinations before exit. Lines follow the `[mcp:<name>] [<plugin-prefix>] <cause>` format so existing `grep '[mcp:<name>]'` investigator paths work. Each destination is wrapped in its own try/catch — an unwritable log must not mask the primary failure. This is the discipline propagated to any plugin author who knows their failure paths.
134
134
 
135
- 2. **Parent-side `mcp-spawn-tee` wrapper (Task 743).** Every node-based core MCP server is spawned via the `lib/mcp-spawn-tee` wrapper rather than `node <entry>` directly. The wrapper spawns the real entry with `stdio: ['inherit', 'inherit', 'pipe']` and writes child stderr chunks to `${LOG_DIR}/mcp-${name}-stderr-<date>.log` via `appendFileSync` while passing the same chunks through to its own stderr (Claude Code's consumer is unchanged). Synchronous `appendFileSync` survives `process.exit`, so the per-server file captures even (a) module-load throws before `initStderrTee` runs, (b) `MODULE_NOT_FOUND` on the entry script itself, and (c) anything else a plugin author missed. The wrapper writes `[mcp-spawn-tee-attached] server=<name> pid=<n>` on attach and forwards SIGTERM/SIGINT to the child. This is the layer that makes capture independent of plugin discipline. Playwright stays unwrapped because it spawns via `npx`, not `node`.
135
+ 2. **Parent-side `mcp-spawn-tee` wrapper.** Every node-based core MCP server is spawned via the `lib/mcp-spawn-tee` wrapper rather than `node <entry>` directly. The wrapper spawns the real entry with `stdio: ['inherit', 'inherit', 'pipe']` and writes child stderr chunks to `${LOG_DIR}/mcp-${name}-stderr-<date>.log` via `appendFileSync` while passing the same chunks through to its own stderr (Claude Code's consumer is unchanged). Synchronous `appendFileSync` survives `process.exit`, so the per-server file captures even (a) module-load throws before `initStderrTee` runs, (b) `MODULE_NOT_FOUND` on the entry script itself, and (c) anything else a plugin author missed. The wrapper writes `[mcp-spawn-tee-attached] server=<name> pid=<n>` on attach and forwards SIGTERM/SIGINT to the child. This is the layer that makes capture independent of plugin discipline. Playwright stays unwrapped because it spawns via `npx`, not `node`.
136
136
 
137
137
  A third layer closes the same gap from the platform side: when `claude-agent.ts` observes an `init` event with any MCP server reporting `status:"failed"`, it reads the last 512 bytes of `${LOG_DIR}/mcp-<name>-stderr-<date>.log` and emits `[mcp-init-error] server=<name> tail=<quoted>` into the stream log. Absent file → `tail="(no stderr file)"`; empty file → `tail="(empty)"`. With the spawn-tee wrapper now interposing on every core MCP, `tail="(no stderr file)"` post-Task-743 means the wrapper itself is broken — file follow-up.
138
138
 
139
139
  **Signal inventory after a failed session:** `[init] FAILED MCP servers: <names>` (names), `[mcp-init-error] server=<name> tail=…` (cause for each, from the platform's tail probe), `[mcp-spawn-tee-attached] server=<name> pid=<n>` (proof the wrapper attached), `[mcp-spawn-tee-exit] server=<name> code=<n>|signal=<s>` (proof the wrapper saw the exit), and optionally `[mcp:<name>] [<plugin>] …` from plugin-side sync-writes. Their union gives the investigator three independent sources for the same failure.
140
140
 
141
- **Boot-smoke as publish-time gate (Task 743).** The memory MCP carries `scripts/boot-smoke.sh` that spawns `dist/index.js` with stub env, sleeps 2s, asserts `kill -0 <pid>`, and reports `[boot-smoke] memory ok|FAILED tail=<n-lines>`. Wired to `prepublish` in `plugins/memory/mcp/package.json`. The pattern is propagatable to other plugin MCPs — it's deliberately not generalised yet because each plugin's stub-env requirements differ (memory needs ACCOUNT_ID + PLATFORM_ROOT + NEO4J_URI + SESSION_ID; others differ).
141
+ **Boot-smoke as publish-time gate.** The memory MCP carries `scripts/boot-smoke.sh` that spawns `dist/index.js` with stub env, sleeps 2s, asserts `kill -0 <pid>`, and reports `[boot-smoke] memory ok|FAILED tail=<n-lines>`. Wired to `prepublish` in `plugins/memory/mcp/package.json`. The pattern is propagatable to other plugin MCPs — it's deliberately not generalised yet because each plugin's stub-env requirements differ (memory needs ACCOUNT_ID + PLATFORM_ROOT + NEO4J_URI + SESSION_ID; others differ).