@rubytech/create-realagent 1.0.814 → 1.0.816
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/package.json +1 -1
- package/payload/platform/plugins/admin/PLUGIN.md +4 -2
- package/payload/platform/plugins/admin/skills/onboarding/SKILL.md +2 -0
- package/payload/platform/plugins/cloudflare/PLUGIN.md +2 -2
- package/payload/platform/plugins/docs/references/cloudflare.md +5 -3
- package/payload/platform/plugins/docs/references/deployment.md +12 -12
- package/payload/platform/plugins/docs/references/internals.md +24 -24
- package/payload/platform/plugins/docs/references/memory-guide.md +1 -1
- package/payload/platform/plugins/docs/references/outlook-guide.md +1 -1
- package/payload/platform/plugins/docs/references/plugins-guide.md +8 -8
- package/payload/platform/plugins/docs/references/troubleshooting.md +38 -38
- package/payload/platform/plugins/memory/PLUGIN.md +4 -4
- package/payload/platform/plugins/whatsapp/PLUGIN.md +10 -4
- package/payload/platform/plugins/whatsapp/mcp/dist/index.js +80 -0
- package/payload/platform/plugins/whatsapp/mcp/dist/index.js.map +1 -1
- package/payload/platform/plugins/whatsapp-import/PLUGIN.md +4 -4
- package/payload/platform/templates/agents/admin/IDENTITY.md +2 -0
- package/payload/platform/templates/specialists/agents/personal-assistant.md +2 -2
- package/payload/server/chunk-42BMMSRN.js +10066 -0
- package/payload/server/chunk-52LIWKMM.js +1032 -0
- package/payload/server/chunk-UYLZDEMC.js +1114 -0
- package/payload/server/chunk-Y3UQFQM7.js +10067 -0
- package/payload/server/client-pool-AIYWSJBR.js +31 -0
- package/payload/server/client-pool-BMPFHXHB.js +31 -0
- package/payload/server/maxy-edge.js +2 -2
- package/payload/server/public/assets/{Checkbox-DZxF6s72.js → Checkbox-CTGhpDKq.js} +1 -1
- package/payload/server/public/assets/{admin-DfuHw_TQ.js → admin-Cxtmv0wo.js} +60 -60
- package/payload/server/public/assets/data-Y77FLKjs.js +1 -0
- package/payload/server/public/assets/graph-C4-jEPDE.js +1 -0
- package/payload/server/public/assets/{jsx-runtime-Cb4WEnIV.css → jsx-runtime-D4WovFYk.css} +1 -1
- package/payload/server/public/assets/{page-BLanFGXC.js → page-DkBfWy4C.js} +1 -1
- package/payload/server/public/assets/{page-DuwlF8N5.js → page-zuI00fuC.js} +1 -1
- package/payload/server/public/assets/{public-BqeUfasT.js → public-BdVIVpv8.js} +1 -1
- package/payload/server/public/assets/{useAdminFetch-DLGqK3Fs.js → useAdminFetch-DmHu0oCx.js} +1 -1
- package/payload/server/public/assets/{useVoiceRecorder-CHPkBGVV.js → useVoiceRecorder-CSc_hxjV.js} +1 -1
- package/payload/server/public/data.html +5 -5
- package/payload/server/public/graph.html +6 -6
- package/payload/server/public/index.html +8 -8
- package/payload/server/public/public.html +5 -5
- package/payload/server/server.js +225 -9
- package/payload/server/public/assets/data-BO12TydY.js +0 -1
- package/payload/server/public/assets/graph-XIePTyWQ.js +0 -1
- /package/payload/server/public/assets/{jsx-runtime-BZMOvG0f.js → jsx-runtime-DkaAusaX.js} +0 -0
package/package.json
CHANGED
|
@@ -51,9 +51,11 @@ Platform management tools for both the admin and public agents. The `plugin-read
|
|
|
51
51
|
|
|
52
52
|
Tools are available via the `admin` MCP server.
|
|
53
53
|
|
|
54
|
-
**Three-store admin auth invariant
|
|
54
|
+
**Three-store admin auth invariant.** `admin-add` writes to all three identity stores (`users.json` PIN auth, `account.json` `admins[]` role, Neo4j `:AdminUser`/`:Person` graph identity) with per-leg `[admin-auth-store]` log lines and returns `is_error: true` on any leg failure naming what's already written. `admin-update-pin` writes `users.json` only and emits the same line. Direct `Edit`/`Write` on `account.json` is blocked at the `pre-tool-use` hook — mutations go through `account-update`, `plugin-toggle-enabled`, or the `admin-*` tools. See `.docs/agents.md` § "Three-store admin auth invariant" for the full contract.
|
|
55
55
|
|
|
56
|
-
`logs-read { type: "agent-stream" }`
|
|
56
|
+
`logs-read { type: "agent-stream" }` is the canonical name for the per-conversation tool-use/tool-result archive previously called `system`; both names work and the legacy alias is preserved.
|
|
57
|
+
|
|
58
|
+
**Stream log is the primary diagnostic surface for an in-progress conversation.** The parent-process console fan-out tee in [`platform/ui/app/lib/claude-agent/logging.ts`](../../ui/app/lib/claude-agent/logging.ts) appends every `[<tag>]`-prefixed `console.error` / `console.log` line to every active conversation's stream log alongside `server.log`. For diagnosing an in-conversation issue (WhatsApp inbound, Cloudflare action, persist write, baileys error), call `logs-read { type: "agent-stream" }` first — the stream log carries both the agent lifecycle AND the parent-process events that occurred during the session window. `logs-read { type: "server" }` becomes the cross-session escape hatch (filtering across conversations or for events outside any session window), not the default.
|
|
57
59
|
|
|
58
60
|
## Skills
|
|
59
61
|
|
|
@@ -6,6 +6,8 @@ After completing each step, persist progress immediately by calling `onboarding-
|
|
|
6
6
|
|
|
7
7
|
Every user selection and every document presentation in onboarding renders through `render-component` — never as markdown text, bullet lists, or inline chat content. Selections use `single-select` or `multi-select`; file content (SOUL, KNOWLEDGE, config) uses `document-editor`. Describing options in prose or pasting file content into a chat message bypasses the UI and the user's ability to review and edit before saving. Never use directional words ("above", "below") when referring to a rendered component — just name it or refer to "the browser", "the editor", etc. The `document-editor` component renders one file at a time — never call `render-component` with `document-editor` twice in the same turn. Present one file, wait for the user's approval or edit, then present the next.
|
|
8
8
|
|
|
9
|
+
**Turn-completion contract (Task 879 §C).** Any onboarding turn that advances `currentStep` (via `onboarding-complete-step`) OR narrates a step transition with phrases like "Moving to step N", "Proceeding to step N", "Step N done" MUST end with one of: a `render-component` call surfacing the next step's UI, OR a `?`-terminated question. Bare prose statements like "Moving to step 9 — operator persona and profile bootstrap." with nothing after are forbidden — they leave the operator with no actionable surface and force a manual nudge. The post-restart resume contract for step 7 is the canonical case: when the cloudflare-relay turn arrives at `currentStep=7`, the agent's first turn must acknowledge AND immediately render the step-8 prompt (or `anthropic-setup`'s next surface) without an intervening dead-end paragraph. Same rule applies turn-by-turn through step 9. The `assistant-step-advance-deadend` review-detector rule (`platform/ui/app/lib/review-detector/rules.ts`) fires when this contract is violated, so an agent that drifts back to bare prose surfaces a `[review-detector]` alert against itself.
|
|
10
|
+
|
|
9
11
|
## Step 1 — Plugin selection
|
|
10
12
|
|
|
11
13
|
*(skip if `currentStep` >= 1)*
|
|
@@ -24,13 +24,13 @@ Each installation has its own Cloudflare account. The operator signs in with OAu
|
|
|
24
24
|
|
|
25
25
|
## Operator-facing surface
|
|
26
26
|
|
|
27
|
-
The plugin registers no agent-facing MCP tools
|
|
27
|
+
The plugin registers no agent-facing MCP tools. Every Cloudflare operation is driven through one of four sanctioned surfaces — `setup-tunnel.sh`, `reset-tunnel.sh`, `references/manual-setup.md`, or `references/dashboard-guide.md`. See the skill below for the discipline rule that binds the agent to these four.
|
|
28
28
|
|
|
29
29
|
### Scripts
|
|
30
30
|
|
|
31
31
|
| Script | Purpose |
|
|
32
32
|
|---|---|
|
|
33
|
-
| [`scripts/setup-tunnel.sh`](scripts/setup-tunnel.sh) | Autonomous end-to-end setup: OAuth login, tunnel create, DNS route, config + state, service restart, post-restart verification. Invocation: `~/setup-tunnel.sh <brand> <port> <admin-hostname> [<public-hostname>] [<apex-hostname>]`. Apex hostnames print an `ACTION REQUIRED` block for the dashboard record the CLI cannot create. Step 1 (
|
|
33
|
+
| [`scripts/setup-tunnel.sh`](scripts/setup-tunnel.sh) | Autonomous end-to-end setup: OAuth login, tunnel create, DNS route, config + state, service restart, post-restart verification. Invocation: `~/setup-tunnel.sh <brand> <port> <admin-hostname> [<public-hostname>] [<apex-hostname>]`. Apex hostnames print an `ACTION REQUIRED` block for the dashboard record the CLI cannot create. Step 1 (earlier platform fixes — wrappers faithfully relay third-party CLI) spawns `cloudflared tunnel login`, extracts the argotunnel URL from its stdout, mechanically opens it on the Pi VNC chromium (`DISPLAY=${DISPLAY:-:99} /usr/bin/chromium <url> &`), then polls for `~/.cloudflared/cert.pem` while the operator clicks the zone row + Authorize on the VNC. 180 s budget with a 2-second `step=oauth-login result=awaiting-cert` heartbeat. No CDP auto-click, no DOM matcher. |
|
|
34
34
|
| [`scripts/reset-tunnel.sh`](scripts/reset-tunnel.sh) | Deletes every tunnel on the brand's CF account and wipes `${CFG_DIR}`. Does not touch the platform service, stray CNAMEs, or token-mode connectors — those require dashboard cleanup or `pkill`. Invocation: `~/reset-tunnel.sh <brand>`. No polling blocks — every long-wait is bounded by `cloudflared`'s network round-trip, so no heartbeat contract applies. |
|
|
35
35
|
|
|
36
36
|
### Skills
|
|
@@ -8,7 +8,7 @@ Each installation has its own Cloudflare account. Sign-in is OAuth in the device
|
|
|
8
8
|
|------|--------|
|
|
9
9
|
| **Product identity** (Maxy vs Real Agent) | `brand.json` (`productName`, `configDir`) — known at install. |
|
|
10
10
|
| **Cloudflare account identity** | `cert.pem` from OAuth. One account per brand per device. |
|
|
11
|
-
| **Domain scope** (which zones the operator can route) | Live Cloudflare dashboard at form-render time via `list-cf-domains.sh`, not `brand.json`. Brand identity has no authority over which domains the operator's CF account holds. When the scrape returns an unexpected count (e.g. 1 on a two-zone account), the stream log's per-poll `phase=dom-scrape-poll n=<k> count=<n> domains=[…]` trajectory + the on-disk HTML dump at `~/{configDir}/logs/list-cf-domains-<ts>-count<n>-<mode>-pid<pid>.html` (
|
|
11
|
+
| **Domain scope** (which zones the operator can route) | Live Cloudflare dashboard at form-render time via `list-cf-domains.sh`, not `brand.json`. Brand identity has no authority over which domains the operator's CF account holds. When the scrape returns an unexpected count (e.g. 1 on a two-zone account), the stream log's per-poll `phase=dom-scrape-poll n=<k> count=<n> domains=[…]` trajectory + the on-disk HTML dump at `~/{configDir}/logs/list-cf-domains-<ts>-count<n>-<mode>-pid<pid>.html` (earlier platform fixes — written on every scrape outcome, not just empty ones) give the operator everything they need to triage the cause without re-running. |
|
|
12
12
|
| **Local tunnel state** | `~/{configDir}/cloudflared/` — `cert.pem`, `<UUID>.json`, `config.yml`, `tunnel.state`, `alias-domains.json`. |
|
|
13
13
|
|
|
14
14
|
There is no token-based auth for the operator-owned path (Mode A). To switch Cloudflare accounts, run `reset-tunnel.sh` (which deletes the cert and every tunnel on the current account), then run `setup-tunnel.sh` again — `cloudflared tunnel login` inside the setup script will pick a fresh account when you sign in.
|
|
@@ -22,7 +22,7 @@ Ask the agent to set up Cloudflare. The agent first confirms the domain is alrea
|
|
|
22
22
|
- **Proxy apex** — optional bare-domain hostname (e.g. `yourdomain.com`) that should also serve the public agent.
|
|
23
23
|
- **Admin password** — the password used to gate remote access to the admin surface.
|
|
24
24
|
|
|
25
|
-
When you submit, the `/api/admin/cloudflare/setup` endpoint runs — in strict order — `setRemotePassword`, launches a `cloudflare-setup` action (
|
|
25
|
+
When you submit, the `/api/admin/cloudflare/setup` endpoint runs — in strict order — `setRemotePassword`, launches a `cloudflare-setup` action (earlier platform fixes: `systemd-run --user` transient unit wrapping `setup-tunnel.sh <brand> <port> <hostname...>`), and registers a post-exit handler to write alias-domains for every non-`public.*` public or apex hostname (so e.g. `chat.yourdomain.com` is classified as public by `isPublicHost`). The script runs end-to-end:
|
|
26
26
|
|
|
27
27
|
- `cloudflared tunnel login` — OAuth browser sign-in. The VNC browser opens the Cloudflare authorize page; pick the account that owns your domain, click Authorize. `cert.pem` lands.
|
|
28
28
|
- Tunnel creation under the naming convention `{brand}-{hostname}` (e.g. `maxy-neo`). Stream log emits `step=tunnel-resolve action=reused|created` once the UUID is known so the admin agent can see which tunnel the later steps will write against.
|
|
@@ -30,7 +30,9 @@ When you submit, the `/api/admin/cloudflare/setup` endpoint runs — in strict o
|
|
|
30
30
|
- `cloudflared tunnel route dns` for each subdomain hostname. Apex hostnames cannot be routed this way — the script prints an **ACTION REQUIRED** block naming the exact dashboard record to add or edit. Stream log emits `step=route-dns hostname=… tunnel_id=…` before the call and `step=route-dns hostname=… result=ok|apex-skip|error` after; on error the bounded cloudflared stderr (≤400 chars) rides in the same phase line. **The script does not parse cloudflared's stdout** — exit code is the sole decision signal, so all three legitimate cloudflared output shapes (new record, overwrite, idempotent "already configured") are treated as success.
|
|
31
31
|
- `config.yml` and `tunnel.state` written under `${CFG_DIR}`.
|
|
32
32
|
- **Step-7 onboarding completion persisted** — the script writes `${ACCOUNT_DIR}/onboarding/step7-complete` (a JSON marker with the completion timestamp and tunnel ID) before arming the restart. Stream log: `step=onboarding-persist result=ok|error reason=<r>`. The marker is consumed by the next admin session's first state read and advances `OnboardingState.currentStep` to 7. Without this, the service restart below would SIGTERM the admin agent before it could persist step-7 completion, and the next session would re-ask the Cloudflare question you just finished. Both invocation surfaces (the form-driven action and the agent-via-Bash path) declare `ACCOUNT_DIR` explicitly because `systemd-run --user` does not inherit parent env — when ACCOUNT_DIR isn't reaching the script you'll see `result=skipped reason=no-account-dir` in the stream log instead of `result=ok`.
|
|
33
|
-
- **Chat-relay queued for the operator's "Cloudflare setup completed" turn**
|
|
33
|
+
- **Chat-relay queued for the operator's "Cloudflare setup completed" turn** — when the form's ActionLogPanel reports `code=0`, the form fires `POST /api/admin/cloudflare/relay-completion`. The route enqueues a record at `${ACCOUNT_DIR}/queue/action-completion-relay-<actionId>.json` (write-once via Node's `wx`/O_EXCL flag) BEFORE the brand restart kills the in-flight admin agent. After the restart, the brand service's boot-drain hook ([server/index.ts](../../../../platform/ui/server/index.ts)) consumes the queue once and dispatches a server-driven agent turn via a synthetic one-shot session bound to the queued conversationId; the agent's hoisted user-message persist writes `role=user` BEFORE the SDK invoke so the operator's relay survives even if SIGTERM hits mid-generation. The diagnostic line you grep on the working path is `[action-completion-relay] phase=consumed actionId=<id> conversationId=<cid> ageMs=<n> outcome=injected`. Failure modes are surfaced by the `cloudflare-setup-relay-not-acknowledged` review rule.
|
|
34
|
+
- **Chat-surface restart-pending banner** — the same `admin-chat:await-relay` CustomEvent that registers the relay-poll now also carries `reason: 'cloudflare-setup'` so the chat hook ([useAdminChat.ts](../../../../platform/ui/app/useAdminChat.ts)) renders an inline `"Service restarting after Cloudflare setup — picking back up…"` banner the moment the form fires. Closes the visible-silence window between the form's `Completed · 20s` and the first post-restart agent token. Idempotent on duplicate dispatch. Copy is keyed by `reason` so future restart sources (plugin-install, npm-update) can plug in their own banner without inventing a new chat surface state. Generic fallback `"Service restarting — reconnect will happen automatically."` is used when the dispatch omits `reason`.
|
|
35
|
+
- **Client sessionKey rebind on first post-restart poll** — when the relay-poll observes its first `200` for the captured cid (server-side cookie-bridge has just hydrated `(accountId, userId)` onto the wiped sessionStore entry), the chat hook fires `POST /api/admin/session/rebind` exactly once with `{session_key, lastKnownConversationId}`. The endpoint validates accountId scope via `getConversationOwner(cid).accountId === sessionAccountId` and binds the conversationId to the session via `setConversationIdForSession`. `sendMessage` awaits the in-flight rebind promise before opening the next chat POST, closing the silent-fork race where the operator's next turn would otherwise create a NEW Conversation and the `[admin/conversation-flush] result=missing-userId|writer-failed` line never reached the chat surface. Diagnostic: `grep '\[admin/session/rebind\]' ~/{configDir}/logs/server.log` — expects `result=ok conversationId=<cid8>` once per restart cycle; `result=conflict` means the server holds a different canonical cid (client adopts it).
|
|
34
36
|
- `systemctl --user restart ${BRAND}.service` — restarts the platform service so the new tunnel spawns via the service's `ExecStartPre=resume-tunnel.sh`.
|
|
35
37
|
- Post-restart verification — `ps -ef | grep '[c]loudflared'` confirms the connector is alive, then `curl -I https://<hostname>` against each subdomain (up to 60 s per host) confirms a non-530 response.
|
|
36
38
|
|
|
@@ -75,27 +75,27 @@ sudo journalctl -u maxy -n 50
|
|
|
75
75
|
The logs will show which service failed to start and why. Common causes:
|
|
76
76
|
|
|
77
77
|
- **Neo4j not started** — run `sudo systemctl start neo4j` and retry
|
|
78
|
-
- **Port 19200 already in use** — check for another process: `lsof -i
|
|
78
|
+
- **Port 19200 already in use** — check for another process: `lsof -i:19200`
|
|
79
79
|
- **Claude OAuth expired** — the next admin session will prompt you to re-authenticate
|
|
80
|
-
- **NEO4J_URI guard throws** — the admin agent probes device reality at boot and fails closed on three shapes (
|
|
80
|
+
- **NEO4J_URI guard throws** — the admin agent probes device reality at boot and fails closed on three shapes (earlier platform fixessucceeding earlier platform fixes):
|
|
81
81
|
- `no Neo4j listening on [ports]` — nothing is bound; start `neo4j.service` or `neo4j-<brand>.service`, or edit `NEO4J_URI` to a port a Neo4j is actually running on.
|
|
82
|
-
- `port
|
|
83
|
-
- `port
|
|
82
|
+
- `port:X not listening; only:Y is live` — single-brand device where `.env` names a port the local Neo4j isn't bound to; edit `NEO4J_URI` in `~/{configDir}/.env` to match the live port (shown in the `[neo4j-probe] listening=[…]` log line).
|
|
83
|
+
- `port:X disagrees with brand.json neo4jPort:Y` — co-tenant device (2+ Neo4js listening) where `.env` names the other brand's port; edit `NEO4J_URI` to match `brand.neo4jPort`, or correct `neo4jPort` in `brand.json` and reinstall. Preserves the earlier platform fixes orphan-write protection on multi-brand devices.
|
|
84
84
|
|
|
85
85
|
## Systemd units on each device
|
|
86
86
|
|
|
87
|
-
Each installed brand runs two per-brand `--user` systemd units (
|
|
87
|
+
Each installed brand runs two per-brand `--user` systemd units (earlier platform fixes + — unit filenames are prefixed with the brand's `hostname` so two brands on the same device never share a unit file):
|
|
88
88
|
|
|
89
|
-
- `{hostname}.service` — the admin + public HTTP server on `127.0.0.1:19201` (public port + 1). Restarted by the upgrade flow; short downtime is expected during steps 8→11 of an upgrade.
|
|
90
|
-
- `{hostname}-edge.service` — the always-on public listener on the configured port (default 19200). Reverse-proxies HTTP to the main brand service and handles `/websockify` (VNC) WebSocket upgrades locally.
|
|
89
|
+
- `{hostname}.service` — the admin + public HTTP server on `127.0.0.1:19201` (public port + 1). Restarted by the upgrade flow; short downtime is expected during steps 8→11 of an upgrade. An earlier fix: the unit carries two port env vars — `PORT=<public>` (canonical public port, read by the upgrade detector) and `MAXY_UI_INTERNAL_PORT=<public+1>` (the port maxy-ui actually binds).
|
|
90
|
+
- `{hostname}-edge.service` — the always-on public listener on the configured port (default 19200). Reverse-proxies HTTP to the main brand service and handles `/websockify` (VNC) WebSocket upgrades locally. An earlier fix: also hosts `/api/admin/actions/*` and `/api/admin/version*` — the Software Update modal's own routes — so the log stream survives the brand service's restart window. Does NOT restart during an upgrade — the browser WebSocket stays connected by construction.
|
|
91
91
|
|
|
92
|
-
**Port-drift recovery
|
|
92
|
+
**Port-drift recovery.** Devices upgraded between Tasks 647 and 666 may have drifted +1 on every upgrade because the pre-Task-666 installer wrote `Environment=PORT=<internal>` into `{hostname}.service` and the upgrade reader correctly treated `PORT=` as public. The first post-Task-666 install detects this (comparing maxy's PORT against the edge's EDGE_PORT) and emits a one-shot loud log: `[port-recovery] detected drift maxy=<X> edge=<Y> — pinning at <Y>`. Subsequent upgrades are silent. If your Cloudflare tunnel was pointing at a drifted port, the ingress `config.yml` still needs a one-time manual fix: `sed -i 's|localhost:<old>|localhost:<current>|' ~/.{configDir}/cloudflared/config.yml && cloudflared tunnel ingress validate`. {{productName}} never rewrites cloudflared config programmatically.
|
|
93
93
|
|
|
94
|
-
Upgrade and Cloudflare setup
|
|
94
|
+
Upgrade and Cloudflare setup run as detached actions: `systemd-run --user` transient units per invocation with stdout+stderr persisted to `~/.maxy/logs/actions/<actionId>.log` and streamed to the UI via SSE. No boot-time service file exists for these.
|
|
95
95
|
|
|
96
96
|
If an action looks stuck, read `~/.maxy/logs/actions/<actionId>.log` directly for the full output, or `journalctl --user --identifier=maxy-action-<actionId>` for systemd's record.
|
|
97
97
|
|
|
98
|
-
**Pre-Task-662 / pre-Task-664 upgrade** — devices that ran an installer
|
|
98
|
+
**Pre-Task-662 / pre-Task-664 upgrade** — devices that ran an installer have legacy shared `maxy-edge.service` / `maxy-ttyd.service` units; devices that ran have per-brand `{hostname}-ttyd.service` units plus a pinned `/usr/local/bin/ttyd` binary. Neither is removed automatically — do this cleanup once per device before re-running any installer:
|
|
99
99
|
|
|
100
100
|
```bash
|
|
101
101
|
systemctl --user stop maxy-edge maxy-ttyd realagent-ttyd 2>/dev/null || true
|
|
@@ -112,8 +112,8 @@ systemctl --user daemon-reload
|
|
|
112
112
|
A single Pi or laptop can host more than one brand (for example Maxy and Real Agent) side by side. Each brand runs as its own service on its own port, with its own install directory and its own data. Installing one brand does not touch the other.
|
|
113
113
|
|
|
114
114
|
- **Separate:** each brand has its own install folder (`~/maxy/`, `~/realagent/`), its own config folder (`~/.maxy/`, `~/.realagent/`), its own web port, its own Cloudflare tunnel state, its own edge systemd unit (`maxy-edge.service` vs `realagent-edge.service`), and by default its own Neo4j database (Maxy on bolt port 7687, Real Agent on 7688). Action runner units are transient and per-invocation, not per-brand, so no naming conflict is possible.
|
|
115
|
-
- **Brand-isolated Neo4j
|
|
116
|
-
- **Peer-aware system-unit guard
|
|
115
|
+
- **Brand-isolated Neo4j:** when a brand provisions a dedicated Neo4j instance (any port other than 7687), the installer stops and disables the apt-package's system `neo4j.service` after enabling the brand-dedicated unit, so only one Neo4j process holds the shared `/var/lib/neo4j/run/` PID file. The seed step receives the brand-correct `NEO4J_URI` and `NEO4J_PASSWORD` as explicit environment variables — the seed script no longer carries a `bolt://localhost:7687` default. A failed dedicated start aborts the install loudly with a journalctl tail; there is no silent fallback to the system instance. Stop/disable targets the literal `neo4j.service` only, so peer brands running their own `neo4j-{brand}.service` are unaffected.
|
|
116
|
+
- **Peer-aware system-unit guard:** before stopping the system `neo4j.service`, the installer checks whether any other brand on the device still depends on it — that is, has `NEO4J_URI=bolt://localhost:7687` in its `~/.<peer>/.env`. If so, the system unit is left enabled and active, and the install log shows `[neo4j] system unit kept active — peer brand <name> depends on port 7687` instead of the usual `[neo4j] disabling system unit` line. This prevents a `create-realagent` install from disabling Maxy's database on a host where Maxy still uses the shared system instance (the earlier platform fixes reproducer on Neo's laptop, 2026-04-28). On single-brand hosts and on multi-brand hosts where every peer runs a dedicated port, behaviour is unchanged.
|
|
117
117
|
- **Shared:** both brands share the system Chromium/VNC stack, the Ollama model server, and the `cloudflared` command itself. Browser automation is serialised — one admin session at a time across both brands.
|
|
118
118
|
|
|
119
119
|
To install a second brand on a device that already runs the first, just run the other installer. No flags needed for isolation:
|
|
@@ -14,9 +14,9 @@ Every knowledge query flows through a hybrid search pipeline that combines seman
|
|
|
14
14
|
QUERY
|
|
15
15
|
│
|
|
16
16
|
├── EMBED (EMBED_MODEL, default nomic-embed-text) ──► VECTOR SEARCH (per index, cosine)
|
|
17
|
-
│
|
|
18
|
-
│
|
|
19
|
-
│
|
|
17
|
+
│ │
|
|
18
|
+
│ ├──► MERGE ──► EXPAND ──► RESULTS
|
|
19
|
+
│ │
|
|
20
20
|
└── ESCAPE (Lucene special chars) ──────► BM25 FULL-TEXT ──┘
|
|
21
21
|
(entity_search index — universal coverage)
|
|
22
22
|
|
|
@@ -29,7 +29,7 @@ Fallback: if the full-text index doesn't exist, vector-only results are returned
|
|
|
29
29
|
|
|
30
30
|
**Vector path:** The query is embedded via Ollama (model per `EMBED_MODEL` env var, default `nomic-embed-text`). The resulting vector is compared against Neo4j's HNSW cosine indexes — one per indexed label. Dimensions are configured at install time (default 768). The search runs against all discovered indexes (or a subset if the caller specifies label filters). Scores are in [0, 1] (cosine similarity).
|
|
31
31
|
|
|
32
|
-
**BM25 path:** The raw query text is escaped for Lucene special characters and run against the `entity_search` full-text index (
|
|
32
|
+
**BM25 path:** The raw query text is escaped for Lucene special characters and run against the `entity_search` full-text index (earlier platform fixes — universal coverage), which spans every operator-meaningful label written by the platform on the canonical text-property union (~28 properties: `name`, `firstName`, `lastName`, `givenName`, `familyName`, `title`, `summary`, `body`, `content`, `description`, `headline`, `email`, `subject`, `bodyPreview`, etc.). Pre-Task-748 the index was named `knowledge_fulltext` and covered only `KnowledgeDocument | Section | Chunk` — that gap silently hid Person/Organization/Task/Event/etc. from BM25 regardless of query. Raw BM25 scores are in [0, infinity) — they are normalised to [0, 1] via min-max scaling within the result set before merging. When all scores are equal (or a single result), all normalise to 1.0.
|
|
33
33
|
|
|
34
34
|
**Merge:** Results from both paths are collected in a single map keyed by `nodeId`. A node appearing in both paths accumulates the max vector score and max BM25 score independently. The combined score is `0.7 * vectorScore + 0.3 * bm25Score`. Results are sorted descending by combined score, then sliced to the requested limit (default 10).
|
|
35
35
|
|
|
@@ -59,7 +59,7 @@ Indexed labels: `Question`, `DefinedTerm`, `Review`, `Service`, `Person`, `Local
|
|
|
59
59
|
|
|
60
60
|
| Index name | Labels | Properties | Purpose |
|
|
61
61
|
|---|---|---|---|
|
|
62
|
-
| `entity_search` | All operator-meaningful labels (~40, see [`schema.cypher`](../../../neo4j/schema.cypher)) | Canonical text-property union (~28) | Universal BM25 keyword matching across the whole graph
|
|
62
|
+
| `entity_search` | All operator-meaningful labels (~40, see [`schema.cypher`](../../../neo4j/schema.cypher)) | Canonical text-property union (~28) | Universal BM25 keyword matching across the whole graph |
|
|
63
63
|
|
|
64
64
|
### Embedding lifecycle
|
|
65
65
|
|
|
@@ -73,13 +73,13 @@ Large documents are decomposed into a three-level hierarchy for granular retriev
|
|
|
73
73
|
|
|
74
74
|
```
|
|
75
75
|
KnowledgeDocument
|
|
76
|
-
├── summary (embedded)
|
|
76
|
+
├── summary (embedded) — document-level semantic anchor
|
|
77
77
|
├── Section
|
|
78
|
-
│
|
|
79
|
-
│
|
|
80
|
-
│
|
|
81
|
-
│
|
|
82
|
-
└── attachmentId
|
|
78
|
+
│ ├── summary (embedded) — section-level semantic anchor
|
|
79
|
+
│ └── Chunk
|
|
80
|
+
│ ├── summary (embedded) — chunk-level semantic anchor
|
|
81
|
+
│ └── content (raw text, BM25-indexed) — full content for retrieval
|
|
82
|
+
└── attachmentId — links back to the source file
|
|
83
83
|
```
|
|
84
84
|
|
|
85
85
|
All three levels are independently vector-indexed and BM25-indexed. A query may match at the document level (broad topic), section level (sub-topic), or chunk level (specific passage). Graph expansion from a matched chunk retrieves its parent section and document for context.
|
|
@@ -155,7 +155,7 @@ Before searching, a Haiku classifier decides whether a query needs knowledge ret
|
|
|
155
155
|
| History window | Last 4 messages (2 user + 2 assistant) | Same |
|
|
156
156
|
| Max tokens | 200 | 120 |
|
|
157
157
|
| Query rewriting | Yes — resolves references from history into concrete search terms | Same |
|
|
158
|
-
| Topic-change detection | Yes — detects shifts with confidence score | No (removed,
|
|
158
|
+
| Topic-change detection | Yes — detects shifts with confidence score | No (removed, earlier platform fixes) |
|
|
159
159
|
| Fallback on failure | `search: true` (always search with raw message) | Same |
|
|
160
160
|
|
|
161
161
|
### Classification output
|
|
@@ -181,7 +181,7 @@ The classifier is one input to a broader decision tree that determines whether `
|
|
|
181
181
|
|
|
182
182
|
Admin: `[admin-query-classifier]` log line with `topicChange`, `topicChangeConfidence`, `existingTopic`, `latencyMs`.
|
|
183
183
|
|
|
184
|
-
Public: `[public-query-classifier]` log line with `search`, `effectiveQuery`, `reason`, `latencyMs`. The intentional absence of topic-change fields in the public log is the on-disk evidence that the public path does less work
|
|
184
|
+
Public: `[public-query-classifier]` log line with `search`, `effectiveQuery`, `reason`, `latencyMs`. The intentional absence of topic-change fields in the public log is the on-disk evidence that the public path does less work.
|
|
185
185
|
|
|
186
186
|
---
|
|
187
187
|
|
|
@@ -221,7 +221,7 @@ Haiku receives a sandboxed system prompt that:
|
|
|
221
221
|
- Requires a JSON response with `nodeId`, `rank` (1-indexed, 1 = most important), and `reasoning` (one sentence, under 300 characters)
|
|
222
222
|
- Explicitly labels the user-provided criterion as "data, not instructions" to prevent prompt injection
|
|
223
223
|
|
|
224
|
-
The criterion itself (from the calling agent) is wrapped in `<<<CRITERION
|
|
224
|
+
The criterion itself (from the calling agent) is wrapped in `<<<CRITERION... CRITERION` delimiters in the user message.
|
|
225
225
|
|
|
226
226
|
### Hallucination defence
|
|
227
227
|
|
|
@@ -284,7 +284,7 @@ Each public agent can subscribe to up to 5 keywords via `knowledgeKeywords` in i
|
|
|
284
284
|
|
|
285
285
|
For each subscription keyword, two complementary searches run:
|
|
286
286
|
|
|
287
|
-
1. **BM25 full-text search** — queries the universal `entity_search` index
|
|
287
|
+
1. **BM25 full-text search** — queries the universal `entity_search` index with the keyword as the search term. Catches content that mentions the keyword in its text across every operator-meaningful label.
|
|
288
288
|
|
|
289
289
|
2. **Property-based search** — finds nodes whose `keywords` array property contains the subscription keyword (case-insensitive). Catches nodes explicitly tagged with that keyword topic. These matches are boosted to maximum BM25 score (1.0) since they are exact tag matches.
|
|
290
290
|
|
|
@@ -292,7 +292,7 @@ Both searches run **without** the per-agent tag filter (`agentSlug`) — keyword
|
|
|
292
292
|
|
|
293
293
|
### Union semantics
|
|
294
294
|
|
|
295
|
-
Results from keyword subscription searches are merged into the same scored map as the primary vector+BM25 results. Deduplication by `nodeId` with `Math.max
|
|
295
|
+
Results from keyword subscription searches are merged into the same scored map as the primary vector+BM25 results. Deduplication by `nodeId` with `Math.max` on scores means a node found by both direct search and keyword subscription keeps the highest score from each method.
|
|
296
296
|
|
|
297
297
|
### Lifecycle
|
|
298
298
|
|
|
@@ -315,17 +315,17 @@ This tool is read-only and available to both public and admin agents.
|
|
|
315
315
|
|
|
316
316
|
### When conversations are created
|
|
317
317
|
|
|
318
|
-
`:Conversation` nodes on webchat (admin login, "New conversation" in the burger, a new public visitor) are created lazily. Opening the chat or logging in does not write anything to the graph — {{productName}} only records the conversation once the user sends a second message. This keeps `conversation-search` and the Conversations modal free of one-turn abandoned threads. WhatsApp and Telegram take the opposite posture: every inbound — DM or group, allowed or activation-off, agent-invoked or gated — MERGEs the `:Conversation` and writes a forensic `:Message:WhatsAppMessage` row before any access-control decision
|
|
318
|
+
`:Conversation` nodes on webchat (admin login, "New conversation" in the burger, a new public visitor) are created lazily. Opening the chat or logging in does not write anything to the graph — {{productName}} only records the conversation once the user sends a second message. This keeps `conversation-search` and the Conversations modal free of one-turn abandoned threads. WhatsApp and Telegram take the opposite posture: every inbound — DM or group, allowed or activation-off, agent-invoked or gated — MERGEs the `:Conversation` and writes a forensic `:Message:WhatsAppMessage` row before any access-control decision. The graph is the durable record of every message the device received, not just the ones the agent replied to. See `.docs/web-chat.md` "Deferred conversation persistence" and `.docs/whatsapp.md` "Session continuity" for the full contract.
|
|
319
319
|
|
|
320
|
-
Each row in the Conversations modal exposes a `View logs` row-action that opens a popover with three links — **Stream**, **Errors**, **SSE** — each of which targets `/api/admin/logs?type={stream|error|sse}&conversationId={full-id}` in a new tab. The row's 8-char id chip is click-to-copy; hover reveals the full `conversationId` as a tooltip. See `.docs/web-chat.md` "In-chat retrieval" for the route contract and `console.debug` observability
|
|
320
|
+
Each row in the Conversations modal exposes a `View logs` row-action that opens a popover with three links — **Stream**, **Errors**, **SSE** — each of which targets `/api/admin/logs?type={stream|error|sse}&conversationId={full-id}` in a new tab. The row's 8-char id chip is click-to-copy; hover reveals the full `conversationId` as a tooltip. See `.docs/web-chat.md` "In-chat retrieval" for the route contract and `console.debug` observability.
|
|
321
321
|
|
|
322
|
-
### Static publish surface — `/sites/*`
|
|
322
|
+
### Static publish surface — `/sites/*`
|
|
323
323
|
|
|
324
|
-
{{productName}} hosts a generic per-account static-tree publish surface at `https://public.<brand>/sites/<...>/<file>`. The route serves files from `<accountDir>/sites/<...>` with URL=disk mirroring — operator drops the tree on disk, no upload API. Extended MIME covers HTML/CSS/JS/woff2/fonts on top of images. Trailing-`/` or extension-less requests fall back to `index.html`. Path traversal (`..`, encoded `..`, segments failing `SAFE_SEG_RE`) returns 403; symlinks escaping the sites root are rejected via a `realpathSync` re-check. `.html` responses carry `Content-Security-Policy: default-src 'self' https: data:; script-src 'none'` and `Cache-Control: no-cache`; assets are cached for an hour; every response carries `X-Content-Type-Options: nosniff`. Per-account isolation comes from `resolveAccount
|
|
324
|
+
{{productName}} hosts a generic per-account static-tree publish surface at `https://public.<brand>/sites/<...>/<file>`. The route serves files from `<accountDir>/sites/<...>` with URL=disk mirroring — operator drops the tree on disk, no upload API. Extended MIME covers HTML/CSS/JS/woff2/fonts on top of images. Trailing-`/` or extension-less requests fall back to `index.html`. Path traversal (`..`, encoded `..`, segments failing `SAFE_SEG_RE`) returns 403; symlinks escaping the sites root are rejected via a `realpathSync` re-check. `.html` responses carry `Content-Security-Policy: default-src 'self' https: data:; script-src 'none'` and `Cache-Control: no-cache`; assets are cached for an hour; every response carries `X-Content-Type-Options: nosniff`. Per-account isolation comes from `resolveAccount` — Maxy and Real Agent installs each see only their own tree. Drop a brochure at `~/.realagent/data/sites/properties/<id>/brochure/output/` and it serves at `https://public.realagent.bot/sites/properties/<id>/brochure/output/brochure.html`. See `.docs/web-chat.md` `/sites/*` route entry for the wire contract and `[sites]` log lines (`serve|not-found|path-traversal-rejected|symlink-escape-rejected|no-account`).
|
|
325
325
|
|
|
326
|
-
### Cross-tab session rotation
|
|
326
|
+
### Cross-tab session rotation
|
|
327
327
|
|
|
328
|
-
When you click "New conversation" in the chat tab, {{productName}} mints a fresh admin session key on the server and clears the old one. Sibling admin tabs (`/graph`, `/data`) opened in the same browser keep working without re-login: the chat tab broadcasts the new key on a same-origin channel so each sibling tab updates its captured key instantly, and any in-flight admin request that 401s with the rotation-orphan code retries once after re-reading the latest key from per-tab storage. If neither path recovers (browser locked down, second 401 after retry, session expired), the tab shows a single banner — "Your admin session was renewed in another tab. Click to reload." — and one click sends you back through login. No silent 401s; no re-clicking through the same trash icon hoping it sticks. See `.docs/web-chat.md` "Cross-tab rotation contract
|
|
328
|
+
When you click "New conversation" in the chat tab, {{productName}} mints a fresh admin session key on the server and clears the old one. Sibling admin tabs (`/graph`, `/data`) opened in the same browser keep working without re-login: the chat tab broadcasts the new key on a same-origin channel so each sibling tab updates its captured key instantly, and any in-flight admin request that 401s with the rotation-orphan code retries once after re-reading the latest key from per-tab storage. If neither path recovers (browser locked down, second 401 after retry, session expired), the tab shows a single banner — "Your admin session was renewed in another tab. Click to reload." — and one click sends you back through login. No silent 401s; no re-clicking through the same trash icon hoping it sticks. See `.docs/web-chat.md` "Cross-tab rotation contract" for the wire-level `code` taxonomy and observability surfaces.
|
|
329
329
|
|
|
330
330
|
---
|
|
331
331
|
|
|
@@ -407,7 +407,7 @@ This means:
|
|
|
407
407
|
- The `memory-reindex` tool can backfill embeddings for newly indexed labels
|
|
408
408
|
- Index renames are transparent — the server discovers the current index names at startup
|
|
409
409
|
|
|
410
|
-
The cache is cleared via `clearIndexCache
|
|
410
|
+
The cache is cleared via `clearIndexCache` after schema changes (e.g., after `memory-reindex` detects new indexes).
|
|
411
411
|
|
|
412
412
|
---
|
|
413
413
|
|
|
@@ -471,6 +471,6 @@ Each log entry includes the tool name and a truncated conversation ID for correl
|
|
|
471
471
|
|
|
472
472
|
When an admin turn crosses 75% of the model's context window, {{productName}} runs a silent compaction turn that asks the agent to call the `session-compact` MCP tool with a structured briefing (what you asked for, what was done, decisions made, work-in-progress, things you've shared about yourself). The briefing is written to Neo4j; the next admin turn injects it back into the system prompt, so continuity survives across the compaction boundary without re-sending the full transcript.
|
|
473
473
|
|
|
474
|
-
The compaction runs against a transient one-shot pool entry separate from the long-lived admin Query
|
|
474
|
+
The compaction runs against a transient one-shot pool entry separate from the long-lived admin Query. Operator-visible side effects:
|
|
475
475
|
- Compaction logs land in `claude-agent-compaction-stream-YYYY-MM-DD.log` alongside the main stream log. Look for `[compaction-start]`, `[compaction-summary-captured]`, `[compaction-failed]`, `[compaction-timeout]`, `[compaction-crashed]`, or `[compaction-spawn-error]` to triage. Subprocess stderr is captured inline as `[subproc-stderr] <line>` — there is no longer a separate `claude-agent-compaction-stderr-…log` file.
|
|
476
476
|
- The one-shot pool entry's lifecycle is greppable as `[client-cold-create] reason=compaction-one-shot …` paired with `[client-evict] reason=compaction-one-shot …`, distinguishable from the regular admin pool's lifecycle tags.
|
|
@@ -86,7 +86,7 @@ Ask naturally:
|
|
|
86
86
|
- "What did I last discuss about the Acme proposal?"
|
|
87
87
|
- "Who have I met from the fintech conference?"
|
|
88
88
|
|
|
89
|
-
## Listing and counting
|
|
89
|
+
## Listing and counting
|
|
90
90
|
|
|
91
91
|
{{productName}} answers relational questions — "list all my people", "how many tasks do I have", "find the person with email X", "show me the 20 most recently created nodes" — via direct read-only Cypher against your Neo4j. This is faster and more precise than semantic search when the question is "the exact set where", not "things similar to".
|
|
92
92
|
|
|
@@ -62,7 +62,7 @@ Latency triage: `mail-list count=0 elapsedMs<200` consistent → permissions iss
|
|
|
62
62
|
| `Outlook token refresh failed for account=X; re-auth required` | Network down at refresh time, or refresh token invalidated | Verify network; re-register |
|
|
63
63
|
| `Outlook auth expired for account=X; run outlook-account-register` | Refresh-then-retry still got 401 | Re-register |
|
|
64
64
|
| `Outlook rate-limited without Retry-After hint` | Graph 429 with no backoff guidance | Wait + retry; if persistent, file bug |
|
|
65
|
-
| `Microsoft Graph does not support on-premises Exchange. Use
|
|
65
|
+
| `Microsoft Graph does not support on-premises Exchange. Use earlier platform fixes (IMAP).` | Mailbox is on hybrid Exchange | Use the `email` plugin |
|
|
66
66
|
|
|
67
67
|
## Out of scope
|
|
68
68
|
|
|
@@ -109,11 +109,11 @@ import { initStderrTee } from "../../../../lib/mcp-stderr-tee/dist/index.js";
|
|
|
109
109
|
initStderrTee("your-plugin-name");
|
|
110
110
|
```
|
|
111
111
|
|
|
112
|
-
After this, every `console.error("[your-tool]
|
|
112
|
+
After this, every `console.error("[your-tool]...")` from any tool in the plugin appears as `[<iso-ts>] [mcp:your-plugin-name] [your-tool]...` in the per-conversation stream log `claude-agent-stream-{conversationId}.log`, alongside the usual agent events. The raw per-server file `mcp-your-plugin-name-stderr-{date}.log` is still produced for deep-dive grep.
|
|
113
113
|
|
|
114
|
-
**How the tee decides which file to write to
|
|
114
|
+
**How the tee decides which file to write to:** the platform sets `STREAM_LOG_PATH` as an environment variable on every MCP server spawn, pointing to the conversation-scoped stream log. The MCP server does not know about conversations — it just trusts `STREAM_LOG_PATH`. Multiple concurrent conversations produce multiple concurrent MCP server processes, each teeing to its own file; no cross-conversation leakage.
|
|
115
115
|
|
|
116
|
-
**`STREAM_LOG_PATH` reaches every Claude Code child
|
|
116
|
+
**`STREAM_LOG_PATH` reaches every Claude Code child.** The platform now sets `STREAM_LOG_PATH` on the parent `claude` spawn env itself (not only on MCP server envs), so the bundled Bun runtime inherits it and every Bash-tool subprocess the CLI spawns sees it too. Opt-in shell scripts — currently `setup-tunnel.sh`, `reset-tunnel.sh`, and `list-cf-domains.sh` under `platform/plugins/cloudflare/scripts/` — read the variable, guard against a missing value with a loud exit, and tee subprocess output line-by-line into the same per-conversation file. Each spawn writes one `[spawn-env] STREAM_LOG_PATH=set pid=… conversationId=… site=…` line so the env-propagation is auditable per session. The chat UI tails the same file for lines matching `^\[([^\]]+)\] \[([a-z][a-z0-9-]*)((?::[a-z0-9:_-]+)?)\] ` — any lowercase scope shape participates on first write (earlier platform fixes generalised from the pre-592 enum `setup-tunnel|reset-tunnel`) — and emits them as `script_stream` SSE events; see `.docs/web-chat.md` for the contract. Inner-layer helpers that a.sh wrapper spawns (e.g. `list-cf-domains.ts` via `node --experimental-strip-types`) must write phase lines directly to `STREAM_LOG_PATH` rather than relying on stderr propagation; the build-gate `platform/ui/scripts/check-stream-log-contract.mjs` enforces this and is the definitive reference for the three allowed patterns (tee-wrapped, direct-write, or explicit stderr-only marker).
|
|
117
117
|
|
|
118
118
|
**Retrieve MCP diagnostic lines for a conversation:**
|
|
119
119
|
|
|
@@ -122,20 +122,20 @@ After this, every `console.error("[your-tool] ...")` from any tool in the plugin
|
|
|
122
122
|
|
|
123
123
|
**Tee-state markers** land in the stream log: `[platform] [mcp-tee-attach] server=<name> streamLogPath=...` when the tee wires up, `[platform] [mcp-tee-skip] server=<name> destination=... reason=...` when a destination fails (missing `LOG_DIR`, unwritable path, `STREAM_LOG_PATH` not set, etc.), `[platform] [mcp-tee-detach] server=<name>` on graceful shutdown. If a server invoked tools but no `[mcp:<name>]` lines appear in the conversation's log, look for the skip marker first.
|
|
124
124
|
|
|
125
|
-
**Main-subprocess stderr
|
|
125
|
+
**Main-subprocess stderr.** The same teeing pattern applies to the main Claude Code subprocess's stderr — every line lands in the per-conversation stream log as `[subproc-stderr] …`, with lifecycle markers `[subproc-stderr-tee-attached] pid=…` and `[subproc-stderr-tee-detached] pid=… bytes=N lines=N`. A `bytes=0 lines=0` detach means the tee was attached but the subprocess emitted nothing on stderr — which is the normal state today, because the Claude Code CLI is a bundled Bun runtime binary that does not honour Node's `NODE_DEBUG` env var. The platform records this explicitly with one line per spawn: `[subproc-debug-unavailable] reason=bundled-bun-binary-ignores-node-debug pid=… cli=claude`. A reader who finds a `[spawn]` without these markers should treat that as a regression of the tee infrastructure, not as silence.
|
|
126
126
|
|
|
127
|
-
## Failure-path observability contract (
|
|
127
|
+
## Failure-path observability contract (earlier platform fixes + earlier platform fixes)
|
|
128
128
|
|
|
129
129
|
The `initStderrTee` wrapper writes to the per-conversation stream log and per-server raw file via `createWriteStream` — async, buffered. Any diagnostic `console.error(…)` followed by an immediate `process.exit(…)` is lost: the event loop never drains the WriteStream before the process terminates. Same race for any synchronous module-load throw: Node's uncaught-exception handler writes the stack to raw fd 2 and exits before the patched async stream flushes. The platform's `[mcp-init-error] tail="(no stderr file)"` line — operationally useless — is the public symptom of this race.
|
|
130
130
|
|
|
131
131
|
**Two layers now close the gap, each load-bearing on its own:**
|
|
132
132
|
|
|
133
|
-
1. **Plugin-side sync-write discipline.** Plugins that call `process.exit
|
|
133
|
+
1. **Plugin-side sync-write discipline.** Plugins that call `process.exit` during module load (rare — `graph-mcp` is the in-tree example; it spawns a child at boot to proxy upstream stdio) use `fs.appendFileSync` at every named exit path to guarantee the cause lands in both log destinations before exit. Lines follow the `[mcp:<name>] [<plugin-prefix>] <cause>` format so existing `grep '[mcp:<name>]'` investigator paths work. Each destination is wrapped in its own try/catch — an unwritable log must not mask the primary failure. This is the discipline propagated to any plugin author who knows their failure paths.
|
|
134
134
|
|
|
135
|
-
2. **Parent-side `mcp-spawn-tee` wrapper
|
|
135
|
+
2. **Parent-side `mcp-spawn-tee` wrapper.** Every node-based core MCP server is spawned via the `lib/mcp-spawn-tee` wrapper rather than `node <entry>` directly. The wrapper spawns the real entry with `stdio: ['inherit', 'inherit', 'pipe']` and writes child stderr chunks to `${LOG_DIR}/mcp-${name}-stderr-<date>.log` via `appendFileSync` while passing the same chunks through to its own stderr (Claude Code's consumer is unchanged). Synchronous `appendFileSync` survives `process.exit`, so the per-server file captures even (a) module-load throws before `initStderrTee` runs, (b) `MODULE_NOT_FOUND` on the entry script itself, and (c) anything else a plugin author missed. The wrapper writes `[mcp-spawn-tee-attached] server=<name> pid=<n>` on attach and forwards SIGTERM/SIGINT to the child. This is the layer that makes capture independent of plugin discipline. Playwright stays unwrapped because it spawns via `npx`, not `node`.
|
|
136
136
|
|
|
137
137
|
A third layer closes the same gap from the platform side: when `claude-agent.ts` observes an `init` event with any MCP server reporting `status:"failed"`, it reads the last 512 bytes of `${LOG_DIR}/mcp-<name>-stderr-<date>.log` and emits `[mcp-init-error] server=<name> tail=<quoted>` into the stream log. Absent file → `tail="(no stderr file)"`; empty file → `tail="(empty)"`. With the spawn-tee wrapper now interposing on every core MCP, `tail="(no stderr file)"` post-Task-743 means the wrapper itself is broken — file follow-up.
|
|
138
138
|
|
|
139
139
|
**Signal inventory after a failed session:** `[init] FAILED MCP servers: <names>` (names), `[mcp-init-error] server=<name> tail=…` (cause for each, from the platform's tail probe), `[mcp-spawn-tee-attached] server=<name> pid=<n>` (proof the wrapper attached), `[mcp-spawn-tee-exit] server=<name> code=<n>|signal=<s>` (proof the wrapper saw the exit), and optionally `[mcp:<name>] [<plugin>] …` from plugin-side sync-writes. Their union gives the investigator three independent sources for the same failure.
|
|
140
140
|
|
|
141
|
-
**Boot-smoke as publish-time gate
|
|
141
|
+
**Boot-smoke as publish-time gate.** The memory MCP carries `scripts/boot-smoke.sh` that spawns `dist/index.js` with stub env, sleeps 2s, asserts `kill -0 <pid>`, and reports `[boot-smoke] memory ok|FAILED tail=<n-lines>`. Wired to `prepublish` in `plugins/memory/mcp/package.json`. The pattern is propagatable to other plugin MCPs — it's deliberately not generalised yet because each plugin's stub-env requirements differ (memory needs ACCOUNT_ID + PLATFORM_ROOT + NEO4J_URI + SESSION_ID; others differ).
|