npm - @rubytech/create-realagent - Versions diffs - 1.0.852 → 1.0.854 - Mend

@rubytech/create-realagent 1.0.852 → 1.0.854

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (41) hide show

package/payload/platform/plugins/docs/references/getting-started.md CHANGED Viewed

@@ -60,6 +60,8 @@ When the text field is empty, a microphone button appears in place of the send b
 Voice recording requires a secure connection (HTTPS). When accessing {{productName}} over the local network via HTTP, use the tunnel URL for voice notes.
+You can also drop, paste, or pick an audio file (`.opus`, `.ogg`, `.m4a`, `.mp3`, `.wav`, `.webm`) into the chat composer — for example a voice note forwarded from WhatsApp. The file is transcribed the same way the in-browser recording is, and only the transcript reaches {{productName}}; the audio itself is discarded after transcription.
 ## What {{productName}} Remembers
 {{productName}} maintains a memory graph of everything important: contacts, conversations, preferences, relationships, and context. When you tell {{productName}} something, it stores it. When you ask about something later, it retrieves it.

package/payload/platform/plugins/docs/references/platform.md CHANGED Viewed

@@ -65,7 +65,7 @@ There is no dashboard, no settings panel, no menus. Everything is done through c
 The chat input auto-grows as you type — it expands to fit your message and shrinks back when you delete text. You can also drag the resize handle above the input to set a custom height.
-The admin interface is a three-pane layout: a sidebar on the left with your brand mark, navigation (Chat, People, Agents, Projects, Tasks, Artefacts), and your recent conversations; the chat in the middle; and an artefact pane on the right that opens when you select a document, click a project, or open Browser, Data, or Graph from the menu — holding the surface side-by-side with the conversation so the chat stays live while you work in it. The sidebar's nav rows swap the list view in place — Chat shows recent conversations, Projects shows your active work projects, and Artefacts lists every KnowledgeDocument plus this account's agent templates (your admin agent's IDENTITY, SOUL, and KNOWLEDGE files plus one entry per enabled specialist). The People, Agents, and Tasks rows are graph shortcuts: clicking each opens the artefact-pane Graph filtered to every Person, every public Agent, or every Task in your account respectively, with no side-list — the graph itself is the result. Public agents become first-class graph entities the moment you create them, with edges to their IDENTITY/SOUL/KNOWLEDGE files, edges to every knowledge document they have access to, and edges from every conversation they have handled, so a single Agents click reveals the whole shape of who knows what and who has been talking to whom. Click an artefact row to open the document. KnowledgeDocuments and your admin agent's templates are editable — type in the document and changes save automatically; specialist agent templates are read-only because they ship with Maxy and your edits would be overwritten on the next install. PDF artefacts render inline so you can read them without leaving the pane. If your browser doesn't have a built-in PDF viewer, a Download button appears instead. Artefacts that have no readable file backing them (orphan rows, files removed from disk, unsupported content types) show a one-line banner explaining the skip instead of opening to a blank pane. Click a project row to open the Graph view focused on that project's neighbourhood — clicking a second project swaps the focus rather than stacking on top. The chat / artefact divider is drag-resizable — drag the line between the columns to make either side wider; double-click it to reset to half of the available width (viewport minus sidebar), clamped to the chat / artefact min-width floors. Your chosen width is remembered across reloads. On wider screens (>1280px) all three panes are visible. The sidebar narrows at 1280px, the artefact pane hides at 1080px (Browser, Data, and Graph then open as full-window pages instead), and the sidebar collapses to a 56px icon rail at 820px. On phones (<720px) the sidebar slides in as a drawer from the left when you tap the menu icon in the chat header.
+The admin interface is a three-pane layout: a sidebar on the left with your brand mark, navigation (Chat, People, Agents, Projects, Tasks, Artefacts), and your recent conversations; the chat in the middle; and an artefact pane on the right that opens when you select a document, click a project, or open Browser, Data, or Graph from the menu — holding the surface side-by-side with the conversation so the chat stays live while you work in it. The sidebar's nav rows swap the list view in place — Chat shows recent conversations, Projects shows your active work projects, and Artefacts lists every KnowledgeDocument plus this account's agent templates (your admin agent's IDENTITY, SOUL, and KNOWLEDGE files plus one entry per enabled specialist). The People, Agents, and Tasks rows are graph shortcuts: clicking each opens the artefact-pane Graph filtered to every Person, every public Agent, or every Task in your account respectively, with no side-list — the graph itself is the result. Public agents become first-class graph entities the moment you create them, with edges to their IDENTITY/SOUL/KNOWLEDGE files, edges to every knowledge document they have access to, and edges from every conversation they have handled, so a single Agents click reveals the whole shape of who knows what and who has been talking to whom. Click an artefact row to open the document. KnowledgeDocuments and your admin agent's templates are editable — type in the document and changes save automatically; specialist agent templates are read-only because they ship with Maxy and your edits would be overwritten on the next install. PDF artefacts render inline so you can read them without leaving the pane. If your browser doesn't have a built-in PDF viewer, a Download button appears instead. Artefacts that have no readable file backing them (orphan rows, files removed from disk, unsupported content types) show a one-line banner explaining the skip instead of opening to a blank pane. Click a project row to open the Graph view focused on that project's neighbourhood — clicking a second project swaps the focus rather than stacking on top. The chat / artefact divider is drag-resizable — drag the line between the columns to make either side wider; double-click it to reset to half of the available width (viewport minus sidebar), clamped to the chat / artefact min-width floors. Your chosen width is remembered across reloads. On wider screens (>1280px) all three panes are visible. The sidebar narrows at 1280px, the artefact pane hides at 1080px (Browser, Data, and Graph then open as full-window pages instead), and the sidebar collapses to a 56px icon rail at 820px. On phones (<720px) the sidebar slides in as a drawer from the left when you tap the menu icon in the chat header. When the sidebar is collapsed to the 56px icon rail, clicking the Artefacts icon expands the rail back open so the artefact list is visible — the row was previously a silent no-op in collapsed state.
 Page titles are brand-aware: the browser tab shows your product name (e.g. `Real Agent` instead of `Maxy`) on every shell — chat, graph, and data — so a non-default brand never leaks the default name in tab strips or browser history.

package/payload/platform/plugins/docs/references/troubleshooting.md CHANGED Viewed

@@ -1,5 +1,13 @@
 # Troubleshooting
+## Browser navigation to a local file (`file://`) used to time out for two minutes
+**Symptom:** Older versions of the platform's admin agent would attempt `browser_navigate file:///path/to.html`, hit Playwright's silent two-minute timeout, then guess fixed ports (8080 / 3000 / 8000 / 9000) and report `ERR_CONNECTION_REFUSED` for each before someone manually started a local HTTP server.
+**Resolution shipped:** The `playwright-file-guard` PreToolUse hook (admin plugin) intercepts `file://` URLs, picks a free loopback port, backgrounds `python3 -m http.server` rooted at the file's parent directory, connect-verifies the server within one second, and rewrites the tool call's URL to `http://127.0.0.1:<port>/<basename>` before Playwright sees it. The agent never sees the rewrite. Stale server processes are reaped opportunistically on every hook invocation (1 h threshold, gated by a `ps` cmdline check that won't kill a reused PID).
+**Diagnose if it ever recurs:** grep the per-conversation stream log for `[playwright-file-guard] action=`. One `action=rewrite original=file://… port=<n> pid=<m>` line per file:// navigate is the healthy signal. `action=fail reason=<r>` indicates the hook tried to rewrite but failed open (Playwright handled the original URL); the reason field names the cause (`python3-missing`, `port-pick-failed`, `server-not-ready`, `file-not-found`, `spawn-failed`). The `cleanup` argv on the hook script can be invoked manually to sweep `/tmp/playwright-file-guard.*.pid`; the suite at `platform/plugins/admin/hooks/__tests__/playwright-file-guard.test.sh` exercises every path.
 ## First user-domain write rejected by `[graph-write-gate] reject reason=no-admin-user`
 **Symptom:** Admin chat reports "couldn't save that — set up your business profile first" or `[graph-write-gate] reject reason=no-admin-user` appears in `server.log` on the operator's first non-bootstrap write (a website, service, opening hours, etc.). Reproduces on Minimal-onboarded installs from before the seed-stamping fix shipped.
@@ -64,6 +72,8 @@ tail -200 ~/.maxy/logs/maxy-ui.log | rg '\[remote-auth\].*resolvedKind='
 **Agent searches the filesystem after uploading a zip.** If you uploaded a zip and the agent burns several turns running `find` / `Glob` instead of unzipping, that is the symptom of the recovery-retry attachment-context regression (now closed by the recovery context preservation contract in `.docs/agents.md`). Greppable confirmation is the `[context-overflow-recovery] retry … attachmentsCarried=<n>` line in the conversation stream log. If you see `[context-overflow-recovery] WARN attachment-context-lost`, the regression has returned — surface to support.
+**A turn rendered in chat is missing on next page-refresh.** Pre-the 2026-05-07 mandate this was a class of silent failure — Neo4j persists were wrapped in a no-op error catch and a write that threw left the artefact "rendered then disappeared on resume". The 2026-05-07 mandate makes JSONL canonical: the resume route reads the SDK transcript file at `~/.claude/projects/<project-key>/<sessionId>.jsonl` first, supplements from Neo4j, and triggers async heal-on-resume writes for any turn the JSONL has but Neo4j does not. So a refreshed conversation always renders what the SDK saw, regardless of write outcome. If a heal write itself fails, the chat shows a top-of-conversation banner naming the count; if every heal succeeds the resume is silent and the missing rows are quietly restored to Neo4j. Greppable post-deploy invariants in the per-conversation stream log (`logs/claude-agent-stream-<conversationId>.log`): `[admin-resume] reason=<…> source=<jsonl|jsonl-missing|neo4j-only>` (one per resume), `[admin-persist] convId=<8> writer=<…> outcome=<ok|fail|skip>` (per persist site), `[admin-persist-heal] convId=<8> turnIndex=<n> outcome=<ok|fail>` (per heal write). To force-audit a specific conversation against its Neo4j projection without re-executing it, run `tsx platform/scripts/admin-persist-audit.ts --conversation-id=<uuid> --account-id=<uuid> --session-id=<uuid>` — non-zero exit + per-divergence `[admin-persist-audit] expected=<message|component> missing reason=neo4j-row-absent` lines name what would have been silently lost pre-mandate.
 **Wrong Claude account answering on a multi-brand device.** On a host running both Maxy and Real Agent, each brand's admin agent reads its own `~/${brand.configDir}/.claude/.credentials.json`; there is no longer a shared `~/.claude/` thrashing them against one another. If a brand reports auth failures or appears to be operating against the wrong subscription, check three things:
 1. `grep "\[claude-auth\] init" ~/.${brand}/logs/server.log | tail -1` — the resolved path must end with `~/.${brand}/.claude/.credentials.json`. If a `[claude-auth] WARN cross-brand-path-detected` line is present, the runtime is still pointing at `~/.claude/`; the brand main service did not pick up the `Environment=CLAUDE_CONFIG_DIR=` setting (re-run the brand installer to refresh the unit file).
 2. `diff <(jq .claudeAiOauth.accessToken ~/.maxy/.claude/.credentials.json) <(jq .claudeAiOauth.accessToken ~/.realagent/.claude/.credentials.json)` — must be non-empty after each brand's operator has run `claude /login` against distinct Anthropic accounts; if it's empty, both brands are still logged in to the same account (operator action, not a code bug).

package/payload/platform/scripts/admin-persist-audit.ts ADDED Viewed

@@ -0,0 +1,191 @@
+#!/usr/bin/env -S node --loader tsx
+/**
+ * Task 940 — admin persist audit harness.
+ *
+ * Compares JSONL canonical state against Neo4j projection for a given
+ * conversationId. Prints one [admin-persist-audit] divergence line per
+ * (sdkTurnUuid, expected) gap; non-zero exit on any mismatch. Designed to
+ * run against an operator-supplied stream log fixture WITHOUT re-executing
+ * the live session — JSONL is the only ground truth this script consults.
+ *
+ * Usage:
+ *   tsx platform/scripts/admin-persist-audit.ts \
+ *       --conversation-id=<uuid> \
+ *       --account-id=<uuid> \
+ *       --jsonl=<path>            # optional override; otherwise resolved from accountId+sessionId
+ *       --session-id=<uuid>       # required if --jsonl not provided
+ *
+ * Exit codes:
+ *   0 = no divergences
+ *   1 = at least one Message or Component absent from Neo4j
+ *   2 = invocation error (missing args, file unreadable, Neo4j unreachable)
+ *
+ * Why audit-only and not also auto-heal: the heal-on-resume writer at
+ * server/routes/admin/sessions.ts handles the live path. This harness exists
+ * for forensic investigation against operator-supplied JSONLs (e.g. the
+ * 2026-05-07 stream log that motivated this task) where the live session
+ * has long since terminated.
+ */
+import { existsSync, readFileSync } from "node:fs";
+import { homedir } from "node:os";
+import { resolve } from "node:path";
+import process from "node:process";
+import { replayJsonl, resolveJsonlPath } from "../ui/app/lib/claude-agent/jsonl-replay";
+import { ACCOUNTS_DIR } from "../ui/app/lib/claude-agent/account";
+import { getRecentMessages, getSession } from "../ui/app/lib/neo4j-store";
+import { PERSISTENT_COMPONENTS } from "../lib/persistent-components/src/index";
+interface Args {
+  conversationId: string;
+  accountId: string;
+  jsonlPath: string;
+}
+function parseArgs(argv: string[]): Args | { error: string } {
+  const out: Partial<Args> = {};
+  let sessionId: string | undefined;
+  let jsonlOverride: string | undefined;
+  for (const a of argv) {
+    const m = a.match(/^--([a-z-]+)=(.+)$/);
+    if (!m) continue;
+    const [, key, val] = m;
+    if (key === "conversation-id") out.conversationId = val;
+    else if (key === "account-id") out.accountId = val;
+    else if (key === "session-id") sessionId = val;
+    else if (key === "jsonl") jsonlOverride = val;
+  }
+  if (!out.conversationId) return { error: "--conversation-id required" };
+  if (!out.accountId) return { error: "--account-id required" };
+  if (jsonlOverride) {
+    out.jsonlPath = resolve(jsonlOverride);
+  } else if (sessionId) {
+    const accountDir = resolve(ACCOUNTS_DIR, out.accountId);
+    out.jsonlPath = resolveJsonlPath(homedir(), accountDir, sessionId);
+  } else {
+    return { error: "either --jsonl or --session-id required" };
+  }
+  return out as Args;
+}
+async function main(): Promise<number> {
+  const parsed = parseArgs(process.argv.slice(2));
+  if ("error" in parsed) {
+    console.error(`[admin-persist-audit] usage error: ${parsed.error}`);
+    return 2;
+  }
+  const { conversationId, jsonlPath } = parsed;
+  if (!existsSync(jsonlPath)) {
+    console.error(`[admin-persist-audit] jsonl absent path=${jsonlPath} convId=${conversationId.slice(0, 8)}`);
+    return 2;
+  }
+  // JSONL replay — derives the canonical message stream and the expected
+  // component side-effects.
+  const replay = replayJsonl(jsonlPath);
+  if (replay.malformedLines > 0) {
+    console.error(`[admin-persist-audit] jsonl-malformed-lines convId=${conversationId.slice(0, 8)} count=${replay.malformedLines}`);
+  }
+  // Neo4j projection — what the resume route would have rendered pre-940.
+  let neo4j: Awaited<ReturnType<typeof getRecentMessages>>;
+  try {
+    neo4j = await getRecentMessages(conversationId, 1000);
+  } catch (err) {
+    console.error(`[admin-persist-audit] neo4j-read-failed convId=${conversationId.slice(0, 8)} reason=${err instanceof Error ? err.message : String(err)}`);
+    return 2;
+  }
+  // Build a set of (role, content) keys present in Neo4j for fast lookup.
+  // Same matching key the resume route uses, for parity.
+  const neo4jByKey = new Map<string, typeof neo4j[number]>();
+  for (const n of neo4j) neo4jByKey.set(`${n.role}\x1f${n.content}`, n);
+  let divergences = 0;
+  for (const j of replay.messages) {
+    const key = `${j.role}\x1f${j.content}`;
+    const match = neo4jByKey.get(key);
+    if (!match) {
+      // Whole message absent from Neo4j.
+      console.log(`[admin-persist-audit] convId=${conversationId.slice(0, 8)} sdkTurnUuid=${j.messageId.slice(0, 8)} expected=message missing reason=neo4j-row-absent`);
+      divergences += 1;
+      // Each missing message implies its components are also missing — emit
+      // one divergence line per absent component for forensic completeness.
+      for (const c of j.components) {
+        console.log(`[admin-persist-audit] convId=${conversationId.slice(0, 8)} sdkTurnUuid=${j.messageId.slice(0, 8)} expected=component component_name=${c.name} ordinal=${c.ordinal} missing reason=neo4j-row-absent`);
+        divergences += 1;
+      }
+      continue;
+    }
+    // Message exists; cross-check component count (Neo4j carries components
+    // as siblings of :Message via :HAS_COMPONENT).
+    const neoComps = match.components ?? [];
+    if (neoComps.length < j.components.length) {
+      for (let i = neoComps.length; i < j.components.length; i++) {
+        const c = j.components[i];
+        console.log(`[admin-persist-audit] convId=${conversationId.slice(0, 8)} sdkTurnUuid=${j.messageId.slice(0, 8)} expected=component component_name=${c.name} ordinal=${c.ordinal} missing reason=neo4j-row-absent`);
+        divergences += 1;
+      }
+    }
+  }
+  // Task 942 — every PERSISTENT_COMPONENTS :Component row must have a
+  // sibling :KnowledgeDocument with a matching attachmentId. Two failure
+  // modes: (a) live writer succeeded the :Component CREATE but the
+  // sibling :KnowledgeDocument MERGE didn't fire (theoretical — they're
+  // in the same tx, so this is only possible if the row is pre-942 or if
+  // the FOREACH ran on a null attachmentId); (b) the disk-write failed
+  // mid-render and `c.attachmentId` is null but the artefact bytes might
+  // be recoverable from `c.data`. Both cases are surfaced as `kd-row-absent`
+  // — the operator runs component-knowledgedoc-backfill.ts to materialise
+  // the projection.
+  const projectionSession = getSession();
+  try {
+    const componentRowsResult = await projectionSession.run(
+      `MATCH (m:Message {conversationId: $conversationId})-[:HAS_COMPONENT]->(c:Component)
+       WHERE c.name IN $names
+       OPTIONAL MATCH (k:KnowledgeDocument {accountId: c.accountId, attachmentId: c.attachmentId})
+         WHERE c.attachmentId IS NOT NULL
+       RETURN c.componentId AS componentId,
+              c.name AS componentName,
+              c.accountId AS accountId,
+              c.attachmentId AS attachmentId,
+              k IS NOT NULL AS hasProjection`,
+      { conversationId, names: Array.from(PERSISTENT_COMPONENTS) },
+    );
+    for (const record of componentRowsResult.records) {
+      const componentId = record.get("componentId") as string;
+      const componentName = record.get("componentName") as string;
+      const attachmentId = record.get("attachmentId") as string | null;
+      const hasProjection = record.get("hasProjection") === true;
+      if (!attachmentId) {
+        console.log(`[admin-persist-audit] convId=${conversationId.slice(0, 8)} componentId=${componentId.slice(0, 8)} expected=knowledgedoc component_name=${componentName} missing reason=empty-attachment-id`);
+        divergences += 1;
+        continue;
+      }
+      if (!hasProjection) {
+        console.log(`[admin-persist-audit] convId=${conversationId.slice(0, 8)} componentId=${componentId.slice(0, 8)} expected=knowledgedoc component_name=${componentName} missing reason=kd-row-absent attachmentId=${attachmentId.slice(0, 8)}`);
+        divergences += 1;
+      }
+    }
+  } finally {
+    await projectionSession.close();
+  }
+  if (divergences === 0) {
+    console.log(`[admin-persist-audit] convId=${conversationId.slice(0, 8)} jsonlMessages=${replay.messages.length} neo4jMessages=${neo4j.length} divergences=0 status=ok`);
+    return 0;
+  }
+  console.log(`[admin-persist-audit] convId=${conversationId.slice(0, 8)} jsonlMessages=${replay.messages.length} neo4jMessages=${neo4j.length} divergences=${divergences} status=mismatch`);
+  return 1;
+}
+main()
+  .then((code) => process.exit(code))
+  .catch((err) => {
+    console.error(`[admin-persist-audit] crashed: ${err instanceof Error ? err.stack : String(err)}`);
+    process.exit(2);
+  });

package/payload/platform/scripts/component-knowledgedoc-backfill.ts ADDED Viewed

@@ -0,0 +1,214 @@
+#!/usr/bin/env -S node --loader tsx
+/**
+ * Task 942 — backfill :KnowledgeDocument projections for legacy
+ * :Component rows whose render-component was emitted before this
+ * task landed.
+ *
+ * Walks every `:Component {name ∈ PERSISTENT_COMPONENTS}` row that
+ * lacks a sibling `:KnowledgeDocument` (matched by accountId +
+ * deterministic attachmentId derived from the component's id) and
+ * for each one materialises the file on disk + MERGEs the projection
+ * row in a single Cypher tx. Idempotent — re-running against the
+ * same rows is a no-op (MERGE collapses, file write rewrites the
+ * same bytes).
+ *
+ * Usage:
+ *   tsx platform/scripts/component-knowledgedoc-backfill.ts \
+ *       [--account-id=<uuid>]                # optional filter, default = all accounts
+ *       [--dry-run]                          # print what would happen, do not write
+ *
+ * Exit codes:
+ *   0 = no rows needed backfill, OR all rows succeeded (including --dry-run)
+ *   1 = at least one row failed (disk write threw, Cypher tx threw)
+ *   2 = invocation / Neo4j connection error
+ *
+ * Per-row log line:
+ *   [component-kd-backfill] convId=<…> componentId=<…> outcome=created|skipped|failed reason=<…>
+ *
+ * The skip cases:
+ *  - component data does not contain `data.content` or `data.html`,
+ *  - both fields empty,
+ *  - the projection row already exists (idempotent re-run).
+ */
+import process from "node:process";
+import { isPersistentComponent, PERSISTENT_COMPONENTS } from "../lib/persistent-components/src/index";
+import { deriveComponentAttachmentId, deriveComponentTitle, pickComponentBytes } from "../ui/app/lib/claude-agent/component-attachment";
+import { getSession } from "../ui/app/lib/neo4j-store";
+import { storeComponentArtefact } from "../ui/app/lib/attachments";
+interface Args {
+  accountIdFilter?: string;
+  dryRun: boolean;
+}
+function parseArgs(argv: string[]): Args {
+  const out: Args = { dryRun: false };
+  for (const a of argv) {
+    const m = a.match(/^--([a-z-]+)(?:=(.+))?$/);
+    if (!m) continue;
+    const [, key, val] = m;
+    if (key === "account-id") out.accountIdFilter = val;
+    else if (key === "dry-run") out.dryRun = true;
+  }
+  return out;
+}
+interface ComponentRow {
+  componentId: string;
+  conversationId: string;
+  accountId: string;
+  name: string;
+  data: string;
+  existingAttachmentId: string | null;
+  messageId: string;
+}
+async function main(): Promise<number> {
+  const args = parseArgs(process.argv.slice(2));
+  const session = getSession();
+  let backfilled = 0;
+  let skipped = 0;
+  let failed = 0;
+  try {
+    // Pull every PERSISTENT_COMPONENTS :Component row, optionally
+    // filtered by accountId. The query also returns the existing
+    // c.attachmentId (null on legacy / pre-942 rows). For legacy rows
+    // we derive attachmentId from componentId — it's the only stable
+    // identifier on the historical data — write the file, MERGE the
+    // projection, AND back-fill c.attachmentId so the audit harness
+    // collapses on the same row on its next run.
+    const componentNames = Array.from(PERSISTENT_COMPONENTS);
+    const filterClause = args.accountIdFilter ? "AND c.accountId = $accountId" : "";
+    const result = await session.run(
+      `MATCH (m:Message)-[:HAS_COMPONENT]->(c:Component)
+       WHERE c.name IN $names ${filterClause}
+       RETURN c.componentId AS componentId,
+              c.conversationId AS conversationId,
+              c.accountId AS accountId,
+              c.name AS name,
+              c.data AS data,
+              c.attachmentId AS existingAttachmentId,
+              m.messageId AS messageId
+       ORDER BY c.createdAt`,
+      args.accountIdFilter
+        ? { names: componentNames, accountId: args.accountIdFilter }
+        : { names: componentNames },
+    );
+    for (const record of result.records) {
+      const row: ComponentRow = {
+        componentId: record.get("componentId") as string,
+        conversationId: record.get("conversationId") as string,
+        accountId: record.get("accountId") as string,
+        name: record.get("name") as string,
+        data: record.get("data") as string,
+        existingAttachmentId: (record.get("existingAttachmentId") as string | null) ?? null,
+        messageId: record.get("messageId") as string,
+      };
+      if (!isPersistentComponent(row.name)) {
+        // Defensive — the WHERE clause already filters, but a future
+        // schema change might mismatch; log and move on.
+        skipped += 1;
+        console.log(`[component-kd-backfill] convId=${row.conversationId.slice(0, 8)} componentId=${row.componentId.slice(0, 8)} outcome=skipped reason=name-not-persistent`);
+        continue;
+      }
+      let dataObj: Record<string, unknown>;
+      try {
+        dataObj = JSON.parse(row.data) as Record<string, unknown>;
+      } catch {
+        skipped += 1;
+        console.log(`[component-kd-backfill] convId=${row.conversationId.slice(0, 8)} componentId=${row.componentId.slice(0, 8)} outcome=skipped reason=data-not-json`);
+        continue;
+      }
+      const bytesPick = pickComponentBytes(dataObj);
+      if (!bytesPick) {
+        skipped += 1;
+        console.log(`[component-kd-backfill] convId=${row.conversationId.slice(0, 8)} componentId=${row.componentId.slice(0, 8)} outcome=skipped reason=no-content-or-html`);
+        continue;
+      }
+      // Prefer the live-writer-stamped attachmentId when present;
+      // otherwise derive from componentId (legacy / pre-942 rows).
+      // The derived value is then written back onto :Component below
+      // so the audit harness sees a single source of truth on the
+      // next run.
+      const attachmentId = row.existingAttachmentId ?? deriveComponentAttachmentId(row.componentId);
+      const derivedFromComponentId = !row.existingAttachmentId;
+      const title = deriveComponentTitle(row.name, dataObj);
+      const filename = bytesPick.mimeType === "text/html" ? `${title}.html` : `${title}.md`;
+      if (args.dryRun) {
+        console.log(`[component-kd-backfill] convId=${row.conversationId.slice(0, 8)} componentId=${row.componentId.slice(0, 8)} outcome=dry-run attachmentId=${attachmentId.slice(0, 8)} mimeType=${bytesPick.mimeType} bytes=${bytesPick.content.length} source=${derivedFromComponentId ? "derived" : "stamped"}`);
+        continue;
+      }
+      try {
+        await storeComponentArtefact(row.accountId, attachmentId, bytesPick.mimeType, bytesPick.content, filename);
+      } catch (err) {
+        failed += 1;
+        const reason = err instanceof Error ? err.message : String(err);
+        console.log(`[component-kd-backfill] convId=${row.conversationId.slice(0, 8)} componentId=${row.componentId.slice(0, 8)} outcome=failed reason=disk-write:${JSON.stringify(reason.slice(0, 200))}`);
+        continue;
+      }
+      try {
+        // MERGE the projection + the discovery edge from :Message + back-fill
+        // c.attachmentId so the audit harness sees a single attachmentId
+        // source on its next run. The Cypher returns whether the projection
+        // was newly created or already existed, so the per-row log line
+        // distinguishes the two outcomes for forensic purposes.
+        const mergeResult = await session.run(
+          `MATCH (m:Message {messageId: $messageId, accountId: $accountId})
+           MATCH (m)-[:HAS_COMPONENT]->(c:Component {componentId: $componentId})
+           MERGE (k:KnowledgeDocument {accountId: $accountId, attachmentId: $attachmentId})
+           ON CREATE SET k.name = $title,
+                         k.encodingFormat = $mimeType,
+                         k.createdAt = datetime(),
+                         k.updatedAt = datetime()
+           ON MATCH SET k.updatedAt = datetime()
+           MERGE (m)-[:HAS_KNOWLEDGE_DOCUMENT]->(k)
+           SET c.attachmentId = $attachmentId
+           RETURN CASE WHEN k.createdAt = k.updatedAt THEN 'created' ELSE 'projection-existed' END AS state`,
+          {
+            accountId: row.accountId,
+            attachmentId,
+            title,
+            mimeType: bytesPick.mimeType,
+            messageId: row.messageId,
+            componentId: row.componentId,
+          },
+        );
+        const state = mergeResult.records[0]?.get("state") as string | undefined;
+        if (state === "created") {
+          backfilled += 1;
+          console.log(`[component-kd-backfill] convId=${row.conversationId.slice(0, 8)} componentId=${row.componentId.slice(0, 8)} outcome=created attachmentId=${attachmentId.slice(0, 8)} mimeType=${bytesPick.mimeType} bytes=${bytesPick.content.length} source=${derivedFromComponentId ? "derived" : "stamped"}`);
+        } else {
+          skipped += 1;
+          console.log(`[component-kd-backfill] convId=${row.conversationId.slice(0, 8)} componentId=${row.componentId.slice(0, 8)} outcome=skipped reason=already-projected`);
+        }
+      } catch (err) {
+        failed += 1;
+        const reason = err instanceof Error ? err.message : String(err);
+        console.log(`[component-kd-backfill] convId=${row.conversationId.slice(0, 8)} componentId=${row.componentId.slice(0, 8)} outcome=failed reason=cypher:${JSON.stringify(reason.slice(0, 200))}`);
+      }
+    }
+  } finally {
+    await session.close();
+  }
+  console.log(`[component-kd-backfill] summary backfilled=${backfilled} skipped=${skipped} failed=${failed}`);
+  return failed === 0 ? 0 : 1;
+}
+main()
+  .then((code) => process.exit(code))
+  .catch((err) => {
+    console.error(`[component-kd-backfill] crashed: ${err instanceof Error ? err.stack : String(err)}`);
+    process.exit(2);
+  });