npm - @arcote.tech/arc-chat - Versions diffs - 0.7.10 → 0.7.12 - Mend

@arcote.tech/arc-chat 0.7.10 → 0.7.12

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (9) hide show

package/README.md +258 -0
package/package.json +7 -7
package/src/aggregates/message.ts +74 -58
package/src/chat-builder.ts +80 -6
package/src/index.ts +14 -2
package/src/listeners/ai-generation-listener.ts +234 -179
package/src/react/chat-component.tsx +241 -204
package/src/routes/chat-stream-route.ts +21 -10
package/src/streaming/stream-registry.ts +252 -118

package/README.md ADDED Viewed

@@ -0,0 +1,258 @@
+# @arcote.tech/arc-chat
+Chat fragment dla Arc — Conversation/Message aggregate + AI generation listener
++ SSE streaming + React component. Builder API: `chat(name).identifyBy(...).ai(...).build()`.
+Ten dokument tłumaczy **jak chat działa**. Nie powtarza tego, co jest w kodzie —
+opisuje **mental model**, którego trzeba się trzymać przy każdej modyfikacji,
+żeby nie zepsuć architektury.
+---
+## Mental model
+> **Live wartość treści asystenta żyje wyłącznie w pamięci serwera.**
+> **DB zna tylko stan finalny — i tylko po zakończeniu tury.**
+W trakcie generacji LLM streamuje chunki do `stream-registry` (in-memory,
+per `messageId`). Klient subskrybuje SSE po `messageId` i dostaje:
+1. `init` — snapshot aktualnego `currentBlocks` w momencie podłączenia
+2. live `text_delta` / `tool_call_*` — kolejne chunki
+3. `done` — koniec turny
+Dopiero po `provider.streamComplete()` zwróci pełen wynik, listener wywołuje
+`completeAssistantTurn({ blocks })` — **jedyny zapis treści do DB w całej
+turze**. Następnie `finalize(messageId)` zamyka stream i po 5s grace okresie
+drop'uje go z mapy.
+**To NIE jest event-sourcing dla streamingu.** Snapshoty częściowej treści
+do DB były anti-pattern (niepotrzebny narzut, dublowanie stanu). Stream-registry
+to autorytatywne źródło live wartości; DB to autorytatywne źródło stanu po
+zamknięciu turny.
+---
+## Komponenty
+```
+src/
+├─ aggregates/message.ts        Aggregate: pola, eventy, mutacje
+├─ listeners/
+│  └─ ai-generation-listener.ts Generation loop + 3 listenery (gen/resume/retry)
+├─ routes/chat-stream-route.ts  GET /chat/:name/stream/:messageId (SSE)
+├─ streaming/stream-registry.ts In-memory per-messageId MessageStream
+├─ react/chat-component.tsx     UI: auto-subscribe SSE + timeline rebuild z DB
+├─ tools/ask-questions.tsx      Reusable interactive tool
+└─ chat-builder.ts              chat().identifyBy(...).ai(...).build()
+```
+---
+## Flow end-to-end
+```
+USER wpisuje "Cześć", klika Send
+    │
+    ▼
+sendMessage mutation (atomowo)
+    ├─ emit assistantTurnStarted  → projection: set empty assistant row
+    │                                (isGenerating=true, brak blocks)
+    └─ emit messageSent           → projection: set user row
+                                  → triggeruje aiGenerationListener (async)
+    │
+    ▼
+DB query getByScope() pushuje obie wiadomości do klienta
+    │
+    ▼
+React effect widzi isGenerating=true assistant row
+    activeGeneratingMessageId = id assistanta
+    useChatMessageStream auto-otwiera SSE
+    │
+    ▼                                          ◀───┐
+fetch /route/chat/:name/stream/:messageId          │
+    │                                              │
+    ├─ subscribe(messageId)                        │
+    │     ├─ Brak streamu → 410 ────┐              │
+    │     │                          │              │
+    │     │                          ▼              │
+    │     │                     UI: "Interrupted"  │
+    │     │                     + Retry button     │
+    │     │                          │              │
+    │     │                          ▼              │
+    │     │                     retryGeneration    │
+    │     │                          └──────────────┘
+    │     │
+    │     └─ Stream istnieje → init z currentBlocks snapshot
+    │
+    ▼
+Listener: startStream() → provider.streamComplete(onChunk)
+    onChunk → publish(messageId, event)
+        ├─ mutuje currentBlocks (text append / push tool_call / set args)
+        └─ broadcast SSE do wszystkich subscribers
+    Klient SSE → processEvent → setTimeline
+    │
+    ▼
+streamComplete zwraca pełen result.blocks
+    │
+    ▼
+completeAssistantTurn({ blocks })  ← jedyny zapis treści do DB
+    │
+    ▼
+finalize(messageId, { usage, finishReason })
+    ├─ broadcast done do subscriberów
+    ├─ close controllery
+    └─ setTimeout(delete, 5s) — grace dla late subscribers
+    │
+    ▼
+Klient SSE: done → setIsStreaming(false)
+DB query update: isGenerating=false, blocks=...
+    └─ historySig refire → timeline rebuild z DB final blocks
+```
+---
+## Edge cases
+### Graceful reload mid-stream (F5)
+Serwer i listener nadal generują. Klient po refresh:
+1. DB query zwraca assistant row z `isGenerating=true`
+2. `activeGeneratingMessageId` ustawia się → hook otwiera SSE
+3. `subscribe(messageId)` zwraca aktualny `currentBlocks` w `init` event
+4. Klient renderuje to, co już zostało wygenerowane + kontynuuje live
+**Bez duplikacji** — brak replay buffer'a chunków, jest jeden snapshot.
+### Server restart mid-stream
+Proces ginie z `currentBlocks` w pamięci → utrata. DB ma row
+`isGenerating=true` ale `subscribe(messageId)` zwraca `null` → route oddaje
+HTTP 410.
+1. React hook: `res.status === 410` → `setInterruptedIds(prev.add(messageId))`
+2. Timeline pokazuje TimelineItem `"interrupted"` + Retry button
+3. Klik Retry → `retryGeneration({ messageId })`:
+   - mutation emit `assistantTurnStarted` (fresh row) + `retryRequested`
+     (projection usuwa interrupted row)
+   - `aiRetryListener` reaguje, odpala `runGenerationLoop` z fresh
+     `preCreatedAssistantMessageId`
+### Server tool call w środku tury
+Po `streamComplete` z `finishReason="tool_call"`:
+1. `completeAssistantTurn(blocks)` — assistant row finalizowany (blocks
+   zawiera tool_call w properOrder)
+2. `finalize(messageId)` — stream zamknięty
+3. Każdy server tool: `saveToolResult` → tool_result row w DB
+4. **Następna iteracja loop'a**: `startAssistantTurn` tworzy nowy assistant
+   row (`isGenerating=true`) → nowy `messageId` → klient widzi go w DB
+   query update → nowy SSE stream → drugi turn streamuje
+Każda iteracja loop'a = **osobny `messageId` = osobny stream**.
+### Interactive tool (np. askQuestions)
+Po `streamComplete` z interactive tool calls:
+1. `completeAssistantTurn` + `finalize` — pierwsza tura zamknięta
+2. Listener returns (loop break)
+3. Klient widzi tool w timeline (status=pending), `ChatInput` disabled
+4. User klika answer → `respondToTool` mutation (atomowo emit
+   `assistantTurnStarted` + `userResponded`)
+5. `aiResumeListener` reaguje → kolejny turn streamuje
+---
+## Stream-registry API
+```ts
+startStream(messageId)              // idempotent. Listener woła przed publish
+publish(messageId, event)           // mutuje currentBlocks + broadcast SSE
+subscribe(messageId): {             // route handler. null → 410
+  stream, currentBlocks
+} | null
+finalize(messageId, finalDetails?)  // broadcast done, close, delete po 5s
+isActive(messageId): boolean        // health check
+getCurrentBlocks(messageId)         // debug/test, readonly
+```
+`PublishableEvent` to subset `ChatStreamEvent` bez `init/done/messageId` —
+`init/done` emit'uje registry, `messageId` wstrzykuje się automatycznie.
+---
+## Key invariants
+**Live wartość:**
+- `currentBlocks` w stream-registry jest jedynym źródłem prawdy dla treści
+  in-progress assistanta
+- `partialBlocks`/`partialLastSeq` **NIE ISTNIEJĄ** — jeśli pojawi się PR
+  dodający je, odrzuć
+**DB:**
+- Assistant row z `isGenerating=true` ma `blocks=undefined`
+- Po `assistantTurnCompleted` row ma `isGenerating=false` + `blocks` final
+- Treść NIGDY nie ląduje w DB chunk po chunku
+**Stream lifecycle:**
+- `startStream(messageId)` PRZED pierwszym `publish` (listener gwarantuje)
+- `finalize(messageId)` PO `completeAssistantTurn` (DB → in-memory order)
+- Każda iteracja generation loop'a → osobny `messageId` → osobny stream
+**Subscribe:**
+- Pierwszy event po `subscribe()` to ZAWSZE `init`
+- `subscribe()` zwraca `null` (→ 410 HTTP) **tylko gdy** stream nie istnieje
+  w mapie (poza grace window). Klient interpretuje 410 jako "interrupted".
+---
+## Gotchas dla modyfikacji
+**`assistantTurnStarted` emit'owany PRZED `messageSent`/`userResponded`/
+`retryRequested` w jednej mutacji.** Powód: async listener reaguje na
+to drugie i potrzebuje, żeby assistant row już istniał w DB. Patrz komentarz
+w `sendMessage` mutation.
+**`historySig` w chat-component zależy od `_id:isGenerating:blocks:contentLen`.**
+Nie dodawaj tu pól typu `updatedAt` — useEffect refireuje dla każdego DB
+update, ale rebuild timeline nie może się fire'ować w trakcie streamingu
+(reset bubble caret). Strategia: rebuild fire tylko gdy `isStreaming === false`,
+SSE flippuje to dopiero w `done`.
+**`activeGeneratingMessageId` derived z `historyData` + `interruptedIds`.**
+Jeśli zmieniasz logikę detekcji "który row trzeba subskrybować", trzymaj
+ją w tym `useMemo` — auto-subscribe effect odpali się sam.
+**`buildHistory` w listenerze pomija `assistant` rows z `isGenerating=true
+&& !blocks`.** Czyli interrupted rows (przed retryRequested projection)
+oraz fresh rows w trakcie generacji nie trafiają do LLM history. Po retry
+fresh row też jest skip'owany — historia kończy się na ostatniej user
+message, LLM kontynuuje od niej.
+**Stream-registry trzyma `toolCallsById` Map.** `publish("tool_call_pending")`
+tworzy block w `currentBlocks` ORAZ wpis w mapie. `tool_call_arguments_complete`
+update'uje args na tym samym block'u. Jeśli zmieniasz strukturę blocks
+asystenta, oba miejsca muszą być spójne.
+**Server-tool execution loop NIE używa stream-registry.** Po `finalize` dla
+tury z tool_calls, kolejne `publish` byłyby no-opem. Server tool results
+trafiają do klienta przez aggregate query update (`saveToolResult` → tool_result
+row w DB). To **świadome** — następna tura ma własny stream.
+**Brak retencji buforów eventów.** Klient który podłączy się 6s po `finalize`
+dostanie 410. Brak `?afterSeq`, brak replay. Po `done` klient ma final
+blocks z DB i nie potrzebuje SSE.
+---
+## Powiązane fragmenty
+- `@arcote.tech/arc-ai` — provider abstraction, `StreamChunk`, `ChatStreamEvent`,
+  tool system, billing
+- `@arcote.tech/arc-ai-{openai,claude,gemini}` — implementacje providerów
+- `@arcote.tech/arc-ds` — DS components: Chat, ChatMessage, ChatInput,
+  ChatToolLog, ChatLabels
+- `@arcote.tech/arc-ai-voice` — VoiceTextarea (dyktowanie głosowe out-of-the-box)

package/package.json CHANGED Viewed

@@ -1,7 +1,7 @@
 {
   "name": "@arcote.tech/arc-chat",
   "type": "module",
-  "version": "0.7.10",
+  "version": "0.7.12",
   "private": false,
   "description": "Chat module with AI integration for Arc framework",
   "main": "./src/index.ts",
@@ -10,12 +10,12 @@
     "type-check": "tsc --noEmit"
   },
   "peerDependencies": {
-    "@arcote.tech/arc": "^0.7.10",
-    "@arcote.tech/arc-ai": "^0.7.10",
-    "@arcote.tech/arc-ai-voice": "^0.7.10",
-    "@arcote.tech/arc-auth": "^0.7.10",
-    "@arcote.tech/arc-ds": "^0.7.10",
-    "@arcote.tech/platform": "^0.7.10",
+    "@arcote.tech/arc": "^0.7.12",
+    "@arcote.tech/arc-ai": "^0.7.12",
+    "@arcote.tech/arc-ai-voice": "^0.7.12",
+    "@arcote.tech/arc-auth": "^0.7.12",
+    "@arcote.tech/arc-ds": "^0.7.12",
+    "@arcote.tech/platform": "^0.7.12",
     "lucide-react": ">=0.400.0",
     "react": ">=18.0.0",
     "typescript": "^5.0.0"

package/src/aggregates/message.ts CHANGED Viewed

@@ -4,7 +4,6 @@ import {
   boolean,
   date,
   id,
-  number,
   string,
   type ArcId,
 } from "@arcote.tech/arc";
@@ -79,18 +78,6 @@ export const createMessageAggregate = <
     previousResponseId: string().optional(),
     isGenerating: boolean().optional(),
     usage: string().optional(),
-    /**
-     * Partial snapshot blocks (JSON-serialized AssistantContentBlock[])
-     * zapisywane w trakcie streamingu co kilka chunków. Pozwala klientowi
-     * po reload przeglądarki przywrócić stan i kontynuować SSE od
-     * `partialLastSeq`. Czyszczone po `assistantTurnCompleted`.
-     */
-    partialBlocks: string().optional(),
-    /**
-     * Ostatni seq SSE event'u zaaplikowany do `partialBlocks`. Klient
-     * wysyła `?afterSeq=partialLastSeq` przy SSE resume.
-     */
-    partialLastSeq: number().optional(),
     createdAt: date(),
   })
@@ -153,30 +140,10 @@ export const createMessageAggregate = <
       },
     )
-    // ─── assistantTurnProgressSnapshot — checkpoint w trakcie streamingu ─
-    // Listener emituje co N chunków lub T sekund — klient po reload czyta
-    // `partialBlocks` + `partialLastSeq` i kontynuuje SSE od miejsca w
-    // którym był.
-    .publicEvent(
-      "assistantTurnProgressSnapshot",
-      {
-        messageId,
-        partialBlocks: string(),
-        partialLastSeq: number(),
-      },
-      async (ctx, event) => {
-        const p = event.payload;
-        await ctx.modify(p.messageId, {
-          partialBlocks: p.partialBlocks,
-          partialLastSeq: p.partialLastSeq,
-        } as any);
-      },
-    )
     // ─── assistantTurnCompleted — finalize an in-progress turn row ───
-    // Partial update on the SAME row — fills `blocks`, flips
-    // `isGenerating` to false, optionally records `previousResponseId`,
-    // `usage`, or `error`. Czyści `partialBlocks` / `partialLastSeq`.
+    // Jedyny zapis treści w trakcie turnu. Listener mutuje in-memory
+    // stream-registry per chunk; finalne `blocks` lądują w DB raz tutaj,
+    // gdy `provider.streamComplete` zwraca pełen wynik.
     .publicEvent(
       "assistantTurnCompleted",
       {
@@ -193,8 +160,6 @@ export const createMessageAggregate = <
           previousResponseId: p.previousResponseId,
           usage: p.usage,
           isGenerating: false,
-          partialBlocks: undefined,
-          partialLastSeq: undefined,
         } as any);
       },
     )
@@ -252,6 +217,30 @@ export const createMessageAggregate = <
       },
     )
+    // ─── retryRequested — user retries an interrupted generation ─
+    // Stream-registry zniknął (server restart / proces crash) podczas
+    // generacji — interrupted assistant row siedzi w DB z `isGenerating=true`
+    // bez `blocks`. Klient widzi SSE 410 i wyświetla "Generation interrupted"
+    // + Retry button. Mutacja `retryGeneration` emituje DWA eventy:
+    //  1) `assistantTurnStarted` — tworzy fresh assistant row (jak w `sendMessage`)
+    //  2) `retryRequested` — usuwa interrupted row + triggeruje `aiRetryListener`
+    .publicEvent(
+      "retryRequested",
+      {
+        /** Fresh assistant row utworzony razem z tym eventem. */
+        messageId,
+        scopeId,
+        sessionId: string(),
+        /** Interrupted assistant row do usunięcia z DB. */
+        interruptedMessageId: messageId,
+        model: string().optional(),
+      },
+      async (ctx, event) => {
+        const p = event.payload;
+        await ctx.remove(p.interruptedMessageId);
+      },
+    )
     // ─── sendMessage — user sends message, creates session ──────
     // Emit'uje DWA eventy w jednej transakcji: messageSent (user row) +
     // assistantTurnStarted (empty assistant row z isGenerating=true). Dzięki
@@ -325,26 +314,6 @@ export const createMessageAggregate = <
       ),
     )
-    // ─── saveProgressSnapshot — zapis partial JSON w trakcie streamingu ─
-    .mutateMethod(
-      "saveProgressSnapshot",
-      (fn) => fn.withParams({
-        messageId,
-        partialBlocks: string(),
-        partialLastSeq: number(),
-      }).handle(
-        ONLY_SERVER &&
-        (async (ctx, params) => {
-          await ctx.assistantTurnProgressSnapshot.emit({
-            messageId: params.messageId,
-            partialBlocks: params.partialBlocks,
-            partialLastSeq: params.partialLastSeq,
-          });
-          return { ok: true };
-        }),
-      ),
-    )
     // ─── completeAssistantTurn — partial update of the open turn row ─
     .mutateMethod(
       "completeAssistantTurn",
@@ -437,6 +406,53 @@ export const createMessageAggregate = <
       ),
     )
+    // ─── retryGeneration — re-run generation for an interrupted turn ─
+    // Wywoływane gdy klient widzi `isGenerating=true` row + 410 z SSE
+    // (proces zrestartował się mid-stream). Tworzy fresh assistant row i
+    // emituje `retryRequested` — `aiRetryListener` ponownie woła provider'a
+    // z aktualną historią (interrupted row jest usuwany przez projection).
+    .mutateMethod(
+      "retryGeneration",
+      (fn) => fn.withParams({
+        messageId,
+      }).handle(
+        ONLY_SERVER &&
+        (async (ctx, params) => {
+          const interrupted = await ctx.$query.findOne({
+            where: { _id: params.messageId },
+          });
+          if (!interrupted) {
+            throw new Error("retryGeneration: message not found");
+          }
+          if ((interrupted as any).role !== "assistant" || !(interrupted as any).isGenerating) {
+            throw new Error("retryGeneration: row is not an interrupted assistant turn");
+          }
+          const assistantMsgId = messageId.generate();
+          const newSessionId = `session_${Date.now()}_${Math.random().toString(36).slice(2, 9)}`;
+          const model = (interrupted as any).model;
+          // KOLEJNOŚĆ EMIT — patrz komentarz w `sendMessage`.
+          await ctx.assistantTurnStarted.emit({
+            messageId: assistantMsgId,
+            scopeId: (interrupted as any).scopeId,
+            sessionId: newSessionId,
+            model,
+          });
+          await ctx.retryRequested.emit({
+            messageId: assistantMsgId,
+            scopeId: (interrupted as any).scopeId,
+            sessionId: newSessionId,
+            interruptedMessageId: params.messageId,
+            model,
+          });
+          return { messageId: assistantMsgId, sessionId: newSessionId };
+        }),
+      ),
+    )
     // ─── startStage — initiate stage with a default priming prompt ─
     // Stored as role="system" so the UI timeline hides it, but the AI
     // generation listener still picks it up as a conversational turn

package/src/chat-builder.ts CHANGED Viewed

@@ -14,7 +14,11 @@ import { tool as createToolFactory } from "@arcote.tech/arc-ai";
 import type { ArcTokenAny } from "@arcote.tech/arc";
 import type { ViewProtectionFn } from "@arcote.tech/arc";
 import { createMessageId, createMessageAggregate } from "./aggregates/message";
-import { createAiGenerationListener, createAiResumeListener } from "./listeners/ai-generation-listener";
+import {
+  createAiGenerationListener,
+  createAiResumeListener,
+  createAiRetryListener,
+} from "./listeners/ai-generation-listener";
 import { createChatStreamRoute } from "./routes/chat-stream-route";
 import { createChatComponent } from "./react/chat-component";
 import type { ChatInputTextareaSlotProps, ChatLabels } from "@arcote.tech/arc-ds";
@@ -51,6 +55,22 @@ export interface ChatReactComponentOptions {
 // ─── Chat Data ──────────────────────────────────────────────────
+/**
+ * Map snapshot token params (decoded payload of the token that protects this
+ * chat) to the scopeId we charge for an AI call. Consumers wire this via
+ * `.billTo(fn)`. Examples for typical setups:
+ *
+ *   .billTo(p => p.accountId)    // per-user billing
+ *   .billTo(p => p.workspaceId)  // per-workspace billing
+ *
+ * Called inside the ai-generation-listener with `ctx.$auth.params` (which
+ * is the decoded payload of the token snapshotted at `messageSent` emit
+ * time — i.e. whatever token the chat's `.protectBy(...)` is configured
+ * with). The returned string becomes the `_id` of the ledger row that gets
+ * debited.
+ */
+export type BillToFn = (tokenParams: Record<string, any>) => string;
 export interface ArcChatData {
   name: string;
   identifyBy: ArcId<any> | null;
@@ -63,6 +83,8 @@ export interface ArcChatData {
   tools: ArcToolAny[];
   maxExecutionCount: number;
   toolChoice: "auto" | "required" | { type: "function"; name: string };
+  alias: string | null;
+  billTo: BillToFn | null;
 }
 const defaultChatData = {
@@ -77,6 +99,8 @@ const defaultChatData = {
   tools: [],
   maxExecutionCount: 10,
   toolChoice: "auto" as const,
+  alias: null,
+  billTo: null,
 } as const satisfies ArcChatData;
 type DefaultChatData = typeof defaultChatData;
@@ -160,6 +184,41 @@ export class ArcChat<const Data extends ArcChatData = DefaultChatData> {
     } as any);
   }
+  /**
+   * Billing alias for this chat — written to the `usageRecorded` event
+   * payload (see `@arcote.tech/arc-ai`). Defaults to chat `name`. Override
+   * when you want consistent reporting across renames or to group multiple
+   * chats under one alias for admin SQL reports.
+   *
+   *   chat("identityConsultation").alias("chat-identity")...
+   */
+  alias<const A extends string>(alias: A) {
+    return new ArcChat<Merge<Data, { alias: A }>>({
+      ...this.data,
+      alias,
+    } as any);
+  }
+  /**
+   * Decide which scopeId to bill for an AI call made from this chat.
+   *
+   * Called with the decoded params of the token snapshotted at `messageSent`
+   * emit time (i.e. the token used in `.protectBy(...)`). The returned string
+   * becomes the `_id` of the row in `creditLedger` that gets debited.
+   *
+   *   .billTo(p => p.accountId)    // per-user billing
+   *   .billTo(p => p.workspaceId)  // per-workspace billing
+   *
+   * Required when `.ai(...)` config has billing wired — `build()` throws
+   * otherwise. Without billing, this method is a no-op.
+   */
+  billTo(fn: BillToFn) {
+    return new ArcChat<Merge<Data, { billTo: BillToFn }>>({
+      ...this.data,
+      billTo: fn,
+    } as any);
+  }
   createTool<const N extends string>(name: N) {
     type IdType = Data["identifyBy"] extends ArcId<any>
       ? $type<Data["identifyBy"]>
@@ -191,6 +250,8 @@ export class ArcChat<const Data extends ArcChatData = DefaultChatData> {
       tools,
       maxExecutionCount,
       toolChoice,
+      alias: aliasOverride,
+      billTo,
     } = this.data;
     if (!name) throw new Error("ArcChat: name is required");
@@ -198,6 +259,13 @@ export class ArcChat<const Data extends ArcChatData = DefaultChatData> {
     if (!accountId) throw new Error("ArcChat: accountId is required");
     if (!userToken) throw new Error("ArcChat: userToken is required");
     if (!aiConfig) throw new Error("ArcChat: ai is required");
+    // Billing wired but no `.billTo(...)` would silently skip recordUsage —
+    // forbid that. Consumer must explicitly decide which scopeId to charge.
+    if (aiConfig.recordUsage && !billTo) {
+      throw new Error(
+        `ArcChat "${name}": ai() factory has billing wired but chat is missing .billTo(...) — declare how to map snapshot token params to a billing scope, e.g. .billTo(p => p.accountId).`,
+      );
+    }
     const messageId = createMessageId({ name });
@@ -240,13 +308,14 @@ export class ArcChat<const Data extends ArcChatData = DefaultChatData> {
     const serverTools = tools.filter((t) => t.isServerTool);
     const interactiveTools = tools.filter((t) => t.isInteractiveTool);
-    // Add ledger element to mutation deps if billing configured
-    const billingElements: ArcContextElement<any>[] = [];
-    if (aiConfig.billing) {
-      for (const el of aiConfig.billing.ledger.elements) {
+    // Add usage-registry aggregate to listener mutate deps so `recordUsage`
+    // (which calls `ctx.mutate(registry).recordUsage(...)`) compiles and runs.
+    // Ledger view is a read-only projection — consumer queries it from React,
+    // listener never writes to it directly.
+    if (aiConfig.usageRegistry) {
+      for (const el of aiConfig.usageRegistry.elements) {
         if (!allMutationElements.includes(el)) allMutationElements.push(el);
         if (!allQueryElements.includes(el)) allQueryElements.push(el);
-        billingElements.push(el);
       }
     }
@@ -261,10 +330,14 @@ export class ArcChat<const Data extends ArcChatData = DefaultChatData> {
       allMutationElements,
       maxExecutionCount,
       toolChoice: toolChoice !== "auto" ? toolChoice : undefined,
+      alias: aliasOverride ?? name,
+      recordUsage: aiConfig.recordUsage,
+      billTo: billTo ?? undefined,
     };
     const aiListener = createAiGenerationListener(listenerConfig);
     const aiResumeListener = createAiResumeListener(listenerConfig);
+    const aiRetryListener = createAiRetryListener(listenerConfig);
     const streamRoute = createChatStreamRoute({
       name,
@@ -275,6 +348,7 @@ export class ArcChat<const Data extends ArcChatData = DefaultChatData> {
       Message,
       aiListener,
       aiResumeListener,
+      aiRetryListener,
       streamRoute,
     ];

package/src/index.ts CHANGED Viewed

@@ -7,10 +7,22 @@ export { createMessageAggregate, createMessageId } from "./aggregates/message";
 export type { MessageAggregate, MessageId } from "./aggregates/message";
 // --- Streaming ---
-export { broadcast, endStream, hasActiveStream, subscribe } from "./streaming/stream-registry";
+export {
+  startStream,
+  publish,
+  subscribe,
+  finalize,
+  isActive,
+  getCurrentBlocks,
+} from "./streaming/stream-registry";
+export type { PublishableEvent } from "./streaming/stream-registry";
 // --- Listener ---
-export { createAiGenerationListener } from "./listeners/ai-generation-listener";
+export {
+  createAiGenerationListener,
+  createAiResumeListener,
+  createAiRetryListener,
+} from "./listeners/ai-generation-listener";
 export type { AiGenerationListenerConfig, InstructionResult } from "./listeners/ai-generation-listener";
 // --- Routes ---