bloby-bot 0.20.6 → 0.20.8

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/package.json CHANGED
@@ -1,6 +1,6 @@
1
1
  {
2
2
  "name": "bloby-bot",
3
- "version": "0.20.6",
3
+ "version": "0.20.8",
4
4
  "releaseNotes": [
5
5
  "1. react router implemented",
6
6
  "2. new workspace design",
@@ -0,0 +1,333 @@
1
+ # Agent Architecture — Long-lived Query Model
2
+
3
+ ## Current State (v1)
4
+
5
+ Each user message spawns a new `query()` call with `maxTurns: 8`. The orchestrator responds, optionally delegates coding tasks to background sub-agents via the SDK's native `agents` config, and the query ends. The next user message starts a fresh query.
6
+
7
+ ### What works
8
+ - Orchestrator delegates coding tasks to background `coder` agent
9
+ - Quick tasks (memory writes, config edits) handled directly by orchestrator
10
+ - User never sees the multi-agent abstraction — feels like one entity
11
+ - Sub-agent progress/completion events streamed via WebSocket
12
+ - Scalable agent definitions in `supervisor/agents/`
13
+
14
+ ### What doesn't
15
+ - **maxTurns budget**: 8 turns gets consumed fast (memory write + delegation + response). Complex messages with multiple intents hit the limit.
16
+ - **Blocking**: While the orchestrator runs its 8 turns, the user can't send another message. The chat locks for 5-15 seconds.
17
+ - **No proactive reporting**: When a sub-agent finishes, the orchestrator can't report back — it's dead. The user only learns about completion on their NEXT message.
18
+ - **Race conditions**: If the user sends a message while a query is running, a second `query()` spawns. Two queries run in parallel with interleaved responses and orphaned abort controllers.
19
+ - **Stateless context**: Each query re-reads memory files, re-assembles the system prompt, re-loads agent definitions. Wasteful and slow.
20
+
21
+ ---
22
+
23
+ ## Target Architecture (v2) — Long-lived Query
24
+
25
+ ### Core Concept
26
+
27
+ One `query()` per conversation that stays alive for the duration of the session. User messages are pushed into an async input queue. The agent processes them naturally — responding, delegating, and reporting — all within a single continuous stream.
28
+
29
+ ```
30
+ ┌──────────────┐ ┌─────────────────────┐
31
+ │ User sends │────────▶│ Async Input Queue │
32
+ │ messages │ │ (push at any time) │
33
+ └──────────────┘ └────────┬────────────┘
34
+
35
+
36
+ ┌─────────────────────┐
37
+ │ Single long-lived │
38
+ │ query() loop │◀── background agents report back
39
+ │ │
40
+ └────────┬────────────┘
41
+
42
+
43
+ ┌─────────────────────┐
44
+ │ Stream SDKMessages │──▶ tokens, tools, responses
45
+ │ back to WebSocket │ all fluid, no blocking
46
+ └─────────────────────┘
47
+ ```
48
+
49
+ ### SDK Features That Enable This
50
+
51
+ **1. AsyncIterable prompt input**
52
+
53
+ `query()` accepts `AsyncIterable<SDKUserMessage>` instead of a plain string. This lets us create an async queue that the SDK consumes as messages arrive:
54
+
55
+ ```typescript
56
+ // Create an async queue the SDK reads from
57
+ const inputQueue = createAsyncQueue<SDKUserMessage>();
58
+
59
+ // Start the query with the queue as input
60
+ const handle = query({
61
+ prompt: inputQueue,
62
+ options: { ... }
63
+ });
64
+
65
+ // Push messages at any time — the SDK picks them up
66
+ inputQueue.push({
67
+ type: 'user',
68
+ message: { role: 'user', content: 'Build me a page' },
69
+ });
70
+ ```
71
+
72
+ **2. SDKUserMessage.priority**
73
+
74
+ Controls when injected messages are processed:
75
+ - `'now'` — interrupt the agent immediately, inject this message into the current turn
76
+ - `'next'` — process after the current turn finishes (default for most user messages)
77
+ - `'later'` — low priority queue (good for system notifications)
78
+
79
+ **3. streamInput()**
80
+
81
+ Method on the query handle to pipe additional message streams into a running query. Alternative to the queue approach.
82
+
83
+ **4. resume / sessionId**
84
+
85
+ Session continuity across query restarts. The SDK persists conversation history to JSONL transcripts and can resume from where it left off. This means the long-lived query can survive supervisor restarts.
86
+
87
+ **5. stopTask(taskId)**
88
+
89
+ Stop specific background agents without killing the whole query. Already integrated in our current implementation.
90
+
91
+ ---
92
+
93
+ ## Architecture Details
94
+
95
+ ### Lifecycle
96
+
97
+ ```
98
+ User opens chat
99
+
100
+ ├── First message arrives
101
+ │ └── Start long-lived query(asyncQueue)
102
+ │ ├── System prompt + memory + config injected once
103
+ │ ├── Agents config (coder, researcher) loaded once
104
+ │ └── for-await loop begins processing events
105
+
106
+ ├── User sends more messages
107
+ │ └── Push to asyncQueue (priority: 'next')
108
+ │ └── Agent sees them naturally, responds in sequence
109
+
110
+ ├── User sends message while agent is working
111
+ │ └── Push to asyncQueue (priority: 'next')
112
+ │ └── Agent processes it after current turn
113
+ │ └── OR use priority: 'now' for urgent interrupts
114
+
115
+ ├── Background agent completes
116
+ │ └── SDK notifies the orchestrator via task_notification
117
+ │ └── Orchestrator responds naturally: "Done! Here's what I built."
118
+ │ └── No user message needed — orchestrator is alive
119
+
120
+ ├── User disconnects / reconnects
121
+ │ └── Query keeps running (no WebSocket needed for SDK)
122
+ │ └── Reconnecting client gets caught up via chat:state
123
+
124
+ ├── Conversation cleared
125
+ │ └── End the query (abort controller)
126
+ │ └── Next message starts a new long-lived query
127
+
128
+ └── Supervisor restarts
129
+ └── Resume from sessionId (SDK persists to JSONL)
130
+ ```
131
+
132
+ ### What Changes from v1
133
+
134
+ | Component | v1 (current) | v2 (long-lived) |
135
+ |---|---|---|
136
+ | `bloby-agent.ts` | `startBlobyAgentQuery()` called per message, creates new `query()` each time | `startConversation()` creates one `query()`, `pushMessage()` adds to queue |
137
+ | `index.ts` WS handler | `user:message` → call `startBlobyAgentQuery()` | `user:message` → call `pushMessage()` on existing conversation |
138
+ | maxTurns | 8 (orchestrator), 50 (sub-agents) | None for orchestrator (runs indefinitely), 50 for sub-agents |
139
+ | System prompt | Re-assembled every message | Assembled once on conversation start |
140
+ | Memory files | Re-read every message | Read once, updated via tool use (agent reads/writes them naturally) |
141
+ | Agent definitions | `buildAgents()` called every message | Built once on conversation start |
142
+ | `agentQueryActive` | Boolean flag, true during query | Always true while conversation exists |
143
+ | Backend restart deferral | Deferred until bot:done | Need new mechanism — query never ends. Defer until agent is between turns (no active tool use). |
144
+ | Sub-agent completion | User learns on next message | Orchestrator reports immediately — it's alive |
145
+ | Error recovery | Query fails → error message → next message starts fresh | Query fails → attempt resume from sessionId → fallback to new query |
146
+
147
+ ### Key Implementation Details
148
+
149
+ #### 1. Async Input Queue
150
+
151
+ ```typescript
152
+ function createAsyncQueue<T>(): AsyncIterable<T> & { push: (item: T) => void; end: () => void } {
153
+ const pending: T[] = [];
154
+ let resolve: ((value: IteratorResult<T>) => void) | null = null;
155
+ let done = false;
156
+
157
+ return {
158
+ push(item: T) {
159
+ if (resolve) {
160
+ resolve({ value: item, done: false });
161
+ resolve = null;
162
+ } else {
163
+ pending.push(item);
164
+ }
165
+ },
166
+ end() {
167
+ done = true;
168
+ if (resolve) resolve({ value: undefined as any, done: true });
169
+ },
170
+ [Symbol.asyncIterator]() {
171
+ return {
172
+ next(): Promise<IteratorResult<T>> {
173
+ if (pending.length > 0) {
174
+ return Promise.resolve({ value: pending.shift()!, done: false });
175
+ }
176
+ if (done) return Promise.resolve({ value: undefined as any, done: true });
177
+ return new Promise((r) => { resolve = r; });
178
+ },
179
+ };
180
+ },
181
+ };
182
+ }
183
+ ```
184
+
185
+ #### 2. Conversation Manager
186
+
187
+ Replace the current per-message model with a conversation-scoped manager:
188
+
189
+ ```typescript
190
+ interface LiveConversation {
191
+ id: string;
192
+ inputQueue: AsyncQueue<SDKUserMessage>;
193
+ queryHandle: QueryHandle;
194
+ abortController: AbortController;
195
+ }
196
+
197
+ const conversations = new Map<string, LiveConversation>();
198
+
199
+ // Start a conversation (called once, on first message)
200
+ function startConversation(convId: string, options: ConvOptions): LiveConversation
201
+
202
+ // Push a user message into an existing conversation
203
+ function pushMessage(convId: string, content: string, attachments?: any[]): void
204
+
205
+ // End a conversation (clear context, abort)
206
+ function endConversation(convId: string): void
207
+ ```
208
+
209
+ #### 3. WebSocket Handler Changes
210
+
211
+ ```typescript
212
+ // Current (v1):
213
+ ws.on('user:message') → startBlobyAgentQuery(convId, content, ...)
214
+
215
+ // New (v2):
216
+ ws.on('user:message') → {
217
+ let conv = conversations.get(convId);
218
+ if (!conv) {
219
+ conv = startConversation(convId, { model, names, ... });
220
+ }
221
+ pushMessage(convId, content, attachments);
222
+ }
223
+ ```
224
+
225
+ #### 4. Backend Restart Timing
226
+
227
+ Currently, backend restart is deferred until `bot:done` (query end). With a long-lived query, there's no `bot:done` until the conversation ends. New approach:
228
+
229
+ - Track whether the agent is currently in a tool-use turn (has active tool calls)
230
+ - When file changes detected (Write/Edit used), schedule restart for when no tool is active
231
+ - The `bot:response` event (agent finished responding) is a safe restart point
232
+ - Sub-agent `bot:task-done` events are also safe restart points
233
+
234
+ #### 5. Session Persistence / Resume
235
+
236
+ The SDK writes conversation history to JSONL. On supervisor restart:
237
+
238
+ ```typescript
239
+ // Try to resume existing conversation
240
+ const conv = query({
241
+ prompt: inputQueue,
242
+ options: {
243
+ ...opts,
244
+ resume: true,
245
+ sessionId: savedSessionId,
246
+ },
247
+ });
248
+ ```
249
+
250
+ If resume fails (corrupt session, SDK version mismatch), fall back to a fresh query with conversation history from the database.
251
+
252
+ #### 6. Memory / Prompt Refresh
253
+
254
+ With a long-lived query, the system prompt is set once. If memory files change (agent writes to them), the agent already knows — it made the change. No need to re-inject.
255
+
256
+ However, external changes (config changes, channel updates) need a mechanism to notify the agent. Options:
257
+ - Push a system message into the queue: `{ role: 'user', content: '[System] Channel config updated: ...' }`
258
+ - End and restart the conversation (heavyweight, loses context)
259
+ - Accept that external config changes take effect on next conversation
260
+
261
+ ### Channel Manager (WhatsApp Admin)
262
+
263
+ Same pattern — `handleAdminMessage` pushes to the shared conversation queue instead of starting a new query. Admin messages from WhatsApp and chat messages from the UI flow into the same conversation.
264
+
265
+ ### Scheduler (Pulse / Cron)
266
+
267
+ Pulse and cron messages can be pushed into the active conversation queue:
268
+
269
+ ```typescript
270
+ // Instead of triggerAgent() starting a new query:
271
+ const conv = conversations.get(activeConvId);
272
+ if (conv) {
273
+ pushMessage(activeConvId, '<PULSE/>');
274
+ } else {
275
+ // No active conversation — start one for the pulse
276
+ const conv = startConversation('pulse-' + Date.now(), opts);
277
+ pushMessage(conv.id, '<PULSE/>');
278
+ }
279
+ ```
280
+
281
+ This means pulse/cron actions happen in the same conversation context as the user's chat. The agent can reference recent conversation when deciding what to do on pulse.
282
+
283
+ ---
284
+
285
+ ## Migration Path
286
+
287
+ ### Phase 1: Ship v1 (current)
288
+ - maxTurns: 8 orchestrator with SDK native background agents
289
+ - Request-response model (one query per message)
290
+ - Works, has known limitations with turn budget and blocking
291
+
292
+ ### Phase 2: Long-lived Query Refactor
293
+ 1. Implement `createAsyncQueue` utility
294
+ 2. Refactor `bloby-agent.ts` → conversation manager with `startConversation` / `pushMessage` / `endConversation`
295
+ 3. Refactor `index.ts` WebSocket handler to push to queue instead of starting new queries
296
+ 4. Refactor `manager.ts` admin handler same way
297
+ 5. Update backend restart logic (trigger on response/task-done, not query-end)
298
+ 6. Add session resume for supervisor restarts
299
+ 7. Refactor scheduler to push into active conversation
300
+
301
+ ### Phase 3: Enhancements
302
+ - Session persistence across restarts (resume from sessionId)
303
+ - Priority-based message injection (urgent interrupts vs normal flow)
304
+ - External config change notifications via system messages
305
+ - Conversation timeout / cleanup (end long-idle conversations)
306
+
307
+ ---
308
+
309
+ ## Files Affected
310
+
311
+ | File | Change |
312
+ |---|---|
313
+ | `supervisor/bloby-agent.ts` | Complete rewrite — conversation manager replacing per-message queries |
314
+ | `supervisor/index.ts` | WebSocket handler refactored to push/pull from conversation |
315
+ | `supervisor/channels/manager.ts` | Admin handler pushes to conversation queue |
316
+ | `supervisor/scheduler.ts` | Pulse/cron push into active conversation |
317
+ | `supervisor/agents/index.ts` | No change — agent definitions loaded once per conversation start |
318
+ | `supervisor/agents/prompts/*` | No change |
319
+ | `worker/prompts/bloby-system-prompt.txt` | Remove maxTurns-related language. The "How You Work" section can be simplified since the orchestrator can now use tools freely without turn budget concerns. |
320
+
321
+ ---
322
+
323
+ ## Open Questions
324
+
325
+ 1. **Conversation lifetime**: When does a long-lived query end? Options: explicit clear, timeout after N hours of inactivity, or truly infinite (until supervisor restart).
326
+
327
+ 2. **Multiple conversations**: The current system tracks one active conversation via `/api/context/current`. With long-lived queries, do we support multiple concurrent conversations? Or always one?
328
+
329
+ 3. **Token accumulation**: A long-lived query accumulates context over time. The SDK handles context window management (sliding window, summarization?), but we need to understand the cost implications.
330
+
331
+ 4. **WhatsApp admin + chat convergence**: If WhatsApp admin messages and chat UI messages both feed into the same conversation queue, the agent sees both sources seamlessly. But the response routing needs to know which channel to reply on.
332
+
333
+ 5. **Customer agents**: Business mode customer agents are already stateless (per-customer buffers). Should they also become long-lived? Or keep them request-response since customer conversations are shorter?
@@ -445,7 +445,7 @@ export class ChannelManager {
445
445
  { botName, humanName },
446
446
  recentMessages,
447
447
  undefined, // no supportPrompt
448
- 5, // maxTurns: orchestrator mode
448
+ 8, // maxTurns: orchestrator mode
449
449
  );
450
450
  }
451
451
 
@@ -33,12 +33,14 @@ export default function TypingIndicator({ text, toolName }: Props) {
33
33
  <span className="w-1.5 h-1.5 rounded-full bg-muted-foreground/60 animate-bounce" style={{ animationDelay: '150ms' }} />
34
34
  <span className="w-1.5 h-1.5 rounded-full bg-muted-foreground/60 animate-bounce" style={{ animationDelay: '300ms' }} />
35
35
  </span>
36
+ {/* Tool activity label — hidden for now (will be used for skill UI later)
36
37
  {toolName && (
37
38
  <div className="flex items-center gap-1.5 mt-1 text-xs text-muted-foreground">
38
39
  <div className="w-2.5 h-2.5 border-[1.5px] border-muted-foreground/30 border-t-primary rounded-full animate-spin" />
39
40
  <span>{toolLabel(toolName)}...</span>
40
41
  </div>
41
42
  )}
43
+ */}
42
44
  </div>
43
45
  </div>
44
46
  );
@@ -59,12 +61,14 @@ export default function TypingIndicator({ text, toolName }: Props) {
59
61
  >
60
62
  {text}
61
63
  </Streamdown>
64
+ {/* Tool activity label — hidden for now (will be used for skill UI later)
62
65
  {toolName && (
63
66
  <div className="flex items-center gap-1.5 mt-1 text-xs text-muted-foreground">
64
67
  <div className="w-2.5 h-2.5 border-[1.5px] border-muted-foreground/30 border-t-primary rounded-full animate-spin" />
65
68
  <span>{toolLabel(toolName)}...</span>
66
69
  </div>
67
70
  )}
71
+ */}
68
72
  </div>
69
73
  </div>
70
74
  );
@@ -1132,7 +1132,7 @@ ${!connected ? '<script>setTimeout(()=>location.reload(),4000)</script>' : ''}
1132
1132
  const waMirrorJid = waStatus?.connected ? waStatus.info?.phoneNumber : null;
1133
1133
  let waChunkBuf = '';
1134
1134
 
1135
- // Start orchestrator query (maxTurns: 5fast, delegates heavy work)
1135
+ // Start orchestrator query (maxTurns: 8quick tasks direct, coding delegated)
1136
1136
  log.info(`[orchestrator] ──── USER MESSAGE ────`);
1137
1137
  log.info(`[orchestrator] Content: "${content.slice(0, 100)}..."`);
1138
1138
  log.info(`[orchestrator] Model: ${freshConfig.ai.model}`);
@@ -1203,7 +1203,7 @@ ${!connected ? '<script>setTimeout(()=>location.reload(),4000)</script>' : ''}
1203
1203
  broadcastBloby(type, eventData);
1204
1204
  }, data.attachments, savedFiles, { botName, humanName }, recentMessages,
1205
1205
  undefined, // no supportPrompt
1206
- 5, // maxTurns: orchestrator mode
1206
+ 8, // maxTurns: orchestrator mode
1207
1207
  );
1208
1208
  })();
1209
1209
  return;