ltcai 1.7.0 → 2.1.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
@@ -0,0 +1,423 @@
1
+ # Lattice AI Realtime Collaboration
2
+
3
+ Realtime Collaboration is the subsystem that gives a Lattice AI workspace a live
4
+ **presence** registry and an **activity feed**. In v2.1.0 it also carries
5
+ workspace-scoped execution observability for agents, handoffs, reviews,
6
+ workflows, plugins, retries, and failures. It is delivered over Server-Sent
7
+ Events (SSE) by an in-process pub/sub bus, the
8
+ [`RealtimeBus`](../latticeai/core/realtime.py).
9
+
10
+ The design goal is to surface "what is happening in the workspace right now"
11
+ (workspaces created, graphs indexed, agents and workflows run, plugins enabled,
12
+ who is online) without adding a new transport, a new dependency, or a second
13
+ event system.
14
+
15
+ The bus version is:
16
+
17
+ ```python
18
+ REALTIME_VERSION = "2.1.0"
19
+ ```
20
+
21
+ v2.1 execution event types include `agent_started`, `handoff_created`,
22
+ `handoff_accepted`, `handoff_completed`, `review_requested`, `review_approved`,
23
+ `retry_requested`, `workflow_started`, `workflow_completed`, `plugin_started`,
24
+ `plugin_completed`, `execution_failed`, and `execution_cancelled`.
25
+
26
+ ---
27
+
28
+ ## Why SSE
29
+
30
+ SSE was chosen deliberately rather than WebSockets:
31
+
32
+ - The codebase **already streams model output over SSE**
33
+ (`latticeai.services.model_runtime.sse_event`), so the wire format and the
34
+ client patterns are familiar.
35
+ - SSE needs **no extra dependency** — it is plain `text/event-stream` over the
36
+ existing single HTTP port used by the local-first deployment.
37
+ - It works through the existing single-port local-first server with no extra
38
+ ports or upgrade handshakes.
39
+
40
+ The bus is **in-process** (one server, local-first) and fans events out to
41
+ in-memory subscriber queues.
42
+
43
+ > **Compatibility.** This subsystem is purely additive. It introduces new
44
+ > `/realtime/*` endpoints and an `/activity` page, and it attaches to the
45
+ > existing `WorkspaceOSStore` through an optional `event_sink` hook. No v1.x
46
+ > data shape, API, or behavior changes. With zero subscribers the bus is a
47
+ > no-op, so single-user local mode behaves exactly as before.
48
+
49
+ ---
50
+
51
+ ## Architecture at a glance
52
+
53
+ ```
54
+ record_timeline_event(...) (any workspace/graph/agent/workflow write)
55
+
56
+
57
+ WorkspaceOSStore.event_sink ──► RealtimeBus.publish(event)
58
+
59
+ ┌───────────────────┼───────────────────────┐
60
+ ▼ ▼ ▼
61
+ ring-buffer feed matching subscriber queues presence registry
62
+ (capped, 200) (bounded, drop-oldest)
63
+
64
+
65
+ GET /realtime/stream (SSE frames)
66
+ ```
67
+
68
+ The bus is created once and wired into the store in `server_app.py`:
69
+
70
+ ```python
71
+ REALTIME_BUS = RealtimeBus()
72
+ WORKSPACE_OS = WorkspaceOSStore(DATA_DIR, event_sink=REALTIME_BUS)
73
+ ```
74
+
75
+ ### The key integration: a single `event_sink`
76
+
77
+ `WorkspaceOSStore` exposes exactly one realtime hook. Its
78
+ `record_timeline_event` method already runs on **every** meaningful workspace
79
+ write, and it ends by firing the sink:
80
+
81
+ ```python
82
+ def record_timeline_event(self, area, event_type, payload, workspace_id=None):
83
+ state = self.load_state()
84
+ event = {
85
+ "id": f"timeline-{...}",
86
+ "area": area,
87
+ "event_type": event_type,
88
+ "timestamp": _now(),
89
+ "workspace_id": self._resolve_scope(workspace_id, state),
90
+ "payload": payload,
91
+ }
92
+ # ... persist to the timeline ...
93
+ if self.event_sink is not None:
94
+ try:
95
+ self.event_sink(event)
96
+ except Exception:
97
+ # Realtime delivery is best-effort and must never break a write.
98
+ pass
99
+ return event
100
+ ```
101
+
102
+ Because the store calls `event_sink(event)` positionally and `RealtimeBus`
103
+ implements `__call__` as an alias for `publish`, wiring the bus as the sink
104
+ makes **all** workspace / graph / agent / workflow / memory / skill / plugin
105
+ activity flow into the realtime feed automatically. There is **no per-call
106
+ instrumentation** to maintain and no duplicated event system — anything that
107
+ already records a timeline event is realtime by construction.
108
+
109
+ Representative `area` / `event_type` pairs already emitted by the store
110
+ include:
111
+
112
+ | `area` | example `event_type` |
113
+ |-------------|---------------------------------------------------|
114
+ | `workspace` | `workspace_created`, `member_added`, `workspace_archived` |
115
+ | `graph` | `answer_trace`, `indexing_paused`, `indexing_resumed` |
116
+ | `agent` | `agent_run` |
117
+ | `workflow` | `workflow_created`, `workflow_run`, `workflow_edited` |
118
+ | `memory` | `memory_upserted`, `memory_deleted` |
119
+ | `skills` | `skill_installed`, `skill_enabled` |
120
+ | `plugins` | `plugin_installed`, `plugin_enabled` |
121
+ | `presence` | `join`, `leave` (emitted by the bus itself) |
122
+
123
+ ---
124
+
125
+ ## `publish()` — the core contract
126
+
127
+ ```python
128
+ def publish(self, event: Dict[str, Any]) -> Dict[str, Any]: ...
129
+ ```
130
+
131
+ `publish` is the heart of the bus and is built so it is always safe to call
132
+ from the store's synchronous write path:
133
+
134
+ - **Sync-callable.** No `await`, callable from any synchronous code.
135
+ - **Never raises.** Queue overflow and other failures are swallowed; the worst
136
+ case is a dropped frame, never a broken write. (The store also wraps the call
137
+ in its own `try/except` as a second layer.)
138
+ - **Never blocks.** Subscriber queues are bounded; the publisher never waits on
139
+ a slow or disconnected consumer.
140
+
141
+ On each call it:
142
+
143
+ 1. Assigns a monotonically increasing `seq` and a `received_at` timestamp, and
144
+ normalizes the event into a stable enriched shape.
145
+ 2. Appends the enriched event to a **capped ring-buffer feed** (`_FEED_LIMIT =
146
+ 200`); older events fall off the front.
147
+ 3. Fans the event out to every subscriber whose scope **accepts** the event's
148
+ `workspace_id` (see [Workspace isolation](#workspace-isolation)).
149
+
150
+ ### Enriched event shape
151
+
152
+ ```json
153
+ {
154
+ "seq": 42,
155
+ "received_at": "2026-06-01T10:15:30",
156
+ "area": "workflow",
157
+ "event_type": "workflow_run",
158
+ "workspace_id": "ws_marketing",
159
+ "payload": { "run_id": "wf-run-7", "workflow_id": "wf_3", "status": "ok" },
160
+ "id": "timeline-9f1c...",
161
+ "timestamp": "2026-06-01T10:15:30"
162
+ }
163
+ ```
164
+
165
+ `area`, `event_type`, `workspace_id`, and `payload` are always present
166
+ (defaulting to `"workspace"`, `"event"`, `None`, and `{}` respectively). Any
167
+ extra keys on the source event (such as the store's `id` and `timestamp`) are
168
+ preserved alongside them.
169
+
170
+ ### Backpressure: bounded queues drop the oldest
171
+
172
+ Each subscriber has an `asyncio.Queue(maxsize=100)`. On overflow the publisher
173
+ does **not** block — it discards the oldest queued event to make room for the
174
+ newest:
175
+
176
+ ```python
177
+ try:
178
+ sub.queue.put_nowait(enriched)
179
+ except asyncio.QueueFull:
180
+ try:
181
+ sub.queue.get_nowait() # drop oldest
182
+ sub.queue.put_nowait(enriched)
183
+ except Exception:
184
+ pass
185
+ ```
186
+
187
+ A slow client therefore sees gaps rather than stalling the whole server.
188
+
189
+ ---
190
+
191
+ ## Workspace isolation
192
+
193
+ Every event carries a `workspace_id`. A subscriber is created with an allowed
194
+ **workspace scope** — a `Set[str]` of workspace IDs the caller may see — and
195
+ only receives events that its scope accepts:
196
+
197
+ ```python
198
+ def accepts(self, workspace_id: Optional[str]) -> bool:
199
+ # ``None`` scope = see everything the local user can (personal/unscoped).
200
+ if self.workspace_scope is None:
201
+ return True
202
+ if workspace_id is None:
203
+ return True
204
+ return workspace_id in self.workspace_scope
205
+ ```
206
+
207
+ Two rules fall out of this:
208
+
209
+ - **Unscoped events are always delivered.** An event with `workspace_id` of
210
+ `None` reaches every subscriber. So do events for a subscriber whose scope is
211
+ `None`. This is what makes **single-user local mode** work with no scope
212
+ restriction — there is nothing to filter and everything is visible.
213
+ - **Scoped events are filtered.** When a subscriber has a concrete scope set,
214
+ it only receives events whose `workspace_id` is in that set (plus the always-
215
+ delivered unscoped events).
216
+
217
+ The scope is resolved per request, not hard-coded. The API layer calls
218
+ `PlatformRuntime.allowed_scopes`, which derives the set from the workspaces the
219
+ user can actually list:
220
+
221
+ ```python
222
+ def allowed_scopes(self, user: Optional[str]) -> Optional[Set[str]]:
223
+ try:
224
+ workspaces = self.svc.list_workspaces(user or None).get("workspaces", [])
225
+ return {ws.get("workspace_id") for ws in workspaces if ws.get("workspace_id")}
226
+ except Exception:
227
+ return None
228
+ ```
229
+
230
+ If scope resolution fails for any reason it returns `None` — the permissive,
231
+ local-friendly default — rather than erroring the stream. The feed
232
+ (`recent`) and presence (`presence`) reads apply the same scope filter, so a
233
+ caller can never read across workspaces it is not entitled to.
234
+
235
+ ---
236
+
237
+ ## `stream()` — replay tail, live frames, heartbeats
238
+
239
+ ```python
240
+ async def stream(self, sub: _Subscriber, *, heartbeat: float = 15.0) -> AsyncIterator[str]: ...
241
+ ```
242
+
243
+ When a client connects, `stream` first **replays a short tail** (up to the 10
244
+ most recent in-scope events) so a fresh subscriber immediately has context,
245
+ then yields **live frames** as they arrive. If no event arrives within the
246
+ `heartbeat` interval (default 15 seconds) it emits an SSE comment:
247
+
248
+ ```
249
+ : heartbeat
250
+ ```
251
+
252
+ The heartbeat keeps proxies from closing an idle connection and stops
253
+ single-user local mode from looking "stuck" when nothing is happening. When the
254
+ async generator is closed (client disconnect), the subscriber is removed in a
255
+ `finally` block, so queues do not leak.
256
+
257
+ Each event is encoded with `sse_format`:
258
+
259
+ ```python
260
+ def sse_format(event: Dict[str, Any]) -> str:
261
+ """Encode an event as an SSE ``data:`` frame."""
262
+ return f"data: {json.dumps(event, ensure_ascii=False, default=str)}\n\n"
263
+ ```
264
+
265
+ ---
266
+
267
+ ## API reference
268
+
269
+ All endpoints require an authenticated user via `require_user`. In local mode
270
+ with auth disabled, `require_user` returns an empty string and the request
271
+ proceeds; the resolved scope is then `None` (see everything). The realtime
272
+ router is mounted in `server_app.py` with the live bus, the auth helpers, and
273
+ `PlatformRuntime.allowed_scopes` as the scope resolver.
274
+
275
+ ### `GET /activity`
276
+
277
+ Serves the Activity UI page (`static/activity.html`). Returns `404` if the UI
278
+ file or static directory is not available.
279
+
280
+ ### `GET /realtime/stream`
281
+
282
+ The SSE stream. Resolves the caller's allowed scope, registers a new subscriber
283
+ with a random ID, and returns a `StreamingResponse` of
284
+ `media_type="text/event-stream"`. The response sets streaming-friendly headers:
285
+
286
+ ```
287
+ Cache-Control: no-cache
288
+ X-Accel-Buffering: no
289
+ Connection: keep-alive
290
+ ```
291
+
292
+ The server stops generating frames as soon as `request.is_disconnected()` is
293
+ true.
294
+
295
+ ### `GET /realtime/feed`
296
+
297
+ Reads the recent activity feed (scope-filtered). Query parameter `limit`
298
+ defaults to `50` and is clamped to the `200`-entry buffer. Returns newest-first:
299
+
300
+ ```json
301
+ {
302
+ "events": [ /* enriched events, newest first */ ],
303
+ "stats": {
304
+ "version": "2.1.0",
305
+ "subscribers": 1,
306
+ "presence": 2,
307
+ "feed_size": 17,
308
+ "transport": "sse"
309
+ }
310
+ }
311
+ ```
312
+
313
+ ### `GET /realtime/presence`
314
+
315
+ Returns the scope-filtered presence registry plus the same `stats` block:
316
+
317
+ ```json
318
+ {
319
+ "presence": [
320
+ {
321
+ "client_id": "Hk3p...",
322
+ "user": "rnlgnquvk@gmail.com",
323
+ "workspace_id": "ws_marketing",
324
+ "joined_at": "2026-06-01T10:14:00",
325
+ "last_seen": "2026-06-01T10:15:30"
326
+ }
327
+ ],
328
+ "stats": { "version": "2.1.0", "subscribers": 1, "presence": 1, "feed_size": 17, "transport": "sse" }
329
+ }
330
+ ```
331
+
332
+ ### `POST /realtime/presence/join`
333
+
334
+ Registers a client as present. Request body:
335
+
336
+ ```json
337
+ {
338
+ "client_id": "optional-client-id",
339
+ "workspace_id": "ws_marketing"
340
+ }
341
+ ```
342
+
343
+ Both fields are optional. If `client_id` is omitted, the server generates one.
344
+ A `presence`/`join` event is published to subscribers in scope. Response:
345
+
346
+ ```json
347
+ {
348
+ "presence": {
349
+ "client_id": "Hk3p...",
350
+ "user": "rnlgnquvk@gmail.com",
351
+ "workspace_id": "ws_marketing",
352
+ "joined_at": "2026-06-01T10:14:00",
353
+ "last_seen": "2026-06-01T10:14:00"
354
+ }
355
+ }
356
+ ```
357
+
358
+ ### `POST /realtime/presence/leave`
359
+
360
+ Removes a client from the presence registry (publishing a `presence`/`leave`
361
+ event) when a `client_id` is supplied. Request body uses the same
362
+ `PresenceRequest` shape; only `client_id` is read.
363
+
364
+ ```json
365
+ { "status": "ok" }
366
+ ```
367
+
368
+ ---
369
+
370
+ ## Client example (`EventSource`)
371
+
372
+ The stream is standard SSE, so a browser can consume it with the built-in
373
+ `EventSource`. Join presence first, then subscribe to the feed:
374
+
375
+ ```javascript
376
+ // 1. Announce presence (optional but enables the presence registry).
377
+ const clientId = crypto.randomUUID();
378
+ await fetch("/realtime/presence/join", {
379
+ method: "POST",
380
+ headers: { "Content-Type": "application/json" },
381
+ body: JSON.stringify({ client_id: clientId, workspace_id: "ws_marketing" }),
382
+ });
383
+
384
+ // 2. Subscribe to the live activity stream.
385
+ const source = new EventSource("/realtime/stream");
386
+
387
+ source.onmessage = (e) => {
388
+ const event = JSON.parse(e.data);
389
+ console.log(`[${event.seq}] ${event.area}/${event.event_type}`, event.payload);
390
+ // e.g. render into an activity panel...
391
+ };
392
+
393
+ source.onerror = () => {
394
+ // EventSource auto-reconnects; on the next connect the server replays
395
+ // a short tail so missed-while-offline context is restored.
396
+ };
397
+
398
+ // 3. On unload, leave presence.
399
+ window.addEventListener("beforeunload", () => {
400
+ navigator.sendBeacon(
401
+ "/realtime/presence/leave",
402
+ new Blob([JSON.stringify({ client_id: clientId })], { type: "application/json" }),
403
+ );
404
+ });
405
+ ```
406
+
407
+ Heartbeat lines (`: heartbeat`) are SSE comments and never fire `onmessage`, so
408
+ no client-side filtering is needed.
409
+
410
+ ---
411
+
412
+ ## Operational notes
413
+
414
+ - **Limits.** Feed ring buffer: `200` events (`_FEED_LIMIT`). Per-subscriber
415
+ queue: `100` events (`_QUEUE_MAX`). Heartbeat interval: `15` seconds.
416
+ - **No persistence.** The feed, presence registry, and subscriber set live in
417
+ memory and reset on server restart. The durable record of activity remains
418
+ the store's own `timeline` (capped at 500 events) — realtime is a live view
419
+ on top of it, not a replacement.
420
+ - **Single process.** The bus is in-process by design for the local-first
421
+ deployment; it does not coordinate across multiple server processes.
422
+ - **`stats()`** reports `version` (`2.1.0`), live `subscribers`, `presence`
423
+ count, `feed_size`, and the `transport` (`"sse"`) for health/observability.