elasticdash-test 0.1.24 → 0.1.25-alpha-2

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
@@ -0,0 +1,291 @@
1
+ # Backend & Dashboard Alignment Plan: Full Rerun Support
2
+
3
+ ## Current State (SDK is ready)
4
+
5
+ The SDK now supports two rerun modes, both triggered through the existing `trigger` socket event:
6
+
7
+ **Step mode** (existing): Re-execute individual tool/AI calls in isolation, evaluate results.
8
+
9
+ **Workflow mode** (new): Replay the full workflow function with frozen events — all HTTP/DB/AI calls are mocked from the original trace's captured I/O.
10
+
11
+ ```
12
+ Backend emits 'trigger' via socket
13
+ → SDK checks trigger.mode
14
+ → mode='step' (default): executeTrigger() reruns individual steps
15
+ → mode='workflow': executeWorkflowTrigger() calls runWithInitializedHttpContext()
16
+ → SDK fetches GET /api/run-configs/:runConfigId (frozen events + mocks)
17
+ → Workflow executes with all external calls replayed from frozen data
18
+ → Results POSTed back to POST /api/observability/triggers/:id/results
19
+ ```
20
+
21
+ ### SDK TriggerSignal (extended)
22
+
23
+ ```typescript
24
+ interface TriggerSignal {
25
+ triggerId: number
26
+ runCount: number
27
+ steps: TriggerStep[]
28
+ mode?: 'step' | 'workflow' // NEW — default 'step'
29
+ workflowName?: string // NEW — for workflow mode
30
+ workflowInput?: unknown // NEW — original input to replay
31
+ runConfigId?: string // NEW — for fetching frozen events
32
+ }
33
+ ```
34
+
35
+ ### SDK Socket Events
36
+
37
+ | Event | Direction | Purpose |
38
+ |-------|-----------|---------|
39
+ | `register` | SDK → BE | On connect: register sessionId, tools, workflows |
40
+ | `trigger` | BE → SDK | Trigger step or workflow rerun |
41
+ | `portal:task` | BE → SDK | Ad-hoc single step rerun |
42
+ | `portal:result` | SDK → BE | Portal task result |
43
+ | `workflow:rerun` | BE → SDK | Direct workflow rerun (alternative to trigger.mode='workflow') |
44
+ | `workflow:rerun:result` | SDK → BE | Workflow rerun result |
45
+
46
+ ---
47
+
48
+ ## Backend Changes Required
49
+
50
+ ### 1. Include `http` and `db` in step selection
51
+
52
+ **File:** `controller/observability/observabilityController.js` → `selectStepsForTrigger()`
53
+
54
+ **Current:** Only selects `event_type IN ('ai', 'tool')`
55
+
56
+ **Change:** Include `http` and `db` so these events appear in frozen event sets.
57
+
58
+ ```sql
59
+ -- Updated query
60
+ WHERE event_type IN ('ai', 'tool', 'http', 'db')
61
+ ```
62
+
63
+ Also update `TriggerStep` event type validation in `SamplingConfigs` to accept `http` and `db`.
64
+
65
+ **Effort:** Small (1 query + validation change)
66
+
67
+ ---
68
+
69
+ ### 2. Session-scoped socket rooms
70
+
71
+ **File:** `index.js` → socket auth middleware
72
+
73
+ **Current:** SDK sockets join `observability:project:${projectId}`. Triggers broadcast to all SDKs in the project.
74
+
75
+ **Change:** Also join a session-scoped room so triggers can target a specific SDK instance:
76
+
77
+ ```javascript
78
+ // In socket auth middleware, after joining project room:
79
+ socket.join(`observability:session:${sessionId}`)
80
+
81
+ // When emitting triggers, target the specific session:
82
+ io.to(`observability:session:${sessionId}`).emit('trigger', triggerInfo)
83
+ ```
84
+
85
+ **Why:** Prevents duplicate execution when multiple SDK instances are connected.
86
+
87
+ **Effort:** Small (2 line changes)
88
+
89
+ ---
90
+
91
+ ### 3. New table: `ObservabilityRunConfigs`
92
+
93
+ Storage for frozen events + mock configs that the SDK fetches during workflow reruns.
94
+
95
+ ```sql
96
+ CREATE TABLE ObservabilityRunConfigs (
97
+ id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
98
+ project_id INT NOT NULL REFERENCES Projects(id),
99
+ trigger_id INT REFERENCES Triggers(id),
100
+ trace_id VARCHAR(255) NOT NULL,
101
+ workflow_name VARCHAR(255),
102
+ workflow_input JSONB,
103
+ frozen_event_ids BIGINT[] DEFAULT '{}', -- empty = freeze all events in trace
104
+ prompt_mocks JSONB DEFAULT '{}',
105
+ tool_mock_config JSONB DEFAULT '{}',
106
+ ai_mock_config JSONB DEFAULT '{}',
107
+ user_prompt_mocks JSONB DEFAULT '{}',
108
+ created_at TIMESTAMPTZ DEFAULT NOW()
109
+ );
110
+
111
+ CREATE INDEX idx_run_configs_project ON ObservabilityRunConfigs(project_id);
112
+ ```
113
+
114
+ **Effort:** Small (1 migration)
115
+
116
+ ---
117
+
118
+ ### 4. New endpoint: `GET /api/observability/run-configs/:runConfigId`
119
+
120
+ The SDK's `runWithInitializedHttpContext()` already fetches this URL to get frozen events. Currently only works for test-mode dashboard runs. Needs an observability-mode implementation.
121
+
122
+ **Logic:**
123
+
124
+ 1. Look up `ObservabilityRunConfigs` by ID
125
+ 2. Fetch events from `ObservabilityEvents` for the referenced `trace_id`
126
+ 3. If `frozen_event_ids` is empty → freeze all events (backend-triggered rerun)
127
+ 4. If `frozen_event_ids` is set → only freeze those specific events (selective rerun)
128
+ 5. Map events to the `WorkflowEvent` format the SDK expects
129
+
130
+ **Response:**
131
+
132
+ ```json
133
+ {
134
+ "frozenEvents": [
135
+ { "id": 1, "type": "ai", "name": "claude-sonnet-4-5-20250929", "input": {...}, "output": {...}, ... },
136
+ { "id": 2, "type": "http", "name": "fetch", "input": {...}, "output": {...}, ... },
137
+ { "id": 3, "type": "db", "name": "pg.query", "input": {...}, "output": {...}, ... }
138
+ ],
139
+ "promptMocks": {},
140
+ "toolMockConfig": {},
141
+ "aiMockConfig": {},
142
+ "userPromptMocks": {}
143
+ }
144
+ ```
145
+
146
+ **Effort:** Medium (new endpoint + query)
147
+
148
+ ---
149
+
150
+ ### 5. Extend trigger creation for workflow mode
151
+
152
+ **File:** `services/observability.js` → trigger creation logic, and `controller/observability/observabilityController.js` → `createTrigger()`
153
+
154
+ When a SamplingConfig has `mode: 'workflow'` (or when the dashboard requests a workflow rerun), the backend should:
155
+
156
+ 1. Create an `ObservabilityRunConfigs` row from the original trace
157
+ 2. Extract the workflow input from the trace's first `workflow` event (or the trigger config)
158
+ 3. Add `mode`, `workflowName`, `workflowInput`, and `runConfigId` to the trigger signal
159
+ 4. Emit the extended trigger via socket
160
+
161
+ **Trigger signal emitted:**
162
+
163
+ ```json
164
+ {
165
+ "triggerId": 123,
166
+ "runCount": 1,
167
+ "mode": "workflow",
168
+ "workflowName": "chatStreamHandler",
169
+ "workflowInput": { "messages": [...] },
170
+ "runConfigId": "uuid-of-run-config",
171
+ "steps": [...]
172
+ }
173
+ ```
174
+
175
+ **Effort:** Medium (extends existing trigger flow)
176
+
177
+ ---
178
+
179
+ ### 6. Extend trigger results handling for workflow mode
180
+
181
+ **File:** `services/observability.js` → `POST /observability/triggers/:id/results`
182
+
183
+ The SDK now sends workflow-mode results with a different shape:
184
+
185
+ ```json
186
+ {
187
+ "mode": "workflow",
188
+ "workflowName": "chatStreamHandler",
189
+ "ok": true,
190
+ "output": { ... },
191
+ "durationMs": 1234
192
+ }
193
+ ```
194
+
195
+ The backend should detect `mode: 'workflow'` and handle it differently from step-mode results (which have a `steps[]` array). Store the workflow output, run evaluations against it, and mark the trigger as completed.
196
+
197
+ **Effort:** Small (conditional handling in existing endpoint)
198
+
199
+ ---
200
+
201
+ ## Dashboard Changes Required
202
+
203
+ ### 1. "Rerun Workflow" button on trace detail view
204
+
205
+ **File:** `src/components/observability/EventDetailPanel.tsx` (or trace view page)
206
+
207
+ Add a button that calls `POST /api/observability/workflow-reruns` (new backend endpoint, see below).
208
+
209
+ **Effort:** Small (1 button + API call)
210
+
211
+ ---
212
+
213
+ ### 2. New backend endpoint for dashboard-initiated reruns: `POST /api/observability/workflow-reruns`
214
+
215
+ **Request:**
216
+
217
+ ```json
218
+ {
219
+ "traceId": "chatStreamHandler::1712851200000::a1b2c3d4",
220
+ "workflowName": "chatStreamHandler",
221
+ "frozenEventIds": [],
222
+ "promptMocks": {},
223
+ "toolMockConfig": {},
224
+ "aiMockConfig": {}
225
+ }
226
+ ```
227
+
228
+ **Logic:**
229
+
230
+ 1. Fetch original trace events from `ObservabilityEvents`
231
+ 2. Find workflow input from the trace
232
+ 3. Create `ObservabilityRunConfigs` row
233
+ 4. Create `Triggers` row with `mode='workflow'`
234
+ 5. Emit `trigger` via socket to the session-scoped room
235
+ 6. Return `{ triggerId, runConfigId, status: 'sent' }`
236
+
237
+ **Effort:** Medium (new endpoint)
238
+
239
+ ---
240
+
241
+ ### 3. Socket listeners for real-time rerun status
242
+
243
+ **File:** `src/services/socketService.ts`
244
+
245
+ Listen for `workflow:rerun:result` relayed from the backend to update the dashboard UI when a rerun completes.
246
+
247
+ **Effort:** Small
248
+
249
+ ---
250
+
251
+ ### 4. "Rerun Step" button on event detail
252
+
253
+ **File:** `src/components/observability/EventDetailPanel.tsx`
254
+
255
+ Add a button per event that calls `POST /api/observability/portal-tasks` to rerun a single step. Backend emits `portal:task` via socket to the SDK.
256
+
257
+ **Effort:** Small (1 button + new backend endpoint)
258
+
259
+ ---
260
+
261
+ ## Implementation Priority
262
+
263
+ | Priority | System | Change | Effort |
264
+ |----------|--------|--------|--------|
265
+ | **P0** | Backend | Include http/db in step selection | Small |
266
+ | **P0** | Backend | Session-scoped socket rooms | Small |
267
+ | **P1** | Backend | `ObservabilityRunConfigs` table migration | Small |
268
+ | **P1** | Backend | `GET /api/observability/run-configs/:id` endpoint | Medium |
269
+ | **P1** | Backend | Extend trigger creation for workflow mode | Medium |
270
+ | **P1** | Backend | Extend trigger results for workflow mode | Small |
271
+ | **P1** | Backend | `POST /api/observability/workflow-reruns` endpoint | Medium |
272
+ | **P1** | Dashboard | "Rerun Workflow" button on trace view | Small |
273
+ | **P2** | Dashboard | Socket listeners for rerun status | Small |
274
+ | **P2** | Backend | `POST /api/observability/portal-tasks` endpoint | Medium |
275
+ | **P2** | Dashboard | "Rerun Step" button on event detail | Small |
276
+ | **P2** | Dashboard | Rerun progress monitor component | Medium |
277
+
278
+ ## SDK — Done (no further changes needed)
279
+
280
+ | Feature | Status |
281
+ |---------|--------|
282
+ | Frozen event replay for HTTP/DB/AI calls | Done |
283
+ | Auto-install interceptors on context init | Done |
284
+ | Socket connection with auto-reconnect | Done |
285
+ | `trigger` listener (step + workflow mode) | Done |
286
+ | `workflow:rerun` listener (direct mode) | Done |
287
+ | `portal:task` listener | Done |
288
+ | `TriggerSignal` extended with workflow fields | Done |
289
+ | `executeTrigger` handles workflow mode | Done |
290
+ | `startTrace()` for workflow attribution | Done |
291
+ | `wrapDB` / `interceptFetch` / `installDBAutoInterceptor` in all modes | Done |
@@ -0,0 +1,141 @@
1
+ # Backend Update: Parse Workflow Name from traceId
2
+
3
+ ## What Changed in the SDK
4
+
5
+ The SDK now generates structured `traceId` values in the format:
6
+
7
+ ```
8
+ {workflowName}::{timestamp}::{shortId}
9
+ ```
10
+
11
+ Examples:
12
+ ```
13
+ chatStreamHandler::1712851200000::a1b2c3d4
14
+ chatHandler::1712851234567::f7e8d9c0
15
+ unknown-workflow::1712851200000::b3c4d5e6 ← fallback when no workflow name is discovered
16
+ ```
17
+
18
+ - `workflowName` — the workflow function name from `ed_workflows.ts`, or `'unknown-workflow'` as fallback
19
+ - `timestamp` — Unix ms when the trace started
20
+ - `shortId` — first 8 chars of a UUID for uniqueness
21
+
22
+ Every event in a batch now has this `traceId` set (previously it was `undefined`).
23
+
24
+ ## What the Backend Needs to Change
25
+
26
+ ### 1. `upsertDiscoveredSteps()` — Parse workflow name from traceId
27
+
28
+ **Current** (in `observabilityController.js`, line ~610-615):
29
+ ```javascript
30
+ if (serviceId) {
31
+ const hash = sha256(serviceId);
32
+ seen.set(`workflow:${hash}`, {
33
+ kind: 'workflow',
34
+ name: serviceId, // ← Always "OPEN-REACT-TEMPLATE"
35
+ hash,
36
+ systemPrompt: null,
37
+ model: null,
38
+ count: events.length
39
+ });
40
+ }
41
+ ```
42
+
43
+ **New:**
44
+ ```javascript
45
+ // Extract unique workflow names from traceId fields in the batch
46
+ const workflowNames = new Set();
47
+ for (const evt of events) {
48
+ if (evt.traceId && typeof evt.traceId === 'string') {
49
+ const parts = evt.traceId.split('::');
50
+ if (parts.length >= 2) {
51
+ workflowNames.add(parts[0]);
52
+ }
53
+ }
54
+ }
55
+
56
+ for (const name of workflowNames) {
57
+ const hash = sha256(name);
58
+ // Count events belonging to this workflow
59
+ const count = events.filter(e => e.traceId && e.traceId.startsWith(name + '::')).length;
60
+ seen.set(`workflow:${hash}`, {
61
+ kind: 'workflow',
62
+ name,
63
+ hash,
64
+ systemPrompt: null,
65
+ model: null,
66
+ count: count || events.length
67
+ });
68
+ }
69
+ ```
70
+
71
+ This way, a single batch from a web server that handles multiple workflows (e.g. `chatHandler` and `chatStreamHandler`) correctly discovers both workflows.
72
+
73
+ ### 2. Event querying — Group by workflow from traceId
74
+
75
+ When querying events for a specific workflow (e.g. the `/steps/:stepKind/:stepName/calls` endpoint), filter by traceId prefix:
76
+
77
+ ```sql
78
+ SELECT * FROM ObservabilityEvents
79
+ WHERE project_id = $1
80
+ AND trace_id LIKE $2 || '::%'
81
+ ORDER BY timestamp DESC
82
+ LIMIT $3 OFFSET $4
83
+ ```
84
+
85
+ Where `$2` is the workflow name (e.g. `chatStreamHandler`).
86
+
87
+ ### 3. Health computation — Scope to workflow
88
+
89
+ The health endpoint (`/steps/:stepKind/:stepName/health`) for workflows should aggregate metrics from events whose `traceId` starts with that workflow name, not from all events in the service.
90
+
91
+ ### 4. Trigger step selection — Use traceId to scope
92
+
93
+ When `upsertDiscoveredSteps` processes tool/prompt steps, associate them with the workflow from their `traceId`:
94
+
95
+ ```javascript
96
+ for (const evt of events) {
97
+ const workflowName = evt.traceId?.split('::')[0] ?? 'unknown-workflow';
98
+
99
+ if (evt.type === 'tool' && evt.name) {
100
+ // Store tool with workflow association if needed
101
+ seen.set(`tool:${sha256(evt.name)}`, { kind: 'tool', name: evt.name, ... });
102
+ }
103
+ }
104
+ ```
105
+
106
+ ## traceId Parsing Helper
107
+
108
+ ```javascript
109
+ function parseTraceId(traceId) {
110
+ if (!traceId || typeof traceId !== 'string') return null;
111
+ const parts = traceId.split('::');
112
+ if (parts.length < 2) return null;
113
+ return {
114
+ workflowName: parts[0],
115
+ timestamp: parts.length >= 2 ? parseInt(parts[1], 10) : null,
116
+ shortId: parts.length >= 3 ? parts[2] : null,
117
+ };
118
+ }
119
+ ```
120
+
121
+ ## Backward Compatibility
122
+
123
+ - Events without `traceId` (from older SDK versions) → treat as `'unknown-workflow'`
124
+ - Events with plain UUID `traceId` (no `::` separator) → treat as `'unknown-workflow'`
125
+ - Events with structured `traceId` → parse workflow name from prefix
126
+
127
+ The check is simple: `traceId.includes('::')` means structured format, otherwise fallback.
128
+
129
+ ## SDK-Side: How traceId Gets Set
130
+
131
+ The SDK discovers workflow names from `ed_workflows.ts` at init time. If exactly one workflow is exported, it is used as the default traceId prefix. Otherwise `'unknown-workflow'` is used until `startTrace()` is called.
132
+
133
+ | Scenario | traceId value |
134
+ |----------|---------------|
135
+ | `initObservability()` with single workflow in `ed_workflows.ts` | `{workflowName}::timestamp::shortId` |
136
+ | `initObservability()` with multiple/no workflows | `unknown-workflow::timestamp::shortId` |
137
+ | `startTrace('chatStreamHandler')` in route handler | `chatStreamHandler::timestamp::shortId` |
138
+ | `startTrace()` without workflow name | `{defaultWorkflowName}::timestamp::shortId` |
139
+ | No observability (test/dashboard mode) | Not set by this system |
140
+
141
+ `serviceId` is no longer used by the SDK. The API key identifies the project.