npm - elasticdash-test - Versions diffs - 0.1.24 → 0.1.25-alpha-2 - Mend

elasticdash-test 0.1.24 → 0.1.25-alpha-2

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (22) hide show

package/README.md +22 -0
package/dist/cli.js +53 -6
package/dist/cli.js.map +1 -1
package/docs/agent-integration-guide.md +557 -0
package/docs/agents.md +140 -0
package/docs/backend_rerun_alignment.md +291 -0
package/docs/backend_traceid_update.md +141 -0
package/docs/dashboard.md +394 -0
package/docs/deno.md +69 -0
package/docs/instrumentation.md +424 -0
package/docs/langfuse-trace-structure.md +145 -0
package/docs/matchers.md +173 -0
package/docs/observability_backend_contract.md +577 -0
package/docs/observability_mode.md +195 -0
package/docs/observability_rerun_backend_plan.md +596 -0
package/docs/quickstart.md +621 -0
package/docs/security-compliance.md +566 -0
package/docs/test-writing-guidelines.md +444 -0
package/docs/tools.md +165 -0
package/docs/workflow-modes.md +253 -0
package/package.json +2 -1
package/src/cli.ts +60 -7

package/docs/backend_rerun_alignment.md ADDED Viewed

@@ -0,0 +1,291 @@
+# Backend & Dashboard Alignment Plan: Full Rerun Support
+## Current State (SDK is ready)
+The SDK now supports two rerun modes, both triggered through the existing `trigger` socket event:
+**Step mode** (existing): Re-execute individual tool/AI calls in isolation, evaluate results.
+**Workflow mode** (new): Replay the full workflow function with frozen events — all HTTP/DB/AI calls are mocked from the original trace's captured I/O.
+```
+Backend emits 'trigger' via socket
+  → SDK checks trigger.mode
+  → mode='step' (default): executeTrigger() reruns individual steps
+  → mode='workflow': executeWorkflowTrigger() calls runWithInitializedHttpContext()
+       → SDK fetches GET /api/run-configs/:runConfigId (frozen events + mocks)
+       → Workflow executes with all external calls replayed from frozen data
+       → Results POSTed back to POST /api/observability/triggers/:id/results
+```
+### SDK TriggerSignal (extended)
+```typescript
+interface TriggerSignal {
+  triggerId: number
+  runCount: number
+  steps: TriggerStep[]
+  mode?: 'step' | 'workflow'         // NEW — default 'step'
+  workflowName?: string              // NEW — for workflow mode
+  workflowInput?: unknown            // NEW — original input to replay
+  runConfigId?: string               // NEW — for fetching frozen events
+}
+```
+### SDK Socket Events
+| Event | Direction | Purpose |
+|-------|-----------|---------|
+| `register` | SDK → BE | On connect: register sessionId, tools, workflows |
+| `trigger` | BE → SDK | Trigger step or workflow rerun |
+| `portal:task` | BE → SDK | Ad-hoc single step rerun |
+| `portal:result` | SDK → BE | Portal task result |
+| `workflow:rerun` | BE → SDK | Direct workflow rerun (alternative to trigger.mode='workflow') |
+| `workflow:rerun:result` | SDK → BE | Workflow rerun result |
+---
+## Backend Changes Required
+### 1. Include `http` and `db` in step selection
+**File:** `controller/observability/observabilityController.js` → `selectStepsForTrigger()`
+**Current:** Only selects `event_type IN ('ai', 'tool')`
+**Change:** Include `http` and `db` so these events appear in frozen event sets.
+```sql
+-- Updated query
+WHERE event_type IN ('ai', 'tool', 'http', 'db')
+```
+Also update `TriggerStep` event type validation in `SamplingConfigs` to accept `http` and `db`.
+**Effort:** Small (1 query + validation change)
+---
+### 2. Session-scoped socket rooms
+**File:** `index.js` → socket auth middleware
+**Current:** SDK sockets join `observability:project:${projectId}`. Triggers broadcast to all SDKs in the project.
+**Change:** Also join a session-scoped room so triggers can target a specific SDK instance:
+```javascript
+// In socket auth middleware, after joining project room:
+socket.join(`observability:session:${sessionId}`)
+// When emitting triggers, target the specific session:
+io.to(`observability:session:${sessionId}`).emit('trigger', triggerInfo)
+```
+**Why:** Prevents duplicate execution when multiple SDK instances are connected.
+**Effort:** Small (2 line changes)
+---
+### 3. New table: `ObservabilityRunConfigs`
+Storage for frozen events + mock configs that the SDK fetches during workflow reruns.
+```sql
+CREATE TABLE ObservabilityRunConfigs (
+    id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
+    project_id INT NOT NULL REFERENCES Projects(id),
+    trigger_id INT REFERENCES Triggers(id),
+    trace_id VARCHAR(255) NOT NULL,
+    workflow_name VARCHAR(255),
+    workflow_input JSONB,
+    frozen_event_ids BIGINT[] DEFAULT '{}',   -- empty = freeze all events in trace
+    prompt_mocks JSONB DEFAULT '{}',
+    tool_mock_config JSONB DEFAULT '{}',
+    ai_mock_config JSONB DEFAULT '{}',
+    user_prompt_mocks JSONB DEFAULT '{}',
+    created_at TIMESTAMPTZ DEFAULT NOW()
+);
+CREATE INDEX idx_run_configs_project ON ObservabilityRunConfigs(project_id);
+```
+**Effort:** Small (1 migration)
+---
+### 4. New endpoint: `GET /api/observability/run-configs/:runConfigId`
+The SDK's `runWithInitializedHttpContext()` already fetches this URL to get frozen events. Currently only works for test-mode dashboard runs. Needs an observability-mode implementation.
+**Logic:**
+1. Look up `ObservabilityRunConfigs` by ID
+2. Fetch events from `ObservabilityEvents` for the referenced `trace_id`
+3. If `frozen_event_ids` is empty → freeze all events (backend-triggered rerun)
+4. If `frozen_event_ids` is set → only freeze those specific events (selective rerun)
+5. Map events to the `WorkflowEvent` format the SDK expects
+**Response:**
+```json
+{
+  "frozenEvents": [
+    { "id": 1, "type": "ai", "name": "claude-sonnet-4-5-20250929", "input": {...}, "output": {...}, ... },
+    { "id": 2, "type": "http", "name": "fetch", "input": {...}, "output": {...}, ... },
+    { "id": 3, "type": "db", "name": "pg.query", "input": {...}, "output": {...}, ... }
+  ],
+  "promptMocks": {},
+  "toolMockConfig": {},
+  "aiMockConfig": {},
+  "userPromptMocks": {}
+}
+```
+**Effort:** Medium (new endpoint + query)
+---
+### 5. Extend trigger creation for workflow mode
+**File:** `services/observability.js` → trigger creation logic, and `controller/observability/observabilityController.js` → `createTrigger()`
+When a SamplingConfig has `mode: 'workflow'` (or when the dashboard requests a workflow rerun), the backend should:
+1. Create an `ObservabilityRunConfigs` row from the original trace
+2. Extract the workflow input from the trace's first `workflow` event (or the trigger config)
+3. Add `mode`, `workflowName`, `workflowInput`, and `runConfigId` to the trigger signal
+4. Emit the extended trigger via socket
+**Trigger signal emitted:**
+```json
+{
+  "triggerId": 123,
+  "runCount": 1,
+  "mode": "workflow",
+  "workflowName": "chatStreamHandler",
+  "workflowInput": { "messages": [...] },
+  "runConfigId": "uuid-of-run-config",
+  "steps": [...]
+}
+```
+**Effort:** Medium (extends existing trigger flow)
+---
+### 6. Extend trigger results handling for workflow mode
+**File:** `services/observability.js` → `POST /observability/triggers/:id/results`
+The SDK now sends workflow-mode results with a different shape:
+```json
+{
+  "mode": "workflow",
+  "workflowName": "chatStreamHandler",
+  "ok": true,
+  "output": { ... },
+  "durationMs": 1234
+}
+```
+The backend should detect `mode: 'workflow'` and handle it differently from step-mode results (which have a `steps[]` array). Store the workflow output, run evaluations against it, and mark the trigger as completed.
+**Effort:** Small (conditional handling in existing endpoint)
+---
+## Dashboard Changes Required
+### 1. "Rerun Workflow" button on trace detail view
+**File:** `src/components/observability/EventDetailPanel.tsx` (or trace view page)
+Add a button that calls `POST /api/observability/workflow-reruns` (new backend endpoint, see below).
+**Effort:** Small (1 button + API call)
+---
+### 2. New backend endpoint for dashboard-initiated reruns: `POST /api/observability/workflow-reruns`
+**Request:**
+```json
+{
+  "traceId": "chatStreamHandler::1712851200000::a1b2c3d4",
+  "workflowName": "chatStreamHandler",
+  "frozenEventIds": [],
+  "promptMocks": {},
+  "toolMockConfig": {},
+  "aiMockConfig": {}
+}
+```
+**Logic:**
+1. Fetch original trace events from `ObservabilityEvents`
+2. Find workflow input from the trace
+3. Create `ObservabilityRunConfigs` row
+4. Create `Triggers` row with `mode='workflow'`
+5. Emit `trigger` via socket to the session-scoped room
+6. Return `{ triggerId, runConfigId, status: 'sent' }`
+**Effort:** Medium (new endpoint)
+---
+### 3. Socket listeners for real-time rerun status
+**File:** `src/services/socketService.ts`
+Listen for `workflow:rerun:result` relayed from the backend to update the dashboard UI when a rerun completes.
+**Effort:** Small
+---
+### 4. "Rerun Step" button on event detail
+**File:** `src/components/observability/EventDetailPanel.tsx`
+Add a button per event that calls `POST /api/observability/portal-tasks` to rerun a single step. Backend emits `portal:task` via socket to the SDK.
+**Effort:** Small (1 button + new backend endpoint)
+---
+## Implementation Priority
+| Priority | System | Change | Effort |
+|----------|--------|--------|--------|
+| **P0** | Backend | Include http/db in step selection | Small |
+| **P0** | Backend | Session-scoped socket rooms | Small |
+| **P1** | Backend | `ObservabilityRunConfigs` table migration | Small |
+| **P1** | Backend | `GET /api/observability/run-configs/:id` endpoint | Medium |
+| **P1** | Backend | Extend trigger creation for workflow mode | Medium |
+| **P1** | Backend | Extend trigger results for workflow mode | Small |
+| **P1** | Backend | `POST /api/observability/workflow-reruns` endpoint | Medium |
+| **P1** | Dashboard | "Rerun Workflow" button on trace view | Small |
+| **P2** | Dashboard | Socket listeners for rerun status | Small |
+| **P2** | Backend | `POST /api/observability/portal-tasks` endpoint | Medium |
+| **P2** | Dashboard | "Rerun Step" button on event detail | Small |
+| **P2** | Dashboard | Rerun progress monitor component | Medium |
+## SDK — Done (no further changes needed)
+| Feature | Status |
+|---------|--------|
+| Frozen event replay for HTTP/DB/AI calls | Done |
+| Auto-install interceptors on context init | Done |
+| Socket connection with auto-reconnect | Done |
+| `trigger` listener (step + workflow mode) | Done |
+| `workflow:rerun` listener (direct mode) | Done |
+| `portal:task` listener | Done |
+| `TriggerSignal` extended with workflow fields | Done |
+| `executeTrigger` handles workflow mode | Done |
+| `startTrace()` for workflow attribution | Done |
+| `wrapDB` / `interceptFetch` / `installDBAutoInterceptor` in all modes | Done |

package/docs/backend_traceid_update.md ADDED Viewed

@@ -0,0 +1,141 @@
+# Backend Update: Parse Workflow Name from traceId
+## What Changed in the SDK
+The SDK now generates structured `traceId` values in the format:
+```
+{workflowName}::{timestamp}::{shortId}
+```
+Examples:
+```
+chatStreamHandler::1712851200000::a1b2c3d4
+chatHandler::1712851234567::f7e8d9c0
+unknown-workflow::1712851200000::b3c4d5e6   ← fallback when no workflow name is discovered
+```
+- `workflowName` — the workflow function name from `ed_workflows.ts`, or `'unknown-workflow'` as fallback
+- `timestamp` — Unix ms when the trace started
+- `shortId` — first 8 chars of a UUID for uniqueness
+Every event in a batch now has this `traceId` set (previously it was `undefined`).
+## What the Backend Needs to Change
+### 1. `upsertDiscoveredSteps()` — Parse workflow name from traceId
+**Current** (in `observabilityController.js`, line ~610-615):
+```javascript
+if (serviceId) {
+  const hash = sha256(serviceId);
+  seen.set(`workflow:${hash}`, {
+    kind: 'workflow',
+    name: serviceId,   // ← Always "OPEN-REACT-TEMPLATE"
+    hash,
+    systemPrompt: null,
+    model: null,
+    count: events.length
+  });
+}
+```
+**New:**
+```javascript
+// Extract unique workflow names from traceId fields in the batch
+const workflowNames = new Set();
+for (const evt of events) {
+  if (evt.traceId && typeof evt.traceId === 'string') {
+    const parts = evt.traceId.split('::');
+    if (parts.length >= 2) {
+      workflowNames.add(parts[0]);
+    }
+  }
+}
+for (const name of workflowNames) {
+  const hash = sha256(name);
+  // Count events belonging to this workflow
+  const count = events.filter(e => e.traceId && e.traceId.startsWith(name + '::')).length;
+  seen.set(`workflow:${hash}`, {
+    kind: 'workflow',
+    name,
+    hash,
+    systemPrompt: null,
+    model: null,
+    count: count || events.length
+  });
+}
+```
+This way, a single batch from a web server that handles multiple workflows (e.g. `chatHandler` and `chatStreamHandler`) correctly discovers both workflows.
+### 2. Event querying — Group by workflow from traceId
+When querying events for a specific workflow (e.g. the `/steps/:stepKind/:stepName/calls` endpoint), filter by traceId prefix:
+```sql
+SELECT * FROM ObservabilityEvents
+WHERE project_id = $1
+  AND trace_id LIKE $2 || '::%'
+ORDER BY timestamp DESC
+LIMIT $3 OFFSET $4
+```
+Where `$2` is the workflow name (e.g. `chatStreamHandler`).
+### 3. Health computation — Scope to workflow
+The health endpoint (`/steps/:stepKind/:stepName/health`) for workflows should aggregate metrics from events whose `traceId` starts with that workflow name, not from all events in the service.
+### 4. Trigger step selection — Use traceId to scope
+When `upsertDiscoveredSteps` processes tool/prompt steps, associate them with the workflow from their `traceId`:
+```javascript
+for (const evt of events) {
+  const workflowName = evt.traceId?.split('::')[0] ?? 'unknown-workflow';
+  if (evt.type === 'tool' && evt.name) {
+    // Store tool with workflow association if needed
+    seen.set(`tool:${sha256(evt.name)}`, { kind: 'tool', name: evt.name, ... });
+  }
+}
+```
+## traceId Parsing Helper
+```javascript
+function parseTraceId(traceId) {
+  if (!traceId || typeof traceId !== 'string') return null;
+  const parts = traceId.split('::');
+  if (parts.length < 2) return null;
+  return {
+    workflowName: parts[0],
+    timestamp: parts.length >= 2 ? parseInt(parts[1], 10) : null,
+    shortId: parts.length >= 3 ? parts[2] : null,
+  };
+}
+```
+## Backward Compatibility
+- Events without `traceId` (from older SDK versions) → treat as `'unknown-workflow'`
+- Events with plain UUID `traceId` (no `::` separator) → treat as `'unknown-workflow'`
+- Events with structured `traceId` → parse workflow name from prefix
+The check is simple: `traceId.includes('::')` means structured format, otherwise fallback.
+## SDK-Side: How traceId Gets Set
+The SDK discovers workflow names from `ed_workflows.ts` at init time. If exactly one workflow is exported, it is used as the default traceId prefix. Otherwise `'unknown-workflow'` is used until `startTrace()` is called.
+| Scenario | traceId value |
+|----------|---------------|
+| `initObservability()` with single workflow in `ed_workflows.ts` | `{workflowName}::timestamp::shortId` |
+| `initObservability()` with multiple/no workflows | `unknown-workflow::timestamp::shortId` |
+| `startTrace('chatStreamHandler')` in route handler | `chatStreamHandler::timestamp::shortId` |
+| `startTrace()` without workflow name | `{defaultWorkflowName}::timestamp::shortId` |
+| No observability (test/dashboard mode) | Not set by this system |
+`serviceId` is no longer used by the SDK. The API key identifies the project.