@archetypeai/ds-cli 0.3.9 → 0.3.10

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
@@ -0,0 +1,986 @@
1
+ ---
2
+ name: newton-machine-state-upload
3
+ description: Run a Machine State Lens by uploading a CSV file for server-side processing. Use when you want to upload a file and let the server handle streaming (no local streaming loop needed).
4
+ argument-hint: [csv-file-path]
5
+ ---
6
+
7
+ # Newton Machine State Lens — Upload File (Server-Side Processing)
8
+
9
+ Generate a script that uploads a CSV file to the Archetype AI platform and runs Machine State classification server-side. Unlike the streaming approaches, this method lets the server read the file directly — no local streaming loop required. Supports both Python and JavaScript/Web.
10
+ 1
11
+ > **Frontend architecture:** See `@rules/frontend-architecture` for component decomposition conventions. This skill already includes detailed frontend guidance in the "Row Synchronization" section below.
12
+
13
+ ---
14
+
15
+ ## Python Implementation
16
+
17
+ ### Requirements
18
+
19
+ - `archetypeai` Python package
20
+ - Environment variables: `ATAI_API_KEY`, optionally `ATAI_API_ENDPOINT`
21
+
22
+ ### Architecture
23
+
24
+ Uses `create_and_run_lens` with YAML config. After the lens session is created, the data file is uploaded and a `csv_file_reader` input stream tells the server to read it directly.
25
+
26
+ #### 1. API Client Setup
27
+
28
+ ```python
29
+ from archetypeai.api_client import ArchetypeAI
30
+ import os
31
+
32
+ api_key = os.getenv("ATAI_API_KEY")
33
+ api_endpoint = os.getenv("ATAI_API_ENDPOINT", ArchetypeAI.get_default_endpoint())
34
+ client = ArchetypeAI(api_key, api_endpoint=api_endpoint)
35
+ ```
36
+
37
+ #### 2. Upload N-Shot Example Files & Build YAML
38
+
39
+ ```python
40
+ from pathlib import Path
41
+
42
+ input_n_shot = {}
43
+ for file_path in n_shot_file_paths:
44
+ class_name = Path(file_path).stem.upper()
45
+ resp = client.files.local.upload(file_path)
46
+ input_n_shot[class_name] = resp["file_id"]
47
+ ```
48
+
49
+ #### 3. Lens YAML Configuration
50
+
51
+ ```yaml
52
+ lens_name: Machine State Lens
53
+ lens_config:
54
+ model_pipeline:
55
+ - processor_name: lens_timeseries_state_processor
56
+ processor_config: {}
57
+ model_parameters:
58
+ model_name: OmegaEncoder
59
+ model_version: OmegaEncoder::omega_embeddings_01
60
+ normalize_input: true
61
+ buffer_size: {window_size}
62
+ input_n_shot:
63
+ HEALTHY: {healthy_file_id}
64
+ BROKEN: {broken_file_id}
65
+ csv_configs:
66
+ timestamp_column: timestamp
67
+ data_columns: ['a1', 'a2', 'a3', 'a4']
68
+ window_size: {window_size}
69
+ step_size: {step_size}
70
+ knn_configs:
71
+ n_neighbors: 5
72
+ metric: manhattan
73
+ weights: distance
74
+ algorithm: ball_tree
75
+ normalize_embeddings: false
76
+ output_streams:
77
+ - stream_type: server_sent_events_writer
78
+ ```
79
+
80
+ Build `input_n_shot` dynamically:
81
+
82
+ ```python
83
+ n_shot_yaml_lines = []
84
+ for class_name, file_id in input_n_shot.items():
85
+ n_shot_yaml_lines.append(f" {class_name}: {file_id}")
86
+ n_shot_yaml = "\n".join(n_shot_yaml_lines)
87
+ ```
88
+
89
+ #### 4. Event Builders for Server-Side File Reading
90
+
91
+ After the lens session is created, send these events to make the server read the uploaded CSV:
92
+
93
+ ##### input_stream.set — Point server at the uploaded CSV
94
+
95
+ ```python
96
+ def build_input_event(file_id, window_size, step_size):
97
+ return {
98
+ "type": "input_stream.set",
99
+ "event_data": {
100
+ "stream_type": "csv_file_reader",
101
+ "stream_config": {
102
+ "file_id": file_id,
103
+ "window_size": window_size,
104
+ "step_size": step_size,
105
+ "loop_recording": False,
106
+ "output_format": ""
107
+ }
108
+ }
109
+ }
110
+ ```
111
+
112
+ ##### output_stream.set — Enable SSE output
113
+
114
+ ```python
115
+ def build_output_event():
116
+ return {
117
+ "type": "output_stream.set",
118
+ "event_data": {
119
+ "stream_type": "server_side_events_writer",
120
+ "stream_config": {}
121
+ }
122
+ }
123
+ ```
124
+
125
+ #### 5. Session Callback
126
+
127
+ ```python
128
+ def session_callback(session_id, session_endpoint, client, args):
129
+ print(f"Session created: {session_id}")
130
+
131
+ # Upload the data CSV
132
+ data_resp = client.files.local.upload(args["data_file_path"])
133
+ data_file_id = data_resp["file_id"]
134
+
135
+ # Tell server to read the uploaded CSV
136
+ client.lens.sessions.process_event(
137
+ session_id,
138
+ build_input_event(data_file_id, args["window_size"], args["step_size"])
139
+ )
140
+ client.lens.sessions.process_event(
141
+ session_id,
142
+ build_output_event()
143
+ )
144
+
145
+ # Listen for results via SSE
146
+ sse_reader = client.lens.sessions.create_sse_consumer(
147
+ session_id, max_read_time_sec=args["max_run_time_sec"]
148
+ )
149
+
150
+ try:
151
+ for event in sse_reader.read(block=True):
152
+ if stop_flag:
153
+ break
154
+ if isinstance(event, dict) and event.get("type") == "inference.result":
155
+ ed = event.get("event_data", {})
156
+ result = ed.get("response")
157
+ meta = ed.get("query_metadata", {})
158
+ ts = meta.get("query_timestamp", "N/A")
159
+ if result is not None:
160
+ print(f"[{ts}] Predicted class: {result}")
161
+ finally:
162
+ sse_reader.close()
163
+ print("Stopped.")
164
+ ```
165
+
166
+ #### 6. Create and Run Lens
167
+
168
+ ```python
169
+ client.lens.create_and_run_lens(
170
+ yaml_config, session_callback,
171
+ client=client, args=args
172
+ )
173
+ ```
174
+
175
+ ### CLI Arguments to Include
176
+
177
+ ```
178
+ --api-key API key (fallback to ATAI_API_KEY env var)
179
+ --api-endpoint API endpoint (default from SDK)
180
+ --data-file Path to CSV file to analyze (required)
181
+ --n-shot-files Paths to n-shot example CSVs (required, nargs='+')
182
+ --window-size Window size in samples (default: 100)
183
+ --step-size Step size in samples (default: 100)
184
+ --max-run-time-sec Max runtime (default: 600)
185
+ ```
186
+
187
+ ### Example Usage
188
+
189
+ ```bash
190
+ python machine_state_upload.py \
191
+ --data-file data.csv \
192
+ --n-shot-files healthy.csv broken.csv
193
+
194
+ # With custom window size
195
+ python machine_state_upload.py \
196
+ --data-file sensor_recording.csv \
197
+ --n-shot-files normal.csv warning.csv critical.csv \
198
+ --window-size 512 --step-size 256
199
+ ```
200
+
201
+ ---
202
+
203
+ ## Web / JavaScript Implementation
204
+
205
+ Uses direct `fetch` calls to the Archetype AI REST API. The simplest of the three approaches on web — just upload files and let the server handle everything. Based on the working pattern from `test-stream/src/lib/atai-client.ts`.
206
+
207
+ ### Requirements
208
+
209
+ - `@microsoft/fetch-event-source` for SSE consumption
210
+
211
+ ### API Reference
212
+
213
+ | Operation | Method | Endpoint | Body |
214
+ |-----------|--------|----------|------|
215
+ | Upload file | POST | `/files` | `FormData` |
216
+ | Register lens | POST | `/lens/register` | `{ lens_config: config }` |
217
+ | Delete lens | POST | `/lens/delete` | `{ lens_id }` |
218
+ | Create session | POST | `/lens/sessions/create` | `{ lens_id }` |
219
+ | Process event | POST | `/lens/sessions/events/process` | `{ session_id, event }` |
220
+ | Destroy session | POST | `/lens/sessions/destroy` | `{ session_id }` |
221
+ | SSE consumer | GET | `/lens/sessions/consumer/{sessionId}` | — |
222
+
223
+ ### Helper: API fetch wrapper
224
+
225
+ ```typescript
226
+ const API_ENDPOINT = 'https://api.u1.archetypeai.app/v0.5'
227
+
228
+ async function apiPost<T>(path: string, apiKey: string, body: unknown, timeoutMs = 5000): Promise<T> {
229
+ const controller = new AbortController()
230
+ const timeoutId = setTimeout(() => controller.abort(), timeoutMs)
231
+
232
+ try {
233
+ const response = await fetch(`${API_ENDPOINT}${path}`, {
234
+ method: 'POST',
235
+ headers: {
236
+ Authorization: `Bearer ${apiKey}`,
237
+ 'Content-Type': 'application/json',
238
+ },
239
+ body: JSON.stringify(body),
240
+ signal: controller.signal,
241
+ })
242
+
243
+ if (!response.ok) {
244
+ const errorBody = await response.json().catch(() => ({}))
245
+ throw new Error(`API POST ${path} failed: ${response.status} - ${JSON.stringify(errorBody)}`)
246
+ }
247
+
248
+ return response.json()
249
+ } finally {
250
+ clearTimeout(timeoutId)
251
+ }
252
+ }
253
+ ```
254
+
255
+ ### Step 1: Upload n-shot CSV files
256
+
257
+ ```typescript
258
+ const nShotMap: Record<string, string> = {}
259
+
260
+ for (const { file, className } of nShotFiles) {
261
+ const formData = new FormData()
262
+ formData.append('file', file) // File from <input type="file">
263
+
264
+ const response = await fetch(`${API_ENDPOINT}/files`, {
265
+ method: 'POST',
266
+ headers: { Authorization: `Bearer ${apiKey}` },
267
+ body: formData,
268
+ })
269
+ const result = await response.json()
270
+ nShotMap[className.toUpperCase()] = result.file_id
271
+ }
272
+ ```
273
+
274
+ ### Step 2: Upload the data CSV
275
+
276
+ ```typescript
277
+ const dataFormData = new FormData()
278
+ dataFormData.append('file', dataFile)
279
+
280
+ const dataResponse = await fetch(`${API_ENDPOINT}/files`, {
281
+ method: 'POST',
282
+ headers: { Authorization: `Bearer ${apiKey}` },
283
+ body: dataFormData,
284
+ })
285
+ const dataUpload = await dataResponse.json()
286
+ const dataFileId = dataUpload.file_id
287
+ ```
288
+
289
+ ### Step 3: Register lens, create session, wait for ready
290
+
291
+ ```typescript
292
+ const windowSize = 100
293
+ const stepSize = 100
294
+
295
+ const lensConfig = {
296
+ lens_name: 'machine_state_lens',
297
+ lens_config: {
298
+ model_pipeline: [
299
+ { processor_name: 'lens_timeseries_state_processor', processor_config: {} },
300
+ ],
301
+ model_parameters: {
302
+ model_name: 'OmegaEncoder',
303
+ model_version: 'OmegaEncoder::omega_embeddings_01',
304
+ normalize_input: true,
305
+ buffer_size: windowSize,
306
+ input_n_shot: nShotMap,
307
+ csv_configs: {
308
+ timestamp_column: 'timestamp',
309
+ data_columns: ['a1', 'a2', 'a3', 'a4'],
310
+ window_size: windowSize,
311
+ step_size: stepSize,
312
+ },
313
+ knn_configs: {
314
+ n_neighbors: 5,
315
+ metric: 'manhattan',
316
+ weights: 'distance',
317
+ algorithm: 'ball_tree',
318
+ normalize_embeddings: false,
319
+ },
320
+ },
321
+ output_streams: [
322
+ { stream_type: 'server_sent_events_writer' },
323
+ ],
324
+ },
325
+ }
326
+
327
+ // Register lens — NOTE: body wraps config as { lens_config: config }
328
+ const registeredLens = await apiPost<{ lens_id: string }>(
329
+ '/lens/register', apiKey, { lens_config: lensConfig }
330
+ )
331
+ const lensId = registeredLens.lens_id
332
+
333
+ // Create session
334
+ const session = await apiPost<{ session_id: string; session_endpoint: string }>(
335
+ '/lens/sessions/create', apiKey, { lens_id: lensId }
336
+ )
337
+ const sessionId = session.session_id
338
+
339
+ await apiPost('/lens/delete', apiKey, { lens_id: lensId })
340
+
341
+ // Wait for session to be ready
342
+ async function waitForSessionReady(sessionId: string, maxWaitMs = 30000): Promise<boolean> {
343
+ const start = Date.now()
344
+ while (Date.now() - start < maxWaitMs) {
345
+ const status = await apiPost<{ session_status: string }>(
346
+ '/lens/sessions/events/process', apiKey,
347
+ { session_id: sessionId, event: { type: 'session.status' } },
348
+ 10000
349
+ )
350
+ if (status.session_status === 'LensSessionStatus.SESSION_STATUS_RUNNING' ||
351
+ status.session_status === '3') {
352
+ return true
353
+ }
354
+ if (status.session_status === 'LensSessionStatus.SESSION_STATUS_FAILED' ||
355
+ status.session_status === '6') {
356
+ return false
357
+ }
358
+ await new Promise(r => setTimeout(r, 500))
359
+ }
360
+ return false
361
+ }
362
+
363
+ const isReady = await waitForSessionReady(sessionId)
364
+ if (!isReady) throw new Error('Session failed to start')
365
+ ```
366
+
367
+ ### Step 4: Tell server to read the uploaded CSV
368
+
369
+ ```typescript
370
+ // Set input stream to CSV file reader
371
+ await apiPost('/lens/sessions/events/process', apiKey, {
372
+ session_id: sessionId,
373
+ event: {
374
+ type: 'input_stream.set',
375
+ event_data: {
376
+ stream_type: 'csv_file_reader',
377
+ stream_config: {
378
+ file_id: dataFileId,
379
+ window_size: windowSize,
380
+ step_size: stepSize,
381
+ loop_recording: false,
382
+ output_format: '',
383
+ },
384
+ },
385
+ },
386
+ }, 10000)
387
+
388
+ // Enable SSE output
389
+ await apiPost('/lens/sessions/events/process', apiKey, {
390
+ session_id: sessionId,
391
+ event: {
392
+ type: 'output_stream.set',
393
+ event_data: {
394
+ stream_type: 'server_side_events_writer',
395
+ stream_config: {},
396
+ },
397
+ },
398
+ }, 10000)
399
+ ```
400
+
401
+ ### Step 5: Consume SSE results
402
+
403
+ ```typescript
404
+ import { fetchEventSource } from '@microsoft/fetch-event-source'
405
+
406
+ const abortController = new AbortController()
407
+
408
+ fetchEventSource(`${API_ENDPOINT}/lens/sessions/consumer/${sessionId}`, {
409
+ headers: { Authorization: `Bearer ${apiKey}` },
410
+ signal: abortController.signal,
411
+ onmessage(event) {
412
+ const parsed = JSON.parse(event.data)
413
+
414
+ if (parsed.type === 'inference.result') {
415
+ const result = parsed.event_data.response
416
+ const meta = parsed.event_data.query_metadata
417
+ const ts = meta?.query_timestamp ?? 'N/A'
418
+ console.log(`[${ts}] Predicted class: ${result}`)
419
+ }
420
+
421
+ if (parsed.type === 'sse.stream.end') {
422
+ console.log('Analysis complete')
423
+ abortController.abort()
424
+ }
425
+ },
426
+ })
427
+ ```
428
+
429
+ ### Step 6: Cleanup
430
+
431
+ ```typescript
432
+ abortController.abort()
433
+ await apiPost('/lens/sessions/destroy', apiKey, { session_id: sessionId })
434
+ ```
435
+
436
+ ### Web Lifecycle Summary
437
+
438
+ ```
439
+ 1. Upload n-shot CSVs -> POST /files (FormData, one per class)
440
+ 2. Upload data CSV -> POST /files (FormData)
441
+ 3. Register lens -> POST /lens/register { lens_config: config }
442
+ 4. Create session -> POST /lens/sessions/create { lens_id }
443
+ 5. Wait for ready -> POST /lens/sessions/events/process { session_id, event: { type: 'session.status' } }
444
+ 6. (Optional) Delete lens -> POST /lens/delete { lens_id }
445
+ 7. Set input stream -> POST /lens/sessions/events/process { session_id, event: { type: 'input_stream.set', ... } }
446
+ 8. Set output stream -> POST /lens/sessions/events/process { session_id, event: { type: 'output_stream.set', ... } }
447
+ 9. Consume SSE results -> GET /lens/sessions/consumer/{sessionId}
448
+ 10. Destroy session -> POST /lens/sessions/destroy { session_id }
449
+ ```
450
+
451
+ ---
452
+
453
+ ## Optional: Results Logging
454
+
455
+ Save predictions to a timestamped CSV for analysis or visualization.
456
+
457
+ ### Python — Results CSV
458
+
459
+ ```python
460
+ import csv
461
+ from pathlib import Path
462
+ from datetime import datetime
463
+
464
+ # Create results directory and timestamped filename
465
+ results_dir = Path("results")
466
+ results_dir.mkdir(exist_ok=True)
467
+ file_stem = Path(args["data_file_path"]).stem
468
+ timestamp = datetime.now().strftime("%Y%m%d_%H%M%S")
469
+ results_file = results_dir / f"{file_stem}_{timestamp}.csv"
470
+
471
+ # Write CSV header
472
+ with open(results_file, 'w', newline='', encoding='utf-8') as f:
473
+ writer = csv.writer(f)
474
+ writer.writerow(['read_index', 'predicted_class', 'confidence_scores',
475
+ 'file_id', 'window_size', 'total_rows'])
476
+
477
+ # Inside the SSE event loop, when handling inference.result:
478
+ if isinstance(event, dict) and event.get("type") == "inference.result":
479
+ ed = event.get("event_data", {})
480
+ result = ed.get("response")
481
+ meta = ed.get("query_metadata", {})
482
+ query_meta = meta.get("query_metadata", {})
483
+
484
+ predicted_class = result[0] if isinstance(result, list) and len(result) > 0 else "unknown"
485
+ raw_votes = result[1] if isinstance(result, list) and len(result) > 1 else {}
486
+
487
+ # Normalize KNN votes to percentages (votes are out of n_neighbors, default 5)
488
+ total_votes = sum(raw_votes.values()) if raw_votes else 1
489
+ confidence_scores = {k: round(v / total_votes * 100, 1) for k, v in raw_votes.items()}
490
+
491
+ read_index = query_meta.get("read_index", "N/A")
492
+ file_id = query_meta.get("file_id", "N/A")
493
+ window_size_val = query_meta.get("window_size", "N/A")
494
+ total_rows = query_meta.get("total_rows", "N/A")
495
+
496
+ print(f"[{read_index}] Predicted: {predicted_class} | Confidence: {confidence_scores}")
497
+
498
+ with open(results_file, 'a', newline='', encoding='utf-8') as f:
499
+ writer = csv.writer(f)
500
+ writer.writerow([read_index, predicted_class, str(confidence_scores),
501
+ file_id, window_size_val, total_rows])
502
+ ```
503
+
504
+ **Row synchronization note**: Each result's `read_index` and `window_size` map directly to CSV row numbers (`[read_index, read_index + window_size)`). When building a UI on top of the Python backend, forward these fields to the frontend for chart-to-result synchronization. See the "Row Synchronization" section below for the full web pattern.
505
+
506
+ ### Response Structure
507
+
508
+ The `inference.result` response contains:
509
+ - `response[0]`: predicted class name (string, e.g. `"HEALTHY"`)
510
+ - `response[1]`: KNN vote counts dict (e.g. `{"HEALTHY": 3, "BROKEN": 2}`) — these are raw votes out of `n_neighbors` (default 5), **not percentages**. Normalize by dividing each value by the total votes to get percentages (e.g. 3/5 = 60%, 2/5 = 40%)
511
+ - `query_metadata.query_metadata.read_index`: window position in the file
512
+ - `query_metadata.query_metadata.file_id`: the file being analyzed
513
+ - `query_metadata.query_metadata.window_size`: window size used
514
+ - `query_metadata.query_metadata.total_rows`: total rows in the file
515
+
516
+ ### Web/JS — Results Array + CSV Download
517
+
518
+ ```typescript
519
+ interface PredictionResult {
520
+ readIndex: number
521
+ predictedClass: string
522
+ confidenceScores: Record<string, number>
523
+ fileId: string
524
+ windowSize: number
525
+ totalRows: number
526
+ }
527
+
528
+ const results: PredictionResult[] = []
529
+
530
+ // Inside the SSE onmessage handler:
531
+ if (parsed.type === 'inference.result') {
532
+ const result = parsed.event_data.response
533
+ const meta = parsed.event_data.query_metadata
534
+ const queryMeta = meta?.query_metadata ?? {}
535
+
536
+ // Normalize KNN votes to percentages (votes are out of n_neighbors, default 5)
537
+ const rawVotes: Record<string, number> = Array.isArray(result) && result.length > 1 ? result[1] : {}
538
+ const totalVotes = Object.values(rawVotes).reduce((sum, v) => sum + v, 0) || 1
539
+ const confidenceScores: Record<string, number> = {}
540
+ for (const [cls, votes] of Object.entries(rawVotes)) {
541
+ confidenceScores[cls] = Math.round((votes / totalVotes) * 1000) / 10 // e.g. 60.0
542
+ }
543
+
544
+ const prediction: PredictionResult = {
545
+ readIndex: queryMeta.read_index ?? 0,
546
+ predictedClass: Array.isArray(result) && result.length > 0 ? result[0] : 'unknown',
547
+ confidenceScores,
548
+ fileId: queryMeta.file_id ?? 'N/A',
549
+ windowSize: queryMeta.window_size ?? 0,
550
+ totalRows: queryMeta.total_rows ?? 0,
551
+ }
552
+
553
+ results.push(prediction)
554
+ console.log(`[${prediction.readIndex}] ${prediction.predictedClass}`, prediction.confidenceScores)
555
+ }
556
+
557
+ // Download results as CSV
558
+ function downloadResultsCsv(results: PredictionResult[], filename: string) {
559
+ const header = 'read_index,predicted_class,confidence_scores,file_id,window_size,total_rows\n'
560
+ const rows = results.map(r =>
561
+ `${r.readIndex},${r.predictedClass},"${JSON.stringify(r.confidenceScores)}",${r.fileId},${r.windowSize},${r.totalRows}`
562
+ ).join('\n')
563
+
564
+ const blob = new Blob([header + rows], { type: 'text/csv' })
565
+ const url = URL.createObjectURL(blob)
566
+ const a = document.createElement('a')
567
+ a.href = url
568
+ a.download = filename
569
+ a.click()
570
+ URL.revokeObjectURL(url)
571
+ }
572
+ ```
573
+
574
+ ### CLI Flag
575
+
576
+ Add `--save-results` flag (default: `True`) to enable/disable results logging:
577
+
578
+ ```
579
+ --save-results Save predictions to CSV in results/ directory (default: True)
580
+ ```
581
+
582
+ ---
583
+
584
+ ## Row Synchronization — Linking Results to Raw Data (Web)
585
+
586
+ Every `inference.result` from the API includes `query_metadata.query_metadata.read_index` and `query_metadata.query_metadata.window_size`. This tells you exactly which CSV rows `[read_index, read_index + window_size)` each classification covers. Use this to synchronize the results list with a line chart of the raw data.
587
+
588
+ ### Architecture Overview
589
+
590
+ ```
591
+ CSV File (user upload)
592
+ ├─ Sent to API for classification (existing flow)
593
+ └─ Parsed client-side for visualization
594
+
595
+ ┌─────────────────────────────────────┐
596
+ │ Line Chart (all CSV data columns) │
597
+ │ ██████░░░░░░░░░░░░░░░░░░░░░░░░░░ │ ← highlight rect on selected result
598
+ └─────────────────────────────────────┘
599
+ ↑ click result sets activeResult
600
+ ┌─────────────────────────────────────┐
601
+ │ Results List (clickable rows) │
602
+ │ [0-99] HEALTHY ██████████ 0.95 │
603
+ │ [100-199] BROKEN ████░░░░░░ 0.40 │ ← click to highlight on chart
604
+ └─────────────────────────────────────┘
605
+ ```
606
+
607
+ ### Step 1: Parse the CSV client-side on upload
608
+
609
+ When the user selects the data CSV file, parse it locally to feed the chart. This happens alongside the existing file upload to the API.
610
+
611
+ ```typescript
612
+ interface CsvRow {
613
+ [column: string]: number
614
+ }
615
+
616
+ function parseCsv(file: File): Promise<CsvRow[]> {
617
+ return new Promise((resolve, reject) => {
618
+ const reader = new FileReader()
619
+ reader.onload = (e) => {
620
+ const text = e.target?.result as string
621
+ const lines = text.trim().split('\n')
622
+ const headers = lines[0].split(',').map(h => h.trim())
623
+ const rows = lines.slice(1).map(line => {
624
+ const values = line.split(',')
625
+ const row: CsvRow = {}
626
+ headers.forEach((h, i) => {
627
+ row[h] = parseFloat(values[i])
628
+ })
629
+ return row
630
+ })
631
+ resolve(rows)
632
+ }
633
+ reader.onerror = reject
634
+ reader.readAsText(file)
635
+ })
636
+ }
637
+
638
+ // Usage: parse when user selects file
639
+ const csvData = await parseCsv(dataFile)
640
+ ```
641
+
642
+ ### Step 2: Class color mapping
643
+
644
+ Assign consistent colors to each predicted class. Both the results list and chart highlight use these colors.
645
+
646
+ **Important (Svelte 5):** Do NOT use a `$state` object with imperative mutation inside a getter — this causes `state_unsafe_mutation` errors when called from templates or `$derived`. Instead, use `$derived.by()` to compute the color map reactively from the detections array:
647
+
648
+ ```javascript
649
+ // Color palette — use 600-weight design tokens for light mode visibility
650
+ const CLASS_COLORS = [
651
+ 'var(--color-atai-screen-green-600)',
652
+ 'var(--color-atai-fire-red-600)',
653
+ 'var(--color-atai-sunshine-yellow-600)',
654
+ 'var(--chart-4)',
655
+ 'var(--chart-5)',
656
+ 'var(--chart-1)'
657
+ ];
658
+
659
+ // Compute color map reactively from detections (no mutation)
660
+ let classColorMap = $derived.by(() => {
661
+ const map = {};
662
+ let idx = 0;
663
+ for (const d of detections) {
664
+ if (!map[d.predictedClass]) {
665
+ map[d.predictedClass] = CLASS_COLORS[idx % CLASS_COLORS.length];
666
+ idx++;
667
+ }
668
+ for (const cls of Object.keys(d.confidenceScores)) {
669
+ if (!map[cls]) {
670
+ map[cls] = CLASS_COLORS[idx % CLASS_COLORS.length];
671
+ idx++;
672
+ }
673
+ }
674
+ }
675
+ return map;
676
+ });
677
+
678
+ function getClassColor(className) {
679
+ return classColorMap[className] || CLASS_COLORS[0];
680
+ }
681
+ ```
682
+
683
+ **Light mode note:** Avoid `--atai-good` (88% lightness) and `--atai-warning` (94% lightness) — they're invisible on white backgrounds. Use the 600-weight variants (`--color-atai-screen-green-600`, `--color-atai-fire-red-600`, `--color-atai-sunshine-yellow-600`) which have ~63% lightness.
684
+
685
+ ### Step 3: Reactive state for selected result
686
+
687
+ ```typescript
688
+ // Svelte 5 runes
689
+ let selectedResult: PredictionResult | null = $state(null)
690
+
691
+ // Or Svelte store
692
+ import { writable } from 'svelte/store'
693
+ const selectedResult = writable<PredictionResult | null>(null)
694
+ ```
695
+
696
+ ### Step 4: Line chart with D3 + highlight overlay
697
+
698
+ Render the raw CSV data as a multi-line chart. When `selectedResult` is set, draw a colored rectangle overlay for the row range `[readIndex, readIndex + windowSize)`.
699
+
700
+ ```typescript
701
+ import * as d3 from 'd3'
702
+
703
+ function drawChart(
704
+ container: HTMLElement,
705
+ csvData: CsvRow[],
706
+ dataColumns: string[],
707
+ selectedResult: PredictionResult | null
708
+ ) {
709
+ const width = container.clientWidth
710
+ const height = 300
711
+ const margin = { top: 20, right: 30, bottom: 40, left: 60 }
712
+ const innerW = width - margin.left - margin.right
713
+ const innerH = height - margin.top - margin.bottom
714
+
715
+ // Clear previous
716
+ d3.select(container).selectAll('*').remove()
717
+
718
+ const svg = d3.select(container)
719
+ .append('svg')
720
+ .attr('width', width)
721
+ .attr('height', height)
722
+
723
+ const g = svg.append('g')
724
+ .attr('transform', `translate(${margin.left},${margin.top})`)
725
+
726
+ // Scales
727
+ const xScale = d3.scaleLinear()
728
+ .domain([0, csvData.length - 1])
729
+ .range([0, innerW])
730
+
731
+ const allValues = csvData.flatMap(row => dataColumns.map(col => row[col]))
732
+ const yScale = d3.scaleLinear()
733
+ .domain([d3.min(allValues)!, d3.max(allValues)!])
734
+ .range([innerH, 0])
735
+ .nice()
736
+
737
+ // Axes
738
+ g.append('g')
739
+ .attr('transform', `translate(0,${innerH})`)
740
+ .call(d3.axisBottom(xScale).ticks(10))
741
+
742
+ g.append('g')
743
+ .call(d3.axisLeft(yScale).ticks(6))
744
+
745
+ // Line colors for data columns
746
+ const lineColors = ['#ff4444', '#44ff44', '#4444ff', '#ff44ff', '#ffaa00', '#00aaff']
747
+
748
+ // Draw lines
749
+ dataColumns.forEach((col, i) => {
750
+ const line = d3.line<CsvRow>()
751
+ .x((_, idx) => xScale(idx))
752
+ .y(d => yScale(d[col]))
753
+
754
+ g.append('path')
755
+ .datum(csvData)
756
+ .attr('fill', 'none')
757
+ .attr('stroke', lineColors[i % lineColors.length])
758
+ .attr('stroke-width', 1)
759
+ .attr('d', line)
760
+ })
761
+
762
+ // Highlight selected result's row range
763
+ if (selectedResult) {
764
+ const startX = xScale(selectedResult.readIndex)
765
+ const endX = xScale(selectedResult.readIndex + selectedResult.windowSize)
766
+ const color = getClassColor(selectedResult.predictedClass)
767
+
768
+ g.append('rect')
769
+ .attr('x', startX)
770
+ .attr('y', 0)
771
+ .attr('width', endX - startX)
772
+ .attr('height', innerH)
773
+ .attr('fill', color)
774
+ .attr('fill-opacity', 0.2)
775
+ .attr('stroke', color)
776
+ .attr('stroke-width', 1.5)
777
+ .attr('stroke-opacity', 0.6)
778
+
779
+ // Label at top of highlight
780
+ g.append('text')
781
+ .attr('x', startX + 4)
782
+ .attr('y', 14)
783
+ .attr('fill', color)
784
+ .attr('font-size', '11px')
785
+ .attr('font-weight', 'bold')
786
+ .text(`${selectedResult.predictedClass} [${selectedResult.readIndex}–${selectedResult.readIndex + selectedResult.windowSize}]`)
787
+ }
788
+ }
789
+ ```
790
+
791
+ ### Step 4b: SensorChart overlay alternative (layerchart)
792
+
793
+ If using the SensorChart pattern (layerchart) instead of raw D3, use an absolute-positioned div overlay rather than drawing into the SVG. SensorChart manages its own SVG, so you overlay a highlight div on top.
794
+
795
+ **Critical:** Use `[data-slot="chart"] svg` selector to find the layerchart SVG — a plain `querySelector('svg')` will match lucide icon SVGs (24×24) in CardHeader instead, producing wrong dimensions.
796
+
797
+ ```svelte
798
+ <script>
799
+ let chartWrapperRef = $state(null);
800
+ let highlightRect = $state(null);
801
+
802
+ $effect(() => {
803
+ const _sel = selectedDetection;
804
+ const _len = chartData.length;
805
+ if (!_sel || !chartWrapperRef || _len === 0) {
806
+ highlightRect = null;
807
+ return;
808
+ }
809
+
810
+ // IMPORTANT: target [data-slot="chart"] to skip icon SVGs
811
+ const svg = chartWrapperRef.querySelector('[data-slot="chart"] svg');
812
+ if (!svg) { highlightRect = null; return; }
813
+
814
+ const svgRect = svg.getBoundingClientRect();
815
+ const wrapperRect = chartWrapperRef.getBoundingClientRect();
816
+
817
+ // SensorChart uses padding={{ left: 20, bottom: 15 }}
818
+ const paddingLeft = 20;
819
+ const paddingBottom = 15;
820
+ const plotLeft = svgRect.left - wrapperRect.left + paddingLeft;
821
+ const plotWidth = svgRect.width - paddingLeft;
822
+ const plotTop = svgRect.top - wrapperRect.top;
823
+ const plotHeight = svgRect.height - paddingBottom;
824
+
825
+ const startFrac = _sel.readIndex / _len;
826
+ const widthFrac = _sel.windowSize / _len;
827
+
828
+ highlightRect = {
829
+ left: plotLeft + startFrac * plotWidth,
830
+ width: widthFrac * plotWidth,
831
+ top: plotTop,
832
+ height: plotHeight
833
+ };
834
+ });
835
+ </script>
836
+
837
+ <!-- Wrapper with relative positioning for overlay -->
838
+ <div bind:this={chartWrapperRef} class="relative">
839
+ <SensorChart
840
+ title="SENSOR DATA"
841
+ data={chartData}
842
+ signals={{ a1: 'A1', a2: 'A2', a3: 'A3', a4: 'A4' }}
843
+ xKey="timestamp"
844
+ yMin={yMin}
845
+ yMax={yMax}
846
+ />
847
+ {#if highlightRect}
848
+ <div
849
+ class="pointer-events-none absolute rounded-sm border-2"
850
+ style="
851
+ left: {highlightRect.left}px;
852
+ top: {highlightRect.top}px;
853
+ width: {highlightRect.width}px;
854
+ height: {highlightRect.height}px;
855
+ background-color: color-mix(in srgb, {getClassColor(selectedDetection.predictedClass)} 20%, transparent);
856
+ border-color: {getClassColor(selectedDetection.predictedClass)};
857
+ "
858
+ ></div>
859
+ {/if}
860
+ </div>
861
+ ```
862
+
863
+ **Key details:**
864
+ - `[data-slot="chart"]` is set by `Chart.Container` (see `chart-container.svelte`)
865
+ - SensorChart padding constants: `left: 20`, `bottom: 15` — match `padding={{ left: 20, bottom: 15 }}` in SensorChart
866
+ - The overlay is `pointer-events-none` so it doesn't block chart interactions
867
+ - `color-mix` creates a translucent fill from the class color
868
+
869
+ ### Step 5: Clickable results list
870
+
871
+ Each result row is clickable. Clicking sets `selectedResult`, which triggers the chart to redraw with the highlight.
872
+
873
+ **Important:** Do not use `result.readIndex` as the `{#each}` key — `readIndex` values can repeat when `step_size < window_size` (overlapping windows). Use the array index `i` instead.
874
+
875
+ ```svelte
876
+ <div class="results-list">
877
+ {#each results as result, i (i)}
878
+ <button
879
+ class="result-row"
880
+ class:selected={selectedResult === result}
881
+ onclick={() => selectedResult = result}
882
+ style="border-left: 4px solid {getClassColor(result.predictedClass)}"
883
+ >
884
+ <span class="row-range">[{result.readIndex}–{result.readIndex + result.windowSize}]</span>
885
+ <span class="class-name">{result.predictedClass}</span>
886
+ <span class="confidence">
887
+ {#each Object.entries(result.confidenceScores) as [cls, pct] (cls)}
888
+ <span style="color: {getClassColor(cls)}">{cls}: {pct}%</span>
889
+ {/each}
890
+ </span>
891
+ </button>
892
+ {/each}
893
+ </div>
894
+ ```
895
+
896
+ ### Step 6: Wire chart + results together
897
+
898
+ In the page component, reactively redraw the chart when `selectedResult` changes:
899
+
900
+ ```svelte
901
+ <script lang="ts">
902
+ let chartContainer: HTMLElement
903
+ let csvData: CsvRow[] = $state([])
904
+ let results: PredictionResult[] = $state([])
905
+ let selectedResult: PredictionResult | null = $state(null)
906
+ const dataColumns = ['a1', 'a2', 'a3', 'a4'] // from csv_configs
907
+
908
+ // Redraw chart when data or selection changes
909
+ $effect(() => {
910
+ if (chartContainer && csvData.length > 0) {
911
+ drawChart(chartContainer, csvData, dataColumns, selectedResult)
912
+ }
913
+ })
914
+ </script>
915
+
916
+ <!-- Chart -->
917
+ <div bind:this={chartContainer} class="w-full h-[300px]"></div>
918
+
919
+ <!-- Results list -->
920
+ <!-- ... clickable results from Step 5 ... -->
921
+ ```
922
+
923
+ ### Row Range Mapping — How It Works
924
+
925
+ Each API result contains the exact row range it classified:
926
+
927
+ ```
928
+ query_metadata.query_metadata = {
929
+ "read_index": 200, // start row in CSV
930
+ "window_size": 100, // number of rows in this window
931
+ "total_rows": 6090 // total rows in the file
932
+ }
933
+ ```
934
+
935
+ - **Start row**: `read_index` (0-based)
936
+ - **End row**: `read_index + window_size` (exclusive)
937
+ - **No proportional mapping needed** — the indices map directly to CSV row numbers
938
+
939
+ This is simpler than the embeddings demo's approach (which uses proportional mapping between datasets). Here, the API gives you exact row positions.
940
+
941
+ ---
942
+
943
+ ## Key Differences from Streaming Approaches
944
+
945
+ | | Upload (this skill) | Stream from File | Stream from Sensor |
946
+ |---|---|---|---|
947
+ | Data reading | Server-side `csv_file_reader` | Local pandas/JS + windowed push | Local sensor + buffered push |
948
+ | Lens creation | YAML config via `create_and_run_lens` / `registerLens` | Same | Same |
949
+ | Data delivery | Upload file, server reads it | Local code streams windows | Local code buffers + streams |
950
+ | Local processing | None (just upload) | Window slicing + streaming | Sensor acquisition + buffering |
951
+ | Best for | Batch analysis of existing data | Controlled local streaming | Real-time from hardware |
952
+
953
+ ## CSV Format Expected
954
+
955
+ ```csv
956
+ timestamp,a1,a2,a3,a4
957
+ 1700000000.0,100,200,300,374
958
+ ```
959
+
960
+ - Column names are configurable via `csv_configs.data_columns`
961
+ - N-shot files and data file must share the same column structure
962
+ - `a4` is typically the magnitude: sqrt(a1² + a2² + a3²)
963
+
964
+ ## Key Implementation Notes
965
+
966
+ - Default `window_size` and `step_size`: **100**
967
+ - N-shot class names derived from filename stems (e.g., `healthy.csv` → `HEALTHY`)
968
+ - No local streaming loop needed — the server reads the file via `csv_file_reader`
969
+ - Python: `signal.SIGINT` for graceful shutdown, always use `try/finally` for `sse_reader`
970
+ - Web: `AbortController` for SSE cancellation, destroy session on unmount
971
+ - On web, this is the simplest approach — just file uploads and SSE consumption, no sensor APIs needed
972
+
973
+ ## Graceful Shutdown
974
+
975
+ ```python
976
+ import signal
977
+
978
+ stop_flag = False
979
+ def _sigint(sig, frame):
980
+ global stop_flag
981
+ stop_flag = True
982
+
983
+ signal.signal(signal.SIGINT, _sigint)
984
+ ```
985
+
986
+ Always wrap the SSE reader in a `try/finally` to ensure `sse_reader.close()` is called.