@archetypeai/ds-cli 0.3.7 → 0.3.10

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (28) hide show
  1. package/README.md +25 -67
  2. package/commands/create.js +5 -27
  3. package/commands/init.js +5 -27
  4. package/files/AGENTS.md +19 -3
  5. package/files/CLAUDE.md +21 -3
  6. package/files/rules/accessibility.md +49 -0
  7. package/files/rules/frontend-architecture.md +77 -0
  8. package/files/skills/apply-ds/SKILL.md +92 -80
  9. package/files/skills/apply-ds/scripts/audit.sh +169 -0
  10. package/files/skills/apply-ds/scripts/setup.sh +48 -166
  11. package/files/skills/create-dashboard/SKILL.md +12 -0
  12. package/files/skills/embedding-from-file/SKILL.md +415 -0
  13. package/files/skills/embedding-from-sensor/SKILL.md +406 -0
  14. package/files/skills/embedding-upload/SKILL.md +414 -0
  15. package/files/skills/fix-accessibility/SKILL.md +57 -9
  16. package/files/skills/newton-activity-monitor-lens-on-video/SKILL.md +817 -0
  17. package/files/skills/newton-camera-frame-analysis/SKILL.md +611 -0
  18. package/files/skills/newton-camera-frame-analysis/scripts/activity-monitor-frame.py +165 -0
  19. package/files/skills/newton-camera-frame-analysis/scripts/captures/logs/api_responses_20260206_105610.json +62 -0
  20. package/files/skills/newton-camera-frame-analysis/scripts/continuous_monitor.py +119 -0
  21. package/files/skills/newton-direct-query/SKILL.md +212 -0
  22. package/files/skills/newton-direct-query/scripts/direct_query.py +129 -0
  23. package/files/skills/newton-machine-state-from-file/SKILL.md +545 -0
  24. package/files/skills/newton-machine-state-from-sensor/SKILL.md +707 -0
  25. package/files/skills/newton-machine-state-upload/SKILL.md +986 -0
  26. package/lib/add-ds-ui-svelte.js +5 -2
  27. package/lib/scaffold-ds-svelte-project.js +25 -18
  28. package/package.json +13 -2
@@ -0,0 +1,817 @@
1
+ ---
2
+ name: newton-activity-monitor-lens-on-video
3
+ description: Analyze uploaded video files using Newton's activity monitor lens. Server reads the video via video_file_reader and streams results via SSE. Use for batch video analysis, activity detection, or temporal event extraction from video files. NOT for live webcam — use /camera-frame-analysis for that.
4
+ ---
5
+
6
+ # Newtown Activity Monitor Lens on Video (Upload Video → Server-Side Processing → SSE)
7
+
8
+ Upload a video file to the Archetype AI platform and analyze it server-side using Newton's activity monitor lens. The server reads the video via `video_file_reader` and streams inference results back via SSE.
9
+
10
+ **This skill is for UPLOADED VIDEO FILES only.** For live webcam analysis, use `/camera-frame-analysis` instead.
11
+
12
+ | | This skill (activity-monitor-lens-on-video) | camera-frame-analysis |
13
+ |---|---|---|
14
+ | **Input** | Uploaded video file (mp4, etc.) | Live webcam (base64 JPEG frames) |
15
+ | **Who reads frames** | Server (`video_file_reader`) | Client (Python cv2 / JS canvas) |
16
+ | **Event type** | Server-driven processing | `model.query` (request/response) |
17
+ | **Response** | Async via SSE stream (`inference.result`) | Direct in POST response |
18
+ | **Use case** | Batch video analysis, post-hoc review | Real-time webcam Q&A |
19
+
20
+ ---
21
+
22
+ ## Step 0: Choose an implementation approach
23
+
24
+ Before writing any code, ask the user which approach they want:
25
+
26
+ | Approach | Best for | API key security | Requires |
27
+ |----------|----------|-------------------|----------|
28
+ | **Server Proxy (SvelteKit)** | SvelteKit apps, demos with MediaRecorder | Private (server-side only) | SvelteKit server routes, `ffmpeg-static` for MediaRecorder video conversion |
29
+ | **Plain JS (client-side)** | Quick prototypes, static sites | Exposed in client | Direct `fetch` calls, pre-recorded MP4 files only |
30
+ | **Python** | Backend scripts, batch processing, CLI tools | Private (server-side) | `archetypeai` Python package |
31
+
32
+ ### Key considerations
33
+
34
+ - **MediaRecorder video (webcam recordings):** Chrome's `MediaRecorder` produces fragmented MP4 or WebM that the Archetype `video_file_reader` **cannot parse**. You MUST convert to proper MP4 (with moov atom) before uploading. The Server Proxy approach handles this automatically with `ffmpeg-static`. Plain JS does NOT support this — use pre-recorded MP4 files only.
35
+ - **CORS:** The Archetype AI API does not allow direct browser requests. The Server Proxy approach avoids this by routing all API calls through SvelteKit server routes. Plain JS will only work if CORS is not an issue (e.g., server-side Node.js scripts).
36
+ - **API key:** The Server Proxy approach keeps the API key in private server env vars. Plain JS exposes it to the browser.
37
+
38
+ ---
39
+
40
+ ## Default Model Parameters
41
+
42
+ | Parameter | Default | Description |
43
+ |---|---|---|
44
+ | `model_version` | `Newton::c2_4_7b_251215a172f6d7` | Newton model ID |
45
+ | `instruction` | *(your activity description prompt)* | What the model should output |
46
+ | `focus` | *(what to look for)* | Focus of analysis |
47
+ | `temporal_focus` | `5` | Temporal window in seconds |
48
+ | `max_new_tokens` | `512` | Max response length |
49
+ | `camera_buffer_size` | `5` | Frames to buffer before processing |
50
+ | `camera_buffer_step_size` | `5` | Step size for frame sampling |
51
+ | `memory_prompt_buffer_size` | `0` | Prior prompts to retain (0 = no memory) |
52
+
53
+ ---
54
+
55
+ ## API Reference
56
+
57
+ | Operation | Method | Endpoint | Body |
58
+ |-----------|--------|----------|------|
59
+ | Upload file | POST | `/files` | `FormData` |
60
+ | List lenses | GET | `/lens/metadata` | — |
61
+ | Register lens | POST | `/lens/register` | `{ lens_config: config }` |
62
+ | Delete lens | POST | `/lens/delete` | `{ lens_id }` |
63
+ | Create session | POST | `/lens/sessions/create` | `{ lens_id }` |
64
+ | Process event | POST | `/lens/sessions/events/process` | `{ session_id, event }` |
65
+ | Destroy session | POST | `/lens/sessions/destroy` | `{ session_id }` |
66
+ | SSE consumer | GET | `/lens/sessions/consumer/{sessionId}` | — |
67
+
68
+ ---
69
+
70
+ ## Frontend Architecture
71
+
72
+ Decompose the UI into components rather than building a monolithic `+page.svelte`. See `@rules/frontend-architecture` for full conventions.
73
+
74
+ ### Recommended decomposition
75
+
76
+ | UI Area | Component | Composes (DS primitives) | Key Props |
77
+ |---------|-----------|--------------------------|-----------|
78
+ | Video input | `VideoUpload.svelte` (or reuse VideoPlayer pattern) | Card, Button | `onupload`, `status`, `videoUrl` |
79
+ | Status | `ProcessingStatus.svelte` | Card, Badge, Spinner | `status`, `message` |
80
+ | Summary | `AnalysisSummary.svelte` | Card | `title`, `summary` |
81
+ | Log | Reuse ExpandableLog pattern | Collapsible, Badge, Item | `title`, `logs[]`, `count` |
82
+
83
+ ### Layout and API logic
84
+
85
+ - Use `@skills/create-dashboard` as a layout starting point for dashboard-style UIs
86
+ - Extract SSE consumer, session management, and upload logic into `$lib/api/activity-monitor.js`
87
+ - `+page.svelte` orchestrates flow state (status, sessionId) and passes data to components via props
88
+
89
+ ### What NOT to include unless requested
90
+
91
+ - Charts (SensorChart, ScatterChart) — only if the user asks for time-series visualization
92
+ - Complex data tables — only if the user needs tabular result inspection
93
+
94
+ ---
95
+
96
+ ## Approach A: Server Proxy (SvelteKit)
97
+
98
+ Routes all API calls through SvelteKit server routes. Converts MediaRecorder video to proper MP4 using `ffmpeg-static`. API key stays private on the server.
99
+
100
+ ### Prerequisites
101
+
102
+ ```bash
103
+ npm install ffmpeg-static
104
+ ```
105
+
106
+ Add to `.env`:
107
+ ```
108
+ ATAI_API_KEY=your-api-key
109
+ ATAI_API_ENDPOINT=https://api.u1.archetypeai.app/v0.5
110
+ ```
111
+
112
+ ### Server Route 1: Upload (`src/routes/api/upload/+server.js`)
113
+
114
+ Accepts a video blob, converts to proper MP4 via ffmpeg, then uploads to the Archetype API.
115
+
116
+ ```javascript
117
+ import { json, error } from '@sveltejs/kit';
118
+ import { ATAI_API_KEY, ATAI_API_ENDPOINT } from '$env/static/private';
119
+ import { writeFile, unlink, readFile } from 'node:fs/promises';
120
+ import { join } from 'node:path';
121
+ import { tmpdir } from 'node:os';
122
+ import { spawn } from 'node:child_process';
123
+ import ffmpegPath from 'ffmpeg-static';
124
+
125
+ function convertToMp4(inputPath, outputPath) {
126
+ return new Promise((resolve, reject) => {
127
+ const args = [
128
+ '-i', inputPath,
129
+ '-c:v', 'libx264', '-preset', 'ultrafast',
130
+ '-movflags', '+faststart',
131
+ '-an', '-y', outputPath
132
+ ];
133
+ const proc = spawn(ffmpegPath, args, { stdio: ['ignore', 'pipe', 'pipe'] });
134
+ let stderr = '';
135
+ proc.stderr.on('data', (d) => (stderr += d.toString()));
136
+ proc.on('close', (code) => {
137
+ if (code === 0) resolve();
138
+ else reject(new Error(`ffmpeg exited with code ${code}: ${stderr.slice(-500)}`));
139
+ });
140
+ proc.on('error', reject);
141
+ });
142
+ }
143
+
144
+ export async function POST({ request }) {
145
+ const formData = await request.formData();
146
+ const file = formData.get('file');
147
+ if (!file) throw error(400, 'No file provided');
148
+
149
+ const timestamp = Date.now();
150
+ const inputPath = join(tmpdir(), `upload_input_${timestamp}`);
151
+ const outputPath = join(tmpdir(), `upload_output_${timestamp}.mp4`);
152
+
153
+ try {
154
+ const buffer = Buffer.from(await file.arrayBuffer());
155
+ await writeFile(inputPath, buffer);
156
+ await convertToMp4(inputPath, outputPath);
157
+
158
+ const convertedBuffer = await readFile(outputPath);
159
+ const outputName = file.name.replace(/\.\w+$/, '.mp4');
160
+ const convertedFile = new File([convertedBuffer], outputName, { type: 'video/mp4' });
161
+
162
+ const uploadForm = new FormData();
163
+ uploadForm.append('file', convertedFile);
164
+
165
+ const response = await fetch(`${ATAI_API_ENDPOINT}/files`, {
166
+ method: 'POST',
167
+ headers: { Authorization: `Bearer ${ATAI_API_KEY}` },
168
+ body: uploadForm
169
+ });
170
+
171
+ if (!response.ok) {
172
+ const errorBody = await response.json().catch(() => ({}));
173
+ throw error(response.status, `Upload failed: ${JSON.stringify(errorBody)}`);
174
+ }
175
+
176
+ return json(await response.json());
177
+ } finally {
178
+ await unlink(inputPath).catch(() => {});
179
+ await unlink(outputPath).catch(() => {});
180
+ }
181
+ }
182
+ ```
183
+
184
+ ### Server Route 2: Activity Monitor (`src/routes/api/activity-monitor/+server.js`)
185
+
186
+ Handles stale lens cleanup, lens registration, session creation, and session-ready polling — all server-side.
187
+
188
+ ```javascript
189
+ import { json, error } from '@sveltejs/kit';
190
+ import { ATAI_API_KEY, ATAI_API_ENDPOINT } from '$env/static/private';
191
+
192
+ async function apiGet(path) {
193
+ const response = await fetch(`${ATAI_API_ENDPOINT}${path}`, {
194
+ headers: { Authorization: `Bearer ${ATAI_API_KEY}` }
195
+ });
196
+ if (!response.ok) throw new Error(`API GET ${path} failed: ${response.status}`);
197
+ return response.json();
198
+ }
199
+
200
+ async function apiPost(path, body, timeoutMs = 10000) {
201
+ const controller = new AbortController();
202
+ const timeoutId = setTimeout(() => controller.abort(), timeoutMs);
203
+ try {
204
+ const response = await fetch(`${ATAI_API_ENDPOINT}${path}`, {
205
+ method: 'POST',
206
+ headers: {
207
+ Authorization: `Bearer ${ATAI_API_KEY}`,
208
+ 'Content-Type': 'application/json'
209
+ },
210
+ body: JSON.stringify(body),
211
+ signal: controller.signal
212
+ });
213
+ if (!response.ok) {
214
+ const errorBody = await response.json().catch(() => ({}));
215
+ throw new Error(`API POST ${path} failed: ${response.status} - ${JSON.stringify(errorBody)}`);
216
+ }
217
+ return response.json();
218
+ } finally {
219
+ clearTimeout(timeoutId);
220
+ }
221
+ }
222
+
223
+ async function waitForSessionReady(sessionId, maxWaitMs = 60000) {
224
+ const start = Date.now();
225
+ while (Date.now() - start < maxWaitMs) {
226
+ const status = await apiPost(
227
+ '/lens/sessions/events/process',
228
+ { session_id: sessionId, event: { type: 'session.status' } },
229
+ 10000
230
+ );
231
+ if (status.session_status === 'LensSessionStatus.SESSION_STATUS_RUNNING' ||
232
+ status.session_status === '3') return true;
233
+ if (status.session_status === 'LensSessionStatus.SESSION_STATUS_FAILED' ||
234
+ status.session_status === '6') return false;
235
+ await new Promise((r) => setTimeout(r, 500));
236
+ }
237
+ return false;
238
+ }
239
+
240
+ const LENS_NAME = 'activity-monitor-video-lens';
241
+
242
+ // POST: Create activity monitor session
243
+ export async function POST({ request }) {
244
+ const { lensConfig } = await request.json();
245
+
246
+ // Clean up ALL existing lenses to avoid stale state blocking new registrations
247
+ try {
248
+ const existing = await apiGet('/lens/metadata');
249
+ if (Array.isArray(existing)) {
250
+ for (const lens of existing) {
251
+ await apiPost('/lens/delete', { lens_id: lens.lens_id }).catch(() => {});
252
+ }
253
+ }
254
+ } catch { /* ignore cleanup errors */ }
255
+
256
+ const registered = await apiPost('/lens/register', { lens_config: lensConfig });
257
+ const lensId = registered.lens_id;
258
+
259
+ const session = await apiPost('/lens/sessions/create', { lens_id: lensId });
260
+ const sessionId = session.session_id;
261
+
262
+ await apiPost('/lens/delete', { lens_id: lensId });
263
+
264
+ const isReady = await waitForSessionReady(sessionId);
265
+ if (!isReady) throw error(500, 'Session failed to start');
266
+
267
+ return json({ sessionId });
268
+ }
269
+
270
+ // DELETE: Destroy a session
271
+ export async function DELETE({ request }) {
272
+ const { sessionId } = await request.json();
273
+ await apiPost('/lens/sessions/destroy', { session_id: sessionId });
274
+ return json({ ok: true });
275
+ }
276
+ ```
277
+
278
+ ### Server Route 3: SSE Stream Proxy (`src/routes/api/activity-monitor/stream/[sessionId]/+server.js`)
279
+
280
+ ```javascript
281
+ import { ATAI_API_KEY, ATAI_API_ENDPOINT } from '$env/static/private';
282
+
283
+ export async function GET({ params }) {
284
+ const { sessionId } = params;
285
+
286
+ const upstream = await fetch(
287
+ `${ATAI_API_ENDPOINT}/lens/sessions/consumer/${sessionId}`,
288
+ { headers: { Authorization: `Bearer ${ATAI_API_KEY}` } }
289
+ );
290
+
291
+ if (!upstream.ok) {
292
+ return new Response('Failed to connect to SSE stream', { status: upstream.status });
293
+ }
294
+
295
+ return new Response(upstream.body, {
296
+ headers: {
297
+ 'Content-Type': 'text/event-stream',
298
+ 'Cache-Control': 'no-cache',
299
+ Connection: 'keep-alive'
300
+ }
301
+ });
302
+ }
303
+ ```
304
+
305
+ ### Server Route 4: Direct Query Proxy (`src/routes/api/query/+server.js`)
306
+
307
+ ```javascript
308
+ import { json, error } from '@sveltejs/kit';
309
+ import { ATAI_API_KEY, ATAI_API_ENDPOINT } from '$env/static/private';
310
+
311
+ export async function POST({ request }) {
312
+ const { query, systemPrompt = '', maxNewTokens = 1024 } = await request.json();
313
+
314
+ const response = await fetch(`${ATAI_API_ENDPOINT}/query`, {
315
+ method: 'POST',
316
+ headers: {
317
+ Authorization: `Bearer ${ATAI_API_KEY}`,
318
+ 'Content-Type': 'application/json'
319
+ },
320
+ body: JSON.stringify({
321
+ query,
322
+ system_prompt: systemPrompt,
323
+ instruction_prompt: systemPrompt,
324
+ file_ids: [],
325
+ model: 'Newton::c2_4_7b_251215a172f6d7',
326
+ max_new_tokens: maxNewTokens,
327
+ sanitize: false
328
+ }),
329
+ signal: AbortSignal.timeout(120000)
330
+ });
331
+
332
+ if (!response.ok) {
333
+ const errorBody = await response.json().catch(() => ({}));
334
+ throw error(response.status, `Query failed: ${JSON.stringify(errorBody)}`);
335
+ }
336
+
337
+ const data = await response.json();
338
+ let text = '';
339
+ const r = data?.response;
340
+ if (r?.response && Array.isArray(r.response)) text = r.response[0] || '';
341
+ else if (Array.isArray(r)) text = r[0] || '';
342
+ else if (typeof r === 'string') text = r;
343
+ else if (data?.text) text = data.text;
344
+ else text = JSON.stringify(data);
345
+
346
+ return json({ text });
347
+ }
348
+ ```
349
+
350
+ ### Client-side usage (Svelte / JS)
351
+
352
+ The client calls the local server routes — no API key or endpoint needed:
353
+
354
+ ```javascript
355
+ // 1. Upload video (server converts to proper MP4 automatically)
356
+ async function uploadVideo(blob) {
357
+ const formData = new FormData();
358
+ const ext = blob.type.includes('mp4') ? 'mp4' : 'webm';
359
+ const timestamp = Date.now();
360
+ formData.append('file', new File([blob], `recording_${timestamp}.${ext}`, { type: blob.type }));
361
+
362
+ const res = await fetch('/api/upload', { method: 'POST', body: formData });
363
+ if (!res.ok) throw new Error(`Upload failed: ${res.status}`);
364
+ const result = await res.json();
365
+ return result.file_id;
366
+ }
367
+
368
+ // 2. Create activity monitor session (server handles lens cleanup, registration, session polling)
369
+ async function createSession(lensConfig) {
370
+ const res = await fetch('/api/activity-monitor', {
371
+ method: 'POST',
372
+ headers: { 'Content-Type': 'application/json' },
373
+ body: JSON.stringify({ lensConfig })
374
+ });
375
+ if (!res.ok) throw new Error(`Session creation failed: ${res.status}`);
376
+ const { sessionId } = await res.json();
377
+ return sessionId;
378
+ }
379
+
380
+ // 3. Consume SSE results (proxied through server)
381
+ async function consumeSSE(sessionId, onResult) {
382
+ const res = await fetch(`/api/activity-monitor/stream/${sessionId}`);
383
+ if (!res.ok) throw new Error(`SSE stream failed: ${res.status}`);
384
+
385
+ const reader = res.body.getReader();
386
+ const decoder = new TextDecoder();
387
+ let buffer = '';
388
+
389
+ while (true) {
390
+ const { done, value } = await reader.read();
391
+ if (done) break;
392
+
393
+ buffer += decoder.decode(value, { stream: true });
394
+ const lines = buffer.split('\n');
395
+ buffer = lines.pop();
396
+
397
+ for (const line of lines) {
398
+ if (!line.startsWith('data: ')) continue;
399
+ try {
400
+ const parsed = JSON.parse(line.slice(6));
401
+ if (parsed.type === 'inference.result') {
402
+ const text = parsed.event_data.response[0];
403
+ const ts = parsed.event_data.query_metadata?.sensor_timestamp;
404
+ onResult({ text, timestamp: ts });
405
+ }
406
+ if (parsed.type === 'sse.stream.end') return;
407
+ } catch { /* skip malformed JSON */ }
408
+ }
409
+ }
410
+ }
411
+
412
+ // 4. Destroy session
413
+ async function destroySession(sessionId) {
414
+ await fetch('/api/activity-monitor', {
415
+ method: 'DELETE',
416
+ headers: { 'Content-Type': 'application/json' },
417
+ body: JSON.stringify({ sessionId })
418
+ });
419
+ }
420
+
421
+ // 5. Direct query (for post-processing)
422
+ async function directQuery(systemPrompt, query) {
423
+ const res = await fetch('/api/query', {
424
+ method: 'POST',
425
+ headers: { 'Content-Type': 'application/json' },
426
+ body: JSON.stringify({ query, systemPrompt })
427
+ });
428
+ if (!res.ok) throw new Error(`Query failed: ${res.status}`);
429
+ return res.json(); // { text: "..." }
430
+ }
431
+ ```
432
+
433
+ ### Server Proxy Lifecycle Summary
434
+
435
+ ```
436
+ Client SvelteKit Server Archetype AI API
437
+ ────── ─────────────── ─────────────────
438
+ POST /api/upload → ffmpeg convert → POST /files → returns file_id
439
+ POST /api/activity-monitor → GET /lens/metadata (cleanup) →
440
+ POST /lens/register → returns lens_id
441
+ POST /lens/sessions/create → returns session_id
442
+ POST /lens/delete →
443
+ poll session.status → SESSION_STATUS_RUNNING
444
+ ← { sessionId }
445
+ GET /api/.../stream/:id → GET /lens/sessions/consumer/ → SSE events proxied
446
+ ← inference.result events
447
+ DELETE /api/activity-monitor → POST /lens/sessions/destroy →
448
+ ```
449
+
450
+ ---
451
+
452
+ ## Approach B: Plain JS (client-side)
453
+
454
+ Direct `fetch` calls from the browser or Node.js. **Only works with pre-recorded MP4 files** (not MediaRecorder blobs). API key is exposed to the client. May hit CORS issues in the browser.
455
+
456
+ ### Helpers: API wrappers
457
+
458
+ ```typescript
459
+ const API_ENDPOINT = 'https://api.u1.archetypeai.app/v0.5'
460
+
461
+ async function apiGet<T>(path: string, apiKey: string): Promise<T> {
462
+ const response = await fetch(`${API_ENDPOINT}${path}`, {
463
+ method: 'GET',
464
+ headers: { Authorization: `Bearer ${apiKey}` },
465
+ })
466
+ if (!response.ok) throw new Error(`API GET ${path} failed: ${response.status}`)
467
+ return response.json()
468
+ }
469
+
470
+ async function apiPost<T>(path: string, apiKey: string, body: unknown, timeoutMs = 5000): Promise<T> {
471
+ const controller = new AbortController()
472
+ const timeoutId = setTimeout(() => controller.abort(), timeoutMs)
473
+
474
+ try {
475
+ const response = await fetch(`${API_ENDPOINT}${path}`, {
476
+ method: 'POST',
477
+ headers: {
478
+ Authorization: `Bearer ${apiKey}`,
479
+ 'Content-Type': 'application/json',
480
+ },
481
+ body: JSON.stringify(body),
482
+ signal: controller.signal,
483
+ })
484
+
485
+ if (!response.ok) {
486
+ const errorBody = await response.json().catch(() => ({}))
487
+ throw new Error(`API POST ${path} failed: ${response.status} - ${JSON.stringify(errorBody)}`)
488
+ }
489
+
490
+ return response.json()
491
+ } finally {
492
+ clearTimeout(timeoutId)
493
+ }
494
+ }
495
+ ```
496
+
497
+ ### Step 1: Upload the video file
498
+
499
+ ```typescript
500
+ const formData = new FormData()
501
+ formData.append('file', videoFile) // File object from <input type="file">
502
+
503
+ const uploadResponse = await fetch(`${API_ENDPOINT}/files`, {
504
+ method: 'POST',
505
+ headers: { Authorization: `Bearer ${apiKey}` },
506
+ body: formData,
507
+ })
508
+ const uploadResult = await uploadResponse.json()
509
+ const fileId = uploadResult.file_id
510
+ ```
511
+
512
+ ### Step 2: Find or create the lens (clean up stale lenses)
513
+
514
+ ```typescript
515
+ const LENS_NAME = 'activity-monitor-video-lens'
516
+
517
+ const lensConfig = {
518
+ lens_name: LENS_NAME,
519
+ lens_config: {
520
+ model_pipeline: [
521
+ { processor_name: 'lens_camera_processor', processor_config: {} },
522
+ ],
523
+ model_parameters: {
524
+ model_version: 'Newton::c2_4_7b_251215a172f6d7',
525
+ instruction: 'Describe the activity currently being performed in one concise sentence.',
526
+ focus: 'Focus on the main person and their current action.',
527
+ temporal_focus: 5,
528
+ max_new_tokens: 512,
529
+ camera_buffer_size: 5,
530
+ camera_buffer_step_size: 5,
531
+ memory_prompt_buffer_size: 0,
532
+ },
533
+ input_streams: [
534
+ {
535
+ stream_type: 'video_file_reader',
536
+ stream_config: { file_id: fileId },
537
+ },
538
+ ],
539
+ output_streams: [
540
+ { stream_type: 'server_sent_events_writer' },
541
+ ],
542
+ },
543
+ }
544
+
545
+ // Delete ALL existing lenses to avoid stale state blocking new registrations
546
+ const existingLenses = await apiGet<Array<{ lens_id: string; lens_name: string }>>(
547
+ '/lens/metadata', apiKey
548
+ )
549
+ if (Array.isArray(existingLenses)) {
550
+ for (const lens of existingLenses) {
551
+ await apiPost('/lens/delete', apiKey, { lens_id: lens.lens_id }).catch(() => {})
552
+ }
553
+ }
554
+
555
+ // Register fresh lens
556
+ const registeredLens = await apiPost<{ lens_id: string }>(
557
+ '/lens/register', apiKey, { lens_config: lensConfig }
558
+ )
559
+ const lensId = registeredLens.lens_id
560
+ ```
561
+
562
+ ### Step 3: Create session and wait for ready
563
+
564
+ ```typescript
565
+ const session = await apiPost<{ session_id: string }>(
566
+ '/lens/sessions/create', apiKey, { lens_id: lensId }
567
+ )
568
+ const sessionId = session.session_id
569
+
570
+ // Optionally delete the lens definition (the session keeps running independently)
571
+ await apiPost('/lens/delete', apiKey, { lens_id: lensId })
572
+
573
+ // Wait for session to be ready
574
+ async function waitForSessionReady(sessionId: string, maxWaitMs = 60000): Promise<boolean> {
575
+ const start = Date.now()
576
+ while (Date.now() - start < maxWaitMs) {
577
+ const status = await apiPost<{ session_status: string }>(
578
+ '/lens/sessions/events/process', apiKey,
579
+ { session_id: sessionId, event: { type: 'session.status' } },
580
+ 10000
581
+ )
582
+ if (status.session_status === 'LensSessionStatus.SESSION_STATUS_RUNNING' ||
583
+ status.session_status === '3') return true
584
+ if (status.session_status === 'LensSessionStatus.SESSION_STATUS_FAILED' ||
585
+ status.session_status === '6') return false
586
+ await new Promise(r => setTimeout(r, 500))
587
+ }
588
+ return false
589
+ }
590
+
591
+ const isReady = await waitForSessionReady(sessionId)
592
+ if (!isReady) throw new Error('Session failed to initialize')
593
+ ```
594
+
595
+ ### Step 4: Consume SSE results
596
+
597
+ ```typescript
598
+ import { fetchEventSource } from '@microsoft/fetch-event-source'
599
+
600
+ const results = []
601
+ const abortController = new AbortController()
602
+
603
+ fetchEventSource(`${API_ENDPOINT}/lens/sessions/consumer/${sessionId}`, {
604
+ headers: { Authorization: `Bearer ${apiKey}` },
605
+ signal: abortController.signal,
606
+ onmessage(event) {
607
+ const parsed = JSON.parse(event.data)
608
+
609
+ if (parsed.type === 'inference.result') {
610
+ const response = parsed.event_data.response
611
+ const metadata = parsed.event_data.query_metadata
612
+ const text = Array.isArray(response) ? response[0] : response
613
+ const timestamp = metadata?.sensor_timestamp ?? 'N/A'
614
+
615
+ results.push({ timestamp, frameIds: metadata?.frame_ids ?? [], response: text })
616
+ console.log(`[${timestamp}] ${text}`)
617
+ }
618
+
619
+ if (parsed.type === 'sse.stream.end') {
620
+ console.log(`Complete. ${results.length} results collected.`)
621
+ abortController.abort()
622
+ }
623
+ },
624
+ })
625
+ ```
626
+
627
+ ### Step 5: Cleanup
628
+
629
+ ```typescript
630
+ abortController.abort()
631
+ await apiPost('/lens/sessions/destroy', apiKey, { session_id: sessionId })
632
+ ```
633
+
634
+ ### Plain JS Lifecycle Summary
635
+
636
+ ```
637
+ 1. Upload video -> POST /files (FormData)
638
+ 2. List existing lenses -> GET /lens/metadata
639
+ 3. Delete stale lens -> POST /lens/delete { lens_id } (if same name exists)
640
+ 4. Register lens -> POST /lens/register { lens_config with video_file_reader }
641
+ 5. Create session -> POST /lens/sessions/create { lens_id }
642
+ 6. Wait for ready -> POST /lens/sessions/events/process (poll session.status)
643
+ 7. Consume SSE results -> GET /lens/sessions/consumer/{sessionId}
644
+ 8. Destroy session -> POST /lens/sessions/destroy { session_id }
645
+ ```
646
+
647
+ ---
648
+
649
+ ## Approach C: Python
650
+
651
+ ### Requirements
652
+
653
+ - `archetypeai` Python package
654
+ - Environment variables: `ATAI_API_KEY`, optionally `ATAI_API_ENDPOINT`
655
+
656
+ ### Full Python Example
657
+
658
+ ```python
659
+ import os
660
+ import time
661
+ from archetypeai.api_client import ArchetypeAI
662
+
663
+ api_key = os.getenv("ATAI_API_KEY")
664
+ api_endpoint = os.getenv("ATAI_API_ENDPOINT", ArchetypeAI.get_default_endpoint())
665
+ client = ArchetypeAI(api_key, api_endpoint=api_endpoint)
666
+
667
+ # 1. Upload the video file
668
+ upload_resp = client.files.local.upload("my_video.mp4")
669
+ file_id = upload_resp["file_id"]
670
+ print(f"Uploaded: {file_id}")
671
+
672
+ # 2. Lens config with video_file_reader input stream
673
+ lens_config = {
674
+ "lens_name": "activity-monitor-video-lens",
675
+ "lens_config": {
676
+ "model_pipeline": [
677
+ {"processor_name": "lens_camera_processor", "processor_config": {}}
678
+ ],
679
+ "model_parameters": {
680
+ "model_version": "Newton::c2_4_7b_251215a172f6d7",
681
+ "instruction": "Describe the activity currently being performed in one concise sentence.",
682
+ "focus": "Focus on the main person and their current action.",
683
+ "temporal_focus": 5,
684
+ "max_new_tokens": 512,
685
+ "camera_buffer_size": 5,
686
+ "camera_buffer_step_size": 5,
687
+ "memory_prompt_buffer_size": 0,
688
+ },
689
+ "input_streams": [
690
+ {
691
+ "stream_type": "video_file_reader",
692
+ "stream_config": {"file_id": file_id},
693
+ }
694
+ ],
695
+ "output_streams": [
696
+ {"stream_type": "server_sent_events_writer"}
697
+ ],
698
+ },
699
+ }
700
+
701
+ # 3. Register lens
702
+ lens = client.lens.register(lens_config)
703
+ lens_id = lens["lens_id"]
704
+
705
+ # 4. Create session
706
+ session = client.lens.sessions.create(lens_id)
707
+ session_id = session["session_id"]
708
+
709
+ # Optionally delete the lens (session runs independently)
710
+ client.lens.delete(lens_id)
711
+
712
+ # 5. Wait for session ready
713
+ for _ in range(60):
714
+ try:
715
+ status = client.lens.sessions.process_event(
716
+ session_id, {"type": "session.status"}
717
+ )
718
+ if status.get("session_status") in ["3", "LensSessionStatus.SESSION_STATUS_RUNNING"]:
719
+ break
720
+ except Exception:
721
+ pass
722
+ time.sleep(0.5)
723
+
724
+ # 6. Consume SSE results (server drives video processing)
725
+ sse_reader = client.lens.sessions.create_sse_consumer(
726
+ session_id, max_read_time_sec=600
727
+ )
728
+
729
+ results = []
730
+ for event in sse_reader.read(block=True):
731
+ if isinstance(event, dict) and event.get("type") == "inference.result":
732
+ ed = event.get("event_data", {})
733
+ response = ed.get("response", "")
734
+ meta = ed.get("query_metadata", {})
735
+
736
+ text = response[0] if isinstance(response, list) else response
737
+
738
+ results.append({
739
+ "timestamp": meta.get("sensor_timestamp", "N/A"),
740
+ "frame_ids": meta.get("frame_ids", []),
741
+ "response": text,
742
+ })
743
+ print(f"[{meta.get('sensor_timestamp', '?')}] {text}")
744
+
745
+ sse_reader.close()
746
+ print(f"Done. {len(results)} results collected.")
747
+
748
+ # 7. Cleanup
749
+ client.lens.sessions.destroy(session_id)
750
+ ```
751
+
752
+ ---
753
+
754
+ ## SSE Event Types
755
+
756
+ | Event type | Description |
757
+ |---|---|
758
+ | `inference.result` | A model response with `response[]` text and `query_metadata` |
759
+ | `sse.stream.end` | The session has finished processing the entire video |
760
+
761
+ ### `inference.result` payload
762
+
763
+ ```json
764
+ {
765
+ "type": "inference.result",
766
+ "event_data": {
767
+ "response": ["Person is stirring a pot on the stove."],
768
+ "query_id": "qry-abc123",
769
+ "query_metadata": {
770
+ "sensor_timestamp": "00:02:15",
771
+ "frame_ids": [270, 275, 280, 285, 290, 295, 300, 305]
772
+ }
773
+ }
774
+ }
775
+ ```
776
+
777
+ - `sensor_timestamp`: position in the video (HH:MM:SS)
778
+ - `frame_ids`: the specific video frames the model analyzed
779
+ - `response`: array of model output strings (typically one element)
780
+
781
+ ---
782
+
783
+ ## Lens Configuration Reference
784
+
785
+ ### Model Parameters
786
+
787
+ | Parameter | Description | Default |
788
+ |---|---|---|
789
+ | `model_version` | Newton model ID | `Newton::c2_4_7b_251215a172f6d7` |
790
+ | `instruction` | Main prompt guiding output format | *(user-defined)* |
791
+ | `focus` | What to look for in the video | *(user-defined)* |
792
+ | `temporal_focus` | Temporal window in seconds. Small (<5s) = fine-grained, large (>5s) = macro | `5` |
793
+ | `camera_buffer_size` | Number of frames to buffer before processing | `5` |
794
+ | `camera_buffer_step_size` | Step size for frame sampling from the buffer | `5` |
795
+ | `memory_prompt_buffer_size` | How many prior prompts to retain for context (0 = stateless) | `0` |
796
+ | `max_new_tokens` | Max tokens for model inference output | `512` |
797
+
798
+ ### Input Stream Types
799
+
800
+ **Pre-recorded video file (this skill):**
801
+ ```json
802
+ { "stream_type": "video_file_reader", "stream_config": { "file_id": "<uploaded_file_id>" } }
803
+ ```
804
+
805
+ **Live RTSP camera stream:**
806
+ ```json
807
+ { "stream_type": "rtsp_video_reader", "stream_config": { "rtsp_url": "rtsp://example.com:554/stream", "target_image_size": [360, 640], "target_frame_rate_hz": 1.0 } }
808
+ ```
809
+
810
+ ## Troubleshooting
811
+
812
+ - **"Input stream is unhealthy!" / "Failed to load video"**: The uploaded video is likely fragmented MP4 (from MediaRecorder) or WebM. The `video_file_reader` requires standard MP4 with moov atom. Use the Server Proxy approach which converts automatically via ffmpeg, or convert manually before uploading.
813
+ - **"Input stream is unhealthy!" (stale lens)**: Delete **ALL** existing lenses before registering a new one (not just name-matched). Stale lenses from previous sessions can block new registrations even with different names. All approaches above include aggressive lens cleanup via `GET /lens/metadata` → delete every lens.
814
+ - **CORS errors in browser**: The Archetype AI API does not support direct browser CORS. Use the Server Proxy approach or run from Node.js/Python server-side.
815
+ - **No SSE events**: Ensure the session reached `SESSION_STATUS_RUNNING` before connecting the SSE consumer. All approaches above include session-ready polling.
816
+ - **Session fails**: Check that `file_id` in `video_file_reader` config matches the uploaded file.
817
+ - **Video too long**: Use `max_run_time_sec` or adjust `temporal_focus` for faster processing.