@archetypeai/ds-cli 0.3.7 → 0.3.10

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (28) hide show
  1. package/README.md +25 -67
  2. package/commands/create.js +5 -27
  3. package/commands/init.js +5 -27
  4. package/files/AGENTS.md +19 -3
  5. package/files/CLAUDE.md +21 -3
  6. package/files/rules/accessibility.md +49 -0
  7. package/files/rules/frontend-architecture.md +77 -0
  8. package/files/skills/apply-ds/SKILL.md +92 -80
  9. package/files/skills/apply-ds/scripts/audit.sh +169 -0
  10. package/files/skills/apply-ds/scripts/setup.sh +48 -166
  11. package/files/skills/create-dashboard/SKILL.md +12 -0
  12. package/files/skills/embedding-from-file/SKILL.md +415 -0
  13. package/files/skills/embedding-from-sensor/SKILL.md +406 -0
  14. package/files/skills/embedding-upload/SKILL.md +414 -0
  15. package/files/skills/fix-accessibility/SKILL.md +57 -9
  16. package/files/skills/newton-activity-monitor-lens-on-video/SKILL.md +817 -0
  17. package/files/skills/newton-camera-frame-analysis/SKILL.md +611 -0
  18. package/files/skills/newton-camera-frame-analysis/scripts/activity-monitor-frame.py +165 -0
  19. package/files/skills/newton-camera-frame-analysis/scripts/captures/logs/api_responses_20260206_105610.json +62 -0
  20. package/files/skills/newton-camera-frame-analysis/scripts/continuous_monitor.py +119 -0
  21. package/files/skills/newton-direct-query/SKILL.md +212 -0
  22. package/files/skills/newton-direct-query/scripts/direct_query.py +129 -0
  23. package/files/skills/newton-machine-state-from-file/SKILL.md +545 -0
  24. package/files/skills/newton-machine-state-from-sensor/SKILL.md +707 -0
  25. package/files/skills/newton-machine-state-upload/SKILL.md +986 -0
  26. package/lib/add-ds-ui-svelte.js +5 -2
  27. package/lib/scaffold-ds-svelte-project.js +25 -18
  28. package/package.json +13 -2
@@ -0,0 +1,611 @@
1
+ ---
2
+ name: newton-camera-frame-analysis
3
+ description: Live webcam frame analysis using Newton's vision model via model.query (request/response). Captures frames from a webcam as base64 JPEG and sends them to Newton. Use for live camera analysis, scene description, presence detection, or visual Q&A. NOT for video file uploads — use /activity-monitor-lens-on-video for that.
4
+ argument-hint: [question] [camera_index]
5
+ allowed-tools: Bash(python *), Read
6
+ ---
7
+
8
+ # Newton Camera Frame Analysis (Live Webcam → base64 → model.query)
9
+
10
+ Capture live webcam frames, encode as base64 JPEG, and send to Newton's vision model via `model.query` (synchronous request/response). Supports Python (OpenCV) and JavaScript (getUserMedia + canvas).
11
+
12
+ > **Frontend architecture:** When building a web UI for this skill, decompose into components (webcam input, status display, results view) rather than a monolithic page. Extract API logic into `$lib/api/`. See `@rules/frontend-architecture` for conventions and `@skills/create-dashboard` / `@skills/build-pattern` for layout and component patterns.
13
+
14
+ **This skill is for LIVE WEBCAM input only.** For analyzing uploaded video files, use `/activity-monitor-lens-on-video` instead.
15
+
16
+ | | This skill (camera-frame-analysis) | activity-monitor-lens-on-video |
17
+ |---|---|---|
18
+ | **Input** | Live webcam (base64 JPEG frames) | Uploaded video file |
19
+ | **Who captures frames** | Client (Python cv2 / JS canvas) | Server (`video_file_reader`) |
20
+ | **Event type** | `model.query` (request/response) | Server-driven, results via SSE |
21
+ | **Response** | Direct in POST response | Async via SSE stream |
22
+ | **Use case** | Real-time webcam Q&A | Batch video analysis |
23
+
24
+ ---
25
+
26
+ ## Model Parameters
27
+
28
+ | Parameter | Default | Notes |
29
+ |---|---|---|
30
+ | `model_version` | `Newton::c2_4_7b_251215a172f6d7` | Newton model ID |
31
+ | `template_name` | `image_qa_template_task` | Prompt template |
32
+ | `instruction` | *(user-provided)* | System prompt guiding output format |
33
+ | `focus` | *(user-provided)* | The question or what to look for |
34
+ | `max_new_tokens` | `512` | Max response length |
35
+ | `camera_buffer_size` | `1` | Single-frame buffer for webcam |
36
+ | `min_replicas` / `max_replicas` | `1` / `1` | Scaling config |
37
+
38
+ **IMPORTANT:** `instruction` and `focus` must be passed as parameters — not hardcoded. The values in the lens config (registration) and in each `model.query` event must be consistent. Pass the user's values into both.
39
+
40
+ ---
41
+
42
+ ## Python Implementation
43
+
44
+ ### Requirements
45
+
46
+ - `archetypeai` Python package
47
+ - `opencv-python` (`cv2`), `Pillow`
48
+ - Environment variables: `ATAI_API_KEY` or `ARCHETYPE_API_KEY`
49
+
50
+ ### Quick Start
51
+
52
+ ```bash
53
+ export ATAI_API_KEY=your_key_here
54
+ python camera_frame_analysis.py "Describe what you see"
55
+
56
+ # Custom question
57
+ python camera_frame_analysis.py "Is anyone present?"
58
+
59
+ # Different camera
60
+ python camera_frame_analysis.py "Describe the scene" 1
61
+ ```
62
+
63
+ ### Parameters
64
+
65
+ - **question** (positional, optional): What to analyze (default: "Describe what you see")
66
+ - **camera_index** (positional, optional): Camera index (default: 0)
67
+
68
+ ### How It Works
69
+
70
+ 1. **Capture**: Opens webcam with OpenCV, reads a frame
71
+ 2. **Encode**: Converts frame to base64 JPEG (BGR → RGB → PIL → JPEG → base64)
72
+ 3. **Setup**: Registers Newton lens, creates session, waits for ready
73
+ 4. **Initialize**: Sends `session.modify` to initialize the processor
74
+ 5. **Query**: Sends base64 image as `model.query` event, gets response directly
75
+ 6. **Cleanup**: Destroys session
76
+
77
+ ### Webcam Capture → base64
78
+
79
+ ```python
80
+ import cv2
81
+ import base64
82
+ import io
83
+ from PIL import Image
84
+
85
+ def capture_frame_base64(camera_index=0, jpeg_quality=80, resize=(640, 480)):
86
+ """Capture a webcam frame and return as raw base64 JPEG string."""
87
+ cap = cv2.VideoCapture(camera_index)
88
+ ret, frame = cap.read()
89
+ cap.release()
90
+
91
+ if not ret:
92
+ raise RuntimeError(f"Failed to capture from camera {camera_index}")
93
+
94
+ # Resize if needed
95
+ h, w = frame.shape[:2]
96
+ if (w, h) != resize:
97
+ frame = cv2.resize(frame, resize)
98
+
99
+ # BGR (OpenCV) → RGB → PIL → JPEG → base64
100
+ rgb_frame = cv2.cvtColor(frame, cv2.COLOR_BGR2RGB)
101
+ pil_image = Image.fromarray(rgb_frame)
102
+
103
+ buffer = io.BytesIO()
104
+ pil_image.save(buffer, format="JPEG", quality=jpeg_quality)
105
+ raw_base64 = base64.b64encode(buffer.getvalue()).decode()
106
+
107
+ return raw_base64 # No "data:image/jpeg;base64," prefix
108
+ ```
109
+
110
+ ### Full Python Example
111
+
112
+ ```python
113
+ import os
114
+ import time
115
+ from archetypeai.api_client import ArchetypeAI
116
+
117
+ api_key = os.getenv("ATAI_API_KEY")
118
+ client = ArchetypeAI(api_key)
119
+
120
+ # --- User-provided values (NOT hardcoded) ---
121
+ instruction = "Answer the following question about the image:"
122
+ focus = "Describe what you see in this image."
123
+
124
+ def build_lens_config(instruction: str, focus: str) -> dict:
125
+ """Build lens config with user-provided instruction and focus."""
126
+ return {
127
+ "lens_name": "camera-frame-capture-lens",
128
+ "lens_config": {
129
+ "model_pipeline": [
130
+ {"processor_name": "lens_camera_processor", "processor_config": {}}
131
+ ],
132
+ "model_parameters": {
133
+ "model_version": "Newton::c2_4_7b_251215a172f6d7",
134
+ "template_name": "image_qa_template_task",
135
+ "instruction": instruction,
136
+ "focus": focus,
137
+ "max_new_tokens": 512,
138
+ "camera_buffer_size": 1,
139
+ "min_replicas": 1,
140
+ "max_replicas": 1,
141
+ },
142
+ },
143
+ }
144
+
145
+ def build_query_event(raw_base64: str, instruction: str, focus: str) -> dict:
146
+ """Build model.query event with the SAME instruction and focus as the lens config."""
147
+ return {
148
+ "type": "model.query",
149
+ "event_data": {
150
+ "model_version": "Newton::c2_4_7b_251215a172f6d7",
151
+ "template_name": "image_qa_template_task",
152
+ "instruction": instruction,
153
+ "focus": focus,
154
+ "max_new_tokens": 512,
155
+ "data": [{"type": "base64_img", "base64_img": raw_base64}],
156
+ },
157
+ }
158
+
159
+ # 1. Register lens (pass user's instruction + focus)
160
+ lens_config = build_lens_config(instruction, focus)
161
+ lens = client.lens.register(lens_config)
162
+ lens_id = lens["lens_id"]
163
+
164
+ # 2. Create session
165
+ session = client.lens.sessions.create(lens_id)
166
+ session_id = session["session_id"]
167
+
168
+ # 3. Wait for session ready
169
+ for _ in range(60):
170
+ try:
171
+ status = client.lens.sessions.process_event(
172
+ session_id, {"type": "session.status"}
173
+ )
174
+ if status.get("session_status") in ["3", "LensSessionStatus.SESSION_STATUS_RUNNING"]:
175
+ break
176
+ except Exception:
177
+ pass
178
+ time.sleep(0.5)
179
+
180
+ # 4. Initialize processor (REQUIRED)
181
+ client.lens.sessions.process_event(session_id, {
182
+ "type": "session.modify",
183
+ "event_data": {"camera_buffer_size": 1}
184
+ })
185
+
186
+ # 5. Capture frame and send as model.query (same instruction + focus)
187
+ raw_base64 = capture_frame_base64(camera_index=0)
188
+ event = build_query_event(raw_base64, instruction, focus)
189
+
190
+ response = client.lens.sessions.process_event(session_id, event)
191
+
192
+ if response.get("type") == "model.query.response":
193
+ result = response["event_data"]["response"]
194
+ if isinstance(result, list):
195
+ result = result[0]
196
+ print(f"Answer: {result}")
197
+
198
+ # 6. Cleanup
199
+ client.lens.sessions.destroy(session_id)
200
+ ```
201
+
202
+ ---
203
+
204
+ ## Web / JavaScript Implementation
205
+
206
+ Uses direct `fetch` calls to the Archetype AI REST API.
207
+
208
+ ### Requirements
209
+
210
+ - Browser with `getUserMedia` support (webcam access)
211
+ - HTTPS (required for camera access, except `localhost`)
212
+
213
+ ### API Reference
214
+
215
+ | Operation | Method | Endpoint | Body |
216
+ |-----------|--------|----------|------|
217
+ | List lenses | GET | `/lens/metadata` | — |
218
+ | Register lens | POST | `/lens/register` | `{ lens_config: config }` |
219
+ | Delete lens | POST | `/lens/delete` | `{ lens_id }` |
220
+ | Create session | POST | `/lens/sessions/create` | `{ lens_id }` |
221
+ | Process event | POST | `/lens/sessions/events/process` | `{ session_id, event }` |
222
+ | Destroy session | POST | `/lens/sessions/destroy` | `{ session_id }` |
223
+
224
+ ### Helpers: API wrappers
225
+
226
+ ```typescript
227
+ const API_ENDPOINT = 'https://api.u1.archetypeai.app/v0.5'
228
+
229
+ async function apiGet<T>(path: string, apiKey: string): Promise<T> {
230
+ const response = await fetch(`${API_ENDPOINT}${path}`, {
231
+ method: 'GET',
232
+ headers: { Authorization: `Bearer ${apiKey}` },
233
+ })
234
+ if (!response.ok) throw new Error(`API GET ${path} failed: ${response.status}`)
235
+ return response.json()
236
+ }
237
+
238
+ async function apiPost<T>(path: string, apiKey: string, body: unknown, timeoutMs = 5000): Promise<T> {
239
+ const controller = new AbortController()
240
+ const timeoutId = setTimeout(() => controller.abort(), timeoutMs)
241
+
242
+ try {
243
+ const response = await fetch(`${API_ENDPOINT}${path}`, {
244
+ method: 'POST',
245
+ headers: {
246
+ Authorization: `Bearer ${apiKey}`,
247
+ 'Content-Type': 'application/json',
248
+ },
249
+ body: JSON.stringify(body),
250
+ signal: controller.signal,
251
+ })
252
+
253
+ if (!response.ok) {
254
+ const errorBody = await response.json().catch(() => ({}))
255
+ throw new Error(`API POST ${path} failed: ${response.status} - ${JSON.stringify(errorBody)}`)
256
+ }
257
+
258
+ return response.json()
259
+ } finally {
260
+ clearTimeout(timeoutId)
261
+ }
262
+ }
263
+ ```
264
+
265
+ ### Step 1: Find or create the lens (clean up stale lenses)
266
+
267
+ A stale lens from a previous run causes `"Input stream is unhealthy!"` errors. Always check for an existing lens with the same name and delete it before registering a fresh one.
268
+
269
+ **Pass the user's `instruction` and `focus` into the lens config** — do not hardcode them.
270
+
271
+ ```typescript
272
+ const LENS_NAME = 'camera-frame-capture-lens'
273
+
274
+ // --- User-provided values (NOT hardcoded) ---
275
+ const instruction = 'Answer the following question about the image:'
276
+ const focus = 'Describe what you see in this image.'
277
+
278
+ function buildLensConfig(instruction: string, focus: string) {
279
+ return {
280
+ lens_name: LENS_NAME,
281
+ lens_config: {
282
+ model_pipeline: [
283
+ { processor_name: 'lens_camera_processor', processor_config: {} },
284
+ ],
285
+ model_parameters: {
286
+ model_version: 'Newton::c2_4_7b_251215a172f6d7',
287
+ template_name: 'image_qa_template_task',
288
+ instruction,
289
+ focus,
290
+ max_new_tokens: 512,
291
+ camera_buffer_size: 1,
292
+ min_replicas: 1,
293
+ max_replicas: 1,
294
+ },
295
+ },
296
+ }
297
+ }
298
+
299
+ // Delete any existing lens with the same name to avoid stale state
300
+ const existingLenses = await apiGet<Array<{ lens_id: string; lens_name: string }>>(
301
+ '/lens/metadata', apiKey
302
+ )
303
+ const staleLens = existingLenses.find(l => l.lens_name === LENS_NAME)
304
+ if (staleLens) {
305
+ console.log('Deleting stale lens:', staleLens.lens_id)
306
+ await apiPost('/lens/delete', apiKey, { lens_id: staleLens.lens_id })
307
+ }
308
+
309
+ // Register fresh lens with user's instruction + focus
310
+ const lensConfig = buildLensConfig(instruction, focus)
311
+ const registeredLens = await apiPost<{ lens_id: string }>(
312
+ '/lens/register', apiKey, { lens_config: lensConfig }
313
+ )
314
+ const lensId = registeredLens.lens_id
315
+ ```
316
+
317
+ ### Step 2: Create session and wait for ready
318
+
319
+ ```typescript
320
+ const session = await apiPost<{ session_id: string; session_endpoint: string }>(
321
+ '/lens/sessions/create', apiKey, { lens_id: lensId }
322
+ )
323
+ const sessionId = session.session_id
324
+
325
+ // Wait for session to be ready (poll until status = running)
326
+ async function waitForSessionReady(sessionId: string, maxWaitMs = 30000): Promise<boolean> {
327
+ const start = Date.now()
328
+ while (Date.now() - start < maxWaitMs) {
329
+ const status = await apiPost<{ session_status: string }>(
330
+ '/lens/sessions/events/process', apiKey,
331
+ { session_id: sessionId, event: { type: 'session.status' } },
332
+ 10000
333
+ )
334
+ if (status.session_status === 'LensSessionStatus.SESSION_STATUS_RUNNING' ||
335
+ status.session_status === '3') {
336
+ return true
337
+ }
338
+ if (status.session_status === 'LensSessionStatus.SESSION_STATUS_FAILED' ||
339
+ status.session_status === '6') {
340
+ return false
341
+ }
342
+ await new Promise(r => setTimeout(r, 500))
343
+ }
344
+ return false
345
+ }
346
+
347
+ const isReady = await waitForSessionReady(sessionId)
348
+ if (!isReady) throw new Error('Session failed to initialize')
349
+ ```
350
+
351
+ ### Step 3: Initialize the processor (REQUIRED for lens_camera_processor)
352
+
353
+ This sends a `session.modify` event that triggers `update_lens_params()` which initializes `video_narrator_memory`. **Without this step, inference will fail.**
354
+
355
+ ```typescript
356
+ await apiPost('/lens/sessions/events/process', apiKey, {
357
+ session_id: sessionId,
358
+ event: {
359
+ type: 'session.modify',
360
+ event_data: {
361
+ camera_buffer_size: 1,
362
+ },
363
+ },
364
+ }, 30000) // 30s timeout for initialization
365
+ ```
366
+
367
+ ### Step 4: Start webcam and capture frames as base64
368
+
369
+ #### 4a. Create a video element
370
+
371
+ ```html
372
+ <!-- Visible preview (optional) -->
373
+ <video id="webcam" autoplay playsinline muted></video>
374
+
375
+ <!-- Or create it in JS (no visible preview) -->
376
+ ```
377
+
378
+ ```typescript
379
+ // Option A: Reference an existing <video> element
380
+ const video = document.getElementById('webcam') as HTMLVideoElement
381
+
382
+ // Option B: Create a hidden video element in JS
383
+ const video = document.createElement('video')
384
+ video.autoplay = true
385
+ video.playsInline = true // Required for iOS
386
+ video.muted = true
387
+ ```
388
+
389
+ #### 4b. Request camera access and start the stream
390
+
391
+ ```typescript
392
+ async function startCamera(
393
+ preferredWidth = 640,
394
+ preferredHeight = 480,
395
+ facingMode: 'user' | 'environment' = 'user', // 'user' = front, 'environment' = back
396
+ ): Promise<MediaStream> {
397
+ const stream = await navigator.mediaDevices.getUserMedia({
398
+ video: {
399
+ width: { ideal: preferredWidth },
400
+ height: { ideal: preferredHeight },
401
+ facingMode,
402
+ },
403
+ audio: false,
404
+ })
405
+
406
+ video.srcObject = stream
407
+
408
+ // Wait until video is actually playing and has dimensions
409
+ await new Promise<void>((resolve) => {
410
+ video.onloadedmetadata = () => {
411
+ video.play()
412
+ resolve()
413
+ }
414
+ })
415
+
416
+ console.log(`Camera started: ${video.videoWidth}x${video.videoHeight}`)
417
+ return stream
418
+ }
419
+
420
+ const stream = await startCamera()
421
+ ```
422
+
423
+ **Permission notes:**
424
+ - Browser will show a permission prompt on first call
425
+ - HTTPS is **required** (except `localhost`)
426
+ - On mobile, `facingMode: 'environment'` selects the rear camera
427
+
428
+ #### 4c. Capture a frame as base64 JPEG
429
+
430
+ The flow is: **video element → canvas → toDataURL → base64 string**.
431
+
432
+ ```typescript
433
+ function captureFrame(quality = 0.8): string | null {
434
+ if (!video.videoWidth || !video.videoHeight) return null
435
+
436
+ const canvas = document.createElement('canvas')
437
+ canvas.width = video.videoWidth
438
+ canvas.height = video.videoHeight
439
+
440
+ const ctx = canvas.getContext('2d')
441
+ if (!ctx) return null
442
+
443
+ // Draw current video frame onto canvas
444
+ ctx.drawImage(video, 0, 0)
445
+
446
+ // Convert to base64 JPEG — returns "data:image/jpeg;base64,/9j/4AAQ..."
447
+ return canvas.toDataURL('image/jpeg', quality)
448
+ }
449
+ ```
450
+
451
+ **Quality vs size tradeoffs:**
452
+ | Quality | ~Size (640x480) | Use case |
453
+ |---------|-----------------|----------|
454
+ | `0.5` | ~20-30 KB | Fast continuous streaming |
455
+ | `0.8` | ~40-60 KB | Good balance (recommended) |
456
+ | `1.0` | ~80-120 KB | Maximum detail |
457
+
458
+ #### 4d. Strip the data URI prefix before sending to the API
459
+
460
+ The API expects **raw base64**, not the `data:image/jpeg;base64,` prefix that `toDataURL` produces.
461
+
462
+ ```typescript
463
+ function captureFrameRaw(quality = 0.8): string | null {
464
+ const dataUri = captureFrame(quality)
465
+ if (!dataUri) return null
466
+
467
+ // Strip "data:image/jpeg;base64," prefix → raw base64
468
+ return dataUri.replace(/^data:image\/\w+;base64,/, '')
469
+ }
470
+ ```
471
+
472
+ This raw base64 string is what goes into the `model.query` event's `base64_img` field.
473
+
474
+ ### Step 5: Send frames for analysis (model.query)
475
+
476
+ This uses **request/response — NOT SSE**. Each frame is sent as a `model.query` event and the response comes back directly in the POST response.
477
+
478
+ The `instruction` and `focus` in the `model.query` event **must match** the values used at lens registration. Pass them through — do not hardcode different values.
479
+
480
+ ```typescript
481
+ function createModelQueryEvent(
482
+ rawBase64Images: string[], // Already stripped of data URI prefix
483
+ instruction: string, // Same as lens config
484
+ focus: string, // Same as lens config
485
+ modelVersion = 'Newton::c2_4_7b_251215a172f6d7',
486
+ templateName = 'image_qa_template_task',
487
+ maxNewTokens = 512,
488
+ ) {
489
+ return {
490
+ type: 'model.query' as const,
491
+ event_data: {
492
+ model_version: modelVersion,
493
+ template_name: templateName,
494
+ instruction,
495
+ focus,
496
+ max_new_tokens: maxNewTokens,
497
+ data: rawBase64Images.map(img => ({
498
+ type: 'base64_img',
499
+ base64_img: img,
500
+ })),
501
+ },
502
+ }
503
+ }
504
+
505
+ // Send a frame and get the response (uses the same instruction + focus from Step 1)
506
+ async function analyzeFrame(instruction: string, focus: string): Promise<string> {
507
+ const frame = captureFrameRaw() // Raw base64 (no data URI prefix)
508
+ if (!frame) throw new Error('Failed to capture frame')
509
+
510
+ const event = createModelQueryEvent([frame], instruction, focus)
511
+
512
+ const response = await apiPost<{
513
+ type: string
514
+ event_data?: { response?: string | string[]; message?: string }
515
+ }>(
516
+ '/lens/sessions/events/process', apiKey,
517
+ { session_id: sessionId, event },
518
+ 60000 // 60s timeout for model inference
519
+ )
520
+
521
+ // Extract text from response
522
+ if (response.type === 'model.query.response' && response.event_data) {
523
+ const text = response.event_data.response
524
+ if (typeof text === 'string') return text
525
+ if (Array.isArray(text)) return text.join('\n')
526
+ return JSON.stringify(response.event_data)
527
+ }
528
+
529
+ return JSON.stringify(response)
530
+ }
531
+ ```
532
+
533
+ ### Step 6: Continuous capture loop
534
+
535
+ Send the first frame **immediately** after initialization — do not wait for the interval. The processor expects data promptly after `session.modify`.
536
+
537
+ ```typescript
538
+ let isSending = false
539
+
540
+ async function captureAndSend() {
541
+ if (isSending) {
542
+ console.log('Previous request still in progress, skipping frame')
543
+ return
544
+ }
545
+
546
+ isSending = true
547
+ try {
548
+ const result = await analyzeFrame(instruction, focus)
549
+ console.log('Result:', result)
550
+ } catch (error) {
551
+ console.error('Frame analysis failed:', error)
552
+ } finally {
553
+ isSending = false
554
+ }
555
+ }
556
+
557
+ // Send first frame immediately
558
+ captureAndSend()
559
+
560
+ // Then continue at 1 frame per second
561
+ const intervalId = setInterval(captureAndSend, 1000)
562
+ ```
563
+
564
+ ### Step 7: Cleanup
565
+
566
+ ```typescript
567
+ // Stop capture loop
568
+ clearInterval(intervalId)
569
+
570
+ // Stop camera
571
+ stream.getTracks().forEach(track => track.stop())
572
+
573
+ // Destroy session
574
+ await apiPost('/lens/sessions/destroy', apiKey, { session_id: sessionId })
575
+ ```
576
+
577
+ ### Web Lifecycle Summary
578
+
579
+ ```
580
+ 1. List existing lenses -> GET /lens/metadata
581
+ 2. Delete stale lens -> POST /lens/delete { lens_id } (if same name exists)
582
+ 3. Register fresh lens -> POST /lens/register { lens_config: config }
583
+ 4. Create session -> POST /lens/sessions/create { lens_id }
584
+ 5. Wait for ready -> POST /lens/sessions/events/process (poll session.status)
585
+ 6. Initialize processor -> POST /lens/sessions/events/process { session_id, event: session.modify }
586
+ 7. Start webcam -> navigator.mediaDevices.getUserMedia()
587
+ 8. Send first frame NOW -> POST /lens/sessions/events/process { session_id, event: model.query }
588
+ 9. Capture loop (1fps) -> POST /lens/sessions/events/process { session_id, event: model.query }
589
+ 10. Stop camera -> stream.getTracks().forEach(t => t.stop())
590
+ 11. Destroy session -> POST /lens/sessions/destroy { session_id }
591
+ ```
592
+
593
+ ---
594
+
595
+ ## Use Cases
596
+
597
+ - **Quick scene analysis**: Single frame description
598
+ - **Presence detection**: Check if someone is at their desk
599
+ - **Safety monitoring**: Verify safety equipment usage
600
+ - **Object identification**: Identify specific items in view
601
+ - **Continuous monitoring**: Stream frames with periodic analysis
602
+
603
+ ## Troubleshooting
604
+
605
+ - **"Input stream is unhealthy!"**: Stale lens from previous run. Always delete existing lens before registering a new one (see Step 1).
606
+ - **Camera not found**: Try different camera indices (Python) or check browser permissions (Web)
607
+ - **API errors**: Verify API key is set correctly
608
+ - **Session fails**: Ensure `session.modify` (Step 3) is called before sending queries
609
+ - **Timeout on inference**: Model queries can take 10-30s; use 60s timeout
610
+ - **Frame too large**: Use JPEG encoding with quality 0.8 to reduce payload size
611
+ - **Requests overlap**: Gate with `isSending` flag to skip frames while previous request is in-flight