tanuki-telemetry 1.3.6 → 1.3.7
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/package.json +1 -1
- package/skills/compare-image.md +120 -264
package/package.json
CHANGED
package/skills/compare-image.md
CHANGED
|
@@ -1,32 +1,29 @@
|
|
|
1
1
|
# Compare Image — Visual Diff with Qualitative Annotations
|
|
2
2
|
|
|
3
|
-
Compare two sets of images (reference vs
|
|
3
|
+
Compare two sets of images (reference vs actual) with pixel-diff heatmaps and qualitative callouts. Works for any before/after image comparison: UI screenshots, design mockups, rendered templates, chart output, etc.
|
|
4
4
|
|
|
5
5
|
## Usage
|
|
6
6
|
|
|
7
7
|
```
|
|
8
|
-
/compare-image <
|
|
9
|
-
/compare-image
|
|
10
|
-
/compare-image
|
|
11
|
-
/compare-image
|
|
8
|
+
/compare-image <ref-dir> <actual-dir>
|
|
9
|
+
/compare-image ./mockups ./screenshots
|
|
10
|
+
/compare-image --ref ./expected --actual ./output --session-id=<existing-session>
|
|
11
|
+
/compare-image --ref ./v1-screenshots --actual ./v2-screenshots --output-dir=./diffs
|
|
12
12
|
```
|
|
13
13
|
|
|
14
14
|
**Arguments:**
|
|
15
|
-
- `
|
|
15
|
+
- `ref-dir`: Directory containing reference (expected) images — PNGs, numbered or named.
|
|
16
|
+
- `actual-dir`: Directory containing actual (generated/current) images to compare against.
|
|
16
17
|
- `--session-id=<id>`: Attach to an existing telemetry session instead of creating a new one.
|
|
17
|
-
- `--
|
|
18
|
-
- `--
|
|
18
|
+
- `--output-dir=<path>`: Override output directory (default: `$TANUKI_OUTPUTS/comparisons/`)
|
|
19
|
+
- `--label=<name>`: Label for this comparison set (default: derived from directory names).
|
|
19
20
|
|
|
20
21
|
---
|
|
21
22
|
|
|
22
23
|
## Prerequisites
|
|
23
24
|
|
|
24
|
-
- **
|
|
25
|
-
- **
|
|
26
|
-
- **Inngest dev server** on `localhost:8288` (`yarn start-inngest`)
|
|
27
|
-
- **LibreOffice** installed (`soffice` on PATH)
|
|
28
|
-
- **Python packages:** `fitz` (PyMuPDF), `PIL` (Pillow), `numpy`
|
|
29
|
-
- **agent-browser** via `npx agent-browser`
|
|
25
|
+
- **Python packages:** `fitz` (PyMuPDF — only if comparing PDFs), `PIL` (Pillow), `numpy`
|
|
26
|
+
- **agent-browser** via `npx agent-browser` (only if capturing live screenshots)
|
|
30
27
|
|
|
31
28
|
---
|
|
32
29
|
|
|
@@ -34,134 +31,83 @@ Compare two sets of images (reference vs generated) with pixel-diff heatmaps and
|
|
|
34
31
|
|
|
35
32
|
### Phase 1: Setup & Discovery
|
|
36
33
|
|
|
37
|
-
1. **Parse arguments** — extract
|
|
34
|
+
1. **Parse arguments** — extract directories and flags.
|
|
38
35
|
2. **Create telemetry session** (unless `--session-id` provided):
|
|
39
36
|
```
|
|
40
37
|
mcp__telemetry__log_session_start({ worktree_name: "image-comparison-<date>" })
|
|
41
38
|
```
|
|
42
|
-
3. **
|
|
43
|
-
|
|
44
|
-
|
|
45
|
-
|
|
46
|
-
|
|
47
|
-
```
|
|
48
|
-
4. **Find existing presentations** (if `--skip-generation`):
|
|
49
|
-
```sql
|
|
50
|
-
SELECT p.id, p.title, p.template_id, p.generation_status,
|
|
51
|
-
(SELECT count(*) FROM slide s WHERE s.presentation_id = p.id) as slide_count
|
|
52
|
-
FROM presentation p
|
|
53
|
-
WHERE p.template_id = '<template-id>' AND p.generation_status = 'completed'
|
|
54
|
-
ORDER BY p.created_at DESC LIMIT 1;
|
|
55
|
-
```
|
|
56
|
-
5. **Log event** for each template found.
|
|
57
|
-
|
|
58
|
-
### Phase 2: Render Reference Slides (from PPTX)
|
|
59
|
-
|
|
60
|
-
For each template:
|
|
61
|
-
|
|
62
|
-
1. **Download PPTX** from Supabase storage:
|
|
63
|
-
```bash
|
|
64
|
-
SERVICE_KEY=$(yarn supabase status 2>/dev/null | grep 'service_role' | awk '{print $NF}')
|
|
65
|
-
curl -s -o /tmp/compare/<name>.pptx \
|
|
66
|
-
"http://127.0.0.1:54321/storage/v1/object/presentations/<source_file_path>" \
|
|
67
|
-
-H "Authorization: Bearer $SERVICE_KEY"
|
|
68
|
-
```
|
|
39
|
+
3. **Discover image pairs** — match reference and actual images by filename or index:
|
|
40
|
+
- Sort both directories by filename
|
|
41
|
+
- Pair them 1:1 (ref-01.png ↔ actual-01.png, or by matching name stems)
|
|
42
|
+
- Report any unmatched images
|
|
43
|
+
4. **Log event** with pair count and any mismatches.
|
|
69
44
|
|
|
70
|
-
|
|
71
|
-
```bash
|
|
72
|
-
soffice --headless --convert-to pdf --outdir /tmp/compare/ref-<name> /tmp/compare/<name>.pptx
|
|
73
|
-
```
|
|
45
|
+
### Phase 2: Prepare Reference Images
|
|
74
46
|
|
|
75
|
-
|
|
76
|
-
```python
|
|
77
|
-
import fitz
|
|
78
|
-
doc = fitz.open(pdf_path)
|
|
79
|
-
for i, page in enumerate(doc):
|
|
80
|
-
zoom = 1920 / page.rect.width
|
|
81
|
-
mat = fitz.Matrix(zoom, zoom)
|
|
82
|
-
pix = page.get_pixmap(matrix=mat)
|
|
83
|
-
pix.save(f'ref-{name}/slide-{i+1:02d}.png')
|
|
84
|
-
```
|
|
47
|
+
Depending on your source format, prepare reference PNGs:
|
|
85
48
|
|
|
86
|
-
|
|
49
|
+
- **Already PNGs:** Use directly — no conversion needed.
|
|
50
|
+
- **From PDF:** Render pages to PNGs via PyMuPDF:
|
|
51
|
+
```python
|
|
52
|
+
import fitz
|
|
53
|
+
doc = fitz.open(pdf_path)
|
|
54
|
+
for i, page in enumerate(doc):
|
|
55
|
+
zoom = 1920 / page.rect.width
|
|
56
|
+
mat = fitz.Matrix(zoom, zoom)
|
|
57
|
+
pix = page.get_pixmap(matrix=mat)
|
|
58
|
+
pix.save(f'ref/image-{i+1:02d}.png')
|
|
59
|
+
```
|
|
60
|
+
- **From live URL:** Capture with agent-browser:
|
|
61
|
+
```bash
|
|
62
|
+
npx agent-browser --url "http://localhost:3000/page" --width 1920 --height 1080 --output ref/page.png
|
|
63
|
+
```
|
|
87
64
|
|
|
88
|
-
### Phase 3:
|
|
65
|
+
### Phase 3: Prepare Actual Images
|
|
89
66
|
|
|
90
|
-
|
|
67
|
+
Same as Phase 2 — get actual/generated images as PNGs by whatever method fits your use case (screenshots, renders, exports, etc.).
|
|
91
68
|
|
|
92
|
-
|
|
69
|
+
### Phase 4: Qualitative Analysis (visual review)
|
|
93
70
|
|
|
94
|
-
|
|
95
|
-
2. **Viewport:** `npx agent-browser set viewport 1920 1080`
|
|
96
|
-
3. **Navigate:** `npx agent-browser open http://localhost:3000/project/<projectId>/slides/<presentationId>`
|
|
97
|
-
4. **Wait:** `npx agent-browser wait --load networkidle --timeout 15000` + `sleep 2`
|
|
98
|
-
5. **Enter Present mode:**
|
|
99
|
-
```bash
|
|
100
|
-
npx agent-browser snapshot -i | grep "Present" # find ref e.g. @e9
|
|
101
|
-
npx agent-browser click @e9 # click Present button
|
|
102
|
-
sleep 3
|
|
103
|
-
```
|
|
104
|
-
6. **Capture each slide:**
|
|
105
|
-
```bash
|
|
106
|
-
# Slide 1 (already showing after entering Present mode)
|
|
107
|
-
SHOT=$(npx agent-browser screenshot | grep -o '/Users/.*\.png')
|
|
108
|
-
cp "$SHOT" gen-<name>/slide-01.png
|
|
109
|
-
|
|
110
|
-
# Slides 2–N: press ArrowRight to advance
|
|
111
|
-
for i in $(seq 2 $NUM_SLIDES); do
|
|
112
|
-
npx agent-browser press ArrowRight
|
|
113
|
-
sleep 1
|
|
114
|
-
SHOT=$(npx agent-browser screenshot | grep -o '/Users/.*\.png')
|
|
115
|
-
cp "$SHOT" gen-<name>/slide-$(printf '%02d' $i).png
|
|
116
|
-
done
|
|
117
|
-
```
|
|
118
|
-
7. **Verify uniqueness:** `md5 -q gen-<name>/*.png` — all hashes must differ. If duplicates found, re-capture the affected slides.
|
|
119
|
-
8. **Exit Present mode:** `npx agent-browser press Escape`
|
|
120
|
-
9. **Log event** per slide captured.
|
|
121
|
-
|
|
122
|
-
### Phase 4: Qualitative Analysis (visual code review)
|
|
123
|
-
|
|
124
|
-
For each slide pair, **read both images** and identify every meaningful difference. Think of this as a visual code review — call out specifics, not just "things changed."
|
|
71
|
+
For each image pair, **read both images** and identify every meaningful difference. Think of this as a visual code review — call out specifics, not just "things changed."
|
|
125
72
|
|
|
126
73
|
| Category | What to look for |
|
|
127
74
|
|----------|-----------------|
|
|
128
|
-
| **
|
|
129
|
-
| **Text
|
|
130
|
-
| **
|
|
131
|
-
| **
|
|
132
|
-
| **
|
|
133
|
-
| **
|
|
134
|
-
| **
|
|
135
|
-
| **Color/style** | Background gradient, accent colors, border styles |
|
|
75
|
+
| **Layout** | Element positioning, spacing, alignment, column/grid structure |
|
|
76
|
+
| **Text** | Content differences, missing text, placeholder values, truncation |
|
|
77
|
+
| **Images/icons** | Missing assets, wrong variants, broken renders, placeholder boxes |
|
|
78
|
+
| **Color/style** | Background, accent colors, borders, gradients, opacity |
|
|
79
|
+
| **Typography** | Font size, weight, color, line height changes |
|
|
80
|
+
| **Data** | Missing values, wrong numbers, empty states |
|
|
81
|
+
| **Chrome/UI** | Headers, footers, navigation, page numbers, timestamps |
|
|
136
82
|
|
|
137
83
|
**Severity classification:**
|
|
138
|
-
- **CRITICAL** (red): Missing content
|
|
139
|
-
- **NOTABLE** (yellow):
|
|
140
|
-
- **MINOR** (blue): Rendering differences — font antialiasing,
|
|
141
|
-
- **GOOD** (green): Things that
|
|
84
|
+
- **CRITICAL** (red): Missing content, broken layout, data that should exist but doesn't
|
|
85
|
+
- **NOTABLE** (yellow): Important differences — content changes, removed elements, placeholder values
|
|
86
|
+
- **MINOR** (blue): Rendering differences — font antialiasing, sub-pixel spacing, minor color shifts
|
|
87
|
+
- **GOOD** (green): Things that match correctly — always include at least one positive finding per pair
|
|
142
88
|
|
|
143
|
-
Build a `callouts` list for each
|
|
89
|
+
Build a `callouts` list for each pair: `[{severity, title, details}]` (max 4 per image).
|
|
144
90
|
|
|
145
91
|
### Phase 5: Generate Comparison Images
|
|
146
92
|
|
|
147
|
-
Each comparison image has
|
|
93
|
+
Each comparison image has three columns plus qualitative callout boxes.
|
|
148
94
|
|
|
149
|
-
**Layout
|
|
95
|
+
**Layout:**
|
|
150
96
|
```
|
|
151
|
-
|
|
152
|
-
│ Title: "
|
|
153
|
-
|
|
154
|
-
│ REFERENCE │ PIXEL DIFF HEATMAP
|
|
155
|
-
│ (
|
|
156
|
-
│ [green border] │ [red border]
|
|
157
|
-
│ [533x450] │ [533x450]
|
|
158
|
-
|
|
159
|
-
│ DIFFERENCES:
|
|
160
|
-
│ ┌─CRITICAL────────┐ ┌─NOTABLE─────────┐ ┌─MINOR──────────┐ ┌─GOOD
|
|
161
|
-
│ │ Title │ │ Title │ │ Title │ │ Title
|
|
162
|
-
│ │ Details... │ │ Details... │ │ Details... │ │ Details
|
|
163
|
-
│ └─────────────────┘ └─────────────────┘ └────────────────┘
|
|
164
|
-
|
|
97
|
+
┌──────────────────────────────────────────────────────────────────────────┐
|
|
98
|
+
│ Title: "Page 3 — Dashboard" [DIFF 18.2%] │
|
|
99
|
+
├────────────────────────┬──────────────────────┬──────────────────────────┤
|
|
100
|
+
│ REFERENCE │ PIXEL DIFF HEATMAP │ ACTUAL │
|
|
101
|
+
│ (Expected) │ (red = changes) │ (Current) │
|
|
102
|
+
│ [green border] │ [red border] │ [blue border] │
|
|
103
|
+
│ [533x450] │ [533x450] │ [533x450] │
|
|
104
|
+
├────────────────────────┴──────────────────────┴──────────────────────────┤
|
|
105
|
+
│ DIFFERENCES: │
|
|
106
|
+
│ ┌─CRITICAL────────┐ ┌─NOTABLE─────────┐ ┌─MINOR──────────┐ ┌─GOOD───┐│
|
|
107
|
+
│ │ Title │ │ Title │ │ Title │ │ Title ││
|
|
108
|
+
│ │ Details... │ │ Details... │ │ Details... │ │ Details ││
|
|
109
|
+
│ └─────────────────┘ └─────────────────┘ └────────────────┘ └─────────┘│
|
|
110
|
+
└──────────────────────────────────────────────────────────────────────────┘
|
|
165
111
|
```
|
|
166
112
|
|
|
167
113
|
**Python implementation** (Pillow + numpy):
|
|
@@ -211,7 +157,7 @@ def normalize_to_size(img, target_w, target_h, bg_color=(255, 255, 255)):
|
|
|
211
157
|
|
|
212
158
|
def compute_diff_heatmap(ref, gen, threshold=25):
|
|
213
159
|
"""
|
|
214
|
-
Compute a red pixel-diff heatmap overlaid on the
|
|
160
|
+
Compute a red pixel-diff heatmap overlaid on the actual image.
|
|
215
161
|
Returns (overlay_image, diff_percentage).
|
|
216
162
|
"""
|
|
217
163
|
ref_arr = np.array(ref.convert("RGB"), dtype=np.float32)
|
|
@@ -237,22 +183,18 @@ def compute_diff_heatmap(ref, gen, threshold=25):
|
|
|
237
183
|
return heatmap, diff_pct
|
|
238
184
|
|
|
239
185
|
|
|
240
|
-
def create_comparison(ref_path, gen_path, out_path,
|
|
186
|
+
def create_comparison(ref_path, gen_path, out_path, label, callouts):
|
|
241
187
|
"""
|
|
242
188
|
Generate a full comparison image with 3 columns side by side:
|
|
243
|
-
REFERENCE | HEATMAP |
|
|
189
|
+
REFERENCE | HEATMAP | ACTUAL — all same height, equal width.
|
|
244
190
|
Plus qualitative callout boxes below.
|
|
245
191
|
"""
|
|
246
|
-
# Normalize both images to exact same dimensions — no stretching, no black bars.
|
|
247
|
-
# Uses white bg to match typical slide backgrounds. Aspect ratio preserved.
|
|
248
192
|
ref = normalize_to_size(Image.open(ref_path), COL_W, COL_H)
|
|
249
193
|
gen = normalize_to_size(Image.open(gen_path), COL_W, COL_H)
|
|
250
194
|
|
|
251
|
-
# Compute heatmap at normalized size — both are now identical dimensions
|
|
252
195
|
heatmap, diff_pct = compute_diff_heatmap(ref, gen)
|
|
253
|
-
heatmap_col = heatmap
|
|
196
|
+
heatmap_col = heatmap
|
|
254
197
|
|
|
255
|
-
# Canvas dimensions: 3 columns + 4 padding gaps
|
|
256
198
|
content_w = COL_W * 3 + PAD * 4
|
|
257
199
|
total_h = PAD + 40 + 24 + COL_H + PAD + 24 + CALLOUT_H + PAD
|
|
258
200
|
canvas = Image.new("RGB", (content_w, total_h), (25, 25, 25))
|
|
@@ -261,7 +203,7 @@ def create_comparison(ref_path, gen_path, out_path, slide_label, callouts):
|
|
|
261
203
|
y = PAD
|
|
262
204
|
|
|
263
205
|
# --- Title bar + diff badge ---
|
|
264
|
-
draw.text((PAD, y),
|
|
206
|
+
draw.text((PAD, y), label, fill=(255, 255, 255), font=FONT_TITLE)
|
|
265
207
|
badge_text = f"DIFF {diff_pct:.1f}%"
|
|
266
208
|
if diff_pct < 5:
|
|
267
209
|
badge_color = (40, 150, 40)
|
|
@@ -274,13 +216,13 @@ def create_comparison(ref_path, gen_path, out_path, slide_label, callouts):
|
|
|
274
216
|
draw.text((badge_x + 8, y + 6), badge_text, fill=(255, 255, 255), font=FONT_LABEL)
|
|
275
217
|
y += 44
|
|
276
218
|
|
|
277
|
-
# --- Column labels
|
|
219
|
+
# --- Column labels ---
|
|
278
220
|
col1_x = PAD
|
|
279
221
|
col2_x = PAD * 2 + COL_W
|
|
280
222
|
col3_x = PAD * 3 + COL_W * 2
|
|
281
|
-
draw.text((col1_x, y), "REFERENCE (
|
|
223
|
+
draw.text((col1_x, y), "REFERENCE (Expected)", fill=(140, 200, 140), font=FONT_LABEL)
|
|
282
224
|
draw.text((col2_x, y), "PIXEL DIFF HEATMAP", fill=(200, 120, 120), font=FONT_LABEL)
|
|
283
|
-
draw.text((col3_x, y), "
|
|
225
|
+
draw.text((col3_x, y), "ACTUAL (Current)", fill=(140, 160, 240), font=FONT_LABEL)
|
|
284
226
|
y += 24
|
|
285
227
|
|
|
286
228
|
# --- 3 images side by side ---
|
|
@@ -328,124 +270,72 @@ def create_comparison(ref_path, gen_path, out_path, slide_label, callouts):
|
|
|
328
270
|
|
|
329
271
|
### Phase 6: Upload to Telemetry (structured findings)
|
|
330
272
|
|
|
331
|
-
Each
|
|
273
|
+
Each image comparison produces telemetry artifacts: a screenshot, structured finding events per callout, and an image-level summary event.
|
|
332
274
|
|
|
333
|
-
#### 6a. Screenshots per
|
|
275
|
+
#### 6a. Screenshots per image pair
|
|
334
276
|
|
|
335
277
|
**The comparison image is always the primary output:**
|
|
336
278
|
```
|
|
337
279
|
mcp__telemetry__log_screenshot({
|
|
338
280
|
session_id,
|
|
339
281
|
phase: "verification",
|
|
340
|
-
description: "[COMPARISON] <
|
|
282
|
+
description: "[COMPARISON] <label> <N> — <highest severity>: <key finding>",
|
|
341
283
|
file_path: "<absolute path to comparison PNG>"
|
|
342
284
|
})
|
|
343
285
|
```
|
|
344
286
|
|
|
345
|
-
|
|
346
|
-
```
|
|
347
|
-
mcp__telemetry__log_screenshot({
|
|
348
|
-
session_id,
|
|
349
|
-
phase: "verification",
|
|
350
|
-
description: "[FIXED] <Template> <N> <Title> — after <what was fixed>",
|
|
351
|
-
file_path: "<absolute path to generated slide PNG>"
|
|
352
|
-
})
|
|
353
|
-
```
|
|
354
|
-
|
|
355
|
-
Also log comparison as an artifact for download/browsing on the dashboard:
|
|
287
|
+
Also log as an artifact for download/browsing on the dashboard:
|
|
356
288
|
```
|
|
357
289
|
mcp__telemetry__log_artifact({
|
|
358
290
|
session_id,
|
|
359
291
|
file_path: "<absolute path to comparison PNG>",
|
|
360
292
|
artifact_type: "comparison",
|
|
361
|
-
description: "<
|
|
362
|
-
metadata: {
|
|
293
|
+
description: "<label> image <N> comparison",
|
|
294
|
+
metadata: { label: "<label>", image_number: <N>, diff_pct: <X.X> }
|
|
363
295
|
})
|
|
364
296
|
```
|
|
365
297
|
|
|
366
298
|
#### 6b. Structured finding event per callout
|
|
367
299
|
|
|
368
|
-
For **each individual finding
|
|
300
|
+
For **each individual finding**, log a `comparison_finding` event with queryable metadata:
|
|
369
301
|
|
|
370
302
|
```
|
|
371
303
|
mcp__telemetry__log_event({
|
|
372
304
|
session_id,
|
|
373
305
|
phase: "verification",
|
|
374
306
|
event_type: "info",
|
|
375
|
-
message: "<severity>: <title> — <
|
|
307
|
+
message: "<severity>: <title> — <label> image <N>",
|
|
376
308
|
metadata: {
|
|
377
309
|
type: "comparison_finding",
|
|
378
|
-
|
|
379
|
-
|
|
380
|
-
|
|
310
|
+
label: "<label>",
|
|
311
|
+
image_number: <N>,
|
|
312
|
+
image_name: "<filename>",
|
|
381
313
|
severity: "<critical|notable|minor|good>",
|
|
382
|
-
finding_title: "<short title>",
|
|
383
|
-
finding_details: "<full description>",
|
|
384
|
-
diff_pct: <X.X>,
|
|
385
|
-
comparison_image: "<absolute path>",
|
|
386
|
-
ref_image: "<absolute path>",
|
|
387
|
-
|
|
314
|
+
finding_title: "<short title>",
|
|
315
|
+
finding_details: "<full description>",
|
|
316
|
+
diff_pct: <X.X>,
|
|
317
|
+
comparison_image: "<absolute path>",
|
|
318
|
+
ref_image: "<absolute path>",
|
|
319
|
+
actual_image: "<absolute path>"
|
|
388
320
|
}
|
|
389
321
|
})
|
|
390
322
|
```
|
|
391
323
|
|
|
392
|
-
|
|
393
|
-
```
|
|
394
|
-
// Event 1: critical finding
|
|
395
|
-
metadata: {
|
|
396
|
-
type: "comparison_finding",
|
|
397
|
-
template: "trinity",
|
|
398
|
-
slide_number: 2,
|
|
399
|
-
slide_title: "Table of Contents",
|
|
400
|
-
severity: "critical",
|
|
401
|
-
finding_title: "Items 02-03: '[Not available]'",
|
|
402
|
-
finding_details: "Reference: '02 Details & Requirements', '03 Success Criteria'. Generated: both show '[Not available]' — LLM failed to map content to these TOC slots.",
|
|
403
|
-
diff_pct: 18.2,
|
|
404
|
-
comparison_image: "/Users/.../comparisons/trinity-02-toc.png",
|
|
405
|
-
ref_image: "/tmp/.../ref-trinity/slide-02.png",
|
|
406
|
-
gen_image: "/tmp/.../gen-trinity/slide-02.png"
|
|
407
|
-
}
|
|
408
|
-
|
|
409
|
-
// Event 2: notable finding
|
|
410
|
-
metadata: {
|
|
411
|
-
type: "comparison_finding",
|
|
412
|
-
template: "trinity",
|
|
413
|
-
slide_number: 2,
|
|
414
|
-
slide_title: "Table of Contents",
|
|
415
|
-
severity: "notable",
|
|
416
|
-
finding_title: "Logo expanded",
|
|
417
|
-
finding_details: "Reference: icon-only client logo. Generated: full wordmark with icon — different logo variant.",
|
|
418
|
-
...
|
|
419
|
-
}
|
|
420
|
-
|
|
421
|
-
// Event 3: good finding
|
|
422
|
-
metadata: {
|
|
423
|
-
type: "comparison_finding",
|
|
424
|
-
template: "trinity",
|
|
425
|
-
slide_number: 2,
|
|
426
|
-
slide_title: "Table of Contents",
|
|
427
|
-
severity: "good",
|
|
428
|
-
finding_title: "Layout & footer preserved",
|
|
429
|
-
finding_details: "TOC numbering, arrow icons, divider lines, footer text all in correct positions.",
|
|
430
|
-
...
|
|
431
|
-
}
|
|
432
|
-
```
|
|
433
|
-
|
|
434
|
-
#### 6c. Slide-level summary event
|
|
324
|
+
#### 6c. Image-level summary event
|
|
435
325
|
|
|
436
|
-
After logging all findings for
|
|
326
|
+
After logging all findings for an image pair:
|
|
437
327
|
|
|
438
328
|
```
|
|
439
329
|
mcp__telemetry__log_event({
|
|
440
330
|
session_id,
|
|
441
331
|
phase: "verification",
|
|
442
332
|
event_type: "info",
|
|
443
|
-
message: "Compared <
|
|
333
|
+
message: "Compared <label> image <N> (<name>) — <highest severity>, diff <X.X>%",
|
|
444
334
|
metadata: {
|
|
445
|
-
type: "
|
|
446
|
-
|
|
447
|
-
|
|
448
|
-
|
|
335
|
+
type: "comparison_image_summary",
|
|
336
|
+
label: "<label>",
|
|
337
|
+
image_number: <N>,
|
|
338
|
+
image_name: "<filename>",
|
|
449
339
|
diff_pct: <X.X>,
|
|
450
340
|
highest_severity: "<critical|notable|minor|good>",
|
|
451
341
|
finding_count: { critical: <N>, notable: <N>, minor: <N>, good: <N> },
|
|
@@ -456,52 +346,42 @@ mcp__telemetry__log_event({
|
|
|
456
346
|
|
|
457
347
|
#### 6d. Final rollup event
|
|
458
348
|
|
|
459
|
-
After all
|
|
349
|
+
After all image pairs:
|
|
460
350
|
|
|
461
351
|
```
|
|
462
352
|
mcp__telemetry__log_event({
|
|
463
353
|
session_id,
|
|
464
354
|
phase: "deliverables",
|
|
465
355
|
event_type: "info",
|
|
466
|
-
message: "Image comparison complete — <N>
|
|
356
|
+
message: "Image comparison complete — <N> images, <C> critical, <N> notable findings",
|
|
467
357
|
metadata: {
|
|
468
358
|
type: "comparison_rollup",
|
|
469
|
-
|
|
470
|
-
|
|
359
|
+
label: "<label>",
|
|
360
|
+
total_images: <N>,
|
|
471
361
|
total_findings: <N>,
|
|
472
362
|
by_severity: { critical: <N>, notable: <N>, minor: <N>, good: <N> },
|
|
473
|
-
avg_diff_pct: <X.X
|
|
474
|
-
per_template: {
|
|
475
|
-
"trinity": { slides: 7, avg_diff_pct: 14.2, critical: 2, notable: 3, good: 7 },
|
|
476
|
-
"ow": { slides: 9, avg_diff_pct: 16.8, critical: 1, notable: 5, good: 9 }
|
|
477
|
-
}
|
|
363
|
+
avg_diff_pct: <X.X>
|
|
478
364
|
}
|
|
479
365
|
})
|
|
480
366
|
```
|
|
481
367
|
|
|
482
368
|
#### Querying findings programmatically
|
|
483
369
|
|
|
484
|
-
The `type: "comparison_finding"` field in metadata enables downstream tools to query findings:
|
|
485
|
-
|
|
486
370
|
```sql
|
|
487
371
|
-- All critical findings across sessions
|
|
488
372
|
SELECT * FROM events
|
|
489
373
|
WHERE metadata->>'type' = 'comparison_finding'
|
|
490
374
|
AND metadata->>'severity' = 'critical';
|
|
491
375
|
|
|
492
|
-
-- All findings for a specific
|
|
376
|
+
-- All findings for a specific comparison
|
|
493
377
|
SELECT * FROM events
|
|
494
378
|
WHERE metadata->>'type' = 'comparison_finding'
|
|
495
|
-
AND metadata->>'
|
|
379
|
+
AND metadata->>'label' = 'homepage-redesign';
|
|
496
380
|
|
|
497
|
-
--
|
|
381
|
+
-- Image summaries sorted by diff percentage
|
|
498
382
|
SELECT * FROM events
|
|
499
|
-
WHERE metadata->>'type' = '
|
|
383
|
+
WHERE metadata->>'type' = 'comparison_image_summary'
|
|
500
384
|
ORDER BY (metadata->>'diff_pct')::float DESC;
|
|
501
|
-
|
|
502
|
-
-- Rollup across all comparison sessions
|
|
503
|
-
SELECT * FROM events
|
|
504
|
-
WHERE metadata->>'type' = 'comparison_rollup';
|
|
505
385
|
```
|
|
506
386
|
|
|
507
387
|
### Phase 7: Summary Output
|
|
@@ -509,50 +389,26 @@ WHERE metadata->>'type' = 'comparison_rollup';
|
|
|
509
389
|
```markdown
|
|
510
390
|
## Image Comparison Results
|
|
511
391
|
|
|
512
|
-
|
|
|
392
|
+
| Image | Diff % | Severity | Key Finding |
|
|
513
393
|
|-------|:------:|----------|-------------|
|
|
514
|
-
|
|
|
515
|
-
|
|
|
394
|
+
| 01 — Homepage | 12.3% | NOTABLE | Header layout shifted, CTA button color changed |
|
|
395
|
+
| 02 — Dashboard | 18.2% | CRITICAL | Chart data missing, sidebar collapsed |
|
|
516
396
|
| ... | ... | ... | ... |
|
|
517
397
|
|
|
518
|
-
**Critical:** <count>
|
|
519
|
-
**Notable:** <count>
|
|
520
|
-
**Good:** <count>
|
|
398
|
+
**Critical:** <count> images
|
|
399
|
+
**Notable:** <count> images
|
|
400
|
+
**Good:** <count> images
|
|
521
401
|
|
|
522
402
|
**Output:** <output-dir>/comparisons/
|
|
523
|
-
**Telemetry:** Session <id
|
|
403
|
+
**Telemetry:** Session <id>
|
|
524
404
|
```
|
|
525
405
|
|
|
526
406
|
---
|
|
527
407
|
|
|
528
|
-
##
|
|
529
|
-
|
|
530
|
-
### Supabase Auth Bypass for Template Upload
|
|
531
|
-
The template upload API requires CSRF tokens. For programmatic access, use the Supabase REST API directly with the service_role key:
|
|
532
|
-
```bash
|
|
533
|
-
SERVICE_KEY="eyJhbG..." # from `yarn supabase status`
|
|
534
|
-
curl -s "http://127.0.0.1:54321/rest/v1/presentation_template" \
|
|
535
|
-
-H "Authorization: Bearer $SERVICE_KEY" -H "apikey: $SERVICE_KEY" ...
|
|
536
|
-
```
|
|
537
|
-
|
|
538
|
-
### Inngest Event Trigger
|
|
539
|
-
Trigger template analysis or slide generation directly:
|
|
540
|
-
```bash
|
|
541
|
-
curl -s "http://localhost:8288/e/test" -X POST \
|
|
542
|
-
-H "Content-Type: application/json" \
|
|
543
|
-
-d '[{"name": "presentation.generate_slides", "data": {...}}]'
|
|
544
|
-
```
|
|
545
|
-
|
|
546
|
-
### agent-browser Present Mode Navigation
|
|
547
|
-
**Prefer Present mode + ArrowRight** for clean fullscreen captures. The editor sidebar thumbnails are `.chakra-stack` elements at `x≈264`, but Present mode avoids all UI chrome.
|
|
548
|
-
|
|
549
|
-
### Shell Variable Pitfall in agent-browser eval
|
|
550
|
-
When using `npx agent-browser eval "document.elementFromPoint(x, $VAR)"` in a bash loop, ensure `$VAR` is non-empty. Array indexing with `${ARR[$i]}` can produce empty values if `i=0` and the array wasn't initialized with explicit values.
|
|
551
|
-
|
|
552
|
-
### Extending Beyond Slides
|
|
553
|
-
This workflow works for any before/after image comparison:
|
|
554
|
-
- **UI screenshots:** Compare a design mockup against the implemented page
|
|
555
|
-
- **Chart rendering:** Compare expected chart output against actual
|
|
556
|
-
- **Email templates:** Compare HTML email reference against rendered output
|
|
408
|
+
## Common Use Cases
|
|
557
409
|
|
|
558
|
-
|
|
410
|
+
- **UI regression testing:** Compare screenshots before/after a code change
|
|
411
|
+
- **Design fidelity:** Compare design mockup PNGs against implemented page screenshots
|
|
412
|
+
- **Generated content:** Compare expected output against LLM/AI-generated output
|
|
413
|
+
- **Email templates:** Compare HTML email reference renders against actual sends
|
|
414
|
+
- **Chart/data viz:** Compare expected chart renders against actual output
|