deepflow 0.1.78 → 0.1.80
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/README.md +14 -3
- package/bin/install.js +3 -2
- package/package.json +4 -1
- package/src/commands/df/auto-cycle.md +33 -19
- package/src/commands/df/execute.md +166 -473
- package/src/commands/df/plan.md +113 -163
- package/src/commands/df/verify.md +433 -3
- package/src/skills/browse-fetch/SKILL.md +258 -0
- package/src/skills/browse-verify/SKILL.md +264 -0
- package/templates/config-template.yaml +14 -0
- package/src/skills/context-hub/SKILL.md +0 -87
|
@@ -131,16 +131,444 @@ Run AFTER L0 passes and L1-L2 complete. Run even if L1-L2 found issues.
|
|
|
131
131
|
**Flaky test handling** (if `quality.test_retry_on_fail: true` in config):
|
|
132
132
|
- Re-run ONCE on failure. Second pass → "⚠ L4: Passed on retry (possible flaky test)". Second fail → genuine failure.
|
|
133
133
|
|
|
134
|
+
**L5: Browser Verification** (if frontend detected)
|
|
135
|
+
|
|
136
|
+
**Step 1: Detect frontend framework** (config override always wins):
|
|
137
|
+
|
|
138
|
+
```bash
|
|
139
|
+
BROWSER_VERIFY=$(yq '.quality.browser_verify' .deepflow/config.yaml 2>/dev/null)
|
|
140
|
+
|
|
141
|
+
if [ "${BROWSER_VERIFY}" = "false" ]; then
|
|
142
|
+
# Explicitly disabled — skip L5 unconditionally
|
|
143
|
+
echo "L5 — (no frontend)"
|
|
144
|
+
L5_RESULT="skipped-no-frontend"
|
|
145
|
+
elif [ "${BROWSER_VERIFY}" = "true" ]; then
|
|
146
|
+
# Explicitly enabled — proceed even without frontend deps
|
|
147
|
+
FRONTEND_DETECTED=true
|
|
148
|
+
FRONTEND_FRAMEWORK="configured"
|
|
149
|
+
else
|
|
150
|
+
# Auto-detect from package.json (both dependencies and devDependencies)
|
|
151
|
+
FRONTEND_DETECTED=false
|
|
152
|
+
FRONTEND_FRAMEWORK=""
|
|
153
|
+
|
|
154
|
+
if [ -f package.json ]; then
|
|
155
|
+
# Check for React / Next.js
|
|
156
|
+
if jq -e '(.dependencies + (.devDependencies // {})) | keys[] | select(. == "react" or . == "react-dom" or . == "next")' package.json >/dev/null 2>&1; then
|
|
157
|
+
FRONTEND_DETECTED=true
|
|
158
|
+
# Prefer Next.js label when next is present
|
|
159
|
+
if jq -e '(.dependencies + (.devDependencies // {}))["next"]' package.json >/dev/null 2>&1; then
|
|
160
|
+
FRONTEND_FRAMEWORK="Next.js"
|
|
161
|
+
else
|
|
162
|
+
FRONTEND_FRAMEWORK="React"
|
|
163
|
+
fi
|
|
164
|
+
# Check for Nuxt / Vue
|
|
165
|
+
elif jq -e '(.dependencies + (.devDependencies // {})) | keys[] | select(. == "vue" or . == "nuxt" or startswith("@vue/"))' package.json >/dev/null 2>&1; then
|
|
166
|
+
FRONTEND_DETECTED=true
|
|
167
|
+
if jq -e '(.dependencies + (.devDependencies // {}))["nuxt"]' package.json >/dev/null 2>&1; then
|
|
168
|
+
FRONTEND_FRAMEWORK="Nuxt"
|
|
169
|
+
else
|
|
170
|
+
FRONTEND_FRAMEWORK="Vue"
|
|
171
|
+
fi
|
|
172
|
+
# Check for Svelte / SvelteKit
|
|
173
|
+
elif jq -e '(.dependencies + (.devDependencies // {})) | keys[] | select(. == "svelte" or startswith("@sveltejs/"))' package.json >/dev/null 2>&1; then
|
|
174
|
+
FRONTEND_DETECTED=true
|
|
175
|
+
if jq -e '(.dependencies + (.devDependencies // {}))["@sveltejs/kit"]' package.json >/dev/null 2>&1; then
|
|
176
|
+
FRONTEND_FRAMEWORK="SvelteKit"
|
|
177
|
+
else
|
|
178
|
+
FRONTEND_FRAMEWORK="Svelte"
|
|
179
|
+
fi
|
|
180
|
+
fi
|
|
181
|
+
fi
|
|
182
|
+
|
|
183
|
+
if [ "${FRONTEND_DETECTED}" = "false" ]; then
|
|
184
|
+
echo "L5 — (no frontend)"
|
|
185
|
+
L5_RESULT="skipped-no-frontend"
|
|
186
|
+
fi
|
|
187
|
+
fi
|
|
188
|
+
```
|
|
189
|
+
|
|
190
|
+
Packages checked in both `dependencies` and `devDependencies`:
|
|
191
|
+
|
|
192
|
+
| Package(s) | Detected Framework |
|
|
193
|
+
|------------|--------------------|
|
|
194
|
+
| `next` | Next.js |
|
|
195
|
+
| `react`, `react-dom` | React |
|
|
196
|
+
| `nuxt` | Nuxt |
|
|
197
|
+
| `vue`, `@vue/*` | Vue |
|
|
198
|
+
| `@sveltejs/kit` | SvelteKit |
|
|
199
|
+
| `svelte`, `@sveltejs/*` | Svelte |
|
|
200
|
+
|
|
201
|
+
Config key `quality.browser_verify`:
|
|
202
|
+
- `false` → always skip L5, output `L5 — (no frontend)`, even if frontend deps are present
|
|
203
|
+
- `true` → always run L5, even if no frontend deps detected
|
|
204
|
+
- absent → auto-detect from package.json as above
|
|
205
|
+
|
|
206
|
+
No frontend deps found and `quality.browser_verify` not set → output `L5 — (no frontend)`, skip all remaining L5 steps.
|
|
207
|
+
|
|
208
|
+
**Step 2: Dev server lifecycle**
|
|
209
|
+
|
|
210
|
+
**2a. Resolve dev command** (config override always wins):
|
|
211
|
+
|
|
212
|
+
```bash
|
|
213
|
+
# 1. Config override
|
|
214
|
+
DEV_COMMAND=$(yq '.quality.dev_command' .deepflow/config.yaml 2>/dev/null)
|
|
215
|
+
|
|
216
|
+
# 2. Auto-detect from package.json scripts.dev
|
|
217
|
+
if [ -z "${DEV_COMMAND}" ] || [ "${DEV_COMMAND}" = "null" ]; then
|
|
218
|
+
if [ -f package.json ] && jq -e '.scripts.dev' package.json >/dev/null 2>&1; then
|
|
219
|
+
DEV_COMMAND="npm run dev"
|
|
220
|
+
fi
|
|
221
|
+
fi
|
|
222
|
+
|
|
223
|
+
# 3. No dev command found → skip L5 dev server steps
|
|
224
|
+
if [ -z "${DEV_COMMAND}" ]; then
|
|
225
|
+
echo "⚠ L5: No dev command found (scripts.dev not in package.json, quality.dev_command not set). Skipping browser check."
|
|
226
|
+
L5_RESULT="skipped-no-dev-command"
|
|
227
|
+
fi
|
|
228
|
+
```
|
|
229
|
+
|
|
230
|
+
**2b. Resolve port:**
|
|
231
|
+
|
|
232
|
+
```bash
|
|
233
|
+
# Config override wins; fallback to 3000
|
|
234
|
+
DEV_PORT=$(yq '.quality.dev_port' .deepflow/config.yaml 2>/dev/null)
|
|
235
|
+
if [ -z "${DEV_PORT}" ] || [ "${DEV_PORT}" = "null" ]; then
|
|
236
|
+
DEV_PORT=3000
|
|
237
|
+
fi
|
|
238
|
+
```
|
|
239
|
+
|
|
240
|
+
**2c. Check if dev server is already running (port already bound):**
|
|
241
|
+
|
|
242
|
+
```bash
|
|
243
|
+
PORT_IN_USE=false
|
|
244
|
+
if curl -s -o /dev/null -w "%{http_code}" "http://localhost:${DEV_PORT}" | grep -q "200"; then
|
|
245
|
+
PORT_IN_USE=true
|
|
246
|
+
echo "ℹ L5: Port ${DEV_PORT} already bound — using existing dev server, will not kill on exit."
|
|
247
|
+
fi
|
|
248
|
+
```
|
|
249
|
+
|
|
250
|
+
**2d. Start dev server and poll for readiness:**
|
|
251
|
+
|
|
252
|
+
```bash
|
|
253
|
+
DEV_SERVER_PID=""
|
|
254
|
+
if [ "${PORT_IN_USE}" = "false" ]; then
|
|
255
|
+
# Start in a new process group so all child processes can be killed together
|
|
256
|
+
setsid ${DEV_COMMAND} &
|
|
257
|
+
DEV_SERVER_PID=$!
|
|
258
|
+
fi
|
|
259
|
+
|
|
260
|
+
# Resolve timeout from config (default 30s)
|
|
261
|
+
TIMEOUT=$(yq '.quality.browser_timeout' .deepflow/config.yaml 2>/dev/null)
|
|
262
|
+
if [ -z "${TIMEOUT}" ] || [ "${TIMEOUT}" = "null" ]; then
|
|
263
|
+
TIMEOUT=30
|
|
264
|
+
fi
|
|
265
|
+
POLL_INTERVAL=0.5
|
|
266
|
+
MAX_POLLS=$(echo "${TIMEOUT} / ${POLL_INTERVAL}" | bc)
|
|
267
|
+
|
|
268
|
+
HTTP_STATUS=""
|
|
269
|
+
for i in $(seq 1 ${MAX_POLLS}); do
|
|
270
|
+
HTTP_STATUS=$(curl -s -o /dev/null -w "%{http_code}" "http://localhost:${DEV_PORT}" 2>/dev/null)
|
|
271
|
+
[ "${HTTP_STATUS}" = "200" ] && break
|
|
272
|
+
sleep ${POLL_INTERVAL}
|
|
273
|
+
done
|
|
274
|
+
|
|
275
|
+
if [ "${HTTP_STATUS}" != "200" ]; then
|
|
276
|
+
# Kill process group before reporting failure
|
|
277
|
+
if [ -n "${DEV_SERVER_PID}" ]; then
|
|
278
|
+
kill -SIGTERM -${DEV_SERVER_PID} 2>/dev/null
|
|
279
|
+
fi
|
|
280
|
+
echo "✗ L5 FAIL: dev server did not start within ${TIMEOUT}s"
|
|
281
|
+
# add fix task to PLAN.md
|
|
282
|
+
exit 1
|
|
283
|
+
fi
|
|
284
|
+
```
|
|
285
|
+
|
|
286
|
+
**2e. Teardown — always runs on both pass and fail paths:**
|
|
287
|
+
|
|
288
|
+
```bash
|
|
289
|
+
cleanup_dev_server() {
|
|
290
|
+
if [ -n "${DEV_SERVER_PID}" ]; then
|
|
291
|
+
# Kill the entire process group to catch any child processes spawned by the dev server
|
|
292
|
+
kill -SIGTERM -${DEV_SERVER_PID} 2>/dev/null
|
|
293
|
+
# Give it up to 5s to exit cleanly, then force-kill
|
|
294
|
+
for i in $(seq 1 10); do
|
|
295
|
+
kill -0 ${DEV_SERVER_PID} 2>/dev/null || break
|
|
296
|
+
sleep 0.5
|
|
297
|
+
done
|
|
298
|
+
kill -SIGKILL -${DEV_SERVER_PID} 2>/dev/null || true
|
|
299
|
+
fi
|
|
300
|
+
}
|
|
301
|
+
# Register cleanup for both success and failure paths
|
|
302
|
+
trap cleanup_dev_server EXIT
|
|
303
|
+
```
|
|
304
|
+
|
|
305
|
+
Note: When `PORT_IN_USE=true` (dev server was already running before L5 began), `DEV_SERVER_PID` is empty and `cleanup_dev_server` is a no-op — the pre-existing server is left running.
|
|
306
|
+
|
|
307
|
+
**Step 3: Read assertions from PLAN.md**
|
|
308
|
+
|
|
309
|
+
Assertions are written into PLAN.md at plan-time (REQ-8). Extract them for the current spec:
|
|
310
|
+
|
|
311
|
+
```bash
|
|
312
|
+
# Parse structured browser assertions block from PLAN.md
|
|
313
|
+
# Format expected in PLAN.md under each spec section:
|
|
314
|
+
# browser_assertions:
|
|
315
|
+
# - selector: "nav"
|
|
316
|
+
# role: "navigation"
|
|
317
|
+
# name: "Main navigation"
|
|
318
|
+
# - selector: "button[type=submit]"
|
|
319
|
+
# visible: true
|
|
320
|
+
# text: "Submit"
|
|
321
|
+
ASSERTIONS=$(parse_yaml_block "browser_assertions" PLAN.md)
|
|
322
|
+
```
|
|
323
|
+
|
|
324
|
+
If no `browser_assertions` block found for the spec → L5 — (no assertions), skip Playwright step.
|
|
325
|
+
|
|
326
|
+
**Step 3.5: Playwright browser auto-install**
|
|
327
|
+
|
|
328
|
+
Before launching Playwright, verify the Chromium browser binary is available. Run this check once per session; cache the result to avoid repeated installs.
|
|
329
|
+
|
|
330
|
+
```bash
|
|
331
|
+
# Marker file path — presence means Playwright Chromium was verified this session
|
|
332
|
+
PW_MARKER="${TMPDIR:-/tmp}/.deepflow-pw-chromium-ok"
|
|
333
|
+
|
|
334
|
+
if [ ! -f "${PW_MARKER}" ]; then
|
|
335
|
+
# Dry-run to detect whether the browser binary is already installed
|
|
336
|
+
if ! npx --yes playwright install --dry-run chromium 2>&1 | grep -q "chromium.*already installed"; then
|
|
337
|
+
echo "ℹ L5: Playwright Chromium not found — installing (one-time setup)..."
|
|
338
|
+
if npx --yes playwright install chromium 2>&1; then
|
|
339
|
+
echo "✓ L5: Playwright Chromium installed successfully."
|
|
340
|
+
touch "${PW_MARKER}"
|
|
341
|
+
else
|
|
342
|
+
echo "✗ L5 FAIL: Playwright Chromium install failed. Browser verification skipped."
|
|
343
|
+
L5_RESULT="skipped-install-failed"
|
|
344
|
+
# Skip the remaining L5 steps for this run
|
|
345
|
+
fi
|
|
346
|
+
else
|
|
347
|
+
# Already installed — cache for this session
|
|
348
|
+
touch "${PW_MARKER}"
|
|
349
|
+
fi
|
|
350
|
+
fi
|
|
351
|
+
|
|
352
|
+
# If install failed, skip Playwright launch and jump to L5 outcome reporting
|
|
353
|
+
if [ "${L5_RESULT}" = "skipped-install-failed" ]; then
|
|
354
|
+
# No assertions can be evaluated — treat as a non-blocking skip with error notice
|
|
355
|
+
: # fall through to report section
|
|
356
|
+
fi
|
|
357
|
+
```
|
|
358
|
+
|
|
359
|
+
Skip Steps 4–6 when `L5_RESULT="skipped-install-failed"`.
|
|
360
|
+
|
|
361
|
+
**Step 4: Playwright verification**
|
|
362
|
+
|
|
363
|
+
Launch Chromium headlessly via Playwright and evaluate each assertion deterministically — no LLM judgment:
|
|
364
|
+
|
|
365
|
+
```javascript
|
|
366
|
+
const { chromium } = require('playwright');
|
|
367
|
+
const browser = await chromium.launch({ headless: true });
|
|
368
|
+
const page = await browser.newPage();
|
|
369
|
+
await page.goto('http://localhost:3000');
|
|
370
|
+
|
|
371
|
+
const failures = [];
|
|
372
|
+
|
|
373
|
+
for (const assertion of assertions) {
|
|
374
|
+
const locator = page.locator(assertion.selector);
|
|
375
|
+
|
|
376
|
+
// Capture accessibility tree (replaces deprecated page.accessibility.snapshot())
|
|
377
|
+
// locator.ariaSnapshot() returns YAML-like text with roles, names, hierarchy
|
|
378
|
+
const ariaSnapshot = await locator.ariaSnapshot();
|
|
379
|
+
|
|
380
|
+
if (assertion.role && !ariaSnapshot.includes(`role: ${assertion.role}`)) {
|
|
381
|
+
failures.push(`${assertion.selector}: expected role "${assertion.role}", not found in aria snapshot`);
|
|
382
|
+
}
|
|
383
|
+
if (assertion.name && !ariaSnapshot.includes(assertion.name)) {
|
|
384
|
+
failures.push(`${assertion.selector}: expected name "${assertion.name}", not found in aria snapshot`);
|
|
385
|
+
}
|
|
386
|
+
|
|
387
|
+
// Capture bounding boxes for visible assertions
|
|
388
|
+
if (assertion.visible !== undefined) {
|
|
389
|
+
const box = await locator.boundingBox();
|
|
390
|
+
const isVisible = box !== null && box.width > 0 && box.height > 0;
|
|
391
|
+
if (assertion.visible !== isVisible) {
|
|
392
|
+
failures.push(`${assertion.selector}: expected visible=${assertion.visible}, got visible=${isVisible}`);
|
|
393
|
+
}
|
|
394
|
+
}
|
|
395
|
+
|
|
396
|
+
if (assertion.text) {
|
|
397
|
+
const text = await locator.innerText();
|
|
398
|
+
if (!text.includes(assertion.text)) {
|
|
399
|
+
failures.push(`${assertion.selector}: expected text "${assertion.text}", got "${text}"`);
|
|
400
|
+
}
|
|
401
|
+
}
|
|
402
|
+
}
|
|
403
|
+
```
|
|
404
|
+
|
|
405
|
+
Note: `page.accessibility.snapshot()` was removed in Playwright 1.x. Always use `locator.ariaSnapshot()`, which returns YAML-like text describing roles, names, and hierarchy for the matched element subtree.
|
|
406
|
+
|
|
407
|
+
**Step 5: Screenshot capture**
|
|
408
|
+
|
|
409
|
+
After evaluation (pass or fail), capture a full-page screenshot:
|
|
410
|
+
|
|
411
|
+
```javascript
|
|
412
|
+
const timestamp = new Date().toISOString().replace(/[:.]/g, '-');
|
|
413
|
+
const specName = 'doing-upload'; // derived from current spec filename
|
|
414
|
+
const screenshotPath = `.deepflow/screenshots/${specName}/${timestamp}.png`;
|
|
415
|
+
await fs.mkdirSync(path.dirname(screenshotPath), { recursive: true });
|
|
416
|
+
await page.screenshot({ path: screenshotPath, fullPage: true });
|
|
417
|
+
```
|
|
418
|
+
|
|
419
|
+
Screenshot path: `.deepflow/screenshots/{spec-name}/{timestamp}.png`
|
|
420
|
+
|
|
421
|
+
**Step 6: Retry logic**
|
|
422
|
+
|
|
423
|
+
On first failure, retry the FULL L5 check once (total 2 attempts). Re-navigate and re-evaluate all assertions from scratch on the retry:
|
|
424
|
+
|
|
425
|
+
```javascript
|
|
426
|
+
// attempt1_failures populated by Step 4 above
|
|
427
|
+
let attempt2_failures = [];
|
|
428
|
+
|
|
429
|
+
if (attempt1_failures.length > 0) {
|
|
430
|
+
// Retry: re-navigate and re-evaluate all assertions (identical logic to Step 4)
|
|
431
|
+
await page.goto('http://localhost:' + DEV_PORT);
|
|
432
|
+
attempt2_failures = await evaluateAssertions(page, assertions); // same loop as Step 4
|
|
433
|
+
|
|
434
|
+
// Capture a second screenshot for the retry attempt
|
|
435
|
+
const retryTimestamp = new Date().toISOString().replace(/[:.]/g, '-');
|
|
436
|
+
const retryScreenshotPath = `.deepflow/screenshots/${specName}/${retryTimestamp}-retry.png`;
|
|
437
|
+
await page.screenshot({ path: retryScreenshotPath, fullPage: true });
|
|
438
|
+
}
|
|
439
|
+
```
|
|
440
|
+
|
|
441
|
+
**Outcome matrix:**
|
|
442
|
+
|
|
443
|
+
| Attempt 1 | Attempt 2 | Result |
|
|
444
|
+
|-----------|-----------|--------|
|
|
445
|
+
| Pass | — (not run) | L5 ✓ |
|
|
446
|
+
| Fail | Pass | L5 ✓ with warning "(passed on retry)" |
|
|
447
|
+
| Fail | Fail — same assertions | L5 ✗ — genuine failure |
|
|
448
|
+
| Fail | Fail — different assertions | L5 ✗ (flaky) |
|
|
449
|
+
|
|
450
|
+
**Outcome reporting:**
|
|
451
|
+
|
|
452
|
+
- **First attempt passes:** `✓ L5: All assertions passed` — no retry needed.
|
|
453
|
+
|
|
454
|
+
- **First fails, retry passes:**
|
|
455
|
+
```
|
|
456
|
+
⚠ L5: Passed on retry (possible flaky render)
|
|
457
|
+
First attempt failed on: {list of assertion selectors from attempt 1}
|
|
458
|
+
```
|
|
459
|
+
→ L5 pass with warning. No fix task added.
|
|
460
|
+
|
|
461
|
+
- **Both fail on SAME assertions** (identical set of failing selectors):
|
|
462
|
+
```
|
|
463
|
+
✗ L5: Browser assertions failed (both attempts)
|
|
464
|
+
{selector}: {failure detail}
|
|
465
|
+
{selector}: {failure detail}
|
|
466
|
+
...
|
|
467
|
+
```
|
|
468
|
+
→ L5 FAIL. Add fix task to PLAN.md.
|
|
469
|
+
|
|
470
|
+
- **Both fail on DIFFERENT assertions** (flaky — assertion sets differ between attempts):
|
|
471
|
+
```
|
|
472
|
+
✗ L5: Browser assertions failed (flaky — inconsistent failures across attempts)
|
|
473
|
+
Attempt 1 failures:
|
|
474
|
+
{selector}: {failure detail}
|
|
475
|
+
Attempt 2 failures:
|
|
476
|
+
{selector}: {failure detail}
|
|
477
|
+
```
|
|
478
|
+
→ L5 ✗ (flaky). Add fix task to PLAN.md noting flakiness.
|
|
479
|
+
|
|
480
|
+
**Fix task generation on L5 failure (both same and flaky):**
|
|
481
|
+
|
|
482
|
+
When both attempts fail (`L5_RESULT = 'fail'` or `L5_RESULT = 'fail-flaky'`), generate a fix task and append it to PLAN.md under the spec's section:
|
|
483
|
+
|
|
484
|
+
```javascript
|
|
485
|
+
// 1. Determine next task ID
|
|
486
|
+
// Scan PLAN.md for highest T{n} and increment
|
|
487
|
+
const planContent = fs.readFileSync('PLAN.md', 'utf8');
|
|
488
|
+
const taskIds = [...planContent.matchAll(/\bT(\d+)\b/g)].map(m => parseInt(m[1], 10));
|
|
489
|
+
const nextId = taskIds.length > 0 ? Math.max(...taskIds) + 1 : 1;
|
|
490
|
+
const taskId = `T${nextId}`;
|
|
491
|
+
|
|
492
|
+
// 2. Collect fix task context
|
|
493
|
+
// - Failing assertions: the structured assertion objects that failed
|
|
494
|
+
const failingAssertions = attempt2_failures.length > 0 ? attempt2_failures : attempt1_failures;
|
|
495
|
+
|
|
496
|
+
// - DOM snapshot excerpt: capture aria snapshot of body at the time of failure
|
|
497
|
+
const domSnapshotExcerpt = await page.locator('body').ariaSnapshot();
|
|
498
|
+
|
|
499
|
+
// - Screenshot path: already captured in Step 5 / Step 6 retry
|
|
500
|
+
// screenshotPath / retryScreenshotPath are available from those steps
|
|
501
|
+
|
|
502
|
+
// 3. Build task description
|
|
503
|
+
const isFlaky = L5_RESULT === 'fail-flaky';
|
|
504
|
+
const flakySuffix = isFlaky ? ' (flaky — inconsistent failures across attempts)' : '';
|
|
505
|
+
const screenshotRef = isFlaky ? retryScreenshotPath : screenshotPath;
|
|
506
|
+
|
|
507
|
+
const fixTaskBlock = `
|
|
508
|
+
- [ ] ${taskId}: Fix L5 browser assertion failures in ${specName}${flakySuffix}
|
|
509
|
+
**Failing assertions:**
|
|
510
|
+
${failingAssertions.map(f => ` - ${f}`).join('\n')}
|
|
511
|
+
**DOM snapshot (aria tree excerpt at failure):**
|
|
512
|
+
\`\`\`
|
|
513
|
+
${domSnapshotExcerpt.split('\n').slice(0, 40).join('\n')}
|
|
514
|
+
\`\`\`
|
|
515
|
+
**Screenshot:** ${screenshotRef}
|
|
516
|
+
`;
|
|
517
|
+
|
|
518
|
+
// 4. Append fix task under spec section in PLAN.md
|
|
519
|
+
// Find the spec section and append before the next section header or EOF
|
|
520
|
+
const specSectionPattern = new RegExp(`(## ${specName}[\\s\\S]*?)(\n## |$)`);
|
|
521
|
+
const updated = planContent.replace(specSectionPattern, (_, section, next) => section + fixTaskBlock + next);
|
|
522
|
+
fs.writeFileSync('PLAN.md', updated);
|
|
523
|
+
|
|
524
|
+
console.log(`Fix task added to PLAN.md: ${taskId}: Fix L5 browser assertion failures in ${specName}`);
|
|
525
|
+
```
|
|
526
|
+
|
|
527
|
+
Fix task context included:
|
|
528
|
+
- **Failing assertions**: the structured assertion data (selector + failure detail) from whichever attempt(s) failed
|
|
529
|
+
- **DOM snapshot excerpt**: first 40 lines of `locator('body').ariaSnapshot()` output at time of failure (textual a11y tree)
|
|
530
|
+
- **Screenshot path**: `.deepflow/screenshots/{spec-name}/{timestamp}.png` (retry screenshot when available)
|
|
531
|
+
- **Flakiness note**: appended to task title when assertion sets differed between attempts
|
|
532
|
+
|
|
533
|
+
**Comparing assertion sets (same vs. different):**
|
|
534
|
+
|
|
535
|
+
```javascript
|
|
536
|
+
// Compare by selector strings only — ignore detail text differences
|
|
537
|
+
const attempt1_selectors = attempt1_failures.map(f => f.split(':')[0]).sort();
|
|
538
|
+
const attempt2_selectors = attempt2_failures.map(f => f.split(':')[0]).sort();
|
|
539
|
+
const same_assertions = JSON.stringify(attempt1_selectors) === JSON.stringify(attempt2_selectors);
|
|
540
|
+
|
|
541
|
+
if (attempt2_failures.length === 0) {
|
|
542
|
+
// Retry passed
|
|
543
|
+
L5_RESULT = 'pass-on-retry';
|
|
544
|
+
} else if (same_assertions) {
|
|
545
|
+
// Genuine failure — same assertions failed both times
|
|
546
|
+
L5_RESULT = 'fail';
|
|
547
|
+
} else {
|
|
548
|
+
// Flaky — different assertions failed each time
|
|
549
|
+
L5_RESULT = 'fail-flaky';
|
|
550
|
+
}
|
|
551
|
+
```
|
|
552
|
+
|
|
553
|
+
**L5 outcomes:**
|
|
554
|
+
- L5 ✓ — all assertions pass on first attempt
|
|
555
|
+
- L5 ⚠ — passed on retry (possible flaky render); first-attempt failures listed as context
|
|
556
|
+
- L5 ✗ — assertions failed on both attempts (same assertions), fix tasks added
|
|
557
|
+
- L5 ✗ (flaky) — assertions failed on both attempts but on different assertions; fix tasks added noting flakiness
|
|
558
|
+
- L5 — (no frontend) — no frontend deps detected and no config override
|
|
559
|
+
- L5 — (no assertions) — frontend detected but no `browser_assertions` in PLAN.md
|
|
560
|
+
- L5 ✗ (install failed) — Playwright Chromium install failed; browser verification skipped for this run
|
|
561
|
+
|
|
134
562
|
### 3. GENERATE REPORT
|
|
135
563
|
|
|
136
564
|
**Format on success:**
|
|
137
565
|
```
|
|
138
|
-
doing-upload.md: L0 ✓ | L1 ✓ (5/5 files) | L2 ⚠ (no coverage tool) | L3 — (subsumed) | L4 ✓ (12 tests) | 0 quality issues
|
|
566
|
+
doing-upload.md: L0 ✓ | L1 ✓ (5/5 files) | L2 ⚠ (no coverage tool) | L3 — (subsumed) | L4 ✓ (12 tests) | L5 ✓ | 0 quality issues
|
|
139
567
|
```
|
|
140
568
|
|
|
141
569
|
**Format on failure:**
|
|
142
570
|
```
|
|
143
|
-
doing-upload.md: L0 ✓ | L1 ✗ (3/5 files) | L2 ⚠ | L3 — | L4 ✗ (3 failed)
|
|
571
|
+
doing-upload.md: L0 ✓ | L1 ✗ (3/5 files) | L2 ⚠ | L3 — | L4 ✗ (3 failed) | L5 ✗ (2 assertions failed)
|
|
144
572
|
|
|
145
573
|
Issues:
|
|
146
574
|
✗ L1: Missing files: src/api/upload.ts, src/services/storage.ts
|
|
@@ -160,6 +588,7 @@ Run /df:execute --continue to fix in the same worktree.
|
|
|
160
588
|
- L1: All planned files appear in diff
|
|
161
589
|
- L2: Coverage didn't drop (or no coverage tool detected)
|
|
162
590
|
- L4: Tests pass (or no test command detected)
|
|
591
|
+
- L5: Browser assertions pass (or no frontend detected, or no assertions defined)
|
|
163
592
|
|
|
164
593
|
**If all gates pass:** Proceed to Post-Verification merge.
|
|
165
594
|
|
|
@@ -192,8 +621,9 @@ Files: ...
|
|
|
192
621
|
| L2: Coverage | Coverage didn't drop | Coverage tool (before/after) | Orchestrator (Bash) |
|
|
193
622
|
| L3: Integration | Build + tests pass | Subsumed by L0 + L4 | — |
|
|
194
623
|
| L4: Tested | Tests pass | Run test command | Orchestrator (Bash) |
|
|
624
|
+
| L5: Browser | UI assertions pass | Playwright + `locator.ariaSnapshot()` | Orchestrator (Bash + Node) |
|
|
195
625
|
|
|
196
|
-
**Default: L0 through
|
|
626
|
+
**Default: L0 through L5.** L0 and L4 skipped ONLY if no build/test command detected (see step 1.5). L5 skipped if no frontend detected and no config override. All checks are machine-verifiable. No LLM agents are used.
|
|
197
627
|
|
|
198
628
|
## Rules
|
|
199
629
|
- Verify against spec, not assumptions
|