npm - deepflow - Versions diffs - 0.1.78 → 0.1.80 - Mend

deepflow 0.1.78 → 0.1.80

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (11) hide show

package/README.md +14 -3
package/bin/install.js +3 -2
package/package.json +4 -1
package/src/commands/df/auto-cycle.md +33 -19
package/src/commands/df/execute.md +166 -473
package/src/commands/df/plan.md +113 -163
package/src/commands/df/verify.md +433 -3
package/src/skills/browse-fetch/SKILL.md +258 -0
package/src/skills/browse-verify/SKILL.md +264 -0
package/templates/config-template.yaml +14 -0
package/src/skills/context-hub/SKILL.md +0 -87

package/src/skills/browse-fetch/SKILL.md ADDED Viewed

@@ -0,0 +1,258 @@
+---
+name: browse-fetch
+description: Fetches live web content using headless Chromium via Playwright. Use when you need to read documentation, articles, or any public URL that requires JavaScript rendering. Falls back to WebFetch for simple HTML pages.
+---
+# Browse-Fetch
+Retrieve live web content with a headless browser. Handles JavaScript-rendered pages, SPAs, and dynamic content that WebFetch cannot reach.
+## When to Use
+- Reading documentation sites that require JavaScript to render (e.g., React-based docs, Vite, Next.js portals)
+- Fetching the current content of a specific URL provided by the user
+- Extracting article or reference content from a known page before implementing code against it
+## Skip When
+- The URL is a plain HTML page or GitHub raw file — use WebFetch instead (faster, no overhead)
+- The target requires authentication (login wall) or CAPTCHA — browser cannot bypass; note the block and continue
+---
+## Browser Core Protocol
+This protocol is the reusable foundation for all browser-based skills (browse-fetch, browse-verify, etc.).
+### 1. Install Check
+Before launching, verify Playwright is available:
+```bash
+# Prefer bun if available, fall back to node
+if which bun > /dev/null 2>&1; then RUNTIME=bun; else RUNTIME=node; fi
+$RUNTIME -e "require('playwright')" 2>/dev/null \
+  || npx --yes playwright install chromium --with-deps 2>&1 | tail -5
+```
+If installation fails, fall back to WebFetch (see Fallback section below).
+### 2. Launch Command
+```bash
+# Detect runtime
+if which bun > /dev/null 2>&1; then RUNTIME=bun; else RUNTIME=node; fi
+$RUNTIME -e "
+const { chromium } = require('playwright');
+(async () => {
+  const browser = await chromium.launch({ headless: true });
+  const context = await browser.newContext({
+    userAgent: 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/120.0.0.0 Safari/537.36'
+  });
+  const page = await context.newPage();
+  // --- navigation + extraction (see sections 3–4) ---
+  await browser.close();
+})().catch(e => { console.error(e.message); process.exit(1); });
+"
+```
+### 3. Navigation
+```js
+// Inside the async IIFE above
+await page.goto(URL, { waitUntil: 'domcontentloaded', timeout: 30000 });
+// Allow JS to settle
+await page.waitForTimeout(1500);
+```
+- Use `waitUntil: 'domcontentloaded'` for speed; upgrade to `'networkidle'` only if content is missing.
+- Set `timeout: 30000` (30 s). On timeout, treat as graceful failure (see section 5).
+### 4. Content Extraction
+Extract the main readable text, not raw HTML:
+```js
+// Primary: semantic content containers
+let text = await page.innerText('main, article, [role="main"]').catch(() => '');
+// Fallback: full body text
+if (!text || text.trim().length < 100) {
+  text = await page.innerText('body').catch(() => '');
+}
+// Truncate to ~4000 tokens (~16000 chars) to stay within context budget
+const MAX_CHARS = 16000;
+if (text.length > MAX_CHARS) {
+  text = text.slice(0, MAX_CHARS) + '\n\n[content truncated — use a more specific selector or paginate]';
+}
+console.log(text);
+```
+For interactive element inspection (e.g., browse-verify), use `locator.ariaSnapshot()` instead of `innerText`.
+### 5. Graceful Failure
+Detect and handle blocks without crashing:
+```js
+const title = await page.title();
+const url   = page.url();
+// Login wall
+if (/sign.?in|log.?in|auth/i.test(title) || url.includes('/login')) {
+  console.log(`[browse-fetch] Blocked by login wall at ${url}. Skipping.`);
+  await browser.close();
+  process.exit(0);
+}
+// CAPTCHA
+const bodyText = await page.innerText('body').catch(() => '');
+if (/captcha|robot|human verification/i.test(bodyText)) {
+  console.log(`[browse-fetch] CAPTCHA detected at ${url}. Skipping.`);
+  await browser.close();
+  process.exit(0);
+}
+```
+On graceful failure: return the URL and a short explanation, then continue with the task using available context.
+### 6. Cleanup
+Always close the browser in a `finally` block or after use:
+```js
+await browser.close();
+```
+---
+## Fetch Workflow
+**Goal:** retrieve and return the text content of a single URL.
+```bash
+# Full inline script — adapt URL and selector per query
+if which bun > /dev/null 2>&1; then RUNTIME=bun; else RUNTIME=node; fi
+$RUNTIME -e "
+const { chromium } = require('playwright');
+(async () => {
+  const browser = await chromium.launch({ headless: true });
+  const context = await browser.newContext({
+    userAgent: 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/120.0.0.0 Safari/537.36'
+  });
+  const page = await context.newPage();
+  try {
+    await page.goto('https://example.com/docs/page', {
+      waitUntil: 'domcontentloaded',
+      timeout: 30000
+    });
+    await page.waitForTimeout(1500);
+    const title = await page.title();
+    const url   = page.url();
+    if (/sign.?in|log.?in|auth/i.test(title) || url.includes('/login')) {
+      console.log('[browse-fetch] Blocked by login wall at ' + url);
+      return;
+    }
+    let text = await page.innerText('main, article, [role=\"main\"]').catch(() => '');
+    if (!text || text.trim().length < 100) {
+      text = await page.innerText('body').catch(() => '');
+    }
+    const MAX_CHARS = 16000;
+    if (text.length > MAX_CHARS) {
+      text = text.slice(0, MAX_CHARS) + '\n\n[content truncated]';
+    }
+    console.log('=== ' + title + ' ===\n' + text);
+  } finally {
+    await browser.close();
+  }
+})().catch(e => { console.error(e.message); process.exit(1); });
+"
+```
+Adapt the URL and selector per query. The agent inlines the full script via `node -e` or `bun -e` so no temp files are needed for extractions under ~4000 tokens.
+---
+## Search + Navigation Protocol
+**Time-box:** 60 seconds total. **Page cap:** 5 pages per query.
+> Search engines (Google, DuckDuckGo) block headless browsers with CAPTCHAs. Do NOT use Playwright to search them.
+Instead, use one of these strategies:
+| Strategy | When to use |
+|----------|-------------|
+| Direct URL construction | You know the domain (e.g., `docs.stripe.com/api/charges`) |
+| WebSearch tool | General keyword search before fetching pages |
+| Site-specific search | Navigate to `site.com/search?q=term` if the site exposes it |
+**Navigation loop** (up to 5 pages):
+1. Construct or obtain the target URL.
+2. Run the fetch workflow above.
+3. If the page lacks the needed information, look for a next-page link or a more specific sub-URL.
+4. Repeat up to 4 more times (5 total).
+5. Stop and summarize what was found within the 60 s window.
+---
+## Session Cache
+The context window is the cache. Extracted content lives in the conversation until it is no longer needed.
+For extractions larger than ~4000 tokens, write to a temp file and reference it:
+```bash
+# Write large extraction to temp file
+TMPFILE=$(mktemp /tmp/browse-fetch-XXXXXX.txt)
+$RUNTIME -e "...script..." > "$TMPFILE"
+echo "Content saved to $TMPFILE"
+# Read relevant sections with grep or head rather than loading all at once
+```
+---
+## Fallback Without Playwright
+When Playwright is unavailable or fails to install, fall back to the WebFetch tool for:
+- Static HTML sites (GitHub README, raw docs, Wikipedia)
+- Any URL the user provides where JavaScript rendering is not required
+| Condition | Action |
+|-----------|--------|
+| `playwright` not installed, install fails | Use WebFetch |
+| Page is a known static domain (github.com/raw, pastebin, etc.) | Use WebFetch directly — skip Playwright |
+| Playwright times out twice | Use WebFetch as fallback attempt |
+```
+WebFetch: { url: "https://example.com/page", prompt: "Extract the main content" }
+```
+If WebFetch also fails, return the URL with an explanation and continue the task.
+---
+## Rules
+- Always run the install check before the first browser launch in a session.
+- Detect runtime with `which bun` first; use `node` if bun is absent.
+- Never navigate to Google or DuckDuckGo with Playwright — use WebSearch tool or direct URLs.
+- Truncate output at ~4000 tokens (~16 000 chars) to protect context budget.
+- On login wall or CAPTCHA, log the block, skip, and continue — never retry infinitely.
+- Close the browser in every code path (use `finally`).
+- Do not persist browser sessions across unrelated tasks.

package/src/skills/browse-verify/SKILL.md ADDED Viewed

@@ -0,0 +1,264 @@
+---
+name: browse-verify
+description: Verifies UI acceptance criteria by launching a headless browser, extracting the accessibility tree, and evaluating structured assertions deterministically. Use when a spec has browser-based ACs that need automated verification after implementation.
+---
+# Browse-Verify
+Headless browser verification using Playwright's accessibility tree. Evaluates structured assertions from PLAN.md without LLM calls — purely deterministic matching.
+## When to Use
+After implementing a spec that contains browser-based acceptance criteria:
+- Visual/layout checks (element presence, text content, roles)
+- Interactive state checks (aria-checked, aria-expanded, aria-disabled)
+- Structural checks (element within a container)
+**Skip when:** The spec has no browser-facing ACs, or the implementation is backend-only.
+## Prerequisites
+- Node.js (preferred) or Bun
+- Playwright 1.x (`npm install playwright` or `npx playwright install`)
+- Chromium browser (auto-installed if missing)
+## Runtime Detection
+```bash
+# Prefer Node.js; fall back to Bun
+if which node > /dev/null 2>&1; then
+  RUNTIME=node
+elif which bun > /dev/null 2>&1; then
+  RUNTIME=bun
+else
+  echo "Error: neither node nor bun found" && exit 1
+fi
+```
+## Browser Auto-Install
+Before running, ensure Chromium is available:
+```bash
+npx playwright install chromium
+```
+Run this once per environment. If it fails due to permissions, instruct the user to run it manually.
+## Protocol
+### 1. Read Assertions from PLAN.md
+Assertions are written into PLAN.md by the `plan` skill during planning. Format:
+```yaml
+assertions:
+  - role: button
+    name: "Submit"
+    check: visible
+  - role: checkbox
+    name: "Accept terms"
+    check: state
+    value: checked
+  - role: heading
+    name: "Dashboard"
+    check: visible
+    within: main
+  - role: textbox
+    name: "Email"
+    check: value
+    value: "user@example.com"
+```
+Assertion schema:
+| Field    | Required | Description |
+|----------|----------|-------------|
+| `role`   | yes      | ARIA role (button, checkbox, heading, textbox, link, etc.) |
+| `name`   | yes      | Accessible name (exact or partial match) |
+| `check`  | yes      | One of: `visible`, `absent`, `state`, `value`, `count` |
+| `value`  | no       | Expected value for `state` or `value` checks |
+| `within` | no       | Ancestor role or selector to scope the search |
+### 2. Launch Browser and Navigate
+```javascript
+const { chromium } = require('playwright');
+const browser = await chromium.launch({ headless: true });
+const page = await browser.newPage();
+await page.goto(TARGET_URL, { waitUntil: 'networkidle' });
+```
+`TARGET_URL` is read from the spec's metadata or passed as an argument.
+### 3. Extract Accessibility Tree
+Use `locator.ariaSnapshot()` — **NOT** `page.accessibility.snapshot()` (removed in Playwright 1.x):
+```javascript
+// Full-page aria snapshot (YAML-like role tree)
+const snapshot = await page.locator('body').ariaSnapshot();
+// Scoped snapshot within a container
+const containerSnapshot = await page.locator('main').ariaSnapshot();
+```
+`ariaSnapshot()` returns a YAML-like string such as:
+```yaml
+- heading "Dashboard" [level=1]
+- button "Submit" [disabled]
+- checkbox "Accept terms" [checked]
+- textbox "Email": user@example.com
+```
+### 4. Capture Bounding Boxes (optional)
+For spatial/layout assertions or debugging:
+```javascript
+const element = page.getByRole(role, { name: assertionName });
+const box = await element.boundingBox();
+// box: { x, y, width, height } or null if not visible
+```
+### 5. Evaluate Assertions Deterministically
+Parse the aria snapshot and evaluate each assertion. No LLM calls during this phase.
+```javascript
+function evaluateAssertion(snapshot, assertion) {
+  const { role, name, check, value, within } = assertion;
+  // Optionally scope to a sub-tree
+  const tree = within
+    ? extractSubtree(snapshot, within)
+    : snapshot;
+  switch (check) {
+    case 'visible':
+      return treeContains(tree, role, name);
+    case 'absent':
+      return !treeContains(tree, role, name);
+    case 'state':
+      // e.g., value: "checked", "disabled", "expanded"
+      return treeContainsWithState(tree, role, name, value);
+    case 'value':
+      // Matches textbox/combobox displayed value
+      return treeContainsWithValue(tree, role, name, value);
+    case 'count':
+      return countMatches(tree, role, name) === parseInt(value, 10);
+  }
+}
+```
+Matching rules:
+- Role matching is case-insensitive
+- Name matching is case-insensitive substring match (unless wrapped in quotes for exact match)
+- State tokens (`[checked]`, `[disabled]`, `[expanded]`) are parsed from the snapshot line
+### 6. Capture Screenshot
+After evaluation, capture a screenshot for the audit trail:
+```javascript
+const screenshotDir = `.deepflow/screenshots/${specName}`;
+const timestamp = new Date().toISOString().replace(/[:.]/g, '-');
+const screenshotPath = `${screenshotDir}/${timestamp}.png`;
+await fs.mkdir(screenshotDir, { recursive: true });
+await page.screenshot({ path: screenshotPath, fullPage: true });
+```
+Screenshot path convention: `.deepflow/screenshots/{spec-name}/{timestamp}.png`
+### 7. Report Results
+Emit a structured result for each assertion:
+```
+[PASS] button "Submit" — visible ✓
+[PASS] checkbox "Accept terms" — state: checked ✓
+[FAIL] heading "Dashboard" — expected visible, not found in snapshot
+[PASS] textbox "Email" — value: user@example.com ✓
+Results: 3 passed, 1 failed
+Screenshot: .deepflow/screenshots/login-form/2026-03-14T12-00-00-000Z.png
+```
+Exit with code 0 if all assertions pass, 1 if any fail.
+### 8. Tear Down
+```javascript
+await browser.close();
+```
+Always close the browser, even on error (use try/finally).
+## Full Script Template
+```javascript
+#!/usr/bin/env node
+const { chromium } = require('playwright');
+const fs = require('fs/promises');
+const path = require('path');
+async function main({ targetUrl, specName, assertions }) {
+  // Auto-install chromium if needed
+  // (handled by: npx playwright install chromium)
+  const browser = await chromium.launch({ headless: true });
+  const page = await browser.newPage();
+  try {
+    await page.goto(targetUrl, { waitUntil: 'networkidle' });
+    const snapshot = await page.locator('body').ariaSnapshot();
+    const results = assertions.map(assertion => ({
+      assertion,
+      passed: evaluateAssertion(snapshot, assertion),
+    }));
+    // Screenshot
+    const screenshotDir = path.join('.deepflow', 'screenshots', specName);
+    await fs.mkdir(screenshotDir, { recursive: true });
+    const timestamp = new Date().toISOString().replace(/[:.]/g, '-');
+    const screenshotPath = path.join(screenshotDir, `${timestamp}.png`);
+    await page.screenshot({ path: screenshotPath, fullPage: true });
+    // Report
+    const passed = results.filter(r => r.passed).length;
+    const failed = results.filter(r => !r.passed).length;
+    for (const { assertion, passed: ok } of results) {
+      const status = ok ? '[PASS]' : '[FAIL]';
+      console.log(`${status} ${assertion.role} "${assertion.name}" — ${assertion.check}${assertion.value ? ': ' + assertion.value : ''}`);
+    }
+    console.log(`\nResults: ${passed} passed, ${failed} failed`);
+    console.log(`Screenshot: ${screenshotPath}`);
+    process.exit(failed > 0 ? 1 : 0);
+  } finally {
+    await browser.close();
+  }
+}
+```
+## Rules
+- Never call an LLM during the verify phase — all assertion evaluation is deterministic
+- Always use `locator.ariaSnapshot()`, never `page.accessibility.snapshot()` (removed)
+- Always close the browser in a `finally` block
+- Screenshot every run regardless of pass/fail outcome
+- If Playwright is not installed, emit a clear error and instructions — don't silently skip
+- Partial name matching is the default; use exact matching only when the assertion specifies it
+- Report results to stdout in the structured format above for downstream parsing

package/templates/config-template.yaml CHANGED Viewed

@@ -75,3 +75,17 @@ quality:
   # Retry flaky tests once before failing (default: true)
   test_retry_on_fail: true
+  # Enable L5 browser verification after tests pass (default: false)
+  # When true, deepflow will start the dev server and run visual checks
+  browser_verify: false
+  # Override the dev server start command for browser verification
+  # If empty, deepflow will attempt to auto-detect (e.g., "npm run dev", "yarn dev")
+  dev_command: ""
+  # Port that the dev server listens on (default: 3000)
+  dev_port: 3000
+  # Timeout in seconds to wait for the dev server to become ready (default: 30)
+  browser_timeout: 30

package/src/skills/context-hub/SKILL.md DELETED Viewed

@@ -1,87 +0,0 @@
----
-name: context-hub
-description: Fetches curated API docs for external libraries before coding. Use when implementing code that uses external APIs/SDKs (Stripe, OpenAI, MongoDB, etc.) to avoid hallucinating APIs and reduce token usage.
----
-# Context Hub
-Fetch curated, versioned docs for external libraries instead of guessing APIs.
-## When to Use
-Before writing code that calls an external API or SDK:
-- New library integration (e.g., Stripe payments, AWS S3)
-- Unfamiliar API version or method
-- Complex API with many options (e.g., MongoDB aggregation)
-**Skip when:** Working with internal code (use LSP instead) or well-known stdlib APIs.
-## Prerequisites
-Requires `chub` CLI: `npm install -g @aisuite/chub`
-If `chub` is not installed, tell the user and skip — don't block implementation.
-## Workflow
-### 1. Search for docs
-```bash
-chub search "<library or API>" --json
-```
-Example:
-```bash
-chub search "stripe payments" --json
-chub search "mongodb aggregation" --json
-```
-### 2. Fetch relevant docs
-```bash
-chub get <id> --lang <py|js|ts>
-```
-Use `--lang` matching the project language. Use `--full` only if the summary lacks what you need.
-### 3. Write code using fetched docs
-Use the retrieved documentation as ground truth for API signatures, parameter names, and patterns.
-### 4. Annotate discoveries
-When you find something the docs missed or got wrong:
-```bash
-chub annotate <id> "Note: method X requires param Y since v2.0"
-```
-This persists locally and appears on future `chub get` calls — the agent learns across sessions.
-### 5. Rate docs (optional)
-```bash
-chub feedback <id> up --label accurate
-chub feedback <id> down --label outdated
-```
-Labels: `accurate`, `outdated`, `incomplete`, `wrong-version`, `helpful`
-## Integration with LSP
-| Need | Tool |
-|------|------|
-| Internal code navigation | LSP (`goToDefinition`, `findReferences`) |
-| External API signatures | Context Hub (`chub get`) |
-| Symbol search in project | LSP (`workspaceSymbol`) |
-| Library usage patterns | Context Hub (`chub search`) |
-**Combined approach:** Use LSP to understand how the project currently uses a library, then use Context Hub to verify correct API usage and discover better patterns.
-## Rules
-- Always search before implementing external API calls
-- Trust chub docs over training data for API specifics
-- Annotate gaps so future sessions benefit
-- Don't block on chub failures — fall back to best knowledge
-- Prefer `--json` flag for programmatic parsing in automated workflows