npm - argusqa-os - Versions diffs - 9.7.4 → 9.7.6 - Mend

argusqa-os 9.7.4 → 9.7.6

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (9) hide show

package/README.md +8 -7
package/glama.json +1 -1
package/package.json +9 -1
package/src/adapters/browser.js +72 -5
package/src/orchestration/crawl-and-report.js +1 -1
package/src/orchestration/orchestrator.js +31 -55
package/src/utils/issues-analyzer.js +8 -2
package/src/utils/lighthouse-checker.js +44 -4
package/src/utils/mcp-client.js +7 -4

package/README.md CHANGED Viewed

@@ -4,7 +4,7 @@
 [![npm](https://img.shields.io/npm/v/argusqa-os?color=7C3AED)](https://www.npmjs.com/package/argusqa-os)
 [![MCP Server](https://glama.ai/mcp/servers/ironclawdevs27/Argus/badges/card.svg)](https://glama.ai/mcp/servers/ironclawdevs27/Argus)
-[![Harness](https://img.shields.io/badge/harness-688%2F688-4ADE80)](test-harness/)
+[![Harness](https://img.shields.io/badge/harness-846%2F846-4ADE80)](test-harness/)
 [![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](LICENSE)
 **Argus catches the bugs your test suite misses — visual regressions, API loops, CSS drift, console noise, accessibility failures, and more — and delivers rich reports to Slack (or a local HTML dashboard).**
@@ -217,9 +217,10 @@ npm run report:html    # Generate reports/report.html from last JSON audit
 npm run report:pdf     # Export HTML report to A4 PDF (requires: npm install puppeteer)
 npm run server         # Start Slack slash-command server (port 3001)
 npm run init           # Interactive setup wizard
-npm run test:unit          # 61 unit tests — no Chrome required
-npm run test:harness       # 142-block correctness harness — requires Chrome
+npm run test:unit          # 94 unit tests — no Chrome required
+npm run test:harness       # 149-block correctness harness — requires Chrome
 npm run test:harness:log   # same, but tees full output to harness-results.txt
+npm run test:coverage      # merged unit + harness coverage gate (requires Chrome)
 ```
 **Watch mode** — live monitoring as you develop:
@@ -342,7 +343,7 @@ Argus is a **complementary layer**, not a replacement for unit or E2E tests:
 ## Known Limitations
-All 688 harness assertions pass (`688/688`) — there are currently no known MCP- or Chrome-layer restrictions. Soft assertions (Lighthouse, performance traces) still require non-headless Chrome and are skipped in headless CI.
+All 846 harness assertions pass (`846/846`) — there are currently no known MCP- or Chrome-layer restrictions. Lighthouse now runs in headless (after the `lighthouse_audit` argument fix); the remaining soft assertions (perf traces, GC-dependent heap-growth) are promoted to counted hard assertions only in the weekly strict-soft lane (`harness-strict.yml`) via `ARGUS_HARNESS_STRICT_SOFT`.
 ---
@@ -361,8 +362,8 @@ src/
     chrome-launcher.js  — npm run chrome / argus-chrome — launches Chrome with correct flags
     doctor.js           — npm run doctor / argus-doctor — pre-flight checks
     pr-validate.js      — headless CI entry point for GitHub Actions
-test-harness/           — 142-block correctness harness, 688 hard assertions, 62 fixture pages
-test/unit/              — 61 Vitest unit tests (no Chrome required)
+test-harness/           — 149-block correctness harness, 846 hard assertions, 63 fixture pages
+test/unit/              — 94 Vitest unit tests (no Chrome required)
 landing/                — Product landing page (React 19 + Vite + Tailwind)
 ```
@@ -373,7 +374,7 @@ Full source map → [CLAUDE.md](CLAUDE.md) · MCP/DSL reference → [SKILL.md](S
 ## Contributing
 1. Fork the repo and create a branch
-2. `npm run test:unit` — verify without Chrome (61 tests)
+2. `npm run test:unit` — verify without Chrome (94 tests)
 3. `npm run test:harness` — full integration coverage (requires Chrome on port 9222)
 4. Open a PR — Argus audits itself via the CI workflow

package/glama.json CHANGED Viewed

@@ -1,7 +1,7 @@
 {
   "$schema": "https://glama.ai/mcp/schemas/server.json",
   "name": "argus",
-  "description": "AI-powered QA harness that audits web apps via Chrome DevTools Protocol. Catches JS errors, network failures, a11y violations, SEO issues, security headers, CSS regressions, and more — directly from Claude conversations. 9 MCP tools: argus_audit (fast 8-analyzer pass), argus_audit_full (Lighthouse + memory + responsive), argus_compare (dev vs staging diff), argus_last_report (retrieve last JSON report), argus_watch_snapshot (live tab snapshot without navigating), argus_get_context (LLM-optimized context + fix loop with snapshot_id diff), argus_design_audit (Figma design fidelity — 13 finding types), argus_visual_diff (screenshot baseline comparison, updateBaseline flag), argus_pr_validate (PR diff → affected routes → targeted audit → blocked flag). Every finding is post-processed with intelligent baseline filtering (cross-run noise classifier) and root cause linking (recent git commits mapped to new findings). 142 test blocks, 688 hard assertions, 67 detection categories.",
+  "description": "AI-powered QA harness that audits web apps via Chrome DevTools Protocol. Catches JS errors, network failures, a11y violations, SEO issues, security headers, CSS regressions, and more — directly from Claude conversations. 9 MCP tools: argus_audit (fast 8-analyzer pass), argus_audit_full (Lighthouse + memory + responsive), argus_compare (dev vs staging diff), argus_last_report (retrieve last JSON report), argus_watch_snapshot (live tab snapshot without navigating), argus_get_context (LLM-optimized context + fix loop with snapshot_id diff), argus_design_audit (Figma design fidelity — 13 finding types), argus_visual_diff (screenshot baseline comparison, updateBaseline flag), argus_pr_validate (PR diff → affected routes → targeted audit → blocked flag). Every finding is post-processed with intelligent baseline filtering (cross-run noise classifier) and root cause linking (recent git commits mapped to new findings). 149 test blocks, 846 hard assertions, 67 detection categories.",
   "maintainers": ["ironclawdevs27"],
   "tools": [
     {

package/package.json CHANGED Viewed

@@ -1,6 +1,6 @@
 {
   "name": "argusqa-os",
-  "version": "9.7.4",
+  "version": "9.7.6",
   "mcpName": "io.github.ironclawdevs27/argus",
   "description": "Argus — AI-powered automated dev-testing platform using Chrome DevTools MCP and Claude Code",
   "keywords": [
@@ -53,6 +53,10 @@
     "test:harness:log": "node test-harness/run-with-log.mjs",
     "test:unit": "vitest run test/unit",
     "test": "npm run test:unit && npm run test:harness",
+    "coverage:unit": "vitest run test/unit --coverage",
+    "coverage:harness": "c8 npm run test:harness",
+    "coverage:gate": "node scripts/coverage-gate.mjs --min-lines 60 --allow-uncovered src/mcp-server.js,src/orchestration/env-comparison.js,src/server/index.js",
+    "test:coverage": "npm run coverage:harness && npm run coverage:unit && npm run coverage:gate",
     "report:html": "node src/utils/html-reporter.js",
     "report:pdf": "node src/utils/pdf-exporter.js",
     "mcp-server": "node src/mcp-server.js"
@@ -72,6 +76,10 @@
     "zod": "^4.4.3"
   },
   "devDependencies": {
+    "@vitest/coverage-v8": "^4.1.8",
+    "c8": "^10.1.3",
+    "fast-check": "^4.8.0",
+    "istanbul-lib-coverage": "^3.2.2",
     "vitest": "^4.1.8"
   }
 }

package/src/adapters/browser.js CHANGED Viewed

@@ -21,7 +21,24 @@ export class CdpBrowserAdapter {
   constructor(mcp) { this._mcp = mcp; }
   // ── Navigation ──────────────────────────────────────────────────────────────
-  navigate(url)            { return withRetry(() => this._mcp.navigate_page({ url }), { label: `navigate(${url})` }); }
+  // navigate_page reports failures as RESOLVED text ("Unable to navigate ...
+  // net::ERR_CONNECTION_REFUSED", "Could not connect to Chrome ..."), never as a
+  // thrown error. Unchecked, a dead target or dead browser produced a "clean"
+  // audit: analyzers ran against chrome-error://chromewebdata and emitted bogus
+  // findings (or none), and CI gates passed with Chrome down. Throw so failures
+  // propagate through the existing crawl error path.
+  navigate(url) {
+    return withRetry(async () => {
+      const resp = await this._mcp.navigate_page({ url });
+      if (typeof resp === 'string' &&
+          (resp.includes('Unable to navigate') ||
+           resp.includes('Could not connect to Chrome') ||
+           resp.includes('A dialog is open'))) {
+        throw new Error(`navigate(${url}) failed: ${resp.split('\n')[0].slice(0, 200)}`);
+      }
+      return resp;
+    }, { label: `navigate(${url})` });
+  }
   // ── Evaluation & snapshots ──────────────────────────────────────────────────
   evaluate(fn)             { return this._mcp.evaluate_script({ function: fn }); }
@@ -41,14 +58,52 @@ export class CdpBrowserAdapter {
   hover(uid)               { return this._mcp.hover({ uid }); }
   drag(src, tgt)           { return this._mcp.drag({ from_uid: src, to_uid: tgt }); }
   uploadFile(uid, filePath) { return this._mcp.upload_file({ uid, filePath }); }
-  handleDialog(accept, promptText = '') { return this._mcp.handle_dialog({ accept, promptText }); }
-  waitFor(opts)            { return this._mcp.wait_for(opts); }
+  // handle_dialog wire schema is { action: 'accept'|'dismiss', promptText? } — sending
+  // { accept: bool } is rejected by the tool's input validation (and the rejection comes
+  // back as a resolved error-text response, so the failure was silent in production).
+  handleDialog(accept, promptText = '') {
+    const args = { action: accept ? 'accept' : 'dismiss' };
+    if (promptText) args.promptText = promptText;
+    return this._mcp.handle_dialog(args);
+  }
+  // wait_for requires text as a non-empty string ARRAY. A bare string is rejected by
+  // input validation, and { state: 'networkidle' } is not part of the tool's schema at
+  // all — both shapes used to resolve to error text and silently wait for nothing.
+  waitFor(opts = {})       {
+    if (typeof opts.text === 'string') opts = { ...opts, text: [opts.text] };
+    if (opts.state === 'networkidle') return this.#waitForNetworkIdle();
+    return this._mcp.wait_for(opts);
+  }
+  // Bounded network-quiet poll: resolves once the page's resource-timing entry count
+  // is stable across two consecutive 250 ms polls, or after 3 s — whichever is first.
+  async #waitForNetworkIdle() {
+    let prev = -1;
+    for (let i = 0; i < 12; i++) {
+      const raw = await this.evaluate(`() => performance.getEntriesByType('resource').length`);
+      const count = Number(typeof raw === 'object' ? raw?.result ?? 0 : raw) || 0;
+      if (count === prev) return;
+      prev = count;
+      await new Promise(r => setTimeout(r, 250));
+    }
+  }
   // ── Viewport ────────────────────────────────────────────────────────────────
   emulate(viewport)              { return this._mcp.emulate({ viewport }); }
   emulateCpu(rate)               { return this._mcp.emulate({ cpuThrottlingRate: rate }); }
   emulateColorScheme(scheme)     { return this._mcp.emulate({ colorScheme: scheme }); }
-  emulateReducedMotion(pref)     { return this._mcp.emulate({ reducedMotion: pref }); }
+  // chrome-devtools-mcp@1.1.1's emulate tool has no reduced-motion capability — the
+  // unsupported argument comes back as RESOLVED error text ("Unknown argument"), not a
+  // thrown error, so callers' graceful-skip catch paths (motion-analyzer) never ran and
+  // analysis proceeded unemulated. Surface it as a real error; if a future upstream
+  // version adds the argument, the call succeeds and emulation lights up automatically.
+  async emulateReducedMotion(pref) {
+    const resp = await this._mcp.emulate({ reducedMotion: pref });
+    if (typeof resp === 'string' && resp.includes('Unknown argument')) {
+      throw new Error(`emulate does not support reducedMotion in this chrome-devtools-mcp version: ${resp.slice(0, 120)}`);
+    }
+    return resp;
+  }
   resize(w, h)                   { return this._mcp.resize_page({ width: w, height: h }); }
   // ── Network & performance ───────────────────────────────────────────────────
@@ -56,7 +111,19 @@ export class CdpBrowserAdapter {
   // is rejected with an Unknown-argument error). Callers still pass the numeric
   // requestId parsed from list_network_requests.
   getNetworkRequest(reqId) { return this._mcp.get_network_request({ reqid: reqId }); }
-  lighthouse(url, opts = {}) { return this._mcp.lighthouse_audit({ url, ...opts }); }
+  // lighthouse_audit audits the CURRENTLY-NAVIGATED page and accepts only
+  // mode/device/outputDirPath. Passing `url` (or `categories`) is REJECTED with an
+  // "Unknown argument" error that comes back as RESOLVED text — so every Argus Lighthouse
+  // run silently no-op'd (caught upstream as "Lighthouse skipped", scores perpetually N/A).
+  // Navigate to the target first, then audit; strip url/categories defensively so legacy
+  // callers cannot reintroduce the rejected args. mode 'navigation' (the tool default)
+  // reloads + audits. Performance is intentionally excluded by lighthouse_audit (covered by
+  // the web-vitals analyzer) — it returns accessibility/best-practices/seo/agentic-browsing.
+  async lighthouse(url, opts = {}) {
+    if (url) await this.navigate(url);
+    const { url: _ignoredUrl, categories: _ignoredCats, ...valid } = opts;
+    return this._mcp.lighthouse_audit(valid);
+  }
   startTrace()             { return this._mcp.performance_start_trace({}); }
   stopTrace()              { return this._mcp.performance_stop_trace({}); }
   analyzeInsight(opts)     { return this._mcp.performance_analyze_insight(opts); }

package/src/orchestration/crawl-and-report.js CHANGED Viewed

@@ -11,6 +11,6 @@
  * continue to import from this file unchanged.
  */
-export { runCrawl, crawlRouteCheap, crawlRouteExpensive } from './orchestrator.js';
+export { runCrawl, crawlRouteCheap, crawlRouteExpensive, checkHttpsRequired } from './orchestrator.js';
 export { processReport, deduplicateFindings, rebuildSummary } from './report-processor.js';
 export { dispatchAll } from './dispatcher.js';

package/src/orchestration/orchestrator.js CHANGED Viewed

@@ -370,45 +370,31 @@ function analyzeNetworkPerformance(perfEntries, pageUrl) {
   return bugs;
 }
-// ── Performance Budgets ────────────────────────────────────────────────────────
-async function checkPerformanceBudgets(browser, url) {
-  const violations = [];
+/**
+ * HTTPS-enforcement rule (single source of truth, exported so it can be verified
+ * directly — the harness can only crawl localhost, which is excluded, so the
+ * positive-trigger path has no live fixture).
+ *
+ * Returns a `security_no_https` finding for an http:// page on a non-loopback host,
+ * or null otherwise (https, or any localhost/127.x/::1 address).
+ *
+ * @param {string} url
+ * @returns {{type:string,message:string,severity:string,url:string}|null}
+ */
+export function checkHttpsRequired(url) {
   try {
-    await browser.startTrace();
-    await new Promise(r => setTimeout(r, 3000));
-    const trace    = await browser.stopTrace();
-    const insights = await browser.analyzeInsight({ insightSetId: trace?.insightSetId ?? trace?.id ?? trace });
-    const metrics = insights?.metrics ?? insights?.performanceMetrics ?? {};
-    const checks = [
-      { key: 'LCP',  value: metrics.largestContentfulPaint ?? metrics.LCP,  budget: thresholds.perf.LCP,  unit: 'ms' },
-      { key: 'CLS',  value: metrics.cumulativeLayoutShift  ?? metrics.CLS,  budget: thresholds.perf.CLS,  unit: ''   },
-      { key: 'FID',  value: metrics.totalBlockingTime ?? metrics.TBT ?? metrics.FID, budget: thresholds.perf.FID, unit: 'ms' },
-      { key: 'TTFB', value: metrics.timeToFirstByte   ?? metrics.TTFB,      budget: thresholds.perf.TTFB, unit: 'ms' },
-    ];
-    for (const { key, value, budget, unit } of checks) {
-      if (value == null) continue;
-      if (value > budget) {
-        violations.push({
-          type:      'performance_budget',
-          metric:    key,
-          value:     `${value}${unit}`,
-          budget:    `${budget}${unit}`,
-          message:   `Performance budget exceeded: ${key} = ${value}${unit} (budget: ${budget}${unit})`,
-          severity:  'warning',
-          url,
-        });
-      }
+    const parsed = new URL(url);
+    const isLocalhost = /^(localhost|127\.|::1)/.test(parsed.hostname);
+    if (parsed.protocol === 'http:' && !isLocalhost) {
+      return {
+        type:     'security_no_https',
+        message:  `Page served over HTTP — enforce HTTPS via server redirect or HSTS`,
+        severity: 'warning',
+        url,
+      };
     }
-  } catch (err) {
-    logger.warn(`[ARGUS] Performance trace skipped for ${url}: ${err.message}`);
-  }
-  return violations;
+  } catch { /* URL parse failure */ }
+  return null;
 }
 // ── Cheap Crawl (called ×2 for flakiness detection) ───────────────────────────
@@ -417,7 +403,7 @@ async function checkPerformanceBudgets(browser, url) {
  * Cheap detections for one route.
  * Runs: console, network, JS errors, blank page, API frequency, contracts,
  *       SEO, security, content, CSS, debugger statements, duplicate ids, screenshot.
- * Does NOT run: Lighthouse, perf budgets, network perf, redirect chain, broken links, cache headers.
+ * Does NOT run: Lighthouse, network perf, redirect chain, broken links, cache headers.
  */
 export async function crawlRouteCheap(route, baseUrl, mcp) {
   const browser = new CdpBrowserAdapter(mcp);
@@ -721,19 +707,9 @@ export async function crawlRouteCheap(route, baseUrl, mcp) {
     logger.warn(`[ARGUS] Issues analysis skipped for ${url}: ${err.message}`);
   }
-  // 9f. HTTPS enforcement check
-  try {
-    const parsed = new URL(url);
-    const isLocalhost = /^(localhost|127\.|::1)/.test(parsed.hostname);
-    if (parsed.protocol === 'http:' && !isLocalhost) {
-      result.errors.push({
-        type:     'security_no_https',
-        message:  `Page served over HTTP — enforce HTTPS via server redirect or HSTS`,
-        severity: 'warning',
-        url,
-      });
-    }
-  } catch { /* URL parse failure */ }
+  // 9f. HTTPS enforcement check (shared rule — see checkHttpsRequired above)
+  const httpsFinding = checkHttpsRequired(url);
+  if (httpsFinding) result.errors.push(httpsFinding);
   // 10. Deduplicate within this cheap run
   result.errors = deduplicateErrors(result.errors);
@@ -757,8 +733,11 @@ export async function crawlRouteCheap(route, baseUrl, mcp) {
 /**
  * Expensive/deterministic analyzers for one route — called ONCE per route.
- * Runs: network perf, redirect chain, perf budgets, Lighthouse,
+ * Runs: network perf, redirect chain, Lighthouse,
  *       broken internal links, cache headers.
+ * (Core Web Vitals — LCP/CLS/TTFB — are emitted by the web-vitals-analyzer
+ *  registerExpensive plugin, which is headless-compatible; the old trace-based
+ *  perf-budget path was removed as dead + redundant.)
  */
 export async function crawlRouteExpensive(route, baseUrl, mcp) {
   const browser = new CdpBrowserAdapter(mcp);
@@ -805,9 +784,6 @@ export async function crawlRouteExpensive(route, baseUrl, mcp) {
     logger.warn(`[ARGUS] Redirect chain check skipped for ${url}: ${err.message}`);
   }
-  // Performance budget check
-  errors.push(...(await checkPerformanceBudgets(browser, url)));
   // Full Lighthouse audit (capped at LIGHTHOUSE_TIMEOUT_MS to prevent indefinite hang)
   errors.push(...(await Promise.race([
     checkLighthouse(browser, url),

package/src/utils/issues-analyzer.js CHANGED Viewed

@@ -31,7 +31,10 @@ const CLASSIFIERS = [
   {
     type:             'cors_violation',
     issueTypePattern: /cors/i,
-    textPattern:      /cors policy|cross.origin.*blocked|access.control.allow.origin/i,
+    // Live Chrome 149 surfaces CorsIssue in the panel as e.g.
+    // "Ensure CORS response header values are valid" — the older phrase-specific
+    // patterns missed it, so the \bcors\b word-match anchors any CORS issue title.
+    textPattern:      /cors policy|cross.origin.*blocked|access.control.allow.origin|\bcors\b/i,
     severity:         (isCritical) => isCritical ? 'critical' : 'warning',
   },
   {
@@ -49,7 +52,10 @@ const CLASSIFIERS = [
   {
     type:             'cookie_attribute_missing',
     issueTypePattern: /cookie/i,
-    textPattern:      /samesite|secure attribute|partitioned|cookie.*rejected|set-cookie.*blocked/i,
+    // Live Chrome 149 surfaces the SameSite=None-without-Secure cookie Issue as
+    // "Mark cross-site cookies as Secure to allow setting them in cross-site contexts"
+    // — neither "samesite" nor "secure attribute" appear, so match those phrasings too.
+    textPattern:      /samesite|secure attribute|partitioned|cookie.*rejected|set-cookie.*blocked|cross-site cookies?|cookies? as secure/i,
     severity:         () => 'warning',
   },
   {

package/src/utils/lighthouse-checker.js CHANGED Viewed

@@ -5,12 +5,49 @@
  * checkLighthouse directly without pulling in the Slack-initialised orchestrator.
  */
+import fs from 'node:fs';
 import { registerExpensive } from '../registry.js';
 import { thresholds }        from '../config/targets.js';
 import { childLogger }       from './logger.js';
 const logger = childLogger('lighthouse-checker');
+/**
+ * Parse a chrome-devtools-mcp `lighthouse_audit` response into the Lighthouse
+ * result shape this module consumes: `{ categories, audits }` (category scores 0–1,
+ * `audits` keyed by id). The tool returns markdown with a "### Reports" section that
+ * points at a full `report.json`; we read that for complete category scores +
+ * per-audit detail (`auditRefs`, `title`, `description`). If the file is unavailable
+ * we fall back to the markdown "### Category Scores" block (scores only, no audits).
+ * Returns `{ categories: {}, audits: {} }` when nothing parses — never throws.
+ *
+ * @param {string} responseText - raw lighthouse_audit response (markdown text)
+ * @returns {{ categories: object, audits: object }}
+ */
+export function parseLighthouseReport(responseText) {
+  const text = String(responseText ?? '');
+  // Prefer the authoritative report.json (categories + auditRefs + per-audit detail).
+  const pathMatch = text.match(/([A-Za-z]:\\[^\r\n]*?report\.json|\/[^\r\n]*?report\.json)/);
+  if (pathMatch) {
+    try {
+      const json = JSON.parse(fs.readFileSync(pathMatch[1].trim(), 'utf8'));
+      if (json && typeof json === 'object' && json.categories) {
+        return { categories: json.categories, audits: json.audits ?? {} };
+      }
+    } catch { /* fall through to the markdown scores */ }
+  }
+  // Fallback: synthesize categories from the "### Category Scores" markdown block,
+  // e.g. "- Accessibility: 96 (accessibility)". Scores normalised to 0–1 to match report.json.
+  const categories = {};
+  const block = text.match(/### Category Scores\s*\n([\s\S]*?)(?:\n###|\s*$)/);
+  if (block) {
+    for (const m of block[1].matchAll(/^\s*-\s+.+?:\s*([\d.]+)\s*\(([\w-]+)\)\s*$/gm)) {
+      categories[m[2]] = { id: m[2], score: Number(m[1]) / 100 };
+    }
+  }
+  return { categories, audits: {} };
+}
 const LIGHTHOUSE_LABELS = {
   accessibility:    'Accessibility',
   performance:      'Performance',
@@ -39,13 +76,16 @@ export async function checkLighthouse(browser, url) {
   const LIGHTHOUSE_TIMEOUT_MS = parseInt(process.env.ARGUS_LIGHTHOUSE_TIMEOUT ?? '120000', 10);
   try {
-    const auditPromise = browser.lighthouse(url, {
-      categories: ['accessibility', 'performance', 'seo', 'best-practices'],
-    });
+    // browser.lighthouse navigates to url + audits the current page. lighthouse_audit
+    // returns markdown referencing a full report.json — parseLighthouseReport reads that
+    // back into the { categories, audits } shape this function consumes. Performance is
+    // excluded by the tool (covered by web-vitals); thresholds.lighthouse.performance is
+    // simply skipped below when its category is absent.
+    const auditPromise = browser.lighthouse(url);
     const timeoutPromise = new Promise((_, reject) =>
       setTimeout(() => reject(new Error(`Lighthouse timed out after ${LIGHTHOUSE_TIMEOUT_MS / 1000}s`)), LIGHTHOUSE_TIMEOUT_MS)
     );
-    const result = await Promise.race([auditPromise, timeoutPromise]);
+    const result = parseLighthouseReport(await Promise.race([auditPromise, timeoutPromise]));
     const categories = result?.categories ?? {};
     const audits     = result?.audits     ?? {};

package/src/utils/mcp-client.js CHANGED Viewed

@@ -181,11 +181,14 @@ export async function createMcpClient() {
         // MCP returns { content: [{ type, text|data }] } — extract the value
         const content = result?.content;
         if (Array.isArray(content) && content.length > 0) {
-          const item = content[0];
-          if (item.type === 'image') {
-            // take_screenshot returns base64 image data — return in a shape callers expect
-            return { data: item.data, mimeType: item.mimeType ?? 'image/png' };
+          // take_screenshot returns [text caption, image] — the image is NOT content[0],
+          // so scan the whole array for it. Reading only content[0] returned the caption
+          // string and starved every screenshot consumer of image data.
+          const img = content.find(c => c.type === 'image');
+          if (img) {
+            return { data: img.data, mimeType: img.mimeType ?? 'image/png' };
           }
+          const item = content[0];
           if (item.type === 'text') {
             const text = item.text;
             // chrome-devtools-mcp wraps evaluate_script results in a markdown code block: