argusqa-os 9.7.3 → 9.7.4

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/README.md CHANGED
@@ -1,394 +1,394 @@
1
- <div align="center">
2
-
3
- # Argus — AI-Powered Automated QA
4
-
5
- [![npm](https://img.shields.io/npm/v/argusqa-os?color=7C3AED)](https://www.npmjs.com/package/argusqa-os)
6
- [![MCP Server](https://glama.ai/mcp/servers/ironclawdevs27/Argus/badges/card.svg)](https://glama.ai/mcp/servers/ironclawdevs27/Argus)
7
- [![Harness](https://img.shields.io/badge/harness-679%2F679-4ADE80)](test-harness/)
8
- [![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](LICENSE)
9
-
10
- **Argus catches the bugs your test suite misses — visual regressions, API loops, CSS drift, console noise, accessibility failures, and more — and delivers rich reports to Slack (or a local HTML dashboard).**
11
-
12
- [Quick Start](#quick-start) · [Features](#what-argus-catches) · [Setup](#full-setup) · [MCP Tools](#mcp-tools) · [CLI Commands](#cli-commands) · [Troubleshooting](#troubleshooting) · [Full Reference](REFERENCE.md)
13
-
14
- </div>
15
-
16
- ---
17
-
18
- ## Quick Start
19
-
20
- > **No install required.** `npx` auto-downloads Argus on first run.
21
-
22
- **Step 1 — Add to `.mcp.json`** in your project root:
23
-
24
- ```json
25
- {
26
- "mcpServers": {
27
- "chrome-devtools": { "command": "npx", "args": ["-y", "chrome-devtools-mcp@latest"] },
28
- "argus": { "command": "npx", "args": ["-y", "argusqa-os"] }
29
- }
30
- }
31
- ```
32
-
33
- Or via Claude Code CLI:
34
-
35
- ```bash
36
- claude mcp add chrome-devtools -- npx -y chrome-devtools-mcp@latest
37
- claude mcp add argus -- npx -y argusqa-os
38
- ```
39
-
40
- **Step 2 — Start Chrome with remote debugging:**
41
-
42
- ```bash
43
- # macOS
44
- open -a "Google Chrome" --args --remote-debugging-port=9222 --headless=new
45
-
46
- # Windows (PowerShell)
47
- & "C:\Program Files\Google\Chrome\Application\chrome.exe" --remote-debugging-port=9222 --headless=new --no-sandbox --disable-gpu --user-data-dir="$env:TEMP\chrome-argus"
48
-
49
- # Linux
50
- google-chrome --remote-debugging-port=9222 --headless=new --no-sandbox
51
- ```
52
-
53
- **Step 3 — Run an audit:**
54
-
55
- ```
56
- Run argus_audit on http://localhost:3000
57
- ```
58
-
59
- Argus scans your app and either posts findings to Slack or opens a local `report.html`. That's it.
60
-
61
- ---
62
-
63
- ## What Argus Catches
64
-
65
- 32 analysis engines, 140 distinct issue types, zero test-file maintenance:
66
-
67
- | Category | What it detects |
68
- |---|---|
69
- | **JavaScript** | Uncaught exceptions, unhandled promise rejections, `console.error` on critical routes |
70
- | **Network & API** | HTTP 5xx, 401/403 auth failures, duplicate API calls (infinite loops), 4xx errors, broken links |
71
- | **Performance** | LCP > 2500ms, CLS > 0.1, TTFB > 800ms, slow APIs > 1s/3s, payloads > 500KB/2MB, JS bundles > 500KB |
72
- | **Accessibility** | axe-core (80+ WCAG rules), color-blind simulation, missing ARIA, keyboard focus, heading hierarchy |
73
- | **SEO** | Missing meta description, OG tags, canonical, viewport, h1 |
74
- | **Security** | Auth tokens in localStorage/URL, `eval()`, missing CSP/X-Frame-Options, CSP violations, missing SRI on external scripts, source map exposure, open redirects, npm CVEs |
75
- | **CSS** | Cascade overrides, component style leaks, unused rules, React inline style conflicts |
76
- | **Content** | `null`/`undefined` as visible text, lorem ipsum, broken images, empty data lists |
77
- | **Responsive** | Horizontal overflow at 375px/768px, touch targets < 44×44px |
78
- | **Memory** | Detached DOM nodes via V8 heap snapshot, heap growth across navigation |
79
- | **Visual** | Pixel-level screenshot regression via pixelmatch (≥0.1% warning, ≥5% critical) |
80
- | **Figma** | Design-to-implementation fidelity — 13 property types (color, spacing, typography, shadows, etc.) |
81
- | **Forms** | Missing `required`, `autocomplete`, `aria-describedby`; unlabelled inputs |
82
- | **Fonts** | FOIT, FOUT, missing fallbacks, slow loads > 1s, suboptimal formats |
83
- | **Motion** | `prefers-reduced-motion` violations, `autoplay` without pause controls |
84
- | **Network baseline** | New requests, missing requests, status-code regressions vs saved HAR baseline |
85
- | **Environment diff** | Dev vs staging — screenshot diff, DOM changes, console/network regressions |
86
-
87
- And every finding is post-processed with:
88
-
89
- | Post-processor | What it adds |
90
- |---|---|
91
- | **Intelligent baseline filtering** | Findings that flip-flop across runs are tagged `noisy` and downgraded to info — pure cross-run heuristics, no API calls (`ARGUS_NOISE_FILTER=0` to disable) |
92
- | **Root cause linking** | New findings are annotated with the recent git commits and files most likely to have caused them (`ARGUS_ROOT_CAUSE=0` to disable) |
93
-
94
- > All findings are classified as `critical` / `warning` / `info` and routed to the right Slack channel — or surfaced in the local HTML report. For per-finding severity tables and detection methods, see [REFERENCE.md](REFERENCE.md).
95
-
96
- ---
97
-
98
- ## MCP Tools
99
-
100
- Ask Claude (or any MCP client) — no terminal required:
101
-
102
- | Tool | Description |
103
- |---|---|
104
- | `argus_audit` | Fast pass — JS, network, accessibility, SEO, security, CSS, content |
105
- | `argus_audit_full` | Deep pass — adds Lighthouse, responsive checks, memory leak detection, hover-state bugs |
106
- | `argus_compare` | Diff dev vs staging — screenshots, findings delta, environment regressions |
107
- | `argus_get_context` | Capture everything broken on the open tab for Claude to diagnose |
108
- | `argus_watch_snapshot` | Snapshot the open tab without navigating (preserves auth/form state) |
109
- | `argus_last_report` | Return last JSON report without re-running |
110
- | `argus_design_audit` | Figma URL → 13 design-token finding types (color, spacing, typography, shadows, etc.) |
111
- | `argus_visual_diff` | Screenshot baseline comparison. Pass `updateBaseline: true` to reset. |
112
- | `argus_pr_validate` | Fetch GitHub PR diff → map changed files to affected routes → targeted audit → `{ blocked, findings }` |
113
-
114
- **Example prompts:**
115
-
116
- ```
117
- Run argus_audit on http://localhost:3000/checkout
118
- Run argus_audit_full on http://localhost:3000/dashboard
119
- Run argus_compare
120
- Run argus_get_context
121
- ```
122
-
123
- ---
124
-
125
- ## Full Setup
126
-
127
- ### Prerequisites
128
-
129
- | Requirement | Version |
130
- |---|---|
131
- | Node.js | v20.19+ |
132
- | Chrome | Stable (desktop or headless) |
133
- | Claude Code | Latest (`npm install -g @anthropic-ai/claude-code`) |
134
- | Slack workspace | **Optional** — omit for local `report.html` mode |
135
-
136
- ---
137
-
138
- ### Option A — MCP Server *(recommended for Claude Code users)*
139
-
140
- No local install needed. Use the [Quick Start](#quick-start) above, then add your target URL:
141
-
142
- ```env
143
- # .env in your project root
144
- TARGET_DEV_URL=http://localhost:3000
145
- TARGET_STAGING_URL=https://staging.example.com # optional — enables argus_compare
146
- ```
147
-
148
- **Optional — Slack notifications:**
149
-
150
- 1. [api.slack.com/apps](https://api.slack.com/apps) → Create New App → name it **BugBot**
151
- 2. OAuth & Permissions → Bot Token Scopes: `chat:write`, `files:write`, `files:read`
152
- 3. Install to workspace → copy the `xoxb-...` token
153
- 4. Create channels `#bugs-critical`, `#bugs-warnings`, `#bugs-digest` and run `/invite @BugBot` in each
154
-
155
- ```env
156
- SLACK_BOT_TOKEN=xoxb-...
157
- SLACK_CHANNEL_CRITICAL=C0000000000
158
- SLACK_CHANNEL_WARNINGS=C0000000001
159
- SLACK_CHANNEL_DIGEST=C0000000002
160
- ```
161
-
162
- > Without Slack: Argus auto-generates `reports/report.html` and opens it in your browser — zero extra config.
163
-
164
- ---
165
-
166
- ### Option B — npm Package (CI / dev dependency)
167
-
168
- ```bash
169
- npm install --save-dev argusqa-os
170
- npx argus init # interactive wizard — detects framework, discovers routes, writes .env
171
- npm run crawl # run after Chrome is started
172
- ```
173
-
174
- ---
175
-
176
- ### Option C — Clone the Repository (contributors / full source)
177
-
178
- ```bash
179
- git clone https://github.com/ironclawdevs27/Argus.git
180
- cd Argus
181
- npm install
182
- npm run init # interactive setup wizard
183
- ```
184
-
185
- **Manual setup (skip the wizard):**
186
-
187
- ```bash
188
- cp .env.example .env
189
- # Fill in TARGET_DEV_URL and optional Slack tokens
190
- ```
191
-
192
- Then configure your routes in [src/config/targets.js](src/config/targets.js):
193
-
194
- ```js
195
- export const routes = [
196
- { path: '/', name: 'Home', critical: true, waitFor: 'main' },
197
- { path: '/login', name: 'Login', critical: true, waitFor: 'form' },
198
- { path: '/dashboard', name: 'Dashboard', critical: true, waitFor: '[data-testid="dashboard"]' },
199
- { path: '/settings', name: 'Settings', critical: false, waitFor: null },
200
- ];
201
- ```
202
-
203
- - `critical: true` — errors on this route go to `#bugs-critical`
204
- - `waitFor` — CSS selector Argus waits for before capturing (signals page-ready)
205
-
206
- ---
207
-
208
- ## CLI Commands
209
-
210
- ```bash
211
- npm run chrome # Launch Chrome with --remote-debugging-port=9222 (auto-detects binary)
212
- npm run doctor # Pre-flight check: Chrome reachable, .mcp.json valid, .env has TARGET_DEV_URL
213
- npm run crawl # Batch audit of all configured routes
214
- npm run compare # Dev vs staging diff (CSS-only if no staging URL)
215
- npm run watch # Passive monitor — polls open Chrome tab every 1s
216
- npm run report:html # Generate reports/report.html from last JSON audit
217
- npm run report:pdf # Export HTML report to A4 PDF (requires: npm install puppeteer)
218
- npm run server # Start Slack slash-command server (port 3001)
219
- npm run init # Interactive setup wizard
220
- npm run test:unit # 61 unit tests — no Chrome required
221
- npm run test:harness # 139-block correctness harness — requires Chrome
222
- npm run test:harness:log # same, but tees full output to harness-results.txt
223
- ```
224
-
225
- **Watch mode** — live monitoring as you develop:
226
-
227
- ```bash
228
- # Terminal 1: start your app
229
- npm run dev
230
-
231
- # Terminal 2: start Argus watcher
232
- npm run watch
233
- # Ctrl+C → stops monitor and writes reports/report.html
234
- ```
235
-
236
- **Slack slash command** (on-demand from any channel):
237
-
238
- ```
239
- /argus-retest https://staging.example.com/checkout
240
- ```
241
-
242
- To expose the server via tunnel: `cloudflared tunnel --url http://localhost:3001` (free, no account required). Set the resulting URL as the Request URL in Slack App → Slash Commands.
243
-
244
- ---
245
-
246
- ## GitHub Actions CI
247
-
248
- Add to your repo's secrets (Settings → Secrets → Actions):
249
-
250
- | Secret | Required | Value |
251
- |---|---|---|
252
- | `TARGET_STAGING_URL` | Yes | Your staging base URL |
253
- | `SLACK_BOT_TOKEN` | No | `xoxb-...` token (omit for HTML-only mode) |
254
- | `SLACK_CHANNEL_CRITICAL` | No* | Channel ID (needed when Slack is configured) |
255
- | `SLACK_CHANNEL_WARNINGS` | No* | Channel ID |
256
- | `SLACK_CHANNEL_DIGEST` | No* | Channel ID |
257
- | `GITHUB_TOKEN` | No | Auto-injected by Actions for PR comments + Check Runs |
258
-
259
- The included [workflow](.github/workflows/argus.yml) runs on push to `main`, daily at 6 AM UTC, and on manual trigger. If critical issues are found, the pipeline fails.
260
-
261
- ---
262
-
263
- ## Environment Variables
264
-
265
- <details>
266
- <summary>Full reference (click to expand)</summary>
267
-
268
- | Variable | Default | Description |
269
- |---|---|---|
270
- | `TARGET_DEV_URL` | — | **Required.** Base URL of your dev environment |
271
- | `TARGET_STAGING_URL` | — | Staging URL — enables `argus_compare`; omit for CSS-only mode |
272
- | `SLACK_BOT_TOKEN` | — | `xoxb-...` token. Omit for local `report.html` mode |
273
- | `SLACK_SIGNING_SECRET` | — | For `/argus-retest` slash command verification |
274
- | `SLACK_CHANNEL_CRITICAL` | — | Channel ID for critical bugs |
275
- | `SLACK_CHANNEL_WARNINGS` | — | Channel ID for warnings |
276
- | `SLACK_CHANNEL_DIGEST` | — | Channel ID for info / daily digest |
277
- | `PORT` | `3001` | Slack slash-command server port |
278
- | `REPORT_OUTPUT_DIR` | `./reports` | Where to write JSON reports |
279
- | `ARGUS_CONCURRENCY` | `1` | Parallel MCP clients for route crawling |
280
- | `ARGUS_LOG_LEVEL` | `info` | `trace` / `debug` / `info` / `warn` / `error` |
281
- | `ARGUS_LOG_PRETTY` | — | Set `1` for human-readable logs in dev |
282
- | `ARGUS_RETRY_ATTEMPTS` | `3` | Max retries for `navigate`/`fill` MCP calls |
283
- | `ARGUS_WATCH_INTERVAL_MS` | `1000` | Watch mode poll interval (ms) |
284
- | `ARGUS_WATCH_UI_PORT` | `3002` | Watch mode web dashboard port |
285
- | `ARGUS_SOURCE_DIR` | — | App source path — enables env-var / feature-flag / dead-route analysis |
286
- | `ARGUS_ENV_FILE` | — | Path to app `.env` for codebase cross-reference |
287
- | `SCREENSHOT_DIFF_THRESHOLD` | `0.5` | Pixel diff % threshold for environment comparison |
288
- | `GITHUB_TOKEN` | — | For PR comments + Check Runs |
289
- | `GITHUB_REPOSITORY` | — | `owner/repo` format |
290
- | `GITHUB_PR_NUMBER` | — | Auto-injected by Actions from PR context |
291
- | `ARGUS_CRITICAL_THRESHOLD` | `1` | New criticals before blocking merge (0 = never block) |
292
- | `ARGUS_DIFF_IMAGE_URL` | — | Visual diff image URL to embed in PR comment |
293
- | `OTEL_EXPORTER_OTLP_ENDPOINT` | — | OTLP collector for Jaeger / Grafana Tempo |
294
- | `FIGMA_API_TOKEN` | — | Required for `argus_design_audit` |
295
- | `FONT_SLOW_MS` | `1000` | Slow web font load threshold (ms) |
296
- | `A11Y_CONTRAST_AA` | `4.5` | WCAG AA min contrast ratio for CVD simulation |
297
-
298
- </details>
299
-
300
- ---
301
-
302
- ## Troubleshooting
303
-
304
- **Chrome DevTools MCP not connecting**
305
- ```bash
306
- claude mcp add chrome-devtools -- npx chrome-devtools-mcp@latest
307
- # Restart Claude Code after adding
308
- ```
309
-
310
- **Slack messages not posting**
311
- - Token must start with `xoxb-` (not `xoxp-`, `xoxe-`, or `xapp-`)
312
- - Run `/invite @BugBot` in each channel
313
- - Required scopes: `chat:write`, `files:write`, `files:read`
314
-
315
- **Screenshots are blank**
316
- - Page hasn't settled — increase `pageSettleMs` in `src/config/targets.js` or add a `waitFor` selector for the route
317
-
318
- **`/argus-retest` returns "dispatch_failed"**
319
- - Tunnel URL changed — update the Request URL in Slack App → Slash Commands and reinstall
320
-
321
- **CSS analysis returns empty results**
322
- - Page may be behind auth — ensure you're logged in on the Chrome instance Argus is controlling
323
-
324
- **CI pipeline fails immediately**
325
- - Chrome may not start fast enough — increase `sleep 3` to `sleep 5` in [.github/workflows/argus.yml](.github/workflows/argus.yml)
326
-
327
- ---
328
-
329
- ## How Argus Differs From Playwright / Cypress
330
-
331
- Argus is a **complementary layer**, not a replacement for unit or E2E tests:
332
-
333
- | | Playwright / Cypress | Argus |
334
- |---|---|---|
335
- | **Purpose** | Test your logic and API contracts | Catch what the user actually sees |
336
- | **What it catches** | Regressions in behavior | CSS drift, visual regressions, API loops, console noise, perf budgets |
337
- | **When it runs** | In your test suite | Continuously, on the live running app |
338
- | **Setup** | Write test files | Configure routes in `targets.js` |
339
- | **Output** | Pass / fail | Structured Slack reports with screenshots |
340
-
341
- ---
342
-
343
- ## Known Limitations
344
-
345
- All 679 harness assertions pass (`679/679`) — there are currently no known MCP- or Chrome-layer restrictions. Soft assertions (Lighthouse, performance traces) still require non-headless Chrome and are skipped in headless CI.
346
-
347
- ---
348
-
349
- ## Project Structure
350
-
351
- ```
352
- src/
353
- argus.js — single-page audit entry point
354
- mcp-server.js — 9 MCP tools exposed to Claude / any MCP client
355
- orchestration/ — crawl loop, Slack/GitHub dispatch, env comparison, watch mode
356
- utils/ — 32 analysis engines (accessibility, security, performance, PDF, recording, etc.)
357
- adapters/browser.js — CdpBrowserAdapter — wraps all chrome-devtools-mcp calls
358
- config/targets.js — routes, thresholds, auth steps
359
- cli/
360
- init.js — argus init interactive setup wizard
361
- chrome-launcher.js — npm run chrome / argus-chrome — launches Chrome with correct flags
362
- doctor.js — npm run doctor / argus-doctor — pre-flight checks
363
- pr-validate.js — headless CI entry point for GitHub Actions
364
- test-harness/ — 139-block correctness harness, 664 hard assertions, 62 fixture pages
365
- test/unit/ — 61 Vitest unit tests (no Chrome required)
366
- landing/ — Product landing page (React 19 + Vite + Tailwind)
367
- ```
368
-
369
- Full source map → [CLAUDE.md](CLAUDE.md) · MCP/DSL reference → [SKILL.md](SKILL.md)
370
-
371
- ---
372
-
373
- ## Contributing
374
-
375
- 1. Fork the repo and create a branch
376
- 2. `npm run test:unit` — verify without Chrome (61 tests)
377
- 3. `npm run test:harness` — full integration coverage (requires Chrome on port 9222)
378
- 4. Open a PR — Argus audits itself via the CI workflow
379
-
380
- ---
381
-
382
- ## License
383
-
384
- MIT © [ironclawdevs27](https://github.com/ironclawdevs27)
385
-
386
- ---
387
-
388
- <div align="center">
389
-
390
- *Argus Panoptes — the all-seeing giant of Greek mythology who never slept.*
391
-
392
- [argus-qa.com](https://argus-qa.com) · [npm](https://www.npmjs.com/package/argusqa-os) · [MCP Registry](https://glama.ai/mcp/servers/ironclawdevs27/Argus)
393
-
394
- </div>
1
+ <div align="center">
2
+
3
+ # Argus — AI-Powered Automated QA
4
+
5
+ [![npm](https://img.shields.io/npm/v/argusqa-os?color=7C3AED)](https://www.npmjs.com/package/argusqa-os)
6
+ [![MCP Server](https://glama.ai/mcp/servers/ironclawdevs27/Argus/badges/card.svg)](https://glama.ai/mcp/servers/ironclawdevs27/Argus)
7
+ [![Harness](https://img.shields.io/badge/harness-688%2F688-4ADE80)](test-harness/)
8
+ [![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](LICENSE)
9
+
10
+ **Argus catches the bugs your test suite misses — visual regressions, API loops, CSS drift, console noise, accessibility failures, and more — and delivers rich reports to Slack (or a local HTML dashboard).**
11
+
12
+ [Quick Start](#quick-start) · [Features](#what-argus-catches) · [Setup](#full-setup) · [MCP Tools](#mcp-tools) · [CLI Commands](#cli-commands) · [Troubleshooting](#troubleshooting) · [Full Reference](REFERENCE.md)
13
+
14
+ </div>
15
+
16
+ ---
17
+
18
+ ## Quick Start
19
+
20
+ > **No install required.** `npx` auto-downloads Argus on first run.
21
+
22
+ **Step 1 — Add to `.mcp.json`** in your project root:
23
+
24
+ ```json
25
+ {
26
+ "mcpServers": {
27
+ "chrome-devtools": { "command": "npx", "args": ["-y", "chrome-devtools-mcp@latest"] },
28
+ "argus": { "command": "npx", "args": ["-y", "argusqa-os"] }
29
+ }
30
+ }
31
+ ```
32
+
33
+ Or via Claude Code CLI:
34
+
35
+ ```bash
36
+ claude mcp add chrome-devtools -- npx -y chrome-devtools-mcp@latest
37
+ claude mcp add argus -- npx -y argusqa-os
38
+ ```
39
+
40
+ **Step 2 — Start Chrome with remote debugging:**
41
+
42
+ ```bash
43
+ # macOS
44
+ open -a "Google Chrome" --args --remote-debugging-port=9222 --headless=new
45
+
46
+ # Windows (PowerShell)
47
+ & "C:\Program Files\Google\Chrome\Application\chrome.exe" --remote-debugging-port=9222 --headless=new --no-sandbox --disable-gpu --user-data-dir="$env:TEMP\chrome-argus"
48
+
49
+ # Linux
50
+ google-chrome --remote-debugging-port=9222 --headless=new --no-sandbox
51
+ ```
52
+
53
+ **Step 3 — Run an audit:**
54
+
55
+ ```
56
+ Run argus_audit on http://localhost:3000
57
+ ```
58
+
59
+ Argus scans your app and either posts findings to Slack or opens a local `report.html`. That's it.
60
+
61
+ ---
62
+
63
+ ## What Argus Catches
64
+
65
+ 32 analysis engines, 140 distinct issue types, zero test-file maintenance:
66
+
67
+ | Category | What it detects |
68
+ |---|---|
69
+ | **JavaScript** | Uncaught exceptions, unhandled promise rejections, `console.error` on critical routes |
70
+ | **Network & API** | HTTP 5xx, 401/403 auth failures, duplicate API calls (infinite loops), 4xx errors, broken links |
71
+ | **Performance** | LCP > 2500ms, CLS > 0.1, TTFB > 800ms, slow APIs > 1s/3s, payloads > 500KB/2MB, JS bundles > 500KB |
72
+ | **Accessibility** | axe-core (80+ WCAG rules), color-blind simulation, missing ARIA, keyboard focus, heading hierarchy |
73
+ | **SEO** | Missing meta description, OG tags, canonical, viewport, h1 |
74
+ | **Security** | Auth tokens in localStorage/URL, `eval()`, missing CSP/X-Frame-Options, CSP violations, missing SRI on external scripts, source map exposure, open redirects, npm CVEs |
75
+ | **CSS** | Cascade overrides, component style leaks, unused rules, React inline style conflicts |
76
+ | **Content** | `null`/`undefined` as visible text, lorem ipsum, broken images, empty data lists |
77
+ | **Responsive** | Horizontal overflow at 375px/768px, touch targets < 44×44px |
78
+ | **Memory** | Detached DOM nodes via V8 heap snapshot, heap growth across navigation |
79
+ | **Visual** | Pixel-level screenshot regression via pixelmatch (≥0.1% warning, ≥5% critical) |
80
+ | **Figma** | Design-to-implementation fidelity — 13 property types (color, spacing, typography, shadows, etc.) |
81
+ | **Forms** | Missing `required`, `autocomplete`, `aria-describedby`; unlabelled inputs |
82
+ | **Fonts** | FOIT, FOUT, missing fallbacks, slow loads > 1s, suboptimal formats |
83
+ | **Motion** | `prefers-reduced-motion` violations, `autoplay` without pause controls |
84
+ | **Network baseline** | New requests, missing requests, status-code regressions vs saved HAR baseline |
85
+ | **Environment diff** | Dev vs staging — screenshot diff, DOM changes, console/network regressions |
86
+
87
+ And every finding is post-processed with:
88
+
89
+ | Post-processor | What it adds |
90
+ |---|---|
91
+ | **Intelligent baseline filtering** | Findings that flip-flop across runs are tagged `noisy` and downgraded to info — pure cross-run heuristics, no API calls (`ARGUS_NOISE_FILTER=0` to disable) |
92
+ | **Root cause linking** | New findings are annotated with the recent git commits and files most likely to have caused them (`ARGUS_ROOT_CAUSE=0` to disable) |
93
+
94
+ > All findings are classified as `critical` / `warning` / `info` and routed to the right Slack channel — or surfaced in the local HTML report. For per-finding severity tables and detection methods, see [REFERENCE.md](REFERENCE.md).
95
+
96
+ ---
97
+
98
+ ## MCP Tools
99
+
100
+ Ask Claude (or any MCP client) — no terminal required:
101
+
102
+ | Tool | Description |
103
+ |---|---|
104
+ | `argus_audit` | Fast pass — JS, network, accessibility, SEO, security, CSS, content |
105
+ | `argus_audit_full` | Deep pass — adds Lighthouse, responsive checks, memory leak detection, hover-state bugs |
106
+ | `argus_compare` | Diff dev vs staging — screenshots, findings delta, environment regressions |
107
+ | `argus_get_context` | Capture everything broken on the open tab for Claude to diagnose |
108
+ | `argus_watch_snapshot` | Snapshot the open tab without navigating (preserves auth/form state) |
109
+ | `argus_last_report` | Return last JSON report without re-running |
110
+ | `argus_design_audit` | Figma URL → 13 design-token finding types (color, spacing, typography, shadows, etc.) |
111
+ | `argus_visual_diff` | Screenshot baseline comparison. Pass `updateBaseline: true` to reset. |
112
+ | `argus_pr_validate` | Fetch GitHub PR diff → map changed files to affected routes → targeted audit → `{ blocked, findings }` |
113
+
114
+ **Example prompts:**
115
+
116
+ ```
117
+ Run argus_audit on http://localhost:3000/checkout
118
+ Run argus_audit_full on http://localhost:3000/dashboard
119
+ Run argus_compare
120
+ Run argus_get_context
121
+ ```
122
+
123
+ ---
124
+
125
+ ## Full Setup
126
+
127
+ ### Prerequisites
128
+
129
+ | Requirement | Version |
130
+ |---|---|
131
+ | Node.js | v20.19+ |
132
+ | Chrome | Stable (desktop or headless) |
133
+ | Claude Code | Latest (`npm install -g @anthropic-ai/claude-code`) |
134
+ | Slack workspace | **Optional** — omit for local `report.html` mode |
135
+
136
+ ---
137
+
138
+ ### Option A — MCP Server *(recommended for Claude Code users)*
139
+
140
+ No local install needed. Use the [Quick Start](#quick-start) above, then add your target URL:
141
+
142
+ ```env
143
+ # .env in your project root
144
+ TARGET_DEV_URL=http://localhost:3000
145
+ TARGET_STAGING_URL=https://staging.example.com # optional — enables argus_compare
146
+ ```
147
+
148
+ **Optional — Slack notifications:**
149
+
150
+ 1. [api.slack.com/apps](https://api.slack.com/apps) → Create New App → name it **BugBot**
151
+ 2. OAuth & Permissions → Bot Token Scopes: `chat:write`, `files:write`, `files:read`
152
+ 3. Install to workspace → copy the `xoxb-...` token
153
+ 4. Create channels `#bugs-critical`, `#bugs-warnings`, `#bugs-digest` and run `/invite @BugBot` in each
154
+
155
+ ```env
156
+ SLACK_BOT_TOKEN=xoxb-...
157
+ SLACK_CHANNEL_CRITICAL=C0000000000
158
+ SLACK_CHANNEL_WARNINGS=C0000000001
159
+ SLACK_CHANNEL_DIGEST=C0000000002
160
+ ```
161
+
162
+ > Without Slack: Argus auto-generates `reports/report.html` and opens it in your browser — zero extra config.
163
+
164
+ ---
165
+
166
+ ### Option B — npm Package (CI / dev dependency)
167
+
168
+ ```bash
169
+ npm install --save-dev argusqa-os
170
+ npx argus init # interactive wizard — detects framework, discovers routes, writes .env
171
+ npm run crawl # run after Chrome is started
172
+ ```
173
+
174
+ ---
175
+
176
+ ### Option C — Clone the Repository (contributors / full source)
177
+
178
+ ```bash
179
+ git clone https://github.com/ironclawdevs27/Argus.git
180
+ cd Argus
181
+ npm install
182
+ npm run init # interactive setup wizard
183
+ ```
184
+
185
+ **Manual setup (skip the wizard):**
186
+
187
+ ```bash
188
+ cp .env.example .env
189
+ # Fill in TARGET_DEV_URL and optional Slack tokens
190
+ ```
191
+
192
+ Then configure your routes in [src/config/targets.js](src/config/targets.js):
193
+
194
+ ```js
195
+ export const routes = [
196
+ { path: '/', name: 'Home', critical: true, waitFor: 'main' },
197
+ { path: '/login', name: 'Login', critical: true, waitFor: 'form' },
198
+ { path: '/dashboard', name: 'Dashboard', critical: true, waitFor: '[data-testid="dashboard"]' },
199
+ { path: '/settings', name: 'Settings', critical: false, waitFor: null },
200
+ ];
201
+ ```
202
+
203
+ - `critical: true` — errors on this route go to `#bugs-critical`
204
+ - `waitFor` — CSS selector Argus waits for before capturing (signals page-ready)
205
+
206
+ ---
207
+
208
+ ## CLI Commands
209
+
210
+ ```bash
211
+ npm run chrome # Launch Chrome with --remote-debugging-port=9222 (auto-detects binary)
212
+ npm run doctor # Pre-flight check: Chrome reachable, .mcp.json valid, .env has TARGET_DEV_URL
213
+ npm run crawl # Batch audit of all configured routes
214
+ npm run compare # Dev vs staging diff (CSS-only if no staging URL)
215
+ npm run watch # Passive monitor — polls open Chrome tab every 1s
216
+ npm run report:html # Generate reports/report.html from last JSON audit
217
+ npm run report:pdf # Export HTML report to A4 PDF (requires: npm install puppeteer)
218
+ npm run server # Start Slack slash-command server (port 3001)
219
+ npm run init # Interactive setup wizard
220
+ npm run test:unit # 61 unit tests — no Chrome required
221
+ npm run test:harness # 142-block correctness harness — requires Chrome
222
+ npm run test:harness:log # same, but tees full output to harness-results.txt
223
+ ```
224
+
225
+ **Watch mode** — live monitoring as you develop:
226
+
227
+ ```bash
228
+ # Terminal 1: start your app
229
+ npm run dev
230
+
231
+ # Terminal 2: start Argus watcher
232
+ npm run watch
233
+ # Ctrl+C → stops monitor and writes reports/report.html
234
+ ```
235
+
236
+ **Slack slash command** (on-demand from any channel):
237
+
238
+ ```
239
+ /argus-retest https://staging.example.com/checkout
240
+ ```
241
+
242
+ To expose the server via tunnel: `cloudflared tunnel --url http://localhost:3001` (free, no account required). Set the resulting URL as the Request URL in Slack App → Slash Commands.
243
+
244
+ ---
245
+
246
+ ## GitHub Actions CI
247
+
248
+ Add to your repo's secrets (Settings → Secrets → Actions):
249
+
250
+ | Secret | Required | Value |
251
+ |---|---|---|
252
+ | `TARGET_STAGING_URL` | Yes | Your staging base URL |
253
+ | `SLACK_BOT_TOKEN` | No | `xoxb-...` token (omit for HTML-only mode) |
254
+ | `SLACK_CHANNEL_CRITICAL` | No* | Channel ID (needed when Slack is configured) |
255
+ | `SLACK_CHANNEL_WARNINGS` | No* | Channel ID |
256
+ | `SLACK_CHANNEL_DIGEST` | No* | Channel ID |
257
+ | `GITHUB_TOKEN` | No | Auto-injected by Actions for PR comments + Check Runs |
258
+
259
+ The included [workflow](.github/workflows/argus.yml) runs on push to `main`, daily at 6 AM UTC, and on manual trigger. If critical issues are found, the pipeline fails.
260
+
261
+ ---
262
+
263
+ ## Environment Variables
264
+
265
+ <details>
266
+ <summary>Full reference (click to expand)</summary>
267
+
268
+ | Variable | Default | Description |
269
+ |---|---|---|
270
+ | `TARGET_DEV_URL` | — | **Required.** Base URL of your dev environment |
271
+ | `TARGET_STAGING_URL` | — | Staging URL — enables `argus_compare`; omit for CSS-only mode |
272
+ | `SLACK_BOT_TOKEN` | — | `xoxb-...` token. Omit for local `report.html` mode |
273
+ | `SLACK_SIGNING_SECRET` | — | For `/argus-retest` slash command verification |
274
+ | `SLACK_CHANNEL_CRITICAL` | — | Channel ID for critical bugs |
275
+ | `SLACK_CHANNEL_WARNINGS` | — | Channel ID for warnings |
276
+ | `SLACK_CHANNEL_DIGEST` | — | Channel ID for info / daily digest |
277
+ | `PORT` | `3001` | Slack slash-command server port |
278
+ | `REPORT_OUTPUT_DIR` | `./reports` | Where to write JSON reports |
279
+ | `ARGUS_CONCURRENCY` | `1` | Parallel MCP clients for route crawling |
280
+ | `ARGUS_LOG_LEVEL` | `info` | `trace` / `debug` / `info` / `warn` / `error` |
281
+ | `ARGUS_LOG_PRETTY` | — | Set `1` for human-readable logs in dev |
282
+ | `ARGUS_RETRY_ATTEMPTS` | `3` | Max retries for `navigate`/`fill` MCP calls |
283
+ | `ARGUS_WATCH_INTERVAL_MS` | `1000` | Watch mode poll interval (ms) |
284
+ | `ARGUS_WATCH_UI_PORT` | `3002` | Watch mode web dashboard port |
285
+ | `ARGUS_SOURCE_DIR` | — | App source path — enables env-var / feature-flag / dead-route analysis |
286
+ | `ARGUS_ENV_FILE` | — | Path to app `.env` for codebase cross-reference |
287
+ | `SCREENSHOT_DIFF_THRESHOLD` | `0.5` | Pixel diff % threshold for environment comparison |
288
+ | `GITHUB_TOKEN` | — | For PR comments + Check Runs |
289
+ | `GITHUB_REPOSITORY` | — | `owner/repo` format |
290
+ | `GITHUB_PR_NUMBER` | — | Auto-injected by Actions from PR context |
291
+ | `ARGUS_CRITICAL_THRESHOLD` | `1` | New criticals before blocking merge (0 = never block) |
292
+ | `ARGUS_DIFF_IMAGE_URL` | — | Visual diff image URL to embed in PR comment |
293
+ | `OTEL_EXPORTER_OTLP_ENDPOINT` | — | OTLP collector for Jaeger / Grafana Tempo |
294
+ | `FIGMA_API_TOKEN` | — | Required for `argus_design_audit` |
295
+ | `FONT_SLOW_MS` | `1000` | Slow web font load threshold (ms) |
296
+ | `A11Y_CONTRAST_AA` | `4.5` | WCAG AA min contrast ratio for CVD simulation |
297
+
298
+ </details>
299
+
300
+ ---
301
+
302
+ ## Troubleshooting
303
+
304
+ **Chrome DevTools MCP not connecting**
305
+ ```bash
306
+ claude mcp add chrome-devtools -- npx chrome-devtools-mcp@latest
307
+ # Restart Claude Code after adding
308
+ ```
309
+
310
+ **Slack messages not posting**
311
+ - Token must start with `xoxb-` (not `xoxp-`, `xoxe-`, or `xapp-`)
312
+ - Run `/invite @BugBot` in each channel
313
+ - Required scopes: `chat:write`, `files:write`, `files:read`
314
+
315
+ **Screenshots are blank**
316
+ - Page hasn't settled — increase `pageSettleMs` in `src/config/targets.js` or add a `waitFor` selector for the route
317
+
318
+ **`/argus-retest` returns "dispatch_failed"**
319
+ - Tunnel URL changed — update the Request URL in Slack App → Slash Commands and reinstall
320
+
321
+ **CSS analysis returns empty results**
322
+ - Page may be behind auth — ensure you're logged in on the Chrome instance Argus is controlling
323
+
324
+ **CI pipeline fails immediately**
325
+ - Chrome may not start fast enough — increase `sleep 3` to `sleep 5` in [.github/workflows/argus.yml](.github/workflows/argus.yml)
326
+
327
+ ---
328
+
329
+ ## How Argus Differs From Playwright / Cypress
330
+
331
+ Argus is a **complementary layer**, not a replacement for unit or E2E tests:
332
+
333
+ | | Playwright / Cypress | Argus |
334
+ |---|---|---|
335
+ | **Purpose** | Test your logic and API contracts | Catch what the user actually sees |
336
+ | **What it catches** | Regressions in behavior | CSS drift, visual regressions, API loops, console noise, perf budgets |
337
+ | **When it runs** | In your test suite | Continuously, on the live running app |
338
+ | **Setup** | Write test files | Configure routes in `targets.js` |
339
+ | **Output** | Pass / fail | Structured Slack reports with screenshots |
340
+
341
+ ---
342
+
343
+ ## Known Limitations
344
+
345
+ All 688 harness assertions pass (`688/688`) — there are currently no known MCP- or Chrome-layer restrictions. Soft assertions (Lighthouse, performance traces) still require non-headless Chrome and are skipped in headless CI.
346
+
347
+ ---
348
+
349
+ ## Project Structure
350
+
351
+ ```
352
+ src/
353
+ argus.js — single-page audit entry point
354
+ mcp-server.js — 9 MCP tools exposed to Claude / any MCP client
355
+ orchestration/ — crawl loop, Slack/GitHub dispatch, env comparison, watch mode
356
+ utils/ — 32 analysis engines (accessibility, security, performance, PDF, recording, etc.)
357
+ adapters/browser.js — CdpBrowserAdapter — wraps all chrome-devtools-mcp calls
358
+ config/targets.js — routes, thresholds, auth steps
359
+ cli/
360
+ init.js — argus init interactive setup wizard
361
+ chrome-launcher.js — npm run chrome / argus-chrome — launches Chrome with correct flags
362
+ doctor.js — npm run doctor / argus-doctor — pre-flight checks
363
+ pr-validate.js — headless CI entry point for GitHub Actions
364
+ test-harness/ — 142-block correctness harness, 688 hard assertions, 62 fixture pages
365
+ test/unit/ — 61 Vitest unit tests (no Chrome required)
366
+ landing/ — Product landing page (React 19 + Vite + Tailwind)
367
+ ```
368
+
369
+ Full source map → [CLAUDE.md](CLAUDE.md) · MCP/DSL reference → [SKILL.md](SKILL.md)
370
+
371
+ ---
372
+
373
+ ## Contributing
374
+
375
+ 1. Fork the repo and create a branch
376
+ 2. `npm run test:unit` — verify without Chrome (61 tests)
377
+ 3. `npm run test:harness` — full integration coverage (requires Chrome on port 9222)
378
+ 4. Open a PR — Argus audits itself via the CI workflow
379
+
380
+ ---
381
+
382
+ ## License
383
+
384
+ MIT © [ironclawdevs27](https://github.com/ironclawdevs27)
385
+
386
+ ---
387
+
388
+ <div align="center">
389
+
390
+ *Argus Panoptes — the all-seeing giant of Greek mythology who never slept.*
391
+
392
+ [argus-qa.com](https://argus-qa.com) · [npm](https://www.npmjs.com/package/argusqa-os) · [MCP Registry](https://glama.ai/mcp/servers/ironclawdevs27/Argus)
393
+
394
+ </div>
package/glama.json CHANGED
@@ -1,12 +1,12 @@
1
1
  {
2
2
  "$schema": "https://glama.ai/mcp/schemas/server.json",
3
3
  "name": "argus",
4
- "description": "AI-powered QA harness that audits web apps via Chrome DevTools Protocol. Catches JS errors, network failures, a11y violations, SEO issues, security headers, CSS regressions, and more — directly from Claude conversations. 9 MCP tools: argus_audit (fast 8-analyzer pass), argus_audit_full (Lighthouse + memory + responsive), argus_compare (dev vs staging diff), argus_last_report (retrieve last JSON report), argus_watch_snapshot (live tab snapshot without navigating), argus_get_context (LLM-optimized context + fix loop with snapshot_id diff), argus_design_audit (Figma design fidelity — 13 finding types), argus_visual_diff (screenshot baseline comparison, updateBaseline flag), argus_pr_validate (PR diff → affected routes → targeted audit → blocked flag). Every finding is post-processed with intelligent baseline filtering (cross-run noise classifier) and root cause linking (recent git commits mapped to new findings). 141 test blocks, 679 hard assertions, 67 detection categories.",
4
+ "description": "AI-powered QA harness that audits web apps via Chrome DevTools Protocol. Catches JS errors, network failures, a11y violations, SEO issues, security headers, CSS regressions, and more — directly from Claude conversations. 9 MCP tools: argus_audit (fast 8-analyzer pass), argus_audit_full (Lighthouse + memory + responsive), argus_compare (dev vs staging diff), argus_last_report (retrieve last JSON report), argus_watch_snapshot (live tab snapshot without navigating), argus_get_context (LLM-optimized context + fix loop with snapshot_id diff), argus_design_audit (Figma design fidelity — 13 finding types), argus_visual_diff (screenshot baseline comparison, updateBaseline flag), argus_pr_validate (PR diff → affected routes → targeted audit → blocked flag). Every finding is post-processed with intelligent baseline filtering (cross-run noise classifier) and root cause linking (recent git commits mapped to new findings). 142 test blocks, 688 hard assertions, 67 detection categories.",
5
5
  "maintainers": ["ironclawdevs27"],
6
6
  "tools": [
7
7
  {
8
8
  "name": "argus_audit",
9
- "description": "Fast QA audit — JS errors, network failures (4xx/5xx), API frequency loops, CSS cascade issues, SEO violations, security headers, accessibility, and content. Returns { findings, summary }. Supports cache: true to skip re-crawl on repeat calls."
9
+ "description": "Fast QA audit — JS errors, network failures (4xx/5xx), CORS, API frequency loops, slow/blocking third-party requests, API contract violations, SEO violations, security headers, content quality, DevTools Issues panel, and HTTPS enforcement. Returns { findings, summary }. Supports cache: true to skip re-crawl on repeat calls."
10
10
  },
11
11
  {
12
12
  "name": "argus_audit_full",
package/package.json CHANGED
@@ -1,6 +1,6 @@
1
1
  {
2
2
  "name": "argusqa-os",
3
- "version": "9.7.3",
3
+ "version": "9.7.4",
4
4
  "mcpName": "io.github.ironclawdevs27/argus",
5
5
  "description": "Argus — AI-powered automated dev-testing platform using Chrome DevTools MCP and Claude Code",
6
6
  "keywords": [
@@ -48,7 +48,7 @@
48
48
  "watch": "node src/orchestration/watch-mode.js",
49
49
  "server": "node src/server/index.js",
50
50
  "harness": "node test-harness/server.js",
51
- "harness:staging": "PORT=3101 node test-harness/server.js",
51
+ "harness:staging": "node test-harness/server.js --port=3101 --staging",
52
52
  "test:harness": "node --env-file=test-harness/.env.harness test-harness/validate.js",
53
53
  "test:harness:log": "node test-harness/run-with-log.mjs",
54
54
  "test:unit": "vitest run test/unit",
@@ -52,15 +52,22 @@ export class CdpBrowserAdapter {
52
52
  resize(w, h) { return this._mcp.resize_page({ width: w, height: h }); }
53
53
 
54
54
  // ── Network & performance ───────────────────────────────────────────────────
55
- getNetworkRequest(reqId) { return this._mcp.get_network_request({ requestId: reqId }); }
55
+ // chrome-devtools-mcp expects the wire parameter "reqid" (sending "requestId"
56
+ // is rejected with an Unknown-argument error). Callers still pass the numeric
57
+ // requestId parsed from list_network_requests.
58
+ getNetworkRequest(reqId) { return this._mcp.get_network_request({ reqid: reqId }); }
56
59
  lighthouse(url, opts = {}) { return this._mcp.lighthouse_audit({ url, ...opts }); }
57
60
  startTrace() { return this._mcp.performance_start_trace({}); }
58
61
  stopTrace() { return this._mcp.performance_stop_trace({}); }
59
62
  analyzeInsight(opts) { return this._mcp.performance_analyze_insight(opts); }
60
63
 
61
64
  // ── Tab management ─────────────────────────────────────────────────────────
65
+ // list_pages returns markdown text ("## Pages\n1: <url> [selected]") like all
66
+ // MCP responses — callers parse with parseListPagesResponse (mcp-parsers.js).
62
67
  listPages() { return this._mcp.list_pages({}); }
63
- selectPage(tabId) { return this._mcp.select_page({ pageId: tabId }); }
68
+ // select_page validates pageId as a number — coerce so callers may pass the
69
+ // string tabId they received from an MCP tool argument.
70
+ selectPage(tabId) { return this._mcp.select_page({ pageId: Number(tabId) }); }
64
71
 
65
72
  // ── Lifecycle ───────────────────────────────────────────────────────────────
66
73
  close() { return this._mcp.close(); }
package/src/cli/doctor.js CHANGED
@@ -47,7 +47,7 @@ export function checkMcpConfig(filePath = '.mcp.json') {
47
47
  if (!fs.existsSync(filePath)) {
48
48
  return {
49
49
  ok: false,
50
- detail: `${path.basename(filePath)} not found — run \`argus init\` or copy .mcp.json from the Argus repo`,
50
+ detail: `${path.basename(filePath)} not found — create it with: {"mcpServers":{"chrome-devtools":{"command":"npx","args":["-y","chrome-devtools-mcp@1.1.1"]}}}`,
51
51
  };
52
52
  }
53
53
  let parsed;
@@ -63,7 +63,7 @@ export function checkMcpConfig(filePath = '.mcp.json') {
63
63
  return cmd.includes('chrome-devtools') || args.some(a => a.includes('chrome-devtools'));
64
64
  });
65
65
  if (!hasCdp) {
66
- return { ok: false, detail: 'No chrome-devtools MCP server entry found in mcpServers' };
66
+ return { ok: false, detail: 'No chrome-devtools entry in mcpServers add: "chrome-devtools": {"command":"npx","args":["-y","chrome-devtools-mcp@1.1.1"]}' };
67
67
  }
68
68
  return { ok: true, detail: `${Object.keys(servers).length} server(s) configured` };
69
69
  }
@@ -230,6 +230,11 @@ async function main() {
230
230
  const files = await fetchPrFiles(prUrl, token);
231
231
  changedFiles.push(...files);
232
232
  console.log(`[argus] ${files.length} changed file(s)`);
233
+ if (files.length >= 300) {
234
+ // CI annotation lives here (CLI owns stdout) — fetchPrFiles itself only
235
+ // logs to stderr so the MCP server's JSON-RPC stdout stays clean.
236
+ console.log('::warning::PR has 300+ changed files — Argus analyzed the first 300. Routes affected by later files may be missed.');
237
+ }
233
238
 
234
239
  // Step 2: Map changed files to affected routes
235
240
  const routes = await loadRoutes();
package/src/mcp-server.js CHANGED
@@ -1,6 +1,6 @@
1
1
  #!/usr/bin/env node
2
2
  /**
3
- * Argus MCP Server (v9.6.6)
3
+ * Argus MCP Server
4
4
  *
5
5
  * Exposes Argus as an MCP server so Claude (or any MCP client) can call
6
6
  * argus_audit, argus_audit_full, argus_compare, argus_last_report, and
@@ -24,8 +24,11 @@ import {
24
24
  } from '@modelcontextprotocol/sdk/types.js';
25
25
  import fs from 'fs';
26
26
  import path from 'path';
27
+ import { createRequire } from 'module';
27
28
 
28
29
  import { createMcpClient } from './utils/mcp-client.js';
30
+ import { childLogger } from './utils/logger.js';
31
+ import { parseListPagesResponse } from './utils/mcp-parsers.js';
29
32
  import { crawlRouteCheap, runCrawl } from './orchestration/crawl-and-report.js';
30
33
  import { runComparison } from './orchestration/env-comparison.js';
31
34
  import { WatchSession } from './orchestration/watch-mode.js';
@@ -35,6 +38,13 @@ import { analyzeDesignFidelity } from './utils/design-fidelity-analy
35
38
  import { analyzeVisualRegression } from './utils/visual-diff-analyzer.js';
36
39
  import { fetchPrFiles, mapFilesToRoutes } from './utils/pr-diff-analyzer.js';
37
40
 
41
+ const logger = childLogger('mcp-server');
42
+
43
+ // Read version from package.json so the MCP server always self-reports the
44
+ // published package version (a hardcoded string here drifted in the past).
45
+ const require_ = createRequire(import.meta.url);
46
+ const pkg = require_('../package.json');
47
+
38
48
  const REPORTS_DIR = path.resolve(process.cwd(), 'reports');
39
49
 
40
50
  // Fix loop: stores up to 20 snapshots keyed by snapshot_id so argus_get_context
@@ -65,7 +75,7 @@ function cacheAudit(url, result) {
65
75
  const TOOLS = [
66
76
  {
67
77
  name: 'argus_audit',
68
- description: 'Fast QA audit on a URL via Chrome DevTools Protocol. Runs 8 analyzers in one pass: JS errors, unhandled rejections, network failures (4xx/5xx), API frequency loops, CSS cascade issues, SEO violations, security header checks, and accessibility. Returns { findings: [{severity, type, message, url}], summary: {critical, warning, info} }. Use for CI smoke tests and pre-deploy gates. Pass cache: true to skip re-crawl on repeat calls to the same URL within a session — useful in tight fix loops. For Lighthouse scoring and memory leak detection, use argus_audit_full. Requires Chrome running with --remote-debugging-port=9222.',
78
+ description: 'Fast QA audit on a URL via Chrome DevTools Protocol. One-pass detection sweep: JS errors, unhandled rejections, network failures (4xx/5xx), CORS errors, API frequency loops, slow APIs and blocking third-party requests, API contract violations, sync XHR, document.write, long tasks, service worker failures, debugger statements, duplicate IDs, SEO violations, security header checks, content quality, Chrome DevTools Issues panel, and HTTPS enforcement. Returns { findings: [{severity, type, message, url}], summary: {critical, warning, info} }. Use for CI smoke tests and pre-deploy gates. Pass cache: true to skip re-crawl on repeat calls to the same URL within a session — useful in tight fix loops. For Lighthouse scoring, CSS analysis, responsive checks, and memory leak detection, use argus_audit_full. Requires Chrome running with --remote-debugging-port=9222.',
69
79
  inputSchema: {
70
80
  type: 'object',
71
81
  properties: {
@@ -181,6 +191,9 @@ async function withMcp(fn) {
181
191
  async function handleAudit({ url, critical = false, cache = false }) {
182
192
  if (cache && auditCache.has(url)) {
183
193
  const { result, ts } = auditCache.get(url);
194
+ // Refresh recency on read so eviction is true LRU, not insertion-order FIFO.
195
+ auditCache.delete(url);
196
+ auditCache.set(url, { result, ts });
184
197
  return { content: [{ type: 'text', text: JSON.stringify({ ...result, _cached: true, _cachedAt: new Date(ts).toISOString() }, null, 2) }] };
185
198
  }
186
199
  return withMcp(async (mcp) => {
@@ -243,12 +256,13 @@ async function handleGetContext({ url, snapshot_id: prevId, tabId } = {}) {
243
256
  const { findings, newConsole, newNetwork } = await session.poll();
244
257
 
245
258
  // List all open tabs so the caller can target a specific tab on the next call.
259
+ // list_pages returns markdown text ("## Pages\n1: <url> [selected]") — parse
260
+ // it like every other MCP response; treating it as a structured array left
261
+ // open_tabs permanently empty.
246
262
  let open_tabs = [];
247
263
  try {
248
- const pages = await browser.listPages();
249
- if (Array.isArray(pages)) {
250
- open_tabs = pages.map(p => ({ id: p.id ?? p.pageId, url: p.url, title: p.title }));
251
- }
264
+ const pages = parseListPagesResponse(await browser.listPages());
265
+ open_tabs = pages.map(p => ({ id: p.id, url: p.url, selected: p.selected }));
252
266
  } catch { /* list_pages not available in all Chrome configs — degrade gracefully */ }
253
267
 
254
268
  const newId = Date.now().toString(36) + Math.random().toString(36).slice(2, 6);
@@ -397,8 +411,12 @@ async function handlePrValidate({ prUrl, targetUrl, githubToken, blockOn } = {})
397
411
  const allFindings = [];
398
412
  const perRoute = [];
399
413
 
414
+ // Preserve any path prefix in the target URL (e.g. http://host/app) — new URL()
415
+ // with a leading-slash path would drop it. Mirrors src/cli/pr-validate.js.
416
+ const baseUrl = String(base).replace(/\/$/, '');
400
417
  for (const route of affectedRoutes) {
401
- const url = new URL(route.path, base).href;
418
+ const routePath = String(route.path ?? '/').startsWith('/') ? route.path : `/${route.path}`;
419
+ const url = `${baseUrl}${routePath}`;
402
420
  const res = await handleAudit({ url, critical: route.critical ?? false });
403
421
  const data = JSON.parse(res.content[0].text);
404
422
  allFindings.push(...(data.findings ?? []));
@@ -447,7 +465,7 @@ async function handleLastReport() {
447
465
  // ── Server bootstrap ──────────────────────────────────────────────────────────
448
466
 
449
467
  const server = new Server(
450
- { name: 'argus', version: '9.6.6' },
468
+ { name: 'argus', version: pkg.version },
451
469
  { capabilities: { tools: {} } },
452
470
  );
453
471
 
@@ -298,6 +298,13 @@ function classifyNetworkReq(req, url) {
298
298
  * the interval-based runWatchMode() entry point.
299
299
  */
300
300
  export class WatchSession {
301
+ // Long-run safety caps: a watch session left running for hours against an app
302
+ // with cache-busted polling URLs would otherwise grow the dedup sets without
303
+ // bound. When a set exceeds its cap the oldest fifth is evicted (Sets iterate
304
+ // in insertion order) — worst case a very old message is re-reported once.
305
+ static MAX_SEEN_KEYS = 5000;
306
+ static MAX_ALL_FINDINGS = 2000;
307
+
301
308
  constructor(browser, baseUrl) {
302
309
  this._browser = browser;
303
310
  this._baseUrl = baseUrl;
@@ -306,6 +313,14 @@ export class WatchSession {
306
313
  this._allFindings = [];
307
314
  }
308
315
 
316
+ /** Evict the oldest 20% of a dedup set once it exceeds the cap. */
317
+ static _trimSeen(set) {
318
+ if (set.size <= WatchSession.MAX_SEEN_KEYS) return;
319
+ const drop = Math.floor(WatchSession.MAX_SEEN_KEYS / 5);
320
+ const it = set.values();
321
+ for (let i = 0; i < drop; i++) set.delete(it.next().value);
322
+ }
323
+
309
324
  /**
310
325
  * Run one poll cycle.
311
326
  *
@@ -350,6 +365,11 @@ export class WatchSession {
350
365
  findings.push(...analyzeSecurityNetwork(newNetwork, this._baseUrl));
351
366
 
352
367
  this._allFindings.push(...findings);
368
+ if (this._allFindings.length > WatchSession.MAX_ALL_FINDINGS) {
369
+ this._allFindings = this._allFindings.slice(-WatchSession.MAX_ALL_FINDINGS);
370
+ }
371
+ WatchSession._trimSeen(this._seenConsole);
372
+ WatchSession._trimSeen(this._seenNetwork);
353
373
  return { findings, newConsole, newNetwork };
354
374
  }
355
375
 
@@ -125,6 +125,32 @@ function loadSchema(contract) {
125
125
  return null;
126
126
  }
127
127
 
128
+ /**
129
+ * Extract and JSON-parse the response body from a get_network_request result.
130
+ *
131
+ * chrome-devtools-mcp returns the request detail as markdown text with the
132
+ * body under a "### Response Body" section — the dominant production shape.
133
+ * Structured shapes ({ responseBody } / { body }) are kept for legacy clients.
134
+ *
135
+ * @param {any} raw - Raw value returned by browser.getNetworkRequest()
136
+ * @returns {any|null} Parsed JSON body, or null when absent
137
+ * @throws {SyntaxError} when a body section exists but is not valid JSON
138
+ */
139
+ export function extractResponseBody(raw) {
140
+ if (raw == null) return null;
141
+ if (typeof raw === 'object') {
142
+ const text = raw.responseBody ?? raw.body ?? null;
143
+ if (text == null) return null;
144
+ return typeof text === 'string' ? JSON.parse(text) : text;
145
+ }
146
+ const text = String(raw);
147
+ const m = text.match(/### Response Body\s*\n([\s\S]*?)(?=\n###? |$)/);
148
+ if (!m) return null;
149
+ const section = m[1].trim();
150
+ if (!section) return null;
151
+ return JSON.parse(section);
152
+ }
153
+
128
154
  /**
129
155
  * Validate captured network requests against apiContracts[].
130
156
  * For each request that matches a contract, fetches the response body via
@@ -153,8 +179,7 @@ export async function validateApiContracts(networkReqs, browser, contracts, page
153
179
  let body = null;
154
180
  try {
155
181
  const raw = await browser.getNetworkRequest(req.id ?? req.requestId);
156
- const text = raw?.responseBody ?? raw?.body ?? null;
157
- if (text) body = JSON.parse(text);
182
+ body = extractResponseBody(raw);
158
183
  } catch {
159
184
  continue; // body unavailable — skip validation for this request
160
185
  }
@@ -55,3 +55,23 @@ export function parseNetworkReqResponse(raw) {
55
55
  }
56
56
  return reqs;
57
57
  }
58
+
59
+ /**
60
+ * Parse the text response from list_pages.
61
+ * Format: "## Pages\n1: http://host/page.html [selected]\n2: about:blank"
62
+ * The numeric prefix is the pageId that select_page expects (as a number).
63
+ * @param {any} raw - Raw value returned by the MCP tool
64
+ * @returns {Array<{ id: number, url: string, selected: boolean }>}
65
+ */
66
+ export function parseListPagesResponse(raw) {
67
+ if (!raw) return [];
68
+ if (Array.isArray(raw)) return raw;
69
+ if (typeof raw !== 'string') return [];
70
+ const pages = [];
71
+ const re = /^(\d+):\s+(\S+)(\s+\[selected\])?\s*$/gm;
72
+ let m;
73
+ while ((m = re.exec(raw)) !== null) {
74
+ pages.push({ id: Number(m[1]), url: m[2], selected: Boolean(m[3]) });
75
+ }
76
+ return pages;
77
+ }
@@ -7,8 +7,16 @@
7
7
  *
8
8
  * Pure functions + one async fetch — no Chrome, no MCP, no AI verdict.
9
9
  * AI verdict logic ships separately in the private argus-pro repo.
10
+ *
11
+ * This module is imported by the MCP server — nothing here may write to
12
+ * stdout (reserved for JSON-RPC). CI annotations are emitted by the CLI
13
+ * entry point (src/cli/pr-validate.js), which owns stdout.
10
14
  */
11
15
 
16
+ import { childLogger } from './logger.js';
17
+
18
+ const logger = childLogger('pr-diff-analyzer');
19
+
12
20
  /**
13
21
  * Parse a GitHub PR URL into its owner/repo/prNumber components.
14
22
  *
@@ -59,7 +67,7 @@ export async function fetchPrFiles(prUrl, githubToken) {
59
67
  }
60
68
 
61
69
  if (allFiles.length >= 300) {
62
- console.log('::warning::PR has 300+ changed files — Argus analyzed the first 300. Routes affected by later files may be missed.');
70
+ logger.warn('[ARGUS] PR has 300+ changed files — Argus analyzed the first 300. Routes affected by later files may be missed.');
63
71
  }
64
72
 
65
73
  return allFiles;
@@ -56,13 +56,19 @@ export function getRecentChanges(repoDir = process.env.ARGUS_SOURCE_DIR || proce
56
56
  const changes = [];
57
57
  let current = null;
58
58
  for (const line of out.split('\n')) {
59
- const trimmed = line.trimEnd();
60
- if (!trimmed) continue;
61
- if (trimmed.includes('\t')) {
62
- const [hash, ...subjectParts] = trimmed.split('\t');
59
+ // Detect commit lines on the RAW line — an empty commit subject leaves a
60
+ // trailing tab ("abc1234\t") that trimEnd() would strip, misclassifying
61
+ // the hash as a file path. File lines never contain raw tabs: git quotes
62
+ // paths with special characters (core.quotePath octal escapes).
63
+ if (line.includes('\t')) {
64
+ const [hash, ...subjectParts] = line.trimEnd().split('\t');
63
65
  current = { hash, subject: subjectParts.join('\t'), files: [] };
64
66
  changes.push(current);
65
- } else if (current) {
67
+ continue;
68
+ }
69
+ const trimmed = line.trimEnd();
70
+ if (!trimmed) continue;
71
+ if (current) {
66
72
  // Git emits paths with forward slashes on every platform
67
73
  current.files.push(trimmed);
68
74
  }