@supatest/cli 0.0.45 → 0.0.46

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (3) hide show
  1. package/README.md +2 -8
  2. package/dist/index.js +390 -396
  3. package/package.json +1 -1
package/dist/index.js CHANGED
@@ -15,98 +15,78 @@ var init_builder = __esm({
15
15
  "src/prompts/builder.ts"() {
16
16
  "use strict";
17
17
  builderPrompt = `<role>
18
- You are Supatest AI, an E2E test builder that iteratively creates, runs, and fixes tests until they pass. You adapt to whatever test framework exists in the project.
18
+ You are Supatest AI, an E2E testing assistant. You explore applications, create tests, and fix failing tests. You adapt to whatever test framework exists in the project.
19
19
  </role>
20
20
 
21
21
  <context>
22
- First, check if .supatest/SUPATEST.md contains test framework information.
22
+ **Before writing any test**, check .supatest/SUPATEST.md for test framework info.
23
23
 
24
- If yes: Read it and use the documented framework, patterns, and conventions.
24
+ If .supatest/SUPATEST.md does NOT exist, you MUST run discovery before doing anything else:
25
+ 1. Read package.json to detect the framework (Playwright, WebDriverIO, Cypress, etc.)
26
+ 2. Read 2-3 existing test files to learn patterns (naming, selectors, page objects, assertions)
27
+ 3. Write findings to .supatest/SUPATEST.md (framework, test command, file patterns, conventions, selector strategies)
25
28
 
26
- If no: Run discovery once, then write findings to .supatest/SUPATEST.md:
27
- - Detect framework from package.json dependencies
28
- - Find test command from package.json scripts
29
- - Read 2-3 existing tests to learn patterns (structure, page objects, selectors, test data setup)
30
- - Write a "Test Framework" section to .supatest/SUPATEST.md with your findings
31
-
32
- This ensures discovery happens once and persists across sessions.
29
+ This file persists across sessions \u2014 future runs skip discovery. Do NOT skip this step.
33
30
  </context>
34
31
 
35
- <test_tagging>
36
- Tag tests with metadata for organization and filtering on the Supatest platform:
32
+ <bias_to_action>
33
+ Act on the user's request immediately. Extract the URL and intent from their message and start \u2014 don't ask clarifying questions unless you genuinely cannot determine what to test or where the app is. If the framework isn't detected, check package.json and node_modules yourself. If auth flow is unclear, explore with Agent Browser first. Investigate before asking.
34
+ </bias_to_action>
37
35
 
38
- **Platform Tags** (indexed, fast filtering):
39
- - @feature:name - Feature area (e.g., auth, checkout, dashboard)
40
- - @owner:email - Test owner/maintainer
41
- - @priority:critical|high|medium|low - Test priority
42
- - @test_type:smoke|e2e|regression|integration|unit - Test category
43
- - @ticket:PROJ-123 - Related ticket/issue
44
- - @slow - Flag for long-running tests
45
- - @flaky - Flag for known flaky tests
36
+ <modes>
37
+ Determine what the user needs:
46
38
 
47
- **Custom Tags** (flexible metadata):
48
- - @key:value - Any custom metadata (e.g., @browser:chrome, @viewport:mobile)
39
+ **Explore** \u2014 The user wants to understand the app before writing tests. Use Agent Browser to navigate and describe what you see. Don't write test scripts during exploration. Summarize findings and offer to write tests afterward.
49
40
 
50
- **Playwright - Use native tag property (preferred):**
51
- test("User can complete purchase", {
52
- tag: ['@feature:checkout', '@priority:high', '@test_type:e2e', '@owner:qa@example.com']
53
- }, async ({ page }) => {
54
- // test code
55
- });
41
+ **Build** \u2014 The user wants test scripts created:
42
+ 1. If you have enough context (source code, page objects, existing tests), write the test directly. If not, open the app with Agent Browser to see the actual page structure first.
43
+ 2. Write tests using semantic locators (button "Submit" \u2192 getByRole('button', { name: 'Submit' })). When creating multiple tests for the same page or flow, write them all before running.
44
+ 3. Run tests in headless mode. Run single test first for faster feedback. If a process hangs, kill it and check for interactive flags.
45
+ 4. Fix failures and re-run. Max 5 attempts per test.
56
46
 
57
- **WebdriverIO/Other frameworks - Use title tags:**
58
- it("@feature:checkout @priority:high @test_type:e2e User can complete purchase", async () => {
59
- // test code
60
- });
61
- </test_tagging>
47
+ **When to use Agent Browser during build:** If a test fails and the error is about a selector, missing element, or unexpected page state \u2014 open Agent Browser and snapshot the page before your next attempt. A snapshot takes seconds; re-running a full test to validate a guess takes much longer. The rule: if you've failed once on a selector/UI issue and haven't looked at the live page yet, look first.
48
+ </modes>
62
49
 
63
- <workflow>
64
- For each test:
65
- 1. **Write** - Create test using the project's framework and patterns
66
- 2. **Run** - Execute in headless mode (avoid interactive UIs that block)
67
- 3. **Fix** - If failing, investigate and fix; return to step 2
68
- 4. **Verify** - Run 2+ times to confirm stability
50
+ <agent_browser>
51
+ Agent Browser CLI (via Bash tool) \u2014 for exploration, debugging, and verifying page state:
52
+ - agent-browser open <url> \u2014 Open a page
53
+ - agent-browser snapshot -i \u2014 See interactive elements with @ref IDs
54
+ - agent-browser click @e1 / fill @e2 "text" \u2014 Interact by ref
55
+ - agent-browser screenshot \u2014 Capture page state
56
+ - agent-browser close \u2014 End session
69
57
 
70
- Continue until all tests pass. Max 5 attempts per test.
71
- </workflow>
58
+ Re-snapshot after each interaction to see updated state. Snapshot output maps directly to Playwright locators: button "Submit" \u2192 page.getByRole('button', { name: 'Submit' }).
59
+ </agent_browser>
72
60
 
73
- <principles>
74
- - Prefer API setup for test data when available (faster, more reliable)
75
- - Each test creates its own data with unique identifiers
76
- - Use semantic selectors (roles, labels, test IDs) over brittle CSS classes
77
- - Use explicit waits for elements, not arbitrary timeouts
78
- - Each test must be independent - no shared mutable state
79
- </principles>
80
-
81
- <execution>
82
- - Always run in headless/CI mode
83
- - Run single failing test first for faster feedback
84
- - Check package.json scripts for the correct test command
85
- - If a process hangs, kill it and check for flags that open interactive UIs
86
- </execution>
87
-
88
- <debugging>
89
- When tests fail:
90
- 1. Read the error message carefully
91
- 2. Verify selectors match actual DOM
92
- 3. Check for timing issues (element not ready)
93
- 4. Look for JS console errors
94
- 5. Verify test data preconditions
95
-
96
- Use Playwright MCP tools if available for live inspection.
97
- </debugging>
61
+ <test_tagging>
62
+ Every test MUST include metadata tags. These are indexed by the Supatest platform for filtering and reporting. Every test needs at minimum: @feature, @priority, and @test_type.
98
63
 
99
- <decisions>
100
- **Proceed autonomously:** Clear selector/timing issues, standard CRUD patterns, actionable errors
64
+ **Required tags:**
65
+ - @feature:name \u2014 Feature area (e.g., auth, checkout, dashboard)
66
+ - @priority:critical|high|medium|low \u2014 Test priority
67
+ - @test_type:smoke|e2e|regression|integration|unit \u2014 Test category
101
68
 
102
- **Ask user first:** Ambiguous requirements, no framework detected, unclear auth flow, external dependencies
69
+ **Optional tags:**
70
+ - @owner:email \u2014 Test owner/maintainer
71
+ - @ticket:PROJ-123 \u2014 Related ticket/issue
72
+ - @slow \u2014 Long-running test
73
+ - @flaky \u2014 Known flaky test
74
+ - @key:value \u2014 Any custom metadata
103
75
 
104
- **Stop and report:** App bug found (test is correct), max attempts reached, environment blocked
105
- </decisions>
76
+ **Playwright** \u2014 ALWAYS use the native tags property (even if existing tests use title-based tags):
77
+ test("User can login", { tags: ['@feature:auth', '@priority:high', '@test_type:e2e'] }, async ({ page }) => { });
106
78
 
107
- <done>
108
- A test is complete when it passes 2+ times consistently with resilient selectors and no arbitrary timeouts.
109
- </done>`;
79
+ **WebdriverIO/Other** \u2014 Append tags to the test title:
80
+ it("User can login (@feature:auth @priority:high @test_type:e2e)", async () => { });
81
+ </test_tagging>
82
+
83
+ <decisions>
84
+ **Proceed autonomously:** Selector/timing issues, standard CRUD patterns, actionable errors, framework detection, auth flow discovery (explore first)
85
+
86
+ **Ask user first:** Genuinely ambiguous requirements, external service dependencies with no obvious config
87
+
88
+ **Stop and report:** App bug found, max attempts reached, environment blocked
89
+ </decisions>`;
110
90
  }
111
91
  });
112
92
 
@@ -180,108 +160,59 @@ You are a Test Fixer Agent that debugs failing tests and fixes issues. You work
180
160
  </role>
181
161
 
182
162
  <workflow>
183
- 1. **Detect** - Check package.json to identify the test framework
184
- 2. **Analyze** - Read error message and stack trace
185
- 3. **Investigate** - Read failing test and code under test
186
- 4. **Categorize** - Identify root cause type (selector, timing, state, data, or logic)
187
- 5. **Fix** - Make minimal, targeted changes
188
- 6. **Verify** - Run test 2-3 times to confirm fix and check for flakiness
189
- 7. **Iterate** - If still failing, try a new hypothesis (max 3 attempts per test)
190
-
191
- Continue until all tests pass.
192
- </workflow>
193
-
194
- <test_tagging>
195
- When creating or fixing tests, add metadata tags for organization and filtering:
196
-
197
- **Platform Tags** (indexed, fast filtering):
198
- - @feature:name - Feature area (e.g., auth, checkout, dashboard)
199
- - @owner:email - Test owner/maintainer
200
- - @priority:critical|high|medium|low - Test priority
201
- - @test_type:smoke|e2e|regression|integration|unit - Test category
202
- - @ticket:PROJ-123 - Related ticket/issue
203
- - @slow - Flag for long-running tests
204
- - @flaky - Flag for known flaky tests
163
+ The failing test output is provided with your task. Start there.
205
164
 
206
- **Custom Tags** (flexible metadata):
207
- - @key:value - Any custom metadata (e.g., @browser:chrome, @viewport:mobile)
208
-
209
- **Playwright - Use native tag property (preferred):**
210
- test("User can complete purchase", {
211
- tag: ['@feature:checkout', '@priority:high', '@test_type:e2e', '@owner:qa@example.com']
212
- }, async ({ page }) => {
213
- // test code
214
- });
165
+ 1. **Analyze** \u2014 Read the error message and stack trace from the provided output
166
+ 2. **Categorize** \u2014 Identify root cause: selector, timing, state, data, or logic
167
+ 3. **Investigate** \u2014 Read the failing test and relevant source code
168
+ 4. **Fix** \u2014 Make minimal, targeted changes. Don't weaken assertions or skip tests.
169
+ 5. **Verify** \u2014 Run the single failing test in headless mode to confirm the fix. If a process hangs, kill it and check for interactive flags.
170
+ 6. **Iterate** \u2014 If still failing after a fix attempt, and the error involves selectors, missing elements, or unexpected page state: open Agent Browser and snapshot the page before your next attempt. A snapshot takes seconds; re-running the test to validate a guess takes much longer. Max 3 attempts per test.
215
171
 
216
- **WebdriverIO/Other frameworks - Use title tags:**
217
- it("@feature:checkout @priority:high @test_type:e2e User can complete purchase", async () => {
218
- // test code
219
- });
220
- </test_tagging>
172
+ Continue until all tests pass. After all individual fixes, run the full suite once to check for regressions.
173
+ </workflow>
221
174
 
222
175
  <root_causes>
223
- **Selector** - Element changed or locator is fragile \u2192 update selector, add wait, make more specific
176
+ **Selector** \u2014 Element changed or locator fragile \u2192 update to roles/labels/test IDs (survive refactors unlike CSS classes)
177
+ **Timing** \u2014 Race condition or async issue \u2192 explicit wait for element/state/network, not arbitrary delays
178
+ **State** \u2014 Test pollution or setup issue \u2192 ensure cleanup, add preconditions, refresh data
179
+ **Data** \u2014 Hardcoded or missing data \u2192 use dynamic data, create via API
180
+ **Logic** \u2014 Assertion wrong or outdated \u2192 update expectation to match actual behavior
181
+ </root_causes>
224
182
 
225
- **Timing** - Race condition or async issue \u2192 add explicit wait for element/state/network
183
+ <agent_browser>
184
+ Agent Browser CLI (via Bash tool) \u2014 for checking live page state during debugging:
185
+ - agent-browser open <url> \u2014 Open the page
186
+ - agent-browser snapshot -i \u2014 See interactive elements with @ref IDs
187
+ - agent-browser click @e1 / fill @e2 "text" \u2014 Interact by ref
188
+ - agent-browser screenshot \u2014 Capture page state
189
+ - agent-browser errors \u2014 Check console errors
190
+ - agent-browser close \u2014 End session
226
191
 
227
- **State** - Test pollution or setup issue \u2192 ensure cleanup, add preconditions, refresh data
192
+ Re-snapshot after each interaction. Walk through the test flow manually to compare expected vs actual behavior.
193
+ </agent_browser>
228
194
 
229
- **Data** - Hardcoded or missing data \u2192 use dynamic data, create via API
195
+ <test_tagging>
196
+ If tests are missing metadata tags, add them \u2014 this is a Supatest platform feature used for filtering, assignment, and reporting.
230
197
 
231
- **Logic** - Assertion wrong or outdated \u2192 update expectation to match actual behavior
232
- </root_causes>
198
+ **Tags**: @feature:name, @priority:critical|high|medium|low, @test_type:smoke|e2e|regression|integration|unit, @owner:email, @ticket:PROJ-123, @slow, @flaky, @key:value
233
199
 
234
- <execution>
235
- - Run in headless/CI mode - avoid interactive UIs that block
236
- - Check package.json scripts for correct test command
237
- - Run single failing test first for faster feedback
238
- - If process hangs, kill it and check for interactive flags
239
- </execution>
240
-
241
- <fixing_principles>
242
- - Use semantic selectors (roles, labels, test IDs) over CSS classes
243
- - Use condition-based waits, not arbitrary delays
244
- - Each test should be independent with its own data
245
- - Don't weaken assertions to make tests pass
246
- - Don't skip or remove tests without understanding the failure
247
- </fixing_principles>
248
-
249
- <browser_inspection>
250
- If available in /mcp commands, use Playwright MCP for live debugging when the failure is unclear from all available assets:
251
- - Test code, error logs, and stack traces
252
- - Application code and related files in the repo
253
- - Configuration and test setup files
254
-
255
- Execute the test flow with MCP to observe actual behavior:
256
- - Navigate and interact as the test does
257
- - Verify element states, attributes, and content
258
- - Check console errors and runtime issues
259
- - Test selectors and locators against live DOM
260
- - Inspect page state at each step
261
- </browser_inspection>
262
-
263
- <flakiness>
264
- After fixing, verify stability by running 2-3 times. Watch for:
265
- - Inconsistent pass/fail results
266
- - Timing sensitivity
267
- - Order dependence with other tests
268
- - Coupling to specific data state
269
- </flakiness>
200
+ **Playwright**: test("...", { tags: ['@feature:auth', '@priority:high'] }, async ({ page }) => { });
201
+ **WebdriverIO/Other**: it("... (@feature:auth @priority:high)", async () => { });
202
+ </test_tagging>
270
203
 
271
204
  <decisions>
272
205
  **Keep iterating:** New hypothesis available, error message changed (progress), under 3 attempts
273
-
274
- **Escalate:** 3 attempts with no progress, actual app bug found, requirements unclear
275
-
276
- When escalating, report what you tried and why it didn't work.
206
+ **Escalate:** 3 attempts with no progress, actual app bug found, requirements unclear \u2014 report what you tried and why it didn't work
277
207
  </decisions>
278
208
 
279
209
  <report>
280
- **Status**: fixed | escalated | in-progress
210
+ Generate this once after all tests are addressed, not after each individual test.
211
+
212
+ **Status**: fixed | escalated
281
213
  **Test**: [file and name]
282
214
  **Root Cause**: [category] - [specific cause]
283
215
  **Fix**: [what changed]
284
- **Verification**: [N runs, results]
285
216
 
286
217
  Summarize: X/Y tests passing
287
218
  </report>`;
@@ -346,8 +277,7 @@ Use these commands in interactive mode (type them and press Enter):
346
277
  ### Setup & Discovery
347
278
  - **/setup** - Check prerequisites and set up required tools
348
279
  - Verifies Node.js version (requires 18+)
349
- - Checks for required browsers and frameworks
350
- - Configures the default Playwright MCP server
280
+ - Checks and installs Agent Browser for browser automation
351
281
  - Run this once when starting with a new project
352
282
 
353
283
  - **/discover** - Scan your project to detect test framework and structure
@@ -390,7 +320,7 @@ Use these commands in interactive mode (type them and press Enter):
390
320
  - View all Model Context Protocol servers available to the agent
391
321
  - See connection status of each server
392
322
  - Add, remove, or test servers
393
- - MCP servers extend capabilities (e.g., Playwright for browser automation)
323
+ - MCP servers extend capabilities with additional tools and services
394
324
 
395
325
  - **/login** - Authenticate with Supatest
396
326
  - Opens your browser to log in to your Supatest account
@@ -495,7 +425,7 @@ Supatest creates and uses a .supatest/ directory in your project:
495
425
  - **.supatest/mcp.json** - MCP server configuration
496
426
  - Defines Model Context Protocol servers available to the agent
497
427
  - Can be project-level (committed to version control) or global
498
- - Created by /setup with default Playwright server
428
+ - Optional file for custom MCP server configuration
499
429
 
500
430
  - **.supatest/settings.json** - Project settings
501
431
  - Stores user preferences
@@ -517,16 +447,14 @@ MCP servers extend Supatest with additional tools and capabilities.
517
447
 
518
448
  ### What is MCP?
519
449
  Model Context Protocol is a standard that allows AI agents to interact with external tools and services. MCP servers provide access to:
520
- - Browser automation (Playwright)
450
+ - Custom tool integrations
521
451
  - File system operations
522
- - Custom project tools
523
452
  - External services
524
453
 
525
- ### Default Setup
526
- When you run /setup, Supatest automatically configures:
527
- - **Playwright MCP Server** - Browser automation for E2E testing
528
- - Command: npx @modelcontextprotocol/server-playwright
529
- - Enables: Opening browsers, navigating pages, interacting with UI elements
454
+ ### Browser Automation
455
+ Browser automation is handled by Agent Browser, a CLI tool installed during /setup.
456
+ The agent uses it via Bash commands (e.g., agent-browser open, agent-browser snapshot -i).
457
+ No MCP configuration is needed for browser automation.
530
458
 
531
459
  ### Configuration
532
460
 
@@ -548,19 +476,13 @@ Project servers take precedence over global servers with the same name.
548
476
  \`\`\`json
549
477
  {
550
478
  "mcpServers": {
551
- "playwright": {
552
- "command": "npx",
553
- "args": ["@modelcontextprotocol/server-playwright"],
554
- "description": "Browser automation via Playwright",
555
- "enabled": true
556
- },
557
479
  "custom-tool": {
558
480
  "command": "node",
559
481
  "args": ["/path/to/server.js"],
560
482
  "env": {
561
483
  "API_KEY": "value"
562
484
  },
563
- "description": "My custom tool",
485
+ "description": "My custom MCP server",
564
486
  "enabled": true
565
487
  }
566
488
  }
@@ -854,13 +776,13 @@ Map risk levels to priority tags:
854
776
  - MEDIUM risk \u2192 @priority:medium
855
777
  - LOW risk \u2192 @priority:low
856
778
 
857
- **Playwright - Use native tag property (preferred):**
779
+ **Playwright - Use native tags property (preferred):**
858
780
  test("User can complete purchase", {
859
- tag: ['@feature:checkout', '@priority:high', '@test_type:e2e']
781
+ tags: ['@feature:checkout', '@priority:high', '@test_type:e2e']
860
782
  }, async ({ page }) => { });
861
783
 
862
- **WebdriverIO/Other frameworks - Use title tags:**
863
- it("@feature:checkout @priority:high @test_type:e2e User can complete purchase", async () => { });
784
+ **WebdriverIO/Other frameworks - Use title tags (at end for readability):**
785
+ it("User can complete purchase (@feature:checkout @priority:high @test_type:e2e)", async () => { });
864
786
  </test_tagging>
865
787
 
866
788
  <example>
@@ -1636,8 +1558,8 @@ var init_shared_es = __esm({
1636
1558
  };
1637
1559
  overrideErrorMap = errorMap;
1638
1560
  makeIssue = (params) => {
1639
- const { data, path: path6, errorMaps, issueData } = params;
1640
- const fullPath = [...path6, ...issueData.path || []];
1561
+ const { data, path: path5, errorMaps, issueData } = params;
1562
+ const fullPath = [...path5, ...issueData.path || []];
1641
1563
  const fullIssue = {
1642
1564
  ...issueData,
1643
1565
  path: fullPath
@@ -1728,11 +1650,11 @@ var init_shared_es = __esm({
1728
1650
  errorUtil2.toString = (message) => typeof message === "string" ? message : message?.message;
1729
1651
  })(errorUtil || (errorUtil = {}));
1730
1652
  ParseInputLazyPath = class {
1731
- constructor(parent, value, path6, key) {
1653
+ constructor(parent, value, path5, key) {
1732
1654
  this._cachedPath = [];
1733
1655
  this.parent = parent;
1734
1656
  this.data = value;
1735
- this._path = path6;
1657
+ this._path = path5;
1736
1658
  this._key = key;
1737
1659
  }
1738
1660
  get path() {
@@ -6415,9 +6337,6 @@ var init_shared_es = __esm({
6415
6337
 
6416
6338
  // src/commands/setup.ts
6417
6339
  import { execSync, spawn, spawnSync } from "child_process";
6418
- import fs from "fs";
6419
- import os from "os";
6420
- import path from "path";
6421
6340
  function parseVersion(versionString) {
6422
6341
  const cleaned = versionString.trim().replace(/^v/, "");
6423
6342
  const match = cleaned.match(/^(\d+)\.(\d+)\.(\d+)/);
@@ -6442,52 +6361,24 @@ function getNodeVersion() {
6442
6361
  return null;
6443
6362
  }
6444
6363
  }
6445
- function getPlaywrightVersion() {
6364
+ function getAgentBrowserVersion() {
6446
6365
  try {
6447
- const result = spawnSync("npx", ["playwright", "--version"], {
6366
+ const result = spawnSync("agent-browser", ["--version"], {
6448
6367
  encoding: "utf-8",
6449
6368
  stdio: ["ignore", "pipe", "ignore"],
6450
6369
  shell: true
6451
- // Required for Windows where npx is npx.cmd
6370
+ // Required for Windows
6452
6371
  });
6453
6372
  if (result.status === 0 && result.stdout) {
6454
- return result.stdout.trim().replace("Version ", "");
6373
+ return result.stdout.trim();
6455
6374
  }
6456
6375
  return null;
6457
6376
  } catch {
6458
6377
  return null;
6459
6378
  }
6460
6379
  }
6461
- function getPlaywrightCachePath() {
6462
- const homeDir = os.homedir();
6463
- const cachePaths = [
6464
- path.join(homeDir, "Library", "Caches", "ms-playwright"),
6465
- // macOS
6466
- path.join(homeDir, ".cache", "ms-playwright"),
6467
- // Linux
6468
- path.join(homeDir, "AppData", "Local", "ms-playwright")
6469
- // Windows
6470
- ];
6471
- for (const cachePath of cachePaths) {
6472
- if (fs.existsSync(cachePath)) {
6473
- return cachePath;
6474
- }
6475
- }
6476
- return null;
6477
- }
6478
- function getInstalledChromiumVersion() {
6479
- const cachePath = getPlaywrightCachePath();
6480
- if (!cachePath) return null;
6481
- try {
6482
- const entries = fs.readdirSync(cachePath);
6483
- const chromiumVersions = entries.filter((entry) => entry.startsWith("chromium-") && !entry.includes("headless")).map((entry) => entry.replace("chromium-", "")).sort((a, b) => Number(b) - Number(a));
6484
- return chromiumVersions[0] || null;
6485
- } catch {
6486
- return null;
6487
- }
6488
- }
6489
- function isChromiumInstalled() {
6490
- return getInstalledChromiumVersion() !== null;
6380
+ function isAgentBrowserInstalled() {
6381
+ return getAgentBrowserVersion() !== null;
6491
6382
  }
6492
6383
  function checkNodeVersion() {
6493
6384
  const nodeVersion = getNodeVersion();
@@ -6511,86 +6402,60 @@ function checkNodeVersion() {
6511
6402
  version: nodeVersion.raw
6512
6403
  };
6513
6404
  }
6514
- async function installChromium() {
6405
+ async function installAgentBrowser() {
6515
6406
  return new Promise((resolve2) => {
6516
- const child = spawn("npx", ["playwright", "install", "chromium"], {
6407
+ const child = spawn("npm", ["install", "-g", "agent-browser"], {
6517
6408
  stdio: "inherit",
6518
6409
  shell: true
6519
- // Required for Windows where npx is npx.cmd
6410
+ // Required for Windows
6520
6411
  });
6521
6412
  child.on("close", (code) => {
6522
- if (code === 0) {
6413
+ if (code !== 0) {
6523
6414
  resolve2({
6524
- ok: true,
6525
- message: "Chromium browser installed successfully."
6415
+ ok: false,
6416
+ message: `npm install -g agent-browser exited with code ${code}`
6526
6417
  });
6527
- } else {
6418
+ return;
6419
+ }
6420
+ const browserInstall = spawn("agent-browser", ["install"], {
6421
+ stdio: "inherit",
6422
+ shell: true
6423
+ });
6424
+ browserInstall.on("close", (browserCode) => {
6425
+ if (browserCode === 0) {
6426
+ resolve2({
6427
+ ok: true,
6428
+ message: "Agent Browser and Chromium installed successfully."
6429
+ });
6430
+ } else {
6431
+ resolve2({
6432
+ ok: false,
6433
+ message: `agent-browser install exited with code ${browserCode}`
6434
+ });
6435
+ }
6436
+ });
6437
+ browserInstall.on("error", (error) => {
6528
6438
  resolve2({
6529
6439
  ok: false,
6530
- message: `Playwright install exited with code ${code}`
6440
+ message: `Failed to install Chromium via agent-browser: ${error.message}`
6531
6441
  });
6532
- }
6442
+ });
6533
6443
  });
6534
6444
  child.on("error", (error) => {
6535
6445
  resolve2({
6536
6446
  ok: false,
6537
- message: `Failed to install Chromium: ${error.message}`
6447
+ message: `Failed to install Agent Browser: ${error.message}`
6538
6448
  });
6539
6449
  });
6540
6450
  });
6541
6451
  }
6542
- function createSupatestConfig(cwd) {
6543
- const supatestDir = path.join(cwd, ".supatest");
6544
- const mcpJsonPath = path.join(supatestDir, "mcp.json");
6545
- try {
6546
- if (!fs.existsSync(supatestDir)) {
6547
- fs.mkdirSync(supatestDir, { recursive: true });
6548
- }
6549
- let config2;
6550
- let fileExisted = false;
6551
- if (fs.existsSync(mcpJsonPath)) {
6552
- fileExisted = true;
6553
- const existingContent = fs.readFileSync(mcpJsonPath, "utf-8");
6554
- config2 = JSON.parse(existingContent);
6555
- } else {
6556
- config2 = {};
6557
- }
6558
- if (!config2.mcpServers || typeof config2.mcpServers !== "object") {
6559
- config2.mcpServers = {};
6560
- }
6561
- if (!config2.mcpServers.playwright) {
6562
- config2.mcpServers.playwright = DEFAULT_MCP_CONFIG.mcpServers.playwright;
6563
- }
6564
- fs.writeFileSync(mcpJsonPath, JSON.stringify(config2, null, 2) + "\n", "utf-8");
6565
- if (fileExisted) {
6566
- return {
6567
- ok: true,
6568
- message: "Updated .supatest/mcp.json with Playwright MCP server configuration",
6569
- created: false
6570
- };
6571
- }
6572
- return {
6573
- ok: true,
6574
- message: "Created .supatest/mcp.json with Playwright MCP server configuration",
6575
- created: true
6576
- };
6577
- } catch (error) {
6578
- return {
6579
- ok: false,
6580
- message: `Failed to create mcp.json: ${error instanceof Error ? error.message : String(error)}`,
6581
- created: false
6582
- };
6583
- }
6584
- }
6585
6452
  function getVersionSummary() {
6586
6453
  const nodeVersion = getNodeVersion();
6587
- const playwrightVersion = getPlaywrightVersion();
6588
- const chromiumVersion = getInstalledChromiumVersion();
6454
+ const agentBrowserVersion = getAgentBrowserVersion();
6589
6455
  const lines = [];
6590
6456
  lines.push("\n\u{1F4CB} Installed Versions:");
6591
- lines.push(` Node.js: ${nodeVersion?.raw || "Not installed"}`);
6592
- lines.push(` Playwright: ${playwrightVersion || "Not installed"}`);
6593
- lines.push(` Chromium: ${chromiumVersion ? `build ${chromiumVersion}` : "Not installed"}`);
6457
+ lines.push(` Node.js: ${nodeVersion?.raw || "Not installed"}`);
6458
+ lines.push(` Agent Browser: ${agentBrowserVersion || "Not installed"}`);
6594
6459
  return lines.join("\n");
6595
6460
  }
6596
6461
  async function setupCommand(options) {
@@ -6600,7 +6465,7 @@ async function setupCommand(options) {
6600
6465
  };
6601
6466
  const result = {
6602
6467
  nodeVersionOk: false,
6603
- playwrightInstalled: false,
6468
+ agentBrowserInstalled: false,
6604
6469
  errors: [],
6605
6470
  output: ""
6606
6471
  };
@@ -6624,33 +6489,21 @@ async function setupCommand(options) {
6624
6489
  log(` nvm install ${MINIMUM_NODE_VERSION}`);
6625
6490
  log(` nvm use ${MINIMUM_NODE_VERSION}`);
6626
6491
  }
6627
- log("\n2. Checking Chromium browser...");
6628
- if (isChromiumInstalled()) {
6629
- log(" \u2705 Chromium browser already installed");
6630
- result.playwrightInstalled = true;
6492
+ log("\n2. Checking Agent Browser...");
6493
+ if (isAgentBrowserInstalled()) {
6494
+ log(" \u2705 Agent Browser already installed");
6495
+ result.agentBrowserInstalled = true;
6631
6496
  } else {
6632
- log(" \u{1F4E6} Chromium browser not found. Installing...\n");
6633
- const chromiumResult = await installChromium();
6634
- result.playwrightInstalled = chromiumResult.ok;
6497
+ log(" \u{1F4E6} Agent Browser not found. Installing...\n");
6498
+ const installResult = await installAgentBrowser();
6499
+ result.agentBrowserInstalled = installResult.ok;
6635
6500
  log("");
6636
- if (chromiumResult.ok) {
6637
- log(` \u2705 ${chromiumResult.message}`);
6638
- } else {
6639
- log(` \u274C ${chromiumResult.message}`);
6640
- result.errors.push(chromiumResult.message);
6641
- }
6642
- }
6643
- log("\n3. Setting up MCP configuration...");
6644
- const configResult = createSupatestConfig(options.cwd);
6645
- if (configResult.ok) {
6646
- if (configResult.created) {
6647
- log(` \u2705 ${configResult.message}`);
6501
+ if (installResult.ok) {
6502
+ log(` \u2705 ${installResult.message}`);
6648
6503
  } else {
6649
- log(` \u2705 ${configResult.message}`);
6504
+ log(` \u274C ${installResult.message}`);
6505
+ result.errors.push(installResult.message);
6650
6506
  }
6651
- } else {
6652
- log(` \u274C ${configResult.message}`);
6653
- result.errors.push(configResult.message);
6654
6507
  }
6655
6508
  const versionSummary = getVersionSummary();
6656
6509
  log(versionSummary);
@@ -6667,19 +6520,11 @@ async function setupCommand(options) {
6667
6520
  result.output = output.join("\n");
6668
6521
  return result;
6669
6522
  }
6670
- var MINIMUM_NODE_VERSION, DEFAULT_MCP_CONFIG;
6523
+ var MINIMUM_NODE_VERSION;
6671
6524
  var init_setup = __esm({
6672
6525
  "src/commands/setup.ts"() {
6673
6526
  "use strict";
6674
6527
  MINIMUM_NODE_VERSION = 18;
6675
- DEFAULT_MCP_CONFIG = {
6676
- mcpServers: {
6677
- playwright: {
6678
- command: "npx",
6679
- args: ["@playwright/mcp@latest"]
6680
- }
6681
- }
6682
- };
6683
6528
  }
6684
6529
  });
6685
6530
 
@@ -6693,13 +6538,13 @@ var init_version = __esm({
6693
6538
  });
6694
6539
 
6695
6540
  // src/utils/error-logger.ts
6696
- import * as fs2 from "fs";
6697
- import * as os2 from "os";
6698
- import * as path2 from "path";
6541
+ import * as fs from "fs";
6542
+ import * as os from "os";
6543
+ import * as path from "path";
6699
6544
  function ensureLogDir() {
6700
6545
  try {
6701
- if (!fs2.existsSync(LOGS_DIR)) {
6702
- fs2.mkdirSync(LOGS_DIR, { recursive: true });
6546
+ if (!fs.existsSync(LOGS_DIR)) {
6547
+ fs.mkdirSync(LOGS_DIR, { recursive: true });
6703
6548
  }
6704
6549
  return true;
6705
6550
  } catch {
@@ -6708,14 +6553,14 @@ function ensureLogDir() {
6708
6553
  }
6709
6554
  function rotateLogIfNeeded() {
6710
6555
  try {
6711
- if (!fs2.existsSync(ERROR_LOG_FILE)) return;
6712
- const stats = fs2.statSync(ERROR_LOG_FILE);
6556
+ if (!fs.existsSync(ERROR_LOG_FILE)) return;
6557
+ const stats = fs.statSync(ERROR_LOG_FILE);
6713
6558
  if (stats.size > MAX_LOG_SIZE) {
6714
6559
  const oldLogFile = `${ERROR_LOG_FILE}.old`;
6715
- if (fs2.existsSync(oldLogFile)) {
6716
- fs2.unlinkSync(oldLogFile);
6560
+ if (fs.existsSync(oldLogFile)) {
6561
+ fs.unlinkSync(oldLogFile);
6717
6562
  }
6718
- fs2.renameSync(ERROR_LOG_FILE, oldLogFile);
6563
+ fs.renameSync(ERROR_LOG_FILE, oldLogFile);
6719
6564
  }
6720
6565
  } catch {
6721
6566
  }
@@ -6751,7 +6596,7 @@ function logError(error, context) {
6751
6596
  const logLine = `${JSON.stringify(entry)}
6752
6597
  `;
6753
6598
  try {
6754
- fs2.appendFileSync(ERROR_LOG_FILE, logLine);
6599
+ fs.appendFileSync(ERROR_LOG_FILE, logLine);
6755
6600
  } catch {
6756
6601
  }
6757
6602
  }
@@ -6760,16 +6605,16 @@ var init_error_logger = __esm({
6760
6605
  "src/utils/error-logger.ts"() {
6761
6606
  "use strict";
6762
6607
  init_version();
6763
- SUPATEST_DIR = process.platform === "win32" ? path2.join(os2.tmpdir(), ".supatest") : path2.join(os2.homedir(), ".supatest");
6764
- LOGS_DIR = path2.join(SUPATEST_DIR, "logs");
6765
- ERROR_LOG_FILE = path2.join(LOGS_DIR, "error.log");
6608
+ SUPATEST_DIR = process.platform === "win32" ? path.join(os.tmpdir(), ".supatest") : path.join(os.homedir(), ".supatest");
6609
+ LOGS_DIR = path.join(SUPATEST_DIR, "logs");
6610
+ ERROR_LOG_FILE = path.join(LOGS_DIR, "error.log");
6766
6611
  MAX_LOG_SIZE = 5 * 1024 * 1024;
6767
6612
  }
6768
6613
  });
6769
6614
 
6770
6615
  // src/utils/logger.ts
6771
- import * as fs3 from "fs";
6772
- import * as path3 from "path";
6616
+ import * as fs2 from "fs";
6617
+ import * as path2 from "path";
6773
6618
  import chalk from "chalk";
6774
6619
  var Logger, logger;
6775
6620
  var init_logger = __esm({
@@ -6795,14 +6640,14 @@ var init_logger = __esm({
6795
6640
  enableFileLogging(isDev = false) {
6796
6641
  this.isDev = isDev;
6797
6642
  if (!isDev) return;
6798
- this.logFile = path3.join(process.cwd(), "cli.log");
6643
+ this.logFile = path2.join(process.cwd(), "cli.log");
6799
6644
  const separator = `
6800
6645
  ${"=".repeat(80)}
6801
6646
  [${(/* @__PURE__ */ new Date()).toISOString()}] New CLI session started
6802
6647
  ${"=".repeat(80)}
6803
6648
  `;
6804
6649
  try {
6805
- fs3.appendFileSync(this.logFile, separator);
6650
+ fs2.appendFileSync(this.logFile, separator);
6806
6651
  } catch (error) {
6807
6652
  }
6808
6653
  }
@@ -6816,7 +6661,7 @@ ${"=".repeat(80)}
6816
6661
  ` : `[${timestamp}] [${level}] ${message}
6817
6662
  `;
6818
6663
  try {
6819
- fs3.appendFileSync(this.logFile, logEntry);
6664
+ fs2.appendFileSync(this.logFile, logEntry);
6820
6665
  } catch (error) {
6821
6666
  }
6822
6667
  }
@@ -7003,9 +6848,21 @@ var init_api_client = __esm({
7003
6848
  constructor(status, statusText, body) {
7004
6849
  let message;
7005
6850
  if (status === 401) {
7006
- message = "Authentication required. Use /login to authenticate.";
6851
+ message = "Authentication required. Run 'supatest' to login, or set SUPATEST_API_KEY for CI/headless use.";
7007
6852
  } else if (status === 403) {
7008
- message = "Access denied. Your token may have been revoked.";
6853
+ message = "Access denied. Your token may have been revoked. Run 'supatest' to re-authenticate.";
6854
+ } else if (status === 429) {
6855
+ let details = "";
6856
+ try {
6857
+ const parsed = JSON.parse(body);
6858
+ if (parsed.used !== void 0 && parsed.limit !== void 0) {
6859
+ const usedM = (parsed.used / 1e6).toFixed(1);
6860
+ const limitM = (parsed.limit / 1e6).toFixed(1);
6861
+ details = ` You've used ${usedM}M of your ${limitM}M monthly tokens.`;
6862
+ }
6863
+ } catch {
6864
+ }
6865
+ message = `Monthly token limit exceeded.${details} Usage resets at the start of next month. Manage your plan at https://code.supatest.ai/api-keys`;
7009
6866
  } else {
7010
6867
  message = `API error: ${status} ${statusText}`;
7011
6868
  if (body) {
@@ -7593,6 +7450,7 @@ var init_api_client = __esm({
7593
7450
  if (query2?.page) urlParams.set("page", query2.page.toString());
7594
7451
  if (query2?.limit) urlParams.set("limit", query2.limit.toString());
7595
7452
  if (query2?.status) urlParams.set("status", query2.status);
7453
+ if (query2?.isFlaky !== void 0) urlParams.set("isFlaky", query2.isFlaky.toString());
7596
7454
  const url = `${this.apiUrl}/v1/tests-catalog/runs/${runId}?${urlParams.toString()}`;
7597
7455
  logger.debug(`Fetching tests catalog for run: ${runId}`);
7598
7456
  const response = await fetch(url, {
@@ -7802,10 +7660,10 @@ function loadProjectInstructions(cwd) {
7802
7660
  join5(cwd, "SUPATEST.md"),
7803
7661
  join5(cwd, ".supatest", "SUPATEST.md")
7804
7662
  ];
7805
- for (const path6 of paths) {
7806
- if (existsSync4(path6)) {
7663
+ for (const path5 of paths) {
7664
+ if (existsSync4(path5)) {
7807
7665
  try {
7808
- return readFileSync3(path6, "utf-8");
7666
+ return readFileSync3(path5, "utf-8");
7809
7667
  } catch {
7810
7668
  }
7811
7669
  }
@@ -7983,7 +7841,7 @@ ${projectInstructions}`,
7983
7841
  includePartialMessages: true,
7984
7842
  executable: "node",
7985
7843
  // MCP servers from .supatest/mcp.json
7986
- // Users can add servers like Playwright if needed
7844
+ // Users can add custom MCP servers if needed
7987
7845
  mcpServers: (() => {
7988
7846
  logger.debug("[agent] Loading MCP servers for query", { cwd });
7989
7847
  const servers = loadMcpServers(cwd);
@@ -8277,7 +8135,7 @@ ${projectInstructions}`,
8277
8135
  return result;
8278
8136
  }
8279
8137
  async resolveClaudeCodePath() {
8280
- const fs5 = await import("fs/promises");
8138
+ const fs4 = await import("fs/promises");
8281
8139
  let claudeCodePath;
8282
8140
  const require2 = createRequire(import.meta.url);
8283
8141
  const sdkPath = require2.resolve("@anthropic-ai/claude-agent-sdk/sdk.mjs");
@@ -8290,7 +8148,7 @@ ${projectInstructions}`,
8290
8148
  );
8291
8149
  }
8292
8150
  try {
8293
- await fs5.access(claudeCodePath);
8151
+ await fs4.access(claudeCodePath);
8294
8152
  this.presenter.onLog(`\u2713 Claude Code CLI found: ${claudeCodePath}`);
8295
8153
  } catch {
8296
8154
  const error = `Claude Code executable not found at: ${claudeCodePath}
@@ -8321,8 +8179,8 @@ function getToolDescription(toolName, input) {
8321
8179
  return `pattern: "${input?.pattern || "files"}"`;
8322
8180
  case "Grep": {
8323
8181
  const pattern = input?.pattern || "code";
8324
- const path6 = input?.path;
8325
- return path6 ? `"${pattern}" (in ${path6})` : `"${pattern}"`;
8182
+ const path5 = input?.path;
8183
+ return path5 ? `"${pattern}" (in ${path5})` : `"${pattern}"`;
8326
8184
  }
8327
8185
  case "Task":
8328
8186
  return input?.subagent_type || "task";
@@ -10264,10 +10122,10 @@ function escapeForCmd(value) {
10264
10122
  return value.replace(/[&^]/g, "^$&");
10265
10123
  }
10266
10124
  function openBrowser(url) {
10267
- const os3 = platform();
10125
+ const os2 = platform();
10268
10126
  let command;
10269
10127
  let args;
10270
- switch (os3) {
10128
+ switch (os2) {
10271
10129
  case "darwin":
10272
10130
  command = "open";
10273
10131
  args = [url];
@@ -10574,11 +10432,50 @@ function buildErrorPage(errorMessage) {
10574
10432
  </html>
10575
10433
  `;
10576
10434
  }
10435
+ function isPortAvailable(port) {
10436
+ return new Promise((resolve2) => {
10437
+ const testServer = http.createServer();
10438
+ testServer.once("error", () => resolve2(false));
10439
+ testServer.listen(port, "127.0.0.1", () => {
10440
+ testServer.close(() => resolve2(true));
10441
+ });
10442
+ });
10443
+ }
10444
+ async function startCallbackServerWithRetry(ports, expectedState) {
10445
+ for (const port of ports) {
10446
+ const available = await isPortAvailable(port);
10447
+ if (available) {
10448
+ const loginPromise = startCallbackServer(port, expectedState);
10449
+ return { loginPromise, port };
10450
+ }
10451
+ }
10452
+ const portList = ports.join(", ");
10453
+ const err = new Error(
10454
+ `Login failed: All callback ports (${portList}) are in use.
10455
+ Close the applications using these ports, or authenticate with an API key instead:
10456
+ SUPATEST_API_KEY=<your-key> supatest
10457
+
10458
+ Get your API key at: https://code.supatest.ai/api-keys`
10459
+ );
10460
+ err.code = "EADDRINUSE";
10461
+ throw err;
10462
+ }
10577
10463
  async function loginCommand() {
10578
10464
  console.log("\nAuthenticating with Supatest...\n");
10579
10465
  const state = generateState();
10580
- const loginPromise = startCallbackServer(CLI_LOGIN_PORT, state);
10581
- const loginUrl = `${FRONTEND_URL}/cli-login?port=${CLI_LOGIN_PORT}&state=${state}`;
10466
+ let loginPromise;
10467
+ let port;
10468
+ try {
10469
+ const result = await startCallbackServerWithRetry(LOGIN_RETRY_PORTS, state);
10470
+ loginPromise = result.loginPromise;
10471
+ port = result.port;
10472
+ } catch (error) {
10473
+ console.error(`
10474
+ \u274C ${error.message}
10475
+ `);
10476
+ throw error;
10477
+ }
10478
+ const loginUrl = `${FRONTEND_URL}/cli-login?port=${port}&state=${state}`;
10582
10479
  console.log(`Opening browser to: ${loginUrl}`);
10583
10480
  console.log("\nIf your browser doesn't open automatically, please visit the URL above.\n");
10584
10481
  try {
@@ -10597,19 +10494,30 @@ ${loginUrl}
10597
10494
  } catch (error) {
10598
10495
  const err = error;
10599
10496
  if (err.code === "EADDRINUSE") {
10600
- console.error("\n\u274C Login failed: Something went wrong.");
10601
- console.error(" Please restart the CLI and try again.\n");
10497
+ console.error(
10498
+ `
10499
+ \u274C Login failed: Port ${port} is in use.
10500
+ Close the application using it, or authenticate with an API key instead:
10501
+ SUPATEST_API_KEY=<your-key> supatest
10502
+
10503
+ Get your API key at: https://code.supatest.ai/api-keys
10504
+ `
10505
+ );
10506
+ } else if (error.message.includes("timeout")) {
10507
+ console.error(
10508
+ "\n\u274C Login timed out. Make sure you completed sign-in in your browser, then run 'supatest' to try again.\n\n Alternatively, use an API key: https://code.supatest.ai/api-keys\n"
10509
+ );
10602
10510
  } else {
10603
10511
  console.error("\n\u274C Login failed:", error.message, "\n");
10604
10512
  }
10605
10513
  throw error;
10606
10514
  }
10607
10515
  }
10608
- var CLI_LOGIN_PORT, FRONTEND_URL, API_URL, CALLBACK_TIMEOUT_MS, STATE_LENGTH;
10516
+ var LOGIN_RETRY_PORTS, FRONTEND_URL, API_URL, CALLBACK_TIMEOUT_MS, STATE_LENGTH;
10609
10517
  var init_login = __esm({
10610
10518
  "src/commands/login.ts"() {
10611
10519
  "use strict";
10612
- CLI_LOGIN_PORT = 8420;
10520
+ LOGIN_RETRY_PORTS = [8420, 8422, 8423];
10613
10521
  FRONTEND_URL = process.env.SUPATEST_FRONTEND_URL || "https://code.supatest.ai";
10614
10522
  API_URL = process.env.SUPATEST_API_URL || "https://code-api.supatest.ai";
10615
10523
  CALLBACK_TIMEOUT_MS = 3e5;
@@ -10626,7 +10534,7 @@ import { spawn as spawn4 } from "child_process";
10626
10534
  import { createHash, randomBytes } from "crypto";
10627
10535
  import http2 from "http";
10628
10536
  import { platform as platform2 } from "os";
10629
- var OAUTH_CONFIG, CALLBACK_PORT, CALLBACK_TIMEOUT_MS2, ClaudeOAuthService;
10537
+ var OAUTH_CONFIG, OAUTH_RETRY_PORTS, CALLBACK_TIMEOUT_MS2, ClaudeOAuthService;
10630
10538
  var init_claude_oauth = __esm({
10631
10539
  "src/utils/claude-oauth.ts"() {
10632
10540
  "use strict";
@@ -10639,7 +10547,7 @@ var init_claude_oauth = __esm({
10639
10547
  // Local callback for CLI
10640
10548
  scopes: ["user:inference", "user:profile", "org:create_api_key"]
10641
10549
  };
10642
- CALLBACK_PORT = 8421;
10550
+ OAUTH_RETRY_PORTS = [8421, 8422, 8423];
10643
10551
  CALLBACK_TIMEOUT_MS2 = 3e5;
10644
10552
  ClaudeOAuthService = class _ClaudeOAuthService {
10645
10553
  secretStorage;
@@ -10647,9 +10555,39 @@ var init_claude_oauth = __esm({
10647
10555
  // 5 minutes
10648
10556
  pendingCodeVerifier = null;
10649
10557
  // Store code verifier for PKCE
10558
+ activeRedirectUri = OAUTH_CONFIG.redirectUri;
10559
+ // Dynamic redirect URI based on available port
10650
10560
  constructor(secretStorage) {
10651
10561
  this.secretStorage = secretStorage;
10652
10562
  }
10563
+ /**
10564
+ * Check if a port is available by briefly listening on it.
10565
+ */
10566
+ isPortAvailable(port) {
10567
+ return new Promise((resolve2) => {
10568
+ const testServer = http2.createServer();
10569
+ testServer.once("error", () => resolve2(false));
10570
+ testServer.listen(port, "127.0.0.1", () => {
10571
+ testServer.close(() => resolve2(true));
10572
+ });
10573
+ });
10574
+ }
10575
+ /**
10576
+ * Try to find an available port and start the callback server on it.
10577
+ */
10578
+ async findAvailablePort(ports, state) {
10579
+ for (const port of ports) {
10580
+ const available = await this.isPortAvailable(port);
10581
+ if (available) {
10582
+ const tokenPromise = this.startCallbackServer(port, state);
10583
+ return { tokenPromise, port };
10584
+ }
10585
+ }
10586
+ const portList = ports.join(", ");
10587
+ throw new Error(
10588
+ `Claude authentication failed: All callback ports (${portList}) are in use. Close the applications using these ports and try again.`
10589
+ );
10590
+ }
10653
10591
  /**
10654
10592
  * Starts the OAuth authorization flow
10655
10593
  * Opens the default browser for user authentication
@@ -10660,11 +10598,12 @@ var init_claude_oauth = __esm({
10660
10598
  const state = this.generateRandomState();
10661
10599
  const pkce = this.generatePKCEChallenge();
10662
10600
  this.pendingCodeVerifier = pkce.codeVerifier;
10663
- const authUrl = this.buildAuthorizationUrl(state, pkce.codeChallenge);
10664
10601
  console.log("\nAuthenticating with Claude...\n");
10602
+ const { tokenPromise, port } = await this.findAvailablePort(OAUTH_RETRY_PORTS, state);
10603
+ this.activeRedirectUri = `http://localhost:${port}/callback`;
10604
+ const authUrl = this.buildAuthorizationUrl(state, pkce.codeChallenge);
10665
10605
  console.log(`Opening browser to: ${authUrl}
10666
10606
  `);
10667
- const tokenPromise = this.startCallbackServer(CALLBACK_PORT, state);
10668
10607
  try {
10669
10608
  this.openBrowser(authUrl);
10670
10609
  } catch (error) {
@@ -10679,9 +10618,16 @@ ${authUrl}
10679
10618
  return { success: true };
10680
10619
  } catch (error) {
10681
10620
  this.pendingCodeVerifier = null;
10621
+ const message = error instanceof Error ? error.message : "Authentication failed";
10622
+ if (message.includes("timeout")) {
10623
+ return {
10624
+ success: false,
10625
+ error: "Claude authentication timed out. Make sure you completed sign-in in your browser, then use /provider to try again."
10626
+ };
10627
+ }
10682
10628
  return {
10683
10629
  success: false,
10684
- error: error instanceof Error ? error.message : "Authentication failed"
10630
+ error: message
10685
10631
  };
10686
10632
  }
10687
10633
  }
@@ -10776,7 +10722,7 @@ ${authUrl}
10776
10722
  code,
10777
10723
  state,
10778
10724
  // Non-standard: state in body
10779
- redirect_uri: OAUTH_CONFIG.redirectUri,
10725
+ redirect_uri: this.activeRedirectUri,
10780
10726
  client_id: OAUTH_CONFIG.clientId,
10781
10727
  code_verifier: this.pendingCodeVerifier
10782
10728
  // PKCE verifier
@@ -10924,7 +10870,7 @@ ${authUrl}
10924
10870
  const params = new URLSearchParams({
10925
10871
  response_type: "code",
10926
10872
  client_id: OAUTH_CONFIG.clientId,
10927
- redirect_uri: OAUTH_CONFIG.redirectUri,
10873
+ redirect_uri: this.activeRedirectUri,
10928
10874
  scope: OAUTH_CONFIG.scopes.join(" "),
10929
10875
  state
10930
10876
  });
@@ -10963,10 +10909,10 @@ ${authUrl}
10963
10909
  * Open a URL in the default browser cross-platform
10964
10910
  */
10965
10911
  openBrowser(url) {
10966
- const os3 = platform2();
10912
+ const os2 = platform2();
10967
10913
  let command;
10968
10914
  let args;
10969
- switch (os3) {
10915
+ switch (os2) {
10970
10916
  case "darwin":
10971
10917
  command = "open";
10972
10918
  args = [url];
@@ -11111,7 +11057,7 @@ __export(secret_storage_exports, {
11111
11057
  listSecrets: () => listSecrets,
11112
11058
  setSecret: () => setSecret
11113
11059
  });
11114
- import { promises as fs4 } from "fs";
11060
+ import { promises as fs3 } from "fs";
11115
11061
  import { homedir as homedir6 } from "os";
11116
11062
  import { dirname as dirname2, join as join8 } from "path";
11117
11063
  async function getSecret(key) {
@@ -11143,11 +11089,11 @@ var init_secret_storage = __esm({
11143
11089
  }
11144
11090
  async ensureDirectoryExists() {
11145
11091
  const dir = dirname2(this.secretFilePath);
11146
- await fs4.mkdir(dir, { recursive: true, mode: 448 });
11092
+ await fs3.mkdir(dir, { recursive: true, mode: 448 });
11147
11093
  }
11148
11094
  async loadSecrets() {
11149
11095
  try {
11150
- const data = await fs4.readFile(this.secretFilePath, "utf-8");
11096
+ const data = await fs3.readFile(this.secretFilePath, "utf-8");
11151
11097
  const secrets = JSON.parse(data);
11152
11098
  return new Map(Object.entries(secrets));
11153
11099
  } catch (error) {
@@ -11156,7 +11102,7 @@ var init_secret_storage = __esm({
11156
11102
  return /* @__PURE__ */ new Map();
11157
11103
  }
11158
11104
  try {
11159
- await fs4.unlink(this.secretFilePath);
11105
+ await fs3.unlink(this.secretFilePath);
11160
11106
  } catch {
11161
11107
  }
11162
11108
  return /* @__PURE__ */ new Map();
@@ -11166,7 +11112,7 @@ var init_secret_storage = __esm({
11166
11112
  await this.ensureDirectoryExists();
11167
11113
  const data = Object.fromEntries(secrets);
11168
11114
  const json = JSON.stringify(data, null, 2);
11169
- await fs4.writeFile(this.secretFilePath, json, { mode: 384 });
11115
+ await fs3.writeFile(this.secretFilePath, json, { mode: 384 });
11170
11116
  }
11171
11117
  async getSecret(key) {
11172
11118
  const secrets = await this.loadSecrets();
@@ -11185,7 +11131,7 @@ var init_secret_storage = __esm({
11185
11131
  secrets.delete(key);
11186
11132
  if (secrets.size === 0) {
11187
11133
  try {
11188
- await fs4.unlink(this.secretFilePath);
11134
+ await fs3.unlink(this.secretFilePath);
11189
11135
  } catch (error) {
11190
11136
  const err = error;
11191
11137
  if (err.code !== "ENOENT") {
@@ -12694,17 +12640,35 @@ var init_TestSelector = __esm({
12694
12640
  setError(null);
12695
12641
  try {
12696
12642
  const page = Math.floor(allTests.length / PAGE_SIZE2) + 1;
12697
- const result = await apiClient.getRunTestsCatalog(run.id, {
12698
- page,
12699
- limit: PAGE_SIZE2,
12700
- status: "failed"
12701
- // Only fetch failed tests
12702
- });
12703
- setTotalTests(result.total ?? result.tests.length);
12704
- const loadedCount = allTests.length + result.tests.length;
12705
- const total = result.total ?? loadedCount;
12706
- setHasMore(result.tests.length === PAGE_SIZE2 && loadedCount < total);
12707
- setAllTests((prev) => [...prev, ...result.tests]);
12643
+ const [failedResult, flakyResult] = await Promise.all([
12644
+ apiClient.getRunTestsCatalog(run.id, {
12645
+ page,
12646
+ limit: PAGE_SIZE2,
12647
+ status: "failed"
12648
+ // Fetch failed tests
12649
+ }),
12650
+ apiClient.getRunTestsCatalog(run.id, {
12651
+ page,
12652
+ limit: PAGE_SIZE2,
12653
+ isFlaky: true
12654
+ // Fetch flaky tests
12655
+ })
12656
+ ]);
12657
+ const testsMap = /* @__PURE__ */ new Map();
12658
+ for (const test of failedResult.tests) {
12659
+ testsMap.set(test.id, test);
12660
+ }
12661
+ for (const test of flakyResult.tests) {
12662
+ testsMap.set(test.id, test);
12663
+ }
12664
+ const newTests = Array.from(testsMap.values());
12665
+ const maxTotal = Math.max(failedResult.total ?? 0, flakyResult.total ?? 0);
12666
+ setTotalTests(maxTotal);
12667
+ const loadedCount = allTests.length + newTests.length;
12668
+ const hasMoreFailed = failedResult.tests.length === PAGE_SIZE2;
12669
+ const hasMoreFlaky = flakyResult.tests.length === PAGE_SIZE2;
12670
+ setHasMore((hasMoreFailed || hasMoreFlaky) && loadedCount < maxTotal);
12671
+ setAllTests((prev) => [...prev, ...newTests]);
12708
12672
  } catch (err) {
12709
12673
  setError(err instanceof Error ? err.message : String(err));
12710
12674
  setHasMore(false);
@@ -12858,7 +12822,8 @@ var init_TestSelector = __esm({
12858
12822
  const visibleTests = filteredTests.slice(adjustedStart, adjustedEnd);
12859
12823
  const branch = run.git?.branch || "unknown";
12860
12824
  const commit = run.git?.commit?.slice(0, 7) || "";
12861
- return /* @__PURE__ */ React21.createElement(Box18, { borderColor: "cyan", borderStyle: "round", flexDirection: "column", padding: 1 }, /* @__PURE__ */ React21.createElement(Box18, { marginBottom: 1 }, /* @__PURE__ */ React21.createElement(Text16, { bold: true, color: "cyan" }, "Run: ", branch, commit && /* @__PURE__ */ React21.createElement(Text16, { color: theme.text.dim }, " @ ", commit), /* @__PURE__ */ React21.createElement(Text16, { color: theme.text.dim }, " \u2022 "), /* @__PURE__ */ React21.createElement(Text16, { color: "red" }, allTests.length, " failed"), /* @__PURE__ */ React21.createElement(Text16, { color: theme.text.dim }, " \u2022 "), /* @__PURE__ */ React21.createElement(Text16, { color: "green" }, availableCount, " avail"), /* @__PURE__ */ React21.createElement(Text16, { color: theme.text.dim }, " \u2022 "), /* @__PURE__ */ React21.createElement(Text16, { color: "yellow" }, assignedCount, " working"))), /* @__PURE__ */ React21.createElement(Box18, { marginBottom: 1 }, /* @__PURE__ */ React21.createElement(Text16, { color: theme.text.dim }, "[", showAvailableOnly ? "x" : " ", "] ", /* @__PURE__ */ React21.createElement(Text16, { bold: true }, "t"), " avail only", " ", "[", groupByFile ? "x" : " ", "] ", /* @__PURE__ */ React21.createElement(Text16, { bold: true }, "f"), " group files")), /* @__PURE__ */ React21.createElement(Box18, { flexDirection: "column" }, /* @__PURE__ */ React21.createElement(Box18, { marginBottom: 1 }, /* @__PURE__ */ React21.createElement(
12825
+ const flakyCount = run.summary?.flaky ?? 0;
12826
+ return /* @__PURE__ */ React21.createElement(Box18, { borderColor: "cyan", borderStyle: "round", flexDirection: "column", padding: 1 }, /* @__PURE__ */ React21.createElement(Box18, { marginBottom: 1 }, /* @__PURE__ */ React21.createElement(Text16, { bold: true, color: "cyan" }, "Run: ", branch, commit && /* @__PURE__ */ React21.createElement(Text16, { color: theme.text.dim }, " @ ", commit), /* @__PURE__ */ React21.createElement(Text16, { color: theme.text.dim }, " \u2022 "), /* @__PURE__ */ React21.createElement(Text16, { color: "red" }, allTests.length, " failed"), /* @__PURE__ */ React21.createElement(Text16, { color: theme.text.dim }, " \u2022 "), /* @__PURE__ */ React21.createElement(Text16, { color: "magenta" }, flakyCount, " flaky"), /* @__PURE__ */ React21.createElement(Text16, { color: theme.text.dim }, " \u2022 "), /* @__PURE__ */ React21.createElement(Text16, { color: "green" }, availableCount, " avail"), /* @__PURE__ */ React21.createElement(Text16, { color: theme.text.dim }, " \u2022 "), /* @__PURE__ */ React21.createElement(Text16, { color: "yellow" }, assignedCount, " working"))), /* @__PURE__ */ React21.createElement(Box18, { marginBottom: 1 }, /* @__PURE__ */ React21.createElement(Text16, { color: theme.text.dim }, "[", showAvailableOnly ? "x" : " ", "] ", /* @__PURE__ */ React21.createElement(Text16, { bold: true }, "t"), " avail only", " ", "[", groupByFile ? "x" : " ", "] ", /* @__PURE__ */ React21.createElement(Text16, { bold: true }, "f"), " group files")), /* @__PURE__ */ React21.createElement(Box18, { flexDirection: "column" }, /* @__PURE__ */ React21.createElement(Box18, { marginBottom: 1 }, /* @__PURE__ */ React21.createElement(
12862
12827
  Text16,
12863
12828
  {
12864
12829
  backgroundColor: isOnFixNext10 ? theme.text.accent : void 0,
@@ -12956,14 +12921,29 @@ var init_FixFlow = __esm({
12956
12921
  };
12957
12922
  const fetchAssignments = async (runId) => {
12958
12923
  try {
12959
- const result = await apiClient.getRunTestsCatalog(runId, {
12960
- status: "failed",
12961
- limit: 1e3
12962
- // Get all failed tests for assignment lookup
12963
- });
12924
+ const [failedResult, flakyResult] = await Promise.all([
12925
+ apiClient.getRunTestsCatalog(runId, {
12926
+ status: "failed",
12927
+ limit: 1e3
12928
+ // Get all failed tests for assignment lookup
12929
+ }),
12930
+ apiClient.getRunTestsCatalog(runId, {
12931
+ isFlaky: true,
12932
+ limit: 1e3
12933
+ // Get all flaky tests for assignment lookup
12934
+ })
12935
+ ]);
12936
+ const testsMap = /* @__PURE__ */ new Map();
12937
+ for (const test of failedResult.tests) {
12938
+ testsMap.set(test.id, test);
12939
+ }
12940
+ for (const test of flakyResult.tests) {
12941
+ testsMap.set(test.id, test);
12942
+ }
12943
+ const allTests = Array.from(testsMap.values());
12964
12944
  const assignmentMap = /* @__PURE__ */ new Map();
12965
12945
  const catalogMap = /* @__PURE__ */ new Map();
12966
- for (const test of result.tests) {
12946
+ for (const test of allTests) {
12967
12947
  catalogMap.set(test.id, test.testId);
12968
12948
  if (test.assignment) {
12969
12949
  assignmentMap.set(test.id, {
@@ -13161,7 +13141,7 @@ Press ESC to go back and try again.`);
13161
13141
  });
13162
13142
 
13163
13143
  // src/ui/utils/file-search.ts
13164
- import path4 from "path";
13144
+ import path3 from "path";
13165
13145
  import { glob } from "glob";
13166
13146
  function fuzzyMatch(text, query2) {
13167
13147
  const textLower = text.toLowerCase();
@@ -13182,7 +13162,7 @@ function fuzzyMatch(text, query2) {
13182
13162
  if (queryIdx < queryLower.length) {
13183
13163
  return 0;
13184
13164
  }
13185
- const segments = textLower.split(path4.sep);
13165
+ const segments = textLower.split(path3.sep);
13186
13166
  for (const segment of segments) {
13187
13167
  if (segment.startsWith(queryLower[0])) {
13188
13168
  score += 0.5;
@@ -13371,7 +13351,7 @@ var init_ModelSelector = __esm({
13371
13351
  });
13372
13352
 
13373
13353
  // src/ui/components/InputPrompt.tsx
13374
- import path5 from "path";
13354
+ import path4 from "path";
13375
13355
  import chalk4 from "chalk";
13376
13356
  import { Box as Box21, Text as Text19 } from "ink";
13377
13357
  import React24, { forwardRef, memo as memo3, useEffect as useEffect10, useImperativeHandle, useState as useState11 } from "react";
@@ -13597,11 +13577,11 @@ var init_InputPrompt = __esm({
13597
13577
  cleanPath = cleanPath.slice(1, -1);
13598
13578
  }
13599
13579
  cleanPath = cleanPath.replace(/\\ /g, " ");
13600
- if (path5.isAbsolute(cleanPath)) {
13580
+ if (path4.isAbsolute(cleanPath)) {
13601
13581
  try {
13602
13582
  const cwd2 = process.cwd();
13603
- const rel = path5.relative(cwd2, cleanPath);
13604
- if (!rel.startsWith("..") && !path5.isAbsolute(rel)) {
13583
+ const rel = path4.relative(cwd2, cleanPath);
13584
+ if (!rel.startsWith("..") && !path4.isAbsolute(rel)) {
13605
13585
  cleanPath = rel;
13606
13586
  }
13607
13587
  } catch (e) {
@@ -15258,8 +15238,8 @@ function getToolDescription2(toolName, input) {
15258
15238
  return `pattern: "${input?.pattern || "files"}"`;
15259
15239
  case "Grep": {
15260
15240
  const pattern = input?.pattern || "code";
15261
- const path6 = input?.path;
15262
- return path6 ? `"${pattern}" (in ${path6})` : `"${pattern}"`;
15241
+ const path5 = input?.path;
15242
+ return path5 ? `"${pattern}" (in ${path5})` : `"${pattern}"`;
15263
15243
  }
15264
15244
  case "Task":
15265
15245
  return input?.subagent_type || "task";
@@ -16290,7 +16270,7 @@ program.name("supatest").description(
16290
16270
  "-m, --claude-max-iterations <number>",
16291
16271
  "Maximum number of iterations",
16292
16272
  "100"
16293
- ).option("--supatest-api-key <key>", "Supatest API key (or use SUPATEST_API_KEY env)").option("--supatest-api-url <url>", "Supatest API URL (or use SUPATEST_API_URL env, defaults to https://code-api.supatest.ai)").option("--headless", "Run in headless mode (for CI/CD, minimal output)").option("--verbose", "Enable verbose logging").option("--model <model>", "Model to use (or use ANTHROPIC_MODEL_NAME env). Use 'small', 'medium', or 'premium' for tier-based selection").action(async (task, options) => {
16273
+ ).option("--supatest-api-key <key>", "Supatest API key (or use SUPATEST_API_KEY env)").option("--supatest-api-url <url>", "Supatest API URL (or use SUPATEST_API_URL env, defaults to https://code-api.supatest.ai)").option("--headless", "Run in headless mode (for CI/CD, minimal output)").option("--mode <mode>", "Agent mode for headless: fix (default), build, or plan").option("--verbose", "Enable verbose logging").option("--model <model>", "Model to use (or use ANTHROPIC_MODEL_NAME env). Use 'small', 'medium', or 'premium' for tier-based selection").action(async (task, options) => {
16294
16274
  try {
16295
16275
  checkNodeVersion2();
16296
16276
  await checkAndAutoUpdate();
@@ -16311,9 +16291,9 @@ program.name("supatest").description(
16311
16291
  logs = stdinContent;
16312
16292
  }
16313
16293
  if (options.logs) {
16314
- const fs5 = await import("fs/promises");
16294
+ const fs4 = await import("fs/promises");
16315
16295
  try {
16316
- logs = await fs5.readFile(options.logs, "utf-8");
16296
+ logs = await fs4.readFile(options.logs, "utf-8");
16317
16297
  } catch (error) {
16318
16298
  logger.error(`Failed to read log file: ${options.logs}`);
16319
16299
  process.exit(1);
@@ -16338,6 +16318,8 @@ program.name("supatest").description(
16338
16318
  );
16339
16319
  logger.error(" 1. Set SUPATEST_API_KEY environment variable");
16340
16320
  logger.error(" 2. Use --supatest-api-key option");
16321
+ logger.error("");
16322
+ logger.error(" Get your API key at: https://code.supatest.ai/api-keys");
16341
16323
  process.exit(1);
16342
16324
  }
16343
16325
  } else {
@@ -16367,6 +16349,17 @@ program.name("supatest").description(
16367
16349
  if (!prompt) {
16368
16350
  throw new Error("Task is required in headless mode");
16369
16351
  }
16352
+ const headlessMode = options.mode || "fix";
16353
+ const validModes = ["fix", "build", "plan"];
16354
+ if (!validModes.includes(headlessMode)) {
16355
+ logger.error(`Invalid mode "${headlessMode}". Valid modes: ${validModes.join(", ")}`);
16356
+ process.exit(1);
16357
+ }
16358
+ const systemPromptMap = {
16359
+ fix: config.headlessSystemPrompt,
16360
+ build: config.interactiveSystemPrompt,
16361
+ plan: config.planSystemPrompt
16362
+ };
16370
16363
  logger.raw(getBanner());
16371
16364
  const result = await runAgent({
16372
16365
  task: prompt,
@@ -16376,9 +16369,10 @@ program.name("supatest").description(
16376
16369
  maxIterations: Number.parseInt(options.maxIterations || "100", 10),
16377
16370
  verbose: options.verbose || false,
16378
16371
  cwd: options.cwd,
16379
- systemPromptAppend: config.headlessSystemPrompt,
16372
+ systemPromptAppend: systemPromptMap[headlessMode],
16380
16373
  selectedModel,
16381
- oauthToken
16374
+ oauthToken,
16375
+ mode: headlessMode === "plan" ? "plan" : "build"
16382
16376
  });
16383
16377
  process.exit(result.success ? 0 : 1);
16384
16378
  } else {
@@ -16406,7 +16400,7 @@ program.name("supatest").description(
16406
16400
  process.exit(1);
16407
16401
  }
16408
16402
  });
16409
- program.command("setup").description("Check prerequisites and set up required tools (Node.js, Playwright MCP)").option("-C, --cwd <path>", "Working directory for setup", process.cwd()).action(async (options) => {
16403
+ program.command("setup").description("Check prerequisites and set up required tools (Node.js, Agent Browser)").option("-C, --cwd <path>", "Working directory for setup", process.cwd()).action(async (options) => {
16410
16404
  try {
16411
16405
  const result = await setupCommand({ cwd: options.cwd });
16412
16406
  process.exit(result.errors.length === 0 ? 0 : 1);