greenrun-cli 0.1.5 → 0.1.6

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
@@ -191,7 +191,7 @@ function installSettings() {
191
191
  function installCommands() {
192
192
  const commandsDir = join(process.cwd(), '.claude', 'commands');
193
193
  mkdirSync(commandsDir, { recursive: true });
194
- const commands = ['greenrun.md', 'greenrun-sweep.md'];
194
+ const commands = ['greenrun.md', 'greenrun-sweep.md', 'procedures.md'];
195
195
  for (const cmd of commands) {
196
196
  const src = join(TEMPLATES_DIR, 'commands', cmd);
197
197
  if (!existsSync(src)) {
package/package.json CHANGED
@@ -1,6 +1,6 @@
1
1
  {
2
2
  "name": "greenrun-cli",
3
- "version": "0.1.5",
3
+ "version": "0.1.6",
4
4
  "description": "CLI and MCP server for Greenrun - browser test management for Claude Code",
5
5
  "type": "module",
6
6
  "main": "dist/server.js",
@@ -36,41 +36,4 @@ Present the affected tests:
36
36
 
37
37
  ### 6. Offer to run
38
38
 
39
- Ask the user if they want to run the affected tests. If yes, execute them **in parallel** using the same approach as the `/greenrun` command:
40
-
41
- Use the project's `concurrency` setting (default: 5) to determine batch size. Split affected tests into batches and launch each batch simultaneously using the **Task tool** with `run_in_background: true`.
42
-
43
- For each test in a batch, launch a background agent with `max_turns: 30` and `model: "sonnet"`. Use this prompt:
44
-
45
- ```
46
- You are executing a single Greenrun browser test. You have access to browser automation tools and Greenrun MCP tools.
47
-
48
- **Test: {test_name}** (ID: {test_id})
49
-
50
- Step 1: Call `get_test` with test_id "{test_id}" to get full instructions.
51
- Step 2: Call `start_run` with test_id "{test_id}" to begin - save the returned `run_id`.
52
- Step 3: Execute the test instructions using browser automation:
53
- - Call `tabs_context_mcp` then create a new browser tab for this test
54
- - Follow each instruction step exactly as written
55
- - The instructions will tell you where to navigate and what to do
56
- - Only take a screenshot when you need to verify a visual assertion — not for every navigation or click
57
- - When reading page content, prefer `find` or `read_page` with `filter: "interactive"` over full DOM reads
58
- - NEVER trigger JavaScript alerts, confirms, or prompts — they block the browser extension entirely. Before clicking delete buttons or other destructive actions, use `javascript_tool` to override: `window.alert = () => {}; window.confirm = () => true; window.prompt = () => null;`
59
- - If browser tools stop responding (no result or timeout), assume a dialog is blocking — report the error and stop. Do not keep retrying.
60
- - If you get stuck or a step fails, record the failure and move on — do not retry more than once
61
- Step 4: Call `complete_run` with:
62
- - run_id: the run ID from step 2
63
- - status: "passed" if all checks succeeded, "failed" if any check failed, "error" if execution was blocked
64
- - result: a brief summary of what happened (include the failure reason if failed/error)
65
- Step 5: Close the browser tab you created to clean up.
66
-
67
- Return a single line summary: {test_name} | {status} | {result_summary}
68
- ```
69
-
70
- Wait for each batch to complete before launching the next. After all tests finish, present a summary table:
71
-
72
- | Test | Pages | Tags | Status | Result |
73
- |------|-------|------|--------|--------|
74
- | Test name | Affected page URLs | tag1, tag2 | passed/failed/error | Brief summary |
75
-
76
- Include the total count: "X passed, Y failed, Z errors out of N tests"
39
+ Ask the user if they want to run the affected tests. If yes, read `.claude/commands/procedures.md` for the agent prompt template and execution procedures. Follow those procedures to pre-fetch test details, launch agents in batches, collect results, and summarize.
@@ -23,79 +23,6 @@ If no argument is given, run all active tests.
23
23
 
24
24
  If there are no matching active tests, tell the user and stop.
25
25
 
26
- ### 3. Pre-fetch test details
26
+ ### 3. Execute tests
27
27
 
28
- Call `get_test` for ALL matching tests **in parallel** (multiple tool calls in one message). This retrieves the full instructions for each test.
29
-
30
- Then call `start_run` for ALL tests **in parallel** to get run IDs.
31
-
32
- You now have everything needed to launch agents: test name, full instructions, and run_id for each test.
33
-
34
- ### 4. Execute tests in parallel
35
-
36
- Split the test list into batches of size `concurrency` (from the project settings).
37
-
38
- For each batch, launch all tests simultaneously using the **Task tool** with `run_in_background: true`. Each background agent receives a prompt with the full instructions and run_id embedded — agents do NOT need to call `get_test` or `start_run`.
39
-
40
- ```
41
- For each test in the current batch, call the Task tool with:
42
- - subagent_type: "general-purpose"
43
- - run_in_background: true
44
- - max_turns: 50
45
- - model: "sonnet"
46
- - prompt: (see below)
47
- ```
48
-
49
- The prompt for each background agent should be:
50
-
51
- ```
52
- You are executing a single Greenrun browser test using browser automation tools. Be efficient — minimize tool calls to complete the test as fast as possible.
53
-
54
- **Test: {test_name}**
55
- **Run ID: {run_id}**
56
-
57
- ## Test Instructions
58
-
59
- {paste the full test instructions from get_test here}
60
-
61
- ## Execution Steps
62
-
63
- 1. Call `tabs_context_mcp` then create a new browser tab with `tabs_create_mcp`
64
- 2. Follow each test instruction step exactly as written, using these rules to minimize tool calls:
65
-
66
- **Speed rules (critical):**
67
- - NEVER take screenshots. Use `read_page` or `find` for all assertions and to locate elements.
68
- - Navigate directly to URLs (e.g. `navigate` to `/tokens`) instead of clicking through nav links
69
- - Use `javascript_tool` for quick assertions: `document.querySelector('h1')?.textContent` is faster than `read_page` for checking a heading
70
- - Use `read_page` with `filter: "interactive"` to verify multiple things in one call rather than separate `find` calls
71
- - Use `form_input` with element refs for filling forms — avoid click-then-type sequences
72
- - When clicking elements, use `ref` parameter instead of coordinates to avoid needing screenshots
73
- - Combine verification: after a page loads, do ONE `read_page` call and check all assertions from that result
74
-
75
- **Reliability rules:**
76
- - NEVER trigger JavaScript alerts, confirms, or prompts — they block the browser extension entirely. Before clicking delete buttons or other destructive actions, use `javascript_tool` to override: `window.alert = () => {}; window.confirm = () => true; window.prompt = () => null;`
77
- - If browser tools stop responding (no result or timeout), assume a dialog is blocking — report the error and stop. Do not keep retrying.
78
- - If you get stuck or a step fails, record the failure and move on — do not retry more than once
79
- - If you are redirected to a login page, try using an existing logged-in tab from `tabs_context_mcp` instead of creating a new one
80
-
81
- 3. Call `complete_run` with:
82
- - run_id: "{run_id}"
83
- - status: "passed" if all checks succeeded, "failed" if any check failed, "error" if execution was blocked
84
- - result: a brief summary of what happened (include the failure reason if failed/error)
85
-
86
- Return a single line summary: {test_name} | {status} | {result_summary}
87
- ```
88
-
89
- After launching all agents in a batch, wait for them all to complete (use `TaskOutput` to collect results) before launching the next batch.
90
-
91
- ### 5. Summarize results
92
-
93
- After all batches complete, collect results from all background agents and present a summary table:
94
-
95
- | Test | Pages | Tags | Status | Result |
96
- |------|-------|------|--------|--------|
97
- | Test name | /login, /dashboard | smoke, auth | passed/failed/error | Brief summary |
98
-
99
- Include the total count: "X passed, Y failed, Z errors out of N tests"
100
-
101
- If any tests failed, highlight what went wrong and suggest next steps.
28
+ Read `.claude/commands/procedures.md` for the agent prompt template and execution procedures. Follow those procedures to pre-fetch test details, launch agents in batches, collect results, and summarize.
@@ -0,0 +1,64 @@
1
+ Shared procedures for executing Greenrun browser tests in parallel. Referenced by `/greenrun` and `/greenrun-sweep`.
2
+
3
+ ## Pre-fetch
4
+
5
+ Before launching agents, call `get_test` for ALL tests **in parallel** to get full instructions. Then call `start_run` for ALL tests **in parallel** to get run IDs.
6
+
7
+ ## Launch agents
8
+
9
+ Split tests into batches of size `concurrency` (from project settings, default: 5).
10
+
11
+ For each batch, launch all tests simultaneously using the **Task tool** with `run_in_background: true`:
12
+
13
+ ```
14
+ For each test in the current batch, call the Task tool with:
15
+ - subagent_type: "general-purpose"
16
+ - run_in_background: true
17
+ - max_turns: 25
18
+ - model: "haiku"
19
+ - prompt: (see agent prompt below)
20
+ ```
21
+
22
+ ### Agent prompt
23
+
24
+ ```
25
+ Execute a Greenrun browser test. Run ID: {run_id}
26
+
27
+ **Test: {test_name}**
28
+
29
+ ## Instructions
30
+ {paste the full test instructions from get_test here}
31
+
32
+ ## Setup
33
+ 1. Call `tabs_context_mcp`, then `tabs_create_mcp` to create YOUR tab. Use ONLY this tabId — other tabs belong to parallel tests.
34
+ 2. Navigate to the first URL. Run `javascript_tool`: `window.location.pathname`. If it returns `/login`, call `complete_run` with status "error", result "Not authenticated", then `window.close()` and stop.
35
+
36
+ ## Execution rules
37
+ - Verify assertions with `screenshot` after actions that change the page. Do NOT use `read_page` for verification.
38
+ - Use `find` to locate elements, then `ref` parameter on `computer` tool or `form_input` to interact.
39
+ - Navigate with absolute URLs via `navigate` — don't click nav links.
40
+ - Before destructive buttons: `window.alert = () => {}; window.confirm = () => true; window.prompt = () => null;`
41
+ - On failure or timeout, retry ONCE then move on. Max 35 tool calls total.
42
+
43
+ ## Finish
44
+ Call `complete_run` with run_id "{run_id}", status ("passed"/"failed"/"error"), and a brief result summary.
45
+ Then run `javascript_tool`: `window.close()`.
46
+
47
+ Return: {test_name} | {status} | {result_summary}
48
+ ```
49
+
50
+ ## Collect results
51
+
52
+ After launching all agents in a batch, wait for them all to complete (use `TaskOutput`) before launching the next batch.
53
+
54
+ ## Summarize
55
+
56
+ After all batches complete, present a summary table:
57
+
58
+ | Test | Pages | Tags | Status | Result |
59
+ |------|-------|------|--------|--------|
60
+ | Test name | /login, /dashboard | smoke, auth | passed/failed/error | Brief summary |
61
+
62
+ Include the total count: "X passed, Y failed, Z errors out of N tests"
63
+
64
+ If any tests failed, highlight what went wrong and suggest next steps.