@fro.bot/systematic 2.0.1 → 2.0.3

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (57) hide show
  1. package/agents/design/figma-design-sync.md +1 -1
  2. package/agents/document-review/coherence-reviewer.md +40 -0
  3. package/agents/document-review/design-lens-reviewer.md +46 -0
  4. package/agents/document-review/feasibility-reviewer.md +42 -0
  5. package/agents/document-review/product-lens-reviewer.md +50 -0
  6. package/agents/document-review/scope-guardian-reviewer.md +54 -0
  7. package/agents/document-review/security-lens-reviewer.md +38 -0
  8. package/agents/research/best-practices-researcher.md +2 -1
  9. package/agents/research/git-history-analyzer.md +1 -1
  10. package/agents/research/repo-research-analyst.md +164 -9
  11. package/agents/review/api-contract-reviewer.md +49 -0
  12. package/agents/review/correctness-reviewer.md +49 -0
  13. package/agents/review/data-migrations-reviewer.md +53 -0
  14. package/agents/review/maintainability-reviewer.md +49 -0
  15. package/agents/review/pattern-recognition-specialist.md +2 -1
  16. package/agents/review/performance-reviewer.md +51 -0
  17. package/agents/review/reliability-reviewer.md +49 -0
  18. package/agents/review/schema-drift-detector.md +12 -10
  19. package/agents/review/security-reviewer.md +51 -0
  20. package/agents/review/testing-reviewer.md +48 -0
  21. package/agents/workflow/pr-comment-resolver.md +1 -1
  22. package/agents/workflow/spec-flow-analyzer.md +60 -89
  23. package/dist/index.js +3 -3
  24. package/package.json +1 -1
  25. package/skills/agent-browser/SKILL.md +69 -48
  26. package/skills/ce-brainstorm/SKILL.md +2 -1
  27. package/skills/ce-compound/SKILL.md +26 -1
  28. package/skills/ce-compound-refresh/SKILL.md +11 -1
  29. package/skills/ce-ideate/SKILL.md +2 -1
  30. package/skills/ce-plan/SKILL.md +424 -414
  31. package/skills/ce-review/SKILL.md +12 -13
  32. package/skills/ce-review-beta/SKILL.md +506 -0
  33. package/skills/ce-review-beta/references/diff-scope.md +31 -0
  34. package/skills/ce-review-beta/references/findings-schema.json +128 -0
  35. package/skills/ce-review-beta/references/persona-catalog.md +50 -0
  36. package/skills/ce-review-beta/references/review-output-template.md +115 -0
  37. package/skills/ce-review-beta/references/subagent-template.md +56 -0
  38. package/skills/ce-work/SKILL.md +14 -6
  39. package/skills/ce-work-beta/SKILL.md +14 -8
  40. package/skills/claude-permissions-optimizer/SKILL.md +15 -14
  41. package/skills/deepen-plan/SKILL.md +348 -483
  42. package/skills/document-review/SKILL.md +160 -52
  43. package/skills/feature-video/SKILL.md +209 -178
  44. package/skills/file-todos/SKILL.md +72 -94
  45. package/skills/frontend-design/SKILL.md +243 -27
  46. package/skills/git-worktree/SKILL.md +37 -28
  47. package/skills/lfg/SKILL.md +7 -7
  48. package/skills/reproduce-bug/SKILL.md +154 -60
  49. package/skills/resolve-pr-parallel/SKILL.md +19 -12
  50. package/skills/resolve-todo-parallel/SKILL.md +9 -6
  51. package/skills/setup/SKILL.md +33 -56
  52. package/skills/slfg/SKILL.md +5 -5
  53. package/skills/test-browser/SKILL.md +69 -145
  54. package/skills/test-xcode/SKILL.md +61 -183
  55. package/skills/triage/SKILL.md +10 -10
  56. package/skills/ce-plan-beta/SKILL.md +0 -571
  57. package/skills/deepen-plan-beta/SKILL.md +0 -323
@@ -8,26 +8,20 @@ disable-model-invocation: true
8
8
 
9
9
  ## Interaction Method
10
10
 
11
- If `question` is available, use it for all prompts below.
11
+ Ask the user each question below using the platform's blocking question tool (e.g., `question` in OpenCode, `request_user_input` in Codex, `ask_user` in Gemini). If no structured question tool is available, present each question as a numbered list and wait for a reply before proceeding. For multiSelect questions, accept comma-separated numbers (e.g. `1, 3`). Never skip or auto-configure.
12
12
 
13
- If not, present each question as a numbered list and wait for a reply before proceeding to the next step. For multiSelect questions, accept comma-separated numbers (e.g. `1, 3`). Never skip or auto-configure.
14
-
15
- Interactive setup for `systematic.local.md` — configures which agents run during `/ce:review` and `/ce:work`.
13
+ Interactive setup for `systematic.local.md` configures which agents run during `ce:review` and `ce:work`.
16
14
 
17
15
  ## Step 1: Check Existing Config
18
16
 
19
- Read `systematic.local.md` in the project root. If it exists, display current settings summary and use question:
17
+ Read `systematic.local.md` in the project root. If it exists, display current settings and ask:
20
18
 
21
19
  ```
22
- question: "Settings file already exists. What would you like to do?"
23
- header: "Config"
24
- options:
25
- - label: "Reconfigure"
26
- description: "Run the interactive setup again from scratch"
27
- - label: "View current"
28
- description: "Show the file contents, then stop"
29
- - label: "Cancel"
30
- description: "Keep current settings"
20
+ Settings file already exists. What would you like to do?
21
+
22
+ 1. Reconfigure - Run the interactive setup again from scratch
23
+ 2. View current - Show the file contents, then stop
24
+ 3. Cancel - Keep current settings
31
25
  ```
32
26
 
33
27
  If "View current": read and display the file, then stop.
@@ -47,16 +41,13 @@ test -f requirements.txt && echo "python" || \
47
41
  echo "general"
48
42
  ```
49
43
 
50
- Use question:
44
+ Ask:
51
45
 
52
46
  ```
53
- question: "Detected {type} project. How would you like to configure?"
54
- header: "Setup"
55
- options:
56
- - label: "Auto-configure (Recommended)"
57
- description: "Use smart defaults for {type}. Done in one click."
58
- - label: "Customize"
59
- description: "Choose stack, focus areas, and review depth."
47
+ Detected {type} project. How would you like to configure?
48
+
49
+ 1. Auto-configure (Recommended) - Use smart defaults for {type}. Done in one click.
50
+ 2. Customize - Choose stack, focus areas, and review depth.
60
51
  ```
61
52
 
62
53
  ### If Auto-configure → Skip to Step 4 with defaults:
@@ -73,50 +64,35 @@ options:
73
64
  **a. Stack** — confirm or override:
74
65
 
75
66
  ```
76
- question: "Which stack should we optimize for?"
77
- header: "Stack"
78
- options:
79
- - label: "{detected_type} (Recommended)"
80
- description: "Auto-detected from project files"
81
- - label: "Rails"
82
- description: "Ruby on Rails — adds DHH-style and Rails-specific reviewers"
83
- - label: "Python"
84
- description: "Python — adds Pythonic pattern reviewer"
85
- - label: "TypeScript"
86
- description: "TypeScript — adds type safety reviewer"
67
+ Which stack should we optimize for?
68
+
69
+ 1. {detected_type} (Recommended) - Auto-detected from project files
70
+ 2. Rails - Ruby on Rails, adds DHH-style and Rails-specific reviewers
71
+ 3. Python - Adds Pythonic pattern reviewer
72
+ 4. TypeScript - Adds type safety reviewer
87
73
  ```
88
74
 
89
75
  Only show options that differ from the detected type.
90
76
 
91
- **b. Focus areas** — multiSelect:
77
+ **b. Focus areas** — multiSelect (user picks one or more):
92
78
 
93
79
  ```
94
- question: "Which review areas matter most?"
95
- header: "Focus"
96
- multiSelect: true
97
- options:
98
- - label: "Security"
99
- description: "Vulnerability scanning, auth, input validation (security-sentinel)"
100
- - label: "Performance"
101
- description: "N+1 queries, memory leaks, complexity (performance-oracle)"
102
- - label: "Architecture"
103
- description: "Design patterns, SOLID, separation of concerns (architecture-strategist)"
104
- - label: "Code simplicity"
105
- description: "Over-engineering, YAGNI violations (code-simplicity-reviewer)"
80
+ Which review areas matter most? (comma-separated, e.g. 1, 3)
81
+
82
+ 1. Security - Vulnerability scanning, auth, input validation (security-sentinel)
83
+ 2. Performance - N+1 queries, memory leaks, complexity (performance-oracle)
84
+ 3. Architecture - Design patterns, SOLID, separation of concerns (architecture-strategist)
85
+ 4. Code simplicity - Over-engineering, YAGNI violations (code-simplicity-reviewer)
106
86
  ```
107
87
 
108
88
  **c. Depth:**
109
89
 
110
90
  ```
111
- question: "How thorough should reviews be?"
112
- header: "Depth"
113
- options:
114
- - label: "Thorough (Recommended)"
115
- description: "Stack reviewers + all selected focus agents."
116
- - label: "Fast"
117
- description: "Stack reviewers + code simplicity only. Less context, quicker."
118
- - label: "Comprehensive"
119
- description: "All above + git history, data integrity, agent-native checks."
91
+ How thorough should reviews be?
92
+
93
+ 1. Thorough (Recommended) - Stack reviewers + all selected focus agents.
94
+ 2. Fast - Stack reviewers + code simplicity only. Less context, quicker.
95
+ 3. Comprehensive - All above + git history, data integrity, agent-native checks.
120
96
  ```
121
97
 
122
98
  ## Step 4: Build Agent List and Write File
@@ -151,7 +127,7 @@ plan_review_agents: [{computed plan agent list}]
151
127
  # Review Context
152
128
 
153
129
  Add project-specific review instructions here.
154
- These notes are passed to all review agents during /ce:review and /ce:work.
130
+ These notes are passed to all review agents during ce:review and ce:work.
155
131
 
156
132
  Examples:
157
133
  - "We use Turbo Frames heavily — check for frame-busting issues"
@@ -172,3 +148,4 @@ Agents: {count} configured
172
148
  Tip: Edit the "Review Context" section to add project-specific instructions.
173
149
  Re-run this setup anytime to reconfigure.
174
150
  ```
151
+
@@ -9,26 +9,26 @@ Swarm-enabled LFG. Run these steps in order, parallelizing where indicated. Do n
9
9
 
10
10
  ## Sequential Phase
11
11
 
12
- 1. **Optional:** If the `ralph-wiggum` skill is available, run `/ralph-wiggum:ralph-loop "finish all slash commands" --completion-promise "DONE"`. If not available or it fails, skip and continue to step 2 immediately.
13
- 2. `/systematic:ce-plan $ARGUMENTS`
12
+ 1. **Optional:** If the `ralph-loop` skill is available, run `/ralph-loop:ralph-loop "finish all slash commands" --completion-promise "DONE"`. If not available or it fails, skip and continue to step 2 immediately.
13
+ 2. `/ce:plan $ARGUMENTS`
14
14
  3. **Conditionally** run `/systematic:deepen-plan`
15
15
  - Run the `deepen-plan` workflow only if the plan is `Standard` or `Deep`, touches a high-risk area (auth, security, payments, migrations, external APIs, significant rollout concerns), or still has obvious confidence gaps in decisions, sequencing, system-wide impact, risks, or verification
16
16
  - If you run the `deepen-plan` workflow, confirm the plan was deepened or explicitly judged sufficiently grounded before moving on
17
17
  - If you skip it, note why and continue to step 4
18
- 4. `/systematic:ce-work` — **Use swarm mode**: Make a Task list and launch an army of agent swarm subagents to build the plan
18
+ 4. `/ce:work` — **Use swarm mode**: Make a Task list and launch an army of agent swarm subagents to build the plan
19
19
 
20
20
  ## Parallel Phase
21
21
 
22
22
  After work completes, launch steps 5 and 6 as **parallel swarm agents** (both only need code to be written):
23
23
 
24
- 5. `/systematic:ce-review` — spawn as background Task agent
24
+ 5. `/ce:review` — spawn as background Task agent
25
25
  6. `/systematic:test-browser` — spawn as background Task agent
26
26
 
27
27
  Wait for both to complete before continuing.
28
28
 
29
29
  ## Finalize Phase
30
30
 
31
- 7. `/systematic:resolve_todo_parallel` — resolve any findings from the review
31
+ 7. `/systematic:resolve-todo-parallel` — resolve findings, compound on learnings, clean up completed todos
32
32
  8. `/systematic:feature-video` — record the final walkthrough and add to PR
33
33
  9. Output `<promise>DONE</promise>` when video is in PR
34
34
 
@@ -4,56 +4,45 @@ description: Run browser tests on pages affected by current PR or branch
4
4
  argument-hint: '[PR number, branch name, ''current'', or --port PORT]'
5
5
  ---
6
6
 
7
- # Browser Test Command
7
+ # Browser Test Skill
8
8
 
9
- <command_purpose>Run end-to-end browser tests on pages affected by a PR or branch changes using agent-browser CLI.</command_purpose>
9
+ Run end-to-end browser tests on pages affected by a PR or branch changes using the `agent-browser` CLI.
10
10
 
11
- ## CRITICAL: Use agent-browser CLI Only
11
+ ## Use `agent-browser` Only For Browser Automation
12
12
 
13
- **DO NOT use Chrome MCP tools (mcp__claude-in-chrome__*).**
13
+ This workflow uses the `agent-browser` CLI exclusively. Do not use any alternative browser automation system, browser MCP integration, or built-in browser-control tool. If the platform offers multiple ways to control a browser, always choose `agent-browser`.
14
14
 
15
- This command uses the `agent-browser` CLI exclusively. The agent-browser CLI is a Bash-based tool from Vercel that runs headless Chromium. It is NOT the same as Chrome browser automation via MCP.
15
+ Use `agent-browser` for: opening pages, clicking elements, filling forms, taking screenshots, and scraping rendered content.
16
16
 
17
- If you find yourself calling `mcp__claude-in-chrome__*` tools, STOP. Use `agent-browser` Bash commands instead.
18
-
19
- ## Introduction
20
-
21
- <role>QA Engineer specializing in browser-based end-to-end testing</role>
22
-
23
- This command tests affected pages in a real browser, catching issues that unit tests miss:
24
- - JavaScript integration bugs
25
- - CSS/layout regressions
26
- - User workflow breakages
27
- - Console errors
17
+ Platform-specific hints:
18
+ - In OpenCode, do not use Chrome MCP tools (`mcp__claude-in-chrome__*`).
19
+ - In Codex, do not substitute unrelated browsing tools.
28
20
 
29
21
  ## Prerequisites
30
22
 
31
- <requirements>
32
23
  - Local development server running (e.g., `bin/dev`, `rails server`, `npm run dev`)
33
- - agent-browser CLI installed (see Setup below)
24
+ - `agent-browser` CLI installed (see Setup below)
34
25
  - Git repository with changes to test
35
- </requirements>
36
26
 
37
27
  ## Setup
38
28
 
39
- **Check installation:**
40
29
  ```bash
41
30
  command -v agent-browser >/dev/null 2>&1 && echo "Installed" || echo "NOT INSTALLED"
42
31
  ```
43
32
 
44
- **Install if needed:**
33
+ Install if needed:
45
34
  ```bash
46
35
  npm install -g agent-browser
47
- agent-browser install # Downloads Chromium (~160MB)
36
+ agent-browser install
48
37
  ```
49
38
 
50
39
  See the `agent-browser` skill for detailed usage.
51
40
 
52
- ## Main Tasks
41
+ ## Workflow
53
42
 
54
- ### 0. Verify agent-browser Installation
43
+ ### 1. Verify Installation
55
44
 
56
- Before starting ANY browser testing, verify agent-browser is installed:
45
+ Before starting, verify `agent-browser` is available:
57
46
 
58
47
  ```bash
59
48
  command -v agent-browser >/dev/null 2>&1 && echo "Ready" || (echo "Installing..." && npm install -g agent-browser && agent-browser install)
@@ -61,27 +50,20 @@ command -v agent-browser >/dev/null 2>&1 && echo "Ready" || (echo "Installing...
61
50
 
62
51
  If installation fails, inform the user and stop.
63
52
 
64
- ### 1. Ask Browser Mode
65
-
66
- <ask_browser_mode>
67
-
68
- Before starting tests, ask user if they want to watch the browser:
69
-
70
- Use question with:
71
- - Question: "Do you want to watch the browser tests run?"
72
- - Options:
73
- 1. **Headed (watch)** - Opens visible browser window so you can see tests run
74
- 2. **Headless (faster)** - Runs in background, faster but invisible
53
+ ### 2. Ask Browser Mode
75
54
 
76
- Store the choice and use `--headed` flag when user selects "Headed".
55
+ Ask the user whether to run headed or headless (using the platform's question tool — e.g., `question` in OpenCode, `request_user_input` in Codex, `ask_user` in Gemini — or present options and wait for a reply):
77
56
 
78
- </ask_browser_mode>
57
+ ```
58
+ Do you want to watch the browser tests run?
79
59
 
80
- ### 2. Determine Test Scope
60
+ 1. Headed (watch) - Opens visible browser window so you can see tests run
61
+ 2. Headless (faster) - Runs in background, faster but invisible
62
+ ```
81
63
 
82
- <test_target> $ARGUMENTS </test_target>
64
+ Store the choice and use the `--headed` flag when the user selects option 1.
83
65
 
84
- <determine_scope>
66
+ ### 3. Determine Test Scope
85
67
 
86
68
  **If PR number provided:**
87
69
  ```bash
@@ -98,11 +80,7 @@ git diff --name-only main...HEAD
98
80
  git diff --name-only main...[branch]
99
81
  ```
100
82
 
101
- </determine_scope>
102
-
103
- ### 3. Map Files to Routes
104
-
105
- <file_to_route_mapping>
83
+ ### 4. Map Files to Routes
106
84
 
107
85
  Map changed files to testable routes:
108
86
 
@@ -120,45 +98,23 @@ Map changed files to testable routes:
120
98
 
121
99
  Build a list of URLs to test based on the mapping.
122
100
 
123
- </file_to_route_mapping>
124
-
125
- ### 4. Detect Dev Server Port
126
-
127
- <detect_port>
101
+ ### 5. Detect Dev Server Port
128
102
 
129
- Determine the dev server port using this priority order:
103
+ Determine the dev server port using this priority:
130
104
 
131
- **Priority 1: Explicit argument**
132
- If the user passed a port number (e.g., `/test-browser 5000` or `/test-browser --port 5000`), use that port directly.
105
+ 1. **Explicit argument** — if the user passed `--port 5000`, use that directly
106
+ 2. **Project instructions** check `AGENTS.md`, `AGENTS.md`, or other instruction files for port references
107
+ 3. **package.json** — check dev/start scripts for `--port` flags
108
+ 4. **Environment files** — check `.env`, `.env.local`, `.env.development` for `PORT=`
109
+ 5. **Default** — fall back to `3000`
133
110
 
134
- **Priority 2: AGENTS.md / project instructions**
135
111
  ```bash
136
- # Check AGENTS.md for port references
137
- grep -Eio '(port\s*[:=]\s*|localhost:)([0-9]{4,5})' AGENTS.md 2>/dev/null | grep -Eo '[0-9]{4,5}' | head -1
138
- ```
139
-
140
- **Priority 3: package.json scripts**
141
- ```bash
142
- # Check dev/start scripts for --port flags
143
- grep -Eo '\-\-port[= ]+[0-9]{4,5}' package.json 2>/dev/null | grep -Eo '[0-9]{4,5}' | head -1
144
- ```
145
-
146
- **Priority 4: Environment files**
147
- ```bash
148
- # Check .env, .env.local, .env.development for PORT=
149
- grep -h '^PORT=' .env .env.local .env.development 2>/dev/null | tail -1 | cut -d= -f2
150
- ```
151
-
152
- **Priority 5: Default fallback**
153
- If none of the above yields a port, default to `3000`.
154
-
155
- Store the result in a `PORT` variable for use in all subsequent steps.
156
-
157
- ```bash
158
- # Combined detection (run this)
159
112
  PORT="${EXPLICIT_PORT:-}"
160
113
  if [ -z "$PORT" ]; then
161
114
  PORT=$(grep -Eio '(port\s*[:=]\s*|localhost:)([0-9]{4,5})' AGENTS.md 2>/dev/null | grep -Eo '[0-9]{4,5}' | head -1)
115
+ if [ -z "$PORT" ]; then
116
+ PORT=$(grep -Eio '(port\s*[:=]\s*|localhost:)([0-9]{4,5})' AGENTS.md 2>/dev/null | grep -Eo '[0-9]{4,5}' | head -1)
117
+ fi
162
118
  fi
163
119
  if [ -z "$PORT" ]; then
164
120
  PORT=$(grep -Eo '\-\-port[= ]+[0-9]{4,5}' package.json 2>/dev/null | grep -Eo '[0-9]{4,5}' | head -1)
@@ -170,77 +126,64 @@ PORT="${PORT:-3000}"
170
126
  echo "Using dev server port: $PORT"
171
127
  ```
172
128
 
173
- </detect_port>
174
-
175
- ### 5. Verify Server is Running
176
-
177
- <check_server>
178
-
179
- Before testing, verify the local server is accessible using the detected port:
129
+ ### 6. Verify Server is Running
180
130
 
181
131
  ```bash
182
132
  agent-browser open http://localhost:${PORT}
183
133
  agent-browser snapshot -i
184
134
  ```
185
135
 
186
- If server is not running, inform user:
187
- ```markdown
188
- **Server not running on port ${PORT}**
136
+ If the server is not running, inform the user:
137
+
138
+ ```
139
+ Server not running on port ${PORT}
189
140
 
190
141
  Please start your development server:
191
142
  - Rails: `bin/dev` or `rails server`
192
143
  - Node/Next.js: `npm run dev`
193
- - Custom port: `/test-browser --port <your-port>`
144
+ - Custom port: run this skill again with `--port <your-port>`
194
145
 
195
- Then run `/test-browser` again.
146
+ Then re-run this skill.
196
147
  ```
197
148
 
198
- </check_server>
199
-
200
- ### 6. Test Each Affected Page
149
+ ### 7. Test Each Affected Page
201
150
 
202
- <test_pages>
151
+ For each affected route:
203
152
 
204
- For each affected route, use agent-browser CLI commands (NOT Chrome MCP):
205
-
206
- **Step 1: Navigate and capture snapshot**
153
+ **Navigate and capture snapshot:**
207
154
  ```bash
208
155
  agent-browser open "http://localhost:${PORT}/[route]"
209
156
  agent-browser snapshot -i
210
157
  ```
211
158
 
212
- **Step 2: For headed mode (visual debugging)**
159
+ **For headed mode:**
213
160
  ```bash
214
161
  agent-browser --headed open "http://localhost:${PORT}/[route]"
215
162
  agent-browser --headed snapshot -i
216
163
  ```
217
164
 
218
- **Step 3: Verify key elements**
165
+ **Verify key elements:**
219
166
  - Use `agent-browser snapshot -i` to get interactive elements with refs
220
167
  - Page title/heading present
221
168
  - Primary content rendered
222
169
  - No error messages visible
223
170
  - Forms have expected fields
224
171
 
225
- **Step 4: Test critical interactions**
172
+ **Test critical interactions:**
226
173
  ```bash
227
- agent-browser click @e1 # Use ref from snapshot
174
+ agent-browser click @e1
228
175
  agent-browser snapshot -i
229
176
  ```
230
177
 
231
- **Step 5: Take screenshots**
178
+ **Take screenshots:**
232
179
  ```bash
233
180
  agent-browser screenshot page-name.png
234
- agent-browser screenshot --full page-name-full.png # Full page
181
+ agent-browser screenshot --full page-name-full.png
235
182
  ```
236
183
 
237
- </test_pages>
238
-
239
- ### 7. Human Verification (When Required)
240
-
241
- <human_verification>
184
+ ### 8. Human Verification (When Required)
242
185
 
243
- Pause for human input when testing touches:
186
+ Pause for human input when testing touches flows that require external interaction:
244
187
 
245
188
  | Flow Type | What to Ask |
246
189
  |-----------|-------------|
@@ -250,11 +193,12 @@ Pause for human input when testing touches:
250
193
  | SMS | "Verify you received the SMS code" |
251
194
  | External APIs | "Confirm the [service] integration is working" |
252
195
 
253
- Use question:
254
- ```markdown
255
- **Human Verification Needed**
196
+ Ask the user (using the platform's question tool, or present numbered options and wait):
197
+
198
+ ```
199
+ Human Verification Needed
256
200
 
257
- This test touches the [flow type]. Please:
201
+ This test touches [flow type]. Please:
258
202
  1. [Action to take]
259
203
  2. [What to verify]
260
204
 
@@ -263,11 +207,7 @@ Did it work correctly?
263
207
  2. No - describe the issue
264
208
  ```
265
209
 
266
- </human_verification>
267
-
268
- ### 8. Handle Failures
269
-
270
- <failure_handling>
210
+ ### 9. Handle Failures
271
211
 
272
212
  When a test fails:
273
213
 
@@ -275,40 +215,27 @@ When a test fails:
275
215
  - Screenshot the error state: `agent-browser screenshot error.png`
276
216
  - Note the exact reproduction steps
277
217
 
278
- 2. **Ask user how to proceed:**
279
- ```markdown
280
- **Test Failed: [route]**
218
+ 2. **Ask the user how to proceed:**
219
+
220
+ ```
221
+ Test Failed: [route]
281
222
 
282
223
  Issue: [description]
283
224
  Console errors: [if any]
284
225
 
285
226
  How to proceed?
286
227
  1. Fix now - I'll help debug and fix
287
- 2. Create todo - Add to todos/ for later
228
+ 2. Create todo - Add a todo for later (using the file-todos skill)
288
229
  3. Skip - Continue testing other pages
289
230
  ```
290
231
 
291
- 3. **If "Fix now":**
292
- - Investigate the issue
293
- - Propose a fix
294
- - Apply fix
295
- - Re-run the failing test
296
-
297
- 4. **If "Create todo":**
298
- - Create `{id}-pending-p1-browser-test-{description}.md`
299
- - Continue testing
300
-
301
- 5. **If "Skip":**
302
- - Log as skipped
303
- - Continue testing
232
+ 3. **If "Fix now":** investigate, propose a fix, apply, re-run the failing test
233
+ 4. **If "Create todo":** load the `file-todos` skill and create a todo with priority p1 and description `browser-test-{description}`, continue
234
+ 5. **If "Skip":** log as skipped, continue
304
235
 
305
- </failure_handling>
236
+ ### 10. Test Summary
306
237
 
307
- ### 9. Test Summary
308
-
309
- <test_summary>
310
-
311
- After all tests complete, present summary:
238
+ After all tests complete, present a summary:
312
239
 
313
240
  ```markdown
314
241
  ## Browser Test Results
@@ -341,8 +268,6 @@ After all tests complete, present summary:
341
268
  ### Result: [PASS / FAIL / PARTIAL]
342
269
  ```
343
270
 
344
- </test_summary>
345
-
346
271
  ## Quick Usage Examples
347
272
 
348
273
  ```bash
@@ -361,8 +286,6 @@ After all tests complete, present summary:
361
286
 
362
287
  ## agent-browser CLI Reference
363
288
 
364
- **ALWAYS use these Bash commands. NEVER use mcp__claude-in-chrome__* tools.**
365
-
366
289
  ```bash
367
290
  # Navigation
368
291
  agent-browser open <url> # Navigate to URL
@@ -391,3 +314,4 @@ agent-browser --headed click @e1 # Click in visible browser
391
314
  agent-browser wait @e1 # Wait for element
392
315
  agent-browser wait 2000 # Wait milliseconds
393
316
  ```
317
+