sdlc-framework 1.0.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (53) hide show
  1. package/LICENSE +21 -0
  2. package/README.md +321 -0
  3. package/bin/install.js +193 -0
  4. package/package.json +39 -0
  5. package/src/commands/close.md +200 -0
  6. package/src/commands/debug.md +124 -0
  7. package/src/commands/fast.md +149 -0
  8. package/src/commands/fix.md +104 -0
  9. package/src/commands/help.md +144 -0
  10. package/src/commands/hotfix.md +99 -0
  11. package/src/commands/impl.md +142 -0
  12. package/src/commands/init.md +93 -0
  13. package/src/commands/milestone.md +136 -0
  14. package/src/commands/pause.md +115 -0
  15. package/src/commands/research.md +136 -0
  16. package/src/commands/resume.md +103 -0
  17. package/src/commands/review.md +195 -0
  18. package/src/commands/spec.md +164 -0
  19. package/src/commands/status.md +118 -0
  20. package/src/commands/verify.md +153 -0
  21. package/src/references/clarification-strategy.md +352 -0
  22. package/src/references/engineering-laws.md +374 -0
  23. package/src/references/loop-phases.md +331 -0
  24. package/src/references/playwright-testing.md +298 -0
  25. package/src/references/prompt-detection.md +264 -0
  26. package/src/references/sub-agent-strategy.md +260 -0
  27. package/src/rules/commands.md +180 -0
  28. package/src/rules/style.md +354 -0
  29. package/src/rules/templates.md +238 -0
  30. package/src/rules/workflows.md +314 -0
  31. package/src/templates/HANDOFF.md +121 -0
  32. package/src/templates/LAWS.md +521 -0
  33. package/src/templates/PROJECT.md +112 -0
  34. package/src/templates/REVIEW.md +145 -0
  35. package/src/templates/ROADMAP.md +101 -0
  36. package/src/templates/SPEC.md +231 -0
  37. package/src/templates/STATE.md +106 -0
  38. package/src/templates/SUMMARY.md +126 -0
  39. package/src/workflows/close-phase.md +189 -0
  40. package/src/workflows/debug-flow.md +302 -0
  41. package/src/workflows/fast-forward.md +340 -0
  42. package/src/workflows/fix-findings.md +235 -0
  43. package/src/workflows/hotfix-flow.md +190 -0
  44. package/src/workflows/impl-phase.md +229 -0
  45. package/src/workflows/init-project.md +249 -0
  46. package/src/workflows/milestone-management.md +169 -0
  47. package/src/workflows/pause-work.md +153 -0
  48. package/src/workflows/research.md +219 -0
  49. package/src/workflows/resume-project.md +159 -0
  50. package/src/workflows/review-phase.md +337 -0
  51. package/src/workflows/spec-phase.md +379 -0
  52. package/src/workflows/transition-phase.md +203 -0
  53. package/src/workflows/verify-phase.md +280 -0
@@ -0,0 +1,331 @@
1
+ # SDLC Loop Phases Reference
2
+
3
+ This document explains the five phases of the SDLC loop for developers who are new to structured development lifecycles. It covers what each phase does, what it produces, how phases connect, and why closing the loop matters.
4
+
5
+ ---
6
+
7
+ ## The Big Picture
8
+
9
+ The SDLC loop is five phases that every piece of work goes through:
10
+
11
+ ```
12
+ SPEC ──▶ IMPLEMENT ──▶ VERIFY ──▶ REVIEW ──▶ CLOSE
13
+ ▲ │
14
+ └──────────────────────────────────────────────┘
15
+ (next iteration)
16
+ ```
17
+
18
+ Each phase has a specific job. Skipping a phase creates problems that compound over time. The loop closes when all five phases complete successfully, and then a new loop begins for the next piece of work.
19
+
20
+ ---
21
+
22
+ ## Phase 1: SPEC (Specification)
23
+
24
+ ### What It Does
25
+
26
+ The spec phase answers: **What are we building, and why?**
27
+
28
+ Before writing any code, the spec phase:
29
+ 1. Clarifies requirements by asking questions (no guessing)
30
+ 2. Defines acceptance criteria in testable BDD format
31
+ 3. Breaks the work into discrete tasks
32
+ 4. Identifies which tasks can run in parallel
33
+ 5. Draws boundaries (what NOT to change)
34
+
35
+ ### What It Produces
36
+
37
+ A spec document (SPEC.md) containing:
38
+ - Clear objective (goal, purpose, output)
39
+ - Clarification audit trail (questions asked, answers received)
40
+ - Numbered acceptance criteria (AC-1, AC-2, etc.)
41
+ - Task list with dependencies
42
+ - Task dependency graph
43
+ - Boundary definitions
44
+
45
+ ### Why It Exists
46
+
47
+ Without a spec, the developer starts coding based on assumptions. Assumptions are often wrong. The developer builds the wrong thing, discovers this during review, and starts over. A spec that takes 15 minutes to write saves hours of rework.
48
+
49
+ ### Common Mistakes
50
+
51
+ - **Skipping clarification:** "I know what they want" — you probably do not.
52
+ - **Vague acceptance criteria:** "It should work" is not testable. "Given X, When Y, Then Z" is.
53
+ - **Missing boundaries:** Without a "do not touch" list, developers refactor code they should not.
54
+
55
+ ---
56
+
57
+ ## Phase 2: IMPLEMENT (Implementation)
58
+
59
+ ### What It Does
60
+
61
+ The implementation phase answers: **How do we build it?**
62
+
63
+ Using the spec as a blueprint:
64
+ 1. Analyzes the task dependency graph
65
+ 2. Groups independent tasks into parallel waves
66
+ 3. Assigns tasks to sub-agents (or executes inline for simple work)
67
+ 4. Executes tasks wave by wave, integrating outputs between waves
68
+ 5. Marks each task as done in the spec
69
+
70
+ ### What It Produces
71
+
72
+ - Working code that implements all tasks in the spec
73
+ - Updated spec with completed task checkmarks
74
+ - Any decisions made during implementation (recorded in the spec or state file)
75
+
76
+ ### Why It Exists
77
+
78
+ Implementation without a spec is random typing. Implementation with a spec is directed construction. The spec tells the developer exactly which files to create, what each function should do, and how to verify completion. No time is wasted deciding what to build next.
79
+
80
+ ### Common Mistakes
81
+
82
+ - **Going beyond the spec:** "While I'm here, I'll also add..." — YAGNI. Build what is in the spec.
83
+ - **Ignoring the dependency graph:** Starting tasks whose dependencies are not complete leads to rework.
84
+ - **Not recording decisions:** Decisions made during implementation that are not in the spec must be recorded. Otherwise, the reviewer does not understand why something was built a certain way.
85
+
86
+ ---
87
+
88
+ ## Phase 3: VERIFY (Verification)
89
+
90
+ ### What It Does
91
+
92
+ The verification phase answers: **Does it actually work?**
93
+
94
+ For each acceptance criterion in the spec:
95
+ 1. Set up the preconditions (the "Given")
96
+ 2. Perform the action (the "When")
97
+ 3. Check the result (the "Then")
98
+ 4. Collect evidence (screenshots, test output, API responses)
99
+
100
+ ### What It Produces
101
+
102
+ - Pass/fail status for each acceptance criterion
103
+ - Evidence for each check (screenshots, test output)
104
+ - List of any criteria that failed and why
105
+
106
+ ### Why It Exists
107
+
108
+ Code that compiles is not code that works. The implementation phase produces code that the developer believes is correct. The verification phase proves it. Every acceptance criterion gets checked. No assumptions. No "it probably works."
109
+
110
+ ### Verification Methods
111
+
112
+ | Work Type | Primary Verification | Fallback |
113
+ |-----------|---------------------|----------|
114
+ | UI features | Playwright MCP (navigate, interact, screenshot) | Manual browser testing |
115
+ | API endpoints | Playwright `browser_evaluate` with fetch, or curl | Integration tests |
116
+ | Business logic | Unit test runner (npm test) | Manual function calls |
117
+ | Refactors | Existing test suite (all tests still pass) | Before/after behavior comparison |
118
+ | Bug fixes | Reproduce the bug, verify it is gone | Regression test |
119
+
120
+ ### Common Mistakes
121
+
122
+ - **Only testing the happy path:** If AC-1 passes but AC-3 (the error case) was skipped, verification is incomplete.
123
+ - **Verifying in the wrong environment:** Testing against a mock database when the AC specifies "real data." Use the environment the spec defines.
124
+ - **Not collecting evidence:** Without screenshots or test output, the reviewer has no proof. Verification without evidence is just another assumption.
125
+
126
+ ---
127
+
128
+ ## Phase 4: REVIEW (Code Review)
129
+
130
+ ### What It Does
131
+
132
+ The review phase answers: **Is the code good enough?**
133
+
134
+ Every file created or modified during implementation is reviewed against the engineering laws:
135
+ 1. SOLID — proper structure and separation
136
+ 2. DRY — no duplicate logic
137
+ 3. YAGNI — nothing beyond the spec
138
+ 4. Clean Code — small functions, shallow nesting
139
+ 5. Security — no hardcoded secrets, input validation
140
+ 6. Testing — every behavior has a test
141
+ 7. Naming — descriptive, consistent names
142
+ 8. Error Handling — no empty catches, domain-specific exceptions
143
+
144
+ ### What It Produces
145
+
146
+ A review document (REVIEW.md) containing:
147
+ - List of files reviewed
148
+ - Findings with severity (error, warning, info)
149
+ - For each finding: file, line numbers, law violated, description, suggested fix
150
+ - Overall assessment and blocking status
151
+
152
+ ### Why It Exists
153
+
154
+ Verification proves the code works. Review proves the code is maintainable. Code that works but violates engineering laws creates technical debt. Technical debt compounds. A 5-minute shortcut today becomes a 5-hour debugging session next month.
155
+
156
+ ### What Blocks Closure
157
+
158
+ Findings at `error` severity block loop closure. The developer must fix the issues and re-run the review. Findings at `warning` severity are reported but do not block. Findings at `info` severity are noted.
159
+
160
+ ### Common Mistakes
161
+
162
+ - **Treating warnings as ignorable:** Warnings are "fix this soon." Not fixing them means they accumulate until the codebase is drowning in minor issues.
163
+ - **Fixing symptoms, not causes:** A reviewer says "this function is too long." The developer splits it into two functions that are both still too long. Fix the root cause — extract meaningful helper functions.
164
+ - **Arguing with the law:** "But my 60-line function is readable!" The law says 40 lines. Refactor.
165
+
166
+ ---
167
+
168
+ ## Phase 5: CLOSE (Loop Closure)
169
+
170
+ ### What It Does
171
+
172
+ The close phase answers: **Are we done, and what did we learn?**
173
+
174
+ Loop closure:
175
+ 1. Confirms all acceptance criteria passed
176
+ 2. Confirms all error-severity review findings are resolved
177
+ 3. Produces a summary document with deliverables, results, and lessons learned
178
+ 4. Updates the roadmap and state files
179
+ 5. Sets the next action for the next loop iteration
180
+
181
+ ### What It Produces
182
+
183
+ A summary document (SUMMARY.md) containing:
184
+ - Deliverables (what was built)
185
+ - Acceptance criteria results (pass/fail with evidence)
186
+ - Deviations from the spec (planned vs actual)
187
+ - Decisions made during implementation
188
+ - Engineering laws review results
189
+ - Files modified
190
+ - Lessons learned
191
+
192
+ ### Why It Exists
193
+
194
+ Without closure, work is "sort of done." Is it done-done? Did everything pass? Were there deviations? Nobody knows because nobody wrote it down. Closure creates a clear record that this work is complete, what was delivered, and what to work on next.
195
+
196
+ ### Closure Checklist
197
+
198
+ The loop cannot close unless:
199
+ - [ ] All acceptance criteria have a pass/fail status
200
+ - [ ] All `error` severity findings are resolved
201
+ - [ ] A summary document is written
202
+ - [ ] The state file is updated with the next action
203
+ - [ ] The roadmap is updated to reflect progress
204
+
205
+ ### Common Mistakes
206
+
207
+ - **Skipping lessons learned:** "We'll remember." You will not. Write it down.
208
+ - **Leaving the next action empty:** The next developer (or the next session) opens the project and does not know what to do. Every close must set the next action.
209
+ - **Closing with unresolved blockers:** If there is a blocker, the loop does not close. Address it or defer it explicitly.
210
+
211
+ ---
212
+
213
+ ## How Phases Connect
214
+
215
+ Each phase consumes the output of the previous phase:
216
+
217
+ ```
218
+ SPEC produces → spec document
219
+
220
+ IMPLEMENT consumes spec → produces working code
221
+
222
+ VERIFY consumes acceptance criteria → produces pass/fail evidence
223
+
224
+ REVIEW consumes code changes → produces findings
225
+
226
+ CLOSE consumes all outputs → produces summary + next action
227
+ ```
228
+
229
+ If a phase fails, it loops back:
230
+ - **Verify fails:** Go back to IMPLEMENT. Fix the code. Re-verify.
231
+ - **Review fails:** Go back to IMPLEMENT. Fix the findings. Re-review.
232
+ - **Spec is incomplete:** Stay in SPEC. Ask more questions. Update the spec.
233
+
234
+ The loop never skips forward. You cannot review before verifying. You cannot close before reviewing.
235
+
236
+ ---
237
+
238
+ ## What "Closing the Loop" Means
239
+
240
+ Closing the loop means all five phases completed successfully for one unit of work (one plan). The result is:
241
+ - Working, verified code
242
+ - No error-severity engineering law violations
243
+ - A complete audit trail (spec, review, summary)
244
+ - A clear next action for the next iteration
245
+
246
+ When the loop closes, the system resets for the next plan. The state file updates. The roadmap advances. The developer knows exactly what to do next.
247
+
248
+ ---
249
+
250
+ ## With Loop vs Without Loop
251
+
252
+ ### Without a Loop (Ad-Hoc Development)
253
+
254
+ 1. Developer gets a vague request: "Add user search."
255
+ 2. Starts coding immediately. Guesses at requirements.
256
+ 3. Builds search with filters, sorting, and pagination (nobody asked for pagination).
257
+ 4. Does not write tests ("I'll add them later").
258
+ 5. Discovers the stakeholder wanted search by name only, not the full-featured search.
259
+ 6. Reworks half the code. Still no tests.
260
+ 7. Ships with a hardcoded API key in the config.
261
+ 8. Next developer opens the codebase: "What was built? Is it done? What do I work on next?"
262
+
263
+ **Result:** Wrong thing built, no tests, security issue, no documentation, confused team.
264
+
265
+ ### With a Loop (SDLC)
266
+
267
+ 1. Developer gets the request: "Add user search."
268
+ 2. **SPEC:** Asks clarifying questions. "Search by what fields? How should results be sorted? Any pagination needed?" Stakeholder says: "Just name search, alphabetical, no pagination."
269
+ 3. **IMPLEMENT:** Builds exactly what the spec says. Name search, alphabetical sort, no pagination.
270
+ 4. **VERIFY:** Tests that search by name works, results are sorted, and error case (no results) is handled.
271
+ 5. **REVIEW:** Code checked against engineering laws. One warning for a vague variable name. Fixed.
272
+ 6. **CLOSE:** Summary written. Tests pass. No pagination to rip out later. Next action set.
273
+
274
+ **Result:** Right thing built, tested, reviewed, documented, clear path forward.
275
+
276
+ ---
277
+
278
+ ## Visual Diagrams
279
+
280
+ ### Single Loop Iteration
281
+
282
+ ```
283
+ ┌─────────────────────────────────────────────────────┐
284
+ │ ONE LOOP │
285
+ │ │
286
+ │ ┌──────┐ ┌──────────┐ ┌────────┐ ┌────────┐│ ┌───────┐
287
+ │ │ SPEC │──▶│IMPLEMENT │──▶│ VERIFY │──▶│ REVIEW ││──▶│ CLOSE │
288
+ │ └──────┘ └──────────┘ └────────┘ └────────┘│ └───────┘
289
+ │ │ ▲ │ │ │ │
290
+ │ │ │ │ │ │ │
291
+ │ │ └─────────────┘ │ │ │
292
+ │ │ (verify fails → re-impl) │ │ │
293
+ │ │ │ │ │
294
+ │ │ ▲ │ │ │
295
+ │ │ └───────────────────────────┘ │ │
296
+ │ │ (review fails → re-impl) │ │
297
+ │ │ │ │
298
+ └──────│──────────────────────────────────────────────┘ │
299
+ │ │
300
+ └──────────────────────────────────────────────────────┘
301
+ (next plan)
302
+ ```
303
+
304
+ ### Multiple Loops in a Phase
305
+
306
+ ```
307
+ Phase 1: Authentication
308
+ ├── Loop 1: Spec → Impl → Verify → Review → Close (user login)
309
+ ├── Loop 2: Spec → Impl → Verify → Review → Close (password reset)
310
+ └── Loop 3: Spec → Impl → Verify → Review → Close (session management)
311
+
312
+ Phase 2: User Management
313
+ ├── Loop 4: Spec → Impl → Verify → Review → Close (CRUD operations)
314
+ ├── Loop 5: Spec → Impl → Verify → Review → Close (role-based access)
315
+ └── Loop 6: Spec → Impl → Verify → Review → Close (user search)
316
+ ```
317
+
318
+ ### Milestone Progression
319
+
320
+ ```
321
+ Milestone 1: MVP
322
+ ├── Phase 1 ── Loop 1 ── Loop 2 ── Loop 3
323
+ ├── Phase 2 ── Loop 4 ── Loop 5
324
+ └── Phase 3 ── Loop 6 ── Loop 7 ── Loop 8
325
+
326
+ Milestone 2: v1.0
327
+ ├── Phase 4 ── Loop 9 ── Loop 10
328
+ └── Phase 5 ── Loop 11
329
+ ```
330
+
331
+ Each loop is one complete iteration of SPEC → IMPLEMENT → VERIFY → REVIEW → CLOSE.
@@ -0,0 +1,298 @@
1
+ # Playwright MCP Verification Reference
2
+
3
+ This document explains how to use Playwright MCP tools for verification during the `/sdlc:verify` phase. It covers the available tools, how to translate acceptance criteria into Playwright sequences, and common pitfalls.
4
+
5
+ ---
6
+
7
+ ## Available MCP Tools
8
+
9
+ The SDLC framework uses Playwright through the Model Context Protocol (MCP). These are the tools available for browser-based verification.
10
+
11
+ ### Navigation
12
+
13
+ | Tool | Purpose | Key Parameters |
14
+ |------|---------|----------------|
15
+ | `browser_navigate` | Go to a URL | `url` — the full URL to navigate to |
16
+ | `browser_navigate_back` | Go back one page | (none) |
17
+ | `browser_tabs` | List open tabs | (none) |
18
+
19
+ ### Interaction
20
+
21
+ | Tool | Purpose | Key Parameters |
22
+ |------|---------|----------------|
23
+ | `browser_click` | Click an element | `element` — description or selector of what to click |
24
+ | `browser_fill_form` | Fill a form field | `element` — the field, `value` — what to type |
25
+ | `browser_type` | Type text (character by character) | `text` — the text to type |
26
+ | `browser_press_key` | Press a keyboard key | `key` — the key name (Enter, Tab, Escape, etc.) |
27
+ | `browser_select_option` | Select from a dropdown | `element` — the dropdown, `value` — the option |
28
+ | `browser_hover` | Hover over an element | `element` — what to hover over |
29
+ | `browser_drag` | Drag an element | `startElement`, `endElement` |
30
+ | `browser_file_upload` | Upload a file | `element` — the file input, `paths` — file paths |
31
+
32
+ ### Observation
33
+
34
+ | Tool | Purpose | Key Parameters |
35
+ |------|---------|----------------|
36
+ | `browser_snapshot` | Get the current page accessibility tree (like a text representation of the page) | (none) |
37
+ | `browser_take_screenshot` | Capture a screenshot | (none) |
38
+ | `browser_console_messages` | Get browser console output | (none) |
39
+ | `browser_network_requests` | Get network request log | (none) |
40
+
41
+ ### Control
42
+
43
+ | Tool | Purpose | Key Parameters |
44
+ |------|---------|----------------|
45
+ | `browser_wait_for` | Wait for a condition | `condition` — what to wait for |
46
+ | `browser_handle_dialog` | Accept or dismiss a dialog | `action` — accept or dismiss |
47
+ | `browser_evaluate` | Run JavaScript in the page | `code` — JavaScript to execute |
48
+ | `browser_run_code` | Run JavaScript for complex automation | `code` — JavaScript code |
49
+ | `browser_resize` | Resize the browser window | `width`, `height` |
50
+ | `browser_close` | Close the browser | (none) |
51
+ | `browser_install` | Install browser binaries | (none) |
52
+
53
+ ---
54
+
55
+ ## Translating BDD Criteria to Playwright Sequences
56
+
57
+ Each acceptance criterion in BDD format (Given/When/Then) maps directly to a sequence of Playwright MCP tool calls.
58
+
59
+ ### The Pattern
60
+
61
+ ```
62
+ Given → Setup (navigate, fill prerequisites)
63
+ When → Action (click, type, submit)
64
+ Then → Assert (snapshot + verify content, screenshot for evidence)
65
+ ```
66
+
67
+ ### Example 1: Login Form
68
+
69
+ **Acceptance Criterion:**
70
+ ```
71
+ AC-1: Successful login
72
+ Given a registered user with email "alice@example.com" and password "secure123"
73
+ When the user navigates to /login and submits valid credentials
74
+ Then the user is redirected to /dashboard and sees "Welcome, Alice"
75
+ ```
76
+
77
+ **Playwright Sequence:**
78
+
79
+ ```
80
+ 1. browser_navigate → url: "http://localhost:3000/login"
81
+ 2. browser_snapshot → verify login form is visible
82
+ 3. browser_fill_form → element: "email input", value: "alice@example.com"
83
+ 4. browser_fill_form → element: "password input", value: "secure123"
84
+ 5. browser_click → element: "Login button"
85
+ 6. browser_wait_for → condition: "navigation to /dashboard"
86
+ 7. browser_snapshot → verify page contains "Welcome, Alice"
87
+ 8. browser_take_screenshot → evidence for review
88
+ ```
89
+
90
+ ### Example 2: Form Validation
91
+
92
+ **Acceptance Criterion:**
93
+ ```
94
+ AC-2: Email validation error
95
+ Given an empty login form
96
+ When the user submits without entering an email
97
+ Then an error message "Email is required" appears below the email field
98
+ ```
99
+
100
+ **Playwright Sequence:**
101
+
102
+ ```
103
+ 1. browser_navigate → url: "http://localhost:3000/login"
104
+ 2. browser_click → element: "Login button" (submit empty form)
105
+ 3. browser_snapshot → verify error message "Email is required" is visible
106
+ 4. browser_take_screenshot → evidence
107
+ ```
108
+
109
+ ### Example 3: API Response Verification
110
+
111
+ **Acceptance Criterion:**
112
+ ```
113
+ AC-3: User list API returns paginated results
114
+ Given the database has 50 users
115
+ When a GET request is made to /api/users?page=1&limit=10
116
+ Then the response contains 10 users and a total count of 50
117
+ ```
118
+
119
+ **Playwright Sequence:**
120
+
121
+ ```
122
+ 1. browser_evaluate → code: `fetch('/api/users?page=1&limit=10').then(r => r.json())`
123
+ 2. Verify the result contains 10 items and totalCount === 50
124
+ ```
125
+
126
+ Alternatively, use `browser_network_requests` after navigating to a page that triggers the API call.
127
+
128
+ ---
129
+
130
+ ## Screenshot Evidence Collection
131
+
132
+ Screenshots provide visual proof that acceptance criteria passed. They are stored alongside the review and summary documents.
133
+
134
+ ### Best Practices
135
+
136
+ - **Take screenshots after assertions, not before.** The screenshot should show the passing state.
137
+ - **Name screenshots descriptively.** Use the AC number: `AC-001-login-success.png`.
138
+ - **Capture the relevant viewport.** Use `browser_resize` to set a consistent viewport before screenshots.
139
+ - **Include error states.** If testing error handling, screenshot the error message.
140
+
141
+ ### Standard Viewport Sizes
142
+
143
+ | Device | Width | Height |
144
+ |--------|-------|--------|
145
+ | Desktop | 1280 | 720 |
146
+ | Tablet | 768 | 1024 |
147
+ | Mobile | 375 | 667 |
148
+
149
+ Use `browser_resize` to set the viewport before taking screenshots for responsive testing.
150
+
151
+ ---
152
+
153
+ ## Handling Dynamic Content
154
+
155
+ Web applications often have content that changes: loading spinners, animations, data that loads asynchronously.
156
+
157
+ ### Loading States
158
+
159
+ Use `browser_wait_for` to wait for content to appear before asserting:
160
+
161
+ ```
162
+ 1. browser_navigate → url: "http://localhost:3000/dashboard"
163
+ 2. browser_wait_for → condition: "Loading spinner disappears"
164
+ 3. browser_wait_for → condition: "User list is visible"
165
+ 4. browser_snapshot → verify content
166
+ ```
167
+
168
+ ### Animations
169
+
170
+ Animations can cause flaky assertions if the snapshot captures a mid-animation state:
171
+
172
+ - Wait for animations to complete before taking snapshots.
173
+ - Use `browser_evaluate` to check CSS animation state if needed:
174
+ ```javascript
175
+ document.querySelector('.modal').getAnimations().every(a => a.playState === 'finished')
176
+ ```
177
+
178
+ ### Asynchronous Data
179
+
180
+ Data loaded via API calls may not be present immediately after navigation:
181
+
182
+ ```
183
+ 1. browser_navigate → url: "http://localhost:3000/users"
184
+ 2. browser_wait_for → condition: "Table contains user rows"
185
+ 3. browser_snapshot → verify data
186
+ ```
187
+
188
+ ### Time-Dependent Content
189
+
190
+ Timestamps, "5 minutes ago" labels, and countdown timers change between test runs:
191
+
192
+ - Assert on stable parts of the content (user name, status) rather than volatile parts (timestamps).
193
+ - Use `browser_evaluate` to freeze time if needed: `Date.now = () => 1700000000000`.
194
+
195
+ ---
196
+
197
+ ## Fallback Verification (Non-UI Work)
198
+
199
+ Not all specs produce UI changes. For backend, library, or CLI work, use alternative verification methods.
200
+
201
+ ### Test Runner Verification
202
+
203
+ For specs that produce testable code, run the test suite:
204
+
205
+ ```bash
206
+ npm test -- --grep "UserService"
207
+ ```
208
+
209
+ Verify that:
210
+ - All tests pass.
211
+ - New tests exist for new behavior.
212
+ - No existing tests broke.
213
+
214
+ ### API Verification with curl
215
+
216
+ For API endpoints without a frontend:
217
+
218
+ ```bash
219
+ # Test endpoint exists and returns correct status
220
+ curl -s -o /dev/null -w "%{http_code}" http://localhost:3000/api/health
221
+
222
+ # Test response body
223
+ curl -s http://localhost:3000/api/users/1 | jq '.name'
224
+ ```
225
+
226
+ ### CLI Verification
227
+
228
+ For CLI tools:
229
+
230
+ ```bash
231
+ # Test command exists and runs
232
+ node dist/cli.js --help
233
+
234
+ # Test specific behavior
235
+ node dist/cli.js create-user --name Alice --email alice@example.com
236
+ echo $? # verify exit code
237
+ ```
238
+
239
+ ### File System Verification
240
+
241
+ For specs that generate files:
242
+
243
+ ```bash
244
+ # Verify file exists
245
+ test -f dist/output.json && echo "exists"
246
+
247
+ # Verify file content
248
+ cat dist/output.json | jq '.version'
249
+ ```
250
+
251
+ ---
252
+
253
+ ## Common Pitfalls and Solutions
254
+
255
+ ### Pitfall 1: Element Not Found
256
+
257
+ **Problem:** `browser_click` or `browser_fill_form` cannot find the element.
258
+ **Cause:** The element has not loaded yet, or the selector is wrong.
259
+ **Solution:** Use `browser_snapshot` first to see the accessibility tree. Find the element's actual label or description, then use that in the tool call. Use `browser_wait_for` if the element loads asynchronously.
260
+
261
+ ### Pitfall 2: Navigation Timing
262
+
263
+ **Problem:** Asserting on a page before navigation completes.
264
+ **Cause:** `browser_click` on a link triggers navigation, but the new page is not ready yet.
265
+ **Solution:** After clicking a navigation link, use `browser_wait_for` with a condition that indicates the new page loaded (specific text, specific element).
266
+
267
+ ### Pitfall 3: Stale State
268
+
269
+ **Problem:** Previous test's state affects current test.
270
+ **Cause:** Browser session persists between verifications. A logged-in user from AC-1 is still logged in for AC-2.
271
+ **Solution:** Either use the persisted state intentionally (if AC-2 requires login) or navigate to a logout endpoint / clear cookies before the next AC.
272
+
273
+ ### Pitfall 4: Form Auto-Complete
274
+
275
+ **Problem:** Browser auto-fills form fields with previous values.
276
+ **Cause:** Browser remembers form submissions.
277
+ **Solution:** Clear the field before filling: use `browser_click` on the field, then `browser_press_key` with Ctrl+A, then type the new value. Or use `browser_evaluate` to set the value directly.
278
+
279
+ ### Pitfall 5: Pop-ups and Dialogs
280
+
281
+ **Problem:** An unexpected dialog (alert, confirm, prompt) blocks interaction.
282
+ **Cause:** JavaScript `alert()`, `confirm()`, or `prompt()` calls.
283
+ **Solution:** Use `browser_handle_dialog` with `action: "accept"` or `action: "dismiss"` before the action that triggers the dialog. Set up the dialog handler proactively.
284
+
285
+ ### Pitfall 6: Iframes
286
+
287
+ **Problem:** Cannot interact with content inside an iframe.
288
+ **Cause:** MCP tools operate on the main frame by default.
289
+ **Solution:** Use `browser_evaluate` to access iframe content:
290
+ ```javascript
291
+ document.querySelector('iframe').contentDocument.querySelector('button').click()
292
+ ```
293
+
294
+ ### Pitfall 7: Screenshot Shows Blank Page
295
+
296
+ **Problem:** Screenshot is empty or shows a loading state.
297
+ **Cause:** Screenshot taken before page finished rendering.
298
+ **Solution:** Always use `browser_wait_for` before `browser_take_screenshot`. Wait for specific content, not just page load.