sdlc-framework 1.0.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/LICENSE +21 -0
- package/README.md +321 -0
- package/bin/install.js +193 -0
- package/package.json +39 -0
- package/src/commands/close.md +200 -0
- package/src/commands/debug.md +124 -0
- package/src/commands/fast.md +149 -0
- package/src/commands/fix.md +104 -0
- package/src/commands/help.md +144 -0
- package/src/commands/hotfix.md +99 -0
- package/src/commands/impl.md +142 -0
- package/src/commands/init.md +93 -0
- package/src/commands/milestone.md +136 -0
- package/src/commands/pause.md +115 -0
- package/src/commands/research.md +136 -0
- package/src/commands/resume.md +103 -0
- package/src/commands/review.md +195 -0
- package/src/commands/spec.md +164 -0
- package/src/commands/status.md +118 -0
- package/src/commands/verify.md +153 -0
- package/src/references/clarification-strategy.md +352 -0
- package/src/references/engineering-laws.md +374 -0
- package/src/references/loop-phases.md +331 -0
- package/src/references/playwright-testing.md +298 -0
- package/src/references/prompt-detection.md +264 -0
- package/src/references/sub-agent-strategy.md +260 -0
- package/src/rules/commands.md +180 -0
- package/src/rules/style.md +354 -0
- package/src/rules/templates.md +238 -0
- package/src/rules/workflows.md +314 -0
- package/src/templates/HANDOFF.md +121 -0
- package/src/templates/LAWS.md +521 -0
- package/src/templates/PROJECT.md +112 -0
- package/src/templates/REVIEW.md +145 -0
- package/src/templates/ROADMAP.md +101 -0
- package/src/templates/SPEC.md +231 -0
- package/src/templates/STATE.md +106 -0
- package/src/templates/SUMMARY.md +126 -0
- package/src/workflows/close-phase.md +189 -0
- package/src/workflows/debug-flow.md +302 -0
- package/src/workflows/fast-forward.md +340 -0
- package/src/workflows/fix-findings.md +235 -0
- package/src/workflows/hotfix-flow.md +190 -0
- package/src/workflows/impl-phase.md +229 -0
- package/src/workflows/init-project.md +249 -0
- package/src/workflows/milestone-management.md +169 -0
- package/src/workflows/pause-work.md +153 -0
- package/src/workflows/research.md +219 -0
- package/src/workflows/resume-project.md +159 -0
- package/src/workflows/review-phase.md +337 -0
- package/src/workflows/spec-phase.md +379 -0
- package/src/workflows/transition-phase.md +203 -0
- package/src/workflows/verify-phase.md +280 -0
|
@@ -0,0 +1,331 @@
|
|
|
1
|
+
# SDLC Loop Phases Reference
|
|
2
|
+
|
|
3
|
+
This document explains the five phases of the SDLC loop for developers who are new to structured development lifecycles. It covers what each phase does, what it produces, how phases connect, and why closing the loop matters.
|
|
4
|
+
|
|
5
|
+
---
|
|
6
|
+
|
|
7
|
+
## The Big Picture
|
|
8
|
+
|
|
9
|
+
The SDLC loop is five phases that every piece of work goes through:
|
|
10
|
+
|
|
11
|
+
```
|
|
12
|
+
SPEC ──▶ IMPLEMENT ──▶ VERIFY ──▶ REVIEW ──▶ CLOSE
|
|
13
|
+
▲ │
|
|
14
|
+
└──────────────────────────────────────────────┘
|
|
15
|
+
(next iteration)
|
|
16
|
+
```
|
|
17
|
+
|
|
18
|
+
Each phase has a specific job. Skipping a phase creates problems that compound over time. The loop closes when all five phases complete successfully, and then a new loop begins for the next piece of work.
|
|
19
|
+
|
|
20
|
+
---
|
|
21
|
+
|
|
22
|
+
## Phase 1: SPEC (Specification)
|
|
23
|
+
|
|
24
|
+
### What It Does
|
|
25
|
+
|
|
26
|
+
The spec phase answers: **What are we building, and why?**
|
|
27
|
+
|
|
28
|
+
Before writing any code, the spec phase:
|
|
29
|
+
1. Clarifies requirements by asking questions (no guessing)
|
|
30
|
+
2. Defines acceptance criteria in testable BDD format
|
|
31
|
+
3. Breaks the work into discrete tasks
|
|
32
|
+
4. Identifies which tasks can run in parallel
|
|
33
|
+
5. Draws boundaries (what NOT to change)
|
|
34
|
+
|
|
35
|
+
### What It Produces
|
|
36
|
+
|
|
37
|
+
A spec document (SPEC.md) containing:
|
|
38
|
+
- Clear objective (goal, purpose, output)
|
|
39
|
+
- Clarification audit trail (questions asked, answers received)
|
|
40
|
+
- Numbered acceptance criteria (AC-1, AC-2, etc.)
|
|
41
|
+
- Task list with dependencies
|
|
42
|
+
- Task dependency graph
|
|
43
|
+
- Boundary definitions
|
|
44
|
+
|
|
45
|
+
### Why It Exists
|
|
46
|
+
|
|
47
|
+
Without a spec, the developer starts coding based on assumptions. Assumptions are often wrong. The developer builds the wrong thing, discovers this during review, and starts over. A spec that takes 15 minutes to write saves hours of rework.
|
|
48
|
+
|
|
49
|
+
### Common Mistakes
|
|
50
|
+
|
|
51
|
+
- **Skipping clarification:** "I know what they want" — you probably do not.
|
|
52
|
+
- **Vague acceptance criteria:** "It should work" is not testable. "Given X, When Y, Then Z" is.
|
|
53
|
+
- **Missing boundaries:** Without a "do not touch" list, developers refactor code they should not.
|
|
54
|
+
|
|
55
|
+
---
|
|
56
|
+
|
|
57
|
+
## Phase 2: IMPLEMENT (Implementation)
|
|
58
|
+
|
|
59
|
+
### What It Does
|
|
60
|
+
|
|
61
|
+
The implementation phase answers: **How do we build it?**
|
|
62
|
+
|
|
63
|
+
Using the spec as a blueprint:
|
|
64
|
+
1. Analyzes the task dependency graph
|
|
65
|
+
2. Groups independent tasks into parallel waves
|
|
66
|
+
3. Assigns tasks to sub-agents (or executes inline for simple work)
|
|
67
|
+
4. Executes tasks wave by wave, integrating outputs between waves
|
|
68
|
+
5. Marks each task as done in the spec
|
|
69
|
+
|
|
70
|
+
### What It Produces
|
|
71
|
+
|
|
72
|
+
- Working code that implements all tasks in the spec
|
|
73
|
+
- Updated spec with completed task checkmarks
|
|
74
|
+
- Any decisions made during implementation (recorded in the spec or state file)
|
|
75
|
+
|
|
76
|
+
### Why It Exists
|
|
77
|
+
|
|
78
|
+
Implementation without a spec is random typing. Implementation with a spec is directed construction. The spec tells the developer exactly which files to create, what each function should do, and how to verify completion. No time is wasted deciding what to build next.
|
|
79
|
+
|
|
80
|
+
### Common Mistakes
|
|
81
|
+
|
|
82
|
+
- **Going beyond the spec:** "While I'm here, I'll also add..." — YAGNI. Build what is in the spec.
|
|
83
|
+
- **Ignoring the dependency graph:** Starting tasks whose dependencies are not complete leads to rework.
|
|
84
|
+
- **Not recording decisions:** Decisions made during implementation that are not in the spec must be recorded. Otherwise, the reviewer does not understand why something was built a certain way.
|
|
85
|
+
|
|
86
|
+
---
|
|
87
|
+
|
|
88
|
+
## Phase 3: VERIFY (Verification)
|
|
89
|
+
|
|
90
|
+
### What It Does
|
|
91
|
+
|
|
92
|
+
The verification phase answers: **Does it actually work?**
|
|
93
|
+
|
|
94
|
+
For each acceptance criterion in the spec:
|
|
95
|
+
1. Set up the preconditions (the "Given")
|
|
96
|
+
2. Perform the action (the "When")
|
|
97
|
+
3. Check the result (the "Then")
|
|
98
|
+
4. Collect evidence (screenshots, test output, API responses)
|
|
99
|
+
|
|
100
|
+
### What It Produces
|
|
101
|
+
|
|
102
|
+
- Pass/fail status for each acceptance criterion
|
|
103
|
+
- Evidence for each check (screenshots, test output)
|
|
104
|
+
- List of any criteria that failed and why
|
|
105
|
+
|
|
106
|
+
### Why It Exists
|
|
107
|
+
|
|
108
|
+
Code that compiles is not code that works. The implementation phase produces code that the developer believes is correct. The verification phase proves it. Every acceptance criterion gets checked. No assumptions. No "it probably works."
|
|
109
|
+
|
|
110
|
+
### Verification Methods
|
|
111
|
+
|
|
112
|
+
| Work Type | Primary Verification | Fallback |
|
|
113
|
+
|-----------|---------------------|----------|
|
|
114
|
+
| UI features | Playwright MCP (navigate, interact, screenshot) | Manual browser testing |
|
|
115
|
+
| API endpoints | Playwright `browser_evaluate` with fetch, or curl | Integration tests |
|
|
116
|
+
| Business logic | Unit test runner (npm test) | Manual function calls |
|
|
117
|
+
| Refactors | Existing test suite (all tests still pass) | Before/after behavior comparison |
|
|
118
|
+
| Bug fixes | Reproduce the bug, verify it is gone | Regression test |
|
|
119
|
+
|
|
120
|
+
### Common Mistakes
|
|
121
|
+
|
|
122
|
+
- **Only testing the happy path:** If AC-1 passes but AC-3 (the error case) was skipped, verification is incomplete.
|
|
123
|
+
- **Verifying in the wrong environment:** Testing against a mock database when the AC specifies "real data." Use the environment the spec defines.
|
|
124
|
+
- **Not collecting evidence:** Without screenshots or test output, the reviewer has no proof. Verification without evidence is just another assumption.
|
|
125
|
+
|
|
126
|
+
---
|
|
127
|
+
|
|
128
|
+
## Phase 4: REVIEW (Code Review)
|
|
129
|
+
|
|
130
|
+
### What It Does
|
|
131
|
+
|
|
132
|
+
The review phase answers: **Is the code good enough?**
|
|
133
|
+
|
|
134
|
+
Every file created or modified during implementation is reviewed against the engineering laws:
|
|
135
|
+
1. SOLID — proper structure and separation
|
|
136
|
+
2. DRY — no duplicate logic
|
|
137
|
+
3. YAGNI — nothing beyond the spec
|
|
138
|
+
4. Clean Code — small functions, shallow nesting
|
|
139
|
+
5. Security — no hardcoded secrets, input validation
|
|
140
|
+
6. Testing — every behavior has a test
|
|
141
|
+
7. Naming — descriptive, consistent names
|
|
142
|
+
8. Error Handling — no empty catches, domain-specific exceptions
|
|
143
|
+
|
|
144
|
+
### What It Produces
|
|
145
|
+
|
|
146
|
+
A review document (REVIEW.md) containing:
|
|
147
|
+
- List of files reviewed
|
|
148
|
+
- Findings with severity (error, warning, info)
|
|
149
|
+
- For each finding: file, line numbers, law violated, description, suggested fix
|
|
150
|
+
- Overall assessment and blocking status
|
|
151
|
+
|
|
152
|
+
### Why It Exists
|
|
153
|
+
|
|
154
|
+
Verification proves the code works. Review proves the code is maintainable. Code that works but violates engineering laws creates technical debt. Technical debt compounds. A 5-minute shortcut today becomes a 5-hour debugging session next month.
|
|
155
|
+
|
|
156
|
+
### What Blocks Closure
|
|
157
|
+
|
|
158
|
+
Findings at `error` severity block loop closure. The developer must fix the issues and re-run the review. Findings at `warning` severity are reported but do not block. Findings at `info` severity are noted.
|
|
159
|
+
|
|
160
|
+
### Common Mistakes
|
|
161
|
+
|
|
162
|
+
- **Treating warnings as ignorable:** Warnings are "fix this soon." Not fixing them means they accumulate until the codebase is drowning in minor issues.
|
|
163
|
+
- **Fixing symptoms, not causes:** A reviewer says "this function is too long." The developer splits it into two functions that are both still too long. Fix the root cause — extract meaningful helper functions.
|
|
164
|
+
- **Arguing with the law:** "But my 60-line function is readable!" The law says 40 lines. Refactor.
|
|
165
|
+
|
|
166
|
+
---
|
|
167
|
+
|
|
168
|
+
## Phase 5: CLOSE (Loop Closure)
|
|
169
|
+
|
|
170
|
+
### What It Does
|
|
171
|
+
|
|
172
|
+
The close phase answers: **Are we done, and what did we learn?**
|
|
173
|
+
|
|
174
|
+
Loop closure:
|
|
175
|
+
1. Confirms all acceptance criteria passed
|
|
176
|
+
2. Confirms all error-severity review findings are resolved
|
|
177
|
+
3. Produces a summary document with deliverables, results, and lessons learned
|
|
178
|
+
4. Updates the roadmap and state files
|
|
179
|
+
5. Sets the next action for the next loop iteration
|
|
180
|
+
|
|
181
|
+
### What It Produces
|
|
182
|
+
|
|
183
|
+
A summary document (SUMMARY.md) containing:
|
|
184
|
+
- Deliverables (what was built)
|
|
185
|
+
- Acceptance criteria results (pass/fail with evidence)
|
|
186
|
+
- Deviations from the spec (planned vs actual)
|
|
187
|
+
- Decisions made during implementation
|
|
188
|
+
- Engineering laws review results
|
|
189
|
+
- Files modified
|
|
190
|
+
- Lessons learned
|
|
191
|
+
|
|
192
|
+
### Why It Exists
|
|
193
|
+
|
|
194
|
+
Without closure, work is "sort of done." Is it done-done? Did everything pass? Were there deviations? Nobody knows because nobody wrote it down. Closure creates a clear record that this work is complete, what was delivered, and what to work on next.
|
|
195
|
+
|
|
196
|
+
### Closure Checklist
|
|
197
|
+
|
|
198
|
+
The loop cannot close unless:
|
|
199
|
+
- [ ] All acceptance criteria have a pass/fail status
|
|
200
|
+
- [ ] All `error` severity findings are resolved
|
|
201
|
+
- [ ] A summary document is written
|
|
202
|
+
- [ ] The state file is updated with the next action
|
|
203
|
+
- [ ] The roadmap is updated to reflect progress
|
|
204
|
+
|
|
205
|
+
### Common Mistakes
|
|
206
|
+
|
|
207
|
+
- **Skipping lessons learned:** "We'll remember." You will not. Write it down.
|
|
208
|
+
- **Leaving the next action empty:** The next developer (or the next session) opens the project and does not know what to do. Every close must set the next action.
|
|
209
|
+
- **Closing with unresolved blockers:** If there is a blocker, the loop does not close. Address it or defer it explicitly.
|
|
210
|
+
|
|
211
|
+
---
|
|
212
|
+
|
|
213
|
+
## How Phases Connect
|
|
214
|
+
|
|
215
|
+
Each phase consumes the output of the previous phase:
|
|
216
|
+
|
|
217
|
+
```
|
|
218
|
+
SPEC produces → spec document
|
|
219
|
+
↓
|
|
220
|
+
IMPLEMENT consumes spec → produces working code
|
|
221
|
+
↓
|
|
222
|
+
VERIFY consumes acceptance criteria → produces pass/fail evidence
|
|
223
|
+
↓
|
|
224
|
+
REVIEW consumes code changes → produces findings
|
|
225
|
+
↓
|
|
226
|
+
CLOSE consumes all outputs → produces summary + next action
|
|
227
|
+
```
|
|
228
|
+
|
|
229
|
+
If a phase fails, it loops back:
|
|
230
|
+
- **Verify fails:** Go back to IMPLEMENT. Fix the code. Re-verify.
|
|
231
|
+
- **Review fails:** Go back to IMPLEMENT. Fix the findings. Re-review.
|
|
232
|
+
- **Spec is incomplete:** Stay in SPEC. Ask more questions. Update the spec.
|
|
233
|
+
|
|
234
|
+
The loop never skips forward. You cannot review before verifying. You cannot close before reviewing.
|
|
235
|
+
|
|
236
|
+
---
|
|
237
|
+
|
|
238
|
+
## What "Closing the Loop" Means
|
|
239
|
+
|
|
240
|
+
Closing the loop means all five phases completed successfully for one unit of work (one plan). The result is:
|
|
241
|
+
- Working, verified code
|
|
242
|
+
- No error-severity engineering law violations
|
|
243
|
+
- A complete audit trail (spec, review, summary)
|
|
244
|
+
- A clear next action for the next iteration
|
|
245
|
+
|
|
246
|
+
When the loop closes, the system resets for the next plan. The state file updates. The roadmap advances. The developer knows exactly what to do next.
|
|
247
|
+
|
|
248
|
+
---
|
|
249
|
+
|
|
250
|
+
## With Loop vs Without Loop
|
|
251
|
+
|
|
252
|
+
### Without a Loop (Ad-Hoc Development)
|
|
253
|
+
|
|
254
|
+
1. Developer gets a vague request: "Add user search."
|
|
255
|
+
2. Starts coding immediately. Guesses at requirements.
|
|
256
|
+
3. Builds search with filters, sorting, and pagination (nobody asked for pagination).
|
|
257
|
+
4. Does not write tests ("I'll add them later").
|
|
258
|
+
5. Discovers the stakeholder wanted search by name only, not the full-featured search.
|
|
259
|
+
6. Reworks half the code. Still no tests.
|
|
260
|
+
7. Ships with a hardcoded API key in the config.
|
|
261
|
+
8. Next developer opens the codebase: "What was built? Is it done? What do I work on next?"
|
|
262
|
+
|
|
263
|
+
**Result:** Wrong thing built, no tests, security issue, no documentation, confused team.
|
|
264
|
+
|
|
265
|
+
### With a Loop (SDLC)
|
|
266
|
+
|
|
267
|
+
1. Developer gets the request: "Add user search."
|
|
268
|
+
2. **SPEC:** Asks clarifying questions. "Search by what fields? How should results be sorted? Any pagination needed?" Stakeholder says: "Just name search, alphabetical, no pagination."
|
|
269
|
+
3. **IMPLEMENT:** Builds exactly what the spec says. Name search, alphabetical sort, no pagination.
|
|
270
|
+
4. **VERIFY:** Tests that search by name works, results are sorted, and error case (no results) is handled.
|
|
271
|
+
5. **REVIEW:** Code checked against engineering laws. One warning for a vague variable name. Fixed.
|
|
272
|
+
6. **CLOSE:** Summary written. Tests pass. No pagination to rip out later. Next action set.
|
|
273
|
+
|
|
274
|
+
**Result:** Right thing built, tested, reviewed, documented, clear path forward.
|
|
275
|
+
|
|
276
|
+
---
|
|
277
|
+
|
|
278
|
+
## Visual Diagrams
|
|
279
|
+
|
|
280
|
+
### Single Loop Iteration
|
|
281
|
+
|
|
282
|
+
```
|
|
283
|
+
┌─────────────────────────────────────────────────────┐
|
|
284
|
+
│ ONE LOOP │
|
|
285
|
+
│ │
|
|
286
|
+
│ ┌──────┐ ┌──────────┐ ┌────────┐ ┌────────┐│ ┌───────┐
|
|
287
|
+
│ │ SPEC │──▶│IMPLEMENT │──▶│ VERIFY │──▶│ REVIEW ││──▶│ CLOSE │
|
|
288
|
+
│ └──────┘ └──────────┘ └────────┘ └────────┘│ └───────┘
|
|
289
|
+
│ │ ▲ │ │ │ │
|
|
290
|
+
│ │ │ │ │ │ │
|
|
291
|
+
│ │ └─────────────┘ │ │ │
|
|
292
|
+
│ │ (verify fails → re-impl) │ │ │
|
|
293
|
+
│ │ │ │ │
|
|
294
|
+
│ │ ▲ │ │ │
|
|
295
|
+
│ │ └───────────────────────────┘ │ │
|
|
296
|
+
│ │ (review fails → re-impl) │ │
|
|
297
|
+
│ │ │ │
|
|
298
|
+
└──────│──────────────────────────────────────────────┘ │
|
|
299
|
+
│ │
|
|
300
|
+
└──────────────────────────────────────────────────────┘
|
|
301
|
+
(next plan)
|
|
302
|
+
```
|
|
303
|
+
|
|
304
|
+
### Multiple Loops in a Phase
|
|
305
|
+
|
|
306
|
+
```
|
|
307
|
+
Phase 1: Authentication
|
|
308
|
+
├── Loop 1: Spec → Impl → Verify → Review → Close (user login)
|
|
309
|
+
├── Loop 2: Spec → Impl → Verify → Review → Close (password reset)
|
|
310
|
+
└── Loop 3: Spec → Impl → Verify → Review → Close (session management)
|
|
311
|
+
|
|
312
|
+
Phase 2: User Management
|
|
313
|
+
├── Loop 4: Spec → Impl → Verify → Review → Close (CRUD operations)
|
|
314
|
+
├── Loop 5: Spec → Impl → Verify → Review → Close (role-based access)
|
|
315
|
+
└── Loop 6: Spec → Impl → Verify → Review → Close (user search)
|
|
316
|
+
```
|
|
317
|
+
|
|
318
|
+
### Milestone Progression
|
|
319
|
+
|
|
320
|
+
```
|
|
321
|
+
Milestone 1: MVP
|
|
322
|
+
├── Phase 1 ── Loop 1 ── Loop 2 ── Loop 3
|
|
323
|
+
├── Phase 2 ── Loop 4 ── Loop 5
|
|
324
|
+
└── Phase 3 ── Loop 6 ── Loop 7 ── Loop 8
|
|
325
|
+
|
|
326
|
+
Milestone 2: v1.0
|
|
327
|
+
├── Phase 4 ── Loop 9 ── Loop 10
|
|
328
|
+
└── Phase 5 ── Loop 11
|
|
329
|
+
```
|
|
330
|
+
|
|
331
|
+
Each loop is one complete iteration of SPEC → IMPLEMENT → VERIFY → REVIEW → CLOSE.
|
|
@@ -0,0 +1,298 @@
|
|
|
1
|
+
# Playwright MCP Verification Reference
|
|
2
|
+
|
|
3
|
+
This document explains how to use Playwright MCP tools for verification during the `/sdlc:verify` phase. It covers the available tools, how to translate acceptance criteria into Playwright sequences, and common pitfalls.
|
|
4
|
+
|
|
5
|
+
---
|
|
6
|
+
|
|
7
|
+
## Available MCP Tools
|
|
8
|
+
|
|
9
|
+
The SDLC framework uses Playwright through the Model Context Protocol (MCP). These are the tools available for browser-based verification.
|
|
10
|
+
|
|
11
|
+
### Navigation
|
|
12
|
+
|
|
13
|
+
| Tool | Purpose | Key Parameters |
|
|
14
|
+
|------|---------|----------------|
|
|
15
|
+
| `browser_navigate` | Go to a URL | `url` — the full URL to navigate to |
|
|
16
|
+
| `browser_navigate_back` | Go back one page | (none) |
|
|
17
|
+
| `browser_tabs` | List open tabs | (none) |
|
|
18
|
+
|
|
19
|
+
### Interaction
|
|
20
|
+
|
|
21
|
+
| Tool | Purpose | Key Parameters |
|
|
22
|
+
|------|---------|----------------|
|
|
23
|
+
| `browser_click` | Click an element | `element` — description or selector of what to click |
|
|
24
|
+
| `browser_fill_form` | Fill a form field | `element` — the field, `value` — what to type |
|
|
25
|
+
| `browser_type` | Type text (character by character) | `text` — the text to type |
|
|
26
|
+
| `browser_press_key` | Press a keyboard key | `key` — the key name (Enter, Tab, Escape, etc.) |
|
|
27
|
+
| `browser_select_option` | Select from a dropdown | `element` — the dropdown, `value` — the option |
|
|
28
|
+
| `browser_hover` | Hover over an element | `element` — what to hover over |
|
|
29
|
+
| `browser_drag` | Drag an element | `startElement`, `endElement` |
|
|
30
|
+
| `browser_file_upload` | Upload a file | `element` — the file input, `paths` — file paths |
|
|
31
|
+
|
|
32
|
+
### Observation
|
|
33
|
+
|
|
34
|
+
| Tool | Purpose | Key Parameters |
|
|
35
|
+
|------|---------|----------------|
|
|
36
|
+
| `browser_snapshot` | Get the current page accessibility tree (like a text representation of the page) | (none) |
|
|
37
|
+
| `browser_take_screenshot` | Capture a screenshot | (none) |
|
|
38
|
+
| `browser_console_messages` | Get browser console output | (none) |
|
|
39
|
+
| `browser_network_requests` | Get network request log | (none) |
|
|
40
|
+
|
|
41
|
+
### Control
|
|
42
|
+
|
|
43
|
+
| Tool | Purpose | Key Parameters |
|
|
44
|
+
|------|---------|----------------|
|
|
45
|
+
| `browser_wait_for` | Wait for a condition | `condition` — what to wait for |
|
|
46
|
+
| `browser_handle_dialog` | Accept or dismiss a dialog | `action` — accept or dismiss |
|
|
47
|
+
| `browser_evaluate` | Run JavaScript in the page | `code` — JavaScript to execute |
|
|
48
|
+
| `browser_run_code` | Run JavaScript for complex automation | `code` — JavaScript code |
|
|
49
|
+
| `browser_resize` | Resize the browser window | `width`, `height` |
|
|
50
|
+
| `browser_close` | Close the browser | (none) |
|
|
51
|
+
| `browser_install` | Install browser binaries | (none) |
|
|
52
|
+
|
|
53
|
+
---
|
|
54
|
+
|
|
55
|
+
## Translating BDD Criteria to Playwright Sequences
|
|
56
|
+
|
|
57
|
+
Each acceptance criterion in BDD format (Given/When/Then) maps directly to a sequence of Playwright MCP tool calls.
|
|
58
|
+
|
|
59
|
+
### The Pattern
|
|
60
|
+
|
|
61
|
+
```
|
|
62
|
+
Given → Setup (navigate, fill prerequisites)
|
|
63
|
+
When → Action (click, type, submit)
|
|
64
|
+
Then → Assert (snapshot + verify content, screenshot for evidence)
|
|
65
|
+
```
|
|
66
|
+
|
|
67
|
+
### Example 1: Login Form
|
|
68
|
+
|
|
69
|
+
**Acceptance Criterion:**
|
|
70
|
+
```
|
|
71
|
+
AC-1: Successful login
|
|
72
|
+
Given a registered user with email "alice@example.com" and password "secure123"
|
|
73
|
+
When the user navigates to /login and submits valid credentials
|
|
74
|
+
Then the user is redirected to /dashboard and sees "Welcome, Alice"
|
|
75
|
+
```
|
|
76
|
+
|
|
77
|
+
**Playwright Sequence:**
|
|
78
|
+
|
|
79
|
+
```
|
|
80
|
+
1. browser_navigate → url: "http://localhost:3000/login"
|
|
81
|
+
2. browser_snapshot → verify login form is visible
|
|
82
|
+
3. browser_fill_form → element: "email input", value: "alice@example.com"
|
|
83
|
+
4. browser_fill_form → element: "password input", value: "secure123"
|
|
84
|
+
5. browser_click → element: "Login button"
|
|
85
|
+
6. browser_wait_for → condition: "navigation to /dashboard"
|
|
86
|
+
7. browser_snapshot → verify page contains "Welcome, Alice"
|
|
87
|
+
8. browser_take_screenshot → evidence for review
|
|
88
|
+
```
|
|
89
|
+
|
|
90
|
+
### Example 2: Form Validation
|
|
91
|
+
|
|
92
|
+
**Acceptance Criterion:**
|
|
93
|
+
```
|
|
94
|
+
AC-2: Email validation error
|
|
95
|
+
Given an empty login form
|
|
96
|
+
When the user submits without entering an email
|
|
97
|
+
Then an error message "Email is required" appears below the email field
|
|
98
|
+
```
|
|
99
|
+
|
|
100
|
+
**Playwright Sequence:**
|
|
101
|
+
|
|
102
|
+
```
|
|
103
|
+
1. browser_navigate → url: "http://localhost:3000/login"
|
|
104
|
+
2. browser_click → element: "Login button" (submit empty form)
|
|
105
|
+
3. browser_snapshot → verify error message "Email is required" is visible
|
|
106
|
+
4. browser_take_screenshot → evidence
|
|
107
|
+
```
|
|
108
|
+
|
|
109
|
+
### Example 3: API Response Verification
|
|
110
|
+
|
|
111
|
+
**Acceptance Criterion:**
|
|
112
|
+
```
|
|
113
|
+
AC-3: User list API returns paginated results
|
|
114
|
+
Given the database has 50 users
|
|
115
|
+
When a GET request is made to /api/users?page=1&limit=10
|
|
116
|
+
Then the response contains 10 users and a total count of 50
|
|
117
|
+
```
|
|
118
|
+
|
|
119
|
+
**Playwright Sequence:**
|
|
120
|
+
|
|
121
|
+
```
|
|
122
|
+
1. browser_evaluate → code: `fetch('/api/users?page=1&limit=10').then(r => r.json())`
|
|
123
|
+
2. Verify the result contains 10 items and totalCount === 50
|
|
124
|
+
```
|
|
125
|
+
|
|
126
|
+
Alternatively, use `browser_network_requests` after navigating to a page that triggers the API call.
|
|
127
|
+
|
|
128
|
+
---
|
|
129
|
+
|
|
130
|
+
## Screenshot Evidence Collection
|
|
131
|
+
|
|
132
|
+
Screenshots provide visual proof that acceptance criteria passed. They are stored alongside the review and summary documents.
|
|
133
|
+
|
|
134
|
+
### Best Practices
|
|
135
|
+
|
|
136
|
+
- **Take screenshots after assertions, not before.** The screenshot should show the passing state.
|
|
137
|
+
- **Name screenshots descriptively.** Use the AC number: `AC-001-login-success.png`.
|
|
138
|
+
- **Capture the relevant viewport.** Use `browser_resize` to set a consistent viewport before screenshots.
|
|
139
|
+
- **Include error states.** If testing error handling, screenshot the error message.
|
|
140
|
+
|
|
141
|
+
### Standard Viewport Sizes
|
|
142
|
+
|
|
143
|
+
| Device | Width | Height |
|
|
144
|
+
|--------|-------|--------|
|
|
145
|
+
| Desktop | 1280 | 720 |
|
|
146
|
+
| Tablet | 768 | 1024 |
|
|
147
|
+
| Mobile | 375 | 667 |
|
|
148
|
+
|
|
149
|
+
Use `browser_resize` to set the viewport before taking screenshots for responsive testing.
|
|
150
|
+
|
|
151
|
+
---
|
|
152
|
+
|
|
153
|
+
## Handling Dynamic Content
|
|
154
|
+
|
|
155
|
+
Web applications often have content that changes: loading spinners, animations, data that loads asynchronously.
|
|
156
|
+
|
|
157
|
+
### Loading States
|
|
158
|
+
|
|
159
|
+
Use `browser_wait_for` to wait for content to appear before asserting:
|
|
160
|
+
|
|
161
|
+
```
|
|
162
|
+
1. browser_navigate → url: "http://localhost:3000/dashboard"
|
|
163
|
+
2. browser_wait_for → condition: "Loading spinner disappears"
|
|
164
|
+
3. browser_wait_for → condition: "User list is visible"
|
|
165
|
+
4. browser_snapshot → verify content
|
|
166
|
+
```
|
|
167
|
+
|
|
168
|
+
### Animations
|
|
169
|
+
|
|
170
|
+
Animations can cause flaky assertions if the snapshot captures a mid-animation state:
|
|
171
|
+
|
|
172
|
+
- Wait for animations to complete before taking snapshots.
|
|
173
|
+
- Use `browser_evaluate` to check CSS animation state if needed:
|
|
174
|
+
```javascript
|
|
175
|
+
document.querySelector('.modal').getAnimations().every(a => a.playState === 'finished')
|
|
176
|
+
```
|
|
177
|
+
|
|
178
|
+
### Asynchronous Data
|
|
179
|
+
|
|
180
|
+
Data loaded via API calls may not be present immediately after navigation:
|
|
181
|
+
|
|
182
|
+
```
|
|
183
|
+
1. browser_navigate → url: "http://localhost:3000/users"
|
|
184
|
+
2. browser_wait_for → condition: "Table contains user rows"
|
|
185
|
+
3. browser_snapshot → verify data
|
|
186
|
+
```
|
|
187
|
+
|
|
188
|
+
### Time-Dependent Content
|
|
189
|
+
|
|
190
|
+
Timestamps, "5 minutes ago" labels, and countdown timers change between test runs:
|
|
191
|
+
|
|
192
|
+
- Assert on stable parts of the content (user name, status) rather than volatile parts (timestamps).
|
|
193
|
+
- Use `browser_evaluate` to freeze time if needed: `Date.now = () => 1700000000000`.
|
|
194
|
+
|
|
195
|
+
---
|
|
196
|
+
|
|
197
|
+
## Fallback Verification (Non-UI Work)
|
|
198
|
+
|
|
199
|
+
Not all specs produce UI changes. For backend, library, or CLI work, use alternative verification methods.
|
|
200
|
+
|
|
201
|
+
### Test Runner Verification
|
|
202
|
+
|
|
203
|
+
For specs that produce testable code, run the test suite:
|
|
204
|
+
|
|
205
|
+
```bash
|
|
206
|
+
npm test -- --grep "UserService"
|
|
207
|
+
```
|
|
208
|
+
|
|
209
|
+
Verify that:
|
|
210
|
+
- All tests pass.
|
|
211
|
+
- New tests exist for new behavior.
|
|
212
|
+
- No existing tests broke.
|
|
213
|
+
|
|
214
|
+
### API Verification with curl
|
|
215
|
+
|
|
216
|
+
For API endpoints without a frontend:
|
|
217
|
+
|
|
218
|
+
```bash
|
|
219
|
+
# Test endpoint exists and returns correct status
|
|
220
|
+
curl -s -o /dev/null -w "%{http_code}" http://localhost:3000/api/health
|
|
221
|
+
|
|
222
|
+
# Test response body
|
|
223
|
+
curl -s http://localhost:3000/api/users/1 | jq '.name'
|
|
224
|
+
```
|
|
225
|
+
|
|
226
|
+
### CLI Verification
|
|
227
|
+
|
|
228
|
+
For CLI tools:
|
|
229
|
+
|
|
230
|
+
```bash
|
|
231
|
+
# Test command exists and runs
|
|
232
|
+
node dist/cli.js --help
|
|
233
|
+
|
|
234
|
+
# Test specific behavior
|
|
235
|
+
node dist/cli.js create-user --name Alice --email alice@example.com
|
|
236
|
+
echo $? # verify exit code
|
|
237
|
+
```
|
|
238
|
+
|
|
239
|
+
### File System Verification
|
|
240
|
+
|
|
241
|
+
For specs that generate files:
|
|
242
|
+
|
|
243
|
+
```bash
|
|
244
|
+
# Verify file exists
|
|
245
|
+
test -f dist/output.json && echo "exists"
|
|
246
|
+
|
|
247
|
+
# Verify file content
|
|
248
|
+
cat dist/output.json | jq '.version'
|
|
249
|
+
```
|
|
250
|
+
|
|
251
|
+
---
|
|
252
|
+
|
|
253
|
+
## Common Pitfalls and Solutions
|
|
254
|
+
|
|
255
|
+
### Pitfall 1: Element Not Found
|
|
256
|
+
|
|
257
|
+
**Problem:** `browser_click` or `browser_fill_form` cannot find the element.
|
|
258
|
+
**Cause:** The element has not loaded yet, or the selector is wrong.
|
|
259
|
+
**Solution:** Use `browser_snapshot` first to see the accessibility tree. Find the element's actual label or description, then use that in the tool call. Use `browser_wait_for` if the element loads asynchronously.
|
|
260
|
+
|
|
261
|
+
### Pitfall 2: Navigation Timing
|
|
262
|
+
|
|
263
|
+
**Problem:** Asserting on a page before navigation completes.
|
|
264
|
+
**Cause:** `browser_click` on a link triggers navigation, but the new page is not ready yet.
|
|
265
|
+
**Solution:** After clicking a navigation link, use `browser_wait_for` with a condition that indicates the new page loaded (specific text, specific element).
|
|
266
|
+
|
|
267
|
+
### Pitfall 3: Stale State
|
|
268
|
+
|
|
269
|
+
**Problem:** Previous test's state affects current test.
|
|
270
|
+
**Cause:** Browser session persists between verifications. A logged-in user from AC-1 is still logged in for AC-2.
|
|
271
|
+
**Solution:** Either use the persisted state intentionally (if AC-2 requires login) or navigate to a logout endpoint / clear cookies before the next AC.
|
|
272
|
+
|
|
273
|
+
### Pitfall 4: Form Auto-Complete
|
|
274
|
+
|
|
275
|
+
**Problem:** Browser auto-fills form fields with previous values.
|
|
276
|
+
**Cause:** Browser remembers form submissions.
|
|
277
|
+
**Solution:** Clear the field before filling: use `browser_click` on the field, then `browser_press_key` with Ctrl+A, then type the new value. Or use `browser_evaluate` to set the value directly.
|
|
278
|
+
|
|
279
|
+
### Pitfall 5: Pop-ups and Dialogs
|
|
280
|
+
|
|
281
|
+
**Problem:** An unexpected dialog (alert, confirm, prompt) blocks interaction.
|
|
282
|
+
**Cause:** JavaScript `alert()`, `confirm()`, or `prompt()` calls.
|
|
283
|
+
**Solution:** Use `browser_handle_dialog` with `action: "accept"` or `action: "dismiss"` before the action that triggers the dialog. Set up the dialog handler proactively.
|
|
284
|
+
|
|
285
|
+
### Pitfall 6: Iframes
|
|
286
|
+
|
|
287
|
+
**Problem:** Cannot interact with content inside an iframe.
|
|
288
|
+
**Cause:** MCP tools operate on the main frame by default.
|
|
289
|
+
**Solution:** Use `browser_evaluate` to access iframe content:
|
|
290
|
+
```javascript
|
|
291
|
+
document.querySelector('iframe').contentDocument.querySelector('button').click()
|
|
292
|
+
```
|
|
293
|
+
|
|
294
|
+
### Pitfall 7: Screenshot Shows Blank Page
|
|
295
|
+
|
|
296
|
+
**Problem:** Screenshot is empty or shows a loading state.
|
|
297
|
+
**Cause:** Screenshot taken before page finished rendering.
|
|
298
|
+
**Solution:** Always use `browser_wait_for` before `browser_take_screenshot`. Wait for specific content, not just page load.
|