@juho0719/cckit 0.1.1
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/assets/agents/architect.md +211 -0
- package/assets/agents/build-error-resolver.md +114 -0
- package/assets/agents/ccwin-code-reviewer.md +224 -0
- package/assets/agents/database-reviewer.md +91 -0
- package/assets/agents/doc-updater.md +107 -0
- package/assets/agents/e2e-runner.md +107 -0
- package/assets/agents/planner.md +212 -0
- package/assets/agents/python-reviewer.md +98 -0
- package/assets/agents/refactor-cleaner.md +85 -0
- package/assets/agents/security-reviewer.md +108 -0
- package/assets/agents/superpower-code-reviewer.md +48 -0
- package/assets/agents/tdd-guide.md +80 -0
- package/assets/commands/build-fix.md +62 -0
- package/assets/commands/checkpoint.md +74 -0
- package/assets/commands/code-review.md +40 -0
- package/assets/commands/e2e.md +362 -0
- package/assets/commands/eval.md +120 -0
- package/assets/commands/orchestrate.md +172 -0
- package/assets/commands/plan.md +113 -0
- package/assets/commands/python-review.md +297 -0
- package/assets/commands/refactor-clean.md +80 -0
- package/assets/commands/sessions.md +305 -0
- package/assets/commands/tdd.md +326 -0
- package/assets/commands/test-coverage.md +69 -0
- package/assets/commands/update-codemaps.md +72 -0
- package/assets/commands/update-docs.md +84 -0
- package/assets/commands/verify.md +59 -0
- package/assets/hooks/post-edit-format.js +49 -0
- package/assets/hooks/post-edit-typecheck.js +96 -0
- package/assets/mcps/mcp-servers.json +92 -0
- package/assets/rules/common/agents.md +49 -0
- package/assets/rules/common/coding-style.md +48 -0
- package/assets/rules/common/git-workflow.md +45 -0
- package/assets/rules/common/hooks.md +30 -0
- package/assets/rules/common/patterns.md +31 -0
- package/assets/rules/common/performance.md +55 -0
- package/assets/rules/common/security.md +29 -0
- package/assets/rules/common/testing.md +29 -0
- package/assets/rules/python/coding-style.md +42 -0
- package/assets/rules/python/hooks.md +19 -0
- package/assets/rules/python/patterns.md +39 -0
- package/assets/rules/python/security.md +30 -0
- package/assets/rules/python/testing.md +38 -0
- package/assets/rules/typescript/coding-style.md +18 -0
- package/assets/rules/typescript/hooks.md +19 -0
- package/assets/rules/typescript/patterns.md +39 -0
- package/assets/rules/typescript/security.md +30 -0
- package/assets/rules/typescript/testing.md +38 -0
- package/assets/skills/api-design/SKILL.md +522 -0
- package/assets/skills/backend-patterns/SKILL.md +597 -0
- package/assets/skills/brainstorming/SKILL.md +96 -0
- package/assets/skills/coding-standards/SKILL.md +529 -0
- package/assets/skills/database-migrations/SKILL.md +334 -0
- package/assets/skills/deployment-patterns/SKILL.md +426 -0
- package/assets/skills/dispatching-parallel-agents/SKILL.md +180 -0
- package/assets/skills/docker-patterns/SKILL.md +363 -0
- package/assets/skills/e2e-testing/SKILL.md +325 -0
- package/assets/skills/eval-harness/SKILL.md +235 -0
- package/assets/skills/executing-plans/SKILL.md +84 -0
- package/assets/skills/finishing-a-development-branch/SKILL.md +200 -0
- package/assets/skills/frontend-patterns/SKILL.md +641 -0
- package/assets/skills/iterative-retrieval/SKILL.md +210 -0
- package/assets/skills/postgres-patterns/SKILL.md +145 -0
- package/assets/skills/python-patterns/SKILL.md +749 -0
- package/assets/skills/python-testing/SKILL.md +815 -0
- package/assets/skills/receiving-code-review/SKILL.md +213 -0
- package/assets/skills/requesting-code-review/SKILL.md +105 -0
- package/assets/skills/requesting-code-review/code-reviewer-template.md +146 -0
- package/assets/skills/subagent-driven-development/SKILL.md +242 -0
- package/assets/skills/subagent-driven-development/code-quality-reviewer-prompt.md +20 -0
- package/assets/skills/subagent-driven-development/implementer-prompt.md +78 -0
- package/assets/skills/subagent-driven-development/spec-reviewer-prompt.md +61 -0
- package/assets/skills/systematic-debugging/CREATION-LOG.md +114 -0
- package/assets/skills/systematic-debugging/SKILL.md +296 -0
- package/assets/skills/systematic-debugging/condition-based-waiting-example.ts +158 -0
- package/assets/skills/systematic-debugging/condition-based-waiting.md +115 -0
- package/assets/skills/systematic-debugging/defense-in-depth.md +122 -0
- package/assets/skills/systematic-debugging/root-cause-tracing.md +169 -0
- package/assets/skills/systematic-debugging/scripts/find-polluter.sh +63 -0
- package/assets/skills/systematic-debugging/test-academic.md +14 -0
- package/assets/skills/systematic-debugging/test-pressure-1.md +58 -0
- package/assets/skills/systematic-debugging/test-pressure-2.md +68 -0
- package/assets/skills/systematic-debugging/test-pressure-3.md +69 -0
- package/assets/skills/tdd-workflow/SKILL.md +409 -0
- package/assets/skills/test-driven-development/SKILL.md +371 -0
- package/assets/skills/test-driven-development/testing-anti-patterns.md +299 -0
- package/assets/skills/using-git-worktrees/SKILL.md +218 -0
- package/assets/skills/verification-before-completion/SKILL.md +139 -0
- package/assets/skills/verification-loop/SKILL.md +125 -0
- package/assets/skills/writing-plans/SKILL.md +116 -0
- package/dist/agents-AEKT67A6.js +9 -0
- package/dist/chunk-3GUKEMND.js +28 -0
- package/dist/chunk-3UNN3IBE.js +54 -0
- package/dist/chunk-3Y26YU4R.js +27 -0
- package/dist/chunk-5XOKKPAA.js +21 -0
- package/dist/chunk-6B46AIFM.js +136 -0
- package/dist/chunk-EYY2IZ7N.js +27 -0
- package/dist/chunk-K25UZZVG.js +17 -0
- package/dist/chunk-KEENFBLL.js +24 -0
- package/dist/chunk-RMUKD7CW.js +44 -0
- package/dist/chunk-W63UKEIT.js +50 -0
- package/dist/cli-VZRGF733.js +238 -0
- package/dist/commands-P5LILVZ5.js +9 -0
- package/dist/hooks-IIG2XK4I.js +9 -0
- package/dist/index.js +131 -0
- package/dist/mcps-67Q7TBGW.js +6 -0
- package/dist/paths-FT6KBIRD.js +10 -0
- package/dist/registry-EGXWYWWK.js +17 -0
- package/dist/rules-2CPBVNNJ.js +7 -0
- package/dist/skills-ULMW3UCM.js +8 -0
- package/package.json +36 -0
|
@@ -0,0 +1,362 @@
|
|
|
1
|
+
---
|
|
2
|
+
description: Generate and run end-to-end tests with Playwright. Creates test journeys, runs tests, captures screenshots/videos/traces, and uploads artifacts.
|
|
3
|
+
---
|
|
4
|
+
|
|
5
|
+
# E2E Command
|
|
6
|
+
|
|
7
|
+
This command invokes the **e2e-runner** agent to generate, maintain, and execute end-to-end tests using Playwright.
|
|
8
|
+
|
|
9
|
+
## What This Command Does
|
|
10
|
+
|
|
11
|
+
1. **Generate Test Journeys** - Create Playwright tests for user flows
|
|
12
|
+
2. **Run E2E Tests** - Execute tests across browsers
|
|
13
|
+
3. **Capture Artifacts** - Screenshots, videos, traces on failures
|
|
14
|
+
4. **Upload Results** - HTML reports and JUnit XML
|
|
15
|
+
5. **Identify Flaky Tests** - Quarantine unstable tests
|
|
16
|
+
|
|
17
|
+
## When to Use
|
|
18
|
+
|
|
19
|
+
Use `/e2e` when:
|
|
20
|
+
- Testing critical user journeys (login, trading, payments)
|
|
21
|
+
- Verifying multi-step flows work end-to-end
|
|
22
|
+
- Testing UI interactions and navigation
|
|
23
|
+
- Validating integration between frontend and backend
|
|
24
|
+
- Preparing for production deployment
|
|
25
|
+
|
|
26
|
+
## How It Works
|
|
27
|
+
|
|
28
|
+
The e2e-runner agent will:
|
|
29
|
+
|
|
30
|
+
1. **Analyze user flow** and identify test scenarios
|
|
31
|
+
2. **Generate Playwright test** using Page Object Model pattern
|
|
32
|
+
3. **Run tests** across multiple browsers (Chrome, Firefox, Safari)
|
|
33
|
+
4. **Capture failures** with screenshots, videos, and traces
|
|
34
|
+
5. **Generate report** with results and artifacts
|
|
35
|
+
6. **Identify flaky tests** and recommend fixes
|
|
36
|
+
|
|
37
|
+
## Example Usage
|
|
38
|
+
|
|
39
|
+
```
|
|
40
|
+
User: /e2e Test the market search and view flow
|
|
41
|
+
|
|
42
|
+
Agent (e2e-runner):
|
|
43
|
+
# E2E Test Generation: Market Search and View Flow
|
|
44
|
+
|
|
45
|
+
## Test Scenario Identified
|
|
46
|
+
|
|
47
|
+
**User Journey:** Market Search → View Results → Click Market → View Details
|
|
48
|
+
|
|
49
|
+
**Test Coverage:**
|
|
50
|
+
1. Navigate to markets page
|
|
51
|
+
2. Perform semantic search
|
|
52
|
+
3. Verify search results
|
|
53
|
+
4. Click on first result
|
|
54
|
+
5. Verify market details page loads
|
|
55
|
+
6. Verify chart renders
|
|
56
|
+
|
|
57
|
+
## Generated Test Code
|
|
58
|
+
|
|
59
|
+
```typescript
|
|
60
|
+
// tests/e2e/markets/search-and-view.spec.ts
|
|
61
|
+
import { test, expect } from '@playwright/test'
|
|
62
|
+
import { MarketsPage } from '../../pages/MarketsPage'
|
|
63
|
+
import { MarketDetailsPage } from '../../pages/MarketDetailsPage'
|
|
64
|
+
|
|
65
|
+
test.describe('Market Search and View Flow', () => {
|
|
66
|
+
test('user can search markets and view details', async ({ page }) => {
|
|
67
|
+
// 1. Navigate to markets page
|
|
68
|
+
const marketsPage = new MarketsPage(page)
|
|
69
|
+
await marketsPage.goto()
|
|
70
|
+
|
|
71
|
+
// Verify page loaded
|
|
72
|
+
await expect(page).toHaveTitle(/Markets/)
|
|
73
|
+
await expect(page.locator('h1')).toContainText('Markets')
|
|
74
|
+
|
|
75
|
+
// 2. Perform semantic search
|
|
76
|
+
await marketsPage.searchMarkets('election')
|
|
77
|
+
|
|
78
|
+
// Wait for API response
|
|
79
|
+
await page.waitForResponse(resp =>
|
|
80
|
+
resp.url().includes('/api/markets/search') && resp.status() === 200
|
|
81
|
+
)
|
|
82
|
+
|
|
83
|
+
// 3. Verify search results
|
|
84
|
+
const marketCards = marketsPage.marketCards
|
|
85
|
+
await expect(marketCards.first()).toBeVisible()
|
|
86
|
+
const resultCount = await marketCards.count()
|
|
87
|
+
expect(resultCount).toBeGreaterThan(0)
|
|
88
|
+
|
|
89
|
+
// Take screenshot of search results
|
|
90
|
+
await page.screenshot({ path: 'artifacts/search-results.png' })
|
|
91
|
+
|
|
92
|
+
// 4. Click on first result
|
|
93
|
+
const firstMarketTitle = await marketCards.first().textContent()
|
|
94
|
+
await marketCards.first().click()
|
|
95
|
+
|
|
96
|
+
// 5. Verify market details page loads
|
|
97
|
+
await expect(page).toHaveURL(/\/markets\/[a-z0-9-]+/)
|
|
98
|
+
|
|
99
|
+
const detailsPage = new MarketDetailsPage(page)
|
|
100
|
+
await expect(detailsPage.marketName).toBeVisible()
|
|
101
|
+
await expect(detailsPage.marketDescription).toBeVisible()
|
|
102
|
+
|
|
103
|
+
// 6. Verify chart renders
|
|
104
|
+
await expect(detailsPage.priceChart).toBeVisible()
|
|
105
|
+
|
|
106
|
+
// Verify market name matches
|
|
107
|
+
const detailsTitle = await detailsPage.marketName.textContent()
|
|
108
|
+
expect(detailsTitle?.toLowerCase()).toContain(
|
|
109
|
+
firstMarketTitle?.toLowerCase().substring(0, 20) || ''
|
|
110
|
+
)
|
|
111
|
+
|
|
112
|
+
// Take screenshot of market details
|
|
113
|
+
await page.screenshot({ path: 'artifacts/market-details.png' })
|
|
114
|
+
})
|
|
115
|
+
|
|
116
|
+
test('search with no results shows empty state', async ({ page }) => {
|
|
117
|
+
const marketsPage = new MarketsPage(page)
|
|
118
|
+
await marketsPage.goto()
|
|
119
|
+
|
|
120
|
+
// Search for non-existent market
|
|
121
|
+
await marketsPage.searchMarkets('xyznonexistentmarket123456')
|
|
122
|
+
|
|
123
|
+
// Verify empty state
|
|
124
|
+
await expect(page.locator('[data-testid="no-results"]')).toBeVisible()
|
|
125
|
+
await expect(page.locator('[data-testid="no-results"]')).toContainText(
|
|
126
|
+
/no.*results|no.*markets/i
|
|
127
|
+
)
|
|
128
|
+
|
|
129
|
+
const marketCount = await marketsPage.marketCards.count()
|
|
130
|
+
expect(marketCount).toBe(0)
|
|
131
|
+
})
|
|
132
|
+
|
|
133
|
+
test('can clear search and see all markets again', async ({ page }) => {
|
|
134
|
+
const marketsPage = new MarketsPage(page)
|
|
135
|
+
await marketsPage.goto()
|
|
136
|
+
|
|
137
|
+
// Initial market count
|
|
138
|
+
const initialCount = await marketsPage.marketCards.count()
|
|
139
|
+
|
|
140
|
+
// Perform search
|
|
141
|
+
await marketsPage.searchMarkets('trump')
|
|
142
|
+
await page.waitForLoadState('networkidle')
|
|
143
|
+
|
|
144
|
+
// Verify filtered results
|
|
145
|
+
const filteredCount = await marketsPage.marketCards.count()
|
|
146
|
+
expect(filteredCount).toBeLessThan(initialCount)
|
|
147
|
+
|
|
148
|
+
// Clear search
|
|
149
|
+
await marketsPage.searchInput.clear()
|
|
150
|
+
await page.waitForLoadState('networkidle')
|
|
151
|
+
|
|
152
|
+
// Verify all markets shown again
|
|
153
|
+
const finalCount = await marketsPage.marketCards.count()
|
|
154
|
+
expect(finalCount).toBe(initialCount)
|
|
155
|
+
})
|
|
156
|
+
})
|
|
157
|
+
```
|
|
158
|
+
|
|
159
|
+
## Running Tests
|
|
160
|
+
|
|
161
|
+
```bash
|
|
162
|
+
# Run the generated test
|
|
163
|
+
npx playwright test tests/e2e/markets/search-and-view.spec.ts
|
|
164
|
+
|
|
165
|
+
Running 3 tests using 3 workers
|
|
166
|
+
|
|
167
|
+
✓ [chromium] › search-and-view.spec.ts:5:3 › user can search markets and view details (4.2s)
|
|
168
|
+
✓ [chromium] › search-and-view.spec.ts:52:3 › search with no results shows empty state (1.8s)
|
|
169
|
+
✓ [chromium] › search-and-view.spec.ts:67:3 › can clear search and see all markets again (2.9s)
|
|
170
|
+
|
|
171
|
+
3 passed (9.1s)
|
|
172
|
+
|
|
173
|
+
Artifacts generated:
|
|
174
|
+
- artifacts/search-results.png
|
|
175
|
+
- artifacts/market-details.png
|
|
176
|
+
- playwright-report/index.html
|
|
177
|
+
```
|
|
178
|
+
|
|
179
|
+
## Test Report
|
|
180
|
+
|
|
181
|
+
```
|
|
182
|
+
╔══════════════════════════════════════════════════════════════╗
|
|
183
|
+
║ E2E Test Results ║
|
|
184
|
+
╠══════════════════════════════════════════════════════════════╣
|
|
185
|
+
║ Status: ✅ ALL TESTS PASSED ║
|
|
186
|
+
║ Total: 3 tests ║
|
|
187
|
+
║ Passed: 3 (100%) ║
|
|
188
|
+
║ Failed: 0 ║
|
|
189
|
+
║ Flaky: 0 ║
|
|
190
|
+
║ Duration: 9.1s ║
|
|
191
|
+
╚══════════════════════════════════════════════════════════════╝
|
|
192
|
+
|
|
193
|
+
Artifacts:
|
|
194
|
+
📸 Screenshots: 2 files
|
|
195
|
+
📹 Videos: 0 files (only on failure)
|
|
196
|
+
🔍 Traces: 0 files (only on failure)
|
|
197
|
+
📊 HTML Report: playwright-report/index.html
|
|
198
|
+
|
|
199
|
+
View report: npx playwright show-report
|
|
200
|
+
```
|
|
201
|
+
|
|
202
|
+
✅ E2E test suite ready for CI/CD integration!
|
|
203
|
+
```
|
|
204
|
+
|
|
205
|
+
## Test Artifacts
|
|
206
|
+
|
|
207
|
+
When tests run, the following artifacts are captured:
|
|
208
|
+
|
|
209
|
+
**On All Tests:**
|
|
210
|
+
- HTML Report with timeline and results
|
|
211
|
+
- JUnit XML for CI integration
|
|
212
|
+
|
|
213
|
+
**On Failure Only:**
|
|
214
|
+
- Screenshot of the failing state
|
|
215
|
+
- Video recording of the test
|
|
216
|
+
- Trace file for debugging (step-by-step replay)
|
|
217
|
+
- Network logs
|
|
218
|
+
- Console logs
|
|
219
|
+
|
|
220
|
+
## Viewing Artifacts
|
|
221
|
+
|
|
222
|
+
```bash
|
|
223
|
+
# View HTML report in browser
|
|
224
|
+
npx playwright show-report
|
|
225
|
+
|
|
226
|
+
# View specific trace file
|
|
227
|
+
npx playwright show-trace artifacts/trace-abc123.zip
|
|
228
|
+
|
|
229
|
+
# Screenshots are saved in artifacts/ directory
|
|
230
|
+
open artifacts/search-results.png
|
|
231
|
+
```
|
|
232
|
+
|
|
233
|
+
## Flaky Test Detection
|
|
234
|
+
|
|
235
|
+
If a test fails intermittently:
|
|
236
|
+
|
|
237
|
+
```
|
|
238
|
+
⚠️ FLAKY TEST DETECTED: tests/e2e/markets/trade.spec.ts
|
|
239
|
+
|
|
240
|
+
Test passed 7/10 runs (70% pass rate)
|
|
241
|
+
|
|
242
|
+
Common failure:
|
|
243
|
+
"Timeout waiting for element '[data-testid="confirm-btn"]'"
|
|
244
|
+
|
|
245
|
+
Recommended fixes:
|
|
246
|
+
1. Add explicit wait: await page.waitForSelector('[data-testid="confirm-btn"]')
|
|
247
|
+
2. Increase timeout: { timeout: 10000 }
|
|
248
|
+
3. Check for race conditions in component
|
|
249
|
+
4. Verify element is not hidden by animation
|
|
250
|
+
|
|
251
|
+
Quarantine recommendation: Mark as test.fixme() until fixed
|
|
252
|
+
```
|
|
253
|
+
|
|
254
|
+
## Browser Configuration
|
|
255
|
+
|
|
256
|
+
Tests run on multiple browsers by default:
|
|
257
|
+
- ✅ Chromium (Desktop Chrome)
|
|
258
|
+
- ✅ Firefox (Desktop)
|
|
259
|
+
- ✅ WebKit (Desktop Safari)
|
|
260
|
+
- ✅ Mobile Chrome (optional)
|
|
261
|
+
|
|
262
|
+
Configure in `playwright.config.ts` to adjust browsers.
|
|
263
|
+
|
|
264
|
+
## CI/CD Integration
|
|
265
|
+
|
|
266
|
+
Add to your CI pipeline:
|
|
267
|
+
|
|
268
|
+
```yaml
|
|
269
|
+
# .github/workflows/e2e.yml
|
|
270
|
+
- name: Install Playwright
|
|
271
|
+
run: npx playwright install --with-deps
|
|
272
|
+
|
|
273
|
+
- name: Run E2E tests
|
|
274
|
+
run: npx playwright test
|
|
275
|
+
|
|
276
|
+
- name: Upload artifacts
|
|
277
|
+
if: always()
|
|
278
|
+
uses: actions/upload-artifact@v3
|
|
279
|
+
with:
|
|
280
|
+
name: playwright-report
|
|
281
|
+
path: playwright-report/
|
|
282
|
+
```
|
|
283
|
+
|
|
284
|
+
## PMX-Specific Critical Flows
|
|
285
|
+
|
|
286
|
+
For PMX, prioritize these E2E tests:
|
|
287
|
+
|
|
288
|
+
**🔴 CRITICAL (Must Always Pass):**
|
|
289
|
+
1. User can connect wallet
|
|
290
|
+
2. User can browse markets
|
|
291
|
+
3. User can search markets (semantic search)
|
|
292
|
+
4. User can view market details
|
|
293
|
+
5. User can place trade (with test funds)
|
|
294
|
+
6. Market resolves correctly
|
|
295
|
+
7. User can withdraw funds
|
|
296
|
+
|
|
297
|
+
**🟡 IMPORTANT:**
|
|
298
|
+
1. Market creation flow
|
|
299
|
+
2. User profile updates
|
|
300
|
+
3. Real-time price updates
|
|
301
|
+
4. Chart rendering
|
|
302
|
+
5. Filter and sort markets
|
|
303
|
+
6. Mobile responsive layout
|
|
304
|
+
|
|
305
|
+
## Best Practices
|
|
306
|
+
|
|
307
|
+
**DO:**
|
|
308
|
+
- ✅ Use Page Object Model for maintainability
|
|
309
|
+
- ✅ Use data-testid attributes for selectors
|
|
310
|
+
- ✅ Wait for API responses, not arbitrary timeouts
|
|
311
|
+
- ✅ Test critical user journeys end-to-end
|
|
312
|
+
- ✅ Run tests before merging to main
|
|
313
|
+
- ✅ Review artifacts when tests fail
|
|
314
|
+
|
|
315
|
+
**DON'T:**
|
|
316
|
+
- ❌ Use brittle selectors (CSS classes can change)
|
|
317
|
+
- ❌ Test implementation details
|
|
318
|
+
- ❌ Run tests against production
|
|
319
|
+
- ❌ Ignore flaky tests
|
|
320
|
+
- ❌ Skip artifact review on failures
|
|
321
|
+
- ❌ Test every edge case with E2E (use unit tests)
|
|
322
|
+
|
|
323
|
+
## Important Notes
|
|
324
|
+
|
|
325
|
+
**CRITICAL for PMX:**
|
|
326
|
+
- E2E tests involving real money MUST run on testnet/staging only
|
|
327
|
+
- Never run trading tests against production
|
|
328
|
+
- Set `test.skip(process.env.NODE_ENV === 'production')` for financial tests
|
|
329
|
+
- Use test wallets with small test funds only
|
|
330
|
+
|
|
331
|
+
## Integration with Other Commands
|
|
332
|
+
|
|
333
|
+
- Use `/plan` to identify critical journeys to test
|
|
334
|
+
- Use `/tdd` for unit tests (faster, more granular)
|
|
335
|
+
- Use `/e2e` for integration and user journey tests
|
|
336
|
+
- Use `/code-review` to verify test quality
|
|
337
|
+
|
|
338
|
+
## Related Agents
|
|
339
|
+
|
|
340
|
+
This command invokes the `e2e-runner` agent located at: `~/.claude/agents/e2e-runner.md`
|
|
341
|
+
|
|
342
|
+
## Quick Commands
|
|
343
|
+
|
|
344
|
+
```bash
|
|
345
|
+
# Run all E2E tests
|
|
346
|
+
npx playwright test
|
|
347
|
+
|
|
348
|
+
# Run specific test file
|
|
349
|
+
npx playwright test tests/e2e/markets/search.spec.ts
|
|
350
|
+
|
|
351
|
+
# Run in headed mode (see browser)
|
|
352
|
+
npx playwright test --headed
|
|
353
|
+
|
|
354
|
+
# Debug test
|
|
355
|
+
npx playwright test --debug
|
|
356
|
+
|
|
357
|
+
# Generate test code
|
|
358
|
+
npx playwright codegen http://localhost:3000
|
|
359
|
+
|
|
360
|
+
# View report
|
|
361
|
+
npx playwright show-report
|
|
362
|
+
```
|
|
@@ -0,0 +1,120 @@
|
|
|
1
|
+
# Eval Command
|
|
2
|
+
|
|
3
|
+
Manage eval-driven development workflow.
|
|
4
|
+
|
|
5
|
+
## Usage
|
|
6
|
+
|
|
7
|
+
`/eval [define|check|report|list] [feature-name]`
|
|
8
|
+
|
|
9
|
+
## Define Evals
|
|
10
|
+
|
|
11
|
+
`/eval define feature-name`
|
|
12
|
+
|
|
13
|
+
Create a new eval definition:
|
|
14
|
+
|
|
15
|
+
1. Create `.claude/evals/feature-name.md` with template:
|
|
16
|
+
|
|
17
|
+
```markdown
|
|
18
|
+
## EVAL: feature-name
|
|
19
|
+
Created: $(date)
|
|
20
|
+
|
|
21
|
+
### Capability Evals
|
|
22
|
+
- [ ] [Description of capability 1]
|
|
23
|
+
- [ ] [Description of capability 2]
|
|
24
|
+
|
|
25
|
+
### Regression Evals
|
|
26
|
+
- [ ] [Existing behavior 1 still works]
|
|
27
|
+
- [ ] [Existing behavior 2 still works]
|
|
28
|
+
|
|
29
|
+
### Success Criteria
|
|
30
|
+
- pass@3 > 90% for capability evals
|
|
31
|
+
- pass^3 = 100% for regression evals
|
|
32
|
+
```
|
|
33
|
+
|
|
34
|
+
2. Prompt user to fill in specific criteria
|
|
35
|
+
|
|
36
|
+
## Check Evals
|
|
37
|
+
|
|
38
|
+
`/eval check feature-name`
|
|
39
|
+
|
|
40
|
+
Run evals for a feature:
|
|
41
|
+
|
|
42
|
+
1. Read eval definition from `.claude/evals/feature-name.md`
|
|
43
|
+
2. For each capability eval:
|
|
44
|
+
- Attempt to verify criterion
|
|
45
|
+
- Record PASS/FAIL
|
|
46
|
+
- Log attempt in `.claude/evals/feature-name.log`
|
|
47
|
+
3. For each regression eval:
|
|
48
|
+
- Run relevant tests
|
|
49
|
+
- Compare against baseline
|
|
50
|
+
- Record PASS/FAIL
|
|
51
|
+
4. Report current status:
|
|
52
|
+
|
|
53
|
+
```
|
|
54
|
+
EVAL CHECK: feature-name
|
|
55
|
+
========================
|
|
56
|
+
Capability: X/Y passing
|
|
57
|
+
Regression: X/Y passing
|
|
58
|
+
Status: IN PROGRESS / READY
|
|
59
|
+
```
|
|
60
|
+
|
|
61
|
+
## Report Evals
|
|
62
|
+
|
|
63
|
+
`/eval report feature-name`
|
|
64
|
+
|
|
65
|
+
Generate comprehensive eval report:
|
|
66
|
+
|
|
67
|
+
```
|
|
68
|
+
EVAL REPORT: feature-name
|
|
69
|
+
=========================
|
|
70
|
+
Generated: $(date)
|
|
71
|
+
|
|
72
|
+
CAPABILITY EVALS
|
|
73
|
+
----------------
|
|
74
|
+
[eval-1]: PASS (pass@1)
|
|
75
|
+
[eval-2]: PASS (pass@2) - required retry
|
|
76
|
+
[eval-3]: FAIL - see notes
|
|
77
|
+
|
|
78
|
+
REGRESSION EVALS
|
|
79
|
+
----------------
|
|
80
|
+
[test-1]: PASS
|
|
81
|
+
[test-2]: PASS
|
|
82
|
+
[test-3]: PASS
|
|
83
|
+
|
|
84
|
+
METRICS
|
|
85
|
+
-------
|
|
86
|
+
Capability pass@1: 67%
|
|
87
|
+
Capability pass@3: 100%
|
|
88
|
+
Regression pass^3: 100%
|
|
89
|
+
|
|
90
|
+
NOTES
|
|
91
|
+
-----
|
|
92
|
+
[Any issues, edge cases, or observations]
|
|
93
|
+
|
|
94
|
+
RECOMMENDATION
|
|
95
|
+
--------------
|
|
96
|
+
[SHIP / NEEDS WORK / BLOCKED]
|
|
97
|
+
```
|
|
98
|
+
|
|
99
|
+
## List Evals
|
|
100
|
+
|
|
101
|
+
`/eval list`
|
|
102
|
+
|
|
103
|
+
Show all eval definitions:
|
|
104
|
+
|
|
105
|
+
```
|
|
106
|
+
EVAL DEFINITIONS
|
|
107
|
+
================
|
|
108
|
+
feature-auth [3/5 passing] IN PROGRESS
|
|
109
|
+
feature-search [5/5 passing] READY
|
|
110
|
+
feature-export [0/4 passing] NOT STARTED
|
|
111
|
+
```
|
|
112
|
+
|
|
113
|
+
## Arguments
|
|
114
|
+
|
|
115
|
+
$ARGUMENTS:
|
|
116
|
+
- `define <name>` - Create new eval definition
|
|
117
|
+
- `check <name>` - Run and check evals
|
|
118
|
+
- `report <name>` - Generate full report
|
|
119
|
+
- `list` - Show all evals
|
|
120
|
+
- `clean` - Remove old eval logs (keeps last 10 runs)
|
|
@@ -0,0 +1,172 @@
|
|
|
1
|
+
# Orchestrate Command
|
|
2
|
+
|
|
3
|
+
Sequential agent workflow for complex tasks.
|
|
4
|
+
|
|
5
|
+
## Usage
|
|
6
|
+
|
|
7
|
+
`/orchestrate [workflow-type] [task-description]`
|
|
8
|
+
|
|
9
|
+
## Workflow Types
|
|
10
|
+
|
|
11
|
+
### feature
|
|
12
|
+
Full feature implementation workflow:
|
|
13
|
+
```
|
|
14
|
+
planner -> tdd-guide -> code-reviewer -> security-reviewer
|
|
15
|
+
```
|
|
16
|
+
|
|
17
|
+
### bugfix
|
|
18
|
+
Bug investigation and fix workflow:
|
|
19
|
+
```
|
|
20
|
+
planner -> tdd-guide -> code-reviewer
|
|
21
|
+
```
|
|
22
|
+
|
|
23
|
+
### refactor
|
|
24
|
+
Safe refactoring workflow:
|
|
25
|
+
```
|
|
26
|
+
architect -> code-reviewer -> tdd-guide
|
|
27
|
+
```
|
|
28
|
+
|
|
29
|
+
### security
|
|
30
|
+
Security-focused review:
|
|
31
|
+
```
|
|
32
|
+
security-reviewer -> code-reviewer -> architect
|
|
33
|
+
```
|
|
34
|
+
|
|
35
|
+
## Execution Pattern
|
|
36
|
+
|
|
37
|
+
For each agent in the workflow:
|
|
38
|
+
|
|
39
|
+
1. **Invoke agent** with context from previous agent
|
|
40
|
+
2. **Collect output** as structured handoff document
|
|
41
|
+
3. **Pass to next agent** in chain
|
|
42
|
+
4. **Aggregate results** into final report
|
|
43
|
+
|
|
44
|
+
## Handoff Document Format
|
|
45
|
+
|
|
46
|
+
Between agents, create handoff document:
|
|
47
|
+
|
|
48
|
+
```markdown
|
|
49
|
+
## HANDOFF: [previous-agent] -> [next-agent]
|
|
50
|
+
|
|
51
|
+
### Context
|
|
52
|
+
[Summary of what was done]
|
|
53
|
+
|
|
54
|
+
### Findings
|
|
55
|
+
[Key discoveries or decisions]
|
|
56
|
+
|
|
57
|
+
### Files Modified
|
|
58
|
+
[List of files touched]
|
|
59
|
+
|
|
60
|
+
### Open Questions
|
|
61
|
+
[Unresolved items for next agent]
|
|
62
|
+
|
|
63
|
+
### Recommendations
|
|
64
|
+
[Suggested next steps]
|
|
65
|
+
```
|
|
66
|
+
|
|
67
|
+
## Example: Feature Workflow
|
|
68
|
+
|
|
69
|
+
```
|
|
70
|
+
/orchestrate feature "Add user authentication"
|
|
71
|
+
```
|
|
72
|
+
|
|
73
|
+
Executes:
|
|
74
|
+
|
|
75
|
+
1. **Planner Agent**
|
|
76
|
+
- Analyzes requirements
|
|
77
|
+
- Creates implementation plan
|
|
78
|
+
- Identifies dependencies
|
|
79
|
+
- Output: `HANDOFF: planner -> tdd-guide`
|
|
80
|
+
|
|
81
|
+
2. **TDD Guide Agent**
|
|
82
|
+
- Reads planner handoff
|
|
83
|
+
- Writes tests first
|
|
84
|
+
- Implements to pass tests
|
|
85
|
+
- Output: `HANDOFF: tdd-guide -> code-reviewer`
|
|
86
|
+
|
|
87
|
+
3. **Code Reviewer Agent**
|
|
88
|
+
- Reviews implementation
|
|
89
|
+
- Checks for issues
|
|
90
|
+
- Suggests improvements
|
|
91
|
+
- Output: `HANDOFF: code-reviewer -> security-reviewer`
|
|
92
|
+
|
|
93
|
+
4. **Security Reviewer Agent**
|
|
94
|
+
- Security audit
|
|
95
|
+
- Vulnerability check
|
|
96
|
+
- Final approval
|
|
97
|
+
- Output: Final Report
|
|
98
|
+
|
|
99
|
+
## Final Report Format
|
|
100
|
+
|
|
101
|
+
```
|
|
102
|
+
ORCHESTRATION REPORT
|
|
103
|
+
====================
|
|
104
|
+
Workflow: feature
|
|
105
|
+
Task: Add user authentication
|
|
106
|
+
Agents: planner -> tdd-guide -> code-reviewer -> security-reviewer
|
|
107
|
+
|
|
108
|
+
SUMMARY
|
|
109
|
+
-------
|
|
110
|
+
[One paragraph summary]
|
|
111
|
+
|
|
112
|
+
AGENT OUTPUTS
|
|
113
|
+
-------------
|
|
114
|
+
Planner: [summary]
|
|
115
|
+
TDD Guide: [summary]
|
|
116
|
+
Code Reviewer: [summary]
|
|
117
|
+
Security Reviewer: [summary]
|
|
118
|
+
|
|
119
|
+
FILES CHANGED
|
|
120
|
+
-------------
|
|
121
|
+
[List all files modified]
|
|
122
|
+
|
|
123
|
+
TEST RESULTS
|
|
124
|
+
------------
|
|
125
|
+
[Test pass/fail summary]
|
|
126
|
+
|
|
127
|
+
SECURITY STATUS
|
|
128
|
+
---------------
|
|
129
|
+
[Security findings]
|
|
130
|
+
|
|
131
|
+
RECOMMENDATION
|
|
132
|
+
--------------
|
|
133
|
+
[SHIP / NEEDS WORK / BLOCKED]
|
|
134
|
+
```
|
|
135
|
+
|
|
136
|
+
## Parallel Execution
|
|
137
|
+
|
|
138
|
+
For independent checks, run agents in parallel:
|
|
139
|
+
|
|
140
|
+
```markdown
|
|
141
|
+
### Parallel Phase
|
|
142
|
+
Run simultaneously:
|
|
143
|
+
- code-reviewer (quality)
|
|
144
|
+
- security-reviewer (security)
|
|
145
|
+
- architect (design)
|
|
146
|
+
|
|
147
|
+
### Merge Results
|
|
148
|
+
Combine outputs into single report
|
|
149
|
+
```
|
|
150
|
+
|
|
151
|
+
## Arguments
|
|
152
|
+
|
|
153
|
+
$ARGUMENTS:
|
|
154
|
+
- `feature <description>` - Full feature workflow
|
|
155
|
+
- `bugfix <description>` - Bug fix workflow
|
|
156
|
+
- `refactor <description>` - Refactoring workflow
|
|
157
|
+
- `security <description>` - Security review workflow
|
|
158
|
+
- `custom <agents> <description>` - Custom agent sequence
|
|
159
|
+
|
|
160
|
+
## Custom Workflow Example
|
|
161
|
+
|
|
162
|
+
```
|
|
163
|
+
/orchestrate custom "architect,tdd-guide,code-reviewer" "Redesign caching layer"
|
|
164
|
+
```
|
|
165
|
+
|
|
166
|
+
## Tips
|
|
167
|
+
|
|
168
|
+
1. **Start with planner** for complex features
|
|
169
|
+
2. **Always include code-reviewer** before merge
|
|
170
|
+
3. **Use security-reviewer** for auth/payment/PII
|
|
171
|
+
4. **Keep handoffs concise** - focus on what next agent needs
|
|
172
|
+
5. **Run verification** between agents if needed
|