@champpaba/claude-agent-kit 1.6.0 → 1.7.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (72) hide show
  1. package/.claude/CHANGELOG-v1.1.1.md +259 -259
  2. package/.claude/CLAUDE.md +21 -6
  3. package/.claude/agents/01-integration.md +1 -1
  4. package/.claude/agents/02-uxui-frontend.md +1 -1
  5. package/.claude/agents/03-test-debug.md +1 -1
  6. package/.claude/agents/04-frontend.md +1 -1
  7. package/.claude/agents/05-backend.md +1 -1
  8. package/.claude/agents/06-database.md +1 -1
  9. package/.claude/commands/agentsetup.md +1464 -1464
  10. package/.claude/commands/cdev.md +3 -4
  11. package/.claude/commands/csetup.md +82 -3
  12. package/.claude/commands/cstatus.md +60 -60
  13. package/.claude/commands/cview.md +364 -364
  14. package/.claude/commands/psetup.md +101 -101
  15. package/.claude/contexts/design/accessibility.md +611 -611
  16. package/.claude/contexts/design/layout.md +400 -400
  17. package/.claude/contexts/design/responsive.md +551 -551
  18. package/.claude/contexts/design/shadows.md +522 -522
  19. package/.claude/contexts/design/typography.md +465 -465
  20. package/.claude/contexts/domain/README.md +164 -164
  21. package/.claude/contexts/patterns/agent-coordination.md +388 -388
  22. package/.claude/contexts/patterns/agent-discovery.md +182 -182
  23. package/.claude/contexts/patterns/change-workflow.md +538 -538
  24. package/.claude/contexts/patterns/code-standards.md +515 -515
  25. package/.claude/contexts/patterns/development-principles.md +513 -513
  26. package/.claude/contexts/patterns/error-handling.md +478 -478
  27. package/.claude/contexts/patterns/error-recovery.md +365 -365
  28. package/.claude/contexts/patterns/logging.md +424 -424
  29. package/.claude/contexts/patterns/task-breakdown.md +452 -452
  30. package/.claude/contexts/patterns/task-classification.md +523 -523
  31. package/.claude/contexts/patterns/tdd-classification.md +516 -516
  32. package/.claude/contexts/patterns/testing.md +413 -413
  33. package/.claude/contexts/patterns/validation-framework.md +776 -776
  34. package/.claude/lib/agent-executor.md +450 -1
  35. package/.claude/lib/agent-router.md +572 -572
  36. package/.claude/lib/detailed-guides/agent-system.md +11 -9
  37. package/.claude/lib/detailed-guides/incremental-testing.md +460 -0
  38. package/.claude/lib/flags-updater.md +469 -469
  39. package/.claude/lib/task-analyzer.md +398 -2
  40. package/.claude/lib/tdd-classifier.md +345 -345
  41. package/.claude/lib/validation-gates.md +484 -484
  42. package/.claude/settings.local.json +42 -42
  43. package/.claude/templates/context-template.md +45 -45
  44. package/.claude/templates/flags-template.json +42 -42
  45. package/.claude/templates/phase-templates.json +173 -124
  46. package/.claude/templates/phases-sections/accessibility-test.md +17 -17
  47. package/.claude/templates/phases-sections/api-design.md +37 -37
  48. package/.claude/templates/phases-sections/backend-tests.md +16 -16
  49. package/.claude/templates/phases-sections/backend.md +37 -37
  50. package/.claude/templates/phases-sections/business-logic-validation.md +16 -16
  51. package/.claude/templates/phases-sections/component-tests.md +17 -17
  52. package/.claude/templates/phases-sections/contract-backend.md +16 -16
  53. package/.claude/templates/phases-sections/contract-frontend.md +16 -16
  54. package/.claude/templates/phases-sections/database.md +35 -35
  55. package/.claude/templates/phases-sections/documentation.md +17 -17
  56. package/.claude/templates/phases-sections/e2e-tests.md +16 -16
  57. package/.claude/templates/phases-sections/fix-implementation.md +17 -17
  58. package/.claude/templates/phases-sections/frontend-integration.md +18 -18
  59. package/.claude/templates/phases-sections/frontend-mockup.md +123 -123
  60. package/.claude/templates/phases-sections/manual-flow-test.md +15 -15
  61. package/.claude/templates/phases-sections/manual-ux-test.md +16 -16
  62. package/.claude/templates/phases-sections/refactor-implementation.md +17 -17
  63. package/.claude/templates/phases-sections/refactor.md +16 -16
  64. package/.claude/templates/phases-sections/regression-tests.md +15 -15
  65. package/.claude/templates/phases-sections/report.md +16 -16
  66. package/.claude/templates/phases-sections/responsive-test.md +16 -16
  67. package/.claude/templates/phases-sections/script-implementation.md +43 -43
  68. package/.claude/templates/phases-sections/test-coverage.md +16 -16
  69. package/.claude/templates/phases-sections/user-approval.md +14 -14
  70. package/LICENSE +21 -21
  71. package/README.md +171 -35
  72. package/package.json +1 -1
@@ -2,7 +2,7 @@
2
2
 
3
3
  > **Detailed guide to the multi-agent architecture**
4
4
  > **Source:** Extracted from CLAUDE.md (Navigation Hub)
5
- > **Version:** 1.4.0
5
+ > **Version:** 1.7.0 (Opus 4.5)
6
6
 
7
7
  ---
8
8
 
@@ -29,14 +29,16 @@
29
29
 
30
30
  ## Available Agents (6 specialists)
31
31
 
32
- | Agent | Color | When to Use | Phase |
33
- |-------|-------|-------------|-------|
34
- | **integration** | Orange | Validate API contracts before connecting | 2.5 |
35
- | **uxui-frontend** | Blue | Design UI components with mock data | 1 |
36
- | **test-debug** | Red | Run tests and fix bugs (max 3-4 iterations) | 1,3,4 |
37
- | **frontend** | Green | Connect UI to backend APIs | 3 |
38
- | **backend** | Purple | Create API endpoints with validation | 2 |
39
- | **database** | Pink | Design schemas, migrations, complex queries | 2 |
32
+ **All agents use Opus 4.5** for best-in-class reasoning and code quality.
33
+
34
+ | Agent | Color | Model | When to Use | Phase |
35
+ |-------|-------|-------|-------------|-------|
36
+ | **integration** | Orange | opus | Validate API contracts before connecting | 2.5 |
37
+ | **uxui-frontend** | Blue | opus | Design UI components with mock data | 1 |
38
+ | **test-debug** | Red | opus | Run tests and fix bugs (max 3-4 iterations) | 1,3,4 |
39
+ | **frontend** | Green | opus | Connect UI to backend APIs | 3 |
40
+ | **backend** | Cyan | opus | Create API endpoints with validation | 2 |
41
+ | **database** | Pink | opus | Design schemas, migrations, complex queries | 2 |
40
42
 
41
43
  ---
42
44
 
@@ -0,0 +1,460 @@
1
+ # Incremental Integration Testing (v1.4.0)
2
+
3
+ > **Detailed guide to progressive validation for high-risk tasks**
4
+ > **Source:** User requirement for sample-based validation
5
+ > **Version:** 1.4.0
6
+
7
+ ---
8
+
9
+ ## 🧠 The Problem: All-or-Nothing Testing
10
+
11
+ **Before v1.4.0:**
12
+ ```
13
+ Task: "Integrate Google Maps API"
14
+ → Agent implements complete solution (1000 locations)
15
+ → Tests with full dataset
16
+ → Bug found → Hard to debug (which part failed?)
17
+ → Fix → Retest full dataset → Slow iteration
18
+
19
+ Problem:
20
+ ❌ Large scope = hard to debug
21
+ ❌ Late bug detection (at scale)
22
+ ❌ Rework expensive (threw away 1000-location implementation)
23
+ ❌ No confidence in progressive scaling
24
+ ```
25
+
26
+ **After v1.4.0:**
27
+ ```
28
+ Task: "Integrate Google Maps API"
29
+ → Milestone 1: Test 1 location (hardcoded)
30
+ → Bug found → Easy to debug (small scope)
31
+ → Fix → Retest 1 location → Fast iteration
32
+ → Milestone 2: Test 10 locations (parameterized)
33
+ → Works! Confidence++
34
+ → Milestone 3: Error handling
35
+ → Refine edge cases
36
+ → Milestone 4: Scale to 1000
37
+ → Already confident (1 and 10 worked)
38
+
39
+ Benefits:
40
+ ✅ Small scope = easy debugging
41
+ ✅ Early bug detection (at milestone 1)
42
+ ✅ Low rework (fix before scaling)
43
+ ✅ Progressive confidence
44
+ ```
45
+
46
+ ---
47
+
48
+ ## The Solution: Milestone-based Validation
49
+
50
+ **Inspired by:** Incremental integration testing best practices
51
+
52
+ **Key Files:**
53
+ - `@/.claude/lib/task-analyzer.md` - Detection + milestone generation
54
+ - `@/.claude/commands/csetup.md` - Inject milestones to phases.md
55
+ - `@/.claude/lib/agent-executor.md` - Round-based retry execution
56
+
57
+ ---
58
+
59
+ ## 🎯 When to Use Incremental Testing
60
+
61
+ ### Automatic Detection (by `/csetup`)
62
+
63
+ Incremental testing triggers when:
64
+
65
+ | Criteria | Example |
66
+ |----------|---------|
67
+ | **Risk = HIGH** | Payment integration, Auth system |
68
+ | **Risk = MEDIUM + Complexity ≥ 7** | Complex form with 20 fields |
69
+ | **External API dependency** | Google Maps, Stripe, OpenAI |
70
+ | **Data-intensive operation** | ETL, migration, batch processing |
71
+
72
+ **Detection Rate:** ~20-30% of tasks (only high-risk)
73
+
74
+ ---
75
+
76
+ ## 📊 Milestone Patterns
77
+
78
+ ### Pattern 1: Backend API Integration (4 milestones)
79
+
80
+ **Use for:** External APIs (Google, Stripe, payment gateways)
81
+
82
+ ```
83
+ M1: Core implementation (30%)
84
+ Test: 1 record, hardcoded
85
+ Goal: Prove API connection works
86
+
87
+ M2: Parameterized query (30%)
88
+ Test: 10 records, dynamic input
89
+ Goal: Validate data flow
90
+
91
+ M3: Error handling (20%)
92
+ Test: Invalid input, timeouts, rate limits
93
+ Goal: Resilience
94
+
95
+ M4: Scale + performance (20%)
96
+ Test: 100-1000 records, load test
97
+ Goal: Production-ready
98
+ ```
99
+
100
+ **Time distribution:** 30-30-20-20 (core/params/errors/scale)
101
+
102
+ ---
103
+
104
+ ### Pattern 2: Complex Form (3 milestones)
105
+
106
+ **Use for:** Multi-step forms, wizards, surveys (complexity ≥ 7)
107
+
108
+ ```
109
+ M1: Architecture + skeleton (40%)
110
+ Implement: Full structure with 2-3 critical fields
111
+ Goal: Architecture supports full field count
112
+
113
+ M2: E2E flow validation (30%)
114
+ Test: Submit minimal form → API → DB
115
+ Goal: Prove full flow works
116
+
117
+ M3: Complete all fields (30%)
118
+ Implement: All 20 fields + validation + accessibility
119
+ Goal: Production-ready
120
+ ```
121
+
122
+ **Time distribution:** 40-30-30 (architecture/flow/completion)
123
+
124
+ **Why architecture-first?**
125
+ Avoids rework if structure changes (2-field form → 20-field wizard)
126
+
127
+ ---
128
+
129
+ ### Pattern 3: Database Migration / ETL (3 milestones)
130
+
131
+ **Use for:** Data migrations, ETL pipelines, batch operations
132
+
133
+ ```
134
+ M1: Dry-run with 10 records (25%)
135
+ Test: Transformation logic, rollback
136
+ Goal: Prove concept
137
+
138
+ M2: Scale to 100 records (25%)
139
+ Test: Performance, duplicates, error logging
140
+ Goal: Validate at moderate scale
141
+
142
+ M3: Full dataset (staging) (50%)
143
+ Test: Complete migration on staging
144
+ Goal: Ready for production
145
+ ```
146
+
147
+ **Time distribution:** 25-25-50 (dry-run/scale/full)
148
+
149
+ ---
150
+
151
+ ## 🔄 Round-based Retry Logic
152
+
153
+ ### Per-Milestone Quota
154
+
155
+ ```
156
+ Milestone 1: Core implementation
157
+ → Round 1:
158
+ → Attempt 1: ❌ FAIL (API key not set)
159
+ → Attempt 2: ❌ FAIL (Still no API key)
160
+ → Quota exhausted → Escalate to Main Claude
161
+
162
+ → Main Claude analyzes:
163
+ Error pattern: Same error 2x
164
+ Complexity: SIMPLE
165
+ Root cause: Config issue (HIGH confidence)
166
+ Decision: Give hints
167
+
168
+ → Hints provided:
169
+ - Check if API_KEY env variable is set
170
+ - Verify API key in .env.local
171
+ - Restart dev server after adding key
172
+
173
+ → Round 2 (NEW quota: 2 attempts):
174
+ → Attempt 1: ✅ PASS (API key added)
175
+
176
+ Total attempts: 3 (2 in Round 1, 1 in Round 2)
177
+ ```
178
+
179
+ **Key principles:**
180
+ - **2 attempts per round** (not total)
181
+ - **Unlimited rounds** (Main Claude decides)
182
+ - **Hints reset quota** (fresh start)
183
+
184
+ ---
185
+
186
+ ## 🤖 Main Claude Intervention
187
+
188
+ ### Decision Matrix
189
+
190
+ | Error Pattern | Complexity | Confidence | Action |
191
+ |---------------|------------|------------|--------|
192
+ | Same error 2x | SIMPLE | HIGH | Give Hints |
193
+ | Same error 2x | COMPLEX | LOW | Ask Human |
194
+ | Different errors | ANY | ANY | Ask Human |
195
+ | Intermittent | ANY | ANY | Ask Human |
196
+ | 2+ rounds no progress | ANY | ANY | Ask Human |
197
+
198
+ **Default:** When in doubt → Give Hints (let agent try once more)
199
+
200
+ ### Hint Generation
201
+
202
+ **Pattern-based hints:**
203
+ ```typescript
204
+ Error: 401 Unauthorized
205
+ → Hints:
206
+ - Check API_KEY env variable
207
+ - Verify API key validity
208
+ - Ensure key has permissions
209
+
210
+ Error: Timeout
211
+ → Hints:
212
+ - Increase timeout threshold
213
+ - Check network connectivity
214
+ - Verify API endpoint URL
215
+
216
+ Error: Schema mismatch
217
+ → Hints:
218
+ - Compare actual vs expected schema
219
+ - Check API version changes
220
+ - Add console.log() to inspect response
221
+ ```
222
+
223
+ **Generic hints:**
224
+ - Review exit criteria
225
+ - Add detailed logging
226
+ - Check implementation vs requirements
227
+
228
+ ---
229
+
230
+ ## 🛑 Human Escalation
231
+
232
+ ### When to Ask Human
233
+
234
+ ```typescript
235
+ // Rule 1: Complex + Low confidence
236
+ if (complexity === 'COMPLEX' && confidence === 'LOW') → Ask Human
237
+
238
+ // Rule 2: Non-deterministic errors
239
+ if (error_pattern === 'different_errors') → Ask Human
240
+
241
+ // Rule 3: Intermittent failures
242
+ if (error_pattern === 'intermittent') → Ask Human
243
+
244
+ // Rule 4: No progress after 2 rounds
245
+ if (rounds >= 3 && !progress) → Ask Human
246
+ ```
247
+
248
+ ### Report Format
249
+
250
+ ```markdown
251
+ 🛑 Human Intervention Required
252
+
253
+ Phase: Google Maps Integration
254
+ Milestone: 3/4 - Error handling
255
+ Total Attempts: 6 across 2 rounds
256
+ Status: AWAITING RESOLUTION
257
+
258
+ ## Failure Summary
259
+ Round 1: (Attempts 1-2)
260
+ - ❌ Timeout errors (5s threshold)
261
+
262
+ Round 2: (Attempts 3-4, after hints)
263
+ - ❌ 503 Service Unavailable (intermittent)
264
+
265
+ ## Analysis
266
+ Error Pattern: Intermittent (non-deterministic)
267
+ Complexity: HIGH
268
+ Root Cause: API instability or network issues
269
+ Confidence: LOW
270
+
271
+ ## Recommendations
272
+ 1. Check Google Maps API status page
273
+ 2. Test API directly (curl/Postman)
274
+ 3. Consider retry with exponential backoff
275
+ 4. Verify API quota not exhausted
276
+
277
+ ## Next Steps
278
+ Please investigate and advise:
279
+ - Continue with current approach?
280
+ - Or fix infrastructure first?
281
+ ```
282
+
283
+ ---
284
+
285
+ ## ✅ Exit Criteria Validation
286
+
287
+ ### Agent Output Format (Required)
288
+
289
+ ```markdown
290
+ ## Milestone 2 Results
291
+
292
+ **Implementation Summary:**
293
+ Implemented parameterized query with dynamic address input
294
+
295
+ **Test Results:**
296
+ - [ ] Accepts dynamic input - PASS - Accepts string parameter
297
+ - [ ] Returns correct results for 10 queries - PASS - All 10 matched
298
+ - [ ] No duplicate API calls - PASS - Checked logs, no dupes
299
+ - [ ] Response time < 700ms - FAIL - Got 823ms (too slow)
300
+
301
+ **Issues Found:**
302
+ - Response time exceeds threshold (optimization needed)
303
+
304
+ **Conclusion:**
305
+ FAIL → Need to add caching to reduce response time
306
+ ```
307
+
308
+ **Validation:**
309
+ - ALL criteria must report PASS/FAIL
310
+ - Missing criteria = automatic FAIL
311
+ - Strict mode: ALL must pass (no lenient 80% rule)
312
+
313
+ ---
314
+
315
+ ## 📊 Benefits & Trade-offs
316
+
317
+ ### Benefits
318
+
319
+ | Benefit | Quantified |
320
+ |---------|------------|
321
+ | **Bug detection** | Catch at M1 (1 record) vs M4 (1000 records) → 75% faster debug |
322
+ | **Rework reduction** | Fix before scaling → 60-70% less rework |
323
+ | **Debug time** | Small scope (1 record) → 80% faster than full dataset |
324
+ | **Confidence** | Progressive proof → 90% success rate at M4 |
325
+ | **Risk mitigation** | Early validation → Catch critical bugs before production |
326
+
327
+ ### Trade-offs
328
+
329
+ | Trade-off | Impact | Mitigation |
330
+ |-----------|--------|------------|
331
+ | **Timeline +15-20%** | Slower initial dev | Saves 60-70% rework time (net positive) |
332
+ | **phases.md 2-3x longer** | Harder to read | Summary table at top |
333
+ | **Complexity** | Learning curve | Comprehensive docs |
334
+ | **Overhead** | More coordination | Automated by `/csetup` |
335
+
336
+ **Net benefit:** +15-20% time → -60-70% rework = **40-50% faster overall**
337
+
338
+ ---
339
+
340
+ ## 🎯 Usage Example
341
+
342
+ ### User Flow
343
+
344
+ ```bash
345
+ # 1. User creates OpenSpec change
346
+ /csetup CHANGE-042
347
+
348
+ # Output:
349
+ ✅ Task Analysis Complete
350
+ 🔄 Testing Strategy:
351
+ - Incremental: 2 tasks (7 milestones total)
352
+ → Google Maps integration (4 milestones)
353
+ → Payment processing (3 milestones)
354
+ - Standard: 5 tasks
355
+
356
+ # 2. Review phases.md
357
+ cat openspec/changes/CHANGE-042/.claude/phases.md
358
+
359
+ # Shows:
360
+ ### Phase 2: Google Maps Integration
361
+ Testing Strategy: 🔄 INCREMENTAL
362
+ Total Milestones: 4
363
+
364
+ #### Milestone 1/4: Core implementation
365
+ Test Scope: Single happy path (1 record)
366
+ Exit Criteria:
367
+ - [ ] Response status = 200
368
+ - [ ] Data structure valid
369
+ - [ ] Response time < 500ms
370
+ ...
371
+
372
+ # 3. Execute
373
+ /cdev CHANGE-042
374
+
375
+ # Execution:
376
+ 🔄 INCREMENTAL MODE
377
+ Milestone 1/4: Core implementation
378
+ Round 1, Attempt 1: ❌ FAIL (API key missing)
379
+ Round 1, Attempt 2: ❌ FAIL (Still missing)
380
+ 💡 Main Claude: "Check API_KEY env variable"
381
+ Round 2, Attempt 1: ✅ PASS
382
+ Milestone 2/4: Parameterized query
383
+ Round 1, Attempt 1: ✅ PASS
384
+ ...
385
+ ```
386
+
387
+ ---
388
+
389
+ ## 🔧 Maintenance Guide
390
+
391
+ ### Adding New Patterns
392
+
393
+ **File:** `.claude/lib/task-analyzer.md` → `generateMilestones()`
394
+
395
+ ```typescript
396
+ // Pattern 4: Real-time WebSocket (NEW)
397
+ if (taskLower.match(/websocket|realtime|socket.io/i)) {
398
+ milestones.push({
399
+ id: 1,
400
+ name: 'Basic connection',
401
+ testScope: 'Single client, simple message',
402
+ exitCriteria: [
403
+ 'WebSocket connects successfully',
404
+ 'Message sent and received',
405
+ 'Connection closes gracefully'
406
+ ],
407
+ estimatedTime: Math.ceil(estimatedTime * 0.3),
408
+ retryLimit: 2
409
+ })
410
+ // ... more milestones
411
+ }
412
+ ```
413
+
414
+ ### Adjusting Time Distribution
415
+
416
+ Current: 30-30-20-20 (API pattern)
417
+
418
+ If Pattern 1 (API) milestones take longer:
419
+ ```typescript
420
+ // Before: 30-30-20-20
421
+ milestones[0].estimatedTime = estimatedTime * 0.3
422
+ milestones[1].estimatedTime = estimatedTime * 0.3
423
+ milestones[2].estimatedTime = estimatedTime * 0.2
424
+ milestones[3].estimatedTime = estimatedTime * 0.2
425
+
426
+ // After: 25-25-25-25 (more balanced)
427
+ milestones.forEach(m => {
428
+ m.estimatedTime = Math.ceil(estimatedTime / 4)
429
+ })
430
+ ```
431
+
432
+ ---
433
+
434
+ ## 📖 References
435
+
436
+ - **Original inspiration:** User request for "incremental integration testing"
437
+ - **Similar concepts:**
438
+ - Sample-based validation (ML/AI pipelines)
439
+ - Progressive enhancement (web development)
440
+ - Iterative refinement testing (data science)
441
+ - **Related files:**
442
+ - Task analysis: `.claude/lib/task-analyzer.md`
443
+ - Execution logic: `.claude/lib/agent-executor.md`
444
+ - Workflow generation: `.claude/commands/csetup.md`
445
+
446
+ ---
447
+
448
+ ## 🎓 Key Takeaways
449
+
450
+ 1. **Use for high-risk only** (20-30% of tasks, not all)
451
+ 2. **Milestone-based > size-based** (functionality over record count)
452
+ 3. **Architecture-first** (avoid rework from structure changes)
453
+ 4. **Trust Main Claude** (hints > blind retry)
454
+ 5. **Progressive confidence** (each milestone proves the next)
455
+
456
+ **Bottom line:** Incremental testing trades +15-20% time for -60-70% rework → **40-50% net speedup** + higher quality.
457
+
458
+ ---
459
+
460
+ This testing strategy ensures high-risk tasks succeed systematically! 🚀