npm - @champpaba/claude-agent-kit - Versions diffs - 1.5.1 → 1.6.1 - Mend

@champpaba/claude-agent-kit 1.5.1 → 1.6.1

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (9) hide show

package/.claude/CLAUDE.md +39 -13
package/.claude/commands/csetup.md +82 -3
package/.claude/commands/pageplan.md +233 -6
package/.claude/lib/agent-executor.md +449 -0
package/.claude/lib/detailed-guides/incremental-testing.md +460 -0
package/.claude/lib/task-analyzer.md +398 -2
package/.claude/templates/phase-templates.json +50 -1
package/README.md +125 -39
package/package.json +1 -1

package/README.md CHANGED Viewed

@@ -917,52 +917,138 @@ Built with:
 ---
-## 🆕 What's New in v1.4.1
+## 🆕 What's New in v1.6.0
-**Feature: Intelligent Auto-Proceed - Eliminate Double Confirmations** 🚀
+**Feature: Incremental Testing - Milestone-based Validation for High-Risk Tasks** 🔄
-### Smart Approval Detection
+### The Problem: All-or-Nothing Testing
-**Problem Solved:**
-- Before: Agent asks "Proceed?" → Main Claude asks user again (redundant!)
-- User frustration: "I already said 'ลุยเลย', why ask twice?"
+**Before v1.6.0:**
+```
+Task: "Integrate Google Maps API"
+→ Agent implements complete solution (1000 locations)
+→ Tests with full dataset
+→ Bug found → Hard to debug (which part failed?)
+→ Fix → Retest full dataset → Slow iteration
+Problem:
+❌ Large scope = hard to debug
+❌ Late bug detection (at scale)
+❌ Rework expensive (threw away 1000-location implementation)
+❌ No confidence in progressive scaling
+```
-**Solution Implemented:**
-- ✅ Main Claude detects user approval keywords ("continue", "proceed", "yes", "ลุยเลย")
-- ✅ Passes approval context to agents in prompt
-- ✅ Auto-responds to agent questions without re-prompting user
-- ✅ Backward compatible: Manual approval mode still available
+**After v1.6.0:**
+```
+Task: "Integrate Google Maps API"
+→ Milestone 1: Test 1 location (hardcoded)
+   → Bug found → Easy to debug (small scope)
+   → Fix → Retest 1 location → Fast iteration
+→ Milestone 2: Test 10 locations (parameterized)
+   → Works! Confidence++
+→ Milestone 3: Error handling
+   → Refine edge cases
+→ Milestone 4: Scale to 1000
+   → Already confident (1 and 10 worked)
+Benefits:
+✅ Small scope = easy debugging
+✅ Early bug detection (at milestone 1)
+✅ Low rework (fix before scaling)
+✅ Progressive confidence
+```
-**Results:**
-- **50-90% fewer confirmations** (1x per workflow vs 2x per phase)
-- **25% faster execution** (no waiting for redundant approvals)
-- **Better UX** (approve once, system handles the rest)
-- **Lean implementation** (80 lines, 1 file, +0.1% context)
+### The Solution: Milestone-based Validation
-### How It Works
+**Automatic Detection:** `/csetup` detects high-risk tasks automatically
+- Risk = HIGH (payment, auth, security)
+- Risk = MEDIUM + Complexity ≥ 7 (complex forms)
+- External API dependency (Google Maps, Stripe, OpenAI)
+- Data-intensive operation (ETL, migration, batch processing)
-```bash
-# Before v1.4.1 (Double confirmation ❌)
-User: "ลุยเลย"
-Main: Calls uxui-frontend agent
-Agent: "Pre-work done. Proceed?"
-Main: "Agent is asking... Proceed? (yes/no)"  ← Asks user again!
-User: "Why ask twice?"
-# After v1.4.1 (Smart auto-proceed ✅)
-User: "ลุยเลย"
-Main: Detects approval → auto_proceed = true
-Agent: "Pre-work done. Proceed?"
-Main: "YES, proceed immediately"  ← Answers agent directly!
-Agent: Continues work...
-```
-### Auto-Proceed Trigger Words
-These keywords enable auto-proceed mode:
-- ✅ `/cdev` command (implicit approval for all phases)
-- ✅ "continue", "proceed", "yes"
-- ✅ "ลุยเลย" (Thai: "go ahead")
+**3 Milestone Patterns:**
+1. **Backend API Integration** (4 milestones)
+   - M1: Core implementation (1 record, hardcoded)
+   - M2: Parameterized query (10 records, dynamic)
+   - M3: Error handling (invalid input, timeouts)
+   - M4: Scale + performance (100-1000 records)
+2. **Complex Form** (3 milestones)
+   - M1: Architecture + skeleton (2-3 critical fields)
+   - M2: E2E flow validation (submit → API → DB)
+   - M3: Complete all fields (all 20 fields + validation)
+3. **Database Migration / ETL** (3 milestones)
+   - M1: Dry-run (10 records)
+   - M2: Scale to 100 records
+   - M3: Full dataset (staging)
+### Round-based Retry Logic
+**Per-Milestone Quota:**
+- **2 attempts per round** (not total)
+- **Unlimited rounds** (Main Claude decides when to stop)
+- **Hints reset quota** (fresh start)
+**Example:**
+```
+Milestone 1: Core implementation
+→ Round 1: Attempt 1 ❌ (API key missing)
+→ Round 1: Attempt 2 ❌ (Still missing)
+→ Main Claude: "Check API_KEY env variable" 💡
+→ Round 2: Attempt 1 ✅ (Fixed!)
+Total attempts: 3 (2 in Round 1, 1 in Round 2)
+```
+### Main Claude Intervention
+**Decision Matrix:**
+| Error Pattern | Complexity | Confidence | Action |
+|---------------|------------|------------|--------|
+| Same error 2x | SIMPLE | HIGH | Give Hints |
+| Same error 2x | COMPLEX | LOW | Ask Human |
+| Different errors | ANY | ANY | Ask Human |
+| Intermittent | ANY | ANY | Ask Human |
+| 2+ rounds no progress | ANY | ANY | Ask Human |
+**Pattern-based Hints:**
+- 401 Unauthorized → Check API_KEY, verify key validity
+- Timeout → Increase threshold, check network
+- Schema mismatch → Compare actual vs expected, check API version
+### Benefits & Trade-offs
+**Benefits:**
+- ✅ **75% faster debug** - Catch bugs at M1 (1 record) vs M4 (1000 records)
+- ✅ **60-70% rework reduction** - Fix before scaling
+- ✅ **80% faster debugging** - Small scope (1 record) vs full dataset
+- ✅ **90% success rate** - Progressive confidence at M4
+- ✅ **40-50% net speedup** - +15-20% time upfront → -60-70% rework time
+**Trade-offs:**
+- ⚠️ **Timeline:** +15-20% upfront (but saves 60-70% rework)
+- ⚠️ **Complexity:** phases.md 2-3x longer (summary table at top)
+- ⚠️ **Learning curve:** More coordination (automated by `/csetup`)
+**Net benefit:** +15-20% time → -60-70% rework = **40-50% faster overall**
+### When to Use Incremental Testing
+**✅ Use for:**
+- Payment integration, Auth systems (HIGH risk)
+- Complex forms with 20+ fields (Complexity ≥ 7)
+- External APIs (Google Maps, Stripe, OpenAI)
+- Data migrations, ETL pipelines (data-intensive)
+**❌ Skip for:**
+- Simple CRUD operations (LOW risk, Complexity < 5)
+- UI components (standard testing sufficient)
+- Configuration changes (no integration testing needed)
+**Detection Rate:** ~20-30% of tasks (only high-risk)
 ---

package/package.json CHANGED Viewed

@@ -1,6 +1,6 @@
 {
   "name": "@champpaba/claude-agent-kit",
-  "version": "1.5.1",
+  "version": "1.6.1",
   "description": "Universal multi-agent template for Claude Code - AI-assisted development with specialized agents",
   "main": "bin/cli.js",
   "bin": {