npm - ctx-cc - Versions diffs - 3.3.1 → 3.3.3 - Mend

ctx-cc 3.3.1 → 3.3.3

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (10) hide show

package/README.md CHANGED Viewed

@@ -15,7 +15,7 @@
 **AI that learns your preferences. Predictive planning. Self-healing deployments. Voice control.**
-[Installation](#installation) · [Quick Start](#quick-start) · [New in 3.3](#new-in-33) · [Commands](#commands) · [Why CTX](#why-ctx)
+[Installation](#installation) · [Quick Start](#quick-start) · [New in 3.3](#new-in-33) · [Commands](#commands) · [Why CTX](#why-ctx) · [**Getting Started Guide**](./GETTING_STARTED.md)
 </div>
@@ -403,12 +403,31 @@ Configure in `.ctx/config.json`:
 }
 ```
-### Persistent Debug State
-Debug sessions survive context resets:
+### Persistent Debug Mode
+Scientific debugging with persistent state across sessions:
 ```bash
-/ctx debug --resume      # Continue previous session
+/ctx debug "login fails"    # Start debugging
+/ctx debug --resume         # Resume after context reset
+/ctx debug --list           # See all sessions
+```
+**How it works:**
+```
+1. OBSERVE   → Capture exact error, context, state
+2. RESEARCH  → Search codebase and web for similar issues
+3. HYPOTHESIZE → Form testable theory with confidence level
+4. TEST      → Apply minimal fix
+5. VERIFY    → Build + Tests + Lint + Browser
+6. ITERATE   → Refine hypothesis, max 10 attempts
 ```
+**Key features:**
+- Sessions survive context resets and days between attempts
+- Browser verification with stored credentials
+- Screenshots saved for each attempt
+- Escalation report if max attempts reached
 State stored in `.ctx/debug/sessions/`:
 - `STATE.json` - Machine-readable progress
 - `TRACE.md` - Human-readable log
@@ -486,6 +505,15 @@ Results synthesized into `SUMMARY.md`.
 | `/ctx verify` | Force three-level verification |
 | `/ctx quick "task"` | Quick task bypass |
+### Debug
+| Command | Purpose |
+|---------|---------|
+| `/ctx debug` | Start debugging current issue |
+| `/ctx debug "issue"` | Debug specific problem |
+| `/ctx debug --resume` | Resume last debug session |
+| `/ctx debug --list` | List all debug sessions |
+| `/ctx debug --status` | Show current session status |
 ### Session
 | Command | Purpose |
 |---------|---------|

package/agents/ctx-designer.md CHANGED Viewed

@@ -623,6 +623,45 @@ If ALL criteria pass:
 }
 ```
+## Delivery Guarantee Loop
+```
+┌─────────────────────────────────────────┐
+│    DESIGN DELIVERY GUARANTEE LOOP       │
+├─────────────────────────────────────────┤
+│                                         │
+│  verify design → ALL PASS?              │
+│     │                                   │
+│     ├─ YES → COMPLETE (100% working)    │
+│     │                                   │
+│     └─ NO → debug/fix → verify again    │
+│              ↑                     │    │
+│              └─────────────────────┘    │
+│                                         │
+│  Loop continues until:                  │
+│  1. ALL checks pass, OR                 │
+│  2. Max attempts (10) → escalate        │
+│                                         │
+│  NEVER mark complete with failures      │
+│                                         │
+└─────────────────────────────────────────┘
+```
+**Design work is ONLY marked complete when:**
+- ✓ All approval gates passed (mood board, direction, prototype)
+- ✓ Three-level check passes (exists, substantive, wired)
+- ✓ Visual verification matches prototype
+- ✓ WCAG 2.2 AA compliance verified
+- ✓ Browser renders correctly (all breakpoints)
+- ✓ No console errors
+- ✓ All acceptance criteria satisfied
+**If ANY fails:**
+- Set status = "debugging"
+- Spawn ctx-debugger for technical issues
+- Return to designer for visual/a11y issues
+- Loop until 100% verified
 </verification>
 <output>

package/agents/ctx-executor.md CHANGED Viewed

@@ -312,11 +312,40 @@ After each task:
 - Set status = "debugging"
 - Capture error details
 - Hand off to ctx-debugger
+- **Loop until fixed** (debug → fix → verify → repeat)
 ### If All Tasks Complete
 - Set status = "verifying"
 - Hand off to ctx-verifier
+## 8. Delivery Guarantee
+```
+┌─────────────────────────────────────────┐
+│       EXECUTOR DELIVERY GUARANTEE       │
+├─────────────────────────────────────────┤
+│                                         │
+│  execute task → quick verify            │
+│     │                                   │
+│     ├─ PASS → commit → next task        │
+│     │                                   │
+│     └─ FAIL → debug → fix → verify      │
+│              ↑                     │    │
+│              └─────────────────────┘    │
+│                                         │
+│  NEVER commit failing code              │
+│  NEVER skip to next task on failure     │
+│  ALWAYS loop until 100% working         │
+│                                         │
+└─────────────────────────────────────────┘
+```
+**Task is ONLY committed when:**
+- ✓ Build passes
+- ✓ Tests pass
+- ✓ Lint passes
+- ✓ No blocking anti-patterns
 </process>
 <commit_message_examples>

package/agents/ctx-planner.md CHANGED Viewed

@@ -148,6 +148,15 @@ Write `.ctx/phases/{story_id}/PLAN.md`:
 - Run tests
 - Browser verify each criterion
+## Delivery Guarantee
+Every task loops until verified:
+```
+execute → verify → PASS? → commit
+              ↓ NO
+          debug → fix → verify again
+```
+NEVER mark done until ALL criteria pass.
 ## Notes
 {Key insights from research}
 ```

package/agents/ctx-verifier.md CHANGED Viewed

@@ -399,9 +399,42 @@ Based on results:
 - Update PRD progress
 **If FAIL:**
-- Create fix tasks mapped to failing criteria
-- Set status = "debugging" or "executing"
-- Keep current story
+- Analyze failure type:
+  - **Build/Test/Runtime error** → Set status = "debugging"
+  - **Missing feature** → Set status = "executing" with fix tasks
+- Record `debug_issue` with error details
+- **DO NOT mark as complete until 100% pass**
+## 11. Delivery Guarantee Loop
+```
+┌─────────────────────────────────────────┐
+│         DELIVERY GUARANTEE LOOP         │
+├─────────────────────────────────────────┤
+│                                         │
+│  verify → ALL CRITERIA PASS?            │
+│     │                                   │
+│     ├─ YES → COMPLETE (100% working)    │
+│     │                                   │
+│     └─ NO → debug/fix → verify again    │
+│              ↑                     │    │
+│              └─────────────────────┘    │
+│                                         │
+│  Loop continues until:                  │
+│  1. ALL criteria pass, OR               │
+│  2. Max debug attempts (10) → escalate  │
+│                                         │
+│  NEVER mark complete with failures      │
+│                                         │
+└─────────────────────────────────────────┘
+```
+**Key principle:** Work is ONLY marked complete when:
+- ✓ ALL acceptance criteria pass
+- ✓ Three-level check passes (exists, substantive, wired)
+- ✓ No blocking anti-patterns
+- ✓ Browser verification passes (if UI)
+- ✓ Build + Tests + Lint all green
 </process>

package/commands/ctx.md CHANGED Viewed

@@ -88,8 +88,12 @@ Route to: **Three-Level Verification**
    - Level 2: Substantive (real code, not stub?)
    - Level 3: Wired (imported and used?)
 3. Scan for anti-patterns (TODO, empty catch, placeholders)
-4. If all pass: complete phase, update STATE.md
-5. If fails: create fix tasks, set status = "executing"
+4. Browser verification (if UI)
+5. **DELIVERY GUARANTEE:**
+   - If ALL pass: complete phase → COMPLETE
+   - If ANY fail: set status = "debugging" → fix → verify again
+   - Loop until 100% working or max attempts
+   - NEVER mark complete with failures
 ### If status = "paused"
 Route to: **Resume**
@@ -149,9 +153,14 @@ executing → debugging (if verification fails)
 executing → verifying (if all tasks done)
 debugging → executing (if fix works)
 debugging → ESCALATE (if max attempts fail)
-verifying → executing (if anti-patterns found)
-verifying → COMPLETE (if all passes)
+verifying → debugging (if any check fails)    ← LOOP BACK
+verifying → COMPLETE (if ALL pass 100%)       ← ONLY EXIT
 paused → (previous state)
+DELIVERY GUARANTEE LOOP:
+  verifying ←→ debugging ←→ executing
+       ↓
+  COMPLETE (only when 100% verified)
 ```
 </state_transitions>

package/commands/debug.md ADDED Viewed

@@ -0,0 +1,461 @@
+---
+name: ctx:debug
+description: Systematic debugging with persistent state across sessions. Scientific method. Max 10 attempts. Browser verification.
+---
+<objective>
+Enter debug mode to systematically fix an issue using scientific method.
+Debug sessions are **persistent** - they survive context resets, session restarts, and days between attempts. No context is ever lost.
+</objective>
+<usage>
+```bash
+/ctx:debug                      # Start debug on current issue (from STATE.md)
+/ctx:debug "description"        # Start debug with specific issue
+/ctx:debug --resume             # Resume most recent debug session
+/ctx:debug --resume {id}        # Resume specific session
+/ctx:debug --list               # List all debug sessions
+/ctx:debug --status             # Show current debug session status
+/ctx:debug --abort              # Abort current session (mark failed)
+```
+</usage>
+<process>
+## Step 1: Parse Command
+Check arguments:
+- No args → Check STATE.md for debug_issue, or ask user
+- `"description"` → Start fresh session with description
+- `--resume` → Resume active session
+- `--resume {id}` → Resume specific session
+- `--list` → Show sessions and exit
+- `--status` → Show current session and exit
+- `--abort` → Mark session failed and exit
+## Step 2: Handle List/Status/Abort (if requested)
+### --list
+```bash
+echo "## Debug Sessions"
+echo ""
+for session in .ctx/debug/sessions/*/; do
+  id=$(basename "$session")
+  state=$(cat "$session/STATE.json" | jq -r '.status')
+  issue=$(cat "$session/STATE.json" | jq -r '.issue.description' | head -c 50)
+  attempts=$(cat "$session/STATE.json" | jq -r '.currentAttempt')
+  echo "- $id [$state] $attempts attempts"
+  echo "  $issue..."
+done
+```
+Output example:
+```
+## Debug Sessions
+- debug-20240120-143022 [resolved] 2 attempts
+  Login form shows blank error...
+- debug-20240119-091500 [escalated] 10 attempts
+  API returns 500 on checkout...
+- debug-20240118-160045 [in_progress] 3 attempts  ← ACTIVE
+  File upload fails silently...
+```
+### --status
+Read `.ctx/debug/active-session.json` and show:
+- Session ID
+- Issue description
+- Current attempt number
+- Last hypothesis
+- Last result
+- Time spent
+### --abort
+```json
+// Update STATE.json
+{
+  "status": "aborted",
+  "abortedAt": "{timestamp}",
+  "reason": "User requested abort"
+}
+```
+Clear active-session.json and return to main router.
+## Step 3: Initialize or Resume Session
+### Resume Existing
+```javascript
+// Check for active session
+const active = JSON.parse(fs.readFileSync('.ctx/debug/active-session.json'));
+if (active.sessionId) {
+  const state = JSON.parse(fs.readFileSync(`.ctx/debug/sessions/${active.sessionId}/STATE.json`));
+  // Continue from state.currentAttempt
+}
+```
+### Start Fresh
+```javascript
+const sessionId = `debug-${date}-${time}`;
+const sessionDir = `.ctx/debug/sessions/${sessionId}`;
+// Create session directory
+fs.mkdirSync(sessionDir, { recursive: true });
+fs.mkdirSync(`${sessionDir}/screenshots`, { recursive: true });
+// Initialize STATE.json
+const state = {
+  sessionId,
+  created: new Date().toISOString(),
+  updated: new Date().toISOString(),
+  status: "in_progress",
+  issue: {
+    description: userDescription,
+    type: null,  // Will classify
+    severity: null,
+    errorMessage: null,
+    stackTrace: null,
+    stepsToReproduce: []
+  },
+  attempts: [],
+  currentAttempt: 0,
+  maxAttempts: 10  // From config.json
+};
+// Set as active
+fs.writeFileSync('.ctx/debug/active-session.json', JSON.stringify({ sessionId }));
+```
+## Step 4: Gather Issue Context
+### Automatic Detection
+1. Check build output for errors
+2. Check test output for failures
+3. Check STATE.md for debug_issue
+4. Check git diff for recent changes
+### Classify Issue Type
+```
+build    → Compilation/bundling errors
+test     → Test failures
+runtime  → Crashes, exceptions
+ui       → Visual/interaction bugs
+api      → Backend/network errors
+perf     → Performance issues
+```
+### Document in STATE.json
+```json
+{
+  "issue": {
+    "description": "Login form submits but shows blank error",
+    "type": "ui",
+    "severity": "high",
+    "errorMessage": "TypeError: Cannot read property 'message' of undefined",
+    "stackTrace": "at handleSubmit (login.tsx:45)...",
+    "stepsToReproduce": [
+      "Go to /login",
+      "Enter invalid credentials",
+      "Click submit",
+      "Observe blank error message"
+    ],
+    "affectedFiles": ["src/auth/login.tsx"],
+    "relatedCommits": ["abc1234"]
+  }
+}
+```
+## Step 5: Spawn ctx-debugger Agent
+```
+Task tool:
+  subagent_type: ctx-debugger
+  prompt: |
+    Resume debug session: {sessionId}
+    Issue: {issue.description}
+    Type: {issue.type}
+    Error: {issue.errorMessage}
+    Previous attempts: {attempts.length}
+    Last result: {lastAttempt.result}
+    Last learning: {lastAttempt.learnings}
+    Continue debugging. Max {remainingAttempts} more attempts.
+    Credentials available in .ctx/.env for browser testing.
+```
+## Step 6: Monitor Progress
+The debugger agent will:
+1. **Form Hypothesis** based on:
+   - Error analysis
+   - Previous attempts (if any)
+   - Codebase patterns
+   - Web research
+2. **Apply Minimal Fix**
+   - Single focused change
+   - No collateral modifications
+3. **Verify All Layers**
+   - Build passes
+   - Tests pass
+   - Lint passes
+   - Browser works (if UI)
+4. **Record Result**
+   - Update STATE.json
+   - Append to TRACE.md
+   - Save screenshots
+5. **Loop or Exit**
+   - If fixed → Mark resolved
+   - If max attempts → Escalate
+   - Otherwise → Next hypothesis
+## Step 7: Handle Outcomes
+### Success (resolved)
+```
+[DEBUG] ✅ Issue resolved!
+Session: debug-20240120-143022
+Issue: Login form shows blank error
+Attempts: 2
+Duration: 15 minutes
+Root Cause:
+  Error state was logged but not set in React state
+Fix Applied:
+  src/auth/login.tsx:45 - Added setError() call
+Files Changed:
+  - src/auth/login.tsx
+Commit: abc1234
+Full trace: .ctx/debug/sessions/debug-20240120-143022/TRACE.md
+```
+Update STATE.md:
+- Set status = "executing"
+- Clear debug_issue
+- Log resolution
+### Escalation (max attempts)
+```
+[DEBUG] ⚠️ Max attempts reached (10)
+Session: debug-20240119-091500
+Issue: API returns 500 on checkout
+Duration: 2 hours
+What We Tried:
+  1. [FAIL] Null check on request body
+  2. [FAIL] Validate payment data format
+  3. [PARTIAL] Fix database connection pool
+  ...
+What We Know:
+  - Error occurs only with specific product combinations
+  - Database connection is stable
+  - Payment API returns valid response
+Possible Root Causes:
+  1. [60%] Race condition in inventory check
+  2. [30%] Stale cache invalidation
+Recommended:
+  - Add logging to checkInventory()
+  - Review concurrent order tests
+Full report: .ctx/debug/sessions/debug-20240119-091500/ESCALATION.md
+```
+Ask user:
+- Provide more context?
+- Try different approach?
+- Manual investigation?
+</process>
+<debug_philosophy>
+## Scientific Method
+```
+OBSERVE   → Exact error, context, state
+RESEARCH  → Similar issues, web search
+HYPOTHESIZE → Testable theory with confidence
+PREDICT   → Expected outcome if correct
+TEST      → Apply minimal fix
+ANALYZE   → Did prediction match?
+ITERATE   → Refine and repeat
+```
+## Hypothesis Confidence Levels
+| Confidence | Meaning | Action |
+|------------|---------|--------|
+| 90%+ | Strong evidence, likely root cause | Test first |
+| 70-90% | Good evidence, probable cause | Test early |
+| 50-70% | Some evidence, possible cause | Test if higher fails |
+| <50% | Weak evidence, unlikely | Last resort |
+## Fix Principles
+1. **Minimal** - Change only what's necessary
+2. **Focused** - One hypothesis per attempt
+3. **Reversible** - Easy to undo if wrong
+4. **Documented** - Clear diff and rationale
+## Verification Layers
+| Layer | What | How |
+|-------|------|-----|
+| 1 | Build | `npm run build` or equivalent |
+| 2 | Tests | Run affected test files |
+| 3 | Lint | `npm run lint` |
+| 4 | Browser | Playwright/Chrome DevTools |
+All layers must pass for "resolved" status.
+</debug_philosophy>
+<persistence>
+## Why Persistence Matters
+Regular debugging:
+```
+Session 1: Try 3 fixes, context resets
+Session 2: Forget what was tried, repeat same fixes
+Session 3: Still stuck, no progress
+```
+CTX debugging:
+```
+Session 1: Try 3 fixes, save state
+Session 2: Load state, know what failed, try fix 4
+Session 3: Load state, try fix 5, success!
+```
+## State Files
+```
+.ctx/debug/
+├── active-session.json     # Current session pointer
+└── sessions/
+    └── debug-20240120-143022/
+        ├── STATE.json      # Machine-readable state
+        ├── TRACE.md        # Human-readable log
+        ├── hypotheses.json # All hypotheses
+        └── screenshots/    # Visual evidence
+```
+## Resume After Days
+```bash
+# Monday - start debugging
+/ctx:debug "checkout fails"
+# Try 3 fixes, no luck, go home
+# Wednesday - resume
+/ctx:debug --resume
+# CTX knows: 3 attempts failed, here's what we learned
+# Continues from attempt 4
+```
+</persistence>
+<browser_verification>
+## Autonomous Browser Testing
+CTX uses stored credentials from `.ctx/.env`:
+```bash
+APP_URL=http://localhost:3000
+TEST_USER_EMAIL=test@example.com
+TEST_USER_PASSWORD=testpass123
+```
+### Verification Flow
+```javascript
+// 1. Navigate to app
+browser_navigate({ url: process.env.APP_URL });
+// 2. Login if needed
+browser_snapshot();
+if (page.includes('login')) {
+  browser_type({ ref: 'email-input', text: process.env.TEST_USER_EMAIL });
+  browser_type({ ref: 'password-input', text: process.env.TEST_USER_PASSWORD });
+  browser_click({ ref: 'submit-button' });
+}
+// 3. Navigate to affected page
+browser_navigate({ url: `${process.env.APP_URL}/affected-page` });
+// 4. Reproduce issue
+browser_snapshot();
+// ... interaction steps
+// 5. Verify fix
+browser_snapshot();
+browser_take_screenshot({ filename: `attempt-${n}.png` });
+```
+### Screenshot Evidence
+Every attempt saves screenshots:
+```
+screenshots/
+├── issue-initial.png    # Before any fixes
+├── attempt-1.png        # After fix 1
+├── attempt-2.png        # After fix 2
+├── attempt-3-success.png # Fixed!
+```
+</browser_verification>
+<config>
+## Debug Settings
+In `.ctx/config.json`:
+```json
+{
+  "debug": {
+    "maxAttempts": 10,
+    "persistSessions": true,
+    "screenshotOnFailure": true,
+    "autoResume": true
+  }
+}
+```
+| Setting | Default | Description |
+|---------|---------|-------------|
+| maxAttempts | 10 | Max tries before escalation |
+| persistSessions | true | Save state across sessions |
+| screenshotOnFailure | true | Capture on each failed attempt |
+| autoResume | true | Auto-resume if active session exists |
+</config>
+<output>
+After spawning ctx-debugger, report:
+- Session status (in_progress / resolved / escalated)
+- Attempts made
+- If resolved: root cause, fix, commit
+- If escalated: report path, recommendations
+- Next action for main router
+</output>

package/commands/help.md CHANGED Viewed

@@ -93,6 +93,16 @@ Output ONLY the reference content below. Do NOT add project-specific analysis.
 | `/ctx verify` | Force three-level verification |
 | `/ctx quick "task"` | Quick task bypass |
+### Debug
+| Command | Purpose |
+|---------|---------|
+| `/ctx debug` | Start debugging current issue |
+| `/ctx debug "issue"` | Debug specific problem |
+| `/ctx debug --resume` | Resume last debug session |
+| `/ctx debug --list` | List all debug sessions |
+| `/ctx debug --status` | Show current session status |
+| `/ctx debug --abort` | Abort current session |
 ### Session
 | Command | Purpose |
 |---------|---------|

package/commands/verify.md CHANGED Viewed

@@ -121,8 +121,11 @@ Based on results:
   - Set status to "initializing" for next story
   - Update current story reference
 - **FAIL**:
-  - Create fix tasks
-  - Set status = "debugging" or "executing"
+  - Analyze failure type:
+    - **Build/Test/Runtime error** → Set status = "debugging" (spawn ctx-debugger)
+    - **Missing feature/incomplete** → Create fix tasks, set status = "executing"
+  - Record `debug_issue` in STATE.md with error details
+  - Debug mode will loop until fixed (max 10 attempts)
 </workflow>
 <output_format>

package/package.json CHANGED Viewed

@@ -1,6 +1,6 @@
 {
   "name": "ctx-cc",
-  "version": "3.3.1",
+  "version": "3.3.3",
   "description": "CTX 3.3 (Continuous Task eXecution) - AI that learns your preferences. Learning system, predictive planning, self-healing deployments (Sentry/LogRocket), voice control for hands-free development.",
   "keywords": [
     "claude",