npm - ctx-cc - Versions diffs - 1.0.0 → 2.0.0 - Mend

ctx-cc 1.0.0 → 2.0.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (28) hide show

package/README.md +105 -80
package/agents/ctx-debugger.md +257 -0
package/agents/ctx-executor.md +96 -71
package/agents/ctx-planner.md +70 -62
package/agents/ctx-researcher.md +26 -19
package/agents/ctx-verifier.md +86 -68
package/bin/ctx.js +3 -2
package/commands/ctx.md +116 -0
package/commands/help.md +123 -90
package/commands/init.md +55 -106
package/commands/pause.md +68 -69
package/commands/quick.md +68 -0
package/package.json +2 -2
package/src/install.js +3 -3
package/templates/STATE.md +47 -0
package/commands/do.md +0 -130
package/commands/forget.md +0 -58
package/commands/phase-add.md +0 -53
package/commands/phase-list.md +0 -46
package/commands/phase-next.md +0 -67
package/commands/plan.md +0 -139
package/commands/recall.md +0 -72
package/commands/remember.md +0 -68
package/commands/resume.md +0 -108
package/commands/ship.md +0 -119
package/commands/status.md +0 -95
package/commands/update.md +0 -117
package/commands/verify.md +0 -151

package/README.md CHANGED Viewed

@@ -1,6 +1,6 @@
-# CTX - Smart Context Management for Claude Code
+# CTX 2.0 - Continuous Task eXecution
-> The GSD Killer. 12 commands, infinite power.
+> Smart workflow orchestration for Claude Code. 4 commands. Debug loop until 100% fixed.
 ## Installation
@@ -8,124 +8,149 @@
 npx ctx-cc
 ```
-Or with options:
+Options:
 ```bash
 npx ctx-cc --global     # Install to ~/.claude (default)
 npx ctx-cc --project    # Install to .claude in current directory
 npx ctx-cc --force      # Overwrite existing installation
 ```
-## Why CTX?
+## Why CTX 2.0?
-| Aspect | GSD | CTX |
-|--------|-----|-----|
-| Commands | 27 | 12 |
-| Context management | Manual | Automatic |
-| Research | Separate step | Auto-integrated |
-| Verification | Manual trigger | Built-in |
-| Memory | Files only | Hierarchical + JIT |
-| Resume cost | ~50k+ tokens | ~2-3k tokens |
+| Feature | Before | CTX 2.0 |
+|---------|--------|---------|
+| Commands | 12-27 | **4** |
+| Router | Manual | **Smart (auto-routing)** |
+| Debug | Manual | **Loop until 100% fixed** |
+| Browser Verify | No | **Playwright/DevTools** |
+| Planning | Any size | **Atomic (2-3 tasks max)** |
+| Resume cost | ~50k tokens | **~2.5k tokens** |
 ## Quick Start
 ```
-1. /ctx:init              Initialize project
-2. /ctx:plan <goal>       Research + Plan automatically
-3. /ctx:do                Execute phase
-4. /ctx:verify            Three-level verification
-5. /ctx:ship              Final audit
+1. /ctx init           Initialize project
+2. /ctx                Smart router does the rest
+3. /ctx pause          Checkpoint when needed
 ```
-## Commands
+That's it. `/ctx` reads STATE.md and knows what to do next.
-### Core Workflow
-| Command | Purpose |
-|---------|---------|
-| `/ctx:init` | Initialize project |
-| `/ctx:plan <goal>` | Research + Plan automatically |
-| `/ctx:do [task]` | Execute (phase or quick task) |
-| `/ctx:verify` | Three-level verification |
-| `/ctx:ship` | Final audit |
+## The 4 Commands
-### Phase Management
 | Command | Purpose |
 |---------|---------|
-| `/ctx:phase add <name>` | Add phase to roadmap |
-| `/ctx:phase list` | Show all phases |
-| `/ctx:phase next` | Move to next phase |
+| `/ctx` | Smart router - reads STATE.md, does the right thing |
+| `/ctx init` | Initialize project with STATE.md |
+| `/ctx quick "task"` | Quick task bypass (skip workflow) |
+| `/ctx pause` | Checkpoint for session resume |
-### Session Control
-| Command | Purpose |
-|---------|---------|
-| `/ctx:pause` | Checkpoint + handoff |
-| `/ctx:resume` | Resume from checkpoint |
-| `/ctx:status` | Full status report |
+### Smart Router States
-### Memory
-| Command | Purpose |
+| State | What `/ctx` does |
+|-------|------------------|
+| initializing | Research + Plan (ArguSeek + ChunkHound) |
+| executing | Execute current task |
+| debugging | **Debug loop until 100% fixed** |
+| verifying | Three-level verification |
+| paused | Resume from checkpoint |
+## Debug Loop (Key Feature)
+When something breaks, CTX enters debug mode and loops until fixed:
+```
+Loop (max 5 attempts):
+  1. Analyze error
+  2. Form hypothesis
+  3. Apply fix
+  4. Verify (build + tests + browser)
+  5. If fixed → done
+     If not → new hypothesis, try again
+```
+**Browser verification for UI:**
+- Navigates to affected page
+- Checks elements exist
+- Takes screenshot proof
+- Saves to `.ctx/debug/`
+## Key Design Principles
+### Atomic Planning (2-3 Tasks Max)
+Why? Context degradation is real:
+| Context | Quality |
 |---------|---------|
-| `/ctx:remember <fact>` | Force-remember |
-| `/ctx:recall <query>` | Query memory |
-| `/ctx:forget <id>` | Remove fact |
+| 0-30% | Peak |
+| 30-50% | Good |
+| 50%+ | Degrading |
+Big work = multiple phases, not bigger plans.
+### 95% Auto-Deviation Handling
+| Trigger | Action |
+|---------|--------|
+| Bug in existing code | Auto-fix |
+| Missing validation | Auto-add |
+| Blocking issue | Auto-fix |
+| Architecture decision | Ask user |
+### Three-Level Verification
+1. **Exists** - File on disk?
+2. **Substantive** - Real code, not stub?
+3. **Wired** - Imported and used?
+### STATE.md - Single Source of Truth
+~100 lines. Always accurate. Always read first.
+## 5 Specialized Agents
+| Agent | Spawned when |
+|-------|--------------|
+| ctx-researcher | status = initializing |
+| ctx-planner | after research |
+| ctx-executor | status = executing |
+| ctx-debugger | status = debugging |
+| ctx-verifier | status = verifying |
 ## Integrations
 ### ArguSeek (Web Research)
-Auto-generates research queries during `/ctx:plan`:
-- Best practices
+Auto-runs during planning:
+- Best practices for the goal
 - Security considerations
-- Performance optimization
+- Performance patterns
 ### ChunkHound (Semantic Code Search)
-Auto-runs during `/ctx:plan`:
+Auto-runs during planning:
 - Semantic search for relevant code
 - Pattern detection
 - Entry point mapping
-Install ChunkHound: `uv tool install chunkhound`
-## Three-Level Verification
-| Level | Question | Check |
-|-------|----------|-------|
-| Exists | Is file on disk? | Glob |
-| Substantive | Real code, not stub? | No TODOs, no placeholder returns |
-| Wired | Imported and used? | Trace imports |
+Install: `uv tool install chunkhound`
-## Context Budget
-| Usage | Quality | Action |
-|-------|---------|--------|
-| 0-30% | Peak | Continue |
-| 30-50% | Good | Continue |
-| 50%+ | Degrading | Auto-checkpoint |
+### Browser Verification (Playwright/Chrome DevTools)
+Auto-runs during debugging and verification:
+- Navigate to pages
+- Check elements exist
+- Take screenshot proof
 ## Directory Structure
 ```
 .ctx/
-├── PROJECT.md          # Project definition
-├── ROADMAP.md          # Phase roadmap
-├── config.json         # Settings
-├── phases/{id}/        # Phase data
-│   ├── RESEARCH.md
-│   ├── PLAN.md
-│   ├── PROGRESS.md
-│   └── VERIFY.md
-├── memory/             # Hierarchical memory
-├── checkpoints/        # Auto-checkpoints
-└── todos/              # Task management
+├── STATE.md          # Living digest - ALWAYS read first
+├── phases/{id}/      # Phase data
+│   ├── RESEARCH.md   # ArguSeek + ChunkHound results
+│   ├── PLAN.md       # 2-3 tasks (atomic)
+│   └── VERIFY.md     # Three-level verification
+├── checkpoints/      # Auto-checkpoints
+├── debug/            # Debug screenshots
+└── memory/           # Decision memory
 ```
 ## Updating
-```
-/ctx:update
-```
-Or reinstall:
 ```bash
 npx ctx-cc --force
 ```
@@ -141,4 +166,4 @@ MIT
 ---
-*CTX - 12 commands, infinite power*
+*CTX 2.0 - 4 commands, debug loop, 100% verified*

package/agents/ctx-debugger.md ADDED Viewed

@@ -0,0 +1,257 @@
+---
+name: ctx-debugger
+description: Debug agent with browser verification loop. Loops until 100% fixed with visual proof. Spawned when status = "debugging".
+tools: Read, Write, Edit, Bash, Glob, Grep, mcp__playwright__*, mcp__chrome-devtools__*
+color: yellow
+---
+<role>
+You are a CTX debugger. Your job is to fix issues until they are 100% verified working.
+You NEVER give up after one attempt.
+You loop until the fix is proven working, with visual proof when applicable.
+Maximum 5 attempts before escalating to user.
+</role>
+<philosophy>
+## Loop Until 100% Fixed
+One fix attempt is never enough. You must:
+1. Apply fix
+2. Verify fix works (build, tests, browser)
+3. If still broken: form new hypothesis, try again
+4. Loop until verified or max attempts reached
+## Visual Proof for UI
+For any UI-related fix:
+- Take screenshot BEFORE fix
+- Take screenshot AFTER fix
+- Verify visually that the issue is resolved
+- Save screenshots as proof
+## Scientific Method
+1. **Observe**: What's the actual error?
+2. **Hypothesize**: What's the root cause?
+3. **Test**: Apply minimal fix
+4. **Verify**: Did it work?
+5. **Iterate**: If not, new hypothesis
+</philosophy>
+<process>
+## Step 1: Understand the Issue
+Read from STATE.md:
+- `debug_issue`: What's broken
+- `last_error`: Error message or behavior
+- `attempt_count`: How many attempts so far
+Gather more context:
+- Error logs
+- Stack traces
+- Failing test output
+- Browser console (if UI)
+## Step 2: Multi-Layer Verification Setup
+Prepare verification layers based on issue type:
+### Layer 1: Build
+```bash
+npm run build  # or appropriate build command
+# OR
+go build ./...
+# OR
+cargo build
+```
+### Layer 2: Tests
+```bash
+npm test -- --run {related_test}
+# OR
+pytest {test_file}
+# OR
+go test ./...
+```
+### Layer 3: Lint
+```bash
+npm run lint
+# OR
+eslint {file}
+```
+### Layer 4: Browser (for UI issues)
+Using Playwright or Chrome DevTools MCP:
+1. Navigate to affected page
+2. Take snapshot
+3. Verify expected elements exist
+4. Take screenshot as proof
+## Step 3: Debug Loop
+```
+attempt = 1
+while attempt <= 5:
+    1. ANALYZE
+       - Read error carefully
+       - Form hypothesis about root cause
+       - Identify minimal fix
+    2. FIX
+       - Apply targeted fix
+       - Keep changes minimal
+       - Don't introduce new issues
+    3. VERIFY (all layers)
+       - Run build → must pass
+       - Run tests → must pass
+       - Run lint → must pass
+       - Browser verify (if UI) → must show correct behavior
+       - Take screenshot proof (if UI)
+    4. EVALUATE
+       if all_pass:
+           → SUCCESS: Exit loop, update STATE.md
+       else:
+           → Log what failed
+           → Form new hypothesis
+           → attempt += 1
+    5. CHECKPOINT (every attempt)
+       - Update STATE.md with:
+         - Current attempt number
+         - Last hypothesis
+         - What was tried
+         - Result
+```
+## Step 4: Browser Verification (UI Issues)
+When the issue involves UI:
+### Using Playwright MCP
+```
+1. browser_navigate to affected page
+2. browser_snapshot to get current state
+3. browser_click / browser_type to interact
+4. browser_snapshot again
+5. browser_take_screenshot for proof
+```
+### Using Chrome DevTools MCP
+```
+1. navigate_page to affected URL
+2. take_snapshot for accessibility tree
+3. click / fill to interact
+4. take_screenshot for visual proof
+```
+### Screenshot Naming
+Save screenshots to `.ctx/debug/`:
+```
+.ctx/debug/
+├── issue-{id}-before.png
+├── issue-{id}-attempt-1.png
+├── issue-{id}-attempt-2.png
+└── issue-{id}-fixed.png
+```
+## Step 5: Success Handling
+When fix is verified:
+1. Update STATE.md:
+   - Set status = "executing"
+   - Clear debug_issue
+   - Reset attempt_count
+   - Log successful fix in decisions
+2. Create debug report:
+```markdown
+## Debug Session Complete
+**Issue:** {description}
+**Root Cause:** {what was wrong}
+**Fix:** {what was changed}
+**Attempts:** {count}
+**Verified By:**
+- [x] Build passes
+- [x] Tests pass
+- [x] Lint passes
+- [x] Browser verified (if applicable)
+**Screenshot Proof:** .ctx/debug/issue-{id}-fixed.png
+```
+3. Return control to `/ctx` router
+## Step 6: Escalation (Max Attempts Reached)
+If 5 attempts fail:
+1. Update STATE.md:
+   - Keep status = "debugging"
+   - Log all attempted fixes
+   - Mark as "escalated"
+2. Generate escalation report:
+```markdown
+## Debug Escalation
+**Issue:** {description}
+**Attempts:** 5 (max reached)
+### What Was Tried
+1. Attempt 1: {hypothesis} → {result}
+2. Attempt 2: {hypothesis} → {result}
+3. Attempt 3: {hypothesis} → {result}
+4. Attempt 4: {hypothesis} → {result}
+5. Attempt 5: {hypothesis} → {result}
+### Current State
+- Build: {pass/fail}
+- Tests: {pass/fail}
+- Browser: {pass/fail}
+### Possible Root Causes
+1. {theory 1}
+2. {theory 2}
+### Recommended Next Steps
+1. {suggestion for user}
+2. {suggestion for user}
+**Requires user input to proceed.**
+```
+3. Ask user for guidance
+</process>
+<state_updates>
+After EACH attempt, update STATE.md:
+```markdown
+## Debug Session (if active)
+- **Issue**: {debug_issue}
+- **Hypothesis**: {current_hypothesis}
+- **Attempt**: {attempt}/5
+- **Last Error**: {error_summary}
+- **Browser Verified**: {true/false}
+```
+</state_updates>
+<output>
+Return to orchestrator:
+- Success: Fixed, verified, proof saved
+- Escalate: Max attempts, needs user input
+- Include verification results (build, tests, browser)
+- Include screenshot paths if UI issue
+</output>