npm - ctx-cc - Versions diffs - 1.0.0 → 2.1.0 - Mend

ctx-cc 1.0.0 → 2.1.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (29) hide show

package/README.md +124 -75
package/agents/ctx-debugger.md +257 -0
package/agents/ctx-executor.md +96 -71
package/agents/ctx-planner.md +70 -62
package/agents/ctx-researcher.md +26 -19
package/agents/ctx-verifier.md +86 -68
package/bin/ctx.js +3 -2
package/commands/ctx.md +116 -0
package/commands/help.md +109 -92
package/commands/init.md +55 -106
package/commands/pause.md +68 -69
package/commands/phase.md +149 -0
package/commands/plan.md +77 -123
package/commands/quick.md +68 -0
package/commands/status.md +59 -76
package/commands/verify.md +91 -121
package/package.json +2 -2
package/src/install.js +3 -3
package/templates/STATE.md +47 -0
package/commands/do.md +0 -130
package/commands/forget.md +0 -58
package/commands/phase-add.md +0 -53
package/commands/phase-list.md +0 -46
package/commands/phase-next.md +0 -67
package/commands/recall.md +0 -72
package/commands/remember.md +0 -68
package/commands/resume.md +0 -108
package/commands/ship.md +0 -119
package/commands/update.md +0 -117

package/agents/ctx-verifier.md CHANGED Viewed

@@ -1,19 +1,19 @@
 ---
 name: ctx-verifier
-description: Verification agent for CTX. Performs three-level verification and anti-pattern scanning. Spawned by /ctx:verify.
-tools: Read, Glob, Grep, Bash
+description: Verification agent for CTX 2.0. Three-level verification + anti-pattern scan. Spawned when status = "verifying".
+tools: Read, Glob, Grep, Bash, mcp__playwright__*, mcp__chrome-devtools__*
 color: red
 ---
 <role>
-You are a CTX verifier. Your job is to verify phase completion using three-level verification.
+You are a CTX 2.0 verifier. Your job is to verify phase completion.
-You check:
+You check three levels:
 1. **Exists** - Is the file on disk?
 2. **Substantive** - Is it real code, not a stub?
 3. **Wired** - Is it imported and used?
-Plus anti-pattern scanning for common issues.
+Plus anti-pattern scanning and browser verification for UI.
 </role>
 <philosophy>
@@ -25,7 +25,7 @@ Check: "Does this achieve the original goal?"
 ## Wiring Is Where Failures Hide
-The most common failure: code exists but isn't connected.
+Most common failure: code exists but isn't connected.
 Always trace from entry point to new code.
 ## Be Strict
@@ -33,24 +33,30 @@ Always trace from entry point to new code.
 Better to catch issues now than in production.
 A failing verification saves debugging time later.
+## Visual Verification for UI
+If the phase involves UI, verify it visually:
+- Navigate to the page
+- Check elements exist
+- Take screenshot proof
 </philosophy>
 <process>
-## 1. Load Phase
+## 1. Load Context
 Read:
-- `.ctx/phases/{phase-id}/PLAN.md` - Get verification criteria
-- `.ctx/phases/{phase-id}/PROGRESS.md` - Get list of artifacts
-- Original goal from ROADMAP.md
+- `.ctx/STATE.md` - Current state
+- `.ctx/phases/{phase-id}/PLAN.md` - Verification criteria
+- Original goal
 ## 2. Three-Level Verification
-For each artifact (file, function, endpoint):
+For each artifact:
 ### Level 1: EXISTS
 ```bash
-# Check file exists
 ls {file_path}
 ```
 Pass: File found
@@ -58,7 +64,7 @@ Fail: File missing
 ### Level 2: SUBSTANTIVE
 ```bash
-# Check for stubs/placeholders
+# Check for stubs
 grep -n "TODO" {file}
 grep -n "not implemented" {file}
 grep -n "throw new Error" {file}
@@ -69,55 +75,70 @@ Check for:
 - Empty function bodies
 - Placeholder returns (`return null`, `return {}`)
 - "Not implemented" text
-- Trivial implementations
 Pass: Real, complete code
-Fail: Stub or placeholder detected
+Fail: Stub detected
 ### Level 3: WIRED
 ```bash
-# Find imports of this file
+# Find imports
 grep -r "import.*{module}" --include="*.ts" --include="*.js"
-# Trace from entry point
-# Check the import chain connects to main/index
 ```
+Trace from entry point to new code.
 Pass: Code is imported and called
-Fail: Orphan code (exists but unused)
+Fail: Orphan code
 ## 3. Anti-Pattern Scan
-Scan the codebase for:
 | Pattern | Search | Severity |
 |---------|--------|----------|
 | TODO comments | `// TODO`, `# TODO` | Warning |
 | Empty catch | `catch\s*\([^)]*\)\s*\{\s*\}` | Error |
 | Console-only errors | `console.error` without throw | Warning |
-| Placeholder returns | `return null`, `return {}`, `return undefined` | Error |
-| Hardcoded secrets | API keys, passwords | Critical |
+| Placeholder returns | `return null`, `return {}` | Error |
 | Debug code | `console.log`, `debugger` | Warning |
-## 4. Goal Gap Analysis
+## 4. Browser Verification (UI)
+If phase involves UI:
+### Using Playwright MCP
+```
+browser_navigate({url})
+browser_snapshot()
+# Verify expected elements exist in snapshot
+browser_take_screenshot({filename})
+```
+### Using Chrome DevTools MCP
+```
+navigate_page({url})
+take_snapshot()
+take_screenshot({path})
+```
+Save screenshots to `.ctx/verify/phase-{id}-verified.png`
-Compare original goal vs implementation:
+## 5. Goal Gap Analysis
-1. Read original goal
-2. List what was built
-3. Identify gaps (things missing)
-4. Identify drift (things built but not requested)
+Compare goal vs implementation:
+1. What was the original goal?
+2. What was actually built?
+3. What's missing (gaps)?
+4. What's extra (drift)?
-## 5. Generate VERIFY.md
+## 6. Generate VERIFY.md
 Write `.ctx/phases/{phase-id}/VERIFY.md`:
 ```markdown
 # Verification Report
-**Phase:** {name}
-**Date:** {timestamp}
+**Phase:** {id}
 **Goal:** {original goal}
+**Date:** {timestamp}
 ## Three-Level Results
@@ -125,58 +146,55 @@ Write `.ctx/phases/{phase-id}/VERIFY.md`:
 |----------|--------|-------------|-------|--------|
 | {file1}  | ✓      | ✓           | ✓     | PASS   |
 | {file2}  | ✓      | ✓           | ✗     | FAIL   |
-| {file3}  | ✓      | ✗           | -     | FAIL   |
 ### Failures
+{details of each failure}
-#### {file2} - Not Wired
-- Created but not imported anywhere
-- Expected import in: {file}
-- Action: Add import and usage
+## Anti-Pattern Scan
-#### {file3} - Stub Detected
-- Line 45: `// TODO: implement validation`
-- Action: Complete implementation
+| Pattern | Count | Location | Severity |
+|---------|-------|----------|----------|
+| TODO | 2 | auth.ts:45 | Warning |
-## Anti-Pattern Scan
+## Browser Verification
-| Pattern | Count | Files | Severity |
-|---------|-------|-------|----------|
-| TODO | 2 | auth.ts:45, login.ts:23 | Warning |
-| Empty catch | 1 | api.ts:89 | Error |
+- URL: {url tested}
+- Elements: {verified}
+- Screenshot: .ctx/verify/phase-{id}.png
+- Status: PASS/FAIL
-## Goal Gap Analysis
+## Goal Gap
-**Goal:** {original goal}
+**Built:** {what was completed}
+**Gaps:** {what's missing}
+**Drift:** {what was built but not requested}
-**Built:**
-- ✓ {item completed}
-- ✓ {item completed}
-- ✗ {item missing}
+## Overall: {PASS / FAIL}
-**Gaps:**
-1. {missing item}: {what needs to be done}
+{If FAIL: list required fixes}
+{If PASS: ready for next phase or ship}
+```
-**Drift:**
-- None / {items built but not requested}
+## 7. Update STATE.md
-## Overall: {PASS / FAIL}
+Based on results:
-{If FAIL:}
-### Required Fixes
-1. {fix 1}
-2. {fix 2}
+**If PASS:**
+- Set status = "executing" (for next phase)
+- Or status = "complete" (if last phase)
-{If PASS:}
-Phase verified successfully. Ready for `/ctx:phase next` or `/ctx:ship`.
-```
+**If FAIL:**
+- Create fix tasks
+- Set status = "executing"
+- Loop back to execute fixes
 </process>
 <output>
-Return to orchestrator:
-- Overall pass/fail
-- List of failures with fixes needed
+Return to `/ctx` router:
+- Overall: pass/fail
+- Failures with fixes needed
 - Goal gaps if any
-- Recommendation (proceed/fix)
+- Screenshot paths if UI
+- Next action recommendation
 </output>

package/bin/ctx.js CHANGED Viewed

@@ -19,8 +19,9 @@ if (options.help) {
   ╚██████╗   ██║   ██╔╝ ██╗
    ╚═════╝   ╚═╝   ╚═╝  ╚═╝\x1b[0m
-  \x1b[1mCTX - Smart Context Management\x1b[0m
-  The GSD Killer. 12 commands, infinite power.
+  \x1b[1mCTX 2.1 - Continuous Task eXecution\x1b[0m
+  Smart workflow orchestration for Claude Code.
+  8 commands. Smart routing. Debug loop.
   \x1b[1mUsage:\x1b[0m
     npx ctx-cc [options]

package/commands/ctx.md ADDED Viewed

@@ -0,0 +1,116 @@
+---
+name: ctx
+description: Smart router - reads STATE.md and does the right thing
+---
+<objective>
+CTX 2.0 Smart Router - One command that always knows what to do next.
+Read STATE.md, understand current context, and execute the appropriate action.
+</objective>
+<workflow>
+## Step 1: Read State
+Read `.ctx/STATE.md` to understand current situation.
+If STATE.md doesn't exist:
+- Output: "No CTX project found. Run `/ctx init` to start."
+- Stop.
+## Step 2: Route Based on State
+### If status = "initializing"
+Route to: **Research Phase**
+1. Use ArguSeek to research the project goal
+2. Use ChunkHound for semantic code search (if existing codebase)
+3. Create atomic plan (2-3 tasks max)
+4. Update STATE.md with plan
+5. Set status = "executing"
+### If status = "executing"
+Route to: **Execute Current Task**
+1. Read current task from STATE.md
+2. Spawn ctx-executor agent
+3. Execute task with deviation handling:
+   - Auto-fix: bugs, validation, deps (95%)
+   - Ask user: architecture decisions only (5%)
+4. After task:
+   - Run verification (build, tests, lint)
+   - If passes: mark done, update STATE.md
+   - If fails: set status = "debugging"
+### If status = "debugging"
+Route to: **Debug Loop**
+1. Spawn ctx-debugger agent
+2. Loop until fixed (max 5 attempts):
+   - Analyze error
+   - Form hypothesis
+   - Apply fix
+   - Verify (build + tests + browser if UI)
+   - Take screenshot proof if browser test
+3. If fixed: set status = "executing", continue
+4. If 5 attempts fail: escalate to user
+### If status = "verifying"
+Route to: **Three-Level Verification**
+1. Spawn ctx-verifier agent
+2. Check all artifacts:
+   - Level 1: Exists (file on disk?)
+   - Level 2: Substantive (real code, not stub?)
+   - Level 3: Wired (imported and used?)
+3. Scan for anti-patterns (TODO, empty catch, placeholders)
+4. If all pass: complete phase, update STATE.md
+5. If fails: create fix tasks, set status = "executing"
+### If status = "paused"
+Route to: **Resume**
+1. Read checkpoint from `.ctx/checkpoints/`
+2. Restore context (~2.5k tokens)
+3. Set status to previous state
+4. Continue workflow
+## Step 3: Context Budget Check
+After every action:
+- Calculate context usage
+- If > 50%: Auto-checkpoint, warn user
+- If > 70%: Force checkpoint
+## Step 4: Update State
+Always update STATE.md after any action:
+- Current status
+- Progress
+- Recent decisions
+- Next action
+</workflow>
+<state_transitions>
+```
+initializing → executing (after plan created)
+executing → debugging (if verification fails)
+executing → verifying (if all tasks done)
+debugging → executing (if fix works)
+debugging → ESCALATE (if 5 attempts fail)
+verifying → executing (if anti-patterns found)
+verifying → COMPLETE (if all passes)
+paused → (previous state)
+```
+</state_transitions>
+<context_budget>
+| Usage | Quality | Action |
+|-------|---------|--------|
+| 0-30% | Peak | Continue |
+| 30-50% | Good | Continue |
+| 50-70% | Degrading | Auto-checkpoint |
+| 70%+ | Poor | Force checkpoint |
+</context_budget>
+<output_format>
+After routing, output:
+```
+[CTX] Status: {{status}}
+[CTX] Action: {{action_taken}}
+[CTX] Next: {{next_action}}
+[CTX] Context: {{percent}}% ({{quality}})
+```
+</output_format>

package/commands/help.md CHANGED Viewed

@@ -4,143 +4,160 @@ description: Show CTX commands and usage guide
 ---
 <objective>
-Display the complete CTX command reference.
+Display the CTX 2.1 command reference.
-Output ONLY the reference content below. Do NOT add project-specific analysis or suggestions.
+Output ONLY the reference content below. Do NOT add project-specific analysis.
 </objective>
 <reference>
-# CTX Command Reference
+# CTX 2.1 Command Reference
-**CTX** (Context) is a smart context management system for Claude Code.
-12 commands. Infinite power.
+**CTX** (Continuous Task eXecution) - Smart workflow orchestration for Claude Code.
+8 commands. Smart routing. Debug loop until 100% fixed.
 ## Quick Start
 ```
-1. /ctx:init              Initialize project
-2. /ctx:plan <goal>       Research + Plan automatically
-3. /ctx:do                Execute phase
-4. (repeat 2-3 for each phase)
-5. /ctx:ship              Final audit
+1. /ctx init           Initialize project
+2. /ctx                Smart router does the right thing
+3. /ctx status         Check progress (read-only)
+4. /ctx pause          Checkpoint when needed
 ```
-## Why CTX?
+## The 8 Commands
-| Aspect | GSD | CTX |
-|--------|-----|-----|
-| Commands | 27 | 12 |
-| Context management | Manual | Automatic |
-| Research | Separate step | Auto-integrated |
-| Verification | Manual trigger | Built-in |
-| Memory | Files only | Hierarchical + JIT |
-| Resume cost | ~50k+ tokens | ~2-3k tokens |
+### Smart (Auto-routing)
-## Core Workflow
+| Command | Purpose |
+|---------|---------|
+| `/ctx` | **Smart router** - reads STATE.md, does the right thing |
+| `/ctx init` | Initialize project with STATE.md |
-**`/ctx:init`**
-Initialize project. Detects tech stack, maps codebase, creates PROJECT.md.
+### Inspect (Read-only)
-**`/ctx:plan <goal>`**
-Research + Plan automatically. Uses ArguSeek for web research and ChunkHound for semantic code search.
+| Command | Purpose |
+|---------|---------|
+| `/ctx status` | See current state without triggering action |
-**`/ctx:do [task]`**
-Execute current phase, or run a quick task if argument provided.
-- `/ctx:do` - Execute current phase
-- `/ctx:do "fix the login bug"` - Quick task (bypasses workflow)
+### Control (Override smart router)
-**`/ctx:verify`**
-Three-level verification:
-1. Exists - Is file on disk?
-2. Substantive - Real code, not stub?
-3. Wired - Imported and used?
+| Command | Purpose |
+|---------|---------|
+| `/ctx plan [goal]` | Force research + planning |
+| `/ctx verify` | Force three-level verification |
+| `/ctx quick "task"` | Quick task bypass (skip workflow) |
-Plus anti-pattern scan for TODOs, empty catches, placeholder returns.
+### Session
-**`/ctx:ship`**
-Final audit before shipping. Checks all phases complete, no pending todos, verification passes.
+| Command | Purpose |
+|---------|---------|
+| `/ctx pause` | Checkpoint for session resume |
-## Phase Management
+### Phase Management
-**`/ctx:phase add <name>`**
-Add a new phase to the roadmap.
+| Command | Purpose |
+|---------|---------|
+| `/ctx phase list` | Show all phases with status |
+| `/ctx phase add "goal"` | Add new phase to roadmap |
+| `/ctx phase next` | Complete current, move to next |
+| `/ctx phase skip` | Skip current phase |
-**`/ctx:phase list`**
-Show all phases with status.
+---
+## Smart Router States
+When you run `/ctx`, it reads STATE.md and auto-routes:
-**`/ctx:phase next`**
-Move to the next phase.
+| State | What happens |
+|-------|--------------|
+| initializing | Research + Plan (ArguSeek + ChunkHound) |
+| executing | Execute current task |
+| debugging | **Debug loop until 100% fixed** |
+| verifying | Three-level verification |
+| paused | Resume from checkpoint |
-## Memory
+## Debug Loop
-**`/ctx:remember <fact>`**
-Force-remember something important.
+When something breaks, CTX enters debug mode:
-**`/ctx:recall <query>`**
-Query memory for relevant facts.
+```
+Loop (max 5 attempts):
+  1. Analyze error
+  2. Form hypothesis
+  3. Apply fix
+  4. Verify (build + tests + browser)
+  5. If fixed → done
+     If not → new hypothesis, try again
+```
-**`/ctx:forget <id>`**
-Remove a fact from memory.
+**Browser verification for UI:**
+- Playwright or Chrome DevTools
+- Screenshots saved to `.ctx/debug/`
-## Session Control
+## Three-Level Verification
-**`/ctx:pause`**
-Create checkpoint with handoff notes. Safe to close session.
+| Level | Question | Check |
+|-------|----------|-------|
+| Exists | File on disk? | Glob |
+| Substantive | Real code, not stub? | No TODOs, no placeholders |
+| Wired | Imported and used? | Trace imports |
-**`/ctx:resume`**
-Resume from last checkpoint. Restores full context in ~2-3k tokens.
+## Key Design Principles
-**`/ctx:status`**
-Full status report: project, phase, progress, context usage, todos.
+### Atomic Planning (2-3 Tasks Max)
+Prevents context degradation. Big work = multiple phases.
-## Integrations
+### 95% Auto-Deviation Handling
+| Trigger | Action |
+|---------|--------|
+| Bug in existing code | Auto-fix |
+| Missing validation | Auto-add |
+| Blocking issue | Auto-fix |
+| Architecture decision | Ask user |
+### Context Budget
+| Usage | Quality | Action |
+|-------|---------|--------|
+| 0-30% | Peak | Continue |
+| 30-50% | Good | Continue |
+| 50%+ | Degrading | Auto-checkpoint |
+## 5 Specialized Agents
-### ArguSeek (Web Research)
-Auto-generates research queries during `/ctx:plan`:
-- Best practices for the goal
-- Security considerations
-- Performance optimization
-- Error handling patterns
+| Agent | When spawned |
+|-------|--------------|
+| ctx-researcher | During planning (ArguSeek + ChunkHound) |
+| ctx-planner | After research |
+| ctx-executor | During execution |
+| ctx-debugger | When debugging |
+| ctx-verifier | During verification |
-### ChunkHound (Semantic Code Search)
-Auto-runs during `/ctx:plan`:
-- Semantic search for goal-relevant code
-- Pattern detection
-- Entry point mapping
+## Integrations
-Install: `uv tool install chunkhound`
+- **ArguSeek**: Web research during planning
+- **ChunkHound**: Semantic code search (`uv tool install chunkhound`)
+- **Playwright/DevTools**: Browser verification for UI
 ## Directory Structure
 ```
 .ctx/
-├── PROJECT.md          # Project definition
-├── ROADMAP.md          # Phase roadmap
-├── config.json         # Settings
-├── phases/{id}/        # Phase data
-│   ├── RESEARCH.md     # ArguSeek + ChunkHound results
-│   ├── PLAN.md         # Task breakdown
-│   ├── PROGRESS.md     # Execution state
-│   └── VERIFY.md       # Verification report
-├── memory/             # Hierarchical memory
-├── checkpoints/        # Auto-checkpoints
-└── todos/              # Task management
+├── STATE.md          # Living digest - ALWAYS read first
+├── phases/{id}/      # Phase data
+│   ├── RESEARCH.md   # ArguSeek + ChunkHound results
+│   ├── PLAN.md       # 2-3 tasks (atomic)
+│   └── VERIFY.md     # Verification report
+├── checkpoints/      # Auto-checkpoints
+├── debug/            # Debug screenshots
+└── verify/           # Verification screenshots
 ```
-## Context Budget
-| Usage | Quality | Action |
-|-------|---------|--------|
-| 0-30% | Peak | Continue |
-| 30-50% | Good | Continue |
-| 50%+ | Degrading | Auto-checkpoint |
 ## Updating CTX
-```
-/ctx:update
+```bash
+npx ctx-cc --force
 ```
 ---
-*CTX - 12 commands, infinite power*
+*CTX 2.1 - 8 commands, smart routing, debug loop, 100% verified*
 </reference>