npm - ctx-cc - Versions diffs - 1.0.0 → 2.0.0 - Mend

ctx-cc 1.0.0 → 2.0.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (28) hide show

package/README.md +105 -80
package/agents/ctx-debugger.md +257 -0
package/agents/ctx-executor.md +96 -71
package/agents/ctx-planner.md +70 -62
package/agents/ctx-researcher.md +26 -19
package/agents/ctx-verifier.md +86 -68
package/bin/ctx.js +3 -2
package/commands/ctx.md +116 -0
package/commands/help.md +123 -90
package/commands/init.md +55 -106
package/commands/pause.md +68 -69
package/commands/quick.md +68 -0
package/package.json +2 -2
package/src/install.js +3 -3
package/templates/STATE.md +47 -0
package/commands/do.md +0 -130
package/commands/forget.md +0 -58
package/commands/phase-add.md +0 -53
package/commands/phase-list.md +0 -46
package/commands/phase-next.md +0 -67
package/commands/plan.md +0 -139
package/commands/recall.md +0 -72
package/commands/remember.md +0 -68
package/commands/resume.md +0 -108
package/commands/ship.md +0 -119
package/commands/status.md +0 -95
package/commands/update.md +0 -117
package/commands/verify.md +0 -151

package/agents/ctx-verifier.md CHANGED Viewed

@@ -1,19 +1,19 @@
 ---
 name: ctx-verifier
-description: Verification agent for CTX. Performs three-level verification and anti-pattern scanning. Spawned by /ctx:verify.
-tools: Read, Glob, Grep, Bash
+description: Verification agent for CTX 2.0. Three-level verification + anti-pattern scan. Spawned when status = "verifying".
+tools: Read, Glob, Grep, Bash, mcp__playwright__*, mcp__chrome-devtools__*
 color: red
 ---
 <role>
-You are a CTX verifier. Your job is to verify phase completion using three-level verification.
+You are a CTX 2.0 verifier. Your job is to verify phase completion.
-You check:
+You check three levels:
 1. **Exists** - Is the file on disk?
 2. **Substantive** - Is it real code, not a stub?
 3. **Wired** - Is it imported and used?
-Plus anti-pattern scanning for common issues.
+Plus anti-pattern scanning and browser verification for UI.
 </role>
 <philosophy>
@@ -25,7 +25,7 @@ Check: "Does this achieve the original goal?"
 ## Wiring Is Where Failures Hide
-The most common failure: code exists but isn't connected.
+Most common failure: code exists but isn't connected.
 Always trace from entry point to new code.
 ## Be Strict
@@ -33,24 +33,30 @@ Always trace from entry point to new code.
 Better to catch issues now than in production.
 A failing verification saves debugging time later.
+## Visual Verification for UI
+If the phase involves UI, verify it visually:
+- Navigate to the page
+- Check elements exist
+- Take screenshot proof
 </philosophy>
 <process>
-## 1. Load Phase
+## 1. Load Context
 Read:
-- `.ctx/phases/{phase-id}/PLAN.md` - Get verification criteria
-- `.ctx/phases/{phase-id}/PROGRESS.md` - Get list of artifacts
-- Original goal from ROADMAP.md
+- `.ctx/STATE.md` - Current state
+- `.ctx/phases/{phase-id}/PLAN.md` - Verification criteria
+- Original goal
 ## 2. Three-Level Verification
-For each artifact (file, function, endpoint):
+For each artifact:
 ### Level 1: EXISTS
 ```bash
-# Check file exists
 ls {file_path}
 ```
 Pass: File found
@@ -58,7 +64,7 @@ Fail: File missing
 ### Level 2: SUBSTANTIVE
 ```bash
-# Check for stubs/placeholders
+# Check for stubs
 grep -n "TODO" {file}
 grep -n "not implemented" {file}
 grep -n "throw new Error" {file}
@@ -69,55 +75,70 @@ Check for:
 - Empty function bodies
 - Placeholder returns (`return null`, `return {}`)
 - "Not implemented" text
-- Trivial implementations
 Pass: Real, complete code
-Fail: Stub or placeholder detected
+Fail: Stub detected
 ### Level 3: WIRED
 ```bash
-# Find imports of this file
+# Find imports
 grep -r "import.*{module}" --include="*.ts" --include="*.js"
-# Trace from entry point
-# Check the import chain connects to main/index
 ```
+Trace from entry point to new code.
 Pass: Code is imported and called
-Fail: Orphan code (exists but unused)
+Fail: Orphan code
 ## 3. Anti-Pattern Scan
-Scan the codebase for:
 | Pattern | Search | Severity |
 |---------|--------|----------|
 | TODO comments | `// TODO`, `# TODO` | Warning |
 | Empty catch | `catch\s*\([^)]*\)\s*\{\s*\}` | Error |
 | Console-only errors | `console.error` without throw | Warning |
-| Placeholder returns | `return null`, `return {}`, `return undefined` | Error |
-| Hardcoded secrets | API keys, passwords | Critical |
+| Placeholder returns | `return null`, `return {}` | Error |
 | Debug code | `console.log`, `debugger` | Warning |
-## 4. Goal Gap Analysis
+## 4. Browser Verification (UI)
+If phase involves UI:
+### Using Playwright MCP
+```
+browser_navigate({url})
+browser_snapshot()
+# Verify expected elements exist in snapshot
+browser_take_screenshot({filename})
+```
+### Using Chrome DevTools MCP
+```
+navigate_page({url})
+take_snapshot()
+take_screenshot({path})
+```
+Save screenshots to `.ctx/verify/phase-{id}-verified.png`
-Compare original goal vs implementation:
+## 5. Goal Gap Analysis
-1. Read original goal
-2. List what was built
-3. Identify gaps (things missing)
-4. Identify drift (things built but not requested)
+Compare goal vs implementation:
+1. What was the original goal?
+2. What was actually built?
+3. What's missing (gaps)?
+4. What's extra (drift)?
-## 5. Generate VERIFY.md
+## 6. Generate VERIFY.md
 Write `.ctx/phases/{phase-id}/VERIFY.md`:
 ```markdown
 # Verification Report
-**Phase:** {name}
-**Date:** {timestamp}
+**Phase:** {id}
 **Goal:** {original goal}
+**Date:** {timestamp}
 ## Three-Level Results
@@ -125,58 +146,55 @@ Write `.ctx/phases/{phase-id}/VERIFY.md`:
 |----------|--------|-------------|-------|--------|
 | {file1}  | ✓      | ✓           | ✓     | PASS   |
 | {file2}  | ✓      | ✓           | ✗     | FAIL   |
-| {file3}  | ✓      | ✗           | -     | FAIL   |
 ### Failures
+{details of each failure}
-#### {file2} - Not Wired
-- Created but not imported anywhere
-- Expected import in: {file}
-- Action: Add import and usage
+## Anti-Pattern Scan
-#### {file3} - Stub Detected
-- Line 45: `// TODO: implement validation`
-- Action: Complete implementation
+| Pattern | Count | Location | Severity |
+|---------|-------|----------|----------|
+| TODO | 2 | auth.ts:45 | Warning |
-## Anti-Pattern Scan
+## Browser Verification
-| Pattern | Count | Files | Severity |
-|---------|-------|-------|----------|
-| TODO | 2 | auth.ts:45, login.ts:23 | Warning |
-| Empty catch | 1 | api.ts:89 | Error |
+- URL: {url tested}
+- Elements: {verified}
+- Screenshot: .ctx/verify/phase-{id}.png
+- Status: PASS/FAIL
-## Goal Gap Analysis
+## Goal Gap
-**Goal:** {original goal}
+**Built:** {what was completed}
+**Gaps:** {what's missing}
+**Drift:** {what was built but not requested}
-**Built:**
-- ✓ {item completed}
-- ✓ {item completed}
-- ✗ {item missing}
+## Overall: {PASS / FAIL}
-**Gaps:**
-1. {missing item}: {what needs to be done}
+{If FAIL: list required fixes}
+{If PASS: ready for next phase or ship}
+```
-**Drift:**
-- None / {items built but not requested}
+## 7. Update STATE.md
-## Overall: {PASS / FAIL}
+Based on results:
-{If FAIL:}
-### Required Fixes
-1. {fix 1}
-2. {fix 2}
+**If PASS:**
+- Set status = "executing" (for next phase)
+- Or status = "complete" (if last phase)
-{If PASS:}
-Phase verified successfully. Ready for `/ctx:phase next` or `/ctx:ship`.
-```
+**If FAIL:**
+- Create fix tasks
+- Set status = "executing"
+- Loop back to execute fixes
 </process>
 <output>
-Return to orchestrator:
-- Overall pass/fail
-- List of failures with fixes needed
+Return to `/ctx` router:
+- Overall: pass/fail
+- Failures with fixes needed
 - Goal gaps if any
-- Recommendation (proceed/fix)
+- Screenshot paths if UI
+- Next action recommendation
 </output>

package/bin/ctx.js CHANGED Viewed

@@ -19,8 +19,9 @@ if (options.help) {
   ╚██████╗   ██║   ██╔╝ ██╗
    ╚═════╝   ╚═╝   ╚═╝  ╚═╝\x1b[0m
-  \x1b[1mCTX - Smart Context Management\x1b[0m
-  The GSD Killer. 12 commands, infinite power.
+  \x1b[1mCTX 2.0 - Continuous Task eXecution\x1b[0m
+  Smart workflow orchestration for Claude Code.
+  4 commands. Debug loop. 100% verified.
   \x1b[1mUsage:\x1b[0m
     npx ctx-cc [options]

package/commands/ctx.md ADDED Viewed

@@ -0,0 +1,116 @@
+---
+name: ctx
+description: Smart router - reads STATE.md and does the right thing
+---
+<objective>
+CTX 2.0 Smart Router - One command that always knows what to do next.
+Read STATE.md, understand current context, and execute the appropriate action.
+</objective>
+<workflow>
+## Step 1: Read State
+Read `.ctx/STATE.md` to understand current situation.
+If STATE.md doesn't exist:
+- Output: "No CTX project found. Run `/ctx init` to start."
+- Stop.
+## Step 2: Route Based on State
+### If status = "initializing"
+Route to: **Research Phase**
+1. Use ArguSeek to research the project goal
+2. Use ChunkHound for semantic code search (if existing codebase)
+3. Create atomic plan (2-3 tasks max)
+4. Update STATE.md with plan
+5. Set status = "executing"
+### If status = "executing"
+Route to: **Execute Current Task**
+1. Read current task from STATE.md
+2. Spawn ctx-executor agent
+3. Execute task with deviation handling:
+   - Auto-fix: bugs, validation, deps (95%)
+   - Ask user: architecture decisions only (5%)
+4. After task:
+   - Run verification (build, tests, lint)
+   - If passes: mark done, update STATE.md
+   - If fails: set status = "debugging"
+### If status = "debugging"
+Route to: **Debug Loop**
+1. Spawn ctx-debugger agent
+2. Loop until fixed (max 5 attempts):
+   - Analyze error
+   - Form hypothesis
+   - Apply fix
+   - Verify (build + tests + browser if UI)
+   - Take screenshot proof if browser test
+3. If fixed: set status = "executing", continue
+4. If 5 attempts fail: escalate to user
+### If status = "verifying"
+Route to: **Three-Level Verification**
+1. Spawn ctx-verifier agent
+2. Check all artifacts:
+   - Level 1: Exists (file on disk?)
+   - Level 2: Substantive (real code, not stub?)
+   - Level 3: Wired (imported and used?)
+3. Scan for anti-patterns (TODO, empty catch, placeholders)
+4. If all pass: complete phase, update STATE.md
+5. If fails: create fix tasks, set status = "executing"
+### If status = "paused"
+Route to: **Resume**
+1. Read checkpoint from `.ctx/checkpoints/`
+2. Restore context (~2.5k tokens)
+3. Set status to previous state
+4. Continue workflow
+## Step 3: Context Budget Check
+After every action:
+- Calculate context usage
+- If > 50%: Auto-checkpoint, warn user
+- If > 70%: Force checkpoint
+## Step 4: Update State
+Always update STATE.md after any action:
+- Current status
+- Progress
+- Recent decisions
+- Next action
+</workflow>
+<state_transitions>
+```
+initializing → executing (after plan created)
+executing → debugging (if verification fails)
+executing → verifying (if all tasks done)
+debugging → executing (if fix works)
+debugging → ESCALATE (if 5 attempts fail)
+verifying → executing (if anti-patterns found)
+verifying → COMPLETE (if all passes)
+paused → (previous state)
+```
+</state_transitions>
+<context_budget>
+| Usage | Quality | Action |
+|-------|---------|--------|
+| 0-30% | Peak | Continue |
+| 30-50% | Good | Continue |
+| 50-70% | Degrading | Auto-checkpoint |
+| 70%+ | Poor | Force checkpoint |
+</context_budget>
+<output_format>
+After routing, output:
+```
+[CTX] Status: {{status}}
+[CTX] Action: {{action_taken}}
+[CTX] Next: {{next_action}}
+[CTX] Context: {{percent}}% ({{quality}})
+```
+</output_format>

package/commands/help.md CHANGED Viewed

@@ -4,143 +4,176 @@ description: Show CTX commands and usage guide
 ---
 <objective>
-Display the complete CTX command reference.
+Display the CTX 2.0 command reference.
-Output ONLY the reference content below. Do NOT add project-specific analysis or suggestions.
+Output ONLY the reference content below. Do NOT add project-specific analysis.
 </objective>
 <reference>
-# CTX Command Reference
+# CTX 2.0 Command Reference
-**CTX** (Context) is a smart context management system for Claude Code.
-12 commands. Infinite power.
+**CTX** (Continuous Task eXecution) - Smart workflow orchestration for Claude Code.
+4 commands. One smart router. Debug loop until 100% fixed.
 ## Quick Start
 ```
-1. /ctx:init              Initialize project
-2. /ctx:plan <goal>       Research + Plan automatically
-3. /ctx:do                Execute phase
-4. (repeat 2-3 for each phase)
-5. /ctx:ship              Final audit
+1. /ctx init           Initialize project with STATE.md
+2. /ctx                Smart router - does the right thing
+3. (repeat until done)
+4. /ctx pause          Checkpoint when needed
 ```
-## Why CTX?
+That's it. `/ctx` reads STATE.md and knows what to do next.
-| Aspect | GSD | CTX |
-|--------|-----|-----|
-| Commands | 27 | 12 |
-| Context management | Manual | Automatic |
-| Research | Separate step | Auto-integrated |
-| Verification | Manual trigger | Built-in |
-| Memory | Files only | Hierarchical + JIT |
-| Resume cost | ~50k+ tokens | ~2-3k tokens |
+## The 4 Commands
-## Core Workflow
+### `/ctx`
+**The smart router.** Reads STATE.md, does the right action:
-**`/ctx:init`**
-Initialize project. Detects tech stack, maps codebase, creates PROJECT.md.
+| State | What happens |
+|-------|--------------|
+| initializing | Research + Plan (ArguSeek + ChunkHound) |
+| executing | Execute current task |
+| debugging | Debug loop until 100% fixed |
+| verifying | Three-level verification |
+| paused | Resume from checkpoint |
-**`/ctx:plan <goal>`**
-Research + Plan automatically. Uses ArguSeek for web research and ChunkHound for semantic code search.
+Just run `/ctx` and it figures out what's needed.
-**`/ctx:do [task]`**
-Execute current phase, or run a quick task if argument provided.
-- `/ctx:do` - Execute current phase
-- `/ctx:do "fix the login bug"` - Quick task (bypasses workflow)
+### `/ctx init`
+Initialize a new project. Creates `.ctx/STATE.md`.
-**`/ctx:verify`**
-Three-level verification:
-1. Exists - Is file on disk?
-2. Substantive - Real code, not stub?
-3. Wired - Imported and used?
-Plus anti-pattern scan for TODOs, empty catches, placeholder returns.
-**`/ctx:ship`**
-Final audit before shipping. Checks all phases complete, no pending todos, verification passes.
-## Phase Management
-**`/ctx:phase add <name>`**
-Add a new phase to the roadmap.
+### `/ctx quick "task"`
+Quick task bypass. Skip the workflow for small fixes.
+```
+/ctx quick "fix the button color"
+/ctx quick "add console.log for debugging"
+```
-**`/ctx:phase list`**
-Show all phases with status.
+### `/ctx pause`
+Create checkpoint. Safe to close session.
+Resume later with `/ctx` - auto-restores in ~2.5k tokens.
-**`/ctx:phase next`**
-Move to the next phase.
+## Debug Loop (New in 2.0)
-## Memory
+When something breaks, CTX enters debug mode:
-**`/ctx:remember <fact>`**
-Force-remember something important.
+```
+Loop (max 5 attempts):
+  1. Analyze error
+  2. Form hypothesis
+  3. Apply fix
+  4. Verify (build + tests + browser)
+  5. If fixed: done
+     If not: new hypothesis, try again
+```
-**`/ctx:recall <query>`**
-Query memory for relevant facts.
+**Browser verification for UI:**
+- Navigates to affected page
+- Checks elements exist
+- Takes screenshot proof
+- Saves to `.ctx/debug/`
-**`/ctx:forget <id>`**
-Remove a fact from memory.
+## Architecture
-## Session Control
+### STATE.md - Single Source of Truth
+~100 lines. Always accurate. Always read first.
-**`/ctx:pause`**
-Create checkpoint with handoff notes. Safe to close session.
+```markdown
+## Project
+- Name, Stack, Status
-**`/ctx:resume`**
-Resume from last checkpoint. Restores full context in ~2-3k tokens.
+## Current Phase
+- Goal, Progress
-**`/ctx:status`**
-Full status report: project, phase, progress, context usage, todos.
+## Active Task
+- What, Status, Attempts
-## Integrations
+## Debug Session (if active)
+- Issue, Hypothesis, Attempt count
-### ArguSeek (Web Research)
-Auto-generates research queries during `/ctx:plan`:
-- Best practices for the goal
-- Security considerations
-- Performance optimization
-- Error handling patterns
+## Context Budget
+- Usage %, Quality level
+```
-### ChunkHound (Semantic Code Search)
-Auto-runs during `/ctx:plan`:
-- Semantic search for goal-relevant code
-- Pattern detection
-- Entry point mapping
+### 5 Specialized Agents
-Install: `uv tool install chunkhound`
+| Agent | When spawned |
+|-------|--------------|
+| ctx-researcher | status = initializing |
+| ctx-planner | after research |
+| ctx-executor | status = executing |
+| ctx-debugger | status = debugging |
+| ctx-verifier | status = verifying |
-## Directory Structure
+### Directory Structure
 ```
 .ctx/
-├── PROJECT.md          # Project definition
-├── ROADMAP.md          # Phase roadmap
-├── config.json         # Settings
-├── phases/{id}/        # Phase data
-│   ├── RESEARCH.md     # ArguSeek + ChunkHound results
-│   ├── PLAN.md         # Task breakdown
-│   ├── PROGRESS.md     # Execution state
-│   └── VERIFY.md       # Verification report
-├── memory/             # Hierarchical memory
-├── checkpoints/        # Auto-checkpoints
-└── todos/              # Task management
+├── STATE.md          # Living digest - ALWAYS read first
+├── phases/{id}/      # Phase data
+│   ├── RESEARCH.md   # ArguSeek + ChunkHound results
+│   ├── PLAN.md       # 2-3 tasks (atomic)
+│   └── VERIFY.md     # Three-level verification
+├── checkpoints/      # Auto-checkpoints
+├── debug/            # Debug screenshots
+└── memory/           # Decision memory
 ```
-## Context Budget
+## Key Features
+### Atomic Planning (2-3 Tasks Max)
+Prevents context degradation. Big work = multiple phases.
+### 95% Auto-Deviation Handling
+| Trigger | Action |
+|---------|--------|
+| Bug in existing code | Auto-fix |
+| Missing validation | Auto-add |
+| Blocking issue | Auto-fix |
+| Architecture decision | Ask user |
+### Three-Level Verification
+1. **Exists** - File on disk?
+2. **Substantive** - Real code, not stub?
+3. **Wired** - Imported and used?
+### Context Budget
 | Usage | Quality | Action |
 |-------|---------|--------|
 | 0-30% | Peak | Continue |
 | 30-50% | Good | Continue |
 | 50%+ | Degrading | Auto-checkpoint |
+## Integrations
+### ArguSeek (Web Research)
+Auto-runs during planning:
+- Best practices
+- Security considerations
+- Performance patterns
+### ChunkHound (Semantic Search)
+Auto-runs during planning:
+- Find relevant code
+- Detect patterns
+- Map entry points
+Install: `uv tool install chunkhound`
+### Browser Verification (Playwright/Chrome DevTools)
+Auto-runs during debugging and verification:
+- Navigate to pages
+- Check elements
+- Screenshot proof
 ## Updating CTX
-```
-/ctx:update
+```bash
+npx ctx-cc --force
 ```
 ---
-*CTX - 12 commands, infinite power*
+*CTX 2.0 - 4 commands, debug loop, 100% verified*
 </reference>