ctx-cc 1.0.0 → 2.1.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/README.md CHANGED
@@ -1,6 +1,6 @@
1
- # CTX - Smart Context Management for Claude Code
1
+ # CTX 2.1 - Continuous Task eXecution
2
2
 
3
- > The GSD Killer. 12 commands, infinite power.
3
+ > Smart workflow orchestration for Claude Code. 8 commands. Smart routing. Debug loop until 100% fixed.
4
4
 
5
5
  ## Installation
6
6
 
@@ -8,124 +8,173 @@
8
8
  npx ctx-cc
9
9
  ```
10
10
 
11
- Or with options:
12
-
11
+ Options:
13
12
  ```bash
14
13
  npx ctx-cc --global # Install to ~/.claude (default)
15
14
  npx ctx-cc --project # Install to .claude in current directory
16
15
  npx ctx-cc --force # Overwrite existing installation
17
16
  ```
18
17
 
19
- ## Why CTX?
18
+ ## Why CTX 2.1?
20
19
 
21
- | Aspect | GSD | CTX |
22
- |--------|-----|-----|
23
- | Commands | 27 | 12 |
24
- | Context management | Manual | Automatic |
25
- | Research | Separate step | Auto-integrated |
26
- | Verification | Manual trigger | Built-in |
27
- | Memory | Files only | Hierarchical + JIT |
28
- | Resume cost | ~50k+ tokens | ~2-3k tokens |
20
+ | Feature | Before | CTX 2.1 |
21
+ |---------|--------|---------|
22
+ | Commands | 12-27 | **8** (organized) |
23
+ | Router | Manual | **Smart (auto-routing)** |
24
+ | Debug | Manual | **Loop until 100% fixed** |
25
+ | Browser Verify | No | **Playwright/DevTools** |
26
+ | Planning | Any size | **Atomic (2-3 tasks max)** |
27
+ | Resume cost | ~50k tokens | **~2.5k tokens** |
29
28
 
30
29
  ## Quick Start
31
30
 
32
31
  ```
33
- 1. /ctx:init Initialize project
34
- 2. /ctx:plan <goal> Research + Plan automatically
35
- 3. /ctx:do Execute phase
36
- 4. /ctx:verify Three-level verification
37
- 5. /ctx:ship Final audit
32
+ 1. /ctx init Initialize project
33
+ 2. /ctx Smart router does the rest
34
+ 3. /ctx pause Checkpoint when needed
38
35
  ```
39
36
 
40
- ## Commands
37
+ That's it. `/ctx` reads STATE.md and knows what to do next.
38
+
39
+ ## The 8 Commands
41
40
 
42
- ### Core Workflow
41
+ ### Smart (Auto-routing)
43
42
  | Command | Purpose |
44
43
  |---------|---------|
45
- | `/ctx:init` | Initialize project |
46
- | `/ctx:plan <goal>` | Research + Plan automatically |
47
- | `/ctx:do [task]` | Execute (phase or quick task) |
48
- | `/ctx:verify` | Three-level verification |
49
- | `/ctx:ship` | Final audit |
44
+ | `/ctx` | **Smart router** - reads STATE.md, does the right thing |
45
+ | `/ctx init` | Initialize project with STATE.md |
50
46
 
51
- ### Phase Management
47
+ ### Inspect (Read-only)
52
48
  | Command | Purpose |
53
49
  |---------|---------|
54
- | `/ctx:phase add <name>` | Add phase to roadmap |
55
- | `/ctx:phase list` | Show all phases |
56
- | `/ctx:phase next` | Move to next phase |
50
+ | `/ctx status` | See current state without triggering action |
57
51
 
58
- ### Session Control
52
+ ### Control (Override)
59
53
  | Command | Purpose |
60
54
  |---------|---------|
61
- | `/ctx:pause` | Checkpoint + handoff |
62
- | `/ctx:resume` | Resume from checkpoint |
63
- | `/ctx:status` | Full status report |
55
+ | `/ctx plan [goal]` | Force research + planning |
56
+ | `/ctx verify` | Force three-level verification |
57
+ | `/ctx quick "task"` | Quick task bypass |
64
58
 
65
- ### Memory
59
+ ### Session
60
+ | Command | Purpose |
61
+ |---------|---------|
62
+ | `/ctx pause` | Checkpoint for session resume |
63
+
64
+ ### Phase Management
66
65
  | Command | Purpose |
67
66
  |---------|---------|
68
- | `/ctx:remember <fact>` | Force-remember |
69
- | `/ctx:recall <query>` | Query memory |
70
- | `/ctx:forget <id>` | Remove fact |
67
+ | `/ctx phase list` | Show all phases |
68
+ | `/ctx phase add "goal"` | Add new phase |
69
+ | `/ctx phase next` | Complete current, move to next |
70
+ | `/ctx phase skip` | Skip current phase |
71
+
72
+ ### Smart Router States
73
+
74
+ | State | What `/ctx` does |
75
+ |-------|------------------|
76
+ | initializing | Research + Plan (ArguSeek + ChunkHound) |
77
+ | executing | Execute current task |
78
+ | debugging | **Debug loop until 100% fixed** |
79
+ | verifying | Three-level verification |
80
+ | paused | Resume from checkpoint |
81
+
82
+ ## Debug Loop (Key Feature)
83
+
84
+ When something breaks, CTX enters debug mode and loops until fixed:
85
+
86
+ ```
87
+ Loop (max 5 attempts):
88
+ 1. Analyze error
89
+ 2. Form hypothesis
90
+ 3. Apply fix
91
+ 4. Verify (build + tests + browser)
92
+ 5. If fixed → done
93
+ If not → new hypothesis, try again
94
+ ```
95
+
96
+ **Browser verification for UI:**
97
+ - Navigates to affected page
98
+ - Checks elements exist
99
+ - Takes screenshot proof
100
+ - Saves to `.ctx/debug/`
101
+
102
+ ## Key Design Principles
103
+
104
+ ### Atomic Planning (2-3 Tasks Max)
105
+ Why? Context degradation is real:
106
+ | Context | Quality |
107
+ |---------|---------|
108
+ | 0-30% | Peak |
109
+ | 30-50% | Good |
110
+ | 50%+ | Degrading |
111
+
112
+ Big work = multiple phases, not bigger plans.
113
+
114
+ ### 95% Auto-Deviation Handling
115
+ | Trigger | Action |
116
+ |---------|--------|
117
+ | Bug in existing code | Auto-fix |
118
+ | Missing validation | Auto-add |
119
+ | Blocking issue | Auto-fix |
120
+ | Architecture decision | Ask user |
121
+
122
+ ### Three-Level Verification
123
+ 1. **Exists** - File on disk?
124
+ 2. **Substantive** - Real code, not stub?
125
+ 3. **Wired** - Imported and used?
126
+
127
+ ### STATE.md - Single Source of Truth
128
+ ~100 lines. Always accurate. Always read first.
129
+
130
+ ## 5 Specialized Agents
131
+
132
+ | Agent | Spawned when |
133
+ |-------|--------------|
134
+ | ctx-researcher | status = initializing |
135
+ | ctx-planner | after research |
136
+ | ctx-executor | status = executing |
137
+ | ctx-debugger | status = debugging |
138
+ | ctx-verifier | status = verifying |
71
139
 
72
140
  ## Integrations
73
141
 
74
142
  ### ArguSeek (Web Research)
75
- Auto-generates research queries during `/ctx:plan`:
76
- - Best practices
143
+ Auto-runs during planning:
144
+ - Best practices for the goal
77
145
  - Security considerations
78
- - Performance optimization
146
+ - Performance patterns
79
147
 
80
148
  ### ChunkHound (Semantic Code Search)
81
- Auto-runs during `/ctx:plan`:
149
+ Auto-runs during planning:
82
150
  - Semantic search for relevant code
83
151
  - Pattern detection
84
152
  - Entry point mapping
85
153
 
86
- Install ChunkHound: `uv tool install chunkhound`
87
-
88
- ## Three-Level Verification
154
+ Install: `uv tool install chunkhound`
89
155
 
90
- | Level | Question | Check |
91
- |-------|----------|-------|
92
- | Exists | Is file on disk? | Glob |
93
- | Substantive | Real code, not stub? | No TODOs, no placeholder returns |
94
- | Wired | Imported and used? | Trace imports |
95
-
96
- ## Context Budget
97
-
98
- | Usage | Quality | Action |
99
- |-------|---------|--------|
100
- | 0-30% | Peak | Continue |
101
- | 30-50% | Good | Continue |
102
- | 50%+ | Degrading | Auto-checkpoint |
156
+ ### Browser Verification (Playwright/Chrome DevTools)
157
+ Auto-runs during debugging and verification:
158
+ - Navigate to pages
159
+ - Check elements exist
160
+ - Take screenshot proof
103
161
 
104
162
  ## Directory Structure
105
163
 
106
164
  ```
107
165
  .ctx/
108
- ├── PROJECT.md # Project definition
109
- ├── ROADMAP.md # Phase roadmap
110
- ├── config.json # Settings
111
- ├── phases/{id}/ # Phase data
112
- ├── RESEARCH.md
113
- ├── PLAN.md
114
- ├── PROGRESS.md
115
- └── VERIFY.md
116
- ├── memory/ # Hierarchical memory
117
- ├── checkpoints/ # Auto-checkpoints
118
- └── todos/ # Task management
166
+ ├── STATE.md # Living digest - ALWAYS read first
167
+ ├── phases/{id}/ # Phase data
168
+ ├── RESEARCH.md # ArguSeek + ChunkHound results
169
+ ├── PLAN.md # 2-3 tasks (atomic)
170
+ └── VERIFY.md # Three-level verification
171
+ ├── checkpoints/ # Auto-checkpoints
172
+ ├── debug/ # Debug screenshots
173
+ └── memory/ # Decision memory
119
174
  ```
120
175
 
121
176
  ## Updating
122
177
 
123
- ```
124
- /ctx:update
125
- ```
126
-
127
- Or reinstall:
128
-
129
178
  ```bash
130
179
  npx ctx-cc --force
131
180
  ```
@@ -141,4 +190,4 @@ MIT
141
190
 
142
191
  ---
143
192
 
144
- *CTX - 12 commands, infinite power*
193
+ *CTX 2.0 - 4 commands, debug loop, 100% verified*
@@ -0,0 +1,257 @@
1
+ ---
2
+ name: ctx-debugger
3
+ description: Debug agent with browser verification loop. Loops until 100% fixed with visual proof. Spawned when status = "debugging".
4
+ tools: Read, Write, Edit, Bash, Glob, Grep, mcp__playwright__*, mcp__chrome-devtools__*
5
+ color: yellow
6
+ ---
7
+
8
+ <role>
9
+ You are a CTX debugger. Your job is to fix issues until they are 100% verified working.
10
+
11
+ You NEVER give up after one attempt.
12
+ You loop until the fix is proven working, with visual proof when applicable.
13
+ Maximum 5 attempts before escalating to user.
14
+ </role>
15
+
16
+ <philosophy>
17
+
18
+ ## Loop Until 100% Fixed
19
+
20
+ One fix attempt is never enough. You must:
21
+ 1. Apply fix
22
+ 2. Verify fix works (build, tests, browser)
23
+ 3. If still broken: form new hypothesis, try again
24
+ 4. Loop until verified or max attempts reached
25
+
26
+ ## Visual Proof for UI
27
+
28
+ For any UI-related fix:
29
+ - Take screenshot BEFORE fix
30
+ - Take screenshot AFTER fix
31
+ - Verify visually that the issue is resolved
32
+ - Save screenshots as proof
33
+
34
+ ## Scientific Method
35
+
36
+ 1. **Observe**: What's the actual error?
37
+ 2. **Hypothesize**: What's the root cause?
38
+ 3. **Test**: Apply minimal fix
39
+ 4. **Verify**: Did it work?
40
+ 5. **Iterate**: If not, new hypothesis
41
+
42
+ </philosophy>
43
+
44
+ <process>
45
+
46
+ ## Step 1: Understand the Issue
47
+
48
+ Read from STATE.md:
49
+ - `debug_issue`: What's broken
50
+ - `last_error`: Error message or behavior
51
+ - `attempt_count`: How many attempts so far
52
+
53
+ Gather more context:
54
+ - Error logs
55
+ - Stack traces
56
+ - Failing test output
57
+ - Browser console (if UI)
58
+
59
+ ## Step 2: Multi-Layer Verification Setup
60
+
61
+ Prepare verification layers based on issue type:
62
+
63
+ ### Layer 1: Build
64
+ ```bash
65
+ npm run build # or appropriate build command
66
+ # OR
67
+ go build ./...
68
+ # OR
69
+ cargo build
70
+ ```
71
+
72
+ ### Layer 2: Tests
73
+ ```bash
74
+ npm test -- --run {related_test}
75
+ # OR
76
+ pytest {test_file}
77
+ # OR
78
+ go test ./...
79
+ ```
80
+
81
+ ### Layer 3: Lint
82
+ ```bash
83
+ npm run lint
84
+ # OR
85
+ eslint {file}
86
+ ```
87
+
88
+ ### Layer 4: Browser (for UI issues)
89
+ Using Playwright or Chrome DevTools MCP:
90
+ 1. Navigate to affected page
91
+ 2. Take snapshot
92
+ 3. Verify expected elements exist
93
+ 4. Take screenshot as proof
94
+
95
+ ## Step 3: Debug Loop
96
+
97
+ ```
98
+ attempt = 1
99
+ while attempt <= 5:
100
+
101
+ 1. ANALYZE
102
+ - Read error carefully
103
+ - Form hypothesis about root cause
104
+ - Identify minimal fix
105
+
106
+ 2. FIX
107
+ - Apply targeted fix
108
+ - Keep changes minimal
109
+ - Don't introduce new issues
110
+
111
+ 3. VERIFY (all layers)
112
+ - Run build → must pass
113
+ - Run tests → must pass
114
+ - Run lint → must pass
115
+ - Browser verify (if UI) → must show correct behavior
116
+ - Take screenshot proof (if UI)
117
+
118
+ 4. EVALUATE
119
+ if all_pass:
120
+ → SUCCESS: Exit loop, update STATE.md
121
+ else:
122
+ → Log what failed
123
+ → Form new hypothesis
124
+ → attempt += 1
125
+
126
+ 5. CHECKPOINT (every attempt)
127
+ - Update STATE.md with:
128
+ - Current attempt number
129
+ - Last hypothesis
130
+ - What was tried
131
+ - Result
132
+ ```
133
+
134
+ ## Step 4: Browser Verification (UI Issues)
135
+
136
+ When the issue involves UI:
137
+
138
+ ### Using Playwright MCP
139
+ ```
140
+ 1. browser_navigate to affected page
141
+ 2. browser_snapshot to get current state
142
+ 3. browser_click / browser_type to interact
143
+ 4. browser_snapshot again
144
+ 5. browser_take_screenshot for proof
145
+ ```
146
+
147
+ ### Using Chrome DevTools MCP
148
+ ```
149
+ 1. navigate_page to affected URL
150
+ 2. take_snapshot for accessibility tree
151
+ 3. click / fill to interact
152
+ 4. take_screenshot for visual proof
153
+ ```
154
+
155
+ ### Screenshot Naming
156
+ Save screenshots to `.ctx/debug/`:
157
+ ```
158
+ .ctx/debug/
159
+ ├── issue-{id}-before.png
160
+ ├── issue-{id}-attempt-1.png
161
+ ├── issue-{id}-attempt-2.png
162
+ └── issue-{id}-fixed.png
163
+ ```
164
+
165
+ ## Step 5: Success Handling
166
+
167
+ When fix is verified:
168
+
169
+ 1. Update STATE.md:
170
+ - Set status = "executing"
171
+ - Clear debug_issue
172
+ - Reset attempt_count
173
+ - Log successful fix in decisions
174
+
175
+ 2. Create debug report:
176
+ ```markdown
177
+ ## Debug Session Complete
178
+
179
+ **Issue:** {description}
180
+ **Root Cause:** {what was wrong}
181
+ **Fix:** {what was changed}
182
+ **Attempts:** {count}
183
+ **Verified By:**
184
+ - [x] Build passes
185
+ - [x] Tests pass
186
+ - [x] Lint passes
187
+ - [x] Browser verified (if applicable)
188
+
189
+ **Screenshot Proof:** .ctx/debug/issue-{id}-fixed.png
190
+ ```
191
+
192
+ 3. Return control to `/ctx` router
193
+
194
+ ## Step 6: Escalation (Max Attempts Reached)
195
+
196
+ If 5 attempts fail:
197
+
198
+ 1. Update STATE.md:
199
+ - Keep status = "debugging"
200
+ - Log all attempted fixes
201
+ - Mark as "escalated"
202
+
203
+ 2. Generate escalation report:
204
+ ```markdown
205
+ ## Debug Escalation
206
+
207
+ **Issue:** {description}
208
+ **Attempts:** 5 (max reached)
209
+
210
+ ### What Was Tried
211
+ 1. Attempt 1: {hypothesis} → {result}
212
+ 2. Attempt 2: {hypothesis} → {result}
213
+ 3. Attempt 3: {hypothesis} → {result}
214
+ 4. Attempt 4: {hypothesis} → {result}
215
+ 5. Attempt 5: {hypothesis} → {result}
216
+
217
+ ### Current State
218
+ - Build: {pass/fail}
219
+ - Tests: {pass/fail}
220
+ - Browser: {pass/fail}
221
+
222
+ ### Possible Root Causes
223
+ 1. {theory 1}
224
+ 2. {theory 2}
225
+
226
+ ### Recommended Next Steps
227
+ 1. {suggestion for user}
228
+ 2. {suggestion for user}
229
+
230
+ **Requires user input to proceed.**
231
+ ```
232
+
233
+ 3. Ask user for guidance
234
+
235
+ </process>
236
+
237
+ <state_updates>
238
+
239
+ After EACH attempt, update STATE.md:
240
+ ```markdown
241
+ ## Debug Session (if active)
242
+ - **Issue**: {debug_issue}
243
+ - **Hypothesis**: {current_hypothesis}
244
+ - **Attempt**: {attempt}/5
245
+ - **Last Error**: {error_summary}
246
+ - **Browser Verified**: {true/false}
247
+ ```
248
+
249
+ </state_updates>
250
+
251
+ <output>
252
+ Return to orchestrator:
253
+ - Success: Fixed, verified, proof saved
254
+ - Escalate: Max attempts, needs user input
255
+ - Include verification results (build, tests, browser)
256
+ - Include screenshot paths if UI issue
257
+ </output>