ctx-cc 1.0.0 → 2.0.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/README.md +105 -80
- package/agents/ctx-debugger.md +257 -0
- package/agents/ctx-executor.md +96 -71
- package/agents/ctx-planner.md +70 -62
- package/agents/ctx-researcher.md +26 -19
- package/agents/ctx-verifier.md +86 -68
- package/bin/ctx.js +3 -2
- package/commands/ctx.md +116 -0
- package/commands/help.md +123 -90
- package/commands/init.md +55 -106
- package/commands/pause.md +68 -69
- package/commands/quick.md +68 -0
- package/package.json +2 -2
- package/src/install.js +3 -3
- package/templates/STATE.md +47 -0
- package/commands/do.md +0 -130
- package/commands/forget.md +0 -58
- package/commands/phase-add.md +0 -53
- package/commands/phase-list.md +0 -46
- package/commands/phase-next.md +0 -67
- package/commands/plan.md +0 -139
- package/commands/recall.md +0 -72
- package/commands/remember.md +0 -68
- package/commands/resume.md +0 -108
- package/commands/ship.md +0 -119
- package/commands/status.md +0 -95
- package/commands/update.md +0 -117
- package/commands/verify.md +0 -151
package/README.md
CHANGED
|
@@ -1,6 +1,6 @@
|
|
|
1
|
-
# CTX -
|
|
1
|
+
# CTX 2.0 - Continuous Task eXecution
|
|
2
2
|
|
|
3
|
-
>
|
|
3
|
+
> Smart workflow orchestration for Claude Code. 4 commands. Debug loop until 100% fixed.
|
|
4
4
|
|
|
5
5
|
## Installation
|
|
6
6
|
|
|
@@ -8,124 +8,149 @@
|
|
|
8
8
|
npx ctx-cc
|
|
9
9
|
```
|
|
10
10
|
|
|
11
|
-
|
|
12
|
-
|
|
11
|
+
Options:
|
|
13
12
|
```bash
|
|
14
13
|
npx ctx-cc --global # Install to ~/.claude (default)
|
|
15
14
|
npx ctx-cc --project # Install to .claude in current directory
|
|
16
15
|
npx ctx-cc --force # Overwrite existing installation
|
|
17
16
|
```
|
|
18
17
|
|
|
19
|
-
## Why CTX?
|
|
18
|
+
## Why CTX 2.0?
|
|
20
19
|
|
|
21
|
-
|
|
|
22
|
-
|
|
23
|
-
| Commands | 27 |
|
|
24
|
-
|
|
|
25
|
-
|
|
|
26
|
-
|
|
|
27
|
-
|
|
|
28
|
-
| Resume cost | ~50k
|
|
20
|
+
| Feature | Before | CTX 2.0 |
|
|
21
|
+
|---------|--------|---------|
|
|
22
|
+
| Commands | 12-27 | **4** |
|
|
23
|
+
| Router | Manual | **Smart (auto-routing)** |
|
|
24
|
+
| Debug | Manual | **Loop until 100% fixed** |
|
|
25
|
+
| Browser Verify | No | **Playwright/DevTools** |
|
|
26
|
+
| Planning | Any size | **Atomic (2-3 tasks max)** |
|
|
27
|
+
| Resume cost | ~50k tokens | **~2.5k tokens** |
|
|
29
28
|
|
|
30
29
|
## Quick Start
|
|
31
30
|
|
|
32
31
|
```
|
|
33
|
-
1. /ctx
|
|
34
|
-
2. /ctx
|
|
35
|
-
3. /ctx
|
|
36
|
-
4. /ctx:verify Three-level verification
|
|
37
|
-
5. /ctx:ship Final audit
|
|
32
|
+
1. /ctx init Initialize project
|
|
33
|
+
2. /ctx Smart router does the rest
|
|
34
|
+
3. /ctx pause Checkpoint when needed
|
|
38
35
|
```
|
|
39
36
|
|
|
40
|
-
|
|
37
|
+
That's it. `/ctx` reads STATE.md and knows what to do next.
|
|
41
38
|
|
|
42
|
-
|
|
43
|
-
| Command | Purpose |
|
|
44
|
-
|---------|---------|
|
|
45
|
-
| `/ctx:init` | Initialize project |
|
|
46
|
-
| `/ctx:plan <goal>` | Research + Plan automatically |
|
|
47
|
-
| `/ctx:do [task]` | Execute (phase or quick task) |
|
|
48
|
-
| `/ctx:verify` | Three-level verification |
|
|
49
|
-
| `/ctx:ship` | Final audit |
|
|
39
|
+
## The 4 Commands
|
|
50
40
|
|
|
51
|
-
### Phase Management
|
|
52
41
|
| Command | Purpose |
|
|
53
42
|
|---------|---------|
|
|
54
|
-
| `/ctx
|
|
55
|
-
| `/ctx
|
|
56
|
-
| `/ctx
|
|
43
|
+
| `/ctx` | Smart router - reads STATE.md, does the right thing |
|
|
44
|
+
| `/ctx init` | Initialize project with STATE.md |
|
|
45
|
+
| `/ctx quick "task"` | Quick task bypass (skip workflow) |
|
|
46
|
+
| `/ctx pause` | Checkpoint for session resume |
|
|
57
47
|
|
|
58
|
-
###
|
|
59
|
-
| Command | Purpose |
|
|
60
|
-
|---------|---------|
|
|
61
|
-
| `/ctx:pause` | Checkpoint + handoff |
|
|
62
|
-
| `/ctx:resume` | Resume from checkpoint |
|
|
63
|
-
| `/ctx:status` | Full status report |
|
|
48
|
+
### Smart Router States
|
|
64
49
|
|
|
65
|
-
|
|
66
|
-
|
|
50
|
+
| State | What `/ctx` does |
|
|
51
|
+
|-------|------------------|
|
|
52
|
+
| initializing | Research + Plan (ArguSeek + ChunkHound) |
|
|
53
|
+
| executing | Execute current task |
|
|
54
|
+
| debugging | **Debug loop until 100% fixed** |
|
|
55
|
+
| verifying | Three-level verification |
|
|
56
|
+
| paused | Resume from checkpoint |
|
|
57
|
+
|
|
58
|
+
## Debug Loop (Key Feature)
|
|
59
|
+
|
|
60
|
+
When something breaks, CTX enters debug mode and loops until fixed:
|
|
61
|
+
|
|
62
|
+
```
|
|
63
|
+
Loop (max 5 attempts):
|
|
64
|
+
1. Analyze error
|
|
65
|
+
2. Form hypothesis
|
|
66
|
+
3. Apply fix
|
|
67
|
+
4. Verify (build + tests + browser)
|
|
68
|
+
5. If fixed → done
|
|
69
|
+
If not → new hypothesis, try again
|
|
70
|
+
```
|
|
71
|
+
|
|
72
|
+
**Browser verification for UI:**
|
|
73
|
+
- Navigates to affected page
|
|
74
|
+
- Checks elements exist
|
|
75
|
+
- Takes screenshot proof
|
|
76
|
+
- Saves to `.ctx/debug/`
|
|
77
|
+
|
|
78
|
+
## Key Design Principles
|
|
79
|
+
|
|
80
|
+
### Atomic Planning (2-3 Tasks Max)
|
|
81
|
+
Why? Context degradation is real:
|
|
82
|
+
| Context | Quality |
|
|
67
83
|
|---------|---------|
|
|
68
|
-
|
|
|
69
|
-
|
|
|
70
|
-
|
|
|
84
|
+
| 0-30% | Peak |
|
|
85
|
+
| 30-50% | Good |
|
|
86
|
+
| 50%+ | Degrading |
|
|
87
|
+
|
|
88
|
+
Big work = multiple phases, not bigger plans.
|
|
89
|
+
|
|
90
|
+
### 95% Auto-Deviation Handling
|
|
91
|
+
| Trigger | Action |
|
|
92
|
+
|---------|--------|
|
|
93
|
+
| Bug in existing code | Auto-fix |
|
|
94
|
+
| Missing validation | Auto-add |
|
|
95
|
+
| Blocking issue | Auto-fix |
|
|
96
|
+
| Architecture decision | Ask user |
|
|
97
|
+
|
|
98
|
+
### Three-Level Verification
|
|
99
|
+
1. **Exists** - File on disk?
|
|
100
|
+
2. **Substantive** - Real code, not stub?
|
|
101
|
+
3. **Wired** - Imported and used?
|
|
102
|
+
|
|
103
|
+
### STATE.md - Single Source of Truth
|
|
104
|
+
~100 lines. Always accurate. Always read first.
|
|
105
|
+
|
|
106
|
+
## 5 Specialized Agents
|
|
107
|
+
|
|
108
|
+
| Agent | Spawned when |
|
|
109
|
+
|-------|--------------|
|
|
110
|
+
| ctx-researcher | status = initializing |
|
|
111
|
+
| ctx-planner | after research |
|
|
112
|
+
| ctx-executor | status = executing |
|
|
113
|
+
| ctx-debugger | status = debugging |
|
|
114
|
+
| ctx-verifier | status = verifying |
|
|
71
115
|
|
|
72
116
|
## Integrations
|
|
73
117
|
|
|
74
118
|
### ArguSeek (Web Research)
|
|
75
|
-
Auto-
|
|
76
|
-
- Best practices
|
|
119
|
+
Auto-runs during planning:
|
|
120
|
+
- Best practices for the goal
|
|
77
121
|
- Security considerations
|
|
78
|
-
- Performance
|
|
122
|
+
- Performance patterns
|
|
79
123
|
|
|
80
124
|
### ChunkHound (Semantic Code Search)
|
|
81
|
-
Auto-runs during
|
|
125
|
+
Auto-runs during planning:
|
|
82
126
|
- Semantic search for relevant code
|
|
83
127
|
- Pattern detection
|
|
84
128
|
- Entry point mapping
|
|
85
129
|
|
|
86
|
-
Install
|
|
87
|
-
|
|
88
|
-
## Three-Level Verification
|
|
89
|
-
|
|
90
|
-
| Level | Question | Check |
|
|
91
|
-
|-------|----------|-------|
|
|
92
|
-
| Exists | Is file on disk? | Glob |
|
|
93
|
-
| Substantive | Real code, not stub? | No TODOs, no placeholder returns |
|
|
94
|
-
| Wired | Imported and used? | Trace imports |
|
|
130
|
+
Install: `uv tool install chunkhound`
|
|
95
131
|
|
|
96
|
-
|
|
97
|
-
|
|
98
|
-
|
|
99
|
-
|
|
100
|
-
|
|
101
|
-
| 30-50% | Good | Continue |
|
|
102
|
-
| 50%+ | Degrading | Auto-checkpoint |
|
|
132
|
+
### Browser Verification (Playwright/Chrome DevTools)
|
|
133
|
+
Auto-runs during debugging and verification:
|
|
134
|
+
- Navigate to pages
|
|
135
|
+
- Check elements exist
|
|
136
|
+
- Take screenshot proof
|
|
103
137
|
|
|
104
138
|
## Directory Structure
|
|
105
139
|
|
|
106
140
|
```
|
|
107
141
|
.ctx/
|
|
108
|
-
├──
|
|
109
|
-
├──
|
|
110
|
-
├──
|
|
111
|
-
├──
|
|
112
|
-
│
|
|
113
|
-
|
|
114
|
-
|
|
115
|
-
|
|
116
|
-
├── memory/ # Hierarchical memory
|
|
117
|
-
├── checkpoints/ # Auto-checkpoints
|
|
118
|
-
└── todos/ # Task management
|
|
142
|
+
├── STATE.md # Living digest - ALWAYS read first
|
|
143
|
+
├── phases/{id}/ # Phase data
|
|
144
|
+
│ ├── RESEARCH.md # ArguSeek + ChunkHound results
|
|
145
|
+
│ ├── PLAN.md # 2-3 tasks (atomic)
|
|
146
|
+
│ └── VERIFY.md # Three-level verification
|
|
147
|
+
├── checkpoints/ # Auto-checkpoints
|
|
148
|
+
├── debug/ # Debug screenshots
|
|
149
|
+
└── memory/ # Decision memory
|
|
119
150
|
```
|
|
120
151
|
|
|
121
152
|
## Updating
|
|
122
153
|
|
|
123
|
-
```
|
|
124
|
-
/ctx:update
|
|
125
|
-
```
|
|
126
|
-
|
|
127
|
-
Or reinstall:
|
|
128
|
-
|
|
129
154
|
```bash
|
|
130
155
|
npx ctx-cc --force
|
|
131
156
|
```
|
|
@@ -141,4 +166,4 @@ MIT
|
|
|
141
166
|
|
|
142
167
|
---
|
|
143
168
|
|
|
144
|
-
*CTX -
|
|
169
|
+
*CTX 2.0 - 4 commands, debug loop, 100% verified*
|
|
@@ -0,0 +1,257 @@
|
|
|
1
|
+
---
|
|
2
|
+
name: ctx-debugger
|
|
3
|
+
description: Debug agent with browser verification loop. Loops until 100% fixed with visual proof. Spawned when status = "debugging".
|
|
4
|
+
tools: Read, Write, Edit, Bash, Glob, Grep, mcp__playwright__*, mcp__chrome-devtools__*
|
|
5
|
+
color: yellow
|
|
6
|
+
---
|
|
7
|
+
|
|
8
|
+
<role>
|
|
9
|
+
You are a CTX debugger. Your job is to fix issues until they are 100% verified working.
|
|
10
|
+
|
|
11
|
+
You NEVER give up after one attempt.
|
|
12
|
+
You loop until the fix is proven working, with visual proof when applicable.
|
|
13
|
+
Maximum 5 attempts before escalating to user.
|
|
14
|
+
</role>
|
|
15
|
+
|
|
16
|
+
<philosophy>
|
|
17
|
+
|
|
18
|
+
## Loop Until 100% Fixed
|
|
19
|
+
|
|
20
|
+
One fix attempt is never enough. You must:
|
|
21
|
+
1. Apply fix
|
|
22
|
+
2. Verify fix works (build, tests, browser)
|
|
23
|
+
3. If still broken: form new hypothesis, try again
|
|
24
|
+
4. Loop until verified or max attempts reached
|
|
25
|
+
|
|
26
|
+
## Visual Proof for UI
|
|
27
|
+
|
|
28
|
+
For any UI-related fix:
|
|
29
|
+
- Take screenshot BEFORE fix
|
|
30
|
+
- Take screenshot AFTER fix
|
|
31
|
+
- Verify visually that the issue is resolved
|
|
32
|
+
- Save screenshots as proof
|
|
33
|
+
|
|
34
|
+
## Scientific Method
|
|
35
|
+
|
|
36
|
+
1. **Observe**: What's the actual error?
|
|
37
|
+
2. **Hypothesize**: What's the root cause?
|
|
38
|
+
3. **Test**: Apply minimal fix
|
|
39
|
+
4. **Verify**: Did it work?
|
|
40
|
+
5. **Iterate**: If not, new hypothesis
|
|
41
|
+
|
|
42
|
+
</philosophy>
|
|
43
|
+
|
|
44
|
+
<process>
|
|
45
|
+
|
|
46
|
+
## Step 1: Understand the Issue
|
|
47
|
+
|
|
48
|
+
Read from STATE.md:
|
|
49
|
+
- `debug_issue`: What's broken
|
|
50
|
+
- `last_error`: Error message or behavior
|
|
51
|
+
- `attempt_count`: How many attempts so far
|
|
52
|
+
|
|
53
|
+
Gather more context:
|
|
54
|
+
- Error logs
|
|
55
|
+
- Stack traces
|
|
56
|
+
- Failing test output
|
|
57
|
+
- Browser console (if UI)
|
|
58
|
+
|
|
59
|
+
## Step 2: Multi-Layer Verification Setup
|
|
60
|
+
|
|
61
|
+
Prepare verification layers based on issue type:
|
|
62
|
+
|
|
63
|
+
### Layer 1: Build
|
|
64
|
+
```bash
|
|
65
|
+
npm run build # or appropriate build command
|
|
66
|
+
# OR
|
|
67
|
+
go build ./...
|
|
68
|
+
# OR
|
|
69
|
+
cargo build
|
|
70
|
+
```
|
|
71
|
+
|
|
72
|
+
### Layer 2: Tests
|
|
73
|
+
```bash
|
|
74
|
+
npm test -- --run {related_test}
|
|
75
|
+
# OR
|
|
76
|
+
pytest {test_file}
|
|
77
|
+
# OR
|
|
78
|
+
go test ./...
|
|
79
|
+
```
|
|
80
|
+
|
|
81
|
+
### Layer 3: Lint
|
|
82
|
+
```bash
|
|
83
|
+
npm run lint
|
|
84
|
+
# OR
|
|
85
|
+
eslint {file}
|
|
86
|
+
```
|
|
87
|
+
|
|
88
|
+
### Layer 4: Browser (for UI issues)
|
|
89
|
+
Using Playwright or Chrome DevTools MCP:
|
|
90
|
+
1. Navigate to affected page
|
|
91
|
+
2. Take snapshot
|
|
92
|
+
3. Verify expected elements exist
|
|
93
|
+
4. Take screenshot as proof
|
|
94
|
+
|
|
95
|
+
## Step 3: Debug Loop
|
|
96
|
+
|
|
97
|
+
```
|
|
98
|
+
attempt = 1
|
|
99
|
+
while attempt <= 5:
|
|
100
|
+
|
|
101
|
+
1. ANALYZE
|
|
102
|
+
- Read error carefully
|
|
103
|
+
- Form hypothesis about root cause
|
|
104
|
+
- Identify minimal fix
|
|
105
|
+
|
|
106
|
+
2. FIX
|
|
107
|
+
- Apply targeted fix
|
|
108
|
+
- Keep changes minimal
|
|
109
|
+
- Don't introduce new issues
|
|
110
|
+
|
|
111
|
+
3. VERIFY (all layers)
|
|
112
|
+
- Run build → must pass
|
|
113
|
+
- Run tests → must pass
|
|
114
|
+
- Run lint → must pass
|
|
115
|
+
- Browser verify (if UI) → must show correct behavior
|
|
116
|
+
- Take screenshot proof (if UI)
|
|
117
|
+
|
|
118
|
+
4. EVALUATE
|
|
119
|
+
if all_pass:
|
|
120
|
+
→ SUCCESS: Exit loop, update STATE.md
|
|
121
|
+
else:
|
|
122
|
+
→ Log what failed
|
|
123
|
+
→ Form new hypothesis
|
|
124
|
+
→ attempt += 1
|
|
125
|
+
|
|
126
|
+
5. CHECKPOINT (every attempt)
|
|
127
|
+
- Update STATE.md with:
|
|
128
|
+
- Current attempt number
|
|
129
|
+
- Last hypothesis
|
|
130
|
+
- What was tried
|
|
131
|
+
- Result
|
|
132
|
+
```
|
|
133
|
+
|
|
134
|
+
## Step 4: Browser Verification (UI Issues)
|
|
135
|
+
|
|
136
|
+
When the issue involves UI:
|
|
137
|
+
|
|
138
|
+
### Using Playwright MCP
|
|
139
|
+
```
|
|
140
|
+
1. browser_navigate to affected page
|
|
141
|
+
2. browser_snapshot to get current state
|
|
142
|
+
3. browser_click / browser_type to interact
|
|
143
|
+
4. browser_snapshot again
|
|
144
|
+
5. browser_take_screenshot for proof
|
|
145
|
+
```
|
|
146
|
+
|
|
147
|
+
### Using Chrome DevTools MCP
|
|
148
|
+
```
|
|
149
|
+
1. navigate_page to affected URL
|
|
150
|
+
2. take_snapshot for accessibility tree
|
|
151
|
+
3. click / fill to interact
|
|
152
|
+
4. take_screenshot for visual proof
|
|
153
|
+
```
|
|
154
|
+
|
|
155
|
+
### Screenshot Naming
|
|
156
|
+
Save screenshots to `.ctx/debug/`:
|
|
157
|
+
```
|
|
158
|
+
.ctx/debug/
|
|
159
|
+
├── issue-{id}-before.png
|
|
160
|
+
├── issue-{id}-attempt-1.png
|
|
161
|
+
├── issue-{id}-attempt-2.png
|
|
162
|
+
└── issue-{id}-fixed.png
|
|
163
|
+
```
|
|
164
|
+
|
|
165
|
+
## Step 5: Success Handling
|
|
166
|
+
|
|
167
|
+
When fix is verified:
|
|
168
|
+
|
|
169
|
+
1. Update STATE.md:
|
|
170
|
+
- Set status = "executing"
|
|
171
|
+
- Clear debug_issue
|
|
172
|
+
- Reset attempt_count
|
|
173
|
+
- Log successful fix in decisions
|
|
174
|
+
|
|
175
|
+
2. Create debug report:
|
|
176
|
+
```markdown
|
|
177
|
+
## Debug Session Complete
|
|
178
|
+
|
|
179
|
+
**Issue:** {description}
|
|
180
|
+
**Root Cause:** {what was wrong}
|
|
181
|
+
**Fix:** {what was changed}
|
|
182
|
+
**Attempts:** {count}
|
|
183
|
+
**Verified By:**
|
|
184
|
+
- [x] Build passes
|
|
185
|
+
- [x] Tests pass
|
|
186
|
+
- [x] Lint passes
|
|
187
|
+
- [x] Browser verified (if applicable)
|
|
188
|
+
|
|
189
|
+
**Screenshot Proof:** .ctx/debug/issue-{id}-fixed.png
|
|
190
|
+
```
|
|
191
|
+
|
|
192
|
+
3. Return control to `/ctx` router
|
|
193
|
+
|
|
194
|
+
## Step 6: Escalation (Max Attempts Reached)
|
|
195
|
+
|
|
196
|
+
If 5 attempts fail:
|
|
197
|
+
|
|
198
|
+
1. Update STATE.md:
|
|
199
|
+
- Keep status = "debugging"
|
|
200
|
+
- Log all attempted fixes
|
|
201
|
+
- Mark as "escalated"
|
|
202
|
+
|
|
203
|
+
2. Generate escalation report:
|
|
204
|
+
```markdown
|
|
205
|
+
## Debug Escalation
|
|
206
|
+
|
|
207
|
+
**Issue:** {description}
|
|
208
|
+
**Attempts:** 5 (max reached)
|
|
209
|
+
|
|
210
|
+
### What Was Tried
|
|
211
|
+
1. Attempt 1: {hypothesis} → {result}
|
|
212
|
+
2. Attempt 2: {hypothesis} → {result}
|
|
213
|
+
3. Attempt 3: {hypothesis} → {result}
|
|
214
|
+
4. Attempt 4: {hypothesis} → {result}
|
|
215
|
+
5. Attempt 5: {hypothesis} → {result}
|
|
216
|
+
|
|
217
|
+
### Current State
|
|
218
|
+
- Build: {pass/fail}
|
|
219
|
+
- Tests: {pass/fail}
|
|
220
|
+
- Browser: {pass/fail}
|
|
221
|
+
|
|
222
|
+
### Possible Root Causes
|
|
223
|
+
1. {theory 1}
|
|
224
|
+
2. {theory 2}
|
|
225
|
+
|
|
226
|
+
### Recommended Next Steps
|
|
227
|
+
1. {suggestion for user}
|
|
228
|
+
2. {suggestion for user}
|
|
229
|
+
|
|
230
|
+
**Requires user input to proceed.**
|
|
231
|
+
```
|
|
232
|
+
|
|
233
|
+
3. Ask user for guidance
|
|
234
|
+
|
|
235
|
+
</process>
|
|
236
|
+
|
|
237
|
+
<state_updates>
|
|
238
|
+
|
|
239
|
+
After EACH attempt, update STATE.md:
|
|
240
|
+
```markdown
|
|
241
|
+
## Debug Session (if active)
|
|
242
|
+
- **Issue**: {debug_issue}
|
|
243
|
+
- **Hypothesis**: {current_hypothesis}
|
|
244
|
+
- **Attempt**: {attempt}/5
|
|
245
|
+
- **Last Error**: {error_summary}
|
|
246
|
+
- **Browser Verified**: {true/false}
|
|
247
|
+
```
|
|
248
|
+
|
|
249
|
+
</state_updates>
|
|
250
|
+
|
|
251
|
+
<output>
|
|
252
|
+
Return to orchestrator:
|
|
253
|
+
- Success: Fixed, verified, proof saved
|
|
254
|
+
- Escalate: Max attempts, needs user input
|
|
255
|
+
- Include verification results (build, tests, browser)
|
|
256
|
+
- Include screenshot paths if UI issue
|
|
257
|
+
</output>
|