ctx-cc 2.0.0 → 2.2.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
@@ -1,19 +1,20 @@
1
1
  ---
2
2
  name: ctx-verifier
3
- description: Verification agent for CTX 2.0. Three-level verification + anti-pattern scan. Spawned when status = "verifying".
4
- tools: Read, Glob, Grep, Bash, mcp__playwright__*, mcp__chrome-devtools__*
3
+ description: Verification agent for CTX 2.1. Verifies story against PRD acceptance criteria. Updates passes flag on success. Spawned when status = "verifying".
4
+ tools: Read, Write, Glob, Grep, Bash, mcp__playwright__*, mcp__chrome-devtools__*
5
5
  color: red
6
6
  ---
7
7
 
8
8
  <role>
9
- You are a CTX 2.0 verifier. Your job is to verify phase completion.
9
+ You are a CTX 2.1 verifier. Your job is to verify story completion against PRD acceptance criteria.
10
10
 
11
- You check three levels:
12
- 1. **Exists** - Is the file on disk?
13
- 2. **Substantive** - Is it real code, not a stub?
14
- 3. **Wired** - Is it imported and used?
11
+ You verify:
12
+ 1. **Acceptance Criteria** - Each criterion from PRD.json satisfied?
13
+ 2. **Three-Level Check** - Exists Substantive Wired
14
+ 3. **Anti-Patterns** - No TODO, stubs, or broken code
15
15
 
16
- Plus anti-pattern scanning and browser verification for UI.
16
+ On success: Set `story.passes = true` in PRD.json
17
+ On failure: List fixes needed, keep `passes = false`
17
18
  </role>
18
19
 
19
20
  <philosophy>
@@ -47,11 +48,31 @@ If the phase involves UI, verify it visually:
47
48
  ## 1. Load Context
48
49
 
49
50
  Read:
51
+ - `.ctx/PRD.json` - Current story and acceptance criteria
50
52
  - `.ctx/STATE.md` - Current state
51
- - `.ctx/phases/{phase-id}/PLAN.md` - Verification criteria
52
- - Original goal
53
+ - `.ctx/phases/{story_id}/PLAN.md` - Task-to-criteria mapping
53
54
 
54
- ## 2. Three-Level Verification
55
+ Extract:
56
+ - Story ID and title
57
+ - `acceptanceCriteria` array (this is what you verify)
58
+ - Verification matrix from PLAN.md
59
+
60
+ ## 2. Verify Acceptance Criteria
61
+
62
+ For each criterion in `story.acceptanceCriteria`:
63
+
64
+ ```
65
+ Criterion: "User can log in with email"
66
+ How to verify: (from PLAN.md verification matrix)
67
+ - Test: npm test auth.test.ts
68
+ - Browser: Navigate to /login, enter email, submit
69
+ Result: PASS / FAIL
70
+ Evidence: {what proved it}
71
+ ```
72
+
73
+ This is the PRIMARY verification. Story passes only if ALL criteria pass.
74
+
75
+ ## 3. Three-Level Verification
55
76
 
56
77
  For each artifact:
57
78
 
@@ -90,7 +111,7 @@ Trace from entry point to new code.
90
111
  Pass: Code is imported and called
91
112
  Fail: Orphan code
92
113
 
93
- ## 3. Anti-Pattern Scan
114
+ ## 4. Anti-Pattern Scan
94
115
 
95
116
  | Pattern | Search | Severity |
96
117
  |---------|--------|----------|
@@ -100,28 +121,51 @@ Fail: Orphan code
100
121
  | Placeholder returns | `return null`, `return {}` | Error |
101
122
  | Debug code | `console.log`, `debugger` | Warning |
102
123
 
103
- ## 4. Browser Verification (UI)
124
+ ## 5. Browser Verification (UI)
104
125
 
105
- If phase involves UI:
126
+ If phase involves UI, use credentials from `.ctx/.env`:
127
+
128
+ ### Load Credentials
129
+ ```
130
+ Read .ctx/.env:
131
+ - APP_URL → where to navigate
132
+ - TEST_USER_EMAIL / TEST_USER_PASSWORD → for login
133
+ - ADMIN_EMAIL / ADMIN_PASSWORD → for admin tests
134
+ ```
106
135
 
107
136
  ### Using Playwright MCP
108
137
  ```
109
- browser_navigate({url})
110
- browser_snapshot()
111
- # Verify expected elements exist in snapshot
112
- browser_take_screenshot({filename})
138
+ 1. browser_navigate to APP_URL
139
+ 2. If login required:
140
+ - browser_type TEST_USER_EMAIL into email field
141
+ - browser_type TEST_USER_PASSWORD into password field
142
+ - browser_click submit
143
+ 3. Navigate to page being verified
144
+ 4. browser_snapshot to check elements
145
+ 5. browser_take_screenshot for proof
113
146
  ```
114
147
 
115
148
  ### Using Chrome DevTools MCP
116
149
  ```
117
- navigate_page({url})
118
- take_snapshot()
119
- take_screenshot({path})
150
+ 1. navigate_page to APP_URL
151
+ 2. If login required:
152
+ - fill email with TEST_USER_EMAIL
153
+ - fill password with TEST_USER_PASSWORD
154
+ - click submit
155
+ 3. Navigate to page being verified
156
+ 4. take_snapshot
157
+ 5. take_screenshot for proof
120
158
  ```
121
159
 
122
- Save screenshots to `.ctx/verify/phase-{id}-verified.png`
160
+ ### Credential Security
161
+ - NEVER echo credentials in output
162
+ - NEVER hardcode credentials
163
+ - Use ONLY from .ctx/.env file
164
+ - Credentials enable AUTONOMOUS verification
165
+
166
+ Save screenshots to `.ctx/verify/story-{id}-verified.png`
123
167
 
124
- ## 5. Goal Gap Analysis
168
+ ## 6. Goal Gap Analysis
125
169
 
126
170
  Compare goal vs implementation:
127
171
  1. What was the original goal?
@@ -129,17 +173,24 @@ Compare goal vs implementation:
129
173
  3. What's missing (gaps)?
130
174
  4. What's extra (drift)?
131
175
 
132
- ## 6. Generate VERIFY.md
176
+ ## 7. Generate VERIFY.md
133
177
 
134
- Write `.ctx/phases/{phase-id}/VERIFY.md`:
178
+ Write `.ctx/phases/{story_id}/VERIFY.md`:
135
179
 
136
180
  ```markdown
137
181
  # Verification Report
138
182
 
139
- **Phase:** {id}
140
- **Goal:** {original goal}
183
+ **Story:** {story_id} - {story_title}
141
184
  **Date:** {timestamp}
142
185
 
186
+ ## Acceptance Criteria
187
+
188
+ | Criterion | Status | Evidence |
189
+ |-----------|--------|----------|
190
+ | {criterion_1} | ✓ PASS | {what proved it} |
191
+ | {criterion_2} | ✓ PASS | {what proved it} |
192
+ | {criterion_3} | ✗ FAIL | {why it failed} |
193
+
143
194
  ## Three-Level Results
144
195
 
145
196
  | Artifact | Exists | Substantive | Wired | Status |
@@ -147,9 +198,6 @@ Write `.ctx/phases/{phase-id}/VERIFY.md`:
147
198
  | {file1} | ✓ | ✓ | ✓ | PASS |
148
199
  | {file2} | ✓ | ✓ | ✗ | FAIL |
149
200
 
150
- ### Failures
151
- {details of each failure}
152
-
153
201
  ## Anti-Pattern Scan
154
202
 
155
203
  | Pattern | Count | Location | Severity |
@@ -159,34 +207,44 @@ Write `.ctx/phases/{phase-id}/VERIFY.md`:
159
207
  ## Browser Verification
160
208
 
161
209
  - URL: {url tested}
162
- - Elements: {verified}
163
- - Screenshot: .ctx/verify/phase-{id}.png
210
+ - Screenshot: .ctx/verify/story-{id}.png
164
211
  - Status: PASS/FAIL
165
212
 
166
- ## Goal Gap
213
+ ## Overall: {PASS / FAIL}
167
214
 
168
- **Built:** {what was completed}
169
- **Gaps:** {what's missing}
170
- **Drift:** {what was built but not requested}
215
+ {If FAIL: list required fixes with criterion mapping}
216
+ {If PASS: story verified}
217
+ ```
171
218
 
172
- ## Overall: {PASS / FAIL}
219
+ ## 8. Update PRD.json
173
220
 
174
- {If FAIL: list required fixes}
175
- {If PASS: ready for next phase or ship}
221
+ **If ALL criteria PASS:**
222
+ ```json
223
+ {
224
+ "stories[story_id].passes": true,
225
+ "stories[story_id].verifiedAt": "{ISO8601 timestamp}",
226
+ "metadata.passedStories": {increment by 1},
227
+ "metadata.currentStory": "{next story where passes=false, or null if all done}"
228
+ }
176
229
  ```
177
230
 
178
- ## 7. Update STATE.md
231
+ **If ANY criterion FAILS:**
232
+ - Keep `passes: false`
233
+ - Add failure details to `stories[story_id].notes`
234
+
235
+ ## 9. Update STATE.md
179
236
 
180
237
  Based on results:
181
238
 
182
239
  **If PASS:**
183
- - Set status = "executing" (for next phase)
184
- - Or status = "complete" (if last phase)
240
+ - Set status = "initializing" (for next story)
241
+ - Update current story to next unpassed
242
+ - Update PRD progress
185
243
 
186
244
  **If FAIL:**
187
- - Create fix tasks
188
- - Set status = "executing"
189
- - Loop back to execute fixes
245
+ - Create fix tasks mapped to failing criteria
246
+ - Set status = "debugging" or "executing"
247
+ - Keep current story
190
248
 
191
249
  </process>
192
250
 
package/bin/ctx.js CHANGED
@@ -19,9 +19,9 @@ if (options.help) {
19
19
  ╚██████╗ ██║ ██╔╝ ██╗
20
20
  ╚═════╝ ╚═╝ ╚═╝ ╚═╝\x1b[0m
21
21
 
22
- \x1b[1mCTX 2.0 - Continuous Task eXecution\x1b[0m
23
- Smart workflow orchestration for Claude Code.
24
- 4 commands. Debug loop. 100% verified.
22
+ \x1b[1mCTX 2.2 - Continuous Task eXecution\x1b[0m
23
+ PRD-driven workflow orchestration for Claude Code.
24
+ 8 commands. Story-verified. Debug loop.
25
25
 
26
26
  \x1b[1mUsage:\x1b[0m
27
27
  npx ctx-cc [options]
package/commands/help.md CHANGED
@@ -4,124 +4,134 @@ description: Show CTX commands and usage guide
4
4
  ---
5
5
 
6
6
  <objective>
7
- Display the CTX 2.0 command reference.
7
+ Display the CTX 2.2 command reference.
8
8
 
9
9
  Output ONLY the reference content below. Do NOT add project-specific analysis.
10
10
  </objective>
11
11
 
12
12
  <reference>
13
- # CTX 2.0 Command Reference
13
+ # CTX 2.2 Command Reference
14
14
 
15
15
  **CTX** (Continuous Task eXecution) - Smart workflow orchestration for Claude Code.
16
- 4 commands. One smart router. Debug loop until 100% fixed.
16
+ 8 commands. PRD-driven. Smart routing. Debug loop until 100% fixed.
17
17
 
18
18
  ## Quick Start
19
19
 
20
20
  ```
21
- 1. /ctx init Initialize project with STATE.md
22
- 2. /ctx Smart router - does the right thing
23
- 3. (repeat until done)
21
+ 1. /ctx init Initialize project + generate PRD.json
22
+ 2. /ctx Smart router does the right thing
23
+ 3. /ctx status Check progress (read-only)
24
24
  4. /ctx pause Checkpoint when needed
25
25
  ```
26
26
 
27
- That's it. `/ctx` reads STATE.md and knows what to do next.
27
+ ## What's New in 2.2
28
28
 
29
- ## The 4 Commands
29
+ - **Front-Loaded Approach** - Gather ALL info upfront, execute autonomously
30
+ - **PRD.json** - Requirements contract with user stories
31
+ - **Secure Credentials** - `.ctx/.env` for test credentials (gitignored)
32
+ - **Acceptance Criteria** - Each story has verifiable criteria
33
+ - **`passes` Flag** - Auto-tracks story completion
34
+ - **Story-Driven Workflow** - Plan → Execute → Verify → Next Story
30
35
 
31
- ### `/ctx`
32
- **The smart router.** Reads STATE.md, does the right action:
36
+ ## Front-Loaded Philosophy
33
37
 
34
- | State | What happens |
35
- |-------|--------------|
36
- | initializing | Research + Plan (ArguSeek + ChunkHound) |
37
- | executing | Execute current task |
38
- | debugging | Debug loop until 100% fixed |
39
- | verifying | Three-level verification |
40
- | paused | Resume from checkpoint |
41
-
42
- Just run `/ctx` and it figures out what's needed.
43
-
44
- ### `/ctx init`
45
- Initialize a new project. Creates `.ctx/STATE.md`.
46
-
47
- ### `/ctx quick "task"`
48
- Quick task bypass. Skip the workflow for small fixes.
49
38
  ```
50
- /ctx quick "fix the button color"
51
- /ctx quick "add console.log for debugging"
39
+ /ctx init gathers:
40
+ ├── Requirements PRD.json stories
41
+ ├── Context → problem, target user, success criteria
42
+ ├── Credentials → .ctx/.env (gitignored)
43
+ └── Constitution → rules for autonomous decisions
44
+
45
+ Then /ctx runs autonomously:
46
+ ├── Only interrupts for architecture decisions
47
+ ├── Uses stored credentials for browser testing
48
+ └── Loops until all stories pass
52
49
  ```
53
50
 
54
- ### `/ctx pause`
55
- Create checkpoint. Safe to close session.
56
- Resume later with `/ctx` - auto-restores in ~2.5k tokens.
51
+ ## The 8 Commands
57
52
 
58
- ## Debug Loop (New in 2.0)
53
+ ### Smart (Auto-routing)
59
54
 
60
- When something breaks, CTX enters debug mode:
55
+ | Command | Purpose |
56
+ |---------|---------|
57
+ | `/ctx` | **Smart router** - reads STATE.md, does the right thing |
58
+ | `/ctx init` | Initialize project with STATE.md |
61
59
 
62
- ```
63
- Loop (max 5 attempts):
64
- 1. Analyze error
65
- 2. Form hypothesis
66
- 3. Apply fix
67
- 4. Verify (build + tests + browser)
68
- 5. If fixed: done
69
- If not: new hypothesis, try again
70
- ```
60
+ ### Inspect (Read-only)
71
61
 
72
- **Browser verification for UI:**
73
- - Navigates to affected page
74
- - Checks elements exist
75
- - Takes screenshot proof
76
- - Saves to `.ctx/debug/`
62
+ | Command | Purpose |
63
+ |---------|---------|
64
+ | `/ctx status` | See current state without triggering action |
77
65
 
78
- ## Architecture
66
+ ### Control (Override smart router)
79
67
 
80
- ### STATE.md - Single Source of Truth
81
- ~100 lines. Always accurate. Always read first.
68
+ | Command | Purpose |
69
+ |---------|---------|
70
+ | `/ctx plan [goal]` | Force research + planning |
71
+ | `/ctx verify` | Force three-level verification |
72
+ | `/ctx quick "task"` | Quick task bypass (skip workflow) |
82
73
 
83
- ```markdown
84
- ## Project
85
- - Name, Stack, Status
74
+ ### Session
86
75
 
87
- ## Current Phase
88
- - Goal, Progress
76
+ | Command | Purpose |
77
+ |---------|---------|
78
+ | `/ctx pause` | Checkpoint for session resume |
89
79
 
90
- ## Active Task
91
- - What, Status, Attempts
80
+ ### Phase Management
92
81
 
93
- ## Debug Session (if active)
94
- - Issue, Hypothesis, Attempt count
82
+ | Command | Purpose |
83
+ |---------|---------|
84
+ | `/ctx phase list` | Show all phases with status |
85
+ | `/ctx phase add "goal"` | Add new phase to roadmap |
86
+ | `/ctx phase next` | Complete current, move to next |
87
+ | `/ctx phase skip` | Skip current phase |
95
88
 
96
- ## Context Budget
97
- - Usage %, Quality level
98
- ```
89
+ ---
99
90
 
100
- ### 5 Specialized Agents
91
+ ## Smart Router States
101
92
 
102
- | Agent | When spawned |
93
+ When you run `/ctx`, it reads STATE.md and PRD.json, auto-routes:
94
+
95
+ | State | What happens |
103
96
  |-------|--------------|
104
- | ctx-researcher | status = initializing |
105
- | ctx-planner | after research |
106
- | ctx-executor | status = executing |
107
- | ctx-debugger | status = debugging |
108
- | ctx-verifier | status = verifying |
97
+ | initializing | Research + Plan for current story |
98
+ | executing | Execute tasks for current story |
99
+ | debugging | **Debug loop until 100% fixed** |
100
+ | verifying | Verify acceptance criteria → mark story as passed |
101
+ | paused | Resume from checkpoint |
102
+
103
+ **Story Flow:**
104
+ ```
105
+ S001 → plan → execute → verify ✓ → S002 → plan → execute → verify ✓ → ...
106
+ ```
107
+
108
+ ## Debug Loop
109
109
 
110
- ### Directory Structure
110
+ When something breaks, CTX enters debug mode:
111
111
 
112
112
  ```
113
- .ctx/
114
- ├── STATE.md # Living digest - ALWAYS read first
115
- ├── phases/{id}/ # Phase data
116
- │ ├── RESEARCH.md # ArguSeek + ChunkHound results
117
- │ ├── PLAN.md # 2-3 tasks (atomic)
118
- │ └── VERIFY.md # Three-level verification
119
- ├── checkpoints/ # Auto-checkpoints
120
- ├── debug/ # Debug screenshots
121
- └── memory/ # Decision memory
113
+ Loop (max 5 attempts):
114
+ 1. Analyze error
115
+ 2. Form hypothesis
116
+ 3. Apply fix
117
+ 4. Verify (build + tests + browser)
118
+ 5. If fixed → done
119
+ If not → new hypothesis, try again
122
120
  ```
123
121
 
124
- ## Key Features
122
+ **Browser verification for UI:**
123
+ - Playwright or Chrome DevTools
124
+ - Screenshots saved to `.ctx/debug/`
125
+
126
+ ## Three-Level Verification
127
+
128
+ | Level | Question | Check |
129
+ |-------|----------|-------|
130
+ | Exists | File on disk? | Glob |
131
+ | Substantive | Real code, not stub? | No TODOs, no placeholders |
132
+ | Wired | Imported and used? | Trace imports |
133
+
134
+ ## Key Design Principles
125
135
 
126
136
  ### Atomic Planning (2-3 Tasks Max)
127
137
  Prevents context degradation. Big work = multiple phases.
@@ -134,11 +144,6 @@ Prevents context degradation. Big work = multiple phases.
134
144
  | Blocking issue | Auto-fix |
135
145
  | Architecture decision | Ask user |
136
146
 
137
- ### Three-Level Verification
138
- 1. **Exists** - File on disk?
139
- 2. **Substantive** - Real code, not stub?
140
- 3. **Wired** - Imported and used?
141
-
142
147
  ### Context Budget
143
148
  | Usage | Quality | Action |
144
149
  |-------|---------|--------|
@@ -146,27 +151,58 @@ Prevents context degradation. Big work = multiple phases.
146
151
  | 30-50% | Good | Continue |
147
152
  | 50%+ | Degrading | Auto-checkpoint |
148
153
 
154
+ ## 5 Specialized Agents
155
+
156
+ | Agent | When spawned |
157
+ |-------|--------------|
158
+ | ctx-researcher | During planning (ArguSeek + ChunkHound) |
159
+ | ctx-planner | After research |
160
+ | ctx-executor | During execution |
161
+ | ctx-debugger | When debugging |
162
+ | ctx-verifier | During verification |
163
+
149
164
  ## Integrations
150
165
 
151
- ### ArguSeek (Web Research)
152
- Auto-runs during planning:
153
- - Best practices
154
- - Security considerations
155
- - Performance patterns
166
+ - **ArguSeek**: Web research during planning
167
+ - **ChunkHound**: Semantic code search (`uv tool install chunkhound`)
168
+ - **Playwright/DevTools**: Browser verification for UI
156
169
 
157
- ### ChunkHound (Semantic Search)
158
- Auto-runs during planning:
159
- - Find relevant code
160
- - Detect patterns
161
- - Map entry points
170
+ ## Directory Structure
162
171
 
163
- Install: `uv tool install chunkhound`
172
+ ```
173
+ .ctx/
174
+ ├── STATE.md # Living digest - execution state
175
+ ├── PRD.json # Requirements contract - stories + criteria
176
+ ├── .env # Test credentials (GITIGNORED)
177
+ ├── .gitignore # Protects secrets
178
+ ├── phases/{story_id}/ # Per-story data
179
+ │ ├── RESEARCH.md # ArguSeek + ChunkHound results
180
+ │ ├── PLAN.md # Tasks mapped to acceptance criteria
181
+ │ └── VERIFY.md # Verification report
182
+ ├── checkpoints/ # Auto-checkpoints
183
+ ├── debug/ # Debug screenshots
184
+ └── verify/ # Verification screenshots
185
+ ```
164
186
 
165
- ### Browser Verification (Playwright/Chrome DevTools)
166
- Auto-runs during debugging and verification:
167
- - Navigate to pages
168
- - Check elements
169
- - Screenshot proof
187
+ ## PRD.json Structure
188
+
189
+ ```json
190
+ {
191
+ "stories": [
192
+ {
193
+ "id": "S001",
194
+ "title": "User login",
195
+ "acceptanceCriteria": ["User can log in with email", "..."],
196
+ "passes": false
197
+ }
198
+ ],
199
+ "metadata": {
200
+ "currentStory": "S001",
201
+ "passedStories": 0,
202
+ "totalStories": 5
203
+ }
204
+ }
205
+ ```
170
206
 
171
207
  ## Updating CTX
172
208
 
@@ -175,5 +211,5 @@ npx ctx-cc --force
175
211
  ```
176
212
 
177
213
  ---
178
- *CTX 2.0 - 4 commands, debug loop, 100% verified*
214
+ *CTX 2.2 - PRD-driven, story-verified, debug loop until 100% fixed*
179
215
  </reference>