ctx-cc 4.0.0 → 4.1.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/README.md CHANGED
@@ -1,589 +1,543 @@
1
1
  <div align="center">
2
2
 
3
- # CTX
4
-
5
- ### Continuous Task eXecution
3
+ ```
4
+ ██████╗████████╗██╗ ██╗
5
+ ██╔════╝╚══██╔══╝╚██╗██╔╝
6
+ ██║ ██║ ╚███╔╝
7
+ ██║ ██║ ██╔██╗
8
+ ╚██████╗ ██║ ██╔╝ ██╗
9
+ ╚═════╝ ╚═╝ ╚═╝ ╚═╝
10
+ ```
6
11
 
7
- **Intelligent workflow orchestration for Claude Code.**
12
+ **Intelligent workflow orchestration for Claude Code**
8
13
 
9
14
  [![npm version](https://img.shields.io/npm/v/ctx-cc.svg?style=flat-square)](https://www.npmjs.com/package/ctx-cc)
10
- [![npm downloads](https://img.shields.io/npm/dm/ctx-cc.svg?style=flat-square)](https://www.npmjs.com/package/ctx-cc)
11
15
  [![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg?style=flat-square)](https://opensource.org/licenses/MIT)
12
- [![GitHub stars](https://img.shields.io/github/stars/jufjuf/CTX?style=flat-square)](https://github.com/jufjuf/CTX/stargazers)
13
-
14
- <img src="./assets/terminal.png" alt="CTX Terminal" width="700">
16
+ [![Tests](https://img.shields.io/badge/tests-264%20passing-brightgreen.svg?style=flat-square)](#testing)
17
+ [![Zero deps](https://img.shields.io/badge/dependencies-0-brightgreen.svg?style=flat-square)](#)
15
18
 
16
- **Conversational-first. Just describe what you want — no commands to memorize.**
17
-
18
- AI that learns your preferences. Predictive planning. Self-healing deployments. 21 specialized agents.
19
-
20
- [Installation](#installation) · [Quick Start](#quick-start) · [New in 3.5](#new-in-35) · [Commands](#commands) · [Why CTX](#why-ctx) · [**Getting Started Guide**](./GETTING_STARTED.md)
19
+ ```bash
20
+ npx ctx-cc
21
+ ```
21
22
 
22
23
  </div>
23
24
 
24
25
  ---
25
26
 
26
- ## Installation
27
+ ## What is CTX?
27
28
 
28
- ```bash
29
- npx ctx-cc
30
- ```
29
+ CTX transforms Claude Code from a single AI assistant into a full development agency. One installer wires 25 specialized agents, 7 auto-discovered skills, and 3 deterministic enforcement hooks directly into Claude Code's native extension points — no runtime daemon, no wrapper, no proxy.
31
30
 
32
- That's it. CTX installs itself to your Claude Code environment.
31
+ **Three verticals out of the box:**
33
32
 
34
- ```bash
35
- # Options
36
- npx ctx-cc --global # Install to ~/.claude (default)
37
- npx ctx-cc --project # Install to .claude in current directory
38
- npx ctx-cc --force # Overwrite existing installation
39
- ```
33
+ | Vertical | Coverage |
34
+ |----------|----------|
35
+ | Software Development | Phase-based lifecycle, autonomous execution, persistent debug, review gates |
36
+ | Agency-Grade Design | Figma MCP-first workflow, W3C DTCG tokens, pixel-perfect QA, WCAG 2.2 AA |
37
+ | Machine Learning | Experiment tracking, model registry, conformal prediction, drift detection |
38
+
39
+ **Key value propositions:**
40
+
41
+ - **Phase-based lifecycle** — `init → plan → execute → verify → complete` with state tracked in `.ctx/STATE.json`
42
+ - **Autonomous execution with review gates** — three-stage review: spec compliance, code quality, and optional cross-model adversarial review via OpenAI Codex
43
+ - **Figma MCP-first design workflow** — tokens sync from Figma, visual QA measures pixels numerically
44
+ - **ML experiment-driven development** — hypothesis tracking, XGBoost+MAPIE patterns, KS drift, Digital Twin workflows
45
+ - **Zero dependencies** — installs into Claude Code's native extension points; nothing runs outside Claude
40
46
 
41
47
  ---
42
48
 
43
49
  ## Quick Start
44
50
 
45
- **Just talk to CTX.** No commands to memorize:
46
-
47
- ```
48
- "I want to build a todo app" → CTX sets up your project
49
- "Fix the login bug" → CTX starts debugging
50
- "Is my app accessible?" → CTX runs accessibility QA
51
- "What should I do next?" → CTX shows status + recommendation
51
+ ```bash
52
+ npx ctx-cc # Install globally (~/.claude)
53
+ npx ctx-cc --project # Install for current project only (.claude/)
54
+ npx ctx-cc --force # Overwrite existing installation
52
55
  ```
53
56
 
54
- Or use commands directly:
55
- ```
56
- 1. /ctx init Gather requirements + credentials + design context
57
- 2. /ctx map Build repository map (existing codebases)
58
- 3. /ctx Autonomous execution with minimal interruption
59
- ```
57
+ Then inside Claude Code:
60
58
 
61
- **The Flow:**
62
59
  ```
63
- Tell CTX what you want CTX figures out the rest → Delivered!
60
+ /ctx Smart router reads state, does the right thing
61
+ /ctx:help Full command reference
62
+ /ctx:init Initialize project (PRD + STATE + config)
64
63
  ```
65
64
 
66
65
  ---
67
66
 
68
- ## New in 3.5
69
-
70
- ### Complete Redesign
71
- v3.5 is a ground-up rewrite focused on reliability over feature count:
72
- - **Unified version** across all 21 agents and commands
73
- - **Trimmed config** — removed 260 lines of settings for unimplemented features
74
- - **Single router** — eliminated duplicate routing logic that caused agent spawn failures
75
- - **GSD-proven architecture** — rebuilt on patterns validated in production
67
+ ## Architecture
76
68
 
77
- ### Conversational-First Routing
78
- **No commands to memorize.** CTX understands natural language from your first prompt:
69
+ CTX uses a **thin installer / fat Claude Code** architecture. The CLI's only job is to copy files into the right directories. All orchestration happens inside Claude Code via the Agent tool.
79
70
 
80
- | You Say | CTX Does |
81
- |---------|----------|
82
- | "I want to build a todo app" | Sets up project, researches best practices, creates plan |
83
- | "Fix the login bug" | Analyzes codebase, starts debugging |
84
- | "Is my app accessible?" | Runs WCAG 2.1 AA accessibility audit |
85
- | "Test everything" | Crawls every page, clicks every button |
86
- | "What's next?" | Shows status and recommended action |
87
-
88
- Commands still work as shortcuts for power users.
89
-
90
- ### Full System QA
91
- Crawl every page, click every button, find all issues:
92
-
93
- ```bash
94
- /ctx qa # Full system QA (WCAG 2.1 AA)
95
- /ctx qa --a11y-only # Accessibility audit only
96
- /ctx qa --visual-only # Visual regression (3 viewports)
97
- /ctx qa --resume # Resume interrupted session
98
71
  ```
99
-
100
- Features:
101
- - **WCAG 2.1 AA compliance** - Touch targets, alt text, labels, contrast, keyboard
102
- - **Multi-viewport testing** - Mobile (375px), Tablet (768px), Desktop (1280px)
103
- - **Performance monitoring** - Slow requests, large assets
104
- - **Trace capture** - Screenshots and logs for every failed interaction
105
- - **Fix tasks** - Issues organized by section, ready for execution
106
-
107
- ### Persistent Debugging
108
- Debug sessions survive context resets, `/clear`, and days between attempts:
109
-
110
- ```bash
111
- /ctx debug "checkout fails" # Start debug session
112
- /ctx debug --resume # Resume where you left off
113
- /ctx debug --list # Show all sessions
72
+ ~/.claude/
73
+ ├── agents/ 25 subagents (invoked via Agent tool)
74
+ ├── skills/ 7 skills (auto-discovered by Claude from descriptions)
75
+ ├── commands/ 26 slash commands (/ctx:*)
76
+ ├── hooks/ 3 hook scripts (deterministic enforcement)
77
+ └── settings.json hooks registered
114
78
  ```
115
79
 
116
- - Scientific method: observe, hypothesize, test, analyze
117
- - Max 10 attempts before escalation with full report
118
- - Browser verification with stored credentials
119
- - Every hypothesis and result recorded in `.ctx/debug/sessions/`
80
+ **Key decisions:**
120
81
 
121
- ### Smart Context Handoff
122
- Seamless transitions at context limits:
82
+ - The CLI is installer-only. It does not run, proxy, or wrap Claude Code.
83
+ - Agents are Markdown files with native frontmatter (`model`, `maxTurns`, `description`). Claude reads these directly.
84
+ - Skills are auto-invoked by Claude Code when task descriptions match the skill's `WHEN:` triggers — no commands needed.
85
+ - Hooks are separate `.js` scripts registered in `settings.json`. They run deterministically on every tool call.
86
+ - `plugin.json` enables marketplace distribution.
123
87
 
124
- | Threshold | Action |
125
- |-----------|--------|
126
- | 40% | Prepare handoff notes |
127
- | 50% | Write HANDOFF.md, warn |
128
- | 60% | Spawn fresh agent |
88
+ ---
129
89
 
130
- Zero information loss. Work continues automatically.
90
+ ## Agents (25)
131
91
 
132
- ### Pre-Commit Review
133
- Catches errors BEFORE they're committed:
134
- - Type errors, unresolved imports, circular dependencies
135
- - Security vulnerabilities, empty catch blocks
136
- - Blocks on critical issues, warns on medium
92
+ ### Software Development (21)
137
93
 
138
- ### Acceptance Criteria Auto-Generation
139
- AI suggests comprehensive criteria before implementation starts:
140
- ```
141
- Story: "Add user authentication"
94
+ #### Planning
142
95
 
143
- Suggested Criteria:
144
- ✓ User can register with email/password
145
- Invalid credentials show error
146
- Passwords hashed with bcrypt
147
- Session expires after 24h
148
- Brute force protection enabled
96
+ | Agent | Model | maxTurns | Purpose |
97
+ |-------|-------|----------|---------|
98
+ | ctx-planner | opus | 25 | Atomic plans (2–3 tasks), maps tasks to acceptance criteria |
99
+ | ctx-researcher | opus | 25 | ArguSeek web research + semantic code search before planning |
100
+ | ctx-criteria-suggester | sonnet | 25 | Auto-generates acceptance criteria before implementation |
101
+ | ctx-parallelizer | haiku | 15 | Identifies tasks that can run concurrently, saves total time |
102
+ | ctx-predictor | haiku | 15 | Analyzes patterns and suggests next features |
149
103
 
150
- [A] Accept all [B] See more [C] Edit
151
- ```
104
+ #### Execution
152
105
 
153
- ---
106
+ | Agent | Model | maxTurns | Purpose |
107
+ |-------|-------|----------|---------|
108
+ | ctx-executor | sonnet | 50 | Implements tasks with git-native commits per completed task |
109
+ | ctx-debugger | sonnet | 75 | Persistent debug loop: observe → hypothesize → test → verify |
154
110
 
155
- ## From 3.0
111
+ #### Review
156
112
 
157
- ### Repository Mapping (like Aider)
158
- ```bash
159
- /ctx map # Build token-optimized codebase map
160
- /ctx map --expand # Include call graph (8k tokens)
161
- /ctx map --refresh # Force full rebuild
162
- ```
113
+ | Agent | Model | maxTurns | Purpose |
114
+ |-------|-------|----------|---------|
115
+ | ctx-reviewer | sonnet | 25 | Pre-commit review: type errors, imports, security, empty catches |
116
+ | ctx-auditor | haiku | 15 | Background audit trail and compliance traceability |
117
+ | ctx-verifier | haiku | 15 | Three-level verification: exists, substantive, wired |
163
118
 
164
- Creates `REPO-MAP.md` with symbols, dependencies, and navigation hints.
119
+ #### Mapping
165
120
 
166
- ### Discussion Phase (like GSD)
167
- ```bash
168
- /ctx discuss S001 # Capture decisions BEFORE planning
169
- /ctx discuss --review # Review locked decisions
170
- ```
121
+ | Agent | Model | maxTurns | Purpose |
122
+ |-------|-------|----------|---------|
123
+ | ctx-mapper | haiku | 15 | Token-optimized repository map (REPO-MAP.md) |
124
+ | ctx-arch-mapper | haiku | 15 | Architecture patterns, data flow, module structure |
125
+ | ctx-tech-mapper | haiku | 15 | Languages, frameworks, dependencies |
126
+ | ctx-quality-mapper | haiku | 15 | Test coverage, lint status, type safety |
127
+ | ctx-concerns-mapper | haiku | 15 | Security vulnerabilities, tech debt, performance risks |
171
128
 
172
- Prevents mid-implementation questions by locking decisions in `CONTEXT.md`.
129
+ #### Knowledge & Coordination
173
130
 
174
- ### Model Profiles (Cost Optimization)
175
- ```bash
176
- /ctx profile # Show current profile
177
- /ctx profile quality # Best models (Opus everywhere)
178
- /ctx profile balanced # Smart mix (default)
179
- /ctx profile budget # Fast models (60% savings)
180
- ```
131
+ | Agent | Model | maxTurns | Purpose |
132
+ |-------|-------|----------|---------|
133
+ | ctx-discusser | sonnet | 25 | Captures implementation decisions before planning begins |
134
+ | ctx-learner | haiku | 15 | Observes patterns and decisions, builds project memory |
135
+ | ctx-handoff | haiku | 15 | Creates HANDOFF.md at context thresholds, zero info loss |
136
+ | ctx-team-coordinator | haiku | 15 | File locks, parallel work coordination, prevents conflicts |
181
137
 
182
- | Profile | Research | Execute | Verify | Cost |
183
- |---------|----------|---------|--------|------|
184
- | quality | Opus | Opus | Sonnet | 3x |
185
- | balanced | Opus | Sonnet | Haiku | 1x |
186
- | budget | Sonnet | Sonnet | Haiku | 0.4x |
138
+ #### Design & QA
187
139
 
188
- ### Git-Native Workflow
189
- Every completed task auto-commits:
190
- ```
191
- [CTX] Implement user login endpoint
140
+ | Agent | Model | maxTurns | Purpose |
141
+ |-------|-------|----------|---------|
142
+ | ctx-designer | sonnet | 50 | Brand establishment, component design, Figma MCP integration |
143
+ | ctx-qa | sonnet | 50 | Full system QA: crawls every page, clicks every button |
192
144
 
193
- Story: S001 - User Authentication
194
- Criteria: User can log in with credentials
195
- Files: src/auth/login.ts, src/routes/auth.ts
145
+ ### Machine Learning (4)
196
146
 
197
- Co-Authored-By: Claude <noreply@anthropic.com>
198
- ```
147
+ | Agent | Model | maxTurns | Purpose |
148
+ |-------|-------|----------|---------|
149
+ | ctx-ml-scientist | opus | 75 | Designs experiments, selects models, autonomous hypothesis loop |
150
+ | ctx-ml-engineer | sonnet | 50 | MLOps pipelines, inference envelope, model registry |
151
+ | ctx-ml-analyst | sonnet | 50 | EDA, dataset quality scoring, feature analysis |
152
+ | ctx-ml-reviewer | sonnet | 25 | ML code review: correctness, leakage, statistical validity |
199
153
 
200
- Configure in `.ctx/config.json`:
201
- ```json
202
- {
203
- "git": {
204
- "autoCommit": true,
205
- "commitPerTask": true
206
- }
207
- }
208
- ```
154
+ ---
209
155
 
210
- ### Persistent Debug Mode
211
- Scientific debugging with persistent state across sessions:
156
+ ## Skills (7)
212
157
 
213
- ```bash
214
- /ctx debug "login fails" # Start debugging
215
- /ctx debug --resume # Resume after context reset
216
- /ctx debug --list # See all sessions
217
- ```
158
+ Skills are auto-discovered. Claude Code reads each skill's `WHEN:` description and invokes it automatically when the task matches — no slash command needed.
218
159
 
219
- **How it works:**
220
- ```
221
- 1. OBSERVE → Capture exact error, context, state
222
- 2. RESEARCH → Search codebase and web for similar issues
223
- 3. HYPOTHESIZE → Form testable theory with confidence level
224
- 4. TEST → Apply minimal fix
225
- 5. VERIFY → Build + Tests + Lint + Browser
226
- 6. ITERATE → Refine hypothesis, max 10 attempts
227
- ```
160
+ ### Core
228
161
 
229
- **Key features:**
230
- - Sessions survive context resets and days between attempts
231
- - Browser verification with stored credentials
232
- - Screenshots saved for each attempt
233
- - Escalation report if max attempts reached
162
+ | Skill | Auto-invoked when... | Purpose |
163
+ |-------|----------------------|---------|
164
+ | ctx-orchestrator | User asks for pipeline, "ctx next", "ctx auto", or autonomous story execution | Runs the full `init → plan → execute → verify → complete` lifecycle via the Agent tool |
165
+ | ctx-state | Any CTX operation needs to read/write `.ctx/STATE.json` or track phase transitions | Manages persistent state, phase transitions, agent history, and task completion log |
166
+ | ctx-review-gate | Code implemented, story ready to close | Three-stage review: spec compliance → code quality → optional Codex cross-model adversarial review; blocks on failures, soft-skips on Codex infrastructure issues |
234
167
 
235
- State stored in `.ctx/debug/sessions/`:
236
- - `STATE.json` - Machine-readable progress
237
- - `TRACE.md` - Human-readable log
238
- - `hypotheses.json` - All theories tested
239
- - `screenshots/` - Visual evidence
168
+ ### Design
240
169
 
241
- ### Parallel Codebase Analysis
242
- ```bash
243
- /ctx map-codebase # Full analysis with 4 parallel agents
244
- ```
170
+ | Skill | Auto-invoked when... | Purpose |
171
+ |-------|----------------------|---------|
172
+ | ctx-design-system | Design system creation, token management, brand kit updates, token export | Manages W3C DTCG 2025.10 tokens as single source of truth; exports to CSS/SCSS/JS/Tailwind |
173
+ | ctx-visual-qa | Visual QA, design parity, pixel-perfect verification, responsive testing, WCAG 2.2 audit | Numerical measurement-driven QA — every delta is a number, every fix is a specific CSS property |
245
174
 
246
- Spawns 4 agents simultaneously:
247
- | Agent | Output | Analyzes |
248
- |-------|--------|----------|
249
- | TECH | TECH.md | Languages, frameworks, dependencies |
250
- | ARCH | ARCH.md | Patterns, data flow, modules |
251
- | QUALITY | QUALITY.md | Test coverage, lint, type safety |
252
- | CONCERNS | CONCERNS.md | Security, tech debt, performance |
175
+ ### Machine Learning
253
176
 
254
- Results synthesized into `SUMMARY.md`.
177
+ | Skill | Auto-invoked when... | Purpose |
178
+ |-------|----------------------|---------|
179
+ | ctx-ml-experiment | User wants to run ML experiments, track hypotheses, compare models | Hypothesis tracking, model registry, experiment lifecycle |
180
+ | ctx-ml-pipeline | Production ML deployment, inference, drift monitoring | Inference envelope, circuit breaker, KS drift detection, retraining triggers |
255
181
 
256
182
  ---
257
183
 
258
- ## Why CTX?
259
-
260
- | Feature | Aider | GSD | CTX 3.5 |
261
- |---------|-------|-----|---------|
262
- | Repository Map | Yes | No | **Yes** |
263
- | Discussion Phase | No | Yes | **Yes** |
264
- | Model Profiles | Yes | Partial | **Yes** |
265
- | Git-Native Commits | Yes | No | **Yes** |
266
- | Persistent Debug | No | Partial | **Yes** |
267
- | Parallel Analysis | No | Yes | **Yes** |
268
- | PRD-Driven | No | Yes | **Yes** |
269
- | Design System | No | No | **Yes** |
270
- | Browser Verification | No | No | **Yes** |
271
-
272
- **CTX 3.5 combines the best of Aider and GSD.**
273
-
274
- ---
184
+ ## Commands (26)
275
185
 
276
- ## Commands
186
+ ### Smart
277
187
 
278
- ### Smart (Auto-routing)
279
188
  | Command | Purpose |
280
189
  |---------|---------|
281
- | `/ctx` | **Smart router** - reads STATE.md, does the right thing |
282
- | `/ctx init` | Initialize project with STATE.md + PRD.json |
190
+ | `/ctx` | Smart router reads STATE.json, does the right thing |
283
191
 
284
192
  ### Mapping
193
+
285
194
  | Command | Purpose |
286
195
  |---------|---------|
287
- | `/ctx map` | Build repository map (REPO-MAP.md) |
288
- | `/ctx map-codebase` | Deep analysis (4 parallel agents) |
196
+ | `/ctx:map` | Build token-optimized repository map (REPO-MAP.md) |
197
+ | `/ctx:map-codebase` | Deep parallel analysis: TECH + ARCH + QUALITY + CONCERNS → SUMMARY |
289
198
 
290
199
  ### Discussion
200
+
291
201
  | Command | Purpose |
292
202
  |---------|---------|
293
- | `/ctx discuss [story]` | Capture decisions before planning |
203
+ | `/ctx:discuss [story]` | Capture implementation decisions before planning; locks them in CONTEXT.md |
294
204
 
295
205
  ### Configuration
206
+
296
207
  | Command | Purpose |
297
208
  |---------|---------|
298
- | `/ctx profile [name]` | Switch model profile (quality/balanced/budget) |
209
+ | `/ctx:profile [name]` | Switch model profile: `quality`, `balanced` (default), `budget` |
210
+
211
+ ### Inspect
299
212
 
300
- ### Inspect (Read-only)
301
213
  | Command | Purpose |
302
214
  |---------|---------|
303
- | `/ctx status` | See current state without triggering action |
215
+ | `/ctx:status` | Show current state without triggering any action |
216
+
217
+ ### Control
304
218
 
305
- ### Control (Override)
306
219
  | Command | Purpose |
307
220
  |---------|---------|
308
- | `/ctx plan [goal]` | Force research + planning |
309
- | `/ctx verify` | Force three-level verification |
310
- | `/ctx quick "task"` | Quick task bypass |
221
+ | `/ctx:init` | Initialize project: PRD.json + STATE.json + config |
222
+ | `/ctx:plan [goal]` | Force research + planning phase |
223
+ | `/ctx:verify` | Force three-level verification |
224
+ | `/ctx:quick "task"` | Quick task bypass (skips full lifecycle) |
311
225
 
312
226
  ### Debug
227
+
228
+ | Command | Purpose |
229
+ |---------|---------|
230
+ | `/ctx:debug` | Start debugging current issue |
231
+ | `/ctx:debug "issue"` | Debug specific problem |
232
+ | `/ctx:debug --resume` | Resume last debug session |
233
+ | `/ctx:debug --list` | List all debug sessions |
234
+ | `/ctx:debug --status` | Show current session status |
235
+
236
+ ### Design
237
+
238
+ | Command | Purpose |
239
+ |---------|---------|
240
+ | `/ctx:brand` | Brand establishment: mood board → 3 options → BRAND_KIT.md |
241
+ | `/ctx:design` | Component design: research → 3 options → prototype → implement |
242
+
243
+ ### QA
244
+
313
245
  | Command | Purpose |
314
246
  |---------|---------|
315
- | `/ctx debug` | Start debugging current issue |
316
- | `/ctx debug "issue"` | Debug specific problem |
317
- | `/ctx debug --resume` | Resume last debug session |
318
- | `/ctx debug --list` | List all debug sessions |
319
- | `/ctx debug --status` | Show current session status |
247
+ | `/ctx:qa` | Full system QA WCAG 2.1 AA, every page, every interaction |
248
+ | `/ctx:qa --a11y-only` | Accessibility audit only |
249
+ | `/ctx:qa --visual-only` | Visual regression across mobile/tablet/desktop |
250
+ | `/ctx:qa --resume` | Resume interrupted QA session |
251
+ | `/ctx:visual-qa` | Measurement-driven design parity check |
252
+
253
+ ### ML
320
254
 
321
- ### QA (Full System Testing)
322
255
  | Command | Purpose |
323
256
  |---------|---------|
324
- | `/ctx qa` | Full system QA - WCAG 2.1 AA, every page, every button |
325
- | `/ctx qa --section "auth"` | QA specific section only |
326
- | `/ctx qa --a11y-only` | Accessibility audit only |
327
- | `/ctx qa --visual-only` | Visual regression (mobile/tablet/desktop) |
328
- | `/ctx qa --resume` | Resume interrupted QA session |
329
- | `/ctx qa --report` | Show last QA report |
257
+ | `/ctx:experiment` | Start ML experiment loop |
258
+ | `/ctx:train` | Trigger training pipeline |
259
+ | `/ctx:ml-status` | Show experiment registry and model status |
330
260
 
331
261
  ### Session
262
+
332
263
  | Command | Purpose |
333
264
  |---------|---------|
334
- | `/ctx pause` | Checkpoint for session resume |
265
+ | `/ctx:pause` | Checkpoint state for session resume |
266
+
267
+ ### Phase
335
268
 
336
- ### Phase Management
337
269
  | Command | Purpose |
338
270
  |---------|---------|
339
- | `/ctx phase list` | Show all phases |
340
- | `/ctx phase add "goal"` | Add new phase |
341
- | `/ctx phase next` | Complete current, move to next |
271
+ | `/ctx:phase list` | Show all phases and their status |
272
+ | `/ctx:phase add "goal"` | Add a new phase |
273
+ | `/ctx:phase next` | Complete current phase, advance to next |
342
274
 
343
275
  ### Integration
276
+
344
277
  | Command | Purpose |
345
278
  |---------|---------|
346
- | `/ctx integrate` | Show integration status |
347
- | `/ctx integrate linear` | Setup Linear |
348
- | `/ctx integrate jira` | Setup Jira |
349
- | `/ctx integrate github` | Setup GitHub Issues |
350
- | `/ctx integrate --sync` | Sync all stories |
279
+ | `/ctx:integrate` | Show integration status |
280
+ | `/ctx:integrate linear` | Set up Linear sync |
281
+ | `/ctx:integrate jira` | Set up Jira sync |
282
+ | `/ctx:integrate github` | Set up GitHub Issues sync |
283
+ | `/ctx:integrate --sync` | Sync all stories with connected tracker |
351
284
 
352
285
  ### Milestone
286
+
287
+ | Command | Purpose |
288
+ |---------|---------|
289
+ | `/ctx:milestone` | Show current milestone |
290
+ | `/ctx:milestone list` | List all milestones |
291
+ | `/ctx:milestone audit` | Verify milestone completion |
292
+ | `/ctx:milestone complete` | Archive and tag release |
293
+ | `/ctx:milestone new [name]` | Start next version |
294
+ | `/ctx:milestone gaps` | Generate fix phases for gaps |
295
+
296
+ ### Metrics
297
+
353
298
  | Command | Purpose |
354
299
  |---------|---------|
355
- | `/ctx milestone` | Show current milestone |
356
- | `/ctx milestone list` | List all milestones |
357
- | `/ctx milestone audit` | Verify completion |
358
- | `/ctx milestone complete` | Archive and tag |
359
- | `/ctx milestone new [name]` | Start next version |
360
- | `/ctx milestone gaps` | Generate fix phases |
361
-
362
- ### Metrics & Audit
300
+ | `/ctx:metrics` | Productivity dashboard |
301
+ | `/ctx:metrics cost` | Cost analysis by model/profile |
302
+ | `/ctx:metrics export` | Export HTML dashboard |
303
+
304
+ ### Learning
305
+
363
306
  | Command | Purpose |
364
307
  |---------|---------|
365
- | `/ctx metrics` | Show productivity dashboard |
366
- | `/ctx metrics cost` | Cost analysis |
367
- | `/ctx metrics export` | Export HTML dashboard |
368
- | `/ctx audit` | Show audit summary |
369
- | `/ctx audit export` | Generate compliance report |
308
+ | `/ctx:learn` | Show what CTX has learned about your project |
309
+ | `/ctx:learn patterns` | Show detected code patterns |
310
+ | `/ctx:learn decisions` | Show architectural decisions log |
311
+ | `/ctx:predict` | Get AI-suggested next features |
312
+ | `/ctx:predict --quick` | Quick wins only |
313
+
314
+ ### Monitoring
370
315
 
371
- ### Learning & Prediction
372
316
  | Command | Purpose |
373
317
  |---------|---------|
374
- | `/ctx learn` | Show what CTX has learned |
375
- | `/ctx learn patterns` | Show code patterns |
376
- | `/ctx learn decisions` | Show architectural decisions |
377
- | `/ctx learn forget [id]` | Remove a learned pattern |
378
- | `/ctx predict` | Get feature suggestions |
379
- | `/ctx predict --quick` | Quick wins only |
380
- | `/ctx predict --create [id]` | Create story from suggestion |
381
-
382
- ### Monitoring & Voice
318
+ | `/ctx:monitor` | Show monitoring status |
319
+ | `/ctx:monitor connect sentry` | Connect Sentry error tracking |
320
+ | `/ctx:monitor errors` | List recent production errors |
321
+ | `/ctx:monitor auto-fix [id]` | Auto-fix error with PR |
322
+ | `/ctx:monitor --watch` | Continuous monitoring mode |
323
+
324
+ ### Voice
325
+
383
326
  | Command | Purpose |
384
327
  |---------|---------|
385
- | `/ctx monitor` | Show monitoring status |
386
- | `/ctx monitor connect sentry` | Connect Sentry |
387
- | `/ctx monitor errors` | List recent errors |
388
- | `/ctx monitor auto-fix [id]` | Auto-fix with PR |
389
- | `/ctx monitor --watch` | Continuous monitoring |
390
- | `/ctx voice` | Start voice input |
391
- | `/ctx voice --continuous` | Always listening mode |
392
- | `/ctx voice --dictate` | Long-form dictation |
328
+ | `/ctx:voice` | Start voice input |
329
+ | `/ctx:voice --continuous` | Always-listening mode |
330
+ | `/ctx:voice --dictate` | Long-form dictation |
331
+
332
+ ---
333
+
334
+ ## Hooks (3)
335
+
336
+ Hooks are deterministic Node.js scripts registered in `settings.json`. They run synchronously on every tool call, independent of Claude's reasoning.
337
+
338
+ | Hook | File | Trigger | Behavior |
339
+ |------|------|---------|----------|
340
+ | pre-tool-use | `hooks/pre-tool-use.js` | Before any tool executes | TDD enforcement + capability restrictions. Exit 2 blocks the tool call. |
341
+ | post-tool-use | `hooks/post-tool-use.js` | After any tool executes | Logs file modifications to audit trail in `.ctx/audit.log` |
342
+ | subagent-stop | `hooks/subagent-stop.js` | When a subagent finishes | Records agent completion in `.ctx/STATE.json` |
343
+
344
+ **Configure hook behavior:**
345
+
346
+ ```bash
347
+ npx ctx-cc config set hooks.tddMode strict # Block writes without tests
348
+ npx ctx-cc config set hooks.tddMode warn # Warn but allow
349
+ npx ctx-cc config set hooks.tddMode off # Disabled
350
+ ```
393
351
 
394
352
  ---
395
353
 
396
- ## State Machine
354
+ ## Design Workflow
355
+
356
+ CTX implements an agency-grade design process with mandatory approval gates.
397
357
 
358
+ **Phase 1 — Brand**
398
359
  ```
399
- initializingdiscussingexecutingverifyingCOMPLETE
400
- ↑ ↓
401
- └── debugging ──┘
360
+ ResearchMood board 3 direction options User picks BRAND_KIT.md
402
361
  ```
362
+ BRAND_KIT.md becomes the constraint for all subsequent design work. Colors, typography, and spacing flow from tokens only.
403
363
 
404
- | State | What happens |
405
- |-------|--------------|
406
- | initializing | Research + Map + Plan |
407
- | discussing | Capture decisions in CONTEXT.md |
408
- | executing | Execute with git-native commits |
409
- | debugging | Persistent debug loop (max 10 attempts) |
410
- | verifying | Three-level verification |
411
- | paused | Resume from checkpoint |
364
+ **Phase 2 Component Design**
365
+ ```
366
+ Research 3 options (A/B/C) User approves direction → Prototype → Implement
367
+ ```
368
+ Never a single design. Options are always presented before implementation.
369
+
370
+ **Phase 3 Visual QA**
371
+ Every design change triggers numerical verification:
372
+ - Measure rendered values vs design spec (px, rem, hex)
373
+ - Report deltas as numbers, not subjective descriptions
374
+ - Fixes are specific: `change font-size from 14px to 16px`
375
+
376
+ **Compliance:** WCAG 2.2 AA + EAA 2025. Touch targets, contrast ratios, keyboard navigation, and screen reader labels are all measured.
377
+
378
+ **Key tools:** Figma MCP (token sync + component metadata), Gemini design analysis, W3C DTCG 2025.10 token format.
412
379
 
413
380
  ---
414
381
 
415
- ## Context Management
382
+ ## ML Workflow
383
+
384
+ CTX implements an experiment-driven ML development loop.
385
+
386
+ **Phase 1 — Data Analysis**
387
+ ```
388
+ Load data → EDA → Quality scoring → Feature correlation → Pandera validation schema
389
+ ```
390
+
391
+ **Phase 2 — Experiment Loop**
392
+ ```
393
+ Hypothesize → Design experiment → Run → Analyze → Register result → Iterate
394
+ ```
395
+ All hypotheses and results are tracked in `.ctx/ml/experiments/`. The model registry stores every trained artifact with metadata.
416
396
 
417
- CTX actively manages context budget:
397
+ **Phase 3 Model Evaluation**
398
+ - Conformal prediction intervals (MAPIE)
399
+ - Statistical significance testing
400
+ - Calibration curves and reliability diagrams
418
401
 
419
- | Usage | Quality | Action |
420
- |-------|---------|--------|
421
- | 0-30% | Peak | Continue |
422
- | 30-40% | Good | Continue |
423
- | 40-50% | Good | Prepare handoff notes |
424
- | 50-60% | Degrading | Auto-checkpoint |
425
- | 60-70% | Degrading | Create HANDOFF.md |
426
- | 70%+ | Poor | Force checkpoint |
402
+ **Phase 4 Production Pipeline**
403
+ - Model registry with version pinning
404
+ - Inference envelope with latency SLA
405
+ - Circuit breaker (auto-disable on error spike)
406
+ - KS drift detection with configurable thresholds
407
+ - Retraining triggers on drift
427
408
 
428
- Smart handoff creates `HANDOFF.md` with:
429
- - Completed tasks with commit hashes
430
- - Current task progress
431
- - Key decisions made
432
- - Files modified
433
- - Next steps
409
+ **Proven patterns:** XGBoost + MAPIE conformal prediction, T-learner causal inference, KS drift, Pandera schema validation — from Digital Twin production workflows.
434
410
 
435
411
  ---
436
412
 
437
- ## 21 Specialized Agents
438
-
439
- | Agent | Spawned when | Model (balanced) |
440
- |-------|--------------|------------------|
441
- | ctx-mapper | /ctx map | haiku |
442
- | ctx-tech-mapper | /ctx map-codebase | haiku |
443
- | ctx-arch-mapper | /ctx map-codebase | haiku |
444
- | ctx-quality-mapper | /ctx map-codebase | haiku |
445
- | ctx-concerns-mapper | /ctx map-codebase | haiku |
446
- | ctx-discusser | status = discussing | sonnet |
447
- | ctx-researcher | status = initializing | opus |
448
- | ctx-planner | after research | opus |
449
- | ctx-executor | status = executing | sonnet |
450
- | ctx-designer | design stories | sonnet |
451
- | ctx-debugger | status = debugging | sonnet |
452
- | ctx-verifier | status = verifying | haiku |
453
- | ctx-parallelizer | before execution | haiku |
454
- | ctx-reviewer | before commit | sonnet |
455
- | ctx-criteria-suggester | during init/discuss | sonnet |
456
- | ctx-handoff | at context thresholds | haiku |
457
- | ctx-team-coordinator | team mode | sonnet |
458
- | ctx-auditor | always (background) | haiku |
459
- | ctx-learner | observing patterns | haiku |
460
- | ctx-predictor | after milestone/on demand | sonnet |
461
- | ctx-qa | /ctx qa (full system test) | sonnet |
413
+ ## Configuration
414
+
415
+ ```bash
416
+ npx ctx-cc config list # Show all config values
417
+ npx ctx-cc config get activeProfile # Get a specific value
418
+ npx ctx-cc config set hooks.tddMode strict # Set a value
419
+ ```
462
420
 
463
- ---
421
+ **Model profiles:**
464
422
 
465
- ## Directory Structure
423
+ | Profile | Research | Planning | Execution | Verify | Relative Cost |
424
+ |---------|----------|----------|-----------|--------|---------------|
425
+ | quality | opus | opus | opus | sonnet | ~3x |
426
+ | balanced | opus | opus | sonnet | haiku | 1x (default) |
427
+ | budget | sonnet | sonnet | sonnet | haiku | ~0.4x |
466
428
 
429
+ Switch profiles without reinstalling:
467
430
  ```
468
- .ctx/
469
- ├── config.json # Model profiles, git settings
470
- ├── STATE.md # Living digest - execution state
471
- ├── PRD.json # Requirements contract
472
- ├── REPO-MAP.md # Token-optimized codebase map
473
- ├── REPO-MAP.json # Structured map data
474
- ├── .env # Test credentials (GITIGNORED)
475
- ├── codebase/ # Deep analysis results
476
- │ ├── TECH.md
477
- │ ├── ARCH.md
478
- │ ├── QUALITY.md
479
- │ ├── CONCERNS.md
480
- │ └── SUMMARY.md
481
- ├── phases/{story_id}/
482
- │ ├── CONTEXT.md # Locked decisions (discussion phase)
483
- │ ├── RESEARCH.md # ArguSeek results
484
- │ ├── PLAN.md # Tasks mapped to criteria
485
- │ └── VERIFY.md # Verification report
486
- ├── debug/
487
- │ ├── sessions/ # Persistent debug state
488
- │ └── screenshots/ # Visual proof
489
- ├── checkpoints/ # Auto-checkpoints
490
- └── memory/ # Decision memory
431
+ /ctx:profile quality
432
+ /ctx:profile balanced
433
+ /ctx:profile budget
491
434
  ```
492
435
 
493
436
  ---
494
437
 
495
- ## Configuration
438
+ ## Phase Lifecycle
496
439
 
497
- `.ctx/config.json`:
498
- ```json
499
- {
500
- "activeProfile": "balanced",
501
- "models": {
502
- "architect": { "id": "claude-opus-4", "costTier": "high" },
503
- "default": { "id": "claude-sonnet-4", "costTier": "medium" },
504
- "fast": { "id": "claude-haiku-4", "costTier": "low" }
505
- },
506
- "profiles": {
507
- "quality": {
508
- "research": "architect",
509
- "discussion": "architect",
510
- "planning": "architect",
511
- "execution": "architect"
512
- },
513
- "balanced": {
514
- "research": "architect",
515
- "discussion": "default",
516
- "planning": "architect",
517
- "execution": "default"
518
- },
519
- "budget": {
520
- "research": "default",
521
- "planning": "default",
522
- "execution": "default"
523
- }
524
- },
525
- "git": {
526
- "autoCommit": true,
527
- "commitPerTask": true
528
- }
529
- }
530
440
  ```
441
+ init → plan → execute → verify → complete
442
+ ↑ ↓
443
+ ←── (fix failures)
444
+ ```
445
+
446
+ State is persisted in `.ctx/STATE.json` after every transition. The `ctx-state` skill manages reads and writes. The `ctx-orchestrator` skill drives transitions.
447
+
448
+ | Phase | What happens |
449
+ |-------|--------------|
450
+ | init | Research + repo map + PRD validation |
451
+ | plan | Acceptance criteria + atomic task plan (2–3 tasks) |
452
+ | execute | Implementation with per-task git commits |
453
+ | verify | Three-level check: exists → substantive → wired |
454
+ | complete | Review gate passed, story archived |
455
+
456
+ If verification fails, state returns to `execute` automatically. The fix-loop runs until all three verification levels pass.
531
457
 
532
458
  ---
533
459
 
534
- ## Integrations
460
+ ## Plugin Manifest
535
461
 
536
- ### ArguSeek (Web Research)
537
- Auto-runs during planning for best practices, security, and patterns.
462
+ CTX ships with `plugin.json` for Claude Code marketplace distribution. Future marketplace installs will use:
538
463
 
539
- ### ChunkHound (Semantic Code Search)
540
- Auto-runs during planning for semantic search and pattern detection.
541
464
  ```bash
542
- uv tool install chunkhound
465
+ /plugin install ctx@my-marketplace
543
466
  ```
544
467
 
545
- ### Browser Verification (Playwright/Chrome DevTools)
546
- Auto-runs during debugging and verification for visual proof.
547
-
548
- ### Figma MCP (Design Context)
549
- Auto-runs during design stories for tokens and component metadata.
550
-
551
- ### Gemini Design MCP (Visual Generation)
552
- Auto-runs during design stories for mockups and UI code.
468
+ The manifest declares all agents, skills, commands, and hooks so the marketplace can display capabilities, manage versions, and handle updates without reinstalling.
553
469
 
554
470
  ---
555
471
 
556
- ## Key Principles
472
+ ## CLI Reference
557
473
 
558
- ### 95% Auto-Deviation Handling
474
+ ```bash
475
+ npx ctx-cc [options] Install CTX into Claude Code
476
+ npx ctx-cc list List all 25 agents with model/maxTurns
477
+ npx ctx-cc skills Analyze skill descriptions and triggers
478
+ npx ctx-cc config list Show full configuration
479
+ npx ctx-cc config get <key> Get a config value
480
+ npx ctx-cc config set <k> <v> Set a config value
481
+ npx ctx-cc --help Show help
482
+
483
+ Options:
484
+ --global, -g Install to ~/.claude (default)
485
+ --project, -p Install to .claude in current directory
486
+ --force, -f Overwrite existing installation
487
+ ```
559
488
 
560
- | Trigger | Action |
561
- |---------|--------|
562
- | Bug in existing code | Auto-fix, document in commit |
563
- | Missing validation | Auto-add, document |
564
- | Blocking issue | Auto-fix, document |
565
- | Architecture decision | **Ask user** |
489
+ ---
566
490
 
567
- ### Three-Level Verification
491
+ ## Development
568
492
 
569
- | Level | Question | Check |
570
- |-------|----------|-------|
571
- | Exists | File on disk? | Glob |
572
- | Substantive | Real code, not stub? | No TODOs, no placeholders |
573
- | Wired | Imported and used? | Trace imports |
493
+ ```bash
494
+ git clone https://github.com/jufjuf/CTX.git
495
+ cd CTX
496
+ npm test # 264 tests, node:test runner
497
+ ```
574
498
 
575
- ### Atomic Planning
499
+ **Project structure:**
576
500
 
577
- Plans limited to 2-3 tasks to prevent context degradation.
501
+ ```
502
+ ctx-cc/
503
+ ├── agents/ 25 agent definitions (.md with frontmatter)
504
+ ├── skills/ 7 skill directories (each contains SKILL.md)
505
+ ├── commands/ 26 slash command definitions (.md)
506
+ ├── hooks/ 3 enforcement hook scripts (.js)
507
+ ├── src/ 17 source modules (.js)
508
+ ├── test/ 19 test files (.test.js)
509
+ ├── templates/ config.json, PRD.json, state templates
510
+ ├── bin/ctx.js CLI entry point (installer only)
511
+ ├── plugin.json Marketplace manifest
512
+ └── package.json Zero runtime dependencies
513
+ ```
578
514
 
579
515
  ---
580
516
 
581
- ## Updating
517
+ ## Testing
582
518
 
583
519
  ```bash
584
- npx ctx-cc --force
520
+ npm test
521
+ # 264 tests, 0 failures, ~2s
585
522
  ```
586
523
 
524
+ **Coverage:**
525
+
526
+ | Area | What is tested |
527
+ |------|----------------|
528
+ | Agent discovery | Frontmatter parsing, model/maxTurns validation |
529
+ | State machine | Phase transitions, invalid transition rejection |
530
+ | Pipelines | Orchestrator flow, review gate stages |
531
+ | Worktrees | Parallel execution isolation |
532
+ | Hooks | TDD enforcement, audit logging, subagent tracking |
533
+ | Capabilities | Restriction rules, exit codes |
534
+ | Context profiles | Model resolution per profile |
535
+ | Skills format | SKILL.md structure, description format |
536
+ | Design compliance | Token format, BRAND_KIT schema |
537
+ | ML compliance | Experiment schema, pipeline config |
538
+ | CLI commands | list, skills, config get/set |
539
+ | Integration | End-to-end install + verify |
540
+
587
541
  ---
588
542
 
589
543
  ## License
@@ -594,8 +548,8 @@ MIT
594
548
 
595
549
  <div align="center">
596
550
 
597
- **[GitHub](https://github.com/jufjuf/CTX)** · **[Issues](https://github.com/jufjuf/CTX/issues)** · **[npm](https://www.npmjs.com/package/ctx-cc)**
551
+ **[GitHub](https://github.com/jufjuf/CTX)** · **[npm](https://www.npmjs.com/package/ctx-cc)** · **[Issues](https://github.com/jufjuf/CTX/issues)**
598
552
 
599
- *CTX 3.5 - Conversational-first. Just describe what you want. 21 specialized agents. PRD-driven development.*
553
+ CTX 4.0 25 agents · 7 skills · 3 hooks · zero dependencies
600
554
 
601
555
  </div>