npm - ctx-cc - Versions diffs - 4.0.0 → 4.1.0 - Mend

ctx-cc 4.0.0 → 4.1.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (37) hide show

package/README.md +378 -424
package/agents/ctx-arch-mapper.md +1 -1
package/agents/ctx-auditor.md +1 -1
package/agents/ctx-codex-reviewer.md +214 -0
package/agents/ctx-concerns-mapper.md +1 -1
package/agents/ctx-criteria-suggester.md +2 -2
package/agents/ctx-debugger.md +1 -1
package/agents/ctx-discusser.md +1 -1
package/agents/ctx-executor.md +1 -1
package/agents/ctx-handoff.md +2 -2
package/agents/ctx-learner.md +1 -1
package/agents/ctx-mapper.md +1 -1
package/agents/ctx-parallelizer.md +1 -1
package/agents/ctx-planner.md +1 -1
package/agents/ctx-predictor.md +1 -1
package/agents/ctx-quality-mapper.md +1 -1
package/agents/ctx-researcher.md +1 -1
package/agents/ctx-reviewer.md +2 -2
package/agents/ctx-team-coordinator.md +1 -1
package/agents/ctx-tech-mapper.md +1 -1
package/agents/ctx-verifier.md +1 -1
package/bin/ctx.js +33 -2
package/commands/ctx.md +10 -10
package/commands/help.md +5 -5
package/commands/init.md +25 -0
package/commands/metrics.md +1 -1
package/commands/milestone.md +1 -1
package/commands/monitor.md +1 -1
package/commands/voice.md +1 -1
package/hooks/pre-tool-use.js +2 -1
package/package.json +2 -2
package/plugin.json +2 -1
package/skills/ctx-review-gate/SKILL.md +49 -13
package/src/capabilities.js +97 -42
package/src/install.js +10 -3
package/src/review-gate.js +103 -9
package/templates/config.json +3 -0

package/README.md CHANGED Viewed

@@ -1,589 +1,543 @@
 <div align="center">
-# CTX
-### Continuous Task eXecution
+```
+   ██████╗████████╗██╗  ██╗
+  ██╔════╝╚══██╔══╝╚██╗██╔╝
+  ██║        ██║    ╚███╔╝
+  ██║        ██║    ██╔██╗
+  ╚██████╗   ██║   ██╔╝ ██╗
+   ╚═════╝   ╚═╝   ╚═╝  ╚═╝
+```
-**Intelligent workflow orchestration for Claude Code.**
+**Intelligent workflow orchestration for Claude Code**
 [![npm version](https://img.shields.io/npm/v/ctx-cc.svg?style=flat-square)](https://www.npmjs.com/package/ctx-cc)
-[![npm downloads](https://img.shields.io/npm/dm/ctx-cc.svg?style=flat-square)](https://www.npmjs.com/package/ctx-cc)
 [![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg?style=flat-square)](https://opensource.org/licenses/MIT)
-[![GitHub stars](https://img.shields.io/github/stars/jufjuf/CTX?style=flat-square)](https://github.com/jufjuf/CTX/stargazers)
-<img src="./assets/terminal.png" alt="CTX Terminal" width="700">
+[![Tests](https://img.shields.io/badge/tests-264%20passing-brightgreen.svg?style=flat-square)](#testing)
+[![Zero deps](https://img.shields.io/badge/dependencies-0-brightgreen.svg?style=flat-square)](#)
-**Conversational-first. Just describe what you want — no commands to memorize.**
-AI that learns your preferences. Predictive planning. Self-healing deployments. 21 specialized agents.
-[Installation](#installation) · [Quick Start](#quick-start) · [New in 3.5](#new-in-35) · [Commands](#commands) · [Why CTX](#why-ctx) · [**Getting Started Guide**](./GETTING_STARTED.md)
+```bash
+npx ctx-cc
+```
 </div>
 ---
-## Installation
+## What is CTX?
-```bash
-npx ctx-cc
-```
+CTX transforms Claude Code from a single AI assistant into a full development agency. One installer wires 25 specialized agents, 7 auto-discovered skills, and 3 deterministic enforcement hooks directly into Claude Code's native extension points — no runtime daemon, no wrapper, no proxy.
-That's it. CTX installs itself to your Claude Code environment.
+**Three verticals out of the box:**
-```bash
-# Options
-npx ctx-cc --global     # Install to ~/.claude (default)
-npx ctx-cc --project    # Install to .claude in current directory
-npx ctx-cc --force      # Overwrite existing installation
-```
+| Vertical | Coverage |
+|----------|----------|
+| Software Development | Phase-based lifecycle, autonomous execution, persistent debug, review gates |
+| Agency-Grade Design | Figma MCP-first workflow, W3C DTCG tokens, pixel-perfect QA, WCAG 2.2 AA |
+| Machine Learning | Experiment tracking, model registry, conformal prediction, drift detection |
+**Key value propositions:**
+- **Phase-based lifecycle** — `init → plan → execute → verify → complete` with state tracked in `.ctx/STATE.json`
+- **Autonomous execution with review gates** — three-stage review: spec compliance, code quality, and optional cross-model adversarial review via OpenAI Codex
+- **Figma MCP-first design workflow** — tokens sync from Figma, visual QA measures pixels numerically
+- **ML experiment-driven development** — hypothesis tracking, XGBoost+MAPIE patterns, KS drift, Digital Twin workflows
+- **Zero dependencies** — installs into Claude Code's native extension points; nothing runs outside Claude
 ---
 ## Quick Start
-**Just talk to CTX.** No commands to memorize:
-```
-"I want to build a todo app"     → CTX sets up your project
-"Fix the login bug"              → CTX starts debugging
-"Is my app accessible?"          → CTX runs accessibility QA
-"What should I do next?"         → CTX shows status + recommendation
+```bash
+npx ctx-cc              # Install globally (~/.claude)
+npx ctx-cc --project    # Install for current project only (.claude/)
+npx ctx-cc --force      # Overwrite existing installation
 ```
-Or use commands directly:
-```
-1. /ctx init           Gather requirements + credentials + design context
-2. /ctx map            Build repository map (existing codebases)
-3. /ctx                Autonomous execution with minimal interruption
-```
+Then inside Claude Code:
-**The Flow:**
 ```
-Tell CTX what you want → CTX figures out the rest → Delivered!
+/ctx              Smart router — reads state, does the right thing
+/ctx:help         Full command reference
+/ctx:init         Initialize project (PRD + STATE + config)
 ```
 ---
-## New in 3.5
-### Complete Redesign
-v3.5 is a ground-up rewrite focused on reliability over feature count:
-- **Unified version** across all 21 agents and commands
-- **Trimmed config** — removed 260 lines of settings for unimplemented features
-- **Single router** — eliminated duplicate routing logic that caused agent spawn failures
-- **GSD-proven architecture** — rebuilt on patterns validated in production
+## Architecture
-### Conversational-First Routing
-**No commands to memorize.** CTX understands natural language from your first prompt:
+CTX uses a **thin installer / fat Claude Code** architecture. The CLI's only job is to copy files into the right directories. All orchestration happens inside Claude Code via the Agent tool.
-| You Say | CTX Does |
-|---------|----------|
-| "I want to build a todo app" | Sets up project, researches best practices, creates plan |
-| "Fix the login bug" | Analyzes codebase, starts debugging |
-| "Is my app accessible?" | Runs WCAG 2.1 AA accessibility audit |
-| "Test everything" | Crawls every page, clicks every button |
-| "What's next?" | Shows status and recommended action |
-Commands still work as shortcuts for power users.
-### Full System QA
-Crawl every page, click every button, find all issues:
-```bash
-/ctx qa                     # Full system QA (WCAG 2.1 AA)
-/ctx qa --a11y-only         # Accessibility audit only
-/ctx qa --visual-only       # Visual regression (3 viewports)
-/ctx qa --resume            # Resume interrupted session
 ```
-Features:
-- **WCAG 2.1 AA compliance** - Touch targets, alt text, labels, contrast, keyboard
-- **Multi-viewport testing** - Mobile (375px), Tablet (768px), Desktop (1280px)
-- **Performance monitoring** - Slow requests, large assets
-- **Trace capture** - Screenshots and logs for every failed interaction
-- **Fix tasks** - Issues organized by section, ready for execution
-### Persistent Debugging
-Debug sessions survive context resets, `/clear`, and days between attempts:
-```bash
-/ctx debug "checkout fails"  # Start debug session
-/ctx debug --resume          # Resume where you left off
-/ctx debug --list            # Show all sessions
+~/.claude/
+├── agents/       25 subagents     (invoked via Agent tool)
+├── skills/       7 skills         (auto-discovered by Claude from descriptions)
+├── commands/     26 slash commands (/ctx:*)
+├── hooks/        3 hook scripts   (deterministic enforcement)
+└── settings.json hooks registered
 ```
-- Scientific method: observe, hypothesize, test, analyze
-- Max 10 attempts before escalation with full report
-- Browser verification with stored credentials
-- Every hypothesis and result recorded in `.ctx/debug/sessions/`
+**Key decisions:**
-### Smart Context Handoff
-Seamless transitions at context limits:
+- The CLI is installer-only. It does not run, proxy, or wrap Claude Code.
+- Agents are Markdown files with native frontmatter (`model`, `maxTurns`, `description`). Claude reads these directly.
+- Skills are auto-invoked by Claude Code when task descriptions match the skill's `WHEN:` triggers — no commands needed.
+- Hooks are separate `.js` scripts registered in `settings.json`. They run deterministically on every tool call.
+- `plugin.json` enables marketplace distribution.
-| Threshold | Action |
-|-----------|--------|
-| 40% | Prepare handoff notes |
-| 50% | Write HANDOFF.md, warn |
-| 60% | Spawn fresh agent |
+---
-Zero information loss. Work continues automatically.
+## Agents (25)
-### Pre-Commit Review
-Catches errors BEFORE they're committed:
-- Type errors, unresolved imports, circular dependencies
-- Security vulnerabilities, empty catch blocks
-- Blocks on critical issues, warns on medium
+### Software Development (21)
-### Acceptance Criteria Auto-Generation
-AI suggests comprehensive criteria before implementation starts:
-```
-Story: "Add user authentication"
+#### Planning
-Suggested Criteria:
-  ✓ User can register with email/password
-  ✓ Invalid credentials show error
-  ✓ Passwords hashed with bcrypt
-  ✓ Session expires after 24h
-  ✓ Brute force protection enabled
+| Agent | Model | maxTurns | Purpose |
+|-------|-------|----------|---------|
+| ctx-planner | opus | 25 | Atomic plans (2–3 tasks), maps tasks to acceptance criteria |
+| ctx-researcher | opus | 25 | ArguSeek web research + semantic code search before planning |
+| ctx-criteria-suggester | sonnet | 25 | Auto-generates acceptance criteria before implementation |
+| ctx-parallelizer | haiku | 15 | Identifies tasks that can run concurrently, saves total time |
+| ctx-predictor | haiku | 15 | Analyzes patterns and suggests next features |
-[A] Accept all  [B] See more  [C] Edit
-```
+#### Execution
----
+| Agent | Model | maxTurns | Purpose |
+|-------|-------|----------|---------|
+| ctx-executor | sonnet | 50 | Implements tasks with git-native commits per completed task |
+| ctx-debugger | sonnet | 75 | Persistent debug loop: observe → hypothesize → test → verify |
-## From 3.0
+#### Review
-### Repository Mapping (like Aider)
-```bash
-/ctx map                 # Build token-optimized codebase map
-/ctx map --expand        # Include call graph (8k tokens)
-/ctx map --refresh       # Force full rebuild
-```
+| Agent | Model | maxTurns | Purpose |
+|-------|-------|----------|---------|
+| ctx-reviewer | sonnet | 25 | Pre-commit review: type errors, imports, security, empty catches |
+| ctx-auditor | haiku | 15 | Background audit trail and compliance traceability |
+| ctx-verifier | haiku | 15 | Three-level verification: exists, substantive, wired |
-Creates `REPO-MAP.md` with symbols, dependencies, and navigation hints.
+#### Mapping
-### Discussion Phase (like GSD)
-```bash
-/ctx discuss S001        # Capture decisions BEFORE planning
-/ctx discuss --review    # Review locked decisions
-```
+| Agent | Model | maxTurns | Purpose |
+|-------|-------|----------|---------|
+| ctx-mapper | haiku | 15 | Token-optimized repository map (REPO-MAP.md) |
+| ctx-arch-mapper | haiku | 15 | Architecture patterns, data flow, module structure |
+| ctx-tech-mapper | haiku | 15 | Languages, frameworks, dependencies |
+| ctx-quality-mapper | haiku | 15 | Test coverage, lint status, type safety |
+| ctx-concerns-mapper | haiku | 15 | Security vulnerabilities, tech debt, performance risks |
-Prevents mid-implementation questions by locking decisions in `CONTEXT.md`.
+#### Knowledge & Coordination
-### Model Profiles (Cost Optimization)
-```bash
-/ctx profile             # Show current profile
-/ctx profile quality     # Best models (Opus everywhere)
-/ctx profile balanced    # Smart mix (default)
-/ctx profile budget      # Fast models (60% savings)
-```
+| Agent | Model | maxTurns | Purpose |
+|-------|-------|----------|---------|
+| ctx-discusser | sonnet | 25 | Captures implementation decisions before planning begins |
+| ctx-learner | haiku | 15 | Observes patterns and decisions, builds project memory |
+| ctx-handoff | haiku | 15 | Creates HANDOFF.md at context thresholds, zero info loss |
+| ctx-team-coordinator | haiku | 15 | File locks, parallel work coordination, prevents conflicts |
-| Profile | Research | Execute | Verify | Cost |
-|---------|----------|---------|--------|------|
-| quality | Opus | Opus | Sonnet | 3x |
-| balanced | Opus | Sonnet | Haiku | 1x |
-| budget | Sonnet | Sonnet | Haiku | 0.4x |
+#### Design & QA
-### Git-Native Workflow
-Every completed task auto-commits:
-```
-[CTX] Implement user login endpoint
+| Agent | Model | maxTurns | Purpose |
+|-------|-------|----------|---------|
+| ctx-designer | sonnet | 50 | Brand establishment, component design, Figma MCP integration |
+| ctx-qa | sonnet | 50 | Full system QA: crawls every page, clicks every button |
-Story: S001 - User Authentication
-Criteria: User can log in with credentials
-Files: src/auth/login.ts, src/routes/auth.ts
+### Machine Learning (4)
-Co-Authored-By: Claude <noreply@anthropic.com>
-```
+| Agent | Model | maxTurns | Purpose |
+|-------|-------|----------|---------|
+| ctx-ml-scientist | opus | 75 | Designs experiments, selects models, autonomous hypothesis loop |
+| ctx-ml-engineer | sonnet | 50 | MLOps pipelines, inference envelope, model registry |
+| ctx-ml-analyst | sonnet | 50 | EDA, dataset quality scoring, feature analysis |
+| ctx-ml-reviewer | sonnet | 25 | ML code review: correctness, leakage, statistical validity |
-Configure in `.ctx/config.json`:
-```json
-{
-  "git": {
-    "autoCommit": true,
-    "commitPerTask": true
-  }
-}
-```
+---
-### Persistent Debug Mode
-Scientific debugging with persistent state across sessions:
+## Skills (7)
-```bash
-/ctx debug "login fails"    # Start debugging
-/ctx debug --resume         # Resume after context reset
-/ctx debug --list           # See all sessions
-```
+Skills are auto-discovered. Claude Code reads each skill's `WHEN:` description and invokes it automatically when the task matches — no slash command needed.
-**How it works:**
-```
-1. OBSERVE   → Capture exact error, context, state
-2. RESEARCH  → Search codebase and web for similar issues
-3. HYPOTHESIZE → Form testable theory with confidence level
-4. TEST      → Apply minimal fix
-5. VERIFY    → Build + Tests + Lint + Browser
-6. ITERATE   → Refine hypothesis, max 10 attempts
-```
+### Core
-**Key features:**
-- Sessions survive context resets and days between attempts
-- Browser verification with stored credentials
-- Screenshots saved for each attempt
-- Escalation report if max attempts reached
+| Skill | Auto-invoked when... | Purpose |
+|-------|----------------------|---------|
+| ctx-orchestrator | User asks for pipeline, "ctx next", "ctx auto", or autonomous story execution | Runs the full `init → plan → execute → verify → complete` lifecycle via the Agent tool |
+| ctx-state | Any CTX operation needs to read/write `.ctx/STATE.json` or track phase transitions | Manages persistent state, phase transitions, agent history, and task completion log |
+| ctx-review-gate | Code implemented, story ready to close | Three-stage review: spec compliance → code quality → optional Codex cross-model adversarial review; blocks on failures, soft-skips on Codex infrastructure issues |
-State stored in `.ctx/debug/sessions/`:
-- `STATE.json` - Machine-readable progress
-- `TRACE.md` - Human-readable log
-- `hypotheses.json` - All theories tested
-- `screenshots/` - Visual evidence
+### Design
-### Parallel Codebase Analysis
-```bash
-/ctx map-codebase        # Full analysis with 4 parallel agents
-```
+| Skill | Auto-invoked when... | Purpose |
+|-------|----------------------|---------|
+| ctx-design-system | Design system creation, token management, brand kit updates, token export | Manages W3C DTCG 2025.10 tokens as single source of truth; exports to CSS/SCSS/JS/Tailwind |
+| ctx-visual-qa | Visual QA, design parity, pixel-perfect verification, responsive testing, WCAG 2.2 audit | Numerical measurement-driven QA — every delta is a number, every fix is a specific CSS property |
-Spawns 4 agents simultaneously:
-| Agent | Output | Analyzes |
-|-------|--------|----------|
-| TECH | TECH.md | Languages, frameworks, dependencies |
-| ARCH | ARCH.md | Patterns, data flow, modules |
-| QUALITY | QUALITY.md | Test coverage, lint, type safety |
-| CONCERNS | CONCERNS.md | Security, tech debt, performance |
+### Machine Learning
-Results synthesized into `SUMMARY.md`.
+| Skill | Auto-invoked when... | Purpose |
+|-------|----------------------|---------|
+| ctx-ml-experiment | User wants to run ML experiments, track hypotheses, compare models | Hypothesis tracking, model registry, experiment lifecycle |
+| ctx-ml-pipeline | Production ML deployment, inference, drift monitoring | Inference envelope, circuit breaker, KS drift detection, retraining triggers |
 ---
-## Why CTX?
-| Feature | Aider | GSD | CTX 3.5 |
-|---------|-------|-----|---------|
-| Repository Map | Yes | No | **Yes** |
-| Discussion Phase | No | Yes | **Yes** |
-| Model Profiles | Yes | Partial | **Yes** |
-| Git-Native Commits | Yes | No | **Yes** |
-| Persistent Debug | No | Partial | **Yes** |
-| Parallel Analysis | No | Yes | **Yes** |
-| PRD-Driven | No | Yes | **Yes** |
-| Design System | No | No | **Yes** |
-| Browser Verification | No | No | **Yes** |
-**CTX 3.5 combines the best of Aider and GSD.**
----
+## Commands (26)
-## Commands
+### Smart
-### Smart (Auto-routing)
 | Command | Purpose |
 |---------|---------|
-| `/ctx` | **Smart router** - reads STATE.md, does the right thing |
-| `/ctx init` | Initialize project with STATE.md + PRD.json |
+| `/ctx` | Smart router — reads STATE.json, does the right thing |
 ### Mapping
 | Command | Purpose |
 |---------|---------|
-| `/ctx map` | Build repository map (REPO-MAP.md) |
-| `/ctx map-codebase` | Deep analysis (4 parallel agents) |
+| `/ctx:map` | Build token-optimized repository map (REPO-MAP.md) |
+| `/ctx:map-codebase` | Deep parallel analysis: TECH + ARCH + QUALITY + CONCERNS → SUMMARY |
 ### Discussion
 | Command | Purpose |
 |---------|---------|
-| `/ctx discuss [story]` | Capture decisions before planning |
+| `/ctx:discuss [story]` | Capture implementation decisions before planning; locks them in CONTEXT.md |
 ### Configuration
 | Command | Purpose |
 |---------|---------|
-| `/ctx profile [name]` | Switch model profile (quality/balanced/budget) |
+| `/ctx:profile [name]` | Switch model profile: `quality`, `balanced` (default), `budget` |
+### Inspect
-### Inspect (Read-only)
 | Command | Purpose |
 |---------|---------|
-| `/ctx status` | See current state without triggering action |
+| `/ctx:status` | Show current state without triggering any action |
+### Control
-### Control (Override)
 | Command | Purpose |
 |---------|---------|
-| `/ctx plan [goal]` | Force research + planning |
-| `/ctx verify` | Force three-level verification |
-| `/ctx quick "task"` | Quick task bypass |
+| `/ctx:init` | Initialize project: PRD.json + STATE.json + config |
+| `/ctx:plan [goal]` | Force research + planning phase |
+| `/ctx:verify` | Force three-level verification |
+| `/ctx:quick "task"` | Quick task bypass (skips full lifecycle) |
 ### Debug
+| Command | Purpose |
+|---------|---------|
+| `/ctx:debug` | Start debugging current issue |
+| `/ctx:debug "issue"` | Debug specific problem |
+| `/ctx:debug --resume` | Resume last debug session |
+| `/ctx:debug --list` | List all debug sessions |
+| `/ctx:debug --status` | Show current session status |
+### Design
+| Command | Purpose |
+|---------|---------|
+| `/ctx:brand` | Brand establishment: mood board → 3 options → BRAND_KIT.md |
+| `/ctx:design` | Component design: research → 3 options → prototype → implement |
+### QA
 | Command | Purpose |
 |---------|---------|
-| `/ctx debug` | Start debugging current issue |
-| `/ctx debug "issue"` | Debug specific problem |
-| `/ctx debug --resume` | Resume last debug session |
-| `/ctx debug --list` | List all debug sessions |
-| `/ctx debug --status` | Show current session status |
+| `/ctx:qa` | Full system QA — WCAG 2.1 AA, every page, every interaction |
+| `/ctx:qa --a11y-only` | Accessibility audit only |
+| `/ctx:qa --visual-only` | Visual regression across mobile/tablet/desktop |
+| `/ctx:qa --resume` | Resume interrupted QA session |
+| `/ctx:visual-qa` | Measurement-driven design parity check |
+### ML
-### QA (Full System Testing)
 | Command | Purpose |
 |---------|---------|
-| `/ctx qa` | Full system QA - WCAG 2.1 AA, every page, every button |
-| `/ctx qa --section "auth"` | QA specific section only |
-| `/ctx qa --a11y-only` | Accessibility audit only |
-| `/ctx qa --visual-only` | Visual regression (mobile/tablet/desktop) |
-| `/ctx qa --resume` | Resume interrupted QA session |
-| `/ctx qa --report` | Show last QA report |
+| `/ctx:experiment` | Start ML experiment loop |
+| `/ctx:train` | Trigger training pipeline |
+| `/ctx:ml-status` | Show experiment registry and model status |
 ### Session
 | Command | Purpose |
 |---------|---------|
-| `/ctx pause` | Checkpoint for session resume |
+| `/ctx:pause` | Checkpoint state for session resume |
+### Phase
-### Phase Management
 | Command | Purpose |
 |---------|---------|
-| `/ctx phase list` | Show all phases |
-| `/ctx phase add "goal"` | Add new phase |
-| `/ctx phase next` | Complete current, move to next |
+| `/ctx:phase list` | Show all phases and their status |
+| `/ctx:phase add "goal"` | Add a new phase |
+| `/ctx:phase next` | Complete current phase, advance to next |
 ### Integration
 | Command | Purpose |
 |---------|---------|
-| `/ctx integrate` | Show integration status |
-| `/ctx integrate linear` | Setup Linear |
-| `/ctx integrate jira` | Setup Jira |
-| `/ctx integrate github` | Setup GitHub Issues |
-| `/ctx integrate --sync` | Sync all stories |
+| `/ctx:integrate` | Show integration status |
+| `/ctx:integrate linear` | Set up Linear sync |
+| `/ctx:integrate jira` | Set up Jira sync |
+| `/ctx:integrate github` | Set up GitHub Issues sync |
+| `/ctx:integrate --sync` | Sync all stories with connected tracker |
 ### Milestone
+| Command | Purpose |
+|---------|---------|
+| `/ctx:milestone` | Show current milestone |
+| `/ctx:milestone list` | List all milestones |
+| `/ctx:milestone audit` | Verify milestone completion |
+| `/ctx:milestone complete` | Archive and tag release |
+| `/ctx:milestone new [name]` | Start next version |
+| `/ctx:milestone gaps` | Generate fix phases for gaps |
+### Metrics
 | Command | Purpose |
 |---------|---------|
-| `/ctx milestone` | Show current milestone |
-| `/ctx milestone list` | List all milestones |
-| `/ctx milestone audit` | Verify completion |
-| `/ctx milestone complete` | Archive and tag |
-| `/ctx milestone new [name]` | Start next version |
-| `/ctx milestone gaps` | Generate fix phases |
-### Metrics & Audit
+| `/ctx:metrics` | Productivity dashboard |
+| `/ctx:metrics cost` | Cost analysis by model/profile |
+| `/ctx:metrics export` | Export HTML dashboard |
+### Learning
 | Command | Purpose |
 |---------|---------|
-| `/ctx metrics` | Show productivity dashboard |
-| `/ctx metrics cost` | Cost analysis |
-| `/ctx metrics export` | Export HTML dashboard |
-| `/ctx audit` | Show audit summary |
-| `/ctx audit export` | Generate compliance report |
+| `/ctx:learn` | Show what CTX has learned about your project |
+| `/ctx:learn patterns` | Show detected code patterns |
+| `/ctx:learn decisions` | Show architectural decisions log |
+| `/ctx:predict` | Get AI-suggested next features |
+| `/ctx:predict --quick` | Quick wins only |
+### Monitoring
-### Learning & Prediction
 | Command | Purpose |
 |---------|---------|
-| `/ctx learn` | Show what CTX has learned |
-| `/ctx learn patterns` | Show code patterns |
-| `/ctx learn decisions` | Show architectural decisions |
-| `/ctx learn forget [id]` | Remove a learned pattern |
-| `/ctx predict` | Get feature suggestions |
-| `/ctx predict --quick` | Quick wins only |
-| `/ctx predict --create [id]` | Create story from suggestion |
-### Monitoring & Voice
+| `/ctx:monitor` | Show monitoring status |
+| `/ctx:monitor connect sentry` | Connect Sentry error tracking |
+| `/ctx:monitor errors` | List recent production errors |
+| `/ctx:monitor auto-fix [id]` | Auto-fix error with PR |
+| `/ctx:monitor --watch` | Continuous monitoring mode |
+### Voice
 | Command | Purpose |
 |---------|---------|
-| `/ctx monitor` | Show monitoring status |
-| `/ctx monitor connect sentry` | Connect Sentry |
-| `/ctx monitor errors` | List recent errors |
-| `/ctx monitor auto-fix [id]` | Auto-fix with PR |
-| `/ctx monitor --watch` | Continuous monitoring |
-| `/ctx voice` | Start voice input |
-| `/ctx voice --continuous` | Always listening mode |
-| `/ctx voice --dictate` | Long-form dictation |
+| `/ctx:voice` | Start voice input |
+| `/ctx:voice --continuous` | Always-listening mode |
+| `/ctx:voice --dictate` | Long-form dictation |
+---
+## Hooks (3)
+Hooks are deterministic Node.js scripts registered in `settings.json`. They run synchronously on every tool call, independent of Claude's reasoning.
+| Hook | File | Trigger | Behavior |
+|------|------|---------|----------|
+| pre-tool-use | `hooks/pre-tool-use.js` | Before any tool executes | TDD enforcement + capability restrictions. Exit 2 blocks the tool call. |
+| post-tool-use | `hooks/post-tool-use.js` | After any tool executes | Logs file modifications to audit trail in `.ctx/audit.log` |
+| subagent-stop | `hooks/subagent-stop.js` | When a subagent finishes | Records agent completion in `.ctx/STATE.json` |
+**Configure hook behavior:**
+```bash
+npx ctx-cc config set hooks.tddMode strict    # Block writes without tests
+npx ctx-cc config set hooks.tddMode warn      # Warn but allow
+npx ctx-cc config set hooks.tddMode off       # Disabled
+```
 ---
-## State Machine
+## Design Workflow
+CTX implements an agency-grade design process with mandatory approval gates.
+**Phase 1 — Brand**
 ```
-initializing → discussing → executing → verifying → COMPLETE
-                    ↑            ↓
-                    └── debugging ──┘
+Research → Mood board → 3 direction options → User picks → BRAND_KIT.md
 ```
+BRAND_KIT.md becomes the constraint for all subsequent design work. Colors, typography, and spacing flow from tokens only.
-| State | What happens |
-|-------|--------------|
-| initializing | Research + Map + Plan |
-| discussing | Capture decisions in CONTEXT.md |
-| executing | Execute with git-native commits |
-| debugging | Persistent debug loop (max 10 attempts) |
-| verifying | Three-level verification |
-| paused | Resume from checkpoint |
+**Phase 2 — Component Design**
+```
+Research → 3 options (A/B/C) → User approves direction → Prototype → Implement
+```
+Never a single design. Options are always presented before implementation.
+**Phase 3 — Visual QA**
+Every design change triggers numerical verification:
+- Measure rendered values vs design spec (px, rem, hex)
+- Report deltas as numbers, not subjective descriptions
+- Fixes are specific: `change font-size from 14px to 16px`
+**Compliance:** WCAG 2.2 AA + EAA 2025. Touch targets, contrast ratios, keyboard navigation, and screen reader labels are all measured.
+**Key tools:** Figma MCP (token sync + component metadata), Gemini design analysis, W3C DTCG 2025.10 token format.
 ---
-## Context Management
+## ML Workflow
+CTX implements an experiment-driven ML development loop.
+**Phase 1 — Data Analysis**
+```
+Load data → EDA → Quality scoring → Feature correlation → Pandera validation schema
+```
+**Phase 2 — Experiment Loop**
+```
+Hypothesize → Design experiment → Run → Analyze → Register result → Iterate
+```
+All hypotheses and results are tracked in `.ctx/ml/experiments/`. The model registry stores every trained artifact with metadata.
-CTX actively manages context budget:
+**Phase 3 — Model Evaluation**
+- Conformal prediction intervals (MAPIE)
+- Statistical significance testing
+- Calibration curves and reliability diagrams
-| Usage | Quality | Action |
-|-------|---------|--------|
-| 0-30% | Peak | Continue |
-| 30-40% | Good | Continue |
-| 40-50% | Good | Prepare handoff notes |
-| 50-60% | Degrading | Auto-checkpoint |
-| 60-70% | Degrading | Create HANDOFF.md |
-| 70%+ | Poor | Force checkpoint |
+**Phase 4 — Production Pipeline**
+- Model registry with version pinning
+- Inference envelope with latency SLA
+- Circuit breaker (auto-disable on error spike)
+- KS drift detection with configurable thresholds
+- Retraining triggers on drift
-Smart handoff creates `HANDOFF.md` with:
-- Completed tasks with commit hashes
-- Current task progress
-- Key decisions made
-- Files modified
-- Next steps
+**Proven patterns:** XGBoost + MAPIE conformal prediction, T-learner causal inference, KS drift, Pandera schema validation — from Digital Twin production workflows.
 ---
-## 21 Specialized Agents
-| Agent | Spawned when | Model (balanced) |
-|-------|--------------|------------------|
-| ctx-mapper | /ctx map | haiku |
-| ctx-tech-mapper | /ctx map-codebase | haiku |
-| ctx-arch-mapper | /ctx map-codebase | haiku |
-| ctx-quality-mapper | /ctx map-codebase | haiku |
-| ctx-concerns-mapper | /ctx map-codebase | haiku |
-| ctx-discusser | status = discussing | sonnet |
-| ctx-researcher | status = initializing | opus |
-| ctx-planner | after research | opus |
-| ctx-executor | status = executing | sonnet |
-| ctx-designer | design stories | sonnet |
-| ctx-debugger | status = debugging | sonnet |
-| ctx-verifier | status = verifying | haiku |
-| ctx-parallelizer | before execution | haiku |
-| ctx-reviewer | before commit | sonnet |
-| ctx-criteria-suggester | during init/discuss | sonnet |
-| ctx-handoff | at context thresholds | haiku |
-| ctx-team-coordinator | team mode | sonnet |
-| ctx-auditor | always (background) | haiku |
-| ctx-learner | observing patterns | haiku |
-| ctx-predictor | after milestone/on demand | sonnet |
-| ctx-qa | /ctx qa (full system test) | sonnet |
+## Configuration
+```bash
+npx ctx-cc config list                       # Show all config values
+npx ctx-cc config get activeProfile          # Get a specific value
+npx ctx-cc config set hooks.tddMode strict   # Set a value
+```
----
+**Model profiles:**
-## Directory Structure
+| Profile | Research | Planning | Execution | Verify | Relative Cost |
+|---------|----------|----------|-----------|--------|---------------|
+| quality | opus | opus | opus | sonnet | ~3x |
+| balanced | opus | opus | sonnet | haiku | 1x (default) |
+| budget | sonnet | sonnet | sonnet | haiku | ~0.4x |
+Switch profiles without reinstalling:
 ```
-.ctx/
-├── config.json       # Model profiles, git settings
-├── STATE.md          # Living digest - execution state
-├── PRD.json          # Requirements contract
-├── REPO-MAP.md       # Token-optimized codebase map
-├── REPO-MAP.json     # Structured map data
-├── .env              # Test credentials (GITIGNORED)
-├── codebase/         # Deep analysis results
-│   ├── TECH.md
-│   ├── ARCH.md
-│   ├── QUALITY.md
-│   ├── CONCERNS.md
-│   └── SUMMARY.md
-├── phases/{story_id}/
-│   ├── CONTEXT.md    # Locked decisions (discussion phase)
-│   ├── RESEARCH.md   # ArguSeek results
-│   ├── PLAN.md       # Tasks mapped to criteria
-│   └── VERIFY.md     # Verification report
-├── debug/
-│   ├── sessions/     # Persistent debug state
-│   └── screenshots/  # Visual proof
-├── checkpoints/      # Auto-checkpoints
-└── memory/           # Decision memory
+/ctx:profile quality
+/ctx:profile balanced
+/ctx:profile budget
 ```
 ---
-## Configuration
+## Phase Lifecycle
-`.ctx/config.json`:
-```json
-{
-  "activeProfile": "balanced",
-  "models": {
-    "architect": { "id": "claude-opus-4", "costTier": "high" },
-    "default": { "id": "claude-sonnet-4", "costTier": "medium" },
-    "fast": { "id": "claude-haiku-4", "costTier": "low" }
-  },
-  "profiles": {
-    "quality": {
-      "research": "architect",
-      "discussion": "architect",
-      "planning": "architect",
-      "execution": "architect"
-    },
-    "balanced": {
-      "research": "architect",
-      "discussion": "default",
-      "planning": "architect",
-      "execution": "default"
-    },
-    "budget": {
-      "research": "default",
-      "planning": "default",
-      "execution": "default"
-    }
-  },
-  "git": {
-    "autoCommit": true,
-    "commitPerTask": true
-  }
-}
 ```
+init → plan → execute → verify → complete
+                ↑          ↓
+                ←── (fix failures)
+```
+State is persisted in `.ctx/STATE.json` after every transition. The `ctx-state` skill manages reads and writes. The `ctx-orchestrator` skill drives transitions.
+| Phase | What happens |
+|-------|--------------|
+| init | Research + repo map + PRD validation |
+| plan | Acceptance criteria + atomic task plan (2–3 tasks) |
+| execute | Implementation with per-task git commits |
+| verify | Three-level check: exists → substantive → wired |
+| complete | Review gate passed, story archived |
+If verification fails, state returns to `execute` automatically. The fix-loop runs until all three verification levels pass.
 ---
-## Integrations
+## Plugin Manifest
-### ArguSeek (Web Research)
-Auto-runs during planning for best practices, security, and patterns.
+CTX ships with `plugin.json` for Claude Code marketplace distribution. Future marketplace installs will use:
-### ChunkHound (Semantic Code Search)
-Auto-runs during planning for semantic search and pattern detection.
 ```bash
-uv tool install chunkhound
+/plugin install ctx@my-marketplace
 ```
-### Browser Verification (Playwright/Chrome DevTools)
-Auto-runs during debugging and verification for visual proof.
-### Figma MCP (Design Context)
-Auto-runs during design stories for tokens and component metadata.
-### Gemini Design MCP (Visual Generation)
-Auto-runs during design stories for mockups and UI code.
+The manifest declares all agents, skills, commands, and hooks so the marketplace can display capabilities, manage versions, and handle updates without reinstalling.
 ---
-## Key Principles
+## CLI Reference
-### 95% Auto-Deviation Handling
+```bash
+npx ctx-cc [options]           Install CTX into Claude Code
+npx ctx-cc list                List all 25 agents with model/maxTurns
+npx ctx-cc skills              Analyze skill descriptions and triggers
+npx ctx-cc config list         Show full configuration
+npx ctx-cc config get <key>    Get a config value
+npx ctx-cc config set <k> <v>  Set a config value
+npx ctx-cc --help              Show help
+Options:
+  --global, -g     Install to ~/.claude (default)
+  --project, -p    Install to .claude in current directory
+  --force, -f      Overwrite existing installation
+```
-| Trigger | Action |
-|---------|--------|
-| Bug in existing code | Auto-fix, document in commit |
-| Missing validation | Auto-add, document |
-| Blocking issue | Auto-fix, document |
-| Architecture decision | **Ask user** |
+---
-### Three-Level Verification
+## Development
-| Level | Question | Check |
-|-------|----------|-------|
-| Exists | File on disk? | Glob |
-| Substantive | Real code, not stub? | No TODOs, no placeholders |
-| Wired | Imported and used? | Trace imports |
+```bash
+git clone https://github.com/jufjuf/CTX.git
+cd CTX
+npm test    # 264 tests, node:test runner
+```
-### Atomic Planning
+**Project structure:**
-Plans limited to 2-3 tasks to prevent context degradation.
+```
+ctx-cc/
+├── agents/          25 agent definitions (.md with frontmatter)
+├── skills/          7 skill directories (each contains SKILL.md)
+├── commands/        26 slash command definitions (.md)
+├── hooks/           3 enforcement hook scripts (.js)
+├── src/             17 source modules (.js)
+├── test/            19 test files (.test.js)
+├── templates/       config.json, PRD.json, state templates
+├── bin/ctx.js       CLI entry point (installer only)
+├── plugin.json      Marketplace manifest
+└── package.json     Zero runtime dependencies
+```
 ---
-## Updating
+## Testing
 ```bash
-npx ctx-cc --force
+npm test
+# 264 tests, 0 failures, ~2s
 ```
+**Coverage:**
+| Area | What is tested |
+|------|----------------|
+| Agent discovery | Frontmatter parsing, model/maxTurns validation |
+| State machine | Phase transitions, invalid transition rejection |
+| Pipelines | Orchestrator flow, review gate stages |
+| Worktrees | Parallel execution isolation |
+| Hooks | TDD enforcement, audit logging, subagent tracking |
+| Capabilities | Restriction rules, exit codes |
+| Context profiles | Model resolution per profile |
+| Skills format | SKILL.md structure, description format |
+| Design compliance | Token format, BRAND_KIT schema |
+| ML compliance | Experiment schema, pipeline config |
+| CLI commands | list, skills, config get/set |
+| Integration | End-to-end install + verify |
 ---
 ## License
@@ -594,8 +548,8 @@ MIT
 <div align="center">
-**[GitHub](https://github.com/jufjuf/CTX)** · **[Issues](https://github.com/jufjuf/CTX/issues)** · **[npm](https://www.npmjs.com/package/ctx-cc)**
+**[GitHub](https://github.com/jufjuf/CTX)** · **[npm](https://www.npmjs.com/package/ctx-cc)** · **[Issues](https://github.com/jufjuf/CTX/issues)**
-*CTX 3.5 - Conversational-first. Just describe what you want. 21 specialized agents. PRD-driven development.*
+CTX 4.0 — 25 agents · 7 skills · 3 hooks · zero dependencies
 </div>