npm - ctx-cc - Versions diffs - 3.5.0 → 4.1.0 - Mend

ctx-cc 3.5.0 → 4.1.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (74) hide show

package/README.md +375 -676
package/agents/ctx-arch-mapper.md +5 -3
package/agents/ctx-auditor.md +5 -3
package/agents/ctx-codex-reviewer.md +214 -0
package/agents/ctx-concerns-mapper.md +5 -3
package/agents/ctx-criteria-suggester.md +6 -4
package/agents/ctx-debugger.md +5 -3
package/agents/ctx-designer.md +488 -114
package/agents/ctx-discusser.md +5 -3
package/agents/ctx-executor.md +5 -3
package/agents/ctx-handoff.md +6 -4
package/agents/ctx-learner.md +5 -3
package/agents/ctx-mapper.md +4 -3
package/agents/ctx-ml-analyst.md +600 -0
package/agents/ctx-ml-engineer.md +933 -0
package/agents/ctx-ml-reviewer.md +485 -0
package/agents/ctx-ml-scientist.md +626 -0
package/agents/ctx-parallelizer.md +4 -3
package/agents/ctx-planner.md +5 -3
package/agents/ctx-predictor.md +4 -3
package/agents/ctx-qa.md +5 -3
package/agents/ctx-quality-mapper.md +5 -3
package/agents/ctx-researcher.md +5 -3
package/agents/ctx-reviewer.md +6 -4
package/agents/ctx-team-coordinator.md +5 -3
package/agents/ctx-tech-mapper.md +5 -3
package/agents/ctx-verifier.md +5 -3
package/bin/ctx.js +199 -27
package/commands/brand.md +309 -0
package/commands/ctx.md +10 -10
package/commands/design.md +304 -0
package/commands/experiment.md +251 -0
package/commands/help.md +57 -7
package/commands/init.md +25 -0
package/commands/metrics.md +1 -1
package/commands/milestone.md +1 -1
package/commands/ml-status.md +197 -0
package/commands/monitor.md +1 -1
package/commands/train.md +266 -0
package/commands/visual-qa.md +559 -0
package/commands/voice.md +1 -1
package/hooks/post-tool-use.js +39 -0
package/hooks/pre-tool-use.js +94 -0
package/hooks/subagent-stop.js +32 -0
package/package.json +9 -3
package/plugin.json +46 -0
package/skills/ctx-design-system/SKILL.md +572 -0
package/skills/ctx-ml-experiment/SKILL.md +334 -0
package/skills/ctx-ml-pipeline/SKILL.md +437 -0
package/skills/ctx-orchestrator/SKILL.md +91 -0
package/skills/ctx-review-gate/SKILL.md +147 -0
package/skills/ctx-state/SKILL.md +100 -0
package/skills/ctx-visual-qa/SKILL.md +587 -0
package/src/agents.js +109 -0
package/src/auto.js +287 -0
package/src/capabilities.js +226 -0
package/src/commits.js +94 -0
package/src/config.js +112 -0
package/src/context.js +241 -0
package/src/handoff.js +156 -0
package/src/hooks.js +218 -0
package/src/install.js +125 -50
package/src/lifecycle.js +194 -0
package/src/metrics.js +198 -0
package/src/pipeline.js +269 -0
package/src/review-gate.js +338 -0
package/src/runner.js +120 -0
package/src/skills.js +143 -0
package/src/state.js +267 -0
package/src/worktree.js +244 -0
package/templates/PRD.json +1 -1
package/templates/config.json +4 -237
package/workflows/ctx-router.md +0 -485
package/workflows/map-codebase.md +0 -329

package/README.md CHANGED Viewed

@@ -1,843 +1,542 @@
 <div align="center">
-# CTX
-### Continuous Task eXecution
+```
+   ██████╗████████╗██╗  ██╗
+  ██╔════╝╚══██╔══╝╚██╗██╔╝
+  ██║        ██║    ╚███╔╝
+  ██║        ██║    ██╔██╗
+  ╚██████╗   ██║   ██╔╝ ██╗
+   ╚═════╝   ╚═╝   ╚═╝  ╚═╝
+```
-**Intelligent workflow orchestration for Claude Code.**
+**Intelligent workflow orchestration for Claude Code**
 [![npm version](https://img.shields.io/npm/v/ctx-cc.svg?style=flat-square)](https://www.npmjs.com/package/ctx-cc)
-[![npm downloads](https://img.shields.io/npm/dm/ctx-cc.svg?style=flat-square)](https://www.npmjs.com/package/ctx-cc)
 [![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg?style=flat-square)](https://opensource.org/licenses/MIT)
-[![GitHub stars](https://img.shields.io/github/stars/jufjuf/CTX?style=flat-square)](https://github.com/jufjuf/CTX/stargazers)
-<img src="./assets/terminal.png" alt="CTX Terminal" width="700">
-**Conversational-first. Just describe what you want — no commands to memorize.**
-AI that learns your preferences. Predictive planning. Self-healing deployments. 21 specialized agents.
-[Installation](#installation) · [Quick Start](#quick-start) · [New in 3.3](#new-in-33) · [Commands](#commands) · [Why CTX](#why-ctx) · [**Getting Started Guide**](./GETTING_STARTED.md)
-</div>
----
-## Installation
+[![Tests](https://img.shields.io/badge/tests-264%20passing-brightgreen.svg?style=flat-square)](#testing)
+[![Zero deps](https://img.shields.io/badge/dependencies-0-brightgreen.svg?style=flat-square)](#)
 ```bash
 npx ctx-cc
 ```
-That's it. CTX installs itself to your Claude Code environment.
-```bash
-# Options
-npx ctx-cc --global     # Install to ~/.claude (default)
-npx ctx-cc --project    # Install to .claude in current directory
-npx ctx-cc --force      # Overwrite existing installation
-```
----
-## Quick Start
-**Just talk to CTX.** No commands to memorize:
-```
-"I want to build a todo app"     → CTX sets up your project
-"Fix the login bug"              → CTX starts debugging
-"Is my app accessible?"          → CTX runs accessibility QA
-"What should I do next?"         → CTX shows status + recommendation
-```
-Or use commands directly:
-```
-1. /ctx init           Gather requirements + credentials + design context
-2. /ctx map            Build repository map (existing codebases)
-3. /ctx                Autonomous execution with minimal interruption
-```
-**The Flow:**
-```
-Tell CTX what you want → CTX figures out the rest → Delivered!
-```
+</div>
 ---
-## New in 3.3
-### Conversational-First Routing
-**No commands to memorize.** CTX understands natural language from your first prompt:
-| You Say | CTX Does |
-|---------|----------|
-| "I want to build a todo app" | Sets up project, researches best practices, creates plan |
-| "Fix the login bug" | Analyzes codebase, starts debugging |
-| "Is my app accessible?" | Runs WCAG 2.1 AA accessibility audit |
-| "Test everything" | Crawls every page, clicks every button |
-| "What's next?" | Shows status and recommended action |
-| "Ship it" | Runs verification, then deploys |
-**New user?** CTX welcomes you and guides setup:
-```
-━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
-Welcome to CTX!
-━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
-I understood: "build a todo app with React"
-You want to build something new. Let's set it up!
-I'll now:
-1. Create your project structure
-2. Research best practices for React todo apps
-3. Create a plan with clear milestones
-```
-Commands still work as shortcuts for power users.
-### Full System QA
-Crawl every page, click every button, find all issues:
+## What is CTX?
-```bash
-/ctx qa                     # Full system QA (WCAG 2.1 AA)
-/ctx qa --a11y-only         # Accessibility audit only
-/ctx qa --visual-only       # Visual regression (3 viewports)
-/ctx qa --resume            # Resume interrupted session
-```
+CTX transforms Claude Code from a single AI assistant into a full development agency. One installer wires 25 specialized agents, 7 auto-discovered skills, and 3 deterministic enforcement hooks directly into Claude Code's native extension points — no runtime daemon, no wrapper, no proxy.
-Features:
-- **WCAG 2.1 AA compliance** - Touch targets, alt text, labels, contrast, keyboard
-- **Multi-viewport testing** - Mobile (375px), Tablet (768px), Desktop (1280px)
-- **Performance monitoring** - Slow requests, large assets
-- **Trace capture** - Screenshots and logs for every failed interaction
-- **Fix tasks** - Issues organized by section, ready for execution
-### Learning System
-CTX learns from your codebase and preferences:
-```
-[LEARNER] Applying 3 learned preferences
+**Three verticals out of the box:**
-1. Using Zod for validation
-   → Based on pattern P001 (95% confidence, 12 observations)
+| Vertical | Coverage |
+|----------|----------|
+| Software Development | Phase-based lifecycle, autonomous execution, persistent debug, review gates |
+| Agency-Grade Design | Figma MCP-first workflow, W3C DTCG tokens, pixel-perfect QA, WCAG 2.2 AA |
+| Machine Learning | Experiment tracking, model registry, conformal prediction, drift detection |
-2. Functional components only
-   → Based on decision D005 (explicit preference)
-3. Avoiding moment.js, using date-fns
-   → Based on failure F001 (bundle size)
-```
-Memory stored in `.ctx/memory/`:
-- `patterns.json` - Code patterns you prefer
-- `decisions.json` - Past architectural decisions
-- `failures.json` - What didn't work
-- `preferences.json` - Communication style
-- `conventions.json` - Naming and structure
-Commands:
-```bash
-/ctx learn                  # Show what CTX has learned
-/ctx learn patterns         # Show code patterns
-/ctx learn decisions        # Show architectural decisions
-/ctx learn forget [id]      # Remove a learned pattern
-```
+**Key value propositions:**
-### Predictive Planning
-AI suggests what to build next based on patterns:
-```
-[PREDICT] Feature Suggestions for MyApp
-Based on your e-commerce app at MVP stage:
-1. Cart Abandonment Recovery
-   ├─ ROI: 9/10
-   ├─ Effort: Medium (2-3 days)
-   └─ Impact: +15-25% recovered revenue
-2. Product Reviews & Ratings
-   ├─ ROI: 9/10
-   ├─ Effort: Low (1-2 days)
-   └─ Impact: +18% conversion rate
-3. Discount Codes System
-   ├─ ROI: 7/10
-   └─ Effort: Low (1 day)
-[1] Create story  [A] Create all  [R] Research more
-```
-Commands:
-```bash
-/ctx predict                # Full prediction analysis
-/ctx predict --quick        # Quick wins only (< 1 day)
-/ctx predict --create [id]  # Create story from suggestion
-```
-### Self-Healing Deployments
-Monitor production errors and auto-fix:
-```bash
-/ctx monitor connect sentry  # Connect Sentry
-/ctx monitor errors          # List recent errors
-/ctx monitor auto-fix [id]   # Auto-fix with PR
-/ctx monitor --watch         # Continuous monitoring
-```
-Auto-fix workflow:
-```
-1. Sentry reports: "TypeError in checkout.ts:45"
-2. CTX analyzes error pattern
-3. If safe pattern (null check, import, etc.):
-   → Auto-generates fix
-   → Runs tests
-   → Creates PR
-   → Notifies team
-4. If complex pattern:
-   → Creates fix story
-   → Assigns priority
-   → Provides context
-```
-Supports: Sentry, LogRocket, Bugsnag, Datadog
-### Voice Control
-Speak your requirements instead of typing:
-```bash
-/ctx voice                  # Start voice input
-/ctx voice --continuous     # Always listening
-/ctx voice --dictate        # Long-form input
-```
-```
-[VOICE] 🎤 Listening...
-User: "Create a story for user authentication
-       with email login and forgot password"
-[VOICE] Story created:
-  Title: User Authentication
-  Criteria:
-    ✓ User can login with email/password
-    ✓ Forgot password sends reset email
-    ✓ Session persists across browser
-Add to PRD? [Y/n/edit]
-```
-Supports: macOS Speech, Whisper (local), Google Speech, Azure
+- **Phase-based lifecycle** — `init → plan → execute → verify → complete` with state tracked in `.ctx/STATE.json`
+- **Autonomous execution with review gates** — three-stage review: spec compliance, code quality, and optional cross-model adversarial review via OpenAI Codex
+- **Figma MCP-first design workflow** — tokens sync from Figma, visual QA measures pixels numerically
+- **ML experiment-driven development** — hypothesis tracking, XGBoost+MAPIE patterns, KS drift, Digital Twin workflows
+- **Zero dependencies** — installs into Claude Code's native extension points; nothing runs outside Claude
 ---
-## From 3.2
-### Milestone Workflow
-Full release management with audit, archive, and git tagging:
-```bash
-/ctx milestone              # Show current milestone status
-/ctx milestone list         # List all milestones
-/ctx milestone audit        # Verify milestone complete
-/ctx milestone complete     # Archive and create git tag
-/ctx milestone new v2.0     # Start next version
-/ctx milestone gaps         # Generate fix phases for failures
-```
-Features:
-- Archive phases to `.ctx/archive/v{version}/`
-- Auto-create git tags on complete
-- Gap analysis for incomplete milestones
-- Integration with Linear/Jira/GitHub
-### Team Collaboration
-Multiple developers working with CTX on same repo:
-```
-[TEAM] Current Activity
-Active:
-  alice: T001 - User authentication (10m)
-  bob: T003 - API endpoints (waiting for T001)
-Available:
-  charlie: idle
-```
-Features:
-- **File locking** - Prevent conflicts during execution
-- **Lock expiry** - Auto-release after 1 hour
-- **Conflict detection** - Fail fast on overlaps
-- **Team notifications** - Slack/Discord on phase complete
-- **Branch strategy** - Auto-create feature branches
+## Quick Start
-### Audit Trail (SOC2/HIPAA Compliance)
-Complete traceability for enterprise:
 ```bash
-/ctx audit                  # Show today's summary
-/ctx audit week             # Weekly summary
-/ctx audit export           # Generate compliance report
-```
-Every action logged:
-```json
-{
-  "timestamp": "2024-01-20T09:30:00Z",
-  "action": "execute",
-  "story": "S001",
-  "agent": "ctx-executor",
-  "model": "claude-sonnet-4",
-  "tokens": {"input": 5000, "output": 2000},
-  "filesModified": ["src/auth/login.ts"],
-  "decisions": ["Use JWT for auth"],
-  "commit": "abc1234"
-}
+npx ctx-cc              # Install globally (~/.claude)
+npx ctx-cc --project    # Install for current project only (.claude/)
+npx ctx-cc --force      # Overwrite existing installation
 ```
-Compliance features:
-- Access logging (who read what)
-- Decision tracking (with rationale)
-- Change log (all modifications)
-- Retention policies (90d daily, 1y weekly)
-- SOC2, HIPAA, GDPR modes
-### Metrics Dashboard
-Understand AI productivity impact:
-```bash
-/ctx metrics                # Overview dashboard
-/ctx metrics cost           # Cost analysis
-/ctx metrics export         # Export HTML dashboard
-```
+Then inside Claude Code:
 ```
-╔══════════════════════════════════════════════════════════╗
-║                  CTX METRICS - Week 3                     ║
-╠══════════════════════════════════════════════════════════╣
-║  Stories/Week:    8  (+2)       Total Cost:    $4.87     ║
-║  Time/Story:      1h 45m        Per Story:     $0.61     ║
-║  Debug Loops:     12            Success Rate:  92%       ║
-║                                                          ║
-║  ROI: 30x  (24 hours saved, $145 net savings)            ║
-╚══════════════════════════════════════════════════════════╝
+/ctx              Smart router — reads state, does the right thing
+/ctx:help         Full command reference
+/ctx:init         Initialize project (PRD + STATE + config)
 ```
-Metrics tracked:
-- Stories completed per day/week
-- Average time per story
-- Token usage by phase/model
-- Cost analysis with ROI
-- Debug loop frequency
-- Verification success rate
-- Human intervention rate
 ---
-## From 3.1
+## Architecture
-### Intelligent Task Parallelization
-Tasks without dependencies run simultaneously:
-```
-Wave 1: [T001, T003] → Parallel (no deps)
-Wave 2: [T002]       → After T001
-Wave 3: [T004]       → After T002
+CTX uses a **thin installer / fat Claude Code** architecture. The CLI's only job is to copy files into the right directories. All orchestration happens inside Claude Code via the Agent tool.
-Result: 40% faster execution
 ```
-### Pre-Commit Review (ctx-reviewer)
-Catches errors BEFORE they're committed:
-- Type errors (TypeScript, Python, Go)
-- Unresolved imports
-- Circular dependencies
-- Security vulnerabilities
-- Empty catch blocks, console.logs
+~/.claude/
+├── agents/       25 subagents     (invoked via Agent tool)
+├── skills/       7 skills         (auto-discovered by Claude from descriptions)
+├── commands/     26 slash commands (/ctx:*)
+├── hooks/        3 hook scripts   (deterministic enforcement)
+└── settings.json hooks registered
 ```
-[CTX] Pre-Commit Review
-  ✅ Types: Pass
-  ✅ Imports: Pass
-  ⚠️  Medium: 2 console.log statements
-  ❌ Critical: SQL injection risk at line 45
-Status: BLOCKED - Fix critical issue
-```
+**Key decisions:**
-### Acceptance Criteria Auto-Generation
-AI suggests comprehensive criteria:
-```
-Story: "Add user authentication"
+- The CLI is installer-only. It does not run, proxy, or wrap Claude Code.
+- Agents are Markdown files with native frontmatter (`model`, `maxTurns`, `description`). Claude reads these directly.
+- Skills are auto-invoked by Claude Code when task descriptions match the skill's `WHEN:` triggers — no commands needed.
+- Hooks are separate `.js` scripts registered in `settings.json`. They run deterministically on every tool call.
+- `plugin.json` enables marketplace distribution.
-Suggested Criteria:
-  ✓ User can register with email/password
-  ✓ Invalid credentials show error
-  ✓ Passwords hashed with bcrypt
-  ✓ Session expires after 24h
-  ✓ Brute force protection enabled
+---
-[A] Accept all  [B] See more  [C] Edit
-```
+## Agents (25)
-### Smart Context Handoff
-Seamless transitions at context limits:
+### Software Development (21)
-| Threshold | Action |
-|-----------|--------|
-| 40% | Prepare handoff notes |
-| 50% | Write HANDOFF.md, warn |
-| 60% | Spawn fresh agent |
+#### Planning
-Zero information loss. Work continues automatically.
+| Agent | Model | maxTurns | Purpose |
+|-------|-------|----------|---------|
+| ctx-planner | opus | 25 | Atomic plans (2–3 tasks), maps tasks to acceptance criteria |
+| ctx-researcher | opus | 25 | ArguSeek web research + semantic code search before planning |
+| ctx-criteria-suggester | sonnet | 25 | Auto-generates acceptance criteria before implementation |
+| ctx-parallelizer | haiku | 15 | Identifies tasks that can run concurrently, saves total time |
+| ctx-predictor | haiku | 15 | Analyzes patterns and suggests next features |
-### Issue Tracker Integration
-Sync with Linear, Jira, or GitHub Issues:
-```bash
-/ctx integrate linear    # Setup Linear
-/ctx integrate jira      # Setup Jira
-/ctx integrate github    # Setup GitHub Issues
-/ctx integrate --sync    # Force sync all stories
-```
+#### Execution
-Features:
-- Bidirectional story sync
-- Status mapping (CTX → tracker)
-- Auto-close on verify pass
-- Comment on verify fail
+| Agent | Model | maxTurns | Purpose |
+|-------|-------|----------|---------|
+| ctx-executor | sonnet | 50 | Implements tasks with git-native commits per completed task |
+| ctx-debugger | sonnet | 75 | Persistent debug loop: observe → hypothesize → test → verify |
----
-## From 3.0
+#### Review
-### Repository Mapping (like Aider)
-```bash
-/ctx map                 # Build token-optimized codebase map
-/ctx map --expand        # Include call graph (8k tokens)
-/ctx map --refresh       # Force full rebuild
-```
+| Agent | Model | maxTurns | Purpose |
+|-------|-------|----------|---------|
+| ctx-reviewer | sonnet | 25 | Pre-commit review: type errors, imports, security, empty catches |
+| ctx-auditor | haiku | 15 | Background audit trail and compliance traceability |
+| ctx-verifier | haiku | 15 | Three-level verification: exists, substantive, wired |
-Creates `REPO-MAP.md` with symbols, dependencies, and navigation hints.
+#### Mapping
-### Discussion Phase (like GSD)
-```bash
-/ctx discuss S001        # Capture decisions BEFORE planning
-/ctx discuss --review    # Review locked decisions
-```
+| Agent | Model | maxTurns | Purpose |
+|-------|-------|----------|---------|
+| ctx-mapper | haiku | 15 | Token-optimized repository map (REPO-MAP.md) |
+| ctx-arch-mapper | haiku | 15 | Architecture patterns, data flow, module structure |
+| ctx-tech-mapper | haiku | 15 | Languages, frameworks, dependencies |
+| ctx-quality-mapper | haiku | 15 | Test coverage, lint status, type safety |
+| ctx-concerns-mapper | haiku | 15 | Security vulnerabilities, tech debt, performance risks |
-Prevents mid-implementation questions by locking decisions in `CONTEXT.md`.
+#### Knowledge & Coordination
-### Model Profiles (Cost Optimization)
-```bash
-/ctx profile             # Show current profile
-/ctx profile quality     # Best models (Opus everywhere)
-/ctx profile balanced    # Smart mix (default)
-/ctx profile budget      # Fast models (60% savings)
-```
+| Agent | Model | maxTurns | Purpose |
+|-------|-------|----------|---------|
+| ctx-discusser | sonnet | 25 | Captures implementation decisions before planning begins |
+| ctx-learner | haiku | 15 | Observes patterns and decisions, builds project memory |
+| ctx-handoff | haiku | 15 | Creates HANDOFF.md at context thresholds, zero info loss |
+| ctx-team-coordinator | haiku | 15 | File locks, parallel work coordination, prevents conflicts |
-| Profile | Research | Execute | Verify | Cost |
-|---------|----------|---------|--------|------|
-| quality | Opus | Opus | Sonnet | 3x |
-| balanced | Opus | Sonnet | Haiku | 1x |
-| budget | Sonnet | Sonnet | Haiku | 0.4x |
+#### Design & QA
-### Git-Native Workflow
-Every completed task auto-commits:
-```
-[CTX] Implement user login endpoint
+| Agent | Model | maxTurns | Purpose |
+|-------|-------|----------|---------|
+| ctx-designer | sonnet | 50 | Brand establishment, component design, Figma MCP integration |
+| ctx-qa | sonnet | 50 | Full system QA: crawls every page, clicks every button |
-Story: S001 - User Authentication
-Criteria: User can log in with credentials
-Files: src/auth/login.ts, src/routes/auth.ts
+### Machine Learning (4)
-Co-Authored-By: Claude <noreply@anthropic.com>
-```
+| Agent | Model | maxTurns | Purpose |
+|-------|-------|----------|---------|
+| ctx-ml-scientist | opus | 75 | Designs experiments, selects models, autonomous hypothesis loop |
+| ctx-ml-engineer | sonnet | 50 | MLOps pipelines, inference envelope, model registry |
+| ctx-ml-analyst | sonnet | 50 | EDA, dataset quality scoring, feature analysis |
+| ctx-ml-reviewer | sonnet | 25 | ML code review: correctness, leakage, statistical validity |
-Configure in `.ctx/config.json`:
-```json
-{
-  "git": {
-    "autoCommit": true,
-    "commitPerTask": true
-  }
-}
-```
+---
-### Persistent Debug Mode
-Scientific debugging with persistent state across sessions:
+## Skills (7)
-```bash
-/ctx debug "login fails"    # Start debugging
-/ctx debug --resume         # Resume after context reset
-/ctx debug --list           # See all sessions
-```
+Skills are auto-discovered. Claude Code reads each skill's `WHEN:` description and invokes it automatically when the task matches — no slash command needed.
-**How it works:**
-```
-1. OBSERVE   → Capture exact error, context, state
-2. RESEARCH  → Search codebase and web for similar issues
-3. HYPOTHESIZE → Form testable theory with confidence level
-4. TEST      → Apply minimal fix
-5. VERIFY    → Build + Tests + Lint + Browser
-6. ITERATE   → Refine hypothesis, max 10 attempts
-```
+### Core
-**Key features:**
-- Sessions survive context resets and days between attempts
-- Browser verification with stored credentials
-- Screenshots saved for each attempt
-- Escalation report if max attempts reached
+| Skill | Auto-invoked when... | Purpose |
+|-------|----------------------|---------|
+| ctx-orchestrator | User asks for pipeline, "ctx next", "ctx auto", or autonomous story execution | Runs the full `init → plan → execute → verify → complete` lifecycle via the Agent tool |
+| ctx-state | Any CTX operation needs to read/write `.ctx/STATE.json` or track phase transitions | Manages persistent state, phase transitions, agent history, and task completion log |
+| ctx-review-gate | Code implemented, story ready to close | Three-stage review: spec compliance → code quality → optional Codex cross-model adversarial review; blocks on failures, soft-skips on Codex infrastructure issues |
-State stored in `.ctx/debug/sessions/`:
-- `STATE.json` - Machine-readable progress
-- `TRACE.md` - Human-readable log
-- `hypotheses.json` - All theories tested
-- `screenshots/` - Visual evidence
+### Design
-### Parallel Codebase Analysis
-```bash
-/ctx map-codebase        # Full analysis with 4 parallel agents
-```
+| Skill | Auto-invoked when... | Purpose |
+|-------|----------------------|---------|
+| ctx-design-system | Design system creation, token management, brand kit updates, token export | Manages W3C DTCG 2025.10 tokens as single source of truth; exports to CSS/SCSS/JS/Tailwind |
+| ctx-visual-qa | Visual QA, design parity, pixel-perfect verification, responsive testing, WCAG 2.2 audit | Numerical measurement-driven QA — every delta is a number, every fix is a specific CSS property |
-Spawns 4 agents simultaneously:
-| Agent | Output | Analyzes |
-|-------|--------|----------|
-| TECH | TECH.md | Languages, frameworks, dependencies |
-| ARCH | ARCH.md | Patterns, data flow, modules |
-| QUALITY | QUALITY.md | Test coverage, lint, type safety |
-| CONCERNS | CONCERNS.md | Security, tech debt, performance |
+### Machine Learning
-Results synthesized into `SUMMARY.md`.
+| Skill | Auto-invoked when... | Purpose |
+|-------|----------------------|---------|
+| ctx-ml-experiment | User wants to run ML experiments, track hypotheses, compare models | Hypothesis tracking, model registry, experiment lifecycle |
+| ctx-ml-pipeline | Production ML deployment, inference, drift monitoring | Inference envelope, circuit breaker, KS drift detection, retraining triggers |
 ---
-## Why CTX?
-| Feature | Aider | GSD | CTX 3.0 |
-|---------|-------|-----|---------|
-| Repository Map | Yes | No | **Yes** |
-| Discussion Phase | No | Yes | **Yes** |
-| Model Profiles | Yes | Partial | **Yes** |
-| Git-Native Commits | Yes | No | **Yes** |
-| Persistent Debug | No | Partial | **Yes** |
-| Parallel Analysis | No | Yes | **Yes** |
-| PRD-Driven | No | Yes | **Yes** |
-| Design System | No | No | **Yes** |
-| Browser Verification | No | No | **Yes** |
+## Commands (26)
-**CTX 3.0 combines the best of Aider and GSD.**
+### Smart
----
-## Commands
-### Smart (Auto-routing)
 | Command | Purpose |
 |---------|---------|
-| `/ctx` | **Smart router** - reads STATE.md, does the right thing |
-| `/ctx init` | Initialize project with STATE.md + PRD.json |
+| `/ctx` | Smart router — reads STATE.json, does the right thing |
 ### Mapping
 | Command | Purpose |
 |---------|---------|
-| `/ctx map` | Build repository map (REPO-MAP.md) |
-| `/ctx map-codebase` | Deep analysis (4 parallel agents) |
+| `/ctx:map` | Build token-optimized repository map (REPO-MAP.md) |
+| `/ctx:map-codebase` | Deep parallel analysis: TECH + ARCH + QUALITY + CONCERNS → SUMMARY |
 ### Discussion
 | Command | Purpose |
 |---------|---------|
-| `/ctx discuss [story]` | Capture decisions before planning |
+| `/ctx:discuss [story]` | Capture implementation decisions before planning; locks them in CONTEXT.md |
 ### Configuration
 | Command | Purpose |
 |---------|---------|
-| `/ctx profile [name]` | Switch model profile (quality/balanced/budget) |
+| `/ctx:profile [name]` | Switch model profile: `quality`, `balanced` (default), `budget` |
+### Inspect
-### Inspect (Read-only)
 | Command | Purpose |
 |---------|---------|
-| `/ctx status` | See current state without triggering action |
+| `/ctx:status` | Show current state without triggering any action |
+### Control
-### Control (Override)
 | Command | Purpose |
 |---------|---------|
-| `/ctx plan [goal]` | Force research + planning |
-| `/ctx verify` | Force three-level verification |
-| `/ctx quick "task"` | Quick task bypass |
+| `/ctx:init` | Initialize project: PRD.json + STATE.json + config |
+| `/ctx:plan [goal]` | Force research + planning phase |
+| `/ctx:verify` | Force three-level verification |
+| `/ctx:quick "task"` | Quick task bypass (skips full lifecycle) |
 ### Debug
 | Command | Purpose |
 |---------|---------|
-| `/ctx debug` | Start debugging current issue |
-| `/ctx debug "issue"` | Debug specific problem |
-| `/ctx debug --resume` | Resume last debug session |
-| `/ctx debug --list` | List all debug sessions |
-| `/ctx debug --status` | Show current session status |
+| `/ctx:debug` | Start debugging current issue |
+| `/ctx:debug "issue"` | Debug specific problem |
+| `/ctx:debug --resume` | Resume last debug session |
+| `/ctx:debug --list` | List all debug sessions |
+| `/ctx:debug --status` | Show current session status |
+### Design
-### QA (Full System Testing)
 | Command | Purpose |
 |---------|---------|
-| `/ctx qa` | Full system QA - WCAG 2.1 AA, every page, every button |
-| `/ctx qa --section "auth"` | QA specific section only |
-| `/ctx qa --a11y-only` | Accessibility audit only |
-| `/ctx qa --visual-only` | Visual regression (mobile/tablet/desktop) |
-| `/ctx qa --resume` | Resume interrupted QA session |
-| `/ctx qa --report` | Show last QA report |
+| `/ctx:brand` | Brand establishment: mood board → 3 options → BRAND_KIT.md |
+| `/ctx:design` | Component design: research → 3 options → prototype → implement |
+### QA
+| Command | Purpose |
+|---------|---------|
+| `/ctx:qa` | Full system QA — WCAG 2.1 AA, every page, every interaction |
+| `/ctx:qa --a11y-only` | Accessibility audit only |
+| `/ctx:qa --visual-only` | Visual regression across mobile/tablet/desktop |
+| `/ctx:qa --resume` | Resume interrupted QA session |
+| `/ctx:visual-qa` | Measurement-driven design parity check |
+### ML
+| Command | Purpose |
+|---------|---------|
+| `/ctx:experiment` | Start ML experiment loop |
+| `/ctx:train` | Trigger training pipeline |
+| `/ctx:ml-status` | Show experiment registry and model status |
 ### Session
 | Command | Purpose |
 |---------|---------|
-| `/ctx pause` | Checkpoint for session resume |
+| `/ctx:pause` | Checkpoint state for session resume |
+### Phase
-### Phase Management
 | Command | Purpose |
 |---------|---------|
-| `/ctx phase list` | Show all phases |
-| `/ctx phase add "goal"` | Add new phase |
-| `/ctx phase next` | Complete current, move to next |
+| `/ctx:phase list` | Show all phases and their status |
+| `/ctx:phase add "goal"` | Add a new phase |
+| `/ctx:phase next` | Complete current phase, advance to next |
 ### Integration
 | Command | Purpose |
 |---------|---------|
-| `/ctx integrate` | Show integration status |
-| `/ctx integrate linear` | Setup Linear |
-| `/ctx integrate jira` | Setup Jira |
-| `/ctx integrate github` | Setup GitHub Issues |
-| `/ctx integrate --sync` | Sync all stories |
+| `/ctx:integrate` | Show integration status |
+| `/ctx:integrate linear` | Set up Linear sync |
+| `/ctx:integrate jira` | Set up Jira sync |
+| `/ctx:integrate github` | Set up GitHub Issues sync |
+| `/ctx:integrate --sync` | Sync all stories with connected tracker |
 ### Milestone
+| Command | Purpose |
+|---------|---------|
+| `/ctx:milestone` | Show current milestone |
+| `/ctx:milestone list` | List all milestones |
+| `/ctx:milestone audit` | Verify milestone completion |
+| `/ctx:milestone complete` | Archive and tag release |
+| `/ctx:milestone new [name]` | Start next version |
+| `/ctx:milestone gaps` | Generate fix phases for gaps |
+### Metrics
 | Command | Purpose |
 |---------|---------|
-| `/ctx milestone` | Show current milestone |
-| `/ctx milestone list` | List all milestones |
-| `/ctx milestone audit` | Verify completion |
-| `/ctx milestone complete` | Archive and tag |
-| `/ctx milestone new [name]` | Start next version |
-| `/ctx milestone gaps` | Generate fix phases |
-### Metrics & Audit
+| `/ctx:metrics` | Productivity dashboard |
+| `/ctx:metrics cost` | Cost analysis by model/profile |
+| `/ctx:metrics export` | Export HTML dashboard |
+### Learning
 | Command | Purpose |
 |---------|---------|
-| `/ctx metrics` | Show productivity dashboard |
-| `/ctx metrics cost` | Cost analysis |
-| `/ctx metrics export` | Export HTML dashboard |
-| `/ctx audit` | Show audit summary |
-| `/ctx audit export` | Generate compliance report |
+| `/ctx:learn` | Show what CTX has learned about your project |
+| `/ctx:learn patterns` | Show detected code patterns |
+| `/ctx:learn decisions` | Show architectural decisions log |
+| `/ctx:predict` | Get AI-suggested next features |
+| `/ctx:predict --quick` | Quick wins only |
+### Monitoring
-### Learning & Prediction
 | Command | Purpose |
 |---------|---------|
-| `/ctx learn` | Show what CTX has learned |
-| `/ctx learn patterns` | Show code patterns |
-| `/ctx learn decisions` | Show architectural decisions |
-| `/ctx learn forget [id]` | Remove a learned pattern |
-| `/ctx predict` | Get feature suggestions |
-| `/ctx predict --quick` | Quick wins only |
-| `/ctx predict --create [id]` | Create story from suggestion |
-### Monitoring & Voice
+| `/ctx:monitor` | Show monitoring status |
+| `/ctx:monitor connect sentry` | Connect Sentry error tracking |
+| `/ctx:monitor errors` | List recent production errors |
+| `/ctx:monitor auto-fix [id]` | Auto-fix error with PR |
+| `/ctx:monitor --watch` | Continuous monitoring mode |
+### Voice
 | Command | Purpose |
 |---------|---------|
-| `/ctx monitor` | Show monitoring status |
-| `/ctx monitor connect sentry` | Connect Sentry |
-| `/ctx monitor errors` | List recent errors |
-| `/ctx monitor auto-fix [id]` | Auto-fix with PR |
-| `/ctx monitor --watch` | Continuous monitoring |
-| `/ctx voice` | Start voice input |
-| `/ctx voice --continuous` | Always listening mode |
-| `/ctx voice --dictate` | Long-form dictation |
+| `/ctx:voice` | Start voice input |
+| `/ctx:voice --continuous` | Always-listening mode |
+| `/ctx:voice --dictate` | Long-form dictation |
 ---
-## State Machine
+## Hooks (3)
-```
-initializing → discussing → executing → verifying → COMPLETE
-                    ↑            ↓
-                    └── debugging ──┘
-```
+Hooks are deterministic Node.js scripts registered in `settings.json`. They run synchronously on every tool call, independent of Claude's reasoning.
-| State | What happens |
-|-------|--------------|
-| initializing | Research + Map + Plan |
-| discussing | Capture decisions in CONTEXT.md |
-| executing | Execute with git-native commits |
-| debugging | Persistent debug loop (max 10 attempts) |
-| verifying | Three-level verification |
-| paused | Resume from checkpoint |
+| Hook | File | Trigger | Behavior |
+|------|------|---------|----------|
+| pre-tool-use | `hooks/pre-tool-use.js` | Before any tool executes | TDD enforcement + capability restrictions. Exit 2 blocks the tool call. |
+| post-tool-use | `hooks/post-tool-use.js` | After any tool executes | Logs file modifications to audit trail in `.ctx/audit.log` |
+| subagent-stop | `hooks/subagent-stop.js` | When a subagent finishes | Records agent completion in `.ctx/STATE.json` |
+**Configure hook behavior:**
+```bash
+npx ctx-cc config set hooks.tddMode strict    # Block writes without tests
+npx ctx-cc config set hooks.tddMode warn      # Warn but allow
+npx ctx-cc config set hooks.tddMode off       # Disabled
+```
 ---
-## Context Management
+## Design Workflow
-CTX actively manages context budget:
+CTX implements an agency-grade design process with mandatory approval gates.
-| Usage | Quality | Action |
-|-------|---------|--------|
-| 0-30% | Peak | Continue |
-| 30-40% | Good | Continue |
-| 40-50% | Good | Prepare handoff notes |
-| 50-60% | Degrading | Auto-checkpoint |
-| 60-70% | Degrading | Create HANDOFF.md |
-| 70%+ | Poor | Force checkpoint |
+**Phase 1 — Brand**
+```
+Research → Mood board → 3 direction options → User picks → BRAND_KIT.md
+```
+BRAND_KIT.md becomes the constraint for all subsequent design work. Colors, typography, and spacing flow from tokens only.
-Smart handoff creates `HANDOFF.md` with:
-- Completed tasks with commit hashes
-- Current task progress
-- Key decisions made
-- Files modified
-- Next steps
+**Phase 2 — Component Design**
+```
+Research → 3 options (A/B/C) → User approves direction → Prototype → Implement
+```
+Never a single design. Options are always presented before implementation.
----
+**Phase 3 — Visual QA**
+Every design change triggers numerical verification:
+- Measure rendered values vs design spec (px, rem, hex)
+- Report deltas as numbers, not subjective descriptions
+- Fixes are specific: `change font-size from 14px to 16px`
+**Compliance:** WCAG 2.2 AA + EAA 2025. Touch targets, contrast ratios, keyboard navigation, and screen reader labels are all measured.
-## 21 Specialized Agents
-| Agent | Spawned when | Model (balanced) |
-|-------|--------------|------------------|
-| ctx-mapper | /ctx map | haiku |
-| ctx-tech-mapper | /ctx map-codebase | haiku |
-| ctx-arch-mapper | /ctx map-codebase | haiku |
-| ctx-quality-mapper | /ctx map-codebase | haiku |
-| ctx-concerns-mapper | /ctx map-codebase | haiku |
-| ctx-discusser | status = discussing | sonnet |
-| ctx-researcher | status = initializing | opus |
-| ctx-planner | after research | opus |
-| ctx-executor | status = executing | sonnet |
-| ctx-designer | design stories | sonnet |
-| ctx-debugger | status = debugging | sonnet |
-| ctx-verifier | status = verifying | haiku |
-| ctx-parallelizer | before execution | haiku |
-| ctx-reviewer | before commit | sonnet |
-| ctx-criteria-suggester | during init/discuss | sonnet |
-| ctx-handoff | at context thresholds | haiku |
-| ctx-team-coordinator | team mode | sonnet |
-| ctx-auditor | always (background) | haiku |
-| ctx-learner | observing patterns | haiku |
-| ctx-predictor | after milestone/on demand | sonnet |
-| ctx-qa | /ctx qa (full system test) | sonnet |
+**Key tools:** Figma MCP (token sync + component metadata), Gemini design analysis, W3C DTCG 2025.10 token format.
 ---
-## Directory Structure
+## ML Workflow
+CTX implements an experiment-driven ML development loop.
+**Phase 1 — Data Analysis**
 ```
-.ctx/
-├── config.json       # Model profiles, git settings
-├── STATE.md          # Living digest - execution state
-├── PRD.json          # Requirements contract
-├── REPO-MAP.md       # Token-optimized codebase map
-├── REPO-MAP.json     # Structured map data
-├── .env              # Test credentials (GITIGNORED)
-├── codebase/         # Deep analysis results
-│   ├── TECH.md
-│   ├── ARCH.md
-│   ├── QUALITY.md
-│   ├── CONCERNS.md
-│   └── SUMMARY.md
-├── phases/{story_id}/
-│   ├── CONTEXT.md    # Locked decisions (discussion phase)
-│   ├── RESEARCH.md   # ArguSeek results
-│   ├── PLAN.md       # Tasks mapped to criteria
-│   └── VERIFY.md     # Verification report
-├── debug/
-│   ├── sessions/     # Persistent debug state
-│   └── screenshots/  # Visual proof
-├── checkpoints/      # Auto-checkpoints
-└── memory/           # Decision memory
+Load data → EDA → Quality scoring → Feature correlation → Pandera validation schema
 ```
+**Phase 2 — Experiment Loop**
+```
+Hypothesize → Design experiment → Run → Analyze → Register result → Iterate
+```
+All hypotheses and results are tracked in `.ctx/ml/experiments/`. The model registry stores every trained artifact with metadata.
+**Phase 3 — Model Evaluation**
+- Conformal prediction intervals (MAPIE)
+- Statistical significance testing
+- Calibration curves and reliability diagrams
+**Phase 4 — Production Pipeline**
+- Model registry with version pinning
+- Inference envelope with latency SLA
+- Circuit breaker (auto-disable on error spike)
+- KS drift detection with configurable thresholds
+- Retraining triggers on drift
+**Proven patterns:** XGBoost + MAPIE conformal prediction, T-learner causal inference, KS drift, Pandera schema validation — from Digital Twin production workflows.
 ---
 ## Configuration
-`.ctx/config.json`:
-```json
-{
-  "activeProfile": "balanced",
-  "models": {
-    "architect": { "id": "claude-opus-4", "costTier": "high" },
-    "default": { "id": "claude-sonnet-4", "costTier": "medium" },
-    "fast": { "id": "claude-haiku-4", "costTier": "low" }
-  },
-  "profiles": {
-    "quality": {
-      "research": "architect",
-      "discussion": "architect",
-      "planning": "architect",
-      "execution": "architect"
-    },
-    "balanced": {
-      "research": "architect",
-      "discussion": "default",
-      "planning": "architect",
-      "execution": "default"
-    },
-    "budget": {
-      "research": "default",
-      "planning": "default",
-      "execution": "default"
-    }
-  },
-  "git": {
-    "autoCommit": true,
-    "commitPerTask": true
-  }
-}
+```bash
+npx ctx-cc config list                       # Show all config values
+npx ctx-cc config get activeProfile          # Get a specific value
+npx ctx-cc config set hooks.tddMode strict   # Set a value
 ```
----
+**Model profiles:**
-## Integrations
+| Profile | Research | Planning | Execution | Verify | Relative Cost |
+|---------|----------|----------|-----------|--------|---------------|
+| quality | opus | opus | opus | sonnet | ~3x |
+| balanced | opus | opus | sonnet | haiku | 1x (default) |
+| budget | sonnet | sonnet | sonnet | haiku | ~0.4x |
-### ArguSeek (Web Research)
-Auto-runs during planning for best practices, security, and patterns.
+Switch profiles without reinstalling:
+```
+/ctx:profile quality
+/ctx:profile balanced
+/ctx:profile budget
+```
-### ChunkHound (Semantic Code Search)
-Auto-runs during planning for semantic search and pattern detection.
-```bash
-uv tool install chunkhound
+---
+## Phase Lifecycle
+```
+init → plan → execute → verify → complete
+                ↑          ↓
+                ←── (fix failures)
 ```
-### Browser Verification (Playwright/Chrome DevTools)
-Auto-runs during debugging and verification for visual proof.
+State is persisted in `.ctx/STATE.json` after every transition. The `ctx-state` skill manages reads and writes. The `ctx-orchestrator` skill drives transitions.
-### Figma MCP (Design Context)
-Auto-runs during design stories for tokens and component metadata.
+| Phase | What happens |
+|-------|--------------|
+| init | Research + repo map + PRD validation |
+| plan | Acceptance criteria + atomic task plan (2–3 tasks) |
+| execute | Implementation with per-task git commits |
+| verify | Three-level check: exists → substantive → wired |
+| complete | Review gate passed, story archived |
-### Gemini Design MCP (Visual Generation)
-Auto-runs during design stories for mockups and UI code.
+If verification fails, state returns to `execute` automatically. The fix-loop runs until all three verification levels pass.
 ---
-## Key Principles
+## Plugin Manifest
-### 95% Auto-Deviation Handling
+CTX ships with `plugin.json` for Claude Code marketplace distribution. Future marketplace installs will use:
-| Trigger | Action |
-|---------|--------|
-| Bug in existing code | Auto-fix, document in commit |
-| Missing validation | Auto-add, document |
-| Blocking issue | Auto-fix, document |
-| Architecture decision | **Ask user** |
+```bash
+/plugin install ctx@my-marketplace
+```
-### Three-Level Verification
+The manifest declares all agents, skills, commands, and hooks so the marketplace can display capabilities, manage versions, and handle updates without reinstalling.
-| Level | Question | Check |
-|-------|----------|-------|
-| Exists | File on disk? | Glob |
-| Substantive | Real code, not stub? | No TODOs, no placeholders |
-| Wired | Imported and used? | Trace imports |
+---
-### Atomic Planning
+## CLI Reference
-Plans limited to 2-3 tasks to prevent context degradation.
+```bash
+npx ctx-cc [options]           Install CTX into Claude Code
+npx ctx-cc list                List all 25 agents with model/maxTurns
+npx ctx-cc skills              Analyze skill descriptions and triggers
+npx ctx-cc config list         Show full configuration
+npx ctx-cc config get <key>    Get a config value
+npx ctx-cc config set <k> <v>  Set a config value
+npx ctx-cc --help              Show help
+Options:
+  --global, -g     Install to ~/.claude (default)
+  --project, -p    Install to .claude in current directory
+  --force, -f      Overwrite existing installation
+```
 ---
-## Updating
+## Development
 ```bash
-npx ctx-cc --force
+git clone https://github.com/jufjuf/CTX.git
+cd CTX
+npm test    # 264 tests, node:test runner
+```
+**Project structure:**
 ```
+ctx-cc/
+├── agents/          25 agent definitions (.md with frontmatter)
+├── skills/          7 skill directories (each contains SKILL.md)
+├── commands/        26 slash command definitions (.md)
+├── hooks/           3 enforcement hook scripts (.js)
+├── src/             17 source modules (.js)
+├── test/            19 test files (.test.js)
+├── templates/       config.json, PRD.json, state templates
+├── bin/ctx.js       CLI entry point (installer only)
+├── plugin.json      Marketplace manifest
+└── package.json     Zero runtime dependencies
+```
+---
+## Testing
+```bash
+npm test
+# 264 tests, 0 failures, ~2s
+```
+**Coverage:**
+| Area | What is tested |
+|------|----------------|
+| Agent discovery | Frontmatter parsing, model/maxTurns validation |
+| State machine | Phase transitions, invalid transition rejection |
+| Pipelines | Orchestrator flow, review gate stages |
+| Worktrees | Parallel execution isolation |
+| Hooks | TDD enforcement, audit logging, subagent tracking |
+| Capabilities | Restriction rules, exit codes |
+| Context profiles | Model resolution per profile |
+| Skills format | SKILL.md structure, description format |
+| Design compliance | Token format, BRAND_KIT schema |
+| ML compliance | Experiment schema, pipeline config |
+| CLI commands | list, skills, config get/set |
+| Integration | End-to-end install + verify |
 ---
@@ -849,8 +548,8 @@ MIT
 <div align="center">
-**[GitHub](https://github.com/jufjuf/CTX)** · **[Issues](https://github.com/jufjuf/CTX/issues)** · **[npm](https://www.npmjs.com/package/ctx-cc)**
+**[GitHub](https://github.com/jufjuf/CTX)** · **[npm](https://www.npmjs.com/package/ctx-cc)** · **[Issues](https://github.com/jufjuf/CTX/issues)**
-*CTX 3.3 - Conversational-first. Just describe what you want. 21 agents. Learning system. Self-healing.*
+CTX 4.0 — 25 agents · 7 skills · 3 hooks · zero dependencies
 </div>