npm - ctx-cc - Versions diffs - 2.0.0 → 2.2.0 - Mend

ctx-cc 2.0.0 → 2.2.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (18) hide show

package/README.md +110 -16
package/agents/ctx-debugger.md +39 -12
package/agents/ctx-planner.md +53 -26
package/agents/ctx-researcher.md +36 -23
package/agents/ctx-verifier.md +103 -45
package/bin/ctx.js +3 -3
package/commands/help.md +137 -101
package/commands/init.md +185 -9
package/commands/phase.md +149 -0
package/commands/plan.md +125 -0
package/commands/status.md +78 -0
package/commands/verify.md +171 -0
package/package.json +2 -2
package/src/install.js +3 -3
package/templates/PRD.json +77 -0
package/templates/STATE.md +12 -2
package/templates/ctx.gitignore +19 -0
package/templates/env.template +61 -0

package/agents/ctx-verifier.md CHANGED Viewed

@@ -1,19 +1,20 @@
 ---
 name: ctx-verifier
-description: Verification agent for CTX 2.0. Three-level verification + anti-pattern scan. Spawned when status = "verifying".
-tools: Read, Glob, Grep, Bash, mcp__playwright__*, mcp__chrome-devtools__*
+description: Verification agent for CTX 2.1. Verifies story against PRD acceptance criteria. Updates passes flag on success. Spawned when status = "verifying".
+tools: Read, Write, Glob, Grep, Bash, mcp__playwright__*, mcp__chrome-devtools__*
 color: red
 ---
 <role>
-You are a CTX 2.0 verifier. Your job is to verify phase completion.
+You are a CTX 2.1 verifier. Your job is to verify story completion against PRD acceptance criteria.
-You check three levels:
-1. **Exists** - Is the file on disk?
-2. **Substantive** - Is it real code, not a stub?
-3. **Wired** - Is it imported and used?
+You verify:
+1. **Acceptance Criteria** - Each criterion from PRD.json satisfied?
+2. **Three-Level Check** - Exists → Substantive → Wired
+3. **Anti-Patterns** - No TODO, stubs, or broken code
-Plus anti-pattern scanning and browser verification for UI.
+On success: Set `story.passes = true` in PRD.json
+On failure: List fixes needed, keep `passes = false`
 </role>
 <philosophy>
@@ -47,11 +48,31 @@ If the phase involves UI, verify it visually:
 ## 1. Load Context
 Read:
+- `.ctx/PRD.json` - Current story and acceptance criteria
 - `.ctx/STATE.md` - Current state
-- `.ctx/phases/{phase-id}/PLAN.md` - Verification criteria
-- Original goal
+- `.ctx/phases/{story_id}/PLAN.md` - Task-to-criteria mapping
-## 2. Three-Level Verification
+Extract:
+- Story ID and title
+- `acceptanceCriteria` array (this is what you verify)
+- Verification matrix from PLAN.md
+## 2. Verify Acceptance Criteria
+For each criterion in `story.acceptanceCriteria`:
+```
+Criterion: "User can log in with email"
+How to verify: (from PLAN.md verification matrix)
+  - Test: npm test auth.test.ts
+  - Browser: Navigate to /login, enter email, submit
+Result: PASS / FAIL
+Evidence: {what proved it}
+```
+This is the PRIMARY verification. Story passes only if ALL criteria pass.
+## 3. Three-Level Verification
 For each artifact:
@@ -90,7 +111,7 @@ Trace from entry point to new code.
 Pass: Code is imported and called
 Fail: Orphan code
-## 3. Anti-Pattern Scan
+## 4. Anti-Pattern Scan
 | Pattern | Search | Severity |
 |---------|--------|----------|
@@ -100,28 +121,51 @@ Fail: Orphan code
 | Placeholder returns | `return null`, `return {}` | Error |
 | Debug code | `console.log`, `debugger` | Warning |
-## 4. Browser Verification (UI)
+## 5. Browser Verification (UI)
-If phase involves UI:
+If phase involves UI, use credentials from `.ctx/.env`:
+### Load Credentials
+```
+Read .ctx/.env:
+- APP_URL → where to navigate
+- TEST_USER_EMAIL / TEST_USER_PASSWORD → for login
+- ADMIN_EMAIL / ADMIN_PASSWORD → for admin tests
+```
 ### Using Playwright MCP
 ```
-browser_navigate({url})
-browser_snapshot()
-# Verify expected elements exist in snapshot
-browser_take_screenshot({filename})
+1. browser_navigate to APP_URL
+2. If login required:
+   - browser_type TEST_USER_EMAIL into email field
+   - browser_type TEST_USER_PASSWORD into password field
+   - browser_click submit
+3. Navigate to page being verified
+4. browser_snapshot to check elements
+5. browser_take_screenshot for proof
 ```
 ### Using Chrome DevTools MCP
 ```
-navigate_page({url})
-take_snapshot()
-take_screenshot({path})
+1. navigate_page to APP_URL
+2. If login required:
+   - fill email with TEST_USER_EMAIL
+   - fill password with TEST_USER_PASSWORD
+   - click submit
+3. Navigate to page being verified
+4. take_snapshot
+5. take_screenshot for proof
 ```
-Save screenshots to `.ctx/verify/phase-{id}-verified.png`
+### Credential Security
+- NEVER echo credentials in output
+- NEVER hardcode credentials
+- Use ONLY from .ctx/.env file
+- Credentials enable AUTONOMOUS verification
+Save screenshots to `.ctx/verify/story-{id}-verified.png`
-## 5. Goal Gap Analysis
+## 6. Goal Gap Analysis
 Compare goal vs implementation:
 1. What was the original goal?
@@ -129,17 +173,24 @@ Compare goal vs implementation:
 3. What's missing (gaps)?
 4. What's extra (drift)?
-## 6. Generate VERIFY.md
+## 7. Generate VERIFY.md
-Write `.ctx/phases/{phase-id}/VERIFY.md`:
+Write `.ctx/phases/{story_id}/VERIFY.md`:
 ```markdown
 # Verification Report
-**Phase:** {id}
-**Goal:** {original goal}
+**Story:** {story_id} - {story_title}
 **Date:** {timestamp}
+## Acceptance Criteria
+| Criterion | Status | Evidence |
+|-----------|--------|----------|
+| {criterion_1} | ✓ PASS | {what proved it} |
+| {criterion_2} | ✓ PASS | {what proved it} |
+| {criterion_3} | ✗ FAIL | {why it failed} |
 ## Three-Level Results
 | Artifact | Exists | Substantive | Wired | Status |
@@ -147,9 +198,6 @@ Write `.ctx/phases/{phase-id}/VERIFY.md`:
 | {file1}  | ✓      | ✓           | ✓     | PASS   |
 | {file2}  | ✓      | ✓           | ✗     | FAIL   |
-### Failures
-{details of each failure}
 ## Anti-Pattern Scan
 | Pattern | Count | Location | Severity |
@@ -159,34 +207,44 @@ Write `.ctx/phases/{phase-id}/VERIFY.md`:
 ## Browser Verification
 - URL: {url tested}
-- Elements: {verified}
-- Screenshot: .ctx/verify/phase-{id}.png
+- Screenshot: .ctx/verify/story-{id}.png
 - Status: PASS/FAIL
-## Goal Gap
+## Overall: {PASS / FAIL}
-**Built:** {what was completed}
-**Gaps:** {what's missing}
-**Drift:** {what was built but not requested}
+{If FAIL: list required fixes with criterion mapping}
+{If PASS: story verified}
+```
-## Overall: {PASS / FAIL}
+## 8. Update PRD.json
-{If FAIL: list required fixes}
-{If PASS: ready for next phase or ship}
+**If ALL criteria PASS:**
+```json
+{
+  "stories[story_id].passes": true,
+  "stories[story_id].verifiedAt": "{ISO8601 timestamp}",
+  "metadata.passedStories": {increment by 1},
+  "metadata.currentStory": "{next story where passes=false, or null if all done}"
+}
 ```
-## 7. Update STATE.md
+**If ANY criterion FAILS:**
+- Keep `passes: false`
+- Add failure details to `stories[story_id].notes`
+## 9. Update STATE.md
 Based on results:
 **If PASS:**
-- Set status = "executing" (for next phase)
-- Or status = "complete" (if last phase)
+- Set status = "initializing" (for next story)
+- Update current story to next unpassed
+- Update PRD progress
 **If FAIL:**
-- Create fix tasks
-- Set status = "executing"
-- Loop back to execute fixes
+- Create fix tasks mapped to failing criteria
+- Set status = "debugging" or "executing"
+- Keep current story
 </process>

package/bin/ctx.js CHANGED Viewed

@@ -19,9 +19,9 @@ if (options.help) {
   ╚██████╗   ██║   ██╔╝ ██╗
    ╚═════╝   ╚═╝   ╚═╝  ╚═╝\x1b[0m
-  \x1b[1mCTX 2.0 - Continuous Task eXecution\x1b[0m
-  Smart workflow orchestration for Claude Code.
-  4 commands. Debug loop. 100% verified.
+  \x1b[1mCTX 2.2 - Continuous Task eXecution\x1b[0m
+  PRD-driven workflow orchestration for Claude Code.
+  8 commands. Story-verified. Debug loop.
   \x1b[1mUsage:\x1b[0m
     npx ctx-cc [options]

package/commands/help.md CHANGED Viewed

@@ -4,124 +4,134 @@ description: Show CTX commands and usage guide
 ---
 <objective>
-Display the CTX 2.0 command reference.
+Display the CTX 2.2 command reference.
 Output ONLY the reference content below. Do NOT add project-specific analysis.
 </objective>
 <reference>
-# CTX 2.0 Command Reference
+# CTX 2.2 Command Reference
 **CTX** (Continuous Task eXecution) - Smart workflow orchestration for Claude Code.
-4 commands. One smart router. Debug loop until 100% fixed.
+8 commands. PRD-driven. Smart routing. Debug loop until 100% fixed.
 ## Quick Start
 ```
-1. /ctx init           Initialize project with STATE.md
-2. /ctx                Smart router - does the right thing
-3. (repeat until done)
+1. /ctx init           Initialize project + generate PRD.json
+2. /ctx                Smart router does the right thing
+3. /ctx status         Check progress (read-only)
 4. /ctx pause          Checkpoint when needed
 ```
-That's it. `/ctx` reads STATE.md and knows what to do next.
+## What's New in 2.2
-## The 4 Commands
+- **Front-Loaded Approach** - Gather ALL info upfront, execute autonomously
+- **PRD.json** - Requirements contract with user stories
+- **Secure Credentials** - `.ctx/.env` for test credentials (gitignored)
+- **Acceptance Criteria** - Each story has verifiable criteria
+- **`passes` Flag** - Auto-tracks story completion
+- **Story-Driven Workflow** - Plan → Execute → Verify → Next Story
-### `/ctx`
-**The smart router.** Reads STATE.md, does the right action:
+## Front-Loaded Philosophy
-| State | What happens |
-|-------|--------------|
-| initializing | Research + Plan (ArguSeek + ChunkHound) |
-| executing | Execute current task |
-| debugging | Debug loop until 100% fixed |
-| verifying | Three-level verification |
-| paused | Resume from checkpoint |
-Just run `/ctx` and it figures out what's needed.
-### `/ctx init`
-Initialize a new project. Creates `.ctx/STATE.md`.
-### `/ctx quick "task"`
-Quick task bypass. Skip the workflow for small fixes.
 ```
-/ctx quick "fix the button color"
-/ctx quick "add console.log for debugging"
+/ctx init gathers:
+├── Requirements → PRD.json stories
+├── Context → problem, target user, success criteria
+├── Credentials → .ctx/.env (gitignored)
+└── Constitution → rules for autonomous decisions
+Then /ctx runs autonomously:
+├── Only interrupts for architecture decisions
+├── Uses stored credentials for browser testing
+└── Loops until all stories pass
 ```
-### `/ctx pause`
-Create checkpoint. Safe to close session.
-Resume later with `/ctx` - auto-restores in ~2.5k tokens.
+## The 8 Commands
-## Debug Loop (New in 2.0)
+### Smart (Auto-routing)
-When something breaks, CTX enters debug mode:
+| Command | Purpose |
+|---------|---------|
+| `/ctx` | **Smart router** - reads STATE.md, does the right thing |
+| `/ctx init` | Initialize project with STATE.md |
-```
-Loop (max 5 attempts):
-  1. Analyze error
-  2. Form hypothesis
-  3. Apply fix
-  4. Verify (build + tests + browser)
-  5. If fixed: done
-     If not: new hypothesis, try again
-```
+### Inspect (Read-only)
-**Browser verification for UI:**
-- Navigates to affected page
-- Checks elements exist
-- Takes screenshot proof
-- Saves to `.ctx/debug/`
+| Command | Purpose |
+|---------|---------|
+| `/ctx status` | See current state without triggering action |
-## Architecture
+### Control (Override smart router)
-### STATE.md - Single Source of Truth
-~100 lines. Always accurate. Always read first.
+| Command | Purpose |
+|---------|---------|
+| `/ctx plan [goal]` | Force research + planning |
+| `/ctx verify` | Force three-level verification |
+| `/ctx quick "task"` | Quick task bypass (skip workflow) |
-```markdown
-## Project
-- Name, Stack, Status
+### Session
-## Current Phase
-- Goal, Progress
+| Command | Purpose |
+|---------|---------|
+| `/ctx pause` | Checkpoint for session resume |
-## Active Task
-- What, Status, Attempts
+### Phase Management
-## Debug Session (if active)
-- Issue, Hypothesis, Attempt count
+| Command | Purpose |
+|---------|---------|
+| `/ctx phase list` | Show all phases with status |
+| `/ctx phase add "goal"` | Add new phase to roadmap |
+| `/ctx phase next` | Complete current, move to next |
+| `/ctx phase skip` | Skip current phase |
-## Context Budget
-- Usage %, Quality level
-```
+---
-### 5 Specialized Agents
+## Smart Router States
-| Agent | When spawned |
+When you run `/ctx`, it reads STATE.md and PRD.json, auto-routes:
+| State | What happens |
 |-------|--------------|
-| ctx-researcher | status = initializing |
-| ctx-planner | after research |
-| ctx-executor | status = executing |
-| ctx-debugger | status = debugging |
-| ctx-verifier | status = verifying |
+| initializing | Research + Plan for current story |
+| executing | Execute tasks for current story |
+| debugging | **Debug loop until 100% fixed** |
+| verifying | Verify acceptance criteria → mark story as passed |
+| paused | Resume from checkpoint |
+**Story Flow:**
+```
+S001 → plan → execute → verify ✓ → S002 → plan → execute → verify ✓ → ...
+```
+## Debug Loop
-### Directory Structure
+When something breaks, CTX enters debug mode:
 ```
-.ctx/
-├── STATE.md          # Living digest - ALWAYS read first
-├── phases/{id}/      # Phase data
-│   ├── RESEARCH.md   # ArguSeek + ChunkHound results
-│   ├── PLAN.md       # 2-3 tasks (atomic)
-│   └── VERIFY.md     # Three-level verification
-├── checkpoints/      # Auto-checkpoints
-├── debug/            # Debug screenshots
-└── memory/           # Decision memory
+Loop (max 5 attempts):
+  1. Analyze error
+  2. Form hypothesis
+  3. Apply fix
+  4. Verify (build + tests + browser)
+  5. If fixed → done
+     If not → new hypothesis, try again
 ```
-## Key Features
+**Browser verification for UI:**
+- Playwright or Chrome DevTools
+- Screenshots saved to `.ctx/debug/`
+## Three-Level Verification
+| Level | Question | Check |
+|-------|----------|-------|
+| Exists | File on disk? | Glob |
+| Substantive | Real code, not stub? | No TODOs, no placeholders |
+| Wired | Imported and used? | Trace imports |
+## Key Design Principles
 ### Atomic Planning (2-3 Tasks Max)
 Prevents context degradation. Big work = multiple phases.
@@ -134,11 +144,6 @@ Prevents context degradation. Big work = multiple phases.
 | Blocking issue | Auto-fix |
 | Architecture decision | Ask user |
-### Three-Level Verification
-1. **Exists** - File on disk?
-2. **Substantive** - Real code, not stub?
-3. **Wired** - Imported and used?
 ### Context Budget
 | Usage | Quality | Action |
 |-------|---------|--------|
@@ -146,27 +151,58 @@ Prevents context degradation. Big work = multiple phases.
 | 30-50% | Good | Continue |
 | 50%+ | Degrading | Auto-checkpoint |
+## 5 Specialized Agents
+| Agent | When spawned |
+|-------|--------------|
+| ctx-researcher | During planning (ArguSeek + ChunkHound) |
+| ctx-planner | After research |
+| ctx-executor | During execution |
+| ctx-debugger | When debugging |
+| ctx-verifier | During verification |
 ## Integrations
-### ArguSeek (Web Research)
-Auto-runs during planning:
-- Best practices
-- Security considerations
-- Performance patterns
+- **ArguSeek**: Web research during planning
+- **ChunkHound**: Semantic code search (`uv tool install chunkhound`)
+- **Playwright/DevTools**: Browser verification for UI
-### ChunkHound (Semantic Search)
-Auto-runs during planning:
-- Find relevant code
-- Detect patterns
-- Map entry points
+## Directory Structure
-Install: `uv tool install chunkhound`
+```
+.ctx/
+├── STATE.md          # Living digest - execution state
+├── PRD.json          # Requirements contract - stories + criteria
+├── .env              # Test credentials (GITIGNORED)
+├── .gitignore        # Protects secrets
+├── phases/{story_id}/  # Per-story data
+│   ├── RESEARCH.md   # ArguSeek + ChunkHound results
+│   ├── PLAN.md       # Tasks mapped to acceptance criteria
+│   └── VERIFY.md     # Verification report
+├── checkpoints/      # Auto-checkpoints
+├── debug/            # Debug screenshots
+└── verify/           # Verification screenshots
+```
-### Browser Verification (Playwright/Chrome DevTools)
-Auto-runs during debugging and verification:
-- Navigate to pages
-- Check elements
-- Screenshot proof
+## PRD.json Structure
+```json
+{
+  "stories": [
+    {
+      "id": "S001",
+      "title": "User login",
+      "acceptanceCriteria": ["User can log in with email", "..."],
+      "passes": false
+    }
+  ],
+  "metadata": {
+    "currentStory": "S001",
+    "passedStories": 0,
+    "totalStories": 5
+  }
+}
+```
 ## Updating CTX
@@ -175,5 +211,5 @@ npx ctx-cc --force
 ```
 ---
-*CTX 2.0 - 4 commands, debug loop, 100% verified*
+*CTX 2.2 - PRD-driven, story-verified, debug loop until 100% fixed*
 </reference>