ctx-cc 3.5.0 → 4.1.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (74) hide show
  1. package/README.md +375 -676
  2. package/agents/ctx-arch-mapper.md +5 -3
  3. package/agents/ctx-auditor.md +5 -3
  4. package/agents/ctx-codex-reviewer.md +214 -0
  5. package/agents/ctx-concerns-mapper.md +5 -3
  6. package/agents/ctx-criteria-suggester.md +6 -4
  7. package/agents/ctx-debugger.md +5 -3
  8. package/agents/ctx-designer.md +488 -114
  9. package/agents/ctx-discusser.md +5 -3
  10. package/agents/ctx-executor.md +5 -3
  11. package/agents/ctx-handoff.md +6 -4
  12. package/agents/ctx-learner.md +5 -3
  13. package/agents/ctx-mapper.md +4 -3
  14. package/agents/ctx-ml-analyst.md +600 -0
  15. package/agents/ctx-ml-engineer.md +933 -0
  16. package/agents/ctx-ml-reviewer.md +485 -0
  17. package/agents/ctx-ml-scientist.md +626 -0
  18. package/agents/ctx-parallelizer.md +4 -3
  19. package/agents/ctx-planner.md +5 -3
  20. package/agents/ctx-predictor.md +4 -3
  21. package/agents/ctx-qa.md +5 -3
  22. package/agents/ctx-quality-mapper.md +5 -3
  23. package/agents/ctx-researcher.md +5 -3
  24. package/agents/ctx-reviewer.md +6 -4
  25. package/agents/ctx-team-coordinator.md +5 -3
  26. package/agents/ctx-tech-mapper.md +5 -3
  27. package/agents/ctx-verifier.md +5 -3
  28. package/bin/ctx.js +199 -27
  29. package/commands/brand.md +309 -0
  30. package/commands/ctx.md +10 -10
  31. package/commands/design.md +304 -0
  32. package/commands/experiment.md +251 -0
  33. package/commands/help.md +57 -7
  34. package/commands/init.md +25 -0
  35. package/commands/metrics.md +1 -1
  36. package/commands/milestone.md +1 -1
  37. package/commands/ml-status.md +197 -0
  38. package/commands/monitor.md +1 -1
  39. package/commands/train.md +266 -0
  40. package/commands/visual-qa.md +559 -0
  41. package/commands/voice.md +1 -1
  42. package/hooks/post-tool-use.js +39 -0
  43. package/hooks/pre-tool-use.js +94 -0
  44. package/hooks/subagent-stop.js +32 -0
  45. package/package.json +9 -3
  46. package/plugin.json +46 -0
  47. package/skills/ctx-design-system/SKILL.md +572 -0
  48. package/skills/ctx-ml-experiment/SKILL.md +334 -0
  49. package/skills/ctx-ml-pipeline/SKILL.md +437 -0
  50. package/skills/ctx-orchestrator/SKILL.md +91 -0
  51. package/skills/ctx-review-gate/SKILL.md +147 -0
  52. package/skills/ctx-state/SKILL.md +100 -0
  53. package/skills/ctx-visual-qa/SKILL.md +587 -0
  54. package/src/agents.js +109 -0
  55. package/src/auto.js +287 -0
  56. package/src/capabilities.js +226 -0
  57. package/src/commits.js +94 -0
  58. package/src/config.js +112 -0
  59. package/src/context.js +241 -0
  60. package/src/handoff.js +156 -0
  61. package/src/hooks.js +218 -0
  62. package/src/install.js +125 -50
  63. package/src/lifecycle.js +194 -0
  64. package/src/metrics.js +198 -0
  65. package/src/pipeline.js +269 -0
  66. package/src/review-gate.js +338 -0
  67. package/src/runner.js +120 -0
  68. package/src/skills.js +143 -0
  69. package/src/state.js +267 -0
  70. package/src/worktree.js +244 -0
  71. package/templates/PRD.json +1 -1
  72. package/templates/config.json +4 -237
  73. package/workflows/ctx-router.md +0 -485
  74. package/workflows/map-codebase.md +0 -329
package/commands/ctx.md CHANGED
@@ -120,29 +120,29 @@ Call Task 4 times in a SINGLE message with these parameters:
120
120
 
121
121
  ```
122
122
  Task 1:
123
- subagent_type: "gsd-codebase-mapper"
124
- prompt: "Focus area: TECH. Analyze technology stack. Write to .ctx/codebase/TECH.md with languages, frameworks, dependencies, build tools, versions."
123
+ subagent_type: "ctx-tech-mapper"
124
+ prompt: "Analyze technology stack. Write to .ctx/codebase/TECH.md with languages, frameworks, dependencies, build tools, versions."
125
125
  model: "haiku"
126
126
  run_in_background: true
127
127
  description: "Map tech stack"
128
128
 
129
129
  Task 2:
130
- subagent_type: "gsd-codebase-mapper"
131
- prompt: "Focus area: ARCH. Analyze architecture. Write to .ctx/codebase/ARCH.md with patterns, layers, modules, entry points, data flow."
130
+ subagent_type: "ctx-arch-mapper"
131
+ prompt: "Analyze architecture. Write to .ctx/codebase/ARCH.md with patterns, layers, modules, entry points, data flow."
132
132
  model: "haiku"
133
133
  run_in_background: true
134
134
  description: "Map architecture"
135
135
 
136
136
  Task 3:
137
- subagent_type: "gsd-codebase-mapper"
138
- prompt: "Focus area: QUALITY. Analyze code quality. Write to .ctx/codebase/QUALITY.md with test coverage, linting, type safety, documentation, code smells."
137
+ subagent_type: "ctx-quality-mapper"
138
+ prompt: "Analyze code quality. Write to .ctx/codebase/QUALITY.md with test coverage, linting, type safety, documentation, code smells."
139
139
  model: "haiku"
140
140
  run_in_background: true
141
141
  description: "Map quality"
142
142
 
143
143
  Task 4:
144
- subagent_type: "gsd-codebase-mapper"
145
- prompt: "Focus area: CONCERNS. Analyze risks and concerns. Write to .ctx/codebase/CONCERNS.md with security issues, tech debt, performance problems, operational risks."
144
+ subagent_type: "ctx-concerns-mapper"
145
+ prompt: "Analyze risks and concerns. Write to .ctx/codebase/CONCERNS.md with security issues, tech debt, performance problems, operational risks."
146
146
  model: "haiku"
147
147
  run_in_background: true
148
148
  description: "Map concerns"
@@ -264,7 +264,7 @@ If .ctx/codebase/ doesn't exist, run quick mapping first.
264
264
  **Spawn debugger agent:**
265
265
  ```
266
266
  Task:
267
- subagent_type: "debugger"
267
+ subagent_type: "ctx-debugger"
268
268
  prompt: "Investigate this issue: [user's problem]. Use scientific method: reproduce, isolate, fix, verify. The codebase analysis is in .ctx/codebase/. Write debug session to .ctx/debug/SESSION-[timestamp].md"
269
269
  description: "Debug issue"
270
270
  ```
@@ -283,7 +283,7 @@ Task:
283
283
  **Spawn QA agent:**
284
284
  ```
285
285
  Task:
286
- subagent_type: "qa-engineer"
286
+ subagent_type: "ctx-qa"
287
287
  prompt: "Run comprehensive QA validation on this codebase. Test user flows, validate accessibility, check for regressions. Write report to .ctx/qa/REPORT-[timestamp].md"
288
288
  description: "QA validation"
289
289
  ```
@@ -0,0 +1,304 @@
1
+ ---
2
+ name: ctx:design
3
+ description: Launch design workflow — brand identity, component design, design system audit, or visual QA. Detects BRAND_KIT.md and routes accordingly. Spawns ctx-designer agent.
4
+ ---
5
+
6
+ <objective>
7
+ Launch the correct design workflow based on what the project needs and what the user wants to build. Detects existing brand foundation, asks targeted questions, and spawns the ctx-designer agent with full context.
8
+ </objective>
9
+
10
+ <usage>
11
+ ```bash
12
+ /ctx:design # Interactive — detects context, asks what to build
13
+ /ctx:design "login page" # Jump straight to component design with a description
14
+ /ctx:design --brand # Force brand establishment workflow
15
+ /ctx:design --audit # Design system audit
16
+ /ctx:design --visual-qa # Visual QA on current implementation
17
+ ```
18
+ </usage>
19
+
20
+ <process>
21
+
22
+ ## Step 1: Detect Brand Foundation
23
+
24
+ ```bash
25
+ # Check for BRAND_KIT.md in project root
26
+ ls BRAND_KIT.md 2>/dev/null && echo "EXISTS" || echo "MISSING"
27
+
28
+ # Check for token files
29
+ ls tokens/ 2>/dev/null && echo "TOKENS_EXIST" || echo "TOKENS_MISSING"
30
+ ```
31
+
32
+ Build context:
33
+ ```
34
+ brand_kit_exists = BRAND_KIT.md found
35
+ tokens_exist = tokens/ directory found with .tokens.json files
36
+ prd_loaded = .ctx/PRD.json loaded (if present)
37
+ ```
38
+
39
+ ## Step 2: Determine Workflow
40
+
41
+ ### If BRAND_KIT.md does NOT exist
42
+
43
+ Present this message and route to brand:
44
+
45
+ ```
46
+ No BRAND_KIT.md found in this project.
47
+
48
+ Design work requires a visual foundation first. Without brand tokens,
49
+ components will be inconsistent and require rework later.
50
+
51
+ Recommended: Run /ctx:brand to establish your visual foundation first.
52
+ This takes 30-60 minutes and produces:
53
+ - BRAND_KIT.md (colors, typography, spacing, motion)
54
+ - tokens/ (W3C DTCG format, three-tier architecture)
55
+ - brand-assets/ (CSS, SCSS, JS, Tailwind exports)
56
+
57
+ Alternatively, if you have existing brand assets, answer:
58
+ 1. What are your primary brand colors? (hex or oklch)
59
+ 2. What fonts do you use? (or "system default")
60
+ 3. Do you have a Figma file? (share key if yes)
61
+
62
+ These answers let me create a minimal BRAND_KIT.md and proceed.
63
+ ```
64
+
65
+ If user provides existing brand assets, create minimal BRAND_KIT.md from their answers and continue.
66
+
67
+ ### If BRAND_KIT.md exists
68
+
69
+ Present workflow choices:
70
+
71
+ ```
72
+ Brand foundation detected: BRAND_KIT.md [version]
73
+ Tokens: [n] token files found
74
+
75
+ What type of design work?
76
+
77
+ A Component or page design
78
+ Design a new UI component or full page using brand tokens.
79
+ → Mood board, 3 options, prototype, implementation
80
+
81
+ B Design system
82
+ Add tokens, update brand kit, export for a new platform,
83
+ audit for unused tokens or accessibility violations.
84
+ → Token management, export, audit
85
+
86
+ C Visual QA
87
+ Check that implementation matches design specs.
88
+ → Measurement-driven parity check, a11y audit, responsive matrix
89
+
90
+ D Visual regression
91
+ Compare current implementation against last baseline screenshots.
92
+ → Gemini diff analysis, bounding box flagging
93
+ ```
94
+
95
+ ## Step 3: Component/Page Design (Choice A)
96
+
97
+ Ask:
98
+ ```
99
+ Describe what you want to design:
100
+ - Component name and purpose
101
+ - Where it appears in the product
102
+ - Any Figma node ID or link? (optional)
103
+ - Priority states to handle: (default, hover, focus, disabled, loading, error)
104
+ - Responsive: mobile-only / tablet+ / all breakpoints?
105
+ ```
106
+
107
+ Load story context if available:
108
+ ```bash
109
+ # Read current PRD story for context
110
+ cat .ctx/PRD.json | python3 -c "
111
+ import json, sys
112
+ prd = json.load(sys.stdin)
113
+ current = prd.get('metadata', {}).get('currentStory')
114
+ if current:
115
+ story = next((s for s in prd['stories'] if s['id'] == current), None)
116
+ if story:
117
+ print(json.dumps(story, indent=2))
118
+ "
119
+ ```
120
+
121
+ Spawn ctx-designer with component context:
122
+
123
+ ```
124
+ Agent({
125
+ subagent_type: "ctx-designer",
126
+ prompt: "
127
+ Design story: [title from user input or PRD]
128
+
129
+ Story type: design
130
+ Component: [component name]
131
+ Description: [user description]
132
+
133
+ Brand context:
134
+ - BRAND_KIT.md exists at project root
135
+ - Token files: tokens/primitive.tokens.json, tokens/semantic.tokens.json, tokens/component.tokens.json
136
+
137
+ States required: [list from user]
138
+ Responsive: [scope from user]
139
+ Figma node: [if provided]
140
+
141
+ Follow the design-workflow in your instructions:
142
+ 1. Pre-flight check (BRAND_KIT.md exists — confirmed)
143
+ 2. Component research
144
+ 3. Mood board — STOP for approval
145
+ 4. 3 design options (A/B/C) — STOP for selection
146
+ 5. Prototype — STOP for approval
147
+ 6. Implement with brand tokens
148
+ 7. Visual QA (all breakpoints)
149
+ 8. Accessibility audit (WCAG 2.2 AA)
150
+ 9. DESIGN_BRIEF.md documentation
151
+
152
+ Acceptance criteria from story (if PRD loaded):
153
+ [acceptance criteria]
154
+ ",
155
+ description: "Design [component name]"
156
+ })
157
+ ```
158
+
159
+ ## Step 4: Design System Work (Choice B)
160
+
161
+ Ask:
162
+ ```
163
+ What design system task?
164
+
165
+ 1 Add new tokens — extend the token scale
166
+ 2 Update brand colors or typography
167
+ 3 Export tokens — CSS / SCSS / JS / Tailwind
168
+ 4 Figma sync — pull from or push to Figma variables
169
+ 5 Audit — find unused tokens, missing semantic mappings, contrast violations
170
+ 6 Theme — add or update light/dark theme overrides
171
+ ```
172
+
173
+ Spawn ctx-designer or use ctx-design-system skill directly based on task.
174
+
175
+ For audit tasks, run:
176
+ ```bash
177
+ # Check for unused tokens
178
+ grep -r "var(--" src/ --include="*.tsx" --include="*.css" --include="*.scss" 2>/dev/null | \
179
+ grep -oP 'var\(--[\w-]+\)' | sort | uniq
180
+
181
+ # Find any direct primitive color references bypassing semantic layer
182
+ grep -r "color\.gray\|color\.brand" \
183
+ tokens/component.tokens.json 2>/dev/null
184
+ ```
185
+
186
+ Report findings before spawning agent with fix instructions.
187
+
188
+ ## Step 5: Visual QA (Choice C)
189
+
190
+ Collect target information:
191
+ ```
192
+ What to QA?
193
+ - Component name (e.g., "Button", "NavBar", "LoginForm")
194
+ - Figma node ID (for spec extraction) — optional but recommended
195
+ - App URL path (e.g., "/components/button" or "/login")
196
+
197
+ QA scope:
198
+ - All breakpoints (375px / 768px / 1440px)?
199
+ - Accessibility audit included?
200
+ - Gemini design analysis included?
201
+ ```
202
+
203
+ Load app URL:
204
+ ```bash
205
+ cat .ctx/.env 2>/dev/null | grep APP_URL
206
+ ```
207
+
208
+ Spawn ctx-designer with visual QA mode:
209
+
210
+ ```
211
+ Agent({
212
+ subagent_type: "ctx-designer",
213
+ prompt: "
214
+ Run visual QA for: [component/page]
215
+
216
+ Story type: visual-qa
217
+ Target URL path: [path]
218
+ Figma node ID: [if provided]
219
+ App URL: [from .ctx/.env]
220
+
221
+ QA scope:
222
+ - Breakpoints: 375px, 768px, 1440px
223
+ - Design parity: measurement-driven precision diff
224
+ - Accessibility: WCAG 2.2 AA automated checks
225
+ - Gemini analysis: design quality review
226
+
227
+ Follow the ctx-visual-qa skill:
228
+ 1. Extract Figma specs (if node ID provided)
229
+ 2. Measure rendered output at each breakpoint
230
+ 3. Generate precision diff tables
231
+ 4. Run accessibility audit
232
+ 5. Run Gemini design analysis
233
+ 6. Write VISUAL_QA_REPORT.md to .ctx/qa/
234
+
235
+ Output:
236
+ - Pass/fail summary per breakpoint
237
+ - Specific corrections: file, line, property, old value, new value
238
+ ",
239
+ description: "Visual QA: [component/page]"
240
+ })
241
+ ```
242
+
243
+ ## Step 6: Visual Regression (Choice D)
244
+
245
+ ```
246
+ Agent({
247
+ subagent_type: "ctx-designer",
248
+ prompt: "
249
+ Run visual regression check.
250
+
251
+ Story type: visual-regression
252
+ Baselines: .ctx/qa/baselines/
253
+ App URL: [from .ctx/.env]
254
+ Breakpoints: 375px, 768px, 1440px
255
+
256
+ Process:
257
+ 1. Take current screenshots at all breakpoints
258
+ 2. For each: compare against baseline using Gemini analyze_design
259
+ 3. Flag any unintended differences with descriptions
260
+ 4. Save new screenshots to .ctx/qa/visual/
261
+ 5. Write regression report to .ctx/qa/REGRESSION_REPORT.md
262
+
263
+ Report format: component | breakpoint | diff description | severity (minor/major)
264
+ ",
265
+ description: "Visual regression check"
266
+ })
267
+ ```
268
+
269
+ ## Step 7: Update STATE.json
270
+
271
+ After workflow completes, record design activity:
272
+
273
+ ```bash
274
+ python3 -c "
275
+ import json, sys
276
+ from datetime import datetime
277
+
278
+ with open('.ctx/STATE.json', 'r') as f:
279
+ state = json.load(f)
280
+
281
+ state.setdefault('design', {})
282
+ state['design']['lastActivity'] = datetime.utcnow().isoformat() + 'Z'
283
+ state['design']['workflow'] = '$WORKFLOW_TYPE'
284
+ state['design']['brandKitExists'] = True
285
+
286
+ with open('.ctx/STATE.json', 'w') as f:
287
+ json.dump(state, f, indent=2)
288
+ "
289
+ ```
290
+
291
+ </process>
292
+
293
+ <output>
294
+ ```
295
+ [CTX DESIGN]
296
+
297
+ Brand foundation: [FOUND / NOT FOUND]
298
+ Workflow: [brand / component / design-system / visual-qa / regression]
299
+
300
+ [Spawning ctx-designer...]
301
+
302
+ [Agent output follows]
303
+ ```
304
+ </output>
@@ -0,0 +1,251 @@
1
+ ---
2
+ name: ctx:experiment
3
+ description: ML experiment workflow — hypothesize, design, run, analyze, iterate. Routes to the right ML agent based on intent.
4
+ args: intent (optional — what you want to do, e.g. "new", "analyze", "review", "status")
5
+ ---
6
+
7
+ <objective>
8
+ Run the ML experiment lifecycle. Detects intent from your message and routes to the appropriate ML specialist. Keeps all work in .ctx/ml/ with full traceability.
9
+ </objective>
10
+
11
+ <usage>
12
+ ```bash
13
+ /ctx:experiment new "depth=8 improves AUC by 2 points" # Start new experiment with hypothesis
14
+ /ctx:experiment analyze # Run EDA on current dataset
15
+ /ctx:experiment review # Review latest experiment results
16
+ /ctx:experiment pipeline # Build or improve training pipeline
17
+ /ctx:experiment status # Show current experiment status
18
+ /ctx:experiment # Auto-detect from context
19
+ ```
20
+ </usage>
21
+
22
+ <process>
23
+
24
+ ## Step 1: Parse Intent
25
+
26
+ Check arguments and message content:
27
+
28
+ | Trigger | Route |
29
+ |---------|-------|
30
+ | "new", "hypothes", quoted string | New experiment flow |
31
+ | "analyz", "eda", "explore", "data" | EDA flow |
32
+ | "review", "results", "evaluate" | Review flow |
33
+ | "pipeline", "train", "build" | Pipeline flow |
34
+ | "status", "log", "list" | Status flow |
35
+ | No clear signal | Read .ctx/ml/ML-STATUS.md and ask |
36
+
37
+ ## Step 2: Initialize ML Directory
38
+
39
+ Check whether `.ctx/ml/` exists. If not, bootstrap it:
40
+
41
+ ```bash
42
+ mkdir -p .ctx/ml/experiments
43
+ mkdir -p .ctx/ml/analysis/plots
44
+ mkdir -p .ctx/ml/features/transforms
45
+ mkdir -p .ctx/ml/models/configs
46
+ ```
47
+
48
+ Create `.ctx/ml/EXPERIMENT-LOG.md` with empty table header if it does not exist:
49
+
50
+ ```markdown
51
+ # ML Experiment Log
52
+
53
+ | ID | Hypothesis | Model | Primary Metric | Result | Status |
54
+ |----|-----------|-------|---------------|--------|--------|
55
+ ```
56
+
57
+ Create `.ctx/ml/ML-STATUS.md` with initialized content if it does not exist:
58
+
59
+ ```markdown
60
+ # ML Project Status
61
+
62
+ **Updated**: {current date}
63
+ **Active Experiment**: none
64
+
65
+ ## Current Focus
66
+
67
+ No active experiment. Run `/ctx:experiment new "<hypothesis>"` to start.
68
+
69
+ ## Blocking Issues
70
+
71
+ - none
72
+ ```
73
+
74
+ ## Step 3: Route to Agent
75
+
76
+ ### Route: New Experiment
77
+
78
+ Determine next experiment ID by reading EXPERIMENT-LOG.md and incrementing max ID.
79
+
80
+ ```
81
+ Agent({
82
+ subagent_type: "ctx-ml-scientist",
83
+ prompt: |
84
+ Create a new ML experiment.
85
+
86
+ Experiment ID: EXP-{n}
87
+ Hypothesis: {user's hypothesis string, or ask if not provided}
88
+
89
+ 1. Write .ctx/ml/experiments/EXP-{n}/HYPOTHESIS.md
90
+ - Formalize the hypothesis (one sentence: "We believe X will improve Y by Z because W")
91
+ - Document rationale and expected outcome
92
+ - Define null hypothesis
93
+ - Identify risks
94
+
95
+ 2. Write .ctx/ml/experiments/EXP-{n}/DESIGN.md
96
+ - Define baseline and treatment
97
+ - Specify metrics (primary + guard rails)
98
+ - Define acceptance criteria
99
+
100
+ 3. Write .ctx/ml/experiments/EXP-{n}/config.yaml
101
+ - Reproducible configuration
102
+ - Seed set to 42
103
+ - Data paths from existing .ctx/ml/ structure
104
+
105
+ 4. Append row to .ctx/ml/EXPERIMENT-LOG.md with status "draft"
106
+
107
+ 5. Update .ctx/ml/ML-STATUS.md to set active experiment = EXP-{n}
108
+
109
+ Follow the ctx-ml-experiment skill for all formats.
110
+ })
111
+ ```
112
+
113
+ ### Route: EDA / Analyze
114
+
115
+ ```
116
+ Agent({
117
+ subagent_type: "ctx-ml-analyst",
118
+ prompt: |
119
+ Run exploratory data analysis.
120
+
121
+ 1. Identify the dataset (check .ctx/ml/experiments/ for config.yaml data paths,
122
+ or ask user if not clear)
123
+
124
+ 2. Write .ctx/ml/analysis/EDA-{dataset_name}.md with:
125
+ - Shape, dtypes, missing value counts
126
+ - Distribution summary for numeric features
127
+ - Class balance for target variable
128
+ - Top correlations with target
129
+ - Potential data quality issues
130
+ - Recommended features to engineer
131
+
132
+ 3. Save any plots to .ctx/ml/analysis/plots/
133
+
134
+ Follow the ctx-ml-experiment skill for file formats.
135
+ })
136
+ ```
137
+
138
+ ### Route: Review Results
139
+
140
+ Read current active experiment from `.ctx/ml/ML-STATUS.md`.
141
+
142
+ ```
143
+ Agent({
144
+ subagent_type: "ctx-ml-reviewer",
145
+ prompt: |
146
+ Review results for {active_experiment}.
147
+
148
+ Files to read:
149
+ - .ctx/ml/experiments/{active_experiment}/HYPOTHESIS.md
150
+ - .ctx/ml/experiments/{active_experiment}/DESIGN.md
151
+ - .ctx/ml/experiments/{active_experiment}/RESULTS.md (if exists)
152
+ - .ctx/ml/experiments/{active_experiment}/artifacts/metrics.json (if exists)
153
+
154
+ Review checklist:
155
+ - [ ] Primary metric vs target
156
+ - [ ] Guard rail metrics not violated
157
+ - [ ] Training stability (check train.log for NaN, divergence)
158
+ - [ ] Calibration error acceptable
159
+ - [ ] Inference latency within budget
160
+ - [ ] Feature drift not detected
161
+
162
+ Write verdict to RESULTS.md if not already there.
163
+ Update EXPERIMENT-LOG.md row status: accepted | rejected | inconclusive.
164
+ Update ML-STATUS.md with outcome and next experiment recommendation.
165
+
166
+ Follow the ctx-ml-experiment skill for all formats.
167
+ })
168
+ ```
169
+
170
+ ### Route: Pipeline
171
+
172
+ ```
173
+ Agent({
174
+ subagent_type: "ctx-ml-engineer",
175
+ prompt: |
176
+ Build or improve the ML training pipeline.
177
+
178
+ Read:
179
+ - .ctx/ml/features/feature-registry.yaml
180
+ - .ctx/ml/models/registry.yaml
181
+ - Active experiment config if available
182
+
183
+ Apply the full pipeline architecture from the ctx-ml-pipeline skill:
184
+ validation → features → training → HPO → evaluation → registry → inference → monitoring
185
+
186
+ Required patterns:
187
+ - Pandera schema validation at ingestion
188
+ - Deterministic, serializable feature transforms
189
+ - Conformal prediction via MAPIE
190
+ - Prediction envelope with lineage
191
+ - Circuit breaker on inference
192
+ - KS-based drift detection
193
+
194
+ Save all artifacts per the reproducibility requirements in ctx-ml-pipeline skill.
195
+ })
196
+ ```
197
+
198
+ ### Route: Status
199
+
200
+ Read and display current state without spawning an agent:
201
+
202
+ 1. Read `.ctx/ml/ML-STATUS.md`
203
+ 2. Read `.ctx/ml/EXPERIMENT-LOG.md` (last 5 rows)
204
+ 3. Read `.ctx/ml/models/registry.yaml` (current versions)
205
+
206
+ Output format:
207
+
208
+ ```
209
+ [ML Status]
210
+
211
+ Active Experiment: EXP-{n} — {hypothesis title}
212
+ Status: {draft | running | concluded}
213
+
214
+ Recent Experiments:
215
+ EXP-{n} {status} {primary metric result}
216
+ EXP-{n-1} {status} {primary metric result}
217
+ EXP-{n-2} {status} {primary metric result}
218
+
219
+ Production Models:
220
+ {model-name}: v{n} (AUC {value})
221
+
222
+ Current Focus:
223
+ {from ML-STATUS.md}
224
+
225
+ Run /ctx:experiment new "<hypothesis>" to start the next experiment.
226
+ ```
227
+
228
+ ## Step 4: Report Outcome
229
+
230
+ After agent completes, report:
231
+
232
+ ```
233
+ [Experiment] EXP-{n} — {phase completed}
234
+
235
+ Files written:
236
+ .ctx/ml/experiments/EXP-{n}/{file1}
237
+ .ctx/ml/experiments/EXP-{n}/{file2}
238
+
239
+ Next action:
240
+ {what to do next — e.g. "run training", "review results", "promote model"}
241
+ ```
242
+
243
+ </process>
244
+
245
+ <guardrails>
246
+ - Never run experiments without HYPOTHESIS.md and DESIGN.md existing first.
247
+ - Never update model registry from this command — use /ctx:train for that.
248
+ - If active experiment is running, warn before creating a new one.
249
+ - Status route never spawns agents — it is read-only.
250
+ - All experiment IDs must be sequential and unique — never reuse.
251
+ </guardrails>