nubos-pilot 0.1.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (273) hide show
  1. package/agents/np-ai-researcher.md +140 -0
  2. package/agents/np-code-fixer.md +363 -0
  3. package/agents/np-code-reviewer.md +351 -0
  4. package/agents/np-domain-researcher.md +136 -0
  5. package/agents/np-eval-auditor.md +167 -0
  6. package/agents/np-eval-planner.md +153 -0
  7. package/agents/np-executor.md +72 -0
  8. package/agents/np-framework-selector.md +171 -0
  9. package/agents/np-nyquist-auditor.md +185 -0
  10. package/agents/np-plan-checker.md +165 -0
  11. package/agents/np-planner.md +199 -0
  12. package/agents/np-researcher.md +150 -0
  13. package/agents/np-security-auditor.md +206 -0
  14. package/agents/np-ui-auditor.md +369 -0
  15. package/agents/np-ui-checker.md +192 -0
  16. package/agents/np-ui-researcher.md +324 -0
  17. package/agents/np-verifier.md +79 -0
  18. package/bin/check-coverage.cjs +40 -0
  19. package/bin/check-workflows.cjs +171 -0
  20. package/bin/check-workflows.test.cjs +208 -0
  21. package/bin/install.js +500 -0
  22. package/bin/np-tools/_commands.cjs +70 -0
  23. package/bin/np-tools/add-tests.cjs +171 -0
  24. package/bin/np-tools/add-tests.test.cjs +122 -0
  25. package/bin/np-tools/add-todo.cjs +108 -0
  26. package/bin/np-tools/add-todo.test.cjs +112 -0
  27. package/bin/np-tools/agent-skills.cjs +14 -0
  28. package/bin/np-tools/agent-skills.test.cjs +42 -0
  29. package/bin/np-tools/ai-integration-phase.cjs +109 -0
  30. package/bin/np-tools/ai-integration-phase.test.cjs +123 -0
  31. package/bin/np-tools/askuser.cjs +53 -0
  32. package/bin/np-tools/askuser.test.cjs +49 -0
  33. package/bin/np-tools/autonomous.cjs +69 -0
  34. package/bin/np-tools/autonomous.test.cjs +74 -0
  35. package/bin/np-tools/checkpoint.cjs +101 -0
  36. package/bin/np-tools/checkpoint.test.cjs +119 -0
  37. package/bin/np-tools/code-review.cjs +133 -0
  38. package/bin/np-tools/code-review.test.cjs +96 -0
  39. package/bin/np-tools/commit-task.cjs +120 -0
  40. package/bin/np-tools/commit-task.test.cjs +160 -0
  41. package/bin/np-tools/commit.cjs +103 -0
  42. package/bin/np-tools/commit.test.cjs +93 -0
  43. package/bin/np-tools/config.cjs +101 -0
  44. package/bin/np-tools/config.test.cjs +71 -0
  45. package/bin/np-tools/discuss-phase-power.cjs +265 -0
  46. package/bin/np-tools/discuss-phase-power.test.cjs +242 -0
  47. package/bin/np-tools/discuss-phase.cjs +132 -0
  48. package/bin/np-tools/discuss-phase.test.cjs +148 -0
  49. package/bin/np-tools/dispatch.cjs +116 -0
  50. package/bin/np-tools/doctor.cjs +242 -0
  51. package/bin/np-tools/eval-review.cjs +116 -0
  52. package/bin/np-tools/eval-review.test.cjs +123 -0
  53. package/bin/np-tools/execute-phase.cjs +182 -0
  54. package/bin/np-tools/execute-phase.test.cjs +116 -0
  55. package/bin/np-tools/execute-plan.cjs +124 -0
  56. package/bin/np-tools/execute-plan.test.cjs +82 -0
  57. package/bin/np-tools/help.cjs +28 -0
  58. package/bin/np-tools/help.test.cjs +29 -0
  59. package/bin/np-tools/init-dispatch.test.cjs +91 -0
  60. package/bin/np-tools/metrics.cjs +97 -0
  61. package/bin/np-tools/metrics.test.cjs +188 -0
  62. package/bin/np-tools/new-milestone.cjs +288 -0
  63. package/bin/np-tools/new-milestone.test.cjs +166 -0
  64. package/bin/np-tools/new-project.cjs +284 -0
  65. package/bin/np-tools/new-project.test.cjs +165 -0
  66. package/bin/np-tools/next.cjs +7 -0
  67. package/bin/np-tools/next.test.cjs +30 -0
  68. package/bin/np-tools/park.cjs +48 -0
  69. package/bin/np-tools/park.test.cjs +50 -0
  70. package/bin/np-tools/pause-work.cjs +24 -0
  71. package/bin/np-tools/pause-work.test.cjs +74 -0
  72. package/bin/np-tools/phase.cjs +71 -0
  73. package/bin/np-tools/phase.test.cjs +81 -0
  74. package/bin/np-tools/plan-diff.cjs +57 -0
  75. package/bin/np-tools/plan-diff.test.cjs +134 -0
  76. package/bin/np-tools/plan-milestone-gaps.cjs +115 -0
  77. package/bin/np-tools/plan-milestone-gaps.test.cjs +122 -0
  78. package/bin/np-tools/plan-phase.cjs +350 -0
  79. package/bin/np-tools/plan-phase.test.cjs +263 -0
  80. package/bin/np-tools/progress.cjs +7 -0
  81. package/bin/np-tools/progress.test.cjs +44 -0
  82. package/bin/np-tools/queue.cjs +213 -0
  83. package/bin/np-tools/research-phase.cjs +144 -0
  84. package/bin/np-tools/research-phase.test.cjs +154 -0
  85. package/bin/np-tools/reset-slice.cjs +17 -0
  86. package/bin/np-tools/reset-slice.test.cjs +96 -0
  87. package/bin/np-tools/resolve-model.cjs +110 -0
  88. package/bin/np-tools/resolve-model.test.cjs +200 -0
  89. package/bin/np-tools/resume-work.cjs +76 -0
  90. package/bin/np-tools/resume-work.test.cjs +91 -0
  91. package/bin/np-tools/skip.cjs +48 -0
  92. package/bin/np-tools/skip.test.cjs +66 -0
  93. package/bin/np-tools/slug.cjs +34 -0
  94. package/bin/np-tools/slug.test.cjs +46 -0
  95. package/bin/np-tools/state.cjs +16 -0
  96. package/bin/np-tools/state.test.cjs +40 -0
  97. package/bin/np-tools/stats.cjs +151 -0
  98. package/bin/np-tools/stats.test.cjs +118 -0
  99. package/bin/np-tools/triage.cjs +128 -0
  100. package/bin/np-tools/ui-phase.cjs +108 -0
  101. package/bin/np-tools/ui-phase.test.cjs +121 -0
  102. package/bin/np-tools/ui-review.cjs +108 -0
  103. package/bin/np-tools/ui-review.test.cjs +120 -0
  104. package/bin/np-tools/undo-task.cjs +31 -0
  105. package/bin/np-tools/undo-task.test.cjs +117 -0
  106. package/bin/np-tools/undo.cjs +43 -0
  107. package/bin/np-tools/undo.test.cjs +120 -0
  108. package/bin/np-tools/unpark.cjs +48 -0
  109. package/bin/np-tools/unpark.test.cjs +50 -0
  110. package/bin/np-tools/verify-work.cjs +186 -0
  111. package/bin/np-tools/verify-work.test.cjs +97 -0
  112. package/docs/adr/0001-no-daemon-invariant.md +82 -0
  113. package/docs/adr/0002-zero-runtime-dependencies.md +90 -0
  114. package/docs/adr/0003-max-six-unit-types.md +85 -0
  115. package/docs/adr/0004-atomic-commit-per-unit.md +102 -0
  116. package/docs/adr/0005-three-orthogonal-file-trees.md +98 -0
  117. package/docs/adr/0006-yaml-dependency-amendment.md +60 -0
  118. package/docs/adr/README.md +27 -0
  119. package/docs/agent-frontmatter-schema.md +84 -0
  120. package/docs/phase-artifact-schemas.md +292 -0
  121. package/docs/phase-directory-layout.md +82 -0
  122. package/lib/__tests__/README.md +1 -0
  123. package/lib/agents.cjs +98 -0
  124. package/lib/agents.test.cjs +286 -0
  125. package/lib/askuser.cjs +36 -0
  126. package/lib/askuser.test.cjs +310 -0
  127. package/lib/checkpoint.cjs +135 -0
  128. package/lib/checkpoint.test.cjs +184 -0
  129. package/lib/core.cjs +165 -0
  130. package/lib/core.test.cjs +405 -0
  131. package/lib/fixtures/README.md +1 -0
  132. package/lib/fixtures/phase-tree/README.md +1 -0
  133. package/lib/fixtures/plans/cycle/PLAN.md +16 -0
  134. package/lib/fixtures/plans/cycle/tasks/T-01.md +20 -0
  135. package/lib/fixtures/plans/cycle/tasks/T-02.md +20 -0
  136. package/lib/fixtures/plans/cycle/tasks/T-03.md +20 -0
  137. package/lib/fixtures/plans/linear/PLAN.md +16 -0
  138. package/lib/fixtures/plans/linear/tasks/T-01.md +20 -0
  139. package/lib/fixtures/plans/linear/tasks/T-02.md +20 -0
  140. package/lib/fixtures/plans/linear/tasks/T-03.md +20 -0
  141. package/lib/fixtures/plans/parallel/PLAN.md +16 -0
  142. package/lib/fixtures/plans/parallel/tasks/T-01.md +20 -0
  143. package/lib/fixtures/plans/parallel/tasks/T-02.md +20 -0
  144. package/lib/fixtures/plans/parallel/tasks/T-03.md +20 -0
  145. package/lib/fixtures/plans/wave-conflict/PLAN.md +16 -0
  146. package/lib/fixtures/plans/wave-conflict/tasks/T-01.md +20 -0
  147. package/lib/fixtures/plans/wave-conflict/tasks/T-02.md +20 -0
  148. package/lib/fixtures/roadmap/ROADMAP-malformed.md +3 -0
  149. package/lib/fixtures/roadmap/ROADMAP-minimal.md +51 -0
  150. package/lib/fixtures/roadmap/roadmap-malformed.yaml +7 -0
  151. package/lib/fixtures/roadmap/roadmap-minimal.yaml +40 -0
  152. package/lib/fixtures/roadmap/roadmap-ten-phases.yaml +101 -0
  153. package/lib/fixtures/templates/phase-context.md +6 -0
  154. package/lib/fixtures/templates/plan-skeleton.md +6 -0
  155. package/lib/frontmatter.cjs +251 -0
  156. package/lib/frontmatter.test.cjs +177 -0
  157. package/lib/gaps.cjs +197 -0
  158. package/lib/gaps.test.cjs +200 -0
  159. package/lib/git.cjs +207 -0
  160. package/lib/git.test.cjs +305 -0
  161. package/lib/install/agents-md.cjs +77 -0
  162. package/lib/install/backup.cjs +70 -0
  163. package/lib/install/codex-toml.cjs +440 -0
  164. package/lib/install/managed-block.cjs +30 -0
  165. package/lib/install/manifest.cjs +148 -0
  166. package/lib/install/mcp-writer.cjs +127 -0
  167. package/lib/install/runtime-detect.cjs +44 -0
  168. package/lib/install/staging.cjs +149 -0
  169. package/lib/metrics-aggregate.cjs +229 -0
  170. package/lib/metrics-aggregate.test.cjs +192 -0
  171. package/lib/metrics.cjs +120 -0
  172. package/lib/metrics.test.cjs +182 -0
  173. package/lib/model-aliases.regression.test.cjs +16 -0
  174. package/lib/model-profiles.cjs +42 -0
  175. package/lib/model-profiles.test.cjs +61 -0
  176. package/lib/next.cjs +236 -0
  177. package/lib/next.test.cjs +194 -0
  178. package/lib/phase.cjs +95 -0
  179. package/lib/phase.test.cjs +189 -0
  180. package/lib/plan-checker-contract.test.cjs +72 -0
  181. package/lib/plan-diff.cjs +173 -0
  182. package/lib/plan-diff.test.cjs +217 -0
  183. package/lib/plan.cjs +85 -0
  184. package/lib/plan.test.cjs +263 -0
  185. package/lib/progress.cjs +95 -0
  186. package/lib/progress.test.cjs +116 -0
  187. package/lib/researcher-contract.test.cjs +61 -0
  188. package/lib/roadmap-render.cjs +206 -0
  189. package/lib/roadmap-render.test.cjs +121 -0
  190. package/lib/roadmap.cjs +416 -0
  191. package/lib/roadmap.test.cjs +371 -0
  192. package/lib/runtime/_contract.test.cjs +61 -0
  193. package/lib/runtime/_readline.cjs +119 -0
  194. package/lib/runtime/_readline.test.cjs +126 -0
  195. package/lib/runtime/claude.cjs +48 -0
  196. package/lib/runtime/claude.test.cjs +101 -0
  197. package/lib/runtime/codex.cjs +35 -0
  198. package/lib/runtime/codex.test.cjs +114 -0
  199. package/lib/runtime/gemini.cjs +35 -0
  200. package/lib/runtime/gemini.test.cjs +109 -0
  201. package/lib/runtime/index.cjs +49 -0
  202. package/lib/runtime/index.test.cjs +181 -0
  203. package/lib/runtime/opencode.cjs +35 -0
  204. package/lib/runtime/opencode.test.cjs +124 -0
  205. package/lib/state.cjs +205 -0
  206. package/lib/state.test.cjs +264 -0
  207. package/lib/surface-audit.test.cjs +46 -0
  208. package/lib/tasks.cjs +327 -0
  209. package/lib/tasks.test.cjs +389 -0
  210. package/lib/template.cjs +66 -0
  211. package/lib/template.test.cjs +159 -0
  212. package/lib/undo.cjs +179 -0
  213. package/lib/undo.test.cjs +261 -0
  214. package/lib/verify.cjs +116 -0
  215. package/lib/verify.test.cjs +187 -0
  216. package/np-tools.cjs +303 -0
  217. package/package.json +39 -0
  218. package/templates/AI-SPEC.md +90 -0
  219. package/templates/CONTEXT.md +32 -0
  220. package/templates/PLAN.md +69 -0
  221. package/templates/PROJECT.md +60 -0
  222. package/templates/REQUIREMENTS.md +38 -0
  223. package/templates/SECURITY.md +61 -0
  224. package/templates/UI-SPEC.md +64 -0
  225. package/templates/VALIDATION.md +76 -0
  226. package/templates/claude/payload/README.md +11 -0
  227. package/templates/opencode/opencode.json +6 -0
  228. package/templates/opencode/payload/AGENTS.md +9 -0
  229. package/workflows/add-backlog.md +212 -0
  230. package/workflows/add-tests.md +69 -0
  231. package/workflows/add-todo.md +222 -0
  232. package/workflows/ai-integration-phase.md +230 -0
  233. package/workflows/autonomous.md +94 -0
  234. package/workflows/cleanup.md +325 -0
  235. package/workflows/code-review-fix.md +435 -0
  236. package/workflows/code-review.md +447 -0
  237. package/workflows/discuss-phase-assumptions.md +269 -0
  238. package/workflows/discuss-phase-power.md +139 -0
  239. package/workflows/discuss-phase.md +386 -0
  240. package/workflows/dispatch.md +9 -0
  241. package/workflows/doctor.md +10 -0
  242. package/workflows/eval-review.md +243 -0
  243. package/workflows/execute-phase.md +142 -0
  244. package/workflows/execute-plan.md +82 -0
  245. package/workflows/help.md +8 -0
  246. package/workflows/new-milestone.md +166 -0
  247. package/workflows/new-project.md +213 -0
  248. package/workflows/next.md +8 -0
  249. package/workflows/note.md +244 -0
  250. package/workflows/park.md +29 -0
  251. package/workflows/pause-work.md +34 -0
  252. package/workflows/plan-milestone-gaps.md +233 -0
  253. package/workflows/plan-phase.md +351 -0
  254. package/workflows/progress.md +8 -0
  255. package/workflows/queue.md +9 -0
  256. package/workflows/research-phase.md +327 -0
  257. package/workflows/reset-slice.md +39 -0
  258. package/workflows/resume-work.md +79 -0
  259. package/workflows/review.md +489 -0
  260. package/workflows/secure-phase.md +209 -0
  261. package/workflows/session-report.md +243 -0
  262. package/workflows/skip.md +29 -0
  263. package/workflows/state.md +7 -0
  264. package/workflows/stats.md +170 -0
  265. package/workflows/thread.md +214 -0
  266. package/workflows/triage.md +9 -0
  267. package/workflows/ui-phase.md +246 -0
  268. package/workflows/ui-review.md +222 -0
  269. package/workflows/undo-task.md +42 -0
  270. package/workflows/undo.md +55 -0
  271. package/workflows/unpark.md +29 -0
  272. package/workflows/validate-phase.md +231 -0
  273. package/workflows/verify-work.md +83 -0
@@ -0,0 +1,386 @@
1
+ ---
2
+ command: np:discuss-phase
3
+ description: Adaptive interview to capture phase implementation decisions; writes CONTEXT.md.
4
+ ---
5
+
6
+ # np:discuss-phase
7
+
8
+ Extract implementation decisions that downstream agents (researcher, planner)
9
+ need. Minimum Phase-5 scope: adaptive askUser()-based interview covering the
10
+ nine context areas and a single CONTEXT.md render.
11
+
12
+ The `--assumptions` flag routes to `workflows/discuss-phase-assumptions.md`
13
+ (lighter-weight codebase-first mode). The `--power` flag is owned by Plan
14
+ 05-08 and is not implemented here.
15
+
16
+ **Scope note (Phase 5):** No advisor subagent spawn, no `--batch`, no
17
+ `--analyze`, no `--chain` auto-advance. Those are deferred; this
18
+ workflow delivers PLAN-01 and nothing beyond it.
19
+
20
+ ## Initialize
21
+
22
+ ```bash
23
+ INIT=$(node np-tools.cjs init discuss-phase "$PHASE")
24
+ if [[ "$INIT" == @file:* ]]; then INIT=$(cat "${INIT#@file:}"); fi
25
+ ```
26
+
27
+ Parse JSON for: `phase_number`, `padded`, `phase_dir`, `phase_name`,
28
+ `phase_slug`, `has_context`, `goal`, `requirements`, `agent_skills`, `mode`.
29
+
30
+ If the user passed `--assumptions`, route to
31
+ `workflows/discuss-phase-assumptions.md` and exit this workflow.
32
+
33
+ ## Purpose
34
+
35
+ <purpose>
36
+ Extract implementation decisions that downstream agents need. Analyze the
37
+ phase to identify gray areas, let the user choose what to discuss, then
38
+ deep-dive each selected area until satisfied.
39
+
40
+ You are a thinking partner, not an interviewer. The user is the visionary —
41
+ you are the builder. Your job is to capture decisions that will guide
42
+ research and planning, not to figure out implementation yourself.
43
+ </purpose>
44
+
45
+ ## Downstream Awareness
46
+
47
+ <downstream_awareness>
48
+ **CONTEXT.md feeds into:**
49
+
50
+ 1. **researcher** — Reads CONTEXT.md to know WHAT to research
51
+ - "User wants card-based layout" → researcher investigates card component patterns
52
+ - "Infinite scroll decided" → researcher looks into virtualization libraries
53
+
54
+ 2. **planner** — Reads CONTEXT.md to know WHAT decisions are locked
55
+ - "Pull-to-refresh on mobile" → planner includes that in task specs
56
+ - "Claude's Discretion: loading skeleton" → planner can decide approach
57
+
58
+ **Your job:** Capture decisions clearly enough that downstream agents can act
59
+ on them without asking the user again.
60
+
61
+ **Not your job:** Figure out HOW to implement. That's what research and
62
+ planning do with the decisions you capture.
63
+ </downstream_awareness>
64
+
65
+ ## Philosophy
66
+
67
+ <philosophy>
68
+ **User = founder/visionary. Claude = builder.**
69
+
70
+ The user knows:
71
+ - How they imagine it working
72
+ - What it should look/feel like
73
+ - What's essential vs nice-to-have
74
+ - Specific behaviors or references they have in mind
75
+
76
+ The user doesn't know (and shouldn't be asked):
77
+ - Codebase patterns (researcher reads the code)
78
+ - Technical risks (researcher identifies these)
79
+ - Implementation approach (planner figures this out)
80
+ - Success metrics (inferred from the work)
81
+
82
+ Ask about vision and implementation choices. Capture decisions for downstream
83
+ agents.
84
+ </philosophy>
85
+
86
+ ## Scope Guardrail
87
+
88
+ <scope_guardrail>
89
+ **CRITICAL: No scope creep.**
90
+
91
+ The phase boundary comes from ROADMAP.md and is FIXED. Discussion clarifies
92
+ HOW to implement what's scoped, never WHETHER to add new capabilities.
93
+
94
+ **Allowed (clarifying ambiguity):**
95
+ - "How should posts be displayed?" (layout, density, info shown)
96
+ - "What happens on empty state?" (within the feature)
97
+ - "Pull to refresh or manual?" (behavior choice)
98
+
99
+ **Not allowed (scope creep):**
100
+ - "Should we also add comments?" (new capability)
101
+ - "What about search/filtering?" (new capability)
102
+ - "Maybe include bookmarking?" (new capability)
103
+
104
+ **The heuristic:** Does this clarify how we implement what's already in the
105
+ phase, or does it add a new capability that could be its own phase?
106
+
107
+ **When user suggests scope creep:**
108
+ ```
109
+ "[Feature X] would be a new capability — that's its own phase.
110
+ Want me to note it for the roadmap backlog?
111
+
112
+ For now, let's focus on [phase domain]."
113
+ ```
114
+
115
+ Capture the idea in a "Deferred Ideas" section. Don't lose it, don't act on it.
116
+ </scope_guardrail>
117
+
118
+ ## Answer Validation
119
+
120
+ <answer_validation>
121
+ **IMPORTANT: Answer validation** — After every interactive prompt, check if the
122
+ response is empty or whitespace-only. If so:
123
+ 1. Retry the question once with the same parameters
124
+ 2. If still empty, present the options as a plain-text numbered list and ask
125
+ the user to type their choice number
126
+ Never proceed with an empty answer.
127
+
128
+ **Text mode (`workflow.text_mode: true` in config or `--text` flag):**
129
+ When text mode is active, **do not use `np-tools.cjs askuser` at all**.
130
+ Instead, present every question as a plain-text numbered list and ask the
131
+ user to type their choice number. This is required for Claude Code remote
132
+ sessions (`/rc` mode) where the Claude App cannot forward TUI menu selections
133
+ back to the host.
134
+
135
+ Enable text mode:
136
+ - Per-session: pass `--text` flag
137
+ - Per-project: `np-tools.cjs config-set workflow.text_mode true`
138
+
139
+ Text mode applies to ALL workflows in the session, not just discuss-phase.
140
+ </answer_validation>
141
+
142
+ ## Process
143
+
144
+ ### Step 1: Guard against existing CONTEXT.md
145
+
146
+ If `has_context` is `true`, ask the user how to proceed:
147
+
148
+ ```bash
149
+ node np-tools.cjs askuser --json '{
150
+ "type": "select",
151
+ "prompt": "Phase '"$PHASE"' already has a CONTEXT.md. What do you want to do?",
152
+ "options": [
153
+ "Overwrite existing CONTEXT.md",
154
+ "Append update section",
155
+ "Abort"
156
+ ]
157
+ }'
158
+ ```
159
+
160
+ - **Overwrite** → preserve the prior file as `{padded}-CONTEXT.archive.md`
161
+ before writing the new one:
162
+ ```bash
163
+ mv "$PHASE_DIR/$PADDED-CONTEXT.md" "$PHASE_DIR/$PADDED-CONTEXT.archive.md"
164
+ ```
165
+ - **Append update section** → skip the archive move; the write step below
166
+ appends a fresh `## Update — <date>` section instead of replacing content.
167
+ - **Abort** → exit the workflow. No file changes.
168
+
169
+ If `has_context` is `false`, continue directly to Step 2.
170
+
171
+ ### Step 2: Confirm phase goal
172
+
173
+ Read `goal` and `requirements` from INIT. Confirm the phase goal is what the
174
+ user expects (users sometimes discover the roadmap goal is stale before
175
+ discussion starts):
176
+
177
+ ```bash
178
+ node np-tools.cjs askuser --json '{
179
+ "type": "confirm",
180
+ "prompt": "ROADMAP goal for phase '"$PHASE"': \"'"$GOAL"'\". Still accurate?",
181
+ "default": true
182
+ }'
183
+ ```
184
+
185
+ If the user says `no`, capture the refined goal with a free-text input call
186
+ and record it for the `<domain>` section of CONTEXT.md:
187
+
188
+ ```bash
189
+ node np-tools.cjs askuser --json '{
190
+ "type": "input",
191
+ "prompt": "Refined goal for phase '"$PHASE"':"
192
+ }'
193
+ ```
194
+
195
+ ### Step 3: Present phase-specific gray areas
196
+
197
+ Based on the phase goal + domain, generate 3–4 concrete gray areas (not
198
+ generic UI/UX labels — specific decisions like "Session handling", "Error
199
+ responses", "Multi-device policy"). Present them via a multi-select:
200
+
201
+ ```bash
202
+ node np-tools.cjs askuser --json '{
203
+ "type": "multiselect",
204
+ "prompt": "Which areas do you want to discuss for '"$PHASE_NAME"'?",
205
+ "options": [
206
+ "<area 1>",
207
+ "<area 2>",
208
+ "<area 3>",
209
+ "<area 4>"
210
+ ]
211
+ }'
212
+ ```
213
+
214
+ Per the scope-guardrail block above: options must clarify HOW to build what
215
+ is in scope — never introduce new capabilities.
216
+
217
+ ### Step 4: Discuss each selected area
218
+
219
+ For each selected area, ask 2–4 focused questions. Every prompt routes
220
+ through `np-tools.cjs askuser` — never through the runtime-native structured
221
+ question tool directly (SC-5 enforcement from Phase 3).
222
+
223
+ Per area, the recommended flow is:
224
+
225
+ ```bash
226
+ # Decision question (typed as select when options exist)
227
+ node np-tools.cjs askuser --json '{
228
+ "type": "select",
229
+ "prompt": "For <area>: <specific decision>?",
230
+ "options": ["<choice A>", "<choice B>", "<choice C>"]
231
+ }'
232
+
233
+ # Follow-up free-text capture when the user picks "Other" or needs nuance
234
+ node np-tools.cjs askuser --json '{
235
+ "type": "input",
236
+ "prompt": "Anything specific about <area> downstream agents must know?"
237
+ }'
238
+
239
+ # Continuation gate
240
+ node np-tools.cjs askuser --json '{
241
+ "type": "select",
242
+ "prompt": "More questions about <area>, or move on?",
243
+ "options": ["More questions", "Next area"]
244
+ }'
245
+ ```
246
+
247
+ After all selected areas are covered:
248
+
249
+ ```bash
250
+ node np-tools.cjs askuser --json '{
251
+ "type": "select",
252
+ "prompt": "We have discussed <areas>. Anything else before we write CONTEXT.md?",
253
+ "options": ["Explore more gray areas", "I am ready for CONTEXT.md"]
254
+ }'
255
+ ```
256
+
257
+ If the user chooses to explore more, loop back to Step 3 with 2–4 fresh
258
+ candidate areas. Otherwise proceed to Step 5.
259
+
260
+ **Canonical ref accumulation.** When the user references a doc/ADR/spec
261
+ during any answer ("read adr-014", "per browse-spec.md"), read it and add
262
+ its full relative path to the canonical-refs accumulator — these are the
263
+ most important refs because they come straight from the user.
264
+
265
+ ### Step 5: Capture remaining CONTEXT.md sections
266
+
267
+ Collect short free-text inputs for the remaining required sections before
268
+ rendering:
269
+
270
+ ```bash
271
+ node np-tools.cjs askuser --json '{
272
+ "type": "input",
273
+ "prompt": "Canonical refs (paths to ADRs/specs/docs downstream agents must read) — comma separated or \"none\":"
274
+ }'
275
+ ```
276
+
277
+ ```bash
278
+ node np-tools.cjs askuser --json '{
279
+ "type": "input",
280
+ "prompt": "Reusable code / existing assets relevant to this phase — or \"none\":"
281
+ }'
282
+ ```
283
+
284
+ ```bash
285
+ node np-tools.cjs askuser --json '{
286
+ "type": "input",
287
+ "prompt": "Specific references (\"I want it like X\" moments) — or \"none\":"
288
+ }'
289
+ ```
290
+
291
+ ```bash
292
+ node np-tools.cjs askuser --json '{
293
+ "type": "input",
294
+ "prompt": "Deferred ideas (things we noted but belong in later phases) — or \"none\":"
295
+ }'
296
+ ```
297
+
298
+ ```bash
299
+ node np-tools.cjs askuser --json '{
300
+ "type": "input",
301
+ "prompt": "Claude\u2019s Discretion — areas where you want Claude to decide without asking:"
302
+ }'
303
+ ```
304
+
305
+ ### Step 6: Render CONTEXT.md
306
+
307
+ Render `templates/CONTEXT.md` with `lib/template.cjs`. The render call is
308
+ fail-loud on unknown placeholders, so the variables object below must match
309
+ the template's `{{var}}` keys exactly.
310
+
311
+ ```bash
312
+ PHASE_DIR=$(echo "$INIT" | node -e 'let d="";process.stdin.on("data",c=>d+=c).on("end",()=>{console.log(JSON.parse(d).phase_dir)})')
313
+ PADDED=$(echo "$INIT" | node -e 'let d="";process.stdin.on("data",c=>d+=c).on("end",()=>{console.log(JSON.parse(d).padded)})')
314
+ mkdir -p "$PHASE_DIR"
315
+
316
+ node -e '
317
+ const { render } = require("./lib/template.cjs");
318
+ const fs = require("node:fs");
319
+ const tpl = fs.readFileSync("templates/CONTEXT.md", "utf-8");
320
+ const vars = JSON.parse(process.argv[1]);
321
+ process.stdout.write(render(tpl, vars));
322
+ ' "$VARS_JSON" > "$PHASE_DIR/$PADDED-CONTEXT.md"
323
+ ```
324
+
325
+ `$VARS_JSON` is the JSON-serialised accumulator from Steps 2–5:
326
+
327
+ ```jsonc
328
+ {
329
+ "phase_number": "5",
330
+ "phase_name": "...",
331
+ "goal": "...",
332
+ "domain": "...",
333
+ "decisions": "...", // collected from Step 4
334
+ "canonical_refs": "...",
335
+ "code_context": "...",
336
+ "specifics": "...",
337
+ "deferred": "...",
338
+ "date": "2026-04-15"
339
+ }
340
+ ```
341
+
342
+ If `templates/CONTEXT.md` lacks a key, `render()` throws
343
+ `NubosPilotError('template-missing-key', …)` — the workflow must not swallow
344
+ that error. Fix the template or the accumulator, don't mask the failure.
345
+
346
+ ### Step 7: Commit respecting config.commit_docs
347
+
348
+ ```bash
349
+ COMMIT_DOCS=$(node np-tools.cjs config-get workflow.commit_docs 2>/dev/null || echo "true")
350
+ if [[ "$COMMIT_DOCS" == "true" ]]; then
351
+ git add "$PHASE_DIR/$PADDED-CONTEXT.md"
352
+ git commit -m "docs($PADDED): capture phase context"
353
+ fi
354
+ ```
355
+
356
+ If `workflow.commit_docs` is false, leave the file uncommitted — the user is
357
+ opting into manual commit gating.
358
+
359
+ ### Step 8: Confirm and next steps
360
+
361
+ ```bash
362
+ node np-tools.cjs askuser --json '{
363
+ "type": "confirm",
364
+ "prompt": "CONTEXT.md written at '"$PHASE_DIR"'/'"$PADDED"'-CONTEXT.md. Run np:plan-phase '"$PHASE"' now?",
365
+ "default": true
366
+ }'
367
+ ```
368
+
369
+ Yes → invoke `np:plan-phase $PHASE` via the runtime's standard workflow
370
+ dispatcher. No → print the manual next-step hint:
371
+
372
+ ```
373
+ Next: /np:plan-phase $PHASE
374
+ ```
375
+
376
+ ## Success Criteria
377
+
378
+ - `{phase_dir}/{padded}-CONTEXT.md` exists with all six required sections
379
+ (domain, decisions, canonical_refs, code_context, specifics, deferred).
380
+ - Every interactive prompt went through `np-tools.cjs askuser`; zero bare
381
+ `np-tools.cjs askuser` bypasses.
382
+ - If prior CONTEXT.md existed, user explicitly chose overwrite / append /
383
+ abort — no silent overwrite.
384
+ - Deferred ideas preserved verbatim for future phases.
385
+ - Commit (if `workflow.commit_docs=true`) landed via
386
+ `docs(PADDED): capture phase context`.
@@ -0,0 +1,9 @@
1
+ # np:dispatch
2
+
3
+ State-router for the current phase. Reads state → determines next action
4
+ (discuss / plan / execute / verify) → delegates via `Skill()` call.
5
+ `--force` or `--action=<name>` to override. `--action` wins over recommendation.
6
+
7
+ ```bash
8
+ node np-tools.cjs dispatch "$@"
9
+ ```
@@ -0,0 +1,10 @@
1
+ # np:doctor
2
+
3
+ Run a 5-check integrity scan of the nubos-pilot install (manifest integrity,
4
+ version mismatch, missing hooks, trapped Codex `[features]`, askUser broken).
5
+ Use `--fix` to apply auto-safe fixes; anything touching user files outside the
6
+ manifest will prompt via `askUser()` (SC-5).
7
+
8
+ ```bash
9
+ node np-tools.cjs doctor "$@"
10
+ ```
@@ -0,0 +1,243 @@
1
+ ---
2
+ command: np:eval-review
3
+ description: Retroactive evaluation-coverage audit of a completed AI phase. Spawns np-eval-auditor to score each planned eval dimension as COVERED/PARTIAL/MISSING against AI-SPEC.md (if present) or general best-practice rubric. Produces EVAL-REVIEW.md.
4
+ ---
5
+
6
+ # np:eval-review
7
+
8
+ Produces `{phase_dir}/{padded}-EVAL-REVIEW.md` via a single `np-eval-auditor`
9
+ spawn that audits the phase's implemented AI system against its
10
+ evaluation plan. Runs AFTER `/np:execute-phase` has landed code — the
11
+ audit needs a SUMMARY.md to know what was built.
12
+
13
+ Three states (resolved by the init payload, not by this workflow):
14
+
15
+ - **State A — spec-conformance audit.** `AI-SPEC.md` and `SUMMARY.md`
16
+ both present. The auditor scores the implementation against the
17
+ planned eval dimensions, rubrics, guardrails, and monitoring plan.
18
+ - **State B — retroactive general audit.** `SUMMARY.md` present but no
19
+ `AI-SPEC.md`. The auditor scores against the generic best-practice
20
+ checklist. The output file header labels the mode explicitly
21
+ (Pitfall 10 parallel — avoids silent drift between spec-backed and
22
+ spec-less reviews).
23
+ - **State C — abort.** No `SUMMARY.md`. The workflow exits with a
24
+ clear message before spawning the auditor — there is nothing to
25
+ audit until the phase has been executed.
26
+
27
+ The single Task-spawn site is wrapped in the Plan 09-05 metrics +
28
+ resolve-model pattern (D-06, D-01). `RUNTIME` is detected once at the
29
+ top of the bash block and re-used by the `metrics record` call.
30
+
31
+ ## Initialize
32
+
33
+ ```bash
34
+ PHASE="$1"
35
+ if [[ -z "$PHASE" ]]; then
36
+ echo "Usage: /np:eval-review <phase-number>" >&2
37
+ exit 2
38
+ fi
39
+
40
+ INIT=$(node np-tools.cjs init eval-review "$PHASE")
41
+ if [[ "$INIT" == @file:* ]]; then INIT=$(cat "${INIT#@file:}"); fi
42
+ RUNTIME=$(node -e "console.log(require('./lib/runtime/index.cjs').detect().runtime)")
43
+ ```
44
+
45
+ Parse JSON for: `phase`, `padded`, `phase_dir`, `eval_review_path`,
46
+ `summary_present`, `summary_path`, `ai_spec_path`, `has_ai_spec`,
47
+ `state`, `agents.eval_auditor`.
48
+
49
+ ```bash
50
+ PADDED=$(echo "$INIT" | jq -r '.padded')
51
+ PHASE_DIR=$(echo "$INIT" | jq -r '.phase_dir')
52
+ EVAL_REVIEW_PATH=$(echo "$INIT" | jq -r '.eval_review_path')
53
+ SUMMARY_PRESENT=$(echo "$INIT" | jq -r '.summary_present')
54
+ SUMMARY_PATH=$(echo "$INIT" | jq -r '.summary_path')
55
+ AI_SPEC_PATH=$(echo "$INIT" | jq -r '.ai_spec_path')
56
+ HAS_AI_SPEC=$(echo "$INIT" | jq -r '.has_ai_spec')
57
+ STATE=$(echo "$INIT" | jq -r '.state')
58
+ PLAN_ID="${PADDED}-eval-review"
59
+ TASK_ID="${PADDED}-eval-review"
60
+ ```
61
+
62
+ ## Pre-Flight Gates
63
+
64
+ <pre_flight>
65
+
66
+ ### Gate 1 — State C aborts before any spawn
67
+
68
+ State C means no SUMMARY.md, so the phase has not been executed and
69
+ there is nothing to audit. Exit with a clear message before any agent
70
+ is spawned or any metrics record is written.
71
+
72
+ ```bash
73
+ if [[ "$STATE" == "C" ]]; then
74
+ echo "Error: Phase $PHASE has no SUMMARY.md at $SUMMARY_PATH." >&2
75
+ echo "The phase must be executed (/np:execute-phase) before its evals can be audited." >&2
76
+ exit 1
77
+ fi
78
+
79
+ if [[ "$SUMMARY_PRESENT" != "true" ]]; then
80
+ echo "Error: summary_present=false for phase $PHASE; expected state=C but got $STATE." >&2
81
+ exit 1
82
+ fi
83
+ ```
84
+
85
+ ### Gate 2 — EVAL-REVIEW.md already exists
86
+
87
+ If a prior review is present, let the user choose between re-running,
88
+ viewing the current review, or skipping.
89
+
90
+ ```bash
91
+ if [[ -f "$EVAL_REVIEW_PATH" ]]; then
92
+ CHOICE=$(node np-tools.cjs askuser --json '{
93
+ "type": "select",
94
+ "header": "Existing EVAL-REVIEW",
95
+ "question": "EVAL-REVIEW.md already exists for Phase '"$PHASE"'. What would you like to do?",
96
+ "options": [
97
+ {"label": "Re-run — replace the current review", "description": "Re-runs np-eval-auditor and overwrites the existing file."},
98
+ {"label": "View — display current review and exit", "description": "Reads the file and exits without changes."},
99
+ {"label": "Skip — keep current review and exit", "description": "Leaves the file untouched."}
100
+ ]
101
+ }')
102
+ case "$CHOICE" in
103
+ "View"*) cat "$EVAL_REVIEW_PATH"; exit 0 ;;
104
+ "Skip"*) exit 0 ;;
105
+ esac
106
+ fi
107
+ ```
108
+
109
+ ### Gate 3 — Label audit mode from state
110
+
111
+ ```bash
112
+ case "$STATE" in
113
+ "A") AUDIT_MODE="spec-conformance" ;;
114
+ "B") AUDIT_MODE="retroactive-general" ;;
115
+ *)
116
+ echo "Error: unexpected state '$STATE' from init payload (expected A or B after Gate 1)." >&2
117
+ exit 1
118
+ ;;
119
+ esac
120
+ ```
121
+
122
+ </pre_flight>
123
+
124
+ ## Philosophy
125
+
126
+ <philosophy>
127
+ Eval plans decay the moment the first commit lands. Planned rubrics
128
+ lose their binding to code, guardrails get stubbed "for now", tracing
129
+ is wired but never turned on, and the reference dataset never leaves
130
+ the design doc. A retroactive eval-coverage audit catches all of that
131
+ in one pass and emits a ranked list of gaps with concrete remediation
132
+ steps. When an AI-SPEC.md exists, the audit is a conformance check
133
+ against planned dimensions. When it does not, the audit is a
134
+ best-practice sweep — and the mode label on EVAL-REVIEW.md makes that
135
+ difference explicit so reviewers never treat a general audit as if it
136
+ had SPEC backing.
137
+ </philosophy>
138
+
139
+ ## Main Flow
140
+
141
+ Single serial spawn — the auditor is self-contained (codebase scan,
142
+ dimension scoring, infrastructure audit, report writing all happen
143
+ inside `np-eval-auditor`).
144
+
145
+ ### Step 1 — Eval auditor (np-eval-auditor, haiku)
146
+
147
+ ```bash
148
+ START=$(node np-tools.cjs metrics start-timestamp)
149
+ MODEL=$(node np-tools.cjs resolve-model np-eval-auditor --profile balanced)
150
+ > NOTE: Spawn agent=np-eval-auditor model=$MODEL state=$STATE mode=$AUDIT_MODE
151
+ > NOTE: input: phase_number=$PHASE, phase_dir=$PHASE_DIR,
152
+ > NOTE: summary_path=$SUMMARY_PATH, ai_spec_path=$AI_SPEC_PATH,
153
+ > NOTE: has_ai_spec=$HAS_AI_SPEC, audit_mode=$AUDIT_MODE,
154
+ > NOTE: eval_review_path=$EVAL_REVIEW_PATH
155
+ > NOTE: output: $EVAL_REVIEW_PATH with dimension scores
156
+ > NOTE: (COVERED/PARTIAL/MISSING), infrastructure scores,
157
+ > NOTE: overall verdict, and a mode label
158
+ > NOTE: ("spec-conformance" or "retroactive-general") in the
159
+ > NOTE: header frontmatter.
160
+ END=$(node np-tools.cjs metrics end-timestamp)
161
+ node np-tools.cjs metrics record \
162
+ --agent np-eval-auditor --tier haiku --resolved-model "$MODEL" \
163
+ --phase "$PHASE" --plan "$PLAN_ID" --task "$TASK_ID" \
164
+ --started "$START" --ended "$END" \
165
+ --tokens-in "${TOKENS_IN:-0}" --tokens-out "${TOKENS_OUT:-0}" \
166
+ --retry-count 0 --status ok --runtime "$RUNTIME"
167
+ ```
168
+
169
+ ## Validation Gate
170
+
171
+ After the auditor finishes, verify EVAL-REVIEW.md was written. If the
172
+ file is missing, the spawn failed silently and the user is prompted to
173
+ re-run or abort.
174
+
175
+ ```bash
176
+ if [[ ! -f "$EVAL_REVIEW_PATH" ]]; then
177
+ CHOICE=$(node np-tools.cjs askuser --json '{
178
+ "type": "select",
179
+ "header": "EVAL-REVIEW.md missing",
180
+ "question": "np-eval-auditor did not write EVAL-REVIEW.md. What would you like to do?",
181
+ "options": [
182
+ {"label": "Re-run np-eval-auditor", "description": "Spawn the auditor once more."},
183
+ {"label": "Abort", "description": "Exit without committing."}
184
+ ]
185
+ }')
186
+ case "$CHOICE" in
187
+ "Abort") exit 1 ;;
188
+ esac
189
+ fi
190
+ ```
191
+
192
+ ## Commit
193
+
194
+ ```bash
195
+ git add "$EVAL_REVIEW_PATH"
196
+ git commit -m "docs(${PADDED}): generate EVAL-REVIEW.md (${AUDIT_MODE})"
197
+ ```
198
+
199
+ ## Scope Guardrail
200
+
201
+ <scope_guardrail>
202
+ **Do:**
203
+ - Run `np-eval-auditor` exactly once per invocation (single-pass audit).
204
+ - Emit a metrics record AFTER the Task spawn (D-06).
205
+ - Resolve MODEL via `np-tools.cjs resolve-model` — no hardcoded IDs.
206
+ - Use `np-tools.cjs askuser` for every prompt (INST-03 invariant).
207
+ - Honour the `state` field from the init payload: A → spec-conformance,
208
+ B → retroactive-general, C → abort before spawning anything.
209
+ - Label the audit mode explicitly in EVAL-REVIEW.md
210
+ (`spec-conformance` when AI-SPEC.md exists, `retroactive-general`
211
+ otherwise) — Pitfall 10 parallel.
212
+ - Abort early when SUMMARY.md is missing; retroactive audits are only
213
+ meaningful against executed phases.
214
+
215
+ **Don't:**
216
+ - Run this workflow on a phase that has not been executed — there is
217
+ nothing to audit until SUMMARY.md lands.
218
+ - Invoke host-specific prompt tools directly — always route through
219
+ `np-tools.cjs askuser`.
220
+ - Silently treat a spec-less audit as if it had SPEC backing — the
221
+ mode label in the output header is mandatory.
222
+ - Spawn any additional agent beyond `np-eval-auditor`; if a follow-up
223
+ remediation pass is needed, that is the planner's job, not this
224
+ workflow's.
225
+ - Call any tools binary other than `np-tools.cjs` (the sole CLI entry
226
+ per Plan 09-05 D-14).
227
+ - Reference legacy homedir payload paths — those directories do not
228
+ exist in nubos-pilot projects.
229
+ - Skip the metrics record block — the Phase-10 np:stats consumer
230
+ expects one record per Task spawn.
231
+ - Re-derive `state` inside this workflow; state detection is the init
232
+ CLI's responsibility (`bin/np-tools/eval-review.cjs`).
233
+ </scope_guardrail>
234
+
235
+ ## Output
236
+
237
+ - `{phase_dir}/{padded}-EVAL-REVIEW.md` — per-dimension scores
238
+ (COVERED/PARTIAL/MISSING), infrastructure scores, overall verdict,
239
+ remediation plan, and mode label
240
+ (`spec-conformance` or `retroactive-general`).
241
+ - 1 metrics record in `.nubos-pilot/metrics/phase-${PHASE}.jsonl`
242
+ for the single `np-eval-auditor` Task spawn.
243
+ - One git commit when EVAL-REVIEW.md is produced successfully.