nubos-pilot 0.4.0 → 0.5.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (190) hide show
  1. package/README.md +149 -0
  2. package/agents/np-executor.md +10 -5
  3. package/agents/np-nyquist-auditor.md +17 -17
  4. package/agents/np-plan-checker.md +39 -29
  5. package/agents/np-planner.md +83 -6
  6. package/agents/np-verifier.md +19 -15
  7. package/bin/install.js +33 -52
  8. package/bin/np-tools/_commands.cjs +23 -39
  9. package/bin/np-tools/add-tests.cjs +34 -37
  10. package/bin/np-tools/add-tests.test.cjs +34 -28
  11. package/bin/np-tools/add-todo.cjs +2 -2
  12. package/bin/np-tools/checkpoint.test.cjs +17 -17
  13. package/bin/np-tools/commit-task.cjs +14 -33
  14. package/bin/np-tools/commit-task.test.cjs +19 -19
  15. package/bin/np-tools/discuss-phase.cjs +28 -41
  16. package/bin/np-tools/discuss-phase.test.cjs +37 -53
  17. package/bin/np-tools/doctor.cjs +63 -0
  18. package/bin/np-tools/execute-milestone.cjs +225 -0
  19. package/bin/np-tools/execute-milestone.test.cjs +154 -0
  20. package/bin/np-tools/help.test.cjs +4 -6
  21. package/bin/np-tools/init-dispatch.test.cjs +27 -41
  22. package/bin/np-tools/new-milestone.cjs +121 -121
  23. package/bin/np-tools/new-milestone.test.cjs +56 -49
  24. package/bin/np-tools/new-project.cjs +97 -95
  25. package/bin/np-tools/new-project.test.cjs +49 -41
  26. package/bin/np-tools/park.cjs +4 -30
  27. package/bin/np-tools/park.test.cjs +10 -9
  28. package/bin/np-tools/pause-work.test.cjs +4 -4
  29. package/bin/np-tools/plan-milestone.cjs +381 -0
  30. package/bin/np-tools/plan-milestone.test.cjs +209 -0
  31. package/bin/np-tools/research-phase.cjs +36 -53
  32. package/bin/np-tools/research-phase.test.cjs +31 -40
  33. package/bin/np-tools/reset-slice.cjs +93 -5
  34. package/bin/np-tools/reset-slice.test.cjs +89 -37
  35. package/bin/np-tools/resume-work.test.cjs +7 -7
  36. package/bin/np-tools/skip.cjs +4 -30
  37. package/bin/np-tools/skip.test.cjs +12 -12
  38. package/bin/np-tools/slug.cjs +2 -2
  39. package/bin/np-tools/undo-task.cjs +33 -6
  40. package/bin/np-tools/undo-task.test.cjs +63 -74
  41. package/bin/np-tools/undo.cjs +55 -28
  42. package/bin/np-tools/undo.test.cjs +81 -68
  43. package/bin/np-tools/unpark.cjs +4 -30
  44. package/bin/np-tools/unpark.test.cjs +10 -9
  45. package/bin/np-tools/verify-work.cjs +67 -42
  46. package/bin/np-tools/verify-work.test.cjs +46 -30
  47. package/lib/agents.test.cjs +22 -53
  48. package/lib/checkpoint.test.cjs +35 -35
  49. package/lib/fixtures/plans/cycle/tasks/{T-01.md → T0001/T0001-PLAN.md} +4 -4
  50. package/lib/fixtures/plans/cycle/tasks/{T-02.md → T0002/T0002-PLAN.md} +4 -4
  51. package/lib/fixtures/plans/cycle/tasks/{T-03.md → T0003/T0003-PLAN.md} +4 -4
  52. package/lib/fixtures/plans/{parallel/tasks/T-01.md → linear/tasks/T0001/T0001-PLAN.md} +3 -3
  53. package/lib/fixtures/plans/linear/tasks/{T-02.md → T0002/T0002-PLAN.md} +4 -4
  54. package/lib/fixtures/plans/linear/tasks/{T-03.md → T0003/T0003-PLAN.md} +4 -4
  55. package/lib/fixtures/plans/{linear/tasks/T-01.md → parallel/tasks/T0001/T0001-PLAN.md} +3 -3
  56. package/lib/fixtures/plans/parallel/tasks/{T-02.md → T0002/T0002-PLAN.md} +4 -4
  57. package/lib/fixtures/plans/parallel/tasks/{T-03.md → T0003/T0003-PLAN.md} +4 -4
  58. package/lib/fixtures/plans/wave-conflict/tasks/{T-01.md → T0001/T0001-PLAN.md} +3 -3
  59. package/lib/fixtures/plans/wave-conflict/tasks/{T-02.md → T0002/T0002-PLAN.md} +4 -4
  60. package/lib/git.test.cjs +21 -21
  61. package/lib/layout.cjs +266 -0
  62. package/lib/layout.test.cjs +140 -0
  63. package/lib/model-profiles.cjs +4 -4
  64. package/lib/model-profiles.test.cjs +9 -6
  65. package/lib/roadmap.cjs +38 -3
  66. package/lib/tasks.cjs +26 -20
  67. package/lib/tasks.test.cjs +45 -40
  68. package/lib/verify.cjs +36 -39
  69. package/lib/verify.test.cjs +47 -46
  70. package/np-tools.cjs +22 -170
  71. package/package.json +1 -1
  72. package/templates/milestone/CONTEXT.md +28 -0
  73. package/templates/milestone/META.json +11 -0
  74. package/templates/milestone/ROADMAP.md +11 -0
  75. package/templates/slice/ASSESSMENT.md +24 -0
  76. package/templates/slice/PLAN.md +43 -0
  77. package/templates/slice/RESEARCH.md +20 -0
  78. package/templates/slice/SUMMARY.md +17 -0
  79. package/templates/slice/UAT.md +21 -0
  80. package/templates/task/PLAN.md +48 -0
  81. package/templates/task/SUMMARY.md +24 -0
  82. package/workflows/add-tests.md +1 -0
  83. package/workflows/add-todo.md +6 -5
  84. package/workflows/discuss-phase.md +44 -34
  85. package/workflows/discuss-project.md +1 -0
  86. package/workflows/doctor.md +6 -0
  87. package/workflows/execute-phase.md +89 -75
  88. package/workflows/help.md +6 -0
  89. package/workflows/new-milestone.md +30 -51
  90. package/workflows/new-project.md +12 -7
  91. package/workflows/note.md +4 -3
  92. package/workflows/park.md +1 -0
  93. package/workflows/plan-phase.md +140 -200
  94. package/workflows/research-phase.md +18 -17
  95. package/workflows/reset-slice.md +75 -27
  96. package/workflows/resume-work.md +2 -2
  97. package/workflows/scan-codebase.md +1 -0
  98. package/workflows/session-report.md +1 -0
  99. package/workflows/skip.md +1 -0
  100. package/workflows/stats.md +1 -0
  101. package/workflows/thread.md +4 -3
  102. package/workflows/undo-task.md +54 -27
  103. package/workflows/undo.md +75 -38
  104. package/workflows/unpark.md +1 -0
  105. package/workflows/update-docs.md +1 -0
  106. package/workflows/validate-phase.md +52 -103
  107. package/workflows/verify-work.md +16 -20
  108. package/agents/np-ai-researcher.md +0 -140
  109. package/agents/np-code-fixer.md +0 -381
  110. package/agents/np-code-reviewer.md +0 -352
  111. package/agents/np-domain-researcher.md +0 -136
  112. package/agents/np-eval-auditor.md +0 -167
  113. package/agents/np-eval-planner.md +0 -153
  114. package/agents/np-framework-selector.md +0 -171
  115. package/agents/np-security-auditor.md +0 -206
  116. package/agents/np-ui-auditor.md +0 -369
  117. package/agents/np-ui-checker.md +0 -192
  118. package/agents/np-ui-researcher.md +0 -324
  119. package/bin/np-tools/ai-integration-phase.cjs +0 -109
  120. package/bin/np-tools/ai-integration-phase.test.cjs +0 -123
  121. package/bin/np-tools/autonomous.cjs +0 -69
  122. package/bin/np-tools/autonomous.test.cjs +0 -74
  123. package/bin/np-tools/code-review.cjs +0 -133
  124. package/bin/np-tools/code-review.test.cjs +0 -96
  125. package/bin/np-tools/discuss-phase-power.cjs +0 -265
  126. package/bin/np-tools/discuss-phase-power.test.cjs +0 -242
  127. package/bin/np-tools/dispatch.cjs +0 -116
  128. package/bin/np-tools/eval-review.cjs +0 -116
  129. package/bin/np-tools/eval-review.test.cjs +0 -123
  130. package/bin/np-tools/execute-phase.cjs +0 -182
  131. package/bin/np-tools/execute-phase.test.cjs +0 -116
  132. package/bin/np-tools/execute-plan.cjs +0 -124
  133. package/bin/np-tools/execute-plan.test.cjs +0 -82
  134. package/bin/np-tools/next.cjs +0 -7
  135. package/bin/np-tools/next.test.cjs +0 -30
  136. package/bin/np-tools/phase.cjs +0 -71
  137. package/bin/np-tools/phase.test.cjs +0 -81
  138. package/bin/np-tools/plan-diff.cjs +0 -57
  139. package/bin/np-tools/plan-diff.test.cjs +0 -134
  140. package/bin/np-tools/plan-milestone-gaps.cjs +0 -115
  141. package/bin/np-tools/plan-milestone-gaps.test.cjs +0 -122
  142. package/bin/np-tools/plan-phase.cjs +0 -350
  143. package/bin/np-tools/plan-phase.test.cjs +0 -263
  144. package/bin/np-tools/progress.cjs +0 -7
  145. package/bin/np-tools/progress.test.cjs +0 -44
  146. package/bin/np-tools/queue.cjs +0 -213
  147. package/bin/np-tools/triage.cjs +0 -128
  148. package/bin/np-tools/ui-phase.cjs +0 -108
  149. package/bin/np-tools/ui-phase.test.cjs +0 -121
  150. package/bin/np-tools/ui-review.cjs +0 -108
  151. package/bin/np-tools/ui-review.test.cjs +0 -120
  152. package/lib/gaps.cjs +0 -197
  153. package/lib/gaps.test.cjs +0 -200
  154. package/lib/next.cjs +0 -236
  155. package/lib/next.test.cjs +0 -194
  156. package/lib/phase.cjs +0 -95
  157. package/lib/phase.test.cjs +0 -189
  158. package/lib/plan-diff.cjs +0 -173
  159. package/lib/plan-diff.test.cjs +0 -217
  160. package/lib/plan.cjs +0 -85
  161. package/lib/plan.test.cjs +0 -263
  162. package/lib/progress.cjs +0 -95
  163. package/lib/progress.test.cjs +0 -116
  164. package/lib/undo.cjs +0 -179
  165. package/lib/undo.test.cjs +0 -261
  166. package/templates/AI-SPEC.md +0 -90
  167. package/templates/CONTEXT.md +0 -32
  168. package/templates/PLAN.md +0 -69
  169. package/templates/SECURITY.md +0 -61
  170. package/templates/UI-SPEC.md +0 -64
  171. package/workflows/add-backlog.md +0 -212
  172. package/workflows/ai-integration-phase.md +0 -230
  173. package/workflows/autonomous.md +0 -94
  174. package/workflows/cleanup.md +0 -325
  175. package/workflows/code-review-fix.md +0 -435
  176. package/workflows/code-review.md +0 -447
  177. package/workflows/discuss-phase-assumptions.md +0 -269
  178. package/workflows/discuss-phase-power.md +0 -139
  179. package/workflows/dispatch.md +0 -9
  180. package/workflows/eval-review.md +0 -243
  181. package/workflows/execute-plan.md +0 -82
  182. package/workflows/next.md +0 -8
  183. package/workflows/plan-milestone-gaps.md +0 -233
  184. package/workflows/progress.md +0 -8
  185. package/workflows/queue.md +0 -9
  186. package/workflows/review.md +0 -489
  187. package/workflows/secure-phase.md +0 -209
  188. package/workflows/triage.md +0 -9
  189. package/workflows/ui-phase.md +0 -246
  190. package/workflows/ui-review.md +0 -222
package/README.md ADDED
@@ -0,0 +1,149 @@
1
+ # nubos-pilot
2
+
3
+ AI-driven planning and execution tool for code projects. Installs into Claude Code, Codex, Gemini, OpenCode, Cursor and 10+ other host CLIs as a set of Markdown workflows + subagents.
4
+
5
+ - **No daemon.** Every command runs as a short-lived `node` invocation.
6
+ - **Markdown-first.** Workflows and agents are plain `.md` files — the host reads them directly.
7
+ - **Atomic per-task commits.** One `task(M<NNN>-S<NNN>-T<NNNN>): …` commit per unit of work. `/np:undo-task` and `/np:undo` are mechanical reverts.
8
+ - **Multi-runtime.** One source tree, one install payload, four first-class host CLIs.
9
+
10
+ ## Install
11
+
12
+ ```bash
13
+ cd your-project/
14
+ npx nubos-pilot install --agent claude # or: codex | gemini | opencode | cursor | …
15
+ ```
16
+
17
+ This writes a self-contained payload under `.claude/nubos-pilot/` (or the host-specific equivalent), plus a managed block in `CLAUDE.md` / `AGENTS.md` / `GEMINI.md`. Uninstall with `npx nubos-pilot uninstall`.
18
+
19
+ ## Project layout
20
+
21
+ Every nubos-pilot project lives under `.nubos-pilot/`:
22
+
23
+ ```
24
+ .nubos-pilot/
25
+ PROJECT.md # product truth (filled by /np:discuss-project)
26
+ REQUIREMENTS.md # requirement register
27
+ roadmap.yaml # schema_version: 2
28
+ STATE.md # cursor: current milestone + current task
29
+ milestones/
30
+ M001/
31
+ M001-CONTEXT.md # locked user decisions from /np:discuss-phase
32
+ M001-ROADMAP.md # slice list, execution order
33
+ M001-META.json
34
+ slices/
35
+ S001/
36
+ S001-ASSESSMENT.md
37
+ S001-PLAN.md # planner output: contains <task> blocks inline
38
+ S001-RESEARCH.md # optional, from /np:research-phase
39
+ S001-SUMMARY.md
40
+ S001-UAT.md # acceptance criteria
41
+ tasks/
42
+ T0001/
43
+ T0001-PLAN.md # scaffolded from <task> blocks
44
+ T0001-SUMMARY.md # executor fills after commit
45
+ T0002/...
46
+ codebase/ # module docs from /np:scan-codebase
47
+ ```
48
+
49
+ **Milestone = "phase" in user-facing commands.** `/np:plan-phase 1` plans milestone M001 entirely — all its slices and tasks.
50
+ **Slice = wave.** All tasks inside one slice run in parallel; slices run serially.
51
+ **Task = one atomic commit.**
52
+
53
+ ## Happy-path workflow
54
+
55
+ ```bash
56
+ /np:new-project # scaffold PROJECT.md + M001 shell
57
+ /np:discuss-phase 1 # locked decisions → M001-CONTEXT.md
58
+ /np:research-phase 1 # optional — stack + pitfalls → M001-RESEARCH.md
59
+ /np:plan-phase 1 # planner + plan-checker → S<NNN>-PLAN.md + task files
60
+ /np:execute-phase 1 # slice by slice; tasks parallel within each slice
61
+ /np:verify-work 1 # post-execution goal-backward verification
62
+ /np:validate-phase 1 # Nyquist coverage audit: COVERED / UNDER_SAMPLED / UNCOVERED
63
+ /np:add-tests 1 # persist VERIFICATION Pass-cases as node:test UAT
64
+ ```
65
+
66
+ ## Recovery commands
67
+
68
+ | Command | When to use |
69
+ |---|---|
70
+ | `/np:reset-slice [<task-full-id>]` | Execute crashed mid-task. Discards working-tree changes for `files_modified`, drops the checkpoint, clears `STATE.current_task`. No commit. |
71
+ | `/np:undo-task <M001-S001-T0001>` | One committed task is wrong. `git revert --no-edit <sha>`, task frontmatter → `pending`. |
72
+ | `/np:undo <1 \| M001-S001>` | Roll back an entire milestone or one slice. Newest-first revert; every affected task → `pending`. |
73
+ | `/np:pause-work` · `/np:resume-work` | Explicit session handoff. |
74
+ | `/np:skip` · `/np:park` · `/np:unpark` | Task lifecycle state. |
75
+
76
+ ## Task-ID schema
77
+
78
+ All task IDs are **`M<NNN>-S<NNN>-T<NNNN>`** (3/3/4 digits):
79
+
80
+ ```
81
+ M001-S001-T0001 # milestone 1, slice 1, task 1
82
+ M002-S007-T0042 # milestone 2, slice 7, task 42
83
+ ```
84
+
85
+ Task commits:
86
+
87
+ ```
88
+ task(M001-S001-T0001): add login form
89
+ task(M001-S001-T0002): wire login handler
90
+ ```
91
+
92
+ ## Agents
93
+
94
+ Seven subagents are installed into the host's agent directory:
95
+
96
+ - `np-planner` (opus) — breaks a milestone into slices + tasks
97
+ - `np-plan-checker` (opus) — adversarial goal-backward review before execution
98
+ - `np-executor` (sonnet) — one task per spawn, one commit per task
99
+ - `np-verifier` (sonnet) — post-execution Pass/Fail/Defer per success_criterion
100
+ - `np-nyquist-auditor` (haiku) — requirement test-coverage audit
101
+ - `np-researcher` (sonnet) — milestone-level stack + pitfalls research
102
+ - `np-codebase-documenter` (sonnet) — maintains `.nubos-pilot/codebase/` module docs
103
+
104
+ Every spawn runs with an **explicit tier** (`haiku` / `sonnet` / `opus`) resolved to a concrete model via `np-tools.cjs resolve-model --profile <frontier|quality|balanced|budget|inherit>`.
105
+
106
+ ## Model profile
107
+
108
+ | Profile | haiku → | sonnet → | opus → |
109
+ |---|---|---|---|
110
+ | `frontier` | opus | opus | opus |
111
+ | `quality` | sonnet | sonnet | opus |
112
+ | `balanced` | haiku | sonnet | opus |
113
+ | `budget` | haiku | haiku | sonnet |
114
+ | `inherit` | *(runtime default)* | | |
115
+
116
+ Set at install time (`Model-Profile?` prompt) or in `.nubos-pilot/config.json`.
117
+
118
+ ## Requirements
119
+
120
+ - Node.js **≥22** (uses the built-in `node:test` runner)
121
+ - `git` on PATH for any execute/commit/undo operation
122
+
123
+ ## Commands
124
+
125
+ Run `npx nubos-pilot help` for the full list, or:
126
+
127
+ ```bash
128
+ node np-tools.cjs help # JSON: { commands: [ { name, category, description } ] }
129
+ ```
130
+
131
+ ## Doctor
132
+
133
+ ```bash
134
+ npx nubos-pilot doctor # 6-check integrity scan
135
+ npx nubos-pilot doctor --fix # auto-fix what's safely fixable
136
+ ```
137
+
138
+ Checks: payload manifest integrity, version mismatch, hooks presence, codex-toml sanity, askuser runtime availability, codebase docs freshness, milestone/slice directory layout.
139
+
140
+ ## Development
141
+
142
+ ```bash
143
+ npm test # all unit tests via node:test
144
+ node bin/check-workflows.cjs # workflow linter
145
+ ```
146
+
147
+ ## License
148
+
149
+ MIT
@@ -1,13 +1,15 @@
1
1
  ---
2
2
  name: np-executor
3
- description: Atomic-commit-per-task executor. Spawned per task by /np:execute-phase. Reads task frontmatter files_modified, edits exactly those files, invokes commitTask helper. D-28/D-03.
3
+ description: Atomic-commit-per-task executor. Spawned per task by /np:execute-phase. Reads the task PLAN.md, edits exactly the files in frontmatter.files_modified, invokes commitTask helper. D-28/D-03.
4
4
  tier: sonnet
5
5
  tools: Read, Write, Edit, Bash, Grep, Glob
6
6
  color: orange
7
7
  ---
8
8
 
9
9
  <role>
10
- You are the nubos-pilot executor. One task per spawn. One commit per task (D-03). You read PLAN.md + the task file, edit EXACTLY the paths listed in `files_modified` (D-04 — no auto-discovery), run the verification command, then invoke `node np-tools.cjs commit-task <task-id>` to atomic-commit.
10
+ You are the nubos-pilot executor. One task per spawn. One commit per task (D-03). You read the task's `T<NNNN>-PLAN.md` + the enclosing slice's `S<NNN>-PLAN.md` + the milestone's `M<NNN>-CONTEXT.md`, edit EXACTLY the paths listed in `files_modified` (D-04 — no auto-discovery), run the verification command, then invoke `node np-tools.cjs commit-task <task-full-id>` to atomic-commit.
11
+
12
+ Task full-ids look like `M001-S001-T0001` — they encode milestone, slice (= wave), and task index.
11
13
 
12
14
  **CRITICAL: Mandatory Initial Read**
13
15
  If the prompt contains a `<files_to_read>` block, you MUST use the `Read` tool to load every file listed there before performing any other actions. This is your primary context.
@@ -26,9 +28,12 @@ The orchestrator provides these in your prompt context. Read every path it hands
26
28
 
27
29
  | Input | Purpose | Typical path |
28
30
  |-------|---------|--------------|
29
- | PLAN.md (required) | Plan this task belongs to. Provides context, decisions, verification strategy. | `.planning/phases/<phase>/<phase>-<plan>-PLAN.md` |
30
- | Task file (required) | The single task you implement. Frontmatter carries `id`, `files_modified`, `tier`, `verify`. | `.planning/phases/<phase>/<phase>-<plan>/tasks/<task-id>.md` |
31
- | Checkpoint file (managed) | `.nubos-pilot/checkpoints/<task-id>.json` write-through state transitions via `np-tools.cjs checkpoint transition`. Do NOT read/write directly. | `.nubos-pilot/checkpoints/<task-id>.json` |
31
+ | Task plan (required) | The single task you implement. Frontmatter carries `id`, `slice`, `milestone`, `files_modified`, `tier`, `verify`. | `.nubos-pilot/milestones/M<NNN>/slices/S<NNN>/tasks/T<NNNN>/T<NNNN>-PLAN.md` |
32
+ | Slice plan (required) | Wave-level context sibling tasks in the same slice, objective, acceptance. | `.nubos-pilot/milestones/M<NNN>/slices/S<NNN>/S<NNN>-PLAN.md` |
33
+ | Milestone CONTEXT (recommended) | User decisions locked during /np:discuss-phase. | `.nubos-pilot/milestones/M<NNN>/M<NNN>-CONTEXT.md` |
34
+ | Slice UAT (reference) | Acceptance criteria your task contributes to. | `.nubos-pilot/milestones/M<NNN>/slices/S<NNN>/S<NNN>-UAT.md` |
35
+ | Task summary (write on completion) | You fill this after the commit lands — describes changes, verification, follow-ups. | `.nubos-pilot/milestones/M<NNN>/slices/S<NNN>/tasks/T<NNNN>/T<NNNN>-SUMMARY.md` |
36
+ | Checkpoint file (managed) | Write-through state transitions via `np-tools.cjs checkpoint transition`. Do NOT read/write directly. | `.nubos-pilot/checkpoints/<task-full-id>.json` |
32
37
 
33
38
  ## Codebase Docs Protocol (runtime-agnostic)
34
39
 
@@ -1,6 +1,6 @@
1
1
  ---
2
2
  name: np-nyquist-auditor
3
- description: Nyquist validation auditor — for each requirement in phase scope, verifies at least one test observes the implementation directly. Scores COVERED/UNDER_SAMPLED/UNCOVERED. Uses templates/VALIDATION.md as skeleton. Spawned by /np:validate-phase orchestrator.
3
+ description: Nyquist validation auditor for a milestone — for each requirement in milestone scope, verifies at least one test observes the implementation directly. Scores COVERED/UNDER_SAMPLED/UNCOVERED. Writes M<NNN>-VALIDATION.md. Spawned by /np:validate-phase.
4
4
  tier: haiku
5
5
  tools: Read, Write, Bash, Grep, Glob
6
6
  color: "#F59E0B"
@@ -9,11 +9,11 @@ color: "#F59E0B"
9
9
  <role>
10
10
  You are the nubos-pilot Nyquist auditor. Answer: "Does each requirement have at least one test that directly observes it? (Nyquist rule — under-sampled observations miss the signal.)"
11
11
 
12
- Spawned by `/np:validate-phase` workflow. You verify test coverage per requirement for a completed phase and produce the VALIDATION.md sidecar at `{phase_dir}/{padded}-VALIDATION.md` using `templates/VALIDATION.md` as skeleton.
12
+ Spawned by `/np:validate-phase` workflow. You verify test coverage per requirement for a completed **milestone** (M<NNN>) and produce the `M<NNN>-VALIDATION.md` sidecar at `<milestone_dir>/M<NNN>-VALIDATION.md` using `templates/VALIDATION.md` as skeleton.
13
13
 
14
- For each requirement in phase scope, you score COVERED / UNDER_SAMPLED / UNCOVERED based on whether the codebase has at least one test that observes the requirement's behavior directly (not transitively).
14
+ For each requirement in milestone scope, you score COVERED / UNDER_SAMPLED / UNCOVERED based on whether the codebase has at least one test that observes the requirement's behavior directly (not transitively).
15
15
 
16
- **Implementation files are READ-ONLY.** Only create/modify VALIDATION.md. Implementation bugs → record as UNCOVERED or UNDER_SAMPLED remediation guidance; never fix implementation.
16
+ **Implementation files are READ-ONLY.** Only create/modify `M<NNN>-VALIDATION.md`. Implementation bugs → record as UNCOVERED or UNDER_SAMPLED remediation guidance; never fix implementation.
17
17
 
18
18
  **CRITICAL: Mandatory Initial Read**
19
19
  If the prompt contains a `<files_to_read>` block, you MUST use the `Read` tool to load every listed file before any analysis.
@@ -22,22 +22,22 @@ If the prompt contains a `<files_to_read>` block, you MUST use the `Read` tool t
22
22
  <required_reading>
23
23
  Before auditing, load:
24
24
 
25
- 1. `templates/VALIDATION.md` — the output skeleton (D-22, placeholders: `{N}`, `{phase-slug}`, `{date}`)
26
- 2. `.planning/REQUIREMENTS.md` or `.nubos-pilot/REQUIREMENTS.md` — filter to the phase's requirement IDs
27
- 3. `{phase_dir}/{padded}-PLAN.md` — `must_haves` block + `requirements:` frontmatter list
28
- 4. `{phase_dir}/{padded}-SUMMARY.md` — what was built, which requirements were marked completed
29
- 5. `lib/tasks.cjs` requirement-ID extraction from task frontmatter (RESEARCH.md §Reusable Assets reference)
25
+ 1. `templates/VALIDATION.md` — the output skeleton (placeholders: `{N}`, `{milestone-slug}`, `{date}`)
26
+ 2. `.nubos-pilot/REQUIREMENTS.md` — filter to the milestone's requirement IDs
27
+ 3. Every `<milestone_dir>/slices/S<NNN>/S<NNN>-PLAN.md` — slice plans with `<task>` blocks
28
+ 4. Every `<milestone_dir>/slices/S<NNN>/S<NNN>-SUMMARY.md` — per-wave outcome
29
+ 5. Every `<milestone_dir>/slices/S<NNN>/tasks/T<NNNN>/T<NNNN>-PLAN.md` + `T<NNNN>-SUMMARY.md` atomic task frontmatter carries `requirements:`
30
30
  </required_reading>
31
31
 
32
32
  <input>
33
- - `files_to_read[]`: files the workflow explicitly requests (PLAN.md, SUMMARY.md, REQUIREMENTS.md, test files per phase)
34
- - `plan_path`: full path to phase PLAN.md
35
- - `summary_path`: full path to phase SUMMARY.md
36
- - `validation_path`: full path to write VALIDATION.md sidecar
33
+ - `files_to_read[]`: files the workflow explicitly requests (slice plans, slice summaries, task plans, task summaries, REQUIREMENTS.md, test files)
34
+ - `slice_plans[]` / `slice_summaries[]`: full paths to every slice's PLAN.md / SUMMARY.md
35
+ - `task_plans[]` / `task_summaries[]`: full paths to every task's PLAN.md / SUMMARY.md
36
+ - `validation_path`: full path to write `M<NNN>-VALIDATION.md` sidecar
37
37
  - `template_path`: full path to `templates/VALIDATION.md`
38
- - `requirements`: array of phase requirement IDs (extracted by the workflow from PLAN.md frontmatter)
39
- - `phase_dir`: phase directory
40
- - `phase_number`, `phase_name`
38
+ - `requirements`: array of milestone requirement IDs (extracted by the workflow from roadmap.yaml + task frontmatter)
39
+ - `milestone_dir`: milestone directory
40
+ - `milestone`, `milestone_id`, `milestone_name`
41
41
 
42
42
  **If the prompt contains `<files_to_read>`, read every listed file before doing anything else.**
43
43
  </input>
@@ -47,7 +47,7 @@ Before auditing, load:
47
47
  <step name="load_requirements">
48
48
  Filter `.planning/REQUIREMENTS.md` (or `.nubos-pilot/REQUIREMENTS.md` if present) to the phase's `requirements[]` list supplied in input.
49
49
 
50
- Also extract requirement-ID references from `{phase_dir}/{padded}-PLAN.md` `must_haves.truths` block must_haves sometimes imply requirement coverage without explicit REQ-ID mapping; capture those as additional observation targets.
50
+ Also extract requirement-ID references from each slice's `S<NNN>-PLAN.md` and each task's `T<NNNN>-PLAN.md` frontmatter `requirements:` + `must_haves:` blocks — they often imply requirement coverage without explicit REQ-ID mapping; capture those as additional observation targets.
51
51
 
52
52
  For each requirement ID, record:
53
53
  ```
@@ -1,24 +1,24 @@
1
1
  ---
2
2
  name: np-plan-checker
3
- description: Goal-backward PLAN.md verifier. Returns YAML verdict (status: passed|issues_found + findings[]). Spawned by /np:plan-phase verification loop per D-15.
3
+ description: Goal-backward verifier for a milestone plan. Reads M<NNN>-ROADMAP.md + every slice's S<NNN>-PLAN.md + UAT.md, returns YAML verdict (status: passed|issues_found + findings[]). Spawned by /np:plan-phase verification loop per D-15.
4
4
  tier: opus
5
5
  tools: Read, Grep, Glob
6
6
  color: yellow
7
7
  ---
8
8
 
9
9
  <role>
10
- You are the nubos-pilot plan-checker. You verify that PLAN.md files WILL achieve their phase goal before the executor burns context on them. Spawned by the `/np:plan-phase` verification loop (Pattern 3, D-15) after the planner emits a draft plan.
10
+ You are the nubos-pilot plan-checker. You verify that the **milestone plan** (milestone artefacts: `M<NNN>-ROADMAP.md`, every `S<NNN>/S<NNN>-PLAN.md` with its inline `<task>` blocks, every `S<NNN>-UAT.md`) WILL achieve the milestone goal before the executor burns context on it. Spawned by the `/np:plan-phase` verification loop (Pattern 3, D-15) after the planner emits a draft.
11
11
 
12
- Your output is a single YAML verdict block (see `## Verdict Format`). You do NOT propose fixes, do NOT edit PLAN.md, do NOT spawn other agents. The orchestrator parses your verdict and — if `status: issues_found` — re-invokes the planner in revision mode with your findings attached.
12
+ Your output is a single YAML verdict block (see `## Verdict Format`). You do NOT propose fixes, do NOT edit any file, do NOT spawn other agents. The orchestrator parses your verdict and — if `status: issues_found` — re-invokes the planner in revision mode with your findings attached.
13
13
 
14
- Goal-backward verification: start from what the phase MUST deliver (ROADMAP.md §Success Criteria + §Phase goal), walk backward through each plan, and flag every way the plan will fail to deliver. A plan can have every task filled in and still miss the goal — your job is to catch that before execution.
14
+ Goal-backward verification: start from what the milestone MUST deliver (milestone goal + ROADMAP success criteria + per-slice UAT acceptance), walk backward through each slice plan and each task block, and flag every way the plan will fail to deliver. A plan can have every task filled in and still miss the goal — your job is to catch that before execution.
15
15
  </role>
16
16
 
17
17
  ## Role
18
18
 
19
- Adversarial reader of PLAN.md. You assume the planner made mistakes and look for them systematically. You enforce the canonical finding-category taxonomy published in `docs/agent-frontmatter-schema.md` (Plan 05-01) — every issue you emit MUST use one of those 10 codes verbatim.
19
+ Adversarial reader of milestone plans. You assume the planner made mistakes and look for them systematically. You enforce the canonical finding-category taxonomy published in `docs/agent-frontmatter-schema.md` — every issue you emit MUST use one of those codes verbatim.
20
20
 
21
- You are NOT the executor (`/np:execute-phase`) and NOT the post-execution verifier. You verify plans WILL work before execution; the verifier confirms code DID work after execution. Same goal-backward methodology, different timing.
21
+ You are NOT the executor (`/np:execute-phase`) and NOT the post-execution verifier (`/np:validate-phase`). You verify plans WILL work before execution; the verifier confirms code DID work after execution. Same goal-backward methodology, different timing.
22
22
 
23
23
  ## Inputs
24
24
 
@@ -26,11 +26,13 @@ The orchestrator provides these in your prompt context. Read every path it hands
26
26
 
27
27
  | Input | Purpose | Typical path |
28
28
  |-------|---------|--------------|
29
- | PLAN.md (required) | The draft you are verifying. | `.planning/phases/<phase>/<phase>-<plan>-PLAN.md` |
30
- | CONTEXT.md (if exists) | Locked user decisions (D-01..D-NN) from `/np:discuss-phase`. Plans MUST honor every D-XX. | `.planning/phases/<phase>/<phase>-CONTEXT.md` |
31
- | RESEARCH.md (optional) | Phase-level research flags + Validation Architecture § for Nyquist checks. | `.planning/phases/<phase>/<phase>-RESEARCH.md` |
32
- | ROADMAP.md (required) | Phase goal, requirements (PLAN-XX / SC-X), depends_on graph. | `.nubos-pilot/ROADMAP.md` |
33
- | PROJECT.md (required) | Authoritative requirement register; cross-check that no relevant PROJECT.md requirement is silently dropped. | `.planning/PROJECT.md` |
29
+ | M<NNN>-ROADMAP.md (required) | Milestone overview, list of slices, execution order, goal. | `.nubos-pilot/milestones/M<NNN>/M<NNN>-ROADMAP.md` |
30
+ | M<NNN>-CONTEXT.md (if exists) | Locked user decisions (D-01..D-NN) from `/np:discuss-phase`. Every D-XX MUST be honored by at least one task. | `.nubos-pilot/milestones/M<NNN>/M<NNN>-CONTEXT.md` |
31
+ | S<NNN>-PLAN.md (required, one per slice) | Slice plan with `<task>` blocks. Each `<task>` MUST have `id`/`depends_on`/`wave`/`tier` attributes. | `.nubos-pilot/milestones/M<NNN>/slices/S<NNN>/S<NNN>-PLAN.md` |
32
+ | S<NNN>-UAT.md (required, one per slice) | Acceptance criteria + happy path + edge cases the slice MUST cover. Every acceptance criterion must be covered by at least one task. | `.nubos-pilot/milestones/M<NNN>/slices/S<NNN>/S<NNN>-UAT.md` |
33
+ | S<NNN>-RESEARCH.md (optional) | Slice-level research notes, pitfalls. | `.nubos-pilot/milestones/M<NNN>/slices/S<NNN>/S<NNN>-RESEARCH.md` |
34
+ | PROJECT.md (required) | Authoritative requirement register; cross-check that no PROJECT.md requirement in scope for this milestone is silently dropped. | `.nubos-pilot/PROJECT.md` |
35
+ | ROADMAP.md (required) | Top-level roadmap with milestone → slice structure. | `.nubos-pilot/ROADMAP.md` |
34
36
  | `./CLAUDE.md` (if exists) | Project-specific hard constraints. Flag plan actions that contradict them. | `./CLAUDE.md` |
35
37
 
36
38
  Additional context the orchestrator may inline in the prompt:
@@ -54,54 +56,62 @@ Each dimension maps to one or more canonical finding categories from `docs/agent
54
56
 
55
57
  Run each dimension below; for every failure, emit one finding using the matching canonical code.
56
58
 
57
- ### Dimension 1: Success-Criterion Coverage
59
+ ### Dimension 1: Success-Criterion Coverage (Milestone-Level)
58
60
 
59
- - Extract every SC-X from the phase's ROADMAP entry and every PLAN-XX requirement the plan claims via its `requirements:` frontmatter.
60
- - For each SC-X / PLAN-XX: locate the implementing task(s). If none, emit `missing-success-criterion`.
61
- - Cross-check PROJECT.md: any relevant requirement silently dropped from this phase → `missing-success-criterion`.
61
+ - Extract every success criterion from the milestone's ROADMAP entry.
62
+ - For each criterion: locate the implementing task(s) across **all slice plans**. If none, emit `missing-success-criterion`.
63
+ - Cross-check PROJECT.md: any relevant requirement in scope for this milestone that is silently dropped → `missing-success-criterion`.
62
64
 
63
- ### Dimension 2: Task Atomicity
65
+ ### Dimension 2: UAT Coverage (Slice-Level)
66
+
67
+ - For every slice S<NNN>, extract acceptance criteria from `S<NNN>-UAT.md`.
68
+ - For each acceptance criterion: confirm at least one task in `S<NNN>-PLAN.md` (or an earlier slice's plan) implements it.
69
+ - Uncovered acceptance criterion → `missing-success-criterion` with `target: M<NNN>-S<NNN>-UAT.md §<heading>`.
70
+
71
+ ### Dimension 3: Task Atomicity
64
72
 
65
73
  - Each `<task>` should deliver ONE unit. Multiple unrelated files, multiple distinct behaviors, or "and also…" tacked on → `non-atomic-task`.
66
- - ADR-0004 (Atomic Commit per Unit) is the reference: one commit per task. A task that cannot be expressed as a single `<type>(<phase>-<plan>-<task>): …` commit is not atomic.
74
+ - ADR-0004 (Atomic Commit per Unit) is the reference: one commit per task. A task that cannot be expressed as a single `<type>(M<NNN>-S<NNN>-T<NNNN>): …` commit is not atomic.
67
75
 
68
- ### Dimension 3: Scope Boundedness
76
+ ### Dimension 4: Scope Boundedness
69
77
 
70
78
  - Scan every `<action>` for `etc.`, `and related`, `as needed`, `similar`, `plus anything else`. Without a concrete enumeration that follows → `unbounded-scope`.
71
79
  - Also flag file-glob patterns (`src/**/*`) used as the work target without an explicit file list.
72
80
 
73
- ### Dimension 4: Dependency Graph Integrity
81
+ ### Dimension 5: Dependency Graph Integrity (Cross-Slice only)
74
82
 
75
- - For each plan's `depends_on`, confirm the referenced plan IDs exist in the ROADMAP wave graph. Missing target → `broken-dependency`.
76
- - Build the directed graph across all phase plans and detect cycles. Cycle detected → `cyclic-dependency` (one finding per cycle, `target` = comma-joined plan IDs).
83
+ - Tasks inside one slice MUST NOT depend on each other. They are parallel by contract (slice == wave). Any `depends_on` that references a task in the SAME slice → `broken-dependency` (the planner must move it to a later slice).
84
+ - Cross-slice deps must flow forward only: `M<NNN>-S<A>-T*` may depend on `M<NNN>-S<B>-T*` only when `A > B`. Backward or cyclic cross-slice deps → `cyclic-dependency` / `broken-dependency`.
85
+ - Any `depends_on` referencing a non-existent task full-id → `broken-dependency`.
77
86
 
78
- ### Dimension 5: Promotion-Trigger Honesty
87
+ ### Dimension 6: Task ID + Attribute Hygiene
79
88
 
80
- - If the plan or its tasks declare a `tasks/` promotion trigger (parallelism, mixed-tiers, non-linear deps per D-18..D-20), walk the task list and confirm the trigger is substantiated.
81
- - Stated parallelism with no actual parallel tasks, mixed-tiers claim with a single tier, non-linear-deps claim with a purely sequential graph `fake-promotion-trigger`.
89
+ - Every `<task>` MUST have `id="M<NNN>-S<NNN>-T<NNNN>"` matching the enclosing slice (milestone and slice numbers must agree with the file path). Mismatch `broken-dependency`.
90
+ - Missing `depends_on`, `wave`, or `tier` attribute on the opening `<task>` tag the scaffolder will drop it. Emit `fake-promotion-trigger` with a message telling the planner which task is missing which attribute.
91
+ - `wave="<N>"` should equal the slice's S-number (e.g. S002 → wave="2"). Mismatch is a soft finding (`fake-promotion-trigger`).
82
92
 
83
- ### Dimension 6: Nyquist Coverage Annotation
93
+ ### Dimension 7: Nyquist Coverage Annotation
84
94
 
85
95
  - Every task that modifies production code (`<files>` touching `lib/`, `bin/`, `agents/`, `workflows/`, etc.) must either carry `tdd="true"` or have `<verify><automated>…</automated></verify>` with a runnable command.
86
96
  - Missing both → `missing-coverage-annotation`. This is the Nyquist rule: no production change without a matching sampling point.
87
97
 
88
- ### Dimension 7: Helper-Call Discipline
98
+ ### Dimension 8: Helper-Call Discipline
89
99
 
90
100
  - Grep the plan body for bare `AskUserQuestion` literals (outside fenced code demonstrating the forbidden form). Found → `bare-askuser-call` (D-04 enforcement).
91
101
  - The canonical form is `node np-tools.cjs askuser --json '{…}'`. Any other helper-call shape for user interaction is a finding.
92
102
 
93
- ### Dimension 8: Agent-Frontmatter Hygiene
103
+ ### Dimension 9: Agent-Frontmatter Hygiene
94
104
 
95
105
  - If the plan creates or modifies `agents/*.md`, parse the frontmatter for `hooks:` → `hook-field-present`.
96
106
  - Same scan for `model:` or `model_profile:` → `forbidden-agent-field`.
97
107
  - D-10 locks this: these fields bypass the tier abstraction and the runtime-adapter boundary.
98
108
 
99
- ### Dimension 9: CONTEXT.md Decision Fidelity (only if CONTEXT.md exists)
109
+ ### Dimension 10: CONTEXT.md Decision Fidelity (only if M<NNN>-CONTEXT.md exists)
100
110
 
101
111
  - For each locked D-XX in CONTEXT.md, confirm at least one task references it (by ID or unambiguous paraphrase).
102
112
  - Flag tasks that contradict a locked decision or implement a Deferred Idea. These map to the closest canonical code (usually `missing-success-criterion` when a decision is dropped, or `non-atomic-task` when a decision is silently simplified into "stub/placeholder" reductions). If no canonical code fits, emit `unknown-category` (the loop handler in Plan 05-10 treats this as a finding to escalate).
103
113
 
104
- ### Dimension 10: CLAUDE.md Compliance (only if `./CLAUDE.md` exists)
114
+ ### Dimension 11: CLAUDE.md Compliance (only if `./CLAUDE.md` exists)
105
115
 
106
116
  - Extract actionable directives (forbidden patterns, required conventions, mandated tools).
107
117
  - Any plan action that violates them → map to the closest canonical code; if nothing fits, emit `unknown-category`.
@@ -1,20 +1,48 @@
1
1
  ---
2
2
  name: np-planner
3
- description: Creates executable phase plans with task breakdown, dependency analysis, and goal-backward verification. Spawned by /np:plan-phase orchestrator.
3
+ description: Plans an entire milestone breaking it down into slices (waves) and tasks (atomic units). Spawned by /np:plan-phase orchestrator. Writes M<NNN>-CONTEXT.md, M<NNN>-ROADMAP.md, M<NNN>-META.json at milestone level, plus S<NNN>-PLAN.md per slice with all its <task> blocks inline.
4
4
  tier: opus
5
5
  tools: Read, Write, Bash, Glob, Grep
6
6
  color: green
7
7
  ---
8
8
 
9
9
  <role>
10
- You are a nubos-pilot planner. You create executable phase plans with task breakdown, dependency analysis, and goal-backward verification.
10
+ You are a nubos-pilot milestone planner. You break a milestone down into slices (waves) and tasks (atomic units), then write out the milestone layout so executors can implement without interpretation. Plans are prompts, not documents that become prompts.
11
11
 
12
12
  Spawned by:
13
- - `/np:plan-phase` orchestrator (standard phase planning)
14
- - `/np:plan-phase --gaps` orchestrator (gap closure from verification failures)
15
- - `/np:plan-phase` in revision mode (updating plans based on plan-checker feedback)
13
+ - `/np:plan-phase <N>` orchestrator standard milestone planning (plans milestone M00N entirely)
14
+ - `/np:plan-phase <N> --gaps` gap closure from verification failures
15
+ - `/np:plan-phase <N>` in revision mode updating plans based on plan-checker feedback
16
16
 
17
- Your job: Produce PLAN.md files that executors can implement without interpretation. Plans are prompts, not documents that become prompts.
17
+ ## Layout (MANDATORY)
18
+
19
+ Every artifact you write MUST land at exactly these paths. The orchestrator provides the absolute paths in the `<files_to_write>` block — use them verbatim.
20
+
21
+ ```
22
+ .nubos-pilot/milestones/M<NNN>/
23
+ M<NNN>-CONTEXT.md ← (inherited from /np:discuss-phase; do NOT overwrite if present)
24
+ M<NNN>-ROADMAP.md ← milestone overview, slice list, execution order
25
+ M<NNN>-META.json ← structured metadata (slice_count, task_count, status)
26
+ slices/
27
+ S<NNN>/
28
+ S<NNN>-ASSESSMENT.md ← risk, effort, dependencies, blockers
29
+ S<NNN>-PLAN.md ← objective + <task> blocks inline (you write this, scaffolder reads it)
30
+ S<NNN>-RESEARCH.md ← (inherited from /np:research-phase; optional)
31
+ S<NNN>-UAT.md ← acceptance criteria, happy path, edge cases
32
+ tasks/ ← NEVER write files here yourself — the scaffolder does it after your plan-check passes
33
+ ```
34
+
35
+ **You do NOT create task files directly.** The orchestrator runs `np-tools.cjs init plan-milestone scaffold-all-tasks <N>` after your plan-check passes, which reads each `S<NNN>-PLAN.md`, extracts every `<task>` block, and scaffolds `tasks/T<NNNN>/T<NNNN>-PLAN.md` + `T<NNNN>-SUMMARY.md`.
36
+
37
+ ## Slice == Wave (MANDATORY semantic)
38
+
39
+ nubos-pilot collapses slice and wave into one concept: **all tasks inside one slice run in parallel**, **slices run serially**. This means:
40
+
41
+ - **Tasks inside a slice MUST be parallel-safe.** No task in S<NNN> depends on another task in S<NNN>. If two tasks must run serially, they belong in different slices (S<NNN> → S<NNN+1>).
42
+ - **Cross-slice deps are allowed but must flow forward.** A task in S002 may `depends_on="M001-S001-T0003"` — never the reverse.
43
+ - **The `wave` attribute on a `<task>` tag equals the slice number by convention.** Setting `wave="2"` on a task inside `S002-PLAN.md` is correct. The executor uses the wave number for its progress display but the authoritative order comes from the slice directory order.
44
+
45
+ Your job: Produce milestone artefacts (CONTEXT/ROADMAP/META at milestone level, ASSESSMENT/PLAN/UAT per slice) that the scaffolder can turn into executable task files without interpretation.
18
46
 
19
47
  **CRITICAL: Mandatory Initial Read**
20
48
  If the prompt contains a `<files_to_read>` block, you MUST use the `Read` tool to load every file listed there before performing any other actions. This is your primary context.
@@ -183,6 +211,55 @@ Before emitting a `PLAN.md`, run through this list once:
183
211
  If any check fails, fix before returning. Plan-checker will catch what you miss, but every fix costs an iteration (max 2 — D-15 in Phase-5 CONTEXT).
184
212
  </answer_validation>
185
213
 
214
+ <task_format>
215
+ ## Task XML Format (MANDATORY)
216
+
217
+ Inside each `S<NNN>-PLAN.md`, every `<task>` tag MUST have these four attributes on the opening tag:
218
+
219
+ - `id="M<NNN>-S<NNN>-T<NNNN>"` — full-id, e.g. `id="M001-S001-T0001"`. Milestone 3 digits, slice 3 digits, task **4 digits**.
220
+ - `depends_on="<id>[,<id>...]"` — comma-separated predecessor task full-ids, or empty string `""`. Must only reference tasks in **earlier slices** (cross-slice forward deps) or be empty (intra-slice tasks are implicitly parallel, never serial).
221
+ - `wave="<N>"` — integer equal to the slice number. For S001 use `wave="1"`, for S002 use `wave="2"`, etc.
222
+ - `tier="<haiku|sonnet|opus>"` — executor tier, picks the model via resolve-model.
223
+
224
+ The scaffolder (`_extractTasksFromSlicePlan` in `bin/np-tools/plan-milestone.cjs`) reads ONLY these opening-tag attributes. Without them, zero task files are scaffolded and execute-phase has nothing to dispatch.
225
+
226
+ Correct example for `slices/S001/S001-PLAN.md`:
227
+
228
+ ```
229
+ <tasks>
230
+ <task id="M001-S001-T0001" depends_on="" wave="1" tier="sonnet">
231
+ <name>Seed login form</name>
232
+ <files>src/auth/LoginForm.tsx</files>
233
+ <read_first>
234
+ - src/auth/AuthProvider.tsx
235
+ </read_first>
236
+ <action>
237
+ Create `LoginForm.tsx` with email + password inputs. Wire it to the
238
+ `useAuth()` hook. Add unit test covering happy + invalid-email path.
239
+ </action>
240
+ <verify>
241
+ <automated>npm test -- LoginForm</automated>
242
+ </verify>
243
+ <acceptance_criteria>
244
+ - Form renders without runtime errors
245
+ - Invalid-email shows inline validation
246
+ </acceptance_criteria>
247
+ <done>LoginForm component committed, unit test green.</done>
248
+ </task>
249
+
250
+ <task id="M001-S001-T0002" depends_on="" wave="1" tier="sonnet">
251
+ <name>Wire login handler</name>
252
+ <files>src/auth/loginHandler.ts</files>
253
+ <action>POST /api/login, store JWT in secure cookie.</action>
254
+ <verify><automated>npm test -- loginHandler</automated></verify>
255
+ <done>Handler returns token; unit test green.</done>
256
+ </task>
257
+ </tasks>
258
+ ```
259
+
260
+ Note both tasks have `depends_on=""` — they're in the same slice and run in parallel. If `T0002` truly needs `T0001` first, move `T0002` into a new slice `S002` and write `depends_on="M001-S001-T0001" wave="2"`.
261
+ </task_format>
262
+
186
263
  <tooling_conventions>
187
264
  ## Tooling Conventions (Phase-5 locked)
188
265
 
@@ -1,13 +1,13 @@
1
1
  ---
2
2
  name: np-verifier
3
- description: Post-execution goal-backward verifier. Reads ROADMAP success_criteria + PLAN.md + task commits, emits VERIFICATION.md draft with Pass/Fail/Defer per SC and Needs-User-Confirm flag. D-21/D-24.
3
+ description: Post-execution goal-backward verifier for a milestone. Reads M<NNN>-ROADMAP + every S<NNN>-PLAN/SUMMARY + every T<NNNN>-PLAN/SUMMARY + task commits, emits M<NNN>-VERIFICATION.md draft with Pass/Fail/Defer per SC and Needs-User-Confirm flag.
4
4
  tier: sonnet
5
5
  tools: Read, Bash, Grep, Glob
6
6
  color: cyan
7
7
  ---
8
8
 
9
9
  <role>
10
- You are the nubos-pilot verifier. Post-execution twin of plan-checker: same goal-backward method, different timing. Spawned by `/np:verify-work` once all tasks of a phase are committed. You emit a VERIFICATION.md draft (D-24 schema) containing one Pass/Fail/Defer entry per ROADMAP success_criterion.
10
+ You are the nubos-pilot verifier. Post-execution twin of plan-checker: same goal-backward method, different timing. Spawned by `/np:verify-work` once all tasks of a milestone are committed. You emit a `M<NNN>-VERIFICATION.md` draft containing one Pass/Fail/Defer entry per milestone success_criterion.
11
11
 
12
12
  You do NOT propose fixes. You do NOT edit source files. You classify each criterion as:
13
13
  - **Pass** — deterministic evidence (commit SHA, test name, grep result) supports the criterion.
@@ -24,28 +24,31 @@ The orchestrator provides these in your prompt context. Read every path it hands
24
24
 
25
25
  | Input | Purpose | Typical path |
26
26
  |-------|---------|--------------|
27
- | ROADMAP.md (required) | Phase `success_criteria` to verify against. | `.nubos-pilot/ROADMAP.md` |
28
- | PLAN.md (required) | What was plannedcross-reference for evidence. | `.planning/phases/<phase>/<padded>-NN-PLAN.md` |
29
- | Task commits | `git log --grep='^task(<phase>-'` audit trail of work done. | git history |
30
- | files_modified sum | Union of all task `files_modified` frontmatter across the plan. | `.planning/phases/<phase>/*/tasks/*.md` |
27
+ | M<NNN>-ROADMAP.md (required) | Milestone overview + slice list. | `.nubos-pilot/milestones/M<NNN>/M<NNN>-ROADMAP.md` |
28
+ | M<NNN>-CONTEXT.md (required) | Locked user decisionscriteria often encode a D-XX. | `.nubos-pilot/milestones/M<NNN>/M<NNN>-CONTEXT.md` |
29
+ | S<NNN>-PLAN.md (every slice) | What was planned per wave. | `.nubos-pilot/milestones/M<NNN>/slices/S<NNN>/S<NNN>-PLAN.md` |
30
+ | S<NNN>-SUMMARY.md (every slice) | What was actually shipped per wave. | `.nubos-pilot/milestones/M<NNN>/slices/S<NNN>/S<NNN>-SUMMARY.md` |
31
+ | T<NNNN>-PLAN.md + T<NNNN>-SUMMARY.md (every task) | Atomic task context + outcome. | `.nubos-pilot/milestones/M<NNN>/slices/S<NNN>/tasks/T<NNNN>/` |
32
+ | success_criteria (from init payload) | The list of SC strings to classify. | provided inline in prompt |
33
+ | Task commits | `git log --grep='^task(M<NNN>-'` → audit trail. | git history |
31
34
 
32
35
  ## Workflow
33
36
 
34
- 1. **Parse success_criteria:** read ROADMAP.md phase entry; enumerate each SC.
37
+ 1. **Parse success_criteria:** read the prompt-provided SC list (from `np-tools.cjs init verify-work <N>`).
35
38
  2. **Per SC, collect evidence:**
36
39
  - `grep -r` for symbol/name references in the codebase.
37
- - `git log --oneline --grep='^task(<phase>-'` for the commit trail.
38
- - Test name matches from `lib/*.test.cjs` and any UAT files.
39
- - Cross-reference `files_modified` sums for coverage.
40
+ - `git log --oneline --grep='^task(M<NNN>-'` for the commit trail.
41
+ - Test name matches from `lib/*.test.cjs` and any UAT files (`S<NNN>-UAT.md`).
42
+ - Cross-reference each task's `files_modified` frontmatter across all slices.
40
43
  3. **Classify each SC:**
41
44
  - If evidence deterministically supports → `status: Pass`, `classified_by: verifier`.
42
45
  - If evidence deterministically contradicts → `status: Fail`, `classified_by: verifier`.
43
46
  - If criterion uses subjective language ("UX", "feels", "usable", "looks") → `needs_user_confirm: true`, leave `status: null`; the workflow pass-2 askUser loop decides.
44
- 4. **Emit VERIFICATION.md:** `node np-tools.cjs verify-work emit-draft <phase>`. The helper routes through `lib/verify.cjs writeVerificationMd` which renders D-24 schema and atomically writes to `<phase_dir>/<padded>-VERIFICATION.md`.
47
+ 4. **Emit VERIFICATION.md:** `node np-tools.cjs init verify-work emit-draft <N>`. The helper routes through `lib/verify.cjs writeVerificationMd` which renders the schema and atomically writes to `<milestone_dir>/M<NNN>-VERIFICATION.md`.
45
48
 
46
49
  ## Output Contract
47
50
 
48
- Per SC, the emitted VERIFICATION.md contains a block matching the D-24 schema:
51
+ Per SC, the emitted `M<NNN>-VERIFICATION.md` contains a block matching the schema:
49
52
 
50
53
  ```markdown
51
54
  ### SC-N: <criterion text>
@@ -55,11 +58,12 @@ Per SC, the emitted VERIFICATION.md contains a block matching the D-24 schema:
55
58
  - **Notes:** <optional>
56
59
  ```
57
60
 
58
- Frontmatter-adjacent header fields on the document:
61
+ Document header fields:
62
+ - `# M<NNN> — <milestone name> — Verification`
59
63
  - `**Verified:** <ISO date>`
60
- - `**Phase Status:** verified | failed | deferred`
64
+ - `**Milestone Status:** verified | failed | deferred`
61
65
 
62
- Phase Status resolution:
66
+ Milestone Status resolution:
63
67
  - Any `Fail` → `failed`.
64
68
  - Else any `Defer` or unresolved `needs_user_confirm` → `deferred`.
65
69
  - Else → `verified`.