nubos-pilot 0.1.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/agents/np-ai-researcher.md +140 -0
- package/agents/np-code-fixer.md +363 -0
- package/agents/np-code-reviewer.md +351 -0
- package/agents/np-domain-researcher.md +136 -0
- package/agents/np-eval-auditor.md +167 -0
- package/agents/np-eval-planner.md +153 -0
- package/agents/np-executor.md +72 -0
- package/agents/np-framework-selector.md +171 -0
- package/agents/np-nyquist-auditor.md +185 -0
- package/agents/np-plan-checker.md +165 -0
- package/agents/np-planner.md +199 -0
- package/agents/np-researcher.md +150 -0
- package/agents/np-security-auditor.md +206 -0
- package/agents/np-ui-auditor.md +369 -0
- package/agents/np-ui-checker.md +192 -0
- package/agents/np-ui-researcher.md +324 -0
- package/agents/np-verifier.md +79 -0
- package/bin/check-coverage.cjs +40 -0
- package/bin/check-workflows.cjs +171 -0
- package/bin/check-workflows.test.cjs +208 -0
- package/bin/install.js +500 -0
- package/bin/np-tools/_commands.cjs +70 -0
- package/bin/np-tools/add-tests.cjs +171 -0
- package/bin/np-tools/add-tests.test.cjs +122 -0
- package/bin/np-tools/add-todo.cjs +108 -0
- package/bin/np-tools/add-todo.test.cjs +112 -0
- package/bin/np-tools/agent-skills.cjs +14 -0
- package/bin/np-tools/agent-skills.test.cjs +42 -0
- package/bin/np-tools/ai-integration-phase.cjs +109 -0
- package/bin/np-tools/ai-integration-phase.test.cjs +123 -0
- package/bin/np-tools/askuser.cjs +53 -0
- package/bin/np-tools/askuser.test.cjs +49 -0
- package/bin/np-tools/autonomous.cjs +69 -0
- package/bin/np-tools/autonomous.test.cjs +74 -0
- package/bin/np-tools/checkpoint.cjs +101 -0
- package/bin/np-tools/checkpoint.test.cjs +119 -0
- package/bin/np-tools/code-review.cjs +133 -0
- package/bin/np-tools/code-review.test.cjs +96 -0
- package/bin/np-tools/commit-task.cjs +120 -0
- package/bin/np-tools/commit-task.test.cjs +160 -0
- package/bin/np-tools/commit.cjs +103 -0
- package/bin/np-tools/commit.test.cjs +93 -0
- package/bin/np-tools/config.cjs +101 -0
- package/bin/np-tools/config.test.cjs +71 -0
- package/bin/np-tools/discuss-phase-power.cjs +265 -0
- package/bin/np-tools/discuss-phase-power.test.cjs +242 -0
- package/bin/np-tools/discuss-phase.cjs +132 -0
- package/bin/np-tools/discuss-phase.test.cjs +148 -0
- package/bin/np-tools/dispatch.cjs +116 -0
- package/bin/np-tools/doctor.cjs +242 -0
- package/bin/np-tools/eval-review.cjs +116 -0
- package/bin/np-tools/eval-review.test.cjs +123 -0
- package/bin/np-tools/execute-phase.cjs +182 -0
- package/bin/np-tools/execute-phase.test.cjs +116 -0
- package/bin/np-tools/execute-plan.cjs +124 -0
- package/bin/np-tools/execute-plan.test.cjs +82 -0
- package/bin/np-tools/help.cjs +28 -0
- package/bin/np-tools/help.test.cjs +29 -0
- package/bin/np-tools/init-dispatch.test.cjs +91 -0
- package/bin/np-tools/metrics.cjs +97 -0
- package/bin/np-tools/metrics.test.cjs +188 -0
- package/bin/np-tools/new-milestone.cjs +288 -0
- package/bin/np-tools/new-milestone.test.cjs +166 -0
- package/bin/np-tools/new-project.cjs +284 -0
- package/bin/np-tools/new-project.test.cjs +165 -0
- package/bin/np-tools/next.cjs +7 -0
- package/bin/np-tools/next.test.cjs +30 -0
- package/bin/np-tools/park.cjs +48 -0
- package/bin/np-tools/park.test.cjs +50 -0
- package/bin/np-tools/pause-work.cjs +24 -0
- package/bin/np-tools/pause-work.test.cjs +74 -0
- package/bin/np-tools/phase.cjs +71 -0
- package/bin/np-tools/phase.test.cjs +81 -0
- package/bin/np-tools/plan-diff.cjs +57 -0
- package/bin/np-tools/plan-diff.test.cjs +134 -0
- package/bin/np-tools/plan-milestone-gaps.cjs +115 -0
- package/bin/np-tools/plan-milestone-gaps.test.cjs +122 -0
- package/bin/np-tools/plan-phase.cjs +350 -0
- package/bin/np-tools/plan-phase.test.cjs +263 -0
- package/bin/np-tools/progress.cjs +7 -0
- package/bin/np-tools/progress.test.cjs +44 -0
- package/bin/np-tools/queue.cjs +213 -0
- package/bin/np-tools/research-phase.cjs +144 -0
- package/bin/np-tools/research-phase.test.cjs +154 -0
- package/bin/np-tools/reset-slice.cjs +17 -0
- package/bin/np-tools/reset-slice.test.cjs +96 -0
- package/bin/np-tools/resolve-model.cjs +110 -0
- package/bin/np-tools/resolve-model.test.cjs +200 -0
- package/bin/np-tools/resume-work.cjs +76 -0
- package/bin/np-tools/resume-work.test.cjs +91 -0
- package/bin/np-tools/skip.cjs +48 -0
- package/bin/np-tools/skip.test.cjs +66 -0
- package/bin/np-tools/slug.cjs +34 -0
- package/bin/np-tools/slug.test.cjs +46 -0
- package/bin/np-tools/state.cjs +16 -0
- package/bin/np-tools/state.test.cjs +40 -0
- package/bin/np-tools/stats.cjs +151 -0
- package/bin/np-tools/stats.test.cjs +118 -0
- package/bin/np-tools/triage.cjs +128 -0
- package/bin/np-tools/ui-phase.cjs +108 -0
- package/bin/np-tools/ui-phase.test.cjs +121 -0
- package/bin/np-tools/ui-review.cjs +108 -0
- package/bin/np-tools/ui-review.test.cjs +120 -0
- package/bin/np-tools/undo-task.cjs +31 -0
- package/bin/np-tools/undo-task.test.cjs +117 -0
- package/bin/np-tools/undo.cjs +43 -0
- package/bin/np-tools/undo.test.cjs +120 -0
- package/bin/np-tools/unpark.cjs +48 -0
- package/bin/np-tools/unpark.test.cjs +50 -0
- package/bin/np-tools/verify-work.cjs +186 -0
- package/bin/np-tools/verify-work.test.cjs +97 -0
- package/docs/adr/0001-no-daemon-invariant.md +82 -0
- package/docs/adr/0002-zero-runtime-dependencies.md +90 -0
- package/docs/adr/0003-max-six-unit-types.md +85 -0
- package/docs/adr/0004-atomic-commit-per-unit.md +102 -0
- package/docs/adr/0005-three-orthogonal-file-trees.md +98 -0
- package/docs/adr/0006-yaml-dependency-amendment.md +60 -0
- package/docs/adr/README.md +27 -0
- package/docs/agent-frontmatter-schema.md +84 -0
- package/docs/phase-artifact-schemas.md +292 -0
- package/docs/phase-directory-layout.md +82 -0
- package/lib/__tests__/README.md +1 -0
- package/lib/agents.cjs +98 -0
- package/lib/agents.test.cjs +286 -0
- package/lib/askuser.cjs +36 -0
- package/lib/askuser.test.cjs +310 -0
- package/lib/checkpoint.cjs +135 -0
- package/lib/checkpoint.test.cjs +184 -0
- package/lib/core.cjs +165 -0
- package/lib/core.test.cjs +405 -0
- package/lib/fixtures/README.md +1 -0
- package/lib/fixtures/phase-tree/README.md +1 -0
- package/lib/fixtures/plans/cycle/PLAN.md +16 -0
- package/lib/fixtures/plans/cycle/tasks/T-01.md +20 -0
- package/lib/fixtures/plans/cycle/tasks/T-02.md +20 -0
- package/lib/fixtures/plans/cycle/tasks/T-03.md +20 -0
- package/lib/fixtures/plans/linear/PLAN.md +16 -0
- package/lib/fixtures/plans/linear/tasks/T-01.md +20 -0
- package/lib/fixtures/plans/linear/tasks/T-02.md +20 -0
- package/lib/fixtures/plans/linear/tasks/T-03.md +20 -0
- package/lib/fixtures/plans/parallel/PLAN.md +16 -0
- package/lib/fixtures/plans/parallel/tasks/T-01.md +20 -0
- package/lib/fixtures/plans/parallel/tasks/T-02.md +20 -0
- package/lib/fixtures/plans/parallel/tasks/T-03.md +20 -0
- package/lib/fixtures/plans/wave-conflict/PLAN.md +16 -0
- package/lib/fixtures/plans/wave-conflict/tasks/T-01.md +20 -0
- package/lib/fixtures/plans/wave-conflict/tasks/T-02.md +20 -0
- package/lib/fixtures/roadmap/ROADMAP-malformed.md +3 -0
- package/lib/fixtures/roadmap/ROADMAP-minimal.md +51 -0
- package/lib/fixtures/roadmap/roadmap-malformed.yaml +7 -0
- package/lib/fixtures/roadmap/roadmap-minimal.yaml +40 -0
- package/lib/fixtures/roadmap/roadmap-ten-phases.yaml +101 -0
- package/lib/fixtures/templates/phase-context.md +6 -0
- package/lib/fixtures/templates/plan-skeleton.md +6 -0
- package/lib/frontmatter.cjs +251 -0
- package/lib/frontmatter.test.cjs +177 -0
- package/lib/gaps.cjs +197 -0
- package/lib/gaps.test.cjs +200 -0
- package/lib/git.cjs +207 -0
- package/lib/git.test.cjs +305 -0
- package/lib/install/agents-md.cjs +77 -0
- package/lib/install/backup.cjs +70 -0
- package/lib/install/codex-toml.cjs +440 -0
- package/lib/install/managed-block.cjs +30 -0
- package/lib/install/manifest.cjs +148 -0
- package/lib/install/mcp-writer.cjs +127 -0
- package/lib/install/runtime-detect.cjs +44 -0
- package/lib/install/staging.cjs +149 -0
- package/lib/metrics-aggregate.cjs +229 -0
- package/lib/metrics-aggregate.test.cjs +192 -0
- package/lib/metrics.cjs +120 -0
- package/lib/metrics.test.cjs +182 -0
- package/lib/model-aliases.regression.test.cjs +16 -0
- package/lib/model-profiles.cjs +42 -0
- package/lib/model-profiles.test.cjs +61 -0
- package/lib/next.cjs +236 -0
- package/lib/next.test.cjs +194 -0
- package/lib/phase.cjs +95 -0
- package/lib/phase.test.cjs +189 -0
- package/lib/plan-checker-contract.test.cjs +72 -0
- package/lib/plan-diff.cjs +173 -0
- package/lib/plan-diff.test.cjs +217 -0
- package/lib/plan.cjs +85 -0
- package/lib/plan.test.cjs +263 -0
- package/lib/progress.cjs +95 -0
- package/lib/progress.test.cjs +116 -0
- package/lib/researcher-contract.test.cjs +61 -0
- package/lib/roadmap-render.cjs +206 -0
- package/lib/roadmap-render.test.cjs +121 -0
- package/lib/roadmap.cjs +416 -0
- package/lib/roadmap.test.cjs +371 -0
- package/lib/runtime/_contract.test.cjs +61 -0
- package/lib/runtime/_readline.cjs +119 -0
- package/lib/runtime/_readline.test.cjs +126 -0
- package/lib/runtime/claude.cjs +48 -0
- package/lib/runtime/claude.test.cjs +101 -0
- package/lib/runtime/codex.cjs +35 -0
- package/lib/runtime/codex.test.cjs +114 -0
- package/lib/runtime/gemini.cjs +35 -0
- package/lib/runtime/gemini.test.cjs +109 -0
- package/lib/runtime/index.cjs +49 -0
- package/lib/runtime/index.test.cjs +181 -0
- package/lib/runtime/opencode.cjs +35 -0
- package/lib/runtime/opencode.test.cjs +124 -0
- package/lib/state.cjs +205 -0
- package/lib/state.test.cjs +264 -0
- package/lib/surface-audit.test.cjs +46 -0
- package/lib/tasks.cjs +327 -0
- package/lib/tasks.test.cjs +389 -0
- package/lib/template.cjs +66 -0
- package/lib/template.test.cjs +159 -0
- package/lib/undo.cjs +179 -0
- package/lib/undo.test.cjs +261 -0
- package/lib/verify.cjs +116 -0
- package/lib/verify.test.cjs +187 -0
- package/np-tools.cjs +303 -0
- package/package.json +39 -0
- package/templates/AI-SPEC.md +90 -0
- package/templates/CONTEXT.md +32 -0
- package/templates/PLAN.md +69 -0
- package/templates/PROJECT.md +60 -0
- package/templates/REQUIREMENTS.md +38 -0
- package/templates/SECURITY.md +61 -0
- package/templates/UI-SPEC.md +64 -0
- package/templates/VALIDATION.md +76 -0
- package/templates/claude/payload/README.md +11 -0
- package/templates/opencode/opencode.json +6 -0
- package/templates/opencode/payload/AGENTS.md +9 -0
- package/workflows/add-backlog.md +212 -0
- package/workflows/add-tests.md +69 -0
- package/workflows/add-todo.md +222 -0
- package/workflows/ai-integration-phase.md +230 -0
- package/workflows/autonomous.md +94 -0
- package/workflows/cleanup.md +325 -0
- package/workflows/code-review-fix.md +435 -0
- package/workflows/code-review.md +447 -0
- package/workflows/discuss-phase-assumptions.md +269 -0
- package/workflows/discuss-phase-power.md +139 -0
- package/workflows/discuss-phase.md +386 -0
- package/workflows/dispatch.md +9 -0
- package/workflows/doctor.md +10 -0
- package/workflows/eval-review.md +243 -0
- package/workflows/execute-phase.md +142 -0
- package/workflows/execute-plan.md +82 -0
- package/workflows/help.md +8 -0
- package/workflows/new-milestone.md +166 -0
- package/workflows/new-project.md +213 -0
- package/workflows/next.md +8 -0
- package/workflows/note.md +244 -0
- package/workflows/park.md +29 -0
- package/workflows/pause-work.md +34 -0
- package/workflows/plan-milestone-gaps.md +233 -0
- package/workflows/plan-phase.md +351 -0
- package/workflows/progress.md +8 -0
- package/workflows/queue.md +9 -0
- package/workflows/research-phase.md +327 -0
- package/workflows/reset-slice.md +39 -0
- package/workflows/resume-work.md +79 -0
- package/workflows/review.md +489 -0
- package/workflows/secure-phase.md +209 -0
- package/workflows/session-report.md +243 -0
- package/workflows/skip.md +29 -0
- package/workflows/state.md +7 -0
- package/workflows/stats.md +170 -0
- package/workflows/thread.md +214 -0
- package/workflows/triage.md +9 -0
- package/workflows/ui-phase.md +246 -0
- package/workflows/ui-review.md +222 -0
- package/workflows/undo-task.md +42 -0
- package/workflows/undo.md +55 -0
- package/workflows/unpark.md +29 -0
- package/workflows/validate-phase.md +231 -0
- package/workflows/verify-work.md +83 -0
|
@@ -0,0 +1,386 @@
|
|
|
1
|
+
---
|
|
2
|
+
command: np:discuss-phase
|
|
3
|
+
description: Adaptive interview to capture phase implementation decisions; writes CONTEXT.md.
|
|
4
|
+
---
|
|
5
|
+
|
|
6
|
+
# np:discuss-phase
|
|
7
|
+
|
|
8
|
+
Extract implementation decisions that downstream agents (researcher, planner)
|
|
9
|
+
need. Minimum Phase-5 scope: adaptive askUser()-based interview covering the
|
|
10
|
+
nine context areas and a single CONTEXT.md render.
|
|
11
|
+
|
|
12
|
+
The `--assumptions` flag routes to `workflows/discuss-phase-assumptions.md`
|
|
13
|
+
(lighter-weight codebase-first mode). The `--power` flag is owned by Plan
|
|
14
|
+
05-08 and is not implemented here.
|
|
15
|
+
|
|
16
|
+
**Scope note (Phase 5):** No advisor subagent spawn, no `--batch`, no
|
|
17
|
+
`--analyze`, no `--chain` auto-advance. Those are deferred; this
|
|
18
|
+
workflow delivers PLAN-01 and nothing beyond it.
|
|
19
|
+
|
|
20
|
+
## Initialize
|
|
21
|
+
|
|
22
|
+
```bash
|
|
23
|
+
INIT=$(node np-tools.cjs init discuss-phase "$PHASE")
|
|
24
|
+
if [[ "$INIT" == @file:* ]]; then INIT=$(cat "${INIT#@file:}"); fi
|
|
25
|
+
```
|
|
26
|
+
|
|
27
|
+
Parse JSON for: `phase_number`, `padded`, `phase_dir`, `phase_name`,
|
|
28
|
+
`phase_slug`, `has_context`, `goal`, `requirements`, `agent_skills`, `mode`.
|
|
29
|
+
|
|
30
|
+
If the user passed `--assumptions`, route to
|
|
31
|
+
`workflows/discuss-phase-assumptions.md` and exit this workflow.
|
|
32
|
+
|
|
33
|
+
## Purpose
|
|
34
|
+
|
|
35
|
+
<purpose>
|
|
36
|
+
Extract implementation decisions that downstream agents need. Analyze the
|
|
37
|
+
phase to identify gray areas, let the user choose what to discuss, then
|
|
38
|
+
deep-dive each selected area until satisfied.
|
|
39
|
+
|
|
40
|
+
You are a thinking partner, not an interviewer. The user is the visionary —
|
|
41
|
+
you are the builder. Your job is to capture decisions that will guide
|
|
42
|
+
research and planning, not to figure out implementation yourself.
|
|
43
|
+
</purpose>
|
|
44
|
+
|
|
45
|
+
## Downstream Awareness
|
|
46
|
+
|
|
47
|
+
<downstream_awareness>
|
|
48
|
+
**CONTEXT.md feeds into:**
|
|
49
|
+
|
|
50
|
+
1. **researcher** — Reads CONTEXT.md to know WHAT to research
|
|
51
|
+
- "User wants card-based layout" → researcher investigates card component patterns
|
|
52
|
+
- "Infinite scroll decided" → researcher looks into virtualization libraries
|
|
53
|
+
|
|
54
|
+
2. **planner** — Reads CONTEXT.md to know WHAT decisions are locked
|
|
55
|
+
- "Pull-to-refresh on mobile" → planner includes that in task specs
|
|
56
|
+
- "Claude's Discretion: loading skeleton" → planner can decide approach
|
|
57
|
+
|
|
58
|
+
**Your job:** Capture decisions clearly enough that downstream agents can act
|
|
59
|
+
on them without asking the user again.
|
|
60
|
+
|
|
61
|
+
**Not your job:** Figure out HOW to implement. That's what research and
|
|
62
|
+
planning do with the decisions you capture.
|
|
63
|
+
</downstream_awareness>
|
|
64
|
+
|
|
65
|
+
## Philosophy
|
|
66
|
+
|
|
67
|
+
<philosophy>
|
|
68
|
+
**User = founder/visionary. Claude = builder.**
|
|
69
|
+
|
|
70
|
+
The user knows:
|
|
71
|
+
- How they imagine it working
|
|
72
|
+
- What it should look/feel like
|
|
73
|
+
- What's essential vs nice-to-have
|
|
74
|
+
- Specific behaviors or references they have in mind
|
|
75
|
+
|
|
76
|
+
The user doesn't know (and shouldn't be asked):
|
|
77
|
+
- Codebase patterns (researcher reads the code)
|
|
78
|
+
- Technical risks (researcher identifies these)
|
|
79
|
+
- Implementation approach (planner figures this out)
|
|
80
|
+
- Success metrics (inferred from the work)
|
|
81
|
+
|
|
82
|
+
Ask about vision and implementation choices. Capture decisions for downstream
|
|
83
|
+
agents.
|
|
84
|
+
</philosophy>
|
|
85
|
+
|
|
86
|
+
## Scope Guardrail
|
|
87
|
+
|
|
88
|
+
<scope_guardrail>
|
|
89
|
+
**CRITICAL: No scope creep.**
|
|
90
|
+
|
|
91
|
+
The phase boundary comes from ROADMAP.md and is FIXED. Discussion clarifies
|
|
92
|
+
HOW to implement what's scoped, never WHETHER to add new capabilities.
|
|
93
|
+
|
|
94
|
+
**Allowed (clarifying ambiguity):**
|
|
95
|
+
- "How should posts be displayed?" (layout, density, info shown)
|
|
96
|
+
- "What happens on empty state?" (within the feature)
|
|
97
|
+
- "Pull to refresh or manual?" (behavior choice)
|
|
98
|
+
|
|
99
|
+
**Not allowed (scope creep):**
|
|
100
|
+
- "Should we also add comments?" (new capability)
|
|
101
|
+
- "What about search/filtering?" (new capability)
|
|
102
|
+
- "Maybe include bookmarking?" (new capability)
|
|
103
|
+
|
|
104
|
+
**The heuristic:** Does this clarify how we implement what's already in the
|
|
105
|
+
phase, or does it add a new capability that could be its own phase?
|
|
106
|
+
|
|
107
|
+
**When user suggests scope creep:**
|
|
108
|
+
```
|
|
109
|
+
"[Feature X] would be a new capability — that's its own phase.
|
|
110
|
+
Want me to note it for the roadmap backlog?
|
|
111
|
+
|
|
112
|
+
For now, let's focus on [phase domain]."
|
|
113
|
+
```
|
|
114
|
+
|
|
115
|
+
Capture the idea in a "Deferred Ideas" section. Don't lose it, don't act on it.
|
|
116
|
+
</scope_guardrail>
|
|
117
|
+
|
|
118
|
+
## Answer Validation
|
|
119
|
+
|
|
120
|
+
<answer_validation>
|
|
121
|
+
**IMPORTANT: Answer validation** — After every interactive prompt, check if the
|
|
122
|
+
response is empty or whitespace-only. If so:
|
|
123
|
+
1. Retry the question once with the same parameters
|
|
124
|
+
2. If still empty, present the options as a plain-text numbered list and ask
|
|
125
|
+
the user to type their choice number
|
|
126
|
+
Never proceed with an empty answer.
|
|
127
|
+
|
|
128
|
+
**Text mode (`workflow.text_mode: true` in config or `--text` flag):**
|
|
129
|
+
When text mode is active, **do not use `np-tools.cjs askuser` at all**.
|
|
130
|
+
Instead, present every question as a plain-text numbered list and ask the
|
|
131
|
+
user to type their choice number. This is required for Claude Code remote
|
|
132
|
+
sessions (`/rc` mode) where the Claude App cannot forward TUI menu selections
|
|
133
|
+
back to the host.
|
|
134
|
+
|
|
135
|
+
Enable text mode:
|
|
136
|
+
- Per-session: pass `--text` flag
|
|
137
|
+
- Per-project: `np-tools.cjs config-set workflow.text_mode true`
|
|
138
|
+
|
|
139
|
+
Text mode applies to ALL workflows in the session, not just discuss-phase.
|
|
140
|
+
</answer_validation>
|
|
141
|
+
|
|
142
|
+
## Process
|
|
143
|
+
|
|
144
|
+
### Step 1: Guard against existing CONTEXT.md
|
|
145
|
+
|
|
146
|
+
If `has_context` is `true`, ask the user how to proceed:
|
|
147
|
+
|
|
148
|
+
```bash
|
|
149
|
+
node np-tools.cjs askuser --json '{
|
|
150
|
+
"type": "select",
|
|
151
|
+
"prompt": "Phase '"$PHASE"' already has a CONTEXT.md. What do you want to do?",
|
|
152
|
+
"options": [
|
|
153
|
+
"Overwrite existing CONTEXT.md",
|
|
154
|
+
"Append update section",
|
|
155
|
+
"Abort"
|
|
156
|
+
]
|
|
157
|
+
}'
|
|
158
|
+
```
|
|
159
|
+
|
|
160
|
+
- **Overwrite** → preserve the prior file as `{padded}-CONTEXT.archive.md`
|
|
161
|
+
before writing the new one:
|
|
162
|
+
```bash
|
|
163
|
+
mv "$PHASE_DIR/$PADDED-CONTEXT.md" "$PHASE_DIR/$PADDED-CONTEXT.archive.md"
|
|
164
|
+
```
|
|
165
|
+
- **Append update section** → skip the archive move; the write step below
|
|
166
|
+
appends a fresh `## Update — <date>` section instead of replacing content.
|
|
167
|
+
- **Abort** → exit the workflow. No file changes.
|
|
168
|
+
|
|
169
|
+
If `has_context` is `false`, continue directly to Step 2.
|
|
170
|
+
|
|
171
|
+
### Step 2: Confirm phase goal
|
|
172
|
+
|
|
173
|
+
Read `goal` and `requirements` from INIT. Confirm the phase goal is what the
|
|
174
|
+
user expects (users sometimes discover the roadmap goal is stale before
|
|
175
|
+
discussion starts):
|
|
176
|
+
|
|
177
|
+
```bash
|
|
178
|
+
node np-tools.cjs askuser --json '{
|
|
179
|
+
"type": "confirm",
|
|
180
|
+
"prompt": "ROADMAP goal for phase '"$PHASE"': \"'"$GOAL"'\". Still accurate?",
|
|
181
|
+
"default": true
|
|
182
|
+
}'
|
|
183
|
+
```
|
|
184
|
+
|
|
185
|
+
If the user says `no`, capture the refined goal with a free-text input call
|
|
186
|
+
and record it for the `<domain>` section of CONTEXT.md:
|
|
187
|
+
|
|
188
|
+
```bash
|
|
189
|
+
node np-tools.cjs askuser --json '{
|
|
190
|
+
"type": "input",
|
|
191
|
+
"prompt": "Refined goal for phase '"$PHASE"':"
|
|
192
|
+
}'
|
|
193
|
+
```
|
|
194
|
+
|
|
195
|
+
### Step 3: Present phase-specific gray areas
|
|
196
|
+
|
|
197
|
+
Based on the phase goal + domain, generate 3–4 concrete gray areas (not
|
|
198
|
+
generic UI/UX labels — specific decisions like "Session handling", "Error
|
|
199
|
+
responses", "Multi-device policy"). Present them via a multi-select:
|
|
200
|
+
|
|
201
|
+
```bash
|
|
202
|
+
node np-tools.cjs askuser --json '{
|
|
203
|
+
"type": "multiselect",
|
|
204
|
+
"prompt": "Which areas do you want to discuss for '"$PHASE_NAME"'?",
|
|
205
|
+
"options": [
|
|
206
|
+
"<area 1>",
|
|
207
|
+
"<area 2>",
|
|
208
|
+
"<area 3>",
|
|
209
|
+
"<area 4>"
|
|
210
|
+
]
|
|
211
|
+
}'
|
|
212
|
+
```
|
|
213
|
+
|
|
214
|
+
Per the scope-guardrail block above: options must clarify HOW to build what
|
|
215
|
+
is in scope — never introduce new capabilities.
|
|
216
|
+
|
|
217
|
+
### Step 4: Discuss each selected area
|
|
218
|
+
|
|
219
|
+
For each selected area, ask 2–4 focused questions. Every prompt routes
|
|
220
|
+
through `np-tools.cjs askuser` — never through the runtime-native structured
|
|
221
|
+
question tool directly (SC-5 enforcement from Phase 3).
|
|
222
|
+
|
|
223
|
+
Per area, the recommended flow is:
|
|
224
|
+
|
|
225
|
+
```bash
|
|
226
|
+
# Decision question (typed as select when options exist)
|
|
227
|
+
node np-tools.cjs askuser --json '{
|
|
228
|
+
"type": "select",
|
|
229
|
+
"prompt": "For <area>: <specific decision>?",
|
|
230
|
+
"options": ["<choice A>", "<choice B>", "<choice C>"]
|
|
231
|
+
}'
|
|
232
|
+
|
|
233
|
+
# Follow-up free-text capture when the user picks "Other" or needs nuance
|
|
234
|
+
node np-tools.cjs askuser --json '{
|
|
235
|
+
"type": "input",
|
|
236
|
+
"prompt": "Anything specific about <area> downstream agents must know?"
|
|
237
|
+
}'
|
|
238
|
+
|
|
239
|
+
# Continuation gate
|
|
240
|
+
node np-tools.cjs askuser --json '{
|
|
241
|
+
"type": "select",
|
|
242
|
+
"prompt": "More questions about <area>, or move on?",
|
|
243
|
+
"options": ["More questions", "Next area"]
|
|
244
|
+
}'
|
|
245
|
+
```
|
|
246
|
+
|
|
247
|
+
After all selected areas are covered:
|
|
248
|
+
|
|
249
|
+
```bash
|
|
250
|
+
node np-tools.cjs askuser --json '{
|
|
251
|
+
"type": "select",
|
|
252
|
+
"prompt": "We have discussed <areas>. Anything else before we write CONTEXT.md?",
|
|
253
|
+
"options": ["Explore more gray areas", "I am ready for CONTEXT.md"]
|
|
254
|
+
}'
|
|
255
|
+
```
|
|
256
|
+
|
|
257
|
+
If the user chooses to explore more, loop back to Step 3 with 2–4 fresh
|
|
258
|
+
candidate areas. Otherwise proceed to Step 5.
|
|
259
|
+
|
|
260
|
+
**Canonical ref accumulation.** When the user references a doc/ADR/spec
|
|
261
|
+
during any answer ("read adr-014", "per browse-spec.md"), read it and add
|
|
262
|
+
its full relative path to the canonical-refs accumulator — these are the
|
|
263
|
+
most important refs because they come straight from the user.
|
|
264
|
+
|
|
265
|
+
### Step 5: Capture remaining CONTEXT.md sections
|
|
266
|
+
|
|
267
|
+
Collect short free-text inputs for the remaining required sections before
|
|
268
|
+
rendering:
|
|
269
|
+
|
|
270
|
+
```bash
|
|
271
|
+
node np-tools.cjs askuser --json '{
|
|
272
|
+
"type": "input",
|
|
273
|
+
"prompt": "Canonical refs (paths to ADRs/specs/docs downstream agents must read) — comma separated or \"none\":"
|
|
274
|
+
}'
|
|
275
|
+
```
|
|
276
|
+
|
|
277
|
+
```bash
|
|
278
|
+
node np-tools.cjs askuser --json '{
|
|
279
|
+
"type": "input",
|
|
280
|
+
"prompt": "Reusable code / existing assets relevant to this phase — or \"none\":"
|
|
281
|
+
}'
|
|
282
|
+
```
|
|
283
|
+
|
|
284
|
+
```bash
|
|
285
|
+
node np-tools.cjs askuser --json '{
|
|
286
|
+
"type": "input",
|
|
287
|
+
"prompt": "Specific references (\"I want it like X\" moments) — or \"none\":"
|
|
288
|
+
}'
|
|
289
|
+
```
|
|
290
|
+
|
|
291
|
+
```bash
|
|
292
|
+
node np-tools.cjs askuser --json '{
|
|
293
|
+
"type": "input",
|
|
294
|
+
"prompt": "Deferred ideas (things we noted but belong in later phases) — or \"none\":"
|
|
295
|
+
}'
|
|
296
|
+
```
|
|
297
|
+
|
|
298
|
+
```bash
|
|
299
|
+
node np-tools.cjs askuser --json '{
|
|
300
|
+
"type": "input",
|
|
301
|
+
"prompt": "Claude\u2019s Discretion — areas where you want Claude to decide without asking:"
|
|
302
|
+
}'
|
|
303
|
+
```
|
|
304
|
+
|
|
305
|
+
### Step 6: Render CONTEXT.md
|
|
306
|
+
|
|
307
|
+
Render `templates/CONTEXT.md` with `lib/template.cjs`. The render call is
|
|
308
|
+
fail-loud on unknown placeholders, so the variables object below must match
|
|
309
|
+
the template's `{{var}}` keys exactly.
|
|
310
|
+
|
|
311
|
+
```bash
|
|
312
|
+
PHASE_DIR=$(echo "$INIT" | node -e 'let d="";process.stdin.on("data",c=>d+=c).on("end",()=>{console.log(JSON.parse(d).phase_dir)})')
|
|
313
|
+
PADDED=$(echo "$INIT" | node -e 'let d="";process.stdin.on("data",c=>d+=c).on("end",()=>{console.log(JSON.parse(d).padded)})')
|
|
314
|
+
mkdir -p "$PHASE_DIR"
|
|
315
|
+
|
|
316
|
+
node -e '
|
|
317
|
+
const { render } = require("./lib/template.cjs");
|
|
318
|
+
const fs = require("node:fs");
|
|
319
|
+
const tpl = fs.readFileSync("templates/CONTEXT.md", "utf-8");
|
|
320
|
+
const vars = JSON.parse(process.argv[1]);
|
|
321
|
+
process.stdout.write(render(tpl, vars));
|
|
322
|
+
' "$VARS_JSON" > "$PHASE_DIR/$PADDED-CONTEXT.md"
|
|
323
|
+
```
|
|
324
|
+
|
|
325
|
+
`$VARS_JSON` is the JSON-serialised accumulator from Steps 2–5:
|
|
326
|
+
|
|
327
|
+
```jsonc
|
|
328
|
+
{
|
|
329
|
+
"phase_number": "5",
|
|
330
|
+
"phase_name": "...",
|
|
331
|
+
"goal": "...",
|
|
332
|
+
"domain": "...",
|
|
333
|
+
"decisions": "...", // collected from Step 4
|
|
334
|
+
"canonical_refs": "...",
|
|
335
|
+
"code_context": "...",
|
|
336
|
+
"specifics": "...",
|
|
337
|
+
"deferred": "...",
|
|
338
|
+
"date": "2026-04-15"
|
|
339
|
+
}
|
|
340
|
+
```
|
|
341
|
+
|
|
342
|
+
If `templates/CONTEXT.md` lacks a key, `render()` throws
|
|
343
|
+
`NubosPilotError('template-missing-key', …)` — the workflow must not swallow
|
|
344
|
+
that error. Fix the template or the accumulator, don't mask the failure.
|
|
345
|
+
|
|
346
|
+
### Step 7: Commit respecting config.commit_docs
|
|
347
|
+
|
|
348
|
+
```bash
|
|
349
|
+
COMMIT_DOCS=$(node np-tools.cjs config-get workflow.commit_docs 2>/dev/null || echo "true")
|
|
350
|
+
if [[ "$COMMIT_DOCS" == "true" ]]; then
|
|
351
|
+
git add "$PHASE_DIR/$PADDED-CONTEXT.md"
|
|
352
|
+
git commit -m "docs($PADDED): capture phase context"
|
|
353
|
+
fi
|
|
354
|
+
```
|
|
355
|
+
|
|
356
|
+
If `workflow.commit_docs` is false, leave the file uncommitted — the user is
|
|
357
|
+
opting into manual commit gating.
|
|
358
|
+
|
|
359
|
+
### Step 8: Confirm and next steps
|
|
360
|
+
|
|
361
|
+
```bash
|
|
362
|
+
node np-tools.cjs askuser --json '{
|
|
363
|
+
"type": "confirm",
|
|
364
|
+
"prompt": "CONTEXT.md written at '"$PHASE_DIR"'/'"$PADDED"'-CONTEXT.md. Run np:plan-phase '"$PHASE"' now?",
|
|
365
|
+
"default": true
|
|
366
|
+
}'
|
|
367
|
+
```
|
|
368
|
+
|
|
369
|
+
Yes → invoke `np:plan-phase $PHASE` via the runtime's standard workflow
|
|
370
|
+
dispatcher. No → print the manual next-step hint:
|
|
371
|
+
|
|
372
|
+
```
|
|
373
|
+
Next: /np:plan-phase $PHASE
|
|
374
|
+
```
|
|
375
|
+
|
|
376
|
+
## Success Criteria
|
|
377
|
+
|
|
378
|
+
- `{phase_dir}/{padded}-CONTEXT.md` exists with all six required sections
|
|
379
|
+
(domain, decisions, canonical_refs, code_context, specifics, deferred).
|
|
380
|
+
- Every interactive prompt went through `np-tools.cjs askuser`; zero bare
|
|
381
|
+
`np-tools.cjs askuser` bypasses.
|
|
382
|
+
- If prior CONTEXT.md existed, user explicitly chose overwrite / append /
|
|
383
|
+
abort — no silent overwrite.
|
|
384
|
+
- Deferred ideas preserved verbatim for future phases.
|
|
385
|
+
- Commit (if `workflow.commit_docs=true`) landed via
|
|
386
|
+
`docs(PADDED): capture phase context`.
|
|
@@ -0,0 +1,9 @@
|
|
|
1
|
+
# np:dispatch
|
|
2
|
+
|
|
3
|
+
State-router for the current phase. Reads state → determines next action
|
|
4
|
+
(discuss / plan / execute / verify) → delegates via `Skill()` call.
|
|
5
|
+
`--force` or `--action=<name>` to override. `--action` wins over recommendation.
|
|
6
|
+
|
|
7
|
+
```bash
|
|
8
|
+
node np-tools.cjs dispatch "$@"
|
|
9
|
+
```
|
|
@@ -0,0 +1,10 @@
|
|
|
1
|
+
# np:doctor
|
|
2
|
+
|
|
3
|
+
Run a 5-check integrity scan of the nubos-pilot install (manifest integrity,
|
|
4
|
+
version mismatch, missing hooks, trapped Codex `[features]`, askUser broken).
|
|
5
|
+
Use `--fix` to apply auto-safe fixes; anything touching user files outside the
|
|
6
|
+
manifest will prompt via `askUser()` (SC-5).
|
|
7
|
+
|
|
8
|
+
```bash
|
|
9
|
+
node np-tools.cjs doctor "$@"
|
|
10
|
+
```
|
|
@@ -0,0 +1,243 @@
|
|
|
1
|
+
---
|
|
2
|
+
command: np:eval-review
|
|
3
|
+
description: Retroactive evaluation-coverage audit of a completed AI phase. Spawns np-eval-auditor to score each planned eval dimension as COVERED/PARTIAL/MISSING against AI-SPEC.md (if present) or general best-practice rubric. Produces EVAL-REVIEW.md.
|
|
4
|
+
---
|
|
5
|
+
|
|
6
|
+
# np:eval-review
|
|
7
|
+
|
|
8
|
+
Produces `{phase_dir}/{padded}-EVAL-REVIEW.md` via a single `np-eval-auditor`
|
|
9
|
+
spawn that audits the phase's implemented AI system against its
|
|
10
|
+
evaluation plan. Runs AFTER `/np:execute-phase` has landed code — the
|
|
11
|
+
audit needs a SUMMARY.md to know what was built.
|
|
12
|
+
|
|
13
|
+
Three states (resolved by the init payload, not by this workflow):
|
|
14
|
+
|
|
15
|
+
- **State A — spec-conformance audit.** `AI-SPEC.md` and `SUMMARY.md`
|
|
16
|
+
both present. The auditor scores the implementation against the
|
|
17
|
+
planned eval dimensions, rubrics, guardrails, and monitoring plan.
|
|
18
|
+
- **State B — retroactive general audit.** `SUMMARY.md` present but no
|
|
19
|
+
`AI-SPEC.md`. The auditor scores against the generic best-practice
|
|
20
|
+
checklist. The output file header labels the mode explicitly
|
|
21
|
+
(Pitfall 10 parallel — avoids silent drift between spec-backed and
|
|
22
|
+
spec-less reviews).
|
|
23
|
+
- **State C — abort.** No `SUMMARY.md`. The workflow exits with a
|
|
24
|
+
clear message before spawning the auditor — there is nothing to
|
|
25
|
+
audit until the phase has been executed.
|
|
26
|
+
|
|
27
|
+
The single Task-spawn site is wrapped in the Plan 09-05 metrics +
|
|
28
|
+
resolve-model pattern (D-06, D-01). `RUNTIME` is detected once at the
|
|
29
|
+
top of the bash block and re-used by the `metrics record` call.
|
|
30
|
+
|
|
31
|
+
## Initialize
|
|
32
|
+
|
|
33
|
+
```bash
|
|
34
|
+
PHASE="$1"
|
|
35
|
+
if [[ -z "$PHASE" ]]; then
|
|
36
|
+
echo "Usage: /np:eval-review <phase-number>" >&2
|
|
37
|
+
exit 2
|
|
38
|
+
fi
|
|
39
|
+
|
|
40
|
+
INIT=$(node np-tools.cjs init eval-review "$PHASE")
|
|
41
|
+
if [[ "$INIT" == @file:* ]]; then INIT=$(cat "${INIT#@file:}"); fi
|
|
42
|
+
RUNTIME=$(node -e "console.log(require('./lib/runtime/index.cjs').detect().runtime)")
|
|
43
|
+
```
|
|
44
|
+
|
|
45
|
+
Parse JSON for: `phase`, `padded`, `phase_dir`, `eval_review_path`,
|
|
46
|
+
`summary_present`, `summary_path`, `ai_spec_path`, `has_ai_spec`,
|
|
47
|
+
`state`, `agents.eval_auditor`.
|
|
48
|
+
|
|
49
|
+
```bash
|
|
50
|
+
PADDED=$(echo "$INIT" | jq -r '.padded')
|
|
51
|
+
PHASE_DIR=$(echo "$INIT" | jq -r '.phase_dir')
|
|
52
|
+
EVAL_REVIEW_PATH=$(echo "$INIT" | jq -r '.eval_review_path')
|
|
53
|
+
SUMMARY_PRESENT=$(echo "$INIT" | jq -r '.summary_present')
|
|
54
|
+
SUMMARY_PATH=$(echo "$INIT" | jq -r '.summary_path')
|
|
55
|
+
AI_SPEC_PATH=$(echo "$INIT" | jq -r '.ai_spec_path')
|
|
56
|
+
HAS_AI_SPEC=$(echo "$INIT" | jq -r '.has_ai_spec')
|
|
57
|
+
STATE=$(echo "$INIT" | jq -r '.state')
|
|
58
|
+
PLAN_ID="${PADDED}-eval-review"
|
|
59
|
+
TASK_ID="${PADDED}-eval-review"
|
|
60
|
+
```
|
|
61
|
+
|
|
62
|
+
## Pre-Flight Gates
|
|
63
|
+
|
|
64
|
+
<pre_flight>
|
|
65
|
+
|
|
66
|
+
### Gate 1 — State C aborts before any spawn
|
|
67
|
+
|
|
68
|
+
State C means no SUMMARY.md, so the phase has not been executed and
|
|
69
|
+
there is nothing to audit. Exit with a clear message before any agent
|
|
70
|
+
is spawned or any metrics record is written.
|
|
71
|
+
|
|
72
|
+
```bash
|
|
73
|
+
if [[ "$STATE" == "C" ]]; then
|
|
74
|
+
echo "Error: Phase $PHASE has no SUMMARY.md at $SUMMARY_PATH." >&2
|
|
75
|
+
echo "The phase must be executed (/np:execute-phase) before its evals can be audited." >&2
|
|
76
|
+
exit 1
|
|
77
|
+
fi
|
|
78
|
+
|
|
79
|
+
if [[ "$SUMMARY_PRESENT" != "true" ]]; then
|
|
80
|
+
echo "Error: summary_present=false for phase $PHASE; expected state=C but got $STATE." >&2
|
|
81
|
+
exit 1
|
|
82
|
+
fi
|
|
83
|
+
```
|
|
84
|
+
|
|
85
|
+
### Gate 2 — EVAL-REVIEW.md already exists
|
|
86
|
+
|
|
87
|
+
If a prior review is present, let the user choose between re-running,
|
|
88
|
+
viewing the current review, or skipping.
|
|
89
|
+
|
|
90
|
+
```bash
|
|
91
|
+
if [[ -f "$EVAL_REVIEW_PATH" ]]; then
|
|
92
|
+
CHOICE=$(node np-tools.cjs askuser --json '{
|
|
93
|
+
"type": "select",
|
|
94
|
+
"header": "Existing EVAL-REVIEW",
|
|
95
|
+
"question": "EVAL-REVIEW.md already exists for Phase '"$PHASE"'. What would you like to do?",
|
|
96
|
+
"options": [
|
|
97
|
+
{"label": "Re-run — replace the current review", "description": "Re-runs np-eval-auditor and overwrites the existing file."},
|
|
98
|
+
{"label": "View — display current review and exit", "description": "Reads the file and exits without changes."},
|
|
99
|
+
{"label": "Skip — keep current review and exit", "description": "Leaves the file untouched."}
|
|
100
|
+
]
|
|
101
|
+
}')
|
|
102
|
+
case "$CHOICE" in
|
|
103
|
+
"View"*) cat "$EVAL_REVIEW_PATH"; exit 0 ;;
|
|
104
|
+
"Skip"*) exit 0 ;;
|
|
105
|
+
esac
|
|
106
|
+
fi
|
|
107
|
+
```
|
|
108
|
+
|
|
109
|
+
### Gate 3 — Label audit mode from state
|
|
110
|
+
|
|
111
|
+
```bash
|
|
112
|
+
case "$STATE" in
|
|
113
|
+
"A") AUDIT_MODE="spec-conformance" ;;
|
|
114
|
+
"B") AUDIT_MODE="retroactive-general" ;;
|
|
115
|
+
*)
|
|
116
|
+
echo "Error: unexpected state '$STATE' from init payload (expected A or B after Gate 1)." >&2
|
|
117
|
+
exit 1
|
|
118
|
+
;;
|
|
119
|
+
esac
|
|
120
|
+
```
|
|
121
|
+
|
|
122
|
+
</pre_flight>
|
|
123
|
+
|
|
124
|
+
## Philosophy
|
|
125
|
+
|
|
126
|
+
<philosophy>
|
|
127
|
+
Eval plans decay the moment the first commit lands. Planned rubrics
|
|
128
|
+
lose their binding to code, guardrails get stubbed "for now", tracing
|
|
129
|
+
is wired but never turned on, and the reference dataset never leaves
|
|
130
|
+
the design doc. A retroactive eval-coverage audit catches all of that
|
|
131
|
+
in one pass and emits a ranked list of gaps with concrete remediation
|
|
132
|
+
steps. When an AI-SPEC.md exists, the audit is a conformance check
|
|
133
|
+
against planned dimensions. When it does not, the audit is a
|
|
134
|
+
best-practice sweep — and the mode label on EVAL-REVIEW.md makes that
|
|
135
|
+
difference explicit so reviewers never treat a general audit as if it
|
|
136
|
+
had SPEC backing.
|
|
137
|
+
</philosophy>
|
|
138
|
+
|
|
139
|
+
## Main Flow
|
|
140
|
+
|
|
141
|
+
Single serial spawn — the auditor is self-contained (codebase scan,
|
|
142
|
+
dimension scoring, infrastructure audit, report writing all happen
|
|
143
|
+
inside `np-eval-auditor`).
|
|
144
|
+
|
|
145
|
+
### Step 1 — Eval auditor (np-eval-auditor, haiku)
|
|
146
|
+
|
|
147
|
+
```bash
|
|
148
|
+
START=$(node np-tools.cjs metrics start-timestamp)
|
|
149
|
+
MODEL=$(node np-tools.cjs resolve-model np-eval-auditor --profile balanced)
|
|
150
|
+
> NOTE: Spawn agent=np-eval-auditor model=$MODEL state=$STATE mode=$AUDIT_MODE
|
|
151
|
+
> NOTE: input: phase_number=$PHASE, phase_dir=$PHASE_DIR,
|
|
152
|
+
> NOTE: summary_path=$SUMMARY_PATH, ai_spec_path=$AI_SPEC_PATH,
|
|
153
|
+
> NOTE: has_ai_spec=$HAS_AI_SPEC, audit_mode=$AUDIT_MODE,
|
|
154
|
+
> NOTE: eval_review_path=$EVAL_REVIEW_PATH
|
|
155
|
+
> NOTE: output: $EVAL_REVIEW_PATH with dimension scores
|
|
156
|
+
> NOTE: (COVERED/PARTIAL/MISSING), infrastructure scores,
|
|
157
|
+
> NOTE: overall verdict, and a mode label
|
|
158
|
+
> NOTE: ("spec-conformance" or "retroactive-general") in the
|
|
159
|
+
> NOTE: header frontmatter.
|
|
160
|
+
END=$(node np-tools.cjs metrics end-timestamp)
|
|
161
|
+
node np-tools.cjs metrics record \
|
|
162
|
+
--agent np-eval-auditor --tier haiku --resolved-model "$MODEL" \
|
|
163
|
+
--phase "$PHASE" --plan "$PLAN_ID" --task "$TASK_ID" \
|
|
164
|
+
--started "$START" --ended "$END" \
|
|
165
|
+
--tokens-in "${TOKENS_IN:-0}" --tokens-out "${TOKENS_OUT:-0}" \
|
|
166
|
+
--retry-count 0 --status ok --runtime "$RUNTIME"
|
|
167
|
+
```
|
|
168
|
+
|
|
169
|
+
## Validation Gate
|
|
170
|
+
|
|
171
|
+
After the auditor finishes, verify EVAL-REVIEW.md was written. If the
|
|
172
|
+
file is missing, the spawn failed silently and the user is prompted to
|
|
173
|
+
re-run or abort.
|
|
174
|
+
|
|
175
|
+
```bash
|
|
176
|
+
if [[ ! -f "$EVAL_REVIEW_PATH" ]]; then
|
|
177
|
+
CHOICE=$(node np-tools.cjs askuser --json '{
|
|
178
|
+
"type": "select",
|
|
179
|
+
"header": "EVAL-REVIEW.md missing",
|
|
180
|
+
"question": "np-eval-auditor did not write EVAL-REVIEW.md. What would you like to do?",
|
|
181
|
+
"options": [
|
|
182
|
+
{"label": "Re-run np-eval-auditor", "description": "Spawn the auditor once more."},
|
|
183
|
+
{"label": "Abort", "description": "Exit without committing."}
|
|
184
|
+
]
|
|
185
|
+
}')
|
|
186
|
+
case "$CHOICE" in
|
|
187
|
+
"Abort") exit 1 ;;
|
|
188
|
+
esac
|
|
189
|
+
fi
|
|
190
|
+
```
|
|
191
|
+
|
|
192
|
+
## Commit
|
|
193
|
+
|
|
194
|
+
```bash
|
|
195
|
+
git add "$EVAL_REVIEW_PATH"
|
|
196
|
+
git commit -m "docs(${PADDED}): generate EVAL-REVIEW.md (${AUDIT_MODE})"
|
|
197
|
+
```
|
|
198
|
+
|
|
199
|
+
## Scope Guardrail
|
|
200
|
+
|
|
201
|
+
<scope_guardrail>
|
|
202
|
+
**Do:**
|
|
203
|
+
- Run `np-eval-auditor` exactly once per invocation (single-pass audit).
|
|
204
|
+
- Emit a metrics record AFTER the Task spawn (D-06).
|
|
205
|
+
- Resolve MODEL via `np-tools.cjs resolve-model` — no hardcoded IDs.
|
|
206
|
+
- Use `np-tools.cjs askuser` for every prompt (INST-03 invariant).
|
|
207
|
+
- Honour the `state` field from the init payload: A → spec-conformance,
|
|
208
|
+
B → retroactive-general, C → abort before spawning anything.
|
|
209
|
+
- Label the audit mode explicitly in EVAL-REVIEW.md
|
|
210
|
+
(`spec-conformance` when AI-SPEC.md exists, `retroactive-general`
|
|
211
|
+
otherwise) — Pitfall 10 parallel.
|
|
212
|
+
- Abort early when SUMMARY.md is missing; retroactive audits are only
|
|
213
|
+
meaningful against executed phases.
|
|
214
|
+
|
|
215
|
+
**Don't:**
|
|
216
|
+
- Run this workflow on a phase that has not been executed — there is
|
|
217
|
+
nothing to audit until SUMMARY.md lands.
|
|
218
|
+
- Invoke host-specific prompt tools directly — always route through
|
|
219
|
+
`np-tools.cjs askuser`.
|
|
220
|
+
- Silently treat a spec-less audit as if it had SPEC backing — the
|
|
221
|
+
mode label in the output header is mandatory.
|
|
222
|
+
- Spawn any additional agent beyond `np-eval-auditor`; if a follow-up
|
|
223
|
+
remediation pass is needed, that is the planner's job, not this
|
|
224
|
+
workflow's.
|
|
225
|
+
- Call any tools binary other than `np-tools.cjs` (the sole CLI entry
|
|
226
|
+
per Plan 09-05 D-14).
|
|
227
|
+
- Reference legacy homedir payload paths — those directories do not
|
|
228
|
+
exist in nubos-pilot projects.
|
|
229
|
+
- Skip the metrics record block — the Phase-10 np:stats consumer
|
|
230
|
+
expects one record per Task spawn.
|
|
231
|
+
- Re-derive `state` inside this workflow; state detection is the init
|
|
232
|
+
CLI's responsibility (`bin/np-tools/eval-review.cjs`).
|
|
233
|
+
</scope_guardrail>
|
|
234
|
+
|
|
235
|
+
## Output
|
|
236
|
+
|
|
237
|
+
- `{phase_dir}/{padded}-EVAL-REVIEW.md` — per-dimension scores
|
|
238
|
+
(COVERED/PARTIAL/MISSING), infrastructure scores, overall verdict,
|
|
239
|
+
remediation plan, and mode label
|
|
240
|
+
(`spec-conformance` or `retroactive-general`).
|
|
241
|
+
- 1 metrics record in `.nubos-pilot/metrics/phase-${PHASE}.jsonl`
|
|
242
|
+
for the single `np-eval-auditor` Task spawn.
|
|
243
|
+
- One git commit when EVAL-REVIEW.md is produced successfully.
|