nubos-pilot 0.9.7 → 0.9.8
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/agents/np-architect.md +2 -0
- package/agents/np-plan-checker.md +4 -1
- package/agents/np-planner.md +33 -1
- package/bin/np-tools/_commands.cjs +1 -0
- package/bin/np-tools/plan-lint.cjs +188 -0
- package/bin/np-tools/plan-lint.test.cjs +205 -0
- package/docs/adr/0013-plan-trust-layer.md +95 -0
- package/lib/plan-lint.cjs +383 -0
- package/lib/plan-lint.test.cjs +313 -0
- package/np-tools.cjs +2 -0
- package/package.json +1 -1
- package/workflows/plan-phase.md +24 -0
package/agents/np-architect.md
CHANGED
|
@@ -70,6 +70,8 @@ If the project already documents a module/pattern that fits, extend it instead o
|
|
|
70
70
|
|
|
71
71
|
## Output Contract
|
|
72
72
|
|
|
73
|
+
**Granularity (ADR-0013).** Architecture decisions are intent-level: which library, which boundary, which protocol. They do NOT prescribe implementation — no schema DDL, no exact framework-generated filenames, no code-style edicts. Those are executor-territory and downstream `np-planner` will refuse plans that bake them in (Plan-side Trust Layer, ADR-0013). If you find yourself describing how a controller method should be structured, stop — that's not architecture.
|
|
74
|
+
|
|
73
75
|
```markdown
|
|
74
76
|
# M<NNN> — <milestone name> — Architecture
|
|
75
77
|
|
|
@@ -53,7 +53,7 @@ Additional context the orchestrator may inline in the prompt:
|
|
|
53
53
|
|
|
54
54
|
## Review Dimensions
|
|
55
55
|
|
|
56
|
-
Each dimension maps to one or more canonical finding categories from `docs/agent-frontmatter-schema.md`. The
|
|
56
|
+
Each dimension maps to one or more canonical finding categories from `docs/agent-frontmatter-schema.md`. The 14 canonical codes are:
|
|
57
57
|
|
|
58
58
|
- `missing-success-criterion` — a ROADMAP SC-X is not mapped to any task.
|
|
59
59
|
- `non-atomic-task` — a task bundles multiple distinct deliverables that should be split.
|
|
@@ -66,6 +66,9 @@ Each dimension maps to one or more canonical finding categories from `docs/agent
|
|
|
66
66
|
- `hook-field-present` — agent frontmatter contains `hooks:` (D-10).
|
|
67
67
|
- `forbidden-agent-field` — agent frontmatter contains `model:` or `model_profile:` (D-10).
|
|
68
68
|
- `unverified-assumption` — a slice plan's `<reality_check>` block is missing, empty, or contains an `<assumption>` without a non-empty `verified_by` attribute, OR a `<files_read>` path does not exist in the repo (Reality-Check rule, see Dimension 12).
|
|
69
|
+
- `verify-command-unknown` — a `<verify>` block invokes a command that is not a known np-tools verb, declared composer/npm script, vendor binary, or POSIX baseline tool (Plan-side Trust Layer, ADR-0013). Mechanically detected by `np-tools.cjs plan-lint`; you mirror the verdict into your findings array so the loop handler treats it uniformly with semantic findings.
|
|
70
|
+
- `parallel-task-implicit-dependency` — tasks marked `depends_on: []` in the same slice but one of them runs a working-tree-reading verify (`update-docs`, `phpstan analyse`, `git diff`, etc.) against files another sibling modifies. Implicit ordering must be made explicit (Plan-side Trust Layer, ADR-0013).
|
|
71
|
+
- `plan-over-specifies-implementation` — PLAN.md body contains schema DDL, framework-controlled timestamped filenames, or large inline code snippets. Plans specify intent + boundary + acceptance, not implementation. Severity is `major` (advisory) — not a hard block, but you flag it so the planner course-corrects (Plan-side Granularity Doctrine, ADR-0013).
|
|
69
72
|
|
|
70
73
|
Run each dimension below; for every failure, emit one finding using the matching canonical code.
|
|
71
74
|
|
package/agents/np-planner.md
CHANGED
|
@@ -261,7 +261,7 @@ Every PLAN.md you write will be consumed by an executor agent that:
|
|
|
261
261
|
**Implications for your writing style:**
|
|
262
262
|
|
|
263
263
|
- **Name the library, not the category.** "Use `jose` for JWT" > "use a JWT library".
|
|
264
|
-
- **Name the file, not the area
|
|
264
|
+
- **Name the file, not the area** — for *deterministic edits the planner can know up-front*. "Modify `src/api/auth/login.ts`" > "update the auth layer". For *scaffolding tasks where a framework generates files at install/publish time*, use a glob (`database/migrations/*_cashier_*.php`) or leave `files_modified` empty — the executor resolves the real paths from the actual publish output and `commit-task` falls back to `checkpoint.files_touched` (D-04, ADR-0013 Layer-D Granularity).
|
|
265
265
|
- **Name the command, not the intent.** "Run `npm test -- --filter=auth`" > "run the tests".
|
|
266
266
|
- **Cite existing interfaces verbatim.** If `lib/core.cjs` exports `NubosPilotError(code, message, details)` — quote that signature in the task context so the executor doesn't mis-remember.
|
|
267
267
|
- **Document deviations from canonical advice.** If you deviate from CONTEXT.md's stack choice, say so explicitly and note why.
|
|
@@ -269,6 +269,38 @@ Every PLAN.md you write will be consumed by an executor agent that:
|
|
|
269
269
|
If the executor has to stop and read three more files to figure out what you meant, the plan failed.
|
|
270
270
|
</downstream_awareness>
|
|
271
271
|
|
|
272
|
+
<plan_granularity>
|
|
273
|
+
## Plan Granularity Doctrine — Intent + Boundary + Acceptance, NOT Implementation (ADR-0013)
|
|
274
|
+
|
|
275
|
+
A PLAN.md is a contract. It specifies **what** must be true at the end (intent), **where** the work is allowed to touch (boundary), and **how** success is measured (acceptance). It does NOT specify HOW the implementation looks. That's the executor's territory; you don't have ground-truth on it and pretending you do is the bug class that produced the M004 plan-bugs.
|
|
276
|
+
|
|
277
|
+
**You DO write:**
|
|
278
|
+
- Intent: "Install Cashier 16 for billing." "Add subscription resource at `/billing`." "Force 2FA for org owners."
|
|
279
|
+
- Boundary: which directories the change is allowed to touch (`database/migrations/`, `app/Providers/AppServiceProvider.php`).
|
|
280
|
+
- Acceptance: observable, falsifiable success criteria (Pest test names, exit codes, HTTP responses, file presence).
|
|
281
|
+
- Verify command: a real, runnable shell invocation that returns exit-code 0 on success. **The first token must be a known command** (np-tools verb, composer/npm script, vendor binary, POSIX tool). `plan-lint` mechanically refuses unknown verbs.
|
|
282
|
+
|
|
283
|
+
**You DO NOT write:**
|
|
284
|
+
- **Schema DDL.** No `CREATE TABLE`, no `Schema::create('...', function (Blueprint $table) { ... })`, no column-by-column lists. The framework decides the schema; the executor publishes/migrates it; you check that migration applies and tests pass.
|
|
285
|
+
- **Framework-controlled filenames.** Cashier publishes 5 migration files with publish-time timestamps; you cannot know the exact names. `0001_01_01_000004_create_customer_columns_table.php` is a **smell** — `plan-lint` flags it as `framework-timestamped-filename`. Use globs (`database/migrations/*_cashier_*.php`) or leave files_modified empty.
|
|
286
|
+
- **Code-style prescriptions.** Whether `boot()` inlines `Cashier::calculateTaxes()` or routes through `configureCashier()` is a codebase-state decision the executor reads from `.nubos-pilot/codebase/<module>.md`. You don't override it.
|
|
287
|
+
- **Library-internal details.** "Cashier publishes one migration with subscriptions + subscription_items as two `Schema::create` blocks" is a falsifiable claim about an external library's internals. Either stay above that level (intent: "install Cashier"), or invoke a researcher to verify the claim and tag it `[VERIFIED]`. Unverified library-shape claims are the M004 plan-bug class.
|
|
288
|
+
- **Inline implementation snippets > 10 lines.** Code blocks of significant length push implementation into the plan. Describe what the code must do; the executor writes it. `plan-lint` warns at >200-character code blocks (heuristic).
|
|
289
|
+
|
|
290
|
+
**The Cashier example, done right:**
|
|
291
|
+
|
|
292
|
+
> **Goal:** Install Cashier 16 for subscription billing.
|
|
293
|
+
> **Boundary:** `database/migrations/`, `app/Providers/AppServiceProvider.php`, `phpunit.xml`.
|
|
294
|
+
> **Acceptance:**
|
|
295
|
+
> - `composer show laravel/cashier` reports version `^16.0`
|
|
296
|
+
> - `php artisan migrate` exits 0 with at least one Cashier migration applied
|
|
297
|
+
> - Pest test `tests/Feature/Cashier/InstallTest.php::installs_cashier` passes
|
|
298
|
+
> **Verify:** `composer test:cashier`
|
|
299
|
+
> **files_modified:** *empty* — let the executor resolve from publish output.
|
|
300
|
+
|
|
301
|
+
The plan does not say which migration files Cashier publishes, what columns they contain, or how `AppServiceProvider::boot()` should look. Those are executor-resolved.
|
|
302
|
+
</plan_granularity>
|
|
303
|
+
|
|
272
304
|
<answer_validation>
|
|
273
305
|
## Self-Check Before Returning
|
|
274
306
|
|
|
@@ -7,6 +7,7 @@ const COMMANDS = [
|
|
|
7
7
|
{ name: 'discuss-phase', category: 'Planning', description: 'Adaptive milestone-context interview (writes M<NNN>-CONTEXT.md)', description_de: 'Adaptives Milestone-Kontext-Interview (schreibt M<NNN>-CONTEXT.md)' },
|
|
8
8
|
{ name: 'research-phase', category: 'Planning', description: 'Milestone-level research (WebFetch + MCP; offline fallback)', description_de: 'Milestone-Recherche (WebFetch + MCP; Offline-Fallback)' },
|
|
9
9
|
{ name: 'plan-milestone', category: 'Planning', description: 'Plan a milestone: scaffolds slices + tasks', description_de: 'Plant einen Milestone: erzeugt Slices + Tasks' },
|
|
10
|
+
{ name: 'plan-lint', category: 'Planning', description: 'Mechanical Trust-Layer linter for PLAN.md (verify-command + parallel-race + over-specification). ADR-0013', description_de: 'Mechanischer Trust-Layer-Linter für PLAN.md (verify-command + parallel-race + Über-Spezifikation). ADR-0013' },
|
|
10
11
|
{ name: 'new-project', category: 'Planning', description: 'Greenfield project init (PROJECT.md + REQUIREMENTS.md + M001 milestone)', description_de: 'Greenfield-Projekt-Init (PROJECT.md + REQUIREMENTS.md + M001-Milestone)' },
|
|
11
12
|
{ name: 'new-milestone', category: 'Planning', description: 'Append a new milestone (M<NNN>) to an existing project', description_de: 'Hängt einen neuen Milestone (M<NNN>) an ein bestehendes Projekt an' },
|
|
12
13
|
{ name: 'propose-milestones', category: 'Planning', description: 'Re-plan all not-yet-done milestones: AI proposes add/update/remove from PROJECT.md + REQUIREMENTS.md', description_de: 'Plant offene Milestones neu: KI schlägt add/update/remove aus PROJECT.md + REQUIREMENTS.md vor' },
|
|
@@ -0,0 +1,188 @@
|
|
|
1
|
+
'use strict';
|
|
2
|
+
|
|
3
|
+
// CLI driver for the plan-side Trust Layer (ADR-0013, lib/plan-lint.cjs).
|
|
4
|
+
// Lints a single PLAN.md file or every PLAN.md under a milestone tree.
|
|
5
|
+
//
|
|
6
|
+
// Usage:
|
|
7
|
+
// plan-lint <path/to/PLAN.md>
|
|
8
|
+
// plan-lint --milestone M004
|
|
9
|
+
// plan-lint --milestone M004 --json
|
|
10
|
+
//
|
|
11
|
+
// Exit code:
|
|
12
|
+
// 0 — no critical findings (major findings still reported)
|
|
13
|
+
// 2 — at least one critical finding emitted
|
|
14
|
+
//
|
|
15
|
+
// Output: JSON object {summary, files: [{path, findings[]}]}.
|
|
16
|
+
|
|
17
|
+
const fs = require('node:fs');
|
|
18
|
+
const path = require('node:path');
|
|
19
|
+
|
|
20
|
+
const { NubosPilotError, findProjectRoot } = require('../../lib/core.cjs');
|
|
21
|
+
const { extractFrontmatter } = require('../../lib/frontmatter.cjs');
|
|
22
|
+
const planLint = require('../../lib/plan-lint.cjs');
|
|
23
|
+
const args = require('./_args.cjs');
|
|
24
|
+
|
|
25
|
+
const MILESTONE_RE = /^M\d{3,}$/;
|
|
26
|
+
|
|
27
|
+
function _walkMilestonePlans(milestoneDir) {
|
|
28
|
+
// Returns ordered list of PLAN.md paths under .nubos-pilot/milestones/M<NNN>/.
|
|
29
|
+
const out = [];
|
|
30
|
+
if (!fs.existsSync(milestoneDir)) return out;
|
|
31
|
+
// Milestone-level plan
|
|
32
|
+
const mPlan = fs.readdirSync(milestoneDir)
|
|
33
|
+
.filter((f) => /^M\d{3,}-PLAN\.md$/.test(f))
|
|
34
|
+
.map((f) => path.join(milestoneDir, f));
|
|
35
|
+
out.push(...mPlan);
|
|
36
|
+
// Slice plans
|
|
37
|
+
const slicesDir = path.join(milestoneDir, 'slices');
|
|
38
|
+
if (fs.existsSync(slicesDir)) {
|
|
39
|
+
const slices = fs.readdirSync(slicesDir).filter((d) => /^S\d{3,}$/.test(d)).sort();
|
|
40
|
+
for (const sId of slices) {
|
|
41
|
+
const sDir = path.join(slicesDir, sId);
|
|
42
|
+
for (const f of fs.readdirSync(sDir)) {
|
|
43
|
+
if (/^S\d{3,}-PLAN\.md$/.test(f)) out.push(path.join(sDir, f));
|
|
44
|
+
}
|
|
45
|
+
// Task plans
|
|
46
|
+
const tasksDir = path.join(sDir, 'tasks');
|
|
47
|
+
if (!fs.existsSync(tasksDir)) continue;
|
|
48
|
+
const tasks = fs.readdirSync(tasksDir).filter((d) => /^T\d{4,}$/.test(d)).sort();
|
|
49
|
+
for (const tId of tasks) {
|
|
50
|
+
const tDir = path.join(tasksDir, tId);
|
|
51
|
+
for (const f of fs.readdirSync(tDir)) {
|
|
52
|
+
if (/^T\d{4,}-PLAN\.md$/.test(f)) out.push(path.join(tDir, f));
|
|
53
|
+
}
|
|
54
|
+
}
|
|
55
|
+
}
|
|
56
|
+
}
|
|
57
|
+
return out;
|
|
58
|
+
}
|
|
59
|
+
|
|
60
|
+
function _sliceTaskCollect(milestoneDir) {
|
|
61
|
+
// Collect parallel-task race input: per slice, the set of tasks with their
|
|
62
|
+
// files_modified, depends_on, and verify text.
|
|
63
|
+
const out = []; // [{ slice, tasks: [{id, files_modified, depends_on, verifyText}] }]
|
|
64
|
+
const slicesDir = path.join(milestoneDir, 'slices');
|
|
65
|
+
if (!fs.existsSync(slicesDir)) return out;
|
|
66
|
+
const slices = fs.readdirSync(slicesDir).filter((d) => /^S\d{3,}$/.test(d)).sort();
|
|
67
|
+
for (const sId of slices) {
|
|
68
|
+
const tasksDir = path.join(slicesDir, sId, 'tasks');
|
|
69
|
+
if (!fs.existsSync(tasksDir)) continue;
|
|
70
|
+
const tasks = fs.readdirSync(tasksDir).filter((d) => /^T\d{4,}$/.test(d)).sort();
|
|
71
|
+
const collected = [];
|
|
72
|
+
for (const tId of tasks) {
|
|
73
|
+
const taskDir = path.join(tasksDir, tId);
|
|
74
|
+
const planFile = fs.readdirSync(taskDir).find((f) => /^T\d{4,}-PLAN\.md$/.test(f));
|
|
75
|
+
if (!planFile) continue;
|
|
76
|
+
const raw = fs.readFileSync(path.join(taskDir, planFile), 'utf-8');
|
|
77
|
+
const { frontmatter, body } = extractFrontmatter(raw);
|
|
78
|
+
const verifyMatch = String(body || '').match(/<verify>([\s\S]*?)<\/verify>/);
|
|
79
|
+
collected.push({
|
|
80
|
+
id: (frontmatter && frontmatter.id) || tId,
|
|
81
|
+
files_modified: (frontmatter && Array.isArray(frontmatter.files_modified))
|
|
82
|
+
? frontmatter.files_modified : [],
|
|
83
|
+
depends_on: (frontmatter && Array.isArray(frontmatter.depends_on))
|
|
84
|
+
? frontmatter.depends_on : [],
|
|
85
|
+
verifyText: verifyMatch ? verifyMatch[1] : '',
|
|
86
|
+
slice: sId,
|
|
87
|
+
});
|
|
88
|
+
}
|
|
89
|
+
if (collected.length) out.push({ slice: sId, tasks: collected });
|
|
90
|
+
}
|
|
91
|
+
return out;
|
|
92
|
+
}
|
|
93
|
+
|
|
94
|
+
function _summarize(filesResult, raceFindings) {
|
|
95
|
+
const counts = { critical: 0, major: 0, minor: 0, total: 0 };
|
|
96
|
+
for (const f of filesResult) {
|
|
97
|
+
for (const finding of f.findings) {
|
|
98
|
+
counts.total += 1;
|
|
99
|
+
counts[finding.severity || 'minor'] = (counts[finding.severity || 'minor'] || 0) + 1;
|
|
100
|
+
}
|
|
101
|
+
}
|
|
102
|
+
for (const finding of raceFindings) {
|
|
103
|
+
counts.total += 1;
|
|
104
|
+
counts[finding.severity || 'minor'] = (counts[finding.severity || 'minor'] || 0) + 1;
|
|
105
|
+
}
|
|
106
|
+
return counts;
|
|
107
|
+
}
|
|
108
|
+
|
|
109
|
+
function run(argv, ctx) {
|
|
110
|
+
const context = ctx || {};
|
|
111
|
+
const cwd = context.cwd || process.cwd();
|
|
112
|
+
const stdout = context.stdout || process.stdout;
|
|
113
|
+
const list = Array.isArray(argv) ? argv : [];
|
|
114
|
+
|
|
115
|
+
const milestoneFlag = args.getFlag(list, '--milestone');
|
|
116
|
+
const positional = list.filter((a) => !String(a).startsWith('--'));
|
|
117
|
+
const targetPath = positional[0];
|
|
118
|
+
|
|
119
|
+
let filePaths = [];
|
|
120
|
+
let raceInputs = [];
|
|
121
|
+
|
|
122
|
+
if (milestoneFlag) {
|
|
123
|
+
if (!MILESTONE_RE.test(milestoneFlag)) {
|
|
124
|
+
throw new NubosPilotError(
|
|
125
|
+
'plan-lint-invalid-milestone',
|
|
126
|
+
'--milestone expects M<NNN> form (e.g. M004)',
|
|
127
|
+
{ got: milestoneFlag },
|
|
128
|
+
);
|
|
129
|
+
}
|
|
130
|
+
const root = findProjectRoot(cwd);
|
|
131
|
+
const mDir = path.join(root, '.nubos-pilot', 'milestones', milestoneFlag);
|
|
132
|
+
if (!fs.existsSync(mDir)) {
|
|
133
|
+
throw new NubosPilotError(
|
|
134
|
+
'plan-lint-milestone-not-found',
|
|
135
|
+
'milestone directory does not exist: ' + mDir,
|
|
136
|
+
{ milestone: milestoneFlag, path: mDir },
|
|
137
|
+
);
|
|
138
|
+
}
|
|
139
|
+
filePaths = _walkMilestonePlans(mDir);
|
|
140
|
+
raceInputs = _sliceTaskCollect(mDir);
|
|
141
|
+
} else if (targetPath) {
|
|
142
|
+
const abs = path.isAbsolute(targetPath) ? targetPath : path.resolve(cwd, targetPath);
|
|
143
|
+
if (!fs.existsSync(abs)) {
|
|
144
|
+
throw new NubosPilotError(
|
|
145
|
+
'plan-lint-file-not-found',
|
|
146
|
+
'plan file not found: ' + targetPath,
|
|
147
|
+
{ path: abs },
|
|
148
|
+
);
|
|
149
|
+
}
|
|
150
|
+
filePaths = [abs];
|
|
151
|
+
} else {
|
|
152
|
+
throw new NubosPilotError(
|
|
153
|
+
'plan-lint-missing-target',
|
|
154
|
+
'plan-lint requires a path argument OR --milestone <M<NNN>>',
|
|
155
|
+
{ hint: 'examples: `plan-lint M004-S001-PLAN.md` or `plan-lint --milestone M004`' },
|
|
156
|
+
);
|
|
157
|
+
}
|
|
158
|
+
|
|
159
|
+
const filesResult = filePaths.map((p) => {
|
|
160
|
+
const raw = fs.readFileSync(p, 'utf-8');
|
|
161
|
+
const { body } = extractFrontmatter(raw);
|
|
162
|
+
return {
|
|
163
|
+
path: path.relative(cwd, p),
|
|
164
|
+
findings: planLint.lintPlan(body, { cwd, raw }),
|
|
165
|
+
};
|
|
166
|
+
});
|
|
167
|
+
|
|
168
|
+
let raceFindings = [];
|
|
169
|
+
for (const group of raceInputs) {
|
|
170
|
+
raceFindings.push(...planLint.lintParallelTaskRaces(group.tasks));
|
|
171
|
+
}
|
|
172
|
+
|
|
173
|
+
const summary = _summarize(filesResult, raceFindings);
|
|
174
|
+
const payload = {
|
|
175
|
+
target: milestoneFlag || (positional[0] || null),
|
|
176
|
+
summary,
|
|
177
|
+
files: filesResult,
|
|
178
|
+
parallel_race_findings: raceFindings,
|
|
179
|
+
};
|
|
180
|
+
stdout.write(JSON.stringify(payload, null, 2) + '\n');
|
|
181
|
+
return summary.critical > 0 ? 2 : 0;
|
|
182
|
+
}
|
|
183
|
+
|
|
184
|
+
module.exports = { run };
|
|
185
|
+
|
|
186
|
+
if (require.main === module) {
|
|
187
|
+
process.exit(run(process.argv.slice(2)));
|
|
188
|
+
}
|
|
@@ -0,0 +1,205 @@
|
|
|
1
|
+
'use strict';
|
|
2
|
+
|
|
3
|
+
const fs = require('node:fs');
|
|
4
|
+
const os = require('node:os');
|
|
5
|
+
const path = require('node:path');
|
|
6
|
+
const { test, afterEach } = require('node:test');
|
|
7
|
+
const assert = require('node:assert/strict');
|
|
8
|
+
|
|
9
|
+
const planLintCli = require('./plan-lint.cjs');
|
|
10
|
+
|
|
11
|
+
const _sandboxes = [];
|
|
12
|
+
|
|
13
|
+
function _mkProject(milestoneTree) {
|
|
14
|
+
const root = fs.mkdtempSync(path.join(os.tmpdir(), 'np-pl-cli-'));
|
|
15
|
+
fs.mkdirSync(path.join(root, '.nubos-pilot'), { recursive: true });
|
|
16
|
+
// Mark project root via STATE.md (findProjectRoot anchors on .nubos-pilot/).
|
|
17
|
+
fs.writeFileSync(path.join(root, '.nubos-pilot', 'STATE.md'),
|
|
18
|
+
'---\nschema_version: 2\ncurrent_phase: null\ncurrent_plan: null\ncurrent_task: null\n---\n', 'utf-8');
|
|
19
|
+
if (milestoneTree) {
|
|
20
|
+
for (const [rel, content] of Object.entries(milestoneTree)) {
|
|
21
|
+
const abs = path.join(root, rel);
|
|
22
|
+
fs.mkdirSync(path.dirname(abs), { recursive: true });
|
|
23
|
+
fs.writeFileSync(abs, content, 'utf-8');
|
|
24
|
+
}
|
|
25
|
+
}
|
|
26
|
+
_sandboxes.push(root);
|
|
27
|
+
return root;
|
|
28
|
+
}
|
|
29
|
+
|
|
30
|
+
function _cap() {
|
|
31
|
+
let buf = '';
|
|
32
|
+
return { stub: { write: (s) => { buf += s; return true; } }, get: () => buf };
|
|
33
|
+
}
|
|
34
|
+
|
|
35
|
+
afterEach(() => {
|
|
36
|
+
while (_sandboxes.length) {
|
|
37
|
+
try { fs.rmSync(_sandboxes.pop(), { recursive: true, force: true }); } catch {}
|
|
38
|
+
}
|
|
39
|
+
});
|
|
40
|
+
|
|
41
|
+
function _taskMd(id, filesModified, dependsOn, verifyText) {
|
|
42
|
+
return `---
|
|
43
|
+
id: ${id}
|
|
44
|
+
files_modified: ${JSON.stringify(filesModified)}
|
|
45
|
+
depends_on: ${JSON.stringify(dependsOn)}
|
|
46
|
+
---
|
|
47
|
+
# ${id}
|
|
48
|
+
|
|
49
|
+
<verify>${verifyText}</verify>
|
|
50
|
+
`;
|
|
51
|
+
}
|
|
52
|
+
|
|
53
|
+
test('PLCLI-1: refuses without --milestone or path', () => {
|
|
54
|
+
assert.throws(
|
|
55
|
+
() => planLintCli.run([], { cwd: _mkProject({}), stdout: _cap().stub }),
|
|
56
|
+
(err) => err && err.code === 'plan-lint-missing-target',
|
|
57
|
+
);
|
|
58
|
+
});
|
|
59
|
+
|
|
60
|
+
test('PLCLI-2: rejects malformed --milestone value', () => {
|
|
61
|
+
assert.throws(
|
|
62
|
+
() => planLintCli.run(['--milestone', 'm1'], { cwd: _mkProject({}), stdout: _cap().stub }),
|
|
63
|
+
(err) => err && err.code === 'plan-lint-invalid-milestone',
|
|
64
|
+
);
|
|
65
|
+
});
|
|
66
|
+
|
|
67
|
+
test('PLCLI-3: rejects nonexistent milestone directory', () => {
|
|
68
|
+
assert.throws(
|
|
69
|
+
() => planLintCli.run(['--milestone', 'M999'], { cwd: _mkProject({}), stdout: _cap().stub }),
|
|
70
|
+
(err) => err && err.code === 'plan-lint-milestone-not-found',
|
|
71
|
+
);
|
|
72
|
+
});
|
|
73
|
+
|
|
74
|
+
test('PLCLI-4: returns exit 0 + zero findings on a clean milestone', () => {
|
|
75
|
+
const root = _mkProject({
|
|
76
|
+
'.nubos-pilot/milestones/M001/M001-PLAN.md': '# Milestone\n\n<verify>echo ok</verify>\n',
|
|
77
|
+
'.nubos-pilot/milestones/M001/slices/S001/S001-PLAN.md': '# Slice\n\n<verify>echo ok</verify>\n',
|
|
78
|
+
'.nubos-pilot/milestones/M001/slices/S001/tasks/T0001/T0001-PLAN.md': _taskMd(
|
|
79
|
+
'M001-S001-T0001', ['src/foo.ts'], [], 'echo ok',
|
|
80
|
+
),
|
|
81
|
+
});
|
|
82
|
+
const cap = _cap();
|
|
83
|
+
const code = planLintCli.run(['--milestone', 'M001'], { cwd: root, stdout: cap.stub });
|
|
84
|
+
const payload = JSON.parse(cap.get());
|
|
85
|
+
assert.equal(code, 0);
|
|
86
|
+
assert.equal(payload.summary.critical, 0);
|
|
87
|
+
assert.equal(payload.summary.total, 0);
|
|
88
|
+
});
|
|
89
|
+
|
|
90
|
+
test('PLCLI-5: catches the exact M004 plan-bug — verify uses unknown np-tools verb', () => {
|
|
91
|
+
const root = _mkProject({
|
|
92
|
+
'.nubos-pilot/milestones/M004/slices/S001/tasks/T0002/T0002-PLAN.md': _taskMd(
|
|
93
|
+
'M004-S001-T0002', [], [],
|
|
94
|
+
'node .nubos-pilot/bin/np-tools.cjs codebase doc-lint',
|
|
95
|
+
),
|
|
96
|
+
});
|
|
97
|
+
const cap = _cap();
|
|
98
|
+
const code = planLintCli.run(['--milestone', 'M004'], { cwd: root, stdout: cap.stub });
|
|
99
|
+
const payload = JSON.parse(cap.get());
|
|
100
|
+
assert.equal(code, 2, 'must exit non-zero on critical findings');
|
|
101
|
+
const verifyFinding = payload.files
|
|
102
|
+
.flatMap((f) => f.findings)
|
|
103
|
+
.find((f) => f.category === 'verify-command-unknown');
|
|
104
|
+
assert.ok(verifyFinding, 'expected verify-command-unknown finding');
|
|
105
|
+
assert.equal(verifyFinding.severity, 'critical');
|
|
106
|
+
assert.equal(verifyFinding.raw.reason, 'np-tools-unknown-verb');
|
|
107
|
+
});
|
|
108
|
+
|
|
109
|
+
test('PLCLI-6: catches the exact M004 plan-bug — parallel race against working-tree-reading verify', () => {
|
|
110
|
+
const root = _mkProject({
|
|
111
|
+
// T0001 modifies migration files
|
|
112
|
+
'.nubos-pilot/milestones/M004/slices/S001/tasks/T0001/T0001-PLAN.md': _taskMd(
|
|
113
|
+
'M004-S001-T0001',
|
|
114
|
+
['database/migrations/2024_01_01_000000_install_cashier.php'],
|
|
115
|
+
[],
|
|
116
|
+
'php artisan migrate',
|
|
117
|
+
),
|
|
118
|
+
// T0002 runs update-docs which hashes working tree → implicit dep
|
|
119
|
+
'.nubos-pilot/milestones/M004/slices/S001/tasks/T0002/T0002-PLAN.md': _taskMd(
|
|
120
|
+
'M004-S001-T0002', [], [],
|
|
121
|
+
'node .nubos-pilot/bin/np-tools.cjs update-docs --check',
|
|
122
|
+
),
|
|
123
|
+
});
|
|
124
|
+
const cap = _cap();
|
|
125
|
+
const code = planLintCli.run(['--milestone', 'M004'], { cwd: root, stdout: cap.stub });
|
|
126
|
+
const payload = JSON.parse(cap.get());
|
|
127
|
+
assert.equal(code, 2);
|
|
128
|
+
const raceFinding = payload.parallel_race_findings.find(
|
|
129
|
+
(f) => f.category === 'parallel-task-implicit-dependency',
|
|
130
|
+
);
|
|
131
|
+
assert.ok(raceFinding, 'expected parallel-task-implicit-dependency finding');
|
|
132
|
+
assert.equal(raceFinding.target, 'M004-S001-T0002');
|
|
133
|
+
assert.deepEqual(raceFinding.raw.conflicts, ['M004-S001-T0001']);
|
|
134
|
+
});
|
|
135
|
+
|
|
136
|
+
test('PLCLI-7: catches over-specification (Schema::create DDL in PLAN body)', () => {
|
|
137
|
+
const root = _mkProject({
|
|
138
|
+
'.nubos-pilot/milestones/M004/slices/S001/tasks/T0001/T0001-PLAN.md': _taskMd(
|
|
139
|
+
'M004-S001-T0001', ['x.php'], [], 'echo ok',
|
|
140
|
+
).replace('# M004-S001-T0001\n',
|
|
141
|
+
'# M004-S001-T0001\n\nSchema::create(\'subscriptions\', function () {});\n'),
|
|
142
|
+
});
|
|
143
|
+
const cap = _cap();
|
|
144
|
+
const code = planLintCli.run(['--milestone', 'M004'], { cwd: root, stdout: cap.stub });
|
|
145
|
+
const payload = JSON.parse(cap.get());
|
|
146
|
+
// Major (advisory) is not enough to fail the gate by default — exit 0.
|
|
147
|
+
assert.equal(code, 0);
|
|
148
|
+
const finding = payload.files
|
|
149
|
+
.flatMap((f) => f.findings)
|
|
150
|
+
.find((f) => f.category === 'plan-over-specifies-implementation');
|
|
151
|
+
assert.ok(finding);
|
|
152
|
+
assert.equal(finding.severity, 'major');
|
|
153
|
+
});
|
|
154
|
+
|
|
155
|
+
test('PLCLI-8: lints a single file when given a path argument', () => {
|
|
156
|
+
const root = _mkProject({
|
|
157
|
+
'mytask.md': _taskMd('M001-S001-T0001', [], [], 'node .nubos-pilot/bin/np-tools.cjs nonexistent-verb'),
|
|
158
|
+
});
|
|
159
|
+
const cap = _cap();
|
|
160
|
+
const code = planLintCli.run(['mytask.md'], { cwd: root, stdout: cap.stub });
|
|
161
|
+
const payload = JSON.parse(cap.get());
|
|
162
|
+
assert.equal(code, 2);
|
|
163
|
+
assert.equal(payload.files.length, 1);
|
|
164
|
+
assert.ok(payload.files[0].findings.find((f) => f.category === 'verify-command-unknown'));
|
|
165
|
+
});
|
|
166
|
+
|
|
167
|
+
test('PLCLI-9: file-not-found surfaces a clear error', () => {
|
|
168
|
+
assert.throws(
|
|
169
|
+
() => planLintCli.run(['nonexistent.md'], { cwd: _mkProject({}), stdout: _cap().stub }),
|
|
170
|
+
(err) => err && err.code === 'plan-lint-file-not-found',
|
|
171
|
+
);
|
|
172
|
+
});
|
|
173
|
+
|
|
174
|
+
test('PLCLI-10: end-to-end — all three M004 plan-bug classes surfaced together', () => {
|
|
175
|
+
const root = _mkProject({
|
|
176
|
+
// T0001 modifies migration files (race target)
|
|
177
|
+
'.nubos-pilot/milestones/M004/slices/S001/tasks/T0001/T0001-PLAN.md': _taskMd(
|
|
178
|
+
'M004-S001-T0001',
|
|
179
|
+
['database/migrations/0001_01_01_000004_create_customer_columns_table.php'],
|
|
180
|
+
[],
|
|
181
|
+
'php artisan migrate',
|
|
182
|
+
),
|
|
183
|
+
// T0002 has working-tree-reader verify (creates implicit race) AND
|
|
184
|
+
// an unknown np-tools verb on the second line.
|
|
185
|
+
'.nubos-pilot/milestones/M004/slices/S001/tasks/T0002/T0002-PLAN.md': _taskMd(
|
|
186
|
+
'M004-S001-T0002', [], [],
|
|
187
|
+
'node .nubos-pilot/bin/np-tools.cjs update-docs --check\nnode .nubos-pilot/bin/np-tools.cjs codebase doc-lint',
|
|
188
|
+
),
|
|
189
|
+
});
|
|
190
|
+
const cap = _cap();
|
|
191
|
+
const code = planLintCli.run(['--milestone', 'M004'], { cwd: root, stdout: cap.stub });
|
|
192
|
+
const payload = JSON.parse(cap.get());
|
|
193
|
+
assert.equal(code, 2);
|
|
194
|
+
const cats = new Set([
|
|
195
|
+
...payload.files.flatMap((f) => f.findings).map((f) => f.category),
|
|
196
|
+
...payload.parallel_race_findings.map((f) => f.category),
|
|
197
|
+
]);
|
|
198
|
+
assert.ok(cats.has('verify-command-unknown'),
|
|
199
|
+
'must catch verify-command-unknown — saw: ' + [...cats].join(', '));
|
|
200
|
+
assert.ok(cats.has('parallel-task-implicit-dependency'),
|
|
201
|
+
'must catch parallel-task-implicit-dependency — saw: ' + [...cats].join(', '));
|
|
202
|
+
assert.ok(cats.has('plan-over-specifies-implementation'),
|
|
203
|
+
'must catch plan-over-specifies-implementation (framework-timestamped filename) — saw: '
|
|
204
|
+
+ [...cats].join(', '));
|
|
205
|
+
});
|
|
@@ -0,0 +1,95 @@
|
|
|
1
|
+
# ADR-0013: Plan-side Trust Layer — Mechanical PLAN.md Validation Before Promote
|
|
2
|
+
|
|
3
|
+
* Status: Accepted
|
|
4
|
+
* Date: 2026-05-05
|
|
5
|
+
* Supersedes: None
|
|
6
|
+
* Related: [ADR-0010](0010-nubosloop.md) — Execute-side Trust Layer (Layers A/B/C)
|
|
7
|
+
|
|
8
|
+
## Context and Problem Statement
|
|
9
|
+
|
|
10
|
+
ADR-0010 closed the Execute-side Trust gaps (Layers A, B, C) — `commit-task` and `loop-run-round` now refuse to advance unless the per-task evidence and audit-trail are intact. Real spawns must demonstrably have happened. That solved one half of the problem.
|
|
11
|
+
|
|
12
|
+
The other half surfaced in production runs: **the plans themselves were buggy**. Three failure classes recurred across milestones M002 + M004:
|
|
13
|
+
|
|
14
|
+
1. **Phantom CLI verbs in `<verify>` blocks.** A plan would specify `<verify>node .nubos-pilot/bin/np-tools.cjs codebase doc-lint</verify>`, but `codebase` is not a registered np-tools verb. The verify command is mechanically unexecutable. The Nubosloop catches this at execute time — verify-red → build-fixer → verify-red → build-fixer → stuck — but the cost is ~3 executor + 3 build-fixer + ~9 critic spawns for a deterministically-failing outcome that a 30-line lint check would have caught at plan time.
|
|
15
|
+
|
|
16
|
+
2. **False parallel-safety claims.** Tasks marked `depends_on: []` (parallel-safe) where one task's `<verify>` reads working-tree state (`update-docs --check`, `phpstan analyse`, `git diff`) against files another sibling task modifies. Filesystem-race at runtime; parallel-safe in name only.
|
|
17
|
+
|
|
18
|
+
3. **Implementation over-specification.** Plans bake in framework-controlled details — exact migration filenames (`0001_01_01_000004_create_customer_columns_table.php`), schema DDL (`Schema::create('subscriptions', function (Blueprint $table) { ... })`), code-style edicts (`use Cashier::calculateTaxes() inline in boot()`). These are not the planner's territory: the framework decides migration shapes, the executor reads codebase docs for style, the publish step decides filenames. A plan that pretends to know these is making falsifiable claims it cannot verify.
|
|
19
|
+
|
|
20
|
+
In all three classes, the orchestrator fielded the consequences at execute time when the planner should have prevented them.
|
|
21
|
+
|
|
22
|
+
## Decision Drivers
|
|
23
|
+
|
|
24
|
+
* The Nubosloop is expensive — 1 executor + 3 critics per task per round. Burning a full Nubosloop on a plan-bug is wasteful when mechanical detection is cheap.
|
|
25
|
+
* Plan-checker is an LLM-judgment agent (opus tier). LLM judgment is unreliable for syntactic checks (verb-existence, regex-pattern-detection) — those are mechanical-checker territory.
|
|
26
|
+
* Planners under user pressure rationalize over-specification ("being thorough"). Doctrine alone doesn't hold; mechanical refusal does.
|
|
27
|
+
|
|
28
|
+
## Considered Options
|
|
29
|
+
|
|
30
|
+
* **Status quo (LLM-judgment plan-checker only).** *Rejected.* Demonstrably fails on all three classes — observed in M002 + M004.
|
|
31
|
+
* **Add the three checks to `np-plan-checker` agent prompt.** *Rejected.* LLM-judgment is non-deterministic. The same plan, checked twice, can produce different verdicts. For mechanical violations we need deterministic refusal.
|
|
32
|
+
* **New mechanical lint verb + Layer-D enforcement in `plan-phase` workflow — chosen**.
|
|
33
|
+
|
|
34
|
+
## Decision Outcome
|
|
35
|
+
|
|
36
|
+
Chosen: **Plan-side Trust Layer with three mechanical linters wrapped in a CLI verb (`np-tools.cjs plan-lint`), called from `plan-phase` workflow before each verification-loop iteration. Critical findings are merged into the LLM-checker verdict and force iteration-2.**
|
|
37
|
+
|
|
38
|
+
### Layer-D — three deterministic linters
|
|
39
|
+
|
|
40
|
+
* **D1 — `lintVerifyCommands`** (severity: critical). Every `<verify>` block is parsed; the first command per non-comment line is validated against:
|
|
41
|
+
* Known np-tools verbs (read from `_commands.cjs::COMMANDS`)
|
|
42
|
+
* Declared composer scripts (`composer.json::scripts`)
|
|
43
|
+
* Declared npm/pnpm/yarn scripts (`package.json::scripts`)
|
|
44
|
+
* `vendor/bin/*` and `node_modules/.bin/*` paths (lint-time existence + conventional-bin-dir tolerance)
|
|
45
|
+
* POSIX baseline (echo, test, [, sed, grep, find, …)
|
|
46
|
+
* Interpreter-prefixed calls (`node`, `php`, `composer`, `npm`, `npx`, …)
|
|
47
|
+
Unknown commands emit `verify-command-unknown` with concrete `raw.reason` (`np-tools-unknown-verb`, `composer-script-not-declared`, `npm-script-not-declared`, `path-not-found`).
|
|
48
|
+
|
|
49
|
+
* **D2 — `lintParallelTaskRaces`** (severity: critical). For every slice with multiple tasks marked `depends_on: []`, computes whether any sibling's `<verify>` matches the working-tree-reader pattern (`update-docs`, `phpstan analyse`, `pint`, `eslint`, `tsc`, `git diff/status/ls-files/log`, `find -newer`, `pre-commit run`). If yes AND another sibling has non-empty `files_modified`, emits `parallel-task-implicit-dependency` naming the conflict. The hint includes the exact `depends_on` array the planner should have written.
|
|
50
|
+
|
|
51
|
+
* **D3 — `lintOverSpecification`** (severity: major, advisory). Heuristic regex scan for:
|
|
52
|
+
* Schema DDL (`CREATE TABLE`, `ALTER COLUMN`, `Schema::create`, `Schema::table`, common Eloquent column-builder calls)
|
|
53
|
+
* Framework-timestamped filenames (`\d{4}_\d{2}_\d{2}_\d{6}_*.php`)
|
|
54
|
+
* Inline code blocks > 200 characters
|
|
55
|
+
Emits `plan-over-specifies-implementation`. Severity is `major` (advisory) so it surfaces without blocking the gate — heuristic false-positives are tolerated.
|
|
56
|
+
|
|
57
|
+
### Granularity Doctrine — propagated to planner agents
|
|
58
|
+
|
|
59
|
+
`agents/np-planner.md` gains a `<plan_granularity>` section codifying: plans specify intent + boundary + acceptance, not implementation. Concrete prohibitions enumerated, including:
|
|
60
|
+
* Schema DDL belongs to executor.
|
|
61
|
+
* Framework-timestamped filenames are publish-time output; use globs or empty `files_modified`.
|
|
62
|
+
* Code-style is codebase-state, executor reads `.nubos-pilot/codebase/<module>.md`.
|
|
63
|
+
* Library-internal claims must be `[VERIFIED]` via researcher or stay above that level.
|
|
64
|
+
|
|
65
|
+
`agents/np-architect.md` gains a granularity reminder: architecture decisions are intent-level (which library, which boundary, which protocol), not implementation prescriptions.
|
|
66
|
+
|
|
67
|
+
`agents/np-plan-checker.md` gains the three new canonical finding categories. The plan-checker agent mirrors mechanical findings into its YAML verdict so the verification loop treats them uniformly with semantic findings.
|
|
68
|
+
|
|
69
|
+
### Workflow integration
|
|
70
|
+
|
|
71
|
+
`workflows/plan-phase.md` calls `plan-lint --milestone $milestone_id` between the plan-checker spawn and the status-pass check, in each iteration of the verification loop. Critical findings are merged into the verdict JSON; the loop forces iteration-2 (and rejects the plan if iteration-2 still has critical findings).
|
|
72
|
+
|
|
73
|
+
### Defaults / Configuration
|
|
74
|
+
|
|
75
|
+
No new configuration. The linter is always-on; severity is fixed (critical for D1+D2, major for D3). Future configurability lives in `.nubos-pilot/config.json::plan_lint` if needed (e.g. project-specific allowlist for vendor binaries).
|
|
76
|
+
|
|
77
|
+
## Consequences
|
|
78
|
+
|
|
79
|
+
* **Good.** All three plan-bug classes from M002+M004 are now caught at plan time, before any executor spawns. Saves ~190 agent invocations per milestone (M004 had 27 tasks × ~7 spawns/task that would have hit the deterministic failure).
|
|
80
|
+
* **Good.** The mechanical layer is deterministic and auditable. Same plan → same verdict, every time.
|
|
81
|
+
* **Good.** Plan-checker (LLM, opus) can now focus on semantic checks (success-criterion coverage, decision fidelity) where its judgment adds value.
|
|
82
|
+
* **Good.** Doctrine in planner agents reinforces the lesson; mechanical layer enforces it when doctrine slips.
|
|
83
|
+
* **Bad.** D3 (over-specification) is heuristic and can false-positive on legitimate plans (e.g. a plan that *must* prescribe a schema for a custom CRUD-builder task). Mitigated by `severity: major` (advisory, not blocking).
|
|
84
|
+
* **Bad.** Plan-lint adds ~50ms to each plan-phase iteration. Acceptable — saves orders of magnitude more on the execute side.
|
|
85
|
+
* **Bad.** The Granularity Doctrine constrains planner output style. Existing plans in flight may fail D3 until rewritten. Migration path: D3 is advisory only; D1+D2 catch the actual blockers.
|
|
86
|
+
|
|
87
|
+
## More Information
|
|
88
|
+
|
|
89
|
+
* **Library:** `lib/plan-lint.cjs` — pure-function linters, no I/O outside file reads.
|
|
90
|
+
* **CLI verb:** `bin/np-tools/plan-lint.cjs` — `plan-lint <path>` or `plan-lint --milestone M<NNN>`. Exit 2 on critical findings, 0 otherwise. Output is JSON.
|
|
91
|
+
* **Tests:** `lib/plan-lint.test.cjs` (25 unit tests), `bin/np-tools/plan-lint.test.cjs` (10 CLI integration + e2e tests).
|
|
92
|
+
* **Workflow integration:** `workflows/plan-phase.md` § Verification Loop — plan-lint runs each iteration, findings merged into verdict.
|
|
93
|
+
* **Related ADR:** [ADR-0010](0010-nubosloop.md) — Execute-side Trust Layer (A/B/C). Layer-D (this ADR) is the planner-side counterpart.
|
|
94
|
+
* **Related ADR:** [ADR-0011](0011-researcher-swarm-consensus.md) — researcher-schwarm runs at plan time too; D1's `verify-command-unknown` is structurally similar to a researcher's `[VERIFIED]` provenance check, applied to the verify-command surface.
|
|
95
|
+
* **Related ADR:** [ADR-0012](0012-completeness-doctrine.md) — Layer-D enforces Rule 3 (do it with tests) at plan time: the verify command must be runnable.
|
|
@@ -0,0 +1,383 @@
|
|
|
1
|
+
'use strict';
|
|
2
|
+
|
|
3
|
+
// Plan-side Trust Layer — mechanical validation of PLAN.md files before they
|
|
4
|
+
// promote to execute. Counterpart to the Execute-side Trust Layer (ADR-0010
|
|
5
|
+
// Layers A/B/C). See ADR-0013.
|
|
6
|
+
//
|
|
7
|
+
// Three linters, all stateless and deterministic:
|
|
8
|
+
//
|
|
9
|
+
// * lintVerifyCommands — every `<verify>` block must invoke commands that
|
|
10
|
+
// exist (np-tools verbs, composer/npm/pnpm scripts, vendor binaries).
|
|
11
|
+
// Catches the "phantom CLI verb" bug class.
|
|
12
|
+
//
|
|
13
|
+
// * lintParallelTaskRaces — tasks marked depends_on:[] in the same slice
|
|
14
|
+
// must be truly parallel-safe. Working-tree-reading verify commands
|
|
15
|
+
// (update-docs, git diff, hash-of-working-tree) create implicit ordering.
|
|
16
|
+
//
|
|
17
|
+
// * lintOverSpecification — heuristic warning: PLAN.md should specify
|
|
18
|
+
// intent, not implementation. Schema DDL, exact framework-controlled
|
|
19
|
+
// filenames, and inline implementation snippets are advisory smells.
|
|
20
|
+
//
|
|
21
|
+
// Findings shape (matches np-plan-checker schema):
|
|
22
|
+
// { category, severity, target, message, hint?, raw? }
|
|
23
|
+
|
|
24
|
+
const fs = require('node:fs');
|
|
25
|
+
const path = require('node:path');
|
|
26
|
+
const { extractFrontmatter } = require('./frontmatter.cjs');
|
|
27
|
+
|
|
28
|
+
// --- Known-command surface --------------------------------------------------
|
|
29
|
+
|
|
30
|
+
// POSIX baseline — these always pass without further inspection.
|
|
31
|
+
const POSIX_BASELINE = Object.freeze(new Set([
|
|
32
|
+
'true', 'false', 'echo', 'printf', 'test', '[', '[[',
|
|
33
|
+
'cat', 'head', 'tail', 'wc', 'sort', 'uniq', 'cut', 'tr', 'sed', 'awk',
|
|
34
|
+
'grep', 'egrep', 'fgrep', 'find', 'xargs',
|
|
35
|
+
'ls', 'mkdir', 'rmdir', 'rm', 'cp', 'mv', 'ln', 'touch', 'chmod', 'chown',
|
|
36
|
+
'pwd', 'cd', 'set', 'unset', 'export', 'source', '.', 'eval', 'exit',
|
|
37
|
+
'diff', 'patch', 'tar', 'gzip', 'gunzip', 'zip', 'unzip',
|
|
38
|
+
'env', 'sleep', 'date', 'time', 'which', 'type',
|
|
39
|
+
]));
|
|
40
|
+
|
|
41
|
+
// Interpreter prefixes — for these, we look at the next argument as the
|
|
42
|
+
// "real" command and route validation accordingly.
|
|
43
|
+
const INTERPRETER_PREFIXES = Object.freeze(new Set([
|
|
44
|
+
'node', 'npx', 'pnpm', 'yarn', 'npm', 'bun', 'bunx',
|
|
45
|
+
'php', 'composer', 'python', 'python3', 'pipx', 'uv', 'poetry',
|
|
46
|
+
'ruby', 'bundle', 'go',
|
|
47
|
+
]));
|
|
48
|
+
|
|
49
|
+
// Working-tree-reading verify operations — used by lintParallelTaskRaces.
|
|
50
|
+
// If a parallel task uses one of these, it has an implicit dependency on
|
|
51
|
+
// every other task in the slice that has files_modified ≠ [].
|
|
52
|
+
const WORKING_TREE_READERS = Object.freeze([
|
|
53
|
+
/\bupdate-docs\b/i,
|
|
54
|
+
/\bgit\s+(diff|status|ls-files|log)/i,
|
|
55
|
+
/\bfind\s+\S+\s+-newer\b/i,
|
|
56
|
+
/\bpre-commit\s+run\b/i,
|
|
57
|
+
/\bphpstan\s+analyse\b/i, // reads source files across the project
|
|
58
|
+
/\bpint\b/i, // reads + may rewrite source files
|
|
59
|
+
/\beslint\b/i,
|
|
60
|
+
/\btsc\b/i, // reads tsconfig + project files
|
|
61
|
+
]);
|
|
62
|
+
|
|
63
|
+
// --- helpers ----------------------------------------------------------------
|
|
64
|
+
|
|
65
|
+
function _readJsonSafe(filepath) {
|
|
66
|
+
try {
|
|
67
|
+
if (!fs.existsSync(filepath)) return null;
|
|
68
|
+
const raw = fs.readFileSync(filepath, 'utf-8');
|
|
69
|
+
const parsed = JSON.parse(raw);
|
|
70
|
+
return parsed && typeof parsed === 'object' ? parsed : null;
|
|
71
|
+
} catch { return null; }
|
|
72
|
+
}
|
|
73
|
+
|
|
74
|
+
function _resolveKnownVerbs(opts) {
|
|
75
|
+
if (Array.isArray(opts && opts.knownVerbs)) return new Set(opts.knownVerbs);
|
|
76
|
+
try {
|
|
77
|
+
const cmds = require('../bin/np-tools/_commands.cjs');
|
|
78
|
+
if (Array.isArray(cmds.COMMANDS)) {
|
|
79
|
+
return new Set(cmds.COMMANDS.map((c) => c.name));
|
|
80
|
+
}
|
|
81
|
+
} catch {}
|
|
82
|
+
return new Set();
|
|
83
|
+
}
|
|
84
|
+
|
|
85
|
+
function _resolveScripts(cwd) {
|
|
86
|
+
const composer = _readJsonSafe(path.join(cwd, 'composer.json'));
|
|
87
|
+
const npm = _readJsonSafe(path.join(cwd, 'package.json'));
|
|
88
|
+
return {
|
|
89
|
+
composer: composer && composer.scripts && typeof composer.scripts === 'object'
|
|
90
|
+
? new Set(Object.keys(composer.scripts)) : new Set(),
|
|
91
|
+
npm: npm && npm.scripts && typeof npm.scripts === 'object'
|
|
92
|
+
? new Set(Object.keys(npm.scripts)) : new Set(),
|
|
93
|
+
};
|
|
94
|
+
}
|
|
95
|
+
|
|
96
|
+
function _binaryExists(cwd, relPath) {
|
|
97
|
+
try { return fs.existsSync(path.join(cwd, relPath)); }
|
|
98
|
+
catch { return false; }
|
|
99
|
+
}
|
|
100
|
+
|
|
101
|
+
// Tokenize a single command line — split on whitespace, drop empty tokens,
|
|
102
|
+
// strip leading env-vars (FOO=bar cmd...), shell control (;, &&, ||, |).
|
|
103
|
+
function _firstCommand(line) {
|
|
104
|
+
const stripped = String(line || '').trim();
|
|
105
|
+
if (!stripped) return null;
|
|
106
|
+
if (stripped.startsWith('#')) return null;
|
|
107
|
+
// Split on shell separators — only validate the first sub-command.
|
|
108
|
+
const head = stripped.split(/[;|&]+/)[0].trim();
|
|
109
|
+
if (!head) return null;
|
|
110
|
+
const tokens = head.split(/\s+/).filter(Boolean);
|
|
111
|
+
// Drop NAME=value env-var prefix(es).
|
|
112
|
+
let i = 0;
|
|
113
|
+
while (i < tokens.length && /^[A-Z_][A-Z0-9_]*=/.test(tokens[i])) i++;
|
|
114
|
+
if (i >= tokens.length) return null;
|
|
115
|
+
return { command: tokens[i], rest: tokens.slice(i + 1), full: head };
|
|
116
|
+
}
|
|
117
|
+
|
|
118
|
+
// Extract every `<verify>...</verify>` block from a PLAN.md body.
|
|
119
|
+
function _extractVerifyBlocks(body) {
|
|
120
|
+
const out = [];
|
|
121
|
+
const matches = String(body || '').matchAll(/<verify>([\s\S]*?)<\/verify>/g);
|
|
122
|
+
for (const m of matches) {
|
|
123
|
+
out.push({ start: m.index, body: m[1] });
|
|
124
|
+
}
|
|
125
|
+
return out;
|
|
126
|
+
}
|
|
127
|
+
|
|
128
|
+
// --- D1: lintVerifyCommands -------------------------------------------------
|
|
129
|
+
|
|
130
|
+
function _validateCommand(cmd, ctx) {
|
|
131
|
+
// Returns { ok, reason?, hint? }.
|
|
132
|
+
const { command, rest, full } = cmd;
|
|
133
|
+
// POSIX baseline
|
|
134
|
+
if (POSIX_BASELINE.has(command)) return { ok: true };
|
|
135
|
+
// Interpreter dispatch
|
|
136
|
+
if (INTERPRETER_PREFIXES.has(command)) {
|
|
137
|
+
return _validateInterpreterCall(command, rest, full, ctx);
|
|
138
|
+
}
|
|
139
|
+
// Path to a vendored binary (e.g. vendor/bin/phpstan)
|
|
140
|
+
if (command.includes('/') || command.startsWith('./') || command.startsWith('../')) {
|
|
141
|
+
if (_binaryExists(ctx.cwd, command)) return { ok: true };
|
|
142
|
+
// Could be a binary that exists post-`composer install` / `npm install`.
|
|
143
|
+
// We accept these as legitimate IF they live under known bin dirs.
|
|
144
|
+
if (/^(\.\/)?(vendor\/bin|node_modules\/\.bin|bin|scripts)\//.test(command)) {
|
|
145
|
+
return { ok: true, hint: 'binary not present at lint-time but path is conventional' };
|
|
146
|
+
}
|
|
147
|
+
return { ok: false, reason: 'path-not-found',
|
|
148
|
+
hint: 'verify path "' + command + '" does not exist; check the project layout' };
|
|
149
|
+
}
|
|
150
|
+
// Bareword — treat as PATH-resolved binary; we cannot statically prove it,
|
|
151
|
+
// so accept but mark soft.
|
|
152
|
+
return { ok: true, hint: 'bareword command "' + command + '" assumed PATH-resolved' };
|
|
153
|
+
}
|
|
154
|
+
|
|
155
|
+
function _validateInterpreterCall(interp, rest, full, ctx) {
|
|
156
|
+
// node .../np-tools.cjs <verb> ...
|
|
157
|
+
if (interp === 'node' && rest.length >= 1) {
|
|
158
|
+
const target = rest[0];
|
|
159
|
+
if (/np-tools\.cjs$/.test(target)) {
|
|
160
|
+
const verb = rest[1];
|
|
161
|
+
if (!verb || verb.startsWith('-')) {
|
|
162
|
+
return { ok: false, reason: 'np-tools-missing-verb',
|
|
163
|
+
hint: 'np-tools.cjs requires a verb as second argument' };
|
|
164
|
+
}
|
|
165
|
+
if (!ctx.knownVerbs.has(verb)) {
|
|
166
|
+
return { ok: false, reason: 'np-tools-unknown-verb',
|
|
167
|
+
hint: 'verb "' + verb + '" is not a registered np-tools command (see _commands.cjs)' };
|
|
168
|
+
}
|
|
169
|
+
return { ok: true };
|
|
170
|
+
}
|
|
171
|
+
// node <some-script.js/cjs> — accept if script exists
|
|
172
|
+
if (/\.(c?js|mjs)$/.test(target)) {
|
|
173
|
+
if (_binaryExists(ctx.cwd, target)) return { ok: true };
|
|
174
|
+
return { ok: false, reason: 'node-script-not-found',
|
|
175
|
+
hint: 'node script "' + target + '" not found at lint-time' };
|
|
176
|
+
}
|
|
177
|
+
return { ok: true }; // node -e "..." or other forms
|
|
178
|
+
}
|
|
179
|
+
// npm run <script> / yarn <script> / pnpm <script> / pnpm run <script>
|
|
180
|
+
if (interp === 'npm' || interp === 'pnpm' || interp === 'yarn') {
|
|
181
|
+
let i = 0;
|
|
182
|
+
if (rest[i] === 'run' || rest[i] === 'run-script' || rest[i] === 'exec') i++;
|
|
183
|
+
const script = rest[i];
|
|
184
|
+
if (!script || script.startsWith('-')) return { ok: true }; // npm install etc.
|
|
185
|
+
// If the script name matches a package.json scripts entry, ok.
|
|
186
|
+
if (ctx.scripts.npm.has(script)) return { ok: true };
|
|
187
|
+
// Common pass-throughs: install, ci, test (if package.json doesn't list it,
|
|
188
|
+
// npm test still works as a default — accept).
|
|
189
|
+
if (['install', 'ci', 'test', 'audit', 'update', 'outdated'].includes(script)) {
|
|
190
|
+
return { ok: true };
|
|
191
|
+
}
|
|
192
|
+
return { ok: false, reason: 'npm-script-not-declared',
|
|
193
|
+
hint: '"' + script + '" is not declared in package.json scripts' };
|
|
194
|
+
}
|
|
195
|
+
// composer <script>
|
|
196
|
+
if (interp === 'composer') {
|
|
197
|
+
const script = rest[0];
|
|
198
|
+
if (!script || script.startsWith('-')) return { ok: true };
|
|
199
|
+
if (ctx.scripts.composer.has(script)) return { ok: true };
|
|
200
|
+
// Standard composer subcommands (not project-defined scripts):
|
|
201
|
+
const builtin = new Set([
|
|
202
|
+
'install', 'update', 'require', 'remove', 'dump-autoload', 'dumpautoload',
|
|
203
|
+
'show', 'why', 'depends', 'why-not', 'audit', 'check-platform-reqs',
|
|
204
|
+
'create-project', 'init', 'self-update', 'about', 'archive', 'browse',
|
|
205
|
+
'clear-cache', 'clearcache', 'config', 'diagnose', 'exec', 'fund',
|
|
206
|
+
'global', 'home', 'licenses', 'list', 'outdated', 'prohibits', 'reinstall',
|
|
207
|
+
'run-script', 'run', 'search', 'status', 'suggests', 'validate',
|
|
208
|
+
]);
|
|
209
|
+
if (builtin.has(script)) return { ok: true };
|
|
210
|
+
return { ok: false, reason: 'composer-script-not-declared',
|
|
211
|
+
hint: '"' + script + '" is neither a composer builtin nor declared in composer.json scripts' };
|
|
212
|
+
}
|
|
213
|
+
// npx / bunx / pnpx — assume the package will be fetched at runtime.
|
|
214
|
+
if (interp === 'npx' || interp === 'bunx' || interp === 'pnpx') return { ok: true };
|
|
215
|
+
// php <file> — accept; php artisan, php -r, php script.php — all standard.
|
|
216
|
+
if (interp === 'php') return { ok: true };
|
|
217
|
+
// ruby/python/go — same.
|
|
218
|
+
if (['ruby', 'python', 'python3', 'go', 'bun'].includes(interp)) return { ok: true };
|
|
219
|
+
// bundle exec <something>
|
|
220
|
+
if (interp === 'bundle') return { ok: true };
|
|
221
|
+
return { ok: true };
|
|
222
|
+
}
|
|
223
|
+
|
|
224
|
+
function lintVerifyCommands(planBody, opts) {
|
|
225
|
+
const cwd = (opts && opts.cwd) || process.cwd();
|
|
226
|
+
const ctx = {
|
|
227
|
+
cwd,
|
|
228
|
+
knownVerbs: _resolveKnownVerbs(opts),
|
|
229
|
+
scripts: _resolveScripts(cwd),
|
|
230
|
+
};
|
|
231
|
+
const findings = [];
|
|
232
|
+
const blocks = _extractVerifyBlocks(planBody || '');
|
|
233
|
+
for (const block of blocks) {
|
|
234
|
+
const lines = block.body.split(/\r?\n/);
|
|
235
|
+
for (const line of lines) {
|
|
236
|
+
const cmd = _firstCommand(line);
|
|
237
|
+
if (!cmd) continue;
|
|
238
|
+
const verdict = _validateCommand(cmd, ctx);
|
|
239
|
+
if (!verdict.ok) {
|
|
240
|
+
findings.push({
|
|
241
|
+
category: 'verify-command-unknown',
|
|
242
|
+
severity: 'critical',
|
|
243
|
+
target: '<verify> block',
|
|
244
|
+
message: '`' + cmd.full + '` — ' + (verdict.hint || verdict.reason || 'unknown command'),
|
|
245
|
+
hint: verdict.hint || null,
|
|
246
|
+
raw: { reason: verdict.reason, command: cmd.command, line },
|
|
247
|
+
});
|
|
248
|
+
}
|
|
249
|
+
}
|
|
250
|
+
}
|
|
251
|
+
return findings;
|
|
252
|
+
}
|
|
253
|
+
|
|
254
|
+
// --- D2: lintParallelTaskRaces ---------------------------------------------
|
|
255
|
+
|
|
256
|
+
function _verifyReadsWorkingTree(verifyText) {
|
|
257
|
+
const t = String(verifyText || '');
|
|
258
|
+
for (const re of WORKING_TREE_READERS) {
|
|
259
|
+
if (re.test(t)) return true;
|
|
260
|
+
}
|
|
261
|
+
return false;
|
|
262
|
+
}
|
|
263
|
+
|
|
264
|
+
function lintParallelTaskRaces(tasks) {
|
|
265
|
+
// tasks: array of { id, files_modified, verifyText, depends_on, slice? }
|
|
266
|
+
const findings = [];
|
|
267
|
+
// Group by slice (or treat as one group).
|
|
268
|
+
const groups = new Map();
|
|
269
|
+
for (const t of tasks || []) {
|
|
270
|
+
const key = t.slice || '__default__';
|
|
271
|
+
if (!groups.has(key)) groups.set(key, []);
|
|
272
|
+
groups.get(key).push(t);
|
|
273
|
+
}
|
|
274
|
+
for (const [, group] of groups) {
|
|
275
|
+
if (group.length < 2) continue;
|
|
276
|
+
// Identify parallel tasks (depends_on is empty array OR missing).
|
|
277
|
+
const parallel = group.filter((t) => {
|
|
278
|
+
const d = t.depends_on;
|
|
279
|
+
return !Array.isArray(d) || d.length === 0;
|
|
280
|
+
});
|
|
281
|
+
if (parallel.length < 2) continue;
|
|
282
|
+
for (const a of parallel) {
|
|
283
|
+
if (!_verifyReadsWorkingTree(a.verifyText)) continue;
|
|
284
|
+
// a's verify reads working-tree → if any sibling has files_modified ≠ [],
|
|
285
|
+
// a has an implicit dependency on it.
|
|
286
|
+
const conflicts = parallel.filter(
|
|
287
|
+
(b) => b.id !== a.id
|
|
288
|
+
&& Array.isArray(b.files_modified)
|
|
289
|
+
&& b.files_modified.length > 0,
|
|
290
|
+
);
|
|
291
|
+
if (conflicts.length === 0) continue;
|
|
292
|
+
findings.push({
|
|
293
|
+
category: 'parallel-task-implicit-dependency',
|
|
294
|
+
severity: 'critical',
|
|
295
|
+
target: a.id,
|
|
296
|
+
message: 'task ' + a.id + ' is marked parallel (depends_on:[]) but its <verify> reads the working tree, ' +
|
|
297
|
+
'creating an implicit ordering against sibling task(s) that modify files: ' +
|
|
298
|
+
conflicts.map((c) => c.id).join(', '),
|
|
299
|
+
hint: 'set depends_on to [' + conflicts.map((c) => '"' + c.id + '"').join(', ') + '] OR ' +
|
|
300
|
+
'replace the working-tree-reading verify with a stateless check',
|
|
301
|
+
raw: { task: a.id, conflicts: conflicts.map((c) => c.id) },
|
|
302
|
+
});
|
|
303
|
+
}
|
|
304
|
+
}
|
|
305
|
+
return findings;
|
|
306
|
+
}
|
|
307
|
+
|
|
308
|
+
// --- D3: lintOverSpecification (heuristic) ---------------------------------
|
|
309
|
+
|
|
310
|
+
const OVER_SPECIFICATION_SIGNALS = [
|
|
311
|
+
{
|
|
312
|
+
name: 'schema-ddl',
|
|
313
|
+
re: /^(\s*)?(CREATE\s+TABLE|ALTER\s+(TABLE|COLUMN)|Schema::(create|table)|->\s*(string|integer|bigInteger|foreignId|timestamp)\s*\()/im,
|
|
314
|
+
hint: 'schema DDL belongs to the executor — the plan describes intent (e.g. "subscriptions table with columns the framework dictates"), not exact column shape',
|
|
315
|
+
},
|
|
316
|
+
{
|
|
317
|
+
name: 'framework-timestamped-filename',
|
|
318
|
+
re: /\b\d{4}_\d{2}_\d{2}_\d{6}_[a-z_]+\.php\b/,
|
|
319
|
+
hint: 'framework-controlled migration filenames are publish-time output, not plan input — use a glob pattern in files_modified',
|
|
320
|
+
},
|
|
321
|
+
{
|
|
322
|
+
name: 'inline-code-snippet',
|
|
323
|
+
re: /```(?:[a-z]+)?\n[\s\S]{200,}\n```/,
|
|
324
|
+
hint: 'large code blocks in PLAN.md push implementation into the planner — describe what the code must achieve, let the executor write it',
|
|
325
|
+
},
|
|
326
|
+
];
|
|
327
|
+
|
|
328
|
+
function lintOverSpecification(planBody) {
|
|
329
|
+
const findings = [];
|
|
330
|
+
const body = String(planBody || '');
|
|
331
|
+
for (const sig of OVER_SPECIFICATION_SIGNALS) {
|
|
332
|
+
const m = body.match(sig.re);
|
|
333
|
+
if (m) {
|
|
334
|
+
findings.push({
|
|
335
|
+
category: 'plan-over-specifies-implementation',
|
|
336
|
+
severity: 'major',
|
|
337
|
+
target: 'PLAN.md body',
|
|
338
|
+
message: 'over-specification signal: ' + sig.name + ' (matched: ' +
|
|
339
|
+
String(m[0]).replace(/\s+/g, ' ').slice(0, 80) + ')',
|
|
340
|
+
hint: sig.hint,
|
|
341
|
+
raw: { signal: sig.name, snippet: String(m[0]).slice(0, 200) },
|
|
342
|
+
});
|
|
343
|
+
}
|
|
344
|
+
}
|
|
345
|
+
return findings;
|
|
346
|
+
}
|
|
347
|
+
|
|
348
|
+
// --- combined lintPlan -----------------------------------------------------
|
|
349
|
+
|
|
350
|
+
// lintPlan takes the post-frontmatter body for verify-command parsing and an
|
|
351
|
+
// optional raw string (frontmatter + body) for over-specification scanning —
|
|
352
|
+
// framework-timestamped filenames typically live in frontmatter `files_modified`,
|
|
353
|
+
// so the over-spec heuristic needs to see both.
|
|
354
|
+
function lintPlan(planBody, opts) {
|
|
355
|
+
const raw = (opts && typeof opts.raw === 'string') ? opts.raw : planBody;
|
|
356
|
+
return [
|
|
357
|
+
...lintVerifyCommands(planBody, opts),
|
|
358
|
+
...lintOverSpecification(raw),
|
|
359
|
+
];
|
|
360
|
+
}
|
|
361
|
+
|
|
362
|
+
// Lint a single task PLAN.md (frontmatter parsed automatically).
|
|
363
|
+
function lintTaskFile(planMdPath, opts) {
|
|
364
|
+
const cwd = (opts && opts.cwd) || process.cwd();
|
|
365
|
+
const raw = fs.readFileSync(planMdPath, 'utf-8');
|
|
366
|
+
const { frontmatter, body } = extractFrontmatter(raw);
|
|
367
|
+
return {
|
|
368
|
+
path: planMdPath,
|
|
369
|
+
frontmatter: frontmatter || {},
|
|
370
|
+
findings: lintPlan(body, { ...opts, cwd, raw }),
|
|
371
|
+
};
|
|
372
|
+
}
|
|
373
|
+
|
|
374
|
+
module.exports = {
|
|
375
|
+
lintVerifyCommands,
|
|
376
|
+
lintParallelTaskRaces,
|
|
377
|
+
lintOverSpecification,
|
|
378
|
+
lintPlan,
|
|
379
|
+
lintTaskFile,
|
|
380
|
+
POSIX_BASELINE,
|
|
381
|
+
INTERPRETER_PREFIXES,
|
|
382
|
+
WORKING_TREE_READERS,
|
|
383
|
+
};
|
|
@@ -0,0 +1,313 @@
|
|
|
1
|
+
'use strict';
|
|
2
|
+
|
|
3
|
+
const fs = require('node:fs');
|
|
4
|
+
const os = require('node:os');
|
|
5
|
+
const path = require('node:path');
|
|
6
|
+
const { test, afterEach } = require('node:test');
|
|
7
|
+
const assert = require('node:assert/strict');
|
|
8
|
+
|
|
9
|
+
const planLint = require('./plan-lint.cjs');
|
|
10
|
+
|
|
11
|
+
const _sandboxes = [];
|
|
12
|
+
function _mkRoot(files) {
|
|
13
|
+
const r = fs.mkdtempSync(path.join(os.tmpdir(), 'np-pl-'));
|
|
14
|
+
if (files) {
|
|
15
|
+
for (const [rel, content] of Object.entries(files)) {
|
|
16
|
+
const abs = path.join(r, rel);
|
|
17
|
+
fs.mkdirSync(path.dirname(abs), { recursive: true });
|
|
18
|
+
fs.writeFileSync(abs, content, 'utf-8');
|
|
19
|
+
}
|
|
20
|
+
}
|
|
21
|
+
_sandboxes.push(r);
|
|
22
|
+
return r;
|
|
23
|
+
}
|
|
24
|
+
afterEach(() => {
|
|
25
|
+
while (_sandboxes.length) {
|
|
26
|
+
try { fs.rmSync(_sandboxes.pop(), { recursive: true, force: true }); } catch {}
|
|
27
|
+
}
|
|
28
|
+
});
|
|
29
|
+
|
|
30
|
+
// ===========================================================================
|
|
31
|
+
// D1 — lintVerifyCommands
|
|
32
|
+
// ===========================================================================
|
|
33
|
+
|
|
34
|
+
test('PL-VC-1: passes for known np-tools verb', () => {
|
|
35
|
+
const findings = planLint.lintVerifyCommands(
|
|
36
|
+
'<verify>node .nubos-pilot/bin/np-tools.cjs commit-task M001-S001-T0001</verify>',
|
|
37
|
+
{ knownVerbs: ['commit-task', 'state'] },
|
|
38
|
+
);
|
|
39
|
+
assert.equal(findings.length, 0);
|
|
40
|
+
});
|
|
41
|
+
|
|
42
|
+
test('PL-VC-2: catches unknown np-tools verb (the M004 bug class)', () => {
|
|
43
|
+
const findings = planLint.lintVerifyCommands(
|
|
44
|
+
'<verify>node .nubos-pilot/bin/np-tools.cjs codebase doc-lint</verify>',
|
|
45
|
+
{ knownVerbs: ['commit-task', 'state', 'help'] },
|
|
46
|
+
);
|
|
47
|
+
assert.equal(findings.length, 1);
|
|
48
|
+
assert.equal(findings[0].category, 'verify-command-unknown');
|
|
49
|
+
assert.equal(findings[0].severity, 'critical');
|
|
50
|
+
assert.equal(findings[0].raw.reason, 'np-tools-unknown-verb');
|
|
51
|
+
});
|
|
52
|
+
|
|
53
|
+
test('PL-VC-3: catches np-tools call without a verb', () => {
|
|
54
|
+
const findings = planLint.lintVerifyCommands(
|
|
55
|
+
'<verify>node .nubos-pilot/bin/np-tools.cjs --help</verify>',
|
|
56
|
+
{ knownVerbs: ['commit-task'] },
|
|
57
|
+
);
|
|
58
|
+
assert.equal(findings.length, 1);
|
|
59
|
+
assert.equal(findings[0].raw.reason, 'np-tools-missing-verb');
|
|
60
|
+
});
|
|
61
|
+
|
|
62
|
+
test('PL-VC-4: passes for declared composer script', () => {
|
|
63
|
+
const r = _mkRoot({
|
|
64
|
+
'composer.json': JSON.stringify({ scripts: { test: 'phpunit' } }),
|
|
65
|
+
});
|
|
66
|
+
const findings = planLint.lintVerifyCommands(
|
|
67
|
+
'<verify>composer test</verify>',
|
|
68
|
+
{ cwd: r },
|
|
69
|
+
);
|
|
70
|
+
assert.equal(findings.length, 0);
|
|
71
|
+
});
|
|
72
|
+
|
|
73
|
+
test('PL-VC-5: catches undeclared composer script', () => {
|
|
74
|
+
const r = _mkRoot({
|
|
75
|
+
'composer.json': JSON.stringify({ scripts: { test: 'phpunit' } }),
|
|
76
|
+
});
|
|
77
|
+
const findings = planLint.lintVerifyCommands(
|
|
78
|
+
'<verify>composer phantom-script</verify>',
|
|
79
|
+
{ cwd: r },
|
|
80
|
+
);
|
|
81
|
+
assert.equal(findings.length, 1);
|
|
82
|
+
assert.equal(findings[0].raw.reason, 'composer-script-not-declared');
|
|
83
|
+
});
|
|
84
|
+
|
|
85
|
+
test('PL-VC-6: composer builtin (install/update/dump-autoload) always passes', () => {
|
|
86
|
+
const r = _mkRoot({});
|
|
87
|
+
const findings = planLint.lintVerifyCommands(
|
|
88
|
+
'<verify>composer dump-autoload</verify>',
|
|
89
|
+
{ cwd: r },
|
|
90
|
+
);
|
|
91
|
+
assert.equal(findings.length, 0);
|
|
92
|
+
});
|
|
93
|
+
|
|
94
|
+
test('PL-VC-7: passes for declared npm script', () => {
|
|
95
|
+
const r = _mkRoot({
|
|
96
|
+
'package.json': JSON.stringify({ scripts: { lint: 'eslint .' } }),
|
|
97
|
+
});
|
|
98
|
+
const findings = planLint.lintVerifyCommands(
|
|
99
|
+
'<verify>npm run lint</verify>',
|
|
100
|
+
{ cwd: r },
|
|
101
|
+
);
|
|
102
|
+
assert.equal(findings.length, 0);
|
|
103
|
+
});
|
|
104
|
+
|
|
105
|
+
test('PL-VC-8: catches undeclared npm script', () => {
|
|
106
|
+
const r = _mkRoot({
|
|
107
|
+
'package.json': JSON.stringify({ scripts: { lint: 'eslint .' } }),
|
|
108
|
+
});
|
|
109
|
+
const findings = planLint.lintVerifyCommands(
|
|
110
|
+
'<verify>npm run nonexistent</verify>',
|
|
111
|
+
{ cwd: r },
|
|
112
|
+
);
|
|
113
|
+
assert.equal(findings.length, 1);
|
|
114
|
+
assert.equal(findings[0].raw.reason, 'npm-script-not-declared');
|
|
115
|
+
});
|
|
116
|
+
|
|
117
|
+
test('PL-VC-9: passes for vendor/bin/* path even if file is absent (post-install)', () => {
|
|
118
|
+
const r = _mkRoot({});
|
|
119
|
+
const findings = planLint.lintVerifyCommands(
|
|
120
|
+
'<verify>vendor/bin/phpstan analyse</verify>',
|
|
121
|
+
{ cwd: r },
|
|
122
|
+
);
|
|
123
|
+
assert.equal(findings.length, 0);
|
|
124
|
+
});
|
|
125
|
+
|
|
126
|
+
test('PL-VC-10: passes for POSIX baseline (echo, test, [, sed)', () => {
|
|
127
|
+
const findings = planLint.lintVerifyCommands(
|
|
128
|
+
'<verify>echo ok && test -f file.txt</verify>',
|
|
129
|
+
{},
|
|
130
|
+
);
|
|
131
|
+
assert.equal(findings.length, 0);
|
|
132
|
+
});
|
|
133
|
+
|
|
134
|
+
test('PL-VC-11: catches non-existent path command', () => {
|
|
135
|
+
const r = _mkRoot({});
|
|
136
|
+
const findings = planLint.lintVerifyCommands(
|
|
137
|
+
'<verify>./scripts-elsewhere/run.sh</verify>',
|
|
138
|
+
{ cwd: r },
|
|
139
|
+
);
|
|
140
|
+
assert.equal(findings.length, 1);
|
|
141
|
+
assert.equal(findings[0].raw.reason, 'path-not-found');
|
|
142
|
+
});
|
|
143
|
+
|
|
144
|
+
test('PL-VC-12: multi-line verify, only first non-comment validated per line', () => {
|
|
145
|
+
const findings = planLint.lintVerifyCommands(
|
|
146
|
+
`<verify>
|
|
147
|
+
# this is a comment
|
|
148
|
+
echo "step 1"
|
|
149
|
+
node .nubos-pilot/bin/np-tools.cjs nonexistent-verb
|
|
150
|
+
</verify>`,
|
|
151
|
+
{ knownVerbs: ['existing-verb'] },
|
|
152
|
+
);
|
|
153
|
+
assert.equal(findings.length, 1);
|
|
154
|
+
assert.equal(findings[0].raw.reason, 'np-tools-unknown-verb');
|
|
155
|
+
});
|
|
156
|
+
|
|
157
|
+
test('PL-VC-13: env-var prefix is stripped before validation', () => {
|
|
158
|
+
const findings = planLint.lintVerifyCommands(
|
|
159
|
+
'<verify>FOO=bar BAZ=qux node .nubos-pilot/bin/np-tools.cjs commit-task X</verify>',
|
|
160
|
+
{ knownVerbs: ['commit-task'] },
|
|
161
|
+
);
|
|
162
|
+
assert.equal(findings.length, 0);
|
|
163
|
+
});
|
|
164
|
+
|
|
165
|
+
test('PL-VC-14: shell pipe — only validates first sub-command', () => {
|
|
166
|
+
const findings = planLint.lintVerifyCommands(
|
|
167
|
+
'<verify>echo data | grep pattern</verify>',
|
|
168
|
+
{},
|
|
169
|
+
);
|
|
170
|
+
assert.equal(findings.length, 0); // echo is POSIX baseline
|
|
171
|
+
});
|
|
172
|
+
|
|
173
|
+
// ===========================================================================
|
|
174
|
+
// D2 — lintParallelTaskRaces
|
|
175
|
+
// ===========================================================================
|
|
176
|
+
|
|
177
|
+
test('PL-PR-1: detects update-docs race against sibling that modifies files', () => {
|
|
178
|
+
const tasks = [
|
|
179
|
+
{ id: 'M001-S001-T0001', files_modified: ['src/foo.ts'], depends_on: [],
|
|
180
|
+
verifyText: 'php artisan test', slice: 'S001' },
|
|
181
|
+
{ id: 'M001-S001-T0002', files_modified: [], depends_on: [],
|
|
182
|
+
verifyText: 'node .nubos-pilot/bin/np-tools.cjs update-docs --check', slice: 'S001' },
|
|
183
|
+
];
|
|
184
|
+
const findings = planLint.lintParallelTaskRaces(tasks);
|
|
185
|
+
assert.equal(findings.length, 1);
|
|
186
|
+
assert.equal(findings[0].category, 'parallel-task-implicit-dependency');
|
|
187
|
+
assert.equal(findings[0].target, 'M001-S001-T0002');
|
|
188
|
+
assert.deepEqual(findings[0].raw.conflicts, ['M001-S001-T0001']);
|
|
189
|
+
});
|
|
190
|
+
|
|
191
|
+
test('PL-PR-2: detects phpstan-analyse race', () => {
|
|
192
|
+
const tasks = [
|
|
193
|
+
{ id: 'M001-S001-T0001', files_modified: ['src/a.php'], depends_on: [],
|
|
194
|
+
verifyText: '', slice: 'S001' },
|
|
195
|
+
{ id: 'M001-S001-T0002', files_modified: [], depends_on: [],
|
|
196
|
+
verifyText: 'vendor/bin/phpstan analyse', slice: 'S001' },
|
|
197
|
+
];
|
|
198
|
+
const findings = planLint.lintParallelTaskRaces(tasks);
|
|
199
|
+
assert.equal(findings.length, 1);
|
|
200
|
+
assert.equal(findings[0].target, 'M001-S001-T0002');
|
|
201
|
+
});
|
|
202
|
+
|
|
203
|
+
test('PL-PR-3: skips when explicit depends_on already declared', () => {
|
|
204
|
+
const tasks = [
|
|
205
|
+
{ id: 'M001-S001-T0001', files_modified: ['src/foo.ts'], depends_on: [],
|
|
206
|
+
verifyText: 'php artisan test', slice: 'S001' },
|
|
207
|
+
{ id: 'M001-S001-T0002', files_modified: [], depends_on: ['M001-S001-T0001'],
|
|
208
|
+
verifyText: 'node .nubos-pilot/bin/np-tools.cjs update-docs --check', slice: 'S001' },
|
|
209
|
+
];
|
|
210
|
+
const findings = planLint.lintParallelTaskRaces(tasks);
|
|
211
|
+
assert.equal(findings.length, 0);
|
|
212
|
+
});
|
|
213
|
+
|
|
214
|
+
test('PL-PR-4: ignores stateless verify (php artisan test alone)', () => {
|
|
215
|
+
const tasks = [
|
|
216
|
+
{ id: 'M001-S001-T0001', files_modified: ['src/foo.ts'], depends_on: [],
|
|
217
|
+
verifyText: 'echo hi', slice: 'S001' },
|
|
218
|
+
{ id: 'M001-S001-T0002', files_modified: ['src/bar.ts'], depends_on: [],
|
|
219
|
+
verifyText: 'echo there', slice: 'S001' },
|
|
220
|
+
];
|
|
221
|
+
const findings = planLint.lintParallelTaskRaces(tasks);
|
|
222
|
+
assert.equal(findings.length, 0);
|
|
223
|
+
});
|
|
224
|
+
|
|
225
|
+
test('PL-PR-5: cross-slice tasks are not pairs (different slice keys)', () => {
|
|
226
|
+
const tasks = [
|
|
227
|
+
{ id: 'M001-S001-T0001', files_modified: ['src/foo.ts'], depends_on: [],
|
|
228
|
+
verifyText: 'php artisan test', slice: 'S001' },
|
|
229
|
+
{ id: 'M001-S002-T0001', files_modified: [], depends_on: [],
|
|
230
|
+
verifyText: 'update-docs --check', slice: 'S002' },
|
|
231
|
+
];
|
|
232
|
+
const findings = planLint.lintParallelTaskRaces(tasks);
|
|
233
|
+
assert.equal(findings.length, 0);
|
|
234
|
+
});
|
|
235
|
+
|
|
236
|
+
// ===========================================================================
|
|
237
|
+
// D3 — lintOverSpecification (heuristic)
|
|
238
|
+
// ===========================================================================
|
|
239
|
+
|
|
240
|
+
test('PL-OS-1: catches Schema::create DDL block', () => {
|
|
241
|
+
const findings = planLint.lintOverSpecification(`
|
|
242
|
+
## Migration
|
|
243
|
+
|
|
244
|
+
Schema::create('subscriptions', function (Blueprint $table) {
|
|
245
|
+
$table->bigIncrements('id');
|
|
246
|
+
});
|
|
247
|
+
`);
|
|
248
|
+
assert.equal(findings.length, 1);
|
|
249
|
+
assert.equal(findings[0].category, 'plan-over-specifies-implementation');
|
|
250
|
+
assert.equal(findings[0].raw.signal, 'schema-ddl');
|
|
251
|
+
});
|
|
252
|
+
|
|
253
|
+
test('PL-OS-2: catches framework-controlled migration filename', () => {
|
|
254
|
+
const findings = planLint.lintOverSpecification(`
|
|
255
|
+
files_modified:
|
|
256
|
+
- database/migrations/0001_01_01_000004_create_customer_columns_table.php
|
|
257
|
+
`);
|
|
258
|
+
assert.equal(findings.length, 1);
|
|
259
|
+
assert.equal(findings[0].raw.signal, 'framework-timestamped-filename');
|
|
260
|
+
});
|
|
261
|
+
|
|
262
|
+
test('PL-OS-3: catches a large inline code block', () => {
|
|
263
|
+
const big = Array(20).fill(' some_field: value').join('\n');
|
|
264
|
+
const findings = planLint.lintOverSpecification('```yaml\n' + big + '\n```');
|
|
265
|
+
assert.equal(findings.length, 1);
|
|
266
|
+
assert.equal(findings[0].raw.signal, 'inline-code-snippet');
|
|
267
|
+
});
|
|
268
|
+
|
|
269
|
+
test('PL-OS-4: clean intent-only PLAN body produces zero findings', () => {
|
|
270
|
+
const findings = planLint.lintOverSpecification(`
|
|
271
|
+
## Goal
|
|
272
|
+
Install Cashier billing into the project.
|
|
273
|
+
|
|
274
|
+
## Boundary
|
|
275
|
+
- App service provider
|
|
276
|
+
- Test surface
|
|
277
|
+
|
|
278
|
+
## Acceptance
|
|
279
|
+
- Pest tests for Cashier integration green
|
|
280
|
+
- Migrations applied successfully
|
|
281
|
+
`);
|
|
282
|
+
assert.equal(findings.length, 0);
|
|
283
|
+
});
|
|
284
|
+
|
|
285
|
+
// ===========================================================================
|
|
286
|
+
// Integration
|
|
287
|
+
// ===========================================================================
|
|
288
|
+
|
|
289
|
+
test('PL-INT-1: lintPlan combines verify-command + over-specification', () => {
|
|
290
|
+
const findings = planLint.lintPlan(`
|
|
291
|
+
<verify>node .nubos-pilot/bin/np-tools.cjs codebase doc-lint</verify>
|
|
292
|
+
|
|
293
|
+
Schema::create('tbl', function () {});
|
|
294
|
+
`, { knownVerbs: ['commit-task'] });
|
|
295
|
+
assert.equal(findings.length, 2);
|
|
296
|
+
const cats = findings.map((f) => f.category).sort();
|
|
297
|
+
assert.deepEqual(cats, ['plan-over-specifies-implementation', 'verify-command-unknown']);
|
|
298
|
+
});
|
|
299
|
+
|
|
300
|
+
test('PL-INT-2: lintTaskFile reads frontmatter + body and runs full lint', () => {
|
|
301
|
+
const r = _mkRoot({
|
|
302
|
+
'task.md': `---
|
|
303
|
+
id: M001-S001-T0001
|
|
304
|
+
files_modified: []
|
|
305
|
+
---
|
|
306
|
+
<verify>node .nubos-pilot/bin/np-tools.cjs codebase doc-lint</verify>
|
|
307
|
+
`,
|
|
308
|
+
});
|
|
309
|
+
const result = planLint.lintTaskFile(path.join(r, 'task.md'), { knownVerbs: ['commit-task'] });
|
|
310
|
+
assert.equal(result.frontmatter.id, 'M001-S001-T0001');
|
|
311
|
+
assert.equal(result.findings.length, 1);
|
|
312
|
+
assert.equal(result.findings[0].category, 'verify-command-unknown');
|
|
313
|
+
});
|
package/np-tools.cjs
CHANGED
|
@@ -93,6 +93,8 @@ const topLevelCommands = {
|
|
|
93
93
|
'session-snapshot-write': require('./bin/np-tools/session-snapshot-write.cjs'),
|
|
94
94
|
'session-snapshot-read': require('./bin/np-tools/session-snapshot-read.cjs'),
|
|
95
95
|
|
|
96
|
+
'plan-lint': require('./bin/np-tools/plan-lint.cjs'),
|
|
97
|
+
|
|
96
98
|
'loop-state-read': require('./bin/np-tools/loop-state-read.cjs'),
|
|
97
99
|
'loop-state-record': require('./bin/np-tools/loop-state-record.cjs'),
|
|
98
100
|
'loop-evaluate': require('./bin/np-tools/loop-evaluate.cjs'),
|
package/package.json
CHANGED
package/workflows/plan-phase.md
CHANGED
|
@@ -258,6 +258,30 @@ for ITER in 1 2; do
|
|
|
258
258
|
VERDICT_JSON_PATH="$milestone_dir/.tmp-verdict-$ITER.json"
|
|
259
259
|
# (verdict JSON: {status: passed|issues_found, findings: [...] })
|
|
260
260
|
|
|
261
|
+
# --- Plan-side Trust Layer (ADR-0013): mechanical pre-flight ---
|
|
262
|
+
# Before treating the LLM-judgment verdict as authoritative, run the
|
|
263
|
+
# mechanical plan-lint over every PLAN.md in the milestone. Critical
|
|
264
|
+
# findings (verify-command-unknown / parallel-task-implicit-dependency)
|
|
265
|
+
# MUST be merged into the verdict so iteration-2 forces a fix. Mechanical
|
|
266
|
+
# findings are non-negotiable; the planner-checker LLM cannot override them.
|
|
267
|
+
PLAN_LINT_JSON=$(node .nubos-pilot/bin/np-tools.cjs plan-lint --milestone "$milestone_id" 2>&1) || true
|
|
268
|
+
PLAN_LINT_CRITICAL=$(echo "$PLAN_LINT_JSON" | node -e 'process.stdin.on("data",d=>{try{const j=JSON.parse(d);console.log((j.summary&&j.summary.critical)||0)}catch{console.log(0)}})')
|
|
269
|
+
if [ "${PLAN_LINT_CRITICAL:-0}" -gt 0 ]; then
|
|
270
|
+
# Promote mechanical findings into the verdict file so iteration-2 sees them.
|
|
271
|
+
echo "$PLAN_LINT_JSON" > "$milestone_dir/.tmp-plan-lint-$ITER.json"
|
|
272
|
+
node -e '
|
|
273
|
+
const fs = require("fs");
|
|
274
|
+
const verdict = JSON.parse(fs.readFileSync(process.argv[1], "utf-8"));
|
|
275
|
+
const lint = JSON.parse(fs.readFileSync(process.argv[2], "utf-8"));
|
|
276
|
+
const findings = Array.isArray(verdict.findings) ? verdict.findings.slice() : [];
|
|
277
|
+
for (const f of (lint.files || []).flatMap(x => x.findings || [])) findings.push(f);
|
|
278
|
+
for (const f of (lint.parallel_race_findings || [])) findings.push(f);
|
|
279
|
+
verdict.findings = findings;
|
|
280
|
+
verdict.status = "issues_found";
|
|
281
|
+
fs.writeFileSync(process.argv[1], JSON.stringify(verdict, null, 2));
|
|
282
|
+
' "$VERDICT_JSON_PATH" "$milestone_dir/.tmp-plan-lint-$ITER.json"
|
|
283
|
+
fi
|
|
284
|
+
|
|
261
285
|
# (Plan-review append uses the milestone-id form — append-only audit)
|
|
262
286
|
# Future: move to plan-milestone plan-review-append verb.
|
|
263
287
|
|