nubos-pilot 0.9.4 → 0.9.6

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
@@ -26,6 +26,10 @@ This agent operates under [`templates/COMPLETENESS.md`](../templates/COMPLETENES
26
26
 
27
27
  Refusal of any rule is a hard-stop. Surface the violation to the orchestrator verbatim and abort the spawn.
28
28
 
29
+ ## Spawn-Evidence Audit (Trust Layer, ADR-0010)
30
+
31
+ Your spawn must be stamped into the per-task `nubosloop.tool_use_audit` log via `loop-audit-tool-use --agent np-critic-acceptance --tool-use-log <json>` after you emit your findings JSON. This is the orchestrator's responsibility, not yours — but if you observe (in the verify output or task summary) that a prior round's critic-schwarm completed without an audit stamp, surface that as a finding of category `locked-decision-violation` because it indicates a bypass of ADR-0010 Layer C. The post-critics gate (`loop-run-round --phase post-critics`) refuses without the three critic stamps; missing your stamp blocks the entire round.
32
+
29
33
  ## Inputs
30
34
 
31
35
  The orchestrator provides these paths in your prompt context. Read every path it hands you via `Read` — do not guess.
@@ -25,6 +25,10 @@ This agent operates under [`templates/COMPLETENESS.md`](../templates/COMPLETENES
25
25
 
26
26
  Refusal of any rule is a hard-stop. Surface the violation to the orchestrator verbatim and abort the spawn.
27
27
 
28
+ ## Spawn-Evidence Audit (Trust Layer, ADR-0010)
29
+
30
+ Your spawn must be stamped into the per-task `nubosloop.tool_use_audit` log via `loop-audit-tool-use --agent np-critic-style --tool-use-log <json>` after you emit your findings JSON. The post-critics gate refuses without the three critic stamps; missing your stamp blocks the entire round. Synthesizing a fake findings JSON without spawning your sibling critics is a Layer-C violation and the orchestrator must NOT do it.
31
+
28
32
  ## Inputs
29
33
 
30
34
  The orchestrator provides these paths in your prompt context. Read every path it hands you via `Read` — do not guess.
@@ -25,6 +25,10 @@ This agent operates under [`templates/COMPLETENESS.md`](../templates/COMPLETENES
25
25
 
26
26
  Refusal of any rule is a hard-stop. Surface the violation to the orchestrator verbatim and abort the spawn.
27
27
 
28
+ ## Spawn-Evidence Audit (Trust Layer, ADR-0010)
29
+
30
+ Your spawn must be stamped into the per-task `nubosloop.tool_use_audit` log via `loop-audit-tool-use --agent np-critic-tests --tool-use-log <json>` after you emit your findings JSON. The post-critics gate refuses without the three critic stamps; missing your stamp blocks the entire round. Synthesizing a fake findings JSON without spawning your sibling critics is a Layer-C violation and the orchestrator must NOT do it.
31
+
28
32
  ## Inputs
29
33
 
30
34
  The orchestrator provides these paths in your prompt context. Read every path it hands you via `Read` — do not guess.
@@ -11,27 +11,53 @@ const { deleteCheckpoint, readCheckpoint } = require('../../lib/checkpoint.cjs')
11
11
 
12
12
  const BYPASS_FLAG = '--bypass-nubosloop';
13
13
 
14
+ // Evidence-based gate: a complete Nubosloop run accumulates fields on the
15
+ // checkpoint envelope (cache_hit from preflight, verify_exit_code from
16
+ // post-executor, findings from post-critics, committed_at from commit). A
17
+ // gamed run that only invokes `loop-run-round --phase commit` directly leaves
18
+ // verify_exit_code and findings undefined. Checking last_phase alone is not
19
+ // enough — we require the cumulative signature.
14
20
  function _assertLoopGate(taskId, cwd, bypass, stderr) {
15
21
  const cp = readCheckpoint(taskId, cwd);
16
- const last = cp && cp.nubosloop && cp.nubosloop.last_phase;
17
- if (last === 'commit') return { bypassed: false, last_phase: last };
18
- const reason = !cp ? 'no-checkpoint' : 'last-phase-mismatch';
19
- const observed = last || (cp ? 'none' : 'no-checkpoint');
22
+ const np = (cp && cp.nubosloop) || null;
23
+ const last = np && np.last_phase;
24
+ const checks = [
25
+ { ok: !!cp, reason: 'no-checkpoint', missing: 'checkpoint', observed: 'no-checkpoint' },
26
+ { ok: last === 'commit', reason: 'last-phase-mismatch', missing: 'last_phase=commit', observed: last || 'none' },
27
+ { ok: np && np.verify_exit_code === 0, reason: 'post-executor-not-green', missing: 'verify_exit_code=0', observed: np && np.verify_exit_code !== undefined ? String(np.verify_exit_code) : 'undefined' },
28
+ { ok: np && Array.isArray(np.findings), reason: 'post-critics-missing', missing: 'findings (array)', observed: np && np.findings !== undefined ? JSON.stringify(np.findings).slice(0, 60) : 'undefined' },
29
+ { ok: np && !!np.committed_at, reason: 'commit-phase-not-stamped', missing: 'committed_at', observed: (np && np.committed_at) || 'undefined' },
30
+ ];
31
+ const failed = checks.find((c) => !c.ok);
32
+ if (!failed) {
33
+ return { bypassed: false, last_phase: last, forced_commit_phase: !!(np && np.forced_commit_phase) };
34
+ }
20
35
  if (bypass) {
21
36
  stderr.write(
22
37
  '[nubos-pilot] WARNING: commit-task ' + taskId +
23
- ' bypassing Nubosloop gate (' + BYPASS_FLAG + '; observed=' + observed +
38
+ ' bypassing Nubosloop gate (' + BYPASS_FLAG +
39
+ '; reason=' + failed.reason + '; missing=' + failed.missing +
40
+ '; observed=' + failed.observed +
24
41
  '). Single-pass commit, no critic review enforced.\n',
25
42
  );
26
- return { bypassed: true, last_phase: last || null };
43
+ return { bypassed: true, last_phase: last || null, forced_commit_phase: !!(np && np.forced_commit_phase) };
27
44
  }
28
45
  throw new NubosPilotError(
29
46
  'commit-task-loop-bypass-violation',
30
- 'commit-task refused: Nubosloop did not reach phase=commit for ' + taskId +
31
- ' (observed nubosloop.last_phase=' + observed + '). ' +
32
- 'Run `loop-run-round ' + taskId + ' --phase commit` first, or pass ' + BYPASS_FLAG +
47
+ 'commit-task refused: Nubosloop sequence incomplete for ' + taskId +
48
+ ' (reason=' + failed.reason + '; missing=' + failed.missing +
49
+ '; observed=' + failed.observed + '). ' +
50
+ 'Run the full loop (preflight → post-executor verify-green → post-critics → commit) first, or pass ' + BYPASS_FLAG +
33
51
  ' for an explicit single-pass override.',
34
- { taskId, reason, last_phase: last || null },
52
+ {
53
+ taskId,
54
+ reason: failed.reason,
55
+ missing: failed.missing,
56
+ observed_last_phase: last || null,
57
+ observed_verify_exit_code: np && np.verify_exit_code !== undefined ? np.verify_exit_code : null,
58
+ observed_findings_is_array: !!(np && Array.isArray(np.findings)),
59
+ observed_committed_at: (np && np.committed_at) || null,
60
+ },
35
61
  );
36
62
  }
37
63
 
@@ -141,6 +167,7 @@ function run(args, ctx) {
141
167
  files: safeFiles,
142
168
  files_source: filesSource,
143
169
  nubosloop_bypassed: gate.bypassed,
170
+ nubosloop_forced_commit_phase: !!gate.forced_commit_phase,
144
171
  };
145
172
  stdout.write(JSON.stringify(payload));
146
173
  return payload;
@@ -82,9 +82,10 @@ function _capture() {
82
82
  return { stub, get: () => buf };
83
83
  }
84
84
 
85
- // Seed a checkpoint that satisfies the Nubosloop gate (last_phase=commit) so
86
- // commit-task accepts it. Tests that exercise the gate explicitly bypass this
87
- // helper. Optional `extra` overrides any field on the envelope.
85
+ // Seed a checkpoint that satisfies the full Nubosloop gate (sequence-integrity).
86
+ // A real loop accumulates evidence on the envelope; the gate refuses unless
87
+ // every required marker is present. Tests that exercise game-paths build their
88
+ // own partial fixtures.
88
89
  function seedLoopReadyCheckpoint(root, taskId, extra) {
89
90
  const cpPath = path.join(root, '.nubos-pilot', 'checkpoints', taskId + '.json');
90
91
  fs.mkdirSync(path.dirname(cpPath), { recursive: true });
@@ -94,12 +95,21 @@ function seedLoopReadyCheckpoint(root, taskId, extra) {
94
95
  status: 'pre-commit',
95
96
  files_touched: [],
96
97
  nubosloop: {
98
+ round: 1,
99
+ cache_hit: false,
97
100
  last_phase: 'commit',
98
101
  last_action: 'commit',
102
+ verify_exit_code: 0,
103
+ findings: [],
99
104
  committed_at: '2026-05-04T12:00:00Z',
100
105
  },
101
106
  };
102
- fs.writeFileSync(cpPath, JSON.stringify(Object.assign(base, extra || {})), 'utf-8');
107
+ // Allow the test to override individual fields (incl. nubosloop sub-fields).
108
+ const merged = Object.assign({}, base, extra || {});
109
+ if (extra && extra.nubosloop) {
110
+ merged.nubosloop = Object.assign({}, base.nubosloop, extra.nubosloop);
111
+ }
112
+ fs.writeFileSync(cpPath, JSON.stringify(merged), 'utf-8');
103
113
  return cpPath;
104
114
  }
105
115
 
@@ -249,10 +259,112 @@ test('CT-9: refuse commit when nubosloop.last_phase ≠ commit', () => {
249
259
  () => subcmd.run(['M006-S001-T0021'], { cwd: root, stdout: cap.stub, stderr: stderr.stub }),
250
260
  (err) => err && err.code === 'commit-task-loop-bypass-violation'
251
261
  && err.details && err.details.reason === 'last-phase-mismatch'
252
- && err.details.last_phase === 'verifying',
262
+ && err.details.observed_last_phase === 'verifying',
263
+ );
264
+ });
265
+
266
+ test('CT-12: refuse gamed commit (last_phase=commit but no verify_exit_code)', () => {
267
+ const root = makeRepo();
268
+ seedPlanAndTask(root, '06-01', 'M006-S001-T0030', ['src/g.ts']);
269
+ fs.mkdirSync(path.join(root, 'src'), { recursive: true });
270
+ fs.writeFileSync(path.join(root, 'src', 'g.ts'), 'export const g = 7;\n', 'utf-8');
271
+ // Simulates an agent that ran ONLY `loop-run-round --phase commit` to game
272
+ // the gate, without going through preflight/post-executor/post-critics.
273
+ // verify_exit_code is undefined → post-executor never ran.
274
+ const cpPath = path.join(root, '.nubos-pilot', 'checkpoints', 'M006-S001-T0030.json');
275
+ fs.mkdirSync(path.dirname(cpPath), { recursive: true });
276
+ fs.writeFileSync(cpPath, JSON.stringify({
277
+ schema_version: 1,
278
+ task_id: 'M006-S001-T0030',
279
+ status: 'pre-commit',
280
+ files_touched: [],
281
+ nubosloop: { last_phase: 'commit', last_action: 'commit', committed_at: '2026-05-04T12:00:00Z' },
282
+ }), 'utf-8');
283
+ const cap = _capture();
284
+ const stderr = _capture();
285
+ assert.throws(
286
+ () => subcmd.run(['M006-S001-T0030'], { cwd: root, stdout: cap.stub, stderr: stderr.stub }),
287
+ (err) => err && err.code === 'commit-task-loop-bypass-violation'
288
+ && err.details && err.details.reason === 'post-executor-not-green',
289
+ );
290
+ });
291
+
292
+ test('CT-13: refuse gamed commit when verify ran but post-critics findings missing', () => {
293
+ const root = makeRepo();
294
+ seedPlanAndTask(root, '06-01', 'M006-S001-T0031', ['src/h.ts']);
295
+ fs.mkdirSync(path.join(root, 'src'), { recursive: true });
296
+ fs.writeFileSync(path.join(root, 'src', 'h.ts'), 'export const h = 8;\n', 'utf-8');
297
+ // verify ran (exit_code=0) but critics never produced findings — agent
298
+ // skipped the critic-schwarm step.
299
+ const cpPath = path.join(root, '.nubos-pilot', 'checkpoints', 'M006-S001-T0031.json');
300
+ fs.mkdirSync(path.dirname(cpPath), { recursive: true });
301
+ fs.writeFileSync(cpPath, JSON.stringify({
302
+ schema_version: 1,
303
+ task_id: 'M006-S001-T0031',
304
+ status: 'pre-commit',
305
+ files_touched: [],
306
+ nubosloop: {
307
+ last_phase: 'commit', last_action: 'commit',
308
+ verify_exit_code: 0, // post-executor ran
309
+ committed_at: '2026-05-04T12:00:00Z',
310
+ // findings: missing → post-critics never ran
311
+ },
312
+ }), 'utf-8');
313
+ const cap = _capture();
314
+ const stderr = _capture();
315
+ assert.throws(
316
+ () => subcmd.run(['M006-S001-T0031'], { cwd: root, stdout: cap.stub, stderr: stderr.stub }),
317
+ (err) => err && err.code === 'commit-task-loop-bypass-violation'
318
+ && err.details && err.details.reason === 'post-critics-missing',
319
+ );
320
+ });
321
+
322
+ test('CT-14: refuse when verify-red was recorded (post-executor failed)', () => {
323
+ const root = makeRepo();
324
+ seedPlanAndTask(root, '06-01', 'M006-S001-T0032', ['src/i.ts']);
325
+ fs.mkdirSync(path.join(root, 'src'), { recursive: true });
326
+ fs.writeFileSync(path.join(root, 'src', 'i.ts'), 'export const i = 9;\n', 'utf-8');
327
+ // Loop reached commit-stamp somehow but verify was red — must refuse.
328
+ seedLoopReadyCheckpoint(root, 'M006-S001-T0032', {
329
+ nubosloop: { verify_exit_code: 1 },
330
+ });
331
+ const cap = _capture();
332
+ const stderr = _capture();
333
+ assert.throws(
334
+ () => subcmd.run(['M006-S001-T0032'], { cwd: root, stdout: cap.stub, stderr: stderr.stub }),
335
+ (err) => err && err.code === 'commit-task-loop-bypass-violation'
336
+ && err.details && err.details.reason === 'post-executor-not-green'
337
+ && err.details.observed_verify_exit_code === 1,
253
338
  );
254
339
  });
255
340
 
341
+ test('CT-15: bypass on gamed commit logs precise reason in stderr', () => {
342
+ const root = makeRepo();
343
+ seedPlanAndTask(root, '06-01', 'M006-S001-T0033', ['src/j.ts']);
344
+ fs.mkdirSync(path.join(root, 'src'), { recursive: true });
345
+ fs.writeFileSync(path.join(root, 'src', 'j.ts'), 'export const j = 10;\n', 'utf-8');
346
+ const cpPath = path.join(root, '.nubos-pilot', 'checkpoints', 'M006-S001-T0033.json');
347
+ fs.mkdirSync(path.dirname(cpPath), { recursive: true });
348
+ fs.writeFileSync(cpPath, JSON.stringify({
349
+ schema_version: 1, task_id: 'M006-S001-T0033', status: 'pre-commit', files_touched: [],
350
+ nubosloop: { last_phase: 'commit', last_action: 'commit', committed_at: 'z' },
351
+ }), 'utf-8');
352
+ const prev = process.cwd();
353
+ process.chdir(root);
354
+ const cap = _capture();
355
+ const stderr = _capture();
356
+ try {
357
+ subcmd.run(['M006-S001-T0033', '--bypass-nubosloop'], { cwd: root, stdout: cap.stub, stderr: stderr.stub });
358
+ } finally {
359
+ process.chdir(prev);
360
+ }
361
+ const payload = JSON.parse(cap.get());
362
+ assert.equal(payload.ok, true);
363
+ assert.equal(payload.nubosloop_bypassed, true);
364
+ assert.match(stderr.get(), /reason=post-executor-not-green/);
365
+ assert.match(stderr.get(), /missing=verify_exit_code=0/);
366
+ });
367
+
256
368
  test('CT-10: --bypass-nubosloop allows single-pass commit and warns on stderr', () => {
257
369
  const root = makeRepo();
258
370
  seedPlanAndTask(root, '06-01', 'M006-S001-T0022', ['src/e.ts']);
@@ -31,18 +31,35 @@ function run(argv, ctx) {
31
31
  { hint: 'agents requiring search tools: ' + nubosloop.AUDITED_AGENTS.join(', ') },
32
32
  );
33
33
  }
34
- const log = args.getJsonFlag(
35
- tail,
36
- '--tool-use-log',
37
- 'loop-audit-missing-log',
38
- "JSON array of tool-name strings, e.g. '[\"Read\",\"search-knowledge\",\"Edit\"]'",
39
- );
40
- if (!Array.isArray(log)) {
34
+ // --tool-use-log is required for AUDITED_AGENTS (Rule 9 enforcement reads
35
+ // the tool list to verify search-knowledge / match-existing-learning calls).
36
+ // For non-audited spawns (critics, plan-checker, etc.) the orchestrator may
37
+ // omit it — we still record the spawn for Layer-C audit-trail evidence with
38
+ // an empty log. Explicit empty-array is also accepted.
39
+ const isAuditedAgent = nubosloop.AUDITED_AGENTS.includes(agent);
40
+ let log;
41
+ if (tail.includes('--tool-use-log')) {
42
+ log = args.getJsonFlag(
43
+ tail,
44
+ '--tool-use-log',
45
+ 'loop-audit-missing-log',
46
+ "JSON array of tool-name strings, e.g. '[\"Read\",\"search-knowledge\",\"Edit\"]'",
47
+ );
48
+ if (!Array.isArray(log)) {
49
+ throw new (require('../../lib/core.cjs').NubosPilotError)(
50
+ 'loop-audit-invalid-log',
51
+ '--tool-use-log must be a JSON array',
52
+ { got: typeof log },
53
+ );
54
+ }
55
+ } else if (isAuditedAgent) {
41
56
  throw new (require('../../lib/core.cjs').NubosPilotError)(
42
- 'loop-audit-invalid-log',
43
- '--tool-use-log must be a JSON array',
44
- { got: typeof log },
57
+ 'loop-audit-missing-log',
58
+ 'loop-audit-tool-use requires --tool-use-log for audited agent: ' + agent,
59
+ { hint: 'audited agents drive Rule 9 enforcement; pass --tool-use-log \'[]\' to record an empty spawn' },
45
60
  );
61
+ } else {
62
+ log = [];
46
63
  }
47
64
  const result = nubosloop.auditToolUse(taskId, agent, log, cwd);
48
65
  const payload = { task_id: taskId, ...result };
@@ -349,9 +349,25 @@ test('LCLI-RR-2: loop-run-round preflight on populated store → spawn-executor-
349
349
  assert.ok(out.cache_hit);
350
350
  });
351
351
 
352
+ // Helper: seed the per-round spawn-evidence audit log so Layer-C gates accept
353
+ // post-executor / post-critics. Tests that exercise the gate explicitly
354
+ // (LCLI-RR-12+) build their own partial fixtures.
355
+ function _seedSpawnEvidence(taskId, round, agents, cwd) {
356
+ const nubosloop = require('../../lib/nubosloop.cjs');
357
+ nubosloop.recordLoopState(taskId, { round }, cwd);
358
+ for (const a of agents) {
359
+ // Pass an empty tool-use log — these are evidence stamps, not Rule 9 audits.
360
+ // For AUDITED_AGENTS in this test (np-executor / np-build-fixer) we need to
361
+ // pass a valid search-tool to avoid generating a rule-9-violation finding.
362
+ const log = nubosloop.AUDITED_AGENTS.includes(a) ? ['search-knowledge'] : [];
363
+ nubosloop.auditToolUse(taskId, a, log, cwd);
364
+ }
365
+ }
366
+
352
367
  test('LCLI-RR-3: loop-run-round phase=post-executor with verify-green → spawn-critic-schwarm', () => {
353
368
  const r = _mkRoot();
354
369
  checkpoint.startTask({ id: 'M001-S001-T0001' }, r);
370
+ _seedSpawnEvidence('M001-S001-T0001', 1, ['np-executor'], r);
355
371
  const cap = _cap();
356
372
  const loopRunRound = require('./loop-run-round.cjs');
357
373
  loopRunRound.run(
@@ -366,6 +382,7 @@ test('LCLI-RR-3: loop-run-round phase=post-executor with verify-green → spawn-
366
382
  test('LCLI-RR-4: loop-run-round phase=post-executor with verify-red → spawn-build-fixer', () => {
367
383
  const r = _mkRoot();
368
384
  checkpoint.startTask({ id: 'M001-S001-T0001' }, r);
385
+ _seedSpawnEvidence('M001-S001-T0001', 1, ['np-executor'], r);
369
386
  const cap = _cap();
370
387
  const loopRunRound = require('./loop-run-round.cjs');
371
388
  loopRunRound.run(
@@ -380,6 +397,8 @@ test('LCLI-RR-4: loop-run-round phase=post-executor with verify-red → spawn-bu
380
397
  test('LCLI-RR-5: loop-run-round phase=post-critics with zero findings → commit', () => {
381
398
  const r = _mkRoot();
382
399
  checkpoint.startTask({ id: 'M001-S001-T0001' }, r);
400
+ _seedSpawnEvidence('M001-S001-T0001', 1,
401
+ ['np-executor', 'np-critic-style', 'np-critic-tests', 'np-critic-acceptance'], r);
383
402
  const cap = _cap();
384
403
  const loopRunRound = require('./loop-run-round.cjs');
385
404
  loopRunRound.run(
@@ -399,6 +418,10 @@ test('LCLI-RR-5b: post-critics surfaces rule-9-violation from audit log even wit
399
418
  // Round 1, executor shipped without searching → audit captures violation
400
419
  nubosloop.recordLoopState('M001-S001-T0001', { round: 1 }, r);
401
420
  nubosloop.auditToolUse('M001-S001-T0001', 'np-executor', ['Read', 'Edit'], r);
421
+ // Seed the three critic spawn evidences so the Layer-C gate is satisfied —
422
+ // we want the rule-9-violation to surface from the audit log, not the gate.
423
+ _seedSpawnEvidence('M001-S001-T0001', 1,
424
+ ['np-critic-style', 'np-critic-tests', 'np-critic-acceptance'], r);
402
425
  // Critics return zero findings (style/tests/acceptance all clean) — without
403
426
  // the Rule 9 chain the loop would commit. With it, the audit violation must
404
427
  // still route the round to executor.
@@ -428,6 +451,9 @@ test('LCLI-RR-5c: post-critics scopes audit findings to current round only', ()
428
451
  nubosloop.auditToolUse('M001-S001-T0001', 'np-executor', ['Read'], r);
429
452
  nubosloop.recordLoopState('M001-S001-T0001', { round: 2 }, r);
430
453
  nubosloop.auditToolUse('M001-S001-T0001', 'np-build-fixer', ['search-knowledge'], r);
454
+ // Seed critic-spawn evidence for round 2 so the Layer-C gate is satisfied.
455
+ _seedSpawnEvidence('M001-S001-T0001', 2,
456
+ ['np-critic-style', 'np-critic-tests', 'np-critic-acceptance'], r);
431
457
  const cap = _cap();
432
458
  const loopRunRound = require('./loop-run-round.cjs');
433
459
  loopRunRound.run(
@@ -467,6 +493,246 @@ test('LCLI-RR-7: loop-run-round rejects unknown --phase', () => {
467
493
  );
468
494
  });
469
495
 
496
+ test('LCLI-RR-8: phase=commit refuses without verify_exit_code (post-executor never ran)', () => {
497
+ const r = _mkRoot();
498
+ checkpoint.startTask({ id: 'M001-S001-T0001' }, r);
499
+ const loopRunRound = require('./loop-run-round.cjs');
500
+ assert.throws(
501
+ () => loopRunRound.run(
502
+ ['M001-S001-T0001', '--phase', 'commit'],
503
+ { cwd: r, stdout: _cap().stub },
504
+ ),
505
+ (err) => err && err.code === 'loop-commit-precondition-missing'
506
+ && err.details && err.details.missing === 'verify_exit_code',
507
+ );
508
+ });
509
+
510
+ test('LCLI-RR-9: phase=commit refuses without findings (post-critics never ran)', () => {
511
+ const r = _mkRoot();
512
+ checkpoint.startTask({ id: 'M001-S001-T0001' }, r);
513
+ const nubosloop = require('../../lib/nubosloop.cjs');
514
+ // post-executor ran (verify-green) but critics never produced findings.
515
+ nubosloop.recordLoopState('M001-S001-T0001', { round: 1 }, r);
516
+ checkpoint.mergeCheckpoint('M001-S001-T0001', (cur) => ({
517
+ nubosloop: Object.assign({}, (cur && cur.nubosloop) || {}, { verify_exit_code: 0 }),
518
+ }), r);
519
+ const loopRunRound = require('./loop-run-round.cjs');
520
+ assert.throws(
521
+ () => loopRunRound.run(
522
+ ['M001-S001-T0001', '--phase', 'commit'],
523
+ { cwd: r, stdout: _cap().stub },
524
+ ),
525
+ (err) => err && err.code === 'loop-commit-precondition-missing'
526
+ && err.details && err.details.missing === 'findings',
527
+ );
528
+ });
529
+
530
+ test('LCLI-RR-10: phase=commit accepts complete loop state (verify-green + findings array)', () => {
531
+ const r = _mkRoot();
532
+ checkpoint.startTask({ id: 'M001-S001-T0001' }, r);
533
+ const nubosloop = require('../../lib/nubosloop.cjs');
534
+ nubosloop.recordLoopState('M001-S001-T0001', { round: 1 }, r);
535
+ checkpoint.mergeCheckpoint('M001-S001-T0001', (cur) => ({
536
+ nubosloop: Object.assign({}, (cur && cur.nubosloop) || {}, {
537
+ verify_exit_code: 0,
538
+ findings: [],
539
+ }),
540
+ }), r);
541
+ const cap = _cap();
542
+ const loopRunRound = require('./loop-run-round.cjs');
543
+ loopRunRound.run(['M001-S001-T0001', '--phase', 'commit'], { cwd: r, stdout: cap.stub });
544
+ const out = JSON.parse(cap.get());
545
+ assert.equal(out.next_action, 'commit-task');
546
+ assert.equal(out.forced, false);
547
+ const cp = checkpoint.readCheckpoint('M001-S001-T0001', r);
548
+ assert.equal(cp.nubosloop.last_phase, 'commit');
549
+ assert.equal(cp.nubosloop.forced_commit_phase, false);
550
+ });
551
+
552
+ test('LCLI-RR-11: phase=commit --force-commit-phase bypasses preconditions and stamps the override', () => {
553
+ const r = _mkRoot();
554
+ checkpoint.startTask({ id: 'M001-S001-T0001' }, r);
555
+ // Empty checkpoint — no verify, no findings. Force should still allow.
556
+ const cap = _cap();
557
+ const loopRunRound = require('./loop-run-round.cjs');
558
+ loopRunRound.run(
559
+ ['M001-S001-T0001', '--phase', 'commit', '--force-commit-phase'],
560
+ { cwd: r, stdout: cap.stub },
561
+ );
562
+ const out = JSON.parse(cap.get());
563
+ assert.equal(out.next_action, 'commit-task');
564
+ assert.equal(out.forced, true);
565
+ const cp = checkpoint.readCheckpoint('M001-S001-T0001', r);
566
+ assert.equal(cp.nubosloop.forced_commit_phase, true);
567
+ });
568
+
569
+ // Layer C — audit-trail evidence enforcement -------------------------------
570
+
571
+ test('LCLI-RR-12: post-executor refuses without np-executor audit (R1)', () => {
572
+ const r = _mkRoot();
573
+ checkpoint.startTask({ id: 'M001-S001-T0001' }, r);
574
+ // Round defaults to 1 with no audit entries.
575
+ const loopRunRound = require('./loop-run-round.cjs');
576
+ assert.throws(
577
+ () => loopRunRound.run(
578
+ ['M001-S001-T0001', '--phase', 'post-executor', '--verify-exit-code', '0'],
579
+ { cwd: r, stdout: _cap().stub },
580
+ ),
581
+ (err) => err && err.code === 'loop-post-executor-missing-spawn-audit'
582
+ && Array.isArray(err.details && err.details.missing)
583
+ && err.details.missing.includes('np-executor')
584
+ && err.details.round === 1,
585
+ );
586
+ });
587
+
588
+ test('LCLI-RR-13: post-executor refuses on R1 if only np-build-fixer was audited (wrong agent)', () => {
589
+ const r = _mkRoot();
590
+ checkpoint.startTask({ id: 'M001-S001-T0001' }, r);
591
+ _seedSpawnEvidence('M001-S001-T0001', 1, ['np-build-fixer'], r);
592
+ const loopRunRound = require('./loop-run-round.cjs');
593
+ assert.throws(
594
+ () => loopRunRound.run(
595
+ ['M001-S001-T0001', '--phase', 'post-executor', '--verify-exit-code', '0'],
596
+ { cwd: r, stdout: _cap().stub },
597
+ ),
598
+ (err) => err && err.code === 'loop-post-executor-missing-spawn-audit'
599
+ && err.details.missing.includes('np-executor'),
600
+ );
601
+ });
602
+
603
+ test('LCLI-RR-14: post-executor on R≥2 requires np-build-fixer audit, not np-executor', () => {
604
+ const r = _mkRoot();
605
+ checkpoint.startTask({ id: 'M001-S001-T0001' }, r);
606
+ // Advance to round 2; audit only the wrong agent (np-executor).
607
+ const nubosloop = require('../../lib/nubosloop.cjs');
608
+ nubosloop.recordLoopState('M001-S001-T0001', { round: 2 }, r);
609
+ nubosloop.auditToolUse('M001-S001-T0001', 'np-executor', ['search-knowledge'], r);
610
+ const loopRunRound = require('./loop-run-round.cjs');
611
+ assert.throws(
612
+ () => loopRunRound.run(
613
+ ['M001-S001-T0001', '--phase', 'post-executor', '--verify-exit-code', '0'],
614
+ { cwd: r, stdout: _cap().stub },
615
+ ),
616
+ (err) => err && err.code === 'loop-post-executor-missing-spawn-audit'
617
+ && err.details.missing.includes('np-build-fixer')
618
+ && err.details.round === 2,
619
+ );
620
+ });
621
+
622
+ test('LCLI-RR-15: post-critics refuses without any critic audit (synthetic-JSON bypass)', () => {
623
+ const r = _mkRoot();
624
+ checkpoint.startTask({ id: 'M001-S001-T0001' }, r);
625
+ _seedSpawnEvidence('M001-S001-T0001', 1, ['np-executor'], r);
626
+ // No critic-spawn audit → gate must refuse even if --critic-outputs is valid.
627
+ const loopRunRound = require('./loop-run-round.cjs');
628
+ assert.throws(
629
+ () => loopRunRound.run(
630
+ ['M001-S001-T0001', '--phase', 'post-critics', '--critic-outputs',
631
+ '[{"critic":"style","findings":[]},{"critic":"tests","findings":[]},{"critic":"acceptance","findings":[],"criteria":[]}]'],
632
+ { cwd: r, stdout: _cap().stub },
633
+ ),
634
+ (err) => err && err.code === 'loop-post-critics-missing-critic-audit'
635
+ && Array.isArray(err.details.missing)
636
+ && err.details.missing.length === 3,
637
+ );
638
+ });
639
+
640
+ test('LCLI-RR-16: post-critics refuses with only 2 of 3 critic audits (partial bypass)', () => {
641
+ const r = _mkRoot();
642
+ checkpoint.startTask({ id: 'M001-S001-T0001' }, r);
643
+ _seedSpawnEvidence('M001-S001-T0001', 1,
644
+ ['np-executor', 'np-critic-style', 'np-critic-tests'], r); // missing acceptance
645
+ const loopRunRound = require('./loop-run-round.cjs');
646
+ assert.throws(
647
+ () => loopRunRound.run(
648
+ ['M001-S001-T0001', '--phase', 'post-critics', '--critic-outputs',
649
+ '[{"critic":"style","findings":[]},{"critic":"tests","findings":[]},{"critic":"acceptance","findings":[],"criteria":[]}]'],
650
+ { cwd: r, stdout: _cap().stub },
651
+ ),
652
+ (err) => err && err.code === 'loop-post-critics-missing-critic-audit'
653
+ && err.details.missing.length === 1
654
+ && err.details.missing[0] === 'np-critic-acceptance',
655
+ );
656
+ });
657
+
658
+ test('LCLI-RR-17: --force-post-executor bypasses Layer-C gate', () => {
659
+ const r = _mkRoot();
660
+ checkpoint.startTask({ id: 'M001-S001-T0001' }, r);
661
+ // No audit entries; force flag must let us through.
662
+ const cap = _cap();
663
+ const loopRunRound = require('./loop-run-round.cjs');
664
+ loopRunRound.run(
665
+ ['M001-S001-T0001', '--phase', 'post-executor', '--verify-exit-code', '0', '--force-post-executor'],
666
+ { cwd: r, stdout: cap.stub },
667
+ );
668
+ const out = JSON.parse(cap.get());
669
+ assert.equal(out.next_action, 'spawn-critic-schwarm');
670
+ });
671
+
672
+ test('LCLI-RR-18: --force-post-critics bypasses Layer-C gate', () => {
673
+ const r = _mkRoot();
674
+ checkpoint.startTask({ id: 'M001-S001-T0001' }, r);
675
+ _seedSpawnEvidence('M001-S001-T0001', 1, ['np-executor'], r); // executor audited, critics not
676
+ const cap = _cap();
677
+ const loopRunRound = require('./loop-run-round.cjs');
678
+ loopRunRound.run(
679
+ ['M001-S001-T0001', '--phase', 'post-critics', '--critic-outputs',
680
+ '[{"critic":"style","findings":[]},{"critic":"tests","findings":[]},{"critic":"acceptance","findings":[],"criteria":[]}]',
681
+ '--force-post-critics'],
682
+ { cwd: r, stdout: cap.stub },
683
+ );
684
+ const out = JSON.parse(cap.get());
685
+ assert.equal(out.next_action, 'commit');
686
+ });
687
+
688
+ test('LCLI-RR-19: assertSpawnsAuditedForRound returns ordered missing list', () => {
689
+ const r = _mkRoot();
690
+ checkpoint.startTask({ id: 'M001-S001-T0001' }, r);
691
+ const nubosloop = require('../../lib/nubosloop.cjs');
692
+ nubosloop.recordLoopState('M001-S001-T0001', { round: 1 }, r);
693
+ nubosloop.auditToolUse('M001-S001-T0001', 'np-critic-style', [], r);
694
+ const v = nubosloop.assertSpawnsAuditedForRound(
695
+ 'M001-S001-T0001', nubosloop.POST_CRITICS_EVIDENCE, 1, r,
696
+ );
697
+ assert.equal(v.satisfied, false);
698
+ assert.deepEqual(v.missing, ['np-critic-tests', 'np-critic-acceptance']);
699
+ });
700
+
701
+ test('LCLI-RR-20: findSpawnAuditForRound is round-scoped (round-1 audit not visible from round-2)', () => {
702
+ const r = _mkRoot();
703
+ checkpoint.startTask({ id: 'M001-S001-T0001' }, r);
704
+ const nubosloop = require('../../lib/nubosloop.cjs');
705
+ nubosloop.recordLoopState('M001-S001-T0001', { round: 1 }, r);
706
+ nubosloop.auditToolUse('M001-S001-T0001', 'np-critic-style', [], r);
707
+ assert.ok(nubosloop.findSpawnAuditForRound('M001-S001-T0001', 'np-critic-style', 1, r));
708
+ assert.equal(nubosloop.findSpawnAuditForRound('M001-S001-T0001', 'np-critic-style', 2, r), null);
709
+ });
710
+
711
+ test('LCLI-RR-21: loop-audit-tool-use accepts critics without --tool-use-log (records empty spawn)', () => {
712
+ const r = _mkRoot();
713
+ checkpoint.startTask({ id: 'M001-S001-T0001' }, r);
714
+ const nubosloop = require('../../lib/nubosloop.cjs');
715
+ nubosloop.recordLoopState('M001-S001-T0001', { round: 1 }, r);
716
+ const loopAudit = require('./loop-audit-tool-use.cjs');
717
+ const cap = _cap();
718
+ loopAudit.run(['M001-S001-T0001', '--agent', 'np-critic-style'], { cwd: r, stdout: cap.stub });
719
+ const out = JSON.parse(cap.get());
720
+ assert.equal(out.agent, 'np-critic-style');
721
+ assert.equal(out.violation, null); // critics aren't audited for Rule 9
722
+ // The audit log must still record the spawn so Layer C can find it.
723
+ assert.ok(nubosloop.findSpawnAuditForRound('M001-S001-T0001', 'np-critic-style', 1, r));
724
+ });
725
+
726
+ test('LCLI-RR-22: loop-audit-tool-use still REQUIRES --tool-use-log for AUDITED_AGENTS', () => {
727
+ const r = _mkRoot();
728
+ checkpoint.startTask({ id: 'M001-S001-T0001' }, r);
729
+ const loopAudit = require('./loop-audit-tool-use.cjs');
730
+ assert.throws(
731
+ () => loopAudit.run(['M001-S001-T0001', '--agent', 'np-executor'], { cwd: r, stdout: _cap().stub }),
732
+ (err) => err && err.code === 'loop-audit-missing-log',
733
+ );
734
+ });
735
+
470
736
  test('LCLI-22: learning-match queries the local store', () => {
471
737
  const r = _mkRoot();
472
738
  const lr = require('../../lib/learnings.cjs');
@@ -81,6 +81,27 @@ function _runPostExecutor(taskId, list, cwd) {
81
81
  { hint: 'pass the exit code of the task verify command' },
82
82
  );
83
83
  }
84
+ // Layer C: audit-trail enforcement — refuse if no executor spawn was
85
+ // recorded for this round via `loop-audit-tool-use`. This blocks the
86
+ // bypass where an orchestrator stamps verify-green without actually
87
+ // spawning np-executor / np-build-fixer.
88
+ const force = list.includes('--force-post-executor');
89
+ if (!force) {
90
+ const cur = checkpoint.readCheckpoint(taskId, cwd) || {};
91
+ const round = Number((cur.nubosloop && cur.nubosloop.round)) || 1;
92
+ const required = round === 1 ? nubosloop.POST_EXECUTOR_EVIDENCE_R1 : nubosloop.POST_EXECUTOR_EVIDENCE_RN;
93
+ const verdict = nubosloop.assertSpawnsAuditedForRound(taskId, required, round, cwd);
94
+ if (!verdict.satisfied) {
95
+ throw new NubosPilotError(
96
+ 'loop-post-executor-missing-spawn-audit',
97
+ 'phase=post-executor refused: no `loop-audit-tool-use` record found for round=' + round +
98
+ ', agent=' + verdict.missing.join('/') + ' on ' + taskId + '. ' +
99
+ 'Spawn the executor/build-fixer agent and call `loop-audit-tool-use ' + taskId +
100
+ ' --agent <name> --tool-use-log <json>` first, or pass --force-post-executor for an explicit override.',
101
+ { taskId, round, missing: verdict.missing.slice(), required: required.slice() },
102
+ );
103
+ }
104
+ }
84
105
  const code = Number(verifyExitCode);
85
106
  const verifyOutputPath = args.getFlag(list, '--verify-output-path');
86
107
  let verifyOutput = '';
@@ -132,6 +153,27 @@ function _runPostCritics(taskId, list, cwd) {
132
153
  const pb = cp.nubosloop || {};
133
154
  return Number(pb.round) || 1;
134
155
  })();
156
+ // Layer C: audit-trail enforcement — refuse if the three critic spawns
157
+ // (style/tests/acceptance) are not present in the audit log for this round.
158
+ // This blocks the bypass where an orchestrator hand-writes synthetic
159
+ // critic-output JSON without actually spawning the critic agents.
160
+ const force = list.includes('--force-post-critics');
161
+ if (!force) {
162
+ const verdict = nubosloop.assertSpawnsAuditedForRound(
163
+ taskId, nubosloop.POST_CRITICS_EVIDENCE, round, cwd,
164
+ );
165
+ if (!verdict.satisfied) {
166
+ throw new NubosPilotError(
167
+ 'loop-post-critics-missing-critic-audit',
168
+ 'phase=post-critics refused: critic-schwarm spawn-evidence missing for round=' + round +
169
+ ' on ' + taskId + ' (missing audits: ' + verdict.missing.join(', ') + '). ' +
170
+ 'For each critic agent, call `loop-audit-tool-use ' + taskId +
171
+ ' --agent <np-critic-style|np-critic-tests|np-critic-acceptance> --tool-use-log <json>` ' +
172
+ 'after the spawn, then re-run --phase post-critics. Pass --force-post-critics for an explicit override.',
173
+ { taskId, round, missing: verdict.missing.slice(), required: nubosloop.POST_CRITICS_EVIDENCE.slice() },
174
+ );
175
+ }
176
+ }
135
177
  const opts = nubosloop.resolveLoopOpts(cwd);
136
178
  // Rule 9 chain: convert this round's audit violations into rule-9-violation
137
179
  // findings so they participate in routing alongside critic findings.
@@ -164,6 +206,43 @@ function _runPostCritics(taskId, list, cwd) {
164
206
  }
165
207
 
166
208
  function _runCommit(taskId, list, cwd) {
209
+ // Sequence-integrity guard: the commit phase MUST follow a complete loop run.
210
+ // Stamping last_phase='commit' is what unlocks commit-task, so without this
211
+ // check an agent could shell out `loop-run-round --phase commit` directly,
212
+ // skip preflight/executor/critics, and bypass the entire Nubosloop. The
213
+ // commit-task gate then sees a satisfied last_phase and lets the commit
214
+ // through. Defense-in-depth: refuse here AND in commit-task.
215
+ //
216
+ // Required evidence on the checkpoint envelope:
217
+ // - verify_exit_code === 0 → post-executor ran AND verify was green
218
+ // - findings is an array → post-critics ran (empty array = passed)
219
+ //
220
+ // Bypass for legitimate test fixtures / migration: --force-commit-phase.
221
+ const force = list.includes('--force-commit-phase');
222
+ if (!force) {
223
+ const cur = checkpoint.readCheckpoint(taskId, cwd) || {};
224
+ const np = (cur && cur.nubosloop) || {};
225
+ if (np.verify_exit_code !== 0) {
226
+ throw new NubosPilotError(
227
+ 'loop-commit-precondition-missing',
228
+ 'phase=commit refused: post-executor did not record a verify-green run for ' + taskId +
229
+ ' (observed verify_exit_code=' + (np.verify_exit_code === undefined ? 'undefined' : np.verify_exit_code) + '). ' +
230
+ 'Run `loop-run-round ' + taskId + ' --phase post-executor --verify-exit-code 0 --verify-output-path ...` first, ' +
231
+ 'or pass --force-commit-phase for an explicit override.',
232
+ { taskId, missing: 'verify_exit_code', observed: np.verify_exit_code === undefined ? null : np.verify_exit_code },
233
+ );
234
+ }
235
+ if (!Array.isArray(np.findings)) {
236
+ throw new NubosPilotError(
237
+ 'loop-commit-precondition-missing',
238
+ 'phase=commit refused: post-critics did not produce a findings array for ' + taskId +
239
+ ' (observed findings=' + (np.findings === undefined ? 'undefined' : JSON.stringify(np.findings)) + '). ' +
240
+ 'Run `loop-run-round ' + taskId + ' --phase post-critics --critic-outputs <json>` first, ' +
241
+ 'or pass --force-commit-phase for an explicit override.',
242
+ { taskId, missing: 'findings', observed: np.findings === undefined ? null : np.findings },
243
+ );
244
+ }
245
+ }
167
246
  const pattern = args.getFlag(list, '--learning-pattern') || null;
168
247
  const outcome = args.getFlag(list, '--learning-outcome') || 'verified';
169
248
  let logged = null;
@@ -184,6 +263,7 @@ function _runCommit(taskId, list, cwd) {
184
263
  last_phase: 'commit',
185
264
  last_action: 'commit',
186
265
  committed_at: new Date().toISOString(),
266
+ forced_commit_phase: force ? true : (cur && cur.nubosloop && cur.nubosloop.forced_commit_phase) || false,
187
267
  }),
188
268
  }),
189
269
  cwd,
@@ -192,6 +272,7 @@ function _runCommit(taskId, list, cwd) {
192
272
  phase: 'commit',
193
273
  next_action: 'commit-task',
194
274
  learning_logged: logged,
275
+ forced: force,
195
276
  };
196
277
  }
197
278
 
@@ -77,6 +77,39 @@ When `loop.maxRounds` is hit:
77
77
  * Bad, because per-task token cost grows compared to the single-pass model. Accepted — that cost is the price of completeness, and the cache + cap bound it.
78
78
  * Bad, because the orchestrator must coordinate 1 Executor + 3 Critics + occasional Researcher-Schwarm per task. Accepted — that coordination is what makes per-task adversarial review possible.
79
79
 
80
+ ## Trust Layer (amended 2026-05-04)
81
+
82
+ The original spec assumed a cooperative orchestrator: each `loop-run-round --phase X` call was treated as evidence that the corresponding work happened. Multiple production runs proved that assumption wrong — under user-pressure or budget constraints, an orchestrator can rationalize partial-loops or fully-synthetic loops while still emitting the right CLI calls. Three failure modes observed in the wild:
83
+
84
+ 1. **Single-pass bypass** — `executor → commit-task` directly, skipping the loop. (Closed by `commit-task` Layer-A gate; refuses without `nubosloop.last_phase=commit`.)
85
+ 2. **Stamp bypass** — `loop-run-round --phase commit` invoked directly without prior phases, just to satisfy Layer A. (Closed by Layer-B precondition in `_runCommit`; refuses without `verify_exit_code=0` and `findings: []` on the checkpoint.)
86
+ 3. **Synthetic-evidence bypass** — orchestrator invokes every `loop-run-round` phase but with hand-written `--critic-outputs '[{"critic":"style","findings":[]}, ...]'` JSON, never actually spawning the three critic agents. Layers A and B see a perfectly-shaped checkpoint and accept. (Closed by Layer-C audit-trail gate, this amendment.)
87
+
88
+ ### Layer-C — Spawn-evidence audit-trail
89
+
90
+ Each LLM spawn (executor, build-fixer, three critics) MUST be stamped into the per-task `nubosloop.tool_use_audit` log via `loop-audit-tool-use --task-id … --agent <name> --tool-use-log <json>`. The round number is sourced automatically from `nubosloop.round` to keep the chain idempotent.
91
+
92
+ Two phase verbs now consult this log before accepting an advance:
93
+
94
+ * **`loop-run-round --phase post-executor`** requires an audit entry for `np-executor` (round 1) or `np-build-fixer` (round ≥ 2) in the current round. Refuses with `loop-post-executor-missing-spawn-audit` otherwise.
95
+ * **`loop-run-round --phase post-critics`** requires audit entries for all three: `np-critic-style`, `np-critic-tests`, `np-critic-acceptance`. Refuses with `loop-post-critics-missing-critic-audit` otherwise.
96
+
97
+ Both phases accept explicit overrides — `--force-post-executor` / `--force-post-critics` — for legitimate test fixtures and migration. The override stamps `forced_*` flags on the checkpoint so dashboards can count them.
98
+
99
+ ### Defense-in-depth summary
100
+
101
+ | Layer | Where | What it proves | Bypass cost |
102
+ |-------|---------------------|---------------------------------------------------------------|----------------------------------------|
103
+ | A | `commit-task.cjs` | The full sequence signature is on the checkpoint | Lie at all five evidence fields |
104
+ | B | `_runCommit` | Verify-green AND a post-critics findings array preceded the commit phase | Pre-write fake `verify_exit_code=0` and `findings: []` to the checkpoint manually |
105
+ | C | `_runPostExecutor` + `_runPostCritics` | Each declared spawn appears in the per-round audit log | Issue extra `loop-audit-tool-use` calls naming agents that didn't actually run |
106
+
107
+ No layer is sufficient alone. Together they require a deliberate, multi-step lie across multiple verbs to bypass — far more deliberate than the "pragmatic compression" rationalizations that produced bypasses 1–3 in production.
108
+
109
+ ### What the Trust Layer cannot prove
110
+
111
+ Layer C still cannot prove that the agent named in an audit entry actually ran. The orchestrator could call `loop-audit-tool-use --agent np-critic-style …` without spawning the critic. Closing this gap requires runtime instrumentation — the LLM runtime itself stamps spawn-provenance metadata into the audit entry, which the orchestrator cannot forge. That is "Stufe 2" and tracked separately; this amendment closes the practical bypass class without it.
112
+
80
113
  ## More Information
81
114
 
82
115
  * **Related ADR:** [ADR-0001](0001-no-daemon-invariant.md) — the loop runs in-session; no daemon coordinates spawns.
@@ -85,3 +118,4 @@ When `loop.maxRounds` is hit:
85
118
  * **Related ADR:** [ADR-0012](0012-completeness-doctrine.md) — the loop enforces the Completeness Mandate.
86
119
  * **Concept page:** [`v1/concepts/nubosloop.md`](../../knowledge/libraries/nubos-pilot/v1/concepts/nubosloop.md).
87
120
  * **Library:** `lib/nubosloop.cjs`.
121
+ * **Gate code:** `bin/np-tools/commit-task.cjs::_assertLoopGate` (Layer A); `bin/np-tools/loop-run-round.cjs::_runCommit` (Layer B); `bin/np-tools/loop-run-round.cjs::_runPostExecutor` + `_runPostCritics` (Layer C).
package/lib/nubosloop.cjs CHANGED
@@ -341,6 +341,52 @@ const SEARCH_TOOLS = Object.freeze([
341
341
 
342
342
  const AUDITED_AGENTS = Object.freeze(['np-researcher', 'np-executor', 'np-build-fixer']);
343
343
 
344
+ // Spawn-evidence agent groups (ADR-0010 Layer-C audit-trail enforcement).
345
+ // These lists are NOT about Rule 9 (which AUDITED_AGENTS gates) — they declare
346
+ // which spawns MUST appear in the per-round tool-use audit log before the
347
+ // orchestrator can advance loop-run-round through `post-executor`/`post-critics`.
348
+ // An entry in tool_use_audit with matching agent+round is the only evidence
349
+ // the gate accepts that the spawn actually happened.
350
+ const POST_EXECUTOR_EVIDENCE_R1 = Object.freeze(['np-executor']);
351
+ const POST_EXECUTOR_EVIDENCE_RN = Object.freeze(['np-build-fixer']);
352
+ const POST_CRITICS_EVIDENCE = Object.freeze([
353
+ 'np-critic-style',
354
+ 'np-critic-tests',
355
+ 'np-critic-acceptance',
356
+ ]);
357
+
358
+ /**
359
+ * Look up a spawn-audit entry for a given (taskId, agent, round). Returns the
360
+ * audit entry object if found, null otherwise. Used by Layer-C gates in
361
+ * loop-run-round to assert that real spawns preceded each phase advance.
362
+ */
363
+ function findSpawnAuditForRound(taskId, agent, round, cwd) {
364
+ if (!checkpoint.TASK_ID_RE.test(taskId)) return null;
365
+ const target = Number(round);
366
+ if (!Number.isFinite(target) || target < 1) return null;
367
+ const audits = readToolUseAudit(taskId, cwd) || [];
368
+ for (const a of audits) {
369
+ if (!a) continue;
370
+ if (a.agent !== agent) continue;
371
+ if ((Number(a.round) || 1) !== target) continue;
372
+ return a;
373
+ }
374
+ return null;
375
+ }
376
+
377
+ /**
378
+ * Assert every required spawn for a phase exists in the audit log for the
379
+ * current round. Returns { satisfied, missing } — the orchestrator-side gate
380
+ * uses `missing` to compose actionable error messages.
381
+ */
382
+ function assertSpawnsAuditedForRound(taskId, requiredAgents, round, cwd) {
383
+ const missing = [];
384
+ for (const agent of requiredAgents) {
385
+ if (!findSpawnAuditForRound(taskId, agent, round, cwd)) missing.push(agent);
386
+ }
387
+ return { satisfied: missing.length === 0, missing };
388
+ }
389
+
344
390
  /**
345
391
  * Rule 9 mechanical check (Completeness Doctrine + ADR-0010 Step 4).
346
392
  * The orchestrator collects each spawn's tool-use log (most LLM APIs
@@ -637,6 +683,11 @@ module.exports = {
637
683
  auditToolUse,
638
684
  readToolUseAudit,
639
685
  auditFindingsForRound,
686
+ findSpawnAuditForRound,
687
+ assertSpawnsAuditedForRound,
688
+ POST_EXECUTOR_EVIDENCE_R1,
689
+ POST_EXECUTOR_EVIDENCE_RN,
690
+ POST_CRITICS_EVIDENCE,
640
691
  KNOWN_ROUTING_BUCKETS,
641
692
  SEARCH_TOOLS,
642
693
  AUDITED_AGENTS,
package/package.json CHANGED
@@ -1,6 +1,6 @@
1
1
  {
2
2
  "name": "nubos-pilot",
3
- "version": "0.9.4",
3
+ "version": "0.9.6",
4
4
  "description": "AI-driven planning and execution tool for code projects",
5
5
  "homepage": "https://github.com/Nubos-AI/nubos-pilot",
6
6
  "repository": {
@@ -223,13 +223,19 @@ for WAVE_INDEX in 0 1 2 ...; do
223
223
 
224
224
  node .nubos-pilot/bin/np-tools.cjs checkpoint transition "$TASK_ID" verifying
225
225
 
226
- # === Step 4: Mechanical Checks + tool-use audit (orchestrator-side) ===
226
+ # === Step 4: Mechanical Checks + spawn-evidence audit (orchestrator-side) ===
227
227
  VERIFY_LOG="${TMPDIR:-/tmp}/np-verify-${TASK_ID}-r${ROUND}.log"
228
228
  # Orchestrator (NOT the agent) runs the task's <verify> command + stack
229
229
  # linters; redirect stdout+stderr to $VERIFY_LOG.
230
230
  VERIFY_EXIT=$?
231
+ # Stamp executor spawn-evidence into the audit log. EXECUTOR_TOOL_LOG is
232
+ # the tool-name JSON array harvested from the spawn's tool_use stream
233
+ # (e.g. '["Read","search-knowledge","Edit","Bash"]'). For AUDITED_AGENTS
234
+ # this drives Rule 9 enforcement; the round number is sourced automatically
235
+ # from the checkpoint by loop-audit-tool-use. The post-executor gate (Layer C)
236
+ # refuses to advance unless this evidence stamp exists for the current round.
231
237
  node .nubos-pilot/bin/np-tools.cjs loop-audit-tool-use "$TASK_ID" \
232
- --round "$ROUND" --agent "$EXECUTOR_AGENT"
238
+ --agent "$EXECUTOR_AGENT" --tool-use-log "$EXECUTOR_TOOL_LOG"
233
239
 
234
240
  POST_EXEC=$(node .nubos-pilot/bin/np-tools.cjs loop-run-round "$TASK_ID" \
235
241
  --phase post-executor \
@@ -249,6 +255,19 @@ for WAVE_INDEX in 0 1 2 ...; do
249
255
  # - agents/np-critic-acceptance.md (sonnet) → CRITIC_ACCEPTANCE_JSON
250
256
  CRITIC_OUTPUTS_JSON=$(printf '[%s,%s,%s]' "$CRITIC_STYLE_JSON" "$CRITIC_TESTS_JSON" "$CRITIC_ACCEPTANCE_JSON")
251
257
 
258
+ # === Step 5b: Stamp critic spawn-evidence (one audit entry per critic) ===
259
+ # MANDATORY — without these three stamps, post-critics refuses with
260
+ # `loop-post-critics-missing-critic-audit` (Layer C, ADR-0010 Trust-Layer).
261
+ # The orchestrator MUST issue all three calls AFTER the critic spawns
262
+ # have actually run; synthetic --critic-outputs JSON without these
263
+ # corresponding audit entries is mechanically blocked.
264
+ # --tool-use-log may be empty for critics (they aren't AUDITED_AGENTS for
265
+ # Rule 9), but supplying the actual critic tool list is preferred for
266
+ # observability on np:dashboard.
267
+ node .nubos-pilot/bin/np-tools.cjs loop-audit-tool-use "$TASK_ID" --agent np-critic-style --tool-use-log '[]'
268
+ node .nubos-pilot/bin/np-tools.cjs loop-audit-tool-use "$TASK_ID" --agent np-critic-tests --tool-use-log '[]'
269
+ node .nubos-pilot/bin/np-tools.cjs loop-audit-tool-use "$TASK_ID" --agent np-critic-acceptance --tool-use-log '[]'
270
+
252
271
  # === Step 6: Route via loop-evaluate (post-critics) ===
253
272
  POST_CRIT=$(node .nubos-pilot/bin/np-tools.cjs loop-run-round "$TASK_ID" \
254
273
  --phase post-critics --critic-outputs "$CRITIC_OUTPUTS_JSON")
@@ -347,7 +366,7 @@ After every slice completes, point the operator at `/np:validate-phase $PHASE` t
347
366
  - Spawn the three Critic agents (`np-critic-style`, `np-critic-tests`, `np-critic-acceptance`) IN PARALLEL — single message, three Agent blocks per task per round.
348
367
  - Run `loop-run-round --phase post-executor` AFTER mechanical checks; honor `next_action: spawn-build-fixer` (verify-red short-circuit, skip critics this round).
349
368
  - Run `loop-run-round --phase post-critics` AFTER critics return, to obtain the routing `next_action`.
350
- - Run `loop-audit-tool-use` per round per spawn — Rule 9 (search-knowledge / match-existing-learning) is mechanically enforced.
369
+ - Run `loop-audit-tool-use` per round per spawn — for executor/build-fixer this drives Rule 9 enforcement, AND for the three Critic agents this is the spawn-evidence required by the Layer-C audit-trail gate (`loop-post-executor-missing-spawn-audit` / `loop-post-critics-missing-critic-audit`). All four audit calls per round are mandatory before the corresponding `loop-run-round --phase post-{executor|critics}` invocation.
351
370
  - Route every commit through `node .nubos-pilot/bin/np-tools.cjs commit-task` so `assertCommittablePaths` (D-25) runs.
352
371
  - Hard-stop the wave when `commit-task` returns non-zero, OR a task hits `stuck`/`plan-checker`.
353
372
 
@@ -357,7 +376,7 @@ After every slice completes, point the operator at `/np:validate-phase $PHASE` t
357
376
  - Skip the Nubosloop and call `commit-task` directly after the executor (single-pass executor → commit is forbidden — ADR-0010).
358
377
  - Spawn the Critic agents serially — they MUST run in parallel (single message, three Agent blocks).
359
378
  - Use `np-executor` on Round ≥ 2 — use `np-build-fixer` (it gets prior critic findings + verify output excerpt).
360
- - Skip `loop-audit-tool-use` Rule 9 violations must surface as `rule-9-violation` findings, not be silenced.
379
+ - Skip `loop-audit-tool-use` for ANY spawn (executor/build-fixer/the three Critics). Skipping the executor audit silences Rule 9; skipping any critic audit means the orchestrator cannot prove the critic actually ran, and the post-critics gate refuses. Synthesizing `--critic-outputs` JSON without spawning real critic agents is the canonical bypass — Layer C blocks it mechanically.
361
380
  - Extend a task's scope beyond `files_modified` — D-04 violations route to `plan-checker`, not post-hoc PLAN.md mutations.
362
381
  - Invoke `git commit`, `git add`, or any bare git command from this workflow or the spawned agent (CLAUDE.md §Git operations).
363
382
  - Bundle two tasks into one commit (ADR-0004 atomicity).