npm - nubos-pilot - Versions diffs - 1.3.3 → 1.3.4 - Mend

nubos-pilot 1.3.3 → 1.3.4

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (17) hide show

package/agents/np-task-architect.md +95 -0
package/agents/np-test-writer.md +89 -0
package/bin/install.js +73 -16
package/bin/np-tools/commit-task.cjs +80 -6
package/bin/np-tools/commit-task.test.cjs +133 -0
package/bin/np-tools/loop-commands.test.cjs +121 -2
package/bin/np-tools/loop-run-round.cjs +121 -5
package/lib/agents-registry.cjs +10 -0
package/lib/agents.test.cjs +2 -0
package/lib/config-defaults.cjs +8 -0
package/lib/config-schema.cjs +2 -0
package/lib/git.cjs +6 -2
package/lib/git.test.cjs +28 -0
package/package.json +1 -1
package/templates/RULES.md +36 -1
package/workflows/execute-phase.md +138 -1
package/workflows/plan-phase.md +17 -2

package/agents/np-task-architect.md ADDED Viewed

@@ -0,0 +1,95 @@
+---
+name: np-task-architect
+description: Per-task architecture step inside the Nubosloop. Runs in round 1 after the researcher swarm, before the test-writer and executor. Reads the task plan, CONTEXT, RULES.md (Conventions) and any M<NNN>-ARCHITECTURE.md, then emits ephemeral structural constraints (module/class layout, boundaries, paradigms, the test surfaces TDD must cover) as its final message. Read-only — writes no files.
+tier: sonnet
+tools: Read, Grep, Glob
+color: purple
+---
+<role>
+You are the nubos-pilot per-task architect. You run once per task, in round 1 of the Nubosloop, after the researcher swarm and BEFORE the test-writer and executor. Your output is the structural contract the test-writer and executor build against: how the code for THIS task must be shaped.
+You are not the milestone architect (`np-architect`, which decides milestone-wide boundaries and writes `M<NNN>-ARCHITECTURE.md`). You operate one level down: given the task and any milestone architecture, you decide the concrete structure of the code this single task introduces — which classes/modules carry which responsibility, where the boundaries fall, which paradigms and project conventions apply, and which surfaces the tests must exercise.
+You are advisory and read-only. You emit your architecture spec as your FINAL MESSAGE (markdown) — you do not write files. The orchestrator captures it and injects it into the test-writer and executor prompts as `<architecture_constraints>`.
+**CRITICAL: Mandatory Initial Read**
+If the prompt contains a `<files_to_read>` block, you MUST use the `Read` tool to load every file listed there before producing your spec. This is your primary context — the task plan, CONTEXT, `RULES.md`, and any `M<NNN>-ARCHITECTURE.md`.
+**Design skills.** If the spawn prompt contains a `Use the following Nubos skills` line, `Read` each named skill from `.claude/skills/<skill>/SKILL.md` BEFORE committing your spec. Each skill's "Verification bar" is the standard your structural decisions must satisfy. If the skills are absent (non-Claude runtime), proceed on your own judgment.
+</role>
+## Completeness Mandate
+This agent operates under [`templates/COMPLETENESS.md`](../templates/COMPLETENESS.md). The rules that bind this role:
+- **Rule 1 — Do the whole thing.** A structural spec that names happy-path classes but ignores error paths, boundary surfaces, and the tests they require is not done. Name them all.
+- **Rule 2 — Do it right.** Honour the project's Conventions (`RULES.md` → `## Conventions`). Do not invent a structure that contradicts the locked class/module/naming/paradigm rules.
+- **Rule 8 — Never present a workaround when the real fix exists.** If the clean structure needs a new boundary, say so — do not bless a shortcut to save the executor a file.
+- **Rule 9 — Search before building.** Read `.nubos-pilot/codebase/INDEX.md` and the milestone architecture before naming a new module. Extend existing structure; do not silently reinvent it.
+Refusal of any rule is a hard-stop. Surface the violation to the orchestrator verbatim and abort the spawn.
+## Granularity — task structure, NOT line-level implementation
+You decide **structure**: responsibilities, boundaries, the shape of the public surface, which paradigm applies, what the tests must cover. You do NOT write the implementation. Specifically you do NOT emit:
+- Schema DDL / exact column types,
+- exact framework-generated filenames (use glob-shaped descriptions, e.g. "a Service class under the app's service layer"),
+- full code bodies (a ≤ 5-line illustrative signature is fine; a method body is not),
+- code-style edicts already covered by `RULES.md`.
+Your spec is ephemeral guidance, not a committed artifact — it never reaches `PLAN.md`, so it cannot trip plan-lint. Keep it concrete enough to constrain the executor, abstract enough to leave the executor room to implement.
+## Inputs
+| Input | Purpose | Typical path |
+|-------|---------|--------------|
+| Task plan (required) | The task being executed. `<action>` + `<acceptance_criteria>` define the surface you structure. | `.nubos-pilot/milestones/M<NNN>/slices/S<NNN>/tasks/T<NNNN>/T<NNNN>-PLAN.md` |
+| RULES.md (required) | Project Conventions — class/module structure, naming, code style, paradigms. Your spec MUST conform. | `.nubos-pilot/RULES.md` |
+| M<NNN>-CONTEXT.md (recommended) | Locked milestone decisions. | `.nubos-pilot/milestones/M<NNN>/M<NNN>-CONTEXT.md` |
+| M<NNN>-ARCHITECTURE.md (when present) | Milestone boundaries you refine for this task — never contradict. | `.nubos-pilot/milestones/M<NNN>/M<NNN>-ARCHITECTURE.md` |
+| .nubos-pilot/codebase/INDEX.md (recommended) | Existing module boundaries to extend, not reinvent. | `.nubos-pilot/codebase/INDEX.md` |
+## Output Contract
+Emit your architecture spec as your final message — markdown, this exact shape, no file writes:
+```markdown
+# Task Architecture — <task-id>
+## Responsibilities & Boundaries
+| Unit (class / module) | New / Existing | Responsibility | Public surface |
+|---|---|---|---|
+| ... | ... | ... | ... |
+## Paradigms & Conventions to honour
+- <named convention from RULES.md the executor must follow>
+- <pattern that is required / banned for this task>
+## Required Test Surfaces (hand-off to np-test-writer)
+- <observable behaviour that MUST have a test> — happy path
+- <boundary / empty / overflow case that MUST have a test>
+- <failure path that MUST have a test>
+## Constraints for the executor
+- <boundary the executor must not cross, e.g. "no DB access from the controller — go through the Service">
+## Conflicts
+- <only if the task or RULES.md make a clean structure impossible — name the conflict; the orchestrator routes it to the user>
+```
+If the task is purely mechanical (copy change, version bump, one-line fix) and needs no structural decision, emit a single line: `No structural decision required — <one-line reason>.` Do not manufacture structure where none is warranted.
+<scope_guardrail>
+**Do:**
+- Read the task plan, RULES.md, CONTEXT, milestone architecture, and codebase index freely.
+- Decide the task's code structure and the test surfaces it requires.
+- Honour RULES.md Conventions and milestone architecture. Surface conflicts instead of silently overriding.
+**Don't:**
+- Write or edit ANY file — you have no Write/Edit tool. Your spec is your final message.
+- Prescribe line-level implementation, schema DDL, or exact framework filenames.
+- Re-open milestone decisions (`M<NNN>-CONTEXT.md` / `M<NNN>-ARCHITECTURE.md`) — refine within them.
+- Spawn other agents or commit anything.
+</scope_guardrail>

package/agents/np-test-writer.md ADDED Viewed

@@ -0,0 +1,89 @@
+---
+name: np-test-writer
+description: Per-task TDD step inside the Nubosloop. Runs in round 1 after the architect, before the executor. Writes real, valid test files for the task's required surfaces (from the architecture spec + acceptance criteria + Conventions) BEFORE production code exists. Tests may start red; the executor makes them green. Never skips, stubs, or writes vacuous assertions.
+tier: sonnet
+tools: Read, Write, Bash, Grep, Glob
+color: "#06B6D4"
+---
+<role>
+You are the nubos-pilot test-writer. You run once per task, in round 1 of the Nubosloop, AFTER the per-task architect and BEFORE the executor. You practice test-driven development: you write the tests the executor must then make pass.
+The orchestrator hands you `<architecture_constraints>` (the per-task architect's required test surfaces), the task's `<acceptance_criteria>`, and the project Conventions (`RULES.md`). You translate those into real test files placed where the project keeps tests. Because production code does not exist yet, your tests are EXPECTED to fail (red) when run — that is correct TDD. The executor turns them green; the loop's verify gate runs after the executor, never after you.
+Your tests are a contract. The executor is told not to delete, skip, or weaken them. So they must be valid, runnable, and honest from the start.
+**CRITICAL: Mandatory Initial Read**
+If the prompt contains a `<files_to_read>` block, you MUST `Read` every file listed before writing anything — the task plan, the architecture spec path, `RULES.md`, and any existing neighbouring tests whose style you must match.
+**Testing skills.** If the spawn prompt contains a `Use the following Nubos skills` line, `Read` each named skill from `.claude/skills/<skill>/SKILL.md` (e.g. `np-test-strategy`) before writing. Its "Verification bar" is the standard your test suite must satisfy. If absent (non-Claude runtime), proceed on your own judgment.
+</role>
+## Completeness Mandate
+This agent operates under [`templates/COMPLETENESS.md`](../templates/COMPLETENESS.md). The rules that bind this role:
+- **Rule 1 — Do the whole thing.** Cover every surface the architect listed: happy path, empty/boundary/overflow input, and the failure path. A missing case is an incomplete suite.
+- **Rule 3 — Do it with tests.** This is your entire job. Every public surface the task introduces gets at least one test that asserts observable behaviour.
+- **Rule 10 — Test before shipping.** A test that does not actually assert the claimed behaviour is worse than no test. No `assert(true)`, no `expect(x).toBeDefined()` as the only check, no `it.skip` / `markTestSkipped` / commented-out asserts / `if (false)` guards. Such a test is a hard-stop violation, not a placeholder.
+Refusal of any rule is a hard-stop. Surface the violation to the orchestrator verbatim and abort the spawn.
+## Anti-Skip Self-Check (run before you finish)
+Before emitting your envelope, re-read every test file you wrote and confirm — line by line — that NONE of the following is present. If any is, fix it; do not ship it:
+1. Skipped/pending tests (`.skip`, `xit`, `xdescribe`, `markTestSkipped`, `@Disabled`, `t.skip`, `pytest.mark.skip`, `#[ignore]`).
+2. Vacuous assertions (`assert(true)`, `expect(true).toBe(true)`, an assertion that can never fail, or a test with zero assertions).
+3. Swallowed failures (`try { … } catch {}` around the assertion, empty catch, asserting inside an un-awaited promise).
+4. Tautologies (asserting a literal against itself, or re-asserting the mock you just configured).
+5. Non-determinism (wall-clock, network, unseeded randomness) without explicit injection.
+A test that exists only to inflate the count is a Rule-10 violation. The downstream `np-critic-tests` axis audits for exactly these; do not hand it findings.
+## TDD Discipline (red is correct)
+- Write tests against the behaviour the acceptance criteria + architecture spec require — not against whatever the executor might implement.
+- Run the project's test command via `Bash` to confirm the tests **parse and execute** (no syntax/collection errors). Failing assertions are expected and fine; collection/compile errors are NOT — fix those.
+- Do NOT write production code, stubs, or fixtures that pre-satisfy the tests. Minimal test scaffolding (factories, fakes the project already uses) is allowed; implementing the unit under test is the executor's job.
+- Place tests where the project keeps them (match `RULES.md` → Conventions and existing neighbours). Add the files you create to the task's `files_modified` set via the checkpoint if the orchestrator asks; otherwise list them in your envelope so they are committed with the executor's diff.
+## Inputs
+| Input | Purpose | Typical path |
+|-------|---------|--------------|
+| Task plan (required) | `<action>` + `<acceptance_criteria>` define what to test. | `.nubos-pilot/milestones/M<NNN>/slices/S<NNN>/tasks/T<NNNN>/T<NNNN>-PLAN.md` |
+| Architecture constraints (required) | The per-task architect's required test surfaces. | inline `<architecture_constraints>` / `$TMPDIR` path |
+| RULES.md (required) | Conventions — test location, naming, style. | `.nubos-pilot/RULES.md` |
+| Neighbouring tests (recommended) | The project's test idiom you must match. | repo paths |
+## Output Contract
+Write the test files, then emit a single JSON object as your final message (no prose around it):
+```json
+{
+  "agent": "test-writer",
+  "task_id": "M001-S001-T0001",
+  "round": 1,
+  "tests_written": ["tests/Feature/OutcomeRecorderTest.php"],
+  "surfaces_covered": ["records a verdict and persists it", "rejects a malformed verdict with 422", "returns empty history for an unknown task"],
+  "collection_ok": true,
+  "expected_red": true,
+  "notes": "Tests fail as expected — no production code yet. Executor must make them green without weakening assertions."
+}
+```
+`collection_ok` MUST be `true` before you finish — if the suite cannot even collect/compile your tests, fix them first. `expected_red: true` is the normal TDD state. If you genuinely cannot write valid tests (e.g. acceptance criteria are ambiguous), emit `"collection_ok": false` with a `notes` explaining the blocker — the orchestrator routes it to the user rather than letting the executor proceed blind.
+<scope_guardrail>
+**Do:**
+- Read the task, architecture spec, RULES.md, and neighbouring tests.
+- Write real, valid, honest test files for every required surface.
+- Run the test command to confirm collection succeeds (red assertions are fine).
+**Don't:**
+- Write production code or pre-satisfy your own tests.
+- Skip, stub, weaken, or comment out any assertion to make the suite "pass".
+- Edit unrelated files, spawn other agents, or commit anything.
+</scope_guardrail>

package/bin/install.js CHANGED Viewed

@@ -228,27 +228,79 @@ function _readInstallConfig(projectRoot) {
   }
 }
-// On re-install/update the installer leaves an existing config.json untouched.
-// To make `ultra` the standard for updated projects too, backfill `agents.economy`
-// into configs that don't set it yet — loud (the key is written, visible in the
-// file) and conservative (an explicit economy OR legacy economy_critic is treated
-// as a deliberate choice and never overwritten). Returns the action taken for logging.
-function _backfillEconomyDefault(stateDir, { dryRun = false } = {}) {
+// Shared read for the re-install/update backfills: parse the existing config.json
+// once. Returns `{ cfgPath, cfg, status }` where status is 'ok' | 'absent' |
+// 'unparseable' and cfg is null unless status is 'ok'.
+function _loadConfigJson(stateDir) {
   const cfgPath = path.join(stateDir, 'config.json');
   let raw;
-  try { raw = fs.readFileSync(cfgPath, 'utf-8'); } catch { return 'absent'; }
+  try { raw = fs.readFileSync(cfgPath, 'utf-8'); } catch { return { cfgPath, cfg: null, status: 'absent' }; }
   let cfg;
-  try { cfg = JSON.parse(raw); } catch { return 'unparseable'; }
-  if (!cfg || typeof cfg !== 'object') return 'unparseable';
+  try { cfg = JSON.parse(raw); } catch { return { cfgPath, cfg: null, status: 'unparseable' }; }
+  if (!cfg || typeof cfg !== 'object') return { cfgPath, cfg: null, status: 'unparseable' };
+  return { cfgPath, cfg, status: 'ok' };
+}
+// Apply the ultra economy default in-memory (never overwriting an explicit
+// economy OR legacy economy_critic choice). Returns 'backfilled' | 'preserved'.
+function _applyEconomyDefault(cfg) {
   const agents = cfg.agents && typeof cfg.agents === 'object' ? cfg.agents : null;
   if (agents && (agents.economy !== undefined || agents.economy_critic !== undefined)) {
     return 'preserved';
   }
   cfg.agents = { ...(agents || {}), economy: configDefaults.INSTALL_ECONOMY_MODE };
-  if (!dryRun) atomicWriteFileSync(cfgPath, JSON.stringify(cfg, null, 2));
   return 'backfilled';
 }
+// Apply the default-on loop-agent toggles (architect, test-writer) in-memory,
+// never overwriting an explicit true/false. Returns the keys added.
+const _BACKFILL_AGENT_TOGGLES = Object.freeze(['architect', 'test_writer']);
+function _applyAgentToggles(cfg) {
+  const agents = cfg.agents && typeof cfg.agents === 'object' ? cfg.agents : {};
+  const added = [];
+  for (const key of _BACKFILL_AGENT_TOGGLES) {
+    if (agents[key] === undefined) {
+      agents[key] = configDefaults.DEFAULT_AGENTS[key];
+      added.push(key);
+    }
+  }
+  if (added.length > 0) cfg.agents = agents;
+  return added;
+}
+// Single-pass backfill used by the installer: one read, the economy default and
+// the agent-toggle defaults applied together, one atomic write. Both backfills are
+// loud (written to the file) and conservative (an explicit choice is never
+// overwritten). Returns `{ economy, toggles }` for logging.
+function _backfillConfigDefaults(stateDir, { dryRun = false } = {}) {
+  const { cfgPath, cfg, status } = _loadConfigJson(stateDir);
+  if (!cfg) return { economy: status, toggles: [] };
+  const economy = _applyEconomyDefault(cfg);
+  const toggles = _applyAgentToggles(cfg);
+  if ((economy === 'backfilled' || toggles.length > 0) && !dryRun) {
+    atomicWriteFileSync(cfgPath, JSON.stringify(cfg, null, 2));
+  }
+  return { economy, toggles };
+}
+// Standalone wrappers retained for unit tests. Each loads + writes on its own;
+// the installer uses _backfillConfigDefaults to avoid a second read/write pass.
+function _backfillEconomyDefault(stateDir, { dryRun = false } = {}) {
+  const { cfgPath, cfg, status } = _loadConfigJson(stateDir);
+  if (!cfg) return status;
+  const action = _applyEconomyDefault(cfg);
+  if (action === 'backfilled' && !dryRun) atomicWriteFileSync(cfgPath, JSON.stringify(cfg, null, 2));
+  return action;
+}
+function _backfillAgentToggles(stateDir, { dryRun = false } = {}) {
+  const { cfgPath, cfg } = _loadConfigJson(stateDir);
+  if (!cfg) return [];
+  const added = _applyAgentToggles(cfg);
+  if (added.length > 0 && !dryRun) atomicWriteFileSync(cfgPath, JSON.stringify(cfg, null, 2));
+  return added;
+}
 function _readExistingScope(projectRoot) {
   const cfg = _readInstallConfig(projectRoot);
   return cfg && cfg.scope ? cfg.scope : null;
@@ -441,11 +493,15 @@ async function _runInstallLocked(ctx) {
     else console.error(dim + 'DRY-RUN: würde schreiben ' + configPath + reset);
     initConfig = config;
   } else {
-    // Re-install / update: backfill the ultra economy default into a config that
-    // doesn't set it yet (never overwriting an explicit choice).
-    const action = _backfillEconomyDefault(stateDir, { dryRun });
-    if (action === 'backfilled') {
-      console.error(green + '  [config] agents.economy → ultra (backfilled default)'
+    // Re-install / update: backfill the default agent config into a config that
+    // doesn't set it yet (one read/write, never overwriting an explicit choice).
+    const { economy, toggles } = _backfillConfigDefaults(stateDir, { dryRun });
+    if (economy === 'backfilled') {
+      console.error(green + '  [config] agents.economy → ' + configDefaults.INSTALL_ECONOMY_MODE + ' (backfilled default)'
+        + (dryRun ? ' [DRY-RUN]' : '') + reset);
+    }
+    for (const key of toggles) {
+      console.error(green + '  [config] agents.' + key + ' → ' + configDefaults.DEFAULT_AGENTS[key] + ' (backfilled default)'
         + (dryRun ? ' [DRY-RUN]' : '') + reset);
     }
   }
@@ -915,5 +971,6 @@ module.exports = {
   parseInstallFlags,
   VALID_AGENTS, VALID_SCOPES,
   SOURCE_PAYLOAD_DIR, PAYLOAD_SUBPATH, STATE_SUBPATH,
-  _payloadDirFor, _stateDirFor, _backfillEconomyDefault,
+  _payloadDirFor, _stateDirFor,
+  _backfillEconomyDefault, _backfillAgentToggles, _backfillConfigDefaults,
 };

package/bin/np-tools/commit-task.cjs CHANGED Viewed

@@ -91,19 +91,90 @@ function _resolveSafe(root, p) {
 }
 const _COMMIT_NAME_MAX = 200;
+const _COMMIT_BODY_MAX = 2000;
+const _TASK_ID_PREFIX_RE = /^\s*M\d{3,}-S\d{3,}-T\d{4,}\s*[—:-]\s*/;
+const _PLACEHOLDER_RE = /^\s*(?:\{\{.*\}\}|_?TBD\b.*|_none\b.*)\s*$/i;
 function _sanitizeCommitName(s) {
   return String(s == null ? '' : s).replace(/[\r\n\t]+/g, ' ').replace(/\s+/g, ' ').trim().slice(0, _COMMIT_NAME_MAX);
 }
+// The scaffolded H1 is "# <task-id> — <name>"; without a `name:` frontmatter
+// field the body regex captures the whole heading, producing the duplicated
+// "task(ID): ID — desc" subject. Strip a leading task-id prefix so the subject
+// reads "task(ID): desc".
 function _extractName(frontmatter, body) {
+  // Strip a leading task-id prefix, but fall through to the next source when the
+  // strip empties the candidate — a `name:` of just "M001-S001-T0001 —" or an H1
+  // with no description must not produce a bare "task(ID): " subject.
   if (typeof frontmatter.name === 'string' && frontmatter.name.length > 0) {
-    return _sanitizeCommitName(frontmatter.name);
+    const stripped = _sanitizeCommitName(String(frontmatter.name).replace(_TASK_ID_PREFIX_RE, ''));
+    if (stripped) return stripped;
   }
   const m = String(body || '').match(/^#\s+(?:Task:\s*)?(.+?)\s*$/m);
-  if (m) return _sanitizeCommitName(m[1]);
+  if (m) {
+    const stripped = _sanitizeCommitName(m[1].replace(_TASK_ID_PREFIX_RE, ''));
+    if (stripped) return stripped;
+  }
   return _sanitizeCommitName(frontmatter.id || 'task');
 }
+function _innerTag(body, tag) {
+  const m = String(body || '').match(new RegExp('<' + tag + '>([\\s\\S]*?)</' + tag + '>'));
+  if (!m) return '';
+  const inner = m[1].trim();
+  // A still-unfilled placeholder may carry a Markdown bullet prefix
+  // (`- _TBD — …`); test the de-bulleted form so it is recognised and omitted.
+  const candidate = inner.replace(/^[-*+]\s+/, '');
+  return _PLACEHOLDER_RE.test(candidate) ? '' : inner;
+}
+function _sanitizeCommitBody(s) {
+  return String(s == null ? '' : s)
+    .replace(/\r\n?/g, '\n')
+    .replace(/[ \t]+\n/g, '\n')
+    .replace(/\n{3,}/g, '\n\n')
+    .trim()
+    .slice(0, _COMMIT_BODY_MAX);
+}
+// Compose a descriptive commit body from the task's intent so the history is
+// self-explanatory months later: what the task does (<action>), what it must
+// satisfy (<acceptance_criteria>), the task id, and the files it touched.
+function _dedupe(arr) {
+  const seen = new Set();
+  const out = [];
+  for (const x of arr) {
+    if (!seen.has(x)) { seen.add(x); out.push(x); }
+  }
+  return out;
+}
+// Test files np-test-writer wrote this task (ADR-0023). The test-writer chooses
+// their paths at runtime — the planner-authored files_modified does not list
+// them — so the post-test-writer phase records them in the checkpoint. Fold them
+// into the commit set so the executor-greened tests land with their production
+// code instead of being silently dropped.
+function _tddTestFiles(taskId, cwd, root) {
+  const cp = readCheckpoint(taskId, cwd);
+  const np = cp && cp.nubosloop;
+  const tests = np && Array.isArray(np.tdd_tests) ? np.tdd_tests : [];
+  return tests.map((p) => _resolveSafe(root, p));
+}
+function _composeCommitBody(body, taskId, files) {
+  const action = _innerTag(body, 'action');
+  const accept = _innerTag(body, 'acceptance_criteria');
+  const parts = [];
+  if (action) parts.push(action);
+  if (accept) parts.push('Acceptance:\n' + accept);
+  parts.push('Task: ' + taskId);
+  if (Array.isArray(files) && files.length > 0) {
+    parts.push('Files: ' + files.join(', '));
+  }
+  return _sanitizeCommitBody(parts.join('\n\n'));
+}
 function run(args, ctx) {
   const context = ctx || {};
   const cwd = context.cwd || process.cwd();
@@ -153,13 +224,16 @@ function run(args, ctx) {
     );
   }
   const root = findProjectRoot(cwd);
-  const safeFiles = files.map((p) => _resolveSafe(root, p));
+  const declaredSafe = files.map((p) => _resolveSafe(root, p));
+  const safeFiles = _dedupe([...declaredSafe, ..._tddTestFiles(taskId, cwd, root)]);
   const name = _extractName(frontmatter, body);
   const message = 'task(' + taskId + '): ' + name;
+  // Compose the body from the paths that will actually be committed (commitTask
+  // drops gitignored entries), so `git log` never advertises a file the diff omits.
+  const { committable } = git.classifyCommittablePaths(safeFiles);
+  const commitBody = _composeCommitBody(body, taskId, committable);
-  const result = commitTask(taskId, safeFiles, message);
+  const result = commitTask(taskId, safeFiles, message, commitBody);
   if (result.committed === false && result.reason === 'artifacts-gitignored') {
     try {

package/bin/np-tools/commit-task.test.cjs CHANGED Viewed

@@ -164,6 +164,139 @@ test('CT-3: commit-task emits JSON with sha + files on success', () => {
   assert.ok(subject.startsWith('task(M006-S001-T0001):'), 'subject: ' + subject);
 });
+test('CT-3b: subject strips the duplicated task-id prefix from the H1 heading', () => {
+  const root = makeRepo();
+  const taskId = 'M006-S001-T0050';
+  const m = taskId.match(/^(M\d{3,})-(S\d{3,})-(T\d{4,})$/);
+  const [, mId, sId, tId] = m;
+  const taskDir = path.join(root, '.nubos-pilot', 'milestones', mId, 'slices', sId, 'tasks', tId);
+  fs.mkdirSync(taskDir, { recursive: true });
+  fs.writeFileSync(path.join(taskDir, tId + '-PLAN.md'), [
+    '---',
+    `id: ${taskId}`,
+    `milestone: ${mId}`,
+    `slice: ${mId}-${sId}`,
+    'type: execute',
+    'status: in-progress',
+    'tier: sonnet',
+    'owner: np-executor',
+    'wave: 1',
+    'depends_on: []',
+    'files_modified:',
+    '  - src/a.ts',
+    'autonomous: true',
+    'must_haves: {}',
+    '---',
+    '',
+    `# ${taskId} — wire the outcome feedback loop`,
+    '',
+    '<action>',
+    'Add the OutcomeRecorder service and persist verdicts.',
+    '</action>',
+    '',
+    '<acceptance_criteria>',
+    '- Verdicts persist across restarts',
+    '- API returns 201 on record',
+    '</acceptance_criteria>',
+  ].join('\n'), 'utf-8');
+  seedLoopReadyCheckpoint(root, taskId);
+  fs.mkdirSync(path.join(root, 'src'), { recursive: true });
+  fs.writeFileSync(path.join(root, 'src', 'a.ts'), 'export const a = 1;\n', 'utf-8');
+  const prev = process.cwd();
+  process.chdir(root);
+  const cap = _capture();
+  try {
+    subcmd.run([taskId], { cwd: root, stdout: cap.stub });
+  } finally {
+    process.chdir(prev);
+  }
+  const subject = execFileSync('git', ['-C', root, 'log', '-n', '1', '--format=%s'], { encoding: 'utf-8' }).trim();
+  assert.equal(subject, `task(${taskId}): wire the outcome feedback loop`,
+    'subject must not repeat the task id after the colon');
+  const fullBody = execFileSync('git', ['-C', root, 'log', '-n', '1', '--format=%b'], { encoding: 'utf-8' });
+  assert.match(fullBody, /Add the OutcomeRecorder service/, 'body should carry the <action> intent');
+  assert.match(fullBody, /Acceptance:/);
+  assert.match(fullBody, /Verdicts persist across restarts/);
+  assert.match(fullBody, new RegExp('Task: ' + taskId));
+  assert.match(fullBody, /Files: src\/a\.ts/);
+});
+test('CT-3c: test-writer files recorded in nubosloop.tdd_tests are folded into the commit', () => {
+  const root = makeRepo();
+  const taskId = 'M006-S001-T0060';
+  seedPlanAndTask(root, '06-01', taskId, ['src/a.ts']);
+  seedLoopReadyCheckpoint(root, taskId, { nubosloop: { tdd_tests: ['tests/a.test.ts'] } });
+  fs.mkdirSync(path.join(root, 'src'), { recursive: true });
+  fs.mkdirSync(path.join(root, 'tests'), { recursive: true });
+  fs.writeFileSync(path.join(root, 'src', 'a.ts'), 'export const a = 1;\n', 'utf-8');
+  fs.writeFileSync(path.join(root, 'tests', 'a.test.ts'), 'test("a", () => {});\n', 'utf-8');
+  const prev = process.cwd();
+  process.chdir(root);
+  try {
+    subcmd.run([taskId], { cwd: root, stdout: _capture().stub });
+  } finally {
+    process.chdir(prev);
+  }
+  const committed = execFileSync('git', ['-C', root, 'show', '--name-only', '--format=', 'HEAD'], { encoding: 'utf-8' });
+  assert.match(committed, /src\/a\.ts/, 'production file must be committed');
+  assert.match(committed, /tests\/a\.test\.ts/, 'tdd test file must be committed even though it is not in files_modified');
+});
+test('CT-3d: degenerate task name falls back to the id instead of an empty subject', () => {
+  const root = makeRepo();
+  const taskId = 'M006-S001-T0061';
+  const m = taskId.match(/^(M\d{3,})-(S\d{3,})-(T\d{4,})$/);
+  const [, mId, sId, tId] = m;
+  const taskDir = path.join(root, '.nubos-pilot', 'milestones', mId, 'slices', sId, 'tasks', tId);
+  fs.mkdirSync(taskDir, { recursive: true });
+  fs.writeFileSync(path.join(taskDir, tId + '-PLAN.md'), [
+    '---',
+    `id: ${taskId}`,
+    `milestone: ${mId}`,
+    `slice: ${mId}-${sId}`,
+    'type: execute', 'status: in-progress', 'tier: sonnet', 'owner: np-executor',
+    'wave: 1', 'depends_on: []',
+    'files_modified:', '  - src/a.ts',
+    'autonomous: true', 'must_haves: {}',
+    '---', '',
+    `# ${taskId} — `,
+  ].join('\n'), 'utf-8');
+  seedLoopReadyCheckpoint(root, taskId);
+  fs.mkdirSync(path.join(root, 'src'), { recursive: true });
+  fs.writeFileSync(path.join(root, 'src', 'a.ts'), 'export const a = 1;\n', 'utf-8');
+  const prev = process.cwd();
+  process.chdir(root);
+  try {
+    subcmd.run([taskId], { cwd: root, stdout: _capture().stub });
+  } finally {
+    process.chdir(prev);
+  }
+  const subject = execFileSync('git', ['-C', root, 'log', '-n', '1', '--format=%s'], { encoding: 'utf-8' }).trim();
+  assert.equal(subject, `task(${taskId}): ${taskId}`, 'empty name must fall back to the id, never a bare colon');
+});
+test('CT-3e: commit body Files: lists only committed paths, not gitignored ones', () => {
+  const root = makeRepo();
+  const taskId = 'M006-S001-T0062';
+  seedPlanAndTask(root, '06-01', taskId, ['src/a.ts', 'build/out.js']);
+  seedLoopReadyCheckpoint(root, taskId);
+  fs.writeFileSync(path.join(root, '.gitignore'), 'build/\n', 'utf-8');
+  fs.mkdirSync(path.join(root, 'src'), { recursive: true });
+  fs.mkdirSync(path.join(root, 'build'), { recursive: true });
+  fs.writeFileSync(path.join(root, 'src', 'a.ts'), 'export const a = 1;\n', 'utf-8');
+  fs.writeFileSync(path.join(root, 'build', 'out.js'), 'noise', 'utf-8');
+  const prev = process.cwd();
+  process.chdir(root);
+  try {
+    subcmd.run([taskId], { cwd: root, stdout: _capture().stub });
+  } finally {
+    process.chdir(prev);
+  }
+  const fullBody = execFileSync('git', ['-C', root, 'log', '-n', '1', '--format=%b'], { encoding: 'utf-8' });
+  assert.match(fullBody, /Files: src\/a\.ts/);
+  assert.doesNotMatch(fullBody, /build\/out\.js/, 'gitignored path must not be advertised in the body');
+});
 test('CT-4: commit-task SOFT-SKIPS when every files_modified entry is gitignored (artifacts-gitignored terminator)', () => {
   const root = makeRepo();
   seedPlanAndTask(root, '06-01', 'M006-S001-T0002', ['build/out.js']);

package/bin/np-tools/loop-commands.test.cjs CHANGED Viewed

@@ -426,6 +426,125 @@ function _seedSpawnEvidence(taskId, round, agents, cwd) {
   }
 }
+test('LCLI-RR-2a: phase=post-architect refuses without np-task-architect spawn audit', async () => {
+  const r = _mkRoot();
+  checkpoint.startTask({ id: 'M001-S001-T0001' }, r);
+  require('../../lib/nubosloop.cjs').recordLoopState('M001-S001-T0001', { round: 1 }, r);
+  const loopRunRound = require('./loop-run-round.cjs');
+  await assert.rejects(
+    () => loopRunRound.run(
+      ['M001-S001-T0001', '--phase', 'post-architect'],
+      { cwd: r, stdout: _cap().stub },
+    ),
+    (err) => err && err.code === 'loop-post-architect-missing-spawn-audit',
+  );
+});
+test('LCLI-RR-2b: phase=post-architect with stamp → spawn-test-writer, no round bump', async () => {
+  const r = _mkRoot();
+  checkpoint.startTask({ id: 'M001-S001-T0001' }, r);
+  _seedSpawnEvidence('M001-S001-T0001', 1, ['np-task-architect'], r);
+  const cap = _cap();
+  const loopRunRound = require('./loop-run-round.cjs');
+  await loopRunRound.run(['M001-S001-T0001', '--phase', 'post-architect'], { cwd: r, stdout: cap.stub });
+  const out = JSON.parse(cap.get());
+  assert.equal(out.phase, 'post-architect');
+  assert.equal(out.next_action, 'spawn-test-writer');
+  const cp = checkpoint.readCheckpoint('M001-S001-T0001', r);
+  assert.equal(cp.nubosloop.last_phase, 'post-architect');
+  assert.equal(cp.nubosloop.round, 1, 'prep step must not bump the round counter');
+});
+test('LCLI-RR-2c: phase=post-architect --force-post-architect bypasses the audit gate', async () => {
+  const r = _mkRoot();
+  checkpoint.startTask({ id: 'M001-S001-T0001' }, r);
+  require('../../lib/nubosloop.cjs').recordLoopState('M001-S001-T0001', { round: 1 }, r);
+  const cap = _cap();
+  const loopRunRound = require('./loop-run-round.cjs');
+  await loopRunRound.run(
+    ['M001-S001-T0001', '--phase', 'post-architect', '--force-post-architect'],
+    { cwd: r, stdout: cap.stub },
+  );
+  const out = JSON.parse(cap.get());
+  assert.equal(out.forced, true);
+  assert.equal(out.next_action, 'spawn-test-writer');
+});
+test('LCLI-RR-2d: phase=post-test-writer refuses without np-test-writer spawn audit', async () => {
+  const r = _mkRoot();
+  checkpoint.startTask({ id: 'M001-S001-T0001' }, r);
+  require('../../lib/nubosloop.cjs').recordLoopState('M001-S001-T0001', { round: 1 }, r);
+  const loopRunRound = require('./loop-run-round.cjs');
+  await assert.rejects(
+    () => loopRunRound.run(
+      ['M001-S001-T0001', '--phase', 'post-test-writer'],
+      { cwd: r, stdout: _cap().stub },
+    ),
+    (err) => err && err.code === 'loop-post-test-writer-missing-spawn-audit',
+  );
+});
+test('LCLI-RR-2e: phase=post-test-writer with stamp → spawn-executor, no round bump', async () => {
+  const r = _mkRoot();
+  checkpoint.startTask({ id: 'M001-S001-T0001' }, r);
+  _seedSpawnEvidence('M001-S001-T0001', 1, ['np-test-writer'], r);
+  const cap = _cap();
+  const loopRunRound = require('./loop-run-round.cjs');
+  await loopRunRound.run(['M001-S001-T0001', '--phase', 'post-test-writer'], { cwd: r, stdout: cap.stub });
+  const out = JSON.parse(cap.get());
+  assert.equal(out.phase, 'post-test-writer');
+  assert.equal(out.next_action, 'spawn-executor');
+  const cp = checkpoint.readCheckpoint('M001-S001-T0001', r);
+  assert.equal(cp.nubosloop.last_phase, 'post-test-writer');
+  assert.equal(cp.nubosloop.round, 1);
+});
+test('LCLI-RR-2f: phase=post-test-writer --tests records the written paths in nubosloop.tdd_tests', async () => {
+  const r = _mkRoot();
+  checkpoint.startTask({ id: 'M001-S001-T0001' }, r);
+  _seedSpawnEvidence('M001-S001-T0001', 1, ['np-test-writer'], r);
+  const cap = _cap();
+  const loopRunRound = require('./loop-run-round.cjs');
+  await loopRunRound.run(
+    ['M001-S001-T0001', '--phase', 'post-test-writer', '--tests', 'tests/a.test.ts, tests/b.test.ts'],
+    { cwd: r, stdout: cap.stub },
+  );
+  const out = JSON.parse(cap.get());
+  assert.deepEqual(out.tdd_tests, ['tests/a.test.ts', 'tests/b.test.ts']);
+  const cp = checkpoint.readCheckpoint('M001-S001-T0001', r);
+  assert.deepEqual(cp.nubosloop.tdd_tests, ['tests/a.test.ts', 'tests/b.test.ts']);
+});
+test('LCLI-RR-2g: post-researcher next_action skips disabled prep steps (architect off → test-writer)', async () => {
+  const r = _mkRoot();
+  fs.writeFileSync(
+    path.join(r, '.nubos-pilot', 'config.json'),
+    JSON.stringify({ agents: { architect: false } }),
+    'utf-8',
+  );
+  checkpoint.startTask({ id: 'M001-S001-T0001' }, r);
+  _seedSpawnEvidence('M001-S001-T0001', 1, ['np-researcher', 'np-researcher', 'np-researcher'], r);
+  const cap = _cap();
+  const loopRunRound = require('./loop-run-round.cjs');
+  await loopRunRound.run(['M001-S001-T0001', '--phase', 'post-researcher'], { cwd: r, stdout: cap.stub });
+  assert.equal(JSON.parse(cap.get()).next_action, 'spawn-test-writer');
+});
+test('LCLI-RR-2h: post-researcher next_action → executor when both prep steps are off', async () => {
+  const r = _mkRoot();
+  fs.writeFileSync(
+    path.join(r, '.nubos-pilot', 'config.json'),
+    JSON.stringify({ agents: { architect: false, test_writer: false } }),
+    'utf-8',
+  );
+  checkpoint.startTask({ id: 'M001-S001-T0001' }, r);
+  _seedSpawnEvidence('M001-S001-T0001', 1, ['np-researcher', 'np-researcher', 'np-researcher'], r);
+  const cap = _cap();
+  const loopRunRound = require('./loop-run-round.cjs');
+  await loopRunRound.run(['M001-S001-T0001', '--phase', 'post-researcher'], { cwd: r, stdout: cap.stub });
+  assert.equal(JSON.parse(cap.get()).next_action, 'spawn-executor');
+});
 test('LCLI-RR-3: loop-run-round phase=post-executor with verify-green → spawn-critic-schwarm', async () => {
   const r = _mkRoot();
   checkpoint.startTask({ id: 'M001-S001-T0001' }, r);
@@ -1184,7 +1303,7 @@ test('LCLI-RR-35: post-researcher accepts when k=3 np-researcher audits exist (T
     { cwd: r, stdout: cap.stub },
   );
   const out = JSON.parse(cap.get());
-  assert.equal(out.next_action, 'spawn-executor');
+  assert.equal(out.next_action, 'spawn-architect');
   assert.equal(out.forced, false);
   assert.equal(out.expected_researcher_count, 3);
 });
@@ -1207,7 +1326,7 @@ test('LCLI-RR-35b: post-researcher k-gate honors swarm.research.k config (Gap #6
   );
   const out = JSON.parse(cap.get());
   assert.equal(out.expected_researcher_count, 1);
-  assert.equal(out.next_action, 'spawn-executor');
+  assert.equal(out.next_action, 'spawn-architect');
 });
 test('LCLI-RR-36: --force-post-researcher bypasses Layer-C gate + stamps flag (T2)', async () => {

package/bin/np-tools/loop-run-round.cjs CHANGED Viewed

@@ -7,6 +7,7 @@ const { NubosPilotError, safeAssign } = require('../../lib/core.cjs');
 const safePath = require('../../lib/safe-path.cjs');
 const checkpoint = require('../../lib/checkpoint.cjs');
+const config = require('../../lib/config.cjs');
 const nubosloop = require('../../lib/nubosloop.cjs');
 const messaging = require('../../lib/messaging.cjs');
 const compress = require('../../lib/compress.cjs');
@@ -40,6 +41,8 @@ function _verifyExcerpt(verifyOutput, cwd) {
 const VALID_PHASES = new Set([
   'preflight',
   'post-researcher',
+  'post-architect',
+  'post-test-writer',
   'post-executor',
   'post-critics',
   'commit',
@@ -153,13 +156,124 @@ function _runPostResearcher(taskId, list, cwd) {
   );
   return {
     phase: 'post-researcher',
-    next_action: 'spawn-executor',
+    next_action: _nextAfterResearcher(cwd),
     forced: force,
     expected_researcher_count: expectedK,
     round: merged.nubosloop ? merged.nubosloop.round : null,
   };
 }
+// Round-1 preparation steps (architect, test-writer) that run between the
+// researcher swarm and the executor. Each verifies its spawn was stamped via
+// `loop-audit-tool-use` (Layer-C SKIP-GUARD) and records last_phase. They never
+// bump the round counter — TDD writes tests once; build-fixer rounds iterate.
+// Config-gated in the orchestrator: when agents.architect / agents.test_writer
+// is off the step (and this phase) is simply never invoked.
+function _checkPrepSpawnAudited(taskId, list, cwd, agent, forceFlag) {
+  const force = list.includes(forceFlag);
+  const cur = checkpoint.readCheckpoint(taskId, cwd) || {};
+  const round = Number((cur.nubosloop && cur.nubosloop.round)) || 1;
+  const satisfied = force
+    ? true
+    : nubosloop.assertSpawnsAuditedForRound(taskId, [agent], round, cwd).satisfied;
+  return { force, round, satisfied };
+}
+function _stampPrepPhase(taskId, cwd, phase, lastAction, forcedKey, force, extraFn) {
+  return checkpoint.mergeCheckpoint(
+    taskId,
+    (curCp) => {
+      const prev = (curCp && curCp.nubosloop) || {};
+      const partial = { last_phase: phase, last_action: lastAction };
+      if (force) partial[forcedKey] = true;
+      if (typeof extraFn === 'function') safeAssign(partial, extraFn(prev));
+      return { nubosloop: safeAssign({}, prev, partial) };
+    },
+    cwd,
+  );
+}
+// The round-1 prep steps are config-gated, so the emitted next_action hint must
+// reflect which downstream steps are actually enabled — a consumer driving off
+// the JSON hint (rather than the ACTION CONTRACT prose) must not skip an enabled
+// architect/test-writer or stall on a disabled one.
+function _nextEnabled(cwd, steps) {
+  for (const [path, action] of steps) {
+    if (config.tryReadConfigPath(cwd, path, true, { onWarn() {} })) return action;
+  }
+  return 'spawn-executor';
+}
+function _nextAfterResearcher(cwd) {
+  return _nextEnabled(cwd, [['agents.architect', 'spawn-architect'], ['agents.test_writer', 'spawn-test-writer']]);
+}
+function _nextAfterArchitect(cwd) {
+  return _nextEnabled(cwd, [['agents.test_writer', 'spawn-test-writer']]);
+}
+// `--tests "a, b, c"` → ['a','b','c']; the post-test-writer phase records these
+// (the paths np-test-writer wrote) so commit-task can fold them into the commit.
+function _parseTestPaths(raw) {
+  if (typeof raw !== 'string' || raw.trim() === '') return [];
+  return raw.split(',').map((s) => s.trim()).filter((s) => s.length > 0);
+}
+function _runPostArchitect(taskId, list, cwd) {
+  const g = _checkPrepSpawnAudited(taskId, list, cwd, 'np-task-architect', '--force-post-architect');
+  if (!g.satisfied) {
+    throw new NubosPilotError(
+      'loop-post-architect-missing-spawn-audit',
+      'phase=post-architect refused: no `loop-audit-tool-use` record for round=' + g.round +
+      ', agent=np-task-architect on ' + taskId + '. ' +
+      'Spawn np-task-architect and call `loop-audit-tool-use ' + taskId +
+      ' --agent np-task-architect --tool-use-log <json>` after the spawn, ' +
+      'or pass --force-post-architect for an explicit override.',
+      { taskId, round: g.round, missing: ['np-task-architect'], required: ['np-task-architect'] },
+    );
+  }
+  const merged = _stampPrepPhase(taskId, cwd, 'post-architect', 'architect-spawned', 'forced_post_architect', g.force);
+  return {
+    phase: 'post-architect',
+    next_action: _nextAfterArchitect(cwd),
+    forced: g.force,
+    round: merged.nubosloop ? merged.nubosloop.round : g.round,
+  };
+}
+function _runPostTestWriter(taskId, list, cwd) {
+  const g = _checkPrepSpawnAudited(taskId, list, cwd, 'np-test-writer', '--force-post-test-writer');
+  if (!g.satisfied) {
+    throw new NubosPilotError(
+      'loop-post-test-writer-missing-spawn-audit',
+      'phase=post-test-writer refused: no `loop-audit-tool-use` record for round=' + g.round +
+      ', agent=np-test-writer on ' + taskId + '. ' +
+      'Spawn np-test-writer and call `loop-audit-tool-use ' + taskId +
+      ' --agent np-test-writer --tool-use-log <json>` after the spawn, ' +
+      'or pass --force-post-test-writer for an explicit override.',
+      { taskId, round: g.round, missing: ['np-test-writer'], required: ['np-test-writer'] },
+    );
+  }
+  const testPaths = _parseTestPaths(args.getFlag(list, '--tests'));
+  const merged = _stampPrepPhase(
+    taskId, cwd, 'post-test-writer', 'test-writer-spawned', 'forced_post_test_writer', g.force,
+    (prev) => {
+      const prior = Array.isArray(prev.tdd_tests) ? prev.tdd_tests : [];
+      const seen = new Set(prior);
+      const tdd_tests = prior.slice();
+      for (const p of testPaths) { if (!seen.has(p)) { seen.add(p); tdd_tests.push(p); } }
+      return { tdd_tests };
+    },
+  );
+  return {
+    phase: 'post-test-writer',
+    next_action: 'spawn-executor',
+    forced: g.force,
+    tdd_tests: merged.nubosloop && Array.isArray(merged.nubosloop.tdd_tests) ? merged.nubosloop.tdd_tests : [],
+    round: merged.nubosloop ? merged.nubosloop.round : g.round,
+  };
+}
 function _runPostExecutor(taskId, list, cwd) {
   const verifyExitCode = args.getFlag(list, '--verify-exit-code');
   if (verifyExitCode === undefined) {
@@ -557,7 +671,7 @@ async function run(argv, ctx) {
   if (!phase) {
     throw new NubosPilotError(
       'loop-run-round-missing-phase',
-      'loop-run-round requires --phase <preflight|post-researcher|post-executor|post-critics|commit|stuck>',
+      'loop-run-round requires --phase <preflight|post-researcher|post-architect|post-test-writer|post-executor|post-critics|commit|stuck>',
       { hint: 'each phase corresponds to a non-LLM transition between LLM spawns' },
     );
   }
@@ -572,9 +686,11 @@ async function run(argv, ctx) {
   const tail = list.slice(1);
   let result;
   switch (phase) {
-    case 'preflight':       result = await _runPreflight(taskId, tail, cwd); break;
-    case 'post-researcher': result = _runPostResearcher(taskId, tail, cwd); break;
-    case 'post-executor':   result = _runPostExecutor(taskId, tail, cwd); break;
+    case 'preflight':        result = await _runPreflight(taskId, tail, cwd); break;
+    case 'post-researcher':  result = _runPostResearcher(taskId, tail, cwd); break;
+    case 'post-architect':   result = _runPostArchitect(taskId, tail, cwd); break;
+    case 'post-test-writer': result = _runPostTestWriter(taskId, tail, cwd); break;
+    case 'post-executor':    result = _runPostExecutor(taskId, tail, cwd); break;
     case 'post-critics':    result = _runPostCritics(taskId, tail, cwd); break;
     case 'commit':          result = _runCommit(taskId, tail, cwd); break;
     case 'stuck':           result = _runStuck(taskId, tail, cwd); break;

package/lib/agents-registry.cjs CHANGED Viewed

@@ -15,6 +15,14 @@ const BUILD_FIXER_AGENT = 'np-build-fixer';
 const RESEARCHER_AGENT = 'np-researcher';
+// Round-1 preparation agents that run between the researcher swarm and the
+// executor. Config-gated (agents.architect / agents.test_writer). They get
+// Layer-C spawn-evidence stamps but are NOT Rule-9 audited (not in
+// AUDITED_AGENTS) — the architect is advisory/read-only and the test-writer's
+// quality is enforced downstream by the np-critic-tests axis.
+const TASK_ARCHITECT_AGENT = 'np-task-architect';
+const TEST_WRITER_AGENT = 'np-test-writer';
 const AUDITED_AGENTS = Object.freeze([
   EXECUTOR_AGENT,
   BUILD_FIXER_AGENT,
@@ -29,5 +37,7 @@ module.exports = {
   EXECUTOR_AGENT,
   BUILD_FIXER_AGENT,
   RESEARCHER_AGENT,
+  TASK_ARCHITECT_AGENT,
+  TEST_WRITER_AGENT,
   AUDITED_AGENTS,
 };

package/lib/agents.test.cjs CHANGED Viewed

@@ -242,6 +242,8 @@ const NP_AGENTS = [
   { file: 'np-researcher-reconciler', expected_tier: 'sonnet' },
   { file: 'np-codebase-documenter', expected_tier: 'sonnet' },
   { file: 'np-architect', expected_tier: 'sonnet' },
+  { file: 'np-task-architect', expected_tier: 'sonnet' },
+  { file: 'np-test-writer', expected_tier: 'sonnet' },
   { file: 'np-build-fixer', expected_tier: 'sonnet' },
   { file: 'np-security-reviewer', expected_tier: 'sonnet' },
   { file: 'np-nyquist-auditor', expected_tier: 'haiku' },

package/lib/config-defaults.cjs CHANGED Viewed

@@ -18,6 +18,14 @@ const DEFAULT_AGENTS = Object.freeze({
   research: true,
   plan_checker: true,
   verifier: true,
+  // Per-task architecture step in the Nubosloop (np-task-architect). Runs in
+  // round 1 after the researcher swarm, before the test-writer and executor.
+  // Backfilled to true on install/update when absent (see bin/install.js).
+  architect: true,
+  // Per-task TDD step (np-test-writer). Runs in round 1 after the architect,
+  // before the executor — writes the tests the executor must make green.
+  // Backfilled to true on install/update when absent (see bin/install.js).
+  test_writer: true,
   // Economy axis level (off|lite|full|ultra). Default `lite` = prevention-first:
   // the climb-the-ladder discipline is on, the in-loop critic is opt-in (full/ultra).
   // Resolved via lib/economy-mode.cjs; legacy `economy_critic` bool still honoured.

package/lib/config-schema.cjs CHANGED Viewed

@@ -33,6 +33,8 @@ const SCHEMA = Object.freeze({
       research:        { type: 'boolean', optional: true },
       plan_checker:    { type: 'boolean', optional: true },
       verifier:        { type: 'boolean', optional: true },
+      architect:       { type: 'boolean', optional: true },
+      test_writer:     { type: 'boolean', optional: true },
       economy:         { type: 'enum', values: VALID_ECONOMY_MODES, optional: true },
       economy_critic:  { type: 'boolean', optional: true },
     },

package/lib/git.cjs CHANGED Viewed

@@ -58,7 +58,7 @@ function assertCommittablePaths(paths, opts) {
   return committable;
 }
-function commitTask(taskId, files, message) {
+function commitTask(taskId, files, message, body) {
   const { committable, ignored } = classifyCommittablePaths(files);
   if (committable.length === 0) {
     if (ignored.length > 0) {
@@ -84,7 +84,11 @@ function commitTask(taskId, files, message) {
     });
   }
   execFileSync('git', ['add', '--', ...committable], { stdio: 'pipe', timeout: GIT_TIMEOUT_MS });
-  execFileSync('git', ['commit', '-m', message, '--', ...committable], { stdio: 'pipe', timeout: GIT_TIMEOUT_MS });
+  const commitArgs = ['commit', '-m', message];
+  if (typeof body === 'string' && body.trim().length > 0) {
+    commitArgs.push('-m', body);
+  }
+  execFileSync('git', [...commitArgs, '--', ...committable], { stdio: 'pipe', timeout: GIT_TIMEOUT_MS });
   return {
     committed: true,
     files_committed: committable.slice(),

package/lib/git.test.cjs CHANGED Viewed

@@ -199,6 +199,34 @@ test('GIT-5: commitTask creates a single commit containing exactly the supplied
   });
 });
+test('GIT-5b: commitTask attaches a multi-line body via a second -m when body is supplied', () => {
+  const root = makeRepo();
+  inRepo(root, () => {
+    writeFile(root, 'lib/git.cjs', '// stub');
+    git.commitTask(
+      'M006-S001-T0001',
+      ['lib/git.cjs'],
+      'task(M006-S001-T0001): add git helper',
+      'Implements the git helper.\n\nTask: M006-S001-T0001',
+    );
+    const subject = execFileSync('git', ['log', '-n', '1', '--format=%s'], { encoding: 'utf-8' }).trim();
+    const fullBody = execFileSync('git', ['log', '-n', '1', '--format=%b'], { encoding: 'utf-8' });
+    assert.equal(subject, 'task(M006-S001-T0001): add git helper');
+    assert.match(fullBody, /Implements the git helper\./);
+    assert.match(fullBody, /Task: M006-S001-T0001/);
+  });
+});
+test('GIT-5c: commitTask omits the body -m when body is empty/whitespace (backward-compatible)', () => {
+  const root = makeRepo();
+  inRepo(root, () => {
+    writeFile(root, 'lib/git.cjs', '// stub');
+    git.commitTask('M006-S001-T0001', ['lib/git.cjs'], 'task(M006-S001-T0001): add git helper', '   ');
+    const fullBody = execFileSync('git', ['log', '-n', '1', '--format=%b'], { encoding: 'utf-8' }).trim();
+    assert.equal(fullBody, '');
+  });
+});
 test('GIT-6: findCommitByTaskId returns 40-char SHA for known task commit', () => {
   const root = makeRepo();
   inRepo(root, () => {

package/package.json CHANGED Viewed

@@ -1,6 +1,6 @@
 {
   "name": "nubos-pilot",
-  "version": "1.3.3",
+  "version": "1.3.4",
   "description": "Self-hosted AI pilot for any codebase. Researcher and critic agents plan, execute and verify each change.",
   "homepage": "https://pilot.nubos.cloud",
   "repository": {

package/templates/RULES.md CHANGED Viewed

@@ -79,7 +79,33 @@ Examples:
 -->
 - _TBD — fill with logging policy._
-## Code Style
+## Conventions
+> **How your code must be built.** These rules bind the architect (`np-task-architect`),
+> the test-writer (`np-test-writer`), the executor, and the style critic
+> (`np-critic-style`). They are read on every task. Each subsection is **MUST FILL** —
+> use `- _none — <reason>_` only when a subsection genuinely does not apply.
+### Class / Module Structure
+<!-- How classes, modules, and units are shaped. Examples:
+- One public class per file; file name matches the class name.
+- Constructor injection only — no service-locator / static singletons.
+- Business logic lives in Services/Actions; controllers stay thin (no DB access).
+- Public surface ≤ 5 methods; split when it grows past that.
+-->
+- _TBD — fill with class/module construction rules._
+### Naming
+<!-- Identifier conventions. Examples:
+- Classes PascalCase, methods camelCase, constants UPPER_SNAKE.
+- Booleans read as predicates (`isActive`, `hasAccess`), never `flag`/`status`.
+- Table names follow the framework's pluralization — never override.
+-->
+- _TBD — fill with naming conventions._
+### Code Style
 <!-- Format/lint/comment policy. Examples:
 - No comments inside source — names + tests carry intent.
@@ -88,6 +114,15 @@ Examples:
 -->
 - _TBD — fill with style policy._
+### Patterns & Paradigms
+<!-- Architectural patterns that are required or banned. Examples:
+- Required: Repository pattern for all persistence; Result objects over exceptions for control flow.
+- Banned: anemic domain models; inheritance for code reuse (prefer composition).
+- Errors are typed and surfaced — never swallowed or stringly-typed.
+-->
+- _TBD — fill with required/banned patterns._
 ## Out-of-Scope (Forever)
 <!-- Things this project explicitly will never do. Distinct from deferred ideas.

package/workflows/execute-phase.md CHANGED Viewed

@@ -219,6 +219,10 @@ SPAWN_HEADLESS_ENABLED=$(node .nubos-pilot/bin/np-tools.cjs config-get spawn.hea
 SPAWN_HEADLESS_AGENTS=$(node .nubos-pilot/bin/np-tools.cjs config-get spawn.headless.agents 2>/dev/null || echo '["np-critic","np-researcher"]')
 SPAWN_HEADLESS_FALLBACK=$(node .nubos-pilot/bin/np-tools.cjs config-get spawn.headless.fallback_on_error 2>/dev/null || echo true)
 CONF_INJECT_CRITERIA=$(node .nubos-pilot/bin/np-tools.cjs config-get conformance.inject_criteria 2>/dev/null || echo true)
+# Round-1 prep agents (default on; backfilled on install/update). When a toggle
+# is false the matching ACTION CONTRACT (Step 2b / Step 2c) is skipped wholesale.
+ARCHITECT_ENABLED=$(node .nubos-pilot/bin/np-tools.cjs config-get agents.architect 2>/dev/null || echo true)
+TEST_WRITER_ENABLED=$(node .nubos-pilot/bin/np-tools.cjs config-get agents.test_writer 2>/dev/null || echo true)
 # Milestone success_criteria as the executor's acceptance target (rendered once from the INIT payload).
 # Intent-level only (ADR-0019): these describe what "done right" means, NOT how to build it.
 SUCCESS_CRITERIA_BLOCK=$(echo "$INIT" | node -e 'process.stdin.on("data",d=>{try{const c=JSON.parse(d).success_criteria||[];console.log(c.map(x=>"- "+(x.id?x.id+": ":"")+(x.text||x)).join("\n"))}catch(e){console.log("")}})')
@@ -414,6 +418,131 @@ for WAVE_INDEX in 0 1 2 ...; do
         CONSENSUS_PATTERN=""
       fi
+      # ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
+      # ACTION CONTRACT — Step 2b: Per-Task Architect (Round 1, config-gated)
+      # ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
+      # WHEN: $ROUND -eq 1 AND $ARCHITECT_ENABLED = true. Skip wholesale otherwise
+      #   (agents.architect=false → no architect this run; R≥2 build-fixer rounds
+      #   never run it).
+      # SKIP-GUARD: loop-post-architect-missing-spawn-audit (needs 1 architect audit).
+      #
+      # Execute EXACTLY these three groups, in order:
+      #
+      # (1) ONE Agent tool-call (real, not bash):
+      #       Agent(subagent_type="np-task-architect", prompt=<…>)
+      #     Prompt fields:
+      #       <files_to_read>: task plan, slice plan, CONTEXT.md, RULES.md,
+      #         M<NNN>-ARCHITECTURE.md (if present), .nubos-pilot/codebase/INDEX.md
+      #       <consensus_pattern>: $CONSENSUS_PATTERN (researcher output; may be empty)
+      #       <lang_directive>: $LANG_DIRECTIVE
+      #     Curated skills (quality bar) — instruct the agent to Read each that
+      #     applies from .claude/skills/<skill>/SKILL.md: np-system-design,
+      #     np-service-boundary, np-api-design, np-composition-patterns,
+      #     np-error-handling, np-adr (only for a costly-to-reverse choice).
+      #     The agent is READ-ONLY: it emits its Task-Architecture spec as its FINAL
+      #     MESSAGE (markdown per its Output Contract). Write that message verbatim
+      #     to "$ARCH_CONSTRAINTS_PATH".
+      #
+      # (2) ONE Bash audit-stamp (same round) — architect is NOT Rule-9 audited,
+      #     so an empty tool-use log is correct:
+      #       node .nubos-pilot/bin/np-tools.cjs loop-audit-tool-use "$TASK_ID" \
+      #         --agent np-task-architect --tool-use-log '[]'
+      #
+      # (3) ONE Bash advance:
+      #       node .nubos-pilot/bin/np-tools.cjs loop-run-round "$TASK_ID" \
+      #         --phase post-architect
+      #
+      # $ARCH_CONSTRAINTS is injected as <architecture_constraints> into the
+      # test-writer (Step 2c) AND executor (Step 3) prompts.
+      #
+      # Rationale: ADR-0023 — a per-task structural pass before tests/code so the
+      # test-writer and executor build against a decided shape, honouring RULES.md
+      # Conventions. Ephemeral ($TMPDIR, never committed) → plan-lint untouched.
+      # ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
+      ARCH_CONSTRAINTS=""
+      ARCH_CONSTRAINTS_PATH="${TMPDIR:-/tmp}/np-arch-${TASK_ID}.md"
+      if [ "$ROUND" -eq 1 ] && [ "$ARCHITECT_ENABLED" = "true" ]; then
+        # Off-host (ADR-0021): np-task-architect is read-only (Read/Grep/Glob), not
+        # Rule-9 audited, writes no files — run via spawn-offhost with default cwd
+        # when it routes to an openai-compat provider; its spec returns as the
+        # spawn's final message (content).
+        ARCHITECT_KIND=$(node .nubos-pilot/bin/np-tools.cjs resolve-model np-task-architect --json 2>/dev/null \
+          | node -e 'let s="";process.stdin.on("data",d=>s+=d).on("end",()=>{try{console.log(JSON.parse(s).kind||"native")}catch{console.log("native")}})')
+        if [ "$ARCHITECT_KIND" = "openai-compat" ]; then
+          A_PROMPT="${TMPDIR:-/tmp}/np-offhost-task-architect-${TASK_ID}.md"
+          # … render the files_to_read block + consensus + skills + $LANG_DIRECTIVE into "$A_PROMPT" …
+          A_OUT=$(node .nubos-pilot/bin/np-tools.cjs spawn-offhost \
+            --agent np-task-architect --task-file "$A_PROMPT" --task-id "$TASK_ID" \
+            --read-only --no-audit ${SLICE_CWD:+--cwd "$SLICE_CWD"})
+          echo "$A_OUT" | ARCH_PATH="$ARCH_CONSTRAINTS_PATH" node -e 'let s="";process.stdin.on("data",d=>s+=d).on("end",()=>{let c="";try{c=JSON.parse(s).content||""}catch{}require("fs").writeFileSync(process.env.ARCH_PATH,c)})'
+        else
+          true  # → execute group (1): native Agent spawn; write its final message to "$ARCH_CONSTRAINTS_PATH".
+        fi
+        node .nubos-pilot/bin/np-tools.cjs loop-audit-tool-use "$TASK_ID" --agent np-task-architect --tool-use-log '[]'
+        node .nubos-pilot/bin/np-tools.cjs loop-run-round "$TASK_ID" --phase post-architect
+        [ -f "$ARCH_CONSTRAINTS_PATH" ] && ARCH_CONSTRAINTS=$(cat "$ARCH_CONSTRAINTS_PATH")
+      fi
+      # ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
+      # ACTION CONTRACT — Step 2c: Test-Writer / TDD (Round 1, config-gated)
+      # ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
+      # WHEN: $ROUND -eq 1 AND $TEST_WRITER_ENABLED = true. Runs AFTER the architect,
+      #   BEFORE the executor. Skip wholesale otherwise.
+      # SKIP-GUARD: loop-post-test-writer-missing-spawn-audit (needs 1 test-writer audit).
+      #
+      # Execute EXACTLY these three groups, in order:
+      #
+      # (1) ONE Agent tool-call (real, not bash):
+      #       Agent(subagent_type="np-test-writer", prompt=<…>)
+      #     Prompt fields:
+      #       <files_to_read>: task plan, slice plan, RULES.md, neighbouring tests
+      #       <architecture_constraints>: $ARCH_CONSTRAINTS (the architect's required
+      #         test surfaces; empty when the architect is disabled)
+      #       <success_criteria>: $SUCCESS_CRITERIA_BLOCK + slice UAT path (intent-level)
+      #       <lang_directive>: $LANG_DIRECTIVE
+      #     Curated skill (quality bar) — instruct the agent to Read
+      #     .claude/skills/np-test-strategy/SKILL.md and satisfy its Verification bar.
+      #     RULES — the agent writes REAL, VALID test files for every required surface;
+      #     it MUST NOT skip/stub/weaken assertions (Rule 10). Tests MAY be red now;
+      #     the executor makes them green. The agent emits a JSON envelope whose
+      #     tests_written paths you collect into $TDD_TESTS.
+      #
+      # (2) ONE Bash audit-stamp (same round) — test-writer is NOT Rule-9 audited:
+      #       node .nubos-pilot/bin/np-tools.cjs loop-audit-tool-use "$TASK_ID" \
+      #         --agent np-test-writer --tool-use-log '[]'
+      #
+      # (3) ONE Bash advance — pass the written test paths so they are recorded in
+      #     the checkpoint (nubosloop.tdd_tests) and commit-task folds them into the
+      #     commit even when files_modified did not enumerate them:
+      #       node .nubos-pilot/bin/np-tools.cjs loop-run-round "$TASK_ID" \
+      #         --phase post-test-writer --tests "$TDD_TESTS"
+      #
+      # Rationale: ADR-0023 — TDD inside the loop. The mechanical verify gate
+      # (Step 4) runs only AFTER the executor, so red-until-executor is expected
+      # and not a failure. The np-critic-tests axis (Step 5) re-audits for any
+      # skipped/vacuous assertions that slipped through.
+      # ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
+      TDD_TESTS=""
+      if [ "$ROUND" -eq 1 ] && [ "$TEST_WRITER_ENABLED" = "true" ]; then
+        # Off-host (ADR-0021): np-test-writer writes test files, so off-host needs
+        # worktree isolation exactly like the executor (model-driven Write confined
+        # + ff-merged back). When worktree isolation is off, it runs native.
+        TEST_WRITER_KIND=$(node .nubos-pilot/bin/np-tools.cjs resolve-model np-test-writer --json 2>/dev/null \
+          | node -e 'let s="";process.stdin.on("data",d=>s+=d).on("end",()=>{try{console.log(JSON.parse(s).kind||"native")}catch{console.log("native")}})')
+        if [ "$TEST_WRITER_KIND" = "openai-compat" ] && [ "$WORKTREE_ISOLATION" = "true" ] && [ -n "$SLICE_CWD" ] && [ "$SLICE_CWD" != "." ]; then
+          TW_PROMPT="${TMPDIR:-/tmp}/np-offhost-test-writer-${TASK_ID}.md"
+          # … render files_to_read + architecture_constraints + success_criteria + skill + $LANG_DIRECTIVE into "$TW_PROMPT" …
+          TW_OUT=$(node .nubos-pilot/bin/np-tools.cjs spawn-offhost \
+            --agent np-test-writer --task-file "$TW_PROMPT" --task-id "$TASK_ID" \
+            --cwd "$SLICE_CWD" --allow-bash --no-audit)
+          TDD_TESTS=$(echo "$TW_OUT" | node -e 'let s="";process.stdin.on("data",d=>s+=d).on("end",()=>{try{const j=JSON.parse(JSON.parse(s).content||"{}");console.log((j.tests_written||[]).join(", "))}catch{console.log("")}})')
+        else
+          true  # → execute group (1): native Agent spawn; collect tests_written from the envelope into $TDD_TESTS.
+        fi
+        node .nubos-pilot/bin/np-tools.cjs loop-audit-tool-use "$TASK_ID" --agent np-test-writer --tool-use-log '[]'
+        node .nubos-pilot/bin/np-tools.cjs loop-run-round "$TASK_ID" --phase post-test-writer --tests "$TDD_TESTS"
+      fi
       # ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
       # ACTION CONTRACT — Step 3: Executor (R1) / Build-Fixer (R≥2)
       # ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
@@ -424,6 +553,13 @@ for WAVE_INDEX in 0 1 2 ...; do
       #     Prompt fields:
       #       <files_to_read>: task plan, slice plan, prior slice SUMMARYs, CONTEXT.md
       #       <consensus_pattern>: $CONSENSUS_PATTERN (with [VERIFIED]/[PROVISIONAL]/[CACHED])
+      #       <architecture_constraints>: $ARCH_CONSTRAINTS — the per-task architect's
+      #         decided structure + constraints (empty when agents.architect is off).
+      #         The executor builds against this shape; it is intent-level, not a code spec.
+      #       <tdd_tests>: $TDD_TESTS — test files np-test-writer wrote (R1, empty when off).
+      #         The executor MUST make them green WITHOUT deleting, skipping, or weakening
+      #         any assertion. They are in scope alongside files_modified (recorded in the
+      #         checkpoint at post-test-writer) and commit-task commits them with the diff.
       #       <success_criteria>: when $CONF_INJECT_CRITERIA = true, include the milestone
       #         acceptance target — $SUCCESS_CRITERIA_BLOCK plus the slice UAT path
       #         (.nubos-pilot/milestones/M<NNN>/slices/S<NNN>/S<NNN>-UAT.md). Frame it as
@@ -437,7 +573,8 @@ for WAVE_INDEX in 0 1 2 ...; do
       #         ultra) instruct the agent to APPLY the np-executor "Climb the ladder"
       #         discipline before writing (prevention-first). When $ECONOMY_MODE = off,
       #         instruct it to SKIP the ladder (no economy pressure this run).
-      #     RULES — Agent MUST: edit ONLY paths in files_modified (D-04 scope guard) —
+      #     RULES — Agent MUST: edit ONLY paths in files_modified plus the <tdd_tests>
+      #     paths (D-04 scope guard; the TDD tests are the sole sanctioned addition) —
       #     success_criteria are the acceptance target, NEVER a licence to touch other files,
       #     run `node np-tools.cjs knowledge-search "<q>" --task $TASK_ID` via Bash
       #     ≥1× (Rule 9 — the --task flag writes the audit evidence ledger),

package/workflows/plan-phase.md CHANGED Viewed

@@ -1,7 +1,7 @@
 ---
 command: np:plan-phase
 description: Plans a milestone (M<NNN>) — breaks it into slices (waves) and tasks. Spawns np-planner (opus) + np-plan-checker (opus), 2-iteration verification, then scaffolds every task file.
-argument-hint: <milestone-number> [--research] [--repromote]
+argument-hint: <milestone-number> [--research] [--architect] [--repromote]
 ---
 # np:plan-phase
@@ -41,17 +41,19 @@ Output layout:
 ```bash
 PHASE=""
 RESEARCH_FLAG=0
+ARCHITECT_FLAG=0
 REPROMOTE_FLAG=0
 for arg in "$@"; do
   case "$arg" in
     --research)  RESEARCH_FLAG=1 ;;
+    --architect) ARCHITECT_FLAG=1 ;;
     --repromote) REPROMOTE_FLAG=1 ;;
     --*)         echo "Unknown flag: $arg" >&2; exit 2 ;;
     *)           [[ -z "$PHASE" ]] && PHASE="$arg" ;;
   esac
 done
 if [[ -z "$PHASE" ]]; then
-  echo "Usage: /np:plan-phase <milestone-number> [--research] [--repromote]" >&2
+  echo "Usage: /np:plan-phase <milestone-number> [--research] [--architect] [--repromote]" >&2
   exit 2
 fi
 ```
@@ -154,6 +156,19 @@ fi
 **Exit code 42 contract:** orchestrator sees exit 42 → runs `/np:research-phase $PHASE` → re-enters `/np:plan-phase $PHASE` without the `--research` flag.
+### Gate 2b — Optional architecture pass (`--architect`)
+The `--architect` flag auto-dispatches `/np:architect-phase` before planning, so a structural ADR pass (`M<NNN>-ARCHITECTURE.md`) is decided up front and the planner consumes it like an extension of CONTEXT.md. Dispatched AFTER research (the established flow is research → architect → plan): when both flags are set, the research re-entry strips `--research`, leaving `--architect` to dispatch on the next pass.
+```bash
+if [[ "$ARCHITECT_FLAG" == "1" ]]; then
+  echo "architect-auto: dispatching /np:architect-phase $PHASE before planning" >&2
+  exit 43
+fi
+```
+**Exit code 43 contract:** orchestrator sees exit 43 → runs `/np:architect-phase $PHASE` → re-enters `/np:plan-phase $PHASE` without the `--architect` flag. The milestone `np-architect` stays intent-level (ADR-0019): its decisions inform the plan; they do not bake schema/filenames/code-style into `PLAN.md`.
 **Researcher-Schwarm semantics (ADR-0011).** The dispatched `/np:research-phase` runs in Schwarm mode by default (`swarm.research.k=3`). The cache-bypass at Pre-flight short-circuits the swarm whenever the milestone goal + requirements match a stored learning at similarity ≥ `swarm.research.threshold` and `occurrence ≥ swarm.research.minOccurrence`. The merged consensus carries a `<consensus_meta>` block (`k`, `agreement_score`, `flagged_decisions`) which `np-plan-checker` reads to weight downstream verdicts. No additional flags needed at this site — the swarm runs automatically when `--research` is set.
 ### Gate 3 — Milestone already planned