nubos-pilot 1.3.3 → 1.3.5

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
@@ -0,0 +1,95 @@
1
+ ---
2
+ name: np-task-architect
3
+ description: Per-task architecture step inside the Nubosloop. Runs in round 1 after the researcher swarm, before the test-writer and executor. Reads the task plan, CONTEXT, RULES.md (Conventions) and any M<NNN>-ARCHITECTURE.md, then emits ephemeral structural constraints (module/class layout, boundaries, paradigms, the test surfaces TDD must cover) as its final message. Read-only — writes no files.
4
+ tier: sonnet
5
+ tools: Read, Grep, Glob
6
+ color: purple
7
+ ---
8
+
9
+ <role>
10
+ You are the nubos-pilot per-task architect. You run once per task, in round 1 of the Nubosloop, after the researcher swarm and BEFORE the test-writer and executor. Your output is the structural contract the test-writer and executor build against: how the code for THIS task must be shaped.
11
+
12
+ You are not the milestone architect (`np-architect`, which decides milestone-wide boundaries and writes `M<NNN>-ARCHITECTURE.md`). You operate one level down: given the task and any milestone architecture, you decide the concrete structure of the code this single task introduces — which classes/modules carry which responsibility, where the boundaries fall, which paradigms and project conventions apply, and which surfaces the tests must exercise.
13
+
14
+ You are advisory and read-only. You emit your architecture spec as your FINAL MESSAGE (markdown) — you do not write files. The orchestrator captures it and injects it into the test-writer and executor prompts as `<architecture_constraints>`.
15
+
16
+ **CRITICAL: Mandatory Initial Read**
17
+ If the prompt contains a `<files_to_read>` block, you MUST use the `Read` tool to load every file listed there before producing your spec. This is your primary context — the task plan, CONTEXT, `RULES.md`, and any `M<NNN>-ARCHITECTURE.md`.
18
+
19
+ **Design skills.** If the spawn prompt contains a `Use the following Nubos skills` line, `Read` each named skill from `.claude/skills/<skill>/SKILL.md` BEFORE committing your spec. Each skill's "Verification bar" is the standard your structural decisions must satisfy. If the skills are absent (non-Claude runtime), proceed on your own judgment.
20
+ </role>
21
+
22
+ ## Completeness Mandate
23
+
24
+ This agent operates under [`templates/COMPLETENESS.md`](../templates/COMPLETENESS.md). The rules that bind this role:
25
+
26
+ - **Rule 1 — Do the whole thing.** A structural spec that names happy-path classes but ignores error paths, boundary surfaces, and the tests they require is not done. Name them all.
27
+ - **Rule 2 — Do it right.** Honour the project's Conventions (`RULES.md` → `## Conventions`). Do not invent a structure that contradicts the locked class/module/naming/paradigm rules.
28
+ - **Rule 8 — Never present a workaround when the real fix exists.** If the clean structure needs a new boundary, say so — do not bless a shortcut to save the executor a file.
29
+ - **Rule 9 — Search before building.** Read `.nubos-pilot/codebase/INDEX.md` and the milestone architecture before naming a new module. Extend existing structure; do not silently reinvent it.
30
+
31
+ Refusal of any rule is a hard-stop. Surface the violation to the orchestrator verbatim and abort the spawn.
32
+
33
+ ## Granularity — task structure, NOT line-level implementation
34
+
35
+ You decide **structure**: responsibilities, boundaries, the shape of the public surface, which paradigm applies, what the tests must cover. You do NOT write the implementation. Specifically you do NOT emit:
36
+
37
+ - Schema DDL / exact column types,
38
+ - exact framework-generated filenames (use glob-shaped descriptions, e.g. "a Service class under the app's service layer"),
39
+ - full code bodies (a ≤ 5-line illustrative signature is fine; a method body is not),
40
+ - code-style edicts already covered by `RULES.md`.
41
+
42
+ Your spec is ephemeral guidance, not a committed artifact — it never reaches `PLAN.md`, so it cannot trip plan-lint. Keep it concrete enough to constrain the executor, abstract enough to leave the executor room to implement.
43
+
44
+ ## Inputs
45
+
46
+ | Input | Purpose | Typical path |
47
+ |-------|---------|--------------|
48
+ | Task plan (required) | The task being executed. `<action>` + `<acceptance_criteria>` define the surface you structure. | `.nubos-pilot/milestones/M<NNN>/slices/S<NNN>/tasks/T<NNNN>/T<NNNN>-PLAN.md` |
49
+ | RULES.md (required) | Project Conventions — class/module structure, naming, code style, paradigms. Your spec MUST conform. | `.nubos-pilot/RULES.md` |
50
+ | M<NNN>-CONTEXT.md (recommended) | Locked milestone decisions. | `.nubos-pilot/milestones/M<NNN>/M<NNN>-CONTEXT.md` |
51
+ | M<NNN>-ARCHITECTURE.md (when present) | Milestone boundaries you refine for this task — never contradict. | `.nubos-pilot/milestones/M<NNN>/M<NNN>-ARCHITECTURE.md` |
52
+ | .nubos-pilot/codebase/INDEX.md (recommended) | Existing module boundaries to extend, not reinvent. | `.nubos-pilot/codebase/INDEX.md` |
53
+
54
+ ## Output Contract
55
+
56
+ Emit your architecture spec as your final message — markdown, this exact shape, no file writes:
57
+
58
+ ```markdown
59
+ # Task Architecture — <task-id>
60
+
61
+ ## Responsibilities & Boundaries
62
+ | Unit (class / module) | New / Existing | Responsibility | Public surface |
63
+ |---|---|---|---|
64
+ | ... | ... | ... | ... |
65
+
66
+ ## Paradigms & Conventions to honour
67
+ - <named convention from RULES.md the executor must follow>
68
+ - <pattern that is required / banned for this task>
69
+
70
+ ## Required Test Surfaces (hand-off to np-test-writer)
71
+ - <observable behaviour that MUST have a test> — happy path
72
+ - <boundary / empty / overflow case that MUST have a test>
73
+ - <failure path that MUST have a test>
74
+
75
+ ## Constraints for the executor
76
+ - <boundary the executor must not cross, e.g. "no DB access from the controller — go through the Service">
77
+
78
+ ## Conflicts
79
+ - <only if the task or RULES.md make a clean structure impossible — name the conflict; the orchestrator routes it to the user>
80
+ ```
81
+
82
+ If the task is purely mechanical (copy change, version bump, one-line fix) and needs no structural decision, emit a single line: `No structural decision required — <one-line reason>.` Do not manufacture structure where none is warranted.
83
+
84
+ <scope_guardrail>
85
+ **Do:**
86
+ - Read the task plan, RULES.md, CONTEXT, milestone architecture, and codebase index freely.
87
+ - Decide the task's code structure and the test surfaces it requires.
88
+ - Honour RULES.md Conventions and milestone architecture. Surface conflicts instead of silently overriding.
89
+
90
+ **Don't:**
91
+ - Write or edit ANY file — you have no Write/Edit tool. Your spec is your final message.
92
+ - Prescribe line-level implementation, schema DDL, or exact framework filenames.
93
+ - Re-open milestone decisions (`M<NNN>-CONTEXT.md` / `M<NNN>-ARCHITECTURE.md`) — refine within them.
94
+ - Spawn other agents or commit anything.
95
+ </scope_guardrail>
@@ -0,0 +1,89 @@
1
+ ---
2
+ name: np-test-writer
3
+ description: Per-task TDD step inside the Nubosloop. Runs in round 1 after the architect, before the executor. Writes real, valid test files for the task's required surfaces (from the architecture spec + acceptance criteria + Conventions) BEFORE production code exists. Tests may start red; the executor makes them green. Never skips, stubs, or writes vacuous assertions.
4
+ tier: sonnet
5
+ tools: Read, Write, Bash, Grep, Glob
6
+ color: "#06B6D4"
7
+ ---
8
+
9
+ <role>
10
+ You are the nubos-pilot test-writer. You run once per task, in round 1 of the Nubosloop, AFTER the per-task architect and BEFORE the executor. You practice test-driven development: you write the tests the executor must then make pass.
11
+
12
+ The orchestrator hands you `<architecture_constraints>` (the per-task architect's required test surfaces), the task's `<acceptance_criteria>`, and the project Conventions (`RULES.md`). You translate those into real test files placed where the project keeps tests. Because production code does not exist yet, your tests are EXPECTED to fail (red) when run — that is correct TDD. The executor turns them green; the loop's verify gate runs after the executor, never after you.
13
+
14
+ Your tests are a contract. The executor is told not to delete, skip, or weaken them. So they must be valid, runnable, and honest from the start.
15
+
16
+ **CRITICAL: Mandatory Initial Read**
17
+ If the prompt contains a `<files_to_read>` block, you MUST `Read` every file listed before writing anything — the task plan, the architecture spec path, `RULES.md`, and any existing neighbouring tests whose style you must match.
18
+
19
+ **Testing skills.** If the spawn prompt contains a `Use the following Nubos skills` line, `Read` each named skill from `.claude/skills/<skill>/SKILL.md` (e.g. `np-test-strategy`) before writing. Its "Verification bar" is the standard your test suite must satisfy. If absent (non-Claude runtime), proceed on your own judgment.
20
+ </role>
21
+
22
+ ## Completeness Mandate
23
+
24
+ This agent operates under [`templates/COMPLETENESS.md`](../templates/COMPLETENESS.md). The rules that bind this role:
25
+
26
+ - **Rule 1 — Do the whole thing.** Cover every surface the architect listed: happy path, empty/boundary/overflow input, and the failure path. A missing case is an incomplete suite.
27
+ - **Rule 3 — Do it with tests.** This is your entire job. Every public surface the task introduces gets at least one test that asserts observable behaviour.
28
+ - **Rule 10 — Test before shipping.** A test that does not actually assert the claimed behaviour is worse than no test. No `assert(true)`, no `expect(x).toBeDefined()` as the only check, no `it.skip` / `markTestSkipped` / commented-out asserts / `if (false)` guards. Such a test is a hard-stop violation, not a placeholder.
29
+
30
+ Refusal of any rule is a hard-stop. Surface the violation to the orchestrator verbatim and abort the spawn.
31
+
32
+ ## Anti-Skip Self-Check (run before you finish)
33
+
34
+ Before emitting your envelope, re-read every test file you wrote and confirm — line by line — that NONE of the following is present. If any is, fix it; do not ship it:
35
+
36
+ 1. Skipped/pending tests (`.skip`, `xit`, `xdescribe`, `markTestSkipped`, `@Disabled`, `t.skip`, `pytest.mark.skip`, `#[ignore]`).
37
+ 2. Vacuous assertions (`assert(true)`, `expect(true).toBe(true)`, an assertion that can never fail, or a test with zero assertions).
38
+ 3. Swallowed failures (`try { … } catch {}` around the assertion, empty catch, asserting inside an un-awaited promise).
39
+ 4. Tautologies (asserting a literal against itself, or re-asserting the mock you just configured).
40
+ 5. Non-determinism (wall-clock, network, unseeded randomness) without explicit injection.
41
+
42
+ A test that exists only to inflate the count is a Rule-10 violation. The downstream `np-critic-tests` axis audits for exactly these; do not hand it findings.
43
+
44
+ ## TDD Discipline (red is correct)
45
+
46
+ - Write tests against the behaviour the acceptance criteria + architecture spec require — not against whatever the executor might implement.
47
+ - Run the project's test command via `Bash` to confirm the tests **parse and execute** (no syntax/collection errors). Failing assertions are expected and fine; collection/compile errors are NOT — fix those.
48
+ - Do NOT write production code, stubs, or fixtures that pre-satisfy the tests. Minimal test scaffolding (factories, fakes the project already uses) is allowed; implementing the unit under test is the executor's job.
49
+ - Place tests where the project keeps them (match `RULES.md` → Conventions and existing neighbours). Add the files you create to the task's `files_modified` set via the checkpoint if the orchestrator asks; otherwise list them in your envelope so they are committed with the executor's diff.
50
+
51
+ ## Inputs
52
+
53
+ | Input | Purpose | Typical path |
54
+ |-------|---------|--------------|
55
+ | Task plan (required) | `<action>` + `<acceptance_criteria>` define what to test. | `.nubos-pilot/milestones/M<NNN>/slices/S<NNN>/tasks/T<NNNN>/T<NNNN>-PLAN.md` |
56
+ | Architecture constraints (required) | The per-task architect's required test surfaces. | inline `<architecture_constraints>` / `$TMPDIR` path |
57
+ | RULES.md (required) | Conventions — test location, naming, style. | `.nubos-pilot/RULES.md` |
58
+ | Neighbouring tests (recommended) | The project's test idiom you must match. | repo paths |
59
+
60
+ ## Output Contract
61
+
62
+ Write the test files, then emit a single JSON object as your final message (no prose around it):
63
+
64
+ ```json
65
+ {
66
+ "agent": "test-writer",
67
+ "task_id": "M001-S001-T0001",
68
+ "round": 1,
69
+ "tests_written": ["tests/Feature/OutcomeRecorderTest.php"],
70
+ "surfaces_covered": ["records a verdict and persists it", "rejects a malformed verdict with 422", "returns empty history for an unknown task"],
71
+ "collection_ok": true,
72
+ "expected_red": true,
73
+ "notes": "Tests fail as expected — no production code yet. Executor must make them green without weakening assertions."
74
+ }
75
+ ```
76
+
77
+ `collection_ok` MUST be `true` before you finish — if the suite cannot even collect/compile your tests, fix them first. `expected_red: true` is the normal TDD state. If you genuinely cannot write valid tests (e.g. acceptance criteria are ambiguous), emit `"collection_ok": false` with a `notes` explaining the blocker — the orchestrator routes it to the user rather than letting the executor proceed blind.
78
+
79
+ <scope_guardrail>
80
+ **Do:**
81
+ - Read the task, architecture spec, RULES.md, and neighbouring tests.
82
+ - Write real, valid, honest test files for every required surface.
83
+ - Run the test command to confirm collection succeeds (red assertions are fine).
84
+
85
+ **Don't:**
86
+ - Write production code or pre-satisfy your own tests.
87
+ - Skip, stub, weaken, or comment out any assertion to make the suite "pass".
88
+ - Edit unrelated files, spawn other agents, or commit anything.
89
+ </scope_guardrail>
package/bin/install.js CHANGED
@@ -228,27 +228,79 @@ function _readInstallConfig(projectRoot) {
228
228
  }
229
229
  }
230
230
 
231
- // On re-install/update the installer leaves an existing config.json untouched.
232
- // To make `ultra` the standard for updated projects too, backfill `agents.economy`
233
- // into configs that don't set it yet loud (the key is written, visible in the
234
- // file) and conservative (an explicit economy OR legacy economy_critic is treated
235
- // as a deliberate choice and never overwritten). Returns the action taken for logging.
236
- function _backfillEconomyDefault(stateDir, { dryRun = false } = {}) {
231
+ // Shared read for the re-install/update backfills: parse the existing config.json
232
+ // once. Returns `{ cfgPath, cfg, status }` where status is 'ok' | 'absent' |
233
+ // 'unparseable' and cfg is null unless status is 'ok'.
234
+ function _loadConfigJson(stateDir) {
237
235
  const cfgPath = path.join(stateDir, 'config.json');
238
236
  let raw;
239
- try { raw = fs.readFileSync(cfgPath, 'utf-8'); } catch { return 'absent'; }
237
+ try { raw = fs.readFileSync(cfgPath, 'utf-8'); } catch { return { cfgPath, cfg: null, status: 'absent' }; }
240
238
  let cfg;
241
- try { cfg = JSON.parse(raw); } catch { return 'unparseable'; }
242
- if (!cfg || typeof cfg !== 'object') return 'unparseable';
239
+ try { cfg = JSON.parse(raw); } catch { return { cfgPath, cfg: null, status: 'unparseable' }; }
240
+ if (!cfg || typeof cfg !== 'object') return { cfgPath, cfg: null, status: 'unparseable' };
241
+ return { cfgPath, cfg, status: 'ok' };
242
+ }
243
+
244
+ // Apply the ultra economy default in-memory (never overwriting an explicit
245
+ // economy OR legacy economy_critic choice). Returns 'backfilled' | 'preserved'.
246
+ function _applyEconomyDefault(cfg) {
243
247
  const agents = cfg.agents && typeof cfg.agents === 'object' ? cfg.agents : null;
244
248
  if (agents && (agents.economy !== undefined || agents.economy_critic !== undefined)) {
245
249
  return 'preserved';
246
250
  }
247
251
  cfg.agents = { ...(agents || {}), economy: configDefaults.INSTALL_ECONOMY_MODE };
248
- if (!dryRun) atomicWriteFileSync(cfgPath, JSON.stringify(cfg, null, 2));
249
252
  return 'backfilled';
250
253
  }
251
254
 
255
+ // Apply the default-on loop-agent toggles (architect, test-writer) in-memory,
256
+ // never overwriting an explicit true/false. Returns the keys added.
257
+ const _BACKFILL_AGENT_TOGGLES = Object.freeze(['architect', 'test_writer']);
258
+ function _applyAgentToggles(cfg) {
259
+ const agents = cfg.agents && typeof cfg.agents === 'object' ? cfg.agents : {};
260
+ const added = [];
261
+ for (const key of _BACKFILL_AGENT_TOGGLES) {
262
+ if (agents[key] === undefined) {
263
+ agents[key] = configDefaults.DEFAULT_AGENTS[key];
264
+ added.push(key);
265
+ }
266
+ }
267
+ if (added.length > 0) cfg.agents = agents;
268
+ return added;
269
+ }
270
+
271
+ // Single-pass backfill used by the installer: one read, the economy default and
272
+ // the agent-toggle defaults applied together, one atomic write. Both backfills are
273
+ // loud (written to the file) and conservative (an explicit choice is never
274
+ // overwritten). Returns `{ economy, toggles }` for logging.
275
+ function _backfillConfigDefaults(stateDir, { dryRun = false } = {}) {
276
+ const { cfgPath, cfg, status } = _loadConfigJson(stateDir);
277
+ if (!cfg) return { economy: status, toggles: [] };
278
+ const economy = _applyEconomyDefault(cfg);
279
+ const toggles = _applyAgentToggles(cfg);
280
+ if ((economy === 'backfilled' || toggles.length > 0) && !dryRun) {
281
+ atomicWriteFileSync(cfgPath, JSON.stringify(cfg, null, 2));
282
+ }
283
+ return { economy, toggles };
284
+ }
285
+
286
+ // Standalone wrappers retained for unit tests. Each loads + writes on its own;
287
+ // the installer uses _backfillConfigDefaults to avoid a second read/write pass.
288
+ function _backfillEconomyDefault(stateDir, { dryRun = false } = {}) {
289
+ const { cfgPath, cfg, status } = _loadConfigJson(stateDir);
290
+ if (!cfg) return status;
291
+ const action = _applyEconomyDefault(cfg);
292
+ if (action === 'backfilled' && !dryRun) atomicWriteFileSync(cfgPath, JSON.stringify(cfg, null, 2));
293
+ return action;
294
+ }
295
+
296
+ function _backfillAgentToggles(stateDir, { dryRun = false } = {}) {
297
+ const { cfgPath, cfg } = _loadConfigJson(stateDir);
298
+ if (!cfg) return [];
299
+ const added = _applyAgentToggles(cfg);
300
+ if (added.length > 0 && !dryRun) atomicWriteFileSync(cfgPath, JSON.stringify(cfg, null, 2));
301
+ return added;
302
+ }
303
+
252
304
  function _readExistingScope(projectRoot) {
253
305
  const cfg = _readInstallConfig(projectRoot);
254
306
  return cfg && cfg.scope ? cfg.scope : null;
@@ -441,11 +493,15 @@ async function _runInstallLocked(ctx) {
441
493
  else console.error(dim + 'DRY-RUN: würde schreiben ' + configPath + reset);
442
494
  initConfig = config;
443
495
  } else {
444
- // Re-install / update: backfill the ultra economy default into a config that
445
- // doesn't set it yet (never overwriting an explicit choice).
446
- const action = _backfillEconomyDefault(stateDir, { dryRun });
447
- if (action === 'backfilled') {
448
- console.error(green + ' [config] agents.economy → ultra (backfilled default)'
496
+ // Re-install / update: backfill the default agent config into a config that
497
+ // doesn't set it yet (one read/write, never overwriting an explicit choice).
498
+ const { economy, toggles } = _backfillConfigDefaults(stateDir, { dryRun });
499
+ if (economy === 'backfilled') {
500
+ console.error(green + ' [config] agents.economy → ' + configDefaults.INSTALL_ECONOMY_MODE + ' (backfilled default)'
501
+ + (dryRun ? ' [DRY-RUN]' : '') + reset);
502
+ }
503
+ for (const key of toggles) {
504
+ console.error(green + ' [config] agents.' + key + ' → ' + configDefaults.DEFAULT_AGENTS[key] + ' (backfilled default)'
449
505
  + (dryRun ? ' [DRY-RUN]' : '') + reset);
450
506
  }
451
507
  }
@@ -915,5 +971,6 @@ module.exports = {
915
971
  parseInstallFlags,
916
972
  VALID_AGENTS, VALID_SCOPES,
917
973
  SOURCE_PAYLOAD_DIR, PAYLOAD_SUBPATH, STATE_SUBPATH,
918
- _payloadDirFor, _stateDirFor, _backfillEconomyDefault,
974
+ _payloadDirFor, _stateDirFor,
975
+ _backfillEconomyDefault, _backfillAgentToggles, _backfillConfigDefaults,
919
976
  };
@@ -91,19 +91,90 @@ function _resolveSafe(root, p) {
91
91
  }
92
92
 
93
93
  const _COMMIT_NAME_MAX = 200;
94
+ const _COMMIT_BODY_MAX = 2000;
95
+ const _TASK_ID_PREFIX_RE = /^\s*M\d{3,}-S\d{3,}-T\d{4,}\s*[—:-]\s*/;
96
+ const _PLACEHOLDER_RE = /^\s*(?:\{\{.*\}\}|_?TBD\b.*|_none\b.*)\s*$/i;
97
+
94
98
  function _sanitizeCommitName(s) {
95
99
  return String(s == null ? '' : s).replace(/[\r\n\t]+/g, ' ').replace(/\s+/g, ' ').trim().slice(0, _COMMIT_NAME_MAX);
96
100
  }
97
101
 
102
+ // The scaffolded H1 is "# <task-id> — <name>"; without a `name:` frontmatter
103
+ // field the body regex captures the whole heading, producing the duplicated
104
+ // "task(ID): ID — desc" subject. Strip a leading task-id prefix so the subject
105
+ // reads "task(ID): desc".
98
106
  function _extractName(frontmatter, body) {
107
+ // Strip a leading task-id prefix, but fall through to the next source when the
108
+ // strip empties the candidate — a `name:` of just "M001-S001-T0001 —" or an H1
109
+ // with no description must not produce a bare "task(ID): " subject.
99
110
  if (typeof frontmatter.name === 'string' && frontmatter.name.length > 0) {
100
- return _sanitizeCommitName(frontmatter.name);
111
+ const stripped = _sanitizeCommitName(String(frontmatter.name).replace(_TASK_ID_PREFIX_RE, ''));
112
+ if (stripped) return stripped;
101
113
  }
102
114
  const m = String(body || '').match(/^#\s+(?:Task:\s*)?(.+?)\s*$/m);
103
- if (m) return _sanitizeCommitName(m[1]);
115
+ if (m) {
116
+ const stripped = _sanitizeCommitName(m[1].replace(_TASK_ID_PREFIX_RE, ''));
117
+ if (stripped) return stripped;
118
+ }
104
119
  return _sanitizeCommitName(frontmatter.id || 'task');
105
120
  }
106
121
 
122
+ function _innerTag(body, tag) {
123
+ const m = String(body || '').match(new RegExp('<' + tag + '>([\\s\\S]*?)</' + tag + '>'));
124
+ if (!m) return '';
125
+ const inner = m[1].trim();
126
+ // A still-unfilled placeholder may carry a Markdown bullet prefix
127
+ // (`- _TBD — …`); test the de-bulleted form so it is recognised and omitted.
128
+ const candidate = inner.replace(/^[-*+]\s+/, '');
129
+ return _PLACEHOLDER_RE.test(candidate) ? '' : inner;
130
+ }
131
+
132
+ function _sanitizeCommitBody(s) {
133
+ return String(s == null ? '' : s)
134
+ .replace(/\r\n?/g, '\n')
135
+ .replace(/[ \t]+\n/g, '\n')
136
+ .replace(/\n{3,}/g, '\n\n')
137
+ .trim()
138
+ .slice(0, _COMMIT_BODY_MAX);
139
+ }
140
+
141
+ // Compose a descriptive commit body from the task's intent so the history is
142
+ // self-explanatory months later: what the task does (<action>), what it must
143
+ // satisfy (<acceptance_criteria>), the task id, and the files it touched.
144
+ function _dedupe(arr) {
145
+ const seen = new Set();
146
+ const out = [];
147
+ for (const x of arr) {
148
+ if (!seen.has(x)) { seen.add(x); out.push(x); }
149
+ }
150
+ return out;
151
+ }
152
+
153
+ // Test files np-test-writer wrote this task (ADR-0023). The test-writer chooses
154
+ // their paths at runtime — the planner-authored files_modified does not list
155
+ // them — so the post-test-writer phase records them in the checkpoint. Fold them
156
+ // into the commit set so the executor-greened tests land with their production
157
+ // code instead of being silently dropped.
158
+ function _tddTestFiles(taskId, cwd, root) {
159
+ const cp = readCheckpoint(taskId, cwd);
160
+ const np = cp && cp.nubosloop;
161
+ const tests = np && Array.isArray(np.tdd_tests) ? np.tdd_tests : [];
162
+ return tests.map((p) => _resolveSafe(root, p));
163
+ }
164
+
165
+ function _composeCommitBody(body, taskId, files) {
166
+ const action = _innerTag(body, 'action');
167
+ const accept = _innerTag(body, 'acceptance_criteria');
168
+ const parts = [];
169
+ if (action) parts.push(action);
170
+ if (accept) parts.push('Acceptance:\n' + accept);
171
+ parts.push('Task: ' + taskId);
172
+ if (Array.isArray(files) && files.length > 0) {
173
+ parts.push('Files: ' + files.join(', '));
174
+ }
175
+ return _sanitizeCommitBody(parts.join('\n\n'));
176
+ }
177
+
107
178
  function run(args, ctx) {
108
179
  const context = ctx || {};
109
180
  const cwd = context.cwd || process.cwd();
@@ -153,13 +224,16 @@ function run(args, ctx) {
153
224
  );
154
225
  }
155
226
  const root = findProjectRoot(cwd);
156
- const safeFiles = files.map((p) => _resolveSafe(root, p));
227
+ const declaredSafe = files.map((p) => _resolveSafe(root, p));
228
+ const safeFiles = _dedupe([...declaredSafe, ..._tddTestFiles(taskId, cwd, root)]);
157
229
  const name = _extractName(frontmatter, body);
158
230
  const message = 'task(' + taskId + '): ' + name;
231
+ // Compose the body from the paths that will actually be committed (commitTask
232
+ // drops gitignored entries), so `git log` never advertises a file the diff omits.
233
+ const { committable } = git.classifyCommittablePaths(safeFiles);
234
+ const commitBody = _composeCommitBody(body, taskId, committable);
159
235
 
160
-
161
-
162
- const result = commitTask(taskId, safeFiles, message);
236
+ const result = commitTask(taskId, safeFiles, message, commitBody);
163
237
 
164
238
  if (result.committed === false && result.reason === 'artifacts-gitignored') {
165
239
  try {
@@ -164,6 +164,139 @@ test('CT-3: commit-task emits JSON with sha + files on success', () => {
164
164
  assert.ok(subject.startsWith('task(M006-S001-T0001):'), 'subject: ' + subject);
165
165
  });
166
166
 
167
+ test('CT-3b: subject strips the duplicated task-id prefix from the H1 heading', () => {
168
+ const root = makeRepo();
169
+ const taskId = 'M006-S001-T0050';
170
+ const m = taskId.match(/^(M\d{3,})-(S\d{3,})-(T\d{4,})$/);
171
+ const [, mId, sId, tId] = m;
172
+ const taskDir = path.join(root, '.nubos-pilot', 'milestones', mId, 'slices', sId, 'tasks', tId);
173
+ fs.mkdirSync(taskDir, { recursive: true });
174
+ fs.writeFileSync(path.join(taskDir, tId + '-PLAN.md'), [
175
+ '---',
176
+ `id: ${taskId}`,
177
+ `milestone: ${mId}`,
178
+ `slice: ${mId}-${sId}`,
179
+ 'type: execute',
180
+ 'status: in-progress',
181
+ 'tier: sonnet',
182
+ 'owner: np-executor',
183
+ 'wave: 1',
184
+ 'depends_on: []',
185
+ 'files_modified:',
186
+ ' - src/a.ts',
187
+ 'autonomous: true',
188
+ 'must_haves: {}',
189
+ '---',
190
+ '',
191
+ `# ${taskId} — wire the outcome feedback loop`,
192
+ '',
193
+ '<action>',
194
+ 'Add the OutcomeRecorder service and persist verdicts.',
195
+ '</action>',
196
+ '',
197
+ '<acceptance_criteria>',
198
+ '- Verdicts persist across restarts',
199
+ '- API returns 201 on record',
200
+ '</acceptance_criteria>',
201
+ ].join('\n'), 'utf-8');
202
+ seedLoopReadyCheckpoint(root, taskId);
203
+ fs.mkdirSync(path.join(root, 'src'), { recursive: true });
204
+ fs.writeFileSync(path.join(root, 'src', 'a.ts'), 'export const a = 1;\n', 'utf-8');
205
+ const prev = process.cwd();
206
+ process.chdir(root);
207
+ const cap = _capture();
208
+ try {
209
+ subcmd.run([taskId], { cwd: root, stdout: cap.stub });
210
+ } finally {
211
+ process.chdir(prev);
212
+ }
213
+ const subject = execFileSync('git', ['-C', root, 'log', '-n', '1', '--format=%s'], { encoding: 'utf-8' }).trim();
214
+ assert.equal(subject, `task(${taskId}): wire the outcome feedback loop`,
215
+ 'subject must not repeat the task id after the colon');
216
+ const fullBody = execFileSync('git', ['-C', root, 'log', '-n', '1', '--format=%b'], { encoding: 'utf-8' });
217
+ assert.match(fullBody, /Add the OutcomeRecorder service/, 'body should carry the <action> intent');
218
+ assert.match(fullBody, /Acceptance:/);
219
+ assert.match(fullBody, /Verdicts persist across restarts/);
220
+ assert.match(fullBody, new RegExp('Task: ' + taskId));
221
+ assert.match(fullBody, /Files: src\/a\.ts/);
222
+ });
223
+
224
+ test('CT-3c: test-writer files recorded in nubosloop.tdd_tests are folded into the commit', () => {
225
+ const root = makeRepo();
226
+ const taskId = 'M006-S001-T0060';
227
+ seedPlanAndTask(root, '06-01', taskId, ['src/a.ts']);
228
+ seedLoopReadyCheckpoint(root, taskId, { nubosloop: { tdd_tests: ['tests/a.test.ts'] } });
229
+ fs.mkdirSync(path.join(root, 'src'), { recursive: true });
230
+ fs.mkdirSync(path.join(root, 'tests'), { recursive: true });
231
+ fs.writeFileSync(path.join(root, 'src', 'a.ts'), 'export const a = 1;\n', 'utf-8');
232
+ fs.writeFileSync(path.join(root, 'tests', 'a.test.ts'), 'test("a", () => {});\n', 'utf-8');
233
+ const prev = process.cwd();
234
+ process.chdir(root);
235
+ try {
236
+ subcmd.run([taskId], { cwd: root, stdout: _capture().stub });
237
+ } finally {
238
+ process.chdir(prev);
239
+ }
240
+ const committed = execFileSync('git', ['-C', root, 'show', '--name-only', '--format=', 'HEAD'], { encoding: 'utf-8' });
241
+ assert.match(committed, /src\/a\.ts/, 'production file must be committed');
242
+ assert.match(committed, /tests\/a\.test\.ts/, 'tdd test file must be committed even though it is not in files_modified');
243
+ });
244
+
245
+ test('CT-3d: degenerate task name falls back to the id instead of an empty subject', () => {
246
+ const root = makeRepo();
247
+ const taskId = 'M006-S001-T0061';
248
+ const m = taskId.match(/^(M\d{3,})-(S\d{3,})-(T\d{4,})$/);
249
+ const [, mId, sId, tId] = m;
250
+ const taskDir = path.join(root, '.nubos-pilot', 'milestones', mId, 'slices', sId, 'tasks', tId);
251
+ fs.mkdirSync(taskDir, { recursive: true });
252
+ fs.writeFileSync(path.join(taskDir, tId + '-PLAN.md'), [
253
+ '---',
254
+ `id: ${taskId}`,
255
+ `milestone: ${mId}`,
256
+ `slice: ${mId}-${sId}`,
257
+ 'type: execute', 'status: in-progress', 'tier: sonnet', 'owner: np-executor',
258
+ 'wave: 1', 'depends_on: []',
259
+ 'files_modified:', ' - src/a.ts',
260
+ 'autonomous: true', 'must_haves: {}',
261
+ '---', '',
262
+ `# ${taskId} — `,
263
+ ].join('\n'), 'utf-8');
264
+ seedLoopReadyCheckpoint(root, taskId);
265
+ fs.mkdirSync(path.join(root, 'src'), { recursive: true });
266
+ fs.writeFileSync(path.join(root, 'src', 'a.ts'), 'export const a = 1;\n', 'utf-8');
267
+ const prev = process.cwd();
268
+ process.chdir(root);
269
+ try {
270
+ subcmd.run([taskId], { cwd: root, stdout: _capture().stub });
271
+ } finally {
272
+ process.chdir(prev);
273
+ }
274
+ const subject = execFileSync('git', ['-C', root, 'log', '-n', '1', '--format=%s'], { encoding: 'utf-8' }).trim();
275
+ assert.equal(subject, `task(${taskId}): ${taskId}`, 'empty name must fall back to the id, never a bare colon');
276
+ });
277
+
278
+ test('CT-3e: commit body Files: lists only committed paths, not gitignored ones', () => {
279
+ const root = makeRepo();
280
+ const taskId = 'M006-S001-T0062';
281
+ seedPlanAndTask(root, '06-01', taskId, ['src/a.ts', 'build/out.js']);
282
+ seedLoopReadyCheckpoint(root, taskId);
283
+ fs.writeFileSync(path.join(root, '.gitignore'), 'build/\n', 'utf-8');
284
+ fs.mkdirSync(path.join(root, 'src'), { recursive: true });
285
+ fs.mkdirSync(path.join(root, 'build'), { recursive: true });
286
+ fs.writeFileSync(path.join(root, 'src', 'a.ts'), 'export const a = 1;\n', 'utf-8');
287
+ fs.writeFileSync(path.join(root, 'build', 'out.js'), 'noise', 'utf-8');
288
+ const prev = process.cwd();
289
+ process.chdir(root);
290
+ try {
291
+ subcmd.run([taskId], { cwd: root, stdout: _capture().stub });
292
+ } finally {
293
+ process.chdir(prev);
294
+ }
295
+ const fullBody = execFileSync('git', ['-C', root, 'log', '-n', '1', '--format=%b'], { encoding: 'utf-8' });
296
+ assert.match(fullBody, /Files: src\/a\.ts/);
297
+ assert.doesNotMatch(fullBody, /build\/out\.js/, 'gitignored path must not be advertised in the body');
298
+ });
299
+
167
300
  test('CT-4: commit-task SOFT-SKIPS when every files_modified entry is gitignored (artifacts-gitignored terminator)', () => {
168
301
  const root = makeRepo();
169
302
  seedPlanAndTask(root, '06-01', 'M006-S001-T0002', ['build/out.js']);