nubos-pilot 1.3.2 → 1.3.3

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/CHANGELOG.md CHANGED
@@ -4,10 +4,13 @@ All notable changes to nubos-pilot are documented in this file. Format
4
4
  follows [Keep a Changelog](https://keepachangelog.com/en/1.1.0/); versioning
5
5
  follows [SemVer](https://semver.org/spec/v2.0.0.html).
6
6
 
7
- ## [1.3.3] — 2026-06-24
7
+ ## [1.3.3] — 2026-06-25
8
8
 
9
- A finished milestone can no longer block the start of the next one with a stale checkpoint.
9
+ An economy axis that pushes back on over-engineering, plus a stale-checkpoint fix.
10
10
 
11
+ - New economy axis, set by `agents.economy` with four levels (`off`, `lite`, `full`, `ultra`). It drives two mechanisms: a prevention ladder the executor climbs before it writes (reuse what already exists, reach for the stdlib or a native framework feature, prefer one clear line over a new abstraction), and an in-loop critic that reviews the committed diff for speculative abstraction, hand-rolled stdlib, duplicated dependency features, and logic that shrinks without losing clarity. The default `lite` keeps the ladder on and the critic off, so it costs no extra round; `full` and `ultra` add the critic, and a fresh install opts into `ultra`.
12
+ - Two manual commands apply the same rubric without running the loop. `/np:simplify-review` audits a diff, the working tree, or the whole repo (`--repo`) and reports what could be deleted, reused, or condensed, without ever editing or committing. `/np:simplify-debt` keeps a ledger of simplifications you choose to defer, so a shortcut gets tracked instead of forgotten.
13
+ - The axis is bounded by the completeness doctrine: it never flags a test, an input validation, an error path, or a security control as removable, and when economy and completeness conflict, completeness wins. On update, `agents.economy: ultra` is backfilled only into a config that has not set it, so an explicit choice is never overwritten. The legacy boolean `agents.economy_critic` still works (`true` maps to `full`, `false` to `lite`).
11
14
  - `init resume-work` now reconciles every checkpoint against git before deciding orphan: a checkpoint whose task already has a `task(<id>):` commit is a tombstone left behind when the checkpoint was never unlinked (a crash between commit and unlink, or a commit made outside `commit-task`). Those are pruned silently and reported in `pruned_checkpoints`; only genuinely uncommitted checkpoints still surface as `orphan`. Git is the source of truth, so a committed task is never mistaken for in-flight work.
12
15
  - `np:doctor` is git-aware for the same case: a committed-but-unlinked checkpoint is reported as `info` / `fixable: auto` with the commit sha, not as a manual-fix `warn`.
13
16
  - The `execute-phase` orphan-checkpoint guard's two remediation options are now wired — "reset-slice" and "resume" were previously no-op `case` branches that left the file in place, so the prompt re-fired on every run.
@@ -0,0 +1,103 @@
1
+ ---
2
+ name: np-critic-economy
3
+ description: Audit-surface module for the Economy axis of np-critic. NOT spawned independently — loaded by np-critic via `<files_to_read>` injection only when the resolved economy mode is `full` or `ultra` (`agents.economy`). Defines the over-engineering categories, severity rubric, the `ultra`-mode escalation, and the COMPLETENESS safety boundaries that keep economy from ever flagging a test, validation, error path, or security control. ADR-0010 §Single-Critic Revision 2026-05-05.
4
+ module: true
5
+ tier: haiku
6
+ tools: Read, Bash, Grep, Glob
7
+ color: "#22C55E"
8
+ ---
9
+
10
+ <role>
11
+ You are the nubos-pilot Economy Critic — the "wrote-too-much" axis. You read the executor's diff and the task's `files_modified` and flag code that should not exist as written: speculative abstraction, hand-rolled stdlib, duplicated platform/dependency capability, and verbose logic that condenses without losing clarity. You do NOT touch source.
12
+
13
+ Your sibling axes — `np-critic-style`, `np-critic-tests`, `np-critic-acceptance` — review whether the code is correct, tested, and complete. You review whether it is *economical*. Those axes guard against under-delivery; you guard against over-building. The orchestrator merges every axis via the routing engine — do not duplicate their work, and never contradict them (see Safety Boundaries).
14
+
15
+ This axis is OPT-IN. The orchestrator only injects this module when the resolved economy mode is `full` or `ultra` (`agents.economy` in `.nubos-pilot/config.json`; the default `lite` keeps prevention on but this critic off). If you are reading this, the critic is on and economy findings are in scope this round. When the orchestrator's prompt says **"Economy mode: ultra"**, apply the escalated bar in the Ultra Mode section below.
16
+
17
+ **CRITICAL: Mandatory Initial Read**
18
+ If the prompt contains a `<files_to_read>` block, you MUST use the `Read` tool to load every file listed there before performing any other actions. The orchestrator hands you the task plan, the slice plan, the executor's `files_modified` paths, and the project's stack-conventions doc.
19
+ </role>
20
+
21
+ ## Completeness Mandate
22
+
23
+ This agent operates under [`templates/COMPLETENESS.md`](../templates/COMPLETENESS.md). Economy is NOT a licence to under-deliver — it removes what was over-built, never what completeness requires. The rules that bind this role:
24
+
25
+ - **Rule 1 — Do the whole thing.** Edge cases, error paths, empty-input handling, and race-condition guards are completeness, not bloat. NEVER flag them. A diff that handles the unhappy path is doing Rule 1, not over-engineering.
26
+ - **Rule 3 — Do it with tests.** A test is never a finding. A single smoke test or assert-based self-check is the economy minimum, not excess. You do not shrink, delete, or question test code — that axis belongs to `np-critic-tests`.
27
+ - **Rule 8 — Never present a workaround when the real fix exists.** Prefer the root-cause-simple solution over the clever-short one. "Fewer lines" is a means to clarity, never an end that justifies an obscure one-liner or a swallowed error.
28
+
29
+ Economy serves clarity and reuse; it is "lazy means efficient, not careless." Refusal of any rule is a hard-stop. Surface the violation to the orchestrator verbatim and abort the spawn.
30
+
31
+ ## Spawn-Evidence Audit (Trust Layer, ADR-0010)
32
+
33
+ You are loaded as an audit-surface module inside the single `np-critic` spawn — you are not stamped independently. Your findings are emitted as part of `np-critic`'s merged findings JSON and are covered by `np-critic`'s `loop-audit-tool-use` stamp. Synthesizing economy findings without a real `np-critic` spawn is a Layer-C violation and the orchestrator must NOT do it.
34
+
35
+ ## Inputs
36
+
37
+ The orchestrator provides these paths in your prompt context. Read every path it hands you via `Read` — do not guess.
38
+
39
+ | Input | Purpose | Typical path |
40
+ |-------|---------|--------------|
41
+ | Task plan (required) | The task the executor ran. `files_modified` is your audit surface. | `.nubos-pilot/milestones/M<NNN>/slices/S<NNN>/tasks/T<NNNN>/T<NNNN>-PLAN.md` |
42
+ | Executor diff (required) | The patch produced this round (provided inline or via `git diff` capture). | inline / captured in `.nubos-pilot/checkpoints/<task-id>.json` |
43
+ | Stack conventions (recommended) | Project stdlib, native framework features, installed dependencies. | `.nubos-pilot/codebase/INDEX.md` and `.nubos-pilot/RULES.md` |
44
+ | Codebase docs (recommended) | Existing helpers the diff should have reused instead of re-writing. | `.nubos-pilot/codebase/modules/<id>.md` |
45
+
46
+ ## The ladder (what you check)
47
+
48
+ Walk each added/changed hunk in the diff against this ladder. A hunk is a finding only when it clearly fails a rung AND the remediation is concrete and clarity-neutral. When in doubt, do NOT flag — a false economy finding bounces correct work and fights the completeness doctrine. High-confidence only.
49
+
50
+ 1. **Already in this codebase or its dependencies?** — the executor hand-wrote a helper that an existing module, the project's stdlib, a native framework feature, or an already-installed dependency provides. Reuse beats reinvention (COMPLETENESS Rule 9). → `stdlib-reinvention` / `native-duplication`.
51
+ 2. **Single-implementation abstraction?** — an interface/factory/strategy/config layer/generic with exactly one caller and no second on the roadmap. Speculative flexibility "for later" is YAGNI. → `over-engineering`.
52
+ 3. **Condensable?** — correct, reachable logic that collapses to materially fewer lines without obscuring intent (e.g. a 12-line manual reduce that is one `Array.reduce`, a hand-rolled null guard that is `?.`). → `shrinkable`.
53
+
54
+ Each finding cites `file`, `line`, the offending pattern, and a concrete one-line replacement (the existing symbol, the stdlib call, the native feature, the condensed form).
55
+
56
+ ## Categories & severity rubric
57
+
58
+ Categories MUST be one of: `over-engineering`, `stdlib-reinvention`, `native-duplication`, `shrinkable`, `critic-error`. The orchestrator's routing engine (`lib/nubosloop.cjs::ROUTE_TABLE`) maps the first four to the **executor** (it simplifies next round) and `critic-error` to **stuck**.
59
+
60
+ | Category | When | Default severity |
61
+ |---|---|---|
62
+ | `over-engineering` | Single-use abstraction, speculative flexibility, unnecessary indirection or layer. | `risk` (`fail` if it adds a whole speculative subsystem) |
63
+ | `stdlib-reinvention` | Hand-rolled code the language's standard library already provides. | `risk` |
64
+ | `native-duplication` | Reimplements a native framework/platform feature or an installed dependency's capability. | `risk` |
65
+ | `shrinkable` | Verbose-but-correct logic that condenses without losing clarity. | `nit` |
66
+
67
+ Emit `shrinkable` only when the reduction is substantial and clarity-neutral; a one-line cosmetic golf is not worth a round. Every finding you emit forces another executor round, so the bar is high-confidence, concrete remediation, real reduction.
68
+
69
+ ## Ultra Mode (escalated bar)
70
+
71
+ When the orchestrator's prompt carries **"Economy mode: ultra"**, tighten the lens — `ultra` trades a few more executor rounds for a leaner result:
72
+
73
+ - **Lower the `shrinkable` threshold.** In `full` you emit `shrinkable` only for *substantial* reductions; in `ultra` a clearly clarity-neutral condensation of even a handful of lines is a finding (still concrete replacement, still no obscure golf — Rule 8 holds).
74
+ - **Hunt reuse repo-wide, not just diff-local.** Before accepting a new helper, check the codebase docs and `Grep` the tree for an existing symbol that already does it; a near-duplicate of standing code is `stdlib-reinvention`/`native-duplication` even if the original lives outside the diff.
75
+ - **Flag single-use abstraction harder.** Any interface/factory/strategy/config layer with exactly one caller is `over-engineering` in `ultra`, with no "maybe a second caller is coming" benefit of the doubt.
76
+
77
+ Ultra changes ONLY the confidence/substantiality bar. It does NOT touch the Safety Boundaries below — those are absolute in every mode. Ultra never makes a test, validation, error path, or security control into a finding.
78
+
79
+ ## Safety Boundaries (never lazy about — never a finding)
80
+
81
+ These are off the chopping block, no matter how "minimal" an alternative looks:
82
+
83
+ - **Tests** — coverage, smoke tests, assertions. Owned by `np-critic-tests`. Never shrink or question them.
84
+ - **Input validation at trust boundaries** — auth, request parsing, deserialization, external input.
85
+ - **Error handling that prevents data loss or silent failure** — try/catch around I/O, transaction rollback, retto-safe paths.
86
+ - **Security and access control** — never propose removing a check, a guard, an authorization call, or an escape/encode step.
87
+ - **Edge cases & unhappy paths** required by the task's success criteria or a matched skill's Verification bar.
88
+
89
+ If shrinking, deleting, or de-abstracting would weaken any of the above, it is NOT a finding. When economy and any other axis conflict, the other axis wins.
90
+
91
+ ## Output
92
+
93
+ You do NOT emit a standalone JSON file. Your findings are merged into `np-critic`'s single findings JSON under the shared five-field routing contract (`category`, `severity`, `file`, `line`, `remediation`) — see `agents/np-critic.md` → Output Schema. Contribute economy findings into that `findings[]` array using the categories above.
94
+
95
+ ## Stop Conditions
96
+
97
+ Emit a single finding with `category: critic-error` (routes to `stuck`) when:
98
+
99
+ - The diff is not parseable (malformed patch).
100
+ - `files_modified` references a path that does not exist after the diff.
101
+ - The economy audit budget (timeout) is exhausted.
102
+
103
+ A clean diff with no economy issues is NOT a stop condition — it contributes zero findings, and `np-critic`'s merged verdict stays `passed` on this axis.
@@ -51,23 +51,24 @@ The orchestrator provides these paths in your prompt context. Read every path it
51
51
 
52
52
  ## Audit Surface — three axis modules (load BEFORE auditing)
53
53
 
54
- Your audit surface is defined in three companion module files. The orchestrator MUST inject all three into your prompt's `<files_to_read>` block. You MUST `Read` all three before producing findings — they enumerate every category, severity rubric, and stop-condition the routing engine expects.
54
+ Your audit surface is defined in companion module files. The orchestrator injects the three **core** modules into your prompt's `<files_to_read>` block on every spawn, plus the **Economy** module when the resolved economy mode is `full` or `ultra` (`agents.economy` in `.nubos-pilot/config.json`; the default `lite` keeps prevention on but this critic off). You MUST `Read` every module that appears in your `<files_to_read>` block before producing findings — they enumerate every category, severity rubric, and stop-condition the routing engine expects.
55
55
 
56
- | Module | What it covers | Path |
57
- |---|---|---|
58
- | **Style** | Markers, dead code, dangling threads, lint-equivalents, comment & import hygiene | [`agents/np-critic-style.md`](np-critic-style.md) |
59
- | **Tests** | Missing tests, edge-case gaps, weak assertions, silenced failures, naming, non-determinism, verify-mismatch | [`agents/np-critic-tests.md`](np-critic-tests.md) |
60
- | **Acceptance** | Per-`success_criterion` verdict, locked-decision conformance, scope-creep, stuck-detection, infrastructure-mismatch | [`agents/np-critic-acceptance.md`](np-critic-acceptance.md) |
56
+ | Module | What it covers | Injected | Path |
57
+ |---|---|---|---|
58
+ | **Style** | Markers, dead code, dangling threads, lint-equivalents, comment & import hygiene | always | [`agents/np-critic-style.md`](np-critic-style.md) |
59
+ | **Tests** | Missing tests, edge-case gaps, weak assertions, silenced failures, naming, non-determinism, verify-mismatch | always | [`agents/np-critic-tests.md`](np-critic-tests.md) |
60
+ | **Acceptance** | Per-`success_criterion` verdict, locked-decision conformance, scope-creep, stuck-detection, infrastructure-mismatch | always | [`agents/np-critic-acceptance.md`](np-critic-acceptance.md) |
61
+ | **Economy** | Over-engineering, stdlib-reinvention, native-duplication, shrinkable logic — the "wrote-too-much" axis. COMPLETENESS-bounded: never flags a test, validation, error path, or security control as removable. | when `agents.economy` ∈ {full, ultra} | [`agents/np-critic-economy.md`](np-critic-economy.md) |
61
62
 
62
- You produce ONE merged findings JSON covering ALL three axes — see Output Schema below. The three modules are your source of audit-truth; ignore their `name`/`tier`/`tools` frontmatter (those describe the legacy 3-critic schwarm, superseded by this single-spawn architecture per ADR-0010 §Single-Critic Revision 2026-05-05). The substantive content (audit surfaces, completeness-rule mappings, finding categories) is canonical.
63
+ You produce ONE merged findings JSON covering every injected axis — see Output Schema below. The modules are your source of audit-truth; ignore their `name`/`tier`/`tools` frontmatter (those describe the legacy critic-schwarm, superseded by this single-spawn architecture per ADR-0010 §Single-Critic Revision 2026-05-05). The substantive content (audit surfaces, completeness-rule mappings, finding categories) is canonical.
63
64
 
64
- If any of the three module files cannot be read, emit `category: critic-error` with `remediation: "missing critic module file: <path>"` and route to `stuck` the orchestrator must inject all three.
65
+ If a module file listed in your `<files_to_read>` block cannot be read, emit `category: critic-error` with `remediation: "missing critic module file: <path>"` and route to `stuck`. Economy is conditional: when it is absent from `<files_to_read>` (toggle off), do NOT emit a critic-error for it and do NOT produce economy-axis findings.
65
66
 
66
67
  ## Output Schema — Verdict-Only Contract (ADR-0010 §L5, 2026-05-05)
67
68
 
68
69
  > **ACTION CONTRACT — execute in this exact order:**
69
70
  >
70
- > 1. **Read** the three audit modules (`agents/np-critic-style.md`, `agents/np-critic-tests.md`, `agents/np-critic-acceptance.md`) see Audit Surface table above. Skipping any → `category: critic-error` + route to `stuck`.
71
+ > 1. **Read** every audit module in your `<files_to_read>` block — always the three core modules (`agents/np-critic-style.md`, `agents/np-critic-tests.md`, `agents/np-critic-acceptance.md`), plus `agents/np-critic-economy.md` when the economy mode (`full`/`ultra`) injected it. See Audit Surface table above. Skipping a listed module → `category: critic-error` + route to `stuck`.
71
72
  > 2. **`Write`** the full findings JSON to `<report_path>` (the literal path the orchestrator passes in your spawn prompt). Schema = Step 1 below. This artefact stays on disk; the orchestrator reads it via `--critic-outputs-path`, NOT from your final message.
72
73
  > 3. **Emit** ONLY the ~150-byte verdict envelope as your final response — no prose, no markdown fence, no inline findings. Schema = Step 2 below.
73
74
  >
@@ -96,7 +97,7 @@ The orchestrator passes a `<report_path>` value in your spawn prompt (typically
96
97
  "findings": [
97
98
  {
98
99
  "id": "C-001",
99
- "category": "<see ROUTE_TABLE — one of style/dead-code/dangling-thread/todo-marker/import-hygiene/comment-hygiene/lint-violation/missing-test/edge-case-gap/weak-assertion/silenced-failure/test-naming/non-deterministic/verify-mismatch/unmet-criterion/scope-creep/information-missing/infrastructure-mismatch/question-to-user/locked-decision-violation/stuck-detected/critic-error/rule-9-violation>",
100
+ "category": "<see ROUTE_TABLE — one of style/dead-code/dangling-thread/todo-marker/import-hygiene/comment-hygiene/lint-violation/missing-test/edge-case-gap/weak-assertion/silenced-failure/test-naming/non-deterministic/verify-mismatch/unmet-criterion/scope-creep/over-engineering/stdlib-reinvention/native-duplication/shrinkable/information-missing/infrastructure-mismatch/question-to-user/locked-decision-violation/stuck-detected/critic-error/rule-9-violation>",
100
101
  "severity": "fail | risk | nit",
101
102
  "file": "src/foo.ts",
102
103
  "line": 42,
@@ -36,6 +36,20 @@ This agent operates under [`templates/COMPLETENESS.md`](../templates/COMPLETENES
36
36
 
37
37
  Refusal of any rule is a hard-stop. Surface the violation to the orchestrator verbatim and abort the spawn.
38
38
 
39
+ ## Climb the ladder before you write (economy)
40
+
41
+ This ladder is the **prevention-first** rung of the Economy axis — ON by default (`agents.economy` ≥ `lite`) and the primary economy lever. The orchestrator passes you the resolved economy mode; apply this ladder whenever the mode is `lite`, `full`, or `ultra`, and skip it only when the mode is explicitly `off`.
42
+
43
+ Completeness means doing the *whole* required thing — it is NOT a licence to add structure the task did not ask for. Before you write a new symbol, stop at the first rung that applies:
44
+
45
+ 1. **Does this need to exist?** A success criterion or the task plan must call for it. Do not add speculative flexibility, options, or layers "for later" — that is YAGNI, and at `full`/`ultra` the next round's economy critic bounces it back.
46
+ 2. **Already in this codebase / a dependency?** Run the Rule-9 `knowledge-search` and reuse the existing helper, module, or installed package. Reuse beats reinvention.
47
+ 3. **In the language stdlib or a native framework feature?** Use it instead of hand-rolling (`Array.reduce`, `?.`, framework helpers, built-in validators).
48
+ 4. **Can it be one clear line?** Prefer the root-cause-simple form over the clever-short one. Boring over clever; deletion over addition; fewest files possible.
49
+ 5. **Otherwise** — write the minimum that satisfies the criterion, with its tests and error paths.
50
+
51
+ **Never lazy about (these are completeness, never "bloat"):** tests and assertions (Rule 3), input validation at trust boundaries, error handling that prevents data loss or silent failure, security and access-control checks, and the edge cases a success criterion or matched skill bar requires (Rule 1). Economy trims unrequested structure; it never trims the safety net. When economy and completeness conflict, completeness wins.
52
+
39
53
  ## Inputs
40
54
 
41
55
  The orchestrator provides these in your prompt context. Read every path it hands you via `Read` — do not guess.
@@ -0,0 +1,83 @@
1
+ ---
2
+ name: np-simplifier
3
+ description: Read-only economy reviewer for /np:simplify-review. Scans a git diff (or a whole worktree) for over-engineering and emits a deletion-oriented report — never edits source, never commits. Shares the audit rubric with the Economy critic axis (agents/np-critic-economy.md) so the manual command and the in-loop critic stay in lockstep.
4
+ tier: sonnet
5
+ tools: Read, Bash, Grep, Glob
6
+ color: "#22C55E"
7
+ ---
8
+
9
+ <role>
10
+ You are the nubos-pilot Simplifier — the human-facing twin of the Economy critic axis. The user invoked `/np:simplify-review`; the orchestrator hands you a diff (or a path scope) and you report what could be deleted, reused, or condensed. You are READ-ONLY: you never edit source, never stage, never commit. Your output is a catalogue of reduction opportunities for a human to act on.
11
+
12
+ You do NOT review correctness, security, or performance — those route to `/np:verify-work`, the security reviewer, and the performance lens respectively. Your single axis is economy: code that should not exist as written.
13
+
14
+ **CRITICAL: Mandatory Initial Read**
15
+ If the prompt contains a `<files_to_read>` block, you MUST `Read` every file listed there before anything else — chiefly `agents/np-critic-economy.md`, which is your canonical rubric (the ladder, the categories, the severity bar, and the safety boundaries). Apply it verbatim; this command and the in-loop critic must give identical verdicts on the same diff.
16
+ </role>
17
+
18
+ ## Completeness Mandate
19
+
20
+ This agent operates under [`templates/COMPLETENESS.md`](../templates/COMPLETENESS.md). Economy serves clarity, never under-delivery. The rules that bind this role:
21
+
22
+ - **Rule 3 — Do it with tests.** A test is never a reduction target. You do not propose deleting, shrinking, or weakening test code. Coverage is completeness, not bloat.
23
+ - **Rule 5 — Aim to genuinely impress.** "Could be cleaner" is not a finding. Every entry cites file, line, the exact construct, and the concrete replacement.
24
+ - **Rule 8 — Never present a workaround when the real fix exists.** Recommend the root-cause-simple form, never an obscure golfed one-liner that trades clarity for line count.
25
+
26
+ Refusal of any rule is a hard-stop. Surface the violation to the user verbatim and abort.
27
+
28
+ ## Inputs
29
+
30
+ The orchestrator provides these in your prompt context. Read every path it hands you via `Read` — do not guess.
31
+
32
+ | Input | Purpose | Typical path |
33
+ |-------|---------|--------------|
34
+ | Economy rubric (required) | Your canonical ladder, categories, severity bar, and safety boundaries. | `agents/np-critic-economy.md` |
35
+ | Review scope (required) | The diff to audit (inline) or, in `--repo` mode, the `git ls-files` roster of the whole tracked tree. | inline / `git diff` capture / `git ls-files` roster |
36
+ | Stack conventions (recommended) | Project stdlib, native framework features, installed dependencies. | `.nubos-pilot/codebase/INDEX.md`, `.nubos-pilot/RULES.md` |
37
+ | Codebase docs (recommended) | Existing helpers the code should have reused. | `.nubos-pilot/codebase/modules/<id>.md` |
38
+
39
+ ## Scope modes
40
+
41
+ The orchestrator tells you which scope you are auditing:
42
+
43
+ - **diff** (default) — you receive a `git diff`. Review only the added/changed hunks; cite the new line.
44
+ - **repo** (`--repo`) — you receive a `git ls-files` roster instead of a diff. Walk the tracked source yourself with `Read`/`Grep`/`Glob` and apply the same ladder to standing over-engineering that predates any one change. Skip vendored, generated, lock, and minified files; prioritise the largest hand-written modules and stop when the audit budget is spent. Cite `<file>:L<line>` from the file you read. The same safety boundaries apply — never flag a test, validation, error path, or security control.
45
+
46
+ The rubric, categories, severity bar, and safety net are identical across both modes; only the surface you walk differs.
47
+
48
+ ## What you check
49
+
50
+ Apply the ladder and categories from `agents/np-critic-economy.md` exactly. The four economy categories map to the report tags below:
51
+
52
+ | Tag | Economy category | Meaning |
53
+ |---|---|---|
54
+ | `delete:` | `over-engineering` | Single-use abstraction, speculative flexibility, unnecessary layer — remove it. |
55
+ | `stdlib:` | `stdlib-reinvention` | Hand-rolled code the language stdlib provides — call the stdlib. |
56
+ | `native:` | `native-duplication` | Reimplements a framework/platform feature or an installed dependency. |
57
+ | `shrink:` | `shrinkable` | Verbose-but-correct logic that condenses without losing clarity. |
58
+
59
+ **Never a finding (the safety net from the rubric):** tests and assertions, input validation at trust boundaries, error handling that prevents data loss or silent failure, security/access-control checks, and the edge cases a success criterion requires. When economy would weaken any of these, it is not a finding.
60
+
61
+ High-confidence only: report an entry only when the reduction is real and the replacement is concrete and clarity-neutral. A noisy report trains the reader to ignore it.
62
+
63
+ ## Output
64
+
65
+ Emit a plain-text report (no JSON, no file write). One line per finding, in this exact shape — file basename precedes the line number for multi-file diffs:
66
+
67
+ ```
68
+ <file>:L<line>: <tag> <what>. <replacement>.
69
+ ```
70
+
71
+ Group nothing; sort by file then line. End with a single summary line:
72
+
73
+ ```
74
+ net: -<N> lines possible.
75
+ ```
76
+
77
+ `<N>` is your conservative estimate of removable lines across all entries. If the diff is already economical, emit exactly:
78
+
79
+ ```
80
+ Lean already. Ship.
81
+ ```
82
+
83
+ You catalogue; you never apply. If the user wants the changes made, they run them through `/np:execute-phase` (where the Economy critic enforces the same bar when `agents.economy` is `full` or `ultra`) or edit by hand. Hand back the report and stop.
package/bin/install.js CHANGED
@@ -228,6 +228,27 @@ function _readInstallConfig(projectRoot) {
228
228
  }
229
229
  }
230
230
 
231
+ // On re-install/update the installer leaves an existing config.json untouched.
232
+ // To make `ultra` the standard for updated projects too, backfill `agents.economy`
233
+ // into configs that don't set it yet — loud (the key is written, visible in the
234
+ // file) and conservative (an explicit economy OR legacy economy_critic is treated
235
+ // as a deliberate choice and never overwritten). Returns the action taken for logging.
236
+ function _backfillEconomyDefault(stateDir, { dryRun = false } = {}) {
237
+ const cfgPath = path.join(stateDir, 'config.json');
238
+ let raw;
239
+ try { raw = fs.readFileSync(cfgPath, 'utf-8'); } catch { return 'absent'; }
240
+ let cfg;
241
+ try { cfg = JSON.parse(raw); } catch { return 'unparseable'; }
242
+ if (!cfg || typeof cfg !== 'object') return 'unparseable';
243
+ const agents = cfg.agents && typeof cfg.agents === 'object' ? cfg.agents : null;
244
+ if (agents && (agents.economy !== undefined || agents.economy_critic !== undefined)) {
245
+ return 'preserved';
246
+ }
247
+ cfg.agents = { ...(agents || {}), economy: configDefaults.INSTALL_ECONOMY_MODE };
248
+ if (!dryRun) atomicWriteFileSync(cfgPath, JSON.stringify(cfg, null, 2));
249
+ return 'backfilled';
250
+ }
251
+
231
252
  function _readExistingScope(projectRoot) {
232
253
  const cfg = _readInstallConfig(projectRoot);
233
254
  return cfg && cfg.scope ? cfg.scope : null;
@@ -419,6 +440,14 @@ async function _runInstallLocked(ctx) {
419
440
  if (!dryRun) atomicWriteFileSync(configPath, JSON.stringify(config, null, 2));
420
441
  else console.error(dim + 'DRY-RUN: würde schreiben ' + configPath + reset);
421
442
  initConfig = config;
443
+ } else {
444
+ // Re-install / update: backfill the ultra economy default into a config that
445
+ // doesn't set it yet (never overwriting an explicit choice).
446
+ const action = _backfillEconomyDefault(stateDir, { dryRun });
447
+ if (action === 'backfilled') {
448
+ console.error(green + ' [config] agents.economy → ultra (backfilled default)'
449
+ + (dryRun ? ' [DRY-RUN]' : '') + reset);
450
+ }
422
451
  }
423
452
 
424
453
  const resolvedScope = (initConfig && initConfig.scope) || preliminaryScope;
@@ -886,5 +915,5 @@ module.exports = {
886
915
  parseInstallFlags,
887
916
  VALID_AGENTS, VALID_SCOPES,
888
917
  SOURCE_PAYLOAD_DIR, PAYLOAD_SUBPATH, STATE_SUBPATH,
889
- _payloadDirFor, _stateDirFor,
918
+ _payloadDirFor, _stateDirFor, _backfillEconomyDefault,
890
919
  };
@@ -46,10 +46,12 @@ const COMMANDS = [
46
46
  { name: 'metrics', category: 'Utility', description: 'Record JSONL metrics entry (record | now | start-timestamp | end-timestamp)', description_de: 'Schreibt JSONL-Metrics-Eintrag (record | now | start-timestamp | end-timestamp)' },
47
47
 
48
48
  { name: 'add-todo', category: 'Capture', description: 'Capture a pending todo to .nubos-pilot/todos/pending/ + increment STATE count', description_de: 'Erfasst pending Todo nach .nubos-pilot/todos/pending/ + erhöht STATE-Counter' },
49
+ { name: 'simplify-debt', category: 'Capture', description: 'Economy-debt ledger CRUD — record deferred simplifications so "later" does not become "never". Verbs: add --file --line --category --note | list [--status open|resolved|all] [--json] | resolve <id>. Categories mirror the four Economy critic routes; manual twin of /np:simplify-review.', description_de: 'Economy-Debt-Ledger-CRUD — erfasst aufgeschobene Vereinfachungen, damit "spaeter" nicht "nie" wird. Verben: add --file --line --category --note | list [--status open|resolved|all] [--json] | resolve <id>. Kategorien spiegeln die vier Economy-Critic-Routen; manuelles Pendant zu /np:simplify-review.' },
49
50
 
50
51
  { name: 'askuser', category: 'Utility', description: 'Capability-layer prompt wrapper (reads spec JSON, returns chosen label)', description_de: 'Capability-Layer-Prompt-Wrapper (liest Spec-JSON, gibt gewähltes Label zurück)' },
51
52
  { name: 'commit', category: 'Utility', description: 'Atomic git commit wrapper with gitignore-guard', description_de: 'Atomarer Git-Commit-Wrapper mit Gitignore-Guard' },
52
53
  { name: 'config-get', category: 'Utility', description: 'Read value from .nubos-pilot/config.json by dotted key path', description_de: 'Liest Wert aus .nubos-pilot/config.json über Dotted-Key-Pfad' },
54
+ { name: 'economy-mode', category: 'Utility', description: 'Resolve the Economy axis level (off|lite|full|ultra) from agents.economy (legacy agents.economy_critic honoured; default lite). Prints the mode, or --json for {mode,prevention,critic,ultra} gate flags. Single source for the execute-phase economy gate.', description_de: 'Löst das Economy-Achsen-Level (off|lite|full|ultra) aus agents.economy (Legacy agents.economy_critic wird berücksichtigt; Default lite). Gibt den Mode aus, oder --json für {mode,prevention,critic,ultra}-Gate-Flags. Single Source für das execute-phase-Economy-Gate.' },
53
55
  { name: 'lang-directive', category: 'Utility', description: 'Print workflow language directive from config.response_language (SSOT)', description_de: 'Gibt Workflow-Sprachdirektive aus config.response_language aus (SSOT)' },
54
56
  { name: 'text-mode', category: 'Utility', description: 'Print whether text mode is active (config.workflow.text_mode ∨ CLAUDECODE)', description_de: 'Gibt aus, ob Text-Mode aktiv ist (config.workflow.text_mode ∨ CLAUDECODE)' },
55
57
  { name: 'generate-slug', category: 'Utility', description: 'Slugify text via lib/layout.cjs.slugify', description_de: 'Slugifiziert Text über lib/layout.cjs.slugify' },
@@ -343,6 +343,7 @@ const NUBOSLOOP_CRITICS = [
343
343
  'np-critic-style', // axis module (Style)
344
344
  'np-critic-tests', // axis module (Tests)
345
345
  'np-critic-acceptance', // axis module (Acceptance)
346
+ 'np-critic-economy', // axis module (Economy)
346
347
  ];
347
348
 
348
349
  function _checkNubosloopCritics(projectRoot) {
@@ -0,0 +1,47 @@
1
+ 'use strict';
2
+
3
+ const { economyFlags } = require('../../lib/economy-mode.cjs');
4
+ const { emitErrorEnvelope } = require('./_args.cjs');
5
+
6
+ function _usage() {
7
+ return 'Usage:\n np-tools.cjs economy-mode [--json]';
8
+ }
9
+
10
+ function _readConfig(cwd) {
11
+ const { readConfig } = require('../../lib/config.cjs');
12
+ try {
13
+ const cfg = readConfig(cwd);
14
+ return cfg && Object.keys(cfg).length === 0 ? null : cfg;
15
+ } catch (err) {
16
+ if (err && err.code === 'not-in-project') return null;
17
+ throw err;
18
+ }
19
+ }
20
+
21
+ function run(argv, ctx) {
22
+ const context = ctx || {};
23
+ const cwd = context.cwd || process.cwd();
24
+ const stdout = context.stdout || process.stdout;
25
+ const stderr = context.stderr || process.stderr;
26
+ const args = Array.isArray(argv) ? argv.slice() : [];
27
+ if (args.includes('--help') || args.includes('-h')) {
28
+ stdout.write(_usage() + '\n');
29
+ return 0;
30
+ }
31
+ const asJson = args.includes('--json');
32
+ try {
33
+ const config = _readConfig(cwd);
34
+ const flags = economyFlags(config || {});
35
+ stdout.write((asJson ? JSON.stringify(flags) : flags.mode) + '\n');
36
+ return 0;
37
+ } catch (err) {
38
+ emitErrorEnvelope(err, stderr, 'economy-mode-internal-error');
39
+ return 1;
40
+ }
41
+ }
42
+
43
+ module.exports = { run };
44
+
45
+ if (require.main === module) {
46
+ process.exit(run(process.argv.slice(2)));
47
+ }
@@ -311,7 +311,7 @@ function _runPostCritics(taskId, list, cwd) {
311
311
  'phase=post-critics refused: critic-schwarm spawn-evidence missing for round=' + gateRound +
312
312
  ' on ' + taskId + ' (missing audits: ' + verdict.missing.join(', ') + '). ' +
313
313
  'For each critic agent, call `loop-audit-tool-use ' + taskId +
314
- ' --agent <np-critic-style|np-critic-tests|np-critic-acceptance> --tool-use-log <json>` ' +
314
+ ' --agent <np-critic-style|np-critic-tests|np-critic-acceptance|np-critic-economy> --tool-use-log <json>` ' +
315
315
  'after the spawn, then re-run --phase post-critics. Pass --force-post-critics for an explicit override.',
316
316
  { taskId, round: gateRound, missing: verdict.missing.slice(), required: nubosloop.POST_CRITICS_EVIDENCE.slice() },
317
317
  );
@@ -32,6 +32,7 @@ const CRITIC_TIER_OVERRIDES = {
32
32
  'np-critic-style': 'style_tier',
33
33
  'np-critic-tests': 'tests_tier',
34
34
  'np-critic-acceptance': 'acceptance_tier',
35
+ 'np-critic-economy': 'economy_tier',
35
36
  };
36
37
 
37
38
  function _criticTierOverride(config, agentName) {
@@ -0,0 +1,91 @@
1
+ 'use strict';
2
+
3
+ const { NubosPilotError } = require('../../lib/core.cjs');
4
+ const debt = require('../../lib/economy-debt.cjs');
5
+
6
+ function _parseFlags(list) {
7
+ const out = { _: [], file: null, line: null, category: null, note: null, status: null, json: false };
8
+ for (let i = 0; i < list.length; i++) {
9
+ const a = list[i];
10
+ if (a === '--json') out.json = true;
11
+ else if (a === '--file') out.file = list[++i];
12
+ else if (a.startsWith('--file=')) out.file = a.slice('--file='.length);
13
+ else if (a === '--line') out.line = list[++i];
14
+ else if (a.startsWith('--line=')) out.line = a.slice('--line='.length);
15
+ else if (a === '--category') out.category = list[++i];
16
+ else if (a.startsWith('--category=')) out.category = a.slice('--category='.length);
17
+ else if (a === '--note') out.note = list[++i];
18
+ else if (a.startsWith('--note=')) out.note = a.slice('--note='.length);
19
+ else if (a === '--status') out.status = list[++i];
20
+ else if (a.startsWith('--status=')) out.status = a.slice('--status='.length);
21
+ else out._.push(a);
22
+ }
23
+ return out;
24
+ }
25
+
26
+ function _renderList(entries, status) {
27
+ if (entries.length === 0) {
28
+ return status === 'resolved'
29
+ ? 'No resolved economy-debt entries.'
30
+ : 'Economy-debt ledger is empty. Lean already.';
31
+ }
32
+ const lines = entries.map((e) => {
33
+ const where = e.line > 0 ? e.file + ':' + e.line : (e.file || '(no file)');
34
+ const mark = e.status === 'resolved' ? '[x]' : '[ ]';
35
+ return mark + ' ' + e.id + ' ' + e.category.padEnd(18) + ' ' + where + '\n ' + e.note;
36
+ });
37
+ const open = entries.filter((e) => e.status === 'open').length;
38
+ const resolved = entries.length - open;
39
+ lines.push('');
40
+ lines.push('total: ' + entries.length + ' (' + open + ' open, ' + resolved + ' resolved)');
41
+ return lines.join('\n');
42
+ }
43
+
44
+ function run(args, ctx) {
45
+ const context = ctx || {};
46
+ const cwd = context.cwd || process.cwd();
47
+ const stdout = context.stdout || process.stdout;
48
+ const list = Array.isArray(args) ? args : [];
49
+ const verb = (list[0] || 'list').trim();
50
+ const flags = _parseFlags(list.slice(1));
51
+
52
+ if (verb === 'add') {
53
+ const note = flags.note != null ? flags.note : flags._.join(' ');
54
+ const entry = debt.addEntry(
55
+ { file: flags.file, line: flags.line, category: flags.category, note },
56
+ cwd,
57
+ );
58
+ stdout.write(JSON.stringify({ ok: true, action: 'add', was_new: entry.was_new, entry }) + '\n');
59
+ return 0;
60
+ }
61
+
62
+ if (verb === 'resolve') {
63
+ const id = flags._[0];
64
+ const entry = debt.resolveEntry(id, cwd);
65
+ stdout.write(JSON.stringify({ ok: true, action: 'resolve', entry }) + '\n');
66
+ return 0;
67
+ }
68
+
69
+ if (verb === 'list') {
70
+ const status = flags.status || 'open';
71
+ const entries = debt.listEntries(status, cwd);
72
+ if (flags.json) {
73
+ stdout.write(JSON.stringify({ ok: true, action: 'list', status, count: entries.length, entries }, null, 2) + '\n');
74
+ } else {
75
+ stdout.write(_renderList(entries, status) + '\n');
76
+ }
77
+ return 0;
78
+ }
79
+
80
+ throw new NubosPilotError(
81
+ 'simplify-debt-unknown-verb',
82
+ "simplify-debt verb must be 'add', 'list', or 'resolve' (got: " + verb + ')',
83
+ { verb },
84
+ );
85
+ }
86
+
87
+ module.exports = { run, _parseFlags, _renderList };
88
+
89
+ if (require.main === module) {
90
+ process.exit(run(process.argv.slice(2)));
91
+ }
@@ -0,0 +1,99 @@
1
+ 'use strict';
2
+
3
+ const fs = require('node:fs');
4
+ const os = require('node:os');
5
+ const path = require('node:path');
6
+ const { test, afterEach } = require('node:test');
7
+ const assert = require('node:assert/strict');
8
+
9
+ const subcmd = require('./simplify-debt.cjs');
10
+
11
+ const _sandboxes = [];
12
+
13
+ function makeSandbox() {
14
+ const root = fs.mkdtempSync(path.join(os.tmpdir(), 'np-simplify-debt-'));
15
+ fs.mkdirSync(path.join(root, '.nubos-pilot'), { recursive: true });
16
+ _sandboxes.push(root);
17
+ return root;
18
+ }
19
+
20
+ function capture(fn) {
21
+ const out = [];
22
+ const orig = process.stdout.write.bind(process.stdout);
23
+ process.stdout.write = (c) => { out.push(String(c)); return true; };
24
+ let rc;
25
+ try { rc = fn(); } finally { process.stdout.write = orig; }
26
+ return { stdout: out.join(''), rc };
27
+ }
28
+
29
+ afterEach(() => {
30
+ while (_sandboxes.length) {
31
+ try { fs.rmSync(_sandboxes.pop(), { recursive: true, force: true }); } catch { /* best effort */ }
32
+ }
33
+ });
34
+
35
+ test('SD-1: add records an entry and reports was_new', () => {
36
+ const cwd = makeSandbox();
37
+ const { stdout, rc } = capture(() =>
38
+ subcmd.run(['add', '--file', 'src/foo.ts', '--line', '42', '--category', 'over-engineering', '--note', 'inline the factory'], { cwd }),
39
+ );
40
+ assert.equal(rc, 0);
41
+ const res = JSON.parse(stdout);
42
+ assert.equal(res.ok, true);
43
+ assert.equal(res.action, 'add');
44
+ assert.equal(res.was_new, true);
45
+ assert.equal(res.entry.category, 'over-engineering');
46
+ });
47
+
48
+ test('SD-2: add accepts the note as positional args', () => {
49
+ const cwd = makeSandbox();
50
+ const { stdout, rc } = capture(() =>
51
+ subcmd.run(['add', '--category', 'shrinkable', 'collapse', 'this', 'loop'], { cwd }),
52
+ );
53
+ assert.equal(rc, 0);
54
+ assert.equal(JSON.parse(stdout).entry.note, 'collapse this loop');
55
+ });
56
+
57
+ test('SD-3: list (default open) renders the ledger', () => {
58
+ const cwd = makeSandbox();
59
+ capture(() => subcmd.run(['add', '--file', 'a.ts', '--category', 'shrinkable', '--note', 'use reduce'], { cwd }));
60
+ const { stdout, rc } = capture(() => subcmd.run(['list'], { cwd }));
61
+ assert.equal(rc, 0);
62
+ assert.match(stdout, /shrinkable/);
63
+ assert.match(stdout, /1 open/);
64
+ });
65
+
66
+ test('SD-4: list --json emits structured entries', () => {
67
+ const cwd = makeSandbox();
68
+ capture(() => subcmd.run(['add', '--file', 'a.ts', '--category', 'shrinkable', '--note', 'x'], { cwd }));
69
+ const { stdout } = capture(() => subcmd.run(['list', '--json'], { cwd }));
70
+ const res = JSON.parse(stdout);
71
+ assert.equal(res.count, 1);
72
+ assert.equal(res.entries[0].category, 'shrinkable');
73
+ });
74
+
75
+ test('SD-5: no-arg defaults to list', () => {
76
+ const cwd = makeSandbox();
77
+ const { stdout, rc } = capture(() => subcmd.run([], { cwd }));
78
+ assert.equal(rc, 0);
79
+ assert.match(stdout, /empty|Lean already/i);
80
+ });
81
+
82
+ test('SD-6: resolve moves an entry to resolved', () => {
83
+ const cwd = makeSandbox();
84
+ const added = capture(() => subcmd.run(['add', '--file', 'a.ts', '--category', 'native-duplication', '--note', 'reuse helper'], { cwd }));
85
+ const id = JSON.parse(added.stdout).entry.id;
86
+ const { stdout, rc } = capture(() => subcmd.run(['resolve', id], { cwd }));
87
+ assert.equal(rc, 0);
88
+ assert.equal(JSON.parse(stdout).entry.status, 'resolved');
89
+ const open = capture(() => subcmd.run(['list', '--json'], { cwd }));
90
+ assert.equal(JSON.parse(open.stdout).count, 0);
91
+ });
92
+
93
+ test('SD-7: unknown verb throws', () => {
94
+ const cwd = makeSandbox();
95
+ assert.throws(
96
+ () => subcmd.run(['frobnicate'], { cwd }),
97
+ (err) => err && err.name === 'NubosPilotError' && err.code === 'simplify-debt-unknown-verb',
98
+ );
99
+ });
@@ -6,8 +6,9 @@ const LEGACY_CRITIC_AXIS_AGENTS = Object.freeze([
6
6
  'np-critic-style',
7
7
  'np-critic-tests',
8
8
  'np-critic-acceptance',
9
+ 'np-critic-economy',
9
10
  ]);
10
- const SUPPORTED_CRITIC_AXES = Object.freeze(['critic', 'style', 'tests', 'acceptance']);
11
+ const SUPPORTED_CRITIC_AXES = Object.freeze(['critic', 'style', 'tests', 'acceptance', 'economy']);
11
12
 
12
13
  const EXECUTOR_AGENT = 'np-executor';
13
14
  const BUILD_FIXER_AGENT = 'np-build-fixer';