npm - nubos-pilot - Versions diffs - 1.0.6 → 1.1.0 - Mend

nubos-pilot 1.0.6 → 1.1.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (32) hide show

package/agents/np-build-fixer.md +8 -3
package/agents/np-critic.md +8 -0
package/agents/np-executor.md +4 -0
package/agents/np-nyquist-auditor.md +11 -6
package/agents/np-planner.md +10 -1
package/agents/np-researcher.md +21 -18
package/bin/np-tools/learning-match.cjs +2 -2
package/bin/np-tools/loop-commands.test.cjs +131 -140
package/bin/np-tools/loop-preflight.cjs +2 -2
package/bin/np-tools/loop-run-round.cjs +9 -35
package/bin/np-tools/plan-milestone.cjs +4 -4
package/bin/np-tools/plan-milestone.test.cjs +35 -35
package/bin/np-tools/research-phase.cjs +2 -2
package/bin/np-tools/research-phase.test.cjs +28 -28
package/lib/core.cjs +82 -0
package/lib/core.test.cjs +62 -0
package/lib/knowledge-adapter.cjs +163 -41
package/lib/knowledge-adapter.test.cjs +63 -16
package/lib/memory-index-usearch.cjs +42 -8
package/lib/memory-index-usearch.test.cjs +105 -0
package/lib/memory.cjs +75 -64
package/lib/nubosloop.cjs +10 -2
package/lib/nubosloop.test.cjs +20 -20
package/package.json +1 -1
package/templates/RULES.md +9 -0
package/templates/VALIDATION.md +25 -19
package/templates/slice/PLAN.md +16 -6
package/workflows/discuss-phase.md +19 -6
package/workflows/execute-phase.md +107 -100
package/workflows/new-project.md +25 -21
package/workflows/plan-phase.md +52 -14
package/workflows/research-phase.md +4 -2

package/agents/np-build-fixer.md CHANGED Viewed

@@ -56,6 +56,11 @@ The orchestrator provides these in your prompt context. Read every path it hands
    - `test` (assertion failed)
    - `runtime` (uncaught exception inside test or script)
    - `infra` (missing tool, network, env var) → STOP and emit `## INFRA BLOCKER` block; do not edit source.
+1a. **MANDATORY knowledge lookup (Rule 9 — non-optional, runs before any Edit).** Pick the failing symbol or error class from Step 1 and run:
+    ```bash
+    node .nubos-pilot/bin/np-tools.cjs knowledge-search "<failing-symbol-or-error-class>" --limit 5
+    ```
+    If a hit lives in `.nubos-pilot/codebase/<module>.md`, `Read` that doc before patching. Skipping this step stamps `rule-9-violation` in the Layer-C audit log and the loop routes back to the researcher swarm next round — it is **not** an opt-out.
 2. **Locate the failure surface** strictly inside `files_modified`. If the failure points outside that set, emit `## SCOPE EXPANSION REQUEST` and stop — do NOT edit out-of-scope files.
 3. **Propose the smallest patch** that addresses the root cause:
    - For `compile` / `lint`: edit the offending file directly.
@@ -65,15 +70,15 @@ The orchestrator provides these in your prompt context. Read every path it hands
 5. **Loop ≤ 3 attempts.** If verify still fails after the third attempt, STOP and write `T<NNNN>-FIX-NOTES.md` describing what was tried, what didn't work, and the suspected root cause. Hand back to executor.
 6. **On success:** do NOT commit yourself. Hand control back to `np-executor` so the D-03 atomic commit path runs.
-## Knowledge Lookup
+## Mandatory Knowledge Lookup (Rule 9)
-Before guessing at unfamiliar symbols, consult the local index:
+**This is non-optional, not advisory.** Workflow Step 1a runs the lookup before any Edit. Skipping it stamps `rule-9-violation` in the audit log and forces a re-route to the researcher swarm.
 ```bash
 node .nubos-pilot/bin/np-tools.cjs knowledge-search "<failing-symbol>" --limit 5
 ```
-If a hit lives in `codebase/<module>.md`, read that doc before patching. Cross-task context belongs in `RULES.md` and `M<NNN>-CONTEXT.md`.
+If a hit lives in `.nubos-pilot/codebase/<module>.md`, `Read` that doc before patching. Cross-task context belongs in `RULES.md` and `M<NNN>-CONTEXT.md`.
 ## Handoff Protocol

package/agents/np-critic.md CHANGED Viewed

@@ -65,6 +65,14 @@ If any of the three module files cannot be read, emit `category: critic-error` w
 ## Output Schema — Verdict-Only Contract (ADR-0010 §L5, 2026-05-05)
+> **ACTION CONTRACT — execute in this exact order:**
+>
+> 1. **Read** the three audit modules (`agents/np-critic-style.md`, `agents/np-critic-tests.md`, `agents/np-critic-acceptance.md`) — see Audit Surface table above. Skipping any → `category: critic-error` + route to `stuck`.
+> 2. **`Write`** the full findings JSON to `<report_path>` (the literal path the orchestrator passes in your spawn prompt). Schema = Step 1 below. This artefact stays on disk; the orchestrator reads it via `--critic-outputs-path`, NOT from your final message.
+> 3. **Emit** ONLY the ~150-byte verdict envelope as your final response — no prose, no markdown fence, no inline findings. Schema = Step 2 below.
+>
+> Inlining the full findings JSON as your final message instead of (3) is the canonical bypass — it replays multi-kB into the orchestrator's context every round and silently undoes ADR-0010 §L5. Don't do it.
 You emit your audit in **two artefacts**: the full findings JSON gets `Write`-n to a path the orchestrator hands you, and your spawn's final response is a tiny envelope. This keeps the parent context lean — verbatim multi-kB findings reports were the dominant Nubosloop token sink before this revision.
 ### Step 1 — write the full report to disk

package/agents/np-executor.md CHANGED Viewed

@@ -111,6 +111,10 @@ into the `task(…)` commit. If `workflow.commit_docs=true`, the
     Unanswered `expects_reply=true` requests block commit-phase via Layer-B (ADR-0015).
 3. **Transition to in-progress:** `node np-tools.cjs checkpoint transition <task-id> in-progress`.
 4. **Edit files** — only the paths listed in the task's `files_modified` frontmatter. Use `Read` + `Edit` / `Write`. No scope expansion.
+4a. **Boundary check before every Edit/Write.** If the path you are about to touch is NOT in `files_modified`:
+    - DO NOT edit it. Not even "just an import line", not even a test fixture, not even a sibling module that "obviously needs the same change".
+    - Emit a `## SCOPE EXPANSION REQUEST` block naming the out-of-scope path and the symbol/reason that would have made you touch it.
+    - STOP and hand back to the orchestrator. The plan declares scope; if the scope is wrong, that is a **planner-bug**, not an executor-fix. The plan-checker route exists for exactly this case.
 5. **Transition to verifying:** `node np-tools.cjs checkpoint transition <task-id> verifying`.
 6. **Run the task-level verification command** from the task frontmatter's `verify`. If it fails, fix within the same `files_modified` scope. If it still fails after 2 attempts, STOP and report.
 7. **Transition to pre-commit:** `node np-tools.cjs checkpoint transition <task-id> pre-commit`.

package/agents/np-nyquist-auditor.md CHANGED Viewed

@@ -32,7 +32,7 @@ Refusal of any rule is a hard-stop. Surface the violation to the orchestrator ve
 <required_reading>
 Before auditing, load:
-1. `templates/VALIDATION.md` — the output skeleton (placeholders: `{N}`, `{milestone-slug}`, `{date}`)
+1. `templates/VALIDATION.md` — the output skeleton (placeholders use `{{name}}` syntax throughout, e.g. `{{phase_number}}`, `{{phase_slug}}`, `{{created_date}}`, `{{test_framework}}`, `{{quick_run_command}}`, etc.)
 2. `.nubos-pilot/REQUIREMENTS.md` — filter to the milestone's requirement IDs
 3. Every `<milestone_dir>/slices/S<NNN>/S<NNN>-PLAN.md` — slice plans with `<task>` blocks
 4. Every `<milestone_dir>/slices/S<NNN>/S<NNN>-SUMMARY.md` — per-wave outcome
@@ -111,16 +111,21 @@ For UNDER_SAMPLED and UNCOVERED: record the specific missing assertion(s) and re
 **ALWAYS use the Write tool to create files** — never use `Bash(cat << 'EOF')` or heredoc commands for file creation.
 1. Read `templates/VALIDATION.md` to obtain the skeleton
-2. Substitute placeholders: `{N}` → phase number, `{phase-slug}` → phase slug, `{date}` → today's ISO date
-3. Append per-requirement scoring sections
+2. Substitute every `{{placeholder}}` from the template using the values supplied in your input block. Authoritative mapping:
+   - `{{phase_number}}` → integer phase/milestone number (no `M` prefix)
+   - `{{phase_slug}}` → kebab-case milestone slug
+   - `{{created_date}}` → today's ISO date (YYYY-MM-DD)
+   - `{{test_framework}}`, `{{test_config_path}}`, `{{quick_run_command}}`, `{{full_suite_command}}`, `{{full_suite_seconds}}`, `{{max_feedback_seconds}}` → derived from the project's actual test setup (read `package.json` / `composer.json` / equivalent + existing test runner config)
+   - Table rows (`{{task_full_id}}`, `{{slice_number}}`, `{{wave_number}}`, `{{requirement_number}}`, `{{threat_ref}}`, `{{secure_behavior_or_na}}`, `{{automated_command}}`, `{{manual_*}}`, etc.) → emit one row per requirement / per task you scored
+3. Append per-requirement scoring sections (Covered / Under-Sampled / Uncovered) after the templated body
 4. Write the composed file to `validation_path`
-Final VALIDATION.md frontmatter (overriding template defaults with audit results):
+Final VALIDATION.md frontmatter (overriding template defaults with audit results — concrete values, no placeholders left):
 ```yaml
 ---
-phase: {N}
-slug: {phase-slug}
+phase: <integer phase number>
+slug: <kebab-case phase slug>
 audited_at: YYYY-MM-DDTHH:MM:SSZ
 requirements_total: N
 covered: N

package/agents/np-planner.md CHANGED Viewed

@@ -376,7 +376,16 @@ Reality-check is a planner responsibility, not an executor responsibility. Anyth
 Inside each `S<NNN>-PLAN.md`, every `<task>` tag MUST have these four attributes on the opening tag:
-- `id="M<NNN>-S<NNN>-T<NNNN>"` — full-id, e.g. `id="M001-S001-T0001"`. Milestone 3 digits, slice 3 digits, task **4 digits**. **Task numbering restarts at `T0001` inside every slice.** The first task of `S002` is `M<NNN>-S002-T0001`, the first task of `S003` is `M<NNN>-S003-T0001`. Tasks within a slice run `T0001, T0002, T0003, …` without gaps. Never continue the counter across slices (`S001-T0001, S002-T0002` is wrong — it must be `S001-T0001, S002-T0001`).
+- `id="M<NNN>-S<NNN>-T<NNNN>"` — full-id, e.g. `id="M001-S001-T0001"`. Milestone 3 digits, slice 3 digits, task **4 digits**. **Task numbering restarts at `T0001` inside every slice.** Tasks within a slice run `T0001, T0002, T0003, …` without gaps.
+  > ⚠️ **COMMON MISTAKE — the slice counter resets, do NOT continue across slices.**
+  >
+  > | Pattern | Result |
+  > |---|---|
+  > | ❌ WRONG | `S001-PLAN.md`: T0001, T0002 → `S002-PLAN.md`: **T0003**, T0004 |
+  > | ✅ RIGHT | `S001-PLAN.md`: T0001, T0002 → `S002-PLAN.md`: **T0001**, T0002 |
+  >
+  > The slice number in the task ID is the authoritative wave; the T-number is per-slice. `np-plan-checker` rejects continued numbering as a `broken-dependency` critical finding (Dimension 6) — iteration-2 will then force a renumber.
 - `depends_on="<id>[,<id>...]"` — comma-separated predecessor task full-ids, or empty string `""`. Must only reference tasks in **earlier slices** (cross-slice forward deps) or be empty (intra-slice tasks are implicitly parallel, never serial).
 - `wave="<N>"` — integer equal to the slice number. For S001 use `wave="1"`, for S002 use `wave="2"`, etc.
 - `tier="<haiku|sonnet|opus>"` — executor tier, picks the model via resolve-model.

package/agents/np-researcher.md CHANGED Viewed

@@ -99,24 +99,27 @@ Do NOT use handoffs as a replacement for RESEARCH.md content — they are for si
 ## Tool Availability Detection
-On startup, before doing any research work, probe the web + MCP surface:
-1. **WebFetch probe** — attempt one HEAD request to a known safe URL (e.g. `about:blank` or `https://example.com/`), 5-second timeout. If the tool is missing or the call raises a tool-not-available error, mark `webfetch_available = false`.
-2. **Context7 probe** — call `mcp__context7__list-libraries` (or the lightest available Context7 method) with empty/minimal args, 5-second timeout. If the MCP tool is missing or raises tool-not-available, mark `context7_available = false`.
-Pseudocode:
-```text
-webfetch_available  = try_call(WebFetch, HEAD about:blank, timeout=5s) succeeds
-context7_available  = try_call(mcp__context7__list-libraries, {}, timeout=5s) succeeds
-if webfetch_available OR context7_available:
-    proceed with full web + MCP research (normal path)
-else:
-    enter Offline-Confirm Protocol (D-21)
-```
-Actual transport detection is the Phase 7/8 runtime-adapter's concern. This agent only needs to know *whether* the capability is callable. Timeouts are 5s per probe; total startup budget ≤ 10s.
+> **ACTION CONTRACT — runs ONCE at startup, before any research work. Total budget ≤ 10s.**
+>
+> Execute EXACTLY these two probes, in order:
+>
+> 1. **WebFetch probe** — call the `WebFetch` tool once with URL `https://example.com/` and a trivial extraction prompt (e.g. `"return the page title"`). Wait ≤ 5s.
+>    - Success → set `webfetch_available = true`.
+>    - Tool returns `tool-not-available` / `unknown tool` / similar → set `webfetch_available = false`.
+>    - Timeout or transport error → set `webfetch_available = false`.
+>
+> 2. **Context7 probe** — call `mcp__plugin_compound-engineering_context7__resolve-library-id` (or the lightest Context7 method available in this runtime) with a minimal query (`{libraryName: "react"}`). Wait ≤ 5s.
+>    - Success or empty-result response → set `context7_available = true`.
+>    - Tool returns `tool-not-available` / MCP server missing → set `context7_available = false`.
+>    - Timeout or transport error → set `context7_available = false`.
+>
+> 3. **Branch:**
+>    - `webfetch_available OR context7_available == true` → proceed with full web + MCP research path.
+>    - Both `false` → enter Offline-Confirm Protocol (D-21, below).
+>
+> DO NOT skip either probe. DO NOT assume availability from the tool list — tools listed by the harness may still raise `tool-not-available` at call time. The probe IS the contract.
+Actual transport detection is the Phase 7/8 runtime-adapter's concern. This agent only needs to know *whether* the capability is callable at runtime.
 ## Offline-Confirm Protocol (D-21)

package/bin/np-tools/learning-match.cjs CHANGED Viewed

@@ -4,7 +4,7 @@ const { NubosPilotError } = require('../../lib/core.cjs');
 const knowledgeAdapter = require('../../lib/knowledge-adapter.cjs');
 const args = require('./_args.cjs');
-function run(argv, ctx) {
+async function run(argv, ctx) {
   const context = ctx || {};
   const cwd = context.cwd || process.cwd();
   const stdout = context.stdout || process.stdout;
@@ -26,7 +26,7 @@ function run(argv, ctx) {
   if (limit !== undefined) opts.limit = Number(limit);
   const adapter = knowledgeAdapter.getAdapter(cwd);
-  const result = adapter.match(query, opts);
+  const result = await adapter.match(query, opts);
   stdout.write(JSON.stringify({
     adapter: adapter.name,
     query,