npm - nubos-pilot - Versions diffs - 1.2.3 → 1.3.0 - Mend

nubos-pilot 1.2.3 → 1.3.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (45) hide show

package/CHANGELOG.md +24 -0
package/README.md +18 -1
package/SECURITY.md +3 -4
package/bin/np-tools/_commands.cjs +1 -0
package/bin/np-tools/learnings.cjs +5 -1
package/bin/np-tools/resolve-model.cjs +55 -1
package/bin/np-tools/resolve-model.test.cjs +139 -0
package/bin/np-tools/security.cjs +4 -1
package/bin/np-tools/spawn-headless.cjs +135 -2
package/bin/np-tools/spawn-headless.test.cjs +225 -40
package/bin/np-tools/spawn-offhost.cjs +93 -0
package/bin/np-tools/spawn-offhost.test.cjs +38 -0
package/lib/agents.cjs +16 -2
package/lib/config-schema.cjs +5 -1
package/lib/headless-guard.cjs +127 -0
package/lib/headless-guard.test.cjs +119 -0
package/lib/learnings/extract.cjs +4 -4
package/lib/learnings/extract.test.cjs +8 -8
package/lib/model-providers.cjs +118 -0
package/lib/model-providers.test.cjs +85 -0
package/lib/runtime/agent-loop.cjs +64 -0
package/lib/runtime/agent-loop.test.cjs +135 -0
package/lib/runtime/dispatch.cjs +174 -0
package/lib/runtime/dispatch.test.cjs +193 -0
package/lib/runtime/preflight.cjs +68 -0
package/lib/runtime/preflight.test.cjs +62 -0
package/lib/runtime/providers/openai-compat.cjs +102 -0
package/lib/runtime/providers/openai-compat.test.cjs +103 -0
package/lib/runtime/tools/index.cjs +415 -0
package/lib/runtime/tools/index.test.cjs +230 -0
package/lib/security/review.cjs +4 -4
package/lib/security/review.test.cjs +6 -6
package/np-tools.cjs +1 -0
package/package.json +1 -1
package/templates/claude/payload/hooks/np-learnings-hook.cjs +1 -0
package/templates/claude/payload/hooks/np-security-hook.cjs +1 -0
package/workflows/add-tests.md +41 -0
package/workflows/architect-phase.md +19 -0
package/workflows/discuss-phase.md +29 -10
package/workflows/execute-phase.md +93 -4
package/workflows/plan-phase.md +57 -16
package/workflows/research-phase.md +45 -0
package/workflows/scan-codebase.md +21 -3
package/workflows/validate-phase.md +30 -13
package/workflows/verify-work.md +17 -0

package/CHANGELOG.md CHANGED Viewed

@@ -4,6 +4,30 @@ All notable changes to nubos-pilot are documented in this file. Format
 follows [Keep a Changelog](https://keepachangelog.com/en/1.1.0/); versioning
 follows [SemVer](https://semver.org/spec/v2.0.0.html).
+## [1.3.0] — 2026-06-17
+Run any agent on any model, not only Claude.
+- Per-agent model routing: two new config blocks, `model_providers` and `agent_routing`, send each agent to a specific model in the same run — planner on Claude opus, critic on OpenAI gpt-4o, executor on a local Ollama model. Any provider that speaks the OpenAI `/v1/chat/completions` dialect (OpenAI, xAI/Grok, Ollama, vLLM, LM Studio, LiteLLM) is reached through one `fetch`-based client, with no SDK added. Both blocks are optional; without them, resolution and spawning behave exactly as before.
+- When the host can't route an agent to a non-native model — Claude Code's Agent tool only accepts Claude tiers — nubos-pilot runs the loop itself. It's a one-shot, zero-dependency tool-use harness: builds the prompt from `agents/<name>.md`, advertises the agent's tools as function schemas, runs the model's tool-calls against the workspace, and loops until a final answer. No daemon, the process exits when the loop returns.
+- The off-host path runs through the same guards as the Claude path: working-tree safety, commit-policy, output-schema lint, the Nubosloop Rule-9 audit, and in-session security review, all unchanged. Off-host file writes are confined through `safe-path`. Off-host Bash runs only inside a slice worktree and stays off until `workflow.worktree_isolation` is on.
+- Every workflow spawn now has an off-host branch — execute, plan, discuss, research, architect, validate, verify, scan. A test (`check-offhost-coverage`) walks the workflows and fails the suite if any spawn lacks one, so a new agent can't ship Claude-only by accident.
+- A preflight runs before any off-host spawn and fails loud: it checks the server is reachable, the model is present, and tool-calling works, then aborts with an actionable message (`run: ollama pull <model>`) instead of dying mid-task. A routing entry that names an undefined provider is a hard config error at load time, never a quiet fallback to Claude.
+Local models are weaker at multi-step tool-use than frontier Claude, so keep high-risk agents like the planner and security-reviewer on Claude — that's why the whole thing is opt-in. ADR-0021 has the full design.
+Full documentation at <https://pilot.nubos.cloud>.
+## [1.2.4] — 2026-06-15
+Fixed a recursion fault in the in-session hooks that could spawn an unbounded cascade of headless `claude -p` processes.
+- The Stop-hook security review and continuous-learning capture each spawn a headless `claude -p` to do their work. That headless run re-fires the same SessionStart/Stop hooks, which spawned another headless run, and so on — a fork bomb of `claude`, `np-tools` and duplicated MCP servers that survived closing the terminal. nubos-pilot now marks every headless spawn with `NUBOS_PILOT_HEADLESS=1` and a `NUBOS_PILOT_HOOK_DEPTH` counter; the hooks no-op immediately inside a headless run, so the chain stops at exactly one level.
+- Three independent guards back this up: the hook scripts and the `security`/`learnings` backends exit early when `NUBOS_PILOT_HEADLESS` is set; `spawn-headless` refuses to start a nested headless run (reentrancy + depth cap, default one level); and a per-agent lockfile under `.nubos-pilot/run/` bounds concurrent headless runs to one per agent even if the environment is not inherited. Headless runs already carry a hard timeout with SIGKILL, so a hung review cannot linger.
+- Escape hatch: the guard keys off `NUBOS_PILOT_HEADLESS`, set automatically on the spawned `claude` — do not set it in your own shell or the in-session hooks will silently no-op. Raise the depth cap with `NUBOS_PILOT_MAX_HOOK_DEPTH` only if you understand the recursion risk.
+Full documentation at <https://pilot.nubos.cloud>.
 ## [1.2.3] — 2026-06-14
 Three opt-in layers that make execution cheaper, more reliable, and self-improving.

package/README.md CHANGED Viewed

@@ -95,7 +95,7 @@ task(M001-S001-T0002): wire login handler
 ## Agents
-Thirteen spawnable subagents are installed into the host's agent directory (alongside three `np-critic-*` audit modules consumed by `np-critic`):
+Fourteen spawnable subagents are installed into the host's agent directory (alongside three `np-critic-*` audit modules consumed by `np-critic`):
 - `np-planner` (opus) — breaks a milestone into slices + tasks
 - `np-plan-checker` (opus) — adversarial goal-backward review before execution
@@ -109,6 +109,7 @@ Thirteen spawnable subagents are installed into the host's agent directory (alon
 - `np-critic` (sonnet) — Nubosloop critic; audits executor output across style, tests and acceptance
 - `np-verifier` (sonnet) — post-execution Pass/Fail/Defer per success_criterion
 - `np-nyquist-auditor` (haiku) — requirement test-coverage audit
+- `np-learnings-extractor` (haiku) — headless continuous-learning observer; distils reusable `{pattern, outcome}` learnings from a session's turn-diff
 - `np-security-reviewer` (sonnet) — OWASP-aligned read-only audit (manual spawn)
 Every spawn runs with an **explicit tier** (`haiku` / `sonnet` / `opus`) resolved to a concrete model via `np-tools.cjs resolve-model --profile <frontier|quality|balanced|budget|inherit>`.
@@ -169,6 +170,22 @@ load-bearing ones for users and contributors:
 See [`SECURITY.md`](./SECURITY.md) for the vulnerability disclosure policy
 and threat model.
+### Headless recursion guard
+The in-session security review and continuous-learning hooks do their work in
+a headless `claude -p` subprocess. To stop that subprocess from re-firing the
+same hooks (which would cascade into an unbounded fork of `claude`/`np-tools`
+processes), nubos-pilot sets `NUBOS_PILOT_HEADLESS=1` and a
+`NUBOS_PILOT_HOOK_DEPTH` counter on every headless spawn. The hooks no-op when
+`NUBOS_PILOT_HEADLESS` is set, `spawn-headless` refuses a nested or
+depth-exceeded spawn, and a per-agent lockfile under `.nubos-pilot/run/` bounds
+concurrent headless runs to one per agent.
+The guard is automatic — do not export `NUBOS_PILOT_HEADLESS` in your own
+shell, or the in-session hooks will silently do nothing. The depth cap is one
+level; override it with `NUBOS_PILOT_MAX_HOOK_DEPTH` only if you understand the
+recursion risk.
 ## Support
 - Bugs / features: [GitHub issues](https://github.com/Nubos-AI/nubos-pilot/issues)

package/SECURITY.md CHANGED Viewed

@@ -18,11 +18,10 @@ versions and announced in `CHANGELOG.md`.
 | Version | Supported |
 |---------|-----------|
-| 0.2.x   | ✅ active |
-| < 0.2   | ❌ end of life |
+| 1.3.x   | ✅ active |
+| < 1.3   | ❌ end of life |
-Only the latest minor on the current major receives security patches until
-1.0 is reached.
+Only the latest minor on the current major (1.x) receives security patches.
 ## Threat Model

package/bin/np-tools/_commands.cjs CHANGED Viewed

@@ -101,6 +101,7 @@ const COMMANDS = [
   { name: 'loop-audit-tool-use',     category: 'Execution', description: 'Record/read the tool-use audit per spawn (Completeness Rule 9 mechanical check)', description_de: 'Tool-use Audit pro Spawn schreiben/lesen (Completeness Rule 9 mechanische Prüfung)' },
   { name: 'loop-stuck',              category: 'Execution', description: 'Mark a task as stuck (writes loop-state + flips checkpoint status to stuck)', description_de: 'Markiert Task als stuck (schreibt Loop-State + setzt Checkpoint-Status auf stuck)' },
   { name: 'spawn-headless',          category: 'Execution', description: 'Spawn an agent as a headless `claude -p` subprocess (ADR-0010 §L6); writes stdout to --output-path and returns exit code', description_de: 'Spawnt einen Agent als headless `claude -p` Subprozess (ADR-0010 §L6); schreibt stdout nach --output-path und liefert Exit-Code' },
+  { name: 'spawn-offhost',           category: 'Execution', description: 'Run an agent routed to an openai-compat provider (Ollama/OpenAI/Grok) as a one-shot tool-use loop (ADR-0021). Args: --agent --task|--task-file [--allow-bash] [--read-only]. Preflights the endpoint, records metrics.', description_de: 'Führt einen auf einen openai-compat-Provider (Ollama/OpenAI/Grok) gerouteten Agent als One-Shot-Tool-Use-Loop aus (ADR-0021). Args: --agent --task|--task-file [--allow-bash] [--read-only]. Preflight des Endpoints, Metrics-Aufzeichnung.' },
   { name: 'security',                category: 'Review',    description: 'In-session security review hook backend (ADR-0020). Verbs: session-start | baseline | scan | review | commit | run-review. Reads the Claude Code hook payload via --stdin; non-blocking, report-once, independent reviewer spawn.', description_de: 'Backend für die In-Session-Security-Review-Hooks (ADR-0020). Verben: session-start | baseline | scan | review | commit | run-review. Liest die Claude-Code-Hook-Payload via --stdin; non-blocking, report-once, unabhängiger Reviewer-Spawn.' },
   { name: 'loop-metrics',            category: 'Utility',   description: 'Aggregate Nubosloop telemetry across all checkpoints (commits, stuck, route distribution)', description_de: 'Aggregiert Nubosloop-Telemetrie über alle Checkpoints (Commits, Stuck, Routing)' },
   { name: 'learning-log',            category: 'Execution', description: 'Persist a learning to the local store (or MCP adapter when configured)', description_de: 'Persistiert ein Learning im lokalen Store (oder MCP-Adapter falls konfiguriert)' },

package/bin/np-tools/learnings.cjs CHANGED Viewed

@@ -6,6 +6,7 @@ const child_process = require('node:child_process');
 const { tryReadConfigPath } = require('../../lib/config.cjs');
 const ledger = require('../../lib/learnings/capture-ledger.cjs');
 const extract = require('../../lib/learnings/extract.cjs');
+const headlessGuard = require('../../lib/headless-guard.cjs');
 const args = require('./_args.cjs');
 function _readStdin() {
@@ -60,6 +61,9 @@ async function run(argv, ctx) {
   const stdout = context.stdout || process.stdout;
   const list = Array.isArray(argv) ? argv : [];
   const verb = list[0];
+  if (headlessGuard.isHeadless(process.env)) return 0;
   const cfg = _cfg(cwd);
   // 'reset' (UserPromptSubmit) and 'run-extract' (background worker) are not
@@ -86,7 +90,7 @@ async function run(argv, ctx) {
   if (verb === 'run-extract') {
     const sid = args.getFlag(list, '--session') || '';
     try {
-      const result = extract.runExtract({ cwd, sid, config: cfg });
+      const result = await extract.runExtract({ cwd, sid, config: cfg });
       _emit(stdout, result);
     } catch (err) {
       _emit(stdout, { ran: false, reason: 'error', error: String(err && err.code || err) });

package/bin/np-tools/resolve-model.cjs CHANGED Viewed

@@ -2,6 +2,7 @@ const { NubosPilotError } = require('../../lib/core.cjs');
 const { readConfig, _CONFIG_PARSE_CODES } = require('../../lib/config.cjs');
 const { loadAgent, loadAgentModule } = require('../../lib/agents.cjs');
 const { resolve: resolveAlias, MODEL_ALIAS_MAP, VALID_TIERS } = require('../../lib/model-profiles.cjs');
+const { resolveProvider } = require('../../lib/model-providers.cjs');
 let _warnedCorruptOnce = false;
 function _readConfig(cwd) {
@@ -56,11 +57,13 @@ function resolveFromConfig({ agentOrTier, profileOverride, cwd, format }) {
   const config = _readConfig(cwd);
   let tier;
+  let agentName = null;
   if (VALID_TIERS.includes(agentOrTier)) {
     tier = agentOrTier;
   } else {
     const fm = _loadAgentForResolve(agentOrTier, cwd);
     tier = fm.tier;
+    agentName = agentOrTier;
     const override = _criticTierOverride(config, agentOrTier);
     if (override) tier = override;
   }
@@ -91,7 +94,32 @@ function resolveFromConfig({ agentOrTier, profileOverride, cwd, format }) {
     resolved = alias;
   }
-  return { tier, profile, alias, resolved, mode };
+  const prov = resolveProvider({ agentName, tier, config });
+  let providerModel = prov.model;
+  if (prov.kind === 'native') {
+    if (prov.model) {
+      resolved = prov.model;
+      mode = 'full-id';
+    } else {
+      providerModel = null;
+    }
+  } else {
+    resolved = prov.model;
+    mode = 'provider';
+  }
+  return {
+    tier,
+    profile,
+    alias,
+    resolved,
+    mode,
+    provider: prov.provider,
+    kind: prov.kind,
+    model: providerModel,
+    baseUrl: prov.baseUrl,
+    apiKeyEnv: prov.apiKeyEnv,
+  };
 }
 function run(argv) {
@@ -105,12 +133,18 @@ function run(argv) {
   const agentOrTier = args.shift();
   let profileOverride = null;
   let format = null;
+  let asJson = false;
+  let asKind = false;
   while (args.length) {
     const flag = args.shift();
     if (flag === '--profile') {
       profileOverride = args.shift();
     } else if (flag === '--format') {
       format = args.shift();
+    } else if (flag === '--json') {
+      asJson = true;
+    } else if (flag === '--kind') {
+      asKind = true;
     } else if (flag === '--raw') {
     }
@@ -122,6 +156,26 @@ function run(argv) {
       cwd: process.cwd(),
       format,
     });
+    if (asKind) {
+      process.stdout.write((out.kind || 'native') + '\n');
+      return 0;
+    }
+    if (asJson) {
+      process.stdout.write(JSON.stringify(out) + '\n');
+      return 0;
+    }
+    if (out.kind === 'openai-compat') {
+      process.stderr.write(
+        JSON.stringify({
+          code: 'off-host-not-on-native-path',
+          message: 'agent "' + agentOrTier + '" routes to provider "' + out.provider
+            + '" (model ' + out.model + '), which the native `claude` spawn path cannot run. '
+            + 'Run it off-host with: np-tools spawn-offhost --agent ' + agentOrTier + ' --task <…>',
+          details: { provider: out.provider, kind: out.kind, model: out.model },
+        }) + '\n',
+      );
+      return 1;
+    }
     process.stdout.write(out.resolved + '\n');
     return 0;
   } catch (err) {

package/bin/np-tools/resolve-model.test.cjs CHANGED Viewed

@@ -72,6 +72,11 @@ test('RM-1: tier branch with empty config returns alias mode, default balanced p
     alias: 'opus',
     resolved: 'opus',
     mode: 'alias',
+    provider: 'claude',
+    kind: 'native',
+    model: null,
+    baseUrl: null,
+    apiKeyEnv: null,
   });
 });
@@ -276,3 +281,137 @@ test('RM-18: module agent without override falls back to module frontmatter tier
   });
   assert.equal(out.tier, 'haiku');
 });
+test('RM-19: agent_routing to openai-compat resolves model from provider models table by tier', () => {
+  const cwd = _sandbox({
+    model_providers: {
+      default: 'claude',
+      claude: { kind: 'native' },
+      ollama: { kind: 'openai-compat', base_url: 'http://localhost:11434/v1', models: { opus: 'qwen2.5-coder:32b' } },
+    },
+    agent_routing: { 'np-planner': { provider: 'ollama' } },
+  }, { 'np-planner': _plannerAgent });
+  const out = subcmd.resolveFromConfig({ agentOrTier: 'np-planner', cwd });
+  assert.equal(out.provider, 'ollama');
+  assert.equal(out.kind, 'openai-compat');
+  assert.equal(out.model, 'qwen2.5-coder:32b');
+  assert.equal(out.resolved, 'qwen2.5-coder:32b');
+  assert.equal(out.mode, 'provider');
+});
+test('RM-20: explicit model pin in agent_routing beats provider models table', () => {
+  const cwd = _sandbox({
+    model_providers: {
+      ollama: { kind: 'openai-compat', base_url: 'http://localhost:11434/v1', models: { opus: 'fallback-model' } },
+    },
+    agent_routing: { 'np-planner': { provider: 'ollama', model: 'qwen3.5' } },
+  }, { 'np-planner': _plannerAgent });
+  const out = subcmd.resolveFromConfig({ agentOrTier: 'np-planner', cwd });
+  assert.equal(out.model, 'qwen3.5');
+  assert.equal(out.resolved, 'qwen3.5');
+});
+test('RM-21: native provider with an explicit model pin forces full-id mode', () => {
+  const cwd = _sandbox({
+    model_providers: { claude: { kind: 'native' } },
+    agent_routing: { 'np-planner': { provider: 'claude', model: 'claude-opus-4-7' } },
+  }, { 'np-planner': _plannerAgent });
+  const out = subcmd.resolveFromConfig({ agentOrTier: 'np-planner', cwd });
+  assert.equal(out.kind, 'native');
+  assert.equal(out.resolved, 'claude-opus-4-7');
+  assert.equal(out.mode, 'full-id');
+  assert.equal(out.model, 'claude-opus-4-7');
+});
+test('RM-22: glob routing key (np-critic*) matches a critic agent', () => {
+  const cwd = _sandbox({
+    model_providers: { openai: { kind: 'openai-compat', base_url: 'https://api.openai.com/v1', models: { sonnet: 'gpt-4o' } } },
+    agent_routing: { 'np-critic*': { provider: 'openai', model: 'gpt-4o' } },
+  }, { 'np-critic': '---\nname: np-critic\ndescription: x\ntier: sonnet\ntools: Read\n---\nbody' });
+  const out = subcmd.resolveFromConfig({ agentOrTier: 'np-critic', cwd });
+  assert.equal(out.provider, 'openai');
+  assert.equal(out.model, 'gpt-4o');
+});
+test('RM-23: routing to an undefined provider fails loud (no silent claude fallback)', () => {
+  const cwd = _sandbox({
+    model_providers: { claude: { kind: 'native' } },
+    agent_routing: { 'np-planner': { provider: 'ollama' } },
+  }, { 'np-planner': _plannerAgent });
+  let thrown = null;
+  try { subcmd.resolveFromConfig({ agentOrTier: 'np-planner', cwd }); } catch (e) { thrown = e; }
+  assert.ok(thrown);
+  assert.equal(thrown.name, 'NubosPilotError');
+  assert.equal(thrown.code, 'provider-undefined');
+});
+test('RM-24: openai-compat with no pin and no models[tier] fails loud', () => {
+  const cwd = _sandbox({
+    model_providers: { ollama: { kind: 'openai-compat', base_url: 'http://localhost:11434/v1', models: { haiku: 'small' } } },
+    agent_routing: { 'np-planner': { provider: 'ollama' } },
+  }, { 'np-planner': _plannerAgent });
+  let thrown = null;
+  try { subcmd.resolveFromConfig({ agentOrTier: 'np-planner', cwd }); } catch (e) { thrown = e; }
+  assert.ok(thrown);
+  assert.equal(thrown.code, 'provider-model-unresolved');
+});
+test('RM-25: bare tier never routes — stays on implicit claude-native default', () => {
+  const cwd = _sandbox({
+    model_providers: { ollama: { kind: 'openai-compat', base_url: 'http://localhost:11434/v1', models: { opus: 'qwen' } } },
+    agent_routing: { opus: { provider: 'ollama' } },
+  });
+  const out = subcmd.resolveFromConfig({ agentOrTier: 'opus', cwd });
+  assert.equal(out.provider, 'claude');
+  assert.equal(out.kind, 'native');
+  assert.equal(out.resolved, 'opus');
+});
+test('RM-27: run() refuses an off-host (openai-compat) agent loud — no model id on stdout', () => {
+  const root = _sandbox({
+    model_providers: { ollama: { kind: 'openai-compat', base_url: 'http://localhost:11434/v1', models: { opus: 'qwen2.5-coder:32b' } } },
+    agent_routing: { 'np-planner': { provider: 'ollama' } },
+  }, { 'np-planner': _plannerAgent });
+  const origCwd = process.cwd();
+  process.chdir(root);
+  try {
+    const cap = _captureStdout(() => subcmd.run(['np-planner']));
+    assert.equal(cap.rc, 1);
+    assert.equal(cap.stdout, '');
+    assert.match(cap.stderr, /off-host-not-on-native-path/);
+    assert.match(cap.stderr, /spawn-offhost/);
+  } finally {
+    process.chdir(origCwd);
+  }
+});
+test('RM-28: --json reports the full resolution and succeeds even for off-host', () => {
+  const root = _sandbox({
+    model_providers: { ollama: { kind: 'openai-compat', base_url: 'http://localhost:11434/v1', models: { opus: 'qwen2.5-coder:32b' } } },
+    agent_routing: { 'np-planner': { provider: 'ollama' } },
+  }, { 'np-planner': _plannerAgent });
+  const origCwd = process.cwd();
+  process.chdir(root);
+  try {
+    const cap = _captureStdout(() => subcmd.run(['np-planner', '--json']));
+    assert.equal(cap.rc, 0);
+    const out = JSON.parse(cap.stdout.trim());
+    assert.equal(out.kind, 'openai-compat');
+    assert.equal(out.provider, 'ollama');
+    assert.equal(out.baseUrl, 'http://localhost:11434/v1');
+  } finally {
+    process.chdir(origCwd);
+  }
+});
+test('RM-26: model_providers.default switches the default provider for all agents', () => {
+  const cwd = _sandbox({
+    model_providers: {
+      default: 'ollama',
+      ollama: { kind: 'openai-compat', base_url: 'http://localhost:11434/v1', models: { opus: 'qwen2.5-coder:32b' } },
+    },
+  }, { 'np-planner': _plannerAgent });
+  const out = subcmd.resolveFromConfig({ agentOrTier: 'np-planner', cwd });
+  assert.equal(out.provider, 'ollama');
+  assert.equal(out.model, 'qwen2.5-coder:32b');
+});

package/bin/np-tools/security.cjs CHANGED Viewed

@@ -8,6 +8,7 @@ const { tryReadConfigPath } = require('../../lib/config.cjs');
 const scan = require('../../lib/security/scan.cjs');
 const ledger = require('../../lib/security/ledger.cjs');
 const review = require('../../lib/security/review.cjs');
+const headlessGuard = require('../../lib/headless-guard.cjs');
 const args = require('./_args.cjs');
 const COMMIT_RE = /\bgit\b[\s\S]*\b(commit|push)\b/;
@@ -93,6 +94,8 @@ async function run(argv, ctx) {
   const list = Array.isArray(argv) ? argv : [];
   const verb = list[0];
+  if (headlessGuard.isHeadless(process.env)) return 0;
   const cfg = _cfg(cwd);
   if (!cfg.enabled && verb !== 'run-review') return 0;
@@ -167,7 +170,7 @@ async function run(argv, ctx) {
   if (verb === 'run-review') {
     if (!cfg.enabled || !sid) return 0;
     const mode = args.getFlag(list, '--mode') === 'commit' ? 'commit' : 'stop';
-    try { review.runReview({ cwd, sid, mode, config: { ...cfg, guidance_path: _resolveRel(cwd, cfg.guidance_path) } }); } catch {}
+    try { await review.runReview({ cwd, sid, mode, config: { ...cfg, guidance_path: _resolveRel(cwd, cfg.guidance_path) } }); } catch {}
     return 0;
   }

package/bin/np-tools/spawn-headless.cjs CHANGED Viewed

@@ -8,6 +8,7 @@ const child_process = require('node:child_process');
 const { NubosPilotError, atomicWriteFileSync, appendJsonl, findProjectRoot } = require('../../lib/core.cjs');
 const runContext = require('../../lib/run-context.cjs');
 const safePath = require('../../lib/safe-path.cjs');
+const headlessGuard = require('../../lib/headless-guard.cjs');
 const args = require('./_args.cjs');
 function _sha256(s) {
@@ -165,12 +166,109 @@ function _stripFrontmatter(md) {
   return md.slice(end + 5);
 }
-function run(argv, ctx) {
+function _defaultResolveKind(agent, cwd) {
+  const { resolveFromConfig } = require('./resolve-model.cjs');
+  return resolveFromConfig({ agentOrTier: agent, cwd });
+}
+async function _runOffHost(o) {
+  const dispatch = o.dispatchImpl || require('../../lib/runtime/dispatch.cjs').dispatchOffHost;
+  const startedAt = o.startedAt;
+  let result = null;
+  let errObj = null;
+  try {
+    result = await dispatch({ agent: o.agent, task: o.userPrompt, cwd: o.cwd });
+  } catch (err) {
+    errObj = {
+      code: (err && err.code) || 'spawn-headless-offhost-failed',
+      message: (err && err.message) || 'off-host dispatch failed',
+    };
+  }
+  const endedAt = new Date().toISOString();
+  const content = result ? (result.content || '') : '';
+  const envelope = JSON.stringify({
+    type: 'result',
+    result: content,
+    model: result ? result.model : null,
+    provider: result ? result.provider : null,
+    is_error: !!errObj,
+  });
+  atomicWriteFileSync(o.resolvedOutput, envelope, 'utf-8', 0o600);
+  const exitCode = errObj ? 2 : 0;
+  const spawnTrailPath = _spawnTrailPath(o.cwd);
+  const spawnRecord = {
+    run_id: o.runId,
+    started_at: startedAt,
+    ended_at: endedAt,
+    duration_ms: Math.max(0, Date.parse(endedAt) - Date.parse(startedAt)),
+    agent: o.agent,
+    bin: 'off-host:' + (result ? result.provider : 'unknown'),
+    exit_code: exitCode,
+    timed_out: false,
+    prompt_sha256: _sha256(o.userPrompt),
+    prompt_bytes: Buffer.byteLength(o.userPrompt || '', 'utf-8'),
+    response_sha256: _sha256(content),
+    response_bytes: Buffer.byteLength(content, 'utf-8'),
+    stderr_excerpt: errObj ? _redactSecrets(String(errObj.message)).slice(-STDERR_TAIL_BYTES) : '',
+    model_actual: result ? result.model : null,
+    tokens_in: null,
+    tokens_out: null,
+    payload_parse_ok: !errObj,
+    runtime: result ? result.provider : 'off-host',
+  };
+  try { appendJsonl(spawnTrailPath, spawnRecord, { maxLineBytes: 16 * 1024, mode: 0o600 }); }
+  catch (err) {
+    throw new NubosPilotError(
+      'spawn-headless-audit-persist-failed',
+      'spawn-trail append failed; refusing to commit response without audit record',
+      { file: 'spawns.jsonl', cause: (err && err.code) || 'unknown' },
+    );
+  }
+  const payload = {
+    agent: o.agent,
+    output_path: o.outputPath,
+    output_path_resolved: o.resolvedOutput,
+    exit_code: exitCode,
+    stderr_excerpt: spawnRecord.stderr_excerpt,
+    bin: spawnRecord.bin,
+    timed_out: false,
+    run_id: o.runId,
+    spawn_trail_path: spawnTrailPath,
+    model_actual: spawnRecord.model_actual,
+    tokens_in: null,
+    tokens_out: null,
+    payload_parse_ok: spawnRecord.payload_parse_ok,
+    off_host: true,
+  };
+  o.stdout.write(JSON.stringify(payload) + '\n');
+  return exitCode;
+}
+async function run(argv, ctx) {
   const context = ctx || {};
   const cwd = context.cwd || process.cwd();
   const stdout = context.stdout || process.stdout;
   const list = Array.isArray(argv) ? argv : [];
+  if (headlessGuard.isHeadless(process.env)) {
+    throw new NubosPilotError(
+      'spawn-headless-reentrant',
+      'refusing to spawn a nested headless `claude` (NUBOS_PILOT_HEADLESS is set) — recursion guard',
+      { depth: headlessGuard.currentDepth(process.env) },
+    );
+  }
+  if (headlessGuard.depthExceeded(process.env)) {
+    throw new NubosPilotError(
+      'spawn-headless-depth-exceeded',
+      'refusing to spawn headless `claude`: hook depth ' + headlessGuard.currentDepth(process.env)
+        + ' has reached the cap ' + headlessGuard.maxDepth(process.env) + ' (recursion guard)',
+      { depth: headlessGuard.currentDepth(process.env), max: headlessGuard.maxDepth(process.env) },
+    );
+  }
   const agent = args.getFlag(list, '--agent');
   if (!agent) {
     throw new NubosPilotError(
@@ -213,6 +311,39 @@ function run(argv, ctx) {
   const runId = runContext.getRunId();
+  // ADR-0021: off-host routing. If the agent routes to an openai-compat provider,
+  // run the nubos-pilot dispatch loop instead of `claude -p`, writing the result
+  // in the same {result,...} envelope `claude --output-format json` produces so the
+  // callers (review/extract parseXxxOutput → outer.result) parse it unchanged. The
+  // native path below is untouched. run() is async; the CLI dispatcher already
+  // awaits returned promises (handleRunResult), and the two in-process callers
+  // (review.cjs / extract.cjs) await runReview/runExtract.
+  const resolveKind = context.resolveImpl || _defaultResolveKind;
+  let routed;
+  try { routed = resolveKind(agent, cwd); } catch { routed = { kind: 'native' }; }
+  if (routed && routed.kind === 'openai-compat') {
+    return _runOffHost({
+      agent, userPrompt, resolvedOutput, outputPath, cwd, runId, stdout,
+      startedAt: new Date().toISOString(),
+      dispatchImpl: context.dispatchImpl,
+    });
+  }
+  let lockRoot;
+  try { lockRoot = findProjectRoot(cwd); }
+  catch { lockRoot = cwd; }
+  const lock = headlessGuard.tryAcquireSpawnLock(lockRoot, agent);
+  if (!lock.acquired) {
+    throw new NubosPilotError(
+      'spawn-headless-locked',
+      'another headless run for agent `' + agent + '` is already active in this project (concurrency guard)',
+      { agent, holder: lock.holder || null },
+    );
+  }
+  const childEnv = _filterSpawnEnv(process.env);
+  Object.assign(childEnv, headlessGuard.childSpawnEnv(process.env));
   const bin = _claudeBinary();
   const claudeArgs = ['-p', '--output-format', 'json'];
   const startedAt = new Date().toISOString();
@@ -224,7 +355,7 @@ function run(argv, ctx) {
       timeout: timeoutMs,
       maxBuffer: 64 * 1024 * 1024,
       encoding: 'utf-8',
-      env: _filterSpawnEnv(process.env),
+      env: childEnv,
       killSignal: 'SIGKILL',
     });
   } catch (err) {
@@ -233,6 +364,8 @@ function run(argv, ctx) {
       'failed to spawn `' + bin + '`: ' + (err && err.message),
       { bin, cause: err && err.code },
     );
+  } finally {
+    lock.release();
   }
   if (result.error && result.error.code === 'ENOENT') {
     throw new NubosPilotError(