nubos-pilot 1.2.3 → 1.3.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (45) hide show
  1. package/CHANGELOG.md +24 -0
  2. package/README.md +18 -1
  3. package/SECURITY.md +3 -4
  4. package/bin/np-tools/_commands.cjs +1 -0
  5. package/bin/np-tools/learnings.cjs +5 -1
  6. package/bin/np-tools/resolve-model.cjs +55 -1
  7. package/bin/np-tools/resolve-model.test.cjs +139 -0
  8. package/bin/np-tools/security.cjs +4 -1
  9. package/bin/np-tools/spawn-headless.cjs +135 -2
  10. package/bin/np-tools/spawn-headless.test.cjs +225 -40
  11. package/bin/np-tools/spawn-offhost.cjs +93 -0
  12. package/bin/np-tools/spawn-offhost.test.cjs +38 -0
  13. package/lib/agents.cjs +16 -2
  14. package/lib/config-schema.cjs +5 -1
  15. package/lib/headless-guard.cjs +127 -0
  16. package/lib/headless-guard.test.cjs +119 -0
  17. package/lib/learnings/extract.cjs +4 -4
  18. package/lib/learnings/extract.test.cjs +8 -8
  19. package/lib/model-providers.cjs +118 -0
  20. package/lib/model-providers.test.cjs +85 -0
  21. package/lib/runtime/agent-loop.cjs +64 -0
  22. package/lib/runtime/agent-loop.test.cjs +135 -0
  23. package/lib/runtime/dispatch.cjs +174 -0
  24. package/lib/runtime/dispatch.test.cjs +193 -0
  25. package/lib/runtime/preflight.cjs +68 -0
  26. package/lib/runtime/preflight.test.cjs +62 -0
  27. package/lib/runtime/providers/openai-compat.cjs +102 -0
  28. package/lib/runtime/providers/openai-compat.test.cjs +103 -0
  29. package/lib/runtime/tools/index.cjs +415 -0
  30. package/lib/runtime/tools/index.test.cjs +230 -0
  31. package/lib/security/review.cjs +4 -4
  32. package/lib/security/review.test.cjs +6 -6
  33. package/np-tools.cjs +1 -0
  34. package/package.json +1 -1
  35. package/templates/claude/payload/hooks/np-learnings-hook.cjs +1 -0
  36. package/templates/claude/payload/hooks/np-security-hook.cjs +1 -0
  37. package/workflows/add-tests.md +41 -0
  38. package/workflows/architect-phase.md +19 -0
  39. package/workflows/discuss-phase.md +29 -10
  40. package/workflows/execute-phase.md +93 -4
  41. package/workflows/plan-phase.md +57 -16
  42. package/workflows/research-phase.md +45 -0
  43. package/workflows/scan-codebase.md +21 -3
  44. package/workflows/validate-phase.md +30 -13
  45. package/workflows/verify-work.md +17 -0
package/CHANGELOG.md CHANGED
@@ -4,6 +4,30 @@ All notable changes to nubos-pilot are documented in this file. Format
4
4
  follows [Keep a Changelog](https://keepachangelog.com/en/1.1.0/); versioning
5
5
  follows [SemVer](https://semver.org/spec/v2.0.0.html).
6
6
 
7
+ ## [1.3.0] — 2026-06-17
8
+
9
+ Run any agent on any model, not only Claude.
10
+
11
+ - Per-agent model routing: two new config blocks, `model_providers` and `agent_routing`, send each agent to a specific model in the same run — planner on Claude opus, critic on OpenAI gpt-4o, executor on a local Ollama model. Any provider that speaks the OpenAI `/v1/chat/completions` dialect (OpenAI, xAI/Grok, Ollama, vLLM, LM Studio, LiteLLM) is reached through one `fetch`-based client, with no SDK added. Both blocks are optional; without them, resolution and spawning behave exactly as before.
12
+ - When the host can't route an agent to a non-native model — Claude Code's Agent tool only accepts Claude tiers — nubos-pilot runs the loop itself. It's a one-shot, zero-dependency tool-use harness: builds the prompt from `agents/<name>.md`, advertises the agent's tools as function schemas, runs the model's tool-calls against the workspace, and loops until a final answer. No daemon, the process exits when the loop returns.
13
+ - The off-host path runs through the same guards as the Claude path: working-tree safety, commit-policy, output-schema lint, the Nubosloop Rule-9 audit, and in-session security review, all unchanged. Off-host file writes are confined through `safe-path`. Off-host Bash runs only inside a slice worktree and stays off until `workflow.worktree_isolation` is on.
14
+ - Every workflow spawn now has an off-host branch — execute, plan, discuss, research, architect, validate, verify, scan. A test (`check-offhost-coverage`) walks the workflows and fails the suite if any spawn lacks one, so a new agent can't ship Claude-only by accident.
15
+ - A preflight runs before any off-host spawn and fails loud: it checks the server is reachable, the model is present, and tool-calling works, then aborts with an actionable message (`run: ollama pull <model>`) instead of dying mid-task. A routing entry that names an undefined provider is a hard config error at load time, never a quiet fallback to Claude.
16
+
17
+ Local models are weaker at multi-step tool-use than frontier Claude, so keep high-risk agents like the planner and security-reviewer on Claude — that's why the whole thing is opt-in. ADR-0021 has the full design.
18
+
19
+ Full documentation at <https://pilot.nubos.cloud>.
20
+
21
+ ## [1.2.4] — 2026-06-15
22
+
23
+ Fixed a recursion fault in the in-session hooks that could spawn an unbounded cascade of headless `claude -p` processes.
24
+
25
+ - The Stop-hook security review and continuous-learning capture each spawn a headless `claude -p` to do their work. That headless run re-fires the same SessionStart/Stop hooks, which spawned another headless run, and so on — a fork bomb of `claude`, `np-tools` and duplicated MCP servers that survived closing the terminal. nubos-pilot now marks every headless spawn with `NUBOS_PILOT_HEADLESS=1` and a `NUBOS_PILOT_HOOK_DEPTH` counter; the hooks no-op immediately inside a headless run, so the chain stops at exactly one level.
26
+ - Three independent guards back this up: the hook scripts and the `security`/`learnings` backends exit early when `NUBOS_PILOT_HEADLESS` is set; `spawn-headless` refuses to start a nested headless run (reentrancy + depth cap, default one level); and a per-agent lockfile under `.nubos-pilot/run/` bounds concurrent headless runs to one per agent even if the environment is not inherited. Headless runs already carry a hard timeout with SIGKILL, so a hung review cannot linger.
27
+ - Escape hatch: the guard keys off `NUBOS_PILOT_HEADLESS`, set automatically on the spawned `claude` — do not set it in your own shell or the in-session hooks will silently no-op. Raise the depth cap with `NUBOS_PILOT_MAX_HOOK_DEPTH` only if you understand the recursion risk.
28
+
29
+ Full documentation at <https://pilot.nubos.cloud>.
30
+
7
31
  ## [1.2.3] — 2026-06-14
8
32
 
9
33
  Three opt-in layers that make execution cheaper, more reliable, and self-improving.
package/README.md CHANGED
@@ -95,7 +95,7 @@ task(M001-S001-T0002): wire login handler
95
95
 
96
96
  ## Agents
97
97
 
98
- Thirteen spawnable subagents are installed into the host's agent directory (alongside three `np-critic-*` audit modules consumed by `np-critic`):
98
+ Fourteen spawnable subagents are installed into the host's agent directory (alongside three `np-critic-*` audit modules consumed by `np-critic`):
99
99
 
100
100
  - `np-planner` (opus) — breaks a milestone into slices + tasks
101
101
  - `np-plan-checker` (opus) — adversarial goal-backward review before execution
@@ -109,6 +109,7 @@ Thirteen spawnable subagents are installed into the host's agent directory (alon
109
109
  - `np-critic` (sonnet) — Nubosloop critic; audits executor output across style, tests and acceptance
110
110
  - `np-verifier` (sonnet) — post-execution Pass/Fail/Defer per success_criterion
111
111
  - `np-nyquist-auditor` (haiku) — requirement test-coverage audit
112
+ - `np-learnings-extractor` (haiku) — headless continuous-learning observer; distils reusable `{pattern, outcome}` learnings from a session's turn-diff
112
113
  - `np-security-reviewer` (sonnet) — OWASP-aligned read-only audit (manual spawn)
113
114
 
114
115
  Every spawn runs with an **explicit tier** (`haiku` / `sonnet` / `opus`) resolved to a concrete model via `np-tools.cjs resolve-model --profile <frontier|quality|balanced|budget|inherit>`.
@@ -169,6 +170,22 @@ load-bearing ones for users and contributors:
169
170
  See [`SECURITY.md`](./SECURITY.md) for the vulnerability disclosure policy
170
171
  and threat model.
171
172
 
173
+ ### Headless recursion guard
174
+
175
+ The in-session security review and continuous-learning hooks do their work in
176
+ a headless `claude -p` subprocess. To stop that subprocess from re-firing the
177
+ same hooks (which would cascade into an unbounded fork of `claude`/`np-tools`
178
+ processes), nubos-pilot sets `NUBOS_PILOT_HEADLESS=1` and a
179
+ `NUBOS_PILOT_HOOK_DEPTH` counter on every headless spawn. The hooks no-op when
180
+ `NUBOS_PILOT_HEADLESS` is set, `spawn-headless` refuses a nested or
181
+ depth-exceeded spawn, and a per-agent lockfile under `.nubos-pilot/run/` bounds
182
+ concurrent headless runs to one per agent.
183
+
184
+ The guard is automatic — do not export `NUBOS_PILOT_HEADLESS` in your own
185
+ shell, or the in-session hooks will silently do nothing. The depth cap is one
186
+ level; override it with `NUBOS_PILOT_MAX_HOOK_DEPTH` only if you understand the
187
+ recursion risk.
188
+
172
189
  ## Support
173
190
 
174
191
  - Bugs / features: [GitHub issues](https://github.com/Nubos-AI/nubos-pilot/issues)
package/SECURITY.md CHANGED
@@ -18,11 +18,10 @@ versions and announced in `CHANGELOG.md`.
18
18
 
19
19
  | Version | Supported |
20
20
  |---------|-----------|
21
- | 0.2.x | ✅ active |
22
- | < 0.2 | ❌ end of life |
21
+ | 1.3.x | ✅ active |
22
+ | < 1.3 | ❌ end of life |
23
23
 
24
- Only the latest minor on the current major receives security patches until
25
- 1.0 is reached.
24
+ Only the latest minor on the current major (1.x) receives security patches.
26
25
 
27
26
  ## Threat Model
28
27
 
@@ -101,6 +101,7 @@ const COMMANDS = [
101
101
  { name: 'loop-audit-tool-use', category: 'Execution', description: 'Record/read the tool-use audit per spawn (Completeness Rule 9 mechanical check)', description_de: 'Tool-use Audit pro Spawn schreiben/lesen (Completeness Rule 9 mechanische Prüfung)' },
102
102
  { name: 'loop-stuck', category: 'Execution', description: 'Mark a task as stuck (writes loop-state + flips checkpoint status to stuck)', description_de: 'Markiert Task als stuck (schreibt Loop-State + setzt Checkpoint-Status auf stuck)' },
103
103
  { name: 'spawn-headless', category: 'Execution', description: 'Spawn an agent as a headless `claude -p` subprocess (ADR-0010 §L6); writes stdout to --output-path and returns exit code', description_de: 'Spawnt einen Agent als headless `claude -p` Subprozess (ADR-0010 §L6); schreibt stdout nach --output-path und liefert Exit-Code' },
104
+ { name: 'spawn-offhost', category: 'Execution', description: 'Run an agent routed to an openai-compat provider (Ollama/OpenAI/Grok) as a one-shot tool-use loop (ADR-0021). Args: --agent --task|--task-file [--allow-bash] [--read-only]. Preflights the endpoint, records metrics.', description_de: 'Führt einen auf einen openai-compat-Provider (Ollama/OpenAI/Grok) gerouteten Agent als One-Shot-Tool-Use-Loop aus (ADR-0021). Args: --agent --task|--task-file [--allow-bash] [--read-only]. Preflight des Endpoints, Metrics-Aufzeichnung.' },
104
105
  { name: 'security', category: 'Review', description: 'In-session security review hook backend (ADR-0020). Verbs: session-start | baseline | scan | review | commit | run-review. Reads the Claude Code hook payload via --stdin; non-blocking, report-once, independent reviewer spawn.', description_de: 'Backend für die In-Session-Security-Review-Hooks (ADR-0020). Verben: session-start | baseline | scan | review | commit | run-review. Liest die Claude-Code-Hook-Payload via --stdin; non-blocking, report-once, unabhängiger Reviewer-Spawn.' },
105
106
  { name: 'loop-metrics', category: 'Utility', description: 'Aggregate Nubosloop telemetry across all checkpoints (commits, stuck, route distribution)', description_de: 'Aggregiert Nubosloop-Telemetrie über alle Checkpoints (Commits, Stuck, Routing)' },
106
107
  { name: 'learning-log', category: 'Execution', description: 'Persist a learning to the local store (or MCP adapter when configured)', description_de: 'Persistiert ein Learning im lokalen Store (oder MCP-Adapter falls konfiguriert)' },
@@ -6,6 +6,7 @@ const child_process = require('node:child_process');
6
6
  const { tryReadConfigPath } = require('../../lib/config.cjs');
7
7
  const ledger = require('../../lib/learnings/capture-ledger.cjs');
8
8
  const extract = require('../../lib/learnings/extract.cjs');
9
+ const headlessGuard = require('../../lib/headless-guard.cjs');
9
10
  const args = require('./_args.cjs');
10
11
 
11
12
  function _readStdin() {
@@ -60,6 +61,9 @@ async function run(argv, ctx) {
60
61
  const stdout = context.stdout || process.stdout;
61
62
  const list = Array.isArray(argv) ? argv : [];
62
63
  const verb = list[0];
64
+
65
+ if (headlessGuard.isHeadless(process.env)) return 0;
66
+
63
67
  const cfg = _cfg(cwd);
64
68
 
65
69
  // 'reset' (UserPromptSubmit) and 'run-extract' (background worker) are not
@@ -86,7 +90,7 @@ async function run(argv, ctx) {
86
90
  if (verb === 'run-extract') {
87
91
  const sid = args.getFlag(list, '--session') || '';
88
92
  try {
89
- const result = extract.runExtract({ cwd, sid, config: cfg });
93
+ const result = await extract.runExtract({ cwd, sid, config: cfg });
90
94
  _emit(stdout, result);
91
95
  } catch (err) {
92
96
  _emit(stdout, { ran: false, reason: 'error', error: String(err && err.code || err) });
@@ -2,6 +2,7 @@ const { NubosPilotError } = require('../../lib/core.cjs');
2
2
  const { readConfig, _CONFIG_PARSE_CODES } = require('../../lib/config.cjs');
3
3
  const { loadAgent, loadAgentModule } = require('../../lib/agents.cjs');
4
4
  const { resolve: resolveAlias, MODEL_ALIAS_MAP, VALID_TIERS } = require('../../lib/model-profiles.cjs');
5
+ const { resolveProvider } = require('../../lib/model-providers.cjs');
5
6
 
6
7
  let _warnedCorruptOnce = false;
7
8
  function _readConfig(cwd) {
@@ -56,11 +57,13 @@ function resolveFromConfig({ agentOrTier, profileOverride, cwd, format }) {
56
57
  const config = _readConfig(cwd);
57
58
 
58
59
  let tier;
60
+ let agentName = null;
59
61
  if (VALID_TIERS.includes(agentOrTier)) {
60
62
  tier = agentOrTier;
61
63
  } else {
62
64
  const fm = _loadAgentForResolve(agentOrTier, cwd);
63
65
  tier = fm.tier;
66
+ agentName = agentOrTier;
64
67
  const override = _criticTierOverride(config, agentOrTier);
65
68
  if (override) tier = override;
66
69
  }
@@ -91,7 +94,32 @@ function resolveFromConfig({ agentOrTier, profileOverride, cwd, format }) {
91
94
  resolved = alias;
92
95
  }
93
96
 
94
- return { tier, profile, alias, resolved, mode };
97
+ const prov = resolveProvider({ agentName, tier, config });
98
+ let providerModel = prov.model;
99
+ if (prov.kind === 'native') {
100
+ if (prov.model) {
101
+ resolved = prov.model;
102
+ mode = 'full-id';
103
+ } else {
104
+ providerModel = null;
105
+ }
106
+ } else {
107
+ resolved = prov.model;
108
+ mode = 'provider';
109
+ }
110
+
111
+ return {
112
+ tier,
113
+ profile,
114
+ alias,
115
+ resolved,
116
+ mode,
117
+ provider: prov.provider,
118
+ kind: prov.kind,
119
+ model: providerModel,
120
+ baseUrl: prov.baseUrl,
121
+ apiKeyEnv: prov.apiKeyEnv,
122
+ };
95
123
  }
96
124
 
97
125
  function run(argv) {
@@ -105,12 +133,18 @@ function run(argv) {
105
133
  const agentOrTier = args.shift();
106
134
  let profileOverride = null;
107
135
  let format = null;
136
+ let asJson = false;
137
+ let asKind = false;
108
138
  while (args.length) {
109
139
  const flag = args.shift();
110
140
  if (flag === '--profile') {
111
141
  profileOverride = args.shift();
112
142
  } else if (flag === '--format') {
113
143
  format = args.shift();
144
+ } else if (flag === '--json') {
145
+ asJson = true;
146
+ } else if (flag === '--kind') {
147
+ asKind = true;
114
148
  } else if (flag === '--raw') {
115
149
 
116
150
  }
@@ -122,6 +156,26 @@ function run(argv) {
122
156
  cwd: process.cwd(),
123
157
  format,
124
158
  });
159
+ if (asKind) {
160
+ process.stdout.write((out.kind || 'native') + '\n');
161
+ return 0;
162
+ }
163
+ if (asJson) {
164
+ process.stdout.write(JSON.stringify(out) + '\n');
165
+ return 0;
166
+ }
167
+ if (out.kind === 'openai-compat') {
168
+ process.stderr.write(
169
+ JSON.stringify({
170
+ code: 'off-host-not-on-native-path',
171
+ message: 'agent "' + agentOrTier + '" routes to provider "' + out.provider
172
+ + '" (model ' + out.model + '), which the native `claude` spawn path cannot run. '
173
+ + 'Run it off-host with: np-tools spawn-offhost --agent ' + agentOrTier + ' --task <…>',
174
+ details: { provider: out.provider, kind: out.kind, model: out.model },
175
+ }) + '\n',
176
+ );
177
+ return 1;
178
+ }
125
179
  process.stdout.write(out.resolved + '\n');
126
180
  return 0;
127
181
  } catch (err) {
@@ -72,6 +72,11 @@ test('RM-1: tier branch with empty config returns alias mode, default balanced p
72
72
  alias: 'opus',
73
73
  resolved: 'opus',
74
74
  mode: 'alias',
75
+ provider: 'claude',
76
+ kind: 'native',
77
+ model: null,
78
+ baseUrl: null,
79
+ apiKeyEnv: null,
75
80
  });
76
81
  });
77
82
 
@@ -276,3 +281,137 @@ test('RM-18: module agent without override falls back to module frontmatter tier
276
281
  });
277
282
  assert.equal(out.tier, 'haiku');
278
283
  });
284
+
285
+ test('RM-19: agent_routing to openai-compat resolves model from provider models table by tier', () => {
286
+ const cwd = _sandbox({
287
+ model_providers: {
288
+ default: 'claude',
289
+ claude: { kind: 'native' },
290
+ ollama: { kind: 'openai-compat', base_url: 'http://localhost:11434/v1', models: { opus: 'qwen2.5-coder:32b' } },
291
+ },
292
+ agent_routing: { 'np-planner': { provider: 'ollama' } },
293
+ }, { 'np-planner': _plannerAgent });
294
+ const out = subcmd.resolveFromConfig({ agentOrTier: 'np-planner', cwd });
295
+ assert.equal(out.provider, 'ollama');
296
+ assert.equal(out.kind, 'openai-compat');
297
+ assert.equal(out.model, 'qwen2.5-coder:32b');
298
+ assert.equal(out.resolved, 'qwen2.5-coder:32b');
299
+ assert.equal(out.mode, 'provider');
300
+ });
301
+
302
+ test('RM-20: explicit model pin in agent_routing beats provider models table', () => {
303
+ const cwd = _sandbox({
304
+ model_providers: {
305
+ ollama: { kind: 'openai-compat', base_url: 'http://localhost:11434/v1', models: { opus: 'fallback-model' } },
306
+ },
307
+ agent_routing: { 'np-planner': { provider: 'ollama', model: 'qwen3.5' } },
308
+ }, { 'np-planner': _plannerAgent });
309
+ const out = subcmd.resolveFromConfig({ agentOrTier: 'np-planner', cwd });
310
+ assert.equal(out.model, 'qwen3.5');
311
+ assert.equal(out.resolved, 'qwen3.5');
312
+ });
313
+
314
+ test('RM-21: native provider with an explicit model pin forces full-id mode', () => {
315
+ const cwd = _sandbox({
316
+ model_providers: { claude: { kind: 'native' } },
317
+ agent_routing: { 'np-planner': { provider: 'claude', model: 'claude-opus-4-7' } },
318
+ }, { 'np-planner': _plannerAgent });
319
+ const out = subcmd.resolveFromConfig({ agentOrTier: 'np-planner', cwd });
320
+ assert.equal(out.kind, 'native');
321
+ assert.equal(out.resolved, 'claude-opus-4-7');
322
+ assert.equal(out.mode, 'full-id');
323
+ assert.equal(out.model, 'claude-opus-4-7');
324
+ });
325
+
326
+ test('RM-22: glob routing key (np-critic*) matches a critic agent', () => {
327
+ const cwd = _sandbox({
328
+ model_providers: { openai: { kind: 'openai-compat', base_url: 'https://api.openai.com/v1', models: { sonnet: 'gpt-4o' } } },
329
+ agent_routing: { 'np-critic*': { provider: 'openai', model: 'gpt-4o' } },
330
+ }, { 'np-critic': '---\nname: np-critic\ndescription: x\ntier: sonnet\ntools: Read\n---\nbody' });
331
+ const out = subcmd.resolveFromConfig({ agentOrTier: 'np-critic', cwd });
332
+ assert.equal(out.provider, 'openai');
333
+ assert.equal(out.model, 'gpt-4o');
334
+ });
335
+
336
+ test('RM-23: routing to an undefined provider fails loud (no silent claude fallback)', () => {
337
+ const cwd = _sandbox({
338
+ model_providers: { claude: { kind: 'native' } },
339
+ agent_routing: { 'np-planner': { provider: 'ollama' } },
340
+ }, { 'np-planner': _plannerAgent });
341
+ let thrown = null;
342
+ try { subcmd.resolveFromConfig({ agentOrTier: 'np-planner', cwd }); } catch (e) { thrown = e; }
343
+ assert.ok(thrown);
344
+ assert.equal(thrown.name, 'NubosPilotError');
345
+ assert.equal(thrown.code, 'provider-undefined');
346
+ });
347
+
348
+ test('RM-24: openai-compat with no pin and no models[tier] fails loud', () => {
349
+ const cwd = _sandbox({
350
+ model_providers: { ollama: { kind: 'openai-compat', base_url: 'http://localhost:11434/v1', models: { haiku: 'small' } } },
351
+ agent_routing: { 'np-planner': { provider: 'ollama' } },
352
+ }, { 'np-planner': _plannerAgent });
353
+ let thrown = null;
354
+ try { subcmd.resolveFromConfig({ agentOrTier: 'np-planner', cwd }); } catch (e) { thrown = e; }
355
+ assert.ok(thrown);
356
+ assert.equal(thrown.code, 'provider-model-unresolved');
357
+ });
358
+
359
+ test('RM-25: bare tier never routes — stays on implicit claude-native default', () => {
360
+ const cwd = _sandbox({
361
+ model_providers: { ollama: { kind: 'openai-compat', base_url: 'http://localhost:11434/v1', models: { opus: 'qwen' } } },
362
+ agent_routing: { opus: { provider: 'ollama' } },
363
+ });
364
+ const out = subcmd.resolveFromConfig({ agentOrTier: 'opus', cwd });
365
+ assert.equal(out.provider, 'claude');
366
+ assert.equal(out.kind, 'native');
367
+ assert.equal(out.resolved, 'opus');
368
+ });
369
+
370
+ test('RM-27: run() refuses an off-host (openai-compat) agent loud — no model id on stdout', () => {
371
+ const root = _sandbox({
372
+ model_providers: { ollama: { kind: 'openai-compat', base_url: 'http://localhost:11434/v1', models: { opus: 'qwen2.5-coder:32b' } } },
373
+ agent_routing: { 'np-planner': { provider: 'ollama' } },
374
+ }, { 'np-planner': _plannerAgent });
375
+ const origCwd = process.cwd();
376
+ process.chdir(root);
377
+ try {
378
+ const cap = _captureStdout(() => subcmd.run(['np-planner']));
379
+ assert.equal(cap.rc, 1);
380
+ assert.equal(cap.stdout, '');
381
+ assert.match(cap.stderr, /off-host-not-on-native-path/);
382
+ assert.match(cap.stderr, /spawn-offhost/);
383
+ } finally {
384
+ process.chdir(origCwd);
385
+ }
386
+ });
387
+
388
+ test('RM-28: --json reports the full resolution and succeeds even for off-host', () => {
389
+ const root = _sandbox({
390
+ model_providers: { ollama: { kind: 'openai-compat', base_url: 'http://localhost:11434/v1', models: { opus: 'qwen2.5-coder:32b' } } },
391
+ agent_routing: { 'np-planner': { provider: 'ollama' } },
392
+ }, { 'np-planner': _plannerAgent });
393
+ const origCwd = process.cwd();
394
+ process.chdir(root);
395
+ try {
396
+ const cap = _captureStdout(() => subcmd.run(['np-planner', '--json']));
397
+ assert.equal(cap.rc, 0);
398
+ const out = JSON.parse(cap.stdout.trim());
399
+ assert.equal(out.kind, 'openai-compat');
400
+ assert.equal(out.provider, 'ollama');
401
+ assert.equal(out.baseUrl, 'http://localhost:11434/v1');
402
+ } finally {
403
+ process.chdir(origCwd);
404
+ }
405
+ });
406
+
407
+ test('RM-26: model_providers.default switches the default provider for all agents', () => {
408
+ const cwd = _sandbox({
409
+ model_providers: {
410
+ default: 'ollama',
411
+ ollama: { kind: 'openai-compat', base_url: 'http://localhost:11434/v1', models: { opus: 'qwen2.5-coder:32b' } },
412
+ },
413
+ }, { 'np-planner': _plannerAgent });
414
+ const out = subcmd.resolveFromConfig({ agentOrTier: 'np-planner', cwd });
415
+ assert.equal(out.provider, 'ollama');
416
+ assert.equal(out.model, 'qwen2.5-coder:32b');
417
+ });
@@ -8,6 +8,7 @@ const { tryReadConfigPath } = require('../../lib/config.cjs');
8
8
  const scan = require('../../lib/security/scan.cjs');
9
9
  const ledger = require('../../lib/security/ledger.cjs');
10
10
  const review = require('../../lib/security/review.cjs');
11
+ const headlessGuard = require('../../lib/headless-guard.cjs');
11
12
  const args = require('./_args.cjs');
12
13
 
13
14
  const COMMIT_RE = /\bgit\b[\s\S]*\b(commit|push)\b/;
@@ -93,6 +94,8 @@ async function run(argv, ctx) {
93
94
  const list = Array.isArray(argv) ? argv : [];
94
95
  const verb = list[0];
95
96
 
97
+ if (headlessGuard.isHeadless(process.env)) return 0;
98
+
96
99
  const cfg = _cfg(cwd);
97
100
  if (!cfg.enabled && verb !== 'run-review') return 0;
98
101
 
@@ -167,7 +170,7 @@ async function run(argv, ctx) {
167
170
  if (verb === 'run-review') {
168
171
  if (!cfg.enabled || !sid) return 0;
169
172
  const mode = args.getFlag(list, '--mode') === 'commit' ? 'commit' : 'stop';
170
- try { review.runReview({ cwd, sid, mode, config: { ...cfg, guidance_path: _resolveRel(cwd, cfg.guidance_path) } }); } catch {}
173
+ try { await review.runReview({ cwd, sid, mode, config: { ...cfg, guidance_path: _resolveRel(cwd, cfg.guidance_path) } }); } catch {}
171
174
  return 0;
172
175
  }
173
176
 
@@ -8,6 +8,7 @@ const child_process = require('node:child_process');
8
8
  const { NubosPilotError, atomicWriteFileSync, appendJsonl, findProjectRoot } = require('../../lib/core.cjs');
9
9
  const runContext = require('../../lib/run-context.cjs');
10
10
  const safePath = require('../../lib/safe-path.cjs');
11
+ const headlessGuard = require('../../lib/headless-guard.cjs');
11
12
  const args = require('./_args.cjs');
12
13
 
13
14
  function _sha256(s) {
@@ -165,12 +166,109 @@ function _stripFrontmatter(md) {
165
166
  return md.slice(end + 5);
166
167
  }
167
168
 
168
- function run(argv, ctx) {
169
+ function _defaultResolveKind(agent, cwd) {
170
+ const { resolveFromConfig } = require('./resolve-model.cjs');
171
+ return resolveFromConfig({ agentOrTier: agent, cwd });
172
+ }
173
+
174
+ async function _runOffHost(o) {
175
+ const dispatch = o.dispatchImpl || require('../../lib/runtime/dispatch.cjs').dispatchOffHost;
176
+ const startedAt = o.startedAt;
177
+ let result = null;
178
+ let errObj = null;
179
+ try {
180
+ result = await dispatch({ agent: o.agent, task: o.userPrompt, cwd: o.cwd });
181
+ } catch (err) {
182
+ errObj = {
183
+ code: (err && err.code) || 'spawn-headless-offhost-failed',
184
+ message: (err && err.message) || 'off-host dispatch failed',
185
+ };
186
+ }
187
+ const endedAt = new Date().toISOString();
188
+ const content = result ? (result.content || '') : '';
189
+
190
+ const envelope = JSON.stringify({
191
+ type: 'result',
192
+ result: content,
193
+ model: result ? result.model : null,
194
+ provider: result ? result.provider : null,
195
+ is_error: !!errObj,
196
+ });
197
+ atomicWriteFileSync(o.resolvedOutput, envelope, 'utf-8', 0o600);
198
+
199
+ const exitCode = errObj ? 2 : 0;
200
+ const spawnTrailPath = _spawnTrailPath(o.cwd);
201
+ const spawnRecord = {
202
+ run_id: o.runId,
203
+ started_at: startedAt,
204
+ ended_at: endedAt,
205
+ duration_ms: Math.max(0, Date.parse(endedAt) - Date.parse(startedAt)),
206
+ agent: o.agent,
207
+ bin: 'off-host:' + (result ? result.provider : 'unknown'),
208
+ exit_code: exitCode,
209
+ timed_out: false,
210
+ prompt_sha256: _sha256(o.userPrompt),
211
+ prompt_bytes: Buffer.byteLength(o.userPrompt || '', 'utf-8'),
212
+ response_sha256: _sha256(content),
213
+ response_bytes: Buffer.byteLength(content, 'utf-8'),
214
+ stderr_excerpt: errObj ? _redactSecrets(String(errObj.message)).slice(-STDERR_TAIL_BYTES) : '',
215
+ model_actual: result ? result.model : null,
216
+ tokens_in: null,
217
+ tokens_out: null,
218
+ payload_parse_ok: !errObj,
219
+ runtime: result ? result.provider : 'off-host',
220
+ };
221
+ try { appendJsonl(spawnTrailPath, spawnRecord, { maxLineBytes: 16 * 1024, mode: 0o600 }); }
222
+ catch (err) {
223
+ throw new NubosPilotError(
224
+ 'spawn-headless-audit-persist-failed',
225
+ 'spawn-trail append failed; refusing to commit response without audit record',
226
+ { file: 'spawns.jsonl', cause: (err && err.code) || 'unknown' },
227
+ );
228
+ }
229
+
230
+ const payload = {
231
+ agent: o.agent,
232
+ output_path: o.outputPath,
233
+ output_path_resolved: o.resolvedOutput,
234
+ exit_code: exitCode,
235
+ stderr_excerpt: spawnRecord.stderr_excerpt,
236
+ bin: spawnRecord.bin,
237
+ timed_out: false,
238
+ run_id: o.runId,
239
+ spawn_trail_path: spawnTrailPath,
240
+ model_actual: spawnRecord.model_actual,
241
+ tokens_in: null,
242
+ tokens_out: null,
243
+ payload_parse_ok: spawnRecord.payload_parse_ok,
244
+ off_host: true,
245
+ };
246
+ o.stdout.write(JSON.stringify(payload) + '\n');
247
+ return exitCode;
248
+ }
249
+
250
+ async function run(argv, ctx) {
169
251
  const context = ctx || {};
170
252
  const cwd = context.cwd || process.cwd();
171
253
  const stdout = context.stdout || process.stdout;
172
254
  const list = Array.isArray(argv) ? argv : [];
173
255
 
256
+ if (headlessGuard.isHeadless(process.env)) {
257
+ throw new NubosPilotError(
258
+ 'spawn-headless-reentrant',
259
+ 'refusing to spawn a nested headless `claude` (NUBOS_PILOT_HEADLESS is set) — recursion guard',
260
+ { depth: headlessGuard.currentDepth(process.env) },
261
+ );
262
+ }
263
+ if (headlessGuard.depthExceeded(process.env)) {
264
+ throw new NubosPilotError(
265
+ 'spawn-headless-depth-exceeded',
266
+ 'refusing to spawn headless `claude`: hook depth ' + headlessGuard.currentDepth(process.env)
267
+ + ' has reached the cap ' + headlessGuard.maxDepth(process.env) + ' (recursion guard)',
268
+ { depth: headlessGuard.currentDepth(process.env), max: headlessGuard.maxDepth(process.env) },
269
+ );
270
+ }
271
+
174
272
  const agent = args.getFlag(list, '--agent');
175
273
  if (!agent) {
176
274
  throw new NubosPilotError(
@@ -213,6 +311,39 @@ function run(argv, ctx) {
213
311
 
214
312
  const runId = runContext.getRunId();
215
313
 
314
+ // ADR-0021: off-host routing. If the agent routes to an openai-compat provider,
315
+ // run the nubos-pilot dispatch loop instead of `claude -p`, writing the result
316
+ // in the same {result,...} envelope `claude --output-format json` produces so the
317
+ // callers (review/extract parseXxxOutput → outer.result) parse it unchanged. The
318
+ // native path below is untouched. run() is async; the CLI dispatcher already
319
+ // awaits returned promises (handleRunResult), and the two in-process callers
320
+ // (review.cjs / extract.cjs) await runReview/runExtract.
321
+ const resolveKind = context.resolveImpl || _defaultResolveKind;
322
+ let routed;
323
+ try { routed = resolveKind(agent, cwd); } catch { routed = { kind: 'native' }; }
324
+ if (routed && routed.kind === 'openai-compat') {
325
+ return _runOffHost({
326
+ agent, userPrompt, resolvedOutput, outputPath, cwd, runId, stdout,
327
+ startedAt: new Date().toISOString(),
328
+ dispatchImpl: context.dispatchImpl,
329
+ });
330
+ }
331
+
332
+ let lockRoot;
333
+ try { lockRoot = findProjectRoot(cwd); }
334
+ catch { lockRoot = cwd; }
335
+ const lock = headlessGuard.tryAcquireSpawnLock(lockRoot, agent);
336
+ if (!lock.acquired) {
337
+ throw new NubosPilotError(
338
+ 'spawn-headless-locked',
339
+ 'another headless run for agent `' + agent + '` is already active in this project (concurrency guard)',
340
+ { agent, holder: lock.holder || null },
341
+ );
342
+ }
343
+
344
+ const childEnv = _filterSpawnEnv(process.env);
345
+ Object.assign(childEnv, headlessGuard.childSpawnEnv(process.env));
346
+
216
347
  const bin = _claudeBinary();
217
348
  const claudeArgs = ['-p', '--output-format', 'json'];
218
349
  const startedAt = new Date().toISOString();
@@ -224,7 +355,7 @@ function run(argv, ctx) {
224
355
  timeout: timeoutMs,
225
356
  maxBuffer: 64 * 1024 * 1024,
226
357
  encoding: 'utf-8',
227
- env: _filterSpawnEnv(process.env),
358
+ env: childEnv,
228
359
  killSignal: 'SIGKILL',
229
360
  });
230
361
  } catch (err) {
@@ -233,6 +364,8 @@ function run(argv, ctx) {
233
364
  'failed to spawn `' + bin + '`: ' + (err && err.message),
234
365
  { bin, cause: err && err.code },
235
366
  );
367
+ } finally {
368
+ lock.release();
236
369
  }
237
370
  if (result.error && result.error.code === 'ENOENT') {
238
371
  throw new NubosPilotError(