nubos-pilot 1.2.4 → 1.3.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (41) hide show
  1. package/CHANGELOG.md +17 -1
  2. package/README.md +2 -1
  3. package/SECURITY.md +3 -4
  4. package/bin/np-tools/_commands.cjs +1 -0
  5. package/bin/np-tools/learnings.cjs +1 -1
  6. package/bin/np-tools/resolve-model.cjs +55 -1
  7. package/bin/np-tools/resolve-model.test.cjs +139 -0
  8. package/bin/np-tools/security.cjs +1 -1
  9. package/bin/np-tools/spawn-headless.cjs +100 -1
  10. package/bin/np-tools/spawn-headless.test.cjs +108 -58
  11. package/bin/np-tools/spawn-offhost.cjs +93 -0
  12. package/bin/np-tools/spawn-offhost.test.cjs +38 -0
  13. package/lib/agents.cjs +16 -2
  14. package/lib/config-schema.cjs +5 -1
  15. package/lib/learnings/extract.cjs +4 -4
  16. package/lib/learnings/extract.test.cjs +8 -8
  17. package/lib/model-providers.cjs +118 -0
  18. package/lib/model-providers.test.cjs +85 -0
  19. package/lib/runtime/agent-loop.cjs +64 -0
  20. package/lib/runtime/agent-loop.test.cjs +135 -0
  21. package/lib/runtime/dispatch.cjs +174 -0
  22. package/lib/runtime/dispatch.test.cjs +193 -0
  23. package/lib/runtime/preflight.cjs +68 -0
  24. package/lib/runtime/preflight.test.cjs +62 -0
  25. package/lib/runtime/providers/openai-compat.cjs +102 -0
  26. package/lib/runtime/providers/openai-compat.test.cjs +103 -0
  27. package/lib/runtime/tools/index.cjs +415 -0
  28. package/lib/runtime/tools/index.test.cjs +230 -0
  29. package/lib/security/review.cjs +4 -4
  30. package/lib/security/review.test.cjs +6 -6
  31. package/np-tools.cjs +1 -0
  32. package/package.json +1 -1
  33. package/workflows/add-tests.md +41 -0
  34. package/workflows/architect-phase.md +19 -0
  35. package/workflows/discuss-phase.md +29 -10
  36. package/workflows/execute-phase.md +93 -4
  37. package/workflows/plan-phase.md +57 -16
  38. package/workflows/research-phase.md +45 -0
  39. package/workflows/scan-codebase.md +21 -3
  40. package/workflows/validate-phase.md +30 -13
  41. package/workflows/verify-work.md +17 -0
package/CHANGELOG.md CHANGED
@@ -4,7 +4,21 @@ All notable changes to nubos-pilot are documented in this file. Format
4
4
  follows [Keep a Changelog](https://keepachangelog.com/en/1.1.0/); versioning
5
5
  follows [SemVer](https://semver.org/spec/v2.0.0.html).
6
6
 
7
- ## [1.2.4] - 2026-06-15
7
+ ## [1.3.0] 2026-06-17
8
+
9
+ Run any agent on any model, not only Claude.
10
+
11
+ - Per-agent model routing: two new config blocks, `model_providers` and `agent_routing`, send each agent to a specific model in the same run — planner on Claude opus, critic on OpenAI gpt-4o, executor on a local Ollama model. Any provider that speaks the OpenAI `/v1/chat/completions` dialect (OpenAI, xAI/Grok, Ollama, vLLM, LM Studio, LiteLLM) is reached through one `fetch`-based client, with no SDK added. Both blocks are optional; without them, resolution and spawning behave exactly as before.
12
+ - When the host can't route an agent to a non-native model — Claude Code's Agent tool only accepts Claude tiers — nubos-pilot runs the loop itself. It's a one-shot, zero-dependency tool-use harness: builds the prompt from `agents/<name>.md`, advertises the agent's tools as function schemas, runs the model's tool-calls against the workspace, and loops until a final answer. No daemon, the process exits when the loop returns.
13
+ - The off-host path runs through the same guards as the Claude path: working-tree safety, commit-policy, output-schema lint, the Nubosloop Rule-9 audit, and in-session security review, all unchanged. Off-host file writes are confined through `safe-path`. Off-host Bash runs only inside a slice worktree and stays off until `workflow.worktree_isolation` is on.
14
+ - Every workflow spawn now has an off-host branch — execute, plan, discuss, research, architect, validate, verify, scan. A test (`check-offhost-coverage`) walks the workflows and fails the suite if any spawn lacks one, so a new agent can't ship Claude-only by accident.
15
+ - A preflight runs before any off-host spawn and fails loud: it checks the server is reachable, the model is present, and tool-calling works, then aborts with an actionable message (`run: ollama pull <model>`) instead of dying mid-task. A routing entry that names an undefined provider is a hard config error at load time, never a quiet fallback to Claude.
16
+
17
+ Local models are weaker at multi-step tool-use than frontier Claude, so keep high-risk agents like the planner and security-reviewer on Claude — that's why the whole thing is opt-in. ADR-0021 has the full design.
18
+
19
+ Full documentation at <https://pilot.nubos.cloud>.
20
+
21
+ ## [1.2.4] — 2026-06-15
8
22
 
9
23
  Fixed a recursion fault in the in-session hooks that could spawn an unbounded cascade of headless `claude -p` processes.
10
24
 
@@ -12,6 +26,8 @@ Fixed a recursion fault in the in-session hooks that could spawn an unbounded ca
12
26
  - Three independent guards back this up: the hook scripts and the `security`/`learnings` backends exit early when `NUBOS_PILOT_HEADLESS` is set; `spawn-headless` refuses to start a nested headless run (reentrancy + depth cap, default one level); and a per-agent lockfile under `.nubos-pilot/run/` bounds concurrent headless runs to one per agent even if the environment is not inherited. Headless runs already carry a hard timeout with SIGKILL, so a hung review cannot linger.
13
27
  - Escape hatch: the guard keys off `NUBOS_PILOT_HEADLESS`, set automatically on the spawned `claude` — do not set it in your own shell or the in-session hooks will silently no-op. Raise the depth cap with `NUBOS_PILOT_MAX_HOOK_DEPTH` only if you understand the recursion risk.
14
28
 
29
+ Full documentation at <https://pilot.nubos.cloud>.
30
+
15
31
  ## [1.2.3] — 2026-06-14
16
32
 
17
33
  Three opt-in layers that make execution cheaper, more reliable, and self-improving.
package/README.md CHANGED
@@ -95,7 +95,7 @@ task(M001-S001-T0002): wire login handler
95
95
 
96
96
  ## Agents
97
97
 
98
- Thirteen spawnable subagents are installed into the host's agent directory (alongside three `np-critic-*` audit modules consumed by `np-critic`):
98
+ Fourteen spawnable subagents are installed into the host's agent directory (alongside three `np-critic-*` audit modules consumed by `np-critic`):
99
99
 
100
100
  - `np-planner` (opus) — breaks a milestone into slices + tasks
101
101
  - `np-plan-checker` (opus) — adversarial goal-backward review before execution
@@ -109,6 +109,7 @@ Thirteen spawnable subagents are installed into the host's agent directory (alon
109
109
  - `np-critic` (sonnet) — Nubosloop critic; audits executor output across style, tests and acceptance
110
110
  - `np-verifier` (sonnet) — post-execution Pass/Fail/Defer per success_criterion
111
111
  - `np-nyquist-auditor` (haiku) — requirement test-coverage audit
112
+ - `np-learnings-extractor` (haiku) — headless continuous-learning observer; distils reusable `{pattern, outcome}` learnings from a session's turn-diff
112
113
  - `np-security-reviewer` (sonnet) — OWASP-aligned read-only audit (manual spawn)
113
114
 
114
115
  Every spawn runs with an **explicit tier** (`haiku` / `sonnet` / `opus`) resolved to a concrete model via `np-tools.cjs resolve-model --profile <frontier|quality|balanced|budget|inherit>`.
package/SECURITY.md CHANGED
@@ -18,11 +18,10 @@ versions and announced in `CHANGELOG.md`.
18
18
 
19
19
  | Version | Supported |
20
20
  |---------|-----------|
21
- | 0.2.x | ✅ active |
22
- | < 0.2 | ❌ end of life |
21
+ | 1.3.x | ✅ active |
22
+ | < 1.3 | ❌ end of life |
23
23
 
24
- Only the latest minor on the current major receives security patches until
25
- 1.0 is reached.
24
+ Only the latest minor on the current major (1.x) receives security patches.
26
25
 
27
26
  ## Threat Model
28
27
 
@@ -101,6 +101,7 @@ const COMMANDS = [
101
101
  { name: 'loop-audit-tool-use', category: 'Execution', description: 'Record/read the tool-use audit per spawn (Completeness Rule 9 mechanical check)', description_de: 'Tool-use Audit pro Spawn schreiben/lesen (Completeness Rule 9 mechanische Prüfung)' },
102
102
  { name: 'loop-stuck', category: 'Execution', description: 'Mark a task as stuck (writes loop-state + flips checkpoint status to stuck)', description_de: 'Markiert Task als stuck (schreibt Loop-State + setzt Checkpoint-Status auf stuck)' },
103
103
  { name: 'spawn-headless', category: 'Execution', description: 'Spawn an agent as a headless `claude -p` subprocess (ADR-0010 §L6); writes stdout to --output-path and returns exit code', description_de: 'Spawnt einen Agent als headless `claude -p` Subprozess (ADR-0010 §L6); schreibt stdout nach --output-path und liefert Exit-Code' },
104
+ { name: 'spawn-offhost', category: 'Execution', description: 'Run an agent routed to an openai-compat provider (Ollama/OpenAI/Grok) as a one-shot tool-use loop (ADR-0021). Args: --agent --task|--task-file [--allow-bash] [--read-only]. Preflights the endpoint, records metrics.', description_de: 'Führt einen auf einen openai-compat-Provider (Ollama/OpenAI/Grok) gerouteten Agent als One-Shot-Tool-Use-Loop aus (ADR-0021). Args: --agent --task|--task-file [--allow-bash] [--read-only]. Preflight des Endpoints, Metrics-Aufzeichnung.' },
104
105
  { name: 'security', category: 'Review', description: 'In-session security review hook backend (ADR-0020). Verbs: session-start | baseline | scan | review | commit | run-review. Reads the Claude Code hook payload via --stdin; non-blocking, report-once, independent reviewer spawn.', description_de: 'Backend für die In-Session-Security-Review-Hooks (ADR-0020). Verben: session-start | baseline | scan | review | commit | run-review. Liest die Claude-Code-Hook-Payload via --stdin; non-blocking, report-once, unabhängiger Reviewer-Spawn.' },
105
106
  { name: 'loop-metrics', category: 'Utility', description: 'Aggregate Nubosloop telemetry across all checkpoints (commits, stuck, route distribution)', description_de: 'Aggregiert Nubosloop-Telemetrie über alle Checkpoints (Commits, Stuck, Routing)' },
106
107
  { name: 'learning-log', category: 'Execution', description: 'Persist a learning to the local store (or MCP adapter when configured)', description_de: 'Persistiert ein Learning im lokalen Store (oder MCP-Adapter falls konfiguriert)' },
@@ -90,7 +90,7 @@ async function run(argv, ctx) {
90
90
  if (verb === 'run-extract') {
91
91
  const sid = args.getFlag(list, '--session') || '';
92
92
  try {
93
- const result = extract.runExtract({ cwd, sid, config: cfg });
93
+ const result = await extract.runExtract({ cwd, sid, config: cfg });
94
94
  _emit(stdout, result);
95
95
  } catch (err) {
96
96
  _emit(stdout, { ran: false, reason: 'error', error: String(err && err.code || err) });
@@ -2,6 +2,7 @@ const { NubosPilotError } = require('../../lib/core.cjs');
2
2
  const { readConfig, _CONFIG_PARSE_CODES } = require('../../lib/config.cjs');
3
3
  const { loadAgent, loadAgentModule } = require('../../lib/agents.cjs');
4
4
  const { resolve: resolveAlias, MODEL_ALIAS_MAP, VALID_TIERS } = require('../../lib/model-profiles.cjs');
5
+ const { resolveProvider } = require('../../lib/model-providers.cjs');
5
6
 
6
7
  let _warnedCorruptOnce = false;
7
8
  function _readConfig(cwd) {
@@ -56,11 +57,13 @@ function resolveFromConfig({ agentOrTier, profileOverride, cwd, format }) {
56
57
  const config = _readConfig(cwd);
57
58
 
58
59
  let tier;
60
+ let agentName = null;
59
61
  if (VALID_TIERS.includes(agentOrTier)) {
60
62
  tier = agentOrTier;
61
63
  } else {
62
64
  const fm = _loadAgentForResolve(agentOrTier, cwd);
63
65
  tier = fm.tier;
66
+ agentName = agentOrTier;
64
67
  const override = _criticTierOverride(config, agentOrTier);
65
68
  if (override) tier = override;
66
69
  }
@@ -91,7 +94,32 @@ function resolveFromConfig({ agentOrTier, profileOverride, cwd, format }) {
91
94
  resolved = alias;
92
95
  }
93
96
 
94
- return { tier, profile, alias, resolved, mode };
97
+ const prov = resolveProvider({ agentName, tier, config });
98
+ let providerModel = prov.model;
99
+ if (prov.kind === 'native') {
100
+ if (prov.model) {
101
+ resolved = prov.model;
102
+ mode = 'full-id';
103
+ } else {
104
+ providerModel = null;
105
+ }
106
+ } else {
107
+ resolved = prov.model;
108
+ mode = 'provider';
109
+ }
110
+
111
+ return {
112
+ tier,
113
+ profile,
114
+ alias,
115
+ resolved,
116
+ mode,
117
+ provider: prov.provider,
118
+ kind: prov.kind,
119
+ model: providerModel,
120
+ baseUrl: prov.baseUrl,
121
+ apiKeyEnv: prov.apiKeyEnv,
122
+ };
95
123
  }
96
124
 
97
125
  function run(argv) {
@@ -105,12 +133,18 @@ function run(argv) {
105
133
  const agentOrTier = args.shift();
106
134
  let profileOverride = null;
107
135
  let format = null;
136
+ let asJson = false;
137
+ let asKind = false;
108
138
  while (args.length) {
109
139
  const flag = args.shift();
110
140
  if (flag === '--profile') {
111
141
  profileOverride = args.shift();
112
142
  } else if (flag === '--format') {
113
143
  format = args.shift();
144
+ } else if (flag === '--json') {
145
+ asJson = true;
146
+ } else if (flag === '--kind') {
147
+ asKind = true;
114
148
  } else if (flag === '--raw') {
115
149
 
116
150
  }
@@ -122,6 +156,26 @@ function run(argv) {
122
156
  cwd: process.cwd(),
123
157
  format,
124
158
  });
159
+ if (asKind) {
160
+ process.stdout.write((out.kind || 'native') + '\n');
161
+ return 0;
162
+ }
163
+ if (asJson) {
164
+ process.stdout.write(JSON.stringify(out) + '\n');
165
+ return 0;
166
+ }
167
+ if (out.kind === 'openai-compat') {
168
+ process.stderr.write(
169
+ JSON.stringify({
170
+ code: 'off-host-not-on-native-path',
171
+ message: 'agent "' + agentOrTier + '" routes to provider "' + out.provider
172
+ + '" (model ' + out.model + '), which the native `claude` spawn path cannot run. '
173
+ + 'Run it off-host with: np-tools spawn-offhost --agent ' + agentOrTier + ' --task <…>',
174
+ details: { provider: out.provider, kind: out.kind, model: out.model },
175
+ }) + '\n',
176
+ );
177
+ return 1;
178
+ }
125
179
  process.stdout.write(out.resolved + '\n');
126
180
  return 0;
127
181
  } catch (err) {
@@ -72,6 +72,11 @@ test('RM-1: tier branch with empty config returns alias mode, default balanced p
72
72
  alias: 'opus',
73
73
  resolved: 'opus',
74
74
  mode: 'alias',
75
+ provider: 'claude',
76
+ kind: 'native',
77
+ model: null,
78
+ baseUrl: null,
79
+ apiKeyEnv: null,
75
80
  });
76
81
  });
77
82
 
@@ -276,3 +281,137 @@ test('RM-18: module agent without override falls back to module frontmatter tier
276
281
  });
277
282
  assert.equal(out.tier, 'haiku');
278
283
  });
284
+
285
+ test('RM-19: agent_routing to openai-compat resolves model from provider models table by tier', () => {
286
+ const cwd = _sandbox({
287
+ model_providers: {
288
+ default: 'claude',
289
+ claude: { kind: 'native' },
290
+ ollama: { kind: 'openai-compat', base_url: 'http://localhost:11434/v1', models: { opus: 'qwen2.5-coder:32b' } },
291
+ },
292
+ agent_routing: { 'np-planner': { provider: 'ollama' } },
293
+ }, { 'np-planner': _plannerAgent });
294
+ const out = subcmd.resolveFromConfig({ agentOrTier: 'np-planner', cwd });
295
+ assert.equal(out.provider, 'ollama');
296
+ assert.equal(out.kind, 'openai-compat');
297
+ assert.equal(out.model, 'qwen2.5-coder:32b');
298
+ assert.equal(out.resolved, 'qwen2.5-coder:32b');
299
+ assert.equal(out.mode, 'provider');
300
+ });
301
+
302
+ test('RM-20: explicit model pin in agent_routing beats provider models table', () => {
303
+ const cwd = _sandbox({
304
+ model_providers: {
305
+ ollama: { kind: 'openai-compat', base_url: 'http://localhost:11434/v1', models: { opus: 'fallback-model' } },
306
+ },
307
+ agent_routing: { 'np-planner': { provider: 'ollama', model: 'qwen3.5' } },
308
+ }, { 'np-planner': _plannerAgent });
309
+ const out = subcmd.resolveFromConfig({ agentOrTier: 'np-planner', cwd });
310
+ assert.equal(out.model, 'qwen3.5');
311
+ assert.equal(out.resolved, 'qwen3.5');
312
+ });
313
+
314
+ test('RM-21: native provider with an explicit model pin forces full-id mode', () => {
315
+ const cwd = _sandbox({
316
+ model_providers: { claude: { kind: 'native' } },
317
+ agent_routing: { 'np-planner': { provider: 'claude', model: 'claude-opus-4-7' } },
318
+ }, { 'np-planner': _plannerAgent });
319
+ const out = subcmd.resolveFromConfig({ agentOrTier: 'np-planner', cwd });
320
+ assert.equal(out.kind, 'native');
321
+ assert.equal(out.resolved, 'claude-opus-4-7');
322
+ assert.equal(out.mode, 'full-id');
323
+ assert.equal(out.model, 'claude-opus-4-7');
324
+ });
325
+
326
+ test('RM-22: glob routing key (np-critic*) matches a critic agent', () => {
327
+ const cwd = _sandbox({
328
+ model_providers: { openai: { kind: 'openai-compat', base_url: 'https://api.openai.com/v1', models: { sonnet: 'gpt-4o' } } },
329
+ agent_routing: { 'np-critic*': { provider: 'openai', model: 'gpt-4o' } },
330
+ }, { 'np-critic': '---\nname: np-critic\ndescription: x\ntier: sonnet\ntools: Read\n---\nbody' });
331
+ const out = subcmd.resolveFromConfig({ agentOrTier: 'np-critic', cwd });
332
+ assert.equal(out.provider, 'openai');
333
+ assert.equal(out.model, 'gpt-4o');
334
+ });
335
+
336
+ test('RM-23: routing to an undefined provider fails loud (no silent claude fallback)', () => {
337
+ const cwd = _sandbox({
338
+ model_providers: { claude: { kind: 'native' } },
339
+ agent_routing: { 'np-planner': { provider: 'ollama' } },
340
+ }, { 'np-planner': _plannerAgent });
341
+ let thrown = null;
342
+ try { subcmd.resolveFromConfig({ agentOrTier: 'np-planner', cwd }); } catch (e) { thrown = e; }
343
+ assert.ok(thrown);
344
+ assert.equal(thrown.name, 'NubosPilotError');
345
+ assert.equal(thrown.code, 'provider-undefined');
346
+ });
347
+
348
+ test('RM-24: openai-compat with no pin and no models[tier] fails loud', () => {
349
+ const cwd = _sandbox({
350
+ model_providers: { ollama: { kind: 'openai-compat', base_url: 'http://localhost:11434/v1', models: { haiku: 'small' } } },
351
+ agent_routing: { 'np-planner': { provider: 'ollama' } },
352
+ }, { 'np-planner': _plannerAgent });
353
+ let thrown = null;
354
+ try { subcmd.resolveFromConfig({ agentOrTier: 'np-planner', cwd }); } catch (e) { thrown = e; }
355
+ assert.ok(thrown);
356
+ assert.equal(thrown.code, 'provider-model-unresolved');
357
+ });
358
+
359
+ test('RM-25: bare tier never routes — stays on implicit claude-native default', () => {
360
+ const cwd = _sandbox({
361
+ model_providers: { ollama: { kind: 'openai-compat', base_url: 'http://localhost:11434/v1', models: { opus: 'qwen' } } },
362
+ agent_routing: { opus: { provider: 'ollama' } },
363
+ });
364
+ const out = subcmd.resolveFromConfig({ agentOrTier: 'opus', cwd });
365
+ assert.equal(out.provider, 'claude');
366
+ assert.equal(out.kind, 'native');
367
+ assert.equal(out.resolved, 'opus');
368
+ });
369
+
370
+ test('RM-27: run() refuses an off-host (openai-compat) agent loud — no model id on stdout', () => {
371
+ const root = _sandbox({
372
+ model_providers: { ollama: { kind: 'openai-compat', base_url: 'http://localhost:11434/v1', models: { opus: 'qwen2.5-coder:32b' } } },
373
+ agent_routing: { 'np-planner': { provider: 'ollama' } },
374
+ }, { 'np-planner': _plannerAgent });
375
+ const origCwd = process.cwd();
376
+ process.chdir(root);
377
+ try {
378
+ const cap = _captureStdout(() => subcmd.run(['np-planner']));
379
+ assert.equal(cap.rc, 1);
380
+ assert.equal(cap.stdout, '');
381
+ assert.match(cap.stderr, /off-host-not-on-native-path/);
382
+ assert.match(cap.stderr, /spawn-offhost/);
383
+ } finally {
384
+ process.chdir(origCwd);
385
+ }
386
+ });
387
+
388
+ test('RM-28: --json reports the full resolution and succeeds even for off-host', () => {
389
+ const root = _sandbox({
390
+ model_providers: { ollama: { kind: 'openai-compat', base_url: 'http://localhost:11434/v1', models: { opus: 'qwen2.5-coder:32b' } } },
391
+ agent_routing: { 'np-planner': { provider: 'ollama' } },
392
+ }, { 'np-planner': _plannerAgent });
393
+ const origCwd = process.cwd();
394
+ process.chdir(root);
395
+ try {
396
+ const cap = _captureStdout(() => subcmd.run(['np-planner', '--json']));
397
+ assert.equal(cap.rc, 0);
398
+ const out = JSON.parse(cap.stdout.trim());
399
+ assert.equal(out.kind, 'openai-compat');
400
+ assert.equal(out.provider, 'ollama');
401
+ assert.equal(out.baseUrl, 'http://localhost:11434/v1');
402
+ } finally {
403
+ process.chdir(origCwd);
404
+ }
405
+ });
406
+
407
+ test('RM-26: model_providers.default switches the default provider for all agents', () => {
408
+ const cwd = _sandbox({
409
+ model_providers: {
410
+ default: 'ollama',
411
+ ollama: { kind: 'openai-compat', base_url: 'http://localhost:11434/v1', models: { opus: 'qwen2.5-coder:32b' } },
412
+ },
413
+ }, { 'np-planner': _plannerAgent });
414
+ const out = subcmd.resolveFromConfig({ agentOrTier: 'np-planner', cwd });
415
+ assert.equal(out.provider, 'ollama');
416
+ assert.equal(out.model, 'qwen2.5-coder:32b');
417
+ });
@@ -170,7 +170,7 @@ async function run(argv, ctx) {
170
170
  if (verb === 'run-review') {
171
171
  if (!cfg.enabled || !sid) return 0;
172
172
  const mode = args.getFlag(list, '--mode') === 'commit' ? 'commit' : 'stop';
173
- try { review.runReview({ cwd, sid, mode, config: { ...cfg, guidance_path: _resolveRel(cwd, cfg.guidance_path) } }); } catch {}
173
+ try { await review.runReview({ cwd, sid, mode, config: { ...cfg, guidance_path: _resolveRel(cwd, cfg.guidance_path) } }); } catch {}
174
174
  return 0;
175
175
  }
176
176
 
@@ -166,7 +166,88 @@ function _stripFrontmatter(md) {
166
166
  return md.slice(end + 5);
167
167
  }
168
168
 
169
- function run(argv, ctx) {
169
+ function _defaultResolveKind(agent, cwd) {
170
+ const { resolveFromConfig } = require('./resolve-model.cjs');
171
+ return resolveFromConfig({ agentOrTier: agent, cwd });
172
+ }
173
+
174
+ async function _runOffHost(o) {
175
+ const dispatch = o.dispatchImpl || require('../../lib/runtime/dispatch.cjs').dispatchOffHost;
176
+ const startedAt = o.startedAt;
177
+ let result = null;
178
+ let errObj = null;
179
+ try {
180
+ result = await dispatch({ agent: o.agent, task: o.userPrompt, cwd: o.cwd });
181
+ } catch (err) {
182
+ errObj = {
183
+ code: (err && err.code) || 'spawn-headless-offhost-failed',
184
+ message: (err && err.message) || 'off-host dispatch failed',
185
+ };
186
+ }
187
+ const endedAt = new Date().toISOString();
188
+ const content = result ? (result.content || '') : '';
189
+
190
+ const envelope = JSON.stringify({
191
+ type: 'result',
192
+ result: content,
193
+ model: result ? result.model : null,
194
+ provider: result ? result.provider : null,
195
+ is_error: !!errObj,
196
+ });
197
+ atomicWriteFileSync(o.resolvedOutput, envelope, 'utf-8', 0o600);
198
+
199
+ const exitCode = errObj ? 2 : 0;
200
+ const spawnTrailPath = _spawnTrailPath(o.cwd);
201
+ const spawnRecord = {
202
+ run_id: o.runId,
203
+ started_at: startedAt,
204
+ ended_at: endedAt,
205
+ duration_ms: Math.max(0, Date.parse(endedAt) - Date.parse(startedAt)),
206
+ agent: o.agent,
207
+ bin: 'off-host:' + (result ? result.provider : 'unknown'),
208
+ exit_code: exitCode,
209
+ timed_out: false,
210
+ prompt_sha256: _sha256(o.userPrompt),
211
+ prompt_bytes: Buffer.byteLength(o.userPrompt || '', 'utf-8'),
212
+ response_sha256: _sha256(content),
213
+ response_bytes: Buffer.byteLength(content, 'utf-8'),
214
+ stderr_excerpt: errObj ? _redactSecrets(String(errObj.message)).slice(-STDERR_TAIL_BYTES) : '',
215
+ model_actual: result ? result.model : null,
216
+ tokens_in: null,
217
+ tokens_out: null,
218
+ payload_parse_ok: !errObj,
219
+ runtime: result ? result.provider : 'off-host',
220
+ };
221
+ try { appendJsonl(spawnTrailPath, spawnRecord, { maxLineBytes: 16 * 1024, mode: 0o600 }); }
222
+ catch (err) {
223
+ throw new NubosPilotError(
224
+ 'spawn-headless-audit-persist-failed',
225
+ 'spawn-trail append failed; refusing to commit response without audit record',
226
+ { file: 'spawns.jsonl', cause: (err && err.code) || 'unknown' },
227
+ );
228
+ }
229
+
230
+ const payload = {
231
+ agent: o.agent,
232
+ output_path: o.outputPath,
233
+ output_path_resolved: o.resolvedOutput,
234
+ exit_code: exitCode,
235
+ stderr_excerpt: spawnRecord.stderr_excerpt,
236
+ bin: spawnRecord.bin,
237
+ timed_out: false,
238
+ run_id: o.runId,
239
+ spawn_trail_path: spawnTrailPath,
240
+ model_actual: spawnRecord.model_actual,
241
+ tokens_in: null,
242
+ tokens_out: null,
243
+ payload_parse_ok: spawnRecord.payload_parse_ok,
244
+ off_host: true,
245
+ };
246
+ o.stdout.write(JSON.stringify(payload) + '\n');
247
+ return exitCode;
248
+ }
249
+
250
+ async function run(argv, ctx) {
170
251
  const context = ctx || {};
171
252
  const cwd = context.cwd || process.cwd();
172
253
  const stdout = context.stdout || process.stdout;
@@ -230,6 +311,24 @@ function run(argv, ctx) {
230
311
 
231
312
  const runId = runContext.getRunId();
232
313
 
314
+ // ADR-0021: off-host routing. If the agent routes to an openai-compat provider,
315
+ // run the nubos-pilot dispatch loop instead of `claude -p`, writing the result
316
+ // in the same {result,...} envelope `claude --output-format json` produces so the
317
+ // callers (review/extract parseXxxOutput → outer.result) parse it unchanged. The
318
+ // native path below is untouched. run() is async; the CLI dispatcher already
319
+ // awaits returned promises (handleRunResult), and the two in-process callers
320
+ // (review.cjs / extract.cjs) await runReview/runExtract.
321
+ const resolveKind = context.resolveImpl || _defaultResolveKind;
322
+ let routed;
323
+ try { routed = resolveKind(agent, cwd); } catch { routed = { kind: 'native' }; }
324
+ if (routed && routed.kind === 'openai-compat') {
325
+ return _runOffHost({
326
+ agent, userPrompt, resolvedOutput, outputPath, cwd, runId, stdout,
327
+ startedAt: new Date().toISOString(),
328
+ dispatchImpl: context.dispatchImpl,
329
+ });
330
+ }
331
+
233
332
  let lockRoot;
234
333
  try { lockRoot = findProjectRoot(cwd); }
235
334
  catch { lockRoot = cwd; }