@runtypelabs/cli 2.0.2 → 2.1.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/README.md CHANGED
@@ -220,15 +220,44 @@ EOF
220
220
  - `planWritten` — advances when the agent writes its plan artifact
221
221
  - `never` — only the agent's `TASK_COMPLETE` signal can advance (if `canAcceptCompletion: true`)
222
222
 
223
+ **Playbook policies**:
224
+
225
+ The optional `policy` block lets you restrict what the agent can do at runtime. Policies are additive restrictions — they can only narrow behavior, never override global safety denies (e.g. `.env` files and private keys are always blocked).
226
+
227
+ ```yaml
228
+ name: blog-writer
229
+ policy:
230
+ allowedReadGlobs: ['content/**', 'templates/**']
231
+ allowedWriteGlobs: ['content/**']
232
+ blockedTools: ['search_repo']
233
+ blockDiscoveryTools: true
234
+ requirePlanBeforeWrite: true
235
+ requireVerification: true
236
+ outputRoot: 'content/'
237
+ milestones:
238
+ - ...
239
+ ```
240
+
241
+ | Field | Type | Description |
242
+ | ------------------------ | ---------- | ----------------------------------------------------------------------------------------------------------------------------- |
243
+ | `allowedReadGlobs` | `string[]` | Glob patterns for allowed read paths. If set, reads outside these are blocked. |
244
+ | `allowedWriteGlobs` | `string[]` | Glob patterns for allowed write paths. If set, writes outside these are blocked. The plan file is always writable regardless. |
245
+ | `blockedTools` | `string[]` | Tool names to block entirely (e.g. `["write_file", "search_repo"]`). |
246
+ | `blockDiscoveryTools` | `boolean` | Block `search_repo`, `glob_files`, `tree_directory`, and `list_directory`. |
247
+ | `requirePlanBeforeWrite` | `boolean` | Require the agent to write its plan before any other file writes. |
248
+ | `requireVerification` | `boolean` | Require verification before `TASK_COMPLETE`. |
249
+ | `outputRoot` | `string` | For creation tasks: confine writes to this directory (e.g. `"public/"`). |
250
+
223
251
  #### Marathon Anatomy
224
252
 
225
253
  ```
226
254
  ┌─ marathon ──────────────────────────────────────────────────────┐
227
255
  │ │
228
- │ ┌─ playbook (optional) ─────────────────────────────┐
229
- │ │ Defines milestones, models, verification, rules
230
- │ │ .runtype/marathons/playbooks/tdd.yaml
231
- └───────────────────────────────────────────────────┘
256
+ │ ┌─ playbook (optional) ──────────────────────────────────┐
257
+ │ │ Defines milestones, models, verification, rules,
258
+ │ │ and policy constraints
259
+ │ │ .runtype/marathons/playbooks/tdd.yaml │ │
260
+ │ └────────────────────────────────────────────────────────┘ │
232
261
  │ │ │
233
262
  │ ▼ │
234
263
  │ ┌─ milestone 1 ──┐ ┌─ milestone 2 ──┐ ┌─ milestone 3 ─────┐ |
@@ -261,6 +290,7 @@ What's optional:
261
290
  ✓ Rules Without them, agent follows only playbook/milestone instructions
262
291
  ✓ Models Without overrides, uses CLI --model flag or default
263
292
  ✓ Verification Without it, no verification gate between milestones
293
+ ✓ Policy Without one, only global safety denies apply
264
294
  ```
265
295
 
266
296
  #### Reasoning / Thinking
@@ -271,6 +301,44 @@ Marathon enables model reasoning by default for models that support it (Gemini 3
271
301
  runtype marathon "Code Builder" --goal "Fix the bug" --no-reasoning
272
302
  ```
273
303
 
304
+ #### Fallback Models
305
+
306
+ When an upstream model provider returns a transient error (e.g. overload, rate limit), marathon can automatically retry and then fall back to a different model instead of dying mid-run.
307
+
308
+ **CLI flag** — applies to all phases:
309
+
310
+ ```bash
311
+ # If claude-opus-4-6 fails, retry once then fall back to claude-sonnet-4-5
312
+ runtype marathon "Code Builder" --goal "Refactor auth" \
313
+ --model claude-opus-4-6 \
314
+ --fallback-model claude-sonnet-4-5
315
+ ```
316
+
317
+ **Playbook** — per-milestone fallback chains:
318
+
319
+ ```yaml
320
+ milestones:
321
+ - name: research
322
+ model: claude-sonnet-4-5
323
+ fallbackModels:
324
+ - gpt-4o # string shorthand
325
+ - gemini-3-flash
326
+ instructions: |
327
+ Research the codebase...
328
+
329
+ - name: execution
330
+ model: claude-opus-4-6
331
+ fallbackModels:
332
+ - model: claude-sonnet-4-5 # object form with overrides
333
+ temperature: 0.5
334
+ - model: gpt-4o
335
+ maxTokens: 8192
336
+ instructions: |
337
+ Implement the changes...
338
+ ```
339
+
340
+ Playbook per-milestone fallbacks take priority over the CLI `--fallback-model` flag. The fallback chain always starts with a retry (5s delay) before trying alternative models.
341
+
274
342
  #### Tool Context Modes
275
343
 
276
344
  When a marathon runs multiple sessions, tool call/result pairs from previous sessions are preserved in the conversation history. The `--tool-context` flag controls how older tool results are stored to balance cost and re-readability:
package/dist/index.js CHANGED
@@ -12272,7 +12272,7 @@ import { theme as theme24 } from "@runtypelabs/ink-components";
12272
12272
  import { jsx as jsx25, jsxs as jsxs21 } from "react/jsx-runtime";
12273
12273
  var MENU_ITEMS = [
12274
12274
  { key: "c", label: "Copy session JSON" },
12275
- { key: "o", label: "Open session JSON in editor" },
12275
+ { key: "e", label: "Open session JSON in editor" },
12276
12276
  { key: "f", label: "Open marathon folder in file manager" },
12277
12277
  { key: "d", label: "Open agent in Runtype dashboard" }
12278
12278
  ];
@@ -12294,7 +12294,7 @@ function SessionActionMenu({
12294
12294
  onCopySession();
12295
12295
  return;
12296
12296
  }
12297
- if (input === "o" && hasStateFile) {
12297
+ if (input === "e" && hasStateFile) {
12298
12298
  onOpenStateFile();
12299
12299
  return;
12300
12300
  }
@@ -12320,7 +12320,7 @@ function SessionActionMenu({
12320
12320
  children: [
12321
12321
  /* @__PURE__ */ jsx25(Text24, { bold: true, color: theme24.accent, children: "Session" }),
12322
12322
  /* @__PURE__ */ jsx25(Box22, { flexDirection: "column", marginTop: 1, children: MENU_ITEMS.map((item) => {
12323
- const dimmed = item.key === "o" && !hasStateFile || item.key === "f" && !hasStateFile || item.key === "d" && !hasDashboard;
12323
+ const dimmed = item.key === "e" && !hasStateFile || item.key === "f" && !hasStateFile || item.key === "d" && !hasDashboard;
12324
12324
  return /* @__PURE__ */ jsxs21(Text24, { children: [
12325
12325
  /* @__PURE__ */ jsx25(Text24, { color: dimmed ? theme24.textSubtle : theme24.accentActive, children: ` ${item.key} ` }),
12326
12326
  /* @__PURE__ */ jsx25(Text24, { color: dimmed ? theme24.textSubtle : theme24.textMuted, children: item.label })
@@ -15311,7 +15311,9 @@ function extractRunTaskResumeState(state) {
15311
15311
  ...sanitized.bestCandidateNeedsVerification ? { bestCandidateNeedsVerification: sanitized.bestCandidateNeedsVerification } : {},
15312
15312
  ...sanitized.bestCandidateVerified ? { bestCandidateVerified: sanitized.bestCandidateVerified } : {},
15313
15313
  ...sanitized.verificationRequired !== void 0 ? { verificationRequired: sanitized.verificationRequired } : {},
15314
- ...sanitized.lastVerificationPassed ? { lastVerificationPassed: sanitized.lastVerificationPassed } : {}
15314
+ ...sanitized.lastVerificationPassed ? { lastVerificationPassed: sanitized.lastVerificationPassed } : {},
15315
+ ...sanitized.isCreationTask !== void 0 ? { isCreationTask: sanitized.isCreationTask } : {},
15316
+ ...sanitized.outputRoot ? { outputRoot: sanitized.outputRoot } : {}
15315
15317
  };
15316
15318
  }
15317
15319
  function findStateFile(name, stateDir) {
@@ -15476,6 +15478,29 @@ var IGNORED_REPO_DIRS = /* @__PURE__ */ new Set([
15476
15478
  "dist",
15477
15479
  "node_modules"
15478
15480
  ]);
15481
+ var SENSITIVE_PATH_PATTERNS = [
15482
+ { name: ".env", test: (n) => n === ".env" || n.endsWith("/.env") },
15483
+ { name: ".env.*", test: (n) => /\.env\.?[^/]*$/.test(n) || /\/\.env\.?[^/]*$/.test(n) },
15484
+ { name: "private keys", test: (n) => /(^|\/)(id_rsa|id_ed25519|id_ecdsa)(\.pub)?$/.test(n) },
15485
+ { name: "known_hosts", test: (n) => n.endsWith("known_hosts") || n.endsWith("/known_hosts") },
15486
+ { name: "authorized_keys", test: (n) => n.endsWith("authorized_keys") || n.endsWith("/authorized_keys") },
15487
+ { name: "cert/key extensions", test: (n) => /\.(pem|key|p12|pfx)$/i.test(n) },
15488
+ { name: "npm/pypi config", test: (n) => /(^|\/)(\.npmrc|\.pypirc|\.netrc)$/.test(n) },
15489
+ { name: "docker config", test: (n) => /\.docker\/config\.json$/i.test(n) },
15490
+ { name: "credentials", test: (n) => /(^|\/)(credentials\.json|secrets\.json)$/i.test(n) },
15491
+ { name: "service account", test: (n) => /service-account.*\.json$/i.test(n) || /firebase-admin.*\.json$/i.test(n) },
15492
+ { name: ".ssh", test: (n) => n === ".ssh" || n.startsWith(".ssh/") || n.includes("/.ssh/") },
15493
+ { name: ".aws", test: (n) => n === ".aws" || n.startsWith(".aws/") || n.includes("/.aws/") },
15494
+ { name: ".gnupg", test: (n) => n === ".gnupg" || n.startsWith(".gnupg/") || n.includes("/.gnupg/") },
15495
+ { name: ".terraform", test: (n) => n === ".terraform" || n.startsWith(".terraform/") || n.includes("/.terraform/") },
15496
+ { name: ".git", test: (n) => n === ".git" || n.startsWith(".git/") || n.includes("/.git/") },
15497
+ { name: ".runtype", test: (n) => n === ".runtype" || n.startsWith(".runtype/") || n.includes("/.runtype/") }
15498
+ ];
15499
+ function isSensitivePath(normalizedPath) {
15500
+ const n = normalizedPath.replace(/\\/g, "/").trim();
15501
+ if (!n) return false;
15502
+ return SENSITIVE_PATH_PATTERNS.some(({ test }) => test(n));
15503
+ }
15479
15504
  var DEFAULT_DISCOVERY_MAX_RESULTS = 50;
15480
15505
  var MAX_FILE_BYTES_TO_SCAN = 1024 * 1024;
15481
15506
  var LOW_SIGNAL_FILE_NAMES = /* @__PURE__ */ new Set([
@@ -15564,12 +15589,15 @@ function scoreSearchPath(relativePath) {
15564
15589
  return score;
15565
15590
  }
15566
15591
  function shouldIgnoreRepoEntry(entryPath) {
15567
- const normalized = normalizeToolPath(entryPath);
15592
+ const normalized = normalizeToolPath(entryPath).replace(/\\/g, "/");
15568
15593
  if (normalized === ".") return false;
15594
+ if (isSensitivePath(normalized)) return true;
15569
15595
  return normalized.split(path8.sep).some((segment) => IGNORED_REPO_DIRS.has(segment));
15570
15596
  }
15571
15597
  function safeReadTextFile(filePath) {
15572
15598
  try {
15599
+ const normalized = normalizeToolPath(filePath).replace(/\\/g, "/");
15600
+ if (isSensitivePath(normalized)) return null;
15573
15601
  const stat = fs8.statSync(filePath);
15574
15602
  if (!stat.isFile() || stat.size > MAX_FILE_BYTES_TO_SCAN) return null;
15575
15603
  const buffer = fs8.readFileSync(filePath);
@@ -15700,9 +15728,10 @@ function resolveToolPath(toolPath, options = {}) {
15700
15728
  return { ok: false, error: `Path does not exist: ${requestedPath}` };
15701
15729
  }
15702
15730
  const workspaceRoot = fs9.realpathSync.native(process.cwd());
15731
+ const extraRoots = (options.allowedRoots || []).map((rootPath) => canonicalizeAllowedRoot(rootPath));
15703
15732
  const allowedRoots = [
15704
- workspaceRoot,
15705
- ...(options.allowedRoots || []).map((rootPath) => canonicalizeAllowedRoot(rootPath))
15733
+ ...extraRoots,
15734
+ workspaceRoot
15706
15735
  ];
15707
15736
  const matchedRoot = allowedRoots.find(
15708
15737
  (rootPath) => isPathWithinRoot(resolved.canonicalPath, rootPath)
@@ -15721,6 +15750,13 @@ function resolveToolPath(toolPath, options = {}) {
15721
15750
  error: `Access denied: ${requestedPath} is inside restricted workspace state (${blockedSegment})`
15722
15751
  };
15723
15752
  }
15753
+ const relativeFromWorkspace = path9.relative(workspaceRoot, resolved.canonicalPath).replace(/\\/g, "/");
15754
+ if (isSensitivePath(relativeFromWorkspace)) {
15755
+ return {
15756
+ ok: false,
15757
+ error: `Access denied: ${requestedPath} is a sensitive path and cannot be read or written`
15758
+ };
15759
+ }
15724
15760
  }
15725
15761
  if (resolved.exists) {
15726
15762
  const stat = fs9.statSync(resolved.canonicalPath);
@@ -15741,8 +15777,17 @@ function resolveToolPath(toolPath, options = {}) {
15741
15777
  }
15742
15778
  return { ok: true, resolvedPath: resolved.canonicalPath };
15743
15779
  }
15780
+ function getTaskStateRoot(taskName, stateDir) {
15781
+ return path9.join(stateDir || getMarathonStateDir(), stateSafeName3(taskName));
15782
+ }
15744
15783
  function createDefaultLocalTools(context) {
15745
- const allowedReadRoots = context?.taskName ? [getOffloadedOutputDir(context.taskName, context.stateDir)] : [];
15784
+ const taskStateRoot = context?.taskName ? getTaskStateRoot(context.taskName, context.stateDir) : void 0;
15785
+ const planDir = context?.taskName ? path9.resolve(`.runtype/marathons/${stateSafeName3(context.taskName)}`) : void 0;
15786
+ const allowedReadRoots = context?.taskName ? [
15787
+ getOffloadedOutputDir(context.taskName, context.stateDir),
15788
+ ...taskStateRoot ? [taskStateRoot] : [],
15789
+ ...planDir ? [planDir] : []
15790
+ ] : [];
15746
15791
  return {
15747
15792
  read_file: {
15748
15793
  description: "Read the contents of a file at the given path",
@@ -15944,6 +15989,8 @@ function createDefaultLocalTools(context) {
15944
15989
  };
15945
15990
  }
15946
15991
  function createCheckpointedWriteFileTool(taskName, stateDir) {
15992
+ const taskStateRoot = getTaskStateRoot(taskName, stateDir);
15993
+ const planDir = path9.resolve(`.runtype/marathons/${stateSafeName3(taskName)}`);
15947
15994
  return {
15948
15995
  description: "Write content to a file, creating directories as needed and checkpointing original repo files",
15949
15996
  parametersSchema: {
@@ -15956,7 +16003,8 @@ function createCheckpointedWriteFileTool(taskName, stateDir) {
15956
16003
  },
15957
16004
  execute: async (args) => {
15958
16005
  const resolvedPath = resolveToolPath(String(args.path || ""), {
15959
- allowMissing: true
16006
+ allowMissing: true,
16007
+ allowedRoots: [taskStateRoot, planDir]
15960
16008
  });
15961
16009
  if (!resolvedPath.ok) return `Error: ${resolvedPath.error}`;
15962
16010
  const content = String(args.content || "");
@@ -16047,6 +16095,7 @@ function createRunCheckTool() {
16047
16095
  if (!isSafeVerificationCommand(command)) {
16048
16096
  return JSON.stringify({
16049
16097
  success: false,
16098
+ blocked: true,
16050
16099
  command,
16051
16100
  error: "Blocked unsafe verification command. Use a single non-destructive lint/test/typecheck/build command."
16052
16101
  });
@@ -16462,12 +16511,46 @@ function resolveModelForPhase(phase, cliOverrides, milestoneModels) {
16462
16511
  }
16463
16512
  return cliOverrides.defaultModel;
16464
16513
  }
16514
+ function resolveErrorHandlingForPhase(phase, cliFallbackModel, milestoneFallbackModels) {
16515
+ const phaseFallbacks = phase ? milestoneFallbackModels?.[phase] : void 0;
16516
+ if (phaseFallbacks?.length) {
16517
+ return {
16518
+ onError: "fallback",
16519
+ fallbacks: [
16520
+ { type: "retry", delay: 5e3 },
16521
+ ...phaseFallbacks.map((fb) => ({
16522
+ type: "model",
16523
+ model: fb.model,
16524
+ ...fb.temperature !== void 0 ? { temperature: fb.temperature } : {},
16525
+ ...fb.maxTokens !== void 0 ? { maxTokens: fb.maxTokens } : {}
16526
+ }))
16527
+ ]
16528
+ };
16529
+ }
16530
+ if (cliFallbackModel) {
16531
+ return {
16532
+ onError: "fallback",
16533
+ fallbacks: [
16534
+ { type: "retry", delay: 5e3 },
16535
+ { type: "model", model: cliFallbackModel }
16536
+ ]
16537
+ };
16538
+ }
16539
+ return void 0;
16540
+ }
16465
16541
 
16466
16542
  // src/marathon/playbook-loader.ts
16467
16543
  import * as fs12 from "fs";
16468
16544
  import * as path12 from "path";
16469
16545
  import * as os4 from "os";
16546
+ import micromatch from "micromatch";
16470
16547
  import { parse as parseYaml } from "yaml";
16548
+ var DISCOVERY_TOOLS = /* @__PURE__ */ new Set([
16549
+ "search_repo",
16550
+ "glob_files",
16551
+ "tree_directory",
16552
+ "list_directory"
16553
+ ]);
16471
16554
  var PLAYBOOKS_DIR = ".runtype/marathons/playbooks";
16472
16555
  function getCandidatePaths(nameOrPath, cwd) {
16473
16556
  const home = os4.homedir();
@@ -16542,7 +16625,54 @@ function buildIsComplete(criteria) {
16542
16625
  return () => false;
16543
16626
  }
16544
16627
  }
16628
+ function buildPolicyIntercept(policy) {
16629
+ if (!policy.blockedTools?.length && !policy.blockDiscoveryTools && !policy.allowedReadGlobs?.length && !policy.allowedWriteGlobs?.length && !policy.requirePlanBeforeWrite) {
16630
+ return void 0;
16631
+ }
16632
+ const blockedSet = new Set(
16633
+ (policy.blockedTools ?? []).map((t) => t.trim()).filter(Boolean)
16634
+ );
16635
+ const readGlobs = policy.allowedReadGlobs ?? [];
16636
+ const writeGlobs = policy.allowedWriteGlobs ?? [];
16637
+ return (toolName, args, ctx) => {
16638
+ if (blockedSet.has(toolName)) {
16639
+ return `Blocked by playbook policy: ${toolName} is not allowed for this task.`;
16640
+ }
16641
+ if (policy.blockDiscoveryTools && DISCOVERY_TOOLS.has(toolName)) {
16642
+ return `Blocked by playbook policy: discovery tools are disabled for this task.`;
16643
+ }
16644
+ const pathArg = typeof args.path === "string" && args.path.trim() ? ctx.normalizePath(String(args.path)) : void 0;
16645
+ if (pathArg) {
16646
+ const isWrite = toolName === "write_file" || toolName === "restore_file_checkpoint";
16647
+ const isRead = toolName === "read_file";
16648
+ if (isRead && readGlobs.length > 0) {
16649
+ const allowed = micromatch.some(pathArg, readGlobs, { dot: true });
16650
+ if (!allowed) {
16651
+ return `Blocked by playbook policy: ${toolName} path "${pathArg}" is outside allowed read globs: ${readGlobs.join(", ")}`;
16652
+ }
16653
+ }
16654
+ if (isWrite && writeGlobs.length > 0) {
16655
+ const planPath = ctx.state.planPath ? ctx.normalizePath(ctx.state.planPath) : void 0;
16656
+ if (planPath && pathArg === planPath) {
16657
+ } else {
16658
+ const allowed = micromatch.some(pathArg, writeGlobs, { dot: true });
16659
+ if (!allowed) {
16660
+ return `Blocked by playbook policy: ${toolName} path "${pathArg}" is outside allowed write globs: ${writeGlobs.join(", ")}`;
16661
+ }
16662
+ }
16663
+ }
16664
+ if (isWrite && policy.requirePlanBeforeWrite && !ctx.state.planWritten && !ctx.trace.planWritten) {
16665
+ const planPath = ctx.state.planPath ? ctx.normalizePath(ctx.state.planPath) : void 0;
16666
+ if (!planPath || pathArg !== planPath) {
16667
+ return `Blocked by playbook policy: write the plan before creating other files.`;
16668
+ }
16669
+ }
16670
+ }
16671
+ return void 0;
16672
+ };
16673
+ }
16545
16674
  function convertToWorkflow(config2) {
16675
+ const policyIntercept = config2.policy ? buildPolicyIntercept(config2.policy) : void 0;
16546
16676
  const phases = config2.milestones.map((milestone) => ({
16547
16677
  name: milestone.name,
16548
16678
  description: milestone.description,
@@ -16558,6 +16688,7 @@ ${instructions}`;
16558
16688
  return milestone.toolGuidance ?? [];
16559
16689
  },
16560
16690
  isComplete: buildIsComplete(milestone.completionCriteria),
16691
+ interceptToolCall: policyIntercept,
16561
16692
  // Default to rejecting TASK_COMPLETE unless the playbook explicitly allows it.
16562
16693
  // The SDK accepts completion by default when canAcceptCompletion is undefined,
16563
16694
  // which would let the model end the marathon prematurely in early phases.
@@ -16568,23 +16699,37 @@ ${instructions}`;
16568
16699
  phases
16569
16700
  };
16570
16701
  }
16702
+ function normalizeFallbackModel(input) {
16703
+ if (typeof input === "string") return { model: input };
16704
+ return {
16705
+ model: input.model,
16706
+ ...input.temperature !== void 0 ? { temperature: input.temperature } : {},
16707
+ ...input.maxTokens !== void 0 ? { maxTokens: input.maxTokens } : {}
16708
+ };
16709
+ }
16571
16710
  function loadPlaybook(nameOrPath, cwd) {
16572
16711
  const baseCwd = cwd || process.cwd();
16573
16712
  const candidates = getCandidatePaths(nameOrPath, baseCwd);
16574
16713
  for (const candidate of candidates) {
16575
- if (!fs12.existsSync(candidate)) continue;
16714
+ if (!fs12.existsSync(candidate) || fs12.statSync(candidate).isDirectory()) continue;
16576
16715
  const config2 = parsePlaybookFile(candidate);
16577
16716
  validatePlaybook(config2, candidate);
16578
16717
  const milestoneModels = {};
16718
+ const milestoneFallbackModels = {};
16579
16719
  for (const m of config2.milestones) {
16580
16720
  if (m.model) milestoneModels[m.name] = m.model;
16721
+ if (m.fallbackModels?.length) {
16722
+ milestoneFallbackModels[m.name] = m.fallbackModels.map(normalizeFallbackModel);
16723
+ }
16581
16724
  }
16582
16725
  return {
16583
16726
  workflow: convertToWorkflow(config2),
16584
16727
  milestones: config2.milestones.map((m) => m.name),
16585
16728
  milestoneModels: Object.keys(milestoneModels).length > 0 ? milestoneModels : void 0,
16729
+ milestoneFallbackModels: Object.keys(milestoneFallbackModels).length > 0 ? milestoneFallbackModels : void 0,
16586
16730
  verification: config2.verification,
16587
- rules: config2.rules
16731
+ rules: config2.rules,
16732
+ policy: config2.policy
16588
16733
  };
16589
16734
  }
16590
16735
  throw new Error(
@@ -16749,13 +16894,22 @@ function normalizeMarathonAgentArgument(agent) {
16749
16894
  function buildMarathonAutoCreatedAgentBootstrap(agentName, options = {}) {
16750
16895
  const normalizedModel = options.model?.trim();
16751
16896
  const normalizedToolIds = [...new Set((options.toolIds || []).map((toolId) => toolId.trim()).filter(Boolean))];
16752
- const config2 = normalizedModel || normalizedToolIds.length > 0 ? {
16897
+ const normalizedFallbackModel = options.fallbackModel?.trim();
16898
+ const errorHandling = normalizedFallbackModel ? {
16899
+ onError: "fallback",
16900
+ fallbacks: [
16901
+ { type: "retry", delay: 5e3 },
16902
+ { type: "model", model: normalizedFallbackModel }
16903
+ ]
16904
+ } : void 0;
16905
+ const config2 = normalizedModel || normalizedToolIds.length > 0 || errorHandling ? {
16753
16906
  ...normalizedModel ? { model: normalizedModel } : {},
16754
16907
  ...normalizedToolIds.length > 0 ? {
16755
16908
  tools: {
16756
16909
  toolIds: normalizedToolIds
16757
16910
  }
16758
- } : {}
16911
+ } : {},
16912
+ ...errorHandling ? { errorHandling } : {}
16759
16913
  } : void 0;
16760
16914
  return {
16761
16915
  description: `Powering a marathon for ${agentName}`,
@@ -17109,11 +17263,17 @@ async function taskAction(agent, options) {
17109
17263
  let playbookWorkflow;
17110
17264
  let playbookMilestones;
17111
17265
  let playbookMilestoneModels;
17266
+ let playbookMilestoneFallbackModels;
17267
+ let playbookPolicy;
17112
17268
  if (options.playbook) {
17113
17269
  const result = loadPlaybook(options.playbook);
17114
17270
  playbookWorkflow = result.workflow;
17115
17271
  playbookMilestones = result.milestones;
17116
17272
  playbookMilestoneModels = result.milestoneModels;
17273
+ playbookMilestoneFallbackModels = result.milestoneFallbackModels;
17274
+ playbookPolicy = result.policy;
17275
+ } else {
17276
+ playbookPolicy = void 0;
17117
17277
  }
17118
17278
  if (useStartupShell && !options.model?.trim()) {
17119
17279
  if (playbookMilestoneModels && Object.keys(playbookMilestoneModels).length > 0 && startupShellRef.current) {
@@ -17214,7 +17374,8 @@ ${rulesContext}`;
17214
17374
  if (autoCreatedAgent) {
17215
17375
  const bootstrapPayload = buildMarathonAutoCreatedAgentBootstrap(normalizedAgent, {
17216
17376
  model: options.model || agentConfigModel || defaultConfiguredModel,
17217
- toolIds: resolvedToolIds
17377
+ toolIds: resolvedToolIds,
17378
+ fallbackModel: options.fallbackModel
17218
17379
  });
17219
17380
  try {
17220
17381
  await client.agents.update(agentId, bootstrapPayload);
@@ -17230,6 +17391,16 @@ ${rulesContext}`;
17230
17391
  );
17231
17392
  }
17232
17393
  }
17394
+ } else if (options.fallbackModel || playbookMilestoneFallbackModels) {
17395
+ const initialErrorHandling = resolveErrorHandlingForPhase(
17396
+ currentPhase,
17397
+ options.fallbackModel,
17398
+ playbookMilestoneFallbackModels
17399
+ );
17400
+ if (initialErrorHandling) {
17401
+ await client.agents.update(agentId, { config: { errorHandling: initialErrorHandling } }).catch(() => {
17402
+ });
17403
+ }
17233
17404
  }
17234
17405
  let localTools = buildLocalTools(client, parsedSandbox, options, {
17235
17406
  taskName,
@@ -17532,7 +17703,13 @@ Saving state... done. Session saved to ${filePath}`);
17532
17703
  model: event.model || effectiveModelForContext
17533
17704
  });
17534
17705
  },
17535
- ...resumeState ? { resumeState } : {},
17706
+ ...resumeState || playbookPolicy ? {
17707
+ resumeState: {
17708
+ ...resumeState ?? {},
17709
+ ...playbookPolicy?.outputRoot ? { outputRoot: playbookPolicy.outputRoot } : {},
17710
+ ...playbookPolicy?.requireVerification !== void 0 ? { verificationRequired: playbookPolicy.requireVerification } : {}
17711
+ }
17712
+ } : {},
17536
17713
  toolContextMode: options.toolContext || "hot-tail",
17537
17714
  toolWindow: options.toolWindow === "session" || !options.toolWindow ? "session" : parseInt(options.toolWindow, 10) || 10,
17538
17715
  onSession: async (state) => {
@@ -17594,6 +17771,17 @@ Saving state... done. Session saved to ${filePath}`);
17594
17771
  options.model = newPhaseModel;
17595
17772
  modelChangedOnPhaseTransition = true;
17596
17773
  }
17774
+ if (options.fallbackModel || playbookMilestoneFallbackModels) {
17775
+ const newErrorHandling = resolveErrorHandlingForPhase(
17776
+ resumeState.workflowPhase,
17777
+ options.fallbackModel,
17778
+ playbookMilestoneFallbackModels
17779
+ );
17780
+ client.agents.update(agentId, {
17781
+ config: { errorHandling: newErrorHandling ?? null }
17782
+ }).catch(() => {
17783
+ });
17784
+ }
17597
17785
  }
17598
17786
  if (state.recentActionKeys && state.recentActionKeys.length > 0) {
17599
17787
  for (const key of state.recentActionKeys) {
@@ -17970,7 +18158,7 @@ function resolveSandboxWorkflowSelection(message, sandboxProvider, resumeState)
17970
18158
  };
17971
18159
  }
17972
18160
  function applyTaskOptions(cmd) {
17973
- return cmd.argument("<agent>", "Agent ID or name").option("-g, --goal <text>", "Goal message for the agent").option("--max-sessions <n>", "Maximum sessions", "50").option("--max-cost <n>", "Budget in USD").option("--model <modelId>", "Model ID to use (overrides agent config)").option("--name <name>", "Task name (used for state file, defaults to agent name)").option("--session <name>", "Resume a specific session by name").option("--state-dir <path>", "Directory for state files (default: ~/.runtype/projects/<hash>/marathons/)").option("--resume [message]", "Resume from existing local state, optionally with a new message").option("--fresh", "Start a new run and ignore any existing local state for this task").option("--compact", "Force compact-summary resume mode instead of replaying full history").option("--compact-strategy <strategy>", "Compaction strategy: auto (default), provider_native, or summary_fallback").option("--compact-threshold <value>", "Auto-compact when estimated context crosses this threshold (default: 80% fallback, 90% native; accepts percent like 90% or absolute token count like 120000)").option("--compact-instructions <text>", "Extra instructions for what a compact summary must preserve").option("--no-auto-compact", "Disable automatic context-aware history compaction").option("--track", "Sync progress to a Runtype record (visible in dashboard)").option("--debug", "Show debug output from each session").option("--json", "Output final result as JSON").option("--sandbox <provider>", "Enable sandbox code execution tool (cloudflare-worker, quickjs, or daytona)").option("--no-local-tools", "Disable built-in local tool execution (read_file, write_file, list_directory)").option("-t, --tools <tools...>", "Enable built-in tools (e.g., exa, firecrawl, dalle, openai_web_search, anthropic_web_search)").option("--plain-text", "Disable markdown rendering in output").option("--no-reasoning", "Disable model reasoning/thinking (enabled by default for supported models)").option("--no-checkpoint", "Run all iterations without checkpoint pauses (fully autonomous)").option("--checkpoint-timeout <seconds>", "Auto-continue timeout in seconds (default: 10)", "10").option("--planning-model <modelId>", "Model to use during research/planning phases").option("--execution-model <modelId>", "Model to use during execution phase").option("--playbook <name>", "Load a playbook from .runtype/marathons/playbooks/").option("--offload-threshold <chars>", 'Offload tool outputs larger than this to files (default: 100000; use "off" or "0" to disable guardrails)').option("--tool-context <mode>", "Tool result storage: hot-tail (default), observation-mask, or full-inline").option("--tool-window <window>", 'Compaction window: "session" (default) or a number for last-N tool results (e.g. 10)').option("--runner-char <char>", "Custom runner emoji (default: \u{1F3C3})").option("--finish-char <char>", "Custom finish line emoji (default: \u{1F3C1})").option("--no-runner", "Hide the runner emoji from the header border").option("--no-finish", "Hide the finish line emoji from the header border").action(taskAction);
18161
+ return cmd.argument("<agent>", "Agent ID or name").option("-g, --goal <text>", "Goal message for the agent").option("--max-sessions <n>", "Maximum sessions", "50").option("--max-cost <n>", "Budget in USD").option("--model <modelId>", "Model ID to use (overrides agent config)").option("--name <name>", "Task name (used for state file, defaults to agent name)").option("--session <name>", "Resume a specific session by name").option("--state-dir <path>", "Directory for state files (default: ~/.runtype/projects/<hash>/marathons/)").option("--resume [message]", "Resume from existing local state, optionally with a new message").option("--fresh", "Start a new run and ignore any existing local state for this task").option("--compact", "Force compact-summary resume mode instead of replaying full history").option("--compact-strategy <strategy>", "Compaction strategy: auto (default), provider_native, or summary_fallback").option("--compact-threshold <value>", "Auto-compact when estimated context crosses this threshold (default: 80% fallback, 90% native; accepts percent like 90% or absolute token count like 120000)").option("--compact-instructions <text>", "Extra instructions for what a compact summary must preserve").option("--no-auto-compact", "Disable automatic context-aware history compaction").option("--track", "Sync progress to a Runtype record (visible in dashboard)").option("--debug", "Show debug output from each session").option("--json", "Output final result as JSON").option("--sandbox <provider>", "Enable sandbox code execution tool (cloudflare-worker, quickjs, or daytona)").option("--no-local-tools", "Disable built-in local tool execution (read_file, write_file, list_directory)").option("-t, --tools <tools...>", "Enable built-in tools (e.g., exa, firecrawl, dalle, openai_web_search, anthropic_web_search)").option("--plain-text", "Disable markdown rendering in output").option("--no-reasoning", "Disable model reasoning/thinking (enabled by default for supported models)").option("--no-checkpoint", "Run all iterations without checkpoint pauses (fully autonomous)").option("--checkpoint-timeout <seconds>", "Auto-continue timeout in seconds (default: 10)", "10").option("--planning-model <modelId>", "Model to use during research/planning phases").option("--execution-model <modelId>", "Model to use during execution phase").option("--fallback-model <modelId>", "Model to fall back to when primary model fails").option("--playbook <name>", "Load a playbook from .runtype/marathons/playbooks/").option("--offload-threshold <chars>", 'Offload tool outputs larger than this to files (default: 100000; use "off" or "0" to disable guardrails)').option("--tool-context <mode>", "Tool result storage: hot-tail (default), observation-mask, or full-inline").option("--tool-window <window>", 'Compaction window: "session" (default) or a number for last-N tool results (e.g. 10)').option("--runner-char <char>", "Custom runner emoji (default: \u{1F3C3})").option("--finish-char <char>", "Custom finish line emoji (default: \u{1F3C1})").option("--no-runner", "Hide the runner emoji from the header border").option("--no-finish", "Hide the finish line emoji from the header border").action(taskAction);
17974
18162
  }
17975
18163
  var taskCommand = applyTaskOptions(
17976
18164
  new Command10("task").description("Run a multi-session agent task")