@runtypelabs/cli 2.0.2 → 2.1.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/README.md +72 -4
- package/dist/index.js +204 -16
- package/dist/index.js.map +1 -1
- package/package.json +4 -2
package/README.md
CHANGED
|
@@ -220,15 +220,44 @@ EOF
|
|
|
220
220
|
- `planWritten` — advances when the agent writes its plan artifact
|
|
221
221
|
- `never` — only the agent's `TASK_COMPLETE` signal can advance (if `canAcceptCompletion: true`)
|
|
222
222
|
|
|
223
|
+
**Playbook policies**:
|
|
224
|
+
|
|
225
|
+
The optional `policy` block lets you restrict what the agent can do at runtime. Policies are additive restrictions — they can only narrow behavior, never override global safety denies (e.g. `.env` files and private keys are always blocked).
|
|
226
|
+
|
|
227
|
+
```yaml
|
|
228
|
+
name: blog-writer
|
|
229
|
+
policy:
|
|
230
|
+
allowedReadGlobs: ['content/**', 'templates/**']
|
|
231
|
+
allowedWriteGlobs: ['content/**']
|
|
232
|
+
blockedTools: ['search_repo']
|
|
233
|
+
blockDiscoveryTools: true
|
|
234
|
+
requirePlanBeforeWrite: true
|
|
235
|
+
requireVerification: true
|
|
236
|
+
outputRoot: 'content/'
|
|
237
|
+
milestones:
|
|
238
|
+
- ...
|
|
239
|
+
```
|
|
240
|
+
|
|
241
|
+
| Field | Type | Description |
|
|
242
|
+
| ------------------------ | ---------- | ----------------------------------------------------------------------------------------------------------------------------- |
|
|
243
|
+
| `allowedReadGlobs` | `string[]` | Glob patterns for allowed read paths. If set, reads outside these are blocked. |
|
|
244
|
+
| `allowedWriteGlobs` | `string[]` | Glob patterns for allowed write paths. If set, writes outside these are blocked. The plan file is always writable regardless. |
|
|
245
|
+
| `blockedTools` | `string[]` | Tool names to block entirely (e.g. `["write_file", "search_repo"]`). |
|
|
246
|
+
| `blockDiscoveryTools` | `boolean` | Block `search_repo`, `glob_files`, `tree_directory`, and `list_directory`. |
|
|
247
|
+
| `requirePlanBeforeWrite` | `boolean` | Require the agent to write its plan before any other file writes. |
|
|
248
|
+
| `requireVerification` | `boolean` | Require verification before `TASK_COMPLETE`. |
|
|
249
|
+
| `outputRoot` | `string` | For creation tasks: confine writes to this directory (e.g. `"public/"`). |
|
|
250
|
+
|
|
223
251
|
#### Marathon Anatomy
|
|
224
252
|
|
|
225
253
|
```
|
|
226
254
|
┌─ marathon ──────────────────────────────────────────────────────┐
|
|
227
255
|
│ │
|
|
228
|
-
│ ┌─ playbook (optional)
|
|
229
|
-
│ │ Defines milestones, models, verification, rules
|
|
230
|
-
│ │
|
|
231
|
-
│
|
|
256
|
+
│ ┌─ playbook (optional) ──────────────────────────────────┐ │
|
|
257
|
+
│ │ Defines milestones, models, verification, rules, │ │
|
|
258
|
+
│ │ and policy constraints │ │
|
|
259
|
+
│ │ .runtype/marathons/playbooks/tdd.yaml │ │
|
|
260
|
+
│ └────────────────────────────────────────────────────────┘ │
|
|
232
261
|
│ │ │
|
|
233
262
|
│ ▼ │
|
|
234
263
|
│ ┌─ milestone 1 ──┐ ┌─ milestone 2 ──┐ ┌─ milestone 3 ─────┐ |
|
|
@@ -261,6 +290,7 @@ What's optional:
|
|
|
261
290
|
✓ Rules Without them, agent follows only playbook/milestone instructions
|
|
262
291
|
✓ Models Without overrides, uses CLI --model flag or default
|
|
263
292
|
✓ Verification Without it, no verification gate between milestones
|
|
293
|
+
✓ Policy Without one, only global safety denies apply
|
|
264
294
|
```
|
|
265
295
|
|
|
266
296
|
#### Reasoning / Thinking
|
|
@@ -271,6 +301,44 @@ Marathon enables model reasoning by default for models that support it (Gemini 3
|
|
|
271
301
|
runtype marathon "Code Builder" --goal "Fix the bug" --no-reasoning
|
|
272
302
|
```
|
|
273
303
|
|
|
304
|
+
#### Fallback Models
|
|
305
|
+
|
|
306
|
+
When an upstream model provider returns a transient error (e.g. overload, rate limit), marathon can automatically retry and then fall back to a different model instead of dying mid-run.
|
|
307
|
+
|
|
308
|
+
**CLI flag** — applies to all phases:
|
|
309
|
+
|
|
310
|
+
```bash
|
|
311
|
+
# If claude-opus-4-6 fails, retry once then fall back to claude-sonnet-4-5
|
|
312
|
+
runtype marathon "Code Builder" --goal "Refactor auth" \
|
|
313
|
+
--model claude-opus-4-6 \
|
|
314
|
+
--fallback-model claude-sonnet-4-5
|
|
315
|
+
```
|
|
316
|
+
|
|
317
|
+
**Playbook** — per-milestone fallback chains:
|
|
318
|
+
|
|
319
|
+
```yaml
|
|
320
|
+
milestones:
|
|
321
|
+
- name: research
|
|
322
|
+
model: claude-sonnet-4-5
|
|
323
|
+
fallbackModels:
|
|
324
|
+
- gpt-4o # string shorthand
|
|
325
|
+
- gemini-3-flash
|
|
326
|
+
instructions: |
|
|
327
|
+
Research the codebase...
|
|
328
|
+
|
|
329
|
+
- name: execution
|
|
330
|
+
model: claude-opus-4-6
|
|
331
|
+
fallbackModels:
|
|
332
|
+
- model: claude-sonnet-4-5 # object form with overrides
|
|
333
|
+
temperature: 0.5
|
|
334
|
+
- model: gpt-4o
|
|
335
|
+
maxTokens: 8192
|
|
336
|
+
instructions: |
|
|
337
|
+
Implement the changes...
|
|
338
|
+
```
|
|
339
|
+
|
|
340
|
+
Playbook per-milestone fallbacks take priority over the CLI `--fallback-model` flag. The fallback chain always starts with a retry (5s delay) before trying alternative models.
|
|
341
|
+
|
|
274
342
|
#### Tool Context Modes
|
|
275
343
|
|
|
276
344
|
When a marathon runs multiple sessions, tool call/result pairs from previous sessions are preserved in the conversation history. The `--tool-context` flag controls how older tool results are stored to balance cost and re-readability:
|
package/dist/index.js
CHANGED
|
@@ -12272,7 +12272,7 @@ import { theme as theme24 } from "@runtypelabs/ink-components";
|
|
|
12272
12272
|
import { jsx as jsx25, jsxs as jsxs21 } from "react/jsx-runtime";
|
|
12273
12273
|
var MENU_ITEMS = [
|
|
12274
12274
|
{ key: "c", label: "Copy session JSON" },
|
|
12275
|
-
{ key: "
|
|
12275
|
+
{ key: "e", label: "Open session JSON in editor" },
|
|
12276
12276
|
{ key: "f", label: "Open marathon folder in file manager" },
|
|
12277
12277
|
{ key: "d", label: "Open agent in Runtype dashboard" }
|
|
12278
12278
|
];
|
|
@@ -12294,7 +12294,7 @@ function SessionActionMenu({
|
|
|
12294
12294
|
onCopySession();
|
|
12295
12295
|
return;
|
|
12296
12296
|
}
|
|
12297
|
-
if (input === "
|
|
12297
|
+
if (input === "e" && hasStateFile) {
|
|
12298
12298
|
onOpenStateFile();
|
|
12299
12299
|
return;
|
|
12300
12300
|
}
|
|
@@ -12320,7 +12320,7 @@ function SessionActionMenu({
|
|
|
12320
12320
|
children: [
|
|
12321
12321
|
/* @__PURE__ */ jsx25(Text24, { bold: true, color: theme24.accent, children: "Session" }),
|
|
12322
12322
|
/* @__PURE__ */ jsx25(Box22, { flexDirection: "column", marginTop: 1, children: MENU_ITEMS.map((item) => {
|
|
12323
|
-
const dimmed = item.key === "
|
|
12323
|
+
const dimmed = item.key === "e" && !hasStateFile || item.key === "f" && !hasStateFile || item.key === "d" && !hasDashboard;
|
|
12324
12324
|
return /* @__PURE__ */ jsxs21(Text24, { children: [
|
|
12325
12325
|
/* @__PURE__ */ jsx25(Text24, { color: dimmed ? theme24.textSubtle : theme24.accentActive, children: ` ${item.key} ` }),
|
|
12326
12326
|
/* @__PURE__ */ jsx25(Text24, { color: dimmed ? theme24.textSubtle : theme24.textMuted, children: item.label })
|
|
@@ -15311,7 +15311,9 @@ function extractRunTaskResumeState(state) {
|
|
|
15311
15311
|
...sanitized.bestCandidateNeedsVerification ? { bestCandidateNeedsVerification: sanitized.bestCandidateNeedsVerification } : {},
|
|
15312
15312
|
...sanitized.bestCandidateVerified ? { bestCandidateVerified: sanitized.bestCandidateVerified } : {},
|
|
15313
15313
|
...sanitized.verificationRequired !== void 0 ? { verificationRequired: sanitized.verificationRequired } : {},
|
|
15314
|
-
...sanitized.lastVerificationPassed ? { lastVerificationPassed: sanitized.lastVerificationPassed } : {}
|
|
15314
|
+
...sanitized.lastVerificationPassed ? { lastVerificationPassed: sanitized.lastVerificationPassed } : {},
|
|
15315
|
+
...sanitized.isCreationTask !== void 0 ? { isCreationTask: sanitized.isCreationTask } : {},
|
|
15316
|
+
...sanitized.outputRoot ? { outputRoot: sanitized.outputRoot } : {}
|
|
15315
15317
|
};
|
|
15316
15318
|
}
|
|
15317
15319
|
function findStateFile(name, stateDir) {
|
|
@@ -15476,6 +15478,29 @@ var IGNORED_REPO_DIRS = /* @__PURE__ */ new Set([
|
|
|
15476
15478
|
"dist",
|
|
15477
15479
|
"node_modules"
|
|
15478
15480
|
]);
|
|
15481
|
+
var SENSITIVE_PATH_PATTERNS = [
|
|
15482
|
+
{ name: ".env", test: (n) => n === ".env" || n.endsWith("/.env") },
|
|
15483
|
+
{ name: ".env.*", test: (n) => /\.env\.?[^/]*$/.test(n) || /\/\.env\.?[^/]*$/.test(n) },
|
|
15484
|
+
{ name: "private keys", test: (n) => /(^|\/)(id_rsa|id_ed25519|id_ecdsa)(\.pub)?$/.test(n) },
|
|
15485
|
+
{ name: "known_hosts", test: (n) => n.endsWith("known_hosts") || n.endsWith("/known_hosts") },
|
|
15486
|
+
{ name: "authorized_keys", test: (n) => n.endsWith("authorized_keys") || n.endsWith("/authorized_keys") },
|
|
15487
|
+
{ name: "cert/key extensions", test: (n) => /\.(pem|key|p12|pfx)$/i.test(n) },
|
|
15488
|
+
{ name: "npm/pypi config", test: (n) => /(^|\/)(\.npmrc|\.pypirc|\.netrc)$/.test(n) },
|
|
15489
|
+
{ name: "docker config", test: (n) => /\.docker\/config\.json$/i.test(n) },
|
|
15490
|
+
{ name: "credentials", test: (n) => /(^|\/)(credentials\.json|secrets\.json)$/i.test(n) },
|
|
15491
|
+
{ name: "service account", test: (n) => /service-account.*\.json$/i.test(n) || /firebase-admin.*\.json$/i.test(n) },
|
|
15492
|
+
{ name: ".ssh", test: (n) => n === ".ssh" || n.startsWith(".ssh/") || n.includes("/.ssh/") },
|
|
15493
|
+
{ name: ".aws", test: (n) => n === ".aws" || n.startsWith(".aws/") || n.includes("/.aws/") },
|
|
15494
|
+
{ name: ".gnupg", test: (n) => n === ".gnupg" || n.startsWith(".gnupg/") || n.includes("/.gnupg/") },
|
|
15495
|
+
{ name: ".terraform", test: (n) => n === ".terraform" || n.startsWith(".terraform/") || n.includes("/.terraform/") },
|
|
15496
|
+
{ name: ".git", test: (n) => n === ".git" || n.startsWith(".git/") || n.includes("/.git/") },
|
|
15497
|
+
{ name: ".runtype", test: (n) => n === ".runtype" || n.startsWith(".runtype/") || n.includes("/.runtype/") }
|
|
15498
|
+
];
|
|
15499
|
+
function isSensitivePath(normalizedPath) {
|
|
15500
|
+
const n = normalizedPath.replace(/\\/g, "/").trim();
|
|
15501
|
+
if (!n) return false;
|
|
15502
|
+
return SENSITIVE_PATH_PATTERNS.some(({ test }) => test(n));
|
|
15503
|
+
}
|
|
15479
15504
|
var DEFAULT_DISCOVERY_MAX_RESULTS = 50;
|
|
15480
15505
|
var MAX_FILE_BYTES_TO_SCAN = 1024 * 1024;
|
|
15481
15506
|
var LOW_SIGNAL_FILE_NAMES = /* @__PURE__ */ new Set([
|
|
@@ -15564,12 +15589,15 @@ function scoreSearchPath(relativePath) {
|
|
|
15564
15589
|
return score;
|
|
15565
15590
|
}
|
|
15566
15591
|
function shouldIgnoreRepoEntry(entryPath) {
|
|
15567
|
-
const normalized = normalizeToolPath(entryPath);
|
|
15592
|
+
const normalized = normalizeToolPath(entryPath).replace(/\\/g, "/");
|
|
15568
15593
|
if (normalized === ".") return false;
|
|
15594
|
+
if (isSensitivePath(normalized)) return true;
|
|
15569
15595
|
return normalized.split(path8.sep).some((segment) => IGNORED_REPO_DIRS.has(segment));
|
|
15570
15596
|
}
|
|
15571
15597
|
function safeReadTextFile(filePath) {
|
|
15572
15598
|
try {
|
|
15599
|
+
const normalized = normalizeToolPath(filePath).replace(/\\/g, "/");
|
|
15600
|
+
if (isSensitivePath(normalized)) return null;
|
|
15573
15601
|
const stat = fs8.statSync(filePath);
|
|
15574
15602
|
if (!stat.isFile() || stat.size > MAX_FILE_BYTES_TO_SCAN) return null;
|
|
15575
15603
|
const buffer = fs8.readFileSync(filePath);
|
|
@@ -15700,9 +15728,10 @@ function resolveToolPath(toolPath, options = {}) {
|
|
|
15700
15728
|
return { ok: false, error: `Path does not exist: ${requestedPath}` };
|
|
15701
15729
|
}
|
|
15702
15730
|
const workspaceRoot = fs9.realpathSync.native(process.cwd());
|
|
15731
|
+
const extraRoots = (options.allowedRoots || []).map((rootPath) => canonicalizeAllowedRoot(rootPath));
|
|
15703
15732
|
const allowedRoots = [
|
|
15704
|
-
|
|
15705
|
-
|
|
15733
|
+
...extraRoots,
|
|
15734
|
+
workspaceRoot
|
|
15706
15735
|
];
|
|
15707
15736
|
const matchedRoot = allowedRoots.find(
|
|
15708
15737
|
(rootPath) => isPathWithinRoot(resolved.canonicalPath, rootPath)
|
|
@@ -15721,6 +15750,13 @@ function resolveToolPath(toolPath, options = {}) {
|
|
|
15721
15750
|
error: `Access denied: ${requestedPath} is inside restricted workspace state (${blockedSegment})`
|
|
15722
15751
|
};
|
|
15723
15752
|
}
|
|
15753
|
+
const relativeFromWorkspace = path9.relative(workspaceRoot, resolved.canonicalPath).replace(/\\/g, "/");
|
|
15754
|
+
if (isSensitivePath(relativeFromWorkspace)) {
|
|
15755
|
+
return {
|
|
15756
|
+
ok: false,
|
|
15757
|
+
error: `Access denied: ${requestedPath} is a sensitive path and cannot be read or written`
|
|
15758
|
+
};
|
|
15759
|
+
}
|
|
15724
15760
|
}
|
|
15725
15761
|
if (resolved.exists) {
|
|
15726
15762
|
const stat = fs9.statSync(resolved.canonicalPath);
|
|
@@ -15741,8 +15777,17 @@ function resolveToolPath(toolPath, options = {}) {
|
|
|
15741
15777
|
}
|
|
15742
15778
|
return { ok: true, resolvedPath: resolved.canonicalPath };
|
|
15743
15779
|
}
|
|
15780
|
+
function getTaskStateRoot(taskName, stateDir) {
|
|
15781
|
+
return path9.join(stateDir || getMarathonStateDir(), stateSafeName3(taskName));
|
|
15782
|
+
}
|
|
15744
15783
|
function createDefaultLocalTools(context) {
|
|
15745
|
-
const
|
|
15784
|
+
const taskStateRoot = context?.taskName ? getTaskStateRoot(context.taskName, context.stateDir) : void 0;
|
|
15785
|
+
const planDir = context?.taskName ? path9.resolve(`.runtype/marathons/${stateSafeName3(context.taskName)}`) : void 0;
|
|
15786
|
+
const allowedReadRoots = context?.taskName ? [
|
|
15787
|
+
getOffloadedOutputDir(context.taskName, context.stateDir),
|
|
15788
|
+
...taskStateRoot ? [taskStateRoot] : [],
|
|
15789
|
+
...planDir ? [planDir] : []
|
|
15790
|
+
] : [];
|
|
15746
15791
|
return {
|
|
15747
15792
|
read_file: {
|
|
15748
15793
|
description: "Read the contents of a file at the given path",
|
|
@@ -15944,6 +15989,8 @@ function createDefaultLocalTools(context) {
|
|
|
15944
15989
|
};
|
|
15945
15990
|
}
|
|
15946
15991
|
function createCheckpointedWriteFileTool(taskName, stateDir) {
|
|
15992
|
+
const taskStateRoot = getTaskStateRoot(taskName, stateDir);
|
|
15993
|
+
const planDir = path9.resolve(`.runtype/marathons/${stateSafeName3(taskName)}`);
|
|
15947
15994
|
return {
|
|
15948
15995
|
description: "Write content to a file, creating directories as needed and checkpointing original repo files",
|
|
15949
15996
|
parametersSchema: {
|
|
@@ -15956,7 +16003,8 @@ function createCheckpointedWriteFileTool(taskName, stateDir) {
|
|
|
15956
16003
|
},
|
|
15957
16004
|
execute: async (args) => {
|
|
15958
16005
|
const resolvedPath = resolveToolPath(String(args.path || ""), {
|
|
15959
|
-
allowMissing: true
|
|
16006
|
+
allowMissing: true,
|
|
16007
|
+
allowedRoots: [taskStateRoot, planDir]
|
|
15960
16008
|
});
|
|
15961
16009
|
if (!resolvedPath.ok) return `Error: ${resolvedPath.error}`;
|
|
15962
16010
|
const content = String(args.content || "");
|
|
@@ -16047,6 +16095,7 @@ function createRunCheckTool() {
|
|
|
16047
16095
|
if (!isSafeVerificationCommand(command)) {
|
|
16048
16096
|
return JSON.stringify({
|
|
16049
16097
|
success: false,
|
|
16098
|
+
blocked: true,
|
|
16050
16099
|
command,
|
|
16051
16100
|
error: "Blocked unsafe verification command. Use a single non-destructive lint/test/typecheck/build command."
|
|
16052
16101
|
});
|
|
@@ -16462,12 +16511,46 @@ function resolveModelForPhase(phase, cliOverrides, milestoneModels) {
|
|
|
16462
16511
|
}
|
|
16463
16512
|
return cliOverrides.defaultModel;
|
|
16464
16513
|
}
|
|
16514
|
+
function resolveErrorHandlingForPhase(phase, cliFallbackModel, milestoneFallbackModels) {
|
|
16515
|
+
const phaseFallbacks = phase ? milestoneFallbackModels?.[phase] : void 0;
|
|
16516
|
+
if (phaseFallbacks?.length) {
|
|
16517
|
+
return {
|
|
16518
|
+
onError: "fallback",
|
|
16519
|
+
fallbacks: [
|
|
16520
|
+
{ type: "retry", delay: 5e3 },
|
|
16521
|
+
...phaseFallbacks.map((fb) => ({
|
|
16522
|
+
type: "model",
|
|
16523
|
+
model: fb.model,
|
|
16524
|
+
...fb.temperature !== void 0 ? { temperature: fb.temperature } : {},
|
|
16525
|
+
...fb.maxTokens !== void 0 ? { maxTokens: fb.maxTokens } : {}
|
|
16526
|
+
}))
|
|
16527
|
+
]
|
|
16528
|
+
};
|
|
16529
|
+
}
|
|
16530
|
+
if (cliFallbackModel) {
|
|
16531
|
+
return {
|
|
16532
|
+
onError: "fallback",
|
|
16533
|
+
fallbacks: [
|
|
16534
|
+
{ type: "retry", delay: 5e3 },
|
|
16535
|
+
{ type: "model", model: cliFallbackModel }
|
|
16536
|
+
]
|
|
16537
|
+
};
|
|
16538
|
+
}
|
|
16539
|
+
return void 0;
|
|
16540
|
+
}
|
|
16465
16541
|
|
|
16466
16542
|
// src/marathon/playbook-loader.ts
|
|
16467
16543
|
import * as fs12 from "fs";
|
|
16468
16544
|
import * as path12 from "path";
|
|
16469
16545
|
import * as os4 from "os";
|
|
16546
|
+
import micromatch from "micromatch";
|
|
16470
16547
|
import { parse as parseYaml } from "yaml";
|
|
16548
|
+
var DISCOVERY_TOOLS = /* @__PURE__ */ new Set([
|
|
16549
|
+
"search_repo",
|
|
16550
|
+
"glob_files",
|
|
16551
|
+
"tree_directory",
|
|
16552
|
+
"list_directory"
|
|
16553
|
+
]);
|
|
16471
16554
|
var PLAYBOOKS_DIR = ".runtype/marathons/playbooks";
|
|
16472
16555
|
function getCandidatePaths(nameOrPath, cwd) {
|
|
16473
16556
|
const home = os4.homedir();
|
|
@@ -16542,7 +16625,54 @@ function buildIsComplete(criteria) {
|
|
|
16542
16625
|
return () => false;
|
|
16543
16626
|
}
|
|
16544
16627
|
}
|
|
16628
|
+
function buildPolicyIntercept(policy) {
|
|
16629
|
+
if (!policy.blockedTools?.length && !policy.blockDiscoveryTools && !policy.allowedReadGlobs?.length && !policy.allowedWriteGlobs?.length && !policy.requirePlanBeforeWrite) {
|
|
16630
|
+
return void 0;
|
|
16631
|
+
}
|
|
16632
|
+
const blockedSet = new Set(
|
|
16633
|
+
(policy.blockedTools ?? []).map((t) => t.trim()).filter(Boolean)
|
|
16634
|
+
);
|
|
16635
|
+
const readGlobs = policy.allowedReadGlobs ?? [];
|
|
16636
|
+
const writeGlobs = policy.allowedWriteGlobs ?? [];
|
|
16637
|
+
return (toolName, args, ctx) => {
|
|
16638
|
+
if (blockedSet.has(toolName)) {
|
|
16639
|
+
return `Blocked by playbook policy: ${toolName} is not allowed for this task.`;
|
|
16640
|
+
}
|
|
16641
|
+
if (policy.blockDiscoveryTools && DISCOVERY_TOOLS.has(toolName)) {
|
|
16642
|
+
return `Blocked by playbook policy: discovery tools are disabled for this task.`;
|
|
16643
|
+
}
|
|
16644
|
+
const pathArg = typeof args.path === "string" && args.path.trim() ? ctx.normalizePath(String(args.path)) : void 0;
|
|
16645
|
+
if (pathArg) {
|
|
16646
|
+
const isWrite = toolName === "write_file" || toolName === "restore_file_checkpoint";
|
|
16647
|
+
const isRead = toolName === "read_file";
|
|
16648
|
+
if (isRead && readGlobs.length > 0) {
|
|
16649
|
+
const allowed = micromatch.some(pathArg, readGlobs, { dot: true });
|
|
16650
|
+
if (!allowed) {
|
|
16651
|
+
return `Blocked by playbook policy: ${toolName} path "${pathArg}" is outside allowed read globs: ${readGlobs.join(", ")}`;
|
|
16652
|
+
}
|
|
16653
|
+
}
|
|
16654
|
+
if (isWrite && writeGlobs.length > 0) {
|
|
16655
|
+
const planPath = ctx.state.planPath ? ctx.normalizePath(ctx.state.planPath) : void 0;
|
|
16656
|
+
if (planPath && pathArg === planPath) {
|
|
16657
|
+
} else {
|
|
16658
|
+
const allowed = micromatch.some(pathArg, writeGlobs, { dot: true });
|
|
16659
|
+
if (!allowed) {
|
|
16660
|
+
return `Blocked by playbook policy: ${toolName} path "${pathArg}" is outside allowed write globs: ${writeGlobs.join(", ")}`;
|
|
16661
|
+
}
|
|
16662
|
+
}
|
|
16663
|
+
}
|
|
16664
|
+
if (isWrite && policy.requirePlanBeforeWrite && !ctx.state.planWritten && !ctx.trace.planWritten) {
|
|
16665
|
+
const planPath = ctx.state.planPath ? ctx.normalizePath(ctx.state.planPath) : void 0;
|
|
16666
|
+
if (!planPath || pathArg !== planPath) {
|
|
16667
|
+
return `Blocked by playbook policy: write the plan before creating other files.`;
|
|
16668
|
+
}
|
|
16669
|
+
}
|
|
16670
|
+
}
|
|
16671
|
+
return void 0;
|
|
16672
|
+
};
|
|
16673
|
+
}
|
|
16545
16674
|
function convertToWorkflow(config2) {
|
|
16675
|
+
const policyIntercept = config2.policy ? buildPolicyIntercept(config2.policy) : void 0;
|
|
16546
16676
|
const phases = config2.milestones.map((milestone) => ({
|
|
16547
16677
|
name: milestone.name,
|
|
16548
16678
|
description: milestone.description,
|
|
@@ -16558,6 +16688,7 @@ ${instructions}`;
|
|
|
16558
16688
|
return milestone.toolGuidance ?? [];
|
|
16559
16689
|
},
|
|
16560
16690
|
isComplete: buildIsComplete(milestone.completionCriteria),
|
|
16691
|
+
interceptToolCall: policyIntercept,
|
|
16561
16692
|
// Default to rejecting TASK_COMPLETE unless the playbook explicitly allows it.
|
|
16562
16693
|
// The SDK accepts completion by default when canAcceptCompletion is undefined,
|
|
16563
16694
|
// which would let the model end the marathon prematurely in early phases.
|
|
@@ -16568,23 +16699,37 @@ ${instructions}`;
|
|
|
16568
16699
|
phases
|
|
16569
16700
|
};
|
|
16570
16701
|
}
|
|
16702
|
+
function normalizeFallbackModel(input) {
|
|
16703
|
+
if (typeof input === "string") return { model: input };
|
|
16704
|
+
return {
|
|
16705
|
+
model: input.model,
|
|
16706
|
+
...input.temperature !== void 0 ? { temperature: input.temperature } : {},
|
|
16707
|
+
...input.maxTokens !== void 0 ? { maxTokens: input.maxTokens } : {}
|
|
16708
|
+
};
|
|
16709
|
+
}
|
|
16571
16710
|
function loadPlaybook(nameOrPath, cwd) {
|
|
16572
16711
|
const baseCwd = cwd || process.cwd();
|
|
16573
16712
|
const candidates = getCandidatePaths(nameOrPath, baseCwd);
|
|
16574
16713
|
for (const candidate of candidates) {
|
|
16575
|
-
if (!fs12.existsSync(candidate)) continue;
|
|
16714
|
+
if (!fs12.existsSync(candidate) || fs12.statSync(candidate).isDirectory()) continue;
|
|
16576
16715
|
const config2 = parsePlaybookFile(candidate);
|
|
16577
16716
|
validatePlaybook(config2, candidate);
|
|
16578
16717
|
const milestoneModels = {};
|
|
16718
|
+
const milestoneFallbackModels = {};
|
|
16579
16719
|
for (const m of config2.milestones) {
|
|
16580
16720
|
if (m.model) milestoneModels[m.name] = m.model;
|
|
16721
|
+
if (m.fallbackModels?.length) {
|
|
16722
|
+
milestoneFallbackModels[m.name] = m.fallbackModels.map(normalizeFallbackModel);
|
|
16723
|
+
}
|
|
16581
16724
|
}
|
|
16582
16725
|
return {
|
|
16583
16726
|
workflow: convertToWorkflow(config2),
|
|
16584
16727
|
milestones: config2.milestones.map((m) => m.name),
|
|
16585
16728
|
milestoneModels: Object.keys(milestoneModels).length > 0 ? milestoneModels : void 0,
|
|
16729
|
+
milestoneFallbackModels: Object.keys(milestoneFallbackModels).length > 0 ? milestoneFallbackModels : void 0,
|
|
16586
16730
|
verification: config2.verification,
|
|
16587
|
-
rules: config2.rules
|
|
16731
|
+
rules: config2.rules,
|
|
16732
|
+
policy: config2.policy
|
|
16588
16733
|
};
|
|
16589
16734
|
}
|
|
16590
16735
|
throw new Error(
|
|
@@ -16749,13 +16894,22 @@ function normalizeMarathonAgentArgument(agent) {
|
|
|
16749
16894
|
function buildMarathonAutoCreatedAgentBootstrap(agentName, options = {}) {
|
|
16750
16895
|
const normalizedModel = options.model?.trim();
|
|
16751
16896
|
const normalizedToolIds = [...new Set((options.toolIds || []).map((toolId) => toolId.trim()).filter(Boolean))];
|
|
16752
|
-
const
|
|
16897
|
+
const normalizedFallbackModel = options.fallbackModel?.trim();
|
|
16898
|
+
const errorHandling = normalizedFallbackModel ? {
|
|
16899
|
+
onError: "fallback",
|
|
16900
|
+
fallbacks: [
|
|
16901
|
+
{ type: "retry", delay: 5e3 },
|
|
16902
|
+
{ type: "model", model: normalizedFallbackModel }
|
|
16903
|
+
]
|
|
16904
|
+
} : void 0;
|
|
16905
|
+
const config2 = normalizedModel || normalizedToolIds.length > 0 || errorHandling ? {
|
|
16753
16906
|
...normalizedModel ? { model: normalizedModel } : {},
|
|
16754
16907
|
...normalizedToolIds.length > 0 ? {
|
|
16755
16908
|
tools: {
|
|
16756
16909
|
toolIds: normalizedToolIds
|
|
16757
16910
|
}
|
|
16758
|
-
} : {}
|
|
16911
|
+
} : {},
|
|
16912
|
+
...errorHandling ? { errorHandling } : {}
|
|
16759
16913
|
} : void 0;
|
|
16760
16914
|
return {
|
|
16761
16915
|
description: `Powering a marathon for ${agentName}`,
|
|
@@ -17109,11 +17263,17 @@ async function taskAction(agent, options) {
|
|
|
17109
17263
|
let playbookWorkflow;
|
|
17110
17264
|
let playbookMilestones;
|
|
17111
17265
|
let playbookMilestoneModels;
|
|
17266
|
+
let playbookMilestoneFallbackModels;
|
|
17267
|
+
let playbookPolicy;
|
|
17112
17268
|
if (options.playbook) {
|
|
17113
17269
|
const result = loadPlaybook(options.playbook);
|
|
17114
17270
|
playbookWorkflow = result.workflow;
|
|
17115
17271
|
playbookMilestones = result.milestones;
|
|
17116
17272
|
playbookMilestoneModels = result.milestoneModels;
|
|
17273
|
+
playbookMilestoneFallbackModels = result.milestoneFallbackModels;
|
|
17274
|
+
playbookPolicy = result.policy;
|
|
17275
|
+
} else {
|
|
17276
|
+
playbookPolicy = void 0;
|
|
17117
17277
|
}
|
|
17118
17278
|
if (useStartupShell && !options.model?.trim()) {
|
|
17119
17279
|
if (playbookMilestoneModels && Object.keys(playbookMilestoneModels).length > 0 && startupShellRef.current) {
|
|
@@ -17214,7 +17374,8 @@ ${rulesContext}`;
|
|
|
17214
17374
|
if (autoCreatedAgent) {
|
|
17215
17375
|
const bootstrapPayload = buildMarathonAutoCreatedAgentBootstrap(normalizedAgent, {
|
|
17216
17376
|
model: options.model || agentConfigModel || defaultConfiguredModel,
|
|
17217
|
-
toolIds: resolvedToolIds
|
|
17377
|
+
toolIds: resolvedToolIds,
|
|
17378
|
+
fallbackModel: options.fallbackModel
|
|
17218
17379
|
});
|
|
17219
17380
|
try {
|
|
17220
17381
|
await client.agents.update(agentId, bootstrapPayload);
|
|
@@ -17230,6 +17391,16 @@ ${rulesContext}`;
|
|
|
17230
17391
|
);
|
|
17231
17392
|
}
|
|
17232
17393
|
}
|
|
17394
|
+
} else if (options.fallbackModel || playbookMilestoneFallbackModels) {
|
|
17395
|
+
const initialErrorHandling = resolveErrorHandlingForPhase(
|
|
17396
|
+
currentPhase,
|
|
17397
|
+
options.fallbackModel,
|
|
17398
|
+
playbookMilestoneFallbackModels
|
|
17399
|
+
);
|
|
17400
|
+
if (initialErrorHandling) {
|
|
17401
|
+
await client.agents.update(agentId, { config: { errorHandling: initialErrorHandling } }).catch(() => {
|
|
17402
|
+
});
|
|
17403
|
+
}
|
|
17233
17404
|
}
|
|
17234
17405
|
let localTools = buildLocalTools(client, parsedSandbox, options, {
|
|
17235
17406
|
taskName,
|
|
@@ -17532,7 +17703,13 @@ Saving state... done. Session saved to ${filePath}`);
|
|
|
17532
17703
|
model: event.model || effectiveModelForContext
|
|
17533
17704
|
});
|
|
17534
17705
|
},
|
|
17535
|
-
...resumeState
|
|
17706
|
+
...resumeState || playbookPolicy ? {
|
|
17707
|
+
resumeState: {
|
|
17708
|
+
...resumeState ?? {},
|
|
17709
|
+
...playbookPolicy?.outputRoot ? { outputRoot: playbookPolicy.outputRoot } : {},
|
|
17710
|
+
...playbookPolicy?.requireVerification !== void 0 ? { verificationRequired: playbookPolicy.requireVerification } : {}
|
|
17711
|
+
}
|
|
17712
|
+
} : {},
|
|
17536
17713
|
toolContextMode: options.toolContext || "hot-tail",
|
|
17537
17714
|
toolWindow: options.toolWindow === "session" || !options.toolWindow ? "session" : parseInt(options.toolWindow, 10) || 10,
|
|
17538
17715
|
onSession: async (state) => {
|
|
@@ -17594,6 +17771,17 @@ Saving state... done. Session saved to ${filePath}`);
|
|
|
17594
17771
|
options.model = newPhaseModel;
|
|
17595
17772
|
modelChangedOnPhaseTransition = true;
|
|
17596
17773
|
}
|
|
17774
|
+
if (options.fallbackModel || playbookMilestoneFallbackModels) {
|
|
17775
|
+
const newErrorHandling = resolveErrorHandlingForPhase(
|
|
17776
|
+
resumeState.workflowPhase,
|
|
17777
|
+
options.fallbackModel,
|
|
17778
|
+
playbookMilestoneFallbackModels
|
|
17779
|
+
);
|
|
17780
|
+
client.agents.update(agentId, {
|
|
17781
|
+
config: { errorHandling: newErrorHandling ?? null }
|
|
17782
|
+
}).catch(() => {
|
|
17783
|
+
});
|
|
17784
|
+
}
|
|
17597
17785
|
}
|
|
17598
17786
|
if (state.recentActionKeys && state.recentActionKeys.length > 0) {
|
|
17599
17787
|
for (const key of state.recentActionKeys) {
|
|
@@ -17970,7 +18158,7 @@ function resolveSandboxWorkflowSelection(message, sandboxProvider, resumeState)
|
|
|
17970
18158
|
};
|
|
17971
18159
|
}
|
|
17972
18160
|
function applyTaskOptions(cmd) {
|
|
17973
|
-
return cmd.argument("<agent>", "Agent ID or name").option("-g, --goal <text>", "Goal message for the agent").option("--max-sessions <n>", "Maximum sessions", "50").option("--max-cost <n>", "Budget in USD").option("--model <modelId>", "Model ID to use (overrides agent config)").option("--name <name>", "Task name (used for state file, defaults to agent name)").option("--session <name>", "Resume a specific session by name").option("--state-dir <path>", "Directory for state files (default: ~/.runtype/projects/<hash>/marathons/)").option("--resume [message]", "Resume from existing local state, optionally with a new message").option("--fresh", "Start a new run and ignore any existing local state for this task").option("--compact", "Force compact-summary resume mode instead of replaying full history").option("--compact-strategy <strategy>", "Compaction strategy: auto (default), provider_native, or summary_fallback").option("--compact-threshold <value>", "Auto-compact when estimated context crosses this threshold (default: 80% fallback, 90% native; accepts percent like 90% or absolute token count like 120000)").option("--compact-instructions <text>", "Extra instructions for what a compact summary must preserve").option("--no-auto-compact", "Disable automatic context-aware history compaction").option("--track", "Sync progress to a Runtype record (visible in dashboard)").option("--debug", "Show debug output from each session").option("--json", "Output final result as JSON").option("--sandbox <provider>", "Enable sandbox code execution tool (cloudflare-worker, quickjs, or daytona)").option("--no-local-tools", "Disable built-in local tool execution (read_file, write_file, list_directory)").option("-t, --tools <tools...>", "Enable built-in tools (e.g., exa, firecrawl, dalle, openai_web_search, anthropic_web_search)").option("--plain-text", "Disable markdown rendering in output").option("--no-reasoning", "Disable model reasoning/thinking (enabled by default for supported models)").option("--no-checkpoint", "Run all iterations without checkpoint pauses (fully autonomous)").option("--checkpoint-timeout <seconds>", "Auto-continue timeout in seconds (default: 10)", "10").option("--planning-model <modelId>", "Model to use during research/planning phases").option("--execution-model <modelId>", "Model to use during execution phase").option("--playbook <name>", "Load a playbook from .runtype/marathons/playbooks/").option("--offload-threshold <chars>", 'Offload tool outputs larger than this to files (default: 100000; use "off" or "0" to disable guardrails)').option("--tool-context <mode>", "Tool result storage: hot-tail (default), observation-mask, or full-inline").option("--tool-window <window>", 'Compaction window: "session" (default) or a number for last-N tool results (e.g. 10)').option("--runner-char <char>", "Custom runner emoji (default: \u{1F3C3})").option("--finish-char <char>", "Custom finish line emoji (default: \u{1F3C1})").option("--no-runner", "Hide the runner emoji from the header border").option("--no-finish", "Hide the finish line emoji from the header border").action(taskAction);
|
|
18161
|
+
return cmd.argument("<agent>", "Agent ID or name").option("-g, --goal <text>", "Goal message for the agent").option("--max-sessions <n>", "Maximum sessions", "50").option("--max-cost <n>", "Budget in USD").option("--model <modelId>", "Model ID to use (overrides agent config)").option("--name <name>", "Task name (used for state file, defaults to agent name)").option("--session <name>", "Resume a specific session by name").option("--state-dir <path>", "Directory for state files (default: ~/.runtype/projects/<hash>/marathons/)").option("--resume [message]", "Resume from existing local state, optionally with a new message").option("--fresh", "Start a new run and ignore any existing local state for this task").option("--compact", "Force compact-summary resume mode instead of replaying full history").option("--compact-strategy <strategy>", "Compaction strategy: auto (default), provider_native, or summary_fallback").option("--compact-threshold <value>", "Auto-compact when estimated context crosses this threshold (default: 80% fallback, 90% native; accepts percent like 90% or absolute token count like 120000)").option("--compact-instructions <text>", "Extra instructions for what a compact summary must preserve").option("--no-auto-compact", "Disable automatic context-aware history compaction").option("--track", "Sync progress to a Runtype record (visible in dashboard)").option("--debug", "Show debug output from each session").option("--json", "Output final result as JSON").option("--sandbox <provider>", "Enable sandbox code execution tool (cloudflare-worker, quickjs, or daytona)").option("--no-local-tools", "Disable built-in local tool execution (read_file, write_file, list_directory)").option("-t, --tools <tools...>", "Enable built-in tools (e.g., exa, firecrawl, dalle, openai_web_search, anthropic_web_search)").option("--plain-text", "Disable markdown rendering in output").option("--no-reasoning", "Disable model reasoning/thinking (enabled by default for supported models)").option("--no-checkpoint", "Run all iterations without checkpoint pauses (fully autonomous)").option("--checkpoint-timeout <seconds>", "Auto-continue timeout in seconds (default: 10)", "10").option("--planning-model <modelId>", "Model to use during research/planning phases").option("--execution-model <modelId>", "Model to use during execution phase").option("--fallback-model <modelId>", "Model to fall back to when primary model fails").option("--playbook <name>", "Load a playbook from .runtype/marathons/playbooks/").option("--offload-threshold <chars>", 'Offload tool outputs larger than this to files (default: 100000; use "off" or "0" to disable guardrails)').option("--tool-context <mode>", "Tool result storage: hot-tail (default), observation-mask, or full-inline").option("--tool-window <window>", 'Compaction window: "session" (default) or a number for last-N tool results (e.g. 10)').option("--runner-char <char>", "Custom runner emoji (default: \u{1F3C3})").option("--finish-char <char>", "Custom finish line emoji (default: \u{1F3C1})").option("--no-runner", "Hide the runner emoji from the header border").option("--no-finish", "Hide the finish line emoji from the header border").action(taskAction);
|
|
17974
18162
|
}
|
|
17975
18163
|
var taskCommand = applyTaskOptions(
|
|
17976
18164
|
new Command10("task").description("Run a multi-session agent task")
|