@runtypelabs/cli 1.9.0 → 1.9.2
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/README.md +320 -47
- package/dist/index.js +1186 -599
- package/dist/index.js.map +1 -1
- package/package.json +3 -3
package/README.md
CHANGED
|
@@ -1,8 +1,14 @@
|
|
|
1
|
-
|
|
1
|
+
<p align="center" style="background:white;border-radius:4px;padding: 12px;margin-bottom:16px;">
|
|
2
|
+
<img
|
|
3
|
+
src="https://www.runtype.com/runtype-text-only.svg"
|
|
4
|
+
alt="Runtype: The Intelligent Product Company"
|
|
5
|
+
width="240"
|
|
6
|
+
/>
|
|
7
|
+
</p>
|
|
2
8
|
|
|
3
|
-
|
|
9
|
+
# The Runtype CLI
|
|
4
10
|
|
|
5
|
-
|
|
11
|
+
This is our command-line interface for the platform, which includes _Marathon_, our harness for long-running tasks and deep workflow analysis.
|
|
6
12
|
|
|
7
13
|
## Installation
|
|
8
14
|
|
|
@@ -13,74 +19,83 @@ npm install -g @runtypelabs/cli
|
|
|
13
19
|
Or run without installing:
|
|
14
20
|
|
|
15
21
|
```bash
|
|
16
|
-
npx @runtypelabs/cli <command>
|
|
22
|
+
npx @runtypelabs/cli@latest <command>
|
|
17
23
|
```
|
|
18
24
|
|
|
19
|
-
##
|
|
25
|
+
## Start Here
|
|
20
26
|
|
|
21
|
-
|
|
27
|
+
The easiest way to try Runtype starting at the CLI is to run a Marathon task. If you are not logged in yet, the CLI will guide you through login or signup first.
|
|
22
28
|
|
|
23
29
|
```bash
|
|
24
|
-
|
|
25
|
-
|
|
30
|
+
npx @runtypelabs/cli@latest marathon researcher \
|
|
31
|
+
-g "fetch the home page of hacker news, read the articles related to AI, and summarize them. Use your firecrawl tool liberally" \
|
|
32
|
+
--model qwen/qwen3.5-397b-a17b \
|
|
33
|
+
--tools firecrawl \
|
|
34
|
+
--fresh \
|
|
35
|
+
--max-sessions 2
|
|
36
|
+
```
|
|
26
37
|
|
|
27
|
-
|
|
28
|
-
runtype auth login
|
|
38
|
+
Marathon can research on the web, edit code in your current repo, or run code in a sandbox. As it does this, you get full insight into what is happening during each run. You can have it process very long tasks with 1000+ tool calls and get an output of the session on your local machine or (optionally) store it within Runtype.
|
|
29
39
|
|
|
30
|
-
|
|
31
|
-
runtype auth whoami
|
|
32
|
-
```
|
|
40
|
+
Keep in mind it's a harness meant to aid understanding and analysis of AI workflows. How LLMs and tools interact with the system and user prompts to solve problems over many runs. It's a research harness, built for people building on top of AI.
|
|
33
41
|
|
|
34
|
-
|
|
42
|
+
It's not aiming to replace your favorite AI coding assistant, even though it'll gladly show you it's work as it makes a valiant effort!
|
|
35
43
|
|
|
36
44
|
```bash
|
|
37
|
-
#
|
|
38
|
-
runtype
|
|
45
|
+
# Edit code in the current repo
|
|
46
|
+
runtype marathon "Code Editor" --goal "Refactor the theme editor to use modern UX best practices"
|
|
39
47
|
|
|
40
|
-
#
|
|
41
|
-
runtype
|
|
48
|
+
# Build something and deploy it publicly
|
|
49
|
+
runtype marathon calculator --goal "Build a calculator in 3d and deploy it publicly" --model claude-sonnet-4-6
|
|
50
|
+
```
|
|
42
51
|
|
|
43
|
-
|
|
44
|
-
runtype records create --name "My Record" --type "document"
|
|
52
|
+
## Quick Start
|
|
45
53
|
|
|
46
|
-
|
|
47
|
-
runtype records list --type document
|
|
48
|
-
```
|
|
54
|
+
### 1. Get Authenticated
|
|
49
55
|
|
|
50
|
-
|
|
56
|
+
```bash
|
|
57
|
+
# Browser login
|
|
58
|
+
runtype login
|
|
59
|
+
|
|
60
|
+
# API key login for CI or non-interactive use
|
|
61
|
+
runtype login --api-key <key>
|
|
62
|
+
```
|
|
51
63
|
|
|
52
|
-
|
|
64
|
+
### 2. Work With Agents, Flows, and Records
|
|
53
65
|
|
|
54
66
|
```bash
|
|
55
|
-
#
|
|
56
|
-
runtype
|
|
67
|
+
# List available agents
|
|
68
|
+
runtype agents list
|
|
57
69
|
|
|
58
|
-
#
|
|
59
|
-
runtype marathon agent_abc123 --goal "
|
|
60
|
-
--max-sessions 10 --max-cost 2.50 --name "api-tests"
|
|
70
|
+
# Run a saved dashboard agent by ID, which uses your configured custom and cloud tools in a marathon
|
|
71
|
+
runtype marathon agent_abc123 --goal "Refactor the theme editor to use modern UX best practices"
|
|
61
72
|
|
|
62
|
-
#
|
|
63
|
-
runtype
|
|
73
|
+
# Run an agent directly
|
|
74
|
+
runtype dispatch --agent <agent-id> --message "Summarize this document"
|
|
64
75
|
|
|
65
|
-
#
|
|
66
|
-
runtype
|
|
67
|
-
runtype marathon "Code Builder" -g "Find docs and scrape them" --tools exa firecrawl
|
|
76
|
+
# Create a flow
|
|
77
|
+
runtype flows create --name "My Flow" --description "Description"
|
|
68
78
|
|
|
69
|
-
#
|
|
70
|
-
runtype
|
|
71
|
-
--tools exa dalle --max-sessions 5
|
|
79
|
+
# Run a flow directly with variables
|
|
80
|
+
runtype dispatch --flow <flow-id> --variable customerName=Alyss --variable priority=high --message "Hello, this is Claudia. Nice to meet you."
|
|
72
81
|
|
|
73
|
-
#
|
|
74
|
-
runtype
|
|
75
|
-
|
|
82
|
+
# Create a record
|
|
83
|
+
runtype records create --name "My Record" --type "document"
|
|
84
|
+
```
|
|
85
|
+
|
|
86
|
+
### 3. Run Multi-Session Agent Tasks
|
|
87
|
+
|
|
88
|
+
Use `runtype marathon` to run long-running tasks with real-time streaming output. If you want to try it without installing first, use `npx @runtypelabs/cli@latest marathon ...`. Swap the agent, model, tools, and execution environment depending on whether you want to research, edit code, or build and ship something end-to-end.
|
|
76
89
|
|
|
77
|
-
|
|
78
|
-
|
|
79
|
-
|
|
90
|
+
```bash
|
|
91
|
+
# Session-limited research task that writes results into the folder you run it in
|
|
92
|
+
runtype marathon researcher -g "fetch the home page of hacker news, read the articles related to AI, and summarize them. Use your firecrawl tool liberally" --model qwen/qwen3.5-397b-a17b --tools firecrawl --fresh --max-sessions 2
|
|
93
|
+
|
|
94
|
+
# Edit code in the current repo
|
|
95
|
+
runtype marathon "Code Editor" --goal "Refactor the theme editor to use modern UX best practices"
|
|
80
96
|
|
|
81
|
-
#
|
|
82
|
-
runtype marathon
|
|
83
|
-
--max-sessions 20 --track --name "payments"
|
|
97
|
+
# Build something and deploy it publicly
|
|
98
|
+
runtype marathon calculator --goal "Build a calculator in 3d and deploy it publicly" --model claude-sonnet-4-6
|
|
84
99
|
```
|
|
85
100
|
|
|
86
101
|
#### Customizing the Runner Animation
|
|
@@ -132,6 +147,53 @@ runtype marathon "Code Builder" --goal "Fix the bug" \
|
|
|
132
147
|
runtype marathon "Code Builder" --goal "Fix the bug" --tool-context full-inline
|
|
133
148
|
```
|
|
134
149
|
|
|
150
|
+
#### Automatic History Compaction
|
|
151
|
+
|
|
152
|
+
Marathon now manages continuation context against the active model's usable input budget instead of raw message history alone. It accounts for conversation history, tool results, tool definitions, and reserved output headroom before deciding whether to compact.
|
|
153
|
+
|
|
154
|
+
- This is enabled by default.
|
|
155
|
+
- The threshold is model-aware and follows the model currently selected for the run.
|
|
156
|
+
- Provider-native compaction is preferred when the active provider supports it. Today that means Anthropic-backed marathon runs compact at 90% of the effective input budget.
|
|
157
|
+
- Other models fall back to structured summary compaction at 80% of the effective input budget.
|
|
158
|
+
- `--compact` still forces compact-summary mode for resume/restart scenarios even if the threshold has not been reached.
|
|
159
|
+
- Use `--compact-strategy auto|provider_native|summary_fallback` to override the default selection.
|
|
160
|
+
- Use `--compact-instructions "..."` to tell summary compaction what state must be preserved.
|
|
161
|
+
|
|
162
|
+
```bash
|
|
163
|
+
# Default behavior: provider-aware auto compaction
|
|
164
|
+
runtype marathon "Code Builder" --goal "Refactor the auth module"
|
|
165
|
+
|
|
166
|
+
# Force summary fallback even on providers with native compaction
|
|
167
|
+
runtype marathon "Code Builder" --goal "Refactor the auth module" \
|
|
168
|
+
--compact-strategy summary_fallback
|
|
169
|
+
|
|
170
|
+
# Preserve specific context in compact summaries
|
|
171
|
+
runtype marathon "Code Builder" --goal "Refactor the auth module" \
|
|
172
|
+
--compact-instructions "Preserve changed files, test results, and unresolved blockers."
|
|
173
|
+
|
|
174
|
+
# Raise the threshold to 90% of the effective input budget
|
|
175
|
+
runtype marathon "Code Builder" --goal "Refactor the auth module" \
|
|
176
|
+
--compact-threshold 90%
|
|
177
|
+
|
|
178
|
+
# Use an absolute token threshold instead
|
|
179
|
+
runtype marathon "Code Builder" --goal "Refactor the auth module" \
|
|
180
|
+
--compact-threshold 120000
|
|
181
|
+
|
|
182
|
+
# Disable automatic history compaction
|
|
183
|
+
runtype marathon "Code Builder" --goal "Refactor the auth module" \
|
|
184
|
+
--no-auto-compact
|
|
185
|
+
```
|
|
186
|
+
|
|
187
|
+
Percentages must include `%` (e.g. `80%`). Bare numbers are treated as absolute token counts (e.g. `120000`).
|
|
188
|
+
|
|
189
|
+
#### Tool Output Guardrails
|
|
190
|
+
|
|
191
|
+
Marathon also guards against oversized local tool results so they do not silently consume the whole context window.
|
|
192
|
+
|
|
193
|
+
- Outputs above the soft warning threshold are surfaced in the TUI as context notices.
|
|
194
|
+
- Outputs above the hard threshold are offloaded to disk and replaced with a compact reference the model can reopen with `read_file`.
|
|
195
|
+
- The default hard threshold is `100000` characters. Set `--offload-threshold <chars>` to tune it, or `--offload-threshold off` to disable it.
|
|
196
|
+
|
|
135
197
|
#### Built-in Tools
|
|
136
198
|
|
|
137
199
|
The `--tools` (or `-t`) flag enables built-in platform tools during agent execution. Tools are validated at startup against the built-in tools registry, and compatibility is checked against all models used in the task (including planning and execution phase models).
|
|
@@ -168,12 +230,207 @@ runtype marathon "Code Builder" -g "Scrape docs" -t firecrawl
|
|
|
168
230
|
- Tools incompatible with the selected model(s) are rejected with a specific error
|
|
169
231
|
- All validation errors are reported together so you can fix them in one pass
|
|
170
232
|
|
|
233
|
+
## Command Reference
|
|
234
|
+
|
|
235
|
+
### `runtype init`
|
|
236
|
+
|
|
237
|
+
Guided onboarding wizard for first-time setup. Walks through authentication and product creation.
|
|
238
|
+
|
|
239
|
+
```bash
|
|
240
|
+
runtype init
|
|
241
|
+
runtype init --api-key <key> --name "My Product"
|
|
242
|
+
```
|
|
243
|
+
|
|
244
|
+
### `runtype login`
|
|
245
|
+
|
|
246
|
+
Top-level alias for `runtype auth login`.
|
|
247
|
+
|
|
248
|
+
### `runtype auth`
|
|
249
|
+
|
|
250
|
+
Manage authentication.
|
|
251
|
+
|
|
252
|
+
| Subcommand | Description |
|
|
253
|
+
| ------------- | ---------------------------------------------- |
|
|
254
|
+
| `auth signup` | Create a new account (alias for `auth login`) |
|
|
255
|
+
| `auth login` | Login via browser or `--api-key <key>` |
|
|
256
|
+
| `auth status` | Show authentication status |
|
|
257
|
+
| `auth whoami` | Display current user info with billing details |
|
|
258
|
+
| `auth logout` | Remove stored credentials |
|
|
259
|
+
|
|
260
|
+
### `runtype flows`
|
|
261
|
+
|
|
262
|
+
Manage flows.
|
|
263
|
+
|
|
264
|
+
| Subcommand | Description |
|
|
265
|
+
| ------------------------ | --------------------------- |
|
|
266
|
+
| `flows list` | List all flows |
|
|
267
|
+
| `flows get <id>` | Get flow details |
|
|
268
|
+
| `flows create -n <name>` | Create a new flow |
|
|
269
|
+
| `flows run <id>` | Execute a flow via dispatch |
|
|
270
|
+
| `flows delete <id>` | Delete a flow |
|
|
271
|
+
|
|
272
|
+
### `runtype records`
|
|
273
|
+
|
|
274
|
+
Manage records.
|
|
275
|
+
|
|
276
|
+
| Subcommand | Description |
|
|
277
|
+
| ------------------------------------ | ---------------------------------------------------------------- |
|
|
278
|
+
| `records list` | List all records (`--type`, `--limit`) |
|
|
279
|
+
| `records get <id>` | Get record details |
|
|
280
|
+
| `records create -n <name> -t <type>` | Create a new record (`--metadata <json>`) |
|
|
281
|
+
| `records delete <id>` | Delete a record |
|
|
282
|
+
| `records export` | Export records to file (`--format json\|csv`, `--output <file>`) |
|
|
283
|
+
|
|
284
|
+
### `runtype agents`
|
|
285
|
+
|
|
286
|
+
Manage agents.
|
|
287
|
+
|
|
288
|
+
| Subcommand | Description |
|
|
289
|
+
| ---------------------------------- | ----------------------------------------------- |
|
|
290
|
+
| `agents list` | List all agents |
|
|
291
|
+
| `agents get <id>` | Get agent details |
|
|
292
|
+
| `agents create -n <name>` | Create a new agent |
|
|
293
|
+
| `agents execute <id> -m <message>` | Execute an agent with a message |
|
|
294
|
+
| `agents task <agent> -g <goal>` | Run a multi-session task (see Marathon section) |
|
|
295
|
+
| `agents delete <id>` | Delete an agent |
|
|
296
|
+
|
|
297
|
+
### `runtype dispatch`
|
|
298
|
+
|
|
299
|
+
Execute a flow or agent via the dispatch API.
|
|
300
|
+
|
|
301
|
+
```bash
|
|
302
|
+
runtype dispatch --flow <id> --message "Hello"
|
|
303
|
+
runtype dispatch --agent <id> --message "Hello"
|
|
304
|
+
runtype dispatch --flow <id> --record <record-id>
|
|
305
|
+
runtype dispatch --flow <id> --record-json data.json
|
|
306
|
+
runtype dispatch --flow <id> --variable key=value --variable key2=value2
|
|
307
|
+
runtype dispatch --flow <id> --no-stream --json
|
|
308
|
+
```
|
|
309
|
+
|
|
310
|
+
### `runtype prompts`
|
|
311
|
+
|
|
312
|
+
Manage prompts.
|
|
313
|
+
|
|
314
|
+
| Subcommand | Description |
|
|
315
|
+
| ------------------- | ---------------------------------------------------- |
|
|
316
|
+
| `prompts list` | List all prompts |
|
|
317
|
+
| `prompts get <id>` | Get prompt details |
|
|
318
|
+
| `prompts test <id>` | Test a prompt (`-i <input>`, `--stream/--no-stream`) |
|
|
319
|
+
|
|
320
|
+
### `runtype models`
|
|
321
|
+
|
|
322
|
+
Manage model configurations.
|
|
323
|
+
|
|
324
|
+
| Subcommand | Description |
|
|
325
|
+
| ------------------------- | --------------------------------------------- |
|
|
326
|
+
| `models list` | List your enabled model configurations |
|
|
327
|
+
| `models available` | List all available models grouped by provider |
|
|
328
|
+
| `models enable <modelId>` | Enable a model |
|
|
329
|
+
| `models disable <id>` | Disable a model configuration |
|
|
330
|
+
| `models default <id>` | Set a model configuration as default |
|
|
331
|
+
| `models usage` | Show model usage statistics |
|
|
332
|
+
|
|
333
|
+
### `runtype batch`
|
|
334
|
+
|
|
335
|
+
Manage batch operations.
|
|
336
|
+
|
|
337
|
+
| Subcommand | Description |
|
|
338
|
+
| -------------------------------------------- | ---------------------------------- |
|
|
339
|
+
| `batch submit -f <flowId> -r <records.json>` | Submit a batch job |
|
|
340
|
+
| `batch status <id>` | Check batch job status (`--watch`) |
|
|
341
|
+
| `batch cancel <id>` | Cancel a batch job |
|
|
342
|
+
|
|
343
|
+
### `runtype eval`
|
|
344
|
+
|
|
345
|
+
Manage evaluations.
|
|
346
|
+
|
|
347
|
+
| Subcommand | Description |
|
|
348
|
+
| ------------------------------------------- | -------------------------------------------- |
|
|
349
|
+
| `eval submit -f <flowId> -r <records.json>` | Submit an eval batch (`-n <name>`) |
|
|
350
|
+
| `eval list` | List eval batches (`--flow <id>`, `--limit`) |
|
|
351
|
+
| `eval results <id>` | Get eval batch results |
|
|
352
|
+
| `eval compare <groupId>` | Compare evals in a group |
|
|
353
|
+
|
|
354
|
+
### `runtype schedules`
|
|
355
|
+
|
|
356
|
+
Manage scheduled flow runs.
|
|
357
|
+
|
|
358
|
+
| Subcommand | Description |
|
|
359
|
+
| ---------------------------------------- | ------------------------------- |
|
|
360
|
+
| `schedules list` | List all schedules |
|
|
361
|
+
| `schedules get <id>` | Get schedule details |
|
|
362
|
+
| `schedules create -f <flowId> -c <cron>` | Create a schedule (`-n <name>`) |
|
|
363
|
+
| `schedules pause <id>` | Pause a schedule |
|
|
364
|
+
| `schedules resume <id>` | Resume a paused schedule |
|
|
365
|
+
| `schedules run-now <id>` | Trigger immediate execution |
|
|
366
|
+
| `schedules delete <id>` | Delete a schedule |
|
|
367
|
+
|
|
368
|
+
### `runtype api-keys`
|
|
369
|
+
|
|
370
|
+
Manage API keys.
|
|
371
|
+
|
|
372
|
+
| Subcommand | Description |
|
|
373
|
+
| --------------------------- | ------------------------------------------------ |
|
|
374
|
+
| `api-keys list` | List your API keys |
|
|
375
|
+
| `api-keys get <id>` | Get API key details |
|
|
376
|
+
| `api-keys create -n <name>` | Create a new API key |
|
|
377
|
+
| `api-keys delete <id>` | Delete an API key (`--yes` to skip confirmation) |
|
|
378
|
+
| `api-keys regenerate <id>` | Regenerate an API key |
|
|
379
|
+
| `api-keys analytics` | Show usage analytics (`--key <id>`) |
|
|
380
|
+
|
|
381
|
+
### `runtype products`
|
|
382
|
+
|
|
383
|
+
Manage products.
|
|
384
|
+
|
|
385
|
+
| Subcommand | Description |
|
|
386
|
+
| ---------------------------- | ------------------------------------------------------------- |
|
|
387
|
+
| `products init --from <url>` | Import a product from an authenticated external A2A agent URL |
|
|
388
|
+
|
|
389
|
+
```bash
|
|
390
|
+
# After logging in, import an external A2A agent
|
|
391
|
+
runtype products init --from https://example.com/.well-known/agent-card.json
|
|
392
|
+
runtype products init --from <url> --name "Custom Name"
|
|
393
|
+
```
|
|
394
|
+
|
|
395
|
+
### `runtype flow-versions`
|
|
396
|
+
|
|
397
|
+
Manage flow versions.
|
|
398
|
+
|
|
399
|
+
| Subcommand | Description |
|
|
400
|
+
| ----------------------------------------------- | ---------------------------- |
|
|
401
|
+
| `flow-versions list <flowId>` | List all versions for a flow |
|
|
402
|
+
| `flow-versions get <flowId> <versionId>` | Get a specific version |
|
|
403
|
+
| `flow-versions published <flowId>` | Get the published version |
|
|
404
|
+
| `flow-versions publish <flowId> -v <versionId>` | Publish a version |
|
|
405
|
+
|
|
406
|
+
### `runtype billing`
|
|
407
|
+
|
|
408
|
+
View billing and subscription info.
|
|
409
|
+
|
|
410
|
+
| Subcommand | Description |
|
|
411
|
+
| ----------------- | --------------------------------------- |
|
|
412
|
+
| `billing status` | Show current plan and usage |
|
|
413
|
+
| `billing portal` | Open the billing portal in your browser |
|
|
414
|
+
| `billing refresh` | Refresh plan data from billing provider |
|
|
415
|
+
|
|
416
|
+
### `runtype analytics`
|
|
417
|
+
|
|
418
|
+
View analytics and execution results.
|
|
419
|
+
|
|
420
|
+
| Subcommand | Description |
|
|
421
|
+
| ------------------- | -------------------------------------------------------------------- |
|
|
422
|
+
| `analytics stats` | Show account statistics |
|
|
423
|
+
| `analytics results` | List execution results (`--flow`, `--record`, `--status`, `--limit`) |
|
|
424
|
+
|
|
171
425
|
## Configuration
|
|
172
426
|
|
|
173
427
|
```bash
|
|
174
428
|
# View all configuration
|
|
175
429
|
runtype config get
|
|
176
430
|
|
|
431
|
+
# Get a specific key
|
|
432
|
+
runtype config get apiUrl
|
|
433
|
+
|
|
177
434
|
# Set API URL
|
|
178
435
|
runtype config set apiUrl https://api.runtype.com
|
|
179
436
|
|
|
@@ -182,8 +439,24 @@ runtype config set defaultModel gpt-4o
|
|
|
182
439
|
|
|
183
440
|
# Reset configuration
|
|
184
441
|
runtype config reset
|
|
442
|
+
|
|
443
|
+
# Show configuration file path
|
|
444
|
+
runtype config path
|
|
185
445
|
```
|
|
186
446
|
|
|
447
|
+
Valid config keys: `apiUrl`, `defaultModel`, `defaultTemperature`, `outputFormat`, `streamResponses`
|
|
448
|
+
|
|
449
|
+
### Global Options
|
|
450
|
+
|
|
451
|
+
All commands support these flags:
|
|
452
|
+
|
|
453
|
+
| Flag | Description |
|
|
454
|
+
| -------------------- | ------------------------ |
|
|
455
|
+
| `--json` | Output in JSON format |
|
|
456
|
+
| `--tty` / `--no-tty` | Force TTY / non-TTY mode |
|
|
457
|
+
| `-v, --verbose` | Enable verbose output |
|
|
458
|
+
| `--api-url <url>` | Override API URL |
|
|
459
|
+
|
|
187
460
|
## Development
|
|
188
461
|
|
|
189
462
|
### Local Development Setup
|