@runtypelabs/cli 1.9.0 → 1.9.1

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/README.md CHANGED
@@ -1,8 +1,14 @@
1
- # Runtype CLI
1
+ <p align="center" style="background:white;border-radius:4px;padding: 12px;margin-bottom:16px;">
2
+ <img
3
+ src="https://www.runtype.com/runtype-text-only.svg"
4
+ alt="Runtype: The Intelligent Product Company"
5
+ width="240"
6
+ />
7
+ </p>
2
8
 
3
- Command-line interface for the Runtype AI platform.
9
+ # The Runtype CLI
4
10
 
5
- > Official CLI tool published on npm
11
+ This is our command-line interface for the platform, which includes _Marathon_, our harness for long-running tasks and deep workflow analysis.
6
12
 
7
13
  ## Installation
8
14
 
@@ -13,74 +19,83 @@ npm install -g @runtypelabs/cli
13
19
  Or run without installing:
14
20
 
15
21
  ```bash
16
- npx @runtypelabs/cli <command>
22
+ npx @runtypelabs/cli@latest <command>
17
23
  ```
18
24
 
19
- ## Quick Start
25
+ ## Start Here
20
26
 
21
- ### 1. Create Account & Authenticate
27
+ The easiest way to try Runtype starting at the CLI is to run a Marathon task. If you are not logged in yet, the CLI will guide you through login or signup first.
22
28
 
23
29
  ```bash
24
- # Create a new account
25
- runtype auth signup
30
+ npx @runtypelabs/cli@latest marathon researcher \
31
+ -g "fetch the home page of hacker news, read the articles related to AI, and summarize them. Use your firecrawl tool liberally" \
32
+ --model qwen/qwen3.5-397b-a17b \
33
+ --tools firecrawl \
34
+ --fresh \
35
+ --max-sessions 2
36
+ ```
26
37
 
27
- # Or login to existing account
28
- runtype auth login
38
+ Marathon can research on the web, edit code in your current repo, or run code in a sandbox. As it does this, you get full insight into what is happening during each run. You can have it process very long tasks with 1000+ tool calls and get an output of the session on your local machine or (optionally) store in wthin Runtype.
29
39
 
30
- # Check authentication status
31
- runtype auth whoami
32
- ```
40
+ Keep in mind it's a harness meant to aid understanding and analysis of AI workflows. How LLMs and tools interact with the system and user prompts to solve problems over many runs. It's a research harness, built for people building on top of AI.
33
41
 
34
- ### 2. Manage Flows and Records
42
+ It's not aiming to replace your favorite AI coding assistant, even though it'll gladly show you it's work as it makes a valiant effort!
35
43
 
36
44
  ```bash
37
- # List flows
38
- runtype flows list
45
+ # Edit code in the current repo
46
+ runtype marathon "Code Editor" --goal "Refactor the theme editor to use modern UX best practices"
39
47
 
40
- # Run a flow
41
- runtype flows run <flow-id> --record <record-id>
48
+ # Build something and deploy it publicly
49
+ runtype marathon calculator --goal "Build a calculator in 3d and deploy it publicly" --model claude-sonnet-4-6
50
+ ```
42
51
 
43
- # Create a record
44
- runtype records create --name "My Record" --type "document"
52
+ ## Quick Start
45
53
 
46
- # List records
47
- runtype records list --type document
48
- ```
54
+ ### 1. Get Authenticated
49
55
 
50
- ### 3. Run Multi-Session Agent Tasks
56
+ ```bash
57
+ # Browser login
58
+ runtype login
59
+
60
+ # API key login for CI or non-interactive use
61
+ runtype login --api-key <key>
62
+ ```
51
63
 
52
- Use `runtype marathon` (or `runtype agents task`) to run long-running, multi-session agent tasks with real-time streaming output.
64
+ ### 2. Work With Agents, Flows, and Records
53
65
 
54
66
  ```bash
55
- # Run a task — agent output streams to your terminal in real time
56
- runtype marathon "Code Builder" --goal "Refactor the auth module to use JWT tokens"
67
+ # List available agents
68
+ runtype agents list
57
69
 
58
- # Set a budget and session limit, save progress with a custom name
59
- runtype marathon agent_abc123 --goal "Write integration tests for the API" \
60
- --max-sessions 10 --max-cost 2.50 --name "api-tests"
70
+ # Run a saved dashboard agent by ID, which uses your configured custom and cloud tools in a marathon
71
+ runtype marathon agent_abc123 --goal "Refactor the theme editor to use modern UX best practices"
61
72
 
62
- # Override the agent's configured model
63
- runtype marathon "Code Builder" --goal "Build a calculator" --model claude-sonnet-4-5
73
+ # Run an agent directly
74
+ runtype dispatch --agent <agent-id> --message "Summarize this document"
64
75
 
65
- # Enable built-in tools for the task (web search, scraping, image generation, etc.)
66
- runtype marathon "Code Builder" --goal "Search the web and summarize" --tools exa
67
- runtype marathon "Code Builder" -g "Find docs and scrape them" --tools exa firecrawl
76
+ # Create a flow
77
+ runtype flows create --name "My Flow" --description "Description"
68
78
 
69
- # Combine tools with other options
70
- runtype marathon "Code Builder" --goal "Research and generate images" \
71
- --tools exa dalle --max-sessions 5
79
+ # Run a flow directly with variables
80
+ runtype dispatch --flow <flow-id> --variable customerName=Alyss --variable priority=high --message "Hello, this is Claudia. Nice to meet you."
72
81
 
73
- # Enable sandboxed code execution tooling for the task (QuickJS or Daytona)
74
- runtype marathon "Code Builder" --goal "Compute totals from this dataset" --sandbox quickjs
75
- runtype marathon "Code Builder" -g "Generate a script and run it" --sandbox daytona
82
+ # Create a record
83
+ runtype records create --name "My Record" --type "document"
84
+ ```
85
+
86
+ ### 3. Run Multi-Session Agent Tasks
87
+
88
+ Use `runtype marathon` to run long-running tasks with real-time streaming output. If you want to try it without installing first, use `npx @runtypelabs/cli@latest marathon ...`. Swap the agent, model, tools, and execution environment depending on whether you want to research, edit code, or build and ship something end-to-end.
76
89
 
77
- # Resume an interrupted task (picks up where it left off)
78
- runtype marathon "Code Builder" --goal "Refactor the auth module to use JWT tokens" \
79
- --resume --name "auth-refactor" --debug
90
+ ```bash
91
+ # Session-limited research task that writes results into the folder you run it in
92
+ runtype marathon researcher -g "fetch the home page of hacker news, read the articles related to AI, and summarize them. Use your firecrawl tool liberally" --model qwen/qwen3.5-397b-a17b --tools firecrawl --fresh --max-sessions 2
93
+
94
+ # Edit code in the current repo
95
+ runtype marathon "Code Editor" --goal "Refactor the theme editor to use modern UX best practices"
80
96
 
81
- # Sync progress to a Runtype record (visible in the dashboard)
82
- runtype marathon "Code Builder" --goal "Build the payments integration" \
83
- --max-sessions 20 --track --name "payments"
97
+ # Build something and deploy it publicly
98
+ runtype marathon calculator --goal "Build a calculator in 3d and deploy it publicly" --model claude-sonnet-4-6
84
99
  ```
85
100
 
86
101
  #### Customizing the Runner Animation
@@ -132,6 +147,53 @@ runtype marathon "Code Builder" --goal "Fix the bug" \
132
147
  runtype marathon "Code Builder" --goal "Fix the bug" --tool-context full-inline
133
148
  ```
134
149
 
150
+ #### Automatic History Compaction
151
+
152
+ Marathon now manages continuation context against the active model's usable input budget instead of raw message history alone. It accounts for conversation history, tool results, tool definitions, and reserved output headroom before deciding whether to compact.
153
+
154
+ - This is enabled by default.
155
+ - The threshold is model-aware and follows the model currently selected for the run.
156
+ - Provider-native compaction is preferred when the active provider supports it. Today that means Anthropic-backed marathon runs compact at 90% of the effective input budget.
157
+ - Other models fall back to structured summary compaction at 80% of the effective input budget.
158
+ - `--compact` still forces compact-summary mode for resume/restart scenarios even if the threshold has not been reached.
159
+ - Use `--compact-strategy auto|provider_native|summary_fallback` to override the default selection.
160
+ - Use `--compact-instructions "..."` to tell summary compaction what state must be preserved.
161
+
162
+ ```bash
163
+ # Default behavior: provider-aware auto compaction
164
+ runtype marathon "Code Builder" --goal "Refactor the auth module"
165
+
166
+ # Force summary fallback even on providers with native compaction
167
+ runtype marathon "Code Builder" --goal "Refactor the auth module" \
168
+ --compact-strategy summary_fallback
169
+
170
+ # Preserve specific context in compact summaries
171
+ runtype marathon "Code Builder" --goal "Refactor the auth module" \
172
+ --compact-instructions "Preserve changed files, test results, and unresolved blockers."
173
+
174
+ # Raise the threshold to 90% of the effective input budget
175
+ runtype marathon "Code Builder" --goal "Refactor the auth module" \
176
+ --compact-threshold 90%
177
+
178
+ # Use an absolute token threshold instead
179
+ runtype marathon "Code Builder" --goal "Refactor the auth module" \
180
+ --compact-threshold 120000
181
+
182
+ # Disable automatic history compaction
183
+ runtype marathon "Code Builder" --goal "Refactor the auth module" \
184
+ --no-auto-compact
185
+ ```
186
+
187
+ Percentages must include `%` (e.g. `80%`). Bare numbers are treated as absolute token counts (e.g. `120000`).
188
+
189
+ #### Tool Output Guardrails
190
+
191
+ Marathon also guards against oversized local tool results so they do not silently consume the whole context window.
192
+
193
+ - Outputs above the soft warning threshold are surfaced in the TUI as context notices.
194
+ - Outputs above the hard threshold are offloaded to disk and replaced with a compact reference the model can reopen with `read_file`.
195
+ - The default hard threshold is `100000` characters. Set `--offload-threshold <chars>` to tune it, or `--offload-threshold off` to disable it.
196
+
135
197
  #### Built-in Tools
136
198
 
137
199
  The `--tools` (or `-t`) flag enables built-in platform tools during agent execution. Tools are validated at startup against the built-in tools registry, and compatibility is checked against all models used in the task (including planning and execution phase models).
@@ -168,12 +230,207 @@ runtype marathon "Code Builder" -g "Scrape docs" -t firecrawl
168
230
  - Tools incompatible with the selected model(s) are rejected with a specific error
169
231
  - All validation errors are reported together so you can fix them in one pass
170
232
 
233
+ ## Command Reference
234
+
235
+ ### `runtype init`
236
+
237
+ Guided onboarding wizard for first-time setup. Walks through authentication and product creation.
238
+
239
+ ```bash
240
+ runtype init
241
+ runtype init --api-key <key> --name "My Product"
242
+ ```
243
+
244
+ ### `runtype login`
245
+
246
+ Top-level alias for `runtype auth login`.
247
+
248
+ ### `runtype auth`
249
+
250
+ Manage authentication.
251
+
252
+ | Subcommand | Description |
253
+ | ------------- | ---------------------------------------------- |
254
+ | `auth signup` | Create a new account (alias for `auth login`) |
255
+ | `auth login` | Login via browser or `--api-key <key>` |
256
+ | `auth status` | Show authentication status |
257
+ | `auth whoami` | Display current user info with billing details |
258
+ | `auth logout` | Remove stored credentials |
259
+
260
+ ### `runtype flows`
261
+
262
+ Manage flows.
263
+
264
+ | Subcommand | Description |
265
+ | ------------------------ | --------------------------- |
266
+ | `flows list` | List all flows |
267
+ | `flows get <id>` | Get flow details |
268
+ | `flows create -n <name>` | Create a new flow |
269
+ | `flows run <id>` | Execute a flow via dispatch |
270
+ | `flows delete <id>` | Delete a flow |
271
+
272
+ ### `runtype records`
273
+
274
+ Manage records.
275
+
276
+ | Subcommand | Description |
277
+ | ------------------------------------ | ---------------------------------------------------------------- |
278
+ | `records list` | List all records (`--type`, `--limit`) |
279
+ | `records get <id>` | Get record details |
280
+ | `records create -n <name> -t <type>` | Create a new record (`--metadata <json>`) |
281
+ | `records delete <id>` | Delete a record |
282
+ | `records export` | Export records to file (`--format json\|csv`, `--output <file>`) |
283
+
284
+ ### `runtype agents`
285
+
286
+ Manage agents.
287
+
288
+ | Subcommand | Description |
289
+ | ---------------------------------- | ----------------------------------------------- |
290
+ | `agents list` | List all agents |
291
+ | `agents get <id>` | Get agent details |
292
+ | `agents create -n <name>` | Create a new agent |
293
+ | `agents execute <id> -m <message>` | Execute an agent with a message |
294
+ | `agents task <agent> -g <goal>` | Run a multi-session task (see Marathon section) |
295
+ | `agents delete <id>` | Delete an agent |
296
+
297
+ ### `runtype dispatch`
298
+
299
+ Execute a flow or agent via the dispatch API.
300
+
301
+ ```bash
302
+ runtype dispatch --flow <id> --message "Hello"
303
+ runtype dispatch --agent <id> --message "Hello"
304
+ runtype dispatch --flow <id> --record <record-id>
305
+ runtype dispatch --flow <id> --record-json data.json
306
+ runtype dispatch --flow <id> --variable key=value --variable key2=value2
307
+ runtype dispatch --flow <id> --no-stream --json
308
+ ```
309
+
310
+ ### `runtype prompts`
311
+
312
+ Manage prompts.
313
+
314
+ | Subcommand | Description |
315
+ | ------------------- | ---------------------------------------------------- |
316
+ | `prompts list` | List all prompts |
317
+ | `prompts get <id>` | Get prompt details |
318
+ | `prompts test <id>` | Test a prompt (`-i <input>`, `--stream/--no-stream`) |
319
+
320
+ ### `runtype models`
321
+
322
+ Manage model configurations.
323
+
324
+ | Subcommand | Description |
325
+ | ------------------------- | --------------------------------------------- |
326
+ | `models list` | List your enabled model configurations |
327
+ | `models available` | List all available models grouped by provider |
328
+ | `models enable <modelId>` | Enable a model |
329
+ | `models disable <id>` | Disable a model configuration |
330
+ | `models default <id>` | Set a model configuration as default |
331
+ | `models usage` | Show model usage statistics |
332
+
333
+ ### `runtype batch`
334
+
335
+ Manage batch operations.
336
+
337
+ | Subcommand | Description |
338
+ | -------------------------------------------- | ---------------------------------- |
339
+ | `batch submit -f <flowId> -r <records.json>` | Submit a batch job |
340
+ | `batch status <id>` | Check batch job status (`--watch`) |
341
+ | `batch cancel <id>` | Cancel a batch job |
342
+
343
+ ### `runtype eval`
344
+
345
+ Manage evaluations.
346
+
347
+ | Subcommand | Description |
348
+ | ------------------------------------------- | -------------------------------------------- |
349
+ | `eval submit -f <flowId> -r <records.json>` | Submit an eval batch (`-n <name>`) |
350
+ | `eval list` | List eval batches (`--flow <id>`, `--limit`) |
351
+ | `eval results <id>` | Get eval batch results |
352
+ | `eval compare <groupId>` | Compare evals in a group |
353
+
354
+ ### `runtype schedules`
355
+
356
+ Manage scheduled flow runs.
357
+
358
+ | Subcommand | Description |
359
+ | ---------------------------------------- | ------------------------------- |
360
+ | `schedules list` | List all schedules |
361
+ | `schedules get <id>` | Get schedule details |
362
+ | `schedules create -f <flowId> -c <cron>` | Create a schedule (`-n <name>`) |
363
+ | `schedules pause <id>` | Pause a schedule |
364
+ | `schedules resume <id>` | Resume a paused schedule |
365
+ | `schedules run-now <id>` | Trigger immediate execution |
366
+ | `schedules delete <id>` | Delete a schedule |
367
+
368
+ ### `runtype api-keys`
369
+
370
+ Manage API keys.
371
+
372
+ | Subcommand | Description |
373
+ | --------------------------- | ------------------------------------------------ |
374
+ | `api-keys list` | List your API keys |
375
+ | `api-keys get <id>` | Get API key details |
376
+ | `api-keys create -n <name>` | Create a new API key |
377
+ | `api-keys delete <id>` | Delete an API key (`--yes` to skip confirmation) |
378
+ | `api-keys regenerate <id>` | Regenerate an API key |
379
+ | `api-keys analytics` | Show usage analytics (`--key <id>`) |
380
+
381
+ ### `runtype products`
382
+
383
+ Manage products.
384
+
385
+ | Subcommand | Description |
386
+ | ---------------------------- | ------------------------------------------------------------- |
387
+ | `products init --from <url>` | Import a product from an authenticated external A2A agent URL |
388
+
389
+ ```bash
390
+ # After logging in, import an external A2A agent
391
+ runtype products init --from https://example.com/.well-known/agent-card.json
392
+ runtype products init --from <url> --name "Custom Name"
393
+ ```
394
+
395
+ ### `runtype flow-versions`
396
+
397
+ Manage flow versions.
398
+
399
+ | Subcommand | Description |
400
+ | ----------------------------------------------- | ---------------------------- |
401
+ | `flow-versions list <flowId>` | List all versions for a flow |
402
+ | `flow-versions get <flowId> <versionId>` | Get a specific version |
403
+ | `flow-versions published <flowId>` | Get the published version |
404
+ | `flow-versions publish <flowId> -v <versionId>` | Publish a version |
405
+
406
+ ### `runtype billing`
407
+
408
+ View billing and subscription info.
409
+
410
+ | Subcommand | Description |
411
+ | ----------------- | --------------------------------------- |
412
+ | `billing status` | Show current plan and usage |
413
+ | `billing portal` | Open the billing portal in your browser |
414
+ | `billing refresh` | Refresh plan data from billing provider |
415
+
416
+ ### `runtype analytics`
417
+
418
+ View analytics and execution results.
419
+
420
+ | Subcommand | Description |
421
+ | ------------------- | -------------------------------------------------------------------- |
422
+ | `analytics stats` | Show account statistics |
423
+ | `analytics results` | List execution results (`--flow`, `--record`, `--status`, `--limit`) |
424
+
171
425
  ## Configuration
172
426
 
173
427
  ```bash
174
428
  # View all configuration
175
429
  runtype config get
176
430
 
431
+ # Get a specific key
432
+ runtype config get apiUrl
433
+
177
434
  # Set API URL
178
435
  runtype config set apiUrl https://api.runtype.com
179
436
 
@@ -182,8 +439,24 @@ runtype config set defaultModel gpt-4o
182
439
 
183
440
  # Reset configuration
184
441
  runtype config reset
442
+
443
+ # Show configuration file path
444
+ runtype config path
185
445
  ```
186
446
 
447
+ Valid config keys: `apiUrl`, `defaultModel`, `defaultTemperature`, `outputFormat`, `streamResponses`
448
+
449
+ ### Global Options
450
+
451
+ All commands support these flags:
452
+
453
+ | Flag | Description |
454
+ | -------------------- | ------------------------ |
455
+ | `--json` | Output in JSON format |
456
+ | `--tty` / `--no-tty` | Force TTY / non-TTY mode |
457
+ | `-v, --verbose` | Enable verbose output |
458
+ | `--api-url <url>` | Override API URL |
459
+
187
460
  ## Development
188
461
 
189
462
  ### Local Development Setup