npm - @runtypelabs/cli - Versions diffs - 1.9.0 → 1.9.2 - Mend

@runtypelabs/cli 1.9.0 → 1.9.2

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (4) hide show

package/README.md CHANGED Viewed

@@ -1,8 +1,14 @@
-# Runtype CLI
+<p align="center" style="background:white;border-radius:4px;padding: 12px;margin-bottom:16px;">
+  <img
+    src="https://www.runtype.com/runtype-text-only.svg"
+    alt="Runtype: The Intelligent Product Company"
+    width="240"
+  />
+</p>
-Command-line interface for the Runtype AI platform.
+# The Runtype CLI
-> Official CLI tool published on npm
+This is our command-line interface for the platform, which includes _Marathon_, our harness for long-running tasks and deep workflow analysis.
 ## Installation
@@ -13,74 +19,83 @@ npm install -g @runtypelabs/cli
 Or run without installing:
 ```bash
-npx @runtypelabs/cli <command>
+npx @runtypelabs/cli@latest <command>
 ```
-## Quick Start
+## Start Here
-### 1. Create Account & Authenticate
+The easiest way to try Runtype starting at the CLI is to run a Marathon task. If you are not logged in yet, the CLI will guide you through login or signup first.
 ```bash
-# Create a new account
-runtype auth signup
+npx @runtypelabs/cli@latest marathon researcher \
+  -g "fetch the home page of hacker news, read the articles related to AI, and summarize them. Use your firecrawl tool liberally" \
+  --model qwen/qwen3.5-397b-a17b \
+  --tools firecrawl \
+  --fresh \
+  --max-sessions 2
+```
-# Or login to existing account
-runtype auth login
+Marathon can research on the web, edit code in your current repo, or run code in a sandbox. As it does this, you get full insight into what is happening during each run. You can have it process very long tasks with 1000+ tool calls and get an output of the session on your local machine or (optionally) store it within Runtype.
-# Check authentication status
-runtype auth whoami
-```
+Keep in mind it's a harness meant to aid understanding and analysis of AI workflows. How LLMs and tools interact with the system and user prompts to solve problems over many runs. It's a research harness, built for people building on top of AI.
-### 2. Manage Flows and Records
+It's not aiming to replace your favorite AI coding assistant, even though it'll gladly show you it's work as it makes a valiant effort!
 ```bash
-# List flows
-runtype flows list
+# Edit code in the current repo
+runtype marathon "Code Editor" --goal "Refactor the theme editor to use modern UX best practices"
-# Run a flow
-runtype flows run <flow-id> --record <record-id>
+# Build something and deploy it publicly
+runtype marathon calculator --goal "Build a calculator in 3d and deploy it publicly" --model claude-sonnet-4-6
+```
-# Create a record
-runtype records create --name "My Record" --type "document"
+## Quick Start
-# List records
-runtype records list --type document
-```
+### 1. Get Authenticated
-### 3. Run Multi-Session Agent Tasks
+```bash
+# Browser login
+runtype login
+# API key login for CI or non-interactive use
+runtype login --api-key <key>
+```
-Use `runtype marathon` (or `runtype agents task`) to run long-running, multi-session agent tasks with real-time streaming output.
+### 2. Work With Agents, Flows, and Records
 ```bash
-# Run a task — agent output streams to your terminal in real time
-runtype marathon "Code Builder" --goal "Refactor the auth module to use JWT tokens"
+# List available agents
+runtype agents list
-# Set a budget and session limit, save progress with a custom name
-runtype marathon agent_abc123 --goal "Write integration tests for the API" \
-  --max-sessions 10 --max-cost 2.50 --name "api-tests"
+# Run a saved dashboard agent by ID, which uses your configured custom and cloud tools in a marathon
+runtype marathon agent_abc123 --goal "Refactor the theme editor to use modern UX best practices"
-# Override the agent's configured model
-runtype marathon "Code Builder" --goal "Build a calculator" --model claude-sonnet-4-5
+# Run an agent directly
+runtype dispatch --agent <agent-id> --message "Summarize this document"
-# Enable built-in tools for the task (web search, scraping, image generation, etc.)
-runtype marathon "Code Builder" --goal "Search the web and summarize" --tools exa
-runtype marathon "Code Builder" -g "Find docs and scrape them" --tools exa firecrawl
+# Create a flow
+runtype flows create --name "My Flow" --description "Description"
-# Combine tools with other options
-runtype marathon "Code Builder" --goal "Research and generate images" \
-  --tools exa dalle --max-sessions 5
+# Run a flow directly with variables
+runtype dispatch --flow <flow-id> --variable customerName=Alyss --variable priority=high --message "Hello, this is Claudia. Nice to meet you."
-# Enable sandboxed code execution tooling for the task (QuickJS or Daytona)
-runtype marathon "Code Builder" --goal "Compute totals from this dataset" --sandbox quickjs
-runtype marathon "Code Builder" -g "Generate a script and run it" --sandbox daytona
+# Create a record
+runtype records create --name "My Record" --type "document"
+```
+### 3. Run Multi-Session Agent Tasks
+Use `runtype marathon` to run long-running tasks with real-time streaming output. If you want to try it without installing first, use `npx @runtypelabs/cli@latest marathon ...`. Swap the agent, model, tools, and execution environment depending on whether you want to research, edit code, or build and ship something end-to-end.
-# Resume an interrupted task (picks up where it left off)
-runtype marathon "Code Builder" --goal "Refactor the auth module to use JWT tokens" \
-  --resume --name "auth-refactor" --debug
+```bash
+# Session-limited research task that writes results into the folder you run it in
+runtype marathon researcher -g "fetch the home page of hacker news, read the articles related to AI, and summarize them. Use your firecrawl tool liberally" --model qwen/qwen3.5-397b-a17b --tools firecrawl --fresh --max-sessions 2
+# Edit code in the current repo
+runtype marathon "Code Editor" --goal "Refactor the theme editor to use modern UX best practices"
-# Sync progress to a Runtype record (visible in the dashboard)
-runtype marathon "Code Builder" --goal "Build the payments integration" \
-  --max-sessions 20 --track --name "payments"
+# Build something and deploy it publicly
+runtype marathon calculator --goal "Build a calculator in 3d and deploy it publicly" --model claude-sonnet-4-6
 ```
 #### Customizing the Runner Animation
@@ -132,6 +147,53 @@ runtype marathon "Code Builder" --goal "Fix the bug" \
 runtype marathon "Code Builder" --goal "Fix the bug" --tool-context full-inline
 ```
+#### Automatic History Compaction
+Marathon now manages continuation context against the active model's usable input budget instead of raw message history alone. It accounts for conversation history, tool results, tool definitions, and reserved output headroom before deciding whether to compact.
+- This is enabled by default.
+- The threshold is model-aware and follows the model currently selected for the run.
+- Provider-native compaction is preferred when the active provider supports it. Today that means Anthropic-backed marathon runs compact at 90% of the effective input budget.
+- Other models fall back to structured summary compaction at 80% of the effective input budget.
+- `--compact` still forces compact-summary mode for resume/restart scenarios even if the threshold has not been reached.
+- Use `--compact-strategy auto|provider_native|summary_fallback` to override the default selection.
+- Use `--compact-instructions "..."` to tell summary compaction what state must be preserved.
+```bash
+# Default behavior: provider-aware auto compaction
+runtype marathon "Code Builder" --goal "Refactor the auth module"
+# Force summary fallback even on providers with native compaction
+runtype marathon "Code Builder" --goal "Refactor the auth module" \
+  --compact-strategy summary_fallback
+# Preserve specific context in compact summaries
+runtype marathon "Code Builder" --goal "Refactor the auth module" \
+  --compact-instructions "Preserve changed files, test results, and unresolved blockers."
+# Raise the threshold to 90% of the effective input budget
+runtype marathon "Code Builder" --goal "Refactor the auth module" \
+  --compact-threshold 90%
+# Use an absolute token threshold instead
+runtype marathon "Code Builder" --goal "Refactor the auth module" \
+  --compact-threshold 120000
+# Disable automatic history compaction
+runtype marathon "Code Builder" --goal "Refactor the auth module" \
+  --no-auto-compact
+```
+Percentages must include `%` (e.g. `80%`). Bare numbers are treated as absolute token counts (e.g. `120000`).
+#### Tool Output Guardrails
+Marathon also guards against oversized local tool results so they do not silently consume the whole context window.
+- Outputs above the soft warning threshold are surfaced in the TUI as context notices.
+- Outputs above the hard threshold are offloaded to disk and replaced with a compact reference the model can reopen with `read_file`.
+- The default hard threshold is `100000` characters. Set `--offload-threshold <chars>` to tune it, or `--offload-threshold off` to disable it.
 #### Built-in Tools
 The `--tools` (or `-t`) flag enables built-in platform tools during agent execution. Tools are validated at startup against the built-in tools registry, and compatibility is checked against all models used in the task (including planning and execution phase models).
@@ -168,12 +230,207 @@ runtype marathon "Code Builder" -g "Scrape docs" -t firecrawl
 - Tools incompatible with the selected model(s) are rejected with a specific error
 - All validation errors are reported together so you can fix them in one pass
+## Command Reference
+### `runtype init`
+Guided onboarding wizard for first-time setup. Walks through authentication and product creation.
+```bash
+runtype init
+runtype init --api-key <key> --name "My Product"
+```
+### `runtype login`
+Top-level alias for `runtype auth login`.
+### `runtype auth`
+Manage authentication.
+| Subcommand    | Description                                    |
+| ------------- | ---------------------------------------------- |
+| `auth signup` | Create a new account (alias for `auth login`)  |
+| `auth login`  | Login via browser or `--api-key <key>`         |
+| `auth status` | Show authentication status                     |
+| `auth whoami` | Display current user info with billing details |
+| `auth logout` | Remove stored credentials                      |
+### `runtype flows`
+Manage flows.
+| Subcommand               | Description                 |
+| ------------------------ | --------------------------- |
+| `flows list`             | List all flows              |
+| `flows get <id>`         | Get flow details            |
+| `flows create -n <name>` | Create a new flow           |
+| `flows run <id>`         | Execute a flow via dispatch |
+| `flows delete <id>`      | Delete a flow               |
+### `runtype records`
+Manage records.
+| Subcommand                           | Description                                                      |
+| ------------------------------------ | ---------------------------------------------------------------- |
+| `records list`                       | List all records (`--type`, `--limit`)                           |
+| `records get <id>`                   | Get record details                                               |
+| `records create -n <name> -t <type>` | Create a new record (`--metadata <json>`)                        |
+| `records delete <id>`                | Delete a record                                                  |
+| `records export`                     | Export records to file (`--format json\|csv`, `--output <file>`) |
+### `runtype agents`
+Manage agents.
+| Subcommand                         | Description                                     |
+| ---------------------------------- | ----------------------------------------------- |
+| `agents list`                      | List all agents                                 |
+| `agents get <id>`                  | Get agent details                               |
+| `agents create -n <name>`          | Create a new agent                              |
+| `agents execute <id> -m <message>` | Execute an agent with a message                 |
+| `agents task <agent> -g <goal>`    | Run a multi-session task (see Marathon section) |
+| `agents delete <id>`               | Delete an agent                                 |
+### `runtype dispatch`
+Execute a flow or agent via the dispatch API.
+```bash
+runtype dispatch --flow <id> --message "Hello"
+runtype dispatch --agent <id> --message "Hello"
+runtype dispatch --flow <id> --record <record-id>
+runtype dispatch --flow <id> --record-json data.json
+runtype dispatch --flow <id> --variable key=value --variable key2=value2
+runtype dispatch --flow <id> --no-stream --json
+```
+### `runtype prompts`
+Manage prompts.
+| Subcommand          | Description                                          |
+| ------------------- | ---------------------------------------------------- |
+| `prompts list`      | List all prompts                                     |
+| `prompts get <id>`  | Get prompt details                                   |
+| `prompts test <id>` | Test a prompt (`-i <input>`, `--stream/--no-stream`) |
+### `runtype models`
+Manage model configurations.
+| Subcommand                | Description                                   |
+| ------------------------- | --------------------------------------------- |
+| `models list`             | List your enabled model configurations        |
+| `models available`        | List all available models grouped by provider |
+| `models enable <modelId>` | Enable a model                                |
+| `models disable <id>`     | Disable a model configuration                 |
+| `models default <id>`     | Set a model configuration as default          |
+| `models usage`            | Show model usage statistics                   |
+### `runtype batch`
+Manage batch operations.
+| Subcommand                                   | Description                        |
+| -------------------------------------------- | ---------------------------------- |
+| `batch submit -f <flowId> -r <records.json>` | Submit a batch job                 |
+| `batch status <id>`                          | Check batch job status (`--watch`) |
+| `batch cancel <id>`                          | Cancel a batch job                 |
+### `runtype eval`
+Manage evaluations.
+| Subcommand                                  | Description                                  |
+| ------------------------------------------- | -------------------------------------------- |
+| `eval submit -f <flowId> -r <records.json>` | Submit an eval batch (`-n <name>`)           |
+| `eval list`                                 | List eval batches (`--flow <id>`, `--limit`) |
+| `eval results <id>`                         | Get eval batch results                       |
+| `eval compare <groupId>`                    | Compare evals in a group                     |
+### `runtype schedules`
+Manage scheduled flow runs.
+| Subcommand                               | Description                     |
+| ---------------------------------------- | ------------------------------- |
+| `schedules list`                         | List all schedules              |
+| `schedules get <id>`                     | Get schedule details            |
+| `schedules create -f <flowId> -c <cron>` | Create a schedule (`-n <name>`) |
+| `schedules pause <id>`                   | Pause a schedule                |
+| `schedules resume <id>`                  | Resume a paused schedule        |
+| `schedules run-now <id>`                 | Trigger immediate execution     |
+| `schedules delete <id>`                  | Delete a schedule               |
+### `runtype api-keys`
+Manage API keys.
+| Subcommand                  | Description                                      |
+| --------------------------- | ------------------------------------------------ |
+| `api-keys list`             | List your API keys                               |
+| `api-keys get <id>`         | Get API key details                              |
+| `api-keys create -n <name>` | Create a new API key                             |
+| `api-keys delete <id>`      | Delete an API key (`--yes` to skip confirmation) |
+| `api-keys regenerate <id>`  | Regenerate an API key                            |
+| `api-keys analytics`        | Show usage analytics (`--key <id>`)              |
+### `runtype products`
+Manage products.
+| Subcommand                   | Description                                                   |
+| ---------------------------- | ------------------------------------------------------------- |
+| `products init --from <url>` | Import a product from an authenticated external A2A agent URL |
+```bash
+# After logging in, import an external A2A agent
+runtype products init --from https://example.com/.well-known/agent-card.json
+runtype products init --from <url> --name "Custom Name"
+```
+### `runtype flow-versions`
+Manage flow versions.
+| Subcommand                                      | Description                  |
+| ----------------------------------------------- | ---------------------------- |
+| `flow-versions list <flowId>`                   | List all versions for a flow |
+| `flow-versions get <flowId> <versionId>`        | Get a specific version       |
+| `flow-versions published <flowId>`              | Get the published version    |
+| `flow-versions publish <flowId> -v <versionId>` | Publish a version            |
+### `runtype billing`
+View billing and subscription info.
+| Subcommand        | Description                             |
+| ----------------- | --------------------------------------- |
+| `billing status`  | Show current plan and usage             |
+| `billing portal`  | Open the billing portal in your browser |
+| `billing refresh` | Refresh plan data from billing provider |
+### `runtype analytics`
+View analytics and execution results.
+| Subcommand          | Description                                                          |
+| ------------------- | -------------------------------------------------------------------- |
+| `analytics stats`   | Show account statistics                                              |
+| `analytics results` | List execution results (`--flow`, `--record`, `--status`, `--limit`) |
 ## Configuration
 ```bash
 # View all configuration
 runtype config get
+# Get a specific key
+runtype config get apiUrl
 # Set API URL
 runtype config set apiUrl https://api.runtype.com
@@ -182,8 +439,24 @@ runtype config set defaultModel gpt-4o
 # Reset configuration
 runtype config reset
+# Show configuration file path
+runtype config path
 ```
+Valid config keys: `apiUrl`, `defaultModel`, `defaultTemperature`, `outputFormat`, `streamResponses`
+### Global Options
+All commands support these flags:
+| Flag                 | Description              |
+| -------------------- | ------------------------ |
+| `--json`             | Output in JSON format    |
+| `--tty` / `--no-tty` | Force TTY / non-TTY mode |
+| `-v, --verbose`      | Enable verbose output    |
+| `--api-url <url>`    | Override API URL         |
 ## Development
 ### Local Development Setup