npm - keystone-cli - Versions diffs - 1.2.0 → 2.0.0 - Mend

keystone-cli 1.2.0 → 2.0.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (52) hide show

package/README.md +163 -138
package/package.json +6 -3
package/src/cli.ts +54 -369
package/src/commands/init.ts +19 -27
package/src/db/dynamic-state-manager.test.ts +319 -0
package/src/db/dynamic-state-manager.ts +411 -0
package/src/db/memory-db.test.ts +45 -0
package/src/db/memory-db.ts +47 -21
package/src/db/sqlite-setup.ts +26 -3
package/src/db/workflow-db.ts +76 -5
package/src/parser/config-schema.ts +11 -13
package/src/parser/schema.ts +37 -2
package/src/parser/workflow-parser.test.ts +3 -4
package/src/parser/workflow-parser.ts +3 -62
package/src/runner/__test__/llm-mock-setup.ts +173 -0
package/src/runner/__test__/llm-test-setup.ts +271 -0
package/src/runner/engine-executor.test.ts +25 -18
package/src/runner/executors/blueprint-executor.ts +0 -1
package/src/runner/executors/dynamic-executor.test.ts +613 -0
package/src/runner/executors/dynamic-executor.ts +723 -0
package/src/runner/executors/dynamic-types.ts +69 -0
package/src/runner/executors/engine-executor.ts +5 -1
package/src/runner/executors/llm-executor.ts +502 -1033
package/src/runner/executors/memory-executor.ts +35 -19
package/src/runner/executors/plan-executor.ts +0 -1
package/src/runner/executors/types.ts +4 -4
package/src/runner/llm-adapter.integration.test.ts +151 -0
package/src/runner/llm-adapter.ts +263 -1401
package/src/runner/llm-clarification.test.ts +91 -106
package/src/runner/llm-executor.test.ts +217 -1181
package/src/runner/memoization.test.ts +0 -1
package/src/runner/recovery-security.test.ts +51 -20
package/src/runner/reflexion.test.ts +55 -18
package/src/runner/standard-tools-integration.test.ts +137 -87
package/src/runner/step-executor.test.ts +36 -80
package/src/runner/step-executor.ts +20 -2
package/src/runner/test-harness.ts +3 -29
package/src/runner/tool-integration.test.ts +122 -73
package/src/runner/workflow-runner.ts +92 -35
package/src/runner/workflow-scheduler.ts +11 -1
package/src/runner/workflow-summary.ts +144 -0
package/src/templates/dynamic-demo.yaml +31 -0
package/src/templates/scaffolding/decompose-problem.yaml +1 -1
package/src/templates/scaffolding/dynamic-decompose.yaml +39 -0
package/src/utils/auth-manager.test.ts +10 -520
package/src/utils/auth-manager.ts +3 -756
package/src/utils/config-loader.ts +12 -0
package/src/utils/constants.ts +0 -17
package/src/utils/process-sandbox.ts +15 -3
package/src/utils/topo-sort.ts +47 -0
package/src/runner/llm-adapter-runtime.test.ts +0 -209
package/src/runner/llm-adapter.test.ts +0 -1012

package/README.md CHANGED Viewed

@@ -34,11 +34,12 @@ Keystone allows you to define complex automation workflows using a simple YAML s
 ---
-## ✨ Features
+## <a id="features">✨ Features</a>
 - ⚡ **Local-First:** Built on Bun with a local SQLite database for state management.
 - 🧩 **Declarative:** Define workflows in YAML with automatic dependency tracking (DAG).
 - 🤖 **Agentic:** First-class support for LLM agents defined in Markdown with YAML frontmatter.
+- 🎯 **Dynamic Workflows:** LLM-driven orchestration where a supervisor generates and executes steps at runtime.
 - 🧑‍💻 **Human-in-the-Loop:** Support for manual approval and text input steps.
 - 🔄 **Resilient:** Built-in retries, timeouts, and state persistence. Resume failed or paused runs exactly where they left off.
 - 📊 **TUI Dashboard:** Built-in interactive dashboard for monitoring and managing runs.
@@ -51,7 +52,7 @@ Keystone allows you to define complex automation workflows using a simple YAML s
 ---
-## 🚀 Installation
+## <a id="installation">🚀 Installation</a>
 Ensure you have [Bun](https://bun.sh) installed.
@@ -89,7 +90,7 @@ source <(keystone completion bash)
 ---
-## 🚦 Quick Start
+## <a id="quick-start">🚦 Quick Start</a>
 ### 1. Initialize a Project
 ```bash
@@ -97,52 +98,64 @@ keystone init
 ```
 This creates the `.keystone/` directory for configuration and seeds `.keystone/workflows/` plus `.keystone/workflows/agents/` with bundled workflows and agents (see "Bundled Workflows" below).
-### 2. Configure your Environment
-Add your API keys to the generated `.env` file:
+### 2. Install AI SDK Providers
+Keystone uses the **Vercel AI SDK**. Install the provider packages you need:
+```bash
+npm install @ai-sdk/openai @ai-sdk/anthropic
+# Or use other AI SDK providers like @ai-sdk/google, @ai-sdk/mistral, etc.
+```
+### 3. Configure Providers
+Edit `.keystone/config.yaml` to configure your providers:
+```yaml
+default_provider: openai
+providers:
+  openai:
+    package: "@ai-sdk/openai"
+    api_key_env: OPENAI_API_KEY
+    default_model: gpt-4o
+  anthropic:
+    package: "@ai-sdk/anthropic"
+    api_key_env: ANTHROPIC_API_KEY
+    default_model: claude-3-5-sonnet-20240620
+model_mappings:
+  "gpt-*": openai
+  "claude-*": anthropic
+```
+Then add your API keys to `.env`:
 ```env
 OPENAI_API_KEY=sk-...
 ANTHROPIC_API_KEY=sk-ant-...
 ```
-Alternatively, you can use the built-in authentication management:
-```bash
-keystone auth login openai
-keystone auth login anthropic
-keystone auth login anthropic-claude
-keystone auth login openai-chatgpt
-keystone auth login gemini
-keystone auth login github
-```
-Use `anthropic-claude` for Claude Pro/Max subscriptions (OAuth) instead of an API key.
-Use `openai-chatgpt` for ChatGPT Plus/Pro subscriptions (OAuth) instead of an API key.
-Use `gemini` (alias `google-gemini`) for Google Gemini subscriptions (OAuth) instead of an API key.
-Use `github` to authenticate GitHub Copilot via the GitHub device flow.
-### 3. Run a Workflow
+See the [Configuration](#configuration) section for more details on BYOP (Bring Your Own Provider).
+### 4. Run a Workflow
 ```bash
 keystone run scaffold-feature
 ```
 Keystone automatically looks in `.keystone/workflows/` (locally and in your home directory) for `.yaml` or `.yml` files.
-### 4. Monitor with the Dashboard
+### 5. Monitor with the Dashboard
 ```bash
 keystone ui
 ```
 ---
-## 🧰 Bundled Workflows
+## <a id="bundled-workflows">🧰 Bundled Workflows</a>
 `keystone init` seeds these workflows under `.keystone/workflows/` (and the agents they rely on under `.keystone/workflows/agents/`):
-Top-level workflows (seeded in `.keystone/workflows/`):
+Top-level utility workflows (seeded in `.keystone/workflows/`):
 - `scaffold-feature.yaml`: Interactive workflow scaffolder. Prompts for requirements, plans files, generates content, and writes them.
 - `decompose-problem.yaml`: Decomposes a problem into research/implementation/review tasks, waits for approval, runs sub-workflows, and summarizes.
 - `dev.yaml`: Self-bootstrapping DevMode workflow for an interactive plan/implement/verify loop.
-- `agent-handoff.yaml`: Demonstrates agent handoffs and tool-driven context updates.
-- `full-feature-demo.yaml`: A comprehensive workflow demonstrating multiple step types (shell, file, request, etc.).
-- `script-example.yaml`: Demonstrates sandboxed JavaScript execution.
-- `artifact-example.yaml`: Demonstrates artifact upload and download between steps.
-- `idempotency-example.yaml`: Demonstrates safe retries for side-effecting steps.
+- `dynamic-decompose.yaml`: Dynamic version of decompose-problem using LLM-driven orchestration.
 Sub-workflows (seeded in `.keystone/workflows/`):
 - `scaffold-plan.yaml`: Generates a file plan from `requirements` input.
@@ -156,14 +169,14 @@ Example runs:
 ```bash
 keystone run scaffold-feature
 keystone run decompose-problem -i problem="Add caching to the API" -i context="Node/Bun service"
-keystone run agent-handoff -i topic="billing" -i user="Ada"
+keystone run dev "Improve the user profile UI"
 ```
 Sub-workflows are used by the top-level workflows, but can be run directly if you want just one phase.
 ---
-## ⚙️ Configuration
+## <a id="configuration">⚙️ Configuration</a>
 Keystone loads configuration from project `.keystone/config.yaml` (and user-level config; see `keystone config show` for search order) to manage model providers and model mappings.
@@ -178,42 +191,27 @@ State is stored at `.keystone/state.db` by default (project-local).
 default_provider: openai
 providers:
+  # Example: Using a standard AI SDK provider package (Bring Your Own Provider)
   openai:
-    type: openai
+    package: "@ai-sdk/openai"
     base_url: https://api.openai.com/v1
     api_key_env: OPENAI_API_KEY
     default_model: gpt-4o
-  openai-chatgpt:
-    type: openai-chatgpt
-    base_url: https://api.openai.com/v1
-    default_model: gpt-5-codex
+  # Example: Using another provider
   anthropic:
-    type: anthropic
-    base_url: https://api.anthropic.com/v1
+    package: "@ai-sdk/anthropic"
     api_key_env: ANTHROPIC_API_KEY
     default_model: claude-3-5-sonnet-20240620
-  anthropic-claude:
-    type: anthropic-claude
-    base_url: https://api.anthropic.com/v1
-    default_model: claude-3-5-sonnet-20240620
-  google-gemini:
-    type: google-gemini
-    base_url: https://cloudcode-pa.googleapis.com
-    default_model: gemini-1.5-pro
-  groq:
-    type: openai
-    base_url: https://api.groq.com/openai/v1
-    api_key_env: GROQ_API_KEY
-    default_model: llama-3.3-70b-versatile
+  # Example: Using a custom provider script
+  # my-custom-provider:
+  #   script: "./providers/my-provider.ts"
+  #   default_model: my-special-model
 model_mappings:
-  "gpt-5*": openai-chatgpt
   "gpt-*": openai
-  "claude-4*": anthropic-claude
   "claude-*": anthropic
-  "gemini-*": google-gemini
-  "o1-*": openai
-  "llama-*": groq
 mcp_servers:
   filesystem:
@@ -241,37 +239,36 @@ expression:
   strict: false
 ```
-`storage.retention_days` sets the default window used by `keystone maintenance` / `keystone prune`. `storage.redact_secrets_at_rest` controls whether secret inputs and known secrets are redacted before storing run data (default `true`).
+### Storage Configuration
-### Context Injection (Opt-in)
+The `storage` section controls data retention and security for workflow runs:
-Keystone can automatically inject project context files (`README.md`, `AGENTS.md`, `.cursor/rules`, `.claude/rules`) into LLM system prompts. This helps agents understand your project's conventions and guidelines.
+- **`retention_days`**: Sets the default window used by `keystone maintenance` / `keystone prune` commands to clean up old run data.
+- **`redact_secrets_at_rest`**: Controls whether secret inputs and known secrets are redacted before storing run data (default `true`).
-```yaml
-features:
-  context_injection:
-    enabled: true              # Opt-in feature (default: false)
-    search_depth: 3            # How many directories up to search (default: 3)
-    sources:                   # Which context sources to include
-      - readme                 # README.md files
-      - agents_md              # AGENTS.md files
-      - cursor_rules           # .cursor/rules or .claude/rules
-```
+### Bring Your Own Provider (BYOP)
-When enabled, Keystone will:
-1. Search from the workflow directory up to the project root
-2. Find the nearest `README.md` and `AGENTS.md` files
-3. Parse rules from `.cursor/rules` or `.claude/rules` directories
-4. Prepend this context to the LLM system prompt
+Keystone uses the **Vercel AI SDK**, allowing you to use any compatible provider. You must install the provider package (e.g., `@ai-sdk/openai`, `ai-sdk-provider-gemini-cli`) so Keystone can resolve it.
-Context is cached for 1 minute to avoid redundant file reads.
+Keystone searches for provider packages in:
+1.  **Local `node_modules`**: The project where you run `keystone`.
+2.  **Global `node_modules`**: Your system-wide npm/bun/yarn directory.
+To install a provider globally:
+```bash
+bun install -g ai-sdk-provider-gemini-cli
+# or
+npm install -g @ai-sdk/openai
+```
+Then configure it in `.keystone/config.yaml` using the `package` field.
 ### Model & Provider Resolution
 Keystone resolves which provider to use for a model in the following order:
 1. **Explicit Provider:** Use the `provider` field in an agent or step definition.
-2. **Provider Prefix:** Use the `provider:model` syntax (e.g., `model: copilot:gpt-4o`).
+2. **Provider Prefix:** Use the `provider:model` syntax (e.g., `model: anthropic:claude-3-5-sonnet-latest`).
 3. **Model Mappings:** Matches the model name against the `model_mappings` in your config (supports suffix `*` for prefix matching).
 4. **Default Provider:** Falls back to the `default_provider` defined in your config.
@@ -290,75 +287,55 @@ model: claude-3-5-sonnet-latest
 - id: notify
   type: llm
   agent: summarizer
-  model: copilot:gpt-4o
+  model: anthropic:claude-3-5-sonnet-latest
   prompt: ...
 ```
 ### OpenAI Compatible Providers
-You can add any OpenAI-compatible provider (Together AI, Perplexity, Local Ollama, etc.) by setting the `type` to `openai` and providing the `base_url` and `api_key_env`.
-### GitHub Copilot Support
-Keystone supports using your GitHub Copilot subscription directly. To authenticate (using the GitHub Device Flow):
-```bash
-keystone auth login github
-```
-Then, you can use Copilot in your configuration:
+You can add any OpenAI-compatible provider (Together AI, Perplexity, Local Ollama, etc.) by using the `@ai-sdk/openai` package and providing the `base_url` and `api_key_env`.
 ```yaml
 providers:
-  copilot:
-    type: copilot
-    default_model: gpt-4o
+  ollama:
+    package: "@ai-sdk/openai"
+    base_url: http://localhost:11434/v1
+    api_key_env: OLLAMA_API_KEY  # Can be any value for local Ollama
+    default_model: llama3.2
 ```
-Authentication tokens for Copilot are managed automatically after the initial login.
-### OpenAI ChatGPT Plus/Pro (OAuth)
-Keystone supports using your ChatGPT Plus/Pro subscription (OAuth) instead of an API key:
-```bash
-keystone auth login openai-chatgpt
-```
-Then map models to the `openai-chatgpt` provider in your config.
-### Anthropic Claude Pro/Max (OAuth)
-Keystone supports using your Claude Pro/Max subscription (OAuth) instead of an API key:
-```bash
-keystone auth login anthropic-claude
-```
+### API Key Management
-Then map models to the `anthropic-claude` provider in your config. This flow uses the Claude web auth code and refreshes tokens automatically.
+For other providers, store API keys in a `.env` file in your project root:
+- `OPENAI_API_KEY`
+- `ANTHROPIC_API_KEY`
-### Google Gemini (OAuth)
+### Context Injection (Opt-in)
-Keystone supports using your Google Gemini subscription (OAuth) instead of an API key:
+Keystone can automatically inject project context files (`README.md`, `AGENTS.md`, `.cursor/rules`, `.claude/rules`) into LLM system prompts. This helps agents understand your project's conventions and guidelines.
-```bash
-keystone auth login gemini
+```yaml
+features:
+  context_injection:
+    enabled: true              # Opt-in feature (default: false)
+    search_depth: 3            # How many directories up to search (default: 3)
+    sources:                   # Which context sources to include
+      - readme                 # README.md files
+      - agents_md              # AGENTS.md files
+      - cursor_rules           # .cursor/rules or .claude/rules
 ```
-Then map models to the `google-gemini` provider in your config.
-### API Key Management
+When enabled, Keystone will:
+1. Search from the workflow directory up to the project root
+2. Find the nearest `README.md` and `AGENTS.md` files
+3. Parse rules from `.cursor/rules` or `.claude/rules` directories
+4. Prepend this context to the LLM system prompt
-For other providers, you can either store API keys in a `.env` file in your project root:
-- `OPENAI_API_KEY`
-- `ANTHROPIC_API_KEY`
+Context is cached for 1 minute to avoid redundant file reads.
-Or use the `keystone auth login` command to securely store them in your local machine's configuration:
-- `keystone auth login openai`
-- `keystone auth login anthropic`
 ---
-## 📝 Workflow Example
+## <a id="workflow-example">📝 Workflow Example</a>
 Workflows are defined in YAML. Dependencies are automatically resolved based on the `needs` field, and **Keystone also automatically detects implicit dependencies** from your `${{ }}` expressions.
@@ -441,7 +418,7 @@ expression:
 ---
-## 🏗️ Step Types
+## <a id="step-types">🏗️ Step Types</a>
 Keystone supports several specialized step types:
@@ -521,6 +498,54 @@ Keystone supports several specialized step types:
   - `env` and `cwd` are required and must be explicit.
   - `input` is sent to stdin (objects/arrays are JSON-encoded).
   - Summary is parsed from stdout or a file at `KEYSTONE_ENGINE_SUMMARY_PATH` and stored as an artifact.
+- `git`: Execute git operations with automatic worktree management.
+  - Operations: `clone`, `checkout`, `pull`, `push`, `commit`, `worktree_add`, `worktree_remove`.
+  - `cleanup: true` automatically removes worktrees at workflow end.
+  ```yaml
+  - id: clone_repo
+    type: git
+    op: clone
+    url: https://github.com/example/repo.git
+    path: ./repo
+    branch: main
+    cleanup: true
+  ```
+- `dynamic`: LLM-driven workflow orchestration where a supervisor agent generates steps at runtime.
+  - The supervisor LLM creates a plan of steps that are then executed dynamically.
+  - Supports resumability - state is persisted after each generated step.
+  - Generated steps can be: `llm`, `shell`, `workflow`, `file`, or `request`.
+  - `goal`: High-level goal for the supervisor to accomplish (required).
+  - `context`: Additional context for planning.
+  - `prompt`: Custom supervisor prompt (overrides default).
+  - `supervisor`: Agent for planning (defaults to `keystone-architect`).
+  - `agent`: Default agent for generated LLM steps.
+  - `templates`: Role-to-agent mapping for specialized tasks.
+  - `maxSteps`: Maximum number of steps to generate.
+  - `concurrency`: Maximum number of steps to run in parallel (default: `1`).
+  - `confirmPlan`: Review and approve/modify the plan before execution (default: `false`).
+  - `maxReplans`: Number of automatic recovery attempts if the plan fails (default: `3`).
+  - `allowStepFailure`: Continue execution even if individual generated steps fail.
+  - `library`: A list of pre-defined step patterns available to the supervisor.
+  ```yaml
+  - id: implement_feature
+    type: dynamic
+    goal: "Implement user authentication with JWT"
+    context: "This is a Node.js Express application"
+    agent: keystone-architect
+    templates:
+      planner: "keystone-architect"
+      developer: "software-engineer"
+    maxSteps: 10
+    allowStepFailure: false
+  ```
+#### Dynamic Orchestration vs. Rigid Pipelines
+Traditional workflows often require complex multi-file decomposition (e.g., `decompose-problem.yaml` calling separate research, implementation, and review workflows). The `dynamic` step type replaces these rigid patterns with **Agentic Orchestration**:
+- **Simplified Structure**: A single `dynamic` step can replace multiple nested pipelines.
+- **Adaptive Execution**: The agent adjusts its plan based on real-time feedback and results from previous steps.
+- **Improved Resumability**: Each sub-step generated by the agent is persisted, allowing seamless resumption even inside long-running dynamic tasks.
+Use **Deterministic Workflows** (standard steps) for predictable, repeatable processes. Use **Dynamic Orchestration** for open-ended tasks where the specific steps cannot be known in advance.
 ### Human Steps in Non-Interactive Mode
 If stdin is not a TTY (CI, piped input), `human` steps suspend. Resume by providing an answer via inputs using the step id and `__answer`:
@@ -726,7 +751,7 @@ Until `strategy.matrix` is wired end-to-end, use explicit `foreach` with an arra
 ---
-## 🔧 Advanced Features
+## <a id="advanced-features">🔧 Advanced Features</a>
 ### Idempotency Keys
@@ -939,7 +964,7 @@ You can also define a workflow-level `compensate` step to handle overall cleanup
 ---
-## 🤖 Agent Definitions
+## <a id="agent-definitions">🤖 Agent Definitions</a>
 Agents are defined in Markdown files with YAML frontmatter, making them easy to read and version control.
@@ -1123,7 +1148,7 @@ In these examples, the agent will have access to all tools provided by the MCP s
 ---
-## 🛠️ CLI Commands
+## <a id="cli-commands">🛠️ CLI Commands</a>
 | Command | Description |
 | :--- | :--- |
@@ -1146,9 +1171,6 @@ In these examples, the agent will have access to all tools provided by the MCP s
 | `dev <task>` | Run the self-bootstrapping DevMode workflow |
 | `manifest` | Show embedded assets manifest |
 | `config show` | Show current configuration and discovery paths (alias: `list`) |
-| `auth status [provider]` | Show authentication status |
-| `auth login [provider]` | Login to an authentication provider (github, openai, anthropic, openai-chatgpt, anthropic-claude, gemini/google-gemini) |
-| `auth logout [provider]` | Logout and clear authentication tokens |
 | `ui` | Open the interactive TUI dashboard |
 | `mcp start` | Start the Keystone MCP server |
 | `mcp login <server>` | Login to a remote MCP server |
@@ -1187,19 +1209,22 @@ Input keys passed via `-i key=val` must be alphanumeric/underscore and cannot be
 ### Dry Run
 `keystone run --dry-run` prints shell commands without executing them and skips non-shell steps (including human prompts). Outputs from skipped steps are empty, so conditional branches may differ from a real run.
-## 🛡️ Security
+## <a id="security">🛡️ Security</a>
 ### Shell Execution
 Keystone blocks shell commands that match common injection/destructive patterns (like `rm -rf /` or pipes to shells). To run them, set `allowInsecure: true` on the step. Prefer `${{ escape(...) }}` when interpolating user input.
-You can bypass this check if you trust the command:
-```yaml
 - id: deploy
   type: shell
   run: ./deploy.sh ${{ inputs.env }}
   allowInsecure: true
 ```
+#### Troubleshooting Security Errors
+If you see a `Security Error: Evaluated command contains shell metacharacters`, it means your command contains characters like `\n`, `|`, or `&` that were not explicitly escaped or are not in the safe whitelist.
+- **Fix 1**: Use `${{ escape(steps.id.output) }}` for any dynamic values.
+- **Fix 2**: Set `allowInsecure: true` if the command naturally uses special characters (like `echo "line1\nline2"`).
 ### Expression Safety
 Expressions `${{ }}` are evaluated using a safe AST parser (`jsep`) which:
 - Prevents arbitrary code execution (no `eval` or `Function`).
@@ -1215,7 +1240,7 @@ Request steps enforce SSRF protections and require HTTPS by default. Cross-origi
 ---
-## 🏗️ Architecture
+## <a id="architecture">🏗️ Architecture</a>
 ```mermaid
 graph TD
@@ -1251,12 +1276,12 @@ graph TD
     EX --> Join[Join Step]
     EX --> Blueprint[Blueprint Step]
-    LLM --> Adapters[LLM Adapters]
-    Adapters --> Providers[OpenAI, Anthropic, Gemini, Copilot, etc.]
+    LLM --> Adapter[LLM Adapter (AI SDK)]
+    Adapter --> Providers[OpenAI, Anthropic, Gemini, Copilot, etc.]
     LLM --> MCPClient[MCP Client]
 ```
-## 📂 Project Structure
+## <a id="project-structure">📂 Project Structure</a>
 - `src/cli.ts`: CLI entry point.
 - `src/db/`: SQLite persistence layer.
@@ -1271,6 +1296,6 @@ graph TD
 ---
-## 📄 License
+## <a id="license">📄 License</a>
 MIT

package/package.json CHANGED Viewed

@@ -1,6 +1,6 @@
 {
   "name": "keystone-cli",
-  "version": "1.2.0",
+  "version": "2.0.0",
   "description": "A local-first, declarative, agentic workflow orchestrator built on Bun",
   "type": "module",
   "bin": {
@@ -8,7 +8,9 @@
   },
   "scripts": {
     "dev": "bun run src/cli.ts",
-    "test": "bun test",
+    "test": "bun test --timeout 60000",
+    "test:adapter": "SKIP_LLM_MOCK=1 bun test ./src/runner/llm-adapter.integration.test.ts --timeout 60000",
+    "test:unit": "bun test --timeout 60000 --filter '!llm-adapter.integration.test.ts'",
     "lint": "biome check .",
     "lint:fix": "biome check --write .",
     "format": "biome format --write .",
@@ -30,6 +32,7 @@
     "@jsep-plugin/object": "^1.2.2",
     "@types/react": "^19.0.0",
     "@xenova/transformers": "^2.17.2",
+    "ai": "^6.0.3",
     "ajv": "^8.12.0",
     "commander": "^12.1.0",
     "dagre": "^0.8.5",
@@ -41,7 +44,7 @@
     "jsep": "^1.4.0",
     "react": "^19.0.0",
     "sqlite-vec": "0.1.6",
-    "zod": "^3.23.8",
+    "zod": "^3.25.76",
     "zod-to-json-schema": "^3.25.1"
   },
   "optionalDependencies": {