keystone-cli 1.2.0 → 2.0.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (52) hide show
  1. package/README.md +163 -138
  2. package/package.json +6 -3
  3. package/src/cli.ts +54 -369
  4. package/src/commands/init.ts +19 -27
  5. package/src/db/dynamic-state-manager.test.ts +319 -0
  6. package/src/db/dynamic-state-manager.ts +411 -0
  7. package/src/db/memory-db.test.ts +45 -0
  8. package/src/db/memory-db.ts +47 -21
  9. package/src/db/sqlite-setup.ts +26 -3
  10. package/src/db/workflow-db.ts +76 -5
  11. package/src/parser/config-schema.ts +11 -13
  12. package/src/parser/schema.ts +37 -2
  13. package/src/parser/workflow-parser.test.ts +3 -4
  14. package/src/parser/workflow-parser.ts +3 -62
  15. package/src/runner/__test__/llm-mock-setup.ts +173 -0
  16. package/src/runner/__test__/llm-test-setup.ts +271 -0
  17. package/src/runner/engine-executor.test.ts +25 -18
  18. package/src/runner/executors/blueprint-executor.ts +0 -1
  19. package/src/runner/executors/dynamic-executor.test.ts +613 -0
  20. package/src/runner/executors/dynamic-executor.ts +723 -0
  21. package/src/runner/executors/dynamic-types.ts +69 -0
  22. package/src/runner/executors/engine-executor.ts +5 -1
  23. package/src/runner/executors/llm-executor.ts +502 -1033
  24. package/src/runner/executors/memory-executor.ts +35 -19
  25. package/src/runner/executors/plan-executor.ts +0 -1
  26. package/src/runner/executors/types.ts +4 -4
  27. package/src/runner/llm-adapter.integration.test.ts +151 -0
  28. package/src/runner/llm-adapter.ts +263 -1401
  29. package/src/runner/llm-clarification.test.ts +91 -106
  30. package/src/runner/llm-executor.test.ts +217 -1181
  31. package/src/runner/memoization.test.ts +0 -1
  32. package/src/runner/recovery-security.test.ts +51 -20
  33. package/src/runner/reflexion.test.ts +55 -18
  34. package/src/runner/standard-tools-integration.test.ts +137 -87
  35. package/src/runner/step-executor.test.ts +36 -80
  36. package/src/runner/step-executor.ts +20 -2
  37. package/src/runner/test-harness.ts +3 -29
  38. package/src/runner/tool-integration.test.ts +122 -73
  39. package/src/runner/workflow-runner.ts +92 -35
  40. package/src/runner/workflow-scheduler.ts +11 -1
  41. package/src/runner/workflow-summary.ts +144 -0
  42. package/src/templates/dynamic-demo.yaml +31 -0
  43. package/src/templates/scaffolding/decompose-problem.yaml +1 -1
  44. package/src/templates/scaffolding/dynamic-decompose.yaml +39 -0
  45. package/src/utils/auth-manager.test.ts +10 -520
  46. package/src/utils/auth-manager.ts +3 -756
  47. package/src/utils/config-loader.ts +12 -0
  48. package/src/utils/constants.ts +0 -17
  49. package/src/utils/process-sandbox.ts +15 -3
  50. package/src/utils/topo-sort.ts +47 -0
  51. package/src/runner/llm-adapter-runtime.test.ts +0 -209
  52. package/src/runner/llm-adapter.test.ts +0 -1012
package/README.md CHANGED
@@ -34,11 +34,12 @@ Keystone allows you to define complex automation workflows using a simple YAML s
34
34
 
35
35
  ---
36
36
 
37
- ## Features
37
+ ## <a id="features">✨ Features</a>
38
38
 
39
39
  - ⚡ **Local-First:** Built on Bun with a local SQLite database for state management.
40
40
  - 🧩 **Declarative:** Define workflows in YAML with automatic dependency tracking (DAG).
41
41
  - 🤖 **Agentic:** First-class support for LLM agents defined in Markdown with YAML frontmatter.
42
+ - 🎯 **Dynamic Workflows:** LLM-driven orchestration where a supervisor generates and executes steps at runtime.
42
43
  - 🧑‍💻 **Human-in-the-Loop:** Support for manual approval and text input steps.
43
44
  - 🔄 **Resilient:** Built-in retries, timeouts, and state persistence. Resume failed or paused runs exactly where they left off.
44
45
  - 📊 **TUI Dashboard:** Built-in interactive dashboard for monitoring and managing runs.
@@ -51,7 +52,7 @@ Keystone allows you to define complex automation workflows using a simple YAML s
51
52
 
52
53
  ---
53
54
 
54
- ## 🚀 Installation
55
+ ## <a id="installation">🚀 Installation</a>
55
56
 
56
57
  Ensure you have [Bun](https://bun.sh) installed.
57
58
 
@@ -89,7 +90,7 @@ source <(keystone completion bash)
89
90
 
90
91
  ---
91
92
 
92
- ## 🚦 Quick Start
93
+ ## <a id="quick-start">🚦 Quick Start</a>
93
94
 
94
95
  ### 1. Initialize a Project
95
96
  ```bash
@@ -97,52 +98,64 @@ keystone init
97
98
  ```
98
99
  This creates the `.keystone/` directory for configuration and seeds `.keystone/workflows/` plus `.keystone/workflows/agents/` with bundled workflows and agents (see "Bundled Workflows" below).
99
100
 
100
- ### 2. Configure your Environment
101
- Add your API keys to the generated `.env` file:
101
+ ### 2. Install AI SDK Providers
102
+ Keystone uses the **Vercel AI SDK**. Install the provider packages you need:
103
+ ```bash
104
+ npm install @ai-sdk/openai @ai-sdk/anthropic
105
+ # Or use other AI SDK providers like @ai-sdk/google, @ai-sdk/mistral, etc.
106
+ ```
107
+
108
+ ### 3. Configure Providers
109
+ Edit `.keystone/config.yaml` to configure your providers:
110
+ ```yaml
111
+ default_provider: openai
112
+
113
+ providers:
114
+ openai:
115
+ package: "@ai-sdk/openai"
116
+ api_key_env: OPENAI_API_KEY
117
+ default_model: gpt-4o
118
+
119
+ anthropic:
120
+ package: "@ai-sdk/anthropic"
121
+ api_key_env: ANTHROPIC_API_KEY
122
+ default_model: claude-3-5-sonnet-20240620
123
+
124
+ model_mappings:
125
+ "gpt-*": openai
126
+ "claude-*": anthropic
127
+ ```
128
+
129
+ Then add your API keys to `.env`:
102
130
  ```env
103
131
  OPENAI_API_KEY=sk-...
104
132
  ANTHROPIC_API_KEY=sk-ant-...
105
133
  ```
106
- Alternatively, you can use the built-in authentication management:
107
- ```bash
108
- keystone auth login openai
109
- keystone auth login anthropic
110
- keystone auth login anthropic-claude
111
- keystone auth login openai-chatgpt
112
- keystone auth login gemini
113
- keystone auth login github
114
- ```
115
- Use `anthropic-claude` for Claude Pro/Max subscriptions (OAuth) instead of an API key.
116
- Use `openai-chatgpt` for ChatGPT Plus/Pro subscriptions (OAuth) instead of an API key.
117
- Use `gemini` (alias `google-gemini`) for Google Gemini subscriptions (OAuth) instead of an API key.
118
- Use `github` to authenticate GitHub Copilot via the GitHub device flow.
119
134
 
120
- ### 3. Run a Workflow
135
+ See the [Configuration](#configuration) section for more details on BYOP (Bring Your Own Provider).
136
+
137
+ ### 4. Run a Workflow
121
138
  ```bash
122
139
  keystone run scaffold-feature
123
140
  ```
124
141
  Keystone automatically looks in `.keystone/workflows/` (locally and in your home directory) for `.yaml` or `.yml` files.
125
142
 
126
- ### 4. Monitor with the Dashboard
143
+ ### 5. Monitor with the Dashboard
127
144
  ```bash
128
145
  keystone ui
129
146
  ```
130
147
 
131
148
  ---
132
149
 
133
- ## 🧰 Bundled Workflows
150
+ ## <a id="bundled-workflows">🧰 Bundled Workflows</a>
134
151
 
135
152
  `keystone init` seeds these workflows under `.keystone/workflows/` (and the agents they rely on under `.keystone/workflows/agents/`):
136
153
 
137
- Top-level workflows (seeded in `.keystone/workflows/`):
154
+ Top-level utility workflows (seeded in `.keystone/workflows/`):
138
155
  - `scaffold-feature.yaml`: Interactive workflow scaffolder. Prompts for requirements, plans files, generates content, and writes them.
139
156
  - `decompose-problem.yaml`: Decomposes a problem into research/implementation/review tasks, waits for approval, runs sub-workflows, and summarizes.
140
157
  - `dev.yaml`: Self-bootstrapping DevMode workflow for an interactive plan/implement/verify loop.
141
- - `agent-handoff.yaml`: Demonstrates agent handoffs and tool-driven context updates.
142
- - `full-feature-demo.yaml`: A comprehensive workflow demonstrating multiple step types (shell, file, request, etc.).
143
- - `script-example.yaml`: Demonstrates sandboxed JavaScript execution.
144
- - `artifact-example.yaml`: Demonstrates artifact upload and download between steps.
145
- - `idempotency-example.yaml`: Demonstrates safe retries for side-effecting steps.
158
+ - `dynamic-decompose.yaml`: Dynamic version of decompose-problem using LLM-driven orchestration.
146
159
 
147
160
  Sub-workflows (seeded in `.keystone/workflows/`):
148
161
  - `scaffold-plan.yaml`: Generates a file plan from `requirements` input.
@@ -156,14 +169,14 @@ Example runs:
156
169
  ```bash
157
170
  keystone run scaffold-feature
158
171
  keystone run decompose-problem -i problem="Add caching to the API" -i context="Node/Bun service"
159
- keystone run agent-handoff -i topic="billing" -i user="Ada"
172
+ keystone run dev "Improve the user profile UI"
160
173
  ```
161
174
 
162
175
  Sub-workflows are used by the top-level workflows, but can be run directly if you want just one phase.
163
176
 
164
177
  ---
165
178
 
166
- ## ⚙️ Configuration
179
+ ## <a id="configuration">⚙️ Configuration</a>
167
180
 
168
181
  Keystone loads configuration from project `.keystone/config.yaml` (and user-level config; see `keystone config show` for search order) to manage model providers and model mappings.
169
182
 
@@ -178,42 +191,27 @@ State is stored at `.keystone/state.db` by default (project-local).
178
191
  default_provider: openai
179
192
 
180
193
  providers:
194
+ # Example: Using a standard AI SDK provider package (Bring Your Own Provider)
181
195
  openai:
182
- type: openai
196
+ package: "@ai-sdk/openai"
183
197
  base_url: https://api.openai.com/v1
184
198
  api_key_env: OPENAI_API_KEY
185
199
  default_model: gpt-4o
186
- openai-chatgpt:
187
- type: openai-chatgpt
188
- base_url: https://api.openai.com/v1
189
- default_model: gpt-5-codex
200
+
201
+ # Example: Using another provider
190
202
  anthropic:
191
- type: anthropic
192
- base_url: https://api.anthropic.com/v1
203
+ package: "@ai-sdk/anthropic"
193
204
  api_key_env: ANTHROPIC_API_KEY
194
205
  default_model: claude-3-5-sonnet-20240620
195
- anthropic-claude:
196
- type: anthropic-claude
197
- base_url: https://api.anthropic.com/v1
198
- default_model: claude-3-5-sonnet-20240620
199
- google-gemini:
200
- type: google-gemini
201
- base_url: https://cloudcode-pa.googleapis.com
202
- default_model: gemini-1.5-pro
203
- groq:
204
- type: openai
205
- base_url: https://api.groq.com/openai/v1
206
- api_key_env: GROQ_API_KEY
207
- default_model: llama-3.3-70b-versatile
206
+
207
+ # Example: Using a custom provider script
208
+ # my-custom-provider:
209
+ # script: "./providers/my-provider.ts"
210
+ # default_model: my-special-model
208
211
 
209
212
  model_mappings:
210
- "gpt-5*": openai-chatgpt
211
213
  "gpt-*": openai
212
- "claude-4*": anthropic-claude
213
214
  "claude-*": anthropic
214
- "gemini-*": google-gemini
215
- "o1-*": openai
216
- "llama-*": groq
217
215
 
218
216
  mcp_servers:
219
217
  filesystem:
@@ -241,37 +239,36 @@ expression:
241
239
  strict: false
242
240
  ```
243
241
 
244
- `storage.retention_days` sets the default window used by `keystone maintenance` / `keystone prune`. `storage.redact_secrets_at_rest` controls whether secret inputs and known secrets are redacted before storing run data (default `true`).
242
+ ### Storage Configuration
245
243
 
246
- ### Context Injection (Opt-in)
244
+ The `storage` section controls data retention and security for workflow runs:
247
245
 
248
- Keystone can automatically inject project context files (`README.md`, `AGENTS.md`, `.cursor/rules`, `.claude/rules`) into LLM system prompts. This helps agents understand your project's conventions and guidelines.
246
+ - **`retention_days`**: Sets the default window used by `keystone maintenance` / `keystone prune` commands to clean up old run data.
247
+ - **`redact_secrets_at_rest`**: Controls whether secret inputs and known secrets are redacted before storing run data (default `true`).
249
248
 
250
- ```yaml
251
- features:
252
- context_injection:
253
- enabled: true # Opt-in feature (default: false)
254
- search_depth: 3 # How many directories up to search (default: 3)
255
- sources: # Which context sources to include
256
- - readme # README.md files
257
- - agents_md # AGENTS.md files
258
- - cursor_rules # .cursor/rules or .claude/rules
259
- ```
249
+ ### Bring Your Own Provider (BYOP)
260
250
 
261
- When enabled, Keystone will:
262
- 1. Search from the workflow directory up to the project root
263
- 2. Find the nearest `README.md` and `AGENTS.md` files
264
- 3. Parse rules from `.cursor/rules` or `.claude/rules` directories
265
- 4. Prepend this context to the LLM system prompt
251
+ Keystone uses the **Vercel AI SDK**, allowing you to use any compatible provider. You must install the provider package (e.g., `@ai-sdk/openai`, `ai-sdk-provider-gemini-cli`) so Keystone can resolve it.
266
252
 
267
- Context is cached for 1 minute to avoid redundant file reads.
253
+ Keystone searches for provider packages in:
254
+ 1. **Local `node_modules`**: The project where you run `keystone`.
255
+ 2. **Global `node_modules`**: Your system-wide npm/bun/yarn directory.
256
+
257
+ To install a provider globally:
258
+ ```bash
259
+ bun install -g ai-sdk-provider-gemini-cli
260
+ # or
261
+ npm install -g @ai-sdk/openai
262
+ ```
263
+
264
+ Then configure it in `.keystone/config.yaml` using the `package` field.
268
265
 
269
266
  ### Model & Provider Resolution
270
267
 
271
268
  Keystone resolves which provider to use for a model in the following order:
272
269
 
273
270
  1. **Explicit Provider:** Use the `provider` field in an agent or step definition.
274
- 2. **Provider Prefix:** Use the `provider:model` syntax (e.g., `model: copilot:gpt-4o`).
271
+ 2. **Provider Prefix:** Use the `provider:model` syntax (e.g., `model: anthropic:claude-3-5-sonnet-latest`).
275
272
  3. **Model Mappings:** Matches the model name against the `model_mappings` in your config (supports suffix `*` for prefix matching).
276
273
  4. **Default Provider:** Falls back to the `default_provider` defined in your config.
277
274
 
@@ -290,75 +287,55 @@ model: claude-3-5-sonnet-latest
290
287
  - id: notify
291
288
  type: llm
292
289
  agent: summarizer
293
- model: copilot:gpt-4o
290
+ model: anthropic:claude-3-5-sonnet-latest
294
291
  prompt: ...
295
292
  ```
296
293
 
297
294
  ### OpenAI Compatible Providers
298
- You can add any OpenAI-compatible provider (Together AI, Perplexity, Local Ollama, etc.) by setting the `type` to `openai` and providing the `base_url` and `api_key_env`.
299
-
300
- ### GitHub Copilot Support
301
-
302
- Keystone supports using your GitHub Copilot subscription directly. To authenticate (using the GitHub Device Flow):
303
-
304
- ```bash
305
- keystone auth login github
306
- ```
307
-
308
- Then, you can use Copilot in your configuration:
295
+ You can add any OpenAI-compatible provider (Together AI, Perplexity, Local Ollama, etc.) by using the `@ai-sdk/openai` package and providing the `base_url` and `api_key_env`.
309
296
 
310
297
  ```yaml
311
298
  providers:
312
- copilot:
313
- type: copilot
314
- default_model: gpt-4o
299
+ ollama:
300
+ package: "@ai-sdk/openai"
301
+ base_url: http://localhost:11434/v1
302
+ api_key_env: OLLAMA_API_KEY # Can be any value for local Ollama
303
+ default_model: llama3.2
315
304
  ```
316
305
 
317
- Authentication tokens for Copilot are managed automatically after the initial login.
318
-
319
- ### OpenAI ChatGPT Plus/Pro (OAuth)
320
-
321
- Keystone supports using your ChatGPT Plus/Pro subscription (OAuth) instead of an API key:
322
-
323
- ```bash
324
- keystone auth login openai-chatgpt
325
- ```
326
-
327
- Then map models to the `openai-chatgpt` provider in your config.
328
-
329
- ### Anthropic Claude Pro/Max (OAuth)
330
-
331
- Keystone supports using your Claude Pro/Max subscription (OAuth) instead of an API key:
332
-
333
- ```bash
334
- keystone auth login anthropic-claude
335
- ```
306
+ ### API Key Management
336
307
 
337
- Then map models to the `anthropic-claude` provider in your config. This flow uses the Claude web auth code and refreshes tokens automatically.
308
+ For other providers, store API keys in a `.env` file in your project root:
309
+ - `OPENAI_API_KEY`
310
+ - `ANTHROPIC_API_KEY`
338
311
 
339
- ### Google Gemini (OAuth)
312
+ ### Context Injection (Opt-in)
340
313
 
341
- Keystone supports using your Google Gemini subscription (OAuth) instead of an API key:
314
+ Keystone can automatically inject project context files (`README.md`, `AGENTS.md`, `.cursor/rules`, `.claude/rules`) into LLM system prompts. This helps agents understand your project's conventions and guidelines.
342
315
 
343
- ```bash
344
- keystone auth login gemini
316
+ ```yaml
317
+ features:
318
+ context_injection:
319
+ enabled: true # Opt-in feature (default: false)
320
+ search_depth: 3 # How many directories up to search (default: 3)
321
+ sources: # Which context sources to include
322
+ - readme # README.md files
323
+ - agents_md # AGENTS.md files
324
+ - cursor_rules # .cursor/rules or .claude/rules
345
325
  ```
346
326
 
347
- Then map models to the `google-gemini` provider in your config.
348
-
349
- ### API Key Management
327
+ When enabled, Keystone will:
328
+ 1. Search from the workflow directory up to the project root
329
+ 2. Find the nearest `README.md` and `AGENTS.md` files
330
+ 3. Parse rules from `.cursor/rules` or `.claude/rules` directories
331
+ 4. Prepend this context to the LLM system prompt
350
332
 
351
- For other providers, you can either store API keys in a `.env` file in your project root:
352
- - `OPENAI_API_KEY`
353
- - `ANTHROPIC_API_KEY`
333
+ Context is cached for 1 minute to avoid redundant file reads.
354
334
 
355
- Or use the `keystone auth login` command to securely store them in your local machine's configuration:
356
- - `keystone auth login openai`
357
- - `keystone auth login anthropic`
358
335
 
359
336
  ---
360
337
 
361
- ## 📝 Workflow Example
338
+ ## <a id="workflow-example">📝 Workflow Example</a>
362
339
 
363
340
  Workflows are defined in YAML. Dependencies are automatically resolved based on the `needs` field, and **Keystone also automatically detects implicit dependencies** from your `${{ }}` expressions.
364
341
 
@@ -441,7 +418,7 @@ expression:
441
418
 
442
419
  ---
443
420
 
444
- ## 🏗️ Step Types
421
+ ## <a id="step-types">🏗️ Step Types</a>
445
422
 
446
423
  Keystone supports several specialized step types:
447
424
 
@@ -521,6 +498,54 @@ Keystone supports several specialized step types:
521
498
  - `env` and `cwd` are required and must be explicit.
522
499
  - `input` is sent to stdin (objects/arrays are JSON-encoded).
523
500
  - Summary is parsed from stdout or a file at `KEYSTONE_ENGINE_SUMMARY_PATH` and stored as an artifact.
501
+ - `git`: Execute git operations with automatic worktree management.
502
+ - Operations: `clone`, `checkout`, `pull`, `push`, `commit`, `worktree_add`, `worktree_remove`.
503
+ - `cleanup: true` automatically removes worktrees at workflow end.
504
+ ```yaml
505
+ - id: clone_repo
506
+ type: git
507
+ op: clone
508
+ url: https://github.com/example/repo.git
509
+ path: ./repo
510
+ branch: main
511
+ cleanup: true
512
+ ```
513
+ - `dynamic`: LLM-driven workflow orchestration where a supervisor agent generates steps at runtime.
514
+ - The supervisor LLM creates a plan of steps that are then executed dynamically.
515
+ - Supports resumability - state is persisted after each generated step.
516
+ - Generated steps can be: `llm`, `shell`, `workflow`, `file`, or `request`.
517
+ - `goal`: High-level goal for the supervisor to accomplish (required).
518
+ - `context`: Additional context for planning.
519
+ - `prompt`: Custom supervisor prompt (overrides default).
520
+ - `supervisor`: Agent for planning (defaults to `keystone-architect`).
521
+ - `agent`: Default agent for generated LLM steps.
522
+ - `templates`: Role-to-agent mapping for specialized tasks.
523
+ - `maxSteps`: Maximum number of steps to generate.
524
+ - `concurrency`: Maximum number of steps to run in parallel (default: `1`).
525
+ - `confirmPlan`: Review and approve/modify the plan before execution (default: `false`).
526
+ - `maxReplans`: Number of automatic recovery attempts if the plan fails (default: `3`).
527
+ - `allowStepFailure`: Continue execution even if individual generated steps fail.
528
+ - `library`: A list of pre-defined step patterns available to the supervisor.
529
+ ```yaml
530
+ - id: implement_feature
531
+ type: dynamic
532
+ goal: "Implement user authentication with JWT"
533
+ context: "This is a Node.js Express application"
534
+ agent: keystone-architect
535
+ templates:
536
+ planner: "keystone-architect"
537
+ developer: "software-engineer"
538
+ maxSteps: 10
539
+ allowStepFailure: false
540
+ ```
541
+
542
+ #### Dynamic Orchestration vs. Rigid Pipelines
543
+ Traditional workflows often require complex multi-file decomposition (e.g., `decompose-problem.yaml` calling separate research, implementation, and review workflows). The `dynamic` step type replaces these rigid patterns with **Agentic Orchestration**:
544
+ - **Simplified Structure**: A single `dynamic` step can replace multiple nested pipelines.
545
+ - **Adaptive Execution**: The agent adjusts its plan based on real-time feedback and results from previous steps.
546
+ - **Improved Resumability**: Each sub-step generated by the agent is persisted, allowing seamless resumption even inside long-running dynamic tasks.
547
+
548
+ Use **Deterministic Workflows** (standard steps) for predictable, repeatable processes. Use **Dynamic Orchestration** for open-ended tasks where the specific steps cannot be known in advance.
524
549
 
525
550
  ### Human Steps in Non-Interactive Mode
526
551
  If stdin is not a TTY (CI, piped input), `human` steps suspend. Resume by providing an answer via inputs using the step id and `__answer`:
@@ -726,7 +751,7 @@ Until `strategy.matrix` is wired end-to-end, use explicit `foreach` with an arra
726
751
 
727
752
  ---
728
753
 
729
- ## 🔧 Advanced Features
754
+ ## <a id="advanced-features">🔧 Advanced Features</a>
730
755
 
731
756
  ### Idempotency Keys
732
757
 
@@ -939,7 +964,7 @@ You can also define a workflow-level `compensate` step to handle overall cleanup
939
964
 
940
965
  ---
941
966
 
942
- ## 🤖 Agent Definitions
967
+ ## <a id="agent-definitions">🤖 Agent Definitions</a>
943
968
 
944
969
  Agents are defined in Markdown files with YAML frontmatter, making them easy to read and version control.
945
970
 
@@ -1123,7 +1148,7 @@ In these examples, the agent will have access to all tools provided by the MCP s
1123
1148
 
1124
1149
  ---
1125
1150
 
1126
- ## 🛠️ CLI Commands
1151
+ ## <a id="cli-commands">🛠️ CLI Commands</a>
1127
1152
 
1128
1153
  | Command | Description |
1129
1154
  | :--- | :--- |
@@ -1146,9 +1171,6 @@ In these examples, the agent will have access to all tools provided by the MCP s
1146
1171
  | `dev <task>` | Run the self-bootstrapping DevMode workflow |
1147
1172
  | `manifest` | Show embedded assets manifest |
1148
1173
  | `config show` | Show current configuration and discovery paths (alias: `list`) |
1149
- | `auth status [provider]` | Show authentication status |
1150
- | `auth login [provider]` | Login to an authentication provider (github, openai, anthropic, openai-chatgpt, anthropic-claude, gemini/google-gemini) |
1151
- | `auth logout [provider]` | Logout and clear authentication tokens |
1152
1174
  | `ui` | Open the interactive TUI dashboard |
1153
1175
  | `mcp start` | Start the Keystone MCP server |
1154
1176
  | `mcp login <server>` | Login to a remote MCP server |
@@ -1187,19 +1209,22 @@ Input keys passed via `-i key=val` must be alphanumeric/underscore and cannot be
1187
1209
  ### Dry Run
1188
1210
  `keystone run --dry-run` prints shell commands without executing them and skips non-shell steps (including human prompts). Outputs from skipped steps are empty, so conditional branches may differ from a real run.
1189
1211
 
1190
- ## 🛡️ Security
1212
+ ## <a id="security">🛡️ Security</a>
1191
1213
 
1192
1214
  ### Shell Execution
1193
1215
  Keystone blocks shell commands that match common injection/destructive patterns (like `rm -rf /` or pipes to shells). To run them, set `allowInsecure: true` on the step. Prefer `${{ escape(...) }}` when interpolating user input.
1194
1216
 
1195
- You can bypass this check if you trust the command:
1196
- ```yaml
1197
1217
  - id: deploy
1198
1218
  type: shell
1199
1219
  run: ./deploy.sh ${{ inputs.env }}
1200
1220
  allowInsecure: true
1201
1221
  ```
1202
1222
 
1223
+ #### Troubleshooting Security Errors
1224
+ If you see a `Security Error: Evaluated command contains shell metacharacters`, it means your command contains characters like `\n`, `|`, or `&` that were not explicitly escaped or are not in the safe whitelist.
1225
+ - **Fix 1**: Use `${{ escape(steps.id.output) }}` for any dynamic values.
1226
+ - **Fix 2**: Set `allowInsecure: true` if the command naturally uses special characters (like `echo "line1\nline2"`).
1227
+
1203
1228
  ### Expression Safety
1204
1229
  Expressions `${{ }}` are evaluated using a safe AST parser (`jsep`) which:
1205
1230
  - Prevents arbitrary code execution (no `eval` or `Function`).
@@ -1215,7 +1240,7 @@ Request steps enforce SSRF protections and require HTTPS by default. Cross-origi
1215
1240
 
1216
1241
  ---
1217
1242
 
1218
- ## 🏗️ Architecture
1243
+ ## <a id="architecture">🏗️ Architecture</a>
1219
1244
 
1220
1245
  ```mermaid
1221
1246
  graph TD
@@ -1251,12 +1276,12 @@ graph TD
1251
1276
  EX --> Join[Join Step]
1252
1277
  EX --> Blueprint[Blueprint Step]
1253
1278
 
1254
- LLM --> Adapters[LLM Adapters]
1255
- Adapters --> Providers[OpenAI, Anthropic, Gemini, Copilot, etc.]
1279
+ LLM --> Adapter[LLM Adapter (AI SDK)]
1280
+ Adapter --> Providers[OpenAI, Anthropic, Gemini, Copilot, etc.]
1256
1281
  LLM --> MCPClient[MCP Client]
1257
1282
  ```
1258
1283
 
1259
- ## 📂 Project Structure
1284
+ ## <a id="project-structure">📂 Project Structure</a>
1260
1285
 
1261
1286
  - `src/cli.ts`: CLI entry point.
1262
1287
  - `src/db/`: SQLite persistence layer.
@@ -1271,6 +1296,6 @@ graph TD
1271
1296
 
1272
1297
  ---
1273
1298
 
1274
- ## 📄 License
1299
+ ## <a id="license">📄 License</a>
1275
1300
 
1276
1301
  MIT
package/package.json CHANGED
@@ -1,6 +1,6 @@
1
1
  {
2
2
  "name": "keystone-cli",
3
- "version": "1.2.0",
3
+ "version": "2.0.0",
4
4
  "description": "A local-first, declarative, agentic workflow orchestrator built on Bun",
5
5
  "type": "module",
6
6
  "bin": {
@@ -8,7 +8,9 @@
8
8
  },
9
9
  "scripts": {
10
10
  "dev": "bun run src/cli.ts",
11
- "test": "bun test",
11
+ "test": "bun test --timeout 60000",
12
+ "test:adapter": "SKIP_LLM_MOCK=1 bun test ./src/runner/llm-adapter.integration.test.ts --timeout 60000",
13
+ "test:unit": "bun test --timeout 60000 --filter '!llm-adapter.integration.test.ts'",
12
14
  "lint": "biome check .",
13
15
  "lint:fix": "biome check --write .",
14
16
  "format": "biome format --write .",
@@ -30,6 +32,7 @@
30
32
  "@jsep-plugin/object": "^1.2.2",
31
33
  "@types/react": "^19.0.0",
32
34
  "@xenova/transformers": "^2.17.2",
35
+ "ai": "^6.0.3",
33
36
  "ajv": "^8.12.0",
34
37
  "commander": "^12.1.0",
35
38
  "dagre": "^0.8.5",
@@ -41,7 +44,7 @@
41
44
  "jsep": "^1.4.0",
42
45
  "react": "^19.0.0",
43
46
  "sqlite-vec": "0.1.6",
44
- "zod": "^3.23.8",
47
+ "zod": "^3.25.76",
45
48
  "zod-to-json-schema": "^3.25.1"
46
49
  },
47
50
  "optionalDependencies": {