@oscharko-dev/keiko 0.1.0-beta.0 → 0.1.0-beta.2

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (59) hide show
  1. package/README.md +98 -570
  2. package/dist/cli/gen-tests.js +8 -3
  3. package/dist/cli/init.d.ts +8 -0
  4. package/dist/cli/init.js +122 -0
  5. package/dist/cli/investigate.js +6 -2
  6. package/dist/cli/lifecycle.d.ts +18 -0
  7. package/dist/cli/lifecycle.js +289 -0
  8. package/dist/cli/models.js +2 -2
  9. package/dist/cli/runner.js +21 -28
  10. package/dist/gateway/capabilities.d.ts +1 -0
  11. package/dist/gateway/capabilities.data.js +5 -203
  12. package/dist/gateway/capabilities.js +18 -0
  13. package/dist/gateway/config.d.ts +2 -1
  14. package/dist/gateway/config.js +98 -9
  15. package/dist/gateway/gateway.js +3 -3
  16. package/dist/gateway/index.d.ts +2 -2
  17. package/dist/gateway/index.js +2 -2
  18. package/dist/gateway/model-selection.d.ts +3 -1
  19. package/dist/gateway/model-selection.js +15 -4
  20. package/dist/gateway/types.d.ts +1 -0
  21. package/dist/harness/session.d.ts +1 -1
  22. package/dist/harness/session.js +1 -1
  23. package/dist/sdk/index.d.ts +1 -1
  24. package/dist/sdk/index.js +1 -1
  25. package/dist/tools/patch-normalize.js +1 -2
  26. package/dist/tools/terminal-policy.js +1 -8
  27. package/dist/ui/chat-handlers.js +26 -12
  28. package/dist/ui/csp-hashes.json +6 -6
  29. package/dist/ui/deps.d.ts +14 -0
  30. package/dist/ui/deps.js +92 -20
  31. package/dist/ui/gateway-setup.d.ts +3 -0
  32. package/dist/ui/gateway-setup.js +235 -0
  33. package/dist/ui/read-handlers.js +14 -7
  34. package/dist/ui/routes.js +6 -4
  35. package/dist/ui/run-handlers.js +3 -2
  36. package/dist/ui/server.d.ts +1 -1
  37. package/dist/ui/server.js +1 -1
  38. package/dist/ui/static/404.html +1 -1
  39. package/dist/ui/static/_next/static/chunks/44-17c259c8e72fb82f.js +1 -0
  40. package/dist/ui/static/_next/static/chunks/app/_not-found/{page-75825b09bcecad97.js → page-7bd871301b874ae0.js} +1 -1
  41. package/dist/ui/static/_next/static/chunks/app/launch/{page-9c86a13c29884245.js → page-3bd098d60d6df513.js} +1 -1
  42. package/dist/ui/static/_next/static/chunks/app/layout-091bb8be985f5c03.js +1 -0
  43. package/dist/ui/static/_next/static/chunks/app/{page-4168c12c68b7a853.js → page-2006f21df58c2bb9.js} +1 -1
  44. package/dist/ui/static/_next/static/chunks/{main-app-30679af7240d63e9.js → main-app-e8144a306630b76d.js} +1 -1
  45. package/dist/ui/static/_next/static/css/{be7cb54d5c5673b6.css → 3d68155c8db012f4.css} +1 -1
  46. package/dist/ui/static/index.html +1 -1
  47. package/dist/ui/static/index.txt +3 -3
  48. package/dist/ui/static/launch.html +1 -1
  49. package/dist/ui/static/launch.txt +3 -3
  50. package/dist/ui/store-handlers.js +16 -12
  51. package/dist/workflows/bug-investigation/model-loop.js +1 -4
  52. package/dist/workflows/bug-investigation/parse.js +5 -3
  53. package/dist/workflows/unit-tests/model-loop.js +1 -1
  54. package/dist/workspace/retrieval.js +1 -1
  55. package/package.json +1 -1
  56. package/dist/ui/static/_next/static/chunks/4-be1fef693af8e088.js +0 -1
  57. package/dist/ui/static/_next/static/chunks/app/layout-bdea63fe87947d50.js +0 -1
  58. /package/dist/ui/static/_next/static/{ca-A01hy9W98aRvMZKdAw → VbDWcDBTN0u8CNeSDaz0o}/_buildManifest.js +0 -0
  59. /package/dist/ui/static/_next/static/{ca-A01hy9W98aRvMZKdAw → VbDWcDBTN0u8CNeSDaz0o}/_ssgManifest.js +0 -0
package/README.md CHANGED
@@ -1,621 +1,149 @@
1
1
  # Keiko
2
2
 
3
- Keiko is an enterprise, model-agnostic developer-assist coding agent for regulated engineering teams.
3
+ Keiko is a local enterprise coding assistant for regulated engineering teams. It helps inspect a repository, chat with configured language models, generate reviewable unit tests, investigate bugs, run verification, and keep redacted evidence for human review.
4
4
 
5
- It runs bounded, reviewable coding workflows against a configurable gateway of language models, across three surfaces: a command-line tool (`keiko`), a programmatic SDK, and a local web UI. Dry-run workflows are the default, and the manifest-producing surfaces emit redacted evidence for audit. Keiko assists a developer; it does not merge code on its own.
6
-
7
- This README is the package's primary shipped guide. It contains the package-facing essentials and links to the repository [`docs/`](https://github.com/oscharko-dev/Keiko/tree/dev/docs) for deeper operational guidance.
8
-
9
- ---
10
-
11
- ## Table of contents
12
-
13
- - [What Keiko is](#what-keiko-is)
14
- - [Wave 1 scope](#wave-1-scope)
15
- - [Requirements](#requirements)
16
- - [Install](#install)
17
- - [Quick start](#quick-start)
18
- - [Build and test](#build-and-test)
19
- - [Configuration and secrets](#configuration-and-secrets)
20
- - [CLI usage](#cli-usage)
21
- - [SDK usage](#sdk-usage)
22
- - [Evidence output](#evidence-output)
23
- - [Local UI](#local-ui)
24
- - [Security and audit boundaries](#security-and-audit-boundaries)
25
- - [Evaluation and Go/No-Go](#evaluation-and-gono-go)
26
- - [Packaging](#packaging)
27
- - [Future architecture path](#future-architecture-path)
28
- - [Documentation index](#documentation-index)
29
- - [Development](#development)
30
- - [License and attribution](#license-and-attribution)
31
-
32
- ---
33
-
34
- ## What Keiko is
35
-
36
- Keiko is a coding agent for teams who must show their work. It targets regulated engineering — banking, insurance, and similar — where every automated change needs a human reviewer and an audit trail.
37
-
38
- Three properties define it:
39
-
40
- - **Model-agnostic.** Route each task to a model that fits the work and the budget, from one config file. The gateway exposes each model's declared capabilities; the caller chooses.
41
- - **Bounded and reviewable.** Workflows run as deterministic pipelines, not open-ended autonomy. Changes are dry-run by default and returned as a diff for human review. No change reaches a branch without a person.
42
- - **Auditable.** Manifest-producing surfaces emit structured, redacted evidence. Credentials never enter logs, events, or evidence.
43
-
44
- Keiko provides bounded developer assistance with measurable output and regulated reviewability. It is not a replacement for engineering judgment, and it does not claim parity with general-purpose autonomous agents.
45
-
46
- ---
47
-
48
- ## Wave 1 scope
49
-
50
- Wave 1 is feature-complete for its defined scope. The shipped capabilities are:
51
-
52
- - **Bounded repository context** — a redacted, byte-budgeted view of a workspace.
53
- - **Unit-test generation** — generate a reviewable test patch for an existing source file.
54
- - **Bug investigation** — propose a fix and a regression test for a reported symptom.
55
- - **Safe tool and command execution** — an allowlisted, bounded command runner.
56
- - **Verification** — run the project's gates (lint, typecheck, test, build) under resource limits.
57
- - **Audit evidence** — redacted, durable evidence manifests with retention.
58
- - **Local UI** — a single-user, local-only web surface for the workflows and evidence.
59
- - **Evaluation harness** — an offline (default) or live scorecard for pilot decisions.
60
-
61
- Surface coverage is intentionally not identical. The CLI exposes the full command set; the SDK exposes programmatic workflows, workspace, verification, gateway, evaluation, and evidence APIs; the local UI exposes workflow launch/review/apply, live run observation, evidence browsing, config/model inspection, and workspace summary.
62
-
63
- Two model kinds in the portfolio are registered but not yet callable: OCR-vision (`callOcr`) and embedding (`callEmbedding`) methods are Wave 2. See the [model capability guide](https://github.com/oscharko-dev/Keiko/blob/dev/docs/pilot/model-capability-guide.md).
64
-
65
- ---
5
+ Keiko is developer-controlled by design. It does not commit, push, open pull requests, merge code, or apply changes without an explicit local action. The manifest-producing surfaces emit redacted evidence for audit.
66
6
 
67
7
  ## Requirements
68
8
 
69
- - Node.js >= 22 (ESM-only package)
70
- - npm >= 10
9
+ - Node.js 22 or newer.
10
+ - npm 10 or newer.
11
+ - A model gateway with an OpenAI-compatible chat-completions API and an API token for model-backed work.
71
12
 
72
- ---
13
+ ## Install and Start
73
14
 
74
- ## Install
15
+ Install Keiko in the project where you want to use it:
75
16
 
76
17
  ```bash
77
18
  npm install @oscharko-dev/keiko
19
+ npx keiko init
20
+ npm run keiko:start
78
21
  ```
79
22
 
80
- Keiko ships ESM only with a minimal runtime dependency set. Use `import`, not `require`.
81
-
82
- ---
83
-
84
- ## Quick start
23
+ Open the local UI:
85
24
 
86
- A dry-run pass that writes nothing: inspect context, generate a test patch, review the diff.
87
-
88
- ```bash
89
- # 1. List the models your gateway knows about (no credentials needed)
90
- keiko models list
91
-
92
- # 2. Print a redacted summary of what the workspace layer would read
93
- keiko context --dir .
94
-
95
- # 3. Generate a unit-test patch for a source file (dry-run by default)
96
- keiko gen-tests --file src/foo.ts
25
+ ```text
26
+ http://127.0.0.1:1983
97
27
  ```
98
28
 
99
- Step 3 prints the proposed diff and writes nothing. Review it, then re-run with `--apply` to write the test file, which triggers verification.
100
-
101
- ---
102
-
103
- ## Build and test
104
-
105
- From a clone of the repository:
29
+ Stop Keiko when you are done:
106
30
 
107
31
  ```bash
108
- npm install
109
- npm run build # compile TypeScript to dist/
110
- npm test # run the test suite (vitest)
111
- npm run lint # eslint
112
- npm run typecheck # tsc --noEmit
113
- npm run format # prettier --write
114
- npm --prefix ui ci --ignore-scripts # install UI build tooling when packaging or testing the UI
115
- ```
116
-
117
- ---
118
-
119
- ## Configuration and secrets
120
-
121
- Keiko reads model credentials from **environment variables** or a **JSON config file** — never from CLI flags. This keeps credentials out of shell history and process listings.
122
-
123
- ### Precedence
124
-
125
- The first match wins:
126
-
127
- 1. Per-model environment variables: `KEIKO_MODEL_<UPPER_MODEL_ID>_API_KEY` / `_BASE_URL`
128
- 2. Config-file value for that model's `apiKey` / `baseUrl`
129
- 3. Global environment variables: `KEIKO_DEFAULT_API_KEY` / `_BASE_URL`
130
-
131
- Live model CLI surfaces (`keiko models validate`, `keiko gen-tests`, `keiko investigate`, and `keiko evaluate --live`) read a config only from `--config PATH` or `KEIKO_CONFIG_FILE`. `keiko ui` requires `--config PATH` for model-backed runs. Keiko does not implicitly trust `./keiko.config.json` from the target repository.
132
-
133
- Provider `baseUrl` values must use `https:` unless they target `localhost` or loopback for local development.
134
-
135
- ### Per-model variables
136
-
137
- Derive the variable name from the model id: uppercase it, then replace every non-alphanumeric character with `_`. Suffix with `_API_KEY` or `_BASE_URL`.
138
-
139
- ```
140
- gpt-oss-120b → KEIKO_MODEL_GPT_OSS_120B_API_KEY
141
- KEIKO_MODEL_GPT_OSS_120B_BASE_URL
32
+ npm run keiko:stop
142
33
  ```
143
34
 
144
- ### Global fallback
35
+ `npx keiko init` adds these local scripts to `package.json`:
145
36
 
146
- Used when neither a per-model environment variable nor a config-file value supplies the secret:
37
+ | Script | Purpose |
38
+ | --------------------- | ---------------------------------------------- |
39
+ | `npm run keiko:start` | Starts the local Keiko UI on the default port. |
40
+ | `npm run keiko:stop` | Stops the local Keiko UI process. |
147
41
 
148
- ```
149
- KEIKO_DEFAULT_API_KEY
150
- KEIKO_DEFAULT_BASE_URL
151
- ```
42
+ ## First-Run Setup
152
43
 
153
- Credentials are held in memory for the duration of a call and are never logged or serialized. See [`.env.example`](https://github.com/oscharko-dev/Keiko/blob/dev/.env.example) for a template and [ADR-0003](https://github.com/oscharko-dev/Keiko/blob/dev/docs/adr/README.md#adr-0003) for the rationale.
44
+ If no model gateway is configured, the UI asks for:
154
45
 
155
- ---
46
+ - Base URL, for example `https://llm-gateway.example.com/v1`
47
+ - API token
156
48
 
157
- ## CLI usage
49
+ Keiko calls the gateway model list endpoint, tests discovered chat models with a small chat-completions request, and stores only callable chat models in the local runtime configuration. Credentials stay on the local machine and are not returned to the browser.
158
50
 
159
- The CLI provides nine subcommands (`models`, `run`, `context`, `verify`, `gen-tests`, `investigate`, `evidence`, `evaluate`, `ui`); `models` and `evidence` each take a sub-action. Top-level `keiko --help` and `keiko --version` print usage; `keiko evaluate --help` prints its own usage. Global options:
51
+ The UI runs on loopback only. The `--host` option can validate a loopback host value; the server always binds `127.0.0.1`.
160
52
 
161
- | Option | Effect |
162
- | ----------------- | -------------------- |
163
- | `-h`, `--help` | Show help text |
164
- | `-v`, `--version` | Show the CLI version |
53
+ ## Daily Use
165
54
 
166
- Exit codes are consistent across commands unless noted:
55
+ 1. Add a local project path.
56
+ 2. Select one of the configured chat models.
57
+ 3. Use chat or a workflow: Generate Tests, Investigate Bug, Explain Plan, or Verify.
58
+ 4. Review proposed diffs and evidence before applying any change.
59
+ 5. Keep generated evidence with the project review material when required by your delivery process.
167
60
 
168
- | Code | Meaning |
169
- | ---- | ------------- |
170
- | `0` | success |
171
- | `1` | runtime error |
172
- | `2` | usage error |
61
+ Surface coverage is intentionally not identical. The UI is the primary surface for day-to-day use; the CLI remains available for focused inspection, verification, and automation.
173
62
 
174
- ### `keiko models list`
63
+ ## CLI Essentials
175
64
 
176
- List all registered model capabilities as a table. No credentials required.
177
-
178
- ```bash
179
- keiko models list
180
- ```
181
-
182
- Takes no options. Prints one row per model: id, kind, cost class, latency class, tool-calling, structured-output, and use cases.
183
-
184
- ### `keiko models validate`
185
-
186
- Validate the gateway configuration from `--config` or `KEIKO_CONFIG_FILE`. Reports structural errors without printing any configured value. Exit `0` when valid, `1` when invalid or no source is given, `2` when `--config` has no path.
187
-
188
- ```bash
189
- keiko models validate --config ./keiko.config.json
190
- ```
191
-
192
- | Option | Description |
193
- | --------------- | -------------------------------------------- |
194
- | `--config PATH` | Gateway config file (or `KEIKO_CONFIG_FILE`) |
195
-
196
- ### `keiko run`
197
-
198
- Run a bounded, dry-run task through the agent harness against deterministic fixtures (no provider call). The task type selects the harness pipeline. A redacted evidence manifest is written by default.
199
-
200
- ```bash
201
- keiko run explain-plan --file src/auth.ts --question "what does this do?"
202
- keiko run generate-unit-tests --file src/add.ts --function add
203
- keiko run investigate-bug --description "login 500 on empty password"
204
- ```
205
-
206
- | Option | Description |
207
- | --------------------- | ------------------------------------------------------------------------- |
208
- | `<task-type>` | `explain-plan`, `generate-unit-tests`, or `investigate-bug` |
209
- | `--file PATH` | Target file (required for the first two task types) |
210
- | `--question TEXT` | Question for `explain-plan` |
211
- | `--function NAME` | Focus function for `generate-unit-tests` |
212
- | `--description TEXT` | Bug description (required for `investigate-bug`) |
213
- | `--no-evidence` | Do not write an evidence manifest |
214
- | `--evidence-dir PATH` | Evidence directory (or `KEIKO_EVIDENCE_DIR`; default `./.keiko/evidence`) |
215
- | `--include-reasoning` | Include redacted reasoning entries in the manifest |
216
- | `--include-diff` | Include the redacted proposed diff in the manifest |
217
-
218
- For real model-backed generation and investigation, use `keiko gen-tests` and `keiko investigate`.
219
-
220
- ### `keiko context`
221
-
222
- Print a redacted workspace context summary. Dry-run by construction: no model is called and nothing is written.
223
-
224
- ```bash
225
- keiko context --dir .
226
- keiko context --dir . --task "add tests" --budget 65536
227
- ```
228
-
229
- | Option | Description |
230
- | ---------------- | ------------------------------------------- |
231
- | `--dir PATH` | Workspace root (default: cwd) |
232
- | `--task TEXT` | Build a context pack scoped to this task |
233
- | `--budget BYTES` | Context-pack byte budget (positive integer) |
234
- | `--json` | Emit the summary as JSON |
235
-
236
- ### `keiko verify`
237
-
238
- Run the project's gates through the safe tool layer under per-command resource limits, and print a redacted summary. Exit `0` when every gate passes, `1` when a gate fails or a workspace error occurs.
239
-
240
- ```bash
241
- keiko verify --dir .
242
- keiko verify --only typecheck,lint --changed src/a.ts
243
- ```
244
-
245
- | Option | Description |
246
- | ----------------------- | --------------------------------------------------------------------------- |
247
- | `--dir PATH` | Workspace root (default: cwd) |
248
- | `--only KIND[,KIND]` | Run only these gates: `test`, `targeted-test`, `typecheck`, `lint`, `build` |
249
- | `--changed FILE[,FILE]` | Restrict targeted tests to these changed files |
250
- | `--json` | Emit the verification report as JSON |
251
-
252
- ### `keiko gen-tests`
253
-
254
- Generate a reviewable unit-test patch. Dry-run by default; `--apply` writes the tests and runs verification. The patch may only create or modify test files (a production-code guard rejects anything else). The model provider comes from config, never a flag. Exit `0` on a successful dry-run or apply, `1` on a rejected/cancelled/failed run or workspace error, `2` on a usage error.
255
-
256
- ```bash
257
- keiko gen-tests --file src/add.ts --config ~/keiko/config.json
258
- keiko gen-tests --file src/add.ts --function add --apply
259
- keiko gen-tests --dir src/math --changed src/math/sum.ts
260
- ```
261
-
262
- | Option | Description |
263
- | ----------------------- | --------------------------------------------------------------------------- |
264
- | `--file PATH` | Source file to test (exactly one of `--file` / `--dir`) |
265
- | `--dir PATH` | Module directory to test (exactly one of `--file` / `--dir`) |
266
- | `--function NAME` | Focus on one function (with `--file`) |
267
- | `--changed FILE[,FILE]` | Authoritative changed-file target set |
268
- | `--apply` | Write the patch and run verification (default: dry-run) |
269
- | `--model ID` | Registered configured model id (default: cheapest capable configured model) |
270
- | `--config PATH` | Gateway config file (or `KEIKO_CONFIG_FILE`) |
271
- | `--json` | Emit the workflow report as JSON |
272
- | `--dir-root PATH` | Workspace root (default: cwd) |
273
-
274
- ### `keiko investigate`
275
-
276
- Investigate a bounded bug report, then propose a minimal fix and a regression test, separating verified facts from the model's unverified hypothesis. Dry-run by default; `--apply` writes the fix and runs verification. A scope guard rejects edits to sensitive paths (version-control internals, CI config, git hooks, lockfiles). At least one evidence source is required. Exit `0` on `fix-applied`/`fix-proposed`/`investigation-only`, `1` on a rejected/cancelled/failed run or read error, `2` on a usage error.
277
-
278
- ```bash
279
- keiko investigate --description "login returns 500 on empty password" --config ~/keiko/config.json
280
- keiko investigate --output-file ./fail.txt --file src/auth.ts --apply
281
- ```
65
+ | Command | Purpose |
66
+ | ----------------------------- | ---------------------------------------------------------------- |
67
+ | `keiko init` | Adds local start and stop scripts. |
68
+ | `keiko start` | Starts the local UI in the background. |
69
+ | `keiko stop` | Stops the local UI. |
70
+ | `keiko status` | Prints the local UI status. |
71
+ | `keiko ui` | Runs the UI in the foreground. Port to bind (default: 1983). |
72
+ | `keiko models validate` | Validates gateway configuration. |
73
+ | `keiko context` | Prints a redacted workspace context summary. |
74
+ | `keiko gen-tests` | Generates a reviewable unit-test patch. |
75
+ | `keiko investigate` | Investigates a bug and proposes a fix plus regression test. |
76
+ | `keiko verify` | Runs configured verification gates and writes redacted evidence. |
77
+ | `keiko evidence list` | Lists local evidence manifests. |
78
+ | `keiko evidence show <runId>` | Shows one redacted evidence manifest. |
282
79
 
283
- | Option | Description |
284
- | -------------------- | --------------------------------------------------------------------------- |
285
- | `--description TEXT` | Free-text bug description |
286
- | `--output TEXT` | Failing command/test output (inline) |
287
- | `--output-file PATH` | Failing output read from a file |
288
- | `--stack TEXT` | Stack trace (inline) |
289
- | `--stack-file PATH` | Stack trace read from a file |
290
- | `--file PATH[,PATH]` | Suspected target file(s) |
291
- | `--apply` | Apply the fix and run verification (default: dry-run) |
292
- | `--model ID` | Registered configured model id (default: cheapest capable configured model) |
293
- | `--config PATH` | Gateway config file (or `KEIKO_CONFIG_FILE`) |
294
- | `--json` | Emit the investigation report as JSON |
295
- | `--dir-root PATH` | Workspace root (default: cwd) |
80
+ `keiko gen-tests` and `keiko investigate` print a reviewable report but do not persist an evidence manifest. Use `keiko run`, `keiko verify`, or the UI evidence view when a stored manifest is required.
296
81
 
297
- ### `keiko evidence`
82
+ ## Configuration
298
83
 
299
- Inspect redacted evidence manifests written by `keiko run`, the local UI, and `keiko evaluate`. Reads only the evidence base directory. Exit `0` on success, `1` on a run id not found in the store or a read error, `2` on a usage error (including `show` with no run id).
84
+ The UI can create a local runtime config during first-run setup. For scripted use, provide a JSON config file through `KEIKO_CONFIG_FILE` or `--config`:
300
85
 
301
- ```bash
302
- keiko evidence list
303
- keiko evidence show <runId>
304
- ```
305
-
306
- | Option | Description |
307
- | --------------------- | ------------------------------------------------------------------------- |
308
- | `list` | List stored manifests |
309
- | `show <runId>` | Show one manifest by run id |
310
- | `--evidence-dir PATH` | Evidence directory (or `KEIKO_EVIDENCE_DIR`; default `./.keiko/evidence`) |
311
- | `--json` | Emit as JSON |
312
-
313
- ### `keiko evaluate`
314
-
315
- Run the evaluation harness against the built-in fixtures. Offline (deterministic, no network) by default; `--live` evaluates against a configured model and fails closed when no credentials resolve. Exit `0` when every applicable dimension and surface-parity pass, `1` on a failure or runtime error, `2` on a usage error.
316
-
317
- ```bash
318
- keiko evaluate
319
- keiko evaluate --suite unit-tests --json
320
- keiko evaluate --live --model gpt-oss-120b --config ~/keiko/config.json
321
- ```
322
-
323
- | Option | Description |
324
- | ---------------- | ----------------------------------------------------------- |
325
- | `--suite NAME` | `unit-tests`, `bug-investigation`, or `all` (default `all`) |
326
- | `--fixture NAME` | Run one fixture by name (mutually exclusive with `--suite`) |
327
- | `--live` | Evaluate against a configured model (default: offline) |
328
- | `--model ID` | Override the model id for all fixtures (live mode) |
329
- | `--config PATH` | Gateway config file (or `KEIKO_CONFIG_FILE`) |
330
- | `--json` | Emit the scorecard as JSON |
331
- | `--output PATH` | Write the scorecard JSON to a file |
332
-
333
- The offline suite checks workflow plumbing deterministically. It does not measure model quality. See [Evaluation and Go/No-Go](#evaluation-and-gono-go).
334
-
335
- ### `keiko ui`
336
-
337
- Launch the local UI. The server binds to `127.0.0.1` (loopback only), prints its URL, and runs until interrupted (Ctrl+C). It serves prebuilt UI assets. The published npm package ships these assets, so `keiko ui` works immediately after install; from a source checkout, run `npm run build && npm run ui:ci && npm run build:ui` first.
338
-
339
- ```bash
340
- keiko ui
341
- keiko ui --port 4319
342
- ```
343
-
344
- | Option | Description |
345
- | --------------------- | ------------------------------------------------------------------- |
346
- | `--port PORT` | Port to bind (default: 4319) |
347
- | `--host HOST` | Validate a loopback host value; the server always binds `127.0.0.1` |
348
- | `--evidence-dir PATH` | Evidence directory for UI-run evidence |
349
- | `--config PATH` | Gateway config file required for model-backed UI runs |
350
-
351
- See [Local UI](#local-ui) and the [local UI runbook](https://github.com/oscharko-dev/Keiko/blob/dev/docs/ui-runbook.md).
352
-
353
- ---
354
-
355
- ## SDK usage
356
-
357
- Keiko ships ESM-only with full type definitions. The package entry point re-exports the public surface; import named values from `keiko`.
358
-
359
- `detectWorkspace` and `loadConfigFromFile` are synchronous and take a path string. The workflow functions take a `workspaceRoot` path (not a workspace object) plus a `deps` object carrying the model port.
360
-
361
- ### Workspace summary
362
-
363
- ```typescript
364
- import { detectWorkspace, buildWorkspaceSummary } from "@oscharko-dev/keiko";
365
-
366
- const workspace = detectWorkspace(process.cwd());
367
- const summary = buildWorkspaceSummary(workspace);
368
- console.log(summary.name, summary.counts);
369
- ```
370
-
371
- ### Generate unit tests
372
-
373
- ```typescript
374
- import {
375
- generateUnitTests,
376
- renderUnitTestReport,
377
- Gateway,
378
- GatewayModelPort,
379
- loadConfigFromFile,
380
- } from "@oscharko-dev/keiko";
381
-
382
- const config = loadConfigFromFile("./keiko.config.json", process.env);
383
- const model = new GatewayModelPort(new Gateway(config));
384
-
385
- const report = await generateUnitTests(
386
- {
387
- workspaceRoot: ".",
388
- target: { kind: "file", filePath: "src/add.ts" },
389
- modelId: config.providers[0].modelId,
390
- // apply defaults to false: a reviewable diff, no files written
391
- },
392
- { model },
393
- );
394
-
395
- console.log(report.status, report.proposedDiff);
396
- console.log(renderUnitTestReport(report));
397
- ```
398
-
399
- ### Investigate a bug
400
-
401
- ```typescript
402
- import {
403
- investigateBug,
404
- renderBugInvestigationReport,
405
- Gateway,
406
- GatewayModelPort,
407
- loadConfigFromFile,
408
- } from "@oscharko-dev/keiko";
409
-
410
- const config = loadConfigFromFile("./keiko.config.json", process.env);
411
- const model = new GatewayModelPort(new Gateway(config));
412
-
413
- const report = await investigateBug(
414
- {
415
- workspaceRoot: ".",
416
- report: { description: "login returns 500 on empty password" },
417
- modelId: config.providers[0].modelId,
418
- // apply defaults to false (dry-run)
419
- },
420
- { model },
421
- );
422
-
423
- // The report separates established facts from the model's unverified hypothesis.
424
- console.log(report.verified, report.hypothesis);
425
- console.log(renderBugInvestigationReport(report));
426
- ```
427
-
428
- ### Run verification
429
-
430
- `runVerification` takes a plan. Build it from the detected workspace and its script catalog.
431
-
432
- ```typescript
433
- import {
434
- detectWorkspace,
435
- detectScripts,
436
- buildVerificationPlan,
437
- runVerification,
438
- buildVerificationSummary,
439
- } from "@oscharko-dev/keiko";
440
-
441
- const workspace = detectWorkspace(process.cwd());
442
- const catalog = detectScripts(workspace);
443
- const plan = buildVerificationPlan(workspace, catalog, {});
444
-
445
- const report = await runVerification(plan, { workspace });
446
- console.log(buildVerificationSummary(report));
447
- console.log(report.overallStatus); // "passed" when every gate passed
448
- ```
449
-
450
- ### Inspect evidence
451
-
452
- `listEvidence` and `loadEvidence` are synchronous. The loaded data is redacted by construction.
453
-
454
- ```typescript
455
- import { createNodeEvidenceStore, listEvidence, loadEvidence } from "@oscharko-dev/keiko";
456
-
457
- const store = createNodeEvidenceStore("./.keiko/evidence");
458
-
459
- for (const entry of listEvidence(store)) {
460
- console.log(entry.runId, entry.taskType, entry.outcome, entry.finishedAt);
86
+ ```json
87
+ {
88
+ "providers": [
89
+ {
90
+ "modelId": "example-chat-model",
91
+ "baseUrl": "https://llm-gateway.example.com/v1",
92
+ "apiKey": "replace-me"
93
+ }
94
+ ]
461
95
  }
462
-
463
- const manifest = loadEvidence(store, "the-run-id");
464
- if (manifest !== undefined) {
465
- console.log(manifest.evidenceSchemaVersion);
466
- }
467
- ```
468
-
469
- ### Drive a workflow with a scripted model
470
-
471
- `createScriptedModelPort` builds a `ModelPort` that replays a fixed transcript, so you can exercise a workflow deterministically with no live model or credentials. It satisfies the same `deps.model` seam the workflows use.
472
-
473
- ```typescript
474
- import {
475
- createScriptedModelPort,
476
- generateUnitTests,
477
- type NormalizedResponse,
478
- } from "@oscharko-dev/keiko";
479
-
480
- const response: NormalizedResponse = {
481
- modelId: "scripted",
482
- content: "--- a/src/add.test.ts\n+++ b/src/add.test.ts\n+// generated test\n",
483
- finishReason: "stop",
484
- toolCalls: [],
485
- structuredOutput: null,
486
- usage: {
487
- requestId: "scripted",
488
- promptTokens: 0,
489
- completionTokens: 0,
490
- latencyMs: 1,
491
- costClass: "low",
492
- },
493
- };
494
-
495
- const model = createScriptedModelPort([response]);
496
-
497
- const report = await generateUnitTests(
498
- { workspaceRoot: ".", target: { kind: "file", filePath: "src/add.ts" }, modelId: "scripted" },
499
- { model },
500
- );
501
- console.log(report.status);
502
- ```
503
-
504
- For the full offline scorecard, run `keiko evaluate` (see [Evaluation and Go/No-Go](#evaluation-and-gono-go)).
505
-
506
- `SDK_VERSION` is exported for diagnostics. `--version` on the CLI reports the same value.
507
-
508
- ---
509
-
510
- ## Evidence output
511
-
512
- `keiko run`, workflow runs launched from the local UI, and `keiko evaluate` (offline and live) persist an `EvidenceManifest`. `keiko gen-tests` and `keiko investigate` print a reviewable report but do not persist an evidence manifest; `keiko verify` and `keiko context` are read-only summaries that persist nothing. Manifests are **redacted at construction** — secret-shaped strings, environment values, and known literal credentials are removed before anything is written. There is no code path that writes an unredacted manifest.
513
-
514
- Manifests are written with an exclusive-create (`O_EXCL`) open into a directory whose real path is verified to be inside the evidence root. The default location is `$KEIKO_EVIDENCE_DIR` or `.keiko/evidence` under the workspace.
515
-
516
- Retention keeps the newest runs up to a maximum (`DEFAULT_RETENTION`, 50 runs). Every manifest carries a stable `EVIDENCE_SCHEMA_VERSION`; readers reject unknown versions rather than guessing.
517
-
518
- Inspect manifests with `keiko evidence list` and `keiko evidence show <runId>`. See [ADR-0010](https://github.com/oscharko-dev/Keiko/blob/dev/docs/adr/README.md#adr-0010).
519
-
520
- ---
521
-
522
- ## Local UI
523
-
524
- `keiko ui` serves a single-user web surface for the workflows and evidence. It binds to `127.0.0.1` by default, checks `Host` and `Origin` headers to block DNS-rebinding, serves a strict Content-Security-Policy, and renders only redacted views. The apply action uses the same gated, dry-run-default path as the CLI.
525
-
526
- The server runs until you interrupt it (Ctrl+C). For setup, surfaces, and troubleshooting, see the [local UI runbook](https://github.com/oscharko-dev/Keiko/blob/dev/docs/ui-runbook.md).
527
-
528
- Multi-user access, authentication, and remote hosting are out of scope for Wave 1.
529
-
530
- ---
531
-
532
- ## Security and audit boundaries
533
-
534
- Keiko's boundaries are explicit, and so are their limits. In summary:
535
-
536
- - **Workspace access** is confined to the workspace root by a lexical and real-path check; secret-shaped files are always denied.
537
- - **Command execution** runs an allowlist with no shell interpretation, an ephemeral HOME, and resource ceilings.
538
- - **Patches** are dry-run by default and guarded by path scope; applying requires an explicit opt-in and is followed by verification.
539
- - **The UI** is local-only with DNS-rebinding defense and a strict CSP.
540
- - **No unattended merge.** A human reviews every change. This is a hard invariant of the pilot.
541
-
542
- Wave 1 is **not** OS-level isolation. Allowlisted project scripts (for example `npm test`) can run repository-authored code; the boundary protects the host outside the workspace, not the workspace from itself. For the full picture and the explicit limitations, read [Security and audit boundaries](https://github.com/oscharko-dev/Keiko/blob/dev/docs/security-and-audit-boundaries.md).
543
-
544
- ---
545
-
546
- ## Evaluation and Go/No-Go
547
-
548
- `keiko evaluate` produces a scorecard, not a verdict. The Wave 1 pilot decision is made by people, using the scorecard plus run evidence.
549
-
550
- - Offline (`keiko evaluate`) checks workflow plumbing deterministically against scripted responses. It does not measure model quality.
551
- - Live (`keiko evaluate --live`) runs the same suite against a configured model endpoint.
552
-
553
- See [Go/No-Go criteria](https://github.com/oscharko-dev/Keiko/blob/dev/docs/pilot/go-no-go.md) and the [model capability guide](https://github.com/oscharko-dev/Keiko/blob/dev/docs/pilot/model-capability-guide.md).
554
-
555
- ---
556
-
557
- ## Packaging
558
-
559
- The published tarball ships `dist/`, `README.md`, `LICENSE`, `NOTICE`, and `TRADEMARKS.md`. A surface check enforces that package boundary and rejects source, docs, source maps, and secret files. Runtime dependencies are intentionally minimal; the root package currently uses `ws` for the browser CDP transport. Supply-chain review is covered by CI dependency review, CodeQL, audit steps, and SBOM builds. Inspect the surface with:
560
-
561
- ```bash
562
- npm pack --dry-run
563
96
  ```
564
97
 
565
- Publishing the package is out of scope for Wave 1. See [npm packaging](https://github.com/oscharko-dev/Keiko/blob/dev/docs/npm-packaging.md) for the exact prepack chain and surface check.
566
-
567
- ---
568
-
569
- ## Future architecture path
98
+ Environment variables can override file values:
570
99
 
571
- Wave 1 is npm-first and TypeScript-first: a CLI, an SDK, and a local UI that run on a developer machine or a CI runner with no managed control plane. This keeps the pilot's footprint small and its trust boundary local.
100
+ | Variable | Purpose |
101
+ | --------------------------- | ------------------------------ |
102
+ | `KEIKO_CONFIG_FILE` | Path to a gateway config file. |
103
+ | `KEIKO_DEFAULT_BASE_URL` | Fallback gateway base URL. |
104
+ | `KEIKO_DEFAULT_API_KEY` | Fallback gateway API token. |
105
+ | `KEIKO_MODEL_<ID>_BASE_URL` | Per-model base URL override. |
106
+ | `KEIKO_MODEL_<ID>_API_KEY` | Per-model API token override. |
107
+ | `KEIKO_UI_PORT` | Local UI port override. |
572
108
 
573
- A later phase may add a cloud-native backend for teams that want shared evaluation, central evidence, or larger workloads. If it does, the CLI and UI stay lightweight local clients; the local-first path remains supported. Multi-user access, authentication, and a hosted UI are explicitly out of scope for Wave 1.
109
+ Do not commit gateway config files, API tokens, `.keiko/`, or evidence that contains project-specific review material unless your process explicitly requires it.
574
110
 
575
- ---
111
+ ## Security Boundaries
576
112
 
577
- ## Documentation index
113
+ Keiko is a local tool, not a remote service.
578
114
 
579
- Repository documentation (not shipped in the package):
115
+ - The UI binds to `127.0.0.1`.
116
+ - API keys are accepted from local config, local environment, or the first-run UI flow.
117
+ - Credentials are redacted from logs, evidence, and browser responses.
118
+ - Workspace reads are bounded by the selected local project path.
119
+ - Commands are allowlisted and run without a shell.
120
+ - Generated patches are dry-run by default and must be reviewed before application.
121
+ - Evidence is redacted before it is written.
580
122
 
581
- | Document | Audience |
582
- | --------------------------------------------------------------------------------------------------------------------- | ----------------------------------- |
583
- | [Customer pilot runbook](https://github.com/oscharko-dev/Keiko/blob/dev/docs/pilot/runbook.md) | Pilot teams, evaluators, reviewers |
584
- | [Go/No-Go criteria](https://github.com/oscharko-dev/Keiko/blob/dev/docs/pilot/go-no-go.md) | Pilot sponsors, leads, review board |
585
- | [Model capability guide](https://github.com/oscharko-dev/Keiko/blob/dev/docs/pilot/model-capability-guide.md) | Pilot evaluators, operators |
586
- | [Security and audit boundaries](https://github.com/oscharko-dev/Keiko/blob/dev/docs/security-and-audit-boundaries.md) | Security and regulated reviewers |
587
- | [Local UI runbook](https://github.com/oscharko-dev/Keiko/blob/dev/docs/ui-runbook.md) | UI operators and reviewers |
588
- | [npm packaging](https://github.com/oscharko-dev/Keiko/blob/dev/docs/npm-packaging.md) | Release engineers |
123
+ Known limits:
589
124
 
590
- Architecture Decision Records live in [`docs/adr/`](https://github.com/oscharko-dev/Keiko/tree/dev/docs/adr).
591
-
592
- ---
593
-
594
- ## Development
595
-
596
- ```bash
597
- npm install
598
- npm run build
599
- npm test
600
- npm run lint
601
- npm run typecheck
602
- npm run format
603
- ```
125
+ - Keiko is not a sandbox or OS-level isolation layer.
126
+ - Evidence files are ordinary local files, not encrypted or tamper-evident records.
127
+ - Local project scripts can execute repository code when you run verification.
128
+ - Do not run Keiko against untrusted repositories.
604
129
 
605
- Contributions follow the delivery standard in [`CONTRIBUTING.md`](CONTRIBUTING.md): strict TypeScript, tested behavior, conventional commits with an issue number, and reviewable, evidence-backed changes.
130
+ ## Troubleshooting
606
131
 
607
- ---
132
+ | Symptom | Check |
133
+ | --------------------- | ------------------------------------------------------------------------------------------------------- |
134
+ | UI does not open | Run `npm run keiko:status`, then inspect `.keiko/ui.log`. |
135
+ | Port is busy | Start with `KEIKO_UI_PORT=1984 npm run keiko:start` or stop the process using the port. |
136
+ | No model appears | Reopen Settings, verify the base URL and token, then run the credential test again. |
137
+ | Credential test fails | Confirm the gateway accepts OpenAI-compatible chat-completions requests at the configured base URL. |
138
+ | Stale process state | Run `npm run keiko:stop`, delete `.keiko/ui.pid` if the process is no longer running, then start again. |
608
139
 
609
- ## License and attribution
140
+ ## Further Reading
610
141
 
611
- Keiko is licensed under Apache-2.0. See [`LICENSE`](LICENSE).
142
+ - [Local UI guide](https://github.com/oscharko-dev/Keiko/blob/dev/docs/ui-runbook.md)
143
+ - [Security boundaries](https://github.com/oscharko-dev/Keiko/blob/dev/docs/security-and-audit-boundaries.md)
144
+ - [Pilot guide](https://github.com/oscharko-dev/Keiko/blob/dev/docs/pilot/runbook.md)
145
+ - [Pilot evaluation](https://github.com/oscharko-dev/Keiko/blob/dev/docs/pilot/go-no-go.md)
612
146
 
613
- The `NOTICE` file carries the package attribution for Keiko and oscharko-dev and
614
- ships with the npm package. Redistributors must preserve applicable copyright,
615
- license, and NOTICE attribution as required by Apache-2.0.
147
+ ## License
616
148
 
617
- The Keiko name, logo, visual identity, and oscharko-dev origin identifiers are
618
- covered by the repository's trademark and brand policy. Truthful attribution and
619
- compatibility references are permitted, but forks and derivative distributions
620
- must not imply that they are the official Keiko project or endorsed by
621
- oscharko-dev. See [`TRADEMARKS.md`](TRADEMARKS.md).
149
+ Apache-2.0. See `LICENSE`, `NOTICE`, and `TRADEMARKS.md`.