llm-cli-gateway 1.14.0 → 1.15.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/CHANGELOG.md +225 -46
- package/dist/async-job-manager.js +9 -3
- package/dist/index.d.ts +101 -0
- package/dist/index.js +311 -26
- package/dist/session-manager.d.ts +20 -2
- package/dist/session-manager.js +28 -3
- package/dist/worktree-manager.d.ts +41 -0
- package/dist/worktree-manager.js +214 -0
- package/package.json +1 -1
package/CHANGELOG.md
CHANGED
|
@@ -2,6 +2,148 @@
|
|
|
2
2
|
|
|
3
3
|
All notable changes to the llm-cli-gateway project.
|
|
4
4
|
|
|
5
|
+
## [1.15.0] - 2026-05-28 — Phase 4 slice λ (gateway-owned worktree lifecycle)
|
|
6
|
+
|
|
7
|
+
Ships the tenth Phase 4 slice: a new top-level `worktree` field on every
|
|
8
|
+
`*_request` and `*_request_async` tool lets a caller run the request
|
|
9
|
+
inside a dedicated git worktree owned and lifecycle-managed by the
|
|
10
|
+
gateway. The provider audit listed `-w/--worktree` as a per-CLI flag on
|
|
11
|
+
Claude / Gemini / Grok; this slice deliberately does **not** wire any
|
|
12
|
+
`-w` passthrough. Instead the gateway pre-creates a worktree via
|
|
13
|
+
`git worktree add`, spawns the child CLI with `cwd: <worktree-path>`,
|
|
14
|
+
and persists `worktreePath` on `session.metadata` for reuse. Five CLIs
|
|
15
|
+
× two transports (sync + async) = ten tools all share one resolver, so
|
|
16
|
+
the surface lands as one Zod schema + one helper per tool rather than
|
|
17
|
+
five-times-two per-CLI argv wirings.
|
|
18
|
+
|
|
19
|
+
### Added — gateway-owned worktree surface
|
|
20
|
+
|
|
21
|
+
- **`WORKTREE_SCHEMA`** (`src/index.ts`): top-level Zod field
|
|
22
|
+
registered on all ten tools — `claude_request`, `codex_request`,
|
|
23
|
+
`gemini_request`, `grok_request`, `mistral_request`, plus the five
|
|
24
|
+
`*_request_async` siblings. Accepts `true` (anonymous UUID worktree
|
|
25
|
+
at `<repoRoot>/.worktrees/<uuid>` branched from HEAD) or
|
|
26
|
+
`{ name?, ref? }` (sanitised name and/or explicit git ref).
|
|
27
|
+
- **`src/worktree-manager.ts`** (new file, 277 lines):
|
|
28
|
+
`sanitizeWorktreeName` (rejects path traversal — `..`, leading `/`,
|
|
29
|
+
control chars, length > 64), `createWorktree`
|
|
30
|
+
(`git rev-parse --verify <ref>` before `git worktree add`,
|
|
31
|
+
collision detection via `WorktreeCollisionError`, branch-namespaced
|
|
32
|
+
`gateway/<name>` worktrees), `removeWorktree`
|
|
33
|
+
(`git worktree remove --force`), and `createWorktreeSessionCleanupHook`
|
|
34
|
+
(hooks into session manager).
|
|
35
|
+
- **`resolveWorktreeForRequest`** (`src/index.ts`): single per-request
|
|
36
|
+
resolver consumed by every tool handler. When the request carries
|
|
37
|
+
a `sessionId` and the session already has `metadata.worktreePath`,
|
|
38
|
+
the worktree is reused (no second `git worktree add`); otherwise a
|
|
39
|
+
new worktree is created and persisted onto the session via
|
|
40
|
+
`updateSessionMetadata`. The resolved path is threaded to the
|
|
41
|
+
executor via the existing `cwd` plumbing.
|
|
42
|
+
- **`formatWorktreePrefix(path)`** (`src/index.ts:826`): every
|
|
43
|
+
successful tool result is prefixed with
|
|
44
|
+
`[gateway] worktree=<absolute-path>\n` so the caller can drive
|
|
45
|
+
`Bash(cd <path>)`, `Read <path>/...`, etc. Empty when the request
|
|
46
|
+
did not use a worktree (zero behaviour change for non-λ callers).
|
|
47
|
+
- **`Session.metadata` extension** (`src/session-manager.ts`):
|
|
48
|
+
`worktreePath` + `worktreeName` land on the existing `metadata`
|
|
49
|
+
bag — no `Session` interface changes. `FileSessionManager` accepts
|
|
50
|
+
a `cleanupHook` option that fires on `deleteSession` and on
|
|
51
|
+
TTL-driven eviction; the hook calls `git worktree remove --force`
|
|
52
|
+
before the session record is dropped.
|
|
53
|
+
- **`AsyncJobManager` cwd-aware dedup** (`src/async-job-manager.ts`):
|
|
54
|
+
the dedup key now includes the resolved `cwd`, so two
|
|
55
|
+
`*_request_async` calls with identical argv but different
|
|
56
|
+
worktree paths cannot collide (REGRESSIONS Lθ).
|
|
57
|
+
|
|
58
|
+
### Out of scope — explicitly deferred
|
|
59
|
+
|
|
60
|
+
- **Grok's `worktree` subcommand** (separate top-level subcommand
|
|
61
|
+
on the Grok CLI, distinct from `-w/--worktree`).
|
|
62
|
+
- **Claude's `--tmux`** (terminal-multiplexer integration).
|
|
63
|
+
- **Startup sweep of orphaned `.worktrees/*`** — left to future
|
|
64
|
+
housekeeping; the cleanup hook covers the happy path
|
|
65
|
+
(session_delete + TTL eviction).
|
|
66
|
+
- **Multi-repo / submodule semantics** — gateway assumes a single
|
|
67
|
+
primary repo at `<repoRoot>`; multi-root behaviour is undefined.
|
|
68
|
+
|
|
69
|
+
### Test surface
|
|
70
|
+
|
|
71
|
+
`940 → 989` tests pass (+49):
|
|
72
|
+
|
|
73
|
+
- **`src/__tests__/worktree-manager.test.ts`** (new, 26 tests) —
|
|
74
|
+
unit-tests for `sanitizeWorktreeName`, `createWorktree` (including
|
|
75
|
+
the rev-parse-before-add invariant + `WorktreeCollisionError`),
|
|
76
|
+
`removeWorktree`, and `createWorktreeSessionCleanupHook`.
|
|
77
|
+
- **`src/__tests__/test-veracity-regressions-slice-lambda.test.ts`**
|
|
78
|
+
(new, 23 tests across REGRESSIONS Lα–Lθ + Lψ):
|
|
79
|
+
- **Lα** — `sanitizeWorktreeName` path-traversal rejection.
|
|
80
|
+
- **Lβ** — `createWorktree` runs `git rev-parse --verify` BEFORE
|
|
81
|
+
`git worktree add`.
|
|
82
|
+
- **Lγ** — `resolveWorktreeForRequest` persists `worktreePath`
|
|
83
|
+
onto session metadata via `updateSessionMetadata`.
|
|
84
|
+
- **Lδ** — same-session reuse: the second request with the same
|
|
85
|
+
`sessionId` skips `git worktree add`.
|
|
86
|
+
- **Lε** — `FileSessionManager.deleteSession` invokes the cleanup
|
|
87
|
+
hook (and TTL eviction does too).
|
|
88
|
+
- **Lζ** — `executor.executeCli` honours the resolved `cwd`.
|
|
89
|
+
- **Lη** — contract-as-negative-oracle: no CLI receives
|
|
90
|
+
`-w`/`--worktree` in emitted argv across all five providers
|
|
91
|
+
(pairs with slice δ's contract-as-positive-oracle).
|
|
92
|
+
- **Lθ** — `AsyncJobManager` dedup key includes `cwd`.
|
|
93
|
+
- **Lψ** — `formatWorktreePrefix` envelope shape locked
|
|
94
|
+
(`[gateway] worktree=<abs>\n`; empty when path missing).
|
|
95
|
+
|
|
96
|
+
### Multi-LLM strict-evidence audit
|
|
97
|
+
|
|
98
|
+
Per the standing protocol (`feedback_test_veracity_audit_protocol`
|
|
99
|
+
|
|
100
|
+
- `feedback_multi_llm_review_gate`), the slice was audited round-1
|
|
101
|
+
on 2026-05-28 against `docs/plans/slice-lambda.spec.md`.
|
|
102
|
+
|
|
103
|
+
**Round 1 outcomes:**
|
|
104
|
+
|
|
105
|
+
- Codex: UNCONDITIONAL APPROVE — 9/9 mutation probes RED as
|
|
106
|
+
predicted; per-probe verbatim assertion text and pre/post-revert
|
|
107
|
+
test counts. Worktree at `audit/codex-round-1`.
|
|
108
|
+
- Grok: UNCONDITIONAL APPROVE — 9/9 RED, per-probe verbatim
|
|
109
|
+
assertion text. Worktree at `audit/grok-round-1`.
|
|
110
|
+
- Mistral: UNCONDITIONAL APPROVE — 9/9 RED with per-probe failed-
|
|
111
|
+
count summaries. Worktree at `audit/mistral-round-1`
|
|
112
|
+
(`5d75099`).
|
|
113
|
+
- Gemini: **PARTIAL (quota-blocked)** — confirmed Lα–Lε RED (5/9)
|
|
114
|
+
with assertion text matching the substantive reviewers before
|
|
115
|
+
`TerminalQuotaError` (4h35m reset window > round budget) forced
|
|
116
|
+
a stop. No findings, no contradictions.
|
|
117
|
+
- Claude: **STRUCTURAL BLOCKER** — two `claude_request_async`
|
|
118
|
+
jobs (`135c05c3-…`, `e411e8cc-…`) stalled silently
|
|
119
|
+
(`stdoutBytes: 0` for ≥10 minutes); the second produced a
|
|
120
|
+
1126-byte fabricated meta-summary with no per-probe evidence,
|
|
121
|
+
rejected per the strict-evidence rule. Documented stall pattern,
|
|
122
|
+
not a defect in slice λ.
|
|
123
|
+
|
|
124
|
+
Four out of five independent vendor voices contributed evidence
|
|
125
|
+
(three full + one partial corroborating) with one documented
|
|
126
|
+
unfixable structural block, satisfying the slice-δ "4/5 minimum
|
|
127
|
+
with documented block" bar. The three full audits are unanimous;
|
|
128
|
+
the partial fourth corroborates without contradiction. Verdict:
|
|
129
|
+
slice λ passes the gate and ships as v1.15.0.
|
|
130
|
+
|
|
131
|
+
Full per-reviewer reports preserved at
|
|
132
|
+
`docs/reviews/slice-lambda/{README,round-1-{codex,grok,mistral,
|
|
133
|
+
gemini,claude}}.md`.
|
|
134
|
+
|
|
135
|
+
### Mechanical anchors (verify with `rg` before relying)
|
|
136
|
+
|
|
137
|
+
- `src/worktree-manager.ts` — new module, 277 lines.
|
|
138
|
+
- `src/index.ts` — `WORKTREE_SCHEMA` (`:419-444`),
|
|
139
|
+
`formatWorktreePrefix` (`:826-828`), `resolveWorktreeForRequest`
|
|
140
|
+
- per-tool prefix injection (search `formatWorktreePrefix(`),
|
|
141
|
+
10 × `worktree: WORKTREE_SCHEMA.optional()` registrations on
|
|
142
|
+
every `*_request` / `*_request_async` tool input.
|
|
143
|
+
- `src/session-manager.ts` — `cleanupHook` plumbing
|
|
144
|
+
(`:53-90, 318-342`).
|
|
145
|
+
- `src/async-job-manager.ts` — dedup-key cwd inclusion.
|
|
146
|
+
|
|
5
147
|
## [1.14.0] - 2026-05-28 — Phase 4 slice κ (Claude explicit `cache_control` via `--input-format stream-json`)
|
|
6
148
|
|
|
7
149
|
Ships the ninth Phase 4 slice. Callers can now opt their stable
|
|
@@ -31,7 +173,7 @@ falsifiability-tightening commits driven by the multi-LLM review gate.
|
|
|
31
173
|
- **`prepareClaudeRequest` κ branch** (`src/index.ts`): when the
|
|
32
174
|
caller marks any block AND requests `outputFormat: "stream-json"`,
|
|
33
175
|
argv switches to `-p --input-format stream-json --output-format
|
|
34
|
-
|
|
176
|
+
stream-json --include-partial-messages --verbose` with NO positional
|
|
35
177
|
prompt; the prep result carries `stdinPayload` + `cacheControlBlocks`.
|
|
36
178
|
Mixing `cacheControl` with `text`/`json` output returns an
|
|
37
179
|
actionable error instead of silently coercing.
|
|
@@ -120,7 +262,7 @@ APPROVE) is preserved in commit history (`bea1aee` and `bbc3b5f`).
|
|
|
120
262
|
|
|
121
263
|
- κ adds caller-side reuse ON TOP of the irreducible ~10–12K
|
|
122
264
|
`cache_creation` token floor that every fresh `claude -p` session
|
|
123
|
-
rebuilds (Claude Code's session-wrap content). The
|
|
265
|
+
rebuilds (Claude Code's session-wrap content). The _added_ benefit
|
|
124
266
|
scales with the caller's stable block size, not the total prompt.
|
|
125
267
|
- The `ttl='1h'` hard-code is mandatory because Anthropic rejects a
|
|
126
268
|
`5m` block after Claude Code's own 1h-marked session blocks; the
|
|
@@ -160,7 +302,7 @@ Patch release. Single user-facing fix to `claude_request` /
|
|
|
160
302
|
- Claude CLI 2.x rejects `--print --output-format=stream-json` without
|
|
161
303
|
`--verbose` ("When using --print, --output-format=stream-json requires
|
|
162
304
|
--verbose"). The gateway was emitting `--output-format stream-json
|
|
163
|
-
|
|
305
|
+
--include-partial-messages` without `--verbose`, so every claude
|
|
164
306
|
request configured for stream-json (sync or async) was exiting 1.
|
|
165
307
|
- `prepareClaudeRequest` now pushes `--verbose` as part of the
|
|
166
308
|
stream-json arg group. `--verbose` only affects what claude writes to
|
|
@@ -174,7 +316,7 @@ Patch release. Single user-facing fix to `claude_request` /
|
|
|
174
316
|
recorded in the FR for the first time since the CLI started enforcing
|
|
175
317
|
`--verbose`.
|
|
176
318
|
- Direct CLI verification: `claude -p ... --output-format stream-json
|
|
177
|
-
|
|
319
|
+
--verbose --include-partial-messages` returned a clean NDJSON stream
|
|
178
320
|
with `cache_read_input_tokens: 17978` and
|
|
179
321
|
`cache_creation_input_tokens: 17435` on a 1-hour-cache-enabled
|
|
180
322
|
account. The parser path is correct; only the missing flag was
|
|
@@ -184,7 +326,7 @@ Patch release. Single user-facing fix to `claude_request` /
|
|
|
184
326
|
|
|
185
327
|
- New regression: `prepareClaudeRequest` emits `--verbose` when
|
|
186
328
|
`outputFormat: "stream-json"` and does NOT emit it for `text` / `json`
|
|
187
|
-
(src
|
|
329
|
+
(src/**tests**/claude-handler.test.ts).
|
|
188
330
|
- Updated `upstream-contracts.test.ts` "accepts a valid Claude argv
|
|
189
331
|
emitted by the gateway" to pin the three-flag combo so a future
|
|
190
332
|
removal of `--verbose` fails at the contract gate.
|
|
@@ -254,7 +396,7 @@ regressions) plus this release commit.
|
|
|
254
396
|
enumerate). Also settable via the `GROK_SANDBOX` env var. Caller
|
|
255
397
|
responsibility to pass a valid profile name. The slice deliberately
|
|
256
398
|
does **not** integrate `--sandbox` with `approvalStrategy:
|
|
257
|
-
|
|
399
|
+
"mcp_managed"` because the value is unbounded — Grok's approval
|
|
258
400
|
semantics are already covered by `permissionMode` + `alwaysApprove` +
|
|
259
401
|
`approvalStrategy`.
|
|
260
402
|
- **`rules`** → `--rules <RULES>`. Supports `@file` prefix per
|
|
@@ -320,7 +462,7 @@ parallel with mandatory mutation-probe execution against
|
|
|
320
462
|
|
|
321
463
|
- Codex: UNCONDITIONAL APPROVE — all 12 probes [as predicted], all
|
|
322
464
|
26 tests VERIFIED. Baseline (`npm test`: 55 files / 884 tests; build
|
|
323
|
-
|
|
465
|
+
- format:check clean; slice file 31/31).
|
|
324
466
|
- Grok: UNCONDITIONAL APPROVE — all 12 probes [as predicted]; ran in
|
|
325
467
|
an isolated worktree at `/tmp/theta-audit-grok` per the slice-ζ
|
|
326
468
|
reviewer-stomping lesson.
|
|
@@ -330,8 +472,8 @@ parallel with mandatory mutation-probe execution against
|
|
|
330
472
|
beyond the spec and closes the "enum-mistake stays silent if fixture
|
|
331
473
|
uses a listed value" gap.
|
|
332
474
|
- Gemini: **FAILED at 10s** with `TerminalQuotaError: You have
|
|
333
|
-
|
|
334
|
-
|
|
475
|
+
exhausted your capacity on this model. Your quota will reset after
|
|
476
|
+
52m10s.` (Google 429). Documented quota blocker per protocol clause
|
|
335
477
|
5+6 — counts as "concrete unfixable when documented". Four
|
|
336
478
|
substantive valid approves from independent vendor families (OpenAI,
|
|
337
479
|
xAI, Mistral, Anthropic) satisfy the gate.
|
|
@@ -500,7 +642,7 @@ this release commit.
|
|
|
500
642
|
so no extra gating required.
|
|
501
643
|
- Both tools accept a new `jsonSchema` field
|
|
502
644
|
(`string | Record<string, unknown>`). Per `claude --help`, the CLI
|
|
503
|
-
argument is the JSON Schema
|
|
645
|
+
argument is the JSON Schema _literal_ (not a path; contrast with Codex
|
|
504
646
|
`--output-schema`). Object values are `JSON.stringify`-d; string values
|
|
505
647
|
pass verbatim. Use with `outputFormat: "json"` for structured output
|
|
506
648
|
validation. Achieves Codex parity for structured-output validation
|
|
@@ -798,7 +940,7 @@ for the async tools and the codex CLI.
|
|
|
798
940
|
already terminated before the arm signal landed.
|
|
799
941
|
- `JobStore.markOrphanedOnStartup()` return shape extended from `number`
|
|
800
942
|
to `{ count, orphaned: Array<{ id, correlationId, startedAt, stdout,
|
|
801
|
-
|
|
943
|
+
stderr, exitCode }> }` so the manager constructor can write FR
|
|
802
944
|
`logComplete` rows for previously orphaned jobs with proper audit data
|
|
803
945
|
(durationMs from `startedAt`, response from `stderr || stdout`,
|
|
804
946
|
errorMessage `"orphaned after gateway restart"`). `SqliteJobStore`
|
|
@@ -930,8 +1072,9 @@ Pure documentation release; zero source-code changes since 1.6.0.
|
|
|
930
1072
|
### Fixed — `docs/launch/blog-cache-awareness.md` accuracy + voice
|
|
931
1073
|
|
|
932
1074
|
Technical corrections from the multi-LLM voice + technical review:
|
|
1075
|
+
|
|
933
1076
|
- Mutually-exclusive error-string quotation reformatted so the
|
|
934
|
-
``provide exactly one of `prompt`
|
|
1077
|
+
``provide exactly one of `prompt`or`promptParts``` example renders
|
|
935
1078
|
correctly in markdown.
|
|
936
1079
|
- `lastWriteAt` references corrected to `lastRequestAt` (the actual
|
|
937
1080
|
public field name on `SessionCacheStats`).
|
|
@@ -1002,8 +1145,7 @@ Also includes (beyond cache-awareness):
|
|
|
1002
1145
|
The gateway concatenates in canonical order (`system → tools → context → task`)
|
|
1003
1146
|
so the stable prefix bytes precede the volatile task tail unchanged across
|
|
1004
1147
|
calls — raising implicit cache hit rate without calling provider cache APIs.
|
|
1005
|
-
The exact error strings `provide exactly one of \`prompt\` or \`promptParts\``
|
|
1006
|
-
and `one of \`prompt\` or \`promptParts\` is required` are stable API
|
|
1148
|
+
The exact error strings `provide exactly one of \`prompt\` or \`promptParts\``and`one of \`prompt\` or \`promptParts\` is required` are stable API
|
|
1007
1149
|
contract.
|
|
1008
1150
|
- **Flight-recorder v3 migration**: new columns `stable_prefix_hash`
|
|
1009
1151
|
(sha256) and `stable_prefix_tokens` (integer bytes/4 heuristic) on
|
|
@@ -1034,9 +1176,9 @@ Also includes (beyond cache-awareness):
|
|
|
1034
1176
|
- `warn_on_ttl_expiry = false`
|
|
1035
1177
|
- `[cache_awareness.min_stable_tokens_for_cache_control]` per-family
|
|
1036
1178
|
table (sonnet=1024, opus=4096, haiku=4096, default=4096).
|
|
1037
|
-
|
|
1038
|
-
|
|
1039
|
-
|
|
1179
|
+
Validated by a separate Zod schema and loader (`loadCacheAwarenessConfig`);
|
|
1180
|
+
a malformed `[cache_awareness]` block does NOT break `loadPersistenceConfig`
|
|
1181
|
+
and vice versa. No env-var overrides.
|
|
1040
1182
|
|
|
1041
1183
|
### Decision: Branch B (prefix-discipline only) for slice 1
|
|
1042
1184
|
|
|
@@ -1356,6 +1498,7 @@ Lands DAG layers 6-12 — the personal-MCP MVP terminal plus all of Phase 0-3 pr
|
|
|
1356
1498
|
- **No self-update** — `cli_upgrade --cli mistral` detects pip / uv / brew via probes and dispatches to `pip install -U vibe-cli`, `uv tool upgrade vibe-cli`, or `brew upgrade mistral-vibe`. Unknown installations return an actionable error rather than running a non-existent `vibe update`.
|
|
1357
1499
|
|
|
1358
1500
|
Other surfaces extended: `SESSION_PROVIDER_VALUES` now includes `"mistral"`; `list_models`, `cli_versions`, `cli_upgrade`, `approval_list`, `session_create`, `session_list`, and `session_clear_all` accept the fifth provider; new MCP resources `sessions://mistral` and `models://mistral` are registered; `validate_with_models` / `consensus_check` / `red_team_review` can route to Mistral.
|
|
1501
|
+
|
|
1359
1502
|
- **U23 — JSON output + token/cost parity across providers.** New `src/codex-json-parser.ts` parses the Codex `--json` JSONL event stream (`thread.started`, `turn.started`/`completed`/`failed`, `item.*`, `error`); lenient against partial streams and garbage preamble. New `src/gemini-json-parser.ts` parses `gemini -o json` output and maps `usageMetadata.{promptTokenCount, candidatesTokenCount, cachedContentTokenCount}`. `extractUsageAndCost` is now a thin per-provider dispatcher returning `{inputTokens, outputTokens, cacheReadTokens?, cacheCreationTokens?, costUsd?}` for every provider that supports JSON; Claude `cache_read_input_tokens` / `cache_creation_input_tokens` are now plumbed through instead of being discarded. `codex_request`, `codex_request_async`, `gemini_request`, and `gemini_request_async` now expose `outputFormat: enum("text","json")` — set to `"json"` and the gateway emits `--json` (Codex) or `-o json` (Gemini) and forwards parsed usage/cost into the flight recorder. Flight-recorder schema gains `cache_read_tokens` and `cache_creation_tokens` columns via idempotent migration (`PRAGMA table_info` → `ALTER TABLE ADD COLUMN`); existing `logs.db` files are upgraded in place. 15 new tests.
|
|
1360
1503
|
- **U24 — Permission/approval-mode parity across providers.** Claude `permissionMode` enum (`default | acceptEdits | plan | auto | dontAsk | bypassPermissions`) replaces the boolean `dangerouslySkipPermissions` (the boolean still works and now maps to `permissionMode: "bypassPermissions"`; setting both logs a warning, `permissionMode` wins). Gemini `approvalMode` gains `plan`. Codex splits `--full-auto` into `sandboxMode: enum("read-only","workspace-write","danger-full-access")` and `askForApproval: enum("untrusted","on-request","never")`, emitting `--sandbox <mode>` and `--ask-for-approval <mode>` independently; legacy `fullAuto: true` still works and expands to `--sandbox workspace-write --ask-for-approval never` by default, with `useLegacyFullAutoFlag: true` as an explicit escape hatch to emit `--full-auto` directly. Codex resume mode filters all three flags (`--full-auto`, `--sandbox`, `--ask-for-approval`) since `codex exec resume` inherits the session's policy. 26 new tests.
|
|
1361
1504
|
- **U25 — Claude high-impact features.** `claude_request` / `claude_request_async` schemas gain `agent?: string` (single sub-agent dispatch), `agents?: Record<string, object>` (multi-agent JSON, validated against `CLAUDE_AGENT_DEFINITION_SCHEMA` before emit), `forkSession?: boolean`, `systemPrompt?: string`, `appendSystemPrompt?: string` (mutually exclusive at the schema + tool-callback boundary), `maxBudgetUsd?: number`, `maxTurns?: number`, `effort?: enum("low","medium","high","xhigh","max")`, and `excludeDynamicSystemPromptSections?: boolean`. Each emits the documented `--<flag>` form. 25 new tests in `src/__tests__/claude-handler.test.ts`.
|
|
@@ -1448,7 +1591,7 @@ Round-1 Codex review found 5 blockers across U22, U23, and U26; round-2 uncondit
|
|
|
1448
1591
|
|
|
1449
1592
|
### Fixed
|
|
1450
1593
|
|
|
1451
|
-
- **SIGTERM→SIGKILL escalation bug** — `proc.killed` becomes `true` after `.kill()` is
|
|
1594
|
+
- **SIGTERM→SIGKILL escalation bug** — `proc.killed` becomes `true` after `.kill()` is _called_, not after the process _exits_, so the SIGKILL guard (`if (!proc.killed)`) was always false. Replaced with an `exited` flag set by `close`/`error` events in both `executor.ts` and `async-job-manager.ts`
|
|
1452
1595
|
- **Timer priority race** — When both `timeout` and `idleTimeout` are set, idle timeout now clears the wall-clock timer to prevent `timedOut` from overriding `idledOut` in the close handler (which would misclassify code 125 as transient code 124)
|
|
1453
1596
|
|
|
1454
1597
|
### Added
|
|
@@ -1533,6 +1676,7 @@ Round-1 Codex review found 5 blockers across U22, U23, and U26; round-2 uncondit
|
|
|
1533
1676
|
## Core Features
|
|
1534
1677
|
|
|
1535
1678
|
### Multi-LLM Orchestration
|
|
1679
|
+
|
|
1536
1680
|
- **3 CLI tools supported**: Claude Code, Codex, Gemini
|
|
1537
1681
|
- **Unified MCP interface**: Single protocol for all LLMs
|
|
1538
1682
|
- **Cross-tool collaboration**: LLMs can use each other via MCP
|
|
@@ -1540,6 +1684,7 @@ Round-1 Codex review found 5 blockers across U22, U23, and U26; round-2 uncondit
|
|
|
1540
1684
|
- **Correlation ID tracking**: Full request tracing
|
|
1541
1685
|
|
|
1542
1686
|
### Token Optimization
|
|
1687
|
+
|
|
1543
1688
|
- **Auto-optimization middleware**: 44% reduction on prompts, 37% on responses
|
|
1544
1689
|
- **15+ optimization patterns**: Remove filler, compact types, arrow notation
|
|
1545
1690
|
- **Opt-in feature**: `optimizePrompt` and `optimizeResponse` flags
|
|
@@ -1547,6 +1692,7 @@ Round-1 Codex review found 5 blockers across U22, U23, and U26; round-2 uncondit
|
|
|
1547
1692
|
- **Research-backed**: 42 sources, best practices documented
|
|
1548
1693
|
|
|
1549
1694
|
### Reliability & Performance
|
|
1695
|
+
|
|
1550
1696
|
- **Retry logic**: Exponential backoff with circuit breaker
|
|
1551
1697
|
- **Atomic file writes**: Process-specific temp files with fsync
|
|
1552
1698
|
- **Memory limits**: 50MB cap on CLI output prevents DoS
|
|
@@ -1554,6 +1700,7 @@ Round-1 Codex review found 5 blockers across U22, U23, and U26; round-2 uncondit
|
|
|
1554
1700
|
- **Non-zero exit code handling**: Proper retry behavior
|
|
1555
1701
|
|
|
1556
1702
|
### Security Hardening
|
|
1703
|
+
|
|
1557
1704
|
- **No secret leakage**: Generic session descriptions only
|
|
1558
1705
|
- **File permissions**: 0o600 on sensitive files
|
|
1559
1706
|
- **No ReDoS vulnerabilities**: Bounded regex patterns
|
|
@@ -1562,6 +1709,7 @@ Round-1 Codex review found 5 blockers across U22, U23, and U26; round-2 uncondit
|
|
|
1562
1709
|
- **Custom storage paths**: Secure directory creation
|
|
1563
1710
|
|
|
1564
1711
|
### Testing & Quality
|
|
1712
|
+
|
|
1565
1713
|
- **114 tests**: 68 unit, 41 integration, 5 optimizer
|
|
1566
1714
|
- **Real CLI integration**: Not mocks
|
|
1567
1715
|
- **Regression tests**: ReDoS, schema validation, retry behavior
|
|
@@ -1569,6 +1717,7 @@ Round-1 Codex review found 5 blockers across U22, U23, and U26; round-2 uncondit
|
|
|
1569
1717
|
- **Edge case coverage**: Timeouts, errors, concurrency
|
|
1570
1718
|
|
|
1571
1719
|
### Documentation Excellence
|
|
1720
|
+
|
|
1572
1721
|
- **7 comprehensive guides**: 4,000+ lines total
|
|
1573
1722
|
- **Research-backed**: TOKEN_OPTIMIZATION_GUIDE.md with 42 sources
|
|
1574
1723
|
- **Real-world examples**: PROMPT_OPTIMIZATION_EXAMPLES.md with 5 examples
|
|
@@ -1580,6 +1729,7 @@ Round-1 Codex review found 5 blockers across U22, U23, and U26; round-2 uncondit
|
|
|
1580
1729
|
## Added
|
|
1581
1730
|
|
|
1582
1731
|
### Features
|
|
1732
|
+
|
|
1583
1733
|
- Multi-LLM CLI orchestration via MCP
|
|
1584
1734
|
- Session management with persistence
|
|
1585
1735
|
- Correlation ID tracking for request tracing
|
|
@@ -1591,6 +1741,7 @@ Round-1 Codex review found 5 blockers across U22, U23, and U26; round-2 uncondit
|
|
|
1591
1741
|
- Custom storage path support
|
|
1592
1742
|
|
|
1593
1743
|
### Tools (MCP)
|
|
1744
|
+
|
|
1594
1745
|
- `claude_request` - Execute Claude Code CLI
|
|
1595
1746
|
- `codex_request` - Execute Codex CLI
|
|
1596
1747
|
- `gemini_request` - Execute Gemini CLI
|
|
@@ -1604,6 +1755,7 @@ Round-1 Codex review found 5 blockers across U22, U23, and U26; round-2 uncondit
|
|
|
1604
1755
|
- `list_models` - List available models for each CLI
|
|
1605
1756
|
|
|
1606
1757
|
### Resources (MCP)
|
|
1758
|
+
|
|
1607
1759
|
- `sessions://all` - All sessions across CLIs
|
|
1608
1760
|
- `sessions://claude` - Claude-specific sessions
|
|
1609
1761
|
- `sessions://codex` - Codex-specific sessions
|
|
@@ -1612,6 +1764,7 @@ Round-1 Codex review found 5 blockers across U22, U23, and U26; round-2 uncondit
|
|
|
1612
1764
|
- `metrics://performance` - Performance metrics and stats
|
|
1613
1765
|
|
|
1614
1766
|
### Documentation
|
|
1767
|
+
|
|
1615
1768
|
- `README.md` - Installation and usage guide
|
|
1616
1769
|
- `BEST_PRACTICES.md` - Design and implementation patterns
|
|
1617
1770
|
- `TOKEN_OPTIMIZATION_GUIDE.md` - Research-backed optimization techniques (42 sources)
|
|
@@ -1625,6 +1778,7 @@ Round-1 Codex review found 5 blockers across U22, U23, and U26; round-2 uncondit
|
|
|
1625
1778
|
- `CROSS_TOOL_SUCCESS.md` - Cross-LLM collaboration validation
|
|
1626
1779
|
|
|
1627
1780
|
### Tests
|
|
1781
|
+
|
|
1628
1782
|
- 68 unit tests (executor, sessions, metrics, optimizer)
|
|
1629
1783
|
- 41 integration tests (full MCP with real CLIs)
|
|
1630
1784
|
- 5 optimizer tests (pattern validation, ReDoS prevention)
|
|
@@ -1637,6 +1791,7 @@ Round-1 Codex review found 5 blockers across U22, U23, and U26; round-2 uncondit
|
|
|
1637
1791
|
### First Review Round (8 bugs)
|
|
1638
1792
|
|
|
1639
1793
|
**Critical:**
|
|
1794
|
+
|
|
1640
1795
|
1. **session_set_active schema mismatch** (src/index.ts:430)
|
|
1641
1796
|
- Issue: Documentation said "null to clear" but z.string() rejected null
|
|
1642
1797
|
- Fix: Changed to z.string().nullable()
|
|
@@ -1652,12 +1807,12 @@ Round-1 Codex review found 5 blockers across U22, U23, and U26; round-2 uncondit
|
|
|
1652
1807
|
- Fix: Integrated withRetry + CircuitBreaker into executeCli
|
|
1653
1808
|
- Impact: Transient failures now retried automatically
|
|
1654
1809
|
|
|
1655
|
-
**Medium:**
|
|
1656
|
-
|
|
1657
|
-
|
|
1658
|
-
|
|
1810
|
+
**Medium:** 4. **Integration test brittleness**
|
|
1811
|
+
|
|
1812
|
+
- Issue: Tests failed without dist/ or CLIs installed
|
|
1813
|
+
- Fix: Tests properly skip when CLIs unavailable
|
|
1659
1814
|
|
|
1660
|
-
5. **Test timing issues** (src
|
|
1815
|
+
5. **Test timing issues** (src/**tests**/session-manager.test.ts:216,429)
|
|
1661
1816
|
- Issue: setTimeout not awaited → false positives
|
|
1662
1817
|
- Fix: Proper async/await patterns
|
|
1663
1818
|
|
|
@@ -1665,10 +1820,10 @@ Round-1 Codex review found 5 blockers across U22, U23, and U26; round-2 uncondit
|
|
|
1665
1820
|
- Issue: All stdout/stderr buffered in memory with no cap
|
|
1666
1821
|
- Fix: Added 50MB limit with early termination
|
|
1667
1822
|
|
|
1668
|
-
**Low:**
|
|
1669
|
-
|
|
1670
|
-
|
|
1671
|
-
|
|
1823
|
+
**Low:** 7. **Model data duplication** (src/index.ts:64, src/resources.ts:22)
|
|
1824
|
+
|
|
1825
|
+
- Issue: CLI_INFO defined in two places
|
|
1826
|
+
- Fix: Centralized in single location
|
|
1672
1827
|
|
|
1673
1828
|
8. **Unused code** (src/resources.ts:33)
|
|
1674
1829
|
- Issue: listResources() never called
|
|
@@ -1677,27 +1832,28 @@ Round-1 Codex review found 5 blockers across U22, U23, and U26; round-2 uncondit
|
|
|
1677
1832
|
### Second Review Round (8 bugs)
|
|
1678
1833
|
|
|
1679
1834
|
**Critical:**
|
|
1835
|
+
|
|
1680
1836
|
1. **Secret leakage via session descriptions** (src/index.ts + src/session-manager.ts)
|
|
1681
1837
|
- Issue: First 50 chars of prompts stored in plain text
|
|
1682
1838
|
- Fix: Generic descriptions ("Claude Session"), file permissions 0o600
|
|
1683
1839
|
- Impact: No user data exposed in session files
|
|
1684
1840
|
|
|
1685
|
-
**High:**
|
|
1686
|
-
|
|
1687
|
-
|
|
1688
|
-
|
|
1689
|
-
|
|
1841
|
+
**High:** 2. **ReDoS in optimizer regex** (src/optimizer.ts:241,244)
|
|
1842
|
+
|
|
1843
|
+
- Issue: Catastrophic backtracking with .+? patterns
|
|
1844
|
+
- Fix: Bounded character sets [A-Za-z][\w-]\*
|
|
1845
|
+
- Impact: No DoS from malicious prompts
|
|
1690
1846
|
|
|
1691
1847
|
3. **Custom storage path directory not created** (src/session-manager.ts:36)
|
|
1692
1848
|
- Issue: ensureStorageDirectory only created default path
|
|
1693
1849
|
- Fix: Create dirname(storagePath) for custom paths
|
|
1694
1850
|
- Impact: Custom storage paths work without errors
|
|
1695
1851
|
|
|
1696
|
-
**Medium:**
|
|
1697
|
-
|
|
1698
|
-
|
|
1699
|
-
|
|
1700
|
-
|
|
1852
|
+
**Medium:** 4. **Atomic write temp filename collision** (src/session-manager.ts:57)
|
|
1853
|
+
|
|
1854
|
+
- Issue: All processes used same .tmp filename
|
|
1855
|
+
- Fix: Process-specific temp files (sessions.json.tmp.${process.pid})
|
|
1856
|
+
- Impact: Safe multi-process deployments
|
|
1701
1857
|
|
|
1702
1858
|
5. **Retry doesn't handle non-zero exit codes** (src/executor.ts:99)
|
|
1703
1859
|
- Issue: Only thrown errors triggered retry
|
|
@@ -1709,11 +1865,11 @@ Round-1 Codex review found 5 blockers across U22, U23, and U26; round-2 uncondit
|
|
|
1709
1865
|
- Fix: 50MB limit with process termination
|
|
1710
1866
|
- Impact: DoS prevention
|
|
1711
1867
|
|
|
1712
|
-
**Low:**
|
|
1713
|
-
|
|
1714
|
-
|
|
1715
|
-
|
|
1716
|
-
|
|
1868
|
+
**Low:** 7. **Performance overhead from NVM scanning** (src/executor.ts:41)
|
|
1869
|
+
|
|
1870
|
+
- Issue: Filesystem scan on every request
|
|
1871
|
+
- Fix: Cache NVM path at module load
|
|
1872
|
+
- Impact: Performance improvement
|
|
1717
1873
|
|
|
1718
1874
|
8. **Unused imports** (src/session-manager.ts:4, src/executor.ts:7)
|
|
1719
1875
|
- Issue: Dead code and unused parameters
|
|
@@ -1725,6 +1881,7 @@ Round-1 Codex review found 5 blockers across U22, U23, and U26; round-2 uncondit
|
|
|
1725
1881
|
## Security
|
|
1726
1882
|
|
|
1727
1883
|
### Vulnerabilities Fixed
|
|
1884
|
+
|
|
1728
1885
|
- ✅ **Secret leakage**: No user data in session descriptions
|
|
1729
1886
|
- ✅ **File permissions**: 0o600 on sessions.json
|
|
1730
1887
|
- ✅ **ReDoS**: Bounded regex patterns prevent DoS
|
|
@@ -1733,6 +1890,7 @@ Round-1 Codex review found 5 blockers across U22, U23, and U26; round-2 uncondit
|
|
|
1733
1890
|
- ✅ **Command injection**: Already prevented via spawn with args
|
|
1734
1891
|
|
|
1735
1892
|
### Security Best Practices
|
|
1893
|
+
|
|
1736
1894
|
- Input validation with Zod schemas
|
|
1737
1895
|
- No stack trace leakage in errors
|
|
1738
1896
|
- Atomic file writes with fsync
|
|
@@ -1744,6 +1902,7 @@ Round-1 Codex review found 5 blockers across U22, U23, and U26; round-2 uncondit
|
|
|
1744
1902
|
## Performance
|
|
1745
1903
|
|
|
1746
1904
|
### Optimizations Added
|
|
1905
|
+
|
|
1747
1906
|
- **Token optimization**: 44% reduction on prompts, 37% on responses
|
|
1748
1907
|
- **NVM path caching**: Eliminates I/O on every request
|
|
1749
1908
|
- **Circuit breaker**: Fast-fail during outages
|
|
@@ -1751,6 +1910,7 @@ Round-1 Codex review found 5 blockers across U22, U23, and U26; round-2 uncondit
|
|
|
1751
1910
|
- **Memory limits**: Prevents resource exhaustion
|
|
1752
1911
|
|
|
1753
1912
|
### Metrics
|
|
1913
|
+
|
|
1754
1914
|
- Request counts per CLI tool
|
|
1755
1915
|
- Response times with percentiles
|
|
1756
1916
|
- Success/failure rates
|
|
@@ -1762,6 +1922,7 @@ Round-1 Codex review found 5 blockers across U22, U23, and U26; round-2 uncondit
|
|
|
1762
1922
|
## Testing
|
|
1763
1923
|
|
|
1764
1924
|
### Test Growth
|
|
1925
|
+
|
|
1765
1926
|
- **Initial**: 104 tests
|
|
1766
1927
|
- **After first fixes**: 109 tests (+5 from retry integration)
|
|
1767
1928
|
- **After optimizer**: 113 tests (+4 from optimizer)
|
|
@@ -1769,6 +1930,7 @@ Round-1 Codex review found 5 blockers across U22, U23, and U26; round-2 uncondit
|
|
|
1769
1930
|
- **Growth**: +10 tests (9.6% increase)
|
|
1770
1931
|
|
|
1771
1932
|
### Coverage Areas
|
|
1933
|
+
|
|
1772
1934
|
- Unit: Executor, session manager, metrics, optimizer
|
|
1773
1935
|
- Integration: Full MCP protocol with real CLI execution
|
|
1774
1936
|
- Regression: Schema validation, ReDoS, retry behavior
|
|
@@ -1779,6 +1941,7 @@ Round-1 Codex review found 5 blockers across U22, U23, and U26; round-2 uncondit
|
|
|
1779
1941
|
## Documentation
|
|
1780
1942
|
|
|
1781
1943
|
### Guides Created
|
|
1944
|
+
|
|
1782
1945
|
1. **README.md** - Installation, usage, API reference
|
|
1783
1946
|
2. **BEST_PRACTICES.md** - Design patterns and architecture
|
|
1784
1947
|
3. **TOKEN_OPTIMIZATION_GUIDE.md** - Research (42 sources)
|
|
@@ -1792,6 +1955,7 @@ Round-1 Codex review found 5 blockers across U22, U23, and U26; round-2 uncondit
|
|
|
1792
1955
|
11. **CROSS_TOOL_SUCCESS.md** - Collaboration proof
|
|
1793
1956
|
|
|
1794
1957
|
### Total Documentation
|
|
1958
|
+
|
|
1795
1959
|
- **11 comprehensive files**
|
|
1796
1960
|
- **~8,000 lines** of documentation
|
|
1797
1961
|
- **Research-backed** with citations
|
|
@@ -1802,17 +1966,20 @@ Round-1 Codex review found 5 blockers across U22, U23, and U26; round-2 uncondit
|
|
|
1802
1966
|
## Dogfooding Validation
|
|
1803
1967
|
|
|
1804
1968
|
### Multi-LLM Review Process
|
|
1969
|
+
|
|
1805
1970
|
- **Claude Sonnet 4.5**: Strategic/product review (8.5/10 → 10/10)
|
|
1806
1971
|
- **Codex**: Bug finding and implementation (13 bugs found, 13 fixed)
|
|
1807
1972
|
- **Gemini 2.5 Pro**: Security analysis (3 critical issues found, 3 fixed)
|
|
1808
1973
|
|
|
1809
1974
|
### Self-Improvement Cycle
|
|
1975
|
+
|
|
1810
1976
|
1. ✅ Multi-LLM review found 16 bugs
|
|
1811
1977
|
2. ✅ Codex fixed all bugs via MCP
|
|
1812
1978
|
3. ✅ Gateway validated fixes via test suite
|
|
1813
1979
|
4. ✅ Complete autonomous improvement demonstrated
|
|
1814
1980
|
|
|
1815
1981
|
### Workflow Validated
|
|
1982
|
+
|
|
1816
1983
|
```
|
|
1817
1984
|
Implement (Codex) → Review (Gemini) → Fix (Codex) → Verify (Tests) → Iterate
|
|
1818
1985
|
```
|
|
@@ -1822,41 +1989,45 @@ Implement (Codex) → Review (Gemini) → Fix (Codex) → Verify (Tests) → Ite
|
|
|
1822
1989
|
## Migration Guide
|
|
1823
1990
|
|
|
1824
1991
|
### Breaking Changes
|
|
1992
|
+
|
|
1825
1993
|
None - This is the first release.
|
|
1826
1994
|
|
|
1827
1995
|
### New Features to Adopt
|
|
1828
1996
|
|
|
1829
1997
|
**1. Token Optimization** (Optional, Opt-in)
|
|
1998
|
+
|
|
1830
1999
|
```typescript
|
|
1831
2000
|
// Enable prompt optimization
|
|
1832
2001
|
await callTool("codex_request", {
|
|
1833
2002
|
prompt: "Your verbose prompt...",
|
|
1834
|
-
optimizePrompt: true
|
|
2003
|
+
optimizePrompt: true, // 44% token reduction
|
|
1835
2004
|
});
|
|
1836
2005
|
|
|
1837
2006
|
// Enable response optimization
|
|
1838
2007
|
await callTool("claude_request", {
|
|
1839
2008
|
prompt: "Generate docs...",
|
|
1840
|
-
optimizeResponse: true
|
|
2009
|
+
optimizeResponse: true, // 37% token reduction
|
|
1841
2010
|
});
|
|
1842
2011
|
```
|
|
1843
2012
|
|
|
1844
2013
|
**2. Session Management**
|
|
2014
|
+
|
|
1845
2015
|
```typescript
|
|
1846
2016
|
// Create and use sessions
|
|
1847
2017
|
const session = await callTool("session_create", {
|
|
1848
2018
|
cli: "claude",
|
|
1849
|
-
description: "My coding session"
|
|
2019
|
+
description: "My coding session",
|
|
1850
2020
|
});
|
|
1851
2021
|
|
|
1852
2022
|
// Continue conversations
|
|
1853
2023
|
await callTool("claude_request", {
|
|
1854
2024
|
prompt: "Continue from previous context",
|
|
1855
|
-
sessionId: session.id
|
|
2025
|
+
sessionId: session.id,
|
|
1856
2026
|
});
|
|
1857
2027
|
```
|
|
1858
2028
|
|
|
1859
2029
|
**3. Correlation IDs** (Automatic)
|
|
2030
|
+
|
|
1860
2031
|
```typescript
|
|
1861
2032
|
// Automatically generated for tracing
|
|
1862
2033
|
// Check logs: [corrId] prefix on all log lines
|
|
@@ -1867,6 +2038,7 @@ await callTool("claude_request", {
|
|
|
1867
2038
|
## Known Limitations
|
|
1868
2039
|
|
|
1869
2040
|
### Documented Constraints
|
|
2041
|
+
|
|
1870
2042
|
1. **Multi-level orchestration unsupported**
|
|
1871
2043
|
- Nested MCP connections fail
|
|
1872
2044
|
- LLMs can't spawn sub-LLMs via gateway
|
|
@@ -1881,6 +2053,7 @@ await callTool("claude_request", {
|
|
|
1881
2053
|
- Consider encryption for sensitive data (future)
|
|
1882
2054
|
|
|
1883
2055
|
### Future Enhancements
|
|
2056
|
+
|
|
1884
2057
|
- Session encryption at rest
|
|
1885
2058
|
- Session TTL and automatic cleanup
|
|
1886
2059
|
- Redis/DynamoDB backend for horizontal scaling
|
|
@@ -1893,16 +2066,19 @@ await callTool("claude_request", {
|
|
|
1893
2066
|
## Credits
|
|
1894
2067
|
|
|
1895
2068
|
### Development
|
|
2069
|
+
|
|
1896
2070
|
- **Architecture & Orchestration**: Claude Sonnet 4.5
|
|
1897
2071
|
- **Implementation & Bug Fixes**: Codex via llm-cli-gateway MCP
|
|
1898
2072
|
- **Security Analysis**: Gemini 2.5 Pro via llm-cli-gateway MCP
|
|
1899
2073
|
|
|
1900
2074
|
### Research
|
|
2075
|
+
|
|
1901
2076
|
- Token optimization: 42 research sources (2025-2026)
|
|
1902
2077
|
- Compression validation: Compel paper (OpenReview 2025)
|
|
1903
2078
|
- Best practices: Industry standards + dogfooding
|
|
1904
2079
|
|
|
1905
2080
|
### Validation
|
|
2081
|
+
|
|
1906
2082
|
- **Self-dogfooding**: Gateway reviewed and fixed itself
|
|
1907
2083
|
- **Multi-LLM collaboration**: 3 LLMs working via MCP
|
|
1908
2084
|
- **Iterative quality**: 2 review rounds, 16 bugs found and fixed
|
|
@@ -1912,6 +2088,7 @@ await callTool("claude_request", {
|
|
|
1912
2088
|
## Statistics
|
|
1913
2089
|
|
|
1914
2090
|
### Development Timeline
|
|
2091
|
+
|
|
1915
2092
|
- **Total time**: ~2.5 hours (from first review to 100% bug-free)
|
|
1916
2093
|
- **Review rounds**: 2 comprehensive multi-LLM reviews
|
|
1917
2094
|
- **Bugs found**: 16 total
|
|
@@ -1919,12 +2096,14 @@ await callTool("claude_request", {
|
|
|
1919
2096
|
- **Test growth**: 104 → 114 tests (+9.6%)
|
|
1920
2097
|
|
|
1921
2098
|
### Code Metrics
|
|
2099
|
+
|
|
1922
2100
|
- **Files modified**: 12 files
|
|
1923
2101
|
- **Lines added**: ~2,500 lines
|
|
1924
2102
|
- **Documentation**: ~8,000 lines (11 files)
|
|
1925
2103
|
- **Test coverage**: 114 tests across unit/integration/regression
|
|
1926
2104
|
|
|
1927
2105
|
### Quality Metrics
|
|
2106
|
+
|
|
1928
2107
|
- **Bug-free rate**: 100%
|
|
1929
2108
|
- **Test pass rate**: 100%
|
|
1930
2109
|
- **Build success**: ✅
|