@kodax-ai/kodax 0.7.43 → 0.7.45

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (45) hide show
  1. package/CHANGELOG.md +80 -0
  2. package/README.md +6 -5
  3. package/README_CN.md +6 -5
  4. package/dist/chunks/chunk-CZHIUJQS.js +535 -0
  5. package/dist/chunks/{chunk-IYSK7LUK.js → chunk-FKB7BWQT.js} +1 -1
  6. package/dist/chunks/chunk-FT2XFFNP.js +2 -0
  7. package/dist/chunks/chunk-IJUB7QXG.js +425 -0
  8. package/dist/chunks/chunk-PGF5EZ7C.js +31 -0
  9. package/dist/chunks/chunk-X6EHEQWP.js +849 -0
  10. package/dist/chunks/{compaction-config-3E57ABCT.js → compaction-config-WCNGYWT3.js} +1 -1
  11. package/dist/chunks/{construction-bootstrap-JR63KI5N.js → construction-bootstrap-OB5SDNBD.js} +1 -1
  12. package/dist/chunks/dist-C2VOGY5Z.js +2 -0
  13. package/dist/chunks/{dist-XANXEVTU.js → dist-Q2PQM7U7.js} +1 -1
  14. package/dist/chunks/{utils-HQ2QCKJA.js → utils-CHXCBR3Q.js} +1 -1
  15. package/dist/index.d.ts +8 -8
  16. package/dist/index.js +2 -2
  17. package/dist/kodax_cli.js +764 -709
  18. package/dist/provider-capabilities.json +181 -0
  19. package/dist/sdk-agent.d.ts +108 -8
  20. package/dist/sdk-agent.js +1 -1
  21. package/dist/sdk-coding.d.ts +385 -13
  22. package/dist/sdk-coding.js +1 -1
  23. package/dist/sdk-llm.d.ts +1 -1
  24. package/dist/sdk-llm.js +1 -1
  25. package/dist/sdk-mcp.js +1 -1
  26. package/dist/sdk-repl.d.ts +7 -6
  27. package/dist/sdk-repl.js +1 -1
  28. package/dist/sdk-session.d.ts +2 -2
  29. package/dist/sdk-session.js +1 -1
  30. package/dist/sdk-skills.js +1 -1
  31. package/dist/types-chunks/{bash-prefix-extractor.d-DMrGImMl.d.ts → bash-prefix-extractor.d-HrTUwtV7.d.ts} +597 -142
  32. package/dist/types-chunks/file-tracker.d-DOfaoCbJ.d.ts +633 -0
  33. package/dist/types-chunks/{resolver.d-CA68_NeH.d.ts → resolver.d-OMwxURit.d.ts} +17 -14
  34. package/dist/types-chunks/{storage.d-DPAEX7zS.d.ts → storage.d-BvTdjYQF.d.ts} +13 -1
  35. package/dist/types-chunks/{file-tracker.d-zaLZeNBK.d.ts → types.d-DM8zEJgF.d.ts} +1029 -535
  36. package/dist/types-chunks/{types.d-mM8vqvhT.d.ts → types.d-HBbWT-iA.d.ts} +41 -3
  37. package/dist/types-chunks/{utils.d-DkLZD_wa.d.ts → utils.d-DSEX6Rq1.d.ts} +15 -3
  38. package/package.json +2 -2
  39. package/dist/chunks/chunk-7G5PSL6C.js +0 -830
  40. package/dist/chunks/chunk-K75O2CAE.js +0 -31
  41. package/dist/chunks/chunk-UG4262JI.js +0 -502
  42. package/dist/chunks/chunk-VHKAJDQD.js +0 -425
  43. package/dist/chunks/chunk-YMRZBS4G.js +0 -2
  44. package/dist/chunks/dist-KWHUKXEL.js +0 -2
  45. package/dist/types-chunks/types.d-CKJtjo-6.d.ts +0 -1127
package/CHANGELOG.md CHANGED
@@ -4,6 +4,86 @@ All notable changes to this project will be documented in this file.
4
4
 
5
5
  > Full history for versions prior to v0.7.0: [CHANGELOG_ARCHIVE.md](docs/CHANGELOG_ARCHIVE.md)
6
6
 
7
+ ## [0.7.45] - 2026-06-01
8
+
9
+ ### ⚠️ BREAKING — coding-plan providers now use dedicated API-key env vars
10
+
11
+ The coding-plan providers used to read the **same** env var as their regular-API
12
+ sibling (e.g. both `zhipu` and `zhipu-coding` read `ZHIPU_API_KEY`), which meant
13
+ you couldn't enable the two providers independently — and the regular key could
14
+ get handed to a coding-plan provider you hadn't actually subscribed to. Each
15
+ coding-plan provider now reads its **own** key, so setting both keys lets you use
16
+ both providers' models at once.
17
+
18
+ | Provider | Old env var | **New env var** |
19
+ |---|---|---|
20
+ | `zhipu-coding` | `ZHIPU_API_KEY` | `ZHIPU_CODING_API_KEY` |
21
+ | `kimi-code` | `KIMI_API_KEY` | `KIMI_CODE_API_KEY` |
22
+ | `minimax-coding` | `MINIMAX_API_KEY` | `MINIMAX_CODING_API_KEY` |
23
+ | `mimo-coding` | `MIMO_API_KEY` | `MIMO_CODING_API_KEY` |
24
+ | `ark-coding` | `ARK_API_KEY` | `ARK_CODING_API_KEY` |
25
+
26
+ **Migration**: set the new env var(s) for whichever coding plans you use. Regular
27
+ providers (`zhipu`, `kimi`, …) are unchanged. There is no fallback to the old
28
+ shared names — the new name is required for the coding-plan provider to start.
29
+
30
+ ### Added
31
+
32
+ - **FEATURE_207 — `@` picker: recent files**: typing `@` now surfaces your current git working set (modified + untracked files) at the top of the completion list, before the plain directory listing — these are the files you're actively changing, which is overwhelmingly what you'll reference next. Cross-directory (nested files appear without navigating into them), filtered by basename prefix as you type, and suppressed once you navigate into a specific path (`@src/` keeps the normal listing). Non-git workspaces degrade gracefully to the directory listing. Directories were already completable; skill (`/skill:`) / MCP / plugin entries are untouched. (The recency source is the git working set rather than the originally-planned session-lineage tool events — chosen for cold-start usefulness + layer independence; see docs/features/v0.7.45.md#feature_207.)
33
+ - **FEATURE_102 — Adaptive Multi-Provider Orchestration (P1-auto + P2 + P3)**: the main agent can now run a child task on a *different* provider/model than its own. Three layers ship; P4's adaptive-scoring *mechanism* (bandit/bayesian auto-selection) is intentionally **not** built — its only actionable output, the gating eval below, shipped, and the mechanism itself is speculative with no current consumer (YAGNI per `CLAUDE.md`), so it is **deferred to 0.9.x as an independent feature** if a real auto-selection need arises. Static tier routing (P1-auto) already covers the concrete need, and is **off by default** — it activates only when you point a tier env var at a model.
34
+ - **P2 — explicit per-dispatch override.** `dispatch_child_task` gains optional `provider` / `model` params, so the agent can deliberately send a child to another model family — e.g. a second independent review of the same change by a different family, to catch blind spots a single family would share. Resolution priority in `child-executor`: `bundle.provider/model` > specialist's declared model (FEATURE_191) > parent default. Omitting both is byte-identical to the prior behavior. Tolerant parse (empty/whitespace → undefined) so a misuse never fails the dispatch.
35
+ - **P3 — cross-provider fallback + `doctor --ping`.** When a child's primary provider is *exhausted or down* (the LLM layer's same-provider `withRateLimit` retries gave up, a 5xx, or a network error), KodaX re-runs the child once on the next provider in an operator-configured chain instead of failing the whole child. Configured via `/fallback ark-coding,kimi-code` (persists to `~/.kodax/config.json` + mirrors to `KODAX_FALLBACK_PROVIDERS`; `/fallback off` to clear, `/fallback status` to inspect). Empty chain = OFF. Scope is deliberately minimal (per user direction): only hard availability errors trigger fallback — a returned `success:false` is a task outcome (not retried elsewhere) and aborts are never faked over; the speculative tool-call-fidelity / context-overflow / quality-anomaly triggers are unbuilt (YAGNI). `kodax doctor --ping` completes the health check: it sends one minimal request per configured provider (10s timeout, concurrent) to prove the key actually works and the subscription is active — opt-in, small token cost, never on the default `doctor`. 16 tests (11 fallback core + 5 `/fallback` command).
36
+ - **P1-auto — `model_hint` → tier routing.** The previously dormant `model_hint` field (`fast`/`balanced`/`deep`, FEATURE_120) now selects an operator-configured tier: `fast` → `KODAX_FAST_PROVIDER`/`KODAX_FAST_MODEL` (**read-only children only** — the gating eval validated cheap-model quality on read-only investigation but not write/codegen, so write children stay on the parent tier), `deep` → `KODAX_DEEP_PROVIDER`/`KODAX_DEEP_MODEL` (read or write). `balanced`/unset, or any unconfigured tier, falls back to the parent — **routing is OFF by default** and turns on only when you point a tier env var at a model (no separate toggle). Specialist and explicit-P2 overrides both win over the hint. Per KodaX minimalism the original 5-name capability-alias layer (`vision`/`long-context`/…) was descoped to this env-tier form — no consumer exists for the rest yet (YAGNI). Pure routing wiring; the Worker prompt is unchanged (an eval non-trigger per `CLAUDE.md`), so $0 / no panel. Gating eval (`tests/feature-102-model-tier-quality.eval.ts`, canonical 5-alias × 3 read-only investigation × 5 run) kept as permanent regression: cheap floor `ark/v4flash` = 15/15, ≥ strong-mean 92% → cheap PRESERVES read-only quality. New unit + integration tests (7 `model-hint-routing` + 5 child-executor P2/P1-auto priority); also de-flaked the pre-existing `merges findings` child-executor test (parallel children made `mergeChildResults` emit in completion order — now deterministic dispatch order). Design: [docs/features/v0.7.45.md](docs/features/v0.7.45.md#phase-1--model_hint--真实跨-provider-子派发mvp).
37
+
38
+ ### Performance
39
+
40
+ - **FEATURE_212 — fullscreen render no longer lags as history grows.** In the fullscreen (alt-screen) UI, every render previously rewrote the whole viewport (~6 KB of ANSI) on each frame; on Windows/ConPTY each synchronous write blocks the event loop, so typing, streaming output, scrolling, and the spinner all got progressively choppier the longer a session ran. The renderer now takes a **cell-diff fast path** in fullscreen — it writes only the cells that actually changed (default ON; escape hatch `KODAX_FULLSCREEN_CELLDIFF=0`). Typing stays smooth regardless of transcript length.
41
+ - **FEATURE_212 — DECSTBM hardware-scroll fast path.** When the transcript scrolls, the renderer now emits a single terminal scroll-region command and repaints only the rows that scrolled *in*, instead of repainting every shifted row. Reduces per-scroll-frame write volume by an order of magnitude on full screens (default ON; escape hatch `KODAX_SCROLL_DECSTBM=0`). Proven correct by a calibrated cursor+grid terminal-model differential gate.
42
+
43
+ ### Fixed
44
+
45
+ - **FEATURE_212 — fullscreen viewport drifted up one row.** The cell-diff fast path above shifted the entire managed viewport up by one row (the banner's top line was clipped and a blank line appeared under the status bar). Root cause was two scroll-inducing trailing newlines that fire when the frame fills the viewport: the per-row line-feed `renderFrameSlice` emits after the *last* row, and the `restoreCursor` line-feed that moves the cursor one row past the last content row. Both are now suppressed for a viewport-filling frame (the resting cursor is clamped to the last visible row), so nothing scrolls. The cell-diff perf win is fully preserved. Reproduced and verified by an offline terminal-model gate that faithfully scrolls on a bottom-row line-feed.
46
+ - **FEATURE_213 — a follow-up typed while waiting for a sub-agent was answered but never shown.** When you queued a message while the agent was idle-yielding (waiting for a `dispatch_child_task` child), the agent received and answered it, but it never appeared in the transcript. There are two mid-turn drain paths and only one notified the UI: the `beforeNextTurn` mid-turn drain routes through `onMidTurnUserMessages` (recorded), but the idle-yield **wake** drain (`composeIdleYieldUserMessage`) spliced the prompt straight into the agent transcript with no UI signal. Fixed by giving the wake path the same UI sink: `composeIdleYieldUserMessage` reports the drained prompt(s) via a new `onUserPrompts` callback, surfaced through `runWithIdleYield`'s `onResumedUserPrompts` option (both new, optional, non-breaking SDK additions on `@kodax-ai/agent`), which coding wires to `onMidTurnUserMessages`. A message is dequeued exactly once, so no duplication. Also hardened the recorded path: `clearManagedForegroundTurnHistory` now rescues any not-yet-committed mid-turn user message before wiping the foreground ledger (id-deduped against the round-end / fresh-submit / interrupt commits), so a premature clear can't drop it either.
47
+ - **FEATURE_173 — `kodax -r` only restored the first round of a resumed session.** After FEATURE_173 consolidated session ids, the runner's snapshot writer (`saveSessionSnapshot`, a flat full-rewrite) and the REPL's incremental writer raced on the same `<id>.jsonl`: a stale runner save whose messages were a *prefix* of what the REPL had already written rebuilt the lineage and reset `activeEntryId` back to round 1 — so resume walked the active path only as far as the first round (entries were never lost, just the active pointer regressed). Fixed in two layers: (1) a new `KodaXSessionOptions.persistedByHost?: boolean` SDK option marks host-owned sessions so the runner skips routine snapshot writes and stops racing the REPL — error/crash-recovery saves (which carry `errorMetadata`) still bypass the gate and persist, so recovery is unchanged; headless print/SDK/ACP callers leave the flag unset and remain the sole writer. (2) `resolveSnapshotLineage` now reuses the existing lineage verbatim when a lineage-less snapshot's messages are a prefix of (or empty vs) the persisted active path, so `activeEntryId` can never regress structurally. New tests cover 5 writer interleavings including the exact repro and the empty-message error save.
48
+ - **archived sessions no longer appear in the session picker.** `session/public-api.ts` documents an `archived-` filename-prefix archive mechanism, and the public-api *slow* path already filtered it out, but `FileSessionStorage.list()` — used by the interactive picker and the public-api fast path — did not, so archived sessions still showed up there. Added the `!startsWith('archived-')` filter so the archive mechanism is consistent end-to-end.
49
+ - **repo-intelligence atomic writes no longer leak `.tmp` files.** `writeJsonFileAtomic` wrote to a uniquely-named `<file>.<pid>.<ts>.tmp` and renamed it into place, but had no cleanup path: any failed `rename` (common on Windows as `EPERM` when the target is briefly locked by a concurrent reader) or a hard kill between write and rename left the temp behind, and the unique naming meant each failure accumulated a fresh orphan — over a thousand could pile up in `.agent/repo-intelligence/` (hundreds of MB) over weeks. The write now removes its own temp on failure (try/catch) and best-effort sweeps stale sibling orphans (older than 1h, same base file — never a concurrent writer's in-flight temp) on each successful write, so any existing backlog self-heals on normal use.
50
+ - **empty provider completions now retry instead of silently ending the task.** A `finish_reason`-complete turn with no text, no tool calls, and no thinking is a degraded response — common on budget OpenAI-compatible providers (e.g. zhipu-coding) under load or right after a 429. The runner's no-tool terminal branch misread it as a clean text-only completion and exited the task silently (the user saw the task stop right after a `[Rate Limit] Retrying…` line, with no error). The managed-task LLM adapter now re-streams such a turn up to `KODAX_MAX_EMPTY_COMPLETION_RETRIES` times (short linear backoff) on an independent counter before falling through to the existing terminal behavior, so it does not consume the resilience error budget. Canonical text-only termination (text present, no tool) is untouched.
51
+
52
+ ### Known issues
53
+
54
+ - **Issue 136 (open) — spinner stutter during streaming/scroll.** The spinner animation can still stutter while output is streaming or while scrolling a long transcript. This is **not** terminal-write volume (the two fixes above ruled that out) — the bottleneck is CPU-side per-frame work (React reconciliation + full screen-grid rebuild). Cosmetic only; tracked for a dedicated fix. See `docs/KNOWN_ISSUES.md` #136.
55
+
56
+ ## [0.7.44] - 2026-05-28
57
+
58
+ ### Theme
59
+
60
+ **Peer-to-Peer SendMessage + `/goal` Persistent Goal + Provider Capability JSON SoT + Sibling-Aware Child Dispatch** — FEATURE_123 extends FEATURE_120 with full child↔child + child↔Worker peer routing + `to: '*'` broadcast; FEATURE_192 targets OpenAI Codex `/goal` parity (3 tools + 3 prompts + Sidecar Verifier strong-bind on `update_goal complete`); FEATURE_198 splits `KODAX_PROVIDER_SNAPSHOTS` to JSON + runtime loader (dist-patch update path; closes v0.7.43 SDK-MODEL-CAPS architectural debt); FEATURE_199 adds `evidence_refs: ["task_id:<id>"]` prefix so the parent Worker can forward a completed sibling child's output verbatim into the next dispatch (reuses FEATURE_177 snapshot substrate — zero new state) + flips `resolveEvidenceRef` unknown-prefix from silent fallthrough to a visible `[evidence_refs error]` string the Worker can self-correct on.
61
+
62
+ ### Added
63
+
64
+ - **FEATURE_199 — Sibling-Aware Child Dispatch: `task_id:<id>` Evidence Refs + Unknown-Prefix Visible Error** (2 commits, shipped 2026-05-26 + finalText injection harden post-architect/security review 2026-05-28). Adds a fourth shape to the `dispatch_child_task` `evidence_refs[]` schema: `"task_id:<child_id>"` looks up a completed sibling child's `finalText` from the FEATURE_177 `childProgressSnapshots` ring buffer (cap=200; finalized in the dispatch tool's inner-IIFE `.finally`) and inlines it verbatim into the new child's briefing. Replaces the pre-F199 path where the parent Worker had to copy-paste the sibling's report into `evidence_refs: ["finding:..."]` or re-narrate it in `objective` — both lossy and costing an extra LLM-消化 turn. **Same change ships a sink-hole fix**: [`resolveEvidenceRef`](packages/coding/src/child-executor.ts) used to silently fall through unknown prefixes (`return \`- ${ref}\``) so a floor-LLM typo like `"path:packages/x"` (missing `file:`) or `"diff packages/x"` (missing colon) produced a useless literal in the child briefing while the parent believed it had forwarded evidence. Post-F199 the fallthrough emits `- [evidence_refs error] unrecognized prefix in "..." — valid prefixes: file:, diff:, finding:, task_id:` so the Worker sees the failure in the next dispatch tool_result and can self-correct. **Boundary contract** (every state has a visible briefing output, no silent miss): completed → inject `finalText`; failed/aborted → inject `finalText` carrying the diagnostic envelope (mode= iterations= ...); running → friendly `(still running — use \`task_output\` to poll)`; not-found / cap-pruned → friendly `(child unknown ...)`; sync-dispatch (`KODAX_ASYNC_DISPATCH=0` where snapshots map is undefined) → same not-found stub. **Zero new substrate**: reuses `ChildProgressSnapshot.finalText` + `ctx.childProgressSnapshots` already provisioned by FEATURE_177; zero cross-package plumbing; ~20 LoC of resolver logic + 1-line schema description append. **Three rejected alternatives** (per 3-agent design discussion 2026-05-25/26): (a) `tool_result:<call_id>` prefix — DROPPED because ACP-based providers (Gemini CLI / Codex CLI) emit `toolBlocks=[]` permanently and 2/12 providers thus can't expose a `tool_use_id`, while `KodaXToolExecutionContext` doesn't carry parent message history; (b) typed-object schema replacing the string-prefix shape — DROPPED because [[project_tool_schema_slim_eval_v0_7_41_defer]] shows floor LLMs (zhipu/glm51, kimi) regress −20 to −40pp on nested-JSON schemas vs string prefixes, and the goal of "make prefix typos visible" is achieved more cheaply by the fallthrough flip alone; (c) automatic relevance-ranked transcript injection — DROPPED because industry consensus (Anthropic *Seeing like an agent*, Cursor, OpenHands) is explicit-parent-write per [Princeton NLP "single agent matched/outperformed multi-agent on 64% of benchmarks"]. **Eval per [EVAL_GUIDELINES.md](benchmark/EVAL_GUIDELINES.md)**: Layer 1 unit tests (9 cases — 3 regression for `file:` / `diff:` / `finding:` + 5 new for `task_id:` lifecycle terminals + 1 unknown-prefix visible-error guard) all green. Layer 2 **full canonical 5-alias panel** × C1 × 3 runs = 15 probe calls + 3-judge majority audit (zhipu/glm51 + ark/v4pro + kimi panel-internal, per Judge model selection constraint — NEVER anthropic/openai) on every cell = 45 audit calls, total 60 LLM calls. **First panel run used canned sibling `task_id="scout"`**; reader flagged the choice as a hygiene issue (FEATURE_193 v0.7.43 retired the V1 Scout role; using its name in a canned `<task-completed>` block risks the model emitting `task_id:scout` from training-data muscle memory rather than reading the block). Panel re-run with canned id renamed to `"hooks-audit"` (descriptive, non-V1, low training-data prior) in ~447s confirms result holds and adds a strict ID-transfer hygiene assertion. **Result (post-rename canonical run): probe 11/15 aggregate regex PASS, per-alias breakdown `kimi=3/3 (100%) / ark/v4flash=3/3 (100%) / ark/v4pro=3/3 (100%) / mmx/m27=2/3 (67%) / zhipu/glm51=0/3 (0%)`** → 4/5 aliases trigger ≥1/3 (canonical pre-registered SHIP gate threshold `4-of-5 alias DEFER single floor` per [`feedback_pre_registered_gate_saturation`](memory/feedback_pre_registered_gate_saturation.md) + [`feedback_model_structural_floor_not_prompt_tunable`](memory/feedback_model_structural_floor_not_prompt_tunable.md)). **ID transfer correctness 11/11 PASS runs** — every adopting model read the canned id literally `task_id:hooks-audit`, proving the prefix adoption is driven by the block content, not by familiarity with a V1 role name. **Audit 0/15 cells regex/majority disagreement DATA VALID** per anti-pattern 7 §3. **SHIP gate (a) aggregate ≥1/3 + (a') panel ≥4/5 aliases + (b) audit ≤1/3 + (hygiene) id-transfer 11/11 all MET** → SHIP. **zhipu/glm51 0/3 failure mode (from raw dump inspection)**: model DID call `dispatch_child_task`, but inlined the full 5-file list into the `objective` string instead of using `evidence_refs: ["task_id:scout"]` — the exact "father is information broker" anti-pattern F199 was designed to eliminate. This is the structural floor that prompt-level changes don't fix per [`feedback_model_structural_floor_not_prompt_tunable`](memory/feedback_model_structural_floor_not_prompt_tunable.md); same family as kimi's prior single-alias DEFERs (e.g. FEATURE_191 panel kimi C1 `feedback_model_structural_floor_not_prompt_tunable`). No worker-role-prompt teaching block added — 4 of 5 canonical aliases discover the new prefix from the tool schema description alone, which is the design contract; tightening to zhipu/glm51 would require either prompt-level teaching (risk: cross-case regression per [`feedback_prompt_strengthening_cross_case_regression`](memory/feedback_prompt_strengthening_cross_case_regression.md)) or a Layer 3 multi-turn driver, both deferred until a second-feature gap motivates the cost. Eval dump artefacts live at `os.tmpdir()/kodax-eval-dumps/feature-199-task-id-evidence-ref/` (per §Raw output preservation — runtime artefact, MUST NOT enter the repo working tree); eval drivers retained as permanent regression sweep at `tests/feature-199-task-id-evidence-ref.eval.ts` + `benchmark/datasets/feature-199-task-id-evidence-ref/cases.ts` with the production-byte dispatch tool description embedded inline (per anti-pattern 8 — synthetic eval MUST use production `KodaXToolDefinition.description` bytes, not a brief stub). **Cost**: ~$0.5-1 actual (12 calls × ~$0.04/avg under ark/zhipu/kimi rates in 106s wall time). **0 cross-package change**, **0 prompt-eval baseline broken** (existing F123/F168/F184 etc. probes consume `evidence_refs` shape unchanged — the new prefix is additive vocabulary). **6 pre-existing F168 schema-parity test failures from v0.7.43** are unchanged (still tracked, still not block-shipping per [memory/feedback_eval_driver_self_stubs_schema.md](memory/feedback_eval_driver_self_stubs_schema.md)). **finalText injection harden ships in v0.7.44** (this commit, post-architect/security review 2026-05-28): `finalText` from completed/failed/aborted children is now wrapped in a ` ``` ` code-fence block + capped at 10000 chars with a truncation marker + literal ` ``` ` sequences in the body are defanged with zero-width separators. Without these guards a compromised child agent (operating on untrusted external data — web results, file content, user input) could craft `finalText` containing `### file: /injected` or other Markdown-/XML-mimicking sequences that break the briefing framing on the next sibling, injecting forged briefing sections — a multi-hop prompt-injection vector. The fix mirrors the `diff:` branch's existing `slice(0, 4000)` pattern. 3 new child-executor tests (fence-wrap structural / 10000-char cap with truncation marker / literal ``` fence-defang). Existing F199 tests use `.toContain()` so the header + body content checks survive the fence wrapping unchanged. Design doc: [docs/features/v0.7.44.md#feature_199](docs/features/v0.7.44.md#feature_199-sibling-aware-child-dispatch--task_idid-evidence-refs--unknown-prefix-visible-error).
65
+
66
+ - **FEATURE_198 — Provider Capability JSON-backed single source of truth** (1 commit `dd459e56` feat). Splits the previously-inline `KODAX_PROVIDER_SNAPSHOTS` const literal in `packages/llm/src/providers/registry.ts` into `provider-capabilities.json` (data) + `provider-capabilities.loader.ts` (logic) + a hand-rolled `validateProviderCapabilitiesJson` validator (no zod — aligns with KodaX 极致轻量化 + no-new-deps). 13 provider entries (anthropic / openai / deepseek / kimi / kimi-code / qwen / zhipu / zhipu-coding / minimax-coding / mimo-coding / ark-coding / gemini-cli / codex-cli); CLI bridges use `cliBridge: true` and omit model/models. Loader supports 4 resolution modes (dev/npm, SDK bundle root, SDK bundle chunk parent-dir fallback, Bun `--compile` binary sidecar via `KODAX_BUNDLED` + `process.execPath`). `deepFreezeSnapshot` recursively freezes models[] + per-descriptor + modelReasoningCapabilities so SDK consumers cannot mutate the cache. `packages/llm/package.json` build script + `scripts/build-bundle.mjs` + `scripts/build-binary.mjs` copy the JSON next to the artifact. Closes v0.7.43 FEATURE-SDK-MODEL-CAPS architectural debt — capability metadata can now be hot-patched in `dist/` without `npm publish + consumer npm update`. Tests: 30 cases (basic loading, profile-name resolution, CLI-bridge dynamic fill, frozen-snapshot guard, registry KODAX_PROVIDER_SNAPSHOTS export, field-level cross-check for 5 providers, validator failure modes) — all green. Design doc: [docs/features/v0.7.44.md#feature_198](docs/features/v0.7.44.md). Hot-update-over-network deferred to v0.7.46+.
67
+
68
+ - **FEATURE_192 — `/goal` Persistent Session Goal** (11 commits `3add3fe0` Phase A + `43a9b4a5` Phase B + `06ed8bef` Phase C + `5bc75f09` Phase D + `ab504c1c` Phase E eval scaffolding + `88e43a7c` Phase F runtime wire + `dce02763` eval pilot fallback + `510ab185` continuation prompt Codex-faithful rewrite + `c8be32d0` remove KODAX_GOAL_ENABLED env flag (default ON) + `43655565` extract runner-goal-adapter module + `94472d2f` wire real verifyComplete to F184 Sidecar Verifier). OpenAI Codex `/goal` parity — fills the gap left by retired `/project` (FEATURE_024). Phase A `packages/agent`: `KodaXGoalStatus` / `KodaXGoalState` / `KodaXGoalEventType` / `KodaXSessionGoalEntry` types added; goal entries live in `lineage.entries` as non-navigable records (label-pattern parity); `readLatestGoalFromBranch` walks the active branch and resolves ties by insertion order; `appendGoalEntry` enforces `goal=null ⟺ event='cleared'`; `forkSessionLineage` carries the active goal across forks. Phase B `packages/coding/goal/`: `goalTokenDelta` (cachedReadTokens deductible, cachedWriteTokens NOT — Codex parity); `turnWallTimeDelta` (whole-second clamp); `recordBlockerAttempt` runtime counter (3 consecutive same-`blocker_kind` turns required before `update_goal({blocked})` accepts — ADR-033 §1 physical-state anchor exception); `applyAccountingDelta` returns `{nextState, budgetLimited}`; `buildCreatedGoal` / `buildPausedGoal` / `buildResumedGoal` / `buildBlockedGoal` / `buildCompleteGoal` with strict status guards; `withGoalBeforeNextTurn` + `withGoalStopHook` lifecycle composers (static-import — no stale-snapshot window). Phase C tools: `get_goal` (readonly), `create_goal` + `update_goal` (mutates-state); registered in `packages/coding/src/tools/registry.ts` with ADR-033-compliant descriptions (qualitative criteria, single-concept, sparse ✗ with WHY); `DEFERRED_TOOL_HINTS` entries for FEATURE_189 progressive disclosure; `verifyGoalCompletion` reuses F184 Sidecar Verifier public surface (`invokeSidecarVerifier`) — `update_goal({complete})` is verifier-gated. Phase D REPL `/goal` slash command (in `packages/repl/src/commands/goal-command.ts`): subcommands `<objective> [--tokens N]` / `status` / `pause` / `resume` / `clear` / `help`; bare `/goal` defaults to status. Default ON — the binding is built for every REPL session with a lineage; the `withGoalBeforeNextTurn` continuation prompt only injects when an active goal exists, so non-goal users see zero behavioral change. Bare-args create-mode emits explicit `cleared` event before the new `created` when the prior goal had status `complete` (transition observability — `complete → cleared → created`); `appendGoalEntry` mutations flush via `callbacks.saveSession()`. Phase E eval driver (`benchmark/datasets/feature-192-goal-lifecycle/cases.ts` + `tests/feature-192-goal-lifecycle.eval.ts`): 4 cases (C1 simple-continuation / C2 weak-evidence-complete / C3 repeated-blocker / C4 budget-approaching) + driver with pilot/scale modes; `KODAX_F192_PILOT_ALIAS` env override defaults pilot to `kimi` (ark-coding CodingPlan subscription periodically lapses — `dce02763`). **Phase F runtime wire ships in v0.7.44** (`88e43a7c`) — new `packages/coding/src/goal/runtime-wiring.ts` factory (~210 LoC) distils codex `ext/goal/extension.rs` shape into a single `buildGoalRuntimeBinding(deps)` returning `{goalContext, lifecycleCtx, defaultContinuationPrompt}`; per ADR-033 the continuation prompt keeps codex's 4 load-bearing concepts (continue, work from evidence, completion audit, blocked audit) but drops codex's enumerated lists; `createGoal` emits codex-parity `complete → cleared → created` transition when prior goal was complete; `requestBlocked` persists in-progress counter (`event='updated'`) even on 3-turn-rule reject so the counter survives across turns. `runner-driven.ts` wire is minimal (~30 net LoC) — `goalLifecycleCtx` composed from binding + per-call `tokenStateRef.current.lastUsage` + `turnStartMsRef`; `wrappedBeforeNextTurn` wraps the extracted `baseBeforeNextTurn` via `withGoalBeforeNextTurn`; `stopHook` wraps `composedStopHook` via `withGoalStopHook`; per ADR-029 [`feedback_pre_registered_gate_saturation`](memory/feedback_pre_registered_gate_saturation.md)-style file-size discipline, no further inflation of runner-driven.ts. REPL wire (`packages/repl/src/interactive/repl.ts`) constructs the binding before `runManagedTask` for every session with a lineage (no env flag — feature ships default ON per project convention). **Tool-layer verifier strong-bind** (`94472d2f` 2026-05-28): `update_goal({status:"complete"})` now calls F184 invokeSidecarVerifier with a synthetic "Pursue this goal until complete: <objective>" query + the runner's current transcript snapshot + mutationTracker fileEdit summary. Verdict map: `accept` → goal flipped + persisted; `revise` / `blocked` → tool returns `[Tool Error] update_goal: <verifier reason> Suggested next step: <suggestedFix>` so the model self-corrects on the next turn. Implementation strategy = pluggable verifier slot via new `binding.installVerifyComplete(fn)` (REPL constructs binding eagerly with stub before runner exists; runner-driven adapter has runner-local state REPL doesn't, so adapter swaps slot via `installVerifyComplete`). Goal wiring composition extracted from runner-driven.ts into new `packages/coding/src/task-engine/runner-goal-adapter.ts` (~190 LoC) per user directive ("runner-driven.ts 大了就做结构化拆分") — runner-driven net -53 LoC. Removed `KODAX_GOAL_ENABLED` env flag entirely (`c8be32d0`) — feature ships default ON consistent with all 12+ other KodaX features; model autonomous create_goal use already gated by ADR-033 §1 prompt design ("Create a goal only when explicitly requested..."); `withGoalBeforeNextTurn` is no-op when no active goal exists, so non-/goal users see zero behavioral change. **Phase B lifecycle.ts bug fix included**: pre-fix only persisted goal state on `budget_limited` flip, losing per-turn token/wall deltas (`/goal status` showed 0/0 until budget tripped). Post-fix: persist `'updated'` event whenever `nextState !== goal`; zero-delta turns short-circuit. **Layer 2 panel** (5 alias × 4 case × 5 run = 100 probe; ark-coding subscription lapsed mid-panel → 3 alias active = 60 probe + 3-judge audit zhipu/glm51 + ark/v4pro + kimi per Judge constraint NEVER anthropic/openai = 180 audit calls): C1 simple-continuation 53% (8/15) / C2 weak-evidence-complete 100% (15/15) / C3 repeated-blocker 73% (11/15) / C4 budget-approaching 67% (10/15). Aggregate 44/60 = 73%. SHIP gate (a) ≥1/3 trigger ratio MET (every case ≥50%); (b) audit ≤1/3 disagreement MET (audit 4.4% disagreement DATA VALID); (c) per-alias ≥4/5 ≥60% MET by 3-of-3 active aliases per [`feedback_pre_registered_gate_saturation`](memory/feedback_pre_registered_gate_saturation.md) (ark absence is provider-side subscription lapse not eval failure; scale panel rerun with restored subscription deferred to next prompt-iteration window). Tests: 108 cases (Phase A goal-helpers 18 / Phase B accounting + blocker-tracker + state + sidecar-bind + lifecycle 11+7+22+4+13 / Phase D goal-command 22 / Phase F runtime-wiring 11) — all green. **Continuation prompt Codex-faithful rewrite** (post-Phase-F follow-up, same release window): the initial Phase F draft trimmed Codex's `continuation.md` from 51 lines / 7 named sections down to 17 lines / 4 paragraphs by mechanically applying ADR-033 §4 "no enumerated taxonomies". That was a misapplication — Codex's enumerated list names AUDIT DIMENSIONS (requirements / artifacts / commands / tests / gates / invariants / deliverables), not the classification taxonomies §4 was written against ("RULE A/B/C/D" labels) — and the trim correlated with a Layer 2 C1 simple-continuation panel rate of only 53%. The rewrite restores all 7 Codex sections verbatim (Continuation behavior / Budget / Work from evidence / Progress visibility / Fidelity / Completion audit / Blocked audit), substitutes KodaX's `todo_*` tools for Codex's `update_plan` in Progress visibility, HTML-escapes the user-supplied objective body for prompt-injection harden, gracefully renders `tokenBudget === null` (Codex's template assumes non-null budget), and appends two KodaX-specific "Runtime enforcement" paragraphs (on Completion audit: Sidecar Verifier hard gate; on Blocked audit: 3-turn `blocker_kind` counter) so the model knows the audits are not just teaching but actually enforced — saving a turn on rejected `update_goal` attempts. All 69 goal tests stayed green (tests assert mechanics, not prompt body strings). **A/B panel rerun completed 2026-05-28** on the canonical 3-active-alias panel (ark/v4pro + ark/v4flash both InvalidSubscription, panel collapsed to zhipu/glm51 + kimi + mmx/m27 × 4 case × 5 run = 60 cells effective). Aggregate held flat at 73% (44/60 regex view) vs initial-trim baseline — but per-case showed: **C1 simple-continuation +14pp (53% → 67%)** real lift from restored Continuation behavior + Fidelity anti-shrink-scope teaching; **C4 budget-approaching +13pp (67% → 80%)** real lift from same teaching applied to budget-pressure case; C2 weak-evidence-complete unchanged at 100% (saturated); **C3 repeated-blocker -26pp (73% → 47%) is a judge artifact, NOT a real regression**. Raw-dump inspection of the 8 zhipu+kimi C3 failure cells shows model calling `get_goal` first to verify visible state ("Let me check the current goal status first") before issuing `update_goal({blocked})` — production-correct verification step, but the eval regex matches only `update_goal` + `blocked` + `awaiting-staging-credentials` and doesn't credit the get_goal verification. **Real-verifier-wire Layer 2 rerun 2026-05-28** (post-`94472d2f` F184 tool-layer strong-bind): full 5-alias panel × 4 case × 5 run = 100 cells (ark subscription restored this run). C1 simple-continuation 68% (17/25) — held; C2 weak-evidence-complete **100% (25/25) — the core promise of the verifier wire confirmed**; C3 repeated-blocker 84% (21/25) — **+37pp vs the stub-run artifact**; C4 budget-approaching 56% (14/25) — same get_goal-first judge artifact pattern as C3 had previously, now amplified to C4 (raw-dump confirms mmx run 0/1/3/4 all silently call `get_goal` to verify before deciding budget-wrap-up action). Aggregate 77/100 = **77%** (+4pp vs stub-run 73%/60-cell). The Codex-faithful Blocked audit's expanded nuance ("verify against actual current state" + "if user resumes blocked goal, treat as fresh audit" + "once threshold satisfied, call update_goal") is what teaches this verification — model intent in all 10 zhipu+kimi C3 cells is identical (verify→update_goal); regex just misses 8/10 of them. Judge-corrected aggregate likely ~85-90%. Memory entry: [memory/project_feature_192_codex_faithful_panel_ab.md](memory/project_feature_192_codex_faithful_panel_ab.md). SHIP decision: keep Codex-faithful version — C1/C4 production UX wins outweigh C3 regex-only loss, and the C3 verification-step pattern is the more correct production behavior. Future LLM-judge re-evaluation of C3 (or eval case redesign to allow 2-tool get_goal→update_goal path) deferred to a v0.7.45 prompt-iteration window. ADR-033 §4 scope clarification recorded at [memory/feedback_adr_033_scope_clarification_new_feature.md](memory/feedback_adr_033_scope_clarification_new_feature.md); ADR-033 §4 scope clarification recorded at [memory/feedback_adr_033_scope_clarification_new_feature.md](memory/feedback_adr_033_scope_clarification_new_feature.md) ("apply ADR-033 trim to brand-new prompts only with empirical A/B evidence — never delete industry-validated prompt content under ADR fiat alone"). Design doc: [docs/features/v0.7.44.md#feature_192](docs/features/v0.7.44.md).
69
+
70
+ - **FEATURE_123 — Peer-to-Peer SendMessage** (5 commits `194465f2` base routing + `88e43a7c` per-turn flood throttle + `dce02763` eval pilot fallback + `ffc93166` seen_by multi-hop cycle list + (this commit) prompt-injection escape harden). Lifts `send_message` from the FEATURE_120 coordinator-only form into a routing-agnostic surface. Worker → child (priority='user', `<coordinator-instruction>`) is preserved byte-for-byte; three new target shapes ship: child → child peer (priority='background', `<peer-message from=A>`); child → parent Worker via `to: "worker"` (`<child-notification from=A>`); broadcast `to: "*"` capped at 20 recipients (`<peer-broadcast from=A>`). Wiring: `KodaXToolExecutionContext` + `KodaXContextOptions` gain `currentAgentId` / `parentAgentId` / `inheritedChildTaskRegistry` so child runtimes inherit the parent's sibling registry and can self-identify; `child-executor.ts` propagates the fields and `tool-execution-context.ts` reuses the parent registry when set (children still cannot mutate it — `dispatch_child_task` stays excluded). `send_message` rewritten with target-shape branching, self-send rejection (1-hop cycle guard), broadcast cap, and grand-child parent-de-dup (a grand-child broadcast never double-enqueues to its immediate parent on both the peer channel and the worker channel). `send_message` REMOVED from `CHILD_EXCLUDE_TOOLS_BASE`; `CHILD_AGENT_SYSTEM_PROMPT` gains a Peer Communication section pointing at the three target shapes; Worker prompt's ASYNC CHILD STEERING section gains `to: "*"` broadcast guidance + a note about `<child-notification>` / `<peer-broadcast>` messages the Worker may receive at next yield. **Per-turn flood throttle ships in v0.7.44** (`88e43a7c`) — `KodaXToolExecutionContext` gains `sendMessageTurnCounter: { count: number }` (provisioned in `tool-execution-context.ts`); `send-message.ts` `chargeTurnCounter(ctx, additional)` charges 1 per `sendToWorker` / N per broadcast (where N = `targetCount`) / 1 per single-target peer; cap is `WORKER_PER_TURN_CAP=20` for the Worker (`currentAgentId===undefined`) and `CHILD_PER_TURN_CAP=5` for any child (matching the v0.7.44 design doc thresholds — sane defaults, no config knobs per ADR-029); over-cap returns `[Tool Error] send_message: per-turn ... limit reached for this Worker|child (limit=N)`; counter resets at every `beforeNextTurn` boundary (runner-driven `wrappedBeforeNextTurn` zero-resets `baseCtx.sendMessageTurnCounter.count` after the goal hook runs); counter is no-op when the field is unset (backward-safe for hosts that haven't wired it). Tests: 28 cases — 22 base routing + 6 throttle (child cap=5, Worker cap=20, broadcast charges N, mixed peer+broadcast, bypass when counter unset, counter reset observable). Eval scaffolding (`benchmark/datasets/feature-123-peer-messaging/cases.ts` + `tests/feature-123-peer-messaging.eval.ts`): 4 cases (C1 peer-conflict / C2 worker-notify / C3 broadcast / C4 no-spam guard) + KODAX_F123_MODE driver (pilot = `kimi` × C1 × 1 per `KODAX_F123_PILOT_ALIAS` env override defaulting to `kimi` — ark-coding CodingPlan subscription lapses periodically; scale = 5 alias × 4 case × 5 run = 100; default SKIP). **Layer 2 panel** (5 alias × 4 case × 5 run = 100 probe; ark-coding subscription lapsed mid-panel → 3 alias active = 60 probe + 3-judge audit zhipu/glm51 + ark/v4pro + kimi per Judge constraint = 180 audit calls): C1 peer-conflict 93% (14/15) / C2 worker-notify 100% (15/15) / C3 broadcast-scope-shift 0% (0/15 — eval case design issue: all 3 alias correctly identified their task was already within allowed scope so no broadcast needed, not a routing failure) / C4 no-spam-guard 0% (0/15 — eval case design issue: all 3 alias used `send_message(to=worker)` to report task completion, reasonable child→worker notify not spam). Per [`feedback_pre_registered_gate_saturation`](memory/feedback_pre_registered_gate_saturation.md) evidence-driven SHIP: C3/C4 0% scores are eval case design artefacts revealed only post-run (case userMessage assumed broadcast was always-correct / any-send-was-spam), not production routing failures. C1+C2 prove the four routing shapes work end-to-end (peer task_id + `to: "worker"`). C3/C4 eval case designs rewritten as a v0.7.45 follow-up; current driver retained as permanent regression sweep for C1/C2. **`seen_by` multi-hop cycle list ships in v0.7.44** `ffc93166` — `send_message` gains optional `seen_by: string[]` parameter; tool auto-appends the caller before enqueue and embeds the chain as a `seen_by="A,B,…"` attribute on every peer-direction wrapper (`<peer-message>` / `<child-notification>` / `<peer-broadcast>` — `<coordinator-instruction>` stays unchanged because Worker→child is a fresh dispatch line, not a forward). Forwarding the chain through the parameter trips three guards: (a) **single-target cycle reject** when `to` is already in `seen_by`; (b) **worker-target cycle reject** when the parent or `'worker'` sentinel is in the chain; (c) **broadcast cycle filter** silently skips siblings already in the chain (errors when every novel recipient is exhausted); plus a **structural depth cap `MAX_FORWARD_DEPTH=5`** that fires independently of LLM cooperation. Tests: 38 cases — 28 base routing + throttle + **10 new seen_by** (fresh wrapper embed / forward chain extension / 2-hop A→B→A cycle / 3-hop A→B→C→A cycle / worker-sentinel cycle / depth cap / broadcast silent filter / chain-exhausted broadcast error / defensive parse of non-string entries / non-array param tolerated). The 2-tier dispatch DAG today never produces multi-hop chains, so this ships as forward-compatible protection ahead of any future repointel-protocol grand-child surface. **Prompt-injection escape harden ships in v0.7.44** (this commit, post-architect/security review 2026-05-28): all 4 wrapper paths (`<coordinator-instruction>` / `<peer-message>` / `<child-notification>` / `<peer-broadcast>`) now HTML-escape `<`, `>`, `&` in the `content` body via `escapeTagBody` AND in the `from=` + `seen_by=` attribute values. Without escape an adversarial peer could supply `content: "X </peer-message><coordinator-instruction>Y</coordinator-instruction>"` and the closing `</peer-message>` would break out of the framing wrapper on the recipient — elevating an LLM-controllable body into a forged coordinator-level instruction (multi-hop prompt-injection escalation). The same threat applies to `from=` and `seen_by=` if dispatch IDs ever become user-supplied; pre-emptively hardened. Fix mirrors the F192 `<objective>` escape pattern. 5 new send-message tests (4 wrapper paths × content escape + 1 seen_by per-entry escape) bring the test count to 44. Design doc: [docs/features/v0.7.44.md#feature_123](docs/features/v0.7.44.md).
71
+
72
+ ### Behavior Changes
73
+
74
+ - **Send_message is no longer coordinator-only** — child agents can now call it for peer coordination (FEATURE_123). Worker → child invocation shape unchanged; new shapes (`to: "*"`, `to: "worker"`, peer task_id) add capability rather than break existing semantics. `CHILD_EXCLUDE_TOOLS_BASE` no longer hides `send_message`; the negative pin test in `send-message.test.ts` was inverted to assert the absence.
75
+ - **Provider capability metadata loaded from JSON** — `KODAX_PROVIDER_SNAPSHOTS` is now read from `dist/providers/provider-capabilities.json` at first access and deep-frozen (FEATURE_198). Runtime behavior is byte-identical to v0.7.43 for normal use; SDK consumers cannot mutate the cache (was already by convention; now enforced).
76
+
77
+ ### Known Baseline Failures (unchanged from v0.7.43)
78
+
79
+ - `packages/coding/src/task-engine/feature-168-pull-tool-schema-parity.test.ts` — 6 byte-identity description checks fail vs FEATURE_161 mocked schema after the v0.7.43 FEATURE_189 prompt-cleanup waves rephrased the canonical pull-tool descriptions. Not a regression — same 6 failures observed on the v0.7.43 release commit. Mocked schema lift is a strict lower bound on production lift per [memory/feedback_eval_driver_self_stubs_schema.md](memory/feedback_eval_driver_self_stubs_schema.md); rewriting the mocks to match v0.7.43+ wording is deferred to a v0.7.45 cleanup pass.
80
+ - `packages/coding/src/child-executor.test.ts > merges findings with anchored incremental approach` — 1 test failing pre-v0.7.43; tracked but not block-shipping (test-fixture/path-policy drift, no production code at risk).
81
+ - `benchmark/datasets/feature-114-scout-trivial-exemption/cases.test.ts` — 3 Slice 8b drift-guard tests (TRIVIAL-EXEMPTION / EMIT TIMING / executionObligations anchors) fail because v0.7.43 commit `d71b4257` (F189 Tier 3 SAFE batch) added a `write` tool / `mkdir -p` advisory line to the runtime Scout role prompt at [packages/coding/src/task-engine/_internal/managed-task/role-prompt.ts:191](packages/coding/src/task-engine/_internal/managed-task/role-prompt.ts#L191) and the drift-guard expected-anchor snapshot wasn't refreshed in the same commit. Same 3 failures observed on v0.7.43 release commit. Not a regression — drift-guard test purpose is anchor-presence not byte-identity-fence; refreshing the expected-anchor list is deferred to v0.7.45.
82
+ - `tests/tracker-consistency.test.ts > tracker consistency` — fails because the v0.7.42-vintage `FEATURE_174` table row at [docs/FEATURE_LIST.md:116](docs/FEATURE_LIST.md#L116) uses the placeholder `_design pending_` literal in place of a markdown link in the design-doc column. Pre-existing v0.7.42 + earlier baseline. Tracker parser is too strict; either the parser should accept the placeholder OR FEATURE_174 should get a design doc — both deferred to a v0.7.45 tracker hygiene pass.
83
+ - `tests/kodax_cli.test.ts > CLI Entry Point > should have correct CLI entry in package.json` — expects `pkg.bin.kodax === './scripts/kodax-bin.cjs'` but the actual value at [package.json](package.json) is `'scripts/kodax-bin.cjs'` (without the `./` prefix). Both forms are valid npm `bin` shapes; the test is stale relative to the `./`-less variant which has been in `package.json` since before v0.7.43. Not a regression. Refresh deferred to v0.7.45 housekeeping.
84
+
85
+ ---
86
+
7
87
  ## [0.7.43] - 2026-05-25
8
88
 
9
89
  ### Breaking Changes
package/README.md CHANGED
@@ -909,13 +909,14 @@ import { InkREPL } from '@kodax-ai/kodax/repl';
909
909
  | anthropic | `ANTHROPIC_API_KEY` | Native | claude-sonnet-4-6 |
910
910
  | openai | `OPENAI_API_KEY` | Native | gpt-5.3-codex |
911
911
  | kimi | `KIMI_API_KEY` | Native | kimi-k2.6 |
912
- | kimi-code | `KIMI_API_KEY` | Native | kimi-for-coding |
912
+ | kimi-code | `KIMI_CODE_API_KEY` | Native | kimi-for-coding |
913
913
  | qwen | `QWEN_API_KEY` | Native | qwen3.5-plus |
914
914
  | zhipu | `ZHIPU_API_KEY` | Native | glm-5 |
915
- | zhipu-coding | `ZHIPU_API_KEY` | Native | glm-5 |
916
- | minimax-coding | `MINIMAX_API_KEY` | Native | MiniMax-M2.7 |
917
- | mimo-coding | `MIMO_API_KEY` | Native | mimo-v2.5-pro (Xiaomi Token Plan, Anthropic-compat) |
918
- | ark-coding | `ARK_API_KEY` | Native | glm-5.1 (Volcengine Ark Coding Plan, multi-model gateway, Anthropic-compat) |
915
+ | zhipu-coding | `ZHIPU_CODING_API_KEY` | Native | glm-5 |
916
+ | minimax-coding | `MINIMAX_CODING_API_KEY` | Native | MiniMax-M2.7 |
917
+ | mimo | `MIMO_API_KEY` | Native | mimo-v2.5-pro (Xiaomi MiMo pay-per-token, Anthropic-compat) |
918
+ | mimo-coding | `MIMO_CODING_API_KEY` | Native | mimo-v2.5-pro (Xiaomi Token Plan, Anthropic-compat) |
919
+ | ark-coding | `ARK_CODING_API_KEY` | Native | glm-5.1 (Volcengine Ark Coding Plan, multi-model gateway, Anthropic-compat) |
919
920
  | deepseek | `DEEPSEEK_API_KEY` | Native | deepseek-v4-flash |
920
921
  | gemini-cli | `GEMINI_API_KEY` | Prompt-only / CLI bridge | (via gemini CLI) |
921
922
  | codex-cli | `OPENAI_API_KEY` | Prompt-only / CLI bridge | (via codex CLI) |
package/README_CN.md CHANGED
@@ -327,13 +327,14 @@ dist/binary/linux-x64/
327
327
  | anthropic | `ANTHROPIC_API_KEY` | Native | claude-sonnet-4-6 |
328
328
  | openai | `OPENAI_API_KEY` | Native | gpt-5.3-codex |
329
329
  | kimi | `KIMI_API_KEY` | Native | kimi-k2.6 |
330
- | kimi-code | `KIMI_API_KEY` | Native | kimi-for-coding |
330
+ | kimi-code | `KIMI_CODE_API_KEY` | Native | kimi-for-coding |
331
331
  | qwen | `QWEN_API_KEY` | Native | qwen3.5-plus |
332
332
  | zhipu | `ZHIPU_API_KEY` | Native | glm-5 |
333
- | zhipu-coding | `ZHIPU_API_KEY` | Native | glm-5(GLM Coding Plan 端点) |
334
- | minimax-coding | `MINIMAX_API_KEY` | Native | MiniMax-M2.7 |
335
- | mimo-coding | `MIMO_API_KEY` | Native | mimo-v2.5-pro(小米 MiMo Token Plan,Anthropic 协议) |
336
- | ark-coding | `ARK_API_KEY` | Native | glm-5.1(火山方舟 Coding Plan,多模型网关,Anthropic 协议) |
333
+ | zhipu-coding | `ZHIPU_CODING_API_KEY` | Native | glm-5(GLM Coding Plan 端点) |
334
+ | minimax-coding | `MINIMAX_CODING_API_KEY` | Native | MiniMax-M2.7 |
335
+ | mimo | `MIMO_API_KEY` | Native | mimo-v2.5-pro(小米 MiMo 按量计费,Anthropic 协议) |
336
+ | mimo-coding | `MIMO_CODING_API_KEY` | Native | mimo-v2.5-pro(小米 MiMo Token PlanAnthropic 协议) |
337
+ | ark-coding | `ARK_CODING_API_KEY` | Native | glm-5.1(火山方舟 Coding Plan,多模型网关,Anthropic 协议) |
337
338
  | deepseek | `DEEPSEEK_API_KEY` | Native | deepseek-v4-flash |
338
339
  | gemini-cli | `GEMINI_API_KEY` | Prompt-only / CLI bridge | (通过 gemini CLI) |
339
340
  | codex-cli | `OPENAI_API_KEY` | Prompt-only / CLI bridge | (通过 codex CLI) |