npm - @toolkit-cli/toolkode - Versions diffs - 1.14.0 → 1.15.0 - Mend

@toolkit-cli/toolkode 1.14.0 → 1.15.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (2) hide show

package/README.md +76 -0
package/package.json +6 -5

package/README.md CHANGED Viewed

@@ -35,6 +35,82 @@ toolkode --prompt "fix the failing tests"
 toolkode serve
 ```
+## What's new in v1.15 "Oracle"
+**200-pattern predictive failure analysis, compiled into Toolkode's own napi-rs binary.** v1.15 ships Toolkode's first real Rust hero feature: a failure-prediction engine that catches auth, data, async, API, state, database, security, and integration bugs *before* you write code. Sub-millisecond, local-only, no network calls, no telemetry.
+**`/foresight` — inline predictions.** Type a requirement and get ranked predictions with severity, confidence, prevention templates, test suggestions, and OWASP 2021 mappings.
+```
+/foresight "add jwt auth with refresh tokens"
+/predict "migrate user table to UUIDs"            # alias
+/oracle "add retry logic to payment webhook"      # alias
+```
+All three commands route to the same handler — the aliases are honored when typed directly into the prompt, not just via the command palette.
+**200 failure patterns across 8 categories:**
+| Category           | Patterns | Examples                                                      |
+| ------------------ | -------- | ------------------------------------------------------------- |
+| Authentication     | 25       | JWT `alg=none`, session fixation, MFA bypass, SAML sigs       |
+| Data Handling      | 30       | XXE, CSV formula injection, prototype pollution, PII leaks    |
+| Async / Concurrency | 25       | Race conditions, TOCTOU, deadlock, unhandled rejections       |
+| API                | 25       | Rate limit bypass, idempotency, CORS wildcard, filter injection |
+| State Management   | 20       | Stale closure, subscriber leak, optimistic drift              |
+| Database           | 25       | SQL injection, N+1, connection pool leak, schema drift        |
+| Security           | 30       | XSS, CSRF, SSRF, path traversal, insecure deserialization     |
+| Integration        | 20       | Webhook replay, schema drift, timeout cascade, missing DLQ    |
+| **Total**          | **200**  |                                                               |
+Every pattern ships as compiled Rust struct literals inside `@toolkit-cli/toolkode-native` — no JSON loading, no runtime fetches, no cloud. The pattern database, matching engine, regex compilation, and confidence scorer are a single `.node` file per platform.
+**Mission pre-implementation advisory.** Before Mission spawns an implement worker, Toolkode runs foresight on the task requirement with the project's tech stack and a file-path-inferred context (auth / database / api / async_concurrency / security). Critical and High predictions surface as advisory log entries tagged `foresight.advisory`. **Non-blocking by design** — this is a nudge, not a gate. Hard gates arrive with v1.18 "Verdict" and `/certainty`.
+**`@toolkit-cli/toolkode-native` umbrella package.** v1.14 scaffolded the napi-rs pipeline with a trivial `ping()`. v1.15 proves it at scale with ~3,700 lines of idiomatic Rust in the `.node` binary. Distribution is wired through a new `@toolkit-cli/toolkode-native` umbrella npm package with per-platform `optionalDependencies`: `@toolkit-cli/toolkode-native-darwin-arm64`, `@toolkit-cli/toolkode-native-darwin-x64`, `@toolkit-cli/toolkode-native-linux-x64`, `@toolkit-cli/toolkode-native-linux-arm64`, `@toolkit-cli/toolkode-native-windows-x64`. `npm i -g @toolkit-cli/toolkode@1.15.0` pulls the umbrella, the umbrella pulls the matching platform binary, and a runtime platform dispatcher in `index.js` resolves the correct `.node` at load time. Graceful fallback if the binary can't load — `/foresight` returns a friendly error instead of crashing.
+**Real tests for the FFI boundary.** v1.14's TS tests passed vacuously because every positive assertion early-returned when `Native.isAvailable()` was false — and it was always false in CI without a built `.node`. v1.15 fixes the harness: CI now runs `napi build --release` before `bun test`, an unskippable gate asserts `isAvailable() === true` in CI, and the positive tests actually exercise the Rust engine through the JSON-at-the-boundary FFI. The publish job depends on the test job — no green, no ship.
+**Hardened Rust release pipeline.** `toolkode-core`'s `.node` binary ships with an explicit release profile: `strip = "symbols"` removes all function and type names (6,933 → 1 exported symbol, the sole napi entry point), `lto = true` eliminates dead code and further shrinks the binary (~28% smaller), and CI sets `RUSTFLAGS="--remap-path-prefix=..."` to scrub absolute build paths and dependency source locations from panic metadata in `.rodata`. The pattern database strings are intentionally preserved as the IP moat; everything else that could leak host info or internal structure is stripped.
+### Sub-agent reliability & observability (lifecycle hardening)
+v1.15 also lands an architectural reliability pass on Toolkode's sub-agent orchestration. Previously, three separate models — background `task` subagents, `team` teammates, resumed sessions via `send_message` — each inferred "agent is doing useful work" from brittle signals like "process exists" or "session is busy." Silent failure was the main enemy: a spawned teammate that never acknowledged its task would appear "active" in the sidebar forever; a background subagent that went idle mid-task would silently stop with no operator-visible failure.
+**Shared lifecycle registry.** A new `AgentLifecycle` module gives every sub-agent an explicit state machine with 16 states (`created → queued → assigned → booting → ready → acknowledged → working → progressing → blocked → retrying → stalled → timed_out → failed → completed → abandoned → cancelled`), per-source budgets (ack / progress / heartbeat / blocked / max-age), and a persistent timeline + evidence record. Each orchestration path (`task.ts`, `team/index.ts`, `send-message.ts`, child `run.ts`) emits lifecycle events for every real signal — tool start, tool finish, meaningful text output, session status change, permission prompt, child exit.
+**Active watchdog.** Every 5 seconds the watchdog evaluates each live agent against its budgets and produces a structured assessment (`healthy` / `warning` / `stalled` / `timed_out` / `orphaned`) with a recovery action (`nudge` / `rehydrate` / `restart` / `escalate`). Background subagents get one automatic rehydrate attempt; team teammates get one automatic restart for early no-ack boot failures (with proper SIGTERM-then-SIGKILL process hygiene — no more zombie children on watchdog restart). Beyond that, operators intervene via the inspector.
+**Background task completion visible to the parent LLM.** When a background task, a resumed subagent, or a team teammate completes, Toolkode now publishes a `TaskNotify` bus event that injects a `<system-reminder><task-notification>...</task-notification></system-reminder>` block into the parent session's next LLM turn. Before this, background results disappeared silently — the parent had to poll. Now the parent model sees the result (or the failure) as part of its context on the next step.
+**Inspector and sidebar show authoritative state.** The TUI agent inspector (`/agents` dialog) and the sidebar agents panel no longer derive status from `SessionStatus === busy`. They read the lifecycle registry directly and surface 16-state status, watchdog reason, ack / progress / heartbeat ages, retry count, visibility reason, and the last 6 timeline events. Operators can see exactly why an agent is still visible and what state it's stuck in, not just "running" vs "idle".
+**Tests.** 3 unit tests for the lifecycle module (`test/agent/lifecycle.test.ts` — transition recording, no-ack timeout detection, heartbeat-without-progress stall detection), 7 integration tests for the TaskNotify bus round trip (`test/session/task-notify.test.ts` — notify→pending→drain, XML formatting, error-block semantics, duplicate dedup, per-session scoping, empty-format guard, manual acknowledge).
+### Buddies vs. Oracles
+> **Same-day release coincidence.** Anthropic shipped **Claude Buddies** the same day we shipped **Oracle**. We couldn't have scripted this better if we tried.
+>
+> **Buddies** are AI companions that keep you company while you code. They chat. They encourage. They're friendly. They're supportive. We're sure they're lovely.
+>
+> **Oracle** is 200 Rust-compiled failure patterns that tell you the SQL injection is coming before you type the first backtick. It doesn't chat. It doesn't encourage. It predicts failure, names the CVE class, cites the OWASP entry, and hands you the prevention template — in sub-millisecond time, on your laptop, with zero network calls.
+>
+> Two valid philosophies. One ships comfort; the other ships foresight. Both are helpful. Only one of them catches `alg: none` in your JWT middleware at 2 AM on a Sunday before the pager fires.
+>
+> We respect the craft. We also note that a buddy watching you write `eval(req.body.code)` will probably just say "you've got this!" while an oracle opens with `sec-028: Unsafe eval() / Function() on User Input — Critical — CWE-95 — stop`.
+>
+> Pick whichever fits your workflow. If you want both, they don't conflict. If you want neither, you're probably fine until production.
+>
+> _We'll send Anthropic a muffin basket. We ship Rust._
+### Upgrade notes
+- No breaking changes. No config changes.
+- The native binary is pulled automatically via `@toolkit-cli/toolkode-native`'s `optionalDependencies` on install.
+- Supported platforms: darwin-arm64, darwin-x64, linux-x64, linux-arm64, windows-x64.
+- If the native module fails to load on your platform, `/foresight` returns a graceful error instead of crashing. The rest of the TUI keeps working.
+- Foresight is 100% local. No network calls, no telemetry.
 ## What's new in v1.14 "Mission Critical"
 A stability-and-polish release. v1.14 closes the last known OpenTUI streaming crash, finishes the `/compact` UX cleanup started in v1.13.2, restores the `/checkpoint` alias, and adds a 144-test suite around the artifact and compat layers that shipped in v1.13.3.

package/package.json CHANGED Viewed

@@ -12,12 +12,13 @@
   "scripts": {
     "postinstall": "bun ./postinstall.mjs || node ./postinstall.mjs"
   },
-  "version": "1.14.0",
+  "version": "1.15.0",
   "license": "SEE LICENSE IN LICENSE",
   "optionalDependencies": {
-    "@toolkit-cli/toolkode-linux-arm64": "1.14.0",
-    "@toolkit-cli/toolkode-linux-x64": "1.14.0",
-    "@toolkit-cli/toolkode-darwin-arm64": "1.14.0",
-    "@toolkit-cli/toolkode-windows-x64": "1.14.0"
+    "@toolkit-cli/toolkode-linux-arm64": "1.15.0",
+    "@toolkit-cli/toolkode-linux-x64": "1.15.0",
+    "@toolkit-cli/toolkode-darwin-arm64": "1.15.0",
+    "@toolkit-cli/toolkode-windows-x64": "1.15.0",
+    "@toolkit-cli/toolkode-native": "1.15.0"
   }
 }