npm - loki-mode - Versions diffs - 7.30.0 → 7.32.0 - Mend

loki-mode 7.30.0 → 7.32.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (23) hide show

package/README.md +4 -2
package/SKILL.md +2 -2
package/VERSION +1 -1
package/autonomy/context-tracker.py +8 -0
package/autonomy/loki +799 -123
package/autonomy/mcp-launch.sh +149 -36
package/autonomy/run.sh +168 -4
package/bin/loki +71 -0
package/dashboard/__init__.py +1 -1
package/dashboard/server.py +326 -1
package/dashboard/static/index.html +105 -39
package/docs/INSTALLATION.md +1 -1
package/docs/competitive/replit-lovable-analysis.md +1 -1
package/loki-ts/data/model-pricing.json +1 -0
package/loki-ts/dist/loki.js +233 -231
package/mcp/__init__.py +1 -1
package/mcp/_sdk_loader.py +157 -0
package/mcp/lsp_proxy.py +61 -61
package/mcp/server.py +35 -129
package/package.json +1 -1
package/providers/claude.sh +76 -19
package/providers/model_catalog.json +9 -0
package/skills/model-selection.md +49 -1

package/skills/model-selection.md CHANGED Viewed

@@ -5,7 +5,7 @@
 Since v6.81.0, the main RARV loop uses **one pinned model per session** instead of rotating models per RARV phase. This eliminates per-iteration tier churn and keeps the main loop behavior predictable.
 - Controlled by `LOKI_SESSION_MODEL` (default: `sonnet`).
-- Accepted values: `opus`, `sonnet`, `haiku`, or a raw tier name (`planning`, `development`, `fast`). Model names are mapped to abstract tiers (opus -> planning, sonnet -> development, haiku -> fast) so provider helpers resolve correctly.
+- Accepted values: `opus`, `sonnet`, `haiku`, or a raw tier name (`planning`, `development`, `fast`). The value names a **tier**, not a fixed model: it is mapped to an abstract tier (opus -> planning, sonnet -> development, haiku -> fast) which is then resolved to a concrete model through the provider config (`PROVIDER_MODEL_PLANNING/DEVELOPMENT/FAST`). On stock config the default `sonnet` pin = the **development** tier = `PROVIDER_MODEL_DEVELOPMENT` = **opus** (the model the runner actually dispatches). Under `LOKI_ALLOW_HAIKU=true` the development tier lowers to sonnet (and fast to haiku). This tier-to-model resolution is why `loki plan` quotes Opus on the stock default path: the quote names the dispatched model, not the pin alias. A mid-flight `.loki/state/model-override` is different: that alias is fed straight to `--model` (a `sonnet` override dispatches sonnet, no tier route).
 - `get_rarv_tier()` is retained and still used for subagent Task-tool dispatches (planning vs implementation vs fast work), not for the main loop.
 **Rollback (legacy RARV tier rotation in the main loop):**
@@ -18,6 +18,54 @@ Set `LOKI_LEGACY_TIER_SWITCHING=true` to restore the previous per-iteration `get
 ---
+## Fable 5 (top-tier advisory model)
+Claude Fable 5 (alias `fable`, full id `claude-fable-5`) is Anthropic's most capable widely released model. It is NOT a default workhorse: it is priced at exactly **2x Opus** ($10 / $50 per MTok in/out vs Opus $5 / $25), uses always-on adaptive thinking, and is reserved for the few decision points where its extra investigation and self-verification pay off.
+**When Fable is documented to help (route here, opt-in, cost shown):**
+- Architecture decisions, root-cause investigations, outage debugging.
+- Long-horizon tasks you would otherwise decompose.
+**When NOT to route to Fable:**
+- Security review. Fable's safety classifiers refuse cybersecurity content, and in non-interactive (`-p`) mode a flagged request ends the turn with a refusal instead of a transparent Opus re-run, which would break the unanimous-council gate. Security review stays on Opus, always. (Defensive-cyber capability lives in Mythos 5 via Project Glasswing, not Fable.)
+- Default planning/development/fast work. The 2x cost is only worth it for the cases above.
+### Mid-flight model switching
+You can change the model a **live run** uses, from the dashboard or by writing a state file. The switch applies at the **next iteration boundary** (each iteration spawns a fresh `claude -p`, which fixes the model per invocation, so it never changes mid-invocation). The override applies to the **current run only**: the runner clears a leftover override at the start of a fresh run, so a switch never silently carries into future runs. The override is also clamped by `LOKI_MAX_TIER`: if the operator set a cost ceiling, an override above it is clamped down with one honest log line, and the dashboard reports the clamped effective model. The clamp is scoped, not blanket: on the override path a `sonnet` ceiling downgrades only `fable` (to `PROVIDER_MODEL_DEVELOPMENT`, opus by default), while a plain `opus` override stays opus; a `haiku` ceiling pins to `PROVIDER_MODEL_FAST`; an `opus` ceiling caps only `fable` back to opus. (The clamp is enforced on the bash runner, which is the live `start` route; the experimental Bun runner does not yet port it.)
+- Dashboard: the Model selector in the session-control panel. The Fable option shows its 2x-Opus cost; an inline notice discloses the iteration-boundary timing. It calls `POST /api/session/model`.
+- File / CLI: write an allowlisted alias (`haiku`, `sonnet`, `opus`, `fable`) to `.loki/state/model-override`. Empty or absent file reverts to the tier mapping. Invalid content is ignored (the runtime warns once). The value is allowlist-validated because it is fed straight into `claude --model`.
+```bash
+# Switch a live run to Fable for the next iteration
+echo fable > .loki/state/model-override
+# Revert to the tier mapping
+rm -f .loki/state/model-override
+```
+### Architecture-tier opt-in: `LOKI_FABLE_ARCHITECT`
+`LOKI_FABLE_ARCHITECT=1` routes ONLY the **first iteration** (the architecture / REASON pass) to Fable; every later iteration uses the session tier (the default pin, e.g. Sonnet/Opus). Default OFF because of the 2x cost. An explicit `LOKI_CLAUDE_MODEL_PLANNING` / `LOKI_MODEL_PLANNING` still wins, and the `LOKI_MAX_TIER` ceiling still clamps Fable down. This first-iteration scope is deliberate: under the default session pinning there is no recurring "planning tier", so scoping to the architecture iteration is the only honest way to give the architecture pass Fable without converting the whole session. `loki plan` discloses the extra Fable architecture iteration in its quote.
+```bash
+LOKI_FABLE_ARCHITECT=1 loki start ./prd.md
+```
+### Financial / physical-stakes work (manual override, by design)
+For diffs that move money or mutate infrastructure (payments, billing, spend authorization, Terraform/k8s/infra changes), routing to Fable is **founder-directed and not doc-evidenced**, so Loki does NOT auto-route based on file-path heuristics in this release. Use the mid-flight switch (set the model to `fable` for those iterations, then clear it) as the deliberate, consent-based mechanism. This keeps the high-cost model an explicit human choice, not an implicit one.
+### Cost estimate honesty
+`loki plan` quotes the model the runner actually dispatches, on **every** path, for the levers the runner honors. This includes the stock no-lever default: the `sonnet` session pin resolves through the development tier to opus, so the quote names Opus (not Sonnet) on the default path. Set `LOKI_SESSION_MODEL=fable`, or have a pending `.loki/state/model-override` of `fable`, and the estimate uses Fable's $10/$50 pricing with a 2x-Opus note. The quote, the dashboard `effective` model, and the actual `claude --model` argument always agree on both routes: the session-pin tier route (no override) and the override-path clamp (override present) are each mirrored exactly in the estimator and the dashboard, so the same lever that changes the quote changes the run. `LOKI_MAX_TIER` is applied to the quote too, so the estimate never quotes a model above the operator's cost ceiling. (`LOKI_MODEL` is not a session lever and does not affect the run; use `LOKI_SESSION_MODEL`.)
+**Where the dispatched model is disclosed (precisely):**
+- Override path (a `.loki/state/model-override` alias): the runner logs the override and the clamp line (`model override: <model>`, plus the one honest clamp-down log line when `LOKI_MAX_TIER` reduces it); the dashboard `GET /api/session/model` `effective` field reports the clamped model.
+- Session pin / architect path (no override, the default): the runner logs the RARV effective-model line each iteration (`RARV Phase: <phase> -> Tier: <tier> (<model>)`), and `loki plan` discloses the provenance (`pinned via LOKI_SESSION_MODEL=<pin> -> <tier> tier -> <model>`) so the quoted model is legible against the pin alias. The `LOKI_FABLE_ARCHITECT=1` first-iteration Fable opt-in is disclosed as an extra architecture iteration in the quote.
+---
 ## Multi-Provider Support (v5.0.0)
 Loki Mode supports five AI providers. Claude has full features; all others run in **degraded mode** (sequential execution only, no Task tool, no parallel agents).