npm - ultracost - Versions diffs - 0.2.0 - Mend

ultracost 0.2.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (24) hide show

package/CHANGELOG.md +81 -0
package/LICENSE +21 -0
package/NOTICE +18 -0
package/README.md +306 -0
package/bin/cli.js +264 -0
package/docs/ESTIMATES.md +191 -0
package/docs/PUBLISHING.md +164 -0
package/docs/TESTING.md +260 -0
package/docs/architecture.md +166 -0
package/docs/policy.md +42 -0
package/docs/ultracode.md +37 -0
package/package.json +52 -0
package/src/estimate.js +101 -0
package/src/guard.js +300 -0
package/src/index.js +7 -0
package/src/install.js +113 -0
package/src/log.js +18 -0
package/src/paths.js +27 -0
package/src/policy.js +80 -0
package/src/pricing.js +82 -0
package/src/rules.js +84 -0
package/templates/hooks/reinject.mjs +41 -0
package/templates/hooks/workflow-gate.mjs +126 -0
package/templates/policy.default.json +49 -0

package/CHANGELOG.md ADDED Viewed

@@ -0,0 +1,81 @@
+# Changelog
+All notable changes to this project are documented here. The format is based on
+[Keep a Changelog](https://keepachangelog.com/en/1.1.0/), and this project adheres to
+[Semantic Versioning](https://semver.org/spec/v2.0.0.html).
+## [Unreleased]
+### Changed
+- **The cost gate is now mode-aware and hard in every permission mode.** It reads
+  `permission_mode` from the `PreToolUse` event: it asks (with the estimate) in
+  `default`/`acceptEdits`/`auto`, and **auto-denies** an unpinned/banned/`inherit` workflow
+  in `bypassPermissions`/`dontAsk` — where an `ask` is auto-approved and wouldn't pause. A
+  `PreToolUse` `deny` is honored in every mode, so this closes the bypass gap without an env
+  var. `ULTRACOST_GATE=strict` denies on any problem in all modes; new `=ask` opts out of the
+  escalation (always ask); `=off` disables. Documented the residual upstream limitation that
+  Claude Code skips `PreToolUse` hooks for subagents dispatched under `bypassPermissions`
+  ([#43772](https://github.com/anthropics/claude-code/issues/43772)).
+### Added
+- **`pipeline(items, ...stages)` fan-out detection.** The guard and estimate now recognize
+  the Workflow API's `pipeline()` primitive: every stage's `agent()` runs once per item, so
+  those stages are counted as fan-out (like `.map`). A live test exposed that an `ultracode`
+  build/verify/fix workflow uses `pipeline()`, which the old detector counted as a few fixed
+  agents — badly under-reporting both the agent count and the cost.
+- **The cost gate now enforces, not just estimates.** `workflow-gate.mjs` runs the static
+  guard and leads the prompt with `⚠ N/M stage(s) NOT pinned -> will inherit <session model>`
+  when any stage is unpinned/banned/`inherit`. New `ULTRACOST_GATE=strict` mode **denies**
+  such launches outright (the model must pin every stage and relaunch).
+- **The estimate is surfaced via `systemMessage`** so it's actually visible to the user.
+  Claude Code does not render `permissionDecisionReason` for `"ask"` decisions in the TUI
+  ([#24059](https://github.com/anthropics/claude-code/issues/24059)); the gate now sends the
+  numbers through the documented `systemMessage` channel (and still sets the reason).
+### Changed
+- **The pre-flight cost gate is now ON by default and deterministic.** The plugin
+  registers the `PreToolUse` hook on the `Workflow` tool (`hooks/hooks.json` →
+  `templates/hooks/workflow-gate.mjs`), so every dynamic-workflow launch hard-stops with a
+  cost estimate and an approve/deny prompt — no longer reliant on the model invoking
+  `AskUserQuestion`. Set `ULTRACOST_GATE=off` to disable for non-interactive runs
+  (`claude -p`, Auto Mode, CI); bypass-permissions mode auto-approves it. The gate now
+  always pauses on a `Workflow` launch (even if the script can't be read for an estimate)
+  and fails closed rather than letting an unpriced fan-out through.
+## [0.2.0] - 2026-06-14
+### Added
+- **`ultracost estimate <script>`** — static cost estimate for a workflow: agent count
+  (fan-outs as `N x`), model mix, and tiered-vs-all-opus cost with savings. `--json` supported.
+- **Dynamic per-stage effort.** The policy now has Claude pick an `effort` per stage
+  (`low`..`xhigh`, bounded by model) instead of a fixed per-tier effort, and the estimate
+  factors effort into output-token cost. New `effort` block in `policy.json`.
+- **Official-sourced pricing.** `pricing` block in `policy.json` carries `_source`/`_asOf`
+  provenance; `ultracost pricing` shows it and `ultracost pricing refresh` re-fetches
+  Anthropic's official pricing page and rewrites it. The estimate stays offline.
+- **Pre-flight cost gate.** The injected policy + skill have Claude estimate a workflow and
+  offer Approve / Cancel / Modify via `AskUserQuestion` before launching. Ships an opt-in
+  deterministic `PreToolUse` gate (`templates/hooks/workflow-gate.mjs`) on the `Workflow` tool.
+### Changed
+- The `SessionStart` hook now injects the routing policy as `additionalContext` on every
+  source (`startup|resume|clear|compact`), not just `compact`. A live end-to-end test showed
+  the model-invoked skill alone was only *offered* and never opened during workflow authoring,
+  so stages stayed unpinned; injecting the policy at session start makes the same prompt pin
+  every stage with correct tiers. The skill now plays a secondary, explicit-reference role.
+## [0.1.0] - 2026-06-14
+### Added
+- `ultracost init` — install a quality-first routing policy, compile it into
+  `~/.claude/CLAUDE.md`, and register a `SessionStart` policy-injection hook.
+- **Workflow Guard** (`ultracost check`) — static analysis that flags `agent()` stages
+  missing a model pin, with codes `UC001`–`UC005`, `--json`, and `--fix`.
+- Data-driven `policy.json` with load-time validation (rejects undefined default tiers
+  and tiers whose model is in `neverUse`).
+- `ultracost status`, `ultracost doctor`, `ultracost uninstall`.
+- Docs: architecture, policy reference, and the ultracode rationale.
+[Unreleased]: https://github.com/danielkremen818/ultracost/compare/v0.2.0...HEAD
+[0.2.0]: https://github.com/danielkremen818/ultracost/compare/v0.1.0...v0.2.0
+[0.1.0]: https://github.com/danielkremen818/ultracost/releases/tag/v0.1.0

package/LICENSE ADDED Viewed

@@ -0,0 +1,21 @@
+MIT License
+Copyright (c) 2026 Daniel Kremen
+Permission is hereby granted, free of charge, to any person obtaining a copy
+of this software and associated documentation files (the "Software"), to deal
+in the Software without restriction, including without limitation the rights
+to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
+copies of the Software, and to permit persons to whom the Software is
+furnished to do so, subject to the following conditions:
+The above copyright notice and this permission notice shall be included in all
+copies or substantial portions of the Software.
+THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
+AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
+OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
+SOFTWARE.

package/NOTICE ADDED Viewed

@@ -0,0 +1,18 @@
+ultracost
+Copyright (c) 2026 Daniel Kremen
+This project is an original, clean-room implementation. It was informed by
+prior art in the Claude Code model-routing ecosystem. None of the projects
+below contributed code to ultracost; they are credited as inspiration for the
+problem space (routing Claude Code subagents to cost-appropriate models).
+Prior art / inspiration:
+  - JacobiusMakes/claude-budget   — static model-routed agents + CLAUDE.md rules + post-compact reminder
+  - 0xrdan/claude-router          — automatic per-prompt routing via hooks
+  - gacabartosz/claude-smart-router — research-inspired complexity scoring
+  - R4CK/claude-model-changer     — quota-aware complexity routing
+ultracost's distinct contribution is per-stage routing for dynamic workflows
+(ultracode): a quality-first policy plus a static-analysis guard that inspects
+the workflow scripts Claude Code authors and flags subagent stages that would
+silently inherit the session model.

package/README.md ADDED Viewed

@@ -0,0 +1,306 @@
+<div align="center">
+# ultracost
+**Per-stage model routing for Claude Code dynamic workflows.**
+Stop a single `ultracode` fan-out from running 40 subagents on Opus 4.8 by accident.
+[![npm](https://img.shields.io/npm/v/ultracost.svg)](https://www.npmjs.com/package/ultracost)
+[![CI](https://github.com/danielkremen818/ultracost/actions/workflows/ci.yml/badge.svg)](https://github.com/danielkremen818/ultracost/actions/workflows/ci.yml)
+[![license](https://img.shields.io/badge/license-MIT-blue.svg)](./LICENSE)
+[![node](https://img.shields.io/badge/node-%3E%3D24-brightgreen.svg)](https://nodejs.org)
+</div>
+---
+## About
+**ultracost keeps Claude Code's `ultracode` mode from silently running every
+subagent on Opus 4.8.** When `ultracode` is on, the session is pinned to Opus 4.8
+@ `xhigh` and a single dynamic workflow fans out to dozens of subagents that
+*inherit that model* unless every stage is pinned. ultracost makes the per-stage
+routing explicit, injects the policy at the start of every session, and ships a
+guard that fails any unpinned stage.
+> Built for `ultracode` (Opus 4.8 @ `xhigh` dynamic workflows) — that is the only
+> place the multi-agent fan-out it guards against happens.
+No telemetry. No network on the hot path. MIT.
+**Setup (plugin):**
+```text
+/plugin marketplace add danielkremen818/ultracost
+/plugin install ultracost@ultracost
+```
+**Built-in command:** `/ultracost:check [script]` — flag any `agent()` stage missing a model pin.
+**CLI verbs:** `init · check · audit · estimate · pricing · status · doctor · uninstall`.
+<div align="center">
+![ultracost — estimate, check, and audit on a real workflow script](./assets/demo.svg)
+</div>
+## The problem
+When `ultracode` is on, Claude Code runs the session on **Opus 4.8 @ xhigh** (the only model that supports `xhigh`) and auto-orchestrates **dynamic workflows** that fan out to dozens — up to 1,000 — subagents. Two defaults compound:
+1. **Subagents inherit the session model.** No per-stage override → every stage is Opus 4.8.
+2. **The built-in workflow guidance tells Claude to _omit_ the per-agent model.** So inheritance wins.
+The documented result: [one prompt spawning 46 Opus subagents and ~3M tokens with no warning](https://github.com/anthropics/claude-code/issues/66023). A grep sweep and a per-file verifier do not need Opus.
+## The evidence: nobody pins a stage
+This is the default behavior, not user error. In a scan of ~22 real `ultracode` workflow scripts authored across `~/.claude/projects/**/workflows/scripts/`, **almost none pinned `model:` on any stage** — every stage inherited the session model (Opus 4.8 @ xhigh). Even Anthropic's own bundled `deep-research` workflow pins **zero** stages. Left to its defaults, Claude Code writes fan-outs that silently run everything on the most expensive model.
+You can reproduce this on your own history in one command:
+```bash
+npx ultracost audit ~/.claude/projects
+```
+## What ultracost does
+ultracost makes the routing **explicit, policy-driven, and verifiable** — without giving up quality on the work that matters.
+- **A quality-first policy.** Coding and reasoning stay on **Opus @ xhigh**. Pre-planned mechanical work and search/collection drop to **Sonnet**. **Haiku is never used.** You own the policy in one JSON file.
+- **Always-on routing guidance.** As a plugin, a `SessionStart` hook injects the policy as context at the start of every session (and re-injects it after compaction) — so it is present when Claude authors a workflow, without relying on the model choosing to open a skill. As the npm CLI, the same policy compiles into your `~/.claude/CLAUDE.md`. A routing skill ships alongside for explicit `/`-reference.
+- **The Workflow Guard.** A static analyzer that scans the workflow scripts Claude authors and flags any `agent()` stage missing a `model:` pin — so a fan-out can't silently inherit Opus. Run it by hand, via `/ultracost:check`, or in CI. **No other tool does this.**
+## Architecture
+One shared core in `src/`, two delivery surfaces: a Claude Code **plugin** (primary) and an **npm CLI** (secondary). Both compile from the same `policy.json`.
+```mermaid
+flowchart TD
+    subgraph core["src/ — shared core"]
+        POL["policy.js<br/>(policy.json — you own it)"]
+        RUL["rules.js<br/>(rule compiler)"]
+        GRD["guard.js<br/>(static analysis)"]
+    end
+    subgraph plugin["Claude Code plugin — PRIMARY"]
+        SK["skills/ultracost<br/>routing policy (always-relevant)"]
+        CMD["/ultracost:check command"]
+        HK["hooks.json<br/>SessionStart (all sources)"]
+    end
+    subgraph cli["npm CLI — secondary"]
+        BIN["bin/cli.js<br/>init · check · audit · doctor · status · uninstall"]
+    end
+    POL --> RUL
+    RUL --> SK
+    RUL --> BIN
+    GRD --> CMD
+    GRD --> BIN
+    HK --> RE["reinject.mjs<br/>(node, no bash/jq)"]
+    BIN --> RE
+    classDef ft fill:#1f6feb,stroke:#0b3d91,color:#fff;
+    class POL,RUL,GRD ft;
+```
+The plan lives in **data** (`policy.json`), not in prose buried in a prompt. The guard is the enforcement layer the model can't talk its way out of. See [`docs/architecture.md`](./docs/architecture.md) for the full picture.
+## Install
+### Plugin (recommended)
+Inside Claude Code, add the marketplace and install the plugin:
+```text
+/plugin marketplace add danielkremen818/ultracost
+/plugin install ultracost@ultracost
+```
+Then verify a workflow script at any time:
+```text
+/ultracost:check ./path/to/workflow.js
+```
+The plugin bundles four things and **touches none of your existing files**:
+- a **`SessionStart`** hook that injects the routing policy as context at session start and after compaction (the always-on guidance),
+- a **`PreToolUse` cost gate** on the `Workflow` tool that hard-stops every dynamic-workflow launch with an estimate (set `ULTRACOST_GATE=off` to disable; see [`docs/ESTIMATES.md`](./docs/ESTIMATES.md)),
+- the **`/ultracost:check`** command (the Workflow Guard), and
+- a **routing-policy skill** for explicit reference when Claude authors workflow/ultracode/subagent scripts.
+> Requires Claude Code with the `/plugin` command (run `/help` to confirm) and dynamic workflows enabled.
+### npm CLI (professional secondary)
+For CI, scripting, or if you prefer the `~/.claude/CLAUDE.md` injection path:
+```bash
+npx ultracost init
+```
+This writes `~/.claude/ultracost/policy.json`, injects the routing block into `~/.claude/CLAUDE.md`, installs the re-inject hook (`~/.claude/ultracost/reinject.mjs`), and registers it in `~/.claude/settings.json`. New sessions pick it up immediately. Paths honor `CLAUDE_CONFIG_DIR` if you've relocated your config.
+> Requires Node ≥ 24.
+## Uninstall
+### Plugin
+```text
+/plugin uninstall ultracost@ultracost
+/plugin marketplace remove ultracost
+```
+The plugin touches none of your own files — its hook, command, and skill live inside the plugin package, so removing it removes everything; nothing is left in your `~/.claude/CLAUDE.md` or `settings.json`. If Claude Code offers to "disable in `settings.local.json`" instead (because the plugin was enabled in a shared `settings.json`), that has the same effect — accept it, or remove the marketplace as above.
+### npm CLI
+```bash
+ultracost uninstall
+```
+Reverses everything `init` did: removes the routing block from `~/.claude/CLAUDE.md`, deletes `~/.claude/ultracost/`, and unregisters the hook from `~/.claude/settings.json` (an invalid `settings.json` is reported, never overwritten).
+## Quickstart (CLI)
+```bash
+ultracost init                      # install policy + rules + hook
+ultracost status                    # see the active policy and install state
+ultracost audit ~/.claude/projects  # pin stats across your real workflow scripts
+ultracost check ./path/to/workflow  # scan a workflow script (or a directory)
+ultracost check . --fix             # auto-pin the default model on unpinned stages
+ultracost estimate ./workflow.js    # agents, model mix, and cost vs all-opus baseline
+ultracost pricing refresh           # update prices from Anthropic's official page
+```
+Point `check` at the script Claude wrote (its path is printed when a run starts, under `~/.claude/projects/`), or wire it into CI.
+## Cost estimate + dynamic effort + pre-flight gate
+Beyond routing, ultracost estimates a workflow's cost before it runs, has Claude pick a per-stage **effort** level (low to xhigh), and gates the launch so you can approve, cancel, or restructure it.
+```text
+$ ultracost estimate ./workflow.js
+  agents      4 fixed + 1 fan-out group(s) x ~5 = ~9
+  model mix   3x opus, 6x sonnet
+  baseline (all opus)   $0.9000
+  tiered (ultracost)        $0.5304
+  savings                   $0.3696  (41%)
+```
+- **Pricing is official-sourced.** Prices live in `policy.json` with a `_source` URL and `_asOf` date; `ultracost pricing refresh` re-fetches Anthropic's official pricing page and updates them. The estimate itself runs offline (no network on the hot path).
+- **Dynamic effort.** Each stage gets the lowest effort that fits (`low`/`medium`/`high`/`xhigh`), bounded by model (`sonnet` up to `high`, `opus` up to `xhigh`). Effort feeds the estimate.
+- **Pre-flight gate (on by default, hard in every mode).** The plugin ships a deterministic `PreToolUse` hook on the `Workflow` tool that **hard-stops every dynamic-workflow launch** — it runs the guard + estimate and leads with `⚠ N/M stage(s) NOT pinned -> will inherit Opus` when stages are unpinned, so an accidental all-Opus fan-out can't slip by. It is **mode-aware**: it asks (with the estimate) in `default`/`acceptEdits`/`auto`, and **auto-denies** an unpinned workflow in `bypassPermissions`/`dontAsk` where an ask wouldn't pause — so it holds in every permission mode, not just when the model chooses to ask. `ULTRACOST_GATE=strict` denies on any problem everywhere; `=ask` never escalates; `=off` disables it (headless/CI). On top of that, the policy has Claude run `ultracost estimate` and offer the **Approve / Cancel / Modify** menu via `AskUserQuestion`.
+Estimates are relative (tiered vs all-opus), not a bill; fan-outs are ranges; the interactive 3-option menu needs a TUI. Full detail, assumptions, and the gate's [#52343](https://github.com/anthropics/claude-code/issues/52343) limitation are in [`docs/ESTIMATES.md`](./docs/ESTIMATES.md).
+## How routing is decided
+| Tier | Model | Use for |
+|------|-------|---------|
+| **opus** | `claude-opus-4-8` @ `xhigh` | writing/refactoring/debugging code, design & architecture, security/perf, tests that need judgment, planning, synthesis |
+| **sonnet** | `claude-sonnet-4-6` @ `high` | applying a *decided* edit across files, search/grep, running tests, git ops, docs, gathering context |
+**Decision rule:** if the stage must *decide how* to change code → opus. If the *how* is already planned and it just executes → sonnet. When in doubt → opus. **Never haiku.**
+This is opinionated and quality-first by design. If you want a cost-first split, edit the policy (below).
+## The Workflow Guard
+```text
+$ ultracost check ./wf.js
+wf.js:2:15  UC001  stage has no options object — add { model: ... } so it does not inherit the session model
+wf.js:3:14  UC002  stage options object has no model — will inherit the session model
+wf.js:4:13  UC003  stage pins banned model "haiku" (policy.neverUse)
+3 error(s), 0 warning(s) in 1 file(s).
+```
+| Code | Meaning |
+|------|---------|
+| `UC001` | `agent(x)` with no options object |
+| `UC002` | options object present, no `model` |
+| `UC003` | model resolves to a banned model (e.g. haiku) |
+| `UC004` | `model: 'inherit'` while `allowInherit` is false |
+| `UC005` | options passed as a variable — can't verify (warning) |
+The scanner is string- and comment-aware: an `agent(` that appears inside a prompt string or a comment is prose, not a call, and is never flagged. `--json` for CI, `--fix` to auto-insert the default model on the unambiguous cases (`UC001`/`UC002`), `--quiet` to print only the problems. Exit code is non-zero when errors are found.
+## Audit your history
+`ultracost audit [dir]` scans `<dir>/**/workflows/scripts/*.js` (default `~/.claude/projects`) and reports the totals — how many stages exist and how many would silently inherit the session model:
+```text
+$ ultracost audit ~/.claude/projects
+ultracost audit
+scanned 22 script(s) under ~/.claude/projects
+  agent() stages   137
+  pinned           4
+  unpinned         128   (UC001/UC002 — inherit the session model)
+  banned           0     (UC003)
+  inherit          1     (UC004)
+  dynamic          4     (UC005 — options is a variable)
+  unpinned ratio   93.4%
+```
+> Numbers above are illustrative; run it to see your own. `--json` emits the totals for dashboards or CI.
+## Customizing the policy
+Edit `~/.claude/ultracost/policy.json`, then re-run `ultracost init` to recompile the rules:
+```json
+{
+  "neverUse": ["haiku"],
+  "allowInherit": false,
+  "default": "opus",
+  "tieBreaker": "opus",
+  "tiers": {
+    "opus": { "model": "opus", "effort": "xhigh" },
+    "sonnet": { "model": "sonnet", "effort": "high" }
+  },
+  "alwaysOpus": ["orchestrator", "planner", "final-synthesis"]
+}
+```
+See [`docs/policy.md`](./docs/policy.md) for the full reference.
+## Use in CI
+```yaml
+- run: npx ultracost check . --json
+```
+Fails the build if any committed workflow script has a stage that would inherit the session model.
+## How it compares
+ultracost is intentionally narrow. General-purpose routers ([claude-router](https://github.com/0xrdan/claude-router), [claude-smart-router](https://github.com/gacabartosz/claude-smart-router), [claude-model-changer](https://github.com/R4CK/claude-model-changer)) score every prompt and route the *main loop*. ultracost targets the **dynamic-workflow / ultracode** path and adds a **guard** that statically verifies stage-level pins — which none of them do. See [`NOTICE`](./NOTICE) for prior-art credits.
+## Documentation
+- [Architecture](./docs/architecture.md)
+- [Policy reference](./docs/policy.md)
+- [Why ultracode needs this](./docs/ultracode.md)
+- [Testing guide](./docs/TESTING.md) — sandbox, plugin, npm, and live Claude Code CLI checks
+- [Publishing & recognition](./docs/PUBLISHING.md) — marketplaces, awesome lists, launch
+## Versioning & releases
+Semantic versioning. See [`CHANGELOG.md`](./CHANGELOG.md). Tagged releases (`vX.Y.Z`) publish to npm and GitHub Releases via CI.
+> Configured for GitHub `danielkremen818/ultracost`. If you fork it, update the handle in the install commands and badges, `package.json`, `CHANGELOG.md`, and `.claude-plugin/plugin.json`. See [`docs/PUBLISHING.md`](./docs/PUBLISHING.md) for the full pre-publish checklist.
+## License
+MIT © Daniel Kremen. Clean-room implementation; prior art credited in [`NOTICE`](./NOTICE).