PyPI - scorchmark - Versions diffs - 0.6.0__tar.gz - Mend

scorchmark 0.6.0__tar.gz

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (21) hide show

scorchmark-0.6.0/.gitignore +20 -0
scorchmark-0.6.0/LICENSE +21 -0
scorchmark-0.6.0/PAYMENTS.md +98 -0
scorchmark-0.6.0/PKG-INFO +305 -0
scorchmark-0.6.0/README.md +274 -0
scorchmark-0.6.0/SECURITY.md +70 -0
scorchmark-0.6.0/adapters.py +131 -0
scorchmark-0.6.0/alerts.py +136 -0
scorchmark-0.6.0/cli.py +318 -0
scorchmark-0.6.0/conftest.py +5 -0
scorchmark-0.6.0/detectors.py +520 -0
scorchmark-0.6.0/examples/sample_cost_log.jsonl +9 -0
scorchmark-0.6.0/ingest.py +157 -0
scorchmark-0.6.0/licensing.py +141 -0
scorchmark-0.6.0/pricing.py +125 -0
scorchmark-0.6.0/pricing_drift.py +144 -0
scorchmark-0.6.0/pyproject.toml +63 -0
scorchmark-0.6.0/scripts/issue_license.py +66 -0
scorchmark-0.6.0/scripts/stripe_webhook.py +106 -0
scorchmark-0.6.0/server.py +392 -0
scorchmark-0.6.0/tests/test_agentspend.py +901 -0

scorchmark-0.6.0/.gitignore ADDED Viewed

@@ -0,0 +1,20 @@
+# Python
+__pycache__/
+*.py[cod]
+.venv/
+venv/
+.pytest_cache/
+*.egg-info/
+build/
+dist/
+# Local runtime state (pricing-drift history, logs, ingested data)
+*.tmp
+*.log
+pricing_history.json
+.DS_Store
+# Secrets — NEVER commit the license signing private key (defense in depth)
+.secrets/
+*signing_key*
+*.pem

scorchmark-0.6.0/LICENSE ADDED Viewed

@@ -0,0 +1,21 @@
+MIT License
+Copyright (c) 2026 Anas (github.com/Nas01010101)
+Permission is hereby granted, free of charge, to any person obtaining a copy
+of this software and associated documentation files (the "Software"), to deal
+in the Software without restriction, including without limitation the rights
+to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
+copies of the Software, and to permit persons to whom the Software is
+furnished to do so, subject to the following conditions:
+The above copyright notice and this permission notice shall be included in all
+copies or substantial portions of the Software.
+THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
+AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
+OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
+SOFTWARE.

scorchmark-0.6.0/PAYMENTS.md ADDED Viewed

@@ -0,0 +1,98 @@
+# Getting paid — Scorchmark
+Two channels. Use either or both. The **MCPize** path is the fastest to revenue;
+the **self-serve Stripe + license** path keeps ~97% and works for direct GitHub/PyPI users.
+| | MCPize (managed) | Self-serve (Stripe + license key) |
+|---|---|---|
+| You keep | 80% (MCPize takes 20%) | ~97% (Stripe ~2.9%+30¢) |
+| Hosting | MCPize hosts the server | user self-hosts (or MCPize too) |
+| Billing/gating | MCPize meters + gates | this repo's license system |
+| Setup effort | ~90 min + Stripe KYC | keypair (done) + Stripe Payment Links |
+| Best for | discovery + hands-off | power users, teams, max margin |
+---
+## How the license system works
+Free is the entire MIT core. The **Pro** and **Team** tools unlock with a signed
+license key in `SCORCHMARK_LICENSE`. Verification is **offline** — an Ed25519
+signature checked against the public key embedded in `licensing.py`. No phone-home,
+so the "no outbound calls" security promise holds. Keys are unforgeable without the
+private key (which only you hold).
+**What each tier unlocks** (same in the MCP server and the `scorchmark` CLI):
+| Tier | Tools |
+|---|---|
+| Free $0 | `ingest_run`, `check_budget`, `find_spend_anomalies`, `detect_cache_waste`, `predict_rate_limit`, `detect_stuck_agent`, `build_alert_payload`, pricing resource |
+| Pro $19/mo | + `detect_spend_acceleration`, + webhook **alerting** (`check_budget(alert=True)`) |
+| Team $49/mo | + `cost_by_agent`, + `simulate_model_swap`, + `detect_pricing_drift` |
+A Team key satisfies a Pro requirement. Gated tools return an `upgrade_required`
+object (not an error) pointing at the buy URL. Honest open-core: the source is MIT,
+so a determined user can patch the gate out — the license keeps the tiers official
+and funds the maintained pricing model. Most users are honest.
+---
+## Channel A — MCPize (managed)
+1. List the repo on https://mcpize.com (reads `mcpize.yaml`).
+2. Set the three tiers in the MCPize dashboard: **Free $0 / Pro $19 / Team $49**
+   (yearly $190 / $490). MCPize connects Stripe and gates by subscription itself —
+   the license system below is **not needed** on this path.
+3. Stripe Connect KYC → payouts the 1st, 80% to you.
+## Channel B — self-serve (Stripe Payment Links + license keys)
+Generate an Ed25519 keypair. The **public** key is embedded in `licensing.py`; keep
+the **private** key in a secure location *outside* the repo (e.g. `~/.secrets/`,
+`chmod 600`, never committed — it's gitignored). Keep it safe — losing it means
+re-issuing every key; leaking it means anyone can mint keys.
+**One-time Stripe setup (dashboard, no code):**
+1. Create two **Products/Prices**: Pro $19/mo, Team $49/mo (add yearly if you want).
+2. Create a **Payment Link** for each. On each link set **metadata**:
+   `tier=pro` (or `team`) and `days=0` for a subscription the webhook re-issues, or
+   `days=365` for a fixed-term key. Collect the customer's email on the link.
+3. Add a **webhook endpoint** → event `checkout.session.completed` → copy the
+   signing secret (`whsec_...`).
+**Run the fulfilment webhook** (turns a payment into a delivered key):
+```bash
+export STRIPE_WEBHOOK_SECRET=whsec_...
+export SCORCHMARK_SIGNING_KEY="$(cat ~/.secrets/scorchmark_signing_key.b64)"
+# optional auto-email; without SMTP it just logs the key for you to send:
+export SMTP_HOST=smtp.you.com SMTP_USER=... SMTP_PASS=... FROM_EMAIL=you@you.com
+python3 scripts/stripe_webhook.py            # listens on :8787, behind HTTPS in prod
+```
+On purchase it verifies Stripe's signature (HMAC-SHA256, stdlib), mints a signed
+key, and emails it (or logs it).
+**Or mint keys by hand** (manual sales, comps, testing):
+```bash
+SCORCHMARK_SIGNING_KEY="$(cat .../scorchmark_signing_key.b64)" \
+  python3 scripts/issue_license.py --tier team --email buyer@co.com --days 365
+```
+**What the customer does:**
+```bash
+pip install 'scorchmark[pro]'          # adds offline-verify (cryptography)
+export SCORCHMARK_LICENSE=SCM1.....           # the key you sent
+scorchmark license                            # → tier: TEAM · active
+```
+---
+## Key operations
+- **Renewal/subscriptions:** issue fixed-term keys (`--days 365`) and re-issue on
+  Stripe's renewal webhook, OR issue perpetual keys (`--days 0`) and rely on Stripe
+  to stop billing on cancel (the key keeps working — fine for low-priced honor-model
+  tiers; use short terms if you need hard expiry).
+- **Revocation:** offline keys can't be individually revoked without rotating the
+  keypair (which invalidates *all* keys). For hard revocation use short terms +
+  renewal, or the MCPize managed path. Document this; don't pretend otherwise.
+- **Rotation:** generate a new keypair, update `PUBLIC_KEY_B64` in `licensing.py`,
+  ship a release, re-issue active keys. Only do this if the private key leaks.

scorchmark-0.6.0/PKG-INFO ADDED Viewed

@@ -0,0 +1,305 @@
+Metadata-Version: 2.4
+Name: scorchmark
+Version: 0.6.0
+Summary: Find the cache tax draining your AI bill: a cross-provider cache-TTL-waste detector + model-swap savings simulator + pricing-drift + per-agent attribution. MCP server + CLI.
+Project-URL: Homepage, https://github.com/Nas01010101/scorchmark
+Project-URL: Repository, https://github.com/Nas01010101/scorchmark
+Project-URL: Issues, https://github.com/Nas01010101/scorchmark/issues
+Author: Anas
+License-Expression: MIT
+License-File: LICENSE
+Keywords: agents,anthropic,claude,cost,finops,llm,mcp,observability,openai,prompt-caching
+Classifier: Development Status :: 4 - Beta
+Classifier: Environment :: Console
+Classifier: Intended Audience :: Developers
+Classifier: Programming Language :: Python :: 3
+Classifier: Topic :: Software Development :: Libraries
+Classifier: Topic :: System :: Monitoring
+Requires-Python: >=3.11
+Provides-Extra: all
+Requires-Dist: cryptography>=42.0; extra == 'all'
+Requires-Dist: fastmcp>=3.0; extra == 'all'
+Provides-Extra: dev
+Requires-Dist: cryptography>=42.0; extra == 'dev'
+Requires-Dist: fastmcp>=3.0; extra == 'dev'
+Requires-Dist: pytest>=8.0; extra == 'dev'
+Provides-Extra: mcp
+Requires-Dist: fastmcp>=3.0; extra == 'mcp'
+Provides-Extra: pro
+Requires-Dist: cryptography>=42.0; extra == 'pro'
+Description-Content-Type: text/markdown
+<p align="center">
+  <picture>
+    <source media="(prefers-color-scheme: dark)" srcset="assets/logo-dark.svg">
+    <img src="assets/logo.svg" alt="Scorchmark" width="460">
+  </picture>
+</p>
+<p align="center"><strong>The scorch mark on your AI bill. Scorchmark finds the cache-rebuild waste, model-swap savings, and silent price hikes your provider dashboard won't show you — as an MCP tool + CLI.</strong></p>
+<p align="center">
+  <img src="https://github.com/Nas01010101/scorchmark/actions/workflows/ci.yml/badge.svg" alt="CI">
+  <img src="https://img.shields.io/badge/license-MIT-blue.svg" alt="License: MIT">
+  <img src="https://img.shields.io/badge/MCP-server-7c3aed.svg" alt="MCP server">
+  <img src="https://img.shields.io/badge/tests-66%20passing-brightgreen.svg" alt="tests: 66 passing">
+  <img src="https://img.shields.io/badge/runtime%20deps-0-success.svg" alt="runtime deps: 0">
+</p>
+<p align="center">
+  <img src="assets/demo.gif" alt="Scorchmark catching runaway cache-rebuild spend live" width="760">
+</p>
+Your provider dashboard tells you *what* you spent — a day late. It never tells you *what you
+wasted*: the cache rebuilds you re-paid for, the model that would have done the job for 40% less,
+the silent price hike. Scorchmark reads a cost log you already have and finds that waste — starting
+with the **cache-TTL waste behind the documented $6,000 overnight burn**, which no observability
+tool we surveyed detects.
+It answers the questions the dashboard can't: *where did the cache money go, which agent burned it,
+and what would a cheaper model have saved?* Read-only and local — no proxy in your request path —
+as an MCP tool your agent can call mid-run, or a one-line CLI.
+| ❌ Without Scorchmark | ✅ With it |
+|---|---|
+| You find out from the limit email, a day late | The agent sees `warn` at wake-up #2, mid-run |
+| Cache-TTL waste silently eats up to 90% of spend | `detect_cache_waste` flags it and prices the loss |
+| No idea which agent burned the money | `cost_by_agent` attributes every dollar |
+| "Should I have used a cheaper model?" stays unknowable | `simulate_model_swap` gives the exact saved % |
+## Why this exists
+Someone left Claude Code looping overnight to check PRs and woke up to a **$6,000 bill**. Not a bug
+in their code. A cache-TTL change (1 hour down to 5 minutes) meant every 30-minute wake-up rebuilt
+an 800k-token history at the cache *write* rate instead of the cheap cache *read*. The dashboard
+showed nothing for days. The first warning was the limit email, after the money was gone.
+The post got 1,400+ upvotes because everyone running unattended agents felt the cold sweat. And
+there was no tripwire — every tool, including the provider's own dashboard, is retrospective.
+Replaying that incident's shape through these tools:
+```text
+ingested 46 wake-ups          total spend $265
+detect_cache_waste            45 rebuilds, $237 wasted (90% of spend)
+check_budget($50 cap)         first 'warn' at wake-up #2 (burn $20/hr)
+simulate_model_swap→Sonnet    $265 would have been $159 (save 40%)
+```
+90% of that spend was avoidable, and a $50 cap would have tripped on the **second** wake-up, not the
+46th. (Absolute dollars scale with your context size and loop length; the waste fraction and the
+early catch are the point.)
+## How it's different from Helicone / Langfuse / LangSmith
+Those are good tools. They are also a different shape of thing.
+| | Helicone | Langfuse / LangSmith | **Scorchmark** |
+|---|---|---|---|
+| Form factor | Proxy in your request path | SDK / OTel tracing + dashboard | **MCP tool the agent calls** |
+| When you learn | Dashboard, after the call | Dashboard, after the run | **In the loop, before the next call** |
+| Who acts on it | A human reading a chart | A human reading a trace | **The agent itself** |
+| Cache-TTL waste | Not detected | Not detected | **Detected and priced** |
+| In your critical path | Yes (all traffic routed) | No | No (read-only on your logs) |
+| Setup | Swap base URL | Instrument SDK | Point it at a log file |
+The wedge: those tools tell *you* what happened. Scorchmark tells the *agent* what's about to
+happen, in a form it can act on without a human in the loop. It adds nothing to your request path —
+it reads a cost log you already have.
+And it is **not** a spend *cap*. Hard budget enforcement is a commodity now — Cloudflare AI Gateway,
+LiteLLM, and Portkey all block-before-the-call. Scorchmark does the part they don't: it tells you
+*where the money leaked and what to change*. Cap your spend with a gateway; find the cache tax with
+this. It is also not a full observability platform — if you want flame-graph traces and prompt
+evals, run Langfuse. Scorchmark is the cost-intelligence layer, and it composes fine with both.
+## Quickstart
+The CLI core is pure stdlib (install pulls nothing). The MCP server adds one extra:
+```bash
+uv sync --extra mcp                                                   # MCP server deps (FastMCP)
+uv run fastmcp run server.py                                          # stdio (Claude Desktop / Inspector)
+uv run fastmcp run server.py --transport streamable-http --port 8000  # remote / MCPize
+```
+Add it to Claude Desktop / Cursor (`claude_desktop_config.json` or `.cursor/mcp.json`):
+```json
+{
+  "mcpServers": {
+    "scorchmark": {
+      "command": "uv",
+      "args": ["run", "fastmcp", "run", "/path/to/scorchmark/server.py"]
+    }
+  }
+}
+```
+Then, in the loop:
+```python
+ingest_run(open("examples/sample_cost_log.jsonl").read())
+check_budget(monthly_cap_usd=100, reset_day=1)   # ok | warn | breach, with ETA to the reset
+detect_cache_waste()                             # the $6k pattern
+simulate_model_swap(to_model="claude-haiku-4-5") # exact per-row savings, cross-provider OK
+```
+## Try it in your terminal (no MCP client)
+**Run it on your own Claude Code usage — zero setup, a log you already have:**
+```bash
+uvx --from scorchmark scorchmark report --claude-code
+# reads ~/.claude/projects/**/*.jsonl directly and prices every request
+```
+That auto-adapts Claude Code's session transcripts — no reformatting. Or point it at any cost log:
+```bash
+uvx --from scorchmark scorchmark report mylog.jsonl --cap 100   # auto-detects the format
+uv run scorchmark report examples/sample_cost_log.jsonl --cap 50          # from a clone
+cat mylog.jsonl | uv run scorchmark swap - --to claude-haiku-4-5          # reads stdin
+```
+Subcommands: `report` (all checks), `budget`, `cache-waste`, `by-agent`, `anomalies`, `swap`.
+Add `--json` for the raw result, `--from {auto,scorchmark,claude-code}` to force a format, or point
+the log argument at a **directory** of `.jsonl` files. Same engine as the MCP server.
+## See it run
+Real output from `examples/sample_cost_log.jsonl` — the live `warn`, the cache-waste dollars, and
+the per-agent breakdown the provider dashboard never shows you (these are actual tool results, not
+mockups):
+![Live budget tripwire — warns mid-run](assets/screenshot-check-budget.png)
+![Cache-TTL waste detection](assets/screenshot-cache-waste.png)
+![Per-agent attribution](assets/screenshot-cost-by-agent.png)
+## Tools
+### Core
+| Tool | What it does |
+|---|---|
+| `ingest_run` | Load a cost-log JSONL. Computes cost from the bundled pricing model when absent. |
+| `check_budget` | Live tripwire: spend, burn rate, projected spend to the next reset, `ok`/`warn`/`breach`, ETA to cap. |
+| `find_spend_anomalies` | Flags requests costing N× the agent's median — the loop-spike signature. |
+| `detect_cache_waste` | Detects cache waste, modeling each provider's real economics (see below) and pricing it. |
+| `cost_by_agent` | Per-agent attribution: cost, share, requests, average per request. |
+### Edge (not offered by any tool we surveyed)
+| Tool | What it does |
+|---|---|
+| `detect_spend_acceleration` | Flags a burn rate that doubles across consecutive windows (runaway context growth). |
+| `simulate_model_swap` | Recomputes every past request at another model's price for the row-exact savings. Cross-provider. |
+| `detect_pricing_drift` | Snapshots provider rates and surfaces any silent change — the root cause of the $6k burn. |
+### Match (parity with the heavy gateways)
+| Tool | What it does |
+|---|---|
+| `predict_rate_limit` | Projects ETA to a 429 from rate-limit headers, per dimension. |
+| `detect_stuck_agent` | Flags an agent repeating the same tool call — the stuck-loop signature. |
+| `build_alert_payload` | Turns any result into a Slack, ntfy, or PagerDuty webhook payload. |
+Resource `scorchmark://pricing/current` exposes the curated cross-provider pricing model.
+## Log format
+One JSON object per line — the de-facto schema the cost trackers, and Claude's own usage fields,
+already emit:
+```json
+{"request_id": "r1", "ts": "2026-06-21T03:00:00Z", "provider": "anthropic",
+ "model": "claude-opus-4-8", "agent_id": "pr-loop", "input_tokens": 2000,
+ "output_tokens": 1500, "cache_write_tokens": 800000, "cache_read_tokens": 0}
+```
+`ts` accepts ISO-8601 (naive timestamps are read as UTC), epoch seconds, or epoch milliseconds.
+`cost_usd` is optional and computed when absent. Anthropic's native `cache_creation_input_tokens`
+and `cache_read_input_tokens` are accepted too. Two optional field groups unlock extra tools:
+| Field group | Unlocks |
+|---|---|
+| `rate_limit_remaining_tokens`, `rate_limit_limit_tokens`, `rate_limit_reset_s` (and `*_requests`) | `predict_rate_limit` |
+| `tool_name`, `tool_args_hash` | `detect_stuck_agent` |
+## Alerting
+Set `SCORCHMARK_WEBHOOK_URL` to a JSON webhook (Slack, ntfy, PagerDuty), then call
+`check_budget(..., alert=True)` to POST a payload on `warn` or `breach`. Or call
+`build_alert_payload(result)` on any tool's output and route it yourself.
+The webhook URL is yours to supply; if you self-host this for others, validate/allowlist it (an
+attacker-controlled URL is an SSRF vector).
+## Pricing
+| Tier | Price | For |
+|---|---|---|
+| Free | $0 | find the cache tax on your own logs — the solo dev who got burned once |
+| Pro | $19/mo | unattended loops: cache-waste + burn-acceleration early warning, webhook alerting |
+| Team | $49/mo | per-agent attribution, model-swap savings simulator, pricing-drift, audit-trail export |
+Catch one runaway loop and it has paid for itself many times over — and the free tier alone catches
+the $6k cache-TTL pattern.
+**Unlocking Pro/Team.** The paid tools (`detect_spend_acceleration`, `cost_by_agent`,
+`simulate_model_swap`, `detect_pricing_drift`, and webhook alerting) unlock with a signed license
+key, verified **offline** (Ed25519 — no phone-home, so the no-outbound-calls guarantee holds):
+```bash
+pip install 'scorchmark[pro]'      # adds the offline verifier
+export SCORCHMARK_LICENSE=SCM1.....       # the key from your purchase
+scorchmark license                        # confirm → tier: TEAM · active
+```
+Buy via MCPize (managed billing) or direct Stripe — the full get-paid setup is in
+[PAYMENTS.md](PAYMENTS.md). The free tier needs none of this and stays pure-stdlib.
+## Security
+Read-only, local-only, no credential storage, no outbound calls from core logic. Core modules have
+zero runtime dependencies beyond the Python standard library. See [SECURITY.md](SECURITY.md).
+## Cross-provider cache economics
+The three providers price caching three different ways, and `detect_cache_waste` models each —
+this is why a generic "cache miss" tool gets the dollars wrong.
+| Provider | Cache model | Where the waste is | Same slow loop* |
+|---|---|---|---|
+| Anthropic | Write **premium** (write = 1.25× input, read = 0.1×) | Re-paying the write premium when the loop interval exceeds the TTL | **$237** |
+| OpenAI | Automatic, **no write premium** (read = 0.1× input) | A stable prefix that never cache-hits, losing the ~90% read discount | $46 (gpt-5) |
+| Google Gemini | Read discount **plus hourly storage** ($4.50/M-tok/hr Pro) | Missed discount, and storage billed on idle explicit caches | $46 (2.5-pro) |
+*Same 46-wake-up, 800k-context, 30-min loop, priced per provider. The catastrophic version is
+Anthropic-specific — the write premium is what turned that loop into a $6k bill. On OpenAI/Gemini the
+identical loop "only" forfeits the read discount, which the detector reports honestly as a smaller number.
+## Accuracy
+Anthropic, OpenAI, and Google rates in `pricing.py` were re-verified on 2026-06-22 against each
+provider's official pricing page (platform.claude.com, developers.openai.com/api/docs/pricing,
+ai.google.dev/gemini-api/docs/pricing) — every row confirmed, and the current OpenAI tiers
+(incl. `gpt-5.4-mini` / `-nano`) added. The tables are the standard context tier; very-large-context
+pricing (OpenAI >272K, Gemini >200K) and Gemini's hourly cache-*storage* fee are noted but not priced
+per row, since the cost log carries no context-tier or cache-lifetime field. `detect_pricing_drift`
+exists because providers change rates without notice — run it regularly.
+## License
+**Code: MIT** (see [LICENSE](LICENSE)) — the entire source, including the gated tools, is free
+to read, fork, and modify. This is honest open-core, not DRM.
+**Pro/Team license keys** are a separate commercial purchase: a signed key (verified offline,
+Ed25519) that activates the paid tools in official builds and funds the maintained, cross-provider
+pricing model. Buying a key supports the project and gets you the official tier — the MIT license
+means you *could* edit the gate out, but the key is what keeps the pricing data and the audit-trail
+tier maintained. Keys are per-purchaser and non-transferable; sold with no warranty (the MIT terms
+govern the software itself). Buy via MCPize (managed billing) or direct Stripe — see
+[PAYMENTS.md](PAYMENTS.md).