PyPI - prompt-analytics-for-claude-code - Versions diffs - 0.3.0__tar.gz - Mend

prompt-analytics-for-claude-code 0.3.0__tar.gz

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (86) hide show

prompt_analytics_for_claude_code-0.3.0/.env.example ADDED Viewed

@@ -0,0 +1,6 @@
+# LLM providers for optional categorization (only one required)
+# Option 1: Anthropic direct (recommended — you already have a Claude account)
+ANTHROPIC_API_KEY=
+# Option 2: OpenRouter (fallback)
+OPENROUTER_API_KEY=

prompt_analytics_for_claude_code-0.3.0/.gitattributes ADDED Viewed

@@ -0,0 +1,13 @@
+# Lock line endings to LF in the repository regardless of each contributor's
+# core.autocrlf: fixtures (.jsonl) and demo CSVs are test *data* whose bytes
+# must be stable across checkouts (cross-platform audit 2026-06-11, M1).
+* text=auto eol=lf
+*.py    text eol=lf
+*.md    text eol=lf
+*.yml   text eol=lf
+*.yaml  text eol=lf
+*.toml  text eol=lf
+*.json  text eol=lf
+*.jsonl text eol=lf
+*.csv   text eol=lf
+*.png   binary

prompt_analytics_for_claude_code-0.3.0/.github/workflows/ci.yml ADDED Viewed

@@ -0,0 +1,100 @@
+name: CI
+on:
+  push:
+    branches: [main]
+  pull_request:
+    branches: [main]
+concurrency:
+  group: ci-${{ github.ref }}
+  cancel-in-progress: true
+jobs:
+  test:
+    name: test (py${{ matrix.python-version }} · ${{ matrix.os }})
+    runs-on: ${{ matrix.os }}
+    strategy:
+      fail-fast: false
+      matrix:
+        os: [ubuntu-latest, windows-latest]
+        python-version: ["3.10", "3.11", "3.12", "3.13", "3.14"]
+        include:
+          - os: macos-latest
+            python-version: "3.12"
+    steps:
+      - uses: actions/checkout@v4
+      - name: Install uv
+        uses: astral-sh/setup-uv@v5
+        with:
+          python-version: ${{ matrix.python-version }}
+          enable-cache: true
+          cache-dependency-glob: "pyproject.toml"
+      - name: Install project (all extras)
+        run: uv sync --all-extras
+      - name: Ruff lint
+        run: uv run ruff check .
+      - name: Ruff format check
+        run: uv run ruff format --check .
+      - name: Type check (package + dashboard + tests)
+        run: uv run mypy prompt_analytics tests
+      - name: Tests with coverage (fails under 85%)
+        run: uv run pytest tests/ -q
+  package:
+    name: build · twine check · clean-venv smoke test (${{ matrix.os }})
+    runs-on: ${{ matrix.os }}
+    strategy:
+      fail-fast: false
+      matrix:
+        os: [ubuntu-latest, windows-latest]
+    steps:
+      - uses: actions/checkout@v4
+      - name: Install uv
+        uses: astral-sh/setup-uv@v5
+        with:
+          python-version: "3.12"
+          enable-cache: true
+          cache-dependency-glob: "pyproject.toml"
+      - name: Build sdist + wheel
+        run: uv build
+      - name: twine check
+        run: uvx twine check dist/*
+      - name: Install the wheel into a clean virtualenv and smoke-test it
+        shell: bash
+        run: |
+          set -euxo pipefail
+          python -m venv clean
+          # bin/ on POSIX, Scripts/ on Windows (Git Bash on the Windows runner)
+          if [ -d clean/bin ]; then BIN=clean/bin; else BIN=clean/Scripts; fi
+          "$BIN/pip" install dist/*.whl
+          "$BIN/prompt-analytics" --help
+          # Analysis commands must not crash on an empty history; exit 1 = "no
+          # data" is expected here.  Assert the exact code: exit 2 would be a
+          # traceback, which is a bug.
+          rc=0
+          "$BIN/prompt-analytics" summary --output-dir empty_out || rc=$?
+          if [ "$rc" -ne 1 ]; then
+            echo "Expected exit 1 from summary on empty dir, got $rc" >&2
+            exit 1
+          fi
+          # Timezone canary: tzdata must resolve Europe/Paris on the bare wheel
+          # (m5 — the Windows runner needs tzdata bundled as a dep).
+          # Exit 0 (empty data written) or 1 (no data) are both OK; exit 2
+          # means a ValueError / ImportError and is a bug.
+          tz=0
+          "$BIN/prompt-analytics" extract --timezone Europe/Paris --output-dir tz_out || tz=$?
+          if [ "$tz" -eq 2 ]; then
+            echo "extract --timezone Europe/Paris exited 2 (timezone not found on bare wheel)" >&2
+            exit 1
+          fi

prompt_analytics_for_claude_code-0.3.0/.github/workflows/pricing-drift.yml ADDED Viewed

@@ -0,0 +1,157 @@
+name: Pricing drift check
+on:
+  schedule:
+    # Every Monday at 08:00 UTC
+    - cron: "0 8 * * 1"
+  workflow_dispatch:
+permissions:
+  contents: write
+  pull-requests: write
+jobs:
+  drift:
+    runs-on: ubuntu-latest
+    steps:
+      - uses: actions/checkout@v4
+      - uses: actions/setup-python@v5
+        with:
+          python-version: "3.12"
+      - name: Install dependencies
+        run: |
+          python -m pip install --upgrade pip
+          pip install pyyaml requests
+      - name: Fetch LiteLLM model prices
+        run: |
+          python - << 'EOF'
+          import json, urllib.request, sys, yaml, pathlib
+          LITELLM_URL = (
+              "https://raw.githubusercontent.com/BerriAI/litellm/main/"
+              "model_prices_and_context_window.json"
+          )
+          with urllib.request.urlopen(LITELLM_URL, timeout=30) as r:
+              litellm: dict = json.loads(r.read().decode())
+          # Claude model keys in LiteLLM use the form "claude/claude-opus-4-8" or
+          # just "claude-opus-4-8"; we normalise to the bare model ID.
+          def price_per_mtok(v, key):
+              val = v.get(key)
+              return round(float(val) * 1_000_000, 6) if val is not None else None
+          drift = []
+          pricing_yml = pathlib.Path("prompt_analytics/data/pricing.yml")
+          data = yaml.safe_load(pricing_yml.read_text(encoding="utf-8"))
+          embedded = data.get("providers", {}).get("anthropic", {}).get("models", {})
+          for model_id, entry in embedded.items():
+              # Try common LiteLLM key forms
+              candidates = [model_id, f"claude/{model_id}", f"anthropic/{model_id}"]
+              litellm_entry = None
+              for c in candidates:
+                  if c in litellm:
+                      litellm_entry = litellm[c]
+                      break
+              if litellm_entry is None:
+                  print(f"  skip {model_id}: not in LiteLLM", flush=True)
+                  continue
+              checks = [
+                  ("input",          "input_cost_per_token"),
+                  ("output",         "output_cost_per_token"),
+                  ("cache_read",     "cache_read_input_token_cost"),
+                  ("cache_write_5m", "cache_creation_input_token_cost"),
+              ]
+              for our_key, ll_key in checks:
+                  ll_val = price_per_mtok(litellm_entry, ll_key)
+                  if ll_val is None:
+                      continue
+                  our_val = entry.get(our_key)
+                  if our_val is None:
+                      continue
+                  # Allow 1 % tolerance for rounding
+                  if abs(ll_val - our_val) / max(abs(ll_val), 1e-9) > 0.01:
+                      drift.append(
+                          f"  {model_id}/{our_key}: ours={our_val}, LiteLLM={ll_val}"
+                      )
+          if drift:
+              print("DRIFT DETECTED (Anthropic vs LiteLLM):", flush=True)
+              for line in drift:
+                  print(line, flush=True)
+              pathlib.Path("drift_litellm.txt").write_text("\n".join(drift) + "\n", encoding="utf-8")
+          else:
+              print("No drift vs LiteLLM.", flush=True)
+          EOF
+      - name: Fetch Copilot pricing
+        run: |
+          pip install pyyaml  # already installed but be explicit
+          python scripts/fetch_copilot_pricing.py --output /tmp/copilot_fresh.yml
+      - name: Compare Copilot pricing
+        run: |
+          python - << 'EOF'
+          import yaml, pathlib, sys
+          fresh = yaml.safe_load(
+              pathlib.Path("/tmp/copilot_fresh.yml").read_text(encoding="utf-8")
+          )
+          pricing_yml = pathlib.Path("prompt_analytics/data/pricing.yml")
+          data = yaml.safe_load(pricing_yml.read_text(encoding="utf-8"))
+          embedded = data.get("providers", {}).get("copilot", {}).get("models", {})
+          fresh_claude = fresh.get("vendors", {}).get("anthropic", {})
+          drift = []
+          for model_id, fresh_entry in fresh_claude.items():
+              our_entry = embedded.get(model_id)
+              if our_entry is None:
+                  drift.append(f"  NEW model {model_id}: not in embedded copilot grid")
+                  continue
+              for key in ("input", "output", "cache_read"):
+                  fv = fresh_entry.get(key)
+                  ov = our_entry.get(key)
+                  if fv is None or ov is None:
+                      continue
+                  if abs(fv - ov) / max(abs(fv), 1e-9) > 0.01:
+                      drift.append(f"  {model_id}/{key}: ours={ov}, Copilot page={fv}")
+              # cache_write in fresh → compare against cache_write_5m
+              fw = fresh_entry.get("cache_write")
+              ow = our_entry.get("cache_write_5m")
+              if fw is not None and ow is not None:
+                  if abs(fw - ow) / max(abs(fw), 1e-9) > 0.01:
+                      drift.append(f"  {model_id}/cache_write_5m: ours={ow}, Copilot page={fw}")
+          if drift:
+              print("DRIFT DETECTED (Copilot):", flush=True)
+              for line in drift:
+                  print(line, flush=True)
+              with open("drift_litellm.txt", "a", encoding="utf-8") as f:
+                  f.write("\n".join(drift) + "\n")
+          else:
+              print("No drift vs Copilot page.", flush=True)
+          EOF
+      - name: Open PR if drift detected
+        if: ${{ hashFiles('drift_litellm.txt') != '' }}
+        env:
+          GH_TOKEN: ${{ secrets.GITHUB_TOKEN }}
+        run: |
+          BRANCH="pricing-drift-$(date +%Y%m%d)"
+          git config user.name "github-actions[bot]"
+          git config user.email "github-actions[bot]@users.noreply.github.com"
+          git checkout -b "$BRANCH"
+          git add -A
+          git commit -m "chore: pricing drift detected $(date +%Y-%m-%d)" --allow-empty
+          git push origin "$BRANCH"
+          DRIFT=$(cat drift_litellm.txt)
+          gh pr create \
+            --title "Pricing drift detected $(date +%Y-%m-%d)" \
+            --body "$(printf '## Pricing drift\n\nThe weekly pricing check found the following differences between \`pricing.yml\` and the upstream sources (LiteLLM / Copilot page).\n\n```\n%s\n```\n\nPlease update \`prompt_analytics/data/pricing.yml\` accordingly.' "$DRIFT")" \
+            --base main \
+            --head "$BRANCH"

prompt_analytics_for_claude_code-0.3.0/.github/workflows/release.yml ADDED Viewed

@@ -0,0 +1,32 @@
+name: Release
+# Build and publish to PyPI on a version tag, using PyPI trusted publishing
+# (OIDC — no API token stored). The PyPI project's trusted publisher must match:
+#   owner:        romainfjgaspard
+#   repository:   prompt-analytics-for-claude-code
+#   workflow:     release.yml
+#   environment:  pypi
+on:
+  push:
+    tags: ["v*"]
+jobs:
+  pypi:
+    name: Build & publish to PyPI
+    runs-on: ubuntu-latest
+    environment: pypi
+    permissions:
+      id-token: write # required for trusted publishing
+    steps:
+      - uses: actions/checkout@v4
+      - name: Install uv
+        uses: astral-sh/setup-uv@v5
+        with:
+          python-version: "3.12"
+      - name: Build sdist + wheel
+        run: uv build
+      - name: Publish to PyPI (trusted publishing)
+        uses: pypa/gh-action-pypi-publish@release/v1

prompt_analytics_for_claude_code-0.3.0/.gitignore ADDED Viewed

@@ -0,0 +1,25 @@
+output/
+!output/.gitkeep
+.env
+__pycache__/
+*.pyc
+*.log
+.mypy_cache/
+.ruff_cache/
+.pytest_cache/
+.coverage
+.coverage.*
+htmlcov/
+.idea/
+.claude/
+dist/
+*.egg-info/
+.venv/
+build/
+# Not committed: Streamlit Community Cloud prioritizes uv.lock over
+# requirements.txt and `uv sync` skips the `dashboard` extra, so the hosted demo
+# would miss streamlit-echarts. Cloud uses requirements.txt (.[dashboard]); CI
+# and local dev resolve from pyproject.toml.
+uv.lock
+.streamlit/secrets.toml
+scripts/reconcile_dump.json

prompt_analytics_for_claude_code-0.3.0/.streamlit/config.toml ADDED Viewed

@@ -0,0 +1,18 @@
+# Dashboard theme — dark, forced (no light/dark toggle).
+#
+# A single dark theme (deep navy, Anthropic coral accent, Space Grotesk). Defining
+# only [theme] (no [theme.light]/[theme.dark] pair) forces dark for every visitor
+# and removes the in-app theme switch — deliberate: this is a dark-first showcase
+# (a [theme.light]/[theme.dark] pair would instead follow the visitor's OS).
+[theme]
+base = "dark"
+primaryColor = "#D97757"
+backgroundColor = "#0B1220"
+secondaryBackgroundColor = "#111827"
+textColor = "#F8FAFC"
+borderColor = "#2B3954"
+font = "'Space Grotesk':https://fonts.googleapis.com/css2?family=Space+Grotesk:wght@300..700&display=swap"
+headingFont = "Space Grotesk"
+codeFont = "'JetBrains Mono':https://fonts.googleapis.com/css2?family=JetBrains+Mono:wght@400..600&display=swap"
+baseFontSize = 14
+showWidgetBorder = true

prompt_analytics_for_claude_code-0.3.0/CHANGELOG.md ADDED Viewed

@@ -0,0 +1,50 @@
+# Changelog
+All notable changes to this project are documented here. The format is based on
+[Keep a Changelog](https://keepachangelog.com/en/1.1.0/), and this project
+adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html) — kept at
+`0.x` on purpose: the upstream Claude Code JSONL format is unstable, so parsing
+breakage is treated as expected and reflected in the version.
+## [0.3.0] — 2026-06-15 — Initial public release
+Prompt-level analytics for Claude Code: every prompt in your local
+`~/.claude/projects/**/*.jsonl` logs becomes one priced row — no account, no API
+key, nothing leaves your machine. Two surfaces on the same data: a terminal CLI
+and a Streamlit dashboard.
+### Highlights
+- **Per-prompt dataset** — tokens and cost per prompt, project, model, category
+  and session, with a Pareto view of where the spend concentrates.
+- **Power-user analyses at the request grain** — `context` (accumulated context
+  by session depth), `ttl` (cache-TTL expiry losses), `compactions`, `overhead`,
+  `by-token-type` (the **context-rent** share of the bill), `model-category
+  --whatif`, `recommend`, `burn-rate`, and `break-even` (is a Pro/Max plan worth
+  it vs the API?).
+- **Automatic categorization** — a local, zero-dependency FR+EN heuristic labels
+  every prompt across eleven categories and scores its *observed* complexity 1–5;
+  an optional LLM pass (Anthropic / OpenRouter / local Ollama) refines it.
+- **Streamlit dashboard** — Apache ECharts on a dark-by-default theme, global
+  cross-filtering (click to filter, brush a date range), an Explorer drill-down
+  (day → session → prompt), and a public synthetic-data demo.
+- **Accurate by construction** — global cross-file deduplication on
+  `message.id + requestId` (fixes the ~2.5× token inflation and double-counted
+  resumed / `--resume` sessions), fake-prompt filtering, an explicit subagent
+  policy, and cache writes split by TTL (5m vs 1h). Totals reconcile
+  bucket-for-bucket with [ccusage](https://github.com/ryoppippi/ccusage) on real
+  history.
+- **Generic multi-provider pricing** — `pricing.yml` ships `anthropic` and
+  `copilot` grids and accepts any rate card; costs are computed at read time from
+  raw counts, so a pricing change never needs a re-extract.
+- **Inspectable exports** — relational CSVs keyed by `prompt_id` / `session_id`,
+  `--format table|csv|json` on every command, and `export --flat` for Excel/BI.
+- **`snapshot`** — records plan quota utilization over time via the OAuth usage
+  endpoint Claude Code already uses (kept out of any public API; fails
+  gracefully).
+### Privacy
+Fully local by default: `extract` and every analysis command touch only your
+local logs. `snapshot` calls Anthropic's own OAuth usage endpoint with your
+existing token; `categorize --llm` sends prompt excerpts only to the provider you
+choose (OpenRouter is a third party). See the README's Privacy section for the
+full read/write/network breakdown.

prompt_analytics_for_claude_code-0.3.0/CONTRIBUTING.md ADDED Viewed

@@ -0,0 +1,111 @@
+# Contributing
+Thanks for your interest in `prompt-analytics-for-claude-code`. This is a small,
+quasi-stdlib tool with a strict quality bar (typed, tested, linted). PRs are
+welcome — please run the checks below before submitting.
+## Dev setup
+The project uses [uv](https://docs.astral.sh/uv/). Clone, then sync all extras
+(core + `categorize` + `dashboard` + `dev`):
+```bash
+uv sync --all-extras
+```
+Run the full local CI — the same steps the GitHub workflow runs:
+```bash
+uv run ruff check .
+uv run ruff format --check .
+uv run mypy prompt_analytics tests
+uv run pytest                       # coverage gate: 85%
+```
+Architecture and the data flow (`extract → analytics → {cli, dashboard, csv}`)
+are documented in [`docs/architecture.md`](docs/architecture.md). The CSV column
+contract lives in one place, [`prompt_analytics/schema.py`](prompt_analytics/schema.py)
+— change a column there and every writer/reader follows.
+A few ground rules:
+- **The core stays light.** Core depends only on `python-dotenv`, `pyyaml`, and
+  `rich`. Heavy deps belong in extras: `anthropic`/`openai` in `categorize`,
+  `streamlit`/`plotly`/`pandas`/`numpy` in `dashboard`. Don't import an extra
+  from core code.
+- **`tokens.csv` stores raw counts, never costs.** Costs are computed at read
+  time in `analytics.py` from the pricing grid, so a pricing change never
+  requires a re-extract. Keep it that way.
+- **Every bug fix gets a regression test.** The happy path is not enough.
+## Adding a pricing provider
+Pricing is a generic multi-provider grid in
+[`prompt_analytics/data/pricing.yml`](prompt_analytics/data/pricing.yml). To add
+a provider (your company's internal rates, a Bedrock tier, …), add a key under
+`providers:`. All prices are **USD per 1,000,000 tokens**:
+```yaml
+providers:
+  my-company:
+    models:
+      claude-opus-4-8:
+        input: 4.50          # negotiated rate
+        output: 22.50
+        cache_read: 0.45
+        cache_write_5m: 5.625
+        cache_write_1h: 9.00
+    fallbacks:               # matched by longest model-name prefix
+      claude-opus:
+        input: 4.50
+        output: 22.50
+        cache_read: 0.45
+        cache_write_5m: 5.625
+        cache_write_1h: 9.00
+```
+Then use it with `--providers anthropic,my-company` on `compare`, or `--provider
+my-company` on the other commands. Users can keep their grid out of the repo and
+pass it with `--pricing ./my-pricing.yml`.
+Lookup rules (see `pricing.get_model_pricing`): an exact model id wins; otherwise
+the **longest matching prefix** under `fallbacks` is used; `[1m]` and
+long-context suffixes are stripped before lookup. An unpriced model is never
+silently zeroed — it surfaces in the extraction/analytics report so you know to
+add an entry.
+The bundled `anthropic` and `copilot` grids are validated weekly by CI
+(`.github/workflows/pricing-drift.yml`): the job diffs `anthropic` against
+LiteLLM's `model_prices_and_context_window.json` and re-runs
+`scripts/fetch_copilot_pricing.py` against the live Copilot pricing page, opening
+a PR if either drifts. Update those two grids through that job, not by hand.
+## Capturing a test fixture
+Claude Code's JSONL format changes without notice, so parsing is pinned against
+fixture files **per Claude Code version**, under
+`tests/fixtures/claude-code-<version>/`. When you hit a new format, capture one
+from your own logs — anonymized, so it is safe to commit:
+```bash
+uv run python scripts/capture_fixture.py path/to/real/session.jsonl
+# --version 2.1.180   override the auto-detected version label
+# --project demo-app   generic project folder name in the fixture
+```
+`capture_fixture.py` rewrites the log **character by character** (letters → `x`,
+digits → `0`, punctuation and length preserved) while keeping everything the
+parser counts: structure, ids, the `uuid`/`parentUuid` attribution chain,
+`message.usage`, `model`, `timestamp`, `version`, and the filtering markers
+(`<command-name>`, `[Request interrupted…`). It scrubs your username, paths and
+prompt text — verify the result before committing. `test_fixtures_versioned.py`
+then acts as a drift canary: it asserts the fixture parses cleanly (no invalid
+lines, no unknown event types) and deterministically.
+## Submitting
+- Branch off `main`, keep commits focused, and describe the *why* in the PR.
+- Add or update tests and docs (README / `docs/`) alongside the code.
+- Update [`CHANGELOG.md`](CHANGELOG.md) under `[Unreleased]`.
+- Make sure the four checks above pass locally — CI runs them on Python
+  3.10–3.14 across Ubuntu and Windows.

prompt_analytics_for_claude_code-0.3.0/LICENSE ADDED Viewed

@@ -0,0 +1,21 @@
+MIT License
+Copyright (c) 2026 Romain Gaspard
+Permission is hereby granted, free of charge, to any person obtaining a copy
+of this software and associated documentation files (the "Software"), to deal
+in the Software without restriction, including without limitation the rights
+to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
+copies of the Software, and to permit persons to whom the Software is
+furnished to do so, subject to the following conditions:
+The above copyright notice and this permission notice shall be included in all
+copies or substantial portions of the Software.
+THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
+AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
+OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
+SOFTWARE.