RubyGems - llm_cost_tracker - Versions diffs - 0.2.0.alpha2 → 0.3.0 - Mend

llm_cost_tracker 0.2.0.alpha2 → 0.3.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (83) hide show

checksums.yaml +4 -4
data/CHANGELOG.md +48 -1
data/README.md +114 -70
data/Rakefile +2 -0
data/app/assets/llm_cost_tracker/application.css +760 -0
data/app/controllers/llm_cost_tracker/application_controller.rb +1 -7
data/app/controllers/llm_cost_tracker/assets_controller.rb +12 -0
data/app/controllers/llm_cost_tracker/calls_controller.rb +29 -12
data/app/controllers/llm_cost_tracker/dashboard_controller.rb +5 -1
data/app/helpers/llm_cost_tracker/application_helper.rb +46 -5
data/app/helpers/llm_cost_tracker/chart_helper.rb +133 -0
data/app/helpers/llm_cost_tracker/dashboard_filter_helper.rb +47 -0
data/app/helpers/llm_cost_tracker/dashboard_filter_options_helper.rb +34 -0
data/app/helpers/llm_cost_tracker/dashboard_query_helper.rb +58 -0
data/app/helpers/llm_cost_tracker/pagination_helper.rb +18 -0
data/app/services/llm_cost_tracker/dashboard/data_quality.rb +16 -1
data/app/services/llm_cost_tracker/dashboard/filter.rb +22 -3
data/app/services/llm_cost_tracker/dashboard/overview_stats.rb +16 -1
data/app/services/llm_cost_tracker/dashboard/spend_anomaly.rb +79 -0
data/app/services/llm_cost_tracker/dashboard/tag_key_explorer.rb +19 -46
data/app/services/llm_cost_tracker/dashboard/top_models.rb +17 -8
data/app/services/llm_cost_tracker/pagination.rb +6 -0
data/app/views/layouts/llm_cost_tracker/application.html.erb +35 -333
data/app/views/llm_cost_tracker/calls/index.html.erb +116 -74
data/app/views/llm_cost_tracker/calls/show.html.erb +58 -1
data/app/views/llm_cost_tracker/dashboard/index.html.erb +211 -111
data/app/views/llm_cost_tracker/data_quality/index.html.erb +224 -78
data/app/views/llm_cost_tracker/errors/database.html.erb +3 -3
data/app/views/llm_cost_tracker/errors/invalid_filter.html.erb +3 -3
data/app/views/llm_cost_tracker/errors/not_found.html.erb +3 -3
data/app/views/llm_cost_tracker/models/index.html.erb +66 -58
data/app/views/llm_cost_tracker/shared/_active_filters.html.erb +16 -0
data/app/views/llm_cost_tracker/shared/_metric_stack.html.erb +23 -0
data/app/views/llm_cost_tracker/shared/_spend_chart.html.erb +18 -0
data/app/views/llm_cost_tracker/shared/_tag_chips.html.erb +15 -0
data/app/views/llm_cost_tracker/shared/setup_required.html.erb +3 -2
data/app/views/llm_cost_tracker/tags/index.html.erb +55 -12
data/app/views/llm_cost_tracker/tags/show.html.erb +88 -39
data/config/routes.rb +3 -0
data/lib/llm_cost_tracker/assets.rb +19 -0
data/lib/llm_cost_tracker/configuration.rb +78 -42
data/lib/llm_cost_tracker/engine.rb +2 -0
data/lib/llm_cost_tracker/event.rb +2 -0
data/lib/llm_cost_tracker/generators/llm_cost_tracker/add_streaming_generator.rb +29 -0
data/lib/llm_cost_tracker/generators/llm_cost_tracker/templates/add_streaming_to_llm_api_calls.rb.erb +25 -0
data/lib/llm_cost_tracker/generators/llm_cost_tracker/templates/create_llm_api_calls.rb.erb +4 -0
data/lib/llm_cost_tracker/generators/llm_cost_tracker/templates/llm_cost_tracker_prices.yml.erb +8 -1
data/lib/llm_cost_tracker/llm_api_call.rb +9 -1
data/lib/llm_cost_tracker/middleware/faraday.rb +57 -9
data/lib/llm_cost_tracker/parsed_usage.rb +7 -3
data/lib/llm_cost_tracker/parsers/anthropic.rb +79 -1
data/lib/llm_cost_tracker/parsers/base.rb +17 -5
data/lib/llm_cost_tracker/parsers/gemini.rb +59 -6
data/lib/llm_cost_tracker/parsers/openai.rb +8 -0
data/lib/llm_cost_tracker/parsers/openai_compatible.rb +8 -0
data/lib/llm_cost_tracker/parsers/openai_usage.rb +55 -1
data/lib/llm_cost_tracker/parsers/registry.rb +15 -3
data/lib/llm_cost_tracker/parsers/sse.rb +81 -0
data/lib/llm_cost_tracker/price_registry.rb +18 -7
data/lib/llm_cost_tracker/price_sync/fetcher.rb +72 -0
data/lib/llm_cost_tracker/price_sync/merger.rb +72 -0
data/lib/llm_cost_tracker/price_sync/model_catalog.rb +77 -0
data/lib/llm_cost_tracker/price_sync/raw_price.rb +35 -0
data/lib/llm_cost_tracker/price_sync/source.rb +29 -0
data/lib/llm_cost_tracker/price_sync/source_result.rb +7 -0
data/lib/llm_cost_tracker/price_sync/sources/litellm.rb +91 -0
data/lib/llm_cost_tracker/price_sync/sources/open_router.rb +94 -0
data/lib/llm_cost_tracker/price_sync/validator.rb +66 -0
data/lib/llm_cost_tracker/price_sync.rb +310 -0
data/lib/llm_cost_tracker/pricing.rb +19 -6
data/lib/llm_cost_tracker/retention.rb +34 -0
data/lib/llm_cost_tracker/storage/active_record_store.rb +3 -1
data/lib/llm_cost_tracker/stream_collector.rb +158 -0
data/lib/llm_cost_tracker/tag_query.rb +7 -2
data/lib/llm_cost_tracker/tags_column.rb +21 -1
data/lib/llm_cost_tracker/tracker.rb +15 -12
data/lib/llm_cost_tracker/value_helpers.rb +40 -0
data/lib/llm_cost_tracker/version.rb +1 -1
data/lib/llm_cost_tracker.rb +51 -29
data/lib/tasks/llm_cost_tracker.rake +124 -0
data/llm_cost_tracker.gemspec +9 -8
metadata +40 -12
data/PLAN_0.2.md +0 -488

checksums.yaml CHANGED Viewed

@@ -1,7 +1,7 @@
 ---
 SHA256:
-  metadata.gz: '043255892f8db58b53a84d26d25ee09561b8349b72f463429f0f31691c6449a5'
-  data.tar.gz: a4577ed5935b1f65e85c4d73700bde447715875d40846d1bf6a8575b0588e2fc
+  metadata.gz: 8b20da957651521f022866af9d4735a4ef53d52a2dc3c278b8b2a90e1d7a7f98
+  data.tar.gz: ea98b2a7505d99c5f78d7756d0adc50224c4fdc88000fa5ec81be4450c9200f1
 SHA512:
-  metadata.gz: 0a7d8c3454cb88b93da2ef83d4b8cda330060200e3cb5073017f2d9c927d2d6e6ceef6e8d3cbc29120d8e44d955c8c1800126271ebb92f981253f9315d9abde8
-  data.tar.gz: 239c293ddf252d5933adf294329f20b9f5df2e1322bb5413aa4392b43077f26385ad89bbf382f786e1bc79a94252b47ed157d19464973beb4467a312cef368eb
+  metadata.gz: 9ca709080d46395ac32b9a2931b4b3cb7d4df6016b73bad3579cb1decdd046be21a2fb67c06e96876013a754a113e9ce5987ed0e27792b312716324bdb5f9adb
+  data.tar.gz: 445b77222180802f208246a2e25b30e5e0a5679d2d5b84a2ba00d1e2fc97a5cf3127521be13f415c6d76bbcc056dd0bdfe6ade937eb9d67d737ce6b6548665fa

data/CHANGELOG.md CHANGED Viewed

@@ -2,6 +2,53 @@
 Format: [Keep a Changelog](https://keepachangelog.com/en/1.1.0/). Versioning: [SemVer](https://semver.org/spec/v2.0.0.html).
+## [Unreleased]
+## [0.3.0] - 2026-04-22
+### Added
+- Streaming capture across OpenAI, Anthropic, and Gemini, including `LlmCostTracker.track_stream` for non-Faraday clients.
+- `stream` / `usage_source` persistence and dashboard coverage for streamed calls.
+- `llm_cost_tracker:prices:sync` and `llm_cost_tracker:prices:check` for keeping local price snapshots current.
+- `LlmCostTracker.enforce_budget!` and opt-in `enforce_budget:` keyword for `track` / `track_stream`.
+### Changed
+- Price refresh now uses structured JSON sources (LiteLLM primary, OpenRouter secondary) instead of scraping provider HTML pages.
+- Synced price entries now carry source provenance (`_source`, `_source_version`, `_fetched_at`), while `_source: "manual"` entries remain untouched.
+- Manual stream parsing now resolves parsers through the shared registry, so configured OpenAI-compatible providers work the same way as built-in ones.
+- `LlmCostTracker.configure` now treats configuration as an immutable snapshot after the block returns; mutating or replacing shared fields through `LlmCostTracker.configuration` raises `FrozenError`.
+### Removed
+- Public `LlmCostTracker.configuration=` writer; use `LlmCostTracker.configure` to replace configuration snapshots.
+## [0.2.0] - 2026-04-20
+### Added
+- `LlmCostTracker::Retention.prune(older_than:)` and `llm_cost_tracker:prune` rake task.
+- Overview: budget projection, previous-period daily spend comparison, spend anomaly alerts.
+- Call details: token and cost mix breakdowns.
+- Dashboard CSS served as a fingerprinted, immutably-cached file via `LlmCostTracker::AssetsController`.
+- Filter dropdowns for Provider and Model, scoped to the current slice.
+- Pagination with per-page selector and Stripe-style page window.
+### Changed
+- Dashboard UI aligned to Tailwind UI Application UI: dot-indicator badges, value-first stat tiles, inset-shadow form inputs, white secondary buttons with `shadow-sm`.
+- CSS fully namespaced under `lct-*`; removed bare `body` selector to avoid host-app leakage.
+### Fixed
+- Thread-safe price memoization (regression from 0.1.3).
+- `by_tag` on MySQL JSON columns.
+- CSV export escapes formula-prefixed values.
+- Portable dashboard sorting across adapters.
+- Dashboard shows database errors instead of install/setup guidance when the DB is unavailable.
+- Tag key explorer uses SQL discovery on MySQL 8.0+.
 ## [0.2.0.alpha1, 0.2.0.alpha2] - 2026-04-20
 ### Breaking
@@ -15,7 +62,7 @@ Format: [Keep a Changelog](https://keepachangelog.com/en/1.1.0/). Versioning: [S
 ### Added
 - `LlmApiCall.group_by_period(:day/:month)` — SQL-side period grouping.
-- Opt-in `LlmCostTracker::Engine` dashboard (Rails 7.1+): overview with delta-vs-previous-period, provider rollup, models, filterable call list with CSV export and outlier sort modes, call details, tag key explorer, per-key tag breakdown, data quality. PostgreSQL/SQLite use adapter-specific SQL; MySQL falls back to an in-Ruby scan capped at 50k rows. Core middleware still works without Rails.
+- Opt-in `LlmCostTracker::Engine` dashboard (Rails 7.1+): overview with delta-vs-previous-period, provider rollup, models, filterable call list with CSV export and outlier sort modes, call details, tag key explorer, per-key tag breakdown, data quality. PostgreSQL/SQLite use adapter-specific SQL; MySQL 8.0+ uses JSON_TABLE-based tag discovery. Core middleware still works without Rails.
 ## [0.1.4] - 2026-04-18

data/README.md CHANGED Viewed

@@ -1,35 +1,17 @@
-# LlmCostTracker
+# LLM Cost Tracker
-**Self-hosted LLM cost tracking for Ruby and Rails.** Intercepts Faraday LLM responses, prices them locally, stores events in your database. No proxy, no SaaS.
+**Self-hosted LLM cost tracking for Ruby and Rails.** Intercepts Faraday LLM responses or records usage explicitly, prices events locally, and stores them in your database. No proxy, no SaaS.
 [![Gem Version](https://img.shields.io/gem/v/llm_cost_tracker.svg)](https://rubygems.org/gems/llm_cost_tracker)
 [![CI](https://github.com/sergey-homenko/llm_cost_tracker/actions/workflows/ruby.yml/badge.svg)](https://github.com/sergey-homenko/llm_cost_tracker/actions)
-```text
-LLM Cost Report (last 30 days)
-Total cost: $127.420000
-Requests: 4,218
-Avg latency: 812ms
-Unknown pricing: 0
-By model:
-  gpt-4o                      $82.100000
-  claude-sonnet-4-6           $31.200000
-  gemini-2.5-flash            $14.120000
-By tag key "env":
-  production                  $119.300000
-  staging                     $8.120000
-```
 ## Why
-Every Rails app with LLM integrations eventually runs into the same question: where did that invoice come from? Full observability platforms like Langfuse and Helicone cover a lot more than cost, and sometimes you just want a small Rails-native ledger that lives in your own database.
+Every Rails app with LLM integrations eventually runs into the same question: where did that invoice come from? Full observability platforms like Langfuse and Helicone solve a broader set of problems; sometimes you just need a small Rails-native ledger in your own database.
-`llm_cost_tracker` is scoped to that. It plugs into Faraday, parses provider usage out of the response, looks up pricing locally, and writes an event. You end up with a ledger you can query with plain ActiveRecord, slice by any tag dimension, and optionally surface on a built-in dashboard. No proxy, no SaaS, no separate service to run.
+`llm_cost_tracker` is built for that. It plugs into Faraday or lets you record usage explicitly with `track` / `track_stream`, looks up pricing locally, and writes an event. You end up with a ledger you can query with plain ActiveRecord, slice by any tag dimension, and optionally surface on a built-in dashboard. No proxy, no SaaS, no separate service to run.
-It's not a tracing platform, prompt CMS, eval system, or gateway — and doesn't want to be. The goal is answering _"what did this app spend on LLM APIs, and where did that spend come from?"_ well enough that you stop worrying about it.
+It is not a tracing platform, prompt CMS, eval system, or gateway. The goal is to answer _"what did this app spend on LLM APIs, and where did that spend come from?"_ clearly enough to make spend review routine.
 ## Installation
@@ -44,23 +26,6 @@ bin/rails generate llm_cost_tracker:install
 bin/rails db:migrate
 ```
-## Quick try (no database)
-```ruby
-require "llm_cost_tracker"
-LlmCostTracker.configure { |c| c.storage_backend = :log }
-LlmCostTracker.track(
-  provider: :openai,
-  model: "gpt-4o",
-  input_tokens: 1000,
-  output_tokens: 200,
-  feature: "demo"
-)
-# => [LlmCostTracker] openai/gpt-4o tokens=1000+200 cost=$0.004500 tags={:feature=>"demo"}
-```
 ## Usage
 ### Patch an existing client's Faraday connection
@@ -78,19 +43,7 @@ OpenAI.configure do |config|
 end
 ```
-`tags:` can be a callable so `Current` attributes are evaluated per request:
-```ruby
-class Current < ActiveSupport::CurrentAttributes
-  attribute :user, :tenant, :workflow
-end
-# application_controller.rb
-before_action do
-  Current.user = current_user
-  Current.workflow = "chat"
-end
-```
+`tags:` can be a callable and is evaluated on each request.
 ### Raw Faraday
@@ -105,7 +58,41 @@ end
 conn.post("/v1/responses", { model: "gpt-5-mini", input: "Hello!" })
 ```
-Place `llm_cost_tracker` inside the Faraday stack where it can see the final response body. For streaming APIs, tracking requires the final body to expose provider usage; otherwise the gem warns and skips — use manual tracking there.
+Place `llm_cost_tracker` inside the Faraday stack where it can see the final response body.
+### Streaming
+Streaming is captured automatically for OpenAI, Anthropic, and Gemini when the request goes through the Faraday middleware. The middleware tees the `on_data` callback, keeps the stream flowing to your code, and records the final usage block once the response completes.
+```ruby
+# OpenAI: include usage in the final chunk
+client.chat(parameters: {
+  model: "gpt-4o",
+  messages: [...],
+  stream: proc { |chunk| ... },
+  stream_options: { include_usage: true }
+})
+```
+Anthropic emits usage in `message_start` + `message_delta` events. Gemini's `:streamGenerateContent` endpoint includes `usageMetadata`; usage from the final chunk is used.
+Streamed calls are stored with `stream: true` and `usage_source: "stream_final"`. If the provider never sends final usage, the call is still recorded with `usage_source: "unknown"` so those calls surface on the Data Quality page.
+For non-Faraday clients (raw `Net::HTTP`, custom SSE code, Azure OpenAI), use the explicit helper:
+```ruby
+LlmCostTracker.track_stream(provider: "openai", model: "gpt-4o") do |stream|
+  my_client.stream(...) { |chunk| stream.event(chunk) }
+end
+# Or skip the chunk parsing entirely if you already know the totals:
+LlmCostTracker.track_stream(provider: "openai", model: "gpt-4o") do |stream|
+  # ... your streaming loop ...
+  stream.usage(input_tokens: 120, output_tokens: 45)
+end
+```
+Run `bin/rails g llm_cost_tracker:add_streaming` once on existing installs to add the `stream` and `usage_source` columns.
 ### Manual tracking
@@ -148,7 +135,7 @@ LlmCostTracker.configure do |config|
 end
 ```
-Pricing is best-effort. OpenRouter-style IDs like `openai/gpt-4o-mini` are normalized to built-in names when possible. Use `prices_file` / `pricing_overrides` for fine-tunes, gateway-specific IDs, enterprise discounts, batch pricing, or models the gem doesn't know.
+Pricing is best effort. OpenRouter-style IDs like `openai/gpt-4o-mini` are normalized to built-in names when possible. Use `prices_file` / `pricing_overrides` for fine-tunes, gateway-specific IDs, enterprise discounts, batch pricing, or models the gem does not know.
 `storage_error_behavior = :warn` (default) lets LLM responses continue if storage fails; `:raise` exposes `StorageError#original_error`.
@@ -160,7 +147,7 @@ LlmCostTracker::LlmApiCall.unknown_pricing.group(:model).count
 ### Keeping prices current
-Built-in prices are in `lib/llm_cost_tracker/prices.json`. The gem never fetches pricing on boot. For production, generate a local overrides file and point the gem at it:
+Built-in prices live in `lib/llm_cost_tracker/prices.json`. The gem never fetches pricing on boot. For production, keep a local snapshot under `config/` and point the gem at it:
 ```bash
 bin/rails generate llm_cost_tracker:prices
@@ -175,7 +162,26 @@ bin/rails generate llm_cost_tracker:prices
 }
 ```
-`pricing_overrides` has the highest precedence; use it for small Ruby-only tweaks, `prices_file` for broader tables.
+`pricing_overrides` has the highest precedence. Use it for a handful of Ruby-side overrides; use `prices_file` when you want a local pricing table under source control.
+To refresh prices on demand:
+```bash
+bin/rails llm_cost_tracker:prices:sync
+```
+`llm_cost_tracker:prices:sync` refreshes the current registry from two structured sources: LiteLLM first, OpenRouter second. LiteLLM is the primary source; OpenRouter fills gaps and helps surface discrepancies.
+`llm_cost_tracker:prices:sync` / `llm_cost_tracker:prices:check` perform HTTP GET requests to:
+- LiteLLM pricing JSON: `https://raw.githubusercontent.com/BerriAI/litellm/main/model_prices_and_context_window.json`
+- OpenRouter Models API: `https://openrouter.ai/api/v1/models`
+If `config.prices_file` is configured, the task syncs that file automatically; otherwise it works from the built-in snapshot. `_source: "manual"` entries are never touched. Models that are still in your file but missing from both upstream sources are left alone and reported as orphaned. For intentional custom entries, mark them as manual so they stop showing up in orphaned warnings.
+Use `PREVIEW=1` to see the diff without writing. Use `STRICT=1` to fail instead of applying a partial refresh when a source fails or the validator rejects a price. Use `bin/rails llm_cost_tracker:prices:check` in CI to print the current diff and exit non-zero when the snapshot has drifted or refresh fails.
+Large price changes are flagged during sync. If a specific entry is expected to move by more than 3x, add `_validator_override: ["skip_relative_change"]` to that entry in your local price file.
 ## Budget enforcement
@@ -194,7 +200,25 @@ rescue LlmCostTracker::BudgetExceededError => e
   # e.monthly_total, e.budget, e.last_event
 ```
-`:block_requests` is best-effort under concurrency, not a transactional cap. Use provider/gateway-level limits for strict quotas.
+`:block_requests` is a **guardrail, not a hard cap**. The preflight and the spend-recording write are separate statements, so under Puma / Sidekiq concurrency multiple workers can all pass the preflight and then collectively overshoot the budget. The setting reliably *stops new requests after the overshoot is visible* — it does not prevent the overshoot itself. For strict quotas use a provider- or gateway-level limit, or a database-backed counter outside this gem.
+Preflight is wired into the Faraday middleware automatically. When you record events via `LlmCostTracker.track` / `track_stream` and also want the same preflight, opt in:
+```ruby
+LlmCostTracker.track(
+  provider: "openai",
+  model: "gpt-4o",
+  input_tokens: 120,
+  output_tokens: 45,
+  enforce_budget: true
+)
+LlmCostTracker.track_stream(provider: "openai", model: "gpt-4o", enforce_budget: true) do |stream|
+  # raises BudgetExceededError before the block runs when over budget
+end
+LlmCostTracker.enforce_budget! # standalone preflight
+```
 ## Querying costs
@@ -229,7 +253,15 @@ LlmCostTracker::LlmApiCall.by_tags(user_id: 42, feature: "chat").this_month.tota
 LlmCostTracker::LlmApiCall.between(1.week.ago, Time.current).cost_by_model
 ```
-### Tag storage
+## Retention
+Retention is not enforced automatically. Use the rake task below if you need to delete older records in batches.
+```bash
+DAYS=90 bin/rails llm_cost_tracker:prune  # delete calls older than N days in batches
+```
+## Tag storage
 New installs use `jsonb` + GIN on PostgreSQL:
@@ -251,7 +283,7 @@ bin/rails db:migrate
 ## Dashboard (optional)
-Opt-in Rails Engine. Plain ERB, inline CSS, no JS. Requires Rails 7.1+; the core middleware works without Rails.
+Optional Rails Engine. Plain ERB, no JavaScript framework, no asset pipeline required. Requires Rails 7.1+; the core middleware works without Rails.
 ```ruby
 # config/application.rb (or an initializer)
@@ -263,15 +295,15 @@ mount LlmCostTracker::Engine => "/llm-costs"
 Routes (GET-only; CSV export included):
-- `/llm-costs` — overview: spend (with delta vs previous period), calls, avg cost/call, avg latency, unknown pricing, budget, daily trend, provider rollup, top models
+- `/llm-costs` — overview: spend with delta vs previous period, budget projection, spend anomaly banner, daily trend vs previous slice, provider rollup, top models
 - `/llm-costs/models` — by provider + model; sortable by spend, volume, avg cost, latency
 - `/llm-costs/calls` — filterable + paginated; outlier sort modes (expensive, largest input/output, slowest, unknown pricing); CSV export
-- `/llm-costs/calls/:id` — details
-- `/llm-costs/tags` — tag keys present in the dataset (PG/SQLite native, MySQL via in-Ruby fallback)
+- `/llm-costs/calls/:id` — details with token mix and cost mix breakdowns
+- `/llm-costs/tags` — tag keys present in the dataset (PG/SQLite native; MySQL 8.0+ via JSON_TABLE)
 - `/llm-costs/tags/:key` — breakdown by values of a given tag key
 - `/llm-costs/data_quality` — unknown pricing share, untagged calls, missing latency
-> ⚠️ **No built-in auth.** Tags carry whatever your app puts in them. Protect the mount point with your app's auth.
+> ⚠️ **No built-in auth.** Tags carry whatever your app puts in them. Protect the mount point with your application's authentication.
 ### Basic auth
@@ -330,7 +362,7 @@ config.custom_storage = ->(event) {
 config.openai_compatible_providers["gateway.example.com"] = "internal_gateway"
 ```
-Configured hosts are parsed with the OpenAI-compatible usage shape (`prompt_tokens` / `completion_tokens` / `total_tokens`, `input_tokens` / `output_tokens`, and optional cached-input details). Covers OpenRouter, DeepSeek, and private gateways exposing Chat Completions / Responses / Completions / Embeddings.
+Configured hosts are parsed using the OpenAI-compatible usage shape (`prompt_tokens` / `completion_tokens` / `total_tokens`, `input_tokens` / `output_tokens`, and optional cached-input details). This covers OpenRouter, DeepSeek, and private gateways exposing Chat Completions / Responses / Completions / Embeddings.
 ## Custom parser
@@ -372,20 +404,32 @@ LlmCostTracker::Parsers::Registry.register(AcmeParser.new)
 | Google Gemini | ✅ | Gemini 2.5 Pro/Flash/Flash-Lite, 2.0 Flash/Flash-Lite, 1.5 Pro/Flash |
 | Any other | 🔧 | Custom parser |
-Endpoints: OpenAI Chat Completions / Responses / Completions / Embeddings; OpenAI-compatible equivalents; Anthropic Messages; Gemini `generateContent` with `usageMetadata`.
+Endpoints: OpenAI Chat Completions / Responses / Completions / Embeddings; OpenAI-compatible equivalents; Anthropic Messages; Gemini `generateContent` and `streamGenerateContent`. All endpoints support streaming capture.
 ## Safety
-- No external HTTP calls.
+- No external HTTP calls at request-tracking time.
 - No prompt or response bodies stored.
 - Faraday responses not modified.
 - Storage failures non-fatal by default (`storage_error_behavior = :warn`).
-- Budget / unknown-pricing errors are raised only when you opt in.
+- Budget and unknown-pricing errors are raised only when you opt in.
+## Thread safety (Puma, Sidekiq)
+The gem is designed for multi-threaded hosts — Puma with `max_threads > 1` and Sidekiq with `concurrency > 1` are both supported. A few rules:
+- **Configure once at boot.** `LlmCostTracker.configure` deep-freezes `default_tags`, `pricing_overrides`, `report_tag_breakdowns`, and `openai_compatible_providers` when the block returns. Mutating or replacing shared fields through `LlmCostTracker.configuration` raises `FrozenError`.
+- **Use `:active_record` storage for shared ledgers.** Puma workers and Sidekiq processes do not share memory; `:log` and `:custom` backends see per-process state only. `:active_record` writes to a single table and is the right choice for dashboards and budget checks across processes.
+- **Size your connection pool.** Each tracked call on the middleware path issues up to three SQL queries (preflight `SUM`, `INSERT`, post-check `SUM`). Make sure the AR pool covers `puma max_threads + sidekiq concurrency` plus your app's own usage.
+- **Don't share a `StreamCollector` across threads you don't own.** The collector itself is thread-safe — `event`, `usage`, and `finish!` synchronize internally and `finish!` is idempotent — but the documented pattern is one collector per stream.
+- **`finish!` is a barrier.** Once a stream is finished, later `event`, `usage`, or `model=` calls raise `FrozenError` instead of mutating a closed collector.
+- **`ActiveSupport::Notifications` subscribers run synchronously** in the caller's thread. Keep them fast or hand off to a background job; otherwise they add latency to every tracked call.
+- **`storage_error_behavior = :raise` inside Sidekiq** will retry the job, which can duplicate an expensive LLM call. Prefer `:warn` plus a Notifications subscriber, or `:ignore`, for worker contexts.
 ## Known limitations
-- `:block_requests` is best-effort under concurrency; use an external quota system for hard caps.
-- Streaming/SSE tracked only when Faraday exposes a final body with usage.
+- `:block_requests` is a best-effort guardrail, not a hard cap. Concurrent workers can pass preflight simultaneously and collectively overshoot the budget. Use an external quota system if you need a transactional cap.
+- Streaming capture relies on the provider emitting a final-usage event (OpenAI needs `stream_options: { include_usage: true }`); missing events are recorded with `usage_source: "unknown"` so they surface on the Data Quality page.
 - Anthropic cache TTL variants (1h vs 5min writes) not modeled separately.
 - OpenAI reasoning tokens included in output totals; separate reasoning-token attribution not stored.

data/Rakefile CHANGED Viewed

@@ -4,6 +4,8 @@ require "bundler/gem_tasks"
 require "rspec/core/rake_task"
 require "rubocop/rake_task"
+Dir[File.expand_path("lib/tasks/**/*.rake", __dir__)].each { |path| load path }
 RSpec::Core::RakeTask.new(:spec)
 RuboCop::RakeTask.new(:rubocop)