llm_cost_tracker 0.7.3 → 0.8.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- checksums.yaml +4 -4
- data/.ruby-version +1 -0
- data/CHANGELOG.md +66 -1
- data/README.md +58 -225
- data/app/assets/llm_cost_tracker/application.css +218 -41
- data/app/controllers/llm_cost_tracker/application_controller.rb +30 -17
- data/app/controllers/llm_cost_tracker/assets_controller.rb +11 -1
- data/app/controllers/llm_cost_tracker/calls_controller.rb +19 -14
- data/app/controllers/llm_cost_tracker/data_quality_controller.rb +10 -2
- data/app/helpers/llm_cost_tracker/application_helper.rb +11 -24
- data/app/helpers/llm_cost_tracker/dashboard_filter_helper.rb +3 -21
- data/app/helpers/llm_cost_tracker/dashboard_filter_options_helper.rb +4 -4
- data/app/helpers/llm_cost_tracker/dashboard_query_helper.rb +1 -1
- data/app/helpers/llm_cost_tracker/token_usage_helper.rb +20 -7
- data/app/models/llm_cost_tracker/call.rb +169 -0
- data/app/models/llm_cost_tracker/call_line_item.rb +22 -0
- data/app/models/llm_cost_tracker/call_rollup.rb +9 -0
- data/app/models/llm_cost_tracker/call_tag.rb +16 -0
- data/app/models/llm_cost_tracker/ingestion/inbox_entry.rb +13 -0
- data/app/models/llm_cost_tracker/ingestion/lease.rb +1 -1
- data/app/models/llm_cost_tracker/provider_invoice.rb +9 -0
- data/app/services/llm_cost_tracker/dashboard/data_quality.rb +121 -30
- data/app/services/llm_cost_tracker/dashboard/date_range.rb +1 -1
- data/app/services/llm_cost_tracker/dashboard/filter.rb +2 -2
- data/app/services/llm_cost_tracker/dashboard/overview_stats.rb +74 -21
- data/app/services/llm_cost_tracker/dashboard/pagination.rb +6 -4
- data/app/services/llm_cost_tracker/dashboard/params.rb +8 -2
- data/app/services/llm_cost_tracker/dashboard/provider_breakdown.rb +1 -1
- data/app/services/llm_cost_tracker/dashboard/spend_anomaly.rb +4 -3
- data/app/services/llm_cost_tracker/dashboard/tag_breakdown.rb +42 -9
- data/app/services/llm_cost_tracker/dashboard/tag_key_explorer.rb +14 -37
- data/app/services/llm_cost_tracker/dashboard/time_series.rb +1 -1
- data/app/services/llm_cost_tracker/dashboard/top_models.rb +1 -1
- data/app/views/llm_cost_tracker/calls/index.html.erb +33 -75
- data/app/views/llm_cost_tracker/calls/show.html.erb +62 -7
- data/app/views/llm_cost_tracker/dashboard/index.html.erb +9 -50
- data/app/views/llm_cost_tracker/data_quality/index.html.erb +103 -126
- data/app/views/llm_cost_tracker/errors/database.html.erb +1 -1
- data/app/views/llm_cost_tracker/models/index.html.erb +18 -50
- data/app/views/llm_cost_tracker/shared/_filters.html.erb +63 -0
- data/app/views/llm_cost_tracker/shared/_sort.html.erb +13 -0
- data/app/views/llm_cost_tracker/shared/setup_required.html.erb +1 -1
- data/app/views/llm_cost_tracker/tags/index.html.erb +3 -34
- data/app/views/llm_cost_tracker/tags/show.html.erb +5 -37
- data/lib/llm_cost_tracker/billing/components.rb +53 -0
- data/lib/llm_cost_tracker/billing/components.yml +117 -0
- data/lib/llm_cost_tracker/billing/cost_status.rb +45 -0
- data/lib/llm_cost_tracker/billing/line_item.rb +189 -0
- data/lib/llm_cost_tracker/budget.rb +23 -35
- data/lib/llm_cost_tracker/capture/stream_collector.rb +47 -33
- data/lib/llm_cost_tracker/configuration.rb +36 -19
- data/lib/llm_cost_tracker/doctor/cost_drift_check.rb +54 -0
- data/lib/llm_cost_tracker/doctor/ingestion_check.rb +24 -32
- data/lib/llm_cost_tracker/doctor/legacy_audit_check.rb +36 -0
- data/lib/llm_cost_tracker/doctor/legacy_billing_status_check.rb +22 -0
- data/lib/llm_cost_tracker/doctor/price_check.rb +2 -2
- data/lib/llm_cost_tracker/doctor/pricing_snapshot_drift_check.rb +85 -0
- data/lib/llm_cost_tracker/doctor/probe.rb +17 -0
- data/lib/llm_cost_tracker/doctor/schema_check.rb +31 -0
- data/lib/llm_cost_tracker/doctor.rb +43 -45
- data/lib/llm_cost_tracker/errors.rb +5 -19
- data/lib/llm_cost_tracker/event.rb +10 -2
- data/lib/llm_cost_tracker/generators/llm_cost_tracker/install_generator.rb +4 -2
- data/lib/llm_cost_tracker/generators/llm_cost_tracker/prices_generator.rb +2 -6
- data/lib/llm_cost_tracker/generators/llm_cost_tracker/templates/create_llm_cost_tracker_calls.rb.erb +157 -0
- data/lib/llm_cost_tracker/ingestion/batch.rb +11 -12
- data/lib/llm_cost_tracker/ingestion/inbox.rb +39 -23
- data/lib/llm_cost_tracker/ingestion/worker.rb +14 -5
- data/lib/llm_cost_tracker/ingestion.rb +28 -22
- data/lib/llm_cost_tracker/integrations/anthropic.rb +45 -38
- data/lib/llm_cost_tracker/integrations/base.rb +36 -29
- data/lib/llm_cost_tracker/integrations/openai.rb +85 -40
- data/lib/llm_cost_tracker/integrations/ruby_llm.rb +5 -5
- data/lib/llm_cost_tracker/integrations.rb +2 -2
- data/lib/llm_cost_tracker/ledger/period/totals.rb +12 -9
- data/lib/llm_cost_tracker/ledger/period.rb +5 -5
- data/lib/llm_cost_tracker/ledger/rollups/upsert_sql.rb +2 -2
- data/lib/llm_cost_tracker/ledger/rollups.rb +76 -25
- data/lib/llm_cost_tracker/ledger/schema/adapter.rb +18 -0
- data/lib/llm_cost_tracker/ledger/schema/call_line_items.rb +50 -0
- data/lib/llm_cost_tracker/ledger/schema/call_rollups.rb +37 -0
- data/lib/llm_cost_tracker/ledger/schema/call_tags.rb +26 -0
- data/lib/llm_cost_tracker/ledger/schema/calls.rb +34 -23
- data/lib/llm_cost_tracker/ledger/schema/provider_invoices.rb +57 -0
- data/lib/llm_cost_tracker/ledger/store.rb +96 -13
- data/lib/llm_cost_tracker/ledger/tags/query.rb +4 -10
- data/lib/llm_cost_tracker/ledger/tags/sql.rb +27 -15
- data/lib/llm_cost_tracker/ledger.rb +4 -2
- data/lib/llm_cost_tracker/logging.rb +2 -5
- data/lib/llm_cost_tracker/middleware/faraday.rb +7 -6
- data/lib/llm_cost_tracker/parsers/anthropic.rb +52 -7
- data/lib/llm_cost_tracker/parsers/base.rb +8 -3
- data/lib/llm_cost_tracker/parsers/gemini.rb +101 -15
- data/lib/llm_cost_tracker/parsers/openai_compatible.rb +10 -2
- data/lib/llm_cost_tracker/parsers/openai_service_charges.rb +87 -0
- data/lib/llm_cost_tracker/parsers/openai_usage.rb +48 -21
- data/lib/llm_cost_tracker/parsers/sse.rb +1 -1
- data/lib/llm_cost_tracker/parsers.rb +1 -1
- data/lib/llm_cost_tracker/prices.json +105 -20
- data/lib/llm_cost_tracker/pricing/effective_prices.rb +57 -19
- data/lib/llm_cost_tracker/pricing/explainer.rb +4 -5
- data/lib/llm_cost_tracker/pricing/lookup.rb +38 -34
- data/lib/llm_cost_tracker/pricing/registry.rb +65 -45
- data/lib/llm_cost_tracker/pricing/service_charges.rb +204 -0
- data/lib/llm_cost_tracker/pricing/sync/fetcher.rb +26 -17
- data/lib/llm_cost_tracker/pricing/sync/registry_diff.rb +6 -15
- data/lib/llm_cost_tracker/pricing/sync.rb +57 -10
- data/lib/llm_cost_tracker/pricing/sync_change_printer.rb +32 -0
- data/lib/llm_cost_tracker/pricing.rb +190 -26
- data/lib/llm_cost_tracker/railtie.rb +0 -8
- data/lib/llm_cost_tracker/report/data.rb +16 -8
- data/lib/llm_cost_tracker/report.rb +0 -4
- data/lib/llm_cost_tracker/retention.rb +8 -8
- data/lib/llm_cost_tracker/tags/context.rb +2 -4
- data/lib/llm_cost_tracker/tags/key.rb +4 -0
- data/lib/llm_cost_tracker/tags/sanitizer.rb +12 -17
- data/lib/llm_cost_tracker/timing.rb +15 -0
- data/lib/llm_cost_tracker/token_usage.rb +56 -42
- data/lib/llm_cost_tracker/tracker.rb +67 -24
- data/lib/llm_cost_tracker/usage_capture.rb +29 -8
- data/lib/llm_cost_tracker/version.rb +1 -1
- data/lib/llm_cost_tracker.rb +36 -35
- data/lib/tasks/llm_cost_tracker.rake +22 -17
- metadata +36 -41
- data/app/models/llm_cost_tracker/ingestion/event.rb +0 -13
- data/app/models/llm_cost_tracker/ledger/call.rb +0 -45
- data/app/models/llm_cost_tracker/ledger/call_metrics.rb +0 -66
- data/app/models/llm_cost_tracker/ledger/period/grouping.rb +0 -71
- data/app/models/llm_cost_tracker/ledger/period/total.rb +0 -13
- data/app/models/llm_cost_tracker/ledger/tags/accessors.rb +0 -19
- data/lib/llm_cost_tracker/configuration/instrumentation.rb +0 -33
- data/lib/llm_cost_tracker/generators/llm_cost_tracker/add_ingestion_generator.rb +0 -29
- data/lib/llm_cost_tracker/generators/llm_cost_tracker/add_latency_ms_generator.rb +0 -29
- data/lib/llm_cost_tracker/generators/llm_cost_tracker/add_period_totals_generator.rb +0 -29
- data/lib/llm_cost_tracker/generators/llm_cost_tracker/add_provider_response_id_generator.rb +0 -29
- data/lib/llm_cost_tracker/generators/llm_cost_tracker/add_streaming_generator.rb +0 -29
- data/lib/llm_cost_tracker/generators/llm_cost_tracker/add_token_usage_generator.rb +0 -42
- data/lib/llm_cost_tracker/generators/llm_cost_tracker/templates/add_ingestion_to_llm_cost_tracker.rb.erb +0 -33
- data/lib/llm_cost_tracker/generators/llm_cost_tracker/templates/add_latency_ms_to_llm_api_calls.rb.erb +0 -9
- data/lib/llm_cost_tracker/generators/llm_cost_tracker/templates/add_period_totals_to_llm_cost_tracker.rb.erb +0 -104
- data/lib/llm_cost_tracker/generators/llm_cost_tracker/templates/add_provider_response_id_to_llm_api_calls.rb.erb +0 -15
- data/lib/llm_cost_tracker/generators/llm_cost_tracker/templates/add_streaming_to_llm_api_calls.rb.erb +0 -21
- data/lib/llm_cost_tracker/generators/llm_cost_tracker/templates/add_token_usage_to_llm_api_calls.rb.erb +0 -22
- data/lib/llm_cost_tracker/generators/llm_cost_tracker/templates/create_llm_api_calls.rb.erb +0 -83
- data/lib/llm_cost_tracker/generators/llm_cost_tracker/templates/upgrade_llm_api_call_cost_precision.rb.erb +0 -26
- data/lib/llm_cost_tracker/generators/llm_cost_tracker/templates/upgrade_llm_api_call_tags_to_jsonb.rb.erb +0 -44
- data/lib/llm_cost_tracker/generators/llm_cost_tracker/upgrade_cost_precision_generator.rb +0 -29
- data/lib/llm_cost_tracker/generators/llm_cost_tracker/upgrade_tags_to_jsonb_generator.rb +0 -29
- data/lib/llm_cost_tracker/ledger/rollups/batch.rb +0 -43
- data/lib/llm_cost_tracker/ledger/schema/period_totals.rb +0 -32
- data/lib/llm_cost_tracker/pricing/components.rb +0 -37
- data/lib/llm_cost_tracker/pricing/sync/registry_loader.rb +0 -63
checksums.yaml
CHANGED
|
@@ -1,7 +1,7 @@
|
|
|
1
1
|
---
|
|
2
2
|
SHA256:
|
|
3
|
-
metadata.gz:
|
|
4
|
-
data.tar.gz:
|
|
3
|
+
metadata.gz: 357a557efba70db48baf47dde01b0aec7ee8f17787c09276a608572037e1b405
|
|
4
|
+
data.tar.gz: dd91244e99ecbc531d592572ea419a6567d51980f1f95f9eedb0a929f52ffb71
|
|
5
5
|
SHA512:
|
|
6
|
-
metadata.gz:
|
|
7
|
-
data.tar.gz:
|
|
6
|
+
metadata.gz: a9ef38be029273bb2df21e9837e92d166ccb3c931c859437a3e5ace57daebbfaf78db182b93a87aedd9be1df18ef26937cae47b3382a61489fe1e4b1cdd0b669
|
|
7
|
+
data.tar.gz: 852695ad7b07962ad540b16669e285324b1af5059e975ca645aa2670f8ed5069640e06207a06f170edcdfc19a22e9aacdab684caf831feaa7c334a3d52f41a67
|
data/.ruby-version
ADDED
|
@@ -0,0 +1 @@
|
|
|
1
|
+
3.4.5
|
data/CHANGELOG.md
CHANGED
|
@@ -2,7 +2,72 @@
|
|
|
2
2
|
|
|
3
3
|
Format: [Keep a Changelog](https://keepachangelog.com/en/1.1.0/). Versioning: [SemVer](https://semver.org/spec/v2.0.0.html).
|
|
4
4
|
|
|
5
|
-
## [
|
|
5
|
+
## [0.8.0] - 2026-05-07
|
|
6
|
+
|
|
7
|
+
0.8 is a storage rebuild. Tokens and tool/runtime charges share one shape
|
|
8
|
+
(`Billing::LineItem`) and live in a dedicated line items table. Per-component
|
|
9
|
+
cost columns and the standalone service charges table are gone. Several tables
|
|
10
|
+
were also renamed during the cycle. See [Upgrading](docs/upgrading.md) for the
|
|
11
|
+
migration path — there is no rolling-deploy upgrade.
|
|
12
|
+
|
|
13
|
+
### Added
|
|
14
|
+
|
|
15
|
+
- `llm_cost_tracker_call_line_items` — one row per priced component (text/audio/cached tokens, web search, code execution, grounding, container sessions, file search). Tokens and tool charges share one shape and one `cost_status` semantics.
|
|
16
|
+
- `llm_cost_tracker_call_tags` — normalized attribution. Tag filters and aggregations now JOIN through this table on PostgreSQL and MySQL alike.
|
|
17
|
+
- `llm_cost_tracker_provider_invoices` — placeholder table reserved for v0.9 invoice reconciliation.
|
|
18
|
+
- `Billing::LineItem` value object covering both token and service charges. `LineItem.from_token_usage` and explicit `component_key:` builders price token and tool/runtime quantities through the same path.
|
|
19
|
+
- `Pricing.price_line_items` — single pricing pass for token + tool/runtime line items, used by `Tracker.build_event`.
|
|
20
|
+
- Doctor schema checks for `llm_cost_tracker_call_line_items`, `llm_cost_tracker_call_tags`, and `llm_cost_tracker_provider_invoices`.
|
|
21
|
+
- Doctor sample-based drift checks: header `total_cost` vs `SUM(line_items.cost)` and stored line item cost vs `pricing_snapshot.rates` (RFC §Doctor).
|
|
22
|
+
- `currency` column on `llm_cost_tracker_call_rollups` (default `USD`) with a `(period, period_start, currency)` unique index. v0.8 stays single-currency; the schema is in place so v0.9 multi-currency rollups don't need another migration.
|
|
23
|
+
- `Billing::Components::REGISTRY` now loads from `lib/llm_cost_tracker/billing/components.yml`. Adding a billable component is one YAML row plus a price entry — no more 11-line `Component.new(...)` literals.
|
|
24
|
+
- Anthropic web search and code execution usage emitted as line items with `component_key: :web_search_request` / `:code_execution_request`. SDK integration emits the same line items from native SDK responses, not just Faraday-wrapped ones.
|
|
25
|
+
- OpenAI hosted web search, file search, and Code Interpreter container sessions emitted as line items via both Faraday and SDK integration paths.
|
|
26
|
+
- Gemini grounding queries emitted as line items.
|
|
27
|
+
- `provider_project_id`, `provider_api_key_id`, `provider_workspace_id`, `batch` capture dimensions on `LlmCostTracker.track` and the `Event` payload, persisted as columns on `llm_cost_tracker_calls`.
|
|
28
|
+
- `Pricing::EffectivePrices` permutes compound pricing modes (e.g. `priority_batch_data_residency`) when matching rates, so combined modes resolve correctly.
|
|
29
|
+
- `Pricing::Sync` registry-diff compares `service_charges` rates in addition to model rates.
|
|
30
|
+
- Dashboard polish pass: shared `_filters.html.erb` and `_sort.html.erb` partials, sticky table headers, button hover/active states, spacing/shadow scales, and a full `prefers-color-scheme` dark palette.
|
|
31
|
+
- Bundled audio and tool rates refreshed from current provider pricing.
|
|
32
|
+
|
|
33
|
+
### Changed
|
|
34
|
+
|
|
35
|
+
- BREAKING: Renamed `llm_api_calls` → `llm_cost_tracker_calls`, `llm_cost_tracker_period_totals` → `llm_cost_tracker_call_rollups`, `llm_cost_tracker_inbox_events` → `llm_cost_tracker_ingestion_inbox_entries`, `llm_cost_tracker_ingestor_leases` → `llm_cost_tracker_ingestion_leases`. Corresponding model `LlmCostTracker::PeriodTotal` renamed to `LlmCostTracker::CallRollup`; ingestion models live under `LlmCostTracker::Ingestion::InboxEntry` and `LlmCostTracker::Ingestion::Lease`.
|
|
36
|
+
- BREAKING: Per-component cost columns removed from `llm_cost_tracker_calls` (`input_cost`, `output_cost`, `cache_read_input_cost`, `cache_write_input_cost`, `cache_write_extended_input_cost`, `cache_write_1h_input_cost`, `audio_input_cost`, `audio_output_cost`). The header keeps `total_cost` only; per-component costs live in line items.
|
|
37
|
+
- BREAKING: `llm_cost_tracker_calls.tags` JSONB column removed in favor of `llm_cost_tracker_call_tags`. `Call#parsed_tags`, `Call.by_tags`, `Call.cost_by_tag`, `Call.group_by_tag`, and the dashboard tag explorer now read the normalized table.
|
|
38
|
+
- BREAKING: `llm_cost_tracker_service_charges` table removed. Tool/runtime rows are stored in `llm_cost_tracker_call_line_items` with `unit != 'token'`.
|
|
39
|
+
- BREAKING: `Billing::ServiceCharge` value object and `LlmCostTracker::ServiceCharge` AR model removed. Use `Billing::LineItem` and `LlmCostTracker::CallLineItem`.
|
|
40
|
+
- BREAKING: `Event#service_charges` removed. Filter `event.line_items` by `unit != :token` instead.
|
|
41
|
+
- BREAKING: `Call#service_charges` association removed. Use `call.line_items.where.not(unit: "token")`.
|
|
42
|
+
- BREAKING: `LlmCostTracker.track(service_charges:)` keyword renamed to `service_line_items:`. Hash keys: `component:` → `component_key:`, `source_key:` → `provider_field:`, `pricing_basis: PROVIDER_USAGE_BASIS` → `pricing_basis: :provider_usage`.
|
|
43
|
+
- BREAKING: `Billing::CostStatus.call(service_charges:)` keyword renamed to `service_line_items:`.
|
|
44
|
+
- BREAKING: `Pricing.cost_with_service_charges` public API removed; replaced internally by `Pricing.price_line_items`.
|
|
45
|
+
- BREAKING: Top-level delegators `LlmCostTracker.flush!`, `LlmCostTracker.shutdown!`, `LlmCostTracker.enforce_budget!` removed. Use `LlmCostTracker::Ingestion::Worker.flush!` / `.shutdown!` directly; budget enforcement is internal.
|
|
46
|
+
- BREAKING: `LlmCostTracker.track` requires explicit `tokens:` and accepts `tags:` as a hash; the previous keyword shape is no longer supported.
|
|
47
|
+
- BREAKING: Notification payload (`llm_request.llm_cost_tracker`) no longer carries `service_charges`. Subscribers read `line_items`.
|
|
48
|
+
- BREAKING: Inbox payload v0/v1 compatibility dropped; only v2 is accepted. Drain any pre-v2 entries on the prior gem version before bumping.
|
|
49
|
+
- BREAKING: Ruby 3.4+ required.
|
|
50
|
+
- BREAKING: Legacy upgrade generators removed (`add_billing`, `add_ingestion`, `add_call_rollups`, `add_capture_dimensions`, `add_latency_ms`, `add_provider_response_id`, `add_streaming`, `add_token_usage`, `upgrade_cost_precision`, `upgrade_schema_foundation`, `upgrade_tags_to_jsonb`). Doctor no longer suggests them.
|
|
51
|
+
- `llm_cost_tracker_call_tags.value` widened to TEXT (was VARCHAR), and the `[:key, :value]` composite index dropped in favor of `:key` only — value-equality filters scan the per-key bucket.
|
|
52
|
+
- `Configuration#pricing_overrides` validates shape at assignment time rather than at first read.
|
|
53
|
+
- Pricing computes a partial `total_cost` (with `cost_status: :partial`) when only some token components have rates; previously `total_cost` was nil whenever any component lacked a rate.
|
|
54
|
+
- `TokenUsage.build` clamps negative token counts to zero so anomalous provider payloads don't poison rollups.
|
|
55
|
+
- Stream collector buffer overflow keeps already-accumulated events instead of dropping them.
|
|
56
|
+
- Budget guardrail preflight time is excluded from SDK call latency measurements.
|
|
57
|
+
- Dashboard data-quality breakdown computes per-component cost from line items via JOIN; usage_rows accepts `component_costs:` hash.
|
|
58
|
+
- CSV export pulls tag JSON from `tag_records` instead of the dropped JSONB column.
|
|
59
|
+
- The fingerprinted dashboard stylesheet is served with `Cache-Control: no-store` in development so edits show up without a hard reload; production keeps the immutable cache.
|
|
60
|
+
|
|
61
|
+
### Fixed
|
|
62
|
+
|
|
63
|
+
- Railtie no longer requires removed legacy upgrade generators at boot, so installs on a clean app don't crash during eager-load.
|
|
64
|
+
- `Tracker` only flags unknown pricing when token quantities are positive — service-only events with zero tokens no longer raise `Pricing::Unknown`.
|
|
65
|
+
- `Billing::CostStatus.cost_status_for` coerces symbol/string status values consistently when building line items.
|
|
66
|
+
- Gemini `thoughtsTokenCount` is billed at the output token rate (already present in 0.7.3, kept for clarity given the rebuild).
|
|
67
|
+
|
|
68
|
+
### Removed
|
|
69
|
+
|
|
70
|
+
- Dead `Billing::LineItem.from_service_charge` constructor and the unused `Call.with_json_tags` scope.
|
|
6
71
|
|
|
7
72
|
## [0.7.3] - 2026-05-01
|
|
8
73
|
|
data/README.md
CHANGED
|
@@ -1,50 +1,46 @@
|
|
|
1
1
|
# LLM Cost Tracker
|
|
2
2
|
|
|
3
|
-
|
|
3
|
+
Self-hosted LLM cost tracking for Rails.
|
|
4
4
|
|
|
5
5
|
[](https://rubygems.org/gems/llm_cost_tracker)
|
|
6
6
|
[](https://github.com/sergey-homenko/llm_cost_tracker/actions)
|
|
7
7
|
[](https://codecov.io/gh/sergey-homenko/llm_cost_tracker)
|
|
8
8
|
|
|
9
|
-
|
|
9
|
+
Every call your app makes to OpenAI, Anthropic, Gemini, RubyLLM, or any
|
|
10
|
+
OpenAI-compatible API gets logged: tokens, cost, latency, tags. Calls go
|
|
11
|
+
app → provider direct. No proxy.
|
|
10
12
|
|
|
11
|
-
|
|
13
|
+
Not Langfuse, Helicone, or LiteLLM. No prompts, no traces, no replay. Spend
|
|
14
|
+
attribution only.
|
|
12
15
|
|
|
13
|
-
Requires Ruby 3.
|
|
16
|
+
Requires Ruby 3.4+, Rails 7.1+, PostgreSQL or MySQL.
|
|
14
17
|
|
|
15
18
|

|
|
16
19
|
|
|
17
|
-
## Accuracy model
|
|
18
|
-
|
|
19
|
-
LLM Cost Tracker estimates spend from provider-reported usage and configured prices. It is useful for explaining spend by provider, model, and tags, but it is not invoice-grade billing. For reconciliation, each call keeps `provider_response_id`, `usage_source`, token breakdowns, and `pricing_mode`.
|
|
20
|
-
|
|
21
20
|
## Quickstart
|
|
22
21
|
|
|
23
|
-
Add to your Gemfile alongside whatever LLM client you already use:
|
|
24
|
-
|
|
25
22
|
```ruby
|
|
23
|
+
# Gemfile
|
|
26
24
|
gem "llm_cost_tracker"
|
|
27
|
-
gem "openai"
|
|
25
|
+
gem "openai"
|
|
28
26
|
```
|
|
29
27
|
|
|
30
|
-
Install, migrate, verify:
|
|
31
|
-
|
|
32
28
|
```bash
|
|
33
|
-
bin/rails
|
|
34
|
-
bin/rails db:migrate
|
|
35
|
-
bin/rails llm_cost_tracker:doctor
|
|
29
|
+
bin/rails llm_cost_tracker:setup
|
|
36
30
|
```
|
|
37
31
|
|
|
38
|
-
|
|
32
|
+
That runs the install generator with the dashboard and pricing snapshot,
|
|
33
|
+
migrates the database, then verifies via `llm_cost_tracker:doctor`.
|
|
39
34
|
|
|
40
35
|
```ruby
|
|
36
|
+
# config/initializers/llm_cost_tracker.rb
|
|
41
37
|
LlmCostTracker.configure do |config|
|
|
42
38
|
config.default_tags = -> { { environment: Rails.env } }
|
|
43
39
|
config.instrument :openai
|
|
44
40
|
end
|
|
45
41
|
```
|
|
46
42
|
|
|
47
|
-
|
|
43
|
+
Tag your calls — that's how you find out who burned the money:
|
|
48
44
|
|
|
49
45
|
```ruby
|
|
50
46
|
LlmCostTracker.with_tags(user_id: Current.user&.id, feature: "chat") do
|
|
@@ -53,240 +49,77 @@ LlmCostTracker.with_tags(user_id: Current.user&.id, feature: "chat") do
|
|
|
53
49
|
end
|
|
54
50
|
```
|
|
55
51
|
|
|
56
|
-
|
|
57
|
-
|
|
58
|
-
## What you get
|
|
59
|
-
|
|
60
|
-
- Local ActiveRecord ledger of every call: provider, model, token breakdown, cost, latency, tags, response IDs
|
|
61
|
-
- Auto-capture for RubyLLM and the official `openai` and `anthropic` Ruby SDKs, plus Faraday middleware for `ruby-openai`, the Gemini REST API, and any client you can inject middleware into
|
|
62
|
-
- Server-rendered dashboard (plain ERB, zero JavaScript) with overview, models, calls, tags, CSV export, and a data-quality page
|
|
63
|
-
- Local pricing snapshots refreshed daily from the official provider pricing pages, applied with `bin/rails llm_cost_tracker:prices:refresh`
|
|
64
|
-
- Monthly / daily / per-call budget guardrails with notify, raise, or block-requests behaviour
|
|
65
|
-
- Tag-based attribution that survives concurrency — Puma threads and Sidekiq fibers don't bleed into each other
|
|
66
|
-
|
|
67
|
-
## What it deliberately doesn't do
|
|
68
|
-
|
|
69
|
-
- **Doesn't run as a proxy.** Calls go directly from your app to the provider.
|
|
70
|
-
- **Doesn't store prompts or completions.** Token counts, model, cost, tags, response IDs only. Nothing else.
|
|
71
|
-
- **Doesn't promise invoice-grade accuracy.** It uses official provider pricing pages, but enterprise rates, batch discounts on unsupported endpoints, and modality tiers are not always modeled. `provider_response_id` is stored as a join key for whoever does that reconciliation.
|
|
72
|
-
- **Doesn't ship with auth on the dashboard.** It's a Rails Engine; mount it behind whatever your app already uses (Devise, basic auth, Cloudflare Access, your own session middleware).
|
|
73
|
-
- **Doesn't centralize multi-service visibility.** One Rails monolith — perfect fit. Six services in four languages — wrong tool, look at a proxy or API-layer gateway.
|
|
74
|
-
|
|
75
|
-
## Capturing calls
|
|
76
|
-
|
|
77
|
-
Three paths, in order of preference. Use the first one that fits your stack.
|
|
78
|
-
|
|
79
|
-
### 1. SDK integrations
|
|
80
|
-
|
|
81
|
-
Drop-in for RubyLLM and the official `openai` and `anthropic` gems. `config.instrument` patches tested SDK methods so you don't change a single call site:
|
|
82
|
-
|
|
83
|
-
```ruby
|
|
84
|
-
LlmCostTracker.configure do |config|
|
|
85
|
-
config.instrument :openai # or :anthropic / :ruby_llm
|
|
86
|
-
end
|
|
87
|
-
|
|
88
|
-
LlmCostTracker.with_tags(feature: "support_chat") do
|
|
89
|
-
Anthropic::Client.new.messages.create(
|
|
90
|
-
model: "claude-sonnet-4-6",
|
|
91
|
-
max_tokens: 1024,
|
|
92
|
-
messages: [{ role: "user", content: "Hello" }]
|
|
93
|
-
)
|
|
94
|
-
end
|
|
95
|
-
```
|
|
96
|
-
|
|
97
|
-
Captures usage, model, latency, response ID, pricing mode, cache tokens, Anthropic cache-write TTLs, and reasoning tokens whenever the SDK exposes them. Provider SDKs are not added as gem dependencies — you install whichever you actually use.
|
|
52
|
+
Mount the dashboard at `/llm-costs` and put it behind your app's auth — it
|
|
53
|
+
ships without one.
|
|
98
54
|
|
|
99
|
-
|
|
55
|
+
## What lands in the ledger
|
|
100
56
|
|
|
101
|
-
|
|
57
|
+
- **Calls.** Provider, model, total tokens, total cost, latency, status.
|
|
58
|
+
- **Line items.** Per-component breakdown — text/audio/cached tokens, tool
|
|
59
|
+
charges (web search, code execution, grounding, container sessions).
|
|
60
|
+
- **Tags.** Whatever attribution you pass — user, feature, tenant, env.
|
|
61
|
+
- **Provider IDs.** Response, project, API key, workspace — for downstream
|
|
62
|
+
audits.
|
|
63
|
+
- **Pricing snapshot.** So historical numbers don't drift when prices change.
|
|
102
64
|
|
|
103
|
-
|
|
65
|
+
## Capture surfaces
|
|
104
66
|
|
|
105
|
-
|
|
106
|
-
|
|
107
|
-
|
|
108
|
-
|
|
109
|
-
|
|
110
|
-
|
|
111
|
-
|
|
112
|
-
|
|
113
|
-
|
|
114
|
-
```
|
|
67
|
+
| Surface | Path |
|
|
68
|
+
| --- | --- |
|
|
69
|
+
| OpenAI | Official SDK or Faraday |
|
|
70
|
+
| Anthropic | Official SDK or Faraday |
|
|
71
|
+
| Google Gemini | Faraday |
|
|
72
|
+
| RubyLLM | Provider layer |
|
|
73
|
+
| `ruby-openai` | Faraday |
|
|
74
|
+
| OpenRouter, DeepSeek, Groq, LiteLLM-style gateways | OpenAI-compatible Faraday |
|
|
75
|
+
| Anything else | `LlmCostTracker.track` |
|
|
115
76
|
|
|
116
|
-
|
|
77
|
+
Streams capture when the provider emits final usage. OpenAI Faraday streams
|
|
78
|
+
need `stream_options: { include_usage: true }`.
|
|
117
79
|
|
|
118
|
-
|
|
80
|
+
## What it isn't
|
|
119
81
|
|
|
120
|
-
|
|
82
|
+
- No proxy. Direct calls only.
|
|
83
|
+
- No prompts. Token counts and metadata only.
|
|
84
|
+
- Not invoice-grade. Provider response IDs are stored for reconciliation.
|
|
85
|
+
- Not multi-service. Built for a Rails monolith.
|
|
121
86
|
|
|
122
|
-
|
|
87
|
+
## Manual tracking
|
|
123
88
|
|
|
124
|
-
|
|
89
|
+
For batch jobs, internal gateways, or anything without an SDK/Faraday hook:
|
|
125
90
|
|
|
126
91
|
```ruby
|
|
127
92
|
LlmCostTracker.track(
|
|
128
93
|
provider: :anthropic,
|
|
129
94
|
model: "claude-sonnet-4-6",
|
|
130
|
-
|
|
131
|
-
|
|
132
|
-
feature: "summarizer",
|
|
133
|
-
user_id: current_user.id
|
|
95
|
+
tokens: { input: 1500, output: 320 },
|
|
96
|
+
tags: { feature: "summarizer", user_id: current_user.id }
|
|
134
97
|
)
|
|
135
98
|
```
|
|
136
99
|
|
|
137
|
-
|
|
138
|
-
|
|
139
|
-
## Tags: who burned this money
|
|
140
|
-
|
|
141
|
-
Tags answer the only question that matters in attribution: which feature, which user, which job, which tenant. They're free-form strings, stored as JSONB on PostgreSQL or JSON on MySQL, and queryable from both Ruby and the dashboard.
|
|
142
|
-
|
|
143
|
-
```ruby
|
|
144
|
-
LlmCostTracker.with_tags(user_id: current_user.id, feature: "support_chat") do
|
|
145
|
-
client.chat(parameters: { model: "gpt-4o", messages: [...] })
|
|
146
|
-
end
|
|
147
|
-
```
|
|
148
|
-
|
|
149
|
-
`with_tags` is thread- and fiber-isolated, so concurrent requests in Puma or jobs in Sidekiq don't bleed into each other. A `default_tags` callable on configuration runs on every event for things you always want — `environment`, `region`, deployment SHA. Explicit tags passed to `track` win over scoped tags, scoped tags win over defaults.
|
|
150
|
-
|
|
151
|
-
Streaming capture snapshots tags when the stream starts, so attribution survives delayed or cross-thread stream consumption.
|
|
152
|
-
|
|
153
|
-
What you put in tags is **your** input — they're queryable strings. Don't put prompts, completions, emails, or secrets there. Use IDs.
|
|
154
|
-
|
|
155
|
-
## Pricing
|
|
156
|
-
|
|
157
|
-
Built-in prices live in `lib/llm_cost_tracker/prices.json` and are refreshed daily from official provider pricing pages by an automated CI workflow that opens a PR on every change. Most apps run on bundled prices and never think about this.
|
|
158
|
-
|
|
159
|
-
When you want to control updates yourself — for negotiated rates, gateway-specific model IDs, or pinned reviews — generate a local snapshot:
|
|
160
|
-
|
|
161
|
-
```bash
|
|
162
|
-
bin/rails generate llm_cost_tracker:prices
|
|
163
|
-
```
|
|
164
|
-
|
|
165
|
-
```ruby
|
|
166
|
-
config.prices_file = Rails.root.join("config/llm_cost_tracker_prices.yml")
|
|
167
|
-
```
|
|
168
|
-
|
|
169
|
-
Refresh on demand from the maintained snapshot:
|
|
170
|
-
|
|
171
|
-
```bash
|
|
172
|
-
bin/rails llm_cost_tracker:prices:refresh
|
|
173
|
-
```
|
|
174
|
-
|
|
175
|
-
Explain why a model is priced or unknown:
|
|
176
|
-
|
|
177
|
-
```bash
|
|
178
|
-
PROVIDER=openai MODEL=gpt-4o bin/rails llm_cost_tracker:prices:explain
|
|
179
|
-
```
|
|
180
|
-
|
|
181
|
-
Precedence is `pricing_overrides` → `prices_file` → bundled. Provider-qualified keys like `openai/gpt-4o-mini` win over model-only keys.
|
|
182
|
-
|
|
183
|
-
`pricing_mode` selects mode-prefixed rates such as `batch_input` or `priority_output`. Built-in capture fills it from provider tier fields when available; explicit `track` calls can pass it directly for batch jobs or gateway-specific modes. Full pricing reference: [`docs/pricing.md`](docs/pricing.md).
|
|
184
|
-
|
|
185
|
-
## Budgets
|
|
186
|
-
|
|
187
|
-
Budgets are guardrails, not transactional caps:
|
|
188
|
-
|
|
189
|
-
```ruby
|
|
190
|
-
config.monthly_budget = 500.00
|
|
191
|
-
config.daily_budget = 50.00
|
|
192
|
-
config.per_call_budget = 2.00
|
|
193
|
-
config.budget_exceeded_behavior = :block_requests # or :notify, :raise
|
|
194
|
-
config.on_budget_exceeded = ->(data) { SlackNotifier.notify("#alerts", "...") }
|
|
195
|
-
```
|
|
196
|
-
|
|
197
|
-
`:block_requests` reads ledger totals before a call goes out and stops it if you're already over. Under concurrency multiple workers can pass preflight at the same time and collectively overshoot — this catches the next call after the overshoot becomes visible, not the overshoot itself. For a strict cap, use a provider-side limit or a transactional counter outside the gem.
|
|
198
|
-
|
|
199
|
-
Full behavior, error class, and preflight details: [`docs/budgets.md`](docs/budgets.md).
|
|
200
|
-
|
|
201
|
-
## Querying
|
|
202
|
-
|
|
203
|
-
When you want to slice spend from a console, scheduled job, or your own admin page:
|
|
204
|
-
|
|
205
|
-
```ruby
|
|
206
|
-
LlmCostTracker::Ledger::Call.this_month.cost_by_model
|
|
207
|
-
LlmCostTracker::Ledger::Call.this_month.cost_by_tag("feature")
|
|
208
|
-
LlmCostTracker::Ledger::Call.daily_costs(days: 7)
|
|
209
|
-
LlmCostTracker::Ledger::Call.by_tags(user_id: 42, feature: "chat").this_month.total_cost
|
|
210
|
-
```
|
|
211
|
-
|
|
212
|
-
A text report is also one rake task away:
|
|
213
|
-
|
|
214
|
-
```bash
|
|
215
|
-
DAYS=7 bin/rails llm_cost_tracker:report
|
|
216
|
-
```
|
|
217
|
-
|
|
218
|
-
Full scope and helper reference: [`docs/querying.md`](docs/querying.md).
|
|
219
|
-
|
|
220
|
-
## Dashboard
|
|
221
|
-
|
|
222
|
-
Mount the engine wherever you want — it's plain ERB, no JavaScript bundle, no asset pipeline gymnastics:
|
|
223
|
-
|
|
224
|
-
```ruby
|
|
225
|
-
# config/routes.rb
|
|
226
|
-
mount LlmCostTracker::Engine => "/llm-costs"
|
|
227
|
-
```
|
|
228
|
-
|
|
229
|
-
Pages: overview (spend trend, budget status, anomaly banner), models, calls (filterable, paginated, CSV export), tags, data quality. Reads the ActiveRecord ledger in `llm_api_calls`.
|
|
230
|
-
|
|
231
|
-
Auth is your job. Examples for basic auth and Devise: [`docs/dashboard.md`](docs/dashboard.md).
|
|
232
|
-
|
|
233
|
-
## Supported providers
|
|
100
|
+
## Docs
|
|
234
101
|
|
|
235
|
-
|
|
236
|
-
|
|
237
|
-
|
|
238
|
-
|
|
239
|
-
|
|
240
|
-
|
|
241
|
-
|
|
242
|
-
|
|
243
|
-
| Other OpenAI-compatible hosts | Configurable | Register the host via `config.openai_compatible_providers` |
|
|
244
|
-
| Anything else | Manual | Use `LlmCostTracker.track` / `track_stream` |
|
|
245
|
-
|
|
246
|
-
RubyLLM chat, embedding, and transcription calls are captured through RubyLLM's provider layer when `config.instrument :ruby_llm` is enabled.
|
|
247
|
-
|
|
248
|
-
Endpoints covered end-to-end: OpenAI Chat Completions / Responses / Completions / Embeddings, Anthropic Messages, Gemini `generateContent` and `streamGenerateContent`, plus their OpenAI-compatible equivalents. Streaming is captured for Faraday paths and official OpenAI / Anthropic SDK stream helpers whenever the provider emits final-usage events.
|
|
249
|
-
|
|
250
|
-
## Privacy
|
|
251
|
-
|
|
252
|
-
By design, **no prompt or response content is ever stored.** Per call, the ledger holds: provider, model, token counts, cost, latency, tags, response ID, timestamp. That's it. No request bodies, no headers, no completions. Warning logs strip query strings before logging URLs.
|
|
253
|
-
|
|
254
|
-
Tags carry whatever your app passes — they are application-controlled input, treat them accordingly. Use `user_id`, not the user's email; use a feature key, not the input prompt.
|
|
255
|
-
|
|
256
|
-
## Documentation
|
|
257
|
-
|
|
258
|
-
Deeper guides live in `docs/`. Reference pages are being filled out as content
|
|
259
|
-
moves out of this README; the inline sections above remain canonical where a page
|
|
260
|
-
is still brief.
|
|
261
|
-
|
|
262
|
-
- [Configuration reference](docs/configuration.md)
|
|
263
|
-
- [Pricing & price refresh](docs/pricing.md)
|
|
264
|
-
- [Budgets & guardrails](docs/budgets.md)
|
|
265
|
-
- [Querying & reports](docs/querying.md)
|
|
266
|
-
- [Dashboard mounting](docs/dashboard.md)
|
|
267
|
-
- [Streaming capture](docs/streaming.md)
|
|
102
|
+
- [Configuration](docs/configuration.md)
|
|
103
|
+
- [Pricing](docs/pricing.md)
|
|
104
|
+
- [Budgets](docs/budgets.md)
|
|
105
|
+
- [Data model](docs/data-model.md)
|
|
106
|
+
- [Querying](docs/querying.md)
|
|
107
|
+
- [Dashboard](docs/dashboard.md)
|
|
108
|
+
- [Streaming](docs/streaming.md)
|
|
109
|
+
- [Cookbook](docs/cookbook.md)
|
|
268
110
|
- [Extending](docs/extending.md)
|
|
269
|
-
- [
|
|
111
|
+
- [Operations](docs/operations.md)
|
|
112
|
+
- [Architecture](docs/architecture.md)
|
|
270
113
|
- [Upgrading](docs/upgrading.md)
|
|
271
|
-
- [
|
|
272
|
-
- [Architecture & design rules](docs/architecture.md)
|
|
273
|
-
|
|
274
|
-
## Known limitations
|
|
275
|
-
|
|
276
|
-
- `:block_requests` is best-effort under concurrency, not a transactional cap.
|
|
277
|
-
- Streaming usage capture relies on the provider emitting a final-usage event. Missing events are stored with `usage_source: "unknown"` so they appear on the data-quality page rather than vanishing.
|
|
278
|
-
- Non-token line items such as Gemini explicit-cache storage duration, provider tool calls, and modality-specific surcharges are not folded into token cost.
|
|
279
|
-
- `provider_response_id` is stored only when the provider exposes a stable ID. Gemini is best-effort and varies by endpoint.
|
|
114
|
+
- [Changelog](CHANGELOG.md)
|
|
280
115
|
|
|
281
116
|
## Development
|
|
282
117
|
|
|
283
118
|
```bash
|
|
284
119
|
bundle install
|
|
285
|
-
bin/check
|
|
120
|
+
bin/check
|
|
286
121
|
```
|
|
287
122
|
|
|
288
|
-
Architecture rules and conventions for contributions live in [`docs/architecture.md`](docs/architecture.md).
|
|
289
|
-
|
|
290
123
|
## License
|
|
291
124
|
|
|
292
125
|
MIT — see [LICENSE.txt](LICENSE.txt).
|