llm_cost_tracker 0.1.3 → 0.2.0.alpha1

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (71) hide show
  1. checksums.yaml +4 -4
  2. data/CHANGELOG.md +64 -81
  3. data/PLAN_0.2.md +488 -0
  4. data/README.md +141 -316
  5. data/app/controllers/llm_cost_tracker/application_controller.rb +42 -0
  6. data/app/controllers/llm_cost_tracker/calls_controller.rb +77 -0
  7. data/app/controllers/llm_cost_tracker/dashboard_controller.rb +54 -0
  8. data/app/controllers/llm_cost_tracker/data_quality_controller.rb +10 -0
  9. data/app/controllers/llm_cost_tracker/models_controller.rb +12 -0
  10. data/app/controllers/llm_cost_tracker/tags_controller.rb +21 -0
  11. data/app/helpers/llm_cost_tracker/application_helper.rb +113 -0
  12. data/app/services/llm_cost_tracker/dashboard/data_quality.rb +38 -0
  13. data/app/services/llm_cost_tracker/dashboard/filter.rb +109 -0
  14. data/app/services/llm_cost_tracker/dashboard/overview_stats.rb +87 -0
  15. data/app/services/llm_cost_tracker/dashboard/provider_breakdown.rb +44 -0
  16. data/app/services/llm_cost_tracker/dashboard/tag_breakdown.rb +58 -0
  17. data/app/services/llm_cost_tracker/dashboard/tag_key_explorer.rb +125 -0
  18. data/app/services/llm_cost_tracker/dashboard/time_series.rb +44 -0
  19. data/app/services/llm_cost_tracker/dashboard/top_models.rb +89 -0
  20. data/app/services/llm_cost_tracker/pagination.rb +59 -0
  21. data/app/views/layouts/llm_cost_tracker/application.html.erb +342 -0
  22. data/app/views/llm_cost_tracker/calls/index.html.erb +127 -0
  23. data/app/views/llm_cost_tracker/calls/show.html.erb +67 -0
  24. data/app/views/llm_cost_tracker/dashboard/index.html.erb +145 -0
  25. data/app/views/llm_cost_tracker/data_quality/index.html.erb +110 -0
  26. data/app/views/llm_cost_tracker/errors/database.html.erb +8 -0
  27. data/app/views/llm_cost_tracker/errors/invalid_filter.html.erb +4 -0
  28. data/app/views/llm_cost_tracker/errors/not_found.html.erb +5 -0
  29. data/app/views/llm_cost_tracker/models/index.html.erb +95 -0
  30. data/app/views/llm_cost_tracker/shared/_bar.html.erb +5 -0
  31. data/app/views/llm_cost_tracker/shared/setup_required.html.erb +6 -0
  32. data/app/views/llm_cost_tracker/tags/index.html.erb +34 -0
  33. data/app/views/llm_cost_tracker/tags/show.html.erb +69 -0
  34. data/config/routes.rb +10 -0
  35. data/lib/llm_cost_tracker/budget.rb +16 -38
  36. data/lib/llm_cost_tracker/configuration.rb +3 -1
  37. data/lib/llm_cost_tracker/cost.rb +1 -3
  38. data/lib/llm_cost_tracker/engine.rb +13 -0
  39. data/lib/llm_cost_tracker/engine_compatibility.rb +15 -0
  40. data/lib/llm_cost_tracker/errors.rb +2 -0
  41. data/lib/llm_cost_tracker/event.rb +1 -3
  42. data/lib/llm_cost_tracker/event_metadata.rb +9 -18
  43. data/lib/llm_cost_tracker/llm_api_call.rb +43 -9
  44. data/lib/llm_cost_tracker/middleware/faraday.rb +4 -4
  45. data/lib/llm_cost_tracker/parsed_usage.rb +5 -9
  46. data/lib/llm_cost_tracker/parsers/anthropic.rb +4 -5
  47. data/lib/llm_cost_tracker/parsers/base.rb +3 -8
  48. data/lib/llm_cost_tracker/parsers/gemini.rb +3 -3
  49. data/lib/llm_cost_tracker/parsers/openai_usage.rb +3 -3
  50. data/lib/llm_cost_tracker/parsers/registry.rb +5 -12
  51. data/lib/llm_cost_tracker/period_grouping.rb +68 -0
  52. data/lib/llm_cost_tracker/price_registry.rb +22 -30
  53. data/lib/llm_cost_tracker/pricing.rb +10 -19
  54. data/lib/llm_cost_tracker/report.rb +4 -4
  55. data/lib/llm_cost_tracker/report_data.rb +23 -29
  56. data/lib/llm_cost_tracker/report_formatter.rb +11 -3
  57. data/lib/llm_cost_tracker/storage/active_record_store.rb +1 -3
  58. data/lib/llm_cost_tracker/tag_accessors.rb +0 -8
  59. data/lib/llm_cost_tracker/tag_key.rb +16 -0
  60. data/lib/llm_cost_tracker/tracker.rb +35 -1
  61. data/lib/llm_cost_tracker/unknown_pricing.rb +1 -1
  62. data/lib/llm_cost_tracker/version.rb +1 -1
  63. data/lib/llm_cost_tracker.rb +3 -6
  64. data/llm_cost_tracker.gemspec +13 -9
  65. metadata +92 -21
  66. data/.rubocop.yml +0 -44
  67. data/lib/llm_cost_tracker/storage/active_record_backend.rb +0 -19
  68. data/lib/llm_cost_tracker/storage/backends.rb +0 -26
  69. data/lib/llm_cost_tracker/storage/custom_backend.rb +0 -16
  70. data/lib/llm_cost_tracker/storage/log_backend.rb +0 -28
  71. data/lib/llm_cost_tracker/value_object.rb +0 -45
data/README.md CHANGED
@@ -1,8 +1,6 @@
1
1
  # LlmCostTracker
2
2
 
3
- **See where your Rails app spends money on LLM APIs.**
4
-
5
- Track cost by user, tenant, feature, provider, and model, all in your own database. No proxy. No SaaS required.
3
+ **Self-hosted LLM cost tracking for Ruby and Rails.** Intercepts Faraday LLM responses, prices them locally, stores events in your database. No proxy, no SaaS.
6
4
 
7
5
  [![Gem Version](https://img.shields.io/gem/v/llm_cost_tracker.svg)](https://rubygems.org/gems/llm_cost_tracker)
8
6
  [![CI](https://github.com/sergey-homenko/llm_cost_tracker/actions/workflows/ruby.yml/badge.svg)](https://github.com/sergey-homenko/llm_cost_tracker/actions)
@@ -20,53 +18,38 @@ By model:
20
18
  claude-sonnet-4-6 $31.200000
21
19
  gemini-2.5-flash $14.120000
22
20
 
23
- By feature:
24
- chat $73.500000
25
- summarizer $29.220000
26
- translate $24.700000
21
+ By tag key "env":
22
+ production $119.300000
23
+ staging $8.120000
27
24
  ```
28
25
 
29
- ## Why?
30
-
31
- Every Rails app integrating LLMs faces the same problem: **you don't know how much AI is costing you** until the invoice arrives. Full observability platforms like Langfuse and Helicone are powerful, but sometimes you just need a small Rails-native cost ledger that lives in your app database.
26
+ ## Why
32
27
 
33
- `llm_cost_tracker` takes a different approach:
28
+ Every Rails app with LLM integrations eventually runs into the same question: where did that invoice come from? Full observability platforms like Langfuse and Helicone cover a lot more than cost, and sometimes you just want a small Rails-native ledger that lives in your own database.
34
29
 
35
- - 🔌 **Faraday-native** intercepts LLM HTTP responses without changing the response
36
- - 🏠 **Self-hosted** — your data stays in your database
37
- - 🧩 **Client-light** — works with raw Faraday and LLM gems that expose their Faraday connection
38
- - 🏷️ **Attribution-first** — tag spend by feature, tenant, user, job, or environment
39
- - 🌐 **OpenAI-compatible** — auto-detect OpenRouter and DeepSeek, with custom compatible hosts configurable
40
- - 🛑 **Budget guardrails** — notify, raise, or block requests when monthly spend is exhausted
41
- - 📊 **Quick reports** — print a terminal cost report with one rake task
30
+ `llm_cost_tracker` is scoped to that. It plugs into Faraday, parses provider usage out of the response, looks up pricing locally, and writes an event. You end up with a ledger you can query with plain ActiveRecord, slice by any tag dimension, and optionally surface on a built-in dashboard. No proxy, no SaaS, no separate service to run.
42
31
 
43
- This gem is intentionally not a tracing platform, prompt CMS, eval system, or gateway. It focuses on the boring but valuable question: "What did this app spend on LLM APIs, and where did that spend come from?"
32
+ It's not a tracing platform, prompt CMS, eval system, or gateway and doesn't want to be. The goal is answering _"what did this app spend on LLM APIs, and where did that spend come from?"_ well enough that you stop worrying about it.
44
33
 
45
34
  ## Installation
46
35
 
47
- Add to your Gemfile:
48
-
49
36
  ```ruby
50
37
  gem "llm_cost_tracker"
51
38
  ```
52
39
 
53
- For ActiveRecord storage (recommended for production):
40
+ For ActiveRecord storage:
54
41
 
55
42
  ```bash
56
43
  bin/rails generate llm_cost_tracker:install
57
44
  bin/rails db:migrate
58
45
  ```
59
46
 
60
- ## Try It In 30 Seconds
61
-
62
- Try cost calculation without a database or migration:
47
+ ## Quick try (no database)
63
48
 
64
49
  ```ruby
65
50
  require "llm_cost_tracker"
66
51
 
67
- LlmCostTracker.configure do |config|
68
- config.storage_backend = :log
69
- end
52
+ LlmCostTracker.configure { |c| c.storage_backend = :log }
70
53
 
71
54
  LlmCostTracker.track(
72
55
  provider: :openai,
@@ -75,25 +58,12 @@ LlmCostTracker.track(
75
58
  output_tokens: 200,
76
59
  feature: "demo"
77
60
  )
61
+ # => [LlmCostTracker] openai/gpt-4o tokens=1000+200 cost=$0.004500 tags={:feature=>"demo"}
78
62
  ```
79
63
 
80
- Output:
81
-
82
- ```text
83
- [LlmCostTracker] openai/gpt-4o tokens=1000+200 cost=$0.004500 tags={:feature=>"demo"}
84
- ```
85
-
86
- ## Quick Start
87
-
88
- Use the path that matches your app:
89
-
90
- - Using `ruby-openai`, `ruby_llm`, or another client that exposes Faraday? Patch that client's Faraday connection.
91
- - Using raw Faraday? Add the middleware directly.
92
- - Using a client without Faraday access? Use manual tracking.
93
-
94
- ### Option 1: Patch An Existing Client
64
+ ## Usage
95
65
 
96
- Some LLM gems expose their Faraday connection. For example, with `ruby-openai`:
66
+ ### Patch an existing client's Faraday connection
97
67
 
98
68
  ```ruby
99
69
  # config/initializers/openai.rb
@@ -102,34 +72,27 @@ OpenAI.configure do |config|
102
72
 
103
73
  config.faraday do |f|
104
74
  f.use :llm_cost_tracker, tags: -> {
105
- {
106
- user_id: Current.user&.id,
107
- feature: Current.llm_feature || "openai"
108
- }
75
+ { user_id: Current.user&.id, workflow: Current.workflow, env: Rails.env }
109
76
  }
110
77
  end
111
78
  end
112
79
  ```
113
80
 
114
- For Rails apps, `tags:` can be a callable so request-local values are evaluated per request:
81
+ `tags:` can be a callable so `Current` attributes are evaluated per request:
115
82
 
116
83
  ```ruby
117
- # app/models/current.rb
118
84
  class Current < ActiveSupport::CurrentAttributes
119
- attribute :user, :tenant, :llm_feature
85
+ attribute :user, :tenant, :workflow
120
86
  end
121
87
 
122
- # app/controllers/application_controller.rb
88
+ # application_controller.rb
123
89
  before_action do
124
90
  Current.user = current_user
125
- Current.tenant = current_tenant if respond_to?(:current_tenant, true)
126
- Current.llm_feature = "chat"
91
+ Current.workflow = "chat"
127
92
  end
128
93
  ```
129
94
 
130
- ### Option 2: Faraday Middleware
131
-
132
- If your LLM client uses Faraday, add the middleware to that connection:
95
+ ### Raw Faraday
133
96
 
134
97
  ```ruby
135
98
  conn = Faraday.new(url: "https://api.openai.com") do |f|
@@ -139,18 +102,12 @@ conn = Faraday.new(url: "https://api.openai.com") do |f|
139
102
  f.adapter Faraday.default_adapter
140
103
  end
141
104
 
142
- # Every supported LLM request through this connection is tracked
143
- response = conn.post("/v1/responses", {
144
- model: "gpt-5-mini",
145
- input: "Hello!"
146
- })
105
+ conn.post("/v1/responses", { model: "gpt-5-mini", input: "Hello!" })
147
106
  ```
148
107
 
149
- If a client does not expose its HTTP connection, use manual tracking or register a custom parser around the HTTP layer you control.
150
-
151
- ### Option 3: Manual tracking
108
+ Place `llm_cost_tracker` inside the Faraday stack where it can see the final response body. For streaming APIs, tracking requires the final body to expose provider usage; otherwise the gem warns and skips — use manual tracking there.
152
109
 
153
- For non-Faraday clients, track manually:
110
+ ### Manual tracking
154
111
 
155
112
  ```ruby
156
113
  LlmCostTracker.track(
@@ -169,332 +126,215 @@ LlmCostTracker.track(
169
126
  ```ruby
170
127
  # config/initializers/llm_cost_tracker.rb
171
128
  LlmCostTracker.configure do |config|
172
- # Storage: :log (default), :active_record, or :custom
173
- config.storage_backend = :active_record
174
-
175
- # Default tags on every event
129
+ config.storage_backend = :active_record # :log (default), :active_record, :custom
176
130
  config.default_tags = { app: "my_app", environment: Rails.env }
177
131
 
178
- # Monthly budget in USD
179
132
  config.monthly_budget = 500.00
180
- config.budget_exceeded_behavior = :notify # :notify, :raise, or :block_requests
181
- config.storage_error_behavior = :warn # :ignore, :warn, or :raise
182
- config.unknown_pricing_behavior = :warn # :ignore, :warn, or :raise
133
+ config.budget_exceeded_behavior = :notify # :notify, :raise, :block_requests
134
+ config.storage_error_behavior = :warn # :ignore, :warn, :raise
135
+ config.unknown_pricing_behavior = :warn # :ignore, :warn, :raise
183
136
 
184
- # Alert callback
185
137
  config.on_budget_exceeded = ->(data) {
186
- SlackNotifier.notify(
187
- "#alerts",
188
- "🚨 LLM budget exceeded! $#{data[:monthly_total].round(2)} / $#{data[:budget]}"
189
- )
138
+ SlackNotifier.notify("#alerts", "🚨 LLM budget $#{data[:monthly_total].round(2)} / $#{data[:budget]}")
190
139
  }
191
140
 
192
- # Override pricing for custom/fine-tuned models (per 1M tokens)
193
141
  config.prices_file = Rails.root.join("config/llm_cost_tracker_prices.yml")
194
142
  config.pricing_overrides = {
195
143
  "ft:gpt-4o-mini:my-org" => { input: 0.30, cached_input: 0.15, output: 1.20 }
196
144
  }
197
145
 
198
- # OpenAI-compatible APIs. OpenRouter and DeepSeek are included by default.
146
+ # Built-in: openrouter.ai, api.deepseek.com
199
147
  config.openai_compatible_providers["llm.my-company.com"] = "internal_gateway"
200
148
  end
201
149
  ```
202
150
 
203
- Pricing is best-effort and based on public provider pricing for standard token usage. Providers change pricing frequently, and some features have extra charges or tiered pricing. OpenRouter-style model IDs such as `openai/gpt-4o-mini` are normalized to built-in model names when possible. Use `prices_file` or `pricing_overrides` for fine-tunes, gateway-specific model IDs, enterprise discounts, batch pricing, long-context premiums, and any model this gem does not know yet.
151
+ Pricing is best-effort. OpenRouter-style IDs like `openai/gpt-4o-mini` are normalized to built-in names when possible. Use `prices_file` / `pricing_overrides` for fine-tunes, gateway-specific IDs, enterprise discounts, batch pricing, or models the gem doesn't know.
204
152
 
205
- Storage errors are non-fatal by default:
206
-
207
- ```ruby
208
- config.storage_error_behavior = :warn # default
209
- config.storage_error_behavior = :raise # fail fast with StorageError
210
- config.storage_error_behavior = :ignore # skip storage failures silently
211
- ```
153
+ `storage_error_behavior = :warn` (default) lets LLM responses continue if storage fails; `:raise` exposes `StorageError#original_error`.
212
154
 
213
- With the default `:warn` behavior, tracking emits a warning and lets the LLM response continue if ActiveRecord or custom storage fails. `LlmCostTracker::StorageError` exposes `original_error` when `:raise` is enabled.
214
-
215
- Unknown model pricing is visible by default:
216
-
217
- ```ruby
218
- config.unknown_pricing_behavior = :warn # default
219
- config.unknown_pricing_behavior = :raise # fail fast with UnknownPricingError
220
- config.unknown_pricing_behavior = :ignore # keep tracking tokens silently
221
- ```
222
-
223
- When pricing is unknown, the event can still be recorded with token counts, but `cost` is `nil` and budget enforcement is skipped for that event. Use `prices_file` or `pricing_overrides` to ensure all production models are priced. Check this ActiveRecord query for a list of unpriced models in your data:
155
+ Unknown pricing still records token counts, but `cost` is `nil` and budget guardrails skip that event. Find unpriced models:
224
156
 
225
157
  ```ruby
226
158
  LlmCostTracker::LlmApiCall.unknown_pricing.group(:model).count
227
159
  ```
228
160
 
229
- ### Keeping Prices Current
230
-
231
- Built-in prices live in `lib/llm_cost_tracker/prices.json`, with `updated_at`, `unit`, `currency`, and source URLs in the file metadata. The gem does not fetch pricing on boot; that keeps it self-hosted and avoids hidden external dependencies.
161
+ ### Keeping prices current
232
162
 
233
- For production apps, keep a local JSON or YAML price file and point the gem at it:
163
+ Built-in prices are in `lib/llm_cost_tracker/prices.json`. The gem never fetches pricing on boot. For production, generate a local overrides file and point the gem at it:
234
164
 
235
165
  ```bash
236
166
  bin/rails generate llm_cost_tracker:prices
237
167
  ```
238
168
 
239
- ```ruby
240
- config.prices_file = Rails.root.join("config/llm_cost_tracker_prices.yml")
241
- ```
242
-
243
- Example JSON:
244
-
245
169
  ```json
246
170
  {
247
- "metadata": {
248
- "updated_at": "2026-04-18",
249
- "currency": "USD",
250
- "unit": "1M tokens"
251
- },
171
+ "metadata": { "updated_at": "2026-04-18", "currency": "USD", "unit": "1M tokens" },
252
172
  "models": {
253
- "my-gateway/gpt-4o-mini": {
254
- "input": 0.20,
255
- "cached_input": 0.10,
256
- "output": 0.80
257
- }
173
+ "my-gateway/gpt-4o-mini": { "input": 0.20, "cached_input": 0.10, "output": 0.80 }
258
174
  }
259
175
  }
260
176
  ```
261
177
 
262
- `pricing_overrides` still has the highest precedence, so you can use it for small Ruby-only overrides and keep broader provider tables in the file. A practical release rhythm is to refresh built-in `prices.json` quarterly and use `prices_file` for urgent provider changes between gem releases.
178
+ `pricing_overrides` has the highest precedence; use it for small Ruby-only tweaks, `prices_file` for broader tables.
263
179
 
264
- ## Budget Enforcement
180
+ ## Budget enforcement
265
181
 
266
182
  ```ruby
267
- LlmCostTracker.configure do |config|
268
- config.storage_backend = :active_record
269
- config.monthly_budget = 100.00
270
- config.budget_exceeded_behavior = :block_requests
271
- end
183
+ config.storage_backend = :active_record
184
+ config.monthly_budget = 100.00
185
+ config.budget_exceeded_behavior = :block_requests
272
186
  ```
273
187
 
274
- Budget behavior options:
275
-
276
- - `:notify` — default. Calls `on_budget_exceeded` after a tracked event pushes the month over budget.
277
- - `:raise` — records the event, then raises `LlmCostTracker::BudgetExceededError` when the month is over budget.
278
- - `:block_requests` — blocks Faraday LLM requests before the HTTP call when the ActiveRecord monthly total has already reached the budget. If a request pushes the month over budget, it also raises after recording the event.
279
-
280
- `BudgetExceededError` exposes `monthly_total`, `budget`, and `last_event`:
188
+ - `:notify` — fire `on_budget_exceeded` after an event pushes the month over budget.
189
+ - `:raise` — record the event, then raise `BudgetExceededError`.
190
+ - `:block_requests` — block preflight when the stored monthly total is already over budget; still raises post-response on the event that crosses the line. Needs `:active_record` storage.
281
191
 
282
192
  ```ruby
283
- begin
284
- client.chat(...)
285
193
  rescue LlmCostTracker::BudgetExceededError => e
286
- Rails.logger.warn("LLM budget exhausted: #{e.monthly_total} / #{e.budget}")
287
- end
194
+ # e.monthly_total, e.budget, e.last_event
288
195
  ```
289
196
 
290
- Pre-request blocking needs `storage_backend = :active_record` because the middleware must query your stored monthly total before sending the request. With `:log` or `:custom` storage, `:raise` and the post-response part of `:block_requests` still work for the event being tracked.
197
+ `:block_requests` is best-effort under concurrency, not a transactional cap. Use provider/gateway-level limits for strict quotas.
291
198
 
292
- `:block_requests` is a best-effort guardrail, not a transactional hard quota. In highly concurrent deployments, multiple workers can pass the preflight check at the same time before any of them records its final cost. The request that first pushes the month over budget is stored before the post-response `BudgetExceededError` is raised; later Faraday requests are blocked during preflight once the stored monthly total is exhausted. Use provider-side limits or a gateway-level quota if you need strict cross-process enforcement.
293
-
294
- ## Querying Costs (ActiveRecord)
295
-
296
- Print a quick terminal report:
199
+ ## Querying costs
297
200
 
298
201
  ```bash
299
202
  bin/rails llm_cost_tracker:report
300
-
301
- # Optional: change the window
302
203
  DAYS=7 bin/rails llm_cost_tracker:report
303
204
  ```
304
205
 
305
- Example:
306
-
307
- ```text
308
- LLM Cost Report (last 30 days)
309
-
310
- Total cost: $127.420000
311
- Requests: 4,218
312
- Avg latency: 812ms
313
- Unknown pricing: 0
314
-
315
- By provider:
316
- openai $96.220000
317
- anthropic $31.200000
318
- ```
319
-
320
- Or query the ledger directly:
321
-
322
206
  ```ruby
323
- # Today's total spend
324
207
  LlmCostTracker::LlmApiCall.today.total_cost
325
- # => 12.45
326
-
327
- # Cost breakdown by model this month
328
208
  LlmCostTracker::LlmApiCall.this_month.cost_by_model
329
- # => { "gpt-4o" => 8.20, "claude-sonnet-4-6" => 4.25 }
330
-
331
- # Cost by provider
332
209
  LlmCostTracker::LlmApiCall.this_month.cost_by_provider
333
- # => { "openai" => 8.20, "anthropic" => 4.25 }
334
210
 
335
- # Daily cost trend
211
+ # Group / sum by any tag
212
+ LlmCostTracker::LlmApiCall.this_month.group_by_tag("feature").sum(:total_cost)
213
+ LlmCostTracker::LlmApiCall.this_month.cost_by_tag("feature") # with "(untagged)" bucket
214
+
215
+ # Period grouping (SQL-side)
216
+ LlmCostTracker::LlmApiCall.this_month.group_by_period(:day).sum(:total_cost)
217
+ LlmCostTracker::LlmApiCall.group_by_period(:month).sum(:total_cost)
336
218
  LlmCostTracker::LlmApiCall.daily_costs(days: 7)
337
- # => { "2026-04-10" => 1.5, "2026-04-11" => 2.3, ... }
338
219
 
339
- # Latency overview
220
+ # Latency
340
221
  LlmCostTracker::LlmApiCall.with_latency.average_latency_ms
341
222
  LlmCostTracker::LlmApiCall.this_month.latency_by_model
342
223
 
343
- # Filter by feature
224
+ # Tag filters
344
225
  LlmCostTracker::LlmApiCall.by_tag("feature", "chat").this_month.total_cost
345
-
346
- # Filter by user
347
- LlmCostTracker::LlmApiCall.by_tag("user_id", "42").today.total_cost
348
- LlmCostTracker::LlmApiCall.by_user(42).today.total_cost
349
-
350
- # Filter by multiple tags
351
226
  LlmCostTracker::LlmApiCall.by_tags(user_id: 42, feature: "chat").this_month.total_cost
352
227
 
353
- # Feature shortcut
354
- LlmCostTracker::LlmApiCall.by_feature("summarizer").this_month.total_cost
355
-
356
- # Find models without pricing
357
- LlmCostTracker::LlmApiCall.unknown_pricing.group(:model).count
358
- LlmCostTracker::LlmApiCall.with_cost.this_month.total_cost
359
-
360
- # Custom date range
228
+ # Range
361
229
  LlmCostTracker::LlmApiCall.between(1.week.ago, Time.current).cost_by_model
362
230
  ```
363
231
 
364
- ### Tag Storage
232
+ ### Tag storage
365
233
 
366
- The install generator uses `jsonb` tags with a GIN index on PostgreSQL:
234
+ New installs use `jsonb` + GIN on PostgreSQL:
367
235
 
368
236
  ```ruby
369
237
  t.jsonb :tags, null: false, default: {}
370
238
  add_index :llm_api_calls, :tags, using: :gin
371
239
  ```
372
240
 
373
- On SQLite, MySQL, and other adapters, tags fall back to JSON stored in a text column. The `by_tag` scope automatically uses PostgreSQL JSONB containment when the column supports it, and the text fallback otherwise. This works, but tag queries are less efficient than PostgreSQL JSONB containment.
241
+ On other adapters tags fall back to JSON in a text column. `by_tag` uses JSONB containment on PG, text matching elsewhere.
374
242
 
375
- If you installed `llm_cost_tracker` before JSONB tags were available and your app uses PostgreSQL, generate an upgrade migration:
243
+ Upgrade an existing install:
376
244
 
377
245
  ```bash
378
- bin/rails generate llm_cost_tracker:upgrade_tags_to_jsonb
246
+ bin/rails generate llm_cost_tracker:upgrade_tags_to_jsonb # PG: text → jsonb + GIN
247
+ bin/rails generate llm_cost_tracker:upgrade_cost_precision # widen cost columns
248
+ bin/rails generate llm_cost_tracker:add_latency_ms
379
249
  bin/rails db:migrate
380
250
  ```
381
251
 
382
- This converts the existing `tags` text column to `jsonb`, keeps existing tag data, and adds the GIN index.
252
+ ## Dashboard (optional)
383
253
 
384
- If you installed an earlier version with `precision: 12, scale: 8` cost columns, widen them for larger production ledgers:
254
+ Opt-in Rails Engine. Plain ERB, inline CSS, no JS. Requires Rails 7.1+; the core middleware works without Rails.
385
255
 
386
- ```bash
387
- bin/rails generate llm_cost_tracker:upgrade_cost_precision
388
- bin/rails db:migrate
256
+ ```ruby
257
+ # config/application.rb (or an initializer)
258
+ require "llm_cost_tracker/engine"
259
+
260
+ # config/routes.rb
261
+ mount LlmCostTracker::Engine => "/llm-costs"
389
262
  ```
390
263
 
391
- If you installed before `latency_ms` was available, add the latency column:
264
+ Routes (GET-only; CSV export included):
392
265
 
393
- ```bash
394
- bin/rails generate llm_cost_tracker:add_latency_ms
395
- bin/rails db:migrate
266
+ - `/llm-costs` — overview: spend (with delta vs previous period), calls, avg cost/call, avg latency, unknown pricing, budget, daily trend, provider rollup, top models
267
+ - `/llm-costs/models` by provider + model; sortable by spend, volume, avg cost, latency
268
+ - `/llm-costs/calls` — filterable + paginated; outlier sort modes (expensive, largest input/output, slowest, unknown pricing); CSV export
269
+ - `/llm-costs/calls/:id` — details
270
+ - `/llm-costs/tags` — tag keys present in the dataset (PG/SQLite native, MySQL via in-Ruby fallback)
271
+ - `/llm-costs/tags/:key` — breakdown by values of a given tag key
272
+ - `/llm-costs/data_quality` — unknown pricing share, untagged calls, missing latency
273
+
274
+ > ⚠️ **No built-in auth.** Tags carry whatever your app puts in them. Protect the mount point with your app's auth.
275
+
276
+ ### Basic auth
277
+
278
+ ```ruby
279
+ authenticated = ->(req) {
280
+ ActionController::HttpAuthentication::Basic.authenticate(req) do |name, password|
281
+ ActiveSupport::SecurityUtils.secure_compare(name, ENV.fetch("LLM_DASHBOARD_USER")) &
282
+ ActiveSupport::SecurityUtils.secure_compare(password, ENV.fetch("LLM_DASHBOARD_PASSWORD"))
283
+ end
284
+ }
285
+ constraints(authenticated) { mount LlmCostTracker::Engine => "/llm-costs" }
396
286
  ```
397
287
 
398
- ## ActiveSupport::Notifications
288
+ ### Devise
289
+
290
+ ```ruby
291
+ authenticate :user, ->(user) { user.admin? } do
292
+ mount LlmCostTracker::Engine => "/llm-costs"
293
+ end
294
+ ```
399
295
 
400
- Every tracked call emits an `llm_request.llm_cost_tracker` event:
296
+ ## ActiveSupport::Notifications
401
297
 
402
298
  ```ruby
403
299
  ActiveSupport::Notifications.subscribe("llm_request.llm_cost_tracker") do |*, payload|
404
300
  # payload =>
405
301
  # {
406
- # provider: "openai",
407
- # model: "gpt-4o",
408
- # input_tokens: 150,
409
- # output_tokens: 42,
410
- # total_tokens: 192,
411
- # latency_ms: 248,
302
+ # provider: "openai", model: "gpt-4o",
303
+ # input_tokens: 150, output_tokens: 42, total_tokens: 192, latency_ms: 248,
412
304
  # cost: {
413
- # input_cost: 0.000375,
414
- # cached_input_cost: 0.0,
415
- # cache_read_input_cost: 0.0,
416
- # cache_creation_input_cost: 0.0,
417
- # output_cost: 0.00042,
418
- # total_cost: 0.000795,
419
- # currency: "USD"
305
+ # input_cost: 0.000375, cached_input_cost: 0.0,
306
+ # cache_read_input_cost: 0.0, cache_creation_input_cost: 0.0,
307
+ # output_cost: 0.00042, total_cost: 0.000795, currency: "USD"
420
308
  # },
421
309
  # tags: { feature: "chat", user_id: 42 },
422
310
  # tracked_at: 2026-04-16 14:30:00 UTC
423
311
  # }
424
-
425
- StatsD.increment("llm.requests", tags: ["provider:#{payload[:provider]}"])
426
- StatsD.histogram("llm.cost", payload[:cost][:total_cost])
427
312
  end
428
313
  ```
429
314
 
430
- ## Custom Storage Backend
315
+ ## Custom storage backend
431
316
 
432
317
  ```ruby
433
- LlmCostTracker.configure do |config|
434
- config.storage_backend = :custom
435
- config.custom_storage = ->(event) {
436
- InfluxDB.write("llm_costs", {
437
- values: {
438
- cost: event[:cost]&.fetch(:total_cost, nil),
439
- tokens: event[:total_tokens],
440
- latency_ms: event[:latency_ms]
441
- },
442
- tags: { provider: event[:provider], model: event[:model] }
443
- })
444
- }
445
- end
318
+ config.storage_backend = :custom
319
+ config.custom_storage = ->(event) {
320
+ InfluxDB.write("llm_costs",
321
+ values: { cost: event.cost&.total_cost, tokens: event.total_tokens, latency_ms: event.latency_ms },
322
+ tags: { provider: event.provider, model: event.model }
323
+ )
324
+ }
446
325
  ```
447
326
 
448
- ## OpenAI-Compatible Providers
327
+ ## OpenAI-compatible providers
449
328
 
450
329
  ```ruby
451
- LlmCostTracker.configure do |config|
452
- # Built in:
453
- # "openrouter.ai" => "openrouter"
454
- # "api.deepseek.com" => "deepseek"
455
- config.openai_compatible_providers["gateway.example.com"] = "internal_gateway"
456
- end
330
+ config.openai_compatible_providers["gateway.example.com"] = "internal_gateway"
457
331
  ```
458
332
 
459
- Any configured host is parsed with the OpenAI-compatible usage shape:
460
-
461
- - `prompt_tokens` / `completion_tokens` / `total_tokens`
462
- - `input_tokens` / `output_tokens` / `total_tokens`
463
- - optional cached input details when the response includes them
464
-
465
- This covers OpenRouter, DeepSeek, and private gateways that expose OpenAI-style Chat Completions, Responses, Completions, or Embeddings endpoints.
466
-
467
- ## Safety Guarantees
333
+ Configured hosts are parsed with the OpenAI-compatible usage shape (`prompt_tokens` / `completion_tokens` / `total_tokens`, `input_tokens` / `output_tokens`, and optional cached-input details). Covers OpenRouter, DeepSeek, and private gateways exposing Chat Completions / Responses / Completions / Embeddings.
468
334
 
469
- - `llm_cost_tracker` does not make external HTTP calls.
470
- - It does not store prompt or response bodies.
471
- - Faraday responses are not modified.
472
- - Storage failures are non-fatal by default via `storage_error_behavior = :warn`.
473
- - Budget and unknown-pricing errors are raised only when you opt into `:raise` or `:block_requests`.
474
- - Pricing is local and best-effort; use `prices_file` or `pricing_overrides` for production-specific rates.
475
- - Streaming/SSE calls are skipped with a warning when the final usage payload is not readable by Faraday.
335
+ ## Custom parser
476
336
 
477
- ## Production Checklist
478
-
479
- - Use `storage_backend = :active_record` in production.
480
- - Set `monthly_budget` and choose `budget_exceeded_behavior`.
481
- - Treat `:block_requests` as best-effort in concurrent systems, not a strict quota.
482
- - Keep `unknown_pricing_behavior = :warn` or `:raise` until pricing overrides are complete.
483
- - Add `pricing_overrides` for custom, fine-tuned, gateway-specific, or newly released models.
484
- - Tag calls with `tenant_id`, `user_id`, and `feature` where possible.
485
- - Check `LlmCostTracker::LlmApiCall.unknown_pricing.group(:model).count` after deploys.
486
- - Track `latency_ms` and watch `latency_by_model` for slow or degraded providers.
487
-
488
- ## Known Limitations
489
-
490
- - `:block_requests` is best-effort under concurrency. For hard caps, use an external quota system, provider-side limits, or a gateway-level budget.
491
- - Streaming/SSE calls are tracked only when Faraday exposes a final response body with usage data. Otherwise the gem warns and skips automatic tracking.
492
- - Anthropic cache creation TTL variants are not modeled separately yet; 1-hour cache writes may be underestimated compared with the default 5-minute cache write rate.
493
- - OpenAI reasoning tokens are included in output-token totals when providers report them that way, but separate reasoning-token attribution is not stored yet.
494
-
495
- ## Adding a Custom Provider Parser
496
-
497
- Use this for providers that are not OpenAI-compatible and return a different usage shape.
337
+ For providers with a non-OpenAI usage shape:
498
338
 
499
339
  ```ruby
500
340
  class AcmeParser < LlmCostTracker::Parsers::Base
@@ -505,73 +345,58 @@ class AcmeParser < LlmCostTracker::Parsers::Base
505
345
  def parse(request_url, request_body, response_status, response_body)
506
346
  return nil unless response_status == 200
507
347
 
508
- response = safe_json_parse(response_body)
509
- usage = response["usage"]
348
+ usage = safe_json_parse(response_body)&.dig("usage")
510
349
  return nil unless usage
511
350
 
512
- {
351
+ LlmCostTracker::ParsedUsage.build(
513
352
  provider: "acme",
514
- model: response["model"],
353
+ model: safe_json_parse(response_body)["model"],
515
354
  input_tokens: usage["input"] || 0,
516
355
  output_tokens: usage["output"] || 0
517
- }
356
+ )
518
357
  end
519
358
  end
520
359
 
521
- # Register it
522
360
  LlmCostTracker::Parsers::Registry.register(AcmeParser.new)
523
361
  ```
524
362
 
525
- ## Supported Providers
363
+ ## Supported providers
526
364
 
527
365
  | Provider | Auto-detected | Models with pricing |
528
- |----------|:---:|---|
366
+ |---|:---:|---|
529
367
  | OpenAI | ✅ | GPT-5.2/5.1/5, GPT-5 mini/nano, GPT-4.1, GPT-4o, o1/o3/o4-mini |
530
- | OpenRouter | ✅ | Uses OpenAI-compatible usage; provider-prefixed OpenAI model IDs are normalized when possible |
531
- | DeepSeek | ✅ | Uses OpenAI-compatible usage; add `pricing_overrides` for DeepSeek model pricing |
368
+ | OpenRouter | ✅ | OpenAI-compatible usage; provider-prefixed OpenAI model IDs normalized when possible |
369
+ | DeepSeek | ✅ | OpenAI-compatible usage; add `pricing_overrides` for DeepSeek models |
532
370
  | OpenAI-compatible hosts | 🔧 | Configure `openai_compatible_providers` |
533
371
  | Anthropic | ✅ | Claude Opus 4.6/4.1/4, Sonnet 4.6/4.5/4, Haiku 4.5, Claude 3.x |
534
372
  | Google Gemini | ✅ | Gemini 2.5 Pro/Flash/Flash-Lite, 2.0 Flash/Flash-Lite, 1.5 Pro/Flash |
535
- | Any other | 🔧 | Via custom parser (see above) |
536
-
537
- Supported endpoint families:
373
+ | Any other | 🔧 | Custom parser |
538
374
 
539
- - OpenAI: Chat Completions, Responses, Completions, Embeddings
540
- - OpenAI-compatible: Chat Completions, Responses, Completions, Embeddings
541
- - Anthropic: Messages
542
- - Google Gemini: `generateContent` responses with `usageMetadata`
375
+ Endpoints: OpenAI Chat Completions / Responses / Completions / Embeddings; OpenAI-compatible equivalents; Anthropic Messages; Gemini `generateContent` with `usageMetadata`.
543
376
 
544
- ## How It Works
377
+ ## Safety
545
378
 
546
- ```
547
- Your App Faraday [LlmCostTracker Middleware] → LLM API
548
-
549
- Parses response body
550
- Extracts token usage
551
- Calculates cost
552
-
553
- ActiveSupport::Notifications
554
- ActiveRecord / Log / Custom
555
- ```
379
+ - No external HTTP calls.
380
+ - No prompt or response bodies stored.
381
+ - Faraday responses not modified.
382
+ - Storage failures non-fatal by default (`storage_error_behavior = :warn`).
383
+ - Budget / unknown-pricing errors are raised only when you opt in.
556
384
 
557
- The middleware intercepts **outgoing** HTTP responses (not incoming Rails requests), parses the provider usage object, looks up pricing, and records the event. It never modifies requests or responses. Put `llm_cost_tracker` inside the Faraday stack where it can see the final response body; if another middleware consumes or transforms streaming bodies, use manual tracking.
385
+ ## Known limitations
558
386
 
559
- For streaming APIs, tracking depends on the final response body including provider usage data. If the client consumes server-sent events without exposing the final usage payload to Faraday, the gem logs a warning and skips tracking; use manual tracking for those calls.
387
+ - `:block_requests` is best-effort under concurrency; use an external quota system for hard caps.
388
+ - Streaming/SSE tracked only when Faraday exposes a final body with usage.
389
+ - Anthropic cache TTL variants (1h vs 5min writes) not modeled separately.
390
+ - OpenAI reasoning tokens included in output totals; separate reasoning-token attribution not stored.
560
391
 
561
392
  ## Development
562
393
 
563
394
  ```bash
564
- git clone https://github.com/sergey-homenko/llm_cost_tracker.git
565
- cd llm_cost_tracker
566
395
  bundle install
567
396
  bundle exec rspec
568
397
  bundle exec rubocop
569
398
  ```
570
399
 
571
- ## Contributing
572
-
573
- Bug reports and pull requests are welcome on [GitHub](https://github.com/sergey-homenko/llm_cost_tracker).
574
-
575
400
  ## License
576
401
 
577
- The gem is available as open source under the terms of the [MIT License](LICENSE.txt).
402
+ MIT. See [LICENSE.txt](LICENSE.txt).