llm_cost_tracker 0.1.3 → 0.2.0.alpha1
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- checksums.yaml +4 -4
- data/CHANGELOG.md +64 -81
- data/PLAN_0.2.md +488 -0
- data/README.md +141 -316
- data/app/controllers/llm_cost_tracker/application_controller.rb +42 -0
- data/app/controllers/llm_cost_tracker/calls_controller.rb +77 -0
- data/app/controllers/llm_cost_tracker/dashboard_controller.rb +54 -0
- data/app/controllers/llm_cost_tracker/data_quality_controller.rb +10 -0
- data/app/controllers/llm_cost_tracker/models_controller.rb +12 -0
- data/app/controllers/llm_cost_tracker/tags_controller.rb +21 -0
- data/app/helpers/llm_cost_tracker/application_helper.rb +113 -0
- data/app/services/llm_cost_tracker/dashboard/data_quality.rb +38 -0
- data/app/services/llm_cost_tracker/dashboard/filter.rb +109 -0
- data/app/services/llm_cost_tracker/dashboard/overview_stats.rb +87 -0
- data/app/services/llm_cost_tracker/dashboard/provider_breakdown.rb +44 -0
- data/app/services/llm_cost_tracker/dashboard/tag_breakdown.rb +58 -0
- data/app/services/llm_cost_tracker/dashboard/tag_key_explorer.rb +125 -0
- data/app/services/llm_cost_tracker/dashboard/time_series.rb +44 -0
- data/app/services/llm_cost_tracker/dashboard/top_models.rb +89 -0
- data/app/services/llm_cost_tracker/pagination.rb +59 -0
- data/app/views/layouts/llm_cost_tracker/application.html.erb +342 -0
- data/app/views/llm_cost_tracker/calls/index.html.erb +127 -0
- data/app/views/llm_cost_tracker/calls/show.html.erb +67 -0
- data/app/views/llm_cost_tracker/dashboard/index.html.erb +145 -0
- data/app/views/llm_cost_tracker/data_quality/index.html.erb +110 -0
- data/app/views/llm_cost_tracker/errors/database.html.erb +8 -0
- data/app/views/llm_cost_tracker/errors/invalid_filter.html.erb +4 -0
- data/app/views/llm_cost_tracker/errors/not_found.html.erb +5 -0
- data/app/views/llm_cost_tracker/models/index.html.erb +95 -0
- data/app/views/llm_cost_tracker/shared/_bar.html.erb +5 -0
- data/app/views/llm_cost_tracker/shared/setup_required.html.erb +6 -0
- data/app/views/llm_cost_tracker/tags/index.html.erb +34 -0
- data/app/views/llm_cost_tracker/tags/show.html.erb +69 -0
- data/config/routes.rb +10 -0
- data/lib/llm_cost_tracker/budget.rb +16 -38
- data/lib/llm_cost_tracker/configuration.rb +3 -1
- data/lib/llm_cost_tracker/cost.rb +1 -3
- data/lib/llm_cost_tracker/engine.rb +13 -0
- data/lib/llm_cost_tracker/engine_compatibility.rb +15 -0
- data/lib/llm_cost_tracker/errors.rb +2 -0
- data/lib/llm_cost_tracker/event.rb +1 -3
- data/lib/llm_cost_tracker/event_metadata.rb +9 -18
- data/lib/llm_cost_tracker/llm_api_call.rb +43 -9
- data/lib/llm_cost_tracker/middleware/faraday.rb +4 -4
- data/lib/llm_cost_tracker/parsed_usage.rb +5 -9
- data/lib/llm_cost_tracker/parsers/anthropic.rb +4 -5
- data/lib/llm_cost_tracker/parsers/base.rb +3 -8
- data/lib/llm_cost_tracker/parsers/gemini.rb +3 -3
- data/lib/llm_cost_tracker/parsers/openai_usage.rb +3 -3
- data/lib/llm_cost_tracker/parsers/registry.rb +5 -12
- data/lib/llm_cost_tracker/period_grouping.rb +68 -0
- data/lib/llm_cost_tracker/price_registry.rb +22 -30
- data/lib/llm_cost_tracker/pricing.rb +10 -19
- data/lib/llm_cost_tracker/report.rb +4 -4
- data/lib/llm_cost_tracker/report_data.rb +23 -29
- data/lib/llm_cost_tracker/report_formatter.rb +11 -3
- data/lib/llm_cost_tracker/storage/active_record_store.rb +1 -3
- data/lib/llm_cost_tracker/tag_accessors.rb +0 -8
- data/lib/llm_cost_tracker/tag_key.rb +16 -0
- data/lib/llm_cost_tracker/tracker.rb +35 -1
- data/lib/llm_cost_tracker/unknown_pricing.rb +1 -1
- data/lib/llm_cost_tracker/version.rb +1 -1
- data/lib/llm_cost_tracker.rb +3 -6
- data/llm_cost_tracker.gemspec +13 -9
- metadata +92 -21
- data/.rubocop.yml +0 -44
- data/lib/llm_cost_tracker/storage/active_record_backend.rb +0 -19
- data/lib/llm_cost_tracker/storage/backends.rb +0 -26
- data/lib/llm_cost_tracker/storage/custom_backend.rb +0 -16
- data/lib/llm_cost_tracker/storage/log_backend.rb +0 -28
- data/lib/llm_cost_tracker/value_object.rb +0 -45
data/README.md
CHANGED
|
@@ -1,8 +1,6 @@
|
|
|
1
1
|
# LlmCostTracker
|
|
2
2
|
|
|
3
|
-
**
|
|
4
|
-
|
|
5
|
-
Track cost by user, tenant, feature, provider, and model, all in your own database. No proxy. No SaaS required.
|
|
3
|
+
**Self-hosted LLM cost tracking for Ruby and Rails.** Intercepts Faraday LLM responses, prices them locally, stores events in your database. No proxy, no SaaS.
|
|
6
4
|
|
|
7
5
|
[](https://rubygems.org/gems/llm_cost_tracker)
|
|
8
6
|
[](https://github.com/sergey-homenko/llm_cost_tracker/actions)
|
|
@@ -20,53 +18,38 @@ By model:
|
|
|
20
18
|
claude-sonnet-4-6 $31.200000
|
|
21
19
|
gemini-2.5-flash $14.120000
|
|
22
20
|
|
|
23
|
-
By
|
|
24
|
-
|
|
25
|
-
|
|
26
|
-
translate $24.700000
|
|
21
|
+
By tag key "env":
|
|
22
|
+
production $119.300000
|
|
23
|
+
staging $8.120000
|
|
27
24
|
```
|
|
28
25
|
|
|
29
|
-
## Why
|
|
30
|
-
|
|
31
|
-
Every Rails app integrating LLMs faces the same problem: **you don't know how much AI is costing you** until the invoice arrives. Full observability platforms like Langfuse and Helicone are powerful, but sometimes you just need a small Rails-native cost ledger that lives in your app database.
|
|
26
|
+
## Why
|
|
32
27
|
|
|
33
|
-
|
|
28
|
+
Every Rails app with LLM integrations eventually runs into the same question: where did that invoice come from? Full observability platforms like Langfuse and Helicone cover a lot more than cost, and sometimes you just want a small Rails-native ledger that lives in your own database.
|
|
34
29
|
|
|
35
|
-
|
|
36
|
-
- 🏠 **Self-hosted** — your data stays in your database
|
|
37
|
-
- 🧩 **Client-light** — works with raw Faraday and LLM gems that expose their Faraday connection
|
|
38
|
-
- 🏷️ **Attribution-first** — tag spend by feature, tenant, user, job, or environment
|
|
39
|
-
- 🌐 **OpenAI-compatible** — auto-detect OpenRouter and DeepSeek, with custom compatible hosts configurable
|
|
40
|
-
- 🛑 **Budget guardrails** — notify, raise, or block requests when monthly spend is exhausted
|
|
41
|
-
- 📊 **Quick reports** — print a terminal cost report with one rake task
|
|
30
|
+
`llm_cost_tracker` is scoped to that. It plugs into Faraday, parses provider usage out of the response, looks up pricing locally, and writes an event. You end up with a ledger you can query with plain ActiveRecord, slice by any tag dimension, and optionally surface on a built-in dashboard. No proxy, no SaaS, no separate service to run.
|
|
42
31
|
|
|
43
|
-
|
|
32
|
+
It's not a tracing platform, prompt CMS, eval system, or gateway — and doesn't want to be. The goal is answering _"what did this app spend on LLM APIs, and where did that spend come from?"_ well enough that you stop worrying about it.
|
|
44
33
|
|
|
45
34
|
## Installation
|
|
46
35
|
|
|
47
|
-
Add to your Gemfile:
|
|
48
|
-
|
|
49
36
|
```ruby
|
|
50
37
|
gem "llm_cost_tracker"
|
|
51
38
|
```
|
|
52
39
|
|
|
53
|
-
For ActiveRecord storage
|
|
40
|
+
For ActiveRecord storage:
|
|
54
41
|
|
|
55
42
|
```bash
|
|
56
43
|
bin/rails generate llm_cost_tracker:install
|
|
57
44
|
bin/rails db:migrate
|
|
58
45
|
```
|
|
59
46
|
|
|
60
|
-
##
|
|
61
|
-
|
|
62
|
-
Try cost calculation without a database or migration:
|
|
47
|
+
## Quick try (no database)
|
|
63
48
|
|
|
64
49
|
```ruby
|
|
65
50
|
require "llm_cost_tracker"
|
|
66
51
|
|
|
67
|
-
LlmCostTracker.configure
|
|
68
|
-
config.storage_backend = :log
|
|
69
|
-
end
|
|
52
|
+
LlmCostTracker.configure { |c| c.storage_backend = :log }
|
|
70
53
|
|
|
71
54
|
LlmCostTracker.track(
|
|
72
55
|
provider: :openai,
|
|
@@ -75,25 +58,12 @@ LlmCostTracker.track(
|
|
|
75
58
|
output_tokens: 200,
|
|
76
59
|
feature: "demo"
|
|
77
60
|
)
|
|
61
|
+
# => [LlmCostTracker] openai/gpt-4o tokens=1000+200 cost=$0.004500 tags={:feature=>"demo"}
|
|
78
62
|
```
|
|
79
63
|
|
|
80
|
-
|
|
81
|
-
|
|
82
|
-
```text
|
|
83
|
-
[LlmCostTracker] openai/gpt-4o tokens=1000+200 cost=$0.004500 tags={:feature=>"demo"}
|
|
84
|
-
```
|
|
85
|
-
|
|
86
|
-
## Quick Start
|
|
87
|
-
|
|
88
|
-
Use the path that matches your app:
|
|
89
|
-
|
|
90
|
-
- Using `ruby-openai`, `ruby_llm`, or another client that exposes Faraday? Patch that client's Faraday connection.
|
|
91
|
-
- Using raw Faraday? Add the middleware directly.
|
|
92
|
-
- Using a client without Faraday access? Use manual tracking.
|
|
93
|
-
|
|
94
|
-
### Option 1: Patch An Existing Client
|
|
64
|
+
## Usage
|
|
95
65
|
|
|
96
|
-
|
|
66
|
+
### Patch an existing client's Faraday connection
|
|
97
67
|
|
|
98
68
|
```ruby
|
|
99
69
|
# config/initializers/openai.rb
|
|
@@ -102,34 +72,27 @@ OpenAI.configure do |config|
|
|
|
102
72
|
|
|
103
73
|
config.faraday do |f|
|
|
104
74
|
f.use :llm_cost_tracker, tags: -> {
|
|
105
|
-
{
|
|
106
|
-
user_id: Current.user&.id,
|
|
107
|
-
feature: Current.llm_feature || "openai"
|
|
108
|
-
}
|
|
75
|
+
{ user_id: Current.user&.id, workflow: Current.workflow, env: Rails.env }
|
|
109
76
|
}
|
|
110
77
|
end
|
|
111
78
|
end
|
|
112
79
|
```
|
|
113
80
|
|
|
114
|
-
|
|
81
|
+
`tags:` can be a callable so `Current` attributes are evaluated per request:
|
|
115
82
|
|
|
116
83
|
```ruby
|
|
117
|
-
# app/models/current.rb
|
|
118
84
|
class Current < ActiveSupport::CurrentAttributes
|
|
119
|
-
attribute :user, :tenant, :
|
|
85
|
+
attribute :user, :tenant, :workflow
|
|
120
86
|
end
|
|
121
87
|
|
|
122
|
-
#
|
|
88
|
+
# application_controller.rb
|
|
123
89
|
before_action do
|
|
124
90
|
Current.user = current_user
|
|
125
|
-
Current.
|
|
126
|
-
Current.llm_feature = "chat"
|
|
91
|
+
Current.workflow = "chat"
|
|
127
92
|
end
|
|
128
93
|
```
|
|
129
94
|
|
|
130
|
-
###
|
|
131
|
-
|
|
132
|
-
If your LLM client uses Faraday, add the middleware to that connection:
|
|
95
|
+
### Raw Faraday
|
|
133
96
|
|
|
134
97
|
```ruby
|
|
135
98
|
conn = Faraday.new(url: "https://api.openai.com") do |f|
|
|
@@ -139,18 +102,12 @@ conn = Faraday.new(url: "https://api.openai.com") do |f|
|
|
|
139
102
|
f.adapter Faraday.default_adapter
|
|
140
103
|
end
|
|
141
104
|
|
|
142
|
-
|
|
143
|
-
response = conn.post("/v1/responses", {
|
|
144
|
-
model: "gpt-5-mini",
|
|
145
|
-
input: "Hello!"
|
|
146
|
-
})
|
|
105
|
+
conn.post("/v1/responses", { model: "gpt-5-mini", input: "Hello!" })
|
|
147
106
|
```
|
|
148
107
|
|
|
149
|
-
|
|
150
|
-
|
|
151
|
-
### Option 3: Manual tracking
|
|
108
|
+
Place `llm_cost_tracker` inside the Faraday stack where it can see the final response body. For streaming APIs, tracking requires the final body to expose provider usage; otherwise the gem warns and skips — use manual tracking there.
|
|
152
109
|
|
|
153
|
-
|
|
110
|
+
### Manual tracking
|
|
154
111
|
|
|
155
112
|
```ruby
|
|
156
113
|
LlmCostTracker.track(
|
|
@@ -169,332 +126,215 @@ LlmCostTracker.track(
|
|
|
169
126
|
```ruby
|
|
170
127
|
# config/initializers/llm_cost_tracker.rb
|
|
171
128
|
LlmCostTracker.configure do |config|
|
|
172
|
-
|
|
173
|
-
config.storage_backend = :active_record
|
|
174
|
-
|
|
175
|
-
# Default tags on every event
|
|
129
|
+
config.storage_backend = :active_record # :log (default), :active_record, :custom
|
|
176
130
|
config.default_tags = { app: "my_app", environment: Rails.env }
|
|
177
131
|
|
|
178
|
-
# Monthly budget in USD
|
|
179
132
|
config.monthly_budget = 500.00
|
|
180
|
-
config.budget_exceeded_behavior = :notify
|
|
181
|
-
config.storage_error_behavior
|
|
182
|
-
config.unknown_pricing_behavior = :warn
|
|
133
|
+
config.budget_exceeded_behavior = :notify # :notify, :raise, :block_requests
|
|
134
|
+
config.storage_error_behavior = :warn # :ignore, :warn, :raise
|
|
135
|
+
config.unknown_pricing_behavior = :warn # :ignore, :warn, :raise
|
|
183
136
|
|
|
184
|
-
# Alert callback
|
|
185
137
|
config.on_budget_exceeded = ->(data) {
|
|
186
|
-
SlackNotifier.notify(
|
|
187
|
-
"#alerts",
|
|
188
|
-
"🚨 LLM budget exceeded! $#{data[:monthly_total].round(2)} / $#{data[:budget]}"
|
|
189
|
-
)
|
|
138
|
+
SlackNotifier.notify("#alerts", "🚨 LLM budget $#{data[:monthly_total].round(2)} / $#{data[:budget]}")
|
|
190
139
|
}
|
|
191
140
|
|
|
192
|
-
# Override pricing for custom/fine-tuned models (per 1M tokens)
|
|
193
141
|
config.prices_file = Rails.root.join("config/llm_cost_tracker_prices.yml")
|
|
194
142
|
config.pricing_overrides = {
|
|
195
143
|
"ft:gpt-4o-mini:my-org" => { input: 0.30, cached_input: 0.15, output: 1.20 }
|
|
196
144
|
}
|
|
197
145
|
|
|
198
|
-
#
|
|
146
|
+
# Built-in: openrouter.ai, api.deepseek.com
|
|
199
147
|
config.openai_compatible_providers["llm.my-company.com"] = "internal_gateway"
|
|
200
148
|
end
|
|
201
149
|
```
|
|
202
150
|
|
|
203
|
-
Pricing is best-effort
|
|
151
|
+
Pricing is best-effort. OpenRouter-style IDs like `openai/gpt-4o-mini` are normalized to built-in names when possible. Use `prices_file` / `pricing_overrides` for fine-tunes, gateway-specific IDs, enterprise discounts, batch pricing, or models the gem doesn't know.
|
|
204
152
|
|
|
205
|
-
|
|
206
|
-
|
|
207
|
-
```ruby
|
|
208
|
-
config.storage_error_behavior = :warn # default
|
|
209
|
-
config.storage_error_behavior = :raise # fail fast with StorageError
|
|
210
|
-
config.storage_error_behavior = :ignore # skip storage failures silently
|
|
211
|
-
```
|
|
153
|
+
`storage_error_behavior = :warn` (default) lets LLM responses continue if storage fails; `:raise` exposes `StorageError#original_error`.
|
|
212
154
|
|
|
213
|
-
|
|
214
|
-
|
|
215
|
-
Unknown model pricing is visible by default:
|
|
216
|
-
|
|
217
|
-
```ruby
|
|
218
|
-
config.unknown_pricing_behavior = :warn # default
|
|
219
|
-
config.unknown_pricing_behavior = :raise # fail fast with UnknownPricingError
|
|
220
|
-
config.unknown_pricing_behavior = :ignore # keep tracking tokens silently
|
|
221
|
-
```
|
|
222
|
-
|
|
223
|
-
When pricing is unknown, the event can still be recorded with token counts, but `cost` is `nil` and budget enforcement is skipped for that event. Use `prices_file` or `pricing_overrides` to ensure all production models are priced. Check this ActiveRecord query for a list of unpriced models in your data:
|
|
155
|
+
Unknown pricing still records token counts, but `cost` is `nil` and budget guardrails skip that event. Find unpriced models:
|
|
224
156
|
|
|
225
157
|
```ruby
|
|
226
158
|
LlmCostTracker::LlmApiCall.unknown_pricing.group(:model).count
|
|
227
159
|
```
|
|
228
160
|
|
|
229
|
-
### Keeping
|
|
230
|
-
|
|
231
|
-
Built-in prices live in `lib/llm_cost_tracker/prices.json`, with `updated_at`, `unit`, `currency`, and source URLs in the file metadata. The gem does not fetch pricing on boot; that keeps it self-hosted and avoids hidden external dependencies.
|
|
161
|
+
### Keeping prices current
|
|
232
162
|
|
|
233
|
-
For production
|
|
163
|
+
Built-in prices are in `lib/llm_cost_tracker/prices.json`. The gem never fetches pricing on boot. For production, generate a local overrides file and point the gem at it:
|
|
234
164
|
|
|
235
165
|
```bash
|
|
236
166
|
bin/rails generate llm_cost_tracker:prices
|
|
237
167
|
```
|
|
238
168
|
|
|
239
|
-
```ruby
|
|
240
|
-
config.prices_file = Rails.root.join("config/llm_cost_tracker_prices.yml")
|
|
241
|
-
```
|
|
242
|
-
|
|
243
|
-
Example JSON:
|
|
244
|
-
|
|
245
169
|
```json
|
|
246
170
|
{
|
|
247
|
-
"metadata": {
|
|
248
|
-
"updated_at": "2026-04-18",
|
|
249
|
-
"currency": "USD",
|
|
250
|
-
"unit": "1M tokens"
|
|
251
|
-
},
|
|
171
|
+
"metadata": { "updated_at": "2026-04-18", "currency": "USD", "unit": "1M tokens" },
|
|
252
172
|
"models": {
|
|
253
|
-
"my-gateway/gpt-4o-mini": {
|
|
254
|
-
"input": 0.20,
|
|
255
|
-
"cached_input": 0.10,
|
|
256
|
-
"output": 0.80
|
|
257
|
-
}
|
|
173
|
+
"my-gateway/gpt-4o-mini": { "input": 0.20, "cached_input": 0.10, "output": 0.80 }
|
|
258
174
|
}
|
|
259
175
|
}
|
|
260
176
|
```
|
|
261
177
|
|
|
262
|
-
`pricing_overrides`
|
|
178
|
+
`pricing_overrides` has the highest precedence; use it for small Ruby-only tweaks, `prices_file` for broader tables.
|
|
263
179
|
|
|
264
|
-
## Budget
|
|
180
|
+
## Budget enforcement
|
|
265
181
|
|
|
266
182
|
```ruby
|
|
267
|
-
|
|
268
|
-
|
|
269
|
-
|
|
270
|
-
config.budget_exceeded_behavior = :block_requests
|
|
271
|
-
end
|
|
183
|
+
config.storage_backend = :active_record
|
|
184
|
+
config.monthly_budget = 100.00
|
|
185
|
+
config.budget_exceeded_behavior = :block_requests
|
|
272
186
|
```
|
|
273
187
|
|
|
274
|
-
|
|
275
|
-
|
|
276
|
-
- `:
|
|
277
|
-
- `:raise` — records the event, then raises `LlmCostTracker::BudgetExceededError` when the month is over budget.
|
|
278
|
-
- `:block_requests` — blocks Faraday LLM requests before the HTTP call when the ActiveRecord monthly total has already reached the budget. If a request pushes the month over budget, it also raises after recording the event.
|
|
279
|
-
|
|
280
|
-
`BudgetExceededError` exposes `monthly_total`, `budget`, and `last_event`:
|
|
188
|
+
- `:notify` — fire `on_budget_exceeded` after an event pushes the month over budget.
|
|
189
|
+
- `:raise` — record the event, then raise `BudgetExceededError`.
|
|
190
|
+
- `:block_requests` — block preflight when the stored monthly total is already over budget; still raises post-response on the event that crosses the line. Needs `:active_record` storage.
|
|
281
191
|
|
|
282
192
|
```ruby
|
|
283
|
-
begin
|
|
284
|
-
client.chat(...)
|
|
285
193
|
rescue LlmCostTracker::BudgetExceededError => e
|
|
286
|
-
|
|
287
|
-
end
|
|
194
|
+
# e.monthly_total, e.budget, e.last_event
|
|
288
195
|
```
|
|
289
196
|
|
|
290
|
-
|
|
197
|
+
`:block_requests` is best-effort under concurrency, not a transactional cap. Use provider/gateway-level limits for strict quotas.
|
|
291
198
|
|
|
292
|
-
|
|
293
|
-
|
|
294
|
-
## Querying Costs (ActiveRecord)
|
|
295
|
-
|
|
296
|
-
Print a quick terminal report:
|
|
199
|
+
## Querying costs
|
|
297
200
|
|
|
298
201
|
```bash
|
|
299
202
|
bin/rails llm_cost_tracker:report
|
|
300
|
-
|
|
301
|
-
# Optional: change the window
|
|
302
203
|
DAYS=7 bin/rails llm_cost_tracker:report
|
|
303
204
|
```
|
|
304
205
|
|
|
305
|
-
Example:
|
|
306
|
-
|
|
307
|
-
```text
|
|
308
|
-
LLM Cost Report (last 30 days)
|
|
309
|
-
|
|
310
|
-
Total cost: $127.420000
|
|
311
|
-
Requests: 4,218
|
|
312
|
-
Avg latency: 812ms
|
|
313
|
-
Unknown pricing: 0
|
|
314
|
-
|
|
315
|
-
By provider:
|
|
316
|
-
openai $96.220000
|
|
317
|
-
anthropic $31.200000
|
|
318
|
-
```
|
|
319
|
-
|
|
320
|
-
Or query the ledger directly:
|
|
321
|
-
|
|
322
206
|
```ruby
|
|
323
|
-
# Today's total spend
|
|
324
207
|
LlmCostTracker::LlmApiCall.today.total_cost
|
|
325
|
-
# => 12.45
|
|
326
|
-
|
|
327
|
-
# Cost breakdown by model this month
|
|
328
208
|
LlmCostTracker::LlmApiCall.this_month.cost_by_model
|
|
329
|
-
# => { "gpt-4o" => 8.20, "claude-sonnet-4-6" => 4.25 }
|
|
330
|
-
|
|
331
|
-
# Cost by provider
|
|
332
209
|
LlmCostTracker::LlmApiCall.this_month.cost_by_provider
|
|
333
|
-
# => { "openai" => 8.20, "anthropic" => 4.25 }
|
|
334
210
|
|
|
335
|
-
#
|
|
211
|
+
# Group / sum by any tag
|
|
212
|
+
LlmCostTracker::LlmApiCall.this_month.group_by_tag("feature").sum(:total_cost)
|
|
213
|
+
LlmCostTracker::LlmApiCall.this_month.cost_by_tag("feature") # with "(untagged)" bucket
|
|
214
|
+
|
|
215
|
+
# Period grouping (SQL-side)
|
|
216
|
+
LlmCostTracker::LlmApiCall.this_month.group_by_period(:day).sum(:total_cost)
|
|
217
|
+
LlmCostTracker::LlmApiCall.group_by_period(:month).sum(:total_cost)
|
|
336
218
|
LlmCostTracker::LlmApiCall.daily_costs(days: 7)
|
|
337
|
-
# => { "2026-04-10" => 1.5, "2026-04-11" => 2.3, ... }
|
|
338
219
|
|
|
339
|
-
# Latency
|
|
220
|
+
# Latency
|
|
340
221
|
LlmCostTracker::LlmApiCall.with_latency.average_latency_ms
|
|
341
222
|
LlmCostTracker::LlmApiCall.this_month.latency_by_model
|
|
342
223
|
|
|
343
|
-
#
|
|
224
|
+
# Tag filters
|
|
344
225
|
LlmCostTracker::LlmApiCall.by_tag("feature", "chat").this_month.total_cost
|
|
345
|
-
|
|
346
|
-
# Filter by user
|
|
347
|
-
LlmCostTracker::LlmApiCall.by_tag("user_id", "42").today.total_cost
|
|
348
|
-
LlmCostTracker::LlmApiCall.by_user(42).today.total_cost
|
|
349
|
-
|
|
350
|
-
# Filter by multiple tags
|
|
351
226
|
LlmCostTracker::LlmApiCall.by_tags(user_id: 42, feature: "chat").this_month.total_cost
|
|
352
227
|
|
|
353
|
-
#
|
|
354
|
-
LlmCostTracker::LlmApiCall.by_feature("summarizer").this_month.total_cost
|
|
355
|
-
|
|
356
|
-
# Find models without pricing
|
|
357
|
-
LlmCostTracker::LlmApiCall.unknown_pricing.group(:model).count
|
|
358
|
-
LlmCostTracker::LlmApiCall.with_cost.this_month.total_cost
|
|
359
|
-
|
|
360
|
-
# Custom date range
|
|
228
|
+
# Range
|
|
361
229
|
LlmCostTracker::LlmApiCall.between(1.week.ago, Time.current).cost_by_model
|
|
362
230
|
```
|
|
363
231
|
|
|
364
|
-
### Tag
|
|
232
|
+
### Tag storage
|
|
365
233
|
|
|
366
|
-
|
|
234
|
+
New installs use `jsonb` + GIN on PostgreSQL:
|
|
367
235
|
|
|
368
236
|
```ruby
|
|
369
237
|
t.jsonb :tags, null: false, default: {}
|
|
370
238
|
add_index :llm_api_calls, :tags, using: :gin
|
|
371
239
|
```
|
|
372
240
|
|
|
373
|
-
On
|
|
241
|
+
On other adapters tags fall back to JSON in a text column. `by_tag` uses JSONB containment on PG, text matching elsewhere.
|
|
374
242
|
|
|
375
|
-
|
|
243
|
+
Upgrade an existing install:
|
|
376
244
|
|
|
377
245
|
```bash
|
|
378
|
-
bin/rails generate llm_cost_tracker:upgrade_tags_to_jsonb
|
|
246
|
+
bin/rails generate llm_cost_tracker:upgrade_tags_to_jsonb # PG: text → jsonb + GIN
|
|
247
|
+
bin/rails generate llm_cost_tracker:upgrade_cost_precision # widen cost columns
|
|
248
|
+
bin/rails generate llm_cost_tracker:add_latency_ms
|
|
379
249
|
bin/rails db:migrate
|
|
380
250
|
```
|
|
381
251
|
|
|
382
|
-
|
|
252
|
+
## Dashboard (optional)
|
|
383
253
|
|
|
384
|
-
|
|
254
|
+
Opt-in Rails Engine. Plain ERB, inline CSS, no JS. Requires Rails 7.1+; the core middleware works without Rails.
|
|
385
255
|
|
|
386
|
-
```
|
|
387
|
-
|
|
388
|
-
|
|
256
|
+
```ruby
|
|
257
|
+
# config/application.rb (or an initializer)
|
|
258
|
+
require "llm_cost_tracker/engine"
|
|
259
|
+
|
|
260
|
+
# config/routes.rb
|
|
261
|
+
mount LlmCostTracker::Engine => "/llm-costs"
|
|
389
262
|
```
|
|
390
263
|
|
|
391
|
-
|
|
264
|
+
Routes (GET-only; CSV export included):
|
|
392
265
|
|
|
393
|
-
|
|
394
|
-
|
|
395
|
-
|
|
266
|
+
- `/llm-costs` — overview: spend (with delta vs previous period), calls, avg cost/call, avg latency, unknown pricing, budget, daily trend, provider rollup, top models
|
|
267
|
+
- `/llm-costs/models` — by provider + model; sortable by spend, volume, avg cost, latency
|
|
268
|
+
- `/llm-costs/calls` — filterable + paginated; outlier sort modes (expensive, largest input/output, slowest, unknown pricing); CSV export
|
|
269
|
+
- `/llm-costs/calls/:id` — details
|
|
270
|
+
- `/llm-costs/tags` — tag keys present in the dataset (PG/SQLite native, MySQL via in-Ruby fallback)
|
|
271
|
+
- `/llm-costs/tags/:key` — breakdown by values of a given tag key
|
|
272
|
+
- `/llm-costs/data_quality` — unknown pricing share, untagged calls, missing latency
|
|
273
|
+
|
|
274
|
+
> ⚠️ **No built-in auth.** Tags carry whatever your app puts in them. Protect the mount point with your app's auth.
|
|
275
|
+
|
|
276
|
+
### Basic auth
|
|
277
|
+
|
|
278
|
+
```ruby
|
|
279
|
+
authenticated = ->(req) {
|
|
280
|
+
ActionController::HttpAuthentication::Basic.authenticate(req) do |name, password|
|
|
281
|
+
ActiveSupport::SecurityUtils.secure_compare(name, ENV.fetch("LLM_DASHBOARD_USER")) &
|
|
282
|
+
ActiveSupport::SecurityUtils.secure_compare(password, ENV.fetch("LLM_DASHBOARD_PASSWORD"))
|
|
283
|
+
end
|
|
284
|
+
}
|
|
285
|
+
constraints(authenticated) { mount LlmCostTracker::Engine => "/llm-costs" }
|
|
396
286
|
```
|
|
397
287
|
|
|
398
|
-
|
|
288
|
+
### Devise
|
|
289
|
+
|
|
290
|
+
```ruby
|
|
291
|
+
authenticate :user, ->(user) { user.admin? } do
|
|
292
|
+
mount LlmCostTracker::Engine => "/llm-costs"
|
|
293
|
+
end
|
|
294
|
+
```
|
|
399
295
|
|
|
400
|
-
|
|
296
|
+
## ActiveSupport::Notifications
|
|
401
297
|
|
|
402
298
|
```ruby
|
|
403
299
|
ActiveSupport::Notifications.subscribe("llm_request.llm_cost_tracker") do |*, payload|
|
|
404
300
|
# payload =>
|
|
405
301
|
# {
|
|
406
|
-
# provider: "openai",
|
|
407
|
-
#
|
|
408
|
-
# input_tokens: 150,
|
|
409
|
-
# output_tokens: 42,
|
|
410
|
-
# total_tokens: 192,
|
|
411
|
-
# latency_ms: 248,
|
|
302
|
+
# provider: "openai", model: "gpt-4o",
|
|
303
|
+
# input_tokens: 150, output_tokens: 42, total_tokens: 192, latency_ms: 248,
|
|
412
304
|
# cost: {
|
|
413
|
-
# input_cost: 0.000375,
|
|
414
|
-
#
|
|
415
|
-
#
|
|
416
|
-
# cache_creation_input_cost: 0.0,
|
|
417
|
-
# output_cost: 0.00042,
|
|
418
|
-
# total_cost: 0.000795,
|
|
419
|
-
# currency: "USD"
|
|
305
|
+
# input_cost: 0.000375, cached_input_cost: 0.0,
|
|
306
|
+
# cache_read_input_cost: 0.0, cache_creation_input_cost: 0.0,
|
|
307
|
+
# output_cost: 0.00042, total_cost: 0.000795, currency: "USD"
|
|
420
308
|
# },
|
|
421
309
|
# tags: { feature: "chat", user_id: 42 },
|
|
422
310
|
# tracked_at: 2026-04-16 14:30:00 UTC
|
|
423
311
|
# }
|
|
424
|
-
|
|
425
|
-
StatsD.increment("llm.requests", tags: ["provider:#{payload[:provider]}"])
|
|
426
|
-
StatsD.histogram("llm.cost", payload[:cost][:total_cost])
|
|
427
312
|
end
|
|
428
313
|
```
|
|
429
314
|
|
|
430
|
-
## Custom
|
|
315
|
+
## Custom storage backend
|
|
431
316
|
|
|
432
317
|
```ruby
|
|
433
|
-
|
|
434
|
-
|
|
435
|
-
|
|
436
|
-
|
|
437
|
-
|
|
438
|
-
|
|
439
|
-
|
|
440
|
-
latency_ms: event[:latency_ms]
|
|
441
|
-
},
|
|
442
|
-
tags: { provider: event[:provider], model: event[:model] }
|
|
443
|
-
})
|
|
444
|
-
}
|
|
445
|
-
end
|
|
318
|
+
config.storage_backend = :custom
|
|
319
|
+
config.custom_storage = ->(event) {
|
|
320
|
+
InfluxDB.write("llm_costs",
|
|
321
|
+
values: { cost: event.cost&.total_cost, tokens: event.total_tokens, latency_ms: event.latency_ms },
|
|
322
|
+
tags: { provider: event.provider, model: event.model }
|
|
323
|
+
)
|
|
324
|
+
}
|
|
446
325
|
```
|
|
447
326
|
|
|
448
|
-
## OpenAI-
|
|
327
|
+
## OpenAI-compatible providers
|
|
449
328
|
|
|
450
329
|
```ruby
|
|
451
|
-
|
|
452
|
-
# Built in:
|
|
453
|
-
# "openrouter.ai" => "openrouter"
|
|
454
|
-
# "api.deepseek.com" => "deepseek"
|
|
455
|
-
config.openai_compatible_providers["gateway.example.com"] = "internal_gateway"
|
|
456
|
-
end
|
|
330
|
+
config.openai_compatible_providers["gateway.example.com"] = "internal_gateway"
|
|
457
331
|
```
|
|
458
332
|
|
|
459
|
-
|
|
460
|
-
|
|
461
|
-
- `prompt_tokens` / `completion_tokens` / `total_tokens`
|
|
462
|
-
- `input_tokens` / `output_tokens` / `total_tokens`
|
|
463
|
-
- optional cached input details when the response includes them
|
|
464
|
-
|
|
465
|
-
This covers OpenRouter, DeepSeek, and private gateways that expose OpenAI-style Chat Completions, Responses, Completions, or Embeddings endpoints.
|
|
466
|
-
|
|
467
|
-
## Safety Guarantees
|
|
333
|
+
Configured hosts are parsed with the OpenAI-compatible usage shape (`prompt_tokens` / `completion_tokens` / `total_tokens`, `input_tokens` / `output_tokens`, and optional cached-input details). Covers OpenRouter, DeepSeek, and private gateways exposing Chat Completions / Responses / Completions / Embeddings.
|
|
468
334
|
|
|
469
|
-
|
|
470
|
-
- It does not store prompt or response bodies.
|
|
471
|
-
- Faraday responses are not modified.
|
|
472
|
-
- Storage failures are non-fatal by default via `storage_error_behavior = :warn`.
|
|
473
|
-
- Budget and unknown-pricing errors are raised only when you opt into `:raise` or `:block_requests`.
|
|
474
|
-
- Pricing is local and best-effort; use `prices_file` or `pricing_overrides` for production-specific rates.
|
|
475
|
-
- Streaming/SSE calls are skipped with a warning when the final usage payload is not readable by Faraday.
|
|
335
|
+
## Custom parser
|
|
476
336
|
|
|
477
|
-
|
|
478
|
-
|
|
479
|
-
- Use `storage_backend = :active_record` in production.
|
|
480
|
-
- Set `monthly_budget` and choose `budget_exceeded_behavior`.
|
|
481
|
-
- Treat `:block_requests` as best-effort in concurrent systems, not a strict quota.
|
|
482
|
-
- Keep `unknown_pricing_behavior = :warn` or `:raise` until pricing overrides are complete.
|
|
483
|
-
- Add `pricing_overrides` for custom, fine-tuned, gateway-specific, or newly released models.
|
|
484
|
-
- Tag calls with `tenant_id`, `user_id`, and `feature` where possible.
|
|
485
|
-
- Check `LlmCostTracker::LlmApiCall.unknown_pricing.group(:model).count` after deploys.
|
|
486
|
-
- Track `latency_ms` and watch `latency_by_model` for slow or degraded providers.
|
|
487
|
-
|
|
488
|
-
## Known Limitations
|
|
489
|
-
|
|
490
|
-
- `:block_requests` is best-effort under concurrency. For hard caps, use an external quota system, provider-side limits, or a gateway-level budget.
|
|
491
|
-
- Streaming/SSE calls are tracked only when Faraday exposes a final response body with usage data. Otherwise the gem warns and skips automatic tracking.
|
|
492
|
-
- Anthropic cache creation TTL variants are not modeled separately yet; 1-hour cache writes may be underestimated compared with the default 5-minute cache write rate.
|
|
493
|
-
- OpenAI reasoning tokens are included in output-token totals when providers report them that way, but separate reasoning-token attribution is not stored yet.
|
|
494
|
-
|
|
495
|
-
## Adding a Custom Provider Parser
|
|
496
|
-
|
|
497
|
-
Use this for providers that are not OpenAI-compatible and return a different usage shape.
|
|
337
|
+
For providers with a non-OpenAI usage shape:
|
|
498
338
|
|
|
499
339
|
```ruby
|
|
500
340
|
class AcmeParser < LlmCostTracker::Parsers::Base
|
|
@@ -505,73 +345,58 @@ class AcmeParser < LlmCostTracker::Parsers::Base
|
|
|
505
345
|
def parse(request_url, request_body, response_status, response_body)
|
|
506
346
|
return nil unless response_status == 200
|
|
507
347
|
|
|
508
|
-
|
|
509
|
-
usage = response["usage"]
|
|
348
|
+
usage = safe_json_parse(response_body)&.dig("usage")
|
|
510
349
|
return nil unless usage
|
|
511
350
|
|
|
512
|
-
|
|
351
|
+
LlmCostTracker::ParsedUsage.build(
|
|
513
352
|
provider: "acme",
|
|
514
|
-
model:
|
|
353
|
+
model: safe_json_parse(response_body)["model"],
|
|
515
354
|
input_tokens: usage["input"] || 0,
|
|
516
355
|
output_tokens: usage["output"] || 0
|
|
517
|
-
|
|
356
|
+
)
|
|
518
357
|
end
|
|
519
358
|
end
|
|
520
359
|
|
|
521
|
-
# Register it
|
|
522
360
|
LlmCostTracker::Parsers::Registry.register(AcmeParser.new)
|
|
523
361
|
```
|
|
524
362
|
|
|
525
|
-
## Supported
|
|
363
|
+
## Supported providers
|
|
526
364
|
|
|
527
365
|
| Provider | Auto-detected | Models with pricing |
|
|
528
|
-
|
|
366
|
+
|---|:---:|---|
|
|
529
367
|
| OpenAI | ✅ | GPT-5.2/5.1/5, GPT-5 mini/nano, GPT-4.1, GPT-4o, o1/o3/o4-mini |
|
|
530
|
-
| OpenRouter | ✅ |
|
|
531
|
-
| DeepSeek | ✅ |
|
|
368
|
+
| OpenRouter | ✅ | OpenAI-compatible usage; provider-prefixed OpenAI model IDs normalized when possible |
|
|
369
|
+
| DeepSeek | ✅ | OpenAI-compatible usage; add `pricing_overrides` for DeepSeek models |
|
|
532
370
|
| OpenAI-compatible hosts | 🔧 | Configure `openai_compatible_providers` |
|
|
533
371
|
| Anthropic | ✅ | Claude Opus 4.6/4.1/4, Sonnet 4.6/4.5/4, Haiku 4.5, Claude 3.x |
|
|
534
372
|
| Google Gemini | ✅ | Gemini 2.5 Pro/Flash/Flash-Lite, 2.0 Flash/Flash-Lite, 1.5 Pro/Flash |
|
|
535
|
-
| Any other | 🔧 |
|
|
536
|
-
|
|
537
|
-
Supported endpoint families:
|
|
373
|
+
| Any other | 🔧 | Custom parser |
|
|
538
374
|
|
|
539
|
-
|
|
540
|
-
- OpenAI-compatible: Chat Completions, Responses, Completions, Embeddings
|
|
541
|
-
- Anthropic: Messages
|
|
542
|
-
- Google Gemini: `generateContent` responses with `usageMetadata`
|
|
375
|
+
Endpoints: OpenAI Chat Completions / Responses / Completions / Embeddings; OpenAI-compatible equivalents; Anthropic Messages; Gemini `generateContent` with `usageMetadata`.
|
|
543
376
|
|
|
544
|
-
##
|
|
377
|
+
## Safety
|
|
545
378
|
|
|
546
|
-
|
|
547
|
-
|
|
548
|
-
|
|
549
|
-
|
|
550
|
-
|
|
551
|
-
Calculates cost
|
|
552
|
-
↓
|
|
553
|
-
ActiveSupport::Notifications
|
|
554
|
-
ActiveRecord / Log / Custom
|
|
555
|
-
```
|
|
379
|
+
- No external HTTP calls.
|
|
380
|
+
- No prompt or response bodies stored.
|
|
381
|
+
- Faraday responses not modified.
|
|
382
|
+
- Storage failures non-fatal by default (`storage_error_behavior = :warn`).
|
|
383
|
+
- Budget / unknown-pricing errors are raised only when you opt in.
|
|
556
384
|
|
|
557
|
-
|
|
385
|
+
## Known limitations
|
|
558
386
|
|
|
559
|
-
|
|
387
|
+
- `:block_requests` is best-effort under concurrency; use an external quota system for hard caps.
|
|
388
|
+
- Streaming/SSE tracked only when Faraday exposes a final body with usage.
|
|
389
|
+
- Anthropic cache TTL variants (1h vs 5min writes) not modeled separately.
|
|
390
|
+
- OpenAI reasoning tokens included in output totals; separate reasoning-token attribution not stored.
|
|
560
391
|
|
|
561
392
|
## Development
|
|
562
393
|
|
|
563
394
|
```bash
|
|
564
|
-
git clone https://github.com/sergey-homenko/llm_cost_tracker.git
|
|
565
|
-
cd llm_cost_tracker
|
|
566
395
|
bundle install
|
|
567
396
|
bundle exec rspec
|
|
568
397
|
bundle exec rubocop
|
|
569
398
|
```
|
|
570
399
|
|
|
571
|
-
## Contributing
|
|
572
|
-
|
|
573
|
-
Bug reports and pull requests are welcome on [GitHub](https://github.com/sergey-homenko/llm_cost_tracker).
|
|
574
|
-
|
|
575
400
|
## License
|
|
576
401
|
|
|
577
|
-
|
|
402
|
+
MIT. See [LICENSE.txt](LICENSE.txt).
|