llm_cost_tracker 0.1.0 → 0.1.1
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- checksums.yaml +4 -4
- data/.rubocop.yml +44 -0
- data/CHANGELOG.md +25 -0
- data/README.md +45 -19
- data/Rakefile +3 -1
- data/lib/llm_cost_tracker/llm_api_call.rb +3 -1
- data/lib/llm_cost_tracker/middleware/faraday.rb +3 -4
- data/lib/llm_cost_tracker/parsers/anthropic.rb +6 -4
- data/lib/llm_cost_tracker/parsers/gemini.rb +4 -3
- data/lib/llm_cost_tracker/parsers/openai.rb +15 -7
- data/lib/llm_cost_tracker/pricing.rb +54 -19
- data/lib/llm_cost_tracker/storage/active_record_store.rb +13 -1
- data/lib/llm_cost_tracker/tracker.rb +45 -8
- data/lib/llm_cost_tracker/version.rb +1 -1
- data/llm_cost_tracker.gemspec +8 -7
- metadata +24 -23
checksums.yaml
CHANGED
|
@@ -1,7 +1,7 @@
|
|
|
1
1
|
---
|
|
2
2
|
SHA256:
|
|
3
|
-
metadata.gz:
|
|
4
|
-
data.tar.gz:
|
|
3
|
+
metadata.gz: f7b40f1010c79358da89ffdd10637f59fa90e24aa0f50aec364828d2e2cbf5b9
|
|
4
|
+
data.tar.gz: d12d1cf407b87afd6e1084c22ceda143c7ab9bf5e6ea6825d70a8e24969cafa5
|
|
5
5
|
SHA512:
|
|
6
|
-
metadata.gz:
|
|
7
|
-
data.tar.gz:
|
|
6
|
+
metadata.gz: 949157f0a6718bc03f8f0d825982ed732df2754ddf1e4ee07b18522b0e20cc4367a97c599071bcda95bbdda4dde0e160f5d586a9b42a0dd8b1f3c89910286547
|
|
7
|
+
data.tar.gz: 9ea9007142d157446271bcf81bc4786e4b22a00f6e353dc2e3dc26c1be12d9abf88aed8d8852da37778b5fe3f71fcd4422c6153d8a531f195adaf0d0b9bb8dd2
|
data/.rubocop.yml
ADDED
|
@@ -0,0 +1,44 @@
|
|
|
1
|
+
AllCops:
|
|
2
|
+
NewCops: enable
|
|
3
|
+
TargetRubyVersion: 3.1
|
|
4
|
+
SuggestExtensions: false
|
|
5
|
+
UseCache: false
|
|
6
|
+
Exclude:
|
|
7
|
+
- "tmp/**/*"
|
|
8
|
+
- "vendor/**/*"
|
|
9
|
+
- "pkg/**/*"
|
|
10
|
+
|
|
11
|
+
Style/Documentation:
|
|
12
|
+
Enabled: false
|
|
13
|
+
|
|
14
|
+
Style/StringLiterals:
|
|
15
|
+
EnforcedStyle: double_quotes
|
|
16
|
+
|
|
17
|
+
Metrics/BlockLength:
|
|
18
|
+
Exclude:
|
|
19
|
+
- "*.gemspec"
|
|
20
|
+
- "spec/**/*.rb"
|
|
21
|
+
|
|
22
|
+
Metrics/MethodLength:
|
|
23
|
+
Max: 25
|
|
24
|
+
|
|
25
|
+
Metrics/AbcSize:
|
|
26
|
+
Max: 45
|
|
27
|
+
|
|
28
|
+
Metrics/ClassLength:
|
|
29
|
+
Max: 130
|
|
30
|
+
|
|
31
|
+
Metrics/CyclomaticComplexity:
|
|
32
|
+
Max: 10
|
|
33
|
+
|
|
34
|
+
Metrics/ParameterLists:
|
|
35
|
+
Max: 6
|
|
36
|
+
|
|
37
|
+
Metrics/PerceivedComplexity:
|
|
38
|
+
Max: 10
|
|
39
|
+
|
|
40
|
+
Gemspec/DevelopmentDependencies:
|
|
41
|
+
Enabled: false
|
|
42
|
+
|
|
43
|
+
Layout/HashAlignment:
|
|
44
|
+
Enabled: false
|
data/CHANGELOG.md
CHANGED
|
@@ -5,6 +5,31 @@ All notable changes to this project will be documented in this file.
|
|
|
5
5
|
The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.1.0/),
|
|
6
6
|
and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html).
|
|
7
7
|
|
|
8
|
+
## [0.1.1] - 2026-04-17
|
|
9
|
+
|
|
10
|
+
### Fixed
|
|
11
|
+
|
|
12
|
+
- Lazy-load ActiveRecord storage so `storage_backend = :active_record` persists events reliably.
|
|
13
|
+
- Avoid double-counting the latest ActiveRecord event in monthly budget callbacks.
|
|
14
|
+
- Track OpenAI Responses API usage via `/v1/responses`.
|
|
15
|
+
- Parse OpenAI cached input token details for cache-aware cost estimates.
|
|
16
|
+
- Parse Anthropic cache read and cache creation token usage under canonical metadata keys.
|
|
17
|
+
- Parse Gemini cached content token usage when present.
|
|
18
|
+
- Store ActiveRecord tag values as strings so `by_tag("user_id", "42")` works for numeric IDs.
|
|
19
|
+
|
|
20
|
+
### Changed
|
|
21
|
+
|
|
22
|
+
- Refresh built-in pricing for current OpenAI, Anthropic, and Gemini models.
|
|
23
|
+
- Add cache-aware cost calculation fields for cached input, cache reads, and cache creation.
|
|
24
|
+
- Tighten OpenAI URL matching to supported endpoint families only.
|
|
25
|
+
- Reposition README around self-hosted Rails/Ruby cost tracking for Faraday-based clients.
|
|
26
|
+
|
|
27
|
+
### Added
|
|
28
|
+
|
|
29
|
+
- Add ActiveRecord integration specs for persistence, tag querying, and budget callbacks.
|
|
30
|
+
- Add RuboCop configuration, rake task, and CI lint step.
|
|
31
|
+
- Require MFA metadata for RubyGems publishing.
|
|
32
|
+
|
|
8
33
|
## [0.1.0] - 2026-04-16
|
|
9
34
|
|
|
10
35
|
### Added
|
data/README.md
CHANGED
|
@@ -1,21 +1,25 @@
|
|
|
1
1
|
# LlmCostTracker
|
|
2
2
|
|
|
3
|
-
**
|
|
3
|
+
**Self-hosted LLM API cost tracking for Ruby and Rails apps.**
|
|
4
4
|
|
|
5
|
-
Track token usage and costs for
|
|
5
|
+
Track token usage and estimated costs for OpenAI, Anthropic, and Google Gemini calls from Faraday-based Ruby clients. Store the data in your own database, tag calls by user or feature, and get budget alerts without adding an external SaaS or proxy.
|
|
6
6
|
|
|
7
7
|
[](https://rubygems.org/gems/llm_cost_tracker)
|
|
8
|
+
[](https://github.com/sergey-homenko/llm_cost_tracker/actions)
|
|
8
9
|
|
|
9
10
|
## Why?
|
|
10
11
|
|
|
11
|
-
Every Rails app integrating LLMs faces the same problem: **you don't know how much AI is costing you** until the invoice arrives.
|
|
12
|
+
Every Rails app integrating LLMs faces the same problem: **you don't know how much AI is costing you** until the invoice arrives. Full observability platforms like Langfuse and Helicone are powerful, but sometimes you just need a small Rails-native cost ledger that lives in your app database.
|
|
12
13
|
|
|
13
14
|
`llm_cost_tracker` takes a different approach:
|
|
14
15
|
|
|
15
|
-
- 🔌 **
|
|
16
|
+
- 🔌 **Faraday-native** — intercepts LLM HTTP responses without changing the response
|
|
16
17
|
- 🏠 **Self-hosted** — your data stays in your database
|
|
17
|
-
- 🧩 **
|
|
18
|
-
-
|
|
18
|
+
- 🧩 **Client-light** — works with raw Faraday and LLM gems that expose their Faraday connection
|
|
19
|
+
- 🏷️ **Attribution-first** — tag spend by feature, tenant, user, job, or environment
|
|
20
|
+
- 💸 **Budget-aware** — emit notifications and callbacks before spend surprises you
|
|
21
|
+
|
|
22
|
+
This gem is intentionally not a tracing platform, prompt CMS, eval system, or gateway. It focuses on the boring but valuable question: "What did this app spend on LLM APIs, and where did that spend come from?"
|
|
19
23
|
|
|
20
24
|
## Installation
|
|
21
25
|
|
|
@@ -34,9 +38,9 @@ bin/rails db:migrate
|
|
|
34
38
|
|
|
35
39
|
## Quick Start
|
|
36
40
|
|
|
37
|
-
### Option 1: Faraday Middleware
|
|
41
|
+
### Option 1: Faraday Middleware
|
|
38
42
|
|
|
39
|
-
If your LLM client uses Faraday
|
|
43
|
+
If your LLM client uses Faraday, add the middleware to that connection:
|
|
40
44
|
|
|
41
45
|
```ruby
|
|
42
46
|
conn = Faraday.new(url: "https://api.openai.com") do |f|
|
|
@@ -46,16 +50,16 @@ conn = Faraday.new(url: "https://api.openai.com") do |f|
|
|
|
46
50
|
f.adapter Faraday.default_adapter
|
|
47
51
|
end
|
|
48
52
|
|
|
49
|
-
# Every request through this connection is
|
|
50
|
-
response = conn.post("/v1/
|
|
51
|
-
model: "gpt-
|
|
52
|
-
|
|
53
|
+
# Every supported LLM request through this connection is tracked
|
|
54
|
+
response = conn.post("/v1/responses", {
|
|
55
|
+
model: "gpt-5-mini",
|
|
56
|
+
input: "Hello!"
|
|
53
57
|
})
|
|
54
58
|
```
|
|
55
59
|
|
|
56
60
|
### Option 2: Patch an existing client
|
|
57
61
|
|
|
58
|
-
|
|
62
|
+
Some LLM gems expose their Faraday connection. For example, with `ruby-openai`:
|
|
59
63
|
|
|
60
64
|
```ruby
|
|
61
65
|
# config/initializers/openai.rb
|
|
@@ -68,6 +72,8 @@ OpenAI.configure do |config|
|
|
|
68
72
|
end
|
|
69
73
|
```
|
|
70
74
|
|
|
75
|
+
If a client does not expose its HTTP connection, use manual tracking or register a custom parser around the HTTP layer you control.
|
|
76
|
+
|
|
71
77
|
### Option 3: Manual tracking
|
|
72
78
|
|
|
73
79
|
For non-Faraday clients, track manually:
|
|
@@ -78,6 +84,7 @@ LlmCostTracker.track(
|
|
|
78
84
|
model: "claude-sonnet-4-6",
|
|
79
85
|
input_tokens: 1500,
|
|
80
86
|
output_tokens: 320,
|
|
87
|
+
cache_read_input_tokens: 1200,
|
|
81
88
|
feature: "summarizer",
|
|
82
89
|
user_id: current_user.id
|
|
83
90
|
)
|
|
@@ -107,11 +114,13 @@ LlmCostTracker.configure do |config|
|
|
|
107
114
|
|
|
108
115
|
# Override pricing for custom/fine-tuned models (per 1M tokens)
|
|
109
116
|
config.pricing_overrides = {
|
|
110
|
-
"ft:gpt-4o-mini:my-org" => { input: 0.30, output: 1.20 }
|
|
117
|
+
"ft:gpt-4o-mini:my-org" => { input: 0.30, cached_input: 0.15, output: 1.20 }
|
|
111
118
|
}
|
|
112
119
|
end
|
|
113
120
|
```
|
|
114
121
|
|
|
122
|
+
Pricing is best-effort and based on public provider pricing for standard token usage. Providers change pricing frequently, and some features have extra charges or tiered pricing. Use `pricing_overrides` for fine-tunes, gateway-specific model IDs, enterprise discounts, batch pricing, long-context premiums, and any model this gem does not know yet.
|
|
123
|
+
|
|
115
124
|
## Querying Costs (ActiveRecord)
|
|
116
125
|
|
|
117
126
|
```ruby
|
|
@@ -154,7 +163,15 @@ ActiveSupport::Notifications.subscribe("llm_request.llm_cost_tracker") do |*, pa
|
|
|
154
163
|
# input_tokens: 150,
|
|
155
164
|
# output_tokens: 42,
|
|
156
165
|
# total_tokens: 192,
|
|
157
|
-
# cost: {
|
|
166
|
+
# cost: {
|
|
167
|
+
# input_cost: 0.000375,
|
|
168
|
+
# cached_input_cost: 0.0,
|
|
169
|
+
# cache_read_input_cost: 0.0,
|
|
170
|
+
# cache_creation_input_cost: 0.0,
|
|
171
|
+
# output_cost: 0.00042,
|
|
172
|
+
# total_cost: 0.000795,
|
|
173
|
+
# currency: "USD"
|
|
174
|
+
# },
|
|
158
175
|
# tags: { feature: "chat", user_id: 42 },
|
|
159
176
|
# tracked_at: 2026-04-16 14:30:00 UTC
|
|
160
177
|
# }
|
|
@@ -210,11 +227,17 @@ LlmCostTracker::Parsers::Registry.register(DeepSeekParser.new)
|
|
|
210
227
|
|
|
211
228
|
| Provider | Auto-detected | Models with pricing |
|
|
212
229
|
|----------|:---:|---|
|
|
213
|
-
| OpenAI | ✅ | GPT-
|
|
214
|
-
| Anthropic | ✅ | Claude Opus 4.6, Sonnet 4.6, Haiku 4.5, Claude 3.
|
|
215
|
-
| Google Gemini | ✅ | Gemini 2.5 Pro/Flash, 2.0 Flash, 1.5 Pro/Flash |
|
|
230
|
+
| OpenAI | ✅ | GPT-5.2/5.1/5, GPT-5 mini/nano, GPT-4.1, GPT-4o, o1/o3/o4-mini |
|
|
231
|
+
| Anthropic | ✅ | Claude Opus 4.6/4.1/4, Sonnet 4.6/4.5/4, Haiku 4.5, Claude 3.x |
|
|
232
|
+
| Google Gemini | ✅ | Gemini 2.5 Pro/Flash/Flash-Lite, 2.0 Flash/Flash-Lite, 1.5 Pro/Flash |
|
|
216
233
|
| Any other | 🔧 | Via custom parser (see above) |
|
|
217
234
|
|
|
235
|
+
Supported endpoint families:
|
|
236
|
+
|
|
237
|
+
- OpenAI: Chat Completions, Responses, Completions, Embeddings
|
|
238
|
+
- Anthropic: Messages
|
|
239
|
+
- Google Gemini: `generateContent` responses with `usageMetadata`
|
|
240
|
+
|
|
218
241
|
## How It Works
|
|
219
242
|
|
|
220
243
|
```
|
|
@@ -228,7 +251,9 @@ Your App → Faraday → [LlmCostTracker Middleware] → LLM API
|
|
|
228
251
|
ActiveRecord / Log / Custom
|
|
229
252
|
```
|
|
230
253
|
|
|
231
|
-
The middleware intercepts **outgoing** HTTP responses (not incoming requests), parses the
|
|
254
|
+
The middleware intercepts **outgoing** HTTP responses (not incoming Rails requests), parses the provider usage object, looks up pricing, and records the event. It never modifies requests or responses.
|
|
255
|
+
|
|
256
|
+
For streaming APIs, tracking depends on the final response body including provider usage data. If the client consumes server-sent events without exposing the final usage payload to Faraday, use manual tracking.
|
|
232
257
|
|
|
233
258
|
## Development
|
|
234
259
|
|
|
@@ -237,6 +262,7 @@ git clone https://github.com/sergey-homenko/llm_cost_tracker.git
|
|
|
237
262
|
cd llm_cost_tracker
|
|
238
263
|
bundle install
|
|
239
264
|
bundle exec rspec
|
|
265
|
+
bundle exec rubocop
|
|
240
266
|
```
|
|
241
267
|
|
|
242
268
|
## Contributing
|
data/Rakefile
CHANGED
|
@@ -9,7 +9,9 @@ module LlmCostTracker
|
|
|
9
9
|
# Scopes for querying
|
|
10
10
|
scope :by_provider, ->(provider) { where(provider: provider) }
|
|
11
11
|
scope :by_model, ->(model) { where(model: model) }
|
|
12
|
-
scope :by_tag,
|
|
12
|
+
scope :by_tag, lambda { |key, value|
|
|
13
|
+
where("tags LIKE ?", "%\"#{key}\":\"#{value}\"%")
|
|
14
|
+
}
|
|
13
15
|
|
|
14
16
|
scope :today, -> { where(tracked_at: Time.now.utc.beginning_of_day..) }
|
|
15
17
|
scope :this_week, -> { where(tracked_at: Time.now.utc.beginning_of_week..) }
|
|
@@ -19,9 +19,6 @@ module LlmCostTracker
|
|
|
19
19
|
@app.call(request_env).on_complete do |response_env|
|
|
20
20
|
process(request_url, request_body, response_env)
|
|
21
21
|
end
|
|
22
|
-
rescue StandardError => e
|
|
23
|
-
# Never break the actual request — log and re-raise
|
|
24
|
-
raise e
|
|
25
22
|
end
|
|
26
23
|
|
|
27
24
|
private
|
|
@@ -46,7 +43,9 @@ module LlmCostTracker
|
|
|
46
43
|
metadata: @tags.merge(parsed.except(:provider, :model, :input_tokens, :output_tokens, :total_tokens))
|
|
47
44
|
)
|
|
48
45
|
rescue StandardError => e
|
|
49
|
-
|
|
46
|
+
return unless LlmCostTracker.configuration.log_level == :debug
|
|
47
|
+
|
|
48
|
+
warn "[LlmCostTracker] Error processing response: #{e.message}"
|
|
50
49
|
end
|
|
51
50
|
|
|
52
51
|
def read_body(body)
|
|
@@ -14,7 +14,7 @@ module LlmCostTracker
|
|
|
14
14
|
false
|
|
15
15
|
end
|
|
16
16
|
|
|
17
|
-
def parse(
|
|
17
|
+
def parse(_request_url, request_body, response_status, response_body)
|
|
18
18
|
return nil unless response_status == 200
|
|
19
19
|
|
|
20
20
|
response = safe_json_parse(response_body)
|
|
@@ -28,9 +28,11 @@ module LlmCostTracker
|
|
|
28
28
|
model: response["model"] || request["model"],
|
|
29
29
|
input_tokens: usage["input_tokens"] || 0,
|
|
30
30
|
output_tokens: usage["output_tokens"] || 0,
|
|
31
|
-
total_tokens: (usage["input_tokens"] || 0) + (usage["output_tokens"] || 0)
|
|
32
|
-
|
|
33
|
-
|
|
31
|
+
total_tokens: (usage["input_tokens"] || 0) + (usage["output_tokens"] || 0) +
|
|
32
|
+
(usage["cache_read_input_tokens"] || 0) +
|
|
33
|
+
(usage["cache_creation_input_tokens"] || 0),
|
|
34
|
+
cache_read_input_tokens: usage["cache_read_input_tokens"],
|
|
35
|
+
cache_creation_input_tokens: usage["cache_creation_input_tokens"]
|
|
34
36
|
}.compact
|
|
35
37
|
end
|
|
36
38
|
end
|
|
@@ -14,7 +14,7 @@ module LlmCostTracker
|
|
|
14
14
|
false
|
|
15
15
|
end
|
|
16
16
|
|
|
17
|
-
def parse(request_url,
|
|
17
|
+
def parse(request_url, _request_body, response_status, response_body)
|
|
18
18
|
return nil unless response_status == 200
|
|
19
19
|
|
|
20
20
|
response = safe_json_parse(response_body)
|
|
@@ -29,8 +29,9 @@ module LlmCostTracker
|
|
|
29
29
|
model: model,
|
|
30
30
|
input_tokens: usage["promptTokenCount"] || 0,
|
|
31
31
|
output_tokens: usage["candidatesTokenCount"] || 0,
|
|
32
|
-
total_tokens: usage["totalTokenCount"] || 0
|
|
33
|
-
|
|
32
|
+
total_tokens: usage["totalTokenCount"] || 0,
|
|
33
|
+
cached_input_tokens: usage["cachedContentTokenCount"]
|
|
34
|
+
}.compact
|
|
34
35
|
end
|
|
35
36
|
|
|
36
37
|
private
|
|
@@ -6,16 +6,16 @@ module LlmCostTracker
|
|
|
6
6
|
module Parsers
|
|
7
7
|
class Openai < Base
|
|
8
8
|
HOSTS = %w[api.openai.com].freeze
|
|
9
|
-
TRACKED_PATHS = %w[/v1/chat/completions /v1/completions /v1/embeddings].freeze
|
|
9
|
+
TRACKED_PATHS = %w[/v1/chat/completions /v1/completions /v1/embeddings /v1/responses].freeze
|
|
10
10
|
|
|
11
11
|
def match?(url)
|
|
12
12
|
uri = URI.parse(url.to_s)
|
|
13
|
-
HOSTS.include?(uri.host) && TRACKED_PATHS.
|
|
13
|
+
HOSTS.include?(uri.host) && TRACKED_PATHS.include?(uri.path)
|
|
14
14
|
rescue URI::InvalidURIError
|
|
15
15
|
false
|
|
16
16
|
end
|
|
17
17
|
|
|
18
|
-
def parse(
|
|
18
|
+
def parse(_request_url, request_body, response_status, response_body)
|
|
19
19
|
return nil unless response_status == 200
|
|
20
20
|
|
|
21
21
|
response = safe_json_parse(response_body)
|
|
@@ -27,10 +27,18 @@ module LlmCostTracker
|
|
|
27
27
|
{
|
|
28
28
|
provider: "openai",
|
|
29
29
|
model: response["model"] || request["model"],
|
|
30
|
-
input_tokens: usage["prompt_tokens"] || 0,
|
|
31
|
-
output_tokens: usage["completion_tokens"] || 0,
|
|
32
|
-
total_tokens: usage["total_tokens"] || 0
|
|
33
|
-
|
|
30
|
+
input_tokens: usage["prompt_tokens"] || usage["input_tokens"] || 0,
|
|
31
|
+
output_tokens: usage["completion_tokens"] || usage["output_tokens"] || 0,
|
|
32
|
+
total_tokens: usage["total_tokens"] || 0,
|
|
33
|
+
cached_input_tokens: cached_input_tokens(usage)
|
|
34
|
+
}.compact
|
|
35
|
+
end
|
|
36
|
+
|
|
37
|
+
private
|
|
38
|
+
|
|
39
|
+
def cached_input_tokens(usage)
|
|
40
|
+
details = usage["prompt_tokens_details"] || usage["input_tokens_details"] || {}
|
|
41
|
+
details["cached_tokens"]
|
|
34
42
|
end
|
|
35
43
|
end
|
|
36
44
|
end
|
|
@@ -6,43 +6,78 @@ module LlmCostTracker
|
|
|
6
6
|
module Pricing
|
|
7
7
|
PRICES = {
|
|
8
8
|
# OpenAI
|
|
9
|
-
"gpt-
|
|
10
|
-
"gpt-
|
|
9
|
+
"gpt-5.2" => { input: 1.75, cached_input: 0.175, output: 14.00 },
|
|
10
|
+
"gpt-5.1" => { input: 1.25, cached_input: 0.125, output: 10.00 },
|
|
11
|
+
"gpt-5" => { input: 1.25, cached_input: 0.125, output: 10.00 },
|
|
12
|
+
"gpt-5-mini" => { input: 0.25, cached_input: 0.025, output: 2.00 },
|
|
13
|
+
"gpt-5-nano" => { input: 0.05, cached_input: 0.005, output: 0.40 },
|
|
14
|
+
"gpt-4.1" => { input: 2.00, cached_input: 0.50, output: 8.00 },
|
|
15
|
+
"gpt-4.1-mini" => { input: 0.40, cached_input: 0.10, output: 1.60 },
|
|
16
|
+
"gpt-4.1-nano" => { input: 0.10, cached_input: 0.025, output: 0.40 },
|
|
17
|
+
"gpt-4o-2024-05-13" => { input: 5.00, output: 15.00 },
|
|
18
|
+
"gpt-4o" => { input: 2.50, cached_input: 1.25, output: 10.00 },
|
|
19
|
+
"gpt-4o-mini" => { input: 0.15, cached_input: 0.075, output: 0.60 },
|
|
11
20
|
"gpt-4-turbo" => { input: 10.00, output: 30.00 },
|
|
12
21
|
"gpt-4" => { input: 30.00, output: 60.00 },
|
|
13
22
|
"gpt-3.5-turbo" => { input: 0.50, output: 1.50 },
|
|
14
|
-
"o1" => { input: 15.00, output: 60.00 },
|
|
15
|
-
"o1-mini" => { input:
|
|
16
|
-
"o3
|
|
23
|
+
"o1" => { input: 15.00, cached_input: 7.50, output: 60.00 },
|
|
24
|
+
"o1-mini" => { input: 1.10, cached_input: 0.55, output: 4.40 },
|
|
25
|
+
"o3" => { input: 2.00, cached_input: 0.50, output: 8.00 },
|
|
26
|
+
"o3-mini" => { input: 1.10, cached_input: 0.55, output: 4.40 },
|
|
27
|
+
"o4-mini" => { input: 1.10, cached_input: 0.275, output: 4.40 },
|
|
17
28
|
|
|
18
29
|
# Anthropic
|
|
19
|
-
"claude-sonnet-4-6" => { input: 3.00, output: 15.00 },
|
|
20
|
-
"claude-opus-4-6" => { input:
|
|
21
|
-
"claude-
|
|
22
|
-
"claude-
|
|
23
|
-
"claude-
|
|
24
|
-
"claude-
|
|
30
|
+
"claude-sonnet-4-6" => { input: 3.00, output: 15.00, cache_read_input: 0.30, cache_creation_input: 3.75 },
|
|
31
|
+
"claude-opus-4-6" => { input: 5.00, output: 25.00, cache_read_input: 0.50, cache_creation_input: 6.25 },
|
|
32
|
+
"claude-opus-4-1" => { input: 15.00, output: 75.00, cache_read_input: 1.50, cache_creation_input: 18.75 },
|
|
33
|
+
"claude-opus-4" => { input: 15.00, output: 75.00, cache_read_input: 1.50, cache_creation_input: 18.75 },
|
|
34
|
+
"claude-sonnet-4-5" => { input: 3.00, output: 15.00, cache_read_input: 0.30, cache_creation_input: 3.75 },
|
|
35
|
+
"claude-sonnet-4" => { input: 3.00, output: 15.00, cache_read_input: 0.30, cache_creation_input: 3.75 },
|
|
36
|
+
"claude-haiku-4-5" => { input: 1.00, output: 5.00, cache_read_input: 0.10, cache_creation_input: 1.25 },
|
|
37
|
+
"claude-3-7-sonnet" => { input: 3.00, output: 15.00, cache_read_input: 0.30, cache_creation_input: 3.75 },
|
|
38
|
+
"claude-3-5-sonnet" => { input: 3.00, output: 15.00, cache_read_input: 0.30, cache_creation_input: 3.75 },
|
|
39
|
+
"claude-3-5-haiku" => { input: 0.80, output: 4.00, cache_read_input: 0.08, cache_creation_input: 1.00 },
|
|
40
|
+
"claude-3-opus" => { input: 15.00, output: 75.00, cache_read_input: 1.50, cache_creation_input: 18.75 },
|
|
25
41
|
|
|
26
42
|
# Google Gemini
|
|
27
|
-
"gemini-2.5-pro" => { input: 1.25, output: 10.00 },
|
|
28
|
-
"gemini-2.5-flash" => { input: 0.
|
|
29
|
-
"gemini-2.
|
|
43
|
+
"gemini-2.5-pro" => { input: 1.25, cached_input: 0.125, output: 10.00 },
|
|
44
|
+
"gemini-2.5-flash" => { input: 0.30, cached_input: 0.03, output: 2.50 },
|
|
45
|
+
"gemini-2.5-flash-lite" => { input: 0.10, cached_input: 0.01, output: 0.40 },
|
|
46
|
+
"gemini-2.0-flash" => { input: 0.10, cached_input: 0.025, output: 0.40 },
|
|
47
|
+
"gemini-2.0-flash-lite" => { input: 0.075, output: 0.30 },
|
|
30
48
|
"gemini-1.5-pro" => { input: 1.25, output: 5.00 },
|
|
31
|
-
"gemini-1.5-flash" => { input: 0.075, output: 0.30 }
|
|
49
|
+
"gemini-1.5-flash" => { input: 0.075, output: 0.30 }
|
|
32
50
|
}.freeze
|
|
33
51
|
|
|
34
52
|
class << self
|
|
35
|
-
def cost_for(model:, input_tokens:, output_tokens:
|
|
53
|
+
def cost_for(model:, input_tokens:, output_tokens:, cached_input_tokens: 0,
|
|
54
|
+
cache_read_input_tokens: 0, cache_creation_input_tokens: 0)
|
|
36
55
|
prices = lookup(model)
|
|
37
56
|
return nil unless prices
|
|
38
57
|
|
|
39
|
-
|
|
58
|
+
cached_input_tokens = cached_input_tokens.to_i
|
|
59
|
+
cache_read_input_tokens = cache_read_input_tokens.to_i
|
|
60
|
+
cache_creation_input_tokens = cache_creation_input_tokens.to_i
|
|
61
|
+
uncached_input_tokens = [input_tokens.to_i - cached_input_tokens, 0].max
|
|
62
|
+
|
|
63
|
+
input_cost = (uncached_input_tokens.to_f / 1_000_000) * prices[:input]
|
|
64
|
+
cached_input_cost = (cached_input_tokens.to_f / 1_000_000) *
|
|
65
|
+
(prices[:cached_input] || prices[:input])
|
|
66
|
+
cache_read_input_cost = (cache_read_input_tokens.to_f / 1_000_000) *
|
|
67
|
+
(prices[:cache_read_input] || prices[:cached_input] || prices[:input])
|
|
68
|
+
cache_creation_input_cost = (cache_creation_input_tokens.to_f / 1_000_000) *
|
|
69
|
+
(prices[:cache_creation_input] || prices[:input])
|
|
40
70
|
output_cost = (output_tokens.to_f / 1_000_000) * prices[:output]
|
|
71
|
+
total_cost = input_cost + cached_input_cost + cache_read_input_cost +
|
|
72
|
+
cache_creation_input_cost + output_cost
|
|
41
73
|
|
|
42
74
|
{
|
|
43
75
|
input_cost: input_cost.round(8),
|
|
76
|
+
cached_input_cost: cached_input_cost.round(8),
|
|
77
|
+
cache_read_input_cost: cache_read_input_cost.round(8),
|
|
78
|
+
cache_creation_input_cost: cache_creation_input_cost.round(8),
|
|
44
79
|
output_cost: output_cost.round(8),
|
|
45
|
-
total_cost:
|
|
80
|
+
total_cost: total_cost.round(8),
|
|
46
81
|
currency: "USD"
|
|
47
82
|
}
|
|
48
83
|
end
|
|
@@ -62,7 +97,7 @@ module LlmCostTracker
|
|
|
62
97
|
def fuzzy_match(model)
|
|
63
98
|
return nil unless model
|
|
64
99
|
|
|
65
|
-
PRICES.each do |key, value|
|
|
100
|
+
PRICES.sort_by { |key, _value| -key.length }.each do |key, value|
|
|
66
101
|
return value if model.start_with?(key)
|
|
67
102
|
end
|
|
68
103
|
|
|
@@ -14,7 +14,7 @@ module LlmCostTracker
|
|
|
14
14
|
input_cost: event.dig(:cost, :input_cost),
|
|
15
15
|
output_cost: event.dig(:cost, :output_cost),
|
|
16
16
|
total_cost: event.dig(:cost, :total_cost),
|
|
17
|
-
tags: event[:tags].to_json,
|
|
17
|
+
tags: stringify_tags(event[:tags]).to_json,
|
|
18
18
|
tracked_at: event[:tracked_at]
|
|
19
19
|
)
|
|
20
20
|
end
|
|
@@ -31,6 +31,18 @@ module LlmCostTracker
|
|
|
31
31
|
def model_class
|
|
32
32
|
LlmCostTracker::LlmApiCall
|
|
33
33
|
end
|
|
34
|
+
|
|
35
|
+
private
|
|
36
|
+
|
|
37
|
+
def stringify_tags(tags)
|
|
38
|
+
tags.transform_keys(&:to_s).transform_values { |value| stringify_tag_value(value) }
|
|
39
|
+
end
|
|
40
|
+
|
|
41
|
+
def stringify_tag_value(value)
|
|
42
|
+
return value.transform_values { |nested| stringify_tag_value(nested) } if value.is_a?(Hash)
|
|
43
|
+
|
|
44
|
+
value.to_s
|
|
45
|
+
end
|
|
34
46
|
end
|
|
35
47
|
end
|
|
36
48
|
end
|
|
@@ -6,18 +6,23 @@ module LlmCostTracker
|
|
|
6
6
|
|
|
7
7
|
class << self
|
|
8
8
|
def record(provider:, model:, input_tokens:, output_tokens:, metadata: {})
|
|
9
|
+
usage = usage_data(input_tokens, output_tokens, metadata)
|
|
10
|
+
|
|
9
11
|
cost_data = Pricing.cost_for(
|
|
10
12
|
model: model,
|
|
11
|
-
input_tokens: input_tokens,
|
|
12
|
-
output_tokens: output_tokens
|
|
13
|
+
input_tokens: usage[:input_tokens],
|
|
14
|
+
output_tokens: usage[:output_tokens],
|
|
15
|
+
cached_input_tokens: usage[:cached_input_tokens],
|
|
16
|
+
cache_read_input_tokens: usage[:cache_read_input_tokens],
|
|
17
|
+
cache_creation_input_tokens: usage[:cache_creation_input_tokens]
|
|
13
18
|
)
|
|
14
19
|
|
|
15
20
|
event = {
|
|
16
21
|
provider: provider,
|
|
17
22
|
model: model,
|
|
18
|
-
input_tokens: input_tokens,
|
|
19
|
-
output_tokens: output_tokens,
|
|
20
|
-
total_tokens:
|
|
23
|
+
input_tokens: usage[:input_tokens],
|
|
24
|
+
output_tokens: usage[:output_tokens],
|
|
25
|
+
total_tokens: usage[:total_tokens],
|
|
21
26
|
cost: cost_data,
|
|
22
27
|
tags: LlmCostTracker.configuration.default_tags.merge(metadata),
|
|
23
28
|
tracked_at: Time.now.utc
|
|
@@ -51,7 +56,7 @@ module LlmCostTracker
|
|
|
51
56
|
end
|
|
52
57
|
|
|
53
58
|
def log_event(event)
|
|
54
|
-
cost_str = event[:cost] ? "$#{'%.6f'
|
|
59
|
+
cost_str = event[:cost] ? "$#{format('%.6f', event[:cost][:total_cost])}" : "unknown"
|
|
55
60
|
|
|
56
61
|
message = "[LlmCostTracker] #{event[:provider]}/#{event[:model]} " \
|
|
57
62
|
"tokens=#{event[:input_tokens]}+#{event[:output_tokens]} " \
|
|
@@ -72,9 +77,12 @@ module LlmCostTracker
|
|
|
72
77
|
end
|
|
73
78
|
|
|
74
79
|
def store_active_record(event)
|
|
75
|
-
|
|
80
|
+
require_relative "llm_api_call" unless defined?(LlmCostTracker::LlmApiCall)
|
|
81
|
+
require_relative "storage/active_record_store" unless defined?(LlmCostTracker::Storage::ActiveRecordStore)
|
|
76
82
|
|
|
77
83
|
LlmCostTracker::Storage::ActiveRecordStore.save(event)
|
|
84
|
+
rescue LoadError => e
|
|
85
|
+
raise Error, "ActiveRecord storage requires the active_record gem: #{e.message}"
|
|
78
86
|
end
|
|
79
87
|
|
|
80
88
|
def check_budget(event)
|
|
@@ -96,12 +104,41 @@ module LlmCostTracker
|
|
|
96
104
|
# For :active_record backend, query the DB
|
|
97
105
|
if LlmCostTracker.configuration.active_record? &&
|
|
98
106
|
defined?(LlmCostTracker::Storage::ActiveRecordStore)
|
|
99
|
-
LlmCostTracker::Storage::ActiveRecordStore.monthly_total
|
|
107
|
+
LlmCostTracker::Storage::ActiveRecordStore.monthly_total
|
|
100
108
|
else
|
|
101
109
|
# For other backends, we can only report the latest cost
|
|
102
110
|
latest_cost
|
|
103
111
|
end
|
|
104
112
|
end
|
|
113
|
+
|
|
114
|
+
def usage_data(input_tokens, output_tokens, metadata)
|
|
115
|
+
cache_read_input_tokens = integer_metadata(metadata, :cache_read_input_tokens, :cache_read_tokens)
|
|
116
|
+
cache_creation_input_tokens = integer_metadata(
|
|
117
|
+
metadata,
|
|
118
|
+
:cache_creation_input_tokens,
|
|
119
|
+
:cache_creation_tokens
|
|
120
|
+
)
|
|
121
|
+
cached_input_tokens = integer_metadata(metadata, :cached_input_tokens)
|
|
122
|
+
|
|
123
|
+
{
|
|
124
|
+
input_tokens: input_tokens.to_i,
|
|
125
|
+
output_tokens: output_tokens.to_i,
|
|
126
|
+
cached_input_tokens: cached_input_tokens,
|
|
127
|
+
cache_read_input_tokens: cache_read_input_tokens,
|
|
128
|
+
cache_creation_input_tokens: cache_creation_input_tokens,
|
|
129
|
+
total_tokens: input_tokens.to_i + output_tokens.to_i +
|
|
130
|
+
cache_read_input_tokens + cache_creation_input_tokens
|
|
131
|
+
}
|
|
132
|
+
end
|
|
133
|
+
|
|
134
|
+
def integer_metadata(metadata, *keys)
|
|
135
|
+
keys.each do |key|
|
|
136
|
+
value = metadata[key] || metadata[key.to_s]
|
|
137
|
+
return value.to_i unless value.nil?
|
|
138
|
+
end
|
|
139
|
+
|
|
140
|
+
0
|
|
141
|
+
end
|
|
105
142
|
end
|
|
106
143
|
end
|
|
107
144
|
end
|
data/llm_cost_tracker.gemspec
CHANGED
|
@@ -8,10 +8,10 @@ Gem::Specification.new do |spec|
|
|
|
8
8
|
spec.authors = ["Sergii Khomenko"]
|
|
9
9
|
spec.email = ["sergey@mm.st"]
|
|
10
10
|
|
|
11
|
-
spec.summary = "
|
|
12
|
-
spec.description = "
|
|
13
|
-
|
|
14
|
-
|
|
11
|
+
spec.summary = "Self-hosted LLM API cost tracking for Ruby and Rails"
|
|
12
|
+
spec.description = "Tracks token usage and estimated costs for OpenAI, Anthropic, and Google Gemini calls. " \
|
|
13
|
+
"Works as Faraday middleware for Ruby clients, with ActiveRecord storage, " \
|
|
14
|
+
"per-user/per-feature attribution, and budget alerts."
|
|
15
15
|
spec.homepage = "https://github.com/sergey-homenko/llm_cost_tracker"
|
|
16
16
|
spec.license = "MIT"
|
|
17
17
|
|
|
@@ -19,6 +19,7 @@ Gem::Specification.new do |spec|
|
|
|
19
19
|
|
|
20
20
|
spec.metadata["homepage_uri"] = spec.homepage
|
|
21
21
|
spec.metadata["changelog_uri"] = "#{spec.homepage}/blob/main/CHANGELOG.md"
|
|
22
|
+
spec.metadata["rubygems_mfa_required"] = "true"
|
|
22
23
|
|
|
23
24
|
spec.files = Dir.chdir(__dir__) do
|
|
24
25
|
`git ls-files -z`.split("\x0").reject do |f|
|
|
@@ -29,13 +30,13 @@ Gem::Specification.new do |spec|
|
|
|
29
30
|
|
|
30
31
|
spec.require_paths = ["lib"]
|
|
31
32
|
|
|
32
|
-
spec.add_dependency "faraday", ">= 1.0", "< 3.0"
|
|
33
33
|
spec.add_dependency "activesupport", ">= 7.0", "< 9.0"
|
|
34
|
+
spec.add_dependency "faraday", ">= 1.0", "< 3.0"
|
|
34
35
|
|
|
35
36
|
spec.add_development_dependency "activerecord", ">= 7.0", "< 9.0"
|
|
36
37
|
spec.add_development_dependency "rake", "~> 13.0"
|
|
37
38
|
spec.add_development_dependency "rspec", "~> 3.0"
|
|
38
|
-
spec.add_development_dependency "webmock", "~> 3.0"
|
|
39
|
-
spec.add_development_dependency "sqlite3", "~> 2.0"
|
|
40
39
|
spec.add_development_dependency "rubocop", "~> 1.0"
|
|
40
|
+
spec.add_development_dependency "sqlite3", "~> 2.0"
|
|
41
|
+
spec.add_development_dependency "webmock", "~> 3.0"
|
|
41
42
|
end
|
metadata
CHANGED
|
@@ -1,55 +1,55 @@
|
|
|
1
1
|
--- !ruby/object:Gem::Specification
|
|
2
2
|
name: llm_cost_tracker
|
|
3
3
|
version: !ruby/object:Gem::Version
|
|
4
|
-
version: 0.1.
|
|
4
|
+
version: 0.1.1
|
|
5
5
|
platform: ruby
|
|
6
6
|
authors:
|
|
7
7
|
- Sergii Khomenko
|
|
8
8
|
autorequire:
|
|
9
9
|
bindir: bin
|
|
10
10
|
cert_chain: []
|
|
11
|
-
date: 2026-04-
|
|
11
|
+
date: 2026-04-17 00:00:00.000000000 Z
|
|
12
12
|
dependencies:
|
|
13
13
|
- !ruby/object:Gem::Dependency
|
|
14
|
-
name:
|
|
14
|
+
name: activesupport
|
|
15
15
|
requirement: !ruby/object:Gem::Requirement
|
|
16
16
|
requirements:
|
|
17
17
|
- - ">="
|
|
18
18
|
- !ruby/object:Gem::Version
|
|
19
|
-
version: '
|
|
19
|
+
version: '7.0'
|
|
20
20
|
- - "<"
|
|
21
21
|
- !ruby/object:Gem::Version
|
|
22
|
-
version: '
|
|
22
|
+
version: '9.0'
|
|
23
23
|
type: :runtime
|
|
24
24
|
prerelease: false
|
|
25
25
|
version_requirements: !ruby/object:Gem::Requirement
|
|
26
26
|
requirements:
|
|
27
27
|
- - ">="
|
|
28
28
|
- !ruby/object:Gem::Version
|
|
29
|
-
version: '
|
|
29
|
+
version: '7.0'
|
|
30
30
|
- - "<"
|
|
31
31
|
- !ruby/object:Gem::Version
|
|
32
|
-
version: '
|
|
32
|
+
version: '9.0'
|
|
33
33
|
- !ruby/object:Gem::Dependency
|
|
34
|
-
name:
|
|
34
|
+
name: faraday
|
|
35
35
|
requirement: !ruby/object:Gem::Requirement
|
|
36
36
|
requirements:
|
|
37
37
|
- - ">="
|
|
38
38
|
- !ruby/object:Gem::Version
|
|
39
|
-
version: '
|
|
39
|
+
version: '1.0'
|
|
40
40
|
- - "<"
|
|
41
41
|
- !ruby/object:Gem::Version
|
|
42
|
-
version: '
|
|
42
|
+
version: '3.0'
|
|
43
43
|
type: :runtime
|
|
44
44
|
prerelease: false
|
|
45
45
|
version_requirements: !ruby/object:Gem::Requirement
|
|
46
46
|
requirements:
|
|
47
47
|
- - ">="
|
|
48
48
|
- !ruby/object:Gem::Version
|
|
49
|
-
version: '
|
|
49
|
+
version: '1.0'
|
|
50
50
|
- - "<"
|
|
51
51
|
- !ruby/object:Gem::Version
|
|
52
|
-
version: '
|
|
52
|
+
version: '3.0'
|
|
53
53
|
- !ruby/object:Gem::Dependency
|
|
54
54
|
name: activerecord
|
|
55
55
|
requirement: !ruby/object:Gem::Requirement
|
|
@@ -99,19 +99,19 @@ dependencies:
|
|
|
99
99
|
- !ruby/object:Gem::Version
|
|
100
100
|
version: '3.0'
|
|
101
101
|
- !ruby/object:Gem::Dependency
|
|
102
|
-
name:
|
|
102
|
+
name: rubocop
|
|
103
103
|
requirement: !ruby/object:Gem::Requirement
|
|
104
104
|
requirements:
|
|
105
105
|
- - "~>"
|
|
106
106
|
- !ruby/object:Gem::Version
|
|
107
|
-
version: '
|
|
107
|
+
version: '1.0'
|
|
108
108
|
type: :development
|
|
109
109
|
prerelease: false
|
|
110
110
|
version_requirements: !ruby/object:Gem::Requirement
|
|
111
111
|
requirements:
|
|
112
112
|
- - "~>"
|
|
113
113
|
- !ruby/object:Gem::Version
|
|
114
|
-
version: '
|
|
114
|
+
version: '1.0'
|
|
115
115
|
- !ruby/object:Gem::Dependency
|
|
116
116
|
name: sqlite3
|
|
117
117
|
requirement: !ruby/object:Gem::Requirement
|
|
@@ -127,23 +127,22 @@ dependencies:
|
|
|
127
127
|
- !ruby/object:Gem::Version
|
|
128
128
|
version: '2.0'
|
|
129
129
|
- !ruby/object:Gem::Dependency
|
|
130
|
-
name:
|
|
130
|
+
name: webmock
|
|
131
131
|
requirement: !ruby/object:Gem::Requirement
|
|
132
132
|
requirements:
|
|
133
133
|
- - "~>"
|
|
134
134
|
- !ruby/object:Gem::Version
|
|
135
|
-
version: '
|
|
135
|
+
version: '3.0'
|
|
136
136
|
type: :development
|
|
137
137
|
prerelease: false
|
|
138
138
|
version_requirements: !ruby/object:Gem::Requirement
|
|
139
139
|
requirements:
|
|
140
140
|
- - "~>"
|
|
141
141
|
- !ruby/object:Gem::Version
|
|
142
|
-
version: '
|
|
143
|
-
description:
|
|
144
|
-
|
|
145
|
-
|
|
146
|
-
and budget alerts.
|
|
142
|
+
version: '3.0'
|
|
143
|
+
description: Tracks token usage and estimated costs for OpenAI, Anthropic, and Google
|
|
144
|
+
Gemini calls. Works as Faraday middleware for Ruby clients, with ActiveRecord storage,
|
|
145
|
+
per-user/per-feature attribution, and budget alerts.
|
|
147
146
|
email:
|
|
148
147
|
- sergey@mm.st
|
|
149
148
|
executables: []
|
|
@@ -151,6 +150,7 @@ extensions: []
|
|
|
151
150
|
extra_rdoc_files: []
|
|
152
151
|
files:
|
|
153
152
|
- ".rspec"
|
|
153
|
+
- ".rubocop.yml"
|
|
154
154
|
- CHANGELOG.md
|
|
155
155
|
- LICENSE.txt
|
|
156
156
|
- README.md
|
|
@@ -179,6 +179,7 @@ licenses:
|
|
|
179
179
|
metadata:
|
|
180
180
|
homepage_uri: https://github.com/sergey-homenko/llm_cost_tracker
|
|
181
181
|
changelog_uri: https://github.com/sergey-homenko/llm_cost_tracker/blob/main/CHANGELOG.md
|
|
182
|
+
rubygems_mfa_required: 'true'
|
|
182
183
|
post_install_message:
|
|
183
184
|
rdoc_options: []
|
|
184
185
|
require_paths:
|
|
@@ -197,5 +198,5 @@ requirements: []
|
|
|
197
198
|
rubygems_version: 3.5.9
|
|
198
199
|
signing_key:
|
|
199
200
|
specification_version: 4
|
|
200
|
-
summary:
|
|
201
|
+
summary: Self-hosted LLM API cost tracking for Ruby and Rails
|
|
201
202
|
test_files: []
|