llm_cost_tracker 0.1.0 → 0.1.1

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
checksums.yaml CHANGED
@@ -1,7 +1,7 @@
1
1
  ---
2
2
  SHA256:
3
- metadata.gz: 16c6c4c230300b2caebc69e1f0aeb9a7af232278a6e77401aba0d490bb16b4a2
4
- data.tar.gz: 4b32fb25d22c645e4d66767264130bc44ce730e553636cefca8e4884dede7b94
3
+ metadata.gz: f7b40f1010c79358da89ffdd10637f59fa90e24aa0f50aec364828d2e2cbf5b9
4
+ data.tar.gz: d12d1cf407b87afd6e1084c22ceda143c7ab9bf5e6ea6825d70a8e24969cafa5
5
5
  SHA512:
6
- metadata.gz: f403ebeeb6a98164bc2318b9f3b8f49e03750fd5c5d328ac0b0f3c0557f16c39d8f247aa663bdd15b605df779be7d4b4db7445fb79d10eb4c89ef17a687a77d7
7
- data.tar.gz: fd04a24708901e998127d6582290da90b35f62f7a36d794805a4677656be9191eed14235b634b72924ac4178837aa9dbace3a57fde18829d47b6a8208e295af2
6
+ metadata.gz: 949157f0a6718bc03f8f0d825982ed732df2754ddf1e4ee07b18522b0e20cc4367a97c599071bcda95bbdda4dde0e160f5d586a9b42a0dd8b1f3c89910286547
7
+ data.tar.gz: 9ea9007142d157446271bcf81bc4786e4b22a00f6e353dc2e3dc26c1be12d9abf88aed8d8852da37778b5fe3f71fcd4422c6153d8a531f195adaf0d0b9bb8dd2
data/.rubocop.yml ADDED
@@ -0,0 +1,44 @@
1
+ AllCops:
2
+ NewCops: enable
3
+ TargetRubyVersion: 3.1
4
+ SuggestExtensions: false
5
+ UseCache: false
6
+ Exclude:
7
+ - "tmp/**/*"
8
+ - "vendor/**/*"
9
+ - "pkg/**/*"
10
+
11
+ Style/Documentation:
12
+ Enabled: false
13
+
14
+ Style/StringLiterals:
15
+ EnforcedStyle: double_quotes
16
+
17
+ Metrics/BlockLength:
18
+ Exclude:
19
+ - "*.gemspec"
20
+ - "spec/**/*.rb"
21
+
22
+ Metrics/MethodLength:
23
+ Max: 25
24
+
25
+ Metrics/AbcSize:
26
+ Max: 45
27
+
28
+ Metrics/ClassLength:
29
+ Max: 130
30
+
31
+ Metrics/CyclomaticComplexity:
32
+ Max: 10
33
+
34
+ Metrics/ParameterLists:
35
+ Max: 6
36
+
37
+ Metrics/PerceivedComplexity:
38
+ Max: 10
39
+
40
+ Gemspec/DevelopmentDependencies:
41
+ Enabled: false
42
+
43
+ Layout/HashAlignment:
44
+ Enabled: false
data/CHANGELOG.md CHANGED
@@ -5,6 +5,31 @@ All notable changes to this project will be documented in this file.
5
5
  The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.1.0/),
6
6
  and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html).
7
7
 
8
+ ## [0.1.1] - 2026-04-17
9
+
10
+ ### Fixed
11
+
12
+ - Lazy-load ActiveRecord storage so `storage_backend = :active_record` persists events reliably.
13
+ - Avoid double-counting the latest ActiveRecord event in monthly budget callbacks.
14
+ - Track OpenAI Responses API usage via `/v1/responses`.
15
+ - Parse OpenAI cached input token details for cache-aware cost estimates.
16
+ - Parse Anthropic cache read and cache creation token usage under canonical metadata keys.
17
+ - Parse Gemini cached content token usage when present.
18
+ - Store ActiveRecord tag values as strings so `by_tag("user_id", "42")` works for numeric IDs.
19
+
20
+ ### Changed
21
+
22
+ - Refresh built-in pricing for current OpenAI, Anthropic, and Gemini models.
23
+ - Add cache-aware cost calculation fields for cached input, cache reads, and cache creation.
24
+ - Tighten OpenAI URL matching to supported endpoint families only.
25
+ - Reposition README around self-hosted Rails/Ruby cost tracking for Faraday-based clients.
26
+
27
+ ### Added
28
+
29
+ - Add ActiveRecord integration specs for persistence, tag querying, and budget callbacks.
30
+ - Add RuboCop configuration, rake task, and CI lint step.
31
+ - Require MFA metadata for RubyGems publishing.
32
+
8
33
  ## [0.1.0] - 2026-04-16
9
34
 
10
35
  ### Added
data/README.md CHANGED
@@ -1,21 +1,25 @@
1
1
  # LlmCostTracker
2
2
 
3
- **Provider-agnostic LLM API cost tracking for Ruby.**
3
+ **Self-hosted LLM API cost tracking for Ruby and Rails apps.**
4
4
 
5
- Track token usage and costs for every LLM API call your app makes — OpenAI, Anthropic, Google Gemini, and any OpenAI-compatible provider. Works as Faraday middleware, so it plugs into **any** Ruby LLM client without code changes.
5
+ Track token usage and estimated costs for OpenAI, Anthropic, and Google Gemini calls from Faraday-based Ruby clients. Store the data in your own database, tag calls by user or feature, and get budget alerts without adding an external SaaS or proxy.
6
6
 
7
7
  [![Gem Version](https://badge.fury.io/rb/llm_cost_tracker.svg)](https://rubygems.org/gems/llm_cost_tracker)
8
+ [![CI](https://github.com/sergey-homenko/llm_cost_tracker/actions/workflows/ruby.yml/badge.svg)](https://github.com/sergey-homenko/llm_cost_tracker/actions)
8
9
 
9
10
  ## Why?
10
11
 
11
- Every Rails app integrating LLMs faces the same problem: **you don't know how much AI is costing you** until the invoice arrives. Existing solutions either lock you into a specific LLM gem (like `ruby_llm-monitoring`) or require external SaaS (Langfuse, Helicone).
12
+ Every Rails app integrating LLMs faces the same problem: **you don't know how much AI is costing you** until the invoice arrives. Full observability platforms like Langfuse and Helicone are powerful, but sometimes you just need a small Rails-native cost ledger that lives in your app database.
12
13
 
13
14
  `llm_cost_tracker` takes a different approach:
14
15
 
15
- - 🔌 **Provider-agnostic** — intercepts HTTP responses at the Faraday level
16
+ - 🔌 **Faraday-native** — intercepts LLM HTTP responses without changing the response
16
17
  - 🏠 **Self-hosted** — your data stays in your database
17
- - 🧩 **Zero coupling** — works with `ruby-openai`, `anthropic-rb`, `ruby_llm`, or raw Faraday
18
- - **Zero config** — add the middleware, done
18
+ - 🧩 **Client-light** — works with raw Faraday and LLM gems that expose their Faraday connection
19
+ - 🏷️ **Attribution-first** — tag spend by feature, tenant, user, job, or environment
20
+ - 💸 **Budget-aware** — emit notifications and callbacks before spend surprises you
21
+
22
+ This gem is intentionally not a tracing platform, prompt CMS, eval system, or gateway. It focuses on the boring but valuable question: "What did this app spend on LLM APIs, and where did that spend come from?"
19
23
 
20
24
  ## Installation
21
25
 
@@ -34,9 +38,9 @@ bin/rails db:migrate
34
38
 
35
39
  ## Quick Start
36
40
 
37
- ### Option 1: Faraday Middleware (automatic)
41
+ ### Option 1: Faraday Middleware
38
42
 
39
- If your LLM client uses Faraday (most do), just add the middleware:
43
+ If your LLM client uses Faraday, add the middleware to that connection:
40
44
 
41
45
  ```ruby
42
46
  conn = Faraday.new(url: "https://api.openai.com") do |f|
@@ -46,16 +50,16 @@ conn = Faraday.new(url: "https://api.openai.com") do |f|
46
50
  f.adapter Faraday.default_adapter
47
51
  end
48
52
 
49
- # Every request through this connection is now tracked automatically
50
- response = conn.post("/v1/chat/completions", {
51
- model: "gpt-4o",
52
- messages: [{ role: "user", content: "Hello!" }]
53
+ # Every supported LLM request through this connection is tracked
54
+ response = conn.post("/v1/responses", {
55
+ model: "gpt-5-mini",
56
+ input: "Hello!"
53
57
  })
54
58
  ```
55
59
 
56
60
  ### Option 2: Patch an existing client
57
61
 
58
- Most LLM gems expose their Faraday connection. For example, with `ruby-openai`:
62
+ Some LLM gems expose their Faraday connection. For example, with `ruby-openai`:
59
63
 
60
64
  ```ruby
61
65
  # config/initializers/openai.rb
@@ -68,6 +72,8 @@ OpenAI.configure do |config|
68
72
  end
69
73
  ```
70
74
 
75
+ If a client does not expose its HTTP connection, use manual tracking or register a custom parser around the HTTP layer you control.
76
+
71
77
  ### Option 3: Manual tracking
72
78
 
73
79
  For non-Faraday clients, track manually:
@@ -78,6 +84,7 @@ LlmCostTracker.track(
78
84
  model: "claude-sonnet-4-6",
79
85
  input_tokens: 1500,
80
86
  output_tokens: 320,
87
+ cache_read_input_tokens: 1200,
81
88
  feature: "summarizer",
82
89
  user_id: current_user.id
83
90
  )
@@ -107,11 +114,13 @@ LlmCostTracker.configure do |config|
107
114
 
108
115
  # Override pricing for custom/fine-tuned models (per 1M tokens)
109
116
  config.pricing_overrides = {
110
- "ft:gpt-4o-mini:my-org" => { input: 0.30, output: 1.20 }
117
+ "ft:gpt-4o-mini:my-org" => { input: 0.30, cached_input: 0.15, output: 1.20 }
111
118
  }
112
119
  end
113
120
  ```
114
121
 
122
+ Pricing is best-effort and based on public provider pricing for standard token usage. Providers change pricing frequently, and some features have extra charges or tiered pricing. Use `pricing_overrides` for fine-tunes, gateway-specific model IDs, enterprise discounts, batch pricing, long-context premiums, and any model this gem does not know yet.
123
+
115
124
  ## Querying Costs (ActiveRecord)
116
125
 
117
126
  ```ruby
@@ -154,7 +163,15 @@ ActiveSupport::Notifications.subscribe("llm_request.llm_cost_tracker") do |*, pa
154
163
  # input_tokens: 150,
155
164
  # output_tokens: 42,
156
165
  # total_tokens: 192,
157
- # cost: { input_cost: 0.000375, output_cost: 0.00042, total_cost: 0.000795, currency: "USD" },
166
+ # cost: {
167
+ # input_cost: 0.000375,
168
+ # cached_input_cost: 0.0,
169
+ # cache_read_input_cost: 0.0,
170
+ # cache_creation_input_cost: 0.0,
171
+ # output_cost: 0.00042,
172
+ # total_cost: 0.000795,
173
+ # currency: "USD"
174
+ # },
158
175
  # tags: { feature: "chat", user_id: 42 },
159
176
  # tracked_at: 2026-04-16 14:30:00 UTC
160
177
  # }
@@ -210,11 +227,17 @@ LlmCostTracker::Parsers::Registry.register(DeepSeekParser.new)
210
227
 
211
228
  | Provider | Auto-detected | Models with pricing |
212
229
  |----------|:---:|---|
213
- | OpenAI | ✅ | GPT-4o, GPT-4o-mini, GPT-4-turbo, GPT-4, GPT-3.5-turbo, o1, o1-mini, o3-mini |
214
- | Anthropic | ✅ | Claude Opus 4.6, Sonnet 4.6, Haiku 4.5, Claude 3.5 Sonnet, Claude 3 Opus |
215
- | Google Gemini | ✅ | Gemini 2.5 Pro/Flash, 2.0 Flash, 1.5 Pro/Flash |
230
+ | OpenAI | ✅ | GPT-5.2/5.1/5, GPT-5 mini/nano, GPT-4.1, GPT-4o, o1/o3/o4-mini |
231
+ | Anthropic | ✅ | Claude Opus 4.6/4.1/4, Sonnet 4.6/4.5/4, Haiku 4.5, Claude 3.x |
232
+ | Google Gemini | ✅ | Gemini 2.5 Pro/Flash/Flash-Lite, 2.0 Flash/Flash-Lite, 1.5 Pro/Flash |
216
233
  | Any other | 🔧 | Via custom parser (see above) |
217
234
 
235
+ Supported endpoint families:
236
+
237
+ - OpenAI: Chat Completions, Responses, Completions, Embeddings
238
+ - Anthropic: Messages
239
+ - Google Gemini: `generateContent` responses with `usageMetadata`
240
+
218
241
  ## How It Works
219
242
 
220
243
  ```
@@ -228,7 +251,9 @@ Your App → Faraday → [LlmCostTracker Middleware] → LLM API
228
251
  ActiveRecord / Log / Custom
229
252
  ```
230
253
 
231
- The middleware intercepts **outgoing** HTTP responses (not incoming requests), parses the `usage` object from the LLM provider's response body, looks up pricing, and records the event. It never modifies requests or responses — it's read-only.
254
+ The middleware intercepts **outgoing** HTTP responses (not incoming Rails requests), parses the provider usage object, looks up pricing, and records the event. It never modifies requests or responses.
255
+
256
+ For streaming APIs, tracking depends on the final response body including provider usage data. If the client consumes server-sent events without exposing the final usage payload to Faraday, use manual tracking.
232
257
 
233
258
  ## Development
234
259
 
@@ -237,6 +262,7 @@ git clone https://github.com/sergey-homenko/llm_cost_tracker.git
237
262
  cd llm_cost_tracker
238
263
  bundle install
239
264
  bundle exec rspec
265
+ bundle exec rubocop
240
266
  ```
241
267
 
242
268
  ## Contributing
data/Rakefile CHANGED
@@ -2,7 +2,9 @@
2
2
 
3
3
  require "bundler/gem_tasks"
4
4
  require "rspec/core/rake_task"
5
+ require "rubocop/rake_task"
5
6
 
6
7
  RSpec::Core::RakeTask.new(:spec)
8
+ RuboCop::RakeTask.new(:rubocop)
7
9
 
8
- task default: :spec
10
+ task default: %i[spec rubocop]
@@ -9,7 +9,9 @@ module LlmCostTracker
9
9
  # Scopes for querying
10
10
  scope :by_provider, ->(provider) { where(provider: provider) }
11
11
  scope :by_model, ->(model) { where(model: model) }
12
- scope :by_tag, ->(key, value) { where("tags LIKE ?", "%\"#{key}\":\"#{value}\"%") }
12
+ scope :by_tag, lambda { |key, value|
13
+ where("tags LIKE ?", "%\"#{key}\":\"#{value}\"%")
14
+ }
13
15
 
14
16
  scope :today, -> { where(tracked_at: Time.now.utc.beginning_of_day..) }
15
17
  scope :this_week, -> { where(tracked_at: Time.now.utc.beginning_of_week..) }
@@ -19,9 +19,6 @@ module LlmCostTracker
19
19
  @app.call(request_env).on_complete do |response_env|
20
20
  process(request_url, request_body, response_env)
21
21
  end
22
- rescue StandardError => e
23
- # Never break the actual request — log and re-raise
24
- raise e
25
22
  end
26
23
 
27
24
  private
@@ -46,7 +43,9 @@ module LlmCostTracker
46
43
  metadata: @tags.merge(parsed.except(:provider, :model, :input_tokens, :output_tokens, :total_tokens))
47
44
  )
48
45
  rescue StandardError => e
49
- warn "[LlmCostTracker] Error processing response: #{e.message}" if LlmCostTracker.configuration.log_level == :debug
46
+ return unless LlmCostTracker.configuration.log_level == :debug
47
+
48
+ warn "[LlmCostTracker] Error processing response: #{e.message}"
50
49
  end
51
50
 
52
51
  def read_body(body)
@@ -14,7 +14,7 @@ module LlmCostTracker
14
14
  false
15
15
  end
16
16
 
17
- def parse(request_url, request_body, response_status, response_body)
17
+ def parse(_request_url, request_body, response_status, response_body)
18
18
  return nil unless response_status == 200
19
19
 
20
20
  response = safe_json_parse(response_body)
@@ -28,9 +28,11 @@ module LlmCostTracker
28
28
  model: response["model"] || request["model"],
29
29
  input_tokens: usage["input_tokens"] || 0,
30
30
  output_tokens: usage["output_tokens"] || 0,
31
- total_tokens: (usage["input_tokens"] || 0) + (usage["output_tokens"] || 0),
32
- cache_read_tokens: usage["cache_read_input_tokens"],
33
- cache_creation_tokens: usage["cache_creation_input_tokens"]
31
+ total_tokens: (usage["input_tokens"] || 0) + (usage["output_tokens"] || 0) +
32
+ (usage["cache_read_input_tokens"] || 0) +
33
+ (usage["cache_creation_input_tokens"] || 0),
34
+ cache_read_input_tokens: usage["cache_read_input_tokens"],
35
+ cache_creation_input_tokens: usage["cache_creation_input_tokens"]
34
36
  }.compact
35
37
  end
36
38
  end
@@ -14,7 +14,7 @@ module LlmCostTracker
14
14
  false
15
15
  end
16
16
 
17
- def parse(request_url, request_body, response_status, response_body)
17
+ def parse(request_url, _request_body, response_status, response_body)
18
18
  return nil unless response_status == 200
19
19
 
20
20
  response = safe_json_parse(response_body)
@@ -29,8 +29,9 @@ module LlmCostTracker
29
29
  model: model,
30
30
  input_tokens: usage["promptTokenCount"] || 0,
31
31
  output_tokens: usage["candidatesTokenCount"] || 0,
32
- total_tokens: usage["totalTokenCount"] || 0
33
- }
32
+ total_tokens: usage["totalTokenCount"] || 0,
33
+ cached_input_tokens: usage["cachedContentTokenCount"]
34
+ }.compact
34
35
  end
35
36
 
36
37
  private
@@ -6,16 +6,16 @@ module LlmCostTracker
6
6
  module Parsers
7
7
  class Openai < Base
8
8
  HOSTS = %w[api.openai.com].freeze
9
- TRACKED_PATHS = %w[/v1/chat/completions /v1/completions /v1/embeddings].freeze
9
+ TRACKED_PATHS = %w[/v1/chat/completions /v1/completions /v1/embeddings /v1/responses].freeze
10
10
 
11
11
  def match?(url)
12
12
  uri = URI.parse(url.to_s)
13
- HOSTS.include?(uri.host) && TRACKED_PATHS.any? { |p| uri.path.start_with?(p) }
13
+ HOSTS.include?(uri.host) && TRACKED_PATHS.include?(uri.path)
14
14
  rescue URI::InvalidURIError
15
15
  false
16
16
  end
17
17
 
18
- def parse(request_url, request_body, response_status, response_body)
18
+ def parse(_request_url, request_body, response_status, response_body)
19
19
  return nil unless response_status == 200
20
20
 
21
21
  response = safe_json_parse(response_body)
@@ -27,10 +27,18 @@ module LlmCostTracker
27
27
  {
28
28
  provider: "openai",
29
29
  model: response["model"] || request["model"],
30
- input_tokens: usage["prompt_tokens"] || 0,
31
- output_tokens: usage["completion_tokens"] || 0,
32
- total_tokens: usage["total_tokens"] || 0
33
- }
30
+ input_tokens: usage["prompt_tokens"] || usage["input_tokens"] || 0,
31
+ output_tokens: usage["completion_tokens"] || usage["output_tokens"] || 0,
32
+ total_tokens: usage["total_tokens"] || 0,
33
+ cached_input_tokens: cached_input_tokens(usage)
34
+ }.compact
35
+ end
36
+
37
+ private
38
+
39
+ def cached_input_tokens(usage)
40
+ details = usage["prompt_tokens_details"] || usage["input_tokens_details"] || {}
41
+ details["cached_tokens"]
34
42
  end
35
43
  end
36
44
  end
@@ -6,43 +6,78 @@ module LlmCostTracker
6
6
  module Pricing
7
7
  PRICES = {
8
8
  # OpenAI
9
- "gpt-4o" => { input: 2.50, output: 10.00 },
10
- "gpt-4o-mini" => { input: 0.15, output: 0.60 },
9
+ "gpt-5.2" => { input: 1.75, cached_input: 0.175, output: 14.00 },
10
+ "gpt-5.1" => { input: 1.25, cached_input: 0.125, output: 10.00 },
11
+ "gpt-5" => { input: 1.25, cached_input: 0.125, output: 10.00 },
12
+ "gpt-5-mini" => { input: 0.25, cached_input: 0.025, output: 2.00 },
13
+ "gpt-5-nano" => { input: 0.05, cached_input: 0.005, output: 0.40 },
14
+ "gpt-4.1" => { input: 2.00, cached_input: 0.50, output: 8.00 },
15
+ "gpt-4.1-mini" => { input: 0.40, cached_input: 0.10, output: 1.60 },
16
+ "gpt-4.1-nano" => { input: 0.10, cached_input: 0.025, output: 0.40 },
17
+ "gpt-4o-2024-05-13" => { input: 5.00, output: 15.00 },
18
+ "gpt-4o" => { input: 2.50, cached_input: 1.25, output: 10.00 },
19
+ "gpt-4o-mini" => { input: 0.15, cached_input: 0.075, output: 0.60 },
11
20
  "gpt-4-turbo" => { input: 10.00, output: 30.00 },
12
21
  "gpt-4" => { input: 30.00, output: 60.00 },
13
22
  "gpt-3.5-turbo" => { input: 0.50, output: 1.50 },
14
- "o1" => { input: 15.00, output: 60.00 },
15
- "o1-mini" => { input: 3.00, output: 12.00 },
16
- "o3-mini" => { input: 1.10, output: 4.40 },
23
+ "o1" => { input: 15.00, cached_input: 7.50, output: 60.00 },
24
+ "o1-mini" => { input: 1.10, cached_input: 0.55, output: 4.40 },
25
+ "o3" => { input: 2.00, cached_input: 0.50, output: 8.00 },
26
+ "o3-mini" => { input: 1.10, cached_input: 0.55, output: 4.40 },
27
+ "o4-mini" => { input: 1.10, cached_input: 0.275, output: 4.40 },
17
28
 
18
29
  # Anthropic
19
- "claude-sonnet-4-6" => { input: 3.00, output: 15.00 },
20
- "claude-opus-4-6" => { input: 15.00, output: 75.00 },
21
- "claude-haiku-4-5" => { input: 0.80, output: 4.00 },
22
- "claude-3-5-sonnet-20241022" => { input: 3.00, output: 15.00 },
23
- "claude-3-5-haiku-20241022" => { input: 0.80, output: 4.00 },
24
- "claude-3-opus-20240229" => { input: 15.00, output: 75.00 },
30
+ "claude-sonnet-4-6" => { input: 3.00, output: 15.00, cache_read_input: 0.30, cache_creation_input: 3.75 },
31
+ "claude-opus-4-6" => { input: 5.00, output: 25.00, cache_read_input: 0.50, cache_creation_input: 6.25 },
32
+ "claude-opus-4-1" => { input: 15.00, output: 75.00, cache_read_input: 1.50, cache_creation_input: 18.75 },
33
+ "claude-opus-4" => { input: 15.00, output: 75.00, cache_read_input: 1.50, cache_creation_input: 18.75 },
34
+ "claude-sonnet-4-5" => { input: 3.00, output: 15.00, cache_read_input: 0.30, cache_creation_input: 3.75 },
35
+ "claude-sonnet-4" => { input: 3.00, output: 15.00, cache_read_input: 0.30, cache_creation_input: 3.75 },
36
+ "claude-haiku-4-5" => { input: 1.00, output: 5.00, cache_read_input: 0.10, cache_creation_input: 1.25 },
37
+ "claude-3-7-sonnet" => { input: 3.00, output: 15.00, cache_read_input: 0.30, cache_creation_input: 3.75 },
38
+ "claude-3-5-sonnet" => { input: 3.00, output: 15.00, cache_read_input: 0.30, cache_creation_input: 3.75 },
39
+ "claude-3-5-haiku" => { input: 0.80, output: 4.00, cache_read_input: 0.08, cache_creation_input: 1.00 },
40
+ "claude-3-opus" => { input: 15.00, output: 75.00, cache_read_input: 1.50, cache_creation_input: 18.75 },
25
41
 
26
42
  # Google Gemini
27
- "gemini-2.5-pro" => { input: 1.25, output: 10.00 },
28
- "gemini-2.5-flash" => { input: 0.15, output: 0.60 },
29
- "gemini-2.0-flash" => { input: 0.10, output: 0.40 },
43
+ "gemini-2.5-pro" => { input: 1.25, cached_input: 0.125, output: 10.00 },
44
+ "gemini-2.5-flash" => { input: 0.30, cached_input: 0.03, output: 2.50 },
45
+ "gemini-2.5-flash-lite" => { input: 0.10, cached_input: 0.01, output: 0.40 },
46
+ "gemini-2.0-flash" => { input: 0.10, cached_input: 0.025, output: 0.40 },
47
+ "gemini-2.0-flash-lite" => { input: 0.075, output: 0.30 },
30
48
  "gemini-1.5-pro" => { input: 1.25, output: 5.00 },
31
- "gemini-1.5-flash" => { input: 0.075, output: 0.30 },
49
+ "gemini-1.5-flash" => { input: 0.075, output: 0.30 }
32
50
  }.freeze
33
51
 
34
52
  class << self
35
- def cost_for(model:, input_tokens:, output_tokens:)
53
+ def cost_for(model:, input_tokens:, output_tokens:, cached_input_tokens: 0,
54
+ cache_read_input_tokens: 0, cache_creation_input_tokens: 0)
36
55
  prices = lookup(model)
37
56
  return nil unless prices
38
57
 
39
- input_cost = (input_tokens.to_f / 1_000_000) * prices[:input]
58
+ cached_input_tokens = cached_input_tokens.to_i
59
+ cache_read_input_tokens = cache_read_input_tokens.to_i
60
+ cache_creation_input_tokens = cache_creation_input_tokens.to_i
61
+ uncached_input_tokens = [input_tokens.to_i - cached_input_tokens, 0].max
62
+
63
+ input_cost = (uncached_input_tokens.to_f / 1_000_000) * prices[:input]
64
+ cached_input_cost = (cached_input_tokens.to_f / 1_000_000) *
65
+ (prices[:cached_input] || prices[:input])
66
+ cache_read_input_cost = (cache_read_input_tokens.to_f / 1_000_000) *
67
+ (prices[:cache_read_input] || prices[:cached_input] || prices[:input])
68
+ cache_creation_input_cost = (cache_creation_input_tokens.to_f / 1_000_000) *
69
+ (prices[:cache_creation_input] || prices[:input])
40
70
  output_cost = (output_tokens.to_f / 1_000_000) * prices[:output]
71
+ total_cost = input_cost + cached_input_cost + cache_read_input_cost +
72
+ cache_creation_input_cost + output_cost
41
73
 
42
74
  {
43
75
  input_cost: input_cost.round(8),
76
+ cached_input_cost: cached_input_cost.round(8),
77
+ cache_read_input_cost: cache_read_input_cost.round(8),
78
+ cache_creation_input_cost: cache_creation_input_cost.round(8),
44
79
  output_cost: output_cost.round(8),
45
- total_cost: (input_cost + output_cost).round(8),
80
+ total_cost: total_cost.round(8),
46
81
  currency: "USD"
47
82
  }
48
83
  end
@@ -62,7 +97,7 @@ module LlmCostTracker
62
97
  def fuzzy_match(model)
63
98
  return nil unless model
64
99
 
65
- PRICES.each do |key, value|
100
+ PRICES.sort_by { |key, _value| -key.length }.each do |key, value|
66
101
  return value if model.start_with?(key)
67
102
  end
68
103
 
@@ -14,7 +14,7 @@ module LlmCostTracker
14
14
  input_cost: event.dig(:cost, :input_cost),
15
15
  output_cost: event.dig(:cost, :output_cost),
16
16
  total_cost: event.dig(:cost, :total_cost),
17
- tags: event[:tags].to_json,
17
+ tags: stringify_tags(event[:tags]).to_json,
18
18
  tracked_at: event[:tracked_at]
19
19
  )
20
20
  end
@@ -31,6 +31,18 @@ module LlmCostTracker
31
31
  def model_class
32
32
  LlmCostTracker::LlmApiCall
33
33
  end
34
+
35
+ private
36
+
37
+ def stringify_tags(tags)
38
+ tags.transform_keys(&:to_s).transform_values { |value| stringify_tag_value(value) }
39
+ end
40
+
41
+ def stringify_tag_value(value)
42
+ return value.transform_values { |nested| stringify_tag_value(nested) } if value.is_a?(Hash)
43
+
44
+ value.to_s
45
+ end
34
46
  end
35
47
  end
36
48
  end
@@ -6,18 +6,23 @@ module LlmCostTracker
6
6
 
7
7
  class << self
8
8
  def record(provider:, model:, input_tokens:, output_tokens:, metadata: {})
9
+ usage = usage_data(input_tokens, output_tokens, metadata)
10
+
9
11
  cost_data = Pricing.cost_for(
10
12
  model: model,
11
- input_tokens: input_tokens,
12
- output_tokens: output_tokens
13
+ input_tokens: usage[:input_tokens],
14
+ output_tokens: usage[:output_tokens],
15
+ cached_input_tokens: usage[:cached_input_tokens],
16
+ cache_read_input_tokens: usage[:cache_read_input_tokens],
17
+ cache_creation_input_tokens: usage[:cache_creation_input_tokens]
13
18
  )
14
19
 
15
20
  event = {
16
21
  provider: provider,
17
22
  model: model,
18
- input_tokens: input_tokens,
19
- output_tokens: output_tokens,
20
- total_tokens: input_tokens + output_tokens,
23
+ input_tokens: usage[:input_tokens],
24
+ output_tokens: usage[:output_tokens],
25
+ total_tokens: usage[:total_tokens],
21
26
  cost: cost_data,
22
27
  tags: LlmCostTracker.configuration.default_tags.merge(metadata),
23
28
  tracked_at: Time.now.utc
@@ -51,7 +56,7 @@ module LlmCostTracker
51
56
  end
52
57
 
53
58
  def log_event(event)
54
- cost_str = event[:cost] ? "$#{'%.6f' % event[:cost][:total_cost]}" : "unknown"
59
+ cost_str = event[:cost] ? "$#{format('%.6f', event[:cost][:total_cost])}" : "unknown"
55
60
 
56
61
  message = "[LlmCostTracker] #{event[:provider]}/#{event[:model]} " \
57
62
  "tokens=#{event[:input_tokens]}+#{event[:output_tokens]} " \
@@ -72,9 +77,12 @@ module LlmCostTracker
72
77
  end
73
78
 
74
79
  def store_active_record(event)
75
- return unless defined?(LlmCostTracker::Storage::ActiveRecordStore)
80
+ require_relative "llm_api_call" unless defined?(LlmCostTracker::LlmApiCall)
81
+ require_relative "storage/active_record_store" unless defined?(LlmCostTracker::Storage::ActiveRecordStore)
76
82
 
77
83
  LlmCostTracker::Storage::ActiveRecordStore.save(event)
84
+ rescue LoadError => e
85
+ raise Error, "ActiveRecord storage requires the active_record gem: #{e.message}"
78
86
  end
79
87
 
80
88
  def check_budget(event)
@@ -96,12 +104,41 @@ module LlmCostTracker
96
104
  # For :active_record backend, query the DB
97
105
  if LlmCostTracker.configuration.active_record? &&
98
106
  defined?(LlmCostTracker::Storage::ActiveRecordStore)
99
- LlmCostTracker::Storage::ActiveRecordStore.monthly_total + latest_cost
107
+ LlmCostTracker::Storage::ActiveRecordStore.monthly_total
100
108
  else
101
109
  # For other backends, we can only report the latest cost
102
110
  latest_cost
103
111
  end
104
112
  end
113
+
114
+ def usage_data(input_tokens, output_tokens, metadata)
115
+ cache_read_input_tokens = integer_metadata(metadata, :cache_read_input_tokens, :cache_read_tokens)
116
+ cache_creation_input_tokens = integer_metadata(
117
+ metadata,
118
+ :cache_creation_input_tokens,
119
+ :cache_creation_tokens
120
+ )
121
+ cached_input_tokens = integer_metadata(metadata, :cached_input_tokens)
122
+
123
+ {
124
+ input_tokens: input_tokens.to_i,
125
+ output_tokens: output_tokens.to_i,
126
+ cached_input_tokens: cached_input_tokens,
127
+ cache_read_input_tokens: cache_read_input_tokens,
128
+ cache_creation_input_tokens: cache_creation_input_tokens,
129
+ total_tokens: input_tokens.to_i + output_tokens.to_i +
130
+ cache_read_input_tokens + cache_creation_input_tokens
131
+ }
132
+ end
133
+
134
+ def integer_metadata(metadata, *keys)
135
+ keys.each do |key|
136
+ value = metadata[key] || metadata[key.to_s]
137
+ return value.to_i unless value.nil?
138
+ end
139
+
140
+ 0
141
+ end
105
142
  end
106
143
  end
107
144
  end
@@ -1,5 +1,5 @@
1
1
  # frozen_string_literal: true
2
2
 
3
3
  module LlmCostTracker
4
- VERSION = "0.1.0"
4
+ VERSION = "0.1.1"
5
5
  end
@@ -8,10 +8,10 @@ Gem::Specification.new do |spec|
8
8
  spec.authors = ["Sergii Khomenko"]
9
9
  spec.email = ["sergey@mm.st"]
10
10
 
11
- spec.summary = "Provider-agnostic LLM API cost tracking for Ruby"
12
- spec.description = "Automatically tracks token usage and costs for LLM API calls (OpenAI, Anthropic, Google Gemini, and more). " \
13
- "Works as Faraday middleware plugs into any Ruby HTTP client. " \
14
- "Provides ActiveRecord storage, per-user/per-feature attribution, and budget alerts."
11
+ spec.summary = "Self-hosted LLM API cost tracking for Ruby and Rails"
12
+ spec.description = "Tracks token usage and estimated costs for OpenAI, Anthropic, and Google Gemini calls. " \
13
+ "Works as Faraday middleware for Ruby clients, with ActiveRecord storage, " \
14
+ "per-user/per-feature attribution, and budget alerts."
15
15
  spec.homepage = "https://github.com/sergey-homenko/llm_cost_tracker"
16
16
  spec.license = "MIT"
17
17
 
@@ -19,6 +19,7 @@ Gem::Specification.new do |spec|
19
19
 
20
20
  spec.metadata["homepage_uri"] = spec.homepage
21
21
  spec.metadata["changelog_uri"] = "#{spec.homepage}/blob/main/CHANGELOG.md"
22
+ spec.metadata["rubygems_mfa_required"] = "true"
22
23
 
23
24
  spec.files = Dir.chdir(__dir__) do
24
25
  `git ls-files -z`.split("\x0").reject do |f|
@@ -29,13 +30,13 @@ Gem::Specification.new do |spec|
29
30
 
30
31
  spec.require_paths = ["lib"]
31
32
 
32
- spec.add_dependency "faraday", ">= 1.0", "< 3.0"
33
33
  spec.add_dependency "activesupport", ">= 7.0", "< 9.0"
34
+ spec.add_dependency "faraday", ">= 1.0", "< 3.0"
34
35
 
35
36
  spec.add_development_dependency "activerecord", ">= 7.0", "< 9.0"
36
37
  spec.add_development_dependency "rake", "~> 13.0"
37
38
  spec.add_development_dependency "rspec", "~> 3.0"
38
- spec.add_development_dependency "webmock", "~> 3.0"
39
- spec.add_development_dependency "sqlite3", "~> 2.0"
40
39
  spec.add_development_dependency "rubocop", "~> 1.0"
40
+ spec.add_development_dependency "sqlite3", "~> 2.0"
41
+ spec.add_development_dependency "webmock", "~> 3.0"
41
42
  end
metadata CHANGED
@@ -1,55 +1,55 @@
1
1
  --- !ruby/object:Gem::Specification
2
2
  name: llm_cost_tracker
3
3
  version: !ruby/object:Gem::Version
4
- version: 0.1.0
4
+ version: 0.1.1
5
5
  platform: ruby
6
6
  authors:
7
7
  - Sergii Khomenko
8
8
  autorequire:
9
9
  bindir: bin
10
10
  cert_chain: []
11
- date: 2026-04-16 00:00:00.000000000 Z
11
+ date: 2026-04-17 00:00:00.000000000 Z
12
12
  dependencies:
13
13
  - !ruby/object:Gem::Dependency
14
- name: faraday
14
+ name: activesupport
15
15
  requirement: !ruby/object:Gem::Requirement
16
16
  requirements:
17
17
  - - ">="
18
18
  - !ruby/object:Gem::Version
19
- version: '1.0'
19
+ version: '7.0'
20
20
  - - "<"
21
21
  - !ruby/object:Gem::Version
22
- version: '3.0'
22
+ version: '9.0'
23
23
  type: :runtime
24
24
  prerelease: false
25
25
  version_requirements: !ruby/object:Gem::Requirement
26
26
  requirements:
27
27
  - - ">="
28
28
  - !ruby/object:Gem::Version
29
- version: '1.0'
29
+ version: '7.0'
30
30
  - - "<"
31
31
  - !ruby/object:Gem::Version
32
- version: '3.0'
32
+ version: '9.0'
33
33
  - !ruby/object:Gem::Dependency
34
- name: activesupport
34
+ name: faraday
35
35
  requirement: !ruby/object:Gem::Requirement
36
36
  requirements:
37
37
  - - ">="
38
38
  - !ruby/object:Gem::Version
39
- version: '7.0'
39
+ version: '1.0'
40
40
  - - "<"
41
41
  - !ruby/object:Gem::Version
42
- version: '9.0'
42
+ version: '3.0'
43
43
  type: :runtime
44
44
  prerelease: false
45
45
  version_requirements: !ruby/object:Gem::Requirement
46
46
  requirements:
47
47
  - - ">="
48
48
  - !ruby/object:Gem::Version
49
- version: '7.0'
49
+ version: '1.0'
50
50
  - - "<"
51
51
  - !ruby/object:Gem::Version
52
- version: '9.0'
52
+ version: '3.0'
53
53
  - !ruby/object:Gem::Dependency
54
54
  name: activerecord
55
55
  requirement: !ruby/object:Gem::Requirement
@@ -99,19 +99,19 @@ dependencies:
99
99
  - !ruby/object:Gem::Version
100
100
  version: '3.0'
101
101
  - !ruby/object:Gem::Dependency
102
- name: webmock
102
+ name: rubocop
103
103
  requirement: !ruby/object:Gem::Requirement
104
104
  requirements:
105
105
  - - "~>"
106
106
  - !ruby/object:Gem::Version
107
- version: '3.0'
107
+ version: '1.0'
108
108
  type: :development
109
109
  prerelease: false
110
110
  version_requirements: !ruby/object:Gem::Requirement
111
111
  requirements:
112
112
  - - "~>"
113
113
  - !ruby/object:Gem::Version
114
- version: '3.0'
114
+ version: '1.0'
115
115
  - !ruby/object:Gem::Dependency
116
116
  name: sqlite3
117
117
  requirement: !ruby/object:Gem::Requirement
@@ -127,23 +127,22 @@ dependencies:
127
127
  - !ruby/object:Gem::Version
128
128
  version: '2.0'
129
129
  - !ruby/object:Gem::Dependency
130
- name: rubocop
130
+ name: webmock
131
131
  requirement: !ruby/object:Gem::Requirement
132
132
  requirements:
133
133
  - - "~>"
134
134
  - !ruby/object:Gem::Version
135
- version: '1.0'
135
+ version: '3.0'
136
136
  type: :development
137
137
  prerelease: false
138
138
  version_requirements: !ruby/object:Gem::Requirement
139
139
  requirements:
140
140
  - - "~>"
141
141
  - !ruby/object:Gem::Version
142
- version: '1.0'
143
- description: Automatically tracks token usage and costs for LLM API calls (OpenAI,
144
- Anthropic, Google Gemini, and more). Works as Faraday middleware plugs into any
145
- Ruby HTTP client. Provides ActiveRecord storage, per-user/per-feature attribution,
146
- and budget alerts.
142
+ version: '3.0'
143
+ description: Tracks token usage and estimated costs for OpenAI, Anthropic, and Google
144
+ Gemini calls. Works as Faraday middleware for Ruby clients, with ActiveRecord storage,
145
+ per-user/per-feature attribution, and budget alerts.
147
146
  email:
148
147
  - sergey@mm.st
149
148
  executables: []
@@ -151,6 +150,7 @@ extensions: []
151
150
  extra_rdoc_files: []
152
151
  files:
153
152
  - ".rspec"
153
+ - ".rubocop.yml"
154
154
  - CHANGELOG.md
155
155
  - LICENSE.txt
156
156
  - README.md
@@ -179,6 +179,7 @@ licenses:
179
179
  metadata:
180
180
  homepage_uri: https://github.com/sergey-homenko/llm_cost_tracker
181
181
  changelog_uri: https://github.com/sergey-homenko/llm_cost_tracker/blob/main/CHANGELOG.md
182
+ rubygems_mfa_required: 'true'
182
183
  post_install_message:
183
184
  rdoc_options: []
184
185
  require_paths:
@@ -197,5 +198,5 @@ requirements: []
197
198
  rubygems_version: 3.5.9
198
199
  signing_key:
199
200
  specification_version: 4
200
- summary: Provider-agnostic LLM API cost tracking for Ruby
201
+ summary: Self-hosted LLM API cost tracking for Ruby and Rails
201
202
  test_files: []