RubyGems - llm_cost_tracker - Versions diffs - 0.1.0 → 0.1.1 - Mend

llm_cost_tracker 0.1.0 → 0.1.1

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (16) hide show

checksums.yaml +4 -4
data/.rubocop.yml +44 -0
data/CHANGELOG.md +25 -0
data/README.md +45 -19
data/Rakefile +3 -1
data/lib/llm_cost_tracker/llm_api_call.rb +3 -1
data/lib/llm_cost_tracker/middleware/faraday.rb +3 -4
data/lib/llm_cost_tracker/parsers/anthropic.rb +6 -4
data/lib/llm_cost_tracker/parsers/gemini.rb +4 -3
data/lib/llm_cost_tracker/parsers/openai.rb +15 -7
data/lib/llm_cost_tracker/pricing.rb +54 -19
data/lib/llm_cost_tracker/storage/active_record_store.rb +13 -1
data/lib/llm_cost_tracker/tracker.rb +45 -8
data/lib/llm_cost_tracker/version.rb +1 -1
data/llm_cost_tracker.gemspec +8 -7
metadata +24 -23

checksums.yaml CHANGED Viewed

@@ -1,7 +1,7 @@
 ---
 SHA256:
-  metadata.gz: 16c6c4c230300b2caebc69e1f0aeb9a7af232278a6e77401aba0d490bb16b4a2
-  data.tar.gz: 4b32fb25d22c645e4d66767264130bc44ce730e553636cefca8e4884dede7b94
+  metadata.gz: f7b40f1010c79358da89ffdd10637f59fa90e24aa0f50aec364828d2e2cbf5b9
+  data.tar.gz: d12d1cf407b87afd6e1084c22ceda143c7ab9bf5e6ea6825d70a8e24969cafa5
 SHA512:
-  metadata.gz: f403ebeeb6a98164bc2318b9f3b8f49e03750fd5c5d328ac0b0f3c0557f16c39d8f247aa663bdd15b605df779be7d4b4db7445fb79d10eb4c89ef17a687a77d7
-  data.tar.gz: fd04a24708901e998127d6582290da90b35f62f7a36d794805a4677656be9191eed14235b634b72924ac4178837aa9dbace3a57fde18829d47b6a8208e295af2
+  metadata.gz: 949157f0a6718bc03f8f0d825982ed732df2754ddf1e4ee07b18522b0e20cc4367a97c599071bcda95bbdda4dde0e160f5d586a9b42a0dd8b1f3c89910286547
+  data.tar.gz: 9ea9007142d157446271bcf81bc4786e4b22a00f6e353dc2e3dc26c1be12d9abf88aed8d8852da37778b5fe3f71fcd4422c6153d8a531f195adaf0d0b9bb8dd2

data/.rubocop.yml ADDED Viewed

@@ -0,0 +1,44 @@
+AllCops:
+  NewCops: enable
+  TargetRubyVersion: 3.1
+  SuggestExtensions: false
+  UseCache: false
+  Exclude:
+    - "tmp/**/*"
+    - "vendor/**/*"
+    - "pkg/**/*"
+Style/Documentation:
+  Enabled: false
+Style/StringLiterals:
+  EnforcedStyle: double_quotes
+Metrics/BlockLength:
+  Exclude:
+    - "*.gemspec"
+    - "spec/**/*.rb"
+Metrics/MethodLength:
+  Max: 25
+Metrics/AbcSize:
+  Max: 45
+Metrics/ClassLength:
+  Max: 130
+Metrics/CyclomaticComplexity:
+  Max: 10
+Metrics/ParameterLists:
+  Max: 6
+Metrics/PerceivedComplexity:
+  Max: 10
+Gemspec/DevelopmentDependencies:
+  Enabled: false
+Layout/HashAlignment:
+  Enabled: false

data/CHANGELOG.md CHANGED Viewed

@@ -5,6 +5,31 @@ All notable changes to this project will be documented in this file.
 The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.1.0/),
 and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html).
+## [0.1.1] - 2026-04-17
+### Fixed
+- Lazy-load ActiveRecord storage so `storage_backend = :active_record` persists events reliably.
+- Avoid double-counting the latest ActiveRecord event in monthly budget callbacks.
+- Track OpenAI Responses API usage via `/v1/responses`.
+- Parse OpenAI cached input token details for cache-aware cost estimates.
+- Parse Anthropic cache read and cache creation token usage under canonical metadata keys.
+- Parse Gemini cached content token usage when present.
+- Store ActiveRecord tag values as strings so `by_tag("user_id", "42")` works for numeric IDs.
+### Changed
+- Refresh built-in pricing for current OpenAI, Anthropic, and Gemini models.
+- Add cache-aware cost calculation fields for cached input, cache reads, and cache creation.
+- Tighten OpenAI URL matching to supported endpoint families only.
+- Reposition README around self-hosted Rails/Ruby cost tracking for Faraday-based clients.
+### Added
+- Add ActiveRecord integration specs for persistence, tag querying, and budget callbacks.
+- Add RuboCop configuration, rake task, and CI lint step.
+- Require MFA metadata for RubyGems publishing.
 ## [0.1.0] - 2026-04-16
 ### Added

data/README.md CHANGED Viewed

@@ -1,21 +1,25 @@
 # LlmCostTracker
-**Provider-agnostic LLM API cost tracking for Ruby.**
+**Self-hosted LLM API cost tracking for Ruby and Rails apps.**
-Track token usage and costs for every LLM API call your app makes — OpenAI, Anthropic, Google Gemini, and any OpenAI-compatible provider. Works as Faraday middleware, so it plugs into **any** Ruby LLM client without code changes.
+Track token usage and estimated costs for OpenAI, Anthropic, and Google Gemini calls from Faraday-based Ruby clients. Store the data in your own database, tag calls by user or feature, and get budget alerts without adding an external SaaS or proxy.
 [![Gem Version](https://badge.fury.io/rb/llm_cost_tracker.svg)](https://rubygems.org/gems/llm_cost_tracker)
+[![CI](https://github.com/sergey-homenko/llm_cost_tracker/actions/workflows/ruby.yml/badge.svg)](https://github.com/sergey-homenko/llm_cost_tracker/actions)
 ## Why?
-Every Rails app integrating LLMs faces the same problem: **you don't know how much AI is costing you** until the invoice arrives. Existing solutions either lock you into a specific LLM gem (like `ruby_llm-monitoring`) or require external SaaS (Langfuse, Helicone).
+Every Rails app integrating LLMs faces the same problem: **you don't know how much AI is costing you** until the invoice arrives. Full observability platforms like Langfuse and Helicone are powerful, but sometimes you just need a small Rails-native cost ledger that lives in your app database.
 `llm_cost_tracker` takes a different approach:
-- 🔌 **Provider-agnostic** — intercepts HTTP responses at the Faraday level
+- 🔌 **Faraday-native** — intercepts LLM HTTP responses without changing the response
 - 🏠 **Self-hosted** — your data stays in your database
-- 🧩 **Zero coupling** — works with `ruby-openai`, `anthropic-rb`, `ruby_llm`, or raw Faraday
-- ⚡ **Zero config** — add the middleware, done
+- 🧩 **Client-light** — works with raw Faraday and LLM gems that expose their Faraday connection
+- 🏷️ **Attribution-first** — tag spend by feature, tenant, user, job, or environment
+- 💸 **Budget-aware** — emit notifications and callbacks before spend surprises you
+This gem is intentionally not a tracing platform, prompt CMS, eval system, or gateway. It focuses on the boring but valuable question: "What did this app spend on LLM APIs, and where did that spend come from?"
 ## Installation
@@ -34,9 +38,9 @@ bin/rails db:migrate
 ## Quick Start
-### Option 1: Faraday Middleware (automatic)
+### Option 1: Faraday Middleware
-If your LLM client uses Faraday (most do), just add the middleware:
+If your LLM client uses Faraday, add the middleware to that connection:
 ```ruby
 conn = Faraday.new(url: "https://api.openai.com") do |f|
@@ -46,16 +50,16 @@ conn = Faraday.new(url: "https://api.openai.com") do |f|
   f.adapter Faraday.default_adapter
 end
-# Every request through this connection is now tracked automatically
-response = conn.post("/v1/chat/completions", {
-  model: "gpt-4o",
-  messages: [{ role: "user", content: "Hello!" }]
+# Every supported LLM request through this connection is tracked
+response = conn.post("/v1/responses", {
+  model: "gpt-5-mini",
+  input: "Hello!"
 })
 ```
 ### Option 2: Patch an existing client
-Most LLM gems expose their Faraday connection. For example, with `ruby-openai`:
+Some LLM gems expose their Faraday connection. For example, with `ruby-openai`:
 ```ruby
 # config/initializers/openai.rb
@@ -68,6 +72,8 @@ OpenAI.configure do |config|
 end
 ```
+If a client does not expose its HTTP connection, use manual tracking or register a custom parser around the HTTP layer you control.
 ### Option 3: Manual tracking
 For non-Faraday clients, track manually:
@@ -78,6 +84,7 @@ LlmCostTracker.track(
   model: "claude-sonnet-4-6",
   input_tokens: 1500,
   output_tokens: 320,
+  cache_read_input_tokens: 1200,
   feature: "summarizer",
   user_id: current_user.id
 )
@@ -107,11 +114,13 @@ LlmCostTracker.configure do |config|
   # Override pricing for custom/fine-tuned models (per 1M tokens)
   config.pricing_overrides = {
-    "ft:gpt-4o-mini:my-org" => { input: 0.30, output: 1.20 }
+    "ft:gpt-4o-mini:my-org" => { input: 0.30, cached_input: 0.15, output: 1.20 }
   }
 end
 ```
+Pricing is best-effort and based on public provider pricing for standard token usage. Providers change pricing frequently, and some features have extra charges or tiered pricing. Use `pricing_overrides` for fine-tunes, gateway-specific model IDs, enterprise discounts, batch pricing, long-context premiums, and any model this gem does not know yet.
 ## Querying Costs (ActiveRecord)
 ```ruby
@@ -154,7 +163,15 @@ ActiveSupport::Notifications.subscribe("llm_request.llm_cost_tracker") do |*, pa
   #   input_tokens: 150,
   #   output_tokens: 42,
   #   total_tokens: 192,
-  #   cost: { input_cost: 0.000375, output_cost: 0.00042, total_cost: 0.000795, currency: "USD" },
+  #   cost: {
+  #     input_cost: 0.000375,
+  #     cached_input_cost: 0.0,
+  #     cache_read_input_cost: 0.0,
+  #     cache_creation_input_cost: 0.0,
+  #     output_cost: 0.00042,
+  #     total_cost: 0.000795,
+  #     currency: "USD"
+  #   },
   #   tags: { feature: "chat", user_id: 42 },
   #   tracked_at: 2026-04-16 14:30:00 UTC
   # }
@@ -210,11 +227,17 @@ LlmCostTracker::Parsers::Registry.register(DeepSeekParser.new)
 | Provider | Auto-detected | Models with pricing |
 |----------|:---:|---|
-| OpenAI | ✅ | GPT-4o, GPT-4o-mini, GPT-4-turbo, GPT-4, GPT-3.5-turbo, o1, o1-mini, o3-mini |
-| Anthropic | ✅ | Claude Opus 4.6, Sonnet 4.6, Haiku 4.5, Claude 3.5 Sonnet, Claude 3 Opus |
-| Google Gemini | ✅ | Gemini 2.5 Pro/Flash, 2.0 Flash, 1.5 Pro/Flash |
+| OpenAI | ✅ | GPT-5.2/5.1/5, GPT-5 mini/nano, GPT-4.1, GPT-4o, o1/o3/o4-mini |
+| Anthropic | ✅ | Claude Opus 4.6/4.1/4, Sonnet 4.6/4.5/4, Haiku 4.5, Claude 3.x |
+| Google Gemini | ✅ | Gemini 2.5 Pro/Flash/Flash-Lite, 2.0 Flash/Flash-Lite, 1.5 Pro/Flash |
 | Any other | 🔧 | Via custom parser (see above) |
+Supported endpoint families:
+- OpenAI: Chat Completions, Responses, Completions, Embeddings
+- Anthropic: Messages
+- Google Gemini: `generateContent` responses with `usageMetadata`
 ## How It Works
 ```
@@ -228,7 +251,9 @@ Your App → Faraday → [LlmCostTracker Middleware] → LLM API
                ActiveRecord / Log / Custom
 ```
-The middleware intercepts **outgoing** HTTP responses (not incoming requests), parses the `usage` object from the LLM provider's response body, looks up pricing, and records the event. It never modifies requests or responses — it's read-only.
+The middleware intercepts **outgoing** HTTP responses (not incoming Rails requests), parses the provider usage object, looks up pricing, and records the event. It never modifies requests or responses.
+For streaming APIs, tracking depends on the final response body including provider usage data. If the client consumes server-sent events without exposing the final usage payload to Faraday, use manual tracking.
 ## Development
@@ -237,6 +262,7 @@ git clone https://github.com/sergey-homenko/llm_cost_tracker.git
 cd llm_cost_tracker
 bundle install
 bundle exec rspec
+bundle exec rubocop
 ```
 ## Contributing

data/Rakefile CHANGED Viewed

@@ -2,7 +2,9 @@
 require "bundler/gem_tasks"
 require "rspec/core/rake_task"
+require "rubocop/rake_task"
 RSpec::Core::RakeTask.new(:spec)
+RuboCop::RakeTask.new(:rubocop)
-task default: :spec
+task default: %i[spec rubocop]

data/lib/llm_cost_tracker/llm_api_call.rb CHANGED Viewed

@@ -9,7 +9,9 @@ module LlmCostTracker
     # Scopes for querying
     scope :by_provider, ->(provider) { where(provider: provider) }
     scope :by_model,    ->(model)    { where(model: model) }
-    scope :by_tag,      ->(key, value) { where("tags LIKE ?", "%\"#{key}\":\"#{value}\"%") }
+    scope :by_tag, lambda { |key, value|
+      where("tags LIKE ?", "%\"#{key}\":\"#{value}\"%")
+    }
     scope :today,       -> { where(tracked_at: Time.now.utc.beginning_of_day..) }
     scope :this_week,   -> { where(tracked_at: Time.now.utc.beginning_of_week..) }

data/lib/llm_cost_tracker/middleware/faraday.rb CHANGED Viewed

@@ -19,9 +19,6 @@ module LlmCostTracker
         @app.call(request_env).on_complete do |response_env|
           process(request_url, request_body, response_env)
         end
-      rescue StandardError => e
-        # Never break the actual request — log and re-raise
-        raise e
       end
       private
@@ -46,7 +43,9 @@ module LlmCostTracker
           metadata: @tags.merge(parsed.except(:provider, :model, :input_tokens, :output_tokens, :total_tokens))
         )
       rescue StandardError => e
-        warn "[LlmCostTracker] Error processing response: #{e.message}" if LlmCostTracker.configuration.log_level == :debug
+        return unless LlmCostTracker.configuration.log_level == :debug
+        warn "[LlmCostTracker] Error processing response: #{e.message}"
       end
       def read_body(body)

data/lib/llm_cost_tracker/parsers/anthropic.rb CHANGED Viewed

@@ -14,7 +14,7 @@ module LlmCostTracker
         false
       end
-      def parse(request_url, request_body, response_status, response_body)
+      def parse(_request_url, request_body, response_status, response_body)
         return nil unless response_status == 200
         response = safe_json_parse(response_body)
@@ -28,9 +28,11 @@ module LlmCostTracker
           model: response["model"] || request["model"],
           input_tokens: usage["input_tokens"] || 0,
           output_tokens: usage["output_tokens"] || 0,
-          total_tokens: (usage["input_tokens"] || 0) + (usage["output_tokens"] || 0),
-          cache_read_tokens: usage["cache_read_input_tokens"],
-          cache_creation_tokens: usage["cache_creation_input_tokens"]
+          total_tokens: (usage["input_tokens"] || 0) + (usage["output_tokens"] || 0) +
+            (usage["cache_read_input_tokens"] || 0) +
+            (usage["cache_creation_input_tokens"] || 0),
+          cache_read_input_tokens: usage["cache_read_input_tokens"],
+          cache_creation_input_tokens: usage["cache_creation_input_tokens"]
         }.compact
       end
     end

data/lib/llm_cost_tracker/parsers/gemini.rb CHANGED Viewed

@@ -14,7 +14,7 @@ module LlmCostTracker
         false
       end
-      def parse(request_url, request_body, response_status, response_body)
+      def parse(request_url, _request_body, response_status, response_body)
         return nil unless response_status == 200
         response = safe_json_parse(response_body)
@@ -29,8 +29,9 @@ module LlmCostTracker
           model: model,
           input_tokens: usage["promptTokenCount"] || 0,
           output_tokens: usage["candidatesTokenCount"] || 0,
-          total_tokens: usage["totalTokenCount"] || 0
-        }
+          total_tokens: usage["totalTokenCount"] || 0,
+          cached_input_tokens: usage["cachedContentTokenCount"]
+        }.compact
       end
       private

data/lib/llm_cost_tracker/parsers/openai.rb CHANGED Viewed

@@ -6,16 +6,16 @@ module LlmCostTracker
   module Parsers
     class Openai < Base
       HOSTS = %w[api.openai.com].freeze
-      TRACKED_PATHS = %w[/v1/chat/completions /v1/completions /v1/embeddings].freeze
+      TRACKED_PATHS = %w[/v1/chat/completions /v1/completions /v1/embeddings /v1/responses].freeze
       def match?(url)
         uri = URI.parse(url.to_s)
-        HOSTS.include?(uri.host) && TRACKED_PATHS.any? { |p| uri.path.start_with?(p) }
+        HOSTS.include?(uri.host) && TRACKED_PATHS.include?(uri.path)
       rescue URI::InvalidURIError
         false
       end
-      def parse(request_url, request_body, response_status, response_body)
+      def parse(_request_url, request_body, response_status, response_body)
         return nil unless response_status == 200
         response = safe_json_parse(response_body)
@@ -27,10 +27,18 @@ module LlmCostTracker
         {
           provider: "openai",
           model: response["model"] || request["model"],
-          input_tokens: usage["prompt_tokens"] || 0,
-          output_tokens: usage["completion_tokens"] || 0,
-          total_tokens: usage["total_tokens"] || 0
-        }
+          input_tokens: usage["prompt_tokens"] || usage["input_tokens"] || 0,
+          output_tokens: usage["completion_tokens"] || usage["output_tokens"] || 0,
+          total_tokens: usage["total_tokens"] || 0,
+          cached_input_tokens: cached_input_tokens(usage)
+        }.compact
+      end
+      private
+      def cached_input_tokens(usage)
+        details = usage["prompt_tokens_details"] || usage["input_tokens_details"] || {}
+        details["cached_tokens"]
       end
     end
   end

data/lib/llm_cost_tracker/pricing.rb CHANGED Viewed

@@ -6,43 +6,78 @@ module LlmCostTracker
   module Pricing
     PRICES = {
       # OpenAI
-      "gpt-4o"             => { input: 2.50,  output: 10.00 },
-      "gpt-4o-mini"        => { input: 0.15,  output: 0.60 },
+      "gpt-5.2"            => { input: 1.75,  cached_input: 0.175, output: 14.00 },
+      "gpt-5.1"            => { input: 1.25,  cached_input: 0.125, output: 10.00 },
+      "gpt-5"              => { input: 1.25,  cached_input: 0.125, output: 10.00 },
+      "gpt-5-mini"         => { input: 0.25,  cached_input: 0.025, output: 2.00 },
+      "gpt-5-nano"         => { input: 0.05,  cached_input: 0.005, output: 0.40 },
+      "gpt-4.1"            => { input: 2.00,  cached_input: 0.50,  output: 8.00 },
+      "gpt-4.1-mini"       => { input: 0.40,  cached_input: 0.10,  output: 1.60 },
+      "gpt-4.1-nano"       => { input: 0.10,  cached_input: 0.025, output: 0.40 },
+      "gpt-4o-2024-05-13"  => { input: 5.00,  output: 15.00 },
+      "gpt-4o"             => { input: 2.50,  cached_input: 1.25,  output: 10.00 },
+      "gpt-4o-mini"        => { input: 0.15,  cached_input: 0.075, output: 0.60 },
       "gpt-4-turbo"        => { input: 10.00, output: 30.00 },
       "gpt-4"              => { input: 30.00, output: 60.00 },
       "gpt-3.5-turbo"      => { input: 0.50,  output: 1.50 },
-      "o1"                 => { input: 15.00, output: 60.00 },
-      "o1-mini"            => { input: 3.00,  output: 12.00 },
-      "o3-mini"            => { input: 1.10,  output: 4.40 },
+      "o1"                 => { input: 15.00, cached_input: 7.50,  output: 60.00 },
+      "o1-mini"            => { input: 1.10,  cached_input: 0.55,  output: 4.40 },
+      "o3"                 => { input: 2.00,  cached_input: 0.50,  output: 8.00 },
+      "o3-mini"            => { input: 1.10,  cached_input: 0.55,  output: 4.40 },
+      "o4-mini"            => { input: 1.10,  cached_input: 0.275, output: 4.40 },
       # Anthropic
-      "claude-sonnet-4-6"  => { input: 3.00,  output: 15.00 },
-      "claude-opus-4-6"    => { input: 15.00, output: 75.00 },
-      "claude-haiku-4-5"   => { input: 0.80,  output: 4.00 },
-      "claude-3-5-sonnet-20241022" => { input: 3.00,  output: 15.00 },
-      "claude-3-5-haiku-20241022"  => { input: 0.80,  output: 4.00 },
-      "claude-3-opus-20240229"     => { input: 15.00, output: 75.00 },
+      "claude-sonnet-4-6"  => { input: 3.00,  output: 15.00, cache_read_input: 0.30, cache_creation_input: 3.75 },
+      "claude-opus-4-6"    => { input: 5.00,  output: 25.00, cache_read_input: 0.50, cache_creation_input: 6.25 },
+      "claude-opus-4-1"    => { input: 15.00, output: 75.00, cache_read_input: 1.50, cache_creation_input: 18.75 },
+      "claude-opus-4"      => { input: 15.00, output: 75.00, cache_read_input: 1.50, cache_creation_input: 18.75 },
+      "claude-sonnet-4-5"  => { input: 3.00,  output: 15.00, cache_read_input: 0.30, cache_creation_input: 3.75 },
+      "claude-sonnet-4"    => { input: 3.00,  output: 15.00, cache_read_input: 0.30, cache_creation_input: 3.75 },
+      "claude-haiku-4-5"   => { input: 1.00,  output: 5.00,  cache_read_input: 0.10, cache_creation_input: 1.25 },
+      "claude-3-7-sonnet"  => { input: 3.00,  output: 15.00, cache_read_input: 0.30, cache_creation_input: 3.75 },
+      "claude-3-5-sonnet"  => { input: 3.00,  output: 15.00, cache_read_input: 0.30, cache_creation_input: 3.75 },
+      "claude-3-5-haiku"   => { input: 0.80,  output: 4.00,  cache_read_input: 0.08, cache_creation_input: 1.00 },
+      "claude-3-opus"      => { input: 15.00, output: 75.00, cache_read_input: 1.50, cache_creation_input: 18.75 },
       # Google Gemini
-      "gemini-2.5-pro"     => { input: 1.25,  output: 10.00 },
-      "gemini-2.5-flash"   => { input: 0.15,  output: 0.60 },
-      "gemini-2.0-flash"   => { input: 0.10,  output: 0.40 },
+      "gemini-2.5-pro"     => { input: 1.25,  cached_input: 0.125, output: 10.00 },
+      "gemini-2.5-flash"   => { input: 0.30,  cached_input: 0.03,  output: 2.50 },
+      "gemini-2.5-flash-lite" => { input: 0.10, cached_input: 0.01, output: 0.40 },
+      "gemini-2.0-flash" => { input: 0.10, cached_input: 0.025, output: 0.40 },
+      "gemini-2.0-flash-lite" => { input: 0.075, output: 0.30 },
       "gemini-1.5-pro"     => { input: 1.25,  output: 5.00 },
-      "gemini-1.5-flash"   => { input: 0.075, output: 0.30 },
+      "gemini-1.5-flash"   => { input: 0.075, output: 0.30 }
     }.freeze
     class << self
-      def cost_for(model:, input_tokens:, output_tokens:)
+      def cost_for(model:, input_tokens:, output_tokens:, cached_input_tokens: 0,
+                   cache_read_input_tokens: 0, cache_creation_input_tokens: 0)
         prices = lookup(model)
         return nil unless prices
-        input_cost  = (input_tokens.to_f / 1_000_000) * prices[:input]
+        cached_input_tokens = cached_input_tokens.to_i
+        cache_read_input_tokens = cache_read_input_tokens.to_i
+        cache_creation_input_tokens = cache_creation_input_tokens.to_i
+        uncached_input_tokens = [input_tokens.to_i - cached_input_tokens, 0].max
+        input_cost = (uncached_input_tokens.to_f / 1_000_000) * prices[:input]
+        cached_input_cost = (cached_input_tokens.to_f / 1_000_000) *
+                            (prices[:cached_input] || prices[:input])
+        cache_read_input_cost = (cache_read_input_tokens.to_f / 1_000_000) *
+                                (prices[:cache_read_input] || prices[:cached_input] || prices[:input])
+        cache_creation_input_cost = (cache_creation_input_tokens.to_f / 1_000_000) *
+                                    (prices[:cache_creation_input] || prices[:input])
         output_cost = (output_tokens.to_f / 1_000_000) * prices[:output]
+        total_cost = input_cost + cached_input_cost + cache_read_input_cost +
+                     cache_creation_input_cost + output_cost
         {
           input_cost: input_cost.round(8),
+          cached_input_cost: cached_input_cost.round(8),
+          cache_read_input_cost: cache_read_input_cost.round(8),
+          cache_creation_input_cost: cache_creation_input_cost.round(8),
           output_cost: output_cost.round(8),
-          total_cost: (input_cost + output_cost).round(8),
+          total_cost: total_cost.round(8),
           currency: "USD"
         }
       end
@@ -62,7 +97,7 @@ module LlmCostTracker
       def fuzzy_match(model)
         return nil unless model
-        PRICES.each do |key, value|
+        PRICES.sort_by { |key, _value| -key.length }.each do |key, value|
           return value if model.start_with?(key)
         end

data/lib/llm_cost_tracker/storage/active_record_store.rb CHANGED Viewed

@@ -14,7 +14,7 @@ module LlmCostTracker
             input_cost:    event.dig(:cost, :input_cost),
             output_cost:   event.dig(:cost, :output_cost),
             total_cost:    event.dig(:cost, :total_cost),
-            tags:          event[:tags].to_json,
+            tags:          stringify_tags(event[:tags]).to_json,
             tracked_at:    event[:tracked_at]
           )
         end
@@ -31,6 +31,18 @@ module LlmCostTracker
         def model_class
           LlmCostTracker::LlmApiCall
         end
+        private
+        def stringify_tags(tags)
+          tags.transform_keys(&:to_s).transform_values { |value| stringify_tag_value(value) }
+        end
+        def stringify_tag_value(value)
+          return value.transform_values { |nested| stringify_tag_value(nested) } if value.is_a?(Hash)
+          value.to_s
+        end
       end
     end
   end

data/lib/llm_cost_tracker/tracker.rb CHANGED Viewed

@@ -6,18 +6,23 @@ module LlmCostTracker
     class << self
       def record(provider:, model:, input_tokens:, output_tokens:, metadata: {})
+        usage = usage_data(input_tokens, output_tokens, metadata)
         cost_data = Pricing.cost_for(
           model: model,
-          input_tokens: input_tokens,
-          output_tokens: output_tokens
+          input_tokens: usage[:input_tokens],
+          output_tokens: usage[:output_tokens],
+          cached_input_tokens: usage[:cached_input_tokens],
+          cache_read_input_tokens: usage[:cache_read_input_tokens],
+          cache_creation_input_tokens: usage[:cache_creation_input_tokens]
         )
         event = {
           provider: provider,
           model: model,
-          input_tokens: input_tokens,
-          output_tokens: output_tokens,
-          total_tokens: input_tokens + output_tokens,
+          input_tokens: usage[:input_tokens],
+          output_tokens: usage[:output_tokens],
+          total_tokens: usage[:total_tokens],
           cost: cost_data,
           tags: LlmCostTracker.configuration.default_tags.merge(metadata),
           tracked_at: Time.now.utc
@@ -51,7 +56,7 @@ module LlmCostTracker
       end
       def log_event(event)
-        cost_str = event[:cost] ? "$#{'%.6f' % event[:cost][:total_cost]}" : "unknown"
+        cost_str = event[:cost] ? "$#{format('%.6f', event[:cost][:total_cost])}" : "unknown"
         message = "[LlmCostTracker] #{event[:provider]}/#{event[:model]} " \
                   "tokens=#{event[:input_tokens]}+#{event[:output_tokens]} " \
@@ -72,9 +77,12 @@ module LlmCostTracker
       end
       def store_active_record(event)
-        return unless defined?(LlmCostTracker::Storage::ActiveRecordStore)
+        require_relative "llm_api_call" unless defined?(LlmCostTracker::LlmApiCall)
+        require_relative "storage/active_record_store" unless defined?(LlmCostTracker::Storage::ActiveRecordStore)
         LlmCostTracker::Storage::ActiveRecordStore.save(event)
+      rescue LoadError => e
+        raise Error, "ActiveRecord storage requires the active_record gem: #{e.message}"
       end
       def check_budget(event)
@@ -96,12 +104,41 @@ module LlmCostTracker
         # For :active_record backend, query the DB
         if LlmCostTracker.configuration.active_record? &&
            defined?(LlmCostTracker::Storage::ActiveRecordStore)
-          LlmCostTracker::Storage::ActiveRecordStore.monthly_total + latest_cost
+          LlmCostTracker::Storage::ActiveRecordStore.monthly_total
         else
           # For other backends, we can only report the latest cost
           latest_cost
         end
       end
+      def usage_data(input_tokens, output_tokens, metadata)
+        cache_read_input_tokens = integer_metadata(metadata, :cache_read_input_tokens, :cache_read_tokens)
+        cache_creation_input_tokens = integer_metadata(
+          metadata,
+          :cache_creation_input_tokens,
+          :cache_creation_tokens
+        )
+        cached_input_tokens = integer_metadata(metadata, :cached_input_tokens)
+        {
+          input_tokens: input_tokens.to_i,
+          output_tokens: output_tokens.to_i,
+          cached_input_tokens: cached_input_tokens,
+          cache_read_input_tokens: cache_read_input_tokens,
+          cache_creation_input_tokens: cache_creation_input_tokens,
+          total_tokens: input_tokens.to_i + output_tokens.to_i +
+            cache_read_input_tokens + cache_creation_input_tokens
+        }
+      end
+      def integer_metadata(metadata, *keys)
+        keys.each do |key|
+          value = metadata[key] || metadata[key.to_s]
+          return value.to_i unless value.nil?
+        end
+        0
+      end
     end
   end
 end

data/lib/llm_cost_tracker/version.rb CHANGED Viewed

@@ -1,5 +1,5 @@
 # frozen_string_literal: true
 module LlmCostTracker
-  VERSION = "0.1.0"
+  VERSION = "0.1.1"
 end

data/llm_cost_tracker.gemspec CHANGED Viewed

@@ -8,10 +8,10 @@ Gem::Specification.new do |spec|
   spec.authors       = ["Sergii Khomenko"]
   spec.email         = ["sergey@mm.st"]
-  spec.summary       = "Provider-agnostic LLM API cost tracking for Ruby"
-  spec.description   = "Automatically tracks token usage and costs for LLM API calls (OpenAI, Anthropic, Google Gemini, and more). " \
-                        "Works as Faraday middleware — plugs into any Ruby HTTP client. " \
-                        "Provides ActiveRecord storage, per-user/per-feature attribution, and budget alerts."
+  spec.summary       = "Self-hosted LLM API cost tracking for Ruby and Rails"
+  spec.description   = "Tracks token usage and estimated costs for OpenAI, Anthropic, and Google Gemini calls. " \
+                       "Works as Faraday middleware for Ruby clients, with ActiveRecord storage, " \
+                       "per-user/per-feature attribution, and budget alerts."
   spec.homepage      = "https://github.com/sergey-homenko/llm_cost_tracker"
   spec.license       = "MIT"
@@ -19,6 +19,7 @@ Gem::Specification.new do |spec|
   spec.metadata["homepage_uri"]  = spec.homepage
   spec.metadata["changelog_uri"] = "#{spec.homepage}/blob/main/CHANGELOG.md"
+  spec.metadata["rubygems_mfa_required"] = "true"
   spec.files = Dir.chdir(__dir__) do
     `git ls-files -z`.split("\x0").reject do |f|
@@ -29,13 +30,13 @@ Gem::Specification.new do |spec|
   spec.require_paths = ["lib"]
-  spec.add_dependency "faraday", ">= 1.0", "< 3.0"
   spec.add_dependency "activesupport", ">= 7.0", "< 9.0"
+  spec.add_dependency "faraday", ">= 1.0", "< 3.0"
   spec.add_development_dependency "activerecord", ">= 7.0", "< 9.0"
   spec.add_development_dependency "rake", "~> 13.0"
   spec.add_development_dependency "rspec", "~> 3.0"
-  spec.add_development_dependency "webmock", "~> 3.0"
-  spec.add_development_dependency "sqlite3", "~> 2.0"
   spec.add_development_dependency "rubocop", "~> 1.0"
+  spec.add_development_dependency "sqlite3", "~> 2.0"
+  spec.add_development_dependency "webmock", "~> 3.0"
 end

metadata CHANGED Viewed

@@ -1,55 +1,55 @@
 --- !ruby/object:Gem::Specification
 name: llm_cost_tracker
 version: !ruby/object:Gem::Version
-  version: 0.1.0
+  version: 0.1.1
 platform: ruby
 authors:
 - Sergii Khomenko
 autorequire:
 bindir: bin
 cert_chain: []
-date: 2026-04-16 00:00:00.000000000 Z
+date: 2026-04-17 00:00:00.000000000 Z
 dependencies:
 - !ruby/object:Gem::Dependency
-  name: faraday
+  name: activesupport
   requirement: !ruby/object:Gem::Requirement
     requirements:
     - - ">="
       - !ruby/object:Gem::Version
-        version: '1.0'
+        version: '7.0'
     - - "<"
       - !ruby/object:Gem::Version
-        version: '3.0'
+        version: '9.0'
   type: :runtime
   prerelease: false
   version_requirements: !ruby/object:Gem::Requirement
     requirements:
     - - ">="
       - !ruby/object:Gem::Version
-        version: '1.0'
+        version: '7.0'
     - - "<"
       - !ruby/object:Gem::Version
-        version: '3.0'
+        version: '9.0'
 - !ruby/object:Gem::Dependency
-  name: activesupport
+  name: faraday
   requirement: !ruby/object:Gem::Requirement
     requirements:
     - - ">="
       - !ruby/object:Gem::Version
-        version: '7.0'
+        version: '1.0'
     - - "<"
       - !ruby/object:Gem::Version
-        version: '9.0'
+        version: '3.0'
   type: :runtime
   prerelease: false
   version_requirements: !ruby/object:Gem::Requirement
     requirements:
     - - ">="
       - !ruby/object:Gem::Version
-        version: '7.0'
+        version: '1.0'
     - - "<"
       - !ruby/object:Gem::Version
-        version: '9.0'
+        version: '3.0'
 - !ruby/object:Gem::Dependency
   name: activerecord
   requirement: !ruby/object:Gem::Requirement
@@ -99,19 +99,19 @@ dependencies:
       - !ruby/object:Gem::Version
         version: '3.0'
 - !ruby/object:Gem::Dependency
-  name: webmock
+  name: rubocop
   requirement: !ruby/object:Gem::Requirement
     requirements:
     - - "~>"
       - !ruby/object:Gem::Version
-        version: '3.0'
+        version: '1.0'
   type: :development
   prerelease: false
   version_requirements: !ruby/object:Gem::Requirement
     requirements:
     - - "~>"
       - !ruby/object:Gem::Version
-        version: '3.0'
+        version: '1.0'
 - !ruby/object:Gem::Dependency
   name: sqlite3
   requirement: !ruby/object:Gem::Requirement
@@ -127,23 +127,22 @@ dependencies:
       - !ruby/object:Gem::Version
         version: '2.0'
 - !ruby/object:Gem::Dependency
-  name: rubocop
+  name: webmock
   requirement: !ruby/object:Gem::Requirement
     requirements:
     - - "~>"
       - !ruby/object:Gem::Version
-        version: '1.0'
+        version: '3.0'
   type: :development
   prerelease: false
   version_requirements: !ruby/object:Gem::Requirement
     requirements:
     - - "~>"
       - !ruby/object:Gem::Version
-        version: '1.0'
-description: Automatically tracks token usage and costs for LLM API calls (OpenAI,
-  Anthropic, Google Gemini, and more). Works as Faraday middleware — plugs into any
-  Ruby HTTP client. Provides ActiveRecord storage, per-user/per-feature attribution,
-  and budget alerts.
+        version: '3.0'
+description: Tracks token usage and estimated costs for OpenAI, Anthropic, and Google
+  Gemini calls. Works as Faraday middleware for Ruby clients, with ActiveRecord storage,
+  per-user/per-feature attribution, and budget alerts.
 email:
 - sergey@mm.st
 executables: []
@@ -151,6 +150,7 @@ extensions: []
 extra_rdoc_files: []
 files:
 - ".rspec"
+- ".rubocop.yml"
 - CHANGELOG.md
 - LICENSE.txt
 - README.md
@@ -179,6 +179,7 @@ licenses:
 metadata:
   homepage_uri: https://github.com/sergey-homenko/llm_cost_tracker
   changelog_uri: https://github.com/sergey-homenko/llm_cost_tracker/blob/main/CHANGELOG.md
+  rubygems_mfa_required: 'true'
 post_install_message:
 rdoc_options: []
 require_paths:
@@ -197,5 +198,5 @@ requirements: []
 rubygems_version: 3.5.9
 signing_key:
 specification_version: 4
-summary: Provider-agnostic LLM API cost tracking for Ruby
+summary: Self-hosted LLM API cost tracking for Ruby and Rails
 test_files: []