RubyGems - llm_cost_tracker - Versions diffs - 0.3.3 → 0.4.1 - Mend

llm_cost_tracker 0.3.3 → 0.4.1

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (47) hide show

checksums.yaml CHANGED Viewed

@@ -1,7 +1,7 @@
 ---
 SHA256:
-  metadata.gz: b966913d302d5c5c3466615d1fa3983855c241f6cd9e3e26558c0fcc5fc4e7d5
-  data.tar.gz: 52804e702d5f01e5a4d247e8b50e601dede2b328bd7075c68ffd5f472b3b0d58
+  metadata.gz: d2cdd5f30c6fbd8c0168549b0853e9d8bc54586e60921733ce11a89a1d86078c
+  data.tar.gz: c91384579df6acdeb04d24b62f8bf916040f98156fd2bc882c94afc534f7dba5
 SHA512:
-  metadata.gz: 609ba1a18be86dce0b567b2ea33b3f3123da88683f0c65d9aef780f2e4854d1dde6686adfa505fc154d13da6dd6cb2b31d9f38c303de5fb22f6fda65c7f44aa7
-  data.tar.gz: de372e0940b4cfc400dacfc6dbf9e00f256c6944209da8cceaadd20a318b8c7aa8982d5190e21d38640ad20d04cc86400a4e872ec640189796e045acf1f7dfad
+  metadata.gz: 88d61d6714101ee9e8162814f5527bde487eced83663d86a1f938b77bcee1e4fcb4db2c4dde763720a828368a779a8677da57ecf10b2fadec78a959e6fdce6a7
+  data.tar.gz: d2d2bb097058507c06c1ea330a0c7d5a63d2e92b824fd8e4e7640a052dff100038eadb2e097edb4500457c382368080414c5261a2cc0a4f69436ae2234fd420d

data/CHANGELOG.md CHANGED Viewed

@@ -4,6 +4,37 @@ Format: [Keep a Changelog](https://keepachangelog.com/en/1.1.0/). Versioning: [S
 ## [Unreleased]
+## [0.4.1] - 2026-04-24
+### Changed
+- Batched ActiveRecord period rollup writes and budget total reads.
+- Memoized schema capability checks and refreshed them on `reset_column_information`.
+- Install migration adds `[:model, :tracked_at]` composite index and drops redundant single-column `:provider` / `:model` indexes.
+- Data Quality now reads counters and usage sums through one aggregate query.
+- Parser URL matching, stream-event extraction, and custom parser registration now share a smaller base/registry extension surface.
+- Added cookbook recipes for `ruby-openai`, `anthropic-sdk-ruby`, `gemini-ai`, `langchainrb`, Azure OpenAI, and LiteLLM proxy setups.
+### Fixed
+- `llm_cost_tracker:add_period_totals` now imports legacy monthly rollups and backfills before adding the unique index.
+- Budget docs now describe `:notify` across monthly, daily, and per-call budgets.
+## [0.4.0] - 2026-04-24
+### Changed
+- BREAKING: Canonical usage and pricing now use `cache_read_input` / `cache_write_input` instead of `cached_input` / `cache_creation_input`.
+- BREAKING: `Pricing.cost_for` now requires `provider:` and prefers provider-specific price entries before model-only entries.
+- BREAKING: Fresh ActiveRecord installs include cache-read, cache-write, and hidden-output token/cost breakdown columns.
+- BREAKING: ActiveRecord budget rollups now use `llm_cost_tracker_period_totals`.
+- BREAKING: `llm_cost_tracker:add_monthly_totals` was replaced by `llm_cost_tracker:add_period_totals`.
+- `llm_cost_tracker:add_usage_breakdown` generator for upgrading existing ActiveRecord installs.
+- `llm_cost_tracker:add_period_totals` generator for upgrading existing ActiveRecord installs.
+- Generic `pricing_mode` support with mode-prefixed local price keys.
+- Data Quality now shows usage bucket totals and hidden-output share.
+- Daily budget and per-call budget guardrails.
 ## [0.3.3] - 2026-04-24
 ### Added

data/README.md CHANGED Viewed

@@ -15,7 +15,7 @@ Every Rails app with LLM integrations eventually runs into the same question: wh
 ## What You Get
-- A local ActiveRecord ledger of provider, model, tokens, cost, latency, tags, streaming usage, and provider response IDs
+- A local ActiveRecord ledger of provider, model, usage breakdown, cost, latency, tags, streaming usage, and provider response IDs
 - Faraday middleware plus explicit `track` / `track_stream` helpers for non-Faraday clients
 - Server-rendered Rails dashboard with overview, calls, tags, CSV export, and data-quality pages
 - Local pricing snapshots, price sync tasks, and budget guardrails
@@ -159,7 +159,9 @@ LlmCostTracker.track_stream(provider: "anthropic", model: "claude-sonnet-4-6") d
 end
 ```
-Run `bin/rails g llm_cost_tracker:add_streaming` once on existing installs to add the `stream` and `usage_source` columns. Run `bin/rails g llm_cost_tracker:add_provider_response_id` to persist provider-issued response IDs.
+Run `bin/rails g llm_cost_tracker:add_streaming` once on existing installs to add the `stream` and `usage_source` columns. Run `bin/rails g llm_cost_tracker:add_provider_response_id` to persist provider-issued response IDs. Run `bin/rails g llm_cost_tracker:add_usage_breakdown` to add cache-read, cache-write, hidden-output, and pricing-mode columns.
+More client-specific snippets live in [`docs/cookbook.md`](docs/cookbook.md).
 ### Manual tracking
@@ -176,6 +178,10 @@ LlmCostTracker.track(
 )
 ```
+`input_tokens` is regular non-cache input. Put cache hits in
+`cache_read_input_tokens` and cache writes in `cache_write_input_tokens`; total
+tokens are calculated from the canonical billing breakdown.
 ## Configuration
 ```ruby
@@ -185,17 +191,19 @@ LlmCostTracker.configure do |config|
   config.default_tags = { app: "my_app", environment: Rails.env }
   config.monthly_budget = 500.00
+  config.daily_budget = 50.00
+  config.per_call_budget = 2.00
   config.budget_exceeded_behavior = :notify  # :notify, :raise, :block_requests
   config.storage_error_behavior   = :warn    # :ignore, :warn, :raise
   config.unknown_pricing_behavior = :warn    # :ignore, :warn, :raise
   config.on_budget_exceeded = ->(data) {
-    SlackNotifier.notify("#alerts", "🚨 LLM budget $#{data[:monthly_total].round(2)} / $#{data[:budget]}")
+    SlackNotifier.notify("#alerts", "🚨 LLM #{data[:budget_type]} budget $#{data[:total].round(2)} / $#{data[:budget]}")
   }
   config.prices_file = Rails.root.join("config/llm_cost_tracker_prices.yml")
   config.pricing_overrides = {
-    "ft:gpt-4o-mini:my-org" => { input: 0.30, cached_input: 0.15, output: 1.20 }
+    "ft:gpt-4o-mini:my-org" => { input: 0.30, cache_read_input: 0.15, output: 1.20 }
   }
   # Built-in: openrouter.ai, api.deepseek.com
@@ -203,7 +211,9 @@ LlmCostTracker.configure do |config|
 end
 ```
-Pricing is best effort. OpenRouter-style IDs like `openai/gpt-4o-mini` are normalized to built-in names when possible. Use `prices_file` / `pricing_overrides` for fine-tunes, gateway-specific IDs, enterprise discounts, batch pricing, or models the gem does not know.
+Pricing is best effort. OpenRouter-style IDs like `openai/gpt-4o-mini` are normalized to built-in names when possible. Use `prices_file` / `pricing_overrides` for fine-tunes, gateway-specific IDs, enterprise discounts, alternate pricing modes, or models the gem does not know.
+Provider-specific entries like `openai/gpt-4o-mini` win over model-only entries like `gpt-4o-mini`.
+Pass `pricing_mode: :batch` to use optional mode-specific keys such as `batch_input` / `batch_output`; missing mode-specific keys fall back to standard `input` / `output` rates. The same pattern works for custom modes, for example `contract_input`.
 `storage_error_behavior = :warn` (default) lets LLM responses continue if storage fails; `:raise` exposes `StorageError#original_error`.
@@ -225,7 +235,7 @@ bin/rails generate llm_cost_tracker:prices
 {
   "metadata": { "updated_at": "2026-04-18", "currency": "USD", "unit": "1M tokens" },
   "models": {
-    "my-gateway/gpt-4o-mini": { "input": 0.20, "cached_input": 0.10, "output": 0.80 }
+    "my-gateway/gpt-4o-mini": { "input": 0.20, "cache_read_input": 0.10, "output": 0.80, "batch_input": 0.10, "batch_output": 0.40 }
   }
 }
 ```
@@ -256,16 +266,22 @@ Large price changes are flagged during sync. If a specific entry is expected to
 ```ruby
 config.storage_backend = :active_record
 config.monthly_budget = 100.00
+config.daily_budget = 10.00
+config.per_call_budget = 1.00
 config.budget_exceeded_behavior = :block_requests
 ```
-- `:notify` — fire `on_budget_exceeded` after an event pushes the month over budget.
+- `:notify` — fire `on_budget_exceeded` after an event pushes the monthly, daily, or per-call budget over the limit.
 - `:raise` — record the event, then raise `BudgetExceededError`.
-- `:block_requests` — block preflight when the stored monthly total is already over budget; still raises post-response on the event that crosses the line. Needs `:active_record` storage.
+- `:block_requests` — block preflight when the stored monthly or daily total is already over budget; still raises post-response on the event that crosses the line. Needs `:active_record` storage for preflight.
+`monthly_budget` and `daily_budget` are cumulative ledger limits. `per_call_budget` is a ceiling for a single priced event and runs after the response cost is known.
+ActiveRecord installs keep `llm_cost_tracker_period_totals` in sync with atomic upserts. Budget preflight reads period rollups instead of scanning `llm_api_calls`.
 ```ruby
 rescue LlmCostTracker::BudgetExceededError => e
-  # e.monthly_total, e.budget, e.last_event
+  # e.budget_type, e.total, e.budget, e.monthly_total, e.daily_total, e.call_cost, e.last_event
 ```
 `:block_requests` is a **guardrail, not a hard cap**. The preflight and the spend-recording write are separate statements, so under Puma / Sidekiq concurrency multiple workers can all pass the preflight and then collectively overshoot the budget. The setting reliably *stops new requests after the overshoot is visible* — it does not prevent the overshoot itself. For strict quotas use a provider- or gateway-level limit, or a database-backed counter outside this gem.
@@ -343,7 +359,7 @@ On other adapters tags fall back to JSON in a text column. `by_tag` uses JSONB c
 Upgrade an existing install:
 ```bash
-bin/rails generate llm_cost_tracker:add_monthly_totals   # shared monthly budget rollups
+bin/rails generate llm_cost_tracker:add_period_totals    # shared budget rollups
 bin/rails generate llm_cost_tracker:upgrade_tags_to_jsonb   # PG: text → jsonb + GIN
 bin/rails generate llm_cost_tracker:upgrade_cost_precision  # widen cost columns
 bin/rails generate llm_cost_tracker:add_latency_ms
@@ -403,12 +419,14 @@ ActiveSupport::Notifications.subscribe("llm_request.llm_cost_tracker") do |*, pa
   # payload =>
   # {
   #   provider: "openai", model: "gpt-4o",
-  #   input_tokens: 150, output_tokens: 42, total_tokens: 192, latency_ms: 248,
+  #   input_tokens: 150, cache_read_input_tokens: 0, cache_write_input_tokens: 0,
+  #   hidden_output_tokens: 0, output_tokens: 42, total_tokens: 192, latency_ms: 248,
   #   cost: {
-  #     input_cost: 0.000375, cached_input_cost: 0.0,
-  #     cache_read_input_cost: 0.0, cache_creation_input_cost: 0.0,
-  #     output_cost: 0.00042, total_cost: 0.000795, currency: "USD"
+  #     input_cost: 0.000375, cache_read_input_cost: 0.0,
+  #     cache_write_input_cost: 0.0, output_cost: 0.00042,
+  #     total_cost: 0.000795, currency: "USD"
   #   },
+  #   pricing_mode: "batch",
   #   tags: { feature: "chat", user_id: 42 },
   #   tracked_at: 2026-04-16 14:30:00 UTC
   # }
@@ -440,21 +458,23 @@ Configured hosts are parsed using the OpenAI-compatible usage shape (`prompt_tok
 For providers with a non-OpenAI usage shape:
 ```ruby
-require "uri"
 class AcmeParser < LlmCostTracker::Parsers::Base
+  HOSTS = %w[api.acme-llm.example].freeze
+  TRACKED_PATHS = %w[/v1/generate].freeze
+  def provider_names
+    %w[acme]
+  end
   def match?(url)
-    uri = URI.parse(url.to_s)
-    uri.host == "api.acme-llm.example" && uri.path == "/v1/generate"
-  rescue URI::InvalidURIError
-    false
+    match_uri?(url, hosts: HOSTS, exact_paths: TRACKED_PATHS)
   end
-  def parse(request_url, request_body, response_status, response_body)
+  def parse(_request_url, _request_body, response_status, response_body)
     return nil unless response_status == 200
     payload = safe_json_parse(response_body)
-    usage = payload&.dig("usage")
+    usage = payload.dig("usage")
     return nil unless usage
     LlmCostTracker::ParsedUsage.build(
@@ -466,7 +486,7 @@ class AcmeParser < LlmCostTracker::Parsers::Base
   end
 end
-LlmCostTracker::Parsers::Registry.register(AcmeParser.new)
+LlmCostTracker::Parsers::Registry.register(AcmeParser)
 ```
 ## Supported providers
@@ -511,11 +531,12 @@ The gem is designed for multi-threaded hosts — Puma with `max_threads > 1` and
 - `:block_requests` is a best-effort guardrail, not a hard cap. Concurrent workers can pass preflight simultaneously and collectively overshoot the budget. Use an external quota system if you need a transactional cap.
 - Streaming capture relies on the provider emitting a final-usage event (OpenAI needs `stream_options: { include_usage: true }`); missing events are recorded with `usage_source: "unknown"` so they surface on the Data Quality page.
 - `provider_response_id` is stored only when the provider exposes a stable response object ID. Missing IDs stay `nil` and surface on the Data Quality page.
-- Anthropic cache TTL variants (1h vs 5min writes) not modeled separately.
-- OpenAI reasoning tokens included in output totals; separate reasoning-token attribution not stored.
+- Cache write TTL variants (1h vs 5min writes) not modeled separately.
 ## Development
+Architecture rules for future changes live in [`docs/architecture.md`](docs/architecture.md). Integration recipes live in [`docs/cookbook.md`](docs/cookbook.md).
 ```bash
 bundle install
 bundle exec rspec

data/app/services/llm_cost_tracker/dashboard/data_quality.rb CHANGED Viewed

@@ -13,40 +13,113 @@ module LlmCostTracker
       :stream_column_present,
       :missing_provider_response_id_count,
       :provider_response_id_column_present,
+      :usage_breakdown_column_present,
+      :input_tokens,
+      :cache_read_input_tokens,
+      :cache_write_input_tokens,
+      :output_tokens,
+      :hidden_output_tokens,
+      :input_cost,
+      :cache_read_input_cost,
+      :cache_write_input_cost,
+      :output_cost,
       :unknown_pricing_by_model
     )
     class DataQuality
       class << self
         def call(scope: LlmCostTracker::LlmApiCall.all)
-          total = scope.count
-          latency_present = LlmCostTracker::LlmApiCall.latency_column?
-          stream_present = LlmCostTracker::LlmApiCall.stream_column?
-          provider_response_id_present = LlmCostTracker::LlmApiCall.provider_response_id_column?
+          model = scope.klass
+          aggregates = DataQualityAggregate.call(scope: scope)
+          total = aggregates.fetch(:total_calls).to_i
           DataQualityStats.new(
             total_calls: total,
-            unknown_pricing_count: scope.unknown_pricing.count,
-            untagged_calls_count: total - scope.with_json_tags.count,
-            missing_latency_count: latency_present ? scope.where(latency_ms: nil).count : nil,
-            latency_column_present: latency_present,
-            streaming_count: stream_present ? scope.streaming.count : nil,
-            streaming_missing_usage_count: if stream_present && LlmCostTracker::LlmApiCall.usage_source_column?
-                                             scope.streaming_missing_usage.count
-                                           end,
-            stream_column_present: stream_present,
-            missing_provider_response_id_count: (
-              provider_response_id_present ? scope.missing_provider_response_id.count : nil
-            ),
-            provider_response_id_column_present: provider_response_id_present,
-            unknown_pricing_by_model: scope.unknown_pricing
-                                      .group(:model)
-                                      .order(Arel.sql("COUNT(*) DESC"))
-                                      .count
-                                      .first(10)
-                                      .to_h
+            unknown_pricing_count: aggregates.fetch(:unknown_pricing_count).to_i,
+            untagged_calls_count: total - aggregates.fetch(:tagged_calls_count).to_i,
+            **latency_stats(aggregates, model:),
+            **stream_stats(aggregates, model:),
+            **provider_response_id_stats(aggregates, model:),
+            **usage_stats(aggregates, model:),
+            unknown_pricing_by_model: unknown_pricing_by_model(scope)
           )
         end
+        private
+        def latency_stats(aggregates, model:)
+          latency_present = model.latency_column?
+          {
+            missing_latency_count: latency_present ? aggregates.fetch(:missing_latency_count).to_i : nil,
+            latency_column_present: latency_present
+          }
+        end
+        def stream_stats(aggregates, model:)
+          stream_present = model.stream_column?
+          usage_source_present = model.usage_source_column?
+          streaming_missing_usage_count = nil
+          if stream_present && usage_source_present
+            streaming_missing_usage_count = aggregates.fetch(:streaming_missing_usage_count).to_i
+          end
+          {
+            streaming_count: stream_present ? aggregates.fetch(:streaming_count).to_i : nil,
+            streaming_missing_usage_count: streaming_missing_usage_count,
+            stream_column_present: stream_present
+          }
+        end
+        def provider_response_id_stats(aggregates, model:)
+          column_present = model.provider_response_id_column?
+          missing_provider_response_id_count = nil
+          if column_present
+            missing_provider_response_id_count = aggregates.fetch(:missing_provider_response_id_count).to_i
+          end
+          {
+            missing_provider_response_id_count: missing_provider_response_id_count,
+            provider_response_id_column_present: column_present
+          }
+        end
+        def usage_stats(aggregates, model:)
+          usage_breakdown_present = model.usage_breakdown_columns?
+          usage_breakdown_cost_present = model.usage_breakdown_cost_columns?
+          cache_read_input_cost = nil
+          cache_write_input_cost = nil
+          if usage_breakdown_cost_present
+            cache_read_input_cost = decimal_sum(aggregates.fetch(:cache_read_input_cost))
+            cache_write_input_cost = decimal_sum(aggregates.fetch(:cache_write_input_cost))
+          end
+          {
+            usage_breakdown_column_present: usage_breakdown_present,
+            input_tokens: aggregates.fetch(:input_tokens).to_i,
+            cache_read_input_tokens: usage_breakdown_present ? aggregates.fetch(:cache_read_input_tokens).to_i : nil,
+            cache_write_input_tokens: usage_breakdown_present ? aggregates.fetch(:cache_write_input_tokens).to_i : nil,
+            output_tokens: aggregates.fetch(:output_tokens).to_i,
+            hidden_output_tokens: usage_breakdown_present ? aggregates.fetch(:hidden_output_tokens).to_i : nil,
+            input_cost: decimal_sum(aggregates.fetch(:input_cost)),
+            cache_read_input_cost: cache_read_input_cost,
+            cache_write_input_cost: cache_write_input_cost,
+            output_cost: decimal_sum(aggregates.fetch(:output_cost))
+          }
+        end
+        def unknown_pricing_by_model(scope)
+          scope.unknown_pricing
+               .group(:model)
+               .order(Arel.sql("COUNT(*) DESC"))
+               .count
+               .first(10)
+               .to_h
+        end
+        def decimal_sum(value)
+          value.to_f.round(8)
+        end
       end
     end
   end

data/app/services/llm_cost_tracker/dashboard/data_quality_aggregate.rb ADDED Viewed

@@ -0,0 +1,81 @@
+# frozen_string_literal: true
+module LlmCostTracker
+  module Dashboard
+    class DataQualityAggregate
+      class << self
+        def call(scope:)
+          model = scope.klass
+          expressions = aggregate_expressions(scope, model:)
+          values = Array(scope.unscope(:order).pick(*expressions.values))
+          expressions.keys.zip(values).to_h
+        end
+        private
+        def aggregate_expressions(scope, model:)
+          usage_breakdown_present = model.usage_breakdown_columns?
+          usage_breakdown_cost_present = model.usage_breakdown_cost_columns?
+          expressions = {
+            total_calls: Arel.sql("COUNT(*)"),
+            unknown_pricing_count: conditional_count_expression("total_cost IS NULL"),
+            tagged_calls_count: tagged_calls_expression(model)
+          }
+          if model.latency_column?
+            expressions[:missing_latency_count] = conditional_count_expression("latency_ms IS NULL")
+          end
+          expressions[:streaming_count] = conditional_count_expression("stream") if model.stream_column?
+          if model.stream_column? && model.usage_source_column?
+            expressions[:streaming_missing_usage_count] =
+              conditional_count_expression("stream AND (usage_source = 'unknown' OR usage_source IS NULL)")
+          end
+          if model.provider_response_id_column?
+            expressions[:missing_provider_response_id_count] =
+              conditional_count_expression("provider_response_id IS NULL OR provider_response_id = ''")
+          end
+          usage_sum_columns(usage_breakdown_present, usage_breakdown_cost_present).each do |column|
+            expressions[column] = sum_expression(scope, column)
+          end
+          expressions
+        end
+        def usage_sum_columns(usage_breakdown_present, usage_breakdown_cost_present)
+          columns = %i[input_tokens output_tokens input_cost output_cost]
+          if usage_breakdown_present
+            columns += %i[cache_read_input_tokens cache_write_input_tokens hidden_output_tokens]
+          end
+          columns += %i[cache_read_input_cost cache_write_input_cost] if usage_breakdown_cost_present
+          columns
+        end
+        def conditional_count_expression(predicate)
+          Arel.sql("COALESCE(SUM(CASE WHEN #{predicate} THEN 1 ELSE 0 END), 0)")
+        end
+        def tagged_calls_expression(model)
+          table = model.quoted_table_name
+          column = "#{table}.#{model.connection.quote_column_name('tags')}"
+          Arel.sql(case
+                   when model.tags_jsonb_column?
+                     "COALESCE(SUM(CASE WHEN #{column} <> '{}'::jsonb THEN 1 ELSE 0 END), 0)"
+                   when model.tags_mysql_json_column?
+                     "COALESCE(SUM(CASE WHEN JSON_LENGTH(#{column}) > 0 THEN 1 ELSE 0 END), 0)"
+                   else
+                     "COALESCE(SUM(CASE WHEN #{column} IS NOT NULL AND #{column} <> '' " \
+                     "AND #{column} <> '{}' THEN 1 ELSE 0 END), 0)"
+                   end)
+        end
+        def sum_expression(scope, column)
+          Arel.sql("COALESCE(SUM(#{scope.connection.quote_column_name(column)}), 0)")
+        end
+      end
+    end
+  end
+end

data/app/views/llm_cost_tracker/data_quality/index.html.erb CHANGED Viewed

@@ -2,6 +2,8 @@
 <% streaming_count = @stats.streaming_count %>
 <% streaming_missing_usage = @stats.streaming_missing_usage_count %>
 <% calls_with_provider_response_id = @stats.provider_response_id_column_present ? total - @stats.missing_provider_response_id_count : nil %>
+<% billable_tokens = @stats.input_tokens + @stats.output_tokens + @stats.cache_read_input_tokens.to_i + @stats.cache_write_input_tokens.to_i %>
+<% hidden_output_share = coverage_percent(@stats.hidden_output_tokens.to_i, @stats.output_tokens) %>
 <section class="lct-panel lct-toolbar">
   <div class="lct-toolbar-head">
@@ -118,6 +120,14 @@
             <p class="lct-stat-sub"><%= percent(coverage_percent(calls_with_provider_response_id, total)) %> of calls</p>
           </article>
         <% end %>
+        <% if @stats.usage_breakdown_column_present && @stats.output_tokens.positive? %>
+          <article class="lct-stat">
+            <p class="lct-stat-label">Hidden output share</p>
+            <p class="lct-stat-value"><%= percent(hidden_output_share) %></p>
+            <p class="lct-stat-sub"><%= number(@stats.hidden_output_tokens) %> of <%= number(@stats.output_tokens) %> output tokens</p>
+          </article>
+        <% end %>
       </div>
     </div>
   </section>
@@ -243,6 +253,61 @@
     </section>
   </section>
+  <% if @stats.usage_breakdown_column_present %>
+    <section class="lct-panel">
+      <div class="lct-section-head">
+        <div>
+          <h2 class="lct-section-title">Usage breakdown</h2>
+        </div>
+      </div>
+      <div class="lct-table-wrap">
+        <table class="lct-table lct-table-compact">
+          <thead>
+            <tr>
+              <th>Bucket</th>
+              <th class="lct-num">Tokens</th>
+              <th class="lct-num">Share</th>
+              <th class="lct-num">Cost</th>
+            </tr>
+          </thead>
+          <tbody>
+            <tr>
+              <td>Regular input</td>
+              <td class="lct-num"><%= number(@stats.input_tokens) %></td>
+              <td class="lct-num"><%= percent(coverage_percent(@stats.input_tokens, billable_tokens)) %></td>
+              <td class="lct-num"><%= money(@stats.input_cost) %></td>
+            </tr>
+            <tr>
+              <td>Cache read input</td>
+              <td class="lct-num"><%= number(@stats.cache_read_input_tokens) %></td>
+              <td class="lct-num"><%= percent(coverage_percent(@stats.cache_read_input_tokens, billable_tokens)) %></td>
+              <td class="lct-num<%= ' lct-num-muted' if @stats.cache_read_input_cost.nil? %>"><%= optional_money(@stats.cache_read_input_cost) %></td>
+            </tr>
+            <tr>
+              <td>Cache write input</td>
+              <td class="lct-num"><%= number(@stats.cache_write_input_tokens) %></td>
+              <td class="lct-num"><%= percent(coverage_percent(@stats.cache_write_input_tokens, billable_tokens)) %></td>
+              <td class="lct-num<%= ' lct-num-muted' if @stats.cache_write_input_cost.nil? %>"><%= optional_money(@stats.cache_write_input_cost) %></td>
+            </tr>
+            <tr>
+              <td>Output</td>
+              <td class="lct-num"><%= number(@stats.output_tokens) %></td>
+              <td class="lct-num"><%= percent(coverage_percent(@stats.output_tokens, billable_tokens)) %></td>
+              <td class="lct-num"><%= money(@stats.output_cost) %></td>
+            </tr>
+            <tr>
+              <td>Hidden output</td>
+              <td class="lct-num"><%= number(@stats.hidden_output_tokens) %></td>
+              <td class="lct-num"><%= percent(hidden_output_share) %> of output</td>
+              <td class="lct-num lct-num-muted">n/a</td>
+            </tr>
+          </tbody>
+        </table>
+      </div>
+    </section>
+  <% end %>
   <% unless @stats.unknown_pricing_by_model.empty? %>
     <section class="lct-panel">
       <div class="lct-section-head">