llm_cost_tracker 0.4.0 → 0.5.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- checksums.yaml +4 -4
- data/CHANGELOG.md +35 -0
- data/README.md +195 -109
- data/app/services/llm_cost_tracker/dashboard/data_quality.rb +46 -55
- data/app/services/llm_cost_tracker/dashboard/data_quality_aggregate.rb +81 -0
- data/lib/llm_cost_tracker/budget.rb +34 -37
- data/lib/llm_cost_tracker/configuration/instrumentation.rb +37 -0
- data/lib/llm_cost_tracker/configuration.rb +10 -5
- data/lib/llm_cost_tracker/doctor.rb +166 -0
- data/lib/llm_cost_tracker/generators/llm_cost_tracker/install_generator.rb +33 -0
- data/lib/llm_cost_tracker/generators/llm_cost_tracker/prices_generator.rb +12 -6
- data/lib/llm_cost_tracker/generators/llm_cost_tracker/templates/add_period_totals_to_llm_cost_tracker.rb.erb +38 -8
- data/lib/llm_cost_tracker/generators/llm_cost_tracker/templates/create_llm_api_calls.rb.erb +1 -2
- data/lib/llm_cost_tracker/generators/llm_cost_tracker/templates/initializer.rb.erb +53 -21
- data/lib/llm_cost_tracker/integrations/anthropic.rb +75 -0
- data/lib/llm_cost_tracker/integrations/base.rb +72 -0
- data/lib/llm_cost_tracker/integrations/object_reader.rb +56 -0
- data/lib/llm_cost_tracker/integrations/openai.rb +95 -0
- data/lib/llm_cost_tracker/integrations/registry.rb +41 -0
- data/lib/llm_cost_tracker/middleware/faraday.rb +4 -3
- data/lib/llm_cost_tracker/parsed_usage.rb +8 -1
- data/lib/llm_cost_tracker/parsers/anthropic.rb +17 -49
- data/lib/llm_cost_tracker/parsers/base.rb +80 -0
- data/lib/llm_cost_tracker/parsers/gemini.rb +12 -35
- data/lib/llm_cost_tracker/parsers/openai.rb +1 -6
- data/lib/llm_cost_tracker/parsers/openai_compatible.rb +6 -15
- data/lib/llm_cost_tracker/parsers/openai_usage.rb +8 -30
- data/lib/llm_cost_tracker/parsers/registry.rb +17 -2
- data/lib/llm_cost_tracker/price_freshness.rb +38 -0
- data/lib/llm_cost_tracker/price_registry.rb +14 -0
- data/lib/llm_cost_tracker/price_sync/fetcher.rb +2 -1
- data/lib/llm_cost_tracker/price_sync/refresh_plan_builder.rb +4 -2
- data/lib/llm_cost_tracker/price_sync.rb +10 -0
- data/lib/llm_cost_tracker/prices.json +394 -41
- data/lib/llm_cost_tracker/pricing.rb +8 -1
- data/lib/llm_cost_tracker/request_url.rb +20 -0
- data/lib/llm_cost_tracker/storage/active_record_rollups.rb +47 -27
- data/lib/llm_cost_tracker/storage/active_record_store.rb +4 -0
- data/lib/llm_cost_tracker/stream_collector.rb +3 -3
- data/lib/llm_cost_tracker/tag_context.rb +52 -0
- data/lib/llm_cost_tracker/tags_column.rb +62 -24
- data/lib/llm_cost_tracker/tracker.rb +5 -2
- data/lib/llm_cost_tracker/version.rb +1 -1
- data/lib/llm_cost_tracker.rb +14 -4
- data/lib/tasks/llm_cost_tracker.rake +21 -3
- metadata +13 -3
- data/lib/llm_cost_tracker/generators/llm_cost_tracker/templates/llm_cost_tracker_prices.yml.erb +0 -51
checksums.yaml
CHANGED
|
@@ -1,7 +1,7 @@
|
|
|
1
1
|
---
|
|
2
2
|
SHA256:
|
|
3
|
-
metadata.gz:
|
|
4
|
-
data.tar.gz:
|
|
3
|
+
metadata.gz: 6ee180a9d6ead4b84965b3ff96f87b31c6ce8982a8e13383f936d3031e8f6f5f
|
|
4
|
+
data.tar.gz: fda6d61c9f86b4e2a4dbdc7a7852f6f4f22bcf43f76b6cfbdd4f438c325e8d8c
|
|
5
5
|
SHA512:
|
|
6
|
-
metadata.gz:
|
|
7
|
-
data.tar.gz:
|
|
6
|
+
metadata.gz: 8e341c007ff3380459a07890a45bc5e05010c12ffc52a1f805492eb6c643e9637529b02e4d6ae12a7f35c1e25ea819336544bd007f7cfc5efa9c7999559f5d83
|
|
7
|
+
data.tar.gz: 44c912532194be0f239c6950f1f91317329bb5b8c3afbf33e430b4f9006377a8729ad0ed7f3c2c98528983d218fabef136774088018203426749532eb01627ef
|
data/CHANGELOG.md
CHANGED
|
@@ -4,6 +4,41 @@ Format: [Keep a Changelog](https://keepachangelog.com/en/1.1.0/). Versioning: [S
|
|
|
4
4
|
|
|
5
5
|
## [Unreleased]
|
|
6
6
|
|
|
7
|
+
## [0.5.0] - 2026-04-25
|
|
8
|
+
|
|
9
|
+
### Added
|
|
10
|
+
|
|
11
|
+
- Optional SDK integrations: `config.instrument :openai`, `:anthropic`, or `:all` patches the official `openai` and `anthropic` gems' resource methods to record usage automatically. Provider SDKs are not added as hard dependencies.
|
|
12
|
+
- `LlmCostTracker.with_tags` plus `TagContext` for thread- and fiber-isolated request-scoped tags that flow through middleware, SDK integrations, and `track` / `track_stream`.
|
|
13
|
+
- `LlmCostTracker::Doctor` and the `llm_cost_tracker:doctor` rake task for diagnosing storage, schema, optional columns, period totals, integrations, prices, and recent calls.
|
|
14
|
+
- `LlmCostTracker::PriceFreshness` helper plus a price-freshness doctor check that warns when bundled or local prices are stale.
|
|
15
|
+
- Technical documentation under `docs/technical/` covering architecture, data flow, extension points, module map, and operational notes.
|
|
16
|
+
|
|
17
|
+
### Changed
|
|
18
|
+
|
|
19
|
+
- Pricing fuzzy matching now only accepts dated snapshot suffixes instead of guessing new model families.
|
|
20
|
+
- Built-in prices include GPT-5.5 and GPT-5.4 variants and drop retired Claude and Gemini entries.
|
|
21
|
+
- Missing model identifiers now normalize to `unknown` instead of leaking nil into tracked events.
|
|
22
|
+
- `llm_cost_tracker:prices` now generates a full local price snapshot instead of an empty override file.
|
|
23
|
+
- Price sync workflow surfaces clearer error context for fetcher failures and skips refresh-plan entries with malformed pricing.
|
|
24
|
+
- README, cookbook, and technical docs clarify that `config.instrument` patches official SDKs only; `ruby-openai` (alexrudall) routes through the Faraday middleware via its constructor block, and `ruby_llm` is not auto-captured today because the gem does not expose a Faraday middleware hook.
|
|
25
|
+
|
|
26
|
+
## [0.4.1] - 2026-04-24
|
|
27
|
+
|
|
28
|
+
### Changed
|
|
29
|
+
|
|
30
|
+
- Batched ActiveRecord period rollup writes and budget total reads.
|
|
31
|
+
- Memoized schema capability checks and refreshed them on `reset_column_information`.
|
|
32
|
+
- Install migration adds `[:model, :tracked_at]` composite index and drops redundant single-column `:provider` / `:model` indexes.
|
|
33
|
+
- Data Quality now reads counters and usage sums through one aggregate query.
|
|
34
|
+
- Parser URL matching, stream-event extraction, and custom parser registration now share a smaller base/registry extension surface.
|
|
35
|
+
- Added cookbook recipes for `ruby-openai`, `anthropic-sdk-ruby`, `gemini-ai`, `langchainrb`, Azure OpenAI, and LiteLLM proxy setups.
|
|
36
|
+
|
|
37
|
+
### Fixed
|
|
38
|
+
|
|
39
|
+
- `llm_cost_tracker:add_period_totals` now imports legacy monthly rollups and backfills before adding the unique index.
|
|
40
|
+
- Budget docs now describe `:notify` across monthly, daily, and per-call budgets.
|
|
41
|
+
|
|
7
42
|
## [0.4.0] - 2026-04-24
|
|
8
43
|
|
|
9
44
|
### Changed
|
data/README.md
CHANGED
|
@@ -1,13 +1,13 @@
|
|
|
1
1
|
# LLM Cost Tracker
|
|
2
2
|
|
|
3
|
-
**Self-hosted LLM cost tracking for Ruby and Rails.**
|
|
3
|
+
**Self-hosted LLM cost tracking for Ruby and Rails.** Instruments common Ruby SDKs, intercepts Faraday LLM responses, prices events locally, and can store them in your database. No proxy, no SaaS.
|
|
4
4
|
|
|
5
5
|
[](https://rubygems.org/gems/llm_cost_tracker)
|
|
6
6
|
[](https://github.com/sergey-homenko/llm_cost_tracker/actions)
|
|
7
7
|
[](https://codecov.io/gh/sergey-homenko/llm_cost_tracker)
|
|
8
8
|
|
|
9
|
-
Requires Ruby 3.3+,
|
|
10
|
-
|
|
9
|
+
Requires Ruby 3.3+, ActiveSupport 7.1+, and Faraday 2.0+.
|
|
10
|
+
ActiveRecord storage requires ActiveRecord 7.1+. The mounted dashboard requires Rails 7.1+.
|
|
11
11
|
|
|
12
12
|
## Why
|
|
13
13
|
|
|
@@ -16,48 +16,48 @@ Every Rails app with LLM integrations eventually runs into the same question: wh
|
|
|
16
16
|
## What You Get
|
|
17
17
|
|
|
18
18
|
- A local ActiveRecord ledger of provider, model, usage breakdown, cost, latency, tags, streaming usage, and provider response IDs
|
|
19
|
-
-
|
|
20
|
-
-
|
|
19
|
+
- Optional official OpenAI and Anthropic SDK integrations, plus Faraday middleware for custom clients
|
|
20
|
+
- Explicit `track` / `track_stream` helpers as a fallback for unsupported clients
|
|
21
|
+
- Server-rendered Rails dashboard with overview, models, calls, tags, CSV export, and data-quality pages
|
|
21
22
|
- Local pricing snapshots, price sync tasks, and budget guardrails
|
|
22
23
|
- Prompt and response bodies are never persisted
|
|
23
24
|
|
|
24
25
|
## Dashboard
|
|
25
26
|
|
|
26
|
-
LLM Cost Tracker ships with
|
|
27
|
+
LLM Cost Tracker ships with a server-rendered Rails Engine dashboard for spend review, attribution, and data quality checks.
|
|
27
28
|
|
|
28
29
|

|
|
29
30
|
|
|
30
|
-
The overview page includes spend trend, budget status, provider breakdown, top models, and filterable slices. The engine also includes Calls, Tags, and Data Quality pages. Plain ERB, no JavaScript bundle.
|
|
31
|
+
The overview page includes spend trend, budget status, provider breakdown, top models, and filterable slices. The engine also includes Models, Calls, Tags, and Data Quality pages. Plain ERB, no JavaScript bundle.
|
|
31
32
|
|
|
32
33
|
## Quickstart
|
|
33
34
|
|
|
34
35
|
```ruby
|
|
35
36
|
gem "llm_cost_tracker"
|
|
37
|
+
gem "openai"
|
|
36
38
|
```
|
|
37
39
|
|
|
38
40
|
```bash
|
|
39
|
-
bin/rails generate llm_cost_tracker:install
|
|
41
|
+
bin/rails generate llm_cost_tracker:install --dashboard --prices
|
|
40
42
|
bin/rails db:migrate
|
|
43
|
+
bin/rails llm_cost_tracker:doctor
|
|
41
44
|
```
|
|
42
45
|
|
|
46
|
+
Skip `--dashboard` if you only want the ledger. Skip `--prices` if you do not want a local pricing file yet.
|
|
47
|
+
|
|
43
48
|
```ruby
|
|
44
49
|
LlmCostTracker.configure do |config|
|
|
45
50
|
config.storage_backend = :active_record
|
|
46
|
-
config.default_tags = {
|
|
51
|
+
config.default_tags = -> { { environment: Rails.env } }
|
|
52
|
+
config.instrument :openai
|
|
47
53
|
end
|
|
48
54
|
|
|
49
|
-
|
|
50
|
-
|
|
51
|
-
|
|
52
|
-
f.use :llm_cost_tracker, tags: -> { { user_id: Current.user&.id, feature: "chat" } }
|
|
53
|
-
end
|
|
55
|
+
LlmCostTracker.with_tags(user_id: Current.user&.id, feature: "chat") do
|
|
56
|
+
client = OpenAI::Client.new(api_key: ENV["OPENAI_API_KEY"])
|
|
57
|
+
client.responses.create(model: "gpt-4o", input: "Hello")
|
|
54
58
|
end
|
|
55
59
|
```
|
|
56
60
|
|
|
57
|
-
```ruby
|
|
58
|
-
mount LlmCostTracker::Engine => "/llm-costs"
|
|
59
|
-
```
|
|
60
|
-
|
|
61
61
|
After that, LLM Cost Tracker starts recording calls into `llm_api_calls` and the dashboard becomes available at `/llm-costs`.
|
|
62
62
|
Protect the mounted engine with your application's authentication before exposing it outside development.
|
|
63
63
|
|
|
@@ -69,39 +69,43 @@ Protect the mounted engine with your application's authentication before exposin
|
|
|
69
69
|
- No built-in auth on the mounted dashboard
|
|
70
70
|
- Use `:active_record` when you want shared dashboards and budget checks across Puma workers and Sidekiq processes
|
|
71
71
|
|
|
72
|
-
##
|
|
72
|
+
## Technical Docs
|
|
73
73
|
|
|
74
|
-
|
|
75
|
-
gem "llm_cost_tracker"
|
|
76
|
-
```
|
|
74
|
+
- [Architecture](docs/architecture.md)
|
|
77
75
|
|
|
78
|
-
|
|
76
|
+
## Usage
|
|
79
77
|
|
|
80
|
-
|
|
81
|
-
bin/rails generate llm_cost_tracker:install
|
|
82
|
-
bin/rails db:migrate
|
|
83
|
-
```
|
|
78
|
+
### Official SDK integrations
|
|
84
79
|
|
|
85
|
-
|
|
80
|
+
`config.instrument` patches **official** provider SDKs only — currently the official `openai` and `anthropic` gems. SDK integrations are optional and do not add provider SDKs as gem dependencies. Install the provider SDK you already use, then enable its integration.
|
|
86
81
|
|
|
87
|
-
|
|
82
|
+
```ruby
|
|
83
|
+
LlmCostTracker.configure do |config|
|
|
84
|
+
config.instrument :openai
|
|
85
|
+
config.instrument :anthropic
|
|
86
|
+
end
|
|
87
|
+
```
|
|
88
|
+
|
|
89
|
+
The OpenAI integration records non-streaming calls through the official `openai` gem's `responses.create` and `chat.completions.create`. The Anthropic integration records non-streaming calls through the official `anthropic` gem's `messages.create`. Both integrations extract usage, model, latency, provider response ID, cache tokens, and hidden/reasoning tokens when the SDK response exposes them.
|
|
88
90
|
|
|
89
91
|
```ruby
|
|
90
|
-
|
|
91
|
-
|
|
92
|
-
|
|
93
|
-
|
|
94
|
-
|
|
95
|
-
|
|
96
|
-
|
|
97
|
-
}
|
|
98
|
-
end
|
|
92
|
+
LlmCostTracker.with_tags(feature: "support_chat", user_id: Current.user&.id) do
|
|
93
|
+
anthropic = Anthropic::Client.new(api_key: ENV["ANTHROPIC_API_KEY"])
|
|
94
|
+
anthropic.messages.create(
|
|
95
|
+
model: "claude-sonnet-4-5-20250929",
|
|
96
|
+
max_tokens: 1024,
|
|
97
|
+
messages: [{ role: "user", content: "Hello" }]
|
|
98
|
+
)
|
|
99
99
|
end
|
|
100
100
|
```
|
|
101
101
|
|
|
102
|
-
`
|
|
102
|
+
Community clients such as `ruby-openai` are not patched by `instrument`. `ruby-openai` exposes a Faraday block on its constructor and is covered by the middleware below.
|
|
103
|
+
|
|
104
|
+
Google's official Gemini SDKs do not include Ruby. Use the Faraday middleware against Gemini's REST API, or keep custom clients behind the fallback helpers until a stable SDK integration exists.
|
|
105
|
+
|
|
106
|
+
### Faraday middleware
|
|
103
107
|
|
|
104
|
-
|
|
108
|
+
`tags:` can be a hash or callable. Callables are evaluated on each request and may accept the Faraday request env.
|
|
105
109
|
|
|
106
110
|
```ruby
|
|
107
111
|
conn = Faraday.new(url: "https://api.openai.com") do |f|
|
|
@@ -116,9 +120,11 @@ conn.post("/v1/responses", { model: "gpt-5-mini", input: "Hello!" })
|
|
|
116
120
|
|
|
117
121
|
Place `llm_cost_tracker` inside the Faraday stack where it can see the final response body.
|
|
118
122
|
|
|
123
|
+
The same middleware covers `ruby-openai` through its constructor block.
|
|
124
|
+
|
|
119
125
|
### Streaming
|
|
120
126
|
|
|
121
|
-
Streaming is captured automatically for OpenAI, Anthropic, and Gemini when the request goes through the Faraday middleware. The middleware tees the `on_data` callback, keeps the stream flowing to your code, and records
|
|
127
|
+
Streaming is captured automatically for OpenAI, Anthropic, and Gemini when the request goes through the Faraday middleware. The middleware tees the `on_data` callback, keeps the stream flowing to your code, and records provider-reported usage once the response completes.
|
|
122
128
|
|
|
123
129
|
```ruby
|
|
124
130
|
# OpenAI: include usage in the final chunk
|
|
@@ -130,20 +136,22 @@ client.chat(parameters: {
|
|
|
130
136
|
})
|
|
131
137
|
```
|
|
132
138
|
|
|
133
|
-
Anthropic emits usage in `message_start` + `message_delta` events. Gemini's `:streamGenerateContent` endpoint includes `usageMetadata`;
|
|
139
|
+
Anthropic emits usage in `message_start` + `message_delta` events. Gemini's `:streamGenerateContent` endpoint includes `usageMetadata`; the latest usage block is used.
|
|
134
140
|
|
|
135
141
|
Streamed calls are stored with `stream: true` and `usage_source: "stream_final"`. If the provider never sends final usage, the call is still recorded with `usage_source: "unknown"` so those calls surface on the Data Quality page.
|
|
136
142
|
|
|
137
143
|
When the provider emits a stable response object ID, LLM Cost Tracker stores it as `provider_response_id`. OpenAI and Anthropic are covered end-to-end; Gemini is best effort and may vary by endpoint or API version.
|
|
138
144
|
|
|
139
|
-
|
|
145
|
+
Model identifiers are extracted from the provider response, request body, stream events, or URL path depending on the provider. If no source carries a model, the event is stored under `model: "unknown"` and shows up as unknown pricing instead of being guessed.
|
|
146
|
+
|
|
147
|
+
For non-Faraday clients without an SDK integration, prefer adding a supported adapter. Use the explicit helper only as a fallback while wiring a client that does not expose a stable hook yet:
|
|
140
148
|
|
|
141
149
|
```ruby
|
|
142
150
|
LlmCostTracker.track_stream(provider: "openai", model: "gpt-4o") do |stream|
|
|
143
|
-
my_client.stream(...) { |
|
|
151
|
+
my_client.stream(...) { |event| stream.event(event.to_h) }
|
|
144
152
|
end
|
|
145
153
|
|
|
146
|
-
# Or skip
|
|
154
|
+
# Or skip provider event parsing entirely if you already know the totals:
|
|
147
155
|
LlmCostTracker.track_stream(provider: "openai", model: "gpt-4o") do |stream|
|
|
148
156
|
# ... your streaming loop ...
|
|
149
157
|
stream.usage(input_tokens: 120, output_tokens: 45)
|
|
@@ -161,7 +169,11 @@ end
|
|
|
161
169
|
|
|
162
170
|
Run `bin/rails g llm_cost_tracker:add_streaming` once on existing installs to add the `stream` and `usage_source` columns. Run `bin/rails g llm_cost_tracker:add_provider_response_id` to persist provider-issued response IDs. Run `bin/rails g llm_cost_tracker:add_usage_breakdown` to add cache-read, cache-write, hidden-output, and pricing-mode columns.
|
|
163
171
|
|
|
164
|
-
|
|
172
|
+
More client-specific snippets live in [`docs/cookbook.md`](docs/cookbook.md).
|
|
173
|
+
|
|
174
|
+
### Fallback tracking
|
|
175
|
+
|
|
176
|
+
Automatic capture should be the default integration path. `track` exists for custom clients, internal gateways, migrations, and SDKs that do not expose a stable middleware or instrumentation hook yet.
|
|
165
177
|
|
|
166
178
|
```ruby
|
|
167
179
|
LlmCostTracker.track(
|
|
@@ -180,42 +192,72 @@ LlmCostTracker.track(
|
|
|
180
192
|
`cache_read_input_tokens` and cache writes in `cache_write_input_tokens`; total
|
|
181
193
|
tokens are calculated from the canonical billing breakdown.
|
|
182
194
|
|
|
195
|
+
For manual tracking, pass the real upstream model when you know it. If a gateway only exposes a deployment or router name, use that stable identifier and add a matching `prices_file` / `pricing_overrides` entry.
|
|
196
|
+
|
|
197
|
+
### Tags
|
|
198
|
+
|
|
199
|
+
Tags are application context, not provider metadata. LLM Cost Tracker detects provider/model from the response when a parser is available; tags tell you who or what caused the call.
|
|
200
|
+
|
|
201
|
+
```ruby
|
|
202
|
+
LlmCostTracker.with_tags(user_id: current_user.id, feature: "support_chat", trace_id: request.uuid) do
|
|
203
|
+
client.chat(parameters: { model: "gpt-4o", messages: [...] })
|
|
204
|
+
end
|
|
205
|
+
```
|
|
206
|
+
|
|
207
|
+
`default_tags` can be a hash or callable. Scoped tags from `with_tags` apply only inside the block and are isolated per thread/fiber. Explicit tags passed to `track`, `track_stream`, or middleware metadata win over scoped/default tags.
|
|
208
|
+
|
|
183
209
|
## Configuration
|
|
184
210
|
|
|
185
211
|
```ruby
|
|
186
|
-
# config/initializers/llm_cost_tracker.rb
|
|
187
212
|
LlmCostTracker.configure do |config|
|
|
188
|
-
config.storage_backend = :active_record
|
|
189
|
-
config.default_tags = {
|
|
190
|
-
|
|
213
|
+
config.storage_backend = :active_record
|
|
214
|
+
config.default_tags = -> { { environment: Rails.env } }
|
|
215
|
+
config.instrument :openai
|
|
216
|
+
config.instrument :anthropic
|
|
217
|
+
config.prices_file = Rails.root.join("config/llm_cost_tracker_prices.yml")
|
|
191
218
|
config.monthly_budget = 500.00
|
|
192
219
|
config.daily_budget = 50.00
|
|
193
220
|
config.per_call_budget = 2.00
|
|
194
|
-
config.budget_exceeded_behavior = :notify
|
|
195
|
-
config.storage_error_behavior = :warn # :ignore, :warn, :raise
|
|
196
|
-
config.unknown_pricing_behavior = :warn # :ignore, :warn, :raise
|
|
197
|
-
|
|
221
|
+
config.budget_exceeded_behavior = :notify
|
|
198
222
|
config.on_budget_exceeded = ->(data) {
|
|
199
|
-
SlackNotifier.notify("#alerts", "
|
|
223
|
+
SlackNotifier.notify("#alerts", "LLM #{data[:budget_type]} budget $#{data[:total].round(2)} / $#{data[:budget]}")
|
|
200
224
|
}
|
|
201
|
-
|
|
202
|
-
config.prices_file = Rails.root.join("config/llm_cost_tracker_prices.yml")
|
|
203
|
-
config.pricing_overrides = {
|
|
204
|
-
"ft:gpt-4o-mini:my-org" => { input: 0.30, cache_read_input: 0.15, output: 1.20 }
|
|
205
|
-
}
|
|
206
|
-
|
|
207
|
-
# Built-in: openrouter.ai, api.deepseek.com
|
|
208
|
-
config.openai_compatible_providers["llm.my-company.com"] = "internal_gateway"
|
|
209
225
|
end
|
|
210
226
|
```
|
|
211
227
|
|
|
228
|
+
Storage backends: `:log` (default), `:active_record`, `:custom`. Error behaviors: `:ignore`, `:warn`, `:raise`; budget behavior also supports `:block_requests`.
|
|
229
|
+
|
|
230
|
+
Configuration reference:
|
|
231
|
+
|
|
232
|
+
| Option | Default | Purpose |
|
|
233
|
+
|---|---:|---|
|
|
234
|
+
| `enabled` | `true` | Turns tracking on/off. |
|
|
235
|
+
| `storage_backend` | `:log` | `:log`, `:active_record`, or `:custom`. |
|
|
236
|
+
| `custom_storage` | `nil` | Callable storage hook for `:custom`. |
|
|
237
|
+
| `default_tags` | `{}` | Hash or callable merged into every event. |
|
|
238
|
+
| `prices_file` | `nil` | Local JSON/YAML price table. |
|
|
239
|
+
| `pricing_overrides` | `{}` | Ruby-side model price overrides. |
|
|
240
|
+
| `instrument` | none | Enables optional SDK integrations such as `:openai`, `:anthropic`, or `:all`. |
|
|
241
|
+
| `monthly_budget` | `nil` | Monthly spend guardrail. |
|
|
242
|
+
| `daily_budget` | `nil` | Daily spend guardrail. |
|
|
243
|
+
| `per_call_budget` | `nil` | Single-event spend guardrail. |
|
|
244
|
+
| `budget_exceeded_behavior` | `:notify` | `:notify`, `:raise`, or `:block_requests`. |
|
|
245
|
+
| `on_budget_exceeded` | `nil` | Callback for budget events. |
|
|
246
|
+
| `storage_error_behavior` | `:warn` | `:ignore`, `:warn`, or `:raise`. |
|
|
247
|
+
| `unknown_pricing_behavior` | `:warn` | `:ignore`, `:warn`, or `:raise`. |
|
|
248
|
+
| `log_level` | `:info` | Log level used by `:log` storage. |
|
|
249
|
+
| `openai_compatible_providers` | OpenRouter + DeepSeek | Host-to-provider map for compatible APIs. |
|
|
250
|
+
| `report_tag_breakdowns` | `[]` | Tag keys included in text reports. |
|
|
251
|
+
|
|
252
|
+
LLM Cost Tracker estimates cost from recorded usage and a versioned price registry. Providers usually return token usage, not a stable per-request price, so request costs are calculated locally and stored with the call. Historical rows do not change when prices update.
|
|
253
|
+
|
|
212
254
|
Pricing is best effort. OpenRouter-style IDs like `openai/gpt-4o-mini` are normalized to built-in names when possible. Use `prices_file` / `pricing_overrides` for fine-tunes, gateway-specific IDs, enterprise discounts, alternate pricing modes, or models the gem does not know.
|
|
213
255
|
Provider-specific entries like `openai/gpt-4o-mini` win over model-only entries like `gpt-4o-mini`.
|
|
214
256
|
Pass `pricing_mode: :batch` to use optional mode-specific keys such as `batch_input` / `batch_output`; missing mode-specific keys fall back to standard `input` / `output` rates. The same pattern works for custom modes, for example `contract_input`.
|
|
215
257
|
|
|
216
258
|
`storage_error_behavior = :warn` (default) lets LLM responses continue if storage fails; `:raise` exposes `StorageError#original_error`.
|
|
217
259
|
|
|
218
|
-
|
|
260
|
+
With `unknown_pricing_behavior = :ignore` or `:warn`, unknown pricing still records token counts, but `cost` is `nil` and budget guardrails skip that event. With `:raise`, the event raises before storage. Find unpriced models:
|
|
219
261
|
|
|
220
262
|
```ruby
|
|
221
263
|
LlmCostTracker::LlmApiCall.unknown_pricing.group(:model).count
|
|
@@ -223,22 +265,33 @@ LlmCostTracker::LlmApiCall.unknown_pricing.group(:model).count
|
|
|
223
265
|
|
|
224
266
|
### Keeping prices current
|
|
225
267
|
|
|
226
|
-
Built-in prices live in `lib/llm_cost_tracker/prices.json`. The gem never fetches pricing on boot. For production,
|
|
268
|
+
Built-in prices live in `lib/llm_cost_tracker/prices.json`. The gem never fetches pricing on boot. For production, generate a local snapshot from the bundled registry, keep it under source control, and point the gem at it:
|
|
227
269
|
|
|
228
270
|
```bash
|
|
229
271
|
bin/rails generate llm_cost_tracker:prices
|
|
230
272
|
```
|
|
231
273
|
|
|
232
|
-
```
|
|
233
|
-
|
|
234
|
-
|
|
235
|
-
|
|
236
|
-
|
|
237
|
-
|
|
238
|
-
|
|
274
|
+
```ruby
|
|
275
|
+
config.prices_file = Rails.root.join("config/llm_cost_tracker_prices.yml")
|
|
276
|
+
```
|
|
277
|
+
|
|
278
|
+
The generated file has the same shape as the bundled registry:
|
|
279
|
+
|
|
280
|
+
```yaml
|
|
281
|
+
metadata:
|
|
282
|
+
updated_at: "2026-04-25"
|
|
283
|
+
currency: USD
|
|
284
|
+
unit: 1M tokens
|
|
285
|
+
models:
|
|
286
|
+
my-gateway/gpt-4o-mini:
|
|
287
|
+
input: 0.20
|
|
288
|
+
cache_read_input: 0.10
|
|
289
|
+
output: 0.80
|
|
290
|
+
batch_input: 0.10
|
|
291
|
+
batch_output: 0.40
|
|
239
292
|
```
|
|
240
293
|
|
|
241
|
-
`pricing_overrides`
|
|
294
|
+
Pricing precedence is `pricing_overrides`, then `prices_file`, then bundled prices. Use `prices_file` for the app's source-controlled snapshot and `pricing_overrides` only for a handful of Ruby-side emergency overrides.
|
|
242
295
|
|
|
243
296
|
To refresh prices on demand:
|
|
244
297
|
|
|
@@ -246,19 +299,30 @@ To refresh prices on demand:
|
|
|
246
299
|
bin/rails llm_cost_tracker:prices:sync
|
|
247
300
|
```
|
|
248
301
|
|
|
249
|
-
`llm_cost_tracker:prices:sync` refreshes
|
|
302
|
+
`llm_cost_tracker:prices:sync` refreshes a pricing file from two structured sources: LiteLLM first, OpenRouter second. LiteLLM is the primary source; OpenRouter fills gaps and helps surface discrepancies.
|
|
250
303
|
|
|
251
304
|
`llm_cost_tracker:prices:sync` / `llm_cost_tracker:prices:check` perform HTTP GET requests to:
|
|
252
305
|
|
|
253
306
|
- LiteLLM pricing JSON: `https://raw.githubusercontent.com/BerriAI/litellm/main/model_prices_and_context_window.json`
|
|
254
307
|
- OpenRouter Models API: `https://openrouter.ai/api/v1/models`
|
|
255
308
|
|
|
256
|
-
|
|
309
|
+
The task writes to `ENV["OUTPUT"]`, then `config.prices_file`, in that order. It aborts if neither is present. The gem's bundled `prices.json` is only updated when you explicitly pass it through `OUTPUT=` while developing the gem. `_source: "manual"` entries are never touched. Models that are still in your file but missing from both upstream sources are left alone and reported as orphaned. For intentional custom entries, mark them as manual so they stop showing up in orphaned warnings.
|
|
257
310
|
|
|
258
|
-
Use `PREVIEW=1` to see the diff without writing. Use `STRICT=1` to fail instead of applying a partial refresh when a source fails or the validator rejects a price. Use `bin/rails llm_cost_tracker:prices:check` in CI to print the current diff and exit non-zero when the snapshot has drifted or refresh fails.
|
|
311
|
+
Use `OUTPUT=config/llm_cost_tracker_prices.yml` to choose a target file explicitly. Use `PREVIEW=1` to see the diff without writing. Use `STRICT=1` to fail instead of applying a partial refresh when a source fails or the validator rejects a price. Use `bin/rails llm_cost_tracker:prices:check` in CI to print the current diff and exit non-zero when the snapshot has drifted or refresh fails.
|
|
259
312
|
|
|
260
313
|
Large price changes are flagged during sync. If a specific entry is expected to move by more than 3x, add `_validator_override: ["skip_relative_change"]` to that entry in your local price file.
|
|
261
314
|
|
|
315
|
+
If sync reports `certificate verify failed`, fix the host Ruby/OpenSSL trust store rather than disabling TLS verification. Common fixes are installing `ca-certificates` in Docker/Linux images, configuring the corporate proxy CA, setting `SSL_CERT_FILE` to the system CA bundle, or rebuilding rbenv/asdf Ruby after an OpenSSL upgrade.
|
|
316
|
+
|
|
317
|
+
For unattended updates, run the check daily and sync through review:
|
|
318
|
+
|
|
319
|
+
```bash
|
|
320
|
+
bin/rails llm_cost_tracker:prices:check
|
|
321
|
+
STRICT=1 bin/rails llm_cost_tracker:prices:sync
|
|
322
|
+
```
|
|
323
|
+
|
|
324
|
+
`bin/rails llm_cost_tracker:doctor` warns when the configured price file has no `metadata.updated_at` or when it is older than 30 days.
|
|
325
|
+
|
|
262
326
|
## Budget enforcement
|
|
263
327
|
|
|
264
328
|
```ruby
|
|
@@ -269,13 +333,13 @@ config.per_call_budget = 1.00
|
|
|
269
333
|
config.budget_exceeded_behavior = :block_requests
|
|
270
334
|
```
|
|
271
335
|
|
|
272
|
-
- `:notify` — fire `on_budget_exceeded` after an event pushes the
|
|
336
|
+
- `:notify` — fire `on_budget_exceeded` after an event pushes the monthly, daily, or per-call budget over the limit.
|
|
273
337
|
- `:raise` — record the event, then raise `BudgetExceededError`.
|
|
274
338
|
- `:block_requests` — block preflight when the stored monthly or daily total is already over budget; still raises post-response on the event that crosses the line. Needs `:active_record` storage for preflight.
|
|
275
339
|
|
|
276
340
|
`monthly_budget` and `daily_budget` are cumulative ledger limits. `per_call_budget` is a ceiling for a single priced event and runs after the response cost is known.
|
|
277
341
|
|
|
278
|
-
ActiveRecord installs keep `llm_cost_tracker_period_totals` in sync with atomic upserts. Budget preflight reads period rollups instead of scanning `llm_api_calls`.
|
|
342
|
+
ActiveRecord installs keep `llm_cost_tracker_period_totals` in sync with atomic upserts. Budget preflight reads period rollups when they are available instead of scanning `llm_api_calls`.
|
|
279
343
|
|
|
280
344
|
```ruby
|
|
281
345
|
rescue LlmCostTracker::BudgetExceededError => e
|
|
@@ -284,7 +348,7 @@ rescue LlmCostTracker::BudgetExceededError => e
|
|
|
284
348
|
|
|
285
349
|
`:block_requests` is a **guardrail, not a hard cap**. The preflight and the spend-recording write are separate statements, so under Puma / Sidekiq concurrency multiple workers can all pass the preflight and then collectively overshoot the budget. The setting reliably *stops new requests after the overshoot is visible* — it does not prevent the overshoot itself. For strict quotas use a provider- or gateway-level limit, or a database-backed counter outside this gem.
|
|
286
350
|
|
|
287
|
-
Preflight is wired into the Faraday middleware automatically. When you record events via `LlmCostTracker.track` / `track_stream` and also want the same preflight, opt in:
|
|
351
|
+
Preflight is wired into the Faraday middleware and SDK integrations automatically. When you record events via `LlmCostTracker.track` / `track_stream` and also want the same preflight, opt in:
|
|
288
352
|
|
|
289
353
|
```ruby
|
|
290
354
|
LlmCostTracker.track(
|
|
@@ -302,8 +366,20 @@ end
|
|
|
302
366
|
LlmCostTracker.enforce_budget! # standalone preflight
|
|
303
367
|
```
|
|
304
368
|
|
|
369
|
+
## Doctor
|
|
370
|
+
|
|
371
|
+
Run the setup check after install, deploy, or upgrades:
|
|
372
|
+
|
|
373
|
+
```bash
|
|
374
|
+
bin/rails llm_cost_tracker:doctor
|
|
375
|
+
```
|
|
376
|
+
|
|
377
|
+
It checks storage mode, ActiveRecord availability, table/column coverage, period rollups, pricing file loading, and whether calls are being recorded. Setup errors exit non-zero; warnings point at optional production hardening.
|
|
378
|
+
|
|
305
379
|
## Querying costs
|
|
306
380
|
|
|
381
|
+
These helpers and rake tasks require ActiveRecord storage.
|
|
382
|
+
|
|
307
383
|
```bash
|
|
308
384
|
bin/rails llm_cost_tracker:report
|
|
309
385
|
DAYS=7 bin/rails llm_cost_tracker:report
|
|
@@ -337,7 +413,7 @@ LlmCostTracker::LlmApiCall.between(1.week.ago, Time.current).cost_by_model
|
|
|
337
413
|
|
|
338
414
|
## Retention
|
|
339
415
|
|
|
340
|
-
Retention is not enforced automatically.
|
|
416
|
+
Retention is not enforced automatically. With ActiveRecord storage, use the rake task below if you need to delete older records in batches.
|
|
341
417
|
|
|
342
418
|
```bash
|
|
343
419
|
DAYS=90 bin/rails llm_cost_tracker:prune # delete calls older than N days in batches
|
|
@@ -354,10 +430,15 @@ add_index :llm_api_calls, :tags, using: :gin
|
|
|
354
430
|
|
|
355
431
|
On other adapters tags fall back to JSON in a text column. `by_tag` uses JSONB containment on PG, text matching elsewhere.
|
|
356
432
|
|
|
357
|
-
|
|
433
|
+
## Upgrading existing installs
|
|
434
|
+
|
|
435
|
+
Run the generators that match columns missing from older versions:
|
|
358
436
|
|
|
359
437
|
```bash
|
|
360
438
|
bin/rails generate llm_cost_tracker:add_period_totals # shared budget rollups
|
|
439
|
+
bin/rails generate llm_cost_tracker:add_streaming # stream + usage_source
|
|
440
|
+
bin/rails generate llm_cost_tracker:add_provider_response_id
|
|
441
|
+
bin/rails generate llm_cost_tracker:add_usage_breakdown
|
|
361
442
|
bin/rails generate llm_cost_tracker:upgrade_tags_to_jsonb # PG: text → jsonb + GIN
|
|
362
443
|
bin/rails generate llm_cost_tracker:upgrade_cost_precision # widen cost columns
|
|
363
444
|
bin/rails generate llm_cost_tracker:add_latency_ms
|
|
@@ -368,7 +449,9 @@ On PostgreSQL, the generated `upgrade_tags_to_jsonb` migration rewrites `llm_api
|
|
|
368
449
|
|
|
369
450
|
## Mounting the dashboard
|
|
370
451
|
|
|
371
|
-
Optional Rails Engine. Plain ERB, no JavaScript framework, no asset pipeline required. Requires Rails 7.1+; the core middleware works without Rails.
|
|
452
|
+
Optional Rails Engine. Plain ERB, no JavaScript framework, no asset pipeline required. Requires Rails 7.1+; the core middleware works without Rails. The dashboard reads `llm_api_calls`, so use `storage_backend = :active_record` for apps that mount it.
|
|
453
|
+
|
|
454
|
+
`bin/rails generate llm_cost_tracker:install --dashboard` adds the require and route for you. Manual setup:
|
|
372
455
|
|
|
373
456
|
```ruby
|
|
374
457
|
# config/application.rb (or an initializer)
|
|
@@ -382,11 +465,11 @@ Routes (GET-only; CSV export included):
|
|
|
382
465
|
|
|
383
466
|
- `/llm-costs` — overview: spend with delta vs previous period, budget projection, spend anomaly banner, daily trend vs previous slice, provider rollup, top models
|
|
384
467
|
- `/llm-costs/models` — by provider + model; sortable by spend, volume, avg cost, latency
|
|
385
|
-
- `/llm-costs/calls` — filterable + paginated;
|
|
468
|
+
- `/llm-costs/calls` — filterable + paginated; sort modes for recency, spend, input tokens, output tokens, latency, and unknown pricing; CSV export
|
|
386
469
|
- `/llm-costs/calls/:id` — details with token mix and cost mix breakdowns
|
|
387
470
|
- `/llm-costs/tags` — tag keys present in the dataset (PG/SQLite native; MySQL 8.0+ via JSON_TABLE)
|
|
388
471
|
- `/llm-costs/tags/:key` — breakdown by values of a given tag key
|
|
389
|
-
- `/llm-costs/data_quality` — unknown pricing
|
|
472
|
+
- `/llm-costs/data_quality` — unknown pricing, untagged calls, missing latency, incomplete stream usage, and missing provider response IDs
|
|
390
473
|
|
|
391
474
|
No built-in auth is included. Tags carry whatever your app puts in them, so protect the mount point with your application's authentication.
|
|
392
475
|
|
|
@@ -425,6 +508,7 @@ ActiveSupport::Notifications.subscribe("llm_request.llm_cost_tracker") do |*, pa
|
|
|
425
508
|
# total_cost: 0.000795, currency: "USD"
|
|
426
509
|
# },
|
|
427
510
|
# pricing_mode: "batch",
|
|
511
|
+
# stream: false, usage_source: "response", provider_response_id: "chatcmpl_123",
|
|
428
512
|
# tags: { feature: "chat", user_id: 42 },
|
|
429
513
|
# tracked_at: 2026-04-16 14:30:00 UTC
|
|
430
514
|
# }
|
|
@@ -456,21 +540,23 @@ Configured hosts are parsed using the OpenAI-compatible usage shape (`prompt_tok
|
|
|
456
540
|
For providers with a non-OpenAI usage shape:
|
|
457
541
|
|
|
458
542
|
```ruby
|
|
459
|
-
require "uri"
|
|
460
|
-
|
|
461
543
|
class AcmeParser < LlmCostTracker::Parsers::Base
|
|
544
|
+
HOSTS = %w[api.acme-llm.example].freeze
|
|
545
|
+
TRACKED_PATHS = %w[/v1/generate].freeze
|
|
546
|
+
|
|
547
|
+
def provider_names
|
|
548
|
+
%w[acme]
|
|
549
|
+
end
|
|
550
|
+
|
|
462
551
|
def match?(url)
|
|
463
|
-
|
|
464
|
-
uri.host == "api.acme-llm.example" && uri.path == "/v1/generate"
|
|
465
|
-
rescue URI::InvalidURIError
|
|
466
|
-
false
|
|
552
|
+
match_uri?(url, hosts: HOSTS, exact_paths: TRACKED_PATHS)
|
|
467
553
|
end
|
|
468
554
|
|
|
469
|
-
def parse(
|
|
555
|
+
def parse(_request_url, _request_body, response_status, response_body)
|
|
470
556
|
return nil unless response_status == 200
|
|
471
557
|
|
|
472
558
|
payload = safe_json_parse(response_body)
|
|
473
|
-
usage = payload
|
|
559
|
+
usage = payload.dig("usage")
|
|
474
560
|
return nil unless usage
|
|
475
561
|
|
|
476
562
|
LlmCostTracker::ParsedUsage.build(
|
|
@@ -482,31 +568,31 @@ class AcmeParser < LlmCostTracker::Parsers::Base
|
|
|
482
568
|
end
|
|
483
569
|
end
|
|
484
570
|
|
|
485
|
-
LlmCostTracker::Parsers::Registry.register(AcmeParser
|
|
571
|
+
LlmCostTracker::Parsers::Registry.register(AcmeParser)
|
|
486
572
|
```
|
|
487
573
|
|
|
488
574
|
## Supported providers
|
|
489
575
|
|
|
490
576
|
| Provider | Auto-detected | Models with pricing |
|
|
491
577
|
|---|:---:|---|
|
|
492
|
-
| OpenAI |
|
|
493
|
-
| OpenRouter |
|
|
494
|
-
| DeepSeek |
|
|
495
|
-
| OpenAI-compatible hosts |
|
|
496
|
-
| Anthropic |
|
|
497
|
-
| Google Gemini |
|
|
498
|
-
| Any other |
|
|
578
|
+
| OpenAI | Yes | GPT-5.5/5.4/5.2/5.1/5, GPT-5.5/5.4/5.2/5 pro, GPT-5.4 mini/nano, GPT-5 mini/nano, GPT-4.1, GPT-4o, o1/o3/o4-mini |
|
|
579
|
+
| OpenRouter | Yes | OpenAI-compatible usage; provider-prefixed OpenAI model IDs normalized when possible |
|
|
580
|
+
| DeepSeek | Yes | OpenAI-compatible usage; add `pricing_overrides` for DeepSeek models |
|
|
581
|
+
| OpenAI-compatible hosts | Config | Configure `openai_compatible_providers` |
|
|
582
|
+
| Anthropic | Yes | Claude Opus 4.7/4.6/4.5/4.1/4, Sonnet 4.6/4.5/4, Haiku 4.5 |
|
|
583
|
+
| Google Gemini | Yes | Gemini 2.5 Pro/Flash/Flash-Lite, 2.0 Flash/Flash-Lite |
|
|
584
|
+
| Any other | Config | Custom parser |
|
|
499
585
|
|
|
500
|
-
Endpoints: OpenAI Chat Completions / Responses / Completions / Embeddings; OpenAI-compatible equivalents; Anthropic Messages; Gemini `generateContent` and `streamGenerateContent`.
|
|
586
|
+
Endpoints: OpenAI Chat Completions / Responses / Completions / Embeddings; OpenAI-compatible equivalents; Anthropic Messages; Gemini `generateContent` and `streamGenerateContent`. Official SDK integrations currently cover non-streaming OpenAI Responses / Chat Completions and Anthropic Messages. Streaming capture is supported for Faraday endpoints that emit stream events with final usage.
|
|
501
587
|
|
|
502
588
|
## Safety
|
|
503
589
|
|
|
504
|
-
**By design, `llm_cost_tracker` never persists prompt or response content.** The only data stored per call is the metadata needed for a cost ledger (provider, model, token counts, cost, latency, tags, provider response ID,
|
|
590
|
+
**By design, `llm_cost_tracker` never persists prompt or response content.** The only data stored per call is the metadata needed for a cost ledger (provider, model, token counts, cost, latency, tags, provider response ID, and timestamp). Tags carry whatever your application passes in — treat them as user-controlled input and avoid putting request bodies, completions, or secrets into them.
|
|
505
591
|
|
|
506
592
|
- No external HTTP calls at request-tracking time.
|
|
507
593
|
- No prompt or response bodies stored.
|
|
508
594
|
- Faraday responses not modified.
|
|
509
|
-
-
|
|
595
|
+
- Request headers are never stored. Warning logs strip query strings from URLs before logging.
|
|
510
596
|
- Storage failures non-fatal by default (`storage_error_behavior = :warn`).
|
|
511
597
|
- Budget and unknown-pricing errors are raised only when you opt in.
|
|
512
598
|
|
|
@@ -514,9 +600,9 @@ Endpoints: OpenAI Chat Completions / Responses / Completions / Embeddings; OpenA
|
|
|
514
600
|
|
|
515
601
|
The gem is designed for multi-threaded hosts — Puma with `max_threads > 1` and Sidekiq with `concurrency > 1` are both supported. A few rules:
|
|
516
602
|
|
|
517
|
-
- **Configure once at boot.** `LlmCostTracker.configure`
|
|
518
|
-
- **Use `:active_record` storage for shared
|
|
519
|
-
- **Size your connection pool.** Each tracked call on the middleware path
|
|
603
|
+
- **Configure once at boot.** `LlmCostTracker.configure` freezes mutable shared configuration when the block returns, and replacing shared fields through `LlmCostTracker.configuration` raises `FrozenError`. If `default_tags` is callable, keep it fast and thread-safe.
|
|
604
|
+
- **Use `:active_record` storage for the built-in shared ledger.** Puma workers and Sidekiq processes do not share memory; `:log` is process-local, and `:custom` is only as shared as the sink you write to. `:active_record` writes to a single table and is the right choice for the bundled dashboard and budget checks across processes.
|
|
605
|
+
- **Size your connection pool.** Each tracked call on the middleware path uses the host app's ActiveRecord connection for ledger writes, period rollups, and optional budget checks. Make sure the AR pool covers `puma max_threads + sidekiq concurrency` plus your app's own usage.
|
|
520
606
|
- **Don't share a `StreamCollector` across threads you don't own.** The collector itself is thread-safe — `event`, `usage`, and `finish!` synchronize internally and `finish!` is idempotent — but the documented pattern is one collector per stream.
|
|
521
607
|
- **`finish!` is a barrier.** Once a stream is finished, later `event`, `usage`, or `model=` calls raise `FrozenError` instead of mutating a closed collector.
|
|
522
608
|
- **`ActiveSupport::Notifications` subscribers run synchronously** in the caller's thread. Keep them fast or hand off to a background job; otherwise they add latency to every tracked call.
|
|
@@ -525,9 +611,10 @@ The gem is designed for multi-threaded hosts — Puma with `max_threads > 1` and
|
|
|
525
611
|
## Known limitations
|
|
526
612
|
|
|
527
613
|
- `:block_requests` is a best-effort guardrail, not a hard cap. Concurrent workers can pass preflight simultaneously and collectively overshoot the budget. Use an external quota system if you need a transactional cap.
|
|
614
|
+
- Official SDK integrations currently cover non-streaming calls. Use Faraday middleware or `track_stream` for SDK streaming until stable stream wrappers are added.
|
|
528
615
|
- Streaming capture relies on the provider emitting a final-usage event (OpenAI needs `stream_options: { include_usage: true }`); missing events are recorded with `usage_source: "unknown"` so they surface on the Data Quality page.
|
|
529
616
|
- `provider_response_id` is stored only when the provider exposes a stable response object ID. Missing IDs stay `nil` and surface on the Data Quality page.
|
|
530
|
-
- Cache write TTL variants (1h vs 5min writes) not modeled separately.
|
|
617
|
+
- Cache write TTL variants (1h vs 5min writes) are not modeled separately.
|
|
531
618
|
|
|
532
619
|
## Development
|
|
533
620
|
|
|
@@ -535,8 +622,7 @@ Architecture rules for future changes live in [`docs/architecture.md`](docs/arch
|
|
|
535
622
|
|
|
536
623
|
```bash
|
|
537
624
|
bundle install
|
|
538
|
-
|
|
539
|
-
bundle exec rubocop
|
|
625
|
+
bin/check
|
|
540
626
|
```
|
|
541
627
|
|
|
542
628
|
## License
|