axn-ruby_llm 0.1.0 → 0.1.2

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
checksums.yaml CHANGED
@@ -1,7 +1,7 @@
1
1
  ---
2
2
  SHA256:
3
- metadata.gz: 5c8aae7528ce13d8b34d96781ea472dec9059aaadd029176c8442da9a828c88c
4
- data.tar.gz: 5d45458a6c80b9679fa3a59c12e9674264891788bbd586b26fb589763403bd19
3
+ metadata.gz: 5339909ae113dc742efe969fb25ae90294abef8090629c5fc9f452acdb98757d
4
+ data.tar.gz: 8d73239e08a8ab270fc6edb980557e6b1932e63e9fdbdcd6f26bfb431b2a5c47
5
5
  SHA512:
6
- metadata.gz: e42eb1a60816cf49c579ebe9343ffa53ff39eadb95a746cd624a4bc78a1209ef8f52f8c2f566566d8d47fc45b16da86161bb88e179814b77bd919c5af44b45af
7
- data.tar.gz: 12ca3f1d02a06c8c0e0d1120a9d4b600b37dde32fed917020cd50ff129958e9509cf4b87bb0037f79e50e227669bccec4ccf3bfb09bfb80ae371842ac66fd70a
6
+ metadata.gz: 75587b0c6a29875e2de143459b0b2ac1280581cb9e3b2a080777737ed803fda05008f0703a31522a4d5bcb47ea10998ef6236e1917bc9157f10fcaf278f97f3f
7
+ data.tar.gz: 21d44dcc5fc6bc41615cc472648737b2c6491d05dfc1d469b7c884bed2d7a260da574f83b5f14d8b0ff97694fdf60f2428d4db72a22123fe0764b47f0606b7c4
data/CHANGELOG.md CHANGED
@@ -1,5 +1,20 @@
1
1
  # Changelog
2
2
 
3
+ ## [0.1.2] - 2026-06-11
4
+
5
+ Requires RubyLLM >= 1.15 (minimum version bumped from 1.0).
6
+
7
+ RubyLLM 1.15 normalized token accounting: `input_tokens` now means non-cached input tokens only; cache activity is split into `cache_read_tokens` and `cache_write_tokens`. This release surfaces those fields and adds a convenience total.
8
+
9
+ - Add `cache_read_tokens` and `cache_write_tokens` exposures to `Ask`.
10
+ - Add `prompt_tokens` exposure — the sum of all three input token fields (`input_tokens + cache_read_tokens + cache_write_tokens`), matching OpenAI's `prompt_tokens` convention. Nil only if all three components are nil.
11
+ - Update `stub_axn_ruby_llm` helper to accept `cache_read_tokens:` and `cache_write_tokens:` params.
12
+ - Update `StubMessage` Data struct to include the new token fields (all zeroed in stub/disabled mode).
13
+
14
+ ## [0.1.1] - 2026-06-11
15
+
16
+ - Use `mount_axn` pattern for `Axn::RubyLLM.ask` / `.ask!` / `.ask_async` shortcuts (via `Axn::Mountable`), replacing hand-written delegation. Requires axn `>= 0.1.0-alpha.4.3`.
17
+
3
18
  ## [0.1.0] - 2026-05-21
4
19
 
5
20
  Initial release.
data/README.md CHANGED
@@ -12,7 +12,7 @@ Three things you'd otherwise build at every callsite:
12
12
 
13
13
  2. **Production gating.** A single `c.enabled = -> { Rails.env.production? }` in an initializer stubs every LLM call in non-prod environments — no per-callsite guards needed. The stub is typed (`stubbed: true`, `input_tokens: 0`, etc.) so downstream code doesn't need to branch on it either.
14
14
 
15
- 3. **Cost/token tracking, exposed automatically.** Every call exposes `input_tokens`, `output_tokens`, `cost`, and `cost_breakdown` without you doing the `RubyLLM.models.find` lookup manually. If your app uses OpenTelemetry, these values are also set as attributes on the existing `axn.call` span — no configuration required.
15
+ 3. **Cost/token tracking, exposed automatically.** Every call exposes `input_tokens`, `output_tokens`, `cache_read_tokens`, `cache_write_tokens`, `prompt_tokens` (the total), `cost`, and `cost_breakdown` without you doing the `RubyLLM.models.find` lookup manually. If your app uses OpenTelemetry, these values are also set as attributes on the existing `axn.call` span — no configuration required.
16
16
 
17
17
  > **Scope note:** This gem covers the subset of RubyLLM functionality that [Teamshares](https://github.com/teamshares) uses internally — single-turn chat, structured output, and basic observability. It is intentionally minimal rather than a full-featured wrapper. Feedback and pull requests to extend it are very welcome.
18
18
 
@@ -65,7 +65,6 @@ result = Axn::RubyLLM.ask(
65
65
  )
66
66
  ```
67
67
 
68
- The underlying action class is available as `Axn::RubyLLM::Ask` for cases where you need the full `Axn` interface (`call!`, `call_async`, instrumentation hooks, etc.).
69
68
 
70
69
  ### Structured output via schema
71
70
 
@@ -89,24 +88,26 @@ result.response # => { "company_id" => 42, "confidence" => 0.92, "reasoning" =>
89
88
 
90
89
  ### Token counts and cost
91
90
 
92
- Every successful result exposes token usage and cost in two tiers:
91
+ Every successful result exposes token usage and cost:
93
92
 
94
93
  ```ruby
95
94
  result = Axn::RubyLLM.ask(prompt: "...")
96
95
 
97
- # Flat (common case)
98
- result.input_tokens # => 412
99
- result.output_tokens # => 78
100
- result.cost # => 0.00056 (Float USD total; nil if RubyLLM has no pricing for the model)
96
+ result.input_tokens # => 412 (non-cached input tokens only)
97
+ result.cache_read_tokens # => 80 (tokens served from cache; nil if provider didn't return them)
98
+ result.cache_write_tokens # => 20 (tokens written to cache; nil if provider didn't return them)
99
+ result.prompt_tokens # => 512 (input_tokens + cache_read_tokens + cache_write_tokens total request-side tokens, OpenAI-style)
100
+ result.output_tokens # => 78
101
+ result.cost # => 0.00056 (Float USD total; nil if RubyLLM has no pricing for the model)
101
102
 
102
- # Resolved breakdown — RubyLLM::Cost struct
103
+ # Full breakdown — RubyLLM::Cost struct with per-tier pricing
103
104
  result.cost_breakdown # => #<Cost input: 0.0004, output: 0.00016, cache_read: 0.0, ..., total: 0.00056>
104
105
 
105
- # Full escape hatch — the raw RubyLLM::Message for cache/thinking tokens, etc.
106
+ # Raw RubyLLM::Message for thinking tokens, raw provider data, etc.
106
107
  result.raw_message # => #<RubyLLM::Message ...>
107
108
  ```
108
109
 
109
- `cost` and `cost_breakdown` are both `nil` when RubyLLM lacks pricing for the model (e.g. unknown/custom endpoints). Token counts are nil only if the provider did not return them.
110
+ `cost` and `cost_breakdown` are both `nil` when RubyLLM lacks pricing for the model (e.g. unknown/custom endpoints). Token counts are nil only if the provider did not return them. `prompt_tokens` is nil only if all three input token fields are nil.
110
111
 
111
112
  Errors are handled via Axn's declarative `error` DSL:
112
113
  - `JSON::ParserError` → result fails with `"Failed to parse JSON from LLM response"`
@@ -136,7 +137,7 @@ If your app uses OpenTelemetry, `axn` already wraps every action in an `axn.call
136
137
  |---|---|
137
138
  | `gen_ai.request.model` | The model requested |
138
139
  | `gen_ai.response.model` | The model that responded |
139
- | `gen_ai.usage.input_tokens` | Prompt token count |
140
+ | `gen_ai.usage.input_tokens` | Non-cached input token count |
140
141
  | `gen_ai.usage.output_tokens` | Completion token count |
141
142
  | `gen_ai.usage.cost` | USD total (non-standard; useful for spend filtering) |
142
143
  | `axn.ruby_llm.stubbed` | `true` when production gating returned a stub |
@@ -155,13 +156,13 @@ Axn::RubyLLM.configure do |c|
155
156
  end
156
157
  ```
157
158
 
158
- When disabled, `Ask` returns a **success** result with obvious stub content, so callers don't need per-callsite branching:
159
+ When disabled, `Axn::RubyLLM.ask` returns a **success** result with obvious stub content, so callers don't need per-callsite branching:
159
160
 
160
161
  | Field | Stubbed value |
161
162
  |---|---|
162
163
  | `response` | `"stubbed response value"` (plain) / `{ "stubbed" => true }` (`json: true` or `schema:`) |
163
- | `raw_message` | `Ask::StubMessage` Data instance with `.content`, `.input_tokens`, `.output_tokens`, `.model_id` |
164
- | `input_tokens` / `output_tokens` | `0` |
164
+ | `raw_message` | Stub struct with `.content`, `.input_tokens`, `.output_tokens`, `.cache_read_tokens`, `.cache_write_tokens`, `.model_id` |
165
+ | `input_tokens` / `output_tokens` / `cache_read_tokens` / `cache_write_tokens` / `prompt_tokens` | `0` |
165
166
  | `cost` | `0.0` |
166
167
  | `cost_breakdown` | `nil` |
167
168
  | `stubbed` | `true` |
@@ -16,11 +16,14 @@ module Axn
16
16
  exposes :raw_message
17
17
  exposes :input_tokens, allow_nil: true
18
18
  exposes :output_tokens, allow_nil: true
19
+ exposes :cache_read_tokens, allow_nil: true
20
+ exposes :cache_write_tokens, allow_nil: true
21
+ exposes :prompt_tokens, allow_nil: true
19
22
  exposes :cost, allow_nil: true
20
23
  exposes :cost_breakdown, allow_nil: true
21
24
  exposes :stubbed, type: :boolean, default: false
22
25
 
23
- StubMessage = Data.define(:content, :input_tokens, :output_tokens, :model_id)
26
+ StubMessage = Data.define(:content, :input_tokens, :output_tokens, :cache_read_tokens, :cache_write_tokens, :model_id)
24
27
 
25
28
  error prefix: "LLM request failed: "
26
29
  error "Failed to parse JSON from LLM response", if: JSON::ParserError
@@ -45,6 +48,9 @@ module Axn
45
48
  raw_message: llm_response,
46
49
  input_tokens: llm_response.input_tokens,
47
50
  output_tokens: llm_response.output_tokens,
51
+ cache_read_tokens: llm_response.cache_read_tokens,
52
+ cache_write_tokens: llm_response.cache_write_tokens,
53
+ prompt_tokens: total_input_tokens,
48
54
  cost_breakdown:,
49
55
  cost: cost_breakdown&.total,
50
56
  stubbed: false,
@@ -68,9 +74,12 @@ module Axn
68
74
  content = schema || json ? { "stubbed" => true } : "stubbed response value"
69
75
  {
70
76
  response: content,
71
- raw_message: StubMessage.new(content:, input_tokens: 0, output_tokens: 0, model_id: "stubbed"),
77
+ raw_message: StubMessage.new(content:, input_tokens: 0, output_tokens: 0, cache_read_tokens: 0, cache_write_tokens: 0, model_id: "stubbed"),
72
78
  input_tokens: 0,
73
79
  output_tokens: 0,
80
+ cache_read_tokens: 0,
81
+ cache_write_tokens: 0,
82
+ prompt_tokens: 0,
74
83
  cost: 0.0,
75
84
  cost_breakdown: nil,
76
85
  stubbed: true,
@@ -87,6 +96,11 @@ module Axn
87
96
  json ? JSON.parse(llm_response.content) : llm_response.content
88
97
  end
89
98
 
99
+ def total_input_tokens
100
+ vals = [llm_response.input_tokens, llm_response.cache_read_tokens, llm_response.cache_write_tokens]
101
+ vals.all?(&:nil?) ? nil : vals.sum(&:to_i)
102
+ end
103
+
90
104
  def cost_breakdown
91
105
  return nil unless model_info
92
106
 
@@ -13,11 +13,14 @@ module Axn
13
13
  # stub_axn_ruby_llm(response: { "key" => "value" }) # auto-JSON-serialized for json: true calls
14
14
  # stub_axn_ruby_llm(response: { "k" => "v" }, schema: MySchema) # Hash passed through unparsed
15
15
  # stub_axn_ruby_llm(response: "...", input_tokens: 100, output_tokens: 50, cost: 0.0023)
16
+ # stub_axn_ruby_llm(response: "...", cache_read_tokens: 500, cache_write_tokens: 200)
16
17
  #
17
18
  # Returns the chat instance double for further assertions if needed.
18
- def stub_axn_ruby_llm(response:, model: nil, schema: nil, input_tokens: nil, output_tokens: nil, cost: nil)
19
+ def stub_axn_ruby_llm(response:, model: nil, schema: nil, input_tokens: nil, output_tokens: nil,
20
+ cache_read_tokens: nil, cache_write_tokens: nil, cost: nil)
19
21
  resolved_model_id = model || Axn::RubyLLM.configuration.default_model
20
- llm_message = _stub_axn_ruby_llm_message(response, resolved_model_id, input_tokens, output_tokens, schema:)
22
+ llm_message = _stub_axn_ruby_llm_message(response, resolved_model_id, input_tokens, output_tokens,
23
+ cache_read_tokens:, cache_write_tokens:, schema:)
21
24
  chat_instance = _stub_axn_ruby_llm_chat(model, llm_message, schema:)
22
25
  _stub_axn_ruby_llm_cost(llm_message, resolved_model_id, cost)
23
26
  chat_instance
@@ -25,7 +28,8 @@ module Axn
25
28
 
26
29
  private
27
30
 
28
- def _stub_axn_ruby_llm_message(response, model_id, input_tokens, output_tokens, schema:)
31
+ def _stub_axn_ruby_llm_message(response, model_id, input_tokens, output_tokens, cache_read_tokens:,
32
+ cache_write_tokens:, schema:)
29
33
  content = if schema
30
34
  response
31
35
  elsif response.is_a?(Hash)
@@ -33,7 +37,9 @@ module Axn
33
37
  else
34
38
  response.to_s
35
39
  end
36
- instance_double(::RubyLLM::Message, content:, input_tokens:, output_tokens:, model_id:)
40
+ instance_double(::RubyLLM::Message,
41
+ content:, input_tokens:, output_tokens:,
42
+ cache_read_tokens:, cache_write_tokens:, model_id:)
37
43
  end
38
44
 
39
45
  def _stub_axn_ruby_llm_chat(model, llm_message, schema:)
@@ -2,6 +2,6 @@
2
2
 
3
3
  module Axn
4
4
  module RubyLLM
5
- VERSION = "0.1.0"
5
+ VERSION = "0.1.2"
6
6
  end
7
7
  end
data/lib/axn/ruby_llm.rb CHANGED
@@ -9,6 +9,10 @@ require_relative "ruby_llm/ask"
9
9
 
10
10
  module Axn
11
11
  module RubyLLM
12
+ include Axn::Mountable
13
+
14
+ mount_axn :ask, Ask
15
+
12
16
  class << self
13
17
  def configuration
14
18
  @configuration ||= Configuration.new
@@ -21,9 +25,6 @@ module Axn
21
25
  def reset_configuration!
22
26
  @configuration = nil
23
27
  end
24
-
25
- def ask(**) = Ask.call(**)
26
- def ask!(**) = Ask.call!(**)
27
28
  end
28
29
  end
29
30
  end
metadata CHANGED
@@ -1,7 +1,7 @@
1
1
  --- !ruby/object:Gem::Specification
2
2
  name: axn-ruby_llm
3
3
  version: !ruby/object:Gem::Version
4
- version: 0.1.0
4
+ version: 0.1.2
5
5
  platform: ruby
6
6
  authors:
7
7
  - Kali Donovan
@@ -15,7 +15,7 @@ dependencies:
15
15
  requirements:
16
16
  - - ">="
17
17
  - !ruby/object:Gem::Version
18
- version: 0.1.0.pre.alpha.4.2
18
+ version: 0.1.0.pre.alpha.4.3
19
19
  - - "<"
20
20
  - !ruby/object:Gem::Version
21
21
  version: 0.2.0
@@ -25,7 +25,7 @@ dependencies:
25
25
  requirements:
26
26
  - - ">="
27
27
  - !ruby/object:Gem::Version
28
- version: 0.1.0.pre.alpha.4.2
28
+ version: 0.1.0.pre.alpha.4.3
29
29
  - - "<"
30
30
  - !ruby/object:Gem::Version
31
31
  version: 0.2.0
@@ -35,7 +35,7 @@ dependencies:
35
35
  requirements:
36
36
  - - ">="
37
37
  - !ruby/object:Gem::Version
38
- version: '1.0'
38
+ version: '1.15'
39
39
  - - "<"
40
40
  - !ruby/object:Gem::Version
41
41
  version: '2.0'
@@ -45,7 +45,7 @@ dependencies:
45
45
  requirements:
46
46
  - - ">="
47
47
  - !ruby/object:Gem::Version
48
- version: '1.0'
48
+ version: '1.15'
49
49
  - - "<"
50
50
  - !ruby/object:Gem::Version
51
51
  version: '2.0'