RubyGems - axn-ruby_llm - Versions diffs - 0.1.1 → 0.1.2 - Mend

axn-ruby_llm 0.1.1 → 0.1.2

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (7) hide show

checksums.yaml CHANGED Viewed

@@ -1,7 +1,7 @@
 ---
 SHA256:
-  metadata.gz: f5a715c75d2ee173307de9cdcb25b067c011f1540cca7211f4f0c58b55df0089
-  data.tar.gz: 69c0f2fbc6edfcad836da1b64ff362d82868b2eeefa4352f7aa52db6b3830de7
+  metadata.gz: 5339909ae113dc742efe969fb25ae90294abef8090629c5fc9f452acdb98757d
+  data.tar.gz: 8d73239e08a8ab270fc6edb980557e6b1932e63e9fdbdcd6f26bfb431b2a5c47
 SHA512:
-  metadata.gz: c2a483d82768f5dc38bdcbd1b013a40626418bbec481860d5b92066f5b847c100d725d41f34f9e57bf7686533f2d4f53999b8a6693bd5fc5064f566156dbeef3
-  data.tar.gz: 371df624c760aca167f3abc35965d10ffe3b6bf029851337c59a80c9b1dad1c6429b0eb58e6df024e5ecfedb65e53306f2afe81d1549e209a051e23f348121cb
+  metadata.gz: 75587b0c6a29875e2de143459b0b2ac1280581cb9e3b2a080777737ed803fda05008f0703a31522a4d5bcb47ea10998ef6236e1917bc9157f10fcaf278f97f3f
+  data.tar.gz: 21d44dcc5fc6bc41615cc472648737b2c6491d05dfc1d469b7c884bed2d7a260da574f83b5f14d8b0ff97694fdf60f2428d4db72a22123fe0764b47f0606b7c4

data/CHANGELOG.md CHANGED Viewed

@@ -1,5 +1,16 @@
 # Changelog
+## [0.1.2] - 2026-06-11
+Requires RubyLLM >= 1.15 (minimum version bumped from 1.0).
+RubyLLM 1.15 normalized token accounting: `input_tokens` now means non-cached input tokens only; cache activity is split into `cache_read_tokens` and `cache_write_tokens`. This release surfaces those fields and adds a convenience total.
+- Add `cache_read_tokens` and `cache_write_tokens` exposures to `Ask`.
+- Add `prompt_tokens` exposure — the sum of all three input token fields (`input_tokens + cache_read_tokens + cache_write_tokens`), matching OpenAI's `prompt_tokens` convention. Nil only if all three components are nil.
+- Update `stub_axn_ruby_llm` helper to accept `cache_read_tokens:` and `cache_write_tokens:` params.
+- Update `StubMessage` Data struct to include the new token fields (all zeroed in stub/disabled mode).
 ## [0.1.1] - 2026-06-11
 - Use `mount_axn` pattern for `Axn::RubyLLM.ask` / `.ask!` / `.ask_async` shortcuts (via `Axn::Mountable`), replacing hand-written delegation. Requires axn `>= 0.1.0-alpha.4.3`.

data/README.md CHANGED Viewed

@@ -12,7 +12,7 @@ Three things you'd otherwise build at every callsite:
 2. **Production gating.** A single `c.enabled = -> { Rails.env.production? }` in an initializer stubs every LLM call in non-prod environments — no per-callsite guards needed. The stub is typed (`stubbed: true`, `input_tokens: 0`, etc.) so downstream code doesn't need to branch on it either.
-3. **Cost/token tracking, exposed automatically.** Every call exposes `input_tokens`, `output_tokens`, `cost`, and `cost_breakdown` without you doing the `RubyLLM.models.find` lookup manually. If your app uses OpenTelemetry, these values are also set as attributes on the existing `axn.call` span — no configuration required.
+3. **Cost/token tracking, exposed automatically.** Every call exposes `input_tokens`, `output_tokens`, `cache_read_tokens`, `cache_write_tokens`, `prompt_tokens` (the total), `cost`, and `cost_breakdown` without you doing the `RubyLLM.models.find` lookup manually. If your app uses OpenTelemetry, these values are also set as attributes on the existing `axn.call` span — no configuration required.
 > **Scope note:** This gem covers the subset of RubyLLM functionality that [Teamshares](https://github.com/teamshares) uses internally — single-turn chat, structured output, and basic observability. It is intentionally minimal rather than a full-featured wrapper. Feedback and pull requests to extend it are very welcome.
@@ -88,24 +88,26 @@ result.response # => { "company_id" => 42, "confidence" => 0.92, "reasoning" =>
 ### Token counts and cost
-Every successful result exposes token usage and cost in two tiers:
+Every successful result exposes token usage and cost:
 ```ruby
 result = Axn::RubyLLM.ask(prompt: "...")
-# Flat (common case)
-result.input_tokens    # => 412
-result.output_tokens   # => 78
-result.cost            # => 0.00056 (Float USD total; nil if RubyLLM has no pricing for the model)
+result.input_tokens       # => 412  (non-cached input tokens only)
+result.cache_read_tokens  # => 80   (tokens served from cache; nil if provider didn't return them)
+result.cache_write_tokens # => 20   (tokens written to cache; nil if provider didn't return them)
+result.prompt_tokens      # => 512  (input_tokens + cache_read_tokens + cache_write_tokens — total request-side tokens, OpenAI-style)
+result.output_tokens      # => 78
+result.cost               # => 0.00056 (Float USD total; nil if RubyLLM has no pricing for the model)
-# Resolved breakdown — RubyLLM::Cost struct
+# Full breakdown — RubyLLM::Cost struct with per-tier pricing
 result.cost_breakdown  # => #<Cost input: 0.0004, output: 0.00016, cache_read: 0.0, ..., total: 0.00056>
-# Full escape hatch — the raw RubyLLM::Message for cache/thinking tokens, etc.
+# Raw RubyLLM::Message for thinking tokens, raw provider data, etc.
 result.raw_message     # => #<RubyLLM::Message ...>
 ```
-`cost` and `cost_breakdown` are both `nil` when RubyLLM lacks pricing for the model (e.g. unknown/custom endpoints). Token counts are nil only if the provider did not return them.
+`cost` and `cost_breakdown` are both `nil` when RubyLLM lacks pricing for the model (e.g. unknown/custom endpoints). Token counts are nil only if the provider did not return them. `prompt_tokens` is nil only if all three input token fields are nil.
 Errors are handled via Axn's declarative `error` DSL:
 - `JSON::ParserError` → result fails with `"Failed to parse JSON from LLM response"`
@@ -135,7 +137,7 @@ If your app uses OpenTelemetry, `axn` already wraps every action in an `axn.call
 |---|---|
 | `gen_ai.request.model` | The model requested |
 | `gen_ai.response.model` | The model that responded |
-| `gen_ai.usage.input_tokens` | Prompt token count |
+| `gen_ai.usage.input_tokens` | Non-cached input token count |
 | `gen_ai.usage.output_tokens` | Completion token count |
 | `gen_ai.usage.cost` | USD total (non-standard; useful for spend filtering) |
 | `axn.ruby_llm.stubbed` | `true` when production gating returned a stub |
@@ -159,8 +161,8 @@ When disabled, `Axn::RubyLLM.ask` returns a **success** result with obvious stub
 | Field | Stubbed value |
 |---|---|
 | `response` | `"stubbed response value"` (plain) / `{ "stubbed" => true }` (`json: true` or `schema:`) |
-| `raw_message` | Stub struct with `.content`, `.input_tokens`, `.output_tokens`, `.model_id` |
-| `input_tokens` / `output_tokens` | `0` |
+| `raw_message` | Stub struct with `.content`, `.input_tokens`, `.output_tokens`, `.cache_read_tokens`, `.cache_write_tokens`, `.model_id` |
+| `input_tokens` / `output_tokens` / `cache_read_tokens` / `cache_write_tokens` / `prompt_tokens` | `0` |
 | `cost` | `0.0` |
 | `cost_breakdown` | `nil` |
 | `stubbed` | `true` |

data/lib/axn/ruby_llm/ask.rb CHANGED Viewed

@@ -16,11 +16,14 @@ module Axn
       exposes :raw_message
       exposes :input_tokens, allow_nil: true
       exposes :output_tokens, allow_nil: true
+      exposes :cache_read_tokens, allow_nil: true
+      exposes :cache_write_tokens, allow_nil: true
+      exposes :prompt_tokens, allow_nil: true
       exposes :cost, allow_nil: true
       exposes :cost_breakdown, allow_nil: true
       exposes :stubbed, type: :boolean, default: false
-      StubMessage = Data.define(:content, :input_tokens, :output_tokens, :model_id)
+      StubMessage = Data.define(:content, :input_tokens, :output_tokens, :cache_read_tokens, :cache_write_tokens, :model_id)
       error prefix: "LLM request failed: "
       error "Failed to parse JSON from LLM response", if: JSON::ParserError
@@ -45,6 +48,9 @@ module Axn
           raw_message: llm_response,
           input_tokens: llm_response.input_tokens,
           output_tokens: llm_response.output_tokens,
+          cache_read_tokens: llm_response.cache_read_tokens,
+          cache_write_tokens: llm_response.cache_write_tokens,
+          prompt_tokens: total_input_tokens,
           cost_breakdown:,
           cost: cost_breakdown&.total,
           stubbed: false,
@@ -68,9 +74,12 @@ module Axn
         content = schema || json ? { "stubbed" => true } : "stubbed response value"
         {
           response: content,
-          raw_message: StubMessage.new(content:, input_tokens: 0, output_tokens: 0, model_id: "stubbed"),
+          raw_message: StubMessage.new(content:, input_tokens: 0, output_tokens: 0, cache_read_tokens: 0, cache_write_tokens: 0, model_id: "stubbed"),
           input_tokens: 0,
           output_tokens: 0,
+          cache_read_tokens: 0,
+          cache_write_tokens: 0,
+          prompt_tokens: 0,
           cost: 0.0,
           cost_breakdown: nil,
           stubbed: true,
@@ -87,6 +96,11 @@ module Axn
         json ? JSON.parse(llm_response.content) : llm_response.content
       end
+      def total_input_tokens
+        vals = [llm_response.input_tokens, llm_response.cache_read_tokens, llm_response.cache_write_tokens]
+        vals.all?(&:nil?) ? nil : vals.sum(&:to_i)
+      end
       def cost_breakdown
         return nil unless model_info

data/lib/axn/ruby_llm/rspec.rb CHANGED Viewed

@@ -13,11 +13,14 @@ module Axn
         #   stub_axn_ruby_llm(response: { "key" => "value" })  # auto-JSON-serialized for json: true calls
         #   stub_axn_ruby_llm(response: { "k" => "v" }, schema: MySchema) # Hash passed through unparsed
         #   stub_axn_ruby_llm(response: "...", input_tokens: 100, output_tokens: 50, cost: 0.0023)
+        #   stub_axn_ruby_llm(response: "...", cache_read_tokens: 500, cache_write_tokens: 200)
         #
         # Returns the chat instance double for further assertions if needed.
-        def stub_axn_ruby_llm(response:, model: nil, schema: nil, input_tokens: nil, output_tokens: nil, cost: nil)
+        def stub_axn_ruby_llm(response:, model: nil, schema: nil, input_tokens: nil, output_tokens: nil,
+                              cache_read_tokens: nil, cache_write_tokens: nil, cost: nil)
           resolved_model_id = model || Axn::RubyLLM.configuration.default_model
-          llm_message = _stub_axn_ruby_llm_message(response, resolved_model_id, input_tokens, output_tokens, schema:)
+          llm_message = _stub_axn_ruby_llm_message(response, resolved_model_id, input_tokens, output_tokens,
+                                                   cache_read_tokens:, cache_write_tokens:, schema:)
           chat_instance = _stub_axn_ruby_llm_chat(model, llm_message, schema:)
           _stub_axn_ruby_llm_cost(llm_message, resolved_model_id, cost)
           chat_instance
@@ -25,7 +28,8 @@ module Axn
         private
-        def _stub_axn_ruby_llm_message(response, model_id, input_tokens, output_tokens, schema:)
+        def _stub_axn_ruby_llm_message(response, model_id, input_tokens, output_tokens, cache_read_tokens:,
+                                       cache_write_tokens:, schema:)
           content = if schema
                       response
                     elsif response.is_a?(Hash)
@@ -33,7 +37,9 @@ module Axn
                     else
                       response.to_s
                     end
-          instance_double(::RubyLLM::Message, content:, input_tokens:, output_tokens:, model_id:)
+          instance_double(::RubyLLM::Message,
+                          content:, input_tokens:, output_tokens:,
+                          cache_read_tokens:, cache_write_tokens:, model_id:)
         end
         def _stub_axn_ruby_llm_chat(model, llm_message, schema:)

data/lib/axn/ruby_llm/version.rb CHANGED Viewed

@@ -2,6 +2,6 @@
 module Axn
   module RubyLLM
-    VERSION = "0.1.1"
+    VERSION = "0.1.2"
   end
 end

metadata CHANGED Viewed

@@ -1,7 +1,7 @@
 --- !ruby/object:Gem::Specification
 name: axn-ruby_llm
 version: !ruby/object:Gem::Version
-  version: 0.1.1
+  version: 0.1.2
 platform: ruby
 authors:
 - Kali Donovan
@@ -35,7 +35,7 @@ dependencies:
     requirements:
     - - ">="
       - !ruby/object:Gem::Version
-        version: '1.0'
+        version: '1.15'
     - - "<"
       - !ruby/object:Gem::Version
         version: '2.0'
@@ -45,7 +45,7 @@ dependencies:
     requirements:
     - - ">="
       - !ruby/object:Gem::Version
-        version: '1.0'
+        version: '1.15'
     - - "<"
       - !ruby/object:Gem::Version
         version: '2.0'