RubyGems - llmemory - Versions diffs - 0.2.2 → 0.2.4 - Mend

llmemory 0.2.2 → 0.2.4

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (48) hide show

checksums.yaml +4 -4
data/README.md +65 -1
data/lib/llmemory/cli/commands/stats.rb +5 -0
data/lib/llmemory/configuration.rb +22 -2
data/lib/llmemory/crypto/cipher.rb +147 -0
data/lib/llmemory/crypto/field_helpers.rb +110 -0
data/lib/llmemory/instrumentation.rb +4 -2
data/lib/llmemory/llm/anthropic.rb +10 -4
data/lib/llmemory/llm/base.rb +42 -0
data/lib/llmemory/llm/openai.rb +29 -13
data/lib/llmemory/llm/response.rb +18 -0
data/lib/llmemory/llm/tracking_client.rb +61 -0
data/lib/llmemory/llm/usage.rb +31 -0
data/lib/llmemory/llm/usage_ledger.rb +118 -0
data/lib/llmemory/llm/usage_recorder.rb +37 -0
data/lib/llmemory/llm.rb +5 -0
data/lib/llmemory/long_term/episodic/memory.rb +16 -4
data/lib/llmemory/long_term/episodic/storage.rb +11 -4
data/lib/llmemory/long_term/episodic/storages/active_record_storage.rb +19 -6
data/lib/llmemory/long_term/episodic/storages/database_storage.rb +25 -3
data/lib/llmemory/long_term/episodic/storages/file_storage.rb +22 -5
data/lib/llmemory/long_term/file_based/storage.rb +11 -4
data/lib/llmemory/long_term/file_based/storages/active_record_storage.rb +16 -10
data/lib/llmemory/long_term/file_based/storages/database_storage.rb +24 -8
data/lib/llmemory/long_term/file_based/storages/file_storage.rb +28 -14
data/lib/llmemory/long_term/graph_based/memory.rb +17 -3
data/lib/llmemory/long_term/graph_based/storage.rb +3 -2
data/lib/llmemory/long_term/graph_based/storages/active_record_storage.rb +47 -21
data/lib/llmemory/long_term/procedural/memory.rb +16 -4
data/lib/llmemory/long_term/procedural/storage.rb +11 -4
data/lib/llmemory/long_term/procedural/storages/active_record_storage.rb +33 -13
data/lib/llmemory/long_term/procedural/storages/database_storage.rb +25 -4
data/lib/llmemory/long_term/procedural/storages/file_storage.rb +23 -6
data/lib/llmemory/mcp/tools/memory_stats.rb +13 -0
data/lib/llmemory/memory.rb +66 -15
data/lib/llmemory/short_term/checkpoint.rb +5 -2
data/lib/llmemory/short_term/stores/active_record_store.rb +12 -10
data/lib/llmemory/short_term/stores/memory_store.rb +1 -1
data/lib/llmemory/short_term/stores/postgres_store.rb +11 -5
data/lib/llmemory/short_term/stores/redis_store.rb +7 -5
data/lib/llmemory/short_term/stores.rb +7 -6
data/lib/llmemory/vector_store/active_record_store.rb +30 -3
data/lib/llmemory/vector_store/memory_store.rb +29 -3
data/lib/llmemory/vector_store/openai_embeddings.rb +23 -2
data/lib/llmemory/vector_store.rb +4 -3
data/lib/llmemory/version.rb +1 -1
data/lib/llmemory.rb +2 -0
metadata +8 -1

checksums.yaml CHANGED Viewed

@@ -1,7 +1,7 @@
 ---
 SHA256:
-  metadata.gz: fdcf202249038554cae18d79da76c261a9c7a80687081126ce985562ef8607ae
-  data.tar.gz: 9b28b0ba29d4444712c2592a808f6b08bbf38f804c031e6b8b246914f7b86699
+  metadata.gz: 296b9d61d6c474145ecaa607653b37438b2491c846aac602f65d5fd850dae9ef
+  data.tar.gz: 521fd05b577c6c17a7dbc5d3771ff9fb3f7cddeaeef31938efabdcfd40db74a5
 SHA512:
-  metadata.gz: 4bddb0f7e9a4bfe6cfd488a341efce98ab4194c36fd6ebcb22b024db338224d0191e3bfb5bbc9638ad43092cac66bb29d159e87d81b97f6d17c8ee82b400c716
-  data.tar.gz: 8e5fb5edddabce0b1b57903282bb868a076d1595c4d187c973ccb7a7fa61f11847c9889e185459438423553aee0bf52f216e3bbdf0d9ac870f9f8196c230c1f1
+  metadata.gz: edf6ee6b41cb366f2ddef5ac2deec9c3c03090d4920d2386f9578e31999d135b34ad7f995ee9a9e803cf43627f969a3fab9ee191c68e2362a4d62fa8bade0729
+  data.tar.gz: 83266918faeb2bb4f7b57e89007b642283d7d64600bf83513948d5378a625fa911aefa6823c54115961bafe34dfdd147f4fb297a8226caad49aaceb6a01372e0

data/README.md CHANGED Viewed

@@ -51,6 +51,39 @@ memory.clear_session!
 - **`prune!(mode: nil)`** — Prunes oversized tool results (soft-trim or hard-clear). Only when `prune_tool_results_enabled` is true.
 - **`check_context_window!`** — Triggers consolidate and compact when context exceeds configured thresholds.
 - **`clear_session!`** — Clears short-term only.
+- **`llm_usage`** — Returns cumulative LLM token usage for this `user_id` (chat/completions + embeddings), persisted in the short-term store.
+## LLM token usage
+llmemory captures **real token counts** from OpenAI and Anthropic API responses (chat and embeddings), accumulates them per `user_id`, and exposes them for cost monitoring.
+```ruby
+memory = Llmemory::Memory.new(user_id: "user_123")
+memory.consolidate!
+memory.maintain!
+usage = memory.llm_usage
+# => {
+#      invoke: { input_tokens: 1200, output_tokens: 400, total_tokens: 1600, calls: 3 },
+#      embed:  { total_tokens: 48, calls: 2 },
+#      updated_at: "2026-07-02T12:00:00Z"
+#    }
+```
+| What | Details |
+|------|---------|
+| **Counted** | `consolidate!`, reflection, skill mining, compaction summaries, iterative retrieval, graph/file extraction, OpenAI embeddings (index + search) |
+| **Scope** | Cumulative per `user_id` (not per session); stored under pseudo-session `__llm_usage__` |
+| **Not counted** | `context_tokens` (local byte estimate), retrieval context budget, MCP auth tokens |
+| **Cache** | Embedding cache hits record zero tokens |
+**Other surfaces:**
+- **CLI:** `llmemory stats USER_ID` prints an `LLM TOKEN USAGE` section.
+- **MCP:** `memory_stats` includes the same totals.
+- **Rails metrics:** subscribe to `llm_invoke.llmemory` and `llm_embed.llmemory` (payload includes `input_tokens`, `output_tokens`, `total_tokens`, `response_chars`).
+Dollar cost is not computed — multiply tokens by your model pricing externally. For lower-level access, `Llmemory::LLM::OpenAI#invoke` returns a `Response` with `#content` (via `#to_s`) and `#usage`.
 ## Configuration
@@ -65,6 +98,12 @@ Llmemory.configure do |config|
   config.long_term_store = :memory  # or :file, :postgres, :active_record
   config.long_term_storage_path = "./llmemory_data"  # for :file
   config.database_url = ENV["DATABASE_URL"]          # for :postgres
+  # Optional encryption at rest (AES-256-GCM). Requires a key; isolates data
+  # cryptographically per key (e.g. per agent/user). See "Encryption at rest".
+  config.encryption_enabled = false
+  config.encryption_key = ENV["LLMEMORY_ENCRYPTION_KEY"]
   config.time_decay_half_life_days = 30
   config.max_retrieval_tokens = 2000
   config.prune_after_days = 90
@@ -112,6 +151,31 @@ Llmemory.configure do |config|
 end
 ```
+## Encryption at rest
+Optional AES-256-GCM encryption protects persisted memory. Without the key, stored data is unreadable — useful for isolating agents or tenants.
+```ruby
+# Global default key (applies to all Memory instances)
+Llmemory.configure do |config|
+  config.encryption_enabled = true
+  config.encryption_key = ENV["LLMEMORY_ENCRYPTION_KEY"]
+end
+memory = Llmemory::Memory.new(user_id: "agent-1")
+# Per-instance key override (isolates this agent even if global config differs)
+memory = Llmemory::Memory.new(user_id: "agent-1", encryption_key: "tenant-specific-secret")
+```
+**What is encrypted:** conversation checkpoints (redis/postgres/active_record), file-based facts/resources/categories, episodic/procedural documents, graph node names/types/predicates (deterministic) and properties (random IV). **Vector embeddings are not encrypted** (required for pgvector search); associated `text_content` metadata is encrypted.
+**Trade-offs:**
+- Database keyword search (`LIKE`, BM25 on encrypted columns) no longer works on ciphertext; file backends still search in memory after decrypt.
+- `:memory` backends are in-process only and are **not** encrypted at rest.
+- Existing plaintext data remains readable (markers `enc:v1:` / `encd:v1:`); new writes are encrypted when enabled.
+- Deterministic encryption on graph identifiers leaks equality (same name ⇒ same ciphertext) but keeps graph traversal working.
 ## Long-Term Storage
 Long-term memory can use different backends:
@@ -654,7 +718,7 @@ MCP_TOKEN=your-secret-token llmemory mcp serve --http --port 443 \
 | `memory_timeline_context` | Get N items before/after a specific memory |
 | `memory_add_message` | Add message to short-term conversation (roles: user, assistant, system, tool, tool_result) |
 | `memory_consolidate` | Extract facts from conversation to long-term |
-| `memory_stats` | Get memory statistics for a user |
+| `memory_stats` | Get memory statistics for a user (includes LLM token usage) |
 | `memory_info` | Documentation on how to use the tools |
 | `memory_episode_record` / `memory_episodes` | Record / list episodic trajectories |
 | `memory_skill_register` / `memory_skill_report` / `memory_skills` | Register / outcome-track / list procedural skills |

data/lib/llmemory/cli/commands/stats.rb CHANGED Viewed

@@ -41,6 +41,11 @@ module Llmemory
             puts "Long-term (file) categories: #{storage.list_categories(user_id).size}"
             puts "Long-term (file) resources: #{storage.list_resources(user_id: user_id).size}"
           end
+          puts "---"
+          puts Llmemory::LLM::UsageLedger.format_text(
+            Llmemory::LLM::UsageLedger.new(store: short_store).totals(user_id)
+          )
         end
         def print_global_stats(short_store, long_type)

data/lib/llmemory/configuration.rb CHANGED Viewed

@@ -48,12 +48,14 @@ module Llmemory
                   :message_sanitizer_enabled,
                   :ttl_episodic_days,
                   :ttl_procedural_days,
-                  :skill_mining_enabled
+                  :skill_mining_enabled,
+                  :encryption_enabled,
+                  :encryption_key
     def initialize
       @llm_provider = :openai
       @llm_api_key = ENV["OPENAI_API_KEY"]
-      @llm_model = "gpt-4"
+      @llm_model = nil # falls back to the active provider's DEFAULT_MODEL
       @llm_base_url = nil
       @short_term_store = :memory
       @redis_url = ENV["REDIS_URL"] || "redis://localhost:6379/0"
@@ -98,6 +100,8 @@ module Llmemory
       @embedding_cache_max_entries = 10_000
       @max_message_chars = 32_000
       @message_sanitizer_enabled = false
+      @encryption_enabled = false
+      @encryption_key = ENV["LLMEMORY_ENCRYPTION_KEY"]
     end
   end
@@ -113,5 +117,21 @@ module Llmemory
     def reset_configuration!
       @configuration = Configuration.new
     end
+    # Builds a Crypto::Cipher when encryption is enabled and a key is present;
+    # otherwise returns Crypto::NullCipher. An explicit non-empty instance key
+    # enables encryption even when the global flag is off.
+    def build_cipher(key = nil)
+      explicit_key = !key.nil? && !key.to_s.empty?
+      resolved = key.nil? ? configuration.encryption_key : key
+      enabled = configuration.encryption_enabled || explicit_key
+      if enabled && !resolved.to_s.empty?
+        require_relative "crypto/cipher"
+        Crypto::Cipher.new(resolved)
+      else
+        require_relative "crypto/cipher"
+        Crypto::NullCipher.new
+      end
+    end
   end
 end

data/lib/llmemory/crypto/cipher.rb ADDED Viewed

@@ -0,0 +1,147 @@
+# frozen_string_literal: true
+require "openssl"
+require "json"
+module Llmemory
+  module Crypto
+    class DecryptionError < Llmemory::Error; end
+    # No-op cipher when encryption is disabled or no key is configured.
+    class NullCipher
+      def enabled?
+        false
+      end
+      def encrypt(str)
+        str.to_s
+      end
+      def encrypt_deterministic(str)
+        str.to_s
+      end
+      def decrypt(str)
+        str.to_s
+      end
+      def encrypt_json(obj)
+        JSON.generate(obj)
+      end
+      def decrypt_json(str)
+        JSON.parse(str.to_s, symbolize_names: true)
+      end
+      def encrypted?(str)
+        false
+      end
+    end
+    # AES-256-GCM encryption with separate content (random IV) and index
+    # (deterministic IV) subkeys derived from the master key via HMAC-SHA256.
+    class Cipher
+      MARKER = "enc:v1:"
+      DETERMINISTIC_MARKER = "encd:v1:"
+      IV_LENGTH = 12
+      TAG_LENGTH = 16
+      def initialize(key)
+        @master_key = derive_master_key(key)
+        @content_key = derive_subkey("content")
+        @index_key = derive_subkey("index")
+      end
+      def enabled?
+        true
+      end
+      def encrypt(plaintext)
+        str = plaintext.to_s
+        return str if str.empty?
+        encrypt_with_key(str, @content_key, iv: OpenSSL::Random.random_bytes(IV_LENGTH), marker: MARKER)
+      end
+      def encrypt_deterministic(plaintext)
+        str = plaintext.to_s
+        return str if str.empty?
+        iv = OpenSSL::HMAC.digest("SHA256", @index_key, str)[0, IV_LENGTH]
+        encrypt_with_key(str, @index_key, iv: iv, marker: DETERMINISTIC_MARKER)
+      end
+      def decrypt(ciphertext)
+        str = ciphertext.to_s
+        return str if str.empty?
+        return str unless encrypted?(str)
+        marker, key = if str.start_with?(DETERMINISTIC_MARKER)
+          [DETERMINISTIC_MARKER, @index_key]
+        else
+          [MARKER, @content_key]
+        end
+        payload = decode64(str.delete_prefix(marker))
+        iv = payload[0, IV_LENGTH]
+        tag = payload[IV_LENGTH, TAG_LENGTH]
+        ct = payload[(IV_LENGTH + TAG_LENGTH)..]
+        cipher = OpenSSL::Cipher.new("aes-256-gcm")
+        cipher.decrypt
+        cipher.key = key
+        cipher.iv = iv
+        cipher.auth_tag = tag
+        cipher.auth_data = ""
+        cipher.update(ct) + cipher.final
+      rescue OpenSSL::Cipher::CipherError, ArgumentError => e
+        raise DecryptionError, "Failed to decrypt data: #{e.message}"
+      end
+      def encrypt_json(obj)
+        encrypt(JSON.generate(obj))
+      end
+      def decrypt_json(str)
+        JSON.parse(decrypt(str), symbolize_names: true)
+      end
+      def encrypted?(str)
+        s = str.to_s
+        s.start_with?(MARKER) || s.start_with?(DETERMINISTIC_MARKER)
+      end
+      private
+      def encrypt_with_key(plaintext, key, iv:, marker:)
+        cipher = OpenSSL::Cipher.new("aes-256-gcm")
+        cipher.encrypt
+        cipher.key = key
+        cipher.iv = iv
+        cipher.auth_data = ""
+        ct = cipher.update(plaintext) + cipher.final
+        tag = cipher.auth_tag
+        marker + encode64(iv + tag + ct)
+      end
+      def encode64(bin)
+        [bin].pack("m0")
+      end
+      def decode64(str)
+        str.unpack1("m0")
+      end
+      def derive_master_key(key)
+        raw = key.to_s
+        raise ConfigurationError, "encryption_key cannot be empty when encryption is enabled" if raw.empty?
+        OpenSSL::Digest::SHA256.digest(raw)
+      end
+      def derive_subkey(label)
+        OpenSSL::HMAC.digest("SHA256", @master_key, "llmemory:#{label}")[0, 32]
+      end
+    end
+  end
+end

data/lib/llmemory/crypto/field_helpers.rb ADDED Viewed

@@ -0,0 +1,110 @@
+# frozen_string_literal: true
+require "json"
+module Llmemory
+  module Crypto
+    # Shared encrypt/decrypt helpers for storage backends.
+    module FieldHelpers
+      private
+      def cipher
+        @cipher || Llmemory.build_cipher
+      end
+      def enc(str)
+        return str if str.nil?
+        return str.to_s unless cipher.enabled?
+        cipher.encrypt(str.to_s)
+      end
+      def dec(str)
+        return str if str.nil?
+        return str unless str.is_a?(String) && cipher.encrypted?(str)
+        cipher.decrypt(str)
+      end
+      def enc_det(str)
+        return str if str.nil?
+        return str.to_s unless cipher.enabled?
+        cipher.encrypt_deterministic(str.to_s)
+      end
+      def enc_json(obj)
+        return obj if obj.nil?
+        return obj unless cipher.enabled?
+        cipher.encrypt_json(obj)
+      end
+      def dec_json(value)
+        return value if value.nil?
+        return value.transform_keys(&:to_sym) if value.is_a?(Hash)
+        return value unless value.is_a?(String) && cipher.encrypted?(value)
+        cipher.decrypt_json(value)
+      end
+      def write_encrypted_file(path, data)
+        payload = JSON.generate(data)
+        File.write(path, cipher.enabled? ? cipher.encrypt(payload) : payload)
+      end
+      def read_encrypted_file(path)
+        raw = File.read(path)
+        json = cipher.enabled? && cipher.encrypted?(raw) ? cipher.decrypt(raw) : raw
+        JSON.parse(json, symbolize_names: true)
+      end
+      def write_encrypted_text_file(path, content, append: false)
+        text = content.to_s
+        if cipher.enabled?
+          if append && File.file?(path)
+            existing = read_encrypted_text_file(path)
+            text = existing + text
+          end
+          File.write(path, cipher.encrypt(text))
+        elsif append && File.file?(path)
+          File.write(path, File.read(path) + text)
+        else
+          File.write(path, text)
+        end
+      end
+      def read_encrypted_text_file(path)
+        raw = File.read(path)
+        cipher.enabled? && cipher.encrypted?(raw) ? cipher.decrypt(raw) : raw
+      end
+      def serialize_state(state)
+        json = JSON.generate(state)
+        return json unless cipher.enabled?
+        cipher.encrypt(json)
+      end
+      def deserialize_state(data)
+        if data.is_a?(Hash)
+          return data.transform_keys(&:to_sym)
+        end
+        str = data.to_s
+        json = cipher.enabled? && cipher.encrypted?(str) ? cipher.decrypt(str) : str
+        JSON.parse(json, symbolize_names: true)
+      end
+      def parse_provenance(value)
+        return nil if value.nil?
+        return value.transform_keys(&:to_sym) if value.is_a?(Hash)
+        return dec_json(value) if value.is_a?(String) && cipher.encrypted?(value)
+        JSON.parse(value.to_s, symbolize_names: true)
+      rescue JSON::ParserError
+        nil
+      end
+    end
+  end
+end

data/lib/llmemory/instrumentation.rb CHANGED Viewed

@@ -10,8 +10,10 @@ module Llmemory
   # Events (payload keys are best-effort; subscribers should treat them as
   # optional):
   #
-  #   llm_invoke.llmemory       provider:, model:, prompt_chars:, response_chars:
-  #   llm_embed.llmemory        provider:, model:, text_chars:, dimensions:
+  #   llm_invoke.llmemory       provider:, model:, prompt_chars:, response_chars:,
+  #                             input_tokens:, output_tokens:, total_tokens:
+  #   llm_embed.llmemory        provider:, model:, text_chars:, input_tokens:,
+  #                             output_tokens:, total_tokens:
   #   memory_write.llmemory     memory_type:, user_id:
   #   memory_forget.llmemory    memory_type:, user_id:, count:
   #   retrieve.llmemory         query_chars:, candidates:, results:

data/lib/llmemory/llm/anthropic.rb CHANGED Viewed

@@ -8,16 +8,19 @@ module Llmemory
   module LLM
     class Anthropic < Base
       DEFAULT_BASE_URL = "https://api.anthropic.com"
+      DEFAULT_MODEL = "claude-sonnet-4-6"
       def initialize(api_key: nil, model: nil, base_url: nil)
+        super()
         @api_key = api_key || config.llm_api_key || ENV["ANTHROPIC_API_KEY"]
-        @model = model || config.llm_model || "claude-3-sonnet-20240229"
+        @model = model || config.llm_model || DEFAULT_MODEL
         @base_url = base_url || config.llm_base_url || DEFAULT_BASE_URL
       end
       def invoke(prompt)
         result = nil
-        Llmemory::Instrumentation.instrument(:llm_invoke, provider: :anthropic, model: @model, prompt_chars: prompt.to_s.length) do
+        payload = { provider: :anthropic, model: @model, prompt_chars: prompt.to_s.length }
+        Llmemory::Instrumentation.instrument(:llm_invoke, payload) do
           response = connection.post("v1/messages") do |req|
             req.body = {
               model: @model,
@@ -32,8 +35,11 @@ module Llmemory
           raise Llmemory::LLMError, "Anthropic API error: #{response.body}" unless response.success?
           body = response.body.is_a?(Hash) ? response.body : JSON.parse(response.body.to_s)
-          content = body.dig("content", 0, "text")
-          result = content&.strip || ""
+          content = body.dig("content", 0, "text")&.strip || ""
+          usage = parse_anthropic_usage(body["usage"])
+          record_usage(usage)
+          payload.merge!(instrumentation_payload(usage, content))
+          result = Response.new(content, usage: usage)
         end
         result
       end

data/lib/llmemory/llm/base.rb CHANGED Viewed

@@ -1,8 +1,17 @@
 # frozen_string_literal: true
+require_relative "usage"
+require_relative "response"
 module Llmemory
   module LLM
     class Base
+      attr_reader :last_usage
+      def initialize(*)
+        @last_usage = Usage.zero
+      end
       def invoke(prompt)
         raise NotImplementedError, "#{self.class}#invoke must be implemented"
       end
@@ -18,6 +27,39 @@ module Llmemory
       def config
         Llmemory.configuration
       end
+      def parse_openai_chat_usage(raw)
+        return Usage.zero unless raw.is_a?(Hash)
+        Usage.new(
+          input_tokens: raw["prompt_tokens"] || raw[:prompt_tokens] || 0,
+          output_tokens: raw["completion_tokens"] || raw[:completion_tokens] || 0,
+          total_tokens: raw["total_tokens"] || raw[:total_tokens]
+        )
+      end
+      def parse_anthropic_usage(raw)
+        return Usage.zero unless raw.is_a?(Hash)
+        input = raw["input_tokens"] || raw[:input_tokens] || 0
+        output = raw["output_tokens"] || raw[:output_tokens] || 0
+        Usage.new(input_tokens: input, output_tokens: output)
+      end
+      def parse_openai_embed_usage(raw)
+        return Usage.zero unless raw.is_a?(Hash)
+        total = raw["total_tokens"] || raw[:total_tokens] || 0
+        Usage.new(input_tokens: 0, output_tokens: 0, total_tokens: total)
+      end
+      def record_usage(usage)
+        @last_usage = usage
+      end
+      def instrumentation_payload(usage, content, extra = {})
+        usage.to_h.merge(response_chars: content.to_s.length).merge(extra)
+      end
     end
   end
 end

data/lib/llmemory/llm/openai.rb CHANGED Viewed

@@ -8,16 +8,19 @@ module Llmemory
   module LLM
     class OpenAI < Base
       DEFAULT_BASE_URL = "https://api.openai.com/v1"
+      DEFAULT_MODEL = "gpt-4"
       def initialize(api_key: nil, model: nil, base_url: nil)
+        super()
         @api_key = api_key || config.llm_api_key
-        @model = model || config.llm_model
+        @model = model || config.llm_model || DEFAULT_MODEL
         @base_url = base_url || config.llm_base_url || DEFAULT_BASE_URL
       end
       def invoke(prompt)
         result = nil
-        Llmemory::Instrumentation.instrument(:llm_invoke, provider: :openai, model: @model, prompt_chars: prompt.to_s.length) do
+        payload = { provider: :openai, model: @model, prompt_chars: prompt.to_s.length }
+        Llmemory::Instrumentation.instrument(:llm_invoke, payload) do
           response = connection.post("chat/completions") do |req|
             req.body = {
               model: @model,
@@ -31,7 +34,11 @@ module Llmemory
           raise Llmemory::LLMError, "OpenAI API error: #{response.body}" unless response.success?
           body = response.body.is_a?(Hash) ? response.body : JSON.parse(response.body.to_s)
-          result = body.dig("choices", 0, "message", "content")&.strip || ""
+          content = body.dig("choices", 0, "message", "content")&.strip || ""
+          usage = parse_openai_chat_usage(body["usage"])
+          record_usage(usage)
+          payload.merge!(instrumentation_payload(usage, content))
+          result = Response.new(content, usage: usage)
         end
         result
       end
@@ -53,18 +60,27 @@ module Llmemory
             }
           }
         }
-        response = connection.post("chat/completions") do |req|
-          req.body = payload.to_json
-          req.headers["Content-Type"] = "application/json"
-          req.headers["Authorization"] = "Bearer #{@api_key}"
-        end
+        parsed = nil
+        instrument_payload = { provider: :openai, model: @model, prompt_chars: prompt.to_s.length }
+        Llmemory::Instrumentation.instrument(:llm_invoke, instrument_payload) do
+          response = connection.post("chat/completions") do |req|
+            req.body = payload.to_json
+            req.headers["Content-Type"] = "application/json"
+            req.headers["Authorization"] = "Bearer #{@api_key}"
+          end
-        raise Llmemory::LLMError, "OpenAI API error: #{response.body}" unless response.success?
+          raise Llmemory::LLMError, "OpenAI API error: #{response.body}" unless response.success?
+          body = response.body.is_a?(Hash) ? response.body : JSON.parse(response.body.to_s)
+          content = body.dig("choices", 0, "message", "content")&.strip
+          usage = parse_openai_chat_usage(body["usage"])
+          record_usage(usage)
+          instrument_payload.merge!(instrumentation_payload(usage, content.to_s))
+          return {} if content.nil? || content.empty?
-        body = response.body.is_a?(Hash) ? response.body : JSON.parse(response.body.to_s)
-        content = body.dig("choices", 0, "message", "content")&.strip
-        return {} if content.nil? || content.empty?
-        JSON.parse(content)
+          parsed = JSON.parse(content)
+        end
+        parsed
       rescue JSON::ParserError => e
         raise Llmemory::LLMError, "Failed to parse JSON response: #{e.message}"
       end

data/lib/llmemory/llm/response.rb ADDED Viewed

@@ -0,0 +1,18 @@
+# frozen_string_literal: true
+module Llmemory
+  module LLM
+    class Response
+      attr_reader :content, :usage
+      def initialize(content, usage: Usage.zero)
+        @content = content.to_s
+        @usage = usage
+      end
+      def to_s
+        @content
+      end
+    end
+  end
+end

data/lib/llmemory/llm/tracking_client.rb ADDED Viewed

@@ -0,0 +1,61 @@
+# frozen_string_literal: true
+require_relative "usage_recorder"
+module Llmemory
+  module LLM
+    # Transparent wrapper that records token usage to the per-user ledger.
+    class TrackingClient
+      def initialize(inner, user_id:, store: nil, api_key: nil)
+        @inner = inner
+        @user_id = user_id
+        @store = store
+        @api_key = api_key
+      end
+      def invoke(prompt)
+        response = inner_client.invoke(prompt)
+        usage = if response.respond_to?(:usage)
+                  response.usage
+                elsif inner_client.respond_to?(:last_usage)
+                  inner_client.last_usage
+                else
+                  Usage.zero
+                end
+        UsageRecorder.record(user_id: @user_id, usage: usage, operation: :invoke, store: @store)
+        response
+      end
+      def invoke_with_json_schema(prompt, json_schema)
+        result = inner_client.invoke_with_json_schema(prompt, json_schema)
+        usage = inner_client.respond_to?(:last_usage) ? inner_client.last_usage : Usage.zero
+        UsageRecorder.record(user_id: @user_id, usage: usage, operation: :invoke, store: @store)
+        result
+      end
+      def last_usage
+        return inner_client.last_usage if inner_client.respond_to?(:last_usage)
+        Usage.zero
+      end
+      def respond_to?(method, include_private = false)
+        inner_client.respond_to?(method, include_private) || super
+      end
+      def method_missing(method, *args, &block)
+        if inner_client.respond_to?(method)
+          inner_client.public_send(method, *args, &block)
+        else
+          super
+        end
+      end
+      private
+      def inner_client
+        @inner_client ||= @inner || Llmemory::LLM.client(api_key: @api_key)
+      end
+    end
+  end
+end