llm_optimizer 0.1.1 → 0.1.2

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
checksums.yaml CHANGED
@@ -1,7 +1,7 @@
1
1
  ---
2
2
  SHA256:
3
- metadata.gz: d382e2ae48971edae81c24fa4e05bbacf9394c04dabed28b0277ca429e75a98d
4
- data.tar.gz: d807840237cf09e8b271063660242ae1c460b682425a50389c60f396b086e2c4
3
+ metadata.gz: 6a0351ff5590228acf939201d0c7eee71e33ee39a0cd20df33e76187c827ab34
4
+ data.tar.gz: 6bc7df5aa71407be80ecd07104e1dbff9a25a9ecab7a95cb390261f97fa212e8
5
5
  SHA512:
6
- metadata.gz: 8cac9e17c1f243c17d997e799daf25d886b329c09e83c84d9151f55abbb50d36a7e1b486171e401a645443022bb4de05e4430d0e303e05587dd1b244eda18cbe
7
- data.tar.gz: 598b000eabc6a4c0000b3b9bd2162231c619d7618653ca6356948b623f0524db048c7b1f9a8a589905c6b611a10fe6bdc42e7490b432fe8f6047b75dcc35038a
6
+ metadata.gz: e6822ea254300a957c8aa5953d267695ebc2ae4c1e2fa478492d175534bc990eac9bb5bb39127b8418e7306a3f5de9e8124832f9aeac9f723979e92a68babdec
7
+ data.tar.gz: 4248569dc2a969518142ac2749b6b6e9dc6defedec71e67eeec2480ade3a1aea05261736a2438c994906c05a1d1a1a7991055d48eab9ac0cc33f1344a88a221c
data/CHANGELOG.md CHANGED
@@ -7,6 +7,17 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
7
7
 
8
8
  ## [Unreleased]
9
9
 
10
+ ## [0.1.2] - 2026-04-10
11
+
12
+ ### Fixed
13
+ - `SemanticCache` used `pack("f*")` (32-bit) for both the Redis key hash and embedding serialization, causing precision loss on round-trip through MessagePack. Switched to `pack("G*")` / `unpack("G*")` (64-bit IEEE 754) — self-similarity is now exactly `1.0` and cache lookups work correctly with real embedding providers (Voyage AI, OpenAI, Cohere, etc.)
14
+ - `HistoryManager` summarization failed with `ConfigurationError: No llm_caller configured` when invoked through the gateway pipeline. The internal `raw_llm_call` lambda was missing `config: call_config`, so it couldn't resolve the user's configured `llm_caller`
15
+ - Updated `test/unit/test_gateway.rb` mock Redis helper to use `pack("G*")` to match the corrected `SemanticCache` key format
16
+
17
+ ### Added
18
+ - `bin/test_semantic_cache.rb` — runnable smoke test for semantic cache using Voyage AI embeddings + Anthropic Claude
19
+ - `bin/test_history_manager.rb` — runnable smoke test for history manager sliding window using Anthropic Claude
20
+
10
21
  ## [0.1.1] - 2026-04-10
11
22
 
12
23
  ### Fixed
@@ -46,5 +57,7 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
46
57
  - `OptimizeResult` struct with `response`, `model`, `model_tier`, `cache_status`, `original_tokens`, `compressed_tokens`, `latency_ms`, `messages`
47
58
  - Unit test suite covering all components with positive and negative scenarios using Minitest + Mocha
48
59
 
49
- [Unreleased]: https://github.com/arunkumarry/llm_optimizer/compare/v0.1.0...HEAD
60
+ [Unreleased]: https://github.com/arunkumarry/llm_optimizer/compare/v0.1.2...HEAD
61
+ [0.1.2]: https://github.com/arunkumarry/llm_optimizer/compare/v0.1.1...v0.1.2
62
+ [0.1.1]: https://github.com/arunkumarry/llm_optimizer/compare/v0.1.0...v0.1.1
50
63
  [0.1.0]: https://github.com/arunkumarry/llm_optimizer/releases/tag/v0.1.0
@@ -15,7 +15,13 @@ module LlmOptimizer
15
15
 
16
16
  def store(embedding, response)
17
17
  key = cache_key(embedding)
18
- payload = MessagePack.pack({ "embedding" => embedding, "response" => response })
18
+ # Serialize embedding as raw 64-bit big-endian doubles to preserve full
19
+ # Float precision. MessagePack silently downcasts Ruby Float to 32-bit,
20
+ # which corrupts cosine similarity on deserialization.
21
+ payload = MessagePack.pack({
22
+ "embedding" => embedding.pack("G*"), # binary string, lossless
23
+ "response" => response
24
+ })
19
25
  @redis.set(key, payload, ex: @ttl)
20
26
  rescue ::Redis::BaseError => e
21
27
  warn "[llm_optimizer] SemanticCache store failed: #{e.message}"
@@ -33,7 +39,8 @@ module LlmOptimizer
33
39
  next unless raw
34
40
 
35
41
  entry = MessagePack.unpack(raw)
36
- stored_embedding = entry["embedding"]
42
+ # Unpack the binary string back to 64-bit doubles
43
+ stored_embedding = entry["embedding"].unpack("G*")
37
44
  score = cosine_similarity(embedding, stored_embedding)
38
45
 
39
46
  if score > best_score
@@ -60,7 +67,10 @@ module LlmOptimizer
60
67
  private
61
68
 
62
69
  def cache_key(embedding)
63
- KEY_NAMESPACE + Digest::SHA256.hexdigest(embedding.pack("f*"))
70
+ # Use "G*" (64-bit big-endian double) to match Ruby's native Float precision.
71
+ # "f*" (32-bit) truncates precision and produces inconsistent hashes for the
72
+ # same embedding across serialize/deserialize round trips.
73
+ KEY_NAMESPACE + Digest::SHA256.hexdigest(embedding.pack("G*"))
64
74
  end
65
75
  end
66
76
  end
@@ -1,5 +1,5 @@
1
1
  # frozen_string_literal: true
2
2
 
3
3
  module LlmOptimizer
4
- VERSION = "0.1.1"
4
+ VERSION = "0.1.2"
5
5
  end
data/lib/llm_optimizer.rb CHANGED
@@ -158,7 +158,7 @@ module LlmOptimizer
158
158
  # History management
159
159
  messages = options[:messages]
160
160
  if call_config.manage_history && messages
161
- llm_caller = ->(p, model:) { raw_llm_call(p, model: model) }
161
+ llm_caller = ->(p, model:) { raw_llm_call(p, model: model, config: call_config) }
162
162
  history_mgr = HistoryManager.new(
163
163
  llm_caller: llm_caller,
164
164
  simple_model: call_config.simple_model,
metadata CHANGED
@@ -1,7 +1,7 @@
1
1
  --- !ruby/object:Gem::Specification
2
2
  name: llm_optimizer
3
3
  version: !ruby/object:Gem::Version
4
- version: 0.1.1
4
+ version: 0.1.2
5
5
  platform: ruby
6
6
  authors:
7
7
  - arun kumar
@@ -79,20 +79,6 @@ dependencies:
79
79
  - - "~>"
80
80
  - !ruby/object:Gem::Version
81
81
  version: '0.65'
82
- - !ruby/object:Gem::Dependency
83
- name: prop_check
84
- requirement: !ruby/object:Gem::Requirement
85
- requirements:
86
- - - "~>"
87
- - !ruby/object:Gem::Version
88
- version: '1.0'
89
- type: :development
90
- prerelease: false
91
- version_requirements: !ruby/object:Gem::Requirement
92
- requirements:
93
- - - "~>"
94
- - !ruby/object:Gem::Version
95
- version: '1.0'
96
82
  description: llm_optimizer reduces LLM API costs by up to 80% through semantic caching,
97
83
  intelligent model routing, token pruning, and conversation history summarization.
98
84
  Strictly opt-in and non-invasive.