reranker-ruby 0.1.0 → 0.1.1

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
checksums.yaml CHANGED
@@ -1,7 +1,7 @@
1
1
  ---
2
2
  SHA256:
3
- metadata.gz: 0b72a0d8a77aa8b43a90df33ef1aca6e74f886795cedb9decbc2f412f6d788da
4
- data.tar.gz: 236483d06f4ca930556ef2bf90a757a7b171cb5175a479cf538086030dbd71a0
3
+ metadata.gz: bcd9ac5bc1d098f8780a6523e648505dbf0259f4335b07e909898203b9723866
4
+ data.tar.gz: 12297c4398121fe27b4f0ed148268197f44258dde3b9e5f7efcf305202a5e861
5
5
  SHA512:
6
- metadata.gz: 5bed85be6192b79df97d15732b17d688d596393099e0b8ef70d117f686151fef88a409254726fdd090af0f1f708e208b070ae55867fb2d8c7ca92f328a75b0be
7
- data.tar.gz: a1ed43eba939509a20439604d0783ebd49942c6b01a4934ca576d91d29aab156a4703b7ebafd00812386f0f80cb4d967c7e8184f429220074d7a8f52241f6d01
6
+ metadata.gz: 75a6a774a12b00ed332f5f401451a53164f8ef3d8fe23428d70cfc224fa951bf0fd7a655f72178a98194e0578698e9f1de826afcd09b59b2dee2ec2dd0ea494f
7
+ data.tar.gz: c46089ed7e3ebedc6aea3bd2455bac70cd3b1666b77519a970510a4d4790d2bb3857a5d295d8c35e88358234d39bee5fee8e276fa5b34d8511b47b47a10f87fb
data/README.md CHANGED
@@ -1,18 +1,32 @@
1
1
  # reranker-ruby
2
2
 
3
- Cross-encoder reranking for Ruby RAG pipelines. The single biggest quality improvement you can add to retrieval-augmented generation.
3
+ Cross-encoder reranking for Ruby RAG pipelines.
4
4
 
5
- Vector search finds candidates fast (approximate). Reranking makes them accurate (precise).
5
+ After vector search retrieves candidate documents, a reranker scores each candidate against the query using a cross-encoder model, producing far more accurate relevance rankings than embedding similarity alone. This is the single biggest quality improvement you can add to a RAG pipeline.
6
+
7
+ ```
8
+ Bi-encoder (embedding search): score = cosine(embed(query), embed(doc)) — fast, approximate
9
+ Cross-encoder (reranking): score = model(query + doc) — slow, precise
10
+ ```
11
+
12
+ The pattern: use bi-encoder for top-100 retrieval, then cross-encoder to rerank to top-10.
6
13
 
7
14
  ## Installation
8
15
 
16
+ Add to your Gemfile:
17
+
9
18
  ```ruby
10
19
  gem "reranker-ruby"
11
20
  ```
12
21
 
13
- ## Usage
22
+ For local ONNX inference, also install:
14
23
 
15
- ### Cohere Rerank
24
+ ```ruby
25
+ gem "onnxruntime"
26
+ gem "tokenizers"
27
+ ```
28
+
29
+ ## Quick Start
16
30
 
17
31
  ```ruby
18
32
  require "reranker_ruby"
@@ -33,8 +47,22 @@ results = reranker.rerank(query, documents, top_k: 3)
33
47
  results.each do |r|
34
48
  puts "#{r.score.round(4)} | #{r.text[0..60]}"
35
49
  end
50
+ # 0.9987 | Paris is the capital and largest city of France.
51
+ # 0.8234 | The Eiffel Tower is located in Paris.
52
+ # 0.6123 | Lyon is the second-largest city in France.
36
53
  ```
37
54
 
55
+ ## Providers
56
+
57
+ ### Cohere Rerank
58
+
59
+ ```ruby
60
+ reranker = RerankerRuby::Cohere.new(api_key: ENV["COHERE_API_KEY"])
61
+ results = reranker.rerank(query, documents, top_k: 3)
62
+ ```
63
+
64
+ Uses [Cohere Rerank API v2](https://docs.cohere.com/reference/rerank) with the `rerank-v3.5` model by default.
65
+
38
66
  ### Jina Rerank
39
67
 
40
68
  ```ruby
@@ -42,7 +70,52 @@ reranker = RerankerRuby::Jina.new(api_key: ENV["JINA_API_KEY"])
42
70
  results = reranker.rerank(query, documents, top_k: 3)
43
71
  ```
44
72
 
45
- ### Hash Documents with Metadata
73
+ Uses `jina-reranker-v2-base-multilingual` by default.
74
+
75
+ ### Local ONNX Inference
76
+
77
+ Run cross-encoder models locally without API calls. Models are auto-downloaded from HuggingFace Hub.
78
+
79
+ ```ruby
80
+ reranker = RerankerRuby::Onnx.new(
81
+ model: "cross-encoder/ms-marco-MiniLM-L-6-v2"
82
+ )
83
+ results = reranker.rerank(query, documents, top_k: 3)
84
+ ```
85
+
86
+ Or use a local model file:
87
+
88
+ ```ruby
89
+ reranker = RerankerRuby::Onnx.new(
90
+ model_path: "/path/to/reranker.onnx",
91
+ tokenizer: "cross-encoder/ms-marco-MiniLM-L-6-v2"
92
+ )
93
+ ```
94
+
95
+ Supported models:
96
+ - `cross-encoder/ms-marco-MiniLM-L-6-v2`
97
+ - `cross-encoder/ms-marco-MiniLM-L-12-v2`
98
+ - `BAAI/bge-reranker-base`
99
+ - `BAAI/bge-reranker-large`
100
+ - `BAAI/bge-reranker-v2-m3`
101
+
102
+ Requires the `onnxruntime` and `tokenizers` gems.
103
+
104
+ ## Result Object
105
+
106
+ Every reranker returns an array of `Result` objects, sorted by relevance (highest first):
107
+
108
+ ```ruby
109
+ result.text # => "Paris is the capital..."
110
+ result.score # => 0.9987
111
+ result.index # => 0 (position in the original document array)
112
+ result.metadata # => {} (preserved from input)
113
+ result.to_h # => { text: "...", score: 0.9987, index: 0, metadata: {} }
114
+ ```
115
+
116
+ ## Structured Documents with Metadata
117
+
118
+ Pass hashes instead of strings. Metadata is preserved through reranking:
46
119
 
47
120
  ```ruby
48
121
  documents = [
@@ -54,38 +127,192 @@ results = reranker.rerank(query, documents, top_k: 3)
54
127
  results.first.metadata # => { source: "wiki", id: "doc1" }
55
128
  ```
56
129
 
57
- ### Reciprocal Rank Fusion
130
+ ## Reciprocal Rank Fusion
131
+
132
+ Combine results from multiple retrieval strategies before reranking:
133
+
134
+ ```ruby
135
+ vector_results = collection.search(embedding, top_k: 50)
136
+ keyword_results = Article.where("content LIKE ?", "%#{query}%").limit(50)
137
+
138
+ fused = RerankerRuby::RRF.fuse(
139
+ vector_results.map(&:id),
140
+ keyword_results.map(&:id),
141
+ k: 60
142
+ )
143
+ # => ranked array of IDs by combined relevance
144
+
145
+ # Then rerank the fused results for final precision
146
+ top_docs = fused.first(20).map { |id| Document.find(id) }
147
+ final = reranker.rerank(query, top_docs.map(&:content), top_k: 5)
148
+ ```
149
+
150
+ ## Ensemble Reranking
151
+
152
+ Combine multiple rerankers with weighted score aggregation:
153
+
154
+ ```ruby
155
+ cohere = RerankerRuby::Cohere.new(api_key: ENV["COHERE_API_KEY"])
156
+ jina = RerankerRuby::Jina.new(api_key: ENV["JINA_API_KEY"])
157
+
158
+ ensemble = RerankerRuby::Ensemble.new(
159
+ rerankers: [cohere, jina],
160
+ weights: [0.6, 0.4],
161
+ normalize: :min_max # :min_max, :softmax, :sigmoid, or :none
162
+ )
163
+
164
+ results = ensemble.rerank(query, documents, top_k: 5)
165
+ ```
166
+
167
+ ## Score Normalization
58
168
 
59
- Combine results from multiple retrieval strategies:
169
+ Different models produce scores on different scales. Normalize them for comparison:
60
170
 
61
171
  ```ruby
62
- vector_results = ["doc1", "doc3", "doc5"]
63
- keyword_results = ["doc2", "doc1", "doc4"]
172
+ results = reranker.rerank(query, documents)
173
+
174
+ # Min-max to [0, 1]
175
+ normalized = RerankerRuby::ScoreNormalizer.min_max(results)
64
176
 
65
- fused = RerankerRuby::RRF.fuse(vector_results, keyword_results, k: 60)
177
+ # Softmax (scores sum to 1.0)
178
+ normalized = RerankerRuby::ScoreNormalizer.softmax(results)
179
+
180
+ # Sigmoid (each score independently mapped to [0, 1])
181
+ normalized = RerankerRuby::ScoreNormalizer.sigmoid(results)
66
182
  ```
67
183
 
68
- ### Caching
184
+ ## Batch Reranking
185
+
186
+ Rerank multiple queries concurrently:
69
187
 
70
188
  ```ruby
189
+ queries = ["capital of France?", "tallest building?", "largest ocean?"]
190
+
191
+ results = RerankerRuby::Batch.rerank(
192
+ reranker, queries, documents,
193
+ top_k: 5,
194
+ threads: 4
195
+ )
196
+
197
+ results[0] # => results for queries[0]
198
+ results[1] # => results for queries[1]
199
+ ```
200
+
201
+ ## Caching
202
+
203
+ Avoid duplicate API calls for the same query+documents:
204
+
205
+ ```ruby
206
+ # In-memory cache
71
207
  reranker = RerankerRuby::Cohere.new(
72
208
  api_key: ENV["COHERE_API_KEY"],
73
209
  cache: RerankerRuby::Cache::Memory.new(ttl: 3600)
74
210
  )
75
211
 
212
+ # Redis cache
213
+ require "redis"
214
+ reranker = RerankerRuby::Cohere.new(
215
+ api_key: ENV["COHERE_API_KEY"],
216
+ cache: RerankerRuby::Cache::Redis.new(redis: Redis.new, ttl: 3600)
217
+ )
218
+
76
219
  reranker.rerank(query, docs, top_k: 5) # API call
77
220
  reranker.rerank(query, docs, top_k: 5) # cache hit
78
221
  ```
79
222
 
80
- ## Result Object
223
+ ## Logging & Metrics
224
+
225
+ Every rerank call is automatically instrumented:
81
226
 
82
227
  ```ruby
83
- result.text # => "Paris is the capital..."
84
- result.score # => 0.9987
85
- result.index # => 0 (original position)
86
- result.metadata # => {} (preserved from input)
228
+ # Set log level
229
+ RerankerRuby::Logging.logger = Logger.new($stdout)
230
+ RerankerRuby::Logging.logger.level = Logger::INFO
231
+
232
+ # Subscribe to rerank events
233
+ RerankerRuby::Logging.on_rerank do |event|
234
+ puts "#{event[:reranker]} reranked #{event[:document_count]} docs in #{event[:duration_ms]}ms"
235
+ # event keys: :reranker, :query, :document_count, :top_k,
236
+ # :result_count, :duration_ms, :top_score
237
+ end
238
+ ```
239
+
240
+ ## Rails Integration
241
+
242
+ ### Configuration
243
+
244
+ Run the install generator:
245
+
246
+ ```bash
247
+ rails generate reranker_ruby:install
248
+ ```
249
+
250
+ This creates `config/initializers/reranker_ruby.rb`:
251
+
252
+ ```ruby
253
+ RerankerRuby.configure do |config|
254
+ config.default_provider = :cohere # :cohere, :jina, or :onnx
255
+ config.cohere_api_key = ENV["COHERE_API_KEY"]
256
+ config.default_top_k = 10
257
+ config.cache_store = :memory # :memory, :redis, or nil
258
+ config.cache_ttl = 3600
259
+ end
260
+ ```
261
+
262
+ Then use the global convenience method anywhere:
263
+
264
+ ```ruby
265
+ results = RerankerRuby.rerank("What is Ruby?", documents, top_k: 5)
87
266
  ```
88
267
 
268
+ ### ActiveJob for Async Reranking
269
+
270
+ For large result sets, run reranking in the background:
271
+
272
+ ```ruby
273
+ RerankerRuby::RerankJob.perform_later(
274
+ query: "What is Ruby?",
275
+ documents: ["doc1", "doc2", ...],
276
+ top_k: 5,
277
+ callback: "MyRerankCallback"
278
+ )
279
+
280
+ # Callback class
281
+ class MyRerankCallback
282
+ def self.on_rerank_complete(query, results)
283
+ # results is an array of hashes: [{ text:, score:, index:, metadata: }, ...]
284
+ end
285
+ end
286
+ ```
287
+
288
+ ### Pipeline Middleware
289
+
290
+ Plug into any RAG pipeline as a reranking step:
291
+
292
+ ```ruby
293
+ middleware = RerankerRuby::Middleware.new(
294
+ reranker: RerankerRuby::Cohere.new(api_key: "..."),
295
+ top_k: 5,
296
+ text_key: :content
297
+ )
298
+
299
+ # Works with hashes, strings, or objects
300
+ candidates = [
301
+ { content: "Paris is the capital...", source: "wiki" },
302
+ { content: "Berlin is the capital...", source: "wiki" },
303
+ ]
304
+
305
+ results = middleware.call(query: "capital of France?", candidates: candidates)
306
+ ```
307
+
308
+ ## Dependencies
309
+
310
+ **Runtime:** `net/http` (stdlib), `json` (stdlib), `logger`
311
+
312
+ **Optional:** `onnxruntime` and `tokenizers` (for local ONNX inference), `redis` (for Redis caching)
313
+
314
+ **Development:** `minitest`, `rake`, `webmock`
315
+
89
316
  ## License
90
317
 
91
318
  MIT
@@ -3,6 +3,7 @@
3
3
  require "net/http"
4
4
  require "json"
5
5
  require "uri"
6
+ require "digest"
6
7
 
7
8
  module RerankerRuby
8
9
  class Error < StandardError; end
@@ -20,6 +21,12 @@ module RerankerRuby
20
21
 
21
22
  private
22
23
 
24
+ def validate_inputs!(query, documents, top_k)
25
+ raise ArgumentError, "query cannot be nil or empty" if query.nil? || query.to_s.strip.empty?
26
+ raise ArgumentError, "documents cannot be nil or empty" if documents.nil? || documents.empty?
27
+ raise ArgumentError, "top_k must be positive" if top_k && top_k <= 0
28
+ end
29
+
23
30
  def instrument(query:, document_count:, top_k:, &block)
24
31
  Logging.instrument(
25
32
  reranker_class: self.class.name,
@@ -40,15 +47,14 @@ module RerankerRuby
40
47
  document.reject { |k, _| k == :text || k == "text" }
41
48
  end
42
49
 
43
- def cache_key(query, documents)
44
- require "digest"
45
- Digest::SHA256.hexdigest("#{query}:#{documents.map(&:to_s).join("|")}")
50
+ def cache_key(query, documents, top_k = nil)
51
+ Digest::SHA256.hexdigest("#{query}:#{top_k}:#{documents.map(&:to_s).join("|")}")
46
52
  end
47
53
 
48
- def with_cache(query, documents, &block)
54
+ def with_cache(query, documents, top_k: nil, &block)
49
55
  return yield unless @cache
50
56
 
51
- key = cache_key(query, documents)
57
+ key = cache_key(query, documents, top_k)
52
58
  cached = @cache.get(key)
53
59
  return cached if cached
54
60
 
@@ -59,21 +65,43 @@ module RerankerRuby
59
65
 
60
66
  def post(url, body, headers: {})
61
67
  uri = URI.parse(url)
62
- http = Net::HTTP.new(uri.host, uri.port)
63
- http.use_ssl = uri.scheme == "https"
64
-
65
- request = Net::HTTP::Post.new(uri.path)
66
- request["Content-Type"] = "application/json"
67
- headers.each { |k, v| request[k] = v }
68
- request.body = JSON.generate(body)
69
-
70
- response = http.request(request)
71
-
72
- unless response.is_a?(Net::HTTPSuccess)
73
- raise APIError, "HTTP #{response.code}: #{response.body}"
68
+ retries = 0
69
+ max_retries = 3
70
+
71
+ begin
72
+ http = Net::HTTP.new(uri.host, uri.port)
73
+ http.use_ssl = uri.scheme == "https"
74
+ http.open_timeout = 30
75
+ http.read_timeout = 30
76
+ http.write_timeout = 30
77
+
78
+ request = Net::HTTP::Post.new(uri.path)
79
+ request["Content-Type"] = "application/json"
80
+ request["User-Agent"] = "RerankerRuby/#{RerankerRuby::VERSION}"
81
+ headers.each { |k, v| request[k] = v }
82
+ request.body = JSON.generate(body)
83
+
84
+ response = http.request(request)
85
+
86
+ if response.code.to_i == 429 || response.code.to_i >= 500
87
+ raise APIError, "HTTP #{response.code}: #{response.body}"
88
+ end
89
+
90
+ unless response.is_a?(Net::HTTPSuccess)
91
+ raise APIError, "HTTP #{response.code}: #{response.body}"
92
+ end
93
+
94
+ JSON.parse(response.body)
95
+ rescue APIError => e
96
+ retries += 1
97
+ if retries <= max_retries && (e.message.include?("429") || e.message.include?("50"))
98
+ sleep(2 ** (retries - 1))
99
+ retry
100
+ end
101
+ raise
102
+ rescue JSON::ParserError => e
103
+ raise APIError, "Invalid JSON response: #{e.message}"
74
104
  end
75
-
76
- JSON.parse(response.body)
77
105
  end
78
106
  end
79
107
  end
@@ -21,6 +21,7 @@ module RerankerRuby
21
21
  end
22
22
 
23
23
  results = Array.new(queries.length)
24
+ errors = []
24
25
  mutex = Mutex.new
25
26
  queue = Queue.new
26
27
 
@@ -30,14 +31,19 @@ module RerankerRuby
30
31
  workers = threads.times.map do
31
32
  Thread.new do
32
33
  while (item = queue.pop)
33
- query, idx = item
34
- result = reranker.rerank(query, documents, top_k: top_k)
35
- mutex.synchronize { results[idx] = result }
34
+ begin
35
+ query, idx = item
36
+ result = reranker.rerank(query, documents, top_k: top_k)
37
+ mutex.synchronize { results[idx] = result }
38
+ rescue => e
39
+ mutex.synchronize { errors << e }
40
+ end
36
41
  end
37
42
  end
38
43
  end
39
44
 
40
45
  workers.each(&:join)
46
+ raise errors.first if errors.any?
41
47
  results
42
48
  end
43
49
  end
@@ -6,30 +6,39 @@ module RerankerRuby
6
6
  def initialize(ttl: 3600)
7
7
  @ttl = ttl
8
8
  @store = {}
9
+ @mutex = Mutex.new
9
10
  end
10
11
 
11
12
  def get(key)
12
- entry = @store[key]
13
- return nil unless entry
13
+ @mutex.synchronize do
14
+ entry = @store[key]
15
+ return nil unless entry
14
16
 
15
- if Time.now.to_f - entry[:time] > @ttl
16
- @store.delete(key)
17
- return nil
18
- end
17
+ if Time.now.to_f - entry[:time] > @ttl
18
+ @store.delete(key)
19
+ return nil
20
+ end
19
21
 
20
- entry[:value]
22
+ entry[:value]
23
+ end
21
24
  end
22
25
 
23
26
  def set(key, value)
24
- @store[key] = { value: value, time: Time.now.to_f }
27
+ @mutex.synchronize do
28
+ @store[key] = { value: value, time: Time.now.to_f }
29
+ end
25
30
  end
26
31
 
27
32
  def clear
28
- @store.clear
33
+ @mutex.synchronize do
34
+ @store.clear
35
+ end
29
36
  end
30
37
 
31
38
  def size
32
- @store.size
39
+ @mutex.synchronize do
40
+ @store.size
41
+ end
33
42
  end
34
43
  end
35
44
  end
@@ -1,5 +1,7 @@
1
1
  # frozen_string_literal: true
2
2
 
3
+ require "json"
4
+
3
5
  module RerankerRuby
4
6
  module Cache
5
7
  class Redis
@@ -13,11 +15,20 @@ module RerankerRuby
13
15
  data = @redis.get("#{@prefix}#{key}")
14
16
  return nil unless data
15
17
 
16
- Marshal.load(data) # rubocop:disable Security/MarshalLoad
18
+ parsed = JSON.parse(data)
19
+ parsed.map do |h|
20
+ Result.new(
21
+ text: h["text"],
22
+ score: h["score"],
23
+ index: h["index"],
24
+ metadata: h["metadata"] || {}
25
+ )
26
+ end
17
27
  end
18
28
 
19
29
  def set(key, value)
20
- @redis.setex("#{@prefix}#{key}", @ttl, Marshal.dump(value))
30
+ serialized = JSON.generate(value.map(&:to_h))
31
+ @redis.setex("#{@prefix}#{key}", @ttl, serialized)
21
32
  end
22
33
 
23
34
  def clear
@@ -12,8 +12,9 @@ module RerankerRuby
12
12
  end
13
13
 
14
14
  def rerank(query, documents, top_k: 10, model: nil)
15
+ validate_inputs!(query, documents, top_k)
15
16
  instrument(query: query, document_count: documents.length, top_k: top_k) do
16
- with_cache(query, documents) do
17
+ with_cache(query, documents, top_k: top_k) do
17
18
  texts = extract_texts(documents)
18
19
 
19
20
  response = post(API_URL, {
@@ -66,7 +66,10 @@ module RerankerRuby
66
66
 
67
67
  class << self
68
68
  def configuration
69
- @configuration ||= Configuration.new
69
+ @config_mutex ||= Mutex.new
70
+ @config_mutex.synchronize do
71
+ @configuration ||= Configuration.new
72
+ end
70
73
  end
71
74
 
72
75
  def configure
@@ -74,13 +77,19 @@ module RerankerRuby
74
77
  end
75
78
 
76
79
  def reset_configuration!
77
- @configuration = Configuration.new
78
- @reranker = nil
80
+ @config_mutex ||= Mutex.new
81
+ @config_mutex.synchronize do
82
+ @configuration = Configuration.new
83
+ @reranker = nil
84
+ end
79
85
  end
80
86
 
81
87
  # Global reranker instance built from configuration
82
88
  def reranker
83
- @reranker ||= configuration.build_reranker
89
+ @config_mutex ||= Mutex.new
90
+ @config_mutex.synchronize do
91
+ @reranker ||= configuration.build_reranker
92
+ end
84
93
  end
85
94
 
86
95
  # Convenience method for quick reranking
@@ -27,7 +27,8 @@ module RerankerRuby
27
27
  end
28
28
 
29
29
  def rerank(query, documents, top_k: 10)
30
- with_cache(query, documents) do
30
+ validate_inputs!(query, documents, top_k)
31
+ with_cache(query, documents, top_k: top_k) do
31
32
  texts = extract_texts(documents)
32
33
 
33
34
  # Collect and normalize results from each reranker
@@ -12,8 +12,9 @@ module RerankerRuby
12
12
  end
13
13
 
14
14
  def rerank(query, documents, top_k: 10, model: nil)
15
+ validate_inputs!(query, documents, top_k)
15
16
  instrument(query: query, document_count: documents.length, top_k: top_k) do
16
- with_cache(query, documents) do
17
+ with_cache(query, documents, top_k: top_k) do
17
18
  texts = extract_texts(documents)
18
19
 
19
20
  response = post(API_URL, {
@@ -26,7 +26,8 @@ module RerankerRuby
26
26
  end
27
27
 
28
28
  def rerank(query, documents, top_k: 10)
29
- with_cache(query, documents) do
29
+ validate_inputs!(query, documents, top_k)
30
+ with_cache(query, documents, top_k: top_k) do
30
31
  texts = extract_texts(documents)
31
32
 
32
33
  scores = texts.map { |text| score_pair(query, text) }
@@ -9,6 +9,10 @@ module RerankerRuby
9
9
  return results if results.empty?
10
10
 
11
11
  scores = results.map(&:score)
12
+ if scores.any? { |s| s.nan? || s.infinite? }
13
+ return results.map { |r| with_score(r, 0.0) }
14
+ end
15
+
12
16
  min = scores.min
13
17
  max = scores.max
14
18
  range = max - min
@@ -23,6 +27,10 @@ module RerankerRuby
23
27
  return results if results.empty?
24
28
 
25
29
  scores = results.map(&:score)
30
+ if scores.any? { |s| s.nan? || s.infinite? }
31
+ return results.map { |r| with_score(r, 0.0) }
32
+ end
33
+
26
34
  max_score = scores.max
27
35
  exps = scores.map { |s| Math.exp(s - max_score) } # subtract max for numerical stability
28
36
  sum = exps.sum
@@ -34,6 +42,13 @@ module RerankerRuby
34
42
 
35
43
  # Sigmoid normalization — each score independently mapped to [0, 1]
36
44
  def self.sigmoid(results)
45
+ return results if results.empty?
46
+
47
+ scores = results.map(&:score)
48
+ if scores.any? { |s| s.nan? || s.infinite? }
49
+ return results.map { |r| with_score(r, 0.0) }
50
+ end
51
+
37
52
  results.map { |r| with_score(r, 1.0 / (1.0 + Math.exp(-r.score))) }
38
53
  end
39
54
 
@@ -1,5 +1,5 @@
1
1
  # frozen_string_literal: true
2
2
 
3
3
  module RerankerRuby
4
- VERSION = "0.1.0"
4
+ VERSION = "0.1.1"
5
5
  end
metadata CHANGED
@@ -1,7 +1,7 @@
1
1
  --- !ruby/object:Gem::Specification
2
2
  name: reranker-ruby
3
3
  version: !ruby/object:Gem::Version
4
- version: 0.1.0
4
+ version: 0.1.1
5
5
  platform: ruby
6
6
  authors:
7
7
  - Johannes Dwi Cahyo