fastembed 1.0.0 → 1.1.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
checksums.yaml CHANGED
@@ -1,7 +1,7 @@
1
1
  ---
2
2
  SHA256:
3
- metadata.gz: 1bc6e2a53d0c7a8c4679f7af6a23af74b903b0e227c401307e3d936d74719291
4
- data.tar.gz: 5664c43f8d7f0632719324b42805abe3806274c18113232f82bbddce918647e5
3
+ metadata.gz: 954cdd87ba985d20a8a5bf0e5676178ebc891032f275fb4cb8cf547c0378a476
4
+ data.tar.gz: 4c11fbd4894906ab46a99746a2efa16f8a56e494a4fcd64012d642b992914918
5
5
  SHA512:
6
- metadata.gz: f7f821a8a8ee49fdbbde65eaf394d8b406b1aecfc53fa572ecbbf3ccaf9fd120771fa5e5d444d339aa9ec5b729c63f283e564d05ea1ce1db1427580ced41cc0d
7
- data.tar.gz: a99fda1200c29abf2ea7bed10ab5e7eee30fc5238e5d13b5608f7c41014e98e1e880c67ea1da1e56eef10864697ebca11ed7a7bf6463fb437f4dc4fe80267bef
6
+ metadata.gz: 948c3d705493fe26f3fd4fcb8b14965c0223b7ddd119307502b9af8f8503552b7693846a2a004ea97aca3d3cc5cd39f77017acd82ee2b48be56b879e6eaa1c7b
7
+ data.tar.gz: 1ceb5a0de19bc743b215332706d20facc49cda03f7e82211ff4306de1fb443d3f508f2b4b7a59dbdc43f3bbb720fe7accb80f342f45e7cc7cd0bf1c65fbb105e
data/.rubocop.yml CHANGED
@@ -8,6 +8,7 @@ AllCops:
8
8
  Exclude:
9
9
  - 'bin/*'
10
10
  - 'vendor/**/*'
11
+ - 'benchmark/**/*'
11
12
 
12
13
  Style/Documentation:
13
14
  Enabled: false
data/.yardopts ADDED
@@ -0,0 +1,6 @@
1
+ --markup markdown
2
+ --title "Fastembed Ruby API Documentation"
3
+ --output-dir doc
4
+ --readme README.md
5
+ --files CHANGELOG.md,BENCHMARKS.md
6
+ lib/**/*.rb
data/BENCHMARKS.md CHANGED
@@ -1,6 +1,6 @@
1
1
  # Benchmarks
2
2
 
3
- Performance benchmarks on Apple M1 Max, Ruby 3.3.10.
3
+ Performance benchmarks on Apple M1 Max, Ruby 3.3, Python 3.13 (January 2026).
4
4
 
5
5
  ## Single Document Latency
6
6
 
@@ -69,6 +69,98 @@ We tested CoreML execution provider to see if GPU/Neural Engine acceleration hel
69
69
 
70
70
  **Recommendation:** Stick with the default CPU provider.
71
71
 
72
+ ## Ruby vs Python FastEmbed
73
+
74
+ Comprehensive comparison of fastembed-rb against Python FastEmbed (v0.7.4) on Apple M1 Max.
75
+
76
+ ### Text Embeddings (100 documents)
77
+
78
+ | Model | Ruby (docs/sec) | Python (docs/sec) | Ratio |
79
+ |-------|-----------------|-------------------|-------|
80
+ | BAAI/bge-small-en-v1.5 | 566 | 629 | 0.90x |
81
+ | BAAI/bge-base-en-v1.5 | 176 | 169 | **1.04x** |
82
+ | all-MiniLM-L6-v2 | 922 | 1309 | 0.70x |
83
+
84
+ Ruby is within 10-30% of Python for text embeddings. Both use the same ONNX Runtime backend.
85
+
86
+ ### Rerankers (100 query-document pairs)
87
+
88
+ | Model | Ruby (pairs/sec) | Python (pairs/sec) | Ratio |
89
+ |-------|------------------|-------------------|-------|
90
+ | ms-marco-MiniLM-L-6-v2 | 986 | 982 | **1.00x** |
91
+ | ms-marco-MiniLM-L-12-v2 | 398 | 512 | 0.78x |
92
+ | BAAI/bge-reranker-base | 132 | 124 | **1.06x** |
93
+
94
+ Ruby matches or beats Python on rerankers.
95
+
96
+ ### Sparse Embeddings - SPLADE (100 documents)
97
+
98
+ | Model | Ruby (docs/sec) | Python (docs/sec) | Ratio |
99
+ |-------|-----------------|-------------------|-------|
100
+ | Splade_PP_en_v1 | 23 | 108 | 0.21x |
101
+
102
+ Ruby's SPLADE implementation is slower due to post-processing overhead. Python uses optimized numpy operations for the log1p transformation.
103
+
104
+ ### Late Interaction - ColBERT (100 documents)
105
+
106
+ | Model | Ruby (docs/sec) | Python (docs/sec) | Ratio |
107
+ |-------|-----------------|-------------------|-------|
108
+ | colbert-ir/colbertv2.0 | 191 | 184 | **1.04x** |
109
+
110
+ Ruby slightly outperforms Python for ColBERT embeddings.
111
+
112
+ ### Image Embeddings (100 images)
113
+
114
+ | Model | Ruby (imgs/sec) | Python (imgs/sec) | Ratio |
115
+ |-------|-----------------|-------------------|-------|
116
+ | clip-ViT-B-32-vision | 9 | 42 | 0.22x |
117
+
118
+ Ruby's image embedding is slower due to MiniMagick subprocess overhead for image preprocessing. Python uses Pillow which is more efficient for batch processing.
119
+
120
+ ### Summary
121
+
122
+ | Category | Ruby vs Python |
123
+ |----------|---------------|
124
+ | Text Embeddings | ~90% of Python speed |
125
+ | Rerankers | **Equal or faster** |
126
+ | ColBERT | **Equal or faster** |
127
+ | Sparse (SPLADE) | ~21% of Python speed |
128
+ | Image | ~22% of Python speed |
129
+
130
+ **Recommendation:** Ruby is excellent for text embeddings, reranking, and ColBERT. For heavy sparse or image embedding workloads, consider Python.
131
+
132
+ ### Why the Differences?
133
+
134
+ Both implementations use the same ONNX Runtime for model inference. The differences come from:
135
+
136
+ 1. **Text/Reranker/ColBERT** - Hot path is tokenization (Rust) + inference (C++). Minimal language overhead. Ruby matches Python.
137
+
138
+ 2. **Sparse (SPLADE)** - Requires post-processing with log1p transformation. Python's numpy vectorization is faster than Ruby loops.
139
+
140
+ 3. **Image** - Requires image preprocessing (resize, normalize). Python's Pillow is faster than Ruby's MiniMagick (subprocess-based).
141
+
142
+ ### Memory Usage
143
+
144
+ | State | Ruby |
145
+ |-------|------|
146
+ | Initial | 33 MB |
147
+ | Model loaded | 277 MB |
148
+ | +1000 embeddings | 359 MB |
149
+ | After GC | 355 MB |
150
+
151
+ Memory is stable across multiple embedding rounds - no leaks detected.
152
+
153
+ ### Embedding Quality
154
+
155
+ Both implementations produce identical embeddings (same ONNX models), verified by cosine similarity tests:
156
+
157
+ ```
158
+ 'dog' vs 'puppy' = 0.855 (high - PASS)
159
+ 'dog' vs 'cat' = 0.688 (medium - PASS)
160
+ 'machine learning' vs 'artificial intelligence' = 0.718 (high - PASS)
161
+ 'machine learning' vs 'cooking recipes' = 0.426 (low - PASS)
162
+ ```
163
+
72
164
  ## Running Your Own Benchmarks
73
165
 
74
166
  ```ruby
@@ -81,3 +173,34 @@ texts = Array.new(1000) { "Sample text for benchmarking" }
81
173
  result = Benchmark.measure { embedding.embed(texts).to_a }
82
174
  puts "#{1000 / result.real} docs/sec"
83
175
  ```
176
+
177
+ ## Reranker Performance
178
+
179
+ TextCrossEncoder (cross-encoder) performance:
180
+
181
+ | Model | 100 pairs | Throughput |
182
+ |-------|-----------|------------|
183
+ | ms-marco-MiniLM-L-6-v2 | 102ms | **986 pairs/sec** |
184
+ | ms-marco-MiniLM-L-12-v2 | 252ms | **398 pairs/sec** |
185
+ | bge-reranker-base | 758ms | **132 pairs/sec** |
186
+
187
+ Cross-encoders are slower than embedding models because they process query-document pairs together rather than encoding them independently.
188
+
189
+ ### Profiling Scripts
190
+
191
+ The `benchmark/` directory contains:
192
+
193
+ - `profile.rb` - Comprehensive embedding performance profiling
194
+ - `reranker_benchmark.rb` - Reranker/cross-encoder performance
195
+ - `memory_profile.rb` - Memory usage analysis
196
+ - `compare_python.py` - Python FastEmbed comparison
197
+ - `compare_all.rb` - Unified Ruby vs Python comparison
198
+
199
+ Run with:
200
+ ```bash
201
+ ruby benchmark/profile.rb
202
+ ruby benchmark/reranker_benchmark.rb
203
+ ruby benchmark/memory_profile.rb
204
+ ruby benchmark/compare_all.rb
205
+ python3 benchmark/compare_python.py
206
+ ```
data/CHANGELOG.md CHANGED
@@ -7,6 +7,20 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
7
7
 
8
8
  ## [Unreleased]
9
9
 
10
+ ### Added
11
+
12
+ - `TextCrossEncoder` class for reranking query-document pairs
13
+ - Support for cross-encoder/reranker models:
14
+ - cross-encoder/ms-marco-MiniLM-L-6-v2 (default)
15
+ - cross-encoder/ms-marco-MiniLM-L-12-v2
16
+ - BAAI/bge-reranker-base
17
+ - BAAI/bge-reranker-large
18
+ - jinaai/jina-reranker-v1-turbo-en
19
+ - `rerank` method for scoring query-document pairs
20
+ - `rerank_with_scores` method for sorted results with top_k support
21
+ - CLI tool (`fastembed`) with `embed` and `list-models` commands
22
+ - Comprehensive benchmark suite comparing Ruby vs Python performance
23
+
10
24
  ## [1.0.0] - 2025-01-08
11
25
 
12
26
  ### Added