rag_embeddings 0.2.0 → 0.2.2

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
checksums.yaml CHANGED
@@ -1,7 +1,7 @@
1
1
  ---
2
2
  SHA256:
3
- metadata.gz: 3bebe9dc8527ed47e0d7b48534911e97ee7549bb73763c1dd59db063e41558c8
4
- data.tar.gz: fdb272ee4dd12f52f33fb2cd33b9ff22f5207911c68c09b976af59346f666039
3
+ metadata.gz: '0065892eb9dc58605d7a30a62de23dfc4a7609b7590cbcac70fc378109846024'
4
+ data.tar.gz: 143f9878b807ff6ad6d9db3011ece7cdb3c733af7c4cdb81373166d6e6c70f2f
5
5
  SHA512:
6
- metadata.gz: a106f044b23d4438110516ee57dc6079b177861c8e09da4eeffc7e842c3aa9d506b96b6141d6e033a74de1ca61ecf1e53490682f906e14915d76fd5fe4d81103
7
- data.tar.gz: 6be68129ca5338a99f3d55816cbb06785cff2530e39784d5202c4602bb73aec783e5180b7cf09a1f1ef962bb5d6a2070af3db080f677369b49a5463d7c466be3
6
+ metadata.gz: 46acb57d8f5467bafb2b48b594ab308a09097ed323a7958284615714616e01eecd0717256212df3372be2a0b34ae46315315d27cc323fbc2b2362dc56a71f47c
7
+ data.tar.gz: ca063e0963c0b95f95f5b611efdd239d13e8db6a566fb8882cb33eb08372c38af832000555420e82a11acce2865a23debcb4b5d60bb1847acff3f8c433584963
data/README.md CHANGED
@@ -22,42 +22,50 @@
22
22
 
23
23
  ---
24
24
 
25
- ## 🔧 Installation
25
+ ## 🌍 Real-world Use Cases
26
26
 
27
- Add to your Gemfile:
27
+ - **Question Answering over Documents:** Instantly search and retrieve the most relevant document snippets from thousands of articles, FAQs, or customer support logs in your Ruby app.
28
+ - **Semantic Search for E-commerce:** Power product search with semantic understanding, returning items similar in meaning, not just keywords.
29
+ - **Personalized Recommendations:** Find related content (articles, products, videos) by comparing user preferences and content embeddings.
30
+ - **Knowledge Base Augmentation:** Use with OpenAI or Ollama to enhance chatbots, letting them ground answers in your company’s internal documentation or wiki.
31
+ - **Fast Prototyping for AI Products:** Effortlessly build MVPs for RAG-enabled chatbots, semantic search tools, and AI-driven discovery apps—all in native Ruby.
28
32
 
29
- ```ruby
30
- gem "rag_embeddings"
31
- gem "langchainrb"
32
- gem "faraday"
33
- gem "sqlite3"
34
- ```
33
+ ---
35
34
 
36
- bundle install
37
- rake compile
35
+ ## 👷 Requirements
38
36
 
39
- (Requires a working C compiler!)
37
+ - Ruby >= 3.3
38
+ - `langchainrb` (for embedding)
39
+ - At the moment `ollama` is used as LLM so it must be active and working, although there are some workarounds
40
+ - `sqlite3` (for storage)
40
41
 
41
- ## 🏁 Running the test suite
42
+ ## 🔧 Installation
42
43
 
43
- To run all specs (RSpec required):
44
+ Requires a working C compiler in order to build the native extension
45
+
46
+ `gem install rag_embeddings`
47
+
48
+ If you'd rather install it using bundler, add a line for it in your Gemfile (but set the require option to false, as it is a standalone tool):
49
+
50
+ ```ruby
51
+ gem "rag_embeddings", require: false
52
+ ```
44
53
 
45
- `bundle exec rspec`
46
54
 
47
55
  ## 🧪 Practical examples
48
56
 
49
57
  ### 1. Generate an embedding from text
50
58
 
51
59
  ```ruby
52
- text = "Hello world, this is RAG!"
53
- embedding = RagEmbeddings.embed(text)
60
+ require "rag_embeddings"
61
+ embedding = RagEmbeddings.embed("Hello world, this is RAG!")
54
62
  # embedding is a float array
55
63
  ```
56
64
 
57
65
  The default model is llama3.2 but you can set another one (reload the console as the llm is memoized):
58
66
 
59
67
  ```ruby
60
- embedding = RagEmbeddings.embed(text, model: 'qwen3:0.6b')
68
+ embedding = RagEmbeddings.embed("Hello world, this is RAG!", model: 'qwen3:0.6b')
61
69
  ````
62
70
 
63
71
  ### 2. Create a C embedding object
@@ -94,46 +102,197 @@ result = db.top_k_similar("Hello!", k: 1)
94
102
  puts "Most similar text: #{result.first[1]}, score: #{result.first[2]}"
95
103
  ```
96
104
 
105
+ ### 5. Batch-index a folder of documents
106
+
107
+ ```ruby
108
+ # load all .txt files
109
+ files = Dir["./docs/*.txt"].map { |f| [File.basename(f), File.read(f)] }
110
+
111
+ db = RagEmbeddings::Database.new("knowledge_base.db")
112
+ files.each do |name, text|
113
+ vector = RagEmbeddings.embed(text)
114
+ db.insert(name, vector)
115
+ end
116
+
117
+ puts "Indexed #{files.size} documents."
118
+ ```
119
+
120
+ ### 6. Simple Retrieval-Augmented Generation (RAG) loop
121
+
122
+ ```ruby
123
+ require "openai" # or your favorite LLM client
124
+
125
+ # 1) build or open your vector store
126
+ db = RagEmbeddings::Database.new("knowledge_base.db")
127
+
128
+ # 2) embed your user question
129
+ client = OpenAI::Client.new(api_key: ENV.fetch("OPENAI_API_KEY"))
130
+ q_embedding = client.embeddings(
131
+ parameters: {
132
+ model: "text-embedding-ada-002",
133
+ input: "What are the benefits of retrieval-augmented generation?"
134
+ }
135
+ ).dig("data", 0, "embedding")
136
+
137
+ # 3) retrieve top-3 relevant passages
138
+ results = db.top_k_similar(q_embedding, k: 3)
139
+
140
+ # 4) build a prompt for your LLM
141
+ context = results.map { |id, text, score| text }.join("\n\n---\n\n")
142
+ prompt = <<~PROMPT
143
+ You are an expert.
144
+ Use the following context to answer the question:
145
+
146
+ CONTEXT:
147
+ #{context}
148
+
149
+ QUESTION:
150
+ What are the benefits of retrieval-augmented generation?
151
+ PROMPT
152
+
153
+ # 5) call the LLM for final answer
154
+ response = client.chat(
155
+ parameters: {
156
+ model: "gpt-4o",
157
+ messages: [{ role: "user", content: prompt }]
158
+ }
159
+ )
160
+ puts response.dig("choices", 0, "message", "content")
161
+
162
+ ```
163
+
164
+ ### 7. In-memory store for fast prototyping
165
+
166
+ ```ruby
167
+ # use SQLite :memory: for ephemeral experiments
168
+ db = RagEmbeddings::Database.new(":memory:")
169
+
170
+ # insert & search exactly as with a file-backed DB
171
+ db.insert("Quick test", RagEmbeddings.embed("Quick test"))
172
+ db.top_k_similar("Test", k: 1)
173
+ ```
174
+
175
+ ---
176
+
97
177
  ## 🏗️ How it works
98
178
 
99
- - Embeddings are managed as dynamic C objects for efficiency (variable dimension).
100
- - The only correct way to construct an embedding object is using .from_array.
101
- - Langchainrb integration lets you easily change the embedding provider (Ollama, OpenAI, etc).
102
- - Storage uses local SQLite with embeddings as BLOB, for maximum portability and simplicity.
179
+ **rag_embeddings** combines the simplicity of Ruby with the performance of C to deliver fast vector operations for RAG applications.
180
+
181
+ ### Architecture Overview
182
+
183
+ The library uses a **hybrid memory-storage approach**:
184
+
185
+ 1. **In-Memory Processing**: All vector operations (cosine similarity calculations, embedding manipulations) happen entirely in memory using optimized C code
186
+ 2. **Persistent Storage**: SQLite serves as a simple, portable storage layer for embeddings and associated text
187
+ 3. **Dynamic C Objects**: Embeddings are managed as native C structures with automatic memory management
188
+
189
+ ### Key Components
190
+
191
+ **C Extension (`embedding.c`)**
192
+ - Handles all computationally intensive operations
193
+ - Manages dynamic vector dimensions (adapts to any LLM output size)
194
+ - Performs cosine similarity calculations with optimized algorithms
195
+ - Ensures memory-safe operations with proper garbage collection integration
196
+
197
+ **Ruby Interface**
198
+ - Provides an intuitive API for vector operations
199
+ - Integrates seamlessly with LLM providers via langchainrb
200
+ - Handles database operations and query orchestration
201
+
202
+ **SQLite Storage**
203
+ - Stores embeddings as BLOBs alongside their associated text
204
+ - Provides persistent storage without requiring external databases
205
+ - Supports both file-based and in-memory (`:memory:`) databases
206
+ - Enables portable, self-contained applications
207
+
208
+ ### Processing Flow
209
+
210
+ 1. **Text → Embedding**: Generate vectors using your preferred LLM (Ollama, OpenAI, etc.)
211
+ 2. **Memory Allocation**: Create C embedding objects with `Embedding.from_array()`
212
+ 3. **Storage**: Persist embeddings and text to SQLite for later retrieval
213
+ 4. **Query Processing**:
214
+ - Load query embedding into memory
215
+ - Compare against stored embeddings using fast C-based cosine similarity
216
+ - Return top-K most similar results ranked by similarity score
217
+
218
+ ### Why This Design?
219
+
220
+ **Performance**: Critical operations run in optimized C code, delivering significant speed improvements over pure Ruby implementations.
221
+
222
+ **Memory Efficiency**: While embeddings are stored in SQLite, all vector computations happen in memory, avoiding I/O bottlenecks during similarity calculations.
223
+
224
+ **Simplicity**: SQLite eliminates the need for complex vector database setups while maintaining good performance for moderate-scale applications.
225
+
226
+ **Portability**: The entire knowledge base fits in a single SQLite file, making deployment and backup trivial.
227
+
228
+ ### Performance Characteristics
229
+
230
+ - **Embedding creation**: ~82ms for 10,000 operations
231
+ - **Cosine similarity**: ~107ms for 10,000 calculations
232
+ - **Memory usage**: ~34MB for 10,000 embeddings
233
+ - **Scalability**: Suitable for thousands to tens of thousands of vectors
234
+
235
+ For applications requiring millions of vectors, consider specialized vector databases (Faiss, sqlite-vss) while using this library for prototyping and smaller-scale production use.
103
236
 
104
237
  ## 🎛️ Customization
105
238
 
106
239
  - Embedding provider: switch model/provider in engine.rb (Ollama, OpenAI, etc)
107
240
  - Database: set the SQLite file path as desired
108
241
 
109
- ## 🔢 Embeddings dimension
242
+ If you need to customize the c part (`ext/rag_embeddings/embedding.c`), recompile it with:
243
+
244
+ `rake compile`
110
245
 
111
- The size of embeddings is dynamic and fits with what the LLM provides.
246
+ ---
247
+
248
+ ## 🏁 Running the test suite
249
+
250
+ To run all specs (RSpec required):
251
+
252
+ `bundle exec rspec`
112
253
 
113
254
  ## ⚡️ Performance
114
255
 
115
- Embedding creation (10000 times): 82 ms
116
- Cosine similarity (10000 times): 107 ms
117
- RSS: 186.7 MB
118
- .
119
- Memory usage delta: 33.97 MB for 10000 embeddings
120
- .
256
+ `bundle exec rspec spec/performance_spec.rb`
121
257
 
122
- Finished in 0.42577 seconds (files took 0.06832 seconds to load)
123
- 2 examples, 0 failures
258
+ You'll get something like this in random order:
124
259
 
125
- ## 👷 Requirements
260
+ ```bash
261
+ Performance test with embedding size: 768
262
+ Embedding creation (10000 times): 19 ms
263
+ Cosine similarity (10000 times): 27 ms
264
+ RSS: 132.3 MB
126
265
 
127
- - Ruby >= 3.3
128
- - langchainrb (for embedding)
129
- - sqlite3 (for storage)
130
- - A working C compiler
266
+ Memory usage test with embedding size: 768
267
+ Memory usage delta: 3.72 MB for 10000 embeddings
131
268
 
132
- ## 📑 Notes
133
269
 
134
- - Always create embeddings with .from_array
135
- - All memory management is idiomatic and safe
136
- - For millions of vectors, consider vector DBs (Faiss, sqlite-vss, etc.)
270
+ Performance test with embedding size: 2048
271
+ Embedding creation (10000 times): 69 ms
272
+ Cosine similarity (10000 times): 73 ms
273
+ RSS: 170.08 MB
274
+
275
+ Memory usage test with embedding size: 2048
276
+ Memory usage delta: 25.11 MB for 10000 embeddings
277
+
278
+
279
+ Performance test with embedding size: 3072
280
+ Embedding creation (10000 times): 98 ms
281
+ Cosine similarity (10000 times): 112 ms
282
+ RSS: 232.97 MB
283
+
284
+ Memory usage test with embedding size: 3072
285
+ Memory usage delta: 60.5 MB for 10000 embeddings
286
+
287
+
288
+ Performance test with embedding size: 4096
289
+ Embedding creation (10000 times): 96 ms
290
+ Cosine similarity (10000 times): 140 ms
291
+ RSS: 275.2 MB
292
+
293
+ Memory usage test with embedding size: 4096
294
+ Memory usage delta: 92.41 MB for 10000 embeddings
295
+ ```
137
296
 
138
297
  ## 📬 Contact & Issues
139
298
  Open an issue or contact the maintainer for questions, suggestions, or bugs.
data/Rakefile CHANGED
@@ -1,5 +1,8 @@
1
1
  task :compile do
2
2
  Dir.chdir("ext/rag_embeddings") do
3
+ # Delete embedding.so or embedding.o
4
+ # Delete embedding.bundle and the folder embedding.bundle.*
5
+ FileUtils.rm_rf(Dir["embedding.so", "embedding.o", "embedding.bundle", "embedding.bundle.*"])
3
6
  ruby "extconf.rb"
4
7
  system("make")
5
8
  end
@@ -1,3 +1,3 @@
1
1
  module RagEmbeddings
2
- VERSION = "0.2.0"
2
+ VERSION = "0.2.2".freeze
3
3
  end
@@ -1,4 +1,8 @@
1
1
  require_relative "rag_embeddings/version"
2
2
  require_relative "rag_embeddings/engine"
3
3
  require_relative "rag_embeddings/database"
4
- require_relative "../ext/rag_embeddings/embedding" # Loads the compiled C extension
4
+
5
+ # Loads the compiled C extension
6
+ require "rag_embeddings/embedding"
7
+
8
+ require "faraday"
metadata CHANGED
@@ -1,7 +1,7 @@
1
1
  --- !ruby/object:Gem::Specification
2
2
  name: rag_embeddings
3
3
  version: !ruby/object:Gem::Version
4
- version: 0.2.0
4
+ version: 0.2.2
5
5
  platform: ruby
6
6
  authors:
7
7
  - Marco Mastrodonato
@@ -51,6 +51,76 @@ dependencies:
51
51
  - - ">="
52
52
  - !ruby/object:Gem::Version
53
53
  version: '0'
54
+ - !ruby/object:Gem::Dependency
55
+ name: rake
56
+ requirement: !ruby/object:Gem::Requirement
57
+ requirements:
58
+ - - ">="
59
+ - !ruby/object:Gem::Version
60
+ version: '0'
61
+ type: :development
62
+ prerelease: false
63
+ version_requirements: !ruby/object:Gem::Requirement
64
+ requirements:
65
+ - - ">="
66
+ - !ruby/object:Gem::Version
67
+ version: '0'
68
+ - !ruby/object:Gem::Dependency
69
+ name: rspec
70
+ requirement: !ruby/object:Gem::Requirement
71
+ requirements:
72
+ - - ">="
73
+ - !ruby/object:Gem::Version
74
+ version: '0'
75
+ type: :development
76
+ prerelease: false
77
+ version_requirements: !ruby/object:Gem::Requirement
78
+ requirements:
79
+ - - ">="
80
+ - !ruby/object:Gem::Version
81
+ version: '0'
82
+ - !ruby/object:Gem::Dependency
83
+ name: rubocop
84
+ requirement: !ruby/object:Gem::Requirement
85
+ requirements:
86
+ - - ">="
87
+ - !ruby/object:Gem::Version
88
+ version: '0'
89
+ type: :development
90
+ prerelease: false
91
+ version_requirements: !ruby/object:Gem::Requirement
92
+ requirements:
93
+ - - ">="
94
+ - !ruby/object:Gem::Version
95
+ version: '0'
96
+ - !ruby/object:Gem::Dependency
97
+ name: dotenv
98
+ requirement: !ruby/object:Gem::Requirement
99
+ requirements:
100
+ - - ">="
101
+ - !ruby/object:Gem::Version
102
+ version: '0'
103
+ type: :development
104
+ prerelease: false
105
+ version_requirements: !ruby/object:Gem::Requirement
106
+ requirements:
107
+ - - ">="
108
+ - !ruby/object:Gem::Version
109
+ version: '0'
110
+ - !ruby/object:Gem::Dependency
111
+ name: debug
112
+ requirement: !ruby/object:Gem::Requirement
113
+ requirements:
114
+ - - ">="
115
+ - !ruby/object:Gem::Version
116
+ version: '0'
117
+ type: :development
118
+ prerelease: false
119
+ version_requirements: !ruby/object:Gem::Requirement
120
+ requirements:
121
+ - - ">="
122
+ - !ruby/object:Gem::Version
123
+ version: '0'
54
124
  description: Manage AI vector embeddings in C with Ruby integration
55
125
  email:
56
126
  - m.mastrodonato@gmail.com
@@ -77,6 +147,7 @@ metadata:
77
147
  rdoc_options: []
78
148
  require_paths:
79
149
  - lib
150
+ - ext
80
151
  required_ruby_version: !ruby/object:Gem::Requirement
81
152
  requirements:
82
153
  - - ">="