vectra-client 1.1.0 → 1.1.2

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
checksums.yaml CHANGED
@@ -1,7 +1,7 @@
1
1
  ---
2
2
  SHA256:
3
- metadata.gz: 70730299dab8475b05688f7017dccd1965b87a49589c96be7633a356af8d57f2
4
- data.tar.gz: 14999ebde62586578b444ba45d396cb38f1a4dd5dcbf2fd126e042ebb1c6fae5
3
+ metadata.gz: 6f59705f200c8a164cc9303e6761776af49e4a81a88f101eae2e371eccca8807
4
+ data.tar.gz: 217dd74d00151f3ba6e94ed80718999656f9616b84e843dd937197cee612540a
5
5
  SHA512:
6
- metadata.gz: 69c8fa722ee4abfe3ddf6b19f8bd12f46d09edefee50612021142c3951600666ed854018e8a5c1bc33249895fec42ae18ab1d821beffa989d2700da99c28bcf4
7
- data.tar.gz: f27a0df4bbcf618659297b1376ebe977588d32d5dfef76fe7c8f5f38c8780b37e055a0e8d91f6bc3ae8eb7cad88606c760d4ed5882c61756ac1495dedfc339a5
6
+ metadata.gz: 8e62f19e82dfb88a14ae50f7cee6afdd8d058ece9972de8f447ba3cc881420d401cfd0d3a093d299e8273795958085349a8d3e6416ecd429f50ea2a05103dea3
7
+ data.tar.gz: ace8ecc519dc588f917e6b29721ed9cfa3ff9b3f3fd7b3e82911233ed7d6e59a9f443e81ae9c0d20dd4891f904d8497cb506064964b00b3dc7587f12137e12bf
data/CHANGELOG.md CHANGED
@@ -1,5 +1,29 @@
1
1
  # Changelog
2
2
 
3
+ ## [v1.1.1](https://github.com/stokry/vectra/tree/v1.1.1) (2026-01-15)
4
+
5
+ [Full Changelog](https://github.com/stokry/vectra/compare/v1.1.0...v1.1.1)
6
+
7
+ ### Added
8
+ - **Text Search Support** - New `text_search` method for keyword-only search without requiring embeddings
9
+ - Qdrant: BM25 text search
10
+ - Weaviate: BM25 text search via GraphQL
11
+ - pgvector: PostgreSQL full-text search (`to_tsvector`, `plainto_tsquery`, `ts_rank`)
12
+ - Memory: Simple keyword matching (for testing)
13
+ - Raises `UnsupportedFeatureError` for Pinecone (use sparse vectors instead)
14
+ - **Documentation Search** - Added search functionality to documentation site with `simple-jekyll-search`
15
+ - Client-side search with fuzzy matching
16
+ - Search index auto-generated from all documentation pages
17
+ - Responsive search UI in navigation
18
+
19
+ ### Changed
20
+ - Updated API documentation to include `text_search` method in overview and cheatsheet
21
+ - Enhanced documentation with text search examples and use cases
22
+
23
+ ## [v1.1.0](https://github.com/stokry/vectra/tree/v1.1.0) (2026-01-15)
24
+
25
+ [Full Changelog](https://github.com/stokry/vectra/compare/v1.0.8...v1.1.0)
26
+
3
27
  ## [v1.0.8](https://github.com/stokry/vectra/tree/v1.0.8) (2026-01-14)
4
28
 
5
29
  [Full Changelog](https://github.com/stokry/vectra/compare/v1.0.7...v1.0.8)
data/README.md CHANGED
@@ -109,6 +109,14 @@ results = client.hybrid_search(
109
109
  text: 'ruby programming',
110
110
  alpha: 0.7 # 70% semantic, 30% keyword
111
111
  )
112
+
113
+ # Text-only search (keyword search without embeddings)
114
+ # Supported by: Qdrant, Weaviate, pgvector
115
+ results = client.text_search(
116
+ index: 'products',
117
+ text: 'iPhone 15 Pro',
118
+ top_k: 10
119
+ )
112
120
  ```
113
121
 
114
122
  ## Provider Examples
@@ -218,6 +226,8 @@ This will:
218
226
  - **Update the model** to include `ProductVector`
219
227
  - **Append to `config/vectra.yml`** with index metadata (no API keys)
220
228
 
229
+ When `config/vectra.yml` contains exactly one entry, a plain `Vectra::Client.new` in that Rails app will automatically use that entry's `index` (and `namespace` if present) as its defaults, so you can usually omit `index:` when calling `upsert` / `query` / `text_search`.
230
+
221
231
  ### Complete Rails Guide
222
232
 
223
233
  For a complete step-by-step guide including:
@@ -307,6 +317,20 @@ Vectra includes 7 production-ready patterns out of the box:
307
317
  - **Health Checks** - `healthy?`, `ping`, and `health_check` methods
308
318
  - **Instrumentation** - Datadog, New Relic, Sentry, Honeybadger support
309
319
 
320
+ ## Roadmap
321
+
322
+ High-level roadmap for `vectra-client`:
323
+
324
+ - **1.x (near term)**
325
+ - Reranking middleware built on top of the existing Rack-style middleware stack.
326
+ - Additional middleware building blocks (sampling, tracing, score normalization).
327
+ - Smoother Rails UX for multi-tenant setups and larger demos (e‑commerce, RAG, recommendations).
328
+ - **Mid term**
329
+ - Additional providers where it makes sense and stays maintainable.
330
+ - Deeper documentation and recipes around reranking and hybrid search.
331
+
332
+ For a more detailed, always-up-to-date version, see the online roadmap: https://vectra-docs.netlify.app/guides/roadmap/
333
+
310
334
  ## Development
311
335
 
312
336
  ```bash
@@ -55,7 +55,7 @@
55
55
  <!-- Hero Section -->
56
56
  <section class="tma-hero">
57
57
  <div class="tma-hero__container">
58
- <span class="tma-hero__badge">v1.1.0 — Hybrid Search, Rails Generator & Middleware</span>
58
+ <span class="tma-hero__badge">v1.1.1 — Hybrid Search, Rails Generator, Middleware & Text Search</span>
59
59
  <h1 class="tma-hero__title">
60
60
  Vector Databases,<br>
61
61
  <span class="tma-hero__title-gradient">Unified for Ruby.</span>
@@ -39,6 +39,12 @@
39
39
  <span class="tma-nav__toggle-line"></span>
40
40
  </button>
41
41
  <ul class="tma-nav__menu" id="nav-menu">
42
+ <li class="tma-nav__search-wrapper">
43
+ <div class="tma-search">
44
+ <input type="search" id="search-input" class="tma-search__input" placeholder="Search docs..." aria-label="Search documentation">
45
+ <div id="search-results" class="tma-search__results"></div>
46
+ </div>
47
+ </li>
42
48
  <li><a href="{{ site.baseurl }}/guides/getting-started" class="tma-nav__link">Getting Started</a></li>
43
49
  <li><a href="{{ site.baseurl }}/guides/recipes" class="tma-nav__link">Recipes</a></li>
44
50
  <li><a href="{{ site.baseurl }}/providers" class="tma-nav__link">Providers</a></li>
@@ -91,6 +97,13 @@
91
97
  <li><a href="https://github.com/stokry/vectra/issues" class="tma-sidebar__link" target="_blank">Report Issue ↗</a></li>
92
98
  </ul>
93
99
  </div>
100
+
101
+ <div class="tma-sidebar__section">
102
+ <h3 class="tma-sidebar__title">Resources</h3>
103
+ <ul class="tma-sidebar__list">
104
+ <li><a href="{{ site.baseurl }}/guides/roadmap" class="tma-sidebar__link {% if page.url == '/guides/roadmap/' %}tma-sidebar__link--active{% endif %}">Roadmap</a></li>
105
+ </ul>
106
+ </div>
94
107
  </aside>
95
108
 
96
109
  <!-- Main Content -->
@@ -115,6 +128,9 @@
115
128
  </div>
116
129
  </footer>
117
130
 
131
+ <!-- Simple Jekyll Search -->
132
+ <script src="https://cdn.jsdelivr.net/npm/simple-jekyll-search@1.10.0/dest/simple-jekyll-search.min.js"></script>
133
+
118
134
  <script>
119
135
  document.addEventListener('DOMContentLoaded', function() {
120
136
  const navToggle = document.querySelector('.tma-nav__toggle');
@@ -148,6 +164,19 @@
148
164
  }
149
165
  });
150
166
  }
167
+
168
+ // Initialize search
169
+ if (typeof SimpleJekyllSearch !== 'undefined') {
170
+ SimpleJekyllSearch({
171
+ searchInput: document.getElementById('search-input'),
172
+ resultsContainer: document.getElementById('search-results'),
173
+ json: '{{ site.baseurl }}/search.json',
174
+ searchResultTemplate: '<div class="tma-search__result"><a href="{url}" class="tma-search__result-link"><span class="tma-search__result-title">{title}</span><span class="tma-search__result-excerpt">{excerpt}</span></a></div>',
175
+ noResultsText: '<div class="tma-search__no-results">No results found</div>',
176
+ fuzzy: true,
177
+ limit: 10
178
+ });
179
+ }
151
180
  });
152
181
  </script>
153
182
  </body>
@@ -49,6 +49,8 @@ client.upsert(vectors: [...])
49
49
  client.query(vector: query_embedding, top_k: 10)
50
50
  ```
51
51
 
52
+ In a Rails app with `config/vectra.yml` generated by `rails generate vectra:index`, if that YAML file contains only one entry, `Vectra::Client.new` will automatically use that entry's `index` (and `namespace`, if present) as defaults.
53
+
52
54
  ### Upsert
53
55
 
54
56
  ```ruby
@@ -98,6 +100,23 @@ results = client.hybrid_search(
98
100
 
99
101
  Supported providers: Qdrant ✅, Weaviate ✅, pgvector ✅, Pinecone ⚠️
100
102
 
103
+ ### Text Search (keyword-only, no embeddings)
104
+
105
+ ```ruby
106
+ results = client.text_search(
107
+ index: 'products',
108
+ text: 'iPhone 15 Pro',
109
+ top_k: 10,
110
+ filter: { category: 'electronics' }
111
+ )
112
+
113
+ results.each do |match|
114
+ puts "#{match.id} (score=#{match.score.round(3)}): #{match.metadata['title']}"
115
+ end
116
+ ```
117
+
118
+ Supported providers: Qdrant ✅ (BM25), Weaviate ✅ (BM25), pgvector ✅ (PostgreSQL full-text)
119
+
101
120
  ### Fetch
102
121
 
103
122
  ```ruby
@@ -176,6 +195,14 @@ else
176
195
  end
177
196
  ```
178
197
 
198
+ For fast health checks you can temporarily lower the timeout:
199
+
200
+ ```ruby
201
+ status = client.with_timeout(0.5) do |c|
202
+ c.ping
203
+ end
204
+ ```
205
+
179
206
  ### Ping (with latency)
180
207
 
181
208
  ```ruby
data/docs/api/methods.md CHANGED
@@ -36,6 +36,8 @@ client = Vectra::Client.new(
36
36
  )
37
37
  ```
38
38
 
39
+ In a Rails app that uses the `vectra:index` generator, if `config/vectra.yml` contains exactly one entry, `Vectra::Client.new` will automatically use that entry's `index` (and `namespace` if present) as its defaults. This allows you to omit `index:` in most calls (`upsert`, `query`, `text_search`, etc.).
40
+
39
41
  ---
40
42
 
41
43
  ### `client.upsert(index:, vectors:, namespace: nil)`
@@ -147,6 +149,51 @@ results = client.hybrid_search(
147
149
 
148
150
  ---
149
151
 
152
+ ### `client.text_search(index:, text:, top_k: 10, namespace: nil, filter: nil, include_values: false, include_metadata: true)`
153
+
154
+ Text-only search (keyword search without requiring embeddings).
155
+
156
+ **Parameters:**
157
+ - `index` (String) - Index/collection name (uses client's default index when omitted)
158
+ - `text` (String) - Text query for keyword search
159
+ - `top_k` (Integer) - Number of results (default: 10)
160
+ - `namespace` (String, optional) - Namespace
161
+ - `filter` (Hash, optional) - Metadata filter
162
+ - `include_values` (Boolean) - Include vector values (default: false)
163
+ - `include_metadata` (Boolean) - Include metadata (default: true)
164
+
165
+ **Returns:** `Vectra::QueryResult`
166
+
167
+ **Provider Support:**
168
+ - ✅ Qdrant (BM25)
169
+ - ✅ Weaviate (BM25)
170
+ - ✅ pgvector (PostgreSQL full-text search)
171
+ - ✅ Memory (simple keyword matching - for testing only)
172
+ - ❌ Pinecone (not supported - use sparse vectors instead)
173
+
174
+ **Example:**
175
+ ```ruby
176
+ # Keyword search for exact matches
177
+ results = client.text_search(
178
+ index: 'products',
179
+ text: 'iPhone 15 Pro',
180
+ top_k: 10,
181
+ filter: { category: 'electronics' }
182
+ )
183
+
184
+ results.each do |match|
185
+ puts "#{match.id}: #{match.score} - #{match.metadata['title']}"
186
+ end
187
+ ```
188
+
189
+ **Use Cases:**
190
+ - Product name search (exact matches)
191
+ - Function/class name search in documentation
192
+ - Keyword-based filtering when semantic search is not needed
193
+ - Faster search when embeddings are not available
194
+
195
+ ---
196
+
150
197
  ### `client.fetch(index:, ids:, namespace: nil)`
151
198
 
152
199
  Fetch vectors by their IDs.
@@ -357,6 +404,28 @@ puts "Latency: #{status[:latency_ms]}ms"
357
404
 
358
405
  ---
359
406
 
407
+ ### `client.with_timeout(seconds) { ... }`
408
+
409
+ Temporarily override the client's request timeout inside a block.
410
+
411
+ **Parameters:**
412
+ - `seconds` (Float) - Temporary timeout in seconds
413
+
414
+ **Returns:** Block result
415
+
416
+ **Example (fast health check in Rails controller):**
417
+ ```ruby
418
+ status = client.with_timeout(0.5) do |c|
419
+ c.ping
420
+ end
421
+
422
+ render json: status, status: status[:healthy] ? :ok : :service_unavailable
423
+ ```
424
+
425
+ After the block finishes (even if it raises), the previous `config.timeout` value is restored.
426
+
427
+ ---
428
+
360
429
  ### `client.health_check`
361
430
 
362
431
  Detailed health check with provider-specific information.
data/docs/api/overview.md CHANGED
@@ -10,15 +10,15 @@ permalink: /api/overview/
10
10
 
11
11
  ```ruby
12
12
  client = Vectra::Client.new(
13
- provider: :pinecone, # Required: :pinecone, :qdrant, :weaviate, :pgvector
13
+ provider: :pinecone, # Required: :pinecone, :qdrant, :weaviate, :pgvector, :memory
14
14
  api_key: 'your-api-key', # Required for cloud providers
15
- index_name: 'my-index', # Optional, provider-dependent
16
- host: 'localhost', # For self-hosted providers
17
- port: 6333, # For self-hosted providers
18
- environment: 'us-west-4' # For Pinecone
15
+ index: 'my-index', # Optional default index
16
+ namespace: 'tenant-1' # Optional default namespace
19
17
  )
20
18
  ```
21
19
 
20
+ In Rails, if you use the `vectra:index` generator and `config/vectra.yml` contains exactly one entry, a plain `Vectra::Client.new` will automatically pick that entry's `index` (and `namespace` if present) as defaults.
21
+
22
22
  ## Core Methods
23
23
 
24
24
  ### `upsert(vectors:)`
@@ -138,6 +138,31 @@ results = client.hybrid_search(
138
138
 
139
139
  **Provider Support:** Qdrant ✅, Weaviate ✅, pgvector ✅, Pinecone ⚠️
140
140
 
141
+ ### `text_search(index:, text:, top_k:)`
142
+
143
+ Text-only search (keyword search without requiring embeddings).
144
+
145
+ **Parameters:**
146
+ - `index` (String) - Index/collection name (uses client's default index when omitted)
147
+ - `text` (String) - Text query for keyword search
148
+ - `top_k` (Integer) - Number of results (default: 10)
149
+ - `namespace` (String, optional) - Namespace
150
+ - `filter` (Hash, optional) - Metadata filter
151
+ - `include_values` (Boolean) - Include vector values (default: false)
152
+ - `include_metadata` (Boolean) - Include metadata (default: true)
153
+
154
+ **Example:**
155
+ ```ruby
156
+ results = client.text_search(
157
+ index: 'products',
158
+ text: 'iPhone 15 Pro',
159
+ top_k: 10,
160
+ filter: { category: 'electronics' }
161
+ )
162
+ ```
163
+
164
+ **Provider Support:** Qdrant ✅ (BM25), Weaviate ✅ (BM25), pgvector ✅ (PostgreSQL full-text), Memory ✅, Pinecone ❌
165
+
141
166
  ### `healthy?`
142
167
 
143
168
  Quick health check - returns true if provider connection is healthy.
@@ -151,6 +176,12 @@ if client.healthy?
151
176
  end
152
177
  ```
153
178
 
179
+ You can also run faster checks with a temporary timeout:
180
+
181
+ ```ruby
182
+ fast_ok = client.with_timeout(0.5) { |c| c.healthy? }
183
+ ```
184
+
154
185
  ### `ping`
155
186
 
156
187
  Ping provider and get connection health status with latency.
@@ -165,6 +165,118 @@ body {
165
165
  border-color: var(--tma-color-border-hover);
166
166
  }
167
167
 
168
+ .tma-nav__search-wrapper {
169
+ position: relative;
170
+ margin-right: var(--tma-spacing-md);
171
+ }
172
+
173
+ .tma-search {
174
+ position: relative;
175
+ }
176
+
177
+ .tma-search__input {
178
+ width: 240px;
179
+ padding: var(--tma-spacing-sm) var(--tma-spacing-md);
180
+ padding-left: 2.5rem;
181
+ background: var(--tma-color-bg-tertiary);
182
+ border: 1px solid var(--tma-color-border);
183
+ border-radius: var(--tma-radius-sm);
184
+ color: var(--tma-color-text-primary);
185
+ font-size: 0.9rem;
186
+ font-family: var(--tma-font-family-primary);
187
+ transition: all var(--tma-transition-fast);
188
+ outline: none;
189
+ }
190
+
191
+ .tma-search__input::placeholder {
192
+ color: var(--tma-color-text-muted);
193
+ }
194
+
195
+ .tma-search__input:focus {
196
+ width: 320px;
197
+ border-color: var(--tma-color-accent-primary);
198
+ background: var(--tma-color-bg-elevated);
199
+ box-shadow: 0 0 0 3px var(--tma-color-accent-muted);
200
+ }
201
+
202
+ .tma-search {
203
+ position: relative;
204
+ }
205
+
206
+ .tma-search::before {
207
+ content: '🔍';
208
+ position: absolute;
209
+ left: var(--tma-spacing-md);
210
+ top: 50%;
211
+ transform: translateY(-50%);
212
+ color: var(--tma-color-text-muted);
213
+ pointer-events: none;
214
+ z-index: 1;
215
+ font-size: 0.9rem;
216
+ }
217
+
218
+ .tma-search__results {
219
+ position: absolute;
220
+ top: calc(100% + var(--tma-spacing-xs));
221
+ left: 0;
222
+ right: 0;
223
+ max-width: 500px;
224
+ max-height: 400px;
225
+ overflow-y: auto;
226
+ background: var(--tma-color-bg-elevated);
227
+ border: 1px solid var(--tma-color-border);
228
+ border-radius: var(--tma-radius-md);
229
+ box-shadow: var(--tma-shadow-lg);
230
+ z-index: 1000;
231
+ display: none;
232
+ }
233
+
234
+ .tma-search__results:not(:empty) {
235
+ display: block;
236
+ }
237
+
238
+ .tma-search__result {
239
+ border-bottom: 1px solid var(--tma-color-border);
240
+ }
241
+
242
+ .tma-search__result:last-child {
243
+ border-bottom: none;
244
+ }
245
+
246
+ .tma-search__result-link {
247
+ display: block;
248
+ padding: var(--tma-spacing-md);
249
+ text-decoration: none;
250
+ color: var(--tma-color-text-primary);
251
+ transition: background var(--tma-transition-fast);
252
+ }
253
+
254
+ .tma-search__result-link:hover {
255
+ background: var(--tma-color-bg-hover);
256
+ }
257
+
258
+ .tma-search__result-title {
259
+ display: block;
260
+ font-weight: 600;
261
+ font-size: 0.95rem;
262
+ color: var(--tma-color-text-primary);
263
+ margin-bottom: var(--tma-spacing-xs);
264
+ }
265
+
266
+ .tma-search__result-excerpt {
267
+ display: block;
268
+ font-size: 0.85rem;
269
+ color: var(--tma-color-text-secondary);
270
+ line-height: 1.5;
271
+ }
272
+
273
+ .tma-search__no-results {
274
+ padding: var(--tma-spacing-lg);
275
+ text-align: center;
276
+ color: var(--tma-color-text-muted);
277
+ font-size: 0.9rem;
278
+ }
279
+
168
280
  .tma-nav__toggle {
169
281
  display: none;
170
282
  flex-direction: column;
@@ -1261,6 +1373,25 @@ code {
1261
1373
  justify-content: center;
1262
1374
  }
1263
1375
 
1376
+ .tma-nav__search-wrapper {
1377
+ width: 100%;
1378
+ margin-right: 0;
1379
+ margin-bottom: var(--tma-spacing-md);
1380
+ }
1381
+
1382
+ .tma-search__input {
1383
+ width: 100%;
1384
+ }
1385
+
1386
+ .tma-search__input:focus {
1387
+ width: 100%;
1388
+ }
1389
+
1390
+ .tma-search__results {
1391
+ max-width: 100%;
1392
+ right: 0;
1393
+ }
1394
+
1264
1395
  .tma-features,
1265
1396
  .tma-providers {
1266
1397
  padding: var(--tma-spacing-xl) var(--tma-spacing-md);
@@ -99,6 +99,8 @@ This will:
99
99
  - Update `app/models/product.rb` to include the concern
100
100
  - Add configuration to `config/vectra.yml`
101
101
 
102
+ When `config/vectra.yml` contains exactly one entry, a plain `Vectra::Client.new` in this Rails app will automatically use that entry's `index` (and `namespace` if present) as its defaults. That means you can usually omit `index:` when calling `upsert`, `query`, `hybrid_search`, or `text_search`.
103
+
102
104
  ### Run Migrations
103
105
 
104
106
  ```bash
@@ -0,0 +1,53 @@
1
+ ---
2
+ layout: page
3
+ title: Roadmap
4
+ permalink: /guides/roadmap/
5
+ ---
6
+
7
+ # Vectra Roadmap
8
+
9
+ This page outlines the high-level roadmap for **vectra-client**, the unified Ruby client for vector databases.
10
+
11
+ The roadmap is intentionally focused on **production features** that make AI workloads reliable, observable, and easy to operate in Ruby.
12
+
13
+ ## Near Term (1.x)
14
+
15
+ - **Reranking middleware**
16
+ - Middleware that can call external rerankers (e.g., Cohere, Jina, custom HTTP) and reorder search results after a `query`.
17
+ - Pluggable providers, configurable `top_n`, and safe fallbacks when reranking fails.
18
+ - **More middleware building blocks**
19
+ - Request sampling / tracing for debugging complex production issues.
20
+ - Response shaping (e.g., score normalization, custom thresholds) as reusable middleware.
21
+ - **Rails UX improvements**
22
+ - Convenience generators and helpers for multi-tenant setups.
23
+ - Better defaults and examples for 1k+ records demos (e‑commerce, blogs, RAG, recommendations).
24
+
25
+ ## Mid Term
26
+
27
+ - **Additional providers**
28
+ - Support for more hosted / self-hosted vector solutions where it makes sense and stays maintainable.
29
+ - **First-class reranking guides**
30
+ - End-to-end documentation for combining vectra-client with external LLMs / rerankers.
31
+ - **More recipes & patterns**
32
+ - Deeper recipes for analytics, recommendations, and hybrid search in large Rails apps.
33
+
34
+ ## Long Term Vision
35
+
36
+ Keep **vectra-client** the most **production-ready Ruby toolkit** for vector databases:
37
+
38
+ - Strong guarantees around retries, circuit breakers, and backpressure.
39
+ - Excellent observability out of the box.
40
+ - Stable, provider-agnostic API that lets you change infra without rewriting your app.
41
+
42
+ If you have ideas or needs that fit this direction, please open an issue on GitHub so we can prioritise the roadmap around real-world use cases.
43
+
44
+ {
45
+ "cells": [],
46
+ "metadata": {
47
+ "language_info": {
48
+ "name": "python"
49
+ }
50
+ },
51
+ "nbformat": 4,
52
+ "nbformat_minor": 2
53
+ }
data/docs/search.json ADDED
@@ -0,0 +1,26 @@
1
+ ---
2
+ layout: null
3
+ ---
4
+ [
5
+ {% assign first = true %}
6
+ {% for page in site.pages %}
7
+ {% unless page.url == '/' or page.url == '/search.json' or page.url contains '/assets/' or page.url contains '/404' or page.url contains '/feed' or page.url contains '/sitemap' or page.url contains '/robots' or page.url contains '/index.html' or page.url contains '/index.md' %}
8
+ {% unless first %},{% endunless %}
9
+ {
10
+ "title": {{ page.title | default: page.url | jsonify }},
11
+ "url": {{ page.url | jsonify }},
12
+ "excerpt": {{ page.content | strip_html | truncatewords: 30 | default: "" | jsonify }}
13
+ }
14
+ {% assign first = false %}
15
+ {% endunless %}
16
+ {% endfor %}
17
+ {% for post in site.posts %}
18
+ {% unless first %},{% endunless %}
19
+ {
20
+ "title": {{ post.title | jsonify }},
21
+ "url": {{ post.url | jsonify }},
22
+ "excerpt": {{ post.content | strip_html | truncatewords: 30 | default: "" | jsonify }}
23
+ }
24
+ {% assign first = false %}
25
+ {% endfor %}
26
+ ]
data/lib/vectra/client.rb CHANGED
@@ -1,5 +1,7 @@
1
1
  # frozen_string_literal: true
2
2
 
3
+ require "yaml"
4
+
3
5
  # Ensure HealthCheck is loaded before Client
4
6
  require_relative "health_check" unless defined?(Vectra::HealthCheck)
5
7
  require_relative "configuration" unless defined?(Vectra::Configuration)
@@ -89,6 +91,7 @@ module Vectra
89
91
  @provider = build_provider
90
92
  @default_index = options[:index]
91
93
  @default_namespace = options[:namespace]
94
+ apply_rails_vectra_defaults! if @default_index.nil? && @default_namespace.nil?
92
95
  @middleware = build_middleware_stack(options[:middleware])
93
96
  end
94
97
 
@@ -494,6 +497,67 @@ module Vectra
494
497
  )
495
498
  end
496
499
 
500
+ # Text-only search (keyword search without embeddings)
501
+ #
502
+ # Performs keyword/text search without requiring vector embeddings.
503
+ # Useful for exact matches, product names, function names, etc.
504
+ #
505
+ # @param index [String] the index/collection name
506
+ # @param text [String] text query for keyword search
507
+ # @param top_k [Integer] number of results to return (default: 10)
508
+ # @param namespace [String, nil] optional namespace
509
+ # @param filter [Hash, nil] metadata filter
510
+ # @param include_values [Boolean] include vector values in results
511
+ # @param include_metadata [Boolean] include metadata in results
512
+ # @return [QueryResult] search results
513
+ #
514
+ # @example Basic text search
515
+ # results = client.text_search(
516
+ # index: 'products',
517
+ # text: 'iPhone 15 Pro',
518
+ # top_k: 10
519
+ # )
520
+ #
521
+ # @example Text search with filter
522
+ # results = client.text_search(
523
+ # index: 'products',
524
+ # text: 'laptop',
525
+ # filter: { category: 'electronics', in_stock: true }
526
+ # )
527
+ #
528
+ # @raise [UnsupportedFeatureError] if provider doesn't support text search
529
+ def text_search(index:, text:, top_k: 10, namespace: nil, filter: nil,
530
+ include_values: false, include_metadata: true)
531
+ index ||= default_index
532
+ namespace ||= default_namespace
533
+ validate_index!(index)
534
+ raise ValidationError, "Text query cannot be nil or empty" if text.nil? || text.empty?
535
+
536
+ unless provider.respond_to?(:text_search)
537
+ raise UnsupportedFeatureError,
538
+ "Text search is not supported by #{provider_name} provider"
539
+ end
540
+
541
+ Instrumentation.instrument(
542
+ operation: :text_search,
543
+ provider: provider_name,
544
+ index: index,
545
+ metadata: { top_k: top_k }
546
+ ) do
547
+ @middleware.call(
548
+ :text_search,
549
+ index: index,
550
+ text: text,
551
+ top_k: top_k,
552
+ namespace: namespace,
553
+ filter: filter,
554
+ include_values: include_values,
555
+ include_metadata: include_metadata,
556
+ provider: provider_name
557
+ )
558
+ end
559
+ end
560
+
497
561
  # Get the provider name
498
562
  #
499
563
  # @return [Symbol]
@@ -684,6 +748,44 @@ module Vectra
684
748
  Middleware::Stack.new(@provider, all_middleware)
685
749
  end
686
750
 
751
+ def apply_rails_vectra_defaults!
752
+ return unless rails_root_available?
753
+
754
+ entry = load_single_vectra_entry
755
+ return unless entry
756
+
757
+ apply_vectra_defaults_from(entry)
758
+ rescue StandardError => e
759
+ log_error("Failed to infer default index/namespace from config/vectra.yml", e)
760
+ end
761
+
762
+ def rails_root_available?
763
+ defined?(Rails) && Rails.respond_to?(:root) && Rails.root
764
+ end
765
+
766
+ def vectra_config_path
767
+ File.join(Rails.root.to_s, "config", "vectra.yml")
768
+ end
769
+
770
+ def load_single_vectra_entry
771
+ path = vectra_config_path
772
+ return unless File.exist?(path)
773
+
774
+ raw = File.read(path)
775
+ data = YAML.safe_load(raw, permitted_classes: [], aliases: true) || {}
776
+ return unless data.is_a?(Hash) && data.size == 1
777
+
778
+ data.values.first || {}
779
+ end
780
+
781
+ def apply_vectra_defaults_from(entry)
782
+ index = entry["index"] || entry[:index]
783
+ namespace = entry["namespace"] || entry[:namespace]
784
+
785
+ @default_index = index if @default_index.nil? && index.is_a?(String) && !index.empty?
786
+ @default_namespace = namespace if @default_namespace.nil? && namespace.is_a?(String) && !namespace.empty?
787
+ end
788
+
687
789
  def validate_index!(index)
688
790
  raise ValidationError, "Index name cannot be nil" if index.nil?
689
791
  raise ValidationError, "Index name must be a string" unless index.is_a?(String)
@@ -750,6 +852,22 @@ module Vectra
750
852
  config.logger.debug("[Vectra] #{data.inspect}") if data
751
853
  end
752
854
 
855
+ # Temporarily override request timeout within a block.
856
+ #
857
+ # This updates the client's configuration timeout for the duration
858
+ # of the block and then restores the previous value.
859
+ #
860
+ # @param seconds [Float] temporary timeout in seconds
861
+ # @yield [Client] yields self with overridden timeout
862
+ # @return [Object] block result
863
+ def with_timeout(seconds)
864
+ previous = config.timeout
865
+ config.timeout = seconds
866
+ yield self
867
+ ensure
868
+ config.timeout = previous
869
+ end
870
+
753
871
  # Temporarily override default index within a block.
754
872
  #
755
873
  # @param index [String] temporary index name
@@ -790,7 +908,7 @@ module Vectra
790
908
  end
791
909
  end
792
910
 
793
- public :with_index, :with_namespace, :with_index_and_namespace
911
+ public :with_index, :with_namespace, :with_index_and_namespace, :with_timeout
794
912
  end
795
913
  # rubocop:enable Metrics/ClassLength
796
914
  end
@@ -31,7 +31,7 @@ module Vectra
31
31
 
32
32
  # For health checks we bypass client middleware and call the provider
33
33
  # directly to avoid interference from custom stacks.
34
- indexes = with_timeout(timeout) { provider.list_indexes }
34
+ indexes = healthcheck_with_timeout(timeout) { provider.list_indexes }
35
35
  index_name = index || indexes.first&.dig(:name)
36
36
 
37
37
  result = base_result(start_time, indexes)
@@ -53,7 +53,7 @@ module Vectra
53
53
 
54
54
  private
55
55
 
56
- def with_timeout(seconds, &)
56
+ def healthcheck_with_timeout(seconds, &)
57
57
  Timeout.timeout(seconds, &)
58
58
  rescue Timeout::Error
59
59
  raise Vectra::TimeoutError, "Health check timed out after #{seconds}s"
@@ -72,7 +72,7 @@ module Vectra
72
72
  def add_index_stats(result, index_name, include_stats, timeout)
73
73
  return unless include_stats && index_name
74
74
 
75
- stats = with_timeout(timeout) { provider.stats(index: index_name) }
75
+ stats = healthcheck_with_timeout(timeout) { provider.stats(index: index_name) }
76
76
  result[:index] = index_name
77
77
  result[:stats] = {
78
78
  vector_count: stats[:total_vector_count],
@@ -55,7 +55,7 @@ module Vectra
55
55
  #
56
56
  # @return [Boolean]
57
57
  def read_operation?
58
- [:query, :fetch, :list_indexes, :describe_index, :stats].include?(operation)
58
+ [:query, :text_search, :hybrid_search, :fetch, :list_indexes, :describe_index, :stats].include?(operation)
59
59
  end
60
60
  end
61
61
  end
@@ -80,6 +80,32 @@ module Vectra
80
80
  QueryResult.from_response(matches: matches, namespace: namespace)
81
81
  end
82
82
 
83
+ # Text-only search using simple keyword matching in metadata
84
+ #
85
+ # For testing purposes only. Performs case-insensitive keyword matching
86
+ # in metadata values. Not a real BM25/full-text search implementation.
87
+ #
88
+ # @param index [String] index name
89
+ # @param text [String] text query for keyword search
90
+ # @param top_k [Integer] number of results
91
+ # @param namespace [String, nil] optional namespace
92
+ # @param filter [Hash, nil] metadata filter
93
+ # @param include_values [Boolean] include vector values
94
+ # @param include_metadata [Boolean] include metadata
95
+ # @return [QueryResult] search results
96
+ def text_search(index:, text:, top_k:, namespace: nil, filter: nil,
97
+ include_values: false, include_metadata: true)
98
+ ns = namespace || ""
99
+ candidates = filter_candidates(@storage[index][ns].values, filter)
100
+ text_lower = text.to_s.downcase
101
+
102
+ matches = find_text_matches(candidates, text_lower, include_values, include_metadata)
103
+ matches = matches.sort_by { |m| -m[:score] }.first(top_k)
104
+
105
+ log_debug("Text search returned #{matches.size} results")
106
+ QueryResult.from_response(matches: matches, namespace: namespace)
107
+ end
108
+
83
109
  # @see Base#fetch
84
110
  def fetch(index:, ids:, namespace: nil)
85
111
  ns = namespace || ""
@@ -293,6 +319,36 @@ module Vectra
293
319
  true
294
320
  end
295
321
  # rubocop:enable Naming/PredicateMethod
322
+
323
+ # Filter candidates by metadata filter
324
+ def filter_candidates(candidates, filter)
325
+ return candidates unless filter
326
+
327
+ candidates.select { |v| matches_filter?(v, filter) }
328
+ end
329
+
330
+ # Find text matches in candidates
331
+ def find_text_matches(candidates, text_lower, include_values, include_metadata)
332
+ candidates.map do |vec|
333
+ metadata_text = build_metadata_text(vec)
334
+ next unless metadata_text.include?(text_lower)
335
+
336
+ score = calculate_text_score(text_lower, metadata_text)
337
+ build_match(vec, score, include_values, include_metadata)
338
+ end.compact
339
+ end
340
+
341
+ # Build metadata text string for searching
342
+ def build_metadata_text(vector)
343
+ (vector.metadata || {}).values.map(&:to_s).join(" ").downcase
344
+ end
345
+
346
+ # Calculate text match score based on word matches
347
+ def calculate_text_score(query_text, metadata_text)
348
+ query_words = query_text.split(/\s+/)
349
+ matched_words = query_words.count { |word| metadata_text.include?(word) }
350
+ matched_words.to_f / query_words.size
351
+ end
296
352
  end
297
353
  end
298
354
  end
@@ -28,6 +28,7 @@ module Vectra
28
28
  # )
29
29
  # client.upsert(index: 'documents', vectors: [...])
30
30
  #
31
+ # rubocop:disable Metrics/ClassLength
31
32
  class Pgvector < Base
32
33
  include Connection
33
34
  include SqlHelpers
@@ -162,6 +163,54 @@ module Vectra
162
163
  )
163
164
  end
164
165
 
166
+ # Text-only search using PostgreSQL full-text search
167
+ #
168
+ # @param index [String] table name
169
+ # @param text [String] text query for full-text search
170
+ # @param top_k [Integer] number of results
171
+ # @param namespace [String, nil] optional namespace
172
+ # @param filter [Hash, nil] metadata filter
173
+ # @param include_values [Boolean] include vector values
174
+ # @param include_metadata [Boolean] include metadata
175
+ # @param text_column [String] column name for full-text search (default: 'content')
176
+ # @return [QueryResult] search results
177
+ #
178
+ # @note Your table should have a text column with a tsvector index:
179
+ # CREATE INDEX idx_content_fts ON my_index USING gin(to_tsvector('english', content));
180
+ def text_search(index:, text:, top_k:, namespace: nil, filter: nil,
181
+ include_values: false, include_metadata: true,
182
+ text_column: "content")
183
+ ensure_table_exists!(index)
184
+
185
+ select_cols = ["id"]
186
+ select_cols << "embedding" if include_values
187
+ select_cols << "metadata" if include_metadata
188
+
189
+ # Use ts_rank for scoring
190
+ text_score = "ts_rank(to_tsvector('english', COALESCE(#{quote_ident(text_column)}, '')), " \
191
+ "plainto_tsquery('english', #{escape_literal(text)}))"
192
+ select_cols << "#{text_score} AS score"
193
+
194
+ where_clauses = build_where_clauses(namespace, filter)
195
+ where_clauses << "to_tsvector('english', COALESCE(#{quote_ident(text_column)}, '')) @@ " \
196
+ "plainto_tsquery('english', #{escape_literal(text)})"
197
+
198
+ sql = "SELECT #{select_cols.join(', ')} FROM #{quote_ident(index)}"
199
+ sql += " WHERE #{where_clauses.join(' AND ')}" if where_clauses.any?
200
+ sql += " ORDER BY score DESC"
201
+ sql += " LIMIT #{top_k.to_i}"
202
+
203
+ result = execute(sql)
204
+ matches = result.map { |row| build_match_from_row(row, include_values, include_metadata) }
205
+
206
+ log_debug("Text search returned #{matches.size} results")
207
+
208
+ QueryResult.from_response(
209
+ matches: matches,
210
+ namespace: namespace
211
+ )
212
+ end
213
+
165
214
  # @see Base#fetch
166
215
  def fetch(index:, ids:, namespace: nil)
167
216
  ensure_table_exists!(index)
@@ -361,5 +410,6 @@ module Vectra
361
410
  raise ConfigurationError, "Host (connection URL or hostname) must be configured for pgvector"
362
411
  end
363
412
  end
413
+ # rubocop:enable Metrics/ClassLength
364
414
  end
365
415
  end
@@ -110,6 +110,45 @@ module Vectra
110
110
  handle_hybrid_search_response(response, alpha, namespace)
111
111
  end
112
112
 
113
+ # Text-only search using Qdrant's BM25 text search
114
+ #
115
+ # @param index [String] collection name
116
+ # @param text [String] text query for keyword search
117
+ # @param top_k [Integer] number of results
118
+ # @param namespace [String, nil] optional namespace
119
+ # @param filter [Hash, nil] metadata filter
120
+ # @param include_values [Boolean] include vector values
121
+ # @param include_metadata [Boolean] include metadata
122
+ # @return [QueryResult] search results
123
+ def text_search(index:, text:, top_k:, namespace: nil, filter: nil,
124
+ include_values: false, include_metadata: true)
125
+ qdrant_filter = build_filter(filter, namespace)
126
+ body = {
127
+ query: { text: text },
128
+ limit: top_k,
129
+ with_vector: include_values,
130
+ with_payload: include_metadata
131
+ }
132
+
133
+ body[:filter] = qdrant_filter if qdrant_filter
134
+
135
+ response = with_error_handling do
136
+ connection.post("/collections/#{index}/points/query", body)
137
+ end
138
+
139
+ if response.success?
140
+ matches = transform_search_results(response.body["result"] || [])
141
+ log_debug("Text search returned #{matches.size} results")
142
+
143
+ QueryResult.from_response(
144
+ matches: matches,
145
+ namespace: namespace
146
+ )
147
+ else
148
+ handle_error(response)
149
+ end
150
+ end
151
+
113
152
  # @see Base#fetch
114
153
  def fetch(index:, ids:, namespace: nil) # rubocop:disable Lint/UnusedMethodArgument
115
154
  point_ids = ids.map { |id| generate_point_id(id) }
@@ -139,6 +139,36 @@ module Vectra
139
139
  include_values, include_metadata)
140
140
  end
141
141
 
142
+ # Text-only search using Weaviate's BM25 text search
143
+ #
144
+ # @param index [String] class name
145
+ # @param text [String] text query for BM25 search
146
+ # @param top_k [Integer] number of results
147
+ # @param namespace [String, nil] optional namespace (not used in Weaviate)
148
+ # @param filter [Hash, nil] metadata filter
149
+ # @param include_values [Boolean] include vector values
150
+ # @param include_metadata [Boolean] include metadata
151
+ # @return [QueryResult] search results
152
+ def text_search(index:, text:, top_k:, namespace: nil, filter: nil,
153
+ include_values: false, include_metadata: true)
154
+ where_filter = build_where(filter, namespace)
155
+ graphql = build_text_search_graphql(
156
+ index: index,
157
+ text: text,
158
+ top_k: top_k,
159
+ where_filter: where_filter,
160
+ include_values: include_values,
161
+ include_metadata: include_metadata
162
+ )
163
+ body = { "query" => graphql }
164
+
165
+ response = with_error_handling do
166
+ connection.post("#{API_BASE_PATH}/graphql", body)
167
+ end
168
+
169
+ handle_text_search_response(response, index, namespace, include_values, include_metadata)
170
+ end
171
+
142
172
  # rubocop:disable Metrics/PerceivedComplexity
143
173
  def fetch(index:, ids:, namespace: nil)
144
174
  body = {
@@ -337,6 +367,26 @@ module Vectra
337
367
  build_graphql_query(index, top_k, text, alpha, vector, where_filter, selection_block)
338
368
  end
339
369
 
370
+ def build_text_search_graphql(index:, text:, top_k:, where_filter:,
371
+ include_values:, include_metadata:)
372
+ selection_block = build_selection_fields(include_values, include_metadata).join(" ")
373
+ <<~GRAPHQL
374
+ {
375
+ Get {
376
+ #{index}(
377
+ limit: #{top_k}
378
+ bm25: {
379
+ query: "#{text.gsub('"', '\\"')}"
380
+ }
381
+ #{"where: #{JSON.generate(where_filter)}" if where_filter}
382
+ ) {
383
+ #{selection_block}
384
+ }
385
+ }
386
+ }
387
+ GRAPHQL
388
+ end
389
+
340
390
  def build_graphql_query(index, top_k, text, alpha, vector, where_filter, selection_block)
341
391
  <<~GRAPHQL
342
392
  {
@@ -379,6 +429,20 @@ module Vectra
379
429
  end
380
430
  end
381
431
 
432
+ def handle_text_search_response(response, index, namespace, include_values, include_metadata)
433
+ if response.success?
434
+ matches = extract_query_matches(response.body, index, include_values, include_metadata)
435
+ log_debug("Text search returned #{matches.size} results")
436
+
437
+ QueryResult.from_response(
438
+ matches: matches,
439
+ namespace: namespace
440
+ )
441
+ else
442
+ handle_error(response)
443
+ end
444
+ end
445
+
382
446
  def validate_config!
383
447
  super
384
448
  raise ConfigurationError, "Host must be configured for Weaviate" if config.host.nil? || config.host.empty?
@@ -1,5 +1,5 @@
1
1
  # frozen_string_literal: true
2
2
 
3
3
  module Vectra
4
- VERSION = "1.1.0"
4
+ VERSION = "1.1.2"
5
5
  end
metadata CHANGED
@@ -1,7 +1,7 @@
1
1
  --- !ruby/object:Gem::Specification
2
2
  name: vectra-client
3
3
  version: !ruby/object:Gem::Version
4
- version: 1.1.0
4
+ version: 1.1.2
5
5
  platform: ruby
6
6
  authors:
7
7
  - Mijo Kristo
@@ -274,6 +274,7 @@ files:
274
274
  - docs/guides/rails-integration.md
275
275
  - docs/guides/rails-troubleshooting.md
276
276
  - docs/guides/recipes.md
277
+ - docs/guides/roadmap.md
277
278
  - docs/guides/runbooks/cache-issues.md
278
279
  - docs/guides/runbooks/high-error-rate.md
279
280
  - docs/guides/runbooks/high-latency.md
@@ -288,6 +289,7 @@ files:
288
289
  - docs/providers/qdrant.md
289
290
  - docs/providers/selection.md
290
291
  - docs/providers/weaviate.md
292
+ - docs/search.json
291
293
  - examples/GRAFANA_QUICKSTART.md
292
294
  - examples/README.md
293
295
  - examples/active_record_demo.rb