vectra-client 1.1.0 → 1.1.2
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- checksums.yaml +4 -4
- data/CHANGELOG.md +24 -0
- data/README.md +24 -0
- data/docs/_layouts/home.html +1 -1
- data/docs/_layouts/page.html +29 -0
- data/docs/api/cheatsheet.md +27 -0
- data/docs/api/methods.md +69 -0
- data/docs/api/overview.md +36 -5
- data/docs/assets/style.css +131 -0
- data/docs/guides/rails-integration.md +2 -0
- data/docs/guides/roadmap.md +53 -0
- data/docs/search.json +26 -0
- data/lib/vectra/client.rb +119 -1
- data/lib/vectra/health_check.rb +3 -3
- data/lib/vectra/middleware/request.rb +1 -1
- data/lib/vectra/providers/memory.rb +56 -0
- data/lib/vectra/providers/pgvector.rb +50 -0
- data/lib/vectra/providers/qdrant.rb +39 -0
- data/lib/vectra/providers/weaviate.rb +64 -0
- data/lib/vectra/version.rb +1 -1
- metadata +3 -1
checksums.yaml
CHANGED
|
@@ -1,7 +1,7 @@
|
|
|
1
1
|
---
|
|
2
2
|
SHA256:
|
|
3
|
-
metadata.gz:
|
|
4
|
-
data.tar.gz:
|
|
3
|
+
metadata.gz: 6f59705f200c8a164cc9303e6761776af49e4a81a88f101eae2e371eccca8807
|
|
4
|
+
data.tar.gz: 217dd74d00151f3ba6e94ed80718999656f9616b84e843dd937197cee612540a
|
|
5
5
|
SHA512:
|
|
6
|
-
metadata.gz:
|
|
7
|
-
data.tar.gz:
|
|
6
|
+
metadata.gz: 8e62f19e82dfb88a14ae50f7cee6afdd8d058ece9972de8f447ba3cc881420d401cfd0d3a093d299e8273795958085349a8d3e6416ecd429f50ea2a05103dea3
|
|
7
|
+
data.tar.gz: ace8ecc519dc588f917e6b29721ed9cfa3ff9b3f3fd7b3e82911233ed7d6e59a9f443e81ae9c0d20dd4891f904d8497cb506064964b00b3dc7587f12137e12bf
|
data/CHANGELOG.md
CHANGED
|
@@ -1,5 +1,29 @@
|
|
|
1
1
|
# Changelog
|
|
2
2
|
|
|
3
|
+
## [v1.1.1](https://github.com/stokry/vectra/tree/v1.1.1) (2026-01-15)
|
|
4
|
+
|
|
5
|
+
[Full Changelog](https://github.com/stokry/vectra/compare/v1.1.0...v1.1.1)
|
|
6
|
+
|
|
7
|
+
### Added
|
|
8
|
+
- **Text Search Support** - New `text_search` method for keyword-only search without requiring embeddings
|
|
9
|
+
- Qdrant: BM25 text search
|
|
10
|
+
- Weaviate: BM25 text search via GraphQL
|
|
11
|
+
- pgvector: PostgreSQL full-text search (`to_tsvector`, `plainto_tsquery`, `ts_rank`)
|
|
12
|
+
- Memory: Simple keyword matching (for testing)
|
|
13
|
+
- Raises `UnsupportedFeatureError` for Pinecone (use sparse vectors instead)
|
|
14
|
+
- **Documentation Search** - Added search functionality to documentation site with `simple-jekyll-search`
|
|
15
|
+
- Client-side search with fuzzy matching
|
|
16
|
+
- Search index auto-generated from all documentation pages
|
|
17
|
+
- Responsive search UI in navigation
|
|
18
|
+
|
|
19
|
+
### Changed
|
|
20
|
+
- Updated API documentation to include `text_search` method in overview and cheatsheet
|
|
21
|
+
- Enhanced documentation with text search examples and use cases
|
|
22
|
+
|
|
23
|
+
## [v1.1.0](https://github.com/stokry/vectra/tree/v1.1.0) (2026-01-15)
|
|
24
|
+
|
|
25
|
+
[Full Changelog](https://github.com/stokry/vectra/compare/v1.0.8...v1.1.0)
|
|
26
|
+
|
|
3
27
|
## [v1.0.8](https://github.com/stokry/vectra/tree/v1.0.8) (2026-01-14)
|
|
4
28
|
|
|
5
29
|
[Full Changelog](https://github.com/stokry/vectra/compare/v1.0.7...v1.0.8)
|
data/README.md
CHANGED
|
@@ -109,6 +109,14 @@ results = client.hybrid_search(
|
|
|
109
109
|
text: 'ruby programming',
|
|
110
110
|
alpha: 0.7 # 70% semantic, 30% keyword
|
|
111
111
|
)
|
|
112
|
+
|
|
113
|
+
# Text-only search (keyword search without embeddings)
|
|
114
|
+
# Supported by: Qdrant, Weaviate, pgvector
|
|
115
|
+
results = client.text_search(
|
|
116
|
+
index: 'products',
|
|
117
|
+
text: 'iPhone 15 Pro',
|
|
118
|
+
top_k: 10
|
|
119
|
+
)
|
|
112
120
|
```
|
|
113
121
|
|
|
114
122
|
## Provider Examples
|
|
@@ -218,6 +226,8 @@ This will:
|
|
|
218
226
|
- **Update the model** to include `ProductVector`
|
|
219
227
|
- **Append to `config/vectra.yml`** with index metadata (no API keys)
|
|
220
228
|
|
|
229
|
+
When `config/vectra.yml` contains exactly one entry, a plain `Vectra::Client.new` in that Rails app will automatically use that entry's `index` (and `namespace` if present) as its defaults, so you can usually omit `index:` when calling `upsert` / `query` / `text_search`.
|
|
230
|
+
|
|
221
231
|
### Complete Rails Guide
|
|
222
232
|
|
|
223
233
|
For a complete step-by-step guide including:
|
|
@@ -307,6 +317,20 @@ Vectra includes 7 production-ready patterns out of the box:
|
|
|
307
317
|
- **Health Checks** - `healthy?`, `ping`, and `health_check` methods
|
|
308
318
|
- **Instrumentation** - Datadog, New Relic, Sentry, Honeybadger support
|
|
309
319
|
|
|
320
|
+
## Roadmap
|
|
321
|
+
|
|
322
|
+
High-level roadmap for `vectra-client`:
|
|
323
|
+
|
|
324
|
+
- **1.x (near term)**
|
|
325
|
+
- Reranking middleware built on top of the existing Rack-style middleware stack.
|
|
326
|
+
- Additional middleware building blocks (sampling, tracing, score normalization).
|
|
327
|
+
- Smoother Rails UX for multi-tenant setups and larger demos (e‑commerce, RAG, recommendations).
|
|
328
|
+
- **Mid term**
|
|
329
|
+
- Additional providers where it makes sense and stays maintainable.
|
|
330
|
+
- Deeper documentation and recipes around reranking and hybrid search.
|
|
331
|
+
|
|
332
|
+
For a more detailed, always-up-to-date version, see the online roadmap: https://vectra-docs.netlify.app/guides/roadmap/
|
|
333
|
+
|
|
310
334
|
## Development
|
|
311
335
|
|
|
312
336
|
```bash
|
data/docs/_layouts/home.html
CHANGED
|
@@ -55,7 +55,7 @@
|
|
|
55
55
|
<!-- Hero Section -->
|
|
56
56
|
<section class="tma-hero">
|
|
57
57
|
<div class="tma-hero__container">
|
|
58
|
-
<span class="tma-hero__badge">v1.1.
|
|
58
|
+
<span class="tma-hero__badge">v1.1.1 — Hybrid Search, Rails Generator, Middleware & Text Search</span>
|
|
59
59
|
<h1 class="tma-hero__title">
|
|
60
60
|
Vector Databases,<br>
|
|
61
61
|
<span class="tma-hero__title-gradient">Unified for Ruby.</span>
|
data/docs/_layouts/page.html
CHANGED
|
@@ -39,6 +39,12 @@
|
|
|
39
39
|
<span class="tma-nav__toggle-line"></span>
|
|
40
40
|
</button>
|
|
41
41
|
<ul class="tma-nav__menu" id="nav-menu">
|
|
42
|
+
<li class="tma-nav__search-wrapper">
|
|
43
|
+
<div class="tma-search">
|
|
44
|
+
<input type="search" id="search-input" class="tma-search__input" placeholder="Search docs..." aria-label="Search documentation">
|
|
45
|
+
<div id="search-results" class="tma-search__results"></div>
|
|
46
|
+
</div>
|
|
47
|
+
</li>
|
|
42
48
|
<li><a href="{{ site.baseurl }}/guides/getting-started" class="tma-nav__link">Getting Started</a></li>
|
|
43
49
|
<li><a href="{{ site.baseurl }}/guides/recipes" class="tma-nav__link">Recipes</a></li>
|
|
44
50
|
<li><a href="{{ site.baseurl }}/providers" class="tma-nav__link">Providers</a></li>
|
|
@@ -91,6 +97,13 @@
|
|
|
91
97
|
<li><a href="https://github.com/stokry/vectra/issues" class="tma-sidebar__link" target="_blank">Report Issue ↗</a></li>
|
|
92
98
|
</ul>
|
|
93
99
|
</div>
|
|
100
|
+
|
|
101
|
+
<div class="tma-sidebar__section">
|
|
102
|
+
<h3 class="tma-sidebar__title">Resources</h3>
|
|
103
|
+
<ul class="tma-sidebar__list">
|
|
104
|
+
<li><a href="{{ site.baseurl }}/guides/roadmap" class="tma-sidebar__link {% if page.url == '/guides/roadmap/' %}tma-sidebar__link--active{% endif %}">Roadmap</a></li>
|
|
105
|
+
</ul>
|
|
106
|
+
</div>
|
|
94
107
|
</aside>
|
|
95
108
|
|
|
96
109
|
<!-- Main Content -->
|
|
@@ -115,6 +128,9 @@
|
|
|
115
128
|
</div>
|
|
116
129
|
</footer>
|
|
117
130
|
|
|
131
|
+
<!-- Simple Jekyll Search -->
|
|
132
|
+
<script src="https://cdn.jsdelivr.net/npm/simple-jekyll-search@1.10.0/dest/simple-jekyll-search.min.js"></script>
|
|
133
|
+
|
|
118
134
|
<script>
|
|
119
135
|
document.addEventListener('DOMContentLoaded', function() {
|
|
120
136
|
const navToggle = document.querySelector('.tma-nav__toggle');
|
|
@@ -148,6 +164,19 @@
|
|
|
148
164
|
}
|
|
149
165
|
});
|
|
150
166
|
}
|
|
167
|
+
|
|
168
|
+
// Initialize search
|
|
169
|
+
if (typeof SimpleJekyllSearch !== 'undefined') {
|
|
170
|
+
SimpleJekyllSearch({
|
|
171
|
+
searchInput: document.getElementById('search-input'),
|
|
172
|
+
resultsContainer: document.getElementById('search-results'),
|
|
173
|
+
json: '{{ site.baseurl }}/search.json',
|
|
174
|
+
searchResultTemplate: '<div class="tma-search__result"><a href="{url}" class="tma-search__result-link"><span class="tma-search__result-title">{title}</span><span class="tma-search__result-excerpt">{excerpt}</span></a></div>',
|
|
175
|
+
noResultsText: '<div class="tma-search__no-results">No results found</div>',
|
|
176
|
+
fuzzy: true,
|
|
177
|
+
limit: 10
|
|
178
|
+
});
|
|
179
|
+
}
|
|
151
180
|
});
|
|
152
181
|
</script>
|
|
153
182
|
</body>
|
data/docs/api/cheatsheet.md
CHANGED
|
@@ -49,6 +49,8 @@ client.upsert(vectors: [...])
|
|
|
49
49
|
client.query(vector: query_embedding, top_k: 10)
|
|
50
50
|
```
|
|
51
51
|
|
|
52
|
+
In a Rails app with `config/vectra.yml` generated by `rails generate vectra:index`, if that YAML file contains only one entry, `Vectra::Client.new` will automatically use that entry's `index` (and `namespace`, if present) as defaults.
|
|
53
|
+
|
|
52
54
|
### Upsert
|
|
53
55
|
|
|
54
56
|
```ruby
|
|
@@ -98,6 +100,23 @@ results = client.hybrid_search(
|
|
|
98
100
|
|
|
99
101
|
Supported providers: Qdrant ✅, Weaviate ✅, pgvector ✅, Pinecone ⚠️
|
|
100
102
|
|
|
103
|
+
### Text Search (keyword-only, no embeddings)
|
|
104
|
+
|
|
105
|
+
```ruby
|
|
106
|
+
results = client.text_search(
|
|
107
|
+
index: 'products',
|
|
108
|
+
text: 'iPhone 15 Pro',
|
|
109
|
+
top_k: 10,
|
|
110
|
+
filter: { category: 'electronics' }
|
|
111
|
+
)
|
|
112
|
+
|
|
113
|
+
results.each do |match|
|
|
114
|
+
puts "#{match.id} (score=#{match.score.round(3)}): #{match.metadata['title']}"
|
|
115
|
+
end
|
|
116
|
+
```
|
|
117
|
+
|
|
118
|
+
Supported providers: Qdrant ✅ (BM25), Weaviate ✅ (BM25), pgvector ✅ (PostgreSQL full-text)
|
|
119
|
+
|
|
101
120
|
### Fetch
|
|
102
121
|
|
|
103
122
|
```ruby
|
|
@@ -176,6 +195,14 @@ else
|
|
|
176
195
|
end
|
|
177
196
|
```
|
|
178
197
|
|
|
198
|
+
For fast health checks you can temporarily lower the timeout:
|
|
199
|
+
|
|
200
|
+
```ruby
|
|
201
|
+
status = client.with_timeout(0.5) do |c|
|
|
202
|
+
c.ping
|
|
203
|
+
end
|
|
204
|
+
```
|
|
205
|
+
|
|
179
206
|
### Ping (with latency)
|
|
180
207
|
|
|
181
208
|
```ruby
|
data/docs/api/methods.md
CHANGED
|
@@ -36,6 +36,8 @@ client = Vectra::Client.new(
|
|
|
36
36
|
)
|
|
37
37
|
```
|
|
38
38
|
|
|
39
|
+
In a Rails app that uses the `vectra:index` generator, if `config/vectra.yml` contains exactly one entry, `Vectra::Client.new` will automatically use that entry's `index` (and `namespace` if present) as its defaults. This allows you to omit `index:` in most calls (`upsert`, `query`, `text_search`, etc.).
|
|
40
|
+
|
|
39
41
|
---
|
|
40
42
|
|
|
41
43
|
### `client.upsert(index:, vectors:, namespace: nil)`
|
|
@@ -147,6 +149,51 @@ results = client.hybrid_search(
|
|
|
147
149
|
|
|
148
150
|
---
|
|
149
151
|
|
|
152
|
+
### `client.text_search(index:, text:, top_k: 10, namespace: nil, filter: nil, include_values: false, include_metadata: true)`
|
|
153
|
+
|
|
154
|
+
Text-only search (keyword search without requiring embeddings).
|
|
155
|
+
|
|
156
|
+
**Parameters:**
|
|
157
|
+
- `index` (String) - Index/collection name (uses client's default index when omitted)
|
|
158
|
+
- `text` (String) - Text query for keyword search
|
|
159
|
+
- `top_k` (Integer) - Number of results (default: 10)
|
|
160
|
+
- `namespace` (String, optional) - Namespace
|
|
161
|
+
- `filter` (Hash, optional) - Metadata filter
|
|
162
|
+
- `include_values` (Boolean) - Include vector values (default: false)
|
|
163
|
+
- `include_metadata` (Boolean) - Include metadata (default: true)
|
|
164
|
+
|
|
165
|
+
**Returns:** `Vectra::QueryResult`
|
|
166
|
+
|
|
167
|
+
**Provider Support:**
|
|
168
|
+
- ✅ Qdrant (BM25)
|
|
169
|
+
- ✅ Weaviate (BM25)
|
|
170
|
+
- ✅ pgvector (PostgreSQL full-text search)
|
|
171
|
+
- ✅ Memory (simple keyword matching - for testing only)
|
|
172
|
+
- ❌ Pinecone (not supported - use sparse vectors instead)
|
|
173
|
+
|
|
174
|
+
**Example:**
|
|
175
|
+
```ruby
|
|
176
|
+
# Keyword search for exact matches
|
|
177
|
+
results = client.text_search(
|
|
178
|
+
index: 'products',
|
|
179
|
+
text: 'iPhone 15 Pro',
|
|
180
|
+
top_k: 10,
|
|
181
|
+
filter: { category: 'electronics' }
|
|
182
|
+
)
|
|
183
|
+
|
|
184
|
+
results.each do |match|
|
|
185
|
+
puts "#{match.id}: #{match.score} - #{match.metadata['title']}"
|
|
186
|
+
end
|
|
187
|
+
```
|
|
188
|
+
|
|
189
|
+
**Use Cases:**
|
|
190
|
+
- Product name search (exact matches)
|
|
191
|
+
- Function/class name search in documentation
|
|
192
|
+
- Keyword-based filtering when semantic search is not needed
|
|
193
|
+
- Faster search when embeddings are not available
|
|
194
|
+
|
|
195
|
+
---
|
|
196
|
+
|
|
150
197
|
### `client.fetch(index:, ids:, namespace: nil)`
|
|
151
198
|
|
|
152
199
|
Fetch vectors by their IDs.
|
|
@@ -357,6 +404,28 @@ puts "Latency: #{status[:latency_ms]}ms"
|
|
|
357
404
|
|
|
358
405
|
---
|
|
359
406
|
|
|
407
|
+
### `client.with_timeout(seconds) { ... }`
|
|
408
|
+
|
|
409
|
+
Temporarily override the client's request timeout inside a block.
|
|
410
|
+
|
|
411
|
+
**Parameters:**
|
|
412
|
+
- `seconds` (Float) - Temporary timeout in seconds
|
|
413
|
+
|
|
414
|
+
**Returns:** Block result
|
|
415
|
+
|
|
416
|
+
**Example (fast health check in Rails controller):**
|
|
417
|
+
```ruby
|
|
418
|
+
status = client.with_timeout(0.5) do |c|
|
|
419
|
+
c.ping
|
|
420
|
+
end
|
|
421
|
+
|
|
422
|
+
render json: status, status: status[:healthy] ? :ok : :service_unavailable
|
|
423
|
+
```
|
|
424
|
+
|
|
425
|
+
After the block finishes (even if it raises), the previous `config.timeout` value is restored.
|
|
426
|
+
|
|
427
|
+
---
|
|
428
|
+
|
|
360
429
|
### `client.health_check`
|
|
361
430
|
|
|
362
431
|
Detailed health check with provider-specific information.
|
data/docs/api/overview.md
CHANGED
|
@@ -10,15 +10,15 @@ permalink: /api/overview/
|
|
|
10
10
|
|
|
11
11
|
```ruby
|
|
12
12
|
client = Vectra::Client.new(
|
|
13
|
-
provider: :pinecone, # Required: :pinecone, :qdrant, :weaviate, :pgvector
|
|
13
|
+
provider: :pinecone, # Required: :pinecone, :qdrant, :weaviate, :pgvector, :memory
|
|
14
14
|
api_key: 'your-api-key', # Required for cloud providers
|
|
15
|
-
|
|
16
|
-
|
|
17
|
-
port: 6333, # For self-hosted providers
|
|
18
|
-
environment: 'us-west-4' # For Pinecone
|
|
15
|
+
index: 'my-index', # Optional default index
|
|
16
|
+
namespace: 'tenant-1' # Optional default namespace
|
|
19
17
|
)
|
|
20
18
|
```
|
|
21
19
|
|
|
20
|
+
In Rails, if you use the `vectra:index` generator and `config/vectra.yml` contains exactly one entry, a plain `Vectra::Client.new` will automatically pick that entry's `index` (and `namespace` if present) as defaults.
|
|
21
|
+
|
|
22
22
|
## Core Methods
|
|
23
23
|
|
|
24
24
|
### `upsert(vectors:)`
|
|
@@ -138,6 +138,31 @@ results = client.hybrid_search(
|
|
|
138
138
|
|
|
139
139
|
**Provider Support:** Qdrant ✅, Weaviate ✅, pgvector ✅, Pinecone ⚠️
|
|
140
140
|
|
|
141
|
+
### `text_search(index:, text:, top_k:)`
|
|
142
|
+
|
|
143
|
+
Text-only search (keyword search without requiring embeddings).
|
|
144
|
+
|
|
145
|
+
**Parameters:**
|
|
146
|
+
- `index` (String) - Index/collection name (uses client's default index when omitted)
|
|
147
|
+
- `text` (String) - Text query for keyword search
|
|
148
|
+
- `top_k` (Integer) - Number of results (default: 10)
|
|
149
|
+
- `namespace` (String, optional) - Namespace
|
|
150
|
+
- `filter` (Hash, optional) - Metadata filter
|
|
151
|
+
- `include_values` (Boolean) - Include vector values (default: false)
|
|
152
|
+
- `include_metadata` (Boolean) - Include metadata (default: true)
|
|
153
|
+
|
|
154
|
+
**Example:**
|
|
155
|
+
```ruby
|
|
156
|
+
results = client.text_search(
|
|
157
|
+
index: 'products',
|
|
158
|
+
text: 'iPhone 15 Pro',
|
|
159
|
+
top_k: 10,
|
|
160
|
+
filter: { category: 'electronics' }
|
|
161
|
+
)
|
|
162
|
+
```
|
|
163
|
+
|
|
164
|
+
**Provider Support:** Qdrant ✅ (BM25), Weaviate ✅ (BM25), pgvector ✅ (PostgreSQL full-text), Memory ✅, Pinecone ❌
|
|
165
|
+
|
|
141
166
|
### `healthy?`
|
|
142
167
|
|
|
143
168
|
Quick health check - returns true if provider connection is healthy.
|
|
@@ -151,6 +176,12 @@ if client.healthy?
|
|
|
151
176
|
end
|
|
152
177
|
```
|
|
153
178
|
|
|
179
|
+
You can also run faster checks with a temporary timeout:
|
|
180
|
+
|
|
181
|
+
```ruby
|
|
182
|
+
fast_ok = client.with_timeout(0.5) { |c| c.healthy? }
|
|
183
|
+
```
|
|
184
|
+
|
|
154
185
|
### `ping`
|
|
155
186
|
|
|
156
187
|
Ping provider and get connection health status with latency.
|
data/docs/assets/style.css
CHANGED
|
@@ -165,6 +165,118 @@ body {
|
|
|
165
165
|
border-color: var(--tma-color-border-hover);
|
|
166
166
|
}
|
|
167
167
|
|
|
168
|
+
.tma-nav__search-wrapper {
|
|
169
|
+
position: relative;
|
|
170
|
+
margin-right: var(--tma-spacing-md);
|
|
171
|
+
}
|
|
172
|
+
|
|
173
|
+
.tma-search {
|
|
174
|
+
position: relative;
|
|
175
|
+
}
|
|
176
|
+
|
|
177
|
+
.tma-search__input {
|
|
178
|
+
width: 240px;
|
|
179
|
+
padding: var(--tma-spacing-sm) var(--tma-spacing-md);
|
|
180
|
+
padding-left: 2.5rem;
|
|
181
|
+
background: var(--tma-color-bg-tertiary);
|
|
182
|
+
border: 1px solid var(--tma-color-border);
|
|
183
|
+
border-radius: var(--tma-radius-sm);
|
|
184
|
+
color: var(--tma-color-text-primary);
|
|
185
|
+
font-size: 0.9rem;
|
|
186
|
+
font-family: var(--tma-font-family-primary);
|
|
187
|
+
transition: all var(--tma-transition-fast);
|
|
188
|
+
outline: none;
|
|
189
|
+
}
|
|
190
|
+
|
|
191
|
+
.tma-search__input::placeholder {
|
|
192
|
+
color: var(--tma-color-text-muted);
|
|
193
|
+
}
|
|
194
|
+
|
|
195
|
+
.tma-search__input:focus {
|
|
196
|
+
width: 320px;
|
|
197
|
+
border-color: var(--tma-color-accent-primary);
|
|
198
|
+
background: var(--tma-color-bg-elevated);
|
|
199
|
+
box-shadow: 0 0 0 3px var(--tma-color-accent-muted);
|
|
200
|
+
}
|
|
201
|
+
|
|
202
|
+
.tma-search {
|
|
203
|
+
position: relative;
|
|
204
|
+
}
|
|
205
|
+
|
|
206
|
+
.tma-search::before {
|
|
207
|
+
content: '🔍';
|
|
208
|
+
position: absolute;
|
|
209
|
+
left: var(--tma-spacing-md);
|
|
210
|
+
top: 50%;
|
|
211
|
+
transform: translateY(-50%);
|
|
212
|
+
color: var(--tma-color-text-muted);
|
|
213
|
+
pointer-events: none;
|
|
214
|
+
z-index: 1;
|
|
215
|
+
font-size: 0.9rem;
|
|
216
|
+
}
|
|
217
|
+
|
|
218
|
+
.tma-search__results {
|
|
219
|
+
position: absolute;
|
|
220
|
+
top: calc(100% + var(--tma-spacing-xs));
|
|
221
|
+
left: 0;
|
|
222
|
+
right: 0;
|
|
223
|
+
max-width: 500px;
|
|
224
|
+
max-height: 400px;
|
|
225
|
+
overflow-y: auto;
|
|
226
|
+
background: var(--tma-color-bg-elevated);
|
|
227
|
+
border: 1px solid var(--tma-color-border);
|
|
228
|
+
border-radius: var(--tma-radius-md);
|
|
229
|
+
box-shadow: var(--tma-shadow-lg);
|
|
230
|
+
z-index: 1000;
|
|
231
|
+
display: none;
|
|
232
|
+
}
|
|
233
|
+
|
|
234
|
+
.tma-search__results:not(:empty) {
|
|
235
|
+
display: block;
|
|
236
|
+
}
|
|
237
|
+
|
|
238
|
+
.tma-search__result {
|
|
239
|
+
border-bottom: 1px solid var(--tma-color-border);
|
|
240
|
+
}
|
|
241
|
+
|
|
242
|
+
.tma-search__result:last-child {
|
|
243
|
+
border-bottom: none;
|
|
244
|
+
}
|
|
245
|
+
|
|
246
|
+
.tma-search__result-link {
|
|
247
|
+
display: block;
|
|
248
|
+
padding: var(--tma-spacing-md);
|
|
249
|
+
text-decoration: none;
|
|
250
|
+
color: var(--tma-color-text-primary);
|
|
251
|
+
transition: background var(--tma-transition-fast);
|
|
252
|
+
}
|
|
253
|
+
|
|
254
|
+
.tma-search__result-link:hover {
|
|
255
|
+
background: var(--tma-color-bg-hover);
|
|
256
|
+
}
|
|
257
|
+
|
|
258
|
+
.tma-search__result-title {
|
|
259
|
+
display: block;
|
|
260
|
+
font-weight: 600;
|
|
261
|
+
font-size: 0.95rem;
|
|
262
|
+
color: var(--tma-color-text-primary);
|
|
263
|
+
margin-bottom: var(--tma-spacing-xs);
|
|
264
|
+
}
|
|
265
|
+
|
|
266
|
+
.tma-search__result-excerpt {
|
|
267
|
+
display: block;
|
|
268
|
+
font-size: 0.85rem;
|
|
269
|
+
color: var(--tma-color-text-secondary);
|
|
270
|
+
line-height: 1.5;
|
|
271
|
+
}
|
|
272
|
+
|
|
273
|
+
.tma-search__no-results {
|
|
274
|
+
padding: var(--tma-spacing-lg);
|
|
275
|
+
text-align: center;
|
|
276
|
+
color: var(--tma-color-text-muted);
|
|
277
|
+
font-size: 0.9rem;
|
|
278
|
+
}
|
|
279
|
+
|
|
168
280
|
.tma-nav__toggle {
|
|
169
281
|
display: none;
|
|
170
282
|
flex-direction: column;
|
|
@@ -1261,6 +1373,25 @@ code {
|
|
|
1261
1373
|
justify-content: center;
|
|
1262
1374
|
}
|
|
1263
1375
|
|
|
1376
|
+
.tma-nav__search-wrapper {
|
|
1377
|
+
width: 100%;
|
|
1378
|
+
margin-right: 0;
|
|
1379
|
+
margin-bottom: var(--tma-spacing-md);
|
|
1380
|
+
}
|
|
1381
|
+
|
|
1382
|
+
.tma-search__input {
|
|
1383
|
+
width: 100%;
|
|
1384
|
+
}
|
|
1385
|
+
|
|
1386
|
+
.tma-search__input:focus {
|
|
1387
|
+
width: 100%;
|
|
1388
|
+
}
|
|
1389
|
+
|
|
1390
|
+
.tma-search__results {
|
|
1391
|
+
max-width: 100%;
|
|
1392
|
+
right: 0;
|
|
1393
|
+
}
|
|
1394
|
+
|
|
1264
1395
|
.tma-features,
|
|
1265
1396
|
.tma-providers {
|
|
1266
1397
|
padding: var(--tma-spacing-xl) var(--tma-spacing-md);
|
|
@@ -99,6 +99,8 @@ This will:
|
|
|
99
99
|
- Update `app/models/product.rb` to include the concern
|
|
100
100
|
- Add configuration to `config/vectra.yml`
|
|
101
101
|
|
|
102
|
+
When `config/vectra.yml` contains exactly one entry, a plain `Vectra::Client.new` in this Rails app will automatically use that entry's `index` (and `namespace` if present) as its defaults. That means you can usually omit `index:` when calling `upsert`, `query`, `hybrid_search`, or `text_search`.
|
|
103
|
+
|
|
102
104
|
### Run Migrations
|
|
103
105
|
|
|
104
106
|
```bash
|
|
@@ -0,0 +1,53 @@
|
|
|
1
|
+
---
|
|
2
|
+
layout: page
|
|
3
|
+
title: Roadmap
|
|
4
|
+
permalink: /guides/roadmap/
|
|
5
|
+
---
|
|
6
|
+
|
|
7
|
+
# Vectra Roadmap
|
|
8
|
+
|
|
9
|
+
This page outlines the high-level roadmap for **vectra-client**, the unified Ruby client for vector databases.
|
|
10
|
+
|
|
11
|
+
The roadmap is intentionally focused on **production features** that make AI workloads reliable, observable, and easy to operate in Ruby.
|
|
12
|
+
|
|
13
|
+
## Near Term (1.x)
|
|
14
|
+
|
|
15
|
+
- **Reranking middleware**
|
|
16
|
+
- Middleware that can call external rerankers (e.g., Cohere, Jina, custom HTTP) and reorder search results after a `query`.
|
|
17
|
+
- Pluggable providers, configurable `top_n`, and safe fallbacks when reranking fails.
|
|
18
|
+
- **More middleware building blocks**
|
|
19
|
+
- Request sampling / tracing for debugging complex production issues.
|
|
20
|
+
- Response shaping (e.g., score normalization, custom thresholds) as reusable middleware.
|
|
21
|
+
- **Rails UX improvements**
|
|
22
|
+
- Convenience generators and helpers for multi-tenant setups.
|
|
23
|
+
- Better defaults and examples for 1k+ records demos (e‑commerce, blogs, RAG, recommendations).
|
|
24
|
+
|
|
25
|
+
## Mid Term
|
|
26
|
+
|
|
27
|
+
- **Additional providers**
|
|
28
|
+
- Support for more hosted / self-hosted vector solutions where it makes sense and stays maintainable.
|
|
29
|
+
- **First-class reranking guides**
|
|
30
|
+
- End-to-end documentation for combining vectra-client with external LLMs / rerankers.
|
|
31
|
+
- **More recipes & patterns**
|
|
32
|
+
- Deeper recipes for analytics, recommendations, and hybrid search in large Rails apps.
|
|
33
|
+
|
|
34
|
+
## Long Term Vision
|
|
35
|
+
|
|
36
|
+
Keep **vectra-client** the most **production-ready Ruby toolkit** for vector databases:
|
|
37
|
+
|
|
38
|
+
- Strong guarantees around retries, circuit breakers, and backpressure.
|
|
39
|
+
- Excellent observability out of the box.
|
|
40
|
+
- Stable, provider-agnostic API that lets you change infra without rewriting your app.
|
|
41
|
+
|
|
42
|
+
If you have ideas or needs that fit this direction, please open an issue on GitHub so we can prioritise the roadmap around real-world use cases.
|
|
43
|
+
|
|
44
|
+
{
|
|
45
|
+
"cells": [],
|
|
46
|
+
"metadata": {
|
|
47
|
+
"language_info": {
|
|
48
|
+
"name": "python"
|
|
49
|
+
}
|
|
50
|
+
},
|
|
51
|
+
"nbformat": 4,
|
|
52
|
+
"nbformat_minor": 2
|
|
53
|
+
}
|
data/docs/search.json
ADDED
|
@@ -0,0 +1,26 @@
|
|
|
1
|
+
---
|
|
2
|
+
layout: null
|
|
3
|
+
---
|
|
4
|
+
[
|
|
5
|
+
{% assign first = true %}
|
|
6
|
+
{% for page in site.pages %}
|
|
7
|
+
{% unless page.url == '/' or page.url == '/search.json' or page.url contains '/assets/' or page.url contains '/404' or page.url contains '/feed' or page.url contains '/sitemap' or page.url contains '/robots' or page.url contains '/index.html' or page.url contains '/index.md' %}
|
|
8
|
+
{% unless first %},{% endunless %}
|
|
9
|
+
{
|
|
10
|
+
"title": {{ page.title | default: page.url | jsonify }},
|
|
11
|
+
"url": {{ page.url | jsonify }},
|
|
12
|
+
"excerpt": {{ page.content | strip_html | truncatewords: 30 | default: "" | jsonify }}
|
|
13
|
+
}
|
|
14
|
+
{% assign first = false %}
|
|
15
|
+
{% endunless %}
|
|
16
|
+
{% endfor %}
|
|
17
|
+
{% for post in site.posts %}
|
|
18
|
+
{% unless first %},{% endunless %}
|
|
19
|
+
{
|
|
20
|
+
"title": {{ post.title | jsonify }},
|
|
21
|
+
"url": {{ post.url | jsonify }},
|
|
22
|
+
"excerpt": {{ post.content | strip_html | truncatewords: 30 | default: "" | jsonify }}
|
|
23
|
+
}
|
|
24
|
+
{% assign first = false %}
|
|
25
|
+
{% endfor %}
|
|
26
|
+
]
|
data/lib/vectra/client.rb
CHANGED
|
@@ -1,5 +1,7 @@
|
|
|
1
1
|
# frozen_string_literal: true
|
|
2
2
|
|
|
3
|
+
require "yaml"
|
|
4
|
+
|
|
3
5
|
# Ensure HealthCheck is loaded before Client
|
|
4
6
|
require_relative "health_check" unless defined?(Vectra::HealthCheck)
|
|
5
7
|
require_relative "configuration" unless defined?(Vectra::Configuration)
|
|
@@ -89,6 +91,7 @@ module Vectra
|
|
|
89
91
|
@provider = build_provider
|
|
90
92
|
@default_index = options[:index]
|
|
91
93
|
@default_namespace = options[:namespace]
|
|
94
|
+
apply_rails_vectra_defaults! if @default_index.nil? && @default_namespace.nil?
|
|
92
95
|
@middleware = build_middleware_stack(options[:middleware])
|
|
93
96
|
end
|
|
94
97
|
|
|
@@ -494,6 +497,67 @@ module Vectra
|
|
|
494
497
|
)
|
|
495
498
|
end
|
|
496
499
|
|
|
500
|
+
# Text-only search (keyword search without embeddings)
|
|
501
|
+
#
|
|
502
|
+
# Performs keyword/text search without requiring vector embeddings.
|
|
503
|
+
# Useful for exact matches, product names, function names, etc.
|
|
504
|
+
#
|
|
505
|
+
# @param index [String] the index/collection name
|
|
506
|
+
# @param text [String] text query for keyword search
|
|
507
|
+
# @param top_k [Integer] number of results to return (default: 10)
|
|
508
|
+
# @param namespace [String, nil] optional namespace
|
|
509
|
+
# @param filter [Hash, nil] metadata filter
|
|
510
|
+
# @param include_values [Boolean] include vector values in results
|
|
511
|
+
# @param include_metadata [Boolean] include metadata in results
|
|
512
|
+
# @return [QueryResult] search results
|
|
513
|
+
#
|
|
514
|
+
# @example Basic text search
|
|
515
|
+
# results = client.text_search(
|
|
516
|
+
# index: 'products',
|
|
517
|
+
# text: 'iPhone 15 Pro',
|
|
518
|
+
# top_k: 10
|
|
519
|
+
# )
|
|
520
|
+
#
|
|
521
|
+
# @example Text search with filter
|
|
522
|
+
# results = client.text_search(
|
|
523
|
+
# index: 'products',
|
|
524
|
+
# text: 'laptop',
|
|
525
|
+
# filter: { category: 'electronics', in_stock: true }
|
|
526
|
+
# )
|
|
527
|
+
#
|
|
528
|
+
# @raise [UnsupportedFeatureError] if provider doesn't support text search
|
|
529
|
+
def text_search(index:, text:, top_k: 10, namespace: nil, filter: nil,
|
|
530
|
+
include_values: false, include_metadata: true)
|
|
531
|
+
index ||= default_index
|
|
532
|
+
namespace ||= default_namespace
|
|
533
|
+
validate_index!(index)
|
|
534
|
+
raise ValidationError, "Text query cannot be nil or empty" if text.nil? || text.empty?
|
|
535
|
+
|
|
536
|
+
unless provider.respond_to?(:text_search)
|
|
537
|
+
raise UnsupportedFeatureError,
|
|
538
|
+
"Text search is not supported by #{provider_name} provider"
|
|
539
|
+
end
|
|
540
|
+
|
|
541
|
+
Instrumentation.instrument(
|
|
542
|
+
operation: :text_search,
|
|
543
|
+
provider: provider_name,
|
|
544
|
+
index: index,
|
|
545
|
+
metadata: { top_k: top_k }
|
|
546
|
+
) do
|
|
547
|
+
@middleware.call(
|
|
548
|
+
:text_search,
|
|
549
|
+
index: index,
|
|
550
|
+
text: text,
|
|
551
|
+
top_k: top_k,
|
|
552
|
+
namespace: namespace,
|
|
553
|
+
filter: filter,
|
|
554
|
+
include_values: include_values,
|
|
555
|
+
include_metadata: include_metadata,
|
|
556
|
+
provider: provider_name
|
|
557
|
+
)
|
|
558
|
+
end
|
|
559
|
+
end
|
|
560
|
+
|
|
497
561
|
# Get the provider name
|
|
498
562
|
#
|
|
499
563
|
# @return [Symbol]
|
|
@@ -684,6 +748,44 @@ module Vectra
|
|
|
684
748
|
Middleware::Stack.new(@provider, all_middleware)
|
|
685
749
|
end
|
|
686
750
|
|
|
751
|
+
def apply_rails_vectra_defaults!
|
|
752
|
+
return unless rails_root_available?
|
|
753
|
+
|
|
754
|
+
entry = load_single_vectra_entry
|
|
755
|
+
return unless entry
|
|
756
|
+
|
|
757
|
+
apply_vectra_defaults_from(entry)
|
|
758
|
+
rescue StandardError => e
|
|
759
|
+
log_error("Failed to infer default index/namespace from config/vectra.yml", e)
|
|
760
|
+
end
|
|
761
|
+
|
|
762
|
+
def rails_root_available?
|
|
763
|
+
defined?(Rails) && Rails.respond_to?(:root) && Rails.root
|
|
764
|
+
end
|
|
765
|
+
|
|
766
|
+
def vectra_config_path
|
|
767
|
+
File.join(Rails.root.to_s, "config", "vectra.yml")
|
|
768
|
+
end
|
|
769
|
+
|
|
770
|
+
def load_single_vectra_entry
|
|
771
|
+
path = vectra_config_path
|
|
772
|
+
return unless File.exist?(path)
|
|
773
|
+
|
|
774
|
+
raw = File.read(path)
|
|
775
|
+
data = YAML.safe_load(raw, permitted_classes: [], aliases: true) || {}
|
|
776
|
+
return unless data.is_a?(Hash) && data.size == 1
|
|
777
|
+
|
|
778
|
+
data.values.first || {}
|
|
779
|
+
end
|
|
780
|
+
|
|
781
|
+
def apply_vectra_defaults_from(entry)
|
|
782
|
+
index = entry["index"] || entry[:index]
|
|
783
|
+
namespace = entry["namespace"] || entry[:namespace]
|
|
784
|
+
|
|
785
|
+
@default_index = index if @default_index.nil? && index.is_a?(String) && !index.empty?
|
|
786
|
+
@default_namespace = namespace if @default_namespace.nil? && namespace.is_a?(String) && !namespace.empty?
|
|
787
|
+
end
|
|
788
|
+
|
|
687
789
|
def validate_index!(index)
|
|
688
790
|
raise ValidationError, "Index name cannot be nil" if index.nil?
|
|
689
791
|
raise ValidationError, "Index name must be a string" unless index.is_a?(String)
|
|
@@ -750,6 +852,22 @@ module Vectra
|
|
|
750
852
|
config.logger.debug("[Vectra] #{data.inspect}") if data
|
|
751
853
|
end
|
|
752
854
|
|
|
855
|
+
# Temporarily override request timeout within a block.
|
|
856
|
+
#
|
|
857
|
+
# This updates the client's configuration timeout for the duration
|
|
858
|
+
# of the block and then restores the previous value.
|
|
859
|
+
#
|
|
860
|
+
# @param seconds [Float] temporary timeout in seconds
|
|
861
|
+
# @yield [Client] yields self with overridden timeout
|
|
862
|
+
# @return [Object] block result
|
|
863
|
+
def with_timeout(seconds)
|
|
864
|
+
previous = config.timeout
|
|
865
|
+
config.timeout = seconds
|
|
866
|
+
yield self
|
|
867
|
+
ensure
|
|
868
|
+
config.timeout = previous
|
|
869
|
+
end
|
|
870
|
+
|
|
753
871
|
# Temporarily override default index within a block.
|
|
754
872
|
#
|
|
755
873
|
# @param index [String] temporary index name
|
|
@@ -790,7 +908,7 @@ module Vectra
|
|
|
790
908
|
end
|
|
791
909
|
end
|
|
792
910
|
|
|
793
|
-
public :with_index, :with_namespace, :with_index_and_namespace
|
|
911
|
+
public :with_index, :with_namespace, :with_index_and_namespace, :with_timeout
|
|
794
912
|
end
|
|
795
913
|
# rubocop:enable Metrics/ClassLength
|
|
796
914
|
end
|
data/lib/vectra/health_check.rb
CHANGED
|
@@ -31,7 +31,7 @@ module Vectra
|
|
|
31
31
|
|
|
32
32
|
# For health checks we bypass client middleware and call the provider
|
|
33
33
|
# directly to avoid interference from custom stacks.
|
|
34
|
-
indexes =
|
|
34
|
+
indexes = healthcheck_with_timeout(timeout) { provider.list_indexes }
|
|
35
35
|
index_name = index || indexes.first&.dig(:name)
|
|
36
36
|
|
|
37
37
|
result = base_result(start_time, indexes)
|
|
@@ -53,7 +53,7 @@ module Vectra
|
|
|
53
53
|
|
|
54
54
|
private
|
|
55
55
|
|
|
56
|
-
def
|
|
56
|
+
def healthcheck_with_timeout(seconds, &)
|
|
57
57
|
Timeout.timeout(seconds, &)
|
|
58
58
|
rescue Timeout::Error
|
|
59
59
|
raise Vectra::TimeoutError, "Health check timed out after #{seconds}s"
|
|
@@ -72,7 +72,7 @@ module Vectra
|
|
|
72
72
|
def add_index_stats(result, index_name, include_stats, timeout)
|
|
73
73
|
return unless include_stats && index_name
|
|
74
74
|
|
|
75
|
-
stats =
|
|
75
|
+
stats = healthcheck_with_timeout(timeout) { provider.stats(index: index_name) }
|
|
76
76
|
result[:index] = index_name
|
|
77
77
|
result[:stats] = {
|
|
78
78
|
vector_count: stats[:total_vector_count],
|
|
@@ -55,7 +55,7 @@ module Vectra
|
|
|
55
55
|
#
|
|
56
56
|
# @return [Boolean]
|
|
57
57
|
def read_operation?
|
|
58
|
-
[:query, :fetch, :list_indexes, :describe_index, :stats].include?(operation)
|
|
58
|
+
[:query, :text_search, :hybrid_search, :fetch, :list_indexes, :describe_index, :stats].include?(operation)
|
|
59
59
|
end
|
|
60
60
|
end
|
|
61
61
|
end
|
|
@@ -80,6 +80,32 @@ module Vectra
|
|
|
80
80
|
QueryResult.from_response(matches: matches, namespace: namespace)
|
|
81
81
|
end
|
|
82
82
|
|
|
83
|
+
# Text-only search using simple keyword matching in metadata
|
|
84
|
+
#
|
|
85
|
+
# For testing purposes only. Performs case-insensitive keyword matching
|
|
86
|
+
# in metadata values. Not a real BM25/full-text search implementation.
|
|
87
|
+
#
|
|
88
|
+
# @param index [String] index name
|
|
89
|
+
# @param text [String] text query for keyword search
|
|
90
|
+
# @param top_k [Integer] number of results
|
|
91
|
+
# @param namespace [String, nil] optional namespace
|
|
92
|
+
# @param filter [Hash, nil] metadata filter
|
|
93
|
+
# @param include_values [Boolean] include vector values
|
|
94
|
+
# @param include_metadata [Boolean] include metadata
|
|
95
|
+
# @return [QueryResult] search results
|
|
96
|
+
def text_search(index:, text:, top_k:, namespace: nil, filter: nil,
|
|
97
|
+
include_values: false, include_metadata: true)
|
|
98
|
+
ns = namespace || ""
|
|
99
|
+
candidates = filter_candidates(@storage[index][ns].values, filter)
|
|
100
|
+
text_lower = text.to_s.downcase
|
|
101
|
+
|
|
102
|
+
matches = find_text_matches(candidates, text_lower, include_values, include_metadata)
|
|
103
|
+
matches = matches.sort_by { |m| -m[:score] }.first(top_k)
|
|
104
|
+
|
|
105
|
+
log_debug("Text search returned #{matches.size} results")
|
|
106
|
+
QueryResult.from_response(matches: matches, namespace: namespace)
|
|
107
|
+
end
|
|
108
|
+
|
|
83
109
|
# @see Base#fetch
|
|
84
110
|
def fetch(index:, ids:, namespace: nil)
|
|
85
111
|
ns = namespace || ""
|
|
@@ -293,6 +319,36 @@ module Vectra
|
|
|
293
319
|
true
|
|
294
320
|
end
|
|
295
321
|
# rubocop:enable Naming/PredicateMethod
|
|
322
|
+
|
|
323
|
+
# Filter candidates by metadata filter
|
|
324
|
+
def filter_candidates(candidates, filter)
|
|
325
|
+
return candidates unless filter
|
|
326
|
+
|
|
327
|
+
candidates.select { |v| matches_filter?(v, filter) }
|
|
328
|
+
end
|
|
329
|
+
|
|
330
|
+
# Find text matches in candidates
|
|
331
|
+
def find_text_matches(candidates, text_lower, include_values, include_metadata)
|
|
332
|
+
candidates.map do |vec|
|
|
333
|
+
metadata_text = build_metadata_text(vec)
|
|
334
|
+
next unless metadata_text.include?(text_lower)
|
|
335
|
+
|
|
336
|
+
score = calculate_text_score(text_lower, metadata_text)
|
|
337
|
+
build_match(vec, score, include_values, include_metadata)
|
|
338
|
+
end.compact
|
|
339
|
+
end
|
|
340
|
+
|
|
341
|
+
# Build metadata text string for searching
|
|
342
|
+
def build_metadata_text(vector)
|
|
343
|
+
(vector.metadata || {}).values.map(&:to_s).join(" ").downcase
|
|
344
|
+
end
|
|
345
|
+
|
|
346
|
+
# Calculate text match score based on word matches
|
|
347
|
+
def calculate_text_score(query_text, metadata_text)
|
|
348
|
+
query_words = query_text.split(/\s+/)
|
|
349
|
+
matched_words = query_words.count { |word| metadata_text.include?(word) }
|
|
350
|
+
matched_words.to_f / query_words.size
|
|
351
|
+
end
|
|
296
352
|
end
|
|
297
353
|
end
|
|
298
354
|
end
|
|
@@ -28,6 +28,7 @@ module Vectra
|
|
|
28
28
|
# )
|
|
29
29
|
# client.upsert(index: 'documents', vectors: [...])
|
|
30
30
|
#
|
|
31
|
+
# rubocop:disable Metrics/ClassLength
|
|
31
32
|
class Pgvector < Base
|
|
32
33
|
include Connection
|
|
33
34
|
include SqlHelpers
|
|
@@ -162,6 +163,54 @@ module Vectra
|
|
|
162
163
|
)
|
|
163
164
|
end
|
|
164
165
|
|
|
166
|
+
# Text-only search using PostgreSQL full-text search
|
|
167
|
+
#
|
|
168
|
+
# @param index [String] table name
|
|
169
|
+
# @param text [String] text query for full-text search
|
|
170
|
+
# @param top_k [Integer] number of results
|
|
171
|
+
# @param namespace [String, nil] optional namespace
|
|
172
|
+
# @param filter [Hash, nil] metadata filter
|
|
173
|
+
# @param include_values [Boolean] include vector values
|
|
174
|
+
# @param include_metadata [Boolean] include metadata
|
|
175
|
+
# @param text_column [String] column name for full-text search (default: 'content')
|
|
176
|
+
# @return [QueryResult] search results
|
|
177
|
+
#
|
|
178
|
+
# @note Your table should have a text column with a tsvector index:
|
|
179
|
+
# CREATE INDEX idx_content_fts ON my_index USING gin(to_tsvector('english', content));
|
|
180
|
+
def text_search(index:, text:, top_k:, namespace: nil, filter: nil,
|
|
181
|
+
include_values: false, include_metadata: true,
|
|
182
|
+
text_column: "content")
|
|
183
|
+
ensure_table_exists!(index)
|
|
184
|
+
|
|
185
|
+
select_cols = ["id"]
|
|
186
|
+
select_cols << "embedding" if include_values
|
|
187
|
+
select_cols << "metadata" if include_metadata
|
|
188
|
+
|
|
189
|
+
# Use ts_rank for scoring
|
|
190
|
+
text_score = "ts_rank(to_tsvector('english', COALESCE(#{quote_ident(text_column)}, '')), " \
|
|
191
|
+
"plainto_tsquery('english', #{escape_literal(text)}))"
|
|
192
|
+
select_cols << "#{text_score} AS score"
|
|
193
|
+
|
|
194
|
+
where_clauses = build_where_clauses(namespace, filter)
|
|
195
|
+
where_clauses << "to_tsvector('english', COALESCE(#{quote_ident(text_column)}, '')) @@ " \
|
|
196
|
+
"plainto_tsquery('english', #{escape_literal(text)})"
|
|
197
|
+
|
|
198
|
+
sql = "SELECT #{select_cols.join(', ')} FROM #{quote_ident(index)}"
|
|
199
|
+
sql += " WHERE #{where_clauses.join(' AND ')}" if where_clauses.any?
|
|
200
|
+
sql += " ORDER BY score DESC"
|
|
201
|
+
sql += " LIMIT #{top_k.to_i}"
|
|
202
|
+
|
|
203
|
+
result = execute(sql)
|
|
204
|
+
matches = result.map { |row| build_match_from_row(row, include_values, include_metadata) }
|
|
205
|
+
|
|
206
|
+
log_debug("Text search returned #{matches.size} results")
|
|
207
|
+
|
|
208
|
+
QueryResult.from_response(
|
|
209
|
+
matches: matches,
|
|
210
|
+
namespace: namespace
|
|
211
|
+
)
|
|
212
|
+
end
|
|
213
|
+
|
|
165
214
|
# @see Base#fetch
|
|
166
215
|
def fetch(index:, ids:, namespace: nil)
|
|
167
216
|
ensure_table_exists!(index)
|
|
@@ -361,5 +410,6 @@ module Vectra
|
|
|
361
410
|
raise ConfigurationError, "Host (connection URL or hostname) must be configured for pgvector"
|
|
362
411
|
end
|
|
363
412
|
end
|
|
413
|
+
# rubocop:enable Metrics/ClassLength
|
|
364
414
|
end
|
|
365
415
|
end
|
|
@@ -110,6 +110,45 @@ module Vectra
|
|
|
110
110
|
handle_hybrid_search_response(response, alpha, namespace)
|
|
111
111
|
end
|
|
112
112
|
|
|
113
|
+
# Text-only search using Qdrant's BM25 text search
|
|
114
|
+
#
|
|
115
|
+
# @param index [String] collection name
|
|
116
|
+
# @param text [String] text query for keyword search
|
|
117
|
+
# @param top_k [Integer] number of results
|
|
118
|
+
# @param namespace [String, nil] optional namespace
|
|
119
|
+
# @param filter [Hash, nil] metadata filter
|
|
120
|
+
# @param include_values [Boolean] include vector values
|
|
121
|
+
# @param include_metadata [Boolean] include metadata
|
|
122
|
+
# @return [QueryResult] search results
|
|
123
|
+
def text_search(index:, text:, top_k:, namespace: nil, filter: nil,
|
|
124
|
+
include_values: false, include_metadata: true)
|
|
125
|
+
qdrant_filter = build_filter(filter, namespace)
|
|
126
|
+
body = {
|
|
127
|
+
query: { text: text },
|
|
128
|
+
limit: top_k,
|
|
129
|
+
with_vector: include_values,
|
|
130
|
+
with_payload: include_metadata
|
|
131
|
+
}
|
|
132
|
+
|
|
133
|
+
body[:filter] = qdrant_filter if qdrant_filter
|
|
134
|
+
|
|
135
|
+
response = with_error_handling do
|
|
136
|
+
connection.post("/collections/#{index}/points/query", body)
|
|
137
|
+
end
|
|
138
|
+
|
|
139
|
+
if response.success?
|
|
140
|
+
matches = transform_search_results(response.body["result"] || [])
|
|
141
|
+
log_debug("Text search returned #{matches.size} results")
|
|
142
|
+
|
|
143
|
+
QueryResult.from_response(
|
|
144
|
+
matches: matches,
|
|
145
|
+
namespace: namespace
|
|
146
|
+
)
|
|
147
|
+
else
|
|
148
|
+
handle_error(response)
|
|
149
|
+
end
|
|
150
|
+
end
|
|
151
|
+
|
|
113
152
|
# @see Base#fetch
|
|
114
153
|
def fetch(index:, ids:, namespace: nil) # rubocop:disable Lint/UnusedMethodArgument
|
|
115
154
|
point_ids = ids.map { |id| generate_point_id(id) }
|
|
@@ -139,6 +139,36 @@ module Vectra
|
|
|
139
139
|
include_values, include_metadata)
|
|
140
140
|
end
|
|
141
141
|
|
|
142
|
+
# Text-only search using Weaviate's BM25 text search
|
|
143
|
+
#
|
|
144
|
+
# @param index [String] class name
|
|
145
|
+
# @param text [String] text query for BM25 search
|
|
146
|
+
# @param top_k [Integer] number of results
|
|
147
|
+
# @param namespace [String, nil] optional namespace (not used in Weaviate)
|
|
148
|
+
# @param filter [Hash, nil] metadata filter
|
|
149
|
+
# @param include_values [Boolean] include vector values
|
|
150
|
+
# @param include_metadata [Boolean] include metadata
|
|
151
|
+
# @return [QueryResult] search results
|
|
152
|
+
def text_search(index:, text:, top_k:, namespace: nil, filter: nil,
|
|
153
|
+
include_values: false, include_metadata: true)
|
|
154
|
+
where_filter = build_where(filter, namespace)
|
|
155
|
+
graphql = build_text_search_graphql(
|
|
156
|
+
index: index,
|
|
157
|
+
text: text,
|
|
158
|
+
top_k: top_k,
|
|
159
|
+
where_filter: where_filter,
|
|
160
|
+
include_values: include_values,
|
|
161
|
+
include_metadata: include_metadata
|
|
162
|
+
)
|
|
163
|
+
body = { "query" => graphql }
|
|
164
|
+
|
|
165
|
+
response = with_error_handling do
|
|
166
|
+
connection.post("#{API_BASE_PATH}/graphql", body)
|
|
167
|
+
end
|
|
168
|
+
|
|
169
|
+
handle_text_search_response(response, index, namespace, include_values, include_metadata)
|
|
170
|
+
end
|
|
171
|
+
|
|
142
172
|
# rubocop:disable Metrics/PerceivedComplexity
|
|
143
173
|
def fetch(index:, ids:, namespace: nil)
|
|
144
174
|
body = {
|
|
@@ -337,6 +367,26 @@ module Vectra
|
|
|
337
367
|
build_graphql_query(index, top_k, text, alpha, vector, where_filter, selection_block)
|
|
338
368
|
end
|
|
339
369
|
|
|
370
|
+
def build_text_search_graphql(index:, text:, top_k:, where_filter:,
|
|
371
|
+
include_values:, include_metadata:)
|
|
372
|
+
selection_block = build_selection_fields(include_values, include_metadata).join(" ")
|
|
373
|
+
<<~GRAPHQL
|
|
374
|
+
{
|
|
375
|
+
Get {
|
|
376
|
+
#{index}(
|
|
377
|
+
limit: #{top_k}
|
|
378
|
+
bm25: {
|
|
379
|
+
query: "#{text.gsub('"', '\\"')}"
|
|
380
|
+
}
|
|
381
|
+
#{"where: #{JSON.generate(where_filter)}" if where_filter}
|
|
382
|
+
) {
|
|
383
|
+
#{selection_block}
|
|
384
|
+
}
|
|
385
|
+
}
|
|
386
|
+
}
|
|
387
|
+
GRAPHQL
|
|
388
|
+
end
|
|
389
|
+
|
|
340
390
|
def build_graphql_query(index, top_k, text, alpha, vector, where_filter, selection_block)
|
|
341
391
|
<<~GRAPHQL
|
|
342
392
|
{
|
|
@@ -379,6 +429,20 @@ module Vectra
|
|
|
379
429
|
end
|
|
380
430
|
end
|
|
381
431
|
|
|
432
|
+
def handle_text_search_response(response, index, namespace, include_values, include_metadata)
|
|
433
|
+
if response.success?
|
|
434
|
+
matches = extract_query_matches(response.body, index, include_values, include_metadata)
|
|
435
|
+
log_debug("Text search returned #{matches.size} results")
|
|
436
|
+
|
|
437
|
+
QueryResult.from_response(
|
|
438
|
+
matches: matches,
|
|
439
|
+
namespace: namespace
|
|
440
|
+
)
|
|
441
|
+
else
|
|
442
|
+
handle_error(response)
|
|
443
|
+
end
|
|
444
|
+
end
|
|
445
|
+
|
|
382
446
|
def validate_config!
|
|
383
447
|
super
|
|
384
448
|
raise ConfigurationError, "Host must be configured for Weaviate" if config.host.nil? || config.host.empty?
|
data/lib/vectra/version.rb
CHANGED
metadata
CHANGED
|
@@ -1,7 +1,7 @@
|
|
|
1
1
|
--- !ruby/object:Gem::Specification
|
|
2
2
|
name: vectra-client
|
|
3
3
|
version: !ruby/object:Gem::Version
|
|
4
|
-
version: 1.1.
|
|
4
|
+
version: 1.1.2
|
|
5
5
|
platform: ruby
|
|
6
6
|
authors:
|
|
7
7
|
- Mijo Kristo
|
|
@@ -274,6 +274,7 @@ files:
|
|
|
274
274
|
- docs/guides/rails-integration.md
|
|
275
275
|
- docs/guides/rails-troubleshooting.md
|
|
276
276
|
- docs/guides/recipes.md
|
|
277
|
+
- docs/guides/roadmap.md
|
|
277
278
|
- docs/guides/runbooks/cache-issues.md
|
|
278
279
|
- docs/guides/runbooks/high-error-rate.md
|
|
279
280
|
- docs/guides/runbooks/high-latency.md
|
|
@@ -288,6 +289,7 @@ files:
|
|
|
288
289
|
- docs/providers/qdrant.md
|
|
289
290
|
- docs/providers/selection.md
|
|
290
291
|
- docs/providers/weaviate.md
|
|
292
|
+
- docs/search.json
|
|
291
293
|
- examples/GRAFANA_QUICKSTART.md
|
|
292
294
|
- examples/README.md
|
|
293
295
|
- examples/active_record_demo.rb
|