vectra-client 0.2.2 → 0.3.1

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
@@ -0,0 +1,287 @@
1
+ ---
2
+ layout: page
3
+ title: "Runbook: High Latency"
4
+ permalink: /guides/runbooks/high-latency/
5
+ ---
6
+
7
+ # Runbook: High Latency
8
+
9
+ **Alert:** `VectraHighLatency`
10
+ **Severity:** Warning
11
+ **Threshold:** P95 latency >2s for 5 minutes
12
+
13
+ ## Symptoms
14
+
15
+ - Slow vector operations
16
+ - Request timeouts
17
+ - User-facing latency issues
18
+ - Queue backlog building up
19
+
20
+ ## Quick Diagnosis
21
+
22
+ ```promql
23
+ # Check current latency by operation
24
+ histogram_quantile(0.95,
25
+ sum(rate(vectra_request_duration_seconds_bucket[5m])) by (le, operation)
26
+ )
27
+ ```
28
+
29
+ ```ruby
30
+ # Test latency in console
31
+ require 'benchmark'
32
+
33
+ time = Benchmark.realtime do
34
+ client.query(index: "test", vector: [0.1] * 384, top_k: 10)
35
+ end
36
+ puts "Query latency: #{(time * 1000).round}ms"
37
+ ```
38
+
39
+ ## Investigation Steps
40
+
41
+ ### 1. Identify Slow Operations
42
+
43
+ ```promql
44
+ # Which operations are slow?
45
+ topk(5,
46
+ histogram_quantile(0.95,
47
+ sum(rate(vectra_request_duration_seconds_bucket[5m])) by (le, operation)
48
+ )
49
+ )
50
+ ```
51
+
52
+ | Operation | Expected P95 | Alert Threshold |
53
+ |-----------|--------------|-----------------|
54
+ | query | <500ms | >2s |
55
+ | upsert (single) | <200ms | >1s |
56
+ | upsert (batch 100) | <2s | >5s |
57
+ | fetch | <100ms | >500ms |
58
+ | delete | <200ms | >1s |
59
+
60
+ ### 2. Check Provider Status
61
+
62
+ ```bash
63
+ # Test provider connectivity
64
+ curl -w "@curl-format.txt" -o /dev/null -s https://api.pinecone.io/health
65
+
66
+ # curl-format.txt:
67
+ # time_namelookup: %{time_namelookup}\n
68
+ # time_connect: %{time_connect}\n
69
+ # time_starttransfer: %{time_starttransfer}\n
70
+ # time_total: %{time_total}\n
71
+ ```
72
+
73
+ ### 3. Check Network Latency
74
+
75
+ ```bash
76
+ # Ping provider endpoint
77
+ ping -c 10 api.pinecone.io
78
+
79
+ # Check for packet loss
80
+ mtr api.pinecone.io
81
+
82
+ # DNS resolution time
83
+ time nslookup api.pinecone.io
84
+ ```
85
+
86
+ ### 4. Check Vector Dimensions
87
+
88
+ ```ruby
89
+ # Large vectors = slower operations
90
+ client.describe_index(index: "my-index")
91
+ # => { dimension: 1536, ... }
92
+
93
+ # Consider using smaller embeddings:
94
+ # - text-embedding-3-small: 512-1536 dims
95
+ # - text-embedding-ada-002: 1536 dims
96
+ # - all-MiniLM-L6-v2: 384 dims (faster!)
97
+ ```
98
+
99
+ ### 5. Check Index Size
100
+
101
+ ```ruby
102
+ stats = client.stats(index: "my-index")
103
+ puts "Vector count: #{stats[:total_vector_count]}"
104
+ puts "Index fullness: #{stats[:index_fullness]}"
105
+
106
+ # Large indexes may need optimization
107
+ # - Pinecone: Check pod type
108
+ # - pgvector: Check IVFFlat parameters
109
+ # - Qdrant: Check HNSW parameters
110
+ ```
111
+
112
+ ## Resolution Steps
113
+
114
+ ### Immediate: Increase Timeouts
115
+
116
+ ```ruby
117
+ Vectra.configure do |config|
118
+ config.timeout = 60 # Increase from 30
119
+ config.open_timeout = 20 # Increase from 10
120
+ end
121
+ ```
122
+
123
+ ### Enable Caching
124
+
125
+ ```ruby
126
+ cache = Vectra::Cache.new(ttl: 300, max_size: 1000)
127
+ cached_client = Vectra::CachedClient.new(client, cache: cache)
128
+
129
+ # Repeat queries will be instant
130
+ ```
131
+
132
+ ### Optimize Batch Operations
133
+
134
+ ```ruby
135
+ # Use smaller batches for faster responses
136
+ batch = Vectra::Batch.new(client, concurrency: 2)
137
+
138
+ result = batch.upsert_async(
139
+ index: "my-index",
140
+ vectors: vectors,
141
+ chunk_size: 50 # Smaller chunks = faster individual operations
142
+ )
143
+ ```
144
+
145
+ ### Reduce top_k
146
+
147
+ ```ruby
148
+ # Fewer results = faster query
149
+ results = client.query(
150
+ index: "my-index",
151
+ vector: query_vec,
152
+ top_k: 5 # Instead of 100
153
+ )
154
+ ```
155
+
156
+ ### Provider-Specific Optimizations
157
+
158
+ #### Pinecone
159
+
160
+ ```ruby
161
+ # Use serverless for auto-scaling
162
+ # Or upgrade pod type for more capacity
163
+ ```
164
+
165
+ #### pgvector
166
+
167
+ ```sql
168
+ -- Check if index exists
169
+ SELECT indexname FROM pg_indexes WHERE tablename = 'your_table';
170
+
171
+ -- Create IVFFlat index for faster queries
172
+ CREATE INDEX ON your_table
173
+ USING ivfflat (embedding vector_cosine_ops)
174
+ WITH (lists = 100);
175
+
176
+ -- Increase probes for accuracy vs speed trade-off
177
+ SET ivfflat.probes = 10; -- Default: 1
178
+ ```
179
+
180
+ #### Qdrant
181
+
182
+ ```ruby
183
+ # Optimize HNSW parameters
184
+ client.provider.create_index(
185
+ name: "optimized",
186
+ dimension: 384,
187
+ metric: "cosine",
188
+ hnsw_config: {
189
+ m: 16, # Connections per node
190
+ ef_construct: 100 # Build-time accuracy
191
+ }
192
+ )
193
+ ```
194
+
195
+ ### Connection Pooling (pgvector)
196
+
197
+ ```ruby
198
+ # Warmup connections to avoid cold start latency
199
+ client.provider.warmup_pool(5)
200
+
201
+ # Increase pool size for parallel queries
202
+ Vectra.configure do |config|
203
+ config.pool_size = 20
204
+ end
205
+ ```
206
+
207
+ ## Prevention
208
+
209
+ ### 1. Monitor Latency Trends
210
+
211
+ ```promql
212
+ # Alert on increasing latency trend
213
+ rate(vectra_request_duration_seconds_sum[1h]) /
214
+ rate(vectra_request_duration_seconds_count[1h]) > 1
215
+ ```
216
+
217
+ ### 2. Implement Request Timeouts
218
+
219
+ ```ruby
220
+ # Fail fast instead of hanging
221
+ Vectra.configure do |config|
222
+ config.timeout = 10 # Strict timeout
223
+ end
224
+ ```
225
+
226
+ ### 3. Use Async Operations
227
+
228
+ ```ruby
229
+ # Don't block on upserts
230
+ Thread.new do
231
+ batch.upsert_async(index: "bg-index", vectors: vectors)
232
+ end
233
+ ```
234
+
235
+ ### 4. Index Maintenance
236
+
237
+ ```sql
238
+ -- pgvector: Reindex periodically
239
+ REINDEX INDEX your_ivfflat_index;
240
+
241
+ -- Analyze for query planner
242
+ ANALYZE your_table;
243
+ ```
244
+
245
+ ### 5. Geographic Optimization
246
+
247
+ ```ruby
248
+ # Use closest region to your servers
249
+ # Pinecone: us-east-1, us-west-2, eu-west-1
250
+ # Qdrant Cloud: Select nearest region
251
+ ```
252
+
253
+ ## Benchmarking
254
+
255
+ ```ruby
256
+ # Run benchmark to establish baseline
257
+ require 'benchmark'
258
+
259
+ results = Benchmark.bm do |x|
260
+ x.report("query") do
261
+ 100.times { client.query(index: "test", vector: vec, top_k: 10) }
262
+ end
263
+
264
+ x.report("upsert") do
265
+ client.upsert(index: "test", vectors: vectors_100)
266
+ end
267
+
268
+ x.report("fetch") do
269
+ 100.times { client.fetch(index: "test", ids: ["id1"]) }
270
+ end
271
+ end
272
+ ```
273
+
274
+ ## Escalation
275
+
276
+ | Time | Action |
277
+ |------|--------|
278
+ | 5 min | Enable caching, increase timeouts |
279
+ | 15 min | Check provider status, optimize queries |
280
+ | 30 min | Scale up provider resources |
281
+ | 1 hour | Engage provider support |
282
+
283
+ ## Related
284
+
285
+ - [High Error Rate Runbook]({{ site.baseurl }}/guides/runbooks/high-error-rate)
286
+ - [Performance Guide]({{ site.baseurl }}/guides/performance)
287
+ - [Monitoring Guide]({{ site.baseurl }}/guides/monitoring)
@@ -0,0 +1,216 @@
1
+ ---
2
+ layout: page
3
+ title: "Runbook: Pool Exhaustion"
4
+ permalink: /guides/runbooks/pool-exhausted/
5
+ ---
6
+
7
+ # Runbook: Pool Exhaustion
8
+
9
+ **Alert:** `VectraPoolExhausted`
10
+ **Severity:** Critical
11
+ **Threshold:** 0 available connections for 1 minute
12
+
13
+ ## Symptoms
14
+
15
+ - `Vectra::Pool::TimeoutError` exceptions
16
+ - Requests timing out waiting for connections
17
+ - Application threads blocked
18
+
19
+ ## Quick Diagnosis
20
+
21
+ ```ruby
22
+ # Check pool stats
23
+ client = Vectra::Client.new(provider: :pgvector, host: ENV['DATABASE_URL'])
24
+ puts client.provider.pool_stats
25
+ # => { available: 0, checked_out: 10, size: 10 }
26
+ ```
27
+
28
+ ```bash
29
+ # Check PostgreSQL connections
30
+ psql -c "SELECT count(*) FROM pg_stat_activity WHERE application_name LIKE '%vectra%';"
31
+ ```
32
+
33
+ ## Investigation Steps
34
+
35
+ ### 1. Check Current Pool State
36
+
37
+ ```ruby
38
+ stats = client.provider.pool_stats
39
+ puts "Available: #{stats[:available]}"
40
+ puts "Checked out: #{stats[:checked_out]}"
41
+ puts "Total size: #{stats[:size]}"
42
+ puts "Shutdown: #{stats[:shutdown]}"
43
+ ```
44
+
45
+ ### 2. Identify Connection Leaks
46
+
47
+ ```ruby
48
+ # Look for connections not being returned
49
+ # Common causes:
50
+ # - Missing ensure blocks
51
+ # - Exceptions before checkin
52
+ # - Long-running operations
53
+
54
+ # Bad:
55
+ conn = pool.checkout
56
+ do_something(conn) # If this raises, connection is leaked!
57
+ pool.checkin(conn)
58
+
59
+ # Good:
60
+ pool.with_connection do |conn|
61
+ do_something(conn)
62
+ end # Always returns connection
63
+ ```
64
+
65
+ ### 3. Check for Long-Running Queries
66
+
67
+ ```sql
68
+ -- PostgreSQL: Find long-running queries
69
+ SELECT pid, now() - pg_stat_activity.query_start AS duration, query
70
+ FROM pg_stat_activity
71
+ WHERE state != 'idle'
72
+ AND query NOT LIKE '%pg_stat_activity%'
73
+ ORDER BY duration DESC;
74
+
75
+ -- Kill long-running query if needed
76
+ SELECT pg_terminate_backend(pid);
77
+ ```
78
+
79
+ ### 4. Check Application Thread Count
80
+
81
+ ```ruby
82
+ # If using Puma/Sidekiq
83
+ # Ensure pool_size >= max_threads
84
+ puts "Thread count: #{Thread.list.count}"
85
+ puts "Pool size: #{client.config.pool_size}"
86
+ ```
87
+
88
+ ## Resolution Steps
89
+
90
+ ### Immediate: Restart Connection Pool
91
+
92
+ ```ruby
93
+ # Force pool restart
94
+ client.provider.shutdown_pool
95
+ # Pool will be recreated on next operation
96
+ ```
97
+
98
+ ### Increase Pool Size
99
+
100
+ ```ruby
101
+ Vectra.configure do |config|
102
+ config.provider = :pgvector
103
+ config.host = ENV['DATABASE_URL']
104
+ config.pool_size = 20 # Increase from default 5
105
+ config.pool_timeout = 10 # Increase timeout
106
+ end
107
+ ```
108
+
109
+ ### Fix Connection Leaks
110
+
111
+ ```ruby
112
+ # Always use with_connection block
113
+ client.provider.with_pooled_connection do |conn|
114
+ # Your code here
115
+ # Connection automatically returned
116
+ end
117
+
118
+ # Or ensure checkin in rescue
119
+ begin
120
+ conn = pool.checkout
121
+ do_work(conn)
122
+ ensure
123
+ pool.checkin(conn) if conn
124
+ end
125
+ ```
126
+
127
+ ### Reduce Connection Hold Time
128
+
129
+ ```ruby
130
+ # Break up long operations
131
+ large_dataset.each_slice(100) do |batch|
132
+ client.provider.with_pooled_connection do |conn|
133
+ process_batch(batch, conn)
134
+ end
135
+ # Connection returned between batches
136
+ end
137
+ ```
138
+
139
+ ### Add Connection Warmup
140
+
141
+ ```ruby
142
+ # In application initializer
143
+ client = Vectra::Client.new(provider: :pgvector, host: ENV['DATABASE_URL'])
144
+ client.provider.warmup_pool(5) # Pre-create 5 connections
145
+ ```
146
+
147
+ ## Prevention
148
+
149
+ ### 1. Right-size Pool
150
+
151
+ ```ruby
152
+ # Formula: pool_size = (max_threads * 1.5) + background_workers
153
+ # Example: Puma with 5 threads, 3 Sidekiq workers
154
+ pool_size = (5 * 1.5) + 3 # = 10.5, round to 12
155
+ ```
156
+
157
+ ### 2. Monitor Pool Usage
158
+
159
+ ```promql
160
+ # Alert when pool is >80% utilized
161
+ vectra_pool_connections{state="checked_out"}
162
+ / vectra_pool_connections{state="available"} > 0.8
163
+ ```
164
+
165
+ ### 3. Implement Connection Timeout
166
+
167
+ ```ruby
168
+ Vectra.configure do |config|
169
+ config.pool_timeout = 5 # Fail fast instead of hanging
170
+ end
171
+ ```
172
+
173
+ ### 4. Use Connection Pool Metrics
174
+
175
+ ```ruby
176
+ # Log pool stats periodically
177
+ every(60.seconds) do
178
+ stats = client.provider.pool_stats
179
+ logger.info "Pool: avail=#{stats[:available]} out=#{stats[:checked_out]}"
180
+ end
181
+ ```
182
+
183
+ ## PostgreSQL-Specific
184
+
185
+ ### Check max_connections
186
+
187
+ ```sql
188
+ SHOW max_connections; -- Default: 100
189
+
190
+ -- Increase if needed (requires restart)
191
+ ALTER SYSTEM SET max_connections = 200;
192
+ ```
193
+
194
+ ### Monitor Connection Usage
195
+
196
+ ```sql
197
+ SELECT
198
+ count(*) as total,
199
+ count(*) FILTER (WHERE state = 'active') as active,
200
+ count(*) FILTER (WHERE state = 'idle') as idle
201
+ FROM pg_stat_activity;
202
+ ```
203
+
204
+ ## Escalation
205
+
206
+ | Time | Action |
207
+ |------|--------|
208
+ | 1 min | Restart pool, page on-call |
209
+ | 5 min | Increase pool size, restart app |
210
+ | 15 min | Check for connection leaks |
211
+ | 30 min | Escalate to DBA |
212
+
213
+ ## Related
214
+
215
+ - [High Error Rate Runbook]({{ site.baseurl }}/guides/runbooks/high-error-rate)
216
+ - [Performance Guide]({{ site.baseurl }}/guides/performance)
data/docs/index.md CHANGED
@@ -3,51 +3,35 @@ layout: home
3
3
  title: Vectra
4
4
  ---
5
5
 
6
- # Welcome to Vectra Documentation
7
-
8
- **Vectra** is a unified Ruby client for vector databases that allows you to write once and switch providers easily.
9
-
10
- ## Supported Vector Databases
11
-
12
- - **Pinecone** - Managed vector database in the cloud
13
- - **Qdrant** - Open-source vector database
14
- - **Weaviate** - Open-source vector search engine
15
- - **PostgreSQL with pgvector** - SQL database with vector support
16
-
17
- ## Quick Links
18
-
19
- - [Installation Guide]({{ site.baseurl }}/guides/installation)
20
- - [Getting Started]({{ site.baseurl }}/guides/getting-started)
21
- - [API Reference]({{ site.baseurl }}/api/overview)
22
- - [Examples]({{ site.baseurl }}/examples/basic-usage)
23
- - [Contributing]({{ site.baseurl }}/community/contributing)
24
-
25
- ## Key Features
26
-
27
- - 🔄 **Provider Agnostic** - Switch between different vector database providers with minimal code changes
28
- - 🚀 **Easy Integration** - Works seamlessly with Rails and other Ruby frameworks
29
- - 📊 **Vector Operations** - Create, search, update, and delete vectors
30
- - 🔌 **Multiple Providers** - Support for leading vector database platforms
31
- - 📈 **Instrumentation** - Built-in support for Datadog and New Relic monitoring
32
- - 🗄️ **ActiveRecord Integration** - Native support for Rails models
33
-
34
- ## Get Started
35
-
36
6
  ```ruby
37
7
  require 'vectra'
38
8
 
39
- # Initialize client
40
- client = Vectra::Client.new(provider: :pinecone, api_key: 'your-key')
9
+ # Initialize any provider with the same API
10
+ client = Vectra::Client.new(
11
+ provider: :pinecone, # or :qdrant, :weaviate, :pgvector
12
+ api_key: ENV['API_KEY'],
13
+ host: 'your-host.example.com'
14
+ )
41
15
 
42
- # Upsert vectors
16
+ # Store vectors with metadata
43
17
  client.upsert(
44
18
  vectors: [
45
- { id: '1', values: [0.1, 0.2, 0.3], metadata: { text: 'example' } }
19
+ {
20
+ id: 'doc-1',
21
+ values: [0.1, 0.2, 0.3, ...], # Your embedding
22
+ metadata: { title: 'Getting Started with AI' }
23
+ }
46
24
  ]
47
25
  )
48
26
 
49
- # Search
50
- results = client.query(vector: [0.1, 0.2, 0.3], top_k: 5)
51
- ```
27
+ # Search by similarity
28
+ results = client.query(
29
+ vector: [0.1, 0.2, 0.3, ...],
30
+ top_k: 10,
31
+ filter: { category: 'tutorials' }
32
+ )
52
33
 
53
- For more detailed examples, see [Basic Usage]({{ site.baseurl }}/examples/basic-usage).
34
+ results.each do |match|
35
+ puts "#{match['id']}: #{match['score']}"
36
+ end
37
+ ```
@@ -6,57 +6,76 @@ permalink: /providers/
6
6
 
7
7
  # Vector Database Providers
8
8
 
9
- Vectra supports multiple vector database providers. Choose the one that best fits your needs:
9
+ Vectra supports multiple vector database providers. Choose the one that best fits your needs.
10
10
 
11
11
  ## Supported Providers
12
12
 
13
- | Provider | Type | Best For | Documentation |
14
- |----------|------|----------|---|
15
- | **Pinecone** | Managed Cloud | Production, Fully managed | [Guide]({{ site.baseurl }}/providers/pinecone) |
16
- | **Qdrant** | Open Source | Self-hosted, High performance | [Guide]({{ site.baseurl }}/providers/qdrant) |
17
- | **Weaviate** | Open Source | Semantic search, GraphQL | [Guide]({{ site.baseurl }}/providers/weaviate) |
18
- | **PostgreSQL + pgvector** | SQL Database | SQL integration, ACID | [Guide]({{ site.baseurl }}/providers/pgvector) |
13
+ | Provider | Type | Best For |
14
+ |----------|------|----------|
15
+ | [**Pinecone**]({{ site.baseurl }}/providers/pinecone) | Managed Cloud | Production, Zero ops |
16
+ | [**Qdrant**]({{ site.baseurl }}/providers/qdrant) | Open Source | Self-hosted, Performance |
17
+ | [**Weaviate**]({{ site.baseurl }}/providers/weaviate) | Open Source | Semantic search, GraphQL |
18
+ | [**pgvector**]({{ site.baseurl }}/providers/pgvector) | PostgreSQL | SQL integration, ACID |
19
19
 
20
20
  ## Quick Comparison
21
21
 
22
- ### Pinecone
23
- - ✅ Fully managed service
24
- - ✅ Easy setup
25
- - ✅ Scalable
26
- - Cloud only
27
- - Paid service
28
-
29
- ### Qdrant
30
- - Open source
31
- - ✅ Self-hosted
32
- - ✅ High performance
33
- - ✅ Multiple deployment options
34
- - ❌ More configuration needed
35
-
36
- ### Weaviate
37
- - ✅ Open source
38
- - Semantic search
39
- - GraphQL API
40
- - Multi-model support
41
- - ❌ More complex
42
-
43
- ### PostgreSQL + pgvector
44
- - ✅ SQL database
45
- - ✅ ACID transactions
46
- - Existing infrastructure
47
- - Affordable
48
- - Not specialized for vectors
22
+ <div class="tma-comparison-grid">
23
+ <div class="tma-comparison-card">
24
+ <h4>Pinecone</h4>
25
+ <ul>
26
+ <li class="pro">Fully managed service</li>
27
+ <li class="pro">Easy setup</li>
28
+ <li class="pro">Highly scalable</li>
29
+ <li class="con">Cloud only</li>
30
+ <li class="con">Paid service</li>
31
+ </ul>
32
+ </div>
33
+ <div class="tma-comparison-card">
34
+ <h4>Qdrant</h4>
35
+ <ul>
36
+ <li class="pro">Open source</li>
37
+ <li class="pro">Self-hosted option</li>
38
+ <li class="pro">High performance</li>
39
+ <li class="pro">Cloud option available</li>
40
+ <li class="con">More configuration</li>
41
+ </ul>
42
+ </div>
43
+ <div class="tma-comparison-card">
44
+ <h4>Weaviate</h4>
45
+ <ul>
46
+ <li class="pro">Open source</li>
47
+ <li class="pro">Semantic search</li>
48
+ <li class="pro">GraphQL API</li>
49
+ <li class="pro">Multi-model support</li>
50
+ <li class="con">More complex setup</li>
51
+ </ul>
52
+ </div>
53
+ <div class="tma-comparison-card">
54
+ <h4>pgvector</h4>
55
+ <ul>
56
+ <li class="pro">SQL database</li>
57
+ <li class="pro">ACID transactions</li>
58
+ <li class="pro">Use existing Postgres</li>
59
+ <li class="pro">Very affordable</li>
60
+ <li class="con">Not vector-specialized</li>
61
+ </ul>
62
+ </div>
63
+ </div>
49
64
 
50
65
  ## Switching Providers
51
66
 
52
67
  One of Vectra's key features is easy provider switching:
53
68
 
54
69
  ```ruby
55
- # All it takes is changing one line!
56
- client = Vectra::Client.new(provider: :qdrant)
70
+ # Just change the provider - your code stays the same!
71
+ client = Vectra::Client.new(provider: :qdrant, host: 'localhost:6333')
57
72
 
58
- # All your code remains the same
59
- results = client.query(vector: [0.1, 0.2, 0.3])
73
+ # All operations work identically
74
+ client.upsert(vectors: [...])
75
+ results = client.query(vector: [...], top_k: 5)
60
76
  ```
61
77
 
62
- See the [Getting Started Guide]({{ site.baseurl }}/guides/getting-started) for more information.
78
+ ## Next Steps
79
+
80
+ - [Getting Started Guide]({{ site.baseurl }}/guides/getting-started)
81
+ - [API Reference]({{ site.baseurl }}/api/overview)