RubyGems - vectra-client - Versions diffs - 0.2.2 → 0.3.1 - Mend

vectra-client 0.2.2 → 0.3.1

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (34) hide show

checksums.yaml +4 -4
data/.rubocop.yml +77 -37
data/CHANGELOG.md +85 -6
data/README.md +56 -203
data/docs/Gemfile +0 -1
data/docs/_config.yml +19 -2
data/docs/_layouts/default.html +6 -6
data/docs/_layouts/home.html +183 -29
data/docs/_layouts/page.html +81 -18
data/docs/assets/style.css +806 -174
data/docs/examples/index.md +46 -24
data/docs/guides/monitoring.md +860 -0
data/docs/guides/performance.md +200 -0
data/docs/guides/runbooks/cache-issues.md +267 -0
data/docs/guides/runbooks/high-error-rate.md +152 -0
data/docs/guides/runbooks/high-latency.md +287 -0
data/docs/guides/runbooks/pool-exhausted.md +216 -0
data/docs/index.md +22 -38
data/docs/providers/index.md +58 -39
data/lib/vectra/batch.rb +148 -0
data/lib/vectra/cache.rb +261 -0
data/lib/vectra/circuit_breaker.rb +336 -0
data/lib/vectra/client.rb +2 -0
data/lib/vectra/configuration.rb +6 -1
data/lib/vectra/health_check.rb +254 -0
data/lib/vectra/instrumentation/honeybadger.rb +128 -0
data/lib/vectra/instrumentation/sentry.rb +117 -0
data/lib/vectra/logging.rb +242 -0
data/lib/vectra/pool.rb +256 -0
data/lib/vectra/rate_limiter.rb +304 -0
data/lib/vectra/streaming.rb +153 -0
data/lib/vectra/version.rb +1 -1
data/lib/vectra.rb +8 -0
metadata +31 -1

data/docs/guides/runbooks/high-latency.md ADDED Viewed

@@ -0,0 +1,287 @@
+---
+layout: page
+title: "Runbook: High Latency"
+permalink: /guides/runbooks/high-latency/
+---
+# Runbook: High Latency
+**Alert:** `VectraHighLatency`
+**Severity:** Warning
+**Threshold:** P95 latency >2s for 5 minutes
+## Symptoms
+- Slow vector operations
+- Request timeouts
+- User-facing latency issues
+- Queue backlog building up
+## Quick Diagnosis
+```promql
+# Check current latency by operation
+histogram_quantile(0.95,
+  sum(rate(vectra_request_duration_seconds_bucket[5m])) by (le, operation)
+)
+```
+```ruby
+# Test latency in console
+require 'benchmark'
+time = Benchmark.realtime do
+  client.query(index: "test", vector: [0.1] * 384, top_k: 10)
+end
+puts "Query latency: #{(time * 1000).round}ms"
+```
+## Investigation Steps
+### 1. Identify Slow Operations
+```promql
+# Which operations are slow?
+topk(5,
+  histogram_quantile(0.95,
+    sum(rate(vectra_request_duration_seconds_bucket[5m])) by (le, operation)
+  )
+)
+```
+| Operation | Expected P95 | Alert Threshold |
+|-----------|--------------|-----------------|
+| query | <500ms | >2s |
+| upsert (single) | <200ms | >1s |
+| upsert (batch 100) | <2s | >5s |
+| fetch | <100ms | >500ms |
+| delete | <200ms | >1s |
+### 2. Check Provider Status
+```bash
+# Test provider connectivity
+curl -w "@curl-format.txt" -o /dev/null -s https://api.pinecone.io/health
+# curl-format.txt:
+# time_namelookup: %{time_namelookup}\n
+# time_connect: %{time_connect}\n
+# time_starttransfer: %{time_starttransfer}\n
+# time_total: %{time_total}\n
+```
+### 3. Check Network Latency
+```bash
+# Ping provider endpoint
+ping -c 10 api.pinecone.io
+# Check for packet loss
+mtr api.pinecone.io
+# DNS resolution time
+time nslookup api.pinecone.io
+```
+### 4. Check Vector Dimensions
+```ruby
+# Large vectors = slower operations
+client.describe_index(index: "my-index")
+# => { dimension: 1536, ... }
+# Consider using smaller embeddings:
+# - text-embedding-3-small: 512-1536 dims
+# - text-embedding-ada-002: 1536 dims
+# - all-MiniLM-L6-v2: 384 dims (faster!)
+```
+### 5. Check Index Size
+```ruby
+stats = client.stats(index: "my-index")
+puts "Vector count: #{stats[:total_vector_count]}"
+puts "Index fullness: #{stats[:index_fullness]}"
+# Large indexes may need optimization
+# - Pinecone: Check pod type
+# - pgvector: Check IVFFlat parameters
+# - Qdrant: Check HNSW parameters
+```
+## Resolution Steps
+### Immediate: Increase Timeouts
+```ruby
+Vectra.configure do |config|
+  config.timeout = 60       # Increase from 30
+  config.open_timeout = 20  # Increase from 10
+end
+```
+### Enable Caching
+```ruby
+cache = Vectra::Cache.new(ttl: 300, max_size: 1000)
+cached_client = Vectra::CachedClient.new(client, cache: cache)
+# Repeat queries will be instant
+```
+### Optimize Batch Operations
+```ruby
+# Use smaller batches for faster responses
+batch = Vectra::Batch.new(client, concurrency: 2)
+result = batch.upsert_async(
+  index: "my-index",
+  vectors: vectors,
+  chunk_size: 50  # Smaller chunks = faster individual operations
+)
+```
+### Reduce top_k
+```ruby
+# Fewer results = faster query
+results = client.query(
+  index: "my-index",
+  vector: query_vec,
+  top_k: 5  # Instead of 100
+)
+```
+### Provider-Specific Optimizations
+#### Pinecone
+```ruby
+# Use serverless for auto-scaling
+# Or upgrade pod type for more capacity
+```
+#### pgvector
+```sql
+-- Check if index exists
+SELECT indexname FROM pg_indexes WHERE tablename = 'your_table';
+-- Create IVFFlat index for faster queries
+CREATE INDEX ON your_table
+USING ivfflat (embedding vector_cosine_ops)
+WITH (lists = 100);
+-- Increase probes for accuracy vs speed trade-off
+SET ivfflat.probes = 10;  -- Default: 1
+```
+#### Qdrant
+```ruby
+# Optimize HNSW parameters
+client.provider.create_index(
+  name: "optimized",
+  dimension: 384,
+  metric: "cosine",
+  hnsw_config: {
+    m: 16,           # Connections per node
+    ef_construct: 100 # Build-time accuracy
+  }
+)
+```
+### Connection Pooling (pgvector)
+```ruby
+# Warmup connections to avoid cold start latency
+client.provider.warmup_pool(5)
+# Increase pool size for parallel queries
+Vectra.configure do |config|
+  config.pool_size = 20
+end
+```
+## Prevention
+### 1. Monitor Latency Trends
+```promql
+# Alert on increasing latency trend
+rate(vectra_request_duration_seconds_sum[1h]) /
+rate(vectra_request_duration_seconds_count[1h]) > 1
+```
+### 2. Implement Request Timeouts
+```ruby
+# Fail fast instead of hanging
+Vectra.configure do |config|
+  config.timeout = 10  # Strict timeout
+end
+```
+### 3. Use Async Operations
+```ruby
+# Don't block on upserts
+Thread.new do
+  batch.upsert_async(index: "bg-index", vectors: vectors)
+end
+```
+### 4. Index Maintenance
+```sql
+-- pgvector: Reindex periodically
+REINDEX INDEX your_ivfflat_index;
+-- Analyze for query planner
+ANALYZE your_table;
+```
+### 5. Geographic Optimization
+```ruby
+# Use closest region to your servers
+# Pinecone: us-east-1, us-west-2, eu-west-1
+# Qdrant Cloud: Select nearest region
+```
+## Benchmarking
+```ruby
+# Run benchmark to establish baseline
+require 'benchmark'
+results = Benchmark.bm do |x|
+  x.report("query") do
+    100.times { client.query(index: "test", vector: vec, top_k: 10) }
+  end
+  x.report("upsert") do
+    client.upsert(index: "test", vectors: vectors_100)
+  end
+  x.report("fetch") do
+    100.times { client.fetch(index: "test", ids: ["id1"]) }
+  end
+end
+```
+## Escalation
+| Time | Action |
+|------|--------|
+| 5 min | Enable caching, increase timeouts |
+| 15 min | Check provider status, optimize queries |
+| 30 min | Scale up provider resources |
+| 1 hour | Engage provider support |
+## Related
+- [High Error Rate Runbook]({{ site.baseurl }}/guides/runbooks/high-error-rate)
+- [Performance Guide]({{ site.baseurl }}/guides/performance)
+- [Monitoring Guide]({{ site.baseurl }}/guides/monitoring)

data/docs/guides/runbooks/pool-exhausted.md ADDED Viewed

@@ -0,0 +1,216 @@
+---
+layout: page
+title: "Runbook: Pool Exhaustion"
+permalink: /guides/runbooks/pool-exhausted/
+---
+# Runbook: Pool Exhaustion
+**Alert:** `VectraPoolExhausted`
+**Severity:** Critical
+**Threshold:** 0 available connections for 1 minute
+## Symptoms
+- `Vectra::Pool::TimeoutError` exceptions
+- Requests timing out waiting for connections
+- Application threads blocked
+## Quick Diagnosis
+```ruby
+# Check pool stats
+client = Vectra::Client.new(provider: :pgvector, host: ENV['DATABASE_URL'])
+puts client.provider.pool_stats
+# => { available: 0, checked_out: 10, size: 10 }
+```
+```bash
+# Check PostgreSQL connections
+psql -c "SELECT count(*) FROM pg_stat_activity WHERE application_name LIKE '%vectra%';"
+```
+## Investigation Steps
+### 1. Check Current Pool State
+```ruby
+stats = client.provider.pool_stats
+puts "Available: #{stats[:available]}"
+puts "Checked out: #{stats[:checked_out]}"
+puts "Total size: #{stats[:size]}"
+puts "Shutdown: #{stats[:shutdown]}"
+```
+### 2. Identify Connection Leaks
+```ruby
+# Look for connections not being returned
+# Common causes:
+# - Missing ensure blocks
+# - Exceptions before checkin
+# - Long-running operations
+# Bad:
+conn = pool.checkout
+do_something(conn)  # If this raises, connection is leaked!
+pool.checkin(conn)
+# Good:
+pool.with_connection do |conn|
+  do_something(conn)
+end  # Always returns connection
+```
+### 3. Check for Long-Running Queries
+```sql
+-- PostgreSQL: Find long-running queries
+SELECT pid, now() - pg_stat_activity.query_start AS duration, query
+FROM pg_stat_activity
+WHERE state != 'idle'
+AND query NOT LIKE '%pg_stat_activity%'
+ORDER BY duration DESC;
+-- Kill long-running query if needed
+SELECT pg_terminate_backend(pid);
+```
+### 4. Check Application Thread Count
+```ruby
+# If using Puma/Sidekiq
+# Ensure pool_size >= max_threads
+puts "Thread count: #{Thread.list.count}"
+puts "Pool size: #{client.config.pool_size}"
+```
+## Resolution Steps
+### Immediate: Restart Connection Pool
+```ruby
+# Force pool restart
+client.provider.shutdown_pool
+# Pool will be recreated on next operation
+```
+### Increase Pool Size
+```ruby
+Vectra.configure do |config|
+  config.provider = :pgvector
+  config.host = ENV['DATABASE_URL']
+  config.pool_size = 20      # Increase from default 5
+  config.pool_timeout = 10   # Increase timeout
+end
+```
+### Fix Connection Leaks
+```ruby
+# Always use with_connection block
+client.provider.with_pooled_connection do |conn|
+  # Your code here
+  # Connection automatically returned
+end
+# Or ensure checkin in rescue
+begin
+  conn = pool.checkout
+  do_work(conn)
+ensure
+  pool.checkin(conn) if conn
+end
+```
+### Reduce Connection Hold Time
+```ruby
+# Break up long operations
+large_dataset.each_slice(100) do |batch|
+  client.provider.with_pooled_connection do |conn|
+    process_batch(batch, conn)
+  end
+  # Connection returned between batches
+end
+```
+### Add Connection Warmup
+```ruby
+# In application initializer
+client = Vectra::Client.new(provider: :pgvector, host: ENV['DATABASE_URL'])
+client.provider.warmup_pool(5)  # Pre-create 5 connections
+```
+## Prevention
+### 1. Right-size Pool
+```ruby
+# Formula: pool_size = (max_threads * 1.5) + background_workers
+# Example: Puma with 5 threads, 3 Sidekiq workers
+pool_size = (5 * 1.5) + 3  # = 10.5, round to 12
+```
+### 2. Monitor Pool Usage
+```promql
+# Alert when pool is >80% utilized
+vectra_pool_connections{state="checked_out"}
+/ vectra_pool_connections{state="available"} > 0.8
+```
+### 3. Implement Connection Timeout
+```ruby
+Vectra.configure do |config|
+  config.pool_timeout = 5  # Fail fast instead of hanging
+end
+```
+### 4. Use Connection Pool Metrics
+```ruby
+# Log pool stats periodically
+every(60.seconds) do
+  stats = client.provider.pool_stats
+  logger.info "Pool: avail=#{stats[:available]} out=#{stats[:checked_out]}"
+end
+```
+## PostgreSQL-Specific
+### Check max_connections
+```sql
+SHOW max_connections;  -- Default: 100
+-- Increase if needed (requires restart)
+ALTER SYSTEM SET max_connections = 200;
+```
+### Monitor Connection Usage
+```sql
+SELECT
+  count(*) as total,
+  count(*) FILTER (WHERE state = 'active') as active,
+  count(*) FILTER (WHERE state = 'idle') as idle
+FROM pg_stat_activity;
+```
+## Escalation
+| Time | Action |
+|------|--------|
+| 1 min | Restart pool, page on-call |
+| 5 min | Increase pool size, restart app |
+| 15 min | Check for connection leaks |
+| 30 min | Escalate to DBA |
+## Related
+- [High Error Rate Runbook]({{ site.baseurl }}/guides/runbooks/high-error-rate)
+- [Performance Guide]({{ site.baseurl }}/guides/performance)

data/docs/index.md CHANGED Viewed

@@ -3,51 +3,35 @@ layout: home
 title: Vectra
 ---
-# Welcome to Vectra Documentation
-**Vectra** is a unified Ruby client for vector databases that allows you to write once and switch providers easily.
-## Supported Vector Databases
-- **Pinecone** - Managed vector database in the cloud
-- **Qdrant** - Open-source vector database
-- **Weaviate** - Open-source vector search engine
-- **PostgreSQL with pgvector** - SQL database with vector support
-## Quick Links
-- [Installation Guide]({{ site.baseurl }}/guides/installation)
-- [Getting Started]({{ site.baseurl }}/guides/getting-started)
-- [API Reference]({{ site.baseurl }}/api/overview)
-- [Examples]({{ site.baseurl }}/examples/basic-usage)
-- [Contributing]({{ site.baseurl }}/community/contributing)
-## Key Features
-- 🔄 **Provider Agnostic** - Switch between different vector database providers with minimal code changes
-- 🚀 **Easy Integration** - Works seamlessly with Rails and other Ruby frameworks
-- 📊 **Vector Operations** - Create, search, update, and delete vectors
-- 🔌 **Multiple Providers** - Support for leading vector database platforms
-- 📈 **Instrumentation** - Built-in support for Datadog and New Relic monitoring
-- 🗄️ **ActiveRecord Integration** - Native support for Rails models
-## Get Started
 ```ruby
 require 'vectra'
-# Initialize client
-client = Vectra::Client.new(provider: :pinecone, api_key: 'your-key')
+# Initialize any provider with the same API
+client = Vectra::Client.new(
+  provider: :pinecone,     # or :qdrant, :weaviate, :pgvector
+  api_key: ENV['API_KEY'],
+  host: 'your-host.example.com'
+)
-# Upsert vectors
+# Store vectors with metadata
 client.upsert(
   vectors: [
-    { id: '1', values: [0.1, 0.2, 0.3], metadata: { text: 'example' } }
+    {
+      id: 'doc-1',
+      values: [0.1, 0.2, 0.3, ...],  # Your embedding
+      metadata: { title: 'Getting Started with AI' }
+    }
   ]
 )
-# Search
-results = client.query(vector: [0.1, 0.2, 0.3], top_k: 5)
-```
+# Search by similarity
+results = client.query(
+  vector: [0.1, 0.2, 0.3, ...],
+  top_k: 10,
+  filter: { category: 'tutorials' }
+)
-For more detailed examples, see [Basic Usage]({{ site.baseurl }}/examples/basic-usage).
+results.each do |match|
+  puts "#{match['id']}: #{match['score']}"
+end
+```

data/docs/providers/index.md CHANGED Viewed

@@ -6,57 +6,76 @@ permalink: /providers/
 # Vector Database Providers
-Vectra supports multiple vector database providers. Choose the one that best fits your needs:
+Vectra supports multiple vector database providers. Choose the one that best fits your needs.
 ## Supported Providers
-| Provider | Type | Best For | Documentation |
-|----------|------|----------|---|
-| **Pinecone** | Managed Cloud | Production, Fully managed | [Guide]({{ site.baseurl }}/providers/pinecone) |
-| **Qdrant** | Open Source | Self-hosted, High performance | [Guide]({{ site.baseurl }}/providers/qdrant) |
-| **Weaviate** | Open Source | Semantic search, GraphQL | [Guide]({{ site.baseurl }}/providers/weaviate) |
-| **PostgreSQL + pgvector** | SQL Database | SQL integration, ACID | [Guide]({{ site.baseurl }}/providers/pgvector) |
+| Provider | Type | Best For |
+|----------|------|----------|
+| [**Pinecone**]({{ site.baseurl }}/providers/pinecone) | Managed Cloud | Production, Zero ops |
+| [**Qdrant**]({{ site.baseurl }}/providers/qdrant) | Open Source | Self-hosted, Performance |
+| [**Weaviate**]({{ site.baseurl }}/providers/weaviate) | Open Source | Semantic search, GraphQL |
+| [**pgvector**]({{ site.baseurl }}/providers/pgvector) | PostgreSQL | SQL integration, ACID |
 ## Quick Comparison
-### Pinecone
-- ✅ Fully managed service
-- ✅ Easy setup
-- ✅ Scalable
-- ❌ Cloud only
-- ❌ Paid service
-### Qdrant
-- ✅ Open source
-- ✅ Self-hosted
-- ✅ High performance
-- ✅ Multiple deployment options
-- ❌ More configuration needed
-### Weaviate
-- ✅ Open source
-- ✅ Semantic search
-- ✅ GraphQL API
-- ✅ Multi-model support
-- ❌ More complex
-### PostgreSQL + pgvector
-- ✅ SQL database
-- ✅ ACID transactions
-- ✅ Existing infrastructure
-- ✅ Affordable
-- ❌ Not specialized for vectors
+<div class="tma-comparison-grid">
+  <div class="tma-comparison-card">
+    <h4>Pinecone</h4>
+    <ul>
+      <li class="pro">Fully managed service</li>
+      <li class="pro">Easy setup</li>
+      <li class="pro">Highly scalable</li>
+      <li class="con">Cloud only</li>
+      <li class="con">Paid service</li>
+    </ul>
+  </div>
+  <div class="tma-comparison-card">
+    <h4>Qdrant</h4>
+    <ul>
+      <li class="pro">Open source</li>
+      <li class="pro">Self-hosted option</li>
+      <li class="pro">High performance</li>
+      <li class="pro">Cloud option available</li>
+      <li class="con">More configuration</li>
+    </ul>
+  </div>
+  <div class="tma-comparison-card">
+    <h4>Weaviate</h4>
+    <ul>
+      <li class="pro">Open source</li>
+      <li class="pro">Semantic search</li>
+      <li class="pro">GraphQL API</li>
+      <li class="pro">Multi-model support</li>
+      <li class="con">More complex setup</li>
+    </ul>
+  </div>
+  <div class="tma-comparison-card">
+    <h4>pgvector</h4>
+    <ul>
+      <li class="pro">SQL database</li>
+      <li class="pro">ACID transactions</li>
+      <li class="pro">Use existing Postgres</li>
+      <li class="pro">Very affordable</li>
+      <li class="con">Not vector-specialized</li>
+    </ul>
+  </div>
+</div>
 ## Switching Providers
 One of Vectra's key features is easy provider switching:
 ```ruby
-# All it takes is changing one line!
-client = Vectra::Client.new(provider: :qdrant)
+# Just change the provider - your code stays the same!
+client = Vectra::Client.new(provider: :qdrant, host: 'localhost:6333')
-# All your code remains the same
-results = client.query(vector: [0.1, 0.2, 0.3])
+# All operations work identically
+client.upsert(vectors: [...])
+results = client.query(vector: [...], top_k: 5)
 ```
-See the [Getting Started Guide]({{ site.baseurl }}/guides/getting-started) for more information.
+## Next Steps
+- [Getting Started Guide]({{ site.baseurl }}/guides/getting-started)
+- [API Reference]({{ site.baseurl }}/api/overview)