@llm-dev-ops/agentics-cli 2.7.38 → 2.7.40

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (34) hide show
  1. package/dist/mcp/mcp-server.js +5 -1
  2. package/dist/mcp/mcp-server.js.map +1 -1
  3. package/dist/pipeline/auto-chain.d.ts.map +1 -1
  4. package/dist/pipeline/auto-chain.js +82 -6
  5. package/dist/pipeline/auto-chain.js.map +1 -1
  6. package/dist/pipeline/exemplars.d.ts +34 -0
  7. package/dist/pipeline/exemplars.d.ts.map +1 -0
  8. package/dist/pipeline/exemplars.js +101 -0
  9. package/dist/pipeline/exemplars.js.map +1 -0
  10. package/dist/pipeline/output-validator.d.ts +38 -0
  11. package/dist/pipeline/output-validator.d.ts.map +1 -0
  12. package/dist/pipeline/output-validator.js +152 -0
  13. package/dist/pipeline/output-validator.js.map +1 -0
  14. package/dist/pipeline/phase2/phases/adr-generator.d.ts.map +1 -1
  15. package/dist/pipeline/phase2/phases/adr-generator.js +21 -3
  16. package/dist/pipeline/phase2/phases/adr-generator.js.map +1 -1
  17. package/dist/pipeline/phase4-adrs/phase4-adrs-coordinator.d.ts.map +1 -1
  18. package/dist/pipeline/phase4-adrs/phase4-adrs-coordinator.js +17 -5
  19. package/dist/pipeline/phase4-adrs/phase4-adrs-coordinator.js.map +1 -1
  20. package/dist/pipeline/ruflo-phase-executor.d.ts.map +1 -1
  21. package/dist/pipeline/ruflo-phase-executor.js +52 -60
  22. package/dist/pipeline/ruflo-phase-executor.js.map +1 -1
  23. package/dist/synthesis/ask-artifact-writer.d.ts.map +1 -1
  24. package/dist/synthesis/ask-artifact-writer.js +9 -6
  25. package/dist/synthesis/ask-artifact-writer.js.map +1 -1
  26. package/dist/synthesis/simulation-artifact-generator.d.ts.map +1 -1
  27. package/dist/synthesis/simulation-artifact-generator.js +26 -6
  28. package/dist/synthesis/simulation-artifact-generator.js.map +1 -1
  29. package/docs/templates/ADR-Good-Example.md +787 -0
  30. package/docs/templates/Implementation-Prompts-Good-Example.md +1158 -0
  31. package/docs/templates/promotion-changelog.md +94 -0
  32. package/docs/templates/regression-changelog.md +86 -0
  33. package/docs/templates/sparc-specification-good-example.md +600 -0
  34. package/package.json +2 -1
@@ -0,0 +1,787 @@
1
+ # ADR-001: Ruvector Core Architecture
2
+
3
+ **Status**: Proposed
4
+ **Date**: 2026-01-18
5
+ **Authors**: ruv.io, RuVector Team
6
+ **Deciders**: Architecture Review Board
7
+ **SDK**: Claude-Flow
8
+
9
+ **Note**: The storage layer described in this ADR is superseded by ADR-029 (RVF as Canonical Binary Format). All vector persistence now uses the RVF segment model.
10
+
11
+ ## Version History
12
+
13
+ | Version | Date | Author | Changes |
14
+ |---------|------|--------|---------|
15
+ | 0.1 | 2026-01-18 | ruv.io | Initial architecture proposal |
16
+
17
+ ---
18
+
19
+ ## Context
20
+
21
+ ### The Vector Database Challenge
22
+
23
+ Modern AI applications require vector databases that can:
24
+
25
+ 1. **Store high-dimensional embeddings** from LLMs and embedding models
26
+ 2. **Search with sub-millisecond latency** for real-time inference
27
+ 3. **Scale to billions of vectors** while maintaining performance
28
+ 4. **Deploy anywhere** - edge devices, browsers (WASM), cloud servers
29
+ 5. **Integrate seamlessly** with LLM inference pipelines
30
+
31
+ ### Current State of Vector Databases
32
+
33
+ Existing solutions fall into several categories:
34
+
35
+ | Category | Examples | Limitations |
36
+ |----------|----------|-------------|
37
+ | **Cloud-only** | Pinecone | No edge deployment, vendor lock-in |
38
+ | **Heavy native** | Milvus, Qdrant | Complex deployment, high memory |
39
+ | **Python-first** | ChromaDB, FAISS | Performance overhead, no WASM |
40
+ | **Learning-capable** | None | No existing solutions learn from usage |
41
+
42
+ ### The Ruvector Vision
43
+
44
+ Ruvector is designed as a **high-performance, learning-capable vector database** implemented in Rust that:
45
+
46
+ - Achieves **61us p50 latency** for k=10 search on 384-dim vectors
47
+ - Provides **2-32x memory compression** through tiered quantization
48
+ - Runs **anywhere** - native (x86_64, ARM64), WASM (browser, edge), PostgreSQL extension
49
+ - **Learns from usage** via GNN layers that improve search quality over time
50
+ - Integrates with **AI agent memory systems** for policy, session state, and audit logs
51
+
52
+ ---
53
+
54
+ ## Decision
55
+
56
+ ### Adopt a Layered, SIMD-Optimized Architecture
57
+
58
+ We implement ruvector-core as the foundational vector database engine with the following architecture:
59
+
60
+ ```
61
+ +-----------------------------------------------------------------------------+
62
+ | APPLICATION LAYER |
63
+ | AgenticDB | VectorDB API | Cypher Queries | REST/gRPC Server |
64
+ +-----------------------------------------------------------------------------+
65
+ |
66
+ +-----------------------------------------------------------------------------+
67
+ | INDEX LAYER |
68
+ | HNSW Index | Flat Index | Filtered Search | Hybrid Search | MMR |
69
+ +-----------------------------------------------------------------------------+
70
+ |
71
+ +-----------------------------------------------------------------------------+
72
+ | QUANTIZATION LAYER |
73
+ | Scalar (4x) | Product (8-16x) | Binary (32x) | Conformal Prediction |
74
+ +-----------------------------------------------------------------------------+
75
+ |
76
+ +-----------------------------------------------------------------------------+
77
+ | DISTANCE LAYER |
78
+ | Euclidean | Cosine | Dot Product | Manhattan | SIMD Dispatch |
79
+ +-----------------------------------------------------------------------------+
80
+ |
81
+ +-----------------------------------------------------------------------------+
82
+ | SIMD INTRINSICS LAYER |
83
+ | AVX2/AVX-512 (x86_64) | NEON (ARM64/Apple Silicon) | Scalar Fallback |
84
+ +-----------------------------------------------------------------------------+
85
+ |
86
+ +-----------------------------------------------------------------------------+
87
+ | STORAGE LAYER |
88
+ | REDB (native) | Memory-only (WASM) | PostgreSQL Extension |
89
+ +-----------------------------------------------------------------------------+
90
+ ```
91
+
92
+ ---
93
+
94
+ ## Key Components
95
+
96
+ ### 1. SIMD Intrinsics Layer (`simd_intrinsics.rs`)
97
+
98
+ The performance foundation of ruvector, providing hardware-accelerated distance calculations.
99
+
100
+ #### Architecture Dispatch
101
+
102
+ ```rust
103
+ pub fn euclidean_distance_simd(a: &[f32], b: &[f32]) -> f32 {
104
+ #[cfg(target_arch = "x86_64")]
105
+ {
106
+ if is_x86_feature_detected!("avx2") {
107
+ unsafe { euclidean_distance_avx2_impl(a, b) }
108
+ } else {
109
+ euclidean_distance_scalar(a, b)
110
+ }
111
+ }
112
+
113
+ #[cfg(target_arch = "aarch64")]
114
+ {
115
+ unsafe { euclidean_distance_neon_impl(a, b) }
116
+ }
117
+
118
+ #[cfg(not(any(target_arch = "x86_64", target_arch = "aarch64")))]
119
+ {
120
+ euclidean_distance_scalar(a, b)
121
+ }
122
+ }
123
+ ```
124
+
125
+ #### Supported Operations
126
+
127
+ | Operation | AVX2 (x86_64) | NEON (ARM64) | Scalar Fallback |
128
+ |-----------|---------------|--------------|-----------------|
129
+ | Euclidean Distance | 8 floats/cycle | 4 floats/cycle | 1 float/cycle |
130
+ | Dot Product | 8 floats/cycle | 4 floats/cycle | 1 float/cycle |
131
+ | Cosine Similarity | 8 floats/cycle | 4 floats/cycle | 1 float/cycle |
132
+ | Manhattan Distance | N/A | 4 floats/cycle | 1 float/cycle |
133
+
134
+ #### Performance Characteristics
135
+
136
+ | Metric | AVX2 | NEON | Scalar |
137
+ |--------|------|------|--------|
138
+ | **512-dim Euclidean** | ~16M ops/sec | ~8M ops/sec | ~2M ops/sec |
139
+ | **384-dim Cosine** | ~143ns | ~200ns | ~800ns |
140
+ | **1536-dim Dot Product** | ~33ns | ~50ns | ~150ns |
141
+
142
+ #### Security Guarantees
143
+
144
+ - Bounds checking via `assert_eq!(a.len(), b.len())` prevents buffer overflows
145
+ - Unaligned loads (`_mm256_loadu_ps`, `vld1q_f32`) handle arbitrary alignment
146
+ - Scalar fallback handles remainder elements after SIMD processing
147
+
148
+ ### 2. Distance Metrics Layer (`distance.rs`)
149
+
150
+ High-level distance API with optional SimSIMD integration for additional acceleration.
151
+
152
+ #### Supported Metrics
153
+
154
+ ```rust
155
+ pub enum DistanceMetric {
156
+ Euclidean, // L2 distance: sqrt(sum((a[i] - b[i])^2))
157
+ Cosine, // 1 - cosine_similarity
158
+ DotProduct, // Negative dot product (for maximization)
159
+ Manhattan, // L1 distance: sum(|a[i] - b[i]|)
160
+ }
161
+ ```
162
+
163
+ #### Feature Flags
164
+
165
+ | Feature | Description | Use Case |
166
+ |---------|-------------|----------|
167
+ | `simd` | SimSIMD acceleration | Native builds |
168
+ | `parallel` | Rayon batch processing | Multi-core systems |
169
+ | None | Pure Rust fallback | WASM builds |
170
+
171
+ #### Batch Distance API
172
+
173
+ ```rust
174
+ pub fn batch_distances(
175
+ query: &[f32],
176
+ vectors: &[Vec<f32>],
177
+ metric: DistanceMetric,
178
+ ) -> Result<Vec<f32>> {
179
+ #[cfg(all(feature = "parallel", not(target_arch = "wasm32")))]
180
+ {
181
+ use rayon::prelude::*;
182
+ vectors.par_iter()
183
+ .map(|v| distance(query, v, metric))
184
+ .collect()
185
+ }
186
+ // Sequential fallback for WASM...
187
+ }
188
+ ```
189
+
190
+ ### 3. Index Structures (`index/`)
191
+
192
+ #### HNSW Index (`index/hnsw.rs`)
193
+
194
+ Hierarchical Navigable Small World graph for approximate nearest neighbor search.
195
+
196
+ **Configuration Parameters:**
197
+
198
+ | Parameter | Default | Description |
199
+ |-----------|---------|-------------|
200
+ | `m` | 32 | Connections per layer (higher = better recall, more memory) |
201
+ | `ef_construction` | 200 | Build-time search depth (higher = better graph, slower build) |
202
+ | `ef_search` | 100 | Query-time search depth (higher = better recall, slower query) |
203
+ | `max_elements` | 10M | Pre-allocated capacity |
204
+
205
+ **Complexity Analysis:**
206
+
207
+ | Operation | Time Complexity | Space Complexity |
208
+ |-----------|-----------------|------------------|
209
+ | Insert | O(log n * m * ef_construction) | O(m * log n) per vector |
210
+ | Search | O(log n * m * ef_search) | O(ef_search) |
211
+ | Delete | O(1)* | O(1) |
212
+
213
+ *Note: HNSW deletion marks vectors as removed but does not restructure the graph.
214
+
215
+ **Serialization:**
216
+
217
+ ```rust
218
+ pub struct HnswState {
219
+ vectors: Vec<(String, Vec<f32>)>,
220
+ id_to_idx: Vec<(String, usize)>,
221
+ idx_to_id: Vec<(usize, String)>,
222
+ next_idx: usize,
223
+ config: SerializableHnswConfig,
224
+ dimensions: usize,
225
+ metric: SerializableDistanceMetric,
226
+ }
227
+ ```
228
+
229
+ #### Flat Index
230
+
231
+ Linear scan index for small datasets or exact search.
232
+
233
+ **Use Cases:**
234
+ - Datasets < 10K vectors
235
+ - Exact k-NN required
236
+ - Benchmarking HNSW recall
237
+
238
+ ### 4. Quantization Strategies (`quantization.rs`)
239
+
240
+ Memory compression techniques trading precision for storage efficiency.
241
+
242
+ #### Scalar Quantization (4x compression)
243
+
244
+ Quantizes f32 to u8 using min-max scaling.
245
+
246
+ ```rust
247
+ pub struct ScalarQuantized {
248
+ pub data: Vec<u8>, // Quantized values
249
+ pub min: f32, // Minimum for dequantization
250
+ pub scale: f32, // Scale factor
251
+ }
252
+ ```
253
+
254
+ **Characteristics:**
255
+ - Compression: 4x (f32 -> u8)
256
+ - Distance calculation: Uses average scale for symmetric distance
257
+ - Reconstruction error: < 0.4% for typical embedding distributions
258
+
259
+ #### Product Quantization (8-16x compression)
260
+
261
+ Divides vectors into subspaces, each quantized independently via k-means codebooks.
262
+
263
+ ```rust
264
+ pub struct ProductQuantized {
265
+ pub codes: Vec<u8>, // One code per subspace
266
+ pub codebooks: Vec<Vec<Vec<f32>>>, // Learned centroids
267
+ }
268
+ ```
269
+
270
+ **Training:**
271
+ - K-means clustering on subspace vectors
272
+ - Codebook size typically 256 (fits in u8)
273
+ - Iterations: 10-100 for convergence
274
+
275
+ #### Binary Quantization (32x compression)
276
+
277
+ Single-bit representation based on sign.
278
+
279
+ ```rust
280
+ pub struct BinaryQuantized {
281
+ pub bits: Vec<u8>, // Packed bits (8 dimensions per byte)
282
+ pub dimensions: usize,
283
+ }
284
+ ```
285
+
286
+ **Characteristics:**
287
+ - Compression: 32x (f32 -> 1 bit)
288
+ - Distance: Hamming distance (XOR + popcount)
289
+ - Best for: Filtering stage before exact distance on candidates
290
+
291
+ #### Tiered Compression Strategy
292
+
293
+ Ruvector automatically manages compression based on access patterns:
294
+
295
+ | Access Frequency | Format | Compression | Latency |
296
+ |-----------------|--------|-------------|---------|
297
+ | Hot (>80%) | f32 | 1x | Instant |
298
+ | Warm (40-80%) | f16 | 2x | ~1us |
299
+ | Cool (10-40%) | Scalar | 4x | ~10us |
300
+ | Cold (1-10%) | Product | 8-16x | ~100us |
301
+ | Archive (<1%) | Binary | 32x | ~1ms |
302
+
303
+ ### 5. Memory Management
304
+
305
+ #### Arena Allocator (`arena.rs`)
306
+
307
+ Bump allocator for batch operations reducing allocation overhead.
308
+
309
+ #### Lock-Free Structures (`lockfree.rs`)
310
+
311
+ - Crossbeam-based concurrent data structures
312
+ - Lock-free queues for batch ingestion
313
+ - Available only on `parallel` feature (not WASM)
314
+
315
+ #### Cache-Optimized Operations (`cache_optimized.rs`)
316
+
317
+ - Prefetching hints for sequential access
318
+ - Cache-line aligned storage
319
+ - NUMA-aware allocation on supported platforms
320
+
321
+ ### 6. Storage Layer (`storage.rs`)
322
+
323
+ #### Native Storage (REDB)
324
+
325
+ - ACID transactions
326
+ - Memory-mapped vectors
327
+ - Configuration persistence
328
+ - Connection pooling for multiple VectorDB instances
329
+
330
+ ```rust
331
+ const VECTORS_TABLE: TableDefinition<&str, &[u8]> = TableDefinition::new("vectors");
332
+ const METADATA_TABLE: TableDefinition<&str, &str> = TableDefinition::new("metadata");
333
+ const CONFIG_TABLE: TableDefinition<&str, &str> = TableDefinition::new("config");
334
+ ```
335
+
336
+ **Security:**
337
+ - Path traversal protection
338
+ - Validates relative paths don't escape working directory
339
+
340
+ #### Memory-Only Storage (`storage_memory.rs`)
341
+
342
+ - Pure in-memory for WASM
343
+ - No persistence
344
+ - DashMap for concurrent access
345
+
346
+ ---
347
+
348
+ ## Integration Points
349
+
350
+ ### 1. Policy Memory Store
351
+
352
+ Ruvector serves as the backing store for AI agent policy memory:
353
+
354
+ ```
355
+ +-------------------+ +-------------------+ +-------------------+
356
+ | AI Agent | | Policy Memory | | ruvector-core |
357
+ | | ----> | (AgenticDB) | ----> | |
358
+ | "What action for | | Search similar | | HNSW search |
359
+ | this situation?" | | past situations | | with metadata |
360
+ +-------------------+ +-------------------+ +-------------------+
361
+ ```
362
+
363
+ **Use Cases:**
364
+ - Q-learning state-action lookups
365
+ - Contextual bandit policy retrieval
366
+ - Episodic memory for reasoning
367
+
368
+ ### 2. Session State Index
369
+
370
+ Real-time session context for conversational AI:
371
+
372
+ ```
373
+ +-------------------+ +-------------------+ +-------------------+
374
+ | Chat Session | | Session Index | | ruvector-core |
375
+ | | ----> | | ----> | |
376
+ | Current context | | Find relevant | | Cosine similarity |
377
+ | embedding | | past turns | | top-k search |
378
+ +-------------------+ +-------------------+ +-------------------+
379
+ ```
380
+
381
+ **Requirements:**
382
+ - < 10ms latency for interactive use
383
+ - Session isolation via namespaces
384
+ - TTL-based cleanup
385
+
386
+ ### 3. Witness Log for Audit
387
+
388
+ Cryptographically-linked audit trail:
389
+
390
+ ```
391
+ +-------------------+ +-------------------+ +-------------------+
392
+ | Agent Action | | Witness Log | | ruvector-core |
393
+ | | ----> | | ----> | |
394
+ | Action embedding | | Store with hash | | Append-only |
395
+ | + metadata | | chain reference | | with timestamps |
396
+ +-------------------+ +-------------------+ +-------------------+
397
+ ```
398
+
399
+ **Properties:**
400
+ - Immutable entries
401
+ - Hash-chain linking
402
+ - Semantic searchability
403
+
404
+ ---
405
+
406
+ ## Decision Drivers
407
+
408
+ ### 1. Performance (Sub-millisecond Latency)
409
+
410
+ | Requirement | Implementation |
411
+ |-------------|----------------|
412
+ | 61us p50 search | SIMD-optimized distance + HNSW |
413
+ | 16,400 QPS | Parallel search with Rayon |
414
+ | Batch ingestion | Lock-free queues + bulk insert |
415
+
416
+ ### 2. Memory Efficiency (Quantization Support)
417
+
418
+ | Requirement | Implementation |
419
+ |-------------|----------------|
420
+ | 4x compression | Scalar quantization |
421
+ | 8-16x compression | Product quantization |
422
+ | 32x compression | Binary quantization |
423
+ | Automatic tiering | Access pattern tracking |
424
+
425
+ ### 3. Cross-Platform Portability (WASM, Native)
426
+
427
+ | Platform | Features Available |
428
+ |----------|-------------------|
429
+ | x86_64 Linux/macOS | Full (SIMD, parallel, storage) |
430
+ | ARM64 macOS (Apple Silicon) | Full (NEON, parallel, storage) |
431
+ | WASM (browser) | Memory-only, scalar fallback |
432
+ | PostgreSQL extension | Full + SQL integration |
433
+
434
+ ### 4. LLM Integration
435
+
436
+ | Requirement | Implementation |
437
+ |-------------|----------------|
438
+ | Embedding ingestion | API-based and local providers |
439
+ | Semantic search | Cosine/dot product metrics |
440
+ | RAG pipeline | Hybrid search + metadata filtering |
441
+
442
+ ---
443
+
444
+ ## Alternatives Considered
445
+
446
+ ### Alternative 1: Pure Python Implementation (NumPy/FAISS)
447
+
448
+ **Rejected because:**
449
+ - 10-100x slower than Rust SIMD
450
+ - No WASM support
451
+ - GIL contention in concurrent workloads
452
+
453
+ ### Alternative 2: C++ with Bindings
454
+
455
+ **Rejected because:**
456
+ - Memory safety concerns
457
+ - Complex cross-compilation
458
+ - Build system complexity (CMake)
459
+
460
+ ### Alternative 3: Qdrant/Milvus Integration
461
+
462
+ **Rejected because:**
463
+ - External service dependency
464
+ - No WASM support
465
+ - Complex deployment for edge use cases
466
+
467
+ ### Alternative 4: GPU-Only Acceleration (CUDA/ROCm)
468
+
469
+ **Rejected because:**
470
+ - Not portable to edge/mobile
471
+ - Driver dependencies
472
+ - Overkill for < 100M vectors
473
+
474
+ ---
475
+
476
+ ## Consequences
477
+
478
+ ### Benefits
479
+
480
+ 1. **Performance**: Sub-millisecond latency enables real-time AI applications
481
+ 2. **Portability**: Single codebase runs native, WASM, and PostgreSQL
482
+ 3. **Memory Efficiency**: 2-32x compression makes large datasets practical on edge
483
+ 4. **Integration**: Native Rust means zero-cost abstractions for embedding in other systems
484
+ 5. **Learning**: GNN layers can improve search quality without reindexing
485
+
486
+ ### Risks and Mitigations
487
+
488
+ | Risk | Probability | Impact | Mitigation |
489
+ |------|-------------|--------|------------|
490
+ | HNSW recall < 100% | High | Medium | ef_search tuning, hybrid with exact search |
491
+ | Quantization accuracy loss | Medium | Medium | Conformal prediction bounds |
492
+ | WASM performance gap | Medium | Low | Specialized WASM-optimized builds |
493
+ | API embeddings require external call | High | Low | Local embedding option via ONNX |
494
+
495
+ ### Performance Targets
496
+
497
+ | Metric | Target | Achieved |
498
+ |--------|--------|----------|
499
+ | HNSW Search (k=10, 384-dim) | < 100us p50 | 61us |
500
+ | HNSW Search (k=100, 384-dim) | < 200us p50 | 164us |
501
+ | Cosine Distance (1536-dim) | < 200ns | 143ns |
502
+ | Dot Product (384-dim) | < 50ns | 33ns |
503
+ | Batch Distance (1000 vectors) | < 500us | 237us |
504
+ | QPS (10K vectors, k=10) | > 10K | 16,400 |
505
+
506
+ ---
507
+
508
+ ## Implementation Status
509
+
510
+ ### Completed (v0.1.x)
511
+
512
+ | Module | Status | Description |
513
+ |--------|--------|-------------|
514
+ | `simd_intrinsics` | Complete | AVX2/NEON dispatch with scalar fallback |
515
+ | `distance` | Complete | All 4 metrics with SimSIMD integration |
516
+ | `index/hnsw` | Complete | Full HNSW with serialization |
517
+ | `index/flat` | Complete | Linear scan baseline |
518
+ | `quantization` | Complete | Scalar, Product, Binary |
519
+ | `storage` | Complete | REDB-based with connection pooling |
520
+ | `storage_memory` | Complete | In-memory for WASM |
521
+ | `types` | Complete | Core types with serde |
522
+ | `error` | Complete | Error types with thiserror |
523
+ | `vector_db` | Complete | High-level API |
524
+ | `agenticdb` | Complete | AI agent memory interface |
525
+
526
+ ### Advanced Features
527
+
528
+ | Module | Status | Description |
529
+ |--------|--------|-------------|
530
+ | `advanced_features/filtered_search` | Complete | Metadata-based filtering |
531
+ | `advanced_features/hybrid_search` | Complete | Dense + sparse (BM25) |
532
+ | `advanced_features/mmr` | Complete | Maximal Marginal Relevance |
533
+ | `advanced_features/conformal_prediction` | Complete | Uncertainty quantification |
534
+ | `advanced_features/product_quantization` | Complete | Enhanced PQ with training |
535
+
536
+ ### Research Features (`advanced/`)
537
+
538
+ | Module | Status | Description |
539
+ |--------|--------|-------------|
540
+ | `hypergraph` | Experimental | Hyperedge relationships |
541
+ | `learned_index` | Experimental | Neural index structures |
542
+ | `neural_hash` | Experimental | LSH with neural tuning |
543
+ | `tda` | Experimental | Topological data analysis |
544
+
545
+ ---
546
+
547
+ ## Feature Flags
548
+
549
+ | Feature | Default | Description |
550
+ |---------|---------|-------------|
551
+ | `default` | Yes | simd, storage, hnsw, api-embeddings, parallel |
552
+ | `simd` | Yes | SimSIMD acceleration |
553
+ | `parallel` | Yes | Rayon parallel processing |
554
+ | `storage` | Yes | REDB file-based storage |
555
+ | `hnsw` | Yes | HNSW index support |
556
+ | `api-embeddings` | Yes | HTTP-based embedding providers |
557
+ | `memory-only` | No | Pure in-memory (WASM) |
558
+ | `real-embeddings` | No | Deprecated, use api-embeddings |
559
+
560
+ ---
561
+
562
+ ## Dependencies
563
+
564
+ ### Core Dependencies
565
+
566
+ | Dependency | Version | Purpose |
567
+ |------------|---------|---------|
568
+ | `hnsw_rs` | workspace | HNSW implementation |
569
+ | `simsimd` | workspace | SIMD distance functions |
570
+ | `rayon` | workspace | Parallel iteration |
571
+ | `redb` | workspace | Embedded database |
572
+ | `bincode` | workspace | Binary serialization |
573
+ | `dashmap` | workspace | Concurrent hash map |
574
+ | `parking_lot` | workspace | Optimized locks |
575
+
576
+ ### Optional Dependencies
577
+
578
+ | Dependency | Feature | Purpose |
579
+ |------------|---------|---------|
580
+ | `reqwest` | api-embeddings | HTTP client for embedding APIs |
581
+ | `memmap2` | storage | Memory-mapped files |
582
+ | `crossbeam` | parallel | Lock-free data structures |
583
+
584
+ ---
585
+
586
+ ## API Examples
587
+
588
+ ### Basic Vector Search
589
+
590
+ ```rust
591
+ use ruvector_core::{VectorDB, DistanceMetric, HnswConfig};
592
+
593
+ // Create database
594
+ let config = HnswConfig {
595
+ m: 32,
596
+ ef_construction: 200,
597
+ ef_search: 100,
598
+ max_elements: 1_000_000,
599
+ };
600
+ let mut db = VectorDB::new(384, DistanceMetric::Cosine, config)?;
601
+
602
+ // Insert vectors
603
+ db.insert("doc_1".to_string(), vec![0.1; 384])?;
604
+ db.insert("doc_2".to_string(), vec![0.2; 384])?;
605
+
606
+ // Search
607
+ let query = vec![0.15; 384];
608
+ let results = db.search(&query, 10)?;
609
+ ```
610
+
611
+ ### Quantized Search
612
+
613
+ ```rust
614
+ use ruvector_core::quantization::{ScalarQuantized, QuantizedVector};
615
+
616
+ // Quantize vectors for storage
617
+ let quantized = ScalarQuantized::quantize(&vector);
618
+
619
+ // Distance in quantized space
620
+ let distance = quantized.distance(&other_quantized);
621
+
622
+ // Reconstruct if needed
623
+ let reconstructed = quantized.reconstruct();
624
+ ```
625
+
626
+ ### Batch Operations
627
+
628
+ ```rust
629
+ use ruvector_core::distance::batch_distances;
630
+
631
+ // Calculate distances to many vectors in parallel
632
+ let distances = batch_distances(
633
+ &query,
634
+ &corpus_vectors,
635
+ DistanceMetric::Cosine,
636
+ )?;
637
+ ```
638
+
639
+ ---
640
+
641
+ ## References
642
+
643
+ 1. Malkov, Y., & Yashunin, D. (2018). "Efficient and robust approximate nearest neighbor search using Hierarchical Navigable Small World graphs." arXiv:1603.09320.
644
+
645
+ 2. Jegou, H., Douze, M., & Schmid, C. (2011). "Product quantization for nearest neighbor search." IEEE TPAMI.
646
+
647
+ 3. RuVector Team. "ruvector-core Benchmarks." /crates/ruvector-core/benches/
648
+
649
+ 4. SimSIMD Documentation. https://github.com/ashvardanian/SimSIMD
650
+
651
+ ---
652
+
653
+ ## Appendix A: SIMD Register Usage
654
+
655
+ ### AVX2 (256-bit registers)
656
+
657
+ ```
658
+ +-------+-------+-------+-------+-------+-------+-------+-------+
659
+ | f32 | f32 | f32 | f32 | f32 | f32 | f32 | f32 |
660
+ +-------+-------+-------+-------+-------+-------+-------+-------+
661
+ [0] [1] [2] [3] [4] [5] [6] [7]
662
+
663
+ Operations per cycle:
664
+ - _mm256_loadu_ps: Load 8 floats
665
+ - _mm256_sub_ps: 8 subtractions
666
+ - _mm256_mul_ps: 8 multiplications
667
+ - _mm256_add_ps: 8 additions
668
+ ```
669
+
670
+ ### NEON (128-bit registers)
671
+
672
+ ```
673
+ +-------+-------+-------+-------+
674
+ | f32 | f32 | f32 | f32 |
675
+ +-------+-------+-------+-------+
676
+ [0] [1] [2] [3]
677
+
678
+ Operations per cycle:
679
+ - vld1q_f32: Load 4 floats
680
+ - vsubq_f32: 4 subtractions
681
+ - vfmaq_f32: 4 fused multiply-add
682
+ - vaddvq_f32: Horizontal sum
683
+ ```
684
+
685
+ ---
686
+
687
+ ## Appendix B: Memory Layout
688
+
689
+ ### VectorEntry
690
+
691
+ ```
692
+ +------------------+------------------+------------------+
693
+ | id: String | vector: Vec<f32>| metadata: JSON |
694
+ | (optional) | (required) | (optional) |
695
+ +------------------+------------------+------------------+
696
+ ```
697
+
698
+ ### HNSW Graph Structure
699
+
700
+ ```
701
+ Level 3: [v0] -------- [v5]
702
+ \ /
703
+ Level 2: [v0] -- [v3] -- [v5] -- [v9]
704
+ \ / \ / \
705
+ Level 1: [v0]-[v1]-[v3]-[v4]-[v5]-[v7]-[v9]
706
+ | | | | | | |
707
+ Level 0: [v0]-[v1]-[v2]-[v3]-[v4]-[v5]-[v6]-[v7]-[v8]-[v9]
708
+ ```
709
+
710
+ ---
711
+
712
+ ## Appendix C: Benchmark Results
713
+
714
+ ### Platform: Apple M2 (ARM64 NEON)
715
+
716
+ ```
717
+ HNSW Search k=10 (10K vectors, 384-dim):
718
+ p50: 61us
719
+ p95: 89us
720
+ p99: 112us
721
+ Throughput: 16,400 QPS
722
+
723
+ HNSW Search k=100 (10K vectors, 384-dim):
724
+ p50: 164us
725
+ p95: 203us
726
+ p99: 245us
727
+ Throughput: 6,100 QPS
728
+
729
+ Distance Operations (1536-dim):
730
+ Cosine: 143ns
731
+ Euclidean: 156ns
732
+ Dot Product: 33ns (384-dim)
733
+
734
+ Batch Distance (1000 vectors, 384-dim):
735
+ Parallel (Rayon): 237us
736
+ Sequential: 890us
737
+ ```
738
+
739
+ ### Platform: Intel i7 (AVX2)
740
+
741
+ ```
742
+ HNSW Search k=10 (10K vectors, 384-dim):
743
+ p50: 72us
744
+ p95: 105us
745
+ p99: 134us
746
+ Throughput: 13,900 QPS
747
+
748
+ Distance Operations (1536-dim):
749
+ Cosine: 128ns
750
+ Euclidean: 141ns
751
+ Dot Product: 29ns (384-dim)
752
+ ```
753
+
754
+ ---
755
+
756
+ ## Related Decisions
757
+
758
+ - **ADR-002**: RuvLLM Integration with Ruvector
759
+ - **ADR-003**: SIMD Optimization Strategy
760
+ - **ADR-004**: KV Cache Management
761
+ - **ADR-005**: WASM Runtime Integration
762
+ - **ADR-006**: Memory Management
763
+ - **ADR-007**: Security Review & Technical Debt
764
+
765
+ ---
766
+
767
+ ## Implementation Status (v2.1)
768
+
769
+ | Component | Status | Notes |
770
+ |-----------|--------|-------|
771
+ | HNSW Index | ✅ Implemented | M=32, ef_construct=256, 16K QPS |
772
+ | SIMD Distance | ✅ Implemented | AVX2/NEON with fallback |
773
+ | Scalar Quantization | ✅ Implemented | 8-bit with min/max scaling |
774
+ | Batch Operations | ✅ Implemented | Rayon parallel distances |
775
+ | Graph Store | ✅ Implemented | Adjacency list with metadata |
776
+ | Persistence | ✅ Implemented | Binary format with versioning |
777
+
778
+ **Security Status:** Core components reviewed. No critical vulnerabilities in ruvector-core. See ADR-007 for full audit (RuvLLM-specific issues).
779
+
780
+ ---
781
+
782
+ ## Revision History
783
+
784
+ | Version | Date | Author | Changes |
785
+ |---------|------|--------|---------|
786
+ | 1.0 | 2026-01-18 | Ruvector Architecture Team | Initial version |
787
+ | 1.1 | 2026-01-19 | Security Review Agent | Added implementation status, related decisions |