rvlite 0.2.1
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/README.md +270 -0
- package/bin/cli.js +1685 -0
- package/dist/wasm/README.md +216 -0
- package/dist/wasm/attention/LICENSE +21 -0
- package/dist/wasm/attention/README.md +193 -0
- package/dist/wasm/attention/package.json +17 -0
- package/dist/wasm/attention/ruvector_attention_wasm.d.ts +334 -0
- package/dist/wasm/attention/ruvector_attention_wasm.js +1470 -0
- package/dist/wasm/attention/ruvector_attention_wasm_bg.wasm +0 -0
- package/dist/wasm/attention/ruvector_attention_wasm_bg.wasm.d.ts +71 -0
- package/dist/wasm/package.json +24 -0
- package/dist/wasm/rvlite.d.ts +276 -0
- package/dist/wasm/rvlite.js +1504 -0
- package/dist/wasm/rvlite_bg.wasm +0 -0
- package/dist/wasm/rvlite_bg.wasm.d.ts +56 -0
- package/dist/wasm/sona/LICENSE-APACHE +103 -0
- package/dist/wasm/sona/LICENSE-MIT +21 -0
- package/dist/wasm/sona/README.md +1513 -0
- package/dist/wasm/sona/package.json +36 -0
- package/dist/wasm/sona/ruvector_sona.d.ts +513 -0
- package/dist/wasm/sona/ruvector_sona.js +1286 -0
- package/dist/wasm/sona/ruvector_sona_bg.wasm +0 -0
- package/dist/wasm/sona/ruvector_sona_bg.wasm.d.ts +53 -0
- package/dist/wasm/sona/sona.d.ts +281 -0
- package/dist/wasm/sona/sona.js +685 -0
- package/dist/wasm/sona/sona_bg.wasm +0 -0
- package/dist/wasm/sona/sona_bg.wasm.d.ts +26 -0
- package/package.json +81 -0
|
@@ -0,0 +1,1513 @@
|
|
|
1
|
+
# SONA - Self-Optimizing Neural Architecture
|
|
2
|
+
|
|
3
|
+
<div align="center">
|
|
4
|
+
|
|
5
|
+
**Runtime-adaptive learning for LLM routers and AI systems without expensive retraining.**
|
|
6
|
+
|
|
7
|
+
[](https://crates.io/crates/ruvector-sona)
|
|
8
|
+
[](https://www.npmjs.com/package/@ruvector/sona)
|
|
9
|
+
[](https://docs.rs/ruvector-sona)
|
|
10
|
+
[](LICENSE)
|
|
11
|
+
|
|
12
|
+
[Quick Start](#quick-start) | [Tutorials](#tutorials) | [API Reference](#api-reference) | [Benchmarks](#benchmarks)
|
|
13
|
+
|
|
14
|
+
</div>
|
|
15
|
+
|
|
16
|
+
---
|
|
17
|
+
|
|
18
|
+
## What is SONA?
|
|
19
|
+
|
|
20
|
+
SONA (Self-Optimizing Neural Architecture) is a **real-time learning system** that makes your AI applications smarter with every interaction. Instead of expensive model retraining that takes days and costs thousands of dollars, SONA learns from user feedback in **sub-millisecond time**.
|
|
21
|
+
|
|
22
|
+
### The Problem SONA Solves
|
|
23
|
+
|
|
24
|
+
Traditional AI systems have a critical limitation: they don't learn from their mistakes in production. When a user gives negative feedback, that information is typically lost or requires manual intervention to address.
|
|
25
|
+
|
|
26
|
+
| Traditional Approach | Time | Cost | Downtime |
|
|
27
|
+
|---------------------|------|------|----------|
|
|
28
|
+
| Fine-tune model | Days-Weeks | $1,000-$100,000+ | Yes |
|
|
29
|
+
| Retrain from scratch | Weeks-Months | $10,000-$1M+ | Yes |
|
|
30
|
+
| Manual prompt tuning | Hours-Days | Engineering time | No |
|
|
31
|
+
| **SONA** | **<1 millisecond** | **$0** | **No** |
|
|
32
|
+
|
|
33
|
+
### How It Works
|
|
34
|
+
|
|
35
|
+
```
|
|
36
|
+
User Query → [SONA Engine] → Model Response → User Feedback
|
|
37
|
+
↑ │
|
|
38
|
+
└─────── Learning Signal ─────────┘
|
|
39
|
+
(< 1ms adaptation)
|
|
40
|
+
```
|
|
41
|
+
|
|
42
|
+
SONA uses three key innovations:
|
|
43
|
+
|
|
44
|
+
1. **Two-Tier LoRA**: Fast (MicroLoRA) and deep (BaseLoRA) adaptation layers
|
|
45
|
+
2. **EWC++**: Prevents forgetting previously learned patterns
|
|
46
|
+
3. **ReasoningBank**: Stores and retrieves successful interaction patterns
|
|
47
|
+
|
|
48
|
+
---
|
|
49
|
+
|
|
50
|
+
## Table of Contents
|
|
51
|
+
|
|
52
|
+
- [Installation](#installation)
|
|
53
|
+
- [Quick Start](#quick-start)
|
|
54
|
+
- [Core Concepts](#core-concepts)
|
|
55
|
+
- [Tutorials](#tutorials)
|
|
56
|
+
- [Tutorial 1: Your First SONA Application](#tutorial-1-your-first-sona-application)
|
|
57
|
+
- [Tutorial 2: Building an Adaptive Chatbot](#tutorial-2-building-an-adaptive-chatbot)
|
|
58
|
+
- [Tutorial 3: LLM Router with Learning](#tutorial-3-llm-router-with-learning)
|
|
59
|
+
- [Tutorial 4: Browser-Based Learning (WASM)](#tutorial-4-browser-based-learning-wasm)
|
|
60
|
+
- [Tutorial 5: Node.js Backend Integration](#tutorial-5-nodejs-backend-integration)
|
|
61
|
+
- [Tutorial 6: Production Deployment](#tutorial-6-production-deployment)
|
|
62
|
+
- [Configuration Guide](#configuration-guide)
|
|
63
|
+
- [API Reference](#api-reference)
|
|
64
|
+
- [Benchmarks](#benchmarks)
|
|
65
|
+
- [Troubleshooting](#troubleshooting)
|
|
66
|
+
|
|
67
|
+
---
|
|
68
|
+
|
|
69
|
+
## Installation
|
|
70
|
+
|
|
71
|
+
### Rust (Cargo)
|
|
72
|
+
|
|
73
|
+
```toml
|
|
74
|
+
[dependencies]
|
|
75
|
+
ruvector-sona = "0.1.1"
|
|
76
|
+
|
|
77
|
+
# With all features
|
|
78
|
+
ruvector-sona = { version = "0.1.1", features = ["serde-support"] }
|
|
79
|
+
```
|
|
80
|
+
|
|
81
|
+
### Node.js (npm)
|
|
82
|
+
|
|
83
|
+
```bash
|
|
84
|
+
npm install @ruvector/sona
|
|
85
|
+
# or
|
|
86
|
+
yarn add @ruvector/sona
|
|
87
|
+
# or
|
|
88
|
+
pnpm add @ruvector/sona
|
|
89
|
+
```
|
|
90
|
+
|
|
91
|
+
### Browser (WASM)
|
|
92
|
+
|
|
93
|
+
```bash
|
|
94
|
+
# Clone and build WASM package
|
|
95
|
+
git clone https://github.com/ruvnet/ruvector.git
|
|
96
|
+
cd ruvector/crates/sona
|
|
97
|
+
wasm-pack build --target web --features wasm
|
|
98
|
+
|
|
99
|
+
# Copy to your project
|
|
100
|
+
cp -r pkg/ your-project/sona/
|
|
101
|
+
```
|
|
102
|
+
|
|
103
|
+
---
|
|
104
|
+
|
|
105
|
+
## Quick Start
|
|
106
|
+
|
|
107
|
+
### 30-Second Example (Rust)
|
|
108
|
+
|
|
109
|
+
```rust
|
|
110
|
+
use ruvector_sona::{SonaEngine, SonaConfig};
|
|
111
|
+
|
|
112
|
+
fn main() {
|
|
113
|
+
// 1. Create engine
|
|
114
|
+
let engine = SonaEngine::builder()
|
|
115
|
+
.hidden_dim(256)
|
|
116
|
+
.build();
|
|
117
|
+
|
|
118
|
+
// 2. Record a user interaction
|
|
119
|
+
let query_embedding = vec![0.1f32; 256];
|
|
120
|
+
let traj_id = engine.begin_trajectory(query_embedding);
|
|
121
|
+
|
|
122
|
+
// 3. Record what happened (model selection, confidence, latency)
|
|
123
|
+
engine.add_step(traj_id, vec![0.5; 256], vec![0.8; 64], 0.9);
|
|
124
|
+
|
|
125
|
+
// 4. Record outcome quality (0.0 = bad, 1.0 = perfect)
|
|
126
|
+
engine.end_trajectory(traj_id, 0.85);
|
|
127
|
+
|
|
128
|
+
// 5. Apply learned optimizations to future queries
|
|
129
|
+
let new_query = vec![0.2f32; 256];
|
|
130
|
+
let optimized = engine.apply_micro_lora(&new_query);
|
|
131
|
+
|
|
132
|
+
println!("SONA is learning! Stats: {}", engine.get_stats());
|
|
133
|
+
}
|
|
134
|
+
```
|
|
135
|
+
|
|
136
|
+
### 30-Second Example (Node.js)
|
|
137
|
+
|
|
138
|
+
```javascript
|
|
139
|
+
const { SonaEngine } = require('@ruvector/sona');
|
|
140
|
+
|
|
141
|
+
// 1. Create engine
|
|
142
|
+
const engine = new SonaEngine(256);
|
|
143
|
+
|
|
144
|
+
// 2. Record interaction
|
|
145
|
+
const queryEmbedding = Array(256).fill(0.1);
|
|
146
|
+
const trajId = engine.beginTrajectory(queryEmbedding);
|
|
147
|
+
|
|
148
|
+
// 3. Add step data
|
|
149
|
+
engine.addTrajectoryStep(trajId, Array(256).fill(0.5), Array(64).fill(0.8), 0.9);
|
|
150
|
+
|
|
151
|
+
// 4. Complete with quality score
|
|
152
|
+
engine.endTrajectory(trajId, 0.85);
|
|
153
|
+
|
|
154
|
+
// 5. Apply learning
|
|
155
|
+
const newQuery = Array(256).fill(0.2);
|
|
156
|
+
const optimized = engine.applyMicroLora(newQuery);
|
|
157
|
+
|
|
158
|
+
console.log('Stats:', engine.getStats());
|
|
159
|
+
```
|
|
160
|
+
|
|
161
|
+
---
|
|
162
|
+
|
|
163
|
+
## Core Concepts
|
|
164
|
+
|
|
165
|
+
### Understanding Embeddings
|
|
166
|
+
|
|
167
|
+
Embeddings are numerical representations of text. Every word, sentence, or query can be converted into a vector of numbers (typically 256-4096 dimensions). SONA works with these embeddings to learn patterns.
|
|
168
|
+
|
|
169
|
+
```
|
|
170
|
+
"How do I reset my password?" → [0.12, -0.45, 0.78, ..., 0.23] (256 numbers)
|
|
171
|
+
"Password reset help" → [0.11, -0.44, 0.79, ..., 0.22] (similar!)
|
|
172
|
+
"What's the weather?" → [0.89, 0.12, -0.34, ..., 0.67] (different)
|
|
173
|
+
```
|
|
174
|
+
|
|
175
|
+
### Trajectories: Recording What Happened
|
|
176
|
+
|
|
177
|
+
A **trajectory** is a complete record of one user interaction:
|
|
178
|
+
|
|
179
|
+
```
|
|
180
|
+
┌─────────────────────────────────────────────────────────────┐
|
|
181
|
+
│ Trajectory │
|
|
182
|
+
├─────────────────────────────────────────────────────────────┤
|
|
183
|
+
│ Query Embedding: [0.12, -0.45, 0.78, ...] │
|
|
184
|
+
│ │
|
|
185
|
+
│ Steps: │
|
|
186
|
+
│ Step 1: Selected Model A, confidence 0.82, latency 45ms │
|
|
187
|
+
│ Step 2: Generated response, confidence 0.91, latency 120ms│
|
|
188
|
+
│ Step 3: Formatted output, confidence 0.95, latency 5ms │
|
|
189
|
+
│ │
|
|
190
|
+
│ Final Quality: 0.85 (user gave thumbs up) │
|
|
191
|
+
└─────────────────────────────────────────────────────────────┘
|
|
192
|
+
```
|
|
193
|
+
|
|
194
|
+
### Two-Tier LoRA: Fast and Deep Learning
|
|
195
|
+
|
|
196
|
+
SONA uses two types of adaptation:
|
|
197
|
+
|
|
198
|
+
| Tier | Rank | Speed | Purpose | When Used |
|
|
199
|
+
|------|------|-------|---------|-----------|
|
|
200
|
+
| **MicroLoRA** | 2 | ~45μs | Instant adjustments | Every request |
|
|
201
|
+
| **BaseLoRA** | 8-16 | ~1ms | Deep pattern learning | Background (hourly) |
|
|
202
|
+
|
|
203
|
+
**MicroLoRA** is like quick reflexes - it adapts immediately based on recent feedback.
|
|
204
|
+
**BaseLoRA** is like long-term memory - it consolidates patterns over time.
|
|
205
|
+
|
|
206
|
+
### EWC++: Remembering Without Forgetting
|
|
207
|
+
|
|
208
|
+
When learning new patterns, AI systems often "forget" old ones (catastrophic forgetting). EWC++ (Elastic Weight Consolidation) prevents this by:
|
|
209
|
+
|
|
210
|
+
1. Tracking which parameters are important for each task
|
|
211
|
+
2. Protecting important parameters when learning new tasks
|
|
212
|
+
3. Automatically detecting when a "new task" begins
|
|
213
|
+
|
|
214
|
+
```
|
|
215
|
+
Without EWC++: With EWC++:
|
|
216
|
+
┌────────────────────┐ ┌────────────────────┐
|
|
217
|
+
│ Learn Task A: ✓ │ │ Learn Task A: ✓ │
|
|
218
|
+
│ Learn Task B: ✓ │ │ Learn Task B: ✓ │
|
|
219
|
+
│ Task A knowledge: ✗ │ │ Task A knowledge: ✓ │
|
|
220
|
+
└────────────────────┘ └────────────────────┘
|
|
221
|
+
```
|
|
222
|
+
|
|
223
|
+
### ReasoningBank: Pattern Library
|
|
224
|
+
|
|
225
|
+
ReasoningBank stores successful interaction patterns using K-means++ clustering:
|
|
226
|
+
|
|
227
|
+
```
|
|
228
|
+
┌─────────────────────────────────────────────────────────────┐
|
|
229
|
+
│ ReasoningBank │
|
|
230
|
+
├─────────────────────────────────────────────────────────────┤
|
|
231
|
+
│ Cluster 1: "Password/Account Issues" │
|
|
232
|
+
│ - 847 trajectories, avg quality 0.89 │
|
|
233
|
+
│ - Best response pattern: Empathetic + Step-by-step │
|
|
234
|
+
│ │
|
|
235
|
+
│ Cluster 2: "Technical Questions" │
|
|
236
|
+
│ - 1,234 trajectories, avg quality 0.92 │
|
|
237
|
+
│ - Best response pattern: Detailed + Code examples │
|
|
238
|
+
│ │
|
|
239
|
+
│ Cluster 3: "General Conversation" │
|
|
240
|
+
│ - 2,156 trajectories, avg quality 0.78 │
|
|
241
|
+
│ - Best response pattern: Friendly + Concise │
|
|
242
|
+
└─────────────────────────────────────────────────────────────┘
|
|
243
|
+
```
|
|
244
|
+
|
|
245
|
+
---
|
|
246
|
+
|
|
247
|
+
## Tutorials
|
|
248
|
+
|
|
249
|
+
### Tutorial 1: Your First SONA Application
|
|
250
|
+
|
|
251
|
+
Let's build a simple application that learns from user feedback.
|
|
252
|
+
|
|
253
|
+
**Goal**: Create a system that improves response quality based on thumbs up/down.
|
|
254
|
+
|
|
255
|
+
```rust
|
|
256
|
+
use ruvector_sona::{SonaEngine, SonaConfig};
|
|
257
|
+
|
|
258
|
+
fn main() {
|
|
259
|
+
// Step 1: Configure SONA
|
|
260
|
+
// Use optimized defaults (benchmark-validated)
|
|
261
|
+
let config = SonaConfig::default();
|
|
262
|
+
|
|
263
|
+
println!("Configuration:");
|
|
264
|
+
println!(" MicroLoRA rank: {} (optimal for SIMD)", config.micro_lora_rank);
|
|
265
|
+
println!(" Learning rate: {} (+55% quality)", config.micro_lora_lr);
|
|
266
|
+
println!(" Pattern clusters: {} (2.3x faster)", config.pattern_clusters);
|
|
267
|
+
println!(" EWC lambda: {} (anti-forgetting)", config.ewc_lambda);
|
|
268
|
+
|
|
269
|
+
// Step 2: Create the engine
|
|
270
|
+
let engine = SonaEngine::builder()
|
|
271
|
+
.config(config)
|
|
272
|
+
.build();
|
|
273
|
+
|
|
274
|
+
// Step 3: Simulate 100 user interactions
|
|
275
|
+
let mut positive_count = 0;
|
|
276
|
+
let mut negative_count = 0;
|
|
277
|
+
|
|
278
|
+
for i in 0..100 {
|
|
279
|
+
// Simulate a query embedding (in real app, use your embedding model)
|
|
280
|
+
let query_embedding: Vec<f32> = (0..256)
|
|
281
|
+
.map(|j| ((i * 256 + j) as f32 * 0.001).sin())
|
|
282
|
+
.collect();
|
|
283
|
+
|
|
284
|
+
// Start recording this interaction
|
|
285
|
+
let traj_id = engine.begin_trajectory(query_embedding.clone());
|
|
286
|
+
|
|
287
|
+
// Simulate processing steps
|
|
288
|
+
let activations: Vec<f32> = query_embedding.iter()
|
|
289
|
+
.map(|x| x.tanh())
|
|
290
|
+
.collect();
|
|
291
|
+
let attention: Vec<f32> = vec![1.0 / 64.0; 64];
|
|
292
|
+
|
|
293
|
+
engine.add_step(traj_id, activations, attention, 0.8);
|
|
294
|
+
|
|
295
|
+
// Simulate user feedback (70% positive in this example)
|
|
296
|
+
let is_positive = (i % 10) < 7;
|
|
297
|
+
let quality = if is_positive { 0.9 } else { 0.3 };
|
|
298
|
+
|
|
299
|
+
if is_positive {
|
|
300
|
+
positive_count += 1;
|
|
301
|
+
} else {
|
|
302
|
+
negative_count += 1;
|
|
303
|
+
}
|
|
304
|
+
|
|
305
|
+
// Complete the trajectory with quality score
|
|
306
|
+
engine.end_trajectory(traj_id, quality);
|
|
307
|
+
|
|
308
|
+
// Run learning tick (processes pending trajectories)
|
|
309
|
+
engine.tick();
|
|
310
|
+
}
|
|
311
|
+
|
|
312
|
+
// Step 4: Check what we learned
|
|
313
|
+
println!("\nResults after 100 interactions:");
|
|
314
|
+
println!(" Positive feedback: {}", positive_count);
|
|
315
|
+
println!(" Negative feedback: {}", negative_count);
|
|
316
|
+
println!(" Engine stats: {}", engine.get_stats());
|
|
317
|
+
|
|
318
|
+
// Step 5: Apply learning to a new query
|
|
319
|
+
let new_query: Vec<f32> = vec![0.5; 256];
|
|
320
|
+
let optimized = engine.apply_micro_lora(&new_query);
|
|
321
|
+
|
|
322
|
+
// The optimized embedding now incorporates learned patterns!
|
|
323
|
+
let diff: f32 = new_query.iter()
|
|
324
|
+
.zip(optimized.iter())
|
|
325
|
+
.map(|(a, b)| (a - b).abs())
|
|
326
|
+
.sum();
|
|
327
|
+
|
|
328
|
+
println!("\nLearning applied! Embedding change magnitude: {:.4}", diff);
|
|
329
|
+
}
|
|
330
|
+
```
|
|
331
|
+
|
|
332
|
+
**Expected Output:**
|
|
333
|
+
```
|
|
334
|
+
Configuration:
|
|
335
|
+
MicroLoRA rank: 2 (optimal for SIMD)
|
|
336
|
+
Learning rate: 0.002 (+55% quality)
|
|
337
|
+
Pattern clusters: 100 (2.3x faster)
|
|
338
|
+
EWC lambda: 2000 (anti-forgetting)
|
|
339
|
+
|
|
340
|
+
Results after 100 interactions:
|
|
341
|
+
Positive feedback: 70
|
|
342
|
+
Negative feedback: 30
|
|
343
|
+
Engine stats: {"trajectories": 100, "patterns": 12, "micro_updates": 100}
|
|
344
|
+
|
|
345
|
+
Learning applied! Embedding change magnitude: 0.0847
|
|
346
|
+
```
|
|
347
|
+
|
|
348
|
+
---
|
|
349
|
+
|
|
350
|
+
### Tutorial 2: Building an Adaptive Chatbot
|
|
351
|
+
|
|
352
|
+
Let's build a chatbot that learns to give better responses.
|
|
353
|
+
|
|
354
|
+
```rust
|
|
355
|
+
use ruvector_sona::{SonaEngine, SonaConfig};
|
|
356
|
+
use std::collections::HashMap;
|
|
357
|
+
|
|
358
|
+
/// Adaptive chatbot that learns from user feedback
|
|
359
|
+
pub struct AdaptiveChatbot {
|
|
360
|
+
engine: SonaEngine,
|
|
361
|
+
response_templates: HashMap<String, Vec<String>>,
|
|
362
|
+
active_trajectory: Option<u64>,
|
|
363
|
+
}
|
|
364
|
+
|
|
365
|
+
impl AdaptiveChatbot {
|
|
366
|
+
pub fn new() -> Self {
|
|
367
|
+
// Use max_quality preset for chatbot (we want best responses)
|
|
368
|
+
let config = SonaConfig::max_quality();
|
|
369
|
+
|
|
370
|
+
let engine = SonaEngine::builder()
|
|
371
|
+
.config(config)
|
|
372
|
+
.build();
|
|
373
|
+
|
|
374
|
+
// Simple response templates (in real app, use LLM)
|
|
375
|
+
let mut templates = HashMap::new();
|
|
376
|
+
templates.insert("greeting".to_string(), vec![
|
|
377
|
+
"Hello! How can I help you today?".to_string(),
|
|
378
|
+
"Hi there! What can I do for you?".to_string(),
|
|
379
|
+
"Welcome! I'm here to assist you.".to_string(),
|
|
380
|
+
]);
|
|
381
|
+
templates.insert("farewell".to_string(), vec![
|
|
382
|
+
"Goodbye! Have a great day!".to_string(),
|
|
383
|
+
"Take care! Feel free to come back anytime.".to_string(),
|
|
384
|
+
"Bye! It was nice helping you.".to_string(),
|
|
385
|
+
]);
|
|
386
|
+
templates.insert("unknown".to_string(), vec![
|
|
387
|
+
"I'm not sure I understand. Could you rephrase that?".to_string(),
|
|
388
|
+
"Let me think about that...".to_string(),
|
|
389
|
+
"Interesting question! Let me help you with that.".to_string(),
|
|
390
|
+
]);
|
|
391
|
+
|
|
392
|
+
Self {
|
|
393
|
+
engine,
|
|
394
|
+
response_templates: templates,
|
|
395
|
+
active_trajectory: None,
|
|
396
|
+
}
|
|
397
|
+
}
|
|
398
|
+
|
|
399
|
+
/// Process a user message
|
|
400
|
+
pub fn respond(&mut self, message: &str) -> String {
|
|
401
|
+
// Step 1: Create embedding from message
|
|
402
|
+
let embedding = self.create_embedding(message);
|
|
403
|
+
|
|
404
|
+
// Step 2: Start trajectory
|
|
405
|
+
let traj_id = self.engine.begin_trajectory(embedding.clone());
|
|
406
|
+
self.active_trajectory = Some(traj_id);
|
|
407
|
+
|
|
408
|
+
// Step 3: Apply learned optimizations
|
|
409
|
+
let optimized = self.engine.apply_micro_lora(&embedding);
|
|
410
|
+
|
|
411
|
+
// Step 4: Classify intent using optimized embedding
|
|
412
|
+
let intent = self.classify_intent(&optimized);
|
|
413
|
+
|
|
414
|
+
// Step 5: Record the classification step
|
|
415
|
+
let activations: Vec<f32> = optimized.iter().map(|x| x.tanh()).collect();
|
|
416
|
+
let attention = vec![1.0 / 64.0; 64];
|
|
417
|
+
self.engine.add_step(traj_id, activations, attention, 0.8);
|
|
418
|
+
|
|
419
|
+
// Step 6: Select best response template
|
|
420
|
+
let responses = self.response_templates.get(&intent)
|
|
421
|
+
.unwrap_or(&self.response_templates["unknown"]);
|
|
422
|
+
|
|
423
|
+
// Use embedding similarity to pick best response
|
|
424
|
+
let response = self.select_best_response(responses, &optimized);
|
|
425
|
+
|
|
426
|
+
response
|
|
427
|
+
}
|
|
428
|
+
|
|
429
|
+
/// Record user feedback (call after response is shown)
|
|
430
|
+
pub fn record_feedback(&mut self, was_helpful: bool) {
|
|
431
|
+
if let Some(traj_id) = self.active_trajectory.take() {
|
|
432
|
+
let quality = if was_helpful { 0.95 } else { 0.2 };
|
|
433
|
+
self.engine.end_trajectory(traj_id, quality);
|
|
434
|
+
|
|
435
|
+
// Force learning if negative feedback (learn faster from mistakes)
|
|
436
|
+
if !was_helpful {
|
|
437
|
+
self.engine.force_learn();
|
|
438
|
+
}
|
|
439
|
+
}
|
|
440
|
+
}
|
|
441
|
+
|
|
442
|
+
/// Create a simple embedding from text
|
|
443
|
+
fn create_embedding(&self, text: &str) -> Vec<f32> {
|
|
444
|
+
// Simple bag-of-characters embedding (use real embeddings in production!)
|
|
445
|
+
let mut embedding = vec![0.0f32; 256];
|
|
446
|
+
for (i, c) in text.chars().enumerate() {
|
|
447
|
+
let idx = (c as usize + i) % 256;
|
|
448
|
+
embedding[idx] += 0.1;
|
|
449
|
+
}
|
|
450
|
+
// Normalize
|
|
451
|
+
let norm: f32 = embedding.iter().map(|x| x * x).sum::<f32>().sqrt();
|
|
452
|
+
if norm > 0.0 {
|
|
453
|
+
embedding.iter_mut().for_each(|x| *x /= norm);
|
|
454
|
+
}
|
|
455
|
+
embedding
|
|
456
|
+
}
|
|
457
|
+
|
|
458
|
+
/// Classify user intent
|
|
459
|
+
fn classify_intent(&self, embedding: &[f32]) -> String {
|
|
460
|
+
// Simple heuristic (use classifier in production!)
|
|
461
|
+
let sum: f32 = embedding.iter().take(10).sum();
|
|
462
|
+
if sum > 0.5 {
|
|
463
|
+
"greeting".to_string()
|
|
464
|
+
} else if sum < -0.5 {
|
|
465
|
+
"farewell".to_string()
|
|
466
|
+
} else {
|
|
467
|
+
"unknown".to_string()
|
|
468
|
+
}
|
|
469
|
+
}
|
|
470
|
+
|
|
471
|
+
/// Select best response based on embedding
|
|
472
|
+
fn select_best_response(&self, responses: &[String], embedding: &[f32]) -> String {
|
|
473
|
+
// Use embedding to deterministically select response
|
|
474
|
+
let idx = (embedding[0].abs() * responses.len() as f32) as usize % responses.len();
|
|
475
|
+
responses[idx].clone()
|
|
476
|
+
}
|
|
477
|
+
|
|
478
|
+
/// Get learning statistics
|
|
479
|
+
pub fn stats(&self) -> String {
|
|
480
|
+
self.engine.get_stats()
|
|
481
|
+
}
|
|
482
|
+
}
|
|
483
|
+
|
|
484
|
+
fn main() {
|
|
485
|
+
let mut bot = AdaptiveChatbot::new();
|
|
486
|
+
|
|
487
|
+
// Simulate conversation
|
|
488
|
+
let conversations = vec![
|
|
489
|
+
("Hello!", true),
|
|
490
|
+
("Hi there", true),
|
|
491
|
+
("What is AI?", false), // Bad response
|
|
492
|
+
("Explain machine learning", false), // Bad response
|
|
493
|
+
("Thanks, goodbye!", true),
|
|
494
|
+
("Hello again!", true),
|
|
495
|
+
];
|
|
496
|
+
|
|
497
|
+
for (message, was_helpful) in conversations {
|
|
498
|
+
println!("User: {}", message);
|
|
499
|
+
let response = bot.respond(message);
|
|
500
|
+
println!("Bot: {}", response);
|
|
501
|
+
bot.record_feedback(was_helpful);
|
|
502
|
+
println!(" [Feedback: {}]", if was_helpful { "👍" } else { "👎" });
|
|
503
|
+
println!();
|
|
504
|
+
}
|
|
505
|
+
|
|
506
|
+
println!("Final stats: {}", bot.stats());
|
|
507
|
+
}
|
|
508
|
+
```
|
|
509
|
+
|
|
510
|
+
---
|
|
511
|
+
|
|
512
|
+
### Tutorial 3: LLM Router with Learning
|
|
513
|
+
|
|
514
|
+
Build a router that learns which LLM to use for different query types.
|
|
515
|
+
|
|
516
|
+
```rust
|
|
517
|
+
use ruvector_sona::{SonaEngine, SonaConfig};
|
|
518
|
+
use std::time::Instant;
|
|
519
|
+
|
|
520
|
+
/// Represents an LLM model
|
|
521
|
+
#[derive(Clone)]
|
|
522
|
+
pub struct LLMModel {
|
|
523
|
+
pub name: String,
|
|
524
|
+
pub cost_per_token: f32,
|
|
525
|
+
pub avg_quality: f32,
|
|
526
|
+
pub avg_latency_ms: u32,
|
|
527
|
+
}
|
|
528
|
+
|
|
529
|
+
/// Adaptive LLM Router that learns optimal model selection
|
|
530
|
+
pub struct AdaptiveLLMRouter {
|
|
531
|
+
engine: SonaEngine,
|
|
532
|
+
models: Vec<LLMModel>,
|
|
533
|
+
}
|
|
534
|
+
|
|
535
|
+
impl AdaptiveLLMRouter {
|
|
536
|
+
pub fn new(models: Vec<LLMModel>) -> Self {
|
|
537
|
+
// Use max_throughput for fast routing decisions
|
|
538
|
+
let config = SonaConfig::max_throughput();
|
|
539
|
+
|
|
540
|
+
let engine = SonaEngine::builder()
|
|
541
|
+
.config(config)
|
|
542
|
+
.build();
|
|
543
|
+
|
|
544
|
+
Self { engine, models }
|
|
545
|
+
}
|
|
546
|
+
|
|
547
|
+
/// Route a query to the best model
|
|
548
|
+
pub fn route(&self, query_embedding: Vec<f32>) -> (usize, &LLMModel) {
|
|
549
|
+
// Apply learned optimizations
|
|
550
|
+
let optimized = self.engine.apply_micro_lora(&query_embedding);
|
|
551
|
+
|
|
552
|
+
// Find similar patterns
|
|
553
|
+
let patterns = self.engine.find_patterns(&optimized, 3);
|
|
554
|
+
|
|
555
|
+
// Score each model based on patterns and learned preferences
|
|
556
|
+
let mut best_idx = 0;
|
|
557
|
+
let mut best_score = f32::MIN;
|
|
558
|
+
|
|
559
|
+
for (idx, model) in self.models.iter().enumerate() {
|
|
560
|
+
let mut score = model.avg_quality;
|
|
561
|
+
|
|
562
|
+
// Boost score if patterns suggest this model works well
|
|
563
|
+
for pattern in &patterns {
|
|
564
|
+
// Pattern centroid similarity affects model preference
|
|
565
|
+
let similarity = cosine_similarity(&optimized, &pattern.centroid);
|
|
566
|
+
if similarity > 0.8 {
|
|
567
|
+
// High similarity to successful pattern
|
|
568
|
+
score += pattern.avg_quality * similarity;
|
|
569
|
+
}
|
|
570
|
+
}
|
|
571
|
+
|
|
572
|
+
// Penalize expensive models slightly
|
|
573
|
+
score -= model.cost_per_token * 0.1;
|
|
574
|
+
|
|
575
|
+
if score > best_score {
|
|
576
|
+
best_score = score;
|
|
577
|
+
best_idx = idx;
|
|
578
|
+
}
|
|
579
|
+
}
|
|
580
|
+
|
|
581
|
+
(best_idx, &self.models[best_idx])
|
|
582
|
+
}
|
|
583
|
+
|
|
584
|
+
/// Record the outcome of a routing decision
|
|
585
|
+
pub fn record_outcome(
|
|
586
|
+
&self,
|
|
587
|
+
query_embedding: Vec<f32>,
|
|
588
|
+
selected_model: usize,
|
|
589
|
+
quality: f32,
|
|
590
|
+
latency_ms: u32,
|
|
591
|
+
) {
|
|
592
|
+
// Start trajectory
|
|
593
|
+
let traj_id = self.engine.begin_trajectory(query_embedding);
|
|
594
|
+
|
|
595
|
+
// Record selection step
|
|
596
|
+
let model = &self.models[selected_model];
|
|
597
|
+
let activations = vec![
|
|
598
|
+
model.avg_quality,
|
|
599
|
+
model.cost_per_token,
|
|
600
|
+
latency_ms as f32 / 1000.0,
|
|
601
|
+
];
|
|
602
|
+
let activations_padded: Vec<f32> = activations.into_iter()
|
|
603
|
+
.chain(std::iter::repeat(0.0))
|
|
604
|
+
.take(256)
|
|
605
|
+
.collect();
|
|
606
|
+
|
|
607
|
+
let attention = vec![1.0 / 64.0; 64];
|
|
608
|
+
self.engine.add_step(traj_id, activations_padded, attention, quality);
|
|
609
|
+
|
|
610
|
+
// Set route info
|
|
611
|
+
self.engine.set_trajectory_route(traj_id, model.name.clone());
|
|
612
|
+
|
|
613
|
+
// Complete trajectory
|
|
614
|
+
self.engine.end_trajectory(traj_id, quality);
|
|
615
|
+
}
|
|
616
|
+
|
|
617
|
+
/// Force background learning cycle
|
|
618
|
+
pub fn learn(&self) -> String {
|
|
619
|
+
self.engine.force_learn()
|
|
620
|
+
}
|
|
621
|
+
|
|
622
|
+
pub fn stats(&self) -> String {
|
|
623
|
+
self.engine.get_stats()
|
|
624
|
+
}
|
|
625
|
+
}
|
|
626
|
+
|
|
627
|
+
fn cosine_similarity(a: &[f32], b: &[f32]) -> f32 {
|
|
628
|
+
let dot: f32 = a.iter().zip(b.iter()).map(|(x, y)| x * y).sum();
|
|
629
|
+
let norm_a: f32 = a.iter().map(|x| x * x).sum::<f32>().sqrt();
|
|
630
|
+
let norm_b: f32 = b.iter().map(|x| x * x).sum::<f32>().sqrt();
|
|
631
|
+
if norm_a > 0.0 && norm_b > 0.0 {
|
|
632
|
+
dot / (norm_a * norm_b)
|
|
633
|
+
} else {
|
|
634
|
+
0.0
|
|
635
|
+
}
|
|
636
|
+
}
|
|
637
|
+
|
|
638
|
+
fn main() {
|
|
639
|
+
// Define available models
|
|
640
|
+
let models = vec![
|
|
641
|
+
LLMModel {
|
|
642
|
+
name: "GPT-4".to_string(),
|
|
643
|
+
cost_per_token: 0.03,
|
|
644
|
+
avg_quality: 0.95,
|
|
645
|
+
avg_latency_ms: 2000,
|
|
646
|
+
},
|
|
647
|
+
LLMModel {
|
|
648
|
+
name: "GPT-3.5-Turbo".to_string(),
|
|
649
|
+
cost_per_token: 0.002,
|
|
650
|
+
avg_quality: 0.85,
|
|
651
|
+
avg_latency_ms: 500,
|
|
652
|
+
},
|
|
653
|
+
LLMModel {
|
|
654
|
+
name: "Claude-Instant".to_string(),
|
|
655
|
+
cost_per_token: 0.001,
|
|
656
|
+
avg_quality: 0.80,
|
|
657
|
+
avg_latency_ms: 300,
|
|
658
|
+
},
|
|
659
|
+
LLMModel {
|
|
660
|
+
name: "Local-LLaMA".to_string(),
|
|
661
|
+
cost_per_token: 0.0001,
|
|
662
|
+
avg_quality: 0.70,
|
|
663
|
+
avg_latency_ms: 100,
|
|
664
|
+
},
|
|
665
|
+
];
|
|
666
|
+
|
|
667
|
+
let router = AdaptiveLLMRouter::new(models);
|
|
668
|
+
|
|
669
|
+
// Simulate 1000 queries with different types
|
|
670
|
+
println!("Training router with 1000 queries...\n");
|
|
671
|
+
|
|
672
|
+
let query_types = vec![
|
|
673
|
+
("simple", vec![0.1f32; 256], 0.70, "Local-LLaMA"), // Simple queries work fine with local
|
|
674
|
+
("medium", vec![0.5f32; 256], 0.85, "GPT-3.5-Turbo"), // Medium needs cloud
|
|
675
|
+
("complex", vec![0.9f32; 256], 0.95, "GPT-4"), // Complex needs best
|
|
676
|
+
];
|
|
677
|
+
|
|
678
|
+
for i in 0..1000 {
|
|
679
|
+
let (query_type, base_embedding, target_quality, expected_model) =
|
|
680
|
+
&query_types[i % query_types.len()];
|
|
681
|
+
|
|
682
|
+
// Add some variation to embeddings
|
|
683
|
+
let embedding: Vec<f32> = base_embedding.iter()
|
|
684
|
+
.enumerate()
|
|
685
|
+
.map(|(j, x)| x + (i as f32 * j as f32 * 0.0001).sin() * 0.1)
|
|
686
|
+
.collect();
|
|
687
|
+
|
|
688
|
+
// Route the query
|
|
689
|
+
let (model_idx, model) = router.route(embedding.clone());
|
|
690
|
+
|
|
691
|
+
// Simulate quality based on model fit
|
|
692
|
+
let quality = if &model.name == *expected_model {
|
|
693
|
+
*target_quality
|
|
694
|
+
} else {
|
|
695
|
+
target_quality - 0.2 // Penalty for wrong model
|
|
696
|
+
};
|
|
697
|
+
|
|
698
|
+
// Record outcome
|
|
699
|
+
router.record_outcome(embedding, model_idx, quality, model.avg_latency_ms);
|
|
700
|
+
|
|
701
|
+
// Periodic learning
|
|
702
|
+
if i % 100 == 0 {
|
|
703
|
+
router.learn();
|
|
704
|
+
}
|
|
705
|
+
}
|
|
706
|
+
|
|
707
|
+
// Test learned routing
|
|
708
|
+
println!("Testing learned routing:\n");
|
|
709
|
+
|
|
710
|
+
for (query_type, embedding, _, expected) in &query_types {
|
|
711
|
+
let (_, model) = router.route(embedding.clone());
|
|
712
|
+
let match_status = if &model.name == *expected { "✓" } else { "✗" };
|
|
713
|
+
println!(" {} query → {} {} (expected: {})",
|
|
714
|
+
query_type, model.name, match_status, expected);
|
|
715
|
+
}
|
|
716
|
+
|
|
717
|
+
println!("\nRouter stats: {}", router.stats());
|
|
718
|
+
}
|
|
719
|
+
```
|
|
720
|
+
|
|
721
|
+
---
|
|
722
|
+
|
|
723
|
+
### Tutorial 4: Browser-Based Learning (WASM)
|
|
724
|
+
|
|
725
|
+
Deploy SONA in the browser for client-side learning.
|
|
726
|
+
|
|
727
|
+
```html
|
|
728
|
+
<!DOCTYPE html>
|
|
729
|
+
<html>
|
|
730
|
+
<head>
|
|
731
|
+
<title>SONA Browser Demo</title>
|
|
732
|
+
<style>
|
|
733
|
+
body { font-family: Arial, sans-serif; max-width: 800px; margin: 0 auto; padding: 20px; }
|
|
734
|
+
.chat { border: 1px solid #ccc; padding: 20px; height: 400px; overflow-y: auto; }
|
|
735
|
+
.message { margin: 10px 0; padding: 10px; border-radius: 5px; }
|
|
736
|
+
.user { background: #e3f2fd; text-align: right; }
|
|
737
|
+
.bot { background: #f5f5f5; }
|
|
738
|
+
.feedback { margin-top: 5px; }
|
|
739
|
+
.feedback button { margin-right: 10px; padding: 5px 15px; cursor: pointer; }
|
|
740
|
+
input { width: 70%; padding: 10px; }
|
|
741
|
+
button.send { padding: 10px 20px; }
|
|
742
|
+
.stats { background: #fff3e0; padding: 10px; margin-top: 20px; font-family: monospace; }
|
|
743
|
+
</style>
|
|
744
|
+
</head>
|
|
745
|
+
<body>
|
|
746
|
+
<h1>🧠 SONA Browser Demo</h1>
|
|
747
|
+
<p>This chatbot learns from your feedback in real-time, entirely in your browser!</p>
|
|
748
|
+
|
|
749
|
+
<div class="chat" id="chat"></div>
|
|
750
|
+
|
|
751
|
+
<div style="margin-top: 10px;">
|
|
752
|
+
<input type="text" id="input" placeholder="Type a message..." onkeypress="if(event.key==='Enter')sendMessage()">
|
|
753
|
+
<button class="send" onclick="sendMessage()">Send</button>
|
|
754
|
+
</div>
|
|
755
|
+
|
|
756
|
+
<div class="stats" id="stats">Loading SONA...</div>
|
|
757
|
+
|
|
758
|
+
<script type="module">
|
|
759
|
+
import init, { WasmSonaEngine } from './pkg/sona.js';
|
|
760
|
+
|
|
761
|
+
let engine = null;
|
|
762
|
+
let currentTrajId = null;
|
|
763
|
+
let messageCount = 0;
|
|
764
|
+
|
|
765
|
+
// Initialize SONA
|
|
766
|
+
async function initSona() {
|
|
767
|
+
await init();
|
|
768
|
+
engine = new WasmSonaEngine(256);
|
|
769
|
+
updateStats();
|
|
770
|
+
document.getElementById('stats').textContent = 'SONA initialized! Start chatting to train it.';
|
|
771
|
+
}
|
|
772
|
+
|
|
773
|
+
// Create embedding from text (simple version)
|
|
774
|
+
function createEmbedding(text) {
|
|
775
|
+
const embedding = new Float32Array(256).fill(0);
|
|
776
|
+
for (let i = 0; i < text.length; i++) {
|
|
777
|
+
const idx = (text.charCodeAt(i) + i) % 256;
|
|
778
|
+
embedding[idx] += 0.1;
|
|
779
|
+
}
|
|
780
|
+
// Normalize
|
|
781
|
+
const norm = Math.sqrt(embedding.reduce((s, x) => s + x * x, 0));
|
|
782
|
+
if (norm > 0) {
|
|
783
|
+
for (let i = 0; i < embedding.length; i++) {
|
|
784
|
+
embedding[i] /= norm;
|
|
785
|
+
}
|
|
786
|
+
}
|
|
787
|
+
return Array.from(embedding);
|
|
788
|
+
}
|
|
789
|
+
|
|
790
|
+
// Generate response
|
|
791
|
+
function generateResponse(input, optimizedEmbedding) {
|
|
792
|
+
// Simple response logic (replace with actual LLM call)
|
|
793
|
+
const responses = {
|
|
794
|
+
greeting: ["Hello! How can I help you?", "Hi there! Nice to meet you!", "Hey! What's on your mind?"],
|
|
795
|
+
question: ["That's a great question!", "Let me think about that...", "Interesting! Here's what I know:"],
|
|
796
|
+
thanks: ["You're welcome!", "Happy to help!", "Anytime!"],
|
|
797
|
+
default: ["I see.", "Tell me more.", "Interesting perspective!"]
|
|
798
|
+
};
|
|
799
|
+
|
|
800
|
+
const inputLower = input.toLowerCase();
|
|
801
|
+
let category = 'default';
|
|
802
|
+
if (inputLower.includes('hello') || inputLower.includes('hi')) category = 'greeting';
|
|
803
|
+
else if (inputLower.includes('?')) category = 'question';
|
|
804
|
+
else if (inputLower.includes('thank')) category = 'thanks';
|
|
805
|
+
|
|
806
|
+
// Use optimized embedding to influence response selection
|
|
807
|
+
const idx = Math.floor(Math.abs(optimizedEmbedding[0]) * responses[category].length);
|
|
808
|
+
return responses[category][idx % responses[category].length];
|
|
809
|
+
}
|
|
810
|
+
|
|
811
|
+
// Add message to chat
|
|
812
|
+
function addMessage(text, isUser, trajId = null) {
|
|
813
|
+
const chat = document.getElementById('chat');
|
|
814
|
+
const div = document.createElement('div');
|
|
815
|
+
div.className = `message ${isUser ? 'user' : 'bot'}`;
|
|
816
|
+
div.innerHTML = text;
|
|
817
|
+
|
|
818
|
+
if (!isUser && trajId !== null) {
|
|
819
|
+
const feedback = document.createElement('div');
|
|
820
|
+
feedback.className = 'feedback';
|
|
821
|
+
feedback.innerHTML = `
|
|
822
|
+
<button onclick="recordFeedback(${trajId}, true)">👍 Helpful</button>
|
|
823
|
+
<button onclick="recordFeedback(${trajId}, false)">👎 Not helpful</button>
|
|
824
|
+
`;
|
|
825
|
+
div.appendChild(feedback);
|
|
826
|
+
}
|
|
827
|
+
|
|
828
|
+
chat.appendChild(div);
|
|
829
|
+
chat.scrollTop = chat.scrollHeight;
|
|
830
|
+
}
|
|
831
|
+
|
|
832
|
+
// Send message
|
|
833
|
+
window.sendMessage = function() {
|
|
834
|
+
const input = document.getElementById('input');
|
|
835
|
+
const text = input.value.trim();
|
|
836
|
+
if (!text) return;
|
|
837
|
+
|
|
838
|
+
// Add user message
|
|
839
|
+
addMessage(text, true);
|
|
840
|
+
input.value = '';
|
|
841
|
+
|
|
842
|
+
// Start trajectory
|
|
843
|
+
const embedding = createEmbedding(text);
|
|
844
|
+
currentTrajId = engine.begin_trajectory(embedding);
|
|
845
|
+
|
|
846
|
+
// Apply learned optimizations
|
|
847
|
+
const optimized = engine.apply_micro_lora(embedding);
|
|
848
|
+
|
|
849
|
+
// Record step
|
|
850
|
+
const activations = optimized.map(x => Math.tanh(x));
|
|
851
|
+
const attention = new Array(64).fill(1/64);
|
|
852
|
+
engine.add_trajectory_step(currentTrajId, activations, attention, 0.8);
|
|
853
|
+
|
|
854
|
+
// Generate and display response
|
|
855
|
+
const response = generateResponse(text, optimized);
|
|
856
|
+
addMessage(response, false, currentTrajId);
|
|
857
|
+
|
|
858
|
+
messageCount++;
|
|
859
|
+
updateStats();
|
|
860
|
+
};
|
|
861
|
+
|
|
862
|
+
// Record feedback
|
|
863
|
+
window.recordFeedback = function(trajId, wasHelpful) {
|
|
864
|
+
const quality = wasHelpful ? 0.95 : 0.2;
|
|
865
|
+
engine.end_trajectory(trajId, quality);
|
|
866
|
+
|
|
867
|
+
// Run learning
|
|
868
|
+
const result = engine.tick();
|
|
869
|
+
if (result) {
|
|
870
|
+
console.log('Learning cycle:', result);
|
|
871
|
+
}
|
|
872
|
+
|
|
873
|
+
// Disable feedback buttons
|
|
874
|
+
event.target.parentElement.innerHTML = wasHelpful
|
|
875
|
+
? '<span style="color:green">✓ Thanks for the feedback!</span>'
|
|
876
|
+
: '<span style="color:orange">✓ I\'ll try to improve!</span>';
|
|
877
|
+
|
|
878
|
+
updateStats();
|
|
879
|
+
};
|
|
880
|
+
|
|
881
|
+
// Update stats display
|
|
882
|
+
function updateStats() {
|
|
883
|
+
const stats = JSON.parse(engine.get_stats());
|
|
884
|
+
document.getElementById('stats').innerHTML = `
|
|
885
|
+
<strong>SONA Stats:</strong><br>
|
|
886
|
+
Messages: ${messageCount} |
|
|
887
|
+
Patterns learned: ${stats.patterns_stored || 0} |
|
|
888
|
+
Learning cycles: ${stats.background_cycles || 0}
|
|
889
|
+
`;
|
|
890
|
+
}
|
|
891
|
+
|
|
892
|
+
// Initialize
|
|
893
|
+
initSona();
|
|
894
|
+
</script>
|
|
895
|
+
</body>
|
|
896
|
+
</html>
|
|
897
|
+
```
|
|
898
|
+
|
|
899
|
+
---
|
|
900
|
+
|
|
901
|
+
### Tutorial 5: Node.js Backend Integration
|
|
902
|
+
|
|
903
|
+
Production-ready Node.js integration with Express.
|
|
904
|
+
|
|
905
|
+
```javascript
|
|
906
|
+
const express = require('express');
|
|
907
|
+
const { SonaEngine } = require('@ruvector/sona');
|
|
908
|
+
|
|
909
|
+
const app = express();
|
|
910
|
+
app.use(express.json());
|
|
911
|
+
|
|
912
|
+
// Initialize SONA engine
|
|
913
|
+
const engine = SonaEngine.withConfig({
|
|
914
|
+
hiddenDim: 256,
|
|
915
|
+
microLoraRank: 2, // Optimized for SIMD
|
|
916
|
+
microLoraLr: 0.002, // Optimal learning rate
|
|
917
|
+
patternClusters: 100, // Fast search
|
|
918
|
+
ewcLambda: 2000, // Anti-forgetting
|
|
919
|
+
qualityThreshold: 0.3 // Learn from more samples
|
|
920
|
+
});
|
|
921
|
+
|
|
922
|
+
// Track active trajectories
|
|
923
|
+
const activeTrajectories = new Map();
|
|
924
|
+
|
|
925
|
+
// Middleware to create embeddings (replace with your embedding service)
|
|
926
|
+
function createEmbedding(text) {
|
|
927
|
+
// Simple embedding (use OpenAI/Cohere embeddings in production)
|
|
928
|
+
const embedding = new Array(256).fill(0);
|
|
929
|
+
for (let i = 0; i < text.length; i++) {
|
|
930
|
+
const idx = (text.charCodeAt(i) + i) % 256;
|
|
931
|
+
embedding[idx] += 0.1;
|
|
932
|
+
}
|
|
933
|
+
const norm = Math.sqrt(embedding.reduce((s, x) => s + x * x, 0));
|
|
934
|
+
return embedding.map(x => x / (norm || 1));
|
|
935
|
+
}
|
|
936
|
+
|
|
937
|
+
// Start a new interaction
|
|
938
|
+
app.post('/api/query', (req, res) => {
|
|
939
|
+
const { query, sessionId } = req.body;
|
|
940
|
+
|
|
941
|
+
// Create embedding
|
|
942
|
+
const embedding = createEmbedding(query);
|
|
943
|
+
|
|
944
|
+
// Start trajectory
|
|
945
|
+
const trajId = engine.beginTrajectory(embedding);
|
|
946
|
+
activeTrajectories.set(sessionId, { trajId, embedding, startTime: Date.now() });
|
|
947
|
+
|
|
948
|
+
// Apply learned optimizations
|
|
949
|
+
const optimized = engine.applyMicroLora(embedding);
|
|
950
|
+
|
|
951
|
+
// Find similar patterns for context
|
|
952
|
+
const patterns = engine.findPatterns(optimized, 3);
|
|
953
|
+
|
|
954
|
+
// Record step
|
|
955
|
+
const activations = optimized.map(x => Math.tanh(x));
|
|
956
|
+
const attention = new Array(64).fill(1/64);
|
|
957
|
+
engine.addTrajectoryStep(trajId, activations, attention, 0.8);
|
|
958
|
+
|
|
959
|
+
res.json({
|
|
960
|
+
sessionId,
|
|
961
|
+
optimizedEmbedding: optimized,
|
|
962
|
+
similarPatterns: patterns.map(p => ({
|
|
963
|
+
avgQuality: p.avgQuality,
|
|
964
|
+
clusterSize: p.clusterSize,
|
|
965
|
+
patternType: p.patternType
|
|
966
|
+
})),
|
|
967
|
+
message: 'Query processed. Send response quality via /api/feedback'
|
|
968
|
+
});
|
|
969
|
+
});
|
|
970
|
+
|
|
971
|
+
// Record feedback
|
|
972
|
+
app.post('/api/feedback', (req, res) => {
|
|
973
|
+
const { sessionId, quality, wasHelpful } = req.body;
|
|
974
|
+
|
|
975
|
+
const session = activeTrajectories.get(sessionId);
|
|
976
|
+
if (!session) {
|
|
977
|
+
return res.status(404).json({ error: 'Session not found' });
|
|
978
|
+
}
|
|
979
|
+
|
|
980
|
+
// Calculate quality score
|
|
981
|
+
const qualityScore = quality ?? (wasHelpful ? 0.9 : 0.2);
|
|
982
|
+
|
|
983
|
+
// Complete trajectory
|
|
984
|
+
engine.endTrajectory(session.trajId, qualityScore);
|
|
985
|
+
|
|
986
|
+
// Run learning tick
|
|
987
|
+
const learnResult = engine.tick();
|
|
988
|
+
|
|
989
|
+
// Clean up
|
|
990
|
+
activeTrajectories.delete(sessionId);
|
|
991
|
+
|
|
992
|
+
res.json({
|
|
993
|
+
success: true,
|
|
994
|
+
quality: qualityScore,
|
|
995
|
+
latencyMs: Date.now() - session.startTime,
|
|
996
|
+
learned: learnResult !== null
|
|
997
|
+
});
|
|
998
|
+
});
|
|
999
|
+
|
|
1000
|
+
// Force learning cycle
|
|
1001
|
+
app.post('/api/learn', (req, res) => {
|
|
1002
|
+
const result = engine.forceLearn();
|
|
1003
|
+
res.json({
|
|
1004
|
+
success: true,
|
|
1005
|
+
result,
|
|
1006
|
+
stats: JSON.parse(engine.getStats())
|
|
1007
|
+
});
|
|
1008
|
+
});
|
|
1009
|
+
|
|
1010
|
+
// Get stats
|
|
1011
|
+
app.get('/api/stats', (req, res) => {
|
|
1012
|
+
res.json(JSON.parse(engine.getStats()));
|
|
1013
|
+
});
|
|
1014
|
+
|
|
1015
|
+
// Health check
|
|
1016
|
+
app.get('/health', (req, res) => {
|
|
1017
|
+
res.json({
|
|
1018
|
+
status: 'healthy',
|
|
1019
|
+
engine: engine.isEnabled() ? 'active' : 'disabled'
|
|
1020
|
+
});
|
|
1021
|
+
});
|
|
1022
|
+
|
|
1023
|
+
// Background learning (run hourly)
|
|
1024
|
+
setInterval(() => {
|
|
1025
|
+
console.log('Running background learning cycle...');
|
|
1026
|
+
const result = engine.forceLearn();
|
|
1027
|
+
console.log('Learning complete:', result);
|
|
1028
|
+
}, 60 * 60 * 1000); // Every hour
|
|
1029
|
+
|
|
1030
|
+
const PORT = process.env.PORT || 3000;
|
|
1031
|
+
app.listen(PORT, () => {
|
|
1032
|
+
console.log(`SONA server running on port ${PORT}`);
|
|
1033
|
+
console.log('Stats:', engine.getStats());
|
|
1034
|
+
});
|
|
1035
|
+
```
|
|
1036
|
+
|
|
1037
|
+
**Usage:**
|
|
1038
|
+
|
|
1039
|
+
```bash
|
|
1040
|
+
# Start server
|
|
1041
|
+
node server.js
|
|
1042
|
+
|
|
1043
|
+
# Test endpoints
|
|
1044
|
+
curl -X POST http://localhost:3000/api/query \
|
|
1045
|
+
-H "Content-Type: application/json" \
|
|
1046
|
+
-d '{"query": "How do I reset my password?", "sessionId": "abc123"}'
|
|
1047
|
+
|
|
1048
|
+
curl -X POST http://localhost:3000/api/feedback \
|
|
1049
|
+
-H "Content-Type: application/json" \
|
|
1050
|
+
-d '{"sessionId": "abc123", "wasHelpful": true}'
|
|
1051
|
+
|
|
1052
|
+
curl http://localhost:3000/api/stats
|
|
1053
|
+
```
|
|
1054
|
+
|
|
1055
|
+
---
|
|
1056
|
+
|
|
1057
|
+
### Tutorial 6: Production Deployment
|
|
1058
|
+
|
|
1059
|
+
Best practices for deploying SONA in production.
|
|
1060
|
+
|
|
1061
|
+
```rust
|
|
1062
|
+
use ruvector_sona::{SonaEngine, SonaConfig};
|
|
1063
|
+
use std::sync::Arc;
|
|
1064
|
+
use tokio::sync::RwLock;
|
|
1065
|
+
use tokio::time::{interval, Duration};
|
|
1066
|
+
|
|
1067
|
+
/// Production-ready SONA wrapper
|
|
1068
|
+
pub struct ProductionSona {
|
|
1069
|
+
engine: Arc<RwLock<SonaEngine>>,
|
|
1070
|
+
metrics: Arc<RwLock<Metrics>>,
|
|
1071
|
+
}
|
|
1072
|
+
|
|
1073
|
+
#[derive(Default)]
|
|
1074
|
+
pub struct Metrics {
|
|
1075
|
+
pub total_requests: u64,
|
|
1076
|
+
pub total_learning_cycles: u64,
|
|
1077
|
+
pub positive_feedback: u64,
|
|
1078
|
+
pub negative_feedback: u64,
|
|
1079
|
+
pub avg_latency_us: f64,
|
|
1080
|
+
}
|
|
1081
|
+
|
|
1082
|
+
impl ProductionSona {
|
|
1083
|
+
pub async fn new() -> Self {
|
|
1084
|
+
// Use optimized defaults
|
|
1085
|
+
let config = SonaConfig::default();
|
|
1086
|
+
|
|
1087
|
+
let engine = SonaEngine::builder()
|
|
1088
|
+
.config(config)
|
|
1089
|
+
.build();
|
|
1090
|
+
|
|
1091
|
+
let instance = Self {
|
|
1092
|
+
engine: Arc::new(RwLock::new(engine)),
|
|
1093
|
+
metrics: Arc::new(RwLock::new(Metrics::default())),
|
|
1094
|
+
};
|
|
1095
|
+
|
|
1096
|
+
// Start background tasks
|
|
1097
|
+
instance.start_background_tasks().await;
|
|
1098
|
+
|
|
1099
|
+
instance
|
|
1100
|
+
}
|
|
1101
|
+
|
|
1102
|
+
async fn start_background_tasks(&self) {
|
|
1103
|
+
let engine = self.engine.clone();
|
|
1104
|
+
let metrics = self.metrics.clone();
|
|
1105
|
+
|
|
1106
|
+
// Hourly learning cycle
|
|
1107
|
+
tokio::spawn(async move {
|
|
1108
|
+
let mut interval = interval(Duration::from_secs(3600));
|
|
1109
|
+
loop {
|
|
1110
|
+
interval.tick().await;
|
|
1111
|
+
|
|
1112
|
+
let mut engine = engine.write().await;
|
|
1113
|
+
let result = engine.force_learn();
|
|
1114
|
+
|
|
1115
|
+
let mut m = metrics.write().await;
|
|
1116
|
+
m.total_learning_cycles += 1;
|
|
1117
|
+
|
|
1118
|
+
tracing::info!("Background learning completed: {}", result);
|
|
1119
|
+
}
|
|
1120
|
+
});
|
|
1121
|
+
|
|
1122
|
+
// Metrics logging (every 5 minutes)
|
|
1123
|
+
let metrics_clone = self.metrics.clone();
|
|
1124
|
+
tokio::spawn(async move {
|
|
1125
|
+
let mut interval = interval(Duration::from_secs(300));
|
|
1126
|
+
loop {
|
|
1127
|
+
interval.tick().await;
|
|
1128
|
+
let m = metrics_clone.read().await;
|
|
1129
|
+
tracing::info!(
|
|
1130
|
+
"SONA Metrics - Requests: {}, Learning: {}, Positive: {}, Negative: {}",
|
|
1131
|
+
m.total_requests,
|
|
1132
|
+
m.total_learning_cycles,
|
|
1133
|
+
m.positive_feedback,
|
|
1134
|
+
m.negative_feedback
|
|
1135
|
+
);
|
|
1136
|
+
}
|
|
1137
|
+
});
|
|
1138
|
+
}
|
|
1139
|
+
|
|
1140
|
+
/// Process a query with full observability
|
|
1141
|
+
pub async fn process(&self, embedding: Vec<f32>) -> ProcessResult {
|
|
1142
|
+
let start = std::time::Instant::now();
|
|
1143
|
+
|
|
1144
|
+
let engine = self.engine.read().await;
|
|
1145
|
+
|
|
1146
|
+
// Start trajectory
|
|
1147
|
+
let traj_id = engine.begin_trajectory(embedding.clone());
|
|
1148
|
+
|
|
1149
|
+
// Apply optimizations
|
|
1150
|
+
let optimized = engine.apply_micro_lora(&embedding);
|
|
1151
|
+
|
|
1152
|
+
// Find patterns
|
|
1153
|
+
let patterns = engine.find_patterns(&optimized, 5);
|
|
1154
|
+
|
|
1155
|
+
// Update metrics
|
|
1156
|
+
let latency = start.elapsed().as_micros() as u64;
|
|
1157
|
+
{
|
|
1158
|
+
let mut m = self.metrics.write().await;
|
|
1159
|
+
m.total_requests += 1;
|
|
1160
|
+
m.avg_latency_us = (m.avg_latency_us * (m.total_requests - 1) as f64
|
|
1161
|
+
+ latency as f64) / m.total_requests as f64;
|
|
1162
|
+
}
|
|
1163
|
+
|
|
1164
|
+
ProcessResult {
|
|
1165
|
+
trajectory_id: traj_id,
|
|
1166
|
+
optimized_embedding: optimized,
|
|
1167
|
+
similar_patterns: patterns.into_iter().map(|p| PatternInfo {
|
|
1168
|
+
quality: p.avg_quality,
|
|
1169
|
+
cluster_size: p.cluster_size,
|
|
1170
|
+
}).collect(),
|
|
1171
|
+
latency_us: latency,
|
|
1172
|
+
}
|
|
1173
|
+
}
|
|
1174
|
+
|
|
1175
|
+
/// Record step in trajectory
|
|
1176
|
+
pub async fn record_step(
|
|
1177
|
+
&self,
|
|
1178
|
+
traj_id: u64,
|
|
1179
|
+
activations: Vec<f32>,
|
|
1180
|
+
attention: Vec<f32>,
|
|
1181
|
+
reward: f32,
|
|
1182
|
+
) {
|
|
1183
|
+
let engine = self.engine.read().await;
|
|
1184
|
+
engine.add_step(traj_id, activations, attention, reward);
|
|
1185
|
+
}
|
|
1186
|
+
|
|
1187
|
+
/// Complete trajectory with feedback
|
|
1188
|
+
pub async fn complete(&self, traj_id: u64, quality: f32, was_positive: bool) {
|
|
1189
|
+
{
|
|
1190
|
+
let engine = self.engine.read().await;
|
|
1191
|
+
engine.end_trajectory(traj_id, quality);
|
|
1192
|
+
}
|
|
1193
|
+
|
|
1194
|
+
// Update metrics
|
|
1195
|
+
let mut m = self.metrics.write().await;
|
|
1196
|
+
if was_positive {
|
|
1197
|
+
m.positive_feedback += 1;
|
|
1198
|
+
} else {
|
|
1199
|
+
m.negative_feedback += 1;
|
|
1200
|
+
}
|
|
1201
|
+
}
|
|
1202
|
+
|
|
1203
|
+
/// Get current statistics
|
|
1204
|
+
pub async fn stats(&self) -> Stats {
|
|
1205
|
+
let engine = self.engine.read().await;
|
|
1206
|
+
let engine_stats = engine.get_stats();
|
|
1207
|
+
|
|
1208
|
+
let m = self.metrics.read().await;
|
|
1209
|
+
|
|
1210
|
+
Stats {
|
|
1211
|
+
engine_stats,
|
|
1212
|
+
total_requests: m.total_requests,
|
|
1213
|
+
total_learning_cycles: m.total_learning_cycles,
|
|
1214
|
+
positive_feedback: m.positive_feedback,
|
|
1215
|
+
negative_feedback: m.negative_feedback,
|
|
1216
|
+
avg_latency_us: m.avg_latency_us,
|
|
1217
|
+
feedback_ratio: if m.positive_feedback + m.negative_feedback > 0 {
|
|
1218
|
+
m.positive_feedback as f64 / (m.positive_feedback + m.negative_feedback) as f64
|
|
1219
|
+
} else {
|
|
1220
|
+
0.0
|
|
1221
|
+
},
|
|
1222
|
+
}
|
|
1223
|
+
}
|
|
1224
|
+
}
|
|
1225
|
+
|
|
1226
|
+
pub struct ProcessResult {
|
|
1227
|
+
pub trajectory_id: u64,
|
|
1228
|
+
pub optimized_embedding: Vec<f32>,
|
|
1229
|
+
pub similar_patterns: Vec<PatternInfo>,
|
|
1230
|
+
pub latency_us: u64,
|
|
1231
|
+
}
|
|
1232
|
+
|
|
1233
|
+
pub struct PatternInfo {
|
|
1234
|
+
pub quality: f32,
|
|
1235
|
+
pub cluster_size: usize,
|
|
1236
|
+
}
|
|
1237
|
+
|
|
1238
|
+
pub struct Stats {
|
|
1239
|
+
pub engine_stats: String,
|
|
1240
|
+
pub total_requests: u64,
|
|
1241
|
+
pub total_learning_cycles: u64,
|
|
1242
|
+
pub positive_feedback: u64,
|
|
1243
|
+
pub negative_feedback: u64,
|
|
1244
|
+
pub avg_latency_us: f64,
|
|
1245
|
+
pub feedback_ratio: f64,
|
|
1246
|
+
}
|
|
1247
|
+
```
|
|
1248
|
+
|
|
1249
|
+
---
|
|
1250
|
+
|
|
1251
|
+
## Configuration Guide
|
|
1252
|
+
|
|
1253
|
+
### Optimized Defaults (v0.1.1)
|
|
1254
|
+
|
|
1255
|
+
The default configuration is optimized based on extensive benchmarks:
|
|
1256
|
+
|
|
1257
|
+
```rust
|
|
1258
|
+
SonaConfig {
|
|
1259
|
+
hidden_dim: 256,
|
|
1260
|
+
embedding_dim: 256,
|
|
1261
|
+
micro_lora_rank: 2, // 5% faster than rank-1 (better SIMD)
|
|
1262
|
+
base_lora_rank: 8,
|
|
1263
|
+
micro_lora_lr: 0.002, // +55% quality improvement
|
|
1264
|
+
base_lora_lr: 0.0001,
|
|
1265
|
+
ewc_lambda: 2000.0, // Better forgetting prevention
|
|
1266
|
+
pattern_clusters: 100, // 2.3x faster search
|
|
1267
|
+
trajectory_capacity: 10000,
|
|
1268
|
+
background_interval_ms: 3600000, // 1 hour
|
|
1269
|
+
quality_threshold: 0.3, // Learn from more samples
|
|
1270
|
+
enable_simd: true,
|
|
1271
|
+
}
|
|
1272
|
+
```
|
|
1273
|
+
|
|
1274
|
+
### Configuration Presets
|
|
1275
|
+
|
|
1276
|
+
```rust
|
|
1277
|
+
// For real-time chat applications
|
|
1278
|
+
let config = SonaConfig::max_throughput();
|
|
1279
|
+
|
|
1280
|
+
// For research/batch processing (best quality)
|
|
1281
|
+
let config = SonaConfig::max_quality();
|
|
1282
|
+
|
|
1283
|
+
// For mobile/edge devices (<5MB memory)
|
|
1284
|
+
let config = SonaConfig::edge_deployment();
|
|
1285
|
+
|
|
1286
|
+
// For high-throughput batch processing
|
|
1287
|
+
let config = SonaConfig::batch_processing();
|
|
1288
|
+
```
|
|
1289
|
+
|
|
1290
|
+
### Custom Configuration
|
|
1291
|
+
|
|
1292
|
+
```rust
|
|
1293
|
+
let config = SonaConfig {
|
|
1294
|
+
// Embedding dimensions (match your model)
|
|
1295
|
+
hidden_dim: 512,
|
|
1296
|
+
embedding_dim: 512,
|
|
1297
|
+
|
|
1298
|
+
// LoRA settings
|
|
1299
|
+
micro_lora_rank: 2, // 1-2 for speed, keep at 2 for SIMD
|
|
1300
|
+
base_lora_rank: 16, // 4-16 for expressiveness
|
|
1301
|
+
micro_lora_lr: 0.002, // Higher = faster learning, risk of instability
|
|
1302
|
+
base_lora_lr: 0.0001, // Lower = stable consolidation
|
|
1303
|
+
|
|
1304
|
+
// Memory protection
|
|
1305
|
+
ewc_lambda: 2000.0, // Higher = stronger protection against forgetting
|
|
1306
|
+
|
|
1307
|
+
// Pattern storage
|
|
1308
|
+
pattern_clusters: 100, // More clusters = faster search, more memory
|
|
1309
|
+
trajectory_capacity: 20000,
|
|
1310
|
+
|
|
1311
|
+
// Learning triggers
|
|
1312
|
+
background_interval_ms: 1800000, // 30 minutes
|
|
1313
|
+
quality_threshold: 0.2, // Lower = learn from more trajectories
|
|
1314
|
+
|
|
1315
|
+
// Performance
|
|
1316
|
+
enable_simd: true,
|
|
1317
|
+
};
|
|
1318
|
+
```
|
|
1319
|
+
|
|
1320
|
+
---
|
|
1321
|
+
|
|
1322
|
+
## API Reference
|
|
1323
|
+
|
|
1324
|
+
### SonaEngine
|
|
1325
|
+
|
|
1326
|
+
| Method | Description | Typical Latency |
|
|
1327
|
+
|--------|-------------|-----------------|
|
|
1328
|
+
| `new(hidden_dim)` | Create with default config | - |
|
|
1329
|
+
| `with_config(config)` | Create with custom config | - |
|
|
1330
|
+
| `builder()` | Start building configuration | - |
|
|
1331
|
+
| `begin_trajectory(embedding)` | Start recording interaction | ~50ns |
|
|
1332
|
+
| `add_trajectory_step(id, activations, attention, reward)` | Add step | ~112ns |
|
|
1333
|
+
| `set_trajectory_route(id, route)` | Set model route | ~20ns |
|
|
1334
|
+
| `add_trajectory_context(id, context)` | Add context | ~20ns |
|
|
1335
|
+
| `end_trajectory(id, quality)` | Complete with quality | ~100ns |
|
|
1336
|
+
| `apply_micro_lora(input)` | Fast transformation | ~45μs |
|
|
1337
|
+
| `apply_base_lora(layer, input)` | Deep transformation | ~25μs |
|
|
1338
|
+
| `tick()` | Run learning if due | ~34μs |
|
|
1339
|
+
| `force_learn()` | Force background cycle | ~5ms |
|
|
1340
|
+
| `flush()` | Flush instant updates | ~10μs |
|
|
1341
|
+
| `find_patterns(embedding, k)` | Find similar patterns | ~100μs |
|
|
1342
|
+
| `get_stats()` | Get JSON statistics | ~1μs |
|
|
1343
|
+
| `set_enabled(bool)` | Enable/disable engine | ~1ns |
|
|
1344
|
+
| `is_enabled()` | Check if enabled | ~1ns |
|
|
1345
|
+
|
|
1346
|
+
### JsSonaConfig (Node.js)
|
|
1347
|
+
|
|
1348
|
+
```typescript
|
|
1349
|
+
interface JsSonaConfig {
|
|
1350
|
+
hiddenDim: number; // Required
|
|
1351
|
+
embeddingDim?: number; // Default: hiddenDim
|
|
1352
|
+
microLoraRank?: number; // Default: 2
|
|
1353
|
+
baseLoraRank?: number; // Default: 8
|
|
1354
|
+
microLoraLr?: number; // Default: 0.002
|
|
1355
|
+
baseLoraLr?: number; // Default: 0.0001
|
|
1356
|
+
ewcLambda?: number; // Default: 2000
|
|
1357
|
+
patternClusters?: number; // Default: 100
|
|
1358
|
+
trajectoryCapacity?: number; // Default: 10000
|
|
1359
|
+
backgroundIntervalMs?: number; // Default: 3600000
|
|
1360
|
+
qualityThreshold?: number; // Default: 0.3
|
|
1361
|
+
enableSimd?: boolean; // Default: true
|
|
1362
|
+
}
|
|
1363
|
+
```
|
|
1364
|
+
|
|
1365
|
+
### JsLearnedPattern (Node.js)
|
|
1366
|
+
|
|
1367
|
+
```typescript
|
|
1368
|
+
interface JsLearnedPattern {
|
|
1369
|
+
id: string;
|
|
1370
|
+
centroid: number[];
|
|
1371
|
+
clusterSize: number;
|
|
1372
|
+
totalWeight: number;
|
|
1373
|
+
avgQuality: number;
|
|
1374
|
+
createdAt: string;
|
|
1375
|
+
lastAccessed: string;
|
|
1376
|
+
accessCount: number;
|
|
1377
|
+
patternType: string;
|
|
1378
|
+
}
|
|
1379
|
+
```
|
|
1380
|
+
|
|
1381
|
+
---
|
|
1382
|
+
|
|
1383
|
+
## Benchmarks
|
|
1384
|
+
|
|
1385
|
+
### Performance Results (v0.1.1)
|
|
1386
|
+
|
|
1387
|
+
| Operation | Target | Achieved | Improvement |
|
|
1388
|
+
|-----------|--------|----------|-------------|
|
|
1389
|
+
| MicroLoRA Forward (256d) | <100μs | **45μs** | 2.2x better |
|
|
1390
|
+
| Trajectory Recording | <1μs | **112ns** | 9x better |
|
|
1391
|
+
| Instant Learning Cycle | <1ms | **34μs** | 29x better |
|
|
1392
|
+
| Pattern Search (100 clusters) | <5ms | **1.3ms** | 3.8x better |
|
|
1393
|
+
| Background Learning | <10ms | **~5ms** | 2x better |
|
|
1394
|
+
| Memory per Trajectory | <1KB | **~800B** | 20% better |
|
|
1395
|
+
|
|
1396
|
+
### Throughput Benchmarks
|
|
1397
|
+
|
|
1398
|
+
| Scenario | Ops/Second | Latency (p99) |
|
|
1399
|
+
|----------|------------|---------------|
|
|
1400
|
+
| MicroLoRA Rank-2 (SIMD) | 2,211 | 0.85ms |
|
|
1401
|
+
| MicroLoRA Rank-1 | 2,100 | 0.90ms |
|
|
1402
|
+
| Batch Size 32 | 2,236 | 0.45ms/vector |
|
|
1403
|
+
| Pattern Search (k=5) | 770 | 1.5ms |
|
|
1404
|
+
|
|
1405
|
+
### Running Benchmarks
|
|
1406
|
+
|
|
1407
|
+
```bash
|
|
1408
|
+
# Run all benchmarks
|
|
1409
|
+
cargo bench -p ruvector-sona
|
|
1410
|
+
|
|
1411
|
+
# Run specific benchmark
|
|
1412
|
+
cargo bench -p ruvector-sona -- micro_lora
|
|
1413
|
+
|
|
1414
|
+
# With detailed output
|
|
1415
|
+
cargo bench -p ruvector-sona -- --verbose
|
|
1416
|
+
```
|
|
1417
|
+
|
|
1418
|
+
---
|
|
1419
|
+
|
|
1420
|
+
## Troubleshooting
|
|
1421
|
+
|
|
1422
|
+
### Common Issues
|
|
1423
|
+
|
|
1424
|
+
**1. "MicroLoRA rank must be 1-2"**
|
|
1425
|
+
```rust
|
|
1426
|
+
// Wrong
|
|
1427
|
+
let config = SonaConfig { micro_lora_rank: 4, .. };
|
|
1428
|
+
|
|
1429
|
+
// Correct - MicroLoRA is limited to rank 1-2 for speed
|
|
1430
|
+
let config = SonaConfig { micro_lora_rank: 2, .. };
|
|
1431
|
+
|
|
1432
|
+
// For higher ranks, use BaseLoRA
|
|
1433
|
+
let config = SonaConfig { base_lora_rank: 16, .. };
|
|
1434
|
+
```
|
|
1435
|
+
|
|
1436
|
+
**2. Embedding dimension mismatch**
|
|
1437
|
+
```rust
|
|
1438
|
+
// Engine expects 256-dim embeddings
|
|
1439
|
+
let engine = SonaEngine::new(256);
|
|
1440
|
+
|
|
1441
|
+
// Wrong - 512-dim embedding
|
|
1442
|
+
let embedding = vec![0.1f32; 512]; // Panic!
|
|
1443
|
+
|
|
1444
|
+
// Correct
|
|
1445
|
+
let embedding = vec![0.1f32; 256];
|
|
1446
|
+
let traj_id = engine.begin_trajectory(embedding);
|
|
1447
|
+
```
|
|
1448
|
+
|
|
1449
|
+
**3. Low quality scores not learning**
|
|
1450
|
+
```rust
|
|
1451
|
+
// If quality_threshold is 0.5, scores below won't trigger learning
|
|
1452
|
+
let config = SonaConfig {
|
|
1453
|
+
quality_threshold: 0.5, // Only learns from quality >= 0.5
|
|
1454
|
+
..Default::default()
|
|
1455
|
+
};
|
|
1456
|
+
|
|
1457
|
+
// Lower threshold to learn from more feedback
|
|
1458
|
+
let config = SonaConfig {
|
|
1459
|
+
quality_threshold: 0.2, // Learns from quality >= 0.2
|
|
1460
|
+
..Default::default()
|
|
1461
|
+
};
|
|
1462
|
+
```
|
|
1463
|
+
|
|
1464
|
+
**4. Memory growing unbounded**
|
|
1465
|
+
```rust
|
|
1466
|
+
// Limit trajectory buffer
|
|
1467
|
+
let config = SonaConfig {
|
|
1468
|
+
trajectory_capacity: 10000, // Max trajectories in memory
|
|
1469
|
+
..Default::default()
|
|
1470
|
+
};
|
|
1471
|
+
|
|
1472
|
+
// Force learning to clear buffer
|
|
1473
|
+
engine.force_learn();
|
|
1474
|
+
```
|
|
1475
|
+
|
|
1476
|
+
### Performance Optimization Tips
|
|
1477
|
+
|
|
1478
|
+
1. **Use Rank-2 MicroLoRA** - 5% faster due to SIMD alignment
|
|
1479
|
+
2. **Batch inputs when possible** - Optimal batch size is 32
|
|
1480
|
+
3. **Use 100 pattern clusters** - 2.3x faster than 50
|
|
1481
|
+
4. **Enable SIMD** - 10% speedup on supported CPUs
|
|
1482
|
+
5. **Run background learning during low-traffic periods**
|
|
1483
|
+
|
|
1484
|
+
---
|
|
1485
|
+
|
|
1486
|
+
## License
|
|
1487
|
+
|
|
1488
|
+
Licensed under either of:
|
|
1489
|
+
|
|
1490
|
+
- Apache License, Version 2.0 ([LICENSE-APACHE](LICENSE-APACHE))
|
|
1491
|
+
- MIT License ([LICENSE-MIT](LICENSE-MIT))
|
|
1492
|
+
|
|
1493
|
+
at your option.
|
|
1494
|
+
|
|
1495
|
+
## Contributing
|
|
1496
|
+
|
|
1497
|
+
Contributions welcome! Please see our [Contributing Guide](https://github.com/ruvnet/ruvector/blob/main/CONTRIBUTING.md).
|
|
1498
|
+
|
|
1499
|
+
## Acknowledgments
|
|
1500
|
+
|
|
1501
|
+
- [LoRA Paper](https://arxiv.org/abs/2106.09685) - Low-Rank Adaptation
|
|
1502
|
+
- [EWC Paper](https://arxiv.org/abs/1612.00796) - Elastic Weight Consolidation
|
|
1503
|
+
- [K-means++](https://theory.stanford.edu/~sergei/papers/kMeansPP-soda.pdf) - Initialization algorithm
|
|
1504
|
+
|
|
1505
|
+
---
|
|
1506
|
+
|
|
1507
|
+
<div align="center">
|
|
1508
|
+
|
|
1509
|
+
**[Documentation](https://docs.rs/ruvector-sona)** | **[GitHub](https://github.com/ruvnet/ruvector)** | **[npm](https://www.npmjs.com/package/@ruvector/sona)** | **[crates.io](https://crates.io/crates/ruvector-sona)**
|
|
1510
|
+
|
|
1511
|
+
Made with 🦀 Rust by the RuVector Team
|
|
1512
|
+
|
|
1513
|
+
</div>
|