@nxuss/lemma 0.3.2 β 0.4.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/README.md +40 -484
- package/dashboard/README.md +351 -0
- package/dashboard/dist/assets/index-CIx8ECj8.css +1 -0
- package/dashboard/dist/assets/index-zTlIPJOp.js +478 -0
- package/dashboard/dist/assets/index-zTlIPJOp.js.map +1 -0
- package/dashboard/dist/index.html +14 -0
- package/dist/cjs/cloud/TenantCache.d.ts +1 -0
- package/dist/cjs/cloud/TenantCache.d.ts.map +1 -1
- package/dist/cjs/cloud/TenantCache.js +25 -3
- package/dist/cjs/cloud/TenantCache.js.map +1 -1
- package/dist/cjs/config/index.d.ts.map +1 -1
- package/dist/cjs/config/index.js +4 -0
- package/dist/cjs/config/index.js.map +1 -1
- package/dist/cjs/index.d.ts +3 -1
- package/dist/cjs/index.d.ts.map +1 -1
- package/dist/cjs/index.js +29 -0
- package/dist/cjs/index.js.map +1 -1
- package/dist/cjs/observability/IdeContextSync.d.ts +39 -0
- package/dist/cjs/observability/IdeContextSync.d.ts.map +1 -0
- package/dist/cjs/observability/IdeContextSync.js +169 -0
- package/dist/cjs/observability/IdeContextSync.js.map +1 -0
- package/dist/cjs/types/index.d.ts +11 -0
- package/dist/cjs/types/index.d.ts.map +1 -1
- package/dist/cjs/types/index.js.map +1 -1
- package/dist/esm/cloud/TenantCache.d.ts +1 -0
- package/dist/esm/cloud/TenantCache.d.ts.map +1 -1
- package/dist/esm/cloud/TenantCache.js +25 -3
- package/dist/esm/cloud/TenantCache.js.map +1 -1
- package/dist/esm/config/index.d.ts.map +1 -1
- package/dist/esm/config/index.js +4 -0
- package/dist/esm/config/index.js.map +1 -1
- package/dist/esm/index.d.ts +3 -1
- package/dist/esm/index.d.ts.map +1 -1
- package/dist/esm/index.js +29 -0
- package/dist/esm/index.js.map +1 -1
- package/dist/esm/observability/IdeContextSync.d.ts +39 -0
- package/dist/esm/observability/IdeContextSync.d.ts.map +1 -0
- package/dist/esm/observability/IdeContextSync.js +162 -0
- package/dist/esm/observability/IdeContextSync.js.map +1 -0
- package/dist/esm/types/index.d.ts +11 -0
- package/dist/esm/types/index.d.ts.map +1 -1
- package/dist/esm/types/index.js.map +1 -1
- package/lemma-proxy.cjs +66 -12
- package/package.json +24 -14
- package/src/cloud/CloudSyncClient.js +35 -0
- package/src/protocol/README.md +576 -0
- package/src/proxy/ComplexityRouter.js +37 -0
- package/src/security/SemanticScrubber.js +54 -0
- package/src/speculative/worker.js +96 -0
package/README.md
CHANGED
|
@@ -1,522 +1,78 @@
|
|
|
1
|
-
# Lemma v0.
|
|
2
|
-
> **The
|
|
1
|
+
# Lemma v0.4.0
|
|
2
|
+
> **The Intelligent AI Gateway β Privacy, Performance, and Precision for the Agentic Era.**
|
|
3
3
|
|
|
4
|
-
Lemma is
|
|
5
|
-
|
|
6
|
-
### π Why Lemma?
|
|
7
|
-
- π° **Stop paying twice**: Lemma caches redundant queries semantically. "Fix this bug" and "Solve this error" return the same cached answer.
|
|
8
|
-
- β‘ **Instant responses**: 3ms cache hits vs 2000ms LLM calls.
|
|
9
|
-
- π€ **Universal Gateway**: One endpoint for OpenAI, Anthropic, and Gemini.
|
|
10
|
-
- π **Agent Swarms**: Orchestrate multiple agents with shared memory.
|
|
4
|
+
Lemma is a high-performance orchestration layer that sits between your development environment and LLM providers. It transforms the way you build with AI by providing **Shared Semantic Memory**, **Autonomous Cost Optimization**, and **Runtime Context Synchronization**.
|
|
11
5
|
|
|
12
6
|
---
|
|
13
7
|
|
|
14
|
-
## β‘
|
|
15
|
-
|
|
16
|
-
Install and launch the proxy to start saving on your API bills immediately.
|
|
17
|
-
|
|
18
|
-
```bash
|
|
19
|
-
npm install -g @nxuss/lemma
|
|
20
|
-
lemma start
|
|
21
|
-
```
|
|
22
|
-
|
|
23
|
-
**Configure your IDE:**
|
|
24
|
-
- **Base URL:** `http://localhost:8080/v1`
|
|
25
|
-
- **Gemini Base:** `http://localhost:8080/v1beta`
|
|
26
|
-
|
|
27
|
-
- π **Free Tier**: 300 queries/month + Exact Matching.
|
|
28
|
-
- π **Pro**: Unlimited queries + **Semantic Caching** ($12/mo or $120/yr).
|
|
29
|
-
- βοΈ **Cloud**: Managed infrastructure (Coming Soon).
|
|
30
|
-
|
|
31
|
-
π **[Get Lemma Pro](https://lemma.nxus.studio/upgrade)**
|
|
32
|
-
|
|
33
|
-
### Option 2: Multi-Agent System
|
|
34
|
-
|
|
35
|
-
For building coordinated AI agent systems:
|
|
8
|
+
## β‘ Killer Features
|
|
36
9
|
|
|
37
|
-
|
|
38
|
-
|
|
39
|
-
```
|
|
40
|
-
|
|
41
|
-
π **[Multi-Agent Guide](#quick-start)**
|
|
10
|
+
### π‘οΈ Privacy Firewall (Semantic Scrubber)
|
|
11
|
+
**Zero-Trust Prompts.** Stop leaking sensitive data. Lemma automatically detects API keys, PII, and credentials in your prompts, masking them with secure tokens before they reach the cloud. Responses are seamlessly reconstructed locally, ensuring your secrets never leave your machine.
|
|
42
12
|
|
|
43
|
-
|
|
13
|
+
### π¦ Complexity Router (Cost-Optimizer)
|
|
14
|
+
**Intelligence Where it Matters.** Lemma analyzes the semantic complexity of every request. It autonomously routes lightweight tasks (like JSON formatting or translations) to hyper-efficient models like `gpt-4o-mini`, reserving premium models for high-reasoning challenges. Save up to 90% on simple tasks without losing quality.
|
|
44
15
|
|
|
45
|
-
|
|
16
|
+
### π§ Telepathic Context Injector (Runtime Sync)
|
|
17
|
+
**Bridge the Gap Between Code and Execution.** Lemma synchronizes your application's live runtime state, exceptions, and memory mutations directly with your IDEβs consciousness. Your AI assistant gains immediate "situational awareness" of crashes and state changes, allowing for instant, context-aware debugging without manual intervention.
|
|
46
18
|
|
|
47
|
-
|
|
48
|
-
|
|
49
|
-
```
|
|
50
|
-
"How to implement JWT in Express?"
|
|
51
|
-
"Explain JWT authentication in Node.js" β Same answer, paid twice
|
|
52
|
-
"Show me JWT example for Express" β Same answer, paid three times
|
|
53
|
-
```
|
|
54
|
-
|
|
55
|
-
**Lemma Proxy** intercepts these calls and returns cached responses for similar prompts in 3ms instead of 600ms, saving you money and time.
|
|
19
|
+
### β‘ Shared Semantic Cache
|
|
20
|
+
**Stop Paying for the Same Thought Twice.** Unlike traditional caches, Lemma understands meaning. It recognizes that *"Fix this CSS bug"* and *"Solve the styling error"* are functionally identical, returning instant (3ms) responses and saving 40-70% on total API expenditure.
|
|
56
21
|
|
|
57
22
|
---
|
|
58
23
|
|
|
59
|
-
##
|
|
60
|
-
|
|
61
|
-
When you run multiple AI agents in parallel, they don't share context. Agent A solves a problem. Agent B gets the same problem 10 minutes later and solves it again. You pay twice, wait twice, and get the same answer.
|
|
24
|
+
## π Quick Start
|
|
62
25
|
|
|
63
|
-
|
|
26
|
+
Launch the Lemma ecosystem in seconds:
|
|
64
27
|
|
|
65
|
-
```
|
|
66
|
-
|
|
28
|
+
```bash
|
|
29
|
+
npm install -g @nxuss/lemma
|
|
30
|
+
lemma start --stack
|
|
67
31
|
```
|
|
68
32
|
|
|
69
|
-
|
|
33
|
+
**Point your IDE or SDK to Lemma:**
|
|
34
|
+
* **Base URL:** `http://localhost:8085/v1` (OpenAI / Anthropic Compatible)
|
|
35
|
+
* **Gemini Base:** `http://localhost:8085/v1beta`
|
|
36
|
+
* **Dashboard:** `http://localhost:3000`
|
|
70
37
|
|
|
71
38
|
---
|
|
72
39
|
|
|
73
|
-
##
|
|
74
|
-
|
|
75
|
-
```
|
|
76
|
-
Agent A βββ
|
|
77
|
-
Agent B βββ€βββΊ SubconsciousHub βββΊ Semantic Cache (ChromaDB + embeddings)
|
|
78
|
-
Agent C βββ β
|
|
79
|
-
ββββΊ Agent Registry + Capability Routing
|
|
80
|
-
```
|
|
81
|
-
|
|
82
|
-
1. Agents connect via WebSocket and register their capabilities
|
|
83
|
-
2. Every task request hits the semantic cache first
|
|
84
|
-
3. On a miss, the hub routes to a capable agent and stores the result
|
|
85
|
-
4. On a hit, the response returns in ~20ms β no agent invoked, no LLM called
|
|
40
|
+
## π Tier Comparison
|
|
86
41
|
|
|
87
|
-
|
|
42
|
+
| Feature | π Free (Open-Core) | π Pro ($12/mo) |
|
|
43
|
+
| :--- | :--- | :--- |
|
|
44
|
+
| **Caching** | Exact Match | **Semantic Memory (ChromaDB)** |
|
|
45
|
+
| **Security** | Basic Logging | **Privacy Firewall (Auto-Masking)** |
|
|
46
|
+
| **Optimization** | Manual Routing | **Autonomous Complexity Router** |
|
|
47
|
+
| **IDE Sync** | Raw Log Stream | **Telepathic Context Injector** |
|
|
48
|
+
| **Continuity** | Local Only | **Cloud Sync (Team Memory)** |
|
|
49
|
+
| **Limits** | 300 requests/mo | **Unlimited Agentic Power** |
|
|
88
50
|
|
|
89
51
|
---
|
|
90
52
|
|
|
91
|
-
##
|
|
53
|
+
## π οΈ Developer Integration
|
|
92
54
|
|
|
93
|
-
###
|
|
55
|
+
### Use as an Intelligent Proxy
|
|
56
|
+
Simply swap your OpenAI/Anthropic base URL in your favorite tools (Cursor, Copilot, AutoGPT). Lemma handles the caching, scrubbing, and routing transparently.
|
|
94
57
|
|
|
95
58
|
```bash
|
|
96
|
-
|
|
97
|
-
|
|
98
|
-
# For semantic mode (optional, lightweight embeddings):
|
|
99
|
-
npm install @xenova/transformers
|
|
100
|
-
|
|
101
|
-
# For persistent storage with ChromaDB (optional):
|
|
102
|
-
pip install chromadb
|
|
103
|
-
chroma run --path ./chroma_data --port 8000
|
|
104
|
-
|
|
105
|
-
# For ChromaDB embeddings (optional):
|
|
106
|
-
ollama pull nomic-embed-text
|
|
107
|
-
```
|
|
108
|
-
|
|
109
|
-
### 2. Choose your mode
|
|
110
|
-
|
|
111
|
-
#### Option A: Semantic Mode (Recommended) β‘
|
|
112
|
-
|
|
113
|
-
Zero external dependencies, true semantic matching:
|
|
114
|
-
|
|
115
|
-
```typescript
|
|
116
|
-
import { Lemma } from '@nxuss/lemma/embed';
|
|
117
|
-
|
|
118
|
-
const lemma = await Lemma.create({
|
|
119
|
-
storage: 'semantic', // Uses transformers.js
|
|
120
|
-
threshold: 0.85, // Similarity threshold
|
|
121
|
-
});
|
|
122
|
-
|
|
123
|
-
const cachedLLM = lemma.wrap(async (query: string) => {
|
|
124
|
-
return await yourLLMCall(query);
|
|
125
|
-
});
|
|
126
|
-
|
|
127
|
-
await cachedLLM('weather in San Francisco'); // Calls LLM
|
|
128
|
-
await cachedLLM('SF weather forecast'); // Cache HIT! β‘
|
|
129
|
-
await cachedLLM('San Francisco temperature'); // Cache HIT! β‘
|
|
130
|
-
```
|
|
131
|
-
|
|
132
|
-
#### Option B: Memory Mode (Fastest)
|
|
133
|
-
|
|
134
|
-
Exact matching, zero dependencies:
|
|
135
|
-
|
|
136
|
-
```typescript
|
|
137
|
-
const lemma = await Lemma.create({
|
|
138
|
-
storage: 'memory', // Default, exact match only
|
|
139
|
-
});
|
|
59
|
+
# Point your configuration to:
|
|
60
|
+
http://localhost:8085/v1
|
|
140
61
|
```
|
|
141
62
|
|
|
142
|
-
|
|
143
|
-
|
|
144
|
-
#### Option C: Server Mode (Multi-Agent)
|
|
145
|
-
|
|
146
|
-
For multi-agent orchestration:
|
|
147
|
-
|
|
148
|
-
```typescript
|
|
149
|
-
import { SubconsciousHub } from '@nxuss/lemma';
|
|
150
|
-
|
|
151
|
-
const hub = new SubconsciousHub({
|
|
152
|
-
server: { port: 8080 }
|
|
153
|
-
});
|
|
154
|
-
|
|
155
|
-
await hub.start();
|
|
156
|
-
console.log('WebSocket hub listening on ws://localhost:8080');
|
|
157
|
-
```
|
|
158
|
-
|
|
159
|
-
### 3. Connect agents (Server Mode)
|
|
160
|
-
|
|
161
|
-
```typescript
|
|
162
|
-
import WebSocket from 'ws';
|
|
163
|
-
|
|
164
|
-
const ws = new WebSocket('ws://localhost:8080');
|
|
165
|
-
|
|
166
|
-
ws.on('open', () => {
|
|
167
|
-
// Register with the hub
|
|
168
|
-
ws.send(JSON.stringify({
|
|
169
|
-
type: 'handshake',
|
|
170
|
-
messageId: `msg-${Date.now()}`,
|
|
171
|
-
timestamp: Date.now(),
|
|
172
|
-
payload: {
|
|
173
|
-
agentId: 'my-agent-001',
|
|
174
|
-
capabilities: [{ name: 'code-generation', description: 'Writes code', version: '1.0.0' }],
|
|
175
|
-
metadata: { version: '1.0.0' }
|
|
176
|
-
}
|
|
177
|
-
}));
|
|
178
|
-
});
|
|
179
|
-
|
|
180
|
-
ws.on('message', (data) => {
|
|
181
|
-
const msg = JSON.parse(data.toString());
|
|
182
|
-
|
|
183
|
-
if (msg.type === 'handshake_ack') {
|
|
184
|
-
// Connected β send a task
|
|
185
|
-
ws.send(JSON.stringify({
|
|
186
|
-
type: 'task_request',
|
|
187
|
-
messageId: `msg-${Date.now()}`,
|
|
188
|
-
timestamp: Date.now(),
|
|
189
|
-
payload: {
|
|
190
|
-
taskId: `task-${Date.now()}`,
|
|
191
|
-
taskType: 'general',
|
|
192
|
-
description: 'Implement binary search in Python',
|
|
193
|
-
requiredCapabilities: ['code-generation'],
|
|
194
|
-
parameters: {}
|
|
195
|
-
}
|
|
196
|
-
}));
|
|
197
|
-
}
|
|
198
|
-
|
|
199
|
-
if (msg.type === 'task_response' || msg.type === 'TASK_RESPONSE') {
|
|
200
|
-
const { cached, executionTime, result } = msg.payload;
|
|
201
|
-
console.log(cached ? `β‘ Cache hit (${executionTime}ms)` : `π Computed (${executionTime}ms)`);
|
|
202
|
-
console.log(result);
|
|
203
|
-
}
|
|
204
|
-
|
|
205
|
-
if (msg.type === 'task_assign') {
|
|
206
|
-
// Hub routed a task to us β process and respond
|
|
207
|
-
const { taskId, description } = msg.payload;
|
|
208
|
-
const result = yourLLM(description); // your actual LLM call
|
|
209
|
-
ws.send(JSON.stringify({
|
|
210
|
-
type: 'task_response',
|
|
211
|
-
messageId: `msg-${Date.now()}`,
|
|
212
|
-
timestamp: Date.now(),
|
|
213
|
-
payload: { taskId, success: true, result, executionTime: 1200, tokensUsed: 800 }
|
|
214
|
-
}));
|
|
215
|
-
}
|
|
216
|
-
});
|
|
217
|
-
```
|
|
218
|
-
|
|
219
|
-
### 4. See it in action
|
|
220
|
-
|
|
221
|
-
When multiple agents request similar tasks, you'll see the cache working:
|
|
222
|
-
|
|
223
|
-
```
|
|
224
|
-
[agent-001] π COMPUTED - Calculate fibonacci(10)... (1000ms)
|
|
225
|
-
[agent-002] β‘ CACHE HIT - compute the 10th fibonacci... (20ms)
|
|
226
|
-
[agent-003] β‘ CACHE HIT - fibonacci sequence up to n=10... (22ms)
|
|
227
|
-
```
|
|
228
|
-
|
|
229
|
-
**Result:** 100% cache hit rate after first computation. ~20ms responses. Zero duplicate LLM calls.
|
|
230
|
-
|
|
231
|
-
---
|
|
232
|
-
|
|
233
|
-
## What's inside
|
|
234
|
-
|
|
235
|
-
### Embedded Mode β Zero-config semantic cache
|
|
236
|
-
|
|
237
|
-
The simplest way to add semantic caching to any project:
|
|
238
|
-
|
|
239
|
-
```typescript
|
|
240
|
-
import { Lemma } from '@nxuss/lemma/embed';
|
|
241
|
-
|
|
242
|
-
const lemma = await Lemma.create({
|
|
243
|
-
storage: 'semantic', // or 'memory', 'chroma', 'cloud'
|
|
244
|
-
threshold: 0.85,
|
|
245
|
-
ttl: 3600000, // 1 hour
|
|
246
|
-
cleanupInterval: 60000, // Auto-cleanup every minute
|
|
247
|
-
enableFallback: true, // Auto-fallback on failures
|
|
248
|
-
});
|
|
249
|
-
|
|
250
|
-
// Wrap any async function
|
|
251
|
-
const cached = lemma.wrap(yourExpensiveFunction);
|
|
252
|
-
|
|
253
|
-
// Use it
|
|
254
|
-
const result = await cached('your input');
|
|
255
|
-
console.log(result.fromCache); // true on cache hit
|
|
256
|
-
```
|
|
257
|
-
|
|
258
|
-
**Features:**
|
|
259
|
-
- **Semantic matching** with lightweight embeddings (transformers.js)
|
|
260
|
-
- **Automatic TTL cleanup** prevents memory leaks
|
|
261
|
-
- **Circuit breaker** with automatic fallbacks (Cloud β Chroma β Memory)
|
|
262
|
-
- **Health monitoring** with detailed metrics
|
|
263
|
-
- **Graceful shutdown** with `lemma.stop()`
|
|
264
|
-
|
|
265
|
-
**Storage options:**
|
|
266
|
-
- `memory`: Exact match, zero dependencies, fastest
|
|
267
|
-
- `semantic`: True semantic matching, lightweight (50MB)
|
|
268
|
-
- `chroma`: Persistent semantic cache (requires ChromaDB)
|
|
269
|
-
- `cloud`: Managed cache (requires API key)
|
|
270
|
-
|
|
271
|
-
### SubconsciousHub β the orchestration layer
|
|
272
|
-
|
|
273
|
-
The core of Lemma. A WebSocket server that manages agent connections, routes tasks by capability, and maintains the shared semantic cache.
|
|
274
|
-
|
|
63
|
+
### Use as a Multi-Agent Hub
|
|
64
|
+
For complex agent swarms that need a "Hive Mind":
|
|
275
65
|
```typescript
|
|
276
66
|
import { SubconsciousHub } from '@nxuss/lemma';
|
|
277
|
-
|
|
278
67
|
const hub = new SubconsciousHub({ server: { port: 8080 } });
|
|
279
68
|
await hub.start();
|
|
280
69
|
```
|
|
281
70
|
|
|
282
|
-
**What it handles:**
|
|
283
|
-
- Agent registration and capability discovery
|
|
284
|
-
- Semantic cache lookup before every task (ChromaDB + `nomic-embed-text` embeddings)
|
|
285
|
-
- Task routing to capable agents on cache miss
|
|
286
|
-
- Response storage for future cache hits
|
|
287
|
-
- WebSocket heartbeat and connection lifecycle
|
|
288
|
-
- Rate limiting and message sanitization
|
|
289
|
-
|
|
290
|
-
### Semantic cache β the shared memory
|
|
291
|
-
|
|
292
|
-
Built on ChromaDB with Ollama embeddings. Catches paraphrases, not just exact matches.
|
|
293
|
-
|
|
294
|
-
```
|
|
295
|
-
"fibonacci up to n=10" βββΊ cache hit (similarity: 0.97)
|
|
296
|
-
"compute the 10th fibonacci" βββΊ cache hit (similarity: 0.91)
|
|
297
|
-
"fib sequence, first 10 terms" βββΊ cache hit (similarity: 0.88)
|
|
298
|
-
```
|
|
299
|
-
|
|
300
|
-
Threshold is configurable (`SEMANTIC_THRESHOLD=0.85` by default).
|
|
301
|
-
|
|
302
|
-
### Consensus engine β multi-model voting
|
|
303
|
-
|
|
304
|
-
For high-stakes decisions, route a query through multiple models and only return when they agree.
|
|
305
|
-
|
|
306
|
-
```typescript
|
|
307
|
-
import { ConsensusEngine } from '@nxuss/lemma';
|
|
308
|
-
|
|
309
|
-
const consensus = new ConsensusEngine({
|
|
310
|
-
minModels: 3,
|
|
311
|
-
minAgreement: 0.90,
|
|
312
|
-
maxRounds: 3,
|
|
313
|
-
});
|
|
314
|
-
|
|
315
|
-
const result = await consensus.requestConsensus({
|
|
316
|
-
query: 'Is this SQL query safe to run in production?',
|
|
317
|
-
models: ['llama3', 'gpt-4', 'claude-3'],
|
|
318
|
-
});
|
|
319
|
-
// Returns only when 3 models agree β₯90%
|
|
320
|
-
```
|
|
321
|
-
|
|
322
|
-
Supports Ollama (local), OpenAI, Anthropic, and Google models simultaneously.
|
|
323
|
-
|
|
324
|
-
---
|
|
325
|
-
|
|
326
|
-
## New in v0.2.0 π
|
|
327
|
-
|
|
328
|
-
### 1. Semantic Memory Backend
|
|
329
|
-
|
|
330
|
-
True semantic caching without external dependencies:
|
|
331
|
-
|
|
332
|
-
```typescript
|
|
333
|
-
const lemma = await Lemma.create({
|
|
334
|
-
storage: 'semantic',
|
|
335
|
-
embeddingModel: 'Xenova/all-MiniLM-L6-v2', // Lightweight!
|
|
336
|
-
});
|
|
337
|
-
|
|
338
|
-
// These all match semantically:
|
|
339
|
-
await lemma.run('weather in SF', fetchWeather);
|
|
340
|
-
await lemma.run('San Francisco weather', fetchWeather); // HIT!
|
|
341
|
-
await lemma.run('SF temperature forecast', fetchWeather); // HIT!
|
|
342
|
-
```
|
|
343
|
-
|
|
344
|
-
### 2. Automatic TTL Cleanup
|
|
345
|
-
|
|
346
|
-
No more memory leaks from expired entries:
|
|
347
|
-
|
|
348
|
-
```typescript
|
|
349
|
-
const lemma = await Lemma.create({
|
|
350
|
-
ttl: 3600000, // 1 hour expiry
|
|
351
|
-
cleanupInterval: 60000, // Check every minute
|
|
352
|
-
});
|
|
353
|
-
|
|
354
|
-
// Expired entries are automatically removed
|
|
355
|
-
// No manual cleanup needed!
|
|
356
|
-
```
|
|
357
|
-
|
|
358
|
-
### 3. Circuit Breaker & Fallbacks
|
|
359
|
-
|
|
360
|
-
Automatic resilience when backends fail:
|
|
361
|
-
|
|
362
|
-
```typescript
|
|
363
|
-
const lemma = await Lemma.create({
|
|
364
|
-
storage: 'cloud',
|
|
365
|
-
enableFallback: true, // Auto-fallback on failure
|
|
366
|
-
maxRetries: 3,
|
|
367
|
-
retryDelay: 1000,
|
|
368
|
-
});
|
|
369
|
-
|
|
370
|
-
// If cloud fails β falls back to chroma
|
|
371
|
-
// If chroma fails β falls back to memory
|
|
372
|
-
// Automatic recovery when backend comes back
|
|
373
|
-
|
|
374
|
-
lemma.on('backend-degraded', ({ from, to }) => {
|
|
375
|
-
console.log(`Degraded: ${from} β ${to}`);
|
|
376
|
-
});
|
|
377
|
-
|
|
378
|
-
lemma.on('backend-recovered', ({ backend }) => {
|
|
379
|
-
console.log(`Recovered: ${backend}`);
|
|
380
|
-
});
|
|
381
|
-
```
|
|
382
|
-
|
|
383
|
-
### 4. Enhanced Metrics & Health Monitoring
|
|
384
|
-
|
|
385
|
-
```typescript
|
|
386
|
-
const metrics = lemma.getMetrics();
|
|
387
|
-
console.log(metrics);
|
|
388
|
-
// {
|
|
389
|
-
// hits: 150,
|
|
390
|
-
// misses: 50,
|
|
391
|
-
// hitRate: 0.75,
|
|
392
|
-
// backendHealth: 'healthy',
|
|
393
|
-
// failureCount: 0,
|
|
394
|
-
// evictedCount: 23,
|
|
395
|
-
// lastCleanupAt: 1234567890
|
|
396
|
-
// }
|
|
397
|
-
|
|
398
|
-
const health = lemma.getBackendHealth();
|
|
399
|
-
console.log(health);
|
|
400
|
-
// {
|
|
401
|
-
// state: 'CLOSED',
|
|
402
|
-
// currentBackend: 'semantic',
|
|
403
|
-
// failureCount: 0,
|
|
404
|
-
// totalFailures: 0
|
|
405
|
-
// }
|
|
406
|
-
```
|
|
407
|
-
|
|
408
|
-
### 5. Dual Module Support (ESM + CJS)
|
|
409
|
-
|
|
410
|
-
```typescript
|
|
411
|
-
// ESM
|
|
412
|
-
import { Lemma } from '@nxuss/lemma/embed';
|
|
413
|
-
import { ConsensusEngine } from '@nxuss/lemma/consensus';
|
|
414
|
-
import { SpeculativeEngine } from '@nxuss/lemma/speculative';
|
|
415
|
-
|
|
416
|
-
// CJS
|
|
417
|
-
const { Lemma } = require('@nxuss/lemma/embed');
|
|
418
|
-
const { ConsensusEngine } = require('@nxuss/lemma/consensus');
|
|
419
|
-
```
|
|
420
|
-
|
|
421
|
-
**New exports:**
|
|
422
|
-
- `@nxuss/lemma/consensus` - Multi-model consensus
|
|
423
|
-
- `@nxuss/lemma/speculative` - Speculative execution
|
|
424
|
-
- `@nxuss/lemma/security` - Security utilities
|
|
425
|
-
- `@nxuss/lemma/protocol` - IAP protocol
|
|
426
|
-
- `@nxuss/lemma/langchain` - LangChain SDK
|
|
427
|
-
- `@nxuss/lemma/crewai` - CrewAI SDK
|
|
428
|
-
|
|
429
|
-
See [MIGRATION_GUIDE.md](docs/MIGRATION_GUIDE.md) for upgrade instructions.
|
|
430
|
-
|
|
431
|
-
---
|
|
432
|
-
|
|
433
|
-
## Install
|
|
434
|
-
|
|
435
|
-
```bash
|
|
436
|
-
npm install @nxuss/lemma
|
|
437
|
-
```
|
|
438
|
-
|
|
439
|
-
**Optional dependencies (install as needed):**
|
|
440
|
-
|
|
441
|
-
```bash
|
|
442
|
-
# For semantic mode (lightweight embeddings)
|
|
443
|
-
npm install @xenova/transformers
|
|
444
|
-
|
|
445
|
-
# For persistent storage with ChromaDB
|
|
446
|
-
pip install chromadb
|
|
447
|
-
chroma run --path ./chroma_data --port 8000
|
|
448
|
-
|
|
449
|
-
# For ChromaDB embeddings
|
|
450
|
-
ollama pull nomic-embed-text
|
|
451
|
-
```
|
|
452
|
-
|
|
453
|
-
**Zero dependencies required** for basic memory mode!
|
|
454
|
-
|
|
455
|
-
---
|
|
456
|
-
|
|
457
|
-
## Configuration
|
|
458
|
-
|
|
459
|
-
```bash
|
|
460
|
-
# .env
|
|
461
|
-
WS_PORT=8080
|
|
462
|
-
CHROMA_HOST=http://localhost
|
|
463
|
-
CHROMA_PORT=8000
|
|
464
|
-
OLLAMA_HOST=http://localhost:11434
|
|
465
|
-
OLLAMA_MODEL=nomic-embed-text
|
|
466
|
-
SEMANTIC_THRESHOLD=0.85 # similarity cutoff (0β1)
|
|
467
|
-
ENABLE_CACHING=true
|
|
468
|
-
AUTH_ENABLED=false # set true in production
|
|
469
|
-
```
|
|
470
|
-
|
|
471
|
-
---
|
|
472
|
-
|
|
473
|
-
## Examples & Documentation
|
|
474
|
-
|
|
475
|
-
For complete examples including:
|
|
476
|
-
- Single agent setup
|
|
477
|
-
- Multi-agent swarms
|
|
478
|
-
- Consensus voting
|
|
479
|
-
- Security & authentication
|
|
480
|
-
- LangChain/CrewAI integration
|
|
481
|
-
|
|
482
|
-
Visit [lemma.nxus.studio/docs](https://lemma.nxus.studio/docs)
|
|
483
|
-
|
|
484
|
-
---
|
|
485
|
-
|
|
486
|
-
## Who this is for
|
|
487
|
-
|
|
488
|
-
- Teams running **LangChain, CrewAI, or custom agent frameworks** who need shared memory across agents
|
|
489
|
-
- Systems where **multiple agents handle overlapping queries** β support bots, research pipelines, code assistants
|
|
490
|
-
- Anyone whose **LLM bill scales with agent count** rather than unique queries
|
|
491
|
-
|
|
492
|
-
Lemma is designed for multi-agent systems where coordination and shared memory provide immediate value.
|
|
493
|
-
|
|
494
71
|
---
|
|
495
72
|
|
|
496
|
-
##
|
|
497
|
-
|
|
498
|
-
Lemma can be deployed to any Node.js hosting environment. For production setup guides including:
|
|
499
|
-
- Docker deployment
|
|
500
|
-
- API key management
|
|
501
|
-
- Security configuration
|
|
502
|
-
- Monitoring & observability
|
|
503
|
-
|
|
504
|
-
Visit [lemma.nxus.studio/docs/deployment](https://lemma.nxus.studio/docs/deployment)
|
|
505
|
-
|
|
506
|
-
---
|
|
507
|
-
|
|
508
|
-
## Cloud hosting (coming soon)
|
|
509
|
-
|
|
510
|
-
Managed Lemma instances with zero infrastructure setup. Check pricing and availability at [lemma.nxus.studio](https://lemma.nxus.studio)
|
|
73
|
+
## π Dashboard
|
|
74
|
+
Monitor your savings, visualize agent connections, and inspect your semantic memory through the integrated real-time dashboard.
|
|
511
75
|
|
|
512
76
|
---
|
|
513
77
|
|
|
514
|
-
|
|
515
|
-
|
|
516
|
-
Contributions are welcome! For development setup and guidelines, visit [lemma.nxus.studio](https://lemma.nxus.studio)
|
|
517
|
-
|
|
518
|
-
---
|
|
519
|
-
|
|
520
|
-
## License
|
|
521
|
-
|
|
522
|
-
MIT Β© [Nxus Studio](https://nxus.studio)
|
|
78
|
+
MIT Β© Nxus Studio | [Get Lemma Pro](https://lemma.nxus.studio/upgrade)
|