collective-memory-mcp 0.4.0 → 0.6.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/README.md +116 -6
- package/package.json +2 -2
- package/src/embeddings.js +318 -0
- package/src/server.js +199 -39
- package/src/storage.js +188 -8
package/README.md
CHANGED
|
@@ -1,16 +1,23 @@
|
|
|
1
1
|
# Collective Memory MCP Server
|
|
2
2
|
|
|
3
|
-
A persistent, graph-based memory system that enables AI agents to document their work and learn from each other's experiences. This system transforms ephemeral agent interactions into a searchable knowledge base of structural patterns, solutions, and methodologies.
|
|
3
|
+
A persistent, graph-based memory system with **semantic search** that enables AI agents to document their work and learn from each other's experiences. This system transforms ephemeral agent interactions into a searchable knowledge base of structural patterns, solutions, and methodologies.
|
|
4
4
|
|
|
5
5
|
## Overview
|
|
6
6
|
|
|
7
7
|
The Collective Memory System is designed for multi-agent environments where agents need to:
|
|
8
8
|
|
|
9
9
|
- Document their completed work for future reference
|
|
10
|
-
- Discover how similar tasks were solved previously
|
|
10
|
+
- Discover how similar tasks were solved previously using **semantic understanding**
|
|
11
11
|
- Learn from the structural patterns and approaches of other agents
|
|
12
12
|
- Coordinate across parallel executions without duplicating effort
|
|
13
13
|
|
|
14
|
+
## Key Features
|
|
15
|
+
|
|
16
|
+
- **Semantic Search** - Finds conceptually similar content even when keywords differ
|
|
17
|
+
- **Knowledge Graph** - Entities and relations capture complex relationships
|
|
18
|
+
- **Ranked Results** - Similarity scores help identify the most relevant past work
|
|
19
|
+
- **Auto-Embeddings** - New content automatically gets semantic embeddings
|
|
20
|
+
|
|
14
21
|
## Installation
|
|
15
22
|
|
|
16
23
|
```bash
|
|
@@ -34,6 +41,69 @@ Add to your Claude Desktop MCP configuration (`~/.config/Claude/claude_desktop_c
|
|
|
34
41
|
|
|
35
42
|
The `-y` flag suppresses the npx prompt and auto-installs the latest version.
|
|
36
43
|
|
|
44
|
+
## System Prompt (Recommended)
|
|
45
|
+
|
|
46
|
+
Add this to your Claude system prompt to ensure agents know about the Collective Memory:
|
|
47
|
+
|
|
48
|
+
```
|
|
49
|
+
You have access to a Collective Memory MCP Server that stores knowledge from previous tasks.
|
|
50
|
+
|
|
51
|
+
BEFORE starting work, search for similar past tasks using:
|
|
52
|
+
- search_collective_memory (semantic search - understands meaning, not just keywords)
|
|
53
|
+
- find_similar_procedures (finds similar tasks with full implementation details)
|
|
54
|
+
|
|
55
|
+
The search uses semantic embeddings, so it finds relevant content even when different
|
|
56
|
+
terminology is used. Results are ranked by similarity score.
|
|
57
|
+
|
|
58
|
+
AFTER completing any task, document it using:
|
|
59
|
+
- record_task_completion
|
|
60
|
+
|
|
61
|
+
When writing observations, be SPECIFIC and include facts like file paths, versions,
|
|
62
|
+
metrics, and error messages. Avoid vague statements like "works well" or "fixed bugs".
|
|
63
|
+
```
|
|
64
|
+
|
|
65
|
+
## Setting Up Semantic Search
|
|
66
|
+
|
|
67
|
+
For semantic search to work, you need to configure an embeddings provider:
|
|
68
|
+
|
|
69
|
+
### Option 1: OpenAI (Recommended - Best Quality)
|
|
70
|
+
|
|
71
|
+
```bash
|
|
72
|
+
# Configure with your OpenAI API key
|
|
73
|
+
# Use the manage_embeddings tool with:
|
|
74
|
+
{
|
|
75
|
+
"action": "configure",
|
|
76
|
+
"provider": "openai",
|
|
77
|
+
"api_key": "sk-..."
|
|
78
|
+
}
|
|
79
|
+
```
|
|
80
|
+
|
|
81
|
+
### Option 2: Ollama (Free - Local)
|
|
82
|
+
|
|
83
|
+
```bash
|
|
84
|
+
# Install Ollama first
|
|
85
|
+
# Then pull the embedding model
|
|
86
|
+
ollama pull nomic-embed-text
|
|
87
|
+
|
|
88
|
+
# Configure the provider
|
|
89
|
+
{
|
|
90
|
+
"action": "configure",
|
|
91
|
+
"provider": "ollama"
|
|
92
|
+
}
|
|
93
|
+
```
|
|
94
|
+
|
|
95
|
+
### Generate Embeddings for Existing Data
|
|
96
|
+
|
|
97
|
+
After configuring, generate embeddings for any existing entities:
|
|
98
|
+
|
|
99
|
+
```json
|
|
100
|
+
{
|
|
101
|
+
"action": "generate"
|
|
102
|
+
}
|
|
103
|
+
```
|
|
104
|
+
|
|
105
|
+
**Note:** Without configuring embeddings, the system falls back to keyword-based search.
|
|
106
|
+
|
|
37
107
|
## Entity Types
|
|
38
108
|
|
|
39
109
|
| Type | Description |
|
|
@@ -75,7 +145,7 @@ The `-y` flag suppresses the npx prompt and auto-installs the latest version.
|
|
|
75
145
|
### Query & Search
|
|
76
146
|
|
|
77
147
|
- **read_graph** - Read entire knowledge graph
|
|
78
|
-
- **search_collective_memory** -
|
|
148
|
+
- **search_collective_memory** - Semantic search with ranked results
|
|
79
149
|
- **open_nodes** - Retrieve specific nodes by name
|
|
80
150
|
|
|
81
151
|
### Agent Workflow
|
|
@@ -83,6 +153,10 @@ The `-y` flag suppresses the npx prompt and auto-installs the latest version.
|
|
|
83
153
|
- **record_task_completion** - Primary tool for documenting completed work
|
|
84
154
|
- **find_similar_procedures** - Find similar tasks with full implementation details
|
|
85
155
|
|
|
156
|
+
### Embeddings Management
|
|
157
|
+
|
|
158
|
+
- **manage_embeddings** - Configure semantic search and generate embeddings
|
|
159
|
+
|
|
86
160
|
## Example Usage
|
|
87
161
|
|
|
88
162
|
### Recording a Task Completion
|
|
@@ -108,22 +182,58 @@ await session.callTool("record_task_completion", {
|
|
|
108
182
|
});
|
|
109
183
|
```
|
|
110
184
|
|
|
111
|
-
### Finding Similar Procedures
|
|
185
|
+
### Finding Similar Procedures (Semantic Search)
|
|
112
186
|
|
|
113
187
|
```javascript
|
|
114
188
|
const result = await session.callTool("find_similar_procedures", {
|
|
115
189
|
query: "authentication implementation"
|
|
116
190
|
});
|
|
191
|
+
|
|
192
|
+
// Returns ranked results with similarity scores:
|
|
193
|
+
// {
|
|
194
|
+
// "similar_tasks": [
|
|
195
|
+
// { "task": {...}, "score": 0.89, "artifacts": [...], "structures": [...] },
|
|
196
|
+
// { "task": {...}, "score": 0.82, "artifacts": [...], "structures": [...] }
|
|
197
|
+
// ],
|
|
198
|
+
// "search_method": "semantic"
|
|
199
|
+
// }
|
|
200
|
+
```
|
|
201
|
+
|
|
202
|
+
### Configuring Embeddings
|
|
203
|
+
|
|
204
|
+
```javascript
|
|
205
|
+
// Check status
|
|
206
|
+
await session.callTool("manage_embeddings", { "action": "status" });
|
|
207
|
+
|
|
208
|
+
// Configure OpenAI
|
|
209
|
+
await session.callTool("manage_embeddings", {
|
|
210
|
+
"action": "configure",
|
|
211
|
+
"provider": "openai",
|
|
212
|
+
"api_key": "sk-..."
|
|
213
|
+
});
|
|
214
|
+
|
|
215
|
+
// Generate embeddings for existing data
|
|
216
|
+
await session.callTool("manage_embeddings", { "action": "generate" });
|
|
117
217
|
```
|
|
118
218
|
|
|
119
219
|
## Database
|
|
120
220
|
|
|
121
|
-
The server uses
|
|
221
|
+
The server uses JSON file storage for persistence. Data is stored at:
|
|
122
222
|
|
|
123
223
|
```
|
|
124
|
-
~/.collective-memory/memory.
|
|
224
|
+
~/.collective-memory/memory.json # Knowledge graph data
|
|
225
|
+
~/.collective-memory/config.json # Embeddings provider configuration
|
|
125
226
|
```
|
|
126
227
|
|
|
228
|
+
## Semantic Search Benefits
|
|
229
|
+
|
|
230
|
+
| Before (Keyword Search) | After (Semantic Search) |
|
|
231
|
+
|------------------------|------------------------|
|
|
232
|
+
| Query "login" misses "authentication" | Query "login" finds "authentication", "JWT", "OAuth" |
|
|
233
|
+
| No relevance ranking | Results ranked by similarity score (0-1) |
|
|
234
|
+
| Exact word matching required | Understands meaning and intent |
|
|
235
|
+
| High false-positive rate | More precise, relevant results |
|
|
236
|
+
|
|
127
237
|
## Requirements
|
|
128
238
|
|
|
129
239
|
- Node.js 18+
|
package/package.json
CHANGED
|
@@ -1,7 +1,7 @@
|
|
|
1
1
|
{
|
|
2
2
|
"name": "collective-memory-mcp",
|
|
3
|
-
"version": "0.
|
|
4
|
-
"description": "A persistent, graph-based memory system for AI agents (MCP Server)",
|
|
3
|
+
"version": "0.6.0",
|
|
4
|
+
"description": "A persistent, graph-based memory system for AI agents with semantic search (MCP Server)",
|
|
5
5
|
"type": "module",
|
|
6
6
|
"main": "src/server.js",
|
|
7
7
|
"bin": {
|
|
@@ -0,0 +1,318 @@
|
|
|
1
|
+
/**
|
|
2
|
+
* Embeddings module for semantic search in the Collective Memory System.
|
|
3
|
+
* Supports multiple embedding providers with cosine similarity.
|
|
4
|
+
*/
|
|
5
|
+
|
|
6
|
+
import fs from "fs";
|
|
7
|
+
import path from "path";
|
|
8
|
+
import os from "os";
|
|
9
|
+
|
|
10
|
+
const CONFIG_DIR = path.join(os.homedir(), ".collective-memory");
|
|
11
|
+
const CONFIG_PATH = path.join(CONFIG_DIR, "config.json");
|
|
12
|
+
|
|
13
|
+
/**
|
|
14
|
+
* Default configuration
|
|
15
|
+
*/
|
|
16
|
+
const DEFAULT_CONFIG = {
|
|
17
|
+
embedding_provider: "openai", // 'openai' or 'ollama'
|
|
18
|
+
openai_api_key: null,
|
|
19
|
+
openai_model: "text-embedding-3-small",
|
|
20
|
+
ollama_base_url: "http://localhost:11434",
|
|
21
|
+
ollama_model: "nomic-embed-text",
|
|
22
|
+
embedding_dimension: 1536,
|
|
23
|
+
};
|
|
24
|
+
|
|
25
|
+
/**
|
|
26
|
+
* Load configuration from file
|
|
27
|
+
*/
|
|
28
|
+
function loadConfig() {
|
|
29
|
+
try {
|
|
30
|
+
if (fs.existsSync(CONFIG_PATH)) {
|
|
31
|
+
const content = fs.readFileSync(CONFIG_PATH, "utf-8");
|
|
32
|
+
return { ...DEFAULT_CONFIG, ...JSON.parse(content) };
|
|
33
|
+
}
|
|
34
|
+
} catch (error) {
|
|
35
|
+
console.error("Failed to load config:", error.message);
|
|
36
|
+
}
|
|
37
|
+
return { ...DEFAULT_CONFIG };
|
|
38
|
+
}
|
|
39
|
+
|
|
40
|
+
/**
|
|
41
|
+
* Save configuration to file
|
|
42
|
+
*/
|
|
43
|
+
function saveConfig(config) {
|
|
44
|
+
try {
|
|
45
|
+
if (!fs.existsSync(CONFIG_DIR)) {
|
|
46
|
+
fs.mkdirSync(CONFIG_DIR, { recursive: true });
|
|
47
|
+
}
|
|
48
|
+
fs.writeFileSync(CONFIG_PATH, JSON.stringify(config, null, 2));
|
|
49
|
+
} catch (error) {
|
|
50
|
+
console.error("Failed to save config:", error.message);
|
|
51
|
+
}
|
|
52
|
+
}
|
|
53
|
+
|
|
54
|
+
/**
|
|
55
|
+
* Calculate cosine similarity between two vectors
|
|
56
|
+
*/
|
|
57
|
+
function cosineSimilarity(a, b) {
|
|
58
|
+
if (a.length !== b.length) {
|
|
59
|
+
throw new Error("Vector dimensions must match");
|
|
60
|
+
}
|
|
61
|
+
|
|
62
|
+
let dotProduct = 0;
|
|
63
|
+
let normA = 0;
|
|
64
|
+
let normB = 0;
|
|
65
|
+
|
|
66
|
+
for (let i = 0; i < a.length; i++) {
|
|
67
|
+
dotProduct += a[i] * b[i];
|
|
68
|
+
normA += a[i] * a[i];
|
|
69
|
+
normB += b[i] * b[i];
|
|
70
|
+
}
|
|
71
|
+
|
|
72
|
+
normA = Math.sqrt(normA);
|
|
73
|
+
normB = Math.sqrt(normB);
|
|
74
|
+
|
|
75
|
+
if (normA === 0 || normB === 0) {
|
|
76
|
+
return 0;
|
|
77
|
+
}
|
|
78
|
+
|
|
79
|
+
return dotProduct / (normA * normB);
|
|
80
|
+
}
|
|
81
|
+
|
|
82
|
+
/**
|
|
83
|
+
* OpenAI embeddings provider
|
|
84
|
+
*/
|
|
85
|
+
class OpenAIEmbeddings {
|
|
86
|
+
constructor(config) {
|
|
87
|
+
this.apiKey = config.openai_api_key || process.env.OPENAI_API_KEY;
|
|
88
|
+
this.model = config.openai_model || "text-embedding-3-small";
|
|
89
|
+
this.dimension = config.embedding_dimension || 1536;
|
|
90
|
+
}
|
|
91
|
+
|
|
92
|
+
isAvailable() {
|
|
93
|
+
return !!this.apiKey;
|
|
94
|
+
}
|
|
95
|
+
|
|
96
|
+
async embed(text) {
|
|
97
|
+
if (!this.isAvailable()) {
|
|
98
|
+
throw new Error("OpenAI API key not configured");
|
|
99
|
+
}
|
|
100
|
+
|
|
101
|
+
const response = await fetch("https://api.openai.com/v1/embeddings", {
|
|
102
|
+
method: "POST",
|
|
103
|
+
headers: {
|
|
104
|
+
"Content-Type": "application/json",
|
|
105
|
+
"Authorization": `Bearer ${this.apiKey}`,
|
|
106
|
+
},
|
|
107
|
+
body: JSON.stringify({
|
|
108
|
+
model: this.model,
|
|
109
|
+
input: text,
|
|
110
|
+
}),
|
|
111
|
+
});
|
|
112
|
+
|
|
113
|
+
if (!response.ok) {
|
|
114
|
+
const error = await response.text();
|
|
115
|
+
throw new Error(`OpenAI API error: ${error}`);
|
|
116
|
+
}
|
|
117
|
+
|
|
118
|
+
const data = await response.json();
|
|
119
|
+
return data.data[0].embedding;
|
|
120
|
+
}
|
|
121
|
+
|
|
122
|
+
async embedBatch(texts) {
|
|
123
|
+
if (!this.isAvailable()) {
|
|
124
|
+
throw new Error("OpenAI API key not configured");
|
|
125
|
+
}
|
|
126
|
+
|
|
127
|
+
const response = await fetch("https://api.openai.com/v1/embeddings", {
|
|
128
|
+
method: "POST",
|
|
129
|
+
headers: {
|
|
130
|
+
"Content-Type": "application/json",
|
|
131
|
+
"Authorization": `Bearer ${this.apiKey}`,
|
|
132
|
+
},
|
|
133
|
+
body: JSON.stringify({
|
|
134
|
+
model: this.model,
|
|
135
|
+
input: texts,
|
|
136
|
+
}),
|
|
137
|
+
});
|
|
138
|
+
|
|
139
|
+
if (!response.ok) {
|
|
140
|
+
const error = await response.text();
|
|
141
|
+
throw new Error(`OpenAI API error: ${error}`);
|
|
142
|
+
}
|
|
143
|
+
|
|
144
|
+
const data = await response.json();
|
|
145
|
+
return data.data.map((item) => item.embedding);
|
|
146
|
+
}
|
|
147
|
+
}
|
|
148
|
+
|
|
149
|
+
/**
|
|
150
|
+
* Ollama embeddings provider (local, free)
|
|
151
|
+
*/
|
|
152
|
+
class OllamaEmbeddings {
|
|
153
|
+
constructor(config) {
|
|
154
|
+
this.baseUrl = config.ollama_base_url || "http://localhost:11434";
|
|
155
|
+
this.model = config.ollama_model || "nomic-embed-text";
|
|
156
|
+
this.dimension = 768; // Default for nomic-embed-text
|
|
157
|
+
}
|
|
158
|
+
|
|
159
|
+
isAvailable() {
|
|
160
|
+
// Check if Ollama is running
|
|
161
|
+
return fetch(`${this.baseUrl}/api/tags`)
|
|
162
|
+
.then((res) => res.ok)
|
|
163
|
+
.catch(() => false);
|
|
164
|
+
}
|
|
165
|
+
|
|
166
|
+
async embed(text) {
|
|
167
|
+
const response = await fetch(`${this.baseUrl}/api/embeddings`, {
|
|
168
|
+
method: "POST",
|
|
169
|
+
headers: {
|
|
170
|
+
"Content-Type": "application/json",
|
|
171
|
+
},
|
|
172
|
+
body: JSON.stringify({
|
|
173
|
+
model: this.model,
|
|
174
|
+
prompt: text,
|
|
175
|
+
}),
|
|
176
|
+
});
|
|
177
|
+
|
|
178
|
+
if (!response.ok) {
|
|
179
|
+
const error = await response.text();
|
|
180
|
+
throw new Error(`Ollama API error: ${error}`);
|
|
181
|
+
}
|
|
182
|
+
|
|
183
|
+
const data = await response.json();
|
|
184
|
+
return data.embedding;
|
|
185
|
+
}
|
|
186
|
+
|
|
187
|
+
async embedBatch(texts) {
|
|
188
|
+
// Ollama doesn't support batch embeddings, so we do them sequentially
|
|
189
|
+
const embeddings = [];
|
|
190
|
+
for (const text of texts) {
|
|
191
|
+
embeddings.push(await this.embed(text));
|
|
192
|
+
}
|
|
193
|
+
return embeddings;
|
|
194
|
+
}
|
|
195
|
+
}
|
|
196
|
+
|
|
197
|
+
/**
|
|
198
|
+
* Main embeddings class that manages providers
|
|
199
|
+
*/
|
|
200
|
+
class Embeddings {
|
|
201
|
+
constructor(config) {
|
|
202
|
+
this.config = config || loadConfig();
|
|
203
|
+
this.providers = {
|
|
204
|
+
openai: new OpenAIEmbeddings(this.config),
|
|
205
|
+
ollama: new OllamaEmbeddings(this.config),
|
|
206
|
+
};
|
|
207
|
+
this.activeProvider = this.config.embedding_provider || "openai";
|
|
208
|
+
}
|
|
209
|
+
|
|
210
|
+
/**
|
|
211
|
+
* Get the active provider
|
|
212
|
+
*/
|
|
213
|
+
getProvider() {
|
|
214
|
+
return this.providers[this.activeProvider];
|
|
215
|
+
}
|
|
216
|
+
|
|
217
|
+
/**
|
|
218
|
+
* Check if the active provider is available
|
|
219
|
+
*/
|
|
220
|
+
async isAvailable() {
|
|
221
|
+
const provider = this.getProvider();
|
|
222
|
+
|
|
223
|
+
if (this.activeProvider === "ollama") {
|
|
224
|
+
return await provider.isAvailable();
|
|
225
|
+
}
|
|
226
|
+
|
|
227
|
+
return provider.isAvailable();
|
|
228
|
+
}
|
|
229
|
+
|
|
230
|
+
/**
|
|
231
|
+
* Generate embedding for a single text
|
|
232
|
+
*/
|
|
233
|
+
async embed(text) {
|
|
234
|
+
const provider = this.getProvider();
|
|
235
|
+
return await provider.embed(text);
|
|
236
|
+
}
|
|
237
|
+
|
|
238
|
+
/**
|
|
239
|
+
* Generate embeddings for multiple texts
|
|
240
|
+
*/
|
|
241
|
+
async embedBatch(texts) {
|
|
242
|
+
const provider = this.getProvider();
|
|
243
|
+
return await provider.embedBatch(texts);
|
|
244
|
+
}
|
|
245
|
+
|
|
246
|
+
/**
|
|
247
|
+
* Find most similar items using cosine similarity
|
|
248
|
+
*/
|
|
249
|
+
findMostSimilar(queryEmbedding, items, topK = 10, threshold = 0.7) {
|
|
250
|
+
const results = items.map((item) => {
|
|
251
|
+
if (!item.embedding) {
|
|
252
|
+
return { ...item, score: 0 };
|
|
253
|
+
}
|
|
254
|
+
const score = cosineSimilarity(queryEmbedding, item.embedding);
|
|
255
|
+
return { ...item, score };
|
|
256
|
+
});
|
|
257
|
+
|
|
258
|
+
// Sort by score descending and filter by threshold
|
|
259
|
+
return results
|
|
260
|
+
.filter((r) => r.score >= threshold)
|
|
261
|
+
.sort((a, b) => b.score - a.score)
|
|
262
|
+
.slice(0, topK);
|
|
263
|
+
}
|
|
264
|
+
|
|
265
|
+
/**
|
|
266
|
+
* Create text representation for entity embedding
|
|
267
|
+
*/
|
|
268
|
+
createEntityText(entity) {
|
|
269
|
+
const parts = [];
|
|
270
|
+
|
|
271
|
+
// Name carries important semantic weight
|
|
272
|
+
parts.push(`Name: ${entity.name}`);
|
|
273
|
+
|
|
274
|
+
// Entity type provides context
|
|
275
|
+
parts.push(`Type: ${entity.entityType}`);
|
|
276
|
+
|
|
277
|
+
// Observations contain the detailed information
|
|
278
|
+
if (entity.observations && entity.observations.length > 0) {
|
|
279
|
+
parts.push(`Observations:\n${entity.observations.join("\n")}`);
|
|
280
|
+
}
|
|
281
|
+
|
|
282
|
+
// Metadata if present
|
|
283
|
+
if (entity.metadata) {
|
|
284
|
+
parts.push(`Metadata: ${JSON.stringify(entity.metadata)}`);
|
|
285
|
+
}
|
|
286
|
+
|
|
287
|
+
return parts.join("\n\n");
|
|
288
|
+
}
|
|
289
|
+
}
|
|
290
|
+
|
|
291
|
+
/**
|
|
292
|
+
* Singleton instance
|
|
293
|
+
*/
|
|
294
|
+
let embeddingsInstance = null;
|
|
295
|
+
|
|
296
|
+
/**
|
|
297
|
+
* Get or create embeddings instance
|
|
298
|
+
*/
|
|
299
|
+
export function getEmbeddings(config) {
|
|
300
|
+
if (!embeddingsInstance) {
|
|
301
|
+
embeddingsInstance = new Embeddings(config);
|
|
302
|
+
}
|
|
303
|
+
return embeddingsInstance;
|
|
304
|
+
}
|
|
305
|
+
|
|
306
|
+
/**
|
|
307
|
+
* Export utilities
|
|
308
|
+
*/
|
|
309
|
+
export {
|
|
310
|
+
cosineSimilarity,
|
|
311
|
+
loadConfig,
|
|
312
|
+
saveConfig,
|
|
313
|
+
Embeddings,
|
|
314
|
+
OpenAIEmbeddings,
|
|
315
|
+
OllamaEmbeddings,
|
|
316
|
+
};
|
|
317
|
+
|
|
318
|
+
export default { getEmbeddings, cosineSimilarity, loadConfig, saveConfig };
|
package/src/server.js
CHANGED
|
@@ -14,13 +14,17 @@ import {
|
|
|
14
14
|
} from "@modelcontextprotocol/sdk/types.js";
|
|
15
15
|
import { getStorage } from "./storage.js";
|
|
16
16
|
import { Entity, Relation, ENTITY_TYPES, RELATION_TYPES } from "./models.js";
|
|
17
|
+
import { loadConfig } from "./embeddings.js";
|
|
17
18
|
|
|
18
19
|
/**
|
|
19
20
|
* Create and configure the MCP server
|
|
20
21
|
*/
|
|
21
|
-
function createServer() {
|
|
22
|
+
async function createServer() {
|
|
22
23
|
const storage = getStorage();
|
|
23
24
|
|
|
25
|
+
// Initialize embeddings for semantic search
|
|
26
|
+
await storage.initEmbeddings();
|
|
27
|
+
|
|
24
28
|
const server = new Server(
|
|
25
29
|
{
|
|
26
30
|
name: "collective-memory",
|
|
@@ -230,15 +234,17 @@ function createServer() {
|
|
|
230
234
|
{
|
|
231
235
|
name: "search_collective_memory",
|
|
232
236
|
description:
|
|
233
|
-
"Search
|
|
234
|
-
"
|
|
235
|
-
"Returns
|
|
237
|
+
"**Search all past work using semantic search** - Use before starting a task to learn from previous solutions. " +
|
|
238
|
+
"Uses semantic embeddings to find conceptually similar content, even with different keywords. " +
|
|
239
|
+
"Returns ranked results with similarity scores. " +
|
|
240
|
+
"Automatically falls back to keyword search if embeddings aren't available. " +
|
|
241
|
+
"Use find_similar_procedures for more detailed results with artifacts.",
|
|
236
242
|
inputSchema: {
|
|
237
243
|
type: "object",
|
|
238
244
|
properties: {
|
|
239
245
|
query: {
|
|
240
246
|
type: "string",
|
|
241
|
-
description: "
|
|
247
|
+
description: "What are you looking for? Semantic search understands meaning. (e.g., 'authentication', 'CORS fix', 'database')",
|
|
242
248
|
},
|
|
243
249
|
},
|
|
244
250
|
required: ["query"],
|
|
@@ -265,33 +271,48 @@ function createServer() {
|
|
|
265
271
|
{
|
|
266
272
|
name: "record_task_completion",
|
|
267
273
|
description:
|
|
268
|
-
"
|
|
269
|
-
"Automatically creates
|
|
270
|
-
"
|
|
271
|
-
"
|
|
274
|
+
"**PRIMARY TOOL - Use this after completing any task** Document completed work with full context. " +
|
|
275
|
+
"Automatically creates task entity, agent entity, artifacts, and relations. " +
|
|
276
|
+
"" +
|
|
277
|
+
"**GUIDELINES FOR observations:** " +
|
|
278
|
+
"- Be SPECIFIC: 'Added JWT with 1-hour expiry' not 'Added auth' " +
|
|
279
|
+
"- Include FACTS: file paths, versions, metrics, error messages " +
|
|
280
|
+
"- Be ATOMIC: One fact per observation " +
|
|
281
|
+
"- BAD: 'Works well', 'Fixed bugs', 'Good code' " +
|
|
282
|
+
"- GOOD: 'API response time reduced from 500ms to 120ms', 'Fixed CORS by adding Origin header' " +
|
|
283
|
+
"" +
|
|
284
|
+
"**Parameters:** " +
|
|
285
|
+
"- agent_name: Your identifier (e.g., 'Agent_Backend_Developer') " +
|
|
286
|
+
"- task_name: Unique descriptive name (e.g., 'Task_Add_JWT_Auth_20241224') " +
|
|
287
|
+
"- task_type: Type (implementation/debugging/refactoring/testing) " +
|
|
288
|
+
"- description: High-level summary " +
|
|
289
|
+
"- observations: Array of specific facts (see guidelines above) " +
|
|
290
|
+
"- created_artifacts: Files/code created with their observations " +
|
|
291
|
+
"- modified_structures: Architectural changes " +
|
|
292
|
+
"- session_id: Optional - groups related tasks together",
|
|
272
293
|
inputSchema: {
|
|
273
294
|
type: "object",
|
|
274
295
|
properties: {
|
|
275
296
|
agent_name: {
|
|
276
297
|
type: "string",
|
|
277
|
-
description: "
|
|
298
|
+
description: "Your identifier (e.g., 'Agent_Backend_Developer')",
|
|
278
299
|
},
|
|
279
300
|
task_name: {
|
|
280
301
|
type: "string",
|
|
281
|
-
description: "Unique name
|
|
302
|
+
description: "Unique descriptive name (e.g., 'Task_Add_JWT_Auth_20241224')",
|
|
282
303
|
},
|
|
283
304
|
task_type: {
|
|
284
305
|
type: "string",
|
|
285
|
-
description: "Type
|
|
306
|
+
description: "Type: implementation, debugging, refactoring, testing, etc.",
|
|
286
307
|
},
|
|
287
308
|
description: {
|
|
288
309
|
type: "string",
|
|
289
|
-
description: "High-level
|
|
310
|
+
description: "High-level summary of what was accomplished",
|
|
290
311
|
},
|
|
291
312
|
observations: {
|
|
292
313
|
type: "array",
|
|
293
314
|
items: { type: "string" },
|
|
294
|
-
description: "
|
|
315
|
+
description: "Specific facts about the work. Be atomic and detailed. Include paths, versions, metrics.",
|
|
295
316
|
},
|
|
296
317
|
created_artifacts: {
|
|
297
318
|
type: "array",
|
|
@@ -306,7 +327,7 @@ function createServer() {
|
|
|
306
327
|
},
|
|
307
328
|
required: ["name"],
|
|
308
329
|
},
|
|
309
|
-
description: "
|
|
330
|
+
description: "Files, code, or configs you created",
|
|
310
331
|
},
|
|
311
332
|
modified_structures: {
|
|
312
333
|
type: "array",
|
|
@@ -321,11 +342,11 @@ function createServer() {
|
|
|
321
342
|
},
|
|
322
343
|
required: ["name"],
|
|
323
344
|
},
|
|
324
|
-
description: "
|
|
345
|
+
description: "Architectural patterns or structures that changed",
|
|
325
346
|
},
|
|
326
347
|
session_id: {
|
|
327
348
|
type: "string",
|
|
328
|
-
description: "Optional session identifier",
|
|
349
|
+
description: "Optional session identifier to group related tasks",
|
|
329
350
|
},
|
|
330
351
|
},
|
|
331
352
|
required: ["agent_name", "task_name"],
|
|
@@ -334,20 +355,49 @@ function createServer() {
|
|
|
334
355
|
{
|
|
335
356
|
name: "find_similar_procedures",
|
|
336
357
|
description:
|
|
337
|
-
"
|
|
338
|
-
"
|
|
339
|
-
"
|
|
358
|
+
"**Use BEFORE starting work** - Find how similar tasks were solved previously using semantic search. " +
|
|
359
|
+
"Returns complete implementation details including artifacts and structures, ranked by similarity. " +
|
|
360
|
+
"Understands meaning and intent, not just keywords. " +
|
|
361
|
+
"Learn from past solutions before implementing new features. " +
|
|
362
|
+
"Query examples: 'authentication', 'database migration', 'API design', 'error handling'.",
|
|
340
363
|
inputSchema: {
|
|
341
364
|
type: "object",
|
|
342
365
|
properties: {
|
|
343
366
|
query: {
|
|
344
367
|
type: "string",
|
|
345
|
-
description: "
|
|
368
|
+
description: "What are you trying to do? Semantic search finds conceptually similar work. (e.g., 'authentication implementation', 'database migration')",
|
|
346
369
|
},
|
|
347
370
|
},
|
|
348
371
|
required: ["query"],
|
|
349
372
|
},
|
|
350
373
|
},
|
|
374
|
+
{
|
|
375
|
+
name: "manage_embeddings",
|
|
376
|
+
description:
|
|
377
|
+
"**Manage semantic search embeddings** - Generate embeddings for existing entities " +
|
|
378
|
+
"to enable semantic search. Run this once after setting up an embeddings provider. " +
|
|
379
|
+
"Embeddings enable finding similar content even when keywords don't match exactly.",
|
|
380
|
+
inputSchema: {
|
|
381
|
+
type: "object",
|
|
382
|
+
properties: {
|
|
383
|
+
action: {
|
|
384
|
+
type: "string",
|
|
385
|
+
enum: ["generate", "status", "configure"],
|
|
386
|
+
description: "Action: 'generate' creates embeddings for entities missing them, 'status' shows current state, 'configure' updates settings",
|
|
387
|
+
},
|
|
388
|
+
provider: {
|
|
389
|
+
type: "string",
|
|
390
|
+
enum: ["openai", "ollama"],
|
|
391
|
+
description: "Provider to use (only for 'configure' action)",
|
|
392
|
+
},
|
|
393
|
+
api_key: {
|
|
394
|
+
type: "string",
|
|
395
|
+
description: "API key for OpenAI (only for 'configure' action with provider='openai')",
|
|
396
|
+
},
|
|
397
|
+
},
|
|
398
|
+
required: ["action"],
|
|
399
|
+
},
|
|
400
|
+
},
|
|
351
401
|
],
|
|
352
402
|
};
|
|
353
403
|
});
|
|
@@ -686,6 +736,9 @@ Future agents will read your observations to learn. Write for them, not for your
|
|
|
686
736
|
case "find_similar_procedures":
|
|
687
737
|
return { content: [{ type: "text", text: JSON.stringify(findSimilarProcedures(args), null, 2) }] };
|
|
688
738
|
|
|
739
|
+
case "manage_embeddings":
|
|
740
|
+
return { content: [{ type: "text", text: JSON.stringify(await manageEmbeddings(args), null, 2) }] };
|
|
741
|
+
|
|
689
742
|
default:
|
|
690
743
|
throw new Error(`Unknown tool: ${name}`);
|
|
691
744
|
}
|
|
@@ -824,16 +877,19 @@ Future agents will read your observations to learn. Write for them, not for your
|
|
|
824
877
|
};
|
|
825
878
|
}
|
|
826
879
|
|
|
827
|
-
function searchCollectiveMemory({ query = "" }) {
|
|
828
|
-
|
|
880
|
+
async function searchCollectiveMemory({ query = "" }) {
|
|
881
|
+
// Use semantic search if available
|
|
882
|
+
const searchResult = await storage.semanticSearchEntities(query);
|
|
829
883
|
|
|
830
|
-
const results =
|
|
884
|
+
const results = searchResult.results.map((item) => {
|
|
885
|
+
const entity = item.entity;
|
|
831
886
|
const related = storage.getRelatedEntities(entity.name);
|
|
832
887
|
return {
|
|
833
888
|
name: entity.name,
|
|
834
889
|
entityType: entity.entityType,
|
|
835
890
|
observations: entity.observations,
|
|
836
891
|
createdAt: entity.createdAt,
|
|
892
|
+
score: item.score,
|
|
837
893
|
related_entities: related.connected.map((e) => ({
|
|
838
894
|
name: e.name,
|
|
839
895
|
entityType: e.entityType,
|
|
@@ -841,7 +897,11 @@ Future agents will read your observations to learn. Write for them, not for your
|
|
|
841
897
|
};
|
|
842
898
|
});
|
|
843
899
|
|
|
844
|
-
return {
|
|
900
|
+
return {
|
|
901
|
+
matching_entities: results,
|
|
902
|
+
count: results.length,
|
|
903
|
+
search_method: searchResult.method,
|
|
904
|
+
};
|
|
845
905
|
}
|
|
846
906
|
|
|
847
907
|
function openNodes({ names = [] }) {
|
|
@@ -1015,20 +1075,13 @@ Future agents will read your observations to learn. Write for them, not for your
|
|
|
1015
1075
|
};
|
|
1016
1076
|
}
|
|
1017
1077
|
|
|
1018
|
-
function findSimilarProcedures({ query = "" }) {
|
|
1019
|
-
|
|
1020
|
-
|
|
1021
|
-
// Search for matching task entities
|
|
1022
|
-
const allEntities = storage.getAllEntities();
|
|
1023
|
-
const matchingTasks = allEntities.filter(
|
|
1024
|
-
(e) =>
|
|
1025
|
-
e.entityType === "task" &&
|
|
1026
|
-
(e.name.toLowerCase().includes(searchQuery) ||
|
|
1027
|
-
e.observations.some((obs) => obs.toLowerCase().includes(searchQuery)))
|
|
1028
|
-
);
|
|
1078
|
+
async function findSimilarProcedures({ query = "" }) {
|
|
1079
|
+
// Use semantic search for tasks, falling back to keyword search
|
|
1080
|
+
const searchResult = await storage.semanticSearchEntities(query, { entityType: "task" });
|
|
1029
1081
|
|
|
1030
1082
|
const results = [];
|
|
1031
|
-
for (const
|
|
1083
|
+
for (const item of searchResult.results) {
|
|
1084
|
+
const task = item.entity;
|
|
1032
1085
|
const taskRelations = storage.getRelations({ fromEntity: task.name });
|
|
1033
1086
|
|
|
1034
1087
|
const artifacts = [];
|
|
@@ -1066,10 +1119,117 @@ Future agents will read your observations to learn. Write for them, not for your
|
|
|
1066
1119
|
artifacts,
|
|
1067
1120
|
structures,
|
|
1068
1121
|
execution_context: executionContext,
|
|
1122
|
+
score: item.score,
|
|
1069
1123
|
});
|
|
1070
1124
|
}
|
|
1071
1125
|
|
|
1072
|
-
return { similar_tasks: results, count: results.length };
|
|
1126
|
+
return { similar_tasks: results, count: results.length, search_method: searchResult.method };
|
|
1127
|
+
}
|
|
1128
|
+
|
|
1129
|
+
async function manageEmbeddings({ action = "status", provider = null, api_key = null }) {
|
|
1130
|
+
const { saveConfig } = await import("./embeddings.js");
|
|
1131
|
+
|
|
1132
|
+
switch (action) {
|
|
1133
|
+
case "status": {
|
|
1134
|
+
const allEntities = storage.getAllEntities();
|
|
1135
|
+
const withEmbeddings = allEntities.filter(
|
|
1136
|
+
(e) => storage.data.entities[e.name]?.embedding
|
|
1137
|
+
).length;
|
|
1138
|
+
|
|
1139
|
+
const config = loadConfig();
|
|
1140
|
+
const isReady = storage.embeddingsReady;
|
|
1141
|
+
|
|
1142
|
+
return {
|
|
1143
|
+
status: "success",
|
|
1144
|
+
action: "status",
|
|
1145
|
+
embeddings_ready: isReady,
|
|
1146
|
+
provider: config.embedding_provider,
|
|
1147
|
+
entities_with_embeddings: withEmbeddings,
|
|
1148
|
+
total_entities: allEntities.length,
|
|
1149
|
+
coverage_percent: allEntities.length > 0
|
|
1150
|
+
? Math.round((withEmbeddings / allEntities.length) * 100)
|
|
1151
|
+
: 0,
|
|
1152
|
+
message: isReady
|
|
1153
|
+
? `Embeddings enabled using ${config.embedding_provider}. ${withEmbeddings}/${allEntities.length} entities have embeddings.`
|
|
1154
|
+
: "Embeddings not configured. Use 'configure' action to set up a provider.",
|
|
1155
|
+
};
|
|
1156
|
+
}
|
|
1157
|
+
|
|
1158
|
+
case "configure": {
|
|
1159
|
+
if (!provider) {
|
|
1160
|
+
return {
|
|
1161
|
+
status: "error",
|
|
1162
|
+
message: "Provider is required for configure action",
|
|
1163
|
+
};
|
|
1164
|
+
}
|
|
1165
|
+
|
|
1166
|
+
const config = loadConfig();
|
|
1167
|
+
config.embedding_provider = provider;
|
|
1168
|
+
|
|
1169
|
+
if (provider === "openai" && api_key) {
|
|
1170
|
+
config.openai_api_key = api_key;
|
|
1171
|
+
}
|
|
1172
|
+
|
|
1173
|
+
saveConfig(config);
|
|
1174
|
+
|
|
1175
|
+
// Re-initialize embeddings with new config
|
|
1176
|
+
storage.embeddingsReady = false;
|
|
1177
|
+
await storage.initEmbeddings();
|
|
1178
|
+
|
|
1179
|
+
return {
|
|
1180
|
+
status: "success",
|
|
1181
|
+
action: "configure",
|
|
1182
|
+
provider,
|
|
1183
|
+
embeddings_ready: storage.embeddingsReady,
|
|
1184
|
+
message: storage.embeddingsReady
|
|
1185
|
+
? `Successfully configured ${provider} for embeddings`
|
|
1186
|
+
: `Configured ${provider} but provider is not available. Check API keys or provider status.`,
|
|
1187
|
+
};
|
|
1188
|
+
}
|
|
1189
|
+
|
|
1190
|
+
case "generate": {
|
|
1191
|
+
// Generate embeddings for entities that don't have them
|
|
1192
|
+
const allEntities = storage.getAllEntities();
|
|
1193
|
+
const missing = allEntities.filter(
|
|
1194
|
+
(e) => !storage.data.entities[e.name]?.embedding
|
|
1195
|
+
);
|
|
1196
|
+
|
|
1197
|
+
if (missing.length === 0) {
|
|
1198
|
+
return {
|
|
1199
|
+
status: "success",
|
|
1200
|
+
action: "generate",
|
|
1201
|
+
message: "All entities already have embeddings",
|
|
1202
|
+
processed: 0,
|
|
1203
|
+
};
|
|
1204
|
+
}
|
|
1205
|
+
|
|
1206
|
+
try {
|
|
1207
|
+
const result = await storage.generateMissingEmbeddings((current, total, name) => {
|
|
1208
|
+
// Optional: could emit progress events
|
|
1209
|
+
});
|
|
1210
|
+
|
|
1211
|
+
return {
|
|
1212
|
+
status: "success",
|
|
1213
|
+
action: "generate",
|
|
1214
|
+
processed: result.processed,
|
|
1215
|
+
total_entities: result.total,
|
|
1216
|
+
message: `Generated embeddings for ${result.processed} entities`,
|
|
1217
|
+
};
|
|
1218
|
+
} catch (error) {
|
|
1219
|
+
return {
|
|
1220
|
+
status: "error",
|
|
1221
|
+
action: "generate",
|
|
1222
|
+
message: error.message,
|
|
1223
|
+
};
|
|
1224
|
+
}
|
|
1225
|
+
}
|
|
1226
|
+
|
|
1227
|
+
default:
|
|
1228
|
+
return {
|
|
1229
|
+
status: "error",
|
|
1230
|
+
message: `Unknown action: ${action}`,
|
|
1231
|
+
};
|
|
1232
|
+
}
|
|
1073
1233
|
}
|
|
1074
1234
|
|
|
1075
1235
|
return server;
|
|
@@ -1079,7 +1239,7 @@ Future agents will read your observations to learn. Write for them, not for your
|
|
|
1079
1239
|
* Main entry point
|
|
1080
1240
|
*/
|
|
1081
1241
|
async function main() {
|
|
1082
|
-
const server = createServer();
|
|
1242
|
+
const server = await createServer();
|
|
1083
1243
|
const transport = new StdioServerTransport();
|
|
1084
1244
|
await server.connect(transport);
|
|
1085
1245
|
}
|
package/src/storage.js
CHANGED
|
@@ -8,6 +8,7 @@ import { existsSync, mkdirSync, readFileSync, writeFileSync } from "fs";
|
|
|
8
8
|
import path from "path";
|
|
9
9
|
import os from "os";
|
|
10
10
|
import { Entity, Relation } from "./models.js";
|
|
11
|
+
import { getEmbeddings } from "./embeddings.js";
|
|
11
12
|
|
|
12
13
|
const DB_DIR = path.join(os.homedir(), ".collective-memory");
|
|
13
14
|
const DB_PATH = path.join(DB_DIR, "memory.json");
|
|
@@ -16,9 +17,12 @@ const DB_PATH = path.join(DB_DIR, "memory.json");
|
|
|
16
17
|
* Simple file-based storage
|
|
17
18
|
*/
|
|
18
19
|
export class Storage {
|
|
19
|
-
constructor(dbPath = DB_PATH) {
|
|
20
|
+
constructor(dbPath = DB_PATH, embeddingsEnabled = true) {
|
|
20
21
|
this.dbPath = dbPath;
|
|
21
22
|
this.data = null;
|
|
23
|
+
this.embeddingsEnabled = embeddingsEnabled;
|
|
24
|
+
this.embeddings = null;
|
|
25
|
+
this.embeddingsReady = false;
|
|
22
26
|
// Initialize synchronously
|
|
23
27
|
this.init();
|
|
24
28
|
}
|
|
@@ -41,7 +45,7 @@ export class Storage {
|
|
|
41
45
|
this.data = {
|
|
42
46
|
entities: {},
|
|
43
47
|
relations: [],
|
|
44
|
-
version: "
|
|
48
|
+
version: "2.0", // Version bump for embeddings support
|
|
45
49
|
};
|
|
46
50
|
this.saveSync();
|
|
47
51
|
}
|
|
@@ -50,11 +54,61 @@ export class Storage {
|
|
|
50
54
|
this.data = {
|
|
51
55
|
entities: {},
|
|
52
56
|
relations: [],
|
|
53
|
-
version: "
|
|
57
|
+
version: "2.0",
|
|
54
58
|
};
|
|
55
59
|
}
|
|
56
60
|
}
|
|
57
61
|
|
|
62
|
+
/**
|
|
63
|
+
* Initialize embeddings asynchronously
|
|
64
|
+
*/
|
|
65
|
+
async initEmbeddings() {
|
|
66
|
+
if (!this.embeddingsEnabled || this.embeddingsReady) {
|
|
67
|
+
return this.embeddingsReady;
|
|
68
|
+
}
|
|
69
|
+
|
|
70
|
+
try {
|
|
71
|
+
this.embeddings = getEmbeddings();
|
|
72
|
+
this.embeddingsReady = await this.embeddings.isAvailable();
|
|
73
|
+
} catch (error) {
|
|
74
|
+
console.warn("Embeddings not available:", error.message);
|
|
75
|
+
this.embeddingsReady = false;
|
|
76
|
+
}
|
|
77
|
+
|
|
78
|
+
return this.embeddingsReady;
|
|
79
|
+
}
|
|
80
|
+
|
|
81
|
+
/**
|
|
82
|
+
* Generate embedding for an entity
|
|
83
|
+
*/
|
|
84
|
+
async generateEmbedding(entity) {
|
|
85
|
+
if (!this.embeddingsReady || !this.embeddings) {
|
|
86
|
+
return null;
|
|
87
|
+
}
|
|
88
|
+
|
|
89
|
+
try {
|
|
90
|
+
const text = this.embeddings.createEntityText(entity);
|
|
91
|
+
return await this.embeddings.embed(text);
|
|
92
|
+
} catch (error) {
|
|
93
|
+
console.warn("Failed to generate embedding:", error.message);
|
|
94
|
+
return null;
|
|
95
|
+
}
|
|
96
|
+
}
|
|
97
|
+
|
|
98
|
+
/**
|
|
99
|
+
* Store embedding in entity data
|
|
100
|
+
*/
|
|
101
|
+
async updateEntityEmbedding(entityName, entity) {
|
|
102
|
+
if (!this.embeddingsReady) {
|
|
103
|
+
return;
|
|
104
|
+
}
|
|
105
|
+
|
|
106
|
+
const embedding = await this.generateEmbedding(entity);
|
|
107
|
+
if (embedding) {
|
|
108
|
+
this.data.entities[entityName].embedding = embedding;
|
|
109
|
+
}
|
|
110
|
+
}
|
|
111
|
+
|
|
58
112
|
/**
|
|
59
113
|
* Save data synchronously
|
|
60
114
|
*/
|
|
@@ -90,7 +144,12 @@ export class Storage {
|
|
|
90
144
|
if (this.data.entities[entity.name]) {
|
|
91
145
|
return false;
|
|
92
146
|
}
|
|
93
|
-
|
|
147
|
+
const entityData = entity.toJSON();
|
|
148
|
+
this.data.entities[entity.name] = entityData;
|
|
149
|
+
|
|
150
|
+
// Generate embedding if available
|
|
151
|
+
await this.updateEntityEmbedding(entity.name, entity);
|
|
152
|
+
|
|
94
153
|
await this.save();
|
|
95
154
|
return true;
|
|
96
155
|
}
|
|
@@ -127,6 +186,7 @@ export class Storage {
|
|
|
127
186
|
const entity = this.data.entities[name];
|
|
128
187
|
if (!entity) return false;
|
|
129
188
|
|
|
189
|
+
const updated = false;
|
|
130
190
|
if (observations !== undefined) {
|
|
131
191
|
entity.observations = observations;
|
|
132
192
|
}
|
|
@@ -134,6 +194,12 @@ export class Storage {
|
|
|
134
194
|
entity.metadata = metadata;
|
|
135
195
|
}
|
|
136
196
|
|
|
197
|
+
// Regenerate embedding if observations changed
|
|
198
|
+
if (observations !== undefined && this.embeddingsReady) {
|
|
199
|
+
const entityObj = new Entity(entity);
|
|
200
|
+
await this.updateEntityEmbedding(name, entityObj);
|
|
201
|
+
}
|
|
202
|
+
|
|
137
203
|
await this.save();
|
|
138
204
|
return true;
|
|
139
205
|
}
|
|
@@ -257,17 +323,131 @@ export class Storage {
|
|
|
257
323
|
|
|
258
324
|
/**
|
|
259
325
|
* Search entities by name, type, or observations
|
|
326
|
+
* Uses word-based matching - any word in the query that matches returns the entity
|
|
260
327
|
*/
|
|
261
328
|
searchEntities(query) {
|
|
262
|
-
|
|
329
|
+
// Split query into words, remove common stop words
|
|
330
|
+
const stopWords = new Set(["the", "a", "an", "and", "or", "but", "in", "on", "at", "to", "for", "of", "with", "by"]);
|
|
331
|
+
const words = query
|
|
332
|
+
.toLowerCase()
|
|
333
|
+
.split(/\s+/)
|
|
334
|
+
.filter(w => w.length > 2 && !stopWords.has(w));
|
|
335
|
+
|
|
336
|
+
if (words.length === 0) {
|
|
337
|
+
// Fallback to original query if all words were filtered
|
|
338
|
+
const lowerQuery = query.toLowerCase();
|
|
339
|
+
return this.getAllEntities().filter(e => {
|
|
340
|
+
if (e.name.toLowerCase().includes(lowerQuery)) return true;
|
|
341
|
+
if (e.entityType.toLowerCase().includes(lowerQuery)) return true;
|
|
342
|
+
if (e.observations.some(o => o.toLowerCase().includes(lowerQuery))) return true;
|
|
343
|
+
return false;
|
|
344
|
+
});
|
|
345
|
+
}
|
|
346
|
+
|
|
263
347
|
return this.getAllEntities().filter(e => {
|
|
264
|
-
if
|
|
265
|
-
|
|
266
|
-
|
|
348
|
+
// Check if ANY word matches in name, type, or observations
|
|
349
|
+
for (const word of words) {
|
|
350
|
+
if (e.name.toLowerCase().includes(word)) return true;
|
|
351
|
+
if (e.entityType.toLowerCase().includes(word)) return true;
|
|
352
|
+
if (e.observations.some(o => o.toLowerCase().includes(word))) return true;
|
|
353
|
+
}
|
|
267
354
|
return false;
|
|
268
355
|
});
|
|
269
356
|
}
|
|
270
357
|
|
|
358
|
+
/**
|
|
359
|
+
* Semantic search using embeddings
|
|
360
|
+
* Returns entities ranked by similarity score
|
|
361
|
+
*/
|
|
362
|
+
async semanticSearchEntities(query, options = {}) {
|
|
363
|
+
const {
|
|
364
|
+
topK = 10,
|
|
365
|
+
threshold = 0.65, // Lower threshold for more matches
|
|
366
|
+
entityType = null,
|
|
367
|
+
} = options;
|
|
368
|
+
|
|
369
|
+
// Initialize embeddings if not ready
|
|
370
|
+
if (!this.embeddingsReady) {
|
|
371
|
+
await this.initEmbeddings();
|
|
372
|
+
}
|
|
373
|
+
|
|
374
|
+
// Fall back to keyword search if embeddings not available
|
|
375
|
+
if (!this.embeddingsReady) {
|
|
376
|
+
const results = this.searchEntities(query);
|
|
377
|
+
return {
|
|
378
|
+
results: results.map(e => ({ entity: e, score: 0 })),
|
|
379
|
+
method: "keyword",
|
|
380
|
+
count: results.length,
|
|
381
|
+
};
|
|
382
|
+
}
|
|
383
|
+
|
|
384
|
+
try {
|
|
385
|
+
// Generate embedding for the query
|
|
386
|
+
const queryEmbedding = await this.embeddings.embed(query);
|
|
387
|
+
|
|
388
|
+
// Get all entities with their embeddings
|
|
389
|
+
const allEntities = this.getAllEntities();
|
|
390
|
+
const items = allEntities
|
|
391
|
+
.filter(e => !entityType || e.entityType === entityType)
|
|
392
|
+
.map(e => ({
|
|
393
|
+
entity: e,
|
|
394
|
+
embedding: this.data.entities[e.name]?.embedding || null,
|
|
395
|
+
}))
|
|
396
|
+
.filter(item => item.embedding !== null);
|
|
397
|
+
|
|
398
|
+
// Find most similar
|
|
399
|
+
const scoredResults = this.embeddings.findMostSimilar(
|
|
400
|
+
queryEmbedding,
|
|
401
|
+
items,
|
|
402
|
+
topK,
|
|
403
|
+
threshold
|
|
404
|
+
);
|
|
405
|
+
|
|
406
|
+
return {
|
|
407
|
+
results: scoredResults.map(r => ({ entity: r.entity, score: r.score })),
|
|
408
|
+
method: "semantic",
|
|
409
|
+
count: scoredResults.length,
|
|
410
|
+
};
|
|
411
|
+
} catch (error) {
|
|
412
|
+
console.warn("Semantic search failed, falling back to keyword:", error.message);
|
|
413
|
+
const results = this.searchEntities(query);
|
|
414
|
+
return {
|
|
415
|
+
results: results.map(e => ({ entity: e, score: 0 })),
|
|
416
|
+
method: "keyword",
|
|
417
|
+
count: results.length,
|
|
418
|
+
};
|
|
419
|
+
}
|
|
420
|
+
}
|
|
421
|
+
|
|
422
|
+
/**
|
|
423
|
+
* Generate embeddings for all entities that don't have them
|
|
424
|
+
* Useful for migrating existing data to semantic search
|
|
425
|
+
*/
|
|
426
|
+
async generateMissingEmbeddings(progressCallback = null) {
|
|
427
|
+
// Initialize embeddings if not ready
|
|
428
|
+
if (!this.embeddingsReady) {
|
|
429
|
+
const ready = await this.initEmbeddings();
|
|
430
|
+
if (!ready) {
|
|
431
|
+
throw new Error("Embeddings provider not available");
|
|
432
|
+
}
|
|
433
|
+
}
|
|
434
|
+
|
|
435
|
+
const entities = this.getAllEntities();
|
|
436
|
+
const missing = entities.filter(e => !this.data.entities[e.name]?.embedding);
|
|
437
|
+
|
|
438
|
+
let processed = 0;
|
|
439
|
+
for (const entity of missing) {
|
|
440
|
+
await this.updateEntityEmbedding(entity.name, entity);
|
|
441
|
+
processed++;
|
|
442
|
+
if (progressCallback) {
|
|
443
|
+
progressCallback(processed, missing.length, entity.name);
|
|
444
|
+
}
|
|
445
|
+
}
|
|
446
|
+
|
|
447
|
+
await this.save();
|
|
448
|
+
return { processed, total: entities.length };
|
|
449
|
+
}
|
|
450
|
+
|
|
271
451
|
/**
|
|
272
452
|
* Get entities related to a given entity
|
|
273
453
|
*/
|