audrey 0.3.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/LICENSE ADDED
@@ -0,0 +1,21 @@
1
+ MIT License
2
+
3
+ Copyright (c) 2026 Tyler Eveland
4
+
5
+ Permission is hereby granted, free of charge, to any person obtaining a copy
6
+ of this software and associated documentation files (the "Software"), to deal
7
+ in the Software without restriction, including without limitation the rights
8
+ to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
9
+ copies of the Software, and to permit persons to whom the Software is
10
+ furnished to do so, subject to the following conditions:
11
+
12
+ The above copyright notice and this permission notice shall be included in all
13
+ copies or substantial portions of the Software.
14
+
15
+ THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
16
+ IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
17
+ FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
18
+ AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
19
+ LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
20
+ OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
21
+ SOFTWARE.
package/README.md ADDED
@@ -0,0 +1,493 @@
1
+ # Audrey
2
+
3
+ Biological memory architecture for AI agents. Gives agents cognitive memory that decays, consolidates, self-validates, and learns from experience — not just a database.
4
+
5
+
6
+ ## Why Audrey Exists
7
+
8
+ Every AI memory tool today (Mem0, Zep, LangChain Memory) is a filing cabinet. Store stuff, retrieve stuff. None of them do what biological memory actually does:
9
+
10
+ - Memories don't decay. A fact from 6 months ago has the same weight as one from today.
11
+ - No consolidation. Raw events never become general principles.
12
+ - No contradiction detection. Conflicting facts coexist silently.
13
+ - No self-defense. If an agent hallucinates and encodes the hallucination, it becomes "truth."
14
+
15
+ Audrey fixes all of this by modeling memory the way the brain does:
16
+
17
+ | Brain Structure | Audrey Component | What It Does |
18
+ |---|---|---|
19
+ | Hippocampus | Episodic Memory | Fast capture of raw events and observations |
20
+ | Neocortex | Semantic Memory | Consolidated principles and patterns |
21
+ | Sleep Replay | Consolidation Engine | Extracts patterns from episodes, promotes to principles |
22
+ | Prefrontal Cortex | Validation Engine | Truth-checking, contradiction detection |
23
+ | Amygdala | Salience Scorer | Importance weighting for retention priority |
24
+
25
+ ## Install
26
+
27
+ ```bash
28
+ npm install audrey
29
+ ```
30
+
31
+ Zero external infrastructure. One SQLite file. That's it.
32
+
33
+ ## Quick Start
34
+
35
+ ```js
36
+ import { Audrey } from 'audrey';
37
+
38
+ const brain = new Audrey({
39
+ dataDir: './agent-memory',
40
+ agent: 'my-agent',
41
+ embedding: { provider: 'openai', model: 'text-embedding-3-small' },
42
+ });
43
+
44
+ // Agent observes something
45
+ await brain.encode({
46
+ content: 'Stripe API returns 429 above 100 req/s',
47
+ source: 'direct-observation',
48
+ salience: 0.9,
49
+ causal: { trigger: 'batch-payment-job', consequence: 'queue-stalled' },
50
+ tags: ['stripe', 'rate-limit'],
51
+ });
52
+
53
+ // Later — agent encounters Stripe again
54
+ const memories = await brain.recall('stripe rate limits', {
55
+ minConfidence: 0.5,
56
+ types: ['semantic', 'procedural'],
57
+ limit: 5,
58
+ });
59
+
60
+ // Run consolidation (the "sleep" cycle)
61
+ await brain.consolidate();
62
+
63
+ // Check brain health
64
+ const stats = brain.introspect();
65
+ // { episodic: 47, semantic: 12, procedural: 3, dormant: 8, ... }
66
+
67
+ brain.close();
68
+ ```
69
+
70
+ ## Core Concepts
71
+
72
+ ### Four Memory Types
73
+
74
+ **Episodic** (hot, fast decay) — Raw events. "Stripe returned 429 at 3pm." Immutable. Append-only. Never modified.
75
+
76
+ **Semantic** (warm, slow decay) — Consolidated principles. "Stripe enforces 100 req/s rate limit." Extracted automatically from clusters of episodic memories.
77
+
78
+ **Procedural** (cold, slowest decay) — Learned workflows. "When Stripe rate-limits, implement exponential backoff." Skills the agent has acquired.
79
+
80
+ **Causal** — Why things happened. Not just "A then B" but "A caused B because of mechanism C." Prevents correlation-as-causation.
81
+
82
+ ### Confidence Formula
83
+
84
+ Every memory has a compositional confidence score:
85
+
86
+ ```
87
+ C(m, t) = w_s * S + w_e * E + w_r * R(t) + w_ret * Ret(t)
88
+ ```
89
+
90
+ | Component | What It Measures | Default Weight |
91
+ |---|---|---|
92
+ | **S** — Source reliability | How trustworthy is the origin? | 0.30 |
93
+ | **E** — Evidence agreement | Do observations agree or contradict? | 0.35 |
94
+ | **R(t)** — Recency decay | How old is the memory? (Ebbinghaus curve) | 0.20 |
95
+ | **Ret(t)** — Retrieval reinforcement | How often is this memory accessed? | 0.15 |
96
+
97
+ Source reliability hierarchy:
98
+
99
+ | Source Type | Reliability |
100
+ |---|---|
101
+ | `direct-observation` | 0.95 |
102
+ | `told-by-user` | 0.90 |
103
+ | `tool-result` | 0.85 |
104
+ | `inference` | 0.60 |
105
+ | `model-generated` | 0.40 (capped at 0.6 confidence) |
106
+
107
+ The `model-generated` cap prevents circular self-confirmation — an agent can't boost its own hallucinations into high-confidence "facts."
108
+
109
+ ### Decay (Forgetting Curves)
110
+
111
+ Unreinforced memories lose confidence over time following Ebbinghaus exponential decay:
112
+
113
+ | Memory Type | Half-Life | Rationale |
114
+ |---|---|---|
115
+ | Episodic | 7 days | Raw events go stale fast |
116
+ | Semantic | 30 days | Principles are hard-won |
117
+ | Procedural | 90 days | Skills are slowest to forget |
118
+
119
+ Retrieval resets the decay clock. Frequently accessed memories persist. Memories below the dormant threshold (0.1) become dormant — still searchable with `includeDormant: true`, but excluded from default recall.
120
+
121
+ ### Consolidation (The "Sleep" Cycle)
122
+
123
+ Audrey's consolidation engine periodically clusters similar episodic memories and extracts general principles:
124
+
125
+ ```
126
+ 3 episodes about Stripe 429 errors
127
+ → 1 semantic principle: "Stripe enforces ~100 req/s rate limit"
128
+ ```
129
+
130
+ The pipeline: **Cluster** (embedding similarity) → **Extract** (LLM or callback) → **Validate** (check for contradictions) → **Promote** (write semantic memory) → **Audit** (log everything).
131
+
132
+ Consolidation is idempotent. Re-running on the same data produces no duplicates. Every run creates an audit record with input/output IDs for full traceability.
133
+
134
+ ### Contradiction Handling
135
+
136
+ When memories conflict, Audrey doesn't force a winner. Contradictions have a lifecycle:
137
+
138
+ ```
139
+ open → resolved | context_dependent | reopened
140
+ ```
141
+
142
+ Context-dependent truths are modeled explicitly:
143
+
144
+ ```js
145
+ // "Stripe rate limit is 100 req/s" (live keys)
146
+ // "Stripe rate limit is 25 req/s" (test keys)
147
+ // Both true — under different conditions
148
+ ```
149
+
150
+ New high-confidence evidence can reopen resolved disputes.
151
+
152
+ ### Rollback
153
+
154
+ Bad consolidation? Undo it:
155
+
156
+ ```js
157
+ const history = brain.consolidationHistory();
158
+ brain.rollback(history[0].id);
159
+ // Semantic memories → rolled_back state
160
+ // Source episodes → un-consolidated
161
+ // Full audit trail preserved
162
+ ```
163
+
164
+ ### Circular Self-Confirmation Defense
165
+
166
+ The most dangerous exploit in AI memory: agent hallucinates X, encodes it, later retrieves it, "reinforcement" boosts confidence, X eventually consolidates as "established truth."
167
+
168
+ Audrey's defenses:
169
+
170
+ 1. **Source diversity requirement** — Consolidation requires evidence from 2+ distinct source types
171
+ 2. **Model-generated cap** — Memories from `model-generated` sources are capped at 0.6 confidence
172
+ 3. **Source lineage tracking** — Provenance chains detect when all evidence traces back to a single inference
173
+ 4. **Source diversity score** — Every semantic memory tracks how many different source types contributed
174
+
175
+ ## API Reference
176
+
177
+ ### `new Audrey(config)`
178
+
179
+ ```js
180
+ const brain = new Audrey({
181
+ dataDir: './audrey-data', // Where the SQLite DB lives
182
+ agent: 'my-agent', // Agent identifier
183
+ embedding: {
184
+ provider: 'openai', // 'openai' | 'mock'
185
+ model: 'text-embedding-3-small',
186
+ apiKey: process.env.OPENAI_API_KEY,
187
+ },
188
+ consolidation: {
189
+ interval: '1h', // Auto-consolidation interval
190
+ minEpisodes: 3, // Minimum cluster size
191
+ confidenceTarget: 2.0, // Adaptive threshold multiplier
192
+ },
193
+ decay: {
194
+ dormantThreshold: 0.1, // Below this → dormant
195
+ },
196
+ });
197
+ ```
198
+
199
+ ### `brain.encode(params)` → `Promise<string>`
200
+
201
+ Encode an episodic memory. Returns the memory ID.
202
+
203
+ ```js
204
+ const id = await brain.encode({
205
+ content: 'What happened', // Required. Non-empty string.
206
+ source: 'direct-observation', // Required. See source types above.
207
+ salience: 0.8, // Optional. 0-1. Default: 0.5
208
+ causal: { // Optional. What caused this / what it caused.
209
+ trigger: 'batch-processing',
210
+ consequence: 'queue-backed-up',
211
+ },
212
+ tags: ['stripe', 'production'], // Optional. Array of strings.
213
+ supersedes: 'previous-id', // Optional. ID of episode this corrects.
214
+ });
215
+ ```
216
+
217
+ Episodes are **immutable**. Corrections create new records with `supersedes` links. The original is preserved.
218
+
219
+ ### `brain.recall(query, options)` → `Promise<Memory[]>`
220
+
221
+ Retrieve memories ranked by `similarity * confidence`.
222
+
223
+ ```js
224
+ const memories = await brain.recall('stripe rate limits', {
225
+ minConfidence: 0.5, // Filter below this confidence
226
+ types: ['semantic'], // Filter by memory type
227
+ limit: 5, // Max results
228
+ includeProvenance: true, // Include evidence chains
229
+ includeDormant: false, // Include dormant memories
230
+ });
231
+ ```
232
+
233
+ Each result:
234
+
235
+ ```js
236
+ {
237
+ id: '01ABC...',
238
+ content: 'Stripe enforces ~100 req/s rate limit',
239
+ type: 'semantic',
240
+ confidence: 0.87,
241
+ score: 0.74, // similarity * confidence
242
+ source: 'consolidation',
243
+ state: 'active',
244
+ provenance: { // When includeProvenance: true
245
+ evidenceEpisodeIds: ['01XYZ...', '01DEF...'],
246
+ evidenceCount: 3,
247
+ supportingCount: 3,
248
+ contradictingCount: 0,
249
+ },
250
+ }
251
+ ```
252
+
253
+ Retrieval automatically reinforces matched memories (boosts confidence, resets decay clock).
254
+
255
+ ### `brain.consolidate(options)` → `Promise<ConsolidationResult>`
256
+
257
+ Run the consolidation engine manually.
258
+
259
+ ```js
260
+ const result = await brain.consolidate({
261
+ minClusterSize: 3,
262
+ similarityThreshold: 0.80,
263
+ extractPrinciple: (episodes) => ({ // Optional LLM callback
264
+ content: 'Extracted principle text',
265
+ type: 'semantic',
266
+ }),
267
+ });
268
+ // { runId, status, episodesEvaluated, clustersFound, principlesExtracted }
269
+ ```
270
+
271
+ ### `brain.decay(options)` → `DecayResult`
272
+
273
+ Apply forgetting curves. Transitions low-confidence memories to dormant.
274
+
275
+ ```js
276
+ const result = brain.decay({ dormantThreshold: 0.1 });
277
+ // { totalEvaluated, transitionedToDormant, timestamp }
278
+ ```
279
+
280
+ ### `brain.rollback(runId)` → `RollbackResult`
281
+
282
+ Undo a consolidation run.
283
+
284
+ ```js
285
+ brain.rollback('01ABC...');
286
+ // { rolledBackMemories: 3, restoredEpisodes: 9 }
287
+ ```
288
+
289
+ ### `brain.introspect()` → `Stats`
290
+
291
+ Get memory system health stats.
292
+
293
+ ```js
294
+ brain.introspect();
295
+ // {
296
+ // episodic: 247, semantic: 31, procedural: 8,
297
+ // causalLinks: 42, dormant: 15,
298
+ // contradictions: { open: 2, resolved: 7, context_dependent: 3, reopened: 0 },
299
+ // lastConsolidation: '2026-02-18T22:00:00Z',
300
+ // totalConsolidationRuns: 14,
301
+ // }
302
+ ```
303
+
304
+ ### `brain.consolidationHistory()` → `ConsolidationRun[]`
305
+
306
+ Full audit trail of all consolidation runs.
307
+
308
+ ### Events
309
+
310
+ ```js
311
+ brain.on('encode', ({ id, content, source }) => { ... });
312
+ brain.on('reinforcement', ({ episodeId, targetId, similarity }) => { ... });
313
+ brain.on('consolidation', ({ runId, principlesExtracted }) => { ... });
314
+ brain.on('decay', ({ totalEvaluated, transitionedToDormant }) => { ... });
315
+ brain.on('rollback', ({ runId, rolledBackMemories }) => { ... });
316
+ brain.on('error', (err) => { ... });
317
+ ```
318
+
319
+ ### `brain.close()`
320
+
321
+ Close the database connection and stop auto-consolidation.
322
+
323
+ ## Architecture
324
+
325
+ ```
326
+ audrey-data/
327
+ audrey.db ← Single SQLite file. WAL mode. That's your brain.
328
+ ```
329
+
330
+ ```
331
+ src/
332
+ audrey.js Main class. EventEmitter. Public API surface.
333
+ confidence.js Compositional confidence formula. Pure math.
334
+ consolidate.js "Sleep" cycle. Cluster → extract → promote.
335
+ db.js SQLite schema. 6 tables. CHECK constraints. Indexes.
336
+ decay.js Ebbinghaus forgetting curves.
337
+ embedding.js Pluggable providers (Mock, OpenAI).
338
+ encode.js Immutable episodic memory creation.
339
+ introspect.js Health dashboard queries.
340
+ recall.js Confidence-weighted vector retrieval.
341
+ rollback.js Undo consolidation runs.
342
+ utils.js Shared: cosine similarity, date math, safe JSON parse.
343
+ validate.js Reinforcement + contradiction lifecycle.
344
+ index.js Barrel export.
345
+ ```
346
+
347
+ ### Database Schema (6 tables)
348
+
349
+ | Table | Purpose | Key Columns |
350
+ |---|---|---|
351
+ | `episodes` | Immutable raw events | content, embedding, source, salience, causal_trigger/consequence, supersedes |
352
+ | `semantics` | Consolidated principles | content, embedding, state, evidence_episode_ids, source_type_diversity |
353
+ | `procedures` | Learned workflows | content, embedding, trigger_conditions, success/failure_count |
354
+ | `causal_links` | Why things happened | cause_id, effect_id, link_type (causal/correlational/temporal), mechanism |
355
+ | `contradictions` | Dispute tracking | claim_a/b_id, state (open/resolved/context_dependent/reopened), resolution |
356
+ | `consolidation_runs` | Audit trail | input_episode_ids, output_memory_ids, status, checkpoint_cursor |
357
+
358
+ All mutations use SQLite transactions for atomicity. CHECK constraints enforce valid states and source types.
359
+
360
+ ## Running Tests
361
+
362
+ ```bash
363
+ npm test # 104 tests, ~760ms
364
+ npm run test:watch
365
+ ```
366
+
367
+ ## Running the Demo
368
+
369
+ ```bash
370
+ node examples/stripe-demo.js
371
+ ```
372
+
373
+ Demonstrates the full pipeline: encode 3 rate-limit observations → consolidate into principle → recall proactively.
374
+
375
+ ---
376
+
377
+ ## Roadmap
378
+
379
+ ### v0.1.0 — Foundation
380
+
381
+ - [x] Immutable episodic memory with append-only records
382
+ - [x] Compositional confidence formula (source + evidence + recency + retrieval)
383
+ - [x] Ebbinghaus-inspired forgetting curves with configurable half-lives
384
+ - [x] Dormancy transitions for low-confidence memories
385
+ - [x] Confidence-weighted recall across episodic/semantic/procedural types
386
+ - [x] Provenance chains (which episodes contributed to which principles)
387
+ - [x] Retrieval reinforcement (frequently accessed memories resist decay)
388
+ - [x] Consolidation engine with clustering and principle extraction
389
+ - [x] Idempotent consolidation with checkpoint cursors
390
+ - [x] Full consolidation audit trail (input/output IDs per run)
391
+ - [x] Consolidation rollback (undo bad runs, restore episodes)
392
+ - [x] Contradiction lifecycle (open/resolved/context_dependent/reopened)
393
+ - [x] Circular self-confirmation defense (model-generated cap at 0.6)
394
+ - [x] Source type diversity tracking on semantic memories
395
+ - [x] Supersedes links for correcting episodic memories
396
+ - [x] Pluggable embedding providers (Mock for tests, OpenAI for production)
397
+ - [x] Causal context storage (trigger/consequence per episode)
398
+ - [x] Introspection API (memory counts, contradiction stats, consolidation history)
399
+ - [x] EventEmitter lifecycle hooks (encode, reinforcement, consolidation, decay, rollback, error)
400
+ - [x] SQLite with WAL mode, CHECK constraints, indexes, foreign keys
401
+ - [x] Transaction safety on all multi-step mutations
402
+ - [x] Input validation on public API (content, salience, tags, source)
403
+ - [x] Shared utility extraction (cosine similarity, date math, safe JSON parse)
404
+ - [x] 104 tests across 12 test files
405
+ - [x] Proof-of-concept demo (Stripe rate limit scenario)
406
+
407
+ ### v0.2.0 — LLM Integration
408
+
409
+ - [x] LLM-powered principle extraction (replace callback with Anthropic/OpenAI calls)
410
+ - [x] LLM-based contradiction detection during validation
411
+ - [x] Causal mechanism articulation via LLM (not just trigger/consequence)
412
+ - [x] Spurious correlation detection (require mechanistic explanation for causal links)
413
+ - [x] Context-dependent truth resolution via LLM
414
+ - [x] Configurable LLM provider for consolidation (Mock, Anthropic, OpenAI)
415
+ - [x] Structured prompt templates for all LLM operations
416
+ - [x] 142 tests across 15 test files
417
+
418
+ ### v0.3.0 — Vector Performance
419
+
420
+ - [x] sqlite-vec native vector indexing (vec0 virtual tables with cosine distance)
421
+ - [x] KNN queries for recall, validation, and consolidation clustering (all vector math in C)
422
+ - [x] SQL-native metadata filtering in KNN (state, source, consolidated)
423
+ - [x] Batch encoding API (`encodeBatch` — encode N episodes in one call)
424
+ - [x] Streaming recall with async generators (`recallStream`)
425
+ - [x] Dimension configuration and mismatch validation
426
+ - [x] Automatic migration from v0.2.0 embedding BLOBs to vec0 tables
427
+ - [x] 168 tests across 16 test files
428
+
429
+ ### v0.3.1 — MCP Server (current)
430
+
431
+ - [x] MCP tool server via `@modelcontextprotocol/sdk` with stdio transport
432
+ - [x] 5 tools: `memory_encode`, `memory_recall`, `memory_consolidate`, `memory_introspect`, `memory_resolve_truth`
433
+ - [x] Configuration via environment variables (data dir, embedding provider, LLM provider)
434
+ - [x] Registration script for Claude Code (`mcp-server/register.sh`)
435
+ - [x] 184 tests across 17 test files
436
+
437
+ ### v0.3.5 — Embedding Migration (deferred from v0.3.0)
438
+
439
+ - [ ] Embedding migration pipeline (re-embed when models change)
440
+ - [ ] Re-consolidation queue (re-run consolidation with new embedding model)
441
+
442
+ ### v0.4.0 — Type Safety & Developer Experience
443
+
444
+ - [ ] Full TypeScript conversion with strict mode
445
+ - [ ] JSDoc types on all exports (interim before TS conversion)
446
+ - [ ] Published type declarations (.d.ts)
447
+ - [ ] Schema versioning and migration system
448
+ - [ ] Structured logging (optional, pluggable)
449
+ - [ ] npm publish with proper package metadata
450
+
451
+ ### v0.5.0 — Advanced Memory Features
452
+
453
+ - [ ] Adaptive consolidation threshold (learn optimal N per domain, not fixed N=3)
454
+ - [ ] Source-aware confidence for semantic memories (track strongest source composition)
455
+ - [ ] Configurable decay rates per Audrey instance
456
+ - [ ] Configurable confidence weights per Audrey instance
457
+ - [ ] PII detection and redaction (opt-in)
458
+ - [ ] Memory export/import (JSON snapshot)
459
+ - [ ] Auto-consolidation scheduling (setInterval with configurable interval)
460
+
461
+ ### v0.6.0 — Scale
462
+
463
+ - [ ] pgvector adapter for PostgreSQL backend
464
+ - [ ] Redis adapter for distributed caching
465
+ - [ ] Connection pooling for concurrent agent access
466
+ - [ ] Pagination on recall queries (cursor-based)
467
+ - [ ] Benchmarks: encode throughput, recall latency at 10k/100k/1M memories
468
+
469
+ ### v1.0.0 — Production Ready
470
+
471
+ - [ ] Comprehensive error handling at all boundaries
472
+ - [ ] Rate limiting on embedding API calls
473
+ - [ ] Memory usage profiling and optimization
474
+ - [ ] Security audit (injection, data isolation)
475
+ - [ ] Cross-agent knowledge sharing protocol (Hivemind)
476
+ - [ ] Documentation site
477
+ - [ ] Integration guides (LangChain, CrewAI, Claude Code, custom agents)
478
+
479
+ ## Design Decisions
480
+
481
+ **Why SQLite, not Postgres?** Zero infrastructure. `npm install` and you have a brain. The adapter pattern means you can migrate to pgvector when you need to scale.
482
+
483
+ **Why append-only episodes?** Immutability creates a reliable audit trail. Corrections use `supersedes` links rather than mutations. You can always trace back to what actually happened.
484
+
485
+ **Why Ebbinghaus curves?** Biological forgetting is an adaptive feature, not a bug. It prevents cognitive overload, maintains relevance, and enables generalization. Audrey's forgetting works the same way.
486
+
487
+ **Why model-generated cap at 0.6?** Prevents the most dangerous exploit in AI memory: circular self-confirmation where an agent's own inferences bootstrap themselves into high-confidence "facts" through repeated retrieval.
488
+
489
+ **Why no TypeScript yet?** Prototyping speed. TypeScript conversion is on the roadmap for v0.4.0. The pure-math modules (`confidence.js`, `utils.js`) are already type-safe in practice.
490
+
491
+ ## License
492
+
493
+ MIT
@@ -0,0 +1,155 @@
1
+ #!/usr/bin/env node
2
+ import { McpServer } from '@modelcontextprotocol/sdk/server/mcp.js';
3
+ import { StdioServerTransport } from '@modelcontextprotocol/sdk/server/stdio.js';
4
+ import { z } from 'zod';
5
+ import { homedir } from 'node:os';
6
+ import { join } from 'node:path';
7
+ import { Audrey } from '../src/index.js';
8
+
9
+ const VALID_SOURCES = ['direct-observation', 'told-by-user', 'tool-result', 'inference', 'model-generated'];
10
+ const VALID_TYPES = ['episodic', 'semantic', 'procedural'];
11
+
12
+ function buildAudreyConfig() {
13
+ const dataDir = process.env.AUDREY_DATA_DIR || join(homedir(), '.audrey', 'data');
14
+ const agent = process.env.AUDREY_AGENT || 'claude-code';
15
+ const embProvider = process.env.AUDREY_EMBEDDING_PROVIDER || 'mock';
16
+ const embDimensions = parseInt(process.env.AUDREY_EMBEDDING_DIMENSIONS || '8', 10);
17
+ const llmProvider = process.env.AUDREY_LLM_PROVIDER;
18
+
19
+ const config = {
20
+ dataDir,
21
+ agent,
22
+ embedding: { provider: embProvider, dimensions: embDimensions },
23
+ };
24
+
25
+ if (llmProvider === 'anthropic') {
26
+ config.llm = { provider: 'anthropic', apiKey: process.env.ANTHROPIC_API_KEY };
27
+ } else if (llmProvider === 'openai') {
28
+ config.llm = { provider: 'openai', apiKey: process.env.OPENAI_API_KEY };
29
+ } else if (llmProvider === 'mock') {
30
+ config.llm = { provider: 'mock' };
31
+ }
32
+
33
+ return config;
34
+ }
35
+
36
+ function toolResult(data) {
37
+ return { content: [{ type: 'text', text: JSON.stringify(data) }] };
38
+ }
39
+
40
+ function toolError(err) {
41
+ return { isError: true, content: [{ type: 'text', text: `Error: ${err.message || String(err)}` }] };
42
+ }
43
+
44
+ async function main() {
45
+ const config = buildAudreyConfig();
46
+ const audrey = new Audrey(config);
47
+ console.error(`[audrey-mcp] started — agent=${config.agent} dataDir=${config.dataDir}`);
48
+
49
+ const server = new McpServer({
50
+ name: 'audrey-memory',
51
+ version: '0.3.0',
52
+ });
53
+
54
+ server.tool(
55
+ 'memory_encode',
56
+ {
57
+ content: z.string().describe('The memory content to encode'),
58
+ source: z.enum(VALID_SOURCES).describe('Source type of the memory'),
59
+ tags: z.array(z.string()).optional().describe('Optional tags for categorization'),
60
+ salience: z.number().min(0).max(1).optional().describe('Importance weight 0-1'),
61
+ },
62
+ async ({ content, source, tags, salience }) => {
63
+ try {
64
+ const id = await audrey.encode({ content, source, tags, salience });
65
+ return toolResult({ id, content, source });
66
+ } catch (err) {
67
+ return toolError(err);
68
+ }
69
+ },
70
+ );
71
+
72
+ server.tool(
73
+ 'memory_recall',
74
+ {
75
+ query: z.string().describe('Search query to match against memories'),
76
+ limit: z.number().min(1).max(50).optional().describe('Max results (default 10)'),
77
+ types: z.array(z.enum(VALID_TYPES)).optional().describe('Memory types to search'),
78
+ min_confidence: z.number().min(0).max(1).optional().describe('Minimum confidence threshold'),
79
+ },
80
+ async ({ query, limit, types, min_confidence }) => {
81
+ try {
82
+ const results = await audrey.recall(query, {
83
+ limit: limit ?? 10,
84
+ types,
85
+ minConfidence: min_confidence,
86
+ });
87
+ return toolResult(results);
88
+ } catch (err) {
89
+ return toolError(err);
90
+ }
91
+ },
92
+ );
93
+
94
+ server.tool(
95
+ 'memory_consolidate',
96
+ {
97
+ min_cluster_size: z.number().optional().describe('Minimum episodes per cluster'),
98
+ similarity_threshold: z.number().optional().describe('Similarity threshold for clustering'),
99
+ },
100
+ async ({ min_cluster_size, similarity_threshold }) => {
101
+ try {
102
+ const consolidation = await audrey.consolidate({
103
+ minClusterSize: min_cluster_size,
104
+ similarityThreshold: similarity_threshold,
105
+ });
106
+ return toolResult(consolidation);
107
+ } catch (err) {
108
+ return toolError(err);
109
+ }
110
+ },
111
+ );
112
+
113
+ server.tool(
114
+ 'memory_introspect',
115
+ {},
116
+ async () => {
117
+ try {
118
+ const stats = audrey.introspect();
119
+ return toolResult(stats);
120
+ } catch (err) {
121
+ return toolError(err);
122
+ }
123
+ },
124
+ );
125
+
126
+ server.tool(
127
+ 'memory_resolve_truth',
128
+ {
129
+ contradiction_id: z.string().describe('ID of the contradiction to resolve'),
130
+ },
131
+ async ({ contradiction_id }) => {
132
+ try {
133
+ const resolution = await audrey.resolveTruth(contradiction_id);
134
+ return toolResult(resolution);
135
+ } catch (err) {
136
+ return toolError(err);
137
+ }
138
+ },
139
+ );
140
+
141
+ const transport = new StdioServerTransport();
142
+ await server.connect(transport);
143
+ console.error('[audrey-mcp] connected via stdio');
144
+
145
+ process.on('SIGINT', () => {
146
+ console.error('[audrey-mcp] shutting down');
147
+ audrey.close();
148
+ process.exit(0);
149
+ });
150
+ }
151
+
152
+ main().catch(err => {
153
+ console.error('[audrey-mcp] fatal:', err);
154
+ process.exit(1);
155
+ });
@@ -0,0 +1,30 @@
1
+ #!/usr/bin/env bash
2
+ # Register Audrey MCP server with Claude Code
3
+ # Usage: bash mcp-server/register.sh [--openai] [--anthropic]
4
+
5
+ SCRIPT_DIR="$(cd "$(dirname "$0")" && pwd)"
6
+ SERVER_PATH="$SCRIPT_DIR/index.js"
7
+
8
+ ARGS="--transport stdio --scope user"
9
+ ENV_ARGS=""
10
+
11
+ ENV_ARGS="$ENV_ARGS --env AUDREY_DATA_DIR=$HOME/.audrey/data"
12
+
13
+ if [[ "$*" == *"--openai"* ]]; then
14
+ ENV_ARGS="$ENV_ARGS --env AUDREY_EMBEDDING_PROVIDER=openai --env AUDREY_EMBEDDING_DIMENSIONS=1536"
15
+ [ -n "$OPENAI_API_KEY" ] && ENV_ARGS="$ENV_ARGS --env OPENAI_API_KEY=$OPENAI_API_KEY"
16
+ else
17
+ ENV_ARGS="$ENV_ARGS --env AUDREY_EMBEDDING_PROVIDER=mock --env AUDREY_EMBEDDING_DIMENSIONS=8"
18
+ fi
19
+
20
+ if [[ "$*" == *"--anthropic"* ]]; then
21
+ ENV_ARGS="$ENV_ARGS --env AUDREY_LLM_PROVIDER=anthropic"
22
+ [ -n "$ANTHROPIC_API_KEY" ] && ENV_ARGS="$ENV_ARGS --env ANTHROPIC_API_KEY=$ANTHROPIC_API_KEY"
23
+ fi
24
+
25
+ echo "Registering Audrey MCP server..."
26
+ echo " Server: $SERVER_PATH"
27
+
28
+ claude mcp add $ARGS $ENV_ARGS audrey-memory -- node "$SERVER_PATH"
29
+
30
+ echo "Done. Run 'claude mcp list' to verify."