@wipcomputer/memory-crystal 0.7.28 → 0.7.30

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (73) hide show
  1. package/.worktrees/memory-crystal-private--cc-mini-fix-home-fallback/.env.example +20 -0
  2. package/.worktrees/memory-crystal-private--cc-mini-fix-home-fallback/.publish-skill.json +1 -0
  3. package/.worktrees/memory-crystal-private--cc-mini-fix-home-fallback/CHANGELOG.md +1297 -0
  4. package/.worktrees/memory-crystal-private--cc-mini-fix-home-fallback/CLA.md +19 -0
  5. package/.worktrees/memory-crystal-private--cc-mini-fix-home-fallback/LICENSE +52 -0
  6. package/.worktrees/memory-crystal-private--cc-mini-fix-home-fallback/README-ENTERPRISE.md +226 -0
  7. package/.worktrees/memory-crystal-private--cc-mini-fix-home-fallback/README.md +151 -0
  8. package/.worktrees/memory-crystal-private--cc-mini-fix-home-fallback/RELAY.md +199 -0
  9. package/.worktrees/memory-crystal-private--cc-mini-fix-home-fallback/SKILL.md +462 -0
  10. package/.worktrees/memory-crystal-private--cc-mini-fix-home-fallback/TECHNICAL.md +656 -0
  11. package/.worktrees/memory-crystal-private--cc-mini-fix-home-fallback/_trash/RELEASE-NOTES-v0-7-23.md +48 -0
  12. package/.worktrees/memory-crystal-private--cc-mini-fix-home-fallback/_trash/RELEASE-NOTES-v0-7-25.md +24 -0
  13. package/.worktrees/memory-crystal-private--cc-mini-fix-home-fallback/_trash/RELEASE-NOTES-v0-7-26.md +7 -0
  14. package/.worktrees/memory-crystal-private--cc-mini-fix-home-fallback/_trash/RELEASE-NOTES-v0-7-28.md +31 -0
  15. package/.worktrees/memory-crystal-private--cc-mini-fix-home-fallback/_trash/RELEASE-NOTES-v0-7-29.md +28 -0
  16. package/.worktrees/memory-crystal-private--cc-mini-fix-home-fallback/_trash/RELEASE-NOTES-v0-7-4.md +64 -0
  17. package/.worktrees/memory-crystal-private--cc-mini-fix-home-fallback/_trash/RELEASE-NOTES-v0-7-5.md +19 -0
  18. package/.worktrees/memory-crystal-private--cc-mini-fix-home-fallback/cloud/README.md +116 -0
  19. package/.worktrees/memory-crystal-private--cc-mini-fix-home-fallback/cloud/docs/gpt-system-instructions.md +69 -0
  20. package/.worktrees/memory-crystal-private--cc-mini-fix-home-fallback/cloud/migrations/0001_init.sql +52 -0
  21. package/.worktrees/memory-crystal-private--cc-mini-fix-home-fallback/migrations/0001_init.sql +51 -0
  22. package/.worktrees/memory-crystal-private--cc-mini-fix-home-fallback/migrations/0002_cloud_storage.sql +49 -0
  23. package/.worktrees/memory-crystal-private--cc-mini-fix-home-fallback/openclaw.plugin.json +11 -0
  24. package/.worktrees/memory-crystal-private--cc-mini-fix-home-fallback/package-lock.json +4169 -0
  25. package/.worktrees/memory-crystal-private--cc-mini-fix-home-fallback/package.json +61 -0
  26. package/.worktrees/memory-crystal-private--cc-mini-fix-home-fallback/scripts/crystal-capture.sh +29 -0
  27. package/.worktrees/memory-crystal-private--cc-mini-fix-home-fallback/scripts/deploy-cloud.sh +153 -0
  28. package/.worktrees/memory-crystal-private--cc-mini-fix-home-fallback/scripts/ldm-backup.sh +116 -0
  29. package/.worktrees/memory-crystal-private--cc-mini-fix-home-fallback/scripts/migrate-lance-to-sqlite.mjs +218 -0
  30. package/.worktrees/memory-crystal-private--cc-mini-fix-home-fallback/skills/memory/SKILL.md +438 -0
  31. package/.worktrees/memory-crystal-private--cc-mini-fix-home-fallback/wrangler-demo.toml +8 -0
  32. package/.worktrees/memory-crystal-private--cc-mini-fix-home-fallback/wrangler-mcp.toml +24 -0
  33. package/.worktrees/memory-crystal-private--cc-mini-release-notes-v0.7.30/.env.example +20 -0
  34. package/.worktrees/memory-crystal-private--cc-mini-release-notes-v0.7.30/.publish-skill.json +1 -0
  35. package/.worktrees/memory-crystal-private--cc-mini-release-notes-v0.7.30/CHANGELOG.md +1297 -0
  36. package/.worktrees/memory-crystal-private--cc-mini-release-notes-v0.7.30/CLA.md +19 -0
  37. package/.worktrees/memory-crystal-private--cc-mini-release-notes-v0.7.30/LICENSE +52 -0
  38. package/.worktrees/memory-crystal-private--cc-mini-release-notes-v0.7.30/README-ENTERPRISE.md +226 -0
  39. package/.worktrees/memory-crystal-private--cc-mini-release-notes-v0.7.30/README.md +151 -0
  40. package/.worktrees/memory-crystal-private--cc-mini-release-notes-v0.7.30/RELAY.md +199 -0
  41. package/.worktrees/memory-crystal-private--cc-mini-release-notes-v0.7.30/RELEASE-NOTES-v0.7.30.md +29 -0
  42. package/.worktrees/memory-crystal-private--cc-mini-release-notes-v0.7.30/SKILL.md +462 -0
  43. package/.worktrees/memory-crystal-private--cc-mini-release-notes-v0.7.30/TECHNICAL.md +656 -0
  44. package/.worktrees/memory-crystal-private--cc-mini-release-notes-v0.7.30/_trash/RELEASE-NOTES-v0-7-23.md +48 -0
  45. package/.worktrees/memory-crystal-private--cc-mini-release-notes-v0.7.30/_trash/RELEASE-NOTES-v0-7-25.md +24 -0
  46. package/.worktrees/memory-crystal-private--cc-mini-release-notes-v0.7.30/_trash/RELEASE-NOTES-v0-7-26.md +7 -0
  47. package/.worktrees/memory-crystal-private--cc-mini-release-notes-v0.7.30/_trash/RELEASE-NOTES-v0-7-28.md +31 -0
  48. package/.worktrees/memory-crystal-private--cc-mini-release-notes-v0.7.30/_trash/RELEASE-NOTES-v0-7-29.md +28 -0
  49. package/.worktrees/memory-crystal-private--cc-mini-release-notes-v0.7.30/_trash/RELEASE-NOTES-v0-7-4.md +64 -0
  50. package/.worktrees/memory-crystal-private--cc-mini-release-notes-v0.7.30/_trash/RELEASE-NOTES-v0-7-5.md +19 -0
  51. package/.worktrees/memory-crystal-private--cc-mini-release-notes-v0.7.30/cloud/README.md +116 -0
  52. package/.worktrees/memory-crystal-private--cc-mini-release-notes-v0.7.30/cloud/docs/gpt-system-instructions.md +69 -0
  53. package/.worktrees/memory-crystal-private--cc-mini-release-notes-v0.7.30/cloud/migrations/0001_init.sql +52 -0
  54. package/.worktrees/memory-crystal-private--cc-mini-release-notes-v0.7.30/migrations/0001_init.sql +51 -0
  55. package/.worktrees/memory-crystal-private--cc-mini-release-notes-v0.7.30/migrations/0002_cloud_storage.sql +49 -0
  56. package/.worktrees/memory-crystal-private--cc-mini-release-notes-v0.7.30/openclaw.plugin.json +11 -0
  57. package/.worktrees/memory-crystal-private--cc-mini-release-notes-v0.7.30/package-lock.json +4169 -0
  58. package/.worktrees/memory-crystal-private--cc-mini-release-notes-v0.7.30/package.json +61 -0
  59. package/.worktrees/memory-crystal-private--cc-mini-release-notes-v0.7.30/scripts/crystal-capture.sh +29 -0
  60. package/.worktrees/memory-crystal-private--cc-mini-release-notes-v0.7.30/scripts/deploy-cloud.sh +153 -0
  61. package/.worktrees/memory-crystal-private--cc-mini-release-notes-v0.7.30/scripts/ldm-backup.sh +116 -0
  62. package/.worktrees/memory-crystal-private--cc-mini-release-notes-v0.7.30/scripts/migrate-lance-to-sqlite.mjs +218 -0
  63. package/.worktrees/memory-crystal-private--cc-mini-release-notes-v0.7.30/skills/memory/SKILL.md +438 -0
  64. package/.worktrees/memory-crystal-private--cc-mini-release-notes-v0.7.30/wrangler-demo.toml +8 -0
  65. package/.worktrees/memory-crystal-private--cc-mini-release-notes-v0.7.30/wrangler-mcp.toml +24 -0
  66. package/CHANGELOG.md +63 -0
  67. package/SKILL.md +13 -3
  68. package/TECHNICAL.md +30 -2
  69. package/_trash/RELEASE-NOTES-v0-7-28.md +15 -8
  70. package/_trash/RELEASE-NOTES-v0-7-29.md +28 -0
  71. package/_trash/RELEASE-NOTES-v0.7.30.md +29 -0
  72. package/package.json +1 -1
  73. package/scripts/migrate-lance-to-sqlite.mjs +2 -1
@@ -0,0 +1,656 @@
1
+ ###### WIP Computer
2
+
3
+ # Technical Documentation
4
+
5
+ How Memory Crystal works. Architecture, design decisions, integrations, encryption, search, and everything else the open source community is going to ask about.
6
+
7
+ ## How Does It Work?
8
+
9
+ Memory Crystal captures every conversation you have with any AI tool, embeds it into a local SQLite database, and makes it searchable with hybrid search (keyword + semantic). One database file. Runs on your machine. Nothing leaves your device unless you set up multi-device sync.
10
+
11
+ ### Five-Layer Memory Stack
12
+
13
+ | Layer | What | How |
14
+ |-------|------|-----|
15
+ | L1: Raw Transcripts | Every conversation archived as JSONL | Automatic capture (cron, hooks, plugins) |
16
+ | L2: Vector Index | Chunks embedded into crystal.db | Automatic. Hybrid search (BM25 + vector + RRF) |
17
+ | L3: Structured Memory | Facts, preferences, decisions | `crystal_remember` / `crystal_forget` |
18
+ | L4: Narrative Consolidation | Dream Weaver journals, identity, soul | `crystal dream-weave` (via Dream Weaver Protocol) |
19
+ | L5: Active Working Context | Boot sequence files, shared context | Agent reads on startup |
20
+
21
+ Every conversation produces three artifacts:
22
+ 1. **JSONL transcript** ... the raw session, archived to disk
23
+ 2. **Markdown summary** ... title, summary, key topics (generated by LLM or simple extraction)
24
+ 3. **Vector embeddings** ... chunked, embedded, and stored in crystal.db for search
25
+
26
+ ## How Does It Work with Claude Code CLI?
27
+
28
+ Two capture paths work together. The poller is primary. The Stop hook is redundancy.
29
+
30
+ ### Continuous Capture (Primary)
31
+
32
+ A cron job runs `cc-poller.ts` every minute. It reads Claude Code's JSONL transcript files via byte-offset watermarking (only reads new data since last capture) and produces all three artifacts in a single pass:
33
+
34
+ 1. Extracts user, assistant, and thinking blocks
35
+ 2. Chunks them, embeds into sqlite-vec
36
+ 3. Archives the full JSONL transcript
37
+ 4. Generates a markdown session summary
38
+ 5. Appends a daily breadcrumb log
39
+
40
+ **Install:**
41
+ ```bash
42
+ crystal init # Scaffolds ~/.ldm/, deploys capture script, installs cron
43
+ ```
44
+
45
+ This copies `crystal-capture.sh` to `~/.ldm/bin/` and installs a cron entry:
46
+ ```
47
+ * * * * * ~/.ldm/bin/crystal-capture.sh >> ~/.ldm/logs/crystal-capture.log 2>&1
48
+ ```
49
+
50
+ The script calls `node ~/.ldm/extensions/memory-crystal/dist/cc-poller.js`. The poller fetches the OpenAI API key internally via `opRead()` (1Password SA token). No secrets in the shell script.
51
+
52
+ ### Stop Hook (Redundancy)
53
+
54
+ The Claude Code Stop hook runs `cc-hook.ts` after every response. It checks the watermark and flushes anything the poller missed. If the poller already captured everything, the Stop hook is a no-op.
55
+
56
+ ```json
57
+ {
58
+ "hooks": {
59
+ "Stop": [{ "hooks": [{ "type": "command", "command": "node ~/.ldm/extensions/memory-crystal/dist/cc-hook.js", "timeout": 30 }] }]
60
+ }
61
+ }
62
+ ```
63
+
64
+ ```bash
65
+ node dist/cc-hook.js --on # Enable capture
66
+ node dist/cc-hook.js --off # Pause capture
67
+ node dist/cc-hook.js --status # Check status
68
+ ```
69
+
70
+ Respects private mode. When capture is off, nothing is recorded.
71
+
72
+ ### Why Both?
73
+
74
+ The Stop hook only fires when a session ends. Long sessions, remote disconnects, and context compactions never trigger Stop. A 72-hour session produced zero captures with Stop-only. The poller decouples capture from the session lifecycle entirely.
75
+
76
+ ## How Does It Work with OpenClaw?
77
+
78
+ Memory Crystal is an OpenClaw plugin. It registers tools (`crystal_search`, `crystal_remember`, `crystal_forget`, `crystal_status`) and an `agent_end` hook that captures conversations after every agent turn.
79
+
80
+ After embedding, the plugin also syncs raw data to LDM:
81
+ - Session JSONLs from `~/.openclaw/agents/main/sessions/`
82
+ - Workspace .md files from `~/.openclaw/workspace/`
83
+ - Daily logs from `~/.openclaw/workspace/memory/`
84
+
85
+ This ensures LDM has a complete copy of all agent data, not just embeddings. Raw data is never modified. Copies are idempotent (skip if same size).
86
+
87
+ Deployed to `~/.openclaw/extensions/memory-crystal/`. The plugin uses the same `core.ts` as every other interface. Same search, same database, same embeddings.
88
+
89
+ ## How Does It Work with ChatGPT and Claude (iOS, web, macOS)?
90
+
91
+ Memory Crystal runs a remote MCP server (`worker-mcp.ts`) on Cloudflare Workers. ChatGPT and Claude connect via OAuth 2.1 on any surface: macOS app, iOS app, or web.
92
+
93
+ Four tools: `memory_search`, `memory_remember`, `memory_forget`, `memory_status`.
94
+
95
+ **Tier 1 (Sovereign):** Memories are encrypted and relayed to your Crystal Core. No cloud search. The cloud MCP tells the client "search is available on your local devices."
96
+
97
+ **Tier 2 (Convenience):** Memories are stored in D1 + Vectorize for cloud search. Same hybrid search algorithm as local Crystal (BM25 + vector + RRF). Your Crystal Core is still the source of truth.
98
+
99
+ **Source:** `src/worker-mcp.ts` (OAuth + MCP server), `src/cloud-crystal.ts` (D1 + Vectorize backend)
100
+
101
+ ## How Does It Work with Other Tools?
102
+
103
+ Any tool that can run shell commands or call an MCP server can use Memory Crystal.
104
+
105
+ - **MCP Server** ... `mcp-server.ts` exposes `crystal_search`, `crystal_remember`, `crystal_forget`, `crystal_status`, `crystal_sources_add`, `crystal_sources_sync`, `crystal_sources_status`. Works with Claude Desktop, Claude Code, or any MCP-compatible client.
106
+ - **CLI** ... `crystal search "query"` from any terminal. Any tool with shell access can call it.
107
+ - **Module** ... `import { MemoryCrystal } from 'memory-crystal'` for Node.js integration.
108
+
109
+ ## Crystal Core and Crystal Node
110
+
111
+ Memory Crystal uses a Core/Node architecture for multi-device setups:
112
+
113
+ - **Crystal Core** ... your master memory. All conversations, all embeddings, all memories. This is the database you cannot lose. Install it on something permanent: a desktop, a home server, a Mac mini
114
+ - **Crystal Node** ... a synced copy on any other device. Captures conversations, sends them to the Core via encrypted relay. Gets a mirror back for local search. If a node dies, nothing is lost. The Core has everything
115
+
116
+ One Core, many Nodes. The Core does embeddings. Nodes just capture and sync.
117
+
118
+ **Role management:**
119
+ - `crystal role` ... show current role (Core or Node) and what it's connected to
120
+ - `crystal promote` ... make this machine the Crystal Core
121
+ - `crystal demote` ... make this machine a Crystal Node (connects to an existing Core)
122
+
123
+ You can move the Core later. Start on a laptop, get a desktop, run `crystal promote` on the desktop. The old Core becomes a Node. No data loss.
124
+
125
+ **If you install the Core on a laptop:** set up automated backups. iCloud Drive, external drive, wherever you trust. Your Core is your memory. Back it up.
126
+
127
+ ## Architecture
128
+
129
+ One core, multiple interfaces. Two Workers. Three relay channels.
130
+
131
+ ```
132
+ Local:
133
+ sqlite-vec (vectors) + FTS5 (BM25) + SQLite (metadata)
134
+ | | |
135
+ core.ts ... pure logic, zero framework deps
136
+ |-- cli.ts -> crystal search, dream-weave, backfill, serve
137
+ |-- mcp-server.ts -> crystal_search (Claude Code, Claude Desktop) + MCP sampling
138
+ |-- openclaw.ts -> plugin (OpenClaw agents) + raw data sync to LDM
139
+ |-- llm.ts -> LLM provider cascade, query expansion, re-ranking
140
+ |-- search-pipeline.ts -> deep search pipeline (expand, search, RRF, rerank, blend)
141
+ |-- cc-hook.ts -> Claude Code Stop hook + relay commands
142
+ |-- crystal-serve.ts -> Crystal Core gateway (localhost:18790)
143
+ |-- dream-weaver.ts -> Dream Weaver integration (narrative consolidation)
144
+ |-- staging.ts -> New agent staging pipeline
145
+
146
+ Cloud:
147
+ D1 (SQL + FTS5) + Vectorize (vectors)
148
+ | |
149
+ cloud-crystal.ts ... same search algorithm, Cloudflare backends
150
+ |-- worker-mcp.ts -> OAuth 2.1 + MCP (ChatGPT, Claude)
151
+
152
+ Relay:
153
+ worker.ts -> Encrypted dead drop (R2, 3 channels)
154
+ poller.ts -> Crystal Core pickup + staging + commands
155
+ mirror-sync.ts -> DB mirror to devices
156
+ crypto.ts -> AES-256-GCM + HMAC-SHA256
157
+
158
+ Init + Backfill:
159
+ discover.ts -> Harness auto-detection (CC + OpenClaw)
160
+ bulk-copy.ts -> Raw file copy to LDM (idempotent)
161
+ oc-backfill.ts -> OpenClaw JSONL parser
162
+ ```
163
+
164
+ Every local interface calls the same `core.ts`. The cloud MCP calls `cloud-crystal.ts` which implements the same search algorithm against D1 + Vectorize. The relay Worker (`worker.ts`) is intentionally separate and blind.
165
+
166
+ ## Search: How Does It Work?
167
+
168
+ Two-tier search system. Fast path (hybrid search) runs by default. Deep search adds LLM-powered query expansion and re-ranking for higher quality results. Falls back to fast path silently if no LLM provider is available.
169
+
170
+ ### Fast Path (Hybrid Search)
171
+
172
+ 1. Query goes to both FTS5 (keyword match) and sqlite-vec (vector similarity)
173
+ 2. FTS5 returns BM25-ranked results, normalized to [0..1) via `|score| / (1 + |score|)`
174
+ 3. sqlite-vec returns cosine-distance results via two-step query (MATCH first, then JOIN separately ... sqlite-vec hangs with JOINs in the same query)
175
+ 4. Reciprocal Rank Fusion merges both lists: `weight / (k + rank + 1)` with k=60, tiered weights (BM25 2x, vector 1x)
176
+ 5. Recency weighting applied on top: `max(0.3, exp(-age_days * 0.1))`
177
+ 6. Final results sorted by combined score
178
+
179
+ ### Deep Search (LLM-Powered, default)
180
+
181
+ Deep search wraps the fast path with LLM intelligence. Implemented in `search-pipeline.ts`:
182
+
183
+ 1. **Strong signal detection:** BM25 probe first. If top score >= 0.85 with gap >= 0.15 to #2, skip expansion (answer already found).
184
+ 2. **Query expansion:** LLM generates 3 variations ... lexical (keyword-focused), vector (semantic rephrase), HyDE (hypothetical answer document). Each variation runs through the fast path.
185
+ 3. **RRF merge:** All results from original + expanded queries fused via Reciprocal Rank Fusion.
186
+ 4. **LLM re-ranking:** Top 40 RRF candidates scored by LLM for relevance to the original query.
187
+ 5. **Position-aware blending:** Top 3: 75% RRF + 25% reranker. Results 4-10: 60/40. Results 11+: 40/60. Trusts RRF for top positions, lets the reranker fix ordering in the tail.
188
+
189
+ Deep search is the default. No flags needed. Falls back silently to fast path if no LLM provider is available.
190
+
191
+ ### LLM Provider Cascade
192
+
193
+ The deep search pipeline tries providers in order. First available wins:
194
+
195
+ | Priority | Provider | Cost | Speed |
196
+ |----------|----------|------|-------|
197
+ | 0 | **MCP Sampling** (if client supports it) | Included in Max subscription | Fast |
198
+ | 1 | **MLX** (local, Apple Silicon) | Free | Fastest |
199
+ | 2 | **Ollama** (local) | Free | Fast |
200
+ | 3 | **OpenAI API** | ~$0.001/search | Network-dependent |
201
+ | 4 | **Anthropic API** (direct key only, not OAuth) | ~$0.001/search | Network-dependent |
202
+ | 5 | **None** | Free | N/A (fast path only) |
203
+
204
+ Local-first by default. API keys are the fallback, not the primary path. MCP Sampling (priority 0) is designed and coded but waiting on Claude Code to implement it (Anthropic Issue #1785).
205
+
206
+ Provider detection in `llm.ts`:
207
+ - Check MCP sampling capability from connected client
208
+ - Check `http://localhost:8080/v1/models` for MLX server
209
+ - Check `http://localhost:11434/api/tags` for Ollama (filters out embedding-only models)
210
+ - Check env var or 1Password for OpenAI key
211
+ - Check env var for Anthropic key (skips OAuth tokens `sk-ant-oat01-`)
212
+ - None found: log once, use fast path
213
+
214
+ ### Time-Filtered Search
215
+
216
+ Search can be filtered by time: `--since 24h`, `--since 7d`, `--since 30d`, or an ISO date. Applied as a SQL WHERE clause before search. Available on CLI and MCP (`time_filter` parameter).
217
+
218
+ ### Intent Parameter
219
+
220
+ `--intent <description>` disambiguates queries without adding search terms. Example: `crystal search "security" --intent "1Password"` steers toward 1Password-specific context, not repo permissions or agent secrets.
221
+
222
+ Intent flows through: expansion prompt (guides LLM variations), disables strong-signal bypass, prepended to rerank query for LLM relevance scoring.
223
+
224
+ ### Explain Mode
225
+
226
+ `--explain` shows per-result scoring breakdown: FTS (keyword) score, vector (semantic) score, RRF rank after fusion, reranker score from LLM, recency weight, final blended score. Makes search quality transparent and debuggable.
227
+
228
+ ### Candidate Limit
229
+
230
+ `--candidates N` tunes the rerank pool size (default 40). Higher values give the LLM more results to evaluate. Lower values are faster.
231
+
232
+ ### LLM Cache
233
+
234
+ Expansion + reranking results cached in `llm_cache` table with 7-day TTL. Same query returns instantly on second search. Cache is per-query, per-intent.
235
+
236
+ ### Recency Decay
237
+
238
+ Exponential decay from 1.0 to floor 0.3. Day 0: 1.0, Day 1: 0.90, Day 3: 0.74, Day 7: 0.50, Day 14+: 0.3 (floor). Fresh context wins decisively. Old content still surfaces for strong matches but doesn't bury recent results.
239
+
240
+ Freshness flags: fresh (<3 days), recent (<7 days), aging (<14 days), stale (14+ days).
241
+
242
+ Inspired by and partially ported from [QMD](https://github.com/tobi/qmd) by Tobi Lutke (MIT, 2024-2026).
243
+
244
+ ### Why Hybrid?
245
+
246
+ Vector search alone misses exact matches. Keyword search alone misses semantic similarity. Hybrid catches both. A search for "deployment process" will find conversations that use the word "deployment" (BM25) and conversations about "shipping code to production" (vector similarity).
247
+
248
+ ### Content Dedup
249
+
250
+ SHA-256 hash of chunk text before embedding. Duplicate content is never re-embedded. This matters when the same conversation is captured by multiple hooks (e.g., Claude Code hook and OpenClaw plugin running simultaneously).
251
+
252
+ ## Database
253
+
254
+ Everything lives in one file: `crystal.db`. Inspectable with any SQLite tool. Backupable with `cp`.
255
+
256
+ ### Schema
257
+
258
+ | Table | Purpose |
259
+ |-------|---------|
260
+ | `chunks` | Chunk text, metadata, SHA-256 hash, timestamps |
261
+ | `chunks_vec` | sqlite-vec virtual table (cosine distance vectors) |
262
+ | `chunks_fts` | FTS5 virtual table (Porter stemming, BM25 scoring) |
263
+ | `memories` | Explicit remember/forget facts |
264
+ | `entities` | Knowledge graph nodes |
265
+ | `relationships` | Knowledge graph edges |
266
+ | `capture_state` | Watermarks for incremental ingestion |
267
+ | `sources` | Ingestion source metadata |
268
+ | `source_collections` | Directory collections for file indexing |
269
+ | `source_files` | Indexed file records with content hashes |
270
+
271
+ ### Why SQLite?
272
+
273
+ One file. No server. No Docker. No connection strings. Works on every platform. Inspectable with standard tools. Backupable with `cp`. Ships with every OS.
274
+
275
+ sqlite-vec adds vector search as a virtual table. FTS5 adds full-text search. Both are SQLite extensions that work within the same database file.
276
+
277
+ ### DB Location
278
+
279
+ `resolveConfig()` in `core.ts` checks in order:
280
+
281
+ 1. Explicit override (programmatic)
282
+ 2. `CRYSTAL_DATA_DIR` env var
283
+ 3. `~/.ldm/memory/crystal.db` (if it exists)
284
+ 4. `~/.openclaw/memory-crystal/` (legacy fallback)
285
+
286
+ ## Directory Structure
287
+
288
+ Memory Crystal manages `~/.ldm/` ... the universal agent home directory for [LDM OS](https://github.com/wipcomputer/dream-weaver-protocol).
289
+
290
+ ```
291
+ ~/.ldm/
292
+ config.json version, registered agents array
293
+ bin/
294
+ crystal-capture.sh cron job script (deployed by crystal init)
295
+ logs/
296
+ crystal-capture.log cron output (persists across reboots)
297
+ ldm-backup.log backup script output
298
+ memory/
299
+ crystal.db shared vector DB (all agents)
300
+ extensions/
301
+ memory-crystal/dist/ deployed JS (cc-poller.js, cc-hook.js, etc.)
302
+ state/ watermarks, poller state, capture state
303
+ staging/{agent_id}/ new agent staging (before live ingest)
304
+ transcripts/ staged transcripts
305
+ READY marker file (triggers processing)
306
+ agents/{agent_id}/
307
+ SOUL.md agent soul (Dream Weaver L4)
308
+ IDENTITY.md agent identity (Dream Weaver L4)
309
+ CONTEXT.md agent context (Dream Weaver L4)
310
+ REFERENCE.md agent reference (Dream Weaver L4)
311
+ memory/
312
+ transcripts/ full JSONL session transcripts
313
+ sessions/ markdown session summaries
314
+ daily/ daily breadcrumb logs
315
+ journals/ Dream Weaver journals
316
+ workspace/ synced workspace files (OpenClaw .md)
317
+ ```
318
+
319
+ Path resolution is centralized in `src/ldm.ts`:
320
+ - `getAgentId()` ... resolves from `CRYSTAL_AGENT_ID` env var, default `cc-mini`
321
+ - `ldmPaths(agentId?)` ... returns all paths as an object (including workspace)
322
+ - `scaffoldLdm(agentId?)` ... creates the full directory tree
323
+ - `ensureLdm(agentId?)` ... idempotent check, scaffolds if needed
324
+ - `deployCaptureScript()` ... copies crystal-capture.sh to `~/.ldm/bin/`
325
+ - `installCron()` ... installs the every-minute cron entry (idempotent)
326
+ - `removeCron()` ... removes the crystal-capture cron entry
327
+ - `resolveStatePath()` ... resolves state file path (watermarks, poller state)
328
+ - `stateWritePath()` ... writable state file path
329
+
330
+ ## Encryption: How Does It Work?
331
+
332
+ For multi-device sync. All encryption happens on-device before anything touches the network.
333
+
334
+ - **AES-256-GCM** for encryption. Authenticated encryption ... ciphertext tampering is detected.
335
+ - **HMAC-SHA256** for signing. Integrity verification before decryption. If the signature doesn't match, the blob is rejected.
336
+ - **Shared symmetric key** generated locally with `openssl rand -hex 32`. Never transmitted to the relay.
337
+ - The relay stores and serves encrypted blobs. It has no decryption capability. Compromising the relay yields encrypted noise.
338
+
339
+ ### Key Management
340
+
341
+ The same encryption key must be present on all devices. Options:
342
+ - **1Password** ... store the key, both machines pull from 1Password via SA token
343
+ - **AirDrop** ... direct transfer between Macs
344
+ - **Manual** ... copy the key securely between machines
345
+
346
+ ### Relay Architecture
347
+
348
+ 1. **Crystal Node** (`cc-hook.ts` relay mode): Encrypts JSONL with AES-256-GCM, signs with HMAC-SHA256, drops at Cloudflare Worker. Can also send commands via `sendCommand()`.
349
+ 2. **Worker** (`worker.ts`): Stores encrypted blobs in R2 across three channels (`conversations`, `mirror`, `commands`). Pure dead drop. No decryption. Auto-cleans after 24h.
350
+ 3. **Crystal Core** (`poller.ts`): Polls Worker, downloads blobs, verifies HMAC, decrypts, ingests into crystal.db. Reconstructs remote agent's file tree (JSONL, MD summary, daily breadcrumb). Detects new agent IDs and routes to staging pipeline. Also polls commands channel and delivers to Crystal Core gateway.
351
+ 4. **Mirror sync** (`mirror-sync.ts`): Pushes delta chunks (new embeddings since last sync) + file tree deltas from Crystal Core to relay. Crystal Nodes pull, decrypt, and insert. Cold start gets a full export; after that, delta only.
352
+ 5. **Staging** (`staging.ts`): New agents from relay are staged before live ingest. Transcripts are written to `~/.ldm/staging/{agent_id}/`, then backfill + Dream Weaver full mode runs before promoting to live capture.
353
+
354
+ ### Sync Model
355
+
356
+ **Core is the only embedder.** All embeddings happen on the Core machine. Nodes never embed locally. This prevents split-brain where a node has embeddings that Core doesn't due to network issues.
357
+
358
+ **Delta sync, not full mirror.** The mirror channel sends only new chunks since last sync, not the entire crystal.db (1.9 GB+). Payload size is proportional to activity, not corpus size. Cold start (new node) gets a one-time full export, then delta only.
359
+
360
+ **Full LDM tree sync.** The relay syncs the entire `~/.ldm/` file tree, not just the database. Embeddings are pointers to artifacts (files, images, videos). If the artifact isn't on the node, the embedding is an orphan. Every file that an embedding references must exist on every device.
361
+
362
+ **No cloud search.** Every node has the full database + full file tree. All search is local. The Cloud MCP server (D1 + Vectorize) exists as a demo/onboarding tool but is not the production architecture.
363
+
364
+ Three relay channels:
365
+ - **conversations** (Node -> Core) ... raw conversation chunks for Core to embed
366
+ - **mirror** (Core -> Nodes) ... delta chunks (pre-embedded) + file tree deltas
367
+ - **commands** (bidirectional) ... Nodes send commands ("run Dream Weaver"), Core sends results
368
+
369
+ ### Future: Native Apple Sync
370
+
371
+ For Apple-to-Apple devices, a native app replaces the relay entirely. CloudKit handles encrypted sync. MLX Swift handles on-device search quality LLM. No Cloudflare Worker needed between Apple devices. Same delta model. The relay stays for non-Apple and cross-platform setups.
372
+
373
+ ## Session Summaries
374
+
375
+ `src/summarize.ts` generates markdown summaries. Two modes:
376
+
377
+ **LLM mode** (default): Calls gpt-4o-mini with a condensed transcript. Returns title, slug, summary, key topics.
378
+
379
+ **Simple mode**: First user message becomes the title. First 10 messages as preview. No API call.
380
+
381
+ Controlled by `CRYSTAL_SUMMARY_MODE` env var (`llm` or `simple`).
382
+
383
+ ## Embedding Providers
384
+
385
+ | Provider | Model | Dimensions | Cost |
386
+ |----------|-------|-----------|------|
387
+ | OpenAI (default) | text-embedding-3-small | 1536 | ~$0.02/1M tokens |
388
+ | Ollama | nomic-embed-text | 768 | Free (local) |
389
+ | Google | text-embedding-004 | 768 | Free tier available |
390
+
391
+ Set via `CRYSTAL_EMBEDDING_PROVIDER` env var or `--provider` flag.
392
+
393
+ ### Why These Three?
394
+
395
+ - **OpenAI** ... best quality, lowest friction. Most people already have an API key.
396
+ - **Ollama** ... fully offline. Zero cost. Privacy-first. No data leaves your machine.
397
+ - **Google** ... free tier is generous. Good alternative if you don't want OpenAI.
398
+
399
+ ## Source File Indexing
400
+
401
+ Add directories as "collections". Files are chunked, embedded, and tagged with file path + collection name. Searchable alongside conversations and memories.
402
+
403
+ ```bash
404
+ crystal sources add /path/to/project --name my-project
405
+ crystal sources sync my-project
406
+ crystal sources status
407
+ ```
408
+
409
+ Incremental sync detects changed files via SHA-256 content hashing. Only re-embeds what changed.
410
+
411
+ ## Environment Variables
412
+
413
+ | Variable | Default | Description |
414
+ |----------|---------|-------------|
415
+ | `CRYSTAL_EMBEDDING_PROVIDER` | `openai` | `openai`, `ollama`, or `google` |
416
+ | `CRYSTAL_AGENT_ID` | `cc-mini` | Agent identifier for LDM paths |
417
+ | `CRYSTAL_SUMMARY_MODE` | `llm` | `llm` or `simple` |
418
+ | `CRYSTAL_SUMMARY_PROVIDER` | `openai` | Summary LLM provider |
419
+ | `CRYSTAL_SUMMARY_MODEL` | `gpt-4o-mini` | Summary LLM model |
420
+ | `CRYSTAL_DATA_DIR` | (auto) | Override DB location |
421
+ | `CRYSTAL_RELAY_KEY` | ... | Shared encryption key for relay |
422
+ | `CRYSTAL_RELAY_URL` | ... | Cloudflare Worker URL |
423
+ | `CRYSTAL_REMOTE_URL` | ... | Remote Worker URL |
424
+ | `CRYSTAL_REMOTE_TOKEN` | ... | Worker auth token |
425
+ | `OPENAI_API_KEY` | ... | OpenAI key |
426
+ | `GOOGLE_API_KEY` | ... | Google AI key |
427
+ | `CRYSTAL_OLLAMA_HOST` | `http://localhost:11434` | Ollama server URL |
428
+ | `CRYSTAL_OLLAMA_MODEL` | `nomic-embed-text` | Ollama model |
429
+ | `CRYSTAL_SERVE_PORT` | `18790` | Crystal Core gateway port |
430
+ | `CRYSTAL_SERVE_TOKEN` | ... | Optional bearer token for gateway auth |
431
+
432
+ ### API Key Resolution
433
+
434
+ 1. Explicit override (programmatic)
435
+ 2. `process.env` (set by plugin or manually)
436
+ 3. `.env` file (`~/.ldm/memory/.env`)
437
+ 4. 1Password CLI fallback
438
+
439
+ ## CLI Reference
440
+
441
+ ```bash
442
+ # Search
443
+ crystal search <query> [-n limit] [--agent <id>] [--since <24h|7d|30d>]
444
+ [--intent <description>] [--candidates N] [--explain]
445
+ [--provider <openai|ollama|google>]
446
+
447
+ # Remember / forget
448
+ crystal remember <text> [--category fact|preference|event|opinion|skill]
449
+ crystal forget <id>
450
+
451
+ # Status
452
+ crystal status [--provider <openai|ollama|google>]
453
+
454
+ # MLX local LLM (Apple Silicon)
455
+ crystal mlx setup # auto-install Qwen2.5-3B, create LaunchAgent
456
+ crystal mlx status # check server health
457
+ crystal mlx stop # stop the server
458
+
459
+ # Source file indexing
460
+ crystal sources add <path> --name <name>
461
+ crystal sources sync [name]
462
+ crystal sources status
463
+
464
+ # Pairing (relay key sharing)
465
+ crystal pair # Show QR code (generate key if none)
466
+ crystal pair --code mc1:<base64> # Receive key from another device
467
+
468
+ # LDM management
469
+ crystal init [--agent <id>] # Scaffold LDM, discover sessions, copy to LDM
470
+ crystal migrate-db
471
+
472
+ # Crystal Core / Node management
473
+ crystal role # Show current role (Core or Node) and connections
474
+ crystal promote # Make this machine the Crystal Core
475
+ crystal demote # Make this machine a Crystal Node
476
+
477
+ # Backfill + migration
478
+ crystal backfill [--agent <id>] [--dry-run] [--limit <n>] # Embed raw transcripts from LDM
479
+ crystal migrate-embeddings [--dry-run] # Migrate context-embeddings into crystal.db
480
+
481
+ # Dream Weaver (narrative consolidation)
482
+ crystal dream-weave [--agent <id>] [--mode full|incremental] [--dry-run]
483
+
484
+ # Crystal Core gateway
485
+ crystal serve [--port <n>] # Start HTTP gateway (default: 18790)
486
+
487
+ # Health + maintenance
488
+ crystal doctor # Health check
489
+ crystal backup # Backup crystal.db
490
+ ```
491
+
492
+ ## MCP Tools
493
+
494
+ | Tool | Description |
495
+ |------|-------------|
496
+ | `crystal_search` | Hybrid search across all memories |
497
+ | `crystal_remember` | Store a fact or observation |
498
+ | `crystal_forget` | Deprecate a memory by ID |
499
+ | `crystal_status` | Chunk count, provider, agents |
500
+ | `crystal_sources_add` | Add a directory for indexing |
501
+ | `crystal_sources_sync` | Re-index changed files |
502
+ | `crystal_sources_status` | Collection stats |
503
+
504
+ ## Migration
505
+
506
+ ### Legacy DB to LDM
507
+
508
+ ```bash
509
+ crystal migrate-db
510
+ ```
511
+
512
+ Copies the database to `~/.ldm/memory/crystal.db`. Verifies chunk count. Creates symlinks at the old path.
513
+
514
+ ### LanceDB to sqlite-vec
515
+
516
+ ```bash
517
+ node scripts/migrate-lance-to-sqlite.mjs --dry-run # check counts
518
+ node scripts/migrate-lance-to-sqlite.mjs # full migration
519
+ ```
520
+
521
+ Reads vectors directly from LanceDB. No re-embedding needed. ~5,000 chunks/sec on M4 Pro.
522
+
523
+ ### context-embeddings.sqlite (legacy migrate.ts)
524
+
525
+ ```bash
526
+ node dist/migrate.js [--dry-run] [--provider openai]
527
+ ```
528
+
529
+ Import from the older context-embeddings format (requires re-embedding).
530
+
531
+ ### context-embeddings.sqlite (direct copy, v0.5.0+)
532
+
533
+ ```bash
534
+ crystal migrate-embeddings --dry-run # Show what would migrate
535
+ crystal migrate-embeddings # Copy embeddings directly ($0)
536
+ ```
537
+
538
+ Copies ~3,108 unique conversation chunks from context-embeddings.sqlite into crystal.db. Embeddings are directly compatible (same model: text-embedding-3-small, 1536d, float32). Zero API calls. SHA-256 dedup skips chunks already in crystal.
539
+
540
+ ### Backfill raw transcripts (v0.5.0+)
541
+
542
+ ```bash
543
+ crystal backfill --agent cc-mini --dry-run # Show file count, estimated tokens
544
+ crystal backfill --agent cc-mini # Embed all transcripts
545
+ crystal backfill --agent lesa-mini # Embed OpenClaw transcripts
546
+ ```
547
+
548
+ Scans `~/.ldm/agents/{agentId}/memory/transcripts/*.jsonl`, auto-detects format (Claude Code vs OpenClaw), extracts messages, embeds into crystal.db. Watermark tracking prevents re-embedding. On Node devices, relays to Core instead of local embedding.
549
+
550
+ ## Project Structure
551
+
552
+ ```
553
+ memory-crystal/
554
+ src/
555
+ core.ts Pure logic, zero framework deps
556
+ cli.ts CLI wrapper (crystal command)
557
+ mcp-server.ts MCP server (Claude Code, Claude Desktop)
558
+ openclaw.ts OpenClaw plugin wrapper + raw data sync to LDM
559
+ cc-poller.ts Continuous capture (cron job, primary)
560
+ cc-hook.ts Claude Code Stop hook (redundancy) + relay commands
561
+ ldm.ts LDM scaffolding, path resolution, script deployment, cron
562
+ summarize.ts Markdown session summary generation
563
+ crypto.ts AES-256-GCM + HMAC-SHA256 encryption
564
+ role.ts Core/Node role detection
565
+ doctor.ts Health check (crystal doctor)
566
+ bridge.ts Bridge detection (lesa-bridge, etc.)
567
+ discover.ts Harness auto-detection (Claude Code + OpenClaw)
568
+ bulk-copy.ts Raw file copy to LDM (idempotent)
569
+ oc-backfill.ts OpenClaw JSONL parser
570
+ dream-weaver.ts Dream Weaver integration (imports from dream-weaver-protocol)
571
+ crystal-serve.ts Crystal Core gateway (localhost:18790)
572
+ staging.ts New agent staging pipeline
573
+ llm.ts LLM provider cascade (MLX > Ollama > OpenAI > Anthropic), query expansion, re-ranking
574
+ search-pipeline.ts Deep search pipeline (expand, search, RRF, rerank, blend)
575
+ worker.ts Cloudflare Worker relay (encrypted dead drop, R2, 3 channels)
576
+ worker-mcp.ts Cloud MCP server (OAuth 2.1 + DCR, ChatGPT/Claude)
577
+ cloud-crystal.ts D1 + Vectorize backend (cloud search)
578
+ poller.ts Relay poller (Crystal Core side) + staging + commands
579
+ mirror-sync.ts DB mirror sync (device side)
580
+ migrate.ts Legacy migration tools
581
+ pair.ts QR code pairing logic
582
+ migrations/
583
+ 0001_init.sql OAuth tables (clients, codes, tokens, users)
584
+ 0002_cloud_storage.sql Cloud chunks + memories + FTS5
585
+ skills/
586
+ memory/SKILL.md Agent skill definition
587
+ scripts/
588
+ crystal-capture.sh Cron job script (source of truth, deployed to ~/.ldm/bin/)
589
+ ldm-backup.sh LDM backup script
590
+ deploy-cloud.sh 1Password-driven Cloudflare deployment
591
+ migrate-lance-to-sqlite.mjs
592
+ wrangler.toml Relay Worker config
593
+ wrangler-mcp.toml Cloud MCP Worker config
594
+ dist/ Built output
595
+ ai/ Plans, dev updates, todos (private repo only)
596
+ ```
597
+
598
+ ## Design Decisions
599
+
600
+ **Why sqlite-vec over pgvector, Pinecone, Weaviate, etc.?**
601
+ No server. No Docker. No cloud dependency. One file. Works offline. Backupable with `cp`. The tradeoff is scale ... sqlite-vec works great up to ~500K vectors. Beyond that, consider dedicated vector stores.
602
+
603
+ **Why FTS5 + vectors instead of just vectors?**
604
+ Vectors alone miss exact keyword matches. "error code 403" should match conversations containing "403", not just semantically similar conversations about HTTP errors. Hybrid search catches both.
605
+
606
+ **Why RRF for fusion?**
607
+ Reciprocal Rank Fusion is simple, robust, and doesn't require score calibration between the two engines. Each engine ranks results independently. RRF merges based on rank position, not raw scores.
608
+
609
+ **Why recency weighting?**
610
+ Without it, old conversations dominate. A conversation from 3 days ago about your current project should outrank a conversation from 3 months ago about a different project, even if the old one is a slightly better semantic match.
611
+
612
+ **Why AES-256-GCM for relay encryption?**
613
+ Authenticated encryption. Ciphertext tampering is detected. No padding oracle attacks. Standard, auditable, widely implemented. Combined with HMAC-SHA256 signing for belt-and-suspenders integrity verification.
614
+
615
+ **Why a dead drop instead of direct device-to-device sync?**
616
+ Devices aren't always online at the same time. A dead drop decouples sender and receiver. Your laptop drops encrypted blobs whenever it captures. Your desktop picks them up whenever it polls. No NAT traversal, no port forwarding, no peer discovery.
617
+
618
+ **Why Dream Weaver as a separate protocol library?**
619
+ The consolidation engine (prompts, parsing, orchestration) lives in the `dream-weaver-protocol` package. Memory Crystal imports it and provides hooks for crystal.db integration (embedding journals, extracting memories). This prevents bifurcation. The protocol repo is the canonical source for HOW to consolidate. Memory Crystal is WHERE the results go.
620
+
621
+ **Why D1 + Vectorize for the cloud instead of sqlite-vec?**
622
+ sqlite-vec runs inside a single SQLite file. Cloudflare Workers don't have persistent local filesystems. D1 provides serverless SQL with FTS5. Vectorize provides serverless vector search. Same search algorithm (BM25 + vector + RRF), different backends.
623
+
624
+ ## Roadmap
625
+
626
+ - **Phase 1** ... Complete. Local memory with CLI, MCP, OpenClaw plugin, Claude Code hook.
627
+ - **Phase 2a** ... Complete. Source file indexing + QMD hybrid search (sqlite-vec + FTS5 + RRF).
628
+ - **Phase 2b** ... Complete. Historical session backfill (159K+ chunks).
629
+ - **Phase 2c** ... Complete. LDM scaffolding, JSONL archive, markdown summaries, relay merge.
630
+ - **Phase 3** ... Complete. Encrypted relay (Cloudflare Worker + R2), poller, mirror sync, QR pairing (`crystal pair`).
631
+ - **Phase 4** ... Complete. Cloud MCP server (OAuth 2.1 + DCR, ChatGPT + Claude on all surfaces), D1 + Vectorize backend.
632
+ - **Phase 5** ... Complete. Core/Node architecture, crystal doctor, crystal backup, crystal bridge.
633
+ - **Phase 6** ... Complete. Init discovery, bulk copy, OpenClaw parser, backfill, CE migration.
634
+ - **Phase 7** ... Complete. Dream Weaver integration (via dream-weaver-protocol), Crystal Core gateway, staging pipeline, commands channel.
635
+ - **Phase 8** ... Complete. Search quality: exponential recency decay, time-filtered search, LLM query expansion + re-ranking (deep search), provider cascade (MLX > Ollama > OpenAI > Anthropic), MCP sampling integration (designed, waiting on Claude Code).
636
+ - **Next** ... MLX auto-install during `crystal init`, local embeddings (zero API key default), LanceDB retirement.
637
+
638
+ ## More Info
639
+
640
+ - [README.md](https://github.com/wipcomputer/memory-crystal/blob/main/README.md) ... What Memory Crystal is and how to install it.
641
+ - [RELAY.md](https://github.com/wipcomputer/memory-crystal/blob/main/RELAY.md) ... Relay: Memory Sync, QR pairing, delta sync, file tree sync.
642
+
643
+ ---
644
+
645
+ ## License
646
+
647
+ ```
648
+ src/core.ts, cli.ts, mcp-server.ts, skills/ MIT (use anywhere, no restrictions)
649
+ src/worker.ts, src/worker-mcp.ts AGPL (relay + cloud server)
650
+ ```
651
+
652
+ AGPL for personal use is free.
653
+
654
+ Built by Parker Todd Brooks, Lēsa (OpenClaw, Claude Opus 4.6), Claude Code CLI (Claude Opus 4.6).
655
+
656
+ Search architecture inspired by [QMD](https://github.com/tobi/qmd) by Tobi Lutke (MIT, 2024-2026).