recallmem 0.1.1 → 0.1.3
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/README.md +38 -36
- package/bin/commands/setup.js +26 -46
- package/package.json +1 -1
package/README.md
CHANGED
|
@@ -6,7 +6,15 @@
|
|
|
6
6
|
</p>
|
|
7
7
|
|
|
8
8
|
<p align="center">
|
|
9
|
-
<strong>Persistent
|
|
9
|
+
<strong>Persistent personal AI that actually remembers you.</strong>
|
|
10
|
+
</p>
|
|
11
|
+
|
|
12
|
+
<p align="center">
|
|
13
|
+
LLMs like ChatGPT, Claude.ai, and Gemini tend to forget you the moment you end your session. RecallMEM doesn't. It builds a profile of who you are, extracts facts after every conversation, and runs vector search across your entire history to find relevant context. By the time you've used it for a week, it knows you better than any AI ever will.
|
|
14
|
+
</p>
|
|
15
|
+
|
|
16
|
+
<p align="center">
|
|
17
|
+
Use it with Claude or OpenAI for fast responses and the best models (~5 minute setup). Or run everything locally with Gemma 4 for 100% privacy. You'll get the same memory framework either way. Your call.
|
|
10
18
|
</p>
|
|
11
19
|
|
|
12
20
|
<p align="center">
|
|
@@ -21,11 +29,15 @@
|
|
|
21
29
|
|
|
22
30
|
## What is this
|
|
23
31
|
|
|
24
|
-
A personal AI
|
|
32
|
+
A personal AI chatbot with REAL memory. Plug in any LLM you want and RecallMEM gives it persistent memory of who you are, what you've talked about, and what's currently true vs historical.
|
|
33
|
+
|
|
34
|
+
The best part is that the LLM will never touch your memory in the database. Every retrieval is deterministic SQL + cosine similarity, assembled by TypeScript before the LLM ever sees it. The LLM only proposes new facts; a TypeScript validator decides what gets stored. Facts have timestamps and get auto-retired when you contradict them ("works at Acme" → "left Acme"). [Deep dive on the architecture →](./docs/ARCHITECTURE.md)
|
|
25
35
|
|
|
26
|
-
|
|
36
|
+
You can run it three ways:
|
|
27
37
|
|
|
28
|
-
|
|
38
|
+
- **Cloud LLMs (recommended for most people).** Add a Claude or OpenAI API key in Settings. Fast, smart, works on any computer. Your memory still stays local in your own Postgres database. Only the chat messages go to the provider.
|
|
39
|
+
- **Local LLMs (recommended for privacy).** Run Gemma 4 via Ollama. Nothing leaves your machine, ever. Slower setup (~18 GB model download) and slower responses, but truly air-gappable.
|
|
40
|
+
- **Both.** Use cloud for daily chat, switch to local for the sensitive stuff. The model dropdown lets you pick per-conversation.
|
|
29
41
|
|
|
30
42
|
## Features
|
|
31
43
|
|
|
@@ -42,49 +54,35 @@ The memory is the actual differentiator. Not the model. Not the UI. Memory reads
|
|
|
42
54
|
|
|
43
55
|
## Quick start (Mac)
|
|
44
56
|
|
|
45
|
-
|
|
57
|
+
Two options. Pick whichever fits your priority.
|
|
46
58
|
|
|
47
|
-
|
|
59
|
+
### Option A: Cloud LLM (Claude or OpenAI) — fastest, ~5 minutes
|
|
60
|
+
|
|
61
|
+
You need Node.js 20+ and [Homebrew](https://brew.sh). Then:
|
|
48
62
|
|
|
49
63
|
```bash
|
|
50
64
|
npx recallmem
|
|
51
65
|
```
|
|
52
66
|
|
|
53
|
-
|
|
67
|
+
The installer sets up Postgres, pgvector, and Ollama (for the embedding model that powers memory). When the browser opens to `localhost:3000`:
|
|
54
68
|
|
|
55
|
-
1.
|
|
56
|
-
2.
|
|
57
|
-
3.
|
|
58
|
-
4.
|
|
59
|
-
5.
|
|
60
|
-
6. **It downloads EmbeddingGemma** (~600 MB, ~1-2 min). This is required for the memory system.
|
|
61
|
-
7. **It asks which Gemma 4 model you want.** Three options:
|
|
62
|
-
- **1) Gemma 4 26B** — 18 GB, fast, recommended for most people
|
|
63
|
-
- **2) Gemma 4 31B** — 19 GB, slower, smartest answers
|
|
64
|
-
- **3) Gemma 4 E2B** — 2 GB, very fast, good for testing or older laptops
|
|
65
|
-
8. **It downloads the model you picked.** E2B finishes in 2-3 min. The 18 GB option takes 10-30 min depending on your internet.
|
|
66
|
-
9. **It runs database migrations** (~5 seconds).
|
|
67
|
-
10. **It builds the app for production** (~30-60 seconds, first install only).
|
|
68
|
-
11. **It starts the server.** Open `http://localhost:3000` in your browser and start chatting.
|
|
69
|
+
1. Click **Settings** in the top right
|
|
70
|
+
2. Click **Providers**
|
|
71
|
+
3. Add your Claude or OpenAI API key
|
|
72
|
+
4. Pick that model from the dropdown in the chat header
|
|
73
|
+
5. Start chatting
|
|
69
74
|
|
|
70
|
-
Total time:
|
|
75
|
+
**Total time: ~5 minutes.** The AI remembers everything across every chat. Your memory stays in your local Postgres database. Only the chat messages go to the cloud provider.
|
|
71
76
|
|
|
72
|
-
|
|
77
|
+
### Option B: Local Gemma 4 — 100% private, ~15-45 minutes
|
|
73
78
|
|
|
74
|
-
|
|
75
|
-
<summary><strong>Just want cloud models? (Claude / GPT)</strong></summary>
|
|
79
|
+
Same `npx recallmem` command. When the app opens, click **Settings → Manage models** and download one of these:
|
|
76
80
|
|
|
77
|
-
|
|
81
|
+
- **Gemma 4 E4B** (4 GB, ~5 minute download) — fastest to test
|
|
82
|
+
- **Gemma 4 26B** (18 GB, ~20-30 minute download) — recommended for daily use
|
|
83
|
+
- **Gemma 4 31B** (19 GB, slower, best quality)
|
|
78
84
|
|
|
79
|
-
|
|
80
|
-
brew install postgresql@17 pgvector
|
|
81
|
-
brew services start postgresql@17
|
|
82
|
-
npx recallmem
|
|
83
|
-
```
|
|
84
|
-
|
|
85
|
-
After the app starts, go to **Settings → Providers → Add a new provider**, paste your API key, and pick that model from the chat dropdown.
|
|
86
|
-
|
|
87
|
-
</details>
|
|
85
|
+
Then pick that model from the dropdown and chat. Nothing leaves your machine.
|
|
88
86
|
|
|
89
87
|
<details>
|
|
90
88
|
<summary><strong>Linux (not officially supported, manual install)</strong></summary>
|
|
@@ -175,6 +173,10 @@ Apache 2.0. See [LICENSE](./LICENSE) and [NOTICE](./NOTICE). Use it, modify it,
|
|
|
175
173
|
|
|
176
174
|
## Status
|
|
177
175
|
|
|
178
|
-
v0.1. It works. I use it every day.
|
|
176
|
+
v0.1.2. It works. I use it every day.
|
|
177
|
+
|
|
178
|
+
I built RecallMEM because I wanted an AI that actually knows me. Not because I'm paranoid about privacy (though that's a nice bonus). The chat models you use today forget you the second you close the tab and that drives me crazy. So I fixed it.
|
|
179
|
+
|
|
180
|
+
There's no CI, no error monitoring, no SLA. If you want to use it as your daily AI tool, fork it, make it yours, and expect to read the code if something breaks. That's the deal. If this is useful to you, that's cool. If not, no hard feelings.
|
|
179
181
|
|
|
180
182
|
[github.com/RealChrisSean/RecallMEM](https://github.com/RealChrisSean/RecallMEM)
|
package/bin/commands/setup.js
CHANGED
|
@@ -95,35 +95,11 @@ async function waitFor(checkFn, timeoutMs = 15000, intervalMs = 500) {
|
|
|
95
95
|
return false;
|
|
96
96
|
}
|
|
97
97
|
|
|
98
|
-
//
|
|
99
|
-
|
|
100
|
-
|
|
101
|
-
|
|
102
|
-
|
|
103
|
-
console.log(" 1) Gemma 4 26B");
|
|
104
|
-
console.log(" Size: 18 GB");
|
|
105
|
-
console.log(" Speed: Fast");
|
|
106
|
-
console.log(" Best for: Most people. Recommended.");
|
|
107
|
-
blank();
|
|
108
|
-
console.log(" 2) Gemma 4 31B");
|
|
109
|
-
console.log(" Size: 19 GB");
|
|
110
|
-
console.log(" Speed: Slower");
|
|
111
|
-
console.log(" Best for: People who want the smartest answers, even if it takes longer.");
|
|
112
|
-
blank();
|
|
113
|
-
console.log(" 3) Gemma 4 E2B");
|
|
114
|
-
console.log(" Size: 2 GB");
|
|
115
|
-
console.log(" Speed: Very fast");
|
|
116
|
-
console.log(" Best for: A quick test. Or older laptops.");
|
|
117
|
-
blank();
|
|
118
|
-
|
|
119
|
-
while (true) {
|
|
120
|
-
const answer = await ask("Type 1, 2, or 3 and press Enter [1]: ");
|
|
121
|
-
if (!answer || answer === "1") return { id: "gemma4:26b", label: "Gemma 4 26B" };
|
|
122
|
-
if (answer === "2") return { id: "gemma4:31b", label: "Gemma 4 31B" };
|
|
123
|
-
if (answer === "3") return { id: "gemma4:e2b", label: "Gemma 4 E2B" };
|
|
124
|
-
console.log(" Type 1, 2, or 3.");
|
|
125
|
-
}
|
|
126
|
-
}
|
|
98
|
+
// Model selection moved to the web UI. Setup only installs the embedder
|
|
99
|
+
// (small, required for memory) and lets the user pick a chat model from
|
|
100
|
+
// the Settings page in the running app. This dramatically shortens the
|
|
101
|
+
// install time and gives users a real visual progress bar instead of
|
|
102
|
+
// terminal output for the multi-GB chat model download.
|
|
127
103
|
|
|
128
104
|
async function setupCommand(opts = {}) {
|
|
129
105
|
const {
|
|
@@ -363,27 +339,26 @@ async function setupCommand(opts = {}) {
|
|
|
363
339
|
const hasE2 = await detectOllamaModel("gemma4:e2b");
|
|
364
340
|
const hasAny = has26.installed || has31.installed || hasE2.installed;
|
|
365
341
|
|
|
366
|
-
|
|
367
|
-
|
|
368
|
-
|
|
369
|
-
|
|
370
|
-
|
|
371
|
-
|
|
372
|
-
} catch (err) {
|
|
373
|
-
fail(`Failed to pull ${choice.id}: ${err.message}`);
|
|
374
|
-
info(`You can pull it later with: ollama pull ${choice.id}`);
|
|
375
|
-
}
|
|
376
|
-
} else if (hasAny) {
|
|
342
|
+
// Skip the Gemma chat model download in the installer entirely.
|
|
343
|
+
// Users pick + download a model from the running web app (Settings →
|
|
344
|
+
// Manage models) where there's a real progress bar. The chat UI
|
|
345
|
+
// detects this state and shows an empty-state banner asking the user
|
|
346
|
+
// to either download a model or add a cloud provider before chatting.
|
|
347
|
+
if (hasAny) {
|
|
377
348
|
success("A Gemma 4 chat model is already installed");
|
|
378
349
|
}
|
|
379
350
|
}
|
|
380
351
|
|
|
381
352
|
// ─── Step 10: Write .env.local ─────────────────────────────────────────
|
|
353
|
+
// Note: we deliberately do NOT hardcode OLLAMA_FAST_MODEL anymore. Fact
|
|
354
|
+
// extraction now uses whichever model the user is actively chatting with
|
|
355
|
+
// (cloud or local), looked up from the chat row at runtime. Hardcoding
|
|
356
|
+
// gemma4:e4b here used to silently break extraction on machines that
|
|
357
|
+
// only pulled the 26B (or any other size).
|
|
382
358
|
const finalEnv = {
|
|
383
359
|
DATABASE_URL: env.DATABASE_URL || connectionString,
|
|
384
360
|
OLLAMA_URL: env.OLLAMA_URL || "http://localhost:11434",
|
|
385
361
|
OLLAMA_CHAT_MODEL: env.OLLAMA_CHAT_MODEL || "gemma4:26b",
|
|
386
|
-
OLLAMA_FAST_MODEL: env.OLLAMA_FAST_MODEL || "gemma4:e4b",
|
|
387
362
|
OLLAMA_EMBED_MODEL: env.OLLAMA_EMBED_MODEL || "embeddinggemma",
|
|
388
363
|
};
|
|
389
364
|
writeEnv(ENV_PATH, finalEnv);
|
|
@@ -416,12 +391,17 @@ async function setupCommand(opts = {}) {
|
|
|
416
391
|
blank();
|
|
417
392
|
success(color.bold("Setup complete!"));
|
|
418
393
|
blank();
|
|
419
|
-
console.log("
|
|
420
|
-
console.log("
|
|
421
|
-
console.log("
|
|
422
|
-
console.log("
|
|
394
|
+
console.log("One more thing before you can chat:");
|
|
395
|
+
console.log("");
|
|
396
|
+
console.log(" You need either a cloud API key OR a local Gemma 4 model.");
|
|
397
|
+
console.log("");
|
|
398
|
+
console.log(" When the app opens, click " + color.bold("Settings") + " in the top right.");
|
|
399
|
+
console.log(" Then pick ONE of these:");
|
|
400
|
+
console.log("");
|
|
401
|
+
console.log(" A) " + color.bold("Providers") + " — add a Claude or OpenAI API key (~30 sec, fastest)");
|
|
402
|
+
console.log(" B) " + color.bold("Manage models") + " — download Gemma 4 E4B for 100% local mode");
|
|
423
403
|
console.log("");
|
|
424
|
-
console.log("
|
|
404
|
+
console.log(" Either one works. You can do both later.");
|
|
425
405
|
blank();
|
|
426
406
|
|
|
427
407
|
return { ok: true };
|