recallmem 0.1.1 → 0.1.3

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/README.md CHANGED
@@ -6,7 +6,15 @@
6
6
  </p>
7
7
 
8
8
  <p align="center">
9
- <strong>Persistent Private AI.</strong> Powered by Gemma 4 running locally on your own machine.
9
+ <strong>Persistent personal AI that actually remembers you.</strong>
10
+ </p>
11
+
12
+ <p align="center">
13
+ LLMs like ChatGPT, Claude.ai, and Gemini tend to forget you the moment you end your session. RecallMEM doesn't. It builds a profile of who you are, extracts facts after every conversation, and runs vector search across your entire history to find relevant context. By the time you've used it for a week, it knows you better than any AI ever will.
14
+ </p>
15
+
16
+ <p align="center">
17
+ Use it with Claude or OpenAI for fast responses and the best models (~5 minute setup). Or run everything locally with Gemma 4 for 100% privacy. You'll get the same memory framework either way. Your call.
10
18
  </p>
11
19
 
12
20
  <p align="center">
@@ -21,11 +29,15 @@
21
29
 
22
30
  ## What is this
23
31
 
24
- A personal AI chat app with real memory that runs 100% on your machine. Your conversations stay local. The AI builds a profile of who you are over time, extracts facts after every chat, and vector-searches across your entire history to find relevant context. By the time you've used it for a week, it knows you better than any cloud AI because it never forgets.
32
+ A personal AI chatbot with REAL memory. Plug in any LLM you want and RecallMEM gives it persistent memory of who you are, what you've talked about, and what's currently true vs historical.
33
+
34
+ The best part is that the LLM will never touch your memory in the database. Every retrieval is deterministic SQL + cosine similarity, assembled by TypeScript before the LLM ever sees it. The LLM only proposes new facts; a TypeScript validator decides what gets stored. Facts have timestamps and get auto-retired when you contradict them ("works at Acme" → "left Acme"). [Deep dive on the architecture →](./docs/ARCHITECTURE.md)
25
35
 
26
- The default model is **Gemma 4** (Apache 2.0) running locally via Ollama. Pick any size from E2B (runs on a phone) up to 31B Dense (best quality, needs a workstation). Or skip Ollama entirely and bring your own API key for Claude, GPT, Groq, Together, OpenRouter, or anything OpenAI-compatible.
36
+ You can run it three ways:
27
37
 
28
- The memory is the actual differentiator. Not the model. Not the UI. Memory reads are deterministic SQL + cosine similarity, not LLM tool calls. The chat model never touches your database. Facts are proposed by a local LLM but validated by TypeScript before storage. [Deep dive on the architecture →](./docs/ARCHITECTURE.md)
38
+ - **Cloud LLMs (recommended for most people).** Add a Claude or OpenAI API key in Settings. Fast, smart, works on any computer. Your memory still stays local in your own Postgres database. Only the chat messages go to the provider.
39
+ - **Local LLMs (recommended for privacy).** Run Gemma 4 via Ollama. Nothing leaves your machine, ever. Slower setup (~18 GB model download) and slower responses, but truly air-gappable.
40
+ - **Both.** Use cloud for daily chat, switch to local for the sensitive stuff. The model dropdown lets you pick per-conversation.
29
41
 
30
42
  ## Features
31
43
 
@@ -42,49 +54,35 @@ The memory is the actual differentiator. Not the model. Not the UI. Memory reads
42
54
 
43
55
  ## Quick start (Mac)
44
56
 
45
- RecallMEM is built and tested on macOS. Mac is the supported platform.
57
+ Two options. Pick whichever fits your priority.
46
58
 
47
- **Prerequisites:** Node.js 20+ and [Homebrew](https://brew.sh).
59
+ ### Option A: Cloud LLM (Claude or OpenAI) — fastest, ~5 minutes
60
+
61
+ You need Node.js 20+ and [Homebrew](https://brew.sh). Then:
48
62
 
49
63
  ```bash
50
64
  npx recallmem
51
65
  ```
52
66
 
53
- That's the whole install. Here's what happens after you hit Enter:
67
+ The installer sets up Postgres, pgvector, and Ollama (for the embedding model that powers memory). When the browser opens to `localhost:3000`:
54
68
 
55
- 1. **It checks what you already have** on your Mac (Node, Postgres, Ollama). Anything already installed gets skipped.
56
- 2. **It shows you a list** of what's missing with ✓ and ✗ marks.
57
- 3. **It asks one question:** `Install everything now? [Y/n]`. Hit Enter to say yes.
58
- 4. **It runs `brew install`** for Postgres 17, pgvector, and Ollama. You'll see real-time progress in your terminal.
59
- 5. **It starts Postgres and Ollama as background services** so they keep running across reboots.
60
- 6. **It downloads EmbeddingGemma** (~600 MB, ~1-2 min). This is required for the memory system.
61
- 7. **It asks which Gemma 4 model you want.** Three options:
62
- - **1) Gemma 4 26B** — 18 GB, fast, recommended for most people
63
- - **2) Gemma 4 31B** — 19 GB, slower, smartest answers
64
- - **3) Gemma 4 E2B** — 2 GB, very fast, good for testing or older laptops
65
- 8. **It downloads the model you picked.** E2B finishes in 2-3 min. The 18 GB option takes 10-30 min depending on your internet.
66
- 9. **It runs database migrations** (~5 seconds).
67
- 10. **It builds the app for production** (~30-60 seconds, first install only).
68
- 11. **It starts the server.** Open `http://localhost:3000` in your browser and start chatting.
69
+ 1. Click **Settings** in the top right
70
+ 2. Click **Providers**
71
+ 3. Add your Claude or OpenAI API key
72
+ 4. Pick that model from the dropdown in the chat header
73
+ 5. Start chatting
69
74
 
70
- Total time: **5-45 minutes** depending on which model you picked and your internet speed. Most of that is the model download. You only have to interact with it twice — once to confirm install, once to pick a model. After that, walk away.
75
+ **Total time: ~5 minutes.** The AI remembers everything across every chat. Your memory stays in your local Postgres database. Only the chat messages go to the cloud provider.
71
76
 
72
- **Subsequent runs are instant.** Just `npx recallmem` and the chat opens.
77
+ ### Option B: Local Gemma 4 100% private, ~15-45 minutes
73
78
 
74
- <details>
75
- <summary><strong>Just want cloud models? (Claude / GPT)</strong></summary>
79
+ Same `npx recallmem` command. When the app opens, click **Settings → Manage models** and download one of these:
76
80
 
77
- You still need Postgres for local memory storage, but you can skip Ollama entirely:
81
+ - **Gemma 4 E4B** (4 GB, ~5 minute download) fastest to test
82
+ - **Gemma 4 26B** (18 GB, ~20-30 minute download) — recommended for daily use
83
+ - **Gemma 4 31B** (19 GB, slower, best quality)
78
84
 
79
- ```bash
80
- brew install postgresql@17 pgvector
81
- brew services start postgresql@17
82
- npx recallmem
83
- ```
84
-
85
- After the app starts, go to **Settings → Providers → Add a new provider**, paste your API key, and pick that model from the chat dropdown.
86
-
87
- </details>
85
+ Then pick that model from the dropdown and chat. Nothing leaves your machine.
88
86
 
89
87
  <details>
90
88
  <summary><strong>Linux (not officially supported, manual install)</strong></summary>
@@ -175,6 +173,10 @@ Apache 2.0. See [LICENSE](./LICENSE) and [NOTICE](./NOTICE). Use it, modify it,
175
173
 
176
174
  ## Status
177
175
 
178
- v0.1. It works. I use it every day. There's no CI, no error monitoring, no SLA. If you want to use it as your daily AI tool, fork it, make it yours, and expect to read the code if something breaks. That's the deal.
176
+ v0.1.2. It works. I use it every day.
177
+
178
+ I built RecallMEM because I wanted an AI that actually knows me. Not because I'm paranoid about privacy (though that's a nice bonus). The chat models you use today forget you the second you close the tab and that drives me crazy. So I fixed it.
179
+
180
+ There's no CI, no error monitoring, no SLA. If you want to use it as your daily AI tool, fork it, make it yours, and expect to read the code if something breaks. That's the deal. If this is useful to you, that's cool. If not, no hard feelings.
179
181
 
180
182
  [github.com/RealChrisSean/RecallMEM](https://github.com/RealChrisSean/RecallMEM)
@@ -95,35 +95,11 @@ async function waitFor(checkFn, timeoutMs = 15000, intervalMs = 500) {
95
95
  return false;
96
96
  }
97
97
 
98
- // Pretty model menu short lines, plain words, dyslexia-friendly.
99
- async function pickGemmaModel() {
100
- blank();
101
- console.log(color.bold("Pick a Gemma 4 model."));
102
- blank();
103
- console.log(" 1) Gemma 4 26B");
104
- console.log(" Size: 18 GB");
105
- console.log(" Speed: Fast");
106
- console.log(" Best for: Most people. Recommended.");
107
- blank();
108
- console.log(" 2) Gemma 4 31B");
109
- console.log(" Size: 19 GB");
110
- console.log(" Speed: Slower");
111
- console.log(" Best for: People who want the smartest answers, even if it takes longer.");
112
- blank();
113
- console.log(" 3) Gemma 4 E2B");
114
- console.log(" Size: 2 GB");
115
- console.log(" Speed: Very fast");
116
- console.log(" Best for: A quick test. Or older laptops.");
117
- blank();
118
-
119
- while (true) {
120
- const answer = await ask("Type 1, 2, or 3 and press Enter [1]: ");
121
- if (!answer || answer === "1") return { id: "gemma4:26b", label: "Gemma 4 26B" };
122
- if (answer === "2") return { id: "gemma4:31b", label: "Gemma 4 31B" };
123
- if (answer === "3") return { id: "gemma4:e2b", label: "Gemma 4 E2B" };
124
- console.log(" Type 1, 2, or 3.");
125
- }
126
- }
98
+ // Model selection moved to the web UI. Setup only installs the embedder
99
+ // (small, required for memory) and lets the user pick a chat model from
100
+ // the Settings page in the running app. This dramatically shortens the
101
+ // install time and gives users a real visual progress bar instead of
102
+ // terminal output for the multi-GB chat model download.
127
103
 
128
104
  async function setupCommand(opts = {}) {
129
105
  const {
@@ -363,27 +339,26 @@ async function setupCommand(opts = {}) {
363
339
  const hasE2 = await detectOllamaModel("gemma4:e2b");
364
340
  const hasAny = has26.installed || has31.installed || hasE2.installed;
365
341
 
366
- if (!hasAny && !skipIfDone) {
367
- const choice = await pickGemmaModel();
368
- step(`Downloading ${choice.label}... (this can take a while)`);
369
- try {
370
- execSync(`ollama pull ${choice.id}`, { stdio: "inherit" });
371
- success(`${choice.label} installed`);
372
- } catch (err) {
373
- fail(`Failed to pull ${choice.id}: ${err.message}`);
374
- info(`You can pull it later with: ollama pull ${choice.id}`);
375
- }
376
- } else if (hasAny) {
342
+ // Skip the Gemma chat model download in the installer entirely.
343
+ // Users pick + download a model from the running web app (Settings →
344
+ // Manage models) where there's a real progress bar. The chat UI
345
+ // detects this state and shows an empty-state banner asking the user
346
+ // to either download a model or add a cloud provider before chatting.
347
+ if (hasAny) {
377
348
  success("A Gemma 4 chat model is already installed");
378
349
  }
379
350
  }
380
351
 
381
352
  // ─── Step 10: Write .env.local ─────────────────────────────────────────
353
+ // Note: we deliberately do NOT hardcode OLLAMA_FAST_MODEL anymore. Fact
354
+ // extraction now uses whichever model the user is actively chatting with
355
+ // (cloud or local), looked up from the chat row at runtime. Hardcoding
356
+ // gemma4:e4b here used to silently break extraction on machines that
357
+ // only pulled the 26B (or any other size).
382
358
  const finalEnv = {
383
359
  DATABASE_URL: env.DATABASE_URL || connectionString,
384
360
  OLLAMA_URL: env.OLLAMA_URL || "http://localhost:11434",
385
361
  OLLAMA_CHAT_MODEL: env.OLLAMA_CHAT_MODEL || "gemma4:26b",
386
- OLLAMA_FAST_MODEL: env.OLLAMA_FAST_MODEL || "gemma4:e4b",
387
362
  OLLAMA_EMBED_MODEL: env.OLLAMA_EMBED_MODEL || "embeddinggemma",
388
363
  };
389
364
  writeEnv(ENV_PATH, finalEnv);
@@ -416,12 +391,17 @@ async function setupCommand(opts = {}) {
416
391
  blank();
417
392
  success(color.bold("Setup complete!"));
418
393
  blank();
419
- console.log("Want a different Gemma 4 model later? Run one of these:");
420
- console.log(" ollama pull gemma4:26b");
421
- console.log(" ollama pull gemma4:31b");
422
- console.log(" ollama pull gemma4:e2b");
394
+ console.log("One more thing before you can chat:");
395
+ console.log("");
396
+ console.log(" You need either a cloud API key OR a local Gemma 4 model.");
397
+ console.log("");
398
+ console.log(" When the app opens, click " + color.bold("Settings") + " in the top right.");
399
+ console.log(" Then pick ONE of these:");
400
+ console.log("");
401
+ console.log(" A) " + color.bold("Providers") + " — add a Claude or OpenAI API key (~30 sec, fastest)");
402
+ console.log(" B) " + color.bold("Manage models") + " — download Gemma 4 E4B for 100% local mode");
423
403
  console.log("");
424
- console.log("Then pick it from the dropdown at the top of the chat.");
404
+ console.log(" Either one works. You can do both later.");
425
405
  blank();
426
406
 
427
407
  return { ok: true };
package/package.json CHANGED
@@ -1,6 +1,6 @@
1
1
  {
2
2
  "name": "recallmem",
3
- "version": "0.1.1",
3
+ "version": "0.1.3",
4
4
  "description": "Private, local-first AI chatbot with persistent working memory. One command install via npx.",
5
5
  "license": "Apache-2.0",
6
6
  "author": "Chris Sean",