npm - recallmem - Versions diffs - 0.1.1 → 0.1.3 - Mend

recallmem 0.1.1 → 0.1.3

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (3) hide show

package/README.md CHANGED Viewed

@@ -6,7 +6,15 @@
 </p>
 <p align="center">
-  <strong>Persistent Private AI.</strong> Powered by Gemma 4 running locally on your own machine.
+  <strong>Persistent personal AI that actually remembers you.</strong>
+</p>
+<p align="center">
+  LLMs like ChatGPT, Claude.ai, and Gemini tend to forget you the moment you end your session. RecallMEM doesn't. It builds a profile of who you are, extracts facts after every conversation, and runs vector search across your entire history to find relevant context. By the time you've used it for a week, it knows you better than any AI ever will.
+</p>
+<p align="center">
+  Use it with Claude or OpenAI for fast responses and the best models (~5 minute setup). Or run everything locally with Gemma 4 for 100% privacy. You'll get the same memory framework either way. Your call.
 </p>
 <p align="center">
@@ -21,11 +29,15 @@
 ## What is this
-A personal AI chat app with real memory that runs 100% on your machine. Your conversations stay local. The AI builds a profile of who you are over time, extracts facts after every chat, and vector-searches across your entire history to find relevant context. By the time you've used it for a week, it knows you better than any cloud AI because it never forgets.
+A personal AI chatbot with REAL memory. Plug in any LLM you want and RecallMEM gives it persistent memory of who you are, what you've talked about, and what's currently true vs historical.
+The best part is that the LLM will never touch your memory in the database. Every retrieval is deterministic SQL + cosine similarity, assembled by TypeScript before the LLM ever sees it. The LLM only proposes new facts; a TypeScript validator decides what gets stored. Facts have timestamps and get auto-retired when you contradict them ("works at Acme" → "left Acme"). [Deep dive on the architecture →](./docs/ARCHITECTURE.md)
-The default model is **Gemma 4** (Apache 2.0) running locally via Ollama. Pick any size from E2B (runs on a phone) up to 31B Dense (best quality, needs a workstation). Or skip Ollama entirely and bring your own API key for Claude, GPT, Groq, Together, OpenRouter, or anything OpenAI-compatible.
+You can run it three ways:
-The memory is the actual differentiator. Not the model. Not the UI. Memory reads are deterministic SQL + cosine similarity, not LLM tool calls. The chat model never touches your database. Facts are proposed by a local LLM but validated by TypeScript before storage. [Deep dive on the architecture →](./docs/ARCHITECTURE.md)
+- **Cloud LLMs (recommended for most people).** Add a Claude or OpenAI API key in Settings. Fast, smart, works on any computer. Your memory still stays local in your own Postgres database. Only the chat messages go to the provider.
+- **Local LLMs (recommended for privacy).** Run Gemma 4 via Ollama. Nothing leaves your machine, ever. Slower setup (~18 GB model download) and slower responses, but truly air-gappable.
+- **Both.** Use cloud for daily chat, switch to local for the sensitive stuff. The model dropdown lets you pick per-conversation.
 ## Features
@@ -42,49 +54,35 @@ The memory is the actual differentiator. Not the model. Not the UI. Memory reads
 ## Quick start (Mac)
-RecallMEM is built and tested on macOS. Mac is the supported platform.
+Two options. Pick whichever fits your priority.
-**Prerequisites:** Node.js 20+ and [Homebrew](https://brew.sh).
+### Option A: Cloud LLM (Claude or OpenAI) — fastest, ~5 minutes
+You need Node.js 20+ and [Homebrew](https://brew.sh). Then:
 ```bash
 npx recallmem
 ```
-That's the whole install. Here's what happens after you hit Enter:
+The installer sets up Postgres, pgvector, and Ollama (for the embedding model that powers memory). When the browser opens to `localhost:3000`:
-1. **It checks what you already have** on your Mac (Node, Postgres, Ollama). Anything already installed gets skipped.
-2. **It shows you a list** of what's missing with ✓ and ✗ marks.
-3. **It asks one question:** `Install everything now? [Y/n]`. Hit Enter to say yes.
-4. **It runs `brew install`** for Postgres 17, pgvector, and Ollama. You'll see real-time progress in your terminal.
-5. **It starts Postgres and Ollama as background services** so they keep running across reboots.
-6. **It downloads EmbeddingGemma** (~600 MB, ~1-2 min). This is required for the memory system.
-7. **It asks which Gemma 4 model you want.** Three options:
-   - **1) Gemma 4 26B** — 18 GB, fast, recommended for most people
-   - **2) Gemma 4 31B** — 19 GB, slower, smartest answers
-   - **3) Gemma 4 E2B** — 2 GB, very fast, good for testing or older laptops
-8. **It downloads the model you picked.** E2B finishes in 2-3 min. The 18 GB option takes 10-30 min depending on your internet.
-9. **It runs database migrations** (~5 seconds).
-10. **It builds the app for production** (~30-60 seconds, first install only).
-11. **It starts the server.** Open `http://localhost:3000` in your browser and start chatting.
+1. Click **Settings** in the top right
+2. Click **Providers**
+3. Add your Claude or OpenAI API key
+4. Pick that model from the dropdown in the chat header
+5. Start chatting
-Total time: **5-45 minutes** depending on which model you picked and your internet speed. Most of that is the model download. You only have to interact with it twice — once to confirm install, once to pick a model. After that, walk away.
+**Total time: ~5 minutes.** The AI remembers everything across every chat. Your memory stays in your local Postgres database. Only the chat messages go to the cloud provider.
-**Subsequent runs are instant.** Just `npx recallmem` and the chat opens.
+### Option B: Local Gemma 4 — 100% private, ~15-45 minutes
-<details>
-<summary><strong>Just want cloud models? (Claude / GPT)</strong></summary>
+Same `npx recallmem` command. When the app opens, click **Settings → Manage models** and download one of these:
-You still need Postgres for local memory storage, but you can skip Ollama entirely:
+- **Gemma 4 E4B** (4 GB, ~5 minute download) — fastest to test
+- **Gemma 4 26B** (18 GB, ~20-30 minute download) — recommended for daily use
+- **Gemma 4 31B** (19 GB, slower, best quality)
-```bash
-brew install postgresql@17 pgvector
-brew services start postgresql@17
-npx recallmem
-```
-After the app starts, go to **Settings → Providers → Add a new provider**, paste your API key, and pick that model from the chat dropdown.
-</details>
+Then pick that model from the dropdown and chat. Nothing leaves your machine.
 <details>
 <summary><strong>Linux (not officially supported, manual install)</strong></summary>
@@ -175,6 +173,10 @@ Apache 2.0. See [LICENSE](./LICENSE) and [NOTICE](./NOTICE). Use it, modify it,
 ## Status
-v0.1. It works. I use it every day. There's no CI, no error monitoring, no SLA. If you want to use it as your daily AI tool, fork it, make it yours, and expect to read the code if something breaks. That's the deal.
+v0.1.2. It works. I use it every day.
+I built RecallMEM because I wanted an AI that actually knows me. Not because I'm paranoid about privacy (though that's a nice bonus). The chat models you use today forget you the second you close the tab and that drives me crazy. So I fixed it.
+There's no CI, no error monitoring, no SLA. If you want to use it as your daily AI tool, fork it, make it yours, and expect to read the code if something breaks. That's the deal. If this is useful to you, that's cool. If not, no hard feelings.
 [github.com/RealChrisSean/RecallMEM](https://github.com/RealChrisSean/RecallMEM)

package/bin/commands/setup.js CHANGED Viewed

@@ -95,35 +95,11 @@ async function waitFor(checkFn, timeoutMs = 15000, intervalMs = 500) {
   return false;
 }
-// Pretty model menu — short lines, plain words, dyslexia-friendly.
-async function pickGemmaModel() {
-  blank();
-  console.log(color.bold("Pick a Gemma 4 model."));
-  blank();
-  console.log("  1) Gemma 4 26B");
-  console.log("     Size: 18 GB");
-  console.log("     Speed: Fast");
-  console.log("     Best for: Most people. Recommended.");
-  blank();
-  console.log("  2) Gemma 4 31B");
-  console.log("     Size: 19 GB");
-  console.log("     Speed: Slower");
-  console.log("     Best for: People who want the smartest answers, even if it takes longer.");
-  blank();
-  console.log("  3) Gemma 4 E2B");
-  console.log("     Size: 2 GB");
-  console.log("     Speed: Very fast");
-  console.log("     Best for: A quick test. Or older laptops.");
-  blank();
-  while (true) {
-    const answer = await ask("Type 1, 2, or 3 and press Enter [1]: ");
-    if (!answer || answer === "1") return { id: "gemma4:26b", label: "Gemma 4 26B" };
-    if (answer === "2") return { id: "gemma4:31b", label: "Gemma 4 31B" };
-    if (answer === "3") return { id: "gemma4:e2b", label: "Gemma 4 E2B" };
-    console.log("  Type 1, 2, or 3.");
-  }
-}
+// Model selection moved to the web UI. Setup only installs the embedder
+// (small, required for memory) and lets the user pick a chat model from
+// the Settings page in the running app. This dramatically shortens the
+// install time and gives users a real visual progress bar instead of
+// terminal output for the multi-GB chat model download.
 async function setupCommand(opts = {}) {
   const {
@@ -363,27 +339,26 @@ async function setupCommand(opts = {}) {
     const hasE2 = await detectOllamaModel("gemma4:e2b");
     const hasAny = has26.installed || has31.installed || hasE2.installed;
-    if (!hasAny && !skipIfDone) {
-      const choice = await pickGemmaModel();
-      step(`Downloading ${choice.label}... (this can take a while)`);
-      try {
-        execSync(`ollama pull ${choice.id}`, { stdio: "inherit" });
-        success(`${choice.label} installed`);
-      } catch (err) {
-        fail(`Failed to pull ${choice.id}: ${err.message}`);
-        info(`You can pull it later with: ollama pull ${choice.id}`);
-      }
-    } else if (hasAny) {
+    // Skip the Gemma chat model download in the installer entirely.
+    // Users pick + download a model from the running web app (Settings →
+    // Manage models) where there's a real progress bar. The chat UI
+    // detects this state and shows an empty-state banner asking the user
+    // to either download a model or add a cloud provider before chatting.
+    if (hasAny) {
       success("A Gemma 4 chat model is already installed");
     }
   }
   // ─── Step 10: Write .env.local ─────────────────────────────────────────
+  // Note: we deliberately do NOT hardcode OLLAMA_FAST_MODEL anymore. Fact
+  // extraction now uses whichever model the user is actively chatting with
+  // (cloud or local), looked up from the chat row at runtime. Hardcoding
+  // gemma4:e4b here used to silently break extraction on machines that
+  // only pulled the 26B (or any other size).
   const finalEnv = {
     DATABASE_URL: env.DATABASE_URL || connectionString,
     OLLAMA_URL: env.OLLAMA_URL || "http://localhost:11434",
     OLLAMA_CHAT_MODEL: env.OLLAMA_CHAT_MODEL || "gemma4:26b",
-    OLLAMA_FAST_MODEL: env.OLLAMA_FAST_MODEL || "gemma4:e4b",
     OLLAMA_EMBED_MODEL: env.OLLAMA_EMBED_MODEL || "embeddinggemma",
   };
   writeEnv(ENV_PATH, finalEnv);
@@ -416,12 +391,17 @@ async function setupCommand(opts = {}) {
   blank();
   success(color.bold("Setup complete!"));
   blank();
-  console.log("Want a different Gemma 4 model later? Run one of these:");
-  console.log("  ollama pull gemma4:26b");
-  console.log("  ollama pull gemma4:31b");
-  console.log("  ollama pull gemma4:e2b");
+  console.log("One more thing before you can chat:");
+  console.log("");
+  console.log("  You need either a cloud API key OR a local Gemma 4 model.");
+  console.log("");
+  console.log("  When the app opens, click " + color.bold("Settings") + " in the top right.");
+  console.log("  Then pick ONE of these:");
+  console.log("");
+  console.log("    A) " + color.bold("Providers") + " — add a Claude or OpenAI API key (~30 sec, fastest)");
+  console.log("    B) " + color.bold("Manage models") + " — download Gemma 4 E4B for 100% local mode");
   console.log("");
-  console.log("Then pick it from the dropdown at the top of the chat.");
+  console.log("  Either one works. You can do both later.");
   blank();
   return { ok: true };

package/package.json CHANGED Viewed

@@ -1,6 +1,6 @@
 {
   "name": "recallmem",
-  "version": "0.1.1",
+  "version": "0.1.3",
   "description": "Private, local-first AI chatbot with persistent working memory. One command install via npx.",
   "license": "Apache-2.0",
   "author": "Chris Sean",