npm - @hmanlab/memo - Versions diffs - 0.5.0 → 0.5.2 - Mend

@hmanlab/memo 0.5.0 → 0.5.2

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (5) hide show

package/README.md +119 -3
package/claude/skill/memo/SKILL.md +88 -48
package/dist/cli.js +36966 -346
package/dist/memo-mcp-server.js +38305 -1373
package/package.json +23 -2

package/README.md CHANGED Viewed

@@ -2,7 +2,7 @@
 Local-first MCP server for persistent, persona-aware memory across projects.
-`memo` ships two surfaces: an MCP server (33 tools) for AI clients like
+`memo` ships two surfaces: an MCP server (35 tools) for AI clients like
 Claude Code, and a Node CLI (`hmanlab-memory`) for power users. Both
 share the same backend — the CLI is a thin wrapper, not a re-implementation.
@@ -10,9 +10,123 @@ Everything lives under `~/.hmanlab/`: one root SQLite DB + a `personas/`
 directory of YAML files + one DB per registered project. No cloud, no
 account, no telemetry.
+## What `npx -y @hmanlab/hl-plugins install memo` does
+It's the one-line path to a working setup. It runs five steps, in order:
+1. **Pre-flight.** Node ≥ 18, an `~/.opencode/` config dir. Auto-creates
+   the dir if missing.
+2. **Install Bun.** Memo is built with `--target=bun`, so Bun is a hard
+   requirement. The installer auto-installs it via
+   `curl -fsSL https://bun.sh/install | bash` if it isn't on PATH yet.
+3. **Stage the plugin CLI.** Copies `dist/cli.js` to
+   `~/.local/share/hl-plugins/memo/` so the next step can invoke plugin
+   subcommands by absolute path (no PATH dependency yet).
+4. **Prompt about MiniLM.** *See the section below.* Your answer is
+   persisted during install — there's no "run later" step.
+5. **Copy + register.** Ships the MCP server bundle to
+   `~/.local/share/hl-plugins/memo/memo-mcp-server.js`, drops the skill
+   markdown at `~/.claude/skills/memo/SKILL.md`, and registers the server
+   in your Claude Code config. Then prints
+   *"Restart opencode to use the new tools."*
+That's it. No auth, no account, no telemetry, no daemon. The server runs
+on stdio only when Claude Code invokes it.
+### Optional: the MiniLM embedder
+After Bun is confirmed and before any files are copied, the installer asks
+once whether you want the optional MiniLM-L6-v2 model. The model powers
+semantic search — paraphrase and typo queries still hit the right memory
+even when the words don't match the stored content literally.
+```
+? MiniLM-L6-v2 (~25 MB) powers semantic search so paraphrase and typo queries
+  still hit the right memory.
+  With it:    75.2% recall@5 (62.9% recall@1)
+  Without it: paraphrase queries drop to ~30%, typo queries to ~25%
+              (105-query eval across coding, glossary, and preferences)
+Enable? [Y/n]:
+```
+Your answer is committed during install — no follow-up step:
+- **Y (default):** writes `embedder_mode: minilm` to
+  `~/.hmanlab/config.yaml`. The model downloads lazily on the next
+  `memory_save` / `memory_search` call (~25 MB, ~2 s warmup, then ~50 ms
+  per query).
+- **n:** writes `embedder_mode: hash`. `loadExtractor()` short-circuits
+  on every subsequent call. The model is **never** downloaded or
+  referenced — the embedder uses the deterministic trigram fallback.
+Non-interactive installs (CI, scripts piped via `| sh`) treat the prompt
+as Yes so the install never blocks.
+Change your mind any time:
+```bash
+hmanlab-memory embedder status     # show current mode
+hmanlab-memory embedder install    # switch to minilm (lazy download on next memory call)
+hmanlab-memory embedder disable    # switch to hash (no download, ever)
+```
+The mode is stored under `embedder_mode` in `~/.hmanlab/config.yaml`. Three
+values: `minilm` (require the real model), `hash` (use the deterministic
+trigram fallback), `auto` (try MiniLM, fall back to hash on failure —
+default if the key is absent).
+### With MiniLM vs without — what actually changes
+Same 105 positive + 20 negative queries, same memory corpus. Two
+columns: **Hash** (no MiniLM, no model download) and **MiniLM +
+trigram** (what ships by default — semantic embedder + the trigram
+FTS5 mirror that catches 3-char substring overlap).
+**Headline metrics:**
+| Metric | Hash fallback | MiniLM + trigram | Δ |
+|---|---|---|---|
+| Recall@1 | 41.0% | **62.9%** | **+21.9 pp** |
+| Recall@5 | 68.6% | **75.2%** | +6.6 pp |
+| MRR | 0.516 | **0.679** | **+0.163** |
+The biggest win is Recall@1 — the trigram FTS5 mirror lifts it from
+45.7% (MiniLM alone) to 62.9% (MiniLM + trigram). When the query
+shares even one 3-char substring with the right memory, that memory
+now lands at rank 1 instead of being lost in the top-5 noise.
+**By domain (R@5):**
+| Domain | Hash | MiniLM + trigram | Δ |
+|---|---|---|---|
+| glossary | 64.5% | 100.0% | **+35.5** |
+| preferences | 97.4% | 100.0% | +2.6 |
+**By query kind (R@5):**
+| Kind | Hash | MiniLM + trigram | Δ |
+|---|---|---|---|
+| literal | 93.3% | 96.7% | +3.4 |
+| paraphrase | 60.0% | 66.7% | +6.7 |
+| typo | 53.3% | 66.7% | +13.3 |
+| negation | 70.0% | 60.0% | **−10.0** |
+| broad | 60.0% | 80.0% | +20.0 |
+If your memory is mostly short, literal preferences, hash fallback is
+competitive. If your memory is glossary definitions or fuzzy
+paraphrases, MiniLM + trigram dominates — particularly on
+broad queries where the user types a vague prompt and expects the
+right memory to surface.
+Raw eval data:
+- `~/Desktop/memo-eval/results-2026-06-25-bigeval.json` (MiniLM + trigram, current ship state)
+- `~/Desktop/memo-eval/results-2026-06-25-bigeval-hash.json` (hash fallback, what you get if you decline MiniLM at install)
 ## What's in the box (v1.0.0)
-### MCP tools (33)
+### MCP tools (35)
 - **Persona (11):** `persona_list`, `persona_get`, `persona_create`,
   `persona_update`, `persona_delete`, `persona_clone`,
   `persona_reload`, `user_persona_get`, `user_persona_update`
@@ -70,9 +184,11 @@ Full CLI reference: [`docs/USAGE.md`](./docs/USAGE.md).
 ```
 ~/.hmanlab/
-├── config.yaml          # cwd_auto_detect, persona_filter_mode, decay knobs
+├── config.yaml          # cwd_auto_detect, persona_filter_mode, embedder_mode
 ├── root.db              # user_persona, ai_personas, projects,
 │                        # global_memories (+ _fts + _edges), schema migrations
+├── models/              # MiniLM-L6-v2 q8 (~25 MB), lazy-downloaded on first use
+│   └── Xenova/all-MiniLM-L6-v2/...
 ├── personas/            # persona YAML files (built-in + user)
 │   ├── default.yaml
 │   ├── work.yaml        # parent: default

package/claude/skill/memo/SKILL.md CHANGED Viewed

@@ -5,25 +5,82 @@ description: Use when the user wants persistent memory across projects, persona-
 # memo — hmanlab-memo (local-first MCP memory)
-The `memo` MCP server exposes 9 tools that give an AI coding assistant
-persistent, persona-aware memory on the user's machine. Everything lives under
-`~/.hmanlab/` (one root SQLite DB + a `personas/` directory of YAML files).
-No cloud, no account, no telemetry.
-This is the Phase 01 slice: persona + user-persona CRUD only. Projects,
-memories, embeddings, and hybrid search land in later phases.
-| Tool                  | What it does                                                |
-| --------------------- | ----------------------------------------------------------- |
-| `persona_list`        | List all personas (built-in + user)                         |
-| `persona_get`         | Read one persona (resolves `parent` chain)                  |
-| `persona_create`      | Write a new YAML persona + DB row                           |
-| `persona_update`      | Edit a persona, bump version                                |
-| `persona_delete`      | Soft-delete (archive) — YAML stays                          |
-| `persona_clone`       | Duplicate a persona as a starting point                     |
-| `persona_reload`      | Re-scan `~/.hmanlab/personas/` and resync the DB            |
-| `user_persona_get`    | Read the user's persona singleton                          |
-| `user_persona_update` | Edit the user's persona                                    |
+The `memo` MCP server exposes 35 tools that give an AI coding assistant
+persistent, persona-aware memory on the user's machine. Everything lives
+under `~/.hmanlab/` (one root SQLite DB + a `personas/` directory of YAML
+files + one DB per registered project). No cloud, no account, no telemetry.
+The MCP bundle is `--target=bun` and lives at
+`~/.local/share/hl-plugins/memo/memo-mcp-server.js`. Claude Code launches
+it on stdio.
+## Search strategy — normalize the query before calling `memory_search`
+`memory_search` is hybrid (FTS + recency + vector). Vector search lifts
+Recall@5 from 68.6% → 73.3% on the standard eval, but only if the query
+isn't too distorted. **You are better at query rewriting than the local
+embedder.** Before calling `memory_search`, normalize the query yourself:
+1. **Fix typos** before searching — vector similarity on `"indenation with
+   tabs"` lands on the right memory; `"indsnation with tabx"` might not.
+2. **Drop conversational filler** ("can you", "do you know", "what was
+   that thing about"). The filler dilutes the cosine signal.
+3. **Strip negation of the question, keep negation of the memory.** When
+   the user asks "should I commit secrets to a private repo", search for
+   `"Never commit secrets to git"` (the actual memory), not the literal
+   question. The embedder can't distinguish "should I commit" from
+   "Never commit" — but if your query contains the *memory's* words,
+   FTS catches it.
+4. **Prefer the user's own phrasing** when you remember it from the
+   conversation. If the user said "tabs not spaces" earlier and the
+   memory says "Use tabs for indentation in this project", search for
+   the latter — it matches both FTS and vector better.
+5. **One query, not several.** Don't fan out: a single, well-chosen query
+   outperforms three noisy ones.
+Concretely, before calling `memory_search("query")`:
+```
+raw user question: "wait what was my rule about not committing api keys"
+rewrite to:         "Never commit secrets to git"
+then call:          memory_search("Never commit secrets to git")
+```
+Don't rewrite for `memory_recent` — that's recency-only and your input
+doesn't matter.
+## When to use the tools
+- **User asks a question whose answer is in memory** → `memory_search`
+  with a normalized query. Skim top-5; if none match, fall back to
+  answering from your own knowledge (don't claim a memory hit when
+  there isn't one).
+- **User states a preference / rule / decision worth keeping** →
+  `memory_save`. Use `importance: 0.9` for durable rules, `0.5` for
+  context, `0.3` for one-off notes. Add a `category` (e.g. "preferences",
+  "code-style", "glossary").
+- **User asks "what do you know about me / this project"** → `memory_search`
+  with `scope: "project"` for project-specific, `scope: "all"` for
+  everything.
+- **User asks to switch hats / "talk like X"** → `persona_list` to see
+  options, `persona_get` to read the full prompt, then continue as that
+  persona.
+- **User asks to remember a global preference** → `user_persona_update`.
+- **Long conversation, context getting heavy** → `memory_compact_prep`
+  to get the pre-selected subset worth re-injecting after compaction.
+- **Storage getting messy** → `memory_hygiene all` for the stale/cold/
+  duplicate report, `memory_status` for the headline counts.
+- **Want to back up / move a project** → `project_export <name>` /
+  `project_import <archive>`.
+## Save rules
+- **Be specific.** "Use tabs for indentation" beats "code style matters".
+- **One fact per memory.** Splitting lets each one rank on its own.
+- **Use the user's own words** when possible. They're more searchable
+  later.
+- **Pick importance honestly.** `0.9` = durable rule, `0.5` = context,
+  `0.3` = ephemeral.
 ## Setup (one-time, on the machine)
@@ -32,37 +89,20 @@ memories, embeddings, and hybrid search land in later phases.
    ```bash
    hl-plugins install memo
    ```
-3. Restart Claude Code. The 9 tools above appear under the `memo` MCP server.
-The CLI auto-installs Bun if missing and registers the MCP bundle under
-`~/.local/share/hl-plugins/memo/`, then wires it into `~/.claude.json`.
+3. The installer prompts once about MiniLM-L6-v2 (a local embedder that
+   powers semantic search). Default is Yes — ~25 MB download on first
+   memory call.
+4. Restart Claude Code. The 35 tools appear under the `memo` MCP server.
 ## On-disk layout
 ```
 ~/.hmanlab/
-├── config.yaml          # paths + embedding defaults (phase-01 reads/writes subset)
-├── root.db              # WAL-mode SQLite: user_persona, ai_personas
-└── personas/
-    ├── default.yaml     # built-in (warm, balanced)
-    ├── work.yaml        # built-in (parent: default)
-    ├── creative.yaml    # built-in (parent: default)
-    └── <user-defined>.yaml
-```
-YAML is the source of truth. Editing a file on disk and calling `persona_reload`
-updates the DB to match. The starter pack is extracted only on first boot;
-existing YAMLs are never overwritten.
-## When to use these tools
-- **User asks to switch hats / "talk like X" / use a persona** → `persona_list`
-  to see options, `persona_get` to read the full prompt, then continue the
-  conversation as that persona.
-- **User asks to remember a preference** → `user_persona_update` with the
-  preference text.
-- **User asks to create / edit a persona** → `persona_create` or
-  `persona_update`.
-- **User edits a persona YAML directly** → `persona_reload` to make the DB
-  match.
-- **User wants to fork an existing persona** → `persona_clone`.
+├── config.yaml          # paths, embedder_mode, persona_filter_mode
+├── root.db              # user_persona, ai_personas, projects, global_memories
+├── models/              # MiniLM-L6-v2 q8 (lazy-downloaded on first embed call)
+├── personas/            # persona YAML files (built-in + user)
+└── projects/<name>/
+    ├── project.yaml
+    └── hmanlab.db       # memories + FTS5
+```