@clauderecallhq/cli 0.77.3 → 0.92.3

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/README.md CHANGED
@@ -138,7 +138,7 @@ recall search "auth #auth-fix" -p Pest-Control
138
138
 
139
139
  Local **768-dimension embeddings** via `bge-base-en-v1.5` ONNX. Three-lane RRF fusion (BM25 + summary + vectors) finds sessions by *meaning*, not keywords. Your code never leaves your laptop.
140
140
 
141
- The embedder model (~110MB) auto-downloads on `npm install`. Skip with `RECALL_SKIP_MODEL_DOWNLOAD=1`.
141
+ The embedder model (~110MB) is opt-in — install on demand with `recall semantic install`. Keeps `npm install -g` quick and avoids forcing the download on users who never touch vector search.
142
142
 
143
143
  **One-click vectorization from the web UI.** Each project header has a `🧠 Vectorize` button. Click → pre-flight dialog shows scope (X of Y eligible sessions), three-✓ cost guarantees (free, on-device CPU only, no network), depth selector (Quick / Standard / Full / Custom chunks-per-session), and a **depth-aware ETA** computed server-side per cap level so Standard and Full give honest, different numbers. **Counters are per-repository:** the Queued, Throughput, and ETA tiles only count chunks from the project you opened the dialog from, with a separate small disclosure line surfacing how many chunks are queued from other projects ahead of yours (the worker drains FIFO globally). While running, the dialog flips to monitor mode with a live progress bar, ETA tick countdown, observed-throughput (drained-in-this-session) feeding the estimate, and visible **warm-up** (yellow) + **wind-down** (red) progress bars with timing context for the parts that have no deterministic ETA (embedder load is 5-60s by hardware; worker batch wind-down is 5-15s). **Three explicit confirmation states close every run:** green `✓ Vectorization complete — N chunks embedded in Xs` banner when the queue drains naturally, green `✓ Already vectorized — N{project} is already done` pre-flight callout (replacing the gray "Nothing to do" text) when there's nothing to vectorize, and the existing green `✓ Stopped — worker halted and queue cleared` badge for Stop. Stop itself is a **global kill switch** — clears every project's queue and halts the worker. The vector worker **does not auto-resume on daemon restart** by default; enable `semantic.autoResumeWorker: true` in `~/.recall/config.json` if you want background draining without a click. Free tier doesn't see the button — endpoint is Pro-gated.
144
144
 
@@ -146,13 +146,15 @@ The embedder model (~110MB) auto-downloads on `npm install`. Skip with `RECALL_S
146
146
 
147
147
  ```bash
148
148
  recall similar <session-id>
149
- recall semantic install # one-time model download (skipped if `npm install` already pulled it)
149
+ recall semantic install # one-time model download (~110MB, on-device)
150
150
  recall semantic reindex # vectorize on local CPU (idempotent, resumable)
151
151
  recall semantic verify-spawn # diagnostic — confirms claude CLI honors the no-persistence flag
152
152
  ```
153
153
 
154
154
  **Two lanes.** Tier-2 (commands above — `install`, `reindex`, `verify-spawn`) is free, local, and never sends anything anywhere. Tier-1 (`recall semantic on / backfill / auto-extract`) shells out to your local `claude` CLI to summarize sessions and **costs plan tokens at ~30 sessions/min** — opt-in for users who want LLM-generated summaries on top of vectors. Default-off.
155
155
 
156
+ **Switching backends in place.** Pro users running an older ONNX-based vector index who upgrade to the llama.cpp backend can run `recall semantic migrate` to re-embed the existing corpus under the new backend. The command is resumable (SIGINT-safe), atomic at switchover, and retains a 30-day rollback window via `recall semantic rollback-migration`.
157
+
156
158
  ### Cost Analytics
157
159
 
158
160
  Per-session and per-project token + dollar totals. Daily sparkline. Top-10 heaviest sessions. Know exactly where your Anthropic bill is going.
@@ -164,7 +166,7 @@ recall stats --project Tools --days 7
164
166
 
165
167
  <div align="center">
166
168
 
167
- <img src="https://cdn.clauderecall.com/marketing/cost-analytics.gif?v=hd3" alt="Cost analytics dashboard — daily spend bars, top sessions ranked by cost, primary model breakdown" width="540">
169
+ <img src="https://cdn.clauderecall.com/marketing/cost-analytics.gif?v=hd4" alt="Cost analytics dashboard — daily spend bars, top sessions ranked by cost, primary model breakdown" width="540">
168
170
 
169
171
  <sub>Hover any day. Exact cost. Exact tokens.</sub>
170
172
 
@@ -278,7 +280,7 @@ recall install-extension
278
280
 
279
281
  <div align="center">
280
282
 
281
- <img src="https://cdn.clauderecall.com/marketing/vs-obsidian.gif?v=hd3" alt="Side-by-side: Obsidian vault on the left, Claude Recall TUI on the right showing 247 sessions across 8 projects with live preview" width="900">
283
+ <img src="https://cdn.clauderecall.com/marketing/vs-obsidian.gif?v=hd4" alt="Side-by-side: Obsidian vault on the left, Claude Recall TUI on the right showing 247 sessions across 8 projects with live preview" width="900">
282
284
 
283
285
  <sub>People sometimes ask if this is "Obsidian for Claude Code." It is not. A general note-taking app cannot watch your filesystem for new sessions, index JSONLs into FTS5 + vector search, expose those sessions as MCP tools, or pipe a session back into Claude with one command. Claude Recall is built specifically for the loop you actually run.</sub>
284
286
 
@@ -504,7 +506,7 @@ All writes are rate-limited (default 60/min), zod-validated, and audited to `~/.
504
506
 
505
507
  <div align="center">
506
508
 
507
- <img src="https://cdn.clauderecall.com/marketing/privacy-trust.gif?v=hd3" alt="Local-first design — daemon binds to 127.0.0.1, session content stays on your machine" width="640">
509
+ <img src="https://cdn.clauderecall.com/marketing/privacy-trust.gif?v=hd4" alt="Local-first design — daemon binds to 127.0.0.1, session content stays on your machine" width="640">
508
510
 
509
511
  </div>
510
512
 
@@ -593,10 +595,13 @@ recall correlate # link sessions to commits
593
595
  recall blame <sha> # commit -> session reverse lookup
594
596
 
595
597
  # Semantic / vector search (Pro)
596
- # Embedder model auto-installs on `npm install`; manual install also available.
597
- recall semantic install # download on-device embedding model (auto-runs on npm install)
598
+ # Embedder model is opt-in `npm install` does NOT auto-download it.
599
+ recall semantic install # download on-device embedding model (~110MB)
598
600
  recall semantic status # model + backfill progress + auto-extract state
599
601
  recall semantic reindex # re-embed everything
602
+ recall semantic migrate # re-embed existing corpus under a different backend (Pro)
603
+ recall semantic rollback-migration --force # restore prior corpus within 30-day window
604
+ recall semantic prune-rollback --force # drop backup table early
600
605
  recall semantic auto-extract on # daemon nibbles un-extracted sessions through Claude (Pro)
601
606
  recall similar <id> # cosine kNN over session chunks
602
607
 
@@ -676,7 +681,7 @@ Claude Recall is a Node.js CLI with a few native dependencies (`better-sqlite3`,
676
681
 
677
682
  **Node:** 22 LTS or 24 LTS. Node 20 and earlier are unsupported (declared in `engines.node`).
678
683
 
679
- **Semantic search is optional.** The on-device embedder is the only feature that loads the native ONNX runtime. The model auto-downloads (~110MB) on `npm install`; skip with `RECALL_SKIP_MODEL_DOWNLOAD=1` and install later via `recall semantic install`. Core CLI features (search, list, context, daemon, MCP) work on every supported platform regardless of whether the embedder is installed. If the embedder fails to load on your platform, you get a clear error pointing here, and the rest of Claude Recall keeps working.
684
+ **Semantic search is optional.** The on-device embedder is the only feature that loads the native ONNX runtime. The model (~110MB) is opt-in install on demand with `recall semantic install`. Core CLI features (search, list, context, daemon, MCP) work on every supported platform regardless of whether the embedder is installed. If the embedder fails to load on your platform, you get a clear error pointing here, and the rest of Claude Recall keeps working.
680
685
 
681
686
  <br />
682
687