npm - @totalreclaw/totalreclaw - Versions diffs - 1.1.0 → 1.2.0 - Mend

@totalreclaw/totalreclaw 1.1.0 → 1.2.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (7) hide show

package/README.md CHANGED Viewed

@@ -1,32 +1,52 @@
-# @totalreclaw/totalreclaw
+<p align="center">
+  <img src="../../docs/assets/logo.png" alt="TotalReclaw" width="80" />
+</p>
-Encrypted memory for your AI agent — zero-knowledge E2EE vault with automatic extraction, semantic search, and portable storage.
+<h1 align="center">@totalreclaw/totalreclaw</h1>
-Built for [OpenClaw](https://openclaw.ai). Your memories are encrypted on your device before leaving — no one can read them, not even us.
+<p align="center">
+  <strong>End-to-end encrypted memory for OpenClaw -- fully automatic, yours forever</strong>
+</p>
-**[totalreclaw.xyz](https://totalreclaw.xyz)**
+<p align="center">
+  <a href="https://totalreclaw.xyz">Website</a> &middot;
+  <a href="https://www.npmjs.com/package/@totalreclaw/totalreclaw">npm</a> &middot;
+  <a href="../../docs/guides/beta-tester-guide.md">Getting Started</a>
+</p>
+<p align="center">
+  <a href="https://www.npmjs.com/package/@totalreclaw/totalreclaw"><img src="https://img.shields.io/npm/v/@totalreclaw/totalreclaw?color=7B5CFF" alt="npm version"></a>
+  <a href="https://www.npmjs.com/package/@totalreclaw/totalreclaw"><img src="https://img.shields.io/npm/dm/@totalreclaw/totalreclaw" alt="npm downloads"></a>
+  <a href="../../LICENSE"><img src="https://img.shields.io/badge/license-MIT-blue" alt="License"></a>
+</p>
+---
+Your AI agent remembers everything -- preferences, decisions, facts -- encrypted so only you can read it. Built for [OpenClaw](https://openclaw.ai) with fully automatic memory extraction and recall.
 ## Install
+Ask your OpenClaw agent:
+> "Install the @totalreclaw/totalreclaw plugin"
+Or from the terminal:
 ```bash
 openclaw plugins install @totalreclaw/totalreclaw
 ```
-Or just ask your agent:
-> "Install the totalreclaw plugin"
-The agent handles setup: generates your encryption keys and registers you. You'll be asked to write down a 12-word recovery phrase — that's the only thing you need to keep safe.
+The agent handles setup: generates your encryption keys, asks you to save a 12-word recovery phrase, and registers you. After that, memory is fully automatic.
 ## How It Works
-After setup, memory is **fully automatic**:
+After setup, everything happens in the background:
-- **Start of conversation** — loads relevant memories from your vault
-- **End of conversation** — extracts and encrypts new facts before storing them
-- **Before context compaction** — saves everything important before the context window is trimmed
+- **Start of conversation** -- loads relevant memories from your encrypted vault
+- **During conversation** -- extracts facts, preferences, and decisions automatically
+- **Before context compaction** -- saves important context before the window is trimmed
-All encryption happens client-side using AES-256-GCM. Search uses blind indices (SHA-256 hashes) — the server never sees your queries or data. Your 12-word recovery phrase derives all keys via Argon2id + HKDF.
+All encryption happens client-side using AES-256-GCM. The server never sees your plaintext data.
 ## Tools
@@ -39,48 +59,45 @@ Your agent gets these tools automatically:
 | `totalreclaw_forget` | Delete a specific memory |
 | `totalreclaw_export` | Export all memories as plaintext |
 | `totalreclaw_status` | Check billing status and quota |
+| `totalreclaw_consolidate` | Merge duplicate memories |
+| `totalreclaw_import_from` | Import from Mem0 or MCP Memory Server |
-Most of the time you won't use these directly — the automatic hooks handle memory for you.
+Most of the time you won't use these directly -- the automatic hooks handle memory for you.
 ## Features
-- **Zero-knowledge E2EE** — AES-256-GCM encryption, blind index search, HKDF auth
-- **Semantic search** — Local embeddings (bge-small-en-v1.5) + BM25 + cosine reranking with RRF
-- **Automatic extraction** — LLM extracts facts from conversations, no manual input needed
-- **Dedup** — Cosine similarity catches paraphrases; LLM-guided dedup catches contradictions (Pro)
-- **On-chain storage** — Encrypted data stored on Gnosis Chain, indexed by The Graph
-- **Portable** — One 12-word phrase. Any device, same memories, no lock-in
-- **Import** — Migrate from Mem0 or MCP Memory Server
+- **End-to-end encrypted** -- AES-256-GCM encryption, blind index search, HKDF auth
+- **Automatic extraction** -- LLM extracts facts from conversations, no manual input needed
+- **Semantic search** -- Local embeddings + BM25 + cosine reranking with RRF fusion
+- **Smart dedup** -- Cosine similarity catches paraphrases; LLM-guided dedup catches contradictions (Pro)
+- **On-chain storage** -- Encrypted data stored on Gnosis Chain, indexed by The Graph
+- **Portable** -- One 12-word phrase. Any device, same memories, no lock-in
+- **Import** -- Migrate from Mem0 or MCP Memory Server
 ## Free Tier & Pricing
-| Tier | Writes | Reads | Price |
-|------|--------|-------|-------|
-| **Free** | 250/month | Unlimited | $0 |
-| **Pro** | 10,000/month | Unlimited | $2-5/month |
-Pay with card (Stripe) or crypto (Coinbase Commerce). Counter resets monthly.
-## Configuration
+| Tier | Memories | Reads | Storage | Price |
+|------|----------|-------|---------|-------|
+| **Free** | 500/month | Unlimited | Testnet (trial) | $0 |
+| **Pro** | Unlimited | Unlimited | Permanent on-chain (Gnosis) | $5/month |
-Set these environment variables before the agent starts:
-| Variable | Description | Default |
-|----------|-------------|---------|
-| `TOTALRECLAW_SERVER_URL` | Server URL | `https://api.totalreclaw.xyz` |
-| `TOTALRECLAW_CREDENTIALS_PATH` | Path to credentials file | `~/.totalreclaw/credentials.json` |
-| `TOTALRECLAW_SELF_HOSTED` | Set to `true` to use your own self-hosted server instead of the managed service | `false` (managed service) |
-| `TOTALRECLAW_EXTRACT_EVERY_TURNS` | Auto-extract interval (turns) | `5` (Free) / `2` (Pro min) |
+Pay with card via Stripe. Counter resets monthly.
 ## Using with Other Agents
 TotalReclaw also works outside OpenClaw:
-- **Claude Desktop / Cursor / Windsurf** — Use [@totalreclaw/mcp-server](https://www.npmjs.com/package/@totalreclaw/mcp-server)
-- **NanoClaw** — Lightweight skill with MCP bridge
+- **Claude Desktop / Cursor / Windsurf** -- Use [@totalreclaw/mcp-server](https://www.npmjs.com/package/@totalreclaw/mcp-server)
+- **NanoClaw** -- Built-in support via MCP bridge
 Same encryption, same recovery phrase, same memories across all agents.
+## Learn More
+- [Getting Started Guide](../../docs/guides/beta-tester-guide.md)
+- [totalreclaw.xyz](https://totalreclaw.xyz)
+- [Main Repository](https://github.com/p-diogo/totalreclaw)
 ## License
 MIT

package/embedding.ts CHANGED Viewed

@@ -1,73 +1,64 @@
 /**
  * TotalReclaw Plugin - Local Embedding via @huggingface/transformers
  *
- * Uses the Xenova/bge-small-en-v1.5 ONNX model to generate 384-dimensional
+ * Uses the Qwen3-Embedding-0.6B ONNX model to generate 1024-dimensional
  * text embeddings locally. No API key needed, no data leaves the machine.
+ * Supports 100+ languages (EN, PT, ES, ZH, etc.).
  *
- * This preserves the zero-knowledge guarantee: embeddings are generated
+ * This preserves the E2EE guarantee: embeddings are generated
  * CLIENT-SIDE before encryption, so no plaintext ever reaches an external API.
  *
  * Model details:
- *   - Quantized (int8) ONNX model: ~33.8MB download on first use
+ *   - Quantized (int8) ONNX model: ~600MB download on first use
  *   - Cached in ~/.cache/huggingface/ after first download
- *   - Lazy initialization: first call ~2-3s (model load), subsequent ~15ms
- *   - Output: 384-dimensional normalized embedding vector
- *   - For retrieval, queries should be prefixed with an instruction string
- *     (documents/passages should NOT be prefixed)
+ *   - Lazy initialization: first call ~3-5s (model load), subsequent ~100ms
+ *   - Output: 1024-dimensional normalized embedding vector
+ *   - No instruction prefix needed (bare queries perform better)
  *
- * Dependencies: @huggingface/transformers (handles model download, WordPiece
- * tokenization, ONNX inference, mean pooling, and normalization).
+ * Dependencies: @huggingface/transformers (handles model download,
+ * tokenization, ONNX inference, last-token pooling, and normalization).
  */
 // @ts-ignore - @huggingface/transformers types may not be perfect
 import { pipeline, type FeatureExtractionPipeline } from '@huggingface/transformers';
-/** ONNX-optimized bge-small-en-v1.5 from HuggingFace Hub. */
-const MODEL_ID = 'Xenova/bge-small-en-v1.5';
+/** ONNX-optimized Qwen3-Embedding-0.6B from HuggingFace Hub. */
+const MODEL_ID = 'onnx-community/Qwen3-Embedding-0.6B-ONNX';
-/** Fixed output dimensionality for bge-small-en-v1.5. */
-const EMBEDDING_DIM = 384;
-/**
- * Query instruction prefix for bge-small-en-v1.5 retrieval tasks.
- *
- * Per the BAAI model card: prepend this to short queries when searching
- * for relevant passages. Do NOT prepend for documents/passages being stored.
- */
-const QUERY_PREFIX = 'Represent this sentence for searching relevant passages: ';
+/** Fixed output dimensionality for Qwen3-Embedding-0.6B. */
+const EMBEDDING_DIM = 1024;
 /** Lazily initialized feature extraction pipeline. */
 let extractor: FeatureExtractionPipeline | null = null;
 /**
- * Generate a 384-dimensional embedding vector for the given text.
+ * Generate a 1024-dimensional embedding vector for the given text.
  *
- * On first call, downloads and loads the ONNX model (~33.8MB, cached).
- * Subsequent calls reuse the loaded model and run in ~15ms.
+ * On first call, downloads and loads the ONNX model (~600MB, cached).
+ * Subsequent calls reuse the loaded model and run in ~100ms.
  *
- * For bge-small-en-v1.5, queries should set `isQuery: true` to prepend the
- * retrieval instruction prefix. Documents being stored should use the default
- * (`isQuery: false`) so no prefix is added.
+ * The isQuery option is accepted for forward compatibility but does not
+ * change behavior -- Qwen3 performs better without instruction prefixes.
  *
  * @param text - The text to embed.
  * @param options - Optional settings.
- * @param options.isQuery - If true, prepend the BGE query instruction prefix
- *                          for improved retrieval accuracy (default: false).
- * @returns 384-dimensional normalized embedding as a number array.
+ * @param options.isQuery - Accepted for forward compatibility (no-op).
+ * @returns 1024-dimensional normalized embedding as a number array.
  */
 export async function generateEmbedding(
   text: string,
   options?: { isQuery?: boolean },
 ): Promise<number[]> {
   if (!extractor) {
+    console.log('Downloading embedding model (one-time setup, ~600MB)...');
     extractor = await pipeline('feature-extraction', MODEL_ID, {
-      // Use quantized (int8) model for smaller download (~33.8MB vs ~67MB)
       quantized: true,
     });
+    console.log('Embedding model ready.');
   }
-  const input = options?.isQuery ? QUERY_PREFIX + text : text;
-  const output = await extractor(input, { pooling: 'mean', normalize: true });
+  const input = text;
+  const output = await extractor(input, { pooling: 'last_token', normalize: true });
   // output.data is a Float32Array; convert to plain number[]
   return Array.from(output.data as Float32Array);
 }
@@ -75,7 +66,7 @@ export async function generateEmbedding(
 /**
  * Get the embedding vector dimensionality.
  *
- * Always returns 384 (fixed for bge-small-en-v1.5).
+ * Always returns 1024 (fixed for Qwen3-Embedding-0.6B).
  * This is needed by downstream code (e.g. LSH hasher) to know the vector
  * size without calling the embedding model.
  */

package/index.ts CHANGED Viewed

@@ -126,7 +126,10 @@ const SEMANTIC_SKIP_THRESHOLD = parseFloat(process.env.TOTALRECLAW_SEMANTIC_SKIP
 // Auto-extract throttle (C3): only extract every N turns in agent_end hook
 let turnsSinceLastExtraction = 0;
-const AUTO_EXTRACT_EVERY_TURNS_ENV = parseInt(process.env.TOTALRECLAW_EXTRACT_EVERY_TURNS ?? '5', 10);
+const AUTO_EXTRACT_EVERY_TURNS_ENV = parseInt(process.env.TOTALRECLAW_EXTRACT_EVERY_TURNS ?? '3', 10);
+// Hard cap on facts per extraction to prevent LLM over-extraction from dense conversations
+const MAX_FACTS_PER_EXTRACTION = 15;
 // Store-time near-duplicate detection (consolidation module)
 const STORE_DEDUP_ENABLED = process.env.TOTALRECLAW_STORE_DEDUP !== 'false';
@@ -188,13 +191,11 @@ function isLlmDedupEnabled(): boolean {
 }
 /**
- * Get the effective extraction interval based on tier.
- * Pro users can set interval as low as 2 via env; Free users are clamped to minimum 5.
+ * Get the effective extraction interval.
+ * Unified to 3 turns for all tiers (quota is per-transaction, not per-memory).
  */
 function getExtractInterval(): number {
-  const cache = readBillingCache();
-  const minInterval = cache?.features?.min_extract_interval ?? 5;
-  return Math.max(AUTO_EXTRACT_EVERY_TURNS_ENV, minInterval);
+  return AUTO_EXTRACT_EVERY_TURNS_ENV;
 }
 /**
@@ -517,7 +518,7 @@ async function generateEmbeddingAndLSH(
     const hasher = getLSHHasher(logger);
     const lshBuckets = hasher ? hasher.hash(embedding) : [];
-    // Encrypt the embedding (JSON array of numbers) for zero-knowledge storage
+    // Encrypt the embedding (JSON array of numbers) for server-blind storage
     const encryptedEmbedding = encryptToHex(JSON.stringify(embedding), encryptionKey!);
     return { embedding, lshBuckets, encryptedEmbedding };
@@ -1177,7 +1178,7 @@ async function handlePluginImportFrom(
 const plugin = {
   id: 'totalreclaw',
   name: 'TotalReclaw',
-  description: 'Zero-knowledge encrypted memory vault for AI agents',
+  description: 'End-to-end encrypted memory vault for AI agents',
   kind: 'memory' as const,
   configSchema: {
     type: 'object',
@@ -2548,7 +2549,13 @@ const plugin = {
               ? await fetchExistingMemoriesForExtraction(api.logger, 20, evt.messages)
               : [];
             const rawFacts = await extractFacts(evt.messages, 'turn', existingMemories);
-            const { kept: facts } = filterByImportance(rawFacts, api.logger);
+            const { kept: importanceFiltered } = filterByImportance(rawFacts, api.logger);
+            if (importanceFiltered.length > MAX_FACTS_PER_EXTRACTION) {
+              api.logger.info(
+                `Capped extraction from ${importanceFiltered.length} to ${MAX_FACTS_PER_EXTRACTION} facts`,
+              );
+            }
+            const facts = importanceFiltered.slice(0, MAX_FACTS_PER_EXTRACTION);
             if (facts.length > 0) {
               await storeExtractedFacts(facts, api.logger);
             }
@@ -2584,7 +2591,13 @@ const plugin = {
             ? await fetchExistingMemoriesForExtraction(api.logger, 50, evt.messages)
             : [];
           const rawCompactFacts = await extractFacts(evt.messages, 'full', existingMemories);
-          const { kept: facts } = filterByImportance(rawCompactFacts, api.logger);
+          const { kept: compactImportanceFiltered } = filterByImportance(rawCompactFacts, api.logger);
+          if (compactImportanceFiltered.length > MAX_FACTS_PER_EXTRACTION) {
+            api.logger.info(
+              `Capped compaction extraction from ${compactImportanceFiltered.length} to ${MAX_FACTS_PER_EXTRACTION} facts`,
+            );
+          }
+          const facts = compactImportanceFiltered.slice(0, MAX_FACTS_PER_EXTRACTION);
           if (facts.length > 0) {
             await storeExtractedFacts(facts, api.logger);
           }
@@ -2619,7 +2632,13 @@ const plugin = {
             ? await fetchExistingMemoriesForExtraction(api.logger, 50, evt.messages)
             : [];
           const rawResetFacts = await extractFacts(evt.messages, 'full', existingMemories);
-          const { kept: facts } = filterByImportance(rawResetFacts, api.logger);
+          const { kept: resetImportanceFiltered } = filterByImportance(rawResetFacts, api.logger);
+          if (resetImportanceFiltered.length > MAX_FACTS_PER_EXTRACTION) {
+            api.logger.info(
+              `Capped reset extraction from ${resetImportanceFiltered.length} to ${MAX_FACTS_PER_EXTRACTION} facts`,
+            );
+          }
+          const facts = resetImportanceFiltered.slice(0, MAX_FACTS_PER_EXTRACTION);
           if (facts.length > 0) {
             await storeExtractedFacts(facts, api.logger);
           }

package/lsh.ts CHANGED Viewed

@@ -1,7 +1,7 @@
 /**
  * TotalReclaw Plugin - LSH Hasher (Locality-Sensitive Hashing)
  *
- * Pure TypeScript implementation of Random Hyperplane LSH for zero-knowledge
+ * Pure TypeScript implementation of Random Hyperplane LSH for server-blind
  * semantic search. Generates deterministic hyperplane matrices from a seed
  * derived from the user's master key, so the same embedding always hashes to
  * the same buckets across sessions.

package/openclaw.plugin.json CHANGED Viewed

@@ -2,7 +2,7 @@
   "id": "totalreclaw",
   "name": "TotalReclaw",
   "kind": "memory",
-  "description": "Zero-knowledge encrypted memory vault for AI agents",
+  "description": "End-to-end encrypted memory vault for AI agents",
   "configSchema": {
     "type": "object",
     "properties": {

package/package.json CHANGED Viewed

@@ -1,14 +1,14 @@
 {
   "name": "@totalreclaw/totalreclaw",
-  "version": "1.1.0",
-  "description": "Encrypted memory for your AI agent — zero-knowledge E2EE vault with automatic extraction, semantic search, and on-chain storage",
+  "version": "1.2.0",
+  "description": "End-to-end encrypted memory for AI agents — portable, yours forever. Automatic extraction, semantic search, and on-chain storage",
   "type": "module",
   "keywords": [
     "totalreclaw",
     "openclaw",
     "ai-memory",
     "ai-agent",
-    "zero-knowledge",
+    "e2e-encryption",
     "encryption",
     "e2ee",
     "lsh",

package/subgraph-store.ts CHANGED Viewed

@@ -13,7 +13,7 @@
 import { createPublicClient, http, type Hex, type Address, type Chain } from 'viem';
 import { entryPoint07Address } from 'viem/account-abstraction';
 import { mnemonicToAccount } from 'viem/accounts';
-import { gnosis, gnosisChiado } from 'viem/chains';
+import { gnosis, gnosisChiado, baseSepolia } from 'viem/chains';
 import { createSmartAccountClient } from 'permissionless';
 import { toSimpleSmartAccount } from 'permissionless/accounts';
 import { createPimlicoClient } from 'permissionless/clients/pimlico';
@@ -32,7 +32,7 @@ export interface SubgraphStoreConfig {
   relayUrl: string;           // TotalReclaw relay server URL (proxies bundler + subgraph)
   mnemonic: string;           // BIP-39 mnemonic for key derivation
   cachePath: string;          // Hot cache file path
-  chainId: number;            // 100 for Gnosis mainnet, 10200 for Chiado testnet
+  chainId: number;            // 100 for Gnosis mainnet, 10200 for Chiado testnet, 84532 for Base Sepolia
   dataEdgeAddress: string;    // EventfulDataEdge contract address
   entryPointAddress: string;  // ERC-4337 EntryPoint v0.7
   authKeyHex?: string;        // HKDF auth key for relay server Authorization header
@@ -151,8 +151,10 @@ function getChainFromId(chainId: number): Chain {
       return gnosis;
     case 10200:
       return gnosisChiado;
+    case 84532:
+      return baseSepolia;
     default:
-      return gnosisChiado;
+      return gnosis;
   }
 }
@@ -311,7 +313,7 @@ export function isSubgraphMode(): boolean {
  * This is the on-chain owner identity used in the subgraph.
  */
 export async function deriveSmartAccountAddress(mnemonic: string, chainId?: number): Promise<string> {
-  const chain: Chain = (chainId ?? 100) === 100 ? gnosis : gnosisChiado;
+  const chain: Chain = getChainFromId(chainId ?? 100);
   const ownerAccount = mnemonicToAccount(mnemonic);
   const entryPointAddr = (process.env.TOTALRECLAW_ENTRYPOINT_ADDRESS || DEFAULT_ENTRYPOINT_ADDRESS) as Address;
   const rpcUrl = process.env.TOTALRECLAW_RPC_URL;