@neruva/mcp 0.22.1 → 0.23.1

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/README.md CHANGED
@@ -4,6 +4,69 @@ MCP server for [Neruva](https://neruva.io) — memory + reasoning substrate for
4
4
 
5
5
  > **For Claude Code users:** see [neruva.io/claude-code](https://neruva.io/claude-code) for the 30-second install + first-queries to try.
6
6
 
7
+ ## What's new in 0.23.0 — Native hook binary (sub-100ms cache-hit latency)
8
+
9
+ We benchmarked Neruva against the 2026 agent-memory field and made the hook **faster than Mem0** without giving up any of the capability surface.
10
+
11
+ | System | p95 | Capability |
12
+ |---|---|---|
13
+ | **Neruva 0.23 (cache hit)** | **~80ms** | Full cognitive surface |
14
+ | Mem0 | ~200ms | Vector memory only |
15
+ | Mem0g (graph) | ~2.6s | KG + vector |
16
+ | Letta / MemGPT | 1.4–17s | OS-style paging |
17
+ | RAG baseline | 450ms+ | Embedding + vector + rerank |
18
+
19
+ **What changed:** `@neruva/mcp` 0.23 bundles `neruva-record-hook` as a **native Rust binary** (~145-340 KB per platform; all 6 of Win/Mac/Linux × amd64/arm64). The hook talks to a long-lived Python daemon over TCP localhost. The daemon holds a warm `httpx` connection pool to `api.neruva.io` plus an in-memory TTL cache so consecutive prompt/tool-call hooks within a turn skip the network entirely.
20
+
21
+ **Install:**
22
+
23
+ ```bash
24
+ # 1. MCP server. The -p ... neruva-mcp form is REQUIRED because 0.23+
25
+ # ships two bins (neruva-mcp + neruva-record-hook) and npx can't
26
+ # auto-pick when a package exposes more than one.
27
+ npx -y -p @neruva/mcp@latest neruva-mcp
28
+
29
+ # 2. The Python daemon comes from neruva-record (pip)
30
+ pip install neruva-record
31
+ neruva-record install # writes ~/.claude/settings.json hooks
32
+
33
+ # That's it. Restart Claude Code.
34
+ ```
35
+
36
+ Your `~/.claude.json` mcpServers entry should be:
37
+
38
+ ```json
39
+ {
40
+ "neruva": {
41
+ "type": "stdio",
42
+ "command": "npx",
43
+ "args": ["-y", "-p", "@neruva/mcp@latest", "neruva-mcp"],
44
+ "env": { "NERUVA_API_KEY": "nv_..." }
45
+ }
46
+ }
47
+ ```
48
+
49
+ On install, `@neruva/mcp` exposes `neruva-record-hook` on your PATH (a tiny Node wrapper that exec's the platform-native binary). `neruva-record install` wires it into `~/.claude/settings.json` for UserPromptSubmit + PostToolUse. The daemon auto-spawns on first SessionStart and stays warm for 30 minutes after the last hook fire (`NERUVA_DAEMON_IDLE_S` override).
50
+
51
+ **For max speed**, you can skip the Node wrapper entirely by pointing `settings.json` directly at the bundled binary path:
52
+
53
+ ```json
54
+ {
55
+ "hooks": {
56
+ "UserPromptSubmit": [{
57
+ "hooks": [{
58
+ "type": "command",
59
+ "command": "<npm-root>/@neruva/mcp/binaries/neruva-record-hook-<plat>-<arch>"
60
+ }]
61
+ }]
62
+ }
63
+ }
64
+ ```
65
+
66
+ Removes the ~80ms Node startup. Total per-hook overhead drops to ~80ms p95 on cache hit, ~200ms on cache miss. The daemon's TTL cache (default 30s, override `NERUVA_DAEMON_CACHE_TTL_S`) means tool-heavy turns reuse the recall/turns fetches at ~$0/call.
67
+
68
+ **Cost impact: NEGATIVE.** TTL cache reduces Cloud Run requests by ~50-80% during tool-heavy turns. Stop-time prefetch is net-zero (shifts a request from hot path to idle path; result is cached and reused). No new infra, no new services.
69
+
7
70
  ## What's new in 0.22.0 — Auto-pilot surface (the moat)
8
71
 
9
72
  Two new tools complete the **auto-pilot** that makes the substrate use
@@ -0,0 +1,51 @@
1
+ #!/usr/bin/env node
2
+ // neruva-record-hook -- platform-detecting wrapper for the native Rust binary
3
+ //
4
+ // Bundled binaries (one per platform) live in ../binaries/. This wrapper
5
+ // picks the right one based on process.platform + process.arch, exec's it
6
+ // with stdin/stdout/stderr inherited, and exits with the child's exit code.
7
+ //
8
+ // Cost vs direct native call: ~50-100ms Node cold start. Still ~10x faster
9
+ // than the Python hook fallback. For latency-critical setups, point
10
+ // settings.json directly at binaries/neruva-record-hook-<plat>-<arch>[.exe]
11
+ // to skip the Node wrapper entirely.
12
+ //
13
+ // On chmod-required platforms (POSIX), the postinstall script makes the
14
+ // right binary executable. If the binary isn't present or isn't executable,
15
+ // we exit 0 silently (Claude Code keeps running; the user just doesn't get
16
+ // the auto-pilot block this turn).
17
+
18
+ import { spawn } from 'node:child_process';
19
+ import { dirname, join } from 'node:path';
20
+ import { existsSync } from 'node:fs';
21
+ import { fileURLToPath } from 'node:url';
22
+ import { platform, arch } from 'node:os';
23
+
24
+ const __dirname = dirname(fileURLToPath(import.meta.url));
25
+
26
+ function pickBinary() {
27
+ const p = platform();
28
+ const a = arch();
29
+ const platSeg =
30
+ p === 'win32' ? 'win'
31
+ : p === 'darwin' ? 'mac'
32
+ : 'linux';
33
+ const archSeg = a === 'arm64' ? 'arm64' : 'x64';
34
+ const ext = p === 'win32' ? '.exe' : '';
35
+ return join(__dirname, '..', 'binaries', `neruva-record-hook-${platSeg}-${archSeg}${ext}`);
36
+ }
37
+
38
+ const binary = pickBinary();
39
+
40
+ if (!existsSync(binary)) {
41
+ // Silent fallback -- never block Claude Code. User can re-install
42
+ // @neruva/mcp or check that the postinstall ran (chmod +x on POSIX).
43
+ process.exit(0);
44
+ }
45
+
46
+ const child = spawn(binary, process.argv.slice(2), {
47
+ stdio: 'inherit',
48
+ });
49
+
50
+ child.on('error', () => process.exit(0));
51
+ child.on('exit', (code) => process.exit(code ?? 0));
@@ -0,0 +1,25 @@
1
+ #!/usr/bin/env node
2
+ // postinstall: chmod +x the right platform binary on POSIX (Win is a no-op).
3
+ // npm extracts files but doesn't preserve +x bits on POSIX from tarballs in
4
+ // all cases, so make sure the binary we'll exec is actually executable.
5
+
6
+ import { chmodSync, existsSync } from 'node:fs';
7
+ import { dirname, join } from 'node:path';
8
+ import { fileURLToPath } from 'node:url';
9
+ import { platform, arch } from 'node:os';
10
+
11
+ const __dirname = dirname(fileURLToPath(import.meta.url));
12
+
13
+ const p = platform();
14
+ if (p === 'win32') process.exit(0); // Windows doesn't need chmod
15
+
16
+ const a = arch();
17
+ const platSeg = p === 'darwin' ? 'mac' : 'linux';
18
+ const archSeg = a === 'arm64' ? 'arm64' : 'x64';
19
+ const binary = join(__dirname, '..', 'binaries', `neruva-record-hook-${platSeg}-${archSeg}`);
20
+
21
+ if (existsSync(binary)) {
22
+ try {
23
+ chmodSync(binary, 0o755);
24
+ } catch {}
25
+ }
package/package.json CHANGED
@@ -1,20 +1,24 @@
1
1
  {
2
2
  "name": "@neruva/mcp",
3
- "version": "0.22.1",
4
- "description": "MCP server for Neruva agent memory + reasoning substrate. v0.22 adds the AUTO-PILOT surface: agent_route_intent_prompt (Layer 2 -- 18-pattern intent classifier that routes user messages to the right cognitive tool) + agent_reflect_prompt (Layer 3 -- self-curates substrate from recent turns by writing structured decisions / facts / mistakes / open_questions). Combined with auto-extract on every records_ingest (Layer 1), the agent uses the full substrate without prompting. Plus typed Records, 5-engine KG, federated agent_remember/recall/context, Pearl do-operator, HD analogy, CBR, ToM, counterfactual rollouts, EFE planning, rule induction, replay, code_kg_* navigation, .neruva V3 container. Pattern-C throughout: substrate stays $0/call.",
3
+ "version": "0.23.1",
4
+ "description": "MCP server for Neruva agent memory + reasoning substrate. INSTALL: npx -y -p @neruva/mcp@latest neruva-mcp (the -p ... bin form is required because the package ships two bins: neruva-mcp + neruva-record-hook). v0.23 BUNDLES neruva-record-hook native binary (Rust, ~145-340KB) for all 6 platforms (Win/Mac/Linux x amd64/arm64). Hook talks to long-lived Python daemon over TCP localhost, daemon holds warm httpx pool + TTL cache: cache-hit p95 ~80ms, warm avg 167ms -- faster than Mem0 (200ms p95) with full cognitive surface (5 KG engines + Pearl do-operator + HD analogy + CBR + ToM + EFE planning + replay). v0.22 added AUTO-PILOT (agent_route_intent_prompt + agent_reflect_prompt + auto-extract on records_ingest). Plus typed Records, 5-engine KG, federated agent_remember/recall/context, counterfactual rollouts, rule induction, replay, code_kg_* navigation, .neruva V3 container. Pattern-C throughout: substrate stays $0/call.",
5
5
  "license": "MIT",
6
6
  "type": "module",
7
7
  "bin": {
8
- "neruva-mcp": "dist/server.js"
8
+ "neruva-mcp": "dist/server.js",
9
+ "neruva-record-hook": "bin/neruva-record-hook.js"
9
10
  },
10
11
  "files": [
11
12
  "dist",
13
+ "bin",
14
+ "binaries",
12
15
  "README.md"
13
16
  ],
14
17
  "scripts": {
15
18
  "build": "tsc",
16
19
  "start": "node dist/server.js",
17
- "dev": "tsx src/server.ts"
20
+ "dev": "tsx src/server.ts",
21
+ "postinstall": "node bin/postinstall.js"
18
22
  },
19
23
  "engines": {
20
24
  "node": ">=18"