@pentatonic-ai/ai-agent-sdk 0.5.11 → 0.7.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (119) hide show
  1. package/README.md +345 -174
  2. package/bin/__tests__/callback-server.test.js +70 -0
  3. package/bin/__tests__/credentials.test.js +58 -0
  4. package/bin/__tests__/login.test.js +210 -0
  5. package/bin/__tests__/pkce.test.js +39 -0
  6. package/bin/__tests__/whoami.test.js +77 -0
  7. package/bin/cli.js +109 -440
  8. package/bin/commands/config.js +251 -0
  9. package/bin/commands/login.js +219 -0
  10. package/bin/commands/whoami.js +41 -0
  11. package/bin/lib/callback-server.js +137 -0
  12. package/bin/lib/credentials.js +100 -0
  13. package/bin/lib/pkce.js +26 -0
  14. package/package.json +4 -2
  15. package/packages/doctor/__tests__/detect.test.js +2 -6
  16. package/packages/doctor/src/checks/local-memory.js +164 -196
  17. package/packages/doctor/src/detect.js +11 -3
  18. package/packages/memory/src/__tests__/corpus-chunkers.test.js +143 -0
  19. package/packages/memory/src/__tests__/corpus-discover.test.js +175 -0
  20. package/packages/memory/src/__tests__/corpus-ingest.test.js +236 -0
  21. package/packages/memory/src/__tests__/corpus-signatures.test.js +175 -0
  22. package/packages/memory/src/__tests__/corpus-state.test.js +161 -0
  23. package/packages/memory/src/__tests__/ingest-corpus-opts.test.js +129 -0
  24. package/packages/memory/src/__tests__/search-kind.test.js +108 -0
  25. package/packages/memory/src/corpus/adapters.js +398 -0
  26. package/packages/memory/src/corpus/chunkers.js +328 -0
  27. package/packages/memory/src/corpus/cli.js +613 -0
  28. package/packages/memory/src/corpus/discover.js +379 -0
  29. package/packages/memory/src/corpus/index.js +68 -0
  30. package/packages/memory/src/corpus/ingest.js +356 -0
  31. package/packages/memory/src/corpus/signatures.js +280 -0
  32. package/packages/memory/src/corpus/state.js +134 -0
  33. package/packages/memory/src/index.js +18 -0
  34. package/packages/memory/src/ingest.js +20 -11
  35. package/packages/memory/src/openclaw/index.js +39 -1
  36. package/packages/memory/src/search.js +30 -7
  37. package/packages/memory-engine/.env.example +13 -0
  38. package/packages/memory-engine/README.md +131 -0
  39. package/packages/memory-engine/bench/README.md +99 -0
  40. package/packages/memory-engine/bench/scorecards-engine/agent-coding__pentatonic-baseline__20260427-142523.json +1115 -0
  41. package/packages/memory-engine/bench/scorecards-engine/chat-recall__pentatonic-baseline__20260427-142648.json +819 -0
  42. package/packages/memory-engine/bench/scorecards-engine/circular-economy__pentatonic-baseline__20260427-142757.json +1278 -0
  43. package/packages/memory-engine/bench/scorecards-engine/customer-support__pentatonic-baseline__20260427-142900.json +1018 -0
  44. package/packages/memory-engine/bench/scorecards-engine/marketplace-ops__pentatonic-baseline__20260427-142957.json +1038 -0
  45. package/packages/memory-engine/bench/scorecards-engine/product-catalogue__pentatonic-baseline__20260427-143122.json +961 -0
  46. package/packages/memory-engine/bench/scorecards-engine-via-docker/agent-coding__pentatonic-memory__20260427-161812.json +1115 -0
  47. package/packages/memory-engine/bench/scorecards-engine-via-docker/chat-recall__pentatonic-memory__20260427-161701.json +819 -0
  48. package/packages/memory-engine/bench/scorecards-engine-via-docker/circular-economy__pentatonic-memory__20260427-161713.json +1278 -0
  49. package/packages/memory-engine/bench/scorecards-engine-via-docker/customer-support__pentatonic-memory__20260427-161723.json +1018 -0
  50. package/packages/memory-engine/bench/scorecards-engine-via-docker/marketplace-ops__pentatonic-memory__20260427-161732.json +1038 -0
  51. package/packages/memory-engine/bench/scorecards-engine-via-docker/product-catalogue__pentatonic-memory__20260427-161741.json +937 -0
  52. package/packages/memory-engine/bench/scorecards-engine-via-l2-7-layer-populated/agent-coding__pentatonic-memory__20260427-184718.json +1115 -0
  53. package/packages/memory-engine/bench/scorecards-engine-via-l2-7-layer-populated/chat-recall__pentatonic-memory__20260427-184614.json +819 -0
  54. package/packages/memory-engine/bench/scorecards-engine-via-l2-7-layer-populated/circular-economy__pentatonic-memory__20260427-184809.json +1278 -0
  55. package/packages/memory-engine/bench/scorecards-engine-via-l2-7-layer-populated/customer-support__pentatonic-memory__20260427-184854.json +1018 -0
  56. package/packages/memory-engine/bench/scorecards-engine-via-l2-7-layer-populated/marketplace-ops__pentatonic-memory__20260427-184929.json +1038 -0
  57. package/packages/memory-engine/bench/scorecards-engine-via-l2-7-layer-populated/product-catalogue__pentatonic-memory__20260427-185015.json +961 -0
  58. package/packages/memory-engine/bench/scorecards-engine-via-l2-empty-layers/agent-coding__pentatonic-memory__20260427-175252.json +1115 -0
  59. package/packages/memory-engine/bench/scorecards-engine-via-l2-empty-layers/chat-recall__pentatonic-memory__20260427-175312.json +819 -0
  60. package/packages/memory-engine/bench/scorecards-engine-via-l2-empty-layers/circular-economy__pentatonic-memory__20260427-175335.json +1278 -0
  61. package/packages/memory-engine/bench/scorecards-engine-via-l2-empty-layers/customer-support__pentatonic-memory__20260427-175355.json +1018 -0
  62. package/packages/memory-engine/bench/scorecards-engine-via-l2-empty-layers/marketplace-ops__pentatonic-memory__20260427-175413.json +1038 -0
  63. package/packages/memory-engine/bench/scorecards-engine-via-l2-empty-layers/product-catalogue__pentatonic-memory__20260427-175430.json +883 -0
  64. package/packages/memory-engine/bench/scorecards-engine-via-shim/agent-coding__pentatonic-memory__20260427-155409.json +1115 -0
  65. package/packages/memory-engine/bench/scorecards-engine-via-shim/chat-recall__pentatonic-memory__20260427-155421.json +819 -0
  66. package/packages/memory-engine/bench/scorecards-engine-via-shim/circular-economy__pentatonic-memory__20260427-155433.json +1278 -0
  67. package/packages/memory-engine/bench/scorecards-engine-via-shim/customer-support__pentatonic-memory__20260427-155443.json +1018 -0
  68. package/packages/memory-engine/bench/scorecards-engine-via-shim/marketplace-ops__pentatonic-memory__20260427-155453.json +1038 -0
  69. package/packages/memory-engine/bench/scorecards-engine-via-shim/product-catalogue__pentatonic-memory__20260427-155503.json +937 -0
  70. package/packages/memory-engine/bench/scorecards-pentatonic-baseline/agent-coding__pentatonic-memory-latest__20260427-145103.json +1115 -0
  71. package/packages/memory-engine/bench/scorecards-pentatonic-baseline/agent-coding__pentatonic-memory__20260427-144909.json +1115 -0
  72. package/packages/memory-engine/bench/scorecards-pentatonic-baseline/chat-recall__pentatonic-memory-latest__20260427-145153.json +819 -0
  73. package/packages/memory-engine/bench/scorecards-pentatonic-baseline/chat-recall__pentatonic-memory__20260427-145120.json +542 -0
  74. package/packages/memory-engine/bench/scorecards-pentatonic-baseline/circular-economy__pentatonic-memory-latest__20260427-145313.json +1278 -0
  75. package/packages/memory-engine/bench/scorecards-pentatonic-baseline/circular-economy__pentatonic-memory__20260427-145207.json +894 -0
  76. package/packages/memory-engine/bench/scorecards-pentatonic-baseline/customer-support__pentatonic-memory-latest__20260427-145412.json +1018 -0
  77. package/packages/memory-engine/bench/scorecards-pentatonic-baseline/customer-support__pentatonic-memory__20260427-145327.json +680 -0
  78. package/packages/memory-engine/bench/scorecards-pentatonic-baseline/marketplace-ops__pentatonic-memory-latest__20260427-145517.json +1038 -0
  79. package/packages/memory-engine/bench/scorecards-pentatonic-baseline/marketplace-ops__pentatonic-memory__20260427-145422.json +693 -0
  80. package/packages/memory-engine/bench/scorecards-pentatonic-baseline/product-catalogue__pentatonic-memory-latest__20260427-145616.json +961 -0
  81. package/packages/memory-engine/bench/scorecards-pentatonic-baseline/product-catalogue__pentatonic-memory__20260427-145528.json +727 -0
  82. package/packages/memory-engine/compat/Dockerfile +11 -0
  83. package/packages/memory-engine/compat/server.py +680 -0
  84. package/packages/memory-engine/docker-compose.yml +243 -0
  85. package/packages/memory-engine/docs/MIGRATION.md +178 -0
  86. package/packages/memory-engine/docs/RUNBOOK-AWS.md +375 -0
  87. package/packages/memory-engine/docs/why-v05-underperforms.md +138 -0
  88. package/packages/memory-engine/engine/README.md +52 -0
  89. package/packages/memory-engine/engine/l2-hybridrag-proxy.py +1543 -0
  90. package/packages/memory-engine/engine/l5-comms-layer.py +663 -0
  91. package/packages/memory-engine/engine/l6-document-store.py +1018 -0
  92. package/packages/memory-engine/engine/services/l2/Dockerfile +41 -0
  93. package/packages/memory-engine/engine/services/l2/init_databases.py +81 -0
  94. package/packages/memory-engine/engine/services/l2/l2-hybridrag-proxy.py +1543 -0
  95. package/packages/memory-engine/engine/services/l4/Dockerfile +15 -0
  96. package/packages/memory-engine/engine/services/l4/server.py +235 -0
  97. package/packages/memory-engine/engine/services/l5/Dockerfile +9 -0
  98. package/packages/memory-engine/engine/services/l5/l5-comms-layer.py +678 -0
  99. package/packages/memory-engine/engine/services/l6/Dockerfile +11 -0
  100. package/packages/memory-engine/engine/services/l6/l6-document-store.py +1016 -0
  101. package/packages/memory-engine/engine/services/nv-embed/Dockerfile +28 -0
  102. package/packages/memory-engine/engine/services/nv-embed/server.py +152 -0
  103. package/packages/memory-engine/pme_memory/__init__.py +0 -0
  104. package/packages/memory-engine/pme_memory/__main__.py +129 -0
  105. package/packages/memory-engine/pme_memory/artifacts.py +95 -0
  106. package/packages/memory-engine/pme_memory/embed.py +74 -0
  107. package/packages/memory-engine/pme_memory/health.py +36 -0
  108. package/packages/memory-engine/pme_memory/hygiene.py +159 -0
  109. package/packages/memory-engine/pme_memory/indexer.py +200 -0
  110. package/packages/memory-engine/pme_memory/needs.py +55 -0
  111. package/packages/memory-engine/pme_memory/provenance.py +80 -0
  112. package/packages/memory-engine/pme_memory/scoring.py +168 -0
  113. package/packages/memory-engine/pme_memory/search.py +52 -0
  114. package/packages/memory-engine/pme_memory/store.py +86 -0
  115. package/packages/memory-engine/pme_memory/synthesis.py +114 -0
  116. package/packages/memory-engine/pyproject.toml +65 -0
  117. package/packages/memory-engine/scripts/kg-extractor.py +557 -0
  118. package/packages/memory-engine/scripts/kg-preflexor-v2.py +738 -0
  119. package/packages/memory-engine/tests/test_api_contract.sh +57 -0
@@ -0,0 +1,134 @@
1
+ /**
2
+ * Local corpus state — what repos are tracked, content hashes per file,
3
+ * last sync timestamps. Lives at ~/.config/tes/corpus.json (or
4
+ * $XDG_CONFIG_HOME/tes/corpus.json) so it survives plugin reinstalls
5
+ * but stays per-developer.
6
+ *
7
+ * Schema:
8
+ * {
9
+ * "version": 1,
10
+ * "tenant": { "clientId": "acme", "endpoint": "https://acme.api..." },
11
+ * "sources": {
12
+ * "/abs/path/to/repo": {
13
+ * "sourceType": "git" | "directory",
14
+ * "sourceUrl": "git@github.com:org/repo.git" | null,
15
+ * "addedAt": "2026-04-27T12:00:00Z",
16
+ * "lastSyncedAt": "2026-04-27T12:05:00Z",
17
+ * "lastSyncedCommit": "abc123" | null,
18
+ * "files": {
19
+ * "src/index.ts": { "hash": "sha256...", "chunks": 3, "indexedAt": "..." }
20
+ * },
21
+ * "stats": { "fileCount": 47, "chunkCount": 132, "totalBytes": 184320 }
22
+ * }
23
+ * }
24
+ * }
25
+ *
26
+ * Atomic writes via tmpfile + rename so partial writes can't corrupt
27
+ * the state file mid-sync.
28
+ */
29
+
30
+ import { promises as fsp, existsSync } from "node:fs";
31
+ import { join, dirname, resolve } from "node:path";
32
+ import { homedir } from "node:os";
33
+
34
+ const STATE_VERSION = 1;
35
+
36
+ export function defaultStatePath() {
37
+ const xdg = process.env.XDG_CONFIG_HOME || join(homedir(), ".config");
38
+ return join(xdg, "tes", "corpus.json");
39
+ }
40
+
41
+ export function emptyState() {
42
+ return {
43
+ version: STATE_VERSION,
44
+ tenant: null,
45
+ sources: {},
46
+ };
47
+ }
48
+
49
+ export async function loadState(path = defaultStatePath()) {
50
+ if (!existsSync(path)) return emptyState();
51
+ try {
52
+ const raw = await fsp.readFile(path, "utf-8");
53
+ const parsed = JSON.parse(raw);
54
+ if (!parsed.version || parsed.version > STATE_VERSION) {
55
+ throw new Error(
56
+ `corpus state at ${path} has unsupported version ${parsed.version} (we understand up to ${STATE_VERSION}). Upgrade the SDK.`
57
+ );
58
+ }
59
+ parsed.sources = parsed.sources || {};
60
+ return parsed;
61
+ } catch (err) {
62
+ if (err instanceof SyntaxError) {
63
+ throw new Error(`corpus state at ${path} is corrupt JSON: ${err.message}`);
64
+ }
65
+ throw err;
66
+ }
67
+ }
68
+
69
+ export async function saveState(state, path = defaultStatePath()) {
70
+ await fsp.mkdir(dirname(path), { recursive: true });
71
+ const tmp = `${path}.${process.pid}.tmp`;
72
+ await fsp.writeFile(tmp, JSON.stringify(state, null, 2), {
73
+ mode: 0o600, // user-only — state may include endpoint URLs
74
+ });
75
+ await fsp.rename(tmp, path);
76
+ }
77
+
78
+ export function getSource(state, repoPath) {
79
+ const abs = resolve(repoPath);
80
+ return state.sources[abs] || null;
81
+ }
82
+
83
+ export function upsertSource(state, repoPath, patch) {
84
+ const abs = resolve(repoPath);
85
+ const existing = state.sources[abs] || {
86
+ sourceType: "directory",
87
+ sourceUrl: null,
88
+ addedAt: new Date().toISOString(),
89
+ lastSyncedAt: null,
90
+ lastSyncedCommit: null,
91
+ files: {},
92
+ stats: { fileCount: 0, chunkCount: 0, totalBytes: 0 },
93
+ };
94
+ state.sources[abs] = { ...existing, ...patch };
95
+ return state.sources[abs];
96
+ }
97
+
98
+ export function removeSource(state, repoPath) {
99
+ const abs = resolve(repoPath);
100
+ if (state.sources[abs]) {
101
+ delete state.sources[abs];
102
+ return true;
103
+ }
104
+ return false;
105
+ }
106
+
107
+ export function recordFile(source, relPath, hash, chunks) {
108
+ source.files[relPath] = {
109
+ hash,
110
+ chunks,
111
+ indexedAt: new Date().toISOString(),
112
+ };
113
+ }
114
+
115
+ export function forgetFile(source, relPath) {
116
+ if (source.files[relPath]) {
117
+ delete source.files[relPath];
118
+ return true;
119
+ }
120
+ return false;
121
+ }
122
+
123
+ export function recomputeStats(source) {
124
+ let fileCount = 0;
125
+ let chunkCount = 0;
126
+ for (const f of Object.values(source.files)) {
127
+ fileCount++;
128
+ chunkCount += f.chunks || 0;
129
+ }
130
+ source.stats = { ...source.stats, fileCount, chunkCount };
131
+ return source.stats;
132
+ }
133
+
134
+ export { STATE_VERSION };
@@ -133,3 +133,21 @@ export { decay } from "./decay.js";
133
133
  export { consolidate } from "./consolidate.js";
134
134
  export { ensureLayers, getLayers } from "./layers.js";
135
135
  export { migrate } from "./migrate.js";
136
+ export {
137
+ ingestCorpus,
138
+ syncCorpus,
139
+ ingestPaths,
140
+ estimateCorpus,
141
+ discover,
142
+ isPathEligible,
143
+ chunkFile,
144
+ localAdapter,
145
+ hostedAdapter,
146
+ loadState,
147
+ saveState,
148
+ defaultStatePath,
149
+ emptyState,
150
+ upsertSource,
151
+ removeSource,
152
+ getSource,
153
+ } from "./corpus/index.js";
@@ -16,6 +16,10 @@ import { distill } from "./distill.js";
16
16
  * @param {string} [opts.userId] - Optional user ID
17
17
  * @param {string} [opts.layerType="episodic"] - Target layer
18
18
  * @param {object} [opts.metadata] - Additional metadata
19
+ * @param {boolean} [opts.distill=true] - Run conversation-shaped fact
20
+ * extraction. Pass false for code/structured content.
21
+ * @param {boolean} [opts.hyde=true] - Generate hypothetical queries
22
+ * (HyDE). Pass false for code/structured content.
19
23
  * @param {Function} [opts.logger] - Optional logger
20
24
  * @param {Function} [opts.waitUntil] - Platform hook to register background
21
25
  * tasks (e.g. Cloudflare Worker ctx.waitUntil). If provided, the distill
@@ -165,18 +169,23 @@ export async function ingest(db, ai, llm, content, opts = {}) {
165
169
  log(`Embedding failed for ${memoryId}: ${err.message}`);
166
170
  }
167
171
 
168
- // HyDE: generate hypothetical queries (non-fatal)
169
- try {
170
- const queries = await generateHypotheticalQueries(llm, content);
171
- if (queries.length) {
172
- await db(
173
- `UPDATE memory_nodes SET metadata = jsonb_set(COALESCE(metadata, '{}')::jsonb, '{hypothetical_queries}', $1::jsonb), updated_at = NOW() WHERE id = $2`,
174
- [JSON.stringify(queries), memoryId]
175
- );
176
- log(`Generated ${queries.length} hypothetical queries for ${memoryId}`);
172
+ // HyDE: generate hypothetical queries (non-fatal). Skippable via
173
+ // opts.hyde === false — corpus ingest passes this for code_reference
174
+ // chunks because hypothetical-question expansion against function
175
+ // signatures degrades retrieval and burns one LLM call per chunk.
176
+ if (opts.hyde !== false) {
177
+ try {
178
+ const queries = await generateHypotheticalQueries(llm, content);
179
+ if (queries.length) {
180
+ await db(
181
+ `UPDATE memory_nodes SET metadata = jsonb_set(COALESCE(metadata, '{}')::jsonb, '{hypothetical_queries}', $1::jsonb), updated_at = NOW() WHERE id = $2`,
182
+ [JSON.stringify(queries), memoryId]
183
+ );
184
+ log(`Generated ${queries.length} hypothetical queries for ${memoryId}`);
185
+ }
186
+ } catch (err) {
187
+ log(`HyDE failed for ${memoryId}: ${err.message}`);
177
188
  }
178
- } catch (err) {
179
- log(`HyDE failed for ${memoryId}: ${err.message}`);
180
189
  }
181
190
 
182
191
  // Distill atomic facts — only for raw ingestions (skip if this call is
@@ -36,6 +36,9 @@
36
36
  */
37
37
 
38
38
  import pg from "pg";
39
+ import { existsSync, readFileSync } from "node:fs";
40
+ import { homedir } from "node:os";
41
+ import { join } from "node:path";
39
42
  import { createMemorySystem } from "../index.js";
40
43
  import { createContextEngine } from "./context-engine.js";
41
44
  import { sanitizeMemoryContent } from "../sanitize.js";
@@ -45,6 +48,37 @@ import {
45
48
  hostedStoreMemory as _hostedStoreMemory,
46
49
  } from "../hosted.js";
47
50
 
51
+ /**
52
+ * Hydrate hosted-mode credentials from ~/.config/tes/credentials.json
53
+ * when the plugin config doesn't carry tes_endpoint/tes_api_key/
54
+ * tes_client_id. Lets a user who ran `tes login` get OpenClaw working
55
+ * without also editing openclaw.json by hand. Plugin-config values
56
+ * take precedence over the credentials file.
57
+ */
58
+ function hydrateHostedConfig(config) {
59
+ if (config?.tes_endpoint && config?.tes_api_key && config?.tes_client_id) {
60
+ return config;
61
+ }
62
+ const credPath = join(
63
+ process.env.XDG_CONFIG_HOME || join(homedir(), ".config"),
64
+ "tes",
65
+ "credentials.json"
66
+ );
67
+ if (!existsSync(credPath)) return config;
68
+ try {
69
+ const creds = JSON.parse(readFileSync(credPath, "utf-8"));
70
+ if (!creds?.endpoint || !creds?.clientId || !creds?.apiKey) return config;
71
+ return {
72
+ ...(config || {}),
73
+ tes_endpoint: config?.tes_endpoint || creds.endpoint,
74
+ tes_client_id: config?.tes_client_id || creds.clientId,
75
+ tes_api_key: config?.tes_api_key || creds.apiKey,
76
+ };
77
+ } catch {
78
+ return config;
79
+ }
80
+ }
81
+
48
82
  // --- Hosted-mode adapters ---
49
83
  //
50
84
  // The OpenClaw plugin predates the public hosted-helper API (`packages/
@@ -399,7 +433,11 @@ export default {
399
433
  kind: "context-engine",
400
434
 
401
435
  register(api) {
402
- const config = api.config || {};
436
+ // Hydrate hosted creds from ~/.config/tes/credentials.json if the
437
+ // plugin config doesn't carry tes_* keys. Lets `tes login` auto-
438
+ // configure OpenClaw without the user editing openclaw.json by
439
+ // hand. Plugin-config values take precedence.
440
+ const config = hydrateHostedConfig(api.config || {});
403
441
  const hosted = isHostedMode(config);
404
442
  const log = (msg) =>
405
443
  process.stderr.write(`[pentatonic-memory] ${msg}\n`);
@@ -29,6 +29,11 @@ const DEFAULT_WEIGHTS = {
29
29
  * @param {number} [opts.limit=20] - Max results
30
30
  * @param {number} [opts.minScore=0.5] - Minimum score threshold
31
31
  * @param {string} [opts.userId] - Optional user scope
32
+ * @param {string} [opts.kind] - Filter by metadata.kind exact match
33
+ * (e.g. "code_reference"). When omitted, all kinds are searched.
34
+ * Lets corpus ingest scope a query to code references only,
35
+ * isolating them from conversational memories that share the
36
+ * semantic layer.
32
37
  * @param {object} [opts.weights] - Override scoring weights
33
38
  * (relevance, recency, frequency, atomBoost, verbosityPenalty)
34
39
  * @param {boolean} [opts.dedupeBySource=true] - When an atom matches,
@@ -76,10 +81,18 @@ export async function search(db, ai, query, opts = {}) {
76
81
  }
77
82
 
78
83
  const embJson = JSON.stringify(embResult.embedding);
79
- const userFilter = opts.userId ? `AND mn.user_id = $5` : "";
80
-
81
84
  const params = [opts.clientId, embJson, query, limit];
82
- if (opts.userId) params.push(opts.userId);
85
+ let nextParam = 5;
86
+ let userFilter = "";
87
+ if (opts.userId) {
88
+ userFilter = `AND mn.user_id = $${nextParam++}`;
89
+ params.push(opts.userId);
90
+ }
91
+ let kindFilter = "";
92
+ if (opts.kind) {
93
+ kindFilter = `AND mn.metadata->>'kind' = $${nextParam++}`;
94
+ params.push(opts.kind);
95
+ }
83
96
 
84
97
  const sql = `
85
98
  WITH max_ac AS (
@@ -143,6 +156,7 @@ export async function search(db, ai, query, opts = {}) {
143
156
  AND mn.embedding_vec IS NOT NULL
144
157
  AND vector_dims(mn.embedding_vec) = vector_dims($2::vector)
145
158
  ${userFilter}
159
+ ${kindFilter}
146
160
  ORDER BY final_score DESC
147
161
  LIMIT $4
148
162
  `;
@@ -203,10 +217,18 @@ export async function search(db, ai, query, opts = {}) {
203
217
  */
204
218
  export async function textSearch(db, query, opts = {}) {
205
219
  const limit = Math.min(Math.max(1, opts.limit || 20), 200);
206
- const userFilter = opts.userId ? `AND mn.user_id = $4` : "";
207
- const params = opts.userId
208
- ? [opts.clientId, query, limit, opts.userId]
209
- : [opts.clientId, query, limit];
220
+ const params = [opts.clientId, query, limit];
221
+ let nextParam = 4;
222
+ let userFilter = "";
223
+ if (opts.userId) {
224
+ userFilter = `AND mn.user_id = $${nextParam++}`;
225
+ params.push(opts.userId);
226
+ }
227
+ let kindFilter = "";
228
+ if (opts.kind) {
229
+ kindFilter = `AND mn.metadata->>'kind' = $${nextParam++}`;
230
+ params.push(opts.kind);
231
+ }
210
232
 
211
233
  const sql = `
212
234
  SELECT mn.* FROM memory_nodes mn
@@ -216,6 +238,7 @@ export async function textSearch(db, query, opts = {}) {
216
238
  OR mn.content ILIKE '%' || $2 || '%'
217
239
  )
218
240
  ${userFilter}
241
+ ${kindFilter}
219
242
  ORDER BY
220
243
  ts_rank(to_tsvector('english', mn.content), plainto_tsquery('english', $2)) DESC,
221
244
  mn.confidence DESC
@@ -0,0 +1,13 @@
1
+ # pentatonic-memory-engine — environment overrides
2
+
3
+ # Compat shim port (matches pentatonic-memory v0.5 default)
4
+ PME_PORT=8099
5
+
6
+ # Client tenant scoping
7
+ CLIENT_ID=default
8
+
9
+ # Neo4j auth (L3 KG)
10
+ NEO4J_AUTH=neo4j/local-dev-pw
11
+
12
+ # NV-Embed model (auto-downloaded from Hugging Face on first run)
13
+ NV_EMBED_MODEL=nvidia/NV-Embed-v2
@@ -0,0 +1,131 @@
1
+ # pentatonic-memory-engine
2
+
3
+ **Drop-in replacement for `pentatonic-memory` v0.5.x with a 7-layer retrieval stack underneath.**
4
+
5
+ | Configuration | Mean accuracy* | p50 latency |
6
+ |---|---|---|
7
+ | pentatonic-memory v0.5.6 (current OSS) | 17.6% | 33ms |
8
+ | pentatonic-memory v0.4.7 (legacy OSS) | 38.8% | 27ms |
9
+ | **pentatonic-memory-engine — fast path** (L6-only via docker, default config) | **84.6%** | **110ms** |
10
+ | **pentatonic-memory-engine — max accuracy** (full 7-layer L2 fusion) | **85.7%** | **1241ms** |
11
+ | langmem (in-process) | 83.0% | 121ms |
12
+ | cognee | 82.1% | 192ms |
13
+ | single-store baseline | 79.3% | 110ms |
14
+
15
+ \* Mean over 6 commerce-domain benches (agent-coding, chat-recall, circular-economy, customer-support, marketplace-ops, product-catalogue) using substring grading. Full reports under `bench/`.
16
+
17
+ **Two configurations, same package.** The fast path (L6-only) is the default and ships at #1 on accuracy among real OSS memory stacks. The max-accuracy 7-layer mode adds Knowledge-Graph entity matching + L0 BM25 + L4 vec fusion via the L2 orchestrator — buys you +1.1pp at 11× latency. Pick per workload (live agent loop → fast path; offline batch / accuracy-graded eval → 7-layer).
18
+
19
+ ---
20
+
21
+ ## What this is
22
+
23
+ A self-contained docker-compose package that exposes the **same HTTP API as `pentatonic-memory`** (`/store`, `/search`, `/health`), plus two regression-fix endpoints (`/store-batch`, `/forget`) — but routes every call through a 7-layer hybrid retrieval engine instead of the single Postgres + pgvector store.
24
+
25
+ Same client code. Same SDK. ~5x better accuracy on retrieval-style benchmarks.
26
+
27
+ ## Why does the existing OSS underperform?
28
+
29
+ Detailed analysis in `docs/why-v05-underperforms.md`. Short version:
30
+
31
+ - Single vector store (pgvector), single embedding per row → diluted vectors on long content
32
+ - `atomBoost: +0.15` makes LLM-paraphrased atoms outrank source verbatim → substring grading fails
33
+ - HyDE generated at ingest time (60s LLM call per /store), not at query time
34
+ - pgvector HNSW broken at >2000 dims → 4096d NV-Embed falls back to sequential scan
35
+ - No reranker, no graph traversal, no multi-store fusion
36
+
37
+ ## Architecture (7-layer)
38
+
39
+ The engine is the same `sequential-hybridrag-7-layer` stack the L2 proxy reports in its health endpoint.
40
+
41
+ ```
42
+ ┌──────────────────┐
43
+ │ L0 BM25 (FTS) │
44
+ ├──────────────────┤
45
+ │ L1 Core files │
46
+ POST /store ┌──────────────┐ ├──────────────────┤
47
+ POST /search │ compat shim │ │ L2 HybridRAG │
48
+ client (any) ───► POST /forget ──► (FastAPI) │──►│ orchestrator│
49
+ POST /store-batch└──────────────┘ ├──────────────────┤
50
+ GET /health │ L3 Knowledge │
51
+ │ Graph (KG) │
52
+ ├──────────────────┤
53
+ │ L4 sqlite-vec │
54
+ ├──────────────────┤
55
+ │ L5 Qdrant comms │
56
+ ├──────────────────┤
57
+ │ L6 Document │
58
+ │ Store + │
59
+ │ reranker │
60
+ └─────────┬────────┘
61
+
62
+ ┌────────────────┴───────┐
63
+ │ NV-Embed-v2 │
64
+ │ Cross-encoder reranker │
65
+ └─────────────────────────┘
66
+ ```
67
+
68
+ Each layer indexes the same content differently. Search runs all seven in parallel and fuses results via Reciprocal Rank Fusion (RRF). Different query types win on different layers — agent-coding queries land on L0 BM25, chat-recall on L5, multi-hop entity questions on L3, conversational context on L1.
69
+
70
+ **Layer cheat-sheet:**
71
+
72
+ | # | Layer | Purpose | Backing tech |
73
+ |---|---|---|---|
74
+ | L0 | BM25 | Lexical / keyword recall | SQLite FTS5 |
75
+ | L1 | Core files | Always-loaded high-priority text (system manuals, key docs) | flat markdown read by L2 |
76
+ | L2 | HybridRAG orchestrator | Fan-out + RRF fusion across all layers | Python FastAPI |
77
+ | L3 | Knowledge Graph | Entity-aware retrieval, multi-hop relationships | Neo4j (OSS) |
78
+ | L4 | Vector index | High-recall semantic search | sqlite-vec |
79
+ | L5 | Comms / multi-collection vectors | Chat / email / contact / memory namespaces | Qdrant |
80
+ | L6 | Document store | Per-arena docs + cross-encoder reranker | sqlite + Milvus + MiniLM |
81
+
82
+ ## Quick start
83
+
84
+ ```bash
85
+ git clone <this-repo>
86
+ cd pentatonic-memory-engine
87
+ cp .env.example .env # set NEO4J_AUTH, etc.
88
+ docker compose up -d
89
+ ```
90
+
91
+ Wait ~30s for layers to come up. Verify:
92
+
93
+ ```bash
94
+ curl http://localhost:8099/health
95
+ # → {"status":"ok","layers":{"l0":"ok","l1":"ok","l2":"ok","l3":"ok","l4":"ok","l5":"ok","l6":"ok"},"engine":"pentatonic-memory-engine"}
96
+ ```
97
+
98
+ Now point your existing `pentatonic-memory` SDK client at `http://localhost:8099` — no code change.
99
+
100
+ ### Picking a mode
101
+
102
+ Both modes share the same `docker compose up -d` and the same HTTP API. Switch via one env var on the `compat` container:
103
+
104
+ ```bash
105
+ # Fast path — L6-only, 84.6% / 110ms p50 (default)
106
+ BYPASS_L2_PROXY=1 docker compose up -d compat
107
+
108
+ # Max accuracy — full 7-layer L2 fusion, 85.7% / 1241ms p50
109
+ BYPASS_L2_PROXY=0 docker compose up -d compat
110
+ ```
111
+
112
+ | Mode | Mean acc | p50 | When to use |
113
+ |---|---|---|---|
114
+ | L6-only (default) | 84.6% | 110ms | Live agent calls, latency-sensitive paths |
115
+ | 7-layer fusion | 85.7% | 1241ms | Offline batch retrieval, accuracy-graded eval, multi-hop entity queries |
116
+
117
+ Both modes populate all 7 layers on `/store-batch` (since v0.2). The mode flag only changes which layers the **search** path queries.
118
+
119
+ ## API compatibility
120
+
121
+ | Endpoint | v0.5 | This package | Notes |
122
+ |---|---|---|---|
123
+ | `POST /store` | ✅ | ✅ | Same request/response shape |
124
+ | `POST /search` | ✅ | ✅ | Same request/response shape; ?mode=vector/text both supported |
125
+ | `GET /health` | ✅ | ✅ | Returns aggregate health across all 7 layers |
126
+ | `POST /store-batch` | ❌ | ✅ | New: batch-ingest N records in one HTTP call (30-50× faster) |
127
+ | `POST /forget` | ❌ (regression) | ✅ | Restored from v0.4.x; supports `metadata_contains` filter |
128
+
129
+ Migration: see `docs/MIGRATION.md`.
130
+
131
+
@@ -0,0 +1,99 @@
1
+ # Benchmark Results
2
+
3
+ All runs were conducted on **DGX Spark GB10** (10-core ARM CPU, 128GB unified memory, NVIDIA GB10 SoC) on **2026-04-27**.
4
+
5
+ ## Summary
6
+
7
+ | Stack | Mean accuracy | Mean p50 latency | Coverage |
8
+ |---|---|---|---|
9
+ | **pentatonic-memory-engine — 7-layer fusion** | **85.7%** | 1241ms | 6/6 |
10
+ | **pentatonic-memory-engine — L6-only fast path** | **84.6%** | 110ms | 6/6 |
11
+ | pentatonic-memory v0.4.7 (current canonical OSS) | 38.8% | 27ms | 6/6 |
12
+ | pentatonic-memory v0.5.6 (latest OSS) | 17.6% | 33ms | 6/6 |
13
+
14
+ Both pentatonic-memory baselines were freshly purged before the run (no stale data pollution). Both modes of `pentatonic-memory-engine` ship in the same docker-compose package — one env var (`BYPASS_L2_PROXY`) toggles between fast path and 7-layer fusion.
15
+
16
+ ## Per-bench breakdown
17
+
18
+ | Bench | 7-layer | L6-only | v0.4.7 | v0.5.6 |
19
+ |---|---|---|---|---|
20
+ | agent-coding | 100.0% (22/22) | 100.0% (22/22) | 63.6% (14/22) | 9.1% (2/22) |
21
+ | chat-recall | 100.0% (16/16) | 100.0% (16/16) | 12.5% (2/16) | 0.0% (0/16) |
22
+ | circular-economy | 76.0% (19/25) | 80.0% (20/25) | 40.0% (10/25) | 32.0% (8/25) |
23
+ | customer-support | 75.0% (15/20) | 70.0% (14/20) | 25.0% (5/20) | 5.0% (1/20) |
24
+ | marketplace-ops | 80.0% (16/20) | 80.0% (16/20) | 25.0% (5/20) | 15.0% (3/20) |
25
+ | product-catalogue | 83.3% (15/18) | 77.8% (14/18) | 66.7% (12/18) | 44.4% (8/18) |
26
+ | **MEAN** | **85.7%** | **84.6%** | **38.8%** | **17.6%** |
27
+
28
+ ### When does 7-layer fusion help?
29
+
30
+ Layer-by-layer effect over L6-only:
31
+
32
+ - **+5.6pp on product-catalogue** — KG entity matching pulls related SKUs / materials in one hop; L0 BM25 catches part numbers that vector search alone misses.
33
+ - **+5.0pp on customer-support** — Multi-hop entity resolution (customer → order → policy) lifts retrieval where pure semantic search loses the relationship.
34
+ - **Tied on agent-coding, chat-recall, marketplace-ops** — L6 already saturated (100%, 100%, 80%); extra layers add nothing.
35
+ - **−4.0pp on circular-economy** — Extra layers add noise on this sustainability corpus; L6's reranker alone is the better signal.
36
+
37
+ Net: +1.1pp accuracy at 11× latency cost. Use 7-layer for accuracy-graded eval and offline batch retrieval; stay on L6-only for live agent calls.
38
+
39
+ ## Bench corpora
40
+
41
+ The 6 benches use commerce-domain corpora that overlap Pentatonic's actual product space:
42
+
43
+ - `agent-coding` — 22 questions over 22 docs (TES + agent SDK source/docs)
44
+ - `chat-recall` — 16 questions over a 16-turn chat transcript
45
+ - `circular-economy` — 25 questions over 25 sustainability docs
46
+ - `customer-support` — 20 questions over a 20-doc support knowledge base
47
+ - `marketplace-ops` — 20 questions over 20 marketplace listings
48
+ - `product-catalogue` — 18 questions over an 18-SKU product catalogue
49
+
50
+ All grading uses **substring match**: a hit is correct if the retrieved text contains the literal answer string. This is the strictest grading mode and the closest analogue to "did the SDK return a chunk that actually answers the question."
51
+
52
+ ## Reproduce
53
+
54
+ ```bash
55
+ # Bring up the engine
56
+ cd pentatonic-memory-engine && docker compose up -d
57
+
58
+ # Wait for healthy
59
+ until curl -sf http://localhost:8099/health | grep -q '"status":"ok"'; do sleep 2; done
60
+
61
+ # Set up the bench harness
62
+ cd ~/pentatonic-memory-bench
63
+ pip install -e .
64
+
65
+ # Run the L6-only fast path (default)
66
+ PENTATONIC_MEMORY_URL=http://localhost:8099 \
67
+ python -m pentatonic_bench.cli run -b chat-recall -s pentatonic-memory -k 3
68
+
69
+ # Run the 7-layer fusion (toggle BYPASS_L2_PROXY=0 + restart compat)
70
+ BYPASS_L2_PROXY=0 docker compose up -d --force-recreate compat
71
+ PENTATONIC_MEMORY_URL=http://localhost:8099 \
72
+ python -m pentatonic_bench.cli run -b chat-recall -s pentatonic-memory -k 3
73
+ ```
74
+
75
+ ## Comparison to other open-source memory stacks
76
+
77
+ | Stack | Mean acc | Mean p50 | Notes |
78
+ |---|---|---|---|
79
+ | 🥇 **pentatonic-memory-engine — 7-layer** | **85.7%** | **1241ms** | This package, full L2 fusion |
80
+ | 🥈 **pentatonic-memory-engine — L6-only** | **84.6%** | **110ms** | This package, fast path |
81
+ | 🥉 langmem | 83.0% | 121ms | LangChain's in-process memory; no HTTP/embedding overhead |
82
+ | cognee | 82.1% | 192ms | Graph + vector hybrid, KG-first |
83
+ | single-store baseline | 79.3% | 110ms | Single vector store + sentence-transformers |
84
+ | llamaindex | 79.3% | 203ms | LlamaIndex with default config |
85
+ | bm25-baseline | 75.9% | 0ms | Pure SQLite FTS5, no embeddings |
86
+ | pentatonic-memory v0.4.7 | 38.8% | 27ms | Current canonical OSS |
87
+ | graphiti | 30.1% | 156ms | Graph-only, no vector |
88
+ | pentatonic-memory v0.5.6 | 17.6% | 33ms | Latest OSS |
89
+
90
+ Engine beats every other OSS memory stack on accuracy in both modes. The L6-only fast path matches langmem's latency profile while delivering +1.6pp accuracy. The 7-layer mode is the genuine #1 on accuracy across all benchmarked stacks.
91
+
92
+ ## Raw scorecards
93
+
94
+ - `scorecards-engine-via-docker/` — 6 JSON scorecards, L6-only fast path (84.6% mean / 110ms p50)
95
+ - `scorecards-engine-via-l2-7-layer-populated/` — 6 JSON scorecards, full 7-layer fusion (85.7% mean / 1241ms p50)
96
+ - `scorecards-engine-via-l2-empty-layers/` — earlier experiment, 7-layer with empty L0/L4-qmd/L3 (82.1%, rolled back; superseded by populated 7-layer)
97
+ - `scorecards-engine-via-shim/` — earlier experiment, shim-direct ingestion path
98
+ - `scorecards-engine/` — initial bench (1183ms, before L6-only optimisation)
99
+ - `scorecards-pentatonic-baseline/` — 12 JSON scorecards (6 per stack) for the v0.4.7 and v0.5.6 baselines