@pentatonic-ai/ai-agent-sdk 0.5.11 → 0.7.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (119) hide show
  1. package/README.md +345 -174
  2. package/bin/__tests__/callback-server.test.js +70 -0
  3. package/bin/__tests__/credentials.test.js +58 -0
  4. package/bin/__tests__/login.test.js +210 -0
  5. package/bin/__tests__/pkce.test.js +39 -0
  6. package/bin/__tests__/whoami.test.js +77 -0
  7. package/bin/cli.js +109 -440
  8. package/bin/commands/config.js +251 -0
  9. package/bin/commands/login.js +219 -0
  10. package/bin/commands/whoami.js +41 -0
  11. package/bin/lib/callback-server.js +137 -0
  12. package/bin/lib/credentials.js +100 -0
  13. package/bin/lib/pkce.js +26 -0
  14. package/package.json +4 -2
  15. package/packages/doctor/__tests__/detect.test.js +2 -6
  16. package/packages/doctor/src/checks/local-memory.js +164 -196
  17. package/packages/doctor/src/detect.js +11 -3
  18. package/packages/memory/src/__tests__/corpus-chunkers.test.js +143 -0
  19. package/packages/memory/src/__tests__/corpus-discover.test.js +175 -0
  20. package/packages/memory/src/__tests__/corpus-ingest.test.js +236 -0
  21. package/packages/memory/src/__tests__/corpus-signatures.test.js +175 -0
  22. package/packages/memory/src/__tests__/corpus-state.test.js +161 -0
  23. package/packages/memory/src/__tests__/ingest-corpus-opts.test.js +129 -0
  24. package/packages/memory/src/__tests__/search-kind.test.js +108 -0
  25. package/packages/memory/src/corpus/adapters.js +398 -0
  26. package/packages/memory/src/corpus/chunkers.js +328 -0
  27. package/packages/memory/src/corpus/cli.js +613 -0
  28. package/packages/memory/src/corpus/discover.js +379 -0
  29. package/packages/memory/src/corpus/index.js +68 -0
  30. package/packages/memory/src/corpus/ingest.js +356 -0
  31. package/packages/memory/src/corpus/signatures.js +280 -0
  32. package/packages/memory/src/corpus/state.js +134 -0
  33. package/packages/memory/src/index.js +18 -0
  34. package/packages/memory/src/ingest.js +20 -11
  35. package/packages/memory/src/openclaw/index.js +39 -1
  36. package/packages/memory/src/search.js +30 -7
  37. package/packages/memory-engine/.env.example +13 -0
  38. package/packages/memory-engine/README.md +131 -0
  39. package/packages/memory-engine/bench/README.md +99 -0
  40. package/packages/memory-engine/bench/scorecards-engine/agent-coding__pentatonic-baseline__20260427-142523.json +1115 -0
  41. package/packages/memory-engine/bench/scorecards-engine/chat-recall__pentatonic-baseline__20260427-142648.json +819 -0
  42. package/packages/memory-engine/bench/scorecards-engine/circular-economy__pentatonic-baseline__20260427-142757.json +1278 -0
  43. package/packages/memory-engine/bench/scorecards-engine/customer-support__pentatonic-baseline__20260427-142900.json +1018 -0
  44. package/packages/memory-engine/bench/scorecards-engine/marketplace-ops__pentatonic-baseline__20260427-142957.json +1038 -0
  45. package/packages/memory-engine/bench/scorecards-engine/product-catalogue__pentatonic-baseline__20260427-143122.json +961 -0
  46. package/packages/memory-engine/bench/scorecards-engine-via-docker/agent-coding__pentatonic-memory__20260427-161812.json +1115 -0
  47. package/packages/memory-engine/bench/scorecards-engine-via-docker/chat-recall__pentatonic-memory__20260427-161701.json +819 -0
  48. package/packages/memory-engine/bench/scorecards-engine-via-docker/circular-economy__pentatonic-memory__20260427-161713.json +1278 -0
  49. package/packages/memory-engine/bench/scorecards-engine-via-docker/customer-support__pentatonic-memory__20260427-161723.json +1018 -0
  50. package/packages/memory-engine/bench/scorecards-engine-via-docker/marketplace-ops__pentatonic-memory__20260427-161732.json +1038 -0
  51. package/packages/memory-engine/bench/scorecards-engine-via-docker/product-catalogue__pentatonic-memory__20260427-161741.json +937 -0
  52. package/packages/memory-engine/bench/scorecards-engine-via-l2-7-layer-populated/agent-coding__pentatonic-memory__20260427-184718.json +1115 -0
  53. package/packages/memory-engine/bench/scorecards-engine-via-l2-7-layer-populated/chat-recall__pentatonic-memory__20260427-184614.json +819 -0
  54. package/packages/memory-engine/bench/scorecards-engine-via-l2-7-layer-populated/circular-economy__pentatonic-memory__20260427-184809.json +1278 -0
  55. package/packages/memory-engine/bench/scorecards-engine-via-l2-7-layer-populated/customer-support__pentatonic-memory__20260427-184854.json +1018 -0
  56. package/packages/memory-engine/bench/scorecards-engine-via-l2-7-layer-populated/marketplace-ops__pentatonic-memory__20260427-184929.json +1038 -0
  57. package/packages/memory-engine/bench/scorecards-engine-via-l2-7-layer-populated/product-catalogue__pentatonic-memory__20260427-185015.json +961 -0
  58. package/packages/memory-engine/bench/scorecards-engine-via-l2-empty-layers/agent-coding__pentatonic-memory__20260427-175252.json +1115 -0
  59. package/packages/memory-engine/bench/scorecards-engine-via-l2-empty-layers/chat-recall__pentatonic-memory__20260427-175312.json +819 -0
  60. package/packages/memory-engine/bench/scorecards-engine-via-l2-empty-layers/circular-economy__pentatonic-memory__20260427-175335.json +1278 -0
  61. package/packages/memory-engine/bench/scorecards-engine-via-l2-empty-layers/customer-support__pentatonic-memory__20260427-175355.json +1018 -0
  62. package/packages/memory-engine/bench/scorecards-engine-via-l2-empty-layers/marketplace-ops__pentatonic-memory__20260427-175413.json +1038 -0
  63. package/packages/memory-engine/bench/scorecards-engine-via-l2-empty-layers/product-catalogue__pentatonic-memory__20260427-175430.json +883 -0
  64. package/packages/memory-engine/bench/scorecards-engine-via-shim/agent-coding__pentatonic-memory__20260427-155409.json +1115 -0
  65. package/packages/memory-engine/bench/scorecards-engine-via-shim/chat-recall__pentatonic-memory__20260427-155421.json +819 -0
  66. package/packages/memory-engine/bench/scorecards-engine-via-shim/circular-economy__pentatonic-memory__20260427-155433.json +1278 -0
  67. package/packages/memory-engine/bench/scorecards-engine-via-shim/customer-support__pentatonic-memory__20260427-155443.json +1018 -0
  68. package/packages/memory-engine/bench/scorecards-engine-via-shim/marketplace-ops__pentatonic-memory__20260427-155453.json +1038 -0
  69. package/packages/memory-engine/bench/scorecards-engine-via-shim/product-catalogue__pentatonic-memory__20260427-155503.json +937 -0
  70. package/packages/memory-engine/bench/scorecards-pentatonic-baseline/agent-coding__pentatonic-memory-latest__20260427-145103.json +1115 -0
  71. package/packages/memory-engine/bench/scorecards-pentatonic-baseline/agent-coding__pentatonic-memory__20260427-144909.json +1115 -0
  72. package/packages/memory-engine/bench/scorecards-pentatonic-baseline/chat-recall__pentatonic-memory-latest__20260427-145153.json +819 -0
  73. package/packages/memory-engine/bench/scorecards-pentatonic-baseline/chat-recall__pentatonic-memory__20260427-145120.json +542 -0
  74. package/packages/memory-engine/bench/scorecards-pentatonic-baseline/circular-economy__pentatonic-memory-latest__20260427-145313.json +1278 -0
  75. package/packages/memory-engine/bench/scorecards-pentatonic-baseline/circular-economy__pentatonic-memory__20260427-145207.json +894 -0
  76. package/packages/memory-engine/bench/scorecards-pentatonic-baseline/customer-support__pentatonic-memory-latest__20260427-145412.json +1018 -0
  77. package/packages/memory-engine/bench/scorecards-pentatonic-baseline/customer-support__pentatonic-memory__20260427-145327.json +680 -0
  78. package/packages/memory-engine/bench/scorecards-pentatonic-baseline/marketplace-ops__pentatonic-memory-latest__20260427-145517.json +1038 -0
  79. package/packages/memory-engine/bench/scorecards-pentatonic-baseline/marketplace-ops__pentatonic-memory__20260427-145422.json +693 -0
  80. package/packages/memory-engine/bench/scorecards-pentatonic-baseline/product-catalogue__pentatonic-memory-latest__20260427-145616.json +961 -0
  81. package/packages/memory-engine/bench/scorecards-pentatonic-baseline/product-catalogue__pentatonic-memory__20260427-145528.json +727 -0
  82. package/packages/memory-engine/compat/Dockerfile +11 -0
  83. package/packages/memory-engine/compat/server.py +680 -0
  84. package/packages/memory-engine/docker-compose.yml +243 -0
  85. package/packages/memory-engine/docs/MIGRATION.md +178 -0
  86. package/packages/memory-engine/docs/RUNBOOK-AWS.md +375 -0
  87. package/packages/memory-engine/docs/why-v05-underperforms.md +138 -0
  88. package/packages/memory-engine/engine/README.md +52 -0
  89. package/packages/memory-engine/engine/l2-hybridrag-proxy.py +1543 -0
  90. package/packages/memory-engine/engine/l5-comms-layer.py +663 -0
  91. package/packages/memory-engine/engine/l6-document-store.py +1018 -0
  92. package/packages/memory-engine/engine/services/l2/Dockerfile +41 -0
  93. package/packages/memory-engine/engine/services/l2/init_databases.py +81 -0
  94. package/packages/memory-engine/engine/services/l2/l2-hybridrag-proxy.py +1543 -0
  95. package/packages/memory-engine/engine/services/l4/Dockerfile +15 -0
  96. package/packages/memory-engine/engine/services/l4/server.py +235 -0
  97. package/packages/memory-engine/engine/services/l5/Dockerfile +9 -0
  98. package/packages/memory-engine/engine/services/l5/l5-comms-layer.py +678 -0
  99. package/packages/memory-engine/engine/services/l6/Dockerfile +11 -0
  100. package/packages/memory-engine/engine/services/l6/l6-document-store.py +1016 -0
  101. package/packages/memory-engine/engine/services/nv-embed/Dockerfile +28 -0
  102. package/packages/memory-engine/engine/services/nv-embed/server.py +152 -0
  103. package/packages/memory-engine/pme_memory/__init__.py +0 -0
  104. package/packages/memory-engine/pme_memory/__main__.py +129 -0
  105. package/packages/memory-engine/pme_memory/artifacts.py +95 -0
  106. package/packages/memory-engine/pme_memory/embed.py +74 -0
  107. package/packages/memory-engine/pme_memory/health.py +36 -0
  108. package/packages/memory-engine/pme_memory/hygiene.py +159 -0
  109. package/packages/memory-engine/pme_memory/indexer.py +200 -0
  110. package/packages/memory-engine/pme_memory/needs.py +55 -0
  111. package/packages/memory-engine/pme_memory/provenance.py +80 -0
  112. package/packages/memory-engine/pme_memory/scoring.py +168 -0
  113. package/packages/memory-engine/pme_memory/search.py +52 -0
  114. package/packages/memory-engine/pme_memory/store.py +86 -0
  115. package/packages/memory-engine/pme_memory/synthesis.py +114 -0
  116. package/packages/memory-engine/pyproject.toml +65 -0
  117. package/packages/memory-engine/scripts/kg-extractor.py +557 -0
  118. package/packages/memory-engine/scripts/kg-preflexor-v2.py +738 -0
  119. package/packages/memory-engine/tests/test_api_contract.sh +57 -0
package/README.md CHANGED
@@ -6,11 +6,11 @@
6
6
  </picture>
7
7
  </p>
8
8
 
9
- <h3 align="center">AI Agent SDK</h3>
9
+ <h3 align="center">Pentatonic AI Agent SDK</h3>
10
10
 
11
11
  <p align="center">
12
- Observability, memory, and analytics for LLM applications.<br>
13
- Run locally or use hosted TES. JavaScript &amp; Python.
12
+ Memory and observability for AI agents.<br>
13
+ Two products on one platform (TES). One install. JavaScript &amp; Python.
14
14
  </p>
15
15
 
16
16
  <p align="center">
@@ -21,166 +21,321 @@
21
21
 
22
22
  ---
23
23
 
24
+ ## What's in this SDK
25
+
26
+ Two products that share one TES account, one install line, and one dashboard:
27
+
28
+ | Product | What it does | When you want it |
29
+ |---|---|---|
30
+ | **Memory** | Persistent, searchable memory for your AI agent — 7-layer hybrid retrieval (BM25 + vector + KG + reranker), repo onboarding via references. Runs locally (Docker) or hosted (TES). | You want your agent to remember conversations, preferences, and codebase context across sessions. |
31
+ | **Observability** | Wrap your LLM client and capture every call — tokens, tool calls, latency, content. Events flow to TES for the dashboard, analytics, and search attribution. | You want to know what your agent is actually doing in production. |
32
+
33
+ Both products are sold separately, but you can use either, both, or neither. Plugins for **Claude Code** and **OpenClaw** install everything at once if you'd rather skip the SDK glue.
34
+
35
+ ## Pick your path
36
+
37
+ - 🧠 **I want memory in my agent** → [Memory](#memory)
38
+ - 📊 **I want to instrument my LLM calls** → [Observability](#observability)
39
+ - 🔌 **I'm using Claude Code or OpenClaw** → [Plugins](#plugins)
40
+ - 📂 **I want to seed memory from my codebase or docs** → [Repository onboarding](#repository-onboarding-corpus-ingest)
41
+ - 🩺 **I want to check my install** → [Health checks (`doctor`)](#health-checks-doctor)
42
+
24
43
  ## Table of Contents
25
44
 
26
- - [Overview](#overview)
27
- - [Local Memory (self-hosted)](#local-memory-self-hosted)
28
- - [Hosted TES](#hosted-tes)
29
- - [Claude Code Plugin](#claude-code-plugin)
30
- - [OpenClaw Plugin](#openclaw-plugin)
31
- - [SDK: Wrap Your LLM Client](#sdk-wrap-your-llm-client)
32
- - [Supported Providers](#supported-providers)
45
+ - [TES — the platform](#tes--the-platform)
46
+ - [Memory](#memory)
47
+ - [Local (self-hosted)](#local-self-hosted)
48
+ - [Hosted (cloud)](#hosted-cloud)
49
+ - [Use as a library](#use-as-a-library)
50
+ - [Observability](#observability)
51
+ - [Wrap your LLM client](#wrap-your-llm-client)
52
+ - [Supported providers](#supported-providers)
53
+ - [Plugins](#plugins)
54
+ - [Claude Code](#claude-code)
55
+ - [OpenClaw](#openclaw)
56
+ - [Repository Onboarding (corpus ingest)](#repository-onboarding-corpus-ingest)
33
57
  - [API Reference](#api-reference)
34
58
  - [Health Checks (`doctor`)](#health-checks-doctor)
35
59
  - [Architecture](#architecture)
36
60
 
37
- ## Overview
61
+ ---
62
+
63
+ ## TES — the platform
64
+
65
+ **TES** (Thing Event System) is Pentatonic's account-and-events backbone. Both products in this SDK run on it: memory writes/queries land in TES, observability events stream to it, and the dashboard reads from it.
66
+
67
+ You only need a TES account if you're using **hosted memory** or **observability** (observability always sends events to TES). **Local memory** runs entirely on your machine and needs no TES account.
68
+
69
+ ```bash
70
+ # One-time: open browser, sign in or sign up, get API keys
71
+ npx @pentatonic-ai/ai-agent-sdk login
72
+ ```
73
+
74
+ `login` opens your browser at the hosted sign-in page. New users click "Sign up" to create a tenant (clientId + region + email + password). After verification the CLI writes credentials to `~/.config/tes/credentials.json` (mode 0600). The Claude Code plugin, OpenClaw plugin, hooks, and corpus CLI all auto-discover this file — no manual paste step.
75
+
76
+ ```
77
+ ✓ Connected as you@example.com on tenant `your-clientid`
78
+ ✓ Credentials written to ~/.config/tes/credentials.json
79
+ ```
80
+
81
+ To check connection state later: `npx @pentatonic-ai/ai-agent-sdk whoami`. To point at a local TES dev instance: `npx @pentatonic-ai/ai-agent-sdk login --endpoint http://localhost:8788`.
82
+
83
+ (`init` still works as a one-major-release deprecation alias for `login`.)
84
+
85
+ ---
86
+
87
+ ## Memory
38
88
 
39
- Two ways to use the SDK:
89
+ Persistent, searchable memory for AI agents. Backed by a 7-layer hybrid retrieval engine — BM25 keyword (L0), core files (L1), HybridRAG orchestrator (L2), Knowledge Graph entities (L3), vector index (L4), comms-namespace vectors (L5), and a document store with cross-encoder reranker (L6). Reciprocal Rank Fusion stitches them at query time.
40
90
 
41
- **Local Memory** -- Run a fully private memory system on your own machine. PostgreSQL + pgvector + Ollama in Docker. No API keys, no cloud. Your agent gets persistent, searchable memory backed by multi-signal retrieval and HyDE query expansion.
91
+ Same engine, same wire format (`/store`, `/search`, `/forget`, `/store-batch`, `/health`), two deployment modes:
42
92
 
43
- **Hosted TES** -- Connect to Pentatonic's Thing Event System for production-grade observability, higher-dimensional embeddings, conversation analytics, and team-wide shared memory.
93
+ ### Local (self-hosted)
44
94
 
45
- Both paths work with Claude Code and OpenClaw. The plugins auto-search on every prompt and auto-store every conversation turn.
95
+ Run the full engine stack on your own machine via Docker. No API keys, no cloud, fully offline. Embeddings come from your local Ollama; quality depends on the model you pull (768d `nomic-embed-text` is the default and works fine on a laptop).
46
96
 
47
- ## Local Memory (self-hosted)
97
+ **Prerequisites**
48
98
 
49
- Run the full memory stack locally. Requires Docker and ~4GB disk for models.
99
+ - Docker + Docker Compose v2
100
+ - Ollama installed on the host (https://ollama.com)
101
+ - A pulled embedding model: `ollama pull nomic-embed-text`
50
102
 
51
- ### 1. Set up
103
+ If you'll run Claude Code (or anything else) inside a Docker container that needs to reach the engine, **make Ollama listen on all interfaces** so containers can reach it via `host.docker.internal`:
52
104
 
53
105
  ```bash
54
- npx @pentatonic-ai/ai-agent-sdk memory
106
+ sudo mkdir -p /etc/systemd/system/ollama.service.d
107
+ echo -e '[Service]\nEnvironment="OLLAMA_HOST=0.0.0.0:11434"' \
108
+ | sudo tee /etc/systemd/system/ollama.service.d/override.conf
109
+ sudo systemctl daemon-reload
110
+ sudo systemctl restart ollama
55
111
  ```
56
112
 
57
- This starts PostgreSQL + pgvector, Ollama, and the memory server. It pulls embedding and chat models, and writes the local config.
58
-
59
- ### 2. Install the Claude Code plugin
113
+ **Bring up the engine**
60
114
 
115
+ ```bash
116
+ git clone https://github.com/Pentatonic-Ltd/ai-agent-sdk.git
117
+ cd ai-agent-sdk/packages/memory-engine
118
+
119
+ # Default .env points at Ollama on the host. Edit if your Ollama is
120
+ # elsewhere or you want to use a higher-quality model (e.g. mxbai-embed-large
121
+ # at 1024d → set EMBED_DIM=1024 and EMBED_MODEL_NAME=mxbai-embed-large).
122
+ cat > .env <<'EOF'
123
+ PME_NV_EMBED_ENABLED=false
124
+ NV_EMBED_URL=http://host.docker.internal:11434/v1/embeddings
125
+ EMBED_MODEL_NAME=nomic-embed-text
126
+ EMBED_DIM=768
127
+ OLLAMA_DIM=768
128
+ PME_OLLAMA_URL=http://host.docker.internal:11434/api/embeddings
129
+ PME_EMBED_MODEL=nomic-embed-text
130
+ L5_OLLAMA_EMBED_URL=http://host.docker.internal:11434/api/embed
131
+ L5_OLLAMA_EMBED_MODEL=nomic-embed-text
132
+ PME_HYDE_ENABLED=false
133
+ PME_RERANK_ENABLED=true
134
+ PME_PORT=8099
135
+ CLIENT_ID=local
136
+ NEO4J_AUTH=neo4j/local-dev-pw
137
+ NEO4J_PASSWORD=local-dev-pw
138
+ EOF
139
+
140
+ docker compose up -d --scale nv-embed=0
61
141
  ```
62
- /plugin marketplace add Pentatonic-Ltd/ai-agent-sdk
63
- /plugin install tes-memory@pentatonic-ai
142
+
143
+ First run pulls images and builds engine containers — ~10–15 min. Subsequent restarts are seconds.
144
+
145
+ **Verify**
146
+
147
+ ```bash
148
+ curl -s http://localhost:8099/health | jq
149
+ # Status should be "ok" or "degraded" with most layers reporting ok.
150
+
151
+ curl -sX POST http://localhost:8099/store \
152
+ -H "content-type: application/json" \
153
+ -d '{"content":"hello memory","metadata":{"arena":"local"}}' | jq
154
+
155
+ curl -sX POST http://localhost:8099/search \
156
+ -H "content-type: application/json" \
157
+ -d '{"query":"hello","limit":3,"min_score":0.001}' | jq
64
158
  ```
65
159
 
66
- That's it. The plugin hooks automatically search memories on every prompt and store every conversation turn. Fully local, fully private.
160
+ If `/search` returns the row from `/store`, the engine is live.
67
161
 
68
- ### What you get
162
+ **Connect Claude Code**
69
163
 
70
- - **Automatic memory** -- every conversation turn is stored with embeddings and HyDE query expansion
71
- - **Semantic search** -- multi-signal retrieval combining vector similarity, BM25 full-text, recency decay, and access frequency
72
- - **Memory layers** -- episodic (recent), semantic (consolidated), procedural (how-to), working (temporary)
73
- - **Distilled memory** -- a background LLM pass extracts atomic facts from each raw turn and stores each as its own node in the semantic layer, linked back to the source. A query like *"what does Phil drink?"* matches *"Phil drinks cortado"* more reliably than a mixed paragraph covering food, drinks, and hobbies. Default-on; the raw turn is still preserved.
74
- - **Decay and consolidation** -- memories fade over time; frequently accessed ones get promoted
164
+ The `tes-memory` plugin's hooks already speak the engine's wire format. Two steps:
75
165
 
76
- > **Store latency note (v0.5.4+):** on the local memory server, `store_memory` now awaits distillation before returning instead of running it fire-and-forget. This fixed a bug where distillation was being killed mid-flight (atoms never got embeddings, so they were unreachable by semantic search), but it means stores now take as long as your configured LLM takes to produce atoms — typically 5–30s on `llama3.2:3b`, up to the `chat()` timeout ceiling (60s default, overridable via `opts.timeout`). Cloudflare Worker deployments pass `ctx.waitUntil` and still return fast. Set `opts.distill: false` on the ingest call if you want the old fast-return behaviour at the cost of no atoms.
166
+ 1. Install the plugin (once):
167
+ ```
168
+ /plugin marketplace add Pentatonic-Ltd/ai-agent-sdk
169
+ /plugin install tes-memory@pentatonic-ai
170
+ ```
171
+ 2. Point it at your local engine. Edit `~/.claude-pentatonic/tes-memory.local.md` (create if missing):
172
+ ```yaml
173
+ ---
174
+ mode: local
175
+ memory_url: http://localhost:8099
176
+ ---
177
+ ```
178
+ 3. Reload: `/reload-plugins` (or restart Claude Code if status reports stale state — MCP server processes need a full restart to pick up plugin updates).
77
179
 
78
- ### Change models
180
+ Verify:
181
+
182
+ ```
183
+ /tes-memory:tes-status
184
+ ```
185
+
186
+ Should report `✓ Connected to local memory engine`. Now every prompt auto-searches engine memory and every turn auto-stores. The footer `🧠 Matched N memories from Pentatonic Memory` shows hits.
187
+
188
+ **Seed memory from your codebase or docs (optional)**
189
+
190
+ Drop the cold-start problem on day one by pre-populating the engine with references to your code/docs:
79
191
 
80
192
  ```bash
81
- EMBEDDING_MODEL=mxbai-embed-large LLM_MODEL=qwen2.5:7b npx @pentatonic-ai/ai-agent-sdk memory
193
+ MEMORY_ENGINE_URL=http://localhost:8099 \
194
+ npx @pentatonic-ai/ai-agent-sdk ingest ~/code/my-project
82
195
  ```
83
196
 
84
- ### Raspberry Pi
197
+ References-mode by default — stores path + signature pointers, not full file contents. See [Repository Onboarding](#repository-onboarding-corpus-ingest) for details.
198
+
199
+ **Tuning**
200
+
201
+ Change embedding model: pull a different one, edit `EMBED_MODEL_NAME` + `EMBED_DIM` in `.env`, then `docker compose down -v && docker compose up -d --scale nv-embed=0` (the `-v` is required because Milvus collections are dim-locked at creation; switching dims means recreating).
202
+
203
+ | Model | Dim | Notes |
204
+ |---|---|---|
205
+ | `nomic-embed-text` (default) | 768 | Smallest; works on any laptop |
206
+ | `mxbai-embed-large` | 1024 | Better recall; ~600 MB download |
207
+ | `nv-embed-v2` (via gateway) | 4096 | Production-grade; needs a hosted endpoint or GPU |
208
+
209
+ ### Hosted (cloud)
85
210
 
86
- Pi 5 with 8GB RAM runs the full stack. `nomic-embed-text` (~300MB) + `llama3.2:3b` (~2GB) leaves plenty of headroom.
211
+ Run on Pentatonic's infrastructure. NV-Embed-v2 (4096d) embeddings via the AI gateway, managed Postgres/Neo4j/Qdrant/Milvus, dashboard. The engine still ships in this repo hosted just deploys it for you.
212
+
213
+ ```bash
214
+ # 1. Get a TES account
215
+ npx @pentatonic-ai/ai-agent-sdk login
216
+
217
+ # 2. Install the SDK
218
+ npm install @pentatonic-ai/ai-agent-sdk
219
+ # or: pip install pentatonic-ai-agent-sdk
220
+ ```
221
+
222
+ Memory operations route through TES → engine. No client-side change between local and hosted.
87
223
 
88
224
  ### Use as a library
89
225
 
90
226
  ```javascript
91
- import { createMemorySystem } from '@pentatonic-ai/ai-agent-sdk/memory';
227
+ import { engineAdapter, ingestCorpus } from '@pentatonic-ai/ai-agent-sdk/memory/corpus';
92
228
 
93
- const memory = createMemorySystem({
94
- db: pgPool,
95
- embedding: { url: 'http://localhost:11434/v1', model: 'nomic-embed-text' },
96
- llm: { url: 'http://localhost:11434/v1', model: 'llama3.2:3b' },
229
+ const adapter = engineAdapter({
230
+ engineUrl: 'http://localhost:8099',
231
+ arena: 'my-app',
97
232
  });
98
-
99
- await memory.migrate();
100
- await memory.ensureLayers('my-app');
101
- await memory.ingest('User prefers dark mode', { clientId: 'my-app' });
102
- const results = await memory.search('preferences', { clientId: 'my-app' });
233
+ await adapter.init();
234
+ await adapter.ingestChunk('User prefers dark mode', { kind: 'note' });
103
235
  ```
104
236
 
105
- ## Hosted TES
237
+ For raw `/search` and `/store`, just `fetch()` against `${engineUrl}/search` etc. The wire format is documented in `packages/memory-engine/docs/MIGRATION.md`.
106
238
 
107
- Connect to Pentatonic's hosted infrastructure for production use.
239
+ ---
108
240
 
109
- ### 1. Create an account
241
+ ## Observability
110
242
 
111
- ```bash
112
- npx @pentatonic-ai/ai-agent-sdk init
113
- ```
243
+ Wrap your LLM client and every call automatically emits a `CHAT_TURN` event to TES — input/output tokens, tool calls, model, latency, content. Events flow into the TES dashboard, where you get session metrics, search attribution, dead-end detection, and full-text + semantic search across conversations.
114
244
 
115
- This walks you through account creation, email verification, and API key generation. You'll get:
245
+ Observability requires a TES account (hosted or self-hosted Pentatonic platform). Events have nowhere to go without one.
116
246
 
117
- ```
118
- TES_ENDPOINT=https://your-company.api.pentatonic.com
119
- TES_CLIENT_ID=your-company
120
- TES_API_KEY=tes_your-company_xxxxx
121
- ```
247
+ ### Wrap your LLM client
122
248
 
123
- ### 2. Install
249
+ **JavaScript**
124
250
 
125
- ```bash
126
- npm install @pentatonic-ai/ai-agent-sdk
251
+ ```js
252
+ import { TESClient } from "@pentatonic-ai/ai-agent-sdk";
253
+
254
+ const tes = new TESClient({
255
+ clientId: process.env.TES_CLIENT_ID,
256
+ apiKey: process.env.TES_API_KEY,
257
+ endpoint: process.env.TES_ENDPOINT,
258
+ });
259
+
260
+ const ai = tes.wrap(new OpenAI(), { sessionId: "conv-123" });
261
+ const result = await ai.chat.completions.create({
262
+ model: "gpt-4o",
263
+ messages: [{ role: "user", content: "Hello!" }],
264
+ });
127
265
  ```
128
266
 
129
- ```bash
130
- pip install pentatonic-ai-agent-sdk
267
+ **Python**
268
+
269
+ ```python
270
+ from pentatonic_agent_events import TESClient
271
+
272
+ tes = TESClient(
273
+ client_id=os.environ["TES_CLIENT_ID"],
274
+ api_key=os.environ["TES_API_KEY"],
275
+ endpoint=os.environ["TES_ENDPOINT"],
276
+ )
277
+
278
+ ai = tes.wrap(OpenAI(), session_id="conv-123")
279
+ result = ai.chat.completions.create(
280
+ model="gpt-4o",
281
+ messages=[{"role": "user", "content": "Hello!"}],
282
+ )
131
283
  ```
132
284
 
133
- ### What you get (in addition to local features)
285
+ ### Supported providers
134
286
 
135
- - **Higher-dimensional embeddings** -- NV-Embed-v2 (4096d) for better retrieval accuracy
136
- - **Conversation analytics** -- session metrics, search attribution, dead-end detection
137
- - **Team-wide shared memory** -- semantic search across your team's AI interactions
138
- - **Admin dashboard** -- visualize conversations, token usage, and memory explorer
139
- - **Multi-tenancy** -- isolated databases per client
287
+ | Provider | Detection | Intercepted Method |
288
+ |----------|-----------|-------------------|
289
+ | OpenAI | `client.chat.completions.create` | `chat.completions.create()` |
290
+ | Anthropic | `client.messages.create` | `messages.create()` |
291
+ | Workers AI | `client.run` (JS only) | `run()` |
140
292
 
141
- ## Claude Code Plugin
293
+ All other methods pass through unchanged.
142
294
 
143
- Works with both local and hosted setups. Install once, switch modes via config.
295
+ ---
144
296
 
145
- ### Install via marketplace
297
+ ## Plugins
298
+
299
+ If you use Claude Code or OpenClaw, the plugin gives you both products at once — every conversation turn is captured (observability) AND searched/stored as memory. No SDK glue to write.
300
+
301
+ ### Claude Code
302
+
303
+ Works with both local and hosted memory. Install once, switch modes via config.
146
304
 
147
305
  ```
148
306
  /plugin marketplace add Pentatonic-Ltd/ai-agent-sdk
149
307
  /plugin install tes-memory@pentatonic-ai
150
308
  ```
151
309
 
152
- ### Set up
310
+ **Local engine** — bring up the engine first ([Memory > Local](#local-self-hosted)), then point the plugin at it. Edit `~/.claude-pentatonic/tes-memory.local.md`:
153
311
 
154
- For hosted TES:
155
- ```
156
- /tes-memory:tes-setup
312
+ ```yaml
313
+ ---
314
+ mode: local
315
+ memory_url: http://localhost:8099
316
+ ---
157
317
  ```
158
318
 
159
- For local memory:
319
+ **Hosted TES** — run `login` once, the plugin auto-discovers `~/.config/tes/credentials.json`:
320
+
160
321
  ```bash
161
- npx @pentatonic-ai/ai-agent-sdk memory
322
+ npx @pentatonic-ai/ai-agent-sdk login
162
323
  ```
163
324
 
164
- ### What it tracks
165
-
166
- - **Every conversation turn** -- user messages, assistant responses, tool calls, duration
167
- - **Automatic memory search** -- relevant memories injected as context on every prompt
168
- - **Automatic memory storage** -- every turn stored with embeddings and HyDE queries
169
- - **Token usage** -- input, output, cache read, cache creation tokens per turn
325
+ Either way, verify with `/tes-memory:tes-status` in Claude Code. The plugin's MCP server, hooks, and tools all read the same config.
170
326
 
171
- ## OpenClaw Plugin
327
+ **What it tracks (auto, every turn):**
328
+ - Memory search at prompt time — relevant memories injected as context
329
+ - Memory store at turn end — every conversation turn persisted
330
+ - Token usage — input, output, cache read, cache creation tokens per turn
172
331
 
173
- Works with both local and hosted setups. Just tell OpenClaw to set it up.
174
-
175
- ### Install
332
+ ### OpenClaw
176
333
 
177
334
  ```bash
178
335
  openclaw plugins install @pentatonic-ai/openclaw-memory-plugin
179
336
  ```
180
337
 
181
- ### Set up
182
-
183
- Tell OpenClaw:
338
+ Then tell OpenClaw:
184
339
 
185
340
  ```
186
341
  Set up pentatonic memory
@@ -194,18 +349,7 @@ Or use the CLI directly:
194
349
  openclaw pentatonic-memory local
195
350
  ```
196
351
 
197
- ### What it does
198
-
199
- OpenClaw's context engine hooks fire on every lifecycle event:
200
-
201
- - **Ingest** -- every user and assistant message is stored with embeddings and HyDE query expansion, then distilled into atomic facts in the background (see [Distilled memory](#what-you-get))
202
- - **Assemble** -- relevant memories are injected as system prompt context before every model run
203
- - **Compact** -- decay cycle runs when the context window fills
204
- - **After turn** -- high-access memories get consolidated to the semantic layer
205
-
206
- Plus agent-callable tools: `memory_search`, `memory_store`, `memory_layers`.
207
-
208
- ### Configuration
352
+ **What it does:** OpenClaw's context engine hooks fire on every lifecycle event — `ingest` stores user/assistant messages via the engine's `/store` endpoint (BM25 + vector + KG indexing in parallel); `assemble` calls `/search` to inject relevant memories as system-prompt context; `compact` and `after-turn` are managed by the engine's own decay/consolidation. Plus agent-callable tools: `memory_search`, `memory_store`, `memory_layers`.
209
353
 
210
354
  After setup, config lives in `~/.openclaw/pentatonic-memory.json`. To switch modes, run setup again or edit directly.
211
355
 
@@ -219,11 +363,7 @@ You can also configure via `openclaw.json`:
219
363
  "pentatonic-memory": {
220
364
  "enabled": true,
221
365
  "config": {
222
- "database_url": "postgres://memory:memory@localhost:5433/memory",
223
- "embedding_url": "http://localhost:11435/v1",
224
- "embedding_model": "nomic-embed-text",
225
- "llm_url": "http://localhost:11435/v1",
226
- "llm_model": "llama3.2:3b"
366
+ "memory_url": "http://localhost:8099"
227
367
  }
228
368
  }
229
369
  }
@@ -241,57 +381,80 @@ For hosted mode, replace the config block with:
241
381
  }
242
382
  ```
243
383
 
244
- ## SDK: Wrap Your LLM Client
384
+ ---
245
385
 
246
- **JavaScript**
386
+ ## Repository Onboarding (corpus ingest)
247
387
 
248
- ```js
249
- import { TESClient } from "@pentatonic-ai/ai-agent-sdk";
388
+ The memory layer starts empty. To avoid the cold-start problem where retrieval has nothing useful to return for the first days of use, you can ingest your repos (or any folder of docs) on day one:
250
389
 
251
- const tes = new TESClient({
252
- clientId: process.env.TES_CLIENT_ID,
253
- apiKey: process.env.TES_API_KEY,
254
- endpoint: process.env.TES_ENDPOINT,
255
- });
390
+ ```bash
391
+ # Interactive — picks paths, shows a cost preview, ingests, offers
392
+ # to install a git post-commit hook so memory stays current
393
+ npx @pentatonic-ai/ai-agent-sdk onboard
256
394
 
257
- const ai = tes.wrap(new OpenAI(), { sessionId: "conv-123" });
258
- const result = await ai.chat.completions.create({
259
- model: "gpt-4o",
260
- messages: [{ role: "user", content: "Hello!" }],
261
- });
395
+ # One-shot ingest of a single path
396
+ npx @pentatonic-ai/ai-agent-sdk ingest ~/code/my-app
397
+ npx @pentatonic-ai/ai-agent-sdk ingest ~/Documents/design-notes # any folder works
398
+
399
+ # See what's tracked and how big the corpus is
400
+ npx @pentatonic-ai/ai-agent-sdk status
401
+
402
+ # Delta-resync everything that's tracked (or one path)
403
+ npx @pentatonic-ai/ai-agent-sdk resync
404
+
405
+ # Manage the tracked-paths list
406
+ npx @pentatonic-ai/ai-agent-sdk corpus list
407
+ npx @pentatonic-ai/ai-agent-sdk corpus remove ~/code/old-project
408
+ npx @pentatonic-ai/ai-agent-sdk corpus reset
262
409
  ```
263
410
 
264
- **Python**
411
+ Tenant credentials come from env vars (`TES_ENDPOINT`, `TES_CLIENT_ID`, `TES_API_KEY`) or `~/.config/tes/credentials.json` if you used `npx @pentatonic-ai/ai-agent-sdk login`. To point at a TES instance running on `localhost`, set `TES_ENDPOINT=http://localhost:8788`.
265
412
 
266
- ```python
267
- from pentatonic_agent_events import TESClient
413
+ ### What gets stored: references, not content
268
414
 
269
- tes = TESClient(
270
- client_id=os.environ["TES_CLIENT_ID"],
271
- api_key=os.environ["TES_API_KEY"],
272
- endpoint=os.environ["TES_ENDPOINT"],
273
- )
415
+ By default, ingest stores **pointers to source content** (path + line range + a short signature/summary), not full chunk content. Per-language strategies:
274
416
 
275
- ai = tes.wrap(OpenAI(), session_id="conv-123")
276
- result = ai.chat.completions.create(
277
- model="gpt-4o",
278
- messages=[{"role": "user", "content": "Hello!"}],
279
- )
280
- ```
417
+ - **Markdown** one reference per H1/H2 section
418
+ - **JS / TS** — one per top-level `function` / `class` / `const` / `export`
419
+ - **Python** — one per top-level `def` / `class`
420
+ - **JSON / YAML** — collapsed top-level keys
421
+ - **Other** — single file-level reference
281
422
 
282
- ## Supported Providers
423
+ Why pointers? **Code mutates between ingests.** Embedded chunks of old source rot silently — the LLM keeps confidently citing functions you've since rewritten, with retrieval evidence to back it up. Pointers rot loudly: when a file moves or changes, `Read` fails or returns different content, and the agent observes and adjusts. Stale-but-confident is the worst-class memory bug; loud-and-self-correcting is qualitatively better for source code.
283
424
 
284
- | Provider | Detection | Intercepted Method |
285
- |----------|-----------|-------------------|
286
- | OpenAI | `client.chat.completions.create` | `chat.completions.create()` |
287
- | Anthropic | `client.messages.create` | `messages.create()` |
288
- | Workers AI | `client.run` (JS only) | `run()` |
425
+ It also means proprietary source never leaves your machine — only the index (path + summary) is sent to the hosted TES, and the agent reads actual file contents at query time on its own.
289
426
 
290
- All other methods pass through unchanged.
427
+ If you need a self-contained index (e.g. for air-gapped retrieval where the source isn't available at query time), opt into legacy chunk-content storage by passing `mode: "content"` to `ingestCorpus` when using the SDK as a library.
428
+
429
+ ### What gets ingested, what doesn't
430
+
431
+ Any folder works — git is not required. The walker honors `.gitignore` and `.tesignore` if present, plus a hard-exclude list for secrets and credentials that **cannot be overridden** even with `!pattern` rules:
432
+
433
+ - `.env*` (any environment file)
434
+ - `*.pem`, `*.key`, `*.crt`, `*.p12`, `*.pfx`, `*.jks`
435
+ - `id_rsa`, `id_ed25519`, `id_ecdsa`, `id_dsa` (SSH private keys)
436
+ - `.ssh/`, `.aws/`, `.gcp/`, `.azure/` (whole directories)
437
+ - `.npmrc`, `.pypirc`, `.netrc`
438
+ - `secrets/`, `credentials/`, `service-account.*`
439
+ - `*_secret*`, `*_token*`, `*_password*`
440
+
441
+ Plus directory-level skips: `.git`, `node_modules`, `dist`, `build`, `.next`, `venv`, `__pycache__`, `target`, `.terraform`, etc. And extension skips for binaries, lockfiles, and minified output. Files larger than 512 KB are skipped by default (override with adapter options if you need to).
442
+
443
+ ### How it stays current
444
+
445
+ For git repos, accepting the prompt during `onboard` installs a post-commit hook at `.git/hooks/post-commit` that re-ingests files changed in each commit. The hook is non-fatal — it never blocks a commit. Install manually any time with:
446
+
447
+ ```bash
448
+ npx @pentatonic-ai/ai-agent-sdk install-git-hook
449
+ ```
450
+
451
+ For non-git folders, re-run `ingest` or `resync` whenever the source changes. Re-ingest is cheap: the SDK keeps a content-hash per file and skips anything that hasn't changed since the last run.
452
+
453
+ ---
291
454
 
292
455
  ## API Reference
293
456
 
294
- ### `TESClient(config)`
457
+ ### `TESClient(config)` — Observability
295
458
 
296
459
  | Param | Type | Default | Description |
297
460
  |-------|------|---------|-------------|
@@ -329,6 +492,14 @@ import { normalizeResponse } from "@pentatonic-ai/ai-agent-sdk";
329
492
  const { content, model, usage, toolCalls } = normalizeResponse(openaiResponse);
330
493
  ```
331
494
 
495
+ ### `engineAdapter(config)` — Memory
496
+
497
+ Thin HTTP client for the memory engine. `config = { engineUrl, arena, apiKey? }`. Returns `{ ingestChunk(content, metadata), deleteByCorpusFile(repoAbs, relPath), init() }`. See [Use as a library](#use-as-a-library).
498
+
499
+ For raw `/store` / `/search` calls, just `fetch()` against `${engineUrl}` directly — the wire format is documented in `packages/memory-engine/docs/MIGRATION.md`.
500
+
501
+ ---
502
+
332
503
  ## Health Checks (`doctor`)
333
504
 
334
505
  Run a full health check of your SDK install at any time:
@@ -337,9 +508,7 @@ Run a full health check of your SDK install at any time:
337
508
  npx @pentatonic-ai/ai-agent-sdk doctor
338
509
  ```
339
510
 
340
- `doctor` auto-detects which install path you're on (Local Memory, Hosted
341
- TES, or self-hosted Pentatonic platform) and runs only the checks that
342
- apply. Exit code is `0` for all-clear, `1` for warnings, `2` for critical.
511
+ `doctor` auto-detects which install path you're on (Local Memory, Hosted TES, or self-hosted Pentatonic platform) and runs only the checks that apply. Exit code is `0` for all-clear, `1` for warnings, `2` for critical.
343
512
 
344
513
  Common flags:
345
514
 
@@ -353,17 +522,13 @@ npx @pentatonic-ai/ai-agent-sdk doctor --path local
353
522
  What gets checked:
354
523
 
355
524
  - **Universal** — Node version, disk space, SDK config-file permissions
356
- - **Local Memory** — Postgres + pgvector + migrations, embedding/LLM
357
- endpoints, memory server port
525
+ - **Local engine** — engine `/health`, per-layer health (L0–L6), embedding endpoint reachability
358
526
  - **Hosted TES** — endpoint reachable, API key authenticates
359
- - **Self-hosted platform** — HybridRAG, Qdrant, Neo4j, vLLM (each
360
- optional, skipped when its env var is unset)
527
+ - **Plugin config** — `tes-memory.local.md` parses, `memory_url` reachable
361
528
 
362
529
  ### Plugins
363
530
 
364
- Drop a `.mjs` file into `~/.config/pentatonic-ai/doctor-plugins/` to add
365
- your own checks. Useful for app-specific things — internal APIs, ingest
366
- freshness, custom infrastructure — without forking the SDK.
531
+ Drop a `.mjs` file into `~/.config/pentatonic-ai/doctor-plugins/` to add your own checks. Useful for app-specific things — internal APIs, ingest freshness, custom infrastructure — without forking the SDK.
367
532
 
368
533
  ```js
369
534
  // ~/.config/pentatonic-ai/doctor-plugins/my-app.mjs
@@ -384,32 +549,38 @@ export default {
384
549
  };
385
550
  ```
386
551
 
387
- See [`packages/doctor/README.md`](packages/doctor/README.md) for the full
388
- plugin contract and programmatic API.
552
+ See [`packages/doctor/README.md`](packages/doctor/README.md) for the full plugin contract and programmatic API.
553
+
554
+ ---
389
555
 
390
556
  ## Architecture
391
557
 
392
558
  ```
393
- +-------------------+ +-------------------+
394
- | Claude Code Plugin| | OpenClaw Plugin |
395
- | (hooks: auto- | | (context engine: |
396
- | search + store) | | ingest, assemble, |
397
- +--------+----------+ | compact, tools) |
398
- | +--------+----------+
399
- | |
400
- +------------+------------+
401
- |
402
- +-----------+-----------+
403
- | |
404
- Local Memory Hosted TES
405
- (Docker) (Cloud)
406
- | |
407
- +----+----+----+ +---+----+---+
408
- | | | | | | | |
409
- PG Ollama MCP HTTP PG R2 Queue Workers
410
- pgvector API pgvector Modules
559
+ Your code / Claude Code plugin / OpenClaw plugin
560
+ |
561
+ +-------------------+--------------------+
562
+ | |
563
+ Memory product Observability product
564
+ (engine HTTP API) (TESClient.wrap)
565
+ | |
566
+ | POST /store /search /forget | CHAT_TURN events
567
+ ▼ ▼
568
+ +----------------+ +-----------------+
569
+ | memory engine | | TES |
570
+ | (compat shim) | | (Cloudflare) |
571
+ +----------------+ | Workers, R2, |
572
+ | | Queues, Pages |
573
+ +----------+----------+ +--------+--------+
574
+ | | |
575
+ Local Hosted ---------------------------+
576
+ (your machine) (Pentatonic-managed)
577
+ | |
578
+ docker compose AWS/GCP container cluster
579
+ + host Ollama + AI gateway (NV-Embed-v2)
411
580
  ```
412
581
 
582
+ Plugins (Claude Code, OpenClaw) are lightweight integrations on top of both products — they call into memory and emit observability events on the user's behalf.
583
+
413
584
  ## License
414
585
 
415
586
  MIT