pikuri-memory 0.0.4 → 0.0.5
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- checksums.yaml +4 -4
- data/README.md +96 -13
- data/docker/README.md +50 -0
- data/docker/docker-compose.yml +113 -0
- data/docker/qdrant-default-config.patch +24 -0
- data/lib/pikuri/memory/extension.rb +293 -0
- data/lib/pikuri/memory/mem0_client.rb +264 -0
- data/lib/pikuri/memory/mem0_server.rb +551 -0
- data/lib/pikuri/memory/recall.rb +107 -0
- data/lib/pikuri/memory/record.rb +72 -0
- data/lib/pikuri/memory/recorder.rb +134 -0
- data/lib/pikuri-memory.rb +78 -5
- data/prompts/memory-extraction.txt +44 -0
- data/prompts/pikuri-memory.txt +7 -0
- metadata +50 -12
|
@@ -0,0 +1,551 @@
|
|
|
1
|
+
# frozen_string_literal: true
|
|
2
|
+
|
|
3
|
+
require 'faraday'
|
|
4
|
+
require 'fileutils'
|
|
5
|
+
require 'pathname'
|
|
6
|
+
require 'uri'
|
|
7
|
+
|
|
8
|
+
module Pikuri
|
|
9
|
+
module Memory
|
|
10
|
+
# Supervisor for a self-managed mem0 + Qdrant sidecar. Pairs with
|
|
11
|
+
# {Mem0Client}: this class owns the *stack* (clone + patch the mem0
|
|
12
|
+
# source, +docker compose up --build+, heartbeat-poll, tear down);
|
|
13
|
+
# {Mem0Client} owns the HTTP client that talks to it. {#client}
|
|
14
|
+
# returns a {Mem0Client} pre-pointed at the running server.
|
|
15
|
+
#
|
|
16
|
+
# Same split, and the same "let pikuri manage it" vs "bring your own"
|
|
17
|
+
# choice, as +pikuri-vectordb+'s {Pikuri::VectorDb::Server::Chroma} /
|
|
18
|
+
# {Pikuri::VectorDb::Backend::Chroma}. A host already running mem0
|
|
19
|
+
# elsewhere skips this class and wires {Mem0Client.new(endpoint:)}
|
|
20
|
+
# directly.
|
|
21
|
+
#
|
|
22
|
+
# == Why a compose stack, not a single +docker run+
|
|
23
|
+
#
|
|
24
|
+
# Unlike Chroma (one clean published image), mem0's REST server has
|
|
25
|
+
# no usable published image — the one on Docker Hub is stale
|
|
26
|
+
# (pre-v3) — so it is *built from source*, and it needs Qdrant as a
|
|
27
|
+
# second container. Two interdependent services with a build step is
|
|
28
|
+
# exactly what compose models, and it is mem0 upstream's own blessed
|
|
29
|
+
# run path. So this supervisor wraps +docker compose+ (through
|
|
30
|
+
# {Pikuri::Subprocess.spawn}) over the compose file shipped in the
|
|
31
|
+
# gem's +docker/+ directory, rather than reimplementing service
|
|
32
|
+
# ordering by hand.
|
|
33
|
+
#
|
|
34
|
+
# == No Postgres; Qdrant, patched in at build
|
|
35
|
+
#
|
|
36
|
+
# The upstream server's +DEFAULT_CONFIG+ hardcodes the **pgvector**
|
|
37
|
+
# vector store, whose mem0 provider has a top-k inversion bug (cosine
|
|
38
|
+
# *distance* ranked as a *similarity* — DESIGN.md §"Root cause: the
|
|
39
|
+
# pgvector top-k inversion"), *and* whose provider connects to
|
|
40
|
+
# Postgres eagerly at boot.
|
|
41
|
+
# {#prepare_checkout!} applies +docker/qdrant-default-config.patch+ to
|
|
42
|
+
# the pinned checkout, swapping that default to **Qdrant** (env-driven
|
|
43
|
+
# host/port/dims). Result: correct nearest-first ranking *and* no
|
|
44
|
+
# Postgres in the stack — verified end-to-end (server boots
|
|
45
|
+
# Postgres-free; add + search through the REST API rank correctly
|
|
46
|
+
# against the local llama.cpp router).
|
|
47
|
+
#
|
|
48
|
+
# == Local LLM + embedder via the router
|
|
49
|
+
#
|
|
50
|
+
# mem0's bundled provider validation accepts only +openai+ /
|
|
51
|
+
# +anthropic+ / +gemini+ for the LLM and +openai+ / +gemini+ for the
|
|
52
|
+
# embedder, so the local path keeps +provider: "openai"+ and points
|
|
53
|
+
# +OPENAI_BASE_URL+ at the llama.cpp router. The extraction model must
|
|
54
|
+
# be **non-reasoning** (e.g. +Qwen2.5-7B-Instruct+): a thinking model
|
|
55
|
+
# burns its budget on CoT and returns empty/truncated JSON
|
|
56
|
+
# (DESIGN.md §"Extraction model decision"). The vector
|
|
57
|
+
# store is not bundle-gated, so Qdrant is accepted.
|
|
58
|
+
#
|
|
59
|
+
# == The router relay (container → host loopback)
|
|
60
|
+
#
|
|
61
|
+
# The extraction LLM + embedder calls originate *inside* the mem0
|
|
62
|
+
# container, but the llama.cpp router conventionally binds the host's
|
|
63
|
+
# +127.0.0.1:8080+ — an address rootless docker deliberately refuses
|
|
64
|
+
# to route containers to (+--disable-host-loopback+; re-enabling it
|
|
65
|
+
# daemon-wide would hand *every* container, including deliberately
|
|
66
|
+
# untrusted MCP containers, a path to *every* loopback-bound service
|
|
67
|
+
# on the host: CUPS, trust-auth Postgres, an unauthenticated Redis…).
|
|
68
|
+
# Instead of punching that hole, the supervisor builds a scoped one:
|
|
69
|
+
#
|
|
70
|
+
# 1. The host side runs +socat UNIX-LISTEN:<sock_dir>/router.sock,…
|
|
71
|
+
# TCP:<router host:port>+ — spawned via {Pikuri::Subprocess.spawn}
|
|
72
|
+
# as a daemon child (never +#wait+ed; stopped with
|
|
73
|
+
# +Subprocess#terminate+, and swept by the exit reaper as a
|
|
74
|
+
# backstop).
|
|
75
|
+
# 2. The compose stack's +router-proxy+ sidecar (a pinned
|
|
76
|
+
# {SOCAT_IMAGE}, ~5 MB) bind-mounts the socket *directory* and
|
|
77
|
+
# relays it back onto the stack-internal network as
|
|
78
|
+
# +http://router-proxy:8080+.
|
|
79
|
+
# 3. mem0's +OPENAI_BASE_URL+ points at that sidecar.
|
|
80
|
+
#
|
|
81
|
+
# The socket *file* is the capability: only a container that mounts
|
|
82
|
+
# it can reach the router, the host's loopback stays sealed for
|
|
83
|
+
# everything else, and the same wiring works identically on rootful
|
|
84
|
+
# and rootless daemons — so there is no daemon-flavour special case.
|
|
85
|
+
# The relay speaks plain TCP, so +router_url+ must be +http://+; an
|
|
86
|
+
# +https+ router needs +container_router_url:+ (which bypasses the
|
|
87
|
+
# relay and is then the caller's routing problem).
|
|
88
|
+
#
|
|
89
|
+
# == Pinned + patched checkout
|
|
90
|
+
#
|
|
91
|
+
# The mem0 server is built from {MEM0_REF} (a pinned commit), cloned
|
|
92
|
+
# into the cache dir once and reused. The patch is applied with
|
|
93
|
+
# +git apply+ and is **fail-loud**: if it neither applies cleanly nor
|
|
94
|
+
# is already applied, {#prepare_checkout!} raises rather than building
|
|
95
|
+
# a wrong image (same discipline as the no-think build patch in
|
|
96
|
+
# DESIGN.md §"The no-think patch (fallback): verified live").
|
|
97
|
+
#
|
|
98
|
+
# == Bind 127.0.0.1
|
|
99
|
+
#
|
|
100
|
+
# The shipped compose publishes both ports on +127.0.0.1+ only — the
|
|
101
|
+
# user's memory never listens on a routable interface, same posture as
|
|
102
|
+
# {Pikuri::VectorDb::Server::Chroma}.
|
|
103
|
+
#
|
|
104
|
+
# == Subprocess seam
|
|
105
|
+
#
|
|
106
|
+
# Every +git+ / +docker+ invocation routes through
|
|
107
|
+
# {Pikuri::Subprocess.spawn} per the subprocess seam. Errors at boot
|
|
108
|
+
# (missing docker/git, build failure, healthcheck timeout) raise
|
|
109
|
+
# +RuntimeError+ with the offending output; teardown failures are
|
|
110
|
+
# logged, not raised.
|
|
111
|
+
class Mem0Server
|
|
112
|
+
LOGGER = Pikuri.logger_for('Memory::Mem0Server')
|
|
113
|
+
|
|
114
|
+
# @return [String] mem0 git remote the server is built from.
|
|
115
|
+
MEM0_REPO_URL = 'https://github.com/mem0ai/mem0.git'
|
|
116
|
+
|
|
117
|
+
# @return [String] pinned mem0 commit the image is built from
|
|
118
|
+
# (library 2.0.4, v3 token-efficient algorithm). Bumping this is
|
|
119
|
+
# how the mem0 version is upgraded — and the shipped patch must be
|
|
120
|
+
# regenerated against the new ref if +DEFAULT_CONFIG+ moved.
|
|
121
|
+
MEM0_REF = 'a3154d59e52386d4e1189c1f5f44819868f76514'
|
|
122
|
+
|
|
123
|
+
# @return [String] compose project name. Prefix +pikuri-internal-+
|
|
124
|
+
# is the namespace pikuri squats for self-managed infra (same
|
|
125
|
+
# convention as {Pikuri::VectorDb::Server::Chroma}).
|
|
126
|
+
COMPOSE_PROJECT = 'pikuri-internal-mem0'
|
|
127
|
+
|
|
128
|
+
# @return [String] absolute path to the shipped compose file.
|
|
129
|
+
COMPOSE_FILE = File.expand_path('../../../docker/docker-compose.yml', __dir__)
|
|
130
|
+
|
|
131
|
+
# @return [String] absolute path to the shipped DEFAULT_CONFIG
|
|
132
|
+
# pgvector→qdrant patch.
|
|
133
|
+
PATCH_FILE = File.expand_path('../../../docker/qdrant-default-config.patch', __dir__)
|
|
134
|
+
|
|
135
|
+
# @return [Integer] default host port mem0's REST API binds (127.0.0.1).
|
|
136
|
+
DEFAULT_PORT = 8888
|
|
137
|
+
|
|
138
|
+
# @return [Integer] default host port Qdrant binds (127.0.0.1),
|
|
139
|
+
# published for inspection only.
|
|
140
|
+
DEFAULT_QDRANT_PORT = 6333
|
|
141
|
+
|
|
142
|
+
# @return [Integer] default embedding dimension. 768 matches
|
|
143
|
+
# +nomic-embed-text-v1.5+; must equal the embedder's output dim or
|
|
144
|
+
# Qdrant rejects upserts.
|
|
145
|
+
DEFAULT_EMBEDDING_DIMS = 768
|
|
146
|
+
|
|
147
|
+
# @return [String] default Qdrant collection name.
|
|
148
|
+
DEFAULT_COLLECTION = 'pikuri_memory'
|
|
149
|
+
|
|
150
|
+
# @return [String] pinned Qdrant image. **Kept in lockstep
|
|
151
|
+
# with +Pikuri::VectorDb::Server::Qdrant::IMAGE+** — same tag,
|
|
152
|
+
# so a host running both stacks holds one image on disk. The
|
|
153
|
+
# gems don't depend on each other, so the pin is a
|
|
154
|
+
# convention, not a shared constant; bump both together. See
|
|
155
|
+
# +pikuri-vectordb/DESIGN.md+ §"Verdict".
|
|
156
|
+
DEFAULT_QDRANT_IMAGE = 'qdrant/qdrant:v1.12.4'
|
|
157
|
+
|
|
158
|
+
# @return [Integer] seconds to wait for the REST API to answer after
|
|
159
|
+
# +compose up+ returns. The first run also builds the image inside
|
|
160
|
+
# +compose up --build+ (minutes), but that is covered by the
|
|
161
|
+
# blocking build call, not this poll — this only covers
|
|
162
|
+
# container-start readiness.
|
|
163
|
+
DEFAULT_HEALTHCHECK_TIMEOUT = 60
|
|
164
|
+
|
|
165
|
+
# @return [String] pinned socat image for the +router-proxy+
|
|
166
|
+
# sidecar (see the class header's "router relay" section). A
|
|
167
|
+
# ~5 MB single-binary image; bumped manually like
|
|
168
|
+
# {DEFAULT_QDRANT_IMAGE}.
|
|
169
|
+
SOCAT_IMAGE = 'alpine/socat:1.8.0.3'
|
|
170
|
+
|
|
171
|
+
# @return [String] socket filename inside {#sock_dir}; the sidecar
|
|
172
|
+
# sees it as +/sock/router.sock+.
|
|
173
|
+
RELAY_SOCKET = 'router.sock'
|
|
174
|
+
|
|
175
|
+
# @return [Integer] seconds to wait for the host-side socat to bind
|
|
176
|
+
# the relay socket after spawn. Binding is immediate in practice;
|
|
177
|
+
# the timeout exists to fail loud when socat dies on startup
|
|
178
|
+
# (bad address, port typo) instead of surfacing minutes later as
|
|
179
|
+
# silent extraction failures.
|
|
180
|
+
RELAY_BIND_TIMEOUT = 5
|
|
181
|
+
|
|
182
|
+
# Construct and immediately ensure the stack is running.
|
|
183
|
+
# Convenience factory — +new(...).tap(&:ensure_running!)+.
|
|
184
|
+
#
|
|
185
|
+
# @return [Mem0Server]
|
|
186
|
+
def self.ensure_running(**kwargs)
|
|
187
|
+
new(**kwargs).tap(&:ensure_running!)
|
|
188
|
+
end
|
|
189
|
+
|
|
190
|
+
# @param router_url [String] llama.cpp OpenAI-compatible base URL
|
|
191
|
+
# *as seen from the host* (e.g. +"http://localhost:8080/v1"+) the
|
|
192
|
+
# mem0 server routes its LLM + embedder calls to. Must be
|
|
193
|
+
# +http://+ — the relay (class header) carries it into the
|
|
194
|
+
# container, so a loopback-bound router is fine.
|
|
195
|
+
# @param llm_model [String] extraction model id on the router. Must
|
|
196
|
+
# be **non-reasoning** (see the class header).
|
|
197
|
+
# @param embedder_model [String] embedder model id on the router.
|
|
198
|
+
# @param container_router_url [String, nil] escape hatch: router
|
|
199
|
+
# base URL *as seen from inside the mem0 container*. When set,
|
|
200
|
+
# the relay is not spawned and routing the container to this URL
|
|
201
|
+
# is the caller's problem (e.g.
|
|
202
|
+
# +"http://host.docker.internal:8080/v1"+ on a rootful daemon, or
|
|
203
|
+
# an +https+ router on a routable address). Default +nil+ — use
|
|
204
|
+
# the relay.
|
|
205
|
+
# @param port [Integer] host port for the REST API (127.0.0.1).
|
|
206
|
+
# @param qdrant_port [Integer] host port for Qdrant (127.0.0.1).
|
|
207
|
+
# @param embedding_dims [Integer] embedder output dimension.
|
|
208
|
+
# @param collection [String] Qdrant collection name.
|
|
209
|
+
# @param qdrant_image [String] Qdrant docker image.
|
|
210
|
+
# @param cache_dir [String, Pathname, nil] where the mem0 checkout
|
|
211
|
+
# lives. +nil+ resolves to {#default_cache_dir}.
|
|
212
|
+
# @param healthcheck_timeout [Integer] seconds to poll readiness.
|
|
213
|
+
# @param connection [Faraday::Connection, nil] DI hook for tests.
|
|
214
|
+
# @return [Mem0Server]
|
|
215
|
+
# @raise [ArgumentError] when +router_url+ is not +http://+ and no
|
|
216
|
+
# +container_router_url+ is given (the TCP relay can't originate
|
|
217
|
+
# TLS).
|
|
218
|
+
def initialize(router_url:, llm_model:, embedder_model:, container_router_url: nil,
|
|
219
|
+
port: DEFAULT_PORT, qdrant_port: DEFAULT_QDRANT_PORT,
|
|
220
|
+
embedding_dims: DEFAULT_EMBEDDING_DIMS, collection: DEFAULT_COLLECTION,
|
|
221
|
+
qdrant_image: DEFAULT_QDRANT_IMAGE, cache_dir: nil,
|
|
222
|
+
healthcheck_timeout: DEFAULT_HEALTHCHECK_TIMEOUT, connection: nil)
|
|
223
|
+
@router_url = router_url
|
|
224
|
+
@container_router_url = container_router_url
|
|
225
|
+
if container_router_url.nil? && URI(router_url).scheme != 'http'
|
|
226
|
+
raise ArgumentError, "Mem0Server: the socat relay carries plain TCP, so router_url must be " \
|
|
227
|
+
"http:// (got #{router_url.inspect}). For an https router, pass " \
|
|
228
|
+
'container_router_url: with an address the container can route to.'
|
|
229
|
+
end
|
|
230
|
+
@llm_model = llm_model
|
|
231
|
+
@embedder_model = embedder_model
|
|
232
|
+
@port = port
|
|
233
|
+
@qdrant_port = qdrant_port
|
|
234
|
+
@embedding_dims = embedding_dims
|
|
235
|
+
@collection = collection
|
|
236
|
+
@qdrant_image = qdrant_image
|
|
237
|
+
@cache_dir = Pathname.new(cache_dir || default_cache_dir).expand_path
|
|
238
|
+
@healthcheck_timeout = healthcheck_timeout
|
|
239
|
+
@connection = connection
|
|
240
|
+
@closed = false
|
|
241
|
+
@finalizer_handle = nil
|
|
242
|
+
@relay = nil
|
|
243
|
+
end
|
|
244
|
+
|
|
245
|
+
# @return [Integer] host port the REST API binds.
|
|
246
|
+
attr_reader :port
|
|
247
|
+
|
|
248
|
+
# @return [Pathname] the mem0 source checkout (the image build
|
|
249
|
+
# context). Under +temp/+ — it is a regenerable clone, not data.
|
|
250
|
+
def checkout_dir
|
|
251
|
+
@cache_dir.join('temp', 'git')
|
|
252
|
+
end
|
|
253
|
+
|
|
254
|
+
# @return [Pathname] host directory bind-mounted into the
|
|
255
|
+
# containers for persistent state — +data/qdrant+ holds the
|
|
256
|
+
# Qdrant corpus, +data/history+ the memory-history SQLite. The
|
|
257
|
+
# containers are ephemeral; this survives them (same posture as
|
|
258
|
+
# {Pikuri::VectorDb::Server::Chroma}).
|
|
259
|
+
def data_dir
|
|
260
|
+
@cache_dir.join('data')
|
|
261
|
+
end
|
|
262
|
+
|
|
263
|
+
# @return [Pathname] host directory holding the relay's unix socket,
|
|
264
|
+
# bind-mounted into the +router-proxy+ sidecar. The *directory* is
|
|
265
|
+
# what's mounted — a restarted host-side socat re-creates the
|
|
266
|
+
# socket file, and a file bind-mount would pin the stale inode.
|
|
267
|
+
def sock_dir
|
|
268
|
+
@cache_dir.join('sock')
|
|
269
|
+
end
|
|
270
|
+
|
|
271
|
+
# @return [String] +"http://localhost:<port>"+.
|
|
272
|
+
def endpoint
|
|
273
|
+
"http://localhost:#{@port}"
|
|
274
|
+
end
|
|
275
|
+
|
|
276
|
+
# Build a {Mem0Client} pointed at the supervised server.
|
|
277
|
+
#
|
|
278
|
+
# @return [Mem0Client]
|
|
279
|
+
def client
|
|
280
|
+
Mem0Client.new(endpoint: endpoint, connection: @connection)
|
|
281
|
+
end
|
|
282
|
+
|
|
283
|
+
# Idempotent: ensure the checkout exists + is patched, bring the
|
|
284
|
+
# compose stack up (building the image on first run), heartbeat-poll
|
|
285
|
+
# the REST API until ready, then register {#close} with
|
|
286
|
+
# {Pikuri::Finalizers} so the stack is stopped at process exit.
|
|
287
|
+
#
|
|
288
|
+
# @return [void]
|
|
289
|
+
# @raise [RuntimeError] on missing docker/git, clone/patch failure,
|
|
290
|
+
# +compose up+ failure, or healthcheck timeout.
|
|
291
|
+
def ensure_running!
|
|
292
|
+
prepare_checkout!
|
|
293
|
+
# Pre-create the bind-mount sources so docker doesn't create them
|
|
294
|
+
# root-owned (and so a missing dir isn't silently a fresh volume).
|
|
295
|
+
FileUtils.mkdir_p(data_dir.join('qdrant'))
|
|
296
|
+
FileUtils.mkdir_p(data_dir.join('history'))
|
|
297
|
+
FileUtils.mkdir_p(sock_dir, mode: 0o700)
|
|
298
|
+
start_relay! unless @container_router_url
|
|
299
|
+
begin
|
|
300
|
+
LOGGER.info("starting #{COMPOSE_PROJECT} (mem0 + Qdrant + relay) on 127.0.0.1:#{@port}; " \
|
|
301
|
+
'first run builds the mem0 image and may take a few minutes')
|
|
302
|
+
compose!('up', '-d', '--build')
|
|
303
|
+
wait_for_healthy!
|
|
304
|
+
rescue StandardError
|
|
305
|
+
# Self-heal: don't leave the relay as a stray when the stack
|
|
306
|
+
# never came up (same discipline as Agent's build-phase rescue).
|
|
307
|
+
stop_relay!
|
|
308
|
+
raise
|
|
309
|
+
end
|
|
310
|
+
register_for_cleanup
|
|
311
|
+
end
|
|
312
|
+
|
|
313
|
+
# Stop the stack (+docker compose down+), leaving {#data_dir}'s
|
|
314
|
+
# bind-mounted host directories — the Qdrant corpus and memory
|
|
315
|
+
# history survive. Registered with
|
|
316
|
+
# {Pikuri::Finalizers} by {#ensure_running!}; safe to call directly.
|
|
317
|
+
# Best-effort and idempotent: a non-zero +down+ is logged, not
|
|
318
|
+
# raised (teardown shouldn't abort on an already-gone stack).
|
|
319
|
+
#
|
|
320
|
+
# @return [void]
|
|
321
|
+
def close
|
|
322
|
+
return if @closed
|
|
323
|
+
|
|
324
|
+
@closed = true
|
|
325
|
+
result = compose('down')
|
|
326
|
+
stop_relay!
|
|
327
|
+
return if result.status.success?
|
|
328
|
+
|
|
329
|
+
LOGGER.warn("docker compose down failed (exit #{result.status.exitstatus}): #{result.output.strip}")
|
|
330
|
+
end
|
|
331
|
+
|
|
332
|
+
# Default cache root for this supervisor: +<Pikuri::Paths.cache>/mem0+
|
|
333
|
+
# (i.e. +$XDG_CACHE_HOME/pikuri/mem0+ or +~/.cache/pikuri/mem0+).
|
|
334
|
+
# Holds +temp/git+ (the checkout) and +data/+ (the bind-mounted
|
|
335
|
+
# corpus + history). Shares the cache root with
|
|
336
|
+
# {Pikuri::VectorDb::Server::Chroma} via {Pikuri::Paths}. Public so
|
|
337
|
+
# tests and docs reference the same path the supervisor resolves.
|
|
338
|
+
#
|
|
339
|
+
# @return [String]
|
|
340
|
+
def default_cache_dir
|
|
341
|
+
Pikuri::Paths.cache.join('mem0').to_s
|
|
342
|
+
end
|
|
343
|
+
|
|
344
|
+
private
|
|
345
|
+
|
|
346
|
+
# Clone mem0 at {MEM0_REF} (once) and apply the Qdrant patch
|
|
347
|
+
# (idempotently, fail-loud). Reused across runs — a populated,
|
|
348
|
+
# already-patched checkout is a no-op.
|
|
349
|
+
def prepare_checkout!
|
|
350
|
+
ensure_clone!
|
|
351
|
+
apply_patch!
|
|
352
|
+
end
|
|
353
|
+
|
|
354
|
+
# Shallow-fetch the pinned commit into {#checkout_dir} if not already
|
|
355
|
+
# there. GitHub serves a fetch-by-SHA, so this stays a one-commit
|
|
356
|
+
# fetch rather than a full clone.
|
|
357
|
+
def ensure_clone!
|
|
358
|
+
dir = checkout_dir
|
|
359
|
+
return if dir.join('.git').directory? && git('rev-parse', 'HEAD', chdir: dir).output.strip == MEM0_REF
|
|
360
|
+
|
|
361
|
+
LOGGER.info("fetching mem0 @ #{MEM0_REF[0, 12]} into #{dir}")
|
|
362
|
+
FileUtils.mkdir_p(dir)
|
|
363
|
+
run_loud!(git('init', '-q', chdir: dir), 'git init')
|
|
364
|
+
# `remote add` is non-idempotent; ignore its failure (already added).
|
|
365
|
+
git('remote', 'add', 'origin', MEM0_REPO_URL, chdir: dir)
|
|
366
|
+
run_loud!(git('fetch', '--depth', '1', 'origin', MEM0_REF, chdir: dir),
|
|
367
|
+
"git fetch #{MEM0_REF}")
|
|
368
|
+
run_loud!(git('checkout', '-q', '--detach', 'FETCH_HEAD', chdir: dir),
|
|
369
|
+
'git checkout FETCH_HEAD')
|
|
370
|
+
end
|
|
371
|
+
|
|
372
|
+
# Apply {PATCH_FILE} unless it is already applied. Fail loud if it
|
|
373
|
+
# neither applies cleanly nor is already present (upstream drift
|
|
374
|
+
# past the pinned ref) — never build a wrong image silently.
|
|
375
|
+
def apply_patch!
|
|
376
|
+
dir = checkout_dir
|
|
377
|
+
return if git('apply', '--reverse', '--check', PATCH_FILE, chdir: dir).status.success?
|
|
378
|
+
|
|
379
|
+
result = git('apply', PATCH_FILE, chdir: dir)
|
|
380
|
+
return if result.status.success?
|
|
381
|
+
|
|
382
|
+
raise "Mem0Server: failed to apply #{PATCH_FILE} to mem0 @ #{MEM0_REF} " \
|
|
383
|
+
"(exit #{result.status.exitstatus}): #{result.output.strip}. " \
|
|
384
|
+
'The pinned ref or the patch may have drifted; regenerate the patch.'
|
|
385
|
+
end
|
|
386
|
+
|
|
387
|
+
# Run +docker compose+ with the supervisor's env, raising on
|
|
388
|
+
# failure. Used for +up+ (boot — loud).
|
|
389
|
+
def compose!(*args)
|
|
390
|
+
run_loud!(compose(*args), "docker compose #{args.join(' ')}")
|
|
391
|
+
end
|
|
392
|
+
|
|
393
|
+
# Run +docker compose+ with the supervisor's env, returning the
|
|
394
|
+
# {Pikuri::Subprocess::Result} (caller decides loud vs quiet). Used
|
|
395
|
+
# for +down+ (teardown — quiet).
|
|
396
|
+
def compose(*args)
|
|
397
|
+
docker('compose', '-p', COMPOSE_PROJECT, '-f', COMPOSE_FILE, *args, env: compose_env)
|
|
398
|
+
end
|
|
399
|
+
|
|
400
|
+
# Environment the compose file interpolates (+${PIKURI_*}+) — the
|
|
401
|
+
# checkout path as the build context plus the router + model wiring.
|
|
402
|
+
# +PIKURI_ROUTER_URL+ is the router URL *as the container sees it*:
|
|
403
|
+
# the relay sidecar's address, unless +container_router_url:+
|
|
404
|
+
# overrode the routing.
|
|
405
|
+
def compose_env
|
|
406
|
+
{
|
|
407
|
+
'PIKURI_MEM0_SRC' => checkout_dir.to_s,
|
|
408
|
+
'PIKURI_DATA_DIR' => data_dir.to_s,
|
|
409
|
+
'PIKURI_SOCK_DIR' => sock_dir.to_s,
|
|
410
|
+
'PIKURI_SOCAT_IMAGE' => SOCAT_IMAGE,
|
|
411
|
+
'PIKURI_ROUTER_URL' => container_router_url,
|
|
412
|
+
'PIKURI_LLM_MODEL' => @llm_model,
|
|
413
|
+
'PIKURI_EMBEDDER_MODEL' => @embedder_model,
|
|
414
|
+
'PIKURI_MEM0_PORT' => @port.to_s,
|
|
415
|
+
'PIKURI_QDRANT_PORT' => @qdrant_port.to_s,
|
|
416
|
+
'PIKURI_QDRANT_IMAGE' => @qdrant_image,
|
|
417
|
+
'PIKURI_COLLECTION' => @collection,
|
|
418
|
+
'PIKURI_EMBEDDING_DIMS' => @embedding_dims.to_s
|
|
419
|
+
}
|
|
420
|
+
end
|
|
421
|
+
|
|
422
|
+
# Router base URL as seen from inside the mem0 container: the
|
|
423
|
+
# explicit override, or the relay sidecar with +router_url+'s path
|
|
424
|
+
# (normally +/v1+) carried over.
|
|
425
|
+
def container_router_url
|
|
426
|
+
@container_router_url || "http://router-proxy:8080#{URI(@router_url).path}"
|
|
427
|
+
end
|
|
428
|
+
|
|
429
|
+
# Spawn the host-side half of the router relay (class header):
|
|
430
|
+
# +socat+ listening on the unix socket, forwarding each connection
|
|
431
|
+
# to the router's TCP address. A daemon child — never +#wait+ed; a
|
|
432
|
+
# drain thread keeps its (normally silent) combined-output pipe from
|
|
433
|
+
# filling, logging anything socat does say. Fails loud if the socket
|
|
434
|
+
# doesn't appear within {RELAY_BIND_TIMEOUT}.
|
|
435
|
+
def start_relay!
|
|
436
|
+
uri = URI(@router_url)
|
|
437
|
+
sock = sock_dir.join(RELAY_SOCKET)
|
|
438
|
+
LOGGER.info("relay: socat #{sock} → #{uri.host}:#{uri.port}")
|
|
439
|
+
@relay = Pikuri::Subprocess.spawn(
|
|
440
|
+
'socat',
|
|
441
|
+
"UNIX-LISTEN:#{sock},fork,unlink-early,mode=600",
|
|
442
|
+
"TCP:#{uri.host}:#{uri.port}",
|
|
443
|
+
chdir: '/'
|
|
444
|
+
)
|
|
445
|
+
drain_relay_output!
|
|
446
|
+
wait_for_relay_socket!(sock)
|
|
447
|
+
rescue Errno::ENOENT
|
|
448
|
+
raise 'Mem0Server: `socat` not found on PATH — the relay that lets the mem0 ' \
|
|
449
|
+
'container reach the host router needs it. Install it (sudo apt install ' \
|
|
450
|
+
'socat), or pass container_router_url: to route the container yourself.'
|
|
451
|
+
end
|
|
452
|
+
|
|
453
|
+
# Log whatever socat prints (errors only, in practice) and keep the
|
|
454
|
+
# pipe from blocking the child. Ends at EOF when the relay exits.
|
|
455
|
+
def drain_relay_output!
|
|
456
|
+
relay = @relay
|
|
457
|
+
Thread.new do
|
|
458
|
+
relay.io.each_line { |line| LOGGER.warn("relay: #{line.chomp}") }
|
|
459
|
+
rescue IOError
|
|
460
|
+
# pipe closed mid-read during teardown — fine
|
|
461
|
+
ensure
|
|
462
|
+
relay.io.close unless relay.io.closed?
|
|
463
|
+
end
|
|
464
|
+
end
|
|
465
|
+
|
|
466
|
+
# Poll until socat has bound the socket. Near-instant in practice;
|
|
467
|
+
# a timeout means socat died on startup — fail loud now rather than
|
|
468
|
+
# as silent extraction failures later.
|
|
469
|
+
def wait_for_relay_socket!(sock)
|
|
470
|
+
deadline = Time.now + RELAY_BIND_TIMEOUT
|
|
471
|
+
until File.socket?(sock)
|
|
472
|
+
if Time.now > deadline
|
|
473
|
+
raise "Mem0Server: relay socket #{sock} did not appear within " \
|
|
474
|
+
"#{RELAY_BIND_TIMEOUT}s — socat failed to start (see 'relay:' log lines)"
|
|
475
|
+
end
|
|
476
|
+
sleep 0.05
|
|
477
|
+
end
|
|
478
|
+
end
|
|
479
|
+
|
|
480
|
+
# SIGTERM the relay's process group. Idempotent; nil-safe for the
|
|
481
|
+
# +container_router_url:+ path where no relay was spawned.
|
|
482
|
+
def stop_relay!
|
|
483
|
+
@relay&.terminate
|
|
484
|
+
@relay = nil
|
|
485
|
+
end
|
|
486
|
+
|
|
487
|
+
# Poll the REST API's +/docs+ (FastAPI serves it once startup —
|
|
488
|
+
# including Qdrant collection creation — completes) until 200 or the
|
|
489
|
+
# timeout elapses. Connection errors during the wait are normal
|
|
490
|
+
# (the container takes a moment to bind) and just feed the loop.
|
|
491
|
+
def wait_for_healthy!
|
|
492
|
+
deadline = Time.now + @healthcheck_timeout
|
|
493
|
+
last_error = 'no attempts'
|
|
494
|
+
|
|
495
|
+
until Time.now > deadline
|
|
496
|
+
begin
|
|
497
|
+
response = http.get('/docs')
|
|
498
|
+
return if response.status == 200
|
|
499
|
+
|
|
500
|
+
last_error = "HTTP #{response.status}"
|
|
501
|
+
rescue Faraday::Error => e
|
|
502
|
+
last_error = "#{e.class.name.split('::').last}: #{e.message}"
|
|
503
|
+
end
|
|
504
|
+
sleep 0.5
|
|
505
|
+
end
|
|
506
|
+
|
|
507
|
+
raise "Mem0Server: #{COMPOSE_PROJECT} did not become healthy within " \
|
|
508
|
+
"#{@healthcheck_timeout}s at #{endpoint} (last: #{last_error})"
|
|
509
|
+
end
|
|
510
|
+
|
|
511
|
+
# Heartbeat-poll connection. Uses the injected +connection:+ when
|
|
512
|
+
# given (DI for tests), else a fresh plain connection against the
|
|
513
|
+
# endpoint.
|
|
514
|
+
def http
|
|
515
|
+
@http ||= @connection || Faraday.new(url: endpoint) do |f|
|
|
516
|
+
f.adapter Faraday.default_adapter
|
|
517
|
+
end
|
|
518
|
+
end
|
|
519
|
+
|
|
520
|
+
# Register {#close} with the process-global teardown registry, once.
|
|
521
|
+
def register_for_cleanup
|
|
522
|
+
@finalizer_handle ||= Pikuri::Finalizers.register(self)
|
|
523
|
+
end
|
|
524
|
+
|
|
525
|
+
# +docker+ chokepoint — routes through {Pikuri::Subprocess.spawn}
|
|
526
|
+
# (subprocess seam). Surfaces a clearer message when docker is absent.
|
|
527
|
+
def docker(*argv, env: {})
|
|
528
|
+
Pikuri::Subprocess.spawn('docker', *argv, chdir: '/', env: env).wait
|
|
529
|
+
rescue Errno::ENOENT
|
|
530
|
+
raise 'Mem0Server: `docker` not found on PATH. Install docker (with the ' \
|
|
531
|
+
'compose plugin), or point Mem0Client at an existing mem0 endpoint ' \
|
|
532
|
+
'via Pikuri::Memory::Mem0Client.new(endpoint:).'
|
|
533
|
+
end
|
|
534
|
+
|
|
535
|
+
# +git+ chokepoint — routes through {Pikuri::Subprocess.spawn}.
|
|
536
|
+
def git(*argv, chdir:)
|
|
537
|
+
Pikuri::Subprocess.spawn('git', *argv, chdir: chdir.to_s).wait
|
|
538
|
+
rescue Errno::ENOENT
|
|
539
|
+
raise 'Mem0Server: `git` not found on PATH. Install git, or point ' \
|
|
540
|
+
'Mem0Client at an existing mem0 endpoint.'
|
|
541
|
+
end
|
|
542
|
+
|
|
543
|
+
# Raise with the command's output if +result+ is non-zero.
|
|
544
|
+
def run_loud!(result, what)
|
|
545
|
+
return result if result.status.success?
|
|
546
|
+
|
|
547
|
+
raise "Mem0Server: #{what} failed (exit #{result.status.exitstatus}): #{result.output.strip}"
|
|
548
|
+
end
|
|
549
|
+
end
|
|
550
|
+
end
|
|
551
|
+
end
|
|
@@ -0,0 +1,107 @@
|
|
|
1
|
+
# frozen_string_literal: true
|
|
2
|
+
|
|
3
|
+
module Pikuri
|
|
4
|
+
module Memory
|
|
5
|
+
# The +recall+ {Pikuri::Tool}: explicit, topic-driven deepening
|
|
6
|
+
# beyond the automatic per-turn prefetch ({Extension#on_user_message}).
|
|
7
|
+
# The "mined, deep" retrieval mode — the model calls it when the
|
|
8
|
+
# prefetch slice hints there is more to find on a topic. Same
|
|
9
|
+
# two-step agentic-RAG shape as +pikuri-vectordb+'s
|
|
10
|
+
# +vectordb_search+ (cheap auto slice) + +vectordb_read+ (pull
|
|
11
|
+
# more) — here, prefetch is the slice and +recall+ is the dig.
|
|
12
|
+
#
|
|
13
|
+
# == Single-param surface
|
|
14
|
+
#
|
|
15
|
+
# +recall(topic:)+ — no +top_k+, no +user_id+. Retrieval depth is
|
|
16
|
+
# host policy baked into the {Extension}; the namespace is fixed at
|
|
17
|
+
# construction. The model shouldn't tune retrieval mid-conversation;
|
|
18
|
+
# a minimal surface is the deliberate choice (same rationale as
|
|
19
|
+
# +vectordb_search+).
|
|
20
|
+
#
|
|
21
|
+
# == Recall does not resolve contradictions
|
|
22
|
+
#
|
|
23
|
+
# mem0 returns the relevant memories ranked by similarity, *not* by
|
|
24
|
+
# recency, and keeps a stale fact alongside its correction. So the
|
|
25
|
+
# observation carries each memory's +created_at+ timestamp and the
|
|
26
|
+
# tool description tells the model to treat newer-about-the-same-
|
|
27
|
+
# thing as current — resolution lives in the model's reasoning, not
|
|
28
|
+
# in the store (DESIGN.md §"Supersede recall: resolution is the consumer's job").
|
|
29
|
+
class Recall < Pikuri::Tool
|
|
30
|
+
LOGGER = Pikuri.logger_for('Memory::Recall')
|
|
31
|
+
|
|
32
|
+
# @return [Integer] memories returned per recall. A handful —
|
|
33
|
+
# enough to surface a topic's facts plus any correction, few
|
|
34
|
+
# enough to stay cheap in the turn.
|
|
35
|
+
TOP_K = 7
|
|
36
|
+
|
|
37
|
+
# @return [String] static description shown to the LLM,
|
|
38
|
+
# opencode-shape (summary + +Usage:+ bullets).
|
|
39
|
+
DESCRIPTION = <<~DESC
|
|
40
|
+
Search your durable memory of the user for facts relevant to a topic.
|
|
41
|
+
|
|
42
|
+
Usage:
|
|
43
|
+
- Use to recall what you know about the user or their work beyond what was automatically surfaced this turn — preferences, ongoing projects, people, decisions.
|
|
44
|
+
- Phrase `topic` as a natural-language statement or question, e.g. "the user's current main project" or "how the user likes test output".
|
|
45
|
+
- Each result carries a timestamp. Memory is append-only: if two results conflict, the more recent one is current truth — the older one is kept as history, not a contradiction to flag.
|
|
46
|
+
- Returns up to #{TOP_K} memories. If nothing relevant comes back, say you don't have a memory of it rather than guessing.
|
|
47
|
+
DESC
|
|
48
|
+
|
|
49
|
+
# @param client [Mem0Client] the mem0 client recall queries go to.
|
|
50
|
+
# @param user_id [String] the fixed mem0 namespace to search.
|
|
51
|
+
# @return [Recall]
|
|
52
|
+
def initialize(client:, user_id:)
|
|
53
|
+
super(
|
|
54
|
+
name: 'recall',
|
|
55
|
+
description: DESCRIPTION,
|
|
56
|
+
parameters: Pikuri::Tool::Parameters.build { |p|
|
|
57
|
+
p.required_string :topic,
|
|
58
|
+
'Natural-language topic or question to recall about the user, e.g. ' \
|
|
59
|
+
'"the user\'s dietary preferences" or "what project are we working on?".'
|
|
60
|
+
},
|
|
61
|
+
execute: lambda { |topic:|
|
|
62
|
+
Recall.execute(client: client, user_id: user_id, topic: topic)
|
|
63
|
+
}
|
|
64
|
+
)
|
|
65
|
+
end
|
|
66
|
+
|
|
67
|
+
# Public so specs can exercise recall without constructing a Tool
|
|
68
|
+
# wrapper. Catches {Mem0Client} failures and renders them as an
|
|
69
|
+
# +"Error: ..."+ observation the LLM can react to (a transient
|
|
70
|
+
# mem0 blip shouldn't crash the loop) — bugs in pikuri's own code
|
|
71
|
+
# still raise.
|
|
72
|
+
#
|
|
73
|
+
# @param client [Mem0Client]
|
|
74
|
+
# @param user_id [String]
|
|
75
|
+
# @param topic [String]
|
|
76
|
+
# @return [String] formatted observation.
|
|
77
|
+
def self.execute(client:, user_id:, topic:)
|
|
78
|
+
return 'Error: topic is empty' if topic.nil? || topic.strip.empty?
|
|
79
|
+
|
|
80
|
+
records = client.search(query: topic, user_id: user_id, top_k: TOP_K)
|
|
81
|
+
return 'No relevant memories found.' if records.empty?
|
|
82
|
+
|
|
83
|
+
format_observation(records)
|
|
84
|
+
rescue RuntimeError => e
|
|
85
|
+
LOGGER.warn("recall failed: #{e.message}")
|
|
86
|
+
"Error: memory recall failed: #{e.message}"
|
|
87
|
+
end
|
|
88
|
+
|
|
89
|
+
# Format the recalled memories as a header + one line per memory,
|
|
90
|
+
# each prefixed with its +created_at+ date so the model can apply
|
|
91
|
+
# recency when two memories about the same thing conflict.
|
|
92
|
+
#
|
|
93
|
+
# @param records [Array<Record>]
|
|
94
|
+
# @return [String]
|
|
95
|
+
def self.format_observation(records)
|
|
96
|
+
header = "Recalled #{records.length} " \
|
|
97
|
+
"memor#{records.length == 1 ? 'y' : 'ies'}:\n"
|
|
98
|
+
body = records.map do |r|
|
|
99
|
+
when_ = r.created_label ? "(#{r.created_label}) " : ''
|
|
100
|
+
"- #{when_}#{r.text}"
|
|
101
|
+
end.join("\n")
|
|
102
|
+
header + body
|
|
103
|
+
end
|
|
104
|
+
private_class_method :format_observation
|
|
105
|
+
end
|
|
106
|
+
end
|
|
107
|
+
end
|