@mnexium/core 0.1.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/.env.example +37 -0
- package/README.md +37 -0
- package/docs/API.md +299 -0
- package/docs/BEHAVIOR.md +224 -0
- package/docs/OPERATIONS.md +116 -0
- package/docs/SETUP.md +239 -0
- package/package.json +22 -0
- package/scripts/e2e.lib.mjs +604 -0
- package/scripts/e2e.routes.mjs +32 -0
- package/scripts/e2e.sh +76 -0
- package/scripts/e2e.webapp.client.js +408 -0
- package/scripts/e2e.webapp.mjs +1065 -0
- package/sql/postgres/schema.sql +275 -0
- package/src/adapters/postgres/PostgresCoreStore.ts +1017 -0
- package/src/ai/memoryExtractionService.ts +265 -0
- package/src/ai/recallService.ts +442 -0
- package/src/ai/types.ts +11 -0
- package/src/contracts/storage.ts +137 -0
- package/src/contracts/types.ts +138 -0
- package/src/dev.ts +144 -0
- package/src/index.ts +15 -0
- package/src/providers/cerebras.ts +101 -0
- package/src/providers/openaiChat.ts +116 -0
- package/src/providers/openaiEmbedding.ts +52 -0
- package/src/server/createCoreServer.ts +1154 -0
- package/src/server/memoryEventBus.ts +57 -0
- package/tsconfig.json +14 -0
package/.env.example
ADDED
|
@@ -0,0 +1,37 @@
|
|
|
1
|
+
POSTGRES_HOST=localhost
|
|
2
|
+
POSTGRES_PORT=5432
|
|
3
|
+
POSTGRES_DB=mnexium_core
|
|
4
|
+
POSTGRES_USER=postgres
|
|
5
|
+
POSTGRES_PASSWORD=change_me
|
|
6
|
+
|
|
7
|
+
# Optional
|
|
8
|
+
CORE_DEFAULT_PROJECT_ID=default-project
|
|
9
|
+
PORT=8080
|
|
10
|
+
CORE_DEBUG=false
|
|
11
|
+
|
|
12
|
+
# AI routing mode:
|
|
13
|
+
# - auto (default): cerebras -> openai -> simple fallback
|
|
14
|
+
# - cerebras: force Cerebras, fallback to simple if key missing
|
|
15
|
+
# - openai: force OpenAI ChatGPT model, fallback to simple if key missing
|
|
16
|
+
# - simple: no LLM calls, heuristic mode only
|
|
17
|
+
CORE_AI_MODE=auto
|
|
18
|
+
|
|
19
|
+
# Controls search-time retrieval expansion/reranking.
|
|
20
|
+
# true = LLM classify/expand/rerank path (when LLM is available)
|
|
21
|
+
# false = simple search mode only
|
|
22
|
+
USE_RETRIEVAL_EXPAND=true
|
|
23
|
+
|
|
24
|
+
# Model API keys
|
|
25
|
+
OPENAI_API_KEY=
|
|
26
|
+
|
|
27
|
+
# Optional: enables Cerebras-powered recall routing/reranking.
|
|
28
|
+
CEREBRAS_API=
|
|
29
|
+
|
|
30
|
+
# Shared retrieval model for whichever LLM provider is selected.
|
|
31
|
+
# - Cerebras example: gpt-oss-120b
|
|
32
|
+
# - OpenAI example: gpt-4o-mini or gpt-4.1-mini
|
|
33
|
+
RETRIEVAL_MODEL=gpt-oss-120b
|
|
34
|
+
|
|
35
|
+
# Embeddings model for semantic memory search and claim vectorization.
|
|
36
|
+
# Keep this aligned with schema vector dimension (schema is VECTOR(1536)).
|
|
37
|
+
OPENAI_EMBED_MODEL=text-embedding-3-small
|
package/README.md
ADDED
|
@@ -0,0 +1,37 @@
|
|
|
1
|
+
# CORE
|
|
2
|
+
|
|
3
|
+
CORE is Mnexium's memory engine service: a Postgres-backed HTTP API for storing memories, extracting claims, resolving truth state, and retrieving relevant context for downstream applications.
|
|
4
|
+
|
|
5
|
+
It is designed as an integration-first core service that can run standalone and plug into existing auth, tenancy, and platform controls.
|
|
6
|
+
|
|
7
|
+
## What CORE does
|
|
8
|
+
|
|
9
|
+
- Stores subject-scoped memories and supports lifecycle operations (create, update, soft-delete, restore).
|
|
10
|
+
- Extracts structured claims from natural language and persists claim assertions.
|
|
11
|
+
- Maintains slot-based truth state (`slot_state`) to track active winners and retractions.
|
|
12
|
+
- Supports retrieval with vector + lexical fallback and optional LLM-powered query expansion/reranking.
|
|
13
|
+
- Streams memory lifecycle events over SSE for real-time consumers.
|
|
14
|
+
|
|
15
|
+
## Why it is powerful
|
|
16
|
+
|
|
17
|
+
- Better grounding for responses: LLMs can retrieve durable, user-specific memory instead of relying only on short chat context.
|
|
18
|
+
- Lower hallucination risk on known facts: retrieval and claim state give the model a concrete memory substrate to reference.
|
|
19
|
+
- Personalization that persists: preferences, history, and prior decisions survive across sessions and channels.
|
|
20
|
+
- Works beyond context-window limits: important memory is stored and recalled on demand instead of repeatedly reprompted.
|
|
21
|
+
- Faster LLM product development: app teams get a ready memory/truth backend rather than building custom memory pipelines from scratch.
|
|
22
|
+
|
|
23
|
+
## Intended use
|
|
24
|
+
|
|
25
|
+
CORE is intended to be the memory and truth substrate behind apps, agents, and workflows that need:
|
|
26
|
+
|
|
27
|
+
- long-lived user memory,
|
|
28
|
+
- auditable claim history,
|
|
29
|
+
- query-time recall,
|
|
30
|
+
- and deterministic APIs backed by Postgres.
|
|
31
|
+
|
|
32
|
+
## Documentation map
|
|
33
|
+
|
|
34
|
+
- Setup and initialization: [docs/SETUP.md](docs/SETUP.md)
|
|
35
|
+
- Runtime behavior and decision logic: [docs/BEHAVIOR.md](docs/BEHAVIOR.md)
|
|
36
|
+
- HTTP endpoints and contracts: [docs/API.md](docs/API.md)
|
|
37
|
+
- Production hardening checklist: [docs/OPERATIONS.md](docs/OPERATIONS.md)
|
package/docs/API.md
ADDED
|
@@ -0,0 +1,299 @@
|
|
|
1
|
+
# ๐ API Reference
|
|
2
|
+
|
|
3
|
+
All routes except `GET /health` require project context.
|
|
4
|
+
|
|
5
|
+
Project context resolution:
|
|
6
|
+
|
|
7
|
+
1. `x-project-id` header
|
|
8
|
+
2. fallback project id configured on server startup
|
|
9
|
+
|
|
10
|
+
If neither is available, the request fails with `400`:
|
|
11
|
+
|
|
12
|
+
```json
|
|
13
|
+
{
|
|
14
|
+
"error": "project_id_required",
|
|
15
|
+
"message": "Provide x-project-id header or configure defaultProjectId"
|
|
16
|
+
}
|
|
17
|
+
```
|
|
18
|
+
|
|
19
|
+
## ๐ฉบ Health
|
|
20
|
+
|
|
21
|
+
### GET `/health`
|
|
22
|
+
|
|
23
|
+
Returns service liveness:
|
|
24
|
+
|
|
25
|
+
```json
|
|
26
|
+
{
|
|
27
|
+
"ok": true,
|
|
28
|
+
"service": "mnexium-core",
|
|
29
|
+
"timestamp": "..."
|
|
30
|
+
}
|
|
31
|
+
```
|
|
32
|
+
|
|
33
|
+
## ๐ก Memory Events (SSE)
|
|
34
|
+
|
|
35
|
+
### GET `/api/v1/events/memories`
|
|
36
|
+
|
|
37
|
+
Query params:
|
|
38
|
+
|
|
39
|
+
- `subject_id` (optional)
|
|
40
|
+
|
|
41
|
+
Event types:
|
|
42
|
+
|
|
43
|
+
- `connected`
|
|
44
|
+
- `heartbeat`
|
|
45
|
+
- `memory.created`
|
|
46
|
+
- `memory.superseded`
|
|
47
|
+
- `memory.updated`
|
|
48
|
+
- `memory.deleted`
|
|
49
|
+
|
|
50
|
+
## ๐ง Memories
|
|
51
|
+
|
|
52
|
+
### GET `/api/v1/memories`
|
|
53
|
+
|
|
54
|
+
Query params:
|
|
55
|
+
|
|
56
|
+
- `subject_id` (required)
|
|
57
|
+
- `limit` (default `50`, max `200`)
|
|
58
|
+
- `offset` (default `0`)
|
|
59
|
+
- `include_deleted` (`true|false`)
|
|
60
|
+
- `include_superseded` (`true|false`)
|
|
61
|
+
|
|
62
|
+
Returns:
|
|
63
|
+
|
|
64
|
+
- `{ data: Memory[], count: number }`
|
|
65
|
+
|
|
66
|
+
### POST `/api/v1/memories`
|
|
67
|
+
|
|
68
|
+
Body:
|
|
69
|
+
|
|
70
|
+
- `subject_id` (required)
|
|
71
|
+
- `text` (required, max length `10000`)
|
|
72
|
+
- `kind`, `visibility`, `importance`, `confidence`, `is_temporal`, `tags`, `metadata`, `source_type` (optional)
|
|
73
|
+
- `id` (optional)
|
|
74
|
+
- `extract_claims` (optional, default `true`)
|
|
75
|
+
- `no_supersede` (optional, default `false`)
|
|
76
|
+
|
|
77
|
+
Returns:
|
|
78
|
+
|
|
79
|
+
- `201` created:
|
|
80
|
+
- `{ id, subject_id, text, kind, created: true, superseded_count, superseded_ids }`
|
|
81
|
+
- `200` duplicate skip:
|
|
82
|
+
- `{ id: null, subject_id, text, kind, created: false, skipped: true, reason: "duplicate" }`
|
|
83
|
+
|
|
84
|
+
### GET `/api/v1/memories/search`
|
|
85
|
+
|
|
86
|
+
Query params:
|
|
87
|
+
|
|
88
|
+
- `subject_id` (required)
|
|
89
|
+
- `q` (required)
|
|
90
|
+
- `limit` (default `25`, max `200`)
|
|
91
|
+
- `min_score` (default `30`)
|
|
92
|
+
- `distance` (alias of `min_score`)
|
|
93
|
+
- `context` (repeatable; optional conversation context items)
|
|
94
|
+
|
|
95
|
+
Returns:
|
|
96
|
+
|
|
97
|
+
- when recall service is configured (default in `src/dev.ts`):
|
|
98
|
+
- `{ data, query, count, engine, mode, used_queries, predicates }`
|
|
99
|
+
- internal fallback path:
|
|
100
|
+
- `{ data, query, count, engine }`
|
|
101
|
+
|
|
102
|
+
### POST `/api/v1/memories/extract`
|
|
103
|
+
|
|
104
|
+
Body:
|
|
105
|
+
|
|
106
|
+
- `subject_id` (required)
|
|
107
|
+
- `text` (required)
|
|
108
|
+
- `force` (`boolean`, optional)
|
|
109
|
+
- `learn` (`boolean`, optional)
|
|
110
|
+
- `conversation_context` (`string[]`, optional)
|
|
111
|
+
|
|
112
|
+
Query params:
|
|
113
|
+
|
|
114
|
+
- `learn=true|false` (optional)
|
|
115
|
+
- `force=true|false` (optional)
|
|
116
|
+
|
|
117
|
+
`learn`/`force` are enabled when either body or query sets them to `true`.
|
|
118
|
+
|
|
119
|
+
Returns (extraction only):
|
|
120
|
+
|
|
121
|
+
- `{ ok: true, learned: false, mode, extracted_count, memories }`
|
|
122
|
+
|
|
123
|
+
Returns (learn/write path):
|
|
124
|
+
|
|
125
|
+
- `{ ok: true, learned: true, mode, extracted_count, learned_memory_count, learned_claim_count, memories }`
|
|
126
|
+
|
|
127
|
+
### GET `/api/v1/memories/superseded`
|
|
128
|
+
|
|
129
|
+
Query params:
|
|
130
|
+
|
|
131
|
+
- `subject_id` (required)
|
|
132
|
+
- `limit` (default `50`, max `200`)
|
|
133
|
+
- `offset` (default `0`)
|
|
134
|
+
|
|
135
|
+
Returns:
|
|
136
|
+
|
|
137
|
+
- `{ data: Memory[], count }`
|
|
138
|
+
|
|
139
|
+
### GET `/api/v1/memories/recalls`
|
|
140
|
+
|
|
141
|
+
Query params:
|
|
142
|
+
|
|
143
|
+
- `chat_id` OR `memory_id` (one required)
|
|
144
|
+
- `stats=true|false` (used only with `memory_id` path)
|
|
145
|
+
- `limit` (default `100`, max `1000`)
|
|
146
|
+
|
|
147
|
+
Response modes:
|
|
148
|
+
|
|
149
|
+
- by chat (`chat_id` provided): `{ data, count, chat_id }`
|
|
150
|
+
- by memory (`memory_id` + no stats): `{ data, count, memory_id }`
|
|
151
|
+
- memory stats (`memory_id` + `stats=true`): `{ memory_id, stats }`
|
|
152
|
+
|
|
153
|
+
Note:
|
|
154
|
+
|
|
155
|
+
- if both `chat_id` and `memory_id` are provided, `chat_id` path is used.
|
|
156
|
+
|
|
157
|
+
### GET `/api/v1/memories/:id`
|
|
158
|
+
|
|
159
|
+
Returns:
|
|
160
|
+
|
|
161
|
+
- `{ data: Memory }`
|
|
162
|
+
- `404` with `memory_not_found` or `memory_deleted`
|
|
163
|
+
|
|
164
|
+
### PATCH `/api/v1/memories/:id`
|
|
165
|
+
|
|
166
|
+
Body (any subset):
|
|
167
|
+
|
|
168
|
+
- `text`, `kind`, `visibility`, `importance`, `confidence`, `is_temporal`, `tags`, `metadata`
|
|
169
|
+
|
|
170
|
+
Returns:
|
|
171
|
+
|
|
172
|
+
- `{ id, updated: true }`
|
|
173
|
+
- `404` with `memory_not_found` or `memory_deleted`
|
|
174
|
+
|
|
175
|
+
### DELETE `/api/v1/memories/:id`
|
|
176
|
+
|
|
177
|
+
Soft delete.
|
|
178
|
+
|
|
179
|
+
Returns:
|
|
180
|
+
|
|
181
|
+
- `{ ok: true, deleted: boolean }`
|
|
182
|
+
|
|
183
|
+
### GET `/api/v1/memories/:id/claims`
|
|
184
|
+
|
|
185
|
+
Returns assertion-centric claims linked to memory:
|
|
186
|
+
|
|
187
|
+
- `{ data: [{ id, predicate, type, value, confidence, status, first_seen_at, last_seen_at }], count }`
|
|
188
|
+
|
|
189
|
+
Errors:
|
|
190
|
+
|
|
191
|
+
- `404` `memory_not_found`
|
|
192
|
+
- `404` `memory_deleted`
|
|
193
|
+
|
|
194
|
+
### POST `/api/v1/memories/:id/restore`
|
|
195
|
+
|
|
196
|
+
Returns:
|
|
197
|
+
|
|
198
|
+
- `{ ok: true, restored: true, id, subject_id, text }`
|
|
199
|
+
- `{ ok: true, restored: false, message: "Memory is already active" }`
|
|
200
|
+
- `400` `memory_deleted`
|
|
201
|
+
- `404` `memory_not_found`
|
|
202
|
+
|
|
203
|
+
## ๐งฉ Claims
|
|
204
|
+
|
|
205
|
+
### POST `/api/v1/claims`
|
|
206
|
+
|
|
207
|
+
Body:
|
|
208
|
+
|
|
209
|
+
- required: `subject_id`, `predicate`, `object_value`
|
|
210
|
+
- optional: `claim_id`, `claim_type`, `slot`, `confidence`, `importance`, `tags`, `source_memory_id`, `source_observation_id`, `subject_entity`, `valid_from`, `valid_until`
|
|
211
|
+
|
|
212
|
+
Returns:
|
|
213
|
+
|
|
214
|
+
- `{ claim_id, subject_id, predicate, object_value, slot, claim_type, confidence, observation_id, linking_triggered }`
|
|
215
|
+
|
|
216
|
+
### POST `/api/v1/claims/:id/retract`
|
|
217
|
+
|
|
218
|
+
Body:
|
|
219
|
+
|
|
220
|
+
- `reason` (optional, default `manual_retraction`)
|
|
221
|
+
|
|
222
|
+
Returns:
|
|
223
|
+
|
|
224
|
+
- `{ success, claim_id, slot, previous_claim_id, restored_previous, reason }`
|
|
225
|
+
|
|
226
|
+
### GET `/api/v1/claims/:id`
|
|
227
|
+
|
|
228
|
+
Returns:
|
|
229
|
+
|
|
230
|
+
- `{ claim, assertions, edges, supersession_chain }`
|
|
231
|
+
|
|
232
|
+
Notes:
|
|
233
|
+
|
|
234
|
+
- `claim` excludes embedding field
|
|
235
|
+
- `supersession_chain` is filtered from `edges` where `edge_type = supersedes`
|
|
236
|
+
|
|
237
|
+
### GET `/api/v1/claims/subject/:subjectId/truth`
|
|
238
|
+
|
|
239
|
+
Query params:
|
|
240
|
+
|
|
241
|
+
- `include_source=true|false` (default `true`)
|
|
242
|
+
|
|
243
|
+
Returns:
|
|
244
|
+
|
|
245
|
+
- `{ subject_id, project_id, slot_count, slots }`
|
|
246
|
+
|
|
247
|
+
### GET `/api/v1/claims/subject/:subjectId/slot/:slot`
|
|
248
|
+
|
|
249
|
+
Returns:
|
|
250
|
+
|
|
251
|
+
- `{ subject_id, project_id, slot, active_claim_id, predicate, object_value, claim_type, confidence, updated_at, tags, source }`
|
|
252
|
+
- `404` `slot_not_found`
|
|
253
|
+
|
|
254
|
+
### GET `/api/v1/claims/subject/:subjectId/slots`
|
|
255
|
+
|
|
256
|
+
Query params:
|
|
257
|
+
|
|
258
|
+
- `limit` (default `100`, max `500`)
|
|
259
|
+
|
|
260
|
+
Returns grouped slot states:
|
|
261
|
+
|
|
262
|
+
- `{ subject_id, total, active_count, slots: { active, superseded, other } }`
|
|
263
|
+
|
|
264
|
+
### GET `/api/v1/claims/subject/:subjectId/graph`
|
|
265
|
+
|
|
266
|
+
Query params:
|
|
267
|
+
|
|
268
|
+
- `limit` (default `50`, max `200`)
|
|
269
|
+
|
|
270
|
+
Returns:
|
|
271
|
+
|
|
272
|
+
- `{ subject_id, claims_count, edges_count, edges_by_type, claims, edges }`
|
|
273
|
+
|
|
274
|
+
### GET `/api/v1/claims/subject/:subjectId/history`
|
|
275
|
+
|
|
276
|
+
Query params:
|
|
277
|
+
|
|
278
|
+
- `slot` (optional)
|
|
279
|
+
- `limit` (default `100`, max `500`)
|
|
280
|
+
|
|
281
|
+
Returns:
|
|
282
|
+
|
|
283
|
+
- `{ subject_id, project_id, slot_filter, by_slot, edges, total_claims }`
|
|
284
|
+
|
|
285
|
+
## โ ๏ธ Error Conventions
|
|
286
|
+
|
|
287
|
+
Common error payloads:
|
|
288
|
+
|
|
289
|
+
- validation: `{ error: "subject_id_required" }`, `{ error: "q_required" }`, `{ error: "invalid_json_body" }`
|
|
290
|
+
- not found: `{ error: "memory_not_found" }`, `{ error: "claim_not_found" }`, `{ error: "slot_not_found" }`, `{ error: "not_found" }`
|
|
291
|
+
- server error: `{ error: "server_error", message: "..." }`
|
|
292
|
+
|
|
293
|
+
Status codes:
|
|
294
|
+
|
|
295
|
+
- `200` success
|
|
296
|
+
- `201` created
|
|
297
|
+
- `400` validation/input
|
|
298
|
+
- `404` not found
|
|
299
|
+
- `500` unexpected server error
|
package/docs/BEHAVIOR.md
ADDED
|
@@ -0,0 +1,224 @@
|
|
|
1
|
+
# ๐ง Runtime Behavior
|
|
2
|
+
|
|
3
|
+
This document explains how CORE decides retrieval/extraction behavior at runtime and how memory + claim state changes flow through the system.
|
|
4
|
+
|
|
5
|
+
## ๐๏ธ Request Lifecycle
|
|
6
|
+
|
|
7
|
+
CORE server path:
|
|
8
|
+
|
|
9
|
+
1. Receive HTTP request
|
|
10
|
+
2. Resolve `project_id`
|
|
11
|
+
3. Dispatch route handler
|
|
12
|
+
4. Execute storage contract (`CoreStore`)
|
|
13
|
+
5. Return JSON (or SSE stream)
|
|
14
|
+
|
|
15
|
+
Primary implementation files:
|
|
16
|
+
|
|
17
|
+
- `src/server/createCoreServer.ts`
|
|
18
|
+
- `src/adapters/postgres/PostgresCoreStore.ts`
|
|
19
|
+
- `src/ai/recallService.ts`
|
|
20
|
+
- `src/ai/memoryExtractionService.ts`
|
|
21
|
+
|
|
22
|
+
## ๐ค AI Provider Resolution
|
|
23
|
+
|
|
24
|
+
Configured by `CORE_AI_MODE`:
|
|
25
|
+
|
|
26
|
+
- `auto` (default)
|
|
27
|
+
- `cerebras`
|
|
28
|
+
- `openai`
|
|
29
|
+
- `simple`
|
|
30
|
+
|
|
31
|
+
Resolution behavior (`src/dev.ts`):
|
|
32
|
+
|
|
33
|
+
- `auto`
|
|
34
|
+
- use Cerebras when `CEREBRAS_API` exists
|
|
35
|
+
- else use OpenAI when `OPENAI_API_KEY` exists
|
|
36
|
+
- else use `simple`
|
|
37
|
+
- `cerebras`
|
|
38
|
+
- requires `CEREBRAS_API`
|
|
39
|
+
- if missing, warns and falls back to `simple`
|
|
40
|
+
- does not auto-switch to OpenAI
|
|
41
|
+
- `openai`
|
|
42
|
+
- requires `OPENAI_API_KEY`
|
|
43
|
+
- if missing, warns and falls back to `simple`
|
|
44
|
+
- does not auto-switch to Cerebras
|
|
45
|
+
- `simple`
|
|
46
|
+
- no LLM client
|
|
47
|
+
|
|
48
|
+
`RETRIEVAL_MODEL` is passed to the selected LLM client.
|
|
49
|
+
|
|
50
|
+
## ๐ Retrieval Behavior (`GET /api/v1/memories/search`)
|
|
51
|
+
|
|
52
|
+
### Retrieval toggle
|
|
53
|
+
|
|
54
|
+
`USE_RETRIEVAL_EXPAND=true|false`:
|
|
55
|
+
|
|
56
|
+
- `true`
|
|
57
|
+
- with LLM client: classify + expand + rerank path
|
|
58
|
+
- without LLM client: simple path
|
|
59
|
+
- `false`
|
|
60
|
+
- always simple search path
|
|
61
|
+
- extraction endpoint can still use LLM if an LLM client exists
|
|
62
|
+
|
|
63
|
+
### Query mode semantics (`broad|direct|indirect`)
|
|
64
|
+
|
|
65
|
+
Mode is produced by an LLM classifier in `src/ai/recallService.ts`.
|
|
66
|
+
|
|
67
|
+
- `broad`
|
|
68
|
+
- intent: profile/summary requests
|
|
69
|
+
- behavior: list active memories and sort by importance + recency
|
|
70
|
+
- output pattern: wider profile set (`max(limit, 20)`)
|
|
71
|
+
- `direct`
|
|
72
|
+
- intent: specific fact lookup
|
|
73
|
+
- behavior: search with hints, then boost claim-backed memories when predicates match truth slots
|
|
74
|
+
- output pattern: high-precision, usually narrow (often top 5)
|
|
75
|
+
- `indirect`
|
|
76
|
+
- intent: advice/planning where personal context helps
|
|
77
|
+
- behavior: query expansion + larger candidate pool + rerank when needed
|
|
78
|
+
- output pattern: context-rich supporting memories
|
|
79
|
+
|
|
80
|
+
Classification fallback behavior:
|
|
81
|
+
|
|
82
|
+
- classify failure/invalid mode -> defaults to `indirect`
|
|
83
|
+
- empty query -> no memories returned
|
|
84
|
+
|
|
85
|
+
### LLM expanded pipeline
|
|
86
|
+
|
|
87
|
+
1. Classify query into `broad|direct|indirect` and extract hints/predicates.
|
|
88
|
+
2. Build query set:
|
|
89
|
+
- `broad`: profile listing path (no multi-query expansion)
|
|
90
|
+
- `direct`: original + search hints
|
|
91
|
+
- `indirect`: original + search hints + expanded queries
|
|
92
|
+
3. For each query, search with embedding (if available) plus lexical fallback.
|
|
93
|
+
4. Merge and dedupe by memory ID.
|
|
94
|
+
5. Apply mode-specific ranking:
|
|
95
|
+
- `direct`: claim/truth boosts, rerank only when needed
|
|
96
|
+
- `indirect`: rerank via LLM when candidate set exceeds limit
|
|
97
|
+
|
|
98
|
+
Other runtime constraints:
|
|
99
|
+
|
|
100
|
+
- conversation context is capped to last 5 items
|
|
101
|
+
- classify timeout is 2s
|
|
102
|
+
- rerank timeout is 3s
|
|
103
|
+
|
|
104
|
+
Response fields include:
|
|
105
|
+
|
|
106
|
+
- `engine` (`provider:model` or `simple`)
|
|
107
|
+
- `mode` (`broad|direct|indirect|simple`)
|
|
108
|
+
- `used_queries`
|
|
109
|
+
- `predicates`
|
|
110
|
+
|
|
111
|
+
### Simple retrieval pipeline
|
|
112
|
+
|
|
113
|
+
1. Run single query search.
|
|
114
|
+
2. Use embedding if available, otherwise lexical-only path.
|
|
115
|
+
3. Rank via adapter scoring.
|
|
116
|
+
|
|
117
|
+
No LLM classify/expand/rerank is applied.
|
|
118
|
+
|
|
119
|
+
## โ๏ธ Extraction Behavior (`POST /api/v1/memories/extract`)
|
|
120
|
+
|
|
121
|
+
This endpoint is always available.
|
|
122
|
+
|
|
123
|
+
- With LLM client:
|
|
124
|
+
- uses extraction prompt
|
|
125
|
+
- normalizes output
|
|
126
|
+
- falls back to simple extraction on parse/failure
|
|
127
|
+
- Without LLM client:
|
|
128
|
+
- uses simple heuristic extraction directly
|
|
129
|
+
|
|
130
|
+
Learn modes:
|
|
131
|
+
|
|
132
|
+
- `learn=false` (default): extraction-only response
|
|
133
|
+
- `learn=true`: writes memories/claims and emits `memory.created` per learned memory
|
|
134
|
+
|
|
135
|
+
## ๐๏ธ Memory Lifecycle Semantics
|
|
136
|
+
|
|
137
|
+
### Create memory (`POST /api/v1/memories`)
|
|
138
|
+
|
|
139
|
+
- writes one `memories` row
|
|
140
|
+
- optional embedding generation
|
|
141
|
+
- duplicate guard when embedding similarity >= 85
|
|
142
|
+
- conflict supersede when embedding similarity is in `[60, 85)`
|
|
143
|
+
- emits `memory.created`
|
|
144
|
+
- emits `memory.superseded` for superseded conflicting memories
|
|
145
|
+
- optional async claim extraction (`extract_claims=true` by default)
|
|
146
|
+
|
|
147
|
+
Notes:
|
|
148
|
+
|
|
149
|
+
- async extraction is skipped when `no_supersede=true`
|
|
150
|
+
- extracted claims link back via `source_memory_id`
|
|
151
|
+
|
|
152
|
+
### Update memory (`PATCH /api/v1/memories/:id`)
|
|
153
|
+
|
|
154
|
+
- patch updates supported fields
|
|
155
|
+
- emits `memory.updated`
|
|
156
|
+
|
|
157
|
+
### Delete memory (`DELETE /api/v1/memories/:id`)
|
|
158
|
+
|
|
159
|
+
- soft delete (`is_deleted=true`)
|
|
160
|
+
- emits `memory.deleted` when delete actually occurs
|
|
161
|
+
|
|
162
|
+
### Restore memory (`POST /api/v1/memories/:id/restore`)
|
|
163
|
+
|
|
164
|
+
- sets `status='active'`, clears `superseded_by`
|
|
165
|
+
- emits `memory.updated`
|
|
166
|
+
- deleted memories cannot be restored
|
|
167
|
+
|
|
168
|
+
### Superseded listing (`GET /api/v1/memories/superseded`)
|
|
169
|
+
|
|
170
|
+
- returns non-deleted memories where `status='superseded'`
|
|
171
|
+
|
|
172
|
+
## ๐งฉ Claim Semantics
|
|
173
|
+
|
|
174
|
+
### Create claim (`POST /api/v1/claims`)
|
|
175
|
+
|
|
176
|
+
- inserts into `claims`
|
|
177
|
+
- inserts assertion row in `claim_assertions`
|
|
178
|
+
- upserts `slot_state` active winner for claim slot
|
|
179
|
+
|
|
180
|
+
### Retract claim (`POST /api/v1/claims/:id/retract`)
|
|
181
|
+
|
|
182
|
+
- sets claim status to `retracted`
|
|
183
|
+
- restores previous active claim in same slot when available
|
|
184
|
+
- updates `slot_state`
|
|
185
|
+
- writes `retracts` edge when prior winner is restored
|
|
186
|
+
|
|
187
|
+
### Truth/slot reads
|
|
188
|
+
|
|
189
|
+
- truth + slot endpoints resolve from `slot_state` joined with active `claims`
|
|
190
|
+
- graph/history endpoints return claims + edges
|
|
191
|
+
|
|
192
|
+
## ๐ก SSE Event Behavior
|
|
193
|
+
|
|
194
|
+
Endpoint:
|
|
195
|
+
|
|
196
|
+
- `GET /api/v1/events/memories`
|
|
197
|
+
|
|
198
|
+
Event types:
|
|
199
|
+
|
|
200
|
+
- `connected`
|
|
201
|
+
- `heartbeat` (every 30s)
|
|
202
|
+
- `memory.created`
|
|
203
|
+
- `memory.superseded`
|
|
204
|
+
- `memory.updated`
|
|
205
|
+
- `memory.deleted`
|
|
206
|
+
|
|
207
|
+
Topology:
|
|
208
|
+
|
|
209
|
+
- in-process event bus (`src/server/memoryEventBus.ts`)
|
|
210
|
+
- no cross-instance fanout by default
|
|
211
|
+
- for horizontal scale, replace bus with external pub/sub (Redis/NATS/Kafka)
|
|
212
|
+
|
|
213
|
+
## โ ๏ธ Error and Degradation Model
|
|
214
|
+
|
|
215
|
+
Status pattern:
|
|
216
|
+
|
|
217
|
+
- `400` validation/input issues
|
|
218
|
+
- `404` missing resource or route
|
|
219
|
+
- `500` unexpected server error
|
|
220
|
+
|
|
221
|
+
Graceful degradation:
|
|
222
|
+
|
|
223
|
+
- no embedding key -> non-vector retrieval path still works
|
|
224
|
+
- no LLM client or LLM failure -> simple retrieval/extraction fallback
|
|
@@ -0,0 +1,116 @@
|
|
|
1
|
+
# ๐ ๏ธ Operations Playbook
|
|
2
|
+
|
|
3
|
+
This guide focuses on production hardening for CORE deployments.
|
|
4
|
+
|
|
5
|
+
## ๐งฑ Deployment Baseline
|
|
6
|
+
|
|
7
|
+
Recommended baseline:
|
|
8
|
+
|
|
9
|
+
- 1+ stateless CORE instances
|
|
10
|
+
- managed Postgres with `pgvector`
|
|
11
|
+
- reverse proxy/load balancer with tuned upstream timeouts
|
|
12
|
+
- centralized logs + metrics
|
|
13
|
+
|
|
14
|
+
For multi-instance event fanout, add external pub/sub.
|
|
15
|
+
|
|
16
|
+
## ๐ Platform Integrations to Add
|
|
17
|
+
|
|
18
|
+
CORE intentionally leaves platform controls to the host system. Typical production additions:
|
|
19
|
+
|
|
20
|
+
1. Auth (API keys/JWT) and route-level scopes
|
|
21
|
+
2. Tenant boundary checks
|
|
22
|
+
3. Idempotency keys for writes
|
|
23
|
+
4. External event transport for SSE fanout
|
|
24
|
+
5. Alerting + SLO dashboards
|
|
25
|
+
|
|
26
|
+
## ๐ Idempotency Guidance
|
|
27
|
+
|
|
28
|
+
Write routes can be retried by clients/proxies. Add:
|
|
29
|
+
|
|
30
|
+
- `Idempotency-Key` for `POST/PATCH/DELETE`
|
|
31
|
+
- persisted request fingerprint and response body
|
|
32
|
+
- replay of original success response for duplicate key
|
|
33
|
+
|
|
34
|
+
## ๐ก SSE at Scale
|
|
35
|
+
|
|
36
|
+
Current behavior:
|
|
37
|
+
|
|
38
|
+
- memory events are emitted from an in-process bus
|
|
39
|
+
- subscribers connected to instance A will not receive events produced on instance B
|
|
40
|
+
|
|
41
|
+
Production recommendation:
|
|
42
|
+
|
|
43
|
+
- publish lifecycle events to Redis/NATS/Kafka
|
|
44
|
+
- fan out consistently across all API instances
|
|
45
|
+
|
|
46
|
+
## ๐๏ธ Database Operations
|
|
47
|
+
|
|
48
|
+
### Migrations
|
|
49
|
+
|
|
50
|
+
- keep `sql/postgres/schema.sql` as bootstrap baseline
|
|
51
|
+
- run versioned forward migrations in CI/CD
|
|
52
|
+
- avoid startup-time auto-migration in production paths
|
|
53
|
+
|
|
54
|
+
### Backup and recovery
|
|
55
|
+
|
|
56
|
+
- daily full backup + PITR
|
|
57
|
+
- regular restore drills
|
|
58
|
+
- documented recovery RTO/RPO targets
|
|
59
|
+
|
|
60
|
+
### Vector index hygiene
|
|
61
|
+
|
|
62
|
+
- run `ANALYZE` after significant ingest batches
|
|
63
|
+
- monitor query plans and latency
|
|
64
|
+
- tune ivfflat list settings as corpus grows
|
|
65
|
+
|
|
66
|
+
## ๐ Observability Model
|
|
67
|
+
|
|
68
|
+
Track:
|
|
69
|
+
|
|
70
|
+
- request rate and latency (`p50`, `p95`, `p99`)
|
|
71
|
+
- status code distribution
|
|
72
|
+
- DB query latency/error rate
|
|
73
|
+
- LLM provider latency/error rate
|
|
74
|
+
- SSE subscriber counts
|
|
75
|
+
|
|
76
|
+
Alert on:
|
|
77
|
+
|
|
78
|
+
- sustained 5xx error rate
|
|
79
|
+
- DB saturation / connection pool pressure
|
|
80
|
+
- high tail latency regressions
|
|
81
|
+
|
|
82
|
+
## ๐ค Runtime Adaptation (Accuracy Notes)
|
|
83
|
+
|
|
84
|
+
Provider selection in `src/dev.ts`:
|
|
85
|
+
|
|
86
|
+
- `CORE_AI_MODE=auto`: Cerebras -> OpenAI -> simple
|
|
87
|
+
- `CORE_AI_MODE=cerebras` without key: simple fallback (no OpenAI auto-switch)
|
|
88
|
+
- `CORE_AI_MODE=openai` without key: simple fallback (no Cerebras auto-switch)
|
|
89
|
+
|
|
90
|
+
Retrieval behavior:
|
|
91
|
+
|
|
92
|
+
- `USE_RETRIEVAL_EXPAND=true`: LLM classify/expand/rerank if LLM exists
|
|
93
|
+
- `USE_RETRIEVAL_EXPAND=false`: simple retrieval path
|
|
94
|
+
|
|
95
|
+
Embedding behavior:
|
|
96
|
+
|
|
97
|
+
- missing `OPENAI_API_KEY` causes embedder to return empty vectors
|
|
98
|
+
- write/read APIs continue; retrieval uses lexical/non-vector scoring path
|
|
99
|
+
|
|
100
|
+
## ๐งพ Data Semantics in Production
|
|
101
|
+
|
|
102
|
+
- memory deletion is soft (`is_deleted=true`)
|
|
103
|
+
- supersession uses `status='superseded'` and `superseded_by`
|
|
104
|
+
- truth state is read from `slot_state` joined with active claims
|
|
105
|
+
- claim retraction may restore previous slot winner
|
|
106
|
+
|
|
107
|
+
## โ
Hardening Checklist
|
|
108
|
+
|
|
109
|
+
- [ ] Auth + scope middleware in front of CORE routes
|
|
110
|
+
- [ ] Explicit tenant/project isolation strategy
|
|
111
|
+
- [ ] Idempotency-key storage and replay implemented
|
|
112
|
+
- [ ] Externalized SSE/event transport for multi-instance deployments
|
|
113
|
+
- [ ] Migration workflow in CI/CD
|
|
114
|
+
- [ ] Backup + restore drill cadence defined
|
|
115
|
+
- [ ] SLOs + alerts configured
|
|
116
|
+
- [ ] Load/perf test against expected traffic profile
|