@mnexium/core 0.1.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/.env.example ADDED
@@ -0,0 +1,37 @@
1
+ POSTGRES_HOST=localhost
2
+ POSTGRES_PORT=5432
3
+ POSTGRES_DB=mnexium_core
4
+ POSTGRES_USER=postgres
5
+ POSTGRES_PASSWORD=change_me
6
+
7
+ # Optional
8
+ CORE_DEFAULT_PROJECT_ID=default-project
9
+ PORT=8080
10
+ CORE_DEBUG=false
11
+
12
+ # AI routing mode:
13
+ # - auto (default): cerebras -> openai -> simple fallback
14
+ # - cerebras: force Cerebras, fallback to simple if key missing
15
+ # - openai: force OpenAI ChatGPT model, fallback to simple if key missing
16
+ # - simple: no LLM calls, heuristic mode only
17
+ CORE_AI_MODE=auto
18
+
19
+ # Controls search-time retrieval expansion/reranking.
20
+ # true = LLM classify/expand/rerank path (when LLM is available)
21
+ # false = simple search mode only
22
+ USE_RETRIEVAL_EXPAND=true
23
+
24
+ # Model API keys
25
+ OPENAI_API_KEY=
26
+
27
+ # Optional: enables Cerebras-powered recall routing/reranking.
28
+ CEREBRAS_API=
29
+
30
+ # Shared retrieval model for whichever LLM provider is selected.
31
+ # - Cerebras example: gpt-oss-120b
32
+ # - OpenAI example: gpt-4o-mini or gpt-4.1-mini
33
+ RETRIEVAL_MODEL=gpt-oss-120b
34
+
35
+ # Embeddings model for semantic memory search and claim vectorization.
36
+ # Keep this aligned with schema vector dimension (schema is VECTOR(1536)).
37
+ OPENAI_EMBED_MODEL=text-embedding-3-small
package/README.md ADDED
@@ -0,0 +1,37 @@
1
+ # CORE
2
+
3
+ CORE is Mnexium's memory engine service: a Postgres-backed HTTP API for storing memories, extracting claims, resolving truth state, and retrieving relevant context for downstream applications.
4
+
5
+ It is designed as an integration-first core service that can run standalone and plug into existing auth, tenancy, and platform controls.
6
+
7
+ ## What CORE does
8
+
9
+ - Stores subject-scoped memories and supports lifecycle operations (create, update, soft-delete, restore).
10
+ - Extracts structured claims from natural language and persists claim assertions.
11
+ - Maintains slot-based truth state (`slot_state`) to track active winners and retractions.
12
+ - Supports retrieval with vector + lexical fallback and optional LLM-powered query expansion/reranking.
13
+ - Streams memory lifecycle events over SSE for real-time consumers.
14
+
15
+ ## Why it is powerful
16
+
17
+ - Better grounding for responses: LLMs can retrieve durable, user-specific memory instead of relying only on short chat context.
18
+ - Lower hallucination risk on known facts: retrieval and claim state give the model a concrete memory substrate to reference.
19
+ - Personalization that persists: preferences, history, and prior decisions survive across sessions and channels.
20
+ - Works beyond context-window limits: important memory is stored and recalled on demand instead of repeatedly reprompted.
21
+ - Faster LLM product development: app teams get a ready memory/truth backend rather than building custom memory pipelines from scratch.
22
+
23
+ ## Intended use
24
+
25
+ CORE is intended to be the memory and truth substrate behind apps, agents, and workflows that need:
26
+
27
+ - long-lived user memory,
28
+ - auditable claim history,
29
+ - query-time recall,
30
+ - and deterministic APIs backed by Postgres.
31
+
32
+ ## Documentation map
33
+
34
+ - Setup and initialization: [docs/SETUP.md](docs/SETUP.md)
35
+ - Runtime behavior and decision logic: [docs/BEHAVIOR.md](docs/BEHAVIOR.md)
36
+ - HTTP endpoints and contracts: [docs/API.md](docs/API.md)
37
+ - Production hardening checklist: [docs/OPERATIONS.md](docs/OPERATIONS.md)
package/docs/API.md ADDED
@@ -0,0 +1,299 @@
1
+ # ๐Ÿ“˜ API Reference
2
+
3
+ All routes except `GET /health` require project context.
4
+
5
+ Project context resolution:
6
+
7
+ 1. `x-project-id` header
8
+ 2. fallback project id configured on server startup
9
+
10
+ If neither is available, the request fails with `400`:
11
+
12
+ ```json
13
+ {
14
+ "error": "project_id_required",
15
+ "message": "Provide x-project-id header or configure defaultProjectId"
16
+ }
17
+ ```
18
+
19
+ ## ๐Ÿฉบ Health
20
+
21
+ ### GET `/health`
22
+
23
+ Returns service liveness:
24
+
25
+ ```json
26
+ {
27
+ "ok": true,
28
+ "service": "mnexium-core",
29
+ "timestamp": "..."
30
+ }
31
+ ```
32
+
33
+ ## ๐Ÿ“ก Memory Events (SSE)
34
+
35
+ ### GET `/api/v1/events/memories`
36
+
37
+ Query params:
38
+
39
+ - `subject_id` (optional)
40
+
41
+ Event types:
42
+
43
+ - `connected`
44
+ - `heartbeat`
45
+ - `memory.created`
46
+ - `memory.superseded`
47
+ - `memory.updated`
48
+ - `memory.deleted`
49
+
50
+ ## ๐Ÿง  Memories
51
+
52
+ ### GET `/api/v1/memories`
53
+
54
+ Query params:
55
+
56
+ - `subject_id` (required)
57
+ - `limit` (default `50`, max `200`)
58
+ - `offset` (default `0`)
59
+ - `include_deleted` (`true|false`)
60
+ - `include_superseded` (`true|false`)
61
+
62
+ Returns:
63
+
64
+ - `{ data: Memory[], count: number }`
65
+
66
+ ### POST `/api/v1/memories`
67
+
68
+ Body:
69
+
70
+ - `subject_id` (required)
71
+ - `text` (required, max length `10000`)
72
+ - `kind`, `visibility`, `importance`, `confidence`, `is_temporal`, `tags`, `metadata`, `source_type` (optional)
73
+ - `id` (optional)
74
+ - `extract_claims` (optional, default `true`)
75
+ - `no_supersede` (optional, default `false`)
76
+
77
+ Returns:
78
+
79
+ - `201` created:
80
+ - `{ id, subject_id, text, kind, created: true, superseded_count, superseded_ids }`
81
+ - `200` duplicate skip:
82
+ - `{ id: null, subject_id, text, kind, created: false, skipped: true, reason: "duplicate" }`
83
+
84
+ ### GET `/api/v1/memories/search`
85
+
86
+ Query params:
87
+
88
+ - `subject_id` (required)
89
+ - `q` (required)
90
+ - `limit` (default `25`, max `200`)
91
+ - `min_score` (default `30`)
92
+ - `distance` (alias of `min_score`)
93
+ - `context` (repeatable; optional conversation context items)
94
+
95
+ Returns:
96
+
97
+ - when recall service is configured (default in `src/dev.ts`):
98
+ - `{ data, query, count, engine, mode, used_queries, predicates }`
99
+ - internal fallback path:
100
+ - `{ data, query, count, engine }`
101
+
102
+ ### POST `/api/v1/memories/extract`
103
+
104
+ Body:
105
+
106
+ - `subject_id` (required)
107
+ - `text` (required)
108
+ - `force` (`boolean`, optional)
109
+ - `learn` (`boolean`, optional)
110
+ - `conversation_context` (`string[]`, optional)
111
+
112
+ Query params:
113
+
114
+ - `learn=true|false` (optional)
115
+ - `force=true|false` (optional)
116
+
117
+ `learn`/`force` are enabled when either body or query sets them to `true`.
118
+
119
+ Returns (extraction only):
120
+
121
+ - `{ ok: true, learned: false, mode, extracted_count, memories }`
122
+
123
+ Returns (learn/write path):
124
+
125
+ - `{ ok: true, learned: true, mode, extracted_count, learned_memory_count, learned_claim_count, memories }`
126
+
127
+ ### GET `/api/v1/memories/superseded`
128
+
129
+ Query params:
130
+
131
+ - `subject_id` (required)
132
+ - `limit` (default `50`, max `200`)
133
+ - `offset` (default `0`)
134
+
135
+ Returns:
136
+
137
+ - `{ data: Memory[], count }`
138
+
139
+ ### GET `/api/v1/memories/recalls`
140
+
141
+ Query params:
142
+
143
+ - `chat_id` OR `memory_id` (one required)
144
+ - `stats=true|false` (used only with `memory_id` path)
145
+ - `limit` (default `100`, max `1000`)
146
+
147
+ Response modes:
148
+
149
+ - by chat (`chat_id` provided): `{ data, count, chat_id }`
150
+ - by memory (`memory_id` + no stats): `{ data, count, memory_id }`
151
+ - memory stats (`memory_id` + `stats=true`): `{ memory_id, stats }`
152
+
153
+ Note:
154
+
155
+ - if both `chat_id` and `memory_id` are provided, `chat_id` path is used.
156
+
157
+ ### GET `/api/v1/memories/:id`
158
+
159
+ Returns:
160
+
161
+ - `{ data: Memory }`
162
+ - `404` with `memory_not_found` or `memory_deleted`
163
+
164
+ ### PATCH `/api/v1/memories/:id`
165
+
166
+ Body (any subset):
167
+
168
+ - `text`, `kind`, `visibility`, `importance`, `confidence`, `is_temporal`, `tags`, `metadata`
169
+
170
+ Returns:
171
+
172
+ - `{ id, updated: true }`
173
+ - `404` with `memory_not_found` or `memory_deleted`
174
+
175
+ ### DELETE `/api/v1/memories/:id`
176
+
177
+ Soft delete.
178
+
179
+ Returns:
180
+
181
+ - `{ ok: true, deleted: boolean }`
182
+
183
+ ### GET `/api/v1/memories/:id/claims`
184
+
185
+ Returns assertion-centric claims linked to memory:
186
+
187
+ - `{ data: [{ id, predicate, type, value, confidence, status, first_seen_at, last_seen_at }], count }`
188
+
189
+ Errors:
190
+
191
+ - `404` `memory_not_found`
192
+ - `404` `memory_deleted`
193
+
194
+ ### POST `/api/v1/memories/:id/restore`
195
+
196
+ Returns:
197
+
198
+ - `{ ok: true, restored: true, id, subject_id, text }`
199
+ - `{ ok: true, restored: false, message: "Memory is already active" }`
200
+ - `400` `memory_deleted`
201
+ - `404` `memory_not_found`
202
+
203
+ ## ๐Ÿงฉ Claims
204
+
205
+ ### POST `/api/v1/claims`
206
+
207
+ Body:
208
+
209
+ - required: `subject_id`, `predicate`, `object_value`
210
+ - optional: `claim_id`, `claim_type`, `slot`, `confidence`, `importance`, `tags`, `source_memory_id`, `source_observation_id`, `subject_entity`, `valid_from`, `valid_until`
211
+
212
+ Returns:
213
+
214
+ - `{ claim_id, subject_id, predicate, object_value, slot, claim_type, confidence, observation_id, linking_triggered }`
215
+
216
+ ### POST `/api/v1/claims/:id/retract`
217
+
218
+ Body:
219
+
220
+ - `reason` (optional, default `manual_retraction`)
221
+
222
+ Returns:
223
+
224
+ - `{ success, claim_id, slot, previous_claim_id, restored_previous, reason }`
225
+
226
+ ### GET `/api/v1/claims/:id`
227
+
228
+ Returns:
229
+
230
+ - `{ claim, assertions, edges, supersession_chain }`
231
+
232
+ Notes:
233
+
234
+ - `claim` excludes embedding field
235
+ - `supersession_chain` is filtered from `edges` where `edge_type = supersedes`
236
+
237
+ ### GET `/api/v1/claims/subject/:subjectId/truth`
238
+
239
+ Query params:
240
+
241
+ - `include_source=true|false` (default `true`)
242
+
243
+ Returns:
244
+
245
+ - `{ subject_id, project_id, slot_count, slots }`
246
+
247
+ ### GET `/api/v1/claims/subject/:subjectId/slot/:slot`
248
+
249
+ Returns:
250
+
251
+ - `{ subject_id, project_id, slot, active_claim_id, predicate, object_value, claim_type, confidence, updated_at, tags, source }`
252
+ - `404` `slot_not_found`
253
+
254
+ ### GET `/api/v1/claims/subject/:subjectId/slots`
255
+
256
+ Query params:
257
+
258
+ - `limit` (default `100`, max `500`)
259
+
260
+ Returns grouped slot states:
261
+
262
+ - `{ subject_id, total, active_count, slots: { active, superseded, other } }`
263
+
264
+ ### GET `/api/v1/claims/subject/:subjectId/graph`
265
+
266
+ Query params:
267
+
268
+ - `limit` (default `50`, max `200`)
269
+
270
+ Returns:
271
+
272
+ - `{ subject_id, claims_count, edges_count, edges_by_type, claims, edges }`
273
+
274
+ ### GET `/api/v1/claims/subject/:subjectId/history`
275
+
276
+ Query params:
277
+
278
+ - `slot` (optional)
279
+ - `limit` (default `100`, max `500`)
280
+
281
+ Returns:
282
+
283
+ - `{ subject_id, project_id, slot_filter, by_slot, edges, total_claims }`
284
+
285
+ ## โš ๏ธ Error Conventions
286
+
287
+ Common error payloads:
288
+
289
+ - validation: `{ error: "subject_id_required" }`, `{ error: "q_required" }`, `{ error: "invalid_json_body" }`
290
+ - not found: `{ error: "memory_not_found" }`, `{ error: "claim_not_found" }`, `{ error: "slot_not_found" }`, `{ error: "not_found" }`
291
+ - server error: `{ error: "server_error", message: "..." }`
292
+
293
+ Status codes:
294
+
295
+ - `200` success
296
+ - `201` created
297
+ - `400` validation/input
298
+ - `404` not found
299
+ - `500` unexpected server error
@@ -0,0 +1,224 @@
1
+ # ๐Ÿง  Runtime Behavior
2
+
3
+ This document explains how CORE decides retrieval/extraction behavior at runtime and how memory + claim state changes flow through the system.
4
+
5
+ ## ๐Ÿ—๏ธ Request Lifecycle
6
+
7
+ CORE server path:
8
+
9
+ 1. Receive HTTP request
10
+ 2. Resolve `project_id`
11
+ 3. Dispatch route handler
12
+ 4. Execute storage contract (`CoreStore`)
13
+ 5. Return JSON (or SSE stream)
14
+
15
+ Primary implementation files:
16
+
17
+ - `src/server/createCoreServer.ts`
18
+ - `src/adapters/postgres/PostgresCoreStore.ts`
19
+ - `src/ai/recallService.ts`
20
+ - `src/ai/memoryExtractionService.ts`
21
+
22
+ ## ๐Ÿค– AI Provider Resolution
23
+
24
+ Configured by `CORE_AI_MODE`:
25
+
26
+ - `auto` (default)
27
+ - `cerebras`
28
+ - `openai`
29
+ - `simple`
30
+
31
+ Resolution behavior (`src/dev.ts`):
32
+
33
+ - `auto`
34
+ - use Cerebras when `CEREBRAS_API` exists
35
+ - else use OpenAI when `OPENAI_API_KEY` exists
36
+ - else use `simple`
37
+ - `cerebras`
38
+ - requires `CEREBRAS_API`
39
+ - if missing, warns and falls back to `simple`
40
+ - does not auto-switch to OpenAI
41
+ - `openai`
42
+ - requires `OPENAI_API_KEY`
43
+ - if missing, warns and falls back to `simple`
44
+ - does not auto-switch to Cerebras
45
+ - `simple`
46
+ - no LLM client
47
+
48
+ `RETRIEVAL_MODEL` is passed to the selected LLM client.
49
+
50
+ ## ๐Ÿ”Ž Retrieval Behavior (`GET /api/v1/memories/search`)
51
+
52
+ ### Retrieval toggle
53
+
54
+ `USE_RETRIEVAL_EXPAND=true|false`:
55
+
56
+ - `true`
57
+ - with LLM client: classify + expand + rerank path
58
+ - without LLM client: simple path
59
+ - `false`
60
+ - always simple search path
61
+ - extraction endpoint can still use LLM if an LLM client exists
62
+
63
+ ### Query mode semantics (`broad|direct|indirect`)
64
+
65
+ Mode is produced by an LLM classifier in `src/ai/recallService.ts`.
66
+
67
+ - `broad`
68
+ - intent: profile/summary requests
69
+ - behavior: list active memories and sort by importance + recency
70
+ - output pattern: wider profile set (`max(limit, 20)`)
71
+ - `direct`
72
+ - intent: specific fact lookup
73
+ - behavior: search with hints, then boost claim-backed memories when predicates match truth slots
74
+ - output pattern: high-precision, usually narrow (often top 5)
75
+ - `indirect`
76
+ - intent: advice/planning where personal context helps
77
+ - behavior: query expansion + larger candidate pool + rerank when needed
78
+ - output pattern: context-rich supporting memories
79
+
80
+ Classification fallback behavior:
81
+
82
+ - classify failure/invalid mode -> defaults to `indirect`
83
+ - empty query -> no memories returned
84
+
85
+ ### LLM expanded pipeline
86
+
87
+ 1. Classify query into `broad|direct|indirect` and extract hints/predicates.
88
+ 2. Build query set:
89
+ - `broad`: profile listing path (no multi-query expansion)
90
+ - `direct`: original + search hints
91
+ - `indirect`: original + search hints + expanded queries
92
+ 3. For each query, search with embedding (if available) plus lexical fallback.
93
+ 4. Merge and dedupe by memory ID.
94
+ 5. Apply mode-specific ranking:
95
+ - `direct`: claim/truth boosts, rerank only when needed
96
+ - `indirect`: rerank via LLM when candidate set exceeds limit
97
+
98
+ Other runtime constraints:
99
+
100
+ - conversation context is capped to last 5 items
101
+ - classify timeout is 2s
102
+ - rerank timeout is 3s
103
+
104
+ Response fields include:
105
+
106
+ - `engine` (`provider:model` or `simple`)
107
+ - `mode` (`broad|direct|indirect|simple`)
108
+ - `used_queries`
109
+ - `predicates`
110
+
111
+ ### Simple retrieval pipeline
112
+
113
+ 1. Run single query search.
114
+ 2. Use embedding if available, otherwise lexical-only path.
115
+ 3. Rank via adapter scoring.
116
+
117
+ No LLM classify/expand/rerank is applied.
118
+
119
+ ## โœ‚๏ธ Extraction Behavior (`POST /api/v1/memories/extract`)
120
+
121
+ This endpoint is always available.
122
+
123
+ - With LLM client:
124
+ - uses extraction prompt
125
+ - normalizes output
126
+ - falls back to simple extraction on parse/failure
127
+ - Without LLM client:
128
+ - uses simple heuristic extraction directly
129
+
130
+ Learn modes:
131
+
132
+ - `learn=false` (default): extraction-only response
133
+ - `learn=true`: writes memories/claims and emits `memory.created` per learned memory
134
+
135
+ ## ๐Ÿ—‚๏ธ Memory Lifecycle Semantics
136
+
137
+ ### Create memory (`POST /api/v1/memories`)
138
+
139
+ - writes one `memories` row
140
+ - optional embedding generation
141
+ - duplicate guard when embedding similarity >= 85
142
+ - conflict supersede when embedding similarity is in `[60, 85)`
143
+ - emits `memory.created`
144
+ - emits `memory.superseded` for superseded conflicting memories
145
+ - optional async claim extraction (`extract_claims=true` by default)
146
+
147
+ Notes:
148
+
149
+ - async extraction is skipped when `no_supersede=true`
150
+ - extracted claims link back via `source_memory_id`
151
+
152
+ ### Update memory (`PATCH /api/v1/memories/:id`)
153
+
154
+ - patch updates supported fields
155
+ - emits `memory.updated`
156
+
157
+ ### Delete memory (`DELETE /api/v1/memories/:id`)
158
+
159
+ - soft delete (`is_deleted=true`)
160
+ - emits `memory.deleted` when delete actually occurs
161
+
162
+ ### Restore memory (`POST /api/v1/memories/:id/restore`)
163
+
164
+ - sets `status='active'`, clears `superseded_by`
165
+ - emits `memory.updated`
166
+ - deleted memories cannot be restored
167
+
168
+ ### Superseded listing (`GET /api/v1/memories/superseded`)
169
+
170
+ - returns non-deleted memories where `status='superseded'`
171
+
172
+ ## ๐Ÿงฉ Claim Semantics
173
+
174
+ ### Create claim (`POST /api/v1/claims`)
175
+
176
+ - inserts into `claims`
177
+ - inserts assertion row in `claim_assertions`
178
+ - upserts `slot_state` active winner for claim slot
179
+
180
+ ### Retract claim (`POST /api/v1/claims/:id/retract`)
181
+
182
+ - sets claim status to `retracted`
183
+ - restores previous active claim in same slot when available
184
+ - updates `slot_state`
185
+ - writes `retracts` edge when prior winner is restored
186
+
187
+ ### Truth/slot reads
188
+
189
+ - truth + slot endpoints resolve from `slot_state` joined with active `claims`
190
+ - graph/history endpoints return claims + edges
191
+
192
+ ## ๐Ÿ“ก SSE Event Behavior
193
+
194
+ Endpoint:
195
+
196
+ - `GET /api/v1/events/memories`
197
+
198
+ Event types:
199
+
200
+ - `connected`
201
+ - `heartbeat` (every 30s)
202
+ - `memory.created`
203
+ - `memory.superseded`
204
+ - `memory.updated`
205
+ - `memory.deleted`
206
+
207
+ Topology:
208
+
209
+ - in-process event bus (`src/server/memoryEventBus.ts`)
210
+ - no cross-instance fanout by default
211
+ - for horizontal scale, replace bus with external pub/sub (Redis/NATS/Kafka)
212
+
213
+ ## โš ๏ธ Error and Degradation Model
214
+
215
+ Status pattern:
216
+
217
+ - `400` validation/input issues
218
+ - `404` missing resource or route
219
+ - `500` unexpected server error
220
+
221
+ Graceful degradation:
222
+
223
+ - no embedding key -> non-vector retrieval path still works
224
+ - no LLM client or LLM failure -> simple retrieval/extraction fallback
@@ -0,0 +1,116 @@
1
+ # ๐Ÿ› ๏ธ Operations Playbook
2
+
3
+ This guide focuses on production hardening for CORE deployments.
4
+
5
+ ## ๐Ÿงฑ Deployment Baseline
6
+
7
+ Recommended baseline:
8
+
9
+ - 1+ stateless CORE instances
10
+ - managed Postgres with `pgvector`
11
+ - reverse proxy/load balancer with tuned upstream timeouts
12
+ - centralized logs + metrics
13
+
14
+ For multi-instance event fanout, add external pub/sub.
15
+
16
+ ## ๐Ÿ” Platform Integrations to Add
17
+
18
+ CORE intentionally leaves platform controls to the host system. Typical production additions:
19
+
20
+ 1. Auth (API keys/JWT) and route-level scopes
21
+ 2. Tenant boundary checks
22
+ 3. Idempotency keys for writes
23
+ 4. External event transport for SSE fanout
24
+ 5. Alerting + SLO dashboards
25
+
26
+ ## ๐Ÿ” Idempotency Guidance
27
+
28
+ Write routes can be retried by clients/proxies. Add:
29
+
30
+ - `Idempotency-Key` for `POST/PATCH/DELETE`
31
+ - persisted request fingerprint and response body
32
+ - replay of original success response for duplicate key
33
+
34
+ ## ๐Ÿ“ก SSE at Scale
35
+
36
+ Current behavior:
37
+
38
+ - memory events are emitted from an in-process bus
39
+ - subscribers connected to instance A will not receive events produced on instance B
40
+
41
+ Production recommendation:
42
+
43
+ - publish lifecycle events to Redis/NATS/Kafka
44
+ - fan out consistently across all API instances
45
+
46
+ ## ๐Ÿ—„๏ธ Database Operations
47
+
48
+ ### Migrations
49
+
50
+ - keep `sql/postgres/schema.sql` as bootstrap baseline
51
+ - run versioned forward migrations in CI/CD
52
+ - avoid startup-time auto-migration in production paths
53
+
54
+ ### Backup and recovery
55
+
56
+ - daily full backup + PITR
57
+ - regular restore drills
58
+ - documented recovery RTO/RPO targets
59
+
60
+ ### Vector index hygiene
61
+
62
+ - run `ANALYZE` after significant ingest batches
63
+ - monitor query plans and latency
64
+ - tune ivfflat list settings as corpus grows
65
+
66
+ ## ๐Ÿ“ˆ Observability Model
67
+
68
+ Track:
69
+
70
+ - request rate and latency (`p50`, `p95`, `p99`)
71
+ - status code distribution
72
+ - DB query latency/error rate
73
+ - LLM provider latency/error rate
74
+ - SSE subscriber counts
75
+
76
+ Alert on:
77
+
78
+ - sustained 5xx error rate
79
+ - DB saturation / connection pool pressure
80
+ - high tail latency regressions
81
+
82
+ ## ๐Ÿค– Runtime Adaptation (Accuracy Notes)
83
+
84
+ Provider selection in `src/dev.ts`:
85
+
86
+ - `CORE_AI_MODE=auto`: Cerebras -> OpenAI -> simple
87
+ - `CORE_AI_MODE=cerebras` without key: simple fallback (no OpenAI auto-switch)
88
+ - `CORE_AI_MODE=openai` without key: simple fallback (no Cerebras auto-switch)
89
+
90
+ Retrieval behavior:
91
+
92
+ - `USE_RETRIEVAL_EXPAND=true`: LLM classify/expand/rerank if LLM exists
93
+ - `USE_RETRIEVAL_EXPAND=false`: simple retrieval path
94
+
95
+ Embedding behavior:
96
+
97
+ - missing `OPENAI_API_KEY` causes embedder to return empty vectors
98
+ - write/read APIs continue; retrieval uses lexical/non-vector scoring path
99
+
100
+ ## ๐Ÿงพ Data Semantics in Production
101
+
102
+ - memory deletion is soft (`is_deleted=true`)
103
+ - supersession uses `status='superseded'` and `superseded_by`
104
+ - truth state is read from `slot_state` joined with active claims
105
+ - claim retraction may restore previous slot winner
106
+
107
+ ## โœ… Hardening Checklist
108
+
109
+ - [ ] Auth + scope middleware in front of CORE routes
110
+ - [ ] Explicit tenant/project isolation strategy
111
+ - [ ] Idempotency-key storage and replay implemented
112
+ - [ ] Externalized SSE/event transport for multi-instance deployments
113
+ - [ ] Migration workflow in CI/CD
114
+ - [ ] Backup + restore drill cadence defined
115
+ - [ ] SLOs + alerts configured
116
+ - [ ] Load/perf test against expected traffic profile