engrm 0.1.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (82) hide show
  1. package/.mcp.json +9 -0
  2. package/AUTH-DESIGN.md +436 -0
  3. package/BRIEF.md +197 -0
  4. package/CLAUDE.md +44 -0
  5. package/COMPETITIVE.md +174 -0
  6. package/CONTEXT-OPTIMIZATION.md +305 -0
  7. package/INFRASTRUCTURE.md +252 -0
  8. package/LICENSE +105 -0
  9. package/MARKET.md +230 -0
  10. package/PLAN.md +278 -0
  11. package/README.md +121 -0
  12. package/SENTINEL.md +293 -0
  13. package/SERVER-API-PLAN.md +553 -0
  14. package/SPEC.md +843 -0
  15. package/SWOT.md +148 -0
  16. package/SYNC-ARCHITECTURE.md +294 -0
  17. package/VIBE-CODER-STRATEGY.md +250 -0
  18. package/bun.lock +375 -0
  19. package/hooks/post-tool-use.ts +144 -0
  20. package/hooks/session-start.ts +64 -0
  21. package/hooks/stop.ts +131 -0
  22. package/mem-page.html +1305 -0
  23. package/package.json +30 -0
  24. package/src/capture/dedup.test.ts +103 -0
  25. package/src/capture/dedup.ts +76 -0
  26. package/src/capture/extractor.test.ts +245 -0
  27. package/src/capture/extractor.ts +330 -0
  28. package/src/capture/quality.test.ts +168 -0
  29. package/src/capture/quality.ts +104 -0
  30. package/src/capture/retrospective.test.ts +115 -0
  31. package/src/capture/retrospective.ts +121 -0
  32. package/src/capture/scanner.test.ts +131 -0
  33. package/src/capture/scanner.ts +100 -0
  34. package/src/capture/scrubber.test.ts +144 -0
  35. package/src/capture/scrubber.ts +181 -0
  36. package/src/cli.ts +517 -0
  37. package/src/config.ts +238 -0
  38. package/src/context/inject.test.ts +940 -0
  39. package/src/context/inject.ts +382 -0
  40. package/src/embeddings/backfill.ts +50 -0
  41. package/src/embeddings/embedder.test.ts +76 -0
  42. package/src/embeddings/embedder.ts +139 -0
  43. package/src/lifecycle/aging.test.ts +103 -0
  44. package/src/lifecycle/aging.ts +36 -0
  45. package/src/lifecycle/compaction.test.ts +264 -0
  46. package/src/lifecycle/compaction.ts +190 -0
  47. package/src/lifecycle/purge.test.ts +100 -0
  48. package/src/lifecycle/purge.ts +37 -0
  49. package/src/lifecycle/scheduler.test.ts +120 -0
  50. package/src/lifecycle/scheduler.ts +101 -0
  51. package/src/provisioning/browser-auth.ts +172 -0
  52. package/src/provisioning/provision.test.ts +198 -0
  53. package/src/provisioning/provision.ts +94 -0
  54. package/src/register.test.ts +167 -0
  55. package/src/register.ts +178 -0
  56. package/src/server.ts +436 -0
  57. package/src/storage/migrations.test.ts +244 -0
  58. package/src/storage/migrations.ts +261 -0
  59. package/src/storage/outbox.test.ts +229 -0
  60. package/src/storage/outbox.ts +131 -0
  61. package/src/storage/projects.test.ts +137 -0
  62. package/src/storage/projects.ts +184 -0
  63. package/src/storage/sqlite.test.ts +798 -0
  64. package/src/storage/sqlite.ts +934 -0
  65. package/src/storage/vec.test.ts +198 -0
  66. package/src/sync/auth.test.ts +76 -0
  67. package/src/sync/auth.ts +68 -0
  68. package/src/sync/client.ts +183 -0
  69. package/src/sync/engine.test.ts +94 -0
  70. package/src/sync/engine.ts +127 -0
  71. package/src/sync/pull.test.ts +279 -0
  72. package/src/sync/pull.ts +170 -0
  73. package/src/sync/push.test.ts +117 -0
  74. package/src/sync/push.ts +230 -0
  75. package/src/tools/get.ts +34 -0
  76. package/src/tools/pin.ts +47 -0
  77. package/src/tools/save.test.ts +301 -0
  78. package/src/tools/save.ts +231 -0
  79. package/src/tools/search.test.ts +69 -0
  80. package/src/tools/search.ts +181 -0
  81. package/src/tools/timeline.ts +64 -0
  82. package/tsconfig.json +22 -0
package/.mcp.json ADDED
@@ -0,0 +1,9 @@
1
+ {
2
+ "mcpServers": {
3
+ "engrm": {
4
+ "type": "stdio",
5
+ "command": "/Users/david/.bun/bin/bun",
6
+ "args": ["run", "src/server.ts"]
7
+ }
8
+ }
9
+ }
package/AUTH-DESIGN.md ADDED
@@ -0,0 +1,436 @@
1
+ # Auth Design — Engrm
2
+
3
+ **Status**: Approved (Devstral review: 2026-03-10)
4
+ **Gates**: Phase 3 (Sync) — local features work without auth
5
+
6
+ ---
7
+
8
+ ## 1. Design Principles
9
+
10
+ 1. **One credential type for sync**: `cvk_` API key is the only credential the sync engine uses
11
+ 2. **Multiple ways to obtain it**: OAuth flow (interactive), device flow (headless), manual (CI/CD)
12
+ 3. **One config directory**: `~/.engrm/` — settings, auth, and database all in one place
13
+ 4. **Offline-first**: Auth failure pauses sync, never breaks local features
14
+
15
+ ---
16
+
17
+ ## 2. Auth Flows
18
+
19
+ ### Flow A: Interactive (Browser Callback)
20
+
21
+ Default for developers with a desktop environment.
22
+
23
+ ```
24
+ ┌─────────────────────────────────────────────────────────┐
25
+ │ engrm init │
26
+ │ │
27
+ │ 1. User runs: engrm init │
28
+ │ 2. CLI starts localhost callback server on random port │
29
+ │ 3. CLI opens browser to: │
30
+ │ https://candengo.com/connect/mem? │
31
+ │ redirect_uri=http://localhost:{port}/callback │
32
+ │ &state={random} │
33
+ │ 4. User logs in (or creates account) on candengo.com │
34
+ │ 5. User clicks "Authorize Engrm" │
35
+ │ 6. Candengo redirects to localhost callback: │
36
+ │ http://localhost:{port}/callback?code=ABC&state=XYZ │
37
+ │ 7. CLI exchanges code for credentials: │
38
+ │ POST /v1/mem/provision { "code": "ABC" } │
39
+ │ → { api_key: "cvk_...", site_id, namespace, ... } │
40
+ │ 8. CLI writes ~/.engrm/settings.json │
41
+ │ 9. CLI prints: "✓ Connected as david@example.com" │
42
+ └─────────────────────────────────────────────────────────┘
43
+ ```
44
+
45
+ The OAuth callback exchanges for a **permanent `cvk_` API key** — not a short-lived access token. This is the same credential type as the existing provisioning flow. The OAuth flow is simply a more convenient delivery mechanism.
46
+
47
+ ### Flow B: Device Code (Headless / SSH)
48
+
49
+ Auto-detected when no browser can be launched, or via `--no-browser` flag. Implements RFC 8628 (OAuth 2.0 Device Authorization Grant).
50
+
51
+ ```
52
+ ┌─────────────────────────────────────────────────────────┐
53
+ │ engrm init --no-browser │
54
+ │ │
55
+ │ 1. CLI requests device code: │
56
+ │ POST /v1/auth/device/code │
57
+ │ → { device_code, user_code: "XXXX-YYYY", │
58
+ │ verification_uri, interval: 5 } │
59
+ │ │
60
+ │ 2. CLI prints: │
61
+ │ "Open this URL on any device: │
62
+ │ https://candengo.com/connect/mem/device │
63
+ │ Enter code: XXXX-YYYY" │
64
+ │ │
65
+ │ 3. User opens URL on phone/desktop browser │
66
+ │ 4. User logs in, enters code, clicks "Authorize" │
67
+ │ 5. CLI polls every 5 seconds: │
68
+ │ POST /v1/auth/device/token { device_code } │
69
+ │ → 202 (pending) | 200 { api_key, site_id, ... } │
70
+ │ 6. On success: writes settings.json │
71
+ │ 7. CLI prints: "✓ Connected as david@example.com" │
72
+ └─────────────────────────────────────────────────────────┘
73
+ ```
74
+
75
+ ### Flow C: Provisioning Token (Web Signup)
76
+
77
+ Existing SPEC flow — user signs up on candengo.com, copies a one-liner.
78
+
79
+ ```
80
+ ┌─────────────────────────────────────────────────────────┐
81
+ │ Web: engrm.dev │
82
+ │ │
83
+ │ 1. User signs up (email + password, or GitHub OAuth) │
84
+ │ 2. Page shows install command: │
85
+ │ npx engrm init --token=cmt_abc123... │
86
+ │ 3. CLI exchanges provisioning token for credentials: │
87
+ │ POST /v1/mem/provision { "token": "cmt_..." } │
88
+ │ → { api_key: "cvk_...", site_id, namespace, ... } │
89
+ │ 4. CLI writes settings.json │
90
+ └─────────────────────────────────────────────────────────┘
91
+ ```
92
+
93
+ ### Flow D: Manual / CI/CD
94
+
95
+ For CI/CD pipelines, air-gapped environments, or self-hosted deployments.
96
+
97
+ ```bash
98
+ # Environment variable (CI/CD)
99
+ export ENGRM_TOKEN=cvk_...
100
+
101
+ # Manual configuration
102
+ engrm init --manual
103
+ # Prompts for: endpoint, api_key, site_id, namespace, user_id
104
+
105
+ # Self-hosted
106
+ engrm init --url=https://vector.internal.company.com --token=cmt_...
107
+ ```
108
+
109
+ The sync engine checks `ENGRM_TOKEN` env var before reading from settings.json. This allows CI/CD pipelines to use Engrm without writing config files.
110
+
111
+ ---
112
+
113
+ ## 3. Credential Types
114
+
115
+ | Prefix | Type | Lifetime | Use Case |
116
+ |--------|------|----------|----------|
117
+ | `cvk_` | API key | Permanent (revocable) | All sync operations. The ONE credential type for API access. |
118
+ | `cmt_` | Provisioning token | 1 hour, single-use | Web signup → exchange for `cvk_` key |
119
+ | `cm_` | Access token | 1 hour | Future: MCP-native OAuth 2.1 (Phase 4) |
120
+ | `cmr_` | Refresh token | 90 days sliding | Future: MCP-native OAuth 2.1 (Phase 4) |
121
+
122
+ **Key decision**: For Phase 3, only `cvk_` and `cmt_` exist. The `cm_`/`cmr_` token pair is reserved for Phase 4 when MCP-native OAuth is implemented. This avoids maintaining two auth models simultaneously.
123
+
124
+ ---
125
+
126
+ ## 4. Token Storage
127
+
128
+ ### Primary: `~/.engrm/settings.json`
129
+
130
+ The `cvk_` API key is stored in the existing settings file. No separate auth file needed.
131
+
132
+ ```json
133
+ {
134
+ "candengo_url": "https://www.candengo.com",
135
+ "candengo_api_key": "cvk_...",
136
+ "site_id": "unimpossible",
137
+ "namespace": "dev-memory",
138
+ "user_id": "david",
139
+ "user_email": "david@example.com",
140
+ "device_id": "macbook-a1b2c3d4",
141
+ "teams": [
142
+ { "id": "team_abc123", "name": "Unimpossible", "namespace": "dev-memory" }
143
+ ],
144
+ "sync": { ... },
145
+ "search": { ... },
146
+ "scrubbing": { ... }
147
+ }
148
+ ```
149
+
150
+ ### Secret Scrubber: `cvk_` Pattern
151
+
152
+ The existing scrubber already catches `cvk_` keys (see SPEC §6). This prevents API keys from leaking into observations.
153
+
154
+ ### OS Keychain (Phase 4)
155
+
156
+ When `cm_`/`cmr_` tokens are introduced in Phase 4:
157
+ - Refresh token (`cmr_`) → OS keychain (macOS Keychain / libsecret / Windows Credential Manager)
158
+ - Access token (`cm_`) → memory only (short-lived, not persisted)
159
+ - Fallback to file storage with logged warning if keychain unavailable
160
+
161
+ ---
162
+
163
+ ## 5. Team Membership
164
+
165
+ ### Data Model
166
+
167
+ `teams` is an **array** in both the auth response and settings.json. This supports multi-team membership from day one.
168
+
169
+ ```typescript
170
+ interface TeamMembership {
171
+ id: string; // "team_abc123"
172
+ name: string; // "Unimpossible"
173
+ namespace: string; // "dev-memory"
174
+ }
175
+ ```
176
+
177
+ ### Namespace Resolution
178
+
179
+ When syncing:
180
+ - Personal namespace: `{user_id}-personal` (always exists)
181
+ - Team namespace: from `teams[].namespace` (requires explicit join)
182
+
183
+ The `search` tool's `scope` parameter determines which namespaces to query:
184
+ - `personal` → personal namespace only
185
+ - `team` → team namespace(s) only
186
+ - `all` → all namespaces (default)
187
+
188
+ ### Team Provisioning
189
+
190
+ Teams are **not** auto-provisioned. Flow:
191
+
192
+ 1. **Personal namespace**: auto-provisioned on first auth (any flow)
193
+ 2. **Team namespace**: explicit create or join action
194
+ - Admin creates team at `engrm.dev/team`
195
+ - Members join via invite link: `engrm.dev/join/team_abc123`
196
+ - Or CLI: `engrm team join --code=INVITE_CODE`
197
+
198
+ ---
199
+
200
+ ## 6. Server-Side Requirements
201
+
202
+ ### Endpoints (Phase 3)
203
+
204
+ | Endpoint | Method | Purpose |
205
+ |----------|--------|---------|
206
+ | `/v1/mem/provision` | POST | Exchange `cmt_` token or OAuth code for `cvk_` API key |
207
+ | `/v1/auth/device/code` | POST | Request device authorization code (RFC 8628) |
208
+ | `/v1/auth/device/token` | POST | Poll for device authorization completion |
209
+ | `/v1/auth/revoke` | POST | Revoke an API key by value |
210
+ | `/v1/auth/keys` | GET | List active API keys for account (dashboard) |
211
+ | `/v1/auth/keys` | DELETE | Revoke specific API key (dashboard) |
212
+
213
+ ### Web Pages
214
+
215
+ | URL | Purpose |
216
+ |-----|---------|
217
+ | `engrm.dev` | Landing page + signup |
218
+ | `candengo.com/connect/mem` | OAuth authorization page |
219
+ | `candengo.com/connect/mem/device` | Device code entry page |
220
+ | `engrm.dev/team` | Team creation (admin) |
221
+ | `engrm.dev/join/{code}` | Team invite acceptance |
222
+ | `engrm.dev/dashboard` | Key management, usage, team settings |
223
+
224
+ ### Database Tables (Server)
225
+
226
+ ```sql
227
+ -- User accounts
228
+ CREATE TABLE mem_accounts (
229
+ id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
230
+ email TEXT UNIQUE NOT NULL,
231
+ created_at TIMESTAMPTZ DEFAULT now()
232
+ );
233
+
234
+ -- API keys (permanent, revocable)
235
+ CREATE TABLE mem_api_keys (
236
+ key_hash TEXT PRIMARY KEY, -- SHA-256 of cvk_ key (never store plaintext)
237
+ key_prefix TEXT NOT NULL, -- First 8 chars for identification
238
+ account_id UUID NOT NULL REFERENCES mem_accounts(id),
239
+ name TEXT, -- "MacBook Pro", "CI Pipeline"
240
+ scopes TEXT[] DEFAULT '{read,write}', -- read, write, admin
241
+ created_at TIMESTAMPTZ DEFAULT now(),
242
+ last_used_at TIMESTAMPTZ,
243
+ revoked_at TIMESTAMPTZ -- NULL = active
244
+ );
245
+
246
+ -- Provisioning tokens (short-lived, single-use)
247
+ CREATE TABLE mem_provision_tokens (
248
+ token TEXT PRIMARY KEY, -- cmt_abc123...
249
+ account_id UUID NOT NULL REFERENCES mem_accounts(id),
250
+ expires_at TIMESTAMPTZ NOT NULL, -- created_at + 1 hour
251
+ used_at TIMESTAMPTZ, -- NULL until redeemed
252
+ created_at TIMESTAMPTZ DEFAULT now()
253
+ );
254
+
255
+ -- Device authorization codes (RFC 8628)
256
+ CREATE TABLE mem_device_codes (
257
+ device_code TEXT PRIMARY KEY,
258
+ user_code TEXT UNIQUE NOT NULL, -- XXXX-YYYY (human-readable)
259
+ account_id UUID, -- NULL until user authorizes
260
+ expires_at TIMESTAMPTZ NOT NULL, -- created_at + 15 minutes
261
+ authorized_at TIMESTAMPTZ, -- NULL until authorized
262
+ created_at TIMESTAMPTZ DEFAULT now()
263
+ );
264
+
265
+ -- Team membership
266
+ CREATE TABLE mem_teams (
267
+ id TEXT PRIMARY KEY, -- team_abc123
268
+ name TEXT NOT NULL,
269
+ namespace TEXT NOT NULL,
270
+ owner_id UUID NOT NULL REFERENCES mem_accounts(id),
271
+ created_at TIMESTAMPTZ DEFAULT now()
272
+ );
273
+
274
+ CREATE TABLE mem_team_members (
275
+ team_id TEXT NOT NULL REFERENCES mem_teams(id),
276
+ account_id UUID NOT NULL REFERENCES mem_accounts(id),
277
+ role TEXT DEFAULT 'member', -- owner, admin, member
278
+ joined_at TIMESTAMPTZ DEFAULT now(),
279
+ PRIMARY KEY (team_id, account_id)
280
+ );
281
+ ```
282
+
283
+ ---
284
+
285
+ ## 7. Token Scopes
286
+
287
+ | Scope | Allows | Use Case |
288
+ |-------|--------|----------|
289
+ | `read` | search, get_observations, timeline, session_context | Read-only sync, CI/CD consumers |
290
+ | `write` | save_observation, pin_observation + all `read` | Normal agent usage |
291
+ | `admin` | team management, key revocation + all `write` | Team admins |
292
+
293
+ Default scopes for new keys: `read, write`.
294
+ CI/CD keys should be created with `read` only to limit blast radius.
295
+
296
+ ---
297
+
298
+ ## 8. Sync Engine Auth Integration
299
+
300
+ ```typescript
301
+ // src/sync/auth.ts
302
+
303
+ /**
304
+ * Get a valid API key for sync operations.
305
+ * Priority: env var → settings.json
306
+ */
307
+ export function getApiKey(config: Config): string | null {
308
+ // CI/CD: environment variable takes precedence
309
+ const envKey = process.env.ENGRM_TOKEN;
310
+ if (envKey && envKey.startsWith("cvk_")) return envKey;
311
+
312
+ // Interactive: from settings
313
+ if (config.candengo_api_key && config.candengo_api_key.startsWith("cvk_")) {
314
+ return config.candengo_api_key;
315
+ }
316
+
317
+ return null;
318
+ }
319
+
320
+ /**
321
+ * Check if sync is configured and authenticated.
322
+ */
323
+ export function isSyncReady(config: Config): boolean {
324
+ return getApiKey(config) !== null && config.candengo_url !== "";
325
+ }
326
+ ```
327
+
328
+ ### Auth Failure Handling
329
+
330
+ When the sync engine gets a 401 from the API:
331
+ 1. Set sync state to `paused_auth`
332
+ 2. Log: "Sync paused: API key invalid or revoked. Run 'engrm init' to re-authenticate."
333
+ 3. Surface warning in next MCP tool response via `_meta.warning` field
334
+ 4. Do not retry sync until user re-authenticates
335
+ 5. Local features continue working normally
336
+
337
+ ---
338
+
339
+ ## 9. Token Revocation
340
+
341
+ ### API Key Rotation
342
+
343
+ Users can rotate keys from the dashboard (`engrm.dev/dashboard`) or CLI:
344
+
345
+ ```bash
346
+ engrm auth rotate
347
+ # 1. Creates new cvk_ key on server
348
+ # 2. Updates settings.json with new key
349
+ # 3. Revokes old key
350
+ ```
351
+
352
+ ### Revocation Scenarios
353
+
354
+ | Scenario | Action |
355
+ |----------|--------|
356
+ | User runs `engrm auth revoke` | Revoke current key, clear from settings |
357
+ | Lost/stolen device | Revoke key from web dashboard |
358
+ | Team member removed | Admin revokes member's team-scoped keys |
359
+ | Account deletion | All keys revoked server-side |
360
+ | Suspected compromise | `engrm auth rotate` (atomic: new key, then revoke old) |
361
+
362
+ ### Server-Side
363
+
364
+ - Keys are stored as SHA-256 hashes (never plaintext)
365
+ - `key_prefix` (first 8 chars) allows identification in dashboard without exposing full key
366
+ - `last_used_at` updated on each API call for activity monitoring
367
+ - Revoked keys return 401 immediately
368
+
369
+ ---
370
+
371
+ ## 10. Cross-Agent Compatibility
372
+
373
+ | Agent | Init Flow | Runtime Auth |
374
+ |-------|-----------|--------------|
375
+ | Claude Code | `engrm init` (any flow) | MCP server reads `cvk_` from settings.json |
376
+ | Codex CLI | Same init + set `bearer_token_env_var` | Codex passes token via env var |
377
+ | Cursor | Same init | MCP server reads from settings.json |
378
+ | Windsurf | Same init | MCP server reads from settings.json |
379
+ | Cline | Same init | MCP server reads from settings.json |
380
+ | CI/CD | `ENGRM_TOKEN=cvk_...` | Sync engine reads env var |
381
+
382
+ All agents share the same `cvk_` API key and the same settings.json. The MCP server binary is identical across agents — agent detection is separate from auth.
383
+
384
+ ---
385
+
386
+ ## 11. Phase 4: MCP-Native OAuth 2.1
387
+
388
+ When the MCP OAuth 2.1 spec stabilises and agents implement it:
389
+
390
+ 1. MCP server returns `401 Unauthorized` with `WWW-Authenticate` header
391
+ 2. Agent handles browser flow automatically (no `engrm init` needed)
392
+ 3. Server issues short-lived `cm_` access token + `cmr_` refresh token
393
+ 4. Refresh token stored in OS keychain via `keytar` or equivalent
394
+ 5. Access token refreshed automatically by MCP client
395
+ 6. This becomes the **preferred** flow; `cvk_` keys remain for CI/CD and backwards compat
396
+
397
+ This is additive — Track A (`cvk_` keys) continues to work. Track B (MCP OAuth) is an optional upgrade path.
398
+
399
+ ---
400
+
401
+ ## 12. Implementation Timeline
402
+
403
+ | Phase | Auth Work | Depends On |
404
+ |-------|-----------|------------|
405
+ | Phase 2 (current) | No auth needed — local only | — |
406
+ | Phase 3 (Sync) | Implement Flows A-D, `cvk_` key auth, revocation endpoints, team model | Candengo web backend |
407
+ | Phase 3.5 | Web dashboard (key management, team admin) | Phase 3 |
408
+ | Phase 4 | MCP-native OAuth 2.1, keychain storage, `cm_`/`cmr_` tokens | MCP spec stabilisation |
409
+
410
+ ### Phase 3 Implementation Order
411
+
412
+ 1. `POST /v1/mem/provision` — exchange `cmt_` token for `cvk_` key (server)
413
+ 2. `engrm init --token=cmt_...` — CLI provisioning (client)
414
+ 3. `engrm init` — browser OAuth callback flow (client)
415
+ 4. `engrm init --no-browser` — device code flow (client + server)
416
+ 5. `ENGRM_TOKEN` env var support in sync engine (client)
417
+ 6. `POST /v1/auth/revoke` — key revocation (server)
418
+ 7. `engrm auth rotate` — key rotation (client)
419
+ 8. Team endpoints + `engrm team join` (client + server)
420
+
421
+ ---
422
+
423
+ ## Decisions Log
424
+
425
+ | # | Decision | Rationale | Review |
426
+ |---|----------|-----------|--------|
427
+ | 1 | `cvk_` API key is the single credential for sync | Avoids two auth models, two validation paths. OAuth is delivery, not credential type. | Devstral: approved |
428
+ | 2 | All config in `~/.engrm/` | Consolidate — no split between `~/.config/` and `~/.engrm/` | Devstral: approved |
429
+ | 3 | Device flow (RFC 8628) for headless/SSH | Localhost callback fails for remote dev. Device flow works everywhere. | Devstral: required |
430
+ | 4 | `teams` is an array, not scalar | Multi-team membership from day one. Cheap now, expensive to migrate later. | Devstral: required |
431
+ | 5 | Personal namespace auto-provisioned, team explicit | Prevents orphaned team namespaces from solo signups | Devstral: approved |
432
+ | 6 | Token revocation is Phase 3 blocker | Basic security requirement — cannot ship sync without revocation | Devstral: required |
433
+ | 7 | MCP-native OAuth deferred to Phase 4 | Spec still stabilising, no agent fully implements it | Devstral: approved |
434
+ | 8 | API keys stored as SHA-256 hashes server-side | Standard practice — never store plaintext credentials | Devstral: approved |
435
+ | 9 | `ENGRM_TOKEN` env var for CI/CD | Pipelines need stable credentials without config files | Devstral: approved |
436
+ | 10 | Scopes: read, write, admin | Limits blast radius of CI/CD tokens and leaked keys | Devstral: approved |
package/BRIEF.md ADDED
@@ -0,0 +1,197 @@
1
+ # Engrm — Product Brief
2
+
3
+ ## Executive Summary
4
+
5
+ Engrm is a **cross-device, team-shared memory layer for AI coding agents** — built on Candengo Vector's proven RAG infrastructure. It captures what developers learn, discover, fix, and decide during AI-assisted coding sessions and makes that knowledge instantly available across all their devices, team members, and future sessions.
6
+
7
+ **We're building this to solve our own problem first.** Our dev team works across multiple machines and projects (Candengo, Alchemy, AIMY). Every Claude Code session starts from zero — no memory of what was done yesterday, on another machine, or by another team member. Engrm fixes this with offline-first local storage that syncs to Candengo Vector, giving every developer's AI agent shared project context from day one.
8
+
9
+ The first integration targets **Claude Code** via its MCP and hooks system. The MCP interface is agent-agnostic, so future agents that support MCP can use the same memory backend.
10
+
11
+ **Not a fork.** Built from scratch, inspired by claude-mem's approach to Claude Code integration. No shared code, no AGPL dependency. Clean-room implementation designed around cross-device sync and team memory from the start.
12
+
13
+ **Built by Unimpossible Consultants** — the team behind the Candengo AI Knowledge Infrastructure platform, Alchemy, and AIMY.
14
+
15
+ ---
16
+
17
+ ## The Problem
18
+
19
+ ### Context Amnesia
20
+ Every new Claude Code session starts from zero. Yesterday's debugging insights, architectural decisions, and hard-won knowledge are gone.
21
+
22
+ ### Multi-Device Friction
23
+ Fix a bug on the laptop, continue on the desktop — no shared context. Our developers work across 2-3 machines and constantly re-explain the same codebase to the agent.
24
+
25
+ ### Team Knowledge Silos
26
+ Developer A discovers a critical gotcha on Monday. Developer B hits the same issue on Tuesday. There's no automatic knowledge transfer between team members' AI agents. New team members' agents have zero institutional knowledge.
27
+
28
+ ### Wasted Tokens and Time
29
+ AI agents re-discover the same patterns, make the same mistakes, ask the same clarifying questions — session after session, developer after developer.
30
+
31
+ ### No Cross-Device Team Solution Exists
32
+ - claude-mem: local SQLite + ChromaDB, single device only
33
+ - mem0: cloud-only SaaS, no self-hosted option
34
+ - Cognee: knowledge graphs, no agent memory focus
35
+ - IDE memory (Cursor, Windsurf): locked to one tool
36
+
37
+ None offer offline-first cross-device sync with team memory on self-hosted infrastructure.
38
+
39
+ ---
40
+
41
+ ## The Solution
42
+
43
+ ### Core Product
44
+ An MCP server + Claude Code hooks that:
45
+ 1. **Self-provisions in under 2 minutes** — sign up at engrm.dev, run one command, done
46
+ 2. **Captures observations automatically** from coding sessions (bugfixes, discoveries, decisions, patterns)
47
+ 3. **Stores locally** in SQLite (instant, always works, offline-first)
48
+ 4. **Syncs to Candengo Vector** when connected (cross-device search, semantic retrieval)
49
+ 5. **Injects relevant context** on session start — agent picks up where you (or a teammate) left off, on any machine
50
+ 6. **Scrubs secrets** before storage (API keys, tokens, passwords, connection strings)
51
+
52
+ ### Architecture
53
+
54
+ ```
55
+ Developer's Machine (any device)
56
+ ┌─────────────────────────────────────────────┐
57
+ │ Claude Code / Future MCP Agent │
58
+ │ ↕ MCP (stdio) │
59
+ │ ┌───────────────────────────────────┐ │
60
+ │ │ Engrm MCP Server │ │
61
+ │ │ - Observation capture (hooks) │ │
62
+ │ │ - Local SQLite + FTS5 │ │
63
+ │ │ - Sync outbox queue │ │
64
+ │ │ - Secret scrubbing │ │
65
+ │ └──────────────┬────────────────────┘ │
66
+ │ │ HTTPS (when available) │
67
+ └─────────────────┼───────────────────────────┘
68
+
69
+
70
+ ┌─────────────────────────────────────────────┐
71
+ │ Candengo Vector (self-hosted or cloud) │
72
+ │ - BGE-M3 hybrid dense+sparse search │
73
+ │ - Cross-encoder reranking │
74
+ │ - Multi-tenant (site_id/namespace) │
75
+ └─────────────────────────────────────────────┘
76
+ ```
77
+
78
+ ### Data Flow
79
+
80
+ ```
81
+ 1. Developer works with Claude Code
82
+ 2. Agent uses tools (reads files, runs commands, edits code)
83
+ 3. PostToolUse hook → observation extracted (title, narrative, facts, type)
84
+ 4. Secret scrubber strips sensitive content
85
+ 5. Observation saved to local SQLite (instant, always works)
86
+ 6. Observation added to sync_outbox
87
+ 7. Sync engine pushes to Candengo Vector (fire-and-forget)
88
+ - Online → pushed immediately
89
+ - Offline → stays in outbox, retried on timer
90
+ 8. Next session (any device) → search hits both local + Candengo Vector
91
+ 9. Agent has context from previous sessions
92
+ ```
93
+
94
+ ---
95
+
96
+ ## Target Users
97
+
98
+ ### Phase 1: Our Team (Dogfood)
99
+ - Unimpossible dev team working across multiple machines and projects (Candengo, Alchemy, AIMY)
100
+ - Shared project memory so every developer's agent has team context
101
+ - Self-hosted on our own Candengo Vector infrastructure
102
+
103
+ ### Phase 2: Individual Developers (Solo Plan)
104
+ - Power users who want persistent AI memory without cloud lock-in
105
+ - Privacy-conscious developers who want self-hosted infrastructure
106
+
107
+ ### Phase 3: External Teams (Team Plan)
108
+ - Small-to-medium dev teams wanting shared institutional knowledge
109
+ - New team member's AI agent instantly has access to team knowledge
110
+
111
+ ### Phase 4: Enterprise
112
+ - Large engineering organisations
113
+ - Compliance-sensitive environments requiring self-hosted data
114
+
115
+ ---
116
+
117
+ ## Key Differentiators
118
+
119
+ | Feature | Engrm | claude-mem | mem0 |
120
+ |---|---|---|---|
121
+ | Free cloud sync | Yes (generous free tier) | No (local only) | 10K memories free |
122
+ | Cross-device sync | Yes (offline-first) | No (local only) | Cloud only |
123
+ | Self-hosted option | Yes (Candengo Vector) | Local only | No (SaaS) |
124
+ | Offline-first | Yes (SQLite + outbox) | N/A (always local) | No |
125
+ | Team memory | Yes (shared namespace) | No | Limited |
126
+ | Multi-agent support | MCP standard | Claude Code only | Multiple (via API) |
127
+ | Vector search quality | BGE-M3 hybrid + reranking | ChromaDB default | Proprietary |
128
+ | Secret scrubbing | Yes | No | Unknown |
129
+ | License | FSL-1.1-ALv2 (source-available, Fair Source) | AGPL-3.0 | Proprietary |
130
+
131
+ ---
132
+
133
+ ## Licensing Strategy
134
+
135
+ **Split model** — core client published, premium features proprietary:
136
+
137
+ | Component | License | Published? |
138
+ |-----------|---------|-----------|
139
+ | Core client (MCP server, hooks, SQLite, search, sync) | FSL-1.1-ALv2 | Yes (GitHub) |
140
+ | Sentinel (real-time AI audit, config push, team standards) | Proprietary | No (private repo) |
141
+ | Server (Candengo Vector) | Proprietary | No (private repo) |
142
+
143
+ **FSL-1.1-ALv2 (Functional Source License)** — part of the [Fair Source](https://fair.io) movement. Used by Sentry, Codecov, GitButler, Keygen.
144
+
145
+ What it allows:
146
+ - Developers can read, modify, and run the code freely
147
+ - Companies can use it internally without restriction
148
+ - Each version automatically converts to Apache 2.0 after 2 years
149
+
150
+ What it restricts:
151
+ - Nobody can fork it and offer a competing hosted service
152
+
153
+ **Why not MIT/Apache**: Too permissive. A competitor could fork the plugin and offer a competing hosted service.
154
+
155
+ **Why not AGPL**: Too restrictive for adoption. Many companies have blanket AGPL bans.
156
+
157
+ **Why not ELv2**: FSL is better — the 2-year Apache 2.0 conversion is a trust signal, and FSL has growing ecosystem legitimacy via Fair Source.
158
+
159
+ **Why separate Sentinel**: Premium IP (audit LLM orchestration, team standards sync, dashboard config push) stays in a private repo. This is the GitLab CE/EE pattern — clean separation, no license gymnastics.
160
+
161
+ ---
162
+
163
+ ## Revenue Model
164
+
165
+ ### Free-First, Upgrade for More
166
+
167
+ The free tier is the product, not a demo. Developers get real cross-device sync with generous limits. Paid tiers unlock more storage, more devices, and team features.
168
+
169
+ | Tier | Price | Includes | Target |
170
+ |---|---|---|---|
171
+ | **Free** | $0 | Cloud sync, 10K observations, 2 devices, 1 user | Individual devs getting started |
172
+ | **Solo** | $9/mo | 50K observations, unlimited devices, priority sync | Power users, multi-machine devs |
173
+ | **Pro** | $19/mo | Unlimited observations, unlimited devices, advanced search | Heavy users |
174
+ | **Team** | $12/seat/mo (min 3) | Shared team memory, team analytics, admin controls | Dev teams (2-20) |
175
+ | **Enterprise** | Custom | Self-hosted Candengo Vector + support SLA, SSO, audit | Large orgs, compliance |
176
+
177
+ **Free tier rationale**: 10K observations is roughly 2-3 months of active daily use for a solo developer. Long enough to get hooked, natural upgrade when they hit the limit. Two devices covers laptop + desktop — the core cross-device use case.
178
+
179
+ **Self-hosted is always free**: Anyone can run their own Candengo Vector instance. The paid tiers are for the convenience of our hosted infrastructure.
180
+
181
+ ### Revenue Flywheel
182
+ ```
183
+ Free users (adoption) → Hit limits → Upgrade to Solo/Pro
184
+ → Tell teammates → Team plan → Enterprise interest
185
+ → More users → Justifies infrastructure investment → Better service → ...
186
+ ```
187
+
188
+ ---
189
+
190
+ ## Success Metrics
191
+
192
+ | Metric | Target (6 months) | Target (12 months) |
193
+ |---|---|---|
194
+ | GitHub stars (plugin) | 1,000 | 10,000 |
195
+ | Active installations | 500 | 5,000 |
196
+ | Candengo Vector signups via Mem | 100 | 1,000 |
197
+ | Cross-device sync events/day | 10,000 | 100,000 |
package/CLAUDE.md ADDED
@@ -0,0 +1,44 @@
1
+ # CLAUDE.md
2
+
3
+ ## Project Overview
4
+
5
+ Engrm (engrm.dev) is a cross-device, team-shared memory layer for AI coding agents. Built to let our dev team share project context across machines and developers. It captures observations (discoveries, bugfixes, decisions, patterns) from AI-assisted coding sessions and syncs them via Candengo Vector.
6
+
7
+ **Not a fork.** Built from scratch, inspired by claude-mem's approach to hooking into Claude Code. No shared code, no AGPL dependency.
8
+
9
+ **Branding**: Public-facing product name is "Engrm" (engrm.dev). Repo stays `candengo-mem`. Server-side internals stay `mem_*`. MCP server name is "engrm". Config dir is `~/.engrm/`. Env var is `ENGRM_TOKEN`.
10
+
11
+ ## Key Documents
12
+
13
+ - `BRIEF.md` — Product brief, architecture, revenue model, success metrics
14
+ - `SWOT.md` — Strengths, weaknesses, opportunities, threats analysis
15
+ - `PLAN.md` — Phased implementation plan with component architecture and effort estimates
16
+ - `SPEC.md` — Technical specification: schemas, MCP tools, sync engine, search pipeline
17
+ - `COMPETITIVE.md` — Competitive analysis vs claude-mem, mem0, Cognee, etc.
18
+ - `MARKET.md` — Market research, competitor pricing, influencer reach, growth projections
19
+ - `INFRASTRUCTURE.md` — Scaling roadmap, account-based routing, capacity planning, cost analysis
20
+ - `AUTH-DESIGN.md` — Authentication flows, credential types, team model, token revocation
21
+ - `SYNC-ARCHITECTURE.md` — Bidirectional sync protocol, change feed, multi-agent compatibility
22
+ - `SERVER-API-PLAN.md` — Server-side API plan: sync, teams, billing, usage (Devstral-reviewed)
23
+ - `SENTINEL.md` — Sentinel: real-time AI audit for coding agents (competitive research, architecture, implementation plan)
24
+
25
+ ## Technology Stack
26
+
27
+ - **MCP Server**: TypeScript + Bun, MCP SDK (stdio transport)
28
+ - **Local Storage**: SQLite (bun:sqlite) with FTS5 for offline search
29
+ - **Remote Backend**: Candengo Vector (BGE-M3, Qdrant, hybrid search)
30
+ - **Agent Support**: Claude Code (hooks + MCP), future MCP-compatible agents
31
+
32
+ ## Architecture
33
+
34
+ ```
35
+ Agent ↔ MCP Server ↔ Local SQLite ↔ Sync Engine ↔ Candengo Vector
36
+ ```
37
+
38
+ - SQLite is the source of truth (always available, offline-first)
39
+ - Sync outbox queues observations for push to Candengo Vector
40
+ - Search combines local FTS5 + remote vector search with result merging
41
+
42
+ ## Development
43
+
44
+ This project is in early development. See `PLAN.md` for implementation phases and `SPEC.md` for technical details.