engrm 0.1.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/.mcp.json +9 -0
- package/AUTH-DESIGN.md +436 -0
- package/BRIEF.md +197 -0
- package/CLAUDE.md +44 -0
- package/COMPETITIVE.md +174 -0
- package/CONTEXT-OPTIMIZATION.md +305 -0
- package/INFRASTRUCTURE.md +252 -0
- package/LICENSE +105 -0
- package/MARKET.md +230 -0
- package/PLAN.md +278 -0
- package/README.md +121 -0
- package/SENTINEL.md +293 -0
- package/SERVER-API-PLAN.md +553 -0
- package/SPEC.md +843 -0
- package/SWOT.md +148 -0
- package/SYNC-ARCHITECTURE.md +294 -0
- package/VIBE-CODER-STRATEGY.md +250 -0
- package/bun.lock +375 -0
- package/hooks/post-tool-use.ts +144 -0
- package/hooks/session-start.ts +64 -0
- package/hooks/stop.ts +131 -0
- package/mem-page.html +1305 -0
- package/package.json +30 -0
- package/src/capture/dedup.test.ts +103 -0
- package/src/capture/dedup.ts +76 -0
- package/src/capture/extractor.test.ts +245 -0
- package/src/capture/extractor.ts +330 -0
- package/src/capture/quality.test.ts +168 -0
- package/src/capture/quality.ts +104 -0
- package/src/capture/retrospective.test.ts +115 -0
- package/src/capture/retrospective.ts +121 -0
- package/src/capture/scanner.test.ts +131 -0
- package/src/capture/scanner.ts +100 -0
- package/src/capture/scrubber.test.ts +144 -0
- package/src/capture/scrubber.ts +181 -0
- package/src/cli.ts +517 -0
- package/src/config.ts +238 -0
- package/src/context/inject.test.ts +940 -0
- package/src/context/inject.ts +382 -0
- package/src/embeddings/backfill.ts +50 -0
- package/src/embeddings/embedder.test.ts +76 -0
- package/src/embeddings/embedder.ts +139 -0
- package/src/lifecycle/aging.test.ts +103 -0
- package/src/lifecycle/aging.ts +36 -0
- package/src/lifecycle/compaction.test.ts +264 -0
- package/src/lifecycle/compaction.ts +190 -0
- package/src/lifecycle/purge.test.ts +100 -0
- package/src/lifecycle/purge.ts +37 -0
- package/src/lifecycle/scheduler.test.ts +120 -0
- package/src/lifecycle/scheduler.ts +101 -0
- package/src/provisioning/browser-auth.ts +172 -0
- package/src/provisioning/provision.test.ts +198 -0
- package/src/provisioning/provision.ts +94 -0
- package/src/register.test.ts +167 -0
- package/src/register.ts +178 -0
- package/src/server.ts +436 -0
- package/src/storage/migrations.test.ts +244 -0
- package/src/storage/migrations.ts +261 -0
- package/src/storage/outbox.test.ts +229 -0
- package/src/storage/outbox.ts +131 -0
- package/src/storage/projects.test.ts +137 -0
- package/src/storage/projects.ts +184 -0
- package/src/storage/sqlite.test.ts +798 -0
- package/src/storage/sqlite.ts +934 -0
- package/src/storage/vec.test.ts +198 -0
- package/src/sync/auth.test.ts +76 -0
- package/src/sync/auth.ts +68 -0
- package/src/sync/client.ts +183 -0
- package/src/sync/engine.test.ts +94 -0
- package/src/sync/engine.ts +127 -0
- package/src/sync/pull.test.ts +279 -0
- package/src/sync/pull.ts +170 -0
- package/src/sync/push.test.ts +117 -0
- package/src/sync/push.ts +230 -0
- package/src/tools/get.ts +34 -0
- package/src/tools/pin.ts +47 -0
- package/src/tools/save.test.ts +301 -0
- package/src/tools/save.ts +231 -0
- package/src/tools/search.test.ts +69 -0
- package/src/tools/search.ts +181 -0
- package/src/tools/timeline.ts +64 -0
- package/tsconfig.json +22 -0
package/.mcp.json
ADDED
package/AUTH-DESIGN.md
ADDED
|
@@ -0,0 +1,436 @@
|
|
|
1
|
+
# Auth Design — Engrm
|
|
2
|
+
|
|
3
|
+
**Status**: Approved (Devstral review: 2026-03-10)
|
|
4
|
+
**Gates**: Phase 3 (Sync) — local features work without auth
|
|
5
|
+
|
|
6
|
+
---
|
|
7
|
+
|
|
8
|
+
## 1. Design Principles
|
|
9
|
+
|
|
10
|
+
1. **One credential type for sync**: `cvk_` API key is the only credential the sync engine uses
|
|
11
|
+
2. **Multiple ways to obtain it**: OAuth flow (interactive), device flow (headless), manual (CI/CD)
|
|
12
|
+
3. **One config directory**: `~/.engrm/` — settings, auth, and database all in one place
|
|
13
|
+
4. **Offline-first**: Auth failure pauses sync, never breaks local features
|
|
14
|
+
|
|
15
|
+
---
|
|
16
|
+
|
|
17
|
+
## 2. Auth Flows
|
|
18
|
+
|
|
19
|
+
### Flow A: Interactive (Browser Callback)
|
|
20
|
+
|
|
21
|
+
Default for developers with a desktop environment.
|
|
22
|
+
|
|
23
|
+
```
|
|
24
|
+
┌─────────────────────────────────────────────────────────┐
|
|
25
|
+
│ engrm init │
|
|
26
|
+
│ │
|
|
27
|
+
│ 1. User runs: engrm init │
|
|
28
|
+
│ 2. CLI starts localhost callback server on random port │
|
|
29
|
+
│ 3. CLI opens browser to: │
|
|
30
|
+
│ https://candengo.com/connect/mem? │
|
|
31
|
+
│ redirect_uri=http://localhost:{port}/callback │
|
|
32
|
+
│ &state={random} │
|
|
33
|
+
│ 4. User logs in (or creates account) on candengo.com │
|
|
34
|
+
│ 5. User clicks "Authorize Engrm" │
|
|
35
|
+
│ 6. Candengo redirects to localhost callback: │
|
|
36
|
+
│ http://localhost:{port}/callback?code=ABC&state=XYZ │
|
|
37
|
+
│ 7. CLI exchanges code for credentials: │
|
|
38
|
+
│ POST /v1/mem/provision { "code": "ABC" } │
|
|
39
|
+
│ → { api_key: "cvk_...", site_id, namespace, ... } │
|
|
40
|
+
│ 8. CLI writes ~/.engrm/settings.json │
|
|
41
|
+
│ 9. CLI prints: "✓ Connected as david@example.com" │
|
|
42
|
+
└─────────────────────────────────────────────────────────┘
|
|
43
|
+
```
|
|
44
|
+
|
|
45
|
+
The OAuth callback exchanges for a **permanent `cvk_` API key** — not a short-lived access token. This is the same credential type as the existing provisioning flow. The OAuth flow is simply a more convenient delivery mechanism.
|
|
46
|
+
|
|
47
|
+
### Flow B: Device Code (Headless / SSH)
|
|
48
|
+
|
|
49
|
+
Auto-detected when no browser can be launched, or via `--no-browser` flag. Implements RFC 8628 (OAuth 2.0 Device Authorization Grant).
|
|
50
|
+
|
|
51
|
+
```
|
|
52
|
+
┌─────────────────────────────────────────────────────────┐
|
|
53
|
+
│ engrm init --no-browser │
|
|
54
|
+
│ │
|
|
55
|
+
│ 1. CLI requests device code: │
|
|
56
|
+
│ POST /v1/auth/device/code │
|
|
57
|
+
│ → { device_code, user_code: "XXXX-YYYY", │
|
|
58
|
+
│ verification_uri, interval: 5 } │
|
|
59
|
+
│ │
|
|
60
|
+
│ 2. CLI prints: │
|
|
61
|
+
│ "Open this URL on any device: │
|
|
62
|
+
│ https://candengo.com/connect/mem/device │
|
|
63
|
+
│ Enter code: XXXX-YYYY" │
|
|
64
|
+
│ │
|
|
65
|
+
│ 3. User opens URL on phone/desktop browser │
|
|
66
|
+
│ 4. User logs in, enters code, clicks "Authorize" │
|
|
67
|
+
│ 5. CLI polls every 5 seconds: │
|
|
68
|
+
│ POST /v1/auth/device/token { device_code } │
|
|
69
|
+
│ → 202 (pending) | 200 { api_key, site_id, ... } │
|
|
70
|
+
│ 6. On success: writes settings.json │
|
|
71
|
+
│ 7. CLI prints: "✓ Connected as david@example.com" │
|
|
72
|
+
└─────────────────────────────────────────────────────────┘
|
|
73
|
+
```
|
|
74
|
+
|
|
75
|
+
### Flow C: Provisioning Token (Web Signup)
|
|
76
|
+
|
|
77
|
+
Existing SPEC flow — user signs up on candengo.com, copies a one-liner.
|
|
78
|
+
|
|
79
|
+
```
|
|
80
|
+
┌─────────────────────────────────────────────────────────┐
|
|
81
|
+
│ Web: engrm.dev │
|
|
82
|
+
│ │
|
|
83
|
+
│ 1. User signs up (email + password, or GitHub OAuth) │
|
|
84
|
+
│ 2. Page shows install command: │
|
|
85
|
+
│ npx engrm init --token=cmt_abc123... │
|
|
86
|
+
│ 3. CLI exchanges provisioning token for credentials: │
|
|
87
|
+
│ POST /v1/mem/provision { "token": "cmt_..." } │
|
|
88
|
+
│ → { api_key: "cvk_...", site_id, namespace, ... } │
|
|
89
|
+
│ 4. CLI writes settings.json │
|
|
90
|
+
└─────────────────────────────────────────────────────────┘
|
|
91
|
+
```
|
|
92
|
+
|
|
93
|
+
### Flow D: Manual / CI/CD
|
|
94
|
+
|
|
95
|
+
For CI/CD pipelines, air-gapped environments, or self-hosted deployments.
|
|
96
|
+
|
|
97
|
+
```bash
|
|
98
|
+
# Environment variable (CI/CD)
|
|
99
|
+
export ENGRM_TOKEN=cvk_...
|
|
100
|
+
|
|
101
|
+
# Manual configuration
|
|
102
|
+
engrm init --manual
|
|
103
|
+
# Prompts for: endpoint, api_key, site_id, namespace, user_id
|
|
104
|
+
|
|
105
|
+
# Self-hosted
|
|
106
|
+
engrm init --url=https://vector.internal.company.com --token=cmt_...
|
|
107
|
+
```
|
|
108
|
+
|
|
109
|
+
The sync engine checks `ENGRM_TOKEN` env var before reading from settings.json. This allows CI/CD pipelines to use Engrm without writing config files.
|
|
110
|
+
|
|
111
|
+
---
|
|
112
|
+
|
|
113
|
+
## 3. Credential Types
|
|
114
|
+
|
|
115
|
+
| Prefix | Type | Lifetime | Use Case |
|
|
116
|
+
|--------|------|----------|----------|
|
|
117
|
+
| `cvk_` | API key | Permanent (revocable) | All sync operations. The ONE credential type for API access. |
|
|
118
|
+
| `cmt_` | Provisioning token | 1 hour, single-use | Web signup → exchange for `cvk_` key |
|
|
119
|
+
| `cm_` | Access token | 1 hour | Future: MCP-native OAuth 2.1 (Phase 4) |
|
|
120
|
+
| `cmr_` | Refresh token | 90 days sliding | Future: MCP-native OAuth 2.1 (Phase 4) |
|
|
121
|
+
|
|
122
|
+
**Key decision**: For Phase 3, only `cvk_` and `cmt_` exist. The `cm_`/`cmr_` token pair is reserved for Phase 4 when MCP-native OAuth is implemented. This avoids maintaining two auth models simultaneously.
|
|
123
|
+
|
|
124
|
+
---
|
|
125
|
+
|
|
126
|
+
## 4. Token Storage
|
|
127
|
+
|
|
128
|
+
### Primary: `~/.engrm/settings.json`
|
|
129
|
+
|
|
130
|
+
The `cvk_` API key is stored in the existing settings file. No separate auth file needed.
|
|
131
|
+
|
|
132
|
+
```json
|
|
133
|
+
{
|
|
134
|
+
"candengo_url": "https://www.candengo.com",
|
|
135
|
+
"candengo_api_key": "cvk_...",
|
|
136
|
+
"site_id": "unimpossible",
|
|
137
|
+
"namespace": "dev-memory",
|
|
138
|
+
"user_id": "david",
|
|
139
|
+
"user_email": "david@example.com",
|
|
140
|
+
"device_id": "macbook-a1b2c3d4",
|
|
141
|
+
"teams": [
|
|
142
|
+
{ "id": "team_abc123", "name": "Unimpossible", "namespace": "dev-memory" }
|
|
143
|
+
],
|
|
144
|
+
"sync": { ... },
|
|
145
|
+
"search": { ... },
|
|
146
|
+
"scrubbing": { ... }
|
|
147
|
+
}
|
|
148
|
+
```
|
|
149
|
+
|
|
150
|
+
### Secret Scrubber: `cvk_` Pattern
|
|
151
|
+
|
|
152
|
+
The existing scrubber already catches `cvk_` keys (see SPEC §6). This prevents API keys from leaking into observations.
|
|
153
|
+
|
|
154
|
+
### OS Keychain (Phase 4)
|
|
155
|
+
|
|
156
|
+
When `cm_`/`cmr_` tokens are introduced in Phase 4:
|
|
157
|
+
- Refresh token (`cmr_`) → OS keychain (macOS Keychain / libsecret / Windows Credential Manager)
|
|
158
|
+
- Access token (`cm_`) → memory only (short-lived, not persisted)
|
|
159
|
+
- Fallback to file storage with logged warning if keychain unavailable
|
|
160
|
+
|
|
161
|
+
---
|
|
162
|
+
|
|
163
|
+
## 5. Team Membership
|
|
164
|
+
|
|
165
|
+
### Data Model
|
|
166
|
+
|
|
167
|
+
`teams` is an **array** in both the auth response and settings.json. This supports multi-team membership from day one.
|
|
168
|
+
|
|
169
|
+
```typescript
|
|
170
|
+
interface TeamMembership {
|
|
171
|
+
id: string; // "team_abc123"
|
|
172
|
+
name: string; // "Unimpossible"
|
|
173
|
+
namespace: string; // "dev-memory"
|
|
174
|
+
}
|
|
175
|
+
```
|
|
176
|
+
|
|
177
|
+
### Namespace Resolution
|
|
178
|
+
|
|
179
|
+
When syncing:
|
|
180
|
+
- Personal namespace: `{user_id}-personal` (always exists)
|
|
181
|
+
- Team namespace: from `teams[].namespace` (requires explicit join)
|
|
182
|
+
|
|
183
|
+
The `search` tool's `scope` parameter determines which namespaces to query:
|
|
184
|
+
- `personal` → personal namespace only
|
|
185
|
+
- `team` → team namespace(s) only
|
|
186
|
+
- `all` → all namespaces (default)
|
|
187
|
+
|
|
188
|
+
### Team Provisioning
|
|
189
|
+
|
|
190
|
+
Teams are **not** auto-provisioned. Flow:
|
|
191
|
+
|
|
192
|
+
1. **Personal namespace**: auto-provisioned on first auth (any flow)
|
|
193
|
+
2. **Team namespace**: explicit create or join action
|
|
194
|
+
- Admin creates team at `engrm.dev/team`
|
|
195
|
+
- Members join via invite link: `engrm.dev/join/team_abc123`
|
|
196
|
+
- Or CLI: `engrm team join --code=INVITE_CODE`
|
|
197
|
+
|
|
198
|
+
---
|
|
199
|
+
|
|
200
|
+
## 6. Server-Side Requirements
|
|
201
|
+
|
|
202
|
+
### Endpoints (Phase 3)
|
|
203
|
+
|
|
204
|
+
| Endpoint | Method | Purpose |
|
|
205
|
+
|----------|--------|---------|
|
|
206
|
+
| `/v1/mem/provision` | POST | Exchange `cmt_` token or OAuth code for `cvk_` API key |
|
|
207
|
+
| `/v1/auth/device/code` | POST | Request device authorization code (RFC 8628) |
|
|
208
|
+
| `/v1/auth/device/token` | POST | Poll for device authorization completion |
|
|
209
|
+
| `/v1/auth/revoke` | POST | Revoke an API key by value |
|
|
210
|
+
| `/v1/auth/keys` | GET | List active API keys for account (dashboard) |
|
|
211
|
+
| `/v1/auth/keys` | DELETE | Revoke specific API key (dashboard) |
|
|
212
|
+
|
|
213
|
+
### Web Pages
|
|
214
|
+
|
|
215
|
+
| URL | Purpose |
|
|
216
|
+
|-----|---------|
|
|
217
|
+
| `engrm.dev` | Landing page + signup |
|
|
218
|
+
| `candengo.com/connect/mem` | OAuth authorization page |
|
|
219
|
+
| `candengo.com/connect/mem/device` | Device code entry page |
|
|
220
|
+
| `engrm.dev/team` | Team creation (admin) |
|
|
221
|
+
| `engrm.dev/join/{code}` | Team invite acceptance |
|
|
222
|
+
| `engrm.dev/dashboard` | Key management, usage, team settings |
|
|
223
|
+
|
|
224
|
+
### Database Tables (Server)
|
|
225
|
+
|
|
226
|
+
```sql
|
|
227
|
+
-- User accounts
|
|
228
|
+
CREATE TABLE mem_accounts (
|
|
229
|
+
id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
|
|
230
|
+
email TEXT UNIQUE NOT NULL,
|
|
231
|
+
created_at TIMESTAMPTZ DEFAULT now()
|
|
232
|
+
);
|
|
233
|
+
|
|
234
|
+
-- API keys (permanent, revocable)
|
|
235
|
+
CREATE TABLE mem_api_keys (
|
|
236
|
+
key_hash TEXT PRIMARY KEY, -- SHA-256 of cvk_ key (never store plaintext)
|
|
237
|
+
key_prefix TEXT NOT NULL, -- First 8 chars for identification
|
|
238
|
+
account_id UUID NOT NULL REFERENCES mem_accounts(id),
|
|
239
|
+
name TEXT, -- "MacBook Pro", "CI Pipeline"
|
|
240
|
+
scopes TEXT[] DEFAULT '{read,write}', -- read, write, admin
|
|
241
|
+
created_at TIMESTAMPTZ DEFAULT now(),
|
|
242
|
+
last_used_at TIMESTAMPTZ,
|
|
243
|
+
revoked_at TIMESTAMPTZ -- NULL = active
|
|
244
|
+
);
|
|
245
|
+
|
|
246
|
+
-- Provisioning tokens (short-lived, single-use)
|
|
247
|
+
CREATE TABLE mem_provision_tokens (
|
|
248
|
+
token TEXT PRIMARY KEY, -- cmt_abc123...
|
|
249
|
+
account_id UUID NOT NULL REFERENCES mem_accounts(id),
|
|
250
|
+
expires_at TIMESTAMPTZ NOT NULL, -- created_at + 1 hour
|
|
251
|
+
used_at TIMESTAMPTZ, -- NULL until redeemed
|
|
252
|
+
created_at TIMESTAMPTZ DEFAULT now()
|
|
253
|
+
);
|
|
254
|
+
|
|
255
|
+
-- Device authorization codes (RFC 8628)
|
|
256
|
+
CREATE TABLE mem_device_codes (
|
|
257
|
+
device_code TEXT PRIMARY KEY,
|
|
258
|
+
user_code TEXT UNIQUE NOT NULL, -- XXXX-YYYY (human-readable)
|
|
259
|
+
account_id UUID, -- NULL until user authorizes
|
|
260
|
+
expires_at TIMESTAMPTZ NOT NULL, -- created_at + 15 minutes
|
|
261
|
+
authorized_at TIMESTAMPTZ, -- NULL until authorized
|
|
262
|
+
created_at TIMESTAMPTZ DEFAULT now()
|
|
263
|
+
);
|
|
264
|
+
|
|
265
|
+
-- Team membership
|
|
266
|
+
CREATE TABLE mem_teams (
|
|
267
|
+
id TEXT PRIMARY KEY, -- team_abc123
|
|
268
|
+
name TEXT NOT NULL,
|
|
269
|
+
namespace TEXT NOT NULL,
|
|
270
|
+
owner_id UUID NOT NULL REFERENCES mem_accounts(id),
|
|
271
|
+
created_at TIMESTAMPTZ DEFAULT now()
|
|
272
|
+
);
|
|
273
|
+
|
|
274
|
+
CREATE TABLE mem_team_members (
|
|
275
|
+
team_id TEXT NOT NULL REFERENCES mem_teams(id),
|
|
276
|
+
account_id UUID NOT NULL REFERENCES mem_accounts(id),
|
|
277
|
+
role TEXT DEFAULT 'member', -- owner, admin, member
|
|
278
|
+
joined_at TIMESTAMPTZ DEFAULT now(),
|
|
279
|
+
PRIMARY KEY (team_id, account_id)
|
|
280
|
+
);
|
|
281
|
+
```
|
|
282
|
+
|
|
283
|
+
---
|
|
284
|
+
|
|
285
|
+
## 7. Token Scopes
|
|
286
|
+
|
|
287
|
+
| Scope | Allows | Use Case |
|
|
288
|
+
|-------|--------|----------|
|
|
289
|
+
| `read` | search, get_observations, timeline, session_context | Read-only sync, CI/CD consumers |
|
|
290
|
+
| `write` | save_observation, pin_observation + all `read` | Normal agent usage |
|
|
291
|
+
| `admin` | team management, key revocation + all `write` | Team admins |
|
|
292
|
+
|
|
293
|
+
Default scopes for new keys: `read, write`.
|
|
294
|
+
CI/CD keys should be created with `read` only to limit blast radius.
|
|
295
|
+
|
|
296
|
+
---
|
|
297
|
+
|
|
298
|
+
## 8. Sync Engine Auth Integration
|
|
299
|
+
|
|
300
|
+
```typescript
|
|
301
|
+
// src/sync/auth.ts
|
|
302
|
+
|
|
303
|
+
/**
|
|
304
|
+
* Get a valid API key for sync operations.
|
|
305
|
+
* Priority: env var → settings.json
|
|
306
|
+
*/
|
|
307
|
+
export function getApiKey(config: Config): string | null {
|
|
308
|
+
// CI/CD: environment variable takes precedence
|
|
309
|
+
const envKey = process.env.ENGRM_TOKEN;
|
|
310
|
+
if (envKey && envKey.startsWith("cvk_")) return envKey;
|
|
311
|
+
|
|
312
|
+
// Interactive: from settings
|
|
313
|
+
if (config.candengo_api_key && config.candengo_api_key.startsWith("cvk_")) {
|
|
314
|
+
return config.candengo_api_key;
|
|
315
|
+
}
|
|
316
|
+
|
|
317
|
+
return null;
|
|
318
|
+
}
|
|
319
|
+
|
|
320
|
+
/**
|
|
321
|
+
* Check if sync is configured and authenticated.
|
|
322
|
+
*/
|
|
323
|
+
export function isSyncReady(config: Config): boolean {
|
|
324
|
+
return getApiKey(config) !== null && config.candengo_url !== "";
|
|
325
|
+
}
|
|
326
|
+
```
|
|
327
|
+
|
|
328
|
+
### Auth Failure Handling
|
|
329
|
+
|
|
330
|
+
When the sync engine gets a 401 from the API:
|
|
331
|
+
1. Set sync state to `paused_auth`
|
|
332
|
+
2. Log: "Sync paused: API key invalid or revoked. Run 'engrm init' to re-authenticate."
|
|
333
|
+
3. Surface warning in next MCP tool response via `_meta.warning` field
|
|
334
|
+
4. Do not retry sync until user re-authenticates
|
|
335
|
+
5. Local features continue working normally
|
|
336
|
+
|
|
337
|
+
---
|
|
338
|
+
|
|
339
|
+
## 9. Token Revocation
|
|
340
|
+
|
|
341
|
+
### API Key Rotation
|
|
342
|
+
|
|
343
|
+
Users can rotate keys from the dashboard (`engrm.dev/dashboard`) or CLI:
|
|
344
|
+
|
|
345
|
+
```bash
|
|
346
|
+
engrm auth rotate
|
|
347
|
+
# 1. Creates new cvk_ key on server
|
|
348
|
+
# 2. Updates settings.json with new key
|
|
349
|
+
# 3. Revokes old key
|
|
350
|
+
```
|
|
351
|
+
|
|
352
|
+
### Revocation Scenarios
|
|
353
|
+
|
|
354
|
+
| Scenario | Action |
|
|
355
|
+
|----------|--------|
|
|
356
|
+
| User runs `engrm auth revoke` | Revoke current key, clear from settings |
|
|
357
|
+
| Lost/stolen device | Revoke key from web dashboard |
|
|
358
|
+
| Team member removed | Admin revokes member's team-scoped keys |
|
|
359
|
+
| Account deletion | All keys revoked server-side |
|
|
360
|
+
| Suspected compromise | `engrm auth rotate` (atomic: new key, then revoke old) |
|
|
361
|
+
|
|
362
|
+
### Server-Side
|
|
363
|
+
|
|
364
|
+
- Keys are stored as SHA-256 hashes (never plaintext)
|
|
365
|
+
- `key_prefix` (first 8 chars) allows identification in dashboard without exposing full key
|
|
366
|
+
- `last_used_at` updated on each API call for activity monitoring
|
|
367
|
+
- Revoked keys return 401 immediately
|
|
368
|
+
|
|
369
|
+
---
|
|
370
|
+
|
|
371
|
+
## 10. Cross-Agent Compatibility
|
|
372
|
+
|
|
373
|
+
| Agent | Init Flow | Runtime Auth |
|
|
374
|
+
|-------|-----------|--------------|
|
|
375
|
+
| Claude Code | `engrm init` (any flow) | MCP server reads `cvk_` from settings.json |
|
|
376
|
+
| Codex CLI | Same init + set `bearer_token_env_var` | Codex passes token via env var |
|
|
377
|
+
| Cursor | Same init | MCP server reads from settings.json |
|
|
378
|
+
| Windsurf | Same init | MCP server reads from settings.json |
|
|
379
|
+
| Cline | Same init | MCP server reads from settings.json |
|
|
380
|
+
| CI/CD | `ENGRM_TOKEN=cvk_...` | Sync engine reads env var |
|
|
381
|
+
|
|
382
|
+
All agents share the same `cvk_` API key and the same settings.json. The MCP server binary is identical across agents — agent detection is separate from auth.
|
|
383
|
+
|
|
384
|
+
---
|
|
385
|
+
|
|
386
|
+
## 11. Phase 4: MCP-Native OAuth 2.1
|
|
387
|
+
|
|
388
|
+
When the MCP OAuth 2.1 spec stabilises and agents implement it:
|
|
389
|
+
|
|
390
|
+
1. MCP server returns `401 Unauthorized` with `WWW-Authenticate` header
|
|
391
|
+
2. Agent handles browser flow automatically (no `engrm init` needed)
|
|
392
|
+
3. Server issues short-lived `cm_` access token + `cmr_` refresh token
|
|
393
|
+
4. Refresh token stored in OS keychain via `keytar` or equivalent
|
|
394
|
+
5. Access token refreshed automatically by MCP client
|
|
395
|
+
6. This becomes the **preferred** flow; `cvk_` keys remain for CI/CD and backwards compat
|
|
396
|
+
|
|
397
|
+
This is additive — Track A (`cvk_` keys) continues to work. Track B (MCP OAuth) is an optional upgrade path.
|
|
398
|
+
|
|
399
|
+
---
|
|
400
|
+
|
|
401
|
+
## 12. Implementation Timeline
|
|
402
|
+
|
|
403
|
+
| Phase | Auth Work | Depends On |
|
|
404
|
+
|-------|-----------|------------|
|
|
405
|
+
| Phase 2 (current) | No auth needed — local only | — |
|
|
406
|
+
| Phase 3 (Sync) | Implement Flows A-D, `cvk_` key auth, revocation endpoints, team model | Candengo web backend |
|
|
407
|
+
| Phase 3.5 | Web dashboard (key management, team admin) | Phase 3 |
|
|
408
|
+
| Phase 4 | MCP-native OAuth 2.1, keychain storage, `cm_`/`cmr_` tokens | MCP spec stabilisation |
|
|
409
|
+
|
|
410
|
+
### Phase 3 Implementation Order
|
|
411
|
+
|
|
412
|
+
1. `POST /v1/mem/provision` — exchange `cmt_` token for `cvk_` key (server)
|
|
413
|
+
2. `engrm init --token=cmt_...` — CLI provisioning (client)
|
|
414
|
+
3. `engrm init` — browser OAuth callback flow (client)
|
|
415
|
+
4. `engrm init --no-browser` — device code flow (client + server)
|
|
416
|
+
5. `ENGRM_TOKEN` env var support in sync engine (client)
|
|
417
|
+
6. `POST /v1/auth/revoke` — key revocation (server)
|
|
418
|
+
7. `engrm auth rotate` — key rotation (client)
|
|
419
|
+
8. Team endpoints + `engrm team join` (client + server)
|
|
420
|
+
|
|
421
|
+
---
|
|
422
|
+
|
|
423
|
+
## Decisions Log
|
|
424
|
+
|
|
425
|
+
| # | Decision | Rationale | Review |
|
|
426
|
+
|---|----------|-----------|--------|
|
|
427
|
+
| 1 | `cvk_` API key is the single credential for sync | Avoids two auth models, two validation paths. OAuth is delivery, not credential type. | Devstral: approved |
|
|
428
|
+
| 2 | All config in `~/.engrm/` | Consolidate — no split between `~/.config/` and `~/.engrm/` | Devstral: approved |
|
|
429
|
+
| 3 | Device flow (RFC 8628) for headless/SSH | Localhost callback fails for remote dev. Device flow works everywhere. | Devstral: required |
|
|
430
|
+
| 4 | `teams` is an array, not scalar | Multi-team membership from day one. Cheap now, expensive to migrate later. | Devstral: required |
|
|
431
|
+
| 5 | Personal namespace auto-provisioned, team explicit | Prevents orphaned team namespaces from solo signups | Devstral: approved |
|
|
432
|
+
| 6 | Token revocation is Phase 3 blocker | Basic security requirement — cannot ship sync without revocation | Devstral: required |
|
|
433
|
+
| 7 | MCP-native OAuth deferred to Phase 4 | Spec still stabilising, no agent fully implements it | Devstral: approved |
|
|
434
|
+
| 8 | API keys stored as SHA-256 hashes server-side | Standard practice — never store plaintext credentials | Devstral: approved |
|
|
435
|
+
| 9 | `ENGRM_TOKEN` env var for CI/CD | Pipelines need stable credentials without config files | Devstral: approved |
|
|
436
|
+
| 10 | Scopes: read, write, admin | Limits blast radius of CI/CD tokens and leaked keys | Devstral: approved |
|
package/BRIEF.md
ADDED
|
@@ -0,0 +1,197 @@
|
|
|
1
|
+
# Engrm — Product Brief
|
|
2
|
+
|
|
3
|
+
## Executive Summary
|
|
4
|
+
|
|
5
|
+
Engrm is a **cross-device, team-shared memory layer for AI coding agents** — built on Candengo Vector's proven RAG infrastructure. It captures what developers learn, discover, fix, and decide during AI-assisted coding sessions and makes that knowledge instantly available across all their devices, team members, and future sessions.
|
|
6
|
+
|
|
7
|
+
**We're building this to solve our own problem first.** Our dev team works across multiple machines and projects (Candengo, Alchemy, AIMY). Every Claude Code session starts from zero — no memory of what was done yesterday, on another machine, or by another team member. Engrm fixes this with offline-first local storage that syncs to Candengo Vector, giving every developer's AI agent shared project context from day one.
|
|
8
|
+
|
|
9
|
+
The first integration targets **Claude Code** via its MCP and hooks system. The MCP interface is agent-agnostic, so future agents that support MCP can use the same memory backend.
|
|
10
|
+
|
|
11
|
+
**Not a fork.** Built from scratch, inspired by claude-mem's approach to Claude Code integration. No shared code, no AGPL dependency. Clean-room implementation designed around cross-device sync and team memory from the start.
|
|
12
|
+
|
|
13
|
+
**Built by Unimpossible Consultants** — the team behind the Candengo AI Knowledge Infrastructure platform, Alchemy, and AIMY.
|
|
14
|
+
|
|
15
|
+
---
|
|
16
|
+
|
|
17
|
+
## The Problem
|
|
18
|
+
|
|
19
|
+
### Context Amnesia
|
|
20
|
+
Every new Claude Code session starts from zero. Yesterday's debugging insights, architectural decisions, and hard-won knowledge are gone.
|
|
21
|
+
|
|
22
|
+
### Multi-Device Friction
|
|
23
|
+
Fix a bug on the laptop, continue on the desktop — no shared context. Our developers work across 2-3 machines and constantly re-explain the same codebase to the agent.
|
|
24
|
+
|
|
25
|
+
### Team Knowledge Silos
|
|
26
|
+
Developer A discovers a critical gotcha on Monday. Developer B hits the same issue on Tuesday. There's no automatic knowledge transfer between team members' AI agents. New team members' agents have zero institutional knowledge.
|
|
27
|
+
|
|
28
|
+
### Wasted Tokens and Time
|
|
29
|
+
AI agents re-discover the same patterns, make the same mistakes, ask the same clarifying questions — session after session, developer after developer.
|
|
30
|
+
|
|
31
|
+
### No Cross-Device Team Solution Exists
|
|
32
|
+
- claude-mem: local SQLite + ChromaDB, single device only
|
|
33
|
+
- mem0: cloud-only SaaS, no self-hosted option
|
|
34
|
+
- Cognee: knowledge graphs, no agent memory focus
|
|
35
|
+
- IDE memory (Cursor, Windsurf): locked to one tool
|
|
36
|
+
|
|
37
|
+
None offer offline-first cross-device sync with team memory on self-hosted infrastructure.
|
|
38
|
+
|
|
39
|
+
---
|
|
40
|
+
|
|
41
|
+
## The Solution
|
|
42
|
+
|
|
43
|
+
### Core Product
|
|
44
|
+
An MCP server + Claude Code hooks that:
|
|
45
|
+
1. **Self-provisions in under 2 minutes** — sign up at engrm.dev, run one command, done
|
|
46
|
+
2. **Captures observations automatically** from coding sessions (bugfixes, discoveries, decisions, patterns)
|
|
47
|
+
3. **Stores locally** in SQLite (instant, always works, offline-first)
|
|
48
|
+
4. **Syncs to Candengo Vector** when connected (cross-device search, semantic retrieval)
|
|
49
|
+
5. **Injects relevant context** on session start — agent picks up where you (or a teammate) left off, on any machine
|
|
50
|
+
6. **Scrubs secrets** before storage (API keys, tokens, passwords, connection strings)
|
|
51
|
+
|
|
52
|
+
### Architecture
|
|
53
|
+
|
|
54
|
+
```
|
|
55
|
+
Developer's Machine (any device)
|
|
56
|
+
┌─────────────────────────────────────────────┐
|
|
57
|
+
│ Claude Code / Future MCP Agent │
|
|
58
|
+
│ ↕ MCP (stdio) │
|
|
59
|
+
│ ┌───────────────────────────────────┐ │
|
|
60
|
+
│ │ Engrm MCP Server │ │
|
|
61
|
+
│ │ - Observation capture (hooks) │ │
|
|
62
|
+
│ │ - Local SQLite + FTS5 │ │
|
|
63
|
+
│ │ - Sync outbox queue │ │
|
|
64
|
+
│ │ - Secret scrubbing │ │
|
|
65
|
+
│ └──────────────┬────────────────────┘ │
|
|
66
|
+
│ │ HTTPS (when available) │
|
|
67
|
+
└─────────────────┼───────────────────────────┘
|
|
68
|
+
│
|
|
69
|
+
▼
|
|
70
|
+
┌─────────────────────────────────────────────┐
|
|
71
|
+
│ Candengo Vector (self-hosted or cloud) │
|
|
72
|
+
│ - BGE-M3 hybrid dense+sparse search │
|
|
73
|
+
│ - Cross-encoder reranking │
|
|
74
|
+
│ - Multi-tenant (site_id/namespace) │
|
|
75
|
+
└─────────────────────────────────────────────┘
|
|
76
|
+
```
|
|
77
|
+
|
|
78
|
+
### Data Flow
|
|
79
|
+
|
|
80
|
+
```
|
|
81
|
+
1. Developer works with Claude Code
|
|
82
|
+
2. Agent uses tools (reads files, runs commands, edits code)
|
|
83
|
+
3. PostToolUse hook → observation extracted (title, narrative, facts, type)
|
|
84
|
+
4. Secret scrubber strips sensitive content
|
|
85
|
+
5. Observation saved to local SQLite (instant, always works)
|
|
86
|
+
6. Observation added to sync_outbox
|
|
87
|
+
7. Sync engine pushes to Candengo Vector (fire-and-forget)
|
|
88
|
+
- Online → pushed immediately
|
|
89
|
+
- Offline → stays in outbox, retried on timer
|
|
90
|
+
8. Next session (any device) → search hits both local + Candengo Vector
|
|
91
|
+
9. Agent has context from previous sessions
|
|
92
|
+
```
|
|
93
|
+
|
|
94
|
+
---
|
|
95
|
+
|
|
96
|
+
## Target Users
|
|
97
|
+
|
|
98
|
+
### Phase 1: Our Team (Dogfood)
|
|
99
|
+
- Unimpossible dev team working across multiple machines and projects (Candengo, Alchemy, AIMY)
|
|
100
|
+
- Shared project memory so every developer's agent has team context
|
|
101
|
+
- Self-hosted on our own Candengo Vector infrastructure
|
|
102
|
+
|
|
103
|
+
### Phase 2: Individual Developers (Solo Plan)
|
|
104
|
+
- Power users who want persistent AI memory without cloud lock-in
|
|
105
|
+
- Privacy-conscious developers who want self-hosted infrastructure
|
|
106
|
+
|
|
107
|
+
### Phase 3: External Teams (Team Plan)
|
|
108
|
+
- Small-to-medium dev teams wanting shared institutional knowledge
|
|
109
|
+
- New team member's AI agent instantly has access to team knowledge
|
|
110
|
+
|
|
111
|
+
### Phase 4: Enterprise
|
|
112
|
+
- Large engineering organisations
|
|
113
|
+
- Compliance-sensitive environments requiring self-hosted data
|
|
114
|
+
|
|
115
|
+
---
|
|
116
|
+
|
|
117
|
+
## Key Differentiators
|
|
118
|
+
|
|
119
|
+
| Feature | Engrm | claude-mem | mem0 |
|
|
120
|
+
|---|---|---|---|
|
|
121
|
+
| Free cloud sync | Yes (generous free tier) | No (local only) | 10K memories free |
|
|
122
|
+
| Cross-device sync | Yes (offline-first) | No (local only) | Cloud only |
|
|
123
|
+
| Self-hosted option | Yes (Candengo Vector) | Local only | No (SaaS) |
|
|
124
|
+
| Offline-first | Yes (SQLite + outbox) | N/A (always local) | No |
|
|
125
|
+
| Team memory | Yes (shared namespace) | No | Limited |
|
|
126
|
+
| Multi-agent support | MCP standard | Claude Code only | Multiple (via API) |
|
|
127
|
+
| Vector search quality | BGE-M3 hybrid + reranking | ChromaDB default | Proprietary |
|
|
128
|
+
| Secret scrubbing | Yes | No | Unknown |
|
|
129
|
+
| License | FSL-1.1-ALv2 (source-available, Fair Source) | AGPL-3.0 | Proprietary |
|
|
130
|
+
|
|
131
|
+
---
|
|
132
|
+
|
|
133
|
+
## Licensing Strategy
|
|
134
|
+
|
|
135
|
+
**Split model** — core client published, premium features proprietary:
|
|
136
|
+
|
|
137
|
+
| Component | License | Published? |
|
|
138
|
+
|-----------|---------|-----------|
|
|
139
|
+
| Core client (MCP server, hooks, SQLite, search, sync) | FSL-1.1-ALv2 | Yes (GitHub) |
|
|
140
|
+
| Sentinel (real-time AI audit, config push, team standards) | Proprietary | No (private repo) |
|
|
141
|
+
| Server (Candengo Vector) | Proprietary | No (private repo) |
|
|
142
|
+
|
|
143
|
+
**FSL-1.1-ALv2 (Functional Source License)** — part of the [Fair Source](https://fair.io) movement. Used by Sentry, Codecov, GitButler, Keygen.
|
|
144
|
+
|
|
145
|
+
What it allows:
|
|
146
|
+
- Developers can read, modify, and run the code freely
|
|
147
|
+
- Companies can use it internally without restriction
|
|
148
|
+
- Each version automatically converts to Apache 2.0 after 2 years
|
|
149
|
+
|
|
150
|
+
What it restricts:
|
|
151
|
+
- Nobody can fork it and offer a competing hosted service
|
|
152
|
+
|
|
153
|
+
**Why not MIT/Apache**: Too permissive. A competitor could fork the plugin and offer a competing hosted service.
|
|
154
|
+
|
|
155
|
+
**Why not AGPL**: Too restrictive for adoption. Many companies have blanket AGPL bans.
|
|
156
|
+
|
|
157
|
+
**Why not ELv2**: FSL is better — the 2-year Apache 2.0 conversion is a trust signal, and FSL has growing ecosystem legitimacy via Fair Source.
|
|
158
|
+
|
|
159
|
+
**Why separate Sentinel**: Premium IP (audit LLM orchestration, team standards sync, dashboard config push) stays in a private repo. This is the GitLab CE/EE pattern — clean separation, no license gymnastics.
|
|
160
|
+
|
|
161
|
+
---
|
|
162
|
+
|
|
163
|
+
## Revenue Model
|
|
164
|
+
|
|
165
|
+
### Free-First, Upgrade for More
|
|
166
|
+
|
|
167
|
+
The free tier is the product, not a demo. Developers get real cross-device sync with generous limits. Paid tiers unlock more storage, more devices, and team features.
|
|
168
|
+
|
|
169
|
+
| Tier | Price | Includes | Target |
|
|
170
|
+
|---|---|---|---|
|
|
171
|
+
| **Free** | $0 | Cloud sync, 10K observations, 2 devices, 1 user | Individual devs getting started |
|
|
172
|
+
| **Solo** | $9/mo | 50K observations, unlimited devices, priority sync | Power users, multi-machine devs |
|
|
173
|
+
| **Pro** | $19/mo | Unlimited observations, unlimited devices, advanced search | Heavy users |
|
|
174
|
+
| **Team** | $12/seat/mo (min 3) | Shared team memory, team analytics, admin controls | Dev teams (2-20) |
|
|
175
|
+
| **Enterprise** | Custom | Self-hosted Candengo Vector + support SLA, SSO, audit | Large orgs, compliance |
|
|
176
|
+
|
|
177
|
+
**Free tier rationale**: 10K observations is roughly 2-3 months of active daily use for a solo developer. Long enough to get hooked, natural upgrade when they hit the limit. Two devices covers laptop + desktop — the core cross-device use case.
|
|
178
|
+
|
|
179
|
+
**Self-hosted is always free**: Anyone can run their own Candengo Vector instance. The paid tiers are for the convenience of our hosted infrastructure.
|
|
180
|
+
|
|
181
|
+
### Revenue Flywheel
|
|
182
|
+
```
|
|
183
|
+
Free users (adoption) → Hit limits → Upgrade to Solo/Pro
|
|
184
|
+
→ Tell teammates → Team plan → Enterprise interest
|
|
185
|
+
→ More users → Justifies infrastructure investment → Better service → ...
|
|
186
|
+
```
|
|
187
|
+
|
|
188
|
+
---
|
|
189
|
+
|
|
190
|
+
## Success Metrics
|
|
191
|
+
|
|
192
|
+
| Metric | Target (6 months) | Target (12 months) |
|
|
193
|
+
|---|---|---|
|
|
194
|
+
| GitHub stars (plugin) | 1,000 | 10,000 |
|
|
195
|
+
| Active installations | 500 | 5,000 |
|
|
196
|
+
| Candengo Vector signups via Mem | 100 | 1,000 |
|
|
197
|
+
| Cross-device sync events/day | 10,000 | 100,000 |
|
package/CLAUDE.md
ADDED
|
@@ -0,0 +1,44 @@
|
|
|
1
|
+
# CLAUDE.md
|
|
2
|
+
|
|
3
|
+
## Project Overview
|
|
4
|
+
|
|
5
|
+
Engrm (engrm.dev) is a cross-device, team-shared memory layer for AI coding agents. Built to let our dev team share project context across machines and developers. It captures observations (discoveries, bugfixes, decisions, patterns) from AI-assisted coding sessions and syncs them via Candengo Vector.
|
|
6
|
+
|
|
7
|
+
**Not a fork.** Built from scratch, inspired by claude-mem's approach to hooking into Claude Code. No shared code, no AGPL dependency.
|
|
8
|
+
|
|
9
|
+
**Branding**: Public-facing product name is "Engrm" (engrm.dev). Repo stays `candengo-mem`. Server-side internals stay `mem_*`. MCP server name is "engrm". Config dir is `~/.engrm/`. Env var is `ENGRM_TOKEN`.
|
|
10
|
+
|
|
11
|
+
## Key Documents
|
|
12
|
+
|
|
13
|
+
- `BRIEF.md` — Product brief, architecture, revenue model, success metrics
|
|
14
|
+
- `SWOT.md` — Strengths, weaknesses, opportunities, threats analysis
|
|
15
|
+
- `PLAN.md` — Phased implementation plan with component architecture and effort estimates
|
|
16
|
+
- `SPEC.md` — Technical specification: schemas, MCP tools, sync engine, search pipeline
|
|
17
|
+
- `COMPETITIVE.md` — Competitive analysis vs claude-mem, mem0, Cognee, etc.
|
|
18
|
+
- `MARKET.md` — Market research, competitor pricing, influencer reach, growth projections
|
|
19
|
+
- `INFRASTRUCTURE.md` — Scaling roadmap, account-based routing, capacity planning, cost analysis
|
|
20
|
+
- `AUTH-DESIGN.md` — Authentication flows, credential types, team model, token revocation
|
|
21
|
+
- `SYNC-ARCHITECTURE.md` — Bidirectional sync protocol, change feed, multi-agent compatibility
|
|
22
|
+
- `SERVER-API-PLAN.md` — Server-side API plan: sync, teams, billing, usage (Devstral-reviewed)
|
|
23
|
+
- `SENTINEL.md` — Sentinel: real-time AI audit for coding agents (competitive research, architecture, implementation plan)
|
|
24
|
+
|
|
25
|
+
## Technology Stack
|
|
26
|
+
|
|
27
|
+
- **MCP Server**: TypeScript + Bun, MCP SDK (stdio transport)
|
|
28
|
+
- **Local Storage**: SQLite (bun:sqlite) with FTS5 for offline search
|
|
29
|
+
- **Remote Backend**: Candengo Vector (BGE-M3, Qdrant, hybrid search)
|
|
30
|
+
- **Agent Support**: Claude Code (hooks + MCP), future MCP-compatible agents
|
|
31
|
+
|
|
32
|
+
## Architecture
|
|
33
|
+
|
|
34
|
+
```
|
|
35
|
+
Agent ↔ MCP Server ↔ Local SQLite ↔ Sync Engine ↔ Candengo Vector
|
|
36
|
+
```
|
|
37
|
+
|
|
38
|
+
- SQLite is the source of truth (always available, offline-first)
|
|
39
|
+
- Sync outbox queues observations for push to Candengo Vector
|
|
40
|
+
- Search combines local FTS5 + remote vector search with result merging
|
|
41
|
+
|
|
42
|
+
## Development
|
|
43
|
+
|
|
44
|
+
This project is in early development. See `PLAN.md` for implementation phases and `SPEC.md` for technical details.
|