engrm 0.1.0 → 0.2.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/README.md +214 -73
- package/bin/build.mjs +97 -0
- package/bin/engrm.mjs +13 -0
- package/dist/cli.js +2712 -0
- package/dist/hooks/elicitation-result.js +1786 -0
- package/dist/hooks/post-tool-use.js +2357 -0
- package/dist/hooks/pre-compact.js +1321 -0
- package/dist/hooks/sentinel.js +1168 -0
- package/dist/hooks/session-start.js +1473 -0
- package/dist/hooks/stop.js +1834 -0
- package/dist/server.js +16628 -0
- package/package.json +29 -4
- package/packs/api-best-practices.json +182 -0
- package/packs/nextjs-patterns.json +68 -0
- package/packs/node-security.json +68 -0
- package/packs/python-django.json +68 -0
- package/packs/react-gotchas.json +182 -0
- package/packs/typescript-patterns.json +67 -0
- package/packs/web-security.json +182 -0
- package/.mcp.json +0 -9
- package/AUTH-DESIGN.md +0 -436
- package/BRIEF.md +0 -197
- package/CLAUDE.md +0 -44
- package/COMPETITIVE.md +0 -174
- package/CONTEXT-OPTIMIZATION.md +0 -305
- package/INFRASTRUCTURE.md +0 -252
- package/MARKET.md +0 -230
- package/PLAN.md +0 -278
- package/SENTINEL.md +0 -293
- package/SERVER-API-PLAN.md +0 -553
- package/SPEC.md +0 -843
- package/SWOT.md +0 -148
- package/SYNC-ARCHITECTURE.md +0 -294
- package/VIBE-CODER-STRATEGY.md +0 -250
- package/bun.lock +0 -375
- package/hooks/post-tool-use.ts +0 -144
- package/hooks/session-start.ts +0 -64
- package/hooks/stop.ts +0 -131
- package/mem-page.html +0 -1305
- package/src/capture/dedup.test.ts +0 -103
- package/src/capture/dedup.ts +0 -76
- package/src/capture/extractor.test.ts +0 -245
- package/src/capture/extractor.ts +0 -330
- package/src/capture/quality.test.ts +0 -168
- package/src/capture/quality.ts +0 -104
- package/src/capture/retrospective.test.ts +0 -115
- package/src/capture/retrospective.ts +0 -121
- package/src/capture/scanner.test.ts +0 -131
- package/src/capture/scanner.ts +0 -100
- package/src/capture/scrubber.test.ts +0 -144
- package/src/capture/scrubber.ts +0 -181
- package/src/cli.ts +0 -517
- package/src/config.ts +0 -238
- package/src/context/inject.test.ts +0 -940
- package/src/context/inject.ts +0 -382
- package/src/embeddings/backfill.ts +0 -50
- package/src/embeddings/embedder.test.ts +0 -76
- package/src/embeddings/embedder.ts +0 -139
- package/src/lifecycle/aging.test.ts +0 -103
- package/src/lifecycle/aging.ts +0 -36
- package/src/lifecycle/compaction.test.ts +0 -264
- package/src/lifecycle/compaction.ts +0 -190
- package/src/lifecycle/purge.test.ts +0 -100
- package/src/lifecycle/purge.ts +0 -37
- package/src/lifecycle/scheduler.test.ts +0 -120
- package/src/lifecycle/scheduler.ts +0 -101
- package/src/provisioning/browser-auth.ts +0 -172
- package/src/provisioning/provision.test.ts +0 -198
- package/src/provisioning/provision.ts +0 -94
- package/src/register.test.ts +0 -167
- package/src/register.ts +0 -178
- package/src/server.ts +0 -436
- package/src/storage/migrations.test.ts +0 -244
- package/src/storage/migrations.ts +0 -261
- package/src/storage/outbox.test.ts +0 -229
- package/src/storage/outbox.ts +0 -131
- package/src/storage/projects.test.ts +0 -137
- package/src/storage/projects.ts +0 -184
- package/src/storage/sqlite.test.ts +0 -798
- package/src/storage/sqlite.ts +0 -934
- package/src/storage/vec.test.ts +0 -198
- package/src/sync/auth.test.ts +0 -76
- package/src/sync/auth.ts +0 -68
- package/src/sync/client.ts +0 -183
- package/src/sync/engine.test.ts +0 -94
- package/src/sync/engine.ts +0 -127
- package/src/sync/pull.test.ts +0 -279
- package/src/sync/pull.ts +0 -170
- package/src/sync/push.test.ts +0 -117
- package/src/sync/push.ts +0 -230
- package/src/tools/get.ts +0 -34
- package/src/tools/pin.ts +0 -47
- package/src/tools/save.test.ts +0 -301
- package/src/tools/save.ts +0 -231
- package/src/tools/search.test.ts +0 -69
- package/src/tools/search.ts +0 -181
- package/src/tools/timeline.ts +0 -64
- package/tsconfig.json +0 -22
package/PLAN.md
DELETED
|
@@ -1,278 +0,0 @@
|
|
|
1
|
-
# Implementation Plan — Engrm
|
|
2
|
-
|
|
3
|
-
## Approach
|
|
4
|
-
|
|
5
|
-
**Internal tooling first.** We're building this so our dev team can share project context across machines and developers. The public product comes later — first it needs to work for us.
|
|
6
|
-
|
|
7
|
-
Built from scratch. claude-mem is a reference for how to hook into Claude Code (hooks, MCP registration, observation capture patterns) but no code is shared. This avoids AGPL licensing issues and lets us design the architecture around cross-device team memory from the start.
|
|
8
|
-
|
|
9
|
-
## Component Architecture
|
|
10
|
-
|
|
11
|
-
```
|
|
12
|
-
engrm/
|
|
13
|
-
├── src/
|
|
14
|
-
│ ├── server.ts # MCP protocol handler (entry point)
|
|
15
|
-
│ ├── tools/ # MCP tool implementations
|
|
16
|
-
│ │ ├── search.ts # search() — hybrid local + remote, project-scoped
|
|
17
|
-
│ │ ├── timeline.ts # timeline() — chronological context
|
|
18
|
-
│ │ ├── get.ts # get_observations() — fetch by ID
|
|
19
|
-
│ │ ├── save.ts # save_observation() — manual save with quality scoring
|
|
20
|
-
│ │ └── pin.ts # pin_observation() — prevent aging
|
|
21
|
-
│ ├── capture/ # Observation extraction
|
|
22
|
-
│ │ ├── extractor.ts # Extract observations from tool use
|
|
23
|
-
│ │ ├── scrubber.ts # Secret/PII scrubbing
|
|
24
|
-
│ │ ├── quality.ts # Quality scoring (0.0-1.0)
|
|
25
|
-
│ │ └── dedup.ts # Near-duplicate detection (title similarity)
|
|
26
|
-
│ ├── storage/ # Local storage layer
|
|
27
|
-
│ │ ├── sqlite.ts # SQLite database (source of truth)
|
|
28
|
-
│ │ ├── migrations.ts # Schema migrations
|
|
29
|
-
│ │ ├── outbox.ts # Sync outbox queue
|
|
30
|
-
│ │ └── projects.ts # Project identity (git remote → canonical ID)
|
|
31
|
-
│ ├── lifecycle/ # Observation lifecycle management
|
|
32
|
-
│ │ ├── aging.ts # Daily: active → aging after 30 days
|
|
33
|
-
│ │ ├── compaction.ts # Weekly: aging → archived, generate digests
|
|
34
|
-
│ │ └── purge.ts # Monthly: delete archived > 12 months
|
|
35
|
-
│ ├── sync/ # Remote sync layer
|
|
36
|
-
│ │ ├── client.ts # Candengo Vector REST client
|
|
37
|
-
│ │ └── engine.ts # Sync engine (outbox flush, backfill, archival cleanup)
|
|
38
|
-
│ ├── context/ # Context injection
|
|
39
|
-
│ │ └── inject.ts # Session start context builder
|
|
40
|
-
│ └── config.ts # Configuration management
|
|
41
|
-
│
|
|
42
|
-
├── hooks/ # Claude Code hooks
|
|
43
|
-
│ ├── post-tool-use.sh # Observation capture
|
|
44
|
-
│ └── stop.sh # Session summary + sync flush
|
|
45
|
-
│
|
|
46
|
-
├── package.json
|
|
47
|
-
├── tsconfig.json
|
|
48
|
-
├── BRIEF.md
|
|
49
|
-
├── SPEC.md
|
|
50
|
-
├── PLAN.md # This file
|
|
51
|
-
└── CLAUDE.md
|
|
52
|
-
```
|
|
53
|
-
|
|
54
|
-
---
|
|
55
|
-
|
|
56
|
-
## Phase 1: Local MCP Server + Provisioning (Weeks 1-2)
|
|
57
|
-
|
|
58
|
-
**Goal**: Working MCP server with local SQLite storage, and a self-service provisioning flow so any developer can go from zero to working memory in under 2 minutes.
|
|
59
|
-
|
|
60
|
-
### 1.1 MCP Server Core
|
|
61
|
-
|
|
62
|
-
| Task | Description | Effort |
|
|
63
|
-
|---|---|---|
|
|
64
|
-
| Project scaffolding | TypeScript + Bun, MCP SDK, bun:sqlite | S |
|
|
65
|
-
| SQLite schema + migrations | projects, observations, sessions, sync_outbox tables (see SPEC §1-2) | M |
|
|
66
|
-
| Project identity detection | Auto-detect canonical project ID from git remote URL, normalise, store in projects table | M |
|
|
67
|
-
| MCP tool: `save_observation` | Save to local SQLite with project FK, quality score, add to sync outbox | S |
|
|
68
|
-
| MCP tool: `search` | Local SQLite FTS5 search, project-scoped by default, quality-weighted ranking | M |
|
|
69
|
-
| MCP tool: `get_observations` | Fetch by IDs from local SQLite | S |
|
|
70
|
-
| MCP tool: `timeline` | Chronological context around an observation | M |
|
|
71
|
-
| MCP tool: `pin_observation` | Pin/unpin observations to prevent aging | XS |
|
|
72
|
-
| Quality scoring | Score observations at capture time (0.0-1.0) based on type, content signals (see SPEC §2) | M |
|
|
73
|
-
| Secret scrubber | Regex-based scrubbing of API keys, passwords, tokens before storage | M |
|
|
74
|
-
| Relative file paths | Store file paths relative to project root, resolve at capture time | S |
|
|
75
|
-
| Configuration | `~/.engrm/settings.json` — local paths, remote config | S |
|
|
76
|
-
|
|
77
|
-
### 1.2 Self-Provisioning
|
|
78
|
-
|
|
79
|
-
| Task | Description | Effort |
|
|
80
|
-
|---|---|---|
|
|
81
|
-
| Engrm landing page | `www.engrm.dev` — product page + signup + install instructions | M |
|
|
82
|
-
| Account provisioning backend | Signup → create mem_accounts row, namespace, provision token | M |
|
|
83
|
-
| Provision API endpoint | `POST /v1/mem/provision` — exchange token for permanent credentials | S |
|
|
84
|
-
| `npx engrm init` | CLI command: redeem token, write settings, register MCP + hooks in Claude Code | M |
|
|
85
|
-
| Team invite flow | Admin creates team → invite URL → member joins with team namespace pre-configured | M |
|
|
86
|
-
| Self-hosted init path | `--url` flag for custom endpoints, `--manual` for air-gapped environments | S |
|
|
87
|
-
|
|
88
|
-
### 1.3 Provisioning Flow
|
|
89
|
-
|
|
90
|
-
```
|
|
91
|
-
1. Developer visits www.engrm.dev
|
|
92
|
-
2. Signs up (email or GitHub OAuth)
|
|
93
|
-
3. Backend provisions account + namespace
|
|
94
|
-
4. Page shows personalised install command:
|
|
95
|
-
npx engrm init --token=cmt_abc123...
|
|
96
|
-
5. Developer runs command in terminal
|
|
97
|
-
6. Plugin exchanges token → gets API key, endpoint, namespace
|
|
98
|
-
7. Plugin writes settings.json, registers MCP server + hooks in Claude Code
|
|
99
|
-
8. Next Claude Code session has memory
|
|
100
|
-
```
|
|
101
|
-
|
|
102
|
-
For teams: admin creates team at `www.engrm.dev/team`, shares invite link, team members get pre-configured for the shared namespace.
|
|
103
|
-
|
|
104
|
-
**Deliverable**: A working MCP server that Claude Code can call, with self-service provisioning from candengo.com. Any developer can sign up and be running in under 2 minutes.
|
|
105
|
-
|
|
106
|
-
---
|
|
107
|
-
|
|
108
|
-
## Phase 2: Claude Code Hooks (Week 3)
|
|
109
|
-
|
|
110
|
-
**Goal**: Automatic observation capture from Claude Code sessions.
|
|
111
|
-
|
|
112
|
-
| Task | Description | Effort |
|
|
113
|
-
|---|---|---|
|
|
114
|
-
| PostToolUse hook | Shell script that extracts observations from tool results | L |
|
|
115
|
-
| Stop hook | Session summary generation, sync flush | M |
|
|
116
|
-
| MCP server registration | `.mcp.json` config for Claude Code | XS |
|
|
117
|
-
| Hooks registration | `hooks.json` for Claude Code | XS |
|
|
118
|
-
| Context injection | Inject relevant history on session start (via MCP tool call) | M |
|
|
119
|
-
| Observation quality filtering | Skip trivial tool uses (ls, cat of small files), focus on meaningful work | M |
|
|
120
|
-
|
|
121
|
-
**Deliverable**: Claude Code automatically captures observations as you work. Session summaries on exit. Relevant history injected on start.
|
|
122
|
-
|
|
123
|
-
### Observation Extraction Design
|
|
124
|
-
|
|
125
|
-
This is the hardest problem. What makes a good observation?
|
|
126
|
-
|
|
127
|
-
**Capture triggers** (PostToolUse):
|
|
128
|
-
- File edits → what changed and why
|
|
129
|
-
- Command execution with errors → what failed and how it was fixed
|
|
130
|
-
- Multiple file reads in sequence → likely investigating something
|
|
131
|
-
- Test runs → pass/fail context
|
|
132
|
-
|
|
133
|
-
**Skip** (low signal):
|
|
134
|
-
- Simple file reads (single `cat`)
|
|
135
|
-
- `ls`, `pwd`, `git status` and similar navigation
|
|
136
|
-
- Repeated identical tool calls
|
|
137
|
-
|
|
138
|
-
**Extraction approach**: The hook sends the tool name + result summary to the MCP server. The server decides whether it's worth capturing based on the tool type and content. Quality score is assigned at capture time. Observations are batched per-session and deduplicated (title similarity > 0.8 against last 24h → merge into existing).
|
|
139
|
-
|
|
140
|
-
### Observation Lifecycle + Deduplication
|
|
141
|
-
|
|
142
|
-
| Task | Description | Effort |
|
|
143
|
-
|---|---|---|
|
|
144
|
-
| Deduplication on save | Check title similarity against last 24h for same project, merge if > 0.8 | M |
|
|
145
|
-
| Aging job | Daily: move active observations older than 30 days to aging (0.7x search weight) | S |
|
|
146
|
-
| Archival + compaction | Weekly: observations > 90 days grouped by session, summarised into digest | L |
|
|
147
|
-
| Purge job | Monthly: delete archived observations > 12 months (keep digests + pinned) | S |
|
|
148
|
-
| FTS5 index maintenance | Remove archived observations from FTS5 index during compaction | S |
|
|
149
|
-
| Quota check | Count active+aging observations for free tier enforcement | S |
|
|
150
|
-
|
|
151
|
-
**Why this matters now**: Without lifecycle management, a developer generating ~100 observations/day hits 10K in ~3 months. Search results degrade as old, irrelevant observations pollute rankings. Compaction turns 25 old observations from a debugging session into one useful digest. Aging reduces the weight of stale knowledge. The free tier stays usable because only active+aging observations count toward the 10K limit — compacted observations are free.
|
|
152
|
-
|
|
153
|
-
---
|
|
154
|
-
|
|
155
|
-
## Phase 3: Cross-Device Sync + Team Memory (Weeks 4-6)
|
|
156
|
-
|
|
157
|
-
**Goal**: Offline-first sync to Candengo Vector with team support from day one. Work on laptop, continue on desktop. Other developers' observations are searchable too.
|
|
158
|
-
|
|
159
|
-
Team memory isn't a separate phase — it's the reason we're building this. User identity, attribution, and shared namespaces are built into the sync layer from the start.
|
|
160
|
-
|
|
161
|
-
### 3.1 Candengo Vector API Prep
|
|
162
|
-
|
|
163
|
-
| Task | Description | Effort |
|
|
164
|
-
|---|---|---|
|
|
165
|
-
| Metadata filtering on search API | `metadata_filters` param on `/v1/search` — filter by `project_canonical`, `user_id`, etc. | S |
|
|
166
|
-
| Document listing by source_type | `GET /v1/documents?source_type=X` with pagination | S |
|
|
167
|
-
| Document deletion by source_id | `DELETE /v1/documents/{source_id}` — needed for archival/compaction cleanup | S |
|
|
168
|
-
| Device/user ID tracking in metadata | Accept `device_id`, `user_id` in metadata | XS |
|
|
169
|
-
|
|
170
|
-
### 3.2 Sync Engine
|
|
171
|
-
|
|
172
|
-
| Task | Description | Effort |
|
|
173
|
-
|---|---|---|
|
|
174
|
-
| Candengo Vector REST client | TypeScript HTTP client for `/v1/ingest`, `/v1/search`, `/v1/ingest/batch`, `/v1/documents/{id}` | M |
|
|
175
|
-
| Fire-and-forget sync | On observation save → attempt immediate push | S |
|
|
176
|
-
| Background sync timer | Every 30s → flush pending outbox items (batch of 50) | S |
|
|
177
|
-
| Startup backfill | On boot → sync observations saved while offline (high-water-mark) | M |
|
|
178
|
-
| Connectivity detection | Skip sync when offline, resume when connected | S |
|
|
179
|
-
| Retry with exponential backoff | Failed syncs retry 30s, 60s, 120s, max 5min | S |
|
|
180
|
-
| Observation → Candengo mapping | Map to ingest format with `project_canonical` in metadata, source_id = `{user}-{device}-obs-{id}` | M |
|
|
181
|
-
| Archival sync | When compaction runs: delete archived source_ids from Vector, ingest digest | M |
|
|
182
|
-
|
|
183
|
-
### 3.3 Team + Hybrid Search
|
|
184
|
-
|
|
185
|
-
| Task | Description | Effort |
|
|
186
|
-
|---|---|---|
|
|
187
|
-
| Hybrid search orchestrator | Query local FTS5 + Candengo `/v1/search` in parallel, scoped by `project_canonical` | M |
|
|
188
|
-
| Result merging + deduplication | Merge by source_id, weighted scoring (semantic × quality × lifecycle) | M |
|
|
189
|
-
| Graceful degradation | Candengo unreachable → local-only search (transparent) | S |
|
|
190
|
-
| Device ID generation | Auto-generate stable device ID on first run | XS |
|
|
191
|
-
| User identity + attribution | `user_id` in all observations, "david/laptop" in results | S |
|
|
192
|
-
| Source ID namespacing | `{user_id}-{device_id}-obs-{local_id}` prevents all collisions | S |
|
|
193
|
-
| Visibility controls | `shared` / `personal` / `secret` flags | M |
|
|
194
|
-
| Team search scope | Search own + team observations by default, filtered by `project_canonical` | M |
|
|
195
|
-
| Cross-project search | Support `project: "*"` to search across all projects | S |
|
|
196
|
-
|
|
197
|
-
**Deliverable**: Full cross-device team sync. Observations from any team member appear on any device within 30 seconds. Works offline, syncs when reconnected. Projects are matched across machines by git remote URL. New developer installs, connects to the shared namespace, and their agent has the full team knowledge base.
|
|
198
|
-
|
|
199
|
-
### Backfill Strategy
|
|
200
|
-
|
|
201
|
-
Instead of diffing all IDs on every startup (expensive at scale), use a high-water-mark:
|
|
202
|
-
|
|
203
|
-
```
|
|
204
|
-
1. Store last_synced_epoch locally
|
|
205
|
-
2. On startup: SELECT * FROM observations WHERE created_at_epoch > last_synced_epoch
|
|
206
|
-
3. Batch push missing observations
|
|
207
|
-
4. Update last_synced_epoch
|
|
208
|
-
```
|
|
209
|
-
|
|
210
|
-
Simple, efficient, scales to any observation count.
|
|
211
|
-
|
|
212
|
-
---
|
|
213
|
-
|
|
214
|
-
## Phase 4: Dogfood (Weeks 7-8)
|
|
215
|
-
|
|
216
|
-
**Goal**: Run it internally on our projects (Candengo, Alchemy, AIMY). Fix what hurts.
|
|
217
|
-
|
|
218
|
-
| Task | Description | Effort |
|
|
219
|
-
|---|---|---|
|
|
220
|
-
| Team onboarding | Install on all dev machines, shared Candengo Vector namespace | S |
|
|
221
|
-
| Observation quality tuning | Adjust capture filters based on real usage — too noisy? too quiet? | M |
|
|
222
|
-
| Search relevance tuning | Adjust scoring weights based on real queries | M |
|
|
223
|
-
| Bug fixes from dogfooding | Whatever breaks | M |
|
|
224
|
-
| Automated testing | Unit tests, sync integration tests | L |
|
|
225
|
-
| Performance benchmarking | <50ms local search, <200ms remote search | M |
|
|
226
|
-
|
|
227
|
-
---
|
|
228
|
-
|
|
229
|
-
## Phase 5: Public Launch (Weeks 9-10)
|
|
230
|
-
|
|
231
|
-
| Task | Description | Effort |
|
|
232
|
-
|---|---|---|
|
|
233
|
-
| One-line installer | `npx engrm install` or similar | M |
|
|
234
|
-
| CLI tool | `engrm status`, `search`, `sync` commands | M |
|
|
235
|
-
| Documentation | Installation, configuration, usage guide | M |
|
|
236
|
-
| GitHub repo (FSL-1.1-ALv2 license) | README, examples, contributing guide, LICENSE file | M |
|
|
237
|
-
| Free tier limits enforcement | Observation count, device count checks against account tier | M |
|
|
238
|
-
| Upgrade flow | In-plugin nudge when approaching limits, link to engrm.dev/upgrade | S |
|
|
239
|
-
|
|
240
|
-
**Licensing**: Core client released under FSL-1.1-ALv2 (Functional Source License, Fair Source). Source-available — developers can read, modify, and self-host freely. The restriction: nobody can fork it and offer a competing hosted service. Each version converts to Apache 2.0 after 2 years. Sentinel (real-time AI audit) is proprietary, delivered from a separate private repo to paying customers only.
|
|
241
|
-
|
|
242
|
-
---
|
|
243
|
-
|
|
244
|
-
## Effort Key
|
|
245
|
-
|
|
246
|
-
| Size | Estimated Effort | Description |
|
|
247
|
-
|---|---|---|
|
|
248
|
-
| XS | < 2 hours | Trivial change, config, or wrapper |
|
|
249
|
-
| S | 2-4 hours | Straightforward implementation |
|
|
250
|
-
| M | 4-8 hours | Moderate complexity, some design decisions |
|
|
251
|
-
| L | 1-2 days | Significant feature, requires careful design |
|
|
252
|
-
|
|
253
|
-
---
|
|
254
|
-
|
|
255
|
-
## Dependencies & Critical Path
|
|
256
|
-
|
|
257
|
-
```
|
|
258
|
-
Phase 1 (Local MCP) ──→ Phase 2 (Hooks) ──→ Phase 3 (Sync + Team) ──→ Phase 4 (Dogfood) ──→ Phase 5 (Launch)
|
|
259
|
-
```
|
|
260
|
-
|
|
261
|
-
**Phase 1+2 are usable standalone** — local-only memory is already valuable.
|
|
262
|
-
**Phase 3 is the whole point** — cross-device team sync is why we're building this.
|
|
263
|
-
**Phase 4 is essential** — dogfooding on our own projects before releasing externally.
|
|
264
|
-
|
|
265
|
-
---
|
|
266
|
-
|
|
267
|
-
## Risk Register
|
|
268
|
-
|
|
269
|
-
| Risk | Impact | Likelihood | Mitigation |
|
|
270
|
-
|---|---|---|---|
|
|
271
|
-
| Observation quality too noisy | High | Medium | Quality scoring (0.0-1.0), skip below 0.1, deduplication on save, compaction at 90 days |
|
|
272
|
-
| Observation volume exceeds quota | Medium | High | Lifecycle management: aging → archival → purge. Compaction summarises old sessions into digests. Only active+aging counts toward quota |
|
|
273
|
-
| Project identity mismatch across machines | High | Medium | Canonical ID from normalised git remote URL. Fallback: `.engrm.json` in project root |
|
|
274
|
-
| Search relevance degrades over time | High | Medium | Quality-weighted ranking, lifecycle scoring (aging=0.7x), project scoping, compaction removes noise |
|
|
275
|
-
| Source ID collisions across devices | Medium | High | Source ID = `{user_id}-{device_id}-obs-{local_id}` — unique across all dimensions |
|
|
276
|
-
| MCP protocol breaking changes | High | Low | Pin MCP SDK version, abstract protocol layer |
|
|
277
|
-
| Secret leakage in observations | Critical | Medium | Multi-layer scrubbing, sensitivity classification, relative file paths only |
|
|
278
|
-
| Sync conflicts | Medium | Low | Source ID namespacing — structurally impossible for two users to overwrite each other |
|
package/SENTINEL.md
DELETED
|
@@ -1,293 +0,0 @@
|
|
|
1
|
-
# Engrm Sentinel — Real-Time AI Audit for Coding Agents
|
|
2
|
-
|
|
3
|
-
**Status**: Planned (Phase 5)
|
|
4
|
-
**Target Launch**: April 22, 2026 (6 weeks from 2026-03-11)
|
|
5
|
-
**Tier**: Pro + Team (paid upsell feature)
|
|
6
|
-
|
|
7
|
-
## Executive Summary
|
|
8
|
-
|
|
9
|
-
Sentinel is a real-time code validation layer that intercepts AI agent tool calls (file writes, edits) **before execution**, retrieves team-specific coding standards from Engrm's vector memory, and routes the diff through a configurable audit LLM for judgment. If the code violates team standards, Sentinel blocks the write and tells the agent exactly what to fix — the agent self-corrects automatically.
|
|
10
|
-
|
|
11
|
-
No competitor offers this. Every existing AI code review tool (CodeRabbit, Qodo, Greptile, Ellipsis) operates at PR level — **after** code is written. Sentinel operates at the pre-execution level, preventing mistakes before they happen.
|
|
12
|
-
|
|
13
|
-
## Market Gap
|
|
14
|
-
|
|
15
|
-
```
|
|
16
|
-
Static Standards Dynamic RAG Standards
|
|
17
|
-
───────────────── ──────────────────────
|
|
18
|
-
PR-Level Review │ CodeRabbit ($24) │ Qodo ($30)
|
|
19
|
-
│ Ellipsis ($20) │ Greptile ($30)
|
|
20
|
-
│ Sourcery ($12) │
|
|
21
|
-
│ Copilot ($19-39) │
|
|
22
|
-
─────────────────┼───────────────────────┼─────────────────────────
|
|
23
|
-
Real-Time │ decider/claude-hooks │
|
|
24
|
-
Pre-Execution │ trailofbits config │ ← ENGRM SENTINEL
|
|
25
|
-
Interception │ (local-only, no RAG) │ (UNOCCUPIED)
|
|
26
|
-
```
|
|
27
|
-
|
|
28
|
-
### Competitive Research (March 2026)
|
|
29
|
-
|
|
30
|
-
| Tool | Pricing | Timing | Custom Standards | RAG/Vector | Agent Hooks |
|
|
31
|
-
|------|---------|--------|-----------------|------------|-------------|
|
|
32
|
-
| CodeRabbit | $0-24/dev/mo | PR-level + IDE inline | Yes (.coderabbit.yml) | No | No |
|
|
33
|
-
| Qodo | $0-30/dev/mo | PR-level + IDE | Yes (auto-generated) | Yes (proprietary) | No |
|
|
34
|
-
| Greptile | $30/dev/mo | PR-level | Learns from PRs | Yes (AST + vector) | No |
|
|
35
|
-
| Ellipsis | ~$20/dev/mo | PR-level | Yes (natural language) | No | No |
|
|
36
|
-
| Cursor Bugbot | Included ($20-40/mo) | PR-level (background) | .cursor/rules | Proprietary | Cursor-only |
|
|
37
|
-
| Copilot Review | $19-39/user/mo | PR-level | Repository rules | Proprietary | Copilot-only |
|
|
38
|
-
| **Engrm Sentinel** | **$15-25/dev/mo** | **Pre-execution** | **Dynamic RAG** | **Hybrid FTS5+vec** | **Any MCP agent** |
|
|
39
|
-
|
|
40
|
-
### GitHub Reference Implementations
|
|
41
|
-
|
|
42
|
-
| Repo | Stars | Pattern | What We Learn |
|
|
43
|
-
|------|-------|---------|--------------|
|
|
44
|
-
| disler/claude-code-hooks-mastery | 3.3k | Builder/Validator agents, PostToolUse linting | Builder/Validator separation pattern |
|
|
45
|
-
| trailofbits/claude-code-config | 1.6k | Security blocking hooks, anti-rationalization gate | Stop hook prompt checking for incomplete work |
|
|
46
|
-
| qodo-ai/pr-agent | 10.5k | PR review tools, AGPL | PR compression for large diffs |
|
|
47
|
-
| ChrisWiles/claude-code-showcase | 5.5k | Skills, agents, GitHub Actions | Skill evaluation hook as pattern |
|
|
48
|
-
| decider/claude-hooks | 67 | Static rule enforcement | Hierarchical config (root + dir overrides) |
|
|
49
|
-
| praneybehl/code-review-mcp | 29 | MCP server for multi-provider review | Stateless — no memory, no team sharing |
|
|
50
|
-
|
|
51
|
-
**Key insight**: Every existing implementation is stateless. None retrieves project-specific standards from vector memory. None syncs findings across a team.
|
|
52
|
-
|
|
53
|
-
## Architecture
|
|
54
|
-
|
|
55
|
-
### Flow
|
|
56
|
-
|
|
57
|
-
```
|
|
58
|
-
Developer working with Claude Code...
|
|
59
|
-
|
|
60
|
-
Claude tries to write a file
|
|
61
|
-
│
|
|
62
|
-
▼
|
|
63
|
-
PreToolUse(Write|Edit) fires → hooks/sentinel.ts
|
|
64
|
-
│
|
|
65
|
-
├─ 1. SKIP CHECK
|
|
66
|
-
│ Is sentinel enabled? Is this file in skip_patterns?
|
|
67
|
-
│
|
|
68
|
-
├─ 2. RETRIEVE STANDARDS
|
|
69
|
-
│ engrm search("auth middleware security")
|
|
70
|
-
│ → "Decision: all auth must use bcrypt, not MD5"
|
|
71
|
-
│ → "Bugfix: session tokens were stored unencrypted"
|
|
72
|
-
│ → "Standard: never log auth credentials"
|
|
73
|
-
│
|
|
74
|
-
├─ 3. AUDIT LLM CALL
|
|
75
|
-
│ POST base_url/chat/completions
|
|
76
|
-
│ { model, messages: [system + standards + diff] }
|
|
77
|
-
│ temperature: 0, max_tokens: 150
|
|
78
|
-
│
|
|
79
|
-
├─ 4a. PASS → exit 0 (Claude proceeds)
|
|
80
|
-
├─ 4b. WARN → exit 0 + log observation
|
|
81
|
-
└─ 4c. BLOCK → exit 2 + stderr reason
|
|
82
|
-
(Claude receives error, self-corrects, retries)
|
|
83
|
-
│
|
|
84
|
-
▼
|
|
85
|
-
Finding saved as observation → syncs to team → future audits are smarter
|
|
86
|
-
```
|
|
87
|
-
|
|
88
|
-
### Dashboard → Server → Client Config Push
|
|
89
|
-
|
|
90
|
-
```
|
|
91
|
-
Dashboard (engrm.dev/sentinel)
|
|
92
|
-
│ POST /v1/mem/sentinel/config
|
|
93
|
-
▼
|
|
94
|
-
Candengo Vector (sync_events, record_type="sentinel_config")
|
|
95
|
-
│ GET /v1/sync/changes (existing pull loop, every 60s)
|
|
96
|
-
▼
|
|
97
|
-
Client (~/.engrm/settings.json → sentinel config merged)
|
|
98
|
-
│
|
|
99
|
-
▼
|
|
100
|
-
PreToolUse hook reads config on each invocation
|
|
101
|
-
```
|
|
102
|
-
|
|
103
|
-
### Provider Agnostic (OpenAI-Compatible API)
|
|
104
|
-
|
|
105
|
-
All major LLM providers speak the same `POST /v1/chat/completions` format:
|
|
106
|
-
|
|
107
|
-
| Provider | Base URL | Models | Cost/1K audits |
|
|
108
|
-
|----------|----------|--------|---------------|
|
|
109
|
-
| OpenAI | api.openai.com/v1 | gpt-4o-mini | ~$0.40 |
|
|
110
|
-
| xAI/Grok | api.x.ai/v1 | grok-3-mini | ~$0.30 |
|
|
111
|
-
| Mistral | api.mistral.ai/v1 | mistral-small | ~$0.20 |
|
|
112
|
-
| Anthropic | via proxy | haiku-4.5 | ~$0.50 |
|
|
113
|
-
| Local vLLM | 192.168.5.5:8000/v1 | devstral-24b | $0 |
|
|
114
|
-
| Ollama | localhost:11434/v1 | llama3-8b | $0 |
|
|
115
|
-
|
|
116
|
-
One client function, ~40 lines. No provider-specific code.
|
|
117
|
-
|
|
118
|
-
## Config Schema
|
|
119
|
-
|
|
120
|
-
```typescript
|
|
121
|
-
interface SentinelConfig {
|
|
122
|
-
enabled: boolean;
|
|
123
|
-
mode: "advisory" | "blocking"; // WARN-only vs BLOCK+WARN
|
|
124
|
-
provider: "openai" | "xai" | "mistral" | "anthropic" | "custom";
|
|
125
|
-
base_url: string; // OpenAI-compatible endpoint
|
|
126
|
-
model: string; // e.g. "gpt-4o-mini"
|
|
127
|
-
api_key_env?: string; // Client-side env var name
|
|
128
|
-
encrypted_api_key?: string; // Server-pushed, decrypted client-side
|
|
129
|
-
match_tools: string[]; // ["Write", "Edit"] default
|
|
130
|
-
timeout_ms: number; // Max wait (default 8000)
|
|
131
|
-
skip_patterns: string[]; // e.g. ["*.test.ts", "*.md"]
|
|
132
|
-
max_diff_lines: number; // Truncate large diffs (default 200)
|
|
133
|
-
}
|
|
134
|
-
```
|
|
135
|
-
|
|
136
|
-
Stored in `~/.engrm/settings.json` under `sentinel` key. Pushed from dashboard via sync_events.
|
|
137
|
-
|
|
138
|
-
## Standards
|
|
139
|
-
|
|
140
|
-
Standards are **observations tagged as audit-relevant**. No separate schema needed.
|
|
141
|
-
|
|
142
|
-
- Add `"standard"` to the observation type enum
|
|
143
|
-
- Tag with `sentinel-standard` in concepts
|
|
144
|
-
- Standards sync through the existing push/pull pipeline
|
|
145
|
-
- Dashboard provides UI for creating/managing them
|
|
146
|
-
- Every past decision, bugfix, and pattern is a potential standard — just tag it
|
|
147
|
-
|
|
148
|
-
## Graceful Degradation
|
|
149
|
-
|
|
150
|
-
Following the sqlite-vec precedent:
|
|
151
|
-
|
|
152
|
-
| Failure | Behavior |
|
|
153
|
-
|---------|----------|
|
|
154
|
-
| LLM API down/timeout | exit 0 (allow), log warning |
|
|
155
|
-
| No standards found | Skip audit, exit 0 |
|
|
156
|
-
| Config not synced yet | Sentinel disabled by default |
|
|
157
|
-
| API key missing | Skip audit, log once |
|
|
158
|
-
| Free tier user | Sentinel hooks not registered |
|
|
159
|
-
|
|
160
|
-
## Feedback Loop (The Moat)
|
|
161
|
-
|
|
162
|
-
```
|
|
163
|
-
Sentinel blocks a write
|
|
164
|
-
→ Claude self-corrects
|
|
165
|
-
→ Corrected code passes
|
|
166
|
-
→ Block + correction saved as observation
|
|
167
|
-
→ Observation syncs to all team members
|
|
168
|
-
→ Future audits retrieve it as context
|
|
169
|
-
→ Sentinel gets smarter over time
|
|
170
|
-
```
|
|
171
|
-
|
|
172
|
-
Static rules don't learn. Sentinel does. This is the competitive moat.
|
|
173
|
-
|
|
174
|
-
## Pricing
|
|
175
|
-
|
|
176
|
-
| Tier | Sentinel | Observations | Price |
|
|
177
|
-
|------|----------|-------------|-------|
|
|
178
|
-
| Free | Not available | 10K, 2 devices | $0 |
|
|
179
|
-
| Pro | Advisory mode, 100 audits/day, own API keys | 50K, unlimited devices | $15/dev/mo |
|
|
180
|
-
| Team | Full blocking + advisory, unlimited, dashboard config push, shared standards, audit heatmap | 100K, unlimited devices | $25/dev/mo |
|
|
181
|
-
|
|
182
|
-
Users bring their own LLM API keys. Engrm's marginal cost per audit is near-zero (one vector search + config lookup).
|
|
183
|
-
|
|
184
|
-
## Implementation Plan
|
|
185
|
-
|
|
186
|
-
### Phase 1: Core Hook + Local Audit (Week 1)
|
|
187
|
-
|
|
188
|
-
| Task | File | Effort |
|
|
189
|
-
|------|------|--------|
|
|
190
|
-
| Add `SentinelConfig` to config interface | `src/config.ts` | 1h |
|
|
191
|
-
| Add `"standard"` to observation type enum | `src/types.ts` | 30m |
|
|
192
|
-
| Create `src/sentinel/types.ts` | New | 30m |
|
|
193
|
-
| Create `src/sentinel/llm-client.ts` (OpenAI-compatible) | New (~40 lines) | 1h |
|
|
194
|
-
| Create `src/sentinel/prompts.ts` | New | 2h |
|
|
195
|
-
| Create `src/sentinel/audit.ts` (orchestrator) | New | 3h |
|
|
196
|
-
| Create `hooks/sentinel.ts` (PreToolUse hook) | New | 2h |
|
|
197
|
-
| Register sentinel hook in `registerHooks()` | `src/register.ts` | 30m |
|
|
198
|
-
| Tests | `src/sentinel/*.test.ts` | 3h |
|
|
199
|
-
| Integration test (hook → local LLM) | Manual | 2h |
|
|
200
|
-
|
|
201
|
-
### Phase 2: Dashboard Config Push (Week 2)
|
|
202
|
-
|
|
203
|
-
| Task | Location | Effort |
|
|
204
|
-
|------|----------|--------|
|
|
205
|
-
| `POST /v1/mem/sentinel/config` endpoint | candengo-vector | 3h |
|
|
206
|
-
| `GET /v1/mem/sentinel/config` endpoint | candengo-vector | 1h |
|
|
207
|
-
| Store config in sync_events (record_type="sentinel_config") | candengo-vector | 2h |
|
|
208
|
-
| Handle sentinel_config in pull loop | `src/sync/pull.ts` | 2h |
|
|
209
|
-
| Dashboard: LLM provider config page | website/mem/sentinel.html | 4h |
|
|
210
|
-
| Dashboard: standards manager (CRUD) | website/mem/sentinel.html | 4h |
|
|
211
|
-
| API key encryption (server→client) | Both | 3h |
|
|
212
|
-
|
|
213
|
-
### Phase 3: Standards Library + Feedback Loop (Week 3)
|
|
214
|
-
|
|
215
|
-
| Task | Location | Effort |
|
|
216
|
-
|------|----------|--------|
|
|
217
|
-
| `POST /v1/mem/sentinel/standards` CRUD | candengo-vector | 3h |
|
|
218
|
-
| Standards sync via pull loop | `src/sync/pull.ts` | 2h |
|
|
219
|
-
| Save audit findings as observations | `src/sentinel/audit.ts` | 2h |
|
|
220
|
-
| Dashboard: audit results + heatmap | candengo-vector website | 4h |
|
|
221
|
-
| `engrm sentinel status` CLI | `src/cli.ts` | 1h |
|
|
222
|
-
| `engrm sentinel test` CLI | `src/cli.ts` | 2h |
|
|
223
|
-
|
|
224
|
-
### Phase 4: Polish + Tier Enforcement (Week 4)
|
|
225
|
-
|
|
226
|
-
| Task | Effort |
|
|
227
|
-
|------|--------|
|
|
228
|
-
| Rate limiting (100/day pro, unlimited team) | 2h |
|
|
229
|
-
| Tier check on hook registration | 1h |
|
|
230
|
-
| Anti-rationalization gate (Stop hook, from TrailOfBits) | 3h |
|
|
231
|
-
| Skip patterns, file-type filtering | 2h |
|
|
232
|
-
| Docs + onboarding in dashboard | 3h |
|
|
233
|
-
| Performance profiling (target: <3s/audit) | 2h |
|
|
234
|
-
|
|
235
|
-
### Phase 5: Beta + Launch (Weeks 5-6)
|
|
236
|
-
|
|
237
|
-
| Task | Effort |
|
|
238
|
-
|------|--------|
|
|
239
|
-
| Internal dogfood (Unimpossible team) | 1 week |
|
|
240
|
-
| Bug fixes from dogfood | Variable |
|
|
241
|
-
| Launch blog post + HN announcement | 1 day |
|
|
242
|
-
| Waitlist conversion emails | 1 day |
|
|
243
|
-
|
|
244
|
-
## Files to Create
|
|
245
|
-
|
|
246
|
-
```
|
|
247
|
-
src/sentinel/
|
|
248
|
-
├── types.ts # SentinelConfig, AuditResult, etc.
|
|
249
|
-
├── llm-client.ts # OpenAI-compatible API client (~40 lines)
|
|
250
|
-
├── prompts.ts # System prompt, audit request formatter, response parser
|
|
251
|
-
├── audit.ts # Orchestrator: search → LLM → decision → save finding
|
|
252
|
-
└── *.test.ts # Tests
|
|
253
|
-
|
|
254
|
-
hooks/
|
|
255
|
-
└── sentinel.ts # PreToolUse hook (follows post-tool-use.ts pattern)
|
|
256
|
-
```
|
|
257
|
-
|
|
258
|
-
## Files to Modify
|
|
259
|
-
|
|
260
|
-
```
|
|
261
|
-
src/config.ts # Add SentinelConfig to Config interface + defaults
|
|
262
|
-
src/register.ts # Register PreToolUse sentinel hook
|
|
263
|
-
src/sync/pull.ts # Handle record_type="sentinel_config" | "sentinel_standard"
|
|
264
|
-
src/cli.ts # Add `engrm sentinel status|test` commands
|
|
265
|
-
```
|
|
266
|
-
|
|
267
|
-
## Key Design Patterns to Follow
|
|
268
|
-
|
|
269
|
-
| Pattern | Source | Application |
|
|
270
|
-
|---------|--------|-------------|
|
|
271
|
-
| Silent error handling | hooks/post-tool-use.ts | Never crash; exit 0 on any error |
|
|
272
|
-
| Config merging | src/config.ts | Defaults → disk → sync override |
|
|
273
|
-
| Reentrancy guards | src/sync/engine.ts | Prevent concurrent audits |
|
|
274
|
-
| Graceful degradation | sqlite-vec integration | If unavailable, skip silently |
|
|
275
|
-
| Observation pipeline | src/capture/ | Findings go through same scrub→quality→dedup→save flow |
|
|
276
|
-
|
|
277
|
-
## References
|
|
278
|
-
|
|
279
|
-
### Competitors
|
|
280
|
-
- [CodeRabbit](https://www.coderabbit.ai/) — PR-level, $0-24/dev/mo
|
|
281
|
-
- [Qodo](https://www.qodo.ai/) — Best RAG, PR-level, $0-30/dev/mo
|
|
282
|
-
- [Greptile](https://www.greptile.com/) — Learns from PRs, $30/dev/mo
|
|
283
|
-
- [Ellipsis](https://www.ellipsis.dev/) — PR-level, ~$20/dev/mo
|
|
284
|
-
|
|
285
|
-
### GitHub Repos
|
|
286
|
-
- [disler/claude-code-hooks-mastery](https://github.com/disler/claude-code-hooks-mastery) (3.3k★)
|
|
287
|
-
- [trailofbits/claude-code-config](https://github.com/trailofbits/claude-code-config) (1.6k★)
|
|
288
|
-
- [qodo-ai/pr-agent](https://github.com/qodo-ai/pr-agent) (10.5k★)
|
|
289
|
-
- [praneybehl/code-review-mcp](https://github.com/praneybehl/code-review-mcp)
|
|
290
|
-
|
|
291
|
-
### Claude Code Docs
|
|
292
|
-
- [Hooks Reference](https://code.claude.com/docs/en/hooks)
|
|
293
|
-
- [Hooks Guide](https://code.claude.com/docs/en/hooks-guide)
|