purecontext-mcp 1.1.1 → 1.1.2

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (37) hide show
  1. package/package.json +1 -1
  2. package/docs/dev/API_STABILITY.md +0 -319
  3. package/docs/dev/DECISIONS.md +0 -22
  4. package/docs/dev/DOCUMENTATION_PLAN.md +0 -113
  5. package/docs/dev/PHASE10_TASKS.md +0 -476
  6. package/docs/dev/PHASE11_TASKS.md +0 -385
  7. package/docs/dev/PHASE12_TASKS.md +0 -335
  8. package/docs/dev/PHASE13_TASKS.md +0 -381
  9. package/docs/dev/PHASE14_TASKS.md +0 -371
  10. package/docs/dev/PHASE15_TASKS.md +0 -256
  11. package/docs/dev/PHASE16_TASKS.md +0 -314
  12. package/docs/dev/PHASE17_TASKS.md +0 -321
  13. package/docs/dev/PHASE18_TASKS.md +0 -345
  14. package/docs/dev/PHASE19_TASKS.md +0 -261
  15. package/docs/dev/PHASE1_TASKS.md +0 -443
  16. package/docs/dev/PHASE20_TASKS.md +0 -280
  17. package/docs/dev/PHASE21_TASKS.md +0 -355
  18. package/docs/dev/PHASE22_TASKS.md +0 -371
  19. package/docs/dev/PHASE23_TASKS.md +0 -274
  20. package/docs/dev/PHASE24_TASKS.md +0 -326
  21. package/docs/dev/PHASE25_TASKS.md +0 -452
  22. package/docs/dev/PHASE26_TASKS.md +0 -253
  23. package/docs/dev/PHASE27_TASKS.md +0 -410
  24. package/docs/dev/PHASE2_TASKS.md +0 -328
  25. package/docs/dev/PHASE3_TASKS.md +0 -571
  26. package/docs/dev/PHASE4_TASKS.md +0 -531
  27. package/docs/dev/PHASE5_TASKS.md +0 -835
  28. package/docs/dev/PHASE6_TASKS.md +0 -347
  29. package/docs/dev/PHASE7_TASKS.md +0 -257
  30. package/docs/dev/PHASE8_TASKS.md +0 -299
  31. package/docs/dev/PHASE9_TASKS.md +0 -320
  32. package/docs/dev/PureContext_MCP_PRD_v1.0.docx +0 -0
  33. package/docs/dev/SELF_HOSTING.md +0 -142
  34. package/docs/dev/TEAM_SETUP.md +0 -316
  35. package/docs/dev/TELEMETRY.md +0 -99
  36. package/docs/dev/feature-analysis.md +0 -305
  37. package/docs/dev/phase-1-notes.md +0 -3
@@ -1,335 +0,0 @@
1
- # Phase 12 — Task Breakdown
2
-
3
- **Goal**: Add rate limiting and multi-tenant authentication for hosted/shared PureContext deployments. This enables running PureContext as a service for teams or organizations.
4
-
5
- **Scope rationale**: When PureContext is deployed as a shared service (HTTP/SSE transport), it needs protection against abuse and support for multiple users/organizations with isolated data. This phase transforms PureContext from a single-user CLI tool into a multi-tenant service.
6
-
7
- **Approach**: Tasks build from basic rate limiting through API key authentication to full tenant isolation. Each layer is independent and provides value.
8
-
9
- ---
10
-
11
- ## Task 96: Token Bucket Rate Limiter
12
-
13
- Implement a token bucket rate limiter for controlling request throughput.
14
-
15
- **Deliverables:**
16
-
17
- - `src/server/rate-limiter.ts`
18
- - `TokenBucketLimiter` class:
19
- ```typescript
20
- class TokenBucketLimiter {
21
- constructor(options: LimiterOptions);
22
- tryConsume(key: string, tokens?: number): LimitResult;
23
- getRemainingTokens(key: string): number;
24
- reset(key: string): void;
25
- }
26
-
27
- interface LimiterOptions {
28
- maxTokens: number; // Bucket capacity (default: 100)
29
- refillRate: number; // Tokens per second (default: 10)
30
- refillInterval: number; // Ms between refills (default: 100)
31
- }
32
-
33
- interface LimitResult {
34
- allowed: boolean;
35
- remainingTokens: number;
36
- retryAfterMs: number; // 0 if allowed
37
- }
38
- ```
39
- - Token bucket algorithm:
40
- - Each client key has a bucket with capacity `maxTokens`
41
- - Tokens refill at `refillRate` per second
42
- - Requests consume tokens (1 by default, more for expensive operations)
43
- - When bucket empty, requests are rejected with retry time
44
-
45
- - `src/server/rate-limit-store.ts`
46
- - In-memory store with LRU eviction for bucket state
47
- - `LRUCache<string, BucketState>` with 10,000 entry limit
48
- - Handles concurrent access with minimal locking
49
-
50
- - Integrate into HTTP server:
51
- - Apply rate limiter to all MCP tool calls
52
- - Key by: IP address (anonymous) or API key (authenticated)
53
- - Return `429 Too Many Requests` when limited
54
- - Include `Retry-After` header with retry time
55
-
56
- - Configuration:
57
- - `rateLimit.enabled`: boolean (default: `true` for HTTP, `false` for stdio)
58
- - `rateLimit.maxTokens`: number (default: `100`)
59
- - `rateLimit.refillRate`: number (default: `10`)
60
- - `rateLimit.perToolLimits`: map of tool names to token costs
61
- - `index-folder`: 10 tokens (expensive)
62
- - `search-symbols`: 1 token
63
- - `get-symbol-source`: 1 token
64
- - etc.
65
-
66
- **Key technical notes:**
67
- - Token bucket is more flexible than fixed window — allows bursts while maintaining average rate
68
- - Per-tool costs reflect operation expense (indexing >> retrieval)
69
- - LRU eviction prevents memory exhaustion from many unique clients
70
- - IP-based limiting is fallback; API key limiting is preferred
71
-
72
- **Verify:** Send 150 requests in rapid succession. Verify first 100 succeed, next 50 get 429. Wait for refill, verify requests succeed again. Verify `Retry-After` header present on 429 responses.
73
-
74
- **Tests:** Token consumption: verify count decrements. Refill: verify tokens restore over time. LRU eviction: verify oldest keys evicted. Per-tool costs: indexing consumes more tokens than search. Concurrent access: no race conditions.
75
-
76
- ---
77
-
78
- ## Task 97: API Key Authentication
79
-
80
- Implement API key based authentication for tenant identification.
81
-
82
- **Deliverables:**
83
-
84
- - `src/server/auth/api-key.ts`
85
- - `ApiKeyValidator` class:
86
- ```typescript
87
- class ApiKeyValidator {
88
- validate(apiKey: string): Promise<AuthResult>;
89
- generate(tenantId: string, permissions: Permission[]): string;
90
- revoke(apiKey: string): void;
91
- }
92
-
93
- interface AuthResult {
94
- valid: boolean;
95
- tenantId?: string;
96
- permissions?: Permission[];
97
- rateLimitTier?: string;
98
- }
99
-
100
- type Permission = 'read' | 'write' | 'admin';
101
- ```
102
- - API key format: `cl_live_<tenantId>_<random>_<checksum>`
103
- - Prefix: `cl_live_` for production, `cl_test_` for test keys
104
- - TenantId: 8-char hex (short identifier)
105
- - Random: 24-char base62 random string
106
- - Checksum: 4-char CRC for validation without DB lookup
107
- - Fast validation: checksum verifies format without DB hit
108
- - Full validation: DB lookup for permissions and revocation check
109
-
110
- - `src/core/db/api-keys.ts`
111
- - SQLite table `api_keys`:
112
- ```sql
113
- CREATE TABLE api_keys (
114
- key_hash TEXT PRIMARY KEY, -- SHA-256 of full key
115
- tenant_id TEXT NOT NULL,
116
- permissions TEXT NOT NULL, -- JSON array
117
- rate_limit_tier TEXT NOT NULL,
118
- created_at TEXT NOT NULL,
119
- last_used_at TEXT,
120
- revoked_at TEXT,
121
- FOREIGN KEY (tenant_id) REFERENCES tenants(id)
122
- );
123
- CREATE INDEX idx_api_keys_tenant ON api_keys(tenant_id);
124
- ```
125
- - Never store raw API keys — only hashes
126
- - Track last usage for auditing
127
-
128
- - HTTP middleware:
129
- - Extract API key from `Authorization: Bearer <key>` header
130
- - Or from `X-API-Key: <key>` header (alternative)
131
- - Validate and attach `AuthResult` to request context
132
- - Return `401 Unauthorized` for missing/invalid keys
133
- - Return `403 Forbidden` for insufficient permissions
134
-
135
- - CLI tool for key management:
136
- - `purecontext-mcp keys create --tenant <id> --permissions read,write`
137
- - `purecontext-mcp keys list --tenant <id>`
138
- - `purecontext-mcp keys revoke <key-prefix>`
139
- - Output includes full key only on creation (not stored)
140
-
141
- **Key technical notes:**
142
- - API keys are hashed before storage — raw keys cannot be recovered
143
- - Checksum allows fast format validation without DB access
144
- - Permissions are simple RBAC: read (query), write (index), admin (manage)
145
- - Rate limit tiers allow different limits per subscription level
146
-
147
- **Verify:** Generate API key. Use it in request header. Verify authentication succeeds. Revoke key. Verify subsequent requests fail. Verify invalid key returns 401.
148
-
149
- **Tests:** Key generation: verify format. Checksum validation: valid key passes, tampered key fails. DB validation: revoked key fails. Permissions: read-only key cannot index. Rate limit tier: different limits applied.
150
-
151
- ---
152
-
153
- ## Task 98: Multi-Tenant Data Isolation
154
-
155
- Implement tenant-level data isolation for shared deployments.
156
-
157
- **Deliverables:**
158
-
159
- - `src/core/db/tenants.ts`
160
- - SQLite table `tenants`:
161
- ```sql
162
- CREATE TABLE tenants (
163
- id TEXT PRIMARY KEY,
164
- name TEXT NOT NULL,
165
- created_at TEXT NOT NULL,
166
- settings TEXT, -- JSON for tenant-specific config
167
- storage_quota_bytes INTEGER, -- Max storage per tenant
168
- storage_used_bytes INTEGER -- Current usage
169
- );
170
- ```
171
- - `TenantStore` class:
172
- ```typescript
173
- class TenantStore {
174
- create(name: string): Tenant;
175
- get(id: string): Tenant | null;
176
- update(id: string, updates: Partial<Tenant>): void;
177
- delete(id: string): void;
178
- list(): Tenant[];
179
- }
180
- ```
181
-
182
- - Update all data tables with tenant isolation:
183
- - Add `tenant_id` column to `repos`, `symbols`, `files`, `dep_edges`, `embeddings`
184
- - Add foreign key constraint to `tenants.id`
185
- - Create indexes on `tenant_id` for efficient filtering
186
- - Migration script for existing single-tenant data
187
-
188
- - Update all data access:
189
- - `SymbolStore`: all queries filtered by `tenant_id`
190
- - `FileStore`: all queries filtered by `tenant_id`
191
- - `DepStore`: all queries filtered by `tenant_id`
192
- - `VectorStore`: separate HNSW indexes per tenant
193
-
194
- - Tenant context:
195
- - `TenantContext` object attached to request
196
- - All service methods receive tenant context
197
- - Prevents cross-tenant data access
198
-
199
- - Storage quotas:
200
- - Track storage per tenant (files + embeddings)
201
- - Reject indexing when quota exceeded
202
- - Return `507 Insufficient Storage` with quota details
203
-
204
- **Key technical notes:**
205
- - Tenant isolation is enforced at the data layer, not just API layer
206
- - Separate HNSW indexes prevent cross-tenant semantic search leakage
207
- - Storage quotas prevent runaway usage
208
- - Existing single-tenant data migrates to a default "local" tenant
209
-
210
- **Verify:** Create two tenants. Index different repos under each. Verify searches only return tenant's own symbols. Verify one tenant cannot access another's repos.
211
-
212
- **Tests:** Tenant creation/deletion. Cross-tenant query: tenant A cannot see tenant B's symbols. Storage quota: reject indexing when exceeded. Migration: existing data belongs to default tenant.
213
-
214
- ---
215
-
216
- ## Task 99: Admin API
217
-
218
- Add administrative endpoints for managing tenants and monitoring.
219
-
220
- **Deliverables:**
221
-
222
- - `src/server/admin-api.ts`
223
- - Admin routes (require `admin` permission):
224
- - `POST /admin/tenants` — create tenant
225
- - `GET /admin/tenants` — list tenants
226
- - `GET /admin/tenants/:id` — get tenant details
227
- - `PATCH /admin/tenants/:id` — update tenant
228
- - `DELETE /admin/tenants/:id` — delete tenant (and all data)
229
- - `POST /admin/tenants/:id/keys` — create API key for tenant
230
- - `GET /admin/tenants/:id/keys` — list API keys
231
- - `DELETE /admin/keys/:prefix` — revoke API key
232
-
233
- - Monitoring endpoints:
234
- - `GET /admin/stats`
235
- ```json
236
- {
237
- "tenants": 15,
238
- "total_repos": 142,
239
- "total_symbols": 2450000,
240
- "storage_used_bytes": 1073741824,
241
- "requests_24h": 50000,
242
- "rate_limit_hits_24h": 120,
243
- "uptime_seconds": 86400
244
- }
245
- ```
246
- - `GET /admin/tenants/:id/stats`
247
- ```json
248
- {
249
- "repos": 12,
250
- "symbols": 180000,
251
- "storage_used_bytes": 52428800,
252
- "requests_24h": 3000,
253
- "rate_limit_hits_24h": 5
254
- }
255
- ```
256
-
257
- - Request logging:
258
- - Log all requests to SQLite table `request_log`:
259
- ```sql
260
- CREATE TABLE request_log (
261
- id INTEGER PRIMARY KEY,
262
- timestamp TEXT NOT NULL,
263
- tenant_id TEXT,
264
- tool TEXT NOT NULL,
265
- duration_ms INTEGER,
266
- status TEXT, -- 'success', 'error', 'rate_limited'
267
- error_message TEXT
268
- );
269
- CREATE INDEX idx_request_log_tenant ON request_log(tenant_id, timestamp);
270
- ```
271
- - Prune logs older than 30 days (configurable)
272
- - Aggregate stats computed from logs
273
-
274
- **Key technical notes:**
275
- - Admin API is separate from MCP tools — REST endpoints only
276
- - Request logging enables usage analytics and debugging
277
- - Log pruning prevents unbounded growth
278
- - Stats are computed from logs, not maintained separately (simpler)
279
-
280
- **Verify:** Create tenant via admin API. Generate API key. Use key to index repo. View tenant stats — verify request counted. Delete tenant — verify data deleted.
281
-
282
- **Tests:** Tenant CRUD: create, read, update, delete. API key lifecycle: create, list, revoke. Stats aggregation: requests counted correctly. Log pruning: old entries removed. Admin permission: non-admin cannot access.
283
-
284
- ---
285
-
286
- ## Task 100: Phase 12 Integration Tests
287
-
288
- Validate the complete rate limiting and multi-tenant system.
289
-
290
- **Deliverables:**
291
-
292
- - Integration tests `test/integration/phase12.test.ts`:
293
- 1. Rate limiting: 100 requests succeed, 101st fails with 429
294
- 2. Rate limiting: verify `Retry-After` header present
295
- 3. Rate limiting: verify per-tool costs (indexing uses more tokens)
296
- 4. Rate limiting: verify refill restores capacity
297
- 5. API key: valid key authenticates successfully
298
- 6. API key: invalid key returns 401
299
- 7. API key: revoked key returns 401
300
- 8. API key: checksum validation catches tampering
301
- 9. Tenant isolation: tenant A cannot see tenant B's data
302
- 10. Tenant isolation: search returns only tenant's symbols
303
- 11. Storage quota: indexing rejected when quota exceeded
304
- 12. Admin API: create tenant, generate key, index, query stats
305
- 13. Admin API: delete tenant removes all data
306
- 14. Full suite regression: all Phase 1–11 tests still green
307
-
308
- - Load testing:
309
- - Simulate 100 concurrent clients
310
- - Verify rate limiting applies correctly under load
311
- - Verify no data leakage under concurrent multi-tenant access
312
-
313
- **Verify:** `npm run test` passes. Load tests verify system stability under concurrent access.
314
-
315
- ---
316
-
317
- ## Order of Execution
318
-
319
- ```
320
- Task 96: Token bucket rate limiter ███░░░░░░░ Rate Limiting
321
- Task 97: API key authentication █████░░░░░ Auth
322
- Task 98: Multi-tenant data isolation ███████░░░ Isolation
323
- Task 99: Admin API █████████░ Admin
324
- Task 100: Integration tests ██████████ Polish
325
- ```
326
-
327
- Tasks proceed in order: rate limiting (96) provides basic protection, API keys (97) enable tenant identification, isolation (98) secures data, admin API (99) enables management.
328
-
329
- ---
330
-
331
- ## Post-Phase 12: What Comes Next
332
-
333
- Phase 12 enables PureContext as a hosted service. The final phase adds visualization:
334
-
335
- - **Phase 13**: Web UI for exploring the symbol graph visually