sourcebook 0.5.1 → 0.6.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/README.md CHANGED
@@ -4,25 +4,27 @@
4
4
 
5
5
  # sourcebook
6
6
 
7
- Generate AI context files from your codebase's actual conventions. Not what agents already know what they keep missing.
7
+ **AI can read your code. It still doesn't know how your project works.**
8
+
9
+ sourcebook captures the project knowledge your team carries in its head — conventions, patterns, traps, and where things actually go — and turns it into context your coding agent can use.
8
10
 
9
11
  ```bash
10
12
  npx sourcebook init
11
13
  ```
12
14
 
13
- One command. Analyzes your codebase. Outputs a `CLAUDE.md` tuned for how your project actually works.
14
-
15
15
  <p align="center">
16
16
  <img src="demo.svg" alt="sourcebook demo" width="820" />
17
17
  </p>
18
18
 
19
+ > Tools like Repomix give AI your entire codebase. sourcebook gives it your project knowledge.
20
+
19
21
  ## Why
20
22
 
21
- AI coding agents spend most of their context window just orienting — reading files to build a mental model before doing real work. Developers manually write context files (`CLAUDE.md`, `.cursorrules`, `copilot-instructions.md`), but most are generic and go stale fast.
23
+ AI coding agents spend most of their context window orienting — reading files to build a mental model before doing real work. Most context files (`CLAUDE.md`, `.cursorrules`) are generic and go stale fast.
22
24
 
23
- Research shows auto-generated context that restates obvious information (tech stack, directory structure) actually makes agents [worse by 2-3%](https://arxiv.org/abs/2502.09601). The only context that helps is **non-discoverable information** — things agents can't figure out by reading the code alone.
25
+ Research shows auto-generated context that restates obvious information actually makes agents [worse by 2-3%](https://arxiv.org/abs/2502.09601). The only context that helps is **non-discoverable information** — the project knowledge agents can't figure out by reading code alone.
24
26
 
25
- sourcebook inverts the typical approach: instead of dumping everything, it extracts only what agents keep missing, filtered through a discoverability test.
27
+ sourcebook extracts only what agents keep missing: the conventions, hidden dependencies, fragile areas, and dominant patterns that live in your team's heads — not in the code.
26
28
 
27
29
  ## What It Finds
28
30
 
@@ -62,6 +64,7 @@ npx sourcebook init --budget 1000
62
64
  | `sourcebook init` | Analyze codebase and generate context files |
63
65
  | `sourcebook update` | Re-analyze while preserving sections you added manually |
64
66
  | `sourcebook diff` | Show what would change without writing files (exit code 1 if changes found — useful for CI) |
67
+ | `sourcebook serve` | Start an MCP server exposing live codebase intelligence (Pro) |
65
68
 
66
69
  ### Options
67
70
 
@@ -142,6 +145,59 @@ Then applies a **discoverability filter**: for every finding, asks "can an agent
142
145
 
143
146
  Output is formatted for **context-rot resistance** — critical constraints go at the top and bottom of the file (where LLMs pay the most attention), lightweight reference info goes in the middle.
144
147
 
148
+ ## MCP Server Mode
149
+
150
+ > **Pro feature** — requires a sourcebook Pro license.
151
+
152
+ `sourcebook serve` starts a local MCP (Model Context Protocol) server that exposes live codebase intelligence to any MCP-compatible AI client — Claude Desktop, Cursor, and others.
153
+
154
+ Instead of a static context file, your AI agent can query your project's architecture on demand: look up blast radius before editing, check conventions before writing code, mine git history for anti-patterns.
155
+
156
+ ### Setup
157
+
158
+ **Claude Desktop** — add to `~/Library/Application Support/Claude/claude_desktop_config.json`:
159
+
160
+ ```json
161
+ {
162
+ "mcpServers": {
163
+ "sourcebook": {
164
+ "command": "npx",
165
+ "args": ["sourcebook", "serve", "--dir", "/path/to/your/project"]
166
+ }
167
+ }
168
+ }
169
+ ```
170
+
171
+ **Cursor** — add to `.cursor/mcp.json` in your project or `~/.cursor/mcp.json` globally:
172
+
173
+ ```json
174
+ {
175
+ "mcpServers": {
176
+ "sourcebook": {
177
+ "command": "npx",
178
+ "args": ["sourcebook", "serve", "--dir", "/path/to/your/project"]
179
+ }
180
+ }
181
+ }
182
+ ```
183
+
184
+ Restart your client after updating the config.
185
+
186
+ ### Available Tools
187
+
188
+ | Tool | What it does |
189
+ |------|-------------|
190
+ | `analyze_codebase` | Full analysis: languages, frameworks, findings, top files by PageRank importance |
191
+ | `get_file_context` | File-level context: importance score, hub status, co-change partners, applicable conventions |
192
+ | `get_blast_radius` | Risk assessment for editing a file: dependents, co-change coupling, fragility, circular deps |
193
+ | `query_conventions` | All detected project conventions: import style, error handling, naming, commit format |
194
+ | `get_import_graph` | Dependency architecture: hub files, circular deps, dead code, PageRank rankings |
195
+ | `get_git_insights` | Git history mining: fragile files, reverted commits, anti-patterns, active dev areas |
196
+ | `get_pressing_questions` | Pre-edit briefing: everything important to know before touching a specific file |
197
+ | `search_codebase_context` | Keyword search across all findings, conventions, structure, and frameworks |
198
+
199
+ The server caches the scan in memory — subsequent tool calls are fast. Pass `refresh: true` to `analyze_codebase` to force a re-scan.
200
+
145
201
  ## Roadmap
146
202
 
147
203
  - [x] `.cursor/rules/sourcebook.mdc` + legacy `.cursorrules` output
@@ -153,9 +209,9 @@ Output is formatted for **context-rot resistance** — critical constraints go a
153
209
  - [x] Python support (Django, FastAPI, Flask, pytest)
154
210
  - [x] Go support (Gin, Echo, Fiber, module layout)
155
211
  - [x] GitHub Action for CI
212
+ - [x] `sourcebook serve` — MCP server mode
156
213
  - [ ] Framework knowledge packs (community-contributed)
157
214
  - [ ] Tree-sitter AST parsing for deeper convention detection
158
- - [ ] `sourcebook serve` — MCP server mode
159
215
  - [ ] Hosted dashboard with context quality scores
160
216
 
161
217
  ## Research Foundation
package/dist/cli.js CHANGED
@@ -4,6 +4,7 @@ import { init } from "./commands/init.js";
4
4
  import { update } from "./commands/update.js";
5
5
  import { diff } from "./commands/diff.js";
6
6
  import { activate } from "./commands/activate.js";
7
+ import { serve } from "./commands/serve.js";
7
8
  const program = new Command();
8
9
  program
9
10
  .name("sourcebook")
@@ -35,4 +36,9 @@ program
35
36
  .command("activate <key>")
36
37
  .description("Activate a Pro or Team license key")
37
38
  .action(activate);
39
+ program
40
+ .command("serve")
41
+ .description("Start an MCP server over STDIO for AI tool integration")
42
+ .option("-d, --dir <path>", "Target directory to analyze", ".")
43
+ .action(serve);
38
44
  program.parse();
@@ -0,0 +1,5 @@
1
+ interface ServeOptions {
2
+ dir: string;
3
+ }
4
+ export declare function serve(options: ServeOptions): Promise<void>;
5
+ export {};
@@ -0,0 +1,556 @@
1
+ import { Server } from "@modelcontextprotocol/sdk/server";
2
+ import { StdioServerTransport } from "@modelcontextprotocol/sdk/server/stdio";
3
+ import { CallToolRequestSchema, ListToolsRequestSchema, } from "@modelcontextprotocol/sdk/types";
4
+ import path from "node:path";
5
+ import { requirePro } from "../auth/license.js";
6
+ import { scanProject } from "../scanner/index.js";
7
+ import { analyzeImportGraph } from "../scanner/graph.js";
8
+ // Cache the scan to avoid re-running on every tool call
9
+ let cachedScan = null;
10
+ let cachedDir = null;
11
+ async function getScan(dir) {
12
+ const resolved = path.resolve(dir);
13
+ if (cachedScan && cachedDir === resolved) {
14
+ return cachedScan;
15
+ }
16
+ cachedScan = await scanProject(resolved);
17
+ cachedDir = resolved;
18
+ return cachedScan;
19
+ }
20
+ function invalidateCache() {
21
+ cachedScan = null;
22
+ cachedDir = null;
23
+ }
24
+ const TOOLS = [
25
+ {
26
+ name: "analyze_codebase",
27
+ description: "Run a full sourcebook analysis on the codebase. Returns the complete ProjectScan including detected languages, frameworks, build commands, project structure, architectural findings, file importance rankings, and repo mode. Use this for a comprehensive overview before making changes.",
28
+ inputSchema: {
29
+ type: "object",
30
+ properties: {
31
+ refresh: {
32
+ type: "boolean",
33
+ description: "Force a fresh scan instead of using cached results. Default: false.",
34
+ },
35
+ },
36
+ },
37
+ },
38
+ {
39
+ name: "get_file_context",
40
+ description: "Get context for a specific file: its importance score (PageRank), what imports it, what it imports, which conventions apply to it, and whether it appears in co-change clusters. Use this before editing a file to understand its role and impact.",
41
+ inputSchema: {
42
+ type: "object",
43
+ properties: {
44
+ file: {
45
+ type: "string",
46
+ description: "Relative file path from the project root (e.g. 'src/utils/auth.ts').",
47
+ },
48
+ },
49
+ required: ["file"],
50
+ },
51
+ },
52
+ {
53
+ name: "get_blast_radius",
54
+ description: "Determine what could break if you edit a given file. Returns direct dependents (files that import it), co-change partners (files historically modified together), and whether the file is a hub module. Use this to assess risk before modifying critical code.",
55
+ inputSchema: {
56
+ type: "object",
57
+ properties: {
58
+ file: {
59
+ type: "string",
60
+ description: "Relative file path from the project root (e.g. 'src/lib/db.ts').",
61
+ },
62
+ },
63
+ required: ["file"],
64
+ },
65
+ },
66
+ {
67
+ name: "query_conventions",
68
+ description: "Return all detected conventions and patterns in the codebase: import styles, error handling, naming conventions, framework-specific patterns, and commit conventions. Use this to ensure new code follows established project patterns.",
69
+ inputSchema: {
70
+ type: "object",
71
+ properties: {
72
+ category: {
73
+ type: "string",
74
+ description: "Optional filter by category (e.g. 'Import conventions', 'Error handling', 'Commit conventions'). Returns all conventions if omitted.",
75
+ },
76
+ },
77
+ },
78
+ },
79
+ {
80
+ name: "get_import_graph",
81
+ description: "Get import relationship data: hub files (most depended-on), circular dependencies, dead code candidates, and file importance rankings from PageRank analysis. Use this to understand the dependency architecture.",
82
+ inputSchema: {
83
+ type: "object",
84
+ properties: {
85
+ file: {
86
+ type: "string",
87
+ description: "Optional file path to focus on. If provided, returns only edges involving this file. If omitted, returns the full graph summary.",
88
+ },
89
+ },
90
+ },
91
+ },
92
+ {
93
+ name: "get_git_insights",
94
+ description: "Get insights mined from git history: fragile files (high churn), reverted commits (failed approaches to avoid), active development areas, co-change coupling (invisible dependencies), and commit conventions. Use this to avoid repeating past mistakes.",
95
+ inputSchema: {
96
+ type: "object",
97
+ properties: {},
98
+ },
99
+ },
100
+ {
101
+ name: "get_pressing_questions",
102
+ description: "Get the most important things to know before editing a specific file or area of the codebase. Combines blast radius, conventions, git history, and structural context into prioritized guidance. This is the 'what should I know?' briefing.",
103
+ inputSchema: {
104
+ type: "object",
105
+ properties: {
106
+ file: {
107
+ type: "string",
108
+ description: "Relative file path you're about to edit (e.g. 'src/api/routes.ts').",
109
+ },
110
+ },
111
+ required: ["file"],
112
+ },
113
+ },
114
+ {
115
+ name: "search_codebase_context",
116
+ description: "Search across all analyzed context (findings, conventions, structure, frameworks) by keyword. Returns matching findings with their category, confidence, and rationale. Use this when looking for specific architectural knowledge.",
117
+ inputSchema: {
118
+ type: "object",
119
+ properties: {
120
+ query: {
121
+ type: "string",
122
+ description: "Keyword or phrase to search for across all findings and context (e.g. 'authentication', 'circular', 'migration').",
123
+ },
124
+ },
125
+ required: ["query"],
126
+ },
127
+ },
128
+ ];
129
+ // --- Tool Handlers ---
130
+ async function handleAnalyzeCodebase(dir, args) {
131
+ if (args.refresh)
132
+ invalidateCache();
133
+ const scan = await getScan(dir);
134
+ return {
135
+ dir: scan.dir,
136
+ languages: scan.languages,
137
+ frameworks: scan.frameworks,
138
+ repoMode: scan.repoMode,
139
+ commands: scan.commands,
140
+ structure: {
141
+ layout: scan.structure.layout,
142
+ entryPoints: scan.structure.entryPoints,
143
+ directories: scan.structure.directories,
144
+ },
145
+ fileCount: scan.files.length,
146
+ findingCount: scan.findings.length,
147
+ findings: scan.findings.map((f) => ({
148
+ category: f.category,
149
+ description: f.description,
150
+ rationale: f.rationale,
151
+ confidence: f.confidence,
152
+ })),
153
+ topFiles: (scan.rankedFiles || []).slice(0, 15).map((f) => ({
154
+ file: f.file,
155
+ score: Math.round(f.score * 10000) / 10000,
156
+ })),
157
+ };
158
+ }
159
+ async function handleGetFileContext(dir, args) {
160
+ const scan = await getScan(dir);
161
+ const file = args.file;
162
+ // Find importance score
163
+ const ranked = scan.rankedFiles || [];
164
+ const fileRank = ranked.find((r) => r.file === file);
165
+ const rank = ranked.findIndex((r) => r.file === file);
166
+ // Find findings mentioning this file
167
+ const relevantFindings = scan.findings.filter((f) => f.evidence?.includes(file) ||
168
+ f.description.includes(file) ||
169
+ f.description.includes(path.basename(file)));
170
+ // Get conventions that apply (category-based)
171
+ const conventionCategories = new Set([
172
+ "Import conventions",
173
+ "Error handling",
174
+ "TypeScript",
175
+ "TypeScript imports",
176
+ "Commit conventions",
177
+ ]);
178
+ const conventions = scan.findings.filter((f) => conventionCategories.has(f.category));
179
+ // Check if it's a hub file
180
+ const hubFinding = scan.findings.find((f) => f.category === "Core modules" && f.description.includes(file));
181
+ // Check co-change clusters
182
+ const coChangeFinding = scan.findings.find((f) => f.category === "Hidden dependencies" &&
183
+ (f.description.includes(path.basename(file)) ||
184
+ f.description.includes(file)));
185
+ return {
186
+ file,
187
+ exists: scan.files.includes(file),
188
+ importance: fileRank
189
+ ? {
190
+ score: Math.round(fileRank.score * 10000) / 10000,
191
+ rank: rank + 1,
192
+ totalFiles: ranked.length,
193
+ }
194
+ : null,
195
+ isHub: !!hubFinding,
196
+ hubDetail: hubFinding?.description || null,
197
+ coChangePartners: coChangeFinding?.description || null,
198
+ relevantFindings: relevantFindings.map((f) => ({
199
+ category: f.category,
200
+ description: f.description,
201
+ confidence: f.confidence,
202
+ })),
203
+ applicableConventions: conventions.map((f) => ({
204
+ category: f.category,
205
+ description: f.description,
206
+ })),
207
+ };
208
+ }
209
+ async function handleGetBlastRadius(dir, args) {
210
+ const scan = await getScan(dir);
211
+ const file = args.file;
212
+ // Re-run import graph to get edge-level data
213
+ const graphAnalysis = await analyzeImportGraph(path.resolve(dir), scan.files);
214
+ // Find files that import this file (dependents)
215
+ // We need to look at the graph findings for hub info
216
+ const hubFinding = scan.findings.find((f) => f.category === "Core modules" && f.description.includes(file));
217
+ // Co-change partners from git analysis
218
+ const coChangeFinding = scan.findings.find((f) => f.category === "Hidden dependencies" &&
219
+ (f.description.includes(path.basename(file)) ||
220
+ f.description.includes(file)));
221
+ // Fragile code mentions
222
+ const fragileFinding = scan.findings.find((f) => f.category === "Fragile code" && f.description.includes(file));
223
+ // Circular dependency involvement
224
+ const circularFinding = scan.findings.find((f) => f.category === "Circular dependencies" &&
225
+ f.description.includes(path.basename(file)));
226
+ // Importance rank
227
+ const ranked = scan.rankedFiles || [];
228
+ const fileRank = ranked.find((r) => r.file === file);
229
+ return {
230
+ file,
231
+ importance: fileRank
232
+ ? Math.round(fileRank.score * 10000) / 10000
233
+ : null,
234
+ isHub: !!hubFinding,
235
+ hubDetail: hubFinding?.description || null,
236
+ coChangePartners: coChangeFinding?.description || null,
237
+ isFragile: !!fragileFinding,
238
+ fragileDetail: fragileFinding?.description || null,
239
+ inCircularDep: !!circularFinding,
240
+ circularDetail: circularFinding?.description || null,
241
+ graphFindings: graphAnalysis.findings.map((f) => ({
242
+ category: f.category,
243
+ description: f.description,
244
+ confidence: f.confidence,
245
+ })),
246
+ riskLevel: hubFinding
247
+ ? "high"
248
+ : circularFinding || fragileFinding
249
+ ? "medium"
250
+ : "low",
251
+ };
252
+ }
253
+ async function handleQueryConventions(dir, args) {
254
+ const scan = await getScan(dir);
255
+ // Convention-related categories
256
+ const conventionCategories = new Set([
257
+ "Import conventions",
258
+ "Error handling",
259
+ "TypeScript",
260
+ "TypeScript imports",
261
+ "Commit conventions",
262
+ "Tailwind",
263
+ "Next.js routing",
264
+ "Next.js deployment",
265
+ "Next.js images",
266
+ "Expo routing",
267
+ "Expo builds",
268
+ "Expo deep linking",
269
+ "Supabase",
270
+ "Django",
271
+ "FastAPI",
272
+ "Go module",
273
+ "Go layout",
274
+ "Go visibility",
275
+ "Testing",
276
+ "Python environment",
277
+ "Dominant patterns",
278
+ ]);
279
+ let conventions = scan.findings.filter((f) => conventionCategories.has(f.category) ||
280
+ f.category.includes("convention") ||
281
+ f.category.includes("pattern"));
282
+ if (args.category) {
283
+ const cat = args.category.toLowerCase();
284
+ conventions = conventions.filter((f) => f.category.toLowerCase().includes(cat));
285
+ }
286
+ return {
287
+ conventions: conventions.map((f) => ({
288
+ category: f.category,
289
+ description: f.description,
290
+ rationale: f.rationale,
291
+ confidence: f.confidence,
292
+ })),
293
+ frameworks: scan.frameworks,
294
+ repoMode: scan.repoMode,
295
+ };
296
+ }
297
+ async function handleGetImportGraph(dir, args) {
298
+ const scan = await getScan(dir);
299
+ const graphFindings = scan.findings.filter((f) => ["Core modules", "Circular dependencies", "Dead code candidates"].includes(f.category));
300
+ const ranked = scan.rankedFiles || [];
301
+ if (args.file) {
302
+ const fileRank = ranked.find((r) => r.file === args.file);
303
+ const rank = ranked.findIndex((r) => r.file === args.file);
304
+ return {
305
+ file: args.file,
306
+ importance: fileRank
307
+ ? {
308
+ score: Math.round(fileRank.score * 10000) / 10000,
309
+ rank: rank + 1,
310
+ totalFiles: ranked.length,
311
+ }
312
+ : null,
313
+ graphFindings: graphFindings
314
+ .filter((f) => f.description.includes(args.file) ||
315
+ f.description.includes(path.basename(args.file)))
316
+ .map((f) => ({
317
+ category: f.category,
318
+ description: f.description,
319
+ confidence: f.confidence,
320
+ })),
321
+ };
322
+ }
323
+ return {
324
+ topFiles: ranked.slice(0, 20).map((f) => ({
325
+ file: f.file,
326
+ score: Math.round(f.score * 10000) / 10000,
327
+ })),
328
+ findings: graphFindings.map((f) => ({
329
+ category: f.category,
330
+ description: f.description,
331
+ confidence: f.confidence,
332
+ })),
333
+ };
334
+ }
335
+ async function handleGetGitInsights(dir) {
336
+ const scan = await getScan(dir);
337
+ const gitCategories = new Set([
338
+ "Git history",
339
+ "Anti-patterns",
340
+ "Active development",
341
+ "Hidden dependencies",
342
+ "Fragile code",
343
+ "Commit conventions",
344
+ ]);
345
+ const gitFindings = scan.findings.filter((f) => gitCategories.has(f.category));
346
+ return {
347
+ findings: gitFindings.map((f) => ({
348
+ category: f.category,
349
+ description: f.description,
350
+ rationale: f.rationale,
351
+ confidence: f.confidence,
352
+ })),
353
+ };
354
+ }
355
+ async function handleGetPressingQuestions(dir, args) {
356
+ const scan = await getScan(dir);
357
+ const file = args.file;
358
+ const basename = path.basename(file);
359
+ const questions = [];
360
+ // Check if it's a hub file
361
+ const hubFinding = scan.findings.find((f) => f.category === "Core modules" && f.description.includes(file));
362
+ if (hubFinding) {
363
+ questions.push({
364
+ priority: 1,
365
+ question: "This is a hub file with wide blast radius",
366
+ detail: hubFinding.description,
367
+ });
368
+ }
369
+ // Check circular dependencies
370
+ const circularFinding = scan.findings.find((f) => f.category === "Circular dependencies" &&
371
+ f.description.includes(basename));
372
+ if (circularFinding) {
373
+ questions.push({
374
+ priority: 2,
375
+ question: "This file is involved in a circular dependency",
376
+ detail: circularFinding.description,
377
+ });
378
+ }
379
+ // Check fragile code
380
+ const fragileFinding = scan.findings.find((f) => f.category === "Fragile code" && f.description.includes(file));
381
+ if (fragileFinding) {
382
+ questions.push({
383
+ priority: 3,
384
+ question: "This file has high recent churn (hard to get right)",
385
+ detail: fragileFinding.description,
386
+ });
387
+ }
388
+ // Check co-change coupling
389
+ const coChangeFinding = scan.findings.find((f) => f.category === "Hidden dependencies" &&
390
+ (f.description.includes(basename) || f.description.includes(file)));
391
+ if (coChangeFinding) {
392
+ questions.push({
393
+ priority: 4,
394
+ question: "This file has hidden dependencies (co-change partners)",
395
+ detail: coChangeFinding.description,
396
+ });
397
+ }
398
+ // Check anti-patterns
399
+ const antiPatterns = scan.findings.filter((f) => f.category === "Anti-patterns");
400
+ if (antiPatterns.length > 0) {
401
+ questions.push({
402
+ priority: 5,
403
+ question: "There are known anti-patterns in this project",
404
+ detail: antiPatterns.map((f) => f.description).join("; "),
405
+ });
406
+ }
407
+ // Applicable conventions
408
+ const conventions = scan.findings.filter((f) => f.category.includes("convention") ||
409
+ f.category.includes("Convention") ||
410
+ f.category.includes("Import") ||
411
+ f.category.includes("TypeScript") ||
412
+ f.category.includes("pattern") ||
413
+ f.category.includes("Pattern"));
414
+ if (conventions.length > 0) {
415
+ questions.push({
416
+ priority: 6,
417
+ question: "Follow these project conventions",
418
+ detail: conventions.map((f) => f.description).join("; "),
419
+ });
420
+ }
421
+ // Active development area?
422
+ const activeFinding = scan.findings.find((f) => f.category === "Active development" &&
423
+ f.description.includes(file.split("/")[0]));
424
+ if (activeFinding) {
425
+ questions.push({
426
+ priority: 7,
427
+ question: "This area is under active development",
428
+ detail: activeFinding.description,
429
+ });
430
+ }
431
+ questions.sort((a, b) => a.priority - b.priority);
432
+ return {
433
+ file,
434
+ questions,
435
+ summary: questions.length > 0
436
+ ? `${questions.length} things to know before editing ${file}`
437
+ : `No special concerns found for ${file}`,
438
+ };
439
+ }
440
+ async function handleSearchCodebaseContext(dir, args) {
441
+ const scan = await getScan(dir);
442
+ const query = args.query.toLowerCase();
443
+ const matches = scan.findings.filter((f) => f.description.toLowerCase().includes(query) ||
444
+ f.category.toLowerCase().includes(query) ||
445
+ (f.rationale && f.rationale.toLowerCase().includes(query)) ||
446
+ (f.evidence && f.evidence.toLowerCase().includes(query)));
447
+ // Also search structure
448
+ const structureMatches = [];
449
+ for (const [dir, purpose] of Object.entries(scan.structure.directories)) {
450
+ if (dir.toLowerCase().includes(query) ||
451
+ purpose.toLowerCase().includes(query)) {
452
+ structureMatches.push({ key: dir, value: purpose });
453
+ }
454
+ }
455
+ // Search frameworks
456
+ const frameworkMatches = scan.frameworks.filter((f) => f.toLowerCase().includes(query));
457
+ return {
458
+ query: args.query,
459
+ findings: matches.map((f) => ({
460
+ category: f.category,
461
+ description: f.description,
462
+ rationale: f.rationale,
463
+ confidence: f.confidence,
464
+ })),
465
+ structureMatches,
466
+ frameworkMatches,
467
+ totalResults: matches.length + structureMatches.length + frameworkMatches.length,
468
+ };
469
+ }
470
+ // --- Main ---
471
+ export async function serve(options) {
472
+ await requirePro("sourcebook serve");
473
+ const dir = path.resolve(options.dir);
474
+ // Suppress all console output — STDIO transport uses stdout for JSON-RPC
475
+ const originalLog = console.log;
476
+ const originalError = console.error;
477
+ console.log = () => { };
478
+ console.error = () => { };
479
+ const server = new Server({
480
+ name: "sourcebook",
481
+ version: "0.6.0",
482
+ }, {
483
+ capabilities: {
484
+ tools: {},
485
+ },
486
+ });
487
+ server.setRequestHandler(ListToolsRequestSchema, async () => ({
488
+ tools: TOOLS,
489
+ }));
490
+ server.setRequestHandler(CallToolRequestSchema, async (request) => {
491
+ const { name, arguments: args } = request.params;
492
+ try {
493
+ let result;
494
+ switch (name) {
495
+ case "analyze_codebase":
496
+ result = await handleAnalyzeCodebase(dir, args);
497
+ break;
498
+ case "get_file_context":
499
+ result = await handleGetFileContext(dir, args);
500
+ break;
501
+ case "get_blast_radius":
502
+ result = await handleGetBlastRadius(dir, args);
503
+ break;
504
+ case "query_conventions":
505
+ result = await handleQueryConventions(dir, args);
506
+ break;
507
+ case "get_import_graph":
508
+ result = await handleGetImportGraph(dir, args);
509
+ break;
510
+ case "get_git_insights":
511
+ result = await handleGetGitInsights(dir);
512
+ break;
513
+ case "get_pressing_questions":
514
+ result = await handleGetPressingQuestions(dir, args);
515
+ break;
516
+ case "search_codebase_context":
517
+ result = await handleSearchCodebaseContext(dir, args);
518
+ break;
519
+ default:
520
+ return {
521
+ content: [
522
+ {
523
+ type: "text",
524
+ text: JSON.stringify({ error: `Unknown tool: ${name}` }),
525
+ },
526
+ ],
527
+ isError: true,
528
+ };
529
+ }
530
+ return {
531
+ content: [
532
+ {
533
+ type: "text",
534
+ text: JSON.stringify(result, null, 2),
535
+ },
536
+ ],
537
+ };
538
+ }
539
+ catch (err) {
540
+ const message = err instanceof Error ? err.message : String(err);
541
+ return {
542
+ content: [
543
+ {
544
+ type: "text",
545
+ text: JSON.stringify({ error: message }),
546
+ },
547
+ ],
548
+ isError: true,
549
+ };
550
+ }
551
+ });
552
+ const transport = new StdioServerTransport();
553
+ await server.connect(transport);
554
+ // Restore console for cleanup messages on stderr
555
+ console.error = originalError;
556
+ }
@@ -28,6 +28,8 @@ const SOURCEBOOK_HEADERS = new Set([
28
28
  "High-Impact Files",
29
29
  "Code Conventions",
30
30
  "Constraints",
31
+ "Quick Reference",
32
+ "Dominant Patterns",
31
33
  ]);
32
34
  /**
33
35
  * Re-analyze and regenerate context files while preserving manual edits.
@@ -218,9 +218,7 @@ export async function detectFrameworks(dir, files) {
218
218
  }
219
219
  const paths = tsconfig?.compilerOptions?.paths;
220
220
  if (paths) {
221
- const aliases = Object.keys(paths)
222
- .map((k) => k.replace("/*", ""))
223
- .join(", ");
221
+ const aliases = [...new Set(Object.keys(paths).map((k) => k.replace("/*", "")))].join(", ");
224
222
  findings.push({
225
223
  category: "TypeScript imports",
226
224
  description: `Path aliases configured: ${aliases}. Use these instead of relative imports.`,
@@ -67,9 +67,14 @@ function detectRevertedPatterns(dir, revertedPatterns) {
67
67
  if (reverts.length >= 2) {
68
68
  // Extract what was reverted
69
69
  const revertDescriptions = [];
70
+ const REVERT_NOISE = [
71
+ /\.yml$/i, /\.yaml$/i, /scorecard/i, /dependabot/i,
72
+ /^update /i, /^bump /i, /^deps/i, /^ci:/i, /^build:/i,
73
+ /^chore\(deps\)/i, /^chore\(release\)/i,
74
+ ];
70
75
  for (const line of reverts.slice(0, 10)) {
71
76
  const match = line.match(/^[a-f0-9]+ Revert "(.+)"/);
72
- if (match) {
77
+ if (match && !REVERT_NOISE.some(n => n.test(match[1]))) {
73
78
  revertDescriptions.push(match[1]);
74
79
  revertedPatterns.push(match[1]);
75
80
  }
@@ -103,8 +108,15 @@ function detectAntiPatterns(dir) {
103
108
  antiPatterns.push(match[1]);
104
109
  }
105
110
  }
106
- if (antiPatterns.length > 0) {
107
- for (const pattern of antiPatterns.slice(0, 5)) {
111
+ // Filter out noise: CI config, deps, version bumps
112
+ const REVERT_NOISE = [
113
+ /\.yml$/i, /\.yaml$/i, /scorecard/i, /dependabot/i,
114
+ /^update /i, /^bump /i, /^deps/i, /^ci:/i, /^build:/i,
115
+ /^chore\(deps\)/i, /^chore\(release\)/i,
116
+ ];
117
+ const meaningful = antiPatterns.filter(p => !REVERT_NOISE.some(n => n.test(p)));
118
+ if (meaningful.length > 0) {
119
+ for (const pattern of meaningful.slice(0, 5)) {
108
120
  findings.push({
109
121
  category: "Anti-patterns",
110
122
  description: `Tried and reverted: "${pattern}". This approach was explicitly rejected.`,
@@ -137,8 +149,22 @@ function detectAntiPatterns(dir) {
137
149
  if (currentFiles.length >= 3) {
138
150
  deletionBatches.push({ message: currentMessage, files: currentFiles });
139
151
  }
152
+ // Filter out release/changeset/version commits and revert-of-revert noise
153
+ const NOISE_PATTERNS = [
154
+ /^chore\(release\)/i,
155
+ /^\[ci\] release/i,
156
+ /^version packages/i,
157
+ /^changeset/i,
158
+ /^bump/i,
159
+ /^release/i,
160
+ /^Revert "Revert/i,
161
+ /^merge/i,
162
+ /^ci:/i,
163
+ /^build:/i,
164
+ /^Revert /i,
165
+ ];
140
166
  // Only report significant deletions (3+ files in one commit = abandoned feature)
141
- for (const batch of deletionBatches.slice(0, 3)) {
167
+ for (const batch of deletionBatches.filter(b => !NOISE_PATTERNS.some(p => p.test(b.message))).slice(0, 3)) {
142
168
  if (batch.files.length >= 3) {
143
169
  const fileList = batch.files.slice(0, 3).map((f) => path.basename(f)).join(", ");
144
170
  findings.push({
@@ -313,9 +339,17 @@ function detectRapidReEdits(dir) {
313
339
  }
314
340
  // Find files edited 5+ times within a 7-day window
315
341
  const churnyFiles = [];
342
+ // Filter out non-source files that naturally churn
343
+ const NON_SOURCE_PATTERNS = [
344
+ /\.md$/i, /\.mdx$/i, /\.rst$/i, /\.txt$/i, /\.json$/i, /\.ya?ml$/i, /\.lock$/i, /\.log$/i,
345
+ /CHANGELOG/i, /\.env/, /\.generated\./, /\.config\./,
346
+ /\.github\//, /\.claude\//, /dashboard\//, /ops\//,
347
+ ];
316
348
  for (const [file, dates] of fileEdits) {
317
349
  if (dates.length < 5)
318
350
  continue;
351
+ if (NON_SOURCE_PATTERNS.some((p) => p.test(file)))
352
+ continue;
319
353
  // Sort dates
320
354
  dates.sort((a, b) => a.getTime() - b.getTime());
321
355
  // Sliding window: find any 7-day window with 5+ edits
@@ -377,7 +411,7 @@ function detectCommitPatterns(dir) {
377
411
  .map(([scope]) => scope);
378
412
  findings.push({
379
413
  category: "Commit conventions",
380
- description: `Uses Conventional Commits (feat/fix/docs/etc). ${topScopes.length > 0 ? `Common scopes: ${topScopes.join(", ")}` : ""}. Follow this pattern for new commits.`,
414
+ description: `Uses Conventional Commits (feat/fix/docs/etc).${topScopes.length > 0 ? ` Common scopes: ${topScopes.join(", ")}.` : ""} Follow this pattern for new commits.`,
381
415
  confidence: "high",
382
416
  discoverable: false,
383
417
  });
@@ -64,8 +64,11 @@ function sampleFiles(files, maxCount) {
64
64
  f.includes("layout.") ||
65
65
  f.includes("middleware."));
66
66
  const rest = files.filter((f) => !priority.includes(f));
67
- const shuffled = rest.sort(() => Math.random() - 0.5);
68
- return [...priority, ...shuffled].slice(0, maxCount);
67
+ // Deterministic sampling: sort by path, take evenly spaced files
68
+ const sorted = rest.sort();
69
+ const step = Math.max(1, Math.floor(sorted.length / Math.max(1, maxCount - priority.length)));
70
+ const sampled = sorted.filter((_, i) => i % step === 0);
71
+ return [...priority, ...sampled].slice(0, maxCount);
69
72
  }
70
73
  function detectBarrelExports(files, contents) {
71
74
  const indexFiles = files.filter((f) => path.basename(f).startsWith("index.") && !f.includes("node_modules"));
@@ -318,7 +321,25 @@ function detectDominantPatterns(dir, files, contents, frameworks) {
318
321
  }
319
322
  }
320
323
  }
321
- const dominantI18n = i18nPatterns.filter((p) => p.count >= 3).sort((a, b) => b.count - a.count);
324
+ // Filter: if only t() matched, require corroborating evidence (i18n files or packages)
325
+ const hasI18nFiles = files.some((f) => f.includes("locale") || f.includes("i18n") || f.includes("translations") || f.includes("messages/"));
326
+ let hasI18nPackage = false;
327
+ for (const [f, c] of allContents) {
328
+ if (f.endsWith("package.json") && (c.includes("i18next") || c.includes("react-intl") || c.includes("next-intl") || c.includes("@lingui"))) {
329
+ hasI18nPackage = true;
330
+ break;
331
+ }
332
+ }
333
+ const dominantI18n = i18nPatterns
334
+ .filter((p) => {
335
+ if (p.count < 3)
336
+ return false;
337
+ // t() alone is too generic — require corroborating evidence
338
+ if (p.hook === 't("key")' && !hasI18nFiles && !hasI18nPackage)
339
+ return false;
340
+ return true;
341
+ })
342
+ .sort((a, b) => b.count - a.count);
322
343
  if (dominantI18n.length > 0) {
323
344
  const primary = dominantI18n[0];
324
345
  let desc = `User-facing strings use ${primary.hook} for internationalization.`;
@@ -345,10 +366,10 @@ function detectDominantPatterns(dir, files, contents, frameworks) {
345
366
  // 2. ROUTING / API PATTERNS
346
367
  // ========================================
347
368
  const routerPatterns = [
348
- { pattern: "trpc\\.router|createTRPCRouter|t\\.router", name: "tRPC routers", count: 0 },
369
+ { pattern: "trpc\\.router|createTRPCRouter|from ['\"]@trpc", name: "tRPC routers", count: 0 },
349
370
  { pattern: "express\\.Router|router\\.get|router\\.post", name: "Express routers", count: 0 },
350
371
  { pattern: "app\\.get\\(|app\\.post\\(|app\\.put\\(", name: "Express app routes", count: 0 },
351
- { pattern: "Hono|app\\.route\\(|c\\.json\\(", name: "Hono routes", count: 0 },
372
+ { pattern: "new Hono|from ['\"]hono['\"]", name: "Hono routes", count: 0 },
352
373
  { pattern: "FastAPI|@app\\.(get|post|put|delete)", name: "FastAPI endpoints", count: 0 },
353
374
  { pattern: "flask\\.route|@app\\.route", name: "Flask routes", count: 0 },
354
375
  { pattern: "gin\\.Engine|r\\.GET|r\\.POST", name: "Gin routes", count: 0 },
@@ -377,7 +398,7 @@ function detectDominantPatterns(dir, files, contents, frameworks) {
377
398
  // ========================================
378
399
  const schemaPatterns = [
379
400
  { pattern: "z\\.object|z\\.string|z\\.number", name: "Zod", usage: "Use Zod schemas for validation", count: 0 },
380
- { pattern: "BaseModel|Field\\(", name: "Pydantic", usage: "Use Pydantic BaseModel for data classes", count: 0 },
401
+ { pattern: "class\\s+\\w+\\(BaseModel\\)|from pydantic", name: "Pydantic", usage: "Use Pydantic BaseModel for data classes", count: 0 },
381
402
  { pattern: "Joi\\.object|Joi\\.string", name: "Joi", usage: "Use Joi schemas for validation", count: 0 },
382
403
  { pattern: "yup\\.object|yup\\.string", name: "Yup", usage: "Use Yup schemas for validation", count: 0 },
383
404
  { pattern: "class.*Serializer.*:|serializers\\.Serializer", name: "Django serializers", usage: "Use Django REST serializers for API data", count: 0 },
@@ -433,7 +454,7 @@ function detectDominantPatterns(dir, files, contents, frameworks) {
433
454
  // 5. TESTING PATTERNS
434
455
  // ========================================
435
456
  const testPatterns = [
436
- { pattern: "describe\\(|it\\(|test\\(", name: "Jest/Vitest", count: 0 },
457
+ { pattern: "describe\\(|it\\(|test\\(", name: "_generic_test", count: 0 },
437
458
  { pattern: "def test_|class Test|pytest", name: "pytest", count: 0 },
438
459
  { pattern: "func Test.*\\(t \\*testing\\.T\\)", name: "Go testing", count: 0 },
439
460
  { pattern: "expect\\(.*\\)\\.to", name: "Chai/expect", count: 0 },
@@ -466,7 +487,46 @@ function detectDominantPatterns(dir, files, contents, frameworks) {
466
487
  }
467
488
  const dominantTest = testPatterns.filter((p) => p.count >= 2).sort((a, b) => b.count - a.count);
468
489
  if (dominantTest.length > 0) {
469
- const primary = dominantTest[0];
490
+ let primary = dominantTest[0];
491
+ // Disambiguate generic test pattern by checking package.json devDependencies
492
+ if (primary.name === "_generic_test") {
493
+ let pkgContent = allContents.get("package.json") || "";
494
+ if (!pkgContent) {
495
+ const pkgPath = safePath(dir, "package.json");
496
+ if (pkgPath) {
497
+ try {
498
+ pkgContent = fs.readFileSync(pkgPath, "utf-8");
499
+ }
500
+ catch { /* skip */ }
501
+ }
502
+ }
503
+ if (pkgContent.includes('"vitest"')) {
504
+ primary = { ...primary, name: "Vitest" };
505
+ }
506
+ else if (pkgContent.includes('"jest"') || pkgContent.includes('"@jest/')) {
507
+ primary = { ...primary, name: "Jest" };
508
+ }
509
+ else if (pkgContent.includes('"mocha"')) {
510
+ primary = { ...primary, name: "Mocha" };
511
+ }
512
+ else if (pkgContent.includes('"jasmine"')) {
513
+ primary = { ...primary, name: "Jasmine" };
514
+ }
515
+ else {
516
+ // Check for Deno (deno.json/deno.jsonc) or Bun (bun.lockb)
517
+ const hasDeno = files.some(f => f === "deno.json" || f === "deno.jsonc" || f === "deno.lock");
518
+ const hasBun = files.some(f => f === "bun.lockb" || f === "bunfig.toml");
519
+ if (hasDeno) {
520
+ primary = { ...primary, name: "Deno test" };
521
+ }
522
+ else if (hasBun) {
523
+ primary = { ...primary, name: "Bun test" };
524
+ }
525
+ else {
526
+ primary = { ...primary, name: "Jest" }; // default for JS/TS projects
527
+ }
528
+ }
529
+ }
470
530
  // Also detect common test utilities/helpers
471
531
  const testHelperFiles = files.filter((f) => (f.includes("test-utils") || f.includes("testUtils") || f.includes("fixtures") || f.includes("helpers")) &&
472
532
  (f.includes("test") || f.includes("spec")));
@@ -527,9 +587,9 @@ function detectDominantPatterns(dir, files, contents, frameworks) {
527
587
  // 7. STYLING CONVENTIONS
528
588
  // ========================================
529
589
  const stylePatterns = [
530
- { pattern: "className=|class=.*tw-", name: "Tailwind CSS", desc: "Styling uses Tailwind CSS utility classes", count: 0 },
531
- { pattern: "styled\\.|styled\\(|css`", name: "styled-components/Emotion", desc: "Styling uses CSS-in-JS (styled-components or Emotion)", count: 0 },
532
- { pattern: "styles\\.\\w+|from.*\\.module\\.(css|scss)", name: "CSS Modules", desc: "Styling uses CSS Modules (*.module.css)", count: 0 },
590
+ { pattern: "class=.*tw-|className=[\"'](?:flex |grid |p-|m-|text-|bg-|border-|rounded-|shadow-|w-|h-)", name: "Tailwind CSS", desc: "Styling uses Tailwind CSS utility classes", count: 0 },
591
+ { pattern: "from ['\"]styled-components|from ['\"]@emotion|styled\\.|styled\\(", name: "styled-components/Emotion", desc: "Styling uses CSS-in-JS (styled-components or Emotion)", count: 0 },
592
+ { pattern: "from.*\\.module\\.(css|scss)", name: "CSS Modules", desc: "Styling uses CSS Modules (*.module.css)", count: 0 },
533
593
  ];
534
594
  for (const [f, content] of allContents) {
535
595
  for (const p of stylePatterns) {
@@ -645,13 +705,15 @@ function detectDominantPatterns(dir, files, contents, frameworks) {
645
705
  if (dominantRouter.length > 0) {
646
706
  const routeDirs = files
647
707
  .filter((f) => (f.includes("routes") || f.includes("routers") || f.includes("api/") || f.includes("app/api/")) &&
648
- !f.includes("node_modules") && !f.includes(".test.") &&
708
+ !f.includes("node_modules") && !f.includes(".test.") && !f.includes(".spec.") &&
709
+ !f.includes("test/") && !f.includes("tests/") && !f.includes("__test") &&
710
+ !f.includes("fixture") && !f.includes("mock") &&
649
711
  (f.endsWith(".ts") || f.endsWith(".js") || f.endsWith(".py") || f.endsWith(".go")))
650
712
  .map((f) => {
651
713
  const parts = f.split("/");
652
- // Get the directory containing route files
653
714
  return parts.slice(0, -1).join("/");
654
715
  })
716
+ .filter((v) => v && v !== "." && v.length > 0) // filter empty/root paths
655
717
  .filter((v, i, a) => a.indexOf(v) === i)
656
718
  .slice(0, 3);
657
719
  if (routeDirs.length > 0) {
package/package.json CHANGED
@@ -1,6 +1,6 @@
1
1
  {
2
2
  "name": "sourcebook",
3
- "version": "0.5.1",
3
+ "version": "0.6.0",
4
4
  "description": "Extract the conventions, constraints, and architectural truths your AI coding agents keep missing.",
5
5
  "type": "module",
6
6
  "bin": {
@@ -41,14 +41,15 @@
41
41
  "LICENSE"
42
42
  ],
43
43
  "dependencies": {
44
+ "@modelcontextprotocol/sdk": "^1.29.0",
45
+ "chalk": "^5.4.0",
44
46
  "commander": "^13.0.0",
45
- "glob": "^11.0.0",
46
- "chalk": "^5.4.0"
47
+ "glob": "^11.0.0"
47
48
  },
48
49
  "devDependencies": {
49
- "typescript": "^5.7.0",
50
+ "@types/node": "^22.0.0",
50
51
  "tsx": "^4.19.0",
51
- "vitest": "^3.0.0",
52
- "@types/node": "^22.0.0"
52
+ "typescript": "^5.7.0",
53
+ "vitest": "^3.0.0"
53
54
  }
54
55
  }