sourcebook 0.1.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/LICENSE ADDED
@@ -0,0 +1,21 @@
1
+ MIT License
2
+
3
+ Copyright (c) 2026 maroond
4
+
5
+ Permission is hereby granted, free of charge, to any person obtaining a copy
6
+ of this software and associated documentation files (the "Software"), to deal
7
+ in the Software without restriction, including without limitation the rights
8
+ to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
9
+ copies of the Software, and to permit persons to whom the Software is
10
+ furnished to do so, subject to the following conditions:
11
+
12
+ The above copyright notice and this permission notice shall be included in all
13
+ copies or substantial portions of the Software.
14
+
15
+ THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
16
+ IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
17
+ FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
18
+ AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
19
+ LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
20
+ OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
21
+ SOFTWARE.
package/README.md ADDED
@@ -0,0 +1,111 @@
1
+ # sourcebook
2
+
3
+ Generate AI context files from your codebase's actual conventions. Not what agents already know — what they keep missing.
4
+
5
+ ```bash
6
+ npx sourcebook init
7
+ ```
8
+
9
+ One command. Analyzes your codebase. Outputs a `CLAUDE.md` tuned for how your project actually works.
10
+
11
+ <p align="center">
12
+ <img src="demo.svg" alt="sourcebook demo" width="820" />
13
+ </p>
14
+
15
+ ## Why
16
+
17
+ AI coding agents spend most of their context window just orienting — reading files to build a mental model before doing real work. Developers manually write context files (`CLAUDE.md`, `.cursorrules`, `copilot-instructions.md`), but most are generic and go stale fast.
18
+
19
+ Research shows auto-generated context that restates obvious information (tech stack, directory structure) actually makes agents [worse by 2-3%](https://arxiv.org/abs/2502.09601). The only context that helps is **non-discoverable information** — things agents can't figure out by reading the code alone.
20
+
21
+ sourcebook inverts the typical approach: instead of dumping everything, it extracts only what agents keep missing, filtered through a discoverability test.
22
+
23
+ ## What It Finds
24
+
25
+ - **Import graph + PageRank** — ranks files by structural importance, identifies hub files with the widest blast radius
26
+ - **Git history forensics** — reverted commits (literal "don't do this" signals), co-change coupling (invisible dependencies), rapid re-edits (code that was hard to get right)
27
+ - **Convention detection** — naming patterns, export style, import organization, barrel exports, path aliases
28
+ - **Framework detection** — Next.js, Expo, Supabase, Tailwind, Express, TypeScript configs
29
+ - **Context-rot-aware formatting** — critical constraints at the top, reference info in the middle, action prompts at the bottom (optimized for LLM attention patterns)
30
+
31
+ ## Quick Start
32
+
33
+ ```bash
34
+ # Generate CLAUDE.md for your project
35
+ npx sourcebook init
36
+
37
+ # Specify output format
38
+ npx sourcebook init --format claude # CLAUDE.md (default)
39
+ npx sourcebook init --format cursor # .cursor/rules/sourcebook.mdc + .cursorrules
40
+ npx sourcebook init --format copilot # .github/copilot-instructions.md
41
+ npx sourcebook init --format all # All of the above
42
+ ```
43
+
44
+ ## Example Output
45
+
46
+ Running `npx sourcebook init` on a real Expo + Supabase project (3,467 files):
47
+
48
+ ```
49
+ sourcebook v0.1.0
50
+
51
+ Scanning project...
52
+ Detected: Expo, Supabase, TypeScript, EAS Build
53
+ Files: 3,467 across 847 directories
54
+ Build: npx expo start | eas build
55
+
56
+ Analyzing import graph...
57
+ Hub files: ThemeContext.tsx (684 importers), brain-api.ts (42 importers)
58
+ Circular: brain-api.ts ↔ chat.ts
59
+ Orphans: 23 potentially dead files
60
+
61
+ Mining git history (287 commits)...
62
+ Reverts: 2 found
63
+ Co-change coupling: useTodayBrain.ts ↔ brain-api.ts (89% correlation)
64
+ Rapid edits: profile.tsx (18 edits in one week)
65
+ Active areas: src/ (265 changes in 30 days)
66
+
67
+ Detecting conventions...
68
+ Barrel exports: 35 index files
69
+ Path aliases: @/ prefix
70
+ Named exports preferred (25:6 ratio)
71
+ Conventional Commits: yes
72
+
73
+ Generated: CLAUDE.md (15 findings, 1.2K tokens)
74
+ Done in 2.8s
75
+ ```
76
+
77
+ ## How It Works
78
+
79
+ sourcebook runs four analysis passes, all deterministic and local — no LLM, no API keys, no network calls:
80
+
81
+ 1. **Static analysis** — framework detection, build commands, project structure, environment variables
82
+ 2. **Import graph** — builds a directed graph of all imports, runs PageRank to find the most structurally important files
83
+ 3. **Git forensics** — mines commit history for reverts, co-change patterns, churn hotspots, and development velocity
84
+ 4. **Convention inference** — samples source files to detect naming, import, export, and error handling patterns
85
+
86
+ Then applies a **discoverability filter**: for every finding, asks "can an agent figure this out by reading the code?" If yes, drops it. Only non-discoverable information makes it to the output.
87
+
88
+ Output is formatted for **context-rot resistance** — critical constraints go at the top and bottom of the file (where LLMs pay the most attention), lightweight reference info goes in the middle.
89
+
90
+ ## Roadmap
91
+
92
+ - [x] `.cursor/rules/sourcebook.mdc` + legacy `.cursorrules` output format
93
+ - [x] `.github/copilot-instructions.md` output format
94
+ - [ ] `sourcebook update` — re-analyze while preserving manual edits
95
+ - [ ] `--budget <tokens>` — PageRank-based prioritization within a token limit
96
+ - [ ] Framework knowledge packs (community-contributed)
97
+ - [ ] Tree-sitter AST parsing for deeper convention detection
98
+ - [ ] GitHub Action for CI (auto-update context on merge)
99
+ - [ ] `sourcebook serve` — MCP server mode
100
+
101
+ ## Research Foundation
102
+
103
+ Built on findings from:
104
+ - [ETH Zurich AGENTS.md study](https://arxiv.org/abs/2502.09601) — auto-generated obvious context hurts agent performance
105
+ - [Karpathy's autoresearch](https://github.com/karpathy/autoresearch) — curated context (`program.md`) is the #1 lever for agent effectiveness
106
+ - [Aider's repo-map](https://aider.chat/docs/repomap.html) — PageRank on import graphs for structural importance
107
+ - Chroma's context-rot research — LLMs show 30%+ accuracy drops for middle-of-context information
108
+
109
+ ## License
110
+
111
+ MIT
package/dist/cli.d.ts ADDED
@@ -0,0 +1,2 @@
1
+ #!/usr/bin/env node
2
+ export {};
package/dist/cli.js ADDED
@@ -0,0 +1,17 @@
1
+ #!/usr/bin/env node
2
+ import { Command } from "commander";
3
+ import { init } from "./commands/init.js";
4
+ const program = new Command();
5
+ program
6
+ .name("sourcebook")
7
+ .description("Extract the conventions, constraints, and architectural truths your AI coding agents keep missing.")
8
+ .version("0.1.0");
9
+ program
10
+ .command("init")
11
+ .description("Analyze a codebase and generate agent context files")
12
+ .option("-d, --dir <path>", "Target directory to analyze", ".")
13
+ .option("-f, --format <formats>", "Output formats (claude,cursor,copilot,agents,json)", "claude")
14
+ .option("--budget <tokens>", "Max token budget for generated context", "4000")
15
+ .option("--dry-run", "Preview findings without writing files")
16
+ .action(init);
17
+ program.parse();
@@ -0,0 +1,8 @@
1
+ interface InitOptions {
2
+ dir: string;
3
+ format: string;
4
+ budget: string;
5
+ dryRun?: boolean;
6
+ }
7
+ export declare function init(options: InitOptions): Promise<void>;
8
+ export {};
@@ -0,0 +1,91 @@
1
+ import path from "node:path";
2
+ import chalk from "chalk";
3
+ import { scanProject } from "../scanner/index.js";
4
+ import { generateClaude } from "../generators/claude.js";
5
+ import { generateCursor, generateCursorLegacy } from "../generators/cursor.js";
6
+ import { generateCopilot } from "../generators/copilot.js";
7
+ import { writeOutput } from "../utils/output.js";
8
+ export async function init(options) {
9
+ const targetDir = path.resolve(options.dir);
10
+ const formats = options.format.split(",").map((f) => f.trim());
11
+ const budget = parseInt(options.budget, 10);
12
+ console.log(chalk.bold("\nsourcebook"));
13
+ console.log(chalk.dim("Extracting repo truths...\n"));
14
+ // Phase 1: Scan the project
15
+ const scan = await scanProject(targetDir);
16
+ console.log(chalk.green("✓") + " Scanned project structure");
17
+ console.log(chalk.dim(` ${scan.files.length} files, ${scan.frameworks.length} frameworks detected`));
18
+ // Phase 2: Generate findings
19
+ const findings = scan.findings;
20
+ if (findings.length === 0) {
21
+ console.log(chalk.yellow("\n⚠ No non-obvious findings detected.") +
22
+ chalk.dim("\n This may mean the project is small or follows standard conventions."));
23
+ }
24
+ else {
25
+ console.log(chalk.green("✓") +
26
+ ` Extracted ${findings.length} findings\n`);
27
+ // Show findings preview
28
+ for (const finding of findings) {
29
+ const icon = finding.confidence === "high"
30
+ ? chalk.green("●")
31
+ : finding.confidence === "medium"
32
+ ? chalk.yellow("●")
33
+ : chalk.dim("●");
34
+ console.log(` ${icon} ${chalk.bold(finding.category)}: ${finding.description}`);
35
+ if (finding.evidence) {
36
+ console.log(chalk.dim(` evidence: ${finding.evidence}`));
37
+ }
38
+ }
39
+ }
40
+ // Phase 3: Generate output
41
+ if (options.dryRun) {
42
+ console.log(chalk.dim("\n--dry-run: no files written."));
43
+ return;
44
+ }
45
+ console.log("");
46
+ for (const format of formats) {
47
+ switch (format) {
48
+ case "claude": {
49
+ const content = generateClaude(scan, budget);
50
+ await writeOutput(targetDir, "CLAUDE.md", content);
51
+ console.log(chalk.green("✓") + " Wrote CLAUDE.md");
52
+ break;
53
+ }
54
+ case "cursor": {
55
+ const cursorContent = generateCursor(scan, budget);
56
+ await writeOutput(targetDir, ".cursor/rules/sourcebook.mdc", cursorContent);
57
+ console.log(chalk.green("✓") + " Wrote .cursor/rules/sourcebook.mdc");
58
+ // Also write legacy .cursorrules for older Cursor versions
59
+ const legacyContent = generateCursorLegacy(scan, budget);
60
+ await writeOutput(targetDir, ".cursorrules", legacyContent);
61
+ console.log(chalk.green("✓") + " Wrote .cursorrules (legacy)");
62
+ break;
63
+ }
64
+ case "copilot": {
65
+ const copilotContent = generateCopilot(scan, budget);
66
+ await writeOutput(targetDir, ".github/copilot-instructions.md", copilotContent);
67
+ console.log(chalk.green("✓") + " Wrote .github/copilot-instructions.md");
68
+ break;
69
+ }
70
+ case "all": {
71
+ const claudeAll = generateClaude(scan, budget);
72
+ await writeOutput(targetDir, "CLAUDE.md", claudeAll);
73
+ console.log(chalk.green("✓") + " Wrote CLAUDE.md");
74
+ const cursorAll = generateCursor(scan, budget);
75
+ await writeOutput(targetDir, ".cursor/rules/sourcebook.mdc", cursorAll);
76
+ console.log(chalk.green("✓") + " Wrote .cursor/rules/sourcebook.mdc");
77
+ const legacyAll = generateCursorLegacy(scan, budget);
78
+ await writeOutput(targetDir, ".cursorrules", legacyAll);
79
+ console.log(chalk.green("✓") + " Wrote .cursorrules (legacy)");
80
+ const copilotAll = generateCopilot(scan, budget);
81
+ await writeOutput(targetDir, ".github/copilot-instructions.md", copilotAll);
82
+ console.log(chalk.green("✓") + " Wrote .github/copilot-instructions.md");
83
+ break;
84
+ }
85
+ default:
86
+ console.log(chalk.yellow(`⚠ Format "${format}" not yet supported`));
87
+ }
88
+ }
89
+ console.log(chalk.dim("\nReview the generated files and edit to add context only you know."));
90
+ console.log(chalk.dim("The best repo truths come from human + machine together.\n"));
91
+ }
@@ -0,0 +1,11 @@
1
+ import type { ProjectScan } from "../types.js";
2
+ /**
3
+ * Generate a CLAUDE.md file from scan results.
4
+ *
5
+ * Design principles (from research):
6
+ * 1. ONLY non-discoverable information (ETH Zurich: auto-generated obvious context hurts by 2-3%)
7
+ * 2. Context-rot-aware formatting (Chroma Research: 30%+ accuracy drop for info in the middle)
8
+ * → Critical info at BEGINNING and END of file
9
+ * 3. Karpathy's program.md pattern: constraints, gotchas, and autonomy boundaries
10
+ */
11
+ export declare function generateClaude(scan: ProjectScan, budget: number): string;
@@ -0,0 +1,191 @@
1
+ /**
2
+ * Generate a CLAUDE.md file from scan results.
3
+ *
4
+ * Design principles (from research):
5
+ * 1. ONLY non-discoverable information (ETH Zurich: auto-generated obvious context hurts by 2-3%)
6
+ * 2. Context-rot-aware formatting (Chroma Research: 30%+ accuracy drop for info in the middle)
7
+ * → Critical info at BEGINNING and END of file
8
+ * 3. Karpathy's program.md pattern: constraints, gotchas, and autonomy boundaries
9
+ */
10
+ export function generateClaude(scan, budget) {
11
+ // Separate findings by importance for context-rot-aware placement
12
+ const critical = scan.findings.filter((f) => f.confidence === "high" && isCritical(f));
13
+ const important = scan.findings.filter((f) => f.confidence === "high" && !isCritical(f));
14
+ const supplementary = scan.findings.filter((f) => f.confidence === "medium");
15
+ const sections = [];
16
+ // ============================================
17
+ // BEGINNING: Most critical info goes here
18
+ // (LLMs retain start of context best)
19
+ // ============================================
20
+ sections.push("# CLAUDE.md");
21
+ sections.push("");
22
+ sections.push("This file provides guidance to Claude Code when working with this codebase.");
23
+ sections.push("Generated by [sourcebook](https://github.com/maroondlabs/sourcebook). Review and edit — the best context comes from human + machine together.");
24
+ sections.push("");
25
+ // Commands first -- most immediately actionable
26
+ if (hasCommands(scan.commands)) {
27
+ sections.push("## Commands");
28
+ sections.push("");
29
+ if (scan.commands.dev)
30
+ sections.push(`- **Dev:** \`${scan.commands.dev}\``);
31
+ if (scan.commands.build)
32
+ sections.push(`- **Build:** \`${scan.commands.build}\``);
33
+ if (scan.commands.test)
34
+ sections.push(`- **Test:** \`${scan.commands.test}\``);
35
+ if (scan.commands.lint)
36
+ sections.push(`- **Lint:** \`${scan.commands.lint}\``);
37
+ for (const [name, cmd] of Object.entries(scan.commands)) {
38
+ if (cmd && !["dev", "build", "test", "lint", "start"].includes(name)) {
39
+ sections.push(`- **${name}:** \`${cmd}\``);
40
+ }
41
+ }
42
+ sections.push("");
43
+ }
44
+ // Critical warnings/constraints near the top (danger zone, fragile code, hidden deps)
45
+ if (critical.length > 0) {
46
+ sections.push("## Critical Constraints");
47
+ sections.push("");
48
+ for (const finding of critical) {
49
+ sections.push(`- **${finding.category}:** ${finding.description}`);
50
+ }
51
+ sections.push("");
52
+ }
53
+ // ============================================
54
+ // MIDDLE: Less critical but useful info
55
+ // (LLMs retain this worst -- keep it short)
56
+ // ============================================
57
+ // Stack (brief)
58
+ if (scan.frameworks.length > 0) {
59
+ sections.push("## Stack");
60
+ sections.push("");
61
+ sections.push(scan.frameworks.join(", "));
62
+ sections.push("");
63
+ }
64
+ // Key directories (only non-obvious ones)
65
+ if (Object.keys(scan.structure.directories).length > 0) {
66
+ const nonObvious = Object.entries(scan.structure.directories).filter(([dir]) => !["src", "public", "node_modules", "dist", "build"].includes(dir));
67
+ if (nonObvious.length > 0) {
68
+ sections.push("## Project Structure");
69
+ sections.push("");
70
+ for (const [dir, purpose] of nonObvious) {
71
+ sections.push(`- \`${dir}/\` — ${purpose}`);
72
+ }
73
+ sections.push("");
74
+ }
75
+ }
76
+ // Core modules (from PageRank)
77
+ if (scan.rankedFiles && scan.rankedFiles.length > 0) {
78
+ const top5 = scan.rankedFiles.slice(0, 5);
79
+ sections.push("## Core Modules (by structural importance)");
80
+ sections.push("");
81
+ for (const { file } of top5) {
82
+ sections.push(`- \`${file}\``);
83
+ }
84
+ sections.push("");
85
+ }
86
+ // Important findings (high confidence, non-critical)
87
+ if (important.length > 0) {
88
+ sections.push("## Conventions & Patterns");
89
+ sections.push("");
90
+ const grouped = groupByCategory(important);
91
+ for (const [category, findings] of grouped) {
92
+ if (findings.length === 1) {
93
+ sections.push(`- **${category}:** ${findings[0].description}`);
94
+ }
95
+ else {
96
+ sections.push(`- **${category}:**`);
97
+ for (const f of findings) {
98
+ sections.push(` - ${f.description}`);
99
+ }
100
+ }
101
+ }
102
+ sections.push("");
103
+ }
104
+ // Supplementary findings (medium confidence)
105
+ if (supplementary.length > 0) {
106
+ sections.push("## Additional Context");
107
+ sections.push("");
108
+ const grouped = groupByCategory(supplementary);
109
+ for (const [category, findings] of grouped) {
110
+ if (findings.length === 1) {
111
+ sections.push(`- **${category}:** ${findings[0].description}`);
112
+ }
113
+ else {
114
+ sections.push(`- **${category}:**`);
115
+ for (const f of findings) {
116
+ sections.push(` - ${f.description}`);
117
+ }
118
+ }
119
+ }
120
+ sections.push("");
121
+ }
122
+ // ============================================
123
+ // END: Important reminders go here
124
+ // (LLMs retain end of context second-best)
125
+ // ============================================
126
+ // "What to add" section -- prompts human to add non-discoverable context
127
+ sections.push("## What to Add Manually");
128
+ sections.push("");
129
+ sections.push("The most valuable context is what only you know. Add:");
130
+ sections.push("");
131
+ sections.push("- Architectural decisions and why they were made");
132
+ sections.push("- Past incidents that shaped current conventions");
133
+ sections.push("- Deprecated patterns to avoid in new code");
134
+ sections.push("- Domain-specific rules or terminology");
135
+ sections.push("- Environment setup beyond what .env.example shows");
136
+ sections.push("");
137
+ let output = sections.join("\n");
138
+ // Token budget enforcement (rough: 1 token ≈ 4 chars)
139
+ const charBudget = budget * 4;
140
+ if (output.length > charBudget) {
141
+ output = output.slice(0, charBudget);
142
+ const lastNewline = output.lastIndexOf("\n");
143
+ output =
144
+ output.slice(0, lastNewline) +
145
+ "\n\n<!-- truncated to fit token budget -->\n";
146
+ }
147
+ return output;
148
+ }
149
+ /**
150
+ * Determine if a finding is "critical" -- things that can cause real damage
151
+ * if an agent gets them wrong. These go at the TOP of the file.
152
+ */
153
+ function isCritical(finding) {
154
+ const criticalCategories = new Set([
155
+ "Hidden dependencies",
156
+ "Circular dependencies",
157
+ "Core modules",
158
+ "Fragile code",
159
+ "Git history",
160
+ "Commit conventions",
161
+ ]);
162
+ const criticalKeywords = [
163
+ "breaking",
164
+ "blast radius",
165
+ "deprecated",
166
+ "don't",
167
+ "must",
168
+ "never",
169
+ "revert",
170
+ "fragile",
171
+ "hidden",
172
+ "invisible",
173
+ "coupling",
174
+ ];
175
+ if (criticalCategories.has(finding.category))
176
+ return true;
177
+ const desc = finding.description.toLowerCase();
178
+ return criticalKeywords.some((kw) => desc.includes(kw));
179
+ }
180
+ function groupByCategory(findings) {
181
+ const grouped = new Map();
182
+ for (const finding of findings) {
183
+ const existing = grouped.get(finding.category) || [];
184
+ existing.push(finding);
185
+ grouped.set(finding.category, existing);
186
+ }
187
+ return grouped;
188
+ }
189
+ function hasCommands(commands) {
190
+ return Object.values(commands).some((v) => v !== undefined);
191
+ }
@@ -0,0 +1,12 @@
1
+ import type { ProjectScan } from "../types.js";
2
+ /**
3
+ * Generate GitHub Copilot instructions from scan results.
4
+ *
5
+ * Copilot supports:
6
+ * - `.github/copilot-instructions.md` — repo-level instructions (always loaded)
7
+ * - `.instructions.md` — per-directory instructions (loaded when files in that dir are referenced)
8
+ *
9
+ * We generate the repo-level file. Copilot's format is plain markdown with
10
+ * natural language instructions — more conversational than Cursor's directive style.
11
+ */
12
+ export declare function generateCopilot(scan: ProjectScan, budget: number): string;
@@ -0,0 +1,119 @@
1
+ /**
2
+ * Generate GitHub Copilot instructions from scan results.
3
+ *
4
+ * Copilot supports:
5
+ * - `.github/copilot-instructions.md` — repo-level instructions (always loaded)
6
+ * - `.instructions.md` — per-directory instructions (loaded when files in that dir are referenced)
7
+ *
8
+ * We generate the repo-level file. Copilot's format is plain markdown with
9
+ * natural language instructions — more conversational than Cursor's directive style.
10
+ */
11
+ export function generateCopilot(scan, budget) {
12
+ const critical = scan.findings.filter((f) => f.confidence === "high" && isCritical(f));
13
+ const important = scan.findings.filter((f) => f.confidence === "high" && !isCritical(f));
14
+ const supplementary = scan.findings.filter((f) => f.confidence === "medium");
15
+ const sections = [];
16
+ sections.push("# Copilot Instructions");
17
+ sections.push("");
18
+ sections.push("These instructions were generated by [sourcebook](https://github.com/maroondlabs/sourcebook). Review and edit — the best context comes from human + machine together.");
19
+ sections.push("");
20
+ // Commands
21
+ if (hasCommands(scan.commands)) {
22
+ sections.push("## Development Commands");
23
+ sections.push("");
24
+ if (scan.commands.dev)
25
+ sections.push(`- Dev server: \`${scan.commands.dev}\``);
26
+ if (scan.commands.build)
27
+ sections.push(`- Build: \`${scan.commands.build}\``);
28
+ if (scan.commands.test)
29
+ sections.push(`- Tests: \`${scan.commands.test}\``);
30
+ if (scan.commands.lint)
31
+ sections.push(`- Lint: \`${scan.commands.lint}\``);
32
+ for (const [name, cmd] of Object.entries(scan.commands)) {
33
+ if (cmd && !["dev", "build", "test", "lint", "start"].includes(name)) {
34
+ sections.push(`- ${name}: \`${cmd}\``);
35
+ }
36
+ }
37
+ sections.push("");
38
+ }
39
+ // Critical constraints
40
+ if (critical.length > 0) {
41
+ sections.push("## Important Constraints");
42
+ sections.push("");
43
+ sections.push("Follow these rules when modifying this codebase:");
44
+ sections.push("");
45
+ for (const finding of critical) {
46
+ sections.push(`- ${finding.description}`);
47
+ }
48
+ sections.push("");
49
+ }
50
+ // Stack
51
+ if (scan.frameworks.length > 0) {
52
+ sections.push("## Technology Stack");
53
+ sections.push("");
54
+ sections.push(`This project uses: ${scan.frameworks.join(", ")}.`);
55
+ sections.push("");
56
+ }
57
+ // Core modules
58
+ if (scan.rankedFiles && scan.rankedFiles.length > 0) {
59
+ const top5 = scan.rankedFiles.slice(0, 5);
60
+ sections.push("## High-Impact Files");
61
+ sections.push("");
62
+ sections.push("These files are imported by many others. Changes here have wide blast radius:");
63
+ sections.push("");
64
+ for (const { file } of top5) {
65
+ sections.push(`- \`${file}\``);
66
+ }
67
+ sections.push("");
68
+ }
69
+ // Conventions
70
+ if (important.length > 0) {
71
+ sections.push("## Code Conventions");
72
+ sections.push("");
73
+ sections.push("This project follows these patterns:");
74
+ sections.push("");
75
+ for (const finding of important) {
76
+ sections.push(`- ${finding.description}`);
77
+ }
78
+ sections.push("");
79
+ }
80
+ // Additional context
81
+ if (supplementary.length > 0) {
82
+ sections.push("## Additional Notes");
83
+ sections.push("");
84
+ for (const finding of supplementary) {
85
+ sections.push(`- ${finding.description}`);
86
+ }
87
+ sections.push("");
88
+ }
89
+ let output = sections.join("\n");
90
+ // Token budget enforcement
91
+ const charBudget = budget * 4;
92
+ if (output.length > charBudget) {
93
+ output = output.slice(0, charBudget);
94
+ const lastNewline = output.lastIndexOf("\n");
95
+ output = output.slice(0, lastNewline) + "\n";
96
+ }
97
+ return output;
98
+ }
99
+ function isCritical(finding) {
100
+ const criticalCategories = new Set([
101
+ "Hidden dependencies",
102
+ "Circular dependencies",
103
+ "Core modules",
104
+ "Fragile code",
105
+ "Git history",
106
+ "Commit conventions",
107
+ ]);
108
+ const criticalKeywords = [
109
+ "breaking", "blast radius", "deprecated", "don't", "must",
110
+ "never", "revert", "fragile", "hidden", "invisible", "coupling",
111
+ ];
112
+ if (criticalCategories.has(finding.category))
113
+ return true;
114
+ const desc = finding.description.toLowerCase();
115
+ return criticalKeywords.some((kw) => desc.includes(kw));
116
+ }
117
+ function hasCommands(commands) {
118
+ return Object.values(commands).some((v) => v !== undefined);
119
+ }
@@ -0,0 +1,17 @@
1
+ import type { ProjectScan } from "../types.js";
2
+ /**
3
+ * Generate Cursor rules from scan results.
4
+ *
5
+ * Cursor deprecated `.cursorrules` in favor of modular `.cursor/rules/*.mdc` files.
6
+ * Each .mdc file has YAML frontmatter (description, globs, alwaysApply) + markdown body.
7
+ *
8
+ * We generate a single `sourcebook.mdc` with alwaysApply: true containing
9
+ * the same non-discoverable findings as the Claude generator, formatted for
10
+ * Cursor's conventions (shorter, more directive).
11
+ */
12
+ export declare function generateCursor(scan: ProjectScan, budget: number): string;
13
+ /**
14
+ * Also generate the legacy .cursorrules format for backwards compatibility.
15
+ * Same content as the .mdc but without the frontmatter.
16
+ */
17
+ export declare function generateCursorLegacy(scan: ProjectScan, budget: number): string;