code-context-extractor 0.1.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/LICENSE ADDED
@@ -0,0 +1,21 @@
1
+ MIT License
2
+
3
+ Copyright (c) 2024
4
+
5
+ Permission is hereby granted, free of charge, to any person obtaining a copy
6
+ of this software and associated documentation files (the "Software"), to deal
7
+ in the Software without restriction, including without limitation the rights
8
+ to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
9
+ copies of the Software, and to permit persons to whom the Software is
10
+ furnished to do so, subject to the following conditions:
11
+
12
+ The above copyright notice and this permission notice shall be included in all
13
+ copies or substantial portions of the Software.
14
+
15
+ THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
16
+ IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
17
+ FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
18
+ AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
19
+ LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
20
+ OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
21
+ SOFTWARE.
package/README.md ADDED
@@ -0,0 +1,227 @@
1
+ # CodeContextExtractor
2
+
3
+ CodeContextExtractor is a local-only CLI that captures a deterministic snapshot of a codebase and writes it to a single text or Markdown file. It exists for one purpose: safely sharing repository context with an AI assistant or teammate without uploading the repo itself or exposing sensitive files.
4
+
5
+ ## Why it exists
6
+ - Developers need a clean, reliable way to send project context to AI tools
7
+ - Repositories contain secrets, binaries, and noise that should be skipped by default
8
+
9
+ ## Who it is for
10
+ - Engineers who want fast, offline context exports
11
+ - Teams who need predictable, reproducible outputs
12
+ - Anyone who shares code with AI systems and wants safe defaults
13
+
14
+ ## Quick start
15
+ ```bash
16
+ npm install
17
+ npm run build
18
+ npx --no-install code-context-extractor extract . --verbose
19
+ ```
20
+
21
+ The output is written to `.code-context/` with a timestamped filename like:
22
+ ```
23
+ .code-context/<root>_context_YYYY-MM-DD_HHMMSS.txt
24
+ ```
25
+
26
+ ## Important: .gitignore your outputs
27
+ > **Warning ⚠️**
28
+ > CodeContextExtractor writes files into a `.code-context/` folder inside your project root. If that folder is not ignored, you may accidentally commit and push your context exports.
29
+
30
+ Add this line to your project's `.gitignore`:
31
+ ```
32
+ .code-context/
33
+ ```
34
+
35
+ ## npm and npx (for beginners)
36
+ - `npm` is the package manager that installs tools and libraries.
37
+ - `npx` runs a tool. If it is not installed locally, it will try to download it.
38
+ - `--no-install` tells `npx` to only use what is already installed and skip downloads.
39
+
40
+ In this repo, `code-context-extractor` is not published to npm yet, so `npx` needs `--no-install` or a direct `node dist/cli.js` call.
41
+
42
+ ## User journey (example)
43
+ You want to generate a context file for a project at `C:\projects\MyNewProject`.
44
+
45
+ ### Step 1: Install CodeContextExtractor (one time)
46
+ ```bash
47
+ git clone https://github.com/Rwstrobe/CodeContextExtractor.git
48
+ cd CodeContextExtractor
49
+ npm install
50
+ npm run build
51
+ ```
52
+
53
+ ### Step 2: Go to your project folder
54
+ ```bash
55
+ cd C:\projects\MyNewProject
56
+ ```
57
+
58
+ ### Step 3: Generate the context file
59
+ ```bash
60
+ npx --no-install code-context-extractor extract . --verbose
61
+ ```
62
+
63
+ The output will be saved to:
64
+ ```
65
+ .\.code-context\MyNewProject_context_YYYY-MM-DD_HHMMSS.txt
66
+ ```
67
+
68
+ ### macOS/Linux variant
69
+ You can follow the same steps with a Unix-style path:
70
+ ```bash
71
+ cd ~/projects/MyNewProject
72
+ npx --no-install code-context-extractor extract . --verbose
73
+ ```
74
+ The output will be saved to:
75
+ ```
76
+ ./.code-context/MyNewProject_context_YYYY-MM-DD_HHMMSS.txt
77
+ ```
78
+
79
+ ## Install from GitHub (beginner-friendly)
80
+ 1. Clone the repository:
81
+ ```bash
82
+ git clone https://github.com/Rwstrobe/CodeContextExtractor.git
83
+ ```
84
+ 2. Enter the project folder:
85
+ ```bash
86
+ cd CodeContextExtractor
87
+ ```
88
+ 3. Install dependencies:
89
+ ```bash
90
+ npm install
91
+ ```
92
+ 4. Build the CLI:
93
+ ```bash
94
+ npm run build
95
+ ```
96
+ 5. Run the extractor:
97
+ ```bash
98
+ npx --no-install code-context-extractor extract . --verbose
99
+ ```
100
+
101
+ If you prefer, you can run the built CLI directly:
102
+ ```bash
103
+ node dist/cli.js extract . --verbose
104
+ ```
105
+
106
+ ## Features
107
+ - Local-only operation (no network calls, telemetry, or analytics)
108
+ - Cross-platform: Windows, macOS, Linux
109
+ - Safe defaults with common sensitive excludes
110
+ - Configurable include/exclude globs and max file size
111
+ - Deterministic ordering and normalized paths/line endings
112
+ - Streaming output (does not load large files into memory)
113
+ - Conservative redaction enabled by default
114
+ - Optional Markdown output
115
+ - Automatic output folder `.code-context/`
116
+
117
+ ## Commands
118
+ ### Extract
119
+ ```bash
120
+ npx --no-install code-context-extractor extract [path]
121
+ ```
122
+ Example:
123
+ ```bash
124
+ npx --no-install code-context-extractor extract ./project --format md --depth 3 --verbose
125
+ ```
126
+ If you installed the CLI globally (for example via `npm link` or a future npm publish), you can run:
127
+ ```bash
128
+ code-context-extractor extract [path]
129
+ ```
130
+
131
+ ## Options
132
+ ```bash
133
+ --out <file> Output file (default: .code-context/<root>_context_YYYY-MM-DD_HHMMSS.txt)
134
+ --format <text|md> Output format (default: text)
135
+ --depth <n> Tree depth (default: 4)
136
+ --include <glob> Include glob (repeatable)
137
+ --exclude <glob> Exclude glob (repeatable)
138
+ --max-bytes <n> Max file size in bytes (default: 500000)
139
+ --redact Enable redaction (default: true)
140
+ --no-redact Disable redaction
141
+ --respect-gitignore Honor .gitignore (default: true)
142
+ --no-gitignore Ignore .gitignore rules
143
+ --config <file> Optional JSON config file
144
+ --verbose Verbose logging
145
+ ```
146
+
147
+ ## Output format
148
+ Each output file contains:
149
+ - LLM context header (how to interpret the file)
150
+ - Metadata header (root path, timestamp, tool version, command, config summary)
151
+ - Folder tree (depth-limited)
152
+ - Included files summary (count and total size)
153
+ - Skipped files with reasons (excluded, too large, binary, unreadable)
154
+ - For each included file: path, size, last modified, and contents in fenced blocks
155
+
156
+ ## Redaction
157
+ Redaction is enabled by default to reduce accidental secret leakage. It targets:
158
+ - Common key/value patterns (tokens, passwords, api keys)
159
+ - Private key blocks
160
+ - Common credential formats (JWTs, GitHub tokens, AWS keys, etc.)
161
+
162
+ Disable redaction only when you are sure the output is safe:
163
+ ```bash
164
+ code-context-extractor extract . --no-redact
165
+ ```
166
+
167
+ ## Default excludes
168
+ The following are excluded by default:
169
+ ```
170
+ node_modules
171
+ dist
172
+ build
173
+ out
174
+ .code-context
175
+ .next
176
+ .turbo
177
+ .git
178
+ .idea
179
+ .vscode
180
+ coverage
181
+ *.lock
182
+ *.log
183
+ *.pem
184
+ *.key
185
+ *.p12
186
+ *.env
187
+ .env.*
188
+ secrets.*
189
+ id_rsa
190
+ id_ed25519
191
+ ```
192
+
193
+ ## Configuration file
194
+ Use an optional JSON file (example: `example-config.json`) to set defaults.
195
+ ```json
196
+ {
197
+ "outFile": ".code-context/project_context_2026-01-19_145227.txt",
198
+ "format": "text",
199
+ "depth": 4,
200
+ "include": [],
201
+ "exclude": [],
202
+ "maxBytes": 500000,
203
+ "redact": true,
204
+ "respectGitignore": true
205
+ }
206
+ ```
207
+
208
+ Use it like this:
209
+ ```bash
210
+ code-context-extractor extract . --config example-config.json
211
+ ```
212
+
213
+ ## Safety posture
214
+ - No network access or external services
215
+ - Does not execute or evaluate code from your repository
216
+ - Skips common sensitive files by default
217
+ - Redacts likely secrets unless `--no-redact` is specified
218
+
219
+ ## Development
220
+ ```bash
221
+ npm run build
222
+ npm test
223
+ ```
224
+
225
+ ## License
226
+ MIT
227
+
package/dist/cli.js ADDED
@@ -0,0 +1,109 @@
1
+ #!/usr/bin/env node
2
+ "use strict";
3
+ var __importDefault = (this && this.__importDefault) || function (mod) {
4
+ return (mod && mod.__esModule) ? mod : { "default": mod };
5
+ };
6
+ Object.defineProperty(exports, "__esModule", { value: true });
7
+ const commander_1 = require("commander");
8
+ const fs_1 = require("fs");
9
+ const path_1 = __importDefault(require("path"));
10
+ const config_1 = require("./config");
11
+ const normalizePath_1 = require("./utils/normalizePath");
12
+ const scanner_1 = require("./scanner");
13
+ const text_1 = require("./exporters/text");
14
+ const markdown_1 = require("./exporters/markdown");
15
+ async function getVersion() {
16
+ try {
17
+ const packagePath = path_1.default.resolve(__dirname, '..', 'package.json');
18
+ const raw = await fs_1.promises.readFile(packagePath, 'utf8');
19
+ return JSON.parse(raw).version ?? '0.0.0';
20
+ }
21
+ catch {
22
+ return '0.0.0';
23
+ }
24
+ }
25
+ async function main() {
26
+ const program = new commander_1.Command();
27
+ program
28
+ .name('code-context-extractor')
29
+ .description('Create a deterministic context file from a local codebase (folder tree + file contents).');
30
+ const version = await getVersion();
31
+ program.version(version);
32
+ program.addHelpText('after', `
33
+ Examples:
34
+ code-context-extractor extract . --verbose
35
+ code-context-extractor extract . --format md --depth 3
36
+ code-context-extractor extract C:\\projects\\MyNewProject --max-bytes 200000
37
+ `);
38
+ program
39
+ .command('extract')
40
+ .argument('[path]', 'Root path', '.')
41
+ .description('Generate a single context file from the target folder.')
42
+ .option('--out <file>', 'Output file path (defaults to .code-context/<root>_context_...)')
43
+ .option('--format <text|md>', 'Output format', 'text')
44
+ .option('--depth <n>', 'Folder tree depth', (value) => parseInt(value, 10), 4)
45
+ .option('--include <glob>', 'Include glob', collect, [])
46
+ .option('--exclude <glob>', 'Exclude glob', collect, [])
47
+ .option('--max-bytes <n>', 'Max file size in bytes', (value) => parseInt(value, 10), 500000)
48
+ .option('--redact', 'Enable redaction', true)
49
+ .option('--no-redact', 'Disable redaction')
50
+ .option('--respect-gitignore', 'Respect .gitignore', true)
51
+ .option('--no-gitignore', 'Ignore .gitignore')
52
+ .option('--config <file>', 'Optional JSON config file')
53
+ .option('--verbose', 'Verbose logging', false)
54
+ .action(async (rootArg, options) => {
55
+ const rootPath = path_1.default.resolve(process.cwd(), rootArg);
56
+ const config = await (0, config_1.resolveConfig)(rootPath, {
57
+ outFile: options.out,
58
+ format: options.format,
59
+ depth: options.depth,
60
+ include: options.include,
61
+ exclude: options.exclude,
62
+ maxBytes: options.maxBytes,
63
+ redact: options.redact,
64
+ respectGitignore: options.gitignore ?? options.respectGitignore,
65
+ verbose: options.verbose,
66
+ configPath: options.config
67
+ }, process.cwd());
68
+ const outputPath = path_1.default.resolve(process.cwd(), config.outFile);
69
+ const relativeOut = (0, normalizePath_1.normalizePath)(path_1.default.relative(rootPath, outputPath));
70
+ const shouldExcludeOut = relativeOut !== '' && !relativeOut.startsWith('..') && !path_1.default.isAbsolute(relativeOut);
71
+ const scanExcludes = shouldExcludeOut
72
+ ? [...config.exclude, relativeOut]
73
+ : config.exclude;
74
+ const result = await (0, scanner_1.scan)(rootPath, {
75
+ include: config.include,
76
+ exclude: scanExcludes,
77
+ maxBytes: config.maxBytes,
78
+ respectGitignore: config.respectGitignore
79
+ });
80
+ await fs_1.promises.mkdir(path_1.default.dirname(outputPath), { recursive: true });
81
+ const output = (0, fs_1.createWriteStream)(outputPath, { encoding: 'utf8' });
82
+ const metadata = {
83
+ version,
84
+ command: process.argv.slice(2).join(' ')
85
+ };
86
+ if (config.format === 'md') {
87
+ await (0, markdown_1.exportMarkdown)(result, config, output, metadata);
88
+ }
89
+ else {
90
+ await (0, text_1.exportText)(result, config, output, metadata);
91
+ }
92
+ await new Promise((resolve, reject) => {
93
+ output.on('finish', resolve);
94
+ output.on('error', reject);
95
+ output.end();
96
+ });
97
+ if (config.verbose) {
98
+ console.log(`Wrote ${result.files.length} files to ${outputPath}`);
99
+ }
100
+ });
101
+ await program.parseAsync(process.argv);
102
+ }
103
+ function collect(value, previous) {
104
+ return previous.concat([value]);
105
+ }
106
+ main().catch((error) => {
107
+ console.error(error);
108
+ process.exit(1);
109
+ });
package/dist/config.js ADDED
@@ -0,0 +1,97 @@
1
+ "use strict";
2
+ var __importDefault = (this && this.__importDefault) || function (mod) {
3
+ return (mod && mod.__esModule) ? mod : { "default": mod };
4
+ };
5
+ Object.defineProperty(exports, "__esModule", { value: true });
6
+ exports.DEFAULT_CONFIG = exports.DEFAULT_EXCLUDES = void 0;
7
+ exports.loadConfigFile = loadConfigFile;
8
+ exports.resolveConfig = resolveConfig;
9
+ const fs_1 = require("fs");
10
+ const path_1 = __importDefault(require("path"));
11
+ exports.DEFAULT_EXCLUDES = [
12
+ 'node_modules/**',
13
+ 'dist/**',
14
+ 'build/**',
15
+ 'out/**',
16
+ '.code-context/**',
17
+ '.next/**',
18
+ '.turbo/**',
19
+ '.git/**',
20
+ '.idea/**',
21
+ '.vscode/**',
22
+ 'coverage/**',
23
+ '*.lock',
24
+ '*.log',
25
+ '*.pem',
26
+ '*.key',
27
+ '*.p12',
28
+ '*.env',
29
+ '.env.*',
30
+ 'secrets.*',
31
+ 'id_rsa',
32
+ 'id_ed25519'
33
+ ];
34
+ exports.DEFAULT_CONFIG = {
35
+ format: 'text',
36
+ depth: 4,
37
+ include: [],
38
+ exclude: [],
39
+ maxBytes: 500000,
40
+ redact: true,
41
+ respectGitignore: true
42
+ };
43
+ async function loadConfigFile(configPath, cwd) {
44
+ if (!configPath) {
45
+ return {};
46
+ }
47
+ const resolved = path_1.default.resolve(cwd, configPath);
48
+ const contents = await fs_1.promises.readFile(resolved, 'utf8');
49
+ return JSON.parse(contents);
50
+ }
51
+ function pickArray(primary, fallback) {
52
+ if (primary && primary.length > 0) {
53
+ return primary;
54
+ }
55
+ if (fallback && fallback.length > 0) {
56
+ return fallback;
57
+ }
58
+ return [];
59
+ }
60
+ async function resolveConfig(rootPath, cliOptions, cwd) {
61
+ const fileConfig = await loadConfigFile(cliOptions.configPath, cwd);
62
+ const format = normalizeFormat(cliOptions.format ?? fileConfig.format ?? exports.DEFAULT_CONFIG.format);
63
+ const explicitOutFile = cliOptions.outFile ?? fileConfig.outFile;
64
+ const outFile = explicitOutFile ?? buildAutoOutFile(rootPath, format);
65
+ return {
66
+ rootPath,
67
+ outFile,
68
+ format,
69
+ depth: cliOptions.depth ?? fileConfig.depth ?? exports.DEFAULT_CONFIG.depth,
70
+ include: pickArray(cliOptions.include, fileConfig.include),
71
+ exclude: pickArray(cliOptions.exclude, fileConfig.exclude),
72
+ maxBytes: cliOptions.maxBytes ?? fileConfig.maxBytes ?? exports.DEFAULT_CONFIG.maxBytes,
73
+ redact: cliOptions.redact ?? fileConfig.redact ?? exports.DEFAULT_CONFIG.redact,
74
+ respectGitignore: cliOptions.respectGitignore ??
75
+ fileConfig.respectGitignore ??
76
+ exports.DEFAULT_CONFIG.respectGitignore,
77
+ verbose: cliOptions.verbose ?? false
78
+ };
79
+ }
80
+ function buildAutoOutFile(rootPath, format) {
81
+ const rootName = path_1.default.basename(rootPath) || 'project';
82
+ const ext = format === 'md' ? 'md' : 'txt';
83
+ const now = new Date();
84
+ const stamp = [
85
+ now.getFullYear(),
86
+ pad(now.getMonth() + 1),
87
+ pad(now.getDate())
88
+ ].join('-');
89
+ const time = [pad(now.getHours()), pad(now.getMinutes()), pad(now.getSeconds())].join('');
90
+ return path_1.default.join('.code-context', `${rootName}_context_${stamp}_${time}.${ext}`);
91
+ }
92
+ function normalizeFormat(value) {
93
+ return value === 'md' ? 'md' : 'text';
94
+ }
95
+ function pad(value) {
96
+ return value.toString().padStart(2, '0');
97
+ }
@@ -0,0 +1,68 @@
1
+ "use strict";
2
+ var __importDefault = (this && this.__importDefault) || function (mod) {
3
+ return (mod && mod.__esModule) ? mod : { "default": mod };
4
+ };
5
+ Object.defineProperty(exports, "__esModule", { value: true });
6
+ exports.exportMarkdown = exportMarkdown;
7
+ const path_1 = __importDefault(require("path"));
8
+ const tree_1 = require("./tree");
9
+ const writer_1 = require("./writer");
10
+ function formatBytes(size) {
11
+ return `${size} bytes`;
12
+ }
13
+ function formatDate(ms) {
14
+ return new Date(ms).toISOString();
15
+ }
16
+ function languageFromPath(filePath) {
17
+ const ext = path_1.default.extname(filePath).replace('.', '');
18
+ return ext || '';
19
+ }
20
+ async function writeFileSection(stream, file, redact) {
21
+ const language = languageFromPath(file.relativePath);
22
+ await (0, writer_1.writeText)(stream, `### ${file.relativePath}\n\n`);
23
+ await (0, writer_1.writeText)(stream, `- Size: ${formatBytes(file.size)}\n`);
24
+ await (0, writer_1.writeText)(stream, `- Modified: ${formatDate(file.mtimeMs)}\n\n`);
25
+ await (0, writer_1.writeText)(stream, `\`\`\`${language}\n`);
26
+ await (0, writer_1.streamFileContents)(stream, file.absolutePath, redact);
27
+ await (0, writer_1.writeText)(stream, '\n```\n\n');
28
+ }
29
+ async function exportMarkdown(result, config, output, metadata) {
30
+ await (0, writer_1.writeText)(output, '# LLM Context\n\n');
31
+ await (0, writer_1.writeText)(output, 'This file contains a deterministic snapshot of the repository: a folder tree and selected file contents.\n');
32
+ await (0, writer_1.writeText)(output, 'Use it for analysis, debugging, and planning changes.\n');
33
+ await (0, writer_1.writeText)(output, 'Treat redacted values as unavailable.\n\n');
34
+ const treeLines = (0, tree_1.buildTreeLines)(result.files.map((file) => file.relativePath), config.depth);
35
+ const totalBytes = result.files.reduce((sum, file) => sum + file.size, 0);
36
+ await (0, writer_1.writeText)(output, '# Code Context\n\n');
37
+ await (0, writer_1.writeText)(output, '## Metadata\n');
38
+ await (0, writer_1.writeText)(output, `- Root: ${result.rootPath}\n`);
39
+ await (0, writer_1.writeText)(output, `- Timestamp: ${new Date().toISOString()}\n`);
40
+ await (0, writer_1.writeText)(output, `- Version: ${metadata.version}\n`);
41
+ await (0, writer_1.writeText)(output, `- Command: ${metadata.command}\n`);
42
+ await (0, writer_1.writeText)(output, `- Config: format=${config.format} depth=${config.depth} maxBytes=${config.maxBytes} redact=${config.redact} respectGitignore=${config.respectGitignore}\n`);
43
+ await (0, writer_1.writeText)(output, `- Includes: ${config.include.length > 0 ? config.include.join(', ') : '(all)'}\n`);
44
+ await (0, writer_1.writeText)(output, `- Excludes: ${config.exclude.length > 0 ? config.exclude.join(', ') : '(defaults only)'}\n\n`);
45
+ await (0, writer_1.writeText)(output, '## Folder Tree\n\n');
46
+ await (0, writer_1.writeText)(output, '```\n');
47
+ for (const line of treeLines) {
48
+ await (0, writer_1.writeText)(output, `${line}\n`);
49
+ }
50
+ await (0, writer_1.writeText)(output, '```\n\n');
51
+ await (0, writer_1.writeText)(output, '## Included Files Summary\n');
52
+ await (0, writer_1.writeText)(output, `- Count: ${result.files.length}\n`);
53
+ await (0, writer_1.writeText)(output, `- Total Size: ${formatBytes(totalBytes)}\n\n`);
54
+ await (0, writer_1.writeText)(output, '## Skipped Files\n');
55
+ if (result.skipped.length === 0) {
56
+ await (0, writer_1.writeText)(output, '- (none)\n\n');
57
+ }
58
+ else {
59
+ for (const skipped of result.skipped) {
60
+ await (0, writer_1.writeText)(output, `- ${skipped.relativePath} (${skipped.reason})\n`);
61
+ }
62
+ await (0, writer_1.writeText)(output, '\n');
63
+ }
64
+ await (0, writer_1.writeText)(output, '## Files\n\n');
65
+ for (const file of result.files) {
66
+ await writeFileSection(output, file, config.redact);
67
+ }
68
+ }
@@ -0,0 +1,56 @@
1
+ "use strict";
2
+ Object.defineProperty(exports, "__esModule", { value: true });
3
+ exports.exportText = exportText;
4
+ const tree_1 = require("./tree");
5
+ const writer_1 = require("./writer");
6
+ function formatBytes(size) {
7
+ return `${size} bytes`;
8
+ }
9
+ function formatDate(ms) {
10
+ return new Date(ms).toISOString();
11
+ }
12
+ async function writeFileSection(stream, file, redact) {
13
+ await (0, writer_1.writeText)(stream, `---\n`);
14
+ await (0, writer_1.writeText)(stream, `File: ${file.relativePath}\n`);
15
+ await (0, writer_1.writeText)(stream, `Size: ${formatBytes(file.size)}\n`);
16
+ await (0, writer_1.writeText)(stream, `Modified: ${formatDate(file.mtimeMs)}\n`);
17
+ await (0, writer_1.writeText)(stream, '```\n');
18
+ await (0, writer_1.streamFileContents)(stream, file.absolutePath, redact);
19
+ await (0, writer_1.writeText)(stream, '\n```\n');
20
+ }
21
+ async function exportText(result, config, output, metadata) {
22
+ await (0, writer_1.writeText)(output, 'LLM Context\n');
23
+ await (0, writer_1.writeText)(output, 'This file contains a deterministic snapshot of the repository: a folder tree and selected file contents.\n');
24
+ await (0, writer_1.writeText)(output, 'Use it for analysis, debugging, and planning changes.\n');
25
+ await (0, writer_1.writeText)(output, 'Treat redacted values as unavailable.\n\n');
26
+ const treeLines = (0, tree_1.buildTreeLines)(result.files.map((file) => file.relativePath), config.depth);
27
+ const totalBytes = result.files.reduce((sum, file) => sum + file.size, 0);
28
+ await (0, writer_1.writeText)(output, 'Code Context\n');
29
+ await (0, writer_1.writeText)(output, `Root: ${result.rootPath}\n`);
30
+ await (0, writer_1.writeText)(output, `Timestamp: ${new Date().toISOString()}\n`);
31
+ await (0, writer_1.writeText)(output, `Version: ${metadata.version}\n`);
32
+ await (0, writer_1.writeText)(output, `Command: ${metadata.command}\n`);
33
+ await (0, writer_1.writeText)(output, `Config: format=${config.format} depth=${config.depth} maxBytes=${config.maxBytes} redact=${config.redact} respectGitignore=${config.respectGitignore}\n`);
34
+ await (0, writer_1.writeText)(output, `Includes: ${config.include.length > 0 ? config.include.join(', ') : '(all)'}\n`);
35
+ await (0, writer_1.writeText)(output, `Excludes: ${config.exclude.length > 0 ? config.exclude.join(', ') : '(defaults only)'}\n`);
36
+ await (0, writer_1.writeText)(output, '\nFolder Tree\n');
37
+ for (const line of treeLines) {
38
+ await (0, writer_1.writeText)(output, `${line}\n`);
39
+ }
40
+ await (0, writer_1.writeText)(output, '\nIncluded Files Summary\n');
41
+ await (0, writer_1.writeText)(output, `Count: ${result.files.length}\n`);
42
+ await (0, writer_1.writeText)(output, `Total Size: ${formatBytes(totalBytes)}\n`);
43
+ await (0, writer_1.writeText)(output, '\nSkipped Files\n');
44
+ if (result.skipped.length === 0) {
45
+ await (0, writer_1.writeText)(output, '(none)\n');
46
+ }
47
+ else {
48
+ for (const skipped of result.skipped) {
49
+ await (0, writer_1.writeText)(output, `${skipped.relativePath} - ${skipped.reason}\n`);
50
+ }
51
+ }
52
+ await (0, writer_1.writeText)(output, '\nFiles\n');
53
+ for (const file of result.files) {
54
+ await writeFileSection(output, file, config.redact);
55
+ }
56
+ }
@@ -0,0 +1,39 @@
1
+ "use strict";
2
+ Object.defineProperty(exports, "__esModule", { value: true });
3
+ exports.buildTreeLines = buildTreeLines;
4
+ const sort_1 = require("../utils/sort");
5
+ function buildTreeLines(paths, depth) {
6
+ const root = { name: '.', isFile: false, children: new Map() };
7
+ for (const filePath of paths) {
8
+ const parts = filePath.split('/').filter(Boolean);
9
+ let node = root;
10
+ parts.forEach((part, index) => {
11
+ const isFile = index === parts.length - 1;
12
+ if (!node.children.has(part)) {
13
+ node.children.set(part, {
14
+ name: part,
15
+ isFile,
16
+ children: new Map()
17
+ });
18
+ }
19
+ node = node.children.get(part);
20
+ });
21
+ }
22
+ const lines = [];
23
+ const maxDepth = depth;
24
+ function render(current, prefix, remainingDepth) {
25
+ if (remainingDepth <= 0) {
26
+ return;
27
+ }
28
+ const entries = Array.from(current.children.values()).sort((a, b) => (0, sort_1.compareEntries)({ name: a.name, isFile: a.isFile }, { name: b.name, isFile: b.isFile }));
29
+ for (const entry of entries) {
30
+ const suffix = entry.isFile ? '' : '/';
31
+ lines.push(`${prefix}${entry.name}${suffix}`);
32
+ if (!entry.isFile) {
33
+ render(entry, `${prefix} `, remainingDepth - 1);
34
+ }
35
+ }
36
+ }
37
+ render(root, '', maxDepth);
38
+ return lines;
39
+ }
@@ -0,0 +1,54 @@
1
+ "use strict";
2
+ Object.defineProperty(exports, "__esModule", { value: true });
3
+ exports.writeText = writeText;
4
+ exports.streamFileContents = streamFileContents;
5
+ const events_1 = require("events");
6
+ const fs_1 = require("fs");
7
+ const stream_1 = require("stream");
8
+ const redaction_1 = require("../redaction");
9
+ class NormalizeTransform extends stream_1.Transform {
10
+ constructor(redact) {
11
+ super();
12
+ this.carry = '';
13
+ this.carrySize = 200;
14
+ this.redact = redact;
15
+ }
16
+ _transform(chunk, _encoding, callback) {
17
+ let text = this.carry + chunk.toString('utf8');
18
+ text = text.replace(/\r\n/g, '\n').replace(/\r/g, '\n');
19
+ if (this.redact) {
20
+ text = (0, redaction_1.redactText)(text);
21
+ }
22
+ const emitLength = Math.max(0, text.length - this.carrySize);
23
+ this.carry = text.slice(emitLength);
24
+ if (emitLength > 0) {
25
+ this.push(text.slice(0, emitLength));
26
+ }
27
+ callback();
28
+ }
29
+ _flush(callback) {
30
+ let text = this.carry;
31
+ if (this.redact) {
32
+ text = (0, redaction_1.redactText)(text);
33
+ }
34
+ if (text.length > 0) {
35
+ this.push(text);
36
+ }
37
+ callback();
38
+ }
39
+ }
40
+ async function writeText(stream, text) {
41
+ if (!stream.write(text)) {
42
+ await (0, events_1.once)(stream, 'drain');
43
+ }
44
+ }
45
+ async function streamFileContents(stream, filePath, redact) {
46
+ await new Promise((resolve, reject) => {
47
+ const reader = (0, fs_1.createReadStream)(filePath);
48
+ const transformer = new NormalizeTransform(redact);
49
+ reader.on('error', reject);
50
+ transformer.on('error', reject);
51
+ transformer.on('end', resolve);
52
+ reader.pipe(transformer).pipe(stream, { end: false });
53
+ });
54
+ }
@@ -0,0 +1,23 @@
1
+ "use strict";
2
+ Object.defineProperty(exports, "__esModule", { value: true });
3
+ exports.redactText = redactText;
4
+ const BLOCK_PATTERNS = [
5
+ /-----BEGIN [^-]+-----[\s\S]*?-----END [^-]+-----/g,
6
+ /\bAKIA[0-9A-Z]{16}\b/g,
7
+ /\bASIA[0-9A-Z]{16}\b/g,
8
+ /\b(?:xox[baprs]-[A-Za-z0-9-]{10,})\b/g,
9
+ /\b(?:ghp|gho|ghu|ghs|ghr)_[A-Za-z0-9]{36}\b/g,
10
+ /\b(?:sk_live|sk_test)_[A-Za-z0-9]{20,}\b/g,
11
+ /\beyJ[A-Za-z0-9_-]+?\.[A-Za-z0-9_-]+?\.[A-Za-z0-9_-]+?\b/g
12
+ ];
13
+ const KEY_VALUE_PATTERN = /\b(api_key|apikey|token|secret|password|passphrase|key)\b(\s*[:=]\s*)['"]?([^'"\s]+)['"]?/gi;
14
+ function redactText(input) {
15
+ let output = input;
16
+ output = output.replace(KEY_VALUE_PATTERN, (_match, key, separator) => {
17
+ return `${key}${separator}[REDACTED]`;
18
+ });
19
+ for (const pattern of BLOCK_PATTERNS) {
20
+ output = output.replace(pattern, '[REDACTED]');
21
+ }
22
+ return output;
23
+ }
@@ -0,0 +1,103 @@
1
+ "use strict";
2
+ var __importDefault = (this && this.__importDefault) || function (mod) {
3
+ return (mod && mod.__esModule) ? mod : { "default": mod };
4
+ };
5
+ Object.defineProperty(exports, "__esModule", { value: true });
6
+ exports.scan = scan;
7
+ const fs_1 = require("fs");
8
+ const path_1 = __importDefault(require("path"));
9
+ const fast_glob_1 = __importDefault(require("fast-glob"));
10
+ const ignore_1 = __importDefault(require("ignore"));
11
+ const config_1 = require("./config");
12
+ const isBinary_1 = require("./utils/isBinary");
13
+ const normalizePath_1 = require("./utils/normalizePath");
14
+ async function loadGitignore(rootPath) {
15
+ const ig = (0, ignore_1.default)();
16
+ try {
17
+ const contents = await fs_1.promises.readFile(path_1.default.join(rootPath, '.gitignore'), 'utf8');
18
+ ig.add(contents);
19
+ }
20
+ catch {
21
+ return ig;
22
+ }
23
+ return ig;
24
+ }
25
+ async function scan(rootPath, options) {
26
+ const includeGlobs = options.include.length > 0 ? options.include : ['**/*'];
27
+ const excludeGlobs = [...config_1.DEFAULT_EXCLUDES, ...options.exclude];
28
+ const candidates = await (0, fast_glob_1.default)(includeGlobs, {
29
+ cwd: rootPath,
30
+ dot: true,
31
+ onlyFiles: true,
32
+ followSymbolicLinks: false,
33
+ unique: true,
34
+ absolute: true
35
+ });
36
+ const excludedSet = new Set();
37
+ if (excludeGlobs.length > 0) {
38
+ const excluded = await (0, fast_glob_1.default)(excludeGlobs, {
39
+ cwd: rootPath,
40
+ dot: true,
41
+ onlyFiles: true,
42
+ followSymbolicLinks: false,
43
+ unique: true,
44
+ absolute: false
45
+ });
46
+ for (const rel of excluded) {
47
+ excludedSet.add((0, normalizePath_1.normalizePath)(rel));
48
+ }
49
+ }
50
+ const ig = options.respectGitignore ? await loadGitignore(rootPath) : null;
51
+ const skipped = [];
52
+ const files = [];
53
+ for (const absPath of candidates) {
54
+ const relative = (0, normalizePath_1.normalizePath)(path_1.default.relative(rootPath, absPath));
55
+ if (excludedSet.has(relative)) {
56
+ skipped.push({ relativePath: relative, reason: 'excluded' });
57
+ continue;
58
+ }
59
+ if (ig && ig.ignores(relative)) {
60
+ skipped.push({ relativePath: relative, reason: 'excluded' });
61
+ continue;
62
+ }
63
+ let stats;
64
+ try {
65
+ stats = await fs_1.promises.stat(absPath);
66
+ }
67
+ catch (error) {
68
+ skipped.push({
69
+ relativePath: relative,
70
+ reason: 'unreadable',
71
+ detail: error.message
72
+ });
73
+ continue;
74
+ }
75
+ if (stats.size > options.maxBytes) {
76
+ skipped.push({ relativePath: relative, reason: 'too large' });
77
+ continue;
78
+ }
79
+ try {
80
+ if (await (0, isBinary_1.isBinaryFile)(absPath)) {
81
+ skipped.push({ relativePath: relative, reason: 'binary' });
82
+ continue;
83
+ }
84
+ }
85
+ catch (error) {
86
+ skipped.push({
87
+ relativePath: relative,
88
+ reason: 'unreadable',
89
+ detail: error.message
90
+ });
91
+ continue;
92
+ }
93
+ files.push({
94
+ relativePath: relative,
95
+ absolutePath: absPath,
96
+ size: stats.size,
97
+ mtimeMs: stats.mtimeMs
98
+ });
99
+ }
100
+ files.sort((a, b) => a.relativePath.localeCompare(b.relativePath));
101
+ skipped.sort((a, b) => a.relativePath.localeCompare(b.relativePath));
102
+ return { rootPath, files, skipped };
103
+ }
@@ -0,0 +1,28 @@
1
+ "use strict";
2
+ Object.defineProperty(exports, "__esModule", { value: true });
3
+ exports.isBinaryFile = isBinaryFile;
4
+ const fs_1 = require("fs");
5
+ async function isBinaryFile(filePath) {
6
+ const handle = await fs_1.promises.open(filePath, 'r');
7
+ try {
8
+ const buffer = Buffer.alloc(8000);
9
+ const { bytesRead } = await handle.read(buffer, 0, buffer.length, 0);
10
+ if (bytesRead === 0) {
11
+ return false;
12
+ }
13
+ let suspicious = 0;
14
+ for (let i = 0; i < bytesRead; i += 1) {
15
+ const byte = buffer[i];
16
+ if (byte === 0) {
17
+ return true;
18
+ }
19
+ if (byte < 7 || (byte > 14 && byte < 32)) {
20
+ suspicious += 1;
21
+ }
22
+ }
23
+ return suspicious / bytesRead > 0.3;
24
+ }
25
+ finally {
26
+ await handle.close();
27
+ }
28
+ }
@@ -0,0 +1,6 @@
1
+ "use strict";
2
+ Object.defineProperty(exports, "__esModule", { value: true });
3
+ exports.normalizePath = normalizePath;
4
+ function normalizePath(inputPath) {
5
+ return inputPath.replace(/\\/g, '/');
6
+ }
@@ -0,0 +1,9 @@
1
+ "use strict";
2
+ Object.defineProperty(exports, "__esModule", { value: true });
3
+ exports.compareEntries = compareEntries;
4
+ function compareEntries(a, b) {
5
+ if (a.isFile !== b.isFile) {
6
+ return a.isFile ? 1 : -1;
7
+ }
8
+ return a.name.localeCompare(b.name);
9
+ }
package/package.json ADDED
@@ -0,0 +1,37 @@
1
+ {
2
+ "name": "code-context-extractor",
3
+ "version": "0.1.0",
4
+ "description": "Local-only CLI tool to extract codebase context into a single text or Markdown file.",
5
+ "license": "MIT",
6
+ "author": "Open Source Contributors",
7
+ "type": "commonjs",
8
+ "bin": {
9
+ "code-context-extractor": "dist/cli.js"
10
+ },
11
+ "main": "dist/cli.js",
12
+ "files": [
13
+ "dist"
14
+ ],
15
+ "scripts": {
16
+ "build": "tsc -p tsconfig.json",
17
+ "dev": "tsx src/cli.ts",
18
+ "test": "vitest run",
19
+ "lint": "eslint .",
20
+ "format": "prettier --write ."
21
+ },
22
+ "dependencies": {
23
+ "commander": "^11.1.0",
24
+ "fast-glob": "^3.3.2",
25
+ "ignore": "^5.3.1"
26
+ },
27
+ "devDependencies": {
28
+ "@types/node": "^20.12.12",
29
+ "@typescript-eslint/eslint-plugin": "^7.11.0",
30
+ "@typescript-eslint/parser": "^7.11.0",
31
+ "eslint": "^8.57.0",
32
+ "prettier": "^3.2.5",
33
+ "tsx": "^4.11.0",
34
+ "typescript": "^5.4.5",
35
+ "vitest": "^1.6.0"
36
+ }
37
+ }