codebase-extractor 1.1.0__tar.gz → 1.2.0__tar.gz

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (19) hide show
  1. {codebase_extractor-1.1.0/src/codebase_extractor.egg-info → codebase_extractor-1.2.0}/PKG-INFO +40 -36
  2. {codebase_extractor-1.1.0 → codebase_extractor-1.2.0}/README.md +38 -34
  3. {codebase_extractor-1.1.0 → codebase_extractor-1.2.0}/pyproject.toml +3 -3
  4. codebase_extractor-1.2.0/src/codebase_extractor/__init__.py +1 -0
  5. {codebase_extractor-1.1.0 → codebase_extractor-1.2.0}/src/codebase_extractor/cli.py +3 -2
  6. {codebase_extractor-1.1.0 → codebase_extractor-1.2.0}/src/codebase_extractor/config.py +35 -3
  7. {codebase_extractor-1.1.0 → codebase_extractor-1.2.0}/src/codebase_extractor/file_handler.py +61 -21
  8. {codebase_extractor-1.1.0 → codebase_extractor-1.2.0}/src/codebase_extractor/main_logic.py +36 -13
  9. {codebase_extractor-1.1.0 → codebase_extractor-1.2.0}/src/codebase_extractor/ui.py +12 -26
  10. {codebase_extractor-1.1.0 → codebase_extractor-1.2.0/src/codebase_extractor.egg-info}/PKG-INFO +40 -36
  11. codebase_extractor-1.2.0/src/codebase_extractor.egg-info/entry_points.txt +2 -0
  12. codebase_extractor-1.1.0/src/codebase_extractor/__init__.py +0 -1
  13. codebase_extractor-1.1.0/src/codebase_extractor.egg-info/entry_points.txt +0 -2
  14. {codebase_extractor-1.1.0 → codebase_extractor-1.2.0}/LICENCE +0 -0
  15. {codebase_extractor-1.1.0 → codebase_extractor-1.2.0}/setup.cfg +0 -0
  16. {codebase_extractor-1.1.0 → codebase_extractor-1.2.0}/src/codebase_extractor.egg-info/SOURCES.txt +0 -0
  17. {codebase_extractor-1.1.0 → codebase_extractor-1.2.0}/src/codebase_extractor.egg-info/dependency_links.txt +0 -0
  18. {codebase_extractor-1.1.0 → codebase_extractor-1.2.0}/src/codebase_extractor.egg-info/requires.txt +0 -0
  19. {codebase_extractor-1.1.0 → codebase_extractor-1.2.0}/src/codebase_extractor.egg-info/top_level.txt +0 -0
@@ -1,6 +1,6 @@
1
1
  Metadata-Version: 2.4
2
2
  Name: codebase-extractor
3
- Version: 1.1.0
3
+ Version: 1.2.0
4
4
  Summary: A CLI tool to extract project source code into structured Markdown files for LLM & AI context.
5
5
  Author: Lukasz Lekowski
6
6
  Project-URL: Homepage, https://github.com/lukaszlekowski/codebase-extractor
@@ -10,7 +10,7 @@ Classifier: License :: OSI Approved :: MIT License
10
10
  Classifier: Operating System :: OS Independent
11
11
  Classifier: Topic :: Software Development :: Documentation
12
12
  Classifier: Topic :: Utilities
13
- Requires-Python: >=3.9
13
+ Requires-Python: >=3.14
14
14
  Description-Content-Type: text/markdown
15
15
  License-File: LICENCE
16
16
  Requires-Dist: questionary
@@ -75,12 +75,12 @@ The tool is highly configurable, allowing you to select specific folders, exclud
75
75
  ## ✨ Key Features
76
76
 
77
77
  - **Interactive & User-Friendly:** A guided, multi-step CLI experience that makes selecting options simple and clear.
78
+ - **Quick Start by Default:** The tool starts without delay. Detailed instructions are available via an `--instructions` flag when you need a reminder.
78
79
  - **Smart Filtering:** Automatically excludes common dependency folders, build artifacts, version control directories, and IDE configuration files. The exact filters are configurable.
79
80
  - **Flexible Selection Modes:** Choose to extract the entire project with one command, or dive into a specific selection mode.
80
- - **🌳 Nested Folder Selection:** Interactively browse and select specific sub-folders from a tree-like view.
81
+ - **🌳 Visual Tree Selection:** Interactively browse and select specific sub-folders from a clear, pipe-based tree structure.
81
82
  - **🔢 Configurable Scan Depth:** You decide how many levels deep the script should look for folders when building the selection tree.
82
- - **YAML Metadata:** Each generated Markdown file is prepended with a YAML front matter block containing useful metadata like a unique run ID, timestamp, and file count for easy tracking and parsing.
83
- - **🚀 Quick Start Mode:** Use the `--no-instructions` flag to skip the detailed intro guide on subsequent runs.
83
+ - **Rich YAML Metadata:** Each generated Markdown file is prepended with a YAML front matter block containing useful metadata like a unique run ID, timestamp, file count, character count, and word count.
84
84
  - **Safe & Robust:** Features graceful exit handling (`Ctrl+C`) and provides clear feedback during the extraction process.
85
85
 
86
86
  ---
@@ -165,15 +165,15 @@ pipx install codebase-extractor
165
165
  Once installed, you can run the tool from any terminal window. Navigate to your project's root directory and run the command:
166
166
 
167
167
  ```bash
168
- code-extractor
168
+ codebase-extractor
169
169
  ```
170
170
 
171
- The script will then guide you through the extraction process.
171
+ The script will launch immediately and guide you through the extraction process.
172
172
 
173
- For repeat usage, you can skip the detailed introductory guide by using the `--no-instructions` or `-ni` flag:
173
+ For a detailed guide on how the script works, you can use the `--instructions` flag:
174
174
 
175
175
  ```bash
176
- code-extractor --no-instructions
176
+ codebase-extractor --instructions
177
177
  ```
178
178
 
179
179
  ### The Process
@@ -193,25 +193,25 @@ The tool will guide you through a series of prompts:
193
193
 
194
194
  ### Output Details
195
195
 
196
- All output files are saved in a `CODEBASE_EXTRACTS` directory within your project folder. Each generated Markdown file includes a YAML metadata header with a unique reference ID, timestamp, and file count for easy tracking and parsing.
196
+ All output files are saved in a `CODEBASE_EXTRACTS` directory within your project folder. Each generated Markdown file includes a YAML metadata header with a unique reference ID, timestamp, file count, character count, and word count for easy tracking and parsing.
197
197
 
198
198
  ### ⚡ CLI Command Reference
199
199
 
200
200
  For non-interactive use and automation, you can control the script entirely with these arguments.
201
201
 
202
- | Argument | Description | Default Value |
203
- | :------------------------- | :--------------------------------------------------------------------------- | :-------------------------- |
204
- | `-ni`, `--no-instructions` | Run the script without printing the detailed instruction banner. | `False` |
205
- | `--root <path>` | The root directory of the project to extract. | The current directory |
206
- | `--output-dir <name>` | Custom name for the output directory. | `CODEBASE_EXTRACTS` |
207
- | `--dry-run` | Simulate the extraction process without writing any files. | `False` |
208
- | `-v`, `--verbose` | Enable verbose logging for debugging. | `False` |
209
- | `--log-file <path>` | Path to save the log file. | `None` |
210
- | `--exclude-large-files` | Non-interactive: Exclude files larger than 1MB. | `False` |
211
- | `--mode <mode>` | Non-interactive: Set the extraction mode. Choices: `everything`, `specific`. | `None` (Interactive prompt) |
212
- | `--depth <number>` | Non-interactive: Set the folder scan depth for 'specific' mode. | `3` |
213
- | `--select-folders <list>` | Non-interactive: A space-separated list of folders/sub-folders to extract. | `[]` |
214
- | `--select-root` | Non-interactive: Include files from the root directory in the extraction. | `False` |
202
+ | Argument | Description | Default Value |
203
+ | :------------------------ | :--------------------------------------------------------------------------- | :-------------------------- |
204
+ | `--instructions` | Show the detailed instruction guide on startup. | `False` |
205
+ | `--root <path>` | The root directory of the project to extract. | The current directory |
206
+ | `--output-dir <name>` | Custom name for the output directory. | `CODEBASE_EXTRACTS` |
207
+ | `--dry-run` | Simulate the extraction process without writing any files. | `False` |
208
+ | `-v`, `--verbose` | Enable verbose logging for debugging. | `False` |
209
+ | `--log-file <path>` | Path to save the log file. | `None` |
210
+ | `--exclude-large-files` | Non-interactive: Exclude files larger than 1MB. | `False` |
211
+ | `--mode <mode>` | Non-interactive: Set the extraction mode. Choices: `everything`, `specific`. | `None` (Interactive prompt) |
212
+ | `--depth <number>` | Non-interactive: Set the folder scan depth for 'specific' mode. | `3` |
213
+ | `--select-folders <list>` | Non-interactive: A space-separated list of folders/sub-folders to extract. | `[]` |
214
+ | `--select-root` | Non-interactive: Include files from the root directory in the extraction. | `False` |
215
215
 
216
216
  ---
217
217
 
@@ -224,7 +224,7 @@ Here are a few practical examples of how to use the tool from your command line.
224
224
  A common command for quick, automated runs.
225
225
 
226
226
  ```bash
227
- code-extractor --no-instructions --mode everything
227
+ codebase-extractor --mode everything
228
228
  ```
229
229
 
230
230
  - #### Extract specific sub-folders non-interactively
@@ -232,7 +232,7 @@ Here are a few practical examples of how to use the tool from your command line.
232
232
  This command extracts only the `src/components` and `src/hooks` directories, plus any files in the root.
233
233
 
234
234
  ```bash
235
- code-extractor --ni --mode specific --select-folders src/components src/hooks --select-root
235
+ codebase-extractor --mode specific --select-folders src/components src/hooks --select-root
236
236
  ```
237
237
 
238
238
  - #### Perform a safe dry run
@@ -240,13 +240,13 @@ Here are a few practical examples of how to use the tool from your command line.
240
240
  This will simulate a full extraction and print what it _would_ have done, without creating any files.
241
241
 
242
242
  ```bash
243
- code-extractor --dry-run --mode everything
243
+ codebase-extractor --dry-run --mode everything
244
244
  ```
245
245
 
246
246
  - #### Run on a different project and save to a custom folder
247
247
  This targets a completely different directory and specifies a custom output folder name.
248
248
  ```bash
249
- code-extractor --root /path/to/another/project --output-dir MyProject_Extraction
249
+ codebase-extractor --root /path/to/another/project --output-dir MyProject_Extraction
250
250
  ```
251
251
 
252
252
  ---
@@ -256,10 +256,11 @@ Here are a few practical examples of how to use the tool from your command line.
256
256
  The tool uses a set of rules to determine which files and folders to include in the extraction. Here are the default settings found in the `config.py` file.
257
257
 
258
258
  <details>
259
- <summary><strong>Click to view Excluded Directories</strong></summary>
259
+ <summary><strong>Click to view Excluded Directories</strong></summary>
260
260
 
261
261
  - `node_modules`, `vendor`, `__pycache__`, `dist`, `build`, `target`, `.next`
262
262
  - `.git`, `.svn`, `.hg`, `.vscode`, `.idea`, `venv`, `.venv`
263
+ - `.dart_tool`, `.gradle`, `Pods`, `DerivedData`
263
264
 
264
265
  </details>
265
266
 
@@ -273,17 +274,20 @@ The tool uses a set of rules to determine which files and folders to include in
273
274
  <details>
274
275
  <summary><strong>Click to view Allowed Filenames & Extensions</strong></summary>
275
276
 
276
- The script will process any file with one of the following extensions. It also explicitly allows common configuration files that may not have an extension.
277
+ The script will process any file with one of the following extensions. It also explicitly allows common configuration files that may not have an extension.
277
278
 
278
279
  **Allowed Filenames:**
279
- - `dockerfile`, `.gitignore`, `.htaccess`, `makefile`
280
+ - `dockerfile`, `.gitignore`, `.htaccess`, `makefile`, `.dockerignore`, `.env.example`
281
+ - `podfile`, `gemfile`, `jenkinsfile`, `gradlew`
280
282
 
281
283
  **Allowed Extensions:**
282
- - `.php`, `.html`, `.css`, `.js`, `.jsx`, `.ts`, `.tsx`, `.vue`, `.svelte`
283
- - `.py`, `.rb`, `.java`, `.c`, `.cpp`, `.cs`, `.go`, `.rs`
284
- - `.json`, `.xml`, `.yaml`, `.yml`, `.toml`, `.ini`, `.conf`
285
- - `.md`, `.txt`, `.rst`, `.twig`, `.blade`, `.handlebars`, `.mustache`, `.ejs`
286
- - `.sql`, `.graphql`, `.gql`, `.tf`
284
+ - **Web & General:** `.php`, `.html`, `.css`, `.js`, `.jsx`, `.ts`, `.tsx`, `.vue`, `.svelte`
285
+ - **Backend & Systems:** `.py`, `.rb`, `.java`, `.c`, `.cpp`, `.cs`, `.go`, `.rs`
286
+ - **Config & Data:** `.json`, `.xml`, `.yaml`, `.yml`, `.toml`, `.ini`, `.conf`
287
+ - **Docs & Templates:** `.md`, `.txt`, `.rst`, `.twig`, `.blade`, `.handlebars`, `.mustache`, `.ejs`
288
+ - **Database & IaC:** `.sql`, `.graphql`, `.gql`, `.tf`
289
+ - **Mobile (Flutter, Android, iOS):** `.dart`, `.arb`, `.gradle`, `.properties`, `.plist`, `.xcconfig`
290
+ - **Scripts:** `.sh`, `.bat`
287
291
 
288
292
  </details>
289
293
 
@@ -291,7 +295,7 @@ The tool uses a set of rules to determine which files and folders to include in
291
295
 
292
296
  ## 🤔 Troubleshooting
293
297
 
294
- - **Problem:** After installation, I run `code-extractor` and my terminal says `command not found`.
298
+ - **Problem:** After installation, I run `codebase-extractor` and my terminal says `command not found`.
295
299
  - **Solution:** This is usually a `PATH` issue. It means your system's shell doesn't know where to find the installed script. The `pip install --user` command sometimes requires you to add a local scripts directory to your `PATH`. Please refer to your operating system's documentation for instructions on how to modify your `PATH` environment variable.
296
300
 
297
301
  - **Problem:** The tool ran, but a specific folder or file I expected to see is missing from the output.
@@ -55,12 +55,12 @@ The tool is highly configurable, allowing you to select specific folders, exclud
55
55
  ## ✨ Key Features
56
56
 
57
57
  - **Interactive & User-Friendly:** A guided, multi-step CLI experience that makes selecting options simple and clear.
58
+ - **Quick Start by Default:** The tool starts without delay. Detailed instructions are available via an `--instructions` flag when you need a reminder.
58
59
  - **Smart Filtering:** Automatically excludes common dependency folders, build artifacts, version control directories, and IDE configuration files. The exact filters are configurable.
59
60
  - **Flexible Selection Modes:** Choose to extract the entire project with one command, or dive into a specific selection mode.
60
- - **🌳 Nested Folder Selection:** Interactively browse and select specific sub-folders from a tree-like view.
61
+ - **🌳 Visual Tree Selection:** Interactively browse and select specific sub-folders from a clear, pipe-based tree structure.
61
62
  - **🔢 Configurable Scan Depth:** You decide how many levels deep the script should look for folders when building the selection tree.
62
- - **YAML Metadata:** Each generated Markdown file is prepended with a YAML front matter block containing useful metadata like a unique run ID, timestamp, and file count for easy tracking and parsing.
63
- - **🚀 Quick Start Mode:** Use the `--no-instructions` flag to skip the detailed intro guide on subsequent runs.
63
+ - **Rich YAML Metadata:** Each generated Markdown file is prepended with a YAML front matter block containing useful metadata like a unique run ID, timestamp, file count, character count, and word count.
64
64
  - **Safe & Robust:** Features graceful exit handling (`Ctrl+C`) and provides clear feedback during the extraction process.
65
65
 
66
66
  ---
@@ -145,15 +145,15 @@ pipx install codebase-extractor
145
145
  Once installed, you can run the tool from any terminal window. Navigate to your project's root directory and run the command:
146
146
 
147
147
  ```bash
148
- code-extractor
148
+ codebase-extractor
149
149
  ```
150
150
 
151
- The script will then guide you through the extraction process.
151
+ The script will launch immediately and guide you through the extraction process.
152
152
 
153
- For repeat usage, you can skip the detailed introductory guide by using the `--no-instructions` or `-ni` flag:
153
+ For a detailed guide on how the script works, you can use the `--instructions` flag:
154
154
 
155
155
  ```bash
156
- code-extractor --no-instructions
156
+ codebase-extractor --instructions
157
157
  ```
158
158
 
159
159
  ### The Process
@@ -173,25 +173,25 @@ The tool will guide you through a series of prompts:
173
173
 
174
174
  ### Output Details
175
175
 
176
- All output files are saved in a `CODEBASE_EXTRACTS` directory within your project folder. Each generated Markdown file includes a YAML metadata header with a unique reference ID, timestamp, and file count for easy tracking and parsing.
176
+ All output files are saved in a `CODEBASE_EXTRACTS` directory within your project folder. Each generated Markdown file includes a YAML metadata header with a unique reference ID, timestamp, file count, character count, and word count for easy tracking and parsing.
177
177
 
178
178
  ### ⚡ CLI Command Reference
179
179
 
180
180
  For non-interactive use and automation, you can control the script entirely with these arguments.
181
181
 
182
- | Argument | Description | Default Value |
183
- | :------------------------- | :--------------------------------------------------------------------------- | :-------------------------- |
184
- | `-ni`, `--no-instructions` | Run the script without printing the detailed instruction banner. | `False` |
185
- | `--root <path>` | The root directory of the project to extract. | The current directory |
186
- | `--output-dir <name>` | Custom name for the output directory. | `CODEBASE_EXTRACTS` |
187
- | `--dry-run` | Simulate the extraction process without writing any files. | `False` |
188
- | `-v`, `--verbose` | Enable verbose logging for debugging. | `False` |
189
- | `--log-file <path>` | Path to save the log file. | `None` |
190
- | `--exclude-large-files` | Non-interactive: Exclude files larger than 1MB. | `False` |
191
- | `--mode <mode>` | Non-interactive: Set the extraction mode. Choices: `everything`, `specific`. | `None` (Interactive prompt) |
192
- | `--depth <number>` | Non-interactive: Set the folder scan depth for 'specific' mode. | `3` |
193
- | `--select-folders <list>` | Non-interactive: A space-separated list of folders/sub-folders to extract. | `[]` |
194
- | `--select-root` | Non-interactive: Include files from the root directory in the extraction. | `False` |
182
+ | Argument | Description | Default Value |
183
+ | :------------------------ | :--------------------------------------------------------------------------- | :-------------------------- |
184
+ | `--instructions` | Show the detailed instruction guide on startup. | `False` |
185
+ | `--root <path>` | The root directory of the project to extract. | The current directory |
186
+ | `--output-dir <name>` | Custom name for the output directory. | `CODEBASE_EXTRACTS` |
187
+ | `--dry-run` | Simulate the extraction process without writing any files. | `False` |
188
+ | `-v`, `--verbose` | Enable verbose logging for debugging. | `False` |
189
+ | `--log-file <path>` | Path to save the log file. | `None` |
190
+ | `--exclude-large-files` | Non-interactive: Exclude files larger than 1MB. | `False` |
191
+ | `--mode <mode>` | Non-interactive: Set the extraction mode. Choices: `everything`, `specific`. | `None` (Interactive prompt) |
192
+ | `--depth <number>` | Non-interactive: Set the folder scan depth for 'specific' mode. | `3` |
193
+ | `--select-folders <list>` | Non-interactive: A space-separated list of folders/sub-folders to extract. | `[]` |
194
+ | `--select-root` | Non-interactive: Include files from the root directory in the extraction. | `False` |
195
195
 
196
196
  ---
197
197
 
@@ -204,7 +204,7 @@ Here are a few practical examples of how to use the tool from your command line.
204
204
  A common command for quick, automated runs.
205
205
 
206
206
  ```bash
207
- code-extractor --no-instructions --mode everything
207
+ codebase-extractor --mode everything
208
208
  ```
209
209
 
210
210
  - #### Extract specific sub-folders non-interactively
@@ -212,7 +212,7 @@ Here are a few practical examples of how to use the tool from your command line.
212
212
  This command extracts only the `src/components` and `src/hooks` directories, plus any files in the root.
213
213
 
214
214
  ```bash
215
- code-extractor --ni --mode specific --select-folders src/components src/hooks --select-root
215
+ codebase-extractor --mode specific --select-folders src/components src/hooks --select-root
216
216
  ```
217
217
 
218
218
  - #### Perform a safe dry run
@@ -220,13 +220,13 @@ Here are a few practical examples of how to use the tool from your command line.
220
220
  This will simulate a full extraction and print what it _would_ have done, without creating any files.
221
221
 
222
222
  ```bash
223
- code-extractor --dry-run --mode everything
223
+ codebase-extractor --dry-run --mode everything
224
224
  ```
225
225
 
226
226
  - #### Run on a different project and save to a custom folder
227
227
  This targets a completely different directory and specifies a custom output folder name.
228
228
  ```bash
229
- code-extractor --root /path/to/another/project --output-dir MyProject_Extraction
229
+ codebase-extractor --root /path/to/another/project --output-dir MyProject_Extraction
230
230
  ```
231
231
 
232
232
  ---
@@ -236,10 +236,11 @@ Here are a few practical examples of how to use the tool from your command line.
236
236
  The tool uses a set of rules to determine which files and folders to include in the extraction. Here are the default settings found in the `config.py` file.
237
237
 
238
238
  <details>
239
- <summary><strong>Click to view Excluded Directories</strong></summary>
239
+ <summary><strong>Click to view Excluded Directories</strong></summary>
240
240
 
241
241
  - `node_modules`, `vendor`, `__pycache__`, `dist`, `build`, `target`, `.next`
242
242
  - `.git`, `.svn`, `.hg`, `.vscode`, `.idea`, `venv`, `.venv`
243
+ - `.dart_tool`, `.gradle`, `Pods`, `DerivedData`
243
244
 
244
245
  </details>
245
246
 
@@ -253,17 +254,20 @@ The tool uses a set of rules to determine which files and folders to include in
253
254
  <details>
254
255
  <summary><strong>Click to view Allowed Filenames & Extensions</strong></summary>
255
256
 
256
- The script will process any file with one of the following extensions. It also explicitly allows common configuration files that may not have an extension.
257
+ The script will process any file with one of the following extensions. It also explicitly allows common configuration files that may not have an extension.
257
258
 
258
259
  **Allowed Filenames:**
259
- - `dockerfile`, `.gitignore`, `.htaccess`, `makefile`
260
+ - `dockerfile`, `.gitignore`, `.htaccess`, `makefile`, `.dockerignore`, `.env.example`
261
+ - `podfile`, `gemfile`, `jenkinsfile`, `gradlew`
260
262
 
261
263
  **Allowed Extensions:**
262
- - `.php`, `.html`, `.css`, `.js`, `.jsx`, `.ts`, `.tsx`, `.vue`, `.svelte`
263
- - `.py`, `.rb`, `.java`, `.c`, `.cpp`, `.cs`, `.go`, `.rs`
264
- - `.json`, `.xml`, `.yaml`, `.yml`, `.toml`, `.ini`, `.conf`
265
- - `.md`, `.txt`, `.rst`, `.twig`, `.blade`, `.handlebars`, `.mustache`, `.ejs`
266
- - `.sql`, `.graphql`, `.gql`, `.tf`
264
+ - **Web & General:** `.php`, `.html`, `.css`, `.js`, `.jsx`, `.ts`, `.tsx`, `.vue`, `.svelte`
265
+ - **Backend & Systems:** `.py`, `.rb`, `.java`, `.c`, `.cpp`, `.cs`, `.go`, `.rs`
266
+ - **Config & Data:** `.json`, `.xml`, `.yaml`, `.yml`, `.toml`, `.ini`, `.conf`
267
+ - **Docs & Templates:** `.md`, `.txt`, `.rst`, `.twig`, `.blade`, `.handlebars`, `.mustache`, `.ejs`
268
+ - **Database & IaC:** `.sql`, `.graphql`, `.gql`, `.tf`
269
+ - **Mobile (Flutter, Android, iOS):** `.dart`, `.arb`, `.gradle`, `.properties`, `.plist`, `.xcconfig`
270
+ - **Scripts:** `.sh`, `.bat`
267
271
 
268
272
  </details>
269
273
 
@@ -271,7 +275,7 @@ The tool uses a set of rules to determine which files and folders to include in
271
275
 
272
276
  ## 🤔 Troubleshooting
273
277
 
274
- - **Problem:** After installation, I run `code-extractor` and my terminal says `command not found`.
278
+ - **Problem:** After installation, I run `codebase-extractor` and my terminal says `command not found`.
275
279
  - **Solution:** This is usually a `PATH` issue. It means your system's shell doesn't know where to find the installed script. The `pip install --user` command sometimes requires you to add a local scripts directory to your `PATH`. Please refer to your operating system's documentation for instructions on how to modify your `PATH` environment variable.
276
280
 
277
281
  - **Problem:** The tool ran, but a specific folder or file I expected to see is missing from the output.
@@ -4,14 +4,14 @@ build-backend = "setuptools.build_meta"
4
4
 
5
5
  [project]
6
6
  name = "codebase-extractor"
7
- version = "1.1.0"
7
+ version = "1.2.0"
8
8
  authors = [
9
9
  { name="Lukasz Lekowski" },
10
10
  ]
11
11
  description = "A CLI tool to extract project source code into structured Markdown files for LLM & AI context."
12
12
  readme = "README.md"
13
13
  license = { file="LICENSE" }
14
- requires-python = ">=3.9"
14
+ requires-python = ">=3.14"
15
15
  classifiers = [
16
16
  "Programming Language :: Python :: 3",
17
17
  "License :: OSI Approved :: MIT License",
@@ -27,7 +27,7 @@ dependencies = [
27
27
 
28
28
  # This creates the `code-extractor` command in the user's terminal
29
29
  [project.scripts]
30
- code-extractor = "codebase_extractor.main_logic:main"
30
+ codebase-extractor = "codebase_extractor.main_logic:main"
31
31
 
32
32
  [project.urls]
33
33
  "Homepage" = "https://github.com/lukaszlekowski/codebase-extractor"
@@ -0,0 +1 @@
1
+ __version__ = "1.2.0"
@@ -10,9 +10,10 @@ def parse_arguments():
10
10
 
11
11
  # General Flags
12
12
  parser.add_argument(
13
- '-ni', '--no-instructions',
13
+ '--instructions',
14
14
  action='store_true',
15
- help="Run the script without printing the detailed instruction banner."
15
+ default=False, # ADDED: This ensures the attribute always exists.
16
+ help="Show the detailed instruction guide on startup."
16
17
  )
17
18
  parser.add_argument(
18
19
  '--root',
@@ -10,30 +10,62 @@ OUTPUT_DIR_NAME = "CODEBASE_EXTRACTS"
10
10
 
11
11
  # --- FILE/FOLDER LISTS ---
12
12
  EXCLUDED_DIRS = {
13
+ # Standard exclusions
13
14
  "node_modules", "vendor", "__pycache__", "dist", "build", "target", ".next",
14
- ".git", ".svn", ".hg", ".vscode", ".idea", "venv", ".venv",
15
+ ".git", ".svn", ".hg", ".vscode", ".idea", "venv", ".venv", ".dart_tool",
16
+ # Flutter & Mobile specific exclusions
17
+ ".dart_tool", # Critical: Contains noisy build config
18
+ ".gradle", # Internal Gradle cache
19
+ "Pods", # iOS external dependencies
20
+ "DerivedData", # iOS build artifacts
15
21
  }
16
22
  EXCLUDED_FILENAMES = {
17
- "package-lock.json", "yarn.lock", "composer.lock", ".env"
23
+ "package-lock.json", "yarn.lock", "composer.lock", ".env", "Podfile.lock",
18
24
  }
19
25
  ALLOWED_FILENAMES = {
20
- "dockerfile", ".gitignore", ".htaccess", "makefile"
26
+ # General
27
+ "dockerfile", ".gitignore", ".htaccess", "makefile", ".dockerignore", ".env.example",
28
+ # Mobile
29
+ "podfile", "gemfile", "jenkinsfile", "gradlew",
21
30
  }
22
31
  ALLOWED_EXTENSIONS = {
32
+ # Web & General
23
33
  ".php", ".html", ".css", ".js", ".jsx", ".ts", ".tsx", ".vue", ".svelte",
24
34
  ".py", ".rb", ".java", ".c", ".cpp", ".cs", ".go", ".rs", ".json", ".xml",
25
35
  ".yaml", ".yml", ".toml", ".ini", ".conf", ".md", ".txt", ".rst", ".twig",
26
36
  ".blade", ".handlebars", ".mustache", ".ejs", ".sql", ".graphql", ".gql", ".tf",
37
+
38
+ # Flutter / Dart
39
+ ".dart", ".arb",
40
+
41
+ # Android
42
+ ".gradle", ".properties",
43
+
44
+ # iOS
45
+ ".plist", ".xcconfig",
46
+
47
+ # Scripts
48
+ ".sh", ".bat",
27
49
  }
28
50
 
29
51
  # --- MAPPINGS & CONSTANTS ---
30
52
  EXTENSION_LANG_MAP = {
53
+ # Web & General
31
54
  ".js": "javascript", ".ts": "typescript", ".tsx": "tsx", ".py": "python",
32
55
  ".html": "html", ".css": "css", ".json": "json", ".md": "markdown", ".txt": "",
33
56
  ".sh": "bash", ".yml": "yaml", ".yaml": "yaml", ".php": "php", ".rb": "ruby",
34
57
  ".java": "java", ".c": "c", ".cpp": "cpp", ".cs": "csharp", ".go": "go",
35
58
  ".rs": "rust", ".vue": "vue", ".svelte": "svelte", ".sql": "sql",
36
59
  ".graphql": "graphql", ".gql": "graphql",
60
+
61
+ # Mobile Specific
62
+ ".dart": "dart",
63
+ ".gradle": "groovy",
64
+ ".plist": "xml",
65
+ ".xcconfig": "properties",
66
+ ".properties": "properties",
67
+ ".arb": "json",
68
+ ".bat": "batch",
37
69
  }
38
70
  MAX_FILE_SIZE_MB = 1
39
71
  FILE_COUNT_WARNING_THRESHOLD = 1000
@@ -10,29 +10,46 @@ import questionary
10
10
 
11
11
 
12
12
  def get_folder_choices(root_path: Path, max_depth: int) -> list:
13
- """Recursively finds folders up to a max depth and prepares them for questionary."""
13
+ """Recursively finds folders up to a max depth and prepares them for questionary with a visual tree."""
14
14
  choices = []
15
-
16
- def scanner(current_path: Path, depth: int):
15
+
16
+ def scanner(current_path: Path, prefix: str, depth: int):
17
+ """A recursive helper to build the folder tree."""
18
+ # Stop scanning if the maximum depth is reached
17
19
  if depth > max_depth:
18
20
  return
19
21
 
20
- relative_path = current_path.relative_to(root_path)
21
- prefix = " " * (depth - 1)
22
- display_name = f"{prefix}{current_path.name}"
23
- choices.append(questionary.Choice(title=display_name, value=relative_path))
24
-
25
22
  try:
26
- subdirs = sorted([p for p in current_path.iterdir() if p.is_dir() and p.name not in config.EXCLUDED_DIRS])
27
- for subdir in subdirs:
28
- scanner(subdir, depth + 1)
23
+ # Get a sorted list of valid subdirectories
24
+ subdirs = sorted([
25
+ p for p in current_path.iterdir()
26
+ if p.is_dir() and p.name not in config.EXCLUDED_DIRS
27
+ ])
28
+
29
+ # Iterate through the subdirectories to build the tree display
30
+ for i, subdir in enumerate(subdirs):
31
+ is_last = (i == len(subdirs) - 1)
32
+
33
+ # Use '└─' for the last item and '├─' for others
34
+ connector = "└─ " if is_last else "├─ "
35
+ display_name = f"{prefix}{connector}{subdir.name}"
36
+
37
+ relative_path = subdir.relative_to(root_path)
38
+ choices.append(questionary.Choice(title=display_name, value=relative_path))
39
+
40
+ # Prepare the prefix for the next level of recursion
41
+ # Use a blank prefix for children of the last item, and a pipe for others
42
+ child_prefix = prefix + (" " if is_last else "│ ")
43
+ scanner(subdir, child_prefix, depth + 1)
44
+
29
45
  except PermissionError:
46
+ # Silently ignore directories that the user doesn't have permission to read
30
47
  pass
31
48
 
32
- top_level_folders = sorted([p for p in root_path.iterdir() if p.is_dir() and p.name not in config.EXCLUDED_DIRS])
33
- for folder in top_level_folders:
34
- scanner(folder, 1)
49
+ # Start the recursive scan from the project's root directory
50
+ scanner(root_path, prefix="", depth=1)
35
51
 
52
+ # Add the special option to select files in the root folder itself
36
53
  root_option_name = f"root [{root_path.name}] (files in root folder only, excl. sub-folders)"
37
54
  choices.insert(0, questionary.Choice(title=root_option_name, value="ROOT_SENTINEL"))
38
55
 
@@ -49,15 +66,20 @@ def is_allowed_file(path: Path, exclude_large: bool) -> bool:
49
66
  return False
50
67
  if path.name.lower() in config.EXCLUDED_FILENAMES:
51
68
  return False
52
- if path.suffix not in config.ALLOWED_EXTENSIONS:
69
+ if path.suffix.lower() not in config.ALLOWED_EXTENSIONS:
53
70
  return False
54
71
  if exclude_large and path.stat().st_size > config.MAX_FILE_SIZE_MB * 1024 * 1024:
55
72
  return False
56
73
  return True
57
74
 
58
75
 
59
- def extract_code_from_folder(folder: Path, exclude_large: bool) -> (str, int):
60
- """Extracts code from a given folder, respecting EXCLUDED_DIRS at all depths."""
76
+ def extract_code_from_folder(folder: Path, exclude_large: bool) -> tuple[str, int, int, int]:
77
+ """
78
+ Extracts code from a given folder, respecting EXCLUDED_DIRS at all depths.
79
+
80
+ Returns:
81
+ A tuple containing the content string, file count, char count, and word count.
82
+ """
61
83
  content = f"# Folder: {folder.relative_to(Path.cwd())}\n\n"
62
84
  extracted_files = 0
63
85
  dirs_to_visit = [folder]
@@ -80,11 +102,21 @@ def extract_code_from_folder(folder: Path, exclude_large: bool) -> (str, int):
80
102
  content += f"\n\n"
81
103
  if extracted_files > config.FILE_COUNT_WARNING_THRESHOLD:
82
104
  logging.warning(colored(f"> Caution: Large file count in '{folder.name}' ({extracted_files} files).", "yellow"))
83
- return content, extracted_files
105
+
106
+ # ADDED: Calculate character and word counts
107
+ char_count = len(content)
108
+ word_count = len(content.split())
109
+
110
+ return content, extracted_files, char_count, word_count
84
111
 
85
112
 
86
- def extract_code_from_root(root_path: Path, exclude_large: bool) -> (str, int):
87
- """Extracts code only from files present in the root directory."""
113
+ def extract_code_from_root(root_path: Path, exclude_large: bool) -> tuple[str, int, int, int]:
114
+ """
115
+ Extracts code only from files present in the root directory.
116
+
117
+ Returns:
118
+ A tuple containing the content string, file count, char count, and word count.
119
+ """
88
120
  content = f"# Root Files: {root_path.name}\n\n"
89
121
  extracted_files = 0
90
122
  for filepath in sorted(root_path.iterdir()):
@@ -97,7 +129,12 @@ def extract_code_from_root(root_path: Path, exclude_large: bool) -> (str, int):
97
129
  extracted_files += 1
98
130
  if extracted_files > config.FILE_COUNT_WARNING_THRESHOLD:
99
131
  logging.warning(colored(f"> Caution: Large file count in root ({extracted_files} files).", "yellow"))
100
- return content, extracted_files
132
+
133
+ # ADDED: Calculate character and word counts
134
+ char_count = len(content)
135
+ word_count = len(content.split())
136
+
137
+ return content, extracted_files, char_count, word_count
101
138
 
102
139
 
103
140
  def write_to_markdown_file(content: str, metadata: dict, root_path: Path, output_dir_name: str):
@@ -116,12 +153,15 @@ def write_to_markdown_file(content: str, metadata: dict, root_path: Path, output
116
153
  filename = f"{file_base_name}_{timestamp}.md"
117
154
  full_filepath = output_dir / filename
118
155
 
156
+ # CHANGED: Added char_count and word_count to the YAML header
119
157
  yaml_header = f"""---
120
158
  extraction_details:
121
159
  reference: {metadata['run_ref']}
122
160
  timestamp_utc: "{metadata['run_timestamp']}"
123
161
  source_folder: "{metadata['folder_name']}"
124
162
  file_count: {metadata['file_count']}
163
+ char_count: {metadata['char_count']}
164
+ word_count: {metadata['word_count']}
125
165
  tool_details:
126
166
  name: "Codebase Extractor"
127
167
  version: "{__version__}"
@@ -35,7 +35,7 @@ class NumberValidator(Validator):
35
35
  message="Please enter a valid number.",
36
36
  cursor_position=len(document.text))
37
37
 
38
- def setup_logging(verbose: bool, log_file: str = None):
38
+ def setup_logging(verbose: bool, log_file: Optional[str] = None):
39
39
  """Configures the logging system."""
40
40
  log_level = logging.DEBUG if verbose else logging.INFO
41
41
  log_format = logging.Formatter('%(message)s')
@@ -79,14 +79,14 @@ def main():
79
79
  # --- Startup Sequence ---
80
80
  if not is_fully_automated:
81
81
  ui.clear_screen()
82
- ui.print_banner(no_instructions=args.no_instructions)
83
- if not args.no_instructions:
82
+ # CHANGED: Pass the new 'instructions' flag to the banner function
83
+ ui.print_banner(show_instructions=args.instructions)
84
+ # CHANGED: Logic is now inverted to show instructions only when the flag is present
85
+ if args.instructions:
84
86
  ui.show_instructions(output_dir_name)
85
- else:
86
- input(colored("\nPress Enter to begin...", "green"))
87
- ui.clear_screen()
88
87
  else:
89
- ui.print_banner(no_instructions=True)
88
+ # NOTE: For automated runs, the banner is always minimal. This is correct.
89
+ ui.print_banner(show_instructions=False)
90
90
 
91
91
  # --- Collect Settings (Interactively or from Args) ---
92
92
  select_style = Style([('qmark', 'fg:#FFA500'), ('pointer', 'fg:#FFA500'), ('highlighted', 'fg:black bg:#FFA500'), ('selected', 'fg:black bg:#FFA500')])
@@ -94,7 +94,7 @@ def main():
94
94
  exclude_large = args.exclude_large_files
95
95
  if not is_fully_automated:
96
96
  logging.info("=== Extraction Settings ===")
97
- exclude_large_choice = questionary.select("[1/2] -- Exclude files larger than 1MB?", choices=["yes", "no"], style=select_style, instruction=" ").ask()
97
+ exclude_large_choice = questionary.select("[1/2] -- Exclude files larger than 1MB?", choices=["no", "yes"], style=select_style, instruction=" ").ask()
98
98
  if exclude_large_choice is None: raise KeyboardInterrupt
99
99
  exclude_large = exclude_large_choice == "yes"
100
100
  print()
@@ -148,13 +148,23 @@ def main():
148
148
  for folder_path in sorted(list(folders_to_process)):
149
149
  with Halo(text=f"Extracting {folder_path.relative_to(root_path)}...", spinner="dots"):
150
150
  time.sleep(0.1)
151
- folder_md, folder_count = file_handler.extract_code_from_folder(folder_path, exclude_large)
151
+ # CHANGED: Unpack the new char_count and word_count values
152
+ folder_md, folder_count, char_count, word_count = file_handler.extract_code_from_folder(folder_path, exclude_large)
152
153
 
153
154
  if folder_count > 0:
154
- metadata = {"run_ref": run_ref, "run_timestamp": run_timestamp, "folder_name": str(folder_path.relative_to(root_path)), "file_count": folder_count}
155
+ # CHANGED: Add new metrics to the metadata dictionary
156
+ metadata = {
157
+ "run_ref": run_ref,
158
+ "run_timestamp": run_timestamp,
159
+ "folder_name": str(folder_path.relative_to(root_path)),
160
+ "file_count": folder_count,
161
+ "char_count": char_count,
162
+ "word_count": word_count
163
+ }
155
164
  if not args.dry_run:
156
165
  file_handler.write_to_markdown_file(folder_md, metadata, root_path, output_dir_name)
157
166
  logging.info(f"✅ Extracted {folder_count} file(s) from: {folder_path.relative_to(root_path)}")
167
+ logging.info(f"📜 {char_count:,} character(s), {word_count:,} word(s)")
158
168
  if args.dry_run: logging.info(colored(" (Dry Run: No file written)", "yellow"))
159
169
  total_files_extracted += folder_count
160
170
  else:
@@ -165,14 +175,24 @@ def main():
165
175
  root_display_name = f"root [{root_path.name}] (files in root folder only, excl. sub-folders)"
166
176
  with Halo(text=f"Extracting {root_display_name}...", spinner="dots"):
167
177
  time.sleep(0.1)
168
- root_md, root_count = file_handler.extract_code_from_root(root_path, exclude_large)
178
+ # CHANGED: Unpack the new char_count and word_count values
179
+ root_md, root_count, char_count, word_count = file_handler.extract_code_from_root(root_path, exclude_large)
169
180
 
170
181
  if root_count > 0:
171
- metadata = {"run_ref": run_ref, "run_timestamp": run_timestamp, "folder_name": root_display_name, "file_count": root_count}
182
+ # CHANGED: Add new metrics to the metadata dictionary
183
+ metadata = {
184
+ "run_ref": run_ref,
185
+ "run_timestamp": run_timestamp,
186
+ "folder_name": root_display_name,
187
+ "file_count": root_count,
188
+ "char_count": char_count,
189
+ "word_count": word_count
190
+ }
172
191
  if not args.dry_run:
173
192
  file_handler.write_to_markdown_file(root_md, metadata, root_path, output_dir_name)
174
193
  total_files_extracted += root_count
175
194
  logging.info(f"✅ Extracted {root_count} file(s) from the root directory")
195
+ logging.info(f"📜 {char_count:,} character(s), {word_count:,} word(s)")
176
196
  if args.dry_run: logging.info(colored(" (Dry Run: No file written)", "yellow"))
177
197
  else:
178
198
  logging.warning("‼️ No extractable files in the root directory")
@@ -196,4 +216,7 @@ def main():
196
216
  logging.error(colored(f"\n[!] An unexpected error occurred: {e}", "red"))
197
217
  import traceback
198
218
  traceback.print_exc()
199
- sys.exit(1)
219
+ sys.exit(1)
220
+
221
+ if __name__ == "__main__":
222
+ main()
@@ -4,8 +4,6 @@ from . import config
4
4
  from . import __version__
5
5
  from termcolor import colored
6
6
 
7
- # ... (LOGO_LARGE and LOGO_SMALL strings remain the same) ...
8
-
9
7
  LOGO_LARGE = """
10
8
  ██████╗ ██████╗ ██████╗ ███████╗██████╗ █████╗ ███████╗███████╗ ███████╗██╗ ██╗████████╗██████╗ █████╗ ██████╗████████╗ ██████╗ ██████╗
11
9
  ██╔════╝██╔═══██╗██╔══██╗██╔════╝██╔══██╗██╔══██╗██╔════╝██╔════╝ ██╔════╝╚██╗██╔╝╚══██╔══╝██╔══██╗██╔══██╗██╔════╝╚══██╔══╝██╔═══██╗██╔══██╗
@@ -16,26 +14,16 @@ LOGO_LARGE = """
16
14
  """
17
15
 
18
16
  LOGO_SMALL = """
19
- ██████╗ ██████╗ ██████╗ ███████╗██████╗ █████╗ ███████╗███████╗
20
- ██╔════╝██╔═══██╗██╔══██╗██╔════╝██╔══██╗██╔══██╗██╔════╝██╔════╝
21
- ██║ ██║ ██║██║ ██║█████╗ ██████╔╝███████║███████╗█████╗
22
- ██║ ██║ ██║██║ ██║██╔══╝ ██╔══██╗██╔══██║╚════██║██╔══╝
23
- ╚██████╗╚██████╔╝██████╔╝███████╗██████╔╝██║ ██║███████║███████╗
24
- ╚═════╝ ╚═════╝ ╚═════╝ ╚══════╝╚═════╝ ╚═╝ ╚═╝╚══════╝╚══════╝
25
-
26
- ███████╗██╗ ██╗████████╗██████╗ █████╗ ██████╗████████╗ ██████╗ ██████╗
27
- ██╔════╝╚██╗██╔╝╚══██╔══╝██╔══██╗██╔══██╗██╔════╝╚══██╔══╝██╔═══██╗██╔══██╗
28
- █████╗ ╚███╔╝ ██║ ██████╔╝███████║██║ ██║ ██║ ██║██████╔╝
29
- ██╔══╝ ██╔██╗ ██║ ██╔══██╗██╔══██║██║ ██║ ██║ ██║██╔══██╗
30
- ███████╗██╔╝ ██╗ ██║ ██║ ██║██║ ██║╚██████╗ ██║ ╚██████╔╝██║ ██║
31
- ╚══════╝╚═╝ ╚═╝ ╚═╝ ╚═╝ ╚═╝╚═╝ ╚═╝ ╚═════╝ ╚═╝ ╚═════╝ ╚═╝ ╚═╝
17
+ ░█▀▀░█▀█░█▀▄░█▀▀░█▀▄░█▀█░█▀▀░█▀▀░░░█▀▀░█░█░▀█▀░█▀▄░█▀█░█▀▀░▀█▀░█▀█░█▀▄
18
+ ░█░░░█░█░█░█░█▀▀░█▀▄░█▀█░▀▀█░█▀▀░░░█▀▀░▄▀▄░░█░░█▀▄░█▀█░█░░░░█░░█░█░█▀▄
19
+ ░▀▀▀░▀▀▀░▀▀░░▀▀▀░▀▀░░▀░▀░▀▀▀░▀▀▀░░░▀▀▀░▀░▀░░▀░░▀░▀░▀░▀░▀▀▀░░▀░░▀▀▀░▀░▀
32
20
  """
33
21
 
34
22
  def clear_screen():
35
23
  """Clears the terminal screen."""
36
24
  os.system('cls' if os.name == 'nt' else 'clear')
37
25
 
38
- def print_banner(no_instructions: bool = False):
26
+ def print_banner(show_instructions: bool = False):
39
27
  """Prints a banner that adjusts to the terminal width."""
40
28
  try:
41
29
  width = shutil.get_terminal_size((80, 20)).columns
@@ -48,11 +36,10 @@ def print_banner(no_instructions: bool = False):
48
36
  print(LOGO_SMALL)
49
37
 
50
38
  # Use the imported __version__ variable instead of config.SCRIPT_VERSION
51
- print(colored(f" Welcome to Code Extractor v{__version__} by Lukasz Lekowski ".center(width, "="), "white", "on_magenta"))
52
-
53
- if not no_instructions:
54
- print("\nThis tool consolidates your project's code into structured Markdown files.")
55
- print("It's ideal for providing context to AI models, archiving projects, or generating documentation.")
39
+ print(colored(f" Welcome to Codebase Extractor v{__version__} by Lukasz Lekowski ".center(width, "="), "white", "on_magenta"))
40
+ print("\nThis tool consolidates your project's codebase into structured Markdown files.")
41
+ print("It's ideal for providing context to AI models, archiving projects, or generating documentation.\n")
42
+
56
43
 
57
44
  def show_instructions(output_dir_name: str):
58
45
  """Clears screen and shows detailed instructions, pausing for user input."""
@@ -80,12 +67,13 @@ def show_instructions(output_dir_name: str):
80
67
  print(" - Selection Tree: You'll see a tree-like list of your project's folders. The script handles parent/child selections intelligently:")
81
68
  print(" - If you select a parent folder, all of its sub-folders are automatically included. You don't need to check them individually.")
82
69
  print(" - To get a file for *only* a sub-folder, select the sub-folder but *not* its parent.")
83
- print(" - The 'root [...]' option specifically extracts *only* the files in your project's main directory.\n")
70
+ print(" - The 'root [...]' option specifically extracts *only* the files (not files in sub-folders) in your project's main directory.\n")
84
71
 
85
72
  print(colored("--- Output Details ---", "yellow"))
86
73
  print(f"All extracted content is saved into the '{output_dir_name}' directory. Each Markdown file generated will contain a YAML metadata header at the top with a unique reference ID, a timestamp, and more.\n")
87
74
 
88
- tip = "TIP: Run this script with the --no-instructions or -ni flag to skip this guide."
75
+ # CHANGED: Updated the tip to reflect the new '--instructions' flag
76
+ tip = "TIP: To see this guide again, run the script with the --instructions flag."
89
77
  print(colored(tip, "black", "on_yellow"))
90
78
 
91
79
  input(colored("\nReady? Press Enter to begin...", "green"))
@@ -105,6 +93,4 @@ def print_footer():
105
93
  print("💡 Love this tool? Found a bug? Share your feedback on GitHub:")
106
94
  print(config.GITHUB_URL + "\n")
107
95
  print("🤝 Connect with the author on LinkedIn:")
108
- print(config.LINKEDIN_URL + "\n")
109
- print("☕ Enjoying this tool? You can support its development with a coffee!")
110
- print("https://www.buymeacoffee.com/lukaszlekowski\n")
96
+ print(config.LINKEDIN_URL + "\n")
@@ -1,6 +1,6 @@
1
1
  Metadata-Version: 2.4
2
2
  Name: codebase-extractor
3
- Version: 1.1.0
3
+ Version: 1.2.0
4
4
  Summary: A CLI tool to extract project source code into structured Markdown files for LLM & AI context.
5
5
  Author: Lukasz Lekowski
6
6
  Project-URL: Homepage, https://github.com/lukaszlekowski/codebase-extractor
@@ -10,7 +10,7 @@ Classifier: License :: OSI Approved :: MIT License
10
10
  Classifier: Operating System :: OS Independent
11
11
  Classifier: Topic :: Software Development :: Documentation
12
12
  Classifier: Topic :: Utilities
13
- Requires-Python: >=3.9
13
+ Requires-Python: >=3.14
14
14
  Description-Content-Type: text/markdown
15
15
  License-File: LICENCE
16
16
  Requires-Dist: questionary
@@ -75,12 +75,12 @@ The tool is highly configurable, allowing you to select specific folders, exclud
75
75
  ## ✨ Key Features
76
76
 
77
77
  - **Interactive & User-Friendly:** A guided, multi-step CLI experience that makes selecting options simple and clear.
78
+ - **Quick Start by Default:** The tool starts without delay. Detailed instructions are available via an `--instructions` flag when you need a reminder.
78
79
  - **Smart Filtering:** Automatically excludes common dependency folders, build artifacts, version control directories, and IDE configuration files. The exact filters are configurable.
79
80
  - **Flexible Selection Modes:** Choose to extract the entire project with one command, or dive into a specific selection mode.
80
- - **🌳 Nested Folder Selection:** Interactively browse and select specific sub-folders from a tree-like view.
81
+ - **🌳 Visual Tree Selection:** Interactively browse and select specific sub-folders from a clear, pipe-based tree structure.
81
82
  - **🔢 Configurable Scan Depth:** You decide how many levels deep the script should look for folders when building the selection tree.
82
- - **YAML Metadata:** Each generated Markdown file is prepended with a YAML front matter block containing useful metadata like a unique run ID, timestamp, and file count for easy tracking and parsing.
83
- - **🚀 Quick Start Mode:** Use the `--no-instructions` flag to skip the detailed intro guide on subsequent runs.
83
+ - **Rich YAML Metadata:** Each generated Markdown file is prepended with a YAML front matter block containing useful metadata like a unique run ID, timestamp, file count, character count, and word count.
84
84
  - **Safe & Robust:** Features graceful exit handling (`Ctrl+C`) and provides clear feedback during the extraction process.
85
85
 
86
86
  ---
@@ -165,15 +165,15 @@ pipx install codebase-extractor
165
165
  Once installed, you can run the tool from any terminal window. Navigate to your project's root directory and run the command:
166
166
 
167
167
  ```bash
168
- code-extractor
168
+ codebase-extractor
169
169
  ```
170
170
 
171
- The script will then guide you through the extraction process.
171
+ The script will launch immediately and guide you through the extraction process.
172
172
 
173
- For repeat usage, you can skip the detailed introductory guide by using the `--no-instructions` or `-ni` flag:
173
+ For a detailed guide on how the script works, you can use the `--instructions` flag:
174
174
 
175
175
  ```bash
176
- code-extractor --no-instructions
176
+ codebase-extractor --instructions
177
177
  ```
178
178
 
179
179
  ### The Process
@@ -193,25 +193,25 @@ The tool will guide you through a series of prompts:
193
193
 
194
194
  ### Output Details
195
195
 
196
- All output files are saved in a `CODEBASE_EXTRACTS` directory within your project folder. Each generated Markdown file includes a YAML metadata header with a unique reference ID, timestamp, and file count for easy tracking and parsing.
196
+ All output files are saved in a `CODEBASE_EXTRACTS` directory within your project folder. Each generated Markdown file includes a YAML metadata header with a unique reference ID, timestamp, file count, character count, and word count for easy tracking and parsing.
197
197
 
198
198
  ### ⚡ CLI Command Reference
199
199
 
200
200
  For non-interactive use and automation, you can control the script entirely with these arguments.
201
201
 
202
- | Argument | Description | Default Value |
203
- | :------------------------- | :--------------------------------------------------------------------------- | :-------------------------- |
204
- | `-ni`, `--no-instructions` | Run the script without printing the detailed instruction banner. | `False` |
205
- | `--root <path>` | The root directory of the project to extract. | The current directory |
206
- | `--output-dir <name>` | Custom name for the output directory. | `CODEBASE_EXTRACTS` |
207
- | `--dry-run` | Simulate the extraction process without writing any files. | `False` |
208
- | `-v`, `--verbose` | Enable verbose logging for debugging. | `False` |
209
- | `--log-file <path>` | Path to save the log file. | `None` |
210
- | `--exclude-large-files` | Non-interactive: Exclude files larger than 1MB. | `False` |
211
- | `--mode <mode>` | Non-interactive: Set the extraction mode. Choices: `everything`, `specific`. | `None` (Interactive prompt) |
212
- | `--depth <number>` | Non-interactive: Set the folder scan depth for 'specific' mode. | `3` |
213
- | `--select-folders <list>` | Non-interactive: A space-separated list of folders/sub-folders to extract. | `[]` |
214
- | `--select-root` | Non-interactive: Include files from the root directory in the extraction. | `False` |
202
+ | Argument | Description | Default Value |
203
+ | :------------------------ | :--------------------------------------------------------------------------- | :-------------------------- |
204
+ | `--instructions` | Show the detailed instruction guide on startup. | `False` |
205
+ | `--root <path>` | The root directory of the project to extract. | The current directory |
206
+ | `--output-dir <name>` | Custom name for the output directory. | `CODEBASE_EXTRACTS` |
207
+ | `--dry-run` | Simulate the extraction process without writing any files. | `False` |
208
+ | `-v`, `--verbose` | Enable verbose logging for debugging. | `False` |
209
+ | `--log-file <path>` | Path to save the log file. | `None` |
210
+ | `--exclude-large-files` | Non-interactive: Exclude files larger than 1MB. | `False` |
211
+ | `--mode <mode>` | Non-interactive: Set the extraction mode. Choices: `everything`, `specific`. | `None` (Interactive prompt) |
212
+ | `--depth <number>` | Non-interactive: Set the folder scan depth for 'specific' mode. | `3` |
213
+ | `--select-folders <list>` | Non-interactive: A space-separated list of folders/sub-folders to extract. | `[]` |
214
+ | `--select-root` | Non-interactive: Include files from the root directory in the extraction. | `False` |
215
215
 
216
216
  ---
217
217
 
@@ -224,7 +224,7 @@ Here are a few practical examples of how to use the tool from your command line.
224
224
  A common command for quick, automated runs.
225
225
 
226
226
  ```bash
227
- code-extractor --no-instructions --mode everything
227
+ codebase-extractor --mode everything
228
228
  ```
229
229
 
230
230
  - #### Extract specific sub-folders non-interactively
@@ -232,7 +232,7 @@ Here are a few practical examples of how to use the tool from your command line.
232
232
  This command extracts only the `src/components` and `src/hooks` directories, plus any files in the root.
233
233
 
234
234
  ```bash
235
- code-extractor --ni --mode specific --select-folders src/components src/hooks --select-root
235
+ codebase-extractor --mode specific --select-folders src/components src/hooks --select-root
236
236
  ```
237
237
 
238
238
  - #### Perform a safe dry run
@@ -240,13 +240,13 @@ Here are a few practical examples of how to use the tool from your command line.
240
240
  This will simulate a full extraction and print what it _would_ have done, without creating any files.
241
241
 
242
242
  ```bash
243
- code-extractor --dry-run --mode everything
243
+ codebase-extractor --dry-run --mode everything
244
244
  ```
245
245
 
246
246
  - #### Run on a different project and save to a custom folder
247
247
  This targets a completely different directory and specifies a custom output folder name.
248
248
  ```bash
249
- code-extractor --root /path/to/another/project --output-dir MyProject_Extraction
249
+ codebase-extractor --root /path/to/another/project --output-dir MyProject_Extraction
250
250
  ```
251
251
 
252
252
  ---
@@ -256,10 +256,11 @@ Here are a few practical examples of how to use the tool from your command line.
256
256
  The tool uses a set of rules to determine which files and folders to include in the extraction. Here are the default settings found in the `config.py` file.
257
257
 
258
258
  <details>
259
- <summary><strong>Click to view Excluded Directories</strong></summary>
259
+ <summary><strong>Click to view Excluded Directories</strong></summary>
260
260
 
261
261
  - `node_modules`, `vendor`, `__pycache__`, `dist`, `build`, `target`, `.next`
262
262
  - `.git`, `.svn`, `.hg`, `.vscode`, `.idea`, `venv`, `.venv`
263
+ - `.dart_tool`, `.gradle`, `Pods`, `DerivedData`
263
264
 
264
265
  </details>
265
266
 
@@ -273,17 +274,20 @@ The tool uses a set of rules to determine which files and folders to include in
273
274
  <details>
274
275
  <summary><strong>Click to view Allowed Filenames & Extensions</strong></summary>
275
276
 
276
- The script will process any file with one of the following extensions. It also explicitly allows common configuration files that may not have an extension.
277
+ The script will process any file with one of the following extensions. It also explicitly allows common configuration files that may not have an extension.
277
278
 
278
279
  **Allowed Filenames:**
279
- - `dockerfile`, `.gitignore`, `.htaccess`, `makefile`
280
+ - `dockerfile`, `.gitignore`, `.htaccess`, `makefile`, `.dockerignore`, `.env.example`
281
+ - `podfile`, `gemfile`, `jenkinsfile`, `gradlew`
280
282
 
281
283
  **Allowed Extensions:**
282
- - `.php`, `.html`, `.css`, `.js`, `.jsx`, `.ts`, `.tsx`, `.vue`, `.svelte`
283
- - `.py`, `.rb`, `.java`, `.c`, `.cpp`, `.cs`, `.go`, `.rs`
284
- - `.json`, `.xml`, `.yaml`, `.yml`, `.toml`, `.ini`, `.conf`
285
- - `.md`, `.txt`, `.rst`, `.twig`, `.blade`, `.handlebars`, `.mustache`, `.ejs`
286
- - `.sql`, `.graphql`, `.gql`, `.tf`
284
+ - **Web & General:** `.php`, `.html`, `.css`, `.js`, `.jsx`, `.ts`, `.tsx`, `.vue`, `.svelte`
285
+ - **Backend & Systems:** `.py`, `.rb`, `.java`, `.c`, `.cpp`, `.cs`, `.go`, `.rs`
286
+ - **Config & Data:** `.json`, `.xml`, `.yaml`, `.yml`, `.toml`, `.ini`, `.conf`
287
+ - **Docs & Templates:** `.md`, `.txt`, `.rst`, `.twig`, `.blade`, `.handlebars`, `.mustache`, `.ejs`
288
+ - **Database & IaC:** `.sql`, `.graphql`, `.gql`, `.tf`
289
+ - **Mobile (Flutter, Android, iOS):** `.dart`, `.arb`, `.gradle`, `.properties`, `.plist`, `.xcconfig`
290
+ - **Scripts:** `.sh`, `.bat`
287
291
 
288
292
  </details>
289
293
 
@@ -291,7 +295,7 @@ The tool uses a set of rules to determine which files and folders to include in
291
295
 
292
296
  ## 🤔 Troubleshooting
293
297
 
294
- - **Problem:** After installation, I run `code-extractor` and my terminal says `command not found`.
298
+ - **Problem:** After installation, I run `codebase-extractor` and my terminal says `command not found`.
295
299
  - **Solution:** This is usually a `PATH` issue. It means your system's shell doesn't know where to find the installed script. The `pip install --user` command sometimes requires you to add a local scripts directory to your `PATH`. Please refer to your operating system's documentation for instructions on how to modify your `PATH` environment variable.
296
300
 
297
301
  - **Problem:** The tool ran, but a specific folder or file I expected to see is missing from the output.
@@ -0,0 +1,2 @@
1
+ [console_scripts]
2
+ codebase-extractor = codebase_extractor.main_logic:main
@@ -1 +0,0 @@
1
- __version__ = "1.1.0"
@@ -1,2 +0,0 @@
1
- [console_scripts]
2
- code-extractor = codebase_extractor.main_logic:main