codebase-extractor 1.1.0__tar.gz → 1.1.1__tar.gz
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- {codebase_extractor-1.1.0/src/codebase_extractor.egg-info → codebase_extractor-1.1.1}/PKG-INFO +23 -23
- {codebase_extractor-1.1.0 → codebase_extractor-1.1.1}/README.md +22 -22
- {codebase_extractor-1.1.0 → codebase_extractor-1.1.1}/pyproject.toml +1 -1
- codebase_extractor-1.1.1/src/codebase_extractor/__init__.py +1 -0
- {codebase_extractor-1.1.0 → codebase_extractor-1.1.1}/src/codebase_extractor/cli.py +3 -2
- {codebase_extractor-1.1.0 → codebase_extractor-1.1.1}/src/codebase_extractor/file_handler.py +60 -20
- {codebase_extractor-1.1.0 → codebase_extractor-1.1.1}/src/codebase_extractor/main_logic.py +30 -10
- {codebase_extractor-1.1.0 → codebase_extractor-1.1.1}/src/codebase_extractor/ui.py +10 -22
- {codebase_extractor-1.1.0 → codebase_extractor-1.1.1/src/codebase_extractor.egg-info}/PKG-INFO +23 -23
- codebase_extractor-1.1.0/src/codebase_extractor/__init__.py +0 -1
- {codebase_extractor-1.1.0 → codebase_extractor-1.1.1}/LICENCE +0 -0
- {codebase_extractor-1.1.0 → codebase_extractor-1.1.1}/setup.cfg +0 -0
- {codebase_extractor-1.1.0 → codebase_extractor-1.1.1}/src/codebase_extractor/config.py +0 -0
- {codebase_extractor-1.1.0 → codebase_extractor-1.1.1}/src/codebase_extractor.egg-info/SOURCES.txt +0 -0
- {codebase_extractor-1.1.0 → codebase_extractor-1.1.1}/src/codebase_extractor.egg-info/dependency_links.txt +0 -0
- {codebase_extractor-1.1.0 → codebase_extractor-1.1.1}/src/codebase_extractor.egg-info/entry_points.txt +0 -0
- {codebase_extractor-1.1.0 → codebase_extractor-1.1.1}/src/codebase_extractor.egg-info/requires.txt +0 -0
- {codebase_extractor-1.1.0 → codebase_extractor-1.1.1}/src/codebase_extractor.egg-info/top_level.txt +0 -0
{codebase_extractor-1.1.0/src/codebase_extractor.egg-info → codebase_extractor-1.1.1}/PKG-INFO
RENAMED
|
@@ -1,6 +1,6 @@
|
|
|
1
1
|
Metadata-Version: 2.4
|
|
2
2
|
Name: codebase-extractor
|
|
3
|
-
Version: 1.1.
|
|
3
|
+
Version: 1.1.1
|
|
4
4
|
Summary: A CLI tool to extract project source code into structured Markdown files for LLM & AI context.
|
|
5
5
|
Author: Lukasz Lekowski
|
|
6
6
|
Project-URL: Homepage, https://github.com/lukaszlekowski/codebase-extractor
|
|
@@ -75,12 +75,12 @@ The tool is highly configurable, allowing you to select specific folders, exclud
|
|
|
75
75
|
## ✨ Key Features
|
|
76
76
|
|
|
77
77
|
- **Interactive & User-Friendly:** A guided, multi-step CLI experience that makes selecting options simple and clear.
|
|
78
|
+
- **Quick Start by Default:** The tool starts without delay. Detailed instructions are available via an `--instructions` flag when you need a reminder.
|
|
78
79
|
- **Smart Filtering:** Automatically excludes common dependency folders, build artifacts, version control directories, and IDE configuration files. The exact filters are configurable.
|
|
79
80
|
- **Flexible Selection Modes:** Choose to extract the entire project with one command, or dive into a specific selection mode.
|
|
80
|
-
- **🌳
|
|
81
|
+
- **🌳 Visual Tree Selection:** Interactively browse and select specific sub-folders from a clear, pipe-based tree structure.
|
|
81
82
|
- **🔢 Configurable Scan Depth:** You decide how many levels deep the script should look for folders when building the selection tree.
|
|
82
|
-
- **YAML Metadata:** Each generated Markdown file is prepended with a YAML front matter block containing useful metadata like a unique run ID, timestamp,
|
|
83
|
-
- **🚀 Quick Start Mode:** Use the `--no-instructions` flag to skip the detailed intro guide on subsequent runs.
|
|
83
|
+
- **Rich YAML Metadata:** Each generated Markdown file is prepended with a YAML front matter block containing useful metadata like a unique run ID, timestamp, file count, character count, and word count.
|
|
84
84
|
- **Safe & Robust:** Features graceful exit handling (`Ctrl+C`) and provides clear feedback during the extraction process.
|
|
85
85
|
|
|
86
86
|
---
|
|
@@ -168,12 +168,12 @@ Once installed, you can run the tool from any terminal window. Navigate to your
|
|
|
168
168
|
code-extractor
|
|
169
169
|
```
|
|
170
170
|
|
|
171
|
-
The script will
|
|
171
|
+
The script will launch immediately and guide you through the extraction process.
|
|
172
172
|
|
|
173
|
-
For
|
|
173
|
+
For a detailed guide on how the script works, you can use the `--instructions` flag:
|
|
174
174
|
|
|
175
175
|
```bash
|
|
176
|
-
code-extractor --
|
|
176
|
+
code-extractor --instructions
|
|
177
177
|
```
|
|
178
178
|
|
|
179
179
|
### The Process
|
|
@@ -193,25 +193,25 @@ The tool will guide you through a series of prompts:
|
|
|
193
193
|
|
|
194
194
|
### Output Details
|
|
195
195
|
|
|
196
|
-
All output files are saved in a `CODEBASE_EXTRACTS` directory within your project folder. Each generated Markdown file includes a YAML metadata header with a unique reference ID, timestamp, and
|
|
196
|
+
All output files are saved in a `CODEBASE_EXTRACTS` directory within your project folder. Each generated Markdown file includes a YAML metadata header with a unique reference ID, timestamp, file count, character count, and word count for easy tracking and parsing.
|
|
197
197
|
|
|
198
198
|
### ⚡ CLI Command Reference
|
|
199
199
|
|
|
200
200
|
For non-interactive use and automation, you can control the script entirely with these arguments.
|
|
201
201
|
|
|
202
|
-
| Argument
|
|
203
|
-
|
|
|
204
|
-
|
|
|
205
|
-
| `--root <path>`
|
|
206
|
-
| `--output-dir <name>`
|
|
207
|
-
| `--dry-run`
|
|
208
|
-
| `-v`, `--verbose`
|
|
209
|
-
| `--log-file <path>`
|
|
210
|
-
| `--exclude-large-files`
|
|
211
|
-
| `--mode <mode>`
|
|
212
|
-
| `--depth <number>`
|
|
213
|
-
| `--select-folders <list>`
|
|
214
|
-
| `--select-root`
|
|
202
|
+
| Argument | Description | Default Value |
|
|
203
|
+
| :------------------------ | :--------------------------------------------------------------------------- | :-------------------------- |
|
|
204
|
+
| `--instructions` | Show the detailed instruction guide on startup. | `False` |
|
|
205
|
+
| `--root <path>` | The root directory of the project to extract. | The current directory |
|
|
206
|
+
| `--output-dir <name>` | Custom name for the output directory. | `CODEBASE_EXTRACTS` |
|
|
207
|
+
| `--dry-run` | Simulate the extraction process without writing any files. | `False` |
|
|
208
|
+
| `-v`, `--verbose` | Enable verbose logging for debugging. | `False` |
|
|
209
|
+
| `--log-file <path>` | Path to save the log file. | `None` |
|
|
210
|
+
| `--exclude-large-files` | Non-interactive: Exclude files larger than 1MB. | `False` |
|
|
211
|
+
| `--mode <mode>` | Non-interactive: Set the extraction mode. Choices: `everything`, `specific`. | `None` (Interactive prompt) |
|
|
212
|
+
| `--depth <number>` | Non-interactive: Set the folder scan depth for 'specific' mode. | `3` |
|
|
213
|
+
| `--select-folders <list>` | Non-interactive: A space-separated list of folders/sub-folders to extract. | `[]` |
|
|
214
|
+
| `--select-root` | Non-interactive: Include files from the root directory in the extraction. | `False` |
|
|
215
215
|
|
|
216
216
|
---
|
|
217
217
|
|
|
@@ -224,7 +224,7 @@ Here are a few practical examples of how to use the tool from your command line.
|
|
|
224
224
|
A common command for quick, automated runs.
|
|
225
225
|
|
|
226
226
|
```bash
|
|
227
|
-
code-extractor --
|
|
227
|
+
code-extractor --mode everything
|
|
228
228
|
```
|
|
229
229
|
|
|
230
230
|
- #### Extract specific sub-folders non-interactively
|
|
@@ -232,7 +232,7 @@ Here are a few practical examples of how to use the tool from your command line.
|
|
|
232
232
|
This command extracts only the `src/components` and `src/hooks` directories, plus any files in the root.
|
|
233
233
|
|
|
234
234
|
```bash
|
|
235
|
-
code-extractor --
|
|
235
|
+
code-extractor --mode specific --select-folders src/components src/hooks --select-root
|
|
236
236
|
```
|
|
237
237
|
|
|
238
238
|
- #### Perform a safe dry run
|
|
@@ -55,12 +55,12 @@ The tool is highly configurable, allowing you to select specific folders, exclud
|
|
|
55
55
|
## ✨ Key Features
|
|
56
56
|
|
|
57
57
|
- **Interactive & User-Friendly:** A guided, multi-step CLI experience that makes selecting options simple and clear.
|
|
58
|
+
- **Quick Start by Default:** The tool starts without delay. Detailed instructions are available via an `--instructions` flag when you need a reminder.
|
|
58
59
|
- **Smart Filtering:** Automatically excludes common dependency folders, build artifacts, version control directories, and IDE configuration files. The exact filters are configurable.
|
|
59
60
|
- **Flexible Selection Modes:** Choose to extract the entire project with one command, or dive into a specific selection mode.
|
|
60
|
-
- **🌳
|
|
61
|
+
- **🌳 Visual Tree Selection:** Interactively browse and select specific sub-folders from a clear, pipe-based tree structure.
|
|
61
62
|
- **🔢 Configurable Scan Depth:** You decide how many levels deep the script should look for folders when building the selection tree.
|
|
62
|
-
- **YAML Metadata:** Each generated Markdown file is prepended with a YAML front matter block containing useful metadata like a unique run ID, timestamp,
|
|
63
|
-
- **🚀 Quick Start Mode:** Use the `--no-instructions` flag to skip the detailed intro guide on subsequent runs.
|
|
63
|
+
- **Rich YAML Metadata:** Each generated Markdown file is prepended with a YAML front matter block containing useful metadata like a unique run ID, timestamp, file count, character count, and word count.
|
|
64
64
|
- **Safe & Robust:** Features graceful exit handling (`Ctrl+C`) and provides clear feedback during the extraction process.
|
|
65
65
|
|
|
66
66
|
---
|
|
@@ -148,12 +148,12 @@ Once installed, you can run the tool from any terminal window. Navigate to your
|
|
|
148
148
|
code-extractor
|
|
149
149
|
```
|
|
150
150
|
|
|
151
|
-
The script will
|
|
151
|
+
The script will launch immediately and guide you through the extraction process.
|
|
152
152
|
|
|
153
|
-
For
|
|
153
|
+
For a detailed guide on how the script works, you can use the `--instructions` flag:
|
|
154
154
|
|
|
155
155
|
```bash
|
|
156
|
-
code-extractor --
|
|
156
|
+
code-extractor --instructions
|
|
157
157
|
```
|
|
158
158
|
|
|
159
159
|
### The Process
|
|
@@ -173,25 +173,25 @@ The tool will guide you through a series of prompts:
|
|
|
173
173
|
|
|
174
174
|
### Output Details
|
|
175
175
|
|
|
176
|
-
All output files are saved in a `CODEBASE_EXTRACTS` directory within your project folder. Each generated Markdown file includes a YAML metadata header with a unique reference ID, timestamp, and
|
|
176
|
+
All output files are saved in a `CODEBASE_EXTRACTS` directory within your project folder. Each generated Markdown file includes a YAML metadata header with a unique reference ID, timestamp, file count, character count, and word count for easy tracking and parsing.
|
|
177
177
|
|
|
178
178
|
### ⚡ CLI Command Reference
|
|
179
179
|
|
|
180
180
|
For non-interactive use and automation, you can control the script entirely with these arguments.
|
|
181
181
|
|
|
182
|
-
| Argument
|
|
183
|
-
|
|
|
184
|
-
|
|
|
185
|
-
| `--root <path>`
|
|
186
|
-
| `--output-dir <name>`
|
|
187
|
-
| `--dry-run`
|
|
188
|
-
| `-v`, `--verbose`
|
|
189
|
-
| `--log-file <path>`
|
|
190
|
-
| `--exclude-large-files`
|
|
191
|
-
| `--mode <mode>`
|
|
192
|
-
| `--depth <number>`
|
|
193
|
-
| `--select-folders <list>`
|
|
194
|
-
| `--select-root`
|
|
182
|
+
| Argument | Description | Default Value |
|
|
183
|
+
| :------------------------ | :--------------------------------------------------------------------------- | :-------------------------- |
|
|
184
|
+
| `--instructions` | Show the detailed instruction guide on startup. | `False` |
|
|
185
|
+
| `--root <path>` | The root directory of the project to extract. | The current directory |
|
|
186
|
+
| `--output-dir <name>` | Custom name for the output directory. | `CODEBASE_EXTRACTS` |
|
|
187
|
+
| `--dry-run` | Simulate the extraction process without writing any files. | `False` |
|
|
188
|
+
| `-v`, `--verbose` | Enable verbose logging for debugging. | `False` |
|
|
189
|
+
| `--log-file <path>` | Path to save the log file. | `None` |
|
|
190
|
+
| `--exclude-large-files` | Non-interactive: Exclude files larger than 1MB. | `False` |
|
|
191
|
+
| `--mode <mode>` | Non-interactive: Set the extraction mode. Choices: `everything`, `specific`. | `None` (Interactive prompt) |
|
|
192
|
+
| `--depth <number>` | Non-interactive: Set the folder scan depth for 'specific' mode. | `3` |
|
|
193
|
+
| `--select-folders <list>` | Non-interactive: A space-separated list of folders/sub-folders to extract. | `[]` |
|
|
194
|
+
| `--select-root` | Non-interactive: Include files from the root directory in the extraction. | `False` |
|
|
195
195
|
|
|
196
196
|
---
|
|
197
197
|
|
|
@@ -204,7 +204,7 @@ Here are a few practical examples of how to use the tool from your command line.
|
|
|
204
204
|
A common command for quick, automated runs.
|
|
205
205
|
|
|
206
206
|
```bash
|
|
207
|
-
code-extractor --
|
|
207
|
+
code-extractor --mode everything
|
|
208
208
|
```
|
|
209
209
|
|
|
210
210
|
- #### Extract specific sub-folders non-interactively
|
|
@@ -212,7 +212,7 @@ Here are a few practical examples of how to use the tool from your command line.
|
|
|
212
212
|
This command extracts only the `src/components` and `src/hooks` directories, plus any files in the root.
|
|
213
213
|
|
|
214
214
|
```bash
|
|
215
|
-
code-extractor --
|
|
215
|
+
code-extractor --mode specific --select-folders src/components src/hooks --select-root
|
|
216
216
|
```
|
|
217
217
|
|
|
218
218
|
- #### Perform a safe dry run
|
|
@@ -0,0 +1 @@
|
|
|
1
|
+
__version__ = "1.1.1"
|
|
@@ -10,9 +10,10 @@ def parse_arguments():
|
|
|
10
10
|
|
|
11
11
|
# General Flags
|
|
12
12
|
parser.add_argument(
|
|
13
|
-
'
|
|
13
|
+
'--instructions',
|
|
14
14
|
action='store_true',
|
|
15
|
-
|
|
15
|
+
default=False, # ADDED: This ensures the attribute always exists.
|
|
16
|
+
help="Show the detailed instruction guide on startup."
|
|
16
17
|
)
|
|
17
18
|
parser.add_argument(
|
|
18
19
|
'--root',
|
{codebase_extractor-1.1.0 → codebase_extractor-1.1.1}/src/codebase_extractor/file_handler.py
RENAMED
|
@@ -10,29 +10,46 @@ import questionary
|
|
|
10
10
|
|
|
11
11
|
|
|
12
12
|
def get_folder_choices(root_path: Path, max_depth: int) -> list:
|
|
13
|
-
"""Recursively finds folders up to a max depth and prepares them for questionary."""
|
|
13
|
+
"""Recursively finds folders up to a max depth and prepares them for questionary with a visual tree."""
|
|
14
14
|
choices = []
|
|
15
|
-
|
|
16
|
-
def scanner(current_path: Path, depth: int):
|
|
15
|
+
|
|
16
|
+
def scanner(current_path: Path, prefix: str, depth: int):
|
|
17
|
+
"""A recursive helper to build the folder tree."""
|
|
18
|
+
# Stop scanning if the maximum depth is reached
|
|
17
19
|
if depth > max_depth:
|
|
18
20
|
return
|
|
19
21
|
|
|
20
|
-
relative_path = current_path.relative_to(root_path)
|
|
21
|
-
prefix = " " * (depth - 1)
|
|
22
|
-
display_name = f"{prefix}{current_path.name}"
|
|
23
|
-
choices.append(questionary.Choice(title=display_name, value=relative_path))
|
|
24
|
-
|
|
25
22
|
try:
|
|
26
|
-
|
|
27
|
-
|
|
28
|
-
|
|
23
|
+
# Get a sorted list of valid subdirectories
|
|
24
|
+
subdirs = sorted([
|
|
25
|
+
p for p in current_path.iterdir()
|
|
26
|
+
if p.is_dir() and p.name not in config.EXCLUDED_DIRS
|
|
27
|
+
])
|
|
28
|
+
|
|
29
|
+
# Iterate through the subdirectories to build the tree display
|
|
30
|
+
for i, subdir in enumerate(subdirs):
|
|
31
|
+
is_last = (i == len(subdirs) - 1)
|
|
32
|
+
|
|
33
|
+
# Use '└─' for the last item and '├─' for others
|
|
34
|
+
connector = "└─ " if is_last else "├─ "
|
|
35
|
+
display_name = f"{prefix}{connector}{subdir.name}"
|
|
36
|
+
|
|
37
|
+
relative_path = subdir.relative_to(root_path)
|
|
38
|
+
choices.append(questionary.Choice(title=display_name, value=relative_path))
|
|
39
|
+
|
|
40
|
+
# Prepare the prefix for the next level of recursion
|
|
41
|
+
# Use a blank prefix for children of the last item, and a pipe for others
|
|
42
|
+
child_prefix = prefix + (" " if is_last else "│ ")
|
|
43
|
+
scanner(subdir, child_prefix, depth + 1)
|
|
44
|
+
|
|
29
45
|
except PermissionError:
|
|
46
|
+
# Silently ignore directories that the user doesn't have permission to read
|
|
30
47
|
pass
|
|
31
48
|
|
|
32
|
-
|
|
33
|
-
|
|
34
|
-
scanner(folder, 1)
|
|
49
|
+
# Start the recursive scan from the project's root directory
|
|
50
|
+
scanner(root_path, prefix="", depth=1)
|
|
35
51
|
|
|
52
|
+
# Add the special option to select files in the root folder itself
|
|
36
53
|
root_option_name = f"root [{root_path.name}] (files in root folder only, excl. sub-folders)"
|
|
37
54
|
choices.insert(0, questionary.Choice(title=root_option_name, value="ROOT_SENTINEL"))
|
|
38
55
|
|
|
@@ -56,8 +73,13 @@ def is_allowed_file(path: Path, exclude_large: bool) -> bool:
|
|
|
56
73
|
return True
|
|
57
74
|
|
|
58
75
|
|
|
59
|
-
def extract_code_from_folder(folder: Path, exclude_large: bool) -> (str, int):
|
|
60
|
-
"""
|
|
76
|
+
def extract_code_from_folder(folder: Path, exclude_large: bool) -> (str, int, int, int):
|
|
77
|
+
"""
|
|
78
|
+
Extracts code from a given folder, respecting EXCLUDED_DIRS at all depths.
|
|
79
|
+
|
|
80
|
+
Returns:
|
|
81
|
+
A tuple containing the content string, file count, char count, and word count.
|
|
82
|
+
"""
|
|
61
83
|
content = f"# Folder: {folder.relative_to(Path.cwd())}\n\n"
|
|
62
84
|
extracted_files = 0
|
|
63
85
|
dirs_to_visit = [folder]
|
|
@@ -80,11 +102,21 @@ def extract_code_from_folder(folder: Path, exclude_large: bool) -> (str, int):
|
|
|
80
102
|
content += f"\n\n"
|
|
81
103
|
if extracted_files > config.FILE_COUNT_WARNING_THRESHOLD:
|
|
82
104
|
logging.warning(colored(f"> Caution: Large file count in '{folder.name}' ({extracted_files} files).", "yellow"))
|
|
83
|
-
|
|
105
|
+
|
|
106
|
+
# ADDED: Calculate character and word counts
|
|
107
|
+
char_count = len(content)
|
|
108
|
+
word_count = len(content.split())
|
|
109
|
+
|
|
110
|
+
return content, extracted_files, char_count, word_count
|
|
84
111
|
|
|
85
112
|
|
|
86
|
-
def extract_code_from_root(root_path: Path, exclude_large: bool) -> (str, int):
|
|
87
|
-
"""
|
|
113
|
+
def extract_code_from_root(root_path: Path, exclude_large: bool) -> (str, int, int, int):
|
|
114
|
+
"""
|
|
115
|
+
Extracts code only from files present in the root directory.
|
|
116
|
+
|
|
117
|
+
Returns:
|
|
118
|
+
A tuple containing the content string, file count, char count, and word count.
|
|
119
|
+
"""
|
|
88
120
|
content = f"# Root Files: {root_path.name}\n\n"
|
|
89
121
|
extracted_files = 0
|
|
90
122
|
for filepath in sorted(root_path.iterdir()):
|
|
@@ -97,7 +129,12 @@ def extract_code_from_root(root_path: Path, exclude_large: bool) -> (str, int):
|
|
|
97
129
|
extracted_files += 1
|
|
98
130
|
if extracted_files > config.FILE_COUNT_WARNING_THRESHOLD:
|
|
99
131
|
logging.warning(colored(f"> Caution: Large file count in root ({extracted_files} files).", "yellow"))
|
|
100
|
-
|
|
132
|
+
|
|
133
|
+
# ADDED: Calculate character and word counts
|
|
134
|
+
char_count = len(content)
|
|
135
|
+
word_count = len(content.split())
|
|
136
|
+
|
|
137
|
+
return content, extracted_files, char_count, word_count
|
|
101
138
|
|
|
102
139
|
|
|
103
140
|
def write_to_markdown_file(content: str, metadata: dict, root_path: Path, output_dir_name: str):
|
|
@@ -116,12 +153,15 @@ def write_to_markdown_file(content: str, metadata: dict, root_path: Path, output
|
|
|
116
153
|
filename = f"{file_base_name}_{timestamp}.md"
|
|
117
154
|
full_filepath = output_dir / filename
|
|
118
155
|
|
|
156
|
+
# CHANGED: Added char_count and word_count to the YAML header
|
|
119
157
|
yaml_header = f"""---
|
|
120
158
|
extraction_details:
|
|
121
159
|
reference: {metadata['run_ref']}
|
|
122
160
|
timestamp_utc: "{metadata['run_timestamp']}"
|
|
123
161
|
source_folder: "{metadata['folder_name']}"
|
|
124
162
|
file_count: {metadata['file_count']}
|
|
163
|
+
char_count: {metadata['char_count']}
|
|
164
|
+
word_count: {metadata['word_count']}
|
|
125
165
|
tool_details:
|
|
126
166
|
name: "Codebase Extractor"
|
|
127
167
|
version: "{__version__}"
|
|
@@ -79,14 +79,14 @@ def main():
|
|
|
79
79
|
# --- Startup Sequence ---
|
|
80
80
|
if not is_fully_automated:
|
|
81
81
|
ui.clear_screen()
|
|
82
|
-
|
|
83
|
-
|
|
82
|
+
# CHANGED: Pass the new 'instructions' flag to the banner function
|
|
83
|
+
ui.print_banner(show_instructions=args.instructions)
|
|
84
|
+
# CHANGED: Logic is now inverted to show instructions only when the flag is present
|
|
85
|
+
if args.instructions:
|
|
84
86
|
ui.show_instructions(output_dir_name)
|
|
85
|
-
else:
|
|
86
|
-
input(colored("\nPress Enter to begin...", "green"))
|
|
87
|
-
ui.clear_screen()
|
|
88
87
|
else:
|
|
89
|
-
|
|
88
|
+
# NOTE: For automated runs, the banner is always minimal. This is correct.
|
|
89
|
+
ui.print_banner(show_instructions=False)
|
|
90
90
|
|
|
91
91
|
# --- Collect Settings (Interactively or from Args) ---
|
|
92
92
|
select_style = Style([('qmark', 'fg:#FFA500'), ('pointer', 'fg:#FFA500'), ('highlighted', 'fg:black bg:#FFA500'), ('selected', 'fg:black bg:#FFA500')])
|
|
@@ -148,13 +148,23 @@ def main():
|
|
|
148
148
|
for folder_path in sorted(list(folders_to_process)):
|
|
149
149
|
with Halo(text=f"Extracting {folder_path.relative_to(root_path)}...", spinner="dots"):
|
|
150
150
|
time.sleep(0.1)
|
|
151
|
-
|
|
151
|
+
# CHANGED: Unpack the new char_count and word_count values
|
|
152
|
+
folder_md, folder_count, char_count, word_count = file_handler.extract_code_from_folder(folder_path, exclude_large)
|
|
152
153
|
|
|
153
154
|
if folder_count > 0:
|
|
154
|
-
|
|
155
|
+
# CHANGED: Add new metrics to the metadata dictionary
|
|
156
|
+
metadata = {
|
|
157
|
+
"run_ref": run_ref,
|
|
158
|
+
"run_timestamp": run_timestamp,
|
|
159
|
+
"folder_name": str(folder_path.relative_to(root_path)),
|
|
160
|
+
"file_count": folder_count,
|
|
161
|
+
"char_count": char_count,
|
|
162
|
+
"word_count": word_count
|
|
163
|
+
}
|
|
155
164
|
if not args.dry_run:
|
|
156
165
|
file_handler.write_to_markdown_file(folder_md, metadata, root_path, output_dir_name)
|
|
157
166
|
logging.info(f"✅ Extracted {folder_count} file(s) from: {folder_path.relative_to(root_path)}")
|
|
167
|
+
logging.info(f"📜 {char_count:,} character(s), {word_count:,} word(s)")
|
|
158
168
|
if args.dry_run: logging.info(colored(" (Dry Run: No file written)", "yellow"))
|
|
159
169
|
total_files_extracted += folder_count
|
|
160
170
|
else:
|
|
@@ -165,14 +175,24 @@ def main():
|
|
|
165
175
|
root_display_name = f"root [{root_path.name}] (files in root folder only, excl. sub-folders)"
|
|
166
176
|
with Halo(text=f"Extracting {root_display_name}...", spinner="dots"):
|
|
167
177
|
time.sleep(0.1)
|
|
168
|
-
|
|
178
|
+
# CHANGED: Unpack the new char_count and word_count values
|
|
179
|
+
root_md, root_count, char_count, word_count = file_handler.extract_code_from_root(root_path, exclude_large)
|
|
169
180
|
|
|
170
181
|
if root_count > 0:
|
|
171
|
-
|
|
182
|
+
# CHANGED: Add new metrics to the metadata dictionary
|
|
183
|
+
metadata = {
|
|
184
|
+
"run_ref": run_ref,
|
|
185
|
+
"run_timestamp": run_timestamp,
|
|
186
|
+
"folder_name": root_display_name,
|
|
187
|
+
"file_count": root_count,
|
|
188
|
+
"char_count": char_count,
|
|
189
|
+
"word_count": word_count
|
|
190
|
+
}
|
|
172
191
|
if not args.dry_run:
|
|
173
192
|
file_handler.write_to_markdown_file(root_md, metadata, root_path, output_dir_name)
|
|
174
193
|
total_files_extracted += root_count
|
|
175
194
|
logging.info(f"✅ Extracted {root_count} file(s) from the root directory")
|
|
195
|
+
logging.info(f"📜 {char_count:,} character(s), {word_count:,} word(s)")
|
|
176
196
|
if args.dry_run: logging.info(colored(" (Dry Run: No file written)", "yellow"))
|
|
177
197
|
else:
|
|
178
198
|
logging.warning("‼️ No extractable files in the root directory")
|
|
@@ -16,26 +16,16 @@ LOGO_LARGE = """
|
|
|
16
16
|
"""
|
|
17
17
|
|
|
18
18
|
LOGO_SMALL = """
|
|
19
|
-
|
|
20
|
-
|
|
21
|
-
|
|
22
|
-
██║ ██║ ██║██║ ██║██╔══╝ ██╔══██╗██╔══██║╚════██║██╔══╝
|
|
23
|
-
╚██████╗╚██████╔╝██████╔╝███████╗██████╔╝██║ ██║███████║███████╗
|
|
24
|
-
╚═════╝ ╚═════╝ ╚═════╝ ╚══════╝╚═════╝ ╚═╝ ╚═╝╚══════╝╚══════╝
|
|
25
|
-
|
|
26
|
-
███████╗██╗ ██╗████████╗██████╗ █████╗ ██████╗████████╗ ██████╗ ██████╗
|
|
27
|
-
██╔════╝╚██╗██╔╝╚══██╔══╝██╔══██╗██╔══██╗██╔════╝╚══██╔══╝██╔═══██╗██╔══██╗
|
|
28
|
-
█████╗ ╚███╔╝ ██║ ██████╔╝███████║██║ ██║ ██║ ██║██████╔╝
|
|
29
|
-
██╔══╝ ██╔██╗ ██║ ██╔══██╗██╔══██║██║ ██║ ██║ ██║██╔══██╗
|
|
30
|
-
███████╗██╔╝ ██╗ ██║ ██║ ██║██║ ██║╚██████╗ ██║ ╚██████╔╝██║ ██║
|
|
31
|
-
╚══════╝╚═╝ ╚═╝ ╚═╝ ╚═╝ ╚═╝╚═╝ ╚═╝ ╚═════╝ ╚═╝ ╚═════╝ ╚═╝ ╚═╝
|
|
19
|
+
░█▀▀░█▀█░█▀▄░█▀▀░█▀▄░█▀█░█▀▀░█▀▀░░░█▀▀░█░█░▀█▀░█▀▄░█▀█░█▀▀░▀█▀░█▀█░█▀▄
|
|
20
|
+
░█░░░█░█░█░█░█▀▀░█▀▄░█▀█░▀▀█░█▀▀░░░█▀▀░▄▀▄░░█░░█▀▄░█▀█░█░░░░█░░█░█░█▀▄
|
|
21
|
+
░▀▀▀░▀▀▀░▀▀░░▀▀▀░▀▀░░▀░▀░▀▀▀░▀▀▀░░░▀▀▀░▀░▀░░▀░░▀░▀░▀░▀░▀▀▀░░▀░░▀▀▀░▀░▀
|
|
32
22
|
"""
|
|
33
23
|
|
|
34
24
|
def clear_screen():
|
|
35
25
|
"""Clears the terminal screen."""
|
|
36
26
|
os.system('cls' if os.name == 'nt' else 'clear')
|
|
37
27
|
|
|
38
|
-
def print_banner(
|
|
28
|
+
def print_banner(show_instructions: bool = False): # CHANGED: Parameter name for clarity
|
|
39
29
|
"""Prints a banner that adjusts to the terminal width."""
|
|
40
30
|
try:
|
|
41
31
|
width = shutil.get_terminal_size((80, 20)).columns
|
|
@@ -49,10 +39,9 @@ def print_banner(no_instructions: bool = False):
|
|
|
49
39
|
|
|
50
40
|
# Use the imported __version__ variable instead of config.SCRIPT_VERSION
|
|
51
41
|
print(colored(f" Welcome to Code Extractor v{__version__} by Lukasz Lekowski ".center(width, "="), "white", "on_magenta"))
|
|
52
|
-
|
|
53
|
-
|
|
54
|
-
|
|
55
|
-
print("It's ideal for providing context to AI models, archiving projects, or generating documentation.")
|
|
42
|
+
print("\nThis tool consolidates your project's code into structured Markdown files.")
|
|
43
|
+
print("It's ideal for providing context to AI models, archiving projects, or generating documentation.\n")
|
|
44
|
+
|
|
56
45
|
|
|
57
46
|
def show_instructions(output_dir_name: str):
|
|
58
47
|
"""Clears screen and shows detailed instructions, pausing for user input."""
|
|
@@ -85,7 +74,8 @@ def show_instructions(output_dir_name: str):
|
|
|
85
74
|
print(colored("--- Output Details ---", "yellow"))
|
|
86
75
|
print(f"All extracted content is saved into the '{output_dir_name}' directory. Each Markdown file generated will contain a YAML metadata header at the top with a unique reference ID, a timestamp, and more.\n")
|
|
87
76
|
|
|
88
|
-
|
|
77
|
+
# CHANGED: Updated the tip to reflect the new '--instructions' flag
|
|
78
|
+
tip = "TIP: To see this guide again, run the script with the --instructions flag."
|
|
89
79
|
print(colored(tip, "black", "on_yellow"))
|
|
90
80
|
|
|
91
81
|
input(colored("\nReady? Press Enter to begin...", "green"))
|
|
@@ -105,6 +95,4 @@ def print_footer():
|
|
|
105
95
|
print("💡 Love this tool? Found a bug? Share your feedback on GitHub:")
|
|
106
96
|
print(config.GITHUB_URL + "\n")
|
|
107
97
|
print("🤝 Connect with the author on LinkedIn:")
|
|
108
|
-
print(config.LINKEDIN_URL + "\n")
|
|
109
|
-
print("☕ Enjoying this tool? You can support its development with a coffee!")
|
|
110
|
-
print("https://www.buymeacoffee.com/lukaszlekowski\n")
|
|
98
|
+
print(config.LINKEDIN_URL + "\n")
|
{codebase_extractor-1.1.0 → codebase_extractor-1.1.1/src/codebase_extractor.egg-info}/PKG-INFO
RENAMED
|
@@ -1,6 +1,6 @@
|
|
|
1
1
|
Metadata-Version: 2.4
|
|
2
2
|
Name: codebase-extractor
|
|
3
|
-
Version: 1.1.
|
|
3
|
+
Version: 1.1.1
|
|
4
4
|
Summary: A CLI tool to extract project source code into structured Markdown files for LLM & AI context.
|
|
5
5
|
Author: Lukasz Lekowski
|
|
6
6
|
Project-URL: Homepage, https://github.com/lukaszlekowski/codebase-extractor
|
|
@@ -75,12 +75,12 @@ The tool is highly configurable, allowing you to select specific folders, exclud
|
|
|
75
75
|
## ✨ Key Features
|
|
76
76
|
|
|
77
77
|
- **Interactive & User-Friendly:** A guided, multi-step CLI experience that makes selecting options simple and clear.
|
|
78
|
+
- **Quick Start by Default:** The tool starts without delay. Detailed instructions are available via an `--instructions` flag when you need a reminder.
|
|
78
79
|
- **Smart Filtering:** Automatically excludes common dependency folders, build artifacts, version control directories, and IDE configuration files. The exact filters are configurable.
|
|
79
80
|
- **Flexible Selection Modes:** Choose to extract the entire project with one command, or dive into a specific selection mode.
|
|
80
|
-
- **🌳
|
|
81
|
+
- **🌳 Visual Tree Selection:** Interactively browse and select specific sub-folders from a clear, pipe-based tree structure.
|
|
81
82
|
- **🔢 Configurable Scan Depth:** You decide how many levels deep the script should look for folders when building the selection tree.
|
|
82
|
-
- **YAML Metadata:** Each generated Markdown file is prepended with a YAML front matter block containing useful metadata like a unique run ID, timestamp,
|
|
83
|
-
- **🚀 Quick Start Mode:** Use the `--no-instructions` flag to skip the detailed intro guide on subsequent runs.
|
|
83
|
+
- **Rich YAML Metadata:** Each generated Markdown file is prepended with a YAML front matter block containing useful metadata like a unique run ID, timestamp, file count, character count, and word count.
|
|
84
84
|
- **Safe & Robust:** Features graceful exit handling (`Ctrl+C`) and provides clear feedback during the extraction process.
|
|
85
85
|
|
|
86
86
|
---
|
|
@@ -168,12 +168,12 @@ Once installed, you can run the tool from any terminal window. Navigate to your
|
|
|
168
168
|
code-extractor
|
|
169
169
|
```
|
|
170
170
|
|
|
171
|
-
The script will
|
|
171
|
+
The script will launch immediately and guide you through the extraction process.
|
|
172
172
|
|
|
173
|
-
For
|
|
173
|
+
For a detailed guide on how the script works, you can use the `--instructions` flag:
|
|
174
174
|
|
|
175
175
|
```bash
|
|
176
|
-
code-extractor --
|
|
176
|
+
code-extractor --instructions
|
|
177
177
|
```
|
|
178
178
|
|
|
179
179
|
### The Process
|
|
@@ -193,25 +193,25 @@ The tool will guide you through a series of prompts:
|
|
|
193
193
|
|
|
194
194
|
### Output Details
|
|
195
195
|
|
|
196
|
-
All output files are saved in a `CODEBASE_EXTRACTS` directory within your project folder. Each generated Markdown file includes a YAML metadata header with a unique reference ID, timestamp, and
|
|
196
|
+
All output files are saved in a `CODEBASE_EXTRACTS` directory within your project folder. Each generated Markdown file includes a YAML metadata header with a unique reference ID, timestamp, file count, character count, and word count for easy tracking and parsing.
|
|
197
197
|
|
|
198
198
|
### ⚡ CLI Command Reference
|
|
199
199
|
|
|
200
200
|
For non-interactive use and automation, you can control the script entirely with these arguments.
|
|
201
201
|
|
|
202
|
-
| Argument
|
|
203
|
-
|
|
|
204
|
-
|
|
|
205
|
-
| `--root <path>`
|
|
206
|
-
| `--output-dir <name>`
|
|
207
|
-
| `--dry-run`
|
|
208
|
-
| `-v`, `--verbose`
|
|
209
|
-
| `--log-file <path>`
|
|
210
|
-
| `--exclude-large-files`
|
|
211
|
-
| `--mode <mode>`
|
|
212
|
-
| `--depth <number>`
|
|
213
|
-
| `--select-folders <list>`
|
|
214
|
-
| `--select-root`
|
|
202
|
+
| Argument | Description | Default Value |
|
|
203
|
+
| :------------------------ | :--------------------------------------------------------------------------- | :-------------------------- |
|
|
204
|
+
| `--instructions` | Show the detailed instruction guide on startup. | `False` |
|
|
205
|
+
| `--root <path>` | The root directory of the project to extract. | The current directory |
|
|
206
|
+
| `--output-dir <name>` | Custom name for the output directory. | `CODEBASE_EXTRACTS` |
|
|
207
|
+
| `--dry-run` | Simulate the extraction process without writing any files. | `False` |
|
|
208
|
+
| `-v`, `--verbose` | Enable verbose logging for debugging. | `False` |
|
|
209
|
+
| `--log-file <path>` | Path to save the log file. | `None` |
|
|
210
|
+
| `--exclude-large-files` | Non-interactive: Exclude files larger than 1MB. | `False` |
|
|
211
|
+
| `--mode <mode>` | Non-interactive: Set the extraction mode. Choices: `everything`, `specific`. | `None` (Interactive prompt) |
|
|
212
|
+
| `--depth <number>` | Non-interactive: Set the folder scan depth for 'specific' mode. | `3` |
|
|
213
|
+
| `--select-folders <list>` | Non-interactive: A space-separated list of folders/sub-folders to extract. | `[]` |
|
|
214
|
+
| `--select-root` | Non-interactive: Include files from the root directory in the extraction. | `False` |
|
|
215
215
|
|
|
216
216
|
---
|
|
217
217
|
|
|
@@ -224,7 +224,7 @@ Here are a few practical examples of how to use the tool from your command line.
|
|
|
224
224
|
A common command for quick, automated runs.
|
|
225
225
|
|
|
226
226
|
```bash
|
|
227
|
-
code-extractor --
|
|
227
|
+
code-extractor --mode everything
|
|
228
228
|
```
|
|
229
229
|
|
|
230
230
|
- #### Extract specific sub-folders non-interactively
|
|
@@ -232,7 +232,7 @@ Here are a few practical examples of how to use the tool from your command line.
|
|
|
232
232
|
This command extracts only the `src/components` and `src/hooks` directories, plus any files in the root.
|
|
233
233
|
|
|
234
234
|
```bash
|
|
235
|
-
code-extractor --
|
|
235
|
+
code-extractor --mode specific --select-folders src/components src/hooks --select-root
|
|
236
236
|
```
|
|
237
237
|
|
|
238
238
|
- #### Perform a safe dry run
|
|
@@ -1 +0,0 @@
|
|
|
1
|
-
__version__ = "1.1.0"
|
|
File without changes
|
|
File without changes
|
|
File without changes
|
{codebase_extractor-1.1.0 → codebase_extractor-1.1.1}/src/codebase_extractor.egg-info/SOURCES.txt
RENAMED
|
File without changes
|
|
File without changes
|
|
File without changes
|
{codebase_extractor-1.1.0 → codebase_extractor-1.1.1}/src/codebase_extractor.egg-info/requires.txt
RENAMED
|
File without changes
|
{codebase_extractor-1.1.0 → codebase_extractor-1.1.1}/src/codebase_extractor.egg-info/top_level.txt
RENAMED
|
File without changes
|