codebase-extractor 1.1.0__tar.gz → 1.1.1__tar.gz

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (18) hide show
  1. {codebase_extractor-1.1.0/src/codebase_extractor.egg-info → codebase_extractor-1.1.1}/PKG-INFO +23 -23
  2. {codebase_extractor-1.1.0 → codebase_extractor-1.1.1}/README.md +22 -22
  3. {codebase_extractor-1.1.0 → codebase_extractor-1.1.1}/pyproject.toml +1 -1
  4. codebase_extractor-1.1.1/src/codebase_extractor/__init__.py +1 -0
  5. {codebase_extractor-1.1.0 → codebase_extractor-1.1.1}/src/codebase_extractor/cli.py +3 -2
  6. {codebase_extractor-1.1.0 → codebase_extractor-1.1.1}/src/codebase_extractor/file_handler.py +60 -20
  7. {codebase_extractor-1.1.0 → codebase_extractor-1.1.1}/src/codebase_extractor/main_logic.py +30 -10
  8. {codebase_extractor-1.1.0 → codebase_extractor-1.1.1}/src/codebase_extractor/ui.py +10 -22
  9. {codebase_extractor-1.1.0 → codebase_extractor-1.1.1/src/codebase_extractor.egg-info}/PKG-INFO +23 -23
  10. codebase_extractor-1.1.0/src/codebase_extractor/__init__.py +0 -1
  11. {codebase_extractor-1.1.0 → codebase_extractor-1.1.1}/LICENCE +0 -0
  12. {codebase_extractor-1.1.0 → codebase_extractor-1.1.1}/setup.cfg +0 -0
  13. {codebase_extractor-1.1.0 → codebase_extractor-1.1.1}/src/codebase_extractor/config.py +0 -0
  14. {codebase_extractor-1.1.0 → codebase_extractor-1.1.1}/src/codebase_extractor.egg-info/SOURCES.txt +0 -0
  15. {codebase_extractor-1.1.0 → codebase_extractor-1.1.1}/src/codebase_extractor.egg-info/dependency_links.txt +0 -0
  16. {codebase_extractor-1.1.0 → codebase_extractor-1.1.1}/src/codebase_extractor.egg-info/entry_points.txt +0 -0
  17. {codebase_extractor-1.1.0 → codebase_extractor-1.1.1}/src/codebase_extractor.egg-info/requires.txt +0 -0
  18. {codebase_extractor-1.1.0 → codebase_extractor-1.1.1}/src/codebase_extractor.egg-info/top_level.txt +0 -0
@@ -1,6 +1,6 @@
1
1
  Metadata-Version: 2.4
2
2
  Name: codebase-extractor
3
- Version: 1.1.0
3
+ Version: 1.1.1
4
4
  Summary: A CLI tool to extract project source code into structured Markdown files for LLM & AI context.
5
5
  Author: Lukasz Lekowski
6
6
  Project-URL: Homepage, https://github.com/lukaszlekowski/codebase-extractor
@@ -75,12 +75,12 @@ The tool is highly configurable, allowing you to select specific folders, exclud
75
75
  ## ✨ Key Features
76
76
 
77
77
  - **Interactive & User-Friendly:** A guided, multi-step CLI experience that makes selecting options simple and clear.
78
+ - **Quick Start by Default:** The tool starts without delay. Detailed instructions are available via an `--instructions` flag when you need a reminder.
78
79
  - **Smart Filtering:** Automatically excludes common dependency folders, build artifacts, version control directories, and IDE configuration files. The exact filters are configurable.
79
80
  - **Flexible Selection Modes:** Choose to extract the entire project with one command, or dive into a specific selection mode.
80
- - **🌳 Nested Folder Selection:** Interactively browse and select specific sub-folders from a tree-like view.
81
+ - **🌳 Visual Tree Selection:** Interactively browse and select specific sub-folders from a clear, pipe-based tree structure.
81
82
  - **🔢 Configurable Scan Depth:** You decide how many levels deep the script should look for folders when building the selection tree.
82
- - **YAML Metadata:** Each generated Markdown file is prepended with a YAML front matter block containing useful metadata like a unique run ID, timestamp, and file count for easy tracking and parsing.
83
- - **🚀 Quick Start Mode:** Use the `--no-instructions` flag to skip the detailed intro guide on subsequent runs.
83
+ - **Rich YAML Metadata:** Each generated Markdown file is prepended with a YAML front matter block containing useful metadata like a unique run ID, timestamp, file count, character count, and word count.
84
84
  - **Safe & Robust:** Features graceful exit handling (`Ctrl+C`) and provides clear feedback during the extraction process.
85
85
 
86
86
  ---
@@ -168,12 +168,12 @@ Once installed, you can run the tool from any terminal window. Navigate to your
168
168
  code-extractor
169
169
  ```
170
170
 
171
- The script will then guide you through the extraction process.
171
+ The script will launch immediately and guide you through the extraction process.
172
172
 
173
- For repeat usage, you can skip the detailed introductory guide by using the `--no-instructions` or `-ni` flag:
173
+ For a detailed guide on how the script works, you can use the `--instructions` flag:
174
174
 
175
175
  ```bash
176
- code-extractor --no-instructions
176
+ code-extractor --instructions
177
177
  ```
178
178
 
179
179
  ### The Process
@@ -193,25 +193,25 @@ The tool will guide you through a series of prompts:
193
193
 
194
194
  ### Output Details
195
195
 
196
- All output files are saved in a `CODEBASE_EXTRACTS` directory within your project folder. Each generated Markdown file includes a YAML metadata header with a unique reference ID, timestamp, and file count for easy tracking and parsing.
196
+ All output files are saved in a `CODEBASE_EXTRACTS` directory within your project folder. Each generated Markdown file includes a YAML metadata header with a unique reference ID, timestamp, file count, character count, and word count for easy tracking and parsing.
197
197
 
198
198
  ### ⚡ CLI Command Reference
199
199
 
200
200
  For non-interactive use and automation, you can control the script entirely with these arguments.
201
201
 
202
- | Argument | Description | Default Value |
203
- | :------------------------- | :--------------------------------------------------------------------------- | :-------------------------- |
204
- | `-ni`, `--no-instructions` | Run the script without printing the detailed instruction banner. | `False` |
205
- | `--root <path>` | The root directory of the project to extract. | The current directory |
206
- | `--output-dir <name>` | Custom name for the output directory. | `CODEBASE_EXTRACTS` |
207
- | `--dry-run` | Simulate the extraction process without writing any files. | `False` |
208
- | `-v`, `--verbose` | Enable verbose logging for debugging. | `False` |
209
- | `--log-file <path>` | Path to save the log file. | `None` |
210
- | `--exclude-large-files` | Non-interactive: Exclude files larger than 1MB. | `False` |
211
- | `--mode <mode>` | Non-interactive: Set the extraction mode. Choices: `everything`, `specific`. | `None` (Interactive prompt) |
212
- | `--depth <number>` | Non-interactive: Set the folder scan depth for 'specific' mode. | `3` |
213
- | `--select-folders <list>` | Non-interactive: A space-separated list of folders/sub-folders to extract. | `[]` |
214
- | `--select-root` | Non-interactive: Include files from the root directory in the extraction. | `False` |
202
+ | Argument | Description | Default Value |
203
+ | :------------------------ | :--------------------------------------------------------------------------- | :-------------------------- |
204
+ | `--instructions` | Show the detailed instruction guide on startup. | `False` |
205
+ | `--root <path>` | The root directory of the project to extract. | The current directory |
206
+ | `--output-dir <name>` | Custom name for the output directory. | `CODEBASE_EXTRACTS` |
207
+ | `--dry-run` | Simulate the extraction process without writing any files. | `False` |
208
+ | `-v`, `--verbose` | Enable verbose logging for debugging. | `False` |
209
+ | `--log-file <path>` | Path to save the log file. | `None` |
210
+ | `--exclude-large-files` | Non-interactive: Exclude files larger than 1MB. | `False` |
211
+ | `--mode <mode>` | Non-interactive: Set the extraction mode. Choices: `everything`, `specific`. | `None` (Interactive prompt) |
212
+ | `--depth <number>` | Non-interactive: Set the folder scan depth for 'specific' mode. | `3` |
213
+ | `--select-folders <list>` | Non-interactive: A space-separated list of folders/sub-folders to extract. | `[]` |
214
+ | `--select-root` | Non-interactive: Include files from the root directory in the extraction. | `False` |
215
215
 
216
216
  ---
217
217
 
@@ -224,7 +224,7 @@ Here are a few practical examples of how to use the tool from your command line.
224
224
  A common command for quick, automated runs.
225
225
 
226
226
  ```bash
227
- code-extractor --no-instructions --mode everything
227
+ code-extractor --mode everything
228
228
  ```
229
229
 
230
230
  - #### Extract specific sub-folders non-interactively
@@ -232,7 +232,7 @@ Here are a few practical examples of how to use the tool from your command line.
232
232
  This command extracts only the `src/components` and `src/hooks` directories, plus any files in the root.
233
233
 
234
234
  ```bash
235
- code-extractor --ni --mode specific --select-folders src/components src/hooks --select-root
235
+ code-extractor --mode specific --select-folders src/components src/hooks --select-root
236
236
  ```
237
237
 
238
238
  - #### Perform a safe dry run
@@ -55,12 +55,12 @@ The tool is highly configurable, allowing you to select specific folders, exclud
55
55
  ## ✨ Key Features
56
56
 
57
57
  - **Interactive & User-Friendly:** A guided, multi-step CLI experience that makes selecting options simple and clear.
58
+ - **Quick Start by Default:** The tool starts without delay. Detailed instructions are available via an `--instructions` flag when you need a reminder.
58
59
  - **Smart Filtering:** Automatically excludes common dependency folders, build artifacts, version control directories, and IDE configuration files. The exact filters are configurable.
59
60
  - **Flexible Selection Modes:** Choose to extract the entire project with one command, or dive into a specific selection mode.
60
- - **🌳 Nested Folder Selection:** Interactively browse and select specific sub-folders from a tree-like view.
61
+ - **🌳 Visual Tree Selection:** Interactively browse and select specific sub-folders from a clear, pipe-based tree structure.
61
62
  - **🔢 Configurable Scan Depth:** You decide how many levels deep the script should look for folders when building the selection tree.
62
- - **YAML Metadata:** Each generated Markdown file is prepended with a YAML front matter block containing useful metadata like a unique run ID, timestamp, and file count for easy tracking and parsing.
63
- - **🚀 Quick Start Mode:** Use the `--no-instructions` flag to skip the detailed intro guide on subsequent runs.
63
+ - **Rich YAML Metadata:** Each generated Markdown file is prepended with a YAML front matter block containing useful metadata like a unique run ID, timestamp, file count, character count, and word count.
64
64
  - **Safe & Robust:** Features graceful exit handling (`Ctrl+C`) and provides clear feedback during the extraction process.
65
65
 
66
66
  ---
@@ -148,12 +148,12 @@ Once installed, you can run the tool from any terminal window. Navigate to your
148
148
  code-extractor
149
149
  ```
150
150
 
151
- The script will then guide you through the extraction process.
151
+ The script will launch immediately and guide you through the extraction process.
152
152
 
153
- For repeat usage, you can skip the detailed introductory guide by using the `--no-instructions` or `-ni` flag:
153
+ For a detailed guide on how the script works, you can use the `--instructions` flag:
154
154
 
155
155
  ```bash
156
- code-extractor --no-instructions
156
+ code-extractor --instructions
157
157
  ```
158
158
 
159
159
  ### The Process
@@ -173,25 +173,25 @@ The tool will guide you through a series of prompts:
173
173
 
174
174
  ### Output Details
175
175
 
176
- All output files are saved in a `CODEBASE_EXTRACTS` directory within your project folder. Each generated Markdown file includes a YAML metadata header with a unique reference ID, timestamp, and file count for easy tracking and parsing.
176
+ All output files are saved in a `CODEBASE_EXTRACTS` directory within your project folder. Each generated Markdown file includes a YAML metadata header with a unique reference ID, timestamp, file count, character count, and word count for easy tracking and parsing.
177
177
 
178
178
  ### ⚡ CLI Command Reference
179
179
 
180
180
  For non-interactive use and automation, you can control the script entirely with these arguments.
181
181
 
182
- | Argument | Description | Default Value |
183
- | :------------------------- | :--------------------------------------------------------------------------- | :-------------------------- |
184
- | `-ni`, `--no-instructions` | Run the script without printing the detailed instruction banner. | `False` |
185
- | `--root <path>` | The root directory of the project to extract. | The current directory |
186
- | `--output-dir <name>` | Custom name for the output directory. | `CODEBASE_EXTRACTS` |
187
- | `--dry-run` | Simulate the extraction process without writing any files. | `False` |
188
- | `-v`, `--verbose` | Enable verbose logging for debugging. | `False` |
189
- | `--log-file <path>` | Path to save the log file. | `None` |
190
- | `--exclude-large-files` | Non-interactive: Exclude files larger than 1MB. | `False` |
191
- | `--mode <mode>` | Non-interactive: Set the extraction mode. Choices: `everything`, `specific`. | `None` (Interactive prompt) |
192
- | `--depth <number>` | Non-interactive: Set the folder scan depth for 'specific' mode. | `3` |
193
- | `--select-folders <list>` | Non-interactive: A space-separated list of folders/sub-folders to extract. | `[]` |
194
- | `--select-root` | Non-interactive: Include files from the root directory in the extraction. | `False` |
182
+ | Argument | Description | Default Value |
183
+ | :------------------------ | :--------------------------------------------------------------------------- | :-------------------------- |
184
+ | `--instructions` | Show the detailed instruction guide on startup. | `False` |
185
+ | `--root <path>` | The root directory of the project to extract. | The current directory |
186
+ | `--output-dir <name>` | Custom name for the output directory. | `CODEBASE_EXTRACTS` |
187
+ | `--dry-run` | Simulate the extraction process without writing any files. | `False` |
188
+ | `-v`, `--verbose` | Enable verbose logging for debugging. | `False` |
189
+ | `--log-file <path>` | Path to save the log file. | `None` |
190
+ | `--exclude-large-files` | Non-interactive: Exclude files larger than 1MB. | `False` |
191
+ | `--mode <mode>` | Non-interactive: Set the extraction mode. Choices: `everything`, `specific`. | `None` (Interactive prompt) |
192
+ | `--depth <number>` | Non-interactive: Set the folder scan depth for 'specific' mode. | `3` |
193
+ | `--select-folders <list>` | Non-interactive: A space-separated list of folders/sub-folders to extract. | `[]` |
194
+ | `--select-root` | Non-interactive: Include files from the root directory in the extraction. | `False` |
195
195
 
196
196
  ---
197
197
 
@@ -204,7 +204,7 @@ Here are a few practical examples of how to use the tool from your command line.
204
204
  A common command for quick, automated runs.
205
205
 
206
206
  ```bash
207
- code-extractor --no-instructions --mode everything
207
+ code-extractor --mode everything
208
208
  ```
209
209
 
210
210
  - #### Extract specific sub-folders non-interactively
@@ -212,7 +212,7 @@ Here are a few practical examples of how to use the tool from your command line.
212
212
  This command extracts only the `src/components` and `src/hooks` directories, plus any files in the root.
213
213
 
214
214
  ```bash
215
- code-extractor --ni --mode specific --select-folders src/components src/hooks --select-root
215
+ code-extractor --mode specific --select-folders src/components src/hooks --select-root
216
216
  ```
217
217
 
218
218
  - #### Perform a safe dry run
@@ -4,7 +4,7 @@ build-backend = "setuptools.build_meta"
4
4
 
5
5
  [project]
6
6
  name = "codebase-extractor"
7
- version = "1.1.0"
7
+ version = "1.1.1"
8
8
  authors = [
9
9
  { name="Lukasz Lekowski" },
10
10
  ]
@@ -0,0 +1 @@
1
+ __version__ = "1.1.1"
@@ -10,9 +10,10 @@ def parse_arguments():
10
10
 
11
11
  # General Flags
12
12
  parser.add_argument(
13
- '-ni', '--no-instructions',
13
+ '--instructions',
14
14
  action='store_true',
15
- help="Run the script without printing the detailed instruction banner."
15
+ default=False, # ADDED: This ensures the attribute always exists.
16
+ help="Show the detailed instruction guide on startup."
16
17
  )
17
18
  parser.add_argument(
18
19
  '--root',
@@ -10,29 +10,46 @@ import questionary
10
10
 
11
11
 
12
12
  def get_folder_choices(root_path: Path, max_depth: int) -> list:
13
- """Recursively finds folders up to a max depth and prepares them for questionary."""
13
+ """Recursively finds folders up to a max depth and prepares them for questionary with a visual tree."""
14
14
  choices = []
15
-
16
- def scanner(current_path: Path, depth: int):
15
+
16
+ def scanner(current_path: Path, prefix: str, depth: int):
17
+ """A recursive helper to build the folder tree."""
18
+ # Stop scanning if the maximum depth is reached
17
19
  if depth > max_depth:
18
20
  return
19
21
 
20
- relative_path = current_path.relative_to(root_path)
21
- prefix = " " * (depth - 1)
22
- display_name = f"{prefix}{current_path.name}"
23
- choices.append(questionary.Choice(title=display_name, value=relative_path))
24
-
25
22
  try:
26
- subdirs = sorted([p for p in current_path.iterdir() if p.is_dir() and p.name not in config.EXCLUDED_DIRS])
27
- for subdir in subdirs:
28
- scanner(subdir, depth + 1)
23
+ # Get a sorted list of valid subdirectories
24
+ subdirs = sorted([
25
+ p for p in current_path.iterdir()
26
+ if p.is_dir() and p.name not in config.EXCLUDED_DIRS
27
+ ])
28
+
29
+ # Iterate through the subdirectories to build the tree display
30
+ for i, subdir in enumerate(subdirs):
31
+ is_last = (i == len(subdirs) - 1)
32
+
33
+ # Use '└─' for the last item and '├─' for others
34
+ connector = "└─ " if is_last else "├─ "
35
+ display_name = f"{prefix}{connector}{subdir.name}"
36
+
37
+ relative_path = subdir.relative_to(root_path)
38
+ choices.append(questionary.Choice(title=display_name, value=relative_path))
39
+
40
+ # Prepare the prefix for the next level of recursion
41
+ # Use a blank prefix for children of the last item, and a pipe for others
42
+ child_prefix = prefix + (" " if is_last else "│ ")
43
+ scanner(subdir, child_prefix, depth + 1)
44
+
29
45
  except PermissionError:
46
+ # Silently ignore directories that the user doesn't have permission to read
30
47
  pass
31
48
 
32
- top_level_folders = sorted([p for p in root_path.iterdir() if p.is_dir() and p.name not in config.EXCLUDED_DIRS])
33
- for folder in top_level_folders:
34
- scanner(folder, 1)
49
+ # Start the recursive scan from the project's root directory
50
+ scanner(root_path, prefix="", depth=1)
35
51
 
52
+ # Add the special option to select files in the root folder itself
36
53
  root_option_name = f"root [{root_path.name}] (files in root folder only, excl. sub-folders)"
37
54
  choices.insert(0, questionary.Choice(title=root_option_name, value="ROOT_SENTINEL"))
38
55
 
@@ -56,8 +73,13 @@ def is_allowed_file(path: Path, exclude_large: bool) -> bool:
56
73
  return True
57
74
 
58
75
 
59
- def extract_code_from_folder(folder: Path, exclude_large: bool) -> (str, int):
60
- """Extracts code from a given folder, respecting EXCLUDED_DIRS at all depths."""
76
+ def extract_code_from_folder(folder: Path, exclude_large: bool) -> (str, int, int, int):
77
+ """
78
+ Extracts code from a given folder, respecting EXCLUDED_DIRS at all depths.
79
+
80
+ Returns:
81
+ A tuple containing the content string, file count, char count, and word count.
82
+ """
61
83
  content = f"# Folder: {folder.relative_to(Path.cwd())}\n\n"
62
84
  extracted_files = 0
63
85
  dirs_to_visit = [folder]
@@ -80,11 +102,21 @@ def extract_code_from_folder(folder: Path, exclude_large: bool) -> (str, int):
80
102
  content += f"\n\n"
81
103
  if extracted_files > config.FILE_COUNT_WARNING_THRESHOLD:
82
104
  logging.warning(colored(f"> Caution: Large file count in '{folder.name}' ({extracted_files} files).", "yellow"))
83
- return content, extracted_files
105
+
106
+ # ADDED: Calculate character and word counts
107
+ char_count = len(content)
108
+ word_count = len(content.split())
109
+
110
+ return content, extracted_files, char_count, word_count
84
111
 
85
112
 
86
- def extract_code_from_root(root_path: Path, exclude_large: bool) -> (str, int):
87
- """Extracts code only from files present in the root directory."""
113
+ def extract_code_from_root(root_path: Path, exclude_large: bool) -> (str, int, int, int):
114
+ """
115
+ Extracts code only from files present in the root directory.
116
+
117
+ Returns:
118
+ A tuple containing the content string, file count, char count, and word count.
119
+ """
88
120
  content = f"# Root Files: {root_path.name}\n\n"
89
121
  extracted_files = 0
90
122
  for filepath in sorted(root_path.iterdir()):
@@ -97,7 +129,12 @@ def extract_code_from_root(root_path: Path, exclude_large: bool) -> (str, int):
97
129
  extracted_files += 1
98
130
  if extracted_files > config.FILE_COUNT_WARNING_THRESHOLD:
99
131
  logging.warning(colored(f"> Caution: Large file count in root ({extracted_files} files).", "yellow"))
100
- return content, extracted_files
132
+
133
+ # ADDED: Calculate character and word counts
134
+ char_count = len(content)
135
+ word_count = len(content.split())
136
+
137
+ return content, extracted_files, char_count, word_count
101
138
 
102
139
 
103
140
  def write_to_markdown_file(content: str, metadata: dict, root_path: Path, output_dir_name: str):
@@ -116,12 +153,15 @@ def write_to_markdown_file(content: str, metadata: dict, root_path: Path, output
116
153
  filename = f"{file_base_name}_{timestamp}.md"
117
154
  full_filepath = output_dir / filename
118
155
 
156
+ # CHANGED: Added char_count and word_count to the YAML header
119
157
  yaml_header = f"""---
120
158
  extraction_details:
121
159
  reference: {metadata['run_ref']}
122
160
  timestamp_utc: "{metadata['run_timestamp']}"
123
161
  source_folder: "{metadata['folder_name']}"
124
162
  file_count: {metadata['file_count']}
163
+ char_count: {metadata['char_count']}
164
+ word_count: {metadata['word_count']}
125
165
  tool_details:
126
166
  name: "Codebase Extractor"
127
167
  version: "{__version__}"
@@ -79,14 +79,14 @@ def main():
79
79
  # --- Startup Sequence ---
80
80
  if not is_fully_automated:
81
81
  ui.clear_screen()
82
- ui.print_banner(no_instructions=args.no_instructions)
83
- if not args.no_instructions:
82
+ # CHANGED: Pass the new 'instructions' flag to the banner function
83
+ ui.print_banner(show_instructions=args.instructions)
84
+ # CHANGED: Logic is now inverted to show instructions only when the flag is present
85
+ if args.instructions:
84
86
  ui.show_instructions(output_dir_name)
85
- else:
86
- input(colored("\nPress Enter to begin...", "green"))
87
- ui.clear_screen()
88
87
  else:
89
- ui.print_banner(no_instructions=True)
88
+ # NOTE: For automated runs, the banner is always minimal. This is correct.
89
+ ui.print_banner(show_instructions=False)
90
90
 
91
91
  # --- Collect Settings (Interactively or from Args) ---
92
92
  select_style = Style([('qmark', 'fg:#FFA500'), ('pointer', 'fg:#FFA500'), ('highlighted', 'fg:black bg:#FFA500'), ('selected', 'fg:black bg:#FFA500')])
@@ -148,13 +148,23 @@ def main():
148
148
  for folder_path in sorted(list(folders_to_process)):
149
149
  with Halo(text=f"Extracting {folder_path.relative_to(root_path)}...", spinner="dots"):
150
150
  time.sleep(0.1)
151
- folder_md, folder_count = file_handler.extract_code_from_folder(folder_path, exclude_large)
151
+ # CHANGED: Unpack the new char_count and word_count values
152
+ folder_md, folder_count, char_count, word_count = file_handler.extract_code_from_folder(folder_path, exclude_large)
152
153
 
153
154
  if folder_count > 0:
154
- metadata = {"run_ref": run_ref, "run_timestamp": run_timestamp, "folder_name": str(folder_path.relative_to(root_path)), "file_count": folder_count}
155
+ # CHANGED: Add new metrics to the metadata dictionary
156
+ metadata = {
157
+ "run_ref": run_ref,
158
+ "run_timestamp": run_timestamp,
159
+ "folder_name": str(folder_path.relative_to(root_path)),
160
+ "file_count": folder_count,
161
+ "char_count": char_count,
162
+ "word_count": word_count
163
+ }
155
164
  if not args.dry_run:
156
165
  file_handler.write_to_markdown_file(folder_md, metadata, root_path, output_dir_name)
157
166
  logging.info(f"✅ Extracted {folder_count} file(s) from: {folder_path.relative_to(root_path)}")
167
+ logging.info(f"📜 {char_count:,} character(s), {word_count:,} word(s)")
158
168
  if args.dry_run: logging.info(colored(" (Dry Run: No file written)", "yellow"))
159
169
  total_files_extracted += folder_count
160
170
  else:
@@ -165,14 +175,24 @@ def main():
165
175
  root_display_name = f"root [{root_path.name}] (files in root folder only, excl. sub-folders)"
166
176
  with Halo(text=f"Extracting {root_display_name}...", spinner="dots"):
167
177
  time.sleep(0.1)
168
- root_md, root_count = file_handler.extract_code_from_root(root_path, exclude_large)
178
+ # CHANGED: Unpack the new char_count and word_count values
179
+ root_md, root_count, char_count, word_count = file_handler.extract_code_from_root(root_path, exclude_large)
169
180
 
170
181
  if root_count > 0:
171
- metadata = {"run_ref": run_ref, "run_timestamp": run_timestamp, "folder_name": root_display_name, "file_count": root_count}
182
+ # CHANGED: Add new metrics to the metadata dictionary
183
+ metadata = {
184
+ "run_ref": run_ref,
185
+ "run_timestamp": run_timestamp,
186
+ "folder_name": root_display_name,
187
+ "file_count": root_count,
188
+ "char_count": char_count,
189
+ "word_count": word_count
190
+ }
172
191
  if not args.dry_run:
173
192
  file_handler.write_to_markdown_file(root_md, metadata, root_path, output_dir_name)
174
193
  total_files_extracted += root_count
175
194
  logging.info(f"✅ Extracted {root_count} file(s) from the root directory")
195
+ logging.info(f"📜 {char_count:,} character(s), {word_count:,} word(s)")
176
196
  if args.dry_run: logging.info(colored(" (Dry Run: No file written)", "yellow"))
177
197
  else:
178
198
  logging.warning("‼️ No extractable files in the root directory")
@@ -16,26 +16,16 @@ LOGO_LARGE = """
16
16
  """
17
17
 
18
18
  LOGO_SMALL = """
19
- ██████╗ ██████╗ ██████╗ ███████╗██████╗ █████╗ ███████╗███████╗
20
- ██╔════╝██╔═══██╗██╔══██╗██╔════╝██╔══██╗██╔══██╗██╔════╝██╔════╝
21
- ██║ ██║ ██║██║ ██║█████╗ ██████╔╝███████║███████╗█████╗
22
- ██║ ██║ ██║██║ ██║██╔══╝ ██╔══██╗██╔══██║╚════██║██╔══╝
23
- ╚██████╗╚██████╔╝██████╔╝███████╗██████╔╝██║ ██║███████║███████╗
24
- ╚═════╝ ╚═════╝ ╚═════╝ ╚══════╝╚═════╝ ╚═╝ ╚═╝╚══════╝╚══════╝
25
-
26
- ███████╗██╗ ██╗████████╗██████╗ █████╗ ██████╗████████╗ ██████╗ ██████╗
27
- ██╔════╝╚██╗██╔╝╚══██╔══╝██╔══██╗██╔══██╗██╔════╝╚══██╔══╝██╔═══██╗██╔══██╗
28
- █████╗ ╚███╔╝ ██║ ██████╔╝███████║██║ ██║ ██║ ██║██████╔╝
29
- ██╔══╝ ██╔██╗ ██║ ██╔══██╗██╔══██║██║ ██║ ██║ ██║██╔══██╗
30
- ███████╗██╔╝ ██╗ ██║ ██║ ██║██║ ██║╚██████╗ ██║ ╚██████╔╝██║ ██║
31
- ╚══════╝╚═╝ ╚═╝ ╚═╝ ╚═╝ ╚═╝╚═╝ ╚═╝ ╚═════╝ ╚═╝ ╚═════╝ ╚═╝ ╚═╝
19
+ ░█▀▀░█▀█░█▀▄░█▀▀░█▀▄░█▀█░█▀▀░█▀▀░░░█▀▀░█░█░▀█▀░█▀▄░█▀█░█▀▀░▀█▀░█▀█░█▀▄
20
+ ░█░░░█░█░█░█░█▀▀░█▀▄░█▀█░▀▀█░█▀▀░░░█▀▀░▄▀▄░░█░░█▀▄░█▀█░█░░░░█░░█░█░█▀▄
21
+ ░▀▀▀░▀▀▀░▀▀░░▀▀▀░▀▀░░▀░▀░▀▀▀░▀▀▀░░░▀▀▀░▀░▀░░▀░░▀░▀░▀░▀░▀▀▀░░▀░░▀▀▀░▀░▀
32
22
  """
33
23
 
34
24
  def clear_screen():
35
25
  """Clears the terminal screen."""
36
26
  os.system('cls' if os.name == 'nt' else 'clear')
37
27
 
38
- def print_banner(no_instructions: bool = False):
28
+ def print_banner(show_instructions: bool = False): # CHANGED: Parameter name for clarity
39
29
  """Prints a banner that adjusts to the terminal width."""
40
30
  try:
41
31
  width = shutil.get_terminal_size((80, 20)).columns
@@ -49,10 +39,9 @@ def print_banner(no_instructions: bool = False):
49
39
 
50
40
  # Use the imported __version__ variable instead of config.SCRIPT_VERSION
51
41
  print(colored(f" Welcome to Code Extractor v{__version__} by Lukasz Lekowski ".center(width, "="), "white", "on_magenta"))
52
-
53
- if not no_instructions:
54
- print("\nThis tool consolidates your project's code into structured Markdown files.")
55
- print("It's ideal for providing context to AI models, archiving projects, or generating documentation.")
42
+ print("\nThis tool consolidates your project's code into structured Markdown files.")
43
+ print("It's ideal for providing context to AI models, archiving projects, or generating documentation.\n")
44
+
56
45
 
57
46
  def show_instructions(output_dir_name: str):
58
47
  """Clears screen and shows detailed instructions, pausing for user input."""
@@ -85,7 +74,8 @@ def show_instructions(output_dir_name: str):
85
74
  print(colored("--- Output Details ---", "yellow"))
86
75
  print(f"All extracted content is saved into the '{output_dir_name}' directory. Each Markdown file generated will contain a YAML metadata header at the top with a unique reference ID, a timestamp, and more.\n")
87
76
 
88
- tip = "TIP: Run this script with the --no-instructions or -ni flag to skip this guide."
77
+ # CHANGED: Updated the tip to reflect the new '--instructions' flag
78
+ tip = "TIP: To see this guide again, run the script with the --instructions flag."
89
79
  print(colored(tip, "black", "on_yellow"))
90
80
 
91
81
  input(colored("\nReady? Press Enter to begin...", "green"))
@@ -105,6 +95,4 @@ def print_footer():
105
95
  print("💡 Love this tool? Found a bug? Share your feedback on GitHub:")
106
96
  print(config.GITHUB_URL + "\n")
107
97
  print("🤝 Connect with the author on LinkedIn:")
108
- print(config.LINKEDIN_URL + "\n")
109
- print("☕ Enjoying this tool? You can support its development with a coffee!")
110
- print("https://www.buymeacoffee.com/lukaszlekowski\n")
98
+ print(config.LINKEDIN_URL + "\n")
@@ -1,6 +1,6 @@
1
1
  Metadata-Version: 2.4
2
2
  Name: codebase-extractor
3
- Version: 1.1.0
3
+ Version: 1.1.1
4
4
  Summary: A CLI tool to extract project source code into structured Markdown files for LLM & AI context.
5
5
  Author: Lukasz Lekowski
6
6
  Project-URL: Homepage, https://github.com/lukaszlekowski/codebase-extractor
@@ -75,12 +75,12 @@ The tool is highly configurable, allowing you to select specific folders, exclud
75
75
  ## ✨ Key Features
76
76
 
77
77
  - **Interactive & User-Friendly:** A guided, multi-step CLI experience that makes selecting options simple and clear.
78
+ - **Quick Start by Default:** The tool starts without delay. Detailed instructions are available via an `--instructions` flag when you need a reminder.
78
79
  - **Smart Filtering:** Automatically excludes common dependency folders, build artifacts, version control directories, and IDE configuration files. The exact filters are configurable.
79
80
  - **Flexible Selection Modes:** Choose to extract the entire project with one command, or dive into a specific selection mode.
80
- - **🌳 Nested Folder Selection:** Interactively browse and select specific sub-folders from a tree-like view.
81
+ - **🌳 Visual Tree Selection:** Interactively browse and select specific sub-folders from a clear, pipe-based tree structure.
81
82
  - **🔢 Configurable Scan Depth:** You decide how many levels deep the script should look for folders when building the selection tree.
82
- - **YAML Metadata:** Each generated Markdown file is prepended with a YAML front matter block containing useful metadata like a unique run ID, timestamp, and file count for easy tracking and parsing.
83
- - **🚀 Quick Start Mode:** Use the `--no-instructions` flag to skip the detailed intro guide on subsequent runs.
83
+ - **Rich YAML Metadata:** Each generated Markdown file is prepended with a YAML front matter block containing useful metadata like a unique run ID, timestamp, file count, character count, and word count.
84
84
  - **Safe & Robust:** Features graceful exit handling (`Ctrl+C`) and provides clear feedback during the extraction process.
85
85
 
86
86
  ---
@@ -168,12 +168,12 @@ Once installed, you can run the tool from any terminal window. Navigate to your
168
168
  code-extractor
169
169
  ```
170
170
 
171
- The script will then guide you through the extraction process.
171
+ The script will launch immediately and guide you through the extraction process.
172
172
 
173
- For repeat usage, you can skip the detailed introductory guide by using the `--no-instructions` or `-ni` flag:
173
+ For a detailed guide on how the script works, you can use the `--instructions` flag:
174
174
 
175
175
  ```bash
176
- code-extractor --no-instructions
176
+ code-extractor --instructions
177
177
  ```
178
178
 
179
179
  ### The Process
@@ -193,25 +193,25 @@ The tool will guide you through a series of prompts:
193
193
 
194
194
  ### Output Details
195
195
 
196
- All output files are saved in a `CODEBASE_EXTRACTS` directory within your project folder. Each generated Markdown file includes a YAML metadata header with a unique reference ID, timestamp, and file count for easy tracking and parsing.
196
+ All output files are saved in a `CODEBASE_EXTRACTS` directory within your project folder. Each generated Markdown file includes a YAML metadata header with a unique reference ID, timestamp, file count, character count, and word count for easy tracking and parsing.
197
197
 
198
198
  ### ⚡ CLI Command Reference
199
199
 
200
200
  For non-interactive use and automation, you can control the script entirely with these arguments.
201
201
 
202
- | Argument | Description | Default Value |
203
- | :------------------------- | :--------------------------------------------------------------------------- | :-------------------------- |
204
- | `-ni`, `--no-instructions` | Run the script without printing the detailed instruction banner. | `False` |
205
- | `--root <path>` | The root directory of the project to extract. | The current directory |
206
- | `--output-dir <name>` | Custom name for the output directory. | `CODEBASE_EXTRACTS` |
207
- | `--dry-run` | Simulate the extraction process without writing any files. | `False` |
208
- | `-v`, `--verbose` | Enable verbose logging for debugging. | `False` |
209
- | `--log-file <path>` | Path to save the log file. | `None` |
210
- | `--exclude-large-files` | Non-interactive: Exclude files larger than 1MB. | `False` |
211
- | `--mode <mode>` | Non-interactive: Set the extraction mode. Choices: `everything`, `specific`. | `None` (Interactive prompt) |
212
- | `--depth <number>` | Non-interactive: Set the folder scan depth for 'specific' mode. | `3` |
213
- | `--select-folders <list>` | Non-interactive: A space-separated list of folders/sub-folders to extract. | `[]` |
214
- | `--select-root` | Non-interactive: Include files from the root directory in the extraction. | `False` |
202
+ | Argument | Description | Default Value |
203
+ | :------------------------ | :--------------------------------------------------------------------------- | :-------------------------- |
204
+ | `--instructions` | Show the detailed instruction guide on startup. | `False` |
205
+ | `--root <path>` | The root directory of the project to extract. | The current directory |
206
+ | `--output-dir <name>` | Custom name for the output directory. | `CODEBASE_EXTRACTS` |
207
+ | `--dry-run` | Simulate the extraction process without writing any files. | `False` |
208
+ | `-v`, `--verbose` | Enable verbose logging for debugging. | `False` |
209
+ | `--log-file <path>` | Path to save the log file. | `None` |
210
+ | `--exclude-large-files` | Non-interactive: Exclude files larger than 1MB. | `False` |
211
+ | `--mode <mode>` | Non-interactive: Set the extraction mode. Choices: `everything`, `specific`. | `None` (Interactive prompt) |
212
+ | `--depth <number>` | Non-interactive: Set the folder scan depth for 'specific' mode. | `3` |
213
+ | `--select-folders <list>` | Non-interactive: A space-separated list of folders/sub-folders to extract. | `[]` |
214
+ | `--select-root` | Non-interactive: Include files from the root directory in the extraction. | `False` |
215
215
 
216
216
  ---
217
217
 
@@ -224,7 +224,7 @@ Here are a few practical examples of how to use the tool from your command line.
224
224
  A common command for quick, automated runs.
225
225
 
226
226
  ```bash
227
- code-extractor --no-instructions --mode everything
227
+ code-extractor --mode everything
228
228
  ```
229
229
 
230
230
  - #### Extract specific sub-folders non-interactively
@@ -232,7 +232,7 @@ Here are a few practical examples of how to use the tool from your command line.
232
232
  This command extracts only the `src/components` and `src/hooks` directories, plus any files in the root.
233
233
 
234
234
  ```bash
235
- code-extractor --ni --mode specific --select-folders src/components src/hooks --select-root
235
+ code-extractor --mode specific --select-folders src/components src/hooks --select-root
236
236
  ```
237
237
 
238
238
  - #### Perform a safe dry run
@@ -1 +0,0 @@
1
- __version__ = "1.1.0"