code-to-txt 0.1.0__tar.gz → 0.3.0__tar.gz

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
@@ -0,0 +1,21 @@
1
+ MIT License
2
+
3
+ Copyright (c) Andrii Sonsiadlo
4
+
5
+ Permission is hereby granted, free of charge, to any person obtaining a copy
6
+ of this software and associated documentation files (the "Software"), to deal
7
+ in the Software without restriction, including without limitation the rights
8
+ to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
9
+ copies of the Software, and to permit persons to whom the Software is
10
+ furnished to do so, subject to the following conditions:
11
+
12
+ The above copyright notice and this permission notice shall be included in all
13
+ copies or substantial portions of the Software.
14
+
15
+ THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
16
+ IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
17
+ FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
18
+ AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
19
+ LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
20
+ OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
21
+ SOFTWARE.
@@ -0,0 +1,466 @@
1
+ Metadata-Version: 2.4
2
+ Name: code-to-txt
3
+ Version: 0.3.0
4
+ Summary: Convert code files to a single text file for LLM consumption
5
+ License: MIT
6
+ License-File: LICENSE
7
+ Author: Andrii Sonsiadlo
8
+ Author-email: andrii.sonsiadlo@gmail.com
9
+ Requires-Python: >=3.10
10
+ Classifier: License :: OSI Approved :: MIT License
11
+ Classifier: Programming Language :: Python :: 3
12
+ Classifier: Programming Language :: Python :: 3.10
13
+ Classifier: Programming Language :: Python :: 3.11
14
+ Classifier: Programming Language :: Python :: 3.12
15
+ Classifier: Programming Language :: Python :: 3.13
16
+ Classifier: Programming Language :: Python :: 3.14
17
+ Requires-Dist: click (>=8.3.1,<9.0.0)
18
+ Requires-Dist: gitpython (>=3.1.46,<4.0.0)
19
+ Requires-Dist: pathspec (>=1.0.4,<2.0.0)
20
+ Requires-Dist: pyperclip (>=1.8.2,<2.0.0)
21
+ Requires-Dist: pyyaml (>=6.0.0,<7.0.0)
22
+ Description-Content-Type: text/markdown
23
+
24
+ # CodeToTxt
25
+
26
+ A powerful Python package to convert code files into a single text file, perfect for feeding into Large Language
27
+ Models (LLMs) or for easy code review and documentation.
28
+
29
+ ## Features
30
+
31
+ **Core Features:**
32
+
33
+ - 📁 Convert entire directories of code into a single text file
34
+ - 🌳 Optional directory tree visualization
35
+ - 🚫 Respects `.gitignore` patterns automatically
36
+ - 🎨 Customizable file separators and output format
37
+ - 🔧 Flexible file filtering by extension or glob patterns
38
+ - 📦 Easy to use CLI and Python API
39
+
40
+ ## Installation
41
+
42
+ ```bash
43
+ pip install code-to-txt
44
+ ```
45
+
46
+ Or with Poetry:
47
+
48
+ ```bash
49
+ poetry add code-to-txt
50
+ ```
51
+
52
+ ## Quick Start
53
+
54
+ ### Basic Usage
55
+
56
+ ```bash
57
+ # Show version
58
+ code-to-txt --version
59
+
60
+ # Convert all code files with timestamp
61
+ code-to-txt -t
62
+
63
+ # Preview what would be processed
64
+ code-to-txt --dry-run
65
+
66
+ # Get codebase statistics
67
+ code-to-txt --stats
68
+
69
+ # Convert specific directory
70
+ code-to-txt ./my-project -o project.txt
71
+
72
+ # Copy to clipboard instead of saving
73
+ code-to-txt --clipboard-only
74
+ ```
75
+
76
+ ### Specify File Types
77
+
78
+ ```bash
79
+ # Multiple extensions (space or comma separated)
80
+ code-to-txt -e ".py .js .ts"
81
+ code-to-txt -e ".py,.js,.ts"
82
+
83
+ # Using glob patterns
84
+ code-to-txt -g "*.py" -g "src/**/*.js"
85
+ code-to-txt -g "*.py" -g "*.md"
86
+ ```
87
+
88
+ ### Advanced Usage
89
+
90
+ ```bash
91
+ # Limit file sizes (useful for LLM token limits)
92
+ code-to-txt --max-file-size 500
93
+
94
+ # Exclude patterns
95
+ code-to-txt -x "tests/*" -x "*.test.js"
96
+
97
+ # Don't use .gitignore
98
+ code-to-txt --no-gitignore
99
+
100
+ # Don't show directory tree
101
+ code-to-txt --no-tree
102
+
103
+ # Custom separator
104
+ code-to-txt --separator "---"
105
+
106
+ # Combine options
107
+ code-to-txt -t -c -e ".py .js" -x "tests/*"
108
+ ```
109
+
110
+ ## Configuration File
111
+
112
+ Create a default configuration file:
113
+
114
+ ```bash
115
+ code-to-txt --init-config
116
+ ```
117
+
118
+ This creates `.code-to-txt.yml` with default settings:
119
+
120
+ ```yaml
121
+ # Output file name
122
+ output: code-to-txt.txt
123
+
124
+ # File extensions to include (null = use defaults)
125
+ extensions: null
126
+
127
+ # Patterns to exclude
128
+ exclude:
129
+ - "tests/*"
130
+ - "*.test.js"
131
+ - "*.test.ts"
132
+ - "*.spec.js"
133
+ - "*.spec.ts"
134
+ - "node_modules/*"
135
+ - "__pycache__/*"
136
+ - "*.pyc"
137
+
138
+ # Glob patterns (alternative to extensions)
139
+ glob: [ ]
140
+
141
+ # Options
142
+ no_gitignore: false
143
+ no_tree: false
144
+ separator: "================"
145
+ clipboard: false
146
+ clipboard_only: false
147
+ timestamp: false
148
+ max_file_size: null
149
+ ```
150
+
151
+ Use the config file:
152
+
153
+ ```bash
154
+ code-to-txt --config .code-to-txt.yml
155
+ ```
156
+
157
+ **Note:** CLI arguments override config file settings.
158
+
159
+ ### Example Configurations
160
+
161
+ **Python Project:**
162
+
163
+ ```yaml
164
+ extensions: [ .py ]
165
+ exclude: [ "tests/*", "*.pyc", "__pycache__/*", "venv/*", ".venv/*" ]
166
+ timestamp: true
167
+ max_file_size: 500
168
+ ```
169
+
170
+ **JavaScript/TypeScript Project:**
171
+
172
+ ```yaml
173
+ extensions: [ .js, .ts, .jsx, .tsx ]
174
+ exclude: [ "node_modules/*", "dist/*", "build/*", "*.test.js", "*.spec.ts" ]
175
+ no_tree: false
176
+ max_file_size: 1000
177
+ ```
178
+
179
+ **LLM-Optimized:**
180
+
181
+ ```yaml
182
+ extensions: [ .py, .js, .md ]
183
+ exclude: [ "tests/*", "*.test.*", "node_modules/*", "dist/*", "build/*" ]
184
+ timestamp: true
185
+ clipboard: true
186
+ max_file_size: 200
187
+ no_tree: false
188
+ ```
189
+
190
+ ## Command Line Options
191
+
192
+ ```
193
+ Usage: code-to-txt [OPTIONS] [PATH]
194
+
195
+ Arguments:
196
+ PATH Directory to scan (default: current directory)
197
+
198
+ Options:
199
+ -o, --output PATH Output file path (default: codetotxt_YYYYMMDD_HHMMSS.txt)
200
+ -e, --extensions TEXT File extensions to include (space or comma separated)
201
+ -x, --exclude TEXT Patterns to exclude (can be used multiple times)
202
+ -g, --glob TEXT Glob patterns to include (can be used multiple times)
203
+ --no-gitignore Don't respect .gitignore files
204
+ --no-tree Don't include directory tree in output
205
+ --separator TEXT Separator between files
206
+ -c, --clipboard Copy output to clipboard in addition to file
207
+ --clipboard-only Copy to clipboard only (don't save file)
208
+ --config PATH Path to config file (.yml or .yaml)
209
+ --init-config Create default configuration file
210
+ -t, --timestamp Add timestamp to output filename
211
+ -v, --version Show version and exit
212
+ --dry-run Show which files would be processed
213
+ --stats Show detailed statistics
214
+ --max-file-size INT Skip files larger than N KB
215
+ --help Show this message and exit
216
+ ```
217
+
218
+ ## Python API
219
+
220
+ ### Basic Usage
221
+
222
+ ```python
223
+ from code_to_txt import CodeToText
224
+
225
+ code_to_txt = CodeToText(
226
+ root_path="./my-project",
227
+ output_file="output.txt",
228
+ include_extensions={".py", ".js"},
229
+ )
230
+
231
+ num_files = code_to_txt.convert(add_tree=True)
232
+ print(f"Processed {num_files} files")
233
+ ```
234
+
235
+ ### Generate Content for Clipboard
236
+
237
+ ```python
238
+ from code_to_txt import CodeToText
239
+ import pyperclip
240
+
241
+ code_to_txt = CodeToText(
242
+ root_path="./my-project",
243
+ output_file=None,
244
+ include_extensions={".py"},
245
+ )
246
+
247
+ content = code_to_txt.generate_content(add_tree=True)
248
+ pyperclip.copy(content)
249
+ ```
250
+
251
+ ### Get Statistics
252
+
253
+ ```python
254
+ from code_to_txt import CodeToText
255
+
256
+ code_to_txt = CodeToText(
257
+ root_path="./my-project",
258
+ output_file=None,
259
+ max_file_size_kb=500,
260
+ )
261
+
262
+ stats = code_to_txt.calculate_statistics()
263
+ print(f"Total files: {stats['total_files']}")
264
+ print(f"Total size: {stats['total_size_bytes'] / 1024 / 1024:.2f} MB")
265
+ print(f"Total lines: {stats['total_lines']:,}")
266
+ ```
267
+
268
+ ### Using Glob Patterns
269
+
270
+ ```python
271
+ from code_to_txt import CodeToText
272
+
273
+ code_to_txt = CodeToText(
274
+ root_path="./my-project",
275
+ output_file="output.txt",
276
+ glob_patterns=["*.py", "src/**/*.js", "**/*.md"],
277
+ )
278
+
279
+ num_files = code_to_txt.convert()
280
+ ```
281
+
282
+ ## Default File Extensions
283
+
284
+ When no extensions are specified, CodeToTxt includes these file types by default:
285
+
286
+ - **Python:** `.py`
287
+ - **JavaScript/TypeScript:** `.js`, `.ts`, `.jsx`, `.tsx`
288
+ - **Systems:** `.c`, `.cpp`, `.h`, `.hpp`, `.java`, `.cs`, `.go`, `.rs`
289
+ - **Web:** `.html`, `.css`, `.scss`
290
+ - **Config:** `.yaml`, `.yml`, `.json`, `.toml`, `.xml`
291
+ - **Documentation:** `.md`, `.txt`, `.rst`
292
+ - **Scripts:** `.sh`, `.bash`, `.zsh`
293
+ - **Other:** `.rb`, `.php`, `.swift`, `.kt`, `.scala`, `.r`, `.sql`
294
+
295
+ ## Default Ignore Patterns
296
+
297
+ CodeToTxt automatically ignores common build artifacts and dependencies:
298
+
299
+ - `__pycache__`, `*.pyc`, `*.pyo`, `*.pyd`
300
+ - `.git`, `.svn`, `.hg`
301
+ - `node_modules`
302
+ - `.venv`, `venv`, `.env`
303
+ - `*.egg-info`, `dist`, `build`
304
+ - `.pytest_cache`, `.mypy_cache`, `.ruff_cache`
305
+ - `*.so`, `*.dylib`, `*.dll`
306
+
307
+ Plus any patterns in your `.gitignore` file (including parent directories).
308
+
309
+ ## Output Format
310
+
311
+ The generated file includes:
312
+
313
+ 1. **Header:** Source directory and file count
314
+ 2. **Directory Tree:** Visual representation of the file structure (optional)
315
+ 3. **File Contents:** Each file with its relative path and content
316
+
317
+ Example output:
318
+
319
+ ```
320
+ Code Export from: /path/to/project
321
+ Total files: 4
322
+ ================================================================================
323
+
324
+ DIRECTORY TREE:
325
+ ================================================================================
326
+ my-project/
327
+ ├── src/
328
+ │ ├── main.py
329
+ │ └── utils.py
330
+ ├── tests/
331
+ │ └── test_main.py
332
+ └── README.md
333
+
334
+ ================================================================================
335
+
336
+ FILE 1/4: src/main.py
337
+ ================================================================================
338
+ def main():
339
+ print("Hello, World!")
340
+
341
+ if __name__ == "__main__":
342
+ main()
343
+
344
+ ================================================================================
345
+ ...
346
+ ```
347
+
348
+ ## Use Cases
349
+
350
+ - 📚 **Code Review:** Share entire codebase in a single file
351
+ - 🤖 **LLM Input:** Feed code to ChatGPT, Claude, or other AI assistants
352
+ - 📖 **Documentation:** Create comprehensive code documentation
353
+ - 🔍 **Code Search:** Easy text-based search across entire project
354
+ - 📊 **Analysis:** Input for code analysis tools
355
+ - 💾 **Archival:** Simple code backup format
356
+
357
+ ## Tips & Tricks
358
+
359
+ ### For LLM Consumption
360
+
361
+ ```bash
362
+ # Step 1: Check what you're working with
363
+ code-to-txt --stats
364
+
365
+ # Step 2: Preview files
366
+ code-to-txt --dry-run --max-file-size 200
367
+
368
+ # Step 3: Copy to clipboard with size limit
369
+ code-to-txt --clipboard-only --max-file-size 200 -e ".py .md"
370
+
371
+ # See token estimate:
372
+ # Estimated tokens: ~95,000
373
+ ```
374
+
375
+ ### For Large Projects
376
+
377
+ ```bash
378
+ # Use specific extensions to reduce size
379
+ code-to-txt -e ".py" -t --max-file-size 500
380
+
381
+ # Exclude heavy directories
382
+ code-to-txt -x "node_modules/*" -x "venv/*" -x "dist/*"
383
+
384
+ # Get statistics first
385
+ code-to-txt --stats --max-file-size 300
386
+ ```
387
+
388
+ ### Debug Ignore Patterns
389
+
390
+ ```bash
391
+ # See which files are being skipped and why
392
+ code-to-txt --dry-run
393
+
394
+ # Compare with and without gitignore
395
+ code-to-txt --dry-run --no-gitignore
396
+ ```
397
+
398
+ ## Requirements
399
+
400
+ - Python 3.10+
401
+ - Dependencies: `click`, `gitpython`, `pathspec`, `pyperclip`, `pyyaml`
402
+
403
+ ## Development
404
+
405
+ ```bash
406
+ # Clone repository
407
+ git clone https://github.com/AndriiSonsiadlo/code-to-txt.git
408
+ cd code-to-txt
409
+
410
+ # Install with Poetry
411
+ poetry install
412
+
413
+ # Run tests
414
+ poetry run pytest
415
+
416
+ # Run linting
417
+ poetry run ruff check .
418
+ poetry run mypy src/
419
+ ```
420
+
421
+ ## Contributing
422
+
423
+ Contributions are welcome! Please feel free to submit a Pull Request.
424
+
425
+ ## License
426
+
427
+ MIT License - see LICENSE file for details.
428
+
429
+ ## Changelog
430
+
431
+ ### v0.3.0
432
+
433
+ - 🔧 Refactored codebase for better maintainability
434
+ - 📁 Externalized default extensions and ignore patterns to separate files
435
+ - 🐛 Fixed critical gitignore bug (now checks parent directories)
436
+ - 🔍 Improved cross-platform path handling
437
+ - 📊 Added `--stats` flag for detailed codebase statistics
438
+ - 🎯 Added `--dry-run` mode to preview without processing
439
+ - 📏 Added `--max-file-size` to skip large files
440
+ - 🔢 Added token estimation for LLM consumption
441
+ - 📝 Added skip tracking to see which files were excluded
442
+ - 🚀 Improved method naming and code structure
443
+ - ✅ Enhanced test coverage
444
+
445
+ ### v0.2.0
446
+
447
+ - ✨ Added automatic timestamp generation for output files
448
+ - 📋 Added clipboard support (`--clipboard` and `--clipboard-only`)
449
+ - 🎯 Improved extension handling (space/comma separated)
450
+ - 🔍 Added glob pattern support
451
+ - ⚙️ Added configuration file support (`.code-to-txt.yml`)
452
+ - 🚀 Expanded default file extensions and ignore patterns
453
+ - 🐛 Various bug fixes and improvements
454
+
455
+ ### v0.1.0
456
+
457
+ - 🎉 Initial release
458
+ - 📁 Basic directory to text conversion
459
+ - 🌳 Directory tree generation
460
+ - 🚫 .gitignore support
461
+ - 🎨 Customizable separators
462
+
463
+ ## Acknowledgments
464
+
465
+ Created by Andrii Sonsiadlo
466
+