splurge-dsv 2025.1.3__tar.gz → 2025.1.5__tar.gz

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (19) hide show
  1. {splurge_dsv-2025.1.3/splurge_dsv.egg-info → splurge_dsv-2025.1.5}/PKG-INFO +12 -87
  2. {splurge_dsv-2025.1.3 → splurge_dsv-2025.1.5}/README.md +11 -86
  3. {splurge_dsv-2025.1.3 → splurge_dsv-2025.1.5}/pyproject.toml +1 -1
  4. {splurge_dsv-2025.1.3 → splurge_dsv-2025.1.5}/splurge_dsv/__init__.py +1 -1
  5. {splurge_dsv-2025.1.3 → splurge_dsv-2025.1.5}/splurge_dsv/__main__.py +3 -3
  6. {splurge_dsv-2025.1.3 → splurge_dsv-2025.1.5}/splurge_dsv/cli.py +53 -12
  7. {splurge_dsv-2025.1.3 → splurge_dsv-2025.1.5/splurge_dsv.egg-info}/PKG-INFO +12 -87
  8. {splurge_dsv-2025.1.3 → splurge_dsv-2025.1.5}/LICENSE +0 -0
  9. {splurge_dsv-2025.1.3 → splurge_dsv-2025.1.5}/setup.cfg +0 -0
  10. {splurge_dsv-2025.1.3 → splurge_dsv-2025.1.5}/splurge_dsv/dsv_helper.py +0 -0
  11. {splurge_dsv-2025.1.3 → splurge_dsv-2025.1.5}/splurge_dsv/exceptions.py +0 -0
  12. {splurge_dsv-2025.1.3 → splurge_dsv-2025.1.5}/splurge_dsv/path_validator.py +0 -0
  13. {splurge_dsv-2025.1.3 → splurge_dsv-2025.1.5}/splurge_dsv/resource_manager.py +0 -0
  14. {splurge_dsv-2025.1.3 → splurge_dsv-2025.1.5}/splurge_dsv/string_tokenizer.py +0 -0
  15. {splurge_dsv-2025.1.3 → splurge_dsv-2025.1.5}/splurge_dsv/text_file_helper.py +0 -0
  16. {splurge_dsv-2025.1.3 → splurge_dsv-2025.1.5}/splurge_dsv.egg-info/SOURCES.txt +0 -0
  17. {splurge_dsv-2025.1.3 → splurge_dsv-2025.1.5}/splurge_dsv.egg-info/dependency_links.txt +0 -0
  18. {splurge_dsv-2025.1.3 → splurge_dsv-2025.1.5}/splurge_dsv.egg-info/requires.txt +0 -0
  19. {splurge_dsv-2025.1.3 → splurge_dsv-2025.1.5}/splurge_dsv.egg-info/top_level.txt +0 -0
@@ -1,6 +1,6 @@
1
1
  Metadata-Version: 2.4
2
2
  Name: splurge-dsv
3
- Version: 2025.1.3
3
+ Version: 2025.1.5
4
4
  Summary: A utility library for working with DSV (Delimited String Values) files
5
5
  Author: Jim Schilling
6
6
  License-Expression: MIT
@@ -29,6 +29,11 @@ Dynamic: license-file
29
29
 
30
30
  # splurge-dsv
31
31
 
32
+ [![PyPI version](https://badge.fury.io/py/splurge-dsv.svg)](https://pypi.org/project/splurge-dsv/)
33
+ [![Python versions](https://img.shields.io/pypi/pyversions/splurge-dsv.svg)](https://pypi.org/project/splurge-dsv/)
34
+ [![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](https://opensource.org/licenses/MIT)
35
+ [![Coverage](https://img.shields.io/badge/coverage-96%25-brightgreen.svg)](https://github.com/jim-schilling/splurge-dsv)
36
+
32
37
  A robust Python library for parsing and processing delimited-separated value (DSV) files with advanced features for data validation, streaming, and error handling.
33
38
 
34
39
  ## Features
@@ -243,97 +248,17 @@ The project follows strict coding standards:
243
248
 
244
249
  ## Changelog
245
250
 
246
- ### 2025.1.3 (2025-09-03)
247
-
248
- #### 🔧 Maintenance & Consistency
249
- - **Version Alignment**: Bumped `__version__` and CLI `--version` to `2025.1.3` to match `pyproject.toml`.
250
- - **CLI Path Validation**: Centralized validation using `PathValidator.validate_path(...)` for consistent error handling.
251
- - **Type Correctness**: Fixed `PathValidator._is_valid_windows_drive_pattern` to return `bool` explicitly.
252
- - **Docs Alignment**: Updated README coverage claims to reflect the `>=85%` coverage gate configured in CI.
253
-
254
- ### 2025.1.2 (2025-09-02)
255
-
256
- #### 🧪 Comprehensive End-to-End Testing
257
- - **Complete E2E Test Suite**: Implemented 25 comprehensive end-to-end workflow tests covering all major CLI functionality
258
- - **Real CLI Execution**: Tests run actual `splurge-dsv` commands with real files, not just mocked components
259
- - **Workflow Coverage**: Tests cover CSV/TSV parsing, file operations, data processing, error handling, and performance scenarios
260
- - **Cross-Platform Compatibility**: Handles Windows-specific encoding issues and platform differences gracefully
261
- - **Performance Testing**: Large file processing tests (1,000+ and 10,000+ rows) with streaming and chunking validation
262
-
263
- #### 📊 Test Coverage Improvements
264
- - **Integration Testing**: Added real file system operations and complete pipeline validation
265
-
266
- #### 🔄 Test Categories
267
- - **CLI Workflows**: 19 tests covering basic parsing, custom delimiters, header/footer skipping, streaming, and error scenarios
268
- - **Error Handling**: 3 tests for invalid arguments, missing parameters, and CLI error conditions
269
- - **Integration Scenarios**: 3 tests for data analysis, transformation, and multi-format workflows
270
-
271
- #### 📚 Documentation & Examples
272
- - **E2E Testing Guide**: Created comprehensive documentation (`docs/e2e_testing_coverage.md`) explaining test coverage and usage
273
- - **Real-World Examples**: Tests serve as practical examples of library usage patterns
274
- - **Error Scenario Coverage**: Comprehensive testing of edge cases and failure conditions
275
-
276
- ### 2025.1.1 (2025-08-XX)
277
-
278
- #### 🔧 Code Quality Improvements
279
- - **Refactored Complex Regex Logic**: Extracted Windows drive letter validation logic from `_check_dangerous_characters` into a dedicated `_is_valid_windows_drive_pattern` helper method in `PathValidator` for better readability and maintainability
280
- - **Exception Handling Consistency**: Fixed inconsistency in `ResourceManager.acquire()` method to properly re-raise `NotImplementedError` without wrapping it in `SplurgeResourceAcquisitionError`
281
- - **Import Organization**: Moved all imports to the top of modules across the entire codebase for better code structure and PEP 8 compliance
282
-
283
- #### 🧪 Testing Enhancements
284
- - **Public API Focus**: Removed all tests that validated private implementation details, focusing exclusively on public API behavior validation
285
- - **Comprehensive Resource Manager Tests**: Added extensive test suite for `ResourceManager` module covering all public methods, edge cases, error scenarios, and context manager behavior
286
- - **Bookend Logic Clarification**: Updated and corrected all tests related to `StringTokenizer.remove_bookends` to properly reflect its single-character, symmetric bookend matching behavior
287
- - **Path Validation Test Clarity**: Clarified test expectations and comments for Windows drive-relative paths (e.g., "C:file.txt") to reflect the validator's intentionally strict security design
288
-
289
- #### 🐛 Bug Fixes
290
- - **Test Reliability**: Fixed failing tests in `ResourceManager` context manager scenarios by properly handling file truncation and line ending normalization
291
- - **Ruff Compliance**: Resolved all linting warnings including unused variables and imports
292
-
293
- #### 📚 Documentation Updates
294
- - **Method Documentation**: Updated `ResourceManager.acquire()` docstring to include `NotImplementedError` in the Raises section
295
- - **Test Comments**: Enhanced test documentation with clearer explanations of expected behaviors and edge cases
296
-
297
- ### 2025.1.0 (2025-08-25)
298
-
299
- #### 🎉 Major Features
300
- - **Complete DSV Parser**: Full-featured delimited-separated value parser with support for CSV, TSV, and custom delimiters
301
- - **Streaming Support**: Memory-efficient streaming for large files with configurable chunk sizes
302
- - **Advanced Parsing Options**: Bookend removal, whitespace handling, and encoding support
303
- - **Header/Footer Skipping**: Skip specified numbers of rows from start or end of files
304
-
305
- #### 🛡️ Security Enhancements
306
- - **Path Validation System**: Comprehensive file path security validation with traversal attack prevention
307
- - **File Permission Checks**: Automatic file accessibility and permission validation
308
- - **Encoding Validation**: Robust encoding error detection and handling
309
-
310
- #### 🔧 Core Components
311
- - **DsvHelper**: Main DSV parsing class with parse, parses, parse_file, and parse_stream methods
312
- - **TextFileHelper**: Utility class for text file operations (line counting, preview, reading, streaming)
313
- - **PathValidator**: Security-focused path validation utilities
314
- - **ResourceManager**: Context managers for safe resource handling
315
- - **StringTokenizer**: Core string parsing functionality
316
-
317
- #### 🧪 Testing & Quality
318
- - **Comprehensive Test Suite**: 250+ tests with 85%+ coverage gate
319
- - **Cross-Platform Testing**: Tested on Windows, Linux, and macOS
320
- - **Type Safety**: Full type annotations throughout the codebase
321
- - **Error Handling**: Custom exception hierarchy with detailed error messages
322
-
323
- #### 📚 Documentation
324
- - **Complete API Documentation**: Google-style docstrings for all public methods
325
- - **Usage Examples**: Comprehensive examples for all major features
326
- - **Error Documentation**: Detailed error handling documentation
327
-
328
- #### 🚀 Performance
329
- - **Memory Efficiency**: Streaming support for large files
330
- - **Optimized Parsing**: Efficient string tokenization and processing
331
- - **Resource Management**: Automatic cleanup and resource management
251
+ See the [CHANGELOG](CHANGELOG.md) for full release notes.
332
252
 
333
253
  ## License
334
254
 
335
255
  This project is licensed under the MIT License - see the [LICENSE](LICENSE) file for details.
336
256
 
257
+ ## More Documentation
258
+
259
+ - Detailed docs: [docs/README-details.md](docs/README-details.md)
260
+ - E2E testing coverage: [docs/e2e_testing_coverage.md](docs/e2e_testing_coverage.md)
261
+
337
262
  ## Contributing
338
263
 
339
264
  Contributions are welcome! Please feel free to submit a Pull Request. For major changes, please open an issue first to discuss what you would like to change.
@@ -1,5 +1,10 @@
1
1
  # splurge-dsv
2
2
 
3
+ [![PyPI version](https://badge.fury.io/py/splurge-dsv.svg)](https://pypi.org/project/splurge-dsv/)
4
+ [![Python versions](https://img.shields.io/pypi/pyversions/splurge-dsv.svg)](https://pypi.org/project/splurge-dsv/)
5
+ [![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](https://opensource.org/licenses/MIT)
6
+ [![Coverage](https://img.shields.io/badge/coverage-96%25-brightgreen.svg)](https://github.com/jim-schilling/splurge-dsv)
7
+
3
8
  A robust Python library for parsing and processing delimited-separated value (DSV) files with advanced features for data validation, streaming, and error handling.
4
9
 
5
10
  ## Features
@@ -214,97 +219,17 @@ The project follows strict coding standards:
214
219
 
215
220
  ## Changelog
216
221
 
217
- ### 2025.1.3 (2025-09-03)
218
-
219
- #### 🔧 Maintenance & Consistency
220
- - **Version Alignment**: Bumped `__version__` and CLI `--version` to `2025.1.3` to match `pyproject.toml`.
221
- - **CLI Path Validation**: Centralized validation using `PathValidator.validate_path(...)` for consistent error handling.
222
- - **Type Correctness**: Fixed `PathValidator._is_valid_windows_drive_pattern` to return `bool` explicitly.
223
- - **Docs Alignment**: Updated README coverage claims to reflect the `>=85%` coverage gate configured in CI.
224
-
225
- ### 2025.1.2 (2025-09-02)
226
-
227
- #### 🧪 Comprehensive End-to-End Testing
228
- - **Complete E2E Test Suite**: Implemented 25 comprehensive end-to-end workflow tests covering all major CLI functionality
229
- - **Real CLI Execution**: Tests run actual `splurge-dsv` commands with real files, not just mocked components
230
- - **Workflow Coverage**: Tests cover CSV/TSV parsing, file operations, data processing, error handling, and performance scenarios
231
- - **Cross-Platform Compatibility**: Handles Windows-specific encoding issues and platform differences gracefully
232
- - **Performance Testing**: Large file processing tests (1,000+ and 10,000+ rows) with streaming and chunking validation
233
-
234
- #### 📊 Test Coverage Improvements
235
- - **Integration Testing**: Added real file system operations and complete pipeline validation
236
-
237
- #### 🔄 Test Categories
238
- - **CLI Workflows**: 19 tests covering basic parsing, custom delimiters, header/footer skipping, streaming, and error scenarios
239
- - **Error Handling**: 3 tests for invalid arguments, missing parameters, and CLI error conditions
240
- - **Integration Scenarios**: 3 tests for data analysis, transformation, and multi-format workflows
241
-
242
- #### 📚 Documentation & Examples
243
- - **E2E Testing Guide**: Created comprehensive documentation (`docs/e2e_testing_coverage.md`) explaining test coverage and usage
244
- - **Real-World Examples**: Tests serve as practical examples of library usage patterns
245
- - **Error Scenario Coverage**: Comprehensive testing of edge cases and failure conditions
246
-
247
- ### 2025.1.1 (2025-08-XX)
248
-
249
- #### 🔧 Code Quality Improvements
250
- - **Refactored Complex Regex Logic**: Extracted Windows drive letter validation logic from `_check_dangerous_characters` into a dedicated `_is_valid_windows_drive_pattern` helper method in `PathValidator` for better readability and maintainability
251
- - **Exception Handling Consistency**: Fixed inconsistency in `ResourceManager.acquire()` method to properly re-raise `NotImplementedError` without wrapping it in `SplurgeResourceAcquisitionError`
252
- - **Import Organization**: Moved all imports to the top of modules across the entire codebase for better code structure and PEP 8 compliance
253
-
254
- #### 🧪 Testing Enhancements
255
- - **Public API Focus**: Removed all tests that validated private implementation details, focusing exclusively on public API behavior validation
256
- - **Comprehensive Resource Manager Tests**: Added extensive test suite for `ResourceManager` module covering all public methods, edge cases, error scenarios, and context manager behavior
257
- - **Bookend Logic Clarification**: Updated and corrected all tests related to `StringTokenizer.remove_bookends` to properly reflect its single-character, symmetric bookend matching behavior
258
- - **Path Validation Test Clarity**: Clarified test expectations and comments for Windows drive-relative paths (e.g., "C:file.txt") to reflect the validator's intentionally strict security design
259
-
260
- #### 🐛 Bug Fixes
261
- - **Test Reliability**: Fixed failing tests in `ResourceManager` context manager scenarios by properly handling file truncation and line ending normalization
262
- - **Ruff Compliance**: Resolved all linting warnings including unused variables and imports
263
-
264
- #### 📚 Documentation Updates
265
- - **Method Documentation**: Updated `ResourceManager.acquire()` docstring to include `NotImplementedError` in the Raises section
266
- - **Test Comments**: Enhanced test documentation with clearer explanations of expected behaviors and edge cases
267
-
268
- ### 2025.1.0 (2025-08-25)
269
-
270
- #### 🎉 Major Features
271
- - **Complete DSV Parser**: Full-featured delimited-separated value parser with support for CSV, TSV, and custom delimiters
272
- - **Streaming Support**: Memory-efficient streaming for large files with configurable chunk sizes
273
- - **Advanced Parsing Options**: Bookend removal, whitespace handling, and encoding support
274
- - **Header/Footer Skipping**: Skip specified numbers of rows from start or end of files
275
-
276
- #### 🛡️ Security Enhancements
277
- - **Path Validation System**: Comprehensive file path security validation with traversal attack prevention
278
- - **File Permission Checks**: Automatic file accessibility and permission validation
279
- - **Encoding Validation**: Robust encoding error detection and handling
280
-
281
- #### 🔧 Core Components
282
- - **DsvHelper**: Main DSV parsing class with parse, parses, parse_file, and parse_stream methods
283
- - **TextFileHelper**: Utility class for text file operations (line counting, preview, reading, streaming)
284
- - **PathValidator**: Security-focused path validation utilities
285
- - **ResourceManager**: Context managers for safe resource handling
286
- - **StringTokenizer**: Core string parsing functionality
287
-
288
- #### 🧪 Testing & Quality
289
- - **Comprehensive Test Suite**: 250+ tests with 85%+ coverage gate
290
- - **Cross-Platform Testing**: Tested on Windows, Linux, and macOS
291
- - **Type Safety**: Full type annotations throughout the codebase
292
- - **Error Handling**: Custom exception hierarchy with detailed error messages
293
-
294
- #### 📚 Documentation
295
- - **Complete API Documentation**: Google-style docstrings for all public methods
296
- - **Usage Examples**: Comprehensive examples for all major features
297
- - **Error Documentation**: Detailed error handling documentation
298
-
299
- #### 🚀 Performance
300
- - **Memory Efficiency**: Streaming support for large files
301
- - **Optimized Parsing**: Efficient string tokenization and processing
302
- - **Resource Management**: Automatic cleanup and resource management
222
+ See the [CHANGELOG](CHANGELOG.md) for full release notes.
303
223
 
304
224
  ## License
305
225
 
306
226
  This project is licensed under the MIT License - see the [LICENSE](LICENSE) file for details.
307
227
 
228
+ ## More Documentation
229
+
230
+ - Detailed docs: [docs/README-details.md](docs/README-details.md)
231
+ - E2E testing coverage: [docs/e2e_testing_coverage.md](docs/e2e_testing_coverage.md)
232
+
308
233
  ## Contributing
309
234
 
310
235
  Contributions are welcome! Please feel free to submit a Pull Request. For major changes, please open an issue first to discuss what you would like to change.
@@ -4,7 +4,7 @@ build-backend = "setuptools.build_meta"
4
4
 
5
5
  [project]
6
6
  name = "splurge-dsv"
7
- version = "2025.1.3"
7
+ version = "2025.1.5"
8
8
  description = "A utility library for working with DSV (Delimited String Values) files"
9
9
  readme = "README.md"
10
10
  requires-python = ">=3.10"
@@ -44,7 +44,7 @@ from splurge_dsv.resource_manager import (
44
44
  from splurge_dsv.string_tokenizer import StringTokenizer
45
45
  from splurge_dsv.text_file_helper import TextFileHelper
46
46
 
47
- __version__ = "2025.1.3"
47
+ __version__ = "2025.1.5"
48
48
  __author__ = "Jim Schilling"
49
49
  __license__ = "MIT"
50
50
 
@@ -2,14 +2,14 @@
2
2
  Command-line interface entry point for splurge-dsv.
3
3
 
4
4
  This module serves as the entry point when running the package as a module.
5
- It imports and calls the main CLI function from the cli module.
5
+ It imports and calls the run_cli function from the cli module.
6
6
  """
7
7
 
8
8
  # Standard library imports
9
9
  import sys
10
10
 
11
11
  # Local imports
12
- from splurge_dsv.cli import main
12
+ from splurge_dsv.cli import run_cli
13
13
 
14
14
  if __name__ == "__main__":
15
- sys.exit(main())
15
+ sys.exit(run_cli())
@@ -13,10 +13,12 @@ Please preserve this header and all related material when sharing!
13
13
 
14
14
  # Standard library imports
15
15
  import argparse
16
+ import json
16
17
  import sys
17
18
  from pathlib import Path
18
19
 
19
20
  # Local imports
21
+ from splurge_dsv import __version__
20
22
  from splurge_dsv.dsv_helper import DsvHelper
21
23
  from splurge_dsv.exceptions import SplurgeDsvError
22
24
 
@@ -56,7 +58,14 @@ Examples:
56
58
 
57
59
  parser.add_argument("--chunk-size", type=int, default=500, help="Chunk size for streaming (default: 500)")
58
60
 
59
- parser.add_argument("--version", action="version", version="%(prog)s 2025.1.3")
61
+ parser.add_argument(
62
+ "--output-format",
63
+ choices=["table", "json", "ndjson"],
64
+ default="table",
65
+ help="Output format for results (default: table)",
66
+ )
67
+
68
+ parser.add_argument("--version", action="version", version=f"%(prog)s {__version__}")
60
69
 
61
70
  return parser.parse_args()
62
71
 
@@ -90,8 +99,25 @@ def print_results(rows: list[list[str]], delimiter: str) -> None:
90
99
  print("-" * (sum(max_widths) + len(max_widths) * 3 - 1))
91
100
 
92
101
 
93
- def main() -> int:
94
- """Main entry point for the command-line interface."""
102
+ def run_cli() -> int:
103
+ """Run the command-line interface for DSV file parsing.
104
+
105
+ This function serves as the main entry point for the splurge-dsv CLI tool.
106
+ It parses command-line arguments, validates the input file, and processes
107
+ DSV files according to the specified options. Supports both regular parsing
108
+ and streaming modes for large files.
109
+
110
+ Returns:
111
+ int: Exit code indicating success or failure:
112
+ - 0: Success
113
+ - 1: Generic error (file not found, parsing error, etc.)
114
+ - 2: Invalid arguments
115
+ - 130: Operation interrupted (Ctrl+C)
116
+
117
+ Raises:
118
+ SystemExit: Terminates the program with the appropriate exit code.
119
+ This is handled internally and should not be caught by callers.
120
+ """
95
121
  try:
96
122
  args = parse_arguments()
97
123
 
@@ -107,7 +133,8 @@ def main() -> int:
107
133
 
108
134
  # Parse the file
109
135
  if args.stream:
110
- print(f"Streaming file '{args.file_path}' with delimiter '{args.delimiter}'...")
136
+ if args.output_format != "json":
137
+ print(f"Streaming file '{args.file_path}' with delimiter '{args.delimiter}'...")
111
138
  chunk_count = 0
112
139
  total_rows = 0
113
140
 
@@ -124,13 +151,21 @@ def main() -> int:
124
151
  ):
125
152
  chunk_count += 1
126
153
  total_rows += len(chunk)
127
- print(f"Chunk {chunk_count}: {len(chunk)} rows")
128
- print_results(chunk, args.delimiter)
129
- print()
130
-
131
- print(f"Total: {total_rows} rows in {chunk_count} chunks")
154
+ if args.output_format == "json":
155
+ print(json.dumps(chunk, ensure_ascii=False))
156
+ elif args.output_format == "ndjson":
157
+ for row in chunk:
158
+ print(json.dumps(row, ensure_ascii=False))
159
+ else:
160
+ print(f"Chunk {chunk_count}: {len(chunk)} rows")
161
+ print_results(chunk, args.delimiter)
162
+ print()
163
+
164
+ if args.output_format not in ["json", "ndjson"]:
165
+ print(f"Total: {total_rows} rows in {chunk_count} chunks")
132
166
  else:
133
- print(f"Parsing file '{args.file_path}' with delimiter '{args.delimiter}'...")
167
+ if args.output_format not in ["json", "ndjson"]:
168
+ print(f"Parsing file '{args.file_path}' with delimiter '{args.delimiter}'...")
134
169
  rows = DsvHelper.parse_file(
135
170
  file_path,
136
171
  delimiter=args.delimiter,
@@ -142,8 +177,14 @@ def main() -> int:
142
177
  skip_footer_rows=args.skip_footer,
143
178
  )
144
179
 
145
- print(f"Parsed {len(rows)} rows")
146
- print_results(rows, args.delimiter)
180
+ if args.output_format == "json":
181
+ print(json.dumps(rows, ensure_ascii=False))
182
+ elif args.output_format == "ndjson":
183
+ for row in rows:
184
+ print(json.dumps(row, ensure_ascii=False))
185
+ else:
186
+ print(f"Parsed {len(rows)} rows")
187
+ print_results(rows, args.delimiter)
147
188
 
148
189
  return 0
149
190
 
@@ -1,6 +1,6 @@
1
1
  Metadata-Version: 2.4
2
2
  Name: splurge-dsv
3
- Version: 2025.1.3
3
+ Version: 2025.1.5
4
4
  Summary: A utility library for working with DSV (Delimited String Values) files
5
5
  Author: Jim Schilling
6
6
  License-Expression: MIT
@@ -29,6 +29,11 @@ Dynamic: license-file
29
29
 
30
30
  # splurge-dsv
31
31
 
32
+ [![PyPI version](https://badge.fury.io/py/splurge-dsv.svg)](https://pypi.org/project/splurge-dsv/)
33
+ [![Python versions](https://img.shields.io/pypi/pyversions/splurge-dsv.svg)](https://pypi.org/project/splurge-dsv/)
34
+ [![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](https://opensource.org/licenses/MIT)
35
+ [![Coverage](https://img.shields.io/badge/coverage-96%25-brightgreen.svg)](https://github.com/jim-schilling/splurge-dsv)
36
+
32
37
  A robust Python library for parsing and processing delimited-separated value (DSV) files with advanced features for data validation, streaming, and error handling.
33
38
 
34
39
  ## Features
@@ -243,97 +248,17 @@ The project follows strict coding standards:
243
248
 
244
249
  ## Changelog
245
250
 
246
- ### 2025.1.3 (2025-09-03)
247
-
248
- #### 🔧 Maintenance & Consistency
249
- - **Version Alignment**: Bumped `__version__` and CLI `--version` to `2025.1.3` to match `pyproject.toml`.
250
- - **CLI Path Validation**: Centralized validation using `PathValidator.validate_path(...)` for consistent error handling.
251
- - **Type Correctness**: Fixed `PathValidator._is_valid_windows_drive_pattern` to return `bool` explicitly.
252
- - **Docs Alignment**: Updated README coverage claims to reflect the `>=85%` coverage gate configured in CI.
253
-
254
- ### 2025.1.2 (2025-09-02)
255
-
256
- #### 🧪 Comprehensive End-to-End Testing
257
- - **Complete E2E Test Suite**: Implemented 25 comprehensive end-to-end workflow tests covering all major CLI functionality
258
- - **Real CLI Execution**: Tests run actual `splurge-dsv` commands with real files, not just mocked components
259
- - **Workflow Coverage**: Tests cover CSV/TSV parsing, file operations, data processing, error handling, and performance scenarios
260
- - **Cross-Platform Compatibility**: Handles Windows-specific encoding issues and platform differences gracefully
261
- - **Performance Testing**: Large file processing tests (1,000+ and 10,000+ rows) with streaming and chunking validation
262
-
263
- #### 📊 Test Coverage Improvements
264
- - **Integration Testing**: Added real file system operations and complete pipeline validation
265
-
266
- #### 🔄 Test Categories
267
- - **CLI Workflows**: 19 tests covering basic parsing, custom delimiters, header/footer skipping, streaming, and error scenarios
268
- - **Error Handling**: 3 tests for invalid arguments, missing parameters, and CLI error conditions
269
- - **Integration Scenarios**: 3 tests for data analysis, transformation, and multi-format workflows
270
-
271
- #### 📚 Documentation & Examples
272
- - **E2E Testing Guide**: Created comprehensive documentation (`docs/e2e_testing_coverage.md`) explaining test coverage and usage
273
- - **Real-World Examples**: Tests serve as practical examples of library usage patterns
274
- - **Error Scenario Coverage**: Comprehensive testing of edge cases and failure conditions
275
-
276
- ### 2025.1.1 (2025-08-XX)
277
-
278
- #### 🔧 Code Quality Improvements
279
- - **Refactored Complex Regex Logic**: Extracted Windows drive letter validation logic from `_check_dangerous_characters` into a dedicated `_is_valid_windows_drive_pattern` helper method in `PathValidator` for better readability and maintainability
280
- - **Exception Handling Consistency**: Fixed inconsistency in `ResourceManager.acquire()` method to properly re-raise `NotImplementedError` without wrapping it in `SplurgeResourceAcquisitionError`
281
- - **Import Organization**: Moved all imports to the top of modules across the entire codebase for better code structure and PEP 8 compliance
282
-
283
- #### 🧪 Testing Enhancements
284
- - **Public API Focus**: Removed all tests that validated private implementation details, focusing exclusively on public API behavior validation
285
- - **Comprehensive Resource Manager Tests**: Added extensive test suite for `ResourceManager` module covering all public methods, edge cases, error scenarios, and context manager behavior
286
- - **Bookend Logic Clarification**: Updated and corrected all tests related to `StringTokenizer.remove_bookends` to properly reflect its single-character, symmetric bookend matching behavior
287
- - **Path Validation Test Clarity**: Clarified test expectations and comments for Windows drive-relative paths (e.g., "C:file.txt") to reflect the validator's intentionally strict security design
288
-
289
- #### 🐛 Bug Fixes
290
- - **Test Reliability**: Fixed failing tests in `ResourceManager` context manager scenarios by properly handling file truncation and line ending normalization
291
- - **Ruff Compliance**: Resolved all linting warnings including unused variables and imports
292
-
293
- #### 📚 Documentation Updates
294
- - **Method Documentation**: Updated `ResourceManager.acquire()` docstring to include `NotImplementedError` in the Raises section
295
- - **Test Comments**: Enhanced test documentation with clearer explanations of expected behaviors and edge cases
296
-
297
- ### 2025.1.0 (2025-08-25)
298
-
299
- #### 🎉 Major Features
300
- - **Complete DSV Parser**: Full-featured delimited-separated value parser with support for CSV, TSV, and custom delimiters
301
- - **Streaming Support**: Memory-efficient streaming for large files with configurable chunk sizes
302
- - **Advanced Parsing Options**: Bookend removal, whitespace handling, and encoding support
303
- - **Header/Footer Skipping**: Skip specified numbers of rows from start or end of files
304
-
305
- #### 🛡️ Security Enhancements
306
- - **Path Validation System**: Comprehensive file path security validation with traversal attack prevention
307
- - **File Permission Checks**: Automatic file accessibility and permission validation
308
- - **Encoding Validation**: Robust encoding error detection and handling
309
-
310
- #### 🔧 Core Components
311
- - **DsvHelper**: Main DSV parsing class with parse, parses, parse_file, and parse_stream methods
312
- - **TextFileHelper**: Utility class for text file operations (line counting, preview, reading, streaming)
313
- - **PathValidator**: Security-focused path validation utilities
314
- - **ResourceManager**: Context managers for safe resource handling
315
- - **StringTokenizer**: Core string parsing functionality
316
-
317
- #### 🧪 Testing & Quality
318
- - **Comprehensive Test Suite**: 250+ tests with 85%+ coverage gate
319
- - **Cross-Platform Testing**: Tested on Windows, Linux, and macOS
320
- - **Type Safety**: Full type annotations throughout the codebase
321
- - **Error Handling**: Custom exception hierarchy with detailed error messages
322
-
323
- #### 📚 Documentation
324
- - **Complete API Documentation**: Google-style docstrings for all public methods
325
- - **Usage Examples**: Comprehensive examples for all major features
326
- - **Error Documentation**: Detailed error handling documentation
327
-
328
- #### 🚀 Performance
329
- - **Memory Efficiency**: Streaming support for large files
330
- - **Optimized Parsing**: Efficient string tokenization and processing
331
- - **Resource Management**: Automatic cleanup and resource management
251
+ See the [CHANGELOG](CHANGELOG.md) for full release notes.
332
252
 
333
253
  ## License
334
254
 
335
255
  This project is licensed under the MIT License - see the [LICENSE](LICENSE) file for details.
336
256
 
257
+ ## More Documentation
258
+
259
+ - Detailed docs: [docs/README-details.md](docs/README-details.md)
260
+ - E2E testing coverage: [docs/e2e_testing_coverage.md](docs/e2e_testing_coverage.md)
261
+
337
262
  ## Contributing
338
263
 
339
264
  Contributions are welcome! Please feel free to submit a Pull Request. For major changes, please open an issue first to discuss what you would like to change.
File without changes
File without changes