PyPI - nm-tool-forge - Versions diffs - 0.2.3__tar.gz → 0.2.5__tar.gz - Mend

nm-tool-forge 0.2.3tar.gz → 0.2.5tar.gz

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (39) hide show

{nm_tool_forge-0.2.3 → nm_tool_forge-0.2.5}/PKG-INFO RENAMED Viewed

@@ -1,6 +1,6 @@
 Metadata-Version: 2.4
 Name: nm-tool-forge
-Version: 0.2.3
+Version: 0.2.5
 Summary: Analyze MigMan log files and generate aggregated CSV, Markdown, HTML, and optional PDF reports.
 Author-email: Stefan Ewald <s.ew@outlook.de>
 License-Expression: MIT
@@ -31,7 +31,7 @@ Dynamic: license-file
 # nm-tool-forge
-`nm-tool-forge` analyzes MigMan text log files with severity tokens such as `INFO`, `ERROR`, and `WARNING` and generates aggregated CSV, Markdown, HTML, and optional PDF reports.
+`nm-tool-forge` analyzes MigMan text log files with severity tokens such as `INFO`, `ERROR`, and `WARNING` and generates aggregated CSV, Markdown, HTML, and optional PDF reports. The package also includes `csvchunking`, a small helper for splitting large CSV files into migration-friendly chunks.
 The project uses a package-ready `src` layout. The legacy `log_analysis.py` file remains available as a thin compatibility entry point for older local setups.
@@ -43,6 +43,7 @@ The project uses a package-ready `src` layout. The legacy `log_analysis.py` file
 - Generate Markdown summary reports
 - Optionally convert reports to HTML and PDF
 - Keep a backup copy of analyzed log files
+- Split large CSV files into numbered chunks while preserving the header row
 - Run built-in self-tests from the CLI
 ## Installation
@@ -61,12 +62,14 @@ python -m pip install .[pdf,dev]
 ## Command-line usage
-After installation, both entry points are available:
+After installation, the CLI entry points are available:
 ```powershell
 python -m loganalysis --help
+python -m csvchunking --help
 loganalysis --help
 nm-tool-forge --help
+csvchunking --help
 ```
 Typical analysis run:
@@ -89,29 +92,30 @@ python -m loganalysis --self-test
 Legacy compatibility call:
+```powershell
+python .\log_analysis.py --convert
+```
-## Release process
-To publish a new release, always test on TestPyPI first, then upload to PyPI only after successful Conda/Smoke-Tests:
-```bash
-export TWINE_USERNAME="__token__"
-export TWINE_PASSWORD="pypi-..."
+CSV chunking run:
-bash scripts/release_testpypi.sh --bump patch
-bash scripts/release_pypi.sh --yes
+```powershell
+csvchunking "data\large_export.csv" --chunk-size 5000
 ```
-**Hinweise:**
-- Erst TestPyPI ausführen und testen, dann final nach PyPI hochladen.
-- Versionen auf PyPI können nicht überschrieben oder erneut verwendet werden.
+The command creates an output directory next to the input file named after the CSV stem. For example, `data\large_export.csv` is split into files such as `data\large_export\large_export_01.csv`, `data\large_export\large_export_02.csv`, and so on.
+CSV chunking with an explicit encoding:
 ```powershell
-python .\log_analysis.py --convert
+python -m csvchunking "data\large_export.csv" --chunk-size 5000 --encoding utf-8-sig
 ```
+Each chunk contains the original header row plus up to `--chunk-size` data rows. The delimiter is detected automatically; if detection fails, semicolon-separated CSV is used.
 ## Supported CLI options
+Log analysis options:
 - `--logs-dir`
 - `--out-dir`
 - `--backup-dir`
@@ -119,6 +123,28 @@ python .\log_analysis.py --convert
 - `--convert`
 - `--self-test`
+CSV chunking options:
+- `input_file` - path to the CSV file to split
+- `--chunk-size` - required number of data rows per output file; must be greater than zero
+- `--encoding` - input and output encoding; defaults to `utf-8-sig`
+## Release process
+To publish a new release, always test on TestPyPI first, then upload to PyPI only after successful Conda smoke tests:
+```bash
+export TWINE_USERNAME="__token__"
+export TWINE_PASSWORD="pypi-..."
+bash scripts/release_testpypi.sh --bump patch
+bash scripts/release_pypi.sh --yes
+```
+**Notes:**
+- Run and verify the TestPyPI release first, then upload the final package to PyPI.
+- PyPI versions cannot be overwritten or reused.
 ## Library usage
 ```python
@@ -130,6 +156,7 @@ from loganalysis import (
     iter_logical_entries,
     normalize_message,
 )
+from csvchunking import split_csv
 result = analyze_file(Path("logs/app.txt"))
 print(result["norm_counts"])
@@ -146,14 +173,21 @@ convert_report_md_to_html_pdf(
     Path("log_analyse_out/report.html"),
     Path("log_analyse_out/report.pdf"),
 )
+chunk_result = split_csv(Path("data/large_export.csv"), chunk_size=5000)
+print(chunk_result.output_dir)
+print(chunk_result.output_files)
 ```
+`split_csv()` returns a `ChunkResult` with the input file, output directory, chunk size, processed data-row count, created file count, and generated output file paths.
 ## Project structure
 ```text
 .
 ├─ pyproject.toml
 ├─ src/loganalysis/
+├─ src/csvchunking/
 ├─ tests/
 ├─ docs/
 └─ log_analysis.py
@@ -168,7 +202,9 @@ Important modules:
 - `report_html.py` - HTML/CSS rendering
 - `report_pdf.py` - PDF engine selection and fallback handling
 - `converters.py` - Markdown-to-HTML/PDF conversion
-- `cli.py` - command-line entry point
+- `loganalysis/cli.py` - log analysis command-line entry point
+- `csvchunking/chunker.py` - CSV splitting logic and `ChunkResult`
+- `csvchunking/cli.py` - CSV chunking command-line entry point
 ## HTML/PDF conversion

{nm_tool_forge-0.2.3 → nm_tool_forge-0.2.5}/README.md RENAMED Viewed

@@ -1,6 +1,6 @@
 # nm-tool-forge
-`nm-tool-forge` analyzes MigMan text log files with severity tokens such as `INFO`, `ERROR`, and `WARNING` and generates aggregated CSV, Markdown, HTML, and optional PDF reports.
+`nm-tool-forge` analyzes MigMan text log files with severity tokens such as `INFO`, `ERROR`, and `WARNING` and generates aggregated CSV, Markdown, HTML, and optional PDF reports. The package also includes `csvchunking`, a small helper for splitting large CSV files into migration-friendly chunks.
 The project uses a package-ready `src` layout. The legacy `log_analysis.py` file remains available as a thin compatibility entry point for older local setups.
@@ -9,10 +9,11 @@ The project uses a package-ready `src` layout. The legacy `log_analysis.py` file
 - Parse logical log entries from multi-line text logs
 - Normalize recurring error patterns for better aggregation
 - Generate aggregated CSV reports
-- Generate Markdown summary reports
-- Optionally convert reports to HTML and PDF
-- Keep a backup copy of analyzed log files
-- Run built-in self-tests from the CLI
+- Generate Markdown summary reports
+- Optionally convert reports to HTML and PDF
+- Keep a backup copy of analyzed log files
+- Split large CSV files into numbered chunks while preserving the header row
+- Run built-in self-tests from the CLI
 ## Installation
@@ -30,13 +31,15 @@ python -m pip install .[pdf,dev]
 ## Command-line usage
-After installation, both entry points are available:
-```powershell
-python -m loganalysis --help
-loganalysis --help
-nm-tool-forge --help
-```
+After installation, the CLI entry points are available:
+```powershell
+python -m loganalysis --help
+python -m csvchunking --help
+loganalysis --help
+nm-tool-forge --help
+csvchunking --help
+```
 Typical analysis run:
@@ -50,18 +53,54 @@ Analysis with HTML/PDF conversion:
 nm-tool-forge --logs-dir logs --out-dir log_analyse_out --convert
 ```
-Self-test mode:
-```powershell
-python -m loganalysis --self-test
-```
-Legacy compatibility call:
-## Release process
-To publish a new release, always test on TestPyPI first, then upload to PyPI only after successful Conda/Smoke-Tests:
+Self-test mode:
+```powershell
+python -m loganalysis --self-test
+```
+Legacy compatibility call:
+```powershell
+python .\log_analysis.py --convert
+```
+CSV chunking run:
+```powershell
+csvchunking "data\large_export.csv" --chunk-size 5000
+```
+The command creates an output directory next to the input file named after the CSV stem. For example, `data\large_export.csv` is split into files such as `data\large_export\large_export_01.csv`, `data\large_export\large_export_02.csv`, and so on.
+CSV chunking with an explicit encoding:
+```powershell
+python -m csvchunking "data\large_export.csv" --chunk-size 5000 --encoding utf-8-sig
+```
+Each chunk contains the original header row plus up to `--chunk-size` data rows. The delimiter is detected automatically; if detection fails, semicolon-separated CSV is used.
+## Supported CLI options
+Log analysis options:
+- `--logs-dir`
+- `--out-dir`
+- `--backup-dir`
+- `--top-examples`
+- `--convert`
+- `--self-test`
+CSV chunking options:
+- `input_file` - path to the CSV file to split
+- `--chunk-size` - required number of data rows per output file; must be greater than zero
+- `--encoding` - input and output encoding; defaults to `utf-8-sig`
+## Release process
+To publish a new release, always test on TestPyPI first, then upload to PyPI only after successful Conda smoke tests:
 ```bash
 export TWINE_USERNAME="__token__"
@@ -71,37 +110,25 @@ bash scripts/release_testpypi.sh --bump patch
 bash scripts/release_pypi.sh --yes
 ```
-**Hinweise:**
-- Erst TestPyPI ausführen und testen, dann final nach PyPI hochladen.
-- Versionen auf PyPI können nicht überschrieben oder erneut verwendet werden.
-```powershell
-python .\log_analysis.py --convert
-```
-## Supported CLI options
-- `--logs-dir`
-- `--out-dir`
-- `--backup-dir`
-- `--top-examples`
-- `--convert`
-- `--self-test`
-## Library usage
-```python
+**Notes:**
+- Run and verify the TestPyPI release first, then upload the final package to PyPI.
+- PyPI versions cannot be overwritten or reused.
+## Library usage
+```python
 from pathlib import Path
 from loganalysis import (
     analyze_file,
     convert_report_md_to_html_pdf,
-    iter_logical_entries,
-    normalize_message,
-)
-result = analyze_file(Path("logs/app.txt"))
-print(result["norm_counts"])
+    iter_logical_entries,
+    normalize_message,
+)
+from csvchunking import split_csv
+result = analyze_file(Path("logs/app.txt"))
+print(result["norm_counts"])
 print(normalize_message(
     'Conversion: X =3100110. 138 The record was not found in table "Teile".'
@@ -112,20 +139,27 @@ for entry in iter_logical_entries(Path("logs/app.txt")):
 convert_report_md_to_html_pdf(
     Path("log_analyse_out/report.md"),
-    Path("log_analyse_out/report.html"),
-    Path("log_analyse_out/report.pdf"),
-)
-```
+    Path("log_analyse_out/report.html"),
+    Path("log_analyse_out/report.pdf"),
+)
+chunk_result = split_csv(Path("data/large_export.csv"), chunk_size=5000)
+print(chunk_result.output_dir)
+print(chunk_result.output_files)
+```
+`split_csv()` returns a `ChunkResult` with the input file, output directory, chunk size, processed data-row count, created file count, and generated output file paths.
 ## Project structure
 ```text
 .
-├─ pyproject.toml
-├─ src/loganalysis/
-├─ tests/
-├─ docs/
-└─ log_analysis.py
+├─ pyproject.toml
+├─ src/loganalysis/
+├─ src/csvchunking/
+├─ tests/
+├─ docs/
+└─ log_analysis.py
 ```
 Important modules:
@@ -135,9 +169,11 @@ Important modules:
 - `normalization.py` - message normalization
 - `report_markdown.py` - Markdown report model and rendering
 - `report_html.py` - HTML/CSS rendering
-- `report_pdf.py` - PDF engine selection and fallback handling
-- `converters.py` - Markdown-to-HTML/PDF conversion
-- `cli.py` - command-line entry point
+- `report_pdf.py` - PDF engine selection and fallback handling
+- `converters.py` - Markdown-to-HTML/PDF conversion
+- `loganalysis/cli.py` - log analysis command-line entry point
+- `csvchunking/chunker.py` - CSV splitting logic and `ChunkResult`
+- `csvchunking/cli.py` - CSV chunking command-line entry point
 ## HTML/PDF conversion

{nm_tool_forge-0.2.3 → nm_tool_forge-0.2.5}/pyproject.toml RENAMED Viewed

@@ -4,7 +4,7 @@ build-backend = "setuptools.build_meta"
 [project]
 name = "nm-tool-forge"
-version = "0.2.3"
+version = "0.2.5"
 description = "Analyze MigMan log files and generate aggregated CSV, Markdown, HTML, and optional PDF reports."
 readme = { file = "README.md", content-type = "text/markdown" }
 requires-python = ">=3.10"

{nm_tool_forge-0.2.3 → nm_tool_forge-0.2.5}/src/csvchunking/__init__.py RENAMED Viewed

@@ -1,4 +1,4 @@
 from .chunker import ChunkResult, split_csv
 __all__ = ["ChunkResult", "split_csv"]
-__version__ = "0.2.3"
+__version__ = "0.2.5"

{nm_tool_forge-0.2.3 → nm_tool_forge-0.2.5}/src/csvchunking/chunker.py RENAMED Viewed

@@ -1,33 +1,50 @@
-import csv
-from dataclasses import dataclass
-from pathlib import Path
-@dataclass(frozen=True)
-class ChunkResult:
+import csv
+import re
+from dataclasses import dataclass
+from pathlib import Path
+@dataclass(frozen=True)
+class ChunkResult:
     input_file: Path
     output_dir: Path
-    chunk_size: int
-    data_rows_processed: int
-    files_created: int
-    output_files: tuple[Path, ...]
-def split_csv(
-    input_file: Path,
-    chunk_size: int,
-    encoding: str = "utf-8-sig",
-) -> ChunkResult:
+    chunk_size: int
+    data_rows_processed: int
+    files_created: int
+    output_files: tuple[Path, ...]
+def cleanup_existing_chunks(output_dir: Path, input_file: Path) -> None:
+    output_dir = Path(output_dir)
+    if not output_dir.exists():
+        return
+    input_file = Path(input_file)
+    pattern = re.compile(
+        rf"^{re.escape(input_file.stem)}_\d{{2,}}{re.escape(input_file.suffix)}$"
+    )
+    for existing_file in output_dir.iterdir():
+        if existing_file.is_file() and pattern.fullmatch(existing_file.name):
+            existing_file.unlink()
+def split_csv(
+    input_file: Path,
+    chunk_size: int,
+    encoding: str = "utf-8-sig",
+) -> ChunkResult:
     if not Path(input_file).is_file():
-        raise FileNotFoundError(f"Eingabedatei nicht gefunden: {input_file}")
+        raise FileNotFoundError(f"Input file not found: {input_file}")
     if chunk_size <= 0:
-        raise ValueError("chunk_size muss > 0 sein")
+        raise ValueError("chunk_size must be greater than 0")
     input_file = Path(input_file)
     output_dir = input_file.parent / input_file.stem
     output_dir.mkdir(exist_ok=True)
+    cleanup_existing_chunks(output_dir, input_file)
-    # Delimiter automatisch erkennen
+    # Detect the delimiter automatically.
     with open(input_file, encoding=encoding, newline="") as f:
         sample = f.read(4096)
         f.seek(0)
@@ -38,10 +55,10 @@ def split_csv(
             dialect = csv.excel
             dialect.delimiter = ";"
         reader = csv.reader(f, dialect)
-        try:
-            header = next(reader)
-        except StopIteration as exc:
-            raise ValueError("Eingabedatei ist leer.") from exc
+        try:
+            header = next(reader)
+        except StopIteration as exc:
+            raise ValueError("Input file is empty.") from exc
         chunk = []
         file_count = 0
         data_rows = 0

{nm_tool_forge-0.2.3 → nm_tool_forge-0.2.5}/src/csvchunking/cli.py RENAMED Viewed

@@ -7,22 +7,22 @@ from .chunker import split_csv
 def main() -> None:
     parser = argparse.ArgumentParser(
-        description="Teilt eine große CSV-Datei in kleinere Chunks mit Header.",
+        description="Split a large CSV file into smaller chunks with a header row.",
     )
-    parser.add_argument("input_file", help="Pfad zur CSV-Datei")
+    parser.add_argument("input_file", help="Path to the CSV file")
     parser.add_argument(
         "--chunk-size",
         type=int,
         required=True,
-        help="Anzahl Datenzeilen pro Ausgabedatei, muss > 0 sein",
+        help="Number of data rows per output file; must be greater than 0",
     )
-    parser.add_argument("--encoding", default="utf-8-sig", help="Encoding für Ein- und Ausgabe (Standard: utf-8-sig)")
+    parser.add_argument("--encoding", default="utf-8-sig", help="Input and output encoding (Default: utf-8-sig)")
     args = parser.parse_args()
-    try:
-        result = split_csv(Path(args.input_file), args.chunk_size, encoding=args.encoding)
-    except Exception as e:
-        print(f"Fehler: {e}", file=sys.stderr)
-        sys.exit(1)
+    try:
+        result = split_csv(Path(args.input_file), args.chunk_size, encoding=args.encoding)
+    except Exception as e:
+        print(f"Error: {e}", file=sys.stderr)
+        sys.exit(1)
     print("CSV chunking completed.")
     print(f"- Input: {result.input_file}")
     print(f"- Output directory: {result.output_dir}")

{nm_tool_forge-0.2.3 → nm_tool_forge-0.2.5}/src/loganalysis/__init__.py RENAMED Viewed

@@ -13,4 +13,4 @@ __all__ = [
     "run_analysis",
 ]
-__version__ = "0.2.3"
+__version__ = "0.2.5"

{nm_tool_forge-0.2.3 → nm_tool_forge-0.2.5}/src/nm_tool_forge.egg-info/PKG-INFO RENAMED Viewed

@@ -1,6 +1,6 @@
 Metadata-Version: 2.4
 Name: nm-tool-forge
-Version: 0.2.3
+Version: 0.2.5
 Summary: Analyze MigMan log files and generate aggregated CSV, Markdown, HTML, and optional PDF reports.
 Author-email: Stefan Ewald <s.ew@outlook.de>
 License-Expression: MIT
@@ -31,7 +31,7 @@ Dynamic: license-file
 # nm-tool-forge
-`nm-tool-forge` analyzes MigMan text log files with severity tokens such as `INFO`, `ERROR`, and `WARNING` and generates aggregated CSV, Markdown, HTML, and optional PDF reports.
+`nm-tool-forge` analyzes MigMan text log files with severity tokens such as `INFO`, `ERROR`, and `WARNING` and generates aggregated CSV, Markdown, HTML, and optional PDF reports. The package also includes `csvchunking`, a small helper for splitting large CSV files into migration-friendly chunks.
 The project uses a package-ready `src` layout. The legacy `log_analysis.py` file remains available as a thin compatibility entry point for older local setups.
@@ -43,6 +43,7 @@ The project uses a package-ready `src` layout. The legacy `log_analysis.py` file
 - Generate Markdown summary reports
 - Optionally convert reports to HTML and PDF
 - Keep a backup copy of analyzed log files
+- Split large CSV files into numbered chunks while preserving the header row
 - Run built-in self-tests from the CLI
 ## Installation
@@ -61,12 +62,14 @@ python -m pip install .[pdf,dev]
 ## Command-line usage
-After installation, both entry points are available:
+After installation, the CLI entry points are available:
 ```powershell
 python -m loganalysis --help
+python -m csvchunking --help
 loganalysis --help
 nm-tool-forge --help
+csvchunking --help
 ```
 Typical analysis run:
@@ -89,29 +92,30 @@ python -m loganalysis --self-test
 Legacy compatibility call:
+```powershell
+python .\log_analysis.py --convert
+```
-## Release process
-To publish a new release, always test on TestPyPI first, then upload to PyPI only after successful Conda/Smoke-Tests:
-```bash
-export TWINE_USERNAME="__token__"
-export TWINE_PASSWORD="pypi-..."
+CSV chunking run:
-bash scripts/release_testpypi.sh --bump patch
-bash scripts/release_pypi.sh --yes
+```powershell
+csvchunking "data\large_export.csv" --chunk-size 5000
 ```
-**Hinweise:**
-- Erst TestPyPI ausführen und testen, dann final nach PyPI hochladen.
-- Versionen auf PyPI können nicht überschrieben oder erneut verwendet werden.
+The command creates an output directory next to the input file named after the CSV stem. For example, `data\large_export.csv` is split into files such as `data\large_export\large_export_01.csv`, `data\large_export\large_export_02.csv`, and so on.
+CSV chunking with an explicit encoding:
 ```powershell
-python .\log_analysis.py --convert
+python -m csvchunking "data\large_export.csv" --chunk-size 5000 --encoding utf-8-sig
 ```
+Each chunk contains the original header row plus up to `--chunk-size` data rows. The delimiter is detected automatically; if detection fails, semicolon-separated CSV is used.
 ## Supported CLI options
+Log analysis options:
 - `--logs-dir`
 - `--out-dir`
 - `--backup-dir`
@@ -119,6 +123,28 @@ python .\log_analysis.py --convert
 - `--convert`
 - `--self-test`
+CSV chunking options:
+- `input_file` - path to the CSV file to split
+- `--chunk-size` - required number of data rows per output file; must be greater than zero
+- `--encoding` - input and output encoding; defaults to `utf-8-sig`
+## Release process
+To publish a new release, always test on TestPyPI first, then upload to PyPI only after successful Conda smoke tests:
+```bash
+export TWINE_USERNAME="__token__"
+export TWINE_PASSWORD="pypi-..."
+bash scripts/release_testpypi.sh --bump patch
+bash scripts/release_pypi.sh --yes
+```
+**Notes:**
+- Run and verify the TestPyPI release first, then upload the final package to PyPI.
+- PyPI versions cannot be overwritten or reused.
 ## Library usage
 ```python
@@ -130,6 +156,7 @@ from loganalysis import (
     iter_logical_entries,
     normalize_message,
 )
+from csvchunking import split_csv
 result = analyze_file(Path("logs/app.txt"))
 print(result["norm_counts"])
@@ -146,14 +173,21 @@ convert_report_md_to_html_pdf(
     Path("log_analyse_out/report.html"),
     Path("log_analyse_out/report.pdf"),
 )
+chunk_result = split_csv(Path("data/large_export.csv"), chunk_size=5000)
+print(chunk_result.output_dir)
+print(chunk_result.output_files)
 ```
+`split_csv()` returns a `ChunkResult` with the input file, output directory, chunk size, processed data-row count, created file count, and generated output file paths.
 ## Project structure
 ```text
 .
 ├─ pyproject.toml
 ├─ src/loganalysis/
+├─ src/csvchunking/
 ├─ tests/
 ├─ docs/
 └─ log_analysis.py
@@ -168,7 +202,9 @@ Important modules:
 - `report_html.py` - HTML/CSS rendering
 - `report_pdf.py` - PDF engine selection and fallback handling
 - `converters.py` - Markdown-to-HTML/PDF conversion
-- `cli.py` - command-line entry point
+- `loganalysis/cli.py` - log analysis command-line entry point
+- `csvchunking/chunker.py` - CSV splitting logic and `ChunkResult`
+- `csvchunking/cli.py` - CSV chunking command-line entry point
 ## HTML/PDF conversion

nm_tool_forge-0.2.5/tests/test_csvchunking.py ADDED Viewed

@@ -0,0 +1,153 @@
+import pytest
+from csvchunking.chunker import split_csv
+def make_csv(tmp_path, name, header, rows, encoding="utf-8-sig", delimiter=";"):
+    file = tmp_path / name
+    with open(file, "w", encoding=encoding, newline="") as f:
+        f.write(delimiter.join(header) + "\n")
+        for row in rows:
+            f.write(delimiter.join(row) + "\n")
+    return file
+def test_regular_split(tmp_path):
+    header = ["col1", "col2"]
+    rows = [["A", "1"], ["B", "2"], ["C", "3"], ["D", "4"], ["E", "5"]]
+    file = make_csv(tmp_path, "sample.csv", header, rows)
+    result = split_csv(file, chunk_size=2)
+    assert result.files_created == 3
+    for out in result.output_files:
+        with open(out, encoding="utf-8-sig") as f:
+            lines = f.read().splitlines()
+            assert lines[0] == "col1;col2"
+    assert (result.output_dir / "sample_01.csv").exists()
+    assert (result.output_dir / "sample_02.csv").exists()
+    assert (result.output_dir / "sample_03.csv").exists()
+def test_header_in_each_file(tmp_path):
+    header = ["foo", "bar"]
+    rows = [["x", "1"], ["y", "2"], ["z", "3"]]
+    file = make_csv(tmp_path, "test.csv", header, rows)
+    result = split_csv(file, chunk_size=1)
+    for out in result.output_files:
+        with open(out, encoding="utf-8-sig") as f:
+            assert f.readline().strip() == "foo;bar"
+def test_filename_with_spaces(tmp_path):
+    header = ["a", "b"]
+    rows = [["1", "2"]]
+    file = make_csv(tmp_path, "Part-Storage Areas Relationships.csv", header, rows)
+    result = split_csv(file, chunk_size=1)
+    assert result.output_dir.name == "Part-Storage Areas Relationships"
+    assert (result.output_dir / "Part-Storage Areas Relationships_01.csv").exists()
+def test_cleanup_removes_stale_matching_chunk_files(tmp_path):
+    header = ["col1", "col2"]
+    rows = [["A", "1"], ["B", "2"], ["C", "3"], ["D", "4"]]
+    file = make_csv(tmp_path, "sample.csv", header, rows)
+    output_dir = tmp_path / "sample"
+    output_dir.mkdir()
+    for name in ("sample_01.csv", "sample_02.csv", "sample_03.csv"):
+        (output_dir / name).write_text("old chunk\n", encoding="utf-8-sig")
+    result = split_csv(file, chunk_size=2)
+    assert result.files_created == 2
+    assert (result.output_dir / "sample_01.csv").exists()
+    assert (result.output_dir / "sample_02.csv").exists()
+    assert not (result.output_dir / "sample_03.csv").exists()
+def test_cleanup_keeps_non_matching_csv_files_and_subdirectories(tmp_path):
+    header = ["col1", "col2"]
+    rows = [["A", "1"], ["B", "2"]]
+    file = make_csv(tmp_path, "sample.csv", header, rows)
+    output_dir = tmp_path / "sample"
+    output_dir.mkdir()
+    preserved_files = [
+        "notes.csv",
+        "sample_backup.csv",
+        "sample_old.csv",
+        "other_01.csv",
+        "sample_1.csv",
+    ]
+    for name in preserved_files:
+        (output_dir / name).write_text("keep\n", encoding="utf-8-sig")
+    matching_subdir = output_dir / "sample_99.csv"
+    matching_subdir.mkdir()
+    (matching_subdir / "nested.txt").write_text("keep nested\n", encoding="utf-8-sig")
+    result = split_csv(file, chunk_size=1)
+    for name in preserved_files:
+        assert (result.output_dir / name).exists()
+    assert matching_subdir.is_dir()
+    assert (matching_subdir / "nested.txt").exists()
+def test_cleanup_filename_with_spaces_uses_exact_chunk_pattern(tmp_path):
+    header = ["a", "b"]
+    rows = [["1", "2"], ["3", "4"], ["5", "6"], ["7", "8"]]
+    filename = "Part-Storage Areas Relationships.csv"
+    file = make_csv(tmp_path, filename, header, rows)
+    output_dir = tmp_path / "Part-Storage Areas Relationships"
+    output_dir.mkdir()
+    for name in (
+        "Part-Storage Areas Relationships_01.csv",
+        "Part-Storage Areas Relationships_02.csv",
+        "Part-Storage Areas Relationships_99.csv",
+    ):
+        (output_dir / name).write_text("old chunk\n", encoding="utf-8-sig")
+    backup_file = output_dir / "Part-Storage Areas Relationships_backup.csv"
+    backup_file.write_text("keep\n", encoding="utf-8-sig")
+    result = split_csv(file, chunk_size=2)
+    assert result.output_dir == output_dir
+    assert (result.output_dir / "Part-Storage Areas Relationships_01.csv").exists()
+    assert (result.output_dir / "Part-Storage Areas Relationships_02.csv").exists()
+    assert not (result.output_dir / "Part-Storage Areas Relationships_99.csv").exists()
+    assert backup_file.exists()
+    assert "old chunk" not in (
+        result.output_dir / "Part-Storage Areas Relationships_01.csv"
+    ).read_text(encoding="utf-8-sig")
+def test_cleanup_repeated_run_removes_extra_chunks(tmp_path):
+    header = ["col1", "col2"]
+    first_rows = [["A", "1"], ["B", "2"], ["C", "3"], ["D", "4"], ["E", "5"]]
+    file = make_csv(tmp_path, "sample.csv", header, first_rows)
+    first_result = split_csv(file, chunk_size=2)
+    assert first_result.files_created == 3
+    assert (first_result.output_dir / "sample_03.csv").exists()
+    second_rows = [["A", "1"], ["B", "2"]]
+    make_csv(tmp_path, "sample.csv", header, second_rows)
+    second_result = split_csv(file, chunk_size=2)
+    assert second_result.files_created == 1
+    assert (second_result.output_dir / "sample_01.csv").exists()
+    assert not (second_result.output_dir / "sample_02.csv").exists()
+    assert not (second_result.output_dir / "sample_03.csv").exists()
+def test_invalid_chunk_size(tmp_path):
+    header = ["a", "b"]
+    rows = [["1", "2"]]
+    file = make_csv(tmp_path, "fail.csv", header, rows)
+    with pytest.raises(ValueError):
+        split_csv(file, chunk_size=0)
+    with pytest.raises(ValueError):
+        split_csv(file, chunk_size=-1)
+def test_empty_file(tmp_path):
+    file = tmp_path / "empty.csv"
+    file.write_text("")
+    with pytest.raises(ValueError):
+        split_csv(file, chunk_size=1)

nm_tool_forge-0.2.3/tests/test_csvchunking.py DELETED Viewed

@@ -1,63 +0,0 @@
-import pytest
-from csvchunking.chunker import split_csv
-def make_csv(tmp_path, name, header, rows, encoding="utf-8-sig", delimiter=";"):
-    file = tmp_path / name
-    with open(file, "w", encoding=encoding, newline="") as f:
-        f.write(delimiter.join(header) + "\n")
-        for row in rows:
-            f.write(delimiter.join(row) + "\n")
-    return file
-def test_normale_aufteilung(tmp_path):
-    header = ["col1", "col2"]
-    rows = [["A", "1"], ["B", "2"], ["C", "3"], ["D", "4"], ["E", "5"]]
-    file = make_csv(tmp_path, "sample.csv", header, rows)
-    result = split_csv(file, chunk_size=2)
-    assert result.files_created == 3
-    for out in result.output_files:
-        with open(out, encoding="utf-8-sig") as f:
-            lines = f.read().splitlines()
-            assert lines[0] == "col1;col2"
-    assert (result.output_dir / "sample_01.csv").exists()
-    assert (result.output_dir / "sample_02.csv").exists()
-    assert (result.output_dir / "sample_03.csv").exists()
-def test_header_in_jeder_datei(tmp_path):
-    header = ["foo", "bar"]
-    rows = [["x", "1"], ["y", "2"], ["z", "3"]]
-    file = make_csv(tmp_path, "test.csv", header, rows)
-    result = split_csv(file, chunk_size=1)
-    for out in result.output_files:
-        with open(out, encoding="utf-8-sig") as f:
-            assert f.readline().strip() == "foo;bar"
-def test_dateiname_mit_leerzeichen(tmp_path):
-    header = ["a", "b"]
-    rows = [["1", "2"]]
-    file = make_csv(tmp_path, "Part-Storage Areas Relationships.csv", header, rows)
-    result = split_csv(file, chunk_size=1)
-    assert result.output_dir.name == "Part-Storage Areas Relationships"
-    assert (result.output_dir / "Part-Storage Areas Relationships_01.csv").exists()
-def test_ungueltige_chunkgroesse(tmp_path):
-    header = ["a", "b"]
-    rows = [["1", "2"]]
-    file = make_csv(tmp_path, "fail.csv", header, rows)
-    with pytest.raises(ValueError):
-        split_csv(file, chunk_size=0)
-    with pytest.raises(ValueError):
-        split_csv(file, chunk_size=-1)
-def test_leere_datei(tmp_path):
-    file = tmp_path / "empty.csv"
-    file.write_text("")
-    with pytest.raises(ValueError):
-        split_csv(file, chunk_size=1)

{nm_tool_forge-0.2.3 → nm_tool_forge-0.2.5}/LICENSE RENAMED Viewed

File without changes

{nm_tool_forge-0.2.3 → nm_tool_forge-0.2.5}/setup.cfg RENAMED Viewed

File without changes

{nm_tool_forge-0.2.3 → nm_tool_forge-0.2.5}/src/csvchunking/__main__.py RENAMED Viewed

File without changes

{nm_tool_forge-0.2.3 → nm_tool_forge-0.2.5}/src/loganalysis/__main__.py RENAMED Viewed

File without changes

{nm_tool_forge-0.2.3 → nm_tool_forge-0.2.5}/src/loganalysis/analysis.py RENAMED Viewed

File without changes

{nm_tool_forge-0.2.3 → nm_tool_forge-0.2.5}/src/loganalysis/cli.py RENAMED Viewed

File without changes

{nm_tool_forge-0.2.3 → nm_tool_forge-0.2.5}/src/loganalysis/constants.py RENAMED Viewed

File without changes

{nm_tool_forge-0.2.3 → nm_tool_forge-0.2.5}/src/loganalysis/converters.py RENAMED Viewed

File without changes

{nm_tool_forge-0.2.3 → nm_tool_forge-0.2.5}/src/loganalysis/csv_export.py RENAMED Viewed

File without changes

{nm_tool_forge-0.2.3 → nm_tool_forge-0.2.5}/src/loganalysis/encoding.py RENAMED Viewed

File without changes

{nm_tool_forge-0.2.3 → nm_tool_forge-0.2.5}/src/loganalysis/filesystem.py RENAMED Viewed

File without changes

{nm_tool_forge-0.2.3 → nm_tool_forge-0.2.5}/src/loganalysis/models.py RENAMED Viewed

File without changes

{nm_tool_forge-0.2.3 → nm_tool_forge-0.2.5}/src/loganalysis/normalization.py RENAMED Viewed

File without changes

{nm_tool_forge-0.2.3 → nm_tool_forge-0.2.5}/src/loganalysis/parsing.py RENAMED Viewed

File without changes

{nm_tool_forge-0.2.3 → nm_tool_forge-0.2.5}/src/loganalysis/report_html.py RENAMED Viewed

File without changes

{nm_tool_forge-0.2.3 → nm_tool_forge-0.2.5}/src/loganalysis/report_markdown.py RENAMED Viewed

File without changes

{nm_tool_forge-0.2.3 → nm_tool_forge-0.2.5}/src/loganalysis/report_models.py RENAMED Viewed

File without changes

{nm_tool_forge-0.2.3 → nm_tool_forge-0.2.5}/src/loganalysis/report_pdf.py RENAMED Viewed

File without changes

{nm_tool_forge-0.2.3 → nm_tool_forge-0.2.5}/src/loganalysis/selftest.py RENAMED Viewed

File without changes

{nm_tool_forge-0.2.3 → nm_tool_forge-0.2.5}/src/nm_tool_forge.egg-info/SOURCES.txt RENAMED Viewed

File without changes

{nm_tool_forge-0.2.3 → nm_tool_forge-0.2.5}/src/nm_tool_forge.egg-info/dependency_links.txt RENAMED Viewed

File without changes

{nm_tool_forge-0.2.3 → nm_tool_forge-0.2.5}/src/nm_tool_forge.egg-info/entry_points.txt RENAMED Viewed

File without changes

{nm_tool_forge-0.2.3 → nm_tool_forge-0.2.5}/src/nm_tool_forge.egg-info/requires.txt RENAMED Viewed

File without changes

{nm_tool_forge-0.2.3 → nm_tool_forge-0.2.5}/src/nm_tool_forge.egg-info/top_level.txt RENAMED Viewed

File without changes

{nm_tool_forge-0.2.3 → nm_tool_forge-0.2.5}/tests/test_analysis.py RENAMED Viewed

File without changes

{nm_tool_forge-0.2.3 → nm_tool_forge-0.2.5}/tests/test_normalization.py RENAMED Viewed

File without changes

{nm_tool_forge-0.2.3 → nm_tool_forge-0.2.5}/tests/test_parsing.py RENAMED Viewed

File without changes

{nm_tool_forge-0.2.3 → nm_tool_forge-0.2.5}/tests/test_report_html.py RENAMED Viewed

File without changes

{nm_tool_forge-0.2.3 → nm_tool_forge-0.2.5}/tests/test_report_markdown.py RENAMED Viewed

File without changes

nm-tool-forge 0.2.3__tar.gz → 0.2.5__tar.gz

nm-tool-forge 0.2.3tar.gz → 0.2.5tar.gz