PyPI - cli-test-framework - Versions diffs - 0.4.1__tar.gz → 0.4.2__tar.gz - Mend

cli-test-framework 0.4.1tar.gz → 0.4.2tar.gz

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (99) hide show

{cli_test_framework-0.4.1/src/cli_test_framework.egg-info → cli_test_framework-0.4.2}/PKG-INFO RENAMED Viewed

@@ -1,6 +1,6 @@
 Metadata-Version: 2.4
 Name: cli-test-framework
-Version: 0.4.1
+Version: 0.4.2
 Summary: A powerful command line testing framework in Python with setup modules, parallel execution, and file comparison capabilities.
 Home-page: https://github.com/ozil111/cli-test-framework
 Author: Xiaotong Wang
@@ -53,7 +53,7 @@ This is a lightweight and extensible automated testing framework that supports d
 - **📊 Comprehensive Reports**: Detailed pass rate statistics and failure diagnostics
 - **🔧 Thread-Safe Design**: Robust concurrent execution with proper synchronization
 - **📝 Advanced File Comparison**: Support for comparing various file types (text, binary, JSON, HDF5) with detailed diff output
-- **🎛️ Resource-Aware Scheduling**: Per-test timeout and resource hints (estimated time / memory / priority) with LPT-based ordering in parallel runs to improve throughput and avoid long-tail blocking
+- **🎛️ Resource-Aware Scheduling**: Per-test timeout and resource hints (CPU cores / estimated time / memory / priority) with automatic CPU core detection and semaphore-based scheduling to prevent resource conflicts and solver "runaway" scenarios
 ## 3. Quick Start
@@ -81,16 +81,22 @@ success = runner.run_tests()
 ```python
 from src.runners.parallel_json_runner import ParallelJSONRunner
-# Multi-threaded execution (recommended for I/O-intensive tests)
+# Multi-threaded execution with resource-aware scheduling
 runner = ParallelJSONRunner(
     config_file="path/to/test_cases.json",
     workspace="/project/root",
     max_workers=4,           # Maximum concurrent workers
-    execution_mode="thread"  # "thread" or "process"
+    execution_mode="thread"  # "thread" (supports CPU scheduling) or "process"
 )
 success = runner.run_tests()
 ```
+**Resource-Aware Scheduling**: When using `execution_mode="thread"`, the framework automatically:
+- Detects available CPU cores on your machine
+- Manages CPU resource allocation using semaphore-based scheduling
+- Prevents resource conflicts by queuing tasks that require more cores than available
+- Injects environment variables to constrain solver threads (prevents "runaway" scenarios)
 ### Setup Module Usage
 ```python
@@ -151,6 +157,7 @@ compare-files binary1.bin binary2.bin --similarity
             "args": ["-i", "input.0000.rad"],
             "timeout": 36000,
             "resources": {
+                "cpu_cores": 4,
                 "estimated_time": 18000,
                 "min_memory_mb": 16000,
                 "priority": 10
@@ -202,6 +209,61 @@ test_cases:
       output_matches: ".*\\.md$"
 ```
+### Resource-Aware Configuration
+For simulation and long-running tasks (CAE/FEA), you can specify resource requirements to enable intelligent scheduling with automatic CPU core management:
+```json
+{
+    "name": "Full_Car_Crash_Simulation",
+    "command": "radioss_solver",
+    "args": ["-i", "input.0000.rad"],
+    "timeout": 36000,
+    "resources": {
+        "cpu_cores": 4,
+        "estimated_time": 18000,
+        "min_memory_mb": 16000,
+        "priority": 10
+    },
+    "expected": {
+        "return_code": 0
+    }
+}
+```
+**Field Descriptions:**
+- **`timeout`** (optional, float): **Hard limit in seconds**. If the test exceeds this time, it will be killed. Default: 3600 seconds (1 hour). Set to `null` for unlimited (not recommended).
+  - Common values: `60` (1 min), `300` (5 min), `3600` (1 hour), `18000` (5 hours), `86400` (24 hours)
+- **`resources.cpu_cores`** (optional, int): **Number of CPU cores required by this task**. The framework automatically detects available CPU cores and uses semaphore-based scheduling to manage resource allocation. Tasks that require more cores than available will wait until resources are freed. Default: `1` core.
+  - **How it works**: The framework automatically detects your machine's CPU count (e.g., 16 cores), reserves 2 cores for the system, and creates a resource pool with the remaining cores (e.g., 14 cores). Tasks acquire cores from this pool before execution.
+  - **Environment injection**: When a task starts, the framework automatically sets `OMP_NUM_THREADS`, `MKL_NUM_THREADS`, and `NPROC` environment variables to constrain solver threads, preventing "runaway" scenarios where solvers ignore Python's scheduling.
+  - **Example scenarios**:
+    - Machine with 16 cores (14 available): 3 tasks requiring 4 cores each can run concurrently (3×4=12 cores used, 2 cores free)
+    - Machine with 8 cores (6 available): 1 task requiring 4 cores + 1 task requiring 2 cores can run concurrently
+  - **Recommendations**:
+    - Heavy simulations: `4-8` cores
+    - Medium tasks: `2-4` cores
+    - Lightweight scripts: `1` core (default)
+- **`resources.estimated_time`** (optional, float): **Estimated duration in seconds** for LPT (Longest Processing Time) scheduling. Tasks with longer estimated times are scheduled first in parallel runs to improve throughput.
+  - Example: `18000` = 5 hours, `3600` = 1 hour, `300` = 5 minutes
+- **`resources.min_memory_mb`** (optional, float): **Estimated memory requirement in MB**. Used for OOM (Out Of Memory) risk warnings. Currently informational only.
+  - Example: `16000` = 16 GB, `8192` = 8 GB, `4096` = 4 GB
+- **`resources.priority`** (optional, int): **Task priority** (higher number = higher priority). Currently informational only. Recommended range: 0-10.
+  - `10`: Critical/blocking tasks (must run first)
+  - `7-9`: High priority (important business paths)
+  - `4-6`: Normal priority
+  - `1-3`: Low priority / exploratory tests
+  - `0` or unset: Default priority
+**Note:**
+- All time values (`timeout`, `estimated_time`) are in **seconds**, not milliseconds. This matches Python's `subprocess.run(timeout=...)` API.
+- **CPU core scheduling is only active in thread mode**. Process mode will fall back to original behavior (process-level isolation provides some resource separation).
 ## 5. File Comparison Features
 ### Supported File Types
@@ -369,12 +431,35 @@ except Exception as e:
    max_workers = os.cpu_count() * 2
    ```
-2. **Test Case Design**:
+2. **Resource-Aware Scheduling**:
+   - **For CAE/FEA simulations**: Always specify `cpu_cores` in your test configuration to prevent resource conflicts
+   - **Mixed workloads**: Configure lightweight tasks with `cpu_cores: 1` and heavy simulations with appropriate core counts
+   - **Example mixed configuration**:
+     ```json
+     {
+         "test_cases": [
+             {
+                 "name": "Heavy Simulation",
+                 "command": "radioss_solver",
+                 "resources": { "cpu_cores": 4 }
+             },
+             {
+                 "name": "Lightweight Script",
+                 "command": "python script.py",
+                 "resources": { "cpu_cores": 1 }
+             }
+         ]
+     }
+     ```
+   - **Monitor resource usage**: The framework prints resource acquisition/release logs to help you understand scheduling behavior
+3. **Test Case Design**:
    - ✅ Ensure test independence (no dependencies between tests)
    - ✅ Avoid shared resource conflicts (different files/ports)
    - ✅ Use relative paths (framework handles resolution automatically)
+   - ✅ Specify `cpu_cores` for CPU-intensive tasks to enable intelligent scheduling
-3. **Debugging**:
+4. **Debugging**:
    ```python
    # Enable verbose output for debugging
    runner = ParallelJSONRunner(
@@ -559,7 +644,22 @@ compare-files data1.h5 data2.h5 --h5-data-filter '<=0.01'
 # 版本更新日志
-## 0.3.7 (Latest)
+## 0.4.2 (Latest)
+### ✨ New Features
+- **Resource-Aware CPU Scheduling**: Automatic CPU core detection and semaphore-based scheduling to prevent resource conflicts
+  - Added `cpu_cores` field in `resources` configuration to specify CPU requirements per task
+  - Automatic environment variable injection (`OMP_NUM_THREADS`, `MKL_NUM_THREADS`, `NPROC`) to constrain solver threads
+  - Prevents solver "runaway" scenarios where solvers ignore Python's scheduling
+  - Intelligent resource pool management: automatically reserves 2 cores for system use
+- **Enhanced execution engine**: Support for custom environment variables in test execution
+### 🔧 Improvements
+- **Better resource management**: Tasks now wait for available CPU cores instead of overwhelming the system
+- **Automatic CPU detection**: No manual configuration needed - framework detects available cores automatically
+- **Thread-safe resource allocation**: Semaphore-based scheduling ensures thread-safe resource management
+## 0.3.7
 ### 🐛 Bug Fixes
 - **Fixed H5 table regex matching**: `--h5-table-regex=table1,table2` now correctly matches both `table1` and `table2` instead of treating the entire string as a single regex pattern

{cli_test_framework-0.4.1 → cli_test_framework-0.4.2}/README.md RENAMED Viewed

@@ -18,7 +18,7 @@ This is a lightweight and extensible automated testing framework that supports d
 - **📊 Comprehensive Reports**: Detailed pass rate statistics and failure diagnostics
 - **🔧 Thread-Safe Design**: Robust concurrent execution with proper synchronization
 - **📝 Advanced File Comparison**: Support for comparing various file types (text, binary, JSON, HDF5) with detailed diff output
-- **🎛️ Resource-Aware Scheduling**: Per-test timeout and resource hints (estimated time / memory / priority) with LPT-based ordering in parallel runs to improve throughput and avoid long-tail blocking
+- **🎛️ Resource-Aware Scheduling**: Per-test timeout and resource hints (CPU cores / estimated time / memory / priority) with automatic CPU core detection and semaphore-based scheduling to prevent resource conflicts and solver "runaway" scenarios
 ## 3. Quick Start
@@ -46,16 +46,22 @@ success = runner.run_tests()
 ```python
 from src.runners.parallel_json_runner import ParallelJSONRunner
-# Multi-threaded execution (recommended for I/O-intensive tests)
+# Multi-threaded execution with resource-aware scheduling
 runner = ParallelJSONRunner(
     config_file="path/to/test_cases.json",
     workspace="/project/root",
     max_workers=4,           # Maximum concurrent workers
-    execution_mode="thread"  # "thread" or "process"
+    execution_mode="thread"  # "thread" (supports CPU scheduling) or "process"
 )
 success = runner.run_tests()
 ```
+**Resource-Aware Scheduling**: When using `execution_mode="thread"`, the framework automatically:
+- Detects available CPU cores on your machine
+- Manages CPU resource allocation using semaphore-based scheduling
+- Prevents resource conflicts by queuing tasks that require more cores than available
+- Injects environment variables to constrain solver threads (prevents "runaway" scenarios)
 ### Setup Module Usage
 ```python
@@ -116,6 +122,7 @@ compare-files binary1.bin binary2.bin --similarity
             "args": ["-i", "input.0000.rad"],
             "timeout": 36000,
             "resources": {
+                "cpu_cores": 4,
                 "estimated_time": 18000,
                 "min_memory_mb": 16000,
                 "priority": 10
@@ -167,6 +174,61 @@ test_cases:
       output_matches: ".*\\.md$"
 ```
+### Resource-Aware Configuration
+For simulation and long-running tasks (CAE/FEA), you can specify resource requirements to enable intelligent scheduling with automatic CPU core management:
+```json
+{
+    "name": "Full_Car_Crash_Simulation",
+    "command": "radioss_solver",
+    "args": ["-i", "input.0000.rad"],
+    "timeout": 36000,
+    "resources": {
+        "cpu_cores": 4,
+        "estimated_time": 18000,
+        "min_memory_mb": 16000,
+        "priority": 10
+    },
+    "expected": {
+        "return_code": 0
+    }
+}
+```
+**Field Descriptions:**
+- **`timeout`** (optional, float): **Hard limit in seconds**. If the test exceeds this time, it will be killed. Default: 3600 seconds (1 hour). Set to `null` for unlimited (not recommended).
+  - Common values: `60` (1 min), `300` (5 min), `3600` (1 hour), `18000` (5 hours), `86400` (24 hours)
+- **`resources.cpu_cores`** (optional, int): **Number of CPU cores required by this task**. The framework automatically detects available CPU cores and uses semaphore-based scheduling to manage resource allocation. Tasks that require more cores than available will wait until resources are freed. Default: `1` core.
+  - **How it works**: The framework automatically detects your machine's CPU count (e.g., 16 cores), reserves 2 cores for the system, and creates a resource pool with the remaining cores (e.g., 14 cores). Tasks acquire cores from this pool before execution.
+  - **Environment injection**: When a task starts, the framework automatically sets `OMP_NUM_THREADS`, `MKL_NUM_THREADS`, and `NPROC` environment variables to constrain solver threads, preventing "runaway" scenarios where solvers ignore Python's scheduling.
+  - **Example scenarios**:
+    - Machine with 16 cores (14 available): 3 tasks requiring 4 cores each can run concurrently (3×4=12 cores used, 2 cores free)
+    - Machine with 8 cores (6 available): 1 task requiring 4 cores + 1 task requiring 2 cores can run concurrently
+  - **Recommendations**:
+    - Heavy simulations: `4-8` cores
+    - Medium tasks: `2-4` cores
+    - Lightweight scripts: `1` core (default)
+- **`resources.estimated_time`** (optional, float): **Estimated duration in seconds** for LPT (Longest Processing Time) scheduling. Tasks with longer estimated times are scheduled first in parallel runs to improve throughput.
+  - Example: `18000` = 5 hours, `3600` = 1 hour, `300` = 5 minutes
+- **`resources.min_memory_mb`** (optional, float): **Estimated memory requirement in MB**. Used for OOM (Out Of Memory) risk warnings. Currently informational only.
+  - Example: `16000` = 16 GB, `8192` = 8 GB, `4096` = 4 GB
+- **`resources.priority`** (optional, int): **Task priority** (higher number = higher priority). Currently informational only. Recommended range: 0-10.
+  - `10`: Critical/blocking tasks (must run first)
+  - `7-9`: High priority (important business paths)
+  - `4-6`: Normal priority
+  - `1-3`: Low priority / exploratory tests
+  - `0` or unset: Default priority
+**Note:**
+- All time values (`timeout`, `estimated_time`) are in **seconds**, not milliseconds. This matches Python's `subprocess.run(timeout=...)` API.
+- **CPU core scheduling is only active in thread mode**. Process mode will fall back to original behavior (process-level isolation provides some resource separation).
 ## 5. File Comparison Features
 ### Supported File Types
@@ -334,12 +396,35 @@ except Exception as e:
    max_workers = os.cpu_count() * 2
    ```
-2. **Test Case Design**:
+2. **Resource-Aware Scheduling**:
+   - **For CAE/FEA simulations**: Always specify `cpu_cores` in your test configuration to prevent resource conflicts
+   - **Mixed workloads**: Configure lightweight tasks with `cpu_cores: 1` and heavy simulations with appropriate core counts
+   - **Example mixed configuration**:
+     ```json
+     {
+         "test_cases": [
+             {
+                 "name": "Heavy Simulation",
+                 "command": "radioss_solver",
+                 "resources": { "cpu_cores": 4 }
+             },
+             {
+                 "name": "Lightweight Script",
+                 "command": "python script.py",
+                 "resources": { "cpu_cores": 1 }
+             }
+         ]
+     }
+     ```
+   - **Monitor resource usage**: The framework prints resource acquisition/release logs to help you understand scheduling behavior
+3. **Test Case Design**:
    - ✅ Ensure test independence (no dependencies between tests)
    - ✅ Avoid shared resource conflicts (different files/ports)
    - ✅ Use relative paths (framework handles resolution automatically)
+   - ✅ Specify `cpu_cores` for CPU-intensive tasks to enable intelligent scheduling
-3. **Debugging**:
+4. **Debugging**:
    ```python
    # Enable verbose output for debugging
    runner = ParallelJSONRunner(
@@ -524,7 +609,22 @@ compare-files data1.h5 data2.h5 --h5-data-filter '<=0.01'
 # 版本更新日志
-## 0.3.7 (Latest)
+## 0.4.2 (Latest)
+### ✨ New Features
+- **Resource-Aware CPU Scheduling**: Automatic CPU core detection and semaphore-based scheduling to prevent resource conflicts
+  - Added `cpu_cores` field in `resources` configuration to specify CPU requirements per task
+  - Automatic environment variable injection (`OMP_NUM_THREADS`, `MKL_NUM_THREADS`, `NPROC`) to constrain solver threads
+  - Prevents solver "runaway" scenarios where solvers ignore Python's scheduling
+  - Intelligent resource pool management: automatically reserves 2 cores for system use
+- **Enhanced execution engine**: Support for custom environment variables in test execution
+### 🔧 Improvements
+- **Better resource management**: Tasks now wait for available CPU cores instead of overwhelming the system
+- **Automatic CPU detection**: No manual configuration needed - framework detects available cores automatically
+- **Thread-safe resource allocation**: Semaphore-based scheduling ensures thread-safe resource management
+## 0.3.7
 ### 🐛 Bug Fixes
 - **Fixed H5 table regex matching**: `--h5-table-regex=table1,table2` now correctly matches both `table1` and `table2` instead of treating the entire string as a single regex pattern

{cli_test_framework-0.4.1 → cli_test_framework-0.4.2}/setup.py RENAMED Viewed

@@ -8,7 +8,7 @@ with open(os.path.join(this_directory, 'README.md'), encoding='utf-8') as f:
 setup(
     name="cli-test-framework",
-    version="0.4.1",
+    version="0.4.2",
     author="Xiaotong Wang",
     author_email="xiaotongwang98@gmail.com",
     description="A powerful command line testing framework in Python with setup modules, parallel execution, and file comparison capabilities.",

{cli_test_framework-0.4.1 → cli_test_framework-0.4.2}/src/cli_test_framework/core/execution.py RENAMED Viewed

@@ -1,6 +1,7 @@
 import subprocess
 import time
-from typing import Optional
+import os
+from typing import Optional, Dict
 from .assertions import Assertions
 from .types import ExpectedResult, TestCaseData, TestResultData
@@ -23,9 +24,14 @@ def validate_result(expected: ExpectedResult, actual: TestResultData) -> None:
         assertions.matches(actual["output"], expected["output_matches"])
-def execute_single_test_case(case: TestCaseData, workspace: Optional[str] = None) -> TestResultData:
+def execute_single_test_case(case: TestCaseData, workspace: Optional[str] = None, env: Optional[Dict[str, str]] = None) -> TestResultData:
     """
     Stateless execution of a single test case.
+    Args:
+        case: Test case data
+        workspace: Working directory for test execution
+        env: Optional environment variables to inject/override (merged with os.environ)
     """
     start_time = time.time()
     full_command = f"{case['command']} {' '.join(case['args'])}".strip()
@@ -41,6 +47,12 @@ def execute_single_test_case(case: TestCaseData, workspace: Optional[str] = None
         "duration": 0.0,
     }
+    # Prepare environment variables
+    # Default to current environment, merge with provided env if any
+    current_env = os.environ.copy()
+    if env:
+        current_env.update(env)
     try:
         process = subprocess.run(
             full_command,
@@ -50,6 +62,7 @@ def execute_single_test_case(case: TestCaseData, workspace: Optional[str] = None
             check=False,
             shell=True,
             timeout=timeout_limit if timeout_limit is not None else None,
+            env=current_env,
         )
         output = process.stdout + process.stderr

{cli_test_framework-0.4.1 → cli_test_framework-0.4.2}/src/cli_test_framework/core/parallel_runner.py RENAMED Viewed

@@ -59,7 +59,9 @@ class ParallelRunner(BaseRunner):
                                 "name": case.name,
                                 "command": case.command,
                                 "args": case.args,
-                                "expected": case.expected
+                                "expected": case.expected,
+                                "timeout": case.timeout,
+                                "resources": case.resources
                             },
                             str(self.workspace) if self.workspace else None
                         ): (i, case)

{cli_test_framework-0.4.1 → cli_test_framework-0.4.2}/src/cli_test_framework/core/process_worker.py RENAMED Viewed

@@ -25,6 +25,8 @@ def run_test_in_process(test_index: int, case_data: Dict[str, Any], workspace: s
         "args": case_data["args"],
         "expected": case_data["expected"],
         "description": case_data.get("description"),
+        "timeout": case_data.get("timeout"),
+        "resources": case_data.get("resources"),
     }
     command_preview = f"{case['command']} {' '.join(case['args'])}".strip()

{cli_test_framework-0.4.1 → cli_test_framework-0.4.2}/src/cli_test_framework/core/types.py RENAMED Viewed

@@ -15,6 +15,7 @@ class ResourceRequirements(TypedDict, total=False):
     estimated_time: float  # seconds, used for ordering (LPT)
     min_memory_mb: float  # soft hint to avoid OOM
     priority: int  # higher value => higher priority
+    cpu_cores: int  # number of CPU cores required by this task
 class TestCaseData(TypedDict):

{cli_test_framework-0.4.1 → cli_test_framework-0.4.2}/src/cli_test_framework/file_comparator/binary_comparator.py RENAMED Viewed

@@ -182,7 +182,7 @@ class BinaryComparator(BaseComparator):
     def compare_files(self, file1, file2, start_line=0, end_line=None, start_column=0, end_column=None):
         """
-        @brief Compare two binary files with optional similarity calculation
+        @brief Compare two binary files with optional similarity calculation using chunk-based streaming
         @param file1 Path: Path to the first binary file
         @param file2 Path: Path to the second binary file
         @param start_line int: Starting byte offset
@@ -190,6 +190,8 @@ class BinaryComparator(BaseComparator):
         @param start_column int: Ignored for binary files
         @param end_column int: Ignored for binary files
         @return ComparisonResult: Result object containing comparison details
+        @details This method implements chunk-based streaming comparison to avoid loading
+                 entire files into memory, making it suitable for large files with O(1) memory usage.
         """
         from pathlib import Path
         from .result import ComparisonResult
@@ -207,26 +209,170 @@ class BinaryComparator(BaseComparator):
             file2_path = Path(file2)
             result.file1_size = file1_path.stat().st_size
             result.file2_size = file2_path.stat().st_size
-            self.logger.debug("Reading content from files")
-            content1 = self.read_content(file1, start_line, end_line, start_column, end_column)
-            content2 = self.read_content(file2, start_line, end_line, start_column, end_column)
-            self.logger.debug("Comparing content")
-            identical, differences = self.compare_content(content1, content2)
-            result.identical = identical
-            result.differences = differences
+            # Quick size check: if file sizes differ and similarity is not requested,
+            # we can return early without streaming
+            if result.file1_size != result.file2_size and not self.similarity:
+                # Adjust sizes based on offset if specified
+                adjusted_size1 = result.file1_size - start_line
+                adjusted_size2 = result.file2_size - start_line
+                if end_line is not None:
+                    adjusted_size1 = min(adjusted_size1, end_line - start_line)
+                    adjusted_size2 = min(adjusted_size2, end_line - start_line)
+                if adjusted_size1 != adjusted_size2:
+                    result.identical = False
+                    result.differences.append(Difference(
+                        position="file size",
+                        expected=f"{result.file1_size} bytes",
+                        actual=f"{result.file2_size} bytes",
+                        diff_type="size"
+                    ))
+                    return result
+            # If similarity calculation is needed, we still need to read full content
+            # but for regular comparison, use chunk-based streaming
             if self.similarity:
+                # For similarity calculation, we still need full content
+                # This is a limitation of the current LCS algorithm
+                self.logger.debug("Reading full content for similarity calculation")
+                content1 = self.read_content(file1, start_line, end_line, start_column, end_column)
+                content2 = self.read_content(file2, start_line, end_line, start_column, end_column)
+                identical, differences = self.compare_content(content1, content2)
                 if (len(content1) + len(content2)) > 0:
                     lcs_len = self.compute_lcs_length(content1, content2)
                     similarity = 2 * lcs_len / (len(content1) + len(content2))
                 else:
                     similarity = 1
                 result.similarity = similarity
+            else:
+                # Chunk-based streaming comparison for O(1) memory usage
+                self.logger.debug("Using chunk-based streaming comparison")
+                identical, differences = self._compare_files_streaming(
+                    file1_path, file2_path, start_line, end_line
+                )
+            result.identical = identical
+            result.differences = differences
             return result
         except Exception as e:
             self.logger.error(f"Error during comparison: {str(e)}")
             result.error = str(e)
             result.identical = False
             return result
+    def _compare_files_streaming(self, file1_path, file2_path, start_offset=0, end_offset=None):
+        """
+        @brief Compare two binary files using chunk-based streaming
+        @param file1_path Path: Path to the first binary file
+        @param file2_path Path: Path to the second binary file
+        @param start_offset int: Starting byte offset
+        @param end_offset int: Ending byte offset (None for end of file)
+        @return tuple: (bool, list) - (identical, differences)
+        @details This method compares files chunk by chunk without loading entire files
+                 into memory, achieving O(1) memory complexity.
+        """
+        differences = []
+        max_differences = 10  # Limit number of differences reported
+        try:
+            with open(file1_path, 'rb') as f1, open(file2_path, 'rb') as f2:
+                # Seek to start offset if specified
+                if start_offset > 0:
+                    f1.seek(start_offset)
+                    f2.seek(start_offset)
+                # Calculate bytes to read if end_offset is specified
+                bytes_to_read = None
+                if end_offset is not None:
+                    if end_offset <= start_offset:
+                        raise ValueError("End offset must be greater than start offset")
+                    bytes_to_read = end_offset - start_offset
+                chunk_size = self.chunk_size
+                current_offset = start_offset
+                bytes_read_total = 0
+                while True:
+                    # Determine how many bytes to read in this chunk
+                    if bytes_to_read is not None:
+                        remaining = bytes_to_read - bytes_read_total
+                        if remaining <= 0:
+                            break
+                        read_size = min(chunk_size, remaining)
+                    else:
+                        read_size = chunk_size
+                    # Read chunks from both files
+                    chunk1 = f1.read(read_size)
+                    chunk2 = f2.read(read_size)
+                    # If both files are exhausted, we're done
+                    if not chunk1 and not chunk2:
+                        break
+                    # If one file ends before the other, that's a difference
+                    if len(chunk1) != len(chunk2):
+                        differences.append(Difference(
+                            position=f"byte {current_offset}",
+                            expected=f"{len(chunk1)} bytes in chunk",
+                            actual=f"{len(chunk2)} bytes in chunk",
+                            diff_type="content"
+                        ))
+                        break
+                    # Compare chunks byte by byte
+                    if chunk1 != chunk2:
+                        # Find the exact byte position where the difference starts
+                        for i in range(len(chunk1)):
+                            if chunk1[i] != chunk2[i]:
+                                abs_pos = current_offset + i
+                                # Show a few bytes before and after the difference for context
+                                context_size = 8
+                                context_start = max(0, i - context_size)
+                                context_end = min(len(chunk1), i + context_size)
+                                # Get context bytes (may need to read previous chunk)
+                                context1 = chunk1[context_start:context_end]
+                                context2 = chunk2[context_start:context_end]
+                                expected_hex = ' '.join(f"{b:02x}" for b in context1)
+                                actual_hex = ' '.join(f"{b:02x}" for b in context2)
+                                differences.append(Difference(
+                                    position=f"byte {abs_pos}",
+                                    expected=expected_hex,
+                                    actual=actual_hex,
+                                    diff_type="content"
+                                ))
+                                # Stop after finding first difference in chunk
+                                # or if we've reached max differences
+                                if len(differences) >= max_differences:
+                                    differences.append(Difference(
+                                        position=None,
+                                        expected=None,
+                                        actual=None,
+                                        diff_type="more differences not shown"
+                                    ))
+                                    return False, differences
+                                break
+                    current_offset += len(chunk1)
+                    bytes_read_total += len(chunk1)
+                    # If we didn't read a full chunk, we've reached EOF
+                    if len(chunk1) < read_size:
+                        break
+                identical = len(differences) == 0
+                return identical, differences
+        except FileNotFoundError as e:
+            raise ValueError(f"File not found: {e}")
+        except IOError as e:
+            raise ValueError(f"Error reading file: {str(e)}")
     def get_file_hash(self, file_path, chunk_size=8192):
         """

cli-test-framework 0.4.1__tar.gz → 0.4.2__tar.gz

cli-test-framework 0.4.1tar.gz → 0.4.2tar.gz