PyPI - pytest-drift - Versions diffs - 0.1.0__tar.gz - Mend

pytest-drift 0.1.0__tar.gz

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (21) hide show

pytest_drift-0.1.0/PKG-INFO +146 -0
pytest_drift-0.1.0/README.md +126 -0
pytest_drift-0.1.0/pyproject.toml +33 -0
pytest_drift-0.1.0/pytest_drift/__init__.py +0 -0
pytest_drift-0.1.0/pytest_drift/compare.py +154 -0
pytest_drift-0.1.0/pytest_drift/pandas_utils.py +135 -0
pytest_drift-0.1.0/pytest_drift/plugin.py +292 -0
pytest_drift-0.1.0/pytest_drift/report.py +68 -0
pytest_drift-0.1.0/pytest_drift/runner.py +100 -0
pytest_drift-0.1.0/pytest_drift/storage.py +92 -0
pytest_drift-0.1.0/pytest_drift.egg-info/PKG-INFO +146 -0
pytest_drift-0.1.0/pytest_drift.egg-info/SOURCES.txt +19 -0
pytest_drift-0.1.0/pytest_drift.egg-info/dependency_links.txt +1 -0
pytest_drift-0.1.0/pytest_drift.egg-info/entry_points.txt +2 -0
pytest_drift-0.1.0/pytest_drift.egg-info/requires.txt +16 -0
pytest_drift-0.1.0/pytest_drift.egg-info/top_level.txt +1 -0
pytest_drift-0.1.0/setup.cfg +4 -0
pytest_drift-0.1.0/tests/test_compare.py +129 -0
pytest_drift-0.1.0/tests/test_pandas_utils.py +109 -0
pytest_drift-0.1.0/tests/test_plugin.py +161 -0
pytest_drift-0.1.0/tests/test_storage.py +79 -0

pytest_drift-0.1.0/PKG-INFO ADDED Viewed

@@ -0,0 +1,146 @@
+Metadata-Version: 2.4
+Name: pytest-drift
+Version: 0.1.0
+Summary: Pytest plugin for regression testing via branch comparison
+Requires-Python: >=3.10
+Description-Content-Type: text/markdown
+Requires-Dist: pytest>=7.0
+Requires-Dist: cloudpickle>=3.0
+Requires-Dist: pandas>=1.5
+Provides-Extra: datacompy
+Requires-Dist: datacompy>=0.9; extra == "datacompy"
+Provides-Extra: parquet
+Requires-Dist: pyarrow>=10.0; extra == "parquet"
+Provides-Extra: dev
+Requires-Dist: pytest; extra == "dev"
+Requires-Dist: pandas; extra == "dev"
+Requires-Dist: numpy; extra == "dev"
+Requires-Dist: datacompy>=0.9; extra == "dev"
+Requires-Dist: pyarrow>=10.0; extra == "dev"
+# pytest-drift
+A pytest plugin for regression testing via branch comparison. When a test returns a value, the plugin runs the same test on a base git branch and compares the results — catching regressions before they merge.
+## How it works
+1. You run `pytest --drift BASE_BRANCH`
+2. For every test that **returns a non-None value**, the plugin:
+   - Records the return value from the current branch (HEAD)
+   - Simultaneously runs the same tests on `BASE_BRANCH` in a git worktree
+   - Compares the two results at the end of the session
+3. Tests returning `None` (the default for normal pytest tests) are ignored entirely
+The base branch runs in parallel with your HEAD tests, so total wall time is approximately `max(HEAD_time, BASE_time)` rather than `HEAD_time + BASE_time`.
+## Installation
+```bash
+pip install pytest-drift
+# With smart DataFrame diff reports (recommended):
+pip install "pytest-drift[datacompy]"
+```
+## Usage
+### CLI flag
+```bash
+pytest --drift main
+pytest --drift origin/main
+```
+### Environment variable
+```bash
+export PYTEST_DRIFT_BASE_BRANCH=main
+pytest
+```
+## Writing regression tests
+Return a value from your test — that's it:
+```python
+def test_revenue_calculation():
+    df = compute_revenue(load_data())
+    return df  # compared against the same function on BASE_BRANCH
+def test_model_accuracy():
+    return evaluate_model()  # compared as a float
+def test_pipeline_output():
+    return run_pipeline()  # compared as a dict, list, DataFrame, etc.
+```
+Normal tests (returning `None`) are unaffected and run as usual.
+## Comparison logic
+The plugin dispatches comparison based on the return type:
+| Type | Comparison method |
+|---|---|
+| `pd.DataFrame` | Auto-detects join columns; uses `datacompy` if installed, else `pd.testing.assert_frame_equal` |
+| `pd.Series` | Converted to DataFrame, same path as above |
+| `float` / `np.floating` | `math.isclose` with `rtol=1e-5, atol=1e-8` |
+| `np.ndarray` | `np.testing.assert_array_almost_equal` (5 decimal places) |
+| `dict` | Recursive key-by-key comparison |
+| `list` / `tuple` | Element-wise comparison |
+| Everything else | `==`, with `repr()` diff on failure |
+### Pandas index auto-detection
+When comparing DataFrames, the plugin automatically finds the best join key:
+1. **Named index**: if the DataFrame already has a named (non-RangeIndex) index, it's used directly
+2. **MultiIndex**: all named index levels are used
+3. **Column heuristic**: searches combinations of up to 3 non-float columns with full cardinality (every row is unique in that combination)
+4. **Positional fallback**: if no unique key is found, rows are compared positionally
+You can also pass `join_columns` explicitly by calling `compare_dataframes` directly from `pandas_utils`.
+## Terminal output
+At the end of the session a regression summary is printed:
+```
+========================================================================
+REGRESSION COMPARISON SUMMARY
+========================================================================
+PASSED tests/test_revenue.py::test_revenue_calculation
+FAILED tests/test_model.py::test_model_accuracy
+    Float mismatch:
+      head: 0.923
+      base: 0.941
+------------------------------------------------------------------------
+1 passed, 1 failed (2 total regression comparisons)
+```
+## How branch switching works
+The plugin uses `git worktree add` to check out `BASE_BRANCH` into a temporary directory — your working tree is never touched. The worktree is cleaned up automatically after the session.
+```
+HEAD tests run         ─────────────────────────▶  sessionfinish
+                                                       │
+git worktree add ──▶  BASE tests run in parallel  ────┘  compare
+```
+## Requirements
+| Package | Required | Purpose |
+|---|---|---|
+| `pytest >= 7.0` | Yes | Core |
+| `cloudpickle >= 3.0` | Yes | Serialization of return values |
+| `pandas >= 1.5` | Yes | DataFrame/Series support |
+| `datacompy >= 0.9` | Optional | Rich DataFrame diff reports |
+| `pyarrow >= 10.0` | Optional | Parquet storage for large DataFrames |
+## Caveats
+- The base branch subprocess uses the same Python environment as HEAD — if your project uses `tox` or `nox`, point to the correct environment
+- Session-scoped fixtures with side effects (e.g. starting a server) will run twice — once per session
+- Tests that fail on HEAD are not compared (no base result is fetched for them)
+- Tests that fail on BASE produce a "base branch test failed, cannot compare" warning

pytest_drift-0.1.0/README.md ADDED Viewed

@@ -0,0 +1,126 @@
+# pytest-drift
+A pytest plugin for regression testing via branch comparison. When a test returns a value, the plugin runs the same test on a base git branch and compares the results — catching regressions before they merge.
+## How it works
+1. You run `pytest --drift BASE_BRANCH`
+2. For every test that **returns a non-None value**, the plugin:
+   - Records the return value from the current branch (HEAD)
+   - Simultaneously runs the same tests on `BASE_BRANCH` in a git worktree
+   - Compares the two results at the end of the session
+3. Tests returning `None` (the default for normal pytest tests) are ignored entirely
+The base branch runs in parallel with your HEAD tests, so total wall time is approximately `max(HEAD_time, BASE_time)` rather than `HEAD_time + BASE_time`.
+## Installation
+```bash
+pip install pytest-drift
+# With smart DataFrame diff reports (recommended):
+pip install "pytest-drift[datacompy]"
+```
+## Usage
+### CLI flag
+```bash
+pytest --drift main
+pytest --drift origin/main
+```
+### Environment variable
+```bash
+export PYTEST_DRIFT_BASE_BRANCH=main
+pytest
+```
+## Writing regression tests
+Return a value from your test — that's it:
+```python
+def test_revenue_calculation():
+    df = compute_revenue(load_data())
+    return df  # compared against the same function on BASE_BRANCH
+def test_model_accuracy():
+    return evaluate_model()  # compared as a float
+def test_pipeline_output():
+    return run_pipeline()  # compared as a dict, list, DataFrame, etc.
+```
+Normal tests (returning `None`) are unaffected and run as usual.
+## Comparison logic
+The plugin dispatches comparison based on the return type:
+| Type | Comparison method |
+|---|---|
+| `pd.DataFrame` | Auto-detects join columns; uses `datacompy` if installed, else `pd.testing.assert_frame_equal` |
+| `pd.Series` | Converted to DataFrame, same path as above |
+| `float` / `np.floating` | `math.isclose` with `rtol=1e-5, atol=1e-8` |
+| `np.ndarray` | `np.testing.assert_array_almost_equal` (5 decimal places) |
+| `dict` | Recursive key-by-key comparison |
+| `list` / `tuple` | Element-wise comparison |
+| Everything else | `==`, with `repr()` diff on failure |
+### Pandas index auto-detection
+When comparing DataFrames, the plugin automatically finds the best join key:
+1. **Named index**: if the DataFrame already has a named (non-RangeIndex) index, it's used directly
+2. **MultiIndex**: all named index levels are used
+3. **Column heuristic**: searches combinations of up to 3 non-float columns with full cardinality (every row is unique in that combination)
+4. **Positional fallback**: if no unique key is found, rows are compared positionally
+You can also pass `join_columns` explicitly by calling `compare_dataframes` directly from `pandas_utils`.
+## Terminal output
+At the end of the session a regression summary is printed:
+```
+========================================================================
+REGRESSION COMPARISON SUMMARY
+========================================================================
+PASSED tests/test_revenue.py::test_revenue_calculation
+FAILED tests/test_model.py::test_model_accuracy
+    Float mismatch:
+      head: 0.923
+      base: 0.941
+------------------------------------------------------------------------
+1 passed, 1 failed (2 total regression comparisons)
+```
+## How branch switching works
+The plugin uses `git worktree add` to check out `BASE_BRANCH` into a temporary directory — your working tree is never touched. The worktree is cleaned up automatically after the session.
+```
+HEAD tests run         ─────────────────────────▶  sessionfinish
+                                                       │
+git worktree add ──▶  BASE tests run in parallel  ────┘  compare
+```
+## Requirements
+| Package | Required | Purpose |
+|---|---|---|
+| `pytest >= 7.0` | Yes | Core |
+| `cloudpickle >= 3.0` | Yes | Serialization of return values |
+| `pandas >= 1.5` | Yes | DataFrame/Series support |
+| `datacompy >= 0.9` | Optional | Rich DataFrame diff reports |
+| `pyarrow >= 10.0` | Optional | Parquet storage for large DataFrames |
+## Caveats
+- The base branch subprocess uses the same Python environment as HEAD — if your project uses `tox` or `nox`, point to the correct environment
+- Session-scoped fixtures with side effects (e.g. starting a server) will run twice — once per session
+- Tests that fail on HEAD are not compared (no base result is fetched for them)
+- Tests that fail on BASE produce a "base branch test failed, cannot compare" warning

pytest_drift-0.1.0/pyproject.toml ADDED Viewed

@@ -0,0 +1,33 @@
+[build-system]
+requires = ["setuptools>=42", "wheel"]
+build-backend = "setuptools.build_meta"
+[project]
+name = "pytest-drift"
+version = "0.1.0"
+description = "Pytest plugin for regression testing via branch comparison"
+readme = "README.md"
+requires-python = ">=3.10"
+dependencies = [
+    "pytest>=7.0",
+    "cloudpickle>=3.0",
+    "pandas>=1.5",
+]
+[project.optional-dependencies]
+datacompy = ["datacompy>=0.9"]
+parquet = ["pyarrow>=10.0"]
+dev = [
+    "pytest",
+    "pandas",
+    "numpy",
+    "datacompy>=0.9",
+    "pyarrow>=10.0",
+]
+[project.entry-points.pytest11]
+drift = "pytest_drift.plugin"
+[tool.setuptools.packages.find]
+where = ["."]
+include = ["pytest_drift*"]

pytest_drift-0.1.0/pytest_drift/__init__.py ADDED Viewed

File without changes

pytest_drift-0.1.0/pytest_drift/compare.py ADDED Viewed

@@ -0,0 +1,154 @@
+"""Type-dispatching comparison logic for test return values."""
+from __future__ import annotations
+import math
+from typing import Any
+from .pandas_utils import ComparisonResult, compare_dataframes, compare_series
+def compare_values(head: Any, base: Any, node_id: str = "") -> ComparisonResult:
+    """
+    Compare head (current branch) and base (base branch) return values.
+    Dispatches based on type.
+    """
+    result = _dispatch(head, base)
+    result.node_id = node_id
+    return result
+def _dispatch(head: Any, base: Any) -> ComparisonResult:
+    # Check pandas types first (before generic checks)
+    try:
+        import pandas as pd
+        if isinstance(head, pd.DataFrame) and isinstance(base, pd.DataFrame):
+            return compare_dataframes(head, base)
+        if isinstance(head, pd.Series) and isinstance(base, pd.Series):
+            return compare_series(head, base)
+        if isinstance(head, (pd.DataFrame, pd.Series)) or isinstance(
+            base, (pd.DataFrame, pd.Series)
+        ):
+            return ComparisonResult(
+                equal=False,
+                report=f"Type mismatch: head={type(head).__name__}, base={type(base).__name__}",
+            )
+    except ImportError:
+        pass
+    # numpy arrays
+    try:
+        import numpy as np
+        if isinstance(head, np.ndarray) and isinstance(base, np.ndarray):
+            return _compare_arrays(head, base)
+    except ImportError:
+        pass
+    # float scalars
+    if isinstance(head, float) and isinstance(base, float):
+        return _compare_floats(head, base)
+    # numpy scalars that are float-like
+    try:
+        import numpy as np
+        if isinstance(head, np.floating) and isinstance(base, np.floating):
+            return _compare_floats(float(head), float(base))
+    except ImportError:
+        pass
+    # dict
+    if isinstance(head, dict) and isinstance(base, dict):
+        return _compare_dicts(head, base)
+    # list / tuple
+    if isinstance(head, (list, tuple)) and isinstance(base, (list, tuple)):
+        return _compare_sequences(head, base)
+    # generic fallback
+    return _compare_generic(head, base)
+def _compare_floats(head: float, base: float, rtol: float = 1e-5, atol: float = 1e-8) -> ComparisonResult:
+    if math.isnan(head) and math.isnan(base):
+        return ComparisonResult(equal=True, report=None)
+    equal = math.isclose(head, base, rel_tol=rtol, abs_tol=atol)
+    report = None if equal else f"Float mismatch: head={head!r}, base={base!r}"
+    return ComparisonResult(equal=equal, report=report)
+def _compare_arrays(head, base) -> ComparisonResult:
+    import numpy as np
+    if head.shape != base.shape:
+        return ComparisonResult(
+            equal=False,
+            report=f"Shape mismatch: head={head.shape}, base={base.shape}",
+        )
+    try:
+        np.testing.assert_array_almost_equal(head, base, decimal=5)
+        return ComparisonResult(equal=True, report=None)
+    except AssertionError as e:
+        return ComparisonResult(equal=False, report=str(e))
+def _compare_dicts(head: dict, base: dict) -> ComparisonResult:
+    head_keys = set(head.keys())
+    base_keys = set(base.keys())
+    if head_keys != base_keys:
+        only_head = head_keys - base_keys
+        only_base = base_keys - head_keys
+        parts = []
+        if only_head:
+            parts.append(f"Keys only in head: {sorted(str(k) for k in only_head)}")
+        if only_base:
+            parts.append(f"Keys only in base: {sorted(str(k) for k in only_base)}")
+        return ComparisonResult(equal=False, report="\n".join(parts))
+    mismatches = []
+    for key in sorted(head_keys, key=str):
+        sub = _dispatch(head[key], base[key])
+        if not sub.equal:
+            mismatches.append(f"  Key {key!r}: {sub.report}")
+    if mismatches:
+        return ComparisonResult(
+            equal=False, report="Dict mismatches:\n" + "\n".join(mismatches)
+        )
+    return ComparisonResult(equal=True, report=None)
+def _compare_sequences(head, base) -> ComparisonResult:
+    if len(head) != len(base):
+        return ComparisonResult(
+            equal=False,
+            report=f"Length mismatch: head={len(head)}, base={len(base)}",
+        )
+    mismatches = []
+    for i, (h, b) in enumerate(zip(head, base)):
+        sub = _dispatch(h, b)
+        if not sub.equal:
+            mismatches.append(f"  Index {i}: {sub.report}")
+    if mismatches:
+        return ComparisonResult(
+            equal=False,
+            report=f"{type(head).__name__} mismatches:\n" + "\n".join(mismatches),
+        )
+    return ComparisonResult(equal=True, report=None)
+def _compare_generic(head: Any, base: Any) -> ComparisonResult:
+    try:
+        equal = bool(head == base)
+    except Exception:
+        equal = False
+    if equal:
+        return ComparisonResult(equal=True, report=None)
+    return ComparisonResult(
+        equal=False,
+        report=f"Value mismatch:\n  head: {head!r}\n  base: {base!r}",
+    )

pytest_drift-0.1.0/pytest_drift/pandas_utils.py ADDED Viewed

@@ -0,0 +1,135 @@
+"""Pandas DataFrame/Series index auto-detection and comparison."""
+from __future__ import annotations
+import itertools
+from dataclasses import dataclass, field
+from typing import TYPE_CHECKING
+if TYPE_CHECKING:
+    import pandas as pd
+@dataclass
+class ComparisonResult:
+    equal: bool
+    report: str | None
+    node_id: str = ""
+    extra: dict = field(default_factory=dict)
+def detect_index_columns(df: "pd.DataFrame", max_combo_size: int = 3) -> list[str] | None:
+    """
+    Auto-detect which columns can serve as join keys for comparison.
+    Priority:
+    1. Named non-RangeIndex (already set as index)
+    2. Heuristic: find smallest combo of non-float cols with full cardinality
+    3. None → fall back to positional comparison
+    """
+    import pandas as pd
+    # Case A: already has a meaningful named index
+    if not isinstance(df.index, pd.RangeIndex):
+        if isinstance(df.index, pd.MultiIndex):
+            if all(name is not None for name in df.index.names):
+                return list(df.index.names)
+        elif df.index.name is not None:
+            return [df.index.name]
+    n = len(df)
+    if n == 0 or len(df.columns) == 0:
+        return None
+    # Case B: heuristic column search (exclude float columns — poor join keys)
+    candidate_cols = [
+        c for c in df.columns if not pd.api.types.is_float_dtype(df[c].dtype)
+    ]
+    # Sort by cardinality descending (higher = better key candidate)
+    candidate_cols = sorted(candidate_cols, key=lambda c: df[c].nunique(), reverse=True)
+    for r in range(1, min(max_combo_size + 1, len(candidate_cols) + 1)):
+        for combo in itertools.combinations(candidate_cols, r):
+            try:
+                if df.groupby(list(combo)).ngroups == n:
+                    return list(combo)
+            except Exception:
+                continue
+    return None
+def _reset_named_index(df: "pd.DataFrame") -> "pd.DataFrame":
+    """If df has a named index, reset it to columns."""
+    import pandas as pd
+    if not isinstance(df.index, pd.RangeIndex):
+        return df.reset_index()
+    return df
+def compare_dataframes(
+    head_df: "pd.DataFrame",
+    base_df: "pd.DataFrame",
+    join_columns: list[str] | None = None,
+) -> ComparisonResult:
+    """Compare two DataFrames, auto-detecting join columns if not provided."""
+    import pandas as pd
+    head_flat = _reset_named_index(head_df)
+    base_flat = _reset_named_index(base_df)
+    if join_columns is None:
+        join_columns = detect_index_columns(head_flat)
+    # Try datacompy first
+    try:
+        import datacompy
+        if not hasattr(datacompy, "Compare"):
+            raise ImportError("datacompy.Compare not available")
+        if join_columns is None:
+            # No key found; use all columns positionally by adding a row-number key
+            head_flat = head_flat.copy()
+            base_flat = base_flat.copy()
+            head_flat["__row__"] = range(len(head_flat))
+            base_flat["__row__"] = range(len(base_flat))
+            join_columns = ["__row__"]
+        cmp = datacompy.Compare(
+            head_flat,
+            base_flat,
+            join_columns=join_columns,
+            df1_name="head",
+            df2_name="base",
+        )
+        equal = cmp.matches()
+        return ComparisonResult(
+            equal=equal,
+            report=None if equal else cmp.report(),
+        )
+    except ImportError:
+        pass
+    # Fallback: pd.testing.assert_frame_equal
+    try:
+        if join_columns:
+            head_sorted = head_flat.set_index(join_columns).sort_index()
+            base_sorted = base_flat.set_index(join_columns).sort_index()
+        else:
+            head_sorted = head_flat.reset_index(drop=True)
+            base_sorted = base_flat.reset_index(drop=True)
+        pd.testing.assert_frame_equal(
+            head_sorted, base_sorted, check_like=True, rtol=1e-5
+        )
+        return ComparisonResult(equal=True, report=None)
+    except AssertionError as e:
+        return ComparisonResult(equal=False, report=str(e))
+def compare_series(head_s: "pd.Series", base_s: "pd.Series") -> ComparisonResult:
+    """Compare two Series by converting to DataFrames."""
+    head_df = head_s.to_frame(name=head_s.name or "value")
+    base_df = base_s.to_frame(name=base_s.name or "value")
+    return compare_dataframes(head_df, base_df)