PyPI - pdex - Versions diffs - 0.2.2__tar.gz → 0.2.3__tar.gz - Mend

pdex 0.2.2tar.gz → 0.2.3tar.gz

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (18) hide show

{pdex-0.2.2 → pdex-0.2.3}/CLAUDE.md +2 -2
{pdex-0.2.2 → pdex-0.2.3}/PKG-INFO +3 -3
{pdex-0.2.2 → pdex-0.2.3}/README.md +2 -2
{pdex-0.2.2 → pdex-0.2.3}/pyproject.toml +1 -1
{pdex-0.2.2 → pdex-0.2.3}/src/pdex/__init__.py +12 -5
{pdex-0.2.2 → pdex-0.2.3}/src/pdex/_math.py +16 -2
{pdex-0.2.2 → pdex-0.2.3}/tests/test_math.py +32 -0
{pdex-0.2.2 → pdex-0.2.3}/tests/test_pdex.py +54 -0
{pdex-0.2.2 → pdex-0.2.3}/.github/workflows/ci.yml +0 -0
{pdex-0.2.2 → pdex-0.2.3}/.github/workflows/release.yml +0 -0
{pdex-0.2.2 → pdex-0.2.3}/.gitignore +0 -0
{pdex-0.2.2 → pdex-0.2.3}/.python-version +0 -0
{pdex-0.2.2 → pdex-0.2.3}/LICENSE +0 -0
{pdex-0.2.2 → pdex-0.2.3}/src/pdex/_utils.py +0 -0
{pdex-0.2.2 → pdex-0.2.3}/src/pdex/py.typed +0 -0
{pdex-0.2.2 → pdex-0.2.3}/tests/conftest.py +0 -0
{pdex-0.2.2 → pdex-0.2.3}/tests/test_internals.py +0 -0
{pdex-0.2.2 → pdex-0.2.3}/tests/test_utils.py +0 -0

{pdex-0.2.2 → pdex-0.2.3}/CLAUDE.md RENAMED Viewed

@@ -80,8 +80,8 @@ The returned Polars DataFrame (or pandas DataFrame when `as_pandas=True`) has co
 | `target_membership` | int   | Number of cells in the target group                                   |
 | `ref_membership`    | int   | Number of cells in the reference                                      |
 | `fold_change`       | float | **Deprecated** alias for `log2_fold_change` (identical values). Retained for one release; emits a `FutureWarning` on every `pdex(...)` call and will be removed in pdex 0.3.0. |
-| `log2_fold_change`  | float | log2((target_mean + epsilon) / (ref_mean + epsilon)) — computed from pseudobulk means |
-| `percent_change`    | float | (target_mean - ref_mean) / (ref_mean + epsilon) — computed from pseudobulk means |
+| `log2_fold_change`  | float | log2((target_mean + epsilon) / (ref_mean + epsilon)) — computed from pseudobulk means. Features unexpressed in both groups (`target_mean == ref_mean == 0`, only with `epsilon == 0`) give `0/0`, defined as `0.0` (not `NaN`); one-sided zeros still yield `±inf`. |
+| `percent_change`    | float | (target_mean - ref_mean) / (ref_mean + epsilon) — computed from pseudobulk means. Features unexpressed in both groups (`target_mean == ref_mean == 0`, only with `epsilon == 0`) give `0/0`, defined as `0.0` (not `NaN`); a zero reference with nonzero target still yields `+inf`. |
 | `p_value`           | float | Mann-Whitney U p-value (per-cell vectors)                             |
 | `statistic`         | float | Mann-Whitney U statistic                                              |
 | `fdr`               | float | FDR-corrected p-value, applied per-group across genes. For `on_target` mode, applied across all groups.                 |

{pdex-0.2.2 → pdex-0.2.3}/PKG-INFO RENAMED Viewed

@@ -1,6 +1,6 @@
 Metadata-Version: 2.4
 Name: pdex
-Version: 0.2.2
+Version: 0.2.3
 Summary: Parallel differential expression for single-cell perturbation sequencing
 Author-email: noam teyssier <noam.teyssier@arcinstitute.org>
 License-File: LICENSE
@@ -114,8 +114,8 @@ Returns a Polars DataFrame (or pandas if `as_pandas=True`) with one row per (gro
 | `target_membership` | Number of cells in the target group                |
 | `ref_membership`    | Number of cells in the reference                   |
 | `fold_change`       | **Deprecated alias** for `log2_fold_change` (identical values). Will be removed in pdex 0.3.0. |
-| `log2_fold_change`  | log2(target_mean / ref_mean)                       |
-| `percent_change`    | (target_mean - ref_mean) / ref_mean                |
+| `log2_fold_change`  | log2(target_mean / ref_mean). Genes unexpressed in both groups (0/0) report `0.0`, not `NaN`. |
+| `percent_change`    | (target_mean - ref_mean) / ref_mean. Genes unexpressed in both groups (0/0) report `0.0`, not `NaN`. |
 | `p_value`           | Mann-Whitney U p-value                             |
 | `statistic`         | Mann-Whitney U statistic                           |
 | `fdr`               | FDR-corrected p-value (per-group, across genes). For `on_target` mode, this is applied across all groups.    |

{pdex-0.2.2 → pdex-0.2.3}/README.md RENAMED Viewed

@@ -96,8 +96,8 @@ Returns a Polars DataFrame (or pandas if `as_pandas=True`) with one row per (gro
 | `target_membership` | Number of cells in the target group                |
 | `ref_membership`    | Number of cells in the reference                   |
 | `fold_change`       | **Deprecated alias** for `log2_fold_change` (identical values). Will be removed in pdex 0.3.0. |
-| `log2_fold_change`  | log2(target_mean / ref_mean)                       |
-| `percent_change`    | (target_mean - ref_mean) / ref_mean                |
+| `log2_fold_change`  | log2(target_mean / ref_mean). Genes unexpressed in both groups (0/0) report `0.0`, not `NaN`. |
+| `percent_change`    | (target_mean - ref_mean) / ref_mean. Genes unexpressed in both groups (0/0) report `0.0`, not `NaN`. |
 | `p_value`           | Mann-Whitney U p-value                             |
 | `statistic`         | Mann-Whitney U statistic                           |
 | `fdr`               | FDR-corrected p-value (per-group, across genes). For `on_target` mode, this is applied across all groups.    |

{pdex-0.2.2 → pdex-0.2.3}/pyproject.toml RENAMED Viewed

@@ -1,6 +1,6 @@
 [project]
 name = "pdex"
-version = "0.2.2"
+version = "0.2.3"
 description = "Parallel differential expression for single-cell perturbation sequencing"
 readme = "README.md"
 authors = [{ name = "noam teyssier", email = "noam.teyssier@arcinstitute.org" }]

{pdex-0.2.2 → pdex-0.2.3}/src/pdex/__init__.py RENAMED Viewed

@@ -203,11 +203,13 @@ def pdex(
         If ``True``, return a :class:`pandas.DataFrame` instead of a
         :class:`polars.DataFrame`. Requires ``pyarrow``.
     epsilon:
-        Pseudocount added to both ``target_mean`` and ``ref_mean`` before computing
-        ``fold_change`` and ``percent_change``. When ``epsilon > 0``, extreme
-        values from near-zero reference means (scRNA-seq sparsity artifact) are
-        dampened toward zero. Has no effect on the Mann-Whitney U p-value or FDR.
-        Default ``0.0`` preserves existing behaviour.
+        Pseudocount added to the denominator (and, for ``log2_fold_change``, the
+        numerator) before computing ``fold_change`` and ``percent_change``. When
+        ``epsilon > 0``, extreme values from near-zero reference means (scRNA-seq
+        sparsity artifact) are dampened toward zero. Has no effect on the
+        Mann-Whitney U p-value or FDR. Default ``0.0`` preserves existing behaviour;
+        regardless of ``epsilon``, features unexpressed in both groups report
+        ``0.0`` (no change) rather than ``NaN`` (see Returns).
         **Recommended usage:** For scRNA-seq CRISPRi/CRISPRa screens where many
         genes are unexpressed in the reference group, start with ``epsilon=0.5``.
@@ -243,6 +245,11 @@ def pdex(
         ``log2((target_mean + epsilon) / (ref_mean + epsilon))`` and
         ``percent_change`` is ``(target_mean - ref_mean) / (ref_mean + epsilon)``.
+        Features unexpressed in both groups (``target_mean == ref_mean == 0`` with
+        ``epsilon == 0``) would evaluate to ``0 / 0``; both ``log2_fold_change``
+        and ``percent_change`` define this as ``0.0`` (no change) rather than
+        ``NaN``. One-sided zeros still produce ``±inf``.
         ``fold_change`` is a **deprecated** alias for ``log2_fold_change``
         (identical values). It is retained for one release to ease migration
         and will be removed in pdex 0.3.0. New code should read

{pdex-0.2.2 → pdex-0.2.3}/src/pdex/_math.py RENAMED Viewed

@@ -112,8 +112,15 @@ def log2_fold_change(x: np.ndarray, y: np.ndarray, epsilon: float = 0.0) -> np.n
     When ``epsilon > 0``, adds a small pseudocount to both numerator and
     denominator before taking the ratio, dampening extreme fold changes that arise
     when the reference mean is near zero (scRNA-seq sparsity artifact).
+    Entries that are zero in both arrays (a feature unexpressed in both groups,
+    only possible when ``epsilon == 0``) evaluate to ``log2(0 / 0)``; these are
+    defined as ``0.0`` (no change) rather than ``NaN``. Legitimate ``±inf`` values
+    from one-sided zeros are preserved.
     """
-    return np.log2((x + epsilon) / (y + epsilon))
+    lfc = np.log2((x + epsilon) / (y + epsilon))
+    lfc[np.isnan(lfc)] = 0.0
+    return lfc
 @nb.njit(parallel=True)
@@ -125,8 +132,15 @@ def percent_change(
     When ``prior_count > 0``, adds a pseudocount to the denominator before
     computing the ratio, dampening extreme values when the reference mean is
     near zero (scRNA-seq sparsity artifact).
+    Entries that are zero in both arrays (a feature unexpressed in both groups,
+    only possible when ``prior_count == 0``) evaluate to ``0 / 0``; these are
+    defined as ``0.0`` (no change) rather than ``NaN``. Legitimate ``+inf`` values
+    from a zero reference are preserved.
     """
-    return (x - y) / (y + prior_count)
+    pc = (x - y) / (y + prior_count)
+    pc[np.isnan(pc)] = 0.0
+    return pc
 def mwu(

{pdex-0.2.2 → pdex-0.2.3}/tests/test_math.py RENAMED Viewed

@@ -29,6 +29,22 @@ class TestFoldChange:
         result = log2_fold_change(x, y)
         np.testing.assert_allclose(result, [0.0, 1.0, 2.0, 3.0])
+    def test_zero_over_zero_is_zero(self):
+        """0/0 (unexpressed in both groups) is defined as 0.0, not NaN."""
+        x = np.array([0.0])
+        y = np.array([0.0])
+        result = log2_fold_change(x, y)
+        assert not np.isnan(result).any()
+        np.testing.assert_array_equal(result, [0.0])
+    def test_zero_over_zero_mixed_with_finite_and_inf(self):
+        """0/0 -> 0.0 while normal ratios and one-sided zeros are untouched."""
+        x = np.array([0.0, 4.0, 0.0, 4.0])
+        y = np.array([0.0, 2.0, 1.0, 0.0])
+        result = log2_fold_change(x, y)
+        # 0/0 -> 0.0, log2(2) -> 1.0, log2(0) -> -inf, log2(4/0) -> +inf
+        np.testing.assert_array_equal(result, [0.0, 1.0, -np.inf, np.inf])
 class TestPercentChange:
     def test_double(self):
@@ -54,6 +70,22 @@ class TestPercentChange:
         result = percent_change(x, y)
         np.testing.assert_allclose(result, [-0.5, 0.0, 0.5])
+    def test_zero_over_zero_is_zero(self):
+        """0/0 (unexpressed in both groups) is defined as 0.0, not NaN."""
+        x = np.array([0.0])
+        y = np.array([0.0])
+        result = percent_change(x, y)
+        assert not np.isnan(result).any()
+        np.testing.assert_array_equal(result, [0.0])
+    def test_zero_over_zero_mixed_with_finite_and_inf(self):
+        """0/0 -> 0.0 while normal ratios and a zero reference are untouched."""
+        x = np.array([0.0, 4.0, 4.0])
+        y = np.array([0.0, 2.0, 0.0])
+        result = percent_change(x, y)
+        # 0/0 -> 0.0, (4-2)/2 -> 1.0, (4-0)/0 -> +inf
+        np.testing.assert_array_equal(result, [0.0, 1.0, np.inf])
 class TestFoldChangeWithEpsilon:
     def test_zero_epsilon_matches_baseline(self):

{pdex-0.2.2 → pdex-0.2.3}/tests/test_pdex.py RENAMED Viewed

@@ -682,3 +682,57 @@ class TestLog2FoldChangeColumn:
         finite = np.isfinite(expected) & np.isfinite(actual)
         assert finite.any()
         np.testing.assert_allclose(actual[finite], expected[finite], rtol=1e-6)
+class TestUnexpressedInBothGroups:
+    """A feature unexpressed in both groups (0/0) reports 0.0, not NaN."""
+    @pytest.mark.parametrize("mode", ["ref", "all"])
+    def test_zero_in_both_is_zero_not_nan(self, small_adata, mode):
+        """gene_0 is zero everywhere -> 0/0 in every comparison -> 0.0."""
+        adata = small_adata.copy()
+        adata.X[:, 0] = 0.0  # gene_0 unexpressed in every cell
+        result = pdex(adata, groupby="guide", mode=mode, is_log1p=False, epsilon=0.0)
+        gene0 = result.filter(pl.col("feature") == "gene_0")
+        assert (gene0["target_mean"].to_numpy() == 0).all()
+        assert (gene0["ref_mean"].to_numpy() == 0).all()
+        for col in ["log2_fold_change", "fold_change", "percent_change"]:
+            values = gene0[col].to_numpy()
+            assert not np.isnan(values).any(), f"{col} contains NaN"
+            np.testing.assert_array_equal(values, 0.0)
+    def test_on_target_zero_in_both_is_zero_not_nan(self, on_target_adata):
+        """on_target mode: a targeted gene that is zero everywhere reports 0.0."""
+        adata = on_target_adata.copy()
+        adata.X[:, 1] = 0.0  # group "A" targets gene_1
+        result = pdex(
+            adata,
+            groupby="guide",
+            mode="on_target",
+            gene_col="target_gene",
+            is_log1p=False,
+            epsilon=0.0,
+        )
+        row = result.filter(pl.col("target") == "A")
+        assert row["target_mean"].to_numpy()[0] == 0
+        assert row["ref_mean"].to_numpy()[0] == 0
+        for col in ["log2_fold_change", "fold_change", "percent_change"]:
+            value = row[col].to_numpy()[0]
+            assert not np.isnan(value), f"{col} is NaN"
+            assert value == 0.0
+    def test_one_sided_zero_still_infinite(self, small_adata):
+        """Only 0/0 is filled; a zero target over a nonzero reference stays infinite."""
+        adata = small_adata.copy()
+        # gene_0 expressed only in the reference -> target_mean 0, ref_mean > 0
+        adata.X[:, 0] = 0.0
+        adata.X[adata.obs["guide"].to_numpy() == "non-targeting", 0] = 5.0
+        result = pdex(adata, groupby="guide", mode="ref", is_log1p=False, epsilon=0.0)
+        gene0 = result.filter(pl.col("feature") == "gene_0")
+        # log2(0 / ref) -> -inf; percent_change (0 - ref) / ref -> -1.0
+        assert np.isneginf(gene0["log2_fold_change"].to_numpy()).all()
+        np.testing.assert_allclose(gene0["percent_change"].to_numpy(), -1.0)

{pdex-0.2.2 → pdex-0.2.3}/.github/workflows/ci.yml RENAMED Viewed

File without changes

{pdex-0.2.2 → pdex-0.2.3}/.github/workflows/release.yml RENAMED Viewed

File without changes

{pdex-0.2.2 → pdex-0.2.3}/.gitignore RENAMED Viewed

File without changes

{pdex-0.2.2 → pdex-0.2.3}/.python-version RENAMED Viewed

File without changes

{pdex-0.2.2 → pdex-0.2.3}/LICENSE RENAMED Viewed

File without changes

{pdex-0.2.2 → pdex-0.2.3}/src/pdex/_utils.py RENAMED Viewed

File without changes

{pdex-0.2.2 → pdex-0.2.3}/src/pdex/py.typed RENAMED Viewed

File without changes

{pdex-0.2.2 → pdex-0.2.3}/tests/conftest.py RENAMED Viewed

File without changes

{pdex-0.2.2 → pdex-0.2.3}/tests/test_internals.py RENAMED Viewed

File without changes

{pdex-0.2.2 → pdex-0.2.3}/tests/test_utils.py RENAMED Viewed

File without changes

pdex 0.2.2__tar.gz → 0.2.3__tar.gz

pdex 0.2.2tar.gz → 0.2.3tar.gz