PyPI - gengeneeval - Versions diffs - 0.4.0__tar.gz → 0.4.1__tar.gz - Mend

gengeneeval 0.4.0tar.gz → 0.4.1tar.gz

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (40) hide show

{gengeneeval-0.4.0 → gengeneeval-0.4.1}/PKG-INFO RENAMED Viewed

@@ -1,6 +1,6 @@
 Metadata-Version: 2.4
 Name: gengeneeval
-Version: 0.4.0
+Version: 0.4.1
 Summary: Comprehensive evaluation of generated gene expression data. Computes metrics between real and generated datasets with support for condition matching, DEG-focused evaluation, per-context analysis, train/test splits, memory-efficient lazy loading, CPU parallelization, GPU acceleration, and publication-quality visualizations.
 License: MIT
 License-File: LICENSE
@@ -256,6 +256,8 @@ GenEval supports **Differentially Expressed Genes (DEG)-focused evaluation**, co
 #### Key Features
 - **Fast DEG detection**: Vectorized Welch's t-test, Student's t-test, or Wilcoxon rank-sum
+- **DEG vs all-genes comparison**: Compute metrics on both and compare
+- **Flexible DEG selection**: Top N by significance, or threshold-based filtering
 - **Per-context evaluation**: Automatically evaluates each (covariate × perturbation) combination
 - **GPU acceleration**: DEG detection and metrics on GPU for large datasets
 - **Comprehensive reporting**: Aggregated and expanded results with visualizations
@@ -266,7 +268,7 @@ GenEval supports **Differentially Expressed Genes (DEG)-focused evaluation**, co
 from geneval import evaluate_degs
 import pandas as pd
-# Evaluate with DEG-focused metrics
+# Evaluate with DEG-focused metrics (computes both DEG and all-genes by default)
 results = evaluate_degs(
     real_data=real_adata.X,           # (n_samples, n_genes)
     generated_data=gen_adata.X,
@@ -276,29 +278,62 @@ results = evaluate_degs(
     control_key="control",            # Value indicating control samples
     perturbation_column="perturbation",
     deg_method="welch",               # or "student", "wilcoxon", "logfc"
-    pval_threshold=0.05,
+    pval_threshold=0.05,              # Significance threshold
     lfc_threshold=0.5,                # log2 fold change threshold
+    compute_all_genes=True,           # Also compute metrics on all genes
     device="cuda",                    # GPU acceleration
 )
-# Access results
-print(results.aggregated_metrics)     # Summary across all contexts
-print(results.expanded_metrics)       # Per-context metrics
+# Compare DEG-only vs all-genes metrics
+print(results.comparison_summary)
+#        metric  deg_mean  all_mean  difference  ratio
+# wasserstein_1     5.34      0.69        4.65   7.74
+#           mmd     1.14      0.13        1.02   9.00
+# Access per-context results
+print(results.expanded_metrics)       # Has deg_* and all_* columns
 print(results.deg_summary)            # DEG counts per context
 # Save results with plots
 results.save("deg_evaluation/")
 ```
+#### DEG Selection Control
+```python
+# Option 1: Top N most significant DEGs
+results = evaluate_degs(
+    ...,
+    n_top_degs=50,      # Use only top 50 DEGs by adjusted p-value
+)
+# Option 2: Stricter thresholds
+results = evaluate_degs(
+    ...,
+    pval_threshold=0.01,    # More stringent p-value
+    lfc_threshold=1.0,      # 2-fold change minimum
+)
+# Option 3: DEGs only (skip all-genes metrics for speed)
+results = evaluate_degs(
+    ...,
+    compute_all_genes=False,
+)
+# Get DEG-only or all-genes metrics separately
+deg_only = results.get_deg_only_metrics()
+all_genes = results.get_all_genes_metrics()
+```
 #### Per-Context Evaluation
 When multiple condition columns are provided (e.g., `["cell_type", "perturbation"]`), GenEval evaluates **every combination** separately:
-| Context | n_DEGs | W1 (DEGs only) | MMD (DEGs only) |
-|---------|--------|----------------|-----------------|
-| TypeA_drug1 | 234 | 0.42 | 0.031 |
-| TypeA_drug2 | 189 | 0.38 | 0.027 |
-| TypeB_drug1 | 312 | 0.51 | 0.045 |
+| Context | n_DEGs | deg_W1 | all_W1 | deg_MMD | all_MMD |
+|---------|--------|--------|--------|---------|---------|
+| TypeA_drug1 | 234 | 5.42 | 0.69 | 1.03 | 0.13 |
+| TypeA_drug2 | 189 | 4.38 | 0.71 | 0.92 | 0.12 |
+| TypeB_drug1 | 312 | 6.51 | 0.68 | 1.21 | 0.14 |
 If only `perturbation` column is provided, evaluation is done per-perturbation.

{gengeneeval-0.4.0 → gengeneeval-0.4.1}/README.md RENAMED Viewed

@@ -216,6 +216,8 @@ GenEval supports **Differentially Expressed Genes (DEG)-focused evaluation**, co
 #### Key Features
 - **Fast DEG detection**: Vectorized Welch's t-test, Student's t-test, or Wilcoxon rank-sum
+- **DEG vs all-genes comparison**: Compute metrics on both and compare
+- **Flexible DEG selection**: Top N by significance, or threshold-based filtering
 - **Per-context evaluation**: Automatically evaluates each (covariate × perturbation) combination
 - **GPU acceleration**: DEG detection and metrics on GPU for large datasets
 - **Comprehensive reporting**: Aggregated and expanded results with visualizations
@@ -226,7 +228,7 @@ GenEval supports **Differentially Expressed Genes (DEG)-focused evaluation**, co
 from geneval import evaluate_degs
 import pandas as pd
-# Evaluate with DEG-focused metrics
+# Evaluate with DEG-focused metrics (computes both DEG and all-genes by default)
 results = evaluate_degs(
     real_data=real_adata.X,           # (n_samples, n_genes)
     generated_data=gen_adata.X,
@@ -236,29 +238,62 @@ results = evaluate_degs(
     control_key="control",            # Value indicating control samples
     perturbation_column="perturbation",
     deg_method="welch",               # or "student", "wilcoxon", "logfc"
-    pval_threshold=0.05,
+    pval_threshold=0.05,              # Significance threshold
     lfc_threshold=0.5,                # log2 fold change threshold
+    compute_all_genes=True,           # Also compute metrics on all genes
     device="cuda",                    # GPU acceleration
 )
-# Access results
-print(results.aggregated_metrics)     # Summary across all contexts
-print(results.expanded_metrics)       # Per-context metrics
+# Compare DEG-only vs all-genes metrics
+print(results.comparison_summary)
+#        metric  deg_mean  all_mean  difference  ratio
+# wasserstein_1     5.34      0.69        4.65   7.74
+#           mmd     1.14      0.13        1.02   9.00
+# Access per-context results
+print(results.expanded_metrics)       # Has deg_* and all_* columns
 print(results.deg_summary)            # DEG counts per context
 # Save results with plots
 results.save("deg_evaluation/")
 ```
+#### DEG Selection Control
+```python
+# Option 1: Top N most significant DEGs
+results = evaluate_degs(
+    ...,
+    n_top_degs=50,      # Use only top 50 DEGs by adjusted p-value
+)
+# Option 2: Stricter thresholds
+results = evaluate_degs(
+    ...,
+    pval_threshold=0.01,    # More stringent p-value
+    lfc_threshold=1.0,      # 2-fold change minimum
+)
+# Option 3: DEGs only (skip all-genes metrics for speed)
+results = evaluate_degs(
+    ...,
+    compute_all_genes=False,
+)
+# Get DEG-only or all-genes metrics separately
+deg_only = results.get_deg_only_metrics()
+all_genes = results.get_all_genes_metrics()
+```
 #### Per-Context Evaluation
 When multiple condition columns are provided (e.g., `["cell_type", "perturbation"]`), GenEval evaluates **every combination** separately:
-| Context | n_DEGs | W1 (DEGs only) | MMD (DEGs only) |
-|---------|--------|----------------|-----------------|
-| TypeA_drug1 | 234 | 0.42 | 0.031 |
-| TypeA_drug2 | 189 | 0.38 | 0.027 |
-| TypeB_drug1 | 312 | 0.51 | 0.045 |
+| Context | n_DEGs | deg_W1 | all_W1 | deg_MMD | all_MMD |
+|---------|--------|--------|--------|---------|---------|
+| TypeA_drug1 | 234 | 5.42 | 0.69 | 1.03 | 0.13 |
+| TypeA_drug2 | 189 | 4.38 | 0.71 | 0.92 | 0.12 |
+| TypeB_drug1 | 312 | 6.51 | 0.68 | 1.21 | 0.14 |
 If only `perturbation` column is provided, evaluation is done per-perturbation.

{gengeneeval-0.4.0 → gengeneeval-0.4.1}/pyproject.toml RENAMED Viewed

@@ -1,6 +1,6 @@
 [tool.poetry]
 name = "gengeneeval"
-version = "0.4.0"
+version = "0.4.1"
 description = "Comprehensive evaluation of generated gene expression data. Computes metrics between real and generated datasets with support for condition matching, DEG-focused evaluation, per-context analysis, train/test splits, memory-efficient lazy loading, CPU parallelization, GPU acceleration, and publication-quality visualizations."
 authors = ["GenEval Team <geneval@example.com>"]
 license = "MIT"

{gengeneeval-0.4.0 → gengeneeval-0.4.1}/src/geneval/__init__.py RENAMED Viewed

@@ -49,7 +49,7 @@ CLI Usage:
               --conditions perturbation cell_type --output results/
 """
-__version__ = "0.4.0"
+__version__ = "0.4.1"
 __author__ = "GenEval Team"
 # Main evaluation interface

{gengeneeval-0.4.0 → gengeneeval-0.4.1}/src/geneval/deg/__init__.py RENAMED Viewed

@@ -31,6 +31,8 @@ from .context import (
 from .evaluator import (
     DEGEvaluator,
     DEGEvaluationResult,
+    DEGSettings,
+    ContextMetrics,
     evaluate_degs,
 )
 from .visualization import (
@@ -56,6 +58,8 @@ __all__ = [
     # Evaluator
     "DEGEvaluator",
     "DEGEvaluationResult",
+    "DEGSettings",
+    "ContextMetrics",
     "evaluate_degs",
     # Visualization
     "plot_deg_distributions",

gengeneeval 0.4.0__tar.gz → 0.4.1__tar.gz

gengeneeval 0.4.0tar.gz → 0.4.1tar.gz