gengeneeval 0.1.0__py3-none-any.whl

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
@@ -0,0 +1,172 @@
1
+ Metadata-Version: 2.4
2
+ Name: gengeneeval
3
+ Version: 0.1.0
4
+ Summary: Comprehensive evaluation of generated gene expression data. Computes metrics between real and generated datasets with support for condition matching, train/test splits, and publication-quality visualizations.
5
+ License: MIT
6
+ License-File: LICENSE
7
+ Keywords: gene expression,evaluation,metrics,single-cell,generative models,benchmarking
8
+ Author: GenEval Team
9
+ Author-email: geneval@example.com
10
+ Requires-Python: >=3.8,<4.0
11
+ Classifier: Development Status :: 4 - Beta
12
+ Classifier: Intended Audience :: Science/Research
13
+ Classifier: License :: OSI Approved :: MIT License
14
+ Classifier: Programming Language :: Python :: 3
15
+ Classifier: Programming Language :: Python :: 3.8
16
+ Classifier: Programming Language :: Python :: 3.9
17
+ Classifier: Programming Language :: Python :: 3.10
18
+ Classifier: Programming Language :: Python :: 3.11
19
+ Classifier: Programming Language :: Python :: 3.12
20
+ Classifier: Programming Language :: Python :: 3.13
21
+ Classifier: Programming Language :: Python :: 3.14
22
+ Classifier: Topic :: Scientific/Engineering :: Bio-Informatics
23
+ Provides-Extra: full
24
+ Provides-Extra: gpu
25
+ Requires-Dist: anndata (>=0.8.0)
26
+ Requires-Dist: geomloss (>=0.2.1) ; extra == "full" or extra == "gpu"
27
+ Requires-Dist: matplotlib (>=3.5.0)
28
+ Requires-Dist: numpy (>=1.21.0)
29
+ Requires-Dist: pandas (>=1.3.0)
30
+ Requires-Dist: pykeops (>=1.4.0) ; extra == "full" or extra == "gpu"
31
+ Requires-Dist: scanpy (>=1.9.0)
32
+ Requires-Dist: scipy (>=1.7.0)
33
+ Requires-Dist: seaborn (>=0.11.0)
34
+ Requires-Dist: torch (>=1.9.0)
35
+ Requires-Dist: umap-learn (>=0.5.0) ; extra == "full"
36
+ Project-URL: Homepage, https://github.com/AndreaRubbi/GenGeneEval
37
+ Project-URL: Repository, https://github.com/AndreaRubbi/GenGeneEval
38
+ Description-Content-Type: text/markdown
39
+
40
+ # GenEval: Gene Expression Evaluation Framework
41
+
42
+ [![PyPI version](https://badge.fury.io/py/gengeneeval.svg)](https://badge.fury.io/py/gengeneeval)
43
+ [![Python 3.8+](https://img.shields.io/badge/python-3.8+-blue.svg)](https://www.python.org/downloads/)
44
+ [![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](https://opensource.org/licenses/MIT)
45
+ [![Tests](https://github.com/AndreaRubbi/GenGeneEval/actions/workflows/tests.yml/badge.svg)](https://github.com/AndreaRubbi/GenGeneEval/actions)
46
+
47
+ **Comprehensive evaluation of generated gene expression data against real datasets.**
48
+
49
+ GenEval is a modular, object-oriented Python framework for computing metrics between real and generated gene expression datasets stored in AnnData (h5ad) format. It supports condition-based matching, train/test splits, and generates publication-quality visualizations.
50
+
51
+ ## Features
52
+
53
+ ### Metrics
54
+ All metrics are computed **per-gene** (returning a vector) and **aggregated**:
55
+
56
+ | Metric | Description | Direction |
57
+ |--------|-------------|-----------|
58
+ | **Pearson Correlation** | Linear correlation between expression profiles | Higher is better |
59
+ | **Spearman Correlation** | Rank correlation (robust to outliers) | Higher is better |
60
+ | **Wasserstein-1** | Earth Mover's Distance (L1) | Lower is better |
61
+ | **Wasserstein-2** | Quadratic optimal transport | Lower is better |
62
+ | **MMD** | Maximum Mean Discrepancy (kernel-based) | Lower is better |
63
+ | **Energy Distance** | Statistical potential energy | Lower is better |
64
+
65
+ ### Visualizations
66
+ - **Boxplots & Violin plots**: Metric distributions across conditions
67
+ - **Radar plots**: Multi-metric comparison
68
+ - **Scatter plots**: Real vs generated expression
69
+ - **Embedding plots**: PCA/UMAP of real vs generated data
70
+ - **Heatmaps**: Per-gene metric values
71
+
72
+ ### Key Features
73
+ - ✅ Condition-based matching (perturbation, cell type, etc.)
74
+ - ✅ Train/test split support
75
+ - ✅ Per-gene and aggregate metrics
76
+ - ✅ Modular, extensible architecture
77
+ - ✅ Command-line interface
78
+ - ✅ Publication-quality visualizations
79
+
80
+ ## Installation
81
+
82
+ ### Using pip
83
+ ```bash
84
+ pip install -e .
85
+ ```
86
+
87
+ ### With GPU support (faster distance metrics)
88
+ ```bash
89
+ pip install -e ".[gpu]"
90
+ ```
91
+
92
+ ## Quick Start
93
+
94
+ ### Python API
95
+
96
+ ```python
97
+ from geneval import evaluate
98
+
99
+ # Run evaluation
100
+ results = evaluate(
101
+ real_path="real_data.h5ad",
102
+ generated_path="generated_data.h5ad",
103
+ condition_columns=["perturbation", "cell_type"],
104
+ split_column="split", # Optional: for train/test
105
+ output_dir="evaluation_output/"
106
+ )
107
+
108
+ # Access results
109
+ print(results.summary())
110
+
111
+ # Get metric for specific split
112
+ test_results = results.get_split("test")
113
+ for condition, cond_result in test_results.conditions.items():
114
+ print(f"{condition}: Pearson={cond_result.get_metric_value('pearson'):.3f}")
115
+ ```
116
+
117
+ ### Command Line
118
+
119
+ ```bash
120
+ # Basic usage
121
+ geneval --real real.h5ad --generated generated.h5ad \
122
+ --conditions perturbation cell_type \
123
+ --output results/
124
+
125
+ # With split column
126
+ geneval --real real.h5ad --generated generated.h5ad \
127
+ --conditions perturbation \
128
+ --split-column split \
129
+ --splits test \
130
+ --output results/
131
+
132
+ # Specify metrics
133
+ geneval --real real.h5ad --generated generated.h5ad \
134
+ --conditions perturbation \
135
+ --metrics pearson spearman wasserstein_1 mmd \
136
+ --output results/
137
+ ```
138
+
139
+ ## Expected Data Format
140
+
141
+ GenEval expects AnnData (h5ad) files with:
142
+
143
+ ### Required
144
+ - `adata.X`: Gene expression matrix (samples × genes)
145
+ - `adata.var_names`: Gene identifiers (must overlap between datasets)
146
+ - `adata.obs[condition_columns]`: Columns for matching conditions
147
+
148
+ ### Optional
149
+ - `adata.obs[split_column]`: Train/test split indicator
150
+
151
+ ## Output Structure
152
+
153
+ ```
154
+ output/
155
+ ├── summary.json # Aggregate metrics and metadata
156
+ ├── results.csv # Per-condition metrics table
157
+ ├── per_gene_*.csv # Per-gene metric values
158
+ └── plots/
159
+ ├── boxplot_metrics.png
160
+ ├── violin_metrics.png
161
+ ├── radar_split.png
162
+ ├── scatter_grid.png
163
+ └── embedding_pca.png
164
+ ```
165
+
166
+ ## Contributing
167
+
168
+ Contributions are welcome! Please feel free to submit a pull request or open an issue.
169
+
170
+ ## License
171
+
172
+ This project is licensed under the MIT License. See the LICENSE file for details.
@@ -0,0 +1,31 @@
1
+ geneval/__init__.py,sha256=6aSk50coEvm8rqwxGvuCyCaN_Dqj1VfuSqvOSTLEgqY,2872
2
+ geneval/cli.py,sha256=0ai0IGyn3SSmEnfLRJhcr0brvUxuNZHE4IXod7jvosU,9977
3
+ geneval/config.py,sha256=gkCjs_gzPWgUZNcmSR3Y70XQCAZ1m9AKLueaM-x8bvw,3729
4
+ geneval/core.py,sha256=No0DP8bNR6LedfCWEedY9C5r_c4M14rvSPaGZqbxc94,1155
5
+ geneval/data/__init__.py,sha256=nD3uWostZbYD3Yj_TOE44LvPDen-Vm3gN8ZH0QptPGw,450
6
+ geneval/data/gene_expression_datamodule.py,sha256=XiBIdf68JZ-3S-FaZsrQlBJA7qL9uUXo2C8y0r4an5M,8009
7
+ geneval/data/loader.py,sha256=zpRmwGZ4PJkB3rpXXRCMFtvMi4qvUrPkKmvIlGjfRpY,14555
8
+ geneval/evaluator.py,sha256=grPudMng-CcnWwkxQGWM6RZ198Q-1THkR4MCXtadCdU,11545
9
+ geneval/evaluators/__init__.py,sha256=i11sHvhsjEAeI3Aw9zFTPmCYuqkGxzTHggAKehe3HQ0,160
10
+ geneval/evaluators/base_evaluator.py,sha256=yJL568HdNofIcHgNOElSQMVlG9oRPTTDIZ7CmKccRqs,5967
11
+ geneval/evaluators/gene_expression_evaluator.py,sha256=v8QL6tzOQ3QVXdPMM8tFHTTviZC3WsPRX4G0ShgeDUw,8743
12
+ geneval/metrics/__init__.py,sha256=wk0CdFXvipfPqXWUMsRRz9CPiSVPG40Id4lyoSaLIkY,1417
13
+ geneval/metrics/base_metric.py,sha256=prbnB-Ap-P64m-2_TUrHxO3NFQaw-obVg1Tw4pjC5EY,6961
14
+ geneval/metrics/correlation.py,sha256=jpYmaihWK89J1E5yQinGUJeB6pTZ21xPNHJi3XYyXJE,6987
15
+ geneval/metrics/distances.py,sha256=9mWzbMbIBY1ckOd2a0l3by3aEFMQZL9bVMSeP44xzUg,16155
16
+ geneval/metrics/metrics.py,sha256=RPRUkgaDeL3cmJDEN7b3sUuPZdvrWXI3YRWwdsTTjL0,4171
17
+ geneval/models/__init__.py,sha256=vJHXIhwzykjoqZ-vHQJnPwwjSUu9nnMyo7jGnWlTd94,42
18
+ geneval/models/base_model.py,sha256=2QDtweYTgiovnksaRPBjNbIDu1l9l_WQMMFfeIX3GB8,1345
19
+ geneval/results.py,sha256=iXSB0o0f1jQrCKjc-lbRfwBFGhspTDDJpQ2K2tM-XR4,11362
20
+ geneval/testing.py,sha256=bD8c966LB6inNHabrFccoCRULPtPc_UYTty-uw7aSGU,11864
21
+ geneval/utils/__init__.py,sha256=wwzI0HWMz0FUp4V66XGRfzeaK3gaQUnIjDstG8ZUpFI,40
22
+ geneval/utils/io.py,sha256=LrRhIRlx_wlCs5Mayaq8hyVIp9uduHHohXuv8zQMwyI,888
23
+ geneval/utils/preprocessing.py,sha256=1Cij1O2dwDR6_zh5IEgLPq3jEmV8VfIRjfQrHiKe3Mw,2612
24
+ geneval/visualization/__init__.py,sha256=LN19jl5xV4WVJTePaOUHWvKZ_pgDFp1chhcklGkNtm8,792
25
+ geneval/visualization/plots.py,sha256=3K94r3x5NjIUZ-hYVQIivO63VkLOvDWl-BLB_qL2pSY,15008
26
+ geneval/visualization/visualizer.py,sha256=dwx1oc0C3dXlJguvIx1pLAOeHcGE5v85OctHOsRE2Yo,36526
27
+ gengeneeval-0.1.0.dist-info/METADATA,sha256=KaYsfE44TMNNBVrfCEFvk4sBczDR5IYy1fzfY8C1nEI,6041
28
+ gengeneeval-0.1.0.dist-info/WHEEL,sha256=3ny-bZhpXrU6vSQ1UPG34FoxZBp3lVcvK0LkgUz6VLk,88
29
+ gengeneeval-0.1.0.dist-info/entry_points.txt,sha256=xTkwnNa2fP0w1uGVsafzRTaCeuBSWLlNO-1CN8uBSK0,43
30
+ gengeneeval-0.1.0.dist-info/licenses/LICENSE,sha256=RDHgHDI4rSDq35R4CAC3npy86YUnmZ81ecO7aHfmmGA,1073
31
+ gengeneeval-0.1.0.dist-info/RECORD,,
@@ -0,0 +1,4 @@
1
+ Wheel-Version: 1.0
2
+ Generator: poetry-core 2.3.0
3
+ Root-Is-Purelib: true
4
+ Tag: py3-none-any
@@ -0,0 +1,3 @@
1
+ [console_scripts]
2
+ geneval=geneval.cli:run
3
+
@@ -0,0 +1,9 @@
1
+ MIT License
2
+
3
+ Copyright (c) 2023 [Your Name]
4
+
5
+ Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions:
6
+
7
+ 1. The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.
8
+
9
+ 2. THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.