omicsync 0.1.0__py3-none-any.whl

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
@@ -0,0 +1,188 @@
1
+ Metadata-Version: 2.4
2
+ Name: omicsync
3
+ Version: 0.1.0
4
+ Summary: Multi-omics data harmonisation for Python
5
+ Author-email: "Paterson V." <citrus.bird72@gmail.com>
6
+ License: MIT License
7
+
8
+ Copyright (c) 2026 Paterson V.
9
+
10
+ Permission is hereby granted, free of charge, to any person obtaining a copy
11
+ of this software and associated documentation files (the "Software"), to deal
12
+ in the Software without restriction, including without limitation the rights
13
+ to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
14
+ copies of the Software, and to permit persons to whom the Software is
15
+ furnished to do so, subject to the following conditions:
16
+
17
+ The above copyright notice and this permission notice shall be included in all
18
+ copies or substantial portions of the Software.
19
+
20
+ THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
21
+ IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
22
+ FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
23
+ AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
24
+ LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
25
+ OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
26
+ SOFTWARE.
27
+
28
+ Project-URL: Homepage, https://github.com/vi-c-ky/omicsync
29
+ Project-URL: Documentation, https://github.com/vi-c-ky/omicsync/blob/main/docs/index.md
30
+ Project-URL: Repository, https://github.com/vi-c-ky/omicsync
31
+ Project-URL: Bug Tracker, https://github.com/vi-c-ky/omicsync/issues
32
+ Keywords: bioinformatics,multi-omics,TCGA,genomics,data harmonisation
33
+ Classifier: Development Status :: 3 - Alpha
34
+ Classifier: Intended Audience :: Science/Research
35
+ Classifier: License :: OSI Approved :: MIT License
36
+ Classifier: Programming Language :: Python :: 3
37
+ Classifier: Programming Language :: Python :: 3.9
38
+ Classifier: Programming Language :: Python :: 3.10
39
+ Classifier: Programming Language :: Python :: 3.11
40
+ Classifier: Topic :: Scientific/Engineering :: Bio-Informatics
41
+ Requires-Python: >=3.9
42
+ Description-Content-Type: text/markdown
43
+ License-File: LICENSE
44
+ Requires-Dist: pandas>=1.5.0
45
+ Requires-Dist: numpy>=1.23.0
46
+ Requires-Dist: scipy>=1.9.0
47
+ Requires-Dist: scikit-learn>=1.1.0
48
+ Requires-Dist: requests>=2.28.0
49
+ Provides-Extra: mofa
50
+ Requires-Dist: mofapy2>=0.7.0; extra == "mofa"
51
+ Provides-Extra: geo
52
+ Requires-Dist: GEOparse>=2.0.0; extra == "geo"
53
+ Provides-Extra: torch
54
+ Requires-Dist: torch>=1.12.0; extra == "torch"
55
+ Provides-Extra: anndata
56
+ Requires-Dist: anndata>=0.8.0; extra == "anndata"
57
+ Provides-Extra: dev
58
+ Requires-Dist: pytest>=7.0.0; extra == "dev"
59
+ Requires-Dist: pytest-cov>=4.0.0; extra == "dev"
60
+ Requires-Dist: black>=23.0.0; extra == "dev"
61
+ Requires-Dist: ruff>=0.1.0; extra == "dev"
62
+ Requires-Dist: mypy>=1.0.0; extra == "dev"
63
+ Requires-Dist: build>=1.0.0; extra == "dev"
64
+ Requires-Dist: twine>=4.0.0; extra == "dev"
65
+ Provides-Extra: all
66
+ Requires-Dist: omicsync[anndata,geo,mofa,torch]; extra == "all"
67
+ Dynamic: license-file
68
+
69
+ # omicsync
70
+
71
+ [![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](https://opensource.org/licenses/MIT)
72
+ [![Python 3.9+](https://img.shields.io/badge/python-3.9+-blue.svg)](https://www.python.org/downloads/)
73
+ [![PyPI version](https://img.shields.io/pypi/v/omicsync.svg)](https://pypi.org/project/omicsync/)
74
+
75
+ **A Python library for multi-omics data harmonisation.**
76
+
77
+ omicsync handles the tedious work of aligning sample IDs, normalising each modality consistently, and exporting to downstream tools so you can focus on biology, not data wrangling.
78
+
79
+ ---
80
+
81
+ ## Installation
82
+
83
+ ```bash
84
+ pip install omicsync
85
+ ```
86
+
87
+ With optional extras:
88
+
89
+ ```bash
90
+ pip install "omicsync[mofa]" # MOFA2 factor analysis
91
+ pip install "omicsync[geo]" # GEO data loading
92
+ pip install "omicsync[anndata]" # AnnData export
93
+ pip install "omicsync[torch]" # PyTorch tensor export
94
+ pip install "omicsync[all]" # Everything
95
+ ```
96
+
97
+ ---
98
+
99
+ ## Quick Start
100
+
101
+ ```python
102
+ import omicsync as oms
103
+ from omicsync.loaders.csv import load_multimodal_csv
104
+
105
+ # Load multiple modalities from CSV files
106
+ dataset = load_multimodal_csv({
107
+ "rna": "brca_rna.tsv",
108
+ "protein": "brca_rppa.tsv",
109
+ "cnv": "brca_cnv.tsv",
110
+ }, study_id="TCGA-BRCA")
111
+
112
+ # Align, normalise, filter — all chainable
113
+ dataset.align_samples().normalize().filter_features(min_variance=0.01)
114
+
115
+ # Export to DataFrame or MOFA2
116
+ df = dataset.to_dataframe() # samples × features, prefixed columns
117
+ mofa_input = dataset.to_mofa2() # dict ready for mofapy2 entry_point
118
+ ```
119
+
120
+ ---
121
+
122
+ ## Features
123
+
124
+ - **Sample harmonisation** — TCGA barcode parsing, fuzzy ID matching, coverage reporting
125
+ - **Per-modality normalisation** — auto-detection of count/TPM/M-value formats
126
+ - **Chainable API** — `dataset.align().normalize().filter_features()`
127
+ - **sklearn compatibility** — use `OmicsSyncTransformer` in a `Pipeline`
128
+ - **Multiple export formats** — DataFrame, dict, MOFA2, PyTorch tensor, AnnData
129
+ - **Open Targets integration** — query target-disease associations via GraphQL
130
+ - **Type hints throughout** — fully typed public API
131
+
132
+ ---
133
+
134
+ ## Supported Data Sources
135
+
136
+ | Source | Loader | Notes |
137
+ |--------|--------|-------|
138
+ | TCGA | `load_tcga_files()` | Local files; barcode auto-harmonisation |
139
+ | GEO | `load_geo()` | Via GEOparse; requires `omicsync[geo]` |
140
+ | CSV/TSV | `load_csv()` | Any tabular file |
141
+ | Open Targets | `load_open_targets_targets()` | GraphQL API v4 |
142
+
143
+ ---
144
+
145
+ ## Supported Modalities
146
+
147
+ | Modality | Class | Default Normalisation |
148
+ |----------|-------|-----------------------|
149
+ | RNA expression | `RNAModality` | `detect_and_normalise()` (log1p) |
150
+ | DNA methylation | `MethylationModality` | M→beta conversion + clip |
151
+ | Copy number | `CNVModality` | log2 ratio, clipped [-2, 2] |
152
+ | Somatic mutations | `MutationModality` | Binarise at threshold |
153
+ | Protein abundance | `ProteinModality` | Z-score per protein |
154
+
155
+ ---
156
+
157
+ ## Documentation
158
+
159
+ - [Quickstart guide](docs/quickstart.md)
160
+ - [API reference](docs/api_reference.md)
161
+ - [Tutorial: TCGA BRCA](docs/tutorials/tcga_brca.md)
162
+ - [Tutorial: Custom CSV data](docs/tutorials/custom_data.md)
163
+
164
+ ---
165
+
166
+ ## Citation
167
+
168
+ If you use omicsync in your research, please cite:
169
+
170
+ > Paterson V. (2026). *omicsync: A Python library for multi-omics data harmonisation*. GitHub: github.com/vi-c-ky/omicsync
171
+
172
+ ---
173
+
174
+ ## Contributing
175
+
176
+ Contributions are welcome. Please open an issue or pull request on GitHub.
177
+
178
+ 1. Fork the repository
179
+ 2. Create a feature branch (`git checkout -b feature/my-feature`)
180
+ 3. Write tests for new functionality
181
+ 4. Run the test suite (`pytest tests/`)
182
+ 5. Open a pull request
183
+
184
+ ---
185
+
186
+ ## License
187
+
188
+ MIT — see [LICENSE](LICENSE) for details.
@@ -0,0 +1,29 @@
1
+ omicsync/__init__.py,sha256=lJ8YoMOblSsKmM2YW-vwLvio1HbNKGcDVgjGH-VxApE,733
2
+ omicsync/core/__init__.py,sha256=0FkHOPs0-6u27WkKh6WZG9NJL1xVeGEHhHApYJFKeX0,527
3
+ omicsync/core/dataset.py,sha256=YkYrxgyeiWOJQgddCtlrk8bhUtF0XvzUsUOmUwFv4cY,15845
4
+ omicsync/core/modality.py,sha256=4v2wVP9TjgD1h980GUKRSNaViNMWAboXnWasstf46Zg,11602
5
+ omicsync/core/sample_index.py,sha256=Q4Vgpvn1Bgm3snMkMRpbm5btC8i5fYg1NQVw7BcOI1k,5908
6
+ omicsync/integration/__init__.py,sha256=iPV828TXBQDT_P0PiqkfqI0BfnV2D0Vz79VQHyCL20Q,313
7
+ omicsync/integration/concat.py,sha256=q5wzm-JM0z9Sx2iDZLwNmxvhbXr9_NRK0bc8ABhwiXs,4100
8
+ omicsync/integration/mofa.py,sha256=MAJJ4w2GBSpfPLPrHxfOgF0ST6AAAmyqTV7UcU6a538,8632
9
+ omicsync/integration/sklearn_compat.py,sha256=p0ISwsQ-F6gy1Retd38eJK8j-i19Pr17-XuPwGyDZQE,5596
10
+ omicsync/loaders/__init__.py,sha256=IUkIVfH6Ef_TjFTiwwgUTW9bB-5CeyKHmAE_jeMSwBc,520
11
+ omicsync/loaders/csv.py,sha256=ue94Ewphf-NBJWHwPbTztUNJHDD16d6KFYK0ndHyBiw,4562
12
+ omicsync/loaders/geo.py,sha256=7KA7sLpWynsQpCu29UV6diwfLDamxImdwO4jksgB9kI,3362
13
+ omicsync/loaders/open_targets.py,sha256=ePZlF1vkPoZawDEkY1DoTDc_LsoNRAMsEZVmm4n9DTQ,7431
14
+ omicsync/loaders/tcga.py,sha256=JMKFCLPxAGmWac9gYCsUay85n2loX-1OR0kw5kDIsgI,8653
15
+ omicsync/normalisation/__init__.py,sha256=XT9r8DGxAQUPy49CrdPrskQx2cIFsHQ91BHcEkQPjS4,196
16
+ omicsync/normalisation/cnv.py,sha256=F7FbobuqdFT2v-ivBXMH0jykivujbwtHCMJkF-UH7i4,2971
17
+ omicsync/normalisation/methylation.py,sha256=Gb6dsep_u8Ak4xUnV82bpdgl-27ohWArM_zi0FzA44w,3533
18
+ omicsync/normalisation/mutations.py,sha256=MRVms0xnJNqJDzyUP0qaUsgy4PnJTthLafSWRmKeRdc,3380
19
+ omicsync/normalisation/protein.py,sha256=-bYn4i3yIl5gtlPkF7Uc_oLmwhY2Nkc4GGJqCxVO0pU,1439
20
+ omicsync/normalisation/rna.py,sha256=ZuzkuAAM5T2mMQ0bwwBD8a_l7yBzGGczhQcds4z09Ss,5160
21
+ omicsync/utils/__init__.py,sha256=4MOcXz0iYhroNZ5ZPLYFxAejdUPToDtnkq13cChiM6k,687
22
+ omicsync/utils/barcode.py,sha256=FyxSVVEef0JVY4ZwShwmqQ_uMoD2mKBGteROJFSjC4k,4654
23
+ omicsync/utils/logging.py,sha256=OXkh81cGbgQ7PXzWNpruGrbVNGLAsffwIIAekFc5Kns,1083
24
+ omicsync/utils/validation.py,sha256=PJ02oISpius-2kxT0guDh3VuoNro82yIvq2upD4VaiY,4504
25
+ omicsync-0.1.0.dist-info/licenses/LICENSE,sha256=ECjspIW15nN3KiM9i0JV1_LYdeH3Ee0tdZVbrWPb6L8,1068
26
+ omicsync-0.1.0.dist-info/METADATA,sha256=6uPKlN-3_tztciL-ZgzcANY99qibpvI7Ub4CNJtckQA,6948
27
+ omicsync-0.1.0.dist-info/WHEEL,sha256=aeYiig01lYGDzBgS8HxWXOg3uV61G9ijOsup-k9o1sk,91
28
+ omicsync-0.1.0.dist-info/top_level.txt,sha256=t9P2Cb5neiuB67sDktLTfjxE003jZgm41i7qGVfNsBI,9
29
+ omicsync-0.1.0.dist-info/RECORD,,
@@ -0,0 +1,5 @@
1
+ Wheel-Version: 1.0
2
+ Generator: setuptools (82.0.1)
3
+ Root-Is-Purelib: true
4
+ Tag: py3-none-any
5
+
@@ -0,0 +1,21 @@
1
+ MIT License
2
+
3
+ Copyright (c) 2026 Paterson V.
4
+
5
+ Permission is hereby granted, free of charge, to any person obtaining a copy
6
+ of this software and associated documentation files (the "Software"), to deal
7
+ in the Software without restriction, including without limitation the rights
8
+ to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
9
+ copies of the Software, and to permit persons to whom the Software is
10
+ furnished to do so, subject to the following conditions:
11
+
12
+ The above copyright notice and this permission notice shall be included in all
13
+ copies or substantial portions of the Software.
14
+
15
+ THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
16
+ IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
17
+ FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
18
+ AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
19
+ LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
20
+ OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
21
+ SOFTWARE.
@@ -0,0 +1 @@
1
+ omicsync