gentroutils 1.5.0__tar.gz → 1.6.0.dev2__tar.gz
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- gentroutils-1.6.0.dev2/.RData +0 -0
- {gentroutils-1.5.0 → gentroutils-1.6.0.dev2}/.gitignore +14 -0
- gentroutils-1.6.0.dev2/.vscode/extensions.json +7 -0
- gentroutils-1.6.0.dev2/.vscode/settings.json +19 -0
- {gentroutils-1.5.0 → gentroutils-1.6.0.dev2}/CHANGELOG.md +16 -0
- gentroutils-1.6.0.dev2/Dockerfile +16 -0
- {gentroutils-1.5.0 → gentroutils-1.6.0.dev2}/LICENSE +1 -1
- {gentroutils-1.5.0 → gentroutils-1.6.0.dev2}/Makefile +1 -1
- gentroutils-1.6.0.dev2/PKG-INFO +274 -0
- gentroutils-1.6.0.dev2/README.md +249 -0
- gentroutils-1.6.0.dev2/config.yaml +32 -0
- gentroutils-1.6.0.dev2/conftest.py +1 -0
- gentroutils-1.6.0.dev2/docs/00_prepare_tables_for_curation.R +126 -0
- gentroutils-1.6.0.dev2/docs/gwas_catalog_curation.md +45 -0
- gentroutils-1.6.0.dev2/pyproject.toml +333 -0
- gentroutils-1.6.0.dev2/src/gentroutils/__init__.py +11 -0
- gentroutils-1.6.0.dev2/src/gentroutils/errors.py +39 -0
- gentroutils-1.6.0.dev2/src/gentroutils/io/path/__init__.py +6 -0
- gentroutils-1.6.0.dev2/src/gentroutils/io/path/ftp.py +48 -0
- gentroutils-1.6.0.dev2/src/gentroutils/io/path/gcs.py +45 -0
- gentroutils-1.6.0.dev2/src/gentroutils/io/transfer/__init__.py +6 -0
- gentroutils-1.6.0.dev2/src/gentroutils/io/transfer/ftp_to_gcs.py +49 -0
- gentroutils-1.6.0.dev2/src/gentroutils/io/transfer/model.py +36 -0
- gentroutils-1.6.0.dev2/src/gentroutils/io/transfer/polars_to_gcs.py +20 -0
- gentroutils-1.6.0.dev2/src/gentroutils/parsers/__init__.py +1 -0
- gentroutils-1.6.0.dev2/src/gentroutils/parsers/curation.py +168 -0
- gentroutils-1.6.0.dev2/src/gentroutils/tasks/__init__.py +90 -0
- gentroutils-1.6.0.dev2/src/gentroutils/tasks/crawl.py +156 -0
- gentroutils-1.6.0.dev2/src/gentroutils/tasks/curation.py +110 -0
- gentroutils-1.6.0.dev2/src/gentroutils/tasks/fetch.py +141 -0
- gentroutils-1.6.0.dev2/src/gentroutils/transfer.py +81 -0
- gentroutils-1.6.0.dev2/tests/data/ftp/test/databases/gwas/summary_statistics/harmonised_list.txt +0 -0
- gentroutils-1.6.0.dev2/tests/data/gsutil_list.txt +95567 -0
- gentroutils-1.6.0.dev2/tests/data/test.h.tsv.gz +0 -0
- gentroutils-1.6.0.dev2/tests/io/conftest.py +0 -0
- gentroutils-1.6.0.dev2/tests/io/path/conftest.py +0 -0
- gentroutils-1.6.0.dev2/tests/io/path/test_ftp.py +36 -0
- gentroutils-1.6.0.dev2/tests/io/path/test_gcs.py +35 -0
- gentroutils-1.6.0.dev2/tests/io/transfer/conftest.py +0 -0
- gentroutils-1.6.0.dev2/tests/io/transfer/test_ftp_to_gcs.py +87 -0
- gentroutils-1.6.0.dev2/tests/io/transfer/test_model.py +23 -0
- gentroutils-1.6.0.dev2/tests/io/transfer/test_polars_to_gcs.py +45 -0
- gentroutils-1.6.0.dev2/tests/parsers/conftest.py +0 -0
- gentroutils-1.6.0.dev2/tests/parsers/test_curation.py +157 -0
- gentroutils-1.6.0.dev2/tests/tasks/conftest.py +84 -0
- gentroutils-1.6.0.dev2/tests/tasks/test_crawl_task.py +219 -0
- gentroutils-1.6.0.dev2/tests/tasks/test_curation_task.py +220 -0
- gentroutils-1.6.0.dev2/tests/tasks/test_fetch_task.py +198 -0
- gentroutils-1.6.0.dev2/tests/test_transfer.py +94 -0
- gentroutils-1.6.0.dev2/uv.lock +2063 -0
- gentroutils-1.5.0/PKG-INFO +0 -135
- gentroutils-1.5.0/README.md +0 -110
- gentroutils-1.5.0/pyproject.toml +0 -218
- gentroutils-1.5.0/src/gentroutils/__init__.py +0 -46
- gentroutils-1.5.0/src/gentroutils/commands/__init__.py +0 -11
- gentroutils-1.5.0/src/gentroutils/commands/update_gwas_curation_metadata.py +0 -287
- gentroutils-1.5.0/src/gentroutils/commands/utils.py +0 -152
- gentroutils-1.5.0/src/gentroutils/commands/validate_gwas_curation.py +0 -165
- gentroutils-1.5.0/tests/conftest.py +0 -132
- gentroutils-1.5.0/tests/test_cli.py +0 -23
- gentroutils-1.5.0/tests/test_update_gwas_curation_metadata.py +0 -205
- gentroutils-1.5.0/tests/test_validate_gwas_curation.py +0 -60
- gentroutils-1.5.0/uv.lock +0 -1796
- /gentroutils-1.5.0/src/gentroutils/py.typed → /gentroutils-1.6.0.dev2/.Rhistory +0 -0
- {gentroutils-1.5.0 → gentroutils-1.6.0.dev2}/.github/workflows/labeler.yaml +0 -0
- {gentroutils-1.5.0 → gentroutils-1.6.0.dev2}/.github/workflows/pr.yaml +0 -0
- {gentroutils-1.5.0 → gentroutils-1.6.0.dev2}/.github/workflows/release.yaml +0 -0
- {gentroutils-1.5.0 → gentroutils-1.6.0.dev2}/.github/workflows/release_pr.yaml +0 -0
- {gentroutils-1.5.0 → gentroutils-1.6.0.dev2}/.github/workflows/tag.yaml +0 -0
- {gentroutils-1.5.0 → gentroutils-1.6.0.dev2}/.pre-commit-config.yaml +0 -0
- {gentroutils-1.5.0 → gentroutils-1.6.0.dev2}/commitlint.config.js +0 -0
- {gentroutils-1.5.0 → gentroutils-1.6.0.dev2}/setup.sh +0 -0
- /gentroutils-1.5.0/tests/data/harmonised_list.txt → /gentroutils-1.6.0.dev2/src/gentroutils/py.typed +0 -0
- {gentroutils-1.5.0 → gentroutils-1.6.0.dev2}/tests/data/manual_curation/correct_curation.tsv +0 -0
- {gentroutils-1.5.0 → gentroutils-1.6.0.dev2}/tests/data/manual_curation/incorrect_analysisFlag_type.tsv +0 -0
- {gentroutils-1.5.0 → gentroutils-1.6.0.dev2}/tests/data/manual_curation/incorrect_analysisFlag_value.tsv +0 -0
- {gentroutils-1.5.0 → gentroutils-1.6.0.dev2}/tests/data/manual_curation/incorrect_columns_curation.tsv +0 -0
- {gentroutils-1.5.0 → gentroutils-1.6.0.dev2}/tests/data/manual_curation/incorrect_publicationTitle_type.tsv +0 -0
- {gentroutils-1.5.0 → gentroutils-1.6.0.dev2}/tests/data/manual_curation/incorrect_pubmedId_type.tsv +0 -0
- {gentroutils-1.5.0 → gentroutils-1.6.0.dev2}/tests/data/manual_curation/incorrect_studyId_type.tsv +0 -0
- {gentroutils-1.5.0 → gentroutils-1.6.0.dev2}/tests/data/manual_curation/incorrect_studyId_value.tsv +0 -0
- {gentroutils-1.5.0 → gentroutils-1.6.0.dev2}/tests/data/manual_curation/incorrect_studyType_type.tsv +0 -0
- {gentroutils-1.5.0 → gentroutils-1.6.0.dev2}/tests/data/manual_curation/incorrect_studyType_value.tsv +0 -0
- {gentroutils-1.5.0 → gentroutils-1.6.0.dev2}/tests/data/manual_curation/incorrect_traitFromSource_type.tsv +0 -0
- {gentroutils-1.5.0 → gentroutils-1.6.0.dev2}/tests/data/manual_curation/non_unique_studyId.tsv +0 -0
- {gentroutils-1.5.0 → gentroutils-1.6.0.dev2}/tests/data/manual_curation/null_value_in_studyId.tsv +0 -0
|
Binary file
|
|
@@ -0,0 +1,19 @@
|
|
|
1
|
+
{
|
|
2
|
+
"python.analysis.typeCheckingMode": "standard",
|
|
3
|
+
"[python]": {
|
|
4
|
+
"editor.formatOnSave": true,
|
|
5
|
+
"editor.defaultFormatter": "charliermarsh.ruff",
|
|
6
|
+
"editor.codeActionsOnSave": {
|
|
7
|
+
"source.organizeImports": "explicit"
|
|
8
|
+
}
|
|
9
|
+
},
|
|
10
|
+
"python.testing.pytestArgs": ["tests", "src"],
|
|
11
|
+
"python.testing.unittestEnabled": false,
|
|
12
|
+
"python.testing.pytestEnabled": true,
|
|
13
|
+
"cSpell.words": [
|
|
14
|
+
"aioftp",
|
|
15
|
+
"gentroutils",
|
|
16
|
+
"harmonised",
|
|
17
|
+
"sumstat"
|
|
18
|
+
]
|
|
19
|
+
}
|
|
@@ -1,6 +1,22 @@
|
|
|
1
1
|
# CHANGELOG
|
|
2
2
|
|
|
3
3
|
|
|
4
|
+
## v1.6.0-dev.2 (2025-08-12)
|
|
5
|
+
|
|
6
|
+
### Features
|
|
7
|
+
|
|
8
|
+
- Update readme
|
|
9
|
+
([`e966927`](https://github.com/opentargets/gentroutils/commit/e966927f8c4b3c670c694258c65eb6b3da6eeb49))
|
|
10
|
+
|
|
11
|
+
|
|
12
|
+
## v1.6.0-dev.1 (2025-08-12)
|
|
13
|
+
|
|
14
|
+
### Features
|
|
15
|
+
|
|
16
|
+
- Version 2.0.0
|
|
17
|
+
([`47c9690`](https://github.com/opentargets/gentroutils/commit/47c9690ffc23be713ef0246aae5271ebe2ab5e3a))
|
|
18
|
+
|
|
19
|
+
|
|
4
20
|
## v1.5.0 (2025-02-12)
|
|
5
21
|
|
|
6
22
|
|
|
@@ -0,0 +1,16 @@
|
|
|
1
|
+
# Description: Dockerfile for the gentroutils package
|
|
2
|
+
#
|
|
3
|
+
# To run locally, you must have a credentials file for GCP. Assuming you do,
|
|
4
|
+
# you can run the following command:
|
|
5
|
+
#
|
|
6
|
+
# docker run -v /path/to/credentials.json:/app/credentials.json -e GOOGLE_APPLICATION_CREDENTIALS=/app/credentials.json gentroutuls -s gwas_catalog_release
|
|
7
|
+
|
|
8
|
+
FROM python:3.13.1-alpine3.21
|
|
9
|
+
COPY --from=ghcr.io/astral-sh/uv:latest /uv /bin/uv
|
|
10
|
+
|
|
11
|
+
ADD . /app
|
|
12
|
+
|
|
13
|
+
WORKDIR /app
|
|
14
|
+
RUN uv sync --frozen
|
|
15
|
+
|
|
16
|
+
ENTRYPOINT ["uv", "run", "gentroutils"]
|
|
@@ -186,7 +186,7 @@ APPENDIX: How to apply the Apache License to your work.
|
|
|
186
186
|
same "printed page" as the copyright notice for easier
|
|
187
187
|
identification within third-party archives.
|
|
188
188
|
|
|
189
|
-
Copyright
|
|
189
|
+
Copyright 2025 [name of copyright owner]
|
|
190
190
|
|
|
191
191
|
Licensed under the Apache License, Version 2.0 (the "License");
|
|
192
192
|
you may not use this file except in compliance with the License.
|
|
@@ -19,7 +19,7 @@ lint: ## run linting
|
|
|
19
19
|
@echo "Running linting tools..."
|
|
20
20
|
@uv run --frozen ruff check --fix --select I src/$(APP_NAME) tests
|
|
21
21
|
@uv run --frozen pydoclint --config=pyproject.toml src tests
|
|
22
|
-
@uv run --frozen interrogate -vv src/$(APP_NAME)
|
|
22
|
+
@uv run --frozen interrogate -vv src/$(APP_NAME)
|
|
23
23
|
|
|
24
24
|
type-check: ## run mypy and check types
|
|
25
25
|
@echo "Running type checks..."
|
|
@@ -0,0 +1,274 @@
|
|
|
1
|
+
Metadata-Version: 2.4
|
|
2
|
+
Name: gentroutils
|
|
3
|
+
Version: 1.6.0.dev2
|
|
4
|
+
Summary: Open Targets python genetics utility CLI tools
|
|
5
|
+
Author-email: Szymon Szyszkowski <ss60@sanger.ac.uk>
|
|
6
|
+
License-Expression: Apache-2.0
|
|
7
|
+
License-File: LICENSE
|
|
8
|
+
Classifier: Development Status :: 3 - Alpha
|
|
9
|
+
Classifier: Intended Audience :: Healthcare Industry
|
|
10
|
+
Classifier: Intended Audience :: Science/Research
|
|
11
|
+
Classifier: License :: OSI Approved :: Apache Software License
|
|
12
|
+
Classifier: Operating System :: Unix
|
|
13
|
+
Classifier: Programming Language :: Python :: 3.13
|
|
14
|
+
Classifier: Topic :: Scientific/Engineering :: Bio-Informatics
|
|
15
|
+
Requires-Python: >=3.13
|
|
16
|
+
Requires-Dist: aioftp>=0.25.1
|
|
17
|
+
Requires-Dist: aiohttp>=3.11.18
|
|
18
|
+
Requires-Dist: google-cloud-storage>=3.1.1
|
|
19
|
+
Requires-Dist: loguru>=0.7.3
|
|
20
|
+
Requires-Dist: opentargets-otter>=25.0.2
|
|
21
|
+
Requires-Dist: polars>=1.31.0
|
|
22
|
+
Requires-Dist: pydantic>=2.10.6
|
|
23
|
+
Requires-Dist: tqdm>=4.67.1
|
|
24
|
+
Description-Content-Type: text/markdown
|
|
25
|
+
|
|
26
|
+
# gentroutils
|
|
27
|
+
|
|
28
|
+
[](https://github.com/opentargets/gentroutils/actions/workflows/pr.yaml)
|
|
29
|
+

|
|
30
|
+
[](https://github.com/opentargets/gentroutils/actions/workflows/release.yaml)
|
|
31
|
+
|
|
32
|
+
Set of Command Line Interface tools to process Open Targets Genetics GWAS data.
|
|
33
|
+
|
|
34
|
+
## Installation
|
|
35
|
+
|
|
36
|
+
```
|
|
37
|
+
pip install gentroutils
|
|
38
|
+
```
|
|
39
|
+
|
|
40
|
+
## Available commands
|
|
41
|
+
|
|
42
|
+
To see all available commands after installation run
|
|
43
|
+
|
|
44
|
+
```{bash}
|
|
45
|
+
gentroutils --help
|
|
46
|
+
```
|
|
47
|
+
|
|
48
|
+
## Usage
|
|
49
|
+
|
|
50
|
+
To run a single step run
|
|
51
|
+
```{bash}
|
|
52
|
+
uv run gentroutils -s gwas_catalog_release # After cloning the repository
|
|
53
|
+
gentroutils -s gwas_catalog_release -c otter_config.yaml # When installed by pip
|
|
54
|
+
```
|
|
55
|
+
|
|
56
|
+
The `gentroutils` repository uses the [otter](https://github.com/opentargets/otter) framework to build the set of tasks to run. The current implementation of tasks can be found in the `config.yaml` file in the root of the repository. To run gentroutils installed via `pip` you need to define the otter config that looks like the `config.yaml` file.
|
|
57
|
+
|
|
58
|
+
<details>
|
|
59
|
+
<summary>Example config</summary>
|
|
60
|
+
|
|
61
|
+
For the top level fields refer to the [otter documentation](https://opentargets.github.io/otter/otter.config.html)
|
|
62
|
+
|
|
63
|
+
```yaml
|
|
64
|
+
---
|
|
65
|
+
work_path: ./work
|
|
66
|
+
log_level: DEBUG
|
|
67
|
+
scratchpad:
|
|
68
|
+
steps:
|
|
69
|
+
gwas_catalog_release:
|
|
70
|
+
- name: crawl release metadata
|
|
71
|
+
stats_uri: "https://www.ebi.ac.uk/gwas/api/search/stats"
|
|
72
|
+
destination_template: "gs://gwas_catalog_inputs/gentroutils/{release_date}/stats.json"
|
|
73
|
+
promote: "true"
|
|
74
|
+
- name: fetch associations
|
|
75
|
+
stats_uri: "https://www.ebi.ac.uk/gwas/api/search/stats"
|
|
76
|
+
source_template: "ftp://ftp.ebi.ac.uk/pub/databases/gwas/releases/{release_date}/gwas-catalog-associations_ontology-annotated.tsv"
|
|
77
|
+
destination_template: "gs://gwas_catalog_inputs/gentroutils/{release_date}/gwas_catalog_associations_ontology_annotated.tsv"
|
|
78
|
+
promote: true
|
|
79
|
+
- name: fetch studies
|
|
80
|
+
stats_uri: "https://www.ebi.ac.uk/gwas/api/search/stats"
|
|
81
|
+
source_template: "ftp://ftp.ebi.ac.uk/pub/databases/gwas/releases/{release_date}/gwas-catalog-download-studies-v1.0.3.1.txt"
|
|
82
|
+
destination_template: "gs://gwas_catalog_inputs/gentroutils/{release_date}/gwas_catalog_download_studies.tsv"
|
|
83
|
+
promote: true
|
|
84
|
+
- name: fetch ancestries
|
|
85
|
+
stats_uri: "https://www.ebi.ac.uk/gwas/api/search/stats"
|
|
86
|
+
source_template: "ftp://ftp.ebi.ac.uk/pub/databases/gwas/releases/{release_date}/gwas-catalog-download-ancestries-v1.0.3.1.txt"
|
|
87
|
+
destination_template: "gs://gwas_catalog_inputs/gentroutils/{release_date}/gwas_catalog_download_ancestries.tsv"
|
|
88
|
+
promote: true
|
|
89
|
+
- name: curation study
|
|
90
|
+
requires:
|
|
91
|
+
- fetch studies
|
|
92
|
+
previous_curation: gs://gwas_catalog_inputs/curation/latest/curated/GWAS_Catalog_study_curation.tsv
|
|
93
|
+
studies: gs://gwas_catalog_inputs/gentroutils/latest/gwas_catalog_download_studies.tsv
|
|
94
|
+
destination_template: ./work/curation_{release_date}.tsv
|
|
95
|
+
promote: true
|
|
96
|
+
```
|
|
97
|
+
|
|
98
|
+
The config above defines the steps that are run in parallel by the `otter` framework.
|
|
99
|
+
|
|
100
|
+
</details>
|
|
101
|
+
|
|
102
|
+
### Available tasks
|
|
103
|
+
|
|
104
|
+
The list of tasks (defined in the `config.yaml` file) that can be run are:
|
|
105
|
+
|
|
106
|
+
#### Crawl release metadata
|
|
107
|
+
|
|
108
|
+
```yaml
|
|
109
|
+
- name: crawl release metadata
|
|
110
|
+
stats_uri: "https://www.ebi.ac.uk/gwas/api/search/stats"
|
|
111
|
+
destination_template: "gs://gwas_catalog_inputs/gentroutils/{release_date}/stats.json"
|
|
112
|
+
promote: "true"
|
|
113
|
+
```
|
|
114
|
+
|
|
115
|
+
This task fetches the latest GWAS Catalog release metadata from the `https://www.ebi.ac.uk/gwas/api/search/stats` endpoint and saves it to the specified destination.
|
|
116
|
+
|
|
117
|
+
> [!NOTE]
|
|
118
|
+
> **Task parameters**
|
|
119
|
+
>
|
|
120
|
+
> - The `stats_uri` is used to fetch the latest release date and other metadata.
|
|
121
|
+
> - The `destination_template` is where the metadata will be saved, and it uses the `{release_date}` placeholder to specify the release date dynamically. By default it searches for the release directly in the stats_uri json output.
|
|
122
|
+
> - The `promote` field is set to `true`, which means the output will be promoted to the latest release. Meaning that the file will be saved under `gs://gwas_catalog_inputs/gentroutils/latest/stats.json` after the task is completed. If the `promote` field is set to `false`, the file will not be promoted and will be saved under the specified path with the release date.
|
|
123
|
+
|
|
124
|
+
---
|
|
125
|
+
|
|
126
|
+
### Fetch associations
|
|
127
|
+
|
|
128
|
+
```yaml
|
|
129
|
+
- name: fetch associations
|
|
130
|
+
stats_uri: "https://www.ebi.ac.uk/gwas/api/search/stats"
|
|
131
|
+
source_template: "ftp://ftp.ebi.ac.uk/pub/databases/gwas/releases/{release_date}/gwas-catalog-associations_ontology-annotated.tsv"
|
|
132
|
+
destination_template: "gs://gwas_catalog_inputs/gentroutils/{release_date}/gwas_catalog_associations_ontology_annotated.tsv"
|
|
133
|
+
promote: true
|
|
134
|
+
```
|
|
135
|
+
|
|
136
|
+
This task fetches the GWAS Catalog associations file from the specified FTP server and saves it to the specified destination.
|
|
137
|
+
|
|
138
|
+
> [!NOTE]
|
|
139
|
+
> **Task parameters**
|
|
140
|
+
>
|
|
141
|
+
> - The `stats_uri` is used to fetch the latest release date and other metadata.
|
|
142
|
+
> - The `source_template` is the URL of the GWAS Catalog associations file, which uses the `{release_date}` placeholder to specify the release date dynamically. The release date is fetched from the `stats_uri` endpoint.
|
|
143
|
+
> - The `destination_template` is where the associations file will be saved, and it also uses the `{release_date}` placeholder. The release date is fetched from the `stats_uri` endpoint.
|
|
144
|
+
> - The `promote` field is set to `true`, which means the output will be promoted to the latest release. Meaning that the file will be saved under `gs://gwas_catalog_inputs/gentroutils/latest/gwas_catalog_associations_ontology_annotated.tsv` after the task is completed. If the `promote` field is set to `false`, the file will not be promoted and will be saved under the specified path with the release date.
|
|
145
|
+
|
|
146
|
+
---
|
|
147
|
+
|
|
148
|
+
### Fetch studies
|
|
149
|
+
|
|
150
|
+
```yaml
|
|
151
|
+
- name: fetch studies
|
|
152
|
+
stats_uri: "https://www.ebi.ac.uk/gwas/api/search/stats"
|
|
153
|
+
source_template: "ftp://ftp.ebi.ac.uk/pub/databases/gwas/releases/{release_date}/gwas-catalog-download-studies-v1.0.3.1.txt"
|
|
154
|
+
destination_template: "gs://gwas_catalog_inputs/gentroutils/{release_date}/gwas_catalog_download_studies.tsv"
|
|
155
|
+
promote: true
|
|
156
|
+
```
|
|
157
|
+
|
|
158
|
+
This task fetches the GWAS Catalog studies file from the specified FTP server and saves it to the specified destination.
|
|
159
|
+
|
|
160
|
+
> [!NOTE]
|
|
161
|
+
> **Task parameters**
|
|
162
|
+
>
|
|
163
|
+
> - The `stats_uri` is used to fetch the latest release date and other metadata.
|
|
164
|
+
> - The `source_template` is the URL of the GWAS Catalog studies file, which uses the `{release_date}` placeholder to specify the release date dynamically. The release date is fetched from the `stats_uri` endpoint.
|
|
165
|
+
> - The `destination_template` is where the studies file will be saved, and it also uses the `{release_date}` placeholder. The release date is fetched from the `stats_uri` endpoint.
|
|
166
|
+
> - The `promote` field is set to `true`, which means the output will be promoted to the latest release. Meaning that the file will be saved under `gs://gwas_catalog_inputs/gentroutils/latest/gwas_catalog_download_studies.tsv` after the task is completed. If the `promote` field is set to `false`, the file will not be promoted and will be saved under the specified path with the release date.
|
|
167
|
+
|
|
168
|
+
---
|
|
169
|
+
|
|
170
|
+
### Fetch ancestries
|
|
171
|
+
|
|
172
|
+
```yaml
|
|
173
|
+
- name: fetch ancestries
|
|
174
|
+
stats_uri: "https://www.ebi.ac.uk/gwas/api/search/stats"
|
|
175
|
+
source_template: "ftp://ftp.ebi.ac.uk/pub/databases/gwas/releases/{release_date}/gwas-catalog-download-ancestries-v1.0.3.1.txt"
|
|
176
|
+
destination_template: "gs://gwas_catalog_inputs/gentroutils/{release_date}/gwas_catalog_download_ancestries.tsv"
|
|
177
|
+
promote: true
|
|
178
|
+
```
|
|
179
|
+
|
|
180
|
+
This task fetches the GWAS Catalog ancestries file from the specified FTP server and saves it to the specified destination.
|
|
181
|
+
|
|
182
|
+
> [!NOTE]
|
|
183
|
+
> **Task parameters**
|
|
184
|
+
>
|
|
185
|
+
> - The `stats_uri` is used to fetch the latest release date and other metadata.
|
|
186
|
+
> - The `source_template` is the URL of the GWAS Catalog ancestries file, which uses the `{release_date}` placeholder to specify the release date dynamically. The release date is fetched from the `stats_uri` endpoint.
|
|
187
|
+
> - The `destination_template` is where the ancestries file will be saved, and it also uses the `{release_date}` placeholder. The release date is fetched from the `stats_uri` endpoint.
|
|
188
|
+
> - The `promote` field is set to `true`, which means the output will be promoted to the latest release. Meaning that the file will be saved under `gs://gwas_catalog_inputs/gentroutils/latest/gwas_catalog_download_ancestries.tsv` after the task is completed. If the `promote` field is set to `false`, the file will not be promoted and will be saved under the specified path with the release date.
|
|
189
|
+
|
|
190
|
+
---
|
|
191
|
+
|
|
192
|
+
### Curation
|
|
193
|
+
|
|
194
|
+
```yaml
|
|
195
|
+
- name: curation study
|
|
196
|
+
requires:
|
|
197
|
+
- fetch studies
|
|
198
|
+
previous_curation: gs://gwas_catalog_inputs/curation/latest/curated/GWAS_Catalog_study_curation.tsv
|
|
199
|
+
studies: gs://gwas_catalog_inputs/gentroutils/latest/gwas_catalog_download_studies.tsv
|
|
200
|
+
destination_template: gs://gwas_catalog_inputs/curation/{release_date}/raw/gwas_catalog_study_curation.tsv
|
|
201
|
+
promote: true
|
|
202
|
+
```
|
|
203
|
+
|
|
204
|
+
This task is used to build the GWAS Catalog curation file that is later used as a template for manual curation. It requires the `fetch studies` task to be completed before it can run. This is due to the fact that the curation file is build based on the list of studies fetched from `download studies` file.
|
|
205
|
+
|
|
206
|
+
> [!NOTE]
|
|
207
|
+
> **Task parameters**
|
|
208
|
+
>
|
|
209
|
+
> - The `requires` field specifies that this task depends on the `fetch studies` task, meaning it will only run after the studies have been fetched.
|
|
210
|
+
> - The `previous_curation` field is used to specify the path to the previous curation file. This is used to build the new curation file based on the previous one.
|
|
211
|
+
> - The `studies` field is the path to the studies file that was fetched in the `fetch studies` task. This file is used to build the curation file.
|
|
212
|
+
> - The `destination_template` is where the curation file will be saved, and it uses the `{release_date}` placeholder to specify the release date dynamically. The release date is fetched from the `stats_uri` endpoint.
|
|
213
|
+
> - The `promote` field is set to `true`, which means the output will be promoted to the latest release. Meaning that the file will be saved under `gs://gwas_catalog_inputs/curation/latest/raw/gwas_catalog_study_curation.tsv` after the task is completed. If the `promote` field is set to `false`, the file will not be promoted and will be saved under the specified path with the release date.
|
|
214
|
+
|
|
215
|
+
---
|
|
216
|
+
|
|
217
|
+
## Curation process
|
|
218
|
+
|
|
219
|
+
The base of the curation process for GWAS Catalog data is defined in the [docs/gwas_catalog_curation.md](docs/gwas_catalog_curation.md). The original solution uses R script to prepare the data for curation and then manually curates the data. The solution proposed in the `curation` task autommates the preparation of the data for curation and provides a template for manual curation. The manual curation process is still required, but the data preparation is automated.
|
|
220
|
+
|
|
221
|
+
The automated process includes:
|
|
222
|
+
|
|
223
|
+
1. Reading `download studies` file with the list of studies that are currently comming from the latest GWAS Catalog release.
|
|
224
|
+
2. Reading `previous curation` file that contains the list of the curated studies from the previous release.
|
|
225
|
+
3. Comparing the two datasets with following logic:
|
|
226
|
+
- In case the study is present in the `previous curation` and `download studies`, the study is marked as `curated`
|
|
227
|
+
* In case the study is present in the `download studies` but not in the `previous curation`, the study is marked as `new`
|
|
228
|
+
* In case the study is present in the `previous curation` but not in the `download studies`, the study is marked as `removed`
|
|
229
|
+
4. The output of the curation process is a file that contains the list of studies with their status (curated, new, removed) and the fields that are required for manual curation. The output file is saved to the `destination_template` path specified in the task configuration. The file is saved under `gs://gwas_catalog_inputs/curation/{release_date}/raw/gwas_catalog_study_curation.tsv` path.
|
|
230
|
+
5. The output file is then promoted to the latest release path `gs://gwas_catalog_inputs/curation/latest/raw/gwas_catalog_study_curation.tsv` so that it can be used for manual curation.
|
|
231
|
+
6. The manual curation process is then performed on the `gs://gwas_catalog_inputs/curation/latest/raw/gwas_catalog_study_curation.tsv` file. The manual curation process is not automated and requires manual intervention. The output from the manual curation process should be saved then to the `gs://gwas_catalog_inputs/curation/latest/curated/GWAS_Catalog_study_curation.tsv` and `gs://gwas_catalog_inputs/curation/{release_date}/curated/GWAS_Catalog_study_curation.tsv` file. This file is then used for the [Open Targets Staging Dags](https://github.com/opentargets/orchestration).
|
|
232
|
+
|
|
233
|
+
---
|
|
234
|
+
|
|
235
|
+
## Contribute
|
|
236
|
+
|
|
237
|
+
To be able to contribute to the project you need to set it up. This project
|
|
238
|
+
runs on:
|
|
239
|
+
|
|
240
|
+
- [x] python 3.13
|
|
241
|
+
- [x] uv (dependency manager)
|
|
242
|
+
|
|
243
|
+
To set up the project run
|
|
244
|
+
|
|
245
|
+
```{bash}
|
|
246
|
+
make dev
|
|
247
|
+
```
|
|
248
|
+
|
|
249
|
+
The command will install above dependencies (initial requirements are curl and bash) if not present and
|
|
250
|
+
install all python dependencies listed in `pyproject.toml`. Finally the command will install `pre-commit` hooks
|
|
251
|
+
required to be run before the commit is created.
|
|
252
|
+
|
|
253
|
+
The project has additional `dev` dependencies that include the list of packages used for testing purposes.
|
|
254
|
+
All of the `dev` dependencies are automatically installed by `uv`.
|
|
255
|
+
|
|
256
|
+
To see all available dev commands
|
|
257
|
+
|
|
258
|
+
Run following command to see all available dev commands
|
|
259
|
+
|
|
260
|
+
```{bash}
|
|
261
|
+
make help
|
|
262
|
+
```
|
|
263
|
+
|
|
264
|
+
### Manual testing of CLI module
|
|
265
|
+
|
|
266
|
+
To check CLI execution manually you need to run
|
|
267
|
+
|
|
268
|
+
```{bash}
|
|
269
|
+
uv run gentroutils
|
|
270
|
+
```
|
|
271
|
+
---
|
|
272
|
+
|
|
273
|
+
This software was developed as part of the Open Targets project. For more
|
|
274
|
+
information please see: http://www.opentargets.org
|
|
@@ -0,0 +1,249 @@
|
|
|
1
|
+
# gentroutils
|
|
2
|
+
|
|
3
|
+
[](https://github.com/opentargets/gentroutils/actions/workflows/pr.yaml)
|
|
4
|
+

|
|
5
|
+
[](https://github.com/opentargets/gentroutils/actions/workflows/release.yaml)
|
|
6
|
+
|
|
7
|
+
Set of Command Line Interface tools to process Open Targets Genetics GWAS data.
|
|
8
|
+
|
|
9
|
+
## Installation
|
|
10
|
+
|
|
11
|
+
```
|
|
12
|
+
pip install gentroutils
|
|
13
|
+
```
|
|
14
|
+
|
|
15
|
+
## Available commands
|
|
16
|
+
|
|
17
|
+
To see all available commands after installation run
|
|
18
|
+
|
|
19
|
+
```{bash}
|
|
20
|
+
gentroutils --help
|
|
21
|
+
```
|
|
22
|
+
|
|
23
|
+
## Usage
|
|
24
|
+
|
|
25
|
+
To run a single step run
|
|
26
|
+
```{bash}
|
|
27
|
+
uv run gentroutils -s gwas_catalog_release # After cloning the repository
|
|
28
|
+
gentroutils -s gwas_catalog_release -c otter_config.yaml # When installed by pip
|
|
29
|
+
```
|
|
30
|
+
|
|
31
|
+
The `gentroutils` repository uses the [otter](https://github.com/opentargets/otter) framework to build the set of tasks to run. The current implementation of tasks can be found in the `config.yaml` file in the root of the repository. To run gentroutils installed via `pip` you need to define the otter config that looks like the `config.yaml` file.
|
|
32
|
+
|
|
33
|
+
<details>
|
|
34
|
+
<summary>Example config</summary>
|
|
35
|
+
|
|
36
|
+
For the top level fields refer to the [otter documentation](https://opentargets.github.io/otter/otter.config.html)
|
|
37
|
+
|
|
38
|
+
```yaml
|
|
39
|
+
---
|
|
40
|
+
work_path: ./work
|
|
41
|
+
log_level: DEBUG
|
|
42
|
+
scratchpad:
|
|
43
|
+
steps:
|
|
44
|
+
gwas_catalog_release:
|
|
45
|
+
- name: crawl release metadata
|
|
46
|
+
stats_uri: "https://www.ebi.ac.uk/gwas/api/search/stats"
|
|
47
|
+
destination_template: "gs://gwas_catalog_inputs/gentroutils/{release_date}/stats.json"
|
|
48
|
+
promote: "true"
|
|
49
|
+
- name: fetch associations
|
|
50
|
+
stats_uri: "https://www.ebi.ac.uk/gwas/api/search/stats"
|
|
51
|
+
source_template: "ftp://ftp.ebi.ac.uk/pub/databases/gwas/releases/{release_date}/gwas-catalog-associations_ontology-annotated.tsv"
|
|
52
|
+
destination_template: "gs://gwas_catalog_inputs/gentroutils/{release_date}/gwas_catalog_associations_ontology_annotated.tsv"
|
|
53
|
+
promote: true
|
|
54
|
+
- name: fetch studies
|
|
55
|
+
stats_uri: "https://www.ebi.ac.uk/gwas/api/search/stats"
|
|
56
|
+
source_template: "ftp://ftp.ebi.ac.uk/pub/databases/gwas/releases/{release_date}/gwas-catalog-download-studies-v1.0.3.1.txt"
|
|
57
|
+
destination_template: "gs://gwas_catalog_inputs/gentroutils/{release_date}/gwas_catalog_download_studies.tsv"
|
|
58
|
+
promote: true
|
|
59
|
+
- name: fetch ancestries
|
|
60
|
+
stats_uri: "https://www.ebi.ac.uk/gwas/api/search/stats"
|
|
61
|
+
source_template: "ftp://ftp.ebi.ac.uk/pub/databases/gwas/releases/{release_date}/gwas-catalog-download-ancestries-v1.0.3.1.txt"
|
|
62
|
+
destination_template: "gs://gwas_catalog_inputs/gentroutils/{release_date}/gwas_catalog_download_ancestries.tsv"
|
|
63
|
+
promote: true
|
|
64
|
+
- name: curation study
|
|
65
|
+
requires:
|
|
66
|
+
- fetch studies
|
|
67
|
+
previous_curation: gs://gwas_catalog_inputs/curation/latest/curated/GWAS_Catalog_study_curation.tsv
|
|
68
|
+
studies: gs://gwas_catalog_inputs/gentroutils/latest/gwas_catalog_download_studies.tsv
|
|
69
|
+
destination_template: ./work/curation_{release_date}.tsv
|
|
70
|
+
promote: true
|
|
71
|
+
```
|
|
72
|
+
|
|
73
|
+
The config above defines the steps that are run in parallel by the `otter` framework.
|
|
74
|
+
|
|
75
|
+
</details>
|
|
76
|
+
|
|
77
|
+
### Available tasks
|
|
78
|
+
|
|
79
|
+
The list of tasks (defined in the `config.yaml` file) that can be run are:
|
|
80
|
+
|
|
81
|
+
#### Crawl release metadata
|
|
82
|
+
|
|
83
|
+
```yaml
|
|
84
|
+
- name: crawl release metadata
|
|
85
|
+
stats_uri: "https://www.ebi.ac.uk/gwas/api/search/stats"
|
|
86
|
+
destination_template: "gs://gwas_catalog_inputs/gentroutils/{release_date}/stats.json"
|
|
87
|
+
promote: "true"
|
|
88
|
+
```
|
|
89
|
+
|
|
90
|
+
This task fetches the latest GWAS Catalog release metadata from the `https://www.ebi.ac.uk/gwas/api/search/stats` endpoint and saves it to the specified destination.
|
|
91
|
+
|
|
92
|
+
> [!NOTE]
|
|
93
|
+
> **Task parameters**
|
|
94
|
+
>
|
|
95
|
+
> - The `stats_uri` is used to fetch the latest release date and other metadata.
|
|
96
|
+
> - The `destination_template` is where the metadata will be saved, and it uses the `{release_date}` placeholder to specify the release date dynamically. By default it searches for the release directly in the stats_uri json output.
|
|
97
|
+
> - The `promote` field is set to `true`, which means the output will be promoted to the latest release. Meaning that the file will be saved under `gs://gwas_catalog_inputs/gentroutils/latest/stats.json` after the task is completed. If the `promote` field is set to `false`, the file will not be promoted and will be saved under the specified path with the release date.
|
|
98
|
+
|
|
99
|
+
---
|
|
100
|
+
|
|
101
|
+
### Fetch associations
|
|
102
|
+
|
|
103
|
+
```yaml
|
|
104
|
+
- name: fetch associations
|
|
105
|
+
stats_uri: "https://www.ebi.ac.uk/gwas/api/search/stats"
|
|
106
|
+
source_template: "ftp://ftp.ebi.ac.uk/pub/databases/gwas/releases/{release_date}/gwas-catalog-associations_ontology-annotated.tsv"
|
|
107
|
+
destination_template: "gs://gwas_catalog_inputs/gentroutils/{release_date}/gwas_catalog_associations_ontology_annotated.tsv"
|
|
108
|
+
promote: true
|
|
109
|
+
```
|
|
110
|
+
|
|
111
|
+
This task fetches the GWAS Catalog associations file from the specified FTP server and saves it to the specified destination.
|
|
112
|
+
|
|
113
|
+
> [!NOTE]
|
|
114
|
+
> **Task parameters**
|
|
115
|
+
>
|
|
116
|
+
> - The `stats_uri` is used to fetch the latest release date and other metadata.
|
|
117
|
+
> - The `source_template` is the URL of the GWAS Catalog associations file, which uses the `{release_date}` placeholder to specify the release date dynamically. The release date is fetched from the `stats_uri` endpoint.
|
|
118
|
+
> - The `destination_template` is where the associations file will be saved, and it also uses the `{release_date}` placeholder. The release date is fetched from the `stats_uri` endpoint.
|
|
119
|
+
> - The `promote` field is set to `true`, which means the output will be promoted to the latest release. Meaning that the file will be saved under `gs://gwas_catalog_inputs/gentroutils/latest/gwas_catalog_associations_ontology_annotated.tsv` after the task is completed. If the `promote` field is set to `false`, the file will not be promoted and will be saved under the specified path with the release date.
|
|
120
|
+
|
|
121
|
+
---
|
|
122
|
+
|
|
123
|
+
### Fetch studies
|
|
124
|
+
|
|
125
|
+
```yaml
|
|
126
|
+
- name: fetch studies
|
|
127
|
+
stats_uri: "https://www.ebi.ac.uk/gwas/api/search/stats"
|
|
128
|
+
source_template: "ftp://ftp.ebi.ac.uk/pub/databases/gwas/releases/{release_date}/gwas-catalog-download-studies-v1.0.3.1.txt"
|
|
129
|
+
destination_template: "gs://gwas_catalog_inputs/gentroutils/{release_date}/gwas_catalog_download_studies.tsv"
|
|
130
|
+
promote: true
|
|
131
|
+
```
|
|
132
|
+
|
|
133
|
+
This task fetches the GWAS Catalog studies file from the specified FTP server and saves it to the specified destination.
|
|
134
|
+
|
|
135
|
+
> [!NOTE]
|
|
136
|
+
> **Task parameters**
|
|
137
|
+
>
|
|
138
|
+
> - The `stats_uri` is used to fetch the latest release date and other metadata.
|
|
139
|
+
> - The `source_template` is the URL of the GWAS Catalog studies file, which uses the `{release_date}` placeholder to specify the release date dynamically. The release date is fetched from the `stats_uri` endpoint.
|
|
140
|
+
> - The `destination_template` is where the studies file will be saved, and it also uses the `{release_date}` placeholder. The release date is fetched from the `stats_uri` endpoint.
|
|
141
|
+
> - The `promote` field is set to `true`, which means the output will be promoted to the latest release. Meaning that the file will be saved under `gs://gwas_catalog_inputs/gentroutils/latest/gwas_catalog_download_studies.tsv` after the task is completed. If the `promote` field is set to `false`, the file will not be promoted and will be saved under the specified path with the release date.
|
|
142
|
+
|
|
143
|
+
---
|
|
144
|
+
|
|
145
|
+
### Fetch ancestries
|
|
146
|
+
|
|
147
|
+
```yaml
|
|
148
|
+
- name: fetch ancestries
|
|
149
|
+
stats_uri: "https://www.ebi.ac.uk/gwas/api/search/stats"
|
|
150
|
+
source_template: "ftp://ftp.ebi.ac.uk/pub/databases/gwas/releases/{release_date}/gwas-catalog-download-ancestries-v1.0.3.1.txt"
|
|
151
|
+
destination_template: "gs://gwas_catalog_inputs/gentroutils/{release_date}/gwas_catalog_download_ancestries.tsv"
|
|
152
|
+
promote: true
|
|
153
|
+
```
|
|
154
|
+
|
|
155
|
+
This task fetches the GWAS Catalog ancestries file from the specified FTP server and saves it to the specified destination.
|
|
156
|
+
|
|
157
|
+
> [!NOTE]
|
|
158
|
+
> **Task parameters**
|
|
159
|
+
>
|
|
160
|
+
> - The `stats_uri` is used to fetch the latest release date and other metadata.
|
|
161
|
+
> - The `source_template` is the URL of the GWAS Catalog ancestries file, which uses the `{release_date}` placeholder to specify the release date dynamically. The release date is fetched from the `stats_uri` endpoint.
|
|
162
|
+
> - The `destination_template` is where the ancestries file will be saved, and it also uses the `{release_date}` placeholder. The release date is fetched from the `stats_uri` endpoint.
|
|
163
|
+
> - The `promote` field is set to `true`, which means the output will be promoted to the latest release. Meaning that the file will be saved under `gs://gwas_catalog_inputs/gentroutils/latest/gwas_catalog_download_ancestries.tsv` after the task is completed. If the `promote` field is set to `false`, the file will not be promoted and will be saved under the specified path with the release date.
|
|
164
|
+
|
|
165
|
+
---
|
|
166
|
+
|
|
167
|
+
### Curation
|
|
168
|
+
|
|
169
|
+
```yaml
|
|
170
|
+
- name: curation study
|
|
171
|
+
requires:
|
|
172
|
+
- fetch studies
|
|
173
|
+
previous_curation: gs://gwas_catalog_inputs/curation/latest/curated/GWAS_Catalog_study_curation.tsv
|
|
174
|
+
studies: gs://gwas_catalog_inputs/gentroutils/latest/gwas_catalog_download_studies.tsv
|
|
175
|
+
destination_template: gs://gwas_catalog_inputs/curation/{release_date}/raw/gwas_catalog_study_curation.tsv
|
|
176
|
+
promote: true
|
|
177
|
+
```
|
|
178
|
+
|
|
179
|
+
This task is used to build the GWAS Catalog curation file that is later used as a template for manual curation. It requires the `fetch studies` task to be completed before it can run. This is due to the fact that the curation file is build based on the list of studies fetched from `download studies` file.
|
|
180
|
+
|
|
181
|
+
> [!NOTE]
|
|
182
|
+
> **Task parameters**
|
|
183
|
+
>
|
|
184
|
+
> - The `requires` field specifies that this task depends on the `fetch studies` task, meaning it will only run after the studies have been fetched.
|
|
185
|
+
> - The `previous_curation` field is used to specify the path to the previous curation file. This is used to build the new curation file based on the previous one.
|
|
186
|
+
> - The `studies` field is the path to the studies file that was fetched in the `fetch studies` task. This file is used to build the curation file.
|
|
187
|
+
> - The `destination_template` is where the curation file will be saved, and it uses the `{release_date}` placeholder to specify the release date dynamically. The release date is fetched from the `stats_uri` endpoint.
|
|
188
|
+
> - The `promote` field is set to `true`, which means the output will be promoted to the latest release. Meaning that the file will be saved under `gs://gwas_catalog_inputs/curation/latest/raw/gwas_catalog_study_curation.tsv` after the task is completed. If the `promote` field is set to `false`, the file will not be promoted and will be saved under the specified path with the release date.
|
|
189
|
+
|
|
190
|
+
---
|
|
191
|
+
|
|
192
|
+
## Curation process
|
|
193
|
+
|
|
194
|
+
The base of the curation process for GWAS Catalog data is defined in the [docs/gwas_catalog_curation.md](docs/gwas_catalog_curation.md). The original solution uses R script to prepare the data for curation and then manually curates the data. The solution proposed in the `curation` task autommates the preparation of the data for curation and provides a template for manual curation. The manual curation process is still required, but the data preparation is automated.
|
|
195
|
+
|
|
196
|
+
The automated process includes:
|
|
197
|
+
|
|
198
|
+
1. Reading `download studies` file with the list of studies that are currently comming from the latest GWAS Catalog release.
|
|
199
|
+
2. Reading `previous curation` file that contains the list of the curated studies from the previous release.
|
|
200
|
+
3. Comparing the two datasets with following logic:
|
|
201
|
+
- In case the study is present in the `previous curation` and `download studies`, the study is marked as `curated`
|
|
202
|
+
* In case the study is present in the `download studies` but not in the `previous curation`, the study is marked as `new`
|
|
203
|
+
* In case the study is present in the `previous curation` but not in the `download studies`, the study is marked as `removed`
|
|
204
|
+
4. The output of the curation process is a file that contains the list of studies with their status (curated, new, removed) and the fields that are required for manual curation. The output file is saved to the `destination_template` path specified in the task configuration. The file is saved under `gs://gwas_catalog_inputs/curation/{release_date}/raw/gwas_catalog_study_curation.tsv` path.
|
|
205
|
+
5. The output file is then promoted to the latest release path `gs://gwas_catalog_inputs/curation/latest/raw/gwas_catalog_study_curation.tsv` so that it can be used for manual curation.
|
|
206
|
+
6. The manual curation process is then performed on the `gs://gwas_catalog_inputs/curation/latest/raw/gwas_catalog_study_curation.tsv` file. The manual curation process is not automated and requires manual intervention. The output from the manual curation process should be saved then to the `gs://gwas_catalog_inputs/curation/latest/curated/GWAS_Catalog_study_curation.tsv` and `gs://gwas_catalog_inputs/curation/{release_date}/curated/GWAS_Catalog_study_curation.tsv` file. This file is then used for the [Open Targets Staging Dags](https://github.com/opentargets/orchestration).
|
|
207
|
+
|
|
208
|
+
---
|
|
209
|
+
|
|
210
|
+
## Contribute
|
|
211
|
+
|
|
212
|
+
To be able to contribute to the project you need to set it up. This project
|
|
213
|
+
runs on:
|
|
214
|
+
|
|
215
|
+
- [x] python 3.13
|
|
216
|
+
- [x] uv (dependency manager)
|
|
217
|
+
|
|
218
|
+
To set up the project run
|
|
219
|
+
|
|
220
|
+
```{bash}
|
|
221
|
+
make dev
|
|
222
|
+
```
|
|
223
|
+
|
|
224
|
+
The command will install above dependencies (initial requirements are curl and bash) if not present and
|
|
225
|
+
install all python dependencies listed in `pyproject.toml`. Finally the command will install `pre-commit` hooks
|
|
226
|
+
required to be run before the commit is created.
|
|
227
|
+
|
|
228
|
+
The project has additional `dev` dependencies that include the list of packages used for testing purposes.
|
|
229
|
+
All of the `dev` dependencies are automatically installed by `uv`.
|
|
230
|
+
|
|
231
|
+
To see all available dev commands
|
|
232
|
+
|
|
233
|
+
Run following command to see all available dev commands
|
|
234
|
+
|
|
235
|
+
```{bash}
|
|
236
|
+
make help
|
|
237
|
+
```
|
|
238
|
+
|
|
239
|
+
### Manual testing of CLI module
|
|
240
|
+
|
|
241
|
+
To check CLI execution manually you need to run
|
|
242
|
+
|
|
243
|
+
```{bash}
|
|
244
|
+
uv run gentroutils
|
|
245
|
+
```
|
|
246
|
+
---
|
|
247
|
+
|
|
248
|
+
This software was developed as part of the Open Targets project. For more
|
|
249
|
+
information please see: http://www.opentargets.org
|