genal-python 0.7__tar.gz → 0.8__tar.gz

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (171) hide show
  1. genal_python-0.8/.DS_Store +0 -0
  2. {genal_python-0.7 → genal_python-0.8}/.gitignore +1 -2
  3. genal_python-0.8/.readthedocs.yaml +22 -0
  4. {genal_python-0.7 → genal_python-0.8}/PKG-INFO +73 -10
  5. {genal_python-0.7 → genal_python-0.8}/README.md +73 -8
  6. genal_python-0.8/docs/.DS_Store +0 -0
  7. genal_python-0.8/docs/build/.buildinfo +4 -0
  8. genal_python-0.8/docs/build/.doctrees/api.doctree +0 -0
  9. genal_python-0.8/docs/build/.doctrees/environment.pickle +0 -0
  10. genal_python-0.8/docs/build/.doctrees/genal.doctree +0 -0
  11. genal_python-0.8/docs/build/.doctrees/index.doctree +0 -0
  12. genal_python-0.8/docs/build/.doctrees/introduction.doctree +0 -0
  13. genal_python-0.8/docs/build/.doctrees/modules.doctree +0 -0
  14. genal_python-0.8/docs/build/_modules/genal/Geno.html +1480 -0
  15. genal_python-0.8/docs/build/_modules/genal/MR.html +1065 -0
  16. genal_python-0.8/docs/build/_modules/genal/MR_tools.html +671 -0
  17. genal_python-0.8/docs/build/_modules/genal/MRpresso.html +409 -0
  18. genal_python-0.8/docs/build/_modules/genal/association.html +445 -0
  19. genal_python-0.8/docs/build/_modules/genal/clump.html +183 -0
  20. genal_python-0.8/docs/build/_modules/genal/extract_prs.html +426 -0
  21. genal_python-0.8/docs/build/_modules/genal/geno_tools.html +567 -0
  22. genal_python-0.8/docs/build/_modules/genal/lift.html +371 -0
  23. genal_python-0.8/docs/build/_modules/genal/proxy.html +359 -0
  24. genal_python-0.8/docs/build/_modules/genal/snp_query.html +231 -0
  25. genal_python-0.8/docs/build/_modules/genal/tools.html +440 -0
  26. genal_python-0.8/docs/build/_modules/index.html +114 -0
  27. genal_python-0.8/docs/build/_sources/api.rst.txt +100 -0
  28. genal_python-0.8/docs/build/_sources/index.rst.txt +69 -0
  29. genal_python-0.8/docs/build/_sources/introduction.rst.txt +666 -0
  30. genal_python-0.8/docs/build/_sources/modules.rst.txt +82 -0
  31. genal_python-0.8/docs/build/_static/basic.css +925 -0
  32. genal_python-0.8/docs/build/_static/css/badge_only.css +1 -0
  33. genal_python-0.8/docs/build/_static/css/fonts/Roboto-Slab-Bold.woff +0 -0
  34. genal_python-0.8/docs/build/_static/css/fonts/Roboto-Slab-Bold.woff2 +0 -0
  35. genal_python-0.8/docs/build/_static/css/fonts/Roboto-Slab-Regular.woff +0 -0
  36. genal_python-0.8/docs/build/_static/css/fonts/Roboto-Slab-Regular.woff2 +0 -0
  37. genal_python-0.8/docs/build/_static/css/fonts/fontawesome-webfont.eot +0 -0
  38. genal_python-0.8/docs/build/_static/css/fonts/fontawesome-webfont.svg +2671 -0
  39. genal_python-0.8/docs/build/_static/css/fonts/fontawesome-webfont.ttf +0 -0
  40. genal_python-0.8/docs/build/_static/css/fonts/fontawesome-webfont.woff +0 -0
  41. genal_python-0.8/docs/build/_static/css/fonts/fontawesome-webfont.woff2 +0 -0
  42. genal_python-0.8/docs/build/_static/css/fonts/lato-bold-italic.woff +0 -0
  43. genal_python-0.8/docs/build/_static/css/fonts/lato-bold-italic.woff2 +0 -0
  44. genal_python-0.8/docs/build/_static/css/fonts/lato-bold.woff +0 -0
  45. genal_python-0.8/docs/build/_static/css/fonts/lato-bold.woff2 +0 -0
  46. genal_python-0.8/docs/build/_static/css/fonts/lato-normal-italic.woff +0 -0
  47. genal_python-0.8/docs/build/_static/css/fonts/lato-normal-italic.woff2 +0 -0
  48. genal_python-0.8/docs/build/_static/css/fonts/lato-normal.woff +0 -0
  49. genal_python-0.8/docs/build/_static/css/fonts/lato-normal.woff2 +0 -0
  50. genal_python-0.8/docs/build/_static/css/theme.css +4 -0
  51. genal_python-0.8/docs/build/_static/doctools.js +156 -0
  52. genal_python-0.8/docs/build/_static/documentation_options.js +13 -0
  53. genal_python-0.8/docs/build/_static/file.png +0 -0
  54. genal_python-0.8/docs/build/_static/js/badge_only.js +1 -0
  55. genal_python-0.8/docs/build/_static/js/html5shiv-printshiv.min.js +4 -0
  56. genal_python-0.8/docs/build/_static/js/html5shiv.min.js +4 -0
  57. genal_python-0.8/docs/build/_static/js/theme.js +1 -0
  58. genal_python-0.8/docs/build/_static/language_data.js +199 -0
  59. genal_python-0.8/docs/build/_static/minus.png +0 -0
  60. genal_python-0.8/docs/build/_static/plus.png +0 -0
  61. genal_python-0.8/docs/build/_static/pygments.css +75 -0
  62. genal_python-0.8/docs/build/_static/searchtools.js +619 -0
  63. genal_python-0.8/docs/build/_static/sphinx_highlight.js +154 -0
  64. genal_python-0.8/docs/build/api.html +2251 -0
  65. genal_python-0.8/docs/build/genal.html +2060 -0
  66. genal_python-0.8/docs/build/genindex.html +584 -0
  67. genal_python-0.8/docs/build/index.html +186 -0
  68. genal_python-0.8/docs/build/introduction.html +706 -0
  69. genal_python-0.8/docs/build/modules.html +754 -0
  70. genal_python-0.8/docs/build/objects.inv +0 -0
  71. genal_python-0.8/docs/build/py-modindex.html +177 -0
  72. genal_python-0.8/docs/build/search.html +122 -0
  73. genal_python-0.8/docs/build/searchindex.js +1 -0
  74. genal_python-0.8/docs/requirements.txt +14 -0
  75. genal_python-0.8/docs/source/.DS_Store +0 -0
  76. genal_python-0.8/docs/source/Images/MR_plot_SBP_AS.png +0 -0
  77. genal_python-0.8/docs/source/api.rst +100 -0
  78. {genal_python-0.7 → genal_python-0.8}/docs/source/conf.py +3 -2
  79. {genal_python-0.7 → genal_python-0.8}/docs/source/index.rst +14 -4
  80. genal_python-0.8/docs/source/introduction.rst +666 -0
  81. genal_python-0.8/docs/source/modules.rst +82 -0
  82. {genal_python-0.7 → genal_python-0.8}/genal/Geno.py +72 -49
  83. {genal_python-0.7 → genal_python-0.8}/genal/MR.py +16 -16
  84. {genal_python-0.7 → genal_python-0.8}/genal/MR_tools.py +11 -0
  85. {genal_python-0.7 → genal_python-0.8}/genal/__init__.py +1 -1
  86. {genal_python-0.7 → genal_python-0.8}/genal/constants.py +1 -0
  87. {genal_python-0.7 → genal_python-0.8}/genal/extract_prs.py +1 -1
  88. {genal_python-0.7 → genal_python-0.8}/genal/snp_query.py +53 -17
  89. {genal_python-0.7 → genal_python-0.8}/genal/tools.py +16 -6
  90. {genal_python-0.7 → genal_python-0.8}/pyproject.toml +2 -3
  91. genal_python-0.7/docs/requirements.txt +0 -2
  92. genal_python-0.7/docs/source/api.rst +0 -24
  93. genal_python-0.7/docs/source/introduction.rst +0 -505
  94. genal_python-0.7/docs/source/modules.rst +0 -7
  95. {genal_python-0.7 → genal_python-0.8}/LICENSE +0 -0
  96. {genal_python-0.7 → genal_python-0.8}/docs/Makefile +0 -0
  97. {genal_python-0.7 → genal_python-0.8}/docs/_build/doctrees/api.doctree +0 -0
  98. {genal_python-0.7 → genal_python-0.8}/docs/_build/doctrees/environment.pickle +0 -0
  99. {genal_python-0.7 → genal_python-0.8}/docs/_build/doctrees/genal.doctree +0 -0
  100. {genal_python-0.7 → genal_python-0.8}/docs/_build/doctrees/index.doctree +0 -0
  101. {genal_python-0.7 → genal_python-0.8}/docs/_build/doctrees/introduction.doctree +0 -0
  102. {genal_python-0.7 → genal_python-0.8}/docs/_build/doctrees/modules.doctree +0 -0
  103. {genal_python-0.7 → genal_python-0.8}/docs/_build/doctrees/source/genal.doctree +0 -0
  104. {genal_python-0.7 → genal_python-0.8}/docs/_build/doctrees/source/modules.doctree +0 -0
  105. {genal_python-0.7 → genal_python-0.8}/docs/_build/html/.buildinfo +0 -0
  106. {genal_python-0.7 → genal_python-0.8}/docs/_build/html/_sources/api.rst.txt +0 -0
  107. {genal_python-0.7 → genal_python-0.8}/docs/_build/html/_sources/genal.rst.txt +0 -0
  108. {genal_python-0.7 → genal_python-0.8}/docs/_build/html/_sources/index.rst.txt +0 -0
  109. {genal_python-0.7 → genal_python-0.8}/docs/_build/html/_sources/introduction.rst.txt +0 -0
  110. {genal_python-0.7 → genal_python-0.8}/docs/_build/html/_sources/modules.rst.txt +0 -0
  111. {genal_python-0.7 → genal_python-0.8}/docs/_build/html/_sources/source/genal.rst.txt +0 -0
  112. {genal_python-0.7 → genal_python-0.8}/docs/_build/html/_sources/source/modules.rst.txt +0 -0
  113. {genal_python-0.7 → genal_python-0.8}/docs/_build/html/_static/_sphinx_javascript_frameworks_compat.js +0 -0
  114. {genal_python-0.7 → genal_python-0.8}/docs/_build/html/_static/basic.css +0 -0
  115. {genal_python-0.7 → genal_python-0.8}/docs/_build/html/_static/css/badge_only.css +0 -0
  116. {genal_python-0.7 → genal_python-0.8}/docs/_build/html/_static/css/fonts/Roboto-Slab-Bold.woff +0 -0
  117. {genal_python-0.7 → genal_python-0.8}/docs/_build/html/_static/css/fonts/Roboto-Slab-Bold.woff2 +0 -0
  118. {genal_python-0.7 → genal_python-0.8}/docs/_build/html/_static/css/fonts/Roboto-Slab-Regular.woff +0 -0
  119. {genal_python-0.7 → genal_python-0.8}/docs/_build/html/_static/css/fonts/Roboto-Slab-Regular.woff2 +0 -0
  120. {genal_python-0.7 → genal_python-0.8}/docs/_build/html/_static/css/fonts/fontawesome-webfont.eot +0 -0
  121. {genal_python-0.7 → genal_python-0.8}/docs/_build/html/_static/css/fonts/fontawesome-webfont.svg +0 -0
  122. {genal_python-0.7 → genal_python-0.8}/docs/_build/html/_static/css/fonts/fontawesome-webfont.ttf +0 -0
  123. {genal_python-0.7 → genal_python-0.8}/docs/_build/html/_static/css/fonts/fontawesome-webfont.woff +0 -0
  124. {genal_python-0.7 → genal_python-0.8}/docs/_build/html/_static/css/fonts/fontawesome-webfont.woff2 +0 -0
  125. {genal_python-0.7 → genal_python-0.8}/docs/_build/html/_static/css/fonts/lato-bold-italic.woff +0 -0
  126. {genal_python-0.7 → genal_python-0.8}/docs/_build/html/_static/css/fonts/lato-bold-italic.woff2 +0 -0
  127. {genal_python-0.7 → genal_python-0.8}/docs/_build/html/_static/css/fonts/lato-bold.woff +0 -0
  128. {genal_python-0.7 → genal_python-0.8}/docs/_build/html/_static/css/fonts/lato-bold.woff2 +0 -0
  129. {genal_python-0.7 → genal_python-0.8}/docs/_build/html/_static/css/fonts/lato-normal-italic.woff +0 -0
  130. {genal_python-0.7 → genal_python-0.8}/docs/_build/html/_static/css/fonts/lato-normal-italic.woff2 +0 -0
  131. {genal_python-0.7 → genal_python-0.8}/docs/_build/html/_static/css/fonts/lato-normal.woff +0 -0
  132. {genal_python-0.7 → genal_python-0.8}/docs/_build/html/_static/css/fonts/lato-normal.woff2 +0 -0
  133. {genal_python-0.7 → genal_python-0.8}/docs/_build/html/_static/css/theme.css +0 -0
  134. {genal_python-0.7 → genal_python-0.8}/docs/_build/html/_static/doctools.js +0 -0
  135. {genal_python-0.7 → genal_python-0.8}/docs/_build/html/_static/documentation_options.js +0 -0
  136. {genal_python-0.7 → genal_python-0.8}/docs/_build/html/_static/file.png +0 -0
  137. {genal_python-0.7 → genal_python-0.8}/docs/_build/html/_static/jquery.js +0 -0
  138. {genal_python-0.7 → genal_python-0.8}/docs/_build/html/_static/js/badge_only.js +0 -0
  139. {genal_python-0.7 → genal_python-0.8}/docs/_build/html/_static/js/html5shiv-printshiv.min.js +0 -0
  140. {genal_python-0.7 → genal_python-0.8}/docs/_build/html/_static/js/html5shiv.min.js +0 -0
  141. {genal_python-0.7 → genal_python-0.8}/docs/_build/html/_static/js/theme.js +0 -0
  142. {genal_python-0.7 → genal_python-0.8}/docs/_build/html/_static/language_data.js +0 -0
  143. {genal_python-0.7 → genal_python-0.8}/docs/_build/html/_static/minus.png +0 -0
  144. {genal_python-0.7 → genal_python-0.8}/docs/_build/html/_static/plus.png +0 -0
  145. {genal_python-0.7 → genal_python-0.8}/docs/_build/html/_static/pygments.css +0 -0
  146. {genal_python-0.7 → genal_python-0.8}/docs/_build/html/_static/searchtools.js +0 -0
  147. {genal_python-0.7 → genal_python-0.8}/docs/_build/html/_static/sphinx_highlight.js +0 -0
  148. {genal_python-0.7 → genal_python-0.8}/docs/_build/html/api.html +0 -0
  149. {genal_python-0.7 → genal_python-0.8}/docs/_build/html/genal.html +0 -0
  150. {genal_python-0.7 → genal_python-0.8}/docs/_build/html/genindex.html +0 -0
  151. {genal_python-0.7 → genal_python-0.8}/docs/_build/html/index.html +0 -0
  152. {genal_python-0.7 → genal_python-0.8}/docs/_build/html/introduction.html +0 -0
  153. {genal_python-0.7 → genal_python-0.8}/docs/_build/html/modules.html +0 -0
  154. {genal_python-0.7 → genal_python-0.8}/docs/_build/html/objects.inv +0 -0
  155. {genal_python-0.7 → genal_python-0.8}/docs/_build/html/py-modindex.html +0 -0
  156. {genal_python-0.7 → genal_python-0.8}/docs/_build/html/search.html +0 -0
  157. {genal_python-0.7 → genal_python-0.8}/docs/_build/html/searchindex.js +0 -0
  158. {genal_python-0.7 → genal_python-0.8}/docs/_build/html/source/genal.html +0 -0
  159. {genal_python-0.7 → genal_python-0.8}/docs/_build/html/source/modules.html +0 -0
  160. {genal_python-0.7/docs/Images → genal_python-0.8/docs/build/_images}/MR_plot_SBP_AS.png +0 -0
  161. /genal_python-0.7/docs/source/genal.rst → /genal_python-0.8/docs/build/_sources/genal.rst.txt +0 -0
  162. {genal_python-0.7 → genal_python-0.8}/docs/make.bat +0 -0
  163. {genal_python-0.7 → genal_python-0.8}/genal/MRpresso.py +0 -0
  164. {genal_python-0.7 → genal_python-0.8}/genal/association.py +0 -0
  165. {genal_python-0.7 → genal_python-0.8}/genal/clump.py +0 -0
  166. {genal_python-0.7 → genal_python-0.8}/genal/geno_tools.py +0 -0
  167. {genal_python-0.7 → genal_python-0.8}/genal/lift.py +0 -0
  168. {genal_python-0.7 → genal_python-0.8}/genal/proxy.py +0 -0
  169. {genal_python-0.7 → genal_python-0.8}/gitignore +0 -0
  170. {genal_python-0.7 → genal_python-0.8}/readthedocs.yaml +0 -0
  171. {genal_python-0.7 → genal_python-0.8}/requirements.txt +0 -0
Binary file
@@ -2,5 +2,4 @@ __pycache__/
2
2
  dist/
3
3
  .ipynb_checkpoints/
4
4
  ipynb_checkpoints/
5
- genal/.ipynb_checkpoints/
6
- docs/
5
+ genal/.ipynb_checkpoints/
@@ -0,0 +1,22 @@
1
+ # .readthedocs.yaml
2
+ # Read the Docs configuration file
3
+ # See https://docs.readthedocs.io/en/stable/config-file/v2.html for details
4
+
5
+ # Required
6
+ version: 2
7
+
8
+ # Set the version of Python and other tools you might need
9
+ build:
10
+ os: ubuntu-22.04
11
+ tools:
12
+ python: "3.11"
13
+
14
+ # Build documentation in the docs/ directory with Sphinx
15
+ sphinx:
16
+ configuration: docs/source/conf.py
17
+
18
+ # We recommend specifying your dependencies to enable reproducible builds:
19
+ # https://docs.readthedocs.io/en/stable/guides/reproducible-builds.html
20
+ python:
21
+ install:
22
+ - requirements: docs/requirements.txt
@@ -1,6 +1,6 @@
1
1
  Metadata-Version: 2.1
2
2
  Name: genal-python
3
- Version: 0.7
3
+ Version: 0.8
4
4
  Summary: A python toolkit for polygenic risk scoring and mendelian randomization.
5
5
  Author-email: Cyprien Rivier <riviercyprien@gmail.com>
6
6
  Requires-Python: >=3.7
@@ -17,7 +17,6 @@ Requires-Dist: psutil==5.9.1
17
17
  Requires-Dist: pyliftover==0.4
18
18
  Requires-Dist: scikit_learn>=1.3.0
19
19
  Requires-Dist: scipy>=1.11.4
20
- Requires-Dist: sphinx_rtd_theme==1.3.0
21
20
  Requires-Dist: statsmodels==0.14.0
22
21
  Requires-Dist: tqdm==4.66.1
23
22
  Requires-Dist: wget==3.2
@@ -32,10 +31,11 @@ Project-URL: Home, https://github.com/CypRiv/genal
32
31
 
33
32
  # Table of contents
34
33
  1. [Introduction](#introduction)
35
- 2. [Citation] (#citation)
34
+ 2. [Citation](#citation)
36
35
  3. [Requirements for the genal module](#paragraph1)
37
36
  4. [Installation and how to use genal](#paragraph2)
38
37
  1. [Installation](#paragraph2.1)
38
+ 2. [Documentation](#paragraph2.2)
39
39
  5. [Tutorial and presentation of the main tools](#paragraph3)
40
40
  1. [Data loading](#paragraph3.1)
41
41
  2. [Data preprocessing](#paragraph3.2)
@@ -44,6 +44,7 @@ Project-URL: Home, https://github.com/CypRiv/genal
44
44
  5. [Mendelian Randomization](#paragraph3.5)
45
45
  6. [SNP-association testing](#paragraph3.6)
46
46
  7. [Lifting](#paragraph3.7)
47
+ 8. [GWAS Catalog](#paragraph3.8)
47
48
 
48
49
 
49
50
  ## Introduction <a name="introduction"></a>
@@ -80,6 +81,16 @@ Once downloaded, the path to the plink executable can be set with:
80
81
  ```
81
82
  genal.set_plink(path="/path/to/plink/executable/file")
82
83
  ```
84
+ ### Documentation <a name="paragraph2.2"></a>
85
+
86
+ For detailed information on how to use the functionalities of Genal, please refer to the documentation: https://genal.rtfd.io
87
+
88
+ The documentation covers:
89
+ - Installation
90
+ - This tutorial
91
+ - The list of the main functions with complete description of their arguments
92
+ - An exhaustive API reference
93
+
83
94
 
84
95
  ## Tutorial <a name="paragraph3"></a>
85
96
  For this tutorial, we will obtain genetic instruments for systolic blood pressure (SBP), compute a Polygenic Risk Score (PRS), and run a Mendelian Randomization analysis to investigate the genetically-determined effect of SBP on the risk of stroke. We will utilize summary statistics from Genome-Wide Association Studies (GWAS) and individual-level data from the UK Biobank. The steps include:
@@ -100,11 +111,11 @@ For this tutorial, we will obtain genetic instruments for systolic blood pressur
100
111
  - Data lifting to another genomic build
101
112
  - In pure Python
102
113
  - Using LiftOver
103
- - Phenoscanner (to be added)
114
+ - Querying the GWAS Catalog
104
115
 
105
116
  ### Data loading <a name="paragraph3.1"></a>
106
117
 
107
- We begin with publicly available summary statistics from a large GWAS study of systolic blood pressure. [Link to study](https://www.nature.com/articles/s41588-018-0205-x). After downloading and unzipping the summary statistics, we load them into a pandas DataFrame:
118
+ We start this tutorial with publicly available summary statistics from a large GWAS study of systolic blood pressure. [Link to study](https://www.nature.com/articles/s41588-018-0205-x). After downloading and unzipping the summary statistics, we load them into a pandas DataFrame:
108
119
 
109
120
  ```python
110
121
  import pandas as pd
@@ -133,6 +144,10 @@ The `genal.Geno` takes as input a pandas dataframe where each row corresponds to
133
144
  - **P**: Column name for effect p-value. Defaults to `'P'`.
134
145
  - **EAF**: Column name for effect allele frequency. Defaults to `'EAF'`.
135
146
 
147
+ > **Note:**
148
+ >
149
+ > You do not need all columns to move forward, as not all columns are required by every function. Additionally, some columns can be imputed as we will see in the next paragraph.
150
+
136
151
  After inspecting the dataframe, we first need to extract the chromosome and position information from the `MarkerName` column into two new columns `CHR` and `POS`:
137
152
 
138
153
  ```python
@@ -158,7 +173,7 @@ The last argument (`keep_columns = False`) indicates that we do not wish to keep
158
173
 
159
174
  > **Note:**
160
175
  >
161
- > Make sure to read the readme file usually provided with the summary statistics to identify the correct columns. It is particularly important to correctly identify the allele that represents the effect allele. Also, you do not need all columns to move forward, as some can be inputted as we will see next.
176
+ > Make sure to read the readme file usually provided with the summary statistics to identify the correct columns. It is particularly important to correctly identify the allele that represents the effect allele.
162
177
 
163
178
  ### Data preprocessing <a name="paragraph3.2"></a>
164
179
 
@@ -337,7 +352,7 @@ and the output is:
337
352
  The PRS computation was successful and used 1330/1538 (86.476%) SNPs.
338
353
  PRS data saved to SBP_prs.csv
339
354
 
340
- In our case, we have been able to find proxies for 571 of the 786 SNPs that were missing in the population genetic data (7 potential proxies have been removed because they were identical to SNPs already present in our data).
355
+ In our case, we have been able to find proxies for 578 of the 786 SNPs that were missing in the population genetic data (7 potential proxies have been removed because they were identical to SNPs already present in our data).
341
356
 
342
357
  You can customize how the proxies are chosen with the following arguments:
343
358
  - `reference_panel`: The reference population used to derive linkage disequilibrium values and find proxies. Defaults to `eur`.
@@ -347,7 +362,7 @@ You can customize how the proxies are chosen with the following arguments:
347
362
 
348
363
  > **Note:**
349
364
  >
350
- > You can call the `genal.Geno.prs` method on any `Geno` instance (containing at least the EA, BETA, and either SNP or CHR/POS columns). The data does not need to be clumped, and there is no limit to the number of instruments used to compute the scores.
365
+ > You can call the `genal.Geno.prs` method on any `Geno` instance (containing at least the EA, BETA, and either SNP or CHR/POS columns). The data does not need to be clumped, and there is no limit to the number of SNPs used to compute the scores.
351
366
 
352
367
 
353
368
  ### Mendelian Randomization <a name="paragraph3.5"></a>
@@ -462,7 +477,7 @@ If you want to visualize the obtained MR results, you can use the `genal.Geno.MR
462
477
  SBP_clumped.MR_plot(filename="MR_plot_SBP_AS")
463
478
  ```
464
479
 
465
- ![MR plot](docs/Images/MR_plot_SBP_AS.png)
480
+ ![MR plot](docs/build/_images/MR_plot_SBP_AS.png)
466
481
  You can select which MR methods you wish to plot with the `methods` argument. Note that for an MR method to be plotted, they must be included in the latest `genal.Geno.MR` call of this `genal.Geno` instance.
467
482
 
468
483
  If you wish to include the heterogeneity values (Cochran's Q) in the results, you can use the heterogeneity argument in the `genal.Geno.MR` call. Here, the heterogeneity for the inverse-variance weighted method:
@@ -507,7 +522,7 @@ df_pheno = pd.read_csv("path/to/trait/data")
507
522
 
508
523
  > **Note:**
509
524
  >
510
- > One important detail is to make sure that the individual IDs are identical between the phenotypic data and the genetic data for the target population.
525
+ > One important detail is to make sure that the IDs of the participants are identical in the phenotypic data and in the genetic data.
511
526
 
512
527
  Then, it is advised to make a copy of the `genal.Geno` instance containing our instruments as we are going to update their coefficients and to avoid any confusion:
513
528
 
@@ -585,6 +600,54 @@ You can specify the path of the LiftOver executable to the `liftover_path` argum
585
600
  SBP_Geno.lift(start = "hg19", end = "hg38", replace = False, liftover_path = "path/to/liftover/exec")
586
601
  ```
587
602
 
603
+ ### GWAS Catalog <a name="paragraph3.8"></a>
588
604
 
605
+ It is sometimes interesting to determine the traits associated with our SNPs. In Mendelian Randomization, for instance, we may want to exclude instruments that are associated with traits likely causing horizontal pleiotropy. For this purpose, we can use the `genal.Geno.query_gwas_catalog` method. This method will query the GWAS Catalog API to determine the list of traits associated with each of our SNPs and store the results in a list in the `ASSOC` column of the `.data` attribute:
589
606
 
607
+ ```python
608
+ SBP_clumped.query_gwas_catalog(p_threshold=5e-8)
609
+ ```
610
+ Which will output:
611
+
612
+ Querying the GWAS Catalog and creating the ASSOC column.
613
+ Only associations with a p-value <= 5e-08 are reported. Use the p_threshold argument to change the threshold.
614
+ To report the p-value for each association, use return_p=True.
615
+ To report the study ID for each association, use return_study=True.
616
+ The .data attribute will be modified. Use replace=False to leave it as is.
617
+ 100%|██████████| 1545/1545 [00:34<00:00, 44.86it/s]
618
+ The ASSOC column has been successfully created.
619
+ 701 (45.37%) SNPs failed to query (not found in GWAS Catalog) and 7 (0.5%) SNPs timed out after 34.33 seconds. You can increase the timeout value with the timeout argument.
620
+ | EA | NEA | EAF | BETA | SE | CHR | POS | SNP | ASSOC |
621
+ |-----|-----|-------|--------|--------|-----|------------|------------|------------------------------------------------------------------------|
622
+ | A | G | 0.1784| 0.2330 | 0.0402 | 10 | 102075479 | rs603424 | [eicosanoids measurement, decadienedioic acid (...] |
623
+ | A | G | 0.0706| -0.3873| 0.0626 | 10 | 102403682 | rs2996303 | FAILED_QUERY |
624
+ | T | G | 0.8872| 0.6846 | 0.0480 | 10 | 102553647 | rs1006545 | [diastolic blood pressure, systolic blood pressure...] |
625
+ | T | G | 0.6652| -0.2098| 0.0340 | 10 | 102558506 | rs12570050 | FAILED_QUERY |
626
+ | T | C | 0.3057| -0.2448| 0.0334 | 10 | 102603924 | rs4919502 | FAILED_QUERY |
627
+ | ... | ... | ... | ... | ... | ... | ... | ... | ... | |
628
+ | T | C | 0.3514| 0.2203 | 0.0314 | 9 | 9350706 | rs1332813 | [diastolic blood pressure, systolic blood pressure...] |
629
+ | T | C | 0.6880| -0.1897| 0.0332 | 9 | 94201341 | rs10820855 | FAILED_QUERY |
630
+ | A | T | 0.3669| -0.1862| 0.0313 | 9 | 95201540 | rs7045409 | [protein measurement, pulse pressure measurement...] |
631
+
632
+ If you are also interested in the p-values of each SNP-trait association, or the ID of the study from which the association was reported, you can use the `return_p = True` and `return_study = True` arguments. Then, the `ASSOC` column will contain a list of tuples, where each tuple contains the trait name, the p-value, and the study ID:
590
633
 
634
+ ```python
635
+ SBP_clumped.query_gwas_catalog(p_threshold=5e-8, return_p=True, return_study=True)
636
+ ```
637
+
638
+ | EA | NEA | EAF | BETA | SE | CHR | POS | SNP | ASSOC |
639
+ |-----|-----|-------|--------|--------|-----|------------|------------|------------------------------------------------------------------------|
640
+ | A | G | 0.1784| 0.2330 | 0.0402 | 10 | 102075479 | rs603424 | TIMEOUT |
641
+ | A | G | 0.0706| -0.3873| 0.0626 | 10 | 102403682 | rs2996303 | FAILED_QUERY |
642
+ | T | G | 0.8872| 0.6846 | 0.0480 | 10 | 102553647 | rs1006545 | [(heart rate response to exercise, 6e-12, GCST... |
643
+ | T | G | 0.6652| -0.2098| 0.0340 | 10 | 102558506 | rs12570050 | FAILED_QUERY |
644
+ | T | C | 0.3057| -0.2448| 0.0334 | 10 | 102603924 | rs4919502 | FAILED_QUERY |
645
+ | ... | ... | ... | ... | ... | ... | ... | ... | ... | |
646
+ | T | C | 0.3514| 0.2203 | 0.0314 | 9 | 9350706 | rs1332813 | [(diastolic blood pressure, 1e-12, GCST9031029... |
647
+ | T | C | 0.6880| -0.1897| 0.0332 | 9 | 94201341 | rs10820855 | FAILED_QUERY |
648
+ | A | T | 0.3669| -0.1862| 0.0313 | 9 | 95201540 | rs7045409 | [(systolic blood pressure, 9e-13, GCST006624),... |
649
+
650
+
651
+ > **Note:**
652
+ >
653
+ > As you can see, many SNPs failed to be queried. This is normal as the GWAS Catalog is not exhaustive.
@@ -7,10 +7,11 @@
7
7
 
8
8
  # Table of contents
9
9
  1. [Introduction](#introduction)
10
- 2. [Citation] (#citation)
10
+ 2. [Citation](#citation)
11
11
  3. [Requirements for the genal module](#paragraph1)
12
12
  4. [Installation and how to use genal](#paragraph2)
13
13
  1. [Installation](#paragraph2.1)
14
+ 2. [Documentation](#paragraph2.2)
14
15
  5. [Tutorial and presentation of the main tools](#paragraph3)
15
16
  1. [Data loading](#paragraph3.1)
16
17
  2. [Data preprocessing](#paragraph3.2)
@@ -19,6 +20,7 @@
19
20
  5. [Mendelian Randomization](#paragraph3.5)
20
21
  6. [SNP-association testing](#paragraph3.6)
21
22
  7. [Lifting](#paragraph3.7)
23
+ 8. [GWAS Catalog](#paragraph3.8)
22
24
 
23
25
 
24
26
  ## Introduction <a name="introduction"></a>
@@ -55,6 +57,16 @@ Once downloaded, the path to the plink executable can be set with:
55
57
  ```
56
58
  genal.set_plink(path="/path/to/plink/executable/file")
57
59
  ```
60
+ ### Documentation <a name="paragraph2.2"></a>
61
+
62
+ For detailed information on how to use the functionalities of Genal, please refer to the documentation: https://genal.rtfd.io
63
+
64
+ The documentation covers:
65
+ - Installation
66
+ - This tutorial
67
+ - The list of the main functions with complete description of their arguments
68
+ - An exhaustive API reference
69
+
58
70
 
59
71
  ## Tutorial <a name="paragraph3"></a>
60
72
  For this tutorial, we will obtain genetic instruments for systolic blood pressure (SBP), compute a Polygenic Risk Score (PRS), and run a Mendelian Randomization analysis to investigate the genetically-determined effect of SBP on the risk of stroke. We will utilize summary statistics from Genome-Wide Association Studies (GWAS) and individual-level data from the UK Biobank. The steps include:
@@ -75,11 +87,11 @@ For this tutorial, we will obtain genetic instruments for systolic blood pressur
75
87
  - Data lifting to another genomic build
76
88
  - In pure Python
77
89
  - Using LiftOver
78
- - Phenoscanner (to be added)
90
+ - Querying the GWAS Catalog
79
91
 
80
92
  ### Data loading <a name="paragraph3.1"></a>
81
93
 
82
- We begin with publicly available summary statistics from a large GWAS study of systolic blood pressure. [Link to study](https://www.nature.com/articles/s41588-018-0205-x). After downloading and unzipping the summary statistics, we load them into a pandas DataFrame:
94
+ We start this tutorial with publicly available summary statistics from a large GWAS study of systolic blood pressure. [Link to study](https://www.nature.com/articles/s41588-018-0205-x). After downloading and unzipping the summary statistics, we load them into a pandas DataFrame:
83
95
 
84
96
  ```python
85
97
  import pandas as pd
@@ -108,6 +120,10 @@ The `genal.Geno` takes as input a pandas dataframe where each row corresponds to
108
120
  - **P**: Column name for effect p-value. Defaults to `'P'`.
109
121
  - **EAF**: Column name for effect allele frequency. Defaults to `'EAF'`.
110
122
 
123
+ > **Note:**
124
+ >
125
+ > You do not need all columns to move forward, as not all columns are required by every function. Additionally, some columns can be imputed as we will see in the next paragraph.
126
+
111
127
  After inspecting the dataframe, we first need to extract the chromosome and position information from the `MarkerName` column into two new columns `CHR` and `POS`:
112
128
 
113
129
  ```python
@@ -133,7 +149,7 @@ The last argument (`keep_columns = False`) indicates that we do not wish to keep
133
149
 
134
150
  > **Note:**
135
151
  >
136
- > Make sure to read the readme file usually provided with the summary statistics to identify the correct columns. It is particularly important to correctly identify the allele that represents the effect allele. Also, you do not need all columns to move forward, as some can be inputted as we will see next.
152
+ > Make sure to read the readme file usually provided with the summary statistics to identify the correct columns. It is particularly important to correctly identify the allele that represents the effect allele.
137
153
 
138
154
  ### Data preprocessing <a name="paragraph3.2"></a>
139
155
 
@@ -312,7 +328,7 @@ and the output is:
312
328
  The PRS computation was successful and used 1330/1538 (86.476%) SNPs.
313
329
  PRS data saved to SBP_prs.csv
314
330
 
315
- In our case, we have been able to find proxies for 571 of the 786 SNPs that were missing in the population genetic data (7 potential proxies have been removed because they were identical to SNPs already present in our data).
331
+ In our case, we have been able to find proxies for 578 of the 786 SNPs that were missing in the population genetic data (7 potential proxies have been removed because they were identical to SNPs already present in our data).
316
332
 
317
333
  You can customize how the proxies are chosen with the following arguments:
318
334
  - `reference_panel`: The reference population used to derive linkage disequilibrium values and find proxies. Defaults to `eur`.
@@ -322,7 +338,7 @@ You can customize how the proxies are chosen with the following arguments:
322
338
 
323
339
  > **Note:**
324
340
  >
325
- > You can call the `genal.Geno.prs` method on any `Geno` instance (containing at least the EA, BETA, and either SNP or CHR/POS columns). The data does not need to be clumped, and there is no limit to the number of instruments used to compute the scores.
341
+ > You can call the `genal.Geno.prs` method on any `Geno` instance (containing at least the EA, BETA, and either SNP or CHR/POS columns). The data does not need to be clumped, and there is no limit to the number of SNPs used to compute the scores.
326
342
 
327
343
 
328
344
  ### Mendelian Randomization <a name="paragraph3.5"></a>
@@ -437,7 +453,7 @@ If you want to visualize the obtained MR results, you can use the `genal.Geno.MR
437
453
  SBP_clumped.MR_plot(filename="MR_plot_SBP_AS")
438
454
  ```
439
455
 
440
- ![MR plot](docs/Images/MR_plot_SBP_AS.png)
456
+ ![MR plot](docs/build/_images/MR_plot_SBP_AS.png)
441
457
  You can select which MR methods you wish to plot with the `methods` argument. Note that for an MR method to be plotted, they must be included in the latest `genal.Geno.MR` call of this `genal.Geno` instance.
442
458
 
443
459
  If you wish to include the heterogeneity values (Cochran's Q) in the results, you can use the heterogeneity argument in the `genal.Geno.MR` call. Here, the heterogeneity for the inverse-variance weighted method:
@@ -482,7 +498,7 @@ df_pheno = pd.read_csv("path/to/trait/data")
482
498
 
483
499
  > **Note:**
484
500
  >
485
- > One important detail is to make sure that the individual IDs are identical between the phenotypic data and the genetic data for the target population.
501
+ > One important detail is to make sure that the IDs of the participants are identical in the phenotypic data and in the genetic data.
486
502
 
487
503
  Then, it is advised to make a copy of the `genal.Geno` instance containing our instruments as we are going to update their coefficients and to avoid any confusion:
488
504
 
@@ -560,5 +576,54 @@ You can specify the path of the LiftOver executable to the `liftover_path` argum
560
576
  SBP_Geno.lift(start = "hg19", end = "hg38", replace = False, liftover_path = "path/to/liftover/exec")
561
577
  ```
562
578
 
579
+ ### GWAS Catalog <a name="paragraph3.8"></a>
580
+
581
+ It is sometimes interesting to determine the traits associated with our SNPs. In Mendelian Randomization, for instance, we may want to exclude instruments that are associated with traits likely causing horizontal pleiotropy. For this purpose, we can use the `genal.Geno.query_gwas_catalog` method. This method will query the GWAS Catalog API to determine the list of traits associated with each of our SNPs and store the results in a list in the `ASSOC` column of the `.data` attribute:
582
+
583
+ ```python
584
+ SBP_clumped.query_gwas_catalog(p_threshold=5e-8)
585
+ ```
586
+ Which will output:
587
+
588
+ Querying the GWAS Catalog and creating the ASSOC column.
589
+ Only associations with a p-value <= 5e-08 are reported. Use the p_threshold argument to change the threshold.
590
+ To report the p-value for each association, use return_p=True.
591
+ To report the study ID for each association, use return_study=True.
592
+ The .data attribute will be modified. Use replace=False to leave it as is.
593
+ 100%|██████████| 1545/1545 [00:34<00:00, 44.86it/s]
594
+ The ASSOC column has been successfully created.
595
+ 701 (45.37%) SNPs failed to query (not found in GWAS Catalog) and 7 (0.5%) SNPs timed out after 34.33 seconds. You can increase the timeout value with the timeout argument.
596
+ | EA | NEA | EAF | BETA | SE | CHR | POS | SNP | ASSOC |
597
+ |-----|-----|-------|--------|--------|-----|------------|------------|------------------------------------------------------------------------|
598
+ | A | G | 0.1784| 0.2330 | 0.0402 | 10 | 102075479 | rs603424 | [eicosanoids measurement, decadienedioic acid (...] |
599
+ | A | G | 0.0706| -0.3873| 0.0626 | 10 | 102403682 | rs2996303 | FAILED_QUERY |
600
+ | T | G | 0.8872| 0.6846 | 0.0480 | 10 | 102553647 | rs1006545 | [diastolic blood pressure, systolic blood pressure...] |
601
+ | T | G | 0.6652| -0.2098| 0.0340 | 10 | 102558506 | rs12570050 | FAILED_QUERY |
602
+ | T | C | 0.3057| -0.2448| 0.0334 | 10 | 102603924 | rs4919502 | FAILED_QUERY |
603
+ | ... | ... | ... | ... | ... | ... | ... | ... | ... | |
604
+ | T | C | 0.3514| 0.2203 | 0.0314 | 9 | 9350706 | rs1332813 | [diastolic blood pressure, systolic blood pressure...] |
605
+ | T | C | 0.6880| -0.1897| 0.0332 | 9 | 94201341 | rs10820855 | FAILED_QUERY |
606
+ | A | T | 0.3669| -0.1862| 0.0313 | 9 | 95201540 | rs7045409 | [protein measurement, pulse pressure measurement...] |
607
+
608
+ If you are also interested in the p-values of each SNP-trait association, or the ID of the study from which the association was reported, you can use the `return_p = True` and `return_study = True` arguments. Then, the `ASSOC` column will contain a list of tuples, where each tuple contains the trait name, the p-value, and the study ID:
609
+
610
+ ```python
611
+ SBP_clumped.query_gwas_catalog(p_threshold=5e-8, return_p=True, return_study=True)
612
+ ```
613
+
614
+ | EA | NEA | EAF | BETA | SE | CHR | POS | SNP | ASSOC |
615
+ |-----|-----|-------|--------|--------|-----|------------|------------|------------------------------------------------------------------------|
616
+ | A | G | 0.1784| 0.2330 | 0.0402 | 10 | 102075479 | rs603424 | TIMEOUT |
617
+ | A | G | 0.0706| -0.3873| 0.0626 | 10 | 102403682 | rs2996303 | FAILED_QUERY |
618
+ | T | G | 0.8872| 0.6846 | 0.0480 | 10 | 102553647 | rs1006545 | [(heart rate response to exercise, 6e-12, GCST... |
619
+ | T | G | 0.6652| -0.2098| 0.0340 | 10 | 102558506 | rs12570050 | FAILED_QUERY |
620
+ | T | C | 0.3057| -0.2448| 0.0334 | 10 | 102603924 | rs4919502 | FAILED_QUERY |
621
+ | ... | ... | ... | ... | ... | ... | ... | ... | ... | |
622
+ | T | C | 0.3514| 0.2203 | 0.0314 | 9 | 9350706 | rs1332813 | [(diastolic blood pressure, 1e-12, GCST9031029... |
623
+ | T | C | 0.6880| -0.1897| 0.0332 | 9 | 94201341 | rs10820855 | FAILED_QUERY |
624
+ | A | T | 0.3669| -0.1862| 0.0313 | 9 | 95201540 | rs7045409 | [(systolic blood pressure, 9e-13, GCST006624),... |
563
625
 
564
626
 
627
+ > **Note:**
628
+ >
629
+ > As you can see, many SNPs failed to be queried. This is normal as the GWAS Catalog is not exhaustive.
Binary file
@@ -0,0 +1,4 @@
1
+ # Sphinx build info version 1
2
+ # This file hashes the configuration used when building these files. When it is not found, a full rebuild will be done.
3
+ config: 1a3c03fa317dbf0f46b6f7567774d6c5
4
+ tags: 645f666f9bcd5a90fca523b33c5a78b7