genal-python 1.0__tar.gz → 1.1__tar.gz

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (111) hide show
  1. genal_python-1.1/Genal_flowchart.png +0 -0
  2. {genal_python-1.0 → genal_python-1.1}/PKG-INFO +18 -9
  3. {genal_python-1.0 → genal_python-1.1}/README.md +17 -8
  4. genal_python-1.1/docs/source/Images/Genal_flowchart.png +0 -0
  5. genal_python-1.1/docs/source/Images/genal_logo.png +0 -0
  6. {genal_python-1.0 → genal_python-1.1}/docs/source/conf.py +1 -1
  7. {genal_python-1.0 → genal_python-1.1}/docs/source/index.rst +5 -1
  8. {genal_python-1.0 → genal_python-1.1}/docs/source/introduction.rst +14 -8
  9. {genal_python-1.0 → genal_python-1.1}/genal/Geno.py +169 -103
  10. {genal_python-1.0 → genal_python-1.1}/genal/MR.py +1 -5
  11. {genal_python-1.0 → genal_python-1.1}/genal/MR_tools.py +24 -24
  12. {genal_python-1.0 → genal_python-1.1}/genal/MRpresso.py +12 -13
  13. genal_python-1.1/genal/__init__.py +19 -0
  14. {genal_python-1.0 → genal_python-1.1}/genal/association.py +173 -115
  15. {genal_python-1.0 → genal_python-1.1}/genal/clump.py +33 -18
  16. {genal_python-1.0 → genal_python-1.1}/genal/constants.py +3 -0
  17. {genal_python-1.0 → genal_python-1.1}/genal/extract_prs.py +106 -101
  18. {genal_python-1.0 → genal_python-1.1}/genal/geno_tools.py +14 -8
  19. {genal_python-1.0 → genal_python-1.1}/genal/proxy.py +120 -54
  20. {genal_python-1.0 → genal_python-1.1}/genal/tools.py +211 -137
  21. genal_python-1.1/genal_logo.png +0 -0
  22. {genal_python-1.0 → genal_python-1.1}/pyproject.toml +1 -1
  23. genal_python-1.0/Genal_flowchart.png +0 -0
  24. genal_python-1.0/genal/__init__.py +0 -20
  25. genal_python-1.0/genal_logo.png +0 -0
  26. {genal_python-1.0 → genal_python-1.1}/.DS_Store +0 -0
  27. {genal_python-1.0 → genal_python-1.1}/.gitignore +0 -0
  28. {genal_python-1.0 → genal_python-1.1}/.readthedocs.yaml +0 -0
  29. {genal_python-1.0 → genal_python-1.1}/LICENSE +0 -0
  30. {genal_python-1.0 → genal_python-1.1}/docs/.DS_Store +0 -0
  31. {genal_python-1.0 → genal_python-1.1}/docs/Makefile +0 -0
  32. {genal_python-1.0 → genal_python-1.1}/docs/build/.DS_Store +0 -0
  33. {genal_python-1.0 → genal_python-1.1}/docs/build/.buildinfo +0 -0
  34. {genal_python-1.0 → genal_python-1.1}/docs/build/.doctrees/api.doctree +0 -0
  35. {genal_python-1.0 → genal_python-1.1}/docs/build/.doctrees/environment.pickle +0 -0
  36. {genal_python-1.0 → genal_python-1.1}/docs/build/.doctrees/genal.doctree +0 -0
  37. {genal_python-1.0 → genal_python-1.1}/docs/build/.doctrees/index.doctree +0 -0
  38. {genal_python-1.0 → genal_python-1.1}/docs/build/.doctrees/introduction.doctree +0 -0
  39. {genal_python-1.0 → genal_python-1.1}/docs/build/.doctrees/modules.doctree +0 -0
  40. {genal_python-1.0 → genal_python-1.1}/docs/build/_images/MR_plot_SBP_AS.png +0 -0
  41. {genal_python-1.0 → genal_python-1.1}/docs/build/_modules/genal/Geno.html +0 -0
  42. {genal_python-1.0 → genal_python-1.1}/docs/build/_modules/genal/MR.html +0 -0
  43. {genal_python-1.0 → genal_python-1.1}/docs/build/_modules/genal/MR_tools.html +0 -0
  44. {genal_python-1.0 → genal_python-1.1}/docs/build/_modules/genal/MRpresso.html +0 -0
  45. {genal_python-1.0 → genal_python-1.1}/docs/build/_modules/genal/association.html +0 -0
  46. {genal_python-1.0 → genal_python-1.1}/docs/build/_modules/genal/clump.html +0 -0
  47. {genal_python-1.0 → genal_python-1.1}/docs/build/_modules/genal/extract_prs.html +0 -0
  48. {genal_python-1.0 → genal_python-1.1}/docs/build/_modules/genal/geno_tools.html +0 -0
  49. {genal_python-1.0 → genal_python-1.1}/docs/build/_modules/genal/lift.html +0 -0
  50. {genal_python-1.0 → genal_python-1.1}/docs/build/_modules/genal/proxy.html +0 -0
  51. {genal_python-1.0 → genal_python-1.1}/docs/build/_modules/genal/snp_query.html +0 -0
  52. {genal_python-1.0 → genal_python-1.1}/docs/build/_modules/genal/tools.html +0 -0
  53. {genal_python-1.0 → genal_python-1.1}/docs/build/_modules/index.html +0 -0
  54. {genal_python-1.0 → genal_python-1.1}/docs/build/_sources/api.rst.txt +0 -0
  55. {genal_python-1.0 → genal_python-1.1}/docs/build/_sources/genal.rst.txt +0 -0
  56. {genal_python-1.0 → genal_python-1.1}/docs/build/_sources/index.rst.txt +0 -0
  57. {genal_python-1.0 → genal_python-1.1}/docs/build/_sources/introduction.rst.txt +0 -0
  58. {genal_python-1.0 → genal_python-1.1}/docs/build/_sources/modules.rst.txt +0 -0
  59. {genal_python-1.0 → genal_python-1.1}/docs/build/_static/basic.css +0 -0
  60. {genal_python-1.0 → genal_python-1.1}/docs/build/_static/css/badge_only.css +0 -0
  61. {genal_python-1.0 → genal_python-1.1}/docs/build/_static/css/fonts/Roboto-Slab-Bold.woff +0 -0
  62. {genal_python-1.0 → genal_python-1.1}/docs/build/_static/css/fonts/Roboto-Slab-Bold.woff2 +0 -0
  63. {genal_python-1.0 → genal_python-1.1}/docs/build/_static/css/fonts/Roboto-Slab-Regular.woff +0 -0
  64. {genal_python-1.0 → genal_python-1.1}/docs/build/_static/css/fonts/Roboto-Slab-Regular.woff2 +0 -0
  65. {genal_python-1.0 → genal_python-1.1}/docs/build/_static/css/fonts/fontawesome-webfont.eot +0 -0
  66. {genal_python-1.0 → genal_python-1.1}/docs/build/_static/css/fonts/fontawesome-webfont.svg +0 -0
  67. {genal_python-1.0 → genal_python-1.1}/docs/build/_static/css/fonts/fontawesome-webfont.ttf +0 -0
  68. {genal_python-1.0 → genal_python-1.1}/docs/build/_static/css/fonts/fontawesome-webfont.woff +0 -0
  69. {genal_python-1.0 → genal_python-1.1}/docs/build/_static/css/fonts/fontawesome-webfont.woff2 +0 -0
  70. {genal_python-1.0 → genal_python-1.1}/docs/build/_static/css/fonts/lato-bold-italic.woff +0 -0
  71. {genal_python-1.0 → genal_python-1.1}/docs/build/_static/css/fonts/lato-bold-italic.woff2 +0 -0
  72. {genal_python-1.0 → genal_python-1.1}/docs/build/_static/css/fonts/lato-bold.woff +0 -0
  73. {genal_python-1.0 → genal_python-1.1}/docs/build/_static/css/fonts/lato-bold.woff2 +0 -0
  74. {genal_python-1.0 → genal_python-1.1}/docs/build/_static/css/fonts/lato-normal-italic.woff +0 -0
  75. {genal_python-1.0 → genal_python-1.1}/docs/build/_static/css/fonts/lato-normal-italic.woff2 +0 -0
  76. {genal_python-1.0 → genal_python-1.1}/docs/build/_static/css/fonts/lato-normal.woff +0 -0
  77. {genal_python-1.0 → genal_python-1.1}/docs/build/_static/css/fonts/lato-normal.woff2 +0 -0
  78. {genal_python-1.0 → genal_python-1.1}/docs/build/_static/css/theme.css +0 -0
  79. {genal_python-1.0 → genal_python-1.1}/docs/build/_static/doctools.js +0 -0
  80. {genal_python-1.0 → genal_python-1.1}/docs/build/_static/documentation_options.js +0 -0
  81. {genal_python-1.0 → genal_python-1.1}/docs/build/_static/file.png +0 -0
  82. {genal_python-1.0 → genal_python-1.1}/docs/build/_static/js/badge_only.js +0 -0
  83. {genal_python-1.0 → genal_python-1.1}/docs/build/_static/js/html5shiv-printshiv.min.js +0 -0
  84. {genal_python-1.0 → genal_python-1.1}/docs/build/_static/js/html5shiv.min.js +0 -0
  85. {genal_python-1.0 → genal_python-1.1}/docs/build/_static/js/theme.js +0 -0
  86. {genal_python-1.0 → genal_python-1.1}/docs/build/_static/language_data.js +0 -0
  87. {genal_python-1.0 → genal_python-1.1}/docs/build/_static/minus.png +0 -0
  88. {genal_python-1.0 → genal_python-1.1}/docs/build/_static/plus.png +0 -0
  89. {genal_python-1.0 → genal_python-1.1}/docs/build/_static/pygments.css +0 -0
  90. {genal_python-1.0 → genal_python-1.1}/docs/build/_static/searchtools.js +0 -0
  91. {genal_python-1.0 → genal_python-1.1}/docs/build/_static/sphinx_highlight.js +0 -0
  92. {genal_python-1.0 → genal_python-1.1}/docs/build/api.html +0 -0
  93. {genal_python-1.0 → genal_python-1.1}/docs/build/genal.html +0 -0
  94. {genal_python-1.0 → genal_python-1.1}/docs/build/genindex.html +0 -0
  95. {genal_python-1.0 → genal_python-1.1}/docs/build/index.html +0 -0
  96. {genal_python-1.0 → genal_python-1.1}/docs/build/introduction.html +0 -0
  97. {genal_python-1.0 → genal_python-1.1}/docs/build/modules.html +0 -0
  98. {genal_python-1.0 → genal_python-1.1}/docs/build/objects.inv +0 -0
  99. {genal_python-1.0 → genal_python-1.1}/docs/build/py-modindex.html +0 -0
  100. {genal_python-1.0 → genal_python-1.1}/docs/build/search.html +0 -0
  101. {genal_python-1.0 → genal_python-1.1}/docs/build/searchindex.js +0 -0
  102. {genal_python-1.0 → genal_python-1.1}/docs/make.bat +0 -0
  103. {genal_python-1.0 → genal_python-1.1}/docs/requirements.txt +0 -0
  104. {genal_python-1.0 → genal_python-1.1}/docs/source/.DS_Store +0 -0
  105. {genal_python-1.0 → genal_python-1.1}/docs/source/Images/MR_plot_SBP_AS.png +0 -0
  106. {genal_python-1.0 → genal_python-1.1}/docs/source/api.rst +0 -0
  107. {genal_python-1.0 → genal_python-1.1}/docs/source/modules.rst +0 -0
  108. {genal_python-1.0 → genal_python-1.1}/genal/lift.py +0 -0
  109. {genal_python-1.0 → genal_python-1.1}/genal/snp_query.py +0 -0
  110. {genal_python-1.0 → genal_python-1.1}/gitignore +0 -0
  111. {genal_python-1.0 → genal_python-1.1}/readthedocs.yaml +0 -0
Binary file
@@ -1,6 +1,6 @@
1
1
  Metadata-Version: 2.3
2
2
  Name: genal-python
3
- Version: 1.0
3
+ Version: 1.1
4
4
  Summary: A python toolkit for polygenic risk scoring and mendelian randomization.
5
5
  Author-email: Cyprien Rivier <riviercyprien@gmail.com>
6
6
  Requires-Python: >=3.8
@@ -62,8 +62,11 @@ Genal draws on concepts from well-established R packages such as TwoSampleMR, MR
62
62
 
63
63
  Genal flowchart. Created in https://www.BioRender.com
64
64
  ## Citation <a name="citation"></a>
65
- If you're using genal, please cite the following paper:
66
- **Genal: A Python Toolkit for Genetic Risk Scoring and Mendelian Randomization.** Cyprien A. Rivier, Santiago Clocchiatti-Tuozzo, Shufan Huo, Victor Torres-Lopez, Daniela Renedo, Kevin N. Sheth, Guido J. Falcone, Julian N. Acosta. medRxiv 2024.05.23.24307776; doi: https://doi.org/10.1101/2024.05.23.24307776
65
+ If you're using genal, please cite the following paper:
66
+ **Genal: A Python Toolkit for Genetic Risk Scoring and Mendelian Randomization.**
67
+ Cyprien A. Rivier, Santiago Clocchiatti-Tuozzo, Shufan Huo, Victor Torres-Lopez, Daniela Renedo, Kevin N. Sheth, Guido J. Falcone, Julian N. Acosta.
68
+ Bioinformatics Advances 2024.
69
+ doi: https://doi.org/10.1093/bioadv/vbae207
67
70
 
68
71
  ## Requirements for the genal module <a name="paragraph1"></a>
69
72
  ***Python 3.8 or later***. https://www.python.org/ <br>
@@ -91,8 +94,8 @@ And import it in a python environment with:
91
94
  import genal
92
95
  ```
93
96
 
94
- The main genal functionalities require a working installation of PLINK v1.9 (and not 2.0 as certain functionalities have not been updated yet).
95
- If you have already installed plink v1.9, you can set the path to its executable with:
97
+ The main genal functionalities require a working installation of PLINK v2.0.
98
+ If you have already installed plink v2.0, you can set the path to its executable with:
96
99
 
97
100
  ```
98
101
  genal.set_plink(path="/path/to/plink/executable/file")
@@ -256,7 +259,7 @@ You do not need to obtain the 1000 genome reference panel yourself, genal will d
256
259
  SBP_Geno.preprocess_data(preprocessing = 'Fill_delete', reference_panel = "afr")
257
260
  ```
258
261
 
259
- You can also use a custom reference panel by specifying to the reference_panel argument a path to bed/bim/fam files (without the extension).
262
+ You can also use a custom reference panel by specifying to the reference_panel argument a path to bed/bim/fam (plink v1.9 format) or pgen/pvar/psam files (plink v2.0 format), without the extension.
260
263
 
261
264
  ### Clumping <a name="paragraph3.3"></a>
262
265
 
@@ -289,7 +292,7 @@ Computing a Polygenic Risk Score (PRS) can be done in one line with the `genal.G
289
292
  SBP_clumped.prs(name = "SBP_prs", path = "path/to/genetic/files")
290
293
  ```
291
294
 
292
- The genetic files of the target population can be either contained in one triple of bed/bim/fam files with information for all SNPs, or divided by chromosome (one bed/bim/fam triple for chr 1, another for chr 2, etc...). In the latter case, provide the path by replacing the chromosome number by `$` and genal will extract the necessary SNPs from each chromosome and merge them before running the PRS. For instance, if the genetic files are named `Pop_chr1.bed`, `Pop_chr1.bim`, `Pop_chr1.fam`, `Pop_chr2.bed`, ..., you can use:
295
+ The genetic files of the target population can be either contained in one triple of bed/bim/fam or pgen/pvar/psam files with information for all SNPs, or divided by chromosome (one bed/bim/fam or pgen/pvar/psam triple for chr 1, another for chr 2, etc...). In the latter case, provide the path by replacing the chromosome number by `$` and genal will extract the necessary SNPs from each chromosome and merge them before running the PRS. For instance, if the genetic files are named `Pop_chr1.bed`, `Pop_chr1.bim`, `Pop_chr1.fam`, `Pop_chr2.bed`, ..., you can use:
293
296
 
294
297
  ```python
295
298
  SBP_clumped.prs(name = "SBP_prs", path = "Pop_chr$")
@@ -520,6 +523,12 @@ And that will give:
520
523
  | SBP | Stroke_eur | Egger Intercept | 1499 | -0.001381 | 0.000813 | 8.935529e-02 | 2959.965136 | 1497 | 1.253763e-98 |
521
524
  | SBP | Stroke_eur | Inverse-Variance Weighted| 1499 | 0.023049 | 0.001061 | 1.382645e-104 | 2965.678836 | 1498 | 4.280737e-99 |
522
525
 
526
+ If you wish to display the coefficients as odds ratios with confidence intervals for a binary outcome trait, you can use the `odds = True` argument:
527
+
528
+ ```python
529
+ SBP_clumped.MR(action = 2, methods = ["Egger","IVW"], exposure_name = "SBP", outcome_name = "Stroke_eur", heterogeneity = True, odds = True)
530
+ ```
531
+
523
532
  As expected, many MR methods indicate that SBP is strongly associated with stroke, but there could be concerns for horizontal pleiotropy (instruments influencing the outcome through a different pathway than the one used as exposure) given the almost significant MR-Egger intercept p-value.
524
533
  To investigate horizontal pleiotropy in more details, a very useful method is Mendelian Randomization Pleiotropy RESidual Sum and Outlier (MR-PRESSO). MR-PRESSO is a method designed to detect and correct for horizontal pleiotropy. It will identify which instruments are likely to be pleiotropic on their effect on the outcome, and it will rerun an inverse-variance weighted MR after excluding them. It can be run using the `genal.Geno.MRpresso` method:
525
534
 
@@ -549,7 +558,7 @@ df_pheno = pd.read_csv("path/to/trait/data")
549
558
 
550
559
  > **Note:**
551
560
  >
552
- > One important point is to make sure that the IDs of the participants are identical in the phenotypic data and in the genetic data.
561
+ > One important point is to make sure that both the Family IDs (FID) and Individual IDs (IID) of the participants are identical in the phenotypic data and in the genetic data.
553
562
 
554
563
  Then, it is advised to make a copy of the `genal.Geno` instance containing our instruments as we are going to update their coefficients and to avoid any confusion:
555
564
 
@@ -560,7 +569,7 @@ SBP_adjusted = SBP_clumped.copy()
560
569
  We can then call the `genal.Geno.set_phenotype` method, specifying which column contains our trait of interest (for the association testing) and which column contains the individual IDs:
561
570
 
562
571
  ```python
563
- SBP_adjusted.set_phenotype(df_pheno, PHENO = "htn", IID = "IID")
572
+ SBP_adjusted.set_phenotype(df_pheno, PHENO = "htn", IID = "IID", FID = "FID")
564
573
  ```
565
574
 
566
575
  At this point, genal will identify if the phenotype is binary or quantitative in order to choose the appropriate regression model. If the phenotype is binary, it will assume that the most frequent value is coding for control (and the other value for case), this can be changed with `alternate_control = True`:
@@ -38,8 +38,11 @@ Genal draws on concepts from well-established R packages such as TwoSampleMR, MR
38
38
 
39
39
  Genal flowchart. Created in https://www.BioRender.com
40
40
  ## Citation <a name="citation"></a>
41
- If you're using genal, please cite the following paper:
42
- **Genal: A Python Toolkit for Genetic Risk Scoring and Mendelian Randomization.** Cyprien A. Rivier, Santiago Clocchiatti-Tuozzo, Shufan Huo, Victor Torres-Lopez, Daniela Renedo, Kevin N. Sheth, Guido J. Falcone, Julian N. Acosta. medRxiv 2024.05.23.24307776; doi: https://doi.org/10.1101/2024.05.23.24307776
41
+ If you're using genal, please cite the following paper:
42
+ **Genal: A Python Toolkit for Genetic Risk Scoring and Mendelian Randomization.**
43
+ Cyprien A. Rivier, Santiago Clocchiatti-Tuozzo, Shufan Huo, Victor Torres-Lopez, Daniela Renedo, Kevin N. Sheth, Guido J. Falcone, Julian N. Acosta.
44
+ Bioinformatics Advances 2024.
45
+ doi: https://doi.org/10.1093/bioadv/vbae207
43
46
 
44
47
  ## Requirements for the genal module <a name="paragraph1"></a>
45
48
  ***Python 3.8 or later***. https://www.python.org/ <br>
@@ -67,8 +70,8 @@ And import it in a python environment with:
67
70
  import genal
68
71
  ```
69
72
 
70
- The main genal functionalities require a working installation of PLINK v1.9 (and not 2.0 as certain functionalities have not been updated yet).
71
- If you have already installed plink v1.9, you can set the path to its executable with:
73
+ The main genal functionalities require a working installation of PLINK v2.0.
74
+ If you have already installed plink v2.0, you can set the path to its executable with:
72
75
 
73
76
  ```
74
77
  genal.set_plink(path="/path/to/plink/executable/file")
@@ -232,7 +235,7 @@ You do not need to obtain the 1000 genome reference panel yourself, genal will d
232
235
  SBP_Geno.preprocess_data(preprocessing = 'Fill_delete', reference_panel = "afr")
233
236
  ```
234
237
 
235
- You can also use a custom reference panel by specifying to the reference_panel argument a path to bed/bim/fam files (without the extension).
238
+ You can also use a custom reference panel by specifying to the reference_panel argument a path to bed/bim/fam (plink v1.9 format) or pgen/pvar/psam files (plink v2.0 format), without the extension.
236
239
 
237
240
  ### Clumping <a name="paragraph3.3"></a>
238
241
 
@@ -265,7 +268,7 @@ Computing a Polygenic Risk Score (PRS) can be done in one line with the `genal.G
265
268
  SBP_clumped.prs(name = "SBP_prs", path = "path/to/genetic/files")
266
269
  ```
267
270
 
268
- The genetic files of the target population can be either contained in one triple of bed/bim/fam files with information for all SNPs, or divided by chromosome (one bed/bim/fam triple for chr 1, another for chr 2, etc...). In the latter case, provide the path by replacing the chromosome number by `$` and genal will extract the necessary SNPs from each chromosome and merge them before running the PRS. For instance, if the genetic files are named `Pop_chr1.bed`, `Pop_chr1.bim`, `Pop_chr1.fam`, `Pop_chr2.bed`, ..., you can use:
271
+ The genetic files of the target population can be either contained in one triple of bed/bim/fam or pgen/pvar/psam files with information for all SNPs, or divided by chromosome (one bed/bim/fam or pgen/pvar/psam triple for chr 1, another for chr 2, etc...). In the latter case, provide the path by replacing the chromosome number by `$` and genal will extract the necessary SNPs from each chromosome and merge them before running the PRS. For instance, if the genetic files are named `Pop_chr1.bed`, `Pop_chr1.bim`, `Pop_chr1.fam`, `Pop_chr2.bed`, ..., you can use:
269
272
 
270
273
  ```python
271
274
  SBP_clumped.prs(name = "SBP_prs", path = "Pop_chr$")
@@ -496,6 +499,12 @@ And that will give:
496
499
  | SBP | Stroke_eur | Egger Intercept | 1499 | -0.001381 | 0.000813 | 8.935529e-02 | 2959.965136 | 1497 | 1.253763e-98 |
497
500
  | SBP | Stroke_eur | Inverse-Variance Weighted| 1499 | 0.023049 | 0.001061 | 1.382645e-104 | 2965.678836 | 1498 | 4.280737e-99 |
498
501
 
502
+ If you wish to display the coefficients as odds ratios with confidence intervals for a binary outcome trait, you can use the `odds = True` argument:
503
+
504
+ ```python
505
+ SBP_clumped.MR(action = 2, methods = ["Egger","IVW"], exposure_name = "SBP", outcome_name = "Stroke_eur", heterogeneity = True, odds = True)
506
+ ```
507
+
499
508
  As expected, many MR methods indicate that SBP is strongly associated with stroke, but there could be concerns for horizontal pleiotropy (instruments influencing the outcome through a different pathway than the one used as exposure) given the almost significant MR-Egger intercept p-value.
500
509
  To investigate horizontal pleiotropy in more details, a very useful method is Mendelian Randomization Pleiotropy RESidual Sum and Outlier (MR-PRESSO). MR-PRESSO is a method designed to detect and correct for horizontal pleiotropy. It will identify which instruments are likely to be pleiotropic on their effect on the outcome, and it will rerun an inverse-variance weighted MR after excluding them. It can be run using the `genal.Geno.MRpresso` method:
501
510
 
@@ -525,7 +534,7 @@ df_pheno = pd.read_csv("path/to/trait/data")
525
534
 
526
535
  > **Note:**
527
536
  >
528
- > One important point is to make sure that the IDs of the participants are identical in the phenotypic data and in the genetic data.
537
+ > One important point is to make sure that both the Family IDs (FID) and Individual IDs (IID) of the participants are identical in the phenotypic data and in the genetic data.
529
538
 
530
539
  Then, it is advised to make a copy of the `genal.Geno` instance containing our instruments as we are going to update their coefficients and to avoid any confusion:
531
540
 
@@ -536,7 +545,7 @@ SBP_adjusted = SBP_clumped.copy()
536
545
  We can then call the `genal.Geno.set_phenotype` method, specifying which column contains our trait of interest (for the association testing) and which column contains the individual IDs:
537
546
 
538
547
  ```python
539
- SBP_adjusted.set_phenotype(df_pheno, PHENO = "htn", IID = "IID")
548
+ SBP_adjusted.set_phenotype(df_pheno, PHENO = "htn", IID = "IID", FID = "FID")
540
549
  ```
541
550
 
542
551
  At this point, genal will identify if the phenotype is binary or quantitative in order to choose the appropriate regression model. If the phenotype is binary, it will assume that the most frequent value is coding for control (and the other value for case), this can be changed with `alternate_control = True`:
@@ -13,7 +13,7 @@ sys.path.insert(0, os.path.abspath('../../'))
13
13
  project = 'genal'
14
14
  copyright = '2023, Cyprien A. Rivier'
15
15
  author = 'Cyprien A. Rivier'
16
- release = 'v1.0'
16
+ release = 'v1.1'
17
17
 
18
18
 
19
19
  # -- General configuration ---------------------------------------------------
@@ -3,6 +3,10 @@
3
3
  You can adapt this file completely to your liking, but it should at least
4
4
  contain the root `toctree` directive.
5
5
 
6
+ .. image:: Images/genal_logo.png
7
+ :alt: genal_logo
8
+ :width: 400px
9
+
6
10
  genal: A Python Toolkit for Genetic Risk Scoring and Mendelian Randomization
7
11
  ============================================================================
8
12
 
@@ -47,7 +51,7 @@ If you use genal in your work, please cite the following paper:
47
51
 
48
52
  .. [Rivier.2024] *Genal: A Python Toolkit for Genetic Risk Scoring and Mendelian Randomization*
49
53
  Cyprien A. Rivier, Santiago Clocchiatti-Tuozzo, Shufan Huo, Victor Torres-Lopez, Daniela Renedo, Kevin N. Sheth, Guido J. Falcone, Julian N. Acosta.
50
- medRxiv. 2024 May `10.1101/2024.05.23.24307776 <https://doi.org/10.1101/2024.05.23.24307776>`_.
54
+ Bioinformatics Advances. 2024 December; `10.1093/bioadv/vbae207 <https://doi.org/10.1093/bioadv/vbae207>`_.
51
55
 
52
56
  References
53
57
  ----------
@@ -22,8 +22,8 @@ And import it in a python environment with:
22
22
 
23
23
  import genal
24
24
 
25
- The main genal functionalities require a working installation of PLINK v1.9 (and not 2.0 as certain functionalities have not been updated yet).
26
- If you have already installed plink v1.9, you can set the path to its executable with:
25
+ The main genal functionalities require a working installation of PLINK v2.0.
26
+ If you have already installed plink v2.0, you can set the path to its executable with:
27
27
 
28
28
  .. code-block:: python
29
29
 
@@ -39,6 +39,9 @@ If plink is not installed, genal can install the correct version for your system
39
39
  Tutorial
40
40
  ========
41
41
 
42
+ .. image:: Images/Genal_flowchart.png
43
+ :alt: Genal_flowchart
44
+
42
45
  For the purpose of this tutorial, we are going to build a PRS of systolic blood pressure (SBP) and investigate the genetically-determined effect of SBP on the risk of stroke. We will use both summary statistics from Genome-Wide Association Studies (GWAS) and individual-level data from the UK Biobank as our test population. We are going to go through the following steps:
43
46
 
44
47
  Table of contents
@@ -185,8 +188,7 @@ By default, the reference panel used is the European (EUR) one. You can specify
185
188
 
186
189
  SBP_Geno.preprocess_data(preprocessing='Fill_delete', reference_panel="afr")
187
190
 
188
- You can also use a custom reference panel by specifying the path to bed/bim/fam files (without the extension) in the ``reference_panel`` argument.
189
-
191
+ You can also use a custom reference panel by specifying to the reference_panel argument a path to bed/bim/fam (plink v1.9 format) or pgen/pvar/psam files (plink v2.0 format), without the extension.
190
192
 
191
193
  Clumping
192
194
  --------
@@ -221,7 +223,7 @@ Computing a Polygenic Risk Score (PRS) can be done in one line with the :meth:`~
221
223
 
222
224
  SBP_clumped.prs(name="SBP_prs", path="path/to/genetic/files")
223
225
 
224
- The genetic files of the target population can be either contained in one triple of bed/bim/fam files with information for all SNPs, or divided by chromosome (one bed/bim/fam triple for chr 1, another for chr 2, etc...). In the latter case, provide the path by replacing the chromosome number by ``$`` and genal will extract the necessary SNPs from each chromosome and merge them before running the PRS. For instance, if the genetic files are named ``Pop_chr1.bed``, ``Pop_chr1.bim``, ``Pop_chr1.fam``, ``Pop_chr2.bed``, ..., you can use:
226
+ The genetic files of the target population can be either contained in one triple of bed/bim/fam or pgen/pvar/psam files with information for all SNPs, or divided by chromosome (one bed/bim/fam or pgen/pvar/psam triple for chr 1, another for chr 2, etc...). In the latter case, provide the path by replacing the chromosome number by `$` and genal will extract the necessary SNPs from each chromosome and merge them before running the PRS. For instance, if the genetic files are named `Pop_chr1.bed`, `Pop_chr1.bim`, `Pop_chr1.fam`, `Pop_chr2.bed`, ..., you can use:
225
227
 
226
228
  .. code-block:: python
227
229
 
@@ -468,8 +470,12 @@ And that will give:
468
470
  1 SBP Stroke_eur Egger Intercept 1499 -0.001381 0.000813 8.935529e-02 2959.965136 1497 1.253763e-98
469
471
  2 SBP Stroke_eur Inverse-Variance Weighted 1499 0.023049 0.001061 1.382645e-104 2965.678836 1498 4.280737e-99
470
472
 
473
+ If you wish to display the coefficients as odds ratios with confidence intervals for a binary outcome trait, you can use the `odds = True` argument:
474
+
475
+ .. code-block:: python
476
+
477
+ SBP_clumped.MR(action=2, methods=["Egger","IVW"], exposure_name="SBP", outcome_name="Stroke_eur", heterogeneity=True, odds=True)
471
478
 
472
-
473
479
  As expected, many MR methods indicate that SBP is strongly associated with stroke, but there could be concerns for horizontal pleiotropy (instruments influencing the outcome through a different pathway than the one used as exposure) given the almost significant MR-Egger intercept p-value.
474
480
 
475
481
  To investigate horizontal pleiotropy in more detail, a very useful method is Mendelian Randomization Pleiotropy RESidual Sum and Outlier (MR-PRESSO).
@@ -506,7 +512,7 @@ Let's start by loading phenotypic data:
506
512
  df_pheno = pd.read_csv("path/to/trait/data")
507
513
 
508
514
  .. note::
509
- One important point is to make sure that the IDs of the participants are identical in the phenotypic data and in the genetic data.
515
+ One important point is to make sure that both the Family IDs (FID) and Individual IDs (IID) of the participants are identical in the phenotypic data and in the genetic data.
510
516
 
511
517
  Then, it is advised to make a copy of the :class:`~genal.Geno` instance containing our instruments as we are going to update their coefficients and to avoid any confusion:
512
518
 
@@ -518,7 +524,7 @@ We can then call the :meth:`~genal.Geno.set_phenotype` method, specifying which
518
524
 
519
525
  .. code-block:: python
520
526
 
521
- SBP_adjusted.set_phenotype(df_pheno, PHENO="htn", IID="IID")
527
+ SBP_adjusted.set_phenotype(df_pheno, PHENO="htn", IID="IID", FID="FID")
522
528
 
523
529
  At this point, genal will identify if the phenotype is binary or quantitative in order to choose the appropriate regression model. If the phenotype is binary, it will assume that the most frequent value is coding for control (and the other value for case), this can be changed with ``alternate_control=True``::
524
530