PyPI - genal-python - Versions diffs - 1.5.0__tar.gz → 1.5.1__tar.gz - Mend

genal-python 1.5.0tar.gz → 1.5.1tar.gz

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (45) hide show

{genal_python-1.5.0 → genal_python-1.5.1}/.gitignore RENAMED Viewed

@@ -1,14 +1,18 @@
+.DS_Store
 __pycache__/
+.pytest_cache/
+.coverage
+htmlcov/
 dist/
 .ipynb_checkpoints/
 ipynb_checkpoints/
 genal/.ipynb_checkpoints/
 test_data/
 cursor/
-tests/
 tmp_GENAL/
 docs/build/
 docs/_build/
 REVIEW_REPORT.md
 TASKS.md
 code_concatenated
+tests/

genal_python-1.5.0/README.md → genal_python-1.5.1/PKG-INFO RENAMED Viewed

@@ -1,3 +1,35 @@
+Metadata-Version: 2.4
+Name: genal-python
+Version: 1.5.1
+Summary: A python toolkit for polygenic risk scoring and mendelian randomization.
+Author-email: Cyprien Rivier <riviercyprien@gmail.com>
+Requires-Python: >=3.8
+Description-Content-Type: text/markdown
+Classifier: Programming Language :: Python :: 3
+Classifier: License :: OSI Approved :: GNU General Public License v3 or later (GPLv3+)
+Classifier: Operating System :: OS Independent
+License-File: LICENSE
+Requires-Dist: aiohttp>=3.7
+Requires-Dist: nest_asyncio>=1.5
+Requires-Dist: numpy>=1.17.3
+Requires-Dist: pandas>=1.0
+Requires-Dist: plotnine>=0.9
+Requires-Dist: psutil>=5.0
+Requires-Dist: requests>=2.0
+Requires-Dist: pyliftover>=0.4
+Requires-Dist: scikit_learn>=0.24
+Requires-Dist: scipy>=1.7,<1.13
+Requires-Dist: statsmodels>=0.13,<0.15
+Requires-Dist: tqdm>=4.38
+Requires-Dist: wget>=3.0
+Requires-Dist: fastparquet>=0.4
+Requires-Dist: pyarrow>=3.0
+Requires-Dist: pytest>=7.0 ; extra == "test"
+Requires-Dist: pytest-cov ; extra == "test"
+Requires-Dist: pytest-xdist ; extra == "test"
+Project-URL: Home, https://github.com/CypRiv/genal
+Provides-Extra: test
 <div align="center">
   <img src="genal_logo.png" height="80" alt="genal logo" />
   <h1>genal</h1>
@@ -123,7 +155,7 @@ G_instruments.query_outcome(G_outcome, proxy=True, reference_panel="EUR_37")
 mr_results = G_instruments.MR(action=2, heterogeneity=True, odds=False)
 # 5.5) Plot MR results
-G_instruments.MR_plot(filename="mr_scatter")
+G_instruments.MR_plot(filename="mr_scatter", figure_size=(10, 6))
 ```
 ## Core concept: the `Geno` object
@@ -133,6 +165,7 @@ G_instruments.MR_plot(filename="mr_scatter")
 - `G.phenotype`: stored after `G.set_phenotype(...)` (phenotype DataFrame + metadata)
 - `G.MR_data`: stored after `G.query_outcome(...)` (exposure/outcome association tables used by MR)
 - `G.MR_results`: stored after `G.MR(...)` (results table + harmonized SNP table; used by plotting)
+- `G.MRpresso_subset_data`: stored after `G.MRpresso(...)` (outlier-removed harmonized table)
 Most methods either:
 - return a **new `Geno`** object (e.g., `clump()`), or
@@ -147,6 +180,7 @@ Most methods either:
 - **Align rsIDs to a target genotype dataset**: `Geno.update_snpids(path=..., replace=...)`
 - **Extract genotype subset**: `Geno.extract_snps(path=...)` → writes extracted files under `tmp_GENAL/`
 - **Two-sample MR pipeline**: `Geno.query_outcome(...)` → `Geno.MR(...)` (+ `MR_plot`)
+- **Leave-one-out MR**: `Geno.MR_loo(...)` → `Geno.MR_loo_plot(...)` (identify influential variants)
 - **MR-PRESSO**: `Geno.MRpresso(...)` (parallel; outlier + distortion tests)
 - **Colocalization**: `Geno.colocalize(...)` (approx Bayes factors; returns posterior probabilities)
 - **Association testing (individual-level)**: `Geno.set_phenotype(...)` → `Geno.association_test(...)`
@@ -298,7 +332,17 @@ About `action` (palindromic SNP handling during harmonization):
 Plot the MR scatter:
 ```python
-G_clumped.MR_plot(filename="mr_scatter")
+G_clumped.MR_plot(filename="mr_scatter", figure_size=(10, 6))
+```
+You can also draw a funnel plot of single-SNP ratio estimates (Wald ratios):
+```python
+G_clumped.MR_funnel(
+    methods=["IVW", "WM", "Egger"],  # vertical reference lines (optional)
+    filename="mr_funnel",
+    figure_size=(10, 6),
+)
 ```
 ### 6) Sensitivity: MR-PRESSO
@@ -311,7 +355,16 @@ mod_table, GlobalTest, OutlierTest, BiasTest = G_clumped.MRpresso(
     cpus=-1,   # use all CPU cores
 )
 ```
-If you want to rerun all MR methods after removing outliers with MR-PRESSO, you can use the `use_mrpresso_data=True` argument in `MR()`:
+To highlight MR-PRESSO outliers on plots, pass `use_mrpresso_data=True` (outliers are colored in red):
+```python
+G_clumped.MR_plot(filename="mr_scatter_mrpresso", figure_size=(10, 6), use_mrpresso_data=True)
+G_clumped.MR_funnel(filename="mr_funnel_mrpresso", figure_size=(10, 6), use_mrpresso_data=True)
+G_clumped.MR_loo_plot(filename="loo_forest_mrpresso", figure_size=(10, 8), use_mrpresso_data=True)
+```
+If you want to rerun MR methods after removing outliers with MR-PRESSO, you can use the `use_mrpresso_data=True` argument in `MR()`:
 ```python
 res = G_clumped.MR(
     action=2,
@@ -325,7 +378,29 @@ res = G_clumped.MR(
 res
 ```
-### 7) Single-SNP association tests (individual-level data)
+### 7) Sensitivity: Leave-One-Out MR
+Leave-one-out MR helps identify influential variants that may be driving the causal estimate.
+```python
+# Run leave-one-out analysis (default uses IVW)
+loo_results = G_clumped.MR_loo(method="IVW", heterogeneity=False, odds=False)
+```
+Visualize the results with a forest plot:
+```python
+# Default: show top influential instruments
+G_clumped.MR_loo_plot(filename="loo_forest", figure_size=(10, 8))
+```
+Tips:
+- `MR_loo` accepts the same `action`, `use_mrpresso_data`, and method parameters as `MR`.
+- `MR_loo_plot` supports `top_influential=True` (default) for a compact figure showing the most influential SNPs, or `top_influential=False` for paginated output with all instruments.
+- `MR_loo_plot(..., use_mrpresso_data=True)` colors MR-PRESSO outliers in red (requires running `MRpresso()` first). When outliers exist, an extra summary row ("MR-PRESSO corrected") is added using the same MR method as the leave-one-out analysis.
+- `MR_loo_plot(..., methods=["WM", "Egger"])` adds extra overall estimates for the requested methods (computed on all instruments).
+- Set `odds=True` in `MR_loo` if you want odds ratio scaling on the plot.
+### 8) Single-SNP association tests (individual-level data)
 Use individual-level data to re-estimate SNP–trait effects in a specific cohort (e.g., different ancestry, different measurement protocol).
@@ -354,7 +429,7 @@ G_adj.association_test(
 This updates `G_adj.data[["BETA","SE","P"]]` with cohort-specific estimates and recomputes `FSTAT` to be consistent with the updated values.
-### 8) Lift to a different build
+### 9) Lift to a different build
 Lift variants between builds (e.g., hg19 → hg38):
@@ -365,7 +440,7 @@ lifted.head()
 For large datasets, you can provide a UCSC LiftOver executable via `liftover_path`.
-### 9) Query the GWAS Catalog
+### 10) Query the GWAS Catalog
 Attach a per-SNP list of associated traits using the GWAS Catalog API:
@@ -520,3 +595,4 @@ If you use methods derived from other packages (e.g., MR-PRESSO), please also ci
 ## License
 GPL-3.0-or-later (see `LICENSE`).

genal_python-1.5.0/PKG-INFO → genal_python-1.5.1/README.md RENAMED Viewed

@@ -1,31 +1,3 @@
-Metadata-Version: 2.4
-Name: genal-python
-Version: 1.5.0
-Summary: A python toolkit for polygenic risk scoring and mendelian randomization.
-Author-email: Cyprien Rivier <riviercyprien@gmail.com>
-Requires-Python: >=3.8
-Description-Content-Type: text/markdown
-Classifier: Programming Language :: Python :: 3
-Classifier: License :: OSI Approved :: GNU General Public License v3 or later (GPLv3+)
-Classifier: Operating System :: OS Independent
-License-File: LICENSE
-Requires-Dist: aiohttp>=3.7
-Requires-Dist: nest_asyncio>=1.5
-Requires-Dist: numpy>=1.17.3
-Requires-Dist: pandas>=1.0
-Requires-Dist: plotnine>=0.9
-Requires-Dist: psutil>=5.0
-Requires-Dist: requests>=2.0
-Requires-Dist: pyliftover>=0.4
-Requires-Dist: scikit_learn>=0.24
-Requires-Dist: scipy>=1.7,<1.13
-Requires-Dist: statsmodels>=0.13,<0.15
-Requires-Dist: tqdm>=4.38
-Requires-Dist: wget>=3.0
-Requires-Dist: fastparquet>=0.4
-Requires-Dist: pyarrow>=3.0
-Project-URL: Home, https://github.com/CypRiv/genal
 <div align="center">
   <img src="genal_logo.png" height="80" alt="genal logo" />
   <h1>genal</h1>
@@ -151,7 +123,7 @@ G_instruments.query_outcome(G_outcome, proxy=True, reference_panel="EUR_37")
 mr_results = G_instruments.MR(action=2, heterogeneity=True, odds=False)
 # 5.5) Plot MR results
-G_instruments.MR_plot(filename="mr_scatter")
+G_instruments.MR_plot(filename="mr_scatter", figure_size=(10, 6))
 ```
 ## Core concept: the `Geno` object
@@ -161,6 +133,7 @@ G_instruments.MR_plot(filename="mr_scatter")
 - `G.phenotype`: stored after `G.set_phenotype(...)` (phenotype DataFrame + metadata)
 - `G.MR_data`: stored after `G.query_outcome(...)` (exposure/outcome association tables used by MR)
 - `G.MR_results`: stored after `G.MR(...)` (results table + harmonized SNP table; used by plotting)
+- `G.MRpresso_subset_data`: stored after `G.MRpresso(...)` (outlier-removed harmonized table)
 Most methods either:
 - return a **new `Geno`** object (e.g., `clump()`), or
@@ -175,6 +148,7 @@ Most methods either:
 - **Align rsIDs to a target genotype dataset**: `Geno.update_snpids(path=..., replace=...)`
 - **Extract genotype subset**: `Geno.extract_snps(path=...)` → writes extracted files under `tmp_GENAL/`
 - **Two-sample MR pipeline**: `Geno.query_outcome(...)` → `Geno.MR(...)` (+ `MR_plot`)
+- **Leave-one-out MR**: `Geno.MR_loo(...)` → `Geno.MR_loo_plot(...)` (identify influential variants)
 - **MR-PRESSO**: `Geno.MRpresso(...)` (parallel; outlier + distortion tests)
 - **Colocalization**: `Geno.colocalize(...)` (approx Bayes factors; returns posterior probabilities)
 - **Association testing (individual-level)**: `Geno.set_phenotype(...)` → `Geno.association_test(...)`
@@ -326,7 +300,17 @@ About `action` (palindromic SNP handling during harmonization):
 Plot the MR scatter:
 ```python
-G_clumped.MR_plot(filename="mr_scatter")
+G_clumped.MR_plot(filename="mr_scatter", figure_size=(10, 6))
+```
+You can also draw a funnel plot of single-SNP ratio estimates (Wald ratios):
+```python
+G_clumped.MR_funnel(
+    methods=["IVW", "WM", "Egger"],  # vertical reference lines (optional)
+    filename="mr_funnel",
+    figure_size=(10, 6),
+)
 ```
 ### 6) Sensitivity: MR-PRESSO
@@ -339,7 +323,16 @@ mod_table, GlobalTest, OutlierTest, BiasTest = G_clumped.MRpresso(
     cpus=-1,   # use all CPU cores
 )
 ```
-If you want to rerun all MR methods after removing outliers with MR-PRESSO, you can use the `use_mrpresso_data=True` argument in `MR()`:
+To highlight MR-PRESSO outliers on plots, pass `use_mrpresso_data=True` (outliers are colored in red):
+```python
+G_clumped.MR_plot(filename="mr_scatter_mrpresso", figure_size=(10, 6), use_mrpresso_data=True)
+G_clumped.MR_funnel(filename="mr_funnel_mrpresso", figure_size=(10, 6), use_mrpresso_data=True)
+G_clumped.MR_loo_plot(filename="loo_forest_mrpresso", figure_size=(10, 8), use_mrpresso_data=True)
+```
+If you want to rerun MR methods after removing outliers with MR-PRESSO, you can use the `use_mrpresso_data=True` argument in `MR()`:
 ```python
 res = G_clumped.MR(
     action=2,
@@ -353,7 +346,29 @@ res = G_clumped.MR(
 res
 ```
-### 7) Single-SNP association tests (individual-level data)
+### 7) Sensitivity: Leave-One-Out MR
+Leave-one-out MR helps identify influential variants that may be driving the causal estimate.
+```python
+# Run leave-one-out analysis (default uses IVW)
+loo_results = G_clumped.MR_loo(method="IVW", heterogeneity=False, odds=False)
+```
+Visualize the results with a forest plot:
+```python
+# Default: show top influential instruments
+G_clumped.MR_loo_plot(filename="loo_forest", figure_size=(10, 8))
+```
+Tips:
+- `MR_loo` accepts the same `action`, `use_mrpresso_data`, and method parameters as `MR`.
+- `MR_loo_plot` supports `top_influential=True` (default) for a compact figure showing the most influential SNPs, or `top_influential=False` for paginated output with all instruments.
+- `MR_loo_plot(..., use_mrpresso_data=True)` colors MR-PRESSO outliers in red (requires running `MRpresso()` first). When outliers exist, an extra summary row ("MR-PRESSO corrected") is added using the same MR method as the leave-one-out analysis.
+- `MR_loo_plot(..., methods=["WM", "Egger"])` adds extra overall estimates for the requested methods (computed on all instruments).
+- Set `odds=True` in `MR_loo` if you want odds ratio scaling on the plot.
+### 8) Single-SNP association tests (individual-level data)
 Use individual-level data to re-estimate SNP–trait effects in a specific cohort (e.g., different ancestry, different measurement protocol).
@@ -382,7 +397,7 @@ G_adj.association_test(
 This updates `G_adj.data[["BETA","SE","P"]]` with cohort-specific estimates and recomputes `FSTAT` to be consistent with the updated values.
-### 8) Lift to a different build
+### 9) Lift to a different build
 Lift variants between builds (e.g., hg19 → hg38):
@@ -393,7 +408,7 @@ lifted.head()
 For large datasets, you can provide a UCSC LiftOver executable via `liftover_path`.
-### 9) Query the GWAS Catalog
+### 10) Query the GWAS Catalog
 Attach a per-SNP list of associated traits using the GWAS Catalog API:
@@ -548,4 +563,3 @@ If you use methods derived from other packages (e.g., MR-PRESSO), please also ci
 ## License
 GPL-3.0-or-later (see `LICENSE`).

{genal_python-1.5.0 → genal_python-1.5.1}/docs/source/concepts.md RENAMED Viewed

@@ -10,6 +10,7 @@ Key attributes (you don't need to manipulate these directly):
 - `G.phenotype`: set by {py:meth}`genal.Geno.set_phenotype` (phenotype DataFrame + metadata)
 - `G.MR_data`: set by {py:meth}`genal.Geno.query_outcome` (exposure/outcome tables used by MR)
 - `G.MR_results`: set by {py:meth}`genal.Geno.MR` (results table + harmonized SNP table; used by plotting)
+- `G.MR_loo_results`: set by {py:meth}`genal.Geno.MR_loo` (leave-one-out results tuple; used by `MR_loo_plot`)
 - `G.MRpresso_results` / `G.MRpresso_subset_data`: set by {py:meth}`genal.Geno.MRpresso`
 ## Standard columns
@@ -38,7 +39,7 @@ This is a *practical guide*, not an exhaustive contract. When a method can work
 | {py:meth}`genal.Geno.clump` | `SNP`, `P` | LD clumping via PLINK; returns a new `Geno` (or `None` if nothing passes). |
 | {py:meth}`genal.Geno.prs` | `EA`, `BETA`, plus `SNP (or CHR+POS)` | If `CHR+POS` are available, genal will prefer position-based matching to your genotype dataset to reduce ID-mismatch losses. |
 | {py:meth}`genal.Geno.query_outcome` | `SNP`, `EA`, `NEA`, `BETA`, `SE` (exposure and outcome) | Outcome querying is rsID-based; proxy search is optional. If you plan to use `action=2` later, `EAF` in both datasets is strongly recommended. |
-| {py:meth}`genal.Geno.MR` / {py:meth}`genal.Geno.MRpresso` | `MR_data` | Both consume `MR_data` produced by `query_outcome()`. |
+| {py:meth}`genal.Geno.MR` / {py:meth}`genal.Geno.MR_loo` / {py:meth}`genal.Geno.MRpresso` | `MR_data` | All consume `MR_data` produced by `query_outcome()`. |
 | {py:meth}`genal.Geno.colocalize` | `BETA`, `SE`, plus `CHR+POS` (preferred) **or** `SNP` (in both datasets) | If `EA/NEA` are present in both datasets, effects are allele-aligned; otherwise results assume both GWAS use the same reference allele. For quantitative traits, provide `sdY` or (`EAF` + `n`) to avoid the default `sdY=1` assumption. |
 | {py:meth}`genal.Geno.update_eaf` | `EA`, plus `CHR+POS` **or** `SNP` | Uses PLINK to compute allele frequencies from a reference panel; coordinate-based matching is faster when available. |
 | {py:meth}`genal.Geno.filter_by_gene` / {py:meth}`genal.Geno.lift` | `CHR`, `POS` | Genomic coordinate operations. |
@@ -65,8 +66,11 @@ A helpful mental framework:
 | `association_test()` | `None` | runs PLINK `--glm`; mutates `G.data` (`BETA/SE/P`) |
 | `query_outcome()` | `None` | sets `G.MR_data` (exposure/outcome tables used by MR) |
 | `MR()` | `pd.DataFrame` | sets `G.MR_results` and returns the results table |
-| `MR_plot()` | plot object | requires `G.MR_results`; writes `.png` if `filename=...` |
-| `MRpresso()` | tuple | sets `G.MRpresso_results` and `G.MRpresso_subset_data` (outlier-removed harmonized table) |
+| `MR_plot()` | plot object | requires `G.MR_results`; writes `.png` if `filename=...`; supports `use_mrpresso_data=True` for outlier highlighting |
+| `MR_funnel()` | plot object | requires `G.MR_results`; writes `.png` if `filename=...`; supports `use_mrpresso_data=True` for outlier highlighting |
+| `MR_loo()` | `pd.DataFrame` | sets `G.MR_loo_results` and returns the LOO results table |
+| `MR_loo_plot()` | plot object(s) | requires `G.MR_loo_results`; writes `.png` if `filename=...`; may return a list for multi-page output; supports `methods=[...]` overall rows and `use_mrpresso_data=True` for outlier highlighting |
+| `MRpresso()` | tuple | sets `G.MRpresso_results` and `G.MRpresso_subset_data` (outlier-removed harmonized table; SNP-indexed) |
 | `prs()` | `None` | writes `<name>.csv` and uses PLINK temp files |
 | `query_gwas_catalog()` | `pd.DataFrame` | adds an `ASSOC` column (network-bound); `replace=True` overwrites `G.data` |
 | `filter_by_gene(replace=False)` | `Geno` | returns a new `Geno` filtered to a locus |
@@ -80,7 +84,7 @@ Be aware of these common side effects:
 - `~/.genal/config.json` is created/updated as you configure PLINK, reference folders, or default genotype paths.
 - `tmp_GENAL/` is used as a scratch directory for PLINK commands and is **not** automatically deleted.
-- Some methods generate output files in your current directory (notably `prs()`, and plot saving in `MR_plot()`).
+- Some methods generate output files in your current directory (notably `prs()`, and plot saving in `MR_plot()`, `MR_funnel()`, and `MR_loo_plot()`).
 ## Resource usage (`ram`, `cpus`)

{genal_python-1.5.0 → genal_python-1.5.1}/docs/source/methods.md RENAMED Viewed

@@ -132,6 +132,30 @@ The bandwidth uses a modified Silverman rule multiplied by the user-provided fac
 The sign method tests whether exposure and outcome effects tend to have the same sign across variants. `genal` performs a binomial test against the null of 50% sign agreement.
+## Leave-one-out MR
+Implementation: {py:func}`genal.MR_tools.MR_loo_func`, wrapped by {py:meth}`genal.Geno.MR_loo`.
+Leave-one-out MR iterates over all instruments, sequentially removing each SNP and re-estimating the causal effect using the remaining instruments. This identifies variants that have a disproportionate influence on the overall estimate.
+For each SNP $i$ in the instrument set:
+1. Remove SNP $i$ from the harmonized data.
+2. Re-run the selected MR method on the remaining $J-1$ SNPs.
+3. Store the resulting estimate $\hat{\theta}_{-i}$.
+The "influence" of SNP $i$ is defined as:
+```{math}
+\text{influence}_i = \left| \hat{\theta}_{-i} - \hat{\theta}_{\text{all}} \right|
+```
+where $\hat{\theta}_{\text{all}}$ is the estimate using all instruments.
+Notes:
+- Any single MR method can be used (IVW, Egger, weighted median, mode-based, etc.).
+- At least 3 instruments are required (so that each LOO subset has ≥2 instruments).
 ## MR-PRESSO (summary)
 Implementation: {py:func}`genal.MRpresso.mr_presso`, wrapped by {py:meth}`genal.Geno.MRpresso`.
@@ -144,11 +168,26 @@ At a high level, `genal`'s MR-PRESSO implementation:
   - an outlier test (per-variant p-values, Bonferroni-corrected),
   - an optional distortion test (whether the causal estimate changes materially after removing outliers).
+### Distortion test
+The distortion test assesses whether detected outliers materially bias the causal estimate. `genal` implements the following version:
+1. **Observed distortion**: $D_\text{obs} = (\hat{\theta}_\text{all} - \hat{\theta}_\text{no outliers}) / |\hat{\theta}_\text{no outliers}|$
+2. **Expected distortion null**: bootstrap resampling is performed *exclusively on the non-outlier subset*. For each iteration:
+   - Sample with replacement $J-k$ SNPs from the non-outlier data (where $k$ is the number of detected outliers).
+   - Fit the IVW model on the sampled data and record $\hat{\theta}_\text{exp}$.
+   - Compute $D_\text{exp} = (\hat{\theta}_\text{all} - \hat{\theta}_\text{exp}) / |\hat{\theta}_\text{exp}|$.
+3. **P-value**: $p = \text{mean}(|D_\text{exp}| > |D_\text{obs}|)$.
+This differs from the original MR-PRESSO R implementation, which in some cases samples from the full dataset (including outliers) for the expected-bias regressions and was inconsistent with the paper's description.
+### Output structure
 `Geno.MRpresso()` returns four objects:
 - `mod_table`: a small results table (`Raw` and `Outlier-corrected` rows; IVW model),
 - `GlobalTest`: RSS and global p-value,
-- `OutlierTest`: per-variant outlier p-values (empty if the global test is not significant),
-- `BiasTest`: distortion test result dictionary (empty if distortion test was not run).
+- `OutlierTest`: per-variant outlier p-values (empty if the global test is not significant); SNP IDs as row labels,
+- `BiasTest`: distortion test result dictionary containing `"outliers_indices"` (SNP IDs), `"distortion_test_coefficient"`, and `"distortion_test_p"` (empty if distortion test was not run).
 If outliers are found, `genal` stores the outlier-removed harmonized table and allows rerunning MR with `Geno.MR(use_mrpresso_data=True)`.

{genal_python-1.5.0 → genal_python-1.5.1}/docs/source/workflows.md RENAMED Viewed

@@ -175,10 +175,20 @@ Key arguments you commonly tune:
 After `MR()`, you can generate a scatter plot with method lines:
 ```python
-G_instruments.MR_plot(filename="mr_scatter")  # saves mr_scatter.png
+G_instruments.MR_plot(filename="mr_scatter", figure_size=(10, 6))  # saves mr_scatter.png
 ```
-## 6) MR-PRESSO (outlier detection and distortion testing)
+You can also draw a funnel plot of single-SNP ratio estimates (Wald ratios):
+```python
+G_instruments.MR_funnel(
+    methods=["IVW", "WM", "Egger"],  # vertical reference lines (optional)
+    filename="mr_funnel",
+    figure_size=(10, 6),
+)
+```
+## 6a) MR-PRESSO (outlier detection and distortion testing)
 {py:meth}`genal.Geno.MRpresso` runs a parallel MR-PRESSO implementation.
@@ -196,14 +206,71 @@ What you typically tune:
 - `significance_p`: threshold for global/outlier tests.
 - `outlier_test` / `distortion_test`: disable if you only want the global test.
+Output structure:
+- `OutlierTest`: DataFrame with per-SNP outlier p-values; SNP identifiers (rsIDs) are used as row labels (not numeric indices).
+- `BiasTest`: dictionary containing `"outliers_indices"` (list of SNP IDs), `"distortion_test_coefficient"`, and `"distortion_test_p"`.
 If outliers are found, you can rerun MR using the outlier-removed subset:
 ```python
 res_no_outliers = G_instruments.MR(use_mrpresso_data=True)
 ```
+To highlight MR-PRESSO outliers on plots, pass `use_mrpresso_data=True` (outliers are colored in red and shown in the legend):
+```python
+G_instruments.MR_plot(filename="mr_scatter_mrpresso", figure_size=(10, 6), use_mrpresso_data=True)
+G_instruments.MR_funnel(filename="mr_funnel_mrpresso", figure_size=(10, 6), use_mrpresso_data=True)
+G_instruments.MR_loo_plot(filename="loo_forest_mrpresso", figure_size=(10, 8), use_mrpresso_data=True)
+```
 See {doc}`methods` for algorithm details and outputs.
+## 6b) Leave-one-out MR (sensitivity analysis)
+{py:meth}`genal.Geno.MR_loo` iteratively removes each SNP and re-estimates the causal effect. This helps identify influential variants that may be driving the overall result.
+```python
+loo_df = G_instruments.MR_loo(
+    method="IVW",        # any single MR method key (see MR method map)
+    action=2,
+    heterogeneity=False, # set True to include Q statistics
+    odds=False,          # set True for OR-scale output
+)
+```
+Key arguments:
+- `method`: a single MR method key (e.g., `"IVW"`, `"Egger"`, `"WM"`); must not be `"all"`.
+- `use_mrpresso_data=True`: use the outlier-removed dataset from MR-PRESSO instead of all instruments.
+### Visualizing leave-one-out results
+{py:meth}`genal.Geno.MR_loo_plot` creates a forest plot from the stored `MR_loo_results`:
+```python
+# Default: show top influential instruments
+G_instruments.MR_loo_plot(filename="loo_forest", figure_size=(10, 8))
+```
+```python
+# Or paginate all instruments
+G_instruments.MR_loo_plot(
+    top_influential=False,  # show all, not just influential
+    snps_per_page=30,
+    page=1,                 # or None for all pages
+    filename="loo_forest_all",
+    figure_size=(10, 12),
+)
+```
+Key arguments:
+- `top_influential=True` (default): select the `snps_per_page` most influential SNPs (largest change in estimate when removed) and render a single compact figure.
+- `top_influential=False`: paginate all instruments; use `page=N` to select a specific page or `page=None` to render all pages.
+- `snps_per_page`: number of SNPs per page (minimum 5).
+- `use_mrpresso_data=True`: color MR-PRESSO outliers in red (requires `MRpresso()` first). When outliers exist, an extra summary row ("MR-PRESSO corrected") is added using the same MR method as the leave-one-out analysis.
+- `methods=["WM", "Egger"]`: add extra overall estimates for the requested methods (computed on all instruments).
 ## 7) Additional capabilities (beyond the core pipeline)
 ### Single-SNP association tests (individual-level data)

genal-python 1.5.0__tar.gz → 1.5.1__tar.gz

genal-python 1.5.0tar.gz → 1.5.1tar.gz