pylocuszoom 0.6.0__tar.gz → 1.0.0__tar.gz

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (90) hide show
  1. {pylocuszoom-0.6.0 → pylocuszoom-1.0.0}/.github/workflows/ci.yml +3 -8
  2. {pylocuszoom-0.6.0 → pylocuszoom-1.0.0}/.github/workflows/publish.yml +63 -8
  3. {pylocuszoom-0.6.0 → pylocuszoom-1.0.0}/.gitignore +2 -0
  4. pylocuszoom-1.0.0/.pre-commit-config.yaml +17 -0
  5. {pylocuszoom-0.6.0 → pylocuszoom-1.0.0}/CHANGELOG.md +69 -1
  6. {pylocuszoom-0.6.0 → pylocuszoom-1.0.0}/PKG-INFO +82 -37
  7. {pylocuszoom-0.6.0 → pylocuszoom-1.0.0}/README.md +79 -34
  8. pylocuszoom-1.0.0/bioconda/meta.yaml +63 -0
  9. pylocuszoom-1.0.0/docs/CODEMAP.md +256 -0
  10. {pylocuszoom-0.6.0 → pylocuszoom-1.0.0}/docs/USER_GUIDE.md +78 -22
  11. {pylocuszoom-0.6.0 → pylocuszoom-1.0.0}/examples/eqtl_bokeh.html +5 -5
  12. pylocuszoom-1.0.0/examples/eqtl_overlay.png +0 -0
  13. {pylocuszoom-0.6.0 → pylocuszoom-1.0.0}/examples/eqtl_plotly.html +1 -1
  14. pylocuszoom-1.0.0/examples/finemapping_bokeh.html +61 -0
  15. pylocuszoom-1.0.0/examples/finemapping_plot.png +0 -0
  16. {pylocuszoom-0.6.0 → pylocuszoom-1.0.0}/examples/finemapping_plotly.html +1 -1
  17. pylocuszoom-1.0.0/examples/forest_plot.png +0 -0
  18. {pylocuszoom-0.6.0 → pylocuszoom-1.0.0}/examples/generate_readme_plots.py +31 -7
  19. {pylocuszoom-0.6.0 → pylocuszoom-1.0.0}/examples/getting_started.ipynb +167 -56
  20. {pylocuszoom-0.6.0 → pylocuszoom-1.0.0}/examples/regional_plot.png +0 -0
  21. pylocuszoom-1.0.0/examples/stacked_plot.png +0 -0
  22. {pylocuszoom-0.6.0 → pylocuszoom-1.0.0}/pyproject.toml +22 -4
  23. {pylocuszoom-0.6.0 → pylocuszoom-1.0.0}/src/pylocuszoom/__init__.py +34 -7
  24. pylocuszoom-1.0.0/src/pylocuszoom/backends/__init__.py +147 -0
  25. {pylocuszoom-0.6.0 → pylocuszoom-1.0.0}/src/pylocuszoom/backends/base.py +363 -60
  26. {pylocuszoom-0.6.0 → pylocuszoom-1.0.0}/src/pylocuszoom/backends/bokeh_backend.py +77 -15
  27. pylocuszoom-1.0.0/src/pylocuszoom/backends/hover.py +198 -0
  28. {pylocuszoom-0.6.0 → pylocuszoom-1.0.0}/src/pylocuszoom/backends/matplotlib_backend.py +263 -3
  29. {pylocuszoom-0.6.0 → pylocuszoom-1.0.0}/src/pylocuszoom/backends/plotly_backend.py +73 -16
  30. pylocuszoom-1.0.0/src/pylocuszoom/config.py +365 -0
  31. pylocuszoom-1.0.0/src/pylocuszoom/ensembl.py +476 -0
  32. {pylocuszoom-0.6.0 → pylocuszoom-1.0.0}/src/pylocuszoom/eqtl.py +17 -25
  33. pylocuszoom-1.0.0/src/pylocuszoom/exceptions.py +33 -0
  34. {pylocuszoom-0.6.0 → pylocuszoom-1.0.0}/src/pylocuszoom/finemapping.py +18 -32
  35. {pylocuszoom-0.6.0 → pylocuszoom-1.0.0}/src/pylocuszoom/forest.py +10 -11
  36. {pylocuszoom-0.6.0 → pylocuszoom-1.0.0}/src/pylocuszoom/gene_track.py +169 -142
  37. {pylocuszoom-0.6.0 → pylocuszoom-1.0.0}/src/pylocuszoom/loaders.py +3 -1
  38. {pylocuszoom-0.6.0 → pylocuszoom-1.0.0}/src/pylocuszoom/phewas.py +10 -11
  39. {pylocuszoom-0.6.0 → pylocuszoom-1.0.0}/src/pylocuszoom/plotter.py +311 -277
  40. {pylocuszoom-0.6.0 → pylocuszoom-1.0.0}/src/pylocuszoom/recombination.py +19 -3
  41. {pylocuszoom-0.6.0 → pylocuszoom-1.0.0}/src/pylocuszoom/schemas.py +1 -6
  42. {pylocuszoom-0.6.0 → pylocuszoom-1.0.0}/src/pylocuszoom/utils.py +54 -4
  43. pylocuszoom-1.0.0/src/pylocuszoom/validation.py +223 -0
  44. pylocuszoom-1.0.0/tests/test_backends.py +294 -0
  45. pylocuszoom-1.0.0/tests/test_config.py +540 -0
  46. pylocuszoom-1.0.0/tests/test_ensembl.py +466 -0
  47. pylocuszoom-1.0.0/tests/test_ensembl_integration.py +77 -0
  48. pylocuszoom-1.0.0/tests/test_exceptions.py +158 -0
  49. {pylocuszoom-0.6.0 → pylocuszoom-1.0.0}/tests/test_finemapping.py +3 -7
  50. pylocuszoom-1.0.0/tests/test_forest.py +111 -0
  51. {pylocuszoom-0.6.0 → pylocuszoom-1.0.0}/tests/test_gene_track.py +61 -0
  52. pylocuszoom-1.0.0/tests/test_hover.py +351 -0
  53. {pylocuszoom-0.6.0 → pylocuszoom-1.0.0}/tests/test_notebook_backends.py +66 -13
  54. pylocuszoom-1.0.0/tests/test_phewas.py +98 -0
  55. {pylocuszoom-0.6.0 → pylocuszoom-1.0.0}/tests/test_plotter.py +306 -13
  56. pylocuszoom-1.0.0/tests/test_utils.py +162 -0
  57. pylocuszoom-1.0.0/tests/test_validation.py +472 -0
  58. {pylocuszoom-0.6.0 → pylocuszoom-1.0.0}/uv.lock +1 -1
  59. pylocuszoom-0.6.0/.pre-commit-config.yaml +0 -7
  60. pylocuszoom-0.6.0/bioconda/meta.yaml +0 -54
  61. pylocuszoom-0.6.0/examples/eqtl_overlay.png +0 -0
  62. pylocuszoom-0.6.0/examples/finemapping_bokeh.html +0 -61
  63. pylocuszoom-0.6.0/examples/finemapping_plot.png +0 -0
  64. pylocuszoom-0.6.0/examples/forest_plot.png +0 -0
  65. pylocuszoom-0.6.0/examples/stacked_plot.png +0 -0
  66. pylocuszoom-0.6.0/src/pylocuszoom/backends/__init__.py +0 -48
  67. pylocuszoom-0.6.0/tests/test_forest.py +0 -54
  68. pylocuszoom-0.6.0/tests/test_phewas.py +0 -51
  69. {pylocuszoom-0.6.0 → pylocuszoom-1.0.0}/.gitattributes +0 -0
  70. {pylocuszoom-0.6.0 → pylocuszoom-1.0.0}/.github/ISSUE_TEMPLATE/bug_report.md +0 -0
  71. {pylocuszoom-0.6.0 → pylocuszoom-1.0.0}/.github/ISSUE_TEMPLATE/config.yml +0 -0
  72. {pylocuszoom-0.6.0 → pylocuszoom-1.0.0}/.github/ISSUE_TEMPLATE/feature_request.md +0 -0
  73. {pylocuszoom-0.6.0 → pylocuszoom-1.0.0}/CONTRIBUTING.md +0 -0
  74. {pylocuszoom-0.6.0 → pylocuszoom-1.0.0}/LICENSE.md +0 -0
  75. {pylocuszoom-0.6.0 → pylocuszoom-1.0.0}/docs/ARCHITECTURE.md +0 -0
  76. {pylocuszoom-0.6.0 → pylocuszoom-1.0.0}/examples/phewas_plot.png +0 -0
  77. {pylocuszoom-0.6.0 → pylocuszoom-1.0.0}/logo.svg +0 -0
  78. {pylocuszoom-0.6.0 → pylocuszoom-1.0.0}/src/pylocuszoom/colors.py +0 -0
  79. {pylocuszoom-0.6.0 → pylocuszoom-1.0.0}/src/pylocuszoom/labels.py +0 -0
  80. {pylocuszoom-0.6.0 → pylocuszoom-1.0.0}/src/pylocuszoom/ld.py +0 -0
  81. {pylocuszoom-0.6.0 → pylocuszoom-1.0.0}/src/pylocuszoom/logging.py +0 -0
  82. {pylocuszoom-0.6.0 → pylocuszoom-1.0.0}/src/pylocuszoom/py.typed +0 -0
  83. {pylocuszoom-0.6.0 → pylocuszoom-1.0.0}/src/pylocuszoom/reference_data/__init__.py +0 -0
  84. {pylocuszoom-0.6.0 → pylocuszoom-1.0.0}/tests/conftest.py +0 -0
  85. {pylocuszoom-0.6.0 → pylocuszoom-1.0.0}/tests/test_colors.py +0 -0
  86. {pylocuszoom-0.6.0 → pylocuszoom-1.0.0}/tests/test_labels.py +0 -0
  87. {pylocuszoom-0.6.0 → pylocuszoom-1.0.0}/tests/test_ld.py +0 -0
  88. {pylocuszoom-0.6.0 → pylocuszoom-1.0.0}/tests/test_loaders.py +0 -0
  89. {pylocuszoom-0.6.0 → pylocuszoom-1.0.0}/tests/test_logging.py +0 -0
  90. {pylocuszoom-0.6.0 → pylocuszoom-1.0.0}/tests/test_recombination.py +0 -0
@@ -44,14 +44,9 @@ jobs:
44
44
  run: uv sync --extra dev --extra all
45
45
 
46
46
  - name: Run tests
47
- run: uv run pytest --cov=pylocuszoom --cov-report=xml
48
-
49
- - name: Upload coverage
50
- uses: codecov/codecov-action@v4
51
- if: matrix.python-version == '3.11'
52
- with:
53
- files: ./coverage.xml
54
- fail_ci_if_error: false
47
+ # Verbose shows pytest-randomly seed for reproduction
48
+ # To reproduce: pytest --randomly-seed=<seed>
49
+ run: uv run pytest -v
55
50
 
56
51
  build:
57
52
  runs-on: ubuntu-latest
@@ -58,8 +58,13 @@ jobs:
58
58
  update-bioconda:
59
59
  needs: publish
60
60
  runs-on: ubuntu-latest
61
+ permissions:
62
+ contents: write
63
+ pull-requests: write
61
64
  steps:
62
65
  - uses: actions/checkout@v4
66
+ with:
67
+ ref: main
63
68
 
64
69
  - name: Update bioconda/meta.yaml
65
70
  env:
@@ -88,21 +93,71 @@ jobs:
88
93
 
89
94
  cat bioconda/meta.yaml
90
95
 
91
- - name: Create Pull Request
96
+ - name: Create local PR for bioconda/meta.yaml
92
97
  uses: peter-evans/create-pull-request@v6
93
98
  with:
94
99
  token: ${{ secrets.GITHUB_TOKEN }}
95
- commit-message: "chore: update bioconda recipe for new release"
100
+ commit-message: "chore: update bioconda recipe for v${{ needs.publish.outputs.version }}"
96
101
  branch: bioconda-update
97
- title: "Update bioconda recipe"
102
+ title: "Update bioconda recipe for v${{ needs.publish.outputs.version }}"
98
103
  body: |
99
104
  Automated update of bioconda/meta.yaml after PyPI release.
100
105
 
101
- **Next steps:**
102
- 1. Review this PR
103
- 2. Merge to main
104
- 3. Copy `bioconda/meta.yaml` to your fork of bioconda-recipes
105
- 4. Submit PR to bioconda-recipes
106
+ This PR updates the local recipe. The bioconda-recipes PR is created automatically below.
106
107
  labels: |
107
108
  bioconda
108
109
  automated
110
+
111
+ submit-bioconda-pr:
112
+ needs: [publish, update-bioconda]
113
+ runs-on: ubuntu-latest
114
+ steps:
115
+ - name: Checkout bioconda-recipes fork
116
+ uses: actions/checkout@v4
117
+ with:
118
+ repository: michael-denyer/bioconda-recipes
119
+ token: ${{ secrets.BIOCONDA_PAT }}
120
+ path: bioconda-recipes
121
+
122
+ - name: Checkout pyLocusZoom
123
+ uses: actions/checkout@v4
124
+ with:
125
+ ref: main
126
+ path: pylocuszoom
127
+
128
+ - name: Update recipe and create PR
129
+ env:
130
+ GH_TOKEN: ${{ secrets.BIOCONDA_PAT }}
131
+ PKG_VERSION: ${{ needs.publish.outputs.version }}
132
+ run: |
133
+ cd bioconda-recipes
134
+
135
+ # Sync fork with upstream
136
+ git remote add upstream https://github.com/bioconda/bioconda-recipes.git
137
+ git fetch upstream
138
+ git checkout master
139
+ git reset --hard upstream/master
140
+ git push origin master --force
141
+
142
+ # Create branch for this version
143
+ BRANCH="pylocuszoom-$PKG_VERSION"
144
+ git checkout -b "$BRANCH"
145
+
146
+ # Copy updated recipe
147
+ mkdir -p recipes/pylocuszoom
148
+ cp ../pylocuszoom/bioconda/meta.yaml recipes/pylocuszoom/meta.yaml
149
+
150
+ # Commit and push
151
+ git add recipes/pylocuszoom/meta.yaml
152
+ git commit -m "Update pylocuszoom to $PKG_VERSION"
153
+ git push -u origin "$BRANCH" --force
154
+
155
+ # Create PR to bioconda/bioconda-recipes
156
+ gh pr create \
157
+ --repo bioconda/bioconda-recipes \
158
+ --title "Update pylocuszoom to $PKG_VERSION" \
159
+ --body "Automated update of pylocuszoom to version $PKG_VERSION.
160
+
161
+ Changes: https://github.com/michael-denyer/pyLocusZoom/releases/tag/v$PKG_VERSION" \
162
+ --head "michael-denyer:$BRANCH" \
163
+ --base master || echo "PR may already exist"
@@ -26,3 +26,5 @@ htmlcov/
26
26
 
27
27
  # Project instructions (private)
28
28
  CLAUDE.md
29
+ docs/plans/
30
+ .planning/
@@ -0,0 +1,17 @@
1
+ repos:
2
+ - repo: https://github.com/astral-sh/ruff-pre-commit
3
+ rev: v0.9.1
4
+ hooks:
5
+ - id: ruff
6
+ args: [--fix]
7
+ - id: ruff-format
8
+
9
+ - repo: local
10
+ hooks:
11
+ - id: pytest-cov
12
+ name: pytest with coverage
13
+ entry: uv run python -m pytest -n auto -q
14
+ language: system
15
+ types: [python]
16
+ pass_filenames: false
17
+ always_run: true
@@ -5,6 +5,72 @@ All notable changes to this project will be documented in this file.
5
5
  The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.1.0/),
6
6
  and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html).
7
7
 
8
+ ## [Unreleased]
9
+
10
+ ## [1.0.0] - 2026-01-28
11
+
12
+ ### Added
13
+ - Unified exception hierarchy with `PyLocusZoomError` base class
14
+ - Custom exceptions: `ValidationError`, `DownloadError`, `LiftoverError`, `DataError`, `PLINKError`, `ConfigurationError`
15
+ - Internal Pydantic validation for plot parameters (validates kwargs at call time)
16
+ - Error path tests for download failures and validation edge cases
17
+ - CI ordering validation for forest plots (`ci_lower <= effect <= ci_upper`)
18
+ - P-value validation warnings for NaN and out-of-range values
19
+ - Vectorized eQTL/PheWAS scatter calls for better performance
20
+
21
+ ### Changed
22
+ - All validation errors now raise `ValidationError` (also a `ValueError` for backward compatibility)
23
+ - Test randomization enabled via pytest-randomly (visible in CI output)
24
+ - Config classes (`PlotConfig`, `StackedPlotConfig`) are now internal implementation details, not part of public API
25
+ - Capped pytest-xdist workers at 8 to prevent terminal issues
26
+
27
+ ### Fixed
28
+ - Recombination overlay now uses correct twin axis for matplotlib (no longer distorts GWAS y-limits)
29
+ - Mb formatting now applied to gene track axis for interactive backends (Plotly/Bokeh)
30
+ - Gene track row assignment algorithm now correctly prevents overlapping genes in same row
31
+ - Handle all-NaN p-values in stacked plot lead SNP detection
32
+ - Replaced broad `except Exception` blocks with specific exception types (only 1 justified fallback remains)
33
+ - Download error handling now catches specific HTTP/network errors
34
+
35
+ ## [0.8.0] - 2026-01-28
36
+
37
+ ### Added
38
+ - `set_yticks()` backend method for consistent y-axis labels across all backends
39
+ - Shared `convert_latex_to_unicode()` utility for interactive backends
40
+ - Automatic gene annotation fetching from Ensembl REST API (`auto_genes=True`)
41
+ - `get_genes_for_region()` function to fetch genes from Ensembl with disk caching
42
+ - `fetch_genes_from_ensembl()` and `fetch_exons_from_ensembl()` low-level API functions
43
+ - `clear_ensembl_cache()` utility to clear cached Ensembl data
44
+ - Support for human, mouse, rat, and any Ensembl species
45
+ - Retry logic with exponential backoff for Ensembl API resilience
46
+ - 5Mb region size validation (Ensembl API limit)
47
+ - `DataFrameValidator` builder class for consistent validation across modules
48
+ - `filter_by_region()` shared utility for chromosome/position filtering
49
+ - `HoverDataBuilder` for constructing hover tooltips across backends
50
+ - Backend capability system with `supports_*` properties for feature detection
51
+ - Backend registration system with `get_backend()` and automatic fallback
52
+ - Pre-commit hook for pytest with coverage enforcement (70% minimum)
53
+
54
+ ### Changed
55
+ - Forest plot example now uses odds ratios with `null_value=1.0` (more representative)
56
+ - PheWAS and forest plot y-axis labels now work correctly in Plotly and Bokeh backends
57
+ - Gene track styling: arrows now 75% height and 10% wider for better proportions
58
+ - Gene track labels increased from 5.5pt to 7pt for improved readability
59
+ - Migrated eQTL, finemapping, phewas, and forest validation to `DataFrameValidator`
60
+ - Plotter now uses capability-based dispatch instead of backend name checks
61
+ - Removed empty `__init__` methods from backend classes
62
+ - Removed unused matplotlib imports from plotter (now backend-agnostic)
63
+
64
+ ### Fixed
65
+ - `load_gwas()` now forwards `**kwargs` to format-specific loaders
66
+ - Forest plot validator now checks that effect and CI columns are numeric
67
+ - PheWAS validator now checks that p-values are numeric and within (0, 1] range
68
+
69
+ ### Security
70
+ - Tar extraction now includes path traversal protection for recombination map downloads
71
+
72
+ ## [0.7.0] - 2026-01-27
73
+
8
74
  ## [0.6.0] - 2026-01-27
9
75
 
10
76
  ### Added
@@ -140,7 +206,9 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
140
206
  - bokeh >= 3.8.2
141
207
  - kaleido >= 0.2.0
142
208
 
143
- [Unreleased]: https://github.com/michael-denyer/pyLocusZoom/compare/v0.6.0...HEAD
209
+ [Unreleased]: https://github.com/michael-denyer/pyLocusZoom/compare/v0.8.0...HEAD
210
+ [0.8.0]: https://github.com/michael-denyer/pyLocusZoom/compare/v0.7.0...v0.8.0
211
+ [0.7.0]: https://github.com/michael-denyer/pyLocusZoom/compare/v0.6.0...v0.7.0
144
212
  [0.6.0]: https://github.com/michael-denyer/pyLocusZoom/compare/v0.5.0...v0.6.0
145
213
  [0.5.0]: https://github.com/michael-denyer/pyLocusZoom/compare/v0.4.0...v0.5.0
146
214
  [0.4.0]: https://github.com/michael-denyer/pyLocusZoom/compare/v0.3.0...v0.4.0
@@ -1,15 +1,15 @@
1
1
  Metadata-Version: 2.4
2
2
  Name: pylocuszoom
3
- Version: 0.6.0
3
+ Version: 1.0.0
4
4
  Summary: Publication-ready regional association plots with LD coloring, gene tracks, and recombination overlays
5
5
  Project-URL: Homepage, https://github.com/michael-denyer/pylocuszoom
6
6
  Project-URL: Documentation, https://github.com/michael-denyer/pylocuszoom#readme
7
7
  Project-URL: Repository, https://github.com/michael-denyer/pylocuszoom
8
- Author: Michael Denyer
8
+ Author-email: Michael Denyer <code.denyer@gmail.com>
9
9
  License-Expression: GPL-3.0-or-later
10
10
  License-File: LICENSE.md
11
11
  Keywords: genetics,gwas,locus-zoom,locuszoom,regional-plot,visualization
12
- Classifier: Development Status :: 3 - Alpha
12
+ Classifier: Development Status :: 4 - Beta
13
13
  Classifier: Intended Audience :: Science/Research
14
14
  Classifier: License :: OSI Approved :: GNU General Public License v3 or later (GPLv3+)
15
15
  Classifier: Programming Language :: Python :: 3
@@ -44,20 +44,18 @@ Requires-Dist: pyspark>=3.0.0; extra == 'spark'
44
44
  Description-Content-Type: text/markdown
45
45
 
46
46
  [![CI](https://github.com/michael-denyer/pyLocusZoom/actions/workflows/ci.yml/badge.svg)](https://github.com/michael-denyer/pyLocusZoom/actions/workflows/ci.yml)
47
- [![codecov](https://codecov.io/gh/michael-denyer/pyLocusZoom/graph/badge.svg)](https://codecov.io/gh/michael-denyer/pyLocusZoom)
48
47
  [![PyPI](https://img.shields.io/pypi/v/pylocuszoom)](https://pypi.org/project/pylocuszoom/)
49
- [![Bioconda](https://img.shields.io/conda/vn/bioconda/pylocuszoom)](https://anaconda.org/bioconda/pylocuszoom)
50
48
  [![License: GPL v3](https://img.shields.io/badge/License-GPLv3-red.svg)](https://www.gnu.org/licenses/gpl-3.0)
51
49
  [![Python 3.10+](https://img.shields.io/badge/python-3.10+-blue.svg)](https://www.python.org/downloads/)
52
50
  [![Ruff](https://img.shields.io/endpoint?url=https://raw.githubusercontent.com/astral-sh/ruff/main/assets/badge/v2.json)](https://github.com/astral-sh/ruff)
53
51
  [![Matplotlib](https://img.shields.io/badge/Matplotlib-3.5+-11557c.svg)](https://matplotlib.org/)
54
- [![Plotly](https://img.shields.io/badge/Plotly-5.0+-3F4F75.svg)](https://plotly.com/python/)
52
+ [![Plotly](https://img.shields.io/badge/Plotly-5.15+-3F4F75.svg)](https://plotly.com/python/)
55
53
  [![Bokeh](https://img.shields.io/badge/Bokeh-3.8+-E6526F.svg)](https://bokeh.org/)
56
54
  [![Pandas](https://img.shields.io/badge/Pandas-1.4+-150458.svg)](https://pandas.pydata.org/)
57
55
  <img src="logo.svg" alt="pyLocusZoom logo" width="120" align="right">
58
56
  # pyLocusZoom
59
57
 
60
- Publication-ready regional association plots with LD coloring, gene tracks, and recombination overlays.
58
+ Designed for publication-ready GWAS visualization with regional association plots, gene tracks, eQTL, PheWAS, fine-mapping, and forest plots.
61
59
 
62
60
  Inspired by [LocusZoom](http://locuszoom.org/) and [locuszoomr](https://github.com/myles-lewis/locuszoomr).
63
61
 
@@ -68,20 +66,22 @@ Inspired by [LocusZoom](http://locuszoom.org/) and [locuszoomr](https://github.c
68
66
  - **Multi-species support**: Built-in reference data for *Canis lupus familiaris* (CanFam3.1/CanFam4) and *Felis catus* (FelCat9), or optionally provide your own for any species
69
67
  - **LD coloring**: SNPs colored by linkage disequilibrium (R²) with lead variant
70
68
  - **Gene tracks**: Annotated gene/exon positions below the association plot
71
- - **Recombination rate**: Overlay showing recombination rate across region (*Canis lupus familiaris* only)
72
- - **SNP labels (matplotlib)**: Automatic labeling of lead SNPs with RS ID
73
- - **Tooltips (Bokeh and Plotly)**: Mouseover for detailed SNP data
69
+ - **Recombination rate**: Optional overlay across region (*Canis lupus familiaris* built-in, not shown in example image)
70
+ - **SNP labels (matplotlib)**: Automatic labeling of top SNPs by p-value (RS IDs)
71
+ - **Hover tooltips (Plotly and Bokeh)**: Detailed SNP data on hover
74
72
 
75
- ![Example regional association plot](examples/regional_plot.png)
73
+ ![Example regional association plot with LD coloring and gene track](examples/regional_plot.png)
74
+ *Regional association plot with LD coloring, gene/exon track, and top SNP labels (recombination overlay disabled in example).*
76
75
 
77
76
  2. **Stacked plots**: Compare multiple GWAS/phenotypes vertically
78
77
  3. **eQTL plot**: Expression QTL data aligned with association plots and gene tracks
79
78
  4. **Fine-mapping plots**: Visualize SuSiE credible sets with posterior inclusion probabilities
80
79
  5. **PheWAS plots**: Phenome-wide association study visualization across multiple phenotypes
81
80
  6. **Forest plots**: Meta-analysis effect size visualization with confidence intervals
82
- 7. **Multiple charting libraries**: matplotlib (static), plotly (interactive), bokeh (dashboards)
81
+ 7. **Multiple backends**: matplotlib (publication-ready), plotly (interactive), bokeh (dashboard integration)
83
82
  8. **Pandas and PySpark support**: Works with both Pandas and PySpark DataFrames for large-scale genomics data
84
83
  9. **Convenience data file loaders**: Load and validate common GWAS, eQTL and fine-mapping file formats
84
+ 10. **Automatic gene annotations**: Fetch gene/exon data from Ensembl REST API with caching (human, mouse, rat, canine, feline, and any Ensembl species)
85
85
 
86
86
  ## Installation
87
87
 
@@ -109,15 +109,14 @@ from pylocuszoom import LocusZoomPlotter
109
109
  # Initialize plotter (loads reference data for canine)
110
110
  plotter = LocusZoomPlotter(species="canine")
111
111
 
112
- # Create regional plot
112
+ # Plot with parameters passed directly
113
113
  fig = plotter.plot(
114
- gwas_df, # DataFrame with ps, p_wald, rs columns
114
+ gwas_df, # DataFrame with ps, p_wald, rs columns
115
115
  chrom=1,
116
116
  start=1000000,
117
117
  end=2000000,
118
- lead_pos=1500000, # Highlight lead SNP
118
+ lead_pos=1500000, # Highlight lead SNP
119
119
  )
120
-
121
120
  fig.savefig("regional_plot.png", dpi=150)
122
121
  ```
123
122
 
@@ -137,9 +136,7 @@ fig = plotter.plot(
137
136
  start=1000000,
138
137
  end=2000000,
139
138
  lead_pos=1500000,
140
- ld_reference_file="genotypes.bed", # For LD calculation
141
- genes_df=genes_df, # Gene annotations
142
- exons_df=exons_df, # Exon annotations
139
+ ld_reference_file="genotypes", # PLINK fileset (without extension)
143
140
  show_recombination=True, # Overlay recombination rate
144
141
  snp_labels=True, # Label top SNPs
145
142
  label_top_n=5, # How many to label
@@ -147,6 +144,8 @@ fig = plotter.plot(
147
144
  p_col="p_wald", # Column name for p-value
148
145
  rs_col="rs", # Column name for SNP ID
149
146
  figsize=(12, 8),
147
+ genes_df=genes_df, # Gene annotations
148
+ exons_df=exons_df, # Exon annotations
150
149
  )
151
150
  ```
152
151
 
@@ -163,6 +162,8 @@ Recombination maps are automatically lifted over from CanFam3.1 to CanFam4 coord
163
162
  ## Using with Other Species
164
163
 
165
164
  ```python
165
+ from pylocuszoom import LocusZoomPlotter
166
+
166
167
  # Feline (LD and gene tracks, user provides recombination data)
167
168
  plotter = LocusZoomPlotter(species="feline")
168
169
 
@@ -172,37 +173,61 @@ plotter = LocusZoomPlotter(
172
173
  recomb_data_dir="/path/to/recomb_maps/",
173
174
  )
174
175
 
175
- # Or provide data per-plot
176
+ # Provide data per-plot
176
177
  fig = plotter.plot(
177
178
  gwas_df,
178
- chrom=1, start=1000000, end=2000000,
179
+ chrom=1,
180
+ start=1000000,
181
+ end=2000000,
179
182
  recomb_df=my_recomb_dataframe,
180
183
  genes_df=my_genes_df,
181
184
  )
182
185
  ```
183
186
 
187
+ ## Automatic Gene Annotations
188
+
189
+ pyLocusZoom can automatically fetch gene annotations from Ensembl for any species:
190
+
191
+ ```python
192
+ from pylocuszoom import LocusZoomPlotter
193
+
194
+ # Enable automatic gene fetching
195
+ plotter = LocusZoomPlotter(species="human", auto_genes=True)
196
+
197
+ # No need to provide genes_df - fetched automatically
198
+ fig = plotter.plot(gwas_df, chrom=13, start=32000000, end=33000000)
199
+ ```
200
+
201
+ Supported species aliases: `human`, `mouse`, `rat`, `canine`/`dog`, `feline`/`cat`, or any Ensembl species name.
202
+ Data is cached locally for fast subsequent plots. Maximum region size is 5Mb (Ensembl API limit).
203
+
184
204
  ## Backends
185
205
 
186
- pyLocusZoom supports multiple rendering backends:
206
+ pyLocusZoom supports multiple rendering backends (set at initialization):
187
207
 
188
208
  ```python
209
+ from pylocuszoom import LocusZoomPlotter
210
+
189
211
  # Static publication-quality plot (default)
190
- fig = plotter.plot(gwas_df, chrom=1, start=1000000, end=2000000, backend="matplotlib")
212
+ plotter = LocusZoomPlotter(species="canine", backend="matplotlib")
213
+ fig = plotter.plot(gwas_df, chrom=1, start=1000000, end=2000000)
191
214
  fig.savefig("plot.png", dpi=150)
192
215
 
193
216
  # Interactive Plotly (hover tooltips, pan/zoom)
194
- fig = plotter.plot(gwas_df, chrom=1, start=1000000, end=2000000, backend="plotly")
217
+ plotter = LocusZoomPlotter(species="canine", backend="plotly")
218
+ fig = plotter.plot(gwas_df, chrom=1, start=1000000, end=2000000)
195
219
  fig.write_html("plot.html")
196
220
 
197
221
  # Interactive Bokeh (dashboard-ready)
198
- fig = plotter.plot(gwas_df, chrom=1, start=1000000, end=2000000, backend="bokeh")
222
+ plotter = LocusZoomPlotter(species="canine", backend="bokeh")
223
+ fig = plotter.plot(gwas_df, chrom=1, start=1000000, end=2000000)
199
224
  ```
200
225
 
201
226
  | Backend | Output | Best For | Features |
202
227
  |---------|--------|----------|----------|
203
- | `matplotlib` | Static PNG/PDF/SVG | Publications, presentations | Full feature set with SNP labels |
204
- | `plotly` | Interactive HTML | Web reports, data exploration | Hover tooltips, pan/zoom |
205
- | `bokeh` | Interactive HTML | Dashboards, web apps | Hover tooltips, pan/zoom |
228
+ | `matplotlib` | Static PNG/PDF/SVG | Publication-ready figures | Full feature set with SNP labels |
229
+ | `plotly` | Interactive HTML | Web reports, exploration | Hover tooltips, pan/zoom |
230
+ | `bokeh` | Interactive HTML | Dashboard integration | Hover tooltips, pan/zoom |
206
231
 
207
232
  > **Note:** All backends support scatter plots, gene tracks, recombination overlay, and LD legend. SNP labels (auto-positioned with adjustText) are matplotlib-only; interactive backends use hover tooltips instead.
208
233
 
@@ -211,6 +236,10 @@ fig = plotter.plot(gwas_df, chrom=1, start=1000000, end=2000000, backend="bokeh"
211
236
  Compare multiple GWAS results vertically with shared x-axis:
212
237
 
213
238
  ```python
239
+ from pylocuszoom import LocusZoomPlotter
240
+
241
+ plotter = LocusZoomPlotter(species="canine")
242
+
214
243
  fig = plotter.plot_stacked(
215
244
  [gwas_height, gwas_bmi, gwas_whr],
216
245
  chrom=1,
@@ -221,22 +250,29 @@ fig = plotter.plot_stacked(
221
250
  )
222
251
  ```
223
252
 
224
- ![Example stacked plot](examples/stacked_plot.png)
253
+ ![Example stacked plot comparing two phenotypes](examples/stacked_plot.png)
254
+ *Stacked plot comparing two phenotypes with LD coloring and shared gene track.*
225
255
 
226
256
  ## eQTL Overlay
227
257
 
228
258
  Add expression QTL data as a separate panel:
229
259
 
230
260
  ```python
261
+ from pylocuszoom import LocusZoomPlotter
262
+
231
263
  eqtl_df = pd.DataFrame({
232
264
  "pos": [1000500, 1001200, 1002000],
233
265
  "p_value": [1e-6, 1e-4, 0.01],
234
266
  "gene": ["BRCA1", "BRCA1", "BRCA1"],
235
267
  })
236
268
 
269
+ plotter = LocusZoomPlotter(species="canine")
270
+
237
271
  fig = plotter.plot_stacked(
238
272
  [gwas_df],
239
- chrom=1, start=1000000, end=2000000,
273
+ chrom=1,
274
+ start=1000000,
275
+ end=2000000,
240
276
  eqtl_df=eqtl_df,
241
277
  eqtl_gene="BRCA1",
242
278
  genes_df=genes_df,
@@ -244,21 +280,28 @@ fig = plotter.plot_stacked(
244
280
  ```
245
281
 
246
282
  ![Example eQTL overlay plot](examples/eqtl_overlay.png)
283
+ *eQTL overlay with effect direction (up/down triangles) and magnitude binning.*
247
284
 
248
285
  ## Fine-mapping Visualization
249
286
 
250
287
  Visualize SuSiE or other fine-mapping results with credible set coloring:
251
288
 
252
289
  ```python
290
+ from pylocuszoom import LocusZoomPlotter
291
+
253
292
  finemapping_df = pd.DataFrame({
254
293
  "pos": [1000500, 1001200, 1002000, 1003500],
255
294
  "pip": [0.85, 0.12, 0.02, 0.45], # Posterior inclusion probability
256
295
  "cs": [1, 1, 0, 2], # Credible set assignment (0 = not in CS)
257
296
  })
258
297
 
298
+ plotter = LocusZoomPlotter(species="canine")
299
+
259
300
  fig = plotter.plot_stacked(
260
301
  [gwas_df],
261
- chrom=1, start=1000000, end=2000000,
302
+ chrom=1,
303
+ start=1000000,
304
+ end=2000000,
262
305
  finemapping_df=finemapping_df,
263
306
  finemapping_cs_col="cs",
264
307
  genes_df=genes_df,
@@ -266,6 +309,7 @@ fig = plotter.plot_stacked(
266
309
  ```
267
310
 
268
311
  ![Example fine-mapping plot](examples/finemapping_plot.png)
312
+ *Fine-mapping visualization with PIP line and credible set coloring (CS1/CS2).*
269
313
 
270
314
  ## PheWAS Plots
271
315
 
@@ -286,6 +330,7 @@ fig = plotter.plot_phewas(
286
330
  ```
287
331
 
288
332
  ![Example PheWAS plot](examples/phewas_plot.png)
333
+ *PheWAS plot showing associations across phenotype categories with significance threshold.*
289
334
 
290
335
  ## Forest Plots
291
336
 
@@ -308,19 +353,18 @@ fig = plotter.plot_forest(
308
353
  ```
309
354
 
310
355
  ![Example forest plot](examples/forest_plot.png)
356
+ *Forest plot with effect sizes, confidence intervals, and weight-proportional markers.*
311
357
 
312
358
  ## PySpark Support
313
359
 
314
- For large-scale genomics data, pass PySpark DataFrames directly:
360
+ For large-scale genomics data, convert PySpark DataFrames with `to_pandas()` before plotting:
315
361
 
316
362
  ```python
317
363
  from pylocuszoom import LocusZoomPlotter, to_pandas
318
364
 
319
- # PySpark DataFrame (automatically converted)
320
- fig = plotter.plot(spark_gwas_df, chrom=1, start=1000000, end=2000000)
321
-
322
- # Or convert manually with sampling for very large data
365
+ # Convert PySpark DataFrame (optionally sampled for very large data)
323
366
  pandas_df = to_pandas(spark_gwas_df, sample_size=100000)
367
+ fig = plotter.plot(pandas_df, chrom=1, start=1000000, end=2000000)
324
368
  ```
325
369
 
326
370
  Install PySpark support: `uv add pylocuszoom[spark]`
@@ -393,7 +437,7 @@ gwas_df = pd.DataFrame({
393
437
  |--------|------|----------|-------------|
394
438
  | `chr` | str or int | Yes | Chromosome identifier. Accepts "1", "chr1", or 1. The "chr" prefix is stripped for matching. |
395
439
  | `start` | int | Yes | Gene start position (bp, 1-based). Transcript start for strand-aware genes. |
396
- | `end` | int | Yes | Gene end position (bp, 1-based). Must be start. |
440
+ | `end` | int | Yes | Gene end position (bp, 1-based). Must be >= start. |
397
441
  | `gene_name` | str | Yes | Gene symbol displayed in track (e.g., "BRCA1", "TP53"). Keep short for readability. |
398
442
 
399
443
  Example:
@@ -495,6 +539,7 @@ Optional:
495
539
  ## Documentation
496
540
 
497
541
  - [User Guide](docs/USER_GUIDE.md) - Comprehensive documentation with API reference
542
+ - [Code Map](docs/CODEMAP.md) - Architecture diagram with source code links
498
543
  - [Architecture](docs/ARCHITECTURE.md) - Design decisions and component overview
499
544
  - [Example Notebook](examples/getting_started.ipynb) - Interactive tutorial
500
545
  - [CHANGELOG](CHANGELOG.md) - Version history