PyPI - pycmplot - Versions diffs - 0.1.9__tar.gz → 0.2.1__tar.gz - Mend

pycmplot 0.1.9tar.gz → 0.2.1tar.gz

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (38) hide show

{pycmplot-0.1.9 → pycmplot-0.2.1}/LICENSE RENAMED Viewed

@@ -1,4 +1,4 @@
-CC BY-NC-SA 4.0 License
+CC-BY-NC-SA-4.0 License
 Copyright (c) 2026 Kevin Esoh

pycmplot-0.2.1/PKG-INFO ADDED Viewed

@@ -0,0 +1,231 @@
+Metadata-Version: 2.4
+Name: pycmplot
+Version: 0.2.1
+Summary: Multi-track circular and linear Manhattan plot generation for GWAS summary statistics
+Author: Kevin Esoh
+Author-email: Kevin Esoh <kesohku1@jh.edu>
+License-Expression: CC-BY-NC-SA-4.0
+Requires-Python: >=3.9
+Description-Content-Type: text/markdown
+License-File: LICENSE
+Requires-Dist: pandas>=1.5
+Requires-Dist: numpy>=1.23
+Requires-Dist: matplotlib>=3.6
+Requires-Dist: pillow>=9.0
+Requires-Dist: pycirclize>=0.6
+Requires-Dist: natsort>=8.0
+Requires-Dist: adjustText>=0.8
+Requires-Dist: pyliftover>=0.4
+Provides-Extra: dev
+Requires-Dist: pytest; extra == "dev"
+Requires-Dist: black; extra == "dev"
+Requires-Dist: ruff; extra == "dev"
+Requires-Dist: towncrier; extra == "dev"
+Requires-Dist: sphinx; extra == "dev"
+Dynamic: license-file
+# pycmplot
+Multi-track **circular** and **linear** Manhattan plot generation for GWAS summary statistics.
+```
+#~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~#
+|  PACKAGE FOR CIRCULAR AND LINEAR MANHATTAN PLOTTING  |
+|                    Kevin Esoh, 2026                  |
+|                    kesohku1@jh.edu                   |
+#~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~#
+```
+This package will take any number of per SNP/variant summary statistics, be it GWAS,
+selection scans (e.g. iHS, EHH, FST), etc and generate Manhattan plots. If given a single
+file, a single one-track Manhattan plot will be generated. Multiple files will result in
+the generation of a multi-track stacked Manhattan plot.
+In the process, the package will generate a **hits summary table** for variants with p-value
+(or whatever statistic for significance is used) below the user-specified significance threshold.
+This hits summary table will contain annotated gene names, in addition to other annotations, that
+would then be used to annotate the plots.
+Importantly, the package allows for conversion of hg19 genomic coordinates to hg38 coordinates.
+This ensures that summary stats obtained using different imputation panels, for instance, can be
+processed in the same run. That is, users can simply concatenate multiple summary stats files together,
+such as those for the same trait but analysed using different imputation panels. Users only need to
+add a new column specifying the genome build (hg19 or hg38) of the variants. Then the `--build_column`
+option of the package should be used to indicate the column and then the package will liftover all
+postions in hg19 to hg38 ensuring that hits table generation and plotting are done with one unified
+corrdinate system.
+A key functionality of the package is its ability to auto-detect certain columns if ommited on the
+command-line or python API:
+- Chromosome column: `-chr, --chrom_column` or ommited
+- Basepair position column: `-pos, --pos_column` or ommited
+- SNP or Marker ID column: `-snp, --snp_column` or ommited
+- P-value (or whatever value) column: `-p, --pval_column` or ommited
+- Build version column: `-b, --build_column` or ommited
+Candidate names for each of the columns is shown below.
+```python
+# Resolve column names
+chr_candidates = [chrom, 'CHR', 'CHROM', 'Chromosome', '#CHROM', '#CHR', 'Chrom', 'chrom', 'chr', 'chromosome', '#chr', '#chrom']
+pos_candidates = [pos, 'BP', 'POS', 'bp', 'pos', 'Basepair']
+snp_candidates = [snp, 'SNP', 'RSID', 'rsID', 'MarkerName', 'MarkerID', 'Predictor', 'Marker', 'SNPID', 'ID']
+pvl_candidates = [pcol, 'P', 'P-value', 'Wald_P', 'pvalue', 'p_val', 'pval']
+bld_candidates = [build, 'BUILD', 'Genome', 'Genome_Build', 'Genome-build']
+```
+> NB: Upper and lower cases of the candidates are also considered, making each candidate expanded 3 times.
+Since GWAS summary stats files can be very large, to improve speed and memory efficiency, it is
+**highly recommended** to use `-tp, --trim_pval` with a value to exclude variants with p-value above a
+certain threshold, e.g. `0.01 (1e-2)` or `0.001 (1e-3)`.
+A potential useful application is **comparative visualization** of results from multiple imputation panels,
+multiple populations, or multiple traits to observe shared genetic architecture.
+Read more in the package documentation page: https://pycmplot.readthedocs.io/en/latest/
+---
+## Installation
+### From PyPI
+```bash
+pip install pycmplot
+```
+### From GitHub
+```bash
+git clone https://github.com/esohkevin/pycmplot.git
+cd pycmplot
+pip install -e .
+# or
+pip install -e . --break-system-packages
+```
+### Use python virtual environment if local installation is not possible
+```bash
+python -m venv ~/bin/pycmplot
+source ~/bin/pycmplot/bin/activate
+pip install --upgrade pip setuptools wheel
+# then follow any of the installation steps above
+```
+# Test the installation
+```bash
+pycmplot -h
+```
+### Dependencies
+| Package | Purpose |
+|---------|---------|
+| pandas, numpy | Data loading & statistics |
+| matplotlib | Plotting backend |
+| pycirclize | Circular (Circos-style) tracks |
+| natsort | Natural chromosome sorting |
+| adjustText | Label collision avoidance |
+| pyliftover | hg19 to hg38 coordinate conversion |
+| Pillow | Image utilities |
+---
+## Command-line usage
+### Linear Manhattan (default)
+```bash
+pycmplot \
+  --sum_stats HbF.tsv.gz,MCV.txt.gz,MCH.tsv.gz \
+  --labels HbF,MCV,MCH \
+  --logp \
+  --signif_line \
+  --highlight \
+  --annotate GENE \
+  --output_dir ./results \
+  --output_format png \
+  --dpi 300
+```
+### Circular Manhattan
+```bash
+pycmplot \
+  --sum_stats HbF.tsv.gz,MCV.tsv.gz \
+  --labels HbF,MCV \
+  --mode cm \
+  --trim_pval 0.01 \
+  --logp \
+  --signif_threshold \
+  --plot_title "RBC Traits" \
+  --output_dir ./results
+```
+### Key options
+| Flag | Description | Default |
+|------|-------------|---------|
+| `-s, --sum_stats` | Comma-separated sumstats files | **required** |
+| `-l, --labels` | Comma-separated track labels | **required** |
+| `-b, --build` | Comma-separated genome builds of sumstats  | off |
+| `-bc, --build_column` | Genome build column name (containing hg18/hg19/hg38) | off |
+| `-m, --mode` | `lm` linear or `cm` circular | `lm` |
+| `-qq, --qq_plot` | Also generate a QQ-plot | off (coming soon...) |
+| `--logp` | Plot -log10(p) | off |
+| `-sig, --signif_threshold` | Genome-wide significance threshold | off (auto 0.05/N) |
+| `-sigl, --signif_line` | Value for genome-wide significance line if different from `-sig` | 5e-8 |
+| `-sug, --suggest_threshold` | Threshold for suggestive signals | off |
+| `-hl, --highlight` | Highlight significant loci | off |
+| `-a, --annotate` | Annotate with `snp`, `gene`, or any column in `hits_table` | `snp` |
+| `-tp, --trim_pval` | Trim variants above this p-value for speed | off |
+| `-st, --sort_track` | Sort tracks by `label` or `chrom_len` | input order |
+| `-od, --output_dir` | Output directory | `.` |
+| `-of, --output_format` | Output format (`png`, `pdf`, `svg`, `jpg`) | `png` |
+Run `pycmplot -h` for the full option list.
+---
+## Python API
+A demonstration of how to use the python API is provided in this notebook: https://github.com/esohkevin/pycmplot/blob/main/pycmplot_python_api.ipynb
+---
+## Package structure
+```
+pycmplot/
+├── pyproject.toml
+├── setup.py
+├── setup.cfg
+├── README.md
+└── pycmplot/
+      ├── __init__.py          # public API exports
+      ├── __main__.py          # python -m pycmplot
+      ├── _core.py             # main() orchestration
+      ├── cli.py               # argparse definitions
+      ├── constants.py         # chromosome lengths, biotype weights
+      ├── resources.py         # external resource path config
+      ├── io.py                # sumstat loading, delimiter detection
+      ├── stats.py             # get_lead_snps, get_highlight_snps
+      ├── liftover.py          # lazy hg19→hg38 liftover
+      ├── annotation.py        # nearest-gene annotation, hits table
+      └── plotting/
+          ├── __init__.py
+          ├── linear.py        # plot_linear
+          └── circular.py      # plot_circular, compute_track_radii_dict

{pycmplot-0.1.9 → pycmplot-0.2.1}/README.md RENAMED Viewed

@@ -49,6 +49,9 @@ pvl_candidates = [pcol, 'P', 'P-value', 'Wald_P', 'pvalue', 'p_val', 'pval']
 bld_candidates = [build, 'BUILD', 'Genome', 'Genome_Build', 'Genome-build']
 ```
+> NB: Upper and lower cases of the candidates are also considered, making each candidate expanded 3 times.
 Since GWAS summary stats files can be very large, to improve speed and memory efficiency, it is
 **highly recommended** to use `-tp, --trim_pval` with a value to exclude variants with p-value above a
 certain threshold, e.g. `0.01 (1e-2)` or `0.001 (1e-3)`.
@@ -56,6 +59,8 @@ certain threshold, e.g. `0.01 (1e-2)` or `0.001 (1e-3)`.
 A potential useful application is **comparative visualization** of results from multiple imputation panels,
 multiple populations, or multiple traits to observe shared genetic architecture.
+Read more in the package documentation page: https://pycmplot.readthedocs.io/en/latest/
 ---
 ## Installation
@@ -149,7 +154,8 @@ pycmplot \
 |------|-------------|---------|
 | `-s, --sum_stats` | Comma-separated sumstats files | **required** |
 | `-l, --labels` | Comma-separated track labels | **required** |
-| `-b, --build_column` | Genome build column name (containing hg18/hg19/hg38) | **required** |
+| `-b, --build` | Comma-separated genome builds of sumstats  | off |
+| `-bc, --build_column` | Genome build column name (containing hg18/hg19/hg38) | off |
 | `-m, --mode` | `lm` linear or `cm` circular | `lm` |
 | `-qq, --qq_plot` | Also generate a QQ-plot | off (coming soon...) |
 | `--logp` | Plot -log10(p) | off |
@@ -157,7 +163,7 @@ pycmplot \
 | `-sigl, --signif_line` | Value for genome-wide significance line if different from `-sig` | 5e-8 |
 | `-sug, --suggest_threshold` | Threshold for suggestive signals | off |
 | `-hl, --highlight` | Highlight significant loci | off |
-| `-a, --annotate` | Annotate with `SNP` or `GENE` | `SNP` |
+| `-a, --annotate` | Annotate with `snp`, `gene`, or any column in `hits_table` | `snp` |
 | `-tp, --trim_pval` | Trim variants above this p-value for speed | off |
 | `-st, --sort_track` | Sort tracks by `label` or `chrom_len` | input order |
 | `-od, --output_dir` | Output directory | `.` |

{pycmplot-0.1.9 → pycmplot-0.2.1}/docs/conf.py RENAMED Viewed

@@ -12,7 +12,7 @@ sys.path.insert(0, os.path.abspath(".."))
 project = "pycmplot"
 copyright = "2026, Kevin Esoh"
 author = "Kevin Esoh"
-release = "0.1.9"  # update to match your PyPI version
+release = "0.2.1"  # update to match PyPI version
 # -- General configuration -----------------------------------------------------
 extensions = [

{pycmplot-0.1.9 → pycmplot-0.2.1}/pycmplot/__init__.py RENAMED Viewed

@@ -42,4 +42,4 @@ __all__ = [
     "ResourceConfig",
 ]
-__version__ = "0.1.9"
+__version__ = "0.2.1"

{pycmplot-0.1.9 → pycmplot-0.2.1}/pycmplot/_core.py RENAMED Viewed

@@ -1,6 +1,6 @@
 from __future__ import annotations
-CORE_MODULE = '''"""
+CORE_MODULE = """
 pycmplot._core
 ==============
@@ -12,7 +12,7 @@ work to :mod:`pycmplot.io`, :mod:`pycmplot.plotting.linear`, and
 All imports are deferred inside :func:`main` so that
 ``import pycmplot`` remains fast regardless of the size of the dependency
 tree.
-"""'''
+"""
 import logging
 import warnings
@@ -26,7 +26,7 @@ logger = logging.getLogger(__name__)
 def main() -> None:
-    MAIN = '''"""Orchestrate the full pycmplot pipeline from the command line.
+    MAIN = """Orchestrate the full pycmplot pipeline from the command line.
     This function is registered as the ``pycmplot`` console-script entry point
     in ``pyproject.toml`` / ``setup.cfg``.  It performs the following steps in
@@ -75,7 +75,7 @@ def main() -> None:
         Linear Manhattan plotter called for ``--mode lm`` (default).
     pycmplot.plotting.circular.plot_circular :
         Circular Manhattan plotter called for ``--mode cm``.
-    """'''
+    """
     # ------------------------------------------------------------------
     # Deferred imports so ``import pycmplot`` remains fast
@@ -105,7 +105,8 @@ def main() -> None:
     chrom_arg        = args.chrom_column
     pos_arg          = args.pos_column
     snp_arg          = args.snp_column
-    build_arg        = args.build_column
+    build_arg        = args.build
+    buildc_arg       = args.build_column
     labels_raw       = args.labels
     pcol_arg         = args.pval_column
     logp             = args.logp
@@ -123,13 +124,13 @@ def main() -> None:
     point_size       = args.point_size
     highlight        = args.highlight
     highlight_thresh = args.highlight_thresh
-    highight_color   = args.highight_color
+    highlight_color   = args.highlight_color
     highlight_line   = args.highlight_line
-    highight_line_color = args.highight_line_color
+    highlight_line_color = args.highlight_line_color
     colors_raw       = args.colors
-    r_min            = args.r_min
-    r_max            = args.r_max
-    pad              = args.pad
+    r_min            = args.min_radius
+    r_max            = args.max_radius
+    pad              = args.circular_track_spacing
     output_format    = args.output_format
     output_dir       = args.output_dir
     dpi              = args.dpi
@@ -142,18 +143,20 @@ def main() -> None:
     # ------------------------------------------------------------------
-    # Sumstat, labels, colours, track heights str to list
+    # Sumstat, labels, colours, track heights [build] str to list
     # ------------------------------------------------------------------
     (
         sum_stats,
         labels,
         colors,
-        t_heights
+        t_heights,
+        builds
     ) = strip_comma_separated_input_streams(
         sum_stats = sum_stats_raw,
         labels = labels_raw,
         colors_raw = colors_raw,
         track_heights = track_heights,
+        builds = build_arg if build_arg else None,
     )
     # ------------------------------------------------------------------
@@ -182,7 +185,8 @@ def main() -> None:
         pos = pos_arg,
         snp = snp_arg,
         pcol = pcol_arg,
-        build = build_arg
+        buildc = buildc_arg,
+        build = builds
     )
     # ------------------------------------------------------------------
@@ -212,6 +216,19 @@ def main() -> None:
         resources=resources,
     )
+    # ------------------------------------------------------------------
+    # ANNOTATE BY
+    # ------------------------------------------------------------------
+    if annotate:
+        if str(annotate).upper() == "GENE":
+            label_col = 'top_gene'
+        elif str(annotate).upper() == "SNP":
+            label_col = 'SNP'
+        else:
+            label_col = annotate
+        logger.info(f"Anotate by: {label_col}")
     # ------------------------------------------------------------------
     # CIRCULAR MANHATTAN
     # ------------------------------------------------------------------
@@ -224,15 +241,16 @@ def main() -> None:
             signif_lines = signif_lines,
             highlight = highlight,
             highlight_thresh = highlight_thresh,
-            highight_color = highight_color,
+            highlight_color = highlight_color,
             highlight_line = highlight_line,
-            highight_line_color = highight_line_color,
+            highlight_line_color = highlight_line_color,
             colors = colors,
             chrom_label_side = chrom_label_side,
             chrom_label_size = chrom_label_size,
             track_label_size = track_label_size,
             track_label_orientation = track_label_orientation,
             annotate = annotate,
+            label_col = label_col if annotate else None,
             annotation_size = annotation_size,
             hits_table = hits_table,
             sector_sizes = merged_assoc_sector_sizes,
@@ -253,24 +271,25 @@ def main() -> None:
     else:
         logger.info("Generating LINEAR MANHATTAN Plot ...")
         plot_linear(
-            sumstats_loaded = sumstats_loaded,
-            track_heights = t_heights,
+            sumstats_loaded=sumstats_loaded,
+            track_heights=t_heights,
             trim_pval=trim_pval,
             logp=True if logp else False,
             point_size=point_size,
             highlight=highlight,
             highlight_thresh=highlight_thresh,
-            highight_color = highight_color,
-            highlight_line = highlight_line,
-            highight_line_color = highight_line_color,
-            annot_df=hits_table if not hits_table.empty else None,
-            label_col="top_gene",
+            highlight_color=highlight_color,
+            highlight_line=highlight_line,
+            highlight_line_color=highlight_line_color,
+            annotate=annotate,
+            hits_table=hits_table if not hits_table.empty else None,
+            label_col=label_col if annotate else None,
             chr_spacing=chr_spacing,
             linear_track_spacing=linear_track_spacing,
             colors=colors,
             signif_lines=signif_lines,
             plot_title=plot_title,
-            no_track_labels = no_track_labels,
+            no_track_labels=no_track_labels,
             dpi=dpi,
             output_format=output_format,
             output_dir=output_dir,

{pycmplot-0.1.9 → pycmplot-0.2.1}/pycmplot/annotation.py RENAMED Viewed

@@ -1,6 +1,6 @@
 from __future__ import annotations
-MODULE_DOCSTRING = '''"""
+MODULE_DOCSTRING = """
 pycmplot.annotation
 ====================
@@ -20,7 +20,7 @@ Annotation relies on a bundled Ensembl gene-info TSV (hg38 or hg19).  The
 file is resolved through :class:`~pycmplot.resources.ResourceConfig`; custom
 paths can be supplied via the ``PYCMPLOT_GENEINFO_HG38`` /
 ``PYCMPLOT_GENEINFO_HG19`` environment variables.
-"""'''
+"""
 import bisect
 import logging
@@ -41,7 +41,7 @@ logger = logging.getLogger(__name__)
 # ---------------------------------------------------------------------------
 def _build_genes_dict(genes_df: pd.DataFrame) -> dict:
-    BUILD_GENES_DICT = '''"""Build a chromosome-keyed interval dictionary with sorted start positions.
+    BUILD_GENES_DICT = """Build a chromosome-keyed interval dictionary with sorted start positions.
     Pre-processes the gene reference DataFrame into a structure that supports
     efficient O(log N) binary-search lookup of genes near a query position.
@@ -67,7 +67,7 @@ def _build_genes_dict(genes_df: pd.DataFrame) -> dict:
     -----
     This function is called once per :func:`get_hits_summary_table` invocation;
     the result is passed to :func:`_annotate_variant` for each lead SNP.
-    """'''
+    """
     genes_df = genes_df.sort_values(["CHR", "START"])
     genes_dict: dict = {}
@@ -98,7 +98,7 @@ def _annotate_variant(
     window: int = 500_000,
     promoter_window: int = 2_000,
 ) -> dict:
-    ANNOTATE_VARIANT = '''"""Return strand-aware nearest-gene annotation for a single variant.
+    ANNOTATE_VARIANT = """Return strand-aware nearest-gene annotation for a single variant.
     Searches the pre-built *genes_dict* within *window* bp of *pos* on
     *chrom*.  Reports the nearest upstream and downstream genes (relative to
@@ -138,7 +138,7 @@ def _annotate_variant(
         within *promoter_window* bp upstream of any TSS.
         * ``gene_density`` (int) – number of genes with any overlap in the
         search window.
-    """'''
+    """
     _empty = {
         "genic": False,
@@ -238,7 +238,7 @@ def _annotate_and_prioritize_variant(
     promoter_window: int = 2_000,
     biotype_weights: Optional[dict] = None,
 ) -> Optional[dict]:
-    ANNOTATE_PRIORITIZE = '''"""Score and rank candidate genes for a single variant using a composite
+    ANNOTATE_PRIORITIZE = """Score and rank candidate genes for a single variant using a composite
     priority metric.
     Builds a candidate gene set within *window* bp of *pos* on *chrom*, then
@@ -287,7 +287,7 @@ def _annotate_and_prioritize_variant(
         For intergenic variants, ``top_gene`` contains the two nearest flanking
         gene symbols joined by ``'-'`` (e.g. ``'HBB-HBD'``) and ``biotype``
         is set to ``'intergenic'``.
-    """'''
+    """
     if biotype_weights is None:
         biotype_weights = BIOTYPE_WEIGHTS
@@ -386,7 +386,7 @@ def _annotate_and_prioritize_variant(
 # ---------------------------------------------------------------------------
 def _clump_by_distance(df: pd.DataFrame, window_kb: int = 500) -> pd.DataFrame:
-    CLUMP_BY_DISTANCE = '''"""Reduce a lead-SNP table to one representative SNP per locus.
+    CLUMP_BY_DISTANCE = """Reduce a lead-SNP table to one representative SNP per locus.
     Applies greedy distance-based clumping within each chromosome group,
     starting from the most significant SNP (lowest ``P`` or highest ``logP``).
@@ -406,7 +406,7 @@ def _clump_by_distance(df: pd.DataFrame, window_kb: int = 500) -> pd.DataFrame:
     pandas.DataFrame
         Deduplicated locus representatives sorted by chromosome and position
         (natural sort order).
-    """'''
+    """
     window = window_kb * 1000
     clumped: list[pd.Series] = []
@@ -438,7 +438,7 @@ def get_hits_summary_table(
     table_out: Optional[str] = None,
     resources: Optional[ResourceConfig] = None,
 ) -> pd.DataFrame:
-    GET_HITS_SUMMARY_TABLE = '''"""Annotate lead SNPs with nearest genes and write the locus summary table.
+    GET_HITS_SUMMARY_TABLE = """Annotate lead SNPs with nearest genes and write the locus summary table.
     For each lead SNP in *leads_df*, runs two complementary annotation passes:
@@ -528,51 +528,54 @@ def get_hits_summary_table(
             SNP CHR       POS  top_gene           biotype
     0  rs123456   2  60718043    BCL11A    protein_coding
     1  rs789012  11   5246696       HBB    protein_coding
-    """'''
+    """
     if resources is None:
         resources = default_resources
     # Choose gene info file based on build
-    if "OLD_POS" not in leads_df.columns and list(set(leads_df["BUILD"])) == ["hg19"]:
-        geneinfo_path = resources.require("geneinfo_hg19")
-    else:
-        geneinfo_path = resources.require("geneinfo_hg38")
+    if 'BUILD' in leads_df.columns:
+        if "OLD_POS" not in leads_df.columns and list(set(leads_df["BUILD"])) == ["hg19"]:
+            geneinfo_path = resources.require("geneinfo_hg19")
+        else:
+            geneinfo_path = resources.require("geneinfo_hg38")
-    logger.info("Loading gene info from: %s", geneinfo_path)
-    geneinfo = pd.read_csv(geneinfo_path, header=0, sep="\t")
-    genes_dict = _build_genes_dict(geneinfo)
+        logger.info("Loading gene info from: %s", geneinfo_path)
+        geneinfo = pd.read_csv(geneinfo_path, header=0, sep="\t")
+        genes_dict = _build_genes_dict(geneinfo)
-    window = window_kb * 1_000
-    records: list[dict] = []
+        window = window_kb * 1_000
+        records: list[dict] = []
-    logger.info("Annotating lead variants and generating hits summary table ...")
-    for _, row in leads_df.iterrows():
-        annotation = _annotate_variant(
-            chrom=row["CHR"],
-            pos=row["POS"],
-            genes_dict=genes_dict,
-            window=window,
-        )
-        prioritized = _annotate_and_prioritize_variant(
-            chrom=row["CHR"],
-            pos=row["POS"],
-            genes_df=geneinfo,
-            lead_snps_df=leads_df,
-            window=window,
-        )
+        logger.info("Annotating lead variants and generating hits summary table ...")
+        for _, row in leads_df.iterrows():
+            annotation = _annotate_variant(
+                chrom=row["CHR"],
+                pos=row["POS"],
+                genes_dict=genes_dict,
+                window=window,
+            )
+            prioritized = _annotate_and_prioritize_variant(
+                chrom=row["CHR"],
+                pos=row["POS"],
+                genes_df=geneinfo,
+                lead_snps_df=leads_df,
+                window=window,
+            )
-        record = {
-            **(row.to_dict()),
-            **(annotation if annotation is not None else {}),
-            **(prioritized if prioritized is not None else {}),
-        }
-        records.append(record)
+            record = {
+                **(row.to_dict()),
+                **(annotation if annotation is not None else {}),
+                **(prioritized if prioritized is not None else {}),
+            }
+            records.append(record)
-    locus_table = pd.DataFrame(records).sort_values(
-        ["CHR", "POS"], key=natsort.natsort_keygen()
-    )
+        locus_table = pd.DataFrame(records).sort_values(
+            ["CHR", "POS"], key=natsort.natsort_keygen()
+        )
+    else:
+        locus_table = leads_df
     if table_out is not None:
         locus_table.to_csv(table_out, index=False, sep="\t", na_rep="None")

pycmplot 0.1.9__tar.gz → 0.2.1__tar.gz

pycmplot 0.1.9tar.gz → 0.2.1tar.gz