PyPI - cool-seq-tool - Versions diffs - 0.3.0.dev1__tar.gz → 0.4.0.dev0__tar.gz - Mend

cool-seq-tool 0.3.0.dev1tar.gz → 0.4.0.dev0tar.gz

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (47) hide show

{cool_seq_tool-0.3.0.dev1 → cool_seq_tool-0.4.0.dev0}/LICENSE RENAMED Viewed

@@ -1,6 +1,6 @@
 MIT License
-Copyright (c) 2021 VICC
+Copyright (c) 2021-2023 Wagner Lab
 Permission is hereby granted, free of charge, to any person obtaining a copy
 of this software and associated documentation files (the "Software"), to deal

cool_seq_tool-0.4.0.dev0/PKG-INFO ADDED Viewed

@@ -0,0 +1,130 @@
+Metadata-Version: 2.1
+Name: cool_seq_tool
+Version: 0.4.0.dev0
+Summary: Common Operation on Lots of Sequences Tool
+Author: Kori Kuzma, James Stevenson, Katie Stahl, Alex Wagner
+License: MIT License
+        Copyright (c) 2021-2023 Wagner Lab
+        Permission is hereby granted, free of charge, to any person obtaining a copy
+        of this software and associated documentation files (the "Software"), to deal
+        in the Software without restriction, including without limitation the rights
+        to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
+        copies of the Software, and to permit persons to whom the Software is
+        furnished to do so, subject to the following conditions:
+        The above copyright notice and this permission notice shall be included in all
+        copies or substantial portions of the Software.
+        THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+        IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+        FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
+        AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+        LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
+        OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
+        SOFTWARE.
+Project-URL: Homepage, https://github.com/genomicmedlab/cool-seq-tool
+Project-URL: Documentation, https://coolseqtool.readthedocs.io/en/latest/index.html
+Project-URL: Changelog, https://github.com/genomicmedlab/cool-seq-tool/releases
+Project-URL: Source, https://github.com/genomicmedlab/cool-seq-tool
+Project-URL: Bug Tracker, https://github.com/genomicmedlab/cool-seq-tool/issues
+Classifier: Development Status :: 3 - Alpha
+Classifier: Framework :: FastAPI
+Classifier: Framework :: Pydantic
+Classifier: Framework :: Pydantic :: 2
+Classifier: Intended Audience :: Science/Research
+Classifier: Intended Audience :: Developers
+Classifier: Topic :: Scientific/Engineering :: Bio-Informatics
+Classifier: License :: OSI Approved :: MIT License
+Classifier: Programming Language :: Python :: 3
+Classifier: Programming Language :: Python :: 3.8
+Classifier: Programming Language :: Python :: 3.9
+Classifier: Programming Language :: Python :: 3.10
+Classifier: Programming Language :: Python :: 3.11
+Requires-Python: >=3.8
+Description-Content-Type: text/markdown
+License-File: LICENSE
+Requires-Dist: asyncpg
+Requires-Dist: aiofiles
+Requires-Dist: boto3
+Requires-Dist: pyliftover
+Requires-Dist: polars
+Requires-Dist: hgvs
+Requires-Dist: biocommons.seqrepo
+Requires-Dist: pydantic==2.*
+Requires-Dist: uvicorn
+Requires-Dist: fastapi
+Requires-Dist: ga4gh.vrs
+Provides-Extra: dev
+Requires-Dist: pre-commit; extra == "dev"
+Requires-Dist: ipython; extra == "dev"
+Requires-Dist: ipykernel; extra == "dev"
+Requires-Dist: psycopg2-binary; extra == "dev"
+Requires-Dist: ruff; extra == "dev"
+Provides-Extra: tests
+Requires-Dist: pytest; extra == "tests"
+Requires-Dist: pytest-cov; extra == "tests"
+Requires-Dist: pytest-asyncio==0.18.3; extra == "tests"
+Requires-Dist: mock; extra == "tests"
+Provides-Extra: docs
+Requires-Dist: sphinx==6.1.3; extra == "docs"
+Requires-Dist: sphinx-autodoc-typehints==1.22.0; extra == "docs"
+Requires-Dist: sphinx-autobuild==2021.3.14; extra == "docs"
+Requires-Dist: sphinx-copybutton==0.5.2; extra == "docs"
+Requires-Dist: sphinxext-opengraph==0.8.2; extra == "docs"
+Requires-Dist: furo==2023.3.27; extra == "docs"
+<h1 align="center">
+CoolSeqTool
+</h1>
+**[Documentation](https://coolseqtool.readthedocs.io/en/latest/)** · [Installation](https://coolseqtool.readthedocs.io/en/latest/install.html) · [Usage](https://coolseqtool.readthedocs.io/en/latest/usage.html) · [API reference](https://coolseqtool.readthedocs.io/en/latest/reference/index.html)
+## Overview
+<!-- description -->
+The **CoolSeqTool** provides:
+ - A Pythonic API on top of sequence data of interest to tertiary analysis tools, including mappings between gene names and transcripts, [MANE transcript](https://www.ncbi.nlm.nih.gov/refseq/MANE/) descriptions, and the [Universal Transcript Archive](https://github.com/biocommons/uta)
+ - Augmented access to the [SeqRepo](https://github.com/biocommons/biocommons.seqrepo) database, including multiple additional methods and tools
+ - Mapping tools that combine the above to support translation between references sequences, annotation layers, and MANE transcripts
+<!-- /description -->
+---
+## Install
+CoolSeqTool is available on [PyPI](https://pypi.org/project/cool-seq-tool)
+```shell
+python3 -m pip install cool-seq-tool
+```
+See the [installation instructions](https://coolseqtool.readthedocs.io/en/latest/install.html) in the documentation for a description of dependency setup requirements.
+---
+## Usage
+All CoolSeqTool resources can be initialized by way of a top-level class instance:
+```pycon
+>>> from cool_seq_tool.app import CoolSeqTool
+>>> cst = CoolSeqTool()
+>>> result = await cst.mane_transcript.get_mane_transcript(
+...     "NP_004324.2",
+...     599,
+...     AnnotationLayer.PROTEIN,
+...     residue_mode=ResidueMode.INTER_RESIDUE,
+... )
+>>> result.gene, result.refseq, result.status
+('EGFR', 'NM_005228.5', <TranscriptPriority.MANE_SELECT: 'mane_select'>)
+```
+---
+## Feedback and contributing
+We welcome bug reports, feature requests, and code contributions from users and interested collaborators. The [documentation](https://coolseqtool.readthedocs.io/en/latest/contributing.html) contains guidance for submitting feedback and contributing new code.

cool_seq_tool-0.4.0.dev0/README.md ADDED Viewed

@@ -0,0 +1,52 @@
+<h1 align="center">
+CoolSeqTool
+</h1>
+**[Documentation](https://coolseqtool.readthedocs.io/en/latest/)** · [Installation](https://coolseqtool.readthedocs.io/en/latest/install.html) · [Usage](https://coolseqtool.readthedocs.io/en/latest/usage.html) · [API reference](https://coolseqtool.readthedocs.io/en/latest/reference/index.html)
+## Overview
+<!-- description -->
+The **CoolSeqTool** provides:
+ - A Pythonic API on top of sequence data of interest to tertiary analysis tools, including mappings between gene names and transcripts, [MANE transcript](https://www.ncbi.nlm.nih.gov/refseq/MANE/) descriptions, and the [Universal Transcript Archive](https://github.com/biocommons/uta)
+ - Augmented access to the [SeqRepo](https://github.com/biocommons/biocommons.seqrepo) database, including multiple additional methods and tools
+ - Mapping tools that combine the above to support translation between references sequences, annotation layers, and MANE transcripts
+<!-- /description -->
+---
+## Install
+CoolSeqTool is available on [PyPI](https://pypi.org/project/cool-seq-tool)
+```shell
+python3 -m pip install cool-seq-tool
+```
+See the [installation instructions](https://coolseqtool.readthedocs.io/en/latest/install.html) in the documentation for a description of dependency setup requirements.
+---
+## Usage
+All CoolSeqTool resources can be initialized by way of a top-level class instance:
+```pycon
+>>> from cool_seq_tool.app import CoolSeqTool
+>>> cst = CoolSeqTool()
+>>> result = await cst.mane_transcript.get_mane_transcript(
+...     "NP_004324.2",
+...     599,
+...     AnnotationLayer.PROTEIN,
+...     residue_mode=ResidueMode.INTER_RESIDUE,
+... )
+>>> result.gene, result.refseq, result.status
+('EGFR', 'NM_005228.5', <TranscriptPriority.MANE_SELECT: 'mane_select'>)
+```
+---
+## Feedback and contributing
+We welcome bug reports, feature requests, and code contributions from users and interested collaborators. The [documentation](https://coolseqtool.readthedocs.io/en/latest/contributing.html) contains guidance for submitting feedback and contributing new code.

cool_seq_tool-0.4.0.dev0/pyproject.toml ADDED Viewed

@@ -0,0 +1,122 @@
+[project]
+name = "cool_seq_tool"
+authors = [
+    {name = "Kori Kuzma"},
+    {name = "James Stevenson"},
+    {name = "Katie Stahl"},
+    {name = "Alex Wagner"},
+]
+readme = "README.md"
+classifiers = [
+    "Development Status :: 3 - Alpha",
+    "Framework :: FastAPI",
+    "Framework :: Pydantic",
+    "Framework :: Pydantic :: 2",
+    "Intended Audience :: Science/Research",
+    "Intended Audience :: Developers",
+    "Topic :: Scientific/Engineering :: Bio-Informatics",
+    "License :: OSI Approved :: MIT License",
+    "Programming Language :: Python :: 3",
+    "Programming Language :: Python :: 3.8",
+    "Programming Language :: Python :: 3.9",
+    "Programming Language :: Python :: 3.10",
+    "Programming Language :: Python :: 3.11",
+]
+requires-python = ">=3.8"
+description = "Common Operation on Lots of Sequences Tool"
+license = {file = "LICENSE"}
+dependencies = [
+    "asyncpg",
+    "aiofiles",
+    "boto3",
+    "pyliftover",
+    "polars",
+    "hgvs",
+    "biocommons.seqrepo",
+    "pydantic == 2.*",
+    "uvicorn",
+    "fastapi",
+    "ga4gh.vrs",
+]
+dynamic = ["version"]
+[project.optional-dependencies]
+dev = ["pre-commit", "ipython", "ipykernel", "psycopg2-binary", "ruff"]
+tests = ["pytest", "pytest-cov", "pytest-asyncio==0.18.3", "mock"]
+docs = [
+    "sphinx==6.1.3",
+    "sphinx-autodoc-typehints==1.22.0",
+    "sphinx-autobuild==2021.3.14",
+    "sphinx-copybutton==0.5.2",
+    "sphinxext-opengraph==0.8.2",
+    "furo==2023.3.27",
+]
+[project.urls]
+Homepage = "https://github.com/genomicmedlab/cool-seq-tool"
+Documentation = "https://coolseqtool.readthedocs.io/en/latest/index.html"
+Changelog = "https://github.com/genomicmedlab/cool-seq-tool/releases"
+Source = "https://github.com/genomicmedlab/cool-seq-tool"
+"Bug Tracker" = "https://github.com/genomicmedlab/cool-seq-tool/issues"
+[build-system]
+requires = ["setuptools>=61.0"]
+build-backend = "setuptools.build_meta"
+[tool.setuptools.dynamic]
+version = {attr = "cool_seq_tool.version.__version__"}
+[tool.setuptools.packages.find]
+where = ["src"]
+[tool.pytest.ini_options]
+addopts = "--cov=src --cov-report term-missing"
+testpaths = ["tests"]
+[tool.coverage.run]
+branch = true
+[tool.ruff]
+src = ["src"]
+exclude = ["docs/source/conf.py"]
+# pycodestyle (E, W)
+# Pyflakes (F)
+# flake8-annotations (ANN)
+# pydocstyle (D)
+# pep8-naming (N)
+# isort (I)
+select = ["E", "W", "F", "ANN", "D", "N", "I"]
+fixable = ["I", "F401"]
+# ANN101 - missing-type-self
+# ANN003 - missing-type-kwargs
+# D203 - one-blank-line-before-class
+# D205 - blank-line-after-summary
+# D206 - indent-with-spaces*
+# D213 - multi-line-summary-second-line
+# D300 - triple-single-quotes*
+# D400 - ends-in-period
+# D415 - ends-in-punctuation
+# E111 - indentation-with-invalid-multiple*
+# E114 - indentation-with-invalid-multiple-comment*
+# E117 - over-indented*
+# E501 - line-too-long*
+# W191 - tab-indentation*
+# *ignored for compatibility with formatter
+ignore = [
+    "ANN101", "ANN003",
+    "D203", "D205", "D206", "D213", "D300", "D400", "D415",
+    "E111", "E114", "E117", "E501",
+    "W191"
+]
+[tool.ruff.per-file-ignores]
+# ANN001 - missing-type-function-argument
+# ANN2 - missing-return-type
+# ANN102 - missing-type-cls
+# N805 - invalid-first-argument-name-for-method
+# F821 - undefined-name
+# F401 - unused-import
+"tests/*" = ["ANN001", "ANN2", "ANN102"]
+"*__init__.py" = ["F401"]
+"src/cool_seq_tool/schemas.py" = ["ANN201", "N805", "ANN001"]

cool_seq_tool-0.4.0.dev0/setup.cfg ADDED Viewed

@@ -0,0 +1,4 @@
+[egg_info]
+tag_build =
+tag_date = 0

{cool_seq_tool-0.3.0.dev1 → cool_seq_tool-0.4.0.dev0/src}/cool_seq_tool/api.py RENAMED Viewed

@@ -24,16 +24,16 @@ def custom_openapi() -> Dict:
     if app.openapi_schema:
         return app.openapi_schema
     openapi_schema = get_openapi(
-        title="The GenomicMedLab Cool Seq Tool",
+        title="The GenomicMedLab Cool-Seq-Tool",
         version=__version__,
-        description="Common Operations On Lots-of Sequences Tool.",
+        description="Common Operations On Lots of Sequences Tool.",
         routes=app.routes,
     )
     openapi_schema["info"]["contact"] = {
         "name": "Alex H. Wagner",
         "email": "Alex.Wagner@nationwidechildrens.org",
-        "url": "https://www.nationwidechildrens.org/specialties/institute-for-genomic-medicine/research-labs/wagner-lab",  # noqa: E501
+        "url": "https://www.nationwidechildrens.org/specialties/institute-for-genomic-medicine/research-labs/wagner-lab",
     }
     app.openapi_schema = openapi_schema
     return app.openapi_schema

cool_seq_tool-0.4.0.dev0/src/cool_seq_tool/app.py ADDED Viewed

@@ -0,0 +1,90 @@
+"""Provides core CoolSeqTool class, which non-redundantly initializes all Cool-Seq-Tool
+data handler and mapping resources for straightforward access.
+"""
+import logging
+from pathlib import Path
+from typing import Optional
+from biocommons.seqrepo import SeqRepo
+from cool_seq_tool.handlers.seqrepo_access import SeqRepoAccess
+from cool_seq_tool.mappers import (
+    AlignmentMapper,
+    ExonGenomicCoordsMapper,
+    ManeTranscript,
+)
+from cool_seq_tool.paths import (
+    LRG_REFSEQGENE_PATH,
+    MANE_SUMMARY_PATH,
+    SEQREPO_ROOT_DIR,
+    TRANSCRIPT_MAPPINGS_PATH,
+)
+from cool_seq_tool.sources.mane_transcript_mappings import ManeTranscriptMappings
+from cool_seq_tool.sources.transcript_mappings import TranscriptMappings
+from cool_seq_tool.sources.uta_database import UTA_DB_URL, UtaDatabase
+logger = logging.getLogger(__name__)
+class CoolSeqTool:
+    """Non-redundantly initialize all Cool-Seq-Tool data resources, available under the
+    following attribute names:
+    * ``self.seqrepo_access``: :py:class:`SeqRepoAccess <cool_seq_tool.handlers.seqrepo_access.SeqRepoAccess>`
+    * ``self.transcript_mappings``: :py:class:`TranscriptMappings <cool_seq_tool.sources.transcript_mappings.TranscriptMappings>`
+    * ``self.mane_transcript_mappings``: :py:class:`ManeTranscriptMappings <cool_seq_tool.sources.mane_transcript_mappings.ManeTranscriptMappings>`
+    * ``self.uta_db``: :py:class:`UtaDatabase <cool_seq_tool.sources.uta_database.UtaDatabase>`
+    * ``self.alignment_mapper``: :py:class:`AlignmentMapper <cool_seq_tool.mappers.alignment.AlignmentMapper>`
+    * ``self.mane_transcript``: :py:class:`ManeTranscript <cool_seq_tool.mappers.mane_transcript.ManeTranscript>`
+    * ``self.ex_g_coords_mapper``: :py:class:`ExonGenomicCoordsMapper <cool_seq_tool.mappers.exon_genomic_coords.ExonGenomicCoordsMapper>`
+    Initialization with default resource locations is straightforward:
+    .. code-block:: pycon
+       >>> from cool_seq_tool.app import CoolSeqTool
+       >>> cst = CoolSeqTool()
+    See the :ref:`configuration <configuration>` section for more information.
+    """
+    def __init__(
+        self,
+        transcript_file_path: Path = TRANSCRIPT_MAPPINGS_PATH,
+        lrg_refseqgene_path: Path = LRG_REFSEQGENE_PATH,
+        mane_data_path: Path = MANE_SUMMARY_PATH,
+        db_url: str = UTA_DB_URL,
+        sr: Optional[SeqRepo] = None,
+    ) -> None:
+        """Initialize CoolSeqTool class
+        :param transcript_file_path: The path to ``transcript_mapping.tsv``
+        :param lrg_refseqgene_path: The path to the LRG_RefSeqGene file
+        :param mane_data_path: Path to RefSeq MANE summary data
+        :param db_url: PostgreSQL connection URL
+            Format: ``driver://user:password@host/database/schema``
+        :param sr: SeqRepo instance. If this is not provided, will create a new instance
+        """
+        if not sr:
+            sr = SeqRepo(root_dir=SEQREPO_ROOT_DIR)
+        self.seqrepo_access = SeqRepoAccess(sr)
+        self.transcript_mappings = TranscriptMappings(
+            transcript_file_path=transcript_file_path,
+            lrg_refseqgene_path=lrg_refseqgene_path,
+        )
+        self.mane_transcript_mappings = ManeTranscriptMappings(
+            mane_data_path=mane_data_path
+        )
+        self.uta_db = UtaDatabase(db_url=db_url)
+        self.alignment_mapper = AlignmentMapper(
+            self.seqrepo_access, self.transcript_mappings, self.uta_db
+        )
+        self.mane_transcript = ManeTranscript(
+            self.seqrepo_access,
+            self.transcript_mappings,
+            self.mane_transcript_mappings,
+            self.uta_db,
+        )
+        self.ex_g_coords_mapper = ExonGenomicCoordsMapper(
+            self.uta_db, self.mane_transcript
+        )

{cool_seq_tool-0.3.0.dev1 → cool_seq_tool-0.4.0.dev0/src}/cool_seq_tool/data/data_downloads.py RENAMED Viewed

@@ -1,4 +1,4 @@
-"""Module for handling downloadable data files."""
+"""Handle acquisition of external data."""
 import datetime
 import gzip
 import logging
@@ -15,8 +15,11 @@ logger = logging.getLogger("cool_seq_tool")
 class DataDownload:
-    """Class for managing downloadable data files. Responsible for checking if files
-    are available under default locations, and fetching them if not.
+    """Manage downloadable data files. Responsible for checking if files are available
+    under expected locations, and fetching them if not.
+    Relevant methods are called automatically by data classes; users should not have
+    to interact with this class under normal circumstances.
     """
     def __init__(self) -> None:
@@ -25,7 +28,7 @@ class DataDownload:
     def get_mane_summary(self) -> Path:
         """Identify latest MANE summary data. If unavailable locally, download from
-        source.
+        `NCBI FTP server <https://ftp.ncbi.nlm.nih.gov/refseq/MANE/MANE_human/current/>`_.
         :return: path to MANE summary file
         """
@@ -52,7 +55,7 @@ class DataDownload:
     def get_lrg_refseq_gene_data(self) -> Path:
         """Identify latest LRG RefSeq Gene file. If unavailable locally, download from
-        source.
+        `NCBI FTP server <https://ftp.ncbi.nlm.nih.gov/refseq/H_sapiens/RefSeqGene/>`_.
         :return: path to acquired LRG RefSeq Gene data file
         """

{cool_seq_tool-0.3.0.dev1 → cool_seq_tool-0.4.0.dev0/src}/cool_seq_tool/handlers/seqrepo_access.py RENAMED Viewed

@@ -1,4 +1,6 @@
-"""A module for accessing SeqRepo."""
+"""Wrap SeqRepo to provide additional lookup and identification methods on top of basic
+dereferencing functions.
+"""
 import logging
 from os import environ
 from pathlib import Path
@@ -13,7 +15,9 @@ logger = logging.getLogger(__name__)
 class SeqRepoAccess(SeqRepoDataProxy):
-    """The SeqRepoAccess class."""
+    """Provide a wrapper around the base SeqRepoDataProxy class from ``VRS-Python`` to
+    provide additional lookup and identification methods.
+    """
     environ["SEQREPO_LRU_CACHE_MAXSIZE"] = "none"
@@ -24,25 +28,37 @@ class SeqRepoAccess(SeqRepoDataProxy):
         end: Optional[int] = None,
         residue_mode: ResidueMode = ResidueMode.RESIDUE,
     ) -> Tuple[str, Optional[str]]:
-        """Get reference sequence for an accession given a start and end position.
-        If `start` and `end` are not given, it will return the entire reference sequence
+        """Get reference sequence for an accession given a start and end position. If
+        ``start`` and ``end`` are not given, returns the entire reference sequence.
+        >>> from cool_seq_tool.handlers import SeqRepoAccess
+        >>> from biocommons.seqrepo import SeqRepo
+        >>> sr = SeqRepoAccess(SeqRepo("/usr/local/share/seqrepo/latest"))
+        >>> sr.get_reference_sequence("NM_002529.3", 1, 10)[0]
+        'TGCAGCTGG'
+        >>> sr.get_reference_sequence("NP_001341538.1", 1, 10)[0]
+        'MAALSGGGG'
         :param ac: Accession
         :param start: Start pos change
-        :param end: End pos change. If `None` assumes both `start` and `end` have same
-            values, if `start` exists.
-        :param residue_mode: Residue mode for `start` and `end`
+        :param end: End pos change. If ``None`` assumes both ``start`` and ``end`` have
+            same values, if ``start`` exists.
+        :param residue_mode: Residue mode for ``start`` and ``end``
         :return: Sequence at position (if accession and positions actually
             exist, else return empty string), warning if any
         """
-        if start or end:
-            pos, warning = get_inter_residue_pos(start, residue_mode, end_pos=end)
-            if pos is None:
-                return "", warning
-            else:
-                start, end = pos
-                if start == end:
-                    end += 1
+        if start and end:
+            if start > end:
+                msg = f"start ({start}) cannot be greater than end ({end})"
+                return "", msg
+            start, end = get_inter_residue_pos(start, end, residue_mode)
+            if start == end:
+                end += 1
+        else:
+            if start is not None and residue_mode == ResidueMode.RESIDUE:
+                start -= 1
         try:
             sequence = self.sr.fetch(ac, start=start, end=end)
         except KeyError:
@@ -53,18 +69,12 @@ class SeqRepoAccess(SeqRepoDataProxy):
             error = str(e)
             if error.startswith("start out of range"):
                 msg = (
-                    f"Start inter-residue coordinate ({start}) is out of "
-                    f"index on {ac}"
+                    f"Start inter-residue coordinate ({start}) is out of index on {ac}"
                 )
             elif error.startswith("stop out of range"):
                 msg = (
                     f"End inter-residue coordinate ({end}) is out of " f"index on {ac}"
                 )
-            elif error.startswith("invalid coordinates") and ">" in error:
-                msg = (
-                    f"Invalid inter-residue coordinates: start ({start}) "
-                    f"cannot be greater than end ({end})"
-                )
             else:
                 msg = f"{e}"
             logger.warning(msg)
@@ -78,8 +88,7 @@ class SeqRepoAccess(SeqRepoDataProxy):
                 if len(sequence) != expected_len_of_seq:
                     return (
                         "",
-                        f"End inter-residue coordinate ({end})"
-                        f" is out of index on {ac}",
+                        f"End inter-residue coordinate ({end}) is out of index on {ac}",
                     )
             return sequence, None
@@ -88,6 +97,14 @@ class SeqRepoAccess(SeqRepoDataProxy):
     ) -> Tuple[List[str], Optional[str]]:
         """Return list of identifiers for accession.
+        >>> from cool_seq_tool.handlers import SeqRepoAccess
+        >>> from biocommons.seqrepo import SeqRepo
+        >>> sr = SeqRepoAccess(SeqRepo("/usr/local/share/seqrepo/latest"))
+        >>> sr.translate_identifier("NM_002529.3")[0]
+        ['MD5:18f0a6e3af9e1bbd8fef1948c7156012', 'NCBI:NM_002529.3', 'refseq:NM_002529.3', 'SEGUID:dEJQBkga9d9VeBHTyTbg6JEtTGQ', 'SHA1:74425006481af5df557811d3c936e0e8912d4c64', 'VMC:GS_RSkww1aYmsMiWbNdNnOTnVDAM3ZWp1uA', 'sha512t24u:RSkww1aYmsMiWbNdNnOTnVDAM3ZWp1uA', 'ga4gh:SQ.RSkww1aYmsMiWbNdNnOTnVDAM3ZWp1uA']
+        >>> sr.translate_identifier("NM_002529.3", "ga4gh")[0]
+        ['ga4gh:SQ.RSkww1aYmsMiWbNdNnOTnVDAM3ZWp1uA']
         :param ac: Identifier accession
         :param target_namespace: The namespace(s) of identifier to return
         :return: List of identifiers, warning
@@ -123,7 +140,7 @@ class SeqRepoAccess(SeqRepoDataProxy):
     ) -> Tuple[Optional[List[str]], Optional[str]]:
         """Get accessions for a chromosome
-        :param str chromosome: Chromosome number. Must be either 1-22, X, or Y
+        :param chromosome: Chromosome number. Must be either 1-22, X, or Y
         :return: Accessions for chromosome (ordered by latest assembly)
         """
         acs = []
@@ -160,9 +177,20 @@ class SeqRepoAccess(SeqRepoDataProxy):
     def get_fasta_file(self, sequence_id: str, outfile_path: Path) -> None:
         """Retrieve FASTA file containing sequence for requested sequence ID.
-        :param sequence_id: accession ID, sans namespace, eg `NM_152263.3`
+        >>> from pathlib import Path
+        >>> from cool_seq_tool.handlers import SeqRepoAccess
+        >>> from biocommons.seqrepo import SeqRepo
+        >>> sr = SeqRepoAccess(SeqRepo("/usr/local/share/seqrepo/latest"))
+        >>> # write to local file tpm3.fasta:
+        >>> sr.get_fasta_file("NM_002529.3", Path("tpm3.fasta"))
+        FASTA file headers will include GA4GH sequence digest, Ensembl accession ID,
+        and RefSeq accession ID.
+        :param sequence_id: accession ID, sans namespace, eg ``NM_152263.3``
         :param outfile_path: path to save file to
-        :return: None, but saves sequence data to `outfile_path` if successful
+        :return: None, but saves sequence data to ``outfile_path`` if successful
         :raise: KeyError if SeqRepo doesn't have sequence data for the given ID
         """
         sequence = self.get_reference_sequence(sequence_id)[0]

{cool_seq_tool-0.3.0.dev1 → cool_seq_tool-0.4.0.dev0/src}/cool_seq_tool/mappers/__init__.py RENAMED Viewed

@@ -1,4 +1,7 @@
 """Module for mapping data"""
 from .alignment import AlignmentMapper  # noqa: I001
-from .mane_transcript import MANETranscript
+from .mane_transcript import ManeTranscript
 from .exon_genomic_coords import ExonGenomicCoordsMapper
+__all__ = ["AlignmentMapper", "ManeTranscript", "ExonGenomicCoordsMapper"]

cool-seq-tool 0.3.0.dev1__tar.gz → 0.4.0.dev0__tar.gz

cool-seq-tool 0.3.0.dev1tar.gz → 0.4.0.dev0tar.gz