PyPI - align-trim - Versions diffs - 1.1.0__tar.gz - Mend

align-trim 1.1.0__tar.gz

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (24) hide show

align_trim-1.1.0/.github/ISSUE_TEMPLATE/bug_report.yml +40 -0
align_trim-1.1.0/.github/ISSUE_TEMPLATE/feature_request.yml +11 -0
align_trim-1.1.0/.github/workflows/pytest.yml +34 -0
align_trim-1.1.0/.github/workflows/python-publish.yml +64 -0
align_trim-1.1.0/.gitignore +204 -0
align_trim-1.1.0/.pre-commit-config.yaml +17 -0
align_trim-1.1.0/LICENSE +19 -0
align_trim-1.1.0/PKG-INFO +133 -0
align_trim-1.1.0/README.md +117 -0
align_trim-1.1.0/align_trim/__init__.py +0 -0
align_trim-1.1.0/align_trim/main.py +1065 -0
align_trim-1.1.0/pyproject.toml +42 -0
align_trim-1.1.0/tests/__init__.py +0 -0
align_trim-1.1.0/tests/test_data/primer.bed +196 -0
align_trim-1.1.0/tests/test_data/sars-cov-2_v3.0.0_paired.bam +0 -0
align_trim-1.1.0/tests/test_data/sars-cov-2_v5.3.2.bam +0 -0
align_trim-1.1.0/tests/test_data/v1.0.0.primer.bed +196 -0
align_trim-1.1.0/tests/test_data/v3.0.0.primer.bed +218 -0
align_trim-1.1.0/tests/test_data/v5.3.2.primer.bed +193 -0
align_trim-1.1.0/tests/test_integration.py +470 -0
align_trim-1.1.0/tests/test_legacy.py +204 -0
align_trim-1.1.0/tests/test_main.py +219 -0
align_trim-1.1.0/tests/test_normalise.py +220 -0
align_trim-1.1.0/uv.lock +674 -0

align_trim-1.1.0/.github/ISSUE_TEMPLATE/bug_report.yml ADDED Viewed

@@ -0,0 +1,40 @@
+name: Bug report
+description: Report something that is broken or incorrect, please ensure you have checked existing issues before submitting a new issue.
+labels: [bug]
+body:
+  - type: textarea
+    id: description
+    attributes:
+      label: Description of the bug
+      description: A clear and concise description of what the bug is.
+    validations:
+      required: true
+  - type: textarea
+    id: command_used
+    attributes:
+      label: Command used and terminal output
+      description: Steps to reproduce the behaviour. Please paste the command you used to launch the pipeline and the output from your terminal.
+      render: console
+      placeholder: |
+        $ align_trim ...
+        Some output where something broke
+  - type: textarea
+    id: files
+    attributes:
+      label: Relevant files
+      description: |
+        Please drag and drop the relevant files here. Create a `.zip` archive if the extension is not allowed.
+        Your verbose log file (from stderr with --verbose enabled), the report TSV generated with `--report`, and your primer scheme bedfile are all helpful.
+  - type: textarea
+    id: system
+    attributes:
+      label: System information
+      description: |
+        * Hardware _(eg. HPC, Desktop, Cloud)_
+        * OS _(eg. CentOS Linux, macOS, Linux Mint)_
+        * Install method _(eg. pip, conda, source)_
+        * Version of align_trim _(eg. 1.1, 1.5, 1.8.2)_

align_trim-1.1.0/.github/ISSUE_TEMPLATE/feature_request.yml ADDED Viewed

@@ -0,0 +1,11 @@
+name: Feature request
+description: Suggest an idea for the align_trim, please ensure you have checked existing feature requests before submitting a new suggestion to avoid duplicates.
+labels: enhancement
+body:
+  - type: textarea
+    id: description
+    attributes:
+      label: Description of feature
+      description: Please describe your suggestion for a new feature. It might help to describe a problem or use case, plus any alternatives that you have considered.
+    validations:
+      required: true

align_trim-1.1.0/.github/workflows/pytest.yml ADDED Viewed

@@ -0,0 +1,34 @@
+name: pytest
+on:
+  push:
+    paths:
+      - "align_trim/**"
+      - "tests/**"
+      - "pyproject.toml"
+      - "uv.lock"
+      - ".github/workflows/pytest.yml"
+jobs:
+  ci:
+    strategy:
+      fail-fast: false
+      matrix:
+        python-version: ["3.9", "3.10", "3.11", "3.12", "3.13"]
+        os: [ubuntu-22.04, macos-latest]
+    runs-on: ${{ matrix.os }}
+    steps:
+      - uses: actions/checkout@v4
+      - name: install uv
+        uses: astral-sh/setup-uv@v5
+      - name: set up python
+        uses: actions/setup-python@v5
+        with:
+          python-version: ${{ matrix.python-version }}
+      - name: install project
+        run: uv sync --python ${{ matrix.python-version }} --locked --all-extras --dev
+      - name: run tests
+        run: uv run --python  ${{ matrix.python-version }} pytest

align_trim-1.1.0/.github/workflows/python-publish.yml ADDED Viewed

@@ -0,0 +1,64 @@
+# This workflow will upload a Python Package to PyPI when a release is created
+# For more information see: https://docs.github.com/en/actions/automating-builds-and-tests/building-and-testing-python#publishing-to-package-registries
+# This workflow uses actions that are not certified by GitHub.
+# They are provided by a third-party and are governed by
+# separate terms of service, privacy policy, and support
+# documentation.
+name: Upload Python Package
+on:
+  release:
+    types: [published]
+permissions:
+  contents: read
+jobs:
+  release-build:
+    runs-on: ubuntu-latest
+    steps:
+      - uses: actions/checkout@v4
+      - uses: actions/setup-python@v5
+        with:
+          python-version: "3.x"
+      - name: Build release distributions
+        run: |
+          python -m pip install build
+          python -m build
+      - name: Upload distributions
+        uses: actions/upload-artifact@v4
+        with:
+          name: release-dists
+          path: dist/
+  pypi-publish:
+    runs-on: ubuntu-latest
+    needs:
+      - release-build
+    permissions:
+      # IMPORTANT: this permission is mandatory for trusted publishing
+      id-token: write
+    # Dedicated environments with protections for publishing are strongly recommended.
+    # For more information, see: https://docs.github.com/en/actions/deployment/targeting-different-environments/using-environments-for-deployment#deployment-protection-rules
+    environment:
+      name: pypi
+      url: https://pypi.org/p/align_trim
+    steps:
+      - name: Retrieve release distributions
+        uses: actions/download-artifact@v4
+        with:
+          name: release-dists
+          path: dist/
+      - name: Publish release distributions to PyPI
+        uses: pypa/gh-action-pypi-publish@release/v1
+        with:
+          packages-dir: dist/

align_trim-1.1.0/.gitignore ADDED Viewed

@@ -0,0 +1,204 @@
+# Byte-compiled / optimized / DLL files
+__pycache__/
+*.py[codz]
+*$py.class
+# C extensions
+*.so
+# Distribution / packaging
+.Python
+build/
+develop-eggs/
+dist/
+downloads/
+eggs/
+.eggs/
+lib/
+lib64/
+parts/
+sdist/
+var/
+wheels/
+share/python-wheels/
+*.egg-info/
+.installed.cfg
+*.egg
+MANIFEST
+.DS_store
+# PyInstaller
+#  Usually these files are written by a python script from a template
+#  before PyInstaller builds the exe, so as to inject date/other infos into it.
+*.manifest
+*.spec
+# Installer logs
+pip-log.txt
+pip-delete-this-directory.txt
+# Unit test / coverage reports
+htmlcov/
+.tox/
+.nox/
+.coverage
+.coverage.*
+.cache
+nosetests.xml
+coverage.xml
+*.cover
+*.py.cover
+.hypothesis/
+.pytest_cache/
+cover/
+# Translations
+*.mo
+*.pot
+# Django stuff:
+*.log
+local_settings.py
+db.sqlite3
+db.sqlite3-journal
+# Flask stuff:
+instance/
+.webassets-cache
+# Scrapy stuff:
+.scrapy
+# Sphinx documentation
+docs/_build/
+# PyBuilder
+.pybuilder/
+target/
+# Jupyter Notebook
+.ipynb_checkpoints
+# IPython
+profile_default/
+ipython_config.py
+# pyenv
+#   For a library or package, you might want to ignore these files since the code is
+#   intended to run in multiple environments; otherwise, check them in:
+# .python-version
+# pipenv
+#   According to pypa/pipenv#598, it is recommended to include Pipfile.lock in version control.
+#   However, in case of collaboration, if having platform-specific dependencies or dependencies
+#   having no cross-platform support, pipenv may install dependencies that don't work, or not
+#   install all needed dependencies.
+#Pipfile.lock
+# UV
+#   Similar to Pipfile.lock, it is generally recommended to include uv.lock in version control.
+#   This is especially recommended for binary packages to ensure reproducibility, and is more
+#   commonly ignored for libraries.
+#uv.lock
+# poetry
+#   Similar to Pipfile.lock, it is generally recommended to include poetry.lock in version control.
+#   This is especially recommended for binary packages to ensure reproducibility, and is more
+#   commonly ignored for libraries.
+#   https://python-poetry.org/docs/basic-usage/#commit-your-poetrylock-file-to-version-control
+#poetry.lock
+#poetry.toml
+# pdm
+#   Similar to Pipfile.lock, it is generally recommended to include pdm.lock in version control.
+#   pdm recommends including project-wide configuration in pdm.toml, but excluding .pdm-python.
+#   https://pdm-project.org/en/latest/usage/project/#working-with-version-control
+#pdm.lock
+#pdm.toml
+.pdm-python
+.pdm-build/
+# pixi
+#   Similar to Pipfile.lock, it is generally recommended to include pixi.lock in version control.
+#pixi.lock
+#   Pixi creates a virtual environment in the .pixi directory, just like venv module creates one
+#   in the .venv directory. It is recommended not to include this directory in version control.
+.pixi
+# PEP 582; used by e.g. github.com/David-OConnor/pyflow and github.com/pdm-project/pdm
+__pypackages__/
+# Celery stuff
+celerybeat-schedule
+celerybeat.pid
+# SageMath parsed files
+*.sage.py
+# Environments
+.env
+.envrc
+.venv
+env/
+venv/
+ENV/
+env.bak/
+venv.bak/
+# Spyder project settings
+.spyderproject
+.spyproject
+# Rope project settings
+.ropeproject
+# mkdocs documentation
+/site
+# mypy
+.mypy_cache/
+.dmypy.json
+dmypy.json
+# Pyre type checker
+.pyre/
+# pytype static type analyzer
+.pytype/
+# Cython debug symbols
+cython_debug/
+# PyCharm
+#  JetBrains specific template is maintained in a separate JetBrains.gitignore that can
+#  be found at https://github.com/github/gitignore/blob/main/Global/JetBrains.gitignore
+#  and can be added to the global gitignore or merged into this file.  For a more nuclear
+#  option (not recommended) you can uncomment the following to ignore the entire idea folder.
+#.idea/
+# Abstra
+# Abstra is an AI-powered process automation framework.
+# Ignore directories containing user credentials, local state, and settings.
+# Learn more at https://abstra.io/docs
+.abstra/
+# Visual Studio Code
+#  Visual Studio Code specific template is maintained in a separate VisualStudioCode.gitignore
+#  that can be found at https://github.com/github/gitignore/blob/main/Global/VisualStudioCode.gitignore
+#  and can be added to the global gitignore or merged into this file. However, if you prefer,
+#  you could uncomment the following to ignore the entire vscode folder
+# .vscode/
+# Ruff stuff:
+.ruff_cache/
+# PyPI configuration file
+.pypirc
+# Marimo
+marimo/_static/
+marimo/_lsp/
+__marimo__/
+# Streamlit
+.streamlit/secrets.toml

align_trim-1.1.0/.pre-commit-config.yaml ADDED Viewed

@@ -0,0 +1,17 @@
+repos:
+  - repo: https://github.com/astral-sh/ruff-pre-commit
+    # Ruff version.
+    rev: v0.4.1
+    hooks:
+      # Run the linter.
+      - id: ruff
+        args: [--fix, --show-fixes]
+      # Run the formatter.
+      - id: ruff-format
+  - repo: https://github.com/astral-sh/uv-pre-commit
+    # uv version.
+    rev: 0.7.11
+    hooks:
+      - id: uv-lock
+      - id: uv-export

align_trim-1.1.0/LICENSE ADDED Viewed

@@ -0,0 +1,19 @@
+Copyright (c) 2017-2018 Nick Loman & the ZiBRA Project & the ARTIC project
+Permission is hereby granted, free of charge, to any person obtaining a copy
+of this software and associated documentation files (the "Software"), to deal
+in the Software without restriction, including without limitation the rights
+to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
+copies of the Software, and to permit persons to whom the Software is
+furnished to do so, subject to the following conditions:
+The above copyright notice and this permission notice shall be included in all
+copies or substantial portions of the Software.
+THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
+AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
+OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
+SOFTWARE.

align_trim-1.1.0/PKG-INFO ADDED Viewed

@@ -0,0 +1,133 @@
+Metadata-Version: 2.4
+Name: align_trim
+Version: 1.1.0
+Summary: Soft-clip primer sites for SAM/BAM files generated from amplicon sequencing runs
+Project-URL: Repository, https://github.com/artic-network/align_trim.git
+Project-URL: Issues, https://github.com/artic-network/align_trim/issues
+Author-email: Nick Loman <n.j.loman@bham.ac.uk>, Sam Wilkinson <s.a.j.wilkinson@bham.ac.uk>, Chris Kent <c.g.kent@bham.ac.uk>
+Maintainer-email: Sam Wilkinson <s.a.j.wilkinson@bham.ac.uk>, Chris Kent <c.g.kent@bham.ac.uk>
+License-Expression: MIT
+License-File: LICENSE
+Requires-Python: >=3.9
+Requires-Dist: numpy
+Requires-Dist: primalbedtools>=0.10.1
+Requires-Dist: pysam
+Description-Content-Type: text/markdown
+# align_trim
+Stand alone version of ARTIC's fieldbioinformatics align_trim.py
+## Installation
+From conda
+```bash
+conda install bioconda::align_trim
+```
+from pypi
+```bash
+pip install align_trim
+```
+from source
+```bash
+git clone https://github.com/artic-network/align_trim.git
+cd align_trim
+uv sync
+uv run align_trim --help
+```
+## Command Line Interface
+### Basic Usage
+```bash
+align_trim [OPTIONS] BEDFILE
+```
+The tool reads alignment data from either a SAM/BAM file or stdin and outputs trimmed alignments to stdout in SAM format by default.
+### Required Arguments
+- `BEDFILE`: BED file containing the amplicon primer scheme in [v3](https://doi.org/10.5281/zenodo.16366659) format.
+### Optional Arguments
+#### Input/Output Options
+- `--samfile`, `-s` : Sorted SAM/BAM file containing the aligned reads, if this is not provided (or '-') then 'align_trim' will read from stdin.
+- `--output`, `-o` : Output file path. Format determined by extension (.sam/.bam). If not provided or '-', writes SAM to stdout
+#### Processing Options
+- `--normalise`, `-n` : Normalise to target depth N per amplicon using a greedy per-read algorithm. Each read is kept only if it brings the amplicon depth closer to the target. Use 0 for no normalisation (default: 0)
+- `--min-mapq`, `-m` : Minimum mapping quality to keep an aligned read (default: 20)
+- `--primer-match-threshold`, `-p` : Add this many bases of padding to the 5' end of primer coordinates to allow fuzzy matching for reads with barcodes/adapters (default: 35)
+#### Primer and Read Handling
+- `--no-trim-primers` : Do not trim primers from reads (by default, primers are trimmed)
+- `--allow-incorrect-pairs` : Allow reads to be assigned to amplicons even if primers are not correctly paired
+- `--require-full-length` : Require all reads to start and stop in primer sites (do not use with rapid barcoding)
+#### Output and Reporting
+- `--report`, `-r` : Output detailed report TSV to specified filepath
+- `--amp-depth-report`, `-a` : Output mean depth for each amplicon as TSV to specified filepath
+- `--no-read-groups` : Do not divide reads into pool-based read groups in SAM/BAM output
+#### General Options
+- `--verbose`, `-v` : Enable debug mode with detailed logging to stderr
+- `--version` : Show version information
+- `--help` : Show help message
+### Examples
+#### Basic trimming with primer removal
+```bash
+align_trim primers.bed --samfile input.bam --output trimmed.bam
+```
+#### Normalize coverage and generate reports
+```bash
+align_trim primers.bed --samfile input.bam --normalise 100 \
+  --report alignment_report.tsv --amp-depth-report depth_report.tsv \
+  --output normalized.bam
+```
+#### Process from stdin with verbose output
+```bash
+samtools view -h input.bam | align_trim primers.bed --verbose > trimmed.sam 2> verbose.out.txt
+```
+#### Strict full-length read filtering
+```bash
+align_trim primers.bed --samfile input.bam --require-full-length \
+  --min-mapq 30 --output filtered.bam
+```
+#### Allow mismatched primer pairs with custom threshold
+```bash
+align_trim primers.bed --samfile input.bam --allow-incorrect-pairs \
+  --primer-match-threshold 50 --output relaxed.bam
+```
+### Output Formats
+The tool supports multiple output formats based on file extension:
+- `.sam` - SAM format (text)
+- `.bam` - BAM format (binary, compressed)
+- No extension or `-` - SAM format to stdout
+### Report Files
+When using `--report`, a tab-separated file is generated with the following columns:
+- `chrom`: Reference chromosome/contig
+- `QueryName`: Read name
+- `ReferenceStart`/`ReferenceEnd`: Alignment coordinates
+- `PrimerPair`: Primer pair assignment
+- `Primer1`/`Primer2`: Individual primer information
+- `CorrectlyPaired`: Boolean indicating proper primer pairing
+- Additional alignment metrics
+The `--amp-depth-report` generates a summary of coverage depth per amplicon.

align_trim-1.1.0/README.md ADDED Viewed

@@ -0,0 +1,117 @@
+# align_trim
+Stand alone version of ARTIC's fieldbioinformatics align_trim.py
+## Installation
+From conda
+```bash
+conda install bioconda::align_trim
+```
+from pypi
+```bash
+pip install align_trim
+```
+from source
+```bash
+git clone https://github.com/artic-network/align_trim.git
+cd align_trim
+uv sync
+uv run align_trim --help
+```
+## Command Line Interface
+### Basic Usage
+```bash
+align_trim [OPTIONS] BEDFILE
+```
+The tool reads alignment data from either a SAM/BAM file or stdin and outputs trimmed alignments to stdout in SAM format by default.
+### Required Arguments
+- `BEDFILE`: BED file containing the amplicon primer scheme in [v3](https://doi.org/10.5281/zenodo.16366659) format.
+### Optional Arguments
+#### Input/Output Options
+- `--samfile`, `-s` : Sorted SAM/BAM file containing the aligned reads, if this is not provided (or '-') then 'align_trim' will read from stdin.
+- `--output`, `-o` : Output file path. Format determined by extension (.sam/.bam). If not provided or '-', writes SAM to stdout
+#### Processing Options
+- `--normalise`, `-n` : Normalise to target depth N per amplicon using a greedy per-read algorithm. Each read is kept only if it brings the amplicon depth closer to the target. Use 0 for no normalisation (default: 0)
+- `--min-mapq`, `-m` : Minimum mapping quality to keep an aligned read (default: 20)
+- `--primer-match-threshold`, `-p` : Add this many bases of padding to the 5' end of primer coordinates to allow fuzzy matching for reads with barcodes/adapters (default: 35)
+#### Primer and Read Handling
+- `--no-trim-primers` : Do not trim primers from reads (by default, primers are trimmed)
+- `--allow-incorrect-pairs` : Allow reads to be assigned to amplicons even if primers are not correctly paired
+- `--require-full-length` : Require all reads to start and stop in primer sites (do not use with rapid barcoding)
+#### Output and Reporting
+- `--report`, `-r` : Output detailed report TSV to specified filepath
+- `--amp-depth-report`, `-a` : Output mean depth for each amplicon as TSV to specified filepath
+- `--no-read-groups` : Do not divide reads into pool-based read groups in SAM/BAM output
+#### General Options
+- `--verbose`, `-v` : Enable debug mode with detailed logging to stderr
+- `--version` : Show version information
+- `--help` : Show help message
+### Examples
+#### Basic trimming with primer removal
+```bash
+align_trim primers.bed --samfile input.bam --output trimmed.bam
+```
+#### Normalize coverage and generate reports
+```bash
+align_trim primers.bed --samfile input.bam --normalise 100 \
+  --report alignment_report.tsv --amp-depth-report depth_report.tsv \
+  --output normalized.bam
+```
+#### Process from stdin with verbose output
+```bash
+samtools view -h input.bam | align_trim primers.bed --verbose > trimmed.sam 2> verbose.out.txt
+```
+#### Strict full-length read filtering
+```bash
+align_trim primers.bed --samfile input.bam --require-full-length \
+  --min-mapq 30 --output filtered.bam
+```
+#### Allow mismatched primer pairs with custom threshold
+```bash
+align_trim primers.bed --samfile input.bam --allow-incorrect-pairs \
+  --primer-match-threshold 50 --output relaxed.bam
+```
+### Output Formats
+The tool supports multiple output formats based on file extension:
+- `.sam` - SAM format (text)
+- `.bam` - BAM format (binary, compressed)
+- No extension or `-` - SAM format to stdout
+### Report Files
+When using `--report`, a tab-separated file is generated with the following columns:
+- `chrom`: Reference chromosome/contig
+- `QueryName`: Read name
+- `ReferenceStart`/`ReferenceEnd`: Alignment coordinates
+- `PrimerPair`: Primer pair assignment
+- `Primer1`/`Primer2`: Individual primer information
+- `CorrectlyPaired`: Boolean indicating proper primer pairing
+- Additional alignment metrics
+The `--amp-depth-report` generates a summary of coverage depth per amplicon.

align_trim-1.1.0/align_trim/__init__.py ADDED Viewed

File without changes