PyPI - XspecT - Versions diffs - 0.5.1__tar.gz → 0.5.2__tar.gz - Mend

XspecT 0.5.1tar.gz → 0.5.2tar.gz

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Potentially problematic release.

This version of XspecT might be problematic. Click here for more details.

Files changed (127) hide show

{xspect-0.5.1 → xspect-0.5.2}/.github/workflows/test.yml RENAMED Viewed

@@ -24,9 +24,10 @@ jobs:
           run: |
             python -m pip install --upgrade pip
             pip install '.[test]'
-        - name: Download models
+        - name: Download models and train MLST
           run: |
             xspect models download
+            yes 1 | xspect models train mlst
         - name: Test with pytest
           env:
             NCBI_API_KEY: ${{ secrets.NCBI_API_KEY }}

{xspect-0.5.1 → xspect-0.5.2}/PKG-INFO RENAMED Viewed

@@ -1,6 +1,6 @@
 Metadata-Version: 2.4
 Name: XspecT
-Version: 0.5.1
+Version: 0.5.2
 Summary: Tool to monitor and characterize pathogens using Bloom filters.
 License: MIT License

xspect-0.5.2/docs/contributing.md ADDED Viewed

@@ -0,0 +1,95 @@
+# Contributing to XspecT
+## Introduction
+Thank you for your interest in contributing to XspecT! This page provides guidelines for contributing to the project, including how to set up your own development environment, the XspecT architecture, CI/CD, and the process for submitting contributions.
+When contributing to XspecT, please follow the following steps to ensure a smooth process:
+- **Read the documentation**: Familiarize yourself with the project by reading the [documentation](https://bionf.github.io/XspecT2/), including the [Understanding XspecT](understanding.md) page and the [architecture overview](#architecture-overview).
+- **Follow the coding standards**: Adhere to the project's coding standards and best practices. This includes using consistent naming conventions, writing clear and concise code, and documentation. Furthermore, please make sure your changes are algined with the project's [architecture](#architecture-overview).
+- **Write tests**: Ensure that your changes are covered by tests. We use [pytest](https://docs.pytest.org/en/stable/) for testing. If you add new features or fix bugs, please include tests to verify your changes.
+- **Document your changes**: Update the documentation to reflect any new features or changes you make. This includes updating the README, Google-style docstrings, and the [Mkdocs](https://www.mkdocs.org)-based documentation.
+- **Use clear commit messages**: When committing your changes, use clear and descriptive commit messages that explain the purpose of the changes.
+- **Follow the pull request process**: When you're ready to submit your changes, follow the [pull request process](#pull-request-process) outlined below.
+## Development Installation
+To set up XspecT for development, first make sure you have [Python](https://www.python.org/downloads/) and [Node.js](https://nodejs.org/en/download/) installed. Please note that XspecT is currently not supported in Windows or Alpine Linux environments, unless you build [COBS](https://github.com/aromberg/cobs) yourself.
+Get started by cloning the repository:
+```bash
+git clone https://github.com/BIONF/XspecT2.git
+```
+You then need to build the web application using Vite. Navigate to the `xspect-web` directory and run the build command, which will also watch for changes:
+```bash
+cd XspecT2/src/xspect/xspect-web
+```
+```bash
+npx vite build --watch
+```
+Finally, in a separate terminal, navigate to the root of the cloned repository and install the Python package in editable mode:
+```bash
+pip install -e .
+```
+By combining the two processes, you can develop both the frontend and backend simultaneously.
+## Architecture Overview
+XspecT consists of a Python component (`src/xspect`) and a web application built with [Vite](https://vitejs.dev/) (`src/xspect/xspect-web`). The Python component provides the core functionality, including the command-line interface (CLI) and the backend API, while the web application provides a user-friendly interface for interacting with XspecT. Furthermore, tests for the Python component reside in the `tests/` directory, while documentation is provided in the `docs/` directory.
+### Python Component
+The Python component of XspecT is structured as follows:
+- `main.py`: The entry point for the command-line interface (CLI) and the backend API.
+- `web.py`: The [FastAPI](https://fastapi.tiangolo.com/) application that serves the web interface and handles API requests.
+The core functionality of XspecT is implemented using the following modules:
+- `classify.py`: Contains methods to classify sequences based on previously trained XspecT models.
+- `filter_sequences.py`: Contains methods to filter sequences based on classification results.
+- `model_management.py`: Contains methods to manage XspecT models.
+- `train.py`: Contains methods to train XspecT models based on user-provided data or data from the NCBI/PubMLST API.
+- `download_models.py`: Contains methods to download pre-trained XspecT models.
+In the background, these modules utilize model classes and a result class, which are defined in the `/models/` folder.
+- `/models/probabilistic_filter_model.py`: Base class for probabilistic filter models, which uses COBS indices for classification and stores the model's metadata. Results from the classification are stored in a `ModelResult` class.
+- `/models/probabilistic_filter_svm_model.py`: This class extends the base model class and implements a probabilistic filter model, in which classification scores are passed to a support vector machine (SVM) for a final prediction. This model is typically used for species-level classification.
+- `/models/probabilistic_filter_mlst_model.py`: This class extends the base model class and implements multilocus strain typing (MLST) by using multiple COBS indices.
+- `/models/probabilistic_single_filter_model.py`: This class extends the base model class and implements a model that uses a single Bloom filter for classification. It is typically used for genus-level classification.
+- `/models/result.py`: Contains the `ModelResult` class, which stores the results of a classification operation, including classification metadata, hits, and a prediction, if applicable.
+Supplementary modules are documented in their respective files.
+### Web Application
+The web application (`src/xspect/xspect-web`) is built using Vite, [Axios](https://axios-http.com/), [Tailwind CSS](https://tailwindcss.com/), and [shadcn/ui](https://ui.shadcn.com/). It provides a user-friendly interface for interacting with XspecT and includes the following main components:
+- `src/api.ts`: Contains the API client for making requests to the backend FastAPI application.
+- `src/App.tsx`: The main application component that renders the user interface. It uses React Router for navigation and includes the main layout as well as routing logic.
+- `src/assets/`: Contains static assets such as images and icons.
+- `src/components/`: Contains reusable components for the user interface, such as buttons, forms, and modals.
+- `src/components/ui/`: Contains UI components from shadcn/ui, which are used to build the user interface.
+- `src/types.ts`: Contains TypeScript type definitions for the application, including types for API responses.
+- `vite.config.ts`: The Vite configuration file that defines how the web application is built and served. Also includes a configuration for the API proxy to the FastAPI backend.
+## Continuous Integration and Deployment
+We use GitHub Actions to run checks on commits and pull requests. These checks include:
+- **Code style and formatting**: Ensures that changes align with the project's code style. We use [Black](https://black.readthedocs.io/en/stable/) for Python code formatting.
+- **Linting**: [Pylint](https://pylint.pycqa.org/en/latest/) is used for Python code linting. It checks for coding standards, potential errors, and code smells.
+- **Tests**: Ensures that all tests pass. We use [pytest](https://docs.pytest.org/en/stable/) for testing.
+Additionally, Github Actions are also used for deployment:
+- **Documentation**: The Mkdocs-based documentation is built and deployed to GitHub Pages on changes to the `main` branch. You can view the documentation at [https://bionf.github.io/XspecT2/](https://bionf.github.io/XspecT2/).
+- **Python package**: The Python package is built and uploaded to PyPI when a new release is created. This allows users to easily install the latest version of XspecT using `pip install xspect`. Pre-releases are uploaded to TestPyPI and can be installed using `pip install --index-url https://test.pypi.org/simple/ xspect`.
+## Pull Request Process
+Once you have made your changes and tested them, you can submit a pull request. Please follow these steps:
+1. Ensure your code is up to date with the `dev` branch
+2. Create a pull request with a clear description of your changes to the `dev` branch
+3. Address any feedback from reviewers
+4. Once approved, your changes will be merged

xspect-0.5.2/docs/understanding.md ADDED Viewed

@@ -0,0 +1,24 @@
+# Understanding XspecT
+## What is XspecT?
+XspecT is a tool designed to monitor and characterize pathogens using exact pattern matching of kmers. It allows users to filter for pathogen sequences in metagenomic datasets, classify these sequences on a species level, and perform strain-level typing.
+## Key Features
+- **Genus-Level Classification**: Classify sequences at the genus level, enabling researchers to quickly identify the presence of specific microbial groups.
+- **Species-Level Classification**: Provides detailed classification of sequences at the species level, enhancing the understanding of microbial diversity.
+- **Multi-Locus Strain Typing**: Offers the ability to type sequences at the strain level, which is crucial for understanding variations within species.
+- **Filtering**: Classification results can be used to filter sequences, enabling analysis of metagenomic samples.
+- **Model Management**: XspecT models can be easily downloaded or trained from scratch using the command line interface. Training is possible both from local data, as well as from the NCBI Datasets and PubMLST API.
+- **User-friendly Interface**: Next to the command line interface (CLI), a React-based web interface is available for easy interaction and visualization of results.
+- **Works with Large Datasets**: Entire folders of input data can be passed to the tool, allowing for efficient processing of large datasets.
+## How XspecT Works
+At its core, XspecT uses exact pattern matching of kmers to identify and classify sequences. The tool leverages indices of known pathogen sequences stored in XspecT models to match against input data. This process involves:
+1. **Kmer Extraction**: The input sequences are processed to extract kmers, which are short sequences of a fixed length.
+2. **Pattern Matching**: The extracted kmers are matched against an index of known sequences using exact matching algorithms. The number of matches is recorded, and stored as hits.
+3. **Classification**: Based on hits, scores are calculated as the fraction of kmers that match known sequences. These scores are then used to classify the sequences at different taxonomic levels.
+### COBS Index
+In order to store kmers in a space-efficient manner, XspecT uses a COBS ("Compact Bit-Sliced Signature Index") classic index. This index uses a probabilistic data structure to store kmers, allowing for efficient storage and retrieval. The COBS index is designed to handle large datasets while maintaining fast query performance. More information about the COBS index can be found in the [COBS research paper](https://arxiv.org/abs/1905.09624).

{xspect-0.5.1 → xspect-0.5.2}/pyproject.toml RENAMED Viewed

@@ -1,6 +1,6 @@
 [project]
 name = "XspecT"
-version = "0.5.1"
+version = "0.5.2"
 description = "Tool to monitor and characterize pathogens using Bloom filters."
 readme = {file = "README.md", content-type = "text/markdown"}
 license = {file = "LICENSE"}

{xspect-0.5.1 → xspect-0.5.2}/src/XspecT.egg-info/PKG-INFO RENAMED Viewed

@@ -1,6 +1,6 @@
 Metadata-Version: 2.4
 Name: XspecT
-Version: 0.5.1
+Version: 0.5.2
 Summary: Tool to monitor and characterize pathogens using Bloom filters.
 License: MIT License

{xspect-0.5.1 → xspect-0.5.2}/src/XspecT.egg-info/SOURCES.txt RENAMED Viewed

@@ -54,8 +54,8 @@ src/xspect/xspect-web/tsconfig.node.json
 src/xspect/xspect-web/vite.config.ts
 src/xspect/xspect-web/dist/index.html
 src/xspect/xspect-web/dist/vite.svg
-src/xspect/xspect-web/dist/assets/index-CMG4V7fZ.js
-src/xspect/xspect-web/dist/assets/index-jIKg1HIy.css
+src/xspect/xspect-web/dist/assets/index-Ceo58xui.css
+src/xspect/xspect-web/dist/assets/index-Dt_UlbgE.js
 src/xspect/xspect-web/public/vite.svg
 src/xspect/xspect-web/src/App.tsx
 src/xspect/xspect-web/src/api.tsx
@@ -72,6 +72,7 @@ src/xspect/xspect-web/src/components/dropdown-checkboxes.tsx
 src/xspect/xspect-web/src/components/dropdown-slider.tsx
 src/xspect/xspect-web/src/components/filter-form.tsx
 src/xspect/xspect-web/src/components/filter.tsx
+src/xspect/xspect-web/src/components/filtering-result.tsx
 src/xspect/xspect-web/src/components/header.tsx
 src/xspect/xspect-web/src/components/landing.tsx
 src/xspect/xspect-web/src/components/models-details.tsx

xspect-0.5.2/src/xspect/classify.py ADDED Viewed

@@ -0,0 +1,80 @@
+from pathlib import Path
+from xspect.mlst_feature.mlst_helper import pick_scheme_from_models_dir
+import xspect.model_management as mm
+from xspect.models.probabilistic_filter_mlst_model import (
+    ProbabilisticFilterMlstSchemeModel,
+)
+from xspect.file_io import prepare_input_output_paths
+def classify_genus(
+    model_genus: str, input_path: Path, output_path: Path, step: int = 1
+):
+    """
+    Classify the genus of sequences.
+    This function classifies input files using the genus model.
+    The input path can be a file or directory
+    Args:
+        model_genus (str): The genus model slug.
+        input_path (Path): The path to the input file/directory containing sequences.
+        output_path (Path): The path to the output file where results will be saved.
+        step (int): The amount of kmers to be skipped.
+    """
+    model = mm.get_genus_model(model_genus)
+    input_paths, get_output_path = prepare_input_output_paths(input_path)
+    for idx, current_path in enumerate(input_paths):
+        result = model.predict(current_path, step=step)
+        result.input_source = current_path.name
+        cls_path = get_output_path(idx, output_path)
+        result.save(cls_path)
+        print(f"Saved result as {cls_path.name}")
+def classify_species(
+    model_genus: str, input_path: Path, output_path: Path, step: int = 1
+):
+    """
+    Classify the species of sequences.
+    This function classifies input files using the species model.
+    The input path can be a file or directory
+    Args:
+        model_genus (str): The genus model slug.
+        input_path (Path): The path to the input file/directory containing sequences.
+        output_path (Path): The path to the output file where results will be saved.
+        step (int): The amount of kmers to be skipped.
+    """
+    model = mm.get_species_model(model_genus)
+    input_paths, get_output_path = prepare_input_output_paths(input_path)
+    for idx, current_path in enumerate(input_paths):
+        result = model.predict(current_path, step=step)
+        result.input_source = current_path.name
+        cls_path = get_output_path(idx, output_path)
+        result.save(cls_path)
+        print(f"Saved result as {cls_path.name}")
+def classify_mlst(input_path: Path, output_path: Path, limit: bool):
+    """
+    Classify the strain type using the specific MLST model.
+    Args:
+        input_path (Path): The path to the input file/directory containing sequences.
+        output_path (Path): The path to the output file where results will be saved.
+        limit (bool): A limit for the highest allele_id results that are shown.
+    """
+    scheme_path = pick_scheme_from_models_dir()
+    model = ProbabilisticFilterMlstSchemeModel.load(scheme_path)
+    input_paths, get_output_path = prepare_input_output_paths(input_path)
+    for idx, current_path in enumerate(input_paths):
+        result = model.predict(scheme_path, current_path, step=1, limit=limit)
+        result.input_source = current_path.name
+        cls_path = get_output_path(idx, output_path)
+        result.save(cls_path)
+        print(f"Saved result as {cls_path.name}")

xspect-0.5.2/src/xspect/definitions.py ADDED Viewed

@@ -0,0 +1,90 @@
+"""This module contains definitions for the XspecT package."""
+from pathlib import Path
+from os import getcwd
+fasta_endings = ["fasta", "fna", "fa", "ffn", "frn"]
+fastq_endings = ["fastq", "fq"]
+def get_xspect_root_path() -> Path:
+    """
+    Return the root path for XspecT data.
+    Returns the path to the XspecT data directory, which can be located either in the user's home directory or in the current working directory.
+    If neither exists, it creates the directory in the user's home directory.
+    Returns:
+        Path: The path to the XspecT data directory.
+    """
+    home_based_dir = Path.home() / "xspect-data"
+    if home_based_dir.exists():
+        return home_based_dir
+    cwd_based_dir = Path(getcwd()) / "xspect-data"
+    if cwd_based_dir.exists():
+        return cwd_based_dir
+    home_based_dir.mkdir(exist_ok=True, parents=True)
+    return home_based_dir
+def get_xspect_model_path() -> Path:
+    """
+    Return the path to the XspecT models.
+    Returns the path to the XspecT models directory, which is located within the XspecT data directory.
+    If the directory does not exist, it creates the directory.
+    Returns:
+        Path: The path to the XspecT models directory.
+    """
+    model_path = get_xspect_root_path() / "models"
+    model_path.mkdir(exist_ok=True, parents=True)
+    return model_path
+def get_xspect_upload_path() -> Path:
+    """
+    Return the path to the XspecT upload directory.
+    Returns the path to the XspecT uploads directory, which is located within the XspecT data directory.
+    If the directory does not exist, it creates the directory.
+    Returns:
+        Path: The path to the XspecT uploads directory.
+    """
+    upload_path = get_xspect_root_path() / "uploads"
+    upload_path.mkdir(exist_ok=True, parents=True)
+    return upload_path
+def get_xspect_runs_path() -> Path:
+    """
+    Return the path to the XspecT runs directory.
+    Returns the path to the XspecT runs directory, which is located within the XspecT data directory.
+    If the directory does not exist, it creates the directory.
+    Returns:
+        Path: The path to the XspecT runs directory.
+    """
+    runs_path = get_xspect_root_path() / "runs"
+    runs_path.mkdir(exist_ok=True, parents=True)
+    return runs_path
+def get_xspect_mlst_path() -> Path:
+    """
+    Return the path to the XspecT MLST directory.
+    Returns the path to the XspecT MLST directory, which is located within the XspecT data directory.
+    If the directory does not exist, it creates the directory.
+    Returns:
+        Path: The path to the XspecT MLST directory.
+    """
+    mlst_path = get_xspect_root_path() / "mlst"
+    mlst_path.mkdir(exist_ok=True, parents=True)
+    return mlst_path

{xspect-0.5.1 → xspect-0.5.2}/src/xspect/download_models.py RENAMED Viewed

@@ -8,8 +8,16 @@ import requests
 from xspect.definitions import get_xspect_model_path
-def download_test_models(url):
-    """Download models."""
+def download_test_models(url: str) -> None:
+    """
+    Download models from the specified URL.
+    This function downloads a zip file from the given URL, extracts its contents,
+    and copies the extracted files to the XspecT model directory.
+    Args:
+        url (str): The URL from which to download the models.
+    """
     with TemporaryDirectory() as tmp_dir:
         tmp_dir = Path(tmp_dir)
         download_path = tmp_dir / "models.zip"

xspect-0.5.2/src/xspect/file_io.py ADDED Viewed

@@ -0,0 +1,232 @@
+"""
+File IO module.
+"""
+from json import loads
+import os
+from pathlib import Path
+import zipfile
+from typing import Callable, Iterator
+from Bio import SeqIO
+from xspect.definitions import fasta_endings, fastq_endings
+def delete_zip_files(dir_path) -> None:
+    """
+    Delete all zip files in the given directory.
+    This function checks each file in the specified directory and removes it if it is a zip file.
+    Args:
+        dir_path (Path): Path to the directory where zip files should be deleted.
+    """
+    files = os.listdir(dir_path)
+    for file in files:
+        if zipfile.is_zipfile(file):
+            file_path = dir_path / str(file)
+            os.remove(file_path)
+def extract_zip(zip_path: Path, unzipped_path: Path) -> None:
+    """
+    Extracts all files from a zip file.
+    Extracts the contents of the specified zip file to the given directory.
+    Args:
+        zip_path (Path): Path to the zip file to be extracted.
+        unzipped_path (Path): Path to the directory where the contents will be extracted.
+    """
+    unzipped_path.mkdir(parents=True, exist_ok=True)
+    with zipfile.ZipFile(zip_path) as item:
+        item.extractall(unzipped_path)
+def get_record_iterator(file_path: Path) -> Iterator:
+    """
+    Returns a record iterator for a fasta or fastq file.
+    This function checks the file extension to determine if the file is in fasta or fastq format
+    and returns an iterator over the records in the file using Biopython's SeqIO module.
+    Args:
+        file_path (Path): Path to the fasta or fastq file.
+    Returns:
+        Iterator: An iterator over the records in the file.
+    Raises:
+        ValueError: If the file path is not a Path object, does not exist, is not a file,
+                    or has an invalid file format.
+    """
+    if not isinstance(file_path, Path):
+        raise ValueError("Path must be a Path object")
+    if not file_path.exists():
+        raise ValueError("File does not exist")
+    if not file_path.is_file():
+        raise ValueError("Path must be a file")
+    if file_path.suffix[1:] in fasta_endings:
+        return SeqIO.parse(file_path, "fasta")
+    if file_path.suffix[1:] in fastq_endings:
+        return SeqIO.parse(file_path, "fastq")
+    raise ValueError("Invalid file format, must be a fasta or fastq file")
+def concatenate_species_fasta_files(
+    input_folders: list[Path], output_directory: Path
+) -> None:
+    """
+    Concatenate fasta files from different species into one file per species.
+    This function iterates through each species folder within the given input folder,
+    collects all fasta files, and concatenates their contents into a single fasta file
+    named after the species.
+    Args:
+        input_folders (list[Path]): List of paths to species folders.
+        output_directory (Path): Path to the output directory.
+    """
+    for species_folder in input_folders:
+        species_name = species_folder.name
+        fasta_files = [
+            f for ending in fasta_endings for f in species_folder.glob(f"*.{ending}")
+        ]
+        if len(fasta_files) == 0:
+            raise ValueError(f"no fasta files found in {species_folder}")
+        # concatenate fasta files
+        concatenated_fasta = output_directory / f"{species_name}.fasta"
+        with open(concatenated_fasta, "w", encoding="utf-8") as f:
+            for fasta_file in fasta_files:
+                with open(fasta_file, "r", encoding="utf-8") as f_in:
+                    f.write(f_in.read())
+def concatenate_metagenome(fasta_dir: Path, meta_path: Path) -> None:
+    """
+    Concatenate all fasta files in a directory into one file.
+    This function searches for all fasta files in the specified directory and writes their contents
+    into a single output file. The output file will contain the concatenated sequences from all fasta files.
+    Args:
+        fasta_dir (Path): Path to the directory with the fasta files.
+        meta_path (Path): Path to the output file.
+    """
+    fasta_files = [
+        file for ending in fasta_endings for file in fasta_dir.glob(f"*.{ending}")
+    ]
+    with open(meta_path, "w", encoding="utf-8") as meta_file:
+        for fasta_file in fasta_files:
+            with open(fasta_file, "r", encoding="utf-8") as f_in:
+                meta_file.write(f_in.read())
+def get_ncbi_dataset_accession_paths(
+    ncbi_dataset_path: Path,
+) -> dict[str, Path]:
+    """
+    Get the paths of the NCBI dataset accessions.
+    This function reads the dataset catalog from the NCBI dataset directory and returns a dictionary
+    mapping each accession to its corresponding file path. The first item in the dataset catalog is
+    assumed to be a data report, and is skipped.
+    Args:
+        ncbi_dataset_path (Path): Path to the NCBI dataset directory.
+    Returns:
+        dict[str, Path]: Dictionary with the accession as key and the path as value.
+    Raises:
+        ValueError: If the dataset path does not exist or is invalid.
+    """
+    data_path = ncbi_dataset_path / "ncbi_dataset" / "data"
+    if not data_path.exists():
+        raise ValueError(f"Path {data_path} does not exist.")
+    accession_paths = {}
+    with open(data_path / "dataset_catalog.json", "r", encoding="utf-8") as f:
+        res = loads(f.read())
+        for assembly in res["assemblies"][1:]:  # the first item is the data report
+            accession = assembly["accession"]
+            assembly_path = data_path / assembly["files"][0]["filePath"]
+            accession_paths[accession] = assembly_path
+    return accession_paths
+def filter_sequences(
+    input_file: Path,
+    output_file: Path,
+    included_ids: list[str],
+) -> None:
+    """
+    Filter sequences by IDs from an input file and save them to an output file.
+    This function reads a fasta or fastq file, filters the sequences based on the provided IDs,
+    and writes the matching sequences to an output file. If no IDs are provided, no output file
+    is created.
+    Args:
+        input_file (Path): Path to the input file.
+        output_file (Path): Path to the output file.
+        included_ids (list[str], optional): List of IDs to include. If None, no output file
+            is created.
+    """
+    if not included_ids:
+        print("No IDs provided, no output file will be created.")
+        return
+    with open(output_file, "w", encoding="utf-8") as out_f:
+        for record in get_record_iterator(input_file):
+            if record.id in included_ids:
+                SeqIO.write(record, out_f, "fasta")
+def prepare_input_output_paths(
+    input_path: Path,
+) -> tuple[list[Path], Callable[[int, Path], Path]]:
+    """
+    Processes the input path into a list of input paths and a function generating output paths.
+    This function checks if the input path is a directory or a file. If it is a directory,
+    it collects all files with specified fasta and fastq endings. If it is a file, it uses that file
+    as the input path. It then returns a list of input file paths and a function that generates
+    output paths based on the index of the input file and a specified output path.
+    Args:
+        input_path (Path): Path to the directory or file.
+    Returns:
+        tuple[list[Path], Callable[[int, Path], Path]]: A tuple containing:
+            - A list of input file paths
+            - A function that takes an index and the output path,
+              and returns the processed output path.
+    Raises:
+        ValueError: If the input path is invalid.
+    """
+    input_is_dir = input_path.is_dir()
+    ending_wildcards = [f"*.{ending}" for ending in fasta_endings + fastq_endings]
+    if input_is_dir:
+        input_paths = [p for e in ending_wildcards for p in input_path.glob(e)]
+    elif input_path.is_file():
+        input_paths = [input_path]
+    else:
+        raise ValueError("Invalid input path")
+    def get_output_path(idx: int, output_path: Path) -> Path:
+        return (
+            output_path.parent / f"{output_path.stem}_{idx+1}{output_path.suffix}"
+            if input_is_dir
+            else output_path
+        )
+    return input_paths, get_output_path

XspecT 0.5.1__tar.gz → 0.5.2__tar.gz

Potentially problematic release.

XspecT 0.5.1tar.gz → 0.5.2tar.gz