PyPI - PyamilySeq - Versions diffs - 1.3.1__tar.gz → 1.3.3__tar.gz - Mend

PyamilySeq 1.3.1tar.gz → 1.3.3tar.gz

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (28) hide show

{pyamilyseq-1.3.1 → pyamilyseq-1.3.3}/PKG-INFO RENAMED Viewed

@@ -1,6 +1,6 @@
 Metadata-Version: 2.4
 Name: PyamilySeq
-Version: 1.3.1
+Version: 1.3.3
 Summary: PyamilySeq - A a tool to investigate sequence-based gene groups identified by clustering methods such as CD-HIT, DIAMOND, BLAST or MMseqs2.
 Author-email: Nicholas Dimonaco <nicholas@dimonaco.co.uk>
 License:                     GNU GENERAL PUBLIC LICENSE
@@ -720,7 +720,7 @@ To update to the newest version add '-U' to end of the pip install command.
 ```commandline
 usage: PyamilySeq.py [-h] {Full,Partial} ...
-PyamilySeq v1.3.1: A tool for gene clustering and analysis.
+PyamilySeq v1.3.3: A tool for gene clustering and analysis.
 positional arguments:
   {Full,Partial}  Choose a mode: 'Full' or 'Partial'.
@@ -750,7 +750,7 @@ Escherichia_coli_110957|ENSB_TIZS9kbTvShDvyX	Escherichia_coli_110957|ENSB_TIZS9k
 ```
 ### Example output:
 ```
-Running PyamilySeq v1.3.1
+Running PyamilySeq v1.3.3
 Calculating Groups
 Number of Genomes: 10
 Gene Groups
@@ -805,7 +805,7 @@ Total Number of First Gene Groups That Had Additional Second Sequences But Not N
 ## PyamilySeq is separated into two main 'run modes', Full and Partial. They each have their own set of required and optional arguments.
 ### PyamilySeq - Full Menu:
 ```
-usage: PyamilySeq.py Full [-h] -output_dir OUTPUT_DIR -input_type {separate,combined,fasta} [-input_dir INPUT_DIR] [-input_fasta INPUT_FASTA] [-name_split_gff NAME_SPLIT_GFF] [-name_split_fasta NAME_SPLIT_FASTA] [-sequence_type {AA,DNA}] [-gene_ident GENE_IDENT] [-c PIDENT] [-s LEN_DIFF] [-fast_mode]
+usage: PyamilySeq.py Full [-h] -output_dir OUTPUT_DIR -input_type {separate,combined,fasta} [-input_dir INPUT_DIR] [-input_fasta INPUT_FASTA] [-name_split_gff NAME_SPLIT_GFF] [-name_split_fasta NAME_SPLIT_FASTA] [-seq_type {AA,DNA}] [-gene_ident GENE_IDENT] [-c PIDENT] [-s LEN_DIFF] [-fast_mode]
                           [-group_mode {Species,Genus}] [-species_groups SPECIES_GROUPS] [-genus_groups GENUS_GROUPS] [-write_groups WRITE_GROUPS] [-write_individual_groups] [-align] [-align_aa] [-no_gpa] [-M MEM] [-T THREADS] [-verbose] [-v]
 options:
@@ -821,7 +821,7 @@ options:
                         Substring to split filenames and extract genome names for gff files (e.g., '_combined.gff3') - Use with -input_type separate/combined.
   -name_split_fasta NAME_SPLIT_FASTA
                         Substring to split filenames and extract genome names for fasta files if named differently to paired gff files (e.g., '_dna.fasta') - Use with -input_type separate/combined.
-  -sequence_type {AA,DNA}
+  -seq_type {AA,DNA}
                         Clustering mode: 'DNA' or 'AA'.
   -gene_ident GENE_IDENT
                         Gene identifiers to extract sequences (e.g., 'CDS, tRNA').
@@ -895,7 +895,7 @@ Seq-Combiner -input_dir .../test_data/genomes -name_split_gff .gff3 -output_dir
 ```
 usage: Seq_Combiner.py [-h] -input_dir INPUT_DIR -input_type {separate,combined,fasta} [-name_split_gff NAME_SPLIT_GFF] [-name_split_fasta NAME_SPLIT_FASTA] -output_dir OUTPUT_DIR -output_name OUTPUT_FILE [-gene_ident GENE_IDENT] [-translate] [-v]
-PyamilySeq v1.3.1: Seq-Combiner - A tool to extract sequences from GFF/FASTA files and prepare them for PyamilySeq.
+PyamilySeq v1.3.3: Seq-Combiner - A tool to extract sequences from GFF/FASTA files and prepare them for PyamilySeq.
 options:
   -h, --help            show this help message and exit
@@ -927,18 +927,18 @@ Misc Arguments:
 ## Group-Splitter: This tool can split multi-copy gene groups using CD-HIT after initial PyamilySeq analysis.
 ### Example:
 ```bash
-Group-Splitter -genome_num 10 -input_fasta .../test/species/ -output_dir .../test/species/ -sequence_type AA
+Group-Splitter -genome_num 10 -input_fasta .../test/species/ -output_dir .../test/species/ -seq_type AA
 ```
 ### Group-Splitter Menu:
 ```
-usage: Group_Splitter.py [-h] -input_fasta INPUT_FASTA -sequence_type {AA,DNA}
+usage: Group_Splitter.py [-h] -input_fasta INPUT_FASTA -seq_type {AA,DNA}
                          -genome_num GENOME_NUM -output_dir OUTPUT_DIR
                          [-groups GROUPS] [-group_threshold GROUP_THRESHOLD]
                          [-c PIDENT] [-s LEN_DIFF] [-T CLUSTERING_THREADS]
                          [-M CLUSTERING_MEMORY] [-no_delete_temp_files]
                          [-verbose] [-v]
-PyamilySeq v1.3.1: Group-Splitter - A tool to split multi-copy gene groups
+PyamilySeq v1.3.3: Group-Splitter - A tool to split multi-copy gene groups
 identified by PyamilySeq.
 options:
@@ -947,7 +947,7 @@ options:
 Required Parameters:
   -input_fasta INPUT_FASTA
                         Input FASTA file containing gene groups.
-  -sequence_type {AA,DNA}
+  -seq_type {AA,DNA}
                         Default - DNA: Are groups "DNA" or "AA" sequences?
   -genome_num GENOME_NUM
                         The total number of genomes must be provide
@@ -981,17 +981,17 @@ Misc Parameters:
 ```
-## Cluster-Summary menu: This tool can be used to summarise CD-HIT .clstr files:
+## Group-Summary menu: This tool can be used to summarise CD-HIT .clstr files:
 ### Example:
 ```bash
-Cluster-Summary -genome_num 10 -input_clstr .../test_data/species/E-coli/E-coli_extracted_pep_cd-hit_80.clstr -output_tsv .../test_data/species/E-coli/E-coli_extracted_pep_cd-hit_80_Summary.tsv
+Group-Summary -genome_num 10 -input_clstr .../test_data/species/E-coli/E-coli_extracted_pep_cd-hit_80.clstr -output_tsv .../test_data/species/E-coli/E-coli_extracted_pep_cd-hit_80_Summary.tsv
 ```
-### Cluster-Summary Menu:
+### Group-Summary Menu:
 ```
 usage: Cluster_Summary.py [-h] -input_clstr INPUT_CLSTR -output OUTPUT -genome_num GENOME_NUM
                           [-output_dir OUTPUT_DIR] [-verbose] [-v]
-PyamilySeq v1.3.1: Cluster-Summary - A tool to summarise CD-HIT clustering files.
+PyamilySeq v1.3.3: Group-Summary - A tool to summarise CD-HIT clustering files.
 options:
   -h, --help            show this help message and exit

{pyamilyseq-1.3.1 → pyamilyseq-1.3.3}/README.md RENAMED Viewed

@@ -29,7 +29,7 @@ To update to the newest version add '-U' to end of the pip install command.
 ```commandline
 usage: PyamilySeq.py [-h] {Full,Partial} ...
-PyamilySeq v1.3.1: A tool for gene clustering and analysis.
+PyamilySeq v1.3.3: A tool for gene clustering and analysis.
 positional arguments:
   {Full,Partial}  Choose a mode: 'Full' or 'Partial'.
@@ -59,7 +59,7 @@ Escherichia_coli_110957|ENSB_TIZS9kbTvShDvyX	Escherichia_coli_110957|ENSB_TIZS9k
 ```
 ### Example output:
 ```
-Running PyamilySeq v1.3.1
+Running PyamilySeq v1.3.3
 Calculating Groups
 Number of Genomes: 10
 Gene Groups
@@ -114,7 +114,7 @@ Total Number of First Gene Groups That Had Additional Second Sequences But Not N
 ## PyamilySeq is separated into two main 'run modes', Full and Partial. They each have their own set of required and optional arguments.
 ### PyamilySeq - Full Menu:
 ```
-usage: PyamilySeq.py Full [-h] -output_dir OUTPUT_DIR -input_type {separate,combined,fasta} [-input_dir INPUT_DIR] [-input_fasta INPUT_FASTA] [-name_split_gff NAME_SPLIT_GFF] [-name_split_fasta NAME_SPLIT_FASTA] [-sequence_type {AA,DNA}] [-gene_ident GENE_IDENT] [-c PIDENT] [-s LEN_DIFF] [-fast_mode]
+usage: PyamilySeq.py Full [-h] -output_dir OUTPUT_DIR -input_type {separate,combined,fasta} [-input_dir INPUT_DIR] [-input_fasta INPUT_FASTA] [-name_split_gff NAME_SPLIT_GFF] [-name_split_fasta NAME_SPLIT_FASTA] [-seq_type {AA,DNA}] [-gene_ident GENE_IDENT] [-c PIDENT] [-s LEN_DIFF] [-fast_mode]
                           [-group_mode {Species,Genus}] [-species_groups SPECIES_GROUPS] [-genus_groups GENUS_GROUPS] [-write_groups WRITE_GROUPS] [-write_individual_groups] [-align] [-align_aa] [-no_gpa] [-M MEM] [-T THREADS] [-verbose] [-v]
 options:
@@ -130,7 +130,7 @@ options:
                         Substring to split filenames and extract genome names for gff files (e.g., '_combined.gff3') - Use with -input_type separate/combined.
   -name_split_fasta NAME_SPLIT_FASTA
                         Substring to split filenames and extract genome names for fasta files if named differently to paired gff files (e.g., '_dna.fasta') - Use with -input_type separate/combined.
-  -sequence_type {AA,DNA}
+  -seq_type {AA,DNA}
                         Clustering mode: 'DNA' or 'AA'.
   -gene_ident GENE_IDENT
                         Gene identifiers to extract sequences (e.g., 'CDS, tRNA').
@@ -204,7 +204,7 @@ Seq-Combiner -input_dir .../test_data/genomes -name_split_gff .gff3 -output_dir
 ```
 usage: Seq_Combiner.py [-h] -input_dir INPUT_DIR -input_type {separate,combined,fasta} [-name_split_gff NAME_SPLIT_GFF] [-name_split_fasta NAME_SPLIT_FASTA] -output_dir OUTPUT_DIR -output_name OUTPUT_FILE [-gene_ident GENE_IDENT] [-translate] [-v]
-PyamilySeq v1.3.1: Seq-Combiner - A tool to extract sequences from GFF/FASTA files and prepare them for PyamilySeq.
+PyamilySeq v1.3.3: Seq-Combiner - A tool to extract sequences from GFF/FASTA files and prepare them for PyamilySeq.
 options:
   -h, --help            show this help message and exit
@@ -236,18 +236,18 @@ Misc Arguments:
 ## Group-Splitter: This tool can split multi-copy gene groups using CD-HIT after initial PyamilySeq analysis.
 ### Example:
 ```bash
-Group-Splitter -genome_num 10 -input_fasta .../test/species/ -output_dir .../test/species/ -sequence_type AA
+Group-Splitter -genome_num 10 -input_fasta .../test/species/ -output_dir .../test/species/ -seq_type AA
 ```
 ### Group-Splitter Menu:
 ```
-usage: Group_Splitter.py [-h] -input_fasta INPUT_FASTA -sequence_type {AA,DNA}
+usage: Group_Splitter.py [-h] -input_fasta INPUT_FASTA -seq_type {AA,DNA}
                          -genome_num GENOME_NUM -output_dir OUTPUT_DIR
                          [-groups GROUPS] [-group_threshold GROUP_THRESHOLD]
                          [-c PIDENT] [-s LEN_DIFF] [-T CLUSTERING_THREADS]
                          [-M CLUSTERING_MEMORY] [-no_delete_temp_files]
                          [-verbose] [-v]
-PyamilySeq v1.3.1: Group-Splitter - A tool to split multi-copy gene groups
+PyamilySeq v1.3.3: Group-Splitter - A tool to split multi-copy gene groups
 identified by PyamilySeq.
 options:
@@ -256,7 +256,7 @@ options:
 Required Parameters:
   -input_fasta INPUT_FASTA
                         Input FASTA file containing gene groups.
-  -sequence_type {AA,DNA}
+  -seq_type {AA,DNA}
                         Default - DNA: Are groups "DNA" or "AA" sequences?
   -genome_num GENOME_NUM
                         The total number of genomes must be provide
@@ -290,17 +290,17 @@ Misc Parameters:
 ```
-## Cluster-Summary menu: This tool can be used to summarise CD-HIT .clstr files:
+## Group-Summary menu: This tool can be used to summarise CD-HIT .clstr files:
 ### Example:
 ```bash
-Cluster-Summary -genome_num 10 -input_clstr .../test_data/species/E-coli/E-coli_extracted_pep_cd-hit_80.clstr -output_tsv .../test_data/species/E-coli/E-coli_extracted_pep_cd-hit_80_Summary.tsv
+Group-Summary -genome_num 10 -input_clstr .../test_data/species/E-coli/E-coli_extracted_pep_cd-hit_80.clstr -output_tsv .../test_data/species/E-coli/E-coli_extracted_pep_cd-hit_80_Summary.tsv
 ```
-### Cluster-Summary Menu:
+### Group-Summary Menu:
 ```
 usage: Cluster_Summary.py [-h] -input_clstr INPUT_CLSTR -output OUTPUT -genome_num GENOME_NUM
                           [-output_dir OUTPUT_DIR] [-verbose] [-v]
-PyamilySeq v1.3.1: Cluster-Summary - A tool to summarise CD-HIT clustering files.
+PyamilySeq v1.3.3: Group-Summary - A tool to summarise CD-HIT clustering files.
 options:
   -h, --help            show this help message and exit

{pyamilyseq-1.3.1 → pyamilyseq-1.3.3}/pyproject.toml RENAMED Viewed

@@ -4,7 +4,7 @@ build-backend = "setuptools.build_meta"
 [project]
 name = "PyamilySeq"
-version = "1.3.1"
+version = "1.3.3"
 authors = [
     {name = "Nicholas Dimonaco", email = "nicholas@dimonaco.co.uk"}
 ]
@@ -33,10 +33,10 @@ Homepage = "https://github.com/NickJD/PyamilySeq"
     seq-combiner = "PyamilySeq.Seq_Combiner:main"
     Group-Splitter = "PyamilySeq.Group_Splitter:main"
     group-splitter = "PyamilySeq.Group_Splitter:main"
-    Cluster-Summary = "PyamilySeq.Cluster_Summary:main"
-    cluster-summary = "PyamilySeq.Cluster_Summary:main"
-    Cluster-Extractor = "PyamilySeq.Cluster_Extractor:main"
-    cluster-extractor = "PyamilySeq.Cluster_Extractor:main"
+    Group-Extractor = "PyamilySeq.Group_Extractor:main"
+    group-extractor = "PyamilySeq.Group_Extractor:main"
+    Group-Summary = "PyamilySeq.Group_Summary:main"
+    group-summary = "PyamilySeq.Group_Summary:main"
     Seq-Finder = "PyamilySeq.Seq_Finder:main"
     seq-finder = "PyamilySeq.Seq_Finder:main"
     Seq-Extractor = "PyamilySeq.Seq_Extractor:main"
@@ -56,5 +56,4 @@ include = ["PyamilySeq*"]
 [tool.setuptools.package-data]
 PyamilySeq = [
 ]

pyamilyseq-1.3.1/src/PyamilySeq/Cluster_Compare.py → pyamilyseq-1.3.3/src/PyamilySeq/Group_Compare.py RENAMED Viewed

@@ -1,5 +1,12 @@
-import argparse
 from collections import defaultdict
+import logging
+import os
+# Use centralised logger factory from constants
+try:
+    from .constants import configure_logger, LoggingArgumentParser
+except Exception:
+    from constants import configure_logger, LoggingArgumentParser
 def read_cd_hit_output(clstr_file):
     """
@@ -23,10 +30,8 @@ def read_cd_hit_output(clstr_file):
     return seq_to_cluster
 def compare_cd_hit_clusters(file1, file2, output_file):
-    """
-    Compares two CD-HIT .clstr files to check if clusters are the same.
-    Writes the results to a TSV file.
-    """
+    logger = logging.getLogger("PyamilySeq.Group_Compare")
+    logger.info("Comparing clusters: %s vs %s", file1, file2)
     # Read both clustering files
     clusters1 = read_cd_hit_output(file1)
     clusters2 = read_cd_hit_output(file2)
@@ -80,12 +85,11 @@ def compare_cd_hit_clusters(file1, file2, output_file):
                     tsv_data.append([seq, cluster_id1, cluster_id2, "Cluster name change"])
     # Print metrics
-    print("🔢 Clustering Comparison Metrics:")
-    print(f"Cluster name changes: {cluster_name_changes}")
-    print(f"Sequence shifts (sequences assigned to different clusters): {sequence_shifts}")
-    print(f"Sequences only in the first file: {len(only_in_file1)}")
-    print(f"Sequences only in the second file: {len(only_in_file2)}")
-    print()
+    logger.info("Clustering Comparison Metrics:")
+    logger.info("Cluster name changes: %s", cluster_name_changes)
+    logger.info("Sequence shifts (sequences assigned to different clusters): %s", sequence_shifts)
+    logger.info("Sequences only in the first file: %s", len(only_in_file1))
+    logger.info("Sequences only in the second file: %s", len(only_in_file2))
     # Write the results to a TSV file
     with open(output_file, 'w') as out_file:
@@ -93,15 +97,25 @@ def compare_cd_hit_clusters(file1, file2, output_file):
         for row in tsv_data:
             out_file.write("\t".join(map(str, row)) + "\n")
-    print(f"✅ Results have been written to {output_file}")
+    logger.info("Results have been written to %s", output_file)
 def main():
-    parser = argparse.ArgumentParser(description="Compare two CD-HIT .clstr files to check for clustering consistency.")
+    # Early console-only logger so parser.description and argparse messages are emitted via logger
+    early_logger = configure_logger("PyamilySeq.Group_Compare", enable_file=False, log_dir=None, verbose=False)
+    parser = LoggingArgumentParser(logger_name="PyamilySeq.Group_Compare", description="Running Group-Compare - A tool to compare two CD-HIT .clstr files to check for clustering consistency.")
     parser.add_argument("-file1", required=True, help="First CD-HIT .clstr file")
     parser.add_argument("-file2", required=True, help="Second CD-HIT .clstr file")
     parser.add_argument("-output", required=True, help="Output file (TSV format)")
+    parser.add_argument("--log", action="store_true", dest="log", help="Create a timestamped logfile for this run.")
+    parser.add_argument("--log-dir", dest="log_dir", default=None, help="Directory for logfile (default: same dir as -output).")
     args = parser.parse_args()
+    # Setup logger
+    out_dir = os.path.abspath(os.path.dirname(args.output)) if args.output else os.getcwd()
+    log_dir = args.log_dir if args.log_dir else out_dir
+    logger = configure_logger("PyamilySeq.Group_Compare", enable_file=args.log, log_dir=log_dir, verbose=False)
     compare_cd_hit_clusters(args.file1, args.file2, args.output)
 if __name__ == "__main__":

{pyamilyseq-1.3.1 → pyamilyseq-1.3.3}/src/PyamilySeq/Group_Extractor.py RENAMED Viewed

@@ -1,6 +1,12 @@
-import argparse
 import os
 import csv
+import logging
+# Use centralissed logger factory from constants
+try:
+    from .constants import configure_logger, LoggingArgumentParser
+except Exception:
+    from constants import configure_logger, LoggingArgumentParser
 def parse_fasta(fasta_file):
@@ -43,9 +49,8 @@ def parse_csv(csv_file):
 def write_group_fastas(groups, sequences, output_dir):
-    """
-    Writes individual FASTA files for each group with the relevant sequences.
-    """
+    logger = logging.getLogger("PyamilySeq.Group_Extractor")
     if not os.path.exists(output_dir):
         os.makedirs(output_dir)
@@ -56,27 +61,39 @@ def write_group_fastas(groups, sequences, output_dir):
                 if gene_id in sequences:
                     f.write(f">{gene_id}\n{sequences[gene_id]}\n")
                 else:
-                    print(f"Warning: Gene ID {gene_id} not found in FASTA file.")
+                    logger.warning("Warning: Gene ID %s not found in FASTA file.", gene_id)
 def main():
-    parser = argparse.ArgumentParser(description="Process FASTA and CSV files to create grouped FASTA outputs.")
+    # Early console-only logger so the parser description is logged before argparse outputs.
+    early_logger = configure_logger("PyamilySeq.Group_Extractor", enable_file=False, log_dir=None, verbose=False)
+    parser = LoggingArgumentParser(logger_name="PyamilySeq.Group_Extractor", description="Running Group-Extractor - A tool to process FASTA and CSV files to create grouped FASTA outputs.")
     parser.add_argument("-fasta", required=True, help="Input FASTA file containing gene sequences.")
     parser.add_argument("-csv", required=True, help="Input CSV file containing group and gene information.")
     parser.add_argument("-output_dir", required=True, help="Directory to save the grouped FASTA files.")
+    parser.add_argument("--log", action="store_true", dest="log", help="Create a timestamped logfile for this run.")
+    parser.add_argument("--log-dir", dest="log_dir", default=None, help="Directory for logfile (default: output_dir).")
     args = parser.parse_args()
-    # Parse the input files
-    print("Parsing FASTA file...")
+    # Setup logger writing to output_dir (optional file)
+    log_dir = os.path.abspath(args.output_dir) if args.output_dir else os.getcwd()
+    if hasattr(args, "log_dir") and args.log_dir:
+        log_dir = args.log_dir
+    # Only create a logfile when --log is provided; default is console (stdout) only.
+    logger = configure_logger("PyamilySeq.Group_Extractor", enable_file=getattr(args, "log", False), log_dir=log_dir, verbose=False)
+    logger.info("Parsing FASTA file: %s", args.fasta)
     sequences = parse_fasta(args.fasta)
-    print("Parsing CSV file...")
+    logger.info("Parsed %d sequences.", len(sequences))
+    logger.info("Parsing CSV file: %s", args.csv)
     groups = parse_csv(args.csv)
+    logger.info("Parsed %d groups.", len(groups))
-    # Write the grouped FASTA files
-    print("Writing grouped FASTA files...")
+    logger.info("Writing grouped FASTA files to %s", args.output_dir)
     write_group_fastas(groups, sequences, args.output_dir)
-    print("Process completed successfully.")
+    logger.info("Process completed successfully.")
 if __name__ == "__main__":

{pyamilyseq-1.3.1 → pyamilyseq-1.3.3}/src/PyamilySeq/Group_Sizes.py RENAMED Viewed

@@ -1,6 +1,14 @@
-import argparse
 import os
 import csv
+import logging
+# Use centralised logger factory from constants
+try:
+    from .constants import configure_logger, LoggingArgumentParser
+except Exception:
+    from constants import configure_logger, LoggingArgumentParser
 def parse_fasta_stats(fasta_file):
@@ -43,9 +51,7 @@ def parse_fasta_stats(fasta_file):
 def process_fasta_directory(input_dir, output_csv):
-    """
-    Processes a directory of FASTA files and writes statistics to a CSV file.
-    """
+    logger = logging.getLogger("PyamilySeq.Group_Sizes")
     results = []
     for filename in os.listdir(input_dir):
         if filename.endswith(".fasta"):
@@ -68,19 +74,27 @@ def process_fasta_directory(input_dir, output_csv):
         writer = csv.DictWriter(csvfile, fieldnames=fieldnames)
         writer.writeheader()
         writer.writerows(results)
+    logger.info("Wrote statistics for %d FASTA files to %s", len(results), output_csv)
 def main():
-    parser = argparse.ArgumentParser(description="Summarize sequence statistics for a directory of FASTA files.")
+    # Early console-only logger so the parser.description is emitted via logger before argparse prints usage/help.
+    early_logger = configure_logger("PyamilySeq.Group_Sizes", enable_file=False, log_dir=None, verbose=False)
+    parser = LoggingArgumentParser(logger_name="PyamilySeq.Group_Sizes", description="Group-Sizes - A tool to summarise sequence statistics for a directory of FASTA files.")
     parser.add_argument("-input_dir", required=True, help="Directory containing FASTA files.")
     parser.add_argument("-output_csv", required=True, help="Output CSV file to save statistics.")
+    parser.add_argument("--log", action="store_true", dest="log", help="Create a timestamped logfile for this run.")
+    parser.add_argument("--log-dir", dest="log_dir", default=None, help="Directory for logfile (default: same dir as -output_csv).")
     args = parser.parse_args()
-    # Process the directory of FASTA files
-    print("Processing FASTA files...")
+    out_dir = os.path.abspath(os.path.dirname(args.output_csv)) if args.output_csv else os.getcwd()
+    log_dir = args.log_dir if args.log_dir else out_dir
+    logger = configure_logger("PyamilySeq.Group_Sizes", enable_file=args.log, log_dir=log_dir, verbose=False)
+    logger.info("Processing FASTA files in %s", args.input_dir)
     process_fasta_directory(args.input_dir, args.output_csv)
-    print(f"Statistics saved to {args.output_csv}")
+    logger.info("Statistics saved to %s", args.output_csv)
 if __name__ == "__main__":

PyamilySeq 1.3.1__tar.gz → 1.3.3__tar.gz

PyamilySeq 1.3.1tar.gz → 1.3.3tar.gz