PyPI - PyamilySeq - Versions diffs - 1.1.2__tar.gz → 1.2.0__tar.gz - Mend

PyamilySeq 1.1.2tar.gz → 1.2.0tar.gz

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (28) hide show

{pyamilyseq-1.1.2 → pyamilyseq-1.2.0}/PKG-INFO RENAMED Viewed

@@ -1,6 +1,6 @@
 Metadata-Version: 2.4
 Name: PyamilySeq
-Version: 1.1.2
+Version: 1.2.0
 Summary: PyamilySeq - A a tool to investigate sequence-based gene groups identified by clustering methods such as CD-HIT, DIAMOND, BLAST or MMseqs2.
 Home-page: https://github.com/NickJD/PyamilySeq
 Author: Nicholas Dimonaco
@@ -46,7 +46,7 @@ To update to the newest version add '-U' to end of the pip install command.
 ```commandline
 usage: PyamilySeq.py [-h] {Full,Partial} ...
-PyamilySeq v1.1.2: A tool for gene clustering and analysis.
+PyamilySeq v1.2.0: A tool for gene clustering and analysis.
 positional arguments:
   {Full,Partial}  Choose a mode: 'Full' or 'Partial'.
@@ -76,7 +76,7 @@ Escherichia_coli_110957|ENSB_TIZS9kbTvShDvyX	Escherichia_coli_110957|ENSB_TIZS9k
 ```
 ### Example output:
 ```
-Running PyamilySeq v1.1.2
+Running PyamilySeq v1.2.0
 Calculating Groups
 Number of Genomes: 10
 Gene Groups
@@ -221,7 +221,7 @@ Seq-Combiner -input_dir .../test_data/genomes -name_split_gff .gff3 -output_dir
 ```
 usage: Seq_Combiner.py [-h] -input_dir INPUT_DIR -input_type {separate,combined,fasta} [-name_split_gff NAME_SPLIT_GFF] [-name_split_fasta NAME_SPLIT_FASTA] -output_dir OUTPUT_DIR -output_name OUTPUT_FILE [-gene_ident GENE_IDENT] [-translate] [-v]
-PyamilySeq v1.1.2: Seq-Combiner - A tool to extract sequences from GFF/FASTA files and prepare them for PyamilySeq.
+PyamilySeq v1.2.0: Seq-Combiner - A tool to extract sequences from GFF/FASTA files and prepare them for PyamilySeq.
 options:
   -h, --help            show this help message and exit
@@ -264,7 +264,7 @@ usage: Group_Splitter.py [-h] -input_fasta INPUT_FASTA -sequence_type {AA,DNA}
                          [-M CLUSTERING_MEMORY] [-no_delete_temp_files]
                          [-verbose] [-v]
-PyamilySeq v1.1.2: Group-Splitter - A tool to split multi-copy gene groups
+PyamilySeq v1.2.0: Group-Splitter - A tool to split multi-copy gene groups
 identified by PyamilySeq.
 options:
@@ -317,7 +317,7 @@ Cluster-Summary -genome_num 10 -input_clstr .../test_data/species/E-coli/E-coli_
 usage: Cluster_Summary.py [-h] -input_clstr INPUT_CLSTR -output OUTPUT -genome_num GENOME_NUM
                           [-output_dir OUTPUT_DIR] [-verbose] [-v]
-PyamilySeq v1.1.2: Cluster-Summary - A tool to summarise CD-HIT clustering files.
+PyamilySeq v1.2.0: Cluster-Summary - A tool to summarise CD-HIT clustering files.
 options:
   -h, --help            show this help message and exit

{pyamilyseq-1.1.2 → pyamilyseq-1.2.0}/README.md RENAMED Viewed

@@ -29,7 +29,7 @@ To update to the newest version add '-U' to end of the pip install command.
 ```commandline
 usage: PyamilySeq.py [-h] {Full,Partial} ...
-PyamilySeq v1.1.2: A tool for gene clustering and analysis.
+PyamilySeq v1.2.0: A tool for gene clustering and analysis.
 positional arguments:
   {Full,Partial}  Choose a mode: 'Full' or 'Partial'.
@@ -59,7 +59,7 @@ Escherichia_coli_110957|ENSB_TIZS9kbTvShDvyX	Escherichia_coli_110957|ENSB_TIZS9k
 ```
 ### Example output:
 ```
-Running PyamilySeq v1.1.2
+Running PyamilySeq v1.2.0
 Calculating Groups
 Number of Genomes: 10
 Gene Groups
@@ -204,7 +204,7 @@ Seq-Combiner -input_dir .../test_data/genomes -name_split_gff .gff3 -output_dir
 ```
 usage: Seq_Combiner.py [-h] -input_dir INPUT_DIR -input_type {separate,combined,fasta} [-name_split_gff NAME_SPLIT_GFF] [-name_split_fasta NAME_SPLIT_FASTA] -output_dir OUTPUT_DIR -output_name OUTPUT_FILE [-gene_ident GENE_IDENT] [-translate] [-v]
-PyamilySeq v1.1.2: Seq-Combiner - A tool to extract sequences from GFF/FASTA files and prepare them for PyamilySeq.
+PyamilySeq v1.2.0: Seq-Combiner - A tool to extract sequences from GFF/FASTA files and prepare them for PyamilySeq.
 options:
   -h, --help            show this help message and exit
@@ -247,7 +247,7 @@ usage: Group_Splitter.py [-h] -input_fasta INPUT_FASTA -sequence_type {AA,DNA}
                          [-M CLUSTERING_MEMORY] [-no_delete_temp_files]
                          [-verbose] [-v]
-PyamilySeq v1.1.2: Group-Splitter - A tool to split multi-copy gene groups
+PyamilySeq v1.2.0: Group-Splitter - A tool to split multi-copy gene groups
 identified by PyamilySeq.
 options:
@@ -300,7 +300,7 @@ Cluster-Summary -genome_num 10 -input_clstr .../test_data/species/E-coli/E-coli_
 usage: Cluster_Summary.py [-h] -input_clstr INPUT_CLSTR -output OUTPUT -genome_num GENOME_NUM
                           [-output_dir OUTPUT_DIR] [-verbose] [-v]
-PyamilySeq v1.1.2: Cluster-Summary - A tool to summarise CD-HIT clustering files.
+PyamilySeq v1.2.0: Cluster-Summary - A tool to summarise CD-HIT clustering files.
 options:
   -h, --help            show this help message and exit

{pyamilyseq-1.1.2 → pyamilyseq-1.2.0}/setup.cfg RENAMED Viewed

@@ -1,6 +1,6 @@
 [metadata]
 name = PyamilySeq
-version = v1.1.2
+version = v1.2.0
 license_files = LICENSE
 author = Nicholas Dimonaco
 author_email = nicholas@dimonaco.co.uk
@@ -43,6 +43,10 @@ console_scripts =
 	seq-finder = PyamilySeq.Seq_Finder:main
 	Seq-Extractor = PyamilySeq.Seq_Extractor:main
 	seq-extractor = PyamilySeq.Seq_Extractor:main
+	compute-singletrees-rf = aux_tools.RF.Compute_SingleTree_RFs:main
+	compare-rf = aux_tools.RF.compare_RF:main
+	compare-contree-singletrees = aux_tools.RF.compare_contree_singletrees:main
 [egg_info]
 tag_build =

{pyamilyseq-1.1.2 → pyamilyseq-1.2.0}/src/PyamilySeq/Seq_Combiner.py RENAMED Viewed

@@ -59,7 +59,7 @@ def main():
         exit(1)
     if options.input_type == 'fasta' and options.name_split_fasta is None:
         print("Please provide a substring to split the filename and extract the genome name.")
-        exit
+        exit(1)
     output_path = os.path.abspath(options.output_dir)
     if not os.path.exists(output_path):
@@ -77,7 +77,7 @@ def main():
     elif options.input_type == 'combined':
         read_combined_files(options.input_dir, options.name_split_gff, options.gene_ident, combined_out_file, options.translate, True)
     elif options.input_type == 'fasta':
-        read_fasta_files(options.input_dir, options.name_split_fasta, combined_out_file, options.translate)
+        read_fasta_files(options.input_dir, options.name_split_fasta, combined_out_file, options.translate, True)
 if __name__ == "__main__":
     main()

{pyamilyseq-1.1.2 → pyamilyseq-1.2.0}/src/PyamilySeq/Seq_Extractor.py RENAMED Viewed

@@ -9,8 +9,13 @@ def find_gene_ids_in_csv(csv_file, group_name):
             cells = line.strip().split(',')
             if cells[0].replace('"','') == group_name:
                 # Collect gene IDs from column 14 onward
+                # for cell in cells[14:]:
+                #     gene_ids.extend(cell.strip().replace('"','').split())  # Splitting by spaces if there are multiple IDs in a cell                break
                 for cell in cells[14:]:
-                    gene_ids.extend(cell.strip().replace('"','').split())  # Splitting by spaces if there are multiple IDs in a cell                break
+                    for gene in cell.strip().replace('"', '').split(';'):
+                        if gene:
+                            gene_ids.append(gene)
     return gene_ids
 def extract_sequences(fasta_file, gene_ids):

pyamilyseq-1.2.0/src/PyamilySeq/constants.py ADDED Viewed

	@@ -0,0 +1,2 @@
1	+ PyamilySeq_Version = 'v1.2.0'
2	+

{pyamilyseq-1.1.2 → pyamilyseq-1.2.0}/src/PyamilySeq/utils.py RENAMED Viewed

@@ -7,7 +7,6 @@ from tempfile import NamedTemporaryFile
 import sys
 import re
 import math
-#from config import config_params
 ####
 # Placeholder for the distance function
@@ -15,11 +14,10 @@ levenshtein_distance_cal = None
 # Check for Levenshtein library once
 try:
     import Levenshtein as LV
-    # Assign the optimized function
+    # Assign the optimised function
     def levenshtein_distance_calc(seq1, seq2):
         return LV.distance(seq1, seq2)
 except (ModuleNotFoundError, ImportError):
-    #if config_params.verbose == True: - Not implemented yet
     print("Levenshtein package not installed - Will fallback to slower Python implementation.")
     # Fallback implementation
     def levenshtein_distance_calc(seq1, seq2):

{pyamilyseq-1.1.2 → pyamilyseq-1.2.0}/src/PyamilySeq.egg-info/PKG-INFO RENAMED Viewed

@@ -1,6 +1,6 @@
 Metadata-Version: 2.4
 Name: PyamilySeq
-Version: 1.1.2
+Version: 1.2.0
 Summary: PyamilySeq - A a tool to investigate sequence-based gene groups identified by clustering methods such as CD-HIT, DIAMOND, BLAST or MMseqs2.
 Home-page: https://github.com/NickJD/PyamilySeq
 Author: Nicholas Dimonaco
@@ -46,7 +46,7 @@ To update to the newest version add '-U' to end of the pip install command.
 ```commandline
 usage: PyamilySeq.py [-h] {Full,Partial} ...
-PyamilySeq v1.1.2: A tool for gene clustering and analysis.
+PyamilySeq v1.2.0: A tool for gene clustering and analysis.
 positional arguments:
   {Full,Partial}  Choose a mode: 'Full' or 'Partial'.
@@ -76,7 +76,7 @@ Escherichia_coli_110957|ENSB_TIZS9kbTvShDvyX	Escherichia_coli_110957|ENSB_TIZS9k
 ```
 ### Example output:
 ```
-Running PyamilySeq v1.1.2
+Running PyamilySeq v1.2.0
 Calculating Groups
 Number of Genomes: 10
 Gene Groups
@@ -221,7 +221,7 @@ Seq-Combiner -input_dir .../test_data/genomes -name_split_gff .gff3 -output_dir
 ```
 usage: Seq_Combiner.py [-h] -input_dir INPUT_DIR -input_type {separate,combined,fasta} [-name_split_gff NAME_SPLIT_GFF] [-name_split_fasta NAME_SPLIT_FASTA] -output_dir OUTPUT_DIR -output_name OUTPUT_FILE [-gene_ident GENE_IDENT] [-translate] [-v]
-PyamilySeq v1.1.2: Seq-Combiner - A tool to extract sequences from GFF/FASTA files and prepare them for PyamilySeq.
+PyamilySeq v1.2.0: Seq-Combiner - A tool to extract sequences from GFF/FASTA files and prepare them for PyamilySeq.
 options:
   -h, --help            show this help message and exit
@@ -264,7 +264,7 @@ usage: Group_Splitter.py [-h] -input_fasta INPUT_FASTA -sequence_type {AA,DNA}
                          [-M CLUSTERING_MEMORY] [-no_delete_temp_files]
                          [-verbose] [-v]
-PyamilySeq v1.1.2: Group-Splitter - A tool to split multi-copy gene groups
+PyamilySeq v1.2.0: Group-Splitter - A tool to split multi-copy gene groups
 identified by PyamilySeq.
 options:
@@ -317,7 +317,7 @@ Cluster-Summary -genome_num 10 -input_clstr .../test_data/species/E-coli/E-coli_
 usage: Cluster_Summary.py [-h] -input_clstr INPUT_CLSTR -output OUTPUT -genome_num GENOME_NUM
                           [-output_dir OUTPUT_DIR] [-verbose] [-v]
-PyamilySeq v1.1.2: Cluster-Summary - A tool to summarise CD-HIT clustering files.
+PyamilySeq v1.2.0: Cluster-Summary - A tool to summarise CD-HIT clustering files.
 options:
   -h, --help            show this help message and exit

{pyamilyseq-1.1.2 → pyamilyseq-1.2.0}/src/PyamilySeq.egg-info/entry_points.txt RENAMED Viewed

@@ -8,6 +8,9 @@ Seq-Extractor = PyamilySeq.Seq_Extractor:main
 Seq-Finder = PyamilySeq.Seq_Finder:main
 cluster-extractor = PyamilySeq.Cluster_Extractor:main
 cluster-summary = PyamilySeq.Cluster_Summary:main
+compare-contree-singletrees = aux_tools.RF.compare_contree_singletrees:main
+compare-rf = aux_tools.RF.compare_RF:main
+compute-singletrees-rf = aux_tools.RF.Compute_SingleTree_RFs:main
 group-splitter = PyamilySeq.Group_Splitter:main
 pyamilyseq = PyamilySeq.PyamilySeq:main
 seq-combiner = PyamilySeq.Seq_Combiner:main

pyamilyseq-1.1.2/src/PyamilySeq/constants.py DELETED Viewed

	@@ -1,2 +0,0 @@
1	- PyamilySeq_Version = 'v1.1.2'
2	-

{pyamilyseq-1.1.2 → pyamilyseq-1.2.0}/LICENSE RENAMED Viewed

File without changes

{pyamilyseq-1.1.2 → pyamilyseq-1.2.0}/pyproject.toml RENAMED Viewed

File without changes

{pyamilyseq-1.1.2 → pyamilyseq-1.2.0}/src/PyamilySeq/Cluster_Compare.py RENAMED Viewed

File without changes

{pyamilyseq-1.1.2 → pyamilyseq-1.2.0}/src/PyamilySeq/Cluster_Summary.py RENAMED Viewed

File without changes

{pyamilyseq-1.1.2 → pyamilyseq-1.2.0}/src/PyamilySeq/Group_Extractor.py RENAMED Viewed

File without changes

{pyamilyseq-1.1.2 → pyamilyseq-1.2.0}/src/PyamilySeq/Group_Sizes.py RENAMED Viewed

File without changes

{pyamilyseq-1.1.2 → pyamilyseq-1.2.0}/src/PyamilySeq/Group_Splitter.py RENAMED Viewed

File without changes

{pyamilyseq-1.1.2 → pyamilyseq-1.2.0}/src/PyamilySeq/PyamilySeq.py RENAMED Viewed

File without changes

{pyamilyseq-1.1.2 → pyamilyseq-1.2.0}/src/PyamilySeq/PyamilySeq_Genus.py RENAMED Viewed

File without changes

{pyamilyseq-1.1.2 → pyamilyseq-1.2.0}/src/PyamilySeq/PyamilySeq_Species.py RENAMED Viewed

File without changes

{pyamilyseq-1.1.2 → pyamilyseq-1.2.0}/src/PyamilySeq/Seq_Finder.py RENAMED Viewed

File without changes

{pyamilyseq-1.1.2 → pyamilyseq-1.2.0}/src/PyamilySeq/__init__.py RENAMED Viewed

File without changes

{pyamilyseq-1.1.2 → pyamilyseq-1.2.0}/src/PyamilySeq/clusterings.py RENAMED Viewed

File without changes

{pyamilyseq-1.1.2 → pyamilyseq-1.2.0}/src/PyamilySeq/config.py RENAMED Viewed

File without changes

{pyamilyseq-1.1.2 → pyamilyseq-1.2.0}/src/PyamilySeq.egg-info/SOURCES.txt RENAMED Viewed

File without changes

{pyamilyseq-1.1.2 → pyamilyseq-1.2.0}/src/PyamilySeq.egg-info/dependency_links.txt RENAMED Viewed

File without changes

{pyamilyseq-1.1.2 → pyamilyseq-1.2.0}/src/PyamilySeq.egg-info/requires.txt RENAMED Viewed

File without changes

{pyamilyseq-1.1.2 → pyamilyseq-1.2.0}/src/PyamilySeq.egg-info/top_level.txt RENAMED Viewed

File without changes

PyamilySeq 1.1.2__tar.gz → 1.2.0__tar.gz

PyamilySeq 1.1.2tar.gz → 1.2.0tar.gz