PyPI - ORForise - Versions diffs - 1.5.0__tar.gz → 1.5.1__tar.gz - Mend

ORForise 1.5.0tar.gz → 1.5.1tar.gz

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (73) hide show

{orforise-1.5.0 → orforise-1.5.1}/PKG-INFO RENAMED Viewed

@@ -1,6 +1,6 @@
 Metadata-Version: 2.4
 Name: ORForise
-Version: 1.5.0
+Version: 1.5.1
 Summary: ORForise - Platform for analysing and comparing Prokaryote CoDing Sequence (CDS) Gene Predictions.
 Home-page: https://github.com/NickJD/ORForise
 Author: Nicholas Dimonaco
@@ -57,13 +57,7 @@ Example output files from ```Annotation-Compare```, ```GFF-Adder``` and ```GFF-I
 For Help: ```Annotation-Compare -h ```
 ```python
-Thank you for using ORForise
-Please report any issues to: https://github.com/NickJD/ORForise/issues
-#####
-usage: Annotation_Compare.py [-h] -dna GENOME_DNA -ref REFERENCE_ANNOTATION -t TOOL -tp TOOL_PREDICTION
-                             [-rt REFERENCE_TOOL] [-o OUTNAME] [-v {True,False}]
-ORForise v1.5.0: Annotatione-Compare Run Parameters.
+ORForise v1.5.1: Annotatione-Compare Run Parameters.
 Required Arguments:
   -dna GENOME_DNA       Genome DNA file (.fa) which both annotations are based on
@@ -78,8 +72,8 @@ Optional Arguments:
                         name to compare output from two tools
 Output:
-  -o OUTNAME            Define full output filename (format is CSV) - If not provided, summary will be printed to
-                        std-out
+  -o OUTDIR             Define directory where detailed output should be places
+  -n OUTNAME            Define output filename(s) prefix - If not provided, filename of reference annotation file will be used- <outname>_<contig_id>_ORF_Comparison.csv
 Misc:
   -v {True,False}       Default - False: Print out runtime status
@@ -107,13 +101,7 @@ ORForise can be used as the example below.
 For Help: ```Aggregate-Compare -h ```
 ```python
-Thank you for using ORForise
-Please report any issues to: https://github.com/NickJD/ORForise/issues
-#####
-usage: Aggregate_Compare.py [-h] -dna GENOME_DNA -t TOOLS -tp TOOL_PREDICTIONS -ref REFERENCE_ANNOTATION
-                            [-rt REFERENCE_TOOL] [-o OUTNAME] [-v {True,False}]
-ORForise v1.5.0: Aggregate-Compare Run Parameters.
+ORForise v1.5.1: Aggregate-Compare Run Parameters.
 Required Arguments:
   -dna GENOME_DNA       Genome DNA file (.fa) which both annotations are based on
@@ -261,13 +249,7 @@ The ```-gi``` option can be used to allow for different genomic elements to be a
 For Help: ```GFF-Adder -h ```
 ```python
-Thank you for using ORForise
-Please report any issues to: https://github.com/NickJD/ORForise/issues
-#####
-usage: GFF_Adder.py [-h] -dna GENOME_DNA -ref REFERENCE_ANNOTATION -at ADDITIONAL_TOOL -add ADDITIONAL_ANNOTATION -o
-                    OUTPUT_FILE [-rt REFERENCE_TOOL] [-gi GENE_IDENT] [-gene_ident GENE_IDENT] [-olap OVERLAP]
-ORForise v1.5.0: GFF-Adder Run Parameters.
+ORForise v1.5.1: GFF-Adder Run Parameters.
 Required Arguments:
   -dna GENOME_DNA       Genome DNA file (.fa) which both annotations are based on
@@ -323,13 +305,7 @@ The ```-gi``` option can be used to allow for different genomic elements to be a
 For Help: ```GFF-Intersector -h ```
 ```python
-Thank you for using ORForise
-Please report any issues to: https://github.com/NickJD/ORForise/issues
-#####
-usage: GFF_Intersector.py [-h] -dna GENOME_DNA -ref REFERENCE_ANNOTATION -at ADDITIONAL_TOOL -add
-                          ADDITIONAL_ANNOTATION -o OUTPUT_FILE [-rt REFERENCE_TOOL] [-gi GENE_IDENT] [-cov COVERAGE]
-ORForise v1.5.0: GFF-Intersector Run Parameters.
+ORForise v1.5.1: GFF-Intersector Run Parameters.
 Required Arguments:
   -dna GENOME_DNA       Genome DNA file (.fa) which both annotations are based on

{orforise-1.5.0 → orforise-1.5.1}/README.md RENAMED Viewed

@@ -40,13 +40,7 @@ Example output files from ```Annotation-Compare```, ```GFF-Adder``` and ```GFF-I
 For Help: ```Annotation-Compare -h ```
 ```python
-Thank you for using ORForise
-Please report any issues to: https://github.com/NickJD/ORForise/issues
-#####
-usage: Annotation_Compare.py [-h] -dna GENOME_DNA -ref REFERENCE_ANNOTATION -t TOOL -tp TOOL_PREDICTION
-                             [-rt REFERENCE_TOOL] [-o OUTNAME] [-v {True,False}]
-ORForise v1.5.0: Annotatione-Compare Run Parameters.
+ORForise v1.5.1: Annotatione-Compare Run Parameters.
 Required Arguments:
   -dna GENOME_DNA       Genome DNA file (.fa) which both annotations are based on
@@ -61,8 +55,8 @@ Optional Arguments:
                         name to compare output from two tools
 Output:
-  -o OUTNAME            Define full output filename (format is CSV) - If not provided, summary will be printed to
-                        std-out
+  -o OUTDIR             Define directory where detailed output should be places
+  -n OUTNAME            Define output filename(s) prefix - If not provided, filename of reference annotation file will be used- <outname>_<contig_id>_ORF_Comparison.csv
 Misc:
   -v {True,False}       Default - False: Print out runtime status
@@ -90,13 +84,7 @@ ORForise can be used as the example below.
 For Help: ```Aggregate-Compare -h ```
 ```python
-Thank you for using ORForise
-Please report any issues to: https://github.com/NickJD/ORForise/issues
-#####
-usage: Aggregate_Compare.py [-h] -dna GENOME_DNA -t TOOLS -tp TOOL_PREDICTIONS -ref REFERENCE_ANNOTATION
-                            [-rt REFERENCE_TOOL] [-o OUTNAME] [-v {True,False}]
-ORForise v1.5.0: Aggregate-Compare Run Parameters.
+ORForise v1.5.1: Aggregate-Compare Run Parameters.
 Required Arguments:
   -dna GENOME_DNA       Genome DNA file (.fa) which both annotations are based on
@@ -244,13 +232,7 @@ The ```-gi``` option can be used to allow for different genomic elements to be a
 For Help: ```GFF-Adder -h ```
 ```python
-Thank you for using ORForise
-Please report any issues to: https://github.com/NickJD/ORForise/issues
-#####
-usage: GFF_Adder.py [-h] -dna GENOME_DNA -ref REFERENCE_ANNOTATION -at ADDITIONAL_TOOL -add ADDITIONAL_ANNOTATION -o
-                    OUTPUT_FILE [-rt REFERENCE_TOOL] [-gi GENE_IDENT] [-gene_ident GENE_IDENT] [-olap OVERLAP]
-ORForise v1.5.0: GFF-Adder Run Parameters.
+ORForise v1.5.1: GFF-Adder Run Parameters.
 Required Arguments:
   -dna GENOME_DNA       Genome DNA file (.fa) which both annotations are based on
@@ -306,13 +288,7 @@ The ```-gi``` option can be used to allow for different genomic elements to be a
 For Help: ```GFF-Intersector -h ```
 ```python
-Thank you for using ORForise
-Please report any issues to: https://github.com/NickJD/ORForise/issues
-#####
-usage: GFF_Intersector.py [-h] -dna GENOME_DNA -ref REFERENCE_ANNOTATION -at ADDITIONAL_TOOL -add
-                          ADDITIONAL_ANNOTATION -o OUTPUT_FILE [-rt REFERENCE_TOOL] [-gi GENE_IDENT] [-cov COVERAGE]
-ORForise v1.5.0: GFF-Intersector Run Parameters.
+ORForise v1.5.1: GFF-Intersector Run Parameters.
 Required Arguments:
   -dna GENOME_DNA       Genome DNA file (.fa) which both annotations are based on

{orforise-1.5.0 → orforise-1.5.1}/setup.cfg RENAMED Viewed

@@ -1,6 +1,6 @@
 [metadata]
 name = ORForise
-version = 1.5.0
+version = 1.5.1
 author = Nicholas Dimonaco
 author_email = nicholas@dimonaco.co.uk
 description = ORForise - Platform for analysing and comparing Prokaryote CoDing Sequence (CDS) Gene Predictions.

{orforise-1.5.0 → orforise-1.5.1}/src/ORForise/Annotation_Compare.py RENAMED Viewed

@@ -1,7 +1,10 @@
 from importlib import import_module
 import argparse
-import sys,os
-import gzip,csv
+import sys, os
+import gzip, csv
+import logging
+from datetime import datetime
 try:
     from Comparator import tool_comparison
@@ -13,10 +16,21 @@ try:
 except ImportError:
     from ORForise.utils import *
+##########################
+# Consolidate printing and logging into a single block
+def _pct(n, total):
+    try:
+        return format(100 * n / total, '.2f') + '%'
+    except Exception:
+        return 'N/A'
 ##########################
 def comparator(options):
     try:
         try:  # Detect whether fasta/gff files are .gz or text and read accordingly
             fasta_in = gzip.open(options.genome_dna, 'rt')
@@ -77,36 +91,56 @@ def comparator(options):
                 'Contig\tGenes\tORFs\tPerfect_Matches\tPartial_Matches\tMissed_Genes\tUnmatched_ORFs\tMulti_Matched_ORFs\n')
     for dna_region, result in results.items():
-        num_current_genes = len(dna_regions[dna_region][2])
-        num_orfs = result['pred_metrics']['Number_of_ORFs']
-        num_perfect = result['pred_metrics']['Number_of_Perfect_Matches']
-        num_partial = len(result['pred_metrics']['partial_Hits'])
-        num_missed = len(result['rep_metrics']['genes_Undetected'])
-        num_unmatched = len(result['pred_metrics']['unmatched_ORFs'])
-        num_multi = len(result['pred_metrics']['multi_Matched_ORFs'])
-        # Collect summary for this contig
-        if options.outdir:
-            contig_summaries.append([
-                dna_region, num_current_genes, num_orfs, num_perfect, num_partial, num_missed, num_unmatched, num_multi
-            ])
-        ###
-        num_current_genes = len(dna_regions[dna_region][2])
-        print("These are the results for: " + dna_region + '\n')
-        ############################################# To get default output filename from input file details
-        genome_name = options.reference_annotation.split('/')[-1].split('.')[0]
-        rep_metric_description, rep_metrics = get_rep_metrics(result)
-        all_metric_description, all_metrics = get_all_metrics(result)
-        print('Current Contig: ' + str(dna_region))
-        print('Number of Genes: ' + str(num_current_genes))
-        print('Number of ORFs: ' + str(result['pred_metrics']['Number_of_ORFs']))
-        print('Perfect Matches: ' + str(result['pred_metrics']['Number_of_Perfect_Matches']) + ' [' + str(num_current_genes)+ '] - '+ format(100 * result['pred_metrics']['Number_of_Perfect_Matches']/num_current_genes,'.2f')+'%')
-        print('Partial Matches: ' + str(len(result['pred_metrics']['partial_Hits'])) + ' [' + str(num_current_genes)+ '] - '+ format(100 * len(result['pred_metrics']['partial_Hits'])/num_current_genes,'.2f')+'%')
-        print('Missed Genes: ' + str(len(result['rep_metrics']['genes_Undetected'])) + ' [' + str(num_current_genes)+ '] - '+ format(100 * len(result['rep_metrics']['genes_Undetected'])/num_current_genes,'.2f')+'%')
-        print('Unmatched ORFs: ' + str(len(result['pred_metrics']['unmatched_ORFs'])) + ' [' + str(num_current_genes)+ '] - '+ format(100 * len(result['pred_metrics']['unmatched_ORFs'])/num_current_genes,'.2f')+'%')
-        print('Multi-matched ORFs: ' + str(len(result['pred_metrics']['multi_Matched_ORFs'])) + ' [' + str(num_current_genes)+ '] - '+ format(100 * len(result['pred_metrics']['multi_Matched_ORFs'])/num_current_genes,'.2f')+'%')
-        if options.outdir:
+        if result:
+            num_current_genes = len(dna_regions[dna_region][2])
+            num_orfs = result['pred_metrics']['Number_of_ORFs']
+            num_perfect = result['pred_metrics']['Number_of_Perfect_Matches']
+            num_partial = len(result['pred_metrics']['partial_Hits'])
+            num_missed = len(result['rep_metrics']['genes_Undetected'])
+            num_unmatched = len(result['pred_metrics']['unmatched_ORFs'])
+            num_multi = len(result['pred_metrics']['multi_Matched_ORFs'])
+            # Collect summary for this contig
+            contig_summaries.append([dna_region, num_current_genes, num_orfs, num_perfect, num_partial, num_missed, num_unmatched, num_multi])
+            num_current_genes = len(dna_regions[dna_region][2])
+            genome_name = options.reference_annotation.split('/')[-1].split('.')[0]
+            rep_metric_description, rep_metrics = get_rep_metrics(result)
+            all_metric_description, all_metrics = get_all_metrics(result)
+             # Safely extract metric values
+            num_orfs = result.get('pred_metrics', {}).get('Number_of_ORFs') if isinstance(result, dict) else 'N/A'
+            perfect = result.get('pred_metrics', {}).get('Number_of_Perfect_Matches') if isinstance(result, dict) else 0
+            partial = len(result.get('pred_metrics', {}).get('partial_Hits', [])) if isinstance(result, dict) else 'N/A'
+            missed = len(result.get('rep_metrics', {}).get('genes_Undetected', [])) if isinstance(result, dict) else 'N/A'
+            unmatched = len(result.get('pred_metrics', {}).get('unmatched_ORFs', [])) if isinstance(result, dict) else 'N/A'
+            multi = len(result.get('pred_metrics', {}).get('multi_Matched_ORFs', [])) if isinstance(result, dict) else 'N/A'
+            lines = [
+                f"These are the results for: {dna_region}",
+                f"Current Contig: {dna_region}",
+                f"Number of Genes: {num_current_genes}",
+                f"Number of ORFs: {num_orfs}",
+                f"Perfect Matches: {perfect} [{num_current_genes}] - {_pct(perfect, num_current_genes) if isinstance(num_current_genes, (int, float)) else 'N/A'}",
+                f"Partial Matches: {partial} [{num_current_genes}] - {_pct(partial, num_current_genes) if isinstance(num_current_genes, (int, float)) else 'N/A'}",
+                f"Missed Genes: {missed} [{num_current_genes}] - {_pct(missed, num_current_genes) if isinstance(num_current_genes, (int, float)) else 'N/A'}",
+                f"Unmatched ORFs: {unmatched} [{num_current_genes}] - {_pct(unmatched, num_current_genes) if isinstance(num_current_genes, (int, float)) else 'N/A'}",
+                f"Multi-matched ORFs: {multi} [{num_current_genes}] - {_pct(multi, num_current_genes) if isinstance(num_current_genes, (int, float)) else 'N/A'}"
+            ]
+            full_msg = '\n'.join(lines) + '\n'
+            if options.verbose:
+                print(full_msg)
+            options.output_logger.info(full_msg)
+            # print("These are the results for: " + dna_region + '\n')
+            # print('Current Contig: ' + str(dna_region))
+            # print('Number of Genes: ' + str(num_current_genes))
+            # print('Number of ORFs: ' + str(result['pred_metrics']['Number_of_ORFs']))
+            # print('Perfect Matches: ' + str(result['pred_metrics']['Number_of_Perfect_Matches']) + ' [' + str(num_current_genes)+ '] - '+ format(100 * result['pred_metrics']['Number_of_Perfect_Matches']/num_current_genes,'.2f')+'%')
+            # print('Partial Matches: ' + str(len(result['pred_metrics']['partial_Hits'])) + ' [' + str(num_current_genes)+ '] - '+ format(100 * len(result['pred_metrics']['partial_Hits'])/num_current_genes,'.2f')+'%')
+            # print('Missed Genes: ' + str(len(result['rep_metrics']['genes_Undetected'])) + ' [' + str(num_current_genes)+ '] - '+ format(100 * len(result['rep_metrics']['genes_Undetected'])/num_current_genes,'.2f')+'%')
+            # print('Unmatched ORFs: ' + str(len(result['pred_metrics']['unmatched_ORFs'])) + ' [' + str(num_current_genes)+ '] - '+ format(100 * len(result['pred_metrics']['unmatched_ORFs'])/num_current_genes,'.2f')+'%')
+            # print('Multi-matched ORFs: ' + str(len(result['pred_metrics']['multi_Matched_ORFs'])) + ' [' + str(num_current_genes)+ '] - '+ format(100 * len(result['pred_metrics']['multi_Matched_ORFs'])/num_current_genes,'.2f')+'%')
             # Prepare output directory and file names for each contig
             contig_save = dna_region.replace('/', '_').replace('\\', '_')
             contig_dir = os.path.join(options.outdir, contig_save)
@@ -210,6 +244,11 @@ def comparator(options):
                     key_parts = key.split(',')
                     multi = f">Predicted_CDS:{key_parts[0]}-{key_parts[1]}_Genes:{'|'.join(value)}"
                     f.write(f"{multi}\n")
+        else:
+            if options.verbose:
+                print(f"No results to process for dna region - " + str(dna_region))
+            options.output_logger.info(f"No results to process for dna region - " + str(dna_region))
     # After all contigs, append the summary table to the main summary file
     if options.outdir and contig_summaries:
@@ -238,23 +277,21 @@ def comparator(options):
             out_file.write(
                 f'Multi-matched ORFs: {total_multi} [{total_genes}] - {format(100 * total_multi / total_genes, ".2f")}%\n')
-            # Print combined metrics to stdout
-            print("\nCombined metrics for all contigs:")
-            print(f'Number of Genes: {total_genes}')
-            print(f'Number of ORFs: {total_orfs}')
-            print(
-                f'Perfect Matches: {total_perfect} [{total_genes}] - {format(100 * total_perfect / total_genes, ".2f")}%')
-            print(
-                f'Partial Matches: {total_partial} [{total_genes}] - {format(100 * total_partial / total_genes, ".2f")}%')
-            print(f'Missed Genes: {total_missed} [{total_genes}] - {format(100 * total_missed / total_genes, ".2f")}%')
-            print(
-                f'Unmatched ORFs: {total_unmatched} [{total_genes}] - {format(100 * total_unmatched / total_genes, ".2f")}%')
-            print(
-                f'Multi-matched ORFs: {total_multi} [{total_genes}] - {format(100 * total_multi / total_genes, ".2f")}%')
+            lines = [
+                f"Combined metrics for all contigs:",
+                f"Number of Genes: {total_genes}",
+                f"Number of ORFs: {total_orfs}",
+                f"Perfect Matches: {total_perfect} [{total_genes}] - {format(100 * total_perfect / total_genes, ".2f")}%",
+                f"Partial Matches: {total_partial} [{total_genes}] - {format(100 * total_partial / total_genes, ".2f")}%",
+                f"Missed Genes: {total_missed} [{total_genes}] - {format(100 * total_missed / total_genes, ".2f")}%",
+                f"Unmatched ORFs: {total_unmatched} [{total_genes}] - {format(100 * total_unmatched / total_genes, ".2f")}%",
+                f"Multi-matched ORFs: {total_multi} [{total_genes}] - {format(100 * total_multi / total_genes, ".2f")}%"
+            ]
+            full_msg = '\n'.join(lines) + '\n'
+            if options.verbose:
+                print(full_msg)
+            options.output_logger.info(full_msg)
 def main():
@@ -282,18 +319,32 @@ def main():
                                '- Provide tool name to compare output from two tools')
     output = parser.add_argument_group('Output')
-    output.add_argument('-o', dest='outdir', required=False,
-                        help='Define directory where detailed output should be places - If not provided, summary will be printed to std-out')
+    output.add_argument('-o', dest='outdir', required=True,
+                        help='Define directory where detailed output should be places')
     output.add_argument('-n', dest='outname', required=False,
-                        help='Define output file name - Mandatory is -o is provided: <outname>_<contig_id>_ORF_Comparison.csv')
+                        help='Define output filename(s) prefix - If not provided, filename of reference '
+                             'annotation file will be used- <outname>_<contig_id>_ORF_Comparison.csv')
     misc = parser.add_argument_group('Misc')
     misc.add_argument('-v', dest='verbose', default='False', type=eval, choices=[True, False],
                       help='Default - False: Print out runtime status')
     options = parser.parse_args()
-    if options.outdir and not options.outname:
-        sys.exit("Error: If -o (outdir) is provided, you must also provide -n (outname).")
+    options.outname = options.outname if options.outname else options.reference_annotation.split('/')[-1].split('.')[0]
+    # Initialise loggers once and store on options
+    if not getattr(options, 'logger_initialized', False):
+        os.makedirs(options.outdir, exist_ok=True)
+        output_log = os.path.join(options.outdir, f"ORForise_{options.outname}_{datetime.now().strftime('%Y%m%d_%H%M%S')}.log")
+        logger = logging.getLogger('ORForise.output')
+        logger.setLevel(logging.INFO)
+        fh_out = logging.FileHandler(output_log, encoding='utf-8')
+        fh_out.setFormatter(logging.Formatter('%(message)s'))
+        logger.addHandler(fh_out)
+        options.output_logger = logger
+        options.logger_initialized = True
     comparator(options)

{orforise-1.5.0 → orforise-1.5.1}/src/ORForise/Comparator.py RENAMED Viewed

@@ -206,33 +206,53 @@ def start_Codon_Count(orfs):
         else:
             other += 1
             other_Starts.append(codon)
-    atg_P = format(100 * atg / len(orfs), '.2f')
-    gtg_P = format(100 * gtg / len(orfs), '.2f')
-    ttg_P = format(100 * ttg / len(orfs), '.2f')
-    att_P = format(100 * att / len(orfs), '.2f')
-    ctg_P = format(100 * ctg / len(orfs), '.2f')
-    other_Start_P = format(100 * other / len(orfs), '.2f')
-    return atg_P, gtg_P, ttg_P, att_P, ctg_P, other_Start_P, other_Starts
+    total = len(orfs) if orfs is not None else 0
+    if total:
+        atg_P = format(100 * atg / len(orfs), '.2f')
+        gtg_P = format(100 * gtg / len(orfs), '.2f')
+        ttg_P = format(100 * ttg / len(orfs), '.2f')
+        att_P = format(100 * att / len(orfs), '.2f')
+        ctg_P = format(100 * ctg / len(orfs), '.2f')
+        other_Start_P = format(100 * other / len(orfs), '.2f')
+    else:
+        atg_P = ttg_P = gtg_P = ctg_P = att_P = other_Start_P = format(0, '.2f')
+    return {
+        'ATG': (atg, atg_P),
+        'TTG': (ttg, ttg_P),
+        'GTG': (gtg, gtg_P),
+        'CTG': (ctg, ctg_P),
+        'ATT': (att, att_P),
+        'Other': (other, other_Start_P),
+        'total': total
+    }
 def stop_Codon_Count(orfs):
     tag, taa, tga, other = 0, 0, 0, 0
     other_Stops = []
-    for orf in orfs.values():
-        codon = orf[2]
-        if codon == 'TAG':
-            tag += 1
-        elif codon == 'TAA':
-            taa += 1
-        elif codon == 'TGA':
-            tga += 1
-        else:
-            other += 1
-            other_Stops.append(codon)
-    tag_p = format(100 * tag / len(orfs), '.2f')
-    taa_p = format(100 * taa / len(orfs), '.2f')
-    tga_p = format(100 * tga / len(orfs), '.2f')
-    other_Stop_P = format(100 * other / len(orfs), '.2f')
+    total = len(orfs) if orfs else 0
+    if total:
+        for orf in orfs.values():
+            codon = orf[2]
+            if codon == 'TAG':
+                tag += 1
+            elif codon == 'TAA':
+                taa += 1
+            elif codon == 'TGA':
+                tga += 1
+            else:
+                other += 1
+                other_Stops.append(codon)
+        tag_p = format(100 * tag / len(orfs), '.2f')
+        taa_p = format(100 * taa / len(orfs), '.2f')
+        tga_p = format(100 * tga / len(orfs), '.2f')
+        other_Stop_P = format(100 * other / len(orfs), '.2f')
+    else:
+        tag_p = taa_p = tga_p = other_Stop_P = format(0, '.2f')
     return tag_p, taa_p, tga_p, other_Stop_P, other_Stops
@@ -260,8 +280,8 @@ def candidate_ORF_Selection(gene_Set,
             if len(current_ORF_Difference) > len(candidate_ORF_Difference):
                 pos = c_Pos
                 orf_Details = c_ORF_Details
-        else:
-            print("Match filtered out")
+        #else:
+            #("Match filtered out")
     return pos, orf_Details
@@ -300,6 +320,11 @@ def tool_comparison(all_orfs, dna_regions, verbose):
         ref_genes_list = dna_regions[dna_region][2]
         ref_genes = collections.OrderedDict()
+        if not ref_genes_list:
+            results[dna_region] = {}
+            continue
         for d in ref_genes_list:
             ref_genes.update(d)
         comp.genome_Seq = dna_regions[dna_region][0]
@@ -311,6 +336,10 @@ def tool_comparison(all_orfs, dna_regions, verbose):
         better_pos_orfs_items = [[(int(pos.split(',')[0]), int(pos.split(',')[1])), orf_Details] for pos, orf_Details in current_orfs.items()] #TODO: turn pos into tuple instead of string everywhere
+        if not current_orfs or not better_pos_orfs_items:
+            results[dna_region] = {}
+            continue
         for gene_num, gene_details in ref_genes.items():  # Loop through each gene to compare against predicted ORFs
             g_Start = int(gene_details[0])
             g_Stop = int(gene_details[1])
@@ -477,10 +506,13 @@ def tool_comparison(all_orfs, dna_regions, verbose):
                 comp.gene_Pos_Olap.append(0)
             elif '-' in g_Strand:
                 comp.gene_Neg_Olap.append(0)
-        ####
-        min_Gene_Length = min(comp.gene_Lengths)
-        max_Gene_Length = max(comp.gene_Lengths)
-        median_Gene_Length = np.median(comp.gene_Lengths)
+        #### avoid ValueError
+        if comp.gene_Lengths:
+            min_Gene_Length = min(comp.gene_Lengths)
+            max_Gene_Length = max(comp.gene_Lengths)
+            median_Gene_Length = np.median(comp.gene_Lengths)
+        else:
+            min_Gene_Length = max_Gene_Length = min_Length_Difference = 0
         prev_ORF_Stop = 0
         prev_ORF_Overlapped = False
         for o_Positions, orf_Details in current_orfs.items():

{orforise-1.5.0 → orforise-1.5.1}/src/ORForise/utils.py RENAMED Viewed

@@ -4,7 +4,7 @@ import collections
 # Constants
 SHORT_ORF_LENGTH = 300
 MIN_COVERAGE = 75
-ORForise_Version = 'v1.5.0'
+ORForise_Version = 'v1.5.1'
 def revCompIterative(watson):  # Gets Reverse Complement

{orforise-1.5.0 → orforise-1.5.1}/src/ORForise.egg-info/PKG-INFO RENAMED Viewed

@@ -1,6 +1,6 @@
 Metadata-Version: 2.4
 Name: ORForise
-Version: 1.5.0
+Version: 1.5.1
 Summary: ORForise - Platform for analysing and comparing Prokaryote CoDing Sequence (CDS) Gene Predictions.
 Home-page: https://github.com/NickJD/ORForise
 Author: Nicholas Dimonaco
@@ -57,13 +57,7 @@ Example output files from ```Annotation-Compare```, ```GFF-Adder``` and ```GFF-I
 For Help: ```Annotation-Compare -h ```
 ```python
-Thank you for using ORForise
-Please report any issues to: https://github.com/NickJD/ORForise/issues
-#####
-usage: Annotation_Compare.py [-h] -dna GENOME_DNA -ref REFERENCE_ANNOTATION -t TOOL -tp TOOL_PREDICTION
-                             [-rt REFERENCE_TOOL] [-o OUTNAME] [-v {True,False}]
-ORForise v1.5.0: Annotatione-Compare Run Parameters.
+ORForise v1.5.1: Annotatione-Compare Run Parameters.
 Required Arguments:
   -dna GENOME_DNA       Genome DNA file (.fa) which both annotations are based on
@@ -78,8 +72,8 @@ Optional Arguments:
                         name to compare output from two tools
 Output:
-  -o OUTNAME            Define full output filename (format is CSV) - If not provided, summary will be printed to
-                        std-out
+  -o OUTDIR             Define directory where detailed output should be places
+  -n OUTNAME            Define output filename(s) prefix - If not provided, filename of reference annotation file will be used- <outname>_<contig_id>_ORF_Comparison.csv
 Misc:
   -v {True,False}       Default - False: Print out runtime status
@@ -107,13 +101,7 @@ ORForise can be used as the example below.
 For Help: ```Aggregate-Compare -h ```
 ```python
-Thank you for using ORForise
-Please report any issues to: https://github.com/NickJD/ORForise/issues
-#####
-usage: Aggregate_Compare.py [-h] -dna GENOME_DNA -t TOOLS -tp TOOL_PREDICTIONS -ref REFERENCE_ANNOTATION
-                            [-rt REFERENCE_TOOL] [-o OUTNAME] [-v {True,False}]
-ORForise v1.5.0: Aggregate-Compare Run Parameters.
+ORForise v1.5.1: Aggregate-Compare Run Parameters.
 Required Arguments:
   -dna GENOME_DNA       Genome DNA file (.fa) which both annotations are based on
@@ -261,13 +249,7 @@ The ```-gi``` option can be used to allow for different genomic elements to be a
 For Help: ```GFF-Adder -h ```
 ```python
-Thank you for using ORForise
-Please report any issues to: https://github.com/NickJD/ORForise/issues
-#####
-usage: GFF_Adder.py [-h] -dna GENOME_DNA -ref REFERENCE_ANNOTATION -at ADDITIONAL_TOOL -add ADDITIONAL_ANNOTATION -o
-                    OUTPUT_FILE [-rt REFERENCE_TOOL] [-gi GENE_IDENT] [-gene_ident GENE_IDENT] [-olap OVERLAP]
-ORForise v1.5.0: GFF-Adder Run Parameters.
+ORForise v1.5.1: GFF-Adder Run Parameters.
 Required Arguments:
   -dna GENOME_DNA       Genome DNA file (.fa) which both annotations are based on
@@ -323,13 +305,7 @@ The ```-gi``` option can be used to allow for different genomic elements to be a
 For Help: ```GFF-Intersector -h ```
 ```python
-Thank you for using ORForise
-Please report any issues to: https://github.com/NickJD/ORForise/issues
-#####
-usage: GFF_Intersector.py [-h] -dna GENOME_DNA -ref REFERENCE_ANNOTATION -at ADDITIONAL_TOOL -add
-                          ADDITIONAL_ANNOTATION -o OUTPUT_FILE [-rt REFERENCE_TOOL] [-gi GENE_IDENT] [-cov COVERAGE]
-ORForise v1.5.0: GFF-Intersector Run Parameters.
+ORForise v1.5.1: GFF-Intersector Run Parameters.
 Required Arguments:
   -dna GENOME_DNA       Genome DNA file (.fa) which both annotations are based on