PyPI - toulligqc - Versions diffs - 2.7__tar.gz → 2.7.1__tar.gz - Mend

toulligqc 2.7tar.gz → 2.7.1tar.gz

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (41) hide show

{toulligqc-2.7 → toulligqc-2.7.1}/PKG-INFO RENAMED Viewed

@@ -1,6 +1,6 @@
 Metadata-Version: 2.1
 Name: toulligqc
-Version: 2.7
+Version: 2.7.1
 Summary: A post sequencing QC tool for Oxford Nanopore sequencers
 Home-page: https://github.com/GenomiqueENS/toulligQC
 Author: Genomic Paris Centre team

{toulligqc-2.7 → toulligqc-2.7.1}/README.md RENAMED Viewed

@@ -53,18 +53,31 @@ $ cd toulligqc && python3 setup.py build install
 ToulligQC is written with Python 3.
 To run ToulligQC without Docker, you need to install the following Python modules:
-* matplotlib
-* plotly
-* h5py
+ * matplotlib
+ * plotly
+ * h5py
 * pandas
 * numpy
 * scipy
 * scikit-learn
 * pysam
+* tqdm
+* pod5
+<a name="Conda-environemnt"></a>
+### 1.2 Conda environemnt**
+You can use a conda environment to install the required packages:
+```
+git clone https://github.com/GenomicParisCentre/toulligQC.git
+cd toulligqc && python3 setup.py build install
+conda env create -f environment.yml
+conda activate toulliqc
+```
 <a name="pypi-installation"></a>
-### 1.2 Using a PyPi package
+### 1.3 Using a PyPi package
 ToulligQC can be more easlily installed with a pip package availlable on the PyPi repository. The following command line  will install the latest version of ToulligQC:
 ```bash
@@ -72,7 +85,7 @@ $ pip3 install toulligqc
 ```
 <a name="docker"></a>
-### 1.3 Using Docker
+### 1.4 Using Docker
 ToulligQC and its dependencies are available through a Docker image. To install docker on your system, go to the Docker website (<https://docs.docker.com/engine/installation/>).
 Even if Docker can run on Windows or macOS virtual machines, we recommend to run ToulligQC on a Linux host.
 <a name="docker-image-recovery"></a>

{toulligqc-2.7 → toulligqc-2.7.1}/toulligqc/common_statistics.py RENAMED Viewed

@@ -18,7 +18,7 @@ def compute_LXX(dataframe_dict, x):
     cum_sum = 0
     count = 0
     for v in data:
-        cum_sum += v
+        cum_sum += int(v)
         count += 1
         if cum_sum >= half_sum:
             return count
@@ -31,7 +31,7 @@ def compute_NXX(dataframe_dict, x):
     half_sum = data.sum() * x / 100
     cum_sum = 0
     for v in data:
-        cum_sum += v
+        cum_sum += int(v)
         if cum_sum >= half_sum:
             return int(v)

{toulligqc-2.7 → toulligqc-2.7.1}/toulligqc/extractor_common.py RENAMED Viewed

@@ -432,7 +432,11 @@ def add_image_to_result(quiet, image_list, start_time, image):
 def timeISO_to_float(iso_datetime, format):
         """
         """
-        dt = datetime.strptime(iso_datetime, format)
+        try:
+            dt = datetime.strptime(iso_datetime, format)
+        except:
+            format = '%Y-%m-%dT%H:%M:%SZ'
+            dt = datetime.strptime(iso_datetime, format)
         unix_timestamp = dt.timestamp()
         return unix_timestamp

{toulligqc-2.7 → toulligqc-2.7.1}/toulligqc/fastq_extractor.py RENAMED Viewed

@@ -119,32 +119,45 @@ class fastqExtractor:
         add_image_to_result(self.quiet, images, time.time(), pgg.read_count_histogram(result_dict, self.images_directory))
         add_image_to_result(self.quiet, images, time.time(), pgg.read_length_scatterplot(self.dataframe_dict, self.images_directory))
         if self.rich:
             add_image_to_result(self.quiet, images, time.time(), pgg.yield_plot(self.dataframe_1d, self.images_directory))
         add_image_to_result(self.quiet, images, time.time(), pgg.read_quality_multiboxplot(self.dataframe_dict, self.images_directory))
         add_image_to_result(self.quiet, images, time.time(), pgg.allphred_score_frequency(self.dataframe_dict, self.images_directory))
         if self.rich:
             add_image_to_result(self.quiet, images, time.time(), pgg.plot_performance(self.dataframe_1d, self.images_directory))
         add_image_to_result(self.quiet, images, time.time(), pgg.twod_density(self.dataframe_dict, self.images_directory))
         if self.rich:
             add_image_to_result(self.quiet, images, time.time(), pgg.sequence_length_over_time(self.dataframe_dict, self.images_directory))
             add_image_to_result(self.quiet, images, time.time(), pgg.phred_score_over_time(self.dataframe_dict, result_dict, self.images_directory))
-        if self.is_barcode:
-            add_image_to_result(self.quiet, images, time.time(), pgg.barcode_percentage_pie_chart_pass(self.dataframe_dict,
-                                                                                                       self.barcode_selection,
-                                                                                                       self.images_directory))
-            read_fail = self.dataframe_dict["read.fail.barcoded"]
-            if not (len(read_fail) == 1 and read_fail["other barcodes"] == 0):
-                add_image_to_result(self.quiet, images, time.time(), pgg.barcode_percentage_pie_chart_fail(self.dataframe_dict,
-                                                                                                      self.barcode_selection,
-                                                                                                      self.images_directory))
-            add_image_to_result(self.quiet, images, time.time(), pgg.barcode_length_boxplot(self.dataframe_dict,
-                                                                                            self.images_directory))
-            add_image_to_result(self.quiet, images, time.time(), pgg.barcoded_phred_score_frequency(self.dataframe_dict,
-                                                                                                    self.images_directory))
+            if self.is_barcode:
+                if "barcode_alias" in self.config_dictionary:
+                    barcode_alias = self.config_dictionary['barcode_alias']
+                else:
+                    barcode_alias = None
+                add_image_to_result(self.quiet, images, time.time(), pgg.barcode_percentage_pie_chart_pass(self.dataframe_dict,
+                                                                                                        self.barcode_selection,
+                                                                                                        self.images_directory,
+                                                                                                        barcode_alias))
+                read_fail = self.dataframe_dict["read.fail.barcoded"]
+                if not (len(read_fail) == 1 and read_fail["other barcodes"] == 0):
+                    add_image_to_result(self.quiet, images, time.time(), pgg.barcode_percentage_pie_chart_fail(self.dataframe_dict,
+                                                                                                        self.barcode_selection,
+                                                                                                        self.images_directory,
+                                                                                                        barcode_alias))
+                add_image_to_result(self.quiet, images, time.time(), pgg.barcode_length_boxplot(self.dataframe_dict,
+                                                                                                self.images_directory,
+                                                                                                barcode_alias))
+                add_image_to_result(self.quiet, images, time.time(), pgg.barcoded_phred_score_frequency(self.dataframe_dict,
+                                                                                                        self.images_directory,
+                                                                                                        barcode_alias))
         return images
@@ -211,7 +224,7 @@ class fastqExtractor:
                       "pass.reads.sequence.length")
         describe_dict(self, result_dict, self.dataframe_dict["fail.reads.sequence.length"],
                       "fail.reads.sequence.length")
-        if self.is_barcode:
+        if self.rich and self.is_barcode:
             extract_barcode_info(self, result_dict,
                                  self.barcode_selection,
                                  self.dataframe_dict,
@@ -258,8 +271,9 @@ class fastqExtractor:
         columns = ['sequence_length', 'mean_qscore', 'passes_filtering']
         if self.rich:
             columns.extend(['start_time', 'channel'])
-        if self.is_barcode:
-            columns.append('barcode_arrangement')
+            if self.is_barcode:
+                columns.append('barcode_arrangement')
         fq_df = pd.DataFrame(fq_df, columns=columns)
@@ -271,8 +285,10 @@ class fastqExtractor:
             fq_df["start_time"] = fq_df["start_time"] - fq_df["start_time"].min()
             fq_df['start_time'] = fq_df['start_time'].astype(np.float64)
             fq_df['channel'] = fq_df['channel'].astype(np.int16)
-        if self.is_barcode:
-            fq_df['barcode_arrangement'] = fq_df['barcode_arrangement'].astype("category")
+            if self.is_barcode:
+                fq_df['barcode_arrangement'] = fq_df['barcode_arrangement'].astype("category")
         return fq_df
@@ -346,8 +362,11 @@ class fastqExtractor:
             self.is_barcode = False
         if 'model_version_id' not in metadata:
                 metadata['model_version_id'] = 'Unknow'
+        run_info = []
         try:
-            return metadata['runid'] , metadata['sampleid'] , metadata['model_version_id']
+           sample_id = 'sample_id' if 'sample_id' in metadata else 'sampleid'
+           run_id = 'run_id' if 'run_id' in metadata else 'runid'
+           return metadata[run_id] , metadata[sample_id] , metadata['model_version_id']
         except:
             return None
@@ -356,7 +375,7 @@ class fastqExtractor:
         """
         """
         metadata = dict(x.split("=") for x in name.split(" ")[1:])
-        start_time = timeISO_to_float(metadata['start_time'], '%Y-%m-%dT%H:%M:%SZ')
+        start_time = timeISO_to_float(metadata['start_time'],  '%Y-%m-%dT%H:%M:%S.%f%z')
         if self.is_barcode:
             return  start_time, metadata['ch'], metadata['barcode']
         return  start_time, metadata['ch']

{toulligqc-2.7 → toulligqc-2.7.1}/toulligqc/toulligqc.py RENAMED Viewed

@@ -352,17 +352,25 @@ def main():
         sys.exit("ERROR: dico_path is empty")
     # Get barcode selection
+    allowed_patterns = r'(BC|RB|NB|BP|BARCODE)(\d{2})'
     if config_dictionary['barcoding'].lower() == 'true':
         config_dictionary['barcode_selection'] = []
-        if 'barcodes' in config_dictionary:
+        if 'samplesheet' in config_dictionary:
+            samplesheet = parse_samplesheet(config_dictionary['samplesheet'])
+            config_dictionary['barcodes'] = ",".join(list(samplesheet['barcode']))
+            config_dictionary['barcode_alias'] = pd.Series(samplesheet.alias.values,
+                                                           index=samplesheet.barcode).to_dict()
+        if 'barcodes' in config_dictionary or 'samplesheet' in config_dictionary:
             barcode_set = set()
             if ":" in config_dictionary['barcodes']:
                 start, end  = config_dictionary['barcodes'].strip().split(':')
-                pattern = re.search(r'(BC|RB|NB|BP|BARCODE)(\d{2})', start.strip().upper())
+                pattern = re.search(allowed_patterns, start.strip().upper())
                 if pattern:
                     start_number = int(pattern.group(2))
-                pattern = re.search(r'(BC|RB|NB|BP|BARCODE)(\d{2})', end.strip().upper())
+                pattern = re.search(allowed_patterns, end.strip().upper())
                 if pattern:
                     end_number = int(pattern.group(2))
                 for i in range(start_number, end_number + 1):
@@ -371,13 +379,15 @@ def main():
             else:
                 for b in config_dictionary['barcodes'].strip().split(','):
-                    pattern = re.search(r'(BC|RB|NB|BP|BARCODE)(\d{2})', b.strip().upper())
+                    pattern = re.search(allowed_patterns, b.strip().upper())
                     if pattern:
                         barcode = 'barcode{}'.format(pattern.group(2))
                         barcode_set.add(barcode)
                     else:
                         sys.stderr.write("\033[93mWarning:\033[0m Barcode '{}' is non-standard custom arrangement.\n".format(b))
                         barcode_set.add(b)
+                    if 'samplesheet' in config_dictionary:
+                        config_dictionary['barcode_alias'][barcode] = config_dictionary['barcode_alias'].pop(b)
             barcode_selection = sorted(barcode_set)
@@ -385,12 +395,6 @@ def main():
                 sys.exit("ERROR: No known barcode found in provided list of barcodes")
             config_dictionary['barcode_selection'] = barcode_selection
-        elif 'samplesheet' in config_dictionary:
-            samplesheet = parse_samplesheet(config_dictionary['samplesheet'])
-            config_dictionary['barcode_selection'] = list(samplesheet['barcode'])
-            config_dictionary['barcode_alias'] = pd.Series(samplesheet.alias.values,
-                                                           index=samplesheet.barcode).to_dict()
     else:
         config_dictionary['barcode_selection'] = ''

toulligqc-2.7.1/toulligqc/version.py ADDED Viewed

	@@ -0,0 +1 @@
1	+ __version__ = '2.7.1'

{toulligqc-2.7 → toulligqc-2.7.1}/toulligqc.egg-info/PKG-INFO RENAMED Viewed

@@ -1,6 +1,6 @@
 Metadata-Version: 2.1
 Name: toulligqc
-Version: 2.7
+Version: 2.7.1
 Summary: A post sequencing QC tool for Oxford Nanopore sequencers
 Home-page: https://github.com/GenomiqueENS/toulligQC
 Author: Genomic Paris Centre team

toulligqc-2.7/toulligqc/version.py DELETED Viewed

	@@ -1 +0,0 @@
1	- __version__ = '2.7'

{toulligqc-2.7 → toulligqc-2.7.1}/AUTHORS RENAMED Viewed

File without changes

{toulligqc-2.7 → toulligqc-2.7.1}/LICENSE-CeCILL.txt RENAMED Viewed

File without changes

{toulligqc-2.7 → toulligqc-2.7.1}/LICENSE.txt RENAMED Viewed

File without changes

{toulligqc-2.7 → toulligqc-2.7.1}/MANIFEST.in RENAMED Viewed

File without changes

{toulligqc-2.7 → toulligqc-2.7.1}/setup.cfg RENAMED Viewed

File without changes

{toulligqc-2.7 → toulligqc-2.7.1}/setup.py RENAMED Viewed

File without changes

{toulligqc-2.7 → toulligqc-2.7.1}/test/test_sequencing_summary_extractor.py RENAMED Viewed

File without changes

{toulligqc-2.7 → toulligqc-2.7.1}/toulligqc/__init__.py RENAMED Viewed

File without changes

{toulligqc-2.7 → toulligqc-2.7.1}/toulligqc/bam_extractor.py RENAMED Viewed

File without changes

{toulligqc-2.7 → toulligqc-2.7.1}/toulligqc/common.py RENAMED Viewed

File without changes

{toulligqc-2.7 → toulligqc-2.7.1}/toulligqc/configuration.py RENAMED Viewed

File without changes

{toulligqc-2.7 → toulligqc-2.7.1}/toulligqc/fast5_extractor.py RENAMED Viewed

File without changes

{toulligqc-2.7 → toulligqc-2.7.1}/toulligqc/fastq_bam_common.py RENAMED Viewed

File without changes

{toulligqc-2.7 → toulligqc-2.7.1}/toulligqc/html_report_generator.py RENAMED Viewed

File without changes

{toulligqc-2.7 → toulligqc-2.7.1}/toulligqc/plotly_graph_common.py RENAMED Viewed

File without changes

{toulligqc-2.7 → toulligqc-2.7.1}/toulligqc/plotly_graph_generator.py RENAMED Viewed

File without changes

{toulligqc-2.7 → toulligqc-2.7.1}/toulligqc/plotly_graph_onedsquare_generator.py RENAMED Viewed

File without changes

{toulligqc-2.7 → toulligqc-2.7.1}/toulligqc/pod5_extractor.py RENAMED Viewed

File without changes

{toulligqc-2.7 → toulligqc-2.7.1}/toulligqc/report_data_file_generator.py RENAMED Viewed

File without changes

{toulligqc-2.7 → toulligqc-2.7.1}/toulligqc/resources/plotly-latest.min.js RENAMED Viewed

File without changes

{toulligqc-2.7 → toulligqc-2.7.1}/toulligqc/resources/toulligqc.css RENAMED Viewed

File without changes

{toulligqc-2.7 → toulligqc-2.7.1}/toulligqc/resources/toulligqc.png RENAMED Viewed

File without changes

{toulligqc-2.7 → toulligqc-2.7.1}/toulligqc/sequencing_summary_extractor.py RENAMED Viewed

File without changes

{toulligqc-2.7 → toulligqc-2.7.1}/toulligqc/sequencing_summary_onedsquare_extractor.py RENAMED Viewed

File without changes

{toulligqc-2.7 → toulligqc-2.7.1}/toulligqc/sequencing_telemetry_extractor.py RENAMED Viewed

File without changes

{toulligqc-2.7 → toulligqc-2.7.1}/toulligqc/toulligqc_info_extractor.py RENAMED Viewed

File without changes

{toulligqc-2.7 → toulligqc-2.7.1}/toulligqc.egg-info/SOURCES.txt RENAMED Viewed

File without changes

{toulligqc-2.7 → toulligqc-2.7.1}/toulligqc.egg-info/dependency_links.txt RENAMED Viewed

File without changes

{toulligqc-2.7 → toulligqc-2.7.1}/toulligqc.egg-info/entry_points.txt RENAMED Viewed

File without changes

{toulligqc-2.7 → toulligqc-2.7.1}/toulligqc.egg-info/not-zip-safe RENAMED Viewed

File without changes

{toulligqc-2.7 → toulligqc-2.7.1}/toulligqc.egg-info/requires.txt RENAMED Viewed

File without changes

{toulligqc-2.7 → toulligqc-2.7.1}/toulligqc.egg-info/top_level.txt RENAMED Viewed

File without changes

toulligqc 2.7__tar.gz → 2.7.1__tar.gz

toulligqc 2.7tar.gz → 2.7.1tar.gz