levseq 1.0.0__py3-none-any.whl

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
@@ -0,0 +1,180 @@
1
+ Metadata-Version: 2.1
2
+ Name: levseq
3
+ Version: 1.0.0
4
+ Home-page: https://github.com/fhalab/levseq/
5
+ Author: Yueming Long, Emreay Gursoy, Ariane Mora
6
+ Author-email: ylong@caltech.edu
7
+ License: GPL3
8
+ Project-URL: Bug Tracker, https://github.com/fhalab/levseq/
9
+ Project-URL: Documentation, https://github.com/fhalab/levseq/
10
+ Project-URL: Source Code, https://github.com/fhalab/levseq/
11
+ Keywords: Nanopore,ONT,evSeq
12
+ Classifier: Development Status :: 5 - Production/Stable
13
+ Classifier: Intended Audience :: Science/Research
14
+ Classifier: License :: OSI Approved :: GNU General Public License v3 (GPLv3)
15
+ Classifier: Natural Language :: English
16
+ Classifier: Operating System :: OS Independent
17
+ Classifier: Programming Language :: Python :: 3.6
18
+ Classifier: Programming Language :: Python :: 3.7
19
+ Classifier: Programming Language :: Python :: 3.8
20
+ Classifier: Topic :: Scientific/Engineering :: Bio-Informatics
21
+ Requires-Python: >=3.8
22
+ Description-Content-Type: text/markdown
23
+ License-File: LICENSE
24
+ Requires-Dist: Bio
25
+ Requires-Dist: biopython
26
+ Requires-Dist: fsspec
27
+ Requires-Dist: h5py
28
+ Requires-Dist: holoviews
29
+ Requires-Dist: jupyterlab
30
+ Requires-Dist: mappy
31
+ Requires-Dist: matplotlib
32
+ Requires-Dist: ninetysix
33
+ Requires-Dist: numpy
34
+ Requires-Dist: pandas
35
+ Requires-Dist: pybedtools
36
+ Requires-Dist: pycoQC
37
+ Requires-Dist: pyfaidx
38
+ Requires-Dist: pyparsing
39
+ Requires-Dist: pysam
40
+ Requires-Dist: scipy
41
+ Requires-Dist: sciutil
42
+ Requires-Dist: seaborn
43
+ Requires-Dist: scikit-learn
44
+ Requires-Dist: statsmodels
45
+ Requires-Dist: tqdm
46
+
47
+ # Variant Sequencing with Nanopore
48
+
49
+ In directed evolution, sequencing every variant enhances data insight and creates datasets suitable for AI/ML methods. This method is presented as an extension of the original Every Variant Sequencer using Illumina technology. With this approach, sequence variants can be generated within a day at an extremely low cost.
50
+
51
+ ![Figure 1: LevSeq Workflow](manuscript/Figures/LevSeq_Figure-1.png)
52
+ Figure 1: Overview of the LevSeq variant sequencing workflow using Nanopore technology. This diagram illustrates the key steps in the process, from sample preparation to data analysis and visualization.
53
+
54
+
55
+ - Data to reproduce the results and to test are available on zenodo [![DOI](https://zenodo.org/badge/DOI/10.5281/zenodo.13694463.svg)](https://doi.org/10.5281/zenodo.13694463)
56
+
57
+ - A dockerized website and database for labs to locally host and visualize their data: https://github.com/fhalab/LevSeq_VDB/
58
+
59
+ ## Setup
60
+
61
+ For setting up the experimental side of LevSeq we suggest the following preparations:
62
+
63
+ - Order forward and reverse primers compatible with the desired plasmid, see methods section of [our paper](http://biorxiv.org/cgi/content/short/2024.09.04.611255v1?rss=1).
64
+ - Successfully install Oxford Nanopore's software (this is only for if you are doing basecalling/minION processing). [Link to installation guide](https://nanoporetech.com/).
65
+
66
+ ## How to Use LevSeq
67
+
68
+ The wet lab part is detailed in the method section of the paper.
69
+
70
+ Once samples are prepared, the multiplexed sample is used for sequencing, and the sequencing data is stored in the `../data` folder as per the typical Nanopore flow (refer to Nanopore documentation for this).
71
+
72
+ After sequencing, you can identify variants, demultiplex, and combine with your variant function here! For simple applications, we recommend using the notebook `example/Example.ipynb`.
73
+
74
+ ### Steps of LevSeq:
75
+
76
+ 1. **Basecalling**: This step converts Nanopore's FAST5 files to sequences. For basecalling, we use Nanopore's basecaller, Medaka, which can run in parallel with sequencing (recommended) or afterward.
77
+
78
+ 2. **Demultiplexing**: After sequencing, the reads, stored as bulk FASTQ files, are sorted. During demultiplexing, each read is assigned to its correct plate/well combination and stored as a FASTQ file.
79
+
80
+ 3. **Variant Calling**: For each sample, the consensus sequence is compared to the reference sequence. A variant is called if it differs from the reference sequence. The success of variant calling depends on the number of reads sequenced and their quality.
81
+
82
+
83
+ ### Installation:
84
+
85
+ We aimed to make LevSeq as simple to use as possible, this means you should be able to run it all using pip. However, if you have issues we recomend using the Docker instance!
86
+
87
+ We recommend using command line interface(Terminal) and a conda environment for installation:
88
+ ```
89
+ git clone https://github.com/fhalab/LevSeq.git
90
+ ```
91
+
92
+ ```
93
+ conda create --name levseq python=3.8
94
+ ```
95
+
96
+ ```
97
+ conda activate levseq
98
+ ```
99
+
100
+ From the LevSeq folder, install the package using pip:
101
+
102
+ ```
103
+ pip install releases/levseq-0.1.0.tar.gz
104
+ ```
105
+ #### Dependencies
106
+
107
+ 1. Samtools: https://www.htslib.org/download/
108
+ ```
109
+ conda install -c bioconda -c conda-forge samtools
110
+ ```
111
+ or for mac users you can use: `brew install samtools`
112
+
113
+ 2. Minimap2: https://github.com/lh3/minimap2
114
+ ```
115
+ conda install -c bioconda -c conda-forge minimap2
116
+ ```
117
+ or for mac users you can use: `brew install minimap2`
118
+ Once dependencies are all installed, you can run LevSeq using command line.
119
+ 3. GCC
120
+ For Mac M1 users: installation via homebrew
121
+ ```
122
+ brew install gcc
123
+ ```
124
+ For Linux users: installation via conda
125
+ ```
126
+ conda install conda-forge::gcc
127
+ ```
128
+ ### Usage
129
+ #### Command Line Interface
130
+ LevSeq can be run using the command line interface. Here's the basic structure of the command:
131
+
132
+ ```
133
+ levseq <name> <location to data folder> <location of reference csv file>
134
+ ```
135
+ #### Required Arguments
136
+ 1. Name of the experiment, this will be the name of the output folder
137
+ 2. Location of basecalled fastq files, this is the direct output from using the MinKnow software for sequencing
138
+ 3. Location of reference csv file, this file includes information of barcodes used for each plate and the DNA sequence used for reference for each plate
139
+
140
+ #### Optional Arguments
141
+ --skip\_demultiplexing If enabled, demultiplexing step will be skipped
142
+
143
+ --skip\_variantcalling If enabled, variant valling step will be skipped
144
+
145
+ --output Save location for output, if not provided default to where the program is executed
146
+
147
+ --show\_msa Showing multiple sequence alignment for each well
148
+
149
+ ### Docker Installation (Recommended for full pipeline)
150
+ For installing the whole pipeline, you'll need to use the docker image. For this, install docker as required for your
151
+ operating system (https://docs.docker.com/engine/install/).
152
+
153
+
154
+ To build the docker image run (within the main folder that contains the `Dockerfile`):
155
+
156
+ ```
157
+ docker build -t levseq .
158
+ ```
159
+
160
+ This gives us the access to the lebSeq command line interface via:
161
+
162
+ ```
163
+ docker run levseq
164
+ ```
165
+ Note! The docker image should work with linux, and mac, however, different mac architectures may have issues (owing to the different M1/M3 processers.)
166
+
167
+ Basically the -v connects a folder on your computer with the output from the minION sequencer with the docker image that will take these results and then perform
168
+ demultiplexing and variant calling.
169
+
170
+ ```
171
+ docker run -v /Users/XXXX/Documents/LevSeq/data:/levseq_results/ levseq 20240502 levseq_results/20240502/ levseq_results/20240502-YL-ParLQ-ep2.csv
172
+ ```
173
+
174
+ In this command: `/Users/XXXX/Documents/LevSeq/data` is a folder on your computer, which contains a subfolder `20240502`
175
+
176
+ ### Issues and Troubleshooting
177
+
178
+ If you have any issues, please check the LevSeq\_error.log find in the output direectory and report the issue. If the problem persists, please open an issue on the GitHub repository with the error details.
179
+
180
+ If you solve something code wise, submit a pull request! We would love community input.
@@ -0,0 +1,26 @@
1
+ levseq/IO_processor.py,sha256=5wlN2osYDARrlre0V74n142tAfBYEr4z7AFPEklO6yQ,20397
2
+ levseq/__init__.py,sha256=eIsx4_96Omo0TTt6wMRKzr8gnyt_0ECnGsP7RWFYpa0,1775
3
+ levseq/basecaller.py,sha256=OCcBoWAoke4h0ALE-Jl-XKPgWIjilcXVIyP228rFwfQ,3054
4
+ levseq/cmd.py,sha256=MXji4L6w5PchFGRSxz25mc3B767l71d-A3wHBxGASbw,1413
5
+ levseq/globals.py,sha256=2blnORDlq8iFIpWCTlA_0aAaP6eW3NqTHBMlNrrA-tM,3130
6
+ levseq/interface.py,sha256=OT3RxkxfrC1W8--ybrkFcGSaT6aEaU2xJNDAwEKlqqQ,3863
7
+ levseq/parser.py,sha256=BWu_4U4m_s4-_tyRNkPZEbp8wqLIkMZ_XKGx0PxxtLI,3574
8
+ levseq/run_levseq.py,sha256=PWEG615Y2ji1dP403xAYPl8Imb442yjXJa1dh_2L1dE,21031
9
+ levseq/screen.py,sha256=YAG-C6K7CwXaVnvyS4Y8gaPV_DLewRXtTH5SlmrQceQ,1496
10
+ levseq/simulation.py,sha256=Fo3tBgu1_iZJHkaj922aS5SIK5KTi8BYmGh1qnKsbuY,14432
11
+ levseq/user.py,sha256=G9MwG88wHWOysN437JK_hZJFnFD5129rnKbTdu6Qxm8,6748
12
+ levseq/utils.py,sha256=4sQ0nH7AErf_zKkXCQl7eOlJd3RMhVb2S_36wP0vqv0,21406
13
+ levseq/variantcaller.py,sha256=COpxAuMWaltjRmX8uhJ9dJrZm0zVc6OhCcGK4Yezf6s,12235
14
+ levseq/visualization.py,sha256=4OFX2n763HdpjDn4Ra2BYNf9KyfBG4X3QmbBtbfeTdQ,33966
15
+ levseq/barcoding/__init__.py,sha256=eK46E0pOoyct-TVkjeBkeA69ImvY27lX_eWTXof_dNo,35
16
+ levseq/barcoding/demultiplex,sha256=TXqQTJ50mmUVUJw2Zw_lSJXxsIpkGy1e_t1OyNz2km8,434104
17
+ levseq/barcoding/demultiplex-arm64,sha256=wIz0ojOHJlMoLcbhA-McuN4o2swiojP3h5sWxs9pVWI,351096
18
+ levseq/barcoding/demultiplex-x86,sha256=38vk7i-RFBnOSW6HGOCVYI1QcOyjOyVg7ziIObwt3IM,336272
19
+ levseq/barcoding/minion_barcodes.fasta,sha256=a3XV-_WJ0E1--cOWgPFqYoC7FwZ8CIWgzGPrwW2czgs,5954
20
+ levseq-1.0.0.data/data/LICENSE,sha256=OXLcl0T2SZ8Pmy2_dmlvKuetivmyPd5m1q-Gyd-zaYY,35149
21
+ levseq-1.0.0.dist-info/LICENSE,sha256=OXLcl0T2SZ8Pmy2_dmlvKuetivmyPd5m1q-Gyd-zaYY,35149
22
+ levseq-1.0.0.dist-info/METADATA,sha256=AGffeZqDavzd4yTTfJIot1ZTtw9vkF2c4O-AXiQFO1Q,7448
23
+ levseq-1.0.0.dist-info/WHEEL,sha256=GJ7t_kWBFywbagK5eo9IoUwLW6oyOeTKmQ-9iHFVNxQ,92
24
+ levseq-1.0.0.dist-info/entry_points.txt,sha256=TrV6VrNW1nWdlKhbtWMh9J_VLPBlYqPIJWNEQSLQGoo,43
25
+ levseq-1.0.0.dist-info/top_level.txt,sha256=8r2n0hF_yJ5VbAyh6EB405w2rjjGLH6CFhfQa7EVTOE,7
26
+ levseq-1.0.0.dist-info/RECORD,,
@@ -0,0 +1,5 @@
1
+ Wheel-Version: 1.0
2
+ Generator: bdist_wheel (0.43.0)
3
+ Root-Is-Purelib: true
4
+ Tag: py3-none-any
5
+
@@ -0,0 +1,2 @@
1
+ [console_scripts]
2
+ levseq = levseq.cmd:main
@@ -0,0 +1 @@
1
+ levseq