bioroebe 0.10.80 → 0.11.24
Sign up to get free protection for your applications and to get access to all the features.
Potentially problematic release.
This version of bioroebe might be problematic. Click here for more details.
- checksums.yaml +4 -4
- data/README.md +1204 -772
- data/bioroebe.gemspec +3 -3
- data/doc/README.gen +1203 -771
- data/doc/todo/bioroebe_todo.md +391 -365
- data/lib/bioroebe/aminoacids/aminoacid_substitution.rb +1 -9
- data/lib/bioroebe/aminoacids/codon_percentage.rb +1 -9
- data/lib/bioroebe/aminoacids/deduce_aminoacid_sequence.rb +1 -9
- data/lib/bioroebe/aminoacids/display_aminoacid_table.rb +1 -0
- data/lib/bioroebe/aminoacids/show_hydrophobicity.rb +1 -6
- data/lib/bioroebe/base/colours_for_base/colours_for_base.rb +18 -8
- data/lib/bioroebe/base/commandline_application/commandline_arguments.rb +13 -11
- data/lib/bioroebe/base/commandline_application/misc.rb +18 -8
- data/lib/bioroebe/base/misc.rb +16 -0
- data/lib/bioroebe/base/prototype/misc.rb +1 -1
- data/lib/bioroebe/codons/show_codon_tables.rb +6 -2
- data/lib/bioroebe/codons/show_codon_usage.rb +2 -1
- data/lib/bioroebe/constants/aminoacids_and_proteins.rb +1 -0
- data/lib/bioroebe/constants/database_constants.rb +1 -1
- data/lib/bioroebe/constants/files_and_directories.rb +20 -1
- data/lib/bioroebe/constants/misc.rb +20 -0
- data/lib/bioroebe/count/count_amount_of_nucleotides.rb +3 -0
- data/lib/bioroebe/crystal/README.md +2 -0
- data/lib/bioroebe/crystal/to_rna.cr +19 -0
- data/lib/bioroebe/data/README.md +11 -8
- data/lib/bioroebe/data/electron_microscopy/pos_example.pos +396 -0
- data/lib/bioroebe/data/electron_microscopy/test_particles.star +36 -0
- data/lib/bioroebe/{shell/tk.rb → electron_microscopy/electron_microscopy_module.rb} +15 -10
- data/lib/bioroebe/electron_microscopy/simple_star_file_generator.rb +4 -9
- data/lib/bioroebe/fasta_and_fastq/show_fasta_headers.rb +27 -12
- data/lib/bioroebe/genome/README.md +4 -0
- data/lib/bioroebe/genome/genome.rb +67 -0
- data/lib/bioroebe/gui/gtk3/protein_to_DNA/protein_to_DNA.rb +18 -18
- data/lib/bioroebe/gui/gtk3/random_sequence/random_sequence.rb +19 -11
- data/lib/bioroebe/gui/shared_code/protein_to_DNA/protein_to_DNA_module.rb +14 -14
- data/lib/bioroebe/misc/ruler.rb +1 -0
- data/lib/bioroebe/parsers/genbank_parser.rb +353 -24
- data/lib/bioroebe/parsers/gff.rb +1 -9
- data/lib/bioroebe/pdb/parse_pdb_file.rb +1 -9
- data/lib/bioroebe/project/project.rb +1 -1
- data/lib/bioroebe/python/README.md +1 -0
- data/lib/bioroebe/python/__pycache__/mymodule.cpython-39.pyc +0 -0
- data/lib/bioroebe/python/gui/gtk3/all_in_one.css +4 -0
- data/lib/bioroebe/python/gui/gtk3/all_in_one.py +59 -0
- data/lib/bioroebe/python/gui/gtk3/widget1.py +20 -0
- data/lib/bioroebe/python/gui/tkinter/all_in_one.py +91 -0
- data/lib/bioroebe/python/mymodule.py +8 -0
- data/lib/bioroebe/python/protein_to_dna.py +33 -0
- data/lib/bioroebe/python/shell/shell.py +19 -0
- data/lib/bioroebe/python/to_rna.py +14 -0
- data/lib/bioroebe/python/toplevel_methods/open_in_browser.py +20 -0
- data/lib/bioroebe/python/toplevel_methods/palindromes.py +42 -0
- data/lib/bioroebe/python/toplevel_methods/rds.py +13 -0
- data/lib/bioroebe/python/toplevel_methods/three_delimiter.py +34 -0
- data/lib/bioroebe/python/toplevel_methods/time_and_date.py +43 -0
- data/lib/bioroebe/python/toplevel_methods/to_camelcase.py +11 -0
- data/lib/bioroebe/requires/require_the_bioroebe_project.rb +3 -1
- data/lib/bioroebe/sequence/nucleotide_module/nucleotide_module.rb +28 -25
- data/lib/bioroebe/sequence/protein.rb +105 -3
- data/lib/bioroebe/sequence/sequence.rb +61 -2
- data/lib/bioroebe/shell/menu.rb +3451 -3366
- data/lib/bioroebe/shell/misc.rb +51 -4311
- data/lib/bioroebe/shell/readline/readline.rb +1 -1
- data/lib/bioroebe/shell/shell.rb +11192 -28
- data/lib/bioroebe/siRNA/siRNA.rb +81 -1
- data/lib/bioroebe/string_matching/find_longest_substring.rb +3 -2
- data/lib/bioroebe/taxonomy/class_methods.rb +3 -8
- data/lib/bioroebe/taxonomy/constants.rb +4 -3
- data/lib/bioroebe/taxonomy/edit.rb +2 -1
- data/lib/bioroebe/taxonomy/help/help.rb +10 -10
- data/lib/bioroebe/taxonomy/info/check_available.rb +15 -9
- data/lib/bioroebe/taxonomy/info/info.rb +17 -2
- data/lib/bioroebe/taxonomy/info/is_dna.rb +46 -36
- data/lib/bioroebe/taxonomy/interactive.rb +139 -95
- data/lib/bioroebe/taxonomy/menu.rb +27 -18
- data/lib/bioroebe/taxonomy/parse_fasta.rb +3 -1
- data/lib/bioroebe/taxonomy/shared.rb +1 -0
- data/lib/bioroebe/taxonomy/taxonomy.rb +1 -0
- data/lib/bioroebe/toplevel_methods/aminoacids_and_proteins.rb +31 -24
- data/lib/bioroebe/toplevel_methods/databases.rb +1 -1
- data/lib/bioroebe/toplevel_methods/fasta_and_fastq.rb +101 -63
- data/lib/bioroebe/toplevel_methods/misc.rb +17 -16
- data/lib/bioroebe/toplevel_methods/nucleotides.rb +22 -5
- data/lib/bioroebe/toplevel_methods/open_in_browser.rb +2 -0
- data/lib/bioroebe/toplevel_methods/palindromes.rb +1 -2
- data/lib/bioroebe/toplevel_methods/taxonomy.rb +2 -2
- data/lib/bioroebe/toplevel_methods/to_camelcase.rb +5 -0
- data/lib/bioroebe/utility_scripts/align_open_reading_frames.rb +1 -9
- data/lib/bioroebe/utility_scripts/check_for_mismatches/check_for_mismatches.rb +1 -9
- data/lib/bioroebe/utility_scripts/compacter.rb +1 -9
- data/lib/bioroebe/utility_scripts/compseq/compseq.rb +1 -9
- data/lib/bioroebe/utility_scripts/create_batch_entrez_file.rb +1 -9
- data/lib/bioroebe/utility_scripts/dot_alignment.rb +1 -9
- data/lib/bioroebe/utility_scripts/move_file_to_its_correct_location.rb +1 -4
- data/lib/bioroebe/utility_scripts/showorf/constants.rb +0 -5
- data/lib/bioroebe/utility_scripts/showorf/reset.rb +1 -4
- data/lib/bioroebe/version/version.rb +2 -2
- data/lib/bioroebe/www/embeddable_interface.rb +101 -52
- data/lib/bioroebe/www/sinatra/sinatra.rb +186 -70
- data/lib/bioroebe/yaml/aminoacids/amino_acids_long_name_to_one_letter.yml +2 -2
- data/lib/bioroebe/yaml/configuration/browser.yml +1 -1
- data/lib/bioroebe/yaml/genomes/README.md +3 -4
- data/lib/bioroebe/yaml/restriction_enzymes/restriction_enzymes.yml +3 -3
- metadata +32 -35
- data/doc/setup.rb +0 -1655
- data/lib/bioroebe/genbank/genbank_parser.rb +0 -291
- data/lib/bioroebe/shell/add.rb +0 -108
- data/lib/bioroebe/shell/assign.rb +0 -360
- data/lib/bioroebe/shell/chop_and_cut.rb +0 -281
- data/lib/bioroebe/shell/constants.rb +0 -166
- data/lib/bioroebe/shell/download.rb +0 -335
- data/lib/bioroebe/shell/enable_and_disable.rb +0 -158
- data/lib/bioroebe/shell/enzymes.rb +0 -310
- data/lib/bioroebe/shell/fasta.rb +0 -345
- data/lib/bioroebe/shell/gtk.rb +0 -76
- data/lib/bioroebe/shell/history.rb +0 -132
- data/lib/bioroebe/shell/initialize.rb +0 -217
- data/lib/bioroebe/shell/loop.rb +0 -74
- data/lib/bioroebe/shell/prompt.rb +0 -107
- data/lib/bioroebe/shell/random.rb +0 -289
- data/lib/bioroebe/shell/reset.rb +0 -335
- data/lib/bioroebe/shell/scan_and_parse.rb +0 -135
- data/lib/bioroebe/shell/search.rb +0 -337
- data/lib/bioroebe/shell/sequences.rb +0 -200
- data/lib/bioroebe/shell/show_report_and_display.rb +0 -2901
- data/lib/bioroebe/shell/startup.rb +0 -127
- data/lib/bioroebe/shell/taxonomy.rb +0 -14
- data/lib/bioroebe/shell/user_input.rb +0 -88
- data/lib/bioroebe/shell/xorg.rb +0 -45
data/doc/todo/bioroebe_todo.md
CHANGED
@@ -1,16 +1,124 @@
|
|
1
|
-
|
2
|
-
(1) →
|
1
|
+
------------------------------------------------------------------------------------------
|
2
|
+
(1) → set_dna_sequence alu
|
3
|
+
|
4
|
+
^^^ fetch random alu
|
5
|
+
|
6
|
+
^^^ alu sequence
|
7
|
+
Ok we started this now adding more details, but we
|
8
|
+
need to become better at searching for this
|
9
|
+
sequence.
|
10
|
+
------------------------------------------------------------------------------------------
|
11
|
+
(2) → draw things based on GR
|
12
|
+
------------------------------------------------------------------------------------------
|
13
|
+
(3) → https://mycocosm.jgi.doe.gov/help/screenshots/browser_viewer.png
|
14
|
+
^^^ offer the same functionality
|
15
|
+
------------------------------------------------------------------------------------------
|
16
|
+
(4) → https://genome.cshlp.org/content/12/10/1611/F3.expansion.html
|
17
|
+
|
18
|
+
^^^ enable this, we must obtain a sequence then store into genbank format
|
19
|
+
so, first fetch; then store as-is.
|
20
|
+
------------------------------------------------------------------------------------------
|
21
|
+
(5) → be able to generate nice graphics
|
22
|
+
|
23
|
+
https://genome.cshlp.org/content/12/10/1611/F1.large.jpg
|
24
|
+
------------------------------------------------------------------------------------------
|
25
|
+
(6) → add rmagicks wrappre, perhaps via imageparadise or something
|
26
|
+
the idea is that we can make fancy drawings and generate
|
27
|
+
an image for the end user to see
|
28
|
+
------------------------------------------------------------------------------------------
|
29
|
+
(7) → https://bioperl.org/howtos/Beginners_HOWTO.html#item13
|
30
|
+
extend the sequence object and document it
|
31
|
+
|
32
|
+
also add:
|
33
|
+
|
34
|
+
class Genome
|
35
|
+
and:
|
36
|
+
def is_circular?
|
37
|
+
@internal_hash[:is_circular]
|
38
|
+
end; alias circular? is_circular? # === circular?
|
39
|
+
def species?
|
40
|
+
@internal_hash[:species] # return the species here
|
41
|
+
end
|
42
|
+
------------------------------------------------------------------------------------------
|
43
|
+
(2) http://lib.ysu.am/open_books/312400.pdf
|
44
|
+
|
45
|
+
clone:
|
46
|
+
Primer.pl
|
47
|
+
This program was written to support the required informatics for a sequencing
|
48
|
+
lab. The desire was to quickly generate primer pair candidates for use in STS
|
49
|
+
mapping. We use Bioperl modules to fetch the sequences from GenBank.
|
50
|
+
#! /usr/bin/perl
|
51
|
+
#
|
52
|
+
# primers.pl
|
53
|
+
#
|
54
|
+
# Reads a list of
|
55
|
+
|
56
|
+
% primers.pl AC013798
|
57
|
+
AC013798
|
58
|
+
Left Right Length Penalty
|
59
|
+
CCTCCTGGACAACCTGTGTT TGAAGTCAGGGGACATAGGG 280 0.0823
|
60
|
+
CCTCCTGGACAACCTGTGTT AGGCCAGTAGACTGGGTGTG 298 0.1758
|
61
|
+
CCTCCTGGACAACCTGTGTT GGTGTGAAGTCAGGGGACAT 284 0.1852
|
62
|
+
TTCCCGCATCTCTTAGCAGT AGGCCAGTAGACTGGGTGTG 209 0.1962
|
63
|
+
CTTCCCGCATCTCTTAGCAG GACACTAGTGGCAAGGAGGC 226 0.2362
|
64
|
+
Most of the primers.pl program is extremely simple. The real guts and power
|
65
|
+
of the program lie in the classes and the methods we call. The next section
|
66
|
+
examines the Primer3 module, which is similar to many Bioperl modules
|
67
|
+
|
68
|
+
|
69
|
+
------------------------------------------------------------------------------------------
|
70
|
+
(1) → Clone all of Emboss. :)
|
71
|
+
|
72
|
+
→ Clone and document the getorf functionality properly.
|
73
|
+
|
74
|
+
See: http://emboss.sourceforge.net/apps/cvs/emboss/apps/getorf.html
|
75
|
+
|
76
|
+
http://emboss.sourceforge.net
|
77
|
+
http://emboss.sourceforge.net/apps/cvs/emboss/apps/getorf.html
|
78
|
+
|
79
|
+
------------------------------------------------------------------------------------------
|
80
|
+
(3) → Add useful formulas for bioshell.
|
81
|
+
------------------------------------------------------------------------------------------
|
82
|
+
(1) → Polish the GUI sets:
|
83
|
+
|
84
|
+
https://i.imgur.com/djElIMh.png
|
85
|
+
|
86
|
+
------------------------------------------------------------------------------------------
|
87
|
+
(4) → The taxonomy part should be fully integrated, without it
|
88
|
+
being a standalone part anymore.
|
89
|
+
continue on the taxonomy stuff.
|
90
|
+
ne day this will work again *shake fist*
|
91
|
+
------------------------------------------------------------------------------------------
|
92
|
+
(1) → Show the frequency of codons in different tables
|
93
|
+
|
94
|
+
This works quite ok, but right now the approach is to store
|
95
|
+
this in a .yml file which is not ideal.
|
96
|
+
|
97
|
+
Thus, we have to add two things:
|
98
|
+
- The ability to store this into a SQL database
|
99
|
+
- The ability to batch-download all of these codons,
|
100
|
+
which first requires that we have a way to obtain all
|
101
|
+
taxonomic ids.
|
102
|
+
Add where this can be found.
|
103
|
+
|
104
|
+
IMPROVE THIS ALL!!!!!!!
|
105
|
+
|
106
|
+
------------------------------------------------------------------------------------------
|
107
|
+
(2) improve docu + tests for melting temperature analysis again
|
108
|
+
+ usage example + GUI + web-use
|
109
|
+
------------------------------------------------------------------------------------------
|
110
|
+
(3) → https://biopython.org/DIST/docs/tutorial/Tutorial.html#sec15
|
3
111
|
|
4
112
|
^^^ work through the above, also integrate it + write docs
|
5
113
|
|
6
114
|
https://raw.githubusercontent.com/biopython/biopython/master/Doc/examples/ls_orchid.fasta
|
7
115
|
|
8
|
-
|
9
|
-
(
|
116
|
+
------------------------------------------------------------------------------------------
|
117
|
+
(4) → integrate electrno microscopy slowly and also add documentation
|
10
118
|
about this AS YOU GO!!!!!
|
11
119
|
^^^ yup add more of it
|
12
|
-
|
13
|
-
(
|
120
|
+
------------------------------------------------------------------------------------------
|
121
|
+
(5) → Add save session support
|
14
122
|
to reload our last activity completely ...
|
15
123
|
hmmm..
|
16
124
|
This has to be well designed...
|
@@ -27,9 +135,8 @@ https://raw.githubusercontent.com/biopython/biopython/master/Doc/examples/ls_orc
|
|
27
135
|
upon startup of the bioroebe shell.
|
28
136
|
This is in preparation for save-session support.
|
29
137
|
|
30
|
-
|
31
|
-
|
32
|
-
(5) → Lys-Asp-Glu-Leu
|
138
|
+
------------------------------------------------------------------------------------------
|
139
|
+
(6) → Lys-Asp-Glu-Leu
|
33
140
|
|
34
141
|
if i.include?('-') and Bioroebe.is_in_the_three_letter_code?(i)
|
35
142
|
end
|
@@ -47,11 +154,11 @@ https://raw.githubusercontent.com/biopython/biopython/master/Doc/examples/ls_orc
|
|
47
154
|
|
48
155
|
^^ yep this is also called KDEL
|
49
156
|
https://en.wikipedia.org/wiki/KDEL_(amino_acid_sequence)
|
50
|
-
|
51
|
-
(
|
157
|
+
------------------------------------------------------------------------------------------
|
158
|
+
(7) → Add "orthologs". this shall show us the top 25 orthologs or
|
52
159
|
something. In the bioshell? Hmm. Not sure yet.
|
53
|
-
|
54
|
-
(
|
160
|
+
------------------------------------------------------------------------------------------
|
161
|
+
(8) → clone the functionality of this:
|
55
162
|
|
56
163
|
http://www.kazusa.or.jp/codon/cgi-bin/countcodon.cgi
|
57
164
|
http://www.kazusa.or.jp/codon/countcodon.html
|
@@ -63,18 +170,18 @@ https://en.wikipedia.org/wiki/KDEL_(amino_acid_sequence)
|
|
63
170
|
widget first. And sinatra output too.
|
64
171
|
AND document it as well
|
65
172
|
|
66
|
-
|
173
|
+
------------------------------------------------------------------------------------------
|
67
174
|
(8) → SARS genom analyisere in bioroebe
|
68
175
|
eventuell auch graphisch
|
69
176
|
|
70
177
|
Gibt es neue GUIs die wir kombinieren könnten? Hmmm.
|
71
|
-
|
178
|
+
------------------------------------------------------------------------------------------
|
72
179
|
(9) → In bioroebe, generate that .ps thingy graphical thing from the
|
73
180
|
vienna RNA tutorial. Hmmm.
|
74
181
|
|
75
182
|
https://www.tbi.univie.ac.at/RNA/tutorial/
|
76
|
-
|
77
|
-
(
|
183
|
+
------------------------------------------------------------------------------------------
|
184
|
+
(10) → get insulin squence frmo NCBI
|
78
185
|
human
|
79
186
|
then apply trypsin onto it
|
80
187
|
and try it like this:
|
@@ -88,13 +195,13 @@ Also add:
|
|
88
195
|
^^^ to show it
|
89
196
|
Hmm. Perhaps also auto-download or something.
|
90
197
|
|
91
|
-
|
92
|
-
(
|
198
|
+
------------------------------------------------------------------------------------------
|
199
|
+
(11) → in bioroebe: UAG?
|
93
200
|
^^^ show all stop codons with that in the bioshell
|
94
201
|
all UAG sequences... hmm. and TAG?
|
95
202
|
Finish that.
|
96
|
-
|
97
|
-
(
|
203
|
+
------------------------------------------------------------------------------------------
|
204
|
+
(12) → The position of a symbol in a string is the total number of
|
98
205
|
symbols found to its left, including itself (e.g., the positions
|
99
206
|
of all occurrences of 'U' in "AUGCUUCAGAAAGGUCUUACG" are 2, 5,
|
100
207
|
6, 15, 17, and 18). The symbol at position i
|
@@ -102,70 +209,70 @@ Hmm. Perhaps also auto-download or something.
|
|
102
209
|
|
103
210
|
^^^ add a solution there, a toplevel API
|
104
211
|
!!!!!
|
105
|
-
|
106
|
-
(
|
212
|
+
------------------------------------------------------------------------------------------
|
213
|
+
(13) → http://bioruby.org/rdoc/Bio/Blast.html
|
107
214
|
^^^ add support for BLAST
|
108
|
-
|
109
|
-
(
|
215
|
+
------------------------------------------------------------------------------------------
|
216
|
+
(14) → add: parse_pdb()
|
110
217
|
With this we shall just show some info, about a given
|
111
218
|
.pdb file at hand.
|
112
219
|
Also make it commandline based too + bioshell variant
|
113
220
|
here, and a sinatra interface once this all works.
|
114
221
|
Don't forget to document it!!!!!
|
115
222
|
^^^ and google a bit how others do that
|
116
|
-
|
117
|
-
(
|
223
|
+
------------------------------------------------------------------------------------------
|
224
|
+
(15) → pdb 1a6m
|
118
225
|
^^^ download this when that is used in the bioshell; we also have
|
119
226
|
to use the download directory for this, so make sure that
|
120
227
|
we do.
|
121
228
|
^^^ And then, also document this clearly.
|
122
|
-
|
123
|
-
(
|
229
|
+
------------------------------------------------------------------------------------------
|
230
|
+
(16) show_string
|
124
231
|
^^^ slowly port this ... find out differences
|
125
232
|
then unify into one method. right now we used
|
126
233
|
two or something.
|
127
|
-
|
128
|
-
(
|
234
|
+
------------------------------------------------------------------------------------------
|
235
|
+
(17) → Try to see if we can integrate this into our GUI:
|
129
236
|
|
130
237
|
https://cdn.snapgene.com/assets/7.6.11/assets/images/snapgene/homepage/homepage-hero.png
|
131
|
-
|
238
|
+
------------------------------------------------------------------------------------------
|
132
239
|
(5) → Scan for leucine zipper!
|
133
240
|
|
134
241
|
This is ~25% implemented. We need to double-check what
|
135
242
|
exactly is a leucine zipper.
|
136
|
-
|
243
|
+
------------------------------------------------------------------------------------------
|
137
244
|
(6) → Extend the sinatra-interface for the Rosalind task,
|
138
245
|
perhaps add a sub-link to show which parts are solved
|
139
246
|
as-is. Hmm. I am not continuing on this though.
|
140
247
|
^^^^
|
141
248
|
well - make rosalind anew again or something.
|
142
249
|
|
143
|
-
|
250
|
+
------------------------------------------------------------------------------------------
|
144
251
|
(7) - Add a blast interface; both via the web-interface, GUI,
|
145
252
|
and also from the commandline.
|
146
|
-
|
253
|
+
------------------------------------------------------------------------------------------
|
147
254
|
(8) - Write a tutorial about primer design.
|
148
255
|
also make sure that the GUI has support for this.
|
149
|
-
|
256
|
+
------------------------------------------------------------------------------------------
|
150
257
|
(9) - In the documentation examples, show some exampls for how to work
|
151
258
|
with different organisms.
|
152
|
-
|
259
|
+
------------------------------------------------------------------------------------------
|
153
260
|
(10) - In the bioshell, if "stop?" is issued, then the colouring isn't
|
154
261
|
correct. It currently does not show any result. This has to
|
155
262
|
be fixed.
|
156
|
-
|
263
|
+
------------------------------------------------------------------------------------------
|
157
264
|
(11) → https://www.rubydoc.info/gems/biomart
|
158
265
|
^^^ integrate biomart
|
159
266
|
|
160
267
|
p biomart.list_datasets
|
161
268
|
p biomart.datasets?
|
162
|
-
|
269
|
+
------------------------------------------------------------------------------------------
|
163
270
|
(12) Add Trypsin und Trypsinogen sequences, both as FASTA
|
164
271
|
but also as shortcut via the commandline such as:
|
165
272
|
show_orf :trypsine
|
166
273
|
show_orf :trypsin
|
167
274
|
Or something like this; and document it as well.
|
168
|
-
|
275
|
+
------------------------------------------------------------------------------------------
|
169
276
|
(13) → 1..60
|
170
277
|
|
171
278
|
setdna 57
|
@@ -177,12 +284,12 @@ well - make rosalind anew again or something.
|
|
177
284
|
5' - ATGTGCAGTCAGGTGAATTTATTGAAAAATTTGAGGCTCCTGGTGGTGCAAATCAAAGAACTGCTCCTCAGTGGATGTTGCCTTTACTTCTAG - 3'
|
178
285
|
^^^ hier beim colourize, wenn das letzte codon ein STOP codon ist
|
179
286
|
dann colourizen wir das auch.
|
180
|
-
|
287
|
+
------------------------------------------------------------------------------------------
|
181
288
|
(14) → MG1655
|
182
289
|
^^^ input this to download the sequence. Also show it to the user.
|
183
|
-
|
290
|
+
------------------------------------------------------------------------------------------
|
184
291
|
(15) → extend virus-information into the bioroebe project.
|
185
|
-
|
292
|
+
------------------------------------------------------------------------------------------
|
186
293
|
(16) → Add a way to analyse the chemical structure of all
|
187
294
|
aminoacids. We wish to show the chemical formula.
|
188
295
|
|
@@ -196,22 +303,22 @@ well - make rosalind anew again or something.
|
|
196
303
|
I don't understand why it removes H and 0 so perhaps
|
197
304
|
dont remove that part. But still show the -R.
|
198
305
|
|
199
|
-
|
306
|
+
------------------------------------------------------------------------------------------
|
200
307
|
(17) FIX THE COLOURIZATION BUG; THIS ONE TRIGGERED THE WHOLE
|
201
308
|
REWRITE AFTER ALL!
|
202
|
-
|
309
|
+
------------------------------------------------------------------------------------------
|
203
310
|
(18) FIX TAXONOMY related-problems AS WELL
|
204
311
|
^^^^^^ AND DOCUMENT THIS related-problems.
|
205
|
-
|
312
|
+
------------------------------------------------------------------------------------------
|
206
313
|
(19) Do note that z will then be a String, not a sequence object anymore.
|
207
314
|
(This may be subject to change in the future, but for now, aka
|
208
315
|
**February 2020**, it is that way.)
|
209
316
|
^^^^
|
210
|
-
|
317
|
+
------------------------------------------------------------------------------------------
|
211
318
|
(20) ^^^ colours are appended. That should not be the case!
|
212
319
|
ADD SOMETHING NEW ... some todo entries
|
213
320
|
and some python tool
|
214
|
-
|
321
|
+
------------------------------------------------------------------------------------------
|
215
322
|
(21) → rewrite the whole project anew
|
216
323
|
- improve the documentation
|
217
324
|
- focus on class Protein first and add
|
@@ -219,10 +326,10 @@ well - make rosalind anew again or something.
|
|
219
326
|
that, as well as:
|
220
327
|
.backtrans
|
221
328
|
.reverse_translate
|
222
|
-
|
329
|
+
------------------------------------------------------------------------------------------
|
223
330
|
(22) → AND THEN test on windows as well.
|
224
331
|
^^^^^^^^^^^^^^
|
225
|
-
|
332
|
+
------------------------------------------------------------------------------------------
|
226
333
|
(23) →
|
227
334
|
Reduced alphabets for proteins | [not implemented yet]
|
228
335
|
^^^ check this as well
|
@@ -252,9 +359,9 @@ First focus on bioroebe.
|
|
252
359
|
efetch "https://www.ncbi.nlm.nih.gov/gene/744779"
|
253
360
|
^^^ test this. again
|
254
361
|
|
255
|
-
|
362
|
+
------------------------------------------------------------------------------------------
|
256
363
|
(25) fix tk-levensthein
|
257
|
-
|
364
|
+
------------------------------------------------------------------------------------------
|
258
365
|
(26) → rewrite the whole project anew
|
259
366
|
- improve the documentation
|
260
367
|
- rework the WHOLE tutorial as well
|
@@ -263,13 +370,13 @@ efetch "https://www.ncbi.nlm.nih.gov/gene/744779"
|
|
263
370
|
that
|
264
371
|
.backtrans
|
265
372
|
.reverse_translate
|
266
|
-
|
373
|
+
------------------------------------------------------------------------------------------
|
267
374
|
(27) → analyze /Depot/Temp/Bioroebe/1CEZ.pdb
|
268
375
|
|
269
376
|
^^^
|
270
377
|
support this. Already works half-way, we started writing a pdb parser.
|
271
378
|
this should work in general, for .fasta files as well.
|
272
|
-
|
379
|
+
------------------------------------------------------------------------------------------
|
273
380
|
(28) → SINATRA STUFF:
|
274
381
|
FIX AND EXTEND SINATRA IN BIOROEBE.
|
275
382
|
extend it too.
|
@@ -281,7 +388,7 @@ efetch "https://www.ncbi.nlm.nih.gov/gene/744779"
|
|
281
388
|
and special-dispaly on sinatra kaa
|
282
389
|
where the nucleotide sequence has numbers
|
283
390
|
^^^
|
284
|
-
|
391
|
+
------------------------------------------------------------------------------------------
|
285
392
|
(29) pick any virus and begin to amass tons of data; and then when done
|
286
393
|
also connect this into a GUI for use therein.
|
287
394
|
|
@@ -302,7 +409,7 @@ https://www.ncbi.nlm.nih.gov/nuccore/NC_038391.1
|
|
302
409
|
|
303
410
|
|
304
411
|
|
305
|
-
|
412
|
+
------------------------------------------------------------------------------------------
|
306
413
|
(1) → Fix:
|
307
414
|
|
308
415
|
require 'bioroebe/toplevel_methods/open_reading_frames.rb'
|
@@ -310,10 +417,10 @@ https://www.ncbi.nlm.nih.gov/nuccore/NC_038391.1
|
|
310
417
|
Something is wrong; it returns regions that contain
|
311
418
|
a stop codon, which can not be true.
|
312
419
|
|
313
|
-
|
420
|
+
------------------------------------------------------------------------------------------
|
314
421
|
(3) → Fix: extend glycovirology parts
|
315
422
|
seek stuff in viral genomes
|
316
|
-
|
423
|
+
------------------------------------------------------------------------------------------
|
317
424
|
(4) →
|
318
425
|
|
319
426
|
seq = Bio::Sequence::NA.new("atgcatgcaaaaaaa")
|
@@ -336,13 +443,13 @@ https://www.ncbi.nlm.nih.gov/nuccore/NC_038391.1
|
|
336
443
|
seq = Bioroebe::Sequence.new("atgcatgcaaaaaaa")
|
337
444
|
puts seq
|
338
445
|
puts seq.complement
|
339
|
-
|
446
|
+
------------------------------------------------------------------------------------------
|
340
447
|
(5) →
|
341
448
|
make sure we have a good fasta-showing widget
|
342
449
|
show how many nucleotides are
|
343
450
|
AND add support to modify this as-is
|
344
451
|
^^^^
|
345
|
-
|
452
|
+
------------------------------------------------------------------------------------------
|
346
453
|
(6) → In BioRoebe:
|
347
454
|
|
348
455
|
Add a table showing how compatible bioroebe is compared to the other
|
@@ -352,7 +459,7 @@ https://www.ncbi.nlm.nih.gov/nuccore/NC_038391.1
|
|
352
459
|
including Bio (ruby-bio) the main ruby project here.
|
353
460
|
And add a table which functionality is implemented
|
354
461
|
in Java already.
|
355
|
-
|
462
|
+
------------------------------------------------------------------------------------------
|
356
463
|
(7) →
|
357
464
|
********************************************************************************
|
358
465
|
Was passiert wenn wir das Lambda-Genom mit EcoRI behandeln?
|
@@ -375,19 +482,19 @@ Bioroebe.digest_this_dna("/root/Bioroebe/fasta/NC_001416.1_Enterobacteria_phage_
|
|
375
482
|
DNA.
|
376
483
|
^^^ this now works kind of ... but it must be better
|
377
484
|
documented and we must test this with more data.
|
378
|
-
|
485
|
+
------------------------------------------------------------------------------------------
|
379
486
|
(8) → add the bioroebe logo to sinatra, but as appropriate size,
|
380
487
|
via base64. perhaps width 50 or so. need to determine
|
381
488
|
which size fits here.
|
382
|
-
|
489
|
+
------------------------------------------------------------------------------------------
|
383
490
|
(9) → Integrate http://nc2.neb.com/NEBcutter2/cutshow.php?name=ffe1d68e-
|
384
491
|
|
385
492
|
in particular the visual part.
|
386
|
-
|
493
|
+
------------------------------------------------------------------------------------------
|
387
494
|
(10) → https://international.neb.com/products/r0196-ncii#Product%20Information
|
388
495
|
^^^ autogenerate such an image, aka restriction cutting enzyme
|
389
496
|
to indicate the target sequence.
|
390
|
-
|
497
|
+
------------------------------------------------------------------------------------------
|
391
498
|
(6) → how to do codon optimiation in e.coli? bioroebe must support this!
|
392
499
|
|
393
500
|
we must first get a display which codon is very commonly used in
|
@@ -399,34 +506,23 @@ and then we look which codons may be improvable - display
|
|
399
506
|
them on the commandline
|
400
507
|
|
401
508
|
class: OptimizeCodons.new(of_this_sequence)
|
402
|
-
|
509
|
+
------------------------------------------------------------------------------------------
|
403
510
|
(7) → Molekulare Grösse von "Ubiquitin"? "8.5 kd".
|
404
511
|
^^^ das sollte automatisch ausgerechnet werden
|
405
|
-
|
512
|
+
------------------------------------------------------------------------------------------
|
406
513
|
(8) → taxonomy !!!!!!!!!!!!!!!!!!
|
407
|
-
|
514
|
+
------------------------------------------------------------------------------------------
|
408
515
|
(9) → Given a list of gene names that I would like to get chromosome/position
|
409
516
|
information for (in mm10). Is there some service online where I can
|
410
517
|
paste this list? ^^^ enable this
|
411
|
-
|
412
|
-
(10) → Show the frequency of codons in different tables
|
413
|
-
|
414
|
-
This works quite ok, but right now the approach is to store
|
415
|
-
this in a .yml file which is not ideal.
|
416
|
-
|
417
|
-
Thus, we have to add two things:
|
418
|
-
- The ability to store this into a SQL database
|
419
|
-
- The ability to batch-download all of these codons,
|
420
|
-
which first requires that we have a way to obtain all
|
421
|
-
taxonomic ids.
|
422
|
-
-------------------------------------------------------------------------------
|
518
|
+
------------------------------------------------------------------------------------------
|
423
519
|
(11) → Add a way in bioroebe to store a gene into a yaml file
|
424
520
|
or so, and to also load it up again. Perhaps simplify
|
425
521
|
this automatically. Need some ways to describe that.
|
426
|
-
|
522
|
+
------------------------------------------------------------------------------------------
|
427
523
|
(12) → Make bioroebe very useful from the www, no matter if via sinatra
|
428
524
|
or rails. It should be a tool-set project on the www as well.
|
429
|
-
|
525
|
+
------------------------------------------------------------------------------------------
|
430
526
|
(13) → Suppose you have a GenBank file which you want to turn into a
|
431
527
|
Fasta file. For example, lets consider the file cor6_6.gb
|
432
528
|
which is included in the Biopython unit tests under the
|
@@ -441,12 +537,12 @@ call it format-converter or so
|
|
441
537
|
the GUI works somewhat but needs to be polished up.
|
442
538
|
THEN THIS CAN BE REMOVED!!!!!!!
|
443
539
|
|
444
|
-
|
540
|
+
------------------------------------------------------------------------------------------
|
445
541
|
(14) → Wir brauchen eine table wo wir die starken promotoren verschiedener
|
446
542
|
Organismen zusammenstellen und vergleichen können.
|
447
543
|
|
448
544
|
strong_promoters.yml
|
449
|
-
|
545
|
+
------------------------------------------------------------------------------------------
|
450
546
|
(15) → add:
|
451
547
|
start position of exons
|
452
548
|
and show the sequence based on that file
|
@@ -454,9 +550,9 @@ THEN THIS CAN BE REMOVED!!!!!!!
|
|
454
550
|
Normally there's a "gene" entry for each gene, so:
|
455
551
|
awk 'BEGIN{FS="\t"; OFS="\t"}{if($3 == "gene") print $1, $4, $5}' foo.gtf
|
456
552
|
|
457
|
-
|
553
|
+
------------------------------------------------------------------------------------------
|
458
554
|
(16) → also add 30-33 to aminoacids hmmm difficult.
|
459
|
-
|
555
|
+
------------------------------------------------------------------------------------------
|
460
556
|
(17) → http://bioinformatics.oxfordjournals.org/content/18/8/1135
|
461
557
|
"TFBS: Computational framework for transcription factor
|
462
558
|
binding site analysis"
|
@@ -464,7 +560,7 @@ THEN THIS CAN BE REMOVED!!!!!!!
|
|
464
560
|
into bioroebe
|
465
561
|
|
466
562
|
http://tfbs.genereg.net/
|
467
|
-
|
563
|
+
------------------------------------------------------------------------------------------
|
468
564
|
(18) → They include trypsin, chymotrypsin, thrombin, plasmin, papain and factor Xa.
|
469
565
|
^^^ provide means to identify where they cut,
|
470
566
|
and show this then by simualting a digest.
|
@@ -472,7 +568,7 @@ THEN THIS CAN BE REMOVED!!!!!!!
|
|
472
568
|
also document this on bioroebe todo
|
473
569
|
this is done via digestion/digestions
|
474
570
|
but it's not quite perfect yet.
|
475
|
-
|
571
|
+
------------------------------------------------------------------------------------------
|
476
572
|
(19) → a) add a commandline way to generate a random protein
|
477
573
|
with a specified length and then display it on the
|
478
574
|
commandline [DONE] !!!
|
@@ -498,29 +594,29 @@ THEN THIS CAN BE REMOVED!!!!!!!
|
|
498
594
|
Enable this BOTH from the commandline AND from the
|
499
595
|
interactive variant and from sinatra! Hmmmm.
|
500
596
|
|
501
|
-
|
597
|
+
------------------------------------------------------------------------------------------
|
502
598
|
(1) → add an option to design a
|
503
599
|
|
504
600
|
degenerate primer
|
505
|
-
|
601
|
+
------------------------------------------------------------------------------------------
|
506
602
|
(2) Add upcase to sequences and ensure that it works; also document it
|
507
603
|
internally and in the .pdf tutorial
|
508
604
|
what does that mean? upcase as method? hmmm.
|
509
605
|
|
510
|
-
|
606
|
+
------------------------------------------------------------------------------------------
|
511
607
|
(1) → http://www.biomart.org/other/user-docs.pdf
|
512
608
|
^^^ work through this
|
513
609
|
^^^ integrate the old .cgi part and improve as you go
|
514
|
-
|
610
|
+
------------------------------------------------------------------------------------------
|
515
611
|
(1) → Access geninfo numbers easily.
|
516
612
|
Die suchen und runterladen.
|
517
|
-
|
613
|
+
------------------------------------------------------------------------------------------
|
518
614
|
- Add all of bioruby into bioroebe:
|
519
615
|
|
520
616
|
continous project
|
521
617
|
https://github.com/biopython/biopython
|
522
618
|
https://github.com/bioruby/bioruby/tree/master/lib/bio
|
523
|
-
|
619
|
+
------------------------------------------------------------------------------------------
|
524
620
|
(3) → https://github.com/bioruby/bioruby/issues/134
|
525
621
|
^^^ check this, for restriction enzymes
|
526
622
|
http://rebase.neb.com/rebase/enz/MboII.html
|
@@ -530,9 +626,9 @@ THEN THIS CAN BE REMOVED!!!!!!!
|
|
530
626
|
> seq = seq.reverse_complement
|
531
627
|
> Bio::RestrictionEnzyme.cut(seq, 'MboII').primary rescue [seq]
|
532
628
|
=> ["atcatcaatcctaatcttct"]
|
533
|
-
|
629
|
+
------------------------------------------------------------------------------------------
|
534
630
|
(4) → Document how an ORF is defined for the bioroebe project.
|
535
|
-
|
631
|
+
------------------------------------------------------------------------------------------
|
536
632
|
(5) Continue with biojava in bioroebe.
|
537
633
|
|
538
634
|
→ We need to make some table that tells us what is implemented
|
@@ -547,7 +643,7 @@ THEN THIS CAN BE REMOVED!!!!!!!
|
|
547
643
|
|
548
644
|
dprimer M-T-T-Y-Y-T-A-A-A-STOP
|
549
645
|
|
550
|
-
|
646
|
+
------------------------------------------------------------------------------------------
|
551
647
|
(1) → The codon tables:
|
552
648
|
→ In January we added a codon-table GUI to ruby-gtk3.
|
553
649
|
|
@@ -576,31 +672,29 @@ THEN THIS CAN BE REMOVED!!!!!!!
|
|
576
672
|
|
577
673
|
This now sorta works semi-ok.
|
578
674
|
|
579
|
-
|
675
|
+
------------------------------------------------------------------------------------------
|
580
676
|
(1) → In the bioroebe-shell, enable input such as:
|
581
677
|
|
582
678
|
NC_000011.10
|
583
679
|
|
584
680
|
This shall quickly download this sequence into the
|
585
681
|
local file, and also rename it properly.
|
586
|
-
|
682
|
+
------------------------------------------------------------------------------------------
|
587
683
|
→ clone all of bioruby
|
588
|
-
|
684
|
+
------------------------------------------------------------------------------------------
|
589
685
|
(1) → bioinf bücher udrhclesen und zeug inkludiere !!!
|
590
686
|
^^^^^ mehr bilderchen hinzufügen ... auchv on den GUIs eventuell.
|
591
687
|
Und auch biopython durcharbeiten und alles wichtige nach
|
592
688
|
bioroebe übertragen.
|
593
|
-
|
689
|
+
------------------------------------------------------------------------------------------
|
594
690
|
- Add: DetectMotif
|
595
691
|
|
596
692
|
This class shall be used for detecting subsequences.
|
597
|
-
|
693
|
+
------------------------------------------------------------------------------------------
|
598
694
|
- Neue funktionälit rein
|
599
|
-
|
600
|
-
- mehr doku
|
601
|
-
|
602
|
-
- continue on bioroebe, and when it is done, write to the guy.
|
603
|
-
-------------------------------------------------------------------------------
|
695
|
+
------------------------------------------------------------------------------------------
|
696
|
+
- mehr doku!!!
|
697
|
+
------------------------------------------------------------------------------------------
|
604
698
|
- Rewrite bioroebe completely - add some tests, too or so, to
|
605
699
|
test this. ^^^
|
606
700
|
That way we learn how to write tests.
|
@@ -643,22 +737,13 @@ extend bioroebe sinatra interface
|
|
643
737
|
also add a footer to show which entries are available or so
|
644
738
|
→ in bioroebe, mach das die postgresql datenbank wieder funktioniert ...
|
645
739
|
|
646
|
-
|
647
|
-
|
648
|
-
..........................................................................
|
740
|
+
------------------------------------------------------------------------------------------
|
649
741
|
|
650
742
|
→ ^^^ improve this whole project a lot
|
651
743
|
|
652
744
|
before uploading then send email
|
653
745
|
|
654
746
|
|
655
|
-
- 1fat.pdb
|
656
|
-
|
657
|
-
^^^ download this, also via bioshell
|
658
|
-
download 1fat
|
659
|
-
^^^ notify the user about this
|
660
|
-
but put it into the dir of bioshell
|
661
|
-
|
662
747
|
→ add:
|
663
748
|
|
664
749
|
set_dna :insulin
|
@@ -674,44 +759,26 @@ also add a footer to show which entries are available or so
|
|
674
759
|
→ becomes: http://www.ncbi.nlm.nih.gov/gene/3630
|
675
760
|
|
676
761
|
wtf ... better to learn how NCBI uworks
|
677
|
-
|
678
|
-
- Add a seuqence table
|
762
|
+
------------------------------------------------------------------------------------------
|
763
|
+
- Add a seuqence table into bioroebe for GFP, YFP etc
|
679
764
|
and mae this show in both the interactio bioshell but
|
680
765
|
also the main README.md
|
681
|
-
-------------------------------------------------------------------------------
|
682
|
-
- stop_frame1?
|
683
|
-
^^^ add support for this
|
684
|
-
and stop_frame2?
|
685
|
-
etcc
|
686
|
-
to show stop-codons in this colour
|
687
|
-
THEN UPLOAD!
|
688
|
-
^^^ this works now but is not documented
|
689
|
-
|
690
|
-
|
691
|
-
-------------------------------------------------------------------------------
|
692
|
-
|
693
|
-
- chop to first ATG
|
694
766
|
|
695
|
-
|
696
|
-
|
697
|
-
^^^^ enable this, to chop towards the first ATG
|
698
|
-
sequence in the string
|
699
|
-
|
700
|
-
-------------------------------------------------------------------------------
|
767
|
+
------------------------------------------------------------------------------------------
|
701
768
|
→ http://www.biophp.org/stats/describe_data/demo.php?show=formula
|
702
769
|
|
703
770
|
^^^ should also add documentation like this, also via www interface
|
704
|
-
|
771
|
+
------------------------------------------------------------------------------------------
|
705
772
|
→ add mouse chromsoome URL, also in the bioshell
|
706
773
|
and the main README, to be of help for the
|
707
774
|
user. add a mouse subsection.
|
708
|
-
|
775
|
+
------------------------------------------------------------------------------------------
|
709
776
|
→ fix the taxonomy stuff...
|
710
|
-
|
777
|
+
------------------------------------------------------------------------------------------
|
711
778
|
(1) → add 2nd_orf
|
712
779
|
→ this shall scan for the 2nd orf
|
713
780
|
→ and third ORF as well, then, and document it.
|
714
|
-
|
781
|
+
------------------------------------------------------------------------------------------
|
715
782
|
(2) → Add a "cutter-range example" in restriction enzymes +
|
716
783
|
table + examples + tutorial
|
717
784
|
|
@@ -719,21 +786,16 @@ also add a footer to show which entries are available or so
|
|
719
786
|
|
720
787
|
Also, add in the documentation where this
|
721
788
|
can be found.
|
722
|
-
|
723
|
-
(3) → Add aaruler, similar to "ruler"; in the bioshell.
|
724
|
-
But we want to do this on the dna-sequence rather
|
725
|
-
than the aminoacid sequence.
|
726
|
-
This works but the display is not ideal.
|
727
|
-
..........................................................................
|
789
|
+
------------------------------------------------------------------------------------------
|
728
790
|
(4) → Add some codon-usage analyzer. What shall it show? It
|
729
791
|
should show how many codons are used, frequencies etc...
|
730
792
|
by an organism, and compare that to other data.
|
731
|
-
|
793
|
+
------------------------------------------------------------------------------------------
|
732
794
|
(5) → Implement a GPCR interface.
|
733
795
|
|
734
796
|
This is for "G-protein coupled receptors."
|
735
797
|
Denote which variants exist and so forth. Document it as well.
|
736
|
-
|
798
|
+
------------------------------------------------------------------------------------------
|
737
799
|
(6) → alu?
|
738
800
|
|
739
801
|
Will read from the file `/Programs/Ruby/2.3.0/lib/ruby/site_ruby/2.3.0/bioroebe/yaml/alu_elements.yml`.
|
@@ -756,7 +818,7 @@ also add a footer to show which entries are available or so
|
|
756
818
|
^^^ add this and document it or something like that
|
757
819
|
And perhaps add a small protein as an example how to
|
758
820
|
work with .pdb files instead.
|
759
|
-
|
821
|
+
------------------------------------------------------------------------------------------
|
760
822
|
(4) → Extend bioroebe to allow download
|
761
823
|
|
762
824
|
PDB files
|
@@ -770,13 +832,13 @@ also add a footer to show which entries are available or so
|
|
770
832
|
|
771
833
|
in 3EML 2VTP 2VEZ
|
772
834
|
do
|
773
|
-
|
835
|
+
------------------------------------------------------------------------------------------
|
774
836
|
(1) → Fully integrate electron microscopy then remove the old entry.
|
775
837
|
Test it though.
|
776
838
|
Hmm... but ... we will first polish the main bioroebe
|
777
839
|
gem AND the taxonomy gem and THEN AFTERWARDS
|
778
840
|
integate elctron microsopcy.
|
779
|
-
|
841
|
+
------------------------------------------------------------------------------------------
|
780
842
|
(1) → ORF Finder:
|
781
843
|
|
782
844
|
We must add an ORF finder for the bioroebe project,
|
@@ -785,23 +847,23 @@ also add a footer to show which entries are available or so
|
|
785
847
|
This works partially... start_stop works but we do not
|
786
848
|
yet find all subsequences.
|
787
849
|
|
788
|
-
|
850
|
+
------------------------------------------------------------------------------------------
|
789
851
|
(1) → must change determine whether we have protein or nucleotide or
|
790
852
|
so via a topelvel method!
|
791
|
-
|
853
|
+
------------------------------------------------------------------------------------------
|
792
854
|
(1) → there is a talens module.
|
793
855
|
we have to improve on it for a while
|
794
856
|
better docu
|
795
857
|
more testing
|
796
858
|
then we can get rid of this entry here
|
797
|
-
|
859
|
+
------------------------------------------------------------------------------------------
|
798
860
|
(1) → 33.44
|
799
861
|
Next showing the nucleotides 33 to 44 (including 33 and 44).
|
800
862
|
The length of the fragment will be 12 nucleotides.
|
801
863
|
5' - 2;70;130;180 - 3'
|
802
864
|
^^^ there is some problem; we somehow embed the colour codes,
|
803
865
|
which should not happen.
|
804
|
-
|
866
|
+
------------------------------------------------------------------------------------------
|
805
867
|
(1) → set_aa DTLCIGYHAN NSTDTVDTVL EKNVTVTHSV NLLEDKHNGK LCKLRGVAPL HLGKCNIAGW ILGNPECESL STASSWSYIV ETSNSDNGTC YPGDFINYEE LREQLSSVSS FERFEIFPKT SSWPNHDNKG VTAACPHAGA KSFYKNLIWL VKKGNSYPKL NQSYINDKGK EVLVLWGIHH PSTTADQQSL YQNADAYVFV GTSRYSKKFK PEIATRPKVR DQEGRMNYYW TLVEPGDKIT FEATGNLVVP RYAFMERNAG SGIIISDTPV HDCNTTCQTP EGAINTSLPF QNIHPITIGK CPKYVKSTKL RLATGLRNVP SIQSRGLFGA IAGFIEGGWT GMVDGWYGYH HQNEQGSGYA ADLKSTQNAI DKITNKVNSV IKMNTQFTAV GKEFNHLEKR IENLNKKVDD GFLDIWTYNA ELLVLLENER TLDYHDSNVK NLYEKVRNQL KNNAKEIGNG CFEFYHKCDN TCMESVKNGT YDYPKYSEEA KLNREKIDGV KLESTRIYHH HHHH
|
806
868
|
|
807
869
|
^^^ enable copy/pasting,
|
@@ -816,7 +878,7 @@ also add a footer to show which entries are available or so
|
|
816
878
|
This sequence has 50 aminoacids.
|
817
879
|
^^^ das stimmt net.
|
818
880
|
|
819
|
-
|
881
|
+
------------------------------------------------------------------------------------------
|
820
882
|
(1) → add this functionality:
|
821
883
|
|
822
884
|
meting temper
|
@@ -853,70 +915,57 @@ also add a footer to show which entries are available or so
|
|
853
915
|
and also provide a commandline-way to calculate them,
|
854
916
|
using ruby. The latter may be useful and rather easy for
|
855
917
|
scripted use.
|
856
|
-
|
918
|
+
------------------------------------------------------------------------------------------
|
857
919
|
(1) → show insulin
|
858
920
|
^^^ to show the insulin structure
|
859
921
|
how to find it? no idea...
|
860
922
|
but we should have these structures already made available somewhere.
|
861
|
-
|
923
|
+
------------------------------------------------------------------------------------------
|
862
924
|
(1) → Todo: find family of enzymes, based on sequence structure
|
863
925
|
alone.
|
864
|
-
|
926
|
+
------------------------------------------------------------------------------------------
|
865
927
|
(1) → https://pubchem.ncbi.nlm.nih.gov/compound/16131099#section=Top
|
866
928
|
|
867
929
|
^^^ this website is quite interesting; try to use components
|
868
930
|
from it.
|
869
|
-
|
931
|
+
------------------------------------------------------------------------------------------
|
870
932
|
(1) → Add some option to show the aminoacid sequence, at the least
|
871
933
|
store it; and optionally show it.
|
872
934
|
|
873
935
|
possibly always report how many aminoacids are
|
874
936
|
part of that file; and optionally also show
|
875
937
|
the whole sequence.
|
876
|
-
|
938
|
+
------------------------------------------------------------------------------------------
|
877
939
|
(1) → WORK THROUGH the PROTOCOL AT BOKU. THEN WORK THROUGH THE VARIOUST
|
878
940
|
TIDBIDS AT UNI WIEN STARTING WITH HEIKO.
|
879
941
|
^^^ da sind wir nun.
|
880
942
|
wir sind an beginn von 1b ... hmmmm, also zerst mal das an der
|
881
943
|
BOKU durchgehen. Dann das löschen.
|
882
|
-
|
944
|
+
------------------------------------------------------------------------------------------
|
883
945
|
(1) → Begin tk-bindings for bioroebe, following the gtk stuff.
|
884
|
-
|
946
|
+
------------------------------------------------------------------------------------------
|
885
947
|
(2) → frame_value = position_of_the_stop_codon - position_of_the_start_codon
|
886
948
|
^^^ continue on this ...
|
887
|
-
|
949
|
+
------------------------------------------------------------------------------------------
|
888
950
|
(1) → improve both the gtk-apps parts, and the sinatra web-interface,
|
889
951
|
and other GUI-like elements. The idea is to make this software
|
890
952
|
more useful for people around the world, which should help
|
891
953
|
increase its adoption rate.
|
892
|
-
|
954
|
+
------------------------------------------------------------------------------------------
|
893
955
|
(2) → Look to integrate this:
|
894
956
|
|
895
957
|
http://www.ncbi.nlm.nih.gov/nuccore/NM_007315.3?report=fasta&log$=seqview&format=text
|
896
958
|
^^^
|
897
|
-
|
898
|
-
(1) → Clone and document the getorf functionality properly.
|
899
|
-
|
900
|
-
See: http://emboss.sourceforge.net/apps/cvs/emboss/apps/getorf.html
|
901
|
-
-------------------------------------------------------------------------------
|
902
|
-
(2) → set_dna_sequence alu
|
903
|
-
|
904
|
-
^^^ fetch random alu
|
905
|
-
|
906
|
-
^^^ alu sequence
|
907
|
-
Ok we started this now adding more details, but we
|
908
|
-
need to become better at searching for this
|
909
|
-
sequence.
|
910
|
-
-------------------------------------------------------------------------------
|
959
|
+
------------------------------------------------------------------------------------------
|
911
960
|
(3) → We need to make available the ... thingy magick
|
912
961
|
emboss functionality. that may seem useful
|
913
962
|
but also feel free to extend these parts for
|
914
963
|
bioroebe as necessary.
|
915
|
-
|
964
|
+
------------------------------------------------------------------------------------------
|
916
965
|
(4) → integrate electron_microscopy fully
|
917
966
|
This will take more time, so first we finish with the
|
918
967
|
taxonomy module instead.
|
919
|
-
|
968
|
+
------------------------------------------------------------------------------------------
|
920
969
|
(5) → Improve support for BLAST up until
|
921
970
|
|
922
971
|
middle of 2015 so that I am better prepared
|
@@ -927,7 +976,7 @@ also add a footer to show which entries are available or so
|
|
927
976
|
So, work on BLAST tutorial at bioinf page:
|
928
977
|
|
929
978
|
bl bioinf; rf bioinf
|
930
|
-
|
979
|
+
------------------------------------------------------------------------------------------
|
931
980
|
(3) → integrate a "codon usage database", whatever this means.
|
932
981
|
It is a cool database anyway. Then document this.
|
933
982
|
First, create a codon-usage analyze on a per-FASTA
|
@@ -935,7 +984,7 @@ also add a footer to show which entries are available or so
|
|
935
984
|
and calculate the codon usage from there.
|
936
985
|
|
937
986
|
^^^ and add some GUI to this. hmmm
|
938
|
-
|
987
|
+
------------------------------------------------------------------------------------------
|
939
988
|
(4) → Input sequence:
|
940
989
|
|
941
990
|
MFLMVSPTAYHQNKDECFLP
|
@@ -951,46 +1000,40 @@ also add a footer to show which entries are available or so
|
|
951
1000
|
|
952
1001
|
^^^ we should also show this on the commandline AND the
|
953
1002
|
www ... hmmm.
|
954
|
-
|
1003
|
+
------------------------------------------------------------------------------------------
|
955
1004
|
(5) → enable a graphical layer so that we can find out which
|
956
1005
|
transcription factor activates which gene(s). This
|
957
1006
|
should show e. g. a transcription factor highlighting
|
958
1007
|
a target genetic area.
|
959
|
-
|
1008
|
+
------------------------------------------------------------------------------------------
|
960
1009
|
(2) → We should add more screenshots, make them available on imgur
|
961
1010
|
as well, after storing them locally. Start with the more
|
962
1011
|
important functionality.
|
963
1012
|
|
964
|
-
|
1013
|
+
------------------------------------------------------------------------------------------
|
965
1014
|
(2) → clone serial cloner or whatever the name was, that GUI,
|
966
1015
|
so that we can offer the same functionality.
|
967
|
-
|
1016
|
+
------------------------------------------------------------------------------------------
|
968
1017
|
(1) →
|
969
1018
|
|
970
1019
|
# * searching for PubMed IDs given a query string:
|
971
1020
|
# * Bio::PubMed#esearch (recommended)
|
972
1021
|
# * Bio::PubMed#search (only retrieves top 20 hits; will be deprecated)
|
973
1022
|
^^^ implement this
|
974
|
-
|
975
|
-
|
976
|
-
..........................................................................
|
1023
|
+
------------------------------------------------------------------------------------------
|
977
1024
|
(3) → Aufgabe 16 in bioroebe lösen könnnen
|
978
|
-
|
979
|
-
|
980
|
-
being a standalone part anymore.
|
981
|
-
continue on the taxonomy stuff.
|
982
|
-
ne day this will work again *shake fist*
|
983
|
-
-------------------------------------------------------------------------------
|
1025
|
+
|
1026
|
+
------------------------------------------------------------------------------------------
|
984
1027
|
(5) → re1 = Bio::RestrictionEnzyme::DoubleStranded.new(enzyme1)
|
985
1028
|
|
986
1029
|
^^^ add this? hmmmm
|
987
1030
|
^^^ from here.
|
988
|
-
|
1031
|
+
------------------------------------------------------------------------------------------
|
989
1032
|
(1) → Colourize exon/intron boundaries.
|
990
|
-
|
1033
|
+
------------------------------------------------------------------------------------------
|
991
1034
|
(2) → In bioroebe: enhance phylogeny stuff and perhaps automatically
|
992
1035
|
generate pictures here.
|
993
|
-
|
1036
|
+
------------------------------------------------------------------------------------------
|
994
1037
|
(1) → In sinatra: add a backtranseq entry point, perhaps
|
995
1038
|
alias it as well.
|
996
1039
|
|
@@ -1000,7 +1043,7 @@ bioroebe --protein-to-dna
|
|
1000
1043
|
|
1001
1044
|
^^^ this shall start the GTK3 variant
|
1002
1045
|
|
1003
|
-
|
1046
|
+
------------------------------------------------------------------------------------------
|
1004
1047
|
(1) → require 'rubygems/text'
|
1005
1048
|
include Gem::Text
|
1006
1049
|
levenshtein_distance 'shevy', 'chevy' # => 1
|
@@ -1012,13 +1055,13 @@ bioroebe --protein-to-dna
|
|
1012
1055
|
https://github.com/rubygems/rubygems/blob/master/lib/rubygems/text.rb
|
1013
1056
|
^^^ actually move that part into bioroebe itself...
|
1014
1057
|
|
1015
|
-
|
1058
|
+
------------------------------------------------------------------------------------------
|
1016
1059
|
(1) → add _source to all APIs in sinatra there. Ensure that this works
|
1017
1060
|
too. The user should be able to view the source code.
|
1018
1061
|
^^^ it has been added for 2 methods so far in sinatra; we need
|
1019
1062
|
to add it for the remaining ones too. Then we can remove
|
1020
1063
|
this entry point.
|
1021
|
-
|
1064
|
+
------------------------------------------------------------------------------------------
|
1022
1065
|
(2) → Check out expasy
|
1023
1066
|
peptidcutter
|
1024
1067
|
also offer this functionality, through commandline, GUI
|
@@ -1026,16 +1069,12 @@ bioroebe --protein-to-dna
|
|
1026
1069
|
https://web.expasy.org/peptide_cutter/
|
1027
1070
|
We now have added trypsin but we should add more here; and
|
1028
1071
|
still have to add support for sinatra here.
|
1029
|
-
|
1072
|
+
------------------------------------------------------------------------------------------
|
1030
1073
|
(3) → melting temperature subsection
|
1031
1074
|
|
1032
1075
|
hmmm .... molecular weight calculation works now ... but
|
1033
1076
|
... is it correct for a ssDNA string? hmm...
|
1034
|
-
|
1035
|
-
(3) → Add useful formulas for bioshell.
|
1036
|
-
|
1037
|
-
|
1038
|
-
...........................................................................
|
1077
|
+
------------------------------------------------------------------------------------------
|
1039
1078
|
(1) → Degenerate Primers
|
1040
1079
|
|
1041
1080
|
You can try to determine the degenerate primers via the Shell
|
@@ -1046,7 +1085,7 @@ bioroebe --protein-to-dna
|
|
1046
1085
|
^^^ epxnad that subsection
|
1047
1086
|
more explanations and examples
|
1048
1087
|
|
1049
|
-
|
1088
|
+
------------------------------------------------------------------------------------------
|
1050
1089
|
(1) → Copy the functionality of plotorf:
|
1051
1090
|
|
1052
1091
|
See:
|
@@ -1062,7 +1101,7 @@ bioroebe --protein-to-dna
|
|
1062
1101
|
|
1063
1102
|
|
1064
1103
|
|
1065
|
-
|
1104
|
+
------------------------------------------------------------------------------------------
|
1066
1105
|
(2) → Start nucleotide position is at: 142
|
1067
1106
|
|
1068
1107
|
See the following example:
|
@@ -1072,24 +1111,24 @@ bioroebe --protein-to-dna
|
|
1072
1111
|
BIO SHELL>
|
1073
1112
|
^^^ this does not work; nothing is highlighted.
|
1074
1113
|
|
1075
|
-
|
1114
|
+
------------------------------------------------------------------------------------------
|
1076
1115
|
(2) → Add a myristoylierung-signal
|
1077
1116
|
|
1078
1117
|
Met-Gly-Xaa-Xaa-YXaa-Ser/Thr-Lys-Lys
|
1079
1118
|
|
1080
1119
|
1^^ but check first.
|
1081
1120
|
|
1082
|
-
|
1121
|
+
------------------------------------------------------------------------------------------
|
1083
1122
|
(3) → integrate the bioroebe_tutorial.cgi into the .md file completely.
|
1084
1123
|
|
1085
|
-
|
1124
|
+
------------------------------------------------------------------------------------------
|
1086
1125
|
(4) → Integrate everything from the biopython tutorial, if it makes
|
1087
1126
|
sense.
|
1088
1127
|
|
1089
|
-
|
1128
|
+
------------------------------------------------------------------------------------------
|
1090
1129
|
(5) → Improve the codon-optimizer in Bioroebe, including the
|
1091
1130
|
documentation. We need to make this really useful.
|
1092
|
-
|
1131
|
+
------------------------------------------------------------------------------------------
|
1093
1132
|
(6) →
|
1094
1133
|
5'- TACACGGCACAT -3'
|
1095
1134
|
3'- ATGTGCCGTGTA -5'
|
@@ -1098,7 +1137,7 @@ Imperfect DNA mirror repeats (IMRs) are less than 100% symmetrical.
|
|
1098
1137
|
|
1099
1138
|
^^^ integrate mirror repeats creation
|
1100
1139
|
and searching for them. Hmmm.
|
1101
|
-
|
1140
|
+
------------------------------------------------------------------------------------------
|
1102
1141
|
(7) → continue porting bioroebe/taxonomy
|
1103
1142
|
|
1104
1143
|
^^^^^^^^^^
|
@@ -1108,12 +1147,12 @@ Imperfect DNA mirror repeats (IMRs) are less than 100% symmetrical.
|
|
1108
1147
|
^^^ das ist der nächste schritt, so das
|
1109
1148
|
wir das nit mehr benötigen.
|
1110
1149
|
|
1111
|
-
|
1150
|
+
------------------------------------------------------------------------------------------
|
1112
1151
|
(8) → find out which bacteria all contain the needle complex; find out
|
1113
1152
|
the sequence for the needle complex as well and study it;
|
1114
1153
|
find the positions of the genes responsible.
|
1115
1154
|
|
1116
|
-
|
1155
|
+
------------------------------------------------------------------------------------------
|
1117
1156
|
(9) → Add trypsin_digest, also in the shell, but possibly
|
1118
1157
|
on toplevel as well (if the input is a protein sequence.
|
1119
1158
|
|
@@ -1127,29 +1166,24 @@ Imperfect DNA mirror repeats (IMRs) are less than 100% symmetrical.
|
|
1127
1166
|
And document it; but do not digest if a prolin
|
1128
1167
|
follows !!!
|
1129
1168
|
^^^ document this too into .md
|
1130
|
-
|
1131
|
-
-------------------------------------------------------------------------------
|
1132
|
-
(10) → in bioroebe, add a commassie check... do we include
|
1133
|
-
arginine or not.
|
1134
|
-
|
1135
|
-
..........................................................................
|
1169
|
+
------------------------------------------------------------------------------------------
|
1136
1170
|
(11) → add codon usage in bioroebe
|
1137
|
-
|
1171
|
+
------------------------------------------------------------------------------------------
|
1138
1172
|
(12) → Clone the following functionality.
|
1139
1173
|
|
1140
1174
|
http://www.bioinformatics.nl/cgi-bin/emboss/help/sirna
|
1141
|
-
|
1175
|
+
------------------------------------------------------------------------------------------
|
1142
1176
|
(13) → Improve the "find and scan" subsection. We must be able to find
|
1143
1177
|
subsequences; check for "matches" as well, including the bioshell.
|
1144
|
-
|
1178
|
+
------------------------------------------------------------------------------------------
|
1145
1179
|
(14) → Clone the CLUSTAL format aligment.
|
1146
|
-
|
1180
|
+
------------------------------------------------------------------------------------------
|
1147
1181
|
(15) → We need to be able to load up a whole geneome into bioroebe,
|
1148
1182
|
and then be able to manipulate it.
|
1149
1183
|
|
1150
1184
|
^^^ perhaps test this with some example
|
1151
1185
|
data or so...
|
1152
|
-
|
1186
|
+
------------------------------------------------------------------------------------------
|
1153
1187
|
(16) → Restriction enzymes:
|
1154
1188
|
|
1155
1189
|
Add a subsection about restritction enzymes including
|
@@ -1163,7 +1197,7 @@ Imperfect DNA mirror repeats (IMRs) are less than 100% symmetrical.
|
|
1163
1197
|
general, so that we can reproduce and verify the
|
1164
1198
|
information there.
|
1165
1199
|
|
1166
|
-
|
1200
|
+
------------------------------------------------------------------------------------------
|
1167
1201
|
(18) → clone pepinfo
|
1168
1202
|
|
1169
1203
|
The program "pepinfo" plots various amino acid properties in
|
@@ -1181,7 +1215,7 @@ Imperfect DNA mirror repeats (IMRs) are less than 100% symmetrical.
|
|
1181
1215
|
|
1182
1216
|
The data are also written out to an output file.
|
1183
1217
|
|
1184
|
-
|
1218
|
+
------------------------------------------------------------------------------------------
|
1185
1219
|
(19) → gff?
|
1186
1220
|
|
1187
1221
|
There are 6 .gff3 files in the current directory.
|
@@ -1193,23 +1227,22 @@ Imperfect DNA mirror repeats (IMRs) are less than 100% symmetrical.
|
|
1193
1227
|
|
1194
1228
|
^^^ we need an analyze-mode as well.
|
1195
1229
|
|
1196
|
-
|
1230
|
+
------------------------------------------------------------------------------------------
|
1197
1231
|
(20) → ^^^^ add the ability to
|
1198
1232
|
show a ruler AND highlighting as well
|
1199
1233
|
^^^ then document it.
|
1200
|
-
|
1234
|
+
------------------------------------------------------------------------------------------
|
1201
1235
|
(21) → https://github.com/bioperl/bioperl-live
|
1202
1236
|
Look what we can take from ^^^.
|
1203
1237
|
|
1204
1238
|
https://github.com/bioperl/bioperl-live/tree/master/examples
|
1205
1239
|
|
1206
|
-
|
1240
|
+
------------------------------------------------------------------------------------------
|
1207
1241
|
(23) → continue biojava, and bioroebe a bit
|
1208
1242
|
|
1209
1243
|
Ideally we should have biojava o a working point.
|
1210
|
-
|
1211
|
-
|
1212
|
-
..........................................................................
|
1244
|
+
|
1245
|
+
------------------------------------------------------------------------------------------
|
1213
1246
|
(25) → clone the functionality found at https://web.expasy.org/protparam/
|
1214
1247
|
|
1215
1248
|
https://web.expasy.org/cgi-bin/protparam/protparam
|
@@ -1219,7 +1252,7 @@ Imperfect DNA mirror repeats (IMRs) are less than 100% symmetrical.
|
|
1219
1252
|
|
1220
1253
|
Theoretical pI: 5.78
|
1221
1254
|
|
1222
|
-
|
1255
|
+
------------------------------------------------------------------------------------------
|
1223
1256
|
(27) → NP_417539.1
|
1224
1257
|
|
1225
1258
|
https://www.ncbi.nlm.nih.gov/protein/NP_417539.1
|
@@ -1227,26 +1260,26 @@ Imperfect DNA mirror repeats (IMRs) are less than 100% symmetrical.
|
|
1227
1260
|
|
1228
1261
|
^^^ if the input is exactly like the above, on the first line,
|
1229
1262
|
download the sequence.
|
1230
|
-
|
1263
|
+
------------------------------------------------------------------------------------------
|
1231
1264
|
(28) → Integrate these nice GUI parts parts:
|
1232
1265
|
|
1233
1266
|
https://dev.to/kojix2/introduction-to-gr-rb-data-visualization-with-ruby-2c39
|
1234
1267
|
|
1235
|
-
|
1268
|
+
------------------------------------------------------------------------------------------
|
1236
1269
|
(29) → http://insilico.ehu.es/
|
1237
1270
|
|
1238
1271
|
^^^ check if we have all of this incorporated
|
1239
|
-
|
1272
|
+
------------------------------------------------------------------------------------------
|
1240
1273
|
(30) → http://www.biostars.org/
|
1241
1274
|
|
1242
1275
|
^^^ regularly work through this
|
1243
1276
|
and try to help
|
1244
1277
|
and extend bioruby at the same time.
|
1245
|
-
|
1278
|
+
------------------------------------------------------------------------------------------
|
1246
1279
|
(31) → The taxonomy-submodule should work one day, and be properly
|
1247
1280
|
documented as well. Perhaps integrate the parts of Taxonomy
|
1248
1281
|
that can be included into the toplevel domain.
|
1249
|
-
|
1282
|
+
------------------------------------------------------------------------------------------
|
1250
1283
|
(32) → Enable:
|
1251
1284
|
|
1252
1285
|
Bioroebe.set_genetic_code()
|
@@ -1262,7 +1295,7 @@ Imperfect DNA mirror repeats (IMRs) are less than 100% symmetrical.
|
|
1262
1295
|
|
1263
1296
|
^^^ enable this as well; extent documentation too.
|
1264
1297
|
|
1265
|
-
|
1298
|
+
------------------------------------------------------------------------------------------
|
1266
1299
|
(34) → We have found a restriction enzyme called NheI.
|
1267
1300
|
|
1268
1301
|
The sequence this 6-cutter relates to is: `5' - GCTAGC - 3'`
|
@@ -1270,23 +1303,23 @@ Imperfect DNA mirror repeats (IMRs) are less than 100% symmetrical.
|
|
1270
1303
|
This restriction enzyme will produce a blunt overhang.
|
1271
1304
|
|
1272
1305
|
^^^ nope das ist falsch
|
1273
|
-
|
1306
|
+
------------------------------------------------------------------------------------------
|
1274
1307
|
(35) → Sau3A?
|
1275
1308
|
^^^ enable this restriction site
|
1276
1309
|
|
1277
|
-
|
1310
|
+
------------------------------------------------------------------------------------------
|
1278
1311
|
(37) → Add matplotlib support.
|
1279
1312
|
|
1280
1313
|
try_to_use_matplotlib
|
1281
|
-
|
1314
|
+
------------------------------------------------------------------------------------------
|
1282
1315
|
(38) → https://www.ncbi.nlm.nih.gov/CBBresearch/Lu/Demo/tmTools/RESTfulAPIs.html
|
1283
|
-
|
1316
|
+
------------------------------------------------------------------------------------------
|
1284
1317
|
(39) → The following input:
|
1285
1318
|
|
1286
1319
|
downcase; orf?; seq?
|
1287
1320
|
|
1288
1321
|
leads to strange display. Something is wrong here, must be checked.
|
1289
|
-
|
1322
|
+
------------------------------------------------------------------------------------------
|
1290
1323
|
(40) → Continue with rosalind problems.
|
1291
1324
|
|
1292
1325
|
These challenges can be found here:
|
@@ -1295,42 +1328,42 @@ Imperfect DNA mirror repeats (IMRs) are less than 100% symmetrical.
|
|
1295
1328
|
|
1296
1329
|
Also integrate these rosalind-quizzes into bioroebe
|
1297
1330
|
when possible.
|
1298
|
-
|
1331
|
+
------------------------------------------------------------------------------------------
|
1299
1332
|
(41) → https://web.expasy.org/cgi-bin/peptide_mass/peptide-mass.pl
|
1300
1333
|
|
1301
1334
|
^^^ make the above usable in sinaitra as well
|
1302
|
-
|
1335
|
+
------------------------------------------------------------------------------------------
|
1303
1336
|
(42) → Integrate a way to search for commonly known promoters:
|
1304
1337
|
|
1305
1338
|
promoters?
|
1306
1339
|
^^^ this functionality
|
1307
1340
|
^^^ this has to be expanded
|
1308
1341
|
and ...
|
1309
|
-
|
1342
|
+
------------------------------------------------------------------------------------------
|
1310
1343
|
(43) → Integrate:
|
1311
1344
|
|
1312
1345
|
http://biotools.nubic.northwestern.edu/OligoCalc.html
|
1313
|
-
|
1346
|
+
------------------------------------------------------------------------------------------
|
1314
1347
|
(44) → Extend the Java part of BioRoebe systematically..
|
1315
1348
|
|
1316
1349
|
What should come next? Let's make a list.
|
1317
1350
|
|
1318
1351
|
→ remove_numbers [DONE]
|
1319
|
-
|
1352
|
+
------------------------------------------------------------------------------------------
|
1320
1353
|
(46) → Study gnuplot; one day we have to draw graphs.
|
1321
1354
|
|
1322
|
-
|
1355
|
+
------------------------------------------------------------------------------------------
|
1323
1356
|
(47) → Add a genome browser, both ascii without GUI and also
|
1324
1357
|
with. In ruby-gtk.
|
1325
|
-
|
1358
|
+
------------------------------------------------------------------------------------------
|
1326
1359
|
(48) → Clone the functionality of:
|
1327
1360
|
|
1328
1361
|
http://www.biophp.org/minitools/restriction_digest/demo.php
|
1329
|
-
|
1362
|
+
------------------------------------------------------------------------------------------
|
1330
1363
|
(50) → Add the loxP sequence to readme [DONE] and explain this
|
1331
1364
|
better on the main readme; and perhaps also assign
|
1332
1365
|
the sequence via the bioshell.
|
1333
|
-
|
1366
|
+
------------------------------------------------------------------------------------------
|
1334
1367
|
(51) → 33. Cephalodiscidae Mitochondrial UAA-Tyr Code (transl_table=33)
|
1335
1368
|
|
1336
1369
|
AAs = FFLLSSSSYYY*CCWWLLLLPPPPHHQQRRRRIIIMTTTTNNKKSSSKVVVVAAAADDEEGGGG
|
@@ -1341,7 +1374,7 @@ Imperfect DNA mirror repeats (IMRs) are less than 100% symmetrical.
|
|
1341
1374
|
|
1342
1375
|
^^^ add a parser, and document it, that can take this input
|
1343
1376
|
and output the corresponding code, in a valid .yml file.
|
1344
|
-
|
1377
|
+
------------------------------------------------------------------------------------------
|
1345
1378
|
(52) → Add to bioroebe the ability to add cloning vectors
|
1346
1379
|
and molecular_weight calcuation
|
1347
1380
|
for this
|
@@ -1363,19 +1396,19 @@ Imperfect DNA mirror repeats (IMRs) are less than 100% symmetrical.
|
|
1363
1396
|
|
1364
1397
|
^^^ we also need a way to find out what resistance genes
|
1365
1398
|
are carried there.
|
1366
|
-
|
1399
|
+
------------------------------------------------------------------------------------------
|
1367
1400
|
(53) → In the lambda genome sequence there are 10 EcoB and
|
1368
1401
|
5 EcoK sites.
|
1369
1402
|
^^^ verify this too, as an example as well
|
1370
|
-
|
1403
|
+
------------------------------------------------------------------------------------------
|
1371
1404
|
(54) → show restriction sites, composable and compatible with
|
1372
1405
|
serial clone ... hmm
|
1373
|
-
|
1406
|
+
------------------------------------------------------------------------------------------
|
1374
1407
|
(55) → enable:
|
1375
1408
|
BIOROEBE_USE_COLOURS:
|
1376
1409
|
can be 0 or 1
|
1377
1410
|
what is this?
|
1378
|
-
|
1411
|
+
------------------------------------------------------------------------------------------
|
1379
1412
|
(56) → Burrows-Wheeler-Transform (BWT)
|
1380
1413
|
|
1381
1414
|
^^^ add some method here
|
@@ -1388,15 +1421,15 @@ Imperfect DNA mirror repeats (IMRs) are less than 100% symmetrical.
|
|
1388
1421
|
|
1389
1422
|
also test this against my paper-result
|
1390
1423
|
with input being: "GATAG$".
|
1391
|
-
|
1424
|
+
------------------------------------------------------------------------------------------
|
1392
1425
|
(56) → Enable working with several genes... hmm and store that somewhere.
|
1393
1426
|
Something like a per-project workspace thingy.
|
1394
|
-
|
1427
|
+
------------------------------------------------------------------------------------------
|
1395
1428
|
(57) → Add:
|
1396
1429
|
|
1397
1430
|
http://nar.oxfordjournals.org/content/35/suppl_2/W71.long
|
1398
1431
|
|
1399
|
-
|
1432
|
+
------------------------------------------------------------------------------------------
|
1400
1433
|
(58) → Now, you may want to translate the nucleotides up to
|
1401
1434
|
the first in frame stop codon, and then stop (as
|
1402
1435
|
happens in nature):
|
@@ -1410,14 +1443,14 @@ Imperfect DNA mirror repeats (IMRs) are less than 100% symmetrical.
|
|
1410
1443
|
Then continue from here:
|
1411
1444
|
|
1412
1445
|
https://people.duke.edu/~ccc14/pcfb/biopython/BiopythonSequences.html
|
1413
|
-
|
1446
|
+
------------------------------------------------------------------------------------------
|
1414
1447
|
(59) → Add:
|
1415
1448
|
|
1416
1449
|
set_dna :Ubiquitin
|
1417
1450
|
set_dna :ubiquitin
|
1418
1451
|
|
1419
1452
|
^^^ we want to obtain the ubuiqitin sequence
|
1420
|
-
|
1453
|
+
------------------------------------------------------------------------------------------
|
1421
1454
|
(59) → Telomers
|
1422
1455
|
|
1423
1456
|
Telomeres are listed from 5' to 3'.
|
@@ -1431,28 +1464,28 @@ Imperfect DNA mirror repeats (IMRs) are less than 100% symmetrical.
|
|
1431
1464
|
doc_telomeres
|
1432
1465
|
|
1433
1466
|
^^^ add this to say the human telomere sequence
|
1434
|
-
|
1467
|
+
------------------------------------------------------------------------------------------
|
1435
1468
|
(60) → ORF_positions?
|
1436
1469
|
^^^ change this a bit, to actually show the positions
|
1437
1470
|
of the various ORFs with the start-position.
|
1438
|
-
|
1471
|
+
------------------------------------------------------------------------------------------
|
1439
1472
|
(62) → add:
|
1440
1473
|
|
1441
1474
|
setgene2
|
1442
1475
|
add_dna2
|
1443
1476
|
dna2
|
1444
1477
|
dna? <--- this one is not a setter but a query.
|
1445
|
-
|
1478
|
+
------------------------------------------------------------------------------------------
|
1446
1479
|
(63) → improve the TM calculation. must be better, must have more
|
1447
1480
|
documentation, and a small tutorial.
|
1448
|
-
|
1481
|
+
------------------------------------------------------------------------------------------
|
1449
1482
|
(64) → Compare bioroebe to:
|
1450
1483
|
|
1451
1484
|
https://www.ncbi.nlm.nih.gov/orffinder
|
1452
1485
|
|
1453
1486
|
whether both return the same
|
1454
1487
|
also possibly add a web-gui
|
1455
|
-
|
1488
|
+
------------------------------------------------------------------------------------------
|
1456
1489
|
(65) → Find out ratios from:
|
1457
1490
|
|
1458
1491
|
Doolittle RF. 1989. Redundancies in protein sequences. I
|
@@ -1478,16 +1511,16 @@ Imperfect DNA mirror repeats (IMRs) are less than 100% symmetrical.
|
|
1478
1511
|
Bioroebe::Blosum[50] as an API.
|
1479
1512
|
and document it in general.
|
1480
1513
|
|
1481
|
-
|
1514
|
+
------------------------------------------------------------------------------------------
|
1482
1515
|
(65) → http://www.biomart.org/other/user-docs.pdf
|
1483
1516
|
^^^ work through this
|
1484
|
-
|
1517
|
+
------------------------------------------------------------------------------------------
|
1485
1518
|
(66) → add:
|
1486
1519
|
|
1487
1520
|
class Cell
|
1488
1521
|
^^^ simulate a cell
|
1489
1522
|
Hmmm. Needs specific components ... and needs a better plan.
|
1490
|
-
|
1523
|
+
------------------------------------------------------------------------------------------
|
1491
1524
|
(68) → class Protein:
|
1492
1525
|
|
1493
1526
|
add glycosyslation patteren
|
@@ -1496,18 +1529,18 @@ Imperfect DNA mirror repeats (IMRs) are less than 100% symmetrical.
|
|
1496
1529
|
need to somehow add the modiication type
|
1497
1530
|
|
1498
1531
|
https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5358406/
|
1499
|
-
|
1532
|
+
------------------------------------------------------------------------------------------
|
1500
1533
|
(69) → In the BioShell we must be able to do probes - completementary
|
1501
1534
|
to amino acids.
|
1502
|
-
|
1535
|
+
------------------------------------------------------------------------------------------
|
1503
1536
|
(70) → Add www-related functionality to bioroebe eventually make use
|
1504
1537
|
of rails, but start with sinatra possibly. In the long run,
|
1505
1538
|
make it flexible to work with as many different frameworks
|
1506
1539
|
as possible, though.
|
1507
|
-
|
1540
|
+
------------------------------------------------------------------------------------------
|
1508
1541
|
(71) → Spaltstellen anzeigen zum beispiel lambda-DNA verdau
|
1509
1542
|
BgI II.
|
1510
|
-
|
1543
|
+
------------------------------------------------------------------------------------------
|
1511
1544
|
(72) → dnaanalyze
|
1512
1545
|
|
1513
1546
|
In the DNA string `TCCGTCGCAACACATCGCCTCAACAAACCGACCGGGATATGCAATACCGGAATCCGATCCTTTAGAAGCTGCATTCCAAACGCTTGCAATAACACCCACTCGACTATTCAGCATTGGCAAAGGGTACGAATTCGACGAAGGGAGGGTGCTATATTTTCCAAGTTGCTCGCCGATTGATACGGAGCCTGTGGAAAGATTTCGCGGCTCTAGTCTTTAGCTTTGATGTCACCCCTGAGTAGTAACCCGGCGTGGTAGCTTTCATTAGACTTCTCGGAGAGAGTATTAAGCAAAGGTGGAGGTCCCAGGGGTCCAGTGAGCTGTATCGCACTAAAAGCATGCCTACGGGCAATGCTATTTTGCTCACAGGAACTTTGGGGGAGCCACAAACTCTCGAAGCCGGATTGTTGTGGCGGCTAACTTTCCAAAGGCGACCATTCATGGTCTGAATGGGCCCTCACCAGAAGAACGTTTTCGACGGGCATTCTTCCCCGGGGTTTCGAAGGCAAGGGTCAGCACGGCGCGGAAAAGTACGCGACGCATACCGGACTAGTCATGCAACTCCCTCGGAACTGGCGATTCCCACCCAAGAGACGCACGCTGATCATTGCCCATGCCGACTGGAGATGCTGAATTTGGTATGCGGGTCTGTTGCCAGCGCTGACATTATCGGACATTGTGGGGAGAACCGTGTGATTGATTGAGCTGGCGCATTTGTCCGCATGCTCTCCTCATGTGGACACCTTCGCAGGTTCTTTCCGCGGCCACAGTGTCGGGATCTACCCCTGGTGCGTCGCCGCGAGTACAGGTGGGGTTTCGCGCATGAGAACCAATGTTGCACGCCTCAAAACATGGCTGTAACATATTAGCGCCAATAAAAATTTTTGGCAACAAAGAAACAAGGCCAACCGAAGTGCTAAGCCGCGATCATGAAGGGGCGATGCCAGAATGGGAGTCTGCCTTTCCTGTGTGGACGTGAGATTGTACCTAGACAGAGAACGCC` we found these Nucleotides:
|
@@ -1532,11 +1565,11 @@ Imperfect DNA mirror repeats (IMRs) are less than 100% symmetrical.
|
|
1532
1565
|
we need to make it so that an input sequence
|
1533
1566
|
can be assigned, and dnaanalyse --GUI should
|
1534
1567
|
start it too. ALSO document it once this works.
|
1535
|
-
|
1568
|
+
------------------------------------------------------------------------------------------
|
1536
1569
|
(73) → go through the individual components slowly and improve them,
|
1537
1570
|
step by step, including the documentation. Then eventually
|
1538
1571
|
remove this todo-entry here.
|
1539
|
-
|
1572
|
+
------------------------------------------------------------------------------------------
|
1540
1573
|
(74) → Add a consensus sequence for:
|
1541
1574
|
|
1542
1575
|
Asn-X-Ser/Thr-Conesnsus
|
@@ -1548,13 +1581,13 @@ Imperfect DNA mirror repeats (IMRs) are less than 100% symmetrical.
|
|
1548
1581
|
NGlyc
|
1549
1582
|
/N-?Glyc/i
|
1550
1583
|
^^^ use that regex
|
1551
|
-
|
1584
|
+
------------------------------------------------------------------------------------------
|
1552
1585
|
(74) → make sure that newly generated files respect the
|
1553
1586
|
default chmod value on the system. from bioroebe.
|
1554
1587
|
right now we default to 755 which I assume is
|
1555
1588
|
hardcoded but perhaps this is wrong.
|
1556
1589
|
|
1557
|
-
|
1590
|
+
------------------------------------------------------------------------------------------
|
1558
1591
|
(75) → require 'bio'
|
1559
1592
|
|
1560
1593
|
# creating a Bio::Sequence::NA object containing ambiguous alphabets
|
@@ -1579,34 +1612,34 @@ Imperfect DNA mirror repeats (IMRs) are less than 100% symmetrical.
|
|
1579
1612
|
part nto a standalone file
|
1580
1613
|
so taht it can be used by both the .cgi and
|
1581
1614
|
well rdoc...
|
1582
|
-
|
1615
|
+
------------------------------------------------------------------------------------------
|
1583
1616
|
- Add more protein-specific thingies to bioroebe.
|
1584
|
-
|
1617
|
+
------------------------------------------------------------------------------------------
|
1585
1618
|
- Die bioshell vorantreiben und durch std_biology.rb abarbeiten.
|
1586
1619
|
Vielleicht können wir ja etwas davon auslagern in eine Klasse
|
1587
1620
|
oder so.
|
1588
1621
|
|
1589
1622
|
Das ganze sollte auch mit Webmin (biomin) verknüpft werden, so das
|
1590
1623
|
wir die Bioshell auch elegant über das www verwenden können!
|
1591
|
-
|
1624
|
+
------------------------------------------------------------------------------------------
|
1592
1625
|
- ^^^ when we find restriction enzyme sites in a DNA
|
1593
1626
|
string, colourize them RED.
|
1594
1627
|
|
1595
1628
|
also set it to
|
1596
1629
|
set_restriction_size()
|
1597
|
-
|
1630
|
+
------------------------------------------------------------------------------------------
|
1598
1631
|
- also ... while learning C++ we extend the project here...
|
1599
1632
|
Useful C++ things will be combined.
|
1600
|
-
|
1633
|
+
------------------------------------------------------------------------------------------
|
1601
1634
|
- As of April 2003, there were 176,890 total taxa represented.
|
1602
1635
|
|
1603
1636
|
^^^ we need a way to also output how many entries we
|
1604
1637
|
have there.
|
1605
|
-
|
1638
|
+
------------------------------------------------------------------------------------------
|
1606
1639
|
- Replace bioruby with bioroebe completely!
|
1607
1640
|
In order for this to work, we first need to find out
|
1608
1641
|
what bioruby is able to do. :P
|
1609
|
-
|
1642
|
+
------------------------------------------------------------------------------------------
|
1610
1643
|
- append 33
|
1611
1644
|
# ^^^ in the bioshell
|
1612
1645
|
Only numbers were given: Adding 33 random nucleotides to the main string next.
|
@@ -1626,7 +1659,7 @@ Did you mean? return_random_codon_sequence_for_this_aminoacid_sequence
|
|
1626
1659
|
|
1627
1660
|
|
1628
1661
|
^^^^^ BUG!
|
1629
|
-
|
1662
|
+
------------------------------------------------------------------------------------------
|
1630
1663
|
> rest?
|
1631
1664
|
|
1632
1665
|
We found these restriction sites within the sequence `TTCAGAACTCAACGCCTGGTTGGCCGTCCAGTAAGCTGACTAAGTAAGTCTATGCCCGCGATAACCAGGATACAGATATCGTGAAACCTGGTTTATCTCCTTCTATAAGAGTCTGCACATCTAGC`:
|
@@ -1656,7 +1689,7 @@ We found these restriction sites within the sequence `TTCAGAACTCAACGCCTGGTTGGCCG
|
|
1656
1689
|
^^^^ also show the position
|
1657
1690
|
|
1658
1691
|
|
1659
|
-
|
1692
|
+
------------------------------------------------------------------------------------------
|
1660
1693
|
|
1661
1694
|
PMID entries are:
|
1662
1695
|
|
@@ -1728,7 +1761,7 @@ We found these restriction sites within the sequence `TTCAGAACTCAACGCCTGGTTGGCCG
|
|
1728
1761
|
|
1729
1762
|
|
1730
1763
|
|
1731
|
-
|
1764
|
+
------------------------------------------------------------------------------------------
|
1732
1765
|
Bei der Datenbanksuche werden die gemessenen Massen mit den Peptidmassen
|
1733
1766
|
aller Proteine bzw. Gene in einer Datenbank (NCBI, Uniprot) verglichen. DNA-
|
1734
1767
|
Sequenzen werden dazu in Proteinsequenzen übersetzt und in silico mit der beim
|
@@ -1738,7 +1771,7 @@ Verdau benutzten Protease geschnitten.
|
|
1738
1771
|
|
1739
1772
|
|
1740
1773
|
|
1741
|
-
|
1774
|
+
------------------------------------------------------------------------------------------
|
1742
1775
|
Complexity of libraries:
|
1743
1776
|
How many independent clones are necessary to represent a genome (plant,
|
1744
1777
|
animal/fungus) or how many such clones have to be screened to have realistic
|
@@ -1773,7 +1806,7 @@ have to be hybridized.
|
|
1773
1806
|
|
1774
1807
|
|
1775
1808
|
|
1776
|
-
|
1809
|
+
------------------------------------------------------------------------------------------
|
1777
1810
|
|
1778
1811
|
BIO SHELL> BglI?
|
1779
1812
|
|
@@ -1818,12 +1851,12 @@ List all enzymes that produce compatible ends for the enzyme.
|
|
1818
1851
|
http://biopython.org/DIST/docs/api/Bio.Restriction.Restriction.Blunt-class.html
|
1819
1852
|
|
1820
1853
|
|
1821
|
-
|
1854
|
+
------------------------------------------------------------------------------------------
|
1822
1855
|
https://www.reddit.com/r/bioinformatics/comments/5o3kn8/bioinformatics_contest_2017_jan_23rd29th_solve_as/
|
1823
|
-
|
1856
|
+
------------------------------------------------------------------------------------------
|
1824
1857
|
(1) → Finish all of biophp integration into bioroebe.
|
1825
1858
|
http://www.biophp.org/
|
1826
|
-
|
1859
|
+
------------------------------------------------------------------------------------------
|
1827
1860
|
|
1828
1861
|
locate oriC here:
|
1829
1862
|
|
@@ -1858,13 +1891,13 @@ But I do not know how to locate ORIs.
|
|
1858
1891
|
|
1859
1892
|
|
1860
1893
|
|
1861
|
-
|
1894
|
+
------------------------------------------------------------------------------------------
|
1862
1895
|
^^^ also integrate git into bioroebe.
|
1863
|
-
|
1896
|
+
------------------------------------------------------------------------------------------
|
1864
1897
|
WIR MÜSSEN DAS HIER EXTREM VERBESSERN.
|
1865
1898
|
|
1866
1899
|
DANN UPLOADEN UND ALS BASIS FÜR APPLICATIONS NUTZEN.
|
1867
|
-
|
1900
|
+
------------------------------------------------------------------------------------------
|
1868
1901
|
|
1869
1902
|
Study MetaCyc
|
1870
1903
|
^^^ study metabolic pathways.
|
@@ -1873,7 +1906,7 @@ http://metacyc.org/
|
|
1873
1906
|
|
1874
1907
|
→ Create KuroMetaCyc, in Analogy towards Metabolic Cycle.
|
1875
1908
|
|
1876
|
-
|
1909
|
+
------------------------------------------------------------------------------------------
|
1877
1910
|
|
1878
1911
|
Welcome to BioShell May 2012. Type "help" to get some help.
|
1879
1912
|
|
@@ -1895,7 +1928,7 @@ When we type this, we then ask:
|
|
1895
1928
|
|
1896
1929
|
|
1897
1930
|
|
1898
|
-
|
1931
|
+
------------------------------------------------------------------------------------------
|
1899
1932
|
|
1900
1933
|
http://biopython.org/DIST/docs/cookbook/Restriction.html#mozTocId101269
|
1901
1934
|
|
@@ -1985,16 +2018,16 @@ ausreichend.
|
|
1985
2018
|
|
1986
2019
|
|
1987
2020
|
|
1988
|
-
|
2021
|
+
------------------------------------------------------------------------------------------
|
1989
2022
|
BioTodo - GENESIS, science fiction.
|
1990
2023
|
|
1991
2024
|
- create virus(:which_one, :amount) # Note the difference to the below
|
1992
2025
|
- create hydra(:amount)
|
1993
2026
|
- create bread
|
1994
|
-
|
2027
|
+
------------------------------------------------------------------------------------------
|
1995
2028
|
→ both
|
1996
2029
|
^ should work, does not work right now.
|
1997
|
-
|
2030
|
+
------------------------------------------------------------------------------------------
|
1998
2031
|
→ Taxonomy is now integrated into bioroebe. This is good but we need more
|
1999
2032
|
documentation, some more tests, a rethinking of the layout and the
|
2000
2033
|
structures, and a fixing of the query-part of the database.
|
@@ -2008,13 +2041,13 @@ ausreichend.
|
|
2008
2041
|
at about the same time \o/
|
2009
2042
|
AND document this related-problems too
|
2010
2043
|
Integrate this some other day...
|
2011
|
-
|
2044
|
+
------------------------------------------------------------------------------------------
|
2012
2045
|
- http://www.restrictionmapper.org/cgi-bin/sitefind3.pl
|
2013
2046
|
|
2014
2047
|
^^^ Das sollte man integrieren, die Funktionalität, so das
|
2015
2048
|
man ALLE Restriktion-Enzymes ausprobiert ausgehend von
|
2016
2049
|
einer bestimmten Sequenz.
|
2017
|
-
|
2050
|
+
------------------------------------------------------------------------------------------
|
2018
2051
|
→ A search is essentially substring search across a database of strings
|
2019
2052
|
(albeit with a smaller alphabet). Some common use cases: one,
|
2020
2053
|
scientists will search for certain genes that they've used in engineered
|
@@ -2033,13 +2066,13 @@ ausreichend.
|
|
2033
2066
|
Bioroebe::DetermineOptimalCodons
|
2034
2067
|
^^^ this is currently incomplete.
|
2035
2068
|
|
2036
|
-
|
2069
|
+
------------------------------------------------------------------------------------------
|
2037
2070
|
→ Redo restrictions enzymes completely.
|
2038
2071
|
And polish this a LOT.
|
2039
2072
|
This may take some days. But we want this to be REALLY good and
|
2040
2073
|
lasting for a long time.
|
2041
2074
|
Need to keep on working at that!
|
2042
|
-
|
2075
|
+
------------------------------------------------------------------------------------------
|
2043
2076
|
→ Add: average_aminoacid_weight?
|
2044
2077
|
|
2045
2078
|
|
@@ -2077,7 +2110,7 @@ end
|
|
2077
2110
|
→ We must be able to align not only nucleotides but also aminoacids.
|
2078
2111
|
But where is the alignment comparer? perhaps hamming distance?
|
2079
2112
|
hmm we have to see.
|
2080
|
-
|
2113
|
+
------------------------------------------------------------------------------------------
|
2081
2114
|
→ /Programs/Ruby/2.3.1/lib/ruby/site_ruby/2.3.0/bioroebe/bioshell/menu.rb:311:in `menu': undefined method `upcase' for ["EcoRI"]:Array (NoMethodError)
|
2082
2115
|
from /Programs/Ruby/2.3.1/lib/ruby/site_ruby/2.3.0/bioroebe/bioshell/user_input.rb:31:in `block in enter_main_loop'
|
2083
2116
|
from /Programs/Ruby/2.3.1/lib/ruby/site_ruby/2.3.0/bioroebe/bioshell/user_input.rb:12:in `loop'
|
@@ -2106,12 +2139,12 @@ end
|
|
2106
2139
|
at this date.'
|
2107
2140
|
SendEmail.new to: Roebe.email?, data
|
2108
2141
|
|
2109
|
-
|
2142
|
+
------------------------------------------------------------------------------------------
|
2110
2143
|
|
2111
2144
|
|
2112
2145
|
→ Document which parts of emboss have already been copied.
|
2113
2146
|
→ EMBOSS.md
|
2114
|
-
|
2147
|
+
------------------------------------------------------------------------------------------
|
2115
2148
|
|
2116
2149
|
|
2117
2150
|
|
@@ -2168,7 +2201,7 @@ Traceback (most recent call last):
|
|
2168
2201
|
|
2169
2202
|
http://www.snapgene.com/products/snapgene_viewer/
|
2170
2203
|
|
2171
|
-
|
2204
|
+
------------------------------------------------------------------------------------------
|
2172
2205
|
(1) → Wir sollten GFP tagging unterstützen, also wie das
|
2173
2206
|
Protein-Konstrukt aussehen soll und so weiter.
|
2174
2207
|
Das geht teilweise...
|
@@ -2177,22 +2210,22 @@ Traceback (most recent call last):
|
|
2177
2210
|
fügt die sequence asl main dna sequenz ein.
|
2178
2211
|
Was fehlt? Hmmmm... eventuell noch mehr an
|
2179
2212
|
dokumentation.
|
2180
|
-
|
2213
|
+
------------------------------------------------------------------------------------------
|
2181
2214
|
|
2182
2215
|
- in bioroebe, create subsequences for siRNA, then scan for
|
2183
2216
|
submatcher + report where these are. Should be fast too.
|
2184
|
-
|
2217
|
+
------------------------------------------------------------------------------------------
|
2185
2218
|
- Reverse complement now works quite well, also via the sinatra
|
2186
2219
|
interface. We still should have a way to show 5' and
|
2187
2220
|
3', both on the commandline, and via sinatra.
|
2188
2221
|
Perhaps via --fancy commandline flag or so.
|
2189
|
-
|
2222
|
+
------------------------------------------------------------------------------------------
|
2190
2223
|
- Cn3D files?
|
2191
2224
|
^^^ add support for these; research what they are, too.
|
2192
|
-
|
2225
|
+
------------------------------------------------------------------------------------------
|
2193
2226
|
- Consider adding graphviz, perhaps to the taxonomy project
|
2194
2227
|
where we make graphs towards different nodes or so...
|
2195
|
-
|
2228
|
+
------------------------------------------------------------------------------------------
|
2196
2229
|
- in parse fasta
|
2197
2230
|
@colourize_sequence = false
|
2198
2231
|
^^^ change this lateron...
|
@@ -2200,7 +2233,7 @@ Traceback (most recent call last):
|
|
2200
2233
|
this method now exists, but we still have to make
|
2201
2234
|
the check better whether it is a protein or a DNA/RNA
|
2202
2235
|
add a toplevel method for this.
|
2203
|
-
|
2236
|
+
------------------------------------------------------------------------------------------
|
2204
2237
|
- clone the BLast ident matcher functionality for aminacids into
|
2205
2238
|
Bioroebe.
|
2206
2239
|
|
@@ -2215,7 +2248,7 @@ Traceback (most recent call last):
|
|
2215
2248
|
|
2216
2249
|
|
2217
2250
|
|
2218
|
-
|
2251
|
+
------------------------------------------------------------------------------------------
|
2219
2252
|
- Be able to mark exon/intron boundaries.
|
2220
2253
|
|
2221
2254
|
- Add "taxid?" to tell us the name of the organism. This works now.
|
@@ -2259,9 +2292,9 @@ Traceback (most recent call last):
|
|
2259
2292
|
|
2260
2293
|
^^^
|
2261
2294
|
study sumoplot ...
|
2262
|
-
|
2295
|
+
------------------------------------------------------------------------------------------
|
2263
2296
|
- http://a-little-book-of-r-for-bioinformatics.readthedocs.io/en/latest/src/chapter7.html
|
2264
|
-
|
2297
|
+
------------------------------------------------------------------------------------------
|
2265
2298
|
- http://biopython.org/DIST/docs/tutorial/Tutorial.html#htoc22
|
2266
2299
|
^^^ continue here; "You can also specify the table using the
|
2267
2300
|
NCBI table number which is shorter, and often included in
|
@@ -2269,7 +2302,7 @@ Traceback (most recent call last):
|
|
2269
2302
|
|
2270
2303
|
^^^ work through this and see if it is good.
|
2271
2304
|
|
2272
|
-
|
2305
|
+
------------------------------------------------------------------------------------------
|
2273
2306
|
|
2274
2307
|
- Clone ALL of biophp, if it us useful.
|
2275
2308
|
|
@@ -2316,7 +2349,7 @@ Palindromic sequences finder
|
|
2316
2349
|
We should also put this poart into doc/ subsection
|
2317
2350
|
to keep track of what is missing and what is not.
|
2318
2351
|
|
2319
|
-
|
2352
|
+
------------------------------------------------------------------------------------------
|
2320
2353
|
(1) → sizeseq
|
2321
2354
|
|
2322
2355
|
^^^ clone this functionality and describe it in detail.
|
@@ -2353,7 +2386,7 @@ foobar.fasta
|
|
2353
2386
|
|
2354
2387
|
ALSO ADD A GUI; sizeseq.rb was added in February 2021.
|
2355
2388
|
|
2356
|
-
|
2389
|
+
------------------------------------------------------------------------------------------
|
2357
2390
|
- In the sinatra-web-interface for Bioroebe:
|
2358
2391
|
continue quiz in rosalind !!!
|
2359
2392
|
also, at to_dna: default to RNA
|
@@ -2372,8 +2405,8 @@ ALSO ADD A GUI; sizeseq.rb was added in February 2021.
|
|
2372
2405
|
→ formatted_view
|
2373
2406
|
111^^^^ in ncbi format
|
2374
2407
|
and document all of this.
|
2375
|
-
|
2376
|
-
|
2408
|
+
------------------------------------------------------------------------------------------
|
2409
|
+
------------------------------------------------------------------------------------------
|
2377
2410
|
- Add a ruby-GUI stuff, probably the old biology/ subsection
|
2378
2411
|
will be moved into the project.
|
2379
2412
|
|
@@ -2470,7 +2503,7 @@ ALSO ADD A GUI; sizeseq.rb was added in February 2021.
|
|
2470
2503
|
|
2471
2504
|
|
2472
2505
|
|
2473
|
-
|
2506
|
+
------------------------------------------------------------------------------------------
|
2474
2507
|
- Identifying amino acid cleavage sites (Sigcleave)
|
2475
2508
|
|
2476
2509
|
For amino acid sequences we may be interested to know whether
|
@@ -2533,29 +2566,22 @@ ALSO ADD A GUI; sizeseq.rb was added in February 2021.
|
|
2533
2566
|
^^^ da gibt es einen bug. später nochmals probieren.
|
2534
2567
|
|
2535
2568
|
|
2536
|
-
- We will read from NM_001180897.3_Saccharomyces_cerevisiae_S288c_Aga2p_AGA2.fasta
|
2537
|
-
|
2538
|
-
The file NM_001180897.3_Saccharomyces_cerevisiae_S288c_Aga2p_AGA2.fasta has this FASTA header:
|
2539
|
-
|
2540
|
-
>gi|398364826|ref|NM_001180897.3| Saccharomyces cerevisiae S288c Aga2p (AGA2), mRNA
|
2541
2569
|
|
2542
|
-
^^^ this should also (optionally) tell us the organism, via a switch.
|
2543
|
-
for this we need some way to return the taxonomic ID of an organism
|
2544
2570
|
|
2545
2571
|
- we have to add expasy...
|
2546
2572
|
functionality to the cmdline too.
|
2547
2573
|
Which one specifically? Let's see...
|
2548
2574
|
|
2549
2575
|
https://www.expasy.org/
|
2550
|
-
|
2576
|
+
------------------------------------------------------------------------------------------
|
2551
2577
|
- https://biopython.org/wiki/Category%3ACookbook
|
2552
2578
|
^^^ clone that
|
2553
|
-
|
2579
|
+
------------------------------------------------------------------------------------------
|
2554
2580
|
- include covid genome, and begin to analyse it in bioroebe
|
2555
2581
|
"Das Genom von SARS-CoV-2 sei doppelt so groß wie jenes
|
2556
2582
|
von Influenzaviren, daher scheinen letztere viermal
|
2557
2583
|
so schnell zu mutieren, schrieb Moshiri."
|
2558
|
-
|
2584
|
+
------------------------------------------------------------------------------------------
|
2559
2585
|
- Look at the GUIs that are part of the BioRoebe project.
|
2560
2586
|
|
2561
2587
|
Polish these part, at the least one widget, then
|
@@ -2570,7 +2596,7 @@ ALSO ADD A GUI; sizeseq.rb was added in February 2021.
|
|
2570
2596
|
|
2571
2597
|
Hmmm. And then, also consider transitioning into gtk3,
|
2572
2598
|
and make mroe screenshots.
|
2573
|
-
|
2599
|
+
------------------------------------------------------------------------------------------
|
2574
2600
|
|
2575
2601
|
- https://www.ebi.ac.uk/Tools/seqstats/emboss_pepstats/
|
2576
2602
|
http://www.ebi.ac.uk/Tools/services/web/toolresult.ebi?jobId=emboss_pepstats-I20160208-020243-0564-53154194-oy
|
@@ -2582,7 +2608,7 @@ ALSO ADD A GUI; sizeseq.rb was added in February 2021.
|
|
2582
2608
|
- Improve on temperature content and how it is calculated
|
2583
2609
|
|
2584
2610
|
someone googled for it in 2014 so build on it
|
2585
|
-
|
2611
|
+
------------------------------------------------------------------------------------------
|
2586
2612
|
- pfasta /Depot/Temp/bioroebe/NM_000539.3_Homo_sapiens_rhodopsin_RHO.fasta
|
2587
2613
|
|
2588
2614
|
Will read from the file `/Depot/Temp/bioroebe/NM_000539.3_Homo_sapiens_rhodopsin_RHO.fasta`.
|
@@ -2593,7 +2619,7 @@ ALSO ADD A GUI; sizeseq.rb was added in February 2021.
|
|
2593
2619
|
Now assigning aminoacid sequence to:
|
2594
2620
|
AGAGTCATCCAGCTGGAGCCCTGAGTGGCTGAGCTCAGGCCTTCGCAG
|
2595
2621
|
AGAGTCATCCAGCTGGAGCCCTGAGTGGCTGAGCTCAGGCCTTCGCAG
|
2596
|
-
|
2622
|
+
------------------------------------------------------------------------------------------
|
2597
2623
|
|
2598
2624
|
|
2599
2625
|
- Formats
|
@@ -2647,7 +2673,7 @@ ALSO ADD A GUI; sizeseq.rb was added in February 2021.
|
|
2647
2673
|
tinyseq NCBI TinySeq XML
|
2648
2674
|
ztr ZTR tracefile ztr
|
2649
2675
|
|
2650
|
-
|
2676
|
+
------------------------------------------------------------------------------------------
|
2651
2677
|
(1) Look at f1 display:
|
2652
2678
|
|
2653
2679
|
|
@@ -2670,7 +2696,7 @@ we probably have to rewrite the whole thing
|
|
2670
2696
|
BEFORE we add ANY COLOURS.
|
2671
2697
|
OH WELL.
|
2672
2698
|
|
2673
|
-
|
2699
|
+
------------------------------------------------------------------------------------------
|
2674
2700
|
(100) → Add a primer-design widget
|
2675
2701
|
|
2676
2702
|
The idea is to be able to manipulate forward and
|
@@ -2684,7 +2710,7 @@ perfect but it is a start.
|
|
2684
2710
|
https://www.bioinformatics.nl/molbi/SCLResources/sequence_notation.htm
|
2685
2711
|
^^^ and check what is useful there. perhaps also add
|
2686
2712
|
nicer visual cues to pretty it up a bit.
|
2687
|
-
|
2713
|
+
------------------------------------------------------------------------------------------
|
2688
2714
|
(1) → Compare bioroebe to:
|
2689
2715
|
|
2690
2716
|
https://www.ncbi.nlm.nih.gov/orffinder
|
@@ -2694,18 +2720,18 @@ whether both return the same also possibly add a web-gui
|
|
2694
2720
|
check this... so that we can search in standard ORF
|
2695
2721
|
but also in different ORFs
|
2696
2722
|
und die länge angeben, zumindest vom längsten ORF start + stop... also so das das ergebnis auch passt
|
2697
|
-
|
2723
|
+
------------------------------------------------------------------------------------------
|
2698
2724
|
test reverse complement in bioroebe
|
2699
2725
|
^^^
|
2700
2726
|
new_WWW/
|
2701
2727
|
^^^ this should eventually become the new web-related interface.
|
2702
2728
|
Ah well. Perhaps not ... ruby-cgi is soooooo annoying ...
|
2703
|
-
|
2729
|
+
------------------------------------------------------------------------------------------
|
2704
2730
|
(154) → the blosum-viewer should be supported in the cgi part
|
2705
2731
|
and sinatra part as well.
|
2706
2732
|
This now works for sinatra. Need to enable this for
|
2707
2733
|
the cgi-part too eventually.
|
2708
|
-
|
2734
|
+
------------------------------------------------------------------------------------------
|
2709
2735
|
(155) → port the sinatra stuff together in bioroebe
|
2710
2736
|
create a dir: web_api
|
2711
2737
|
^^^ also make params? usable in both sinatra and cgi page
|
@@ -2716,18 +2742,18 @@ Ah well. Perhaps not ... ruby-cgi is soooooo annoying ...
|
|
2716
2742
|
add tons of HtmlTemplate[]
|
2717
2743
|
and replace the ad-hoc code otherwise...
|
2718
2744
|
^^^ yeah, finish the HtmlTemplate stuff.
|
2719
|
-
|
2745
|
+
------------------------------------------------------------------------------------------
|
2720
2746
|
(1) → https://i.imgur.com/ptcSn12.png
|
2721
2747
|
^^^ enable such an overview; this shows mass compuation e.g
|
2722
2748
|
peptide mass and such
|
2723
|
-
|
2749
|
+
------------------------------------------------------------------------------------------
|
2724
2750
|
(80) Bioroebe.sanitize_nucleotide_sequence
|
2725
2751
|
^^^ port this into java. The code has been written for this already,
|
2726
2752
|
but we currently fail to link it.
|
2727
|
-
|
2753
|
+
------------------------------------------------------------------------------------------
|
2728
2754
|
(81) Bioroebe.base_composition
|
2729
2755
|
^^^^^^^^^ port this into java
|
2730
|
-
|
2756
|
+
------------------------------------------------------------------------------------------
|
2731
2757
|
(82) - work a bit more on tk!!!
|
2732
2758
|
in particular to start it from the bioshell as-is.
|
2733
2759
|
^^^ this is mostly done for quick
|
@@ -2740,20 +2766,20 @@ Ah well. Perhaps not ... ruby-cgi is soooooo annoying ...
|
|
2740
2766
|
hamming_distance [PARTIALLY IMPLEMENTED; ~80%]
|
2741
2767
|
protein_to_DNA
|
2742
2768
|
^^^^ improve both while improving tk_paradise docu as well.
|
2743
|
-
|
2769
|
+
------------------------------------------------------------------------------------------
|
2744
2770
|
(83) Batch-create the .exe files on windows for libui, once
|
2745
2771
|
the first has been added. And then test it too
|
2746
2772
|
AND document it. This should be done with the controller
|
2747
2773
|
eventually. Once this works, we can remove this entry
|
2748
2774
|
here.
|
2749
|
-
|
2775
|
+
------------------------------------------------------------------------------------------
|
2750
2776
|
(84) port more libui stuff in bioroebe. We have two widgets ported so far;
|
2751
2777
|
add more such entries.
|
2752
|
-
|
2778
|
+
------------------------------------------------------------------------------------------
|
2753
2779
|
(85) after libui has been ported, explore how gosu works on windows.
|
2754
2780
|
if possible add things to a gosu-specific UI as well, but
|
2755
2781
|
we may need a common, unified GUI base for that.
|
2756
|
-
|
2782
|
+
------------------------------------------------------------------------------------------
|
2757
2783
|
(86)
|
2758
2784
|
|
2759
2785
|
add libui bindings AND once done make sure the controller works in
|
@@ -2762,22 +2788,22 @@ libui as well. Embed the various things into it.
|
|
2762
2788
|
Tab A set named tabs for placing items in
|
2763
2789
|
^^^ use this perhaps also in bioroebe hmmm
|
2764
2790
|
yeah.
|
2765
|
-
|
2791
|
+
------------------------------------------------------------------------------------------
|
2766
2792
|
(87) https://github.com/cnjinhao/nana/wiki/User-Works-using-Nana
|
2767
2793
|
|
2768
2794
|
^^^ port the "DNA hybrid"
|
2769
2795
|
https://camo.githubusercontent.com/4c27d554ca4d698d288628f21255f917c2c577e35d7e11dd67e21880d56b6b0a/687474703a2f2f6e616e6170726f2e6f72672f696d616765732f73637265656e73686f74732f746864795f7365715f6578706c2e706e67
|
2770
2796
|
|
2771
|
-
|
2797
|
+
------------------------------------------------------------------------------------------
|
2772
2798
|
(88) Bioroebe::Cell
|
2773
2799
|
^^^ think about what to do with it. If we don't need it then perhaps
|
2774
2800
|
we should just remove it. Think about this more at 2022, before
|
2775
2801
|
deciding what to do.
|
2776
|
-
|
2802
|
+
------------------------------------------------------------------------------------------
|
2777
2803
|
(89) - Add emboss cgplot functionality.
|
2778
2804
|
|
2779
2805
|
https://www.bioinformatics.nl/cgi-bin/emboss/cpgplot
|
2780
|
-
|
2806
|
+
------------------------------------------------------------------------------------------
|
2781
2807
|
(90) - integrate calculation of the Instability index (II)
|
2782
2808
|
|
2783
2809
|
The instability index provides an estimate of the
|
@@ -2815,9 +2841,9 @@ that the protein may be unstable.
|
|
2815
2841
|
The instability index (II) is computed to be 65.43
|
2816
2842
|
This classifies the protein as unstable.
|
2817
2843
|
|
2818
|
-
|
2844
|
+
------------------------------------------------------------------------------------------
|
2819
2845
|
(1) → We have now added a method to show all hydrophobic amino acids, via the
|
2820
2846
|
method .hydrophobic_amino_acids?. This works and has been documented
|
2821
2847
|
in May 2022. However had, we also still need a way to PREDICT
|
2822
2848
|
hydrophobic segments in a polypeptide sequence.
|
2823
|
-
|
2849
|
+
------------------------------------------------------------------------------------------
|