bioroebe 0.10.80 → 0.11.25
Sign up to get free protection for your applications and to get access to all the features.
Potentially problematic release.
This version of bioroebe might be problematic. Click here for more details.
- checksums.yaml +4 -4
- data/README.md +3117 -2645
- data/bioroebe.gemspec +3 -3
- data/doc/README.gen +3116 -2644
- data/doc/todo/bioroebe_todo.md +418 -387
- data/lib/bioroebe/aminoacids/aminoacid_substitution.rb +1 -9
- data/lib/bioroebe/aminoacids/codon_percentage.rb +1 -9
- data/lib/bioroebe/aminoacids/deduce_aminoacid_sequence.rb +1 -9
- data/lib/bioroebe/aminoacids/display_aminoacid_table.rb +1 -0
- data/lib/bioroebe/aminoacids/show_hydrophobicity.rb +1 -6
- data/lib/bioroebe/base/colours_for_base/colours_for_base.rb +18 -8
- data/lib/bioroebe/base/commandline_application/commandline_arguments.rb +13 -11
- data/lib/bioroebe/base/commandline_application/misc.rb +18 -8
- data/lib/bioroebe/base/misc.rb +16 -0
- data/lib/bioroebe/base/prototype/misc.rb +1 -1
- data/lib/bioroebe/codons/show_codon_tables.rb +6 -2
- data/lib/bioroebe/codons/show_codon_usage.rb +2 -1
- data/lib/bioroebe/constants/aminoacids_and_proteins.rb +1 -0
- data/lib/bioroebe/constants/database_constants.rb +1 -1
- data/lib/bioroebe/constants/files_and_directories.rb +24 -4
- data/lib/bioroebe/constants/misc.rb +20 -0
- data/lib/bioroebe/count/count_amount_of_nucleotides.rb +3 -0
- data/lib/bioroebe/crystal/README.md +2 -0
- data/lib/bioroebe/crystal/to_rna.cr +19 -0
- data/lib/bioroebe/data/README.md +11 -8
- data/lib/bioroebe/data/electron_microscopy/pos_example.pos +396 -0
- data/lib/bioroebe/data/electron_microscopy/test_particles.star +36 -0
- data/lib/bioroebe/{shell/tk.rb → electron_microscopy/electron_microscopy_module.rb} +15 -10
- data/lib/bioroebe/electron_microscopy/simple_star_file_generator.rb +4 -9
- data/lib/bioroebe/fasta_and_fastq/show_fasta_headers.rb +27 -12
- data/lib/bioroebe/genome/README.md +4 -0
- data/lib/bioroebe/genome/genome.rb +67 -0
- data/lib/bioroebe/gui/gtk +1 -0
- data/lib/bioroebe/gui/gtk3/controller/controller.rb +45 -27
- data/lib/bioroebe/gui/gtk3/dna_to_aminoacid_widget/dna_to_aminoacid_widget.rb +76 -50
- data/lib/bioroebe/gui/gtk3/hamming_distance/hamming_distance.rb +42 -28
- data/lib/bioroebe/gui/gtk3/nucleotide_analyser/nucleotide_analyser.rb +119 -71
- data/lib/bioroebe/gui/gtk3/protein_to_DNA/protein_to_DNA.rb +18 -18
- data/lib/bioroebe/gui/gtk3/random_sequence/random_sequence.rb +19 -11
- data/lib/bioroebe/gui/shared_code/protein_to_DNA/protein_to_DNA_module.rb +14 -14
- data/lib/bioroebe/misc/ruler.rb +1 -0
- data/lib/bioroebe/parsers/genbank_parser.rb +353 -24
- data/lib/bioroebe/parsers/gff.rb +1 -9
- data/lib/bioroebe/pdb/parse_pdb_file.rb +1 -9
- data/lib/bioroebe/project/project.rb +1 -1
- data/lib/bioroebe/python/README.md +1 -0
- data/lib/bioroebe/python/__pycache__/mymodule.cpython-39.pyc +0 -0
- data/lib/bioroebe/python/gui/gtk3/all_in_one.css +4 -0
- data/lib/bioroebe/python/gui/gtk3/all_in_one.py +59 -0
- data/lib/bioroebe/python/gui/gtk3/widget1.py +20 -0
- data/lib/bioroebe/python/gui/tkinter/all_in_one.py +91 -0
- data/lib/bioroebe/python/mymodule.py +8 -0
- data/lib/bioroebe/python/protein_to_dna.py +33 -0
- data/lib/bioroebe/python/shell/shell.py +19 -0
- data/lib/bioroebe/python/to_rna.py +14 -0
- data/lib/bioroebe/python/toplevel_methods/open_in_browser.py +20 -0
- data/lib/bioroebe/python/toplevel_methods/palindromes.py +42 -0
- data/lib/bioroebe/python/toplevel_methods/rds.py +13 -0
- data/lib/bioroebe/python/toplevel_methods/three_delimiter.py +34 -0
- data/lib/bioroebe/python/toplevel_methods/time_and_date.py +43 -0
- data/lib/bioroebe/python/toplevel_methods/to_camelcase.py +11 -0
- data/lib/bioroebe/requires/require_the_bioroebe_project.rb +3 -1
- data/lib/bioroebe/sequence/nucleotide_module/nucleotide_module.rb +28 -25
- data/lib/bioroebe/sequence/protein.rb +105 -3
- data/lib/bioroebe/sequence/sequence.rb +61 -2
- data/lib/bioroebe/shell/menu.rb +3752 -3667
- data/lib/bioroebe/shell/misc.rb +51 -4311
- data/lib/bioroebe/shell/readline/readline.rb +1 -1
- data/lib/bioroebe/shell/shell.rb +11199 -28
- data/lib/bioroebe/siRNA/siRNA.rb +81 -1
- data/lib/bioroebe/string_matching/find_longest_substring.rb +3 -2
- data/lib/bioroebe/taxonomy/class_methods.rb +3 -8
- data/lib/bioroebe/taxonomy/constants.rb +4 -3
- data/lib/bioroebe/taxonomy/edit.rb +2 -1
- data/lib/bioroebe/taxonomy/help/help.rb +10 -10
- data/lib/bioroebe/taxonomy/info/check_available.rb +15 -9
- data/lib/bioroebe/taxonomy/info/info.rb +17 -2
- data/lib/bioroebe/taxonomy/info/is_dna.rb +46 -36
- data/lib/bioroebe/taxonomy/interactive.rb +139 -95
- data/lib/bioroebe/taxonomy/menu.rb +27 -18
- data/lib/bioroebe/taxonomy/parse_fasta.rb +3 -1
- data/lib/bioroebe/taxonomy/shared.rb +1 -0
- data/lib/bioroebe/taxonomy/taxonomy.rb +1 -0
- data/lib/bioroebe/toplevel_methods/aminoacids_and_proteins.rb +31 -24
- data/lib/bioroebe/toplevel_methods/databases.rb +1 -1
- data/lib/bioroebe/toplevel_methods/fasta_and_fastq.rb +101 -63
- data/lib/bioroebe/toplevel_methods/misc.rb +17 -16
- data/lib/bioroebe/toplevel_methods/nucleotides.rb +22 -5
- data/lib/bioroebe/toplevel_methods/open_in_browser.rb +2 -0
- data/lib/bioroebe/toplevel_methods/palindromes.rb +1 -2
- data/lib/bioroebe/toplevel_methods/taxonomy.rb +2 -2
- data/lib/bioroebe/toplevel_methods/to_camelcase.rb +5 -0
- data/lib/bioroebe/utility_scripts/align_open_reading_frames.rb +1 -9
- data/lib/bioroebe/utility_scripts/check_for_mismatches/check_for_mismatches.rb +1 -9
- data/lib/bioroebe/utility_scripts/compacter.rb +1 -9
- data/lib/bioroebe/utility_scripts/compseq/compseq.rb +1 -9
- data/lib/bioroebe/utility_scripts/create_batch_entrez_file.rb +1 -9
- data/lib/bioroebe/utility_scripts/dot_alignment.rb +1 -9
- data/lib/bioroebe/utility_scripts/move_file_to_its_correct_location.rb +1 -4
- data/lib/bioroebe/utility_scripts/showorf/constants.rb +0 -5
- data/lib/bioroebe/utility_scripts/showorf/reset.rb +1 -4
- data/lib/bioroebe/version/version.rb +2 -2
- data/lib/bioroebe/www/embeddable_interface.rb +101 -52
- data/lib/bioroebe/www/sinatra/sinatra.rb +186 -70
- data/lib/bioroebe/yaml/aminoacids/amino_acids_long_name_to_one_letter.yml +2 -2
- data/lib/bioroebe/yaml/configuration/browser.yml +1 -1
- data/lib/bioroebe/yaml/genomes/README.md +3 -4
- data/lib/bioroebe/yaml/restriction_enzymes/restriction_enzymes.yml +3 -3
- metadata +33 -35
- data/doc/setup.rb +0 -1655
- data/lib/bioroebe/genbank/genbank_parser.rb +0 -291
- data/lib/bioroebe/shell/add.rb +0 -108
- data/lib/bioroebe/shell/assign.rb +0 -360
- data/lib/bioroebe/shell/chop_and_cut.rb +0 -281
- data/lib/bioroebe/shell/constants.rb +0 -166
- data/lib/bioroebe/shell/download.rb +0 -335
- data/lib/bioroebe/shell/enable_and_disable.rb +0 -158
- data/lib/bioroebe/shell/enzymes.rb +0 -310
- data/lib/bioroebe/shell/fasta.rb +0 -345
- data/lib/bioroebe/shell/gtk.rb +0 -76
- data/lib/bioroebe/shell/history.rb +0 -132
- data/lib/bioroebe/shell/initialize.rb +0 -217
- data/lib/bioroebe/shell/loop.rb +0 -74
- data/lib/bioroebe/shell/prompt.rb +0 -107
- data/lib/bioroebe/shell/random.rb +0 -289
- data/lib/bioroebe/shell/reset.rb +0 -335
- data/lib/bioroebe/shell/scan_and_parse.rb +0 -135
- data/lib/bioroebe/shell/search.rb +0 -337
- data/lib/bioroebe/shell/sequences.rb +0 -200
- data/lib/bioroebe/shell/show_report_and_display.rb +0 -2901
- data/lib/bioroebe/shell/startup.rb +0 -127
- data/lib/bioroebe/shell/taxonomy.rb +0 -14
- data/lib/bioroebe/shell/user_input.rb +0 -88
- data/lib/bioroebe/shell/xorg.rb +0 -45
data/doc/todo/bioroebe_todo.md
CHANGED
@@ -1,16 +1,157 @@
|
|
1
|
-
|
2
|
-
(1) → https://
|
1
|
+
------------------------------------------------------------------------------------------
|
2
|
+
(1) → https://pubchem.ncbi.nlm.nih.gov/compound/16131099#section=Top
|
3
|
+
|
4
|
+
^^^ this website is quite interesting; try to use components
|
5
|
+
from it.
|
6
|
+
------------------------------------------------------------------------------------------
|
7
|
+
(1) → Add some option to show the aminoacid sequence, at the least
|
8
|
+
store it; and optionally show it.
|
9
|
+
|
10
|
+
possibly always report how many aminoacids are
|
11
|
+
part of that file; and optionally also show
|
12
|
+
the whole sequence.
|
13
|
+
|
14
|
+
|
15
|
+
------------------------------------------------------------------------------------------
|
16
|
+
(1) → http://insilico.ehu.es/
|
17
|
+
|
18
|
+
^^^ check if we have all of this incorporated
|
19
|
+
|
20
|
+
------------------------------------------------------------------------------------------
|
21
|
+
(28) → Integrate these nice GUI parts parts:
|
22
|
+
|
23
|
+
https://dev.to/kojix2/introduction-to-gr-rb-data-visualization-with-ruby-2c39
|
24
|
+
------------------------------------------------------------------------------------------
|
25
|
+
(22) → AND THEN test on windows as well.
|
26
|
+
^^^^^^^^^^^^^^
|
27
|
+
------------------------------------------------------------------------------------------
|
28
|
+
(1) → add mouse chromsoome URL, also in the bioshell
|
29
|
+
and the main README, to be of help for the
|
30
|
+
user. add a mouse subsection.
|
31
|
+
|
32
|
+
------------------------------------------------------------------------------------------
|
33
|
+
(2) → fix the taxonomy stuff...
|
34
|
+
------------------------------------------------------------------------------------------
|
35
|
+
(1) → set_dna_sequence alu
|
36
|
+
|
37
|
+
^^^ fetch random alu
|
38
|
+
|
39
|
+
^^^ alu sequence
|
40
|
+
Ok we started this now adding more details, but we
|
41
|
+
need to become better at searching for this
|
42
|
+
sequence.
|
43
|
+
------------------------------------------------------------------------------------------
|
44
|
+
(2) → draw things based on GR
|
45
|
+
------------------------------------------------------------------------------------------
|
46
|
+
(3) → https://mycocosm.jgi.doe.gov/help/screenshots/browser_viewer.png
|
47
|
+
^^^ offer the same functionality
|
48
|
+
------------------------------------------------------------------------------------------
|
49
|
+
(4) → https://genome.cshlp.org/content/12/10/1611/F3.expansion.html
|
50
|
+
|
51
|
+
^^^ enable this, we must obtain a sequence then store into genbank format
|
52
|
+
so, first fetch; then store as-is.
|
53
|
+
------------------------------------------------------------------------------------------
|
54
|
+
(5) → be able to generate nice graphics
|
55
|
+
|
56
|
+
https://genome.cshlp.org/content/12/10/1611/F1.large.jpg
|
57
|
+
------------------------------------------------------------------------------------------
|
58
|
+
(6) → add rmagicks wrappre, perhaps via imageparadise or something
|
59
|
+
the idea is that we can make fancy drawings and generate
|
60
|
+
an image for the end user to see
|
61
|
+
------------------------------------------------------------------------------------------
|
62
|
+
(7) → https://bioperl.org/howtos/Beginners_HOWTO.html#item13
|
63
|
+
extend the sequence object and document it
|
64
|
+
|
65
|
+
also add:
|
66
|
+
|
67
|
+
class Genome
|
68
|
+
and:
|
69
|
+
def is_circular?
|
70
|
+
@internal_hash[:is_circular]
|
71
|
+
end; alias circular? is_circular? # === circular?
|
72
|
+
def species?
|
73
|
+
@internal_hash[:species] # return the species here
|
74
|
+
end
|
75
|
+
------------------------------------------------------------------------------------------
|
76
|
+
(2) http://lib.ysu.am/open_books/312400.pdf
|
77
|
+
|
78
|
+
clone:
|
79
|
+
Primer.pl
|
80
|
+
This program was written to support the required informatics for a sequencing
|
81
|
+
lab. The desire was to quickly generate primer pair candidates for use in STS
|
82
|
+
mapping. We use Bioperl modules to fetch the sequences from GenBank.
|
83
|
+
#! /usr/bin/perl
|
84
|
+
#
|
85
|
+
# primers.pl
|
86
|
+
#
|
87
|
+
# Reads a list of
|
88
|
+
|
89
|
+
% primers.pl AC013798
|
90
|
+
AC013798
|
91
|
+
Left Right Length Penalty
|
92
|
+
CCTCCTGGACAACCTGTGTT TGAAGTCAGGGGACATAGGG 280 0.0823
|
93
|
+
CCTCCTGGACAACCTGTGTT AGGCCAGTAGACTGGGTGTG 298 0.1758
|
94
|
+
CCTCCTGGACAACCTGTGTT GGTGTGAAGTCAGGGGACAT 284 0.1852
|
95
|
+
TTCCCGCATCTCTTAGCAGT AGGCCAGTAGACTGGGTGTG 209 0.1962
|
96
|
+
CTTCCCGCATCTCTTAGCAG GACACTAGTGGCAAGGAGGC 226 0.2362
|
97
|
+
Most of the primers.pl program is extremely simple. The real guts and power
|
98
|
+
of the program lie in the classes and the methods we call. The next section
|
99
|
+
examines the Primer3 module, which is similar to many Bioperl modules
|
100
|
+
|
101
|
+
|
102
|
+
------------------------------------------------------------------------------------------
|
103
|
+
(1) → Clone all of Emboss. :)
|
104
|
+
|
105
|
+
→ Clone and document the getorf functionality properly.
|
106
|
+
|
107
|
+
See: http://emboss.sourceforge.net/apps/cvs/emboss/apps/getorf.html
|
108
|
+
|
109
|
+
http://emboss.sourceforge.net
|
110
|
+
http://emboss.sourceforge.net/apps/cvs/emboss/apps/getorf.html
|
111
|
+
|
112
|
+
------------------------------------------------------------------------------------------
|
113
|
+
(3) → Add useful formulas for bioshell.
|
114
|
+
------------------------------------------------------------------------------------------
|
115
|
+
(1) → Polish the GUI sets:
|
116
|
+
|
117
|
+
https://i.imgur.com/djElIMh.png
|
118
|
+
|
119
|
+
------------------------------------------------------------------------------------------
|
120
|
+
(4) → The taxonomy part should be fully integrated, without it
|
121
|
+
being a standalone part anymore.
|
122
|
+
continue on the taxonomy stuff.
|
123
|
+
ne day this will work again *shake fist*
|
124
|
+
------------------------------------------------------------------------------------------
|
125
|
+
(1) → Show the frequency of codons in different tables
|
126
|
+
|
127
|
+
This works quite ok, but right now the approach is to store
|
128
|
+
this in a .yml file which is not ideal.
|
129
|
+
|
130
|
+
Thus, we have to add two things:
|
131
|
+
- The ability to store this into a SQL database
|
132
|
+
- The ability to batch-download all of these codons,
|
133
|
+
which first requires that we have a way to obtain all
|
134
|
+
taxonomic ids.
|
135
|
+
Add where this can be found.
|
136
|
+
|
137
|
+
IMPROVE THIS ALL!!!!!!!
|
138
|
+
|
139
|
+
------------------------------------------------------------------------------------------
|
140
|
+
(2) improve docu + tests for melting temperature analysis again
|
141
|
+
+ usage example + GUI + web-use
|
142
|
+
------------------------------------------------------------------------------------------
|
143
|
+
(3) → https://biopython.org/DIST/docs/tutorial/Tutorial.html#sec15
|
3
144
|
|
4
145
|
^^^ work through the above, also integrate it + write docs
|
5
146
|
|
6
147
|
https://raw.githubusercontent.com/biopython/biopython/master/Doc/examples/ls_orchid.fasta
|
7
148
|
|
8
|
-
|
9
|
-
(
|
149
|
+
------------------------------------------------------------------------------------------
|
150
|
+
(4) → integrate electrno microscopy slowly and also add documentation
|
10
151
|
about this AS YOU GO!!!!!
|
11
152
|
^^^ yup add more of it
|
12
|
-
|
13
|
-
(
|
153
|
+
------------------------------------------------------------------------------------------
|
154
|
+
(5) → Add save session support
|
14
155
|
to reload our last activity completely ...
|
15
156
|
hmmm..
|
16
157
|
This has to be well designed...
|
@@ -27,9 +168,8 @@ https://raw.githubusercontent.com/biopython/biopython/master/Doc/examples/ls_orc
|
|
27
168
|
upon startup of the bioroebe shell.
|
28
169
|
This is in preparation for save-session support.
|
29
170
|
|
30
|
-
|
31
|
-
|
32
|
-
(5) → Lys-Asp-Glu-Leu
|
171
|
+
------------------------------------------------------------------------------------------
|
172
|
+
(6) → Lys-Asp-Glu-Leu
|
33
173
|
|
34
174
|
if i.include?('-') and Bioroebe.is_in_the_three_letter_code?(i)
|
35
175
|
end
|
@@ -47,11 +187,11 @@ https://raw.githubusercontent.com/biopython/biopython/master/Doc/examples/ls_orc
|
|
47
187
|
|
48
188
|
^^ yep this is also called KDEL
|
49
189
|
https://en.wikipedia.org/wiki/KDEL_(amino_acid_sequence)
|
50
|
-
|
51
|
-
(
|
190
|
+
------------------------------------------------------------------------------------------
|
191
|
+
(7) → Add "orthologs". this shall show us the top 25 orthologs or
|
52
192
|
something. In the bioshell? Hmm. Not sure yet.
|
53
|
-
|
54
|
-
(
|
193
|
+
------------------------------------------------------------------------------------------
|
194
|
+
(8) → clone the functionality of this:
|
55
195
|
|
56
196
|
http://www.kazusa.or.jp/codon/cgi-bin/countcodon.cgi
|
57
197
|
http://www.kazusa.or.jp/codon/countcodon.html
|
@@ -63,18 +203,18 @@ https://en.wikipedia.org/wiki/KDEL_(amino_acid_sequence)
|
|
63
203
|
widget first. And sinatra output too.
|
64
204
|
AND document it as well
|
65
205
|
|
66
|
-
|
206
|
+
------------------------------------------------------------------------------------------
|
67
207
|
(8) → SARS genom analyisere in bioroebe
|
68
208
|
eventuell auch graphisch
|
69
209
|
|
70
210
|
Gibt es neue GUIs die wir kombinieren könnten? Hmmm.
|
71
|
-
|
211
|
+
------------------------------------------------------------------------------------------
|
72
212
|
(9) → In bioroebe, generate that .ps thingy graphical thing from the
|
73
213
|
vienna RNA tutorial. Hmmm.
|
74
214
|
|
75
215
|
https://www.tbi.univie.ac.at/RNA/tutorial/
|
76
|
-
|
77
|
-
(
|
216
|
+
------------------------------------------------------------------------------------------
|
217
|
+
(10) → get insulin squence frmo NCBI
|
78
218
|
human
|
79
219
|
then apply trypsin onto it
|
80
220
|
and try it like this:
|
@@ -88,13 +228,13 @@ Also add:
|
|
88
228
|
^^^ to show it
|
89
229
|
Hmm. Perhaps also auto-download or something.
|
90
230
|
|
91
|
-
|
92
|
-
(
|
231
|
+
------------------------------------------------------------------------------------------
|
232
|
+
(11) → in bioroebe: UAG?
|
93
233
|
^^^ show all stop codons with that in the bioshell
|
94
234
|
all UAG sequences... hmm. and TAG?
|
95
235
|
Finish that.
|
96
|
-
|
97
|
-
(
|
236
|
+
------------------------------------------------------------------------------------------
|
237
|
+
(12) → The position of a symbol in a string is the total number of
|
98
238
|
symbols found to its left, including itself (e.g., the positions
|
99
239
|
of all occurrences of 'U' in "AUGCUUCAGAAAGGUCUUACG" are 2, 5,
|
100
240
|
6, 15, 17, and 18). The symbol at position i
|
@@ -102,70 +242,70 @@ Hmm. Perhaps also auto-download or something.
|
|
102
242
|
|
103
243
|
^^^ add a solution there, a toplevel API
|
104
244
|
!!!!!
|
105
|
-
|
106
|
-
(
|
245
|
+
------------------------------------------------------------------------------------------
|
246
|
+
(13) → http://bioruby.org/rdoc/Bio/Blast.html
|
107
247
|
^^^ add support for BLAST
|
108
|
-
|
109
|
-
(
|
248
|
+
------------------------------------------------------------------------------------------
|
249
|
+
(14) → add: parse_pdb()
|
110
250
|
With this we shall just show some info, about a given
|
111
251
|
.pdb file at hand.
|
112
252
|
Also make it commandline based too + bioshell variant
|
113
253
|
here, and a sinatra interface once this all works.
|
114
254
|
Don't forget to document it!!!!!
|
115
255
|
^^^ and google a bit how others do that
|
116
|
-
|
117
|
-
(
|
256
|
+
------------------------------------------------------------------------------------------
|
257
|
+
(15) → pdb 1a6m
|
118
258
|
^^^ download this when that is used in the bioshell; we also have
|
119
259
|
to use the download directory for this, so make sure that
|
120
260
|
we do.
|
121
261
|
^^^ And then, also document this clearly.
|
122
|
-
|
123
|
-
(
|
262
|
+
------------------------------------------------------------------------------------------
|
263
|
+
(16) show_string
|
124
264
|
^^^ slowly port this ... find out differences
|
125
265
|
then unify into one method. right now we used
|
126
266
|
two or something.
|
127
|
-
|
128
|
-
(
|
267
|
+
------------------------------------------------------------------------------------------
|
268
|
+
(17) → Try to see if we can integrate this into our GUI:
|
129
269
|
|
130
270
|
https://cdn.snapgene.com/assets/7.6.11/assets/images/snapgene/homepage/homepage-hero.png
|
131
|
-
|
271
|
+
------------------------------------------------------------------------------------------
|
132
272
|
(5) → Scan for leucine zipper!
|
133
273
|
|
134
274
|
This is ~25% implemented. We need to double-check what
|
135
275
|
exactly is a leucine zipper.
|
136
|
-
|
276
|
+
------------------------------------------------------------------------------------------
|
137
277
|
(6) → Extend the sinatra-interface for the Rosalind task,
|
138
278
|
perhaps add a sub-link to show which parts are solved
|
139
279
|
as-is. Hmm. I am not continuing on this though.
|
140
280
|
^^^^
|
141
281
|
well - make rosalind anew again or something.
|
142
282
|
|
143
|
-
|
283
|
+
------------------------------------------------------------------------------------------
|
144
284
|
(7) - Add a blast interface; both via the web-interface, GUI,
|
145
285
|
and also from the commandline.
|
146
|
-
|
286
|
+
------------------------------------------------------------------------------------------
|
147
287
|
(8) - Write a tutorial about primer design.
|
148
288
|
also make sure that the GUI has support for this.
|
149
|
-
|
289
|
+
------------------------------------------------------------------------------------------
|
150
290
|
(9) - In the documentation examples, show some exampls for how to work
|
151
291
|
with different organisms.
|
152
|
-
|
292
|
+
------------------------------------------------------------------------------------------
|
153
293
|
(10) - In the bioshell, if "stop?" is issued, then the colouring isn't
|
154
294
|
correct. It currently does not show any result. This has to
|
155
295
|
be fixed.
|
156
|
-
|
296
|
+
------------------------------------------------------------------------------------------
|
157
297
|
(11) → https://www.rubydoc.info/gems/biomart
|
158
298
|
^^^ integrate biomart
|
159
299
|
|
160
300
|
p biomart.list_datasets
|
161
301
|
p biomart.datasets?
|
162
|
-
|
302
|
+
------------------------------------------------------------------------------------------
|
163
303
|
(12) Add Trypsin und Trypsinogen sequences, both as FASTA
|
164
304
|
but also as shortcut via the commandline such as:
|
165
305
|
show_orf :trypsine
|
166
306
|
show_orf :trypsin
|
167
307
|
Or something like this; and document it as well.
|
168
|
-
|
308
|
+
------------------------------------------------------------------------------------------
|
169
309
|
(13) → 1..60
|
170
310
|
|
171
311
|
setdna 57
|
@@ -177,12 +317,12 @@ well - make rosalind anew again or something.
|
|
177
317
|
5' - ATGTGCAGTCAGGTGAATTTATTGAAAAATTTGAGGCTCCTGGTGGTGCAAATCAAAGAACTGCTCCTCAGTGGATGTTGCCTTTACTTCTAG - 3'
|
178
318
|
^^^ hier beim colourize, wenn das letzte codon ein STOP codon ist
|
179
319
|
dann colourizen wir das auch.
|
180
|
-
|
320
|
+
------------------------------------------------------------------------------------------
|
181
321
|
(14) → MG1655
|
182
322
|
^^^ input this to download the sequence. Also show it to the user.
|
183
|
-
|
323
|
+
------------------------------------------------------------------------------------------
|
184
324
|
(15) → extend virus-information into the bioroebe project.
|
185
|
-
|
325
|
+
------------------------------------------------------------------------------------------
|
186
326
|
(16) → Add a way to analyse the chemical structure of all
|
187
327
|
aminoacids. We wish to show the chemical formula.
|
188
328
|
|
@@ -196,22 +336,22 @@ well - make rosalind anew again or something.
|
|
196
336
|
I don't understand why it removes H and 0 so perhaps
|
197
337
|
dont remove that part. But still show the -R.
|
198
338
|
|
199
|
-
|
339
|
+
------------------------------------------------------------------------------------------
|
200
340
|
(17) FIX THE COLOURIZATION BUG; THIS ONE TRIGGERED THE WHOLE
|
201
341
|
REWRITE AFTER ALL!
|
202
|
-
|
342
|
+
------------------------------------------------------------------------------------------
|
203
343
|
(18) FIX TAXONOMY related-problems AS WELL
|
204
344
|
^^^^^^ AND DOCUMENT THIS related-problems.
|
205
|
-
|
345
|
+
------------------------------------------------------------------------------------------
|
206
346
|
(19) Do note that z will then be a String, not a sequence object anymore.
|
207
347
|
(This may be subject to change in the future, but for now, aka
|
208
348
|
**February 2020**, it is that way.)
|
209
349
|
^^^^
|
210
|
-
|
350
|
+
------------------------------------------------------------------------------------------
|
211
351
|
(20) ^^^ colours are appended. That should not be the case!
|
212
352
|
ADD SOMETHING NEW ... some todo entries
|
213
353
|
and some python tool
|
214
|
-
|
354
|
+
------------------------------------------------------------------------------------------
|
215
355
|
(21) → rewrite the whole project anew
|
216
356
|
- improve the documentation
|
217
357
|
- focus on class Protein first and add
|
@@ -219,10 +359,8 @@ well - make rosalind anew again or something.
|
|
219
359
|
that, as well as:
|
220
360
|
.backtrans
|
221
361
|
.reverse_translate
|
222
|
-
|
223
|
-
|
224
|
-
^^^^^^^^^^^^^^
|
225
|
-
-------------------------------------------------------------------------------
|
362
|
+
|
363
|
+
------------------------------------------------------------------------------------------
|
226
364
|
(23) →
|
227
365
|
Reduced alphabets for proteins | [not implemented yet]
|
228
366
|
^^^ check this as well
|
@@ -252,9 +390,9 @@ First focus on bioroebe.
|
|
252
390
|
efetch "https://www.ncbi.nlm.nih.gov/gene/744779"
|
253
391
|
^^^ test this. again
|
254
392
|
|
255
|
-
|
393
|
+
------------------------------------------------------------------------------------------
|
256
394
|
(25) fix tk-levensthein
|
257
|
-
|
395
|
+
------------------------------------------------------------------------------------------
|
258
396
|
(26) → rewrite the whole project anew
|
259
397
|
- improve the documentation
|
260
398
|
- rework the WHOLE tutorial as well
|
@@ -263,13 +401,13 @@ efetch "https://www.ncbi.nlm.nih.gov/gene/744779"
|
|
263
401
|
that
|
264
402
|
.backtrans
|
265
403
|
.reverse_translate
|
266
|
-
|
404
|
+
------------------------------------------------------------------------------------------
|
267
405
|
(27) → analyze /Depot/Temp/Bioroebe/1CEZ.pdb
|
268
406
|
|
269
407
|
^^^
|
270
408
|
support this. Already works half-way, we started writing a pdb parser.
|
271
409
|
this should work in general, for .fasta files as well.
|
272
|
-
|
410
|
+
------------------------------------------------------------------------------------------
|
273
411
|
(28) → SINATRA STUFF:
|
274
412
|
FIX AND EXTEND SINATRA IN BIOROEBE.
|
275
413
|
extend it too.
|
@@ -281,7 +419,7 @@ efetch "https://www.ncbi.nlm.nih.gov/gene/744779"
|
|
281
419
|
and special-dispaly on sinatra kaa
|
282
420
|
where the nucleotide sequence has numbers
|
283
421
|
^^^
|
284
|
-
|
422
|
+
------------------------------------------------------------------------------------------
|
285
423
|
(29) pick any virus and begin to amass tons of data; and then when done
|
286
424
|
also connect this into a GUI for use therein.
|
287
425
|
|
@@ -302,7 +440,7 @@ https://www.ncbi.nlm.nih.gov/nuccore/NC_038391.1
|
|
302
440
|
|
303
441
|
|
304
442
|
|
305
|
-
|
443
|
+
------------------------------------------------------------------------------------------
|
306
444
|
(1) → Fix:
|
307
445
|
|
308
446
|
require 'bioroebe/toplevel_methods/open_reading_frames.rb'
|
@@ -310,10 +448,10 @@ https://www.ncbi.nlm.nih.gov/nuccore/NC_038391.1
|
|
310
448
|
Something is wrong; it returns regions that contain
|
311
449
|
a stop codon, which can not be true.
|
312
450
|
|
313
|
-
|
451
|
+
------------------------------------------------------------------------------------------
|
314
452
|
(3) → Fix: extend glycovirology parts
|
315
453
|
seek stuff in viral genomes
|
316
|
-
|
454
|
+
------------------------------------------------------------------------------------------
|
317
455
|
(4) →
|
318
456
|
|
319
457
|
seq = Bio::Sequence::NA.new("atgcatgcaaaaaaa")
|
@@ -336,13 +474,13 @@ https://www.ncbi.nlm.nih.gov/nuccore/NC_038391.1
|
|
336
474
|
seq = Bioroebe::Sequence.new("atgcatgcaaaaaaa")
|
337
475
|
puts seq
|
338
476
|
puts seq.complement
|
339
|
-
|
477
|
+
------------------------------------------------------------------------------------------
|
340
478
|
(5) →
|
341
479
|
make sure we have a good fasta-showing widget
|
342
480
|
show how many nucleotides are
|
343
481
|
AND add support to modify this as-is
|
344
482
|
^^^^
|
345
|
-
|
483
|
+
------------------------------------------------------------------------------------------
|
346
484
|
(6) → In BioRoebe:
|
347
485
|
|
348
486
|
Add a table showing how compatible bioroebe is compared to the other
|
@@ -352,7 +490,7 @@ https://www.ncbi.nlm.nih.gov/nuccore/NC_038391.1
|
|
352
490
|
including Bio (ruby-bio) the main ruby project here.
|
353
491
|
And add a table which functionality is implemented
|
354
492
|
in Java already.
|
355
|
-
|
493
|
+
------------------------------------------------------------------------------------------
|
356
494
|
(7) →
|
357
495
|
********************************************************************************
|
358
496
|
Was passiert wenn wir das Lambda-Genom mit EcoRI behandeln?
|
@@ -375,19 +513,19 @@ Bioroebe.digest_this_dna("/root/Bioroebe/fasta/NC_001416.1_Enterobacteria_phage_
|
|
375
513
|
DNA.
|
376
514
|
^^^ this now works kind of ... but it must be better
|
377
515
|
documented and we must test this with more data.
|
378
|
-
|
516
|
+
------------------------------------------------------------------------------------------
|
379
517
|
(8) → add the bioroebe logo to sinatra, but as appropriate size,
|
380
518
|
via base64. perhaps width 50 or so. need to determine
|
381
519
|
which size fits here.
|
382
|
-
|
520
|
+
------------------------------------------------------------------------------------------
|
383
521
|
(9) → Integrate http://nc2.neb.com/NEBcutter2/cutshow.php?name=ffe1d68e-
|
384
522
|
|
385
523
|
in particular the visual part.
|
386
|
-
|
524
|
+
------------------------------------------------------------------------------------------
|
387
525
|
(10) → https://international.neb.com/products/r0196-ncii#Product%20Information
|
388
526
|
^^^ autogenerate such an image, aka restriction cutting enzyme
|
389
527
|
to indicate the target sequence.
|
390
|
-
|
528
|
+
------------------------------------------------------------------------------------------
|
391
529
|
(6) → how to do codon optimiation in e.coli? bioroebe must support this!
|
392
530
|
|
393
531
|
we must first get a display which codon is very commonly used in
|
@@ -399,34 +537,23 @@ and then we look which codons may be improvable - display
|
|
399
537
|
them on the commandline
|
400
538
|
|
401
539
|
class: OptimizeCodons.new(of_this_sequence)
|
402
|
-
|
540
|
+
------------------------------------------------------------------------------------------
|
403
541
|
(7) → Molekulare Grösse von "Ubiquitin"? "8.5 kd".
|
404
542
|
^^^ das sollte automatisch ausgerechnet werden
|
405
|
-
|
543
|
+
------------------------------------------------------------------------------------------
|
406
544
|
(8) → taxonomy !!!!!!!!!!!!!!!!!!
|
407
|
-
|
545
|
+
------------------------------------------------------------------------------------------
|
408
546
|
(9) → Given a list of gene names that I would like to get chromosome/position
|
409
547
|
information for (in mm10). Is there some service online where I can
|
410
548
|
paste this list? ^^^ enable this
|
411
|
-
|
412
|
-
(10) → Show the frequency of codons in different tables
|
413
|
-
|
414
|
-
This works quite ok, but right now the approach is to store
|
415
|
-
this in a .yml file which is not ideal.
|
416
|
-
|
417
|
-
Thus, we have to add two things:
|
418
|
-
- The ability to store this into a SQL database
|
419
|
-
- The ability to batch-download all of these codons,
|
420
|
-
which first requires that we have a way to obtain all
|
421
|
-
taxonomic ids.
|
422
|
-
-------------------------------------------------------------------------------
|
549
|
+
------------------------------------------------------------------------------------------
|
423
550
|
(11) → Add a way in bioroebe to store a gene into a yaml file
|
424
551
|
or so, and to also load it up again. Perhaps simplify
|
425
552
|
this automatically. Need some ways to describe that.
|
426
|
-
|
553
|
+
------------------------------------------------------------------------------------------
|
427
554
|
(12) → Make bioroebe very useful from the www, no matter if via sinatra
|
428
555
|
or rails. It should be a tool-set project on the www as well.
|
429
|
-
|
556
|
+
------------------------------------------------------------------------------------------
|
430
557
|
(13) → Suppose you have a GenBank file which you want to turn into a
|
431
558
|
Fasta file. For example, lets consider the file cor6_6.gb
|
432
559
|
which is included in the Biopython unit tests under the
|
@@ -441,12 +568,12 @@ call it format-converter or so
|
|
441
568
|
the GUI works somewhat but needs to be polished up.
|
442
569
|
THEN THIS CAN BE REMOVED!!!!!!!
|
443
570
|
|
444
|
-
|
571
|
+
------------------------------------------------------------------------------------------
|
445
572
|
(14) → Wir brauchen eine table wo wir die starken promotoren verschiedener
|
446
573
|
Organismen zusammenstellen und vergleichen können.
|
447
574
|
|
448
575
|
strong_promoters.yml
|
449
|
-
|
576
|
+
------------------------------------------------------------------------------------------
|
450
577
|
(15) → add:
|
451
578
|
start position of exons
|
452
579
|
and show the sequence based on that file
|
@@ -454,9 +581,9 @@ THEN THIS CAN BE REMOVED!!!!!!!
|
|
454
581
|
Normally there's a "gene" entry for each gene, so:
|
455
582
|
awk 'BEGIN{FS="\t"; OFS="\t"}{if($3 == "gene") print $1, $4, $5}' foo.gtf
|
456
583
|
|
457
|
-
|
584
|
+
------------------------------------------------------------------------------------------
|
458
585
|
(16) → also add 30-33 to aminoacids hmmm difficult.
|
459
|
-
|
586
|
+
------------------------------------------------------------------------------------------
|
460
587
|
(17) → http://bioinformatics.oxfordjournals.org/content/18/8/1135
|
461
588
|
"TFBS: Computational framework for transcription factor
|
462
589
|
binding site analysis"
|
@@ -464,7 +591,7 @@ THEN THIS CAN BE REMOVED!!!!!!!
|
|
464
591
|
into bioroebe
|
465
592
|
|
466
593
|
http://tfbs.genereg.net/
|
467
|
-
|
594
|
+
------------------------------------------------------------------------------------------
|
468
595
|
(18) → They include trypsin, chymotrypsin, thrombin, plasmin, papain and factor Xa.
|
469
596
|
^^^ provide means to identify where they cut,
|
470
597
|
and show this then by simualting a digest.
|
@@ -472,7 +599,7 @@ THEN THIS CAN BE REMOVED!!!!!!!
|
|
472
599
|
also document this on bioroebe todo
|
473
600
|
this is done via digestion/digestions
|
474
601
|
but it's not quite perfect yet.
|
475
|
-
|
602
|
+
------------------------------------------------------------------------------------------
|
476
603
|
(19) → a) add a commandline way to generate a random protein
|
477
604
|
with a specified length and then display it on the
|
478
605
|
commandline [DONE] !!!
|
@@ -498,29 +625,29 @@ THEN THIS CAN BE REMOVED!!!!!!!
|
|
498
625
|
Enable this BOTH from the commandline AND from the
|
499
626
|
interactive variant and from sinatra! Hmmmm.
|
500
627
|
|
501
|
-
|
628
|
+
------------------------------------------------------------------------------------------
|
502
629
|
(1) → add an option to design a
|
503
630
|
|
504
631
|
degenerate primer
|
505
|
-
|
632
|
+
------------------------------------------------------------------------------------------
|
506
633
|
(2) Add upcase to sequences and ensure that it works; also document it
|
507
634
|
internally and in the .pdf tutorial
|
508
635
|
what does that mean? upcase as method? hmmm.
|
509
636
|
|
510
|
-
|
637
|
+
------------------------------------------------------------------------------------------
|
511
638
|
(1) → http://www.biomart.org/other/user-docs.pdf
|
512
639
|
^^^ work through this
|
513
640
|
^^^ integrate the old .cgi part and improve as you go
|
514
|
-
|
641
|
+
------------------------------------------------------------------------------------------
|
515
642
|
(1) → Access geninfo numbers easily.
|
516
643
|
Die suchen und runterladen.
|
517
|
-
|
644
|
+
------------------------------------------------------------------------------------------
|
518
645
|
- Add all of bioruby into bioroebe:
|
519
646
|
|
520
647
|
continous project
|
521
648
|
https://github.com/biopython/biopython
|
522
649
|
https://github.com/bioruby/bioruby/tree/master/lib/bio
|
523
|
-
|
650
|
+
------------------------------------------------------------------------------------------
|
524
651
|
(3) → https://github.com/bioruby/bioruby/issues/134
|
525
652
|
^^^ check this, for restriction enzymes
|
526
653
|
http://rebase.neb.com/rebase/enz/MboII.html
|
@@ -530,9 +657,9 @@ THEN THIS CAN BE REMOVED!!!!!!!
|
|
530
657
|
> seq = seq.reverse_complement
|
531
658
|
> Bio::RestrictionEnzyme.cut(seq, 'MboII').primary rescue [seq]
|
532
659
|
=> ["atcatcaatcctaatcttct"]
|
533
|
-
|
660
|
+
------------------------------------------------------------------------------------------
|
534
661
|
(4) → Document how an ORF is defined for the bioroebe project.
|
535
|
-
|
662
|
+
------------------------------------------------------------------------------------------
|
536
663
|
(5) Continue with biojava in bioroebe.
|
537
664
|
|
538
665
|
→ We need to make some table that tells us what is implemented
|
@@ -547,7 +674,7 @@ THEN THIS CAN BE REMOVED!!!!!!!
|
|
547
674
|
|
548
675
|
dprimer M-T-T-Y-Y-T-A-A-A-STOP
|
549
676
|
|
550
|
-
|
677
|
+
------------------------------------------------------------------------------------------
|
551
678
|
(1) → The codon tables:
|
552
679
|
→ In January we added a codon-table GUI to ruby-gtk3.
|
553
680
|
|
@@ -576,31 +703,29 @@ THEN THIS CAN BE REMOVED!!!!!!!
|
|
576
703
|
|
577
704
|
This now sorta works semi-ok.
|
578
705
|
|
579
|
-
|
706
|
+
------------------------------------------------------------------------------------------
|
580
707
|
(1) → In the bioroebe-shell, enable input such as:
|
581
708
|
|
582
709
|
NC_000011.10
|
583
710
|
|
584
711
|
This shall quickly download this sequence into the
|
585
712
|
local file, and also rename it properly.
|
586
|
-
|
713
|
+
------------------------------------------------------------------------------------------
|
587
714
|
→ clone all of bioruby
|
588
|
-
|
715
|
+
------------------------------------------------------------------------------------------
|
589
716
|
(1) → bioinf bücher udrhclesen und zeug inkludiere !!!
|
590
717
|
^^^^^ mehr bilderchen hinzufügen ... auchv on den GUIs eventuell.
|
591
718
|
Und auch biopython durcharbeiten und alles wichtige nach
|
592
719
|
bioroebe übertragen.
|
593
|
-
|
720
|
+
------------------------------------------------------------------------------------------
|
594
721
|
- Add: DetectMotif
|
595
722
|
|
596
723
|
This class shall be used for detecting subsequences.
|
597
|
-
|
724
|
+
------------------------------------------------------------------------------------------
|
598
725
|
- Neue funktionälit rein
|
599
|
-
|
600
|
-
- mehr doku
|
601
|
-
|
602
|
-
- continue on bioroebe, and when it is done, write to the guy.
|
603
|
-
-------------------------------------------------------------------------------
|
726
|
+
------------------------------------------------------------------------------------------
|
727
|
+
- mehr doku!!!
|
728
|
+
------------------------------------------------------------------------------------------
|
604
729
|
- Rewrite bioroebe completely - add some tests, too or so, to
|
605
730
|
test this. ^^^
|
606
731
|
That way we learn how to write tests.
|
@@ -643,22 +768,13 @@ extend bioroebe sinatra interface
|
|
643
768
|
also add a footer to show which entries are available or so
|
644
769
|
→ in bioroebe, mach das die postgresql datenbank wieder funktioniert ...
|
645
770
|
|
646
|
-
|
647
|
-
|
648
|
-
..........................................................................
|
771
|
+
------------------------------------------------------------------------------------------
|
649
772
|
|
650
773
|
→ ^^^ improve this whole project a lot
|
651
774
|
|
652
775
|
before uploading then send email
|
653
776
|
|
654
777
|
|
655
|
-
- 1fat.pdb
|
656
|
-
|
657
|
-
^^^ download this, also via bioshell
|
658
|
-
download 1fat
|
659
|
-
^^^ notify the user about this
|
660
|
-
but put it into the dir of bioshell
|
661
|
-
|
662
778
|
→ add:
|
663
779
|
|
664
780
|
set_dna :insulin
|
@@ -674,44 +790,20 @@ also add a footer to show which entries are available or so
|
|
674
790
|
→ becomes: http://www.ncbi.nlm.nih.gov/gene/3630
|
675
791
|
|
676
792
|
wtf ... better to learn how NCBI uworks
|
677
|
-
|
678
|
-
- Add a seuqence table
|
793
|
+
------------------------------------------------------------------------------------------
|
794
|
+
- Add a seuqence table into bioroebe for GFP, YFP etc
|
679
795
|
and mae this show in both the interactio bioshell but
|
680
796
|
also the main README.md
|
681
|
-
-------------------------------------------------------------------------------
|
682
|
-
- stop_frame1?
|
683
|
-
^^^ add support for this
|
684
|
-
and stop_frame2?
|
685
|
-
etcc
|
686
|
-
to show stop-codons in this colour
|
687
|
-
THEN UPLOAD!
|
688
|
-
^^^ this works now but is not documented
|
689
|
-
|
690
|
-
|
691
|
-
-------------------------------------------------------------------------------
|
692
|
-
|
693
|
-
- chop to first ATG
|
694
797
|
|
695
|
-
|
696
|
-
|
697
|
-
^^^^ enable this, to chop towards the first ATG
|
698
|
-
sequence in the string
|
699
|
-
|
700
|
-
-------------------------------------------------------------------------------
|
798
|
+
------------------------------------------------------------------------------------------
|
701
799
|
→ http://www.biophp.org/stats/describe_data/demo.php?show=formula
|
702
800
|
|
703
801
|
^^^ should also add documentation like this, also via www interface
|
704
|
-
|
705
|
-
→ add mouse chromsoome URL, also in the bioshell
|
706
|
-
and the main README, to be of help for the
|
707
|
-
user. add a mouse subsection.
|
708
|
-
..........................................................................
|
709
|
-
→ fix the taxonomy stuff...
|
710
|
-
..........................................................................
|
802
|
+
------------------------------------------------------------------------------------------
|
711
803
|
(1) → add 2nd_orf
|
712
804
|
→ this shall scan for the 2nd orf
|
713
805
|
→ and third ORF as well, then, and document it.
|
714
|
-
|
806
|
+
------------------------------------------------------------------------------------------
|
715
807
|
(2) → Add a "cutter-range example" in restriction enzymes +
|
716
808
|
table + examples + tutorial
|
717
809
|
|
@@ -719,21 +811,16 @@ also add a footer to show which entries are available or so
|
|
719
811
|
|
720
812
|
Also, add in the documentation where this
|
721
813
|
can be found.
|
722
|
-
|
723
|
-
(3) → Add aaruler, similar to "ruler"; in the bioshell.
|
724
|
-
But we want to do this on the dna-sequence rather
|
725
|
-
than the aminoacid sequence.
|
726
|
-
This works but the display is not ideal.
|
727
|
-
..........................................................................
|
814
|
+
------------------------------------------------------------------------------------------
|
728
815
|
(4) → Add some codon-usage analyzer. What shall it show? It
|
729
816
|
should show how many codons are used, frequencies etc...
|
730
817
|
by an organism, and compare that to other data.
|
731
|
-
|
818
|
+
------------------------------------------------------------------------------------------
|
732
819
|
(5) → Implement a GPCR interface.
|
733
820
|
|
734
821
|
This is for "G-protein coupled receptors."
|
735
822
|
Denote which variants exist and so forth. Document it as well.
|
736
|
-
|
823
|
+
------------------------------------------------------------------------------------------
|
737
824
|
(6) → alu?
|
738
825
|
|
739
826
|
Will read from the file `/Programs/Ruby/2.3.0/lib/ruby/site_ruby/2.3.0/bioroebe/yaml/alu_elements.yml`.
|
@@ -756,7 +843,7 @@ also add a footer to show which entries are available or so
|
|
756
843
|
^^^ add this and document it or something like that
|
757
844
|
And perhaps add a small protein as an example how to
|
758
845
|
work with .pdb files instead.
|
759
|
-
|
846
|
+
------------------------------------------------------------------------------------------
|
760
847
|
(4) → Extend bioroebe to allow download
|
761
848
|
|
762
849
|
PDB files
|
@@ -770,13 +857,13 @@ also add a footer to show which entries are available or so
|
|
770
857
|
|
771
858
|
in 3EML 2VTP 2VEZ
|
772
859
|
do
|
773
|
-
|
860
|
+
------------------------------------------------------------------------------------------
|
774
861
|
(1) → Fully integrate electron microscopy then remove the old entry.
|
775
862
|
Test it though.
|
776
863
|
Hmm... but ... we will first polish the main bioroebe
|
777
864
|
gem AND the taxonomy gem and THEN AFTERWARDS
|
778
865
|
integate elctron microsopcy.
|
779
|
-
|
866
|
+
------------------------------------------------------------------------------------------
|
780
867
|
(1) → ORF Finder:
|
781
868
|
|
782
869
|
We must add an ORF finder for the bioroebe project,
|
@@ -785,23 +872,23 @@ also add a footer to show which entries are available or so
|
|
785
872
|
This works partially... start_stop works but we do not
|
786
873
|
yet find all subsequences.
|
787
874
|
|
788
|
-
|
875
|
+
------------------------------------------------------------------------------------------
|
789
876
|
(1) → must change determine whether we have protein or nucleotide or
|
790
877
|
so via a topelvel method!
|
791
|
-
|
878
|
+
------------------------------------------------------------------------------------------
|
792
879
|
(1) → there is a talens module.
|
793
880
|
we have to improve on it for a while
|
794
881
|
better docu
|
795
882
|
more testing
|
796
883
|
then we can get rid of this entry here
|
797
|
-
|
884
|
+
------------------------------------------------------------------------------------------
|
798
885
|
(1) → 33.44
|
799
886
|
Next showing the nucleotides 33 to 44 (including 33 and 44).
|
800
887
|
The length of the fragment will be 12 nucleotides.
|
801
888
|
5' - 2;70;130;180 - 3'
|
802
889
|
^^^ there is some problem; we somehow embed the colour codes,
|
803
890
|
which should not happen.
|
804
|
-
|
891
|
+
------------------------------------------------------------------------------------------
|
805
892
|
(1) → set_aa DTLCIGYHAN NSTDTVDTVL EKNVTVTHSV NLLEDKHNGK LCKLRGVAPL HLGKCNIAGW ILGNPECESL STASSWSYIV ETSNSDNGTC YPGDFINYEE LREQLSSVSS FERFEIFPKT SSWPNHDNKG VTAACPHAGA KSFYKNLIWL VKKGNSYPKL NQSYINDKGK EVLVLWGIHH PSTTADQQSL YQNADAYVFV GTSRYSKKFK PEIATRPKVR DQEGRMNYYW TLVEPGDKIT FEATGNLVVP RYAFMERNAG SGIIISDTPV HDCNTTCQTP EGAINTSLPF QNIHPITIGK CPKYVKSTKL RLATGLRNVP SIQSRGLFGA IAGFIEGGWT GMVDGWYGYH HQNEQGSGYA ADLKSTQNAI DKITNKVNSV IKMNTQFTAV GKEFNHLEKR IENLNKKVDD GFLDIWTYNA ELLVLLENER TLDYHDSNVK NLYEKVRNQL KNNAKEIGNG CFEFYHKCDN TCMESVKNGT YDYPKYSEEA KLNREKIDGV KLESTRIYHH HHHH
|
806
893
|
|
807
894
|
^^^ enable copy/pasting,
|
@@ -816,7 +903,7 @@ also add a footer to show which entries are available or so
|
|
816
903
|
This sequence has 50 aminoacids.
|
817
904
|
^^^ das stimmt net.
|
818
905
|
|
819
|
-
|
906
|
+
------------------------------------------------------------------------------------------
|
820
907
|
(1) → add this functionality:
|
821
908
|
|
822
909
|
meting temper
|
@@ -853,70 +940,46 @@ also add a footer to show which entries are available or so
|
|
853
940
|
and also provide a commandline-way to calculate them,
|
854
941
|
using ruby. The latter may be useful and rather easy for
|
855
942
|
scripted use.
|
856
|
-
|
943
|
+
------------------------------------------------------------------------------------------
|
857
944
|
(1) → show insulin
|
858
945
|
^^^ to show the insulin structure
|
859
946
|
how to find it? no idea...
|
860
947
|
but we should have these structures already made available somewhere.
|
861
|
-
|
948
|
+
------------------------------------------------------------------------------------------
|
862
949
|
(1) → Todo: find family of enzymes, based on sequence structure
|
863
950
|
alone.
|
864
|
-
..........................................................................
|
865
|
-
(1) → https://pubchem.ncbi.nlm.nih.gov/compound/16131099#section=Top
|
866
|
-
|
867
|
-
^^^ this website is quite interesting; try to use components
|
868
|
-
from it.
|
869
|
-
-------------------------------------------------------------------------------
|
870
|
-
(1) → Add some option to show the aminoacid sequence, at the least
|
871
|
-
store it; and optionally show it.
|
872
951
|
|
873
|
-
|
874
|
-
part of that file; and optionally also show
|
875
|
-
the whole sequence.
|
876
|
-
-------------------------------------------------------------------------------
|
952
|
+
------------------------------------------------------------------------------------------
|
877
953
|
(1) → WORK THROUGH the PROTOCOL AT BOKU. THEN WORK THROUGH THE VARIOUST
|
878
954
|
TIDBIDS AT UNI WIEN STARTING WITH HEIKO.
|
879
955
|
^^^ da sind wir nun.
|
880
956
|
wir sind an beginn von 1b ... hmmmm, also zerst mal das an der
|
881
957
|
BOKU durchgehen. Dann das löschen.
|
882
|
-
|
958
|
+
------------------------------------------------------------------------------------------
|
883
959
|
(1) → Begin tk-bindings for bioroebe, following the gtk stuff.
|
884
|
-
|
960
|
+
------------------------------------------------------------------------------------------
|
885
961
|
(2) → frame_value = position_of_the_stop_codon - position_of_the_start_codon
|
886
962
|
^^^ continue on this ...
|
887
|
-
|
963
|
+
------------------------------------------------------------------------------------------
|
888
964
|
(1) → improve both the gtk-apps parts, and the sinatra web-interface,
|
889
965
|
and other GUI-like elements. The idea is to make this software
|
890
966
|
more useful for people around the world, which should help
|
891
967
|
increase its adoption rate.
|
892
|
-
|
968
|
+
------------------------------------------------------------------------------------------
|
893
969
|
(2) → Look to integrate this:
|
894
970
|
|
895
971
|
http://www.ncbi.nlm.nih.gov/nuccore/NM_007315.3?report=fasta&log$=seqview&format=text
|
896
972
|
^^^
|
897
|
-
|
898
|
-
(1) → Clone and document the getorf functionality properly.
|
899
|
-
|
900
|
-
See: http://emboss.sourceforge.net/apps/cvs/emboss/apps/getorf.html
|
901
|
-
-------------------------------------------------------------------------------
|
902
|
-
(2) → set_dna_sequence alu
|
903
|
-
|
904
|
-
^^^ fetch random alu
|
905
|
-
|
906
|
-
^^^ alu sequence
|
907
|
-
Ok we started this now adding more details, but we
|
908
|
-
need to become better at searching for this
|
909
|
-
sequence.
|
910
|
-
-------------------------------------------------------------------------------
|
973
|
+
------------------------------------------------------------------------------------------
|
911
974
|
(3) → We need to make available the ... thingy magick
|
912
975
|
emboss functionality. that may seem useful
|
913
976
|
but also feel free to extend these parts for
|
914
977
|
bioroebe as necessary.
|
915
|
-
|
978
|
+
------------------------------------------------------------------------------------------
|
916
979
|
(4) → integrate electron_microscopy fully
|
917
980
|
This will take more time, so first we finish with the
|
918
981
|
taxonomy module instead.
|
919
|
-
|
982
|
+
------------------------------------------------------------------------------------------
|
920
983
|
(5) → Improve support for BLAST up until
|
921
984
|
|
922
985
|
middle of 2015 so that I am better prepared
|
@@ -927,7 +990,7 @@ also add a footer to show which entries are available or so
|
|
927
990
|
So, work on BLAST tutorial at bioinf page:
|
928
991
|
|
929
992
|
bl bioinf; rf bioinf
|
930
|
-
|
993
|
+
------------------------------------------------------------------------------------------
|
931
994
|
(3) → integrate a "codon usage database", whatever this means.
|
932
995
|
It is a cool database anyway. Then document this.
|
933
996
|
First, create a codon-usage analyze on a per-FASTA
|
@@ -935,7 +998,7 @@ also add a footer to show which entries are available or so
|
|
935
998
|
and calculate the codon usage from there.
|
936
999
|
|
937
1000
|
^^^ and add some GUI to this. hmmm
|
938
|
-
|
1001
|
+
------------------------------------------------------------------------------------------
|
939
1002
|
(4) → Input sequence:
|
940
1003
|
|
941
1004
|
MFLMVSPTAYHQNKDECFLP
|
@@ -951,46 +1014,40 @@ also add a footer to show which entries are available or so
|
|
951
1014
|
|
952
1015
|
^^^ we should also show this on the commandline AND the
|
953
1016
|
www ... hmmm.
|
954
|
-
|
1017
|
+
------------------------------------------------------------------------------------------
|
955
1018
|
(5) → enable a graphical layer so that we can find out which
|
956
1019
|
transcription factor activates which gene(s). This
|
957
1020
|
should show e. g. a transcription factor highlighting
|
958
1021
|
a target genetic area.
|
959
|
-
|
1022
|
+
------------------------------------------------------------------------------------------
|
960
1023
|
(2) → We should add more screenshots, make them available on imgur
|
961
1024
|
as well, after storing them locally. Start with the more
|
962
1025
|
important functionality.
|
963
1026
|
|
964
|
-
|
1027
|
+
------------------------------------------------------------------------------------------
|
965
1028
|
(2) → clone serial cloner or whatever the name was, that GUI,
|
966
1029
|
so that we can offer the same functionality.
|
967
|
-
|
1030
|
+
------------------------------------------------------------------------------------------
|
968
1031
|
(1) →
|
969
1032
|
|
970
1033
|
# * searching for PubMed IDs given a query string:
|
971
1034
|
# * Bio::PubMed#esearch (recommended)
|
972
1035
|
# * Bio::PubMed#search (only retrieves top 20 hits; will be deprecated)
|
973
1036
|
^^^ implement this
|
974
|
-
|
975
|
-
|
976
|
-
..........................................................................
|
1037
|
+
------------------------------------------------------------------------------------------
|
977
1038
|
(3) → Aufgabe 16 in bioroebe lösen könnnen
|
978
|
-
|
979
|
-
|
980
|
-
being a standalone part anymore.
|
981
|
-
continue on the taxonomy stuff.
|
982
|
-
ne day this will work again *shake fist*
|
983
|
-
-------------------------------------------------------------------------------
|
1039
|
+
|
1040
|
+
------------------------------------------------------------------------------------------
|
984
1041
|
(5) → re1 = Bio::RestrictionEnzyme::DoubleStranded.new(enzyme1)
|
985
1042
|
|
986
1043
|
^^^ add this? hmmmm
|
987
1044
|
^^^ from here.
|
988
|
-
|
1045
|
+
------------------------------------------------------------------------------------------
|
989
1046
|
(1) → Colourize exon/intron boundaries.
|
990
|
-
|
1047
|
+
------------------------------------------------------------------------------------------
|
991
1048
|
(2) → In bioroebe: enhance phylogeny stuff and perhaps automatically
|
992
1049
|
generate pictures here.
|
993
|
-
|
1050
|
+
------------------------------------------------------------------------------------------
|
994
1051
|
(1) → In sinatra: add a backtranseq entry point, perhaps
|
995
1052
|
alias it as well.
|
996
1053
|
|
@@ -1000,7 +1057,7 @@ bioroebe --protein-to-dna
|
|
1000
1057
|
|
1001
1058
|
^^^ this shall start the GTK3 variant
|
1002
1059
|
|
1003
|
-
|
1060
|
+
------------------------------------------------------------------------------------------
|
1004
1061
|
(1) → require 'rubygems/text'
|
1005
1062
|
include Gem::Text
|
1006
1063
|
levenshtein_distance 'shevy', 'chevy' # => 1
|
@@ -1012,13 +1069,13 @@ bioroebe --protein-to-dna
|
|
1012
1069
|
https://github.com/rubygems/rubygems/blob/master/lib/rubygems/text.rb
|
1013
1070
|
^^^ actually move that part into bioroebe itself...
|
1014
1071
|
|
1015
|
-
|
1072
|
+
------------------------------------------------------------------------------------------
|
1016
1073
|
(1) → add _source to all APIs in sinatra there. Ensure that this works
|
1017
1074
|
too. The user should be able to view the source code.
|
1018
1075
|
^^^ it has been added for 2 methods so far in sinatra; we need
|
1019
1076
|
to add it for the remaining ones too. Then we can remove
|
1020
1077
|
this entry point.
|
1021
|
-
|
1078
|
+
------------------------------------------------------------------------------------------
|
1022
1079
|
(2) → Check out expasy
|
1023
1080
|
peptidcutter
|
1024
1081
|
also offer this functionality, through commandline, GUI
|
@@ -1026,16 +1083,12 @@ bioroebe --protein-to-dna
|
|
1026
1083
|
https://web.expasy.org/peptide_cutter/
|
1027
1084
|
We now have added trypsin but we should add more here; and
|
1028
1085
|
still have to add support for sinatra here.
|
1029
|
-
|
1086
|
+
------------------------------------------------------------------------------------------
|
1030
1087
|
(3) → melting temperature subsection
|
1031
1088
|
|
1032
1089
|
hmmm .... molecular weight calculation works now ... but
|
1033
1090
|
... is it correct for a ssDNA string? hmm...
|
1034
|
-
|
1035
|
-
(3) → Add useful formulas for bioshell.
|
1036
|
-
|
1037
|
-
|
1038
|
-
...........................................................................
|
1091
|
+
------------------------------------------------------------------------------------------
|
1039
1092
|
(1) → Degenerate Primers
|
1040
1093
|
|
1041
1094
|
You can try to determine the degenerate primers via the Shell
|
@@ -1046,7 +1099,7 @@ bioroebe --protein-to-dna
|
|
1046
1099
|
^^^ epxnad that subsection
|
1047
1100
|
more explanations and examples
|
1048
1101
|
|
1049
|
-
|
1102
|
+
------------------------------------------------------------------------------------------
|
1050
1103
|
(1) → Copy the functionality of plotorf:
|
1051
1104
|
|
1052
1105
|
See:
|
@@ -1062,7 +1115,7 @@ bioroebe --protein-to-dna
|
|
1062
1115
|
|
1063
1116
|
|
1064
1117
|
|
1065
|
-
|
1118
|
+
------------------------------------------------------------------------------------------
|
1066
1119
|
(2) → Start nucleotide position is at: 142
|
1067
1120
|
|
1068
1121
|
See the following example:
|
@@ -1072,24 +1125,24 @@ bioroebe --protein-to-dna
|
|
1072
1125
|
BIO SHELL>
|
1073
1126
|
^^^ this does not work; nothing is highlighted.
|
1074
1127
|
|
1075
|
-
|
1128
|
+
------------------------------------------------------------------------------------------
|
1076
1129
|
(2) → Add a myristoylierung-signal
|
1077
1130
|
|
1078
1131
|
Met-Gly-Xaa-Xaa-YXaa-Ser/Thr-Lys-Lys
|
1079
1132
|
|
1080
1133
|
1^^ but check first.
|
1081
1134
|
|
1082
|
-
|
1135
|
+
------------------------------------------------------------------------------------------
|
1083
1136
|
(3) → integrate the bioroebe_tutorial.cgi into the .md file completely.
|
1084
1137
|
|
1085
|
-
|
1138
|
+
------------------------------------------------------------------------------------------
|
1086
1139
|
(4) → Integrate everything from the biopython tutorial, if it makes
|
1087
1140
|
sense.
|
1088
1141
|
|
1089
|
-
|
1142
|
+
------------------------------------------------------------------------------------------
|
1090
1143
|
(5) → Improve the codon-optimizer in Bioroebe, including the
|
1091
1144
|
documentation. We need to make this really useful.
|
1092
|
-
|
1145
|
+
------------------------------------------------------------------------------------------
|
1093
1146
|
(6) →
|
1094
1147
|
5'- TACACGGCACAT -3'
|
1095
1148
|
3'- ATGTGCCGTGTA -5'
|
@@ -1098,7 +1151,7 @@ Imperfect DNA mirror repeats (IMRs) are less than 100% symmetrical.
|
|
1098
1151
|
|
1099
1152
|
^^^ integrate mirror repeats creation
|
1100
1153
|
and searching for them. Hmmm.
|
1101
|
-
|
1154
|
+
------------------------------------------------------------------------------------------
|
1102
1155
|
(7) → continue porting bioroebe/taxonomy
|
1103
1156
|
|
1104
1157
|
^^^^^^^^^^
|
@@ -1108,12 +1161,12 @@ Imperfect DNA mirror repeats (IMRs) are less than 100% symmetrical.
|
|
1108
1161
|
^^^ das ist der nächste schritt, so das
|
1109
1162
|
wir das nit mehr benötigen.
|
1110
1163
|
|
1111
|
-
|
1164
|
+
------------------------------------------------------------------------------------------
|
1112
1165
|
(8) → find out which bacteria all contain the needle complex; find out
|
1113
1166
|
the sequence for the needle complex as well and study it;
|
1114
1167
|
find the positions of the genes responsible.
|
1115
1168
|
|
1116
|
-
|
1169
|
+
------------------------------------------------------------------------------------------
|
1117
1170
|
(9) → Add trypsin_digest, also in the shell, but possibly
|
1118
1171
|
on toplevel as well (if the input is a protein sequence.
|
1119
1172
|
|
@@ -1127,29 +1180,24 @@ Imperfect DNA mirror repeats (IMRs) are less than 100% symmetrical.
|
|
1127
1180
|
And document it; but do not digest if a prolin
|
1128
1181
|
follows !!!
|
1129
1182
|
^^^ document this too into .md
|
1130
|
-
|
1131
|
-
-------------------------------------------------------------------------------
|
1132
|
-
(10) → in bioroebe, add a commassie check... do we include
|
1133
|
-
arginine or not.
|
1134
|
-
|
1135
|
-
..........................................................................
|
1183
|
+
------------------------------------------------------------------------------------------
|
1136
1184
|
(11) → add codon usage in bioroebe
|
1137
|
-
|
1185
|
+
------------------------------------------------------------------------------------------
|
1138
1186
|
(12) → Clone the following functionality.
|
1139
1187
|
|
1140
1188
|
http://www.bioinformatics.nl/cgi-bin/emboss/help/sirna
|
1141
|
-
|
1189
|
+
------------------------------------------------------------------------------------------
|
1142
1190
|
(13) → Improve the "find and scan" subsection. We must be able to find
|
1143
1191
|
subsequences; check for "matches" as well, including the bioshell.
|
1144
|
-
|
1192
|
+
------------------------------------------------------------------------------------------
|
1145
1193
|
(14) → Clone the CLUSTAL format aligment.
|
1146
|
-
|
1194
|
+
------------------------------------------------------------------------------------------
|
1147
1195
|
(15) → We need to be able to load up a whole geneome into bioroebe,
|
1148
1196
|
and then be able to manipulate it.
|
1149
1197
|
|
1150
1198
|
^^^ perhaps test this with some example
|
1151
1199
|
data or so...
|
1152
|
-
|
1200
|
+
------------------------------------------------------------------------------------------
|
1153
1201
|
(16) → Restriction enzymes:
|
1154
1202
|
|
1155
1203
|
Add a subsection about restritction enzymes including
|
@@ -1163,7 +1211,7 @@ Imperfect DNA mirror repeats (IMRs) are less than 100% symmetrical.
|
|
1163
1211
|
general, so that we can reproduce and verify the
|
1164
1212
|
information there.
|
1165
1213
|
|
1166
|
-
|
1214
|
+
------------------------------------------------------------------------------------------
|
1167
1215
|
(18) → clone pepinfo
|
1168
1216
|
|
1169
1217
|
The program "pepinfo" plots various amino acid properties in
|
@@ -1181,7 +1229,7 @@ Imperfect DNA mirror repeats (IMRs) are less than 100% symmetrical.
|
|
1181
1229
|
|
1182
1230
|
The data are also written out to an output file.
|
1183
1231
|
|
1184
|
-
|
1232
|
+
------------------------------------------------------------------------------------------
|
1185
1233
|
(19) → gff?
|
1186
1234
|
|
1187
1235
|
There are 6 .gff3 files in the current directory.
|
@@ -1193,23 +1241,22 @@ Imperfect DNA mirror repeats (IMRs) are less than 100% symmetrical.
|
|
1193
1241
|
|
1194
1242
|
^^^ we need an analyze-mode as well.
|
1195
1243
|
|
1196
|
-
|
1244
|
+
------------------------------------------------------------------------------------------
|
1197
1245
|
(20) → ^^^^ add the ability to
|
1198
1246
|
show a ruler AND highlighting as well
|
1199
1247
|
^^^ then document it.
|
1200
|
-
|
1248
|
+
------------------------------------------------------------------------------------------
|
1201
1249
|
(21) → https://github.com/bioperl/bioperl-live
|
1202
1250
|
Look what we can take from ^^^.
|
1203
1251
|
|
1204
1252
|
https://github.com/bioperl/bioperl-live/tree/master/examples
|
1205
1253
|
|
1206
|
-
|
1254
|
+
------------------------------------------------------------------------------------------
|
1207
1255
|
(23) → continue biojava, and bioroebe a bit
|
1208
1256
|
|
1209
1257
|
Ideally we should have biojava o a working point.
|
1210
|
-
|
1211
|
-
|
1212
|
-
..........................................................................
|
1258
|
+
|
1259
|
+
------------------------------------------------------------------------------------------
|
1213
1260
|
(25) → clone the functionality found at https://web.expasy.org/protparam/
|
1214
1261
|
|
1215
1262
|
https://web.expasy.org/cgi-bin/protparam/protparam
|
@@ -1219,7 +1266,7 @@ Imperfect DNA mirror repeats (IMRs) are less than 100% symmetrical.
|
|
1219
1266
|
|
1220
1267
|
Theoretical pI: 5.78
|
1221
1268
|
|
1222
|
-
|
1269
|
+
------------------------------------------------------------------------------------------
|
1223
1270
|
(27) → NP_417539.1
|
1224
1271
|
|
1225
1272
|
https://www.ncbi.nlm.nih.gov/protein/NP_417539.1
|
@@ -1227,26 +1274,17 @@ Imperfect DNA mirror repeats (IMRs) are less than 100% symmetrical.
|
|
1227
1274
|
|
1228
1275
|
^^^ if the input is exactly like the above, on the first line,
|
1229
1276
|
download the sequence.
|
1230
|
-
|
1231
|
-
(28) → Integrate these nice GUI parts parts:
|
1232
|
-
|
1233
|
-
https://dev.to/kojix2/introduction-to-gr-rb-data-visualization-with-ruby-2c39
|
1234
|
-
|
1235
|
-
-------------------------------------------------------------------------------
|
1236
|
-
(29) → http://insilico.ehu.es/
|
1237
|
-
|
1238
|
-
^^^ check if we have all of this incorporated
|
1239
|
-
-------------------------------------------------------------------------------
|
1277
|
+
------------------------------------------------------------------------------------------
|
1240
1278
|
(30) → http://www.biostars.org/
|
1241
1279
|
|
1242
1280
|
^^^ regularly work through this
|
1243
1281
|
and try to help
|
1244
1282
|
and extend bioruby at the same time.
|
1245
|
-
|
1283
|
+
------------------------------------------------------------------------------------------
|
1246
1284
|
(31) → The taxonomy-submodule should work one day, and be properly
|
1247
1285
|
documented as well. Perhaps integrate the parts of Taxonomy
|
1248
1286
|
that can be included into the toplevel domain.
|
1249
|
-
|
1287
|
+
------------------------------------------------------------------------------------------
|
1250
1288
|
(32) → Enable:
|
1251
1289
|
|
1252
1290
|
Bioroebe.set_genetic_code()
|
@@ -1262,7 +1300,7 @@ Imperfect DNA mirror repeats (IMRs) are less than 100% symmetrical.
|
|
1262
1300
|
|
1263
1301
|
^^^ enable this as well; extent documentation too.
|
1264
1302
|
|
1265
|
-
|
1303
|
+
------------------------------------------------------------------------------------------
|
1266
1304
|
(34) → We have found a restriction enzyme called NheI.
|
1267
1305
|
|
1268
1306
|
The sequence this 6-cutter relates to is: `5' - GCTAGC - 3'`
|
@@ -1270,23 +1308,23 @@ Imperfect DNA mirror repeats (IMRs) are less than 100% symmetrical.
|
|
1270
1308
|
This restriction enzyme will produce a blunt overhang.
|
1271
1309
|
|
1272
1310
|
^^^ nope das ist falsch
|
1273
|
-
|
1311
|
+
------------------------------------------------------------------------------------------
|
1274
1312
|
(35) → Sau3A?
|
1275
1313
|
^^^ enable this restriction site
|
1276
1314
|
|
1277
|
-
|
1315
|
+
------------------------------------------------------------------------------------------
|
1278
1316
|
(37) → Add matplotlib support.
|
1279
1317
|
|
1280
1318
|
try_to_use_matplotlib
|
1281
|
-
|
1319
|
+
------------------------------------------------------------------------------------------
|
1282
1320
|
(38) → https://www.ncbi.nlm.nih.gov/CBBresearch/Lu/Demo/tmTools/RESTfulAPIs.html
|
1283
|
-
|
1321
|
+
------------------------------------------------------------------------------------------
|
1284
1322
|
(39) → The following input:
|
1285
1323
|
|
1286
1324
|
downcase; orf?; seq?
|
1287
1325
|
|
1288
1326
|
leads to strange display. Something is wrong here, must be checked.
|
1289
|
-
|
1327
|
+
------------------------------------------------------------------------------------------
|
1290
1328
|
(40) → Continue with rosalind problems.
|
1291
1329
|
|
1292
1330
|
These challenges can be found here:
|
@@ -1295,42 +1333,42 @@ Imperfect DNA mirror repeats (IMRs) are less than 100% symmetrical.
|
|
1295
1333
|
|
1296
1334
|
Also integrate these rosalind-quizzes into bioroebe
|
1297
1335
|
when possible.
|
1298
|
-
|
1336
|
+
------------------------------------------------------------------------------------------
|
1299
1337
|
(41) → https://web.expasy.org/cgi-bin/peptide_mass/peptide-mass.pl
|
1300
1338
|
|
1301
1339
|
^^^ make the above usable in sinaitra as well
|
1302
|
-
|
1340
|
+
------------------------------------------------------------------------------------------
|
1303
1341
|
(42) → Integrate a way to search for commonly known promoters:
|
1304
1342
|
|
1305
1343
|
promoters?
|
1306
1344
|
^^^ this functionality
|
1307
1345
|
^^^ this has to be expanded
|
1308
1346
|
and ...
|
1309
|
-
|
1347
|
+
------------------------------------------------------------------------------------------
|
1310
1348
|
(43) → Integrate:
|
1311
1349
|
|
1312
1350
|
http://biotools.nubic.northwestern.edu/OligoCalc.html
|
1313
|
-
|
1351
|
+
------------------------------------------------------------------------------------------
|
1314
1352
|
(44) → Extend the Java part of BioRoebe systematically..
|
1315
1353
|
|
1316
1354
|
What should come next? Let's make a list.
|
1317
1355
|
|
1318
1356
|
→ remove_numbers [DONE]
|
1319
|
-
|
1357
|
+
------------------------------------------------------------------------------------------
|
1320
1358
|
(46) → Study gnuplot; one day we have to draw graphs.
|
1321
1359
|
|
1322
|
-
|
1360
|
+
------------------------------------------------------------------------------------------
|
1323
1361
|
(47) → Add a genome browser, both ascii without GUI and also
|
1324
1362
|
with. In ruby-gtk.
|
1325
|
-
|
1363
|
+
------------------------------------------------------------------------------------------
|
1326
1364
|
(48) → Clone the functionality of:
|
1327
1365
|
|
1328
1366
|
http://www.biophp.org/minitools/restriction_digest/demo.php
|
1329
|
-
|
1367
|
+
------------------------------------------------------------------------------------------
|
1330
1368
|
(50) → Add the loxP sequence to readme [DONE] and explain this
|
1331
1369
|
better on the main readme; and perhaps also assign
|
1332
1370
|
the sequence via the bioshell.
|
1333
|
-
|
1371
|
+
------------------------------------------------------------------------------------------
|
1334
1372
|
(51) → 33. Cephalodiscidae Mitochondrial UAA-Tyr Code (transl_table=33)
|
1335
1373
|
|
1336
1374
|
AAs = FFLLSSSSYYY*CCWWLLLLPPPPHHQQRRRRIIIMTTTTNNKKSSSKVVVVAAAADDEEGGGG
|
@@ -1341,7 +1379,7 @@ Imperfect DNA mirror repeats (IMRs) are less than 100% symmetrical.
|
|
1341
1379
|
|
1342
1380
|
^^^ add a parser, and document it, that can take this input
|
1343
1381
|
and output the corresponding code, in a valid .yml file.
|
1344
|
-
|
1382
|
+
------------------------------------------------------------------------------------------
|
1345
1383
|
(52) → Add to bioroebe the ability to add cloning vectors
|
1346
1384
|
and molecular_weight calcuation
|
1347
1385
|
for this
|
@@ -1363,19 +1401,19 @@ Imperfect DNA mirror repeats (IMRs) are less than 100% symmetrical.
|
|
1363
1401
|
|
1364
1402
|
^^^ we also need a way to find out what resistance genes
|
1365
1403
|
are carried there.
|
1366
|
-
|
1404
|
+
------------------------------------------------------------------------------------------
|
1367
1405
|
(53) → In the lambda genome sequence there are 10 EcoB and
|
1368
1406
|
5 EcoK sites.
|
1369
1407
|
^^^ verify this too, as an example as well
|
1370
|
-
|
1408
|
+
------------------------------------------------------------------------------------------
|
1371
1409
|
(54) → show restriction sites, composable and compatible with
|
1372
1410
|
serial clone ... hmm
|
1373
|
-
|
1411
|
+
------------------------------------------------------------------------------------------
|
1374
1412
|
(55) → enable:
|
1375
1413
|
BIOROEBE_USE_COLOURS:
|
1376
1414
|
can be 0 or 1
|
1377
1415
|
what is this?
|
1378
|
-
|
1416
|
+
------------------------------------------------------------------------------------------
|
1379
1417
|
(56) → Burrows-Wheeler-Transform (BWT)
|
1380
1418
|
|
1381
1419
|
^^^ add some method here
|
@@ -1388,15 +1426,15 @@ Imperfect DNA mirror repeats (IMRs) are less than 100% symmetrical.
|
|
1388
1426
|
|
1389
1427
|
also test this against my paper-result
|
1390
1428
|
with input being: "GATAG$".
|
1391
|
-
|
1429
|
+
------------------------------------------------------------------------------------------
|
1392
1430
|
(56) → Enable working with several genes... hmm and store that somewhere.
|
1393
1431
|
Something like a per-project workspace thingy.
|
1394
|
-
|
1432
|
+
------------------------------------------------------------------------------------------
|
1395
1433
|
(57) → Add:
|
1396
1434
|
|
1397
1435
|
http://nar.oxfordjournals.org/content/35/suppl_2/W71.long
|
1398
1436
|
|
1399
|
-
|
1437
|
+
------------------------------------------------------------------------------------------
|
1400
1438
|
(58) → Now, you may want to translate the nucleotides up to
|
1401
1439
|
the first in frame stop codon, and then stop (as
|
1402
1440
|
happens in nature):
|
@@ -1410,14 +1448,14 @@ Imperfect DNA mirror repeats (IMRs) are less than 100% symmetrical.
|
|
1410
1448
|
Then continue from here:
|
1411
1449
|
|
1412
1450
|
https://people.duke.edu/~ccc14/pcfb/biopython/BiopythonSequences.html
|
1413
|
-
|
1451
|
+
------------------------------------------------------------------------------------------
|
1414
1452
|
(59) → Add:
|
1415
1453
|
|
1416
1454
|
set_dna :Ubiquitin
|
1417
1455
|
set_dna :ubiquitin
|
1418
1456
|
|
1419
1457
|
^^^ we want to obtain the ubuiqitin sequence
|
1420
|
-
|
1458
|
+
------------------------------------------------------------------------------------------
|
1421
1459
|
(59) → Telomers
|
1422
1460
|
|
1423
1461
|
Telomeres are listed from 5' to 3'.
|
@@ -1431,28 +1469,28 @@ Imperfect DNA mirror repeats (IMRs) are less than 100% symmetrical.
|
|
1431
1469
|
doc_telomeres
|
1432
1470
|
|
1433
1471
|
^^^ add this to say the human telomere sequence
|
1434
|
-
|
1472
|
+
------------------------------------------------------------------------------------------
|
1435
1473
|
(60) → ORF_positions?
|
1436
1474
|
^^^ change this a bit, to actually show the positions
|
1437
1475
|
of the various ORFs with the start-position.
|
1438
|
-
|
1476
|
+
------------------------------------------------------------------------------------------
|
1439
1477
|
(62) → add:
|
1440
1478
|
|
1441
1479
|
setgene2
|
1442
1480
|
add_dna2
|
1443
1481
|
dna2
|
1444
1482
|
dna? <--- this one is not a setter but a query.
|
1445
|
-
|
1483
|
+
------------------------------------------------------------------------------------------
|
1446
1484
|
(63) → improve the TM calculation. must be better, must have more
|
1447
1485
|
documentation, and a small tutorial.
|
1448
|
-
|
1486
|
+
------------------------------------------------------------------------------------------
|
1449
1487
|
(64) → Compare bioroebe to:
|
1450
1488
|
|
1451
1489
|
https://www.ncbi.nlm.nih.gov/orffinder
|
1452
1490
|
|
1453
1491
|
whether both return the same
|
1454
1492
|
also possibly add a web-gui
|
1455
|
-
|
1493
|
+
------------------------------------------------------------------------------------------
|
1456
1494
|
(65) → Find out ratios from:
|
1457
1495
|
|
1458
1496
|
Doolittle RF. 1989. Redundancies in protein sequences. I
|
@@ -1478,16 +1516,16 @@ Imperfect DNA mirror repeats (IMRs) are less than 100% symmetrical.
|
|
1478
1516
|
Bioroebe::Blosum[50] as an API.
|
1479
1517
|
and document it in general.
|
1480
1518
|
|
1481
|
-
|
1519
|
+
------------------------------------------------------------------------------------------
|
1482
1520
|
(65) → http://www.biomart.org/other/user-docs.pdf
|
1483
1521
|
^^^ work through this
|
1484
|
-
|
1522
|
+
------------------------------------------------------------------------------------------
|
1485
1523
|
(66) → add:
|
1486
1524
|
|
1487
1525
|
class Cell
|
1488
1526
|
^^^ simulate a cell
|
1489
1527
|
Hmmm. Needs specific components ... and needs a better plan.
|
1490
|
-
|
1528
|
+
------------------------------------------------------------------------------------------
|
1491
1529
|
(68) → class Protein:
|
1492
1530
|
|
1493
1531
|
add glycosyslation patteren
|
@@ -1496,18 +1534,18 @@ Imperfect DNA mirror repeats (IMRs) are less than 100% symmetrical.
|
|
1496
1534
|
need to somehow add the modiication type
|
1497
1535
|
|
1498
1536
|
https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5358406/
|
1499
|
-
|
1537
|
+
------------------------------------------------------------------------------------------
|
1500
1538
|
(69) → In the BioShell we must be able to do probes - completementary
|
1501
1539
|
to amino acids.
|
1502
|
-
|
1540
|
+
------------------------------------------------------------------------------------------
|
1503
1541
|
(70) → Add www-related functionality to bioroebe eventually make use
|
1504
1542
|
of rails, but start with sinatra possibly. In the long run,
|
1505
1543
|
make it flexible to work with as many different frameworks
|
1506
1544
|
as possible, though.
|
1507
|
-
|
1545
|
+
------------------------------------------------------------------------------------------
|
1508
1546
|
(71) → Spaltstellen anzeigen zum beispiel lambda-DNA verdau
|
1509
1547
|
BgI II.
|
1510
|
-
|
1548
|
+
------------------------------------------------------------------------------------------
|
1511
1549
|
(72) → dnaanalyze
|
1512
1550
|
|
1513
1551
|
In the DNA string `TCCGTCGCAACACATCGCCTCAACAAACCGACCGGGATATGCAATACCGGAATCCGATCCTTTAGAAGCTGCATTCCAAACGCTTGCAATAACACCCACTCGACTATTCAGCATTGGCAAAGGGTACGAATTCGACGAAGGGAGGGTGCTATATTTTCCAAGTTGCTCGCCGATTGATACGGAGCCTGTGGAAAGATTTCGCGGCTCTAGTCTTTAGCTTTGATGTCACCCCTGAGTAGTAACCCGGCGTGGTAGCTTTCATTAGACTTCTCGGAGAGAGTATTAAGCAAAGGTGGAGGTCCCAGGGGTCCAGTGAGCTGTATCGCACTAAAAGCATGCCTACGGGCAATGCTATTTTGCTCACAGGAACTTTGGGGGAGCCACAAACTCTCGAAGCCGGATTGTTGTGGCGGCTAACTTTCCAAAGGCGACCATTCATGGTCTGAATGGGCCCTCACCAGAAGAACGTTTTCGACGGGCATTCTTCCCCGGGGTTTCGAAGGCAAGGGTCAGCACGGCGCGGAAAAGTACGCGACGCATACCGGACTAGTCATGCAACTCCCTCGGAACTGGCGATTCCCACCCAAGAGACGCACGCTGATCATTGCCCATGCCGACTGGAGATGCTGAATTTGGTATGCGGGTCTGTTGCCAGCGCTGACATTATCGGACATTGTGGGGAGAACCGTGTGATTGATTGAGCTGGCGCATTTGTCCGCATGCTCTCCTCATGTGGACACCTTCGCAGGTTCTTTCCGCGGCCACAGTGTCGGGATCTACCCCTGGTGCGTCGCCGCGAGTACAGGTGGGGTTTCGCGCATGAGAACCAATGTTGCACGCCTCAAAACATGGCTGTAACATATTAGCGCCAATAAAAATTTTTGGCAACAAAGAAACAAGGCCAACCGAAGTGCTAAGCCGCGATCATGAAGGGGCGATGCCAGAATGGGAGTCTGCCTTTCCTGTGTGGACGTGAGATTGTACCTAGACAGAGAACGCC` we found these Nucleotides:
|
@@ -1532,11 +1570,11 @@ Imperfect DNA mirror repeats (IMRs) are less than 100% symmetrical.
|
|
1532
1570
|
we need to make it so that an input sequence
|
1533
1571
|
can be assigned, and dnaanalyse --GUI should
|
1534
1572
|
start it too. ALSO document it once this works.
|
1535
|
-
|
1573
|
+
------------------------------------------------------------------------------------------
|
1536
1574
|
(73) → go through the individual components slowly and improve them,
|
1537
1575
|
step by step, including the documentation. Then eventually
|
1538
1576
|
remove this todo-entry here.
|
1539
|
-
|
1577
|
+
------------------------------------------------------------------------------------------
|
1540
1578
|
(74) → Add a consensus sequence for:
|
1541
1579
|
|
1542
1580
|
Asn-X-Ser/Thr-Conesnsus
|
@@ -1548,13 +1586,13 @@ Imperfect DNA mirror repeats (IMRs) are less than 100% symmetrical.
|
|
1548
1586
|
NGlyc
|
1549
1587
|
/N-?Glyc/i
|
1550
1588
|
^^^ use that regex
|
1551
|
-
|
1589
|
+
------------------------------------------------------------------------------------------
|
1552
1590
|
(74) → make sure that newly generated files respect the
|
1553
1591
|
default chmod value on the system. from bioroebe.
|
1554
1592
|
right now we default to 755 which I assume is
|
1555
1593
|
hardcoded but perhaps this is wrong.
|
1556
1594
|
|
1557
|
-
|
1595
|
+
------------------------------------------------------------------------------------------
|
1558
1596
|
(75) → require 'bio'
|
1559
1597
|
|
1560
1598
|
# creating a Bio::Sequence::NA object containing ambiguous alphabets
|
@@ -1579,34 +1617,34 @@ Imperfect DNA mirror repeats (IMRs) are less than 100% symmetrical.
|
|
1579
1617
|
part nto a standalone file
|
1580
1618
|
so taht it can be used by both the .cgi and
|
1581
1619
|
well rdoc...
|
1582
|
-
|
1620
|
+
------------------------------------------------------------------------------------------
|
1583
1621
|
- Add more protein-specific thingies to bioroebe.
|
1584
|
-
|
1622
|
+
------------------------------------------------------------------------------------------
|
1585
1623
|
- Die bioshell vorantreiben und durch std_biology.rb abarbeiten.
|
1586
1624
|
Vielleicht können wir ja etwas davon auslagern in eine Klasse
|
1587
1625
|
oder so.
|
1588
1626
|
|
1589
1627
|
Das ganze sollte auch mit Webmin (biomin) verknüpft werden, so das
|
1590
1628
|
wir die Bioshell auch elegant über das www verwenden können!
|
1591
|
-
|
1629
|
+
------------------------------------------------------------------------------------------
|
1592
1630
|
- ^^^ when we find restriction enzyme sites in a DNA
|
1593
1631
|
string, colourize them RED.
|
1594
1632
|
|
1595
1633
|
also set it to
|
1596
1634
|
set_restriction_size()
|
1597
|
-
|
1635
|
+
------------------------------------------------------------------------------------------
|
1598
1636
|
- also ... while learning C++ we extend the project here...
|
1599
1637
|
Useful C++ things will be combined.
|
1600
|
-
|
1638
|
+
------------------------------------------------------------------------------------------
|
1601
1639
|
- As of April 2003, there were 176,890 total taxa represented.
|
1602
1640
|
|
1603
1641
|
^^^ we need a way to also output how many entries we
|
1604
1642
|
have there.
|
1605
|
-
|
1643
|
+
------------------------------------------------------------------------------------------
|
1606
1644
|
- Replace bioruby with bioroebe completely!
|
1607
1645
|
In order for this to work, we first need to find out
|
1608
1646
|
what bioruby is able to do. :P
|
1609
|
-
|
1647
|
+
------------------------------------------------------------------------------------------
|
1610
1648
|
- append 33
|
1611
1649
|
# ^^^ in the bioshell
|
1612
1650
|
Only numbers were given: Adding 33 random nucleotides to the main string next.
|
@@ -1626,7 +1664,7 @@ Did you mean? return_random_codon_sequence_for_this_aminoacid_sequence
|
|
1626
1664
|
|
1627
1665
|
|
1628
1666
|
^^^^^ BUG!
|
1629
|
-
|
1667
|
+
------------------------------------------------------------------------------------------
|
1630
1668
|
> rest?
|
1631
1669
|
|
1632
1670
|
We found these restriction sites within the sequence `TTCAGAACTCAACGCCTGGTTGGCCGTCCAGTAAGCTGACTAAGTAAGTCTATGCCCGCGATAACCAGGATACAGATATCGTGAAACCTGGTTTATCTCCTTCTATAAGAGTCTGCACATCTAGC`:
|
@@ -1656,7 +1694,7 @@ We found these restriction sites within the sequence `TTCAGAACTCAACGCCTGGTTGGCCG
|
|
1656
1694
|
^^^^ also show the position
|
1657
1695
|
|
1658
1696
|
|
1659
|
-
|
1697
|
+
------------------------------------------------------------------------------------------
|
1660
1698
|
|
1661
1699
|
PMID entries are:
|
1662
1700
|
|
@@ -1728,7 +1766,7 @@ We found these restriction sites within the sequence `TTCAGAACTCAACGCCTGGTTGGCCG
|
|
1728
1766
|
|
1729
1767
|
|
1730
1768
|
|
1731
|
-
|
1769
|
+
------------------------------------------------------------------------------------------
|
1732
1770
|
Bei der Datenbanksuche werden die gemessenen Massen mit den Peptidmassen
|
1733
1771
|
aller Proteine bzw. Gene in einer Datenbank (NCBI, Uniprot) verglichen. DNA-
|
1734
1772
|
Sequenzen werden dazu in Proteinsequenzen übersetzt und in silico mit der beim
|
@@ -1738,7 +1776,7 @@ Verdau benutzten Protease geschnitten.
|
|
1738
1776
|
|
1739
1777
|
|
1740
1778
|
|
1741
|
-
|
1779
|
+
------------------------------------------------------------------------------------------
|
1742
1780
|
Complexity of libraries:
|
1743
1781
|
How many independent clones are necessary to represent a genome (plant,
|
1744
1782
|
animal/fungus) or how many such clones have to be screened to have realistic
|
@@ -1773,7 +1811,7 @@ have to be hybridized.
|
|
1773
1811
|
|
1774
1812
|
|
1775
1813
|
|
1776
|
-
|
1814
|
+
------------------------------------------------------------------------------------------
|
1777
1815
|
|
1778
1816
|
BIO SHELL> BglI?
|
1779
1817
|
|
@@ -1818,12 +1856,12 @@ List all enzymes that produce compatible ends for the enzyme.
|
|
1818
1856
|
http://biopython.org/DIST/docs/api/Bio.Restriction.Restriction.Blunt-class.html
|
1819
1857
|
|
1820
1858
|
|
1821
|
-
|
1859
|
+
------------------------------------------------------------------------------------------
|
1822
1860
|
https://www.reddit.com/r/bioinformatics/comments/5o3kn8/bioinformatics_contest_2017_jan_23rd29th_solve_as/
|
1823
|
-
|
1861
|
+
------------------------------------------------------------------------------------------
|
1824
1862
|
(1) → Finish all of biophp integration into bioroebe.
|
1825
1863
|
http://www.biophp.org/
|
1826
|
-
|
1864
|
+
------------------------------------------------------------------------------------------
|
1827
1865
|
|
1828
1866
|
locate oriC here:
|
1829
1867
|
|
@@ -1858,13 +1896,13 @@ But I do not know how to locate ORIs.
|
|
1858
1896
|
|
1859
1897
|
|
1860
1898
|
|
1861
|
-
|
1899
|
+
------------------------------------------------------------------------------------------
|
1862
1900
|
^^^ also integrate git into bioroebe.
|
1863
|
-
|
1901
|
+
------------------------------------------------------------------------------------------
|
1864
1902
|
WIR MÜSSEN DAS HIER EXTREM VERBESSERN.
|
1865
1903
|
|
1866
1904
|
DANN UPLOADEN UND ALS BASIS FÜR APPLICATIONS NUTZEN.
|
1867
|
-
|
1905
|
+
------------------------------------------------------------------------------------------
|
1868
1906
|
|
1869
1907
|
Study MetaCyc
|
1870
1908
|
^^^ study metabolic pathways.
|
@@ -1873,7 +1911,7 @@ http://metacyc.org/
|
|
1873
1911
|
|
1874
1912
|
→ Create KuroMetaCyc, in Analogy towards Metabolic Cycle.
|
1875
1913
|
|
1876
|
-
|
1914
|
+
------------------------------------------------------------------------------------------
|
1877
1915
|
|
1878
1916
|
Welcome to BioShell May 2012. Type "help" to get some help.
|
1879
1917
|
|
@@ -1895,7 +1933,7 @@ When we type this, we then ask:
|
|
1895
1933
|
|
1896
1934
|
|
1897
1935
|
|
1898
|
-
|
1936
|
+
------------------------------------------------------------------------------------------
|
1899
1937
|
|
1900
1938
|
http://biopython.org/DIST/docs/cookbook/Restriction.html#mozTocId101269
|
1901
1939
|
|
@@ -1985,16 +2023,16 @@ ausreichend.
|
|
1985
2023
|
|
1986
2024
|
|
1987
2025
|
|
1988
|
-
|
2026
|
+
------------------------------------------------------------------------------------------
|
1989
2027
|
BioTodo - GENESIS, science fiction.
|
1990
2028
|
|
1991
2029
|
- create virus(:which_one, :amount) # Note the difference to the below
|
1992
2030
|
- create hydra(:amount)
|
1993
2031
|
- create bread
|
1994
|
-
|
2032
|
+
------------------------------------------------------------------------------------------
|
1995
2033
|
→ both
|
1996
2034
|
^ should work, does not work right now.
|
1997
|
-
|
2035
|
+
------------------------------------------------------------------------------------------
|
1998
2036
|
→ Taxonomy is now integrated into bioroebe. This is good but we need more
|
1999
2037
|
documentation, some more tests, a rethinking of the layout and the
|
2000
2038
|
structures, and a fixing of the query-part of the database.
|
@@ -2008,13 +2046,13 @@ ausreichend.
|
|
2008
2046
|
at about the same time \o/
|
2009
2047
|
AND document this related-problems too
|
2010
2048
|
Integrate this some other day...
|
2011
|
-
|
2049
|
+
------------------------------------------------------------------------------------------
|
2012
2050
|
- http://www.restrictionmapper.org/cgi-bin/sitefind3.pl
|
2013
2051
|
|
2014
2052
|
^^^ Das sollte man integrieren, die Funktionalität, so das
|
2015
2053
|
man ALLE Restriktion-Enzymes ausprobiert ausgehend von
|
2016
2054
|
einer bestimmten Sequenz.
|
2017
|
-
|
2055
|
+
------------------------------------------------------------------------------------------
|
2018
2056
|
→ A search is essentially substring search across a database of strings
|
2019
2057
|
(albeit with a smaller alphabet). Some common use cases: one,
|
2020
2058
|
scientists will search for certain genes that they've used in engineered
|
@@ -2033,13 +2071,13 @@ ausreichend.
|
|
2033
2071
|
Bioroebe::DetermineOptimalCodons
|
2034
2072
|
^^^ this is currently incomplete.
|
2035
2073
|
|
2036
|
-
|
2074
|
+
------------------------------------------------------------------------------------------
|
2037
2075
|
→ Redo restrictions enzymes completely.
|
2038
2076
|
And polish this a LOT.
|
2039
2077
|
This may take some days. But we want this to be REALLY good and
|
2040
2078
|
lasting for a long time.
|
2041
2079
|
Need to keep on working at that!
|
2042
|
-
|
2080
|
+
------------------------------------------------------------------------------------------
|
2043
2081
|
→ Add: average_aminoacid_weight?
|
2044
2082
|
|
2045
2083
|
|
@@ -2077,7 +2115,7 @@ end
|
|
2077
2115
|
→ We must be able to align not only nucleotides but also aminoacids.
|
2078
2116
|
But where is the alignment comparer? perhaps hamming distance?
|
2079
2117
|
hmm we have to see.
|
2080
|
-
|
2118
|
+
------------------------------------------------------------------------------------------
|
2081
2119
|
→ /Programs/Ruby/2.3.1/lib/ruby/site_ruby/2.3.0/bioroebe/bioshell/menu.rb:311:in `menu': undefined method `upcase' for ["EcoRI"]:Array (NoMethodError)
|
2082
2120
|
from /Programs/Ruby/2.3.1/lib/ruby/site_ruby/2.3.0/bioroebe/bioshell/user_input.rb:31:in `block in enter_main_loop'
|
2083
2121
|
from /Programs/Ruby/2.3.1/lib/ruby/site_ruby/2.3.0/bioroebe/bioshell/user_input.rb:12:in `loop'
|
@@ -2106,12 +2144,12 @@ end
|
|
2106
2144
|
at this date.'
|
2107
2145
|
SendEmail.new to: Roebe.email?, data
|
2108
2146
|
|
2109
|
-
|
2147
|
+
------------------------------------------------------------------------------------------
|
2110
2148
|
|
2111
2149
|
|
2112
2150
|
→ Document which parts of emboss have already been copied.
|
2113
2151
|
→ EMBOSS.md
|
2114
|
-
|
2152
|
+
------------------------------------------------------------------------------------------
|
2115
2153
|
|
2116
2154
|
|
2117
2155
|
|
@@ -2168,7 +2206,7 @@ Traceback (most recent call last):
|
|
2168
2206
|
|
2169
2207
|
http://www.snapgene.com/products/snapgene_viewer/
|
2170
2208
|
|
2171
|
-
|
2209
|
+
------------------------------------------------------------------------------------------
|
2172
2210
|
(1) → Wir sollten GFP tagging unterstützen, also wie das
|
2173
2211
|
Protein-Konstrukt aussehen soll und so weiter.
|
2174
2212
|
Das geht teilweise...
|
@@ -2177,22 +2215,22 @@ Traceback (most recent call last):
|
|
2177
2215
|
fügt die sequence asl main dna sequenz ein.
|
2178
2216
|
Was fehlt? Hmmmm... eventuell noch mehr an
|
2179
2217
|
dokumentation.
|
2180
|
-
|
2218
|
+
------------------------------------------------------------------------------------------
|
2181
2219
|
|
2182
2220
|
- in bioroebe, create subsequences for siRNA, then scan for
|
2183
2221
|
submatcher + report where these are. Should be fast too.
|
2184
|
-
|
2222
|
+
------------------------------------------------------------------------------------------
|
2185
2223
|
- Reverse complement now works quite well, also via the sinatra
|
2186
2224
|
interface. We still should have a way to show 5' and
|
2187
2225
|
3', both on the commandline, and via sinatra.
|
2188
2226
|
Perhaps via --fancy commandline flag or so.
|
2189
|
-
|
2227
|
+
------------------------------------------------------------------------------------------
|
2190
2228
|
- Cn3D files?
|
2191
2229
|
^^^ add support for these; research what they are, too.
|
2192
|
-
|
2230
|
+
------------------------------------------------------------------------------------------
|
2193
2231
|
- Consider adding graphviz, perhaps to the taxonomy project
|
2194
2232
|
where we make graphs towards different nodes or so...
|
2195
|
-
|
2233
|
+
------------------------------------------------------------------------------------------
|
2196
2234
|
- in parse fasta
|
2197
2235
|
@colourize_sequence = false
|
2198
2236
|
^^^ change this lateron...
|
@@ -2200,7 +2238,7 @@ Traceback (most recent call last):
|
|
2200
2238
|
this method now exists, but we still have to make
|
2201
2239
|
the check better whether it is a protein or a DNA/RNA
|
2202
2240
|
add a toplevel method for this.
|
2203
|
-
|
2241
|
+
------------------------------------------------------------------------------------------
|
2204
2242
|
- clone the BLast ident matcher functionality for aminacids into
|
2205
2243
|
Bioroebe.
|
2206
2244
|
|
@@ -2215,7 +2253,7 @@ Traceback (most recent call last):
|
|
2215
2253
|
|
2216
2254
|
|
2217
2255
|
|
2218
|
-
|
2256
|
+
------------------------------------------------------------------------------------------
|
2219
2257
|
- Be able to mark exon/intron boundaries.
|
2220
2258
|
|
2221
2259
|
- Add "taxid?" to tell us the name of the organism. This works now.
|
@@ -2259,9 +2297,9 @@ Traceback (most recent call last):
|
|
2259
2297
|
|
2260
2298
|
^^^
|
2261
2299
|
study sumoplot ...
|
2262
|
-
|
2300
|
+
------------------------------------------------------------------------------------------
|
2263
2301
|
- http://a-little-book-of-r-for-bioinformatics.readthedocs.io/en/latest/src/chapter7.html
|
2264
|
-
|
2302
|
+
------------------------------------------------------------------------------------------
|
2265
2303
|
- http://biopython.org/DIST/docs/tutorial/Tutorial.html#htoc22
|
2266
2304
|
^^^ continue here; "You can also specify the table using the
|
2267
2305
|
NCBI table number which is shorter, and often included in
|
@@ -2269,7 +2307,7 @@ Traceback (most recent call last):
|
|
2269
2307
|
|
2270
2308
|
^^^ work through this and see if it is good.
|
2271
2309
|
|
2272
|
-
|
2310
|
+
------------------------------------------------------------------------------------------
|
2273
2311
|
|
2274
2312
|
- Clone ALL of biophp, if it us useful.
|
2275
2313
|
|
@@ -2316,7 +2354,7 @@ Palindromic sequences finder
|
|
2316
2354
|
We should also put this poart into doc/ subsection
|
2317
2355
|
to keep track of what is missing and what is not.
|
2318
2356
|
|
2319
|
-
|
2357
|
+
------------------------------------------------------------------------------------------
|
2320
2358
|
(1) → sizeseq
|
2321
2359
|
|
2322
2360
|
^^^ clone this functionality and describe it in detail.
|
@@ -2353,7 +2391,7 @@ foobar.fasta
|
|
2353
2391
|
|
2354
2392
|
ALSO ADD A GUI; sizeseq.rb was added in February 2021.
|
2355
2393
|
|
2356
|
-
|
2394
|
+
------------------------------------------------------------------------------------------
|
2357
2395
|
- In the sinatra-web-interface for Bioroebe:
|
2358
2396
|
continue quiz in rosalind !!!
|
2359
2397
|
also, at to_dna: default to RNA
|
@@ -2372,8 +2410,8 @@ ALSO ADD A GUI; sizeseq.rb was added in February 2021.
|
|
2372
2410
|
→ formatted_view
|
2373
2411
|
111^^^^ in ncbi format
|
2374
2412
|
and document all of this.
|
2375
|
-
|
2376
|
-
|
2413
|
+
------------------------------------------------------------------------------------------
|
2414
|
+
------------------------------------------------------------------------------------------
|
2377
2415
|
- Add a ruby-GUI stuff, probably the old biology/ subsection
|
2378
2416
|
will be moved into the project.
|
2379
2417
|
|
@@ -2470,7 +2508,7 @@ ALSO ADD A GUI; sizeseq.rb was added in February 2021.
|
|
2470
2508
|
|
2471
2509
|
|
2472
2510
|
|
2473
|
-
|
2511
|
+
------------------------------------------------------------------------------------------
|
2474
2512
|
- Identifying amino acid cleavage sites (Sigcleave)
|
2475
2513
|
|
2476
2514
|
For amino acid sequences we may be interested to know whether
|
@@ -2533,29 +2571,22 @@ ALSO ADD A GUI; sizeseq.rb was added in February 2021.
|
|
2533
2571
|
^^^ da gibt es einen bug. später nochmals probieren.
|
2534
2572
|
|
2535
2573
|
|
2536
|
-
- We will read from NM_001180897.3_Saccharomyces_cerevisiae_S288c_Aga2p_AGA2.fasta
|
2537
|
-
|
2538
|
-
The file NM_001180897.3_Saccharomyces_cerevisiae_S288c_Aga2p_AGA2.fasta has this FASTA header:
|
2539
|
-
|
2540
|
-
>gi|398364826|ref|NM_001180897.3| Saccharomyces cerevisiae S288c Aga2p (AGA2), mRNA
|
2541
2574
|
|
2542
|
-
^^^ this should also (optionally) tell us the organism, via a switch.
|
2543
|
-
for this we need some way to return the taxonomic ID of an organism
|
2544
2575
|
|
2545
2576
|
- we have to add expasy...
|
2546
2577
|
functionality to the cmdline too.
|
2547
2578
|
Which one specifically? Let's see...
|
2548
2579
|
|
2549
2580
|
https://www.expasy.org/
|
2550
|
-
|
2581
|
+
------------------------------------------------------------------------------------------
|
2551
2582
|
- https://biopython.org/wiki/Category%3ACookbook
|
2552
2583
|
^^^ clone that
|
2553
|
-
|
2584
|
+
------------------------------------------------------------------------------------------
|
2554
2585
|
- include covid genome, and begin to analyse it in bioroebe
|
2555
2586
|
"Das Genom von SARS-CoV-2 sei doppelt so groß wie jenes
|
2556
2587
|
von Influenzaviren, daher scheinen letztere viermal
|
2557
2588
|
so schnell zu mutieren, schrieb Moshiri."
|
2558
|
-
|
2589
|
+
------------------------------------------------------------------------------------------
|
2559
2590
|
- Look at the GUIs that are part of the BioRoebe project.
|
2560
2591
|
|
2561
2592
|
Polish these part, at the least one widget, then
|
@@ -2570,7 +2601,7 @@ ALSO ADD A GUI; sizeseq.rb was added in February 2021.
|
|
2570
2601
|
|
2571
2602
|
Hmmm. And then, also consider transitioning into gtk3,
|
2572
2603
|
and make mroe screenshots.
|
2573
|
-
|
2604
|
+
------------------------------------------------------------------------------------------
|
2574
2605
|
|
2575
2606
|
- https://www.ebi.ac.uk/Tools/seqstats/emboss_pepstats/
|
2576
2607
|
http://www.ebi.ac.uk/Tools/services/web/toolresult.ebi?jobId=emboss_pepstats-I20160208-020243-0564-53154194-oy
|
@@ -2582,7 +2613,7 @@ ALSO ADD A GUI; sizeseq.rb was added in February 2021.
|
|
2582
2613
|
- Improve on temperature content and how it is calculated
|
2583
2614
|
|
2584
2615
|
someone googled for it in 2014 so build on it
|
2585
|
-
|
2616
|
+
------------------------------------------------------------------------------------------
|
2586
2617
|
- pfasta /Depot/Temp/bioroebe/NM_000539.3_Homo_sapiens_rhodopsin_RHO.fasta
|
2587
2618
|
|
2588
2619
|
Will read from the file `/Depot/Temp/bioroebe/NM_000539.3_Homo_sapiens_rhodopsin_RHO.fasta`.
|
@@ -2593,7 +2624,7 @@ ALSO ADD A GUI; sizeseq.rb was added in February 2021.
|
|
2593
2624
|
Now assigning aminoacid sequence to:
|
2594
2625
|
AGAGTCATCCAGCTGGAGCCCTGAGTGGCTGAGCTCAGGCCTTCGCAG
|
2595
2626
|
AGAGTCATCCAGCTGGAGCCCTGAGTGGCTGAGCTCAGGCCTTCGCAG
|
2596
|
-
|
2627
|
+
------------------------------------------------------------------------------------------
|
2597
2628
|
|
2598
2629
|
|
2599
2630
|
- Formats
|
@@ -2647,7 +2678,7 @@ ALSO ADD A GUI; sizeseq.rb was added in February 2021.
|
|
2647
2678
|
tinyseq NCBI TinySeq XML
|
2648
2679
|
ztr ZTR tracefile ztr
|
2649
2680
|
|
2650
|
-
|
2681
|
+
------------------------------------------------------------------------------------------
|
2651
2682
|
(1) Look at f1 display:
|
2652
2683
|
|
2653
2684
|
|
@@ -2670,7 +2701,7 @@ we probably have to rewrite the whole thing
|
|
2670
2701
|
BEFORE we add ANY COLOURS.
|
2671
2702
|
OH WELL.
|
2672
2703
|
|
2673
|
-
|
2704
|
+
------------------------------------------------------------------------------------------
|
2674
2705
|
(100) → Add a primer-design widget
|
2675
2706
|
|
2676
2707
|
The idea is to be able to manipulate forward and
|
@@ -2684,7 +2715,7 @@ perfect but it is a start.
|
|
2684
2715
|
https://www.bioinformatics.nl/molbi/SCLResources/sequence_notation.htm
|
2685
2716
|
^^^ and check what is useful there. perhaps also add
|
2686
2717
|
nicer visual cues to pretty it up a bit.
|
2687
|
-
|
2718
|
+
------------------------------------------------------------------------------------------
|
2688
2719
|
(1) → Compare bioroebe to:
|
2689
2720
|
|
2690
2721
|
https://www.ncbi.nlm.nih.gov/orffinder
|
@@ -2694,18 +2725,18 @@ whether both return the same also possibly add a web-gui
|
|
2694
2725
|
check this... so that we can search in standard ORF
|
2695
2726
|
but also in different ORFs
|
2696
2727
|
und die länge angeben, zumindest vom längsten ORF start + stop... also so das das ergebnis auch passt
|
2697
|
-
|
2728
|
+
------------------------------------------------------------------------------------------
|
2698
2729
|
test reverse complement in bioroebe
|
2699
2730
|
^^^
|
2700
2731
|
new_WWW/
|
2701
2732
|
^^^ this should eventually become the new web-related interface.
|
2702
2733
|
Ah well. Perhaps not ... ruby-cgi is soooooo annoying ...
|
2703
|
-
|
2734
|
+
------------------------------------------------------------------------------------------
|
2704
2735
|
(154) → the blosum-viewer should be supported in the cgi part
|
2705
2736
|
and sinatra part as well.
|
2706
2737
|
This now works for sinatra. Need to enable this for
|
2707
2738
|
the cgi-part too eventually.
|
2708
|
-
|
2739
|
+
------------------------------------------------------------------------------------------
|
2709
2740
|
(155) → port the sinatra stuff together in bioroebe
|
2710
2741
|
create a dir: web_api
|
2711
2742
|
^^^ also make params? usable in both sinatra and cgi page
|
@@ -2716,18 +2747,18 @@ Ah well. Perhaps not ... ruby-cgi is soooooo annoying ...
|
|
2716
2747
|
add tons of HtmlTemplate[]
|
2717
2748
|
and replace the ad-hoc code otherwise...
|
2718
2749
|
^^^ yeah, finish the HtmlTemplate stuff.
|
2719
|
-
|
2750
|
+
------------------------------------------------------------------------------------------
|
2720
2751
|
(1) → https://i.imgur.com/ptcSn12.png
|
2721
2752
|
^^^ enable such an overview; this shows mass compuation e.g
|
2722
2753
|
peptide mass and such
|
2723
|
-
|
2754
|
+
------------------------------------------------------------------------------------------
|
2724
2755
|
(80) Bioroebe.sanitize_nucleotide_sequence
|
2725
2756
|
^^^ port this into java. The code has been written for this already,
|
2726
2757
|
but we currently fail to link it.
|
2727
|
-
|
2758
|
+
------------------------------------------------------------------------------------------
|
2728
2759
|
(81) Bioroebe.base_composition
|
2729
2760
|
^^^^^^^^^ port this into java
|
2730
|
-
|
2761
|
+
------------------------------------------------------------------------------------------
|
2731
2762
|
(82) - work a bit more on tk!!!
|
2732
2763
|
in particular to start it from the bioshell as-is.
|
2733
2764
|
^^^ this is mostly done for quick
|
@@ -2740,20 +2771,20 @@ Ah well. Perhaps not ... ruby-cgi is soooooo annoying ...
|
|
2740
2771
|
hamming_distance [PARTIALLY IMPLEMENTED; ~80%]
|
2741
2772
|
protein_to_DNA
|
2742
2773
|
^^^^ improve both while improving tk_paradise docu as well.
|
2743
|
-
|
2774
|
+
------------------------------------------------------------------------------------------
|
2744
2775
|
(83) Batch-create the .exe files on windows for libui, once
|
2745
2776
|
the first has been added. And then test it too
|
2746
2777
|
AND document it. This should be done with the controller
|
2747
2778
|
eventually. Once this works, we can remove this entry
|
2748
2779
|
here.
|
2749
|
-
|
2780
|
+
------------------------------------------------------------------------------------------
|
2750
2781
|
(84) port more libui stuff in bioroebe. We have two widgets ported so far;
|
2751
2782
|
add more such entries.
|
2752
|
-
|
2783
|
+
------------------------------------------------------------------------------------------
|
2753
2784
|
(85) after libui has been ported, explore how gosu works on windows.
|
2754
2785
|
if possible add things to a gosu-specific UI as well, but
|
2755
2786
|
we may need a common, unified GUI base for that.
|
2756
|
-
|
2787
|
+
------------------------------------------------------------------------------------------
|
2757
2788
|
(86)
|
2758
2789
|
|
2759
2790
|
add libui bindings AND once done make sure the controller works in
|
@@ -2762,22 +2793,22 @@ libui as well. Embed the various things into it.
|
|
2762
2793
|
Tab A set named tabs for placing items in
|
2763
2794
|
^^^ use this perhaps also in bioroebe hmmm
|
2764
2795
|
yeah.
|
2765
|
-
|
2796
|
+
------------------------------------------------------------------------------------------
|
2766
2797
|
(87) https://github.com/cnjinhao/nana/wiki/User-Works-using-Nana
|
2767
2798
|
|
2768
2799
|
^^^ port the "DNA hybrid"
|
2769
2800
|
https://camo.githubusercontent.com/4c27d554ca4d698d288628f21255f917c2c577e35d7e11dd67e21880d56b6b0a/687474703a2f2f6e616e6170726f2e6f72672f696d616765732f73637265656e73686f74732f746864795f7365715f6578706c2e706e67
|
2770
2801
|
|
2771
|
-
|
2802
|
+
------------------------------------------------------------------------------------------
|
2772
2803
|
(88) Bioroebe::Cell
|
2773
2804
|
^^^ think about what to do with it. If we don't need it then perhaps
|
2774
2805
|
we should just remove it. Think about this more at 2022, before
|
2775
2806
|
deciding what to do.
|
2776
|
-
|
2807
|
+
------------------------------------------------------------------------------------------
|
2777
2808
|
(89) - Add emboss cgplot functionality.
|
2778
2809
|
|
2779
2810
|
https://www.bioinformatics.nl/cgi-bin/emboss/cpgplot
|
2780
|
-
|
2811
|
+
------------------------------------------------------------------------------------------
|
2781
2812
|
(90) - integrate calculation of the Instability index (II)
|
2782
2813
|
|
2783
2814
|
The instability index provides an estimate of the
|
@@ -2815,9 +2846,9 @@ that the protein may be unstable.
|
|
2815
2846
|
The instability index (II) is computed to be 65.43
|
2816
2847
|
This classifies the protein as unstable.
|
2817
2848
|
|
2818
|
-
|
2849
|
+
------------------------------------------------------------------------------------------
|
2819
2850
|
(1) → We have now added a method to show all hydrophobic amino acids, via the
|
2820
2851
|
method .hydrophobic_amino_acids?. This works and has been documented
|
2821
2852
|
in May 2022. However had, we also still need a way to PREDICT
|
2822
2853
|
hydrophobic segments in a polypeptide sequence.
|
2823
|
-
|
2854
|
+
------------------------------------------------------------------------------------------
|