bioroebe 0.10.80 → 0.11.25

Sign up to get free protection for your applications and to get access to all the features.

Potentially problematic release.


This version of bioroebe might be problematic. Click here for more details.

Files changed (134) hide show
  1. checksums.yaml +4 -4
  2. data/README.md +3117 -2645
  3. data/bioroebe.gemspec +3 -3
  4. data/doc/README.gen +3116 -2644
  5. data/doc/todo/bioroebe_todo.md +418 -387
  6. data/lib/bioroebe/aminoacids/aminoacid_substitution.rb +1 -9
  7. data/lib/bioroebe/aminoacids/codon_percentage.rb +1 -9
  8. data/lib/bioroebe/aminoacids/deduce_aminoacid_sequence.rb +1 -9
  9. data/lib/bioroebe/aminoacids/display_aminoacid_table.rb +1 -0
  10. data/lib/bioroebe/aminoacids/show_hydrophobicity.rb +1 -6
  11. data/lib/bioroebe/base/colours_for_base/colours_for_base.rb +18 -8
  12. data/lib/bioroebe/base/commandline_application/commandline_arguments.rb +13 -11
  13. data/lib/bioroebe/base/commandline_application/misc.rb +18 -8
  14. data/lib/bioroebe/base/misc.rb +16 -0
  15. data/lib/bioroebe/base/prototype/misc.rb +1 -1
  16. data/lib/bioroebe/codons/show_codon_tables.rb +6 -2
  17. data/lib/bioroebe/codons/show_codon_usage.rb +2 -1
  18. data/lib/bioroebe/constants/aminoacids_and_proteins.rb +1 -0
  19. data/lib/bioroebe/constants/database_constants.rb +1 -1
  20. data/lib/bioroebe/constants/files_and_directories.rb +24 -4
  21. data/lib/bioroebe/constants/misc.rb +20 -0
  22. data/lib/bioroebe/count/count_amount_of_nucleotides.rb +3 -0
  23. data/lib/bioroebe/crystal/README.md +2 -0
  24. data/lib/bioroebe/crystal/to_rna.cr +19 -0
  25. data/lib/bioroebe/data/README.md +11 -8
  26. data/lib/bioroebe/data/electron_microscopy/pos_example.pos +396 -0
  27. data/lib/bioroebe/data/electron_microscopy/test_particles.star +36 -0
  28. data/lib/bioroebe/{shell/tk.rb → electron_microscopy/electron_microscopy_module.rb} +15 -10
  29. data/lib/bioroebe/electron_microscopy/simple_star_file_generator.rb +4 -9
  30. data/lib/bioroebe/fasta_and_fastq/show_fasta_headers.rb +27 -12
  31. data/lib/bioroebe/genome/README.md +4 -0
  32. data/lib/bioroebe/genome/genome.rb +67 -0
  33. data/lib/bioroebe/gui/gtk +1 -0
  34. data/lib/bioroebe/gui/gtk3/controller/controller.rb +45 -27
  35. data/lib/bioroebe/gui/gtk3/dna_to_aminoacid_widget/dna_to_aminoacid_widget.rb +76 -50
  36. data/lib/bioroebe/gui/gtk3/hamming_distance/hamming_distance.rb +42 -28
  37. data/lib/bioroebe/gui/gtk3/nucleotide_analyser/nucleotide_analyser.rb +119 -71
  38. data/lib/bioroebe/gui/gtk3/protein_to_DNA/protein_to_DNA.rb +18 -18
  39. data/lib/bioroebe/gui/gtk3/random_sequence/random_sequence.rb +19 -11
  40. data/lib/bioroebe/gui/shared_code/protein_to_DNA/protein_to_DNA_module.rb +14 -14
  41. data/lib/bioroebe/misc/ruler.rb +1 -0
  42. data/lib/bioroebe/parsers/genbank_parser.rb +353 -24
  43. data/lib/bioroebe/parsers/gff.rb +1 -9
  44. data/lib/bioroebe/pdb/parse_pdb_file.rb +1 -9
  45. data/lib/bioroebe/project/project.rb +1 -1
  46. data/lib/bioroebe/python/README.md +1 -0
  47. data/lib/bioroebe/python/__pycache__/mymodule.cpython-39.pyc +0 -0
  48. data/lib/bioroebe/python/gui/gtk3/all_in_one.css +4 -0
  49. data/lib/bioroebe/python/gui/gtk3/all_in_one.py +59 -0
  50. data/lib/bioroebe/python/gui/gtk3/widget1.py +20 -0
  51. data/lib/bioroebe/python/gui/tkinter/all_in_one.py +91 -0
  52. data/lib/bioroebe/python/mymodule.py +8 -0
  53. data/lib/bioroebe/python/protein_to_dna.py +33 -0
  54. data/lib/bioroebe/python/shell/shell.py +19 -0
  55. data/lib/bioroebe/python/to_rna.py +14 -0
  56. data/lib/bioroebe/python/toplevel_methods/open_in_browser.py +20 -0
  57. data/lib/bioroebe/python/toplevel_methods/palindromes.py +42 -0
  58. data/lib/bioroebe/python/toplevel_methods/rds.py +13 -0
  59. data/lib/bioroebe/python/toplevel_methods/three_delimiter.py +34 -0
  60. data/lib/bioroebe/python/toplevel_methods/time_and_date.py +43 -0
  61. data/lib/bioroebe/python/toplevel_methods/to_camelcase.py +11 -0
  62. data/lib/bioroebe/requires/require_the_bioroebe_project.rb +3 -1
  63. data/lib/bioroebe/sequence/nucleotide_module/nucleotide_module.rb +28 -25
  64. data/lib/bioroebe/sequence/protein.rb +105 -3
  65. data/lib/bioroebe/sequence/sequence.rb +61 -2
  66. data/lib/bioroebe/shell/menu.rb +3752 -3667
  67. data/lib/bioroebe/shell/misc.rb +51 -4311
  68. data/lib/bioroebe/shell/readline/readline.rb +1 -1
  69. data/lib/bioroebe/shell/shell.rb +11199 -28
  70. data/lib/bioroebe/siRNA/siRNA.rb +81 -1
  71. data/lib/bioroebe/string_matching/find_longest_substring.rb +3 -2
  72. data/lib/bioroebe/taxonomy/class_methods.rb +3 -8
  73. data/lib/bioroebe/taxonomy/constants.rb +4 -3
  74. data/lib/bioroebe/taxonomy/edit.rb +2 -1
  75. data/lib/bioroebe/taxonomy/help/help.rb +10 -10
  76. data/lib/bioroebe/taxonomy/info/check_available.rb +15 -9
  77. data/lib/bioroebe/taxonomy/info/info.rb +17 -2
  78. data/lib/bioroebe/taxonomy/info/is_dna.rb +46 -36
  79. data/lib/bioroebe/taxonomy/interactive.rb +139 -95
  80. data/lib/bioroebe/taxonomy/menu.rb +27 -18
  81. data/lib/bioroebe/taxonomy/parse_fasta.rb +3 -1
  82. data/lib/bioroebe/taxonomy/shared.rb +1 -0
  83. data/lib/bioroebe/taxonomy/taxonomy.rb +1 -0
  84. data/lib/bioroebe/toplevel_methods/aminoacids_and_proteins.rb +31 -24
  85. data/lib/bioroebe/toplevel_methods/databases.rb +1 -1
  86. data/lib/bioroebe/toplevel_methods/fasta_and_fastq.rb +101 -63
  87. data/lib/bioroebe/toplevel_methods/misc.rb +17 -16
  88. data/lib/bioroebe/toplevel_methods/nucleotides.rb +22 -5
  89. data/lib/bioroebe/toplevel_methods/open_in_browser.rb +2 -0
  90. data/lib/bioroebe/toplevel_methods/palindromes.rb +1 -2
  91. data/lib/bioroebe/toplevel_methods/taxonomy.rb +2 -2
  92. data/lib/bioroebe/toplevel_methods/to_camelcase.rb +5 -0
  93. data/lib/bioroebe/utility_scripts/align_open_reading_frames.rb +1 -9
  94. data/lib/bioroebe/utility_scripts/check_for_mismatches/check_for_mismatches.rb +1 -9
  95. data/lib/bioroebe/utility_scripts/compacter.rb +1 -9
  96. data/lib/bioroebe/utility_scripts/compseq/compseq.rb +1 -9
  97. data/lib/bioroebe/utility_scripts/create_batch_entrez_file.rb +1 -9
  98. data/lib/bioroebe/utility_scripts/dot_alignment.rb +1 -9
  99. data/lib/bioroebe/utility_scripts/move_file_to_its_correct_location.rb +1 -4
  100. data/lib/bioroebe/utility_scripts/showorf/constants.rb +0 -5
  101. data/lib/bioroebe/utility_scripts/showorf/reset.rb +1 -4
  102. data/lib/bioroebe/version/version.rb +2 -2
  103. data/lib/bioroebe/www/embeddable_interface.rb +101 -52
  104. data/lib/bioroebe/www/sinatra/sinatra.rb +186 -70
  105. data/lib/bioroebe/yaml/aminoacids/amino_acids_long_name_to_one_letter.yml +2 -2
  106. data/lib/bioroebe/yaml/configuration/browser.yml +1 -1
  107. data/lib/bioroebe/yaml/genomes/README.md +3 -4
  108. data/lib/bioroebe/yaml/restriction_enzymes/restriction_enzymes.yml +3 -3
  109. metadata +33 -35
  110. data/doc/setup.rb +0 -1655
  111. data/lib/bioroebe/genbank/genbank_parser.rb +0 -291
  112. data/lib/bioroebe/shell/add.rb +0 -108
  113. data/lib/bioroebe/shell/assign.rb +0 -360
  114. data/lib/bioroebe/shell/chop_and_cut.rb +0 -281
  115. data/lib/bioroebe/shell/constants.rb +0 -166
  116. data/lib/bioroebe/shell/download.rb +0 -335
  117. data/lib/bioroebe/shell/enable_and_disable.rb +0 -158
  118. data/lib/bioroebe/shell/enzymes.rb +0 -310
  119. data/lib/bioroebe/shell/fasta.rb +0 -345
  120. data/lib/bioroebe/shell/gtk.rb +0 -76
  121. data/lib/bioroebe/shell/history.rb +0 -132
  122. data/lib/bioroebe/shell/initialize.rb +0 -217
  123. data/lib/bioroebe/shell/loop.rb +0 -74
  124. data/lib/bioroebe/shell/prompt.rb +0 -107
  125. data/lib/bioroebe/shell/random.rb +0 -289
  126. data/lib/bioroebe/shell/reset.rb +0 -335
  127. data/lib/bioroebe/shell/scan_and_parse.rb +0 -135
  128. data/lib/bioroebe/shell/search.rb +0 -337
  129. data/lib/bioroebe/shell/sequences.rb +0 -200
  130. data/lib/bioroebe/shell/show_report_and_display.rb +0 -2901
  131. data/lib/bioroebe/shell/startup.rb +0 -127
  132. data/lib/bioroebe/shell/taxonomy.rb +0 -14
  133. data/lib/bioroebe/shell/user_input.rb +0 -88
  134. data/lib/bioroebe/shell/xorg.rb +0 -45
@@ -1,16 +1,157 @@
1
- -------------------------------------------------------------------------------
2
- (1) → https://biopython.org/DIST/docs/tutorial/Tutorial.html#sec15
1
+ ------------------------------------------------------------------------------------------
2
+ (1) → https://pubchem.ncbi.nlm.nih.gov/compound/16131099#section=Top
3
+
4
+ ^^^ this website is quite interesting; try to use components
5
+ from it.
6
+ ------------------------------------------------------------------------------------------
7
+ (1) → Add some option to show the aminoacid sequence, at the least
8
+ store it; and optionally show it.
9
+
10
+ possibly always report how many aminoacids are
11
+ part of that file; and optionally also show
12
+ the whole sequence.
13
+
14
+
15
+ ------------------------------------------------------------------------------------------
16
+ (1) → http://insilico.ehu.es/
17
+
18
+ ^^^ check if we have all of this incorporated
19
+
20
+ ------------------------------------------------------------------------------------------
21
+ (28) → Integrate these nice GUI parts parts:
22
+
23
+ https://dev.to/kojix2/introduction-to-gr-rb-data-visualization-with-ruby-2c39
24
+ ------------------------------------------------------------------------------------------
25
+ (22) → AND THEN test on windows as well.
26
+ ^^^^^^^^^^^^^^
27
+ ------------------------------------------------------------------------------------------
28
+ (1) → add mouse chromsoome URL, also in the bioshell
29
+ and the main README, to be of help for the
30
+ user. add a mouse subsection.
31
+
32
+ ------------------------------------------------------------------------------------------
33
+ (2) → fix the taxonomy stuff...
34
+ ------------------------------------------------------------------------------------------
35
+ (1) → set_dna_sequence alu
36
+
37
+ ^^^ fetch random alu
38
+
39
+ ^^^ alu sequence
40
+ Ok we started this now adding more details, but we
41
+ need to become better at searching for this
42
+ sequence.
43
+ ------------------------------------------------------------------------------------------
44
+ (2) → draw things based on GR
45
+ ------------------------------------------------------------------------------------------
46
+ (3) → https://mycocosm.jgi.doe.gov/help/screenshots/browser_viewer.png
47
+ ^^^ offer the same functionality
48
+ ------------------------------------------------------------------------------------------
49
+ (4) → https://genome.cshlp.org/content/12/10/1611/F3.expansion.html
50
+
51
+ ^^^ enable this, we must obtain a sequence then store into genbank format
52
+ so, first fetch; then store as-is.
53
+ ------------------------------------------------------------------------------------------
54
+ (5) → be able to generate nice graphics
55
+
56
+ https://genome.cshlp.org/content/12/10/1611/F1.large.jpg
57
+ ------------------------------------------------------------------------------------------
58
+ (6) → add rmagicks wrappre, perhaps via imageparadise or something
59
+ the idea is that we can make fancy drawings and generate
60
+ an image for the end user to see
61
+ ------------------------------------------------------------------------------------------
62
+ (7) → https://bioperl.org/howtos/Beginners_HOWTO.html#item13
63
+ extend the sequence object and document it
64
+
65
+ also add:
66
+
67
+ class Genome
68
+ and:
69
+ def is_circular?
70
+ @internal_hash[:is_circular]
71
+ end; alias circular? is_circular? # === circular?
72
+ def species?
73
+ @internal_hash[:species] # return the species here
74
+ end
75
+ ------------------------------------------------------------------------------------------
76
+ (2) http://lib.ysu.am/open_books/312400.pdf
77
+
78
+ clone:
79
+ Primer.pl
80
+ This program was written to support the required informatics for a sequencing
81
+ lab. The desire was to quickly generate primer pair candidates for use in STS
82
+ mapping. We use Bioperl modules to fetch the sequences from GenBank.
83
+ #! /usr/bin/perl
84
+ #
85
+ # primers.pl
86
+ #
87
+ # Reads a list of
88
+
89
+ % primers.pl AC013798
90
+ AC013798
91
+ Left Right Length Penalty
92
+ CCTCCTGGACAACCTGTGTT TGAAGTCAGGGGACATAGGG 280 0.0823
93
+ CCTCCTGGACAACCTGTGTT AGGCCAGTAGACTGGGTGTG 298 0.1758
94
+ CCTCCTGGACAACCTGTGTT GGTGTGAAGTCAGGGGACAT 284 0.1852
95
+ TTCCCGCATCTCTTAGCAGT AGGCCAGTAGACTGGGTGTG 209 0.1962
96
+ CTTCCCGCATCTCTTAGCAG GACACTAGTGGCAAGGAGGC 226 0.2362
97
+ Most of the primers.pl program is extremely simple. The real guts and power
98
+ of the program lie in the classes and the methods we call. The next section
99
+ examines the Primer3 module, which is similar to many Bioperl modules
100
+
101
+
102
+ ------------------------------------------------------------------------------------------
103
+ (1) → Clone all of Emboss. :)
104
+
105
+ → Clone and document the getorf functionality properly.
106
+
107
+ See: http://emboss.sourceforge.net/apps/cvs/emboss/apps/getorf.html
108
+
109
+ http://emboss.sourceforge.net
110
+ http://emboss.sourceforge.net/apps/cvs/emboss/apps/getorf.html
111
+
112
+ ------------------------------------------------------------------------------------------
113
+ (3) → Add useful formulas for bioshell.
114
+ ------------------------------------------------------------------------------------------
115
+ (1) → Polish the GUI sets:
116
+
117
+ https://i.imgur.com/djElIMh.png
118
+
119
+ ------------------------------------------------------------------------------------------
120
+ (4) → The taxonomy part should be fully integrated, without it
121
+ being a standalone part anymore.
122
+ continue on the taxonomy stuff.
123
+ ne day this will work again *shake fist*
124
+ ------------------------------------------------------------------------------------------
125
+ (1) → Show the frequency of codons in different tables
126
+
127
+ This works quite ok, but right now the approach is to store
128
+ this in a .yml file which is not ideal.
129
+
130
+ Thus, we have to add two things:
131
+ - The ability to store this into a SQL database
132
+ - The ability to batch-download all of these codons,
133
+ which first requires that we have a way to obtain all
134
+ taxonomic ids.
135
+ Add where this can be found.
136
+
137
+ IMPROVE THIS ALL!!!!!!!
138
+
139
+ ------------------------------------------------------------------------------------------
140
+ (2) improve docu + tests for melting temperature analysis again
141
+ + usage example + GUI + web-use
142
+ ------------------------------------------------------------------------------------------
143
+ (3) → https://biopython.org/DIST/docs/tutorial/Tutorial.html#sec15
3
144
 
4
145
  ^^^ work through the above, also integrate it + write docs
5
146
 
6
147
  https://raw.githubusercontent.com/biopython/biopython/master/Doc/examples/ls_orchid.fasta
7
148
 
8
- -------------------------------------------------------------------------------
9
- (2) → integrate electrno microscopy slowly and also add documentation
149
+ ------------------------------------------------------------------------------------------
150
+ (4) → integrate electrno microscopy slowly and also add documentation
10
151
  about this AS YOU GO!!!!!
11
152
  ^^^ yup add more of it
12
- -------------------------------------------------------------------------------
13
- (3) → Add save session support
153
+ ------------------------------------------------------------------------------------------
154
+ (5) → Add save session support
14
155
  to reload our last activity completely ...
15
156
  hmmm..
16
157
  This has to be well designed...
@@ -27,9 +168,8 @@ https://raw.githubusercontent.com/biopython/biopython/master/Doc/examples/ls_orc
27
168
  upon startup of the bioroebe shell.
28
169
  This is in preparation for save-session support.
29
170
 
30
-
31
- -------------------------------------------------------------------------------
32
- (5) → Lys-Asp-Glu-Leu
171
+ ------------------------------------------------------------------------------------------
172
+ (6) → Lys-Asp-Glu-Leu
33
173
 
34
174
  if i.include?('-') and Bioroebe.is_in_the_three_letter_code?(i)
35
175
  end
@@ -47,11 +187,11 @@ https://raw.githubusercontent.com/biopython/biopython/master/Doc/examples/ls_orc
47
187
 
48
188
  ^^ yep this is also called KDEL
49
189
  https://en.wikipedia.org/wiki/KDEL_(amino_acid_sequence)
50
- -------------------------------------------------------------------------------
51
- (6) → Add "orthologs". this shall show us the top 25 orthologs or
190
+ ------------------------------------------------------------------------------------------
191
+ (7) → Add "orthologs". this shall show us the top 25 orthologs or
52
192
  something. In the bioshell? Hmm. Not sure yet.
53
- -------------------------------------------------------------------------------
54
- (7) → clone the functionality of this:
193
+ ------------------------------------------------------------------------------------------
194
+ (8) → clone the functionality of this:
55
195
 
56
196
  http://www.kazusa.or.jp/codon/cgi-bin/countcodon.cgi
57
197
  http://www.kazusa.or.jp/codon/countcodon.html
@@ -63,18 +203,18 @@ https://en.wikipedia.org/wiki/KDEL_(amino_acid_sequence)
63
203
  widget first. And sinatra output too.
64
204
  AND document it as well
65
205
 
66
- -------------------------------------------------------------------------------
206
+ ------------------------------------------------------------------------------------------
67
207
  (8) → SARS genom analyisere in bioroebe
68
208
  eventuell auch graphisch
69
209
 
70
210
  Gibt es neue GUIs die wir kombinieren könnten? Hmmm.
71
- -------------------------------------------------------------------------------
211
+ ------------------------------------------------------------------------------------------
72
212
  (9) → In bioroebe, generate that .ps thingy graphical thing from the
73
213
  vienna RNA tutorial. Hmmm.
74
214
 
75
215
  https://www.tbi.univie.ac.at/RNA/tutorial/
76
- -------------------------------------------------------------------------------
77
- (1) → get insulin squence frmo NCBI
216
+ ------------------------------------------------------------------------------------------
217
+ (10) → get insulin squence frmo NCBI
78
218
  human
79
219
  then apply trypsin onto it
80
220
  and try it like this:
@@ -88,13 +228,13 @@ Also add:
88
228
  ^^^ to show it
89
229
  Hmm. Perhaps also auto-download or something.
90
230
 
91
- -------------------------------------------------------------------------------
92
- (1) → in bioroebe: UAG?
231
+ ------------------------------------------------------------------------------------------
232
+ (11) → in bioroebe: UAG?
93
233
  ^^^ show all stop codons with that in the bioshell
94
234
  all UAG sequences... hmm. and TAG?
95
235
  Finish that.
96
- ..........................................................................
97
- (1) → The position of a symbol in a string is the total number of
236
+ ------------------------------------------------------------------------------------------
237
+ (12) → The position of a symbol in a string is the total number of
98
238
  symbols found to its left, including itself (e.g., the positions
99
239
  of all occurrences of 'U' in "AUGCUUCAGAAAGGUCUUACG" are 2, 5,
100
240
  6, 15, 17, and 18). The symbol at position i
@@ -102,70 +242,70 @@ Hmm. Perhaps also auto-download or something.
102
242
 
103
243
  ^^^ add a solution there, a toplevel API
104
244
  !!!!!
105
- -------------------------------------------------------------------------------
106
- (1) → http://bioruby.org/rdoc/Bio/Blast.html
245
+ ------------------------------------------------------------------------------------------
246
+ (13) → http://bioruby.org/rdoc/Bio/Blast.html
107
247
  ^^^ add support for BLAST
108
- ..........................................................................
109
- (1) → add: parse_pdb()
248
+ ------------------------------------------------------------------------------------------
249
+ (14) → add: parse_pdb()
110
250
  With this we shall just show some info, about a given
111
251
  .pdb file at hand.
112
252
  Also make it commandline based too + bioshell variant
113
253
  here, and a sinatra interface once this all works.
114
254
  Don't forget to document it!!!!!
115
255
  ^^^ and google a bit how others do that
116
- ..........................................................................
117
- (2) → pdb 1a6m
256
+ ------------------------------------------------------------------------------------------
257
+ (15) → pdb 1a6m
118
258
  ^^^ download this when that is used in the bioshell; we also have
119
259
  to use the download directory for this, so make sure that
120
260
  we do.
121
261
  ^^^ And then, also document this clearly.
122
- -------------------------------------------------------------------------------
123
- (3) show_string
262
+ ------------------------------------------------------------------------------------------
263
+ (16) show_string
124
264
  ^^^ slowly port this ... find out differences
125
265
  then unify into one method. right now we used
126
266
  two or something.
127
- -------------------------------------------------------------------------------
128
- (4) → Try to see if we can integrate this into our GUI:
267
+ ------------------------------------------------------------------------------------------
268
+ (17) → Try to see if we can integrate this into our GUI:
129
269
 
130
270
  https://cdn.snapgene.com/assets/7.6.11/assets/images/snapgene/homepage/homepage-hero.png
131
- -------------------------------------------------------------------------------
271
+ ------------------------------------------------------------------------------------------
132
272
  (5) → Scan for leucine zipper!
133
273
 
134
274
  This is ~25% implemented. We need to double-check what
135
275
  exactly is a leucine zipper.
136
- ..........................................................................
276
+ ------------------------------------------------------------------------------------------
137
277
  (6) → Extend the sinatra-interface for the Rosalind task,
138
278
  perhaps add a sub-link to show which parts are solved
139
279
  as-is. Hmm. I am not continuing on this though.
140
280
  ^^^^
141
281
  well - make rosalind anew again or something.
142
282
 
143
- ...........................................................................
283
+ ------------------------------------------------------------------------------------------
144
284
  (7) - Add a blast interface; both via the web-interface, GUI,
145
285
  and also from the commandline.
146
- -------------------------------------------------------------------------------
286
+ ------------------------------------------------------------------------------------------
147
287
  (8) - Write a tutorial about primer design.
148
288
  also make sure that the GUI has support for this.
149
- ..........................................................................
289
+ ------------------------------------------------------------------------------------------
150
290
  (9) - In the documentation examples, show some exampls for how to work
151
291
  with different organisms.
152
- ..........................................................................
292
+ ------------------------------------------------------------------------------------------
153
293
  (10) - In the bioshell, if "stop?" is issued, then the colouring isn't
154
294
  correct. It currently does not show any result. This has to
155
295
  be fixed.
156
- ..........................................................................
296
+ ------------------------------------------------------------------------------------------
157
297
  (11) → https://www.rubydoc.info/gems/biomart
158
298
  ^^^ integrate biomart
159
299
 
160
300
  p biomart.list_datasets
161
301
  p biomart.datasets?
162
- -------------------------------------------------------------------------------
302
+ ------------------------------------------------------------------------------------------
163
303
  (12) Add Trypsin und Trypsinogen sequences, both as FASTA
164
304
  but also as shortcut via the commandline such as:
165
305
  show_orf :trypsine
166
306
  show_orf :trypsin
167
307
  Or something like this; and document it as well.
168
- -------------------------------------------------------------------------------
308
+ ------------------------------------------------------------------------------------------
169
309
  (13) → 1..60
170
310
 
171
311
  setdna 57
@@ -177,12 +317,12 @@ well - make rosalind anew again or something.
177
317
  5' - ATGTGCAGTCAGGTGAATTTATTGAAAAATTTGAGGCTCCTGGTGGTGCAAATCAAAGAACTGCTCCTCAGTGGATGTTGCCTTTACTTCTAG - 3'
178
318
  ^^^ hier beim colourize, wenn das letzte codon ein STOP codon ist
179
319
  dann colourizen wir das auch.
180
- -------------------------------------------------------------------------------
320
+ ------------------------------------------------------------------------------------------
181
321
  (14) → MG1655
182
322
  ^^^ input this to download the sequence. Also show it to the user.
183
- -------------------------------------------------------------------------------
323
+ ------------------------------------------------------------------------------------------
184
324
  (15) → extend virus-information into the bioroebe project.
185
- -------------------------------------------------------------------------------
325
+ ------------------------------------------------------------------------------------------
186
326
  (16) → Add a way to analyse the chemical structure of all
187
327
  aminoacids. We wish to show the chemical formula.
188
328
 
@@ -196,22 +336,22 @@ well - make rosalind anew again or something.
196
336
  I don't understand why it removes H and 0 so perhaps
197
337
  dont remove that part. But still show the -R.
198
338
 
199
- -------------------------------------------------------------------------------
339
+ ------------------------------------------------------------------------------------------
200
340
  (17) FIX THE COLOURIZATION BUG; THIS ONE TRIGGERED THE WHOLE
201
341
  REWRITE AFTER ALL!
202
- -------------------------------------------------------------------------------
342
+ ------------------------------------------------------------------------------------------
203
343
  (18) FIX TAXONOMY related-problems AS WELL
204
344
  ^^^^^^ AND DOCUMENT THIS related-problems.
205
- -------------------------------------------------------------------------------
345
+ ------------------------------------------------------------------------------------------
206
346
  (19) Do note that z will then be a String, not a sequence object anymore.
207
347
  (This may be subject to change in the future, but for now, aka
208
348
  **February 2020**, it is that way.)
209
349
  ^^^^
210
- -------------------------------------------------------------------------------
350
+ ------------------------------------------------------------------------------------------
211
351
  (20) ^^^ colours are appended. That should not be the case!
212
352
  ADD SOMETHING NEW ... some todo entries
213
353
  and some python tool
214
- -------------------------------------------------------------------------------
354
+ ------------------------------------------------------------------------------------------
215
355
  (21) → rewrite the whole project anew
216
356
  - improve the documentation
217
357
  - focus on class Protein first and add
@@ -219,10 +359,8 @@ well - make rosalind anew again or something.
219
359
  that, as well as:
220
360
  .backtrans
221
361
  .reverse_translate
222
- -------------------------------------------------------------------------------
223
- (22) → AND THEN test on windows as well.
224
- ^^^^^^^^^^^^^^
225
- -------------------------------------------------------------------------------
362
+
363
+ ------------------------------------------------------------------------------------------
226
364
  (23) →
227
365
  Reduced alphabets for proteins | [not implemented yet]
228
366
  ^^^ check this as well
@@ -252,9 +390,9 @@ First focus on bioroebe.
252
390
  efetch "https://www.ncbi.nlm.nih.gov/gene/744779"
253
391
  ^^^ test this. again
254
392
 
255
- -------------------------------------------------------------------------------
393
+ ------------------------------------------------------------------------------------------
256
394
  (25) fix tk-levensthein
257
- -------------------------------------------------------------------------------
395
+ ------------------------------------------------------------------------------------------
258
396
  (26) → rewrite the whole project anew
259
397
  - improve the documentation
260
398
  - rework the WHOLE tutorial as well
@@ -263,13 +401,13 @@ efetch "https://www.ncbi.nlm.nih.gov/gene/744779"
263
401
  that
264
402
  .backtrans
265
403
  .reverse_translate
266
- -------------------------------------------------------------------------------
404
+ ------------------------------------------------------------------------------------------
267
405
  (27) → analyze /Depot/Temp/Bioroebe/1CEZ.pdb
268
406
 
269
407
  ^^^
270
408
  support this. Already works half-way, we started writing a pdb parser.
271
409
  this should work in general, for .fasta files as well.
272
- ..........................................................................
410
+ ------------------------------------------------------------------------------------------
273
411
  (28) → SINATRA STUFF:
274
412
  FIX AND EXTEND SINATRA IN BIOROEBE.
275
413
  extend it too.
@@ -281,7 +419,7 @@ efetch "https://www.ncbi.nlm.nih.gov/gene/744779"
281
419
  and special-dispaly on sinatra kaa
282
420
  where the nucleotide sequence has numbers
283
421
  ^^^
284
- -------------------------------------------------------------------------------
422
+ ------------------------------------------------------------------------------------------
285
423
  (29) pick any virus and begin to amass tons of data; and then when done
286
424
  also connect this into a GUI for use therein.
287
425
 
@@ -302,7 +440,7 @@ https://www.ncbi.nlm.nih.gov/nuccore/NC_038391.1
302
440
 
303
441
 
304
442
 
305
- -------------------------------------------------------------------------------
443
+ ------------------------------------------------------------------------------------------
306
444
  (1) → Fix:
307
445
 
308
446
  require 'bioroebe/toplevel_methods/open_reading_frames.rb'
@@ -310,10 +448,10 @@ https://www.ncbi.nlm.nih.gov/nuccore/NC_038391.1
310
448
  Something is wrong; it returns regions that contain
311
449
  a stop codon, which can not be true.
312
450
 
313
- -------------------------------------------------------------------------------
451
+ ------------------------------------------------------------------------------------------
314
452
  (3) → Fix: extend glycovirology parts
315
453
  seek stuff in viral genomes
316
- -------------------------------------------------------------------------------
454
+ ------------------------------------------------------------------------------------------
317
455
  (4) →
318
456
 
319
457
  seq = Bio::Sequence::NA.new("atgcatgcaaaaaaa")
@@ -336,13 +474,13 @@ https://www.ncbi.nlm.nih.gov/nuccore/NC_038391.1
336
474
  seq = Bioroebe::Sequence.new("atgcatgcaaaaaaa")
337
475
  puts seq
338
476
  puts seq.complement
339
- -------------------------------------------------------------------------------
477
+ ------------------------------------------------------------------------------------------
340
478
  (5) →
341
479
  make sure we have a good fasta-showing widget
342
480
  show how many nucleotides are
343
481
  AND add support to modify this as-is
344
482
  ^^^^
345
- -------------------------------------------------------------------------------
483
+ ------------------------------------------------------------------------------------------
346
484
  (6) → In BioRoebe:
347
485
 
348
486
  Add a table showing how compatible bioroebe is compared to the other
@@ -352,7 +490,7 @@ https://www.ncbi.nlm.nih.gov/nuccore/NC_038391.1
352
490
  including Bio (ruby-bio) the main ruby project here.
353
491
  And add a table which functionality is implemented
354
492
  in Java already.
355
- -------------------------------------------------------------------------------
493
+ ------------------------------------------------------------------------------------------
356
494
  (7) →
357
495
  ********************************************************************************
358
496
  Was passiert wenn wir das Lambda-Genom mit EcoRI behandeln?
@@ -375,19 +513,19 @@ Bioroebe.digest_this_dna("/root/Bioroebe/fasta/NC_001416.1_Enterobacteria_phage_
375
513
  DNA.
376
514
  ^^^ this now works kind of ... but it must be better
377
515
  documented and we must test this with more data.
378
- -------------------------------------------------------------------------------
516
+ ------------------------------------------------------------------------------------------
379
517
  (8) → add the bioroebe logo to sinatra, but as appropriate size,
380
518
  via base64. perhaps width 50 or so. need to determine
381
519
  which size fits here.
382
- -------------------------------------------------------------------------------
520
+ ------------------------------------------------------------------------------------------
383
521
  (9) → Integrate http://nc2.neb.com/NEBcutter2/cutshow.php?name=ffe1d68e-
384
522
 
385
523
  in particular the visual part.
386
- -------------------------------------------------------------------------------
524
+ ------------------------------------------------------------------------------------------
387
525
  (10) → https://international.neb.com/products/r0196-ncii#Product%20Information
388
526
  ^^^ autogenerate such an image, aka restriction cutting enzyme
389
527
  to indicate the target sequence.
390
- -------------------------------------------------------------------------------
528
+ ------------------------------------------------------------------------------------------
391
529
  (6) → how to do codon optimiation in e.coli? bioroebe must support this!
392
530
 
393
531
  we must first get a display which codon is very commonly used in
@@ -399,34 +537,23 @@ and then we look which codons may be improvable - display
399
537
  them on the commandline
400
538
 
401
539
  class: OptimizeCodons.new(of_this_sequence)
402
- -------------------------------------------------------------------------------
540
+ ------------------------------------------------------------------------------------------
403
541
  (7) → Molekulare Grösse von "Ubiquitin"? "8.5 kd".
404
542
  ^^^ das sollte automatisch ausgerechnet werden
405
- -------------------------------------------------------------------------------
543
+ ------------------------------------------------------------------------------------------
406
544
  (8) → taxonomy !!!!!!!!!!!!!!!!!!
407
- -------------------------------------------------------------------------------
545
+ ------------------------------------------------------------------------------------------
408
546
  (9) → Given a list of gene names that I would like to get chromosome/position
409
547
  information for (in mm10). Is there some service online where I can
410
548
  paste this list? ^^^ enable this
411
- -------------------------------------------------------------------------------
412
- (10) → Show the frequency of codons in different tables
413
-
414
- This works quite ok, but right now the approach is to store
415
- this in a .yml file which is not ideal.
416
-
417
- Thus, we have to add two things:
418
- - The ability to store this into a SQL database
419
- - The ability to batch-download all of these codons,
420
- which first requires that we have a way to obtain all
421
- taxonomic ids.
422
- -------------------------------------------------------------------------------
549
+ ------------------------------------------------------------------------------------------
423
550
  (11) → Add a way in bioroebe to store a gene into a yaml file
424
551
  or so, and to also load it up again. Perhaps simplify
425
552
  this automatically. Need some ways to describe that.
426
- -------------------------------------------------------------------------------
553
+ ------------------------------------------------------------------------------------------
427
554
  (12) → Make bioroebe very useful from the www, no matter if via sinatra
428
555
  or rails. It should be a tool-set project on the www as well.
429
- -------------------------------------------------------------------------------
556
+ ------------------------------------------------------------------------------------------
430
557
  (13) → Suppose you have a GenBank file which you want to turn into a
431
558
  Fasta file. For example, lets consider the file cor6_6.gb
432
559
  which is included in the Biopython unit tests under the
@@ -441,12 +568,12 @@ call it format-converter or so
441
568
  the GUI works somewhat but needs to be polished up.
442
569
  THEN THIS CAN BE REMOVED!!!!!!!
443
570
 
444
- -------------------------------------------------------------------------------
571
+ ------------------------------------------------------------------------------------------
445
572
  (14) → Wir brauchen eine table wo wir die starken promotoren verschiedener
446
573
  Organismen zusammenstellen und vergleichen können.
447
574
 
448
575
  strong_promoters.yml
449
- -------------------------------------------------------------------------------
576
+ ------------------------------------------------------------------------------------------
450
577
  (15) → add:
451
578
  start position of exons
452
579
  and show the sequence based on that file
@@ -454,9 +581,9 @@ THEN THIS CAN BE REMOVED!!!!!!!
454
581
  Normally there's a "gene" entry for each gene, so:
455
582
  awk 'BEGIN{FS="\t"; OFS="\t"}{if($3 == "gene") print $1, $4, $5}' foo.gtf
456
583
 
457
- -------------------------------------------------------------------------------
584
+ ------------------------------------------------------------------------------------------
458
585
  (16) → also add 30-33 to aminoacids hmmm difficult.
459
- -------------------------------------------------------------------------------
586
+ ------------------------------------------------------------------------------------------
460
587
  (17) → http://bioinformatics.oxfordjournals.org/content/18/8/1135
461
588
  "TFBS: Computational framework for transcription factor
462
589
  binding site analysis"
@@ -464,7 +591,7 @@ THEN THIS CAN BE REMOVED!!!!!!!
464
591
  into bioroebe
465
592
 
466
593
  http://tfbs.genereg.net/
467
- -------------------------------------------------------------------------------
594
+ ------------------------------------------------------------------------------------------
468
595
  (18) → They include trypsin, chymotrypsin, thrombin, plasmin, papain and factor Xa.
469
596
  ^^^ provide means to identify where they cut,
470
597
  and show this then by simualting a digest.
@@ -472,7 +599,7 @@ THEN THIS CAN BE REMOVED!!!!!!!
472
599
  also document this on bioroebe todo
473
600
  this is done via digestion/digestions
474
601
  but it's not quite perfect yet.
475
- -------------------------------------------------------------------------------
602
+ ------------------------------------------------------------------------------------------
476
603
  (19) → a) add a commandline way to generate a random protein
477
604
  with a specified length and then display it on the
478
605
  commandline [DONE] !!!
@@ -498,29 +625,29 @@ THEN THIS CAN BE REMOVED!!!!!!!
498
625
  Enable this BOTH from the commandline AND from the
499
626
  interactive variant and from sinatra! Hmmmm.
500
627
 
501
- -------------------------------------------------------------------------------
628
+ ------------------------------------------------------------------------------------------
502
629
  (1) → add an option to design a
503
630
 
504
631
  degenerate primer
505
- -------------------------------------------------------------------------------
632
+ ------------------------------------------------------------------------------------------
506
633
  (2) Add upcase to sequences and ensure that it works; also document it
507
634
  internally and in the .pdf tutorial
508
635
  what does that mean? upcase as method? hmmm.
509
636
 
510
- ..........................................................................
637
+ ------------------------------------------------------------------------------------------
511
638
  (1) → http://www.biomart.org/other/user-docs.pdf
512
639
  ^^^ work through this
513
640
  ^^^ integrate the old .cgi part and improve as you go
514
- ..........................................................................
641
+ ------------------------------------------------------------------------------------------
515
642
  (1) → Access geninfo numbers easily.
516
643
  Die suchen und runterladen.
517
- -------------------------------------------------------------------------------
644
+ ------------------------------------------------------------------------------------------
518
645
  - Add all of bioruby into bioroebe:
519
646
 
520
647
  continous project
521
648
  https://github.com/biopython/biopython
522
649
  https://github.com/bioruby/bioruby/tree/master/lib/bio
523
- -------------------------------------------------------------------------------
650
+ ------------------------------------------------------------------------------------------
524
651
  (3) → https://github.com/bioruby/bioruby/issues/134
525
652
  ^^^ check this, for restriction enzymes
526
653
  http://rebase.neb.com/rebase/enz/MboII.html
@@ -530,9 +657,9 @@ THEN THIS CAN BE REMOVED!!!!!!!
530
657
  > seq = seq.reverse_complement
531
658
  > Bio::RestrictionEnzyme.cut(seq, 'MboII').primary rescue [seq]
532
659
  => ["atcatcaatcctaatcttct"]
533
- -------------------------------------------------------------------------------
660
+ ------------------------------------------------------------------------------------------
534
661
  (4) → Document how an ORF is defined for the bioroebe project.
535
- ..........................................................................
662
+ ------------------------------------------------------------------------------------------
536
663
  (5) Continue with biojava in bioroebe.
537
664
 
538
665
  → We need to make some table that tells us what is implemented
@@ -547,7 +674,7 @@ THEN THIS CAN BE REMOVED!!!!!!!
547
674
 
548
675
  dprimer M-T-T-Y-Y-T-A-A-A-STOP
549
676
 
550
- ..........................................................................
677
+ ------------------------------------------------------------------------------------------
551
678
  (1) → The codon tables:
552
679
  → In January we added a codon-table GUI to ruby-gtk3.
553
680
 
@@ -576,31 +703,29 @@ THEN THIS CAN BE REMOVED!!!!!!!
576
703
 
577
704
  This now sorta works semi-ok.
578
705
 
579
- -------------------------------------------------------------------------------
706
+ ------------------------------------------------------------------------------------------
580
707
  (1) → In the bioroebe-shell, enable input such as:
581
708
 
582
709
  NC_000011.10
583
710
 
584
711
  This shall quickly download this sequence into the
585
712
  local file, and also rename it properly.
586
- -------------------------------------------------------------------------------
713
+ ------------------------------------------------------------------------------------------
587
714
  → clone all of bioruby
588
- -------------------------------------------------------------------------------
715
+ ------------------------------------------------------------------------------------------
589
716
  (1) → bioinf bücher udrhclesen und zeug inkludiere !!!
590
717
  ^^^^^ mehr bilderchen hinzufügen ... auchv on den GUIs eventuell.
591
718
  Und auch biopython durcharbeiten und alles wichtige nach
592
719
  bioroebe übertragen.
593
- -------------------------------------------------------------------------------
720
+ ------------------------------------------------------------------------------------------
594
721
  - Add: DetectMotif
595
722
 
596
723
  This class shall be used for detecting subsequences.
597
- -------------------------------------------------------------------------------
724
+ ------------------------------------------------------------------------------------------
598
725
  - Neue funktionälit rein
599
- -------------------------------------------------------------------------------
600
- - mehr doku!
601
- -------------------------------------------------------------------------------
602
- - continue on bioroebe, and when it is done, write to the guy.
603
- -------------------------------------------------------------------------------
726
+ ------------------------------------------------------------------------------------------
727
+ - mehr doku!!!
728
+ ------------------------------------------------------------------------------------------
604
729
  - Rewrite bioroebe completely - add some tests, too or so, to
605
730
  test this. ^^^
606
731
  That way we learn how to write tests.
@@ -643,22 +768,13 @@ extend bioroebe sinatra interface
643
768
  also add a footer to show which entries are available or so
644
769
  → in bioroebe, mach das die postgresql datenbank wieder funktioniert ...
645
770
 
646
-
647
-
648
- ..........................................................................
771
+ ------------------------------------------------------------------------------------------
649
772
 
650
773
  → ^^^ improve this whole project a lot
651
774
 
652
775
  before uploading then send email
653
776
 
654
777
 
655
- - 1fat.pdb
656
-
657
- ^^^ download this, also via bioshell
658
- download 1fat
659
- ^^^ notify the user about this
660
- but put it into the dir of bioshell
661
-
662
778
  → add:
663
779
 
664
780
  set_dna :insulin
@@ -674,44 +790,20 @@ also add a footer to show which entries are available or so
674
790
  → becomes: http://www.ncbi.nlm.nih.gov/gene/3630
675
791
 
676
792
  wtf ... better to learn how NCBI uworks
677
- -------------------------------------------------------------------------------
678
- - Add a seuqence table int obioroebe for GFP, YFP etc
793
+ ------------------------------------------------------------------------------------------
794
+ - Add a seuqence table into bioroebe for GFP, YFP etc
679
795
  and mae this show in both the interactio bioshell but
680
796
  also the main README.md
681
- -------------------------------------------------------------------------------
682
- - stop_frame1?
683
- ^^^ add support for this
684
- and stop_frame2?
685
- etcc
686
- to show stop-codons in this colour
687
- THEN UPLOAD!
688
- ^^^ this works now but is not documented
689
-
690
-
691
- -------------------------------------------------------------------------------
692
-
693
- - chop to first ATG
694
797
 
695
- chop :ATG
696
-
697
- ^^^^ enable this, to chop towards the first ATG
698
- sequence in the string
699
-
700
- -------------------------------------------------------------------------------
798
+ ------------------------------------------------------------------------------------------
701
799
  → http://www.biophp.org/stats/describe_data/demo.php?show=formula
702
800
 
703
801
  ^^^ should also add documentation like this, also via www interface
704
- -------------------------------------------------------------------------------
705
- → add mouse chromsoome URL, also in the bioshell
706
- and the main README, to be of help for the
707
- user. add a mouse subsection.
708
- ..........................................................................
709
- → fix the taxonomy stuff...
710
- ..........................................................................
802
+ ------------------------------------------------------------------------------------------
711
803
  (1) → add 2nd_orf
712
804
  → this shall scan for the 2nd orf
713
805
  → and third ORF as well, then, and document it.
714
- ..........................................................................
806
+ ------------------------------------------------------------------------------------------
715
807
  (2) → Add a "cutter-range example" in restriction enzymes +
716
808
  table + examples + tutorial
717
809
 
@@ -719,21 +811,16 @@ also add a footer to show which entries are available or so
719
811
 
720
812
  Also, add in the documentation where this
721
813
  can be found.
722
- ..........................................................................
723
- (3) → Add aaruler, similar to "ruler"; in the bioshell.
724
- But we want to do this on the dna-sequence rather
725
- than the aminoacid sequence.
726
- This works but the display is not ideal.
727
- ..........................................................................
814
+ ------------------------------------------------------------------------------------------
728
815
  (4) → Add some codon-usage analyzer. What shall it show? It
729
816
  should show how many codons are used, frequencies etc...
730
817
  by an organism, and compare that to other data.
731
- ..........................................................................
818
+ ------------------------------------------------------------------------------------------
732
819
  (5) → Implement a GPCR interface.
733
820
 
734
821
  This is for "G-protein coupled receptors."
735
822
  Denote which variants exist and so forth. Document it as well.
736
- ..........................................................................
823
+ ------------------------------------------------------------------------------------------
737
824
  (6) → alu?
738
825
 
739
826
  Will read from the file `/Programs/Ruby/2.3.0/lib/ruby/site_ruby/2.3.0/bioroebe/yaml/alu_elements.yml`.
@@ -756,7 +843,7 @@ also add a footer to show which entries are available or so
756
843
  ^^^ add this and document it or something like that
757
844
  And perhaps add a small protein as an example how to
758
845
  work with .pdb files instead.
759
- -------------------------------------------------------------------------------
846
+ ------------------------------------------------------------------------------------------
760
847
  (4) → Extend bioroebe to allow download
761
848
 
762
849
  PDB files
@@ -770,13 +857,13 @@ also add a footer to show which entries are available or so
770
857
 
771
858
  in 3EML 2VTP 2VEZ
772
859
  do
773
- ..........................................................................
860
+ ------------------------------------------------------------------------------------------
774
861
  (1) → Fully integrate electron microscopy then remove the old entry.
775
862
  Test it though.
776
863
  Hmm... but ... we will first polish the main bioroebe
777
864
  gem AND the taxonomy gem and THEN AFTERWARDS
778
865
  integate elctron microsopcy.
779
- ..........................................................................
866
+ ------------------------------------------------------------------------------------------
780
867
  (1) → ORF Finder:
781
868
 
782
869
  We must add an ORF finder for the bioroebe project,
@@ -785,23 +872,23 @@ also add a footer to show which entries are available or so
785
872
  This works partially... start_stop works but we do not
786
873
  yet find all subsequences.
787
874
 
788
- ..........................................................................
875
+ ------------------------------------------------------------------------------------------
789
876
  (1) → must change determine whether we have protein or nucleotide or
790
877
  so via a topelvel method!
791
- ..........................................................................
878
+ ------------------------------------------------------------------------------------------
792
879
  (1) → there is a talens module.
793
880
  we have to improve on it for a while
794
881
  better docu
795
882
  more testing
796
883
  then we can get rid of this entry here
797
- ..........................................................................
884
+ ------------------------------------------------------------------------------------------
798
885
  (1) → 33.44
799
886
  Next showing the nucleotides 33 to 44 (including 33 and 44).
800
887
  The length of the fragment will be 12 nucleotides.
801
888
  5' - 2;70;130;180 - 3'
802
889
  ^^^ there is some problem; we somehow embed the colour codes,
803
890
  which should not happen.
804
- ..........................................................................
891
+ ------------------------------------------------------------------------------------------
805
892
  (1) → set_aa DTLCIGYHAN NSTDTVDTVL EKNVTVTHSV NLLEDKHNGK LCKLRGVAPL HLGKCNIAGW ILGNPECESL STASSWSYIV ETSNSDNGTC YPGDFINYEE LREQLSSVSS FERFEIFPKT SSWPNHDNKG VTAACPHAGA KSFYKNLIWL VKKGNSYPKL NQSYINDKGK EVLVLWGIHH PSTTADQQSL YQNADAYVFV GTSRYSKKFK PEIATRPKVR DQEGRMNYYW TLVEPGDKIT FEATGNLVVP RYAFMERNAG SGIIISDTPV HDCNTTCQTP EGAINTSLPF QNIHPITIGK CPKYVKSTKL RLATGLRNVP SIQSRGLFGA IAGFIEGGWT GMVDGWYGYH HQNEQGSGYA ADLKSTQNAI DKITNKVNSV IKMNTQFTAV GKEFNHLEKR IENLNKKVDD GFLDIWTYNA ELLVLLENER TLDYHDSNVK NLYEKVRNQL KNNAKEIGNG CFEFYHKCDN TCMESVKNGT YDYPKYSEEA KLNREKIDGV KLESTRIYHH HHHH
806
893
 
807
894
  ^^^ enable copy/pasting,
@@ -816,7 +903,7 @@ also add a footer to show which entries are available or so
816
903
  This sequence has 50 aminoacids.
817
904
  ^^^ das stimmt net.
818
905
 
819
- ..........................................................................
906
+ ------------------------------------------------------------------------------------------
820
907
  (1) → add this functionality:
821
908
 
822
909
  meting temper
@@ -853,70 +940,46 @@ also add a footer to show which entries are available or so
853
940
  and also provide a commandline-way to calculate them,
854
941
  using ruby. The latter may be useful and rather easy for
855
942
  scripted use.
856
- ..........................................................................
943
+ ------------------------------------------------------------------------------------------
857
944
  (1) → show insulin
858
945
  ^^^ to show the insulin structure
859
946
  how to find it? no idea...
860
947
  but we should have these structures already made available somewhere.
861
- ..........................................................................
948
+ ------------------------------------------------------------------------------------------
862
949
  (1) → Todo: find family of enzymes, based on sequence structure
863
950
  alone.
864
- ..........................................................................
865
- (1) → https://pubchem.ncbi.nlm.nih.gov/compound/16131099#section=Top
866
-
867
- ^^^ this website is quite interesting; try to use components
868
- from it.
869
- -------------------------------------------------------------------------------
870
- (1) → Add some option to show the aminoacid sequence, at the least
871
- store it; and optionally show it.
872
951
 
873
- possibly always report how many aminoacids are
874
- part of that file; and optionally also show
875
- the whole sequence.
876
- -------------------------------------------------------------------------------
952
+ ------------------------------------------------------------------------------------------
877
953
  (1) → WORK THROUGH the PROTOCOL AT BOKU. THEN WORK THROUGH THE VARIOUST
878
954
  TIDBIDS AT UNI WIEN STARTING WITH HEIKO.
879
955
  ^^^ da sind wir nun.
880
956
  wir sind an beginn von 1b ... hmmmm, also zerst mal das an der
881
957
  BOKU durchgehen. Dann das löschen.
882
- -------------------------------------------------------------------------------
958
+ ------------------------------------------------------------------------------------------
883
959
  (1) → Begin tk-bindings for bioroebe, following the gtk stuff.
884
- -------------------------------------------------------------------------------
960
+ ------------------------------------------------------------------------------------------
885
961
  (2) → frame_value = position_of_the_stop_codon - position_of_the_start_codon
886
962
  ^^^ continue on this ...
887
- -------------------------------------------------------------------------------
963
+ ------------------------------------------------------------------------------------------
888
964
  (1) → improve both the gtk-apps parts, and the sinatra web-interface,
889
965
  and other GUI-like elements. The idea is to make this software
890
966
  more useful for people around the world, which should help
891
967
  increase its adoption rate.
892
- -------------------------------------------------------------------------------
968
+ ------------------------------------------------------------------------------------------
893
969
  (2) → Look to integrate this:
894
970
 
895
971
  http://www.ncbi.nlm.nih.gov/nuccore/NM_007315.3?report=fasta&log$=seqview&format=text
896
972
  ^^^
897
- -------------------------------------------------------------------------------
898
- (1) → Clone and document the getorf functionality properly.
899
-
900
- See: http://emboss.sourceforge.net/apps/cvs/emboss/apps/getorf.html
901
- -------------------------------------------------------------------------------
902
- (2) → set_dna_sequence alu
903
-
904
- ^^^ fetch random alu
905
-
906
- ^^^ alu sequence
907
- Ok we started this now adding more details, but we
908
- need to become better at searching for this
909
- sequence.
910
- -------------------------------------------------------------------------------
973
+ ------------------------------------------------------------------------------------------
911
974
  (3) → We need to make available the ... thingy magick
912
975
  emboss functionality. that may seem useful
913
976
  but also feel free to extend these parts for
914
977
  bioroebe as necessary.
915
- -------------------------------------------------------------------------------
978
+ ------------------------------------------------------------------------------------------
916
979
  (4) → integrate electron_microscopy fully
917
980
  This will take more time, so first we finish with the
918
981
  taxonomy module instead.
919
- -------------------------------------------------------------------------------
982
+ ------------------------------------------------------------------------------------------
920
983
  (5) → Improve support for BLAST up until
921
984
 
922
985
  middle of 2015 so that I am better prepared
@@ -927,7 +990,7 @@ also add a footer to show which entries are available or so
927
990
  So, work on BLAST tutorial at bioinf page:
928
991
 
929
992
  bl bioinf; rf bioinf
930
- -------------------------------------------------------------------------------
993
+ ------------------------------------------------------------------------------------------
931
994
  (3) → integrate a "codon usage database", whatever this means.
932
995
  It is a cool database anyway. Then document this.
933
996
  First, create a codon-usage analyze on a per-FASTA
@@ -935,7 +998,7 @@ also add a footer to show which entries are available or so
935
998
  and calculate the codon usage from there.
936
999
 
937
1000
  ^^^ and add some GUI to this. hmmm
938
- ..........................................................................
1001
+ ------------------------------------------------------------------------------------------
939
1002
  (4) → Input sequence:
940
1003
 
941
1004
  MFLMVSPTAYHQNKDECFLP
@@ -951,46 +1014,40 @@ also add a footer to show which entries are available or so
951
1014
 
952
1015
  ^^^ we should also show this on the commandline AND the
953
1016
  www ... hmmm.
954
- ..........................................................................
1017
+ ------------------------------------------------------------------------------------------
955
1018
  (5) → enable a graphical layer so that we can find out which
956
1019
  transcription factor activates which gene(s). This
957
1020
  should show e. g. a transcription factor highlighting
958
1021
  a target genetic area.
959
- ..........................................................................
1022
+ ------------------------------------------------------------------------------------------
960
1023
  (2) → We should add more screenshots, make them available on imgur
961
1024
  as well, after storing them locally. Start with the more
962
1025
  important functionality.
963
1026
 
964
- ..........................................................................
1027
+ ------------------------------------------------------------------------------------------
965
1028
  (2) → clone serial cloner or whatever the name was, that GUI,
966
1029
  so that we can offer the same functionality.
967
- ..........................................................................
1030
+ ------------------------------------------------------------------------------------------
968
1031
  (1) →
969
1032
 
970
1033
  # * searching for PubMed IDs given a query string:
971
1034
  # * Bio::PubMed#esearch (recommended)
972
1035
  # * Bio::PubMed#search (only retrieves top 20 hits; will be deprecated)
973
1036
  ^^^ implement this
974
-
975
-
976
- ..........................................................................
1037
+ ------------------------------------------------------------------------------------------
977
1038
  (3) → Aufgabe 16 in bioroebe lösen könnnen
978
- ..........................................................................
979
- (4) → The taxonomy part should be fully integrated, without it
980
- being a standalone part anymore.
981
- continue on the taxonomy stuff.
982
- ne day this will work again *shake fist*
983
- -------------------------------------------------------------------------------
1039
+
1040
+ ------------------------------------------------------------------------------------------
984
1041
  (5) → re1 = Bio::RestrictionEnzyme::DoubleStranded.new(enzyme1)
985
1042
 
986
1043
  ^^^ add this? hmmmm
987
1044
  ^^^ from here.
988
- -------------------------------------------------------------------------------
1045
+ ------------------------------------------------------------------------------------------
989
1046
  (1) → Colourize exon/intron boundaries.
990
- -------------------------------------------------------------------------------
1047
+ ------------------------------------------------------------------------------------------
991
1048
  (2) → In bioroebe: enhance phylogeny stuff and perhaps automatically
992
1049
  generate pictures here.
993
- -------------------------------------------------------------------------------
1050
+ ------------------------------------------------------------------------------------------
994
1051
  (1) → In sinatra: add a backtranseq entry point, perhaps
995
1052
  alias it as well.
996
1053
 
@@ -1000,7 +1057,7 @@ bioroebe --protein-to-dna
1000
1057
 
1001
1058
  ^^^ this shall start the GTK3 variant
1002
1059
 
1003
- -------------------------------------------------------------------------------
1060
+ ------------------------------------------------------------------------------------------
1004
1061
  (1) → require 'rubygems/text'
1005
1062
  include Gem::Text
1006
1063
  levenshtein_distance 'shevy', 'chevy' # => 1
@@ -1012,13 +1069,13 @@ bioroebe --protein-to-dna
1012
1069
  https://github.com/rubygems/rubygems/blob/master/lib/rubygems/text.rb
1013
1070
  ^^^ actually move that part into bioroebe itself...
1014
1071
 
1015
- -------------------------------------------------------------------------------
1072
+ ------------------------------------------------------------------------------------------
1016
1073
  (1) → add _source to all APIs in sinatra there. Ensure that this works
1017
1074
  too. The user should be able to view the source code.
1018
1075
  ^^^ it has been added for 2 methods so far in sinatra; we need
1019
1076
  to add it for the remaining ones too. Then we can remove
1020
1077
  this entry point.
1021
- -------------------------------------------------------------------------------
1078
+ ------------------------------------------------------------------------------------------
1022
1079
  (2) → Check out expasy
1023
1080
  peptidcutter
1024
1081
  also offer this functionality, through commandline, GUI
@@ -1026,16 +1083,12 @@ bioroebe --protein-to-dna
1026
1083
  https://web.expasy.org/peptide_cutter/
1027
1084
  We now have added trypsin but we should add more here; and
1028
1085
  still have to add support for sinatra here.
1029
- -------------------------------------------------------------------------------
1086
+ ------------------------------------------------------------------------------------------
1030
1087
  (3) → melting temperature subsection
1031
1088
 
1032
1089
  hmmm .... molecular weight calculation works now ... but
1033
1090
  ... is it correct for a ssDNA string? hmm...
1034
- -------------------------------------------------------------------------------
1035
- (3) → Add useful formulas for bioshell.
1036
-
1037
-
1038
- ...........................................................................
1091
+ ------------------------------------------------------------------------------------------
1039
1092
  (1) → Degenerate Primers
1040
1093
 
1041
1094
  You can try to determine the degenerate primers via the Shell
@@ -1046,7 +1099,7 @@ bioroebe --protein-to-dna
1046
1099
  ^^^ epxnad that subsection
1047
1100
  more explanations and examples
1048
1101
 
1049
- -------------------------------------------------------------------------------
1102
+ ------------------------------------------------------------------------------------------
1050
1103
  (1) → Copy the functionality of plotorf:
1051
1104
 
1052
1105
  See:
@@ -1062,7 +1115,7 @@ bioroebe --protein-to-dna
1062
1115
 
1063
1116
 
1064
1117
 
1065
- -------------------------------------------------------------------------------
1118
+ ------------------------------------------------------------------------------------------
1066
1119
  (2) → Start nucleotide position is at: 142
1067
1120
 
1068
1121
  See the following example:
@@ -1072,24 +1125,24 @@ bioroebe --protein-to-dna
1072
1125
  BIO SHELL>
1073
1126
  ^^^ this does not work; nothing is highlighted.
1074
1127
 
1075
- -------------------------------------------------------------------------------
1128
+ ------------------------------------------------------------------------------------------
1076
1129
  (2) → Add a myristoylierung-signal
1077
1130
 
1078
1131
  Met-Gly-Xaa-Xaa-YXaa-Ser/Thr-Lys-Lys
1079
1132
 
1080
1133
  1^^ but check first.
1081
1134
 
1082
- -------------------------------------------------------------------------------
1135
+ ------------------------------------------------------------------------------------------
1083
1136
  (3) → integrate the bioroebe_tutorial.cgi into the .md file completely.
1084
1137
 
1085
- -------------------------------------------------------------------------------
1138
+ ------------------------------------------------------------------------------------------
1086
1139
  (4) → Integrate everything from the biopython tutorial, if it makes
1087
1140
  sense.
1088
1141
 
1089
- -------------------------------------------------------------------------------
1142
+ ------------------------------------------------------------------------------------------
1090
1143
  (5) → Improve the codon-optimizer in Bioroebe, including the
1091
1144
  documentation. We need to make this really useful.
1092
- -------------------------------------------------------------------------------
1145
+ ------------------------------------------------------------------------------------------
1093
1146
  (6) →
1094
1147
  5'- TACACGGCACAT -3'
1095
1148
  3'- ATGTGCCGTGTA -5'
@@ -1098,7 +1151,7 @@ Imperfect DNA mirror repeats (IMRs) are less than 100% symmetrical.
1098
1151
 
1099
1152
  ^^^ integrate mirror repeats creation
1100
1153
  and searching for them. Hmmm.
1101
- -------------------------------------------------------------------------------
1154
+ ------------------------------------------------------------------------------------------
1102
1155
  (7) → continue porting bioroebe/taxonomy
1103
1156
 
1104
1157
  ^^^^^^^^^^
@@ -1108,12 +1161,12 @@ Imperfect DNA mirror repeats (IMRs) are less than 100% symmetrical.
1108
1161
  ^^^ das ist der nächste schritt, so das
1109
1162
  wir das nit mehr benötigen.
1110
1163
 
1111
- -------------------------------------------------------------------------------
1164
+ ------------------------------------------------------------------------------------------
1112
1165
  (8) → find out which bacteria all contain the needle complex; find out
1113
1166
  the sequence for the needle complex as well and study it;
1114
1167
  find the positions of the genes responsible.
1115
1168
 
1116
- -------------------------------------------------------------------------------
1169
+ ------------------------------------------------------------------------------------------
1117
1170
  (9) → Add trypsin_digest, also in the shell, but possibly
1118
1171
  on toplevel as well (if the input is a protein sequence.
1119
1172
 
@@ -1127,29 +1180,24 @@ Imperfect DNA mirror repeats (IMRs) are less than 100% symmetrical.
1127
1180
  And document it; but do not digest if a prolin
1128
1181
  follows !!!
1129
1182
  ^^^ document this too into .md
1130
-
1131
- -------------------------------------------------------------------------------
1132
- (10) → in bioroebe, add a commassie check... do we include
1133
- arginine or not.
1134
-
1135
- ..........................................................................
1183
+ ------------------------------------------------------------------------------------------
1136
1184
  (11) → add codon usage in bioroebe
1137
- -------------------------------------------------------------------------------
1185
+ ------------------------------------------------------------------------------------------
1138
1186
  (12) → Clone the following functionality.
1139
1187
 
1140
1188
  http://www.bioinformatics.nl/cgi-bin/emboss/help/sirna
1141
- -------------------------------------------------------------------------------
1189
+ ------------------------------------------------------------------------------------------
1142
1190
  (13) → Improve the "find and scan" subsection. We must be able to find
1143
1191
  subsequences; check for "matches" as well, including the bioshell.
1144
- -------------------------------------------------------------------------------
1192
+ ------------------------------------------------------------------------------------------
1145
1193
  (14) → Clone the CLUSTAL format aligment.
1146
- -------------------------------------------------------------------------------
1194
+ ------------------------------------------------------------------------------------------
1147
1195
  (15) → We need to be able to load up a whole geneome into bioroebe,
1148
1196
  and then be able to manipulate it.
1149
1197
 
1150
1198
  ^^^ perhaps test this with some example
1151
1199
  data or so...
1152
- -------------------------------------------------------------------------------
1200
+ ------------------------------------------------------------------------------------------
1153
1201
  (16) → Restriction enzymes:
1154
1202
 
1155
1203
  Add a subsection about restritction enzymes including
@@ -1163,7 +1211,7 @@ Imperfect DNA mirror repeats (IMRs) are less than 100% symmetrical.
1163
1211
  general, so that we can reproduce and verify the
1164
1212
  information there.
1165
1213
 
1166
- -------------------------------------------------------------------------------
1214
+ ------------------------------------------------------------------------------------------
1167
1215
  (18) → clone pepinfo
1168
1216
 
1169
1217
  The program "pepinfo" plots various amino acid properties in
@@ -1181,7 +1229,7 @@ Imperfect DNA mirror repeats (IMRs) are less than 100% symmetrical.
1181
1229
 
1182
1230
  The data are also written out to an output file.
1183
1231
 
1184
- -------------------------------------------------------------------------------
1232
+ ------------------------------------------------------------------------------------------
1185
1233
  (19) → gff?
1186
1234
 
1187
1235
  There are 6 .gff3 files in the current directory.
@@ -1193,23 +1241,22 @@ Imperfect DNA mirror repeats (IMRs) are less than 100% symmetrical.
1193
1241
 
1194
1242
  ^^^ we need an analyze-mode as well.
1195
1243
 
1196
- ..........................................................................
1244
+ ------------------------------------------------------------------------------------------
1197
1245
  (20) → ^^^^ add the ability to
1198
1246
  show a ruler AND highlighting as well
1199
1247
  ^^^ then document it.
1200
- ..........................................................................
1248
+ ------------------------------------------------------------------------------------------
1201
1249
  (21) → https://github.com/bioperl/bioperl-live
1202
1250
  Look what we can take from ^^^.
1203
1251
 
1204
1252
  https://github.com/bioperl/bioperl-live/tree/master/examples
1205
1253
 
1206
- ..........................................................................
1254
+ ------------------------------------------------------------------------------------------
1207
1255
  (23) → continue biojava, and bioroebe a bit
1208
1256
 
1209
1257
  Ideally we should have biojava o a working point.
1210
- ..........................................................................
1211
- (24) → Clone all of Emboss. :)
1212
- ..........................................................................
1258
+
1259
+ ------------------------------------------------------------------------------------------
1213
1260
  (25) → clone the functionality found at https://web.expasy.org/protparam/
1214
1261
 
1215
1262
  https://web.expasy.org/cgi-bin/protparam/protparam
@@ -1219,7 +1266,7 @@ Imperfect DNA mirror repeats (IMRs) are less than 100% symmetrical.
1219
1266
 
1220
1267
  Theoretical pI: 5.78
1221
1268
 
1222
- -------------------------------------------------------------------------------
1269
+ ------------------------------------------------------------------------------------------
1223
1270
  (27) → NP_417539.1
1224
1271
 
1225
1272
  https://www.ncbi.nlm.nih.gov/protein/NP_417539.1
@@ -1227,26 +1274,17 @@ Imperfect DNA mirror repeats (IMRs) are less than 100% symmetrical.
1227
1274
 
1228
1275
  ^^^ if the input is exactly like the above, on the first line,
1229
1276
  download the sequence.
1230
- -------------------------------------------------------------------------------
1231
- (28) → Integrate these nice GUI parts parts:
1232
-
1233
- https://dev.to/kojix2/introduction-to-gr-rb-data-visualization-with-ruby-2c39
1234
-
1235
- -------------------------------------------------------------------------------
1236
- (29) → http://insilico.ehu.es/
1237
-
1238
- ^^^ check if we have all of this incorporated
1239
- -------------------------------------------------------------------------------
1277
+ ------------------------------------------------------------------------------------------
1240
1278
  (30) → http://www.biostars.org/
1241
1279
 
1242
1280
  ^^^ regularly work through this
1243
1281
  and try to help
1244
1282
  and extend bioruby at the same time.
1245
- -------------------------------------------------------------------------------
1283
+ ------------------------------------------------------------------------------------------
1246
1284
  (31) → The taxonomy-submodule should work one day, and be properly
1247
1285
  documented as well. Perhaps integrate the parts of Taxonomy
1248
1286
  that can be included into the toplevel domain.
1249
- -------------------------------------------------------------------------------
1287
+ ------------------------------------------------------------------------------------------
1250
1288
  (32) → Enable:
1251
1289
 
1252
1290
  Bioroebe.set_genetic_code()
@@ -1262,7 +1300,7 @@ Imperfect DNA mirror repeats (IMRs) are less than 100% symmetrical.
1262
1300
 
1263
1301
  ^^^ enable this as well; extent documentation too.
1264
1302
 
1265
- -------------------------------------------------------------------------------
1303
+ ------------------------------------------------------------------------------------------
1266
1304
  (34) → We have found a restriction enzyme called NheI.
1267
1305
 
1268
1306
  The sequence this 6-cutter relates to is: `5' - GCTAGC - 3'`
@@ -1270,23 +1308,23 @@ Imperfect DNA mirror repeats (IMRs) are less than 100% symmetrical.
1270
1308
  This restriction enzyme will produce a blunt overhang.
1271
1309
 
1272
1310
  ^^^ nope das ist falsch
1273
- -------------------------------------------------------------------------------
1311
+ ------------------------------------------------------------------------------------------
1274
1312
  (35) → Sau3A?
1275
1313
  ^^^ enable this restriction site
1276
1314
 
1277
- -------------------------------------------------------------------------------
1315
+ ------------------------------------------------------------------------------------------
1278
1316
  (37) → Add matplotlib support.
1279
1317
 
1280
1318
  try_to_use_matplotlib
1281
- -------------------------------------------------------------------------------
1319
+ ------------------------------------------------------------------------------------------
1282
1320
  (38) → https://www.ncbi.nlm.nih.gov/CBBresearch/Lu/Demo/tmTools/RESTfulAPIs.html
1283
- -------------------------------------------------------------------------------
1321
+ ------------------------------------------------------------------------------------------
1284
1322
  (39) → The following input:
1285
1323
 
1286
1324
  downcase; orf?; seq?
1287
1325
 
1288
1326
  leads to strange display. Something is wrong here, must be checked.
1289
- -------------------------------------------------------------------------------
1327
+ ------------------------------------------------------------------------------------------
1290
1328
  (40) → Continue with rosalind problems.
1291
1329
 
1292
1330
  These challenges can be found here:
@@ -1295,42 +1333,42 @@ Imperfect DNA mirror repeats (IMRs) are less than 100% symmetrical.
1295
1333
 
1296
1334
  Also integrate these rosalind-quizzes into bioroebe
1297
1335
  when possible.
1298
- -------------------------------------------------------------------------------
1336
+ ------------------------------------------------------------------------------------------
1299
1337
  (41) → https://web.expasy.org/cgi-bin/peptide_mass/peptide-mass.pl
1300
1338
 
1301
1339
  ^^^ make the above usable in sinaitra as well
1302
- -------------------------------------------------------------------------------
1340
+ ------------------------------------------------------------------------------------------
1303
1341
  (42) → Integrate a way to search for commonly known promoters:
1304
1342
 
1305
1343
  promoters?
1306
1344
  ^^^ this functionality
1307
1345
  ^^^ this has to be expanded
1308
1346
  and ...
1309
- -------------------------------------------------------------------------------
1347
+ ------------------------------------------------------------------------------------------
1310
1348
  (43) → Integrate:
1311
1349
 
1312
1350
  http://biotools.nubic.northwestern.edu/OligoCalc.html
1313
- -------------------------------------------------------------------------------
1351
+ ------------------------------------------------------------------------------------------
1314
1352
  (44) → Extend the Java part of BioRoebe systematically..
1315
1353
 
1316
1354
  What should come next? Let's make a list.
1317
1355
 
1318
1356
  → remove_numbers [DONE]
1319
- -------------------------------------------------------------------------------
1357
+ ------------------------------------------------------------------------------------------
1320
1358
  (46) → Study gnuplot; one day we have to draw graphs.
1321
1359
 
1322
- -------------------------------------------------------------------------------
1360
+ ------------------------------------------------------------------------------------------
1323
1361
  (47) → Add a genome browser, both ascii without GUI and also
1324
1362
  with. In ruby-gtk.
1325
- -------------------------------------------------------------------------------
1363
+ ------------------------------------------------------------------------------------------
1326
1364
  (48) → Clone the functionality of:
1327
1365
 
1328
1366
  http://www.biophp.org/minitools/restriction_digest/demo.php
1329
- -------------------------------------------------------------------------------
1367
+ ------------------------------------------------------------------------------------------
1330
1368
  (50) → Add the loxP sequence to readme [DONE] and explain this
1331
1369
  better on the main readme; and perhaps also assign
1332
1370
  the sequence via the bioshell.
1333
- -------------------------------------------------------------------------------
1371
+ ------------------------------------------------------------------------------------------
1334
1372
  (51) → 33. Cephalodiscidae Mitochondrial UAA-Tyr Code (transl_table=33)
1335
1373
 
1336
1374
  AAs = FFLLSSSSYYY*CCWWLLLLPPPPHHQQRRRRIIIMTTTTNNKKSSSKVVVVAAAADDEEGGGG
@@ -1341,7 +1379,7 @@ Imperfect DNA mirror repeats (IMRs) are less than 100% symmetrical.
1341
1379
 
1342
1380
  ^^^ add a parser, and document it, that can take this input
1343
1381
  and output the corresponding code, in a valid .yml file.
1344
- -------------------------------------------------------------------------------
1382
+ ------------------------------------------------------------------------------------------
1345
1383
  (52) → Add to bioroebe the ability to add cloning vectors
1346
1384
  and molecular_weight calcuation
1347
1385
  for this
@@ -1363,19 +1401,19 @@ Imperfect DNA mirror repeats (IMRs) are less than 100% symmetrical.
1363
1401
 
1364
1402
  ^^^ we also need a way to find out what resistance genes
1365
1403
  are carried there.
1366
- -------------------------------------------------------------------------------
1404
+ ------------------------------------------------------------------------------------------
1367
1405
  (53) → In the lambda genome sequence there are 10 EcoB and
1368
1406
  5 EcoK sites.
1369
1407
  ^^^ verify this too, as an example as well
1370
- -------------------------------------------------------------------------------
1408
+ ------------------------------------------------------------------------------------------
1371
1409
  (54) → show restriction sites, composable and compatible with
1372
1410
  serial clone ... hmm
1373
- -------------------------------------------------------------------------------
1411
+ ------------------------------------------------------------------------------------------
1374
1412
  (55) → enable:
1375
1413
  BIOROEBE_USE_COLOURS:
1376
1414
  can be 0 or 1
1377
1415
  what is this?
1378
- -------------------------------------------------------------------------------
1416
+ ------------------------------------------------------------------------------------------
1379
1417
  (56) → Burrows-Wheeler-Transform (BWT)
1380
1418
 
1381
1419
  ^^^ add some method here
@@ -1388,15 +1426,15 @@ Imperfect DNA mirror repeats (IMRs) are less than 100% symmetrical.
1388
1426
 
1389
1427
  also test this against my paper-result
1390
1428
  with input being: "GATAG$".
1391
- -------------------------------------------------------------------------------
1429
+ ------------------------------------------------------------------------------------------
1392
1430
  (56) → Enable working with several genes... hmm and store that somewhere.
1393
1431
  Something like a per-project workspace thingy.
1394
- -------------------------------------------------------------------------------
1432
+ ------------------------------------------------------------------------------------------
1395
1433
  (57) → Add:
1396
1434
 
1397
1435
  http://nar.oxfordjournals.org/content/35/suppl_2/W71.long
1398
1436
 
1399
- -------------------------------------------------------------------------------
1437
+ ------------------------------------------------------------------------------------------
1400
1438
  (58) → Now, you may want to translate the nucleotides up to
1401
1439
  the first in frame stop codon, and then stop (as
1402
1440
  happens in nature):
@@ -1410,14 +1448,14 @@ Imperfect DNA mirror repeats (IMRs) are less than 100% symmetrical.
1410
1448
  Then continue from here:
1411
1449
 
1412
1450
  https://people.duke.edu/~ccc14/pcfb/biopython/BiopythonSequences.html
1413
- -------------------------------------------------------------------------------
1451
+ ------------------------------------------------------------------------------------------
1414
1452
  (59) → Add:
1415
1453
 
1416
1454
  set_dna :Ubiquitin
1417
1455
  set_dna :ubiquitin
1418
1456
 
1419
1457
  ^^^ we want to obtain the ubuiqitin sequence
1420
- -------------------------------------------------------------------------------
1458
+ ------------------------------------------------------------------------------------------
1421
1459
  (59) → Telomers
1422
1460
 
1423
1461
  Telomeres are listed from 5' to 3'.
@@ -1431,28 +1469,28 @@ Imperfect DNA mirror repeats (IMRs) are less than 100% symmetrical.
1431
1469
  doc_telomeres
1432
1470
 
1433
1471
  ^^^ add this to say the human telomere sequence
1434
- -------------------------------------------------------------------------------
1472
+ ------------------------------------------------------------------------------------------
1435
1473
  (60) → ORF_positions?
1436
1474
  ^^^ change this a bit, to actually show the positions
1437
1475
  of the various ORFs with the start-position.
1438
- -------------------------------------------------------------------------------
1476
+ ------------------------------------------------------------------------------------------
1439
1477
  (62) → add:
1440
1478
 
1441
1479
  setgene2
1442
1480
  add_dna2
1443
1481
  dna2
1444
1482
  dna? <--- this one is not a setter but a query.
1445
- -------------------------------------------------------------------------------
1483
+ ------------------------------------------------------------------------------------------
1446
1484
  (63) → improve the TM calculation. must be better, must have more
1447
1485
  documentation, and a small tutorial.
1448
- -------------------------------------------------------------------------------
1486
+ ------------------------------------------------------------------------------------------
1449
1487
  (64) → Compare bioroebe to:
1450
1488
 
1451
1489
  https://www.ncbi.nlm.nih.gov/orffinder
1452
1490
 
1453
1491
  whether both return the same
1454
1492
  also possibly add a web-gui
1455
- -------------------------------------------------------------------------------
1493
+ ------------------------------------------------------------------------------------------
1456
1494
  (65) → Find out ratios from:
1457
1495
 
1458
1496
  Doolittle RF. 1989. Redundancies in protein sequences. I
@@ -1478,16 +1516,16 @@ Imperfect DNA mirror repeats (IMRs) are less than 100% symmetrical.
1478
1516
  Bioroebe::Blosum[50] as an API.
1479
1517
  and document it in general.
1480
1518
 
1481
- ..........................................................................
1519
+ ------------------------------------------------------------------------------------------
1482
1520
  (65) → http://www.biomart.org/other/user-docs.pdf
1483
1521
  ^^^ work through this
1484
- -------------------------------------------------------------------------------
1522
+ ------------------------------------------------------------------------------------------
1485
1523
  (66) → add:
1486
1524
 
1487
1525
  class Cell
1488
1526
  ^^^ simulate a cell
1489
1527
  Hmmm. Needs specific components ... and needs a better plan.
1490
- -------------------------------------------------------------------------------
1528
+ ------------------------------------------------------------------------------------------
1491
1529
  (68) → class Protein:
1492
1530
 
1493
1531
  add glycosyslation patteren
@@ -1496,18 +1534,18 @@ Imperfect DNA mirror repeats (IMRs) are less than 100% symmetrical.
1496
1534
  need to somehow add the modiication type
1497
1535
 
1498
1536
  https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5358406/
1499
- -------------------------------------------------------------------------------
1537
+ ------------------------------------------------------------------------------------------
1500
1538
  (69) → In the BioShell we must be able to do probes - completementary
1501
1539
  to amino acids.
1502
- -------------------------------------------------------------------------------
1540
+ ------------------------------------------------------------------------------------------
1503
1541
  (70) → Add www-related functionality to bioroebe eventually make use
1504
1542
  of rails, but start with sinatra possibly. In the long run,
1505
1543
  make it flexible to work with as many different frameworks
1506
1544
  as possible, though.
1507
- -------------------------------------------------------------------------------
1545
+ ------------------------------------------------------------------------------------------
1508
1546
  (71) → Spaltstellen anzeigen zum beispiel lambda-DNA verdau
1509
1547
  BgI II.
1510
- -------------------------------------------------------------------------------
1548
+ ------------------------------------------------------------------------------------------
1511
1549
  (72) → dnaanalyze
1512
1550
 
1513
1551
  In the DNA string `TCCGTCGCAACACATCGCCTCAACAAACCGACCGGGATATGCAATACCGGAATCCGATCCTTTAGAAGCTGCATTCCAAACGCTTGCAATAACACCCACTCGACTATTCAGCATTGGCAAAGGGTACGAATTCGACGAAGGGAGGGTGCTATATTTTCCAAGTTGCTCGCCGATTGATACGGAGCCTGTGGAAAGATTTCGCGGCTCTAGTCTTTAGCTTTGATGTCACCCCTGAGTAGTAACCCGGCGTGGTAGCTTTCATTAGACTTCTCGGAGAGAGTATTAAGCAAAGGTGGAGGTCCCAGGGGTCCAGTGAGCTGTATCGCACTAAAAGCATGCCTACGGGCAATGCTATTTTGCTCACAGGAACTTTGGGGGAGCCACAAACTCTCGAAGCCGGATTGTTGTGGCGGCTAACTTTCCAAAGGCGACCATTCATGGTCTGAATGGGCCCTCACCAGAAGAACGTTTTCGACGGGCATTCTTCCCCGGGGTTTCGAAGGCAAGGGTCAGCACGGCGCGGAAAAGTACGCGACGCATACCGGACTAGTCATGCAACTCCCTCGGAACTGGCGATTCCCACCCAAGAGACGCACGCTGATCATTGCCCATGCCGACTGGAGATGCTGAATTTGGTATGCGGGTCTGTTGCCAGCGCTGACATTATCGGACATTGTGGGGAGAACCGTGTGATTGATTGAGCTGGCGCATTTGTCCGCATGCTCTCCTCATGTGGACACCTTCGCAGGTTCTTTCCGCGGCCACAGTGTCGGGATCTACCCCTGGTGCGTCGCCGCGAGTACAGGTGGGGTTTCGCGCATGAGAACCAATGTTGCACGCCTCAAAACATGGCTGTAACATATTAGCGCCAATAAAAATTTTTGGCAACAAAGAAACAAGGCCAACCGAAGTGCTAAGCCGCGATCATGAAGGGGCGATGCCAGAATGGGAGTCTGCCTTTCCTGTGTGGACGTGAGATTGTACCTAGACAGAGAACGCC` we found these Nucleotides:
@@ -1532,11 +1570,11 @@ Imperfect DNA mirror repeats (IMRs) are less than 100% symmetrical.
1532
1570
  we need to make it so that an input sequence
1533
1571
  can be assigned, and dnaanalyse --GUI should
1534
1572
  start it too. ALSO document it once this works.
1535
- -------------------------------------------------------------------------------
1573
+ ------------------------------------------------------------------------------------------
1536
1574
  (73) → go through the individual components slowly and improve them,
1537
1575
  step by step, including the documentation. Then eventually
1538
1576
  remove this todo-entry here.
1539
- -------------------------------------------------------------------------------
1577
+ ------------------------------------------------------------------------------------------
1540
1578
  (74) → Add a consensus sequence for:
1541
1579
 
1542
1580
  Asn-X-Ser/Thr-Conesnsus
@@ -1548,13 +1586,13 @@ Imperfect DNA mirror repeats (IMRs) are less than 100% symmetrical.
1548
1586
  NGlyc
1549
1587
  /N-?Glyc/i
1550
1588
  ^^^ use that regex
1551
- -------------------------------------------------------------------------------
1589
+ ------------------------------------------------------------------------------------------
1552
1590
  (74) → make sure that newly generated files respect the
1553
1591
  default chmod value on the system. from bioroebe.
1554
1592
  right now we default to 755 which I assume is
1555
1593
  hardcoded but perhaps this is wrong.
1556
1594
 
1557
- -------------------------------------------------------------------------------
1595
+ ------------------------------------------------------------------------------------------
1558
1596
  (75) → require 'bio'
1559
1597
 
1560
1598
  # creating a Bio::Sequence::NA object containing ambiguous alphabets
@@ -1579,34 +1617,34 @@ Imperfect DNA mirror repeats (IMRs) are less than 100% symmetrical.
1579
1617
  part nto a standalone file
1580
1618
  so taht it can be used by both the .cgi and
1581
1619
  well rdoc...
1582
- -------------------------------------------------------------------------------
1620
+ ------------------------------------------------------------------------------------------
1583
1621
  - Add more protein-specific thingies to bioroebe.
1584
- -------------------------------------------------------------------------------
1622
+ ------------------------------------------------------------------------------------------
1585
1623
  - Die bioshell vorantreiben und durch std_biology.rb abarbeiten.
1586
1624
  Vielleicht können wir ja etwas davon auslagern in eine Klasse
1587
1625
  oder so.
1588
1626
 
1589
1627
  Das ganze sollte auch mit Webmin (biomin) verknüpft werden, so das
1590
1628
  wir die Bioshell auch elegant über das www verwenden können!
1591
- -------------------------------------------------------------------------------
1629
+ ------------------------------------------------------------------------------------------
1592
1630
  - ^^^ when we find restriction enzyme sites in a DNA
1593
1631
  string, colourize them RED.
1594
1632
 
1595
1633
  also set it to
1596
1634
  set_restriction_size()
1597
- -------------------------------------------------------------------------------
1635
+ ------------------------------------------------------------------------------------------
1598
1636
  - also ... while learning C++ we extend the project here...
1599
1637
  Useful C++ things will be combined.
1600
- -------------------------------------------------------------------------------
1638
+ ------------------------------------------------------------------------------------------
1601
1639
  - As of April 2003, there were 176,890 total taxa represented.
1602
1640
 
1603
1641
  ^^^ we need a way to also output how many entries we
1604
1642
  have there.
1605
- -------------------------------------------------------------------------------
1643
+ ------------------------------------------------------------------------------------------
1606
1644
  - Replace bioruby with bioroebe completely!
1607
1645
  In order for this to work, we first need to find out
1608
1646
  what bioruby is able to do. :P
1609
- -------------------------------------------------------------------------------
1647
+ ------------------------------------------------------------------------------------------
1610
1648
  - append 33
1611
1649
  # ^^^ in the bioshell
1612
1650
  Only numbers were given: Adding 33 random nucleotides to the main string next.
@@ -1626,7 +1664,7 @@ Did you mean? return_random_codon_sequence_for_this_aminoacid_sequence
1626
1664
 
1627
1665
 
1628
1666
  ^^^^^ BUG!
1629
- -------------------------------------------------------------------------------
1667
+ ------------------------------------------------------------------------------------------
1630
1668
  > rest?
1631
1669
 
1632
1670
  We found these restriction sites within the sequence `TTCAGAACTCAACGCCTGGTTGGCCGTCCAGTAAGCTGACTAAGTAAGTCTATGCCCGCGATAACCAGGATACAGATATCGTGAAACCTGGTTTATCTCCTTCTATAAGAGTCTGCACATCTAGC`:
@@ -1656,7 +1694,7 @@ We found these restriction sites within the sequence `TTCAGAACTCAACGCCTGGTTGGCCG
1656
1694
  ^^^^ also show the position
1657
1695
 
1658
1696
 
1659
- -------------------------------------------------------------------------------
1697
+ ------------------------------------------------------------------------------------------
1660
1698
 
1661
1699
  PMID entries are:
1662
1700
 
@@ -1728,7 +1766,7 @@ We found these restriction sites within the sequence `TTCAGAACTCAACGCCTGGTTGGCCG
1728
1766
 
1729
1767
 
1730
1768
 
1731
- ..........................................................................
1769
+ ------------------------------------------------------------------------------------------
1732
1770
  Bei der Datenbanksuche werden die gemessenen Massen mit den Peptidmassen
1733
1771
  aller Proteine bzw. Gene in einer Datenbank (NCBI, Uniprot) verglichen. DNA-
1734
1772
  Sequenzen werden dazu in Proteinsequenzen übersetzt und in silico mit der beim
@@ -1738,7 +1776,7 @@ Verdau benutzten Protease geschnitten.
1738
1776
 
1739
1777
 
1740
1778
 
1741
- ..........................................................................
1779
+ ------------------------------------------------------------------------------------------
1742
1780
  Complexity of libraries:
1743
1781
  How many independent clones are necessary to represent a genome (plant,
1744
1782
  animal/fungus) or how many such clones have to be screened to have realistic
@@ -1773,7 +1811,7 @@ have to be hybridized.
1773
1811
 
1774
1812
 
1775
1813
 
1776
- ..........................................................................
1814
+ ------------------------------------------------------------------------------------------
1777
1815
 
1778
1816
  BIO SHELL> BglI?
1779
1817
 
@@ -1818,12 +1856,12 @@ List all enzymes that produce compatible ends for the enzyme.
1818
1856
  http://biopython.org/DIST/docs/api/Bio.Restriction.Restriction.Blunt-class.html
1819
1857
 
1820
1858
 
1821
- ..........................................................................
1859
+ ------------------------------------------------------------------------------------------
1822
1860
  https://www.reddit.com/r/bioinformatics/comments/5o3kn8/bioinformatics_contest_2017_jan_23rd29th_solve_as/
1823
- ..........................................................................
1861
+ ------------------------------------------------------------------------------------------
1824
1862
  (1) → Finish all of biophp integration into bioroebe.
1825
1863
  http://www.biophp.org/
1826
- -------------------------------------------------------------------------------
1864
+ ------------------------------------------------------------------------------------------
1827
1865
 
1828
1866
  locate oriC here:
1829
1867
 
@@ -1858,13 +1896,13 @@ But I do not know how to locate ORIs.
1858
1896
 
1859
1897
 
1860
1898
 
1861
- -------------------------------------------------------------------------------
1899
+ ------------------------------------------------------------------------------------------
1862
1900
  ^^^ also integrate git into bioroebe.
1863
- -------------------------------------------------------------------------------
1901
+ ------------------------------------------------------------------------------------------
1864
1902
  WIR MÜSSEN DAS HIER EXTREM VERBESSERN.
1865
1903
 
1866
1904
  DANN UPLOADEN UND ALS BASIS FÜR APPLICATIONS NUTZEN.
1867
- -------------------------------------------------------------------------------
1905
+ ------------------------------------------------------------------------------------------
1868
1906
 
1869
1907
  Study MetaCyc
1870
1908
  ^^^ study metabolic pathways.
@@ -1873,7 +1911,7 @@ http://metacyc.org/
1873
1911
 
1874
1912
  → Create KuroMetaCyc, in Analogy towards Metabolic Cycle.
1875
1913
 
1876
- -------------------------------------------------------------------------------
1914
+ ------------------------------------------------------------------------------------------
1877
1915
 
1878
1916
  Welcome to BioShell May 2012. Type "help" to get some help.
1879
1917
 
@@ -1895,7 +1933,7 @@ When we type this, we then ask:
1895
1933
 
1896
1934
 
1897
1935
 
1898
- -------------------------------------------------------------------------------
1936
+ ------------------------------------------------------------------------------------------
1899
1937
 
1900
1938
  http://biopython.org/DIST/docs/cookbook/Restriction.html#mozTocId101269
1901
1939
 
@@ -1985,16 +2023,16 @@ ausreichend.
1985
2023
 
1986
2024
 
1987
2025
 
1988
- ..........................................................................
2026
+ ------------------------------------------------------------------------------------------
1989
2027
  BioTodo - GENESIS, science fiction.
1990
2028
 
1991
2029
  - create virus(:which_one, :amount) # Note the difference to the below
1992
2030
  - create hydra(:amount)
1993
2031
  - create bread
1994
- ..........................................................................
2032
+ ------------------------------------------------------------------------------------------
1995
2033
  → both
1996
2034
  ^ should work, does not work right now.
1997
- ..........................................................................
2035
+ ------------------------------------------------------------------------------------------
1998
2036
  → Taxonomy is now integrated into bioroebe. This is good but we need more
1999
2037
  documentation, some more tests, a rethinking of the layout and the
2000
2038
  structures, and a fixing of the query-part of the database.
@@ -2008,13 +2046,13 @@ ausreichend.
2008
2046
  at about the same time \o/
2009
2047
  AND document this related-problems too
2010
2048
  Integrate this some other day...
2011
- ..........................................................................
2049
+ ------------------------------------------------------------------------------------------
2012
2050
  - http://www.restrictionmapper.org/cgi-bin/sitefind3.pl
2013
2051
 
2014
2052
  ^^^ Das sollte man integrieren, die Funktionalität, so das
2015
2053
  man ALLE Restriktion-Enzymes ausprobiert ausgehend von
2016
2054
  einer bestimmten Sequenz.
2017
- ..........................................................................
2055
+ ------------------------------------------------------------------------------------------
2018
2056
  → A search is essentially substring search across a database of strings
2019
2057
  (albeit with a smaller alphabet). Some common use cases: one,
2020
2058
  scientists will search for certain genes that they've used in engineered
@@ -2033,13 +2071,13 @@ ausreichend.
2033
2071
  Bioroebe::DetermineOptimalCodons
2034
2072
  ^^^ this is currently incomplete.
2035
2073
 
2036
- ..........................................................................
2074
+ ------------------------------------------------------------------------------------------
2037
2075
  → Redo restrictions enzymes completely.
2038
2076
  And polish this a LOT.
2039
2077
  This may take some days. But we want this to be REALLY good and
2040
2078
  lasting for a long time.
2041
2079
  Need to keep on working at that!
2042
- ..........................................................................
2080
+ ------------------------------------------------------------------------------------------
2043
2081
  → Add: average_aminoacid_weight?
2044
2082
 
2045
2083
 
@@ -2077,7 +2115,7 @@ end
2077
2115
  → We must be able to align not only nucleotides but also aminoacids.
2078
2116
  But where is the alignment comparer? perhaps hamming distance?
2079
2117
  hmm we have to see.
2080
- ..........................................................................
2118
+ ------------------------------------------------------------------------------------------
2081
2119
  → /Programs/Ruby/2.3.1/lib/ruby/site_ruby/2.3.0/bioroebe/bioshell/menu.rb:311:in `menu': undefined method `upcase' for ["EcoRI"]:Array (NoMethodError)
2082
2120
  from /Programs/Ruby/2.3.1/lib/ruby/site_ruby/2.3.0/bioroebe/bioshell/user_input.rb:31:in `block in enter_main_loop'
2083
2121
  from /Programs/Ruby/2.3.1/lib/ruby/site_ruby/2.3.0/bioroebe/bioshell/user_input.rb:12:in `loop'
@@ -2106,12 +2144,12 @@ end
2106
2144
  at this date.'
2107
2145
  SendEmail.new to: Roebe.email?, data
2108
2146
 
2109
- ..........................................................................
2147
+ ------------------------------------------------------------------------------------------
2110
2148
 
2111
2149
 
2112
2150
  → Document which parts of emboss have already been copied.
2113
2151
  → EMBOSS.md
2114
- ..........................................................................
2152
+ ------------------------------------------------------------------------------------------
2115
2153
 
2116
2154
 
2117
2155
 
@@ -2168,7 +2206,7 @@ Traceback (most recent call last):
2168
2206
 
2169
2207
  http://www.snapgene.com/products/snapgene_viewer/
2170
2208
 
2171
- -------------------------------------------------------------------------------
2209
+ ------------------------------------------------------------------------------------------
2172
2210
  (1) → Wir sollten GFP tagging unterstützen, also wie das
2173
2211
  Protein-Konstrukt aussehen soll und so weiter.
2174
2212
  Das geht teilweise...
@@ -2177,22 +2215,22 @@ Traceback (most recent call last):
2177
2215
  fügt die sequence asl main dna sequenz ein.
2178
2216
  Was fehlt? Hmmmm... eventuell noch mehr an
2179
2217
  dokumentation.
2180
- -------------------------------------------------------------------------------
2218
+ ------------------------------------------------------------------------------------------
2181
2219
 
2182
2220
  - in bioroebe, create subsequences for siRNA, then scan for
2183
2221
  submatcher + report where these are. Should be fast too.
2184
- -------------------------------------------------------------------------------
2222
+ ------------------------------------------------------------------------------------------
2185
2223
  - Reverse complement now works quite well, also via the sinatra
2186
2224
  interface. We still should have a way to show 5' and
2187
2225
  3', both on the commandline, and via sinatra.
2188
2226
  Perhaps via --fancy commandline flag or so.
2189
- -------------------------------------------------------------------------------
2227
+ ------------------------------------------------------------------------------------------
2190
2228
  - Cn3D files?
2191
2229
  ^^^ add support for these; research what they are, too.
2192
- -------------------------------------------------------------------------------
2230
+ ------------------------------------------------------------------------------------------
2193
2231
  - Consider adding graphviz, perhaps to the taxonomy project
2194
2232
  where we make graphs towards different nodes or so...
2195
- -------------------------------------------------------------------------------
2233
+ ------------------------------------------------------------------------------------------
2196
2234
  - in parse fasta
2197
2235
  @colourize_sequence = false
2198
2236
  ^^^ change this lateron...
@@ -2200,7 +2238,7 @@ Traceback (most recent call last):
2200
2238
  this method now exists, but we still have to make
2201
2239
  the check better whether it is a protein or a DNA/RNA
2202
2240
  add a toplevel method for this.
2203
- -------------------------------------------------------------------------------
2241
+ ------------------------------------------------------------------------------------------
2204
2242
  - clone the BLast ident matcher functionality for aminacids into
2205
2243
  Bioroebe.
2206
2244
 
@@ -2215,7 +2253,7 @@ Traceback (most recent call last):
2215
2253
 
2216
2254
 
2217
2255
 
2218
- -------------------------------------------------------------------------------
2256
+ ------------------------------------------------------------------------------------------
2219
2257
  - Be able to mark exon/intron boundaries.
2220
2258
 
2221
2259
  - Add "taxid?" to tell us the name of the organism. This works now.
@@ -2259,9 +2297,9 @@ Traceback (most recent call last):
2259
2297
 
2260
2298
  ^^^
2261
2299
  study sumoplot ...
2262
- -------------------------------------------------------------------------------
2300
+ ------------------------------------------------------------------------------------------
2263
2301
  - http://a-little-book-of-r-for-bioinformatics.readthedocs.io/en/latest/src/chapter7.html
2264
- -------------------------------------------------------------------------------
2302
+ ------------------------------------------------------------------------------------------
2265
2303
  - http://biopython.org/DIST/docs/tutorial/Tutorial.html#htoc22
2266
2304
  ^^^ continue here; "You can also specify the table using the
2267
2305
  NCBI table number which is shorter, and often included in
@@ -2269,7 +2307,7 @@ Traceback (most recent call last):
2269
2307
 
2270
2308
  ^^^ work through this and see if it is good.
2271
2309
 
2272
- -------------------------------------------------------------------------------
2310
+ ------------------------------------------------------------------------------------------
2273
2311
 
2274
2312
  - Clone ALL of biophp, if it us useful.
2275
2313
 
@@ -2316,7 +2354,7 @@ Palindromic sequences finder
2316
2354
  We should also put this poart into doc/ subsection
2317
2355
  to keep track of what is missing and what is not.
2318
2356
 
2319
- -------------------------------------------------------------------------------
2357
+ ------------------------------------------------------------------------------------------
2320
2358
  (1) → sizeseq
2321
2359
 
2322
2360
  ^^^ clone this functionality and describe it in detail.
@@ -2353,7 +2391,7 @@ foobar.fasta
2353
2391
 
2354
2392
  ALSO ADD A GUI; sizeseq.rb was added in February 2021.
2355
2393
 
2356
- -------------------------------------------------------------------------------
2394
+ ------------------------------------------------------------------------------------------
2357
2395
  - In the sinatra-web-interface for Bioroebe:
2358
2396
  continue quiz in rosalind !!!
2359
2397
  also, at to_dna: default to RNA
@@ -2372,8 +2410,8 @@ ALSO ADD A GUI; sizeseq.rb was added in February 2021.
2372
2410
  → formatted_view
2373
2411
  111^^^^ in ncbi format
2374
2412
  and document all of this.
2375
- ..........................................................................
2376
- -------------------------------------------------------------------------------
2413
+ ------------------------------------------------------------------------------------------
2414
+ ------------------------------------------------------------------------------------------
2377
2415
  - Add a ruby-GUI stuff, probably the old biology/ subsection
2378
2416
  will be moved into the project.
2379
2417
 
@@ -2470,7 +2508,7 @@ ALSO ADD A GUI; sizeseq.rb was added in February 2021.
2470
2508
 
2471
2509
 
2472
2510
 
2473
- -------------------------------------------------------------------------------
2511
+ ------------------------------------------------------------------------------------------
2474
2512
  - Identifying amino acid cleavage sites (Sigcleave)
2475
2513
 
2476
2514
  For amino acid sequences we may be interested to know whether
@@ -2533,29 +2571,22 @@ ALSO ADD A GUI; sizeseq.rb was added in February 2021.
2533
2571
  ^^^ da gibt es einen bug. später nochmals probieren.
2534
2572
 
2535
2573
 
2536
- - We will read from NM_001180897.3_Saccharomyces_cerevisiae_S288c_Aga2p_AGA2.fasta
2537
-
2538
- The file NM_001180897.3_Saccharomyces_cerevisiae_S288c_Aga2p_AGA2.fasta has this FASTA header:
2539
-
2540
- >gi|398364826|ref|NM_001180897.3| Saccharomyces cerevisiae S288c Aga2p (AGA2), mRNA
2541
2574
 
2542
- ^^^ this should also (optionally) tell us the organism, via a switch.
2543
- for this we need some way to return the taxonomic ID of an organism
2544
2575
 
2545
2576
  - we have to add expasy...
2546
2577
  functionality to the cmdline too.
2547
2578
  Which one specifically? Let's see...
2548
2579
 
2549
2580
  https://www.expasy.org/
2550
- -------------------------------------------------------------------------------
2581
+ ------------------------------------------------------------------------------------------
2551
2582
  - https://biopython.org/wiki/Category%3ACookbook
2552
2583
  ^^^ clone that
2553
- -------------------------------------------------------------------------------
2584
+ ------------------------------------------------------------------------------------------
2554
2585
  - include covid genome, and begin to analyse it in bioroebe
2555
2586
  "Das Genom von SARS-CoV-2 sei doppelt so groß wie jenes
2556
2587
  von Influenzaviren, daher scheinen letztere viermal
2557
2588
  so schnell zu mutieren, schrieb Moshiri."
2558
- -------------------------------------------------------------------------------
2589
+ ------------------------------------------------------------------------------------------
2559
2590
  - Look at the GUIs that are part of the BioRoebe project.
2560
2591
 
2561
2592
  Polish these part, at the least one widget, then
@@ -2570,7 +2601,7 @@ ALSO ADD A GUI; sizeseq.rb was added in February 2021.
2570
2601
 
2571
2602
  Hmmm. And then, also consider transitioning into gtk3,
2572
2603
  and make mroe screenshots.
2573
- -------------------------------------------------------------------------------
2604
+ ------------------------------------------------------------------------------------------
2574
2605
 
2575
2606
  - https://www.ebi.ac.uk/Tools/seqstats/emboss_pepstats/
2576
2607
  http://www.ebi.ac.uk/Tools/services/web/toolresult.ebi?jobId=emboss_pepstats-I20160208-020243-0564-53154194-oy
@@ -2582,7 +2613,7 @@ ALSO ADD A GUI; sizeseq.rb was added in February 2021.
2582
2613
  - Improve on temperature content and how it is calculated
2583
2614
 
2584
2615
  someone googled for it in 2014 so build on it
2585
- -------------------------------------------------------------------------------
2616
+ ------------------------------------------------------------------------------------------
2586
2617
  - pfasta /Depot/Temp/bioroebe/NM_000539.3_Homo_sapiens_rhodopsin_RHO.fasta
2587
2618
 
2588
2619
  Will read from the file `/Depot/Temp/bioroebe/NM_000539.3_Homo_sapiens_rhodopsin_RHO.fasta`.
@@ -2593,7 +2624,7 @@ ALSO ADD A GUI; sizeseq.rb was added in February 2021.
2593
2624
  Now assigning aminoacid sequence to:
2594
2625
  AGAGTCATCCAGCTGGAGCCCTGAGTGGCTGAGCTCAGGCCTTCGCAG
2595
2626
  AGAGTCATCCAGCTGGAGCCCTGAGTGGCTGAGCTCAGGCCTTCGCAG
2596
- -------------------------------------------------------------------------------
2627
+ ------------------------------------------------------------------------------------------
2597
2628
 
2598
2629
 
2599
2630
  - Formats
@@ -2647,7 +2678,7 @@ ALSO ADD A GUI; sizeseq.rb was added in February 2021.
2647
2678
  tinyseq NCBI TinySeq XML
2648
2679
  ztr ZTR tracefile ztr
2649
2680
 
2650
- ..........................................................................
2681
+ ------------------------------------------------------------------------------------------
2651
2682
  (1) Look at f1 display:
2652
2683
 
2653
2684
 
@@ -2670,7 +2701,7 @@ we probably have to rewrite the whole thing
2670
2701
  BEFORE we add ANY COLOURS.
2671
2702
  OH WELL.
2672
2703
 
2673
- -------------------------------------------------------------------------------
2704
+ ------------------------------------------------------------------------------------------
2674
2705
  (100) → Add a primer-design widget
2675
2706
 
2676
2707
  The idea is to be able to manipulate forward and
@@ -2684,7 +2715,7 @@ perfect but it is a start.
2684
2715
  https://www.bioinformatics.nl/molbi/SCLResources/sequence_notation.htm
2685
2716
  ^^^ and check what is useful there. perhaps also add
2686
2717
  nicer visual cues to pretty it up a bit.
2687
- -------------------------------------------------------------------------------
2718
+ ------------------------------------------------------------------------------------------
2688
2719
  (1) → Compare bioroebe to:
2689
2720
 
2690
2721
  https://www.ncbi.nlm.nih.gov/orffinder
@@ -2694,18 +2725,18 @@ whether both return the same also possibly add a web-gui
2694
2725
  check this... so that we can search in standard ORF
2695
2726
  but also in different ORFs
2696
2727
  und die länge angeben, zumindest vom längsten ORF start + stop... also so das das ergebnis auch passt
2697
- ...........................................................................
2728
+ ------------------------------------------------------------------------------------------
2698
2729
  test reverse complement in bioroebe
2699
2730
  ^^^
2700
2731
  new_WWW/
2701
2732
  ^^^ this should eventually become the new web-related interface.
2702
2733
  Ah well. Perhaps not ... ruby-cgi is soooooo annoying ...
2703
- ...........................................................................
2734
+ ------------------------------------------------------------------------------------------
2704
2735
  (154) → the blosum-viewer should be supported in the cgi part
2705
2736
  and sinatra part as well.
2706
2737
  This now works for sinatra. Need to enable this for
2707
2738
  the cgi-part too eventually.
2708
- -------------------------------------------------------------------------------
2739
+ ------------------------------------------------------------------------------------------
2709
2740
  (155) → port the sinatra stuff together in bioroebe
2710
2741
  create a dir: web_api
2711
2742
  ^^^ also make params? usable in both sinatra and cgi page
@@ -2716,18 +2747,18 @@ Ah well. Perhaps not ... ruby-cgi is soooooo annoying ...
2716
2747
  add tons of HtmlTemplate[]
2717
2748
  and replace the ad-hoc code otherwise...
2718
2749
  ^^^ yeah, finish the HtmlTemplate stuff.
2719
- -------------------------------------------------------------------------------
2750
+ ------------------------------------------------------------------------------------------
2720
2751
  (1) → https://i.imgur.com/ptcSn12.png
2721
2752
  ^^^ enable such an overview; this shows mass compuation e.g
2722
2753
  peptide mass and such
2723
- -------------------------------------------------------------------------------
2754
+ ------------------------------------------------------------------------------------------
2724
2755
  (80) Bioroebe.sanitize_nucleotide_sequence
2725
2756
  ^^^ port this into java. The code has been written for this already,
2726
2757
  but we currently fail to link it.
2727
- -------------------------------------------------------------------------------
2758
+ ------------------------------------------------------------------------------------------
2728
2759
  (81) Bioroebe.base_composition
2729
2760
  ^^^^^^^^^ port this into java
2730
- -------------------------------------------------------------------------------
2761
+ ------------------------------------------------------------------------------------------
2731
2762
  (82) - work a bit more on tk!!!
2732
2763
  in particular to start it from the bioshell as-is.
2733
2764
  ^^^ this is mostly done for quick
@@ -2740,20 +2771,20 @@ Ah well. Perhaps not ... ruby-cgi is soooooo annoying ...
2740
2771
  hamming_distance [PARTIALLY IMPLEMENTED; ~80%]
2741
2772
  protein_to_DNA
2742
2773
  ^^^^ improve both while improving tk_paradise docu as well.
2743
- -------------------------------------------------------------------------------
2774
+ ------------------------------------------------------------------------------------------
2744
2775
  (83) Batch-create the .exe files on windows for libui, once
2745
2776
  the first has been added. And then test it too
2746
2777
  AND document it. This should be done with the controller
2747
2778
  eventually. Once this works, we can remove this entry
2748
2779
  here.
2749
- -------------------------------------------------------------------------------
2780
+ ------------------------------------------------------------------------------------------
2750
2781
  (84) port more libui stuff in bioroebe. We have two widgets ported so far;
2751
2782
  add more such entries.
2752
- -------------------------------------------------------------------------------
2783
+ ------------------------------------------------------------------------------------------
2753
2784
  (85) after libui has been ported, explore how gosu works on windows.
2754
2785
  if possible add things to a gosu-specific UI as well, but
2755
2786
  we may need a common, unified GUI base for that.
2756
- -------------------------------------------------------------------------------
2787
+ ------------------------------------------------------------------------------------------
2757
2788
  (86)
2758
2789
 
2759
2790
  add libui bindings AND once done make sure the controller works in
@@ -2762,22 +2793,22 @@ libui as well. Embed the various things into it.
2762
2793
  Tab A set named tabs for placing items in
2763
2794
  ^^^ use this perhaps also in bioroebe hmmm
2764
2795
  yeah.
2765
- -------------------------------------------------------------------------------
2796
+ ------------------------------------------------------------------------------------------
2766
2797
  (87) https://github.com/cnjinhao/nana/wiki/User-Works-using-Nana
2767
2798
 
2768
2799
  ^^^ port the "DNA hybrid"
2769
2800
  https://camo.githubusercontent.com/4c27d554ca4d698d288628f21255f917c2c577e35d7e11dd67e21880d56b6b0a/687474703a2f2f6e616e6170726f2e6f72672f696d616765732f73637265656e73686f74732f746864795f7365715f6578706c2e706e67
2770
2801
 
2771
- -------------------------------------------------------------------------------
2802
+ ------------------------------------------------------------------------------------------
2772
2803
  (88) Bioroebe::Cell
2773
2804
  ^^^ think about what to do with it. If we don't need it then perhaps
2774
2805
  we should just remove it. Think about this more at 2022, before
2775
2806
  deciding what to do.
2776
- -------------------------------------------------------------------------------
2807
+ ------------------------------------------------------------------------------------------
2777
2808
  (89) - Add emboss cgplot functionality.
2778
2809
 
2779
2810
  https://www.bioinformatics.nl/cgi-bin/emboss/cpgplot
2780
- -------------------------------------------------------------------------------
2811
+ ------------------------------------------------------------------------------------------
2781
2812
  (90) - integrate calculation of the Instability index (II)
2782
2813
 
2783
2814
  The instability index provides an estimate of the
@@ -2815,9 +2846,9 @@ that the protein may be unstable.
2815
2846
  The instability index (II) is computed to be 65.43
2816
2847
  This classifies the protein as unstable.
2817
2848
 
2818
- -------------------------------------------------------------------------------
2849
+ ------------------------------------------------------------------------------------------
2819
2850
  (1) → We have now added a method to show all hydrophobic amino acids, via the
2820
2851
  method .hydrophobic_amino_acids?. This works and has been documented
2821
2852
  in May 2022. However had, we also still need a way to PREDICT
2822
2853
  hydrophobic segments in a polypeptide sequence.
2823
- -------------------------------------------------------------------------------
2854
+ ------------------------------------------------------------------------------------------