bioroebe 0.10.80 → 0.11.24

Sign up to get free protection for your applications and to get access to all the features.

Potentially problematic release.


This version of bioroebe might be problematic. Click here for more details.

Files changed (129) hide show
  1. checksums.yaml +4 -4
  2. data/README.md +1204 -772
  3. data/bioroebe.gemspec +3 -3
  4. data/doc/README.gen +1203 -771
  5. data/doc/todo/bioroebe_todo.md +391 -365
  6. data/lib/bioroebe/aminoacids/aminoacid_substitution.rb +1 -9
  7. data/lib/bioroebe/aminoacids/codon_percentage.rb +1 -9
  8. data/lib/bioroebe/aminoacids/deduce_aminoacid_sequence.rb +1 -9
  9. data/lib/bioroebe/aminoacids/display_aminoacid_table.rb +1 -0
  10. data/lib/bioroebe/aminoacids/show_hydrophobicity.rb +1 -6
  11. data/lib/bioroebe/base/colours_for_base/colours_for_base.rb +18 -8
  12. data/lib/bioroebe/base/commandline_application/commandline_arguments.rb +13 -11
  13. data/lib/bioroebe/base/commandline_application/misc.rb +18 -8
  14. data/lib/bioroebe/base/misc.rb +16 -0
  15. data/lib/bioroebe/base/prototype/misc.rb +1 -1
  16. data/lib/bioroebe/codons/show_codon_tables.rb +6 -2
  17. data/lib/bioroebe/codons/show_codon_usage.rb +2 -1
  18. data/lib/bioroebe/constants/aminoacids_and_proteins.rb +1 -0
  19. data/lib/bioroebe/constants/database_constants.rb +1 -1
  20. data/lib/bioroebe/constants/files_and_directories.rb +20 -1
  21. data/lib/bioroebe/constants/misc.rb +20 -0
  22. data/lib/bioroebe/count/count_amount_of_nucleotides.rb +3 -0
  23. data/lib/bioroebe/crystal/README.md +2 -0
  24. data/lib/bioroebe/crystal/to_rna.cr +19 -0
  25. data/lib/bioroebe/data/README.md +11 -8
  26. data/lib/bioroebe/data/electron_microscopy/pos_example.pos +396 -0
  27. data/lib/bioroebe/data/electron_microscopy/test_particles.star +36 -0
  28. data/lib/bioroebe/{shell/tk.rb → electron_microscopy/electron_microscopy_module.rb} +15 -10
  29. data/lib/bioroebe/electron_microscopy/simple_star_file_generator.rb +4 -9
  30. data/lib/bioroebe/fasta_and_fastq/show_fasta_headers.rb +27 -12
  31. data/lib/bioroebe/genome/README.md +4 -0
  32. data/lib/bioroebe/genome/genome.rb +67 -0
  33. data/lib/bioroebe/gui/gtk3/protein_to_DNA/protein_to_DNA.rb +18 -18
  34. data/lib/bioroebe/gui/gtk3/random_sequence/random_sequence.rb +19 -11
  35. data/lib/bioroebe/gui/shared_code/protein_to_DNA/protein_to_DNA_module.rb +14 -14
  36. data/lib/bioroebe/misc/ruler.rb +1 -0
  37. data/lib/bioroebe/parsers/genbank_parser.rb +353 -24
  38. data/lib/bioroebe/parsers/gff.rb +1 -9
  39. data/lib/bioroebe/pdb/parse_pdb_file.rb +1 -9
  40. data/lib/bioroebe/project/project.rb +1 -1
  41. data/lib/bioroebe/python/README.md +1 -0
  42. data/lib/bioroebe/python/__pycache__/mymodule.cpython-39.pyc +0 -0
  43. data/lib/bioroebe/python/gui/gtk3/all_in_one.css +4 -0
  44. data/lib/bioroebe/python/gui/gtk3/all_in_one.py +59 -0
  45. data/lib/bioroebe/python/gui/gtk3/widget1.py +20 -0
  46. data/lib/bioroebe/python/gui/tkinter/all_in_one.py +91 -0
  47. data/lib/bioroebe/python/mymodule.py +8 -0
  48. data/lib/bioroebe/python/protein_to_dna.py +33 -0
  49. data/lib/bioroebe/python/shell/shell.py +19 -0
  50. data/lib/bioroebe/python/to_rna.py +14 -0
  51. data/lib/bioroebe/python/toplevel_methods/open_in_browser.py +20 -0
  52. data/lib/bioroebe/python/toplevel_methods/palindromes.py +42 -0
  53. data/lib/bioroebe/python/toplevel_methods/rds.py +13 -0
  54. data/lib/bioroebe/python/toplevel_methods/three_delimiter.py +34 -0
  55. data/lib/bioroebe/python/toplevel_methods/time_and_date.py +43 -0
  56. data/lib/bioroebe/python/toplevel_methods/to_camelcase.py +11 -0
  57. data/lib/bioroebe/requires/require_the_bioroebe_project.rb +3 -1
  58. data/lib/bioroebe/sequence/nucleotide_module/nucleotide_module.rb +28 -25
  59. data/lib/bioroebe/sequence/protein.rb +105 -3
  60. data/lib/bioroebe/sequence/sequence.rb +61 -2
  61. data/lib/bioroebe/shell/menu.rb +3451 -3366
  62. data/lib/bioroebe/shell/misc.rb +51 -4311
  63. data/lib/bioroebe/shell/readline/readline.rb +1 -1
  64. data/lib/bioroebe/shell/shell.rb +11192 -28
  65. data/lib/bioroebe/siRNA/siRNA.rb +81 -1
  66. data/lib/bioroebe/string_matching/find_longest_substring.rb +3 -2
  67. data/lib/bioroebe/taxonomy/class_methods.rb +3 -8
  68. data/lib/bioroebe/taxonomy/constants.rb +4 -3
  69. data/lib/bioroebe/taxonomy/edit.rb +2 -1
  70. data/lib/bioroebe/taxonomy/help/help.rb +10 -10
  71. data/lib/bioroebe/taxonomy/info/check_available.rb +15 -9
  72. data/lib/bioroebe/taxonomy/info/info.rb +17 -2
  73. data/lib/bioroebe/taxonomy/info/is_dna.rb +46 -36
  74. data/lib/bioroebe/taxonomy/interactive.rb +139 -95
  75. data/lib/bioroebe/taxonomy/menu.rb +27 -18
  76. data/lib/bioroebe/taxonomy/parse_fasta.rb +3 -1
  77. data/lib/bioroebe/taxonomy/shared.rb +1 -0
  78. data/lib/bioroebe/taxonomy/taxonomy.rb +1 -0
  79. data/lib/bioroebe/toplevel_methods/aminoacids_and_proteins.rb +31 -24
  80. data/lib/bioroebe/toplevel_methods/databases.rb +1 -1
  81. data/lib/bioroebe/toplevel_methods/fasta_and_fastq.rb +101 -63
  82. data/lib/bioroebe/toplevel_methods/misc.rb +17 -16
  83. data/lib/bioroebe/toplevel_methods/nucleotides.rb +22 -5
  84. data/lib/bioroebe/toplevel_methods/open_in_browser.rb +2 -0
  85. data/lib/bioroebe/toplevel_methods/palindromes.rb +1 -2
  86. data/lib/bioroebe/toplevel_methods/taxonomy.rb +2 -2
  87. data/lib/bioroebe/toplevel_methods/to_camelcase.rb +5 -0
  88. data/lib/bioroebe/utility_scripts/align_open_reading_frames.rb +1 -9
  89. data/lib/bioroebe/utility_scripts/check_for_mismatches/check_for_mismatches.rb +1 -9
  90. data/lib/bioroebe/utility_scripts/compacter.rb +1 -9
  91. data/lib/bioroebe/utility_scripts/compseq/compseq.rb +1 -9
  92. data/lib/bioroebe/utility_scripts/create_batch_entrez_file.rb +1 -9
  93. data/lib/bioroebe/utility_scripts/dot_alignment.rb +1 -9
  94. data/lib/bioroebe/utility_scripts/move_file_to_its_correct_location.rb +1 -4
  95. data/lib/bioroebe/utility_scripts/showorf/constants.rb +0 -5
  96. data/lib/bioroebe/utility_scripts/showorf/reset.rb +1 -4
  97. data/lib/bioroebe/version/version.rb +2 -2
  98. data/lib/bioroebe/www/embeddable_interface.rb +101 -52
  99. data/lib/bioroebe/www/sinatra/sinatra.rb +186 -70
  100. data/lib/bioroebe/yaml/aminoacids/amino_acids_long_name_to_one_letter.yml +2 -2
  101. data/lib/bioroebe/yaml/configuration/browser.yml +1 -1
  102. data/lib/bioroebe/yaml/genomes/README.md +3 -4
  103. data/lib/bioroebe/yaml/restriction_enzymes/restriction_enzymes.yml +3 -3
  104. metadata +32 -35
  105. data/doc/setup.rb +0 -1655
  106. data/lib/bioroebe/genbank/genbank_parser.rb +0 -291
  107. data/lib/bioroebe/shell/add.rb +0 -108
  108. data/lib/bioroebe/shell/assign.rb +0 -360
  109. data/lib/bioroebe/shell/chop_and_cut.rb +0 -281
  110. data/lib/bioroebe/shell/constants.rb +0 -166
  111. data/lib/bioroebe/shell/download.rb +0 -335
  112. data/lib/bioroebe/shell/enable_and_disable.rb +0 -158
  113. data/lib/bioroebe/shell/enzymes.rb +0 -310
  114. data/lib/bioroebe/shell/fasta.rb +0 -345
  115. data/lib/bioroebe/shell/gtk.rb +0 -76
  116. data/lib/bioroebe/shell/history.rb +0 -132
  117. data/lib/bioroebe/shell/initialize.rb +0 -217
  118. data/lib/bioroebe/shell/loop.rb +0 -74
  119. data/lib/bioroebe/shell/prompt.rb +0 -107
  120. data/lib/bioroebe/shell/random.rb +0 -289
  121. data/lib/bioroebe/shell/reset.rb +0 -335
  122. data/lib/bioroebe/shell/scan_and_parse.rb +0 -135
  123. data/lib/bioroebe/shell/search.rb +0 -337
  124. data/lib/bioroebe/shell/sequences.rb +0 -200
  125. data/lib/bioroebe/shell/show_report_and_display.rb +0 -2901
  126. data/lib/bioroebe/shell/startup.rb +0 -127
  127. data/lib/bioroebe/shell/taxonomy.rb +0 -14
  128. data/lib/bioroebe/shell/user_input.rb +0 -88
  129. data/lib/bioroebe/shell/xorg.rb +0 -45
@@ -1,16 +1,124 @@
1
- -------------------------------------------------------------------------------
2
- (1) → https://biopython.org/DIST/docs/tutorial/Tutorial.html#sec15
1
+ ------------------------------------------------------------------------------------------
2
+ (1) → set_dna_sequence alu
3
+
4
+ ^^^ fetch random alu
5
+
6
+ ^^^ alu sequence
7
+ Ok we started this now adding more details, but we
8
+ need to become better at searching for this
9
+ sequence.
10
+ ------------------------------------------------------------------------------------------
11
+ (2) → draw things based on GR
12
+ ------------------------------------------------------------------------------------------
13
+ (3) → https://mycocosm.jgi.doe.gov/help/screenshots/browser_viewer.png
14
+ ^^^ offer the same functionality
15
+ ------------------------------------------------------------------------------------------
16
+ (4) → https://genome.cshlp.org/content/12/10/1611/F3.expansion.html
17
+
18
+ ^^^ enable this, we must obtain a sequence then store into genbank format
19
+ so, first fetch; then store as-is.
20
+ ------------------------------------------------------------------------------------------
21
+ (5) → be able to generate nice graphics
22
+
23
+ https://genome.cshlp.org/content/12/10/1611/F1.large.jpg
24
+ ------------------------------------------------------------------------------------------
25
+ (6) → add rmagicks wrappre, perhaps via imageparadise or something
26
+ the idea is that we can make fancy drawings and generate
27
+ an image for the end user to see
28
+ ------------------------------------------------------------------------------------------
29
+ (7) → https://bioperl.org/howtos/Beginners_HOWTO.html#item13
30
+ extend the sequence object and document it
31
+
32
+ also add:
33
+
34
+ class Genome
35
+ and:
36
+ def is_circular?
37
+ @internal_hash[:is_circular]
38
+ end; alias circular? is_circular? # === circular?
39
+ def species?
40
+ @internal_hash[:species] # return the species here
41
+ end
42
+ ------------------------------------------------------------------------------------------
43
+ (2) http://lib.ysu.am/open_books/312400.pdf
44
+
45
+ clone:
46
+ Primer.pl
47
+ This program was written to support the required informatics for a sequencing
48
+ lab. The desire was to quickly generate primer pair candidates for use in STS
49
+ mapping. We use Bioperl modules to fetch the sequences from GenBank.
50
+ #! /usr/bin/perl
51
+ #
52
+ # primers.pl
53
+ #
54
+ # Reads a list of
55
+
56
+ % primers.pl AC013798
57
+ AC013798
58
+ Left Right Length Penalty
59
+ CCTCCTGGACAACCTGTGTT TGAAGTCAGGGGACATAGGG 280 0.0823
60
+ CCTCCTGGACAACCTGTGTT AGGCCAGTAGACTGGGTGTG 298 0.1758
61
+ CCTCCTGGACAACCTGTGTT GGTGTGAAGTCAGGGGACAT 284 0.1852
62
+ TTCCCGCATCTCTTAGCAGT AGGCCAGTAGACTGGGTGTG 209 0.1962
63
+ CTTCCCGCATCTCTTAGCAG GACACTAGTGGCAAGGAGGC 226 0.2362
64
+ Most of the primers.pl program is extremely simple. The real guts and power
65
+ of the program lie in the classes and the methods we call. The next section
66
+ examines the Primer3 module, which is similar to many Bioperl modules
67
+
68
+
69
+ ------------------------------------------------------------------------------------------
70
+ (1) → Clone all of Emboss. :)
71
+
72
+ → Clone and document the getorf functionality properly.
73
+
74
+ See: http://emboss.sourceforge.net/apps/cvs/emboss/apps/getorf.html
75
+
76
+ http://emboss.sourceforge.net
77
+ http://emboss.sourceforge.net/apps/cvs/emboss/apps/getorf.html
78
+
79
+ ------------------------------------------------------------------------------------------
80
+ (3) → Add useful formulas for bioshell.
81
+ ------------------------------------------------------------------------------------------
82
+ (1) → Polish the GUI sets:
83
+
84
+ https://i.imgur.com/djElIMh.png
85
+
86
+ ------------------------------------------------------------------------------------------
87
+ (4) → The taxonomy part should be fully integrated, without it
88
+ being a standalone part anymore.
89
+ continue on the taxonomy stuff.
90
+ ne day this will work again *shake fist*
91
+ ------------------------------------------------------------------------------------------
92
+ (1) → Show the frequency of codons in different tables
93
+
94
+ This works quite ok, but right now the approach is to store
95
+ this in a .yml file which is not ideal.
96
+
97
+ Thus, we have to add two things:
98
+ - The ability to store this into a SQL database
99
+ - The ability to batch-download all of these codons,
100
+ which first requires that we have a way to obtain all
101
+ taxonomic ids.
102
+ Add where this can be found.
103
+
104
+ IMPROVE THIS ALL!!!!!!!
105
+
106
+ ------------------------------------------------------------------------------------------
107
+ (2) improve docu + tests for melting temperature analysis again
108
+ + usage example + GUI + web-use
109
+ ------------------------------------------------------------------------------------------
110
+ (3) → https://biopython.org/DIST/docs/tutorial/Tutorial.html#sec15
3
111
 
4
112
  ^^^ work through the above, also integrate it + write docs
5
113
 
6
114
  https://raw.githubusercontent.com/biopython/biopython/master/Doc/examples/ls_orchid.fasta
7
115
 
8
- -------------------------------------------------------------------------------
9
- (2) → integrate electrno microscopy slowly and also add documentation
116
+ ------------------------------------------------------------------------------------------
117
+ (4) → integrate electrno microscopy slowly and also add documentation
10
118
  about this AS YOU GO!!!!!
11
119
  ^^^ yup add more of it
12
- -------------------------------------------------------------------------------
13
- (3) → Add save session support
120
+ ------------------------------------------------------------------------------------------
121
+ (5) → Add save session support
14
122
  to reload our last activity completely ...
15
123
  hmmm..
16
124
  This has to be well designed...
@@ -27,9 +135,8 @@ https://raw.githubusercontent.com/biopython/biopython/master/Doc/examples/ls_orc
27
135
  upon startup of the bioroebe shell.
28
136
  This is in preparation for save-session support.
29
137
 
30
-
31
- -------------------------------------------------------------------------------
32
- (5) → Lys-Asp-Glu-Leu
138
+ ------------------------------------------------------------------------------------------
139
+ (6) → Lys-Asp-Glu-Leu
33
140
 
34
141
  if i.include?('-') and Bioroebe.is_in_the_three_letter_code?(i)
35
142
  end
@@ -47,11 +154,11 @@ https://raw.githubusercontent.com/biopython/biopython/master/Doc/examples/ls_orc
47
154
 
48
155
  ^^ yep this is also called KDEL
49
156
  https://en.wikipedia.org/wiki/KDEL_(amino_acid_sequence)
50
- -------------------------------------------------------------------------------
51
- (6) → Add "orthologs". this shall show us the top 25 orthologs or
157
+ ------------------------------------------------------------------------------------------
158
+ (7) → Add "orthologs". this shall show us the top 25 orthologs or
52
159
  something. In the bioshell? Hmm. Not sure yet.
53
- -------------------------------------------------------------------------------
54
- (7) → clone the functionality of this:
160
+ ------------------------------------------------------------------------------------------
161
+ (8) → clone the functionality of this:
55
162
 
56
163
  http://www.kazusa.or.jp/codon/cgi-bin/countcodon.cgi
57
164
  http://www.kazusa.or.jp/codon/countcodon.html
@@ -63,18 +170,18 @@ https://en.wikipedia.org/wiki/KDEL_(amino_acid_sequence)
63
170
  widget first. And sinatra output too.
64
171
  AND document it as well
65
172
 
66
- -------------------------------------------------------------------------------
173
+ ------------------------------------------------------------------------------------------
67
174
  (8) → SARS genom analyisere in bioroebe
68
175
  eventuell auch graphisch
69
176
 
70
177
  Gibt es neue GUIs die wir kombinieren könnten? Hmmm.
71
- -------------------------------------------------------------------------------
178
+ ------------------------------------------------------------------------------------------
72
179
  (9) → In bioroebe, generate that .ps thingy graphical thing from the
73
180
  vienna RNA tutorial. Hmmm.
74
181
 
75
182
  https://www.tbi.univie.ac.at/RNA/tutorial/
76
- -------------------------------------------------------------------------------
77
- (1) → get insulin squence frmo NCBI
183
+ ------------------------------------------------------------------------------------------
184
+ (10) → get insulin squence frmo NCBI
78
185
  human
79
186
  then apply trypsin onto it
80
187
  and try it like this:
@@ -88,13 +195,13 @@ Also add:
88
195
  ^^^ to show it
89
196
  Hmm. Perhaps also auto-download or something.
90
197
 
91
- -------------------------------------------------------------------------------
92
- (1) → in bioroebe: UAG?
198
+ ------------------------------------------------------------------------------------------
199
+ (11) → in bioroebe: UAG?
93
200
  ^^^ show all stop codons with that in the bioshell
94
201
  all UAG sequences... hmm. and TAG?
95
202
  Finish that.
96
- ..........................................................................
97
- (1) → The position of a symbol in a string is the total number of
203
+ ------------------------------------------------------------------------------------------
204
+ (12) → The position of a symbol in a string is the total number of
98
205
  symbols found to its left, including itself (e.g., the positions
99
206
  of all occurrences of 'U' in "AUGCUUCAGAAAGGUCUUACG" are 2, 5,
100
207
  6, 15, 17, and 18). The symbol at position i
@@ -102,70 +209,70 @@ Hmm. Perhaps also auto-download or something.
102
209
 
103
210
  ^^^ add a solution there, a toplevel API
104
211
  !!!!!
105
- -------------------------------------------------------------------------------
106
- (1) → http://bioruby.org/rdoc/Bio/Blast.html
212
+ ------------------------------------------------------------------------------------------
213
+ (13) → http://bioruby.org/rdoc/Bio/Blast.html
107
214
  ^^^ add support for BLAST
108
- ..........................................................................
109
- (1) → add: parse_pdb()
215
+ ------------------------------------------------------------------------------------------
216
+ (14) → add: parse_pdb()
110
217
  With this we shall just show some info, about a given
111
218
  .pdb file at hand.
112
219
  Also make it commandline based too + bioshell variant
113
220
  here, and a sinatra interface once this all works.
114
221
  Don't forget to document it!!!!!
115
222
  ^^^ and google a bit how others do that
116
- ..........................................................................
117
- (2) → pdb 1a6m
223
+ ------------------------------------------------------------------------------------------
224
+ (15) → pdb 1a6m
118
225
  ^^^ download this when that is used in the bioshell; we also have
119
226
  to use the download directory for this, so make sure that
120
227
  we do.
121
228
  ^^^ And then, also document this clearly.
122
- -------------------------------------------------------------------------------
123
- (3) show_string
229
+ ------------------------------------------------------------------------------------------
230
+ (16) show_string
124
231
  ^^^ slowly port this ... find out differences
125
232
  then unify into one method. right now we used
126
233
  two or something.
127
- -------------------------------------------------------------------------------
128
- (4) → Try to see if we can integrate this into our GUI:
234
+ ------------------------------------------------------------------------------------------
235
+ (17) → Try to see if we can integrate this into our GUI:
129
236
 
130
237
  https://cdn.snapgene.com/assets/7.6.11/assets/images/snapgene/homepage/homepage-hero.png
131
- -------------------------------------------------------------------------------
238
+ ------------------------------------------------------------------------------------------
132
239
  (5) → Scan for leucine zipper!
133
240
 
134
241
  This is ~25% implemented. We need to double-check what
135
242
  exactly is a leucine zipper.
136
- ..........................................................................
243
+ ------------------------------------------------------------------------------------------
137
244
  (6) → Extend the sinatra-interface for the Rosalind task,
138
245
  perhaps add a sub-link to show which parts are solved
139
246
  as-is. Hmm. I am not continuing on this though.
140
247
  ^^^^
141
248
  well - make rosalind anew again or something.
142
249
 
143
- ...........................................................................
250
+ ------------------------------------------------------------------------------------------
144
251
  (7) - Add a blast interface; both via the web-interface, GUI,
145
252
  and also from the commandline.
146
- -------------------------------------------------------------------------------
253
+ ------------------------------------------------------------------------------------------
147
254
  (8) - Write a tutorial about primer design.
148
255
  also make sure that the GUI has support for this.
149
- ..........................................................................
256
+ ------------------------------------------------------------------------------------------
150
257
  (9) - In the documentation examples, show some exampls for how to work
151
258
  with different organisms.
152
- ..........................................................................
259
+ ------------------------------------------------------------------------------------------
153
260
  (10) - In the bioshell, if "stop?" is issued, then the colouring isn't
154
261
  correct. It currently does not show any result. This has to
155
262
  be fixed.
156
- ..........................................................................
263
+ ------------------------------------------------------------------------------------------
157
264
  (11) → https://www.rubydoc.info/gems/biomart
158
265
  ^^^ integrate biomart
159
266
 
160
267
  p biomart.list_datasets
161
268
  p biomart.datasets?
162
- -------------------------------------------------------------------------------
269
+ ------------------------------------------------------------------------------------------
163
270
  (12) Add Trypsin und Trypsinogen sequences, both as FASTA
164
271
  but also as shortcut via the commandline such as:
165
272
  show_orf :trypsine
166
273
  show_orf :trypsin
167
274
  Or something like this; and document it as well.
168
- -------------------------------------------------------------------------------
275
+ ------------------------------------------------------------------------------------------
169
276
  (13) → 1..60
170
277
 
171
278
  setdna 57
@@ -177,12 +284,12 @@ well - make rosalind anew again or something.
177
284
  5' - ATGTGCAGTCAGGTGAATTTATTGAAAAATTTGAGGCTCCTGGTGGTGCAAATCAAAGAACTGCTCCTCAGTGGATGTTGCCTTTACTTCTAG - 3'
178
285
  ^^^ hier beim colourize, wenn das letzte codon ein STOP codon ist
179
286
  dann colourizen wir das auch.
180
- -------------------------------------------------------------------------------
287
+ ------------------------------------------------------------------------------------------
181
288
  (14) → MG1655
182
289
  ^^^ input this to download the sequence. Also show it to the user.
183
- -------------------------------------------------------------------------------
290
+ ------------------------------------------------------------------------------------------
184
291
  (15) → extend virus-information into the bioroebe project.
185
- -------------------------------------------------------------------------------
292
+ ------------------------------------------------------------------------------------------
186
293
  (16) → Add a way to analyse the chemical structure of all
187
294
  aminoacids. We wish to show the chemical formula.
188
295
 
@@ -196,22 +303,22 @@ well - make rosalind anew again or something.
196
303
  I don't understand why it removes H and 0 so perhaps
197
304
  dont remove that part. But still show the -R.
198
305
 
199
- -------------------------------------------------------------------------------
306
+ ------------------------------------------------------------------------------------------
200
307
  (17) FIX THE COLOURIZATION BUG; THIS ONE TRIGGERED THE WHOLE
201
308
  REWRITE AFTER ALL!
202
- -------------------------------------------------------------------------------
309
+ ------------------------------------------------------------------------------------------
203
310
  (18) FIX TAXONOMY related-problems AS WELL
204
311
  ^^^^^^ AND DOCUMENT THIS related-problems.
205
- -------------------------------------------------------------------------------
312
+ ------------------------------------------------------------------------------------------
206
313
  (19) Do note that z will then be a String, not a sequence object anymore.
207
314
  (This may be subject to change in the future, but for now, aka
208
315
  **February 2020**, it is that way.)
209
316
  ^^^^
210
- -------------------------------------------------------------------------------
317
+ ------------------------------------------------------------------------------------------
211
318
  (20) ^^^ colours are appended. That should not be the case!
212
319
  ADD SOMETHING NEW ... some todo entries
213
320
  and some python tool
214
- -------------------------------------------------------------------------------
321
+ ------------------------------------------------------------------------------------------
215
322
  (21) → rewrite the whole project anew
216
323
  - improve the documentation
217
324
  - focus on class Protein first and add
@@ -219,10 +326,10 @@ well - make rosalind anew again or something.
219
326
  that, as well as:
220
327
  .backtrans
221
328
  .reverse_translate
222
- -------------------------------------------------------------------------------
329
+ ------------------------------------------------------------------------------------------
223
330
  (22) → AND THEN test on windows as well.
224
331
  ^^^^^^^^^^^^^^
225
- -------------------------------------------------------------------------------
332
+ ------------------------------------------------------------------------------------------
226
333
  (23) →
227
334
  Reduced alphabets for proteins | [not implemented yet]
228
335
  ^^^ check this as well
@@ -252,9 +359,9 @@ First focus on bioroebe.
252
359
  efetch "https://www.ncbi.nlm.nih.gov/gene/744779"
253
360
  ^^^ test this. again
254
361
 
255
- -------------------------------------------------------------------------------
362
+ ------------------------------------------------------------------------------------------
256
363
  (25) fix tk-levensthein
257
- -------------------------------------------------------------------------------
364
+ ------------------------------------------------------------------------------------------
258
365
  (26) → rewrite the whole project anew
259
366
  - improve the documentation
260
367
  - rework the WHOLE tutorial as well
@@ -263,13 +370,13 @@ efetch "https://www.ncbi.nlm.nih.gov/gene/744779"
263
370
  that
264
371
  .backtrans
265
372
  .reverse_translate
266
- -------------------------------------------------------------------------------
373
+ ------------------------------------------------------------------------------------------
267
374
  (27) → analyze /Depot/Temp/Bioroebe/1CEZ.pdb
268
375
 
269
376
  ^^^
270
377
  support this. Already works half-way, we started writing a pdb parser.
271
378
  this should work in general, for .fasta files as well.
272
- ..........................................................................
379
+ ------------------------------------------------------------------------------------------
273
380
  (28) → SINATRA STUFF:
274
381
  FIX AND EXTEND SINATRA IN BIOROEBE.
275
382
  extend it too.
@@ -281,7 +388,7 @@ efetch "https://www.ncbi.nlm.nih.gov/gene/744779"
281
388
  and special-dispaly on sinatra kaa
282
389
  where the nucleotide sequence has numbers
283
390
  ^^^
284
- -------------------------------------------------------------------------------
391
+ ------------------------------------------------------------------------------------------
285
392
  (29) pick any virus and begin to amass tons of data; and then when done
286
393
  also connect this into a GUI for use therein.
287
394
 
@@ -302,7 +409,7 @@ https://www.ncbi.nlm.nih.gov/nuccore/NC_038391.1
302
409
 
303
410
 
304
411
 
305
- -------------------------------------------------------------------------------
412
+ ------------------------------------------------------------------------------------------
306
413
  (1) → Fix:
307
414
 
308
415
  require 'bioroebe/toplevel_methods/open_reading_frames.rb'
@@ -310,10 +417,10 @@ https://www.ncbi.nlm.nih.gov/nuccore/NC_038391.1
310
417
  Something is wrong; it returns regions that contain
311
418
  a stop codon, which can not be true.
312
419
 
313
- -------------------------------------------------------------------------------
420
+ ------------------------------------------------------------------------------------------
314
421
  (3) → Fix: extend glycovirology parts
315
422
  seek stuff in viral genomes
316
- -------------------------------------------------------------------------------
423
+ ------------------------------------------------------------------------------------------
317
424
  (4) →
318
425
 
319
426
  seq = Bio::Sequence::NA.new("atgcatgcaaaaaaa")
@@ -336,13 +443,13 @@ https://www.ncbi.nlm.nih.gov/nuccore/NC_038391.1
336
443
  seq = Bioroebe::Sequence.new("atgcatgcaaaaaaa")
337
444
  puts seq
338
445
  puts seq.complement
339
- -------------------------------------------------------------------------------
446
+ ------------------------------------------------------------------------------------------
340
447
  (5) →
341
448
  make sure we have a good fasta-showing widget
342
449
  show how many nucleotides are
343
450
  AND add support to modify this as-is
344
451
  ^^^^
345
- -------------------------------------------------------------------------------
452
+ ------------------------------------------------------------------------------------------
346
453
  (6) → In BioRoebe:
347
454
 
348
455
  Add a table showing how compatible bioroebe is compared to the other
@@ -352,7 +459,7 @@ https://www.ncbi.nlm.nih.gov/nuccore/NC_038391.1
352
459
  including Bio (ruby-bio) the main ruby project here.
353
460
  And add a table which functionality is implemented
354
461
  in Java already.
355
- -------------------------------------------------------------------------------
462
+ ------------------------------------------------------------------------------------------
356
463
  (7) →
357
464
  ********************************************************************************
358
465
  Was passiert wenn wir das Lambda-Genom mit EcoRI behandeln?
@@ -375,19 +482,19 @@ Bioroebe.digest_this_dna("/root/Bioroebe/fasta/NC_001416.1_Enterobacteria_phage_
375
482
  DNA.
376
483
  ^^^ this now works kind of ... but it must be better
377
484
  documented and we must test this with more data.
378
- -------------------------------------------------------------------------------
485
+ ------------------------------------------------------------------------------------------
379
486
  (8) → add the bioroebe logo to sinatra, but as appropriate size,
380
487
  via base64. perhaps width 50 or so. need to determine
381
488
  which size fits here.
382
- -------------------------------------------------------------------------------
489
+ ------------------------------------------------------------------------------------------
383
490
  (9) → Integrate http://nc2.neb.com/NEBcutter2/cutshow.php?name=ffe1d68e-
384
491
 
385
492
  in particular the visual part.
386
- -------------------------------------------------------------------------------
493
+ ------------------------------------------------------------------------------------------
387
494
  (10) → https://international.neb.com/products/r0196-ncii#Product%20Information
388
495
  ^^^ autogenerate such an image, aka restriction cutting enzyme
389
496
  to indicate the target sequence.
390
- -------------------------------------------------------------------------------
497
+ ------------------------------------------------------------------------------------------
391
498
  (6) → how to do codon optimiation in e.coli? bioroebe must support this!
392
499
 
393
500
  we must first get a display which codon is very commonly used in
@@ -399,34 +506,23 @@ and then we look which codons may be improvable - display
399
506
  them on the commandline
400
507
 
401
508
  class: OptimizeCodons.new(of_this_sequence)
402
- -------------------------------------------------------------------------------
509
+ ------------------------------------------------------------------------------------------
403
510
  (7) → Molekulare Grösse von "Ubiquitin"? "8.5 kd".
404
511
  ^^^ das sollte automatisch ausgerechnet werden
405
- -------------------------------------------------------------------------------
512
+ ------------------------------------------------------------------------------------------
406
513
  (8) → taxonomy !!!!!!!!!!!!!!!!!!
407
- -------------------------------------------------------------------------------
514
+ ------------------------------------------------------------------------------------------
408
515
  (9) → Given a list of gene names that I would like to get chromosome/position
409
516
  information for (in mm10). Is there some service online where I can
410
517
  paste this list? ^^^ enable this
411
- -------------------------------------------------------------------------------
412
- (10) → Show the frequency of codons in different tables
413
-
414
- This works quite ok, but right now the approach is to store
415
- this in a .yml file which is not ideal.
416
-
417
- Thus, we have to add two things:
418
- - The ability to store this into a SQL database
419
- - The ability to batch-download all of these codons,
420
- which first requires that we have a way to obtain all
421
- taxonomic ids.
422
- -------------------------------------------------------------------------------
518
+ ------------------------------------------------------------------------------------------
423
519
  (11) → Add a way in bioroebe to store a gene into a yaml file
424
520
  or so, and to also load it up again. Perhaps simplify
425
521
  this automatically. Need some ways to describe that.
426
- -------------------------------------------------------------------------------
522
+ ------------------------------------------------------------------------------------------
427
523
  (12) → Make bioroebe very useful from the www, no matter if via sinatra
428
524
  or rails. It should be a tool-set project on the www as well.
429
- -------------------------------------------------------------------------------
525
+ ------------------------------------------------------------------------------------------
430
526
  (13) → Suppose you have a GenBank file which you want to turn into a
431
527
  Fasta file. For example, lets consider the file cor6_6.gb
432
528
  which is included in the Biopython unit tests under the
@@ -441,12 +537,12 @@ call it format-converter or so
441
537
  the GUI works somewhat but needs to be polished up.
442
538
  THEN THIS CAN BE REMOVED!!!!!!!
443
539
 
444
- -------------------------------------------------------------------------------
540
+ ------------------------------------------------------------------------------------------
445
541
  (14) → Wir brauchen eine table wo wir die starken promotoren verschiedener
446
542
  Organismen zusammenstellen und vergleichen können.
447
543
 
448
544
  strong_promoters.yml
449
- -------------------------------------------------------------------------------
545
+ ------------------------------------------------------------------------------------------
450
546
  (15) → add:
451
547
  start position of exons
452
548
  and show the sequence based on that file
@@ -454,9 +550,9 @@ THEN THIS CAN BE REMOVED!!!!!!!
454
550
  Normally there's a "gene" entry for each gene, so:
455
551
  awk 'BEGIN{FS="\t"; OFS="\t"}{if($3 == "gene") print $1, $4, $5}' foo.gtf
456
552
 
457
- -------------------------------------------------------------------------------
553
+ ------------------------------------------------------------------------------------------
458
554
  (16) → also add 30-33 to aminoacids hmmm difficult.
459
- -------------------------------------------------------------------------------
555
+ ------------------------------------------------------------------------------------------
460
556
  (17) → http://bioinformatics.oxfordjournals.org/content/18/8/1135
461
557
  "TFBS: Computational framework for transcription factor
462
558
  binding site analysis"
@@ -464,7 +560,7 @@ THEN THIS CAN BE REMOVED!!!!!!!
464
560
  into bioroebe
465
561
 
466
562
  http://tfbs.genereg.net/
467
- -------------------------------------------------------------------------------
563
+ ------------------------------------------------------------------------------------------
468
564
  (18) → They include trypsin, chymotrypsin, thrombin, plasmin, papain and factor Xa.
469
565
  ^^^ provide means to identify where they cut,
470
566
  and show this then by simualting a digest.
@@ -472,7 +568,7 @@ THEN THIS CAN BE REMOVED!!!!!!!
472
568
  also document this on bioroebe todo
473
569
  this is done via digestion/digestions
474
570
  but it's not quite perfect yet.
475
- -------------------------------------------------------------------------------
571
+ ------------------------------------------------------------------------------------------
476
572
  (19) → a) add a commandline way to generate a random protein
477
573
  with a specified length and then display it on the
478
574
  commandline [DONE] !!!
@@ -498,29 +594,29 @@ THEN THIS CAN BE REMOVED!!!!!!!
498
594
  Enable this BOTH from the commandline AND from the
499
595
  interactive variant and from sinatra! Hmmmm.
500
596
 
501
- -------------------------------------------------------------------------------
597
+ ------------------------------------------------------------------------------------------
502
598
  (1) → add an option to design a
503
599
 
504
600
  degenerate primer
505
- -------------------------------------------------------------------------------
601
+ ------------------------------------------------------------------------------------------
506
602
  (2) Add upcase to sequences and ensure that it works; also document it
507
603
  internally and in the .pdf tutorial
508
604
  what does that mean? upcase as method? hmmm.
509
605
 
510
- ..........................................................................
606
+ ------------------------------------------------------------------------------------------
511
607
  (1) → http://www.biomart.org/other/user-docs.pdf
512
608
  ^^^ work through this
513
609
  ^^^ integrate the old .cgi part and improve as you go
514
- ..........................................................................
610
+ ------------------------------------------------------------------------------------------
515
611
  (1) → Access geninfo numbers easily.
516
612
  Die suchen und runterladen.
517
- -------------------------------------------------------------------------------
613
+ ------------------------------------------------------------------------------------------
518
614
  - Add all of bioruby into bioroebe:
519
615
 
520
616
  continous project
521
617
  https://github.com/biopython/biopython
522
618
  https://github.com/bioruby/bioruby/tree/master/lib/bio
523
- -------------------------------------------------------------------------------
619
+ ------------------------------------------------------------------------------------------
524
620
  (3) → https://github.com/bioruby/bioruby/issues/134
525
621
  ^^^ check this, for restriction enzymes
526
622
  http://rebase.neb.com/rebase/enz/MboII.html
@@ -530,9 +626,9 @@ THEN THIS CAN BE REMOVED!!!!!!!
530
626
  > seq = seq.reverse_complement
531
627
  > Bio::RestrictionEnzyme.cut(seq, 'MboII').primary rescue [seq]
532
628
  => ["atcatcaatcctaatcttct"]
533
- -------------------------------------------------------------------------------
629
+ ------------------------------------------------------------------------------------------
534
630
  (4) → Document how an ORF is defined for the bioroebe project.
535
- ..........................................................................
631
+ ------------------------------------------------------------------------------------------
536
632
  (5) Continue with biojava in bioroebe.
537
633
 
538
634
  → We need to make some table that tells us what is implemented
@@ -547,7 +643,7 @@ THEN THIS CAN BE REMOVED!!!!!!!
547
643
 
548
644
  dprimer M-T-T-Y-Y-T-A-A-A-STOP
549
645
 
550
- ..........................................................................
646
+ ------------------------------------------------------------------------------------------
551
647
  (1) → The codon tables:
552
648
  → In January we added a codon-table GUI to ruby-gtk3.
553
649
 
@@ -576,31 +672,29 @@ THEN THIS CAN BE REMOVED!!!!!!!
576
672
 
577
673
  This now sorta works semi-ok.
578
674
 
579
- -------------------------------------------------------------------------------
675
+ ------------------------------------------------------------------------------------------
580
676
  (1) → In the bioroebe-shell, enable input such as:
581
677
 
582
678
  NC_000011.10
583
679
 
584
680
  This shall quickly download this sequence into the
585
681
  local file, and also rename it properly.
586
- -------------------------------------------------------------------------------
682
+ ------------------------------------------------------------------------------------------
587
683
  → clone all of bioruby
588
- -------------------------------------------------------------------------------
684
+ ------------------------------------------------------------------------------------------
589
685
  (1) → bioinf bücher udrhclesen und zeug inkludiere !!!
590
686
  ^^^^^ mehr bilderchen hinzufügen ... auchv on den GUIs eventuell.
591
687
  Und auch biopython durcharbeiten und alles wichtige nach
592
688
  bioroebe übertragen.
593
- -------------------------------------------------------------------------------
689
+ ------------------------------------------------------------------------------------------
594
690
  - Add: DetectMotif
595
691
 
596
692
  This class shall be used for detecting subsequences.
597
- -------------------------------------------------------------------------------
693
+ ------------------------------------------------------------------------------------------
598
694
  - Neue funktionälit rein
599
- -------------------------------------------------------------------------------
600
- - mehr doku!
601
- -------------------------------------------------------------------------------
602
- - continue on bioroebe, and when it is done, write to the guy.
603
- -------------------------------------------------------------------------------
695
+ ------------------------------------------------------------------------------------------
696
+ - mehr doku!!!
697
+ ------------------------------------------------------------------------------------------
604
698
  - Rewrite bioroebe completely - add some tests, too or so, to
605
699
  test this. ^^^
606
700
  That way we learn how to write tests.
@@ -643,22 +737,13 @@ extend bioroebe sinatra interface
643
737
  also add a footer to show which entries are available or so
644
738
  → in bioroebe, mach das die postgresql datenbank wieder funktioniert ...
645
739
 
646
-
647
-
648
- ..........................................................................
740
+ ------------------------------------------------------------------------------------------
649
741
 
650
742
  → ^^^ improve this whole project a lot
651
743
 
652
744
  before uploading then send email
653
745
 
654
746
 
655
- - 1fat.pdb
656
-
657
- ^^^ download this, also via bioshell
658
- download 1fat
659
- ^^^ notify the user about this
660
- but put it into the dir of bioshell
661
-
662
747
  → add:
663
748
 
664
749
  set_dna :insulin
@@ -674,44 +759,26 @@ also add a footer to show which entries are available or so
674
759
  → becomes: http://www.ncbi.nlm.nih.gov/gene/3630
675
760
 
676
761
  wtf ... better to learn how NCBI uworks
677
- -------------------------------------------------------------------------------
678
- - Add a seuqence table int obioroebe for GFP, YFP etc
762
+ ------------------------------------------------------------------------------------------
763
+ - Add a seuqence table into bioroebe for GFP, YFP etc
679
764
  and mae this show in both the interactio bioshell but
680
765
  also the main README.md
681
- -------------------------------------------------------------------------------
682
- - stop_frame1?
683
- ^^^ add support for this
684
- and stop_frame2?
685
- etcc
686
- to show stop-codons in this colour
687
- THEN UPLOAD!
688
- ^^^ this works now but is not documented
689
-
690
-
691
- -------------------------------------------------------------------------------
692
-
693
- - chop to first ATG
694
766
 
695
- chop :ATG
696
-
697
- ^^^^ enable this, to chop towards the first ATG
698
- sequence in the string
699
-
700
- -------------------------------------------------------------------------------
767
+ ------------------------------------------------------------------------------------------
701
768
  → http://www.biophp.org/stats/describe_data/demo.php?show=formula
702
769
 
703
770
  ^^^ should also add documentation like this, also via www interface
704
- -------------------------------------------------------------------------------
771
+ ------------------------------------------------------------------------------------------
705
772
  → add mouse chromsoome URL, also in the bioshell
706
773
  and the main README, to be of help for the
707
774
  user. add a mouse subsection.
708
- ..........................................................................
775
+ ------------------------------------------------------------------------------------------
709
776
  → fix the taxonomy stuff...
710
- ..........................................................................
777
+ ------------------------------------------------------------------------------------------
711
778
  (1) → add 2nd_orf
712
779
  → this shall scan for the 2nd orf
713
780
  → and third ORF as well, then, and document it.
714
- ..........................................................................
781
+ ------------------------------------------------------------------------------------------
715
782
  (2) → Add a "cutter-range example" in restriction enzymes +
716
783
  table + examples + tutorial
717
784
 
@@ -719,21 +786,16 @@ also add a footer to show which entries are available or so
719
786
 
720
787
  Also, add in the documentation where this
721
788
  can be found.
722
- ..........................................................................
723
- (3) → Add aaruler, similar to "ruler"; in the bioshell.
724
- But we want to do this on the dna-sequence rather
725
- than the aminoacid sequence.
726
- This works but the display is not ideal.
727
- ..........................................................................
789
+ ------------------------------------------------------------------------------------------
728
790
  (4) → Add some codon-usage analyzer. What shall it show? It
729
791
  should show how many codons are used, frequencies etc...
730
792
  by an organism, and compare that to other data.
731
- ..........................................................................
793
+ ------------------------------------------------------------------------------------------
732
794
  (5) → Implement a GPCR interface.
733
795
 
734
796
  This is for "G-protein coupled receptors."
735
797
  Denote which variants exist and so forth. Document it as well.
736
- ..........................................................................
798
+ ------------------------------------------------------------------------------------------
737
799
  (6) → alu?
738
800
 
739
801
  Will read from the file `/Programs/Ruby/2.3.0/lib/ruby/site_ruby/2.3.0/bioroebe/yaml/alu_elements.yml`.
@@ -756,7 +818,7 @@ also add a footer to show which entries are available or so
756
818
  ^^^ add this and document it or something like that
757
819
  And perhaps add a small protein as an example how to
758
820
  work with .pdb files instead.
759
- -------------------------------------------------------------------------------
821
+ ------------------------------------------------------------------------------------------
760
822
  (4) → Extend bioroebe to allow download
761
823
 
762
824
  PDB files
@@ -770,13 +832,13 @@ also add a footer to show which entries are available or so
770
832
 
771
833
  in 3EML 2VTP 2VEZ
772
834
  do
773
- ..........................................................................
835
+ ------------------------------------------------------------------------------------------
774
836
  (1) → Fully integrate electron microscopy then remove the old entry.
775
837
  Test it though.
776
838
  Hmm... but ... we will first polish the main bioroebe
777
839
  gem AND the taxonomy gem and THEN AFTERWARDS
778
840
  integate elctron microsopcy.
779
- ..........................................................................
841
+ ------------------------------------------------------------------------------------------
780
842
  (1) → ORF Finder:
781
843
 
782
844
  We must add an ORF finder for the bioroebe project,
@@ -785,23 +847,23 @@ also add a footer to show which entries are available or so
785
847
  This works partially... start_stop works but we do not
786
848
  yet find all subsequences.
787
849
 
788
- ..........................................................................
850
+ ------------------------------------------------------------------------------------------
789
851
  (1) → must change determine whether we have protein or nucleotide or
790
852
  so via a topelvel method!
791
- ..........................................................................
853
+ ------------------------------------------------------------------------------------------
792
854
  (1) → there is a talens module.
793
855
  we have to improve on it for a while
794
856
  better docu
795
857
  more testing
796
858
  then we can get rid of this entry here
797
- ..........................................................................
859
+ ------------------------------------------------------------------------------------------
798
860
  (1) → 33.44
799
861
  Next showing the nucleotides 33 to 44 (including 33 and 44).
800
862
  The length of the fragment will be 12 nucleotides.
801
863
  5' - 2;70;130;180 - 3'
802
864
  ^^^ there is some problem; we somehow embed the colour codes,
803
865
  which should not happen.
804
- ..........................................................................
866
+ ------------------------------------------------------------------------------------------
805
867
  (1) → set_aa DTLCIGYHAN NSTDTVDTVL EKNVTVTHSV NLLEDKHNGK LCKLRGVAPL HLGKCNIAGW ILGNPECESL STASSWSYIV ETSNSDNGTC YPGDFINYEE LREQLSSVSS FERFEIFPKT SSWPNHDNKG VTAACPHAGA KSFYKNLIWL VKKGNSYPKL NQSYINDKGK EVLVLWGIHH PSTTADQQSL YQNADAYVFV GTSRYSKKFK PEIATRPKVR DQEGRMNYYW TLVEPGDKIT FEATGNLVVP RYAFMERNAG SGIIISDTPV HDCNTTCQTP EGAINTSLPF QNIHPITIGK CPKYVKSTKL RLATGLRNVP SIQSRGLFGA IAGFIEGGWT GMVDGWYGYH HQNEQGSGYA ADLKSTQNAI DKITNKVNSV IKMNTQFTAV GKEFNHLEKR IENLNKKVDD GFLDIWTYNA ELLVLLENER TLDYHDSNVK NLYEKVRNQL KNNAKEIGNG CFEFYHKCDN TCMESVKNGT YDYPKYSEEA KLNREKIDGV KLESTRIYHH HHHH
806
868
 
807
869
  ^^^ enable copy/pasting,
@@ -816,7 +878,7 @@ also add a footer to show which entries are available or so
816
878
  This sequence has 50 aminoacids.
817
879
  ^^^ das stimmt net.
818
880
 
819
- ..........................................................................
881
+ ------------------------------------------------------------------------------------------
820
882
  (1) → add this functionality:
821
883
 
822
884
  meting temper
@@ -853,70 +915,57 @@ also add a footer to show which entries are available or so
853
915
  and also provide a commandline-way to calculate them,
854
916
  using ruby. The latter may be useful and rather easy for
855
917
  scripted use.
856
- ..........................................................................
918
+ ------------------------------------------------------------------------------------------
857
919
  (1) → show insulin
858
920
  ^^^ to show the insulin structure
859
921
  how to find it? no idea...
860
922
  but we should have these structures already made available somewhere.
861
- ..........................................................................
923
+ ------------------------------------------------------------------------------------------
862
924
  (1) → Todo: find family of enzymes, based on sequence structure
863
925
  alone.
864
- ..........................................................................
926
+ ------------------------------------------------------------------------------------------
865
927
  (1) → https://pubchem.ncbi.nlm.nih.gov/compound/16131099#section=Top
866
928
 
867
929
  ^^^ this website is quite interesting; try to use components
868
930
  from it.
869
- -------------------------------------------------------------------------------
931
+ ------------------------------------------------------------------------------------------
870
932
  (1) → Add some option to show the aminoacid sequence, at the least
871
933
  store it; and optionally show it.
872
934
 
873
935
  possibly always report how many aminoacids are
874
936
  part of that file; and optionally also show
875
937
  the whole sequence.
876
- -------------------------------------------------------------------------------
938
+ ------------------------------------------------------------------------------------------
877
939
  (1) → WORK THROUGH the PROTOCOL AT BOKU. THEN WORK THROUGH THE VARIOUST
878
940
  TIDBIDS AT UNI WIEN STARTING WITH HEIKO.
879
941
  ^^^ da sind wir nun.
880
942
  wir sind an beginn von 1b ... hmmmm, also zerst mal das an der
881
943
  BOKU durchgehen. Dann das löschen.
882
- -------------------------------------------------------------------------------
944
+ ------------------------------------------------------------------------------------------
883
945
  (1) → Begin tk-bindings for bioroebe, following the gtk stuff.
884
- -------------------------------------------------------------------------------
946
+ ------------------------------------------------------------------------------------------
885
947
  (2) → frame_value = position_of_the_stop_codon - position_of_the_start_codon
886
948
  ^^^ continue on this ...
887
- -------------------------------------------------------------------------------
949
+ ------------------------------------------------------------------------------------------
888
950
  (1) → improve both the gtk-apps parts, and the sinatra web-interface,
889
951
  and other GUI-like elements. The idea is to make this software
890
952
  more useful for people around the world, which should help
891
953
  increase its adoption rate.
892
- -------------------------------------------------------------------------------
954
+ ------------------------------------------------------------------------------------------
893
955
  (2) → Look to integrate this:
894
956
 
895
957
  http://www.ncbi.nlm.nih.gov/nuccore/NM_007315.3?report=fasta&log$=seqview&format=text
896
958
  ^^^
897
- -------------------------------------------------------------------------------
898
- (1) → Clone and document the getorf functionality properly.
899
-
900
- See: http://emboss.sourceforge.net/apps/cvs/emboss/apps/getorf.html
901
- -------------------------------------------------------------------------------
902
- (2) → set_dna_sequence alu
903
-
904
- ^^^ fetch random alu
905
-
906
- ^^^ alu sequence
907
- Ok we started this now adding more details, but we
908
- need to become better at searching for this
909
- sequence.
910
- -------------------------------------------------------------------------------
959
+ ------------------------------------------------------------------------------------------
911
960
  (3) → We need to make available the ... thingy magick
912
961
  emboss functionality. that may seem useful
913
962
  but also feel free to extend these parts for
914
963
  bioroebe as necessary.
915
- -------------------------------------------------------------------------------
964
+ ------------------------------------------------------------------------------------------
916
965
  (4) → integrate electron_microscopy fully
917
966
  This will take more time, so first we finish with the
918
967
  taxonomy module instead.
919
- -------------------------------------------------------------------------------
968
+ ------------------------------------------------------------------------------------------
920
969
  (5) → Improve support for BLAST up until
921
970
 
922
971
  middle of 2015 so that I am better prepared
@@ -927,7 +976,7 @@ also add a footer to show which entries are available or so
927
976
  So, work on BLAST tutorial at bioinf page:
928
977
 
929
978
  bl bioinf; rf bioinf
930
- -------------------------------------------------------------------------------
979
+ ------------------------------------------------------------------------------------------
931
980
  (3) → integrate a "codon usage database", whatever this means.
932
981
  It is a cool database anyway. Then document this.
933
982
  First, create a codon-usage analyze on a per-FASTA
@@ -935,7 +984,7 @@ also add a footer to show which entries are available or so
935
984
  and calculate the codon usage from there.
936
985
 
937
986
  ^^^ and add some GUI to this. hmmm
938
- ..........................................................................
987
+ ------------------------------------------------------------------------------------------
939
988
  (4) → Input sequence:
940
989
 
941
990
  MFLMVSPTAYHQNKDECFLP
@@ -951,46 +1000,40 @@ also add a footer to show which entries are available or so
951
1000
 
952
1001
  ^^^ we should also show this on the commandline AND the
953
1002
  www ... hmmm.
954
- ..........................................................................
1003
+ ------------------------------------------------------------------------------------------
955
1004
  (5) → enable a graphical layer so that we can find out which
956
1005
  transcription factor activates which gene(s). This
957
1006
  should show e. g. a transcription factor highlighting
958
1007
  a target genetic area.
959
- ..........................................................................
1008
+ ------------------------------------------------------------------------------------------
960
1009
  (2) → We should add more screenshots, make them available on imgur
961
1010
  as well, after storing them locally. Start with the more
962
1011
  important functionality.
963
1012
 
964
- ..........................................................................
1013
+ ------------------------------------------------------------------------------------------
965
1014
  (2) → clone serial cloner or whatever the name was, that GUI,
966
1015
  so that we can offer the same functionality.
967
- ..........................................................................
1016
+ ------------------------------------------------------------------------------------------
968
1017
  (1) →
969
1018
 
970
1019
  # * searching for PubMed IDs given a query string:
971
1020
  # * Bio::PubMed#esearch (recommended)
972
1021
  # * Bio::PubMed#search (only retrieves top 20 hits; will be deprecated)
973
1022
  ^^^ implement this
974
-
975
-
976
- ..........................................................................
1023
+ ------------------------------------------------------------------------------------------
977
1024
  (3) → Aufgabe 16 in bioroebe lösen könnnen
978
- ..........................................................................
979
- (4) → The taxonomy part should be fully integrated, without it
980
- being a standalone part anymore.
981
- continue on the taxonomy stuff.
982
- ne day this will work again *shake fist*
983
- -------------------------------------------------------------------------------
1025
+
1026
+ ------------------------------------------------------------------------------------------
984
1027
  (5) → re1 = Bio::RestrictionEnzyme::DoubleStranded.new(enzyme1)
985
1028
 
986
1029
  ^^^ add this? hmmmm
987
1030
  ^^^ from here.
988
- -------------------------------------------------------------------------------
1031
+ ------------------------------------------------------------------------------------------
989
1032
  (1) → Colourize exon/intron boundaries.
990
- -------------------------------------------------------------------------------
1033
+ ------------------------------------------------------------------------------------------
991
1034
  (2) → In bioroebe: enhance phylogeny stuff and perhaps automatically
992
1035
  generate pictures here.
993
- -------------------------------------------------------------------------------
1036
+ ------------------------------------------------------------------------------------------
994
1037
  (1) → In sinatra: add a backtranseq entry point, perhaps
995
1038
  alias it as well.
996
1039
 
@@ -1000,7 +1043,7 @@ bioroebe --protein-to-dna
1000
1043
 
1001
1044
  ^^^ this shall start the GTK3 variant
1002
1045
 
1003
- -------------------------------------------------------------------------------
1046
+ ------------------------------------------------------------------------------------------
1004
1047
  (1) → require 'rubygems/text'
1005
1048
  include Gem::Text
1006
1049
  levenshtein_distance 'shevy', 'chevy' # => 1
@@ -1012,13 +1055,13 @@ bioroebe --protein-to-dna
1012
1055
  https://github.com/rubygems/rubygems/blob/master/lib/rubygems/text.rb
1013
1056
  ^^^ actually move that part into bioroebe itself...
1014
1057
 
1015
- -------------------------------------------------------------------------------
1058
+ ------------------------------------------------------------------------------------------
1016
1059
  (1) → add _source to all APIs in sinatra there. Ensure that this works
1017
1060
  too. The user should be able to view the source code.
1018
1061
  ^^^ it has been added for 2 methods so far in sinatra; we need
1019
1062
  to add it for the remaining ones too. Then we can remove
1020
1063
  this entry point.
1021
- -------------------------------------------------------------------------------
1064
+ ------------------------------------------------------------------------------------------
1022
1065
  (2) → Check out expasy
1023
1066
  peptidcutter
1024
1067
  also offer this functionality, through commandline, GUI
@@ -1026,16 +1069,12 @@ bioroebe --protein-to-dna
1026
1069
  https://web.expasy.org/peptide_cutter/
1027
1070
  We now have added trypsin but we should add more here; and
1028
1071
  still have to add support for sinatra here.
1029
- -------------------------------------------------------------------------------
1072
+ ------------------------------------------------------------------------------------------
1030
1073
  (3) → melting temperature subsection
1031
1074
 
1032
1075
  hmmm .... molecular weight calculation works now ... but
1033
1076
  ... is it correct for a ssDNA string? hmm...
1034
- -------------------------------------------------------------------------------
1035
- (3) → Add useful formulas for bioshell.
1036
-
1037
-
1038
- ...........................................................................
1077
+ ------------------------------------------------------------------------------------------
1039
1078
  (1) → Degenerate Primers
1040
1079
 
1041
1080
  You can try to determine the degenerate primers via the Shell
@@ -1046,7 +1085,7 @@ bioroebe --protein-to-dna
1046
1085
  ^^^ epxnad that subsection
1047
1086
  more explanations and examples
1048
1087
 
1049
- -------------------------------------------------------------------------------
1088
+ ------------------------------------------------------------------------------------------
1050
1089
  (1) → Copy the functionality of plotorf:
1051
1090
 
1052
1091
  See:
@@ -1062,7 +1101,7 @@ bioroebe --protein-to-dna
1062
1101
 
1063
1102
 
1064
1103
 
1065
- -------------------------------------------------------------------------------
1104
+ ------------------------------------------------------------------------------------------
1066
1105
  (2) → Start nucleotide position is at: 142
1067
1106
 
1068
1107
  See the following example:
@@ -1072,24 +1111,24 @@ bioroebe --protein-to-dna
1072
1111
  BIO SHELL>
1073
1112
  ^^^ this does not work; nothing is highlighted.
1074
1113
 
1075
- -------------------------------------------------------------------------------
1114
+ ------------------------------------------------------------------------------------------
1076
1115
  (2) → Add a myristoylierung-signal
1077
1116
 
1078
1117
  Met-Gly-Xaa-Xaa-YXaa-Ser/Thr-Lys-Lys
1079
1118
 
1080
1119
  1^^ but check first.
1081
1120
 
1082
- -------------------------------------------------------------------------------
1121
+ ------------------------------------------------------------------------------------------
1083
1122
  (3) → integrate the bioroebe_tutorial.cgi into the .md file completely.
1084
1123
 
1085
- -------------------------------------------------------------------------------
1124
+ ------------------------------------------------------------------------------------------
1086
1125
  (4) → Integrate everything from the biopython tutorial, if it makes
1087
1126
  sense.
1088
1127
 
1089
- -------------------------------------------------------------------------------
1128
+ ------------------------------------------------------------------------------------------
1090
1129
  (5) → Improve the codon-optimizer in Bioroebe, including the
1091
1130
  documentation. We need to make this really useful.
1092
- -------------------------------------------------------------------------------
1131
+ ------------------------------------------------------------------------------------------
1093
1132
  (6) →
1094
1133
  5'- TACACGGCACAT -3'
1095
1134
  3'- ATGTGCCGTGTA -5'
@@ -1098,7 +1137,7 @@ Imperfect DNA mirror repeats (IMRs) are less than 100% symmetrical.
1098
1137
 
1099
1138
  ^^^ integrate mirror repeats creation
1100
1139
  and searching for them. Hmmm.
1101
- -------------------------------------------------------------------------------
1140
+ ------------------------------------------------------------------------------------------
1102
1141
  (7) → continue porting bioroebe/taxonomy
1103
1142
 
1104
1143
  ^^^^^^^^^^
@@ -1108,12 +1147,12 @@ Imperfect DNA mirror repeats (IMRs) are less than 100% symmetrical.
1108
1147
  ^^^ das ist der nächste schritt, so das
1109
1148
  wir das nit mehr benötigen.
1110
1149
 
1111
- -------------------------------------------------------------------------------
1150
+ ------------------------------------------------------------------------------------------
1112
1151
  (8) → find out which bacteria all contain the needle complex; find out
1113
1152
  the sequence for the needle complex as well and study it;
1114
1153
  find the positions of the genes responsible.
1115
1154
 
1116
- -------------------------------------------------------------------------------
1155
+ ------------------------------------------------------------------------------------------
1117
1156
  (9) → Add trypsin_digest, also in the shell, but possibly
1118
1157
  on toplevel as well (if the input is a protein sequence.
1119
1158
 
@@ -1127,29 +1166,24 @@ Imperfect DNA mirror repeats (IMRs) are less than 100% symmetrical.
1127
1166
  And document it; but do not digest if a prolin
1128
1167
  follows !!!
1129
1168
  ^^^ document this too into .md
1130
-
1131
- -------------------------------------------------------------------------------
1132
- (10) → in bioroebe, add a commassie check... do we include
1133
- arginine or not.
1134
-
1135
- ..........................................................................
1169
+ ------------------------------------------------------------------------------------------
1136
1170
  (11) → add codon usage in bioroebe
1137
- -------------------------------------------------------------------------------
1171
+ ------------------------------------------------------------------------------------------
1138
1172
  (12) → Clone the following functionality.
1139
1173
 
1140
1174
  http://www.bioinformatics.nl/cgi-bin/emboss/help/sirna
1141
- -------------------------------------------------------------------------------
1175
+ ------------------------------------------------------------------------------------------
1142
1176
  (13) → Improve the "find and scan" subsection. We must be able to find
1143
1177
  subsequences; check for "matches" as well, including the bioshell.
1144
- -------------------------------------------------------------------------------
1178
+ ------------------------------------------------------------------------------------------
1145
1179
  (14) → Clone the CLUSTAL format aligment.
1146
- -------------------------------------------------------------------------------
1180
+ ------------------------------------------------------------------------------------------
1147
1181
  (15) → We need to be able to load up a whole geneome into bioroebe,
1148
1182
  and then be able to manipulate it.
1149
1183
 
1150
1184
  ^^^ perhaps test this with some example
1151
1185
  data or so...
1152
- -------------------------------------------------------------------------------
1186
+ ------------------------------------------------------------------------------------------
1153
1187
  (16) → Restriction enzymes:
1154
1188
 
1155
1189
  Add a subsection about restritction enzymes including
@@ -1163,7 +1197,7 @@ Imperfect DNA mirror repeats (IMRs) are less than 100% symmetrical.
1163
1197
  general, so that we can reproduce and verify the
1164
1198
  information there.
1165
1199
 
1166
- -------------------------------------------------------------------------------
1200
+ ------------------------------------------------------------------------------------------
1167
1201
  (18) → clone pepinfo
1168
1202
 
1169
1203
  The program "pepinfo" plots various amino acid properties in
@@ -1181,7 +1215,7 @@ Imperfect DNA mirror repeats (IMRs) are less than 100% symmetrical.
1181
1215
 
1182
1216
  The data are also written out to an output file.
1183
1217
 
1184
- -------------------------------------------------------------------------------
1218
+ ------------------------------------------------------------------------------------------
1185
1219
  (19) → gff?
1186
1220
 
1187
1221
  There are 6 .gff3 files in the current directory.
@@ -1193,23 +1227,22 @@ Imperfect DNA mirror repeats (IMRs) are less than 100% symmetrical.
1193
1227
 
1194
1228
  ^^^ we need an analyze-mode as well.
1195
1229
 
1196
- ..........................................................................
1230
+ ------------------------------------------------------------------------------------------
1197
1231
  (20) → ^^^^ add the ability to
1198
1232
  show a ruler AND highlighting as well
1199
1233
  ^^^ then document it.
1200
- ..........................................................................
1234
+ ------------------------------------------------------------------------------------------
1201
1235
  (21) → https://github.com/bioperl/bioperl-live
1202
1236
  Look what we can take from ^^^.
1203
1237
 
1204
1238
  https://github.com/bioperl/bioperl-live/tree/master/examples
1205
1239
 
1206
- ..........................................................................
1240
+ ------------------------------------------------------------------------------------------
1207
1241
  (23) → continue biojava, and bioroebe a bit
1208
1242
 
1209
1243
  Ideally we should have biojava o a working point.
1210
- ..........................................................................
1211
- (24) → Clone all of Emboss. :)
1212
- ..........................................................................
1244
+
1245
+ ------------------------------------------------------------------------------------------
1213
1246
  (25) → clone the functionality found at https://web.expasy.org/protparam/
1214
1247
 
1215
1248
  https://web.expasy.org/cgi-bin/protparam/protparam
@@ -1219,7 +1252,7 @@ Imperfect DNA mirror repeats (IMRs) are less than 100% symmetrical.
1219
1252
 
1220
1253
  Theoretical pI: 5.78
1221
1254
 
1222
- -------------------------------------------------------------------------------
1255
+ ------------------------------------------------------------------------------------------
1223
1256
  (27) → NP_417539.1
1224
1257
 
1225
1258
  https://www.ncbi.nlm.nih.gov/protein/NP_417539.1
@@ -1227,26 +1260,26 @@ Imperfect DNA mirror repeats (IMRs) are less than 100% symmetrical.
1227
1260
 
1228
1261
  ^^^ if the input is exactly like the above, on the first line,
1229
1262
  download the sequence.
1230
- -------------------------------------------------------------------------------
1263
+ ------------------------------------------------------------------------------------------
1231
1264
  (28) → Integrate these nice GUI parts parts:
1232
1265
 
1233
1266
  https://dev.to/kojix2/introduction-to-gr-rb-data-visualization-with-ruby-2c39
1234
1267
 
1235
- -------------------------------------------------------------------------------
1268
+ ------------------------------------------------------------------------------------------
1236
1269
  (29) → http://insilico.ehu.es/
1237
1270
 
1238
1271
  ^^^ check if we have all of this incorporated
1239
- -------------------------------------------------------------------------------
1272
+ ------------------------------------------------------------------------------------------
1240
1273
  (30) → http://www.biostars.org/
1241
1274
 
1242
1275
  ^^^ regularly work through this
1243
1276
  and try to help
1244
1277
  and extend bioruby at the same time.
1245
- -------------------------------------------------------------------------------
1278
+ ------------------------------------------------------------------------------------------
1246
1279
  (31) → The taxonomy-submodule should work one day, and be properly
1247
1280
  documented as well. Perhaps integrate the parts of Taxonomy
1248
1281
  that can be included into the toplevel domain.
1249
- -------------------------------------------------------------------------------
1282
+ ------------------------------------------------------------------------------------------
1250
1283
  (32) → Enable:
1251
1284
 
1252
1285
  Bioroebe.set_genetic_code()
@@ -1262,7 +1295,7 @@ Imperfect DNA mirror repeats (IMRs) are less than 100% symmetrical.
1262
1295
 
1263
1296
  ^^^ enable this as well; extent documentation too.
1264
1297
 
1265
- -------------------------------------------------------------------------------
1298
+ ------------------------------------------------------------------------------------------
1266
1299
  (34) → We have found a restriction enzyme called NheI.
1267
1300
 
1268
1301
  The sequence this 6-cutter relates to is: `5' - GCTAGC - 3'`
@@ -1270,23 +1303,23 @@ Imperfect DNA mirror repeats (IMRs) are less than 100% symmetrical.
1270
1303
  This restriction enzyme will produce a blunt overhang.
1271
1304
 
1272
1305
  ^^^ nope das ist falsch
1273
- -------------------------------------------------------------------------------
1306
+ ------------------------------------------------------------------------------------------
1274
1307
  (35) → Sau3A?
1275
1308
  ^^^ enable this restriction site
1276
1309
 
1277
- -------------------------------------------------------------------------------
1310
+ ------------------------------------------------------------------------------------------
1278
1311
  (37) → Add matplotlib support.
1279
1312
 
1280
1313
  try_to_use_matplotlib
1281
- -------------------------------------------------------------------------------
1314
+ ------------------------------------------------------------------------------------------
1282
1315
  (38) → https://www.ncbi.nlm.nih.gov/CBBresearch/Lu/Demo/tmTools/RESTfulAPIs.html
1283
- -------------------------------------------------------------------------------
1316
+ ------------------------------------------------------------------------------------------
1284
1317
  (39) → The following input:
1285
1318
 
1286
1319
  downcase; orf?; seq?
1287
1320
 
1288
1321
  leads to strange display. Something is wrong here, must be checked.
1289
- -------------------------------------------------------------------------------
1322
+ ------------------------------------------------------------------------------------------
1290
1323
  (40) → Continue with rosalind problems.
1291
1324
 
1292
1325
  These challenges can be found here:
@@ -1295,42 +1328,42 @@ Imperfect DNA mirror repeats (IMRs) are less than 100% symmetrical.
1295
1328
 
1296
1329
  Also integrate these rosalind-quizzes into bioroebe
1297
1330
  when possible.
1298
- -------------------------------------------------------------------------------
1331
+ ------------------------------------------------------------------------------------------
1299
1332
  (41) → https://web.expasy.org/cgi-bin/peptide_mass/peptide-mass.pl
1300
1333
 
1301
1334
  ^^^ make the above usable in sinaitra as well
1302
- -------------------------------------------------------------------------------
1335
+ ------------------------------------------------------------------------------------------
1303
1336
  (42) → Integrate a way to search for commonly known promoters:
1304
1337
 
1305
1338
  promoters?
1306
1339
  ^^^ this functionality
1307
1340
  ^^^ this has to be expanded
1308
1341
  and ...
1309
- -------------------------------------------------------------------------------
1342
+ ------------------------------------------------------------------------------------------
1310
1343
  (43) → Integrate:
1311
1344
 
1312
1345
  http://biotools.nubic.northwestern.edu/OligoCalc.html
1313
- -------------------------------------------------------------------------------
1346
+ ------------------------------------------------------------------------------------------
1314
1347
  (44) → Extend the Java part of BioRoebe systematically..
1315
1348
 
1316
1349
  What should come next? Let's make a list.
1317
1350
 
1318
1351
  → remove_numbers [DONE]
1319
- -------------------------------------------------------------------------------
1352
+ ------------------------------------------------------------------------------------------
1320
1353
  (46) → Study gnuplot; one day we have to draw graphs.
1321
1354
 
1322
- -------------------------------------------------------------------------------
1355
+ ------------------------------------------------------------------------------------------
1323
1356
  (47) → Add a genome browser, both ascii without GUI and also
1324
1357
  with. In ruby-gtk.
1325
- -------------------------------------------------------------------------------
1358
+ ------------------------------------------------------------------------------------------
1326
1359
  (48) → Clone the functionality of:
1327
1360
 
1328
1361
  http://www.biophp.org/minitools/restriction_digest/demo.php
1329
- -------------------------------------------------------------------------------
1362
+ ------------------------------------------------------------------------------------------
1330
1363
  (50) → Add the loxP sequence to readme [DONE] and explain this
1331
1364
  better on the main readme; and perhaps also assign
1332
1365
  the sequence via the bioshell.
1333
- -------------------------------------------------------------------------------
1366
+ ------------------------------------------------------------------------------------------
1334
1367
  (51) → 33. Cephalodiscidae Mitochondrial UAA-Tyr Code (transl_table=33)
1335
1368
 
1336
1369
  AAs = FFLLSSSSYYY*CCWWLLLLPPPPHHQQRRRRIIIMTTTTNNKKSSSKVVVVAAAADDEEGGGG
@@ -1341,7 +1374,7 @@ Imperfect DNA mirror repeats (IMRs) are less than 100% symmetrical.
1341
1374
 
1342
1375
  ^^^ add a parser, and document it, that can take this input
1343
1376
  and output the corresponding code, in a valid .yml file.
1344
- -------------------------------------------------------------------------------
1377
+ ------------------------------------------------------------------------------------------
1345
1378
  (52) → Add to bioroebe the ability to add cloning vectors
1346
1379
  and molecular_weight calcuation
1347
1380
  for this
@@ -1363,19 +1396,19 @@ Imperfect DNA mirror repeats (IMRs) are less than 100% symmetrical.
1363
1396
 
1364
1397
  ^^^ we also need a way to find out what resistance genes
1365
1398
  are carried there.
1366
- -------------------------------------------------------------------------------
1399
+ ------------------------------------------------------------------------------------------
1367
1400
  (53) → In the lambda genome sequence there are 10 EcoB and
1368
1401
  5 EcoK sites.
1369
1402
  ^^^ verify this too, as an example as well
1370
- -------------------------------------------------------------------------------
1403
+ ------------------------------------------------------------------------------------------
1371
1404
  (54) → show restriction sites, composable and compatible with
1372
1405
  serial clone ... hmm
1373
- -------------------------------------------------------------------------------
1406
+ ------------------------------------------------------------------------------------------
1374
1407
  (55) → enable:
1375
1408
  BIOROEBE_USE_COLOURS:
1376
1409
  can be 0 or 1
1377
1410
  what is this?
1378
- -------------------------------------------------------------------------------
1411
+ ------------------------------------------------------------------------------------------
1379
1412
  (56) → Burrows-Wheeler-Transform (BWT)
1380
1413
 
1381
1414
  ^^^ add some method here
@@ -1388,15 +1421,15 @@ Imperfect DNA mirror repeats (IMRs) are less than 100% symmetrical.
1388
1421
 
1389
1422
  also test this against my paper-result
1390
1423
  with input being: "GATAG$".
1391
- -------------------------------------------------------------------------------
1424
+ ------------------------------------------------------------------------------------------
1392
1425
  (56) → Enable working with several genes... hmm and store that somewhere.
1393
1426
  Something like a per-project workspace thingy.
1394
- -------------------------------------------------------------------------------
1427
+ ------------------------------------------------------------------------------------------
1395
1428
  (57) → Add:
1396
1429
 
1397
1430
  http://nar.oxfordjournals.org/content/35/suppl_2/W71.long
1398
1431
 
1399
- -------------------------------------------------------------------------------
1432
+ ------------------------------------------------------------------------------------------
1400
1433
  (58) → Now, you may want to translate the nucleotides up to
1401
1434
  the first in frame stop codon, and then stop (as
1402
1435
  happens in nature):
@@ -1410,14 +1443,14 @@ Imperfect DNA mirror repeats (IMRs) are less than 100% symmetrical.
1410
1443
  Then continue from here:
1411
1444
 
1412
1445
  https://people.duke.edu/~ccc14/pcfb/biopython/BiopythonSequences.html
1413
- -------------------------------------------------------------------------------
1446
+ ------------------------------------------------------------------------------------------
1414
1447
  (59) → Add:
1415
1448
 
1416
1449
  set_dna :Ubiquitin
1417
1450
  set_dna :ubiquitin
1418
1451
 
1419
1452
  ^^^ we want to obtain the ubuiqitin sequence
1420
- -------------------------------------------------------------------------------
1453
+ ------------------------------------------------------------------------------------------
1421
1454
  (59) → Telomers
1422
1455
 
1423
1456
  Telomeres are listed from 5' to 3'.
@@ -1431,28 +1464,28 @@ Imperfect DNA mirror repeats (IMRs) are less than 100% symmetrical.
1431
1464
  doc_telomeres
1432
1465
 
1433
1466
  ^^^ add this to say the human telomere sequence
1434
- -------------------------------------------------------------------------------
1467
+ ------------------------------------------------------------------------------------------
1435
1468
  (60) → ORF_positions?
1436
1469
  ^^^ change this a bit, to actually show the positions
1437
1470
  of the various ORFs with the start-position.
1438
- -------------------------------------------------------------------------------
1471
+ ------------------------------------------------------------------------------------------
1439
1472
  (62) → add:
1440
1473
 
1441
1474
  setgene2
1442
1475
  add_dna2
1443
1476
  dna2
1444
1477
  dna? <--- this one is not a setter but a query.
1445
- -------------------------------------------------------------------------------
1478
+ ------------------------------------------------------------------------------------------
1446
1479
  (63) → improve the TM calculation. must be better, must have more
1447
1480
  documentation, and a small tutorial.
1448
- -------------------------------------------------------------------------------
1481
+ ------------------------------------------------------------------------------------------
1449
1482
  (64) → Compare bioroebe to:
1450
1483
 
1451
1484
  https://www.ncbi.nlm.nih.gov/orffinder
1452
1485
 
1453
1486
  whether both return the same
1454
1487
  also possibly add a web-gui
1455
- -------------------------------------------------------------------------------
1488
+ ------------------------------------------------------------------------------------------
1456
1489
  (65) → Find out ratios from:
1457
1490
 
1458
1491
  Doolittle RF. 1989. Redundancies in protein sequences. I
@@ -1478,16 +1511,16 @@ Imperfect DNA mirror repeats (IMRs) are less than 100% symmetrical.
1478
1511
  Bioroebe::Blosum[50] as an API.
1479
1512
  and document it in general.
1480
1513
 
1481
- ..........................................................................
1514
+ ------------------------------------------------------------------------------------------
1482
1515
  (65) → http://www.biomart.org/other/user-docs.pdf
1483
1516
  ^^^ work through this
1484
- -------------------------------------------------------------------------------
1517
+ ------------------------------------------------------------------------------------------
1485
1518
  (66) → add:
1486
1519
 
1487
1520
  class Cell
1488
1521
  ^^^ simulate a cell
1489
1522
  Hmmm. Needs specific components ... and needs a better plan.
1490
- -------------------------------------------------------------------------------
1523
+ ------------------------------------------------------------------------------------------
1491
1524
  (68) → class Protein:
1492
1525
 
1493
1526
  add glycosyslation patteren
@@ -1496,18 +1529,18 @@ Imperfect DNA mirror repeats (IMRs) are less than 100% symmetrical.
1496
1529
  need to somehow add the modiication type
1497
1530
 
1498
1531
  https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5358406/
1499
- -------------------------------------------------------------------------------
1532
+ ------------------------------------------------------------------------------------------
1500
1533
  (69) → In the BioShell we must be able to do probes - completementary
1501
1534
  to amino acids.
1502
- -------------------------------------------------------------------------------
1535
+ ------------------------------------------------------------------------------------------
1503
1536
  (70) → Add www-related functionality to bioroebe eventually make use
1504
1537
  of rails, but start with sinatra possibly. In the long run,
1505
1538
  make it flexible to work with as many different frameworks
1506
1539
  as possible, though.
1507
- -------------------------------------------------------------------------------
1540
+ ------------------------------------------------------------------------------------------
1508
1541
  (71) → Spaltstellen anzeigen zum beispiel lambda-DNA verdau
1509
1542
  BgI II.
1510
- -------------------------------------------------------------------------------
1543
+ ------------------------------------------------------------------------------------------
1511
1544
  (72) → dnaanalyze
1512
1545
 
1513
1546
  In the DNA string `TCCGTCGCAACACATCGCCTCAACAAACCGACCGGGATATGCAATACCGGAATCCGATCCTTTAGAAGCTGCATTCCAAACGCTTGCAATAACACCCACTCGACTATTCAGCATTGGCAAAGGGTACGAATTCGACGAAGGGAGGGTGCTATATTTTCCAAGTTGCTCGCCGATTGATACGGAGCCTGTGGAAAGATTTCGCGGCTCTAGTCTTTAGCTTTGATGTCACCCCTGAGTAGTAACCCGGCGTGGTAGCTTTCATTAGACTTCTCGGAGAGAGTATTAAGCAAAGGTGGAGGTCCCAGGGGTCCAGTGAGCTGTATCGCACTAAAAGCATGCCTACGGGCAATGCTATTTTGCTCACAGGAACTTTGGGGGAGCCACAAACTCTCGAAGCCGGATTGTTGTGGCGGCTAACTTTCCAAAGGCGACCATTCATGGTCTGAATGGGCCCTCACCAGAAGAACGTTTTCGACGGGCATTCTTCCCCGGGGTTTCGAAGGCAAGGGTCAGCACGGCGCGGAAAAGTACGCGACGCATACCGGACTAGTCATGCAACTCCCTCGGAACTGGCGATTCCCACCCAAGAGACGCACGCTGATCATTGCCCATGCCGACTGGAGATGCTGAATTTGGTATGCGGGTCTGTTGCCAGCGCTGACATTATCGGACATTGTGGGGAGAACCGTGTGATTGATTGAGCTGGCGCATTTGTCCGCATGCTCTCCTCATGTGGACACCTTCGCAGGTTCTTTCCGCGGCCACAGTGTCGGGATCTACCCCTGGTGCGTCGCCGCGAGTACAGGTGGGGTTTCGCGCATGAGAACCAATGTTGCACGCCTCAAAACATGGCTGTAACATATTAGCGCCAATAAAAATTTTTGGCAACAAAGAAACAAGGCCAACCGAAGTGCTAAGCCGCGATCATGAAGGGGCGATGCCAGAATGGGAGTCTGCCTTTCCTGTGTGGACGTGAGATTGTACCTAGACAGAGAACGCC` we found these Nucleotides:
@@ -1532,11 +1565,11 @@ Imperfect DNA mirror repeats (IMRs) are less than 100% symmetrical.
1532
1565
  we need to make it so that an input sequence
1533
1566
  can be assigned, and dnaanalyse --GUI should
1534
1567
  start it too. ALSO document it once this works.
1535
- -------------------------------------------------------------------------------
1568
+ ------------------------------------------------------------------------------------------
1536
1569
  (73) → go through the individual components slowly and improve them,
1537
1570
  step by step, including the documentation. Then eventually
1538
1571
  remove this todo-entry here.
1539
- -------------------------------------------------------------------------------
1572
+ ------------------------------------------------------------------------------------------
1540
1573
  (74) → Add a consensus sequence for:
1541
1574
 
1542
1575
  Asn-X-Ser/Thr-Conesnsus
@@ -1548,13 +1581,13 @@ Imperfect DNA mirror repeats (IMRs) are less than 100% symmetrical.
1548
1581
  NGlyc
1549
1582
  /N-?Glyc/i
1550
1583
  ^^^ use that regex
1551
- -------------------------------------------------------------------------------
1584
+ ------------------------------------------------------------------------------------------
1552
1585
  (74) → make sure that newly generated files respect the
1553
1586
  default chmod value on the system. from bioroebe.
1554
1587
  right now we default to 755 which I assume is
1555
1588
  hardcoded but perhaps this is wrong.
1556
1589
 
1557
- -------------------------------------------------------------------------------
1590
+ ------------------------------------------------------------------------------------------
1558
1591
  (75) → require 'bio'
1559
1592
 
1560
1593
  # creating a Bio::Sequence::NA object containing ambiguous alphabets
@@ -1579,34 +1612,34 @@ Imperfect DNA mirror repeats (IMRs) are less than 100% symmetrical.
1579
1612
  part nto a standalone file
1580
1613
  so taht it can be used by both the .cgi and
1581
1614
  well rdoc...
1582
- -------------------------------------------------------------------------------
1615
+ ------------------------------------------------------------------------------------------
1583
1616
  - Add more protein-specific thingies to bioroebe.
1584
- -------------------------------------------------------------------------------
1617
+ ------------------------------------------------------------------------------------------
1585
1618
  - Die bioshell vorantreiben und durch std_biology.rb abarbeiten.
1586
1619
  Vielleicht können wir ja etwas davon auslagern in eine Klasse
1587
1620
  oder so.
1588
1621
 
1589
1622
  Das ganze sollte auch mit Webmin (biomin) verknüpft werden, so das
1590
1623
  wir die Bioshell auch elegant über das www verwenden können!
1591
- -------------------------------------------------------------------------------
1624
+ ------------------------------------------------------------------------------------------
1592
1625
  - ^^^ when we find restriction enzyme sites in a DNA
1593
1626
  string, colourize them RED.
1594
1627
 
1595
1628
  also set it to
1596
1629
  set_restriction_size()
1597
- -------------------------------------------------------------------------------
1630
+ ------------------------------------------------------------------------------------------
1598
1631
  - also ... while learning C++ we extend the project here...
1599
1632
  Useful C++ things will be combined.
1600
- -------------------------------------------------------------------------------
1633
+ ------------------------------------------------------------------------------------------
1601
1634
  - As of April 2003, there were 176,890 total taxa represented.
1602
1635
 
1603
1636
  ^^^ we need a way to also output how many entries we
1604
1637
  have there.
1605
- -------------------------------------------------------------------------------
1638
+ ------------------------------------------------------------------------------------------
1606
1639
  - Replace bioruby with bioroebe completely!
1607
1640
  In order for this to work, we first need to find out
1608
1641
  what bioruby is able to do. :P
1609
- -------------------------------------------------------------------------------
1642
+ ------------------------------------------------------------------------------------------
1610
1643
  - append 33
1611
1644
  # ^^^ in the bioshell
1612
1645
  Only numbers were given: Adding 33 random nucleotides to the main string next.
@@ -1626,7 +1659,7 @@ Did you mean? return_random_codon_sequence_for_this_aminoacid_sequence
1626
1659
 
1627
1660
 
1628
1661
  ^^^^^ BUG!
1629
- -------------------------------------------------------------------------------
1662
+ ------------------------------------------------------------------------------------------
1630
1663
  > rest?
1631
1664
 
1632
1665
  We found these restriction sites within the sequence `TTCAGAACTCAACGCCTGGTTGGCCGTCCAGTAAGCTGACTAAGTAAGTCTATGCCCGCGATAACCAGGATACAGATATCGTGAAACCTGGTTTATCTCCTTCTATAAGAGTCTGCACATCTAGC`:
@@ -1656,7 +1689,7 @@ We found these restriction sites within the sequence `TTCAGAACTCAACGCCTGGTTGGCCG
1656
1689
  ^^^^ also show the position
1657
1690
 
1658
1691
 
1659
- -------------------------------------------------------------------------------
1692
+ ------------------------------------------------------------------------------------------
1660
1693
 
1661
1694
  PMID entries are:
1662
1695
 
@@ -1728,7 +1761,7 @@ We found these restriction sites within the sequence `TTCAGAACTCAACGCCTGGTTGGCCG
1728
1761
 
1729
1762
 
1730
1763
 
1731
- ..........................................................................
1764
+ ------------------------------------------------------------------------------------------
1732
1765
  Bei der Datenbanksuche werden die gemessenen Massen mit den Peptidmassen
1733
1766
  aller Proteine bzw. Gene in einer Datenbank (NCBI, Uniprot) verglichen. DNA-
1734
1767
  Sequenzen werden dazu in Proteinsequenzen übersetzt und in silico mit der beim
@@ -1738,7 +1771,7 @@ Verdau benutzten Protease geschnitten.
1738
1771
 
1739
1772
 
1740
1773
 
1741
- ..........................................................................
1774
+ ------------------------------------------------------------------------------------------
1742
1775
  Complexity of libraries:
1743
1776
  How many independent clones are necessary to represent a genome (plant,
1744
1777
  animal/fungus) or how many such clones have to be screened to have realistic
@@ -1773,7 +1806,7 @@ have to be hybridized.
1773
1806
 
1774
1807
 
1775
1808
 
1776
- ..........................................................................
1809
+ ------------------------------------------------------------------------------------------
1777
1810
 
1778
1811
  BIO SHELL> BglI?
1779
1812
 
@@ -1818,12 +1851,12 @@ List all enzymes that produce compatible ends for the enzyme.
1818
1851
  http://biopython.org/DIST/docs/api/Bio.Restriction.Restriction.Blunt-class.html
1819
1852
 
1820
1853
 
1821
- ..........................................................................
1854
+ ------------------------------------------------------------------------------------------
1822
1855
  https://www.reddit.com/r/bioinformatics/comments/5o3kn8/bioinformatics_contest_2017_jan_23rd29th_solve_as/
1823
- ..........................................................................
1856
+ ------------------------------------------------------------------------------------------
1824
1857
  (1) → Finish all of biophp integration into bioroebe.
1825
1858
  http://www.biophp.org/
1826
- -------------------------------------------------------------------------------
1859
+ ------------------------------------------------------------------------------------------
1827
1860
 
1828
1861
  locate oriC here:
1829
1862
 
@@ -1858,13 +1891,13 @@ But I do not know how to locate ORIs.
1858
1891
 
1859
1892
 
1860
1893
 
1861
- -------------------------------------------------------------------------------
1894
+ ------------------------------------------------------------------------------------------
1862
1895
  ^^^ also integrate git into bioroebe.
1863
- -------------------------------------------------------------------------------
1896
+ ------------------------------------------------------------------------------------------
1864
1897
  WIR MÜSSEN DAS HIER EXTREM VERBESSERN.
1865
1898
 
1866
1899
  DANN UPLOADEN UND ALS BASIS FÜR APPLICATIONS NUTZEN.
1867
- -------------------------------------------------------------------------------
1900
+ ------------------------------------------------------------------------------------------
1868
1901
 
1869
1902
  Study MetaCyc
1870
1903
  ^^^ study metabolic pathways.
@@ -1873,7 +1906,7 @@ http://metacyc.org/
1873
1906
 
1874
1907
  → Create KuroMetaCyc, in Analogy towards Metabolic Cycle.
1875
1908
 
1876
- -------------------------------------------------------------------------------
1909
+ ------------------------------------------------------------------------------------------
1877
1910
 
1878
1911
  Welcome to BioShell May 2012. Type "help" to get some help.
1879
1912
 
@@ -1895,7 +1928,7 @@ When we type this, we then ask:
1895
1928
 
1896
1929
 
1897
1930
 
1898
- -------------------------------------------------------------------------------
1931
+ ------------------------------------------------------------------------------------------
1899
1932
 
1900
1933
  http://biopython.org/DIST/docs/cookbook/Restriction.html#mozTocId101269
1901
1934
 
@@ -1985,16 +2018,16 @@ ausreichend.
1985
2018
 
1986
2019
 
1987
2020
 
1988
- ..........................................................................
2021
+ ------------------------------------------------------------------------------------------
1989
2022
  BioTodo - GENESIS, science fiction.
1990
2023
 
1991
2024
  - create virus(:which_one, :amount) # Note the difference to the below
1992
2025
  - create hydra(:amount)
1993
2026
  - create bread
1994
- ..........................................................................
2027
+ ------------------------------------------------------------------------------------------
1995
2028
  → both
1996
2029
  ^ should work, does not work right now.
1997
- ..........................................................................
2030
+ ------------------------------------------------------------------------------------------
1998
2031
  → Taxonomy is now integrated into bioroebe. This is good but we need more
1999
2032
  documentation, some more tests, a rethinking of the layout and the
2000
2033
  structures, and a fixing of the query-part of the database.
@@ -2008,13 +2041,13 @@ ausreichend.
2008
2041
  at about the same time \o/
2009
2042
  AND document this related-problems too
2010
2043
  Integrate this some other day...
2011
- ..........................................................................
2044
+ ------------------------------------------------------------------------------------------
2012
2045
  - http://www.restrictionmapper.org/cgi-bin/sitefind3.pl
2013
2046
 
2014
2047
  ^^^ Das sollte man integrieren, die Funktionalität, so das
2015
2048
  man ALLE Restriktion-Enzymes ausprobiert ausgehend von
2016
2049
  einer bestimmten Sequenz.
2017
- ..........................................................................
2050
+ ------------------------------------------------------------------------------------------
2018
2051
  → A search is essentially substring search across a database of strings
2019
2052
  (albeit with a smaller alphabet). Some common use cases: one,
2020
2053
  scientists will search for certain genes that they've used in engineered
@@ -2033,13 +2066,13 @@ ausreichend.
2033
2066
  Bioroebe::DetermineOptimalCodons
2034
2067
  ^^^ this is currently incomplete.
2035
2068
 
2036
- ..........................................................................
2069
+ ------------------------------------------------------------------------------------------
2037
2070
  → Redo restrictions enzymes completely.
2038
2071
  And polish this a LOT.
2039
2072
  This may take some days. But we want this to be REALLY good and
2040
2073
  lasting for a long time.
2041
2074
  Need to keep on working at that!
2042
- ..........................................................................
2075
+ ------------------------------------------------------------------------------------------
2043
2076
  → Add: average_aminoacid_weight?
2044
2077
 
2045
2078
 
@@ -2077,7 +2110,7 @@ end
2077
2110
  → We must be able to align not only nucleotides but also aminoacids.
2078
2111
  But where is the alignment comparer? perhaps hamming distance?
2079
2112
  hmm we have to see.
2080
- ..........................................................................
2113
+ ------------------------------------------------------------------------------------------
2081
2114
  → /Programs/Ruby/2.3.1/lib/ruby/site_ruby/2.3.0/bioroebe/bioshell/menu.rb:311:in `menu': undefined method `upcase' for ["EcoRI"]:Array (NoMethodError)
2082
2115
  from /Programs/Ruby/2.3.1/lib/ruby/site_ruby/2.3.0/bioroebe/bioshell/user_input.rb:31:in `block in enter_main_loop'
2083
2116
  from /Programs/Ruby/2.3.1/lib/ruby/site_ruby/2.3.0/bioroebe/bioshell/user_input.rb:12:in `loop'
@@ -2106,12 +2139,12 @@ end
2106
2139
  at this date.'
2107
2140
  SendEmail.new to: Roebe.email?, data
2108
2141
 
2109
- ..........................................................................
2142
+ ------------------------------------------------------------------------------------------
2110
2143
 
2111
2144
 
2112
2145
  → Document which parts of emboss have already been copied.
2113
2146
  → EMBOSS.md
2114
- ..........................................................................
2147
+ ------------------------------------------------------------------------------------------
2115
2148
 
2116
2149
 
2117
2150
 
@@ -2168,7 +2201,7 @@ Traceback (most recent call last):
2168
2201
 
2169
2202
  http://www.snapgene.com/products/snapgene_viewer/
2170
2203
 
2171
- -------------------------------------------------------------------------------
2204
+ ------------------------------------------------------------------------------------------
2172
2205
  (1) → Wir sollten GFP tagging unterstützen, also wie das
2173
2206
  Protein-Konstrukt aussehen soll und so weiter.
2174
2207
  Das geht teilweise...
@@ -2177,22 +2210,22 @@ Traceback (most recent call last):
2177
2210
  fügt die sequence asl main dna sequenz ein.
2178
2211
  Was fehlt? Hmmmm... eventuell noch mehr an
2179
2212
  dokumentation.
2180
- -------------------------------------------------------------------------------
2213
+ ------------------------------------------------------------------------------------------
2181
2214
 
2182
2215
  - in bioroebe, create subsequences for siRNA, then scan for
2183
2216
  submatcher + report where these are. Should be fast too.
2184
- -------------------------------------------------------------------------------
2217
+ ------------------------------------------------------------------------------------------
2185
2218
  - Reverse complement now works quite well, also via the sinatra
2186
2219
  interface. We still should have a way to show 5' and
2187
2220
  3', both on the commandline, and via sinatra.
2188
2221
  Perhaps via --fancy commandline flag or so.
2189
- -------------------------------------------------------------------------------
2222
+ ------------------------------------------------------------------------------------------
2190
2223
  - Cn3D files?
2191
2224
  ^^^ add support for these; research what they are, too.
2192
- -------------------------------------------------------------------------------
2225
+ ------------------------------------------------------------------------------------------
2193
2226
  - Consider adding graphviz, perhaps to the taxonomy project
2194
2227
  where we make graphs towards different nodes or so...
2195
- -------------------------------------------------------------------------------
2228
+ ------------------------------------------------------------------------------------------
2196
2229
  - in parse fasta
2197
2230
  @colourize_sequence = false
2198
2231
  ^^^ change this lateron...
@@ -2200,7 +2233,7 @@ Traceback (most recent call last):
2200
2233
  this method now exists, but we still have to make
2201
2234
  the check better whether it is a protein or a DNA/RNA
2202
2235
  add a toplevel method for this.
2203
- -------------------------------------------------------------------------------
2236
+ ------------------------------------------------------------------------------------------
2204
2237
  - clone the BLast ident matcher functionality for aminacids into
2205
2238
  Bioroebe.
2206
2239
 
@@ -2215,7 +2248,7 @@ Traceback (most recent call last):
2215
2248
 
2216
2249
 
2217
2250
 
2218
- -------------------------------------------------------------------------------
2251
+ ------------------------------------------------------------------------------------------
2219
2252
  - Be able to mark exon/intron boundaries.
2220
2253
 
2221
2254
  - Add "taxid?" to tell us the name of the organism. This works now.
@@ -2259,9 +2292,9 @@ Traceback (most recent call last):
2259
2292
 
2260
2293
  ^^^
2261
2294
  study sumoplot ...
2262
- -------------------------------------------------------------------------------
2295
+ ------------------------------------------------------------------------------------------
2263
2296
  - http://a-little-book-of-r-for-bioinformatics.readthedocs.io/en/latest/src/chapter7.html
2264
- -------------------------------------------------------------------------------
2297
+ ------------------------------------------------------------------------------------------
2265
2298
  - http://biopython.org/DIST/docs/tutorial/Tutorial.html#htoc22
2266
2299
  ^^^ continue here; "You can also specify the table using the
2267
2300
  NCBI table number which is shorter, and often included in
@@ -2269,7 +2302,7 @@ Traceback (most recent call last):
2269
2302
 
2270
2303
  ^^^ work through this and see if it is good.
2271
2304
 
2272
- -------------------------------------------------------------------------------
2305
+ ------------------------------------------------------------------------------------------
2273
2306
 
2274
2307
  - Clone ALL of biophp, if it us useful.
2275
2308
 
@@ -2316,7 +2349,7 @@ Palindromic sequences finder
2316
2349
  We should also put this poart into doc/ subsection
2317
2350
  to keep track of what is missing and what is not.
2318
2351
 
2319
- -------------------------------------------------------------------------------
2352
+ ------------------------------------------------------------------------------------------
2320
2353
  (1) → sizeseq
2321
2354
 
2322
2355
  ^^^ clone this functionality and describe it in detail.
@@ -2353,7 +2386,7 @@ foobar.fasta
2353
2386
 
2354
2387
  ALSO ADD A GUI; sizeseq.rb was added in February 2021.
2355
2388
 
2356
- -------------------------------------------------------------------------------
2389
+ ------------------------------------------------------------------------------------------
2357
2390
  - In the sinatra-web-interface for Bioroebe:
2358
2391
  continue quiz in rosalind !!!
2359
2392
  also, at to_dna: default to RNA
@@ -2372,8 +2405,8 @@ ALSO ADD A GUI; sizeseq.rb was added in February 2021.
2372
2405
  → formatted_view
2373
2406
  111^^^^ in ncbi format
2374
2407
  and document all of this.
2375
- ..........................................................................
2376
- -------------------------------------------------------------------------------
2408
+ ------------------------------------------------------------------------------------------
2409
+ ------------------------------------------------------------------------------------------
2377
2410
  - Add a ruby-GUI stuff, probably the old biology/ subsection
2378
2411
  will be moved into the project.
2379
2412
 
@@ -2470,7 +2503,7 @@ ALSO ADD A GUI; sizeseq.rb was added in February 2021.
2470
2503
 
2471
2504
 
2472
2505
 
2473
- -------------------------------------------------------------------------------
2506
+ ------------------------------------------------------------------------------------------
2474
2507
  - Identifying amino acid cleavage sites (Sigcleave)
2475
2508
 
2476
2509
  For amino acid sequences we may be interested to know whether
@@ -2533,29 +2566,22 @@ ALSO ADD A GUI; sizeseq.rb was added in February 2021.
2533
2566
  ^^^ da gibt es einen bug. später nochmals probieren.
2534
2567
 
2535
2568
 
2536
- - We will read from NM_001180897.3_Saccharomyces_cerevisiae_S288c_Aga2p_AGA2.fasta
2537
-
2538
- The file NM_001180897.3_Saccharomyces_cerevisiae_S288c_Aga2p_AGA2.fasta has this FASTA header:
2539
-
2540
- >gi|398364826|ref|NM_001180897.3| Saccharomyces cerevisiae S288c Aga2p (AGA2), mRNA
2541
2569
 
2542
- ^^^ this should also (optionally) tell us the organism, via a switch.
2543
- for this we need some way to return the taxonomic ID of an organism
2544
2570
 
2545
2571
  - we have to add expasy...
2546
2572
  functionality to the cmdline too.
2547
2573
  Which one specifically? Let's see...
2548
2574
 
2549
2575
  https://www.expasy.org/
2550
- -------------------------------------------------------------------------------
2576
+ ------------------------------------------------------------------------------------------
2551
2577
  - https://biopython.org/wiki/Category%3ACookbook
2552
2578
  ^^^ clone that
2553
- -------------------------------------------------------------------------------
2579
+ ------------------------------------------------------------------------------------------
2554
2580
  - include covid genome, and begin to analyse it in bioroebe
2555
2581
  "Das Genom von SARS-CoV-2 sei doppelt so groß wie jenes
2556
2582
  von Influenzaviren, daher scheinen letztere viermal
2557
2583
  so schnell zu mutieren, schrieb Moshiri."
2558
- -------------------------------------------------------------------------------
2584
+ ------------------------------------------------------------------------------------------
2559
2585
  - Look at the GUIs that are part of the BioRoebe project.
2560
2586
 
2561
2587
  Polish these part, at the least one widget, then
@@ -2570,7 +2596,7 @@ ALSO ADD A GUI; sizeseq.rb was added in February 2021.
2570
2596
 
2571
2597
  Hmmm. And then, also consider transitioning into gtk3,
2572
2598
  and make mroe screenshots.
2573
- -------------------------------------------------------------------------------
2599
+ ------------------------------------------------------------------------------------------
2574
2600
 
2575
2601
  - https://www.ebi.ac.uk/Tools/seqstats/emboss_pepstats/
2576
2602
  http://www.ebi.ac.uk/Tools/services/web/toolresult.ebi?jobId=emboss_pepstats-I20160208-020243-0564-53154194-oy
@@ -2582,7 +2608,7 @@ ALSO ADD A GUI; sizeseq.rb was added in February 2021.
2582
2608
  - Improve on temperature content and how it is calculated
2583
2609
 
2584
2610
  someone googled for it in 2014 so build on it
2585
- -------------------------------------------------------------------------------
2611
+ ------------------------------------------------------------------------------------------
2586
2612
  - pfasta /Depot/Temp/bioroebe/NM_000539.3_Homo_sapiens_rhodopsin_RHO.fasta
2587
2613
 
2588
2614
  Will read from the file `/Depot/Temp/bioroebe/NM_000539.3_Homo_sapiens_rhodopsin_RHO.fasta`.
@@ -2593,7 +2619,7 @@ ALSO ADD A GUI; sizeseq.rb was added in February 2021.
2593
2619
  Now assigning aminoacid sequence to:
2594
2620
  AGAGTCATCCAGCTGGAGCCCTGAGTGGCTGAGCTCAGGCCTTCGCAG
2595
2621
  AGAGTCATCCAGCTGGAGCCCTGAGTGGCTGAGCTCAGGCCTTCGCAG
2596
- -------------------------------------------------------------------------------
2622
+ ------------------------------------------------------------------------------------------
2597
2623
 
2598
2624
 
2599
2625
  - Formats
@@ -2647,7 +2673,7 @@ ALSO ADD A GUI; sizeseq.rb was added in February 2021.
2647
2673
  tinyseq NCBI TinySeq XML
2648
2674
  ztr ZTR tracefile ztr
2649
2675
 
2650
- ..........................................................................
2676
+ ------------------------------------------------------------------------------------------
2651
2677
  (1) Look at f1 display:
2652
2678
 
2653
2679
 
@@ -2670,7 +2696,7 @@ we probably have to rewrite the whole thing
2670
2696
  BEFORE we add ANY COLOURS.
2671
2697
  OH WELL.
2672
2698
 
2673
- -------------------------------------------------------------------------------
2699
+ ------------------------------------------------------------------------------------------
2674
2700
  (100) → Add a primer-design widget
2675
2701
 
2676
2702
  The idea is to be able to manipulate forward and
@@ -2684,7 +2710,7 @@ perfect but it is a start.
2684
2710
  https://www.bioinformatics.nl/molbi/SCLResources/sequence_notation.htm
2685
2711
  ^^^ and check what is useful there. perhaps also add
2686
2712
  nicer visual cues to pretty it up a bit.
2687
- -------------------------------------------------------------------------------
2713
+ ------------------------------------------------------------------------------------------
2688
2714
  (1) → Compare bioroebe to:
2689
2715
 
2690
2716
  https://www.ncbi.nlm.nih.gov/orffinder
@@ -2694,18 +2720,18 @@ whether both return the same also possibly add a web-gui
2694
2720
  check this... so that we can search in standard ORF
2695
2721
  but also in different ORFs
2696
2722
  und die länge angeben, zumindest vom längsten ORF start + stop... also so das das ergebnis auch passt
2697
- ...........................................................................
2723
+ ------------------------------------------------------------------------------------------
2698
2724
  test reverse complement in bioroebe
2699
2725
  ^^^
2700
2726
  new_WWW/
2701
2727
  ^^^ this should eventually become the new web-related interface.
2702
2728
  Ah well. Perhaps not ... ruby-cgi is soooooo annoying ...
2703
- ...........................................................................
2729
+ ------------------------------------------------------------------------------------------
2704
2730
  (154) → the blosum-viewer should be supported in the cgi part
2705
2731
  and sinatra part as well.
2706
2732
  This now works for sinatra. Need to enable this for
2707
2733
  the cgi-part too eventually.
2708
- -------------------------------------------------------------------------------
2734
+ ------------------------------------------------------------------------------------------
2709
2735
  (155) → port the sinatra stuff together in bioroebe
2710
2736
  create a dir: web_api
2711
2737
  ^^^ also make params? usable in both sinatra and cgi page
@@ -2716,18 +2742,18 @@ Ah well. Perhaps not ... ruby-cgi is soooooo annoying ...
2716
2742
  add tons of HtmlTemplate[]
2717
2743
  and replace the ad-hoc code otherwise...
2718
2744
  ^^^ yeah, finish the HtmlTemplate stuff.
2719
- -------------------------------------------------------------------------------
2745
+ ------------------------------------------------------------------------------------------
2720
2746
  (1) → https://i.imgur.com/ptcSn12.png
2721
2747
  ^^^ enable such an overview; this shows mass compuation e.g
2722
2748
  peptide mass and such
2723
- -------------------------------------------------------------------------------
2749
+ ------------------------------------------------------------------------------------------
2724
2750
  (80) Bioroebe.sanitize_nucleotide_sequence
2725
2751
  ^^^ port this into java. The code has been written for this already,
2726
2752
  but we currently fail to link it.
2727
- -------------------------------------------------------------------------------
2753
+ ------------------------------------------------------------------------------------------
2728
2754
  (81) Bioroebe.base_composition
2729
2755
  ^^^^^^^^^ port this into java
2730
- -------------------------------------------------------------------------------
2756
+ ------------------------------------------------------------------------------------------
2731
2757
  (82) - work a bit more on tk!!!
2732
2758
  in particular to start it from the bioshell as-is.
2733
2759
  ^^^ this is mostly done for quick
@@ -2740,20 +2766,20 @@ Ah well. Perhaps not ... ruby-cgi is soooooo annoying ...
2740
2766
  hamming_distance [PARTIALLY IMPLEMENTED; ~80%]
2741
2767
  protein_to_DNA
2742
2768
  ^^^^ improve both while improving tk_paradise docu as well.
2743
- -------------------------------------------------------------------------------
2769
+ ------------------------------------------------------------------------------------------
2744
2770
  (83) Batch-create the .exe files on windows for libui, once
2745
2771
  the first has been added. And then test it too
2746
2772
  AND document it. This should be done with the controller
2747
2773
  eventually. Once this works, we can remove this entry
2748
2774
  here.
2749
- -------------------------------------------------------------------------------
2775
+ ------------------------------------------------------------------------------------------
2750
2776
  (84) port more libui stuff in bioroebe. We have two widgets ported so far;
2751
2777
  add more such entries.
2752
- -------------------------------------------------------------------------------
2778
+ ------------------------------------------------------------------------------------------
2753
2779
  (85) after libui has been ported, explore how gosu works on windows.
2754
2780
  if possible add things to a gosu-specific UI as well, but
2755
2781
  we may need a common, unified GUI base for that.
2756
- -------------------------------------------------------------------------------
2782
+ ------------------------------------------------------------------------------------------
2757
2783
  (86)
2758
2784
 
2759
2785
  add libui bindings AND once done make sure the controller works in
@@ -2762,22 +2788,22 @@ libui as well. Embed the various things into it.
2762
2788
  Tab A set named tabs for placing items in
2763
2789
  ^^^ use this perhaps also in bioroebe hmmm
2764
2790
  yeah.
2765
- -------------------------------------------------------------------------------
2791
+ ------------------------------------------------------------------------------------------
2766
2792
  (87) https://github.com/cnjinhao/nana/wiki/User-Works-using-Nana
2767
2793
 
2768
2794
  ^^^ port the "DNA hybrid"
2769
2795
  https://camo.githubusercontent.com/4c27d554ca4d698d288628f21255f917c2c577e35d7e11dd67e21880d56b6b0a/687474703a2f2f6e616e6170726f2e6f72672f696d616765732f73637265656e73686f74732f746864795f7365715f6578706c2e706e67
2770
2796
 
2771
- -------------------------------------------------------------------------------
2797
+ ------------------------------------------------------------------------------------------
2772
2798
  (88) Bioroebe::Cell
2773
2799
  ^^^ think about what to do with it. If we don't need it then perhaps
2774
2800
  we should just remove it. Think about this more at 2022, before
2775
2801
  deciding what to do.
2776
- -------------------------------------------------------------------------------
2802
+ ------------------------------------------------------------------------------------------
2777
2803
  (89) - Add emboss cgplot functionality.
2778
2804
 
2779
2805
  https://www.bioinformatics.nl/cgi-bin/emboss/cpgplot
2780
- -------------------------------------------------------------------------------
2806
+ ------------------------------------------------------------------------------------------
2781
2807
  (90) - integrate calculation of the Instability index (II)
2782
2808
 
2783
2809
  The instability index provides an estimate of the
@@ -2815,9 +2841,9 @@ that the protein may be unstable.
2815
2841
  The instability index (II) is computed to be 65.43
2816
2842
  This classifies the protein as unstable.
2817
2843
 
2818
- -------------------------------------------------------------------------------
2844
+ ------------------------------------------------------------------------------------------
2819
2845
  (1) → We have now added a method to show all hydrophobic amino acids, via the
2820
2846
  method .hydrophobic_amino_acids?. This works and has been documented
2821
2847
  in May 2022. However had, we also still need a way to PREDICT
2822
2848
  hydrophobic segments in a polypeptide sequence.
2823
- -------------------------------------------------------------------------------
2849
+ ------------------------------------------------------------------------------------------