bioroebe 0.10.80 → 0.12.24

Sign up to get free protection for your applications and to get access to all the features.
Files changed (301) hide show
  1. checksums.yaml +4 -4
  2. data/README.md +3946 -2817
  3. data/bin/bioroebe +13 -2
  4. data/bin/bioroebe_hash +7 -0
  5. data/bin/codon_to_aminoacid +6 -4
  6. data/bin/compacter +7 -0
  7. data/bin/plain_palindrome +7 -0
  8. data/bioroebe.gemspec +3 -3
  9. data/doc/README.gen +3918 -2793
  10. data/doc/quality_control/commandline_applications.md +3 -3
  11. data/doc/statistics/statistics.md +7 -7
  12. data/doc/todo/bioroebe_GUI_todo.md +19 -14
  13. data/doc/todo/bioroebe_java_todo.md +22 -0
  14. data/doc/todo/bioroebe_todo.md +2075 -2620
  15. data/lib/bioroebe/C++/DNA.cpp +69 -0
  16. data/lib/bioroebe/C++/RNA.cpp +58 -0
  17. data/lib/bioroebe/C++/sequence.cpp +35 -0
  18. data/lib/bioroebe/abstract/README.md +1 -0
  19. data/lib/bioroebe/abstract/features.rb +29 -0
  20. data/lib/bioroebe/aminoacids/aminoacid_substitution.rb +1 -9
  21. data/lib/bioroebe/aminoacids/codon_percentage.rb +1 -9
  22. data/lib/bioroebe/aminoacids/deduce_aminoacid_sequence.rb +1 -9
  23. data/lib/bioroebe/aminoacids/display_aminoacid_table.rb +1 -0
  24. data/lib/bioroebe/aminoacids/show_hydrophobicity.rb +1 -6
  25. data/lib/bioroebe/base/base_module/base_module.rb +36 -0
  26. data/lib/bioroebe/base/colours_for_base/colours_for_base.rb +18 -8
  27. data/lib/bioroebe/base/commandline_application/commandline_application.rb +13 -9
  28. data/lib/bioroebe/base/commandline_application/commandline_arguments.rb +24 -19
  29. data/lib/bioroebe/base/commandline_application/misc.rb +66 -49
  30. data/lib/bioroebe/base/commandline_application/opn.rb +8 -8
  31. data/lib/bioroebe/base/commandline_application/reset.rb +5 -3
  32. data/lib/bioroebe/base/internal_hash_module/internal_hash_module.rb +42 -0
  33. data/lib/bioroebe/base/misc.rb +35 -0
  34. data/lib/bioroebe/base/prototype/misc.rb +15 -9
  35. data/lib/bioroebe/base/prototype/reset.rb +10 -0
  36. data/lib/bioroebe/cleave_and_digest/digestion.rb +10 -2
  37. data/lib/bioroebe/cleave_and_digest/trypsin.rb +104 -50
  38. data/lib/bioroebe/codon_tables/frequencies/parse_frequency_table.rb +2 -10
  39. data/lib/bioroebe/codons/codons.rb +1 -1
  40. data/lib/bioroebe/codons/convert_this_codon_to_that_aminoacid.rb +208 -59
  41. data/lib/bioroebe/codons/possible_codons_for_this_aminoacid.rb +1 -9
  42. data/lib/bioroebe/codons/show_codon_tables.rb +8 -3
  43. data/lib/bioroebe/codons/show_codon_usage.rb +15 -4
  44. data/lib/bioroebe/colours/rev.rb +4 -1
  45. data/lib/bioroebe/constants/aminoacids_and_proteins.rb +1 -0
  46. data/lib/bioroebe/constants/database_constants.rb +1 -1
  47. data/lib/bioroebe/constants/files_and_directories.rb +31 -4
  48. data/lib/bioroebe/constants/misc.rb +20 -0
  49. data/lib/bioroebe/constants/nucleotides.rb +7 -0
  50. data/lib/bioroebe/conversions/dna_to_aminoacid_sequence.rb +109 -39
  51. data/lib/bioroebe/count/count_amount_of_aminoacids.rb +3 -2
  52. data/lib/bioroebe/count/count_amount_of_nucleotides.rb +3 -0
  53. data/lib/bioroebe/cpp +1 -0
  54. data/lib/bioroebe/crystal/README.md +2 -0
  55. data/lib/bioroebe/crystal/to_rna.cr +19 -0
  56. data/lib/bioroebe/data/README.md +11 -8
  57. data/lib/bioroebe/data/electron_microscopy/pos_example.pos +396 -0
  58. data/lib/bioroebe/data/electron_microscopy/test_particles.star +36 -0
  59. data/lib/bioroebe/data/fasta/human/Homo_sapiens_hemoglobin_subunit_alpha_HBB_mRNA.fasta +9 -0
  60. data/lib/bioroebe/data/fasta/human/Homo_sapiens_hemoglobin_subunit_beta_HBB_mRNA.fasta +8 -0
  61. data/lib/bioroebe/data/fasta/human/README.md +2 -0
  62. data/lib/bioroebe/dotplots/advanced_dotplot.rb +1 -1
  63. data/lib/bioroebe/electron_microscopy/coordinate_analyzer.rb +15 -18
  64. data/lib/bioroebe/{fasta_and_fastq/parse_fasta/run.rb → electron_microscopy/electron_microscopy_module.rb} +16 -8
  65. data/lib/bioroebe/electron_microscopy/fix_pos_file.rb +1 -9
  66. data/lib/bioroebe/electron_microscopy/flipy.rb +83 -0
  67. data/lib/bioroebe/electron_microscopy/parse_coordinates.rb +2 -10
  68. data/lib/bioroebe/electron_microscopy/read_file_xmd.rb +1 -9
  69. data/lib/bioroebe/electron_microscopy/simple_star_file_generator.rb +4 -9
  70. data/lib/bioroebe/enzymes/has_this_restriction_enzyme.rb +10 -3
  71. data/lib/bioroebe/enzymes/restriction_enzyme.rb +23 -1
  72. data/lib/bioroebe/enzymes/restriction_enzymes/statistics.rb +65 -0
  73. data/lib/bioroebe/fasta_and_fastq/autocorrect_the_name_of_this_fasta_file.rb +1 -9
  74. data/lib/bioroebe/fasta_and_fastq/compact_fasta_file/compact_fasta_file.rb +7 -9
  75. data/lib/bioroebe/fasta_and_fastq/fasta_defline/fasta_defline.rb +1 -5
  76. data/lib/bioroebe/fasta_and_fastq/fasta_to_yaml/fasta_to_yaml.rb +81 -0
  77. data/lib/bioroebe/fasta_and_fastq/parse_fasta/parse_fasta.rb +1518 -7
  78. data/lib/bioroebe/fasta_and_fastq/return_fasta_subsection_of_this_file.rb +11 -2
  79. data/lib/bioroebe/fasta_and_fastq/show_fasta_headers.rb +27 -12
  80. data/lib/bioroebe/fasta_and_fastq/simplify_fasta_header/simplify_fasta_header.rb +1 -5
  81. data/lib/bioroebe/fasta_and_fastq/split_this_fasta_file_into_chromosomes/constants.rb +0 -5
  82. data/lib/bioroebe/genome/README.md +4 -0
  83. data/lib/bioroebe/genome/genome.rb +130 -0
  84. data/lib/bioroebe/genomes/genome_pattern.rb +3 -9
  85. data/lib/bioroebe/gui/gtk +1 -0
  86. data/lib/bioroebe/gui/gtk3/alignment/alignment.rb +106 -137
  87. data/lib/bioroebe/gui/gtk3/aminoacid_composition/aminoacid_composition.rb +27 -61
  88. data/lib/bioroebe/gui/gtk3/aminoacid_composition/customized_dialog.rb +1 -1
  89. data/lib/bioroebe/gui/gtk3/blosum_matrix_viewer/blosum_matrix_viewer.rb +1 -2
  90. data/lib/bioroebe/gui/gtk3/calculate_cell_numbers_of_bacteria/calculate_cell_numbers_of_bacteria.rb +1 -2
  91. data/lib/bioroebe/gui/gtk3/controller/controller.rb +46 -29
  92. data/lib/bioroebe/gui/gtk3/dna_to_aminoacid_widget/dna_to_aminoacid_widget.rb +77 -52
  93. data/lib/bioroebe/gui/gtk3/dna_to_reverse_complement_widget/dna_to_reverse_complement_widget.rb +1 -2
  94. data/lib/bioroebe/gui/gtk3/fasta_table_widget/fasta_table_widget.rb +100 -23
  95. data/lib/bioroebe/gui/gtk3/format_converter/format_converter.rb +1 -2
  96. data/lib/bioroebe/gui/gtk3/gene/gene.rb +1 -2
  97. data/lib/bioroebe/gui/gtk3/hamming_distance/hamming_distance.rb +43 -30
  98. data/lib/bioroebe/gui/gtk3/levensthein_distance/levensthein_distance.rb +1 -2
  99. data/lib/bioroebe/gui/gtk3/nucleotide_analyser/nucleotide_analyser.rb +120 -73
  100. data/lib/bioroebe/gui/gtk3/primer_design_widget/primer_design_widget.rb +1 -2
  101. data/lib/bioroebe/gui/gtk3/protein_to_DNA/protein_to_DNA.rb +19 -20
  102. data/lib/bioroebe/gui/gtk3/random_sequence/random_sequence.rb +20 -13
  103. data/lib/bioroebe/gui/gtk3/restriction_enzymes/restriction_enzymes.rb +1 -2
  104. data/lib/bioroebe/gui/gtk3/show_codon_table/misc.rb +97 -22
  105. data/lib/bioroebe/gui/gtk3/show_codon_table/show_codon_table.rb +3 -73
  106. data/lib/bioroebe/gui/gtk3/show_codon_usage/show_codon_usage.rb +1 -2
  107. data/lib/bioroebe/gui/gtk3/sizeseq/sizeseq.rb +1 -2
  108. data/lib/bioroebe/gui/gtk3/three_to_one/three_to_one.rb +1 -2
  109. data/lib/bioroebe/gui/gtk3/www_finder/www_finder.rb +1 -2
  110. data/lib/bioroebe/gui/javafx/bioroebe/Bioroebe.class +0 -0
  111. data/lib/bioroebe/gui/javafx/bioroebe/Bioroebe.java +104 -0
  112. data/lib/bioroebe/gui/javafx/bioroebe.jar +0 -0
  113. data/lib/bioroebe/gui/javafx/bioroebe.mf +1 -0
  114. data/lib/bioroebe/gui/javafx/module-info.class +0 -0
  115. data/lib/bioroebe/gui/javafx/module-info.java +5 -0
  116. data/lib/bioroebe/gui/jruby/alignment/alignment.rb +165 -0
  117. data/lib/bioroebe/gui/jruby/aminoacid_composition/aminoacid_composition.rb +166 -0
  118. data/lib/bioroebe/gui/libui/alignment/alignment.rb +3 -1
  119. data/lib/bioroebe/gui/libui/controller/controller.rb +116 -0
  120. data/lib/bioroebe/gui/libui/random_sequence/random_sequence.rb +18 -2
  121. data/lib/bioroebe/gui/libui/show_codon_table/show_codon_table.rb +2 -0
  122. data/lib/bioroebe/gui/libui/three_to_one/three_to_one.rb +8 -6
  123. data/lib/bioroebe/gui/shared_code/alignment/alignment_module.rb +102 -0
  124. data/lib/bioroebe/gui/shared_code/aminoacid_composition/aminoacid_composition_module.rb +94 -0
  125. data/lib/bioroebe/gui/shared_code/levensthein_distance/levensthein_distance_module.rb +18 -16
  126. data/lib/bioroebe/gui/shared_code/protein_to_DNA/protein_to_DNA_module.rb +14 -14
  127. data/lib/bioroebe/gui/swing/three_to_one/ThreeToOne$1.class +0 -0
  128. data/lib/bioroebe/gui/swing/three_to_one/ThreeToOne$CloseListener.class +0 -0
  129. data/lib/bioroebe/gui/swing/three_to_one/ThreeToOne.class +0 -0
  130. data/lib/bioroebe/gui/swing/three_to_one/ThreeToOne.java +141 -0
  131. data/lib/bioroebe/images/FORWARD_PRIMER.png +0 -0
  132. data/lib/bioroebe/images/REVERSE_PRIMER.png +0 -0
  133. data/lib/bioroebe/images/images.html +29845 -0
  134. data/lib/bioroebe/java/README.md +5 -0
  135. data/lib/bioroebe/java/bioroebe/AllInOne.java +1 -0
  136. data/lib/bioroebe/java/bioroebe/Base.class +0 -0
  137. data/lib/bioroebe/java/bioroebe/Base.java +39 -5
  138. data/lib/bioroebe/java/bioroebe/IsPalindrome.java +23 -5
  139. data/lib/bioroebe/java/bioroebe/SanitizeNucleotideSequence.java +0 -0
  140. data/lib/bioroebe/java/bioroebe/Sequence.java +28 -3
  141. data/lib/bioroebe/java/bioroebe/ToCamelcase.class +0 -0
  142. data/lib/bioroebe/java/bioroebe/ToCamelcase.java +16 -4
  143. data/lib/bioroebe/java/bioroebe/ToRNA.java +43 -0
  144. data/lib/bioroebe/java/bioroebe/ToplevelMethods.java +6 -0
  145. data/lib/bioroebe/java/bioroebe/{BisulfiteTreatment.class → src/BisulfiteTreatment.class} +0 -0
  146. data/lib/bioroebe/java/bioroebe/{Codons.class → src/Codons.class} +0 -0
  147. data/lib/bioroebe/java/bioroebe/src/Codons.java +35 -0
  148. data/lib/bioroebe/java/bioroebe/src/Commandline.class +0 -0
  149. data/lib/bioroebe/java/bioroebe/src/Commandline.java +101 -0
  150. data/lib/bioroebe/java/bioroebe/{Esystem.class → src/Esystem.class} +0 -0
  151. data/lib/bioroebe/java/bioroebe/{Esystem.java → src/Esystem.java} +6 -1
  152. data/lib/bioroebe/java/bioroebe/{GenerateRandomDnaSequence.class → src/GenerateRandomDnaSequence.class} +0 -0
  153. data/lib/bioroebe/java/bioroebe/{GenerateRandomDnaSequence.java → src/GenerateRandomDnaSequence.java} +8 -2
  154. data/lib/bioroebe/java/bioroebe/src/PartnerNucleotide.class +0 -0
  155. data/lib/bioroebe/java/bioroebe/src/PartnerNucleotide.java +56 -0
  156. data/lib/bioroebe/java/bioroebe/{RemoveFile.java → src/RemoveFile.java} +10 -4
  157. data/lib/bioroebe/java/bioroebe/{RemoveNumbers.class → src/RemoveNumbers.class} +0 -0
  158. data/lib/bioroebe/java/bioroebe/{RemoveNumbers.java → src/RemoveNumbers.java} +1 -0
  159. data/lib/bioroebe/java/bioroebe/src/toplevel_methods/BaseComposition.class +0 -0
  160. data/lib/bioroebe/java/bioroebe/src/toplevel_methods/BaseComposition.java +75 -0
  161. data/lib/bioroebe/misc/ruler.rb +11 -2
  162. data/lib/bioroebe/nucleotides/most_likely_nucleotide_sequence_for_this_aminoacid_sequence.rb +1 -9
  163. data/lib/bioroebe/nucleotides/sanitize_nucleotide_sequence.rb +59 -18
  164. data/lib/bioroebe/nucleotides/show_nucleotide_sequence.rb +7 -7
  165. data/lib/bioroebe/parsers/genbank_parser.rb +347 -26
  166. data/lib/bioroebe/parsers/gff.rb +1 -9
  167. data/lib/bioroebe/patterns/scan_for_repeat.rb +1 -5
  168. data/lib/bioroebe/pdb/fetch_fasta_sequence_from_pdb.rb +1 -9
  169. data/lib/bioroebe/pdb/parse_mmCIF_file.rb +1 -9
  170. data/lib/bioroebe/pdb/parse_pdb_file.rb +4 -10
  171. data/lib/bioroebe/project/project.rb +1 -1
  172. data/lib/bioroebe/python/README.md +1 -0
  173. data/lib/bioroebe/python/__pycache__/mymodule.cpython-39.pyc +0 -0
  174. data/lib/bioroebe/python/gui/gtk3/all_in_one.css +4 -0
  175. data/lib/bioroebe/python/gui/gtk3/all_in_one.py +59 -0
  176. data/lib/bioroebe/python/gui/gtk3/widget1.py +20 -0
  177. data/lib/bioroebe/python/gui/tkinter/all_in_one.py +91 -0
  178. data/lib/bioroebe/python/mymodule.py +8 -0
  179. data/lib/bioroebe/python/protein_to_dna.py +33 -0
  180. data/lib/bioroebe/python/shell/shell.py +19 -0
  181. data/lib/bioroebe/python/to_rna.py +14 -0
  182. data/lib/bioroebe/python/toplevel_methods/convert_dna_to_aminoacid_sequence.py +137 -0
  183. data/lib/bioroebe/python/toplevel_methods/esystem.py +12 -0
  184. data/lib/bioroebe/python/toplevel_methods/open_in_browser.py +20 -0
  185. data/lib/bioroebe/python/toplevel_methods/palindromes.py +52 -0
  186. data/lib/bioroebe/python/toplevel_methods/rds.py +13 -0
  187. data/lib/bioroebe/python/toplevel_methods/shuffleseq.py +23 -0
  188. data/lib/bioroebe/python/toplevel_methods/three_delimiter.py +37 -0
  189. data/lib/bioroebe/python/toplevel_methods/time_and_date.py +43 -0
  190. data/lib/bioroebe/python/toplevel_methods/to_camelcase.py +21 -0
  191. data/lib/bioroebe/requires/require_cleave_and_digest.rb +3 -1
  192. data/lib/bioroebe/requires/require_the_bioroebe_project.rb +3 -1
  193. data/lib/bioroebe/sequence/alignment.rb +14 -4
  194. data/lib/bioroebe/sequence/dna.rb +1 -0
  195. data/lib/bioroebe/sequence/nucleotide_module/nucleotide_module.rb +28 -25
  196. data/lib/bioroebe/sequence/protein.rb +105 -3
  197. data/lib/bioroebe/sequence/rna.rb +220 -0
  198. data/lib/bioroebe/sequence/sequence.rb +128 -40
  199. data/lib/bioroebe/shell/menu.rb +3815 -3696
  200. data/lib/bioroebe/shell/misc.rb +9019 -3133
  201. data/lib/bioroebe/shell/readline/readline.rb +1 -1
  202. data/lib/bioroebe/shell/shell.rb +1137 -28
  203. data/lib/bioroebe/siRNA/siRNA.rb +81 -1
  204. data/lib/bioroebe/string_matching/find_longest_substring.rb +3 -2
  205. data/lib/bioroebe/string_matching/hamming_distance.rb +1 -9
  206. data/lib/bioroebe/taxonomy/class_methods.rb +3 -8
  207. data/lib/bioroebe/taxonomy/constants.rb +4 -3
  208. data/lib/bioroebe/taxonomy/edit.rb +2 -1
  209. data/lib/bioroebe/taxonomy/help/help.rb +10 -10
  210. data/lib/bioroebe/taxonomy/help/helpline.rb +2 -2
  211. data/lib/bioroebe/taxonomy/info/check_available.rb +15 -9
  212. data/lib/bioroebe/taxonomy/info/info.rb +18 -11
  213. data/lib/bioroebe/taxonomy/info/is_dna.rb +46 -36
  214. data/lib/bioroebe/taxonomy/interactive.rb +140 -104
  215. data/lib/bioroebe/taxonomy/menu.rb +27 -18
  216. data/lib/bioroebe/taxonomy/parse_fasta.rb +3 -1
  217. data/lib/bioroebe/taxonomy/shared.rb +1 -0
  218. data/lib/bioroebe/taxonomy/taxonomy.rb +1 -0
  219. data/lib/bioroebe/toplevel_methods/aminoacids_and_proteins.rb +31 -24
  220. data/lib/bioroebe/toplevel_methods/colourize_related_methods.rb +164 -0
  221. data/lib/bioroebe/toplevel_methods/databases.rb +1 -1
  222. data/lib/bioroebe/toplevel_methods/digest.rb +18 -8
  223. data/lib/bioroebe/toplevel_methods/fasta_and_fastq.rb +107 -63
  224. data/lib/bioroebe/toplevel_methods/file_and_directory_related_actions.rb +14 -2
  225. data/lib/bioroebe/toplevel_methods/frequencies.rb +8 -1
  226. data/lib/bioroebe/toplevel_methods/misc.rb +175 -11
  227. data/lib/bioroebe/toplevel_methods/nucleotides.rb +118 -46
  228. data/lib/bioroebe/toplevel_methods/open_in_browser.rb +2 -0
  229. data/lib/bioroebe/toplevel_methods/palindromes.rb +75 -47
  230. data/lib/bioroebe/toplevel_methods/taxonomy.rb +3 -3
  231. data/lib/bioroebe/toplevel_methods/to_camelcase.rb +5 -0
  232. data/lib/bioroebe/utility_scripts/align_open_reading_frames.rb +1 -9
  233. data/lib/bioroebe/utility_scripts/check_for_mismatches/check_for_mismatches.rb +1 -9
  234. data/lib/bioroebe/utility_scripts/compacter/compacter.rb +251 -0
  235. data/lib/bioroebe/utility_scripts/compseq/compseq.rb +1 -9
  236. data/lib/bioroebe/utility_scripts/consensus_sequence.rb +6 -6
  237. data/lib/bioroebe/utility_scripts/create_batch_entrez_file.rb +1 -9
  238. data/lib/bioroebe/utility_scripts/dot_alignment.rb +1 -9
  239. data/lib/bioroebe/utility_scripts/move_file_to_its_correct_location.rb +1 -4
  240. data/lib/bioroebe/utility_scripts/parse_taxonomy.rb +2 -2
  241. data/lib/bioroebe/utility_scripts/permutations.rb +36 -9
  242. data/lib/bioroebe/utility_scripts/showorf/constants.rb +0 -5
  243. data/lib/bioroebe/utility_scripts/showorf/reset.rb +1 -4
  244. data/lib/bioroebe/version/version.rb +2 -2
  245. data/lib/bioroebe/www/embeddable_interface.rb +121 -58
  246. data/lib/bioroebe/www/sinatra/sinatra.rb +186 -71
  247. data/lib/bioroebe/yaml/aminoacids/amino_acids_long_name_to_one_letter.yml +2 -2
  248. data/lib/bioroebe/yaml/aminoacids/weight_of_common_proteins.yml +17 -17
  249. data/lib/bioroebe/yaml/configuration/browser.yml +1 -1
  250. data/lib/bioroebe/yaml/configuration/temp_dir.yml +1 -1
  251. data/lib/bioroebe/yaml/consensus_sequences/consensus_sequences.yml +1 -0
  252. data/lib/bioroebe/yaml/genomes/README.md +3 -4
  253. data/lib/bioroebe/yaml/nucleotides/nucleotides.yml +5 -0
  254. data/lib/bioroebe/yaml/restriction_enzymes/restriction_enzymes.yml +57 -57
  255. data/spec/README.md +6 -0
  256. data/spec/project_wide_specification/classes.md +5 -0
  257. metadata +107 -70
  258. data/doc/setup.rb +0 -1655
  259. data/lib/bioroebe/fasta_and_fastq/parse_fasta/constants.rb +0 -50
  260. data/lib/bioroebe/fasta_and_fastq/parse_fasta/initialize.rb +0 -86
  261. data/lib/bioroebe/fasta_and_fastq/parse_fasta/menu.rb +0 -117
  262. data/lib/bioroebe/fasta_and_fastq/parse_fasta/misc.rb +0 -981
  263. data/lib/bioroebe/fasta_and_fastq/parse_fasta/report.rb +0 -156
  264. data/lib/bioroebe/fasta_and_fastq/parse_fasta/reset.rb +0 -128
  265. data/lib/bioroebe/genbank/genbank_parser.rb +0 -291
  266. data/lib/bioroebe/java/bioroebe/AllInOne.class +0 -0
  267. data/lib/bioroebe/java/bioroebe/Cat.class +0 -0
  268. data/lib/bioroebe/java/bioroebe/Codons.java +0 -22
  269. data/lib/bioroebe/java/bioroebe/IsPalindrome.class +0 -0
  270. data/lib/bioroebe/java/bioroebe/PartnerNucleotide.class +0 -0
  271. data/lib/bioroebe/java/bioroebe/PartnerNucleotide.java +0 -19
  272. data/lib/bioroebe/java/bioroebe/SanitizeNucleotideSequence.class +0 -0
  273. data/lib/bioroebe/java/bioroebe/ToplevelMethods.class +0 -0
  274. data/lib/bioroebe/java/bioroebe.jar +0 -0
  275. data/lib/bioroebe/shell/add.rb +0 -108
  276. data/lib/bioroebe/shell/assign.rb +0 -360
  277. data/lib/bioroebe/shell/chop_and_cut.rb +0 -281
  278. data/lib/bioroebe/shell/constants.rb +0 -166
  279. data/lib/bioroebe/shell/download.rb +0 -335
  280. data/lib/bioroebe/shell/enable_and_disable.rb +0 -158
  281. data/lib/bioroebe/shell/enzymes.rb +0 -310
  282. data/lib/bioroebe/shell/fasta.rb +0 -345
  283. data/lib/bioroebe/shell/gtk.rb +0 -76
  284. data/lib/bioroebe/shell/history.rb +0 -132
  285. data/lib/bioroebe/shell/initialize.rb +0 -217
  286. data/lib/bioroebe/shell/loop.rb +0 -74
  287. data/lib/bioroebe/shell/prompt.rb +0 -107
  288. data/lib/bioroebe/shell/random.rb +0 -289
  289. data/lib/bioroebe/shell/reset.rb +0 -335
  290. data/lib/bioroebe/shell/scan_and_parse.rb +0 -135
  291. data/lib/bioroebe/shell/search.rb +0 -337
  292. data/lib/bioroebe/shell/sequences.rb +0 -200
  293. data/lib/bioroebe/shell/show_report_and_display.rb +0 -2901
  294. data/lib/bioroebe/shell/startup.rb +0 -127
  295. data/lib/bioroebe/shell/taxonomy.rb +0 -14
  296. data/lib/bioroebe/shell/tk.rb +0 -23
  297. data/lib/bioroebe/shell/user_input.rb +0 -88
  298. data/lib/bioroebe/shell/xorg.rb +0 -45
  299. data/lib/bioroebe/utility_scripts/compacter.rb +0 -131
  300. /data/lib/bioroebe/java/bioroebe/{BisulfiteTreatment.java → src/BisulfiteTreatment.java} +0 -0
  301. /data/lib/bioroebe/java/bioroebe/{RemoveFile.class → src/RemoveFile.class} +0 -0
@@ -1,2823 +1,2278 @@
1
- -------------------------------------------------------------------------------
2
- (1) → https://biopython.org/DIST/docs/tutorial/Tutorial.html#sec15
3
-
4
- ^^^ work through the above, also integrate it + write docs
5
-
6
- https://raw.githubusercontent.com/biopython/biopython/master/Doc/examples/ls_orchid.fasta
7
-
8
- -------------------------------------------------------------------------------
9
- (2) → integrate electrno microscopy slowly and also add documentation
1
+ --------------------------------------------------------------------------------
2
+ (2) → Integrate http://nc2.neb.com/NEBcutter2/cutshow.php?name=ffe1d68e-
3
+ in particular the visual part.
4
+ --------------------------------------------------------------------------------
5
+ (3) → add support for:
6
+ codon_of? this_aminoacid
7
+ class CodonOfThisAminoacid
8
+ ^^^^
9
+ --------------------------------------------------------------------------------
10
+ (4) → Bioroebe::RestrictionEnzymes::Statistics.show
11
+ ^^^ improve these
12
+ and then add it to the documentation.
13
+ --------------------------------------------------------------------------------
14
+ (5) → use glimmer + nebula for widgets
15
+ ^^^
16
+ improve the nucleotide sequence analyser
17
+ --------------------------------------------------------------------------------
18
+ (6) → add to sinatra: a standalone server to query BAM files (and
19
+ the corresponding reference). The server will return the
20
+ content of a BAM file in the selected folder when the
21
+ server is started up. The server used is sintra.
22
+ --------------------------------------------------------------------------------
23
+ (7) → add the possibility to show what the effect of enzymes
24
+ are
25
+ AND inhibitors of enzymes. perhaps bioroebe can be
26
+ used in system biology one day
27
+ --------------------------------------------------------------------------------
28
+ (8) → Bioroebe::Sequence.new('AGCTTAGCGTACAGCTACGACGTAGTCTGACGA').cut_with? :AluI
29
+ ^^^ support this API and document it too
30
+ --------------------------------------------------------------------------------
31
+ (9) → integrate electrno microscopy slowly and also add documentation
10
32
  about this AS YOU GO!!!!!
11
33
  ^^^ yup add more of it
12
- -------------------------------------------------------------------------------
13
- (3) → Add save session support
34
+ --------------------------------------------------------------------------------
35
+ (10) → Add save session support
14
36
  to reload our last activity completely ...
15
- hmmm..
16
- This has to be well designed...
17
- Perhaps before we do so, we will add some
18
- class that anylizes what we have
19
- call it:
20
-
21
- class AnalyseLocalDataset
22
-
23
- And it is called when the bioshell is
24
- called. Can be enabled and disabled.
25
- AND document it then.
26
- The idea is to provide additional information
27
- upon startup of the bioroebe shell.
28
- This is in preparation for save-session support.
29
-
30
-
31
- -------------------------------------------------------------------------------
32
- (5) → Lys-Asp-Glu-Leu
33
-
34
- if i.include?('-') and Bioroebe.is_in_the_three_letter_code?(i)
35
- end
36
-
37
- - Lys-Asp-Glu-Leu-COO-
38
-
39
- Lys-Asp-Glu-Leu
40
-
41
- ^^^ this is "retention in lumen of ER"
42
- find this too
43
-
44
- BUT!
45
-
46
- we must verify it
47
-
48
- ^^ yep this is also called KDEL
49
- https://en.wikipedia.org/wiki/KDEL_(amino_acid_sequence)
50
- -------------------------------------------------------------------------------
51
- (6) → Add "orthologs". this shall show us the top 25 orthologs or
37
+ hmmm..
38
+ This has to be well designed...
39
+ Perhaps before we do so, we will add some
40
+ class that anylizes what we have
41
+ call it:
42
+ class AnalyseLocalDataset
43
+ And it is called when the bioshell is
44
+ called. Can be enabled and disabled.
45
+ AND document it then.
46
+ The idea is to provide additional information
47
+ upon startup of the bioroebe shell.
48
+ This is in preparation for save-session support.
49
+ --------------------------------------------------------------------------------
50
+ (11) Lys-Asp-Glu-Leu
51
+ if i.include?('-') and Bioroebe.is_in_the_three_letter_code?(i)
52
+ end
53
+ - Lys-Asp-Glu-Leu-COO-
54
+ Lys-Asp-Glu-Leu
55
+ ^^^ this is "retention in lumen of ER"
56
+ find this too
57
+ BUT!
58
+ we must verify it
59
+ ^^ yep this is also called KDEL
60
+ https://en.wikipedia.org/wiki/KDEL_(amino_acid_sequence)
61
+ --------------------------------------------------------------------------------
62
+ (12) → Add "orthologs". this shall show us the top 25 orthologs or
52
63
  something. In the bioshell? Hmm. Not sure yet.
53
- -------------------------------------------------------------------------------
54
- (7) → clone the functionality of this:
55
-
56
- http://www.kazusa.or.jp/codon/cgi-bin/countcodon.cgi
57
- http://www.kazusa.or.jp/codon/countcodon.html
58
-
59
- In other words, create a class that can generate such an output.
60
- ^^^ This is now done.
61
- Then add this to a GUI as well as the www output.
62
- ^^^ This still has to be done, though. We will use a ruby-gtk3
63
- widget first. And sinatra output too.
64
- AND document it as well
65
-
66
- -------------------------------------------------------------------------------
67
- (8) → SARS genom analyisere in bioroebe
64
+ --------------------------------------------------------------------------------
65
+ (13) → clone the functionality of this:
66
+ http://www.kazusa.or.jp/codon/cgi-bin/countcodon.cgi
67
+ http://www.kazusa.or.jp/codon/countcodon.html
68
+ In other words, create a class that can generate such an output.
69
+ ^^^ This is now done.
70
+ Then add this to a GUI as well as the www output.
71
+ ^^^ This still has to be done, though. We will use a ruby-gtk3
72
+ widget first. And sinatra output too.
73
+ AND document it as well
74
+ --------------------------------------------------------------------------------
75
+ (14) SARS genom analyisere in bioroebe
68
76
  eventuell auch graphisch
69
-
70
- Gibt es neue GUIs die wir kombinieren könnten? Hmmm.
71
- -------------------------------------------------------------------------------
72
- (9) → In bioroebe, generate that .ps thingy graphical thing from the
77
+ Gibt es neue GUIs die wir kombinieren könnten? Hmmm.
78
+ --------------------------------------------------------------------------------
79
+ (15) → In bioroebe, generate that .ps thingy graphical thing from the
73
80
  vienna RNA tutorial. Hmmm.
74
-
75
- https://www.tbi.univie.ac.at/RNA/tutorial/
76
- -------------------------------------------------------------------------------
77
- (1) → get insulin squence frmo NCBI
78
- human
81
+ https://www.tbi.univie.ac.at/RNA/tutorial/
82
+ --------------------------------------------------------------------------------
83
+ (16) → get insulin squence frmo NCBI
84
+ human
79
85
  then apply trypsin onto it
80
86
  and try it like this:
81
-
82
- trypsin --insulin
83
- ^^^
84
- also document it then. well .....
85
-
86
- Also add:
87
- insulin?
88
- ^^^ to show it
89
- Hmm. Perhaps also auto-download or something.
90
-
91
- -------------------------------------------------------------------------------
92
- (1) → in bioroebe: UAG?
87
+ trypsin --insulin
88
+ ^^^
89
+ also document it then. well .....
90
+ Also add:
91
+ insulin?
92
+ ^^^ to show it
93
+ Hmm. Perhaps also auto-download or something.
94
+ --------------------------------------------------------------------------------
95
+ (17) in bioroebe: UAG?
93
96
  ^^^ show all stop codons with that in the bioshell
94
97
  all UAG sequences... hmm. and TAG?
95
98
  Finish that.
96
- ..........................................................................
97
- (1) → The position of a symbol in a string is the total number of
99
+ --------------------------------------------------------------------------------
100
+ (18) → The position of a symbol in a string is the total number of
98
101
  symbols found to its left, including itself (e.g., the positions
99
102
  of all occurrences of 'U' in "AUGCUUCAGAAAGGUCUUACG" are 2, 5,
100
103
  6, 15, 17, and 18). The symbol at position i
101
104
  of s is denoted by s[i].
102
-
103
- ^^^ add a solution there, a toplevel API
104
- !!!!!
105
- -------------------------------------------------------------------------------
106
- (1) → http://bioruby.org/rdoc/Bio/Blast.html
105
+ ^^^ add a solution there, a toplevel API
106
+ !!!!!
107
+ --------------------------------------------------------------------------------
108
+ (19) → http://bioruby.org/rdoc/Bio/Blast.html
107
109
  ^^^ add support for BLAST
108
- ..........................................................................
109
- (1) → add: parse_pdb()
110
+ --------------------------------------------------------------------------------
111
+ (20) → add: parse_pdb()
110
112
  With this we shall just show some info, about a given
111
113
  .pdb file at hand.
112
114
  Also make it commandline based too + bioshell variant
113
115
  here, and a sinatra interface once this all works.
114
116
  Don't forget to document it!!!!!
115
117
  ^^^ and google a bit how others do that
116
- ..........................................................................
117
- (2) → pdb 1a6m
118
+ --------------------------------------------------------------------------------
119
+ (21) → pdb 1a6m
118
120
  ^^^ download this when that is used in the bioshell; we also have
119
- to use the download directory for this, so make sure that
120
- we do.
121
+ to use the download directory for this, so make sure that
122
+ we do.
121
123
  ^^^ And then, also document this clearly.
122
- -------------------------------------------------------------------------------
123
- (3) show_string
124
- ^^^ slowly port this ... find out differences
125
- then unify into one method. right now we used
126
- two or something.
127
- -------------------------------------------------------------------------------
128
- (4) → Try to see if we can integrate this into our GUI:
129
-
130
- https://cdn.snapgene.com/assets/7.6.11/assets/images/snapgene/homepage/homepage-hero.png
131
- -------------------------------------------------------------------------------
132
- (5) → Scan for leucine zipper!
133
-
124
+ --------------------------------------------------------------------------------
125
+ (22) show_string
126
+ ^^^ slowly port this ... find out differences
127
+ then unify into one method. right now we used
128
+ two or something.
129
+ --------------------------------------------------------------------------------
130
+ (23) → Try to see if we can integrate this into our GUI:
131
+ https://cdn.snapgene.com/assets/7.6.11/assets/images/snapgene/homepage/homepage-hero.png
132
+ --------------------------------------------------------------------------------
133
+ (24) → Scan for leucine zipper!
134
134
  This is ~25% implemented. We need to double-check what
135
135
  exactly is a leucine zipper.
136
- ..........................................................................
137
- (6) → Extend the sinatra-interface for the Rosalind task,
136
+ --------------------------------------------------------------------------------
137
+ (25) → Extend the sinatra-interface for the Rosalind task,
138
138
  perhaps add a sub-link to show which parts are solved
139
139
  as-is. Hmm. I am not continuing on this though.
140
- ^^^^
141
- well - make rosalind anew again or something.
142
-
143
- ...........................................................................
144
- (7) - Add a blast interface; both via the web-interface, GUI,
140
+ ^^^^
141
+ well - make rosalind anew again or something.
142
+ --------------------------------------------------------------------------------
143
+ (26) → Add a blast interface; both via the web-interface, GUI,
145
144
  and also from the commandline.
146
- -------------------------------------------------------------------------------
147
- (8) - Write a tutorial about primer design.
145
+ --------------------------------------------------------------------------------
146
+ (27) Write a tutorial about primer design.
148
147
  also make sure that the GUI has support for this.
149
- ..........................................................................
150
- (9) - In the documentation examples, show some exampls for how to work
148
+ --------------------------------------------------------------------------------
149
+ (28) In the documentation examples, show some exampls for how to work
151
150
  with different organisms.
152
- ..........................................................................
153
- (10) - In the bioshell, if "stop?" is issued, then the colouring isn't
154
- correct. It currently does not show any result. This has to
155
- be fixed.
156
- ..........................................................................
157
- (11) https://www.rubydoc.info/gems/biomart
158
- ^^^ integrate biomart
159
-
160
- p biomart.list_datasets
161
- p biomart.datasets?
162
- -------------------------------------------------------------------------------
163
- (12) Add Trypsin und Trypsinogen sequences, both as FASTA
164
- but also as shortcut via the commandline such as:
151
+ --------------------------------------------------------------------------------
152
+ (29) In the bioshell, if "stop?" is issued, then the colouring isn't
153
+ correct. It currently does not show any result. This has to
154
+ be fixed.
155
+ --------------------------------------------------------------------------------
156
+ (30) https://www.rubydoc.info/gems/biomart
157
+ ^^^ integrate biomart
158
+ p biomart.list_datasets
159
+ p biomart.datasets?
160
+ --------------------------------------------------------------------------------
161
+ (31) → Add Trypsin und Trypsinogen sequences, both as FASTA
162
+ but also as shortcut via the commandline such as:
165
163
  show_orf :trypsine
166
164
  show_orf :trypsin
167
- Or something like this; and document it as well.
168
- -------------------------------------------------------------------------------
169
- (13) → 1..60
170
-
171
- setdna 57
172
- append stop
173
- 1..60
174
-
175
- Next showing the nucleotides 1370 to 1462 (including 1370 and 1462).
176
- The length of the fragment will be 93 nucleotides.
177
- 5' - ATGTGCAGTCAGGTGAATTTATTGAAAAATTTGAGGCTCCTGGTGGTGCAAATCAAAGAACTGCTCCTCAGTGGATGTTGCCTTTACTTCTAG - 3'
178
- ^^^ hier beim colourize, wenn das letzte codon ein STOP codon ist
179
- dann colourizen wir das auch.
180
- -------------------------------------------------------------------------------
181
- (14) → MG1655
165
+ Or something like this; and document it as well.
166
+ --------------------------------------------------------------------------------
167
+ (32) → 1..60
168
+ setdna 57
169
+ append stop
170
+ 1..60
171
+ Next showing the nucleotides 1370 to 1462 (including 1370 and 1462).
172
+ The length of the fragment will be 93 nucleotides.
173
+ 5' - ATGTGCAGTCAGGTGAATTTATTGAAAAATTTGAGGCTCCTGGTGGTGCAAATCAAAGAACTGCTCCTCAGTGGATGTTGCCTTTACTTCTAG - 3'
174
+ ^^^ hier beim colourize, wenn das letzte codon ein STOP codon ist
175
+ dann colourizen wir das auch.
176
+ --------------------------------------------------------------------------------
177
+ (33) MG1655
182
178
  ^^^ input this to download the sequence. Also show it to the user.
183
- -------------------------------------------------------------------------------
184
- (15) → extend virus-information into the bioroebe project.
185
- -------------------------------------------------------------------------------
186
- (16) → Add a way to analyse the chemical structure of all
179
+ --------------------------------------------------------------------------------
180
+ (34) → extend virus-information into the bioroebe project.
181
+ --------------------------------------------------------------------------------
182
+ (35) → Add a way to analyse the chemical structure of all
187
183
  aminoacids. We wish to show the chemical formula.
188
-
189
184
  E. g. if we input:
190
-
191
185
  "phenylalanin"
192
-
193
- Then the C9N should be shown, of its -R part.
194
- ^^^ H wird aber rausgelöscht und O ebenso.
195
- wtf?
196
- I don't understand why it removes H and 0 so perhaps
197
- dont remove that part. But still show the -R.
198
-
199
- -------------------------------------------------------------------------------
200
- (17) FIX THE COLOURIZATION BUG; THIS ONE TRIGGERED THE WHOLE
201
- REWRITE AFTER ALL!
202
- -------------------------------------------------------------------------------
203
- (18) FIX TAXONOMY related-problems AS WELL
204
- ^^^^^^ AND DOCUMENT THIS related-problems.
205
- -------------------------------------------------------------------------------
206
- (19) Do note that z will then be a String, not a sequence object anymore.
207
- (This may be subject to change in the future, but for now, aka
208
- **February 2020**, it is that way.)
209
- ^^^^
210
- -------------------------------------------------------------------------------
211
- (20) ^^^ colours are appended. That should not be the case!
212
- ADD SOMETHING NEW ... some todo entries
213
- and some python tool
214
- -------------------------------------------------------------------------------
215
- (21) → rewrite the whole project anew
186
+ Then the C9N should be shown, of its -R part.
187
+ ^^^ H wird aber rausgelöscht und O ebenso.
188
+ wtf?
189
+ I don't understand why it removes H and 0 so perhaps
190
+ dont remove that part. But still show the -R.
191
+ --------------------------------------------------------------------------------
192
+ (36) → FIX THE COLOURIZATION BUG; THIS ONE TRIGGERED THE WHOLE
193
+ REWRITE AFTER ALL!
194
+ --------------------------------------------------------------------------------
195
+ (37) FIX TAXONOMY related-problems AS WELL
196
+ ^^^^^^ AND DOCUMENT THIS related-problems.
197
+ --------------------------------------------------------------------------------
198
+ (38) Do note that z will then be a String, not a sequence object anymore.
199
+ (This may be subject to change in the future, but for now, aka
200
+ **February 2020**, it is that way.)
201
+ ^^^^
202
+ --------------------------------------------------------------------------------
203
+ (39) → ^^^ colours are appended. That should not be the case!
204
+ ADD SOMETHING NEW ... some todo entries
205
+ and some python tool
206
+ --------------------------------------------------------------------------------
207
+ (40) rewrite the whole project anew
216
208
  - improve the documentation
217
- - focus on class Protein first and add
218
- all_dna_combinations or somethingl ike
219
- that, as well as:
220
- .backtrans
221
- .reverse_translate
222
- -------------------------------------------------------------------------------
223
- (22) → AND THEN test on windows as well.
224
- ^^^^^^^^^^^^^^
225
- -------------------------------------------------------------------------------
226
- (23)
227
- Reduced alphabets for proteins | [not implemented yet]
228
- ^^^ check this as well
229
-
230
- require 'bioroebe/base/commandline_application/aminoacids.rb'
231
- ^^^ verify whether we need this really in the commandline
232
- application part. Or perhaps it should be part of
233
- Base.
234
-
235
- - in bioroebe, test sequel for taxonomy...
236
-
237
-
238
- - LEARN FUCKING JAVA; combine it with bioroebe though. work through
239
- bioroebe and as you go, also write related-problems into java.
240
- Add restriction thingy complete in java AND bioroebe
241
- show how many will be cut.
242
- Improve this also in bioroebe at the same time
243
- add a GUI in swing too, also in ruby-gtk.
244
- document publish IMPROVE
245
- for now just show how many segments we will
246
- generate.
247
-
248
- First focus on bioroebe.
249
- ^^^^^^^^^^^^^^^
250
-
251
- -
252
- efetch "https://www.ncbi.nlm.nih.gov/gene/744779"
253
- ^^^ test this. again
254
-
255
- -------------------------------------------------------------------------------
256
- (25) fix tk-levensthein
257
- -------------------------------------------------------------------------------
258
- (26) → rewrite the whole project anew
209
+ - focus on class Protein first and add
210
+ all_dna_combinations or somethingl ike
211
+ that, as well as:
212
+ .backtrans
213
+ .reverse_translate
214
+ --------------------------------------------------------------------------------
215
+ (41) →
216
+ Reduced alphabets for proteins | [not implemented yet]
217
+ ^^^ check this as well
218
+ require 'bioroebe/base/commandline_application/aminoacids.rb'
219
+ ^^^ verify whether we need this really in the commandline
220
+ application part. Or perhaps it should be part of
221
+ Base.
222
+ - in bioroebe, test sequel for taxonomy...
223
+ - LEARN FUCKING JAVA; combine it with bioroebe though. work through
224
+ bioroebe and as you go, also write related-problems into java.
225
+ Add restriction thingy complete in java AND bioroebe
226
+ show how many will be cut.
227
+ Improve this also in bioroebe at the same time
228
+ add a GUI in swing too, also in ruby-gtk.
229
+ document publish IMPROVE
230
+ for now just show how many segments we will
231
+ generate.
232
+ First focus on bioroebe.
233
+ ^^^^^^^^^^^^^^^
234
+ -
235
+ efetch "https://www.ncbi.nlm.nih.gov/gene/744779"
236
+ ^^^ test this. again
237
+ --------------------------------------------------------------------------------
238
+ (42) → fix tk-levensthein
239
+ --------------------------------------------------------------------------------
240
+ (43) rewrite the whole project anew
259
241
  - improve the documentation
260
- - rework the WHOLE tutorial as well
261
- - focus on class Protein first and add
262
- all_dna_combinations or somethingl ike
263
- that
264
- .backtrans
265
- .reverse_translate
266
- -------------------------------------------------------------------------------
267
- (27) → analyze /Depot/Temp/Bioroebe/1CEZ.pdb
268
-
269
- ^^^
270
- support this. Already works half-way, we started writing a pdb parser.
271
- this should work in general, for .fasta files as well.
272
- ..........................................................................
273
- (28) → SINATRA STUFF:
242
+ - rework the WHOLE tutorial as well
243
+ - focus on class Protein first and add
244
+ all_dna_combinations or somethingl ike
245
+ that
246
+ .backtrans
247
+ .reverse_translate
248
+ --------------------------------------------------------------------------------
249
+ (44) → analyze /Depot/Temp/Bioroebe/1CEZ.pdb
250
+ ^^^
251
+ support this. Already works half-way, we started writing a pdb parser.
252
+ this should work in general, for .fasta files as well.
253
+ --------------------------------------------------------------------------------
254
+ (45) → SINATRA STUFF:
274
255
  FIX AND EXTEND SINATRA IN BIOROEBE.
275
256
  extend it too.
276
-
277
257
  http://localhost:4567/random_aminoacids
278
258
  ^^^ add a form there
279
-
280
259
  add emboss show orf or so
281
260
  and special-dispaly on sinatra kaa
282
261
  where the nucleotide sequence has numbers
283
262
  ^^^
284
- -------------------------------------------------------------------------------
285
- (29) pick any virus and begin to amass tons of data; and then when done
286
- also connect this into a GUI for use therein.
287
-
288
- See:
263
+ --------------------------------------------------------------------------------
264
+ (46) pick any virus and begin to amass tons of data; and then when done
265
+ also connect this into a GUI for use therein.
266
+ See:
289
267
  https://raw.githubusercontent.com/labsquare/fastQt/master/screenshot.gif
290
- ^^^^^^^
291
-
292
- begin with a circovirus
293
-
294
- ^^^^^^^
295
-
296
- DOCUMENT THIS AS YOU GO
297
- research about circovirus too
298
-
299
- https://www.ncbi.nlm.nih.gov/nuccore/NC_038391.1
300
-
301
-
302
-
303
-
304
-
305
- -------------------------------------------------------------------------------
306
- (1) → Fix:
307
-
308
- require 'bioroebe/toplevel_methods/open_reading_frames.rb'
309
-
310
- Something is wrong; it returns regions that contain
311
- a stop codon, which can not be true.
312
-
313
- -------------------------------------------------------------------------------
314
- (3) → Fix: extend glycovirology parts
315
- seek stuff in viral genomes
316
- -------------------------------------------------------------------------------
317
- (4) →
318
-
319
- seq = Bio::Sequence::NA.new("atgcatgcaaaaaaa")
320
- puts seq
321
- puts seq.complement
322
- puts seq.subseq(3,8)
323
- puts seq.subseq(3,8).complement #wont work
324
- p seq.gc_percent
325
- p (100 - seq.gc_percent) # at_percent
326
- p seq.composition
327
- puts seq.translate
328
- puts seq.translate(2)
329
- puts seq.translate(1,11)
330
- puts seq.translate.codes
331
- puts seq.translate.names
332
- puts seq.translate.composition
333
- puts seq.translate.molecular_weight
334
- puts seq.complement.translate
335
- ^^^ make sure this works
336
- seq = Bioroebe::Sequence.new("atgcatgcaaaaaaa")
337
- puts seq
338
- puts seq.complement
339
- -------------------------------------------------------------------------------
340
- (5) →
341
- make sure we have a good fasta-showing widget
342
- show how many nucleotides are
343
- AND add support to modify this as-is
344
- ^^^^
345
- -------------------------------------------------------------------------------
346
- (6) → In BioRoebe:
347
-
348
- Add a table showing how compatible bioroebe is compared to the other
349
- bio-projects, staring with biophp.
350
-
351
- Also show the status how much is complete in each,
352
- including Bio (ruby-bio) the main ruby project here.
353
- And add a table which functionality is implemented
354
- in Java already.
355
- -------------------------------------------------------------------------------
356
- (7)
357
- ********************************************************************************
358
- Was passiert wenn wir das Lambda-Genom mit EcoRI behandeln?
359
- ********************************************************************************
360
-
361
- Es entstehen 3 chromosomale Fragmente.
362
- ^^^ dies testen
363
- also zerst lambda genom herunterladen.
364
- download lambda
365
- download lambda_genome
366
- ^^^
367
- dann:
368
-
369
- Bioroebe.digest_this_dna(:lambda_genome, with: :EcoRI)
370
- Bioroebe.digest_this_dna("/root/Bioroebe/fasta/NC_001416.1_Enterobacteria_phage_lambda_complete_genome.fasta", with: :EcoRI)
371
-
372
-
373
- ^^^ test this API and document it as well.
374
- ^^^ and say how many fragments will be created in this CIRCULAR
375
- DNA.
376
- ^^^ this now works kind of ... but it must be better
377
- documented and we must test this with more data.
378
- -------------------------------------------------------------------------------
379
- (8) → add the bioroebe logo to sinatra, but as appropriate size,
380
- via base64. perhaps width 50 or so. need to determine
381
- which size fits here.
382
- -------------------------------------------------------------------------------
383
- (9) → Integrate http://nc2.neb.com/NEBcutter2/cutshow.php?name=ffe1d68e-
384
-
385
- in particular the visual part.
386
- -------------------------------------------------------------------------------
387
- (10) → https://international.neb.com/products/r0196-ncii#Product%20Information
388
- ^^^ autogenerate such an image, aka restriction cutting enzyme
389
- to indicate the target sequence.
390
- -------------------------------------------------------------------------------
391
- (6) → how to do codon optimiation in e.coli? bioroebe must support this!
392
-
393
- we must first get a display which codon is very commonly used in
394
- E. coli, from some remote site ... japanese site I think.
395
-
396
- then, we analyse all possibilities.
397
-
398
- and then we look which codons may be improvable - display
399
- them on the commandline
400
-
401
- class: OptimizeCodons.new(of_this_sequence)
402
- -------------------------------------------------------------------------------
403
- (7) → Molekulare Grösse von "Ubiquitin"? "8.5 kd".
268
+ ^^^^^^^
269
+ begin with a circovirus
270
+ ^^^^^^^
271
+ DOCUMENT THIS AS YOU GO
272
+ research about circovirus too
273
+ https://www.ncbi.nlm.nih.gov/nuccore/NC_038391.1
274
+ --------------------------------------------------------------------------------
275
+ (47) → Fix:
276
+ require 'bioroebe/toplevel_methods/open_reading_frames.rb'
277
+ Something is wrong; it returns regions that contain
278
+ a stop codon, which can not be true.
279
+ --------------------------------------------------------------------------------
280
+ (48) → Fix: extend glycovirology parts
281
+ seek stuff in viral genomes
282
+ --------------------------------------------------------------------------------
283
+ (49) →
284
+ seq = Bio::Sequence::NA.new("atgcatgcaaaaaaa")
285
+ puts seq
286
+ puts seq.complement
287
+ puts seq.subseq(3,8)
288
+ puts seq.subseq(3,8).complement #wont work
289
+ p seq.gc_percent
290
+ p (100 - seq.gc_percent) # at_percent
291
+ p seq.composition
292
+ puts seq.translate
293
+ puts seq.translate(2)
294
+ puts seq.translate(1,11)
295
+ puts seq.translate.codes
296
+ puts seq.translate.names
297
+ puts seq.translate.composition
298
+ puts seq.translate.molecular_weight
299
+ puts seq.complement.translate
300
+ ^^^ make sure this works
301
+ seq = Bioroebe::Sequence.new("atgcatgcaaaaaaa")
302
+ puts seq
303
+ puts seq.complement
304
+ --------------------------------------------------------------------------------
305
+ (50) → In BioRoebe:
306
+ Add a table showing how compatible bioroebe is compared to the other
307
+ bio-projects, staring with biophp.
308
+ Also show the status how much is complete in each,
309
+ including Bio (ruby-bio) the main ruby project here.
310
+ And add a table which functionality is implemented
311
+ in Java already.
312
+ --------------------------------------------------------------------------------
313
+ (51)
314
+ ********************************************************************************
315
+ Was passiert wenn wir das Lambda-Genom mit EcoRI behandeln?
316
+ ********************************************************************************
317
+ Es entstehen 3 chromosomale Fragmente.
318
+ ^^^ dies testen
319
+ also zerst lambda genom herunterladen.
320
+ download lambda
321
+ download lambda_genome
322
+ ^^^
323
+ dann:
324
+ Bioroebe.digest_this_dna(:lambda_genome, with: :EcoRI)
325
+ Bioroebe.digest_this_dna("/root/Bioroebe/fasta/NC_001416.1_Enterobacteria_phage_lambda_complete_genome.fasta", with: :EcoRI)
326
+ ^^^ test this API and document it as well.
327
+ ^^^ and say how many fragments will be created in this CIRCULAR
328
+ DNA.
329
+ ^^^ this now works kind of ... but it must be better
330
+ documented and we must test this with more data.
331
+ --------------------------------------------------------------------------------
332
+ (52) https://international.neb.com/products/r0196-ncii#Product%20Information
333
+ ^^^ autogenerate such an image, aka restriction cutting enzyme
334
+ to indicate the target sequence.
335
+ --------------------------------------------------------------------------------
336
+ (53) how to do codon optimiation in e.coli? bioroebe must support this!
337
+ we must first get a display which codon is very commonly used in
338
+ E. coli, from some remote site ... japanese site I think.
339
+ then, we analyse all possibilities.
340
+ and then we look which codons may be improvable - display
341
+ them on the commandline
342
+ class: OptimizeCodons.new(of_this_sequence)
343
+ --------------------------------------------------------------------------------
344
+ (54) → Molekulare Grösse von "Ubiquitin"? "8.5 kd".
404
345
  ^^^ das sollte automatisch ausgerechnet werden
405
- -------------------------------------------------------------------------------
406
- (8) → taxonomy !!!!!!!!!!!!!!!!!!
407
- -------------------------------------------------------------------------------
408
- (9) → Given a list of gene names that I would like to get chromosome/position
346
+ --------------------------------------------------------------------------------
347
+ (55) → taxonomy !!!!!!!!!!!!!!!!!!
348
+ --------------------------------------------------------------------------------
349
+ (56) → Given a list of gene names that I would like to get chromosome/position
409
350
  information for (in mm10). Is there some service online where I can
410
351
  paste this list? ^^^ enable this
411
- -------------------------------------------------------------------------------
412
- (10) → Show the frequency of codons in different tables
413
-
414
- This works quite ok, but right now the approach is to store
415
- this in a .yml file which is not ideal.
416
-
417
- Thus, we have to add two things:
418
- - The ability to store this into a SQL database
419
- - The ability to batch-download all of these codons,
420
- which first requires that we have a way to obtain all
421
- taxonomic ids.
422
- -------------------------------------------------------------------------------
423
- (11) → Add a way in bioroebe to store a gene into a yaml file
424
- or so, and to also load it up again. Perhaps simplify
425
- this automatically. Need some ways to describe that.
426
- -------------------------------------------------------------------------------
427
- (12) → Make bioroebe very useful from the www, no matter if via sinatra
352
+ --------------------------------------------------------------------------------
353
+ (57) → Make bioroebe very useful from the www, no matter if via sinatra
428
354
  or rails. It should be a tool-set project on the www as well.
429
- -------------------------------------------------------------------------------
430
- (13) → Suppose you have a GenBank file which you want to turn into a
431
- Fasta file. For example, lets consider the file cor6_6.gb
432
- which is included in the Biopython unit tests under the
355
+ --------------------------------------------------------------------------------
356
+ (58) → Suppose you have a GenBank file which you want to turn into a
357
+ Fasta file. For example, lets consider the file cor6_6.gb
358
+ which is included in the Biopython unit tests under the
433
359
  GenBank directory.
434
-
435
-
436
- need to check that this is equivalent, think about the API
437
- document it and then remove this entry.
438
-
439
- ^^^ also build a GUI for this.
440
- call it format-converter or so
441
- the GUI works somewhat but needs to be polished up.
442
- THEN THIS CAN BE REMOVED!!!!!!!
443
-
444
- -------------------------------------------------------------------------------
445
- (14) → Wir brauchen eine table wo wir die starken promotoren verschiedener
360
+ need to check that this is equivalent, think about the API
361
+ document it and then remove this entry.
362
+ ^^^ also build a GUI for this.
363
+ call it format-converter or so
364
+ the GUI works somewhat but needs to be polished up.
365
+ THEN THIS CAN BE REMOVED!!!!!!!
366
+ --------------------------------------------------------------------------------
367
+ (59) Wir brauchen eine table wo wir die starken promotoren verschiedener
446
368
  Organismen zusammenstellen und vergleichen können.
447
-
448
- strong_promoters.yml
449
- -------------------------------------------------------------------------------
450
- (15) add:
451
- start position of exons
452
- and show the sequence based on that file
453
-
454
- Normally there's a "gene" entry for each gene, so:
455
- awk 'BEGIN{FS="\t"; OFS="\t"}{if($3 == "gene") print $1, $4, $5}' foo.gtf
456
-
457
- -------------------------------------------------------------------------------
458
- (16) → also add 30-33 to aminoacids hmmm difficult.
459
- -------------------------------------------------------------------------------
460
- (17) → http://bioinformatics.oxfordjournals.org/content/18/8/1135
369
+ strong_promoters.yml
370
+ --------------------------------------------------------------------------------
371
+ (60) → add:
372
+ start position of exons
373
+ and show the sequence based on that file
374
+ Normally there's a "gene" entry for each gene, so:
375
+ awk 'BEGIN{FS="\t"; OFS="\t"}{if($3 == "gene") print $1, $4, $5}' foo.gtf
376
+ --------------------------------------------------------------------------------
377
+ (61) also add 30-33 to aminoacids hmmm difficult.
378
+ --------------------------------------------------------------------------------
379
+ (62) → http://bioinformatics.oxfordjournals.org/content/18/8/1135
461
380
  "TFBS: Computational framework for transcription factor
462
- binding site analysis"
463
- study the above and see if it can be included
464
- into bioroebe
465
-
466
- http://tfbs.genereg.net/
467
- -------------------------------------------------------------------------------
468
- (18) → They include trypsin, chymotrypsin, thrombin, plasmin, papain and factor Xa.
381
+ binding site analysis"
382
+ study the above and see if it can be included
383
+ into bioroebe
384
+ http://tfbs.genereg.net/
385
+ --------------------------------------------------------------------------------
386
+ (63) → They include trypsin, chymotrypsin, thrombin, plasmin, papain and factor Xa.
469
387
  ^^^ provide means to identify where they cut,
470
- and show this then by simualting a digest.
471
- return an array with the starting aminoacids.
472
- also document this on bioroebe todo
388
+ and show this then by simualting a digest.
389
+ return an array with the starting aminoacids.
390
+ also document this on bioroebe todo
473
391
  this is done via digestion/digestions
474
392
  but it's not quite perfect yet.
475
- -------------------------------------------------------------------------------
476
- (19) → a) add a commandline way to generate a random protein
477
- with a specified length and then display it on the
393
+ --------------------------------------------------------------------------------
394
+ (64) → a) add a commandline way to generate a random protein
395
+ with a specified length and then display it on the
478
396
  commandline [DONE] !!!
479
-
480
- bioroebe --random-aminoacids=33
481
- bioroebe --n-aminoacids=33
482
- sinatra:
483
-
484
- random_aminoacids/33 [Implemented! 23.09.2019]
485
-
486
- also added the gtk-GUI code here; needs to be
487
- documented briefly, then this part is completelty
488
- done. contiu on random_aminoacids: in particular
489
- add a gtk_entry that specifie, no, a spin button
490
- that states how many no... an entry or so
491
- to state how many aminoacids to generate
492
- randomly
493
-
494
- b) add a way to generate a cDNA sequence from such a
495
- protein and view all possible sequences from that
496
- sequence. ^^^
497
-
498
- Enable this BOTH from the commandline AND from the
499
- interactive variant and from sinatra! Hmmmm.
500
-
501
- -------------------------------------------------------------------------------
502
- (1) → add an option to design a
503
-
504
- degenerate primer
505
- -------------------------------------------------------------------------------
506
- (2) Add upcase to sequences and ensure that it works; also document it
507
- internally and in the .pdf tutorial
508
- what does that mean? upcase as method? hmmm.
509
-
510
- ..........................................................................
511
- (1) → http://www.biomart.org/other/user-docs.pdf
512
- ^^^ work through this
513
- ^^^ integrate the old .cgi part and improve as you go
514
- ..........................................................................
515
- (1) → Access geninfo numbers easily.
397
+ bioroebe --random-aminoacids=33
398
+ bioroebe --n-aminoacids=33
399
+ sinatra:
400
+ random_aminoacids/33 [Implemented! 23.09.2019]
401
+ also added the gtk-GUI code here; needs to be
402
+ documented briefly, then this part is completelty
403
+ done. contiu on random_aminoacids: in particular
404
+ add a gtk_entry that specifie, no, a spin button
405
+ that states how many no... an entry or so
406
+ to state how many aminoacids to generate
407
+ randomly
408
+ b) add a way to generate a cDNA sequence from such a
409
+ protein and view all possible sequences from that
410
+ sequence. ^^^
411
+ Enable this BOTH from the commandline AND from the
412
+ interactive variant and from sinatra! Hmmmm.
413
+ --------------------------------------------------------------------------------
414
+ (65) → add an option to design a
415
+ degenerate primer
416
+ --------------------------------------------------------------------------------
417
+ (66) Add upcase to sequences and ensure that it works; also document it
418
+ internally and in the .pdf tutorial
419
+ what does that mean? upcase as method? hmmm.
420
+ --------------------------------------------------------------------------------
421
+ (67) → http://www.biomart.org/other/user-docs.pdf
422
+ ^^^ work through this
423
+ ^^^ integrate the old .cgi part and improve as you go
424
+ --------------------------------------------------------------------------------
425
+ (68) Access geninfo numbers easily.
516
426
  Die suchen und runterladen.
517
- -------------------------------------------------------------------------------
518
- - Add all of bioruby into bioroebe:
519
-
520
- continous project
521
- https://github.com/biopython/biopython
522
- https://github.com/bioruby/bioruby/tree/master/lib/bio
523
- -------------------------------------------------------------------------------
524
- (3) https://github.com/bioruby/bioruby/issues/134
525
- ^^^ check this, for restriction enzymes
526
- http://rebase.neb.com/rebase/enz/MboII.html
527
-
528
- Bio::RestrictionEnzyme.cut(seq, 'MboII').primary rescue [seq]
529
- => ["agaagattaggatt", "gatgat"]
530
- > seq = seq.reverse_complement
531
- > Bio::RestrictionEnzyme.cut(seq, 'MboII').primary rescue [seq]
532
- => ["atcatcaatcctaatcttct"]
533
- -------------------------------------------------------------------------------
534
- (4) → Document how an ORF is defined for the bioroebe project.
535
- ..........................................................................
536
- (5) Continue with biojava in bioroebe.
537
-
538
- → We need to make some table that tells us what is implemented
427
+ --------------------------------------------------------------------------------
428
+ (69) Add all of bioruby into bioroebe:
429
+ continous project
430
+ https://github.com/biopython/biopython
431
+ https://github.com/bioruby/bioruby/tree/master/lib/bio
432
+ --------------------------------------------------------------------------------
433
+ (70) → https://github.com/bioruby/bioruby/issues/134
434
+ ^^^ check this, for restriction enzymes
435
+ http://rebase.neb.com/rebase/enz/MboII.html
436
+ Bio::RestrictionEnzyme.cut(seq, 'MboII').primary rescue [seq]
437
+ => ["agaagattaggatt", "gatgat"]
438
+ > seq = seq.reverse_complement
439
+ > Bio::RestrictionEnzyme.cut(seq, 'MboII').primary rescue [seq]
440
+ => ["atcatcaatcctaatcttct"]
441
+ --------------------------------------------------------------------------------
442
+ (71) → Document how an ORF is defined for the bioroebe project.
443
+ --------------------------------------------------------------------------------
444
+ (72) → Continue with biojava in bioroebe.
445
+ → We need to make some table that tells us what is implemented
539
446
  in java.
540
- → Make it possible to randomly generate aminoacids, and then,
447
+ → Make it possible to randomly generate aminoacids, and then,
541
448
  based on that, design degenarate DNA that matches to it) ←
542
449
  this also must work standalone, and be documented.
543
-
544
- document on bioroebe.cgi as well
545
-
546
- We can generate degenerate primers now:
547
-
548
- dprimer M-T-T-Y-Y-T-A-A-A-STOP
549
-
550
- ..........................................................................
551
- (1) → The codon tables:
552
- → In January we added a codon-table GUI to ruby-gtk3.
553
-
450
+ document on bioroebe.cgi as well
451
+ We can generate degenerate primers now:
452
+ dprimer M-T-T-Y-Y-T-A-A-A-STOP
453
+ --------------------------------------------------------------------------------
454
+ (73) → The codon tables:
455
+ → In January we added a codon-table GUI to ruby-gtk3.
554
456
  also enable an inverse table.
555
- Ala/A GCT, GCC, GCA, GCG GCN Leu/L TTA, TTG, CTT, CTC, CTA, CTG YTR, CTN
556
- Arg/R CGT, CGC, CGA, CGG, AGA, AGG CGN, MGR Lys/K AAA, AAG AAR
557
- Asn/N AAT, AAC AAY Met/M ATG
558
- Asp/D GAT, GAC GAY Phe/F TTT, TTC TTY
559
- Cys/C TGT, TGC TGY Pro/P CCT, CCC, CCA, CCG CCN
560
- Gln/Q CAA, CAG CAR Ser/S TCT, TCC, TCA, TCG, AGT, AGC TCN, AGY
561
- Glu/E GAA, GAG GAR Thr/T ACT, ACC, ACA, ACG ACN
562
- Gly/G GGT, GGC, GGA, GGG GGN Trp/W TGG
563
- His/H CAT, CAC CAY Tyr/Y TAT, TAC TAY
564
- Ile/I ATT, ATC, ATA ATH Val/V GTT, GTC, GTA, GTG GTN
565
- START ATG STOP TAA, TGA, TAG TAR, TRA
566
-
567
- I think this is already stored in:
568
- inverse_rna_codon_table.yml
569
-
570
- table = Bio::CodonTable[1]
571
- ^^^^ this is quite a useful feature of bioruby. We need to
572
- add this as well. Then document it.
573
-
574
- ^^^ document this better too
457
+ Ala/A GCT, GCC, GCA, GCG GCN Leu/L TTA, TTG, CTT, CTC, CTA, CTG YTR, CTN
458
+ Arg/R CGT, CGC, CGA, CGG, AGA, AGG CGN, MGR Lys/K AAA, AAG AAR
459
+ Asn/N AAT, AAC AAY Met/M ATG
460
+ Asp/D GAT, GAC GAY Phe/F TTT, TTC TTY
461
+ Cys/C TGT, TGC TGY Pro/P CCT, CCC, CCA, CCG CCN
462
+ Gln/Q CAA, CAG CAR Ser/S TCT, TCC, TCA, TCG, AGT, AGC TCN, AGY
463
+ Glu/E GAA, GAG GAR Thr/T ACT, ACC, ACA, ACG ACN
464
+ Gly/G GGT, GGC, GGA, GGG GGN Trp/W TGG
465
+ His/H CAT, CAC CAY Tyr/Y TAT, TAC TAY
466
+ Ile/I ATT, ATC, ATA ATH Val/V GTT, GTC, GTA, GTG GTN
467
+ START ATG STOP TAA, TGA, TAG TAR, TRA
468
+ I think this is already stored in:
469
+ inverse_rna_codon_table.yml
470
+ table = Bio::CodonTable[1]
471
+ ^^^^ this is quite a useful feature of bioruby. We need to
472
+ add this as well. Then document it.
473
+ ^^^ document this better too
575
474
  that we can now display all the different codon tables.
576
-
577
- This now sorta works semi-ok.
578
-
579
- -------------------------------------------------------------------------------
580
- (1) In the bioroebe-shell, enable input such as:
581
-
582
- NC_000011.10
583
-
584
- This shall quickly download this sequence into the
585
- local file, and also rename it properly.
586
- -------------------------------------------------------------------------------
587
- clone all of bioruby
588
- -------------------------------------------------------------------------------
589
- (1) → bioinf bücher udrhclesen und zeug inkludiere !!!
590
- ^^^^^ mehr bilderchen hinzufügen ... auchv on den GUIs eventuell.
591
- Und auch biopython durcharbeiten und alles wichtige nach
592
- bioroebe übertragen.
593
- -------------------------------------------------------------------------------
594
- - Add: DetectMotif
595
-
596
- This class shall be used for detecting subsequences.
597
- -------------------------------------------------------------------------------
598
- - Neue funktionälit rein
599
- -------------------------------------------------------------------------------
600
- - mehr doku!
601
- -------------------------------------------------------------------------------
602
- - continue on bioroebe, and when it is done, write to the guy.
603
- -------------------------------------------------------------------------------
604
- - Rewrite bioroebe completely - add some tests, too or so, to
605
- test this. ^^^
606
- That way we learn how to write tests.
607
- AND ... we will actually start with the taxonomy project
608
- so that it finally works again.
609
- continue work on bioroebe
610
- MAKE BIOROEBE EPIC because this is what I will make money with.
611
- CONTINUE THE BIOROEBE PORT !
612
- ^^^^
613
- require 'bioroebe/constants/remote_urls.rb
614
- ^^^
615
- ncbi taxonomy databse: move this into this file.
616
- # require 'bioroebe/constants/aminoacid_families.rb'
617
- ^^^ also ... fix this here.
618
- also continue bioroebe port...
619
- hmm. and perhaps add something else, like the option to have
620
- multiple genes and multiple proteis
621
- and define workspaces.
622
- but start with taxonomy first.
623
- ^^^
624
- Also during that rewrite, make sure that the quality of the
625
- documentation improves. That way the whole project serves
626
- as advertisement too.
627
- wait with rewrite though......
628
-
629
- clone bioroebe in java, STEP BY STEP, as means of learning java too,
630
- for preparation at the TU Wien.
631
- ^^^ but first read up some java tutorial because I dont know related-problems about
632
- java
633
-
634
- CPK: international colour scheme
635
- add document in bioroebe + registration
636
- https://proteopedia.org/wiki/index.php/CPK
637
- ^^^
638
- bioroebe erweitern... auch rosalind
639
- ^^^
640
- improve bioroebe so that it is supper
641
- and add C++
642
- extend bioroebe sinatra interface
643
- also add a footer to show which entries are available or so
644
- in bioroebe, mach das die postgresql datenbank wieder funktioniert ...
645
-
646
-
647
-
648
- ..........................................................................
649
-
650
- ^^^ improve this whole project a lot
651
-
652
- before uploading then send email
653
-
654
-
655
- - 1fat.pdb
656
-
657
- ^^^ download this, also via bioshell
658
- download 1fat
659
- ^^^ notify the user about this
660
- but put it into the dir of bioshell
661
-
662
- → add:
663
-
664
- set_dna :insulin
665
- set_dna insulin
666
-
667
- This shall allow us to use the sequence of human insulin
668
- here. Also document this. Shall just make things more
669
- convenient for us.
670
-
671
- http://www.ncbi.nlm.nih.gov/gene/3630
672
-
673
- insulin = 'ncbi_gene: 3630'
674
- → becomes: http://www.ncbi.nlm.nih.gov/gene/3630
675
-
676
- wtf ... better to learn how NCBI uworks
677
- -------------------------------------------------------------------------------
678
- - Add a seuqence table int obioroebe for GFP, YFP etc
679
- and mae this show in both the interactio bioshell but
680
- also the main README.md
681
- -------------------------------------------------------------------------------
682
- - stop_frame1?
683
- ^^^ add support for this
684
- and stop_frame2?
685
- etcc
686
- to show stop-codons in this colour
687
- THEN UPLOAD!
688
- ^^^ this works now but is not documented
689
-
690
-
691
- -------------------------------------------------------------------------------
692
-
693
- - chop to first ATG
694
-
695
- chop :ATG
696
-
697
- ^^^^ enable this, to chop towards the first ATG
698
- sequence in the string
699
-
700
- -------------------------------------------------------------------------------
701
- → http://www.biophp.org/stats/describe_data/demo.php?show=formula
702
-
703
- ^^^ should also add documentation like this, also via www interface
704
- -------------------------------------------------------------------------------
705
- → add mouse chromsoome URL, also in the bioshell
706
- and the main README, to be of help for the
707
- user. add a mouse subsection.
708
- ..........................................................................
709
- → fix the taxonomy stuff...
710
- ..........................................................................
711
- (1) → add 2nd_orf
712
- → this shall scan for the 2nd orf
713
- → and third ORF as well, then, and document it.
714
- ..........................................................................
715
- (2) → Add a "cutter-range example" in restriction enzymes +
475
+ This now sorta works semi-ok.
476
+ --------------------------------------------------------------------------------
477
+ (74) → In the bioroebe-shell, enable input such as:
478
+ NC_000011.10
479
+ This shall quickly download this sequence into the
480
+ local file, and also rename it properly.
481
+ --------------------------------------------------------------------------------
482
+ (75) → clone all of bioruby
483
+ --------------------------------------------------------------------------------
484
+ (76) bioinf bücher udrhclesen und zeug inkludiere !!!
485
+ ^^^^^ mehr bilderchen hinzufügen ... auchv on den GUIs eventuell.
486
+ Und auch biopython durcharbeiten und alles wichtige nach
487
+ bioroebe übertragen.
488
+ --------------------------------------------------------------------------------
489
+ (77) Add: DetectMotif
490
+ This class shall be used for detecting subsequences.
491
+ --------------------------------------------------------------------------------
492
+ (78) → Neue funktionälit rein
493
+ --------------------------------------------------------------------------------
494
+ (79) → mehr doku!!!
495
+ --------------------------------------------------------------------------------
496
+ (80) → Rewrite bioroebe completely - add some tests, too or so, to
497
+ test this. ^^^
498
+ That way we learn how to write tests.
499
+ AND ... we will actually start with the taxonomy project
500
+ so that it finally works again.
501
+ continue work on bioroebe
502
+ MAKE BIOROEBE EPIC because this is what I will make money with.
503
+ CONTINUE THE BIOROEBE PORT !
504
+ ^^^^
505
+ require 'bioroebe/constants/remote_urls.rb
506
+ ^^^
507
+ ncbi taxonomy databse: move this into this file.
508
+ # require 'bioroebe/constants/aminoacid_families.rb'
509
+ ^^^ also ... fix this here.
510
+ also continue bioroebe port...
511
+ hmm. and perhaps add something else, like the option to have
512
+ multiple genes and multiple proteis
513
+ and define workspaces.
514
+ but start with taxonomy first.
515
+ ^^^
516
+ Also during that rewrite, make sure that the quality of the
517
+ documentation improves. That way the whole project serves
518
+ as advertisement too.
519
+ wait with rewrite though......
520
+ clone bioroebe in java, STEP BY STEP, as means of learning java too,
521
+ for preparation at the TU Wien.
522
+ ^^^ but first read up some java tutorial because I dont know related-problems about
523
+ java
524
+ CPK: international colour scheme
525
+ add document in bioroebe + registration
526
+ https://proteopedia.org/wiki/index.php/CPK
527
+ ^^^
528
+ bioroebe erweitern... auch rosalind
529
+ ^^^
530
+ improve bioroebe so that it is supper
531
+ and add C++
532
+ extend bioroebe sinatra interface
533
+ also add a footer to show which entries are available or so
534
+ in bioroebe, mach das die postgresql datenbank wieder funktioniert ...
535
+ --------------------------------------------------------------------------------
536
+ (81) → ^^^ improve this whole project a lot
537
+ before uploading then send email
538
+ → add:
539
+ set_dna :insulin
540
+ set_dna insulin
541
+ This shall allow us to use the sequence of human insulin
542
+ here. Also document this. Shall just make things more
543
+ convenient for us.
544
+ http://www.ncbi.nlm.nih.gov/gene/3630
545
+ insulin = 'ncbi_gene: 3630'
546
+ → becomes: http://www.ncbi.nlm.nih.gov/gene/3630
547
+ wtf ... better to learn how NCBI uworks
548
+ --------------------------------------------------------------------------------
549
+ (82) Add a seuqence table into bioroebe for GFP, YFP etc
550
+ and mae this show in both the interactio bioshell but
551
+ also the main README.md
552
+ --------------------------------------------------------------------------------
553
+ (83) → http://www.biophp.org/stats/describe_data/demo.php?show=formula
554
+ ^^^ should also add documentation like this, also via www interface
555
+ --------------------------------------------------------------------------------
556
+ (84) Add a "cutter-range example" in restriction enzymes +
716
557
  table + examples + tutorial
717
-
718
- one example each in this overview.
719
-
558
+ one example each in this overview.
720
559
  Also, add in the documentation where this
721
560
  can be found.
722
- ..........................................................................
723
- (3) → Add aaruler, similar to "ruler"; in the bioshell.
724
- But we want to do this on the dna-sequence rather
725
- than the aminoacid sequence.
726
- This works but the display is not ideal.
727
- ..........................................................................
728
- (4) → Add some codon-usage analyzer. What shall it show? It
561
+ --------------------------------------------------------------------------------
562
+ (85) → Add some codon-usage analyzer. What shall it show? It
729
563
  should show how many codons are used, frequencies etc...
730
564
  by an organism, and compare that to other data.
731
- ..........................................................................
732
- (5) → Implement a GPCR interface.
733
-
565
+ --------------------------------------------------------------------------------
566
+ (86) → Implement a GPCR interface.
734
567
  This is for "G-protein coupled receptors."
735
568
  Denote which variants exist and so forth. Document it as well.
736
- ..........................................................................
737
- (6) → alu?
738
-
569
+ --------------------------------------------------------------------------------
570
+ (87) → alu?
739
571
  Will read from the file `/Programs/Ruby/2.3.0/lib/ruby/site_ruby/2.3.0/bioroebe/yaml/alu_elements.yml`.
740
572
  Bioroebe::ParseFasta: This sequence is assumed to be a protein.
741
573
  This sequence has 1317 aminoacids.
742
-
743
574
  We have identified a total of 1 entries in this fasta dataset.
744
575
  The ALU sequence in humans may be:
745
-
746
- GC ...
747
-
748
- ^^^ das stimmt aber net ... hmmm.
749
-
750
- (3) → The .pdb file that used to be distributed via bioroebe was way
751
- too large. Perhaps add a way to download it instead if needed
752
- e. g.:
753
-
754
- common_downloads:
755
-
756
- ^^^ add this and document it or something like that
757
- And perhaps add a small protein as an example how to
758
- work with .pdb files instead.
759
- -------------------------------------------------------------------------------
760
- (4) → Extend bioroebe to allow download
761
-
762
- PDB files
763
-
764
- id 3030
765
-
766
- and then display nice thingies to the user.
767
-
768
- http://www.pdb.org/pdb/download/downloadFile.do?fileFormat=pdb&compression=NO&structureId=3030
769
- http://www.pdb.org/pdb/download/downloadFile.do?fileFormat=pdb&compression=NO&structureId=2VEZ
770
-
771
- in 3EML 2VTP 2VEZ
772
- do
773
- ..........................................................................
774
- (1) → Fully integrate electron microscopy then remove the old entry.
576
+ GC ...
577
+ ^^^ das stimmt aber net ... hmmm.
578
+ (3) → The .pdb file that used to be distributed via bioroebe was way
579
+ too large. Perhaps add a way to download it instead if needed
580
+ e. g.:
581
+ common_downloads:
582
+ ^^^ add this and document it or something like that
583
+ And perhaps add a small protein as an example how to
584
+ work with .pdb files instead.
585
+ --------------------------------------------------------------------------------
586
+ (88) → Extend bioroebe to allow download
587
+ PDB files
588
+ id 3030
589
+ and then display nice thingies to the user.
590
+ http://www.pdb.org/pdb/download/downloadFile.do?fileFormat=pdb&compression=NO&structureId=3030
591
+ http://www.pdb.org/pdb/download/downloadFile.do?fileFormat=pdb&compression=NO&structureId=2VEZ
592
+ in 3EML 2VTP 2VEZ
593
+ do
594
+ --------------------------------------------------------------------------------
595
+ (89) → Fully integrate electron microscopy then remove the old entry.
775
596
  Test it though.
776
597
  Hmm... but ... we will first polish the main bioroebe
777
598
  gem AND the taxonomy gem and THEN AFTERWARDS
778
599
  integate elctron microsopcy.
779
- ..........................................................................
780
- (1) → ORF Finder:
781
-
600
+ --------------------------------------------------------------------------------
601
+ (90) → ORF Finder:
782
602
  We must add an ORF finder for the bioroebe project,
783
603
  similar to the NCBI ORF Finder.
784
-
785
604
  This works partially... start_stop works but we do not
786
605
  yet find all subsequences.
787
-
788
- ..........................................................................
789
- (1) → must change determine whether we have protein or nucleotide or
606
+ --------------------------------------------------------------------------------
607
+ (91) → must change determine whether we have protein or nucleotide or
790
608
  so via a topelvel method!
791
- ..........................................................................
792
- (1) → there is a talens module.
793
- we have to improve on it for a while
794
- better docu
795
- more testing
796
- then we can get rid of this entry here
797
- ..........................................................................
798
- (1) → 33.44
609
+ --------------------------------------------------------------------------------
610
+ (92) → there is a talens module.
611
+ we have to improve on it for a while
612
+ better docu
613
+ more testing
614
+ then we can get rid of this entry here
615
+ --------------------------------------------------------------------------------
616
+ (93) → 33.44
799
617
  Next showing the nucleotides 33 to 44 (including 33 and 44).
800
- The length of the fragment will be 12 nucleotides.
801
- 5' - 2;70;130;180 - 3'
802
- ^^^ there is some problem; we somehow embed the colour codes,
803
- which should not happen.
804
- ..........................................................................
805
- (1) → set_aa DTLCIGYHAN NSTDTVDTVL EKNVTVTHSV NLLEDKHNGK LCKLRGVAPL HLGKCNIAGW ILGNPECESL STASSWSYIV ETSNSDNGTC YPGDFINYEE LREQLSSVSS FERFEIFPKT SSWPNHDNKG VTAACPHAGA KSFYKNLIWL VKKGNSYPKL NQSYINDKGK EVLVLWGIHH PSTTADQQSL YQNADAYVFV GTSRYSKKFK PEIATRPKVR DQEGRMNYYW TLVEPGDKIT FEATGNLVVP RYAFMERNAG SGIIISDTPV HDCNTTCQTP EGAINTSLPF QNIHPITIGK CPKYVKSTKL RLATGLRNVP SIQSRGLFGA IAGFIEGGWT GMVDGWYGYH HQNEQGSGYA ADLKSTQNAI DKITNKVNSV IKMNTQFTAV GKEFNHLEKR IENLNKKVDD GFLDIWTYNA ELLVLLENER TLDYHDSNVK NLYEKVRNQL KNNAKEIGNG CFEFYHKCDN TCMESVKNGT YDYPKYSEEA KLNREKIDGV KLESTRIYHH HHHH
806
-
807
- ^^^ enable copy/pasting,
808
- then reverse_sequence
809
- dna_sequence?
810
-
811
- @type=:dna>
812
-
813
- BIO SHELL> aaseq?
618
+ The length of the fragment will be 12 nucleotides.
619
+ 5' - 2;70;130;180 - 3'
620
+ ^^^ there is some problem; we somehow embed the colour codes,
621
+ which should not happen.
622
+ --------------------------------------------------------------------------------
623
+ (94) → set_aa DTLCIGYHAN NSTDTVDTVL EKNVTVTHSV NLLEDKHNGK LCKLRGVAPL HLGKCNIAGW ILGNPECESL STASSWSYIV ETSNSDNGTC YPGDFINYEE LREQLSSVSS FERFEIFPKT SSWPNHDNKG VTAACPHAGA KSFYKNLIWL VKKGNSYPKL NQSYINDKGK EVLVLWGIHH PSTTADQQSL YQNADAYVFV GTSRYSKKFK PEIATRPKVR DQEGRMNYYW TLVEPGDKIT FEATGNLVVP RYAFMERNAG SGIIISDTPV HDCNTTCQTP EGAINTSLPF QNIHPITIGK CPKYVKSTKL RLATGLRNVP SIQSRGLFGA IAGFIEGGWT GMVDGWYGYH HQNEQGSGYA ADLKSTQNAI DKITNKVNSV IKMNTQFTAV GKEFNHLEKR IENLNKKVDD GFLDIWTYNA ELLVLLENER TLDYHDSNVK NLYEKVRNQL KNNAKEIGNG CFEFYHKCDN TCMESVKNGT YDYPKYSEEA KLNREKIDGV KLESTRIYHH HHHH
624
+ ^^^ enable copy/pasting,
625
+ then reverse_sequence
626
+ dna_sequence?
627
+ @type=:dna>
628
+ BIO SHELL> aaseq?
814
629
  DTLCIGYHANNSTDTVDTVLEKNVTVTHSVNLLEDKHNGKLCKLRGVAPLHLGKCNIAGWILGNPECESLSTASSWSYIVETSNSDNGTCYPGDFINYEELREQLSSVSSFERFEIFPKTSSWPNHDNKGVTAACPHAGAKSFYKNLIWLVKKGNSYPKLNQSYINDKGKEVLVLWGIHHPSTTADQQSLYQNADAYVFVGTSRYSKKFKPEIATRPKVRDQEGRMNYYWTLVEPGDKITFEATGNLVVPRYAFMERNAGSGIIISDTPVHDCNTTCQTPEGAINTSLPFQNIHPITIGKCPKYVKSTKLRLATGLRNVPSIQSRGLFGAIAGFIEGGWTGMVDGWYGYHHQNEQGSGYAADLKSTQNAIDKITNKVNSVIKMNTQFTAVGKEFNHLEKRIENLNKKVDDGFLDIWTYNAELLVLLENERTLDYHDSNVKNLYEKVRNQLKNNAKEIGNGCFEFYHKCDNTCMESVKNGTYDYPKYSEEAKLNREKIDGVKLESTRIYHHHHHH
815
- BIO SHELL> aasize?
816
- This sequence has 50 aminoacids.
817
- ^^^ das stimmt net.
818
-
819
- ..........................................................................
820
- (1) → add this functionality:
821
-
630
+ BIO SHELL> aasize?
631
+ This sequence has 50 aminoacids.
632
+ ^^^ das stimmt net.
633
+ --------------------------------------------------------------------------------
634
+ (95) → add this functionality:
822
635
  meting temper
823
636
  melting temper
824
637
  melting_temperature?
825
-
826
- ^^^ for short primers
827
-
828
- 4°C for each G/C base pair
829
- → 2°C for each A/T base pair
830
-
831
- Also add an explanation as can be seen here:
832
-
833
- http://comments.gmane.org/gmane.comp.lang.ruby.bio/1182
834
-
835
- "I discovered the above discussion by accident, about two
836
- years lateron. :)
837
-
838
- I am not using email-discussions or usegroups/newsgroups
839
- in general, largely because I have never been able to
840
- keep up to date with them and deal with emails properly;
841
- I am more a casual emails user myself, growing up in a
842
- www-world where phpBB really made it convenient to
843
- communicate with other people. So probably a bit after the
844
- emails-people use emails.
845
-
846
- At any rate, when I noticed it, I decided on my todo-list
847
- that I will improve the melting-temperature calculation
848
- of BioRoebe.
849
-
850
- Note that NCBI Blast and several other sites also already
851
- have very good algorithms in this regards, so the prime
852
- use case for BioRoebe is to explain a bit the algorithms
853
- and also provide a commandline-way to calculate them,
854
- using ruby. The latter may be useful and rather easy for
855
- scripted use.
856
- ..........................................................................
857
- (1) → show insulin
638
+ ^^^ for short primers
639
+ 4°C for each G/C base pair
640
+ → 2°C for each A/T base pair
641
+ Also add an explanation as can be seen here:
642
+ http://comments.gmane.org/gmane.comp.lang.ruby.bio/1182
643
+ "I discovered the above discussion by accident, about two
644
+ years lateron. :)
645
+ I am not using email-discussions or usegroups/newsgroups
646
+ in general, largely because I have never been able to
647
+ keep up to date with them and deal with emails properly;
648
+ I am more a casual emails user myself, growing up in a
649
+ www-world where phpBB really made it convenient to
650
+ communicate with other people. So probably a bit after the
651
+ emails-people use emails.
652
+ At any rate, when I noticed it, I decided on my todo-list
653
+ that I will improve the melting-temperature calculation
654
+ of BioRoebe.
655
+ Note that NCBI Blast and several other sites also already
656
+ have very good algorithms in this regards, so the prime
657
+ use case for BioRoebe is to explain a bit the algorithms
658
+ and also provide a commandline-way to calculate them,
659
+ using ruby. The latter may be useful and rather easy for
660
+ scripted use.
661
+ --------------------------------------------------------------------------------
662
+ (96) → show insulin
858
663
  ^^^ to show the insulin structure
859
- how to find it? no idea...
860
- but we should have these structures already made available somewhere.
861
- ..........................................................................
862
- (1) → Todo: find family of enzymes, based on sequence structure
664
+ how to find it? no idea...
665
+ but we should have these structures already made available somewhere.
666
+ --------------------------------------------------------------------------------
667
+ (97) → Todo: find family of enzymes, based on sequence structure
863
668
  alone.
864
- ..........................................................................
865
- (1) → https://pubchem.ncbi.nlm.nih.gov/compound/16131099#section=Top
866
-
867
- ^^^ this website is quite interesting; try to use components
868
- from it.
869
- -------------------------------------------------------------------------------
870
- (1) → Add some option to show the aminoacid sequence, at the least
871
- store it; and optionally show it.
872
-
873
- possibly always report how many aminoacids are
874
- part of that file; and optionally also show
875
- the whole sequence.
876
- -------------------------------------------------------------------------------
877
- (1) → WORK THROUGH the PROTOCOL AT BOKU. THEN WORK THROUGH THE VARIOUST
669
+ --------------------------------------------------------------------------------
670
+ (98) → WORK THROUGH the PROTOCOL AT BOKU. THEN WORK THROUGH THE VARIOUST
878
671
  TIDBIDS AT UNI WIEN STARTING WITH HEIKO.
879
- ^^^ da sind wir nun.
672
+ ^^^ da sind wir nun.
880
673
  wir sind an beginn von 1b ... hmmmm, also zerst mal das an der
881
674
  BOKU durchgehen. Dann das löschen.
882
- -------------------------------------------------------------------------------
883
- (1) → Begin tk-bindings for bioroebe, following the gtk stuff.
884
- -------------------------------------------------------------------------------
885
- (2) → frame_value = position_of_the_stop_codon - position_of_the_start_codon
675
+ --------------------------------------------------------------------------------
676
+ (99) → Begin tk-bindings for bioroebe, following the gtk stuff.
677
+ --------------------------------------------------------------------------------
678
+ (100) → frame_value = position_of_the_stop_codon - position_of_the_start_codon
886
679
  ^^^ continue on this ...
887
- -------------------------------------------------------------------------------
888
- (1) → improve both the gtk-apps parts, and the sinatra web-interface,
680
+ --------------------------------------------------------------------------------
681
+ (101) → improve both the gtk-apps parts, and the sinatra web-interface,
889
682
  and other GUI-like elements. The idea is to make this software
890
683
  more useful for people around the world, which should help
891
684
  increase its adoption rate.
892
- -------------------------------------------------------------------------------
893
- (2) → Look to integrate this:
894
-
895
- http://www.ncbi.nlm.nih.gov/nuccore/NM_007315.3?report=fasta&log$=seqview&format=text
685
+ --------------------------------------------------------------------------------
686
+ (102) → Look to integrate this:
687
+ http://www.ncbi.nlm.nih.gov/nuccore/NM_007315.3?report=fasta&log$=seqview&format=text
896
688
  ^^^
897
- -------------------------------------------------------------------------------
898
- (1) → Clone and document the getorf functionality properly.
899
-
900
- See: http://emboss.sourceforge.net/apps/cvs/emboss/apps/getorf.html
901
- -------------------------------------------------------------------------------
902
- (2) → set_dna_sequence alu
903
-
904
- ^^^ fetch random alu
905
-
906
- ^^^ alu sequence
907
- Ok we started this now adding more details, but we
908
- need to become better at searching for this
909
- sequence.
910
- -------------------------------------------------------------------------------
911
- (3) → We need to make available the ... thingy magick
689
+ --------------------------------------------------------------------------------
690
+ (103) → We need to make available the ... thingy magick
912
691
  emboss functionality. that may seem useful
913
692
  but also feel free to extend these parts for
914
693
  bioroebe as necessary.
915
- -------------------------------------------------------------------------------
916
- (4) → integrate electron_microscopy fully
917
- This will take more time, so first we finish with the
694
+ --------------------------------------------------------------------------------
695
+ (104) → integrate electron_microscopy fully
696
+ This will take more time, so first we finish with the
918
697
  taxonomy module instead.
919
- -------------------------------------------------------------------------------
920
- (5) → Improve support for BLAST up until
921
-
698
+ --------------------------------------------------------------------------------
699
+ (105) → Improve support for BLAST up until
922
700
  middle of 2015 so that I am better prepared
923
701
  for work-related stuff. In order for this
924
- to succed, we first have to understand
702
+ to succed, we first have to understand
925
703
  BLAST very well.
926
-
927
704
  So, work on BLAST tutorial at bioinf page:
928
-
929
- bl bioinf; rf bioinf
930
- -------------------------------------------------------------------------------
931
- (3) → integrate a "codon usage database", whatever this means.
705
+ bl bioinf; rf bioinf
706
+ --------------------------------------------------------------------------------
707
+ (106) → integrate a "codon usage database", whatever this means.
932
708
  It is a cool database anyway. Then document this.
933
709
  First, create a codon-usage analyze on a per-FASTA
934
710
  site basis. Meaning we download a fasta sequence
935
711
  and calculate the codon usage from there.
936
-
937
- ^^^ and add some GUI to this. hmmm
938
- ..........................................................................
939
- (4) → Input sequence:
940
-
712
+ ^^^ and add some GUI to this. hmmm
713
+ --------------------------------------------------------------------------------
714
+ (107) → Input sequence:
941
715
  MFLMVSPTAYHQNKDECFLP
942
716
  TAYHQNKDECMVSPTAYHQN
943
717
  KDECFLPTAYHQMVSPTAYH
944
718
  QNKDECFLPTAYHQ
945
719
  Reverse Translated sequence
946
-
947
720
  ATG TTY YTNATGGTNWSNCCNACNGCNTAYCAYCARAAYAARGAYGARTGYTTYYTNCCN
948
721
  ACNGCNTAYCAYCARAAYAARGAYGARTGYATGGTNWSNCCNACNGCNTAYCAYCARAAY
949
722
  AARGAYGARTGYTTYYTNCCNACNGCNTAYCAYCARATGGTNWSNCCNACNGCNTAYCAY
950
723
  CARAAYAARGAYGARTGYTTYYTNCCNACNGCNTAYCAYCAR
951
-
952
724
  ^^^ we should also show this on the commandline AND the
953
- www ... hmmm.
954
- ..........................................................................
955
- (5) → enable a graphical layer so that we can find out which
725
+ www ... hmmm.
726
+ --------------------------------------------------------------------------------
727
+ (108) → enable a graphical layer so that we can find out which
956
728
  transcription factor activates which gene(s). This
957
729
  should show e. g. a transcription factor highlighting
958
730
  a target genetic area.
959
- ..........................................................................
960
- (2) → We should add more screenshots, make them available on imgur
731
+ --------------------------------------------------------------------------------
732
+ (109) → We should add more screenshots, make them available on imgur
961
733
  as well, after storing them locally. Start with the more
962
734
  important functionality.
963
-
964
- ..........................................................................
965
- (2) → clone serial cloner or whatever the name was, that GUI,
966
- so that we can offer the same functionality.
967
- ..........................................................................
968
- (1) →
969
-
970
- # * searching for PubMed IDs given a query string:
971
- # * Bio::PubMed#esearch (recommended)
972
- # * Bio::PubMed#search (only retrieves top 20 hits; will be deprecated)
973
- ^^^ implement this
974
-
975
-
976
- ..........................................................................
977
- (3) → Aufgabe 16 in bioroebe lösen könnnen
978
- ..........................................................................
979
- (4) The taxonomy part should be fully integrated, without it
980
- being a standalone part anymore.
981
- continue on the taxonomy stuff.
982
- ne day this will work again *shake fist*
983
- -------------------------------------------------------------------------------
984
- (5) → re1 = Bio::RestrictionEnzyme::DoubleStranded.new(enzyme1)
985
-
986
- ^^^ add this? hmmmm
987
- ^^^ from here.
988
- -------------------------------------------------------------------------------
989
- (1) → Colourize exon/intron boundaries.
990
- -------------------------------------------------------------------------------
991
- (2) → In bioroebe: enhance phylogeny stuff and perhaps automatically
735
+ --------------------------------------------------------------------------------
736
+ (110) → clone serial cloner or whatever the name was, that GUI,
737
+ so that we can offer the same functionality.
738
+ --------------------------------------------------------------------------------
739
+ (111) →
740
+ # * searching for PubMed IDs given a query string:
741
+ # * Bio::PubMed#esearch (recommended)
742
+ # * Bio::PubMed#search (only retrieves top 20 hits; will be deprecated)
743
+ ^^^ implement this
744
+ --------------------------------------------------------------------------------
745
+ (112) → Aufgabe 16 in bioroebe lösen könnnen
746
+ --------------------------------------------------------------------------------
747
+ (113) → re1 = Bio::RestrictionEnzyme::DoubleStranded.new(enzyme1)
748
+ ^^^ add this? hmmmm
749
+ ^^^ from here.
750
+ --------------------------------------------------------------------------------
751
+ (114) Colourize exon/intron boundaries.
752
+ --------------------------------------------------------------------------------
753
+ (115) In bioroebe: enhance phylogeny stuff and perhaps automatically
992
754
  generate pictures here.
993
- -------------------------------------------------------------------------------
994
- (1) → In sinatra: add a backtranseq entry point, perhaps
755
+ --------------------------------------------------------------------------------
756
+ (116) → In sinatra: add a backtranseq entry point, perhaps
995
757
  alias it as well.
996
-
997
- ^^ sync this to ruby-gtk3? hmm
998
-
999
- bioroebe --protein-to-dna
1000
-
1001
- ^^^ this shall start the GTK3 variant
1002
-
1003
- -------------------------------------------------------------------------------
1004
- (1) require 'rubygems/text'
1005
- include Gem::Text
1006
- levenshtein_distance 'shevy', 'chevy' # => 1
1007
-
1008
- ^^^ add some class that outpus, on the commandline
1009
- the levensthein distance ont he commandline
1010
- and also in the interactive bioshell
1011
-
1012
- https://github.com/rubygems/rubygems/blob/master/lib/rubygems/text.rb
1013
- ^^^ actually move that part into bioroebe itself...
1014
-
1015
- -------------------------------------------------------------------------------
1016
- (1) → add _source to all APIs in sinatra there. Ensure that this works
758
+ ^^ sync this to ruby-gtk3? hmm
759
+ bioroebe --protein-to-dna
760
+ ^^^ this shall start the GTK3 variant
761
+ --------------------------------------------------------------------------------
762
+ (117) → require 'rubygems/text'
763
+ include Gem::Text
764
+ levenshtein_distance 'shevy', 'chevy' # => 1
765
+ ^^^ add some class that outpus, on the commandline
766
+ the levensthein distance ont he commandline
767
+ and also in the interactive bioshell
768
+ https://github.com/rubygems/rubygems/blob/master/lib/rubygems/text.rb
769
+ ^^^ actually move that part into bioroebe itself...
770
+ --------------------------------------------------------------------------------
771
+ (118) add _source to all APIs in sinatra there. Ensure that this works
1017
772
  too. The user should be able to view the source code.
1018
773
  ^^^ it has been added for 2 methods so far in sinatra; we need
1019
- to add it for the remaining ones too. Then we can remove
1020
- this entry point.
1021
- -------------------------------------------------------------------------------
1022
- (2) → Check out expasy
1023
- peptidcutter
1024
- also offer this functionality, through commandline, GUI
774
+ to add it for the remaining ones too. Then we can remove
775
+ this entry point.
776
+ --------------------------------------------------------------------------------
777
+ (119) → Check out expasy
778
+ peptidcutter
779
+ also offer this functionality, through commandline, GUI
1025
780
  and sinatra.
1026
- https://web.expasy.org/peptide_cutter/
1027
- We now have added trypsin but we should add more here; and
781
+ https://web.expasy.org/peptide_cutter/
782
+ We now have added trypsin but we should add more here; and
1028
783
  still have to add support for sinatra here.
1029
- -------------------------------------------------------------------------------
1030
- (3) → melting temperature subsection
1031
-
784
+ --------------------------------------------------------------------------------
785
+ (120) → melting temperature subsection
1032
786
  hmmm .... molecular weight calculation works now ... but
1033
787
  ... is it correct for a ssDNA string? hmm...
1034
- -------------------------------------------------------------------------------
1035
- (3) → Add useful formulas for bioshell.
1036
-
1037
-
1038
- ...........................................................................
1039
- (1) → Degenerate Primers
1040
-
1041
- You can try to determine the degenerate primers via the Shell
1042
- component. Issue the following instructions:
1043
-
1044
- degenerate_primer
1045
-
1046
- ^^^ epxnad that subsection
1047
- more explanations and examples
1048
-
1049
- -------------------------------------------------------------------------------
1050
- (1) → Copy the functionality of plotorf:
1051
-
788
+ --------------------------------------------------------------------------------
789
+ (121) → Degenerate Primers
790
+ You can try to determine the degenerate primers via the Shell
791
+ component. Issue the following instructions:
792
+ degenerate_primer
793
+ ^^^ epxnad that subsection
794
+ more explanations and examples
795
+ --------------------------------------------------------------------------------
796
+ (122) Copy the functionality of plotorf:
1052
797
  See:
1053
-
1054
- http://www.bioinformatics.nl/cgi-bin/emboss/plotorf
1055
-
798
+ http://www.bioinformatics.nl/cgi-bin/emboss/plotorf
1056
799
  Also extend emboss info on the main homepage.
1057
800
  For plotorf we also need to be able to generate images.
1058
801
  We also need a simpler toplevel API here, something like
1059
802
  Bioroebe.return_all_open_reading_frames(of_this_sequence, use_these_as_start_codons = start_codons?, use_these_as_stop_codons = stop_codons?)
1060
803
  ^^^
1061
804
  Bioroebe.return_all_ORFS
1062
-
1063
-
1064
-
1065
- -------------------------------------------------------------------------------
1066
- (2) Start nucleotide position is at: 142
1067
-
1068
- See the following example:
1069
-
1070
- BIO SHELL> highlight AAA
1071
- 5' - GTAACTGTTAAACTGTCAGGCAGGCGCTCAGGTGTACGTTTGATGCTCAGTAGTATTCCATTCTCGCGAGGGTCACGATACCCAAGATCTCCATGGCTTTCTGTTAGACGCAGCCGTGGACGACTAGAGCGTTTTTTTTTGGAAAGTATATGACCAGCACTCTACATCCTAACTAGAAGGTCTTCTAGGCGTACCAATATTAACGAATAGTGAGTGGTTACCCGTACCCGTCATGACGTCTATCATTAATT - 3'
1072
- BIO SHELL>
805
+ --------------------------------------------------------------------------------
806
+ (123) → Start nucleotide position is at: 142
807
+ See the following example:
808
+ BIO SHELL> highlight AAA
809
+ 5' - GTAACTGTTAAACTGTCAGGCAGGCGCTCAGGTGTACGTTTGATGCTCAGTAGTATTCCATTCTCGCGAGGGTCACGATACCCAAGATCTCCATGGCTTTCTGTTAGACGCAGCCGTGGACGACTAGAGCGTTTTTTTTTGGAAAGTATATGACCAGCACTCTACATCCTAACTAGAAGGTCTTCTAGGCGTACCAATATTAACGAATAGTGAGTGGTTACCCGTACCCGTCATGACGTCTATCATTAATT - 3'
810
+ BIO SHELL>
1073
811
  ^^^ this does not work; nothing is highlighted.
1074
-
1075
- -------------------------------------------------------------------------------
1076
- (2) → Add a myristoylierung-signal
1077
-
1078
- Met-Gly-Xaa-Xaa-YXaa-Ser/Thr-Lys-Lys
1079
-
1080
- 1^^ but check first.
1081
-
1082
- -------------------------------------------------------------------------------
1083
- (3) → integrate the bioroebe_tutorial.cgi into the .md file completely.
1084
-
1085
- -------------------------------------------------------------------------------
1086
- (4) → Integrate everything from the biopython tutorial, if it makes
812
+ --------------------------------------------------------------------------------
813
+ (124) → Add a myristoylierung-signal
814
+ Met-Gly-Xaa-Xaa-YXaa-Ser/Thr-Lys-Lys
815
+ 1^^ but check first.
816
+ --------------------------------------------------------------------------------
817
+ (125) → integrate the bioroebe_tutorial.cgi into the .md file completely.
818
+ --------------------------------------------------------------------------------
819
+ (126) → Integrate everything from the biopython tutorial, if it makes
1087
820
  sense.
1088
-
1089
- -------------------------------------------------------------------------------
1090
- (5) → Improve the codon-optimizer in Bioroebe, including the
821
+ --------------------------------------------------------------------------------
822
+ (127) → Improve the codon-optimizer in Bioroebe, including the
1091
823
  documentation. We need to make this really useful.
1092
- -------------------------------------------------------------------------------
1093
- (6) →
1094
- 5'- TACACGGCACAT -3'
1095
- 3'- ATGTGCCGTGTA -5'
1096
-
1097
- Imperfect DNA mirror repeats (IMRs) are less than 100% symmetrical.
1098
-
1099
- ^^^ integrate mirror repeats creation
1100
- and searching for them. Hmmm.
1101
- -------------------------------------------------------------------------------
1102
- (7) continue porting bioroebe/taxonomy
1103
-
1104
- ^^^^^^^^^^
1105
- It has been 5 years ...
1106
-
1107
- ^^^ taxonomy/colours/colours wird integriert
824
+ --------------------------------------------------------------------------------
825
+ (128) →
826
+ 5'- TACACGGCACAT -3'
827
+ 3'- ATGTGCCGTGTA -5'
828
+ Imperfect DNA mirror repeats (IMRs) are less than 100% symmetrical.
829
+ ^^^ integrate mirror repeats creation
830
+ and searching for them. Hmmm.
831
+ --------------------------------------------------------------------------------
832
+ (129) continue porting bioroebe/taxonomy
833
+ ^^^^^^^^^^
834
+ It has been 5 years ...
835
+ ^^^ taxonomy/colours/colours wird integriert
1108
836
  ^^^ das ist der nächste schritt, so das
1109
- wir das nit mehr benötigen.
1110
-
1111
- -------------------------------------------------------------------------------
1112
- (8) → find out which bacteria all contain the needle complex; find out
837
+ wir das nit mehr benötigen.
838
+ --------------------------------------------------------------------------------
839
+ (130) → find out which bacteria all contain the needle complex; find out
1113
840
  the sequence for the needle complex as well and study it;
1114
841
  find the positions of the genes responsible.
1115
-
1116
- -------------------------------------------------------------------------------
1117
- (9) → Add trypsin_digest, also in the shell, but possibly
842
+ --------------------------------------------------------------------------------
843
+ (131) → Add trypsin_digest, also in the shell, but possibly
1118
844
  on toplevel as well (if the input is a protein sequence.
1119
-
1120
845
  Also, more generally in the shell, add this:
1121
-
1122
- digest trypsin
1123
-
846
+ digest trypsin
1124
847
  ^^^ onto the aminoacid
1125
-
1126
- "This is the result when digsting with trypsine."
1127
- And document it; but do not digest if a prolin
1128
- follows !!!
1129
- ^^^ document this too into .md
1130
-
1131
- -------------------------------------------------------------------------------
1132
- (10) → in bioroebe, add a commassie check... do we include
1133
- arginine or not.
1134
-
1135
- ..........................................................................
1136
- (11) add codon usage in bioroebe
1137
- -------------------------------------------------------------------------------
1138
- (12) → Clone the following functionality.
1139
-
1140
- http://www.bioinformatics.nl/cgi-bin/emboss/help/sirna
1141
- -------------------------------------------------------------------------------
1142
- (13) Improve the "find and scan" subsection. We must be able to find
1143
- subsequences; check for "matches" as well, including the bioshell.
1144
- -------------------------------------------------------------------------------
1145
- (14) → Clone the CLUSTAL format aligment.
1146
- -------------------------------------------------------------------------------
1147
- (15) We need to be able to load up a whole geneome into bioroebe,
1148
- and then be able to manipulate it.
1149
-
1150
- ^^^ perhaps test this with some example
1151
- data or so...
1152
- -------------------------------------------------------------------------------
1153
- (16) → Restriction enzymes:
1154
-
1155
- Add a subsection about restritction enzymes including
1156
- examples, and also explain how to use this in bioroebe.
1157
- Minute by minute...
1158
-
1159
- AND add resources to useful sites.
1160
- ^^^ start it... have to expand on it
1161
-
1162
- Also, improve the part about restriction enzymes in
1163
- general, so that we can reproduce and verify the
1164
- information there.
1165
-
1166
- -------------------------------------------------------------------------------
1167
- (18) → clone pepinfo
1168
-
1169
- The program "pepinfo" plots various amino acid properties in
1170
- parallel for an input protein sequence.
1171
-
1172
- The types of plot available are:
1173
-
1174
- i. Hydrophobicity plots using the method of Kyte & Doolittle, the
1175
- optimal matching hydrophobicity scale (OHM) of Sweet & Eisenberg,
1176
- or consensus parameters (Eisenberg et al).
1177
-
1178
- ii. Histogram of the presence of residues with the physico-chemical
1179
- properties: Tiny, Small, Aliphatic, Aromatic, Non-polar, Polar,
1180
- Charged, Positive, Negative.
1181
-
1182
- The data are also written out to an output file.
1183
-
1184
- -------------------------------------------------------------------------------
1185
- (19) gff?
1186
-
1187
- There are 6 .gff3 files in the current directory.
1188
- We will simply pass the first entry there into class Bioroebe::Parser::GFF.
1189
- The accession id is `NZ_CP011602.1`.
1190
- Bioroebe::Parser::GFF: We are instructed to split into standalone files, but we
1191
- Bioroebe::Parser::GFF: can not do so, as there is not more than one accession id
1192
- Bioroebe::Parser::GFF: in this file.
1193
-
1194
- ^^^ we need an analyze-mode as well.
1195
-
1196
- ..........................................................................
1197
- (20) → ^^^^ add the ability to
1198
- show a ruler AND highlighting as well
1199
- ^^^ then document it.
1200
- ..........................................................................
1201
- (21) → https://github.com/bioperl/bioperl-live
1202
- Look what we can take from ^^^.
1203
-
1204
- https://github.com/bioperl/bioperl-live/tree/master/examples
1205
-
1206
- ..........................................................................
1207
- (23) → continue biojava, and bioroebe a bit
1208
-
1209
- Ideally we should have biojava o a working point.
1210
- ..........................................................................
1211
- (24) → Clone all of Emboss. :)
1212
- ..........................................................................
1213
- (25) → clone the functionality found at https://web.expasy.org/protparam/
1214
-
1215
- https://web.expasy.org/cgi-bin/protparam/protparam
1216
- ^^^ this is halfway done...
1217
-
1218
- ^^^^ also add pI calculation:
1219
-
1220
- Theoretical pI: 5.78
1221
-
1222
- -------------------------------------------------------------------------------
1223
- (27) → NP_417539.1
1224
-
848
+ "This is the result when digsting with trypsine."
849
+ And document it; but do not digest if a prolin
850
+ follows !!!
851
+ ^^^ document this too into .md
852
+ --------------------------------------------------------------------------------
853
+ (132) → add codon usage in bioroebe
854
+ --------------------------------------------------------------------------------
855
+ (133) → Clone the following functionality.
856
+ http://www.bioinformatics.nl/cgi-bin/emboss/help/sirna
857
+ --------------------------------------------------------------------------------
858
+ (134) → Improve the "find and scan" subsection. We must be able to find
859
+ subsequences; check for "matches" as well, including the bioshell.
860
+ --------------------------------------------------------------------------------
861
+ (135) → Clone the CLUSTAL format aligment.
862
+ --------------------------------------------------------------------------------
863
+ (136) → We need to be able to load up a whole geneome into bioroebe,
864
+ and then be able to manipulate it.
865
+ ^^^ perhaps test this with some example
866
+ data or so...
867
+ --------------------------------------------------------------------------------
868
+ (137) → Restriction enzymes:
869
+ Add a subsection about restritction enzymes including
870
+ examples, and also explain how to use this in bioroebe.
871
+ Minute by minute...
872
+ AND add resources to useful sites.
873
+ ^^^ start it... have to expand on it
874
+ Also, improve the part about restriction enzymes in
875
+ general, so that we can reproduce and verify the
876
+ information there.
877
+ --------------------------------------------------------------------------------
878
+ (138) clone pepinfo
879
+ The program "pepinfo" plots various amino acid properties in
880
+ parallel for an input protein sequence.
881
+ The types of plot available are:
882
+ i. Hydrophobicity plots using the method of Kyte & Doolittle, the
883
+ optimal matching hydrophobicity scale (OHM) of Sweet & Eisenberg,
884
+ or consensus parameters (Eisenberg et al).
885
+ ii. Histogram of the presence of residues with the physico-chemical
886
+ properties: Tiny, Small, Aliphatic, Aromatic, Non-polar, Polar,
887
+ Charged, Positive, Negative.
888
+ The data are also written out to an output file.
889
+ --------------------------------------------------------------------------------
890
+ (139) → gff?
891
+ There are 6 .gff3 files in the current directory.
892
+ We will simply pass the first entry there into class Bioroebe::Parser::GFF.
893
+ The accession id is `NZ_CP011602.1`.
894
+ Bioroebe::Parser::GFF: We are instructed to split into standalone files, but we
895
+ Bioroebe::Parser::GFF: can not do so, as there is not more than one accession id
896
+ Bioroebe::Parser::GFF: in this file.
897
+ ^^^ we need an analyze-mode as well.
898
+ --------------------------------------------------------------------------------
899
+ (140) → ^^^^ add the ability to
900
+ show a ruler AND highlighting as well
901
+ ^^^ then document it.
902
+ --------------------------------------------------------------------------------
903
+ (141) → https://github.com/bioperl/bioperl-live
904
+ Look what we can take from ^^^.
905
+ https://github.com/bioperl/bioperl-live/tree/master/examples
906
+ --------------------------------------------------------------------------------
907
+ (142) → continue biojava, and bioroebe a bit
908
+ Ideally we should have biojava o a working point.
909
+ --------------------------------------------------------------------------------
910
+ (143) → clone the functionality found at https://web.expasy.org/protparam/
911
+ https://web.expasy.org/cgi-bin/protparam/protparam
912
+ ^^^ this is halfway done...
913
+ ^^^^ also add pI calculation:
914
+ Theoretical pI: 5.78
915
+ --------------------------------------------------------------------------------
916
+ (144) → NP_417539.1
1225
917
  https://www.ncbi.nlm.nih.gov/protein/NP_417539.1
1226
918
  https://www.ncbi.nlm.nih.gov/protein/NP_417539.1?report=fasta
1227
-
1228
- ^^^ if the input is exactly like the above, on the first line,
919
+ ^^^ if the input is exactly like the above, on the first line,
1229
920
  download the sequence.
1230
- -------------------------------------------------------------------------------
1231
- (28) → Integrate these nice GUI parts parts:
1232
-
1233
- https://dev.to/kojix2/introduction-to-gr-rb-data-visualization-with-ruby-2c39
1234
-
1235
- -------------------------------------------------------------------------------
1236
- (29) → http://insilico.ehu.es/
1237
-
1238
- ^^^ check if we have all of this incorporated
1239
- -------------------------------------------------------------------------------
1240
- (30) → http://www.biostars.org/
1241
-
1242
- ^^^ regularly work through this
1243
- and try to help
1244
- and extend bioruby at the same time.
1245
- -------------------------------------------------------------------------------
1246
- (31) → The taxonomy-submodule should work one day, and be properly
1247
- documented as well. Perhaps integrate the parts of Taxonomy
1248
- that can be included into the toplevel domain.
1249
- -------------------------------------------------------------------------------
1250
- (32) → Enable:
1251
-
1252
- Bioroebe.set_genetic_code()
1253
- Bioroebe.set_genetic_code(to: 'Vertebrate Mitochondrial')
1254
-
921
+ --------------------------------------------------------------------------------
922
+ (145) → http://www.biostars.org/
923
+ ^^^ regularly work through this
924
+ and try to help
925
+ and extend bioruby at the same time.
926
+ --------------------------------------------------------------------------------
927
+ (146) → The taxonomy-submodule should work one day, and be properly
928
+ documented as well. Perhaps integrate the parts of Taxonomy
929
+ that can be included into the toplevel domain.
930
+ --------------------------------------------------------------------------------
931
+ (147) → Enable:
932
+ Bioroebe.set_genetic_code()
933
+ Bioroebe.set_genetic_code(to: 'Vertebrate Mitochondrial')
1255
934
  ^^^ enable this
1256
-
1257
- Seq("ATGGCCATTGTAATGGGCCGCTGAAAGGGTGCCCGATAG")
1258
- coding_dna.translate(table=2)
1259
- coding_dna.translate(table="Vertebrate Mitochondrial")
1260
- Seq('MAIVMGRWKGAR*', HasStopCodon(IUPACProtein(), '*'))
1261
- Seq('MAIVMGRWKGAR*')
1262
-
935
+ Seq("ATGGCCATTGTAATGGGCCGCTGAAAGGGTGCCCGATAG")
936
+ coding_dna.translate(table=2)
937
+ coding_dna.translate(table="Vertebrate Mitochondrial")
938
+ Seq('MAIVMGRWKGAR*', HasStopCodon(IUPACProtein(), '*'))
939
+ Seq('MAIVMGRWKGAR*')
1263
940
  ^^^ enable this as well; extent documentation too.
1264
-
1265
- -------------------------------------------------------------------------------
1266
- (34) We have found a restriction enzyme called NheI.
1267
-
1268
- The sequence this 6-cutter relates to is: `5' - GCTAGC - 3'`
1269
-
1270
- This restriction enzyme will produce a blunt overhang.
1271
-
1272
- ^^^ nope das ist falsch
1273
- -------------------------------------------------------------------------------
1274
- (35) → Sau3A?
1275
- ^^^ enable this restriction site
1276
-
1277
- -------------------------------------------------------------------------------
1278
- (37) → Add matplotlib support.
1279
-
1280
- try_to_use_matplotlib
1281
- -------------------------------------------------------------------------------
1282
- (38) → https://www.ncbi.nlm.nih.gov/CBBresearch/Lu/Demo/tmTools/RESTfulAPIs.html
1283
- -------------------------------------------------------------------------------
1284
- (39) → The following input:
1285
-
1286
- downcase; orf?; seq?
1287
-
1288
- leads to strange display. Something is wrong here, must be checked.
1289
- -------------------------------------------------------------------------------
1290
- (40) → Continue with rosalind problems.
1291
-
1292
- These challenges can be found here:
1293
-
1294
- http://rosalind.info/problems/sign/
1295
-
1296
- Also integrate these rosalind-quizzes into bioroebe
1297
- when possible.
1298
- -------------------------------------------------------------------------------
1299
- (41) → https://web.expasy.org/cgi-bin/peptide_mass/peptide-mass.pl
1300
-
1301
- ^^^ make the above usable in sinaitra as well
1302
- -------------------------------------------------------------------------------
1303
- (42) → Integrate a way to search for commonly known promoters:
1304
-
1305
- promoters?
1306
- ^^^ this functionality
1307
- ^^^ this has to be expanded
1308
- and ...
1309
- -------------------------------------------------------------------------------
1310
- (43) → Integrate:
1311
-
1312
- http://biotools.nubic.northwestern.edu/OligoCalc.html
1313
- -------------------------------------------------------------------------------
1314
- (44) Extend the Java part of BioRoebe systematically..
1315
-
1316
- What should come next? Let's make a list.
1317
-
1318
- remove_numbers [DONE]
1319
- -------------------------------------------------------------------------------
1320
- (46) → Study gnuplot; one day we have to draw graphs.
1321
-
1322
- -------------------------------------------------------------------------------
1323
- (47) Add a genome browser, both ascii without GUI and also
1324
- with. In ruby-gtk.
1325
- -------------------------------------------------------------------------------
1326
- (48) Clone the functionality of:
1327
-
1328
- http://www.biophp.org/minitools/restriction_digest/demo.php
1329
- -------------------------------------------------------------------------------
1330
- (50) Add the loxP sequence to readme [DONE] and explain this
1331
- better on the main readme; and perhaps also assign
1332
- the sequence via the bioshell.
1333
- -------------------------------------------------------------------------------
1334
- (51) 33. Cephalodiscidae Mitochondrial UAA-Tyr Code (transl_table=33)
1335
-
1336
- AAs = FFLLSSSSYYY*CCWWLLLLPPPPHHQQRRRRIIIMTTTTNNKKSSSKVVVVAAAADDEEGGGG
1337
- Starts = ---M-------*-------M---------------M---------------M------------
1338
- Base1 = TTTTTTTTTTTTTTTTCCCCCCCCCCCCCCCCAAAAAAAAAAAAAAAAGGGGGGGGGGGGGGGG
1339
- Base2 = TTTTCCCCAAAAGGGGTTTTCCCCAAAAGGGGTTTTCCCCAAAAGGGGTTTTCCCCAAAAGGGG
1340
- Base3 = TCAGTCAGTCAGTCAGTCAGTCAGTCAGTCAGTCAGTCAGTCAGTCAGTCAGTCAGTCAGTCAG
1341
-
1342
- ^^^ add a parser, and document it, that can take this input
1343
- and output the corresponding code, in a valid .yml file.
1344
- -------------------------------------------------------------------------------
1345
- (52) → Add to bioroebe the ability to add cloning vectors
1346
- and molecular_weight calcuation
1347
- for this
1348
-
1349
- and also to show the sequence of a vector
1350
-
1351
- and how to add them to bioroebe
1352
-
1353
- also download sequence data for a vector - this
1354
- should probably be some interactive table or
1355
- something of the sort here.
1356
-
1357
- ^^^ this works a tiny bit now, must be documented still
1358
- via seq2? and so forth
1359
-
1360
- "pBR322 contains the genes for resistance to ampicillin and
1361
- tetracycline, and can be amplified with chloramphenicol.
1362
- The molecular weight is 2.83 x 106 daltons."
1363
-
1364
- ^^^ we also need a way to find out what resistance genes
1365
- are carried there.
1366
- -------------------------------------------------------------------------------
1367
- (53) → In the lambda genome sequence there are 10 EcoB and
1368
- 5 EcoK sites.
1369
- ^^^ verify this too, as an example as well
1370
- -------------------------------------------------------------------------------
1371
- (54) show restriction sites, composable and compatible with
1372
- serial clone ... hmm
1373
- -------------------------------------------------------------------------------
1374
- (55) → enable:
1375
- BIOROEBE_USE_COLOURS:
1376
- can be 0 or 1
1377
- what is this?
1378
- -------------------------------------------------------------------------------
1379
- (56) → Burrows-Wheeler-Transform (BWT)
1380
-
1381
- ^^^ add some method here
1382
- Bioroebe.burrows_wheeler_transform
1383
- ^^^ if no '$' char is in the input, then append it
1384
-
1385
- then, output the array of a SORTED BWT transform,
1386
- frmo the given input string.
1387
- document this properly as well
1388
-
1389
- also test this against my paper-result
1390
- with input being: "GATAG$".
1391
- -------------------------------------------------------------------------------
1392
- (56) Enable working with several genes... hmm and store that somewhere.
1393
- Something like a per-project workspace thingy.
1394
- -------------------------------------------------------------------------------
1395
- (57) → Add:
1396
-
1397
- http://nar.oxfordjournals.org/content/35/suppl_2/W71.long
1398
-
1399
- -------------------------------------------------------------------------------
1400
- (58) → Now, you may want to translate the nucleotides up to
1401
- the first in frame stop codon, and then stop (as
1402
- happens in nature):
1403
-
1404
- coding_dna.translate()
1405
- Seq('MAIVMGR*KGAR*', HasStopCodon(IUPACProtein(), '*'))
1406
- >>> coding_dna.translate(to_stop=True)
1407
- Seq('MAIVMGR', IUPACProtein())
1408
- ^^^^ support this hmmm.
1409
-
1410
- Then continue from here:
1411
-
1412
- https://people.duke.edu/~ccc14/pcfb/biopython/BiopythonSequences.html
1413
- -------------------------------------------------------------------------------
1414
- (59) → Add:
1415
-
1416
- set_dna :Ubiquitin
1417
- set_dna :ubiquitin
1418
-
1419
- ^^^ we want to obtain the ubuiqitin sequence
1420
- -------------------------------------------------------------------------------
1421
- (59) → Telomers
1422
-
1423
- Telomeres are listed from 5' to 3'.
1424
-
1425
- Example for the human telomeres would be:
1426
- 5'-TTAGGG-3
1427
-
1428
- ^^^ stimmt das?
1429
-
1430
- add:
1431
- doc_telomeres
1432
-
1433
- ^^^ add this to say the human telomere sequence
1434
- -------------------------------------------------------------------------------
1435
- (60) → ORF_positions?
941
+ --------------------------------------------------------------------------------
942
+ (148) → We have found a restriction enzyme called NheI.
943
+ The sequence this 6-cutter relates to is: `5' - GCTAGC - 3'`
944
+ This restriction enzyme will produce a blunt overhang.
945
+ ^^^ nope das ist falsch
946
+ --------------------------------------------------------------------------------
947
+ (149) Sau3A?
948
+ ^^^ enable this restriction site
949
+ --------------------------------------------------------------------------------
950
+ (150) → Add matplotlib support.
951
+ try_to_use_matplotlib
952
+ --------------------------------------------------------------------------------
953
+ (151) → https://www.ncbi.nlm.nih.gov/CBBresearch/Lu/Demo/tmTools/RESTfulAPIs.html
954
+ --------------------------------------------------------------------------------
955
+ (152) → The following input:
956
+ downcase; orf?; seq?
957
+ leads to strange display. Something is wrong here, must be checked.
958
+ --------------------------------------------------------------------------------
959
+ (153) → Continue with rosalind problems.
960
+ These challenges can be found here:
961
+ http://rosalind.info/problems/sign/
962
+ Also integrate these rosalind-quizzes into bioroebe
963
+ when possible.
964
+ --------------------------------------------------------------------------------
965
+ (154) https://web.expasy.org/cgi-bin/peptide_mass/peptide-mass.pl
966
+ ^^^ make the above usable in sinaitra as well
967
+ --------------------------------------------------------------------------------
968
+ (155) → Integrate a way to search for commonly known promoters:
969
+ promoters?
970
+ ^^^ this functionality
971
+ ^^^ this has to be expanded
972
+ and ...
973
+ --------------------------------------------------------------------------------
974
+ (156) → Integrate:
975
+ http://biotools.nubic.northwestern.edu/OligoCalc.html
976
+ --------------------------------------------------------------------------------
977
+ (157) → Extend the Java part of BioRoebe systematically..
978
+ What should come next? Let's make a list.
979
+ → remove_numbers [DONE]
980
+ --------------------------------------------------------------------------------
981
+ (158) → Study gnuplot; one day we have to draw graphs.
982
+ --------------------------------------------------------------------------------
983
+ (159) Add a genome browser, both ascii without GUI and also
984
+ with. In ruby-gtk.
985
+ --------------------------------------------------------------------------------
986
+ (160) → Clone the functionality of:
987
+ http://www.biophp.org/minitools/restriction_digest/demo.php
988
+ --------------------------------------------------------------------------------
989
+ (161) → Add the loxP sequence to readme [DONE] and explain this
990
+ better on the main readme; and perhaps also assign
991
+ the sequence via the bioshell.
992
+ --------------------------------------------------------------------------------
993
+ (162) 33. Cephalodiscidae Mitochondrial UAA-Tyr Code (transl_table=33)
994
+ AAs = FFLLSSSSYYY*CCWWLLLLPPPPHHQQRRRRIIIMTTTTNNKKSSSKVVVVAAAADDEEGGGG
995
+ Starts = ---M-------*-------M---------------M---------------M------------
996
+ Base1 = TTTTTTTTTTTTTTTTCCCCCCCCCCCCCCCCAAAAAAAAAAAAAAAAGGGGGGGGGGGGGGGG
997
+ Base2 = TTTTCCCCAAAAGGGGTTTTCCCCAAAAGGGGTTTTCCCCAAAAGGGGTTTTCCCCAAAAGGGG
998
+ Base3 = TCAGTCAGTCAGTCAGTCAGTCAGTCAGTCAGTCAGTCAGTCAGTCAGTCAGTCAGTCAGTCAG
999
+ ^^^ add a parser, and document it, that can take this input
1000
+ and output the corresponding code, in a valid .yml file.
1001
+ --------------------------------------------------------------------------------
1002
+ (163) → Add to bioroebe the ability to add cloning vectors
1003
+ and molecular_weight calcuation
1004
+ for this
1005
+ and also to show the sequence of a vector
1006
+ and how to add them to bioroebe
1007
+ also download sequence data for a vector - this
1008
+ should probably be some interactive table or
1009
+ something of the sort here.
1010
+ ^^^ this works a tiny bit now, must be documented still
1011
+ via seq2? and so forth
1012
+ "pBR322 contains the genes for resistance to ampicillin and
1013
+ tetracycline, and can be amplified with chloramphenicol.
1014
+ The molecular weight is 2.83 x 106 daltons."
1015
+ ^^^ we also need a way to find out what resistance genes
1016
+ are carried there.
1017
+ --------------------------------------------------------------------------------
1018
+ (164) → In the lambda genome sequence there are 10 EcoB and
1019
+ 5 EcoK sites.
1020
+ ^^^ verify this too, as an example as well
1021
+ --------------------------------------------------------------------------------
1022
+ (165) → show restriction sites, composable and compatible with
1023
+ serial clone ... hmm
1024
+ --------------------------------------------------------------------------------
1025
+ (166) → enable:
1026
+ BIOROEBE_USE_COLOURS:
1027
+ can be 0 or 1
1028
+ what is this?
1029
+ --------------------------------------------------------------------------------
1030
+ (167) Burrows-Wheeler-Transform (BWT)
1031
+ ^^^ add some method here
1032
+ Bioroebe.burrows_wheeler_transform
1033
+ ^^^ if no '$' char is in the input, then append it
1034
+ then, output the array of a SORTED BWT transform,
1035
+ frmo the given input string.
1036
+ document this properly as well
1037
+ also test this against my paper-result
1038
+ with input being: "GATAG$".
1039
+ --------------------------------------------------------------------------------
1040
+ (168) → Enable working with several genes... hmm and store that somewhere.
1041
+ Something like a per-project workspace thingy.
1042
+ --------------------------------------------------------------------------------
1043
+ (169) → Add:
1044
+ http://nar.oxfordjournals.org/content/35/suppl_2/W71.long
1045
+ --------------------------------------------------------------------------------
1046
+ (170) Now, you may want to translate the nucleotides up to
1047
+ the first in frame stop codon, and then stop (as
1048
+ happens in nature):
1049
+ coding_dna.translate()
1050
+ Seq('MAIVMGR*KGAR*', HasStopCodon(IUPACProtein(), '*'))
1051
+ >>> coding_dna.translate(to_stop=True)
1052
+ Seq('MAIVMGR', IUPACProtein())
1053
+ ^^^^ support this hmmm.
1054
+ Then continue from here:
1055
+ https://people.duke.edu/~ccc14/pcfb/biopython/BiopythonSequences.html
1056
+ --------------------------------------------------------------------------------
1057
+ (171) → Add:
1058
+ set_dna :Ubiquitin
1059
+ set_dna :ubiquitin
1060
+ ^^^ we want to obtain the ubuiqitin sequence
1061
+ --------------------------------------------------------------------------------
1062
+ (172) Telomers
1063
+ Telomeres are listed from 5' to 3'.
1064
+ Example for the human telomeres would be:
1065
+ 5'-TTAGGG-3
1066
+ ^^^ stimmt das?
1067
+ add:
1068
+ doc_telomeres
1069
+ ^^^ add this to say the human telomere sequence
1070
+ --------------------------------------------------------------------------------
1071
+ (173) → ORF_positions?
1436
1072
  ^^^ change this a bit, to actually show the positions
1437
- of the various ORFs with the start-position.
1438
- -------------------------------------------------------------------------------
1439
- (62) → add:
1440
-
1441
- setgene2
1442
- add_dna2
1443
- dna2
1444
- dna? <--- this one is not a setter but a query.
1445
- -------------------------------------------------------------------------------
1446
- (63) improve the TM calculation. must be better, must have more
1447
- documentation, and a small tutorial.
1448
- -------------------------------------------------------------------------------
1449
- (64) → Compare bioroebe to:
1450
-
1451
- https://www.ncbi.nlm.nih.gov/orffinder
1452
-
1453
- whether both return the same
1454
- also possibly add a web-gui
1455
- -------------------------------------------------------------------------------
1456
- (65) Find out ratios from:
1457
-
1458
- Doolittle RF. 1989. Redundancies in protein sequences. I
1459
-
1460
- http://onlinelibrary.wiley.com/doi/10.1110/ps.9.6.1203/pdf
1461
- ^^^ that table perhaps
1462
-
1463
- (1) require 'bioroebe'
1464
-
1465
- NameError (uninitialized constant Bioroebe::Blosum)
1466
- irb(main):003:0> module LibSSW
1467
- irb(main):004:1> BLOSUM50 = [
1468
- irb(main):005:2*
1469
-
1470
-
1471
- ^^^ also enable Bioroebe::Blosum.matrix?,
1472
- Bioroebe::Blosum.matrix?
1473
- #^^^ add this
1073
+ of the various ORFs with the start-position.
1074
+ --------------------------------------------------------------------------------
1075
+ (174) → add:
1076
+ setgene2
1077
+ add_dna2
1078
+ dna2
1079
+ dna? <--- this one is not a setter but a query.
1080
+ --------------------------------------------------------------------------------
1081
+ (175) → improve the TM calculation. must be better, must have more
1082
+ documentation, and a small tutorial.
1083
+ --------------------------------------------------------------------------------
1084
+ (176) → Compare bioroebe to:
1085
+ https://www.ncbi.nlm.nih.gov/orffinder
1086
+ whether both return the same
1087
+ also possibly add a web-gui
1088
+ --------------------------------------------------------------------------------
1089
+ (177) Find out ratios from:
1090
+ Doolittle RF. 1989. Redundancies in protein sequences. I
1091
+ http://onlinelibrary.wiley.com/doi/10.1110/ps.9.6.1203/pdf
1092
+ ^^^ that table perhaps
1093
+ (1) require 'bioroebe'
1094
+ NameError (uninitialized constant Bioroebe::Blosum)
1095
+ irb(main):003:0> module LibSSW
1096
+ irb(main):004:1> BLOSUM50 = [
1097
+ irb(main):005:2*
1098
+ ^^^ also enable Bioroebe::Blosum.matrix?,
1099
+ Bioroebe::Blosum.matrix?
1100
+ #^^^ add this
1474
1101
  Bioroebe::Blosum[50]
1475
1102
  ^^^ add this, and show an error if the file does not exist.
1476
- .show_matrix
1477
- ^^^ and so forth, also:
1478
- Bioroebe::Blosum[50] as an API.
1479
- and document it in general.
1480
-
1481
- ..........................................................................
1482
- (65) http://www.biomart.org/other/user-docs.pdf
1483
- ^^^ work through this
1484
- -------------------------------------------------------------------------------
1485
- (66) → add:
1486
-
1487
- class Cell
1488
- ^^^ simulate a cell
1103
+ .show_matrix
1104
+ ^^^ and so forth, also:
1105
+ Bioroebe::Blosum[50] as an API.
1106
+ and document it in general.
1107
+ --------------------------------------------------------------------------------
1108
+ (178) → http://www.biomart.org/other/user-docs.pdf
1109
+ ^^^ work through this
1110
+ --------------------------------------------------------------------------------
1111
+ (179) → add:
1112
+ class Cell
1113
+ ^^^ simulate a cell
1489
1114
  Hmmm. Needs specific components ... and needs a better plan.
1490
- -------------------------------------------------------------------------------
1491
- (68) → class Protein:
1492
-
1493
- add glycosyslation patteren
1494
- .glycosylated? yes no
1495
- + glycoslated?
1496
- need to somehow add the modiication type
1497
-
1498
- https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5358406/
1499
- -------------------------------------------------------------------------------
1500
- (69) → In the BioShell we must be able to do probes - completementary
1501
- to amino acids.
1502
- -------------------------------------------------------------------------------
1503
- (70) Add www-related functionality to bioroebe eventually make use
1504
- of rails, but start with sinatra possibly. In the long run,
1505
- make it flexible to work with as many different frameworks
1506
- as possible, though.
1507
- -------------------------------------------------------------------------------
1508
- (71) → Spaltstellen anzeigen zum beispiel lambda-DNA verdau
1509
- BgI II.
1510
- -------------------------------------------------------------------------------
1511
- (72) → dnaanalyze
1512
-
1513
- In the DNA string `TCCGTCGCAACACATCGCCTCAACAAACCGACCGGGATATGCAATACCGGAATCCGATCCTTTAGAAGCTGCATTCCAAACGCTTGCAATAACACCCACTCGACTATTCAGCATTGGCAAAGGGTACGAATTCGACGAAGGGAGGGTGCTATATTTTCCAAGTTGCTCGCCGATTGATACGGAGCCTGTGGAAAGATTTCGCGGCTCTAGTCTTTAGCTTTGATGTCACCCCTGAGTAGTAACCCGGCGTGGTAGCTTTCATTAGACTTCTCGGAGAGAGTATTAAGCAAAGGTGGAGGTCCCAGGGGTCCAGTGAGCTGTATCGCACTAAAAGCATGCCTACGGGCAATGCTATTTTGCTCACAGGAACTTTGGGGGAGCCACAAACTCTCGAAGCCGGATTGTTGTGGCGGCTAACTTTCCAAAGGCGACCATTCATGGTCTGAATGGGCCCTCACCAGAAGAACGTTTTCGACGGGCATTCTTCCCCGGGGTTTCGAAGGCAAGGGTCAGCACGGCGCGGAAAAGTACGCGACGCATACCGGACTAGTCATGCAACTCCCTCGGAACTGGCGATTCCCACCCAAGAGACGCACGCTGATCATTGCCCATGCCGACTGGAGATGCTGAATTTGGTATGCGGGTCTGTTGCCAGCGCTGACATTATCGGACATTGTGGGGAGAACCGTGTGATTGATTGAGCTGGCGCATTTGTCCGCATGCTCTCCTCATGTGGACACCTTCGCAGGTTCTTTCCGCGGCCACAGTGTCGGGATCTACCCCTGGTGCGTCGCCGCGAGTACAGGTGGGGTTTCGCGCATGAGAACCAATGTTGCACGCCTCAAAACATGGCTGTAACATATTAGCGCCAATAAAAATTTTTGGCAACAAAGAAACAAGGCCAACCGAAGTGCTAAGCCGCGATCATGAAGGGGCGATGCCAGAATGGGAGTCTGCCTTTCCTGTGTGGACGTGAGATTGTACCTAGACAGAGAACGCC` we found these Nucleotides:
1514
- ================================================================================
1515
- Adenines: 244 | 24.40 %
1516
- Guanines: 273 | 27.30 %
1517
- Cytosines: 255 | 25.50 %
1518
- Thymine: 228 | 22.80 %
1519
-
1520
-
1521
- ^^^ created balanced composition
1522
-
1115
+ --------------------------------------------------------------------------------
1116
+ (180) → class Protein:
1117
+ add glycosyslation patteren
1118
+ .glycosylated? yes no
1119
+ + glycoslated?
1120
+ need to somehow add the modiication type
1121
+ https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5358406/
1122
+ --------------------------------------------------------------------------------
1123
+ (181) → In the BioShell we must be able to do probes - completementary
1124
+ to amino acids.
1125
+ --------------------------------------------------------------------------------
1126
+ (182) → Add www-related functionality to bioroebe eventually make use
1127
+ of rails, but start with sinatra possibly. In the long run,
1128
+ make it flexible to work with as many different frameworks
1129
+ as possible, though.
1130
+ --------------------------------------------------------------------------------
1131
+ (183) Spaltstellen anzeigen zum beispiel lambda-DNA verdau
1132
+ BgI II.
1133
+ --------------------------------------------------------------------------------
1134
+ (184) → dnaanalyze
1135
+ In the DNA string `TCCGTCGCAACACATCGCCTCAACAAACCGACCGGGATATGCAATACCGGAATCCGATCCTTTAGAAGCTGCATTCCAAACGCTTGCAATAACACCCACTCGACTATTCAGCATTGGCAAAGGGTACGAATTCGACGAAGGGAGGGTGCTATATTTTCCAAGTTGCTCGCCGATTGATACGGAGCCTGTGGAAAGATTTCGCGGCTCTAGTCTTTAGCTTTGATGTCACCCCTGAGTAGTAACCCGGCGTGGTAGCTTTCATTAGACTTCTCGGAGAGAGTATTAAGCAAAGGTGGAGGTCCCAGGGGTCCAGTGAGCTGTATCGCACTAAAAGCATGCCTACGGGCAATGCTATTTTGCTCACAGGAACTTTGGGGGAGCCACAAACTCTCGAAGCCGGATTGTTGTGGCGGCTAACTTTCCAAAGGCGACCATTCATGGTCTGAATGGGCCCTCACCAGAAGAACGTTTTCGACGGGCATTCTTCCCCGGGGTTTCGAAGGCAAGGGTCAGCACGGCGCGGAAAAGTACGCGACGCATACCGGACTAGTCATGCAACTCCCTCGGAACTGGCGATTCCCACCCAAGAGACGCACGCTGATCATTGCCCATGCCGACTGGAGATGCTGAATTTGGTATGCGGGTCTGTTGCCAGCGCTGACATTATCGGACATTGTGGGGAGAACCGTGTGATTGATTGAGCTGGCGCATTTGTCCGCATGCTCTCCTCATGTGGACACCTTCGCAGGTTCTTTCCGCGGCCACAGTGTCGGGATCTACCCCTGGTGCGTCGCCGCGAGTACAGGTGGGGTTTCGCGCATGAGAACCAATGTTGCACGCCTCAAAACATGGCTGTAACATATTAGCGCCAATAAAAATTTTTGGCAACAAAGAAACAAGGCCAACCGAAGTGCTAAGCCGCGATCATGAAGGGGCGATGCCAGAATGGGAGTCTGCCTTTCCTGTGTGGACGTGAGATTGTACCTAGACAGAGAACGCC` we found these Nucleotides:
1136
+ ================================================================================
1137
+ Adenines: 244 | 24.40 %
1138
+ Guanines: 273 | 27.30 %
1139
+ Cytosines: 255 | 25.50 %
1140
+ Thymine: 228 | 22.80 %
1141
+ ^^^ created balanced composition
1523
1142
  "Enter the percentage"
1524
-
1525
1143
  "interactive_string"
1526
1144
  ^^^ here we ask user input
1527
-
1528
1145
  otherwise, we assume it to be 25%
1529
1146
  Ok 25% works...
1530
- The other part does not yet work. I am so lazy...
1531
- ^^^ add a GUI part too. The GUI has been added;
1532
- we need to make it so that an input sequence
1533
- can be assigned, and dnaanalyse --GUI should
1534
- start it too. ALSO document it once this works.
1535
- -------------------------------------------------------------------------------
1536
- (73) → go through the individual components slowly and improve them,
1537
- step by step, including the documentation. Then eventually
1538
- remove this todo-entry here.
1539
- -------------------------------------------------------------------------------
1540
- (74) → Add a consensus sequence for:
1541
-
1542
- Asn-X-Ser/Thr-Conesnsus
1543
-
1544
- first in a yaml file; also documented, then also add
1147
+ The other part does not yet work. I am so lazy...
1148
+ ^^^ add a GUI part too. The GUI has been added;
1149
+ we need to make it so that an input sequence
1150
+ can be assigned, and dnaanalyse --GUI should
1151
+ start it too. ALSO document it once this works.
1152
+ --------------------------------------------------------------------------------
1153
+ (185) → go through the individual components slowly and improve them,
1154
+ step by step, including the documentation. Then eventually
1155
+ remove this todo-entry here.
1156
+ --------------------------------------------------------------------------------
1157
+ (186) → Add a consensus sequence for:
1158
+ Asn-X-Ser/Thr-Conesnsus
1159
+ first in a yaml file; also documented, then also add
1545
1160
  a way to scan for these something like:
1546
- N-Glycosylation?
1547
- NGlycosylation?
1548
- NGlyc
1549
- /N-?Glyc/i
1550
- ^^^ use that regex
1551
- -------------------------------------------------------------------------------
1552
- (74) → make sure that newly generated files respect the
1553
- default chmod value on the system. from bioroebe.
1554
- right now we default to 755 which I assume is
1555
- hardcoded but perhaps this is wrong.
1556
-
1557
- -------------------------------------------------------------------------------
1558
- (75) require 'bio'
1559
-
1560
- # creating a Bio::Sequence::NA object containing ambiguous alphabets
1561
- ambiguous_seq = Bio::Sequence::NA.new("atgcyrwskmbdhvn")
1562
-
1563
- # show the contents and class of the DNA sequence object
1564
- p ambiguous_seq # => "atgcyrwskmbdhvn"
1565
- p ambiguous_seq.class # => Bio::Sequence::NA
1566
-
1567
- # convert the sequence to a Regexp object
1568
- p ambiguous_seq.to_re # => /atgc[tc][ag][at][gc][tg][ac][tgc][atg][atc][agc][atgc]/
1569
- p ambiguous_seq.to_re.class # => Regexp
1570
-
1571
- ^^^ add .to_re to this. it must generate a regexp object.
1572
-
1573
-
1574
-
1575
- - Also add some restirctione nzymes example
1576
- on the bioroebe readme... and bio.cgi
1577
-
1578
- perhaps add the restriction enyme
1579
- part nto a standalone file
1580
- so taht it can be used by both the .cgi and
1581
- well rdoc...
1582
- -------------------------------------------------------------------------------
1583
- - Add more protein-specific thingies to bioroebe.
1584
- -------------------------------------------------------------------------------
1585
- - Die bioshell vorantreiben und durch std_biology.rb abarbeiten.
1586
- Vielleicht können wir ja etwas davon auslagern in eine Klasse
1587
- oder so.
1588
-
1589
- Das ganze sollte auch mit Webmin (biomin) verknüpft werden, so das
1590
- wir die Bioshell auch elegant über das www verwenden können!
1591
- -------------------------------------------------------------------------------
1592
- - ^^^ when we find restriction enzyme sites in a DNA
1161
+ N-Glycosylation?
1162
+ NGlycosylation?
1163
+ NGlyc
1164
+ /N-?Glyc/i
1165
+ ^^^ use that regex
1166
+ --------------------------------------------------------------------------------
1167
+ (187) → require 'bio'
1168
+ # creating a Bio::Sequence::NA object containing ambiguous alphabets
1169
+ ambiguous_seq = Bio::Sequence::NA.new("atgcyrwskmbdhvn")
1170
+ # show the contents and class of the DNA sequence object
1171
+ p ambiguous_seq # => "atgcyrwskmbdhvn"
1172
+ p ambiguous_seq.class # => Bio::Sequence::NA
1173
+ # convert the sequence to a Regexp object
1174
+ p ambiguous_seq.to_re # => /atgc[tc][ag][at][gc][tg][ac][tgc][atg][atc][agc][atgc]/
1175
+ p ambiguous_seq.to_re.class # => Regexp
1176
+ ^^^ add .to_re to this. it must generate a regexp object.
1177
+ - Also add some restirctione nzymes example
1178
+ on the bioroebe readme... and bio.cgi
1179
+ perhaps add the restriction enyme
1180
+ part nto a standalone file
1181
+ so taht it can be used by both the .cgi and
1182
+ well rdoc...
1183
+ --------------------------------------------------------------------------------
1184
+ (188) Add more protein-specific thingies to bioroebe.
1185
+ --------------------------------------------------------------------------------
1186
+ (189) Die bioshell vorantreiben und durch std_biology.rb abarbeiten.
1187
+ Vielleicht können wir ja etwas davon auslagern in eine Klasse
1188
+ oder so.
1189
+ Das ganze sollte auch mit Webmin (biomin) verknüpft werden, so das
1190
+ wir die Bioshell auch elegant über das www verwenden können!
1191
+ --------------------------------------------------------------------------------
1192
+ (190) → ^^^ when we find restriction enzyme sites in a DNA
1593
1193
  string, colourize them RED.
1594
-
1595
- also set it to
1596
- set_restriction_size()
1597
- -------------------------------------------------------------------------------
1598
- - also ... while learning C++ we extend the project here...
1599
- Useful C++ things will be combined.
1600
- -------------------------------------------------------------------------------
1601
- - As of April 2003, there were 176,890 total taxa represented.
1602
-
1603
- ^^^ we need a way to also output how many entries we
1604
- have there.
1605
- -------------------------------------------------------------------------------
1606
- - Replace bioruby with bioroebe completely!
1607
- In order for this to work, we first need to find out
1608
- what bioruby is able to do. :P
1609
- -------------------------------------------------------------------------------
1610
- - append 33
1611
- # ^^^ in the bioshell
1612
- Only numbers were given: Adding 33 random nucleotides to the main string next.
1613
- Traceback (most recent call last):
1614
- 10: from /usr/bin/bioshell:26:in `<main>'
1615
- 9: from /home/Programs/Ruby/2.7.1/lib/ruby/site_ruby/2.7.0/bioroebe/shell/misc.rb:4121:in `shell'
1616
- 8: from /home/Programs/Ruby/2.7.1/lib/ruby/site_ruby/2.7.0/bioroebe/shell/misc.rb:4121:in `new'
1617
- 7: from /home/Programs/Ruby/2.7.1/lib/ruby/site_ruby/2.7.0/bioroebe/shell/initialize.rb:168:in `initialize'
1618
- 6: from /home/Programs/Ruby/2.7.1/lib/ruby/site_ruby/2.7.0/bioroebe/shell/loop.rb:18:in `enter_main_loop'
1619
- 5: from /home/Programs/Ruby/2.7.1/lib/ruby/site_ruby/2.7.0/bioroebe/shell/loop.rb:18:in `loop'
1620
- 4: from /home/Programs/Ruby/2.7.1/lib/ruby/site_ruby/2.7.0/bioroebe/shell/loop.rb:30:in `block in enter_main_loop'
1621
- 3: from /home/Programs/Ruby/2.7.1/lib/ruby/site_ruby/2.7.0/bioroebe/shell/loop.rb:30:in `each'
1622
- 2: from /home/Programs/Ruby/2.7.1/lib/ruby/site_ruby/2.7.0/bioroebe/shell/loop.rb:31:in `block (2 levels) in enter_main_loop'
1623
- 1: from /home/Programs/Ruby/2.7.1/lib/ruby/site_ruby/2.7.0/bioroebe/shell/menu.rb:3565:in `menu'
1624
- /home/Programs/Ruby/2.7.1/lib/ruby/site_ruby/2.7.0/bioroebe/shell/misc.rb:3979:in `append': undefined method `return_sequence_that_is_cut_via_restriction_enzyme' for Bioroebe:Module (NoMethodError)
1625
- Did you mean? return_random_codon_sequence_for_this_aminoacid_sequence
1626
-
1627
-
1628
- ^^^^^ BUG!
1629
- -------------------------------------------------------------------------------
1630
- > rest?
1631
-
1632
- We found these restriction sites within the sequence `TTCAGAACTCAACGCCTGGTTGGCCGTCCAGTAAGCTGACTAAGTAAGTCTATGCCCGCGATAACCAGGATACAGATATCGTGAAACCTGGTTTATCTCCTTCTATAAGAGTCTGCACATCTAGC`:
1633
-
1634
- AccII → CGCG ( 1 times found)
1635
- AluIAGCT ( 1 times found)
1636
- BfaI CTAG ( 1 times found)
1637
- BshI GGCC ( 1 times found)
1638
- Bsh1236I CGCG ( 1 times found)
1639
- BshFI GGCC ( 1 times found)
1640
- BstFNI CGCG ( 1 times found)
1641
- BstUI CGCG ( 1 times found)
1642
- BsuRI -> GGCC ( 1 times found)
1643
- CviRI -> TGCA ( 1 times found)
1644
- Eco32I -> GATATC ( 1 times found)
1645
- EcoRV -> GATATC ( 1 times found)
1646
- FnuDII -> CGCG ( 1 times found)
1647
- HaeIII -> GGCC ( 1 times found)
1648
- HpyCH4V -> TGCA ( 1 times found)
1649
- MaeI -> CTAG ( 1 times found)
1650
- MvnI -> CGCG ( 1 times found)
1651
- PalI -> GGCC ( 1 times found)
1652
- SelI -> CGCG ( 1 times found)
1653
- ThaI -> CGCG ( 1 times found)
1654
- XspI CTAG ( 1 times found)
1655
-
1656
- ^^^^ also show the position
1657
-
1658
-
1659
- -------------------------------------------------------------------------------
1660
-
1661
- PMID entries are:
1662
-
1663
- x = 'Goldman, JM, Melo JV 2003 NEJM 349:1451 14534339
1664
- Lewis GD 1993 Cancer Immunol Immun other 37: 255 8102322
1665
- McShane LM 2009 Clin Canc Res 15: 1898 19276274
1666
- Fox JL 2007 Nature Biotech 25: 489 17483821
1667
- Bodin L 2005 Blood 106: 135 15790782'
1668
-
1669
- require 'rubygems'
1670
- require 'bio'
1671
-
1672
- my_file = File.new(ARGV[0])
1673
- refs = my_file.readlines
1674
- ids = []
1675
-
1676
- refs.each do |line|
1194
+ also set it to
1195
+ set_restriction_size()
1196
+ --------------------------------------------------------------------------------
1197
+ (191) → also ... while learning C++ we extend the project here...
1198
+ Useful C++ things will be combined.
1199
+ --------------------------------------------------------------------------------
1200
+ (192) → As of April 2003, there were 176,890 total taxa represented.
1201
+ ^^^ we need a way to also output how many entries we
1202
+ have there.
1203
+ --------------------------------------------------------------------------------
1204
+ (193) → Replace bioruby with bioroebe completely!
1205
+ In order for this to work, we first need to find out
1206
+ what bioruby is able to do. :P
1207
+ --------------------------------------------------------------------------------
1208
+ (194) append 33
1209
+ # ^^^ in the bioshell
1210
+ Only numbers were given: Adding 33 random nucleotides to the main string next.
1211
+ Traceback (most recent call last):
1212
+ 10: from /usr/bin/bioshell:26:in `<main>'
1213
+ 9: from /home/Programs/Ruby/2.7.1/lib/ruby/site_ruby/2.7.0/bioroebe/shell/misc.rb:4121:in `shell'
1214
+ 8: from /home/Programs/Ruby/2.7.1/lib/ruby/site_ruby/2.7.0/bioroebe/shell/misc.rb:4121:in `new'
1215
+ 7: from /home/Programs/Ruby/2.7.1/lib/ruby/site_ruby/2.7.0/bioroebe/shell/initialize.rb:168:in `initialize'
1216
+ 6: from /home/Programs/Ruby/2.7.1/lib/ruby/site_ruby/2.7.0/bioroebe/shell/loop.rb:18:in `enter_main_loop'
1217
+ 5: from /home/Programs/Ruby/2.7.1/lib/ruby/site_ruby/2.7.0/bioroebe/shell/loop.rb:18:in `loop'
1218
+ 4: from /home/Programs/Ruby/2.7.1/lib/ruby/site_ruby/2.7.0/bioroebe/shell/loop.rb:30:in `block in enter_main_loop'
1219
+ 3: from /home/Programs/Ruby/2.7.1/lib/ruby/site_ruby/2.7.0/bioroebe/shell/loop.rb:30:in `each'
1220
+ 2: from /home/Programs/Ruby/2.7.1/lib/ruby/site_ruby/2.7.0/bioroebe/shell/loop.rb:31:in `block (2 levels) in enter_main_loop'
1221
+ 1: from /home/Programs/Ruby/2.7.1/lib/ruby/site_ruby/2.7.0/bioroebe/shell/menu.rb:3565:in `menu'
1222
+ /home/Programs/Ruby/2.7.1/lib/ruby/site_ruby/2.7.0/bioroebe/shell/misc.rb:3979:in `append': undefined method `return_sequence_that_is_cut_via_restriction_enzyme' for Bioroebe:Module (NoMethodError)
1223
+ Did you mean? return_random_codon_sequence_for_this_aminoacid_sequence
1224
+ ^^^^^ BUG!
1225
+ --------------------------------------------------------------------------------
1226
+ (195) → > rest?
1227
+ We found these restriction sites within the sequence `TTCAGAACTCAACGCCTGGTTGGCCGTCCAGTAAGCTGACTAAGTAAGTCTATGCCCGCGATAACCAGGATACAGATATCGTGAAACCTGGTTTATCTCCTTCTATAAGAGTCTGCACATCTAGC`:
1228
+ AccII CGCG ( 1 times found)
1229
+ AluI → AGCT ( 1 times found)
1230
+ BfaI → CTAG ( 1 times found)
1231
+ BshI → GGCC ( 1 times found)
1232
+ Bsh1236I CGCG ( 1 times found)
1233
+ BshFI → GGCC ( 1 times found)
1234
+ BstFNI → CGCG ( 1 times found)
1235
+ BstUICGCG ( 1 times found)
1236
+ BsuRI -> GGCC ( 1 times found)
1237
+ CviRI -> TGCA ( 1 times found)
1238
+ Eco32I -> GATATC ( 1 times found)
1239
+ EcoRV -> GATATC ( 1 times found)
1240
+ FnuDII -> CGCG ( 1 times found)
1241
+ HaeIII -> GGCC ( 1 times found)
1242
+ HpyCH4V -> TGCA ( 1 times found)
1243
+ MaeI -> CTAG ( 1 times found)
1244
+ MvnI -> CGCG ( 1 times found)
1245
+ PalI -> GGCC ( 1 times found)
1246
+ SelI -> CGCG ( 1 times found)
1247
+ ThaI -> CGCG ( 1 times found)
1248
+ XspI CTAG ( 1 times found)
1249
+ ^^^^ also show the position
1250
+ --------------------------------------------------------------------------------
1251
+ (196) PMID entries are:
1252
+ x = 'Goldman, JM, Melo JV 2003 NEJM 349:1451 14534339
1253
+ Lewis GD 1993 Cancer Immunol Immun other 37: 255 8102322
1254
+ McShane LM 2009 Clin Canc Res 15: 1898 19276274
1255
+ Fox JL 2007 Nature Biotech 25: 489 17483821
1256
+ Bodin L 2005 Blood 106: 135 15790782'
1257
+ require 'rubygems'
1258
+ require 'bio'
1259
+ my_file = File.new(ARGV[0])
1260
+ refs = my_file.readlines
1261
+ ids = []
1262
+ refs.each do |line|
1677
1263
  pmid = line.strip().split("\t")
1678
1264
  ids.push(pmid[2])
1679
- end
1680
-
1681
- ids.each { |id|
1265
+ end
1266
+ ids.each { |id|
1682
1267
  entry = Bio::PubMed.query(id)
1683
1268
  medline = Bio::MEDLINE.new(entry)
1684
1269
  reference = medline.reference
1685
1270
  puts reference.endnote
1686
- }
1687
- - Clone, in bioroebe, the get_ORF functionality.
1688
-
1689
- - in bioroebe: Find out how to count where a restriction
1690
- enzyme was found + add it into a table, also WHEN it
1691
- was found and by WHO.
1692
-
1693
- We should make a good reference table there, so that
1694
- people can reproduce where the information is kept
1695
- or obtained from.
1696
-
1697
- - Enable:
1698
- download ecoli
1699
- first fix a bug
1700
- and then make it so that we can download the ecoli
1701
- sequence from that file.... yaml file
1702
- - Rewrite Bioroebe (Pending since as of September 2019 ...)
1703
- Start with the functionality that shall decode something,
1704
- and put this back ... as a first class citizen with
1705
- descriptions how to work with this.
1706
- - improve quality
1707
- - also fix the taxonomy project as you go...
1708
- - Add C++ wrapper stuff, starting with qt-widgets specifically
1709
- but also extend the base part of bioroebe as far as C++ is
1710
- concerned. Perhaps at a later time also add/embed mruby
1711
- into the project.
1712
-
1713
- - Need to overlay 2 exceptionally large DNA data sets and
1714
- analyze the overlap.
1715
-
1716
- Want to determine the frequency of one event within the DNA
1717
- reads of the other data set. Prefer to discuss the details
1718
- of the data sets one-on-one.
1719
- ^^^ we need to compare two DNA data sets and analyze the overlap.
1720
-
1721
- - Make sure that the sinatra interface has the parts of
1722
- emboss available, too. List the progress here.
1723
- - interactivefasta interactive_fasta
1724
- ^^^ enable this... but I am not sure what is meant with
1725
- that. :\
1726
-
1727
-
1728
-
1729
-
1730
-
1731
- ..........................................................................
1732
- Bei der Datenbanksuche werden die gemessenen Massen mit den Peptidmassen
1733
- aller Proteine bzw. Gene in einer Datenbank (NCBI, Uniprot) verglichen. DNA-
1734
- Sequenzen werden dazu in Proteinsequenzen übersetzt und in silico mit der beim
1735
- Verdau benutzten Protease geschnitten.
1736
- ^^^ enable digestions
1737
-
1738
-
1739
-
1740
-
1741
- ..........................................................................
1742
- Complexity of libraries:
1743
- How many independent clones are necessary to represent a genome (plant,
1744
- animal/fungus) or how many such clones have to be screened to have realistic
1745
- chance of finding the gene of interest?
1746
- This can be calculated by the formula:
1747
- P = 1 - (1 - F/G) N
1748
- N = ln(1 - P) / ln(1 - F/G)
1749
- P.... Propability that a certain insert is present in library consisting of N clones
1750
- F.... average insert size (kb)
1751
- G... Genome size (kb) of the organism from which a library should be constructed
1752
- N.... Number of clones in the library
1753
- ^^^^ add this formula + documentation
1754
-
1755
- Example: 16 kb average insert size in a replacement vector
1756
- Genome sizes:
1757
- Yeast:
1758
- 16 Mb = 16 000 kb (1000 clones with 16 kb = 1 genome equivalent)
1759
- F/G=0.001.
1760
- ln(1-F/G)= - 0.0010005
1761
- 95%: ln0.05= - 2.99573
1762
- 2,99/0.001 3000
1763
- Wheat: 16 000 Mb
1764
- 3 .10 6 clones
1765
- How many plaques can be screended on one 9 cm petri dish or filter?
1766
- Plaques close to confluency: about 10 4 pfu per plate.
1767
- While in case of yeast the whole library can be screened on 1 plate,
1768
- 300 plates would be needed for wheat - which is impracticable!
1769
- (BAC "bacterial artificial chromosome" libraries with 150 -300 kb inserts are used!)
1770
- Most plants have reasonable genome size (e.g. tomato about 800 Mb) - 15 filters
1771
- have to be hybridized.
1772
-
1773
-
1774
-
1775
-
1776
- ..........................................................................
1777
-
1778
- BIO SHELL> BglI?
1779
-
1780
- We have found a restriction enzyme called BglI.
1781
-
1782
- The sequence this 11-cutter relates to is: `5' - GCCNNNNNGGC - 3'`
1783
-
1784
- It will specifically cut between: 5' - GCCNNNN|NGGC - 3'
1785
- 3' - CGG(A/T/G/C)(A/T/G/C)(A/T/G/C)(A/T/G/C)(A/T/|G/C)CCG - 5'
1786
-
1787
- ^
1788
- also add a line to say
1789
- "This is a blunt-end cutter."
1790
- or
1791
- "This enzyme cre
1792
- ates a staggered cut."
1793
-
1794
-
1795
- http://biopython.org/DIST/docs/api/Bio.Restriction.Restriction.Blunt-class.html
1796
- ^^^ also add something like this... as a query thingy.
1797
-
1798
- catalyse(cls, dna, linear=True)
1799
- List the sequence fragments after cutting dna with enzyme. source code
1800
-
1801
- catalyze(cls, dna, linear=True)
1802
- List the sequence fragments after cutting dna with enzyme. source code
1803
-
1804
- is_blunt(cls)
1805
- Return if the enzyme produces blunt ends. source code
1806
-
1807
- is_5overhang(cls)
1808
- Return if the enzymes produces 5' overhanging ends. source code
1809
-
1810
- is_3overhang(cls)
1811
- Return if the enzyme produces 3' overhanging ends. source code
1812
-
1813
- overhang(cls)
1814
- Return the type of the enzyme's overhang as string. source code
1815
-
1816
- compatible_end(cls, batch=None)
1817
- List all enzymes that produce compatible ends for the enzyme.
1818
- http://biopython.org/DIST/docs/api/Bio.Restriction.Restriction.Blunt-class.html
1819
-
1820
-
1821
- ..........................................................................
1822
- https://www.reddit.com/r/bioinformatics/comments/5o3kn8/bioinformatics_contest_2017_jan_23rd29th_solve_as/
1823
- ..........................................................................
1824
- (1) Finish all of biophp integration into bioroebe.
1825
- http://www.biophp.org/
1826
- -------------------------------------------------------------------------------
1827
-
1828
- locate oriC here:
1829
-
1830
- ttcgttaagtaacttcactgcccgtagtgtaccggcattcgctagcaagagtctttctg
1831
- ggcaagcttcacttgtgatcgcggcctgtgcccccggaatgaaacaaccacgtccctgct
1832
- aacaacgacgggaaaagggaagtgatccgtcggcagacccagactagtgcccttctccgg
1833
- cttccaacaccaacgagtcggaccgaattgagcactcgaatgcacggcgctttttgccgg
1834
- ccgaaacggcgcctccgcattgatcgacgcacggcctcttttggctacagcgcatggctt
1835
- tacactcggcatgcatttccagtgctaatcaaacagaattccttgtaaagtccttcaacc
1836
- gtgacagactatcgctaaggagcctttccagtcgtgcctgcaatcactcgcgaaatcaac
1837
- aaatctacatctaagcacgctcgtggttcggagtcccgccctcatgtggaccatagccgg
1838
- ttcgcccgagtcctaggcacgatcagaggacctatctttcgccactcaactcttctgagt
1839
- gaaacaatatcgaccgaaaccttgctcggttttgtccacaacaacgtcaggcccataagc
1840
- agacgacattagtccgctgttgtcgcgggcgtcccatagccgtacgatgtcccgtcgga
1841
-
1842
- ori?
1843
-
1844
- ^^^ this shall give us all ORI in a sequence.
1845
- DnaA protein binds to DNA-Box in an ori.
1846
-
1847
- '9cut'
1848
- '8cut'
1849
- '7cut'
1850
- ^^^ these give us slices
1851
-
1852
- But I do not know how to locate ORIs.
1853
-
1854
-
1855
-
1856
-
1857
-
1858
-
1859
-
1860
-
1861
- -------------------------------------------------------------------------------
1862
- ^^^ also integrate git into bioroebe.
1863
- -------------------------------------------------------------------------------
1864
- WIR MÜSSEN DAS HIER EXTREM VERBESSERN.
1865
-
1866
- DANN UPLOADEN UND ALS BASIS FÜR APPLICATIONS NUTZEN.
1867
- -------------------------------------------------------------------------------
1868
-
1869
- Study MetaCyc
1870
- ^^^ study metabolic pathways.
1871
-
1872
- http://metacyc.org/
1873
-
1874
- → Create KuroMetaCyc, in Analogy towards Metabolic Cycle.
1875
-
1876
- -------------------------------------------------------------------------------
1877
-
1878
- Welcome to BioShell May 2012. Type "help" to get some help.
1879
-
1880
- Hello and welcome to the Bio Shell Version, last updated: May 2012
1881
-
1882
- BIO SHELL> IPEYVDWRQKGAVTPVKNQGSCGSCWAFSAVVTIEGIIKIRTGNLNQYSEQELLDCDRRSYGCNGGYPWSALQLVAQYGI
1883
- BIO SHELL> HYRNTYPYEGVQRYCRSREKGPYAAKTDGVRQVQPYNQGALLYSIANQPVSVVLQAAGKDFQLYRGGIFVGPCGNKVDHA
1884
- BIO SHELL> VAAVGYGPNYILIKNSWGTGWGENGYIRIKRGTGNSYGVCGLYTSSFYPVKN
1885
- BIO SHELL>
1886
- BIO SHELL> input?
1887
- BIO SHELL> pdb
1888
- BIO SHELL>
1889
-
1890
- ^^^ (1)
1891
- Add a pdb submodule
1892
- When we type this, we then ask:
1893
- "Please input your FASTA format now:"
1894
-
1895
-
1896
-
1897
-
1898
- -------------------------------------------------------------------------------
1899
-
1900
- http://biopython.org/DIST/docs/cookbook/Restriction.html#mozTocId101269
1901
-
1902
- ^^^ support this also:
1903
-
1904
- >>> from Bio import Restriction
1905
- >>> dir()
1906
- ['Restriction', '__builtins__', '__doc__', '__name__']
1907
- >>> Restriction.EcoRI
1908
- EcoRI
1909
- >>> Restriction.EcoRI.site
1910
- 'GAATTC'
1911
- >>>
1912
-
1913
- and document it somewhere; perhaps in a new .cgi page.
1914
-
1915
- The above will return the exact site, without verbosity.
1916
-
1917
-
1918
- Restriction.EcoRI
1919
- Restriction.EcoRI.site
1920
-
1921
- result = Bioroebe.restriction_enzyme 'EcoRI.site'
1271
+ }
1272
+ - Clone, in bioroebe, the get_ORF functionality.
1273
+ - in bioroebe: Find out how to count where a restriction
1274
+ enzyme was found + add it into a table, also WHEN it
1275
+ was found and by WHO.
1276
+ We should make a good reference table there, so that
1277
+ people can reproduce where the information is kept
1278
+ or obtained from.
1279
+ - Enable:
1280
+ download ecoli
1281
+ first fix a bug
1282
+ and then make it so that we can download the ecoli
1283
+ sequence from that file.... yaml file
1284
+ - Rewrite Bioroebe (Pending since as of September 2019 ...)
1285
+ Start with the functionality that shall decode something,
1286
+ and put this back ... as a first class citizen with
1287
+ descriptions how to work with this.
1288
+ - improve quality
1289
+ - also fix the taxonomy project as you go...
1290
+ - Add C++ wrapper stuff, starting with qt-widgets specifically
1291
+ but also extend the base part of bioroebe as far as C++ is
1292
+ concerned. Perhaps at a later time also add/embed mruby
1293
+ into the project.
1294
+ - Need to overlay 2 exceptionally large DNA data sets and
1295
+ analyze the overlap.
1296
+ Want to determine the frequency of one event within the DNA
1297
+ reads of the other data set. Prefer to discuss the details
1298
+ of the data sets one-on-one.
1299
+ ^^^ we need to compare two DNA data sets and analyze the overlap.
1300
+ - Make sure that the sinatra interface has the parts of
1301
+ emboss available, too. List the progress here.
1302
+ - interactivefasta interactive_fasta
1303
+ ^^^ enable this... but I am not sure what is meant with
1304
+ that. :\
1305
+ --------------------------------------------------------------------------------
1306
+ (197) Bei der Datenbanksuche werden die gemessenen Massen mit den Peptidmassen
1307
+ aller Proteine bzw. Gene in einer Datenbank (NCBI, Uniprot) verglichen. DNA-
1308
+ Sequenzen werden dazu in Proteinsequenzen übersetzt und in silico mit der beim
1309
+ Verdau benutzten Protease geschnitten.
1310
+ ^^^ enable digestions
1311
+ --------------------------------------------------------------------------------
1312
+ (198) → Complexity of libraries:
1313
+ How many independent clones are necessary to represent a genome (plant,
1314
+ animal/fungus) or how many such clones have to be screened to have realistic
1315
+ chance of finding the gene of interest?
1316
+ This can be calculated by the formula:
1317
+ P = 1 - (1 - F/G) N
1318
+ N = ln(1 - P) / ln(1 - F/G)
1319
+ P.... Propability that a certain insert is present in library consisting of N clones
1320
+ F.... average insert size (kb)
1321
+ G... Genome size (kb) of the organism from which a library should be constructed
1322
+ N.... Number of clones in the library
1323
+ ^^^^ add this formula + documentation
1324
+ Example: 16 kb average insert size in a replacement vector
1325
+ Genome sizes:
1326
+ Yeast:
1327
+ 16 Mb = 16 000 kb (1000 clones with 16 kb = 1 genome equivalent)
1328
+ F/G=0.001.
1329
+ ln(1-F/G)= - 0.0010005
1330
+ 95%: ln0.05= - 2.99573
1331
+ 2,99/0.001 3000
1332
+ Wheat: 16 000 Mb
1333
+ 3 .10 6 clones
1334
+ How many plaques can be screended on one 9 cm petri dish or filter?
1335
+ Plaques close to confluency: about 10 4 pfu per plate.
1336
+ While in case of yeast the whole library can be screened on 1 plate,
1337
+ 300 plates would be needed for wheat - which is impracticable!
1338
+ (BAC "bacterial artificial chromosome" libraries with 150 -300 kb inserts are used!)
1339
+ Most plants have reasonable genome size (e.g. tomato about 800 Mb) - 15 filters
1340
+ have to be hybridized.
1341
+ --------------------------------------------------------------------------------
1342
+ (199) → BIO SHELL> BglI?
1343
+ We have found a restriction enzyme called BglI.
1344
+ The sequence this 11-cutter relates to is: `5' - GCCNNNNNGGC - 3'`
1345
+ It will specifically cut between: 5' - GCCNNNN|NGGC - 3'
1346
+ 3' - CGG(A/T/G/C)(A/T/G/C)(A/T/G/C)(A/T/G/C)(A/T/|G/C)CCG - 5'
1347
+ ^
1348
+ also add a line to say
1349
+ "This is a blunt-end cutter."
1350
+ or
1351
+ "This enzyme cre
1352
+ ates a staggered cut."
1353
+ http://biopython.org/DIST/docs/api/Bio.Restriction.Restriction.Blunt-class.html
1354
+ ^^^ also add something like this... as a query thingy.
1355
+ catalyse(cls, dna, linear=True)
1356
+ List the sequence fragments after cutting dna with enzyme. source code
1357
+ catalyze(cls, dna, linear=True)
1358
+ List the sequence fragments after cutting dna with enzyme. source code
1359
+ is_blunt(cls)
1360
+ Return if the enzyme produces blunt ends. source code
1361
+ is_5overhang(cls)
1362
+ Return if the enzymes produces 5' overhanging ends. source code
1363
+ is_3overhang(cls)
1364
+ Return if the enzyme produces 3' overhanging ends. source code
1365
+ overhang(cls)
1366
+ Return the type of the enzyme's overhang as string. source code
1367
+ compatible_end(cls, batch=None)
1368
+ List all enzymes that produce compatible ends for the enzyme.
1369
+ http://biopython.org/DIST/docs/api/Bio.Restriction.Restriction.Blunt-class.html
1370
+ --------------------------------------------------------------------------------
1371
+ (200) → https://www.reddit.com/r/bioinformatics/comments/5o3kn8/bioinformatics_contest_2017_jan_23rd29th_solve_as/
1372
+ --------------------------------------------------------------------------------
1373
+ (201) Finish all of biophp integration into bioroebe.
1374
+ http://www.biophp.org/
1375
+ --------------------------------------------------------------------------------
1376
+ (202) locate oriC here:
1377
+ ttcgttaagtaacttcactgcccgtagtgtaccggcattcgctagcaagagtctttctg
1378
+ ggcaagcttcacttgtgatcgcggcctgtgcccccggaatgaaacaaccacgtccctgct
1379
+ aacaacgacgggaaaagggaagtgatccgtcggcagacccagactagtgcccttctccgg
1380
+ cttccaacaccaacgagtcggaccgaattgagcactcgaatgcacggcgctttttgccgg
1381
+ ccgaaacggcgcctccgcattgatcgacgcacggcctcttttggctacagcgcatggctt
1382
+ tacactcggcatgcatttccagtgctaatcaaacagaattccttgtaaagtccttcaacc
1383
+ gtgacagactatcgctaaggagcctttccagtcgtgcctgcaatcactcgcgaaatcaac
1384
+ aaatctacatctaagcacgctcgtggttcggagtcccgccctcatgtggaccatagccgg
1385
+ ttcgcccgagtcctaggcacgatcagaggacctatctttcgccactcaactcttctgagt
1386
+ gaaacaatatcgaccgaaaccttgctcggttttgtccacaacaacgtcaggcccataagc
1387
+ agacgacattagtccgctgttgtcgcgggcgtcccatagccgtacgatgtcccgtcgga
1388
+ ori?
1389
+ ^^^ this shall give us all ORI in a sequence.
1390
+ DnaA protein binds to DNA-Box in an ori.
1391
+ '9cut'
1392
+ '8cut'
1393
+ '7cut'
1394
+ ^^^ these give us slices
1395
+ But I do not know how to locate ORIs.
1396
+ --------------------------------------------------------------------------------
1397
+ (203) → ^^^ also integrate git into bioroebe.
1398
+ --------------------------------------------------------------------------------
1399
+ (204) WIR MÜSSEN DAS HIER EXTREM VERBESSERN.
1400
+ DANN UPLOADEN UND ALS BASIS FÜR APPLICATIONS NUTZEN.
1401
+ --------------------------------------------------------------------------------
1402
+ (205) Study MetaCyc
1403
+ ^^^ study metabolic pathways.
1404
+ http://metacyc.org/
1405
+ → Create KuroMetaCyc, in Analogy towards Metabolic Cycle.
1406
+ --------------------------------------------------------------------------------
1407
+ (206) → Welcome to BioShell May 2012. Type "help" to get some help.
1408
+ Hello and welcome to the Bio Shell Version, last updated: May 2012
1409
+ BIO SHELL> IPEYVDWRQKGAVTPVKNQGSCGSCWAFSAVVTIEGIIKIRTGNLNQYSEQELLDCDRRSYGCNGGYPWSALQLVAQYGI
1410
+ BIO SHELL> HYRNTYPYEGVQRYCRSREKGPYAAKTDGVRQVQPYNQGALLYSIANQPVSVVLQAAGKDFQLYRGGIFVGPCGNKVDHA
1411
+ BIO SHELL> VAAVGYGPNYILIKNSWGTGWGENGYIRIKRGTGNSYGVCGLYTSSFYPVKN
1412
+ BIO SHELL>
1413
+ BIO SHELL> input?
1414
+ BIO SHELL> pdb
1415
+ BIO SHELL>
1416
+ ^^^ (1)
1417
+ Add a pdb submodule
1418
+ When we type this, we then ask:
1419
+ "Please input your FASTA format now:"
1420
+ --------------------------------------------------------------------------------
1421
+ (207) → http://biopython.org/DIST/docs/cookbook/Restriction.html#mozTocId101269
1422
+ ^^^ support this also:
1423
+ >>> from Bio import Restriction
1424
+ >>> dir()
1425
+ ['Restriction', '__builtins__', '__doc__', '__name__']
1426
+ >>> Restriction.EcoRI
1427
+ EcoRI
1428
+ >>> Restriction.EcoRI.site
1429
+ 'GAATTC'
1430
+ >>>
1431
+ and document it somewhere; perhaps in a new .cgi page.
1432
+ The above will return the exact site, without verbosity.
1433
+ Restriction.EcoRI
1434
+ Restriction.EcoRI.site
1435
+ result = Bioroebe.restriction_enzyme 'EcoRI.site'
1922
1436
  # => "GAATTC"
1923
-
1924
- ^^^ funktioniert bereits Teilweise, aber noch nit
1925
- ausreichend.
1926
-
1927
-
1928
-
1929
- - Clone the following functionality from Bio:
1930
-
1931
- require 'bio'
1932
- quality_threshold = 60
1933
- Bio::FlatFile.open('sample.fastq').each {|entry|
1437
+ ^^^ funktioniert bereits Teilweise, aber noch nit
1438
+ ausreichend.
1439
+ - Clone the following functionality from Bio:
1440
+ require 'bio'
1441
+ quality_threshold = 60
1442
+ Bio::FlatFile.open('sample.fastq').each {|entry|
1934
1443
  hq_seq = entry.mask(quality_threshold)
1935
1444
  puts hq_seq.output_fasta(entry.entry_id)
1936
- }
1937
-
1938
-
1939
- - Document the workflow of all scripts in a reproducible
1940
- manner, e. g. so that others can use it...
1941
-
1942
-
1943
- it must also allow for different tables to be
1944
- used! check this... so that we can search in
1945
- standard ORF but also in different ORFs
1946
-
1947
- und die länge angeben, zumindest vom längsten ORF
1948
- start + stop... also so das das ergebnis auch
1949
- passt
1950
-
1951
-
1952
-
1953
- - Before we start to use rails for bioroebe, let's polish the GUI
1954
- components more.
1955
- But we really should use rails for this project too, at the
1956
- least optionally.
1957
- Ideally every small class has a tiny widget that can be
1958
- interconnected. Perhaps do this via sinatra first and think
1959
- of ways how to generalize on it.
1960
-
1961
- We should probably be systematic about this and go through
1962
- each class, then write a GUI for it:
1963
- ^^^ first GUI
1964
- ^^^ then rails
1965
-
1966
- Ok, first task - write a lot of GUIs.
1967
- gui/hamming_distance.rb - works ok-ish but it needs an in-widget
1968
- notification, that is, we need to somehow colourize this thing.
1969
-
1970
-
1971
-
1972
- - fix batch generation of .star files
1973
-
1974
- - add script that will generate those weird
1975
- files ... from the tch shell script
1976
-
1977
-
1978
- - in mohitstar --help auch examples hinzufügen
1979
- --examples
1980
- we also need the Z coordinates into the .star file
1981
- test.xmd
1982
-
1983
- /home/kumar/Desktop/test_exit/InputMicrographs/
1984
-
1985
-
1986
-
1987
-
1988
- ..........................................................................
1989
- BioTodo - GENESIS, science fiction.
1990
-
1991
- - create virus(:which_one, :amount) # Note the difference to the below
1992
- - create hydra(:amount)
1993
- - create bread
1994
- ..........................................................................
1995
- → both
1996
- ^ should work, does not work right now.
1997
- ..........................................................................
1998
- → Taxonomy is now integrated into bioroebe. This is good but we need more
1999
- documentation, some more tests, a rethinking of the layout and the
2000
- structures, and a fixing of the query-part of the database.
2001
-
2002
- Also, make sure that it does the main functions.
2003
-
2004
- rewrite for taxonomy in bioroebe
2005
- and while doing this, also continue with the
2006
- protokoll in bioinformatik
2007
- so that we can finish both related-problemsters
2008
- at about the same time \o/
2009
- AND document this related-problems too
2010
- Integrate this some other day...
2011
- ..........................................................................
2012
- - http://www.restrictionmapper.org/cgi-bin/sitefind3.pl
2013
-
2014
- ^^^ Das sollte man integrieren, die Funktionalität, so das
1445
+ }
1446
+ - Document the workflow of all scripts in a reproducible
1447
+ manner, e. g. so that others can use it...
1448
+ it must also allow for different tables to be
1449
+ used! check this... so that we can search in
1450
+ standard ORF but also in different ORFs
1451
+ und die länge angeben, zumindest vom längsten ORF
1452
+ start + stop... also so das das ergebnis auch
1453
+ passt
1454
+ - Before we start to use rails for bioroebe, let's polish the GUI
1455
+ components more.
1456
+ But we really should use rails for this project too, at the
1457
+ least optionally.
1458
+ Ideally every small class has a tiny widget that can be
1459
+ interconnected. Perhaps do this via sinatra first and think
1460
+ of ways how to generalize on it.
1461
+ We should probably be systematic about this and go through
1462
+ each class, then write a GUI for it:
1463
+ ^^^ first GUI
1464
+ ^^^ then rails
1465
+ Ok, first task - write a lot of GUIs.
1466
+ gui/hamming_distance.rb - works ok-ish but it needs an in-widget
1467
+ notification, that is, we need to somehow colourize this thing.
1468
+ - fix batch generation of .star files
1469
+ - add script that will generate those weird
1470
+ files ... from the tch shell script
1471
+ - in mohitstar --help auch examples hinzufügen
1472
+ --examples
1473
+ we also need the Z coordinates into the .star file
1474
+ test.xmd
1475
+ /home/kumar/Desktop/test_exit/InputMicrographs/
1476
+ --------------------------------------------------------------------------------
1477
+ (208) BioTodo - GENESIS, science fiction.
1478
+ - create virus(:which_one, :amount) # Note the difference to the below
1479
+ - create hydra(:amount)
1480
+ - create bread
1481
+ --------------------------------------------------------------------------------
1482
+ (209) → both
1483
+ ^ should work, does not work right now.
1484
+ --------------------------------------------------------------------------------
1485
+ (210) → Taxonomy is now integrated into bioroebe. This is good but we need more
1486
+ documentation, some more tests, a rethinking of the layout and the
1487
+ structures, and a fixing of the query-part of the database.
1488
+ Also, make sure that it does the main functions.
1489
+ rewrite for taxonomy in bioroebe
1490
+ and while doing this, also continue with the
1491
+ protokoll in bioinformatik
1492
+ so that we can finish both related-problemsters
1493
+ at about the same time \o/
1494
+ AND document this related-problems too
1495
+ Integrate this some other day...
1496
+ --------------------------------------------------------------------------------
1497
+ (211) → http://www.restrictionmapper.org/cgi-bin/sitefind3.pl
1498
+ ^^^ Das sollte man integrieren, die Funktionalität, so das
2015
1499
  man ALLE Restriktion-Enzymes ausprobiert ausgehend von
2016
1500
  einer bestimmten Sequenz.
2017
- ..........................................................................
2018
- A search is essentially substring search across a database of strings
2019
- (albeit with a smaller alphabet). Some common use cases: one,
2020
- scientists will search for certain genes that they've used in engineered
2021
- plasmids. Two, since multiple codons can translate to the same amino
2022
- acid, a process called "codon optimization" might replace codons with
2023
- equivalent ones that work better given a certain kind of organism -
2024
- your CAA (glutamine) might become CAG (still glutamine) instead. You
2025
- might want to search by amino acid ("glutamine") so that you find
2026
- it regardless of whether CAA or CAG was used.
2027
-
2028
- ^^^^ yeah enable this too.
2029
- Perhaps start with a "codon optimizer". This shall, for a given
2030
- organism, replace the given codons with the "optimal" ones.
2031
- Then also add this to sinatra interafce.
2032
-
2033
- Bioroebe::DetermineOptimalCodons
2034
- ^^^ this is currently incomplete.
2035
-
2036
- ..........................................................................
2037
- Redo restrictions enzymes completely.
2038
- And polish this a LOT.
2039
- This may take some days. But we want this to be REALLY good and
2040
- lasting for a long time.
2041
- Need to keep on working at that!
2042
- ..........................................................................
2043
- Add: average_aminoacid_weight?
2044
-
2045
-
2046
- === LV-Nummer 300214 UE Übung III B Sequenzanalysen in der Molekularbiologie
2047
- Pubmed
2048
- Finding sequences
2049
- Sequence homology search (Blast, FASTA)
2050
- Pairwise sequence alignment
2051
- DNA analysis, translation
2052
- Gene finding in genomes
2053
- ^^^ find genes
2054
- Plasmid Cloning
2055
- Primer for PCR
2056
- Protein analysis
2057
- PROSITE
2058
- EMBOSS
2059
- ^^^ das nochmals in Ruhe durchgehen.
2060
-
2061
- in bioroebe viel C++ schreiben, möglichst optimieren,
2062
- und dann irgendwann später ruby bindings dazu liefern
2063
- add the ability to compare two FASTA files
2064
- can probably do so via the two sequences.
2065
- integrate "grundlagen der bioinfo" slides
2066
- gleichzeitig während wir dafür lernen.
2067
- ntSeq.each_entry do |e|
2068
- # pep = Bio::Sequence.new(e.naseq.reverse_complement!.translate)
2069
- pep = Bio::Sequence.new(e.naseq.translate)
2070
- if strand == 0
2071
- pep = Bio::Sequence.new(e.naseq.reverse_complement!.translate)
2072
- end
2073
- puts pep.output_fasta(e.definition,20)
2074
- end
2075
- ^^^ we must show this in n characters per line:
2076
- see https://raw.githubusercontent.com/zorino/bioruby-scripts/master/transeq.rb
2077
- We must be able to align not only nucleotides but also aminoacids.
2078
- But where is the alignment comparer? perhaps hamming distance?
2079
- hmm we have to see.
2080
- ..........................................................................
2081
- /Programs/Ruby/2.3.1/lib/ruby/site_ruby/2.3.0/bioroebe/bioshell/menu.rb:311:in `menu': undefined method `upcase' for ["EcoRI"]:Array (NoMethodError)
2082
- from /Programs/Ruby/2.3.1/lib/ruby/site_ruby/2.3.0/bioroebe/bioshell/user_input.rb:31:in `block in enter_main_loop'
2083
- from /Programs/Ruby/2.3.1/lib/ruby/site_ruby/2.3.0/bioroebe/bioshell/user_input.rb:12:in `loop'
2084
- from /Programs/Ruby/2.3.1/lib/ruby/site_ruby/2.3.0/bioroebe/bioshell/user_input.rb:12:in `enter_main_loop'
2085
- from /Programs/Ruby/2.3.1/lib/ruby/site_ruby/2.3.0/bioroebe/bioshell/initialize.rb:42:in `initialize'
2086
- from /Programs/Ruby/2.3.1/lib/ruby/site_ruby/2.3.0/bioroebe/bioshell/bioshell.rb:52:in `new'
2087
- from /Programs/Ruby/2.3.1/lib/ruby/site_ruby/2.3.0/bioroebe/bioshell/bioshell.rb:52:in `shell'
2088
- from /System/Executables/bioshell:6:in `<main>'
2089
-
2090
- ^^^ also fix this stupid bug.
2091
- perhaps redo the whole restriction enzyme stuff.
2092
-
2093
- → Taxonomy components:
2094
-
2095
- (1) Wir wollen nur einen query machen statt vieler kleiner
1501
+ --------------------------------------------------------------------------------
1502
+ (212) A search is essentially substring search across a database of strings
1503
+ (albeit with a smaller alphabet). Some common use cases: one,
1504
+ scientists will search for certain genes that they've used in engineered
1505
+ plasmids. Two, since multiple codons can translate to the same amino
1506
+ acid, a process called "codon optimization" might replace codons with
1507
+ equivalent ones that work better given a certain kind of organism -
1508
+ your CAA (glutamine) might become CAG (still glutamine) instead. You
1509
+ might want to search by amino acid ("glutamine") so that you find
1510
+ it regardless of whether CAA or CAG was used.
1511
+ ^^^^ yeah enable this too.
1512
+ Perhaps start with a "codon optimizer". This shall, for a given
1513
+ organism, replace the given codons with the "optimal" ones.
1514
+ Then also add this to sinatra interafce.
1515
+ Bioroebe::DetermineOptimalCodons
1516
+ ^^^ this is currently incomplete.
1517
+ --------------------------------------------------------------------------------
1518
+ (213) → Redo restrictions enzymes completely.
1519
+ And polish this a LOT.
1520
+ This may take some days. But we want this to be REALLY good and
1521
+ lasting for a long time.
1522
+ Need to keep on working at that!
1523
+ --------------------------------------------------------------------------------
1524
+ (214) → Add: average_aminoacid_weight?
1525
+ === LV-Nummer 300214 UE Übung III B Sequenzanalysen in der Molekularbiologie
1526
+ → Pubmed
1527
+ Finding sequences
1528
+ → Sequence homology search (Blast, FASTA)
1529
+ → Pairwise sequence alignment
1530
+ DNA analysis, translation
1531
+ Gene finding in genomes
1532
+ ^^^ find genes
1533
+ Plasmid Cloning
1534
+ Primer for PCR
1535
+ Protein analysis
1536
+ PROSITE
1537
+ EMBOSS
1538
+ ^^^ das nochmals in Ruhe durchgehen.
1539
+ in bioroebe viel C++ schreiben, möglichst optimieren,
1540
+ und dann irgendwann später ruby bindings dazu liefern
1541
+ add the ability to compare two FASTA files
1542
+ can probably do so via the two sequences.
1543
+ integrate "grundlagen der bioinfo" slides
1544
+ gleichzeitig während wir dafür lernen.
1545
+ ntSeq.each_entry do |e|
1546
+ # pep = Bio::Sequence.new(e.naseq.reverse_complement!.translate)
1547
+ pep = Bio::Sequence.new(e.naseq.translate)
1548
+ if strand == 0
1549
+ pep = Bio::Sequence.new(e.naseq.reverse_complement!.translate)
1550
+ end
1551
+ puts pep.output_fasta(e.definition,20)
1552
+ end
1553
+ ^^^ we must show this in n characters per line:
1554
+ see https://raw.githubusercontent.com/zorino/bioruby-scripts/master/transeq.rb
1555
+ We must be able to align not only nucleotides but also aminoacids.
1556
+ But where is the alignment comparer? perhaps hamming distance?
1557
+ hmm we have to see.
1558
+ --------------------------------------------------------------------------------
1559
+ (215) → /Programs/Ruby/2.3.1/lib/ruby/site_ruby/2.3.0/bioroebe/bioshell/menu.rb:311:in `menu': undefined method `upcase' for ["EcoRI"]:Array (NoMethodError)
1560
+ from /Programs/Ruby/2.3.1/lib/ruby/site_ruby/2.3.0/bioroebe/bioshell/user_input.rb:31:in `block in enter_main_loop'
1561
+ from /Programs/Ruby/2.3.1/lib/ruby/site_ruby/2.3.0/bioroebe/bioshell/user_input.rb:12:in `loop'
1562
+ from /Programs/Ruby/2.3.1/lib/ruby/site_ruby/2.3.0/bioroebe/bioshell/user_input.rb:12:in `enter_main_loop'
1563
+ from /Programs/Ruby/2.3.1/lib/ruby/site_ruby/2.3.0/bioroebe/bioshell/initialize.rb:42:in `initialize'
1564
+ from /Programs/Ruby/2.3.1/lib/ruby/site_ruby/2.3.0/bioroebe/bioshell/bioshell.rb:52:in `new'
1565
+ from /Programs/Ruby/2.3.1/lib/ruby/site_ruby/2.3.0/bioroebe/bioshell/bioshell.rb:52:in `shell'
1566
+ from /System/Executables/bioshell:6:in `<main>'
1567
+ ^^^ also fix this stupid bug.
1568
+ perhaps redo the whole restriction enzyme stuff.
1569
+ Taxonomy components:
1570
+ Wir wollen nur einen query machen statt vieler kleiner
2096
1571
  queries.
2097
-
2098
- Taxonomy component
2099
-
2100
- (2) Send email, also enable disable_email notification
1572
+ Taxonomy component
1573
+ (2) Send email, also enable disable_email notification
2101
1574
  this should be simple if we use the email part
2102
1575
  of cyberweb-project.
2103
1576
  We send an email when everything has finished.
2104
-
2105
- data = 'Last update... job is now finished,
2106
- at this date.'
2107
- SendEmail.new to: Roebe.email?, data
2108
-
2109
- ..........................................................................
2110
-
2111
-
2112
- Document which parts of emboss have already been copied.
2113
- EMBOSS.md
2114
- ..........................................................................
2115
-
2116
-
2117
-
2118
-
2119
-
2120
- - Trametes_versicolor_FP-101664_SS1_pyranose_2-oxidase_partial_mRNA_XM_008046051.1.fasta
2121
-
2122
-
2123
-
2124
-
2125
- Bioroebe::Shell: Now loading from `Trametes_versicolor_FP-101664_SS1_pyranose_2-oxidase_partial_mRNA_XM_008046051.1.fasta`.
2126
- Bioroebe::ParseFasta: Will read from the file `Trametes_versicolor_FP-101664_SS1_pyranose_2-oxidase_partial_mRNA_XM_008046051.1.fasta`.
2127
- This sequence is assumed to be DNA or RNA.
2128
- The GC content of "XM_008046051.1 Trametes versicolor FP-101664 SS1 pyranose 2-oxidase partial mRNA" is:
2129
- 60.41667 %
2130
- This sequence has 1872 nucleotides.
2131
- We have identified a total of 1 entry in this fasta dataset.
2132
- Setting DNA sequence to (1872 nucleotides):
2133
- 5'- ATGTCTACCAGCTCGAGCGACCCGTTCTTCAACTTCACGAAGTCGAGCTTCAGGAGCGCGGCGGCGCAGAAGGCCTCGGCGACTTCTCTGCCGCCGCTGCCTGGTCCCGACAAGAAAGTCCCTGGAATGGACATCAAGTACGACGTTGTCATAGTAGGCTCCGGACCGATTGGATGCACCTATGCCCGTGAGCTCGTCGAAGCCGGTTACAAGGTCGCGATGTTCGACATCGGAGAGATCGACTCCGGCCTGAAGATCGGTGCCCACAAGAAGAACACTGTCGAATACCAGAAGAACATTGACAAGTTTGTGAACGTCATTCAGGGTCAATTGATGTCTGTTTCCGTTCCCGTCAATACCCTCGTGATCGATACGCTCAGCCCGACGTCTTGGCAAGCTTCATCGTTCTTCGTCCGTAACGGCTCGAACCCAGAGCAGGACCCGCTTCGTAACCTCAGTGGTCAGGCGGTCACACGCGTCGTCGGGGGAATGTCCACGCATTGGACTTGCGCGACACCCCGCTTTGACCGCGAGCAGCGCCCACTGCTCGTGAAGGATGACACGGACGCCGACGACGCCGAGTGGGACCGGCTGTACACCAAGGCCGAGTCGTACTTCAAGACCGGGACGGACCAGTTCAAGGAGTCGATCCGCCACAACCTCGTGCTCAACAAGCTCGCGGAGGAATACAAAGGTCAGCGCGACTTCCAGCAGATCCCGCTGGCGGCAACGCGCCGCAGTCCGACCTTCGTCGAGTGGAGCTCGGCGAACACCGTGTTCGACCTCCAGAACAGGCCGAACACGGACGCGCCGAATGAGCGCTTCAACCTCTTCCCCGCGGTCGCATGTGAGCGCGTCGTGCGCAACACGTCGAACTCCGAGATCGAGAGTCTGCACATCCACGACCTCATCTCAGGCGACCGCTTCGAAATCAAGGCAGACGTGTTTGTTCTCACAGCCGGGGCGGTCCACAACGCGCAGCTTCTCGTGAACTCTGGCTTTGGACAGCTGGGCCGGCCGGACCCCGCGAACCCGCCGCGGTTGCTGCCTTCCCTGGGAAGCTACATCACCGAGCAGTCGCTCGTCTTCTGCCAGACCGTGATGAGCACCGAGCTCATCGACAGCGTCAAGTCCGACATGATCATCAGGGGCAACCCTGGTGATCCGGGGTATAGCGTCACGTACACGCCGGGCGCGTCGACCAACAAGCACCCGGACTGGTGGAACGAGAAGGTGAAGAACCACATGATGCAGCACCAGGAGGACCCGCTCCCGATCCCGTTCGAGGACCCCGAGCCCCAGGTCACCACCCTGTTCCAGCCGTCGCACCCGTGGCACACCCAGATCCACCGCGACGCTTTCAGTTACGGTGCGGTGCAGCAAAGCATCGACTCGCGTCTCATCGTCGACTGGCGCTTTTTCGGAAGGACGGAGCCCAAGGAGGAAAACAAGCTCTGGTTCTCGGACAAGATCACCGACACGTACAACATGCCGCAGCCGACGTTCGACTTCCGCTTCCCGGCAGGCCGCACGAGCAAGGAGGCGGAGGACATGATGACCGACATGTGCGTCATGTCGGCGAAGATTGGTGGCTTCCTGCCCGGCTCTCTCCCGCAATTCATGGAGCCCGGTCTTGTCCTTCACCTCGGTGGTACGCACCGCATGGGCTTCGATGAGCAGGAGGACAAGTGCTGCGTCAACACGGACTCCCGCGTGTTCGGCTTTAAGAACCTTTTCCTCGGCGGCTGCGGCAACATTCCCACCGCGTACGGCGCGAACCCGACGCTCACCGCAATGTCGCTCGCGATCAAGAGTTGCGAGTACATCAAGAACAACTTCACACCGAGCCCTTTCACAGATCAGGCTCAGTGA - 3'
2134
- BIO SHELL> gc?
2135
- Traceback (most recent call last):
2136
- 12: from /System/Index/bin/bioshell:27:in `<main>'
2137
- 11: from /Programs/Ruby/2.6.4/lib/ruby/site_ruby/2.6.0/bioroebe/shell/shell.rb:109:in `shell'
2138
- 10: from /Programs/Ruby/2.6.4/lib/ruby/site_ruby/2.6.0/bioroebe/shell/shell.rb:109:in `new'
2139
- 9: from /Programs/Ruby/2.6.4/lib/ruby/site_ruby/2.6.0/bioroebe/shell/initialize.rb:152:in `initialize'
2140
- 8: from /Programs/Ruby/2.6.4/lib/ruby/site_ruby/2.6.0/bioroebe/shell/user_input.rb:18:in `enter_main_loop'
2141
- 7: from /Programs/Ruby/2.6.4/lib/ruby/site_ruby/2.6.0/bioroebe/shell/user_input.rb:18:in `loop'
2142
- 6: from /Programs/Ruby/2.6.4/lib/ruby/site_ruby/2.6.0/bioroebe/shell/user_input.rb:41:in `block in enter_main_loop'
2143
- 5: from /Programs/Ruby/2.6.4/lib/ruby/site_ruby/2.6.0/bioroebe/shell/menu.rb:997:in `menu'
2144
- 4: from /Programs/Ruby/2.6.4/lib/ruby/site_ruby/2.6.0/bioroebe/shell/shell.rb:2605:in `calculcate_gc_content'
2145
- 3: from /Programs/Ruby/2.6.4/lib/ruby/site_ruby/2.6.0/bioroebe/shell/shell.rb:2605:in `new'
2146
- 2: from /Programs/Ruby/2.6.4/lib/ruby/site_ruby/2.6.0/bioroebe/calculate/calculate_gc_content.rb:41:in `initialize'
2147
- 1: from /Programs/Ruby/2.6.4/lib/ruby/site_ruby/2.6.0/bioroebe/calculate/calculate_gc_content.rb:71:in `set_data'
2148
- /Programs/Ruby/2.6.4/lib/ruby/site_ruby/2.6.0/bioroebe/calculate/calculate_gc_content.rb:71:in `exist?': can't convert Bioroebe::Sequence to String (Bioroebe::Sequence#to_str gives Bioroebe::Sequence) (TypeError)
2149
-
2150
-
2151
-
2152
- - require 'bio'
2153
-
2154
- keywords = ARGV.join(' ')
2155
- options = {
2156
- 'retmax' => 1
2157
- }
2158
- entries = Bio::PubMed.esearch(keywords, options)
2159
- Bio::PubMed.efetch(entries).each do |entry|
2160
- medline = Bio::MEDLINE.new(entry)
2161
- reference = medline.reference
2162
- puts reference.bibtex
2163
- end
2164
- ^^^ enable BioPubMed access, similar to bioruby,
2165
- then docment it as well.
2166
-
2167
- - Learn from:
2168
-
2169
- http://www.snapgene.com/products/snapgene_viewer/
2170
-
2171
- -------------------------------------------------------------------------------
2172
- (1) → Wir sollten GFP tagging unterstützen, also wie das
1577
+ data = 'Last update... job is now finished,
1578
+ at this date.'
1579
+ SendEmail.new to: Roebe.email?, data
1580
+ --------------------------------------------------------------------------------
1581
+ (216) → Document which parts of emboss have already been copied.
1582
+ → EMBOSS.md
1583
+ --------------------------------------------------------------------------------
1584
+ (217) → Trametes_versicolor_FP-101664_SS1_pyranose_2-oxidase_partial_mRNA_XM_008046051.1.fasta
1585
+ Bioroebe::Shell: Now loading from `Trametes_versicolor_FP-101664_SS1_pyranose_2-oxidase_partial_mRNA_XM_008046051.1.fasta`.
1586
+ Bioroebe::ParseFasta: Will read from the file `Trametes_versicolor_FP-101664_SS1_pyranose_2-oxidase_partial_mRNA_XM_008046051.1.fasta`.
1587
+ This sequence is assumed to be DNA or RNA.
1588
+ The GC content of "XM_008046051.1 Trametes versicolor FP-101664 SS1 pyranose 2-oxidase partial mRNA" is:
1589
+ 60.41667 %
1590
+ This sequence has 1872 nucleotides.
1591
+ We have identified a total of 1 entry in this fasta dataset.
1592
+ Setting DNA sequence to (1872 nucleotides):
1593
+ 5'- ATGTCTACCAGCTCGAGCGACCCGTTCTTCAACTTCACGAAGTCGAGCTTCAGGAGCGCGGCGGCGCAGAAGGCCTCGGCGACTTCTCTGCCGCCGCTGCCTGGTCCCGACAAGAAAGTCCCTGGAATGGACATCAAGTACGACGTTGTCATAGTAGGCTCCGGACCGATTGGATGCACCTATGCCCGTGAGCTCGTCGAAGCCGGTTACAAGGTCGCGATGTTCGACATCGGAGAGATCGACTCCGGCCTGAAGATCGGTGCCCACAAGAAGAACACTGTCGAATACCAGAAGAACATTGACAAGTTTGTGAACGTCATTCAGGGTCAATTGATGTCTGTTTCCGTTCCCGTCAATACCCTCGTGATCGATACGCTCAGCCCGACGTCTTGGCAAGCTTCATCGTTCTTCGTCCGTAACGGCTCGAACCCAGAGCAGGACCCGCTTCGTAACCTCAGTGGTCAGGCGGTCACACGCGTCGTCGGGGGAATGTCCACGCATTGGACTTGCGCGACACCCCGCTTTGACCGCGAGCAGCGCCCACTGCTCGTGAAGGATGACACGGACGCCGACGACGCCGAGTGGGACCGGCTGTACACCAAGGCCGAGTCGTACTTCAAGACCGGGACGGACCAGTTCAAGGAGTCGATCCGCCACAACCTCGTGCTCAACAAGCTCGCGGAGGAATACAAAGGTCAGCGCGACTTCCAGCAGATCCCGCTGGCGGCAACGCGCCGCAGTCCGACCTTCGTCGAGTGGAGCTCGGCGAACACCGTGTTCGACCTCCAGAACAGGCCGAACACGGACGCGCCGAATGAGCGCTTCAACCTCTTCCCCGCGGTCGCATGTGAGCGCGTCGTGCGCAACACGTCGAACTCCGAGATCGAGAGTCTGCACATCCACGACCTCATCTCAGGCGACCGCTTCGAAATCAAGGCAGACGTGTTTGTTCTCACAGCCGGGGCGGTCCACAACGCGCAGCTTCTCGTGAACTCTGGCTTTGGACAGCTGGGCCGGCCGGACCCCGCGAACCCGCCGCGGTTGCTGCCTTCCCTGGGAAGCTACATCACCGAGCAGTCGCTCGTCTTCTGCCAGACCGTGATGAGCACCGAGCTCATCGACAGCGTCAAGTCCGACATGATCATCAGGGGCAACCCTGGTGATCCGGGGTATAGCGTCACGTACACGCCGGGCGCGTCGACCAACAAGCACCCGGACTGGTGGAACGAGAAGGTGAAGAACCACATGATGCAGCACCAGGAGGACCCGCTCCCGATCCCGTTCGAGGACCCCGAGCCCCAGGTCACCACCCTGTTCCAGCCGTCGCACCCGTGGCACACCCAGATCCACCGCGACGCTTTCAGTTACGGTGCGGTGCAGCAAAGCATCGACTCGCGTCTCATCGTCGACTGGCGCTTTTTCGGAAGGACGGAGCCCAAGGAGGAAAACAAGCTCTGGTTCTCGGACAAGATCACCGACACGTACAACATGCCGCAGCCGACGTTCGACTTCCGCTTCCCGGCAGGCCGCACGAGCAAGGAGGCGGAGGACATGATGACCGACATGTGCGTCATGTCGGCGAAGATTGGTGGCTTCCTGCCCGGCTCTCTCCCGCAATTCATGGAGCCCGGTCTTGTCCTTCACCTCGGTGGTACGCACCGCATGGGCTTCGATGAGCAGGAGGACAAGTGCTGCGTCAACACGGACTCCCGCGTGTTCGGCTTTAAGAACCTTTTCCTCGGCGGCTGCGGCAACATTCCCACCGCGTACGGCGCGAACCCGACGCTCACCGCAATGTCGCTCGCGATCAAGAGTTGCGAGTACATCAAGAACAACTTCACACCGAGCCCTTTCACAGATCAGGCTCAGTGA - 3'
1594
+ BIO SHELL> gc?
1595
+ Traceback (most recent call last):
1596
+ 12: from /System/Index/bin/bioshell:27:in `<main>'
1597
+ 11: from /Programs/Ruby/2.6.4/lib/ruby/site_ruby/2.6.0/bioroebe/shell/shell.rb:109:in `shell'
1598
+ 10: from /Programs/Ruby/2.6.4/lib/ruby/site_ruby/2.6.0/bioroebe/shell/shell.rb:109:in `new'
1599
+ 9: from /Programs/Ruby/2.6.4/lib/ruby/site_ruby/2.6.0/bioroebe/shell/initialize.rb:152:in `initialize'
1600
+ 8: from /Programs/Ruby/2.6.4/lib/ruby/site_ruby/2.6.0/bioroebe/shell/user_input.rb:18:in `enter_main_loop'
1601
+ 7: from /Programs/Ruby/2.6.4/lib/ruby/site_ruby/2.6.0/bioroebe/shell/user_input.rb:18:in `loop'
1602
+ 6: from /Programs/Ruby/2.6.4/lib/ruby/site_ruby/2.6.0/bioroebe/shell/user_input.rb:41:in `block in enter_main_loop'
1603
+ 5: from /Programs/Ruby/2.6.4/lib/ruby/site_ruby/2.6.0/bioroebe/shell/menu.rb:997:in `menu'
1604
+ 4: from /Programs/Ruby/2.6.4/lib/ruby/site_ruby/2.6.0/bioroebe/shell/shell.rb:2605:in `calculcate_gc_content'
1605
+ 3: from /Programs/Ruby/2.6.4/lib/ruby/site_ruby/2.6.0/bioroebe/shell/shell.rb:2605:in `new'
1606
+ 2: from /Programs/Ruby/2.6.4/lib/ruby/site_ruby/2.6.0/bioroebe/calculate/calculate_gc_content.rb:41:in `initialize'
1607
+ 1: from /Programs/Ruby/2.6.4/lib/ruby/site_ruby/2.6.0/bioroebe/calculate/calculate_gc_content.rb:71:in `set_data'
1608
+ /Programs/Ruby/2.6.4/lib/ruby/site_ruby/2.6.0/bioroebe/calculate/calculate_gc_content.rb:71:in `exist?': can't convert Bioroebe::Sequence to String (Bioroebe::Sequence#to_str gives Bioroebe::Sequence) (TypeError)
1609
+ - require 'bio'
1610
+ keywords = ARGV.join(' ')
1611
+ options = {
1612
+ 'retmax' => 1
1613
+ }
1614
+ entries = Bio::PubMed.esearch(keywords, options)
1615
+ Bio::PubMed.efetch(entries).each do |entry|
1616
+ medline = Bio::MEDLINE.new(entry)
1617
+ reference = medline.reference
1618
+ puts reference.bibtex
1619
+ end
1620
+ ^^^ enable BioPubMed access, similar to bioruby,
1621
+ then docment it as well.
1622
+ - Learn from:
1623
+ http://www.snapgene.com/products/snapgene_viewer/
1624
+ --------------------------------------------------------------------------------
1625
+ (218) Wir sollten GFP tagging unterstützen, also wie das
2173
1626
  Protein-Konstrukt aussehen soll und so weiter.
2174
1627
  Das geht teilweise...
2175
1628
  GFP? zeigt die Sequenz an.
2176
- assign :GFP
2177
- fügt die sequence asl main dna sequenz ein.
2178
- Was fehlt? Hmmmm... eventuell noch mehr an
2179
- dokumentation.
2180
- -------------------------------------------------------------------------------
2181
-
2182
- - in bioroebe, create subsequences for siRNA, then scan for
2183
- submatcher + report where these are. Should be fast too.
2184
- -------------------------------------------------------------------------------
2185
- - Reverse complement now works quite well, also via the sinatra
2186
- interface. We still should have a way to show 5' and
2187
- 3', both on the commandline, and via sinatra.
2188
- Perhaps via --fancy commandline flag or so.
2189
- -------------------------------------------------------------------------------
2190
- - Cn3D files?
2191
- ^^^ add support for these; research what they are, too.
2192
- -------------------------------------------------------------------------------
2193
- - Consider adding graphviz, perhaps to the taxonomy project
2194
- where we make graphs towards different nodes or so...
2195
- -------------------------------------------------------------------------------
2196
- - in parse fasta
2197
- @colourize_sequence = false
2198
- ^^^ change this lateron...
2199
- perhaps create a toplevel method
2200
- this method now exists, but we still have to make
2201
- the check better whether it is a protein or a DNA/RNA
2202
- add a toplevel method for this.
2203
- -------------------------------------------------------------------------------
2204
- - clone the BLast ident matcher functionality for aminacids into
2205
- Bioroebe.
2206
-
2207
- - fasta_download AAC76198.2
2208
-
2209
- ^^^ enable the above in the bioshell, and perhaps also outside
1629
+ assign :GFP
1630
+ fügt die sequence asl main dna sequenz ein.
1631
+ Was fehlt? Hmmmm... eventuell noch mehr an
1632
+ dokumentation.
1633
+ --------------------------------------------------------------------------------
1634
+ (219) → in bioroebe, create subsequences for siRNA, then scan for
1635
+ submatcher + report where these are. Should be fast too.
1636
+ --------------------------------------------------------------------------------
1637
+ (220) → Reverse complement now works quite well, also via the sinatra
1638
+ interface. We still should have a way to show 5' and
1639
+ 3', both on the commandline, and via sinatra.
1640
+ Perhaps via --fancy commandline flag or so.
1641
+ --------------------------------------------------------------------------------
1642
+ (221) → Cn3D files?
1643
+ ^^^ add support for these; research what they are, too.
1644
+ --------------------------------------------------------------------------------
1645
+ (222) → Consider adding graphviz, perhaps to the taxonomy project
1646
+ where we make graphs towards different nodes or so...
1647
+ --------------------------------------------------------------------------------
1648
+ (223) → in parse fasta
1649
+ @colourize_sequence = false
1650
+ ^^^ change this lateron...
1651
+ perhaps create a toplevel method
1652
+ this method now exists, but we still have to make
1653
+ the check better whether it is a protein or a DNA/RNA
1654
+ add a toplevel method for this.
1655
+ --------------------------------------------------------------------------------
1656
+ (224) → clone the BLast ident matcher functionality for aminacids into
1657
+ Bioroebe.
1658
+ - fasta_download AAC76198.2
1659
+ ^^^ enable the above in the bioshell, and perhaps also outside
2210
1660
  of the bioshell.
2211
-
2212
- http://www.ncbi.nlm.nih.gov/protein/145693187?report=fasta
2213
-
2214
- ^^^ shall use something such as the above
2215
-
2216
-
2217
-
2218
- -------------------------------------------------------------------------------
2219
- - Be able to mark exon/intron boundaries.
2220
-
2221
- - Add "taxid?" to tell us the name of the organism. This works now.
2222
-
2223
- ^^^ should also work with a local database. ← we integrate this
1661
+ http://www.ncbi.nlm.nih.gov/protein/145693187?report=fasta
1662
+ ^^^ shall use something such as the above
1663
+ --------------------------------------------------------------------------------
1664
+ (225) Be able to mark exon/intron boundaries.
1665
+ - Add "taxid?" to tell us the name of the organism. This works now.
1666
+ ^^^ should also work with a local database. ← we integrate this
2224
1667
  at a later point.
2225
-
2226
- - We have identified a total of 24 entries in this fasta dataset.
2227
-
2228
- There is a total of 4038 letters/nucleotides stored in total.
2229
- Setting DNA sequence to (1582 nucleotides):
2230
- ^^^^ hmm should enable:
2231
-
2232
- @seq1
2233
- @seq2
2234
-
2235
- and so forth
2236
-
2237
- - SUMOylation
2238
- "small ubiquitin modifier"
2239
- chemistry
2240
- SUMO proteins are small
2241
- 100 aa / 12 kD.
2242
-
2243
- Similar structural fold as ubiqitin
2244
-
2245
- Most SUMO-modified proteins contain the tetrapeptide consensus motif.
2246
- phi is a hydrophobic residue
2247
- kappa is the lysine conjugated to SUMO
2248
- x is any aa
2249
- D or E is an acidic residue
2250
- SomethingHydrophobic-K-x-D/E
2251
-
2252
- Prediction programmes, e.g. SUMOplot http://www.abgent.com/sumoplot
2253
-
2254
- >MYB44 LENGTH=305
2255
- MADRIKGPWSPEEDEQLRRLVVKYGPRNWTVISKSIPGRSGKSCRLRWCNQLSPQVEHRPFSAEEDETIARAHAQFGNKWATI
2256
- ARLLNGRTDNAVKNHWNSTLKRKCGGYDHRGYDGSEDHRPVKRSVSAGSPPVVTGLYMSPGSPTGSDVSDSSTIPILPSVELF
2257
- KPVPRPGAVVLPLPIETSSSSDDPPTSLSLSLPGADVSEESNRSHESTNINNTTSSRHNHNNTVSFMPFSGGFRGAIEEMGKS
2258
- FPGNGGEFMAVVQEMIKAEVRSYMTEMQRNNGGGFVGGFIDNGMIPMSQIGVGRIE
2259
-
2260
- ^^^
2261
- study sumoplot ...
2262
- -------------------------------------------------------------------------------
2263
- - http://a-little-book-of-r-for-bioinformatics.readthedocs.io/en/latest/src/chapter7.html
2264
- -------------------------------------------------------------------------------
2265
- - http://biopython.org/DIST/docs/tutorial/Tutorial.html#htoc22
2266
- ^^^ continue here; "You can also specify the table using the
2267
- NCBI table number which is shorter, and often included in
2268
- the feature annotation of GenBank files:"
2269
-
2270
- ^^^ work through this and see if it is good.
2271
-
2272
- -------------------------------------------------------------------------------
2273
-
2274
- - Clone ALL of biophp, if it us useful.
2275
-
2276
- Then state so too, then get rid of this entry here.
2277
-
2278
- But remember, we must also be able to do so via a webinterface!
2279
-
2280
- Oligos now work. Hmm.
2281
-
2282
- http://www.biophp.org/
2283
-
2284
- Let's use a table for now to show which variants we have enabled:
2285
-
2286
- DNA to protein | Bioroebe.to_aa
2287
- Protein to DNA | [IMPLEMENTED FULLY] → http://www.biophp.org/minitools/protein_to_dna/
2288
- | GTK GUI bindings now exist in a simple manner.
2289
- Restriction digest of DNA | http://www.biophp.org/minitools/restriction_digest/demo.php
2290
- Find Palindromic Sequences | [IMPLEMENTED FULLY]
2291
- Sequence manipulation and data |
2292
- Melting Temperature (Tm) Calculator |
2293
- PCR Amplification |
2294
- Microsatellite Repeats Finder |
2295
- Alignment of DNA/Protein sequences | http://www.biophp.org/minitools/seq_alignment/demo.php
2296
-
2297
- Microarray analysis: adaptive quantification |
2298
- Protein sequence information | http://www.biophp.org/minitools/protein_properties/demo.php
2299
- Reduced alphabets for proteins | http://www.biophp.org/minitools/reduce_protein_alphabet/ (started; implemented the first one for now)
2300
- Chaos Game Representation |
2301
- GC-, AT-, KETO- and oligo-skews generator |
2302
- Oligonucleotide Frequency | [IMPLEMENTED PARTIALLY]
2303
- ^^^ we need a way to obtain a hash with the frequencies;
1668
+ - We have identified a total of 24 entries in this fasta dataset.
1669
+ There is a total of 4038 letters/nucleotides stored in total.
1670
+ Setting DNA sequence to (1582 nucleotides):
1671
+ ^^^^ hmm should enable:
1672
+ @seq1
1673
+ @seq2
1674
+ and so forth
1675
+ - SUMOylation
1676
+ "small ubiquitin modifier"
1677
+ chemistry
1678
+ SUMO proteins are small
1679
+ 100 aa / 12 kD.
1680
+ Similar structural fold as ubiqitin
1681
+ Most SUMO-modified proteins contain the tetrapeptide consensus motif.
1682
+ phi is a hydrophobic residue
1683
+ kappa is the lysine conjugated to SUMO
1684
+ x is any aa
1685
+ D or E is an acidic residue
1686
+ SomethingHydrophobic-K-x-D/E
1687
+ Prediction programmes, e.g. SUMOplot http://www.abgent.com/sumoplot
1688
+ >MYB44 LENGTH=305
1689
+ MADRIKGPWSPEEDEQLRRLVVKYGPRNWTVISKSIPGRSGKSCRLRWCNQLSPQVEHRPFSAEEDETIARAHAQFGNKWATI
1690
+ ARLLNGRTDNAVKNHWNSTLKRKCGGYDHRGYDGSEDHRPVKRSVSAGSPPVVTGLYMSPGSPTGSDVSDSSTIPILPSVELF
1691
+ KPVPRPGAVVLPLPIETSSSSDDPPTSLSLSLPGADVSEESNRSHESTNINNTTSSRHNHNNTVSFMPFSGGFRGAIEEMGKS
1692
+ FPGNGGEFMAVVQEMIKAEVRSYMTEMQRNNGGGFVGGFIDNGMIPMSQIGVGRIE
1693
+ ^^^
1694
+ study sumoplot ...
1695
+ --------------------------------------------------------------------------------
1696
+ (226) → http://a-little-book-of-r-for-bioinformatics.readthedocs.io/en/latest/src/chapter7.html
1697
+ --------------------------------------------------------------------------------
1698
+ (227) → http://biopython.org/DIST/docs/tutorial/Tutorial.html#htoc22
1699
+ ^^^ continue here; "You can also specify the table using the
1700
+ NCBI table number which is shorter, and often included in
1701
+ the feature annotation of GenBank files:"
1702
+ ^^^ work through this and see if it is good.
1703
+ --------------------------------------------------------------------------------
1704
+ (228) Clone ALL of biophp, if it us useful.
1705
+ Then state so too, then get rid of this entry here.
1706
+ But remember, we must also be able to do so via a webinterface!
1707
+ Oligos now work. Hmm.
1708
+ http://www.biophp.org/
1709
+ Let's use a table for now to show which variants we have enabled:
1710
+ DNA to protein | Bioroebe.to_aa
1711
+ Protein to DNA | [IMPLEMENTED FULLY] → http://www.biophp.org/minitools/protein_to_dna/
1712
+ | GTK GUI bindings now exist in a simple manner.
1713
+ Restriction digest of DNA | http://www.biophp.org/minitools/restriction_digest/demo.php
1714
+ Find Palindromic Sequences | [IMPLEMENTED FULLY]
1715
+ Sequence manipulation and data |
1716
+ Melting Temperature (Tm) Calculator |
1717
+ PCR Amplification |
1718
+ Microsatellite Repeats Finder |
1719
+ Alignment of DNA/Protein sequences | http://www.biophp.org/minitools/seq_alignment/demo.php
1720
+ Microarray analysis: adaptive quantification |
1721
+ Protein sequence information | http://www.biophp.org/minitools/protein_properties/demo.php
1722
+ Reduced alphabets for proteins | http://www.biophp.org/minitools/reduce_protein_alphabet/ (started; implemented the first one for now)
1723
+ Chaos Game Representation |
1724
+ GC-, AT-, KETO- and oligo-skews generator |
1725
+ Oligonucleotide Frequency | [IMPLEMENTED PARTIALLY]
1726
+ ^^^ we need a way to obtain a hash with the frequencies;
2304
1727
  we also need to extend this to 3 or 4 etc... oligos
2305
- Oligonucleotides for distance among sequences |
2306
- Random sequences |
2307
- Useful formulas
2308
-
2309
- rf biophp
2310
-
2311
-
2312
- Palindromic sequences finder
2313
-
2314
- ^^^ enable this next.
2315
-
2316
- We should also put this poart into doc/ subsection
2317
- to keep track of what is missing and what is not.
2318
-
2319
- -------------------------------------------------------------------------------
2320
- (1) → sizeseq
2321
-
2322
- ^^^ clone this functionality and describe it in detail.
1728
+ Oligonucleotides for distance among sequences |
1729
+ Random sequences |
1730
+ Useful formulas
1731
+ rf biophp
1732
+ Palindromic sequences finder
1733
+ ^^^ enable this next.
1734
+ We should also put this poart into doc/ subsection
1735
+ to keep track of what is missing and what is not.
1736
+ --------------------------------------------------------------------------------
1737
+ (229) sizeseq
1738
+ ^^^ clone this functionality and describe it in detail.
2323
1739
  also for the www. Hmmm. Need to add this for the
2324
1740
  www.
2325
-
2326
1741
  http://www.bioinformatics.nl/cgi-bin/emboss/sizeseq
2327
- ^^^ hmm implemented that I think.
2328
- GTTGTTGCAAGATACAATCTGGTGTGTACTAGA
2329
- AGCTAACTCCAGACCGATACAT
2330
- CGGACTCGGCC
2331
- AATACCAGCGTAGGCTGTGAGCTCGCGGCTGACAAAC
2332
- GGAAACGTTTCCTATGTCGGGATTC
2333
-
2334
- Output file outseq
2335
-
2336
- >four
2337
- ATGC
2338
- >two
2339
- GTTGTTGCAAGATACAATCTGGTGTGTACTAGACCGATACATCGGACTCGGCCAATACCA
2340
- GCGTAGGCTGTGAGCTCGCGGCTGACAAACGGAAACGTTTCCTATGTCGGGAT
2341
- >one
2342
- GTTGTTGCAAGATACAATCTGGTGTGTACTAGAAGCTAACTCCAGACCGATACATCGGAC
2343
- TCGGCCAATACCAGCGTAGGCTGTGAGCTCGCGGCTGACAAACGGAAACGTTTCCTATGT
2344
- CGGGATTC
2345
- >three
2346
- GTTGTTGCAAGATACAATCTGGTGTGTACTAGAAGCTAACTCCAGACGTTGTTGCAAGAT
2347
- ACAATCTGGTGTGTACTAGACGATACATCGGGTTGTTGCAAGATACAATCTGGTGTGTAC
2348
- TAGAACTCGGCCAATACCAGCGTAGGCTGTGAGCTCGCGGCTGACAAACGGAAACGTTTC
2349
- CTATGTCGGGATTC
2350
-
2351
- foobar.fasta
2352
- ^^^ demonstrate via foobar.fasta
2353
-
2354
- ALSO ADD A GUI; sizeseq.rb was added in February 2021.
2355
-
2356
- -------------------------------------------------------------------------------
2357
- - In the sinatra-web-interface for Bioroebe:
2358
- continue quiz in rosalind !!!
2359
- also, at to_dna: default to RNA
2360
- And improve the general quality.
2361
- also add the ability to tchange the codon table
2362
- via URL through the sinatra interface
2363
- Example:
2364
- codon_table/1,2,3
2365
- view it, change it, document it
2366
- also add:
2367
- view_codon_table
2368
- ^^^ shall display it on-line
2369
- and give a formatted-view
2370
- output-view numbering
2371
- Something like:
2372
- → formatted_view
2373
- 111^^^^ in ncbi format
2374
- and document all of this.
2375
- ..........................................................................
2376
- -------------------------------------------------------------------------------
2377
- - Add a ruby-GUI stuff, probably the old biology/ subsection
2378
- will be moved into the project.
2379
-
2380
- Also tell how to start or get this GUI stuff to run, then add
2381
- components that can be a part of bioroebe into it.
2382
-
2383
- ^^^ we should push this before asking for a job in the summer
1742
+ ^^^ hmm implemented that I think.
1743
+ GTTGTTGCAAGATACAATCTGGTGTGTACTAGA
1744
+ AGCTAACTCCAGACCGATACAT
1745
+ CGGACTCGGCC
1746
+ AATACCAGCGTAGGCTGTGAGCTCGCGGCTGACAAAC
1747
+ GGAAACGTTTCCTATGTCGGGATTC
1748
+ Output file outseq
1749
+ >four
1750
+ ATGC
1751
+ >two
1752
+ GTTGTTGCAAGATACAATCTGGTGTGTACTAGACCGATACATCGGACTCGGCCAATACCA
1753
+ GCGTAGGCTGTGAGCTCGCGGCTGACAAACGGAAACGTTTCCTATGTCGGGAT
1754
+ >one
1755
+ GTTGTTGCAAGATACAATCTGGTGTGTACTAGAAGCTAACTCCAGACCGATACATCGGAC
1756
+ TCGGCCAATACCAGCGTAGGCTGTGAGCTCGCGGCTGACAAACGGAAACGTTTCCTATGT
1757
+ CGGGATTC
1758
+ >three
1759
+ GTTGTTGCAAGATACAATCTGGTGTGTACTAGAAGCTAACTCCAGACGTTGTTGCAAGAT
1760
+ ACAATCTGGTGTGTACTAGACGATACATCGGGTTGTTGCAAGATACAATCTGGTGTGTAC
1761
+ TAGAACTCGGCCAATACCAGCGTAGGCTGTGAGCTCGCGGCTGACAAACGGAAACGTTTC
1762
+ CTATGTCGGGATTC
1763
+ foobar.fasta
1764
+ ^^^ demonstrate via foobar.fasta
1765
+ ALSO ADD A GUI; sizeseq.rb was added in February 2021.
1766
+ --------------------------------------------------------------------------------
1767
+ (230) In the sinatra-web-interface for Bioroebe:
1768
+ continue quiz in rosalind !!!
1769
+ also, at to_dna: default to RNA
1770
+ And improve the general quality.
1771
+ → also add the ability to tchange the codon table
1772
+ via URL through the sinatra interface
1773
+ Example:
1774
+ codon_table/1,2,3
1775
+ view it, change it, document it
1776
+ also add:
1777
+ view_codon_table
1778
+ ^^^ shall display it on-line
1779
+ and give a formatted-view
1780
+ output-view numbering
1781
+ Something like:
1782
+ → formatted_view
1783
+ 111^^^^ in ncbi format
1784
+ and document all of this.
1785
+ --------------------------------------------------------------------------------
1786
+ (231)
1787
+ --------------------------------------------------------------------------------
1788
+ (232) Add a ruby-GUI stuff, probably the old biology/ subsection
1789
+ will be moved into the project.
1790
+ Also tell how to start or get this GUI stuff to run, then add
1791
+ components that can be a part of bioroebe into it.
1792
+ ^^^ we should push this before asking for a job in the summer
2384
1793
  months.
2385
-
2386
-
2387
- - http://www.biophp.org/minitools/seq_alignment/demo.php
2388
-
2389
- ^^^ implement smith waterman alignment
2390
-
2391
- swalign AAGGGGAGGACGATGCGGATGTTC AGGGAGGACGATGCGG
2392
- ---
2393
- Query: cmdline (16 nt)
2394
- Ref : cmdline (24 nt)
2395
-
2396
- Query: 1 A-GGGAGGACGATGCGG 16
2397
- | |||||||||||||||
2398
- Ref : 2 AGGGGAGGACGATGCGG 18
2399
-
2400
- Score: 31
2401
- Matches: 16 (94.1%)
2402
- Mismatches: 1
2403
- CIGAR: 1M1D15M
2404
-
2405
- ---
2406
-
2407
- ^^^^ also add this commandline tool
2408
-
2409
-
2410
- bin/swalin
2411
-
2412
- should have same output
2413
- - Gene finding in genomes
2414
-
2415
- ^^^ find genes;
1794
+ - http://www.biophp.org/minitools/seq_alignment/demo.php
1795
+ ^^^ implement smith waterman alignment
1796
+ swalign AAGGGGAGGACGATGCGGATGTTC AGGGAGGACGATGCGG
1797
+ --------------------------------------------------------------------------------
1798
+ (233) Query: cmdline (16 nt)
1799
+ Ref : cmdline (24 nt)
1800
+ Query: 1 A-GGGAGGACGATGCGG 16
1801
+ | |||||||||||||||
1802
+ Ref : 2 AGGGGAGGACGATGCGG 18
1803
+ Score: 31
1804
+ Matches: 16 (94.1%)
1805
+ Mismatches: 1
1806
+ CIGAR: 1M1D15M
1807
+ --------------------------------------------------------------------------------
1808
+ (234) → ^^^^ also add this commandline tool
1809
+ bin/swalin
1810
+ should have same output
1811
+ - Gene finding in genomes
1812
+ ^^^ find genes;
2416
1813
  and add: return all genes from the ORFs
2417
-
2418
-
2419
-
2420
- - Enable blast.
2421
-
2422
- - fasta http://www.ncbi.nlm.nih.gov/nuccore/145337669?report=fasta
2423
-
2424
- Input `fasta http://www.ncbi.nlm.nih.gov/nuccore/145337669?report=fasta` not found.
2425
-
2426
- ^^^ this should work.
2427
- fasta_header? http://www.ncbi.nlm.nih.gov/nuccore/145337669?report=fasta
2428
-
2429
- ^^^ and this hmmm.
2430
-
2431
-
2432
- - http://www.biophp.org/minitools/random_seqs/demo.php
2433
-
2434
- ^^^ clone this thingy
2435
-
2436
- composition?
2437
-
2438
- ^^^ also calculate the percentage.... hmm
2439
-
2440
- one part has been cloned finally BUT it has to be
2441
- described... and we may have to show this
2442
- in CGI too hmmmmmmm.
2443
-
2444
- the first one has been done, the second one not quite yet
2445
- and we lack documentation!
2446
-
2447
- Also add this to sinatra yay!
2448
- Well, we have cloned quite a bit so far. Need to finish
2449
- this up eventually.
2450
-
2451
-
2452
- - How do I write Sequences in Fasta format?
2453
-
2454
- FASTA format is a fairly standard bioinformatics output that is convenient and easy to read. BioRuby's Sequence class has a to_fasta method for formatting sequence in FASTA format.
2455
-
2456
- Printing any Bio::Sequence sequence object in FASTA format.
2457
-
2458
- #!/usr/bin/env ruby
2459
-
2460
- require 'bio'
2461
-
2462
- # Generates a sample 100bp sequence.
2463
- seq1 = Bio::Sequence::NA.new("aatgacccgt" * 10)
2464
-
2465
- # Naming this sequence as "testseq" and print in FASTA format
2466
- # (folded by 60 chars per line).
2467
- puts seq1.to_fasta("testseq", 60)
2468
-
2469
- ^^^^ enable this
2470
-
2471
-
2472
-
2473
- -------------------------------------------------------------------------------
2474
- - Identifying amino acid cleavage sites (Sigcleave)
2475
-
2476
- For amino acid sequences we may be interested to know whether
2477
- the amino acid sequence contains a cleavable signal sequence
2478
- for directing the transport of the protein within the cell.
2479
-
2480
- SigCleave is a program (originally part of the EGCG molecular
2481
- biology package) to predict signal sequences, and to identify
2482
- the cleavage site based on the "von Heijne" algorithm.
2483
-
2484
- The threshold setting controls the score reporting. If no
2485
- value for threshold is passed in by the user, the code
2486
- defaults to a reporting value of 3.5.
2487
-
2488
- SigCleave will only return score/position pairs which meet
2489
- the threshold limit.
2490
-
2491
- There are 2 accessor methods for this object.
2492
- signals() will return a perl hash containing the
2493
- sigcleave scores keyed by amino acid position.
2494
- pretty_print() returns a formatted string similar
2495
- to the output of the original sigcleave utility.
2496
-
2497
- The syntax for using Sigcleave is as follows:
2498
-
2499
- # create a Seq object, for example:
2500
- $seqobj = Bio::Seq->new(-seq => "AALLHHHHHHGGGGPPRTTTTTVVVVVVVVVVVVVVV");
2501
-
2502
- use Bio::Tools::Sigcleave;
2503
- $sigcleave_object = new Bio::Tools::Sigcleave
1814
+ - Enable blast.
1815
+ - fasta http://www.ncbi.nlm.nih.gov/nuccore/145337669?report=fasta
1816
+ Input `fasta http://www.ncbi.nlm.nih.gov/nuccore/145337669?report=fasta` not found.
1817
+ ^^^ this should work.
1818
+ fasta_header? http://www.ncbi.nlm.nih.gov/nuccore/145337669?report=fasta
1819
+ ^^^ and this hmmm.
1820
+ - http://www.biophp.org/minitools/random_seqs/demo.php
1821
+ ^^^ clone this thingy
1822
+ composition?
1823
+ ^^^ also calculate the percentage.... hmm
1824
+ one part has been cloned finally BUT it has to be
1825
+ described... and we may have to show this
1826
+ in CGI too hmmmmmmm.
1827
+ the first one has been done, the second one not quite yet
1828
+ and we lack documentation!
1829
+ Also add this to sinatra yay!
1830
+ Well, we have cloned quite a bit so far. Need to finish
1831
+ this up eventually.
1832
+ - How do I write Sequences in Fasta format?
1833
+ FASTA format is a fairly standard bioinformatics output that is convenient and easy to read. BioRuby's Sequence class has a to_fasta method for formatting sequence in FASTA format.
1834
+ Printing any Bio::Sequence sequence object in FASTA format.
1835
+ #!/usr/bin/env ruby
1836
+ require 'bio'
1837
+ # Generates a sample 100bp sequence.
1838
+ seq1 = Bio::Sequence::NA.new("aatgacccgt" * 10)
1839
+ # Naming this sequence as "testseq" and print in FASTA format
1840
+ # (folded by 60 chars per line).
1841
+ puts seq1.to_fasta("testseq", 60)
1842
+ ^^^^ enable this
1843
+ --------------------------------------------------------------------------------
1844
+ (235) Identifying amino acid cleavage sites (Sigcleave)
1845
+ For amino acid sequences we may be interested to know whether
1846
+ the amino acid sequence contains a cleavable signal sequence
1847
+ for directing the transport of the protein within the cell.
1848
+ SigCleave is a program (originally part of the EGCG molecular
1849
+ biology package) to predict signal sequences, and to identify
1850
+ the cleavage site based on the "von Heijne" algorithm.
1851
+ The threshold setting controls the score reporting. If no
1852
+ value for threshold is passed in by the user, the code
1853
+ defaults to a reporting value of 3.5.
1854
+ SigCleave will only return score/position pairs which meet
1855
+ the threshold limit.
1856
+ There are 2 accessor methods for this object.
1857
+ signals() will return a perl hash containing the
1858
+ sigcleave scores keyed by amino acid position.
1859
+ pretty_print() returns a formatted string similar
1860
+ to the output of the original sigcleave utility.
1861
+ The syntax for using Sigcleave is as follows:
1862
+ # create a Seq object, for example:
1863
+ $seqobj = Bio::Seq->new(-seq => "AALLHHHHHHGGGGPPRTTTTTVVVVVVVVVVVVVVV");
1864
+ use Bio::Tools::Sigcleave;
1865
+ $sigcleave_object = new Bio::Tools::Sigcleave
2504
1866
  ( -seq => $seqobj,
2505
- -threshold => 3.5,
2506
- -description => 'test sigcleave protein seq',
1867
+ -threshold => 3.5,
1868
+ -description => 'test sigcleave protein seq',
2507
1869
  );
2508
- %raw_results = $sigcleave_object->signals;
2509
- $formatted_output = $sigcleave_object->pretty_print;
2510
-
2511
- Please see Bio::Tools::Sigcleave for details.
2512
-
2513
- ^^^ add this
2514
- http://doc.bioperl.org/releases/bioperl-current/bioperl-live/Bio/Tools/Sigcleave.html
2515
-
2516
- - enable drawing of images like the following:
2517
-
2518
- http://nar.oxfordjournals.org/content/43/D1/D227/F1.large.jpg
2519
- https://www.researchgate.net/profile/Matt_Oates/publication/268790596/figure/fig1/AS:295477619773440@1447458762108/Summary-of-all-genome-updates-and-additions-at-the-level-of-taxonomic-Class-since-the.png
2520
-
2521
-
2522
- - add reverse showorf
2523
-
2524
- like emboss
2525
- document this
2526
- then upload
2527
- also enable in bioshell
2528
- r1
2529
- r2
2530
- r3
2531
- so wie f1
2532
- f2 f3
2533
- ^^^ da gibt es einen bug. später nochmals probieren.
2534
-
2535
-
2536
- - We will read from NM_001180897.3_Saccharomyces_cerevisiae_S288c_Aga2p_AGA2.fasta
2537
-
2538
- The file NM_001180897.3_Saccharomyces_cerevisiae_S288c_Aga2p_AGA2.fasta has this FASTA header:
2539
-
2540
- >gi|398364826|ref|NM_001180897.3| Saccharomyces cerevisiae S288c Aga2p (AGA2), mRNA
2541
-
2542
- ^^^ this should also (optionally) tell us the organism, via a switch.
2543
- for this we need some way to return the taxonomic ID of an organism
2544
-
2545
- - we have to add expasy...
2546
- functionality to the cmdline too.
2547
- Which one specifically? Let's see...
2548
-
2549
- https://www.expasy.org/
2550
- -------------------------------------------------------------------------------
2551
- - https://biopython.org/wiki/Category%3ACookbook
2552
- ^^^ clone that
2553
- -------------------------------------------------------------------------------
2554
- - include covid genome, and begin to analyse it in bioroebe
2555
- "Das Genom von SARS-CoV-2 sei doppelt so groß wie jenes
2556
- von Influenzaviren, daher scheinen letztere viermal
2557
- so schnell zu mutieren, schrieb Moshiri."
2558
- -------------------------------------------------------------------------------
2559
- - Look at the GUIs that are part of the BioRoebe project.
2560
-
2561
- Polish these part, at the least one widget, then
2562
- make a screenshot, as the first one.
2563
- Then upload the image + new release and docu!
2564
- document also that more images will be added to this
2565
- in the coming weeks and months. Once done, move this to the
2566
- bottom and regularly improve on this part of the bioroebe
2567
- project.
2568
-
2569
- ^^^ also add java gui to it in the long run.
2570
-
2571
- Hmmm. And then, also consider transitioning into gtk3,
2572
- and make mroe screenshots.
2573
- -------------------------------------------------------------------------------
2574
-
2575
- - https://www.ebi.ac.uk/Tools/seqstats/emboss_pepstats/
2576
- http://www.ebi.ac.uk/Tools/services/web/toolresult.ebi?jobId=emboss_pepstats-I20160208-020243-0564-53154194-oy
2577
-
2578
- ^^^^ clone the pepstat functionality
2579
-
2580
- printAA RLAVQYAPLSGCHSTIREDVHNLHFCRARKE*
2581
-
2582
- - Improve on temperature content and how it is calculated
2583
-
2584
- someone googled for it in 2014 so build on it
2585
- -------------------------------------------------------------------------------
2586
- - pfasta /Depot/Temp/bioroebe/NM_000539.3_Homo_sapiens_rhodopsin_RHO.fasta
2587
-
2588
- Will read from the file `/Depot/Temp/bioroebe/NM_000539.3_Homo_sapiens_rhodopsin_RHO.fasta`.
2589
- Bioroebe::ParseFasta: This sequence is assumed to be a protein.
2590
- This sequence has 2768 aminoacids.
2591
- We have identified a total of 1 entries in this fasta dataset.
2592
- Bioroebe::BioShell: We will now assign this data to @_.
2593
- Now assigning aminoacid sequence to:
2594
- AGAGTCATCCAGCTGGAGCCCTGAGTGGCTGAGCTCAGGCCTTCGCAG
2595
- AGAGTCATCCAGCTGGAGCCCTGAGTGGCTGAGCTCAGGCCTTCGCAG
2596
- -------------------------------------------------------------------------------
2597
-
2598
-
2599
- - Formats
2600
-
2601
- BioPerl's SeqIO system understands lot of formats and can interconvert
2602
- all of them. Here is a current listing of formats, as of version 1.6.
2603
-
2604
- ^^^ must implement this too
2605
-
2606
- Name Description File extension
2607
- abi ABI tracefile ab[i1]
2608
- ace Ace database ace
2609
- agave AGAVE XML
2610
- alf ALF tracefile alf
2611
- asciitree write-only, to visualize features
2612
- bsml BSML using bsm,bsml
2613
- bsml_sax BSML, using
2614
- chadoxml CHADO sequence format
2615
- chaos CHAOS sequence format
2616
- chaosxml Chaos XML
2617
- ctf CTF tracefile ctf
2618
- embl EMBL database embl,ebl,emb,dat
2619
- entrezgene Entrez Gene ASN1
2620
- excel Excel
2621
- exp Staden EXP format exp
2622
- fasta FASTA fasta,fast,seq,fa,fsa,nt,aa
2623
- fastq quality score data in FASTA-like format fastq
2624
- flybase_chadoxml variant of Chado XML
2625
- game GAME XML
2626
- gcg GCG gcg
2627
- genbank GenBank gb
2628
- interpro InterProScan XML
2629
- kegg KEGG
2630
- largefasta Large files, fasta format
2631
- lasergene Lasergene format
2632
- locuslink LocusLink
2633
- metafasta
2634
- phd Phred phd,phred
2635
- pir PIR database pir
2636
- pln PLN tracefile pln
2637
- qual Phred
2638
- raw plain text txt
2639
- scf Standard Chromatogram Format scf
2640
- seqxml SeqXML sequence format xml
2641
- strider DNA Strider format
2642
- swiss SwissProt swiss,sp
2643
- tab tab-delimited
2644
- table Table
2645
- tigr TIGR XML
2646
- tigrxml TIGR Coordset XML
2647
- tinyseq NCBI TinySeq XML
2648
- ztr ZTR tracefile ztr
2649
-
2650
- ..........................................................................
2651
- (1) Look at f1 display:
2652
-
2653
-
2654
- 10 20 30 40 50 60 70 80 90 100
2655
- ---------|---------|---------|---------|---------|---------|---------|---------|---------|---------|
2656
- 1 ATGCAGTTACTTCGCTGTTTTTCAATATTTTCTGTTATTGCTTCAGTTTTAGCACAGGAACTGACAACTATATGCGAGCAAATCCCCTCACCAACTTTAG 100
2657
- F1 1 M Q L L R C F S I F S V I A S V L A Q E L T T I C E Q I P S P T L E 34
2658
-
2659
- ^^^ when we do f1
2660
- the aminoacid sequence position is on the next
2661
- line. this is bad.
2662
-
2663
-
2664
- ff1 "ATGCAGTTACTTCGCTGATTTTCTGTTATTGCTTTTTCAATATTTTCTGTTATTGCTTCAGTTTTAGCACAGGAACTGACAACTATATGCGAGCAAATCCCCTCACCAACTTTAG
2665
- ATGCAGTTACTTCGCTTTCTGTTATTGCTTCAGTTTTAGCACAGGAACTGACAACTATATGCGAGCAAATCCCCTCACCAACTTTAG"
2666
-
2667
- this works semi ok...
2668
- we probably have to rewrite the whole thing
2669
-
2670
- BEFORE we add ANY COLOURS.
2671
- OH WELL.
2672
-
2673
- -------------------------------------------------------------------------------
2674
- (100) → Add a primer-design widget
2675
-
2676
- The idea is to be able to manipulate forward and
2677
- reverse primer areas.
2678
-
2679
- AND research how to do this ...
2680
- We now have a ruby-gtk3 widget for this. It's not yet
2681
- perfect but it is a start.
2682
-
2683
-
2684
- https://www.bioinformatics.nl/molbi/SCLResources/sequence_notation.htm
2685
- ^^^ and check what is useful there. perhaps also add
2686
- nicer visual cues to pretty it up a bit.
2687
- -------------------------------------------------------------------------------
2688
- (1) → Compare bioroebe to:
2689
-
2690
- https://www.ncbi.nlm.nih.gov/orffinder
2691
-
2692
- whether both return the same also possibly add a web-gui
2693
- → it must also allow for different tables to be used!
2694
- check this... so that we can search in standard ORF
2695
- but also in different ORFs
2696
- und die länge angeben, zumindest vom längsten ORF start + stop... also so das das ergebnis auch passt
2697
- ...........................................................................
2698
- test reverse complement in bioroebe
2699
- ^^^
2700
- new_WWW/
2701
- ^^^ this should eventually become the new web-related interface.
2702
- Ah well. Perhaps not ... ruby-cgi is soooooo annoying ...
2703
- ...........................................................................
2704
- (154) → the blosum-viewer should be supported in the cgi part
1870
+ %raw_results = $sigcleave_object->signals;
1871
+ $formatted_output = $sigcleave_object->pretty_print;
1872
+ Please see Bio::Tools::Sigcleave for details.
1873
+ ^^^ add this
1874
+ http://doc.bioperl.org/releases/bioperl-current/bioperl-live/Bio/Tools/Sigcleave.html
1875
+ - enable drawing of images like the following:
1876
+ http://nar.oxfordjournals.org/content/43/D1/D227/F1.large.jpg
1877
+ https://www.researchgate.net/profile/Matt_Oates/publication/268790596/figure/fig1/AS:295477619773440@1447458762108/Summary-of-all-genome-updates-and-additions-at-the-level-of-taxonomic-Class-since-the.png
1878
+ - add reverse showorf
1879
+ like emboss
1880
+ document this
1881
+ then upload
1882
+ also enable in bioshell
1883
+ r1
1884
+ r2
1885
+ r3
1886
+ so wie f1
1887
+ f2 f3
1888
+ ^^^ da gibt es einen bug. später nochmals probieren.
1889
+ - we have to add expasy...
1890
+ functionality to the cmdline too.
1891
+ Which one specifically? Let's see...
1892
+ https://www.expasy.org/
1893
+ --------------------------------------------------------------------------------
1894
+ (236) → https://biopython.org/wiki/Category%3ACookbook
1895
+ ^^^ clone that
1896
+ --------------------------------------------------------------------------------
1897
+ (237) → include covid genome, and begin to analyse it in bioroebe
1898
+ "Das Genom von SARS-CoV-2 sei doppelt so groß wie jenes
1899
+ von Influenzaviren, daher scheinen letztere viermal
1900
+ so schnell zu mutieren, schrieb Moshiri."
1901
+ --------------------------------------------------------------------------------
1902
+ (238) Look at the GUIs that are part of the BioRoebe project.
1903
+ Polish these part, at the least one widget, then
1904
+ make a screenshot, as the first one.
1905
+ Then upload the image + new release and docu!
1906
+ document also that more images will be added to this
1907
+ in the coming weeks and months. Once done, move this to the
1908
+ bottom and regularly improve on this part of the bioroebe
1909
+ project.
1910
+ ^^^ also add java gui to it in the long run.
1911
+ Hmmm. And then, also consider transitioning into gtk3,
1912
+ and make mroe screenshots.
1913
+ --------------------------------------------------------------------------------
1914
+ (239) https://www.ebi.ac.uk/Tools/seqstats/emboss_pepstats/
1915
+ http://www.ebi.ac.uk/Tools/services/web/toolresult.ebi?jobId=emboss_pepstats-I20160208-020243-0564-53154194-oy
1916
+ ^^^^ clone the pepstat functionality
1917
+ printAA RLAVQYAPLSGCHSTIREDVHNLHFCRARKE*
1918
+ - Improve on temperature content and how it is calculated
1919
+ someone googled for it in 2014 so build on it
1920
+ --------------------------------------------------------------------------------
1921
+ (240) pfasta /Depot/Temp/bioroebe/NM_000539.3_Homo_sapiens_rhodopsin_RHO.fasta
1922
+ Will read from the file `/Depot/Temp/bioroebe/NM_000539.3_Homo_sapiens_rhodopsin_RHO.fasta`.
1923
+ Bioroebe::ParseFasta: This sequence is assumed to be a protein.
1924
+ This sequence has 2768 aminoacids.
1925
+ We have identified a total of 1 entries in this fasta dataset.
1926
+ Bioroebe::BioShell: We will now assign this data to @_.
1927
+ Now assigning aminoacid sequence to:
1928
+ AGAGTCATCCAGCTGGAGCCCTGAGTGGCTGAGCTCAGGCCTTCGCAG
1929
+ AGAGTCATCCAGCTGGAGCCCTGAGTGGCTGAGCTCAGGCCTTCGCAG
1930
+ --------------------------------------------------------------------------------
1931
+ (241) Formats
1932
+ BioPerl's SeqIO system understands lot of formats and can interconvert
1933
+ all of them. Here is a current listing of formats, as of version 1.6.
1934
+ ^^^ must implement this too
1935
+ Name Description File extension
1936
+ abi ABI tracefile ab[i1]
1937
+ ace Ace database ace
1938
+ agave AGAVE XML
1939
+ alf ALF tracefile alf
1940
+ asciitree write-only, to visualize features
1941
+ bsml BSML using bsm,bsml
1942
+ bsml_sax BSML, using
1943
+ chadoxml CHADO sequence format
1944
+ chaos CHAOS sequence format
1945
+ chaosxml Chaos XML
1946
+ ctf CTF tracefile ctf
1947
+ embl EMBL database embl,ebl,emb,dat
1948
+ entrezgene Entrez Gene ASN1
1949
+ excel Excel
1950
+ exp Staden EXP format exp
1951
+ fasta FASTA fasta,fast,seq,fa,fsa,nt,aa
1952
+ fastq quality score data in FASTA-like format fastq
1953
+ flybase_chadoxml variant of Chado XML
1954
+ game GAME XML
1955
+ gcg GCG gcg
1956
+ genbank GenBank gb
1957
+ interpro InterProScan XML
1958
+ kegg KEGG
1959
+ largefasta Large files, fasta format
1960
+ lasergene Lasergene format
1961
+ locuslink LocusLink
1962
+ metafasta
1963
+ phd Phred phd,phred
1964
+ pir PIR database pir
1965
+ pln PLN tracefile pln
1966
+ qual Phred
1967
+ raw plain text txt
1968
+ scf Standard Chromatogram Format scf
1969
+ seqxml SeqXML sequence format xml
1970
+ strider DNA Strider format
1971
+ swiss SwissProt swiss,sp
1972
+ tab tab-delimited
1973
+ table Table
1974
+ tigr TIGR XML
1975
+ tigrxml TIGR Coordset XML
1976
+ tinyseq NCBI TinySeq XML
1977
+ ztr ZTR tracefile ztr
1978
+ --------------------------------------------------------------------------------
1979
+ (242) → Look at f1 display:
1980
+ 10 20 30 40 50 60 70 80 90 100
1981
+ --------------------------------------------------------------------------------
1982
+ (243) → 1 ATGCAGTTACTTCGCTGTTTTTCAATATTTTCTGTTATTGCTTCAGTTTTAGCACAGGAACTGACAACTATATGCGAGCAAATCCCCTCACCAACTTTAG 100
1983
+ F1 1 M Q L L R C F S I F S V I A S V L A Q E L T T I C E Q I P S P T L E 34
1984
+ ^^^ when we do f1
1985
+ the aminoacid sequence position is on the next
1986
+ line. this is bad.
1987
+ ff1 "ATGCAGTTACTTCGCTGATTTTCTGTTATTGCTTTTTCAATATTTTCTGTTATTGCTTCAGTTTTAGCACAGGAACTGACAACTATATGCGAGCAAATCCCCTCACCAACTTTAG
1988
+ ATGCAGTTACTTCGCTTTCTGTTATTGCTTCAGTTTTAGCACAGGAACTGACAACTATATGCGAGCAAATCCCCTCACCAACTTTAG"
1989
+ this works semi ok...
1990
+ we probably have to rewrite the whole thing
1991
+ BEFORE we add ANY COLOURS.
1992
+ OH WELL.
1993
+ --------------------------------------------------------------------------------
1994
+ (244) → Add a primer-design widget
1995
+ The idea is to be able to manipulate forward and
1996
+ reverse primer areas.
1997
+ AND research how to do this ...
1998
+ We now have a ruby-gtk3 widget for this. It's not yet
1999
+ perfect but it is a start.
2000
+ https://www.bioinformatics.nl/molbi/SCLResources/sequence_notation.htm
2001
+ ^^^ and check what is useful there. perhaps also add
2002
+ nicer visual cues to pretty it up a bit.
2003
+ --------------------------------------------------------------------------------
2004
+ (245) → Compare bioroebe to:
2005
+ https://www.ncbi.nlm.nih.gov/orffinder
2006
+ whether both return the same also possibly add a web-gui
2007
+ it must also allow for different tables to be used!
2008
+ check this... so that we can search in standard ORF
2009
+ but also in different ORFs
2010
+ und die länge angeben, zumindest vom längsten ORF start + stop... also so das das ergebnis auch passt
2011
+ --------------------------------------------------------------------------------
2012
+ (246) → test reverse complement in bioroebe
2013
+ ^^^
2014
+ new_WWW/
2015
+ ^^^ this should eventually become the new web-related interface.
2016
+ Ah well. Perhaps not ... ruby-cgi is soooooo annoying ...
2017
+ --------------------------------------------------------------------------------
2018
+ (247) → the blosum-viewer should be supported in the cgi part
2705
2019
  and sinatra part as well.
2706
2020
  This now works for sinatra. Need to enable this for
2707
2021
  the cgi-part too eventually.
2708
- -------------------------------------------------------------------------------
2709
- (155) → port the sinatra stuff together in bioroebe
2710
- create a dir: web_api
2711
- ^^^ also make params? usable in both sinatra and cgi page
2712
- well ...............
2713
- this is quite tough.
2714
- Hmmmmmm...
2715
- perhaps as a middle-step,
2716
- add tons of HtmlTemplate[]
2717
- and replace the ad-hoc code otherwise...
2718
- ^^^ yeah, finish the HtmlTemplate stuff.
2719
- -------------------------------------------------------------------------------
2720
- (1) https://i.imgur.com/ptcSn12.png
2721
- ^^^ enable such an overview; this shows mass compuation e.g
2722
- peptide mass and such
2723
- -------------------------------------------------------------------------------
2724
- (80) Bioroebe.sanitize_nucleotide_sequence
2725
- ^^^ port this into java. The code has been written for this already,
2726
- but we currently fail to link it.
2727
- -------------------------------------------------------------------------------
2728
- (81) Bioroebe.base_composition
2729
- ^^^^^^^^^ port this into java
2730
- -------------------------------------------------------------------------------
2731
- (82) - work a bit more on tk!!!
2732
- in particular to start it from the bioshell as-is.
2733
- ^^^ this is mostly done for quick
2734
- demonstration purposes
2735
-
2736
- - also add another ruby-tk widget hmm
2737
- a new one...
2738
- ^^^^^ and fix the remaining ones
2739
-
2740
- hamming_distance [PARTIALLY IMPLEMENTED; ~80%]
2741
- protein_to_DNA
2742
- ^^^^ improve both while improving tk_paradise docu as well.
2743
- -------------------------------------------------------------------------------
2744
- (83) Batch-create the .exe files on windows for libui, once
2745
- the first has been added. And then test it too
2746
- AND document it. This should be done with the controller
2747
- eventually. Once this works, we can remove this entry
2748
- here.
2749
- -------------------------------------------------------------------------------
2750
- (84) port more libui stuff in bioroebe. We have two widgets ported so far;
2751
- add more such entries.
2752
- -------------------------------------------------------------------------------
2753
- (85) after libui has been ported, explore how gosu works on windows.
2754
- if possible add things to a gosu-specific UI as well, but
2755
- we may need a common, unified GUI base for that.
2756
- -------------------------------------------------------------------------------
2757
- (86)
2758
-
2759
- add libui bindings AND once done make sure the controller works in
2760
- libui as well. Embed the various things into it.
2761
-
2762
- Tab A set named tabs for placing items in
2763
- ^^^ use this perhaps also in bioroebe hmmm
2764
- yeah.
2765
- -------------------------------------------------------------------------------
2766
- (87) https://github.com/cnjinhao/nana/wiki/User-Works-using-Nana
2767
-
2768
- ^^^ port the "DNA hybrid"
2769
- https://camo.githubusercontent.com/4c27d554ca4d698d288628f21255f917c2c577e35d7e11dd67e21880d56b6b0a/687474703a2f2f6e616e6170726f2e6f72672f696d616765732f73637265656e73686f74732f746864795f7365715f6578706c2e706e67
2770
-
2771
- -------------------------------------------------------------------------------
2772
- (88) Bioroebe::Cell
2773
- ^^^ think about what to do with it. If we don't need it then perhaps
2774
- we should just remove it. Think about this more at 2022, before
2775
- deciding what to do.
2776
- -------------------------------------------------------------------------------
2777
- (89) - Add emboss cgplot functionality.
2778
-
2779
- https://www.bioinformatics.nl/cgi-bin/emboss/cpgplot
2780
- -------------------------------------------------------------------------------
2781
- (90) - integrate calculation of the Instability index (II)
2782
-
2783
- The instability index provides an estimate of the
2784
- stability of your protein in a test tube. Statistical
2785
- analysis of 12 unstable and 32 stable proteins has
2786
- revealed [7] that there are certain dipeptides, the
2787
- occurence of which is significantly different in the
2788
- unstable proteins compared with those in the stable
2789
- ones. The authors of this method have assigned a
2790
- weight value of instability to each of the 400
2791
- different dipeptides (DIWV). Using these weight
2792
- values it is possible to compute an instability
2793
- index (II) which is defined as:
2794
-
2795
-
2796
- i=L-1
2797
-
2798
- II = (10/L) * Sum DIWV(x[i]x[i+1])
2799
-
2800
- i=1
2801
-
2802
-
2803
- where: L is the length of sequence
2804
-
2805
- DIWV(x[i]x[i+1]) is the instability weight value for the dipeptide starting in position i.
2806
-
2807
- A protein whose instability index is smaller than 40
2808
- is predicted as stable, a value above 40 predicts
2809
- that the protein may be unstable.
2810
-
2022
+ --------------------------------------------------------------------------------
2023
+ (248) → port the sinatra stuff together in bioroebe
2024
+ create a dir: web_api
2025
+ ^^^ also make params? usable in both sinatra and cgi page
2026
+ well ...............
2027
+ this is quite tough.
2028
+ Hmmmmmm...
2029
+ perhaps as a middle-step,
2030
+ add tons of HtmlTemplate[]
2031
+ and replace the ad-hoc code otherwise...
2032
+ ^^^ yeah, finish the HtmlTemplate stuff.
2033
+ --------------------------------------------------------------------------------
2034
+ (249) https://i.imgur.com/ptcSn12.png
2035
+ ^^^ enable such an overview; this shows mass compuation e.g
2036
+ peptide mass and such
2037
+ --------------------------------------------------------------------------------
2038
+ (250) Bioroebe.sanitize_nucleotide_sequence
2039
+ ^^^ port this into java. The code has been written for this already,
2040
+ but we currently fail to link it.
2041
+ --------------------------------------------------------------------------------
2042
+ (251) → Batch-create the .exe files on windows for libui, once
2043
+ the first has been added. And then test it too
2044
+ AND document it. This should be done with the controller
2045
+ eventually. Once this works, we can remove this entry
2046
+ here.
2047
+ --------------------------------------------------------------------------------
2048
+ (252) → port more libui stuff in bioroebe. We have two widgets ported so far;
2049
+ add more such entries.
2050
+ --------------------------------------------------------------------------------
2051
+ (253) after libui has been ported, explore how gosu works on windows.
2052
+ if possible add things to a gosu-specific UI as well, but
2053
+ we may need a common, unified GUI base for that.
2054
+ --------------------------------------------------------------------------------
2055
+ (254) → (86)
2056
+ add libui bindings AND once done make sure the controller works in
2057
+ libui as well. Embed the various things into it.
2058
+ Tab A set named tabs for placing items in
2059
+ ^^^ use this perhaps also in bioroebe hmmm
2060
+ yeah.
2061
+ --------------------------------------------------------------------------------
2062
+ (255) → https://github.com/cnjinhao/nana/wiki/User-Works-using-Nana
2063
+ ^^^ port the "DNA hybrid"
2064
+ https://camo.githubusercontent.com/4c27d554ca4d698d288628f21255f917c2c577e35d7e11dd67e21880d56b6b0a/687474703a2f2f6e616e6170726f2e6f72672f696d616765732f73637265656e73686f74732f746864795f7365715f6578706c2e706e67
2065
+ --------------------------------------------------------------------------------
2066
+ (256) → Bioroebe::Cell
2067
+ ^^^ think about what to do with it. If we don't need it then perhaps
2068
+ we should just remove it. Think about this more at 2022, before
2069
+ deciding what to do.
2070
+ --------------------------------------------------------------------------------
2071
+ (257) → Add emboss cgplot functionality.
2072
+ https://www.bioinformatics.nl/cgi-bin/emboss/cpgplot
2073
+ --------------------------------------------------------------------------------
2074
+ (258) integrate calculation of the Instability index (II)
2075
+ The instability index provides an estimate of the
2076
+ stability of your protein in a test tube. Statistical
2077
+ analysis of 12 unstable and 32 stable proteins has
2078
+ revealed [7] that there are certain dipeptides, the
2079
+ occurence of which is significantly different in the
2080
+ unstable proteins compared with those in the stable
2081
+ ones. The authors of this method have assigned a
2082
+ weight value of instability to each of the 400
2083
+ different dipeptides (DIWV). Using these weight
2084
+ values it is possible to compute an instability
2085
+ index (II) which is defined as:
2086
+ i=L-1
2087
+ II = (10/L) * Sum DIWV(x[i]x[i+1])
2088
+ i=1
2089
+ where: L is the length of sequence
2090
+ DIWV(x[i]x[i+1]) is the instability weight value for the dipeptide starting in position i.
2091
+ A protein whose instability index is smaller than 40
2092
+ is predicted as stable, a value above 40 predicts
2093
+ that the protein may be unstable.
2811
2094
  # MEKVQYLTRSAIRRASTIEMPQQARQKLQNLFINFCLILICLLLICIIVMLL
2812
-
2813
- 52
2814
-
2815
- The instability index (II) is computed to be 65.43
2816
- This classifies the protein as unstable.
2817
-
2818
- -------------------------------------------------------------------------------
2819
- (1) → We have now added a method to show all hydrophobic amino acids, via the
2095
+ 52
2096
+ The instability index (II) is computed to be 65.43
2097
+ This classifies the protein as unstable.
2098
+ --------------------------------------------------------------------------------
2099
+ (259) We have now added a method to show all hydrophobic amino acids, via the
2820
2100
  method .hydrophobic_amino_acids?. This works and has been documented
2821
2101
  in May 2022. However had, we also still need a way to PREDICT
2822
2102
  hydrophobic segments in a polypeptide sequence.
2823
- -------------------------------------------------------------------------------
2103
+ --------------------------------------------------------------------------------
2104
+ (260) → <img src="https://i.imgur.com/tkB8MTJ.png" style="margin: 1em">
2105
+ --------------------------------------------------------------------------------
2106
+ (261) → https://www.studocu.com/en-us/document/queens-college-cuny/biochemistry-laboratory/bioinformatics-exercise/13329106
2107
+ ^^^ this enable via a method
2108
+ and add a screenshot
2109
+ we want to colourize an existing string
2110
+ ALSo use javascript for this on the www, otherwise use ruby
2111
+ hmmm. so first the ruby variant + sinatra demo app
2112
+ ^^^ this works now, see the
2113
+ image at:
2114
+ https://i.imgur.com/tkB8MTJ.png
2115
+ However had, we should add a sinatra demo app too,
2116
+ and demonstrate this too and then documen it as-is
2117
+ --------------------------------------------------------------------------------
2118
+ (262) →
2119
+ make sure we have a good fasta-showing widget
2120
+ show how many nucleotides are
2121
+ AND add support to modify this as-is
2122
+ ^^^^
2123
+ The fasta showing widget in ruby-gtk3 is quite nice, but we
2124
+ need to make it more convenient to work with.
2125
+ Perhaps add a context menu that can be customzied?
2126
+ hmm. and perhaps open the sequence in the editor
2127
+ or something ... perhaps also keybindings by default
2128
+ and a help option somewhere to explain all of this.
2129
+ --------------------------------------------------------------------------------
2130
+ (263) → Add a way in bioroebe to store a gene into a yaml file
2131
+ or so, and to also load it up again. Perhaps simplify
2132
+ this automatically. Need some ways to describe that.
2133
+ FastaToYaml
2134
+ ^^^ perhaps?
2135
+ ^^^ describe the why too
2136
+ This class now exists. We have to add more features to it
2137
+ eventually, though.
2138
+ --------------------------------------------------------------------------------
2139
+ (264) → https://pubchem.ncbi.nlm.nih.gov/compound/16131099#section=Top
2140
+ ^^^ this website is quite interesting; try to use components
2141
+ from it.
2142
+ --------------------------------------------------------------------------------
2143
+ (265) → Add some option to show the aminoacid sequence, at the least
2144
+ store it; and optionally show it.
2145
+ possibly always report how many aminoacids are
2146
+ part of that file; and optionally also show
2147
+ the whole sequence.
2148
+ --------------------------------------------------------------------------------
2149
+ (266) → http://insilico.ehu.es/
2150
+ ^^^ check if we have all of this incorporated
2151
+ --------------------------------------------------------------------------------
2152
+ (267) → Integrate these nice GUI parts parts:
2153
+ https://dev.to/kojix2/introduction-to-gr-rb-data-visualization-with-ruby-2c39
2154
+ --------------------------------------------------------------------------------
2155
+ (268) → AND THEN test on windows as well.
2156
+ ^^^^^^^^^^^^^^
2157
+ --------------------------------------------------------------------------------
2158
+ (269) → add mouse chromsoome URL, also in the bioshell
2159
+ and the main README, to be of help for the
2160
+ user. add a mouse subsection.
2161
+ --------------------------------------------------------------------------------
2162
+ (270) → fix the taxonomy stuff...
2163
+ --------------------------------------------------------------------------------
2164
+ (271) → set_dna_sequence alu
2165
+ ^^^ fetch random alu
2166
+ ^^^ alu sequence
2167
+ Ok we started this now adding more details, but we
2168
+ need to become better at searching for this
2169
+ sequence.
2170
+ --------------------------------------------------------------------------------
2171
+ (272) → draw things based on GR
2172
+ --------------------------------------------------------------------------------
2173
+ (273) → https://mycocosm.jgi.doe.gov/help/screenshots/browser_viewer.png
2174
+ ^^^ offer the same functionality
2175
+ --------------------------------------------------------------------------------
2176
+ (274) → https://genome.cshlp.org/content/12/10/1611/F3.expansion.html
2177
+ ^^^ enable this, we must obtain a sequence then store into genbank format
2178
+ so, first fetch; then store as-is.
2179
+ --------------------------------------------------------------------------------
2180
+ (275) → be able to generate nice graphics
2181
+ https://genome.cshlp.org/content/12/10/1611/F1.large.jpg
2182
+ --------------------------------------------------------------------------------
2183
+ (276) → add rmagicks wrappre, perhaps via imageparadise or something
2184
+ the idea is that we can make fancy drawings and generate
2185
+ an image for the end user to see
2186
+ --------------------------------------------------------------------------------
2187
+ (277) → https://bioperl.org/howtos/Beginners_HOWTO.html#item13
2188
+ extend the sequence object and document it
2189
+ also add:
2190
+ class Genome
2191
+ and:
2192
+ def is_circular?
2193
+ @internal_hash[:is_circular]
2194
+ end; alias circular? is_circular? # === circular?
2195
+ def species?
2196
+ @internal_hash[:species] # return the species here
2197
+ end
2198
+ --------------------------------------------------------------------------------
2199
+ (278) → http://lib.ysu.am/open_books/312400.pdf
2200
+ clone:
2201
+ Primer.pl
2202
+ This program was written to support the required informatics for a sequencing
2203
+ lab. The desire was to quickly generate primer pair candidates for use in STS
2204
+ mapping. We use Bioperl modules to fetch the sequences from GenBank.
2205
+ #! /usr/bin/perl
2206
+ #
2207
+ # primers.pl
2208
+ #
2209
+ # Reads a list of
2210
+ % primers.pl AC013798
2211
+ AC013798
2212
+ Left Right Length Penalty
2213
+ CCTCCTGGACAACCTGTGTT TGAAGTCAGGGGACATAGGG 280 0.0823
2214
+ CCTCCTGGACAACCTGTGTT AGGCCAGTAGACTGGGTGTG 298 0.1758
2215
+ CCTCCTGGACAACCTGTGTT GGTGTGAAGTCAGGGGACAT 284 0.1852
2216
+ TTCCCGCATCTCTTAGCAGT AGGCCAGTAGACTGGGTGTG 209 0.1962
2217
+ CTTCCCGCATCTCTTAGCAG GACACTAGTGGCAAGGAGGC 226 0.2362
2218
+ Most of the primers.pl program is extremely simple. The real guts and power
2219
+ of the program lie in the classes and the methods we call. The next section
2220
+ examines the Primer3 module, which is similar to many Bioperl modules
2221
+ --------------------------------------------------------------------------------
2222
+ (279) → Clone all of Emboss. :)
2223
+ → Clone and document the getorf functionality properly.
2224
+ See: http://emboss.sourceforge.net/apps/cvs/emboss/apps/getorf.html
2225
+ http://emboss.sourceforge.net
2226
+ http://emboss.sourceforge.net/apps/cvs/emboss/apps/getorf.html
2227
+ --------------------------------------------------------------------------------
2228
+ (280) → Add useful formulas for bioshell.
2229
+ --------------------------------------------------------------------------------
2230
+ (281) → Polish the GUI sets:
2231
+ https://i.imgur.com/djElIMh.png
2232
+ --------------------------------------------------------------------------------
2233
+ (282) → The taxonomy part should be fully integrated, without it
2234
+ being a standalone part anymore.
2235
+ continue on the taxonomy stuff.
2236
+ ne day this will work again *shake fist*
2237
+ --------------------------------------------------------------------------------
2238
+ (283) → Show the frequency of codons in different tables
2239
+ This works quite ok, but right now the approach is to store
2240
+ this in a .yml file which is not ideal.
2241
+ Thus, we have to add two things:
2242
+ - The ability to store this into a SQL database
2243
+ - The ability to batch-download all of these codons,
2244
+ which first requires that we have a way to obtain all
2245
+ taxonomic ids.
2246
+ Add where this can be found.
2247
+ IMPROVE THIS ALL!!!!!!!
2248
+ --------------------------------------------------------------------------------
2249
+ (284) → improve docu + tests for melting temperature analysis again
2250
+ + usage example + GUI + web-use
2251
+ --------------------------------------------------------------------------------
2252
+ (285) → https://biopython.org/DIST/docs/tutorial/Tutorial.html#sec15
2253
+ ^^^ work through the above, also integrate it + write docs
2254
+ https://raw.githubusercontent.com/biopython/biopython/master/Doc/examples/ls_orchid.fasta
2255
+ --------------------------------------------------------------------------------
2256
+ (286) → work a bit more on tk!!!
2257
+ in particular to start it from the bioshell as-is.
2258
+ ^^^ this is mostly done for quick
2259
+ demonstration purposes
2260
+ - also add another ruby-tk widget hmm
2261
+ a new one...
2262
+ ^^^^^ and fix the remaining ones
2263
+ hamming_distance [PARTIALLY IMPLEMENTED; ~80%]
2264
+ protein_to_DNA
2265
+ ^^^^ improve both while improving tk_paradise docu as well.
2266
+ --------------------------------------------------------------------------------
2267
+ (287) → add 2nd_orf
2268
+ → this shall scan for the 2nd orf
2269
+ → and third ORF as well, then, and document it.
2270
+ --------------------------------------------------------------------------------
2271
+ (288) → https://github.com/pjotrp/bigbio
2272
+ ^^^^ include uses cases from that readme
2273
+ --------------------------------------------------------------------------------
2274
+ (289) -> bioinformatiocs bioroebe:
2275
+ cut_via(:trypsin)
2276
+ ^^^^ show the digest as array
2277
+ then upload after documenting this
2278
+ ------------------------------------------------------------------------