bioroebe 0.12.24 → 0.13.31
Sign up to get free protection for your applications and to get access to all the features.
- checksums.yaml +4 -4
- data/LICENSE.md +7 -8
- data/README.md +566 -354
- data/bin/all_positions_of_this_nucleotide +1 -1
- data/bin/aminoacid_frequencies +1 -1
- data/bin/automatically_rename_this_fasta_file +1 -1
- data/bin/base_composition +1 -1
- data/bin/batch_create_windows_executables +1 -1
- data/bin/bioroebe +12 -1
- data/bin/bioroebe_cat +7 -0
- data/bin/calculate_exponential_growth +7 -0
- data/bin/calculate_n50_value +1 -1
- data/bin/calculate_the_frequencies_of_this_species +7 -0
- data/bin/chunked_display +1 -1
- data/bin/codon_frequency +1 -1
- data/bin/codon_to_aminoacid +1 -1
- data/bin/colourize_this_fasta_sequence +1 -1
- data/bin/complementary_dna_strand +1 -1
- data/bin/complementary_rna_strand +1 -1
- data/bin/consensus_sequence +1 -1
- data/bin/dna_to_rna +1 -1
- data/bin/downcase_chunked_display +1 -1
- data/bin/download_this_pdb +1 -1
- data/bin/fasta_index +1 -1
- data/bin/fetch_data_from_uniprot +1 -1
- data/bin/filter_away_invalid_nucleotides +1 -1
- data/bin/find_substring +1 -1
- data/bin/input_as_dna +1 -1
- data/bin/is_palindrome +1 -1
- data/bin/leading_five_prime +1 -1
- data/bin/longest_ORF +1 -1
- data/bin/longest_substring +1 -1
- data/bin/open_reading_frames +1 -1
- data/bin/partner_nucleotide +1 -1
- data/bin/plain_palindrome +1 -1
- data/bin/random_dna_sequence +1 -1
- data/bin/random_sequence +1 -1
- data/bin/raw_hamming_distance +1 -1
- data/bin/return_longest_substring_via_LCS_algorithm +1 -1
- data/bin/reverse_sequence +1 -1
- data/bin/short_aminoacid_letter_from_long_aminoacid_name +1 -1
- data/bin/show_atomic_composition +1 -1
- data/bin/show_fasta_header +1 -1
- data/bin/show_nucleotide_sequence +1 -1
- data/bin/show_this_dna_sequence +1 -1
- data/bin/show_time_now +7 -0
- data/bin/sort_aminoacid_based_on_its_hydrophobicity +1 -1
- data/bin/strict_filter_away_invalid_aminoacids +1 -1
- data/{lib/bioroebe/base/reset.rb → bin/three_delimiter} +9 -6
- data/bin/three_to_one +1 -1
- data/bin/to_rna +1 -1
- data/bin/trailing_three_prime +1 -1
- data/bin/upcase_this_aminoacid_sequence_and_remove_numbers +1 -1
- data/bioroebe.gemspec +6 -7
- data/doc/README.gen +534 -322
- data/doc/blosum/blosum.md +4 -0
- data/doc/compatibility/BIO_PHP.md +20 -18
- data/doc/compatibility/README.md +2 -3
- data/doc/compatibility/emboss.md +5 -3
- data/doc/{extensive_usage_example.md → extensive_usage_example/extensive_usage_example.md} +4 -2
- data/doc/{instructions_for_the_taxonomy_subproject.md → instructions_for_the_taxonomy_subproject/instructions_for_the_taxonomy_subproject.md} +36 -33
- data/doc/{legacy_paths.md → legacy_paths/legacy_paths.md} +3 -3
- data/doc/statistics/statistics.md +12 -10
- data/doc/todo/bioroebe_GUI_todo.md +6 -1
- data/doc/todo/bioroebe_java_todo.md +3 -2
- data/doc/todo/bioroebe_todo.md +328 -310
- data/doc/{using_biomart.md → using_biomart/using_biomart.md} +7 -3
- data/lib/bioroebe/abstract/features.rb +0 -0
- data/lib/bioroebe/aminoacids/aminoacid_substitution.rb +1 -1
- data/lib/bioroebe/aminoacids/aminoacids_mass_table.rb +3 -1
- data/lib/bioroebe/aminoacids/codon_percentage.rb +18 -10
- data/lib/bioroebe/aminoacids/create_random_aminoacids.rb +5 -2
- data/lib/bioroebe/aminoacids/deduce_aminoacid_sequence.rb +90 -64
- data/lib/bioroebe/aminoacids/display_aminoacid_table.rb +1 -3
- data/lib/bioroebe/aminoacids/show_hydrophobicity.rb +2 -2
- data/lib/bioroebe/annotations/create_annotation_format.rb +2 -2
- data/lib/bioroebe/base/base.rb +101 -6
- data/lib/bioroebe/base/base_module/base_module.rb +9 -1
- data/lib/bioroebe/base/colours.rb +3 -0
- data/lib/bioroebe/base/colours_for_base/colours_for_base.rb +80 -44
- data/lib/bioroebe/base/commandline_application/README.md +1 -1
- data/lib/bioroebe/base/commandline_application/commandline_application.rb +661 -22
- data/lib/bioroebe/base/commandline_application/commandline_arguments.rb +2 -1
- data/lib/bioroebe/base/infer_the_namespace_module/infer_the_namespace_module.rb +37 -0
- data/lib/bioroebe/base/internal_hash_module/internal_hash_module.rb +1 -6
- data/lib/bioroebe/base/prototype/prototype.rb +155 -14
- data/lib/bioroebe/biomart/attribute.rb +1 -1
- data/lib/bioroebe/biomart/biomart.rb +8 -9
- data/lib/bioroebe/biomart/server.rb +1 -1
- data/lib/bioroebe/blosum/blosum.rb +2 -2
- data/lib/bioroebe/calculate/calculate_blosum_score.rb +5 -3
- data/lib/bioroebe/calculate/calculate_gc_content.rb +1 -1
- data/lib/bioroebe/calculate/calculate_levensthein_distance.rb +5 -3
- data/lib/bioroebe/calculate/calculate_melting_temperature.rb +2 -10
- data/lib/bioroebe/calculate/calculate_melting_temperature_for_more_than_thirteen_nucleotides.rb +6 -15
- data/lib/bioroebe/calculate/calculate_the_position_specific_scoring_matrix.rb +4 -2
- data/lib/bioroebe/cell/cell.rb +3 -2
- data/lib/bioroebe/cell/specialized_cells/B_cell.rb +60 -0
- data/lib/bioroebe/cell/specialized_cells/Macrophage.rb +60 -0
- data/lib/bioroebe/cell/specialized_cells/README.md +5 -0
- data/lib/bioroebe/cell/specialized_cells/T_cell.rb +60 -0
- data/lib/bioroebe/cleave_and_digest/cleave.rb +3 -1
- data/lib/bioroebe/cleave_and_digest/digestion.rb +1 -1
- data/lib/bioroebe/codon_tables/frequencies/10090_Mus_musculus.yml +93 -0
- data/lib/bioroebe/codon_tables/frequencies/107243_Thlaspi_caerulescens.yml +72 -0
- data/lib/bioroebe/codon_tables/frequencies/parse_frequency_table.rb +2 -2
- data/lib/bioroebe/codons/codon_table.rb +10 -2
- data/lib/bioroebe/codons/codons.rb +3 -3
- data/lib/bioroebe/codons/convert_this_codon_to_that_aminoacid.rb +18 -15
- data/lib/bioroebe/codons/determine_optimal_codons.rb +1 -1
- data/lib/bioroebe/codons/possible_codons_for_this_aminoacid.rb +4 -2
- data/lib/bioroebe/codons/show_codon_tables.rb +1 -1
- data/lib/bioroebe/codons/show_codon_usage.rb +1 -2
- data/lib/bioroebe/codons/show_this_codon_table.rb +2 -2
- data/lib/bioroebe/codons/start_codons.rb +7 -3
- data/lib/bioroebe/colours/colour_schemes/README.md +1 -1
- data/lib/bioroebe/colours/colour_schemes/array_available_colour_schemes.rb +3 -3
- data/lib/bioroebe/colours/colour_schemes/colour_scheme.rb +3 -3
- data/lib/bioroebe/colours/colour_schemes/colour_scheme_demo.rb +4 -3
- data/lib/bioroebe/colours/colour_schemes/helix.rb +3 -1
- data/lib/bioroebe/colours/colour_schemes/hydropathy.rb +3 -1
- data/lib/bioroebe/colours/colour_schemes/score.rb +13 -2
- data/lib/bioroebe/colours/colour_schemes/strand.rb +3 -1
- data/lib/bioroebe/colours/colour_schemes/turn.rb +3 -1
- data/lib/bioroebe/colours/colour_schemes/zappo.rb +1 -1
- data/lib/bioroebe/{toplevel_methods/colourize_related_methods.rb → colours/colourize_related_code.rb} +1 -3
- data/lib/bioroebe/colours/colourize_sequence.rb +3 -1
- data/lib/bioroebe/colours/colours.rb +172 -15
- data/lib/bioroebe/configuration/configuration.rb +1 -1
- data/lib/bioroebe/constants/GUIs.rb +2 -2
- data/lib/bioroebe/constants/constants.rb +1349 -0
- data/lib/bioroebe/conversions/convert_aminoacid_to_dna.rb +8 -13
- data/lib/bioroebe/conversions/dna_to_aminoacid_sequence.rb +9 -3
- data/lib/bioroebe/count/count_amount_of_aminoacids.rb +11 -10
- data/lib/bioroebe/count/count_amount_of_nucleotides.rb +1 -1
- data/lib/bioroebe/count/count_at.rb +2 -1
- data/lib/bioroebe/databases/download_taxonomy_database.rb +1 -1
- data/lib/bioroebe/dotplots/advanced_dotplot.rb +2 -2
- data/lib/bioroebe/electron_microscopy/coordinate_analyzer.rb +2 -2
- data/lib/bioroebe/electron_microscopy/fix_pos_file.rb +2 -2
- data/lib/bioroebe/electron_microscopy/flipy.rb +2 -2
- data/lib/bioroebe/electron_microscopy/generate_em2em_file.rb +3 -11
- data/lib/bioroebe/electron_microscopy/parse_coordinates.rb +6 -6
- data/lib/bioroebe/electron_microscopy/read_file_xmd.rb +6 -6
- data/lib/bioroebe/electron_microscopy/simple_star_file_generator.rb +2 -2
- data/lib/bioroebe/enzymes/has_this_restriction_enzyme.rb +1 -1
- data/lib/bioroebe/enzymes/restriction_enzyme.rb +1 -1
- data/lib/bioroebe/enzymes/restriction_enzymes/statistics.rb +4 -3
- data/lib/bioroebe/enzymes/restriction_enzymes_file.rb +1 -1
- data/lib/bioroebe/enzymes/return_sequence_that_is_cut_via_restriction_enzyme.rb +4 -3
- data/lib/bioroebe/enzymes/show_restriction_enzymes.rb +3 -3
- data/lib/bioroebe/ext/main.cpp +0 -1
- data/lib/bioroebe/fasta_and_fastq/autocorrect_the_name_of_this_fasta_file.rb +3 -3
- data/lib/bioroebe/fasta_and_fastq/compact_fasta_file/compact_fasta_file.rb +1 -1
- data/lib/bioroebe/fasta_and_fastq/display_how_many_fasta_entries_are_in_this_directory.rb +1 -1
- data/lib/bioroebe/fasta_and_fastq/download_fasta.rb +8 -14
- data/lib/bioroebe/fasta_and_fastq/fasta_defline/fasta_defline.rb +1 -1
- data/lib/bioroebe/fasta_and_fastq/fasta_to_yaml/fasta_to_yaml.rb +1 -1
- data/lib/bioroebe/fasta_and_fastq/fastq_format_explainer.rb +1 -1
- data/lib/bioroebe/fasta_and_fastq/length_modifier/length_modifier.rb +1 -1
- data/lib/bioroebe/fasta_and_fastq/parse_fasta/parse_fasta.rb +37 -11
- data/lib/bioroebe/fasta_and_fastq/parse_fastq/parse_fastq.rb +2 -2
- data/lib/bioroebe/fasta_and_fastq/return_fasta_subsection_of_this_file.rb +1 -1
- data/lib/bioroebe/fasta_and_fastq/show_fasta_headers.rb +5 -13
- data/lib/bioroebe/fasta_and_fastq/show_fasta_statistics.rb +1 -1
- data/lib/bioroebe/fasta_and_fastq/simplify_fasta_header/simplify_fasta_header.rb +1 -1
- data/lib/bioroebe/fasta_and_fastq/split_this_fasta_file_into_chromosomes/reset.rb +3 -6
- data/lib/bioroebe/fasta_and_fastq/split_this_fasta_file_into_chromosomes/split_this_fasta_file_into_chromosomes.rb +3 -3
- data/lib/bioroebe/genbank/genbank_flat_file_format_generator.rb +20 -11
- data/lib/bioroebe/genome/genome.rb +1 -1
- data/lib/bioroebe/genomes/genome_pattern.rb +17 -16
- data/lib/bioroebe/genomes/genome_retriever.rb +4 -2
- data/lib/bioroebe/gui/experimental/snapgene/snapgene.rb +10 -13
- data/lib/bioroebe/gui/universal_widgets/alignment/alignment.rb +557 -0
- data/lib/bioroebe/gui/{gtk3 → universal_widgets}/aminoacid_composition/aminoacid_composition.rb +498 -198
- data/lib/bioroebe/gui/universal_widgets/anti_sense_strand/anti_sense_strand.rb +665 -0
- data/lib/bioroebe/gui/universal_widgets/blosum_matrix_viewer/blosum_matrix_viewer.rb +329 -0
- data/lib/bioroebe/gui/universal_widgets/calculate_cell_numbers_of_bacteria/calculate_cell_numbers_of_bacteria.rb +423 -0
- data/lib/bioroebe/gui/{gtk3 → universal_widgets}/controller/controller.rb +170 -118
- data/lib/bioroebe/gui/{gtk3 → universal_widgets}/dna_to_aminoacid_widget/dna_to_aminoacid_widget.rb +277 -215
- data/lib/bioroebe/gui/{shared_code/dna_to_reverse_complement_widget/dna_to_reverse_complement_widget_module.rb → universal_widgets/dna_to_reverse_complement_widget/dna_to_reverse_complement_widget.rb} +297 -107
- data/lib/bioroebe/gui/universal_widgets/fasta_table_widget/fasta_table_widget.rb +643 -0
- data/lib/bioroebe/gui/{gtk3 → universal_widgets}/format_converter/format_converter.rb +236 -164
- data/lib/bioroebe/gui/universal_widgets/gene/gene.rb +278 -0
- data/lib/bioroebe/gui/universal_widgets/hamming_distance/hamming_distance.rb +646 -0
- data/lib/bioroebe/gui/{shared_code/levensthein_distance/levensthein_distance_module.rb → universal_widgets/levensthein_distance/levensthein_distance.rb} +313 -88
- data/lib/bioroebe/gui/{gtk3 → universal_widgets}/nucleotide_analyser/nucleotide_analyser.rb +281 -189
- data/lib/bioroebe/gui/{gtk3 → universal_widgets}/parse_pdb_file/parse_pdb_file.rb +265 -149
- data/lib/bioroebe/gui/{gtk3 → universal_widgets}/primer_design_widget/primer_design_widget.rb +337 -263
- data/lib/bioroebe/gui/universal_widgets/protein_to_DNA/protein_to_DNA.rb +408 -0
- data/lib/bioroebe/gui/{gtk3 → universal_widgets}/random_sequence/random_sequence.rb +245 -187
- data/lib/bioroebe/gui/{gtk3 → universal_widgets}/restriction_enzymes/restriction_enzymes.rb +207 -137
- data/lib/bioroebe/gui/universal_widgets/shell/shell.rb +288 -0
- data/lib/bioroebe/gui/{gtk3/show_codon_table/misc.rb → universal_widgets/show_codon_table/show_codon_table.rb} +290 -110
- data/lib/bioroebe/gui/{shared_code/show_codon_usage/show_codon_usage_module.rb → universal_widgets/show_codon_usage/show_codon_usage.rb} +228 -47
- data/lib/bioroebe/gui/{gtk3 → universal_widgets}/sizeseq/sizeseq.rb +151 -69
- data/lib/bioroebe/gui/{gtk3 → universal_widgets}/three_to_one/three_to_one.rb +190 -127
- data/lib/bioroebe/gui/{gtk3 → universal_widgets}/www_finder/www_finder.rb +211 -152
- data/lib/bioroebe/images/images.html +953 -1170
- data/lib/bioroebe/images/misc/README.md +6 -0
- data/lib/bioroebe/images/misc/activation.avif +0 -0
- data/lib/bioroebe/images/misc/inhibition.avif +0 -0
- data/lib/bioroebe/images/misc/small_virus_logo.avif +0 -0
- data/lib/bioroebe/{constants/base_directory.rb → log_directory/log_directory.rb} +79 -59
- data/lib/bioroebe/matplotlib/matplotlib_generator.rb +1 -1
- data/lib/bioroebe/misc/quiz/three_letter_to_aminoacid.rb +1 -1
- data/lib/bioroebe/misc/ruler.rb +5 -5
- data/lib/bioroebe/misc/useful_formulas.rb +3 -3
- data/lib/bioroebe/ncbi/efetch.rb +1 -2
- data/lib/bioroebe/ngs/phred_quality_score_table.rb +3 -3
- data/lib/bioroebe/nucleotides/complementary_dna_strand.rb +3 -6
- data/lib/bioroebe/nucleotides/molecular_weight_of_nucleotides.rb +3 -3
- data/lib/bioroebe/nucleotides/most_likely_nucleotide_sequence_for_this_aminoacid_sequence.rb +6 -10
- data/lib/bioroebe/nucleotides/{show_nucleotide_sequence.rb → show_nucleotide_sequence/show_nucleotide_sequence.rb} +377 -255
- data/lib/bioroebe/palindromes/palindrome_2D_structure.rb +1 -1
- data/lib/bioroebe/palindromes/palindrome_finder.rb +1 -1
- data/lib/bioroebe/palindromes/palindrome_generator.rb +2 -10
- data/lib/bioroebe/parsers/biolang_parser.rb +1 -1
- data/lib/bioroebe/parsers/blosum_parser.rb +14 -19
- data/lib/bioroebe/parsers/genbank_parser.rb +2 -6
- data/lib/bioroebe/parsers/gff.rb +9 -9
- data/lib/bioroebe/parsers/parse_embl.rb +2 -6
- data/lib/bioroebe/parsers/stride_parser.rb +4 -12
- data/lib/bioroebe/patterns/analyse_glycosylation_pattern.rb +2 -2
- data/lib/bioroebe/patterns/is_this_sequence_a_EGF2_pattern.rb +6 -3
- data/lib/bioroebe/patterns/profile_pattern.rb +2 -2
- data/lib/bioroebe/patterns/rgg_scanner.rb +4 -2
- data/lib/bioroebe/{protein_structure → pdb_and_protein_structure}/alpha_helix.rb +2 -2
- data/lib/bioroebe/{pdb → pdb_and_protein_structure}/download_this_pdb.rb +2 -3
- data/lib/bioroebe/{pdb → pdb_and_protein_structure}/fetch_fasta_sequence_from_pdb.rb +4 -4
- data/lib/bioroebe/{protein_structure → pdb_and_protein_structure}/helical_wheel.rb +2 -2
- data/lib/bioroebe/{pdb → pdb_and_protein_structure}/parse_mmCIF_file.rb +1 -1
- data/lib/bioroebe/{pdb → pdb_and_protein_structure}/parse_pdb_file.rb +3 -3
- data/lib/bioroebe/{pdb → pdb_and_protein_structure}/report_secondary_structures_from_this_pdb_file.rb +3 -3
- data/lib/bioroebe/project/project.rb +3 -1
- data/lib/bioroebe/raw_sequence/README.md +8 -8
- data/lib/bioroebe/raw_sequence/raw_sequence.rb +11 -2
- data/lib/bioroebe/regexes/regexes.rb +1 -2
- data/lib/bioroebe/requires/commandline_application.rb +3 -1
- data/lib/bioroebe/requires/require_all_pdb_files.rb +1 -1
- data/lib/bioroebe/requires/require_all_taxonomy_files.rb +1 -1
- data/lib/bioroebe/requires/require_all_utility_scripts_files.rb +10 -0
- data/lib/bioroebe/requires/require_colours.rb +1 -1
- data/lib/bioroebe/requires/require_the_bioroebe_project.rb +5 -7
- data/lib/bioroebe/requires/require_the_bioroebe_sinatra_components.rb +1 -1
- data/lib/bioroebe/requires/require_the_constants.rb +2 -14
- data/lib/bioroebe/requires/require_yaml.rb +7 -5
- data/lib/bioroebe/sequence/alignment.rb +1 -1
- data/lib/bioroebe/sequence/dna.rb +4 -2
- data/lib/bioroebe/sequence/nucleotide_module/nucleotide_module.rb +22 -8
- data/lib/bioroebe/sequence/protein.rb +2 -2
- data/lib/bioroebe/sequence/reverse_complement.rb +3 -3
- data/lib/bioroebe/sequence/rna.rb +9 -8
- data/lib/bioroebe/sequence/sequence.rb +3 -3
- data/lib/bioroebe/shell/configuration/additionally_set_xorg_buffer.yml +0 -0
- data/lib/bioroebe/shell/configuration/may_we_show_the_startup_information.yml +0 -0
- data/lib/bioroebe/shell/configuration/upcase_nucleotides.yml +0 -0
- data/lib/bioroebe/shell/configuration/use_silent_startup.yml +1 -1
- data/lib/bioroebe/shell/help/class.rb +68 -19
- data/lib/bioroebe/shell/menu.rb +5244 -5322
- data/lib/bioroebe/shell/{readline/readline.rb → readline.rb} +1 -3
- data/lib/bioroebe/shell/shell.rb +11240 -453
- data/lib/bioroebe/siRNA/siRNA.rb +3 -3
- data/lib/bioroebe/{gui/shared_code/blosum_matrix_viewer/blosum_matrix_viewer_module.rb → sinatra/sinatra_interface.rb} +28 -19
- data/lib/bioroebe/{www/sinatra/sinatra.rb → sinatra/sinatra_wrapper.rb} +731 -754
- data/lib/bioroebe/string_matching/find_longest_substring.rb +2 -10
- data/lib/bioroebe/string_matching/find_longest_substring_via_LCS_algorithm.rb +4 -14
- data/lib/bioroebe/string_matching/hamming_distance.rb +11 -10
- data/lib/bioroebe/string_matching/levensthein.rb +5 -17
- data/lib/bioroebe/string_matching/simple_string_comparer.rb +48 -4
- data/lib/bioroebe/string_matching/smith_waterman.rb +11 -6
- data/lib/bioroebe/svg/glyph.rb +4 -1
- data/lib/bioroebe/svg/mini_feature.rb +1 -1
- data/lib/bioroebe/svg/page.rb +18 -7
- data/lib/bioroebe/svg/svgee.rb +22 -13
- data/lib/bioroebe/svg/track.rb +20 -4
- data/lib/bioroebe/taxonomy/chart.rb +2 -2
- data/lib/bioroebe/taxonomy/class_methods.rb +5 -6
- data/lib/bioroebe/taxonomy/constants.rb +1 -1
- data/lib/bioroebe/taxonomy/info/info.rb +1 -1
- data/lib/bioroebe/taxonomy/info/is_dna.rb +1 -1
- data/lib/bioroebe/taxonomy/interactive.rb +1 -2
- data/lib/bioroebe/taxonomy/menu.rb +1 -1
- data/lib/bioroebe/taxonomy/node.rb +1 -1
- data/lib/bioroebe/taxonomy/parse_fasta.rb +4 -2
- data/lib/bioroebe/taxonomy/shared.rb +5 -4
- data/lib/bioroebe/taxonomy/taxonomy.rb +2 -4
- data/lib/bioroebe/toplevel_methods/fasta_and_fastq.rb +3 -45
- data/lib/bioroebe/toplevel_methods/{is_on_roebe.rb → roebe.rb} +1 -11
- data/lib/bioroebe/toplevel_methods/taxonomy.rb +6 -12
- data/lib/bioroebe/toplevel_methods/toplevel_methods.rb +5568 -0
- data/lib/bioroebe/utility_scripts/align_open_reading_frames.rb +4 -3
- data/lib/bioroebe/utility_scripts/analyse_local_dataset.rb +2 -2
- data/lib/bioroebe/utility_scripts/check_for_mismatches/check_for_mismatches.rb +16 -9
- data/lib/bioroebe/utility_scripts/compacter/compacter.rb +4 -2
- data/lib/bioroebe/utility_scripts/compare_these_two_sequences_via_blosum.rb +119 -0
- data/lib/bioroebe/utility_scripts/compseq/compseq.rb +11 -9
- data/lib/bioroebe/utility_scripts/{consensus_sequence.rb → consensus_sequence/consensus_sequence.rb} +13 -4
- data/lib/bioroebe/utility_scripts/{create_batch_entrez_file.rb → create_batch_entrez_file/create_batch_entrez_file.rb} +5 -5
- data/lib/bioroebe/utility_scripts/{determine_antigenic_areas.rb → determine_antigenic_areas/determine_antigenic_areas.rb} +5 -5
- data/lib/bioroebe/utility_scripts/{determine_missing_nucleotides_percentage.rb → determine_missing_nucleotides_percentage/determine_missing_nucleotides_percentage.rb} +16 -15
- data/lib/bioroebe/utility_scripts/display_open_reading_frames/display_open_reading_frames.rb +7 -7
- data/lib/bioroebe/utility_scripts/display_open_reading_frames/misc.rb +1 -1
- data/lib/bioroebe/utility_scripts/display_open_reading_frames/report.rb +2 -0
- data/lib/bioroebe/utility_scripts/{dot_alignment.rb → dot_alignment/dot_alignment.rb} +3 -3
- data/lib/bioroebe/utility_scripts/{download_files_from_rebase.rb → download_files_from_rebase/download_files_from_rebase.rb} +5 -5
- data/lib/bioroebe/utility_scripts/fetch_data_from_uniprot/fetch_data_from_uniprot.rb +269 -0
- data/lib/bioroebe/utility_scripts/find_gene.rb +4 -2
- data/lib/bioroebe/utility_scripts/{mirror_repeat.rb → mirror_repeat/mirror_repeat.rb} +5 -5
- data/lib/bioroebe/utility_scripts/move_file_to_its_correct_location.rb +3 -3
- data/lib/bioroebe/utility_scripts/{parse_taxonomy.rb → parse_taxonomy/parse_taxonomy.rb} +15 -6
- data/lib/bioroebe/utility_scripts/{pathways.rb → pathways/pathways.rb} +4 -3
- data/lib/bioroebe/utility_scripts/{permutations.rb → permutations/permutations.rb} +3 -3
- data/lib/bioroebe/utility_scripts/punnet/punnet.rb +4 -2
- data/lib/bioroebe/utility_scripts/{show_this_dna_sequence.rb → show_this_dna_sequence/show_this_dna_sequence.rb} +1 -1
- data/lib/bioroebe/utility_scripts/showorf/showorf.rb +406 -10
- data/lib/bioroebe/version/version.rb +2 -2
- data/lib/bioroebe/viennarna/rnafold_wrapper.rb +5 -13
- data/lib/bioroebe/virus/individual_viruses/README.md +15 -0
- data/lib/bioroebe/virus/individual_viruses/tobacco_mosaic_virus.rb +40 -0
- data/lib/bioroebe/virus/virus.rb +76 -0
- data/lib/bioroebe/www/bioroebe.cgi +4 -3
- data/lib/bioroebe/www/embeddable_interface.rb +85 -49
- data/lib/bioroebe/yaml/agarose/agarose_concentrations.yml +6 -6
- data/lib/bioroebe/yaml/antisense/antisense.yml +2 -0
- data/lib/bioroebe/yaml/blosum/blosum50.yml +6 -0
- data/lib/bioroebe/yaml/blosum/blosum90.yml +2 -1
- data/lib/bioroebe/yaml/chromosomes/chromosome_numbers.yml +2 -2
- data/lib/bioroebe/yaml/configuration/temp_dir.yml +1 -1
- data/lib/bioroebe/yaml/consensus_sequences/consensus_sequences.yml +1 -0
- data/lib/bioroebe/yaml/enzymes/enzyme_classes.yml +7 -6
- data/lib/bioroebe/yaml/humans/human_chromosomes.yml +3 -3
- data/lib/bioroebe/yaml/mRNA/mRNA.yml +1 -5
- data/lib/bioroebe/yaml/nucleotides/abbreviations_for_nucleotides.yml +1 -0
- data/lib/bioroebe/yaml/nucleotides/nucleotide_density.yml +2 -1
- data/lib/bioroebe/yaml/promoters/35S.yml +3 -1
- data/lib/bioroebe/yaml/proteases/proteases.yml +3 -1
- data/lib/bioroebe/yaml/proteins/ubiquitin.yml +4 -1
- data/lib/bioroebe/yaml/restriction_enzymes/restriction_enzymes.yml +7 -7
- data/spec/testing_toplevel_method_editor.rb +1 -1
- data/spec/testing_toplevel_method_verbose.rb +1 -1
- data/test/testing_dna_to_rna_conversion.rb +1 -1
- metadata +127 -235
- data/doc/blosum.md +0 -5
- data/lib/bioroebe/base/commandline_application/aminoacids.rb +0 -33
- data/lib/bioroebe/base/commandline_application/directory.rb +0 -33
- data/lib/bioroebe/base/commandline_application/extract.rb +0 -22
- data/lib/bioroebe/base/commandline_application/misc.rb +0 -502
- data/lib/bioroebe/base/commandline_application/opn.rb +0 -47
- data/lib/bioroebe/base/commandline_application/reset.rb +0 -42
- data/lib/bioroebe/base/commandline_application/warnings.rb +0 -36
- data/lib/bioroebe/base/commandline_application/write_what_into.rb +0 -29
- data/lib/bioroebe/base/initialize.rb +0 -18
- data/lib/bioroebe/base/misc.rb +0 -129
- data/lib/bioroebe/base/namespace.rb +0 -16
- data/lib/bioroebe/base/prototype/e_and_ee.rb +0 -24
- data/lib/bioroebe/base/prototype/misc.rb +0 -114
- data/lib/bioroebe/base/prototype/mkdir.rb +0 -20
- data/lib/bioroebe/base/prototype/reset.rb +0 -36
- data/lib/bioroebe/colours/misc_colours.rb +0 -80
- data/lib/bioroebe/colours/rev.rb +0 -44
- data/lib/bioroebe/colours/sdir.rb +0 -21
- data/lib/bioroebe/colours/sfancy.rb +0 -21
- data/lib/bioroebe/colours/sfile.rb +0 -21
- data/lib/bioroebe/colours/simp.rb +0 -21
- data/lib/bioroebe/colours/swarn.rb +0 -29
- data/lib/bioroebe/constants/aminoacids_and_proteins.rb +0 -147
- data/lib/bioroebe/constants/carriage_return.rb +0 -14
- data/lib/bioroebe/constants/codon_tables.rb +0 -77
- data/lib/bioroebe/constants/database_constants.rb +0 -107
- data/lib/bioroebe/constants/files_and_directories.rb +0 -606
- data/lib/bioroebe/constants/misc.rb +0 -209
- data/lib/bioroebe/constants/newline.rb +0 -14
- data/lib/bioroebe/constants/nucleotides.rb +0 -121
- data/lib/bioroebe/constants/regex.rb +0 -28
- data/lib/bioroebe/constants/roebe.rb +0 -38
- data/lib/bioroebe/constants/row_terminator.rb +0 -16
- data/lib/bioroebe/constants/tabulator.rb +0 -14
- data/lib/bioroebe/constants/unicode.rb +0 -12
- data/lib/bioroebe/constants/urls.rb +0 -50
- data/lib/bioroebe/gui/gtk +0 -1
- data/lib/bioroebe/gui/gtk3/README.md +0 -2
- data/lib/bioroebe/gui/gtk3/alignment/alignment.rb +0 -306
- data/lib/bioroebe/gui/gtk3/anti_sense_strand/anti_sense_strand.rb +0 -29
- data/lib/bioroebe/gui/gtk3/blosum_matrix_viewer/blosum_matrix_viewer.rb +0 -195
- data/lib/bioroebe/gui/gtk3/calculate_cell_numbers_of_bacteria/calculate_cell_numbers_of_bacteria.rb +0 -105
- data/lib/bioroebe/gui/gtk3/dna_to_reverse_complement_widget/dna_to_reverse_complement_widget.rb +0 -188
- data/lib/bioroebe/gui/gtk3/fasta_table_widget/fasta_table_widget.rb +0 -322
- data/lib/bioroebe/gui/gtk3/gene/gene.rb +0 -181
- data/lib/bioroebe/gui/gtk3/hamming_distance/hamming_distance.rb +0 -383
- data/lib/bioroebe/gui/gtk3/levensthein_distance/levensthein_distance.rb +0 -174
- data/lib/bioroebe/gui/gtk3/protein_to_DNA/protein_to_DNA.rb +0 -181
- data/lib/bioroebe/gui/gtk3/show_codon_table/show_codon_table.rb +0 -101
- data/lib/bioroebe/gui/gtk3/show_codon_usage/show_codon_usage.rb +0 -145
- data/lib/bioroebe/gui/gtk3/three_to_one/title.rb +0 -23
- data/lib/bioroebe/gui/jruby/alignment/alignment.rb +0 -165
- data/lib/bioroebe/gui/jruby/aminoacid_composition/aminoacid_composition.rb +0 -166
- data/lib/bioroebe/gui/jruby/blosum_matrix_viewer/blosum_matrix_viewer.rb +0 -82
- data/lib/bioroebe/gui/libui/README.md +0 -4
- data/lib/bioroebe/gui/libui/alignment/alignment.rb +0 -116
- data/lib/bioroebe/gui/libui/blosum_matrix_viewer/blosum_matrix_viewer.rb +0 -112
- data/lib/bioroebe/gui/libui/calculate_cell_numbers_of_bacteria/calculate_cell_numbers_of_bacteria.rb +0 -60
- data/lib/bioroebe/gui/libui/controller/controller.rb +0 -116
- data/lib/bioroebe/gui/libui/dna_to_aminoacid_widget/dna_to_aminoacid_widget.rb +0 -161
- data/lib/bioroebe/gui/libui/dna_to_reverse_complement_widget/dna_to_reverse_complement_widget.rb +0 -76
- data/lib/bioroebe/gui/libui/hamming_distance/hamming_distance.rb +0 -135
- data/lib/bioroebe/gui/libui/levensthein_distance/levensthein_distance.rb +0 -118
- data/lib/bioroebe/gui/libui/protein_to_DNA/protein_to_DNA.rb +0 -115
- data/lib/bioroebe/gui/libui/random_sequence/random_sequence.rb +0 -190
- data/lib/bioroebe/gui/libui/show_codon_table/show_codon_table.rb +0 -134
- data/lib/bioroebe/gui/libui/show_codon_usage/show_codon_usage.rb +0 -89
- data/lib/bioroebe/gui/libui/three_to_one/three_to_one.rb +0 -113
- data/lib/bioroebe/gui/shared_code/alignment/alignment_module.rb +0 -102
- data/lib/bioroebe/gui/shared_code/aminoacid_composition/aminoacid_composition_module.rb +0 -94
- data/lib/bioroebe/gui/shared_code/calculate_cell_numbers_of_bacteria/calculate_cell_numbers_of_bacteria_module.rb +0 -216
- data/lib/bioroebe/gui/shared_code/protein_to_DNA/protein_to_DNA_module.rb +0 -192
- data/lib/bioroebe/gui/shared_code/show_codon_table/show_codon_table_module.rb +0 -72
- data/lib/bioroebe/gui/tk/aminoacid_composition/aminoacid_composition.rb +0 -206
- data/lib/bioroebe/gui/tk/blosum_matrix_viewer/blosum_matrix_viewer.rb +0 -140
- data/lib/bioroebe/gui/tk/hamming_distance/hamming_distance.rb +0 -262
- data/lib/bioroebe/gui/tk/levensthein_distance/levensthein_distance.rb +0 -243
- data/lib/bioroebe/gui/tk/three_to_one/three_to_one.rb +0 -199
- data/lib/bioroebe/gui/unified_widgets/anti_sense_strand/anti_sense_strand.rb +0 -519
- data/lib/bioroebe/shell/colours/colours.rb +0 -235
- data/lib/bioroebe/shell/help/help.rb +0 -25
- data/lib/bioroebe/shell/misc.rb +0 -10227
- data/lib/bioroebe/toplevel_methods/ad_hoc_task.rb +0 -56
- data/lib/bioroebe/toplevel_methods/aminoacids_and_proteins.rb +0 -722
- data/lib/bioroebe/toplevel_methods/atomic_composition.rb +0 -198
- data/lib/bioroebe/toplevel_methods/base_composition.rb +0 -121
- data/lib/bioroebe/toplevel_methods/blast.rb +0 -153
- data/lib/bioroebe/toplevel_methods/calculate_n50_value.rb +0 -57
- data/lib/bioroebe/toplevel_methods/cat.rb +0 -71
- data/lib/bioroebe/toplevel_methods/chunked_display.rb +0 -92
- data/lib/bioroebe/toplevel_methods/cliner.rb +0 -81
- data/lib/bioroebe/toplevel_methods/complement.rb +0 -58
- data/lib/bioroebe/toplevel_methods/convert_global_env.rb +0 -39
- data/lib/bioroebe/toplevel_methods/databases.rb +0 -73
- data/lib/bioroebe/toplevel_methods/delimiter.rb +0 -19
- data/lib/bioroebe/toplevel_methods/digest.rb +0 -81
- data/lib/bioroebe/toplevel_methods/download_and_fetch_data.rb +0 -146
- data/lib/bioroebe/toplevel_methods/e.rb +0 -20
- data/lib/bioroebe/toplevel_methods/editor.rb +0 -21
- data/lib/bioroebe/toplevel_methods/esystem.rb +0 -22
- data/lib/bioroebe/toplevel_methods/exponential_growth.rb +0 -74
- data/lib/bioroebe/toplevel_methods/extract.rb +0 -56
- data/lib/bioroebe/toplevel_methods/file_and_directory_related_actions.rb +0 -269
- data/lib/bioroebe/toplevel_methods/frequencies.rb +0 -99
- data/lib/bioroebe/toplevel_methods/hamming_distance.rb +0 -60
- data/lib/bioroebe/toplevel_methods/infer.rb +0 -66
- data/lib/bioroebe/toplevel_methods/leading_five_prime_and_trailing_three_prime.rb +0 -101
- data/lib/bioroebe/toplevel_methods/levensthein.rb +0 -63
- data/lib/bioroebe/toplevel_methods/log_directory.rb +0 -109
- data/lib/bioroebe/toplevel_methods/longest_common_substring.rb +0 -55
- data/lib/bioroebe/toplevel_methods/map_ncbi_entry_to_eutils_id.rb +0 -88
- data/lib/bioroebe/toplevel_methods/matches.rb +0 -259
- data/lib/bioroebe/toplevel_methods/misc.rb +0 -596
- data/lib/bioroebe/toplevel_methods/nucleotides.rb +0 -787
- data/lib/bioroebe/toplevel_methods/number_of_clones.rb +0 -63
- data/lib/bioroebe/toplevel_methods/open_in_browser.rb +0 -79
- data/lib/bioroebe/toplevel_methods/open_reading_frames.rb +0 -236
- data/lib/bioroebe/toplevel_methods/opn.rb +0 -34
- data/lib/bioroebe/toplevel_methods/palindromes.rb +0 -155
- data/lib/bioroebe/toplevel_methods/parse.rb +0 -59
- data/lib/bioroebe/toplevel_methods/phred_error_probability.rb +0 -68
- data/lib/bioroebe/toplevel_methods/rds.rb +0 -24
- data/lib/bioroebe/toplevel_methods/remove.rb +0 -86
- data/lib/bioroebe/toplevel_methods/return_source_code_of_this_method.rb +0 -35
- data/lib/bioroebe/toplevel_methods/return_subsequence_based_on_indices.rb +0 -68
- data/lib/bioroebe/toplevel_methods/rna_splicing.rb +0 -73
- data/lib/bioroebe/toplevel_methods/rnalfold.rb +0 -69
- data/lib/bioroebe/toplevel_methods/searching_and_finding.rb +0 -116
- data/lib/bioroebe/toplevel_methods/shuffleseq.rb +0 -37
- data/lib/bioroebe/toplevel_methods/statistics.rb +0 -53
- data/lib/bioroebe/toplevel_methods/sum_of_odd_integers.rb +0 -62
- data/lib/bioroebe/toplevel_methods/three_delimiter.rb +0 -34
- data/lib/bioroebe/toplevel_methods/time_and_date.rb +0 -53
- data/lib/bioroebe/toplevel_methods/to_camelcase.rb +0 -31
- data/lib/bioroebe/toplevel_methods/truncate.rb +0 -48
- data/lib/bioroebe/toplevel_methods/url.rb +0 -36
- data/lib/bioroebe/toplevel_methods/verbose.rb +0 -59
- data/lib/bioroebe/utility_scripts/showorf/constants.rb +0 -31
- data/lib/bioroebe/utility_scripts/showorf/help.rb +0 -33
- data/lib/bioroebe/utility_scripts/showorf/initialize.rb +0 -52
- data/lib/bioroebe/utility_scripts/showorf/menu.rb +0 -68
- data/lib/bioroebe/utility_scripts/showorf/reset.rb +0 -36
- data/lib/bioroebe/utility_scripts/showorf/run.rb +0 -152
- data/lib/bioroebe/utility_scripts/showorf/show.rb +0 -97
- /data/doc/{german_names_for_the_aminoacids.md → german_names_for_the_aminoacids/german_names_for_the_aminoacids.md} +0 -0
- /data/doc/{pdb_ATOM_entry.md → pdb_ATOM_entry/pdb_ATOM_entry.md} +0 -0
- /data/doc/{resources.md → resources/resources.md} +0 -0
- /data/lib/bioroebe/gui/{gtk3 → universal_widgets}/aminoacid_composition/customized_dialog.rb +0 -0
- /data/lib/bioroebe/gui/{gtk3 → universal_widgets}/anti_sense_strand/anti_sense_strand.config +0 -0
- /data/lib/bioroebe/gui/{gtk3 → universal_widgets}/calculate_cell_numbers_of_bacteria/calculate_cell_numbers_of_bacteria.config +0 -0
- /data/lib/bioroebe/gui/{gtk3 → universal_widgets}/dna_to_reverse_complement_widget/dna_to_reverse_complement_widget.config +0 -0
- /data/lib/bioroebe/gui/{gtk3 → universal_widgets}/hamming_distance/hamming_distance.config +0 -0
- /data/lib/bioroebe/gui/{gtk3 → universal_widgets}/levensthein_distance/levensthein_distance.config +0 -0
- /data/lib/bioroebe/gui/{gtk3 → universal_widgets}/protein_to_DNA/protein_to_DNA.config +0 -0
- /data/lib/bioroebe/gui/{gtk3 → universal_widgets}/restriction_enzymes/restriction_enzymes.config +0 -0
- /data/lib/bioroebe/gui/{gtk3 → universal_widgets}/www_finder/www_finder.config +0 -0
- /data/lib/bioroebe/yaml/{base_composition_of_dna.yml → base_composition_of_dna/base_composition_of_dna.yml} +0 -0
- /data/lib/bioroebe/yaml/{nuclear_localization_sequences.yml → nuclear_localization_sequences/nuclear_localization_sequences.yml} +0 -0
- /data/lib/bioroebe/yaml/{talens.yml → talens/talens.yml} +0 -0
data/README.md
CHANGED
@@ -2,7 +2,7 @@
|
|
2
2
|
[![forthebadge](https://forthebadge.com/images/badges/made-with-ruby.svg)](https://www.ruby-lang.org/en/)
|
3
3
|
[![Gem Version](https://badge.fury.io/rb/bioroebe.svg)](https://badge.fury.io/rb/bioroebe)
|
4
4
|
|
5
|
-
This gem was <b>last updated</b> on the <span style="color: darkblue; font-weight: bold">
|
5
|
+
This gem was <b>last updated</b> on the <span style="color: darkblue; font-weight: bold">22.02.2024</span> (dd.mm.yyyy notation), at <span style="color: steelblue; font-weight: bold">13:34:05</span> o'clock.
|
6
6
|
|
7
7
|
# The Bioroebe Project
|
8
8
|
|
@@ -14,239 +14,193 @@ This gem was <b>last updated</b> on the <span style="color: darkblue; font-weigh
|
|
14
14
|
<img src="https://i.imgur.com/AfduheY.png" style="margin: 4px; margin-left: 12px;"/>
|
15
15
|
|
16
16
|
The **<span style="color: darkblue">above pictures</span>**,
|
17
|
-
more or less, represent DNA
|
18
|
-
|
17
|
+
more or less, represent DNA or, more generally, information in biological
|
18
|
+
systems. In particular the very left picture used to represented a
|
19
|
+
<b>dsDNA helix</b> (a **double-stranded DNA helix**), but it no longer
|
20
|
+
really like a dsDNA helix, due to my failed attempts at 'improving' it over
|
21
|
+
the years. One day I should really should re-do that image ...
|
19
22
|
|
20
|
-
The
|
21
|
-
my attempts to 'improve' it over the years and failing hard at that.
|
22
|
-
I really should re-do that image one day.
|
23
|
-
|
24
|
-
The other two images are more recent. For instance, the second picture (the one
|
23
|
+
The other images are more recent. For instance, the second picture (the one
|
25
24
|
that is almost in the middle) shows a schematic for a dsDNA helix. Of course
|
26
|
-
DNA does not look like this at all whatsoever
|
27
|
-
ever look as depicted like that, aside from the spacing being incorrect
|
28
|
-
it is a pretty picture nonetheless
|
29
|
-
visual show effects! \o/
|
25
|
+
DNA does not look like this at all whatsoever - hydrogen bonds can not possibly
|
26
|
+
ever look as depicted like that, aside from the spacing being incorrect
|
27
|
+
anyway - but it is a pretty picture nonetheless, so let's go with pretty,
|
28
|
+
for the fancy visual show effects! \o/
|
30
29
|
|
31
30
|
Minor nitpick: keep in mind that DNA in regular cells is right-handed, so if
|
32
31
|
you see a DNA double helix displayed that is going in the left direction then
|
33
|
-
this is technically incorrect
|
34
|
-
showing two dsDNA where one is evidently "incorrect"
|
35
|
-
|
36
|
-
|
37
|
-
in chemistry
|
32
|
+
this is <b>technically incorrect</b>. So the image on the very right hand side
|
33
|
+
is showing two dsDNA where one is evidently "<b>incorrect</b>", since it is
|
34
|
+
a mirror image - but we could reason about it still being correct <b>if</b> we
|
35
|
+
were to assume that one of the two dsDNA molecules shown is a synthetic one
|
36
|
+
for a mirror cell, similar to how racemases and epimers (in chemistry) may
|
37
|
+
be defined.
|
38
38
|
|
39
39
|
The last image, that is the fourth image from left, shows a dsDNA helix. It
|
40
40
|
is a bit better than the other pictures because it is **somewhat** more
|
41
41
|
accurate. The distance between the two helices is <b>2nm</b>, so this
|
42
42
|
picture kind of shows this somewhat more accurate. The distance between
|
43
43
|
two adjacent nucleotide pairs is <b>3.4 nm</b> - so that kind of fits
|
44
|
-
|
44
|
+
the distance shown in the fourth image; not quite perfect, but somewhat
|
45
|
+
close to that.
|
45
46
|
|
46
47
|
## About the BioRoebe project: History and Goals
|
47
48
|
|
48
49
|
The **BioRoebe project** was initially created in the year **2007** -
|
49
|
-
or at the least close
|
50
|
+
or at the least close towards that year, give or take, under <b>another
|
51
|
+
name</b>.
|
50
52
|
|
51
53
|
I was using the project for quite some time for my own, personal
|
52
54
|
use cases in regards to **bioinformatics** and **molecular biology**;
|
53
|
-
just a small hobby project, for very small, minor tasks.
|
54
|
-
|
55
|
-
|
56
|
-
|
55
|
+
just a small hobby project, for very small, minor tasks.
|
56
|
+
|
57
|
+
In many ways the project is still **just a hobby project** really -
|
58
|
+
it's not a professional suite of software, and won't be for the
|
59
|
+
foreseeable time either, due to time constraints alone.
|
57
60
|
|
58
|
-
In **early 2013**, the project was finally published on
|
61
|
+
In **early 2013**, the project was finally <b>published</b> on
|
59
62
|
**rubygems.org**, which has been its new home ever since -
|
60
63
|
and probably will remain its home, for a very long time to come.
|
61
|
-
It is hard to predict the future accurately, though.
|
62
|
-
|
63
|
-
|
64
|
-
|
65
|
-
|
66
|
-
|
67
|
-
|
68
|
-
|
69
|
-
|
70
|
-
|
71
|
-
|
72
|
-
|
73
|
-
|
64
|
+
It is hard to predict the future accurately, though. Perhaps I
|
65
|
+
may use the project for additional professional work in the future -
|
66
|
+
I don't know yet.
|
67
|
+
|
68
|
+
Nonetheless, since as of the year **2013**, the project has grown
|
69
|
+
considerably in size. That makes describing the project a bit difficult,
|
70
|
+
too, since the **use cases** for the project have increased, changed
|
71
|
+
and been adapted over the years. More code, and more use cases,
|
72
|
+
begetting more (and better) documentation. That makes sense, right?
|
73
|
+
|
74
|
+
So there is not merely 'one' use case only for this project - the
|
75
|
+
**bioroebe** gem is ultimately <b>a toolset project</b>. Different
|
76
|
+
people make use of different tools. Even programming languages may
|
77
|
+
vary when it comes to this project - while most of the bioroebe project is
|
78
|
+
written in <b>ruby</b>, there are some (smaller) parts written in Java as
|
79
|
+
well and I do not rule out using Python or other programming languages to
|
80
|
+
do specific tasks either in the long run, including C++, C or any other
|
81
|
+
programming language. Some parts of the bioroebe project are more widely
|
82
|
+
(and often) used; other parts refer to **niche use cases** and thus
|
83
|
+
are less frequently used.
|
74
84
|
|
75
85
|
Despite the plethora of options supported by this project, the
|
76
86
|
**BioRoebe project** has several very important
|
77
87
|
<span style="color: darkblue; font-weight: bold">goals</span>
|
78
|
-
that are still **valid as of today** and stand out compared to
|
79
|
-
other goals of lesser importance
|
88
|
+
that are still **valid as of today** and <b>stand out compared to
|
89
|
+
other goals of lesser importance</b>.
|
80
90
|
|
81
|
-
The
|
82
|
-
case
|
83
|
-
in regards to
|
91
|
+
The <b>primary purpose</b> of this project - that is the <b>main use
|
92
|
+
case</b>, if but one has to be named - is <b>to be able to quickly help
|
93
|
+
in regards to bioinformatics-related tasks</b>, and associated
|
84
94
|
<b>wet-lab</b>-related use cases from within molecular biology.
|
85
95
|
|
86
96
|
For example, the project should allow its users to run it on
|
87
|
-
a
|
97
|
+
a <b>local computer</b>, on a **remote computer**, be used on
|
88
98
|
the commandline, via **different GUIs** or via the **www**
|
89
|
-
or on a smartphone/mobile device in the long run.
|
99
|
+
or on a smartphone/mobile device in the long run. Flexible use
|
100
|
+
cases all the way down the rabbit hole.
|
90
101
|
|
91
|
-
|
92
|
-
or could be run
|
102
|
+
<b>There should be no limitation in where and how the project should
|
103
|
+
or could be run</b>, including the possibility to run it on as many
|
93
104
|
different operating systems as possible (at the least if ruby is
|
94
105
|
available on that operating system; or possibly Java as an
|
95
|
-
additional option one day). This
|
96
|
-
try to **stay as flexible as possible** -
|
97
|
-
different operating systems, with all their
|
98
|
-
unique oddities, if this is doable.
|
106
|
+
additional option one day, including jruby-SWING bindings). This
|
107
|
+
is why the project has to try to **stay as flexible as possible** -
|
108
|
+
we <b>must</b> support different operating systems, with all their
|
109
|
+
quirks and unique oddities, if this is doable.
|
99
110
|
|
100
111
|
Note that this goal also applies to **programming languages**,
|
101
|
-
as pointed out
|
112
|
+
as pointed out above. While the primary focus for the **bioroebe**
|
102
113
|
project is (and will remain) on ruby, I specifically **do not** exclude
|
103
114
|
the possibility that languages such as Java, C/C++ or even Python may
|
104
115
|
be included and used in this project. In fact: since as of **2021**,
|
105
116
|
Java-specific parts will be extended in the bioroebe project as well.
|
117
|
+
|
106
118
|
In the long run I would like to support both ruby and Java from the
|
107
|
-
get go. The primary question that is relevant here in
|
108
|
-
to different programming languages is that of <b>use case</b>;
|
119
|
+
get go, on an equal level. The primary question that is relevant here in
|
120
|
+
regards to different programming languages is that of <b>use case</b>;
|
109
121
|
<b>usability</b> and <b>usefulness</b>, and then <b>maintainability
|
110
|
-
of the code base</b> as well, as a secondary consideration.
|
122
|
+
of the code base</b> as well, as a secondary consideration. Documentation
|
123
|
+
is also equally important, so I will try to improve the documentation
|
124
|
+
systematically - both from a user's point of view, but also from
|
125
|
+
the point of view of other developers who may try to make use of
|
126
|
+
this project.
|
111
127
|
|
112
128
|
The bioroebe project additionally has to **solve real problems**,
|
113
129
|
in particular from a molecular biology point of view. Most
|
114
130
|
bioinformatics-related toolkits were written by experts in the
|
115
131
|
field who tend to have a strong background in **mathematics** and
|
116
132
|
**informatics**. While that is a perfectly fine background to have,
|
117
|
-
and most definitely an **asset**, I
|
118
|
-
side - molecular biology and molecular genetics. This makes
|
119
|
-
a difference in thinking too, because I tend to be closer towards
|
120
|
-
the side of, say, synthetic biology
|
133
|
+
and most definitely an **asset**, I myself came from the "<i>other</i>"
|
134
|
+
side of the medal - molecular biology and molecular genetics. This makes
|
135
|
+
for a difference in thinking too, because I tend to be closer towards
|
136
|
+
the side of, say, <b>synthetic biology</b>, than on the side of
|
121
137
|
(bio)mathematics or statistics, due to my own interests in the
|
122
|
-
way how I think or approach a given problem set.
|
123
|
-
|
124
|
-
|
125
|
-
|
126
|
-
|
127
|
-
|
128
|
-
|
129
|
-
|
130
|
-
|
131
|
-
|
132
|
-
to
|
133
|
-
|
134
|
-
|
135
|
-
|
136
|
-
|
137
|
-
|
138
|
-
|
139
|
-
|
140
|
-
|
141
|
-
the
|
142
|
-
|
138
|
+
way how I think or approach a given problem set. I find life fascinating,
|
139
|
+
more so than computer systems - thus, biological information is
|
140
|
+
more appealing to me than computer-stored information.
|
141
|
+
|
142
|
+
It is most definitely a different view to primarily in-silico driven
|
143
|
+
approaches. Nonetheless, the **bioroebe project** attempts to remain
|
144
|
+
as flexible as possible, including exploring other ways in how a
|
145
|
+
given problem set can be solved.
|
146
|
+
|
147
|
+
When the bioroebe project was initially created, many years ago, I wanted it
|
148
|
+
to be <b>more natural</b> to people who may not necessarily excel at designing
|
149
|
+
(or even understanding) algorithms. Not that algorithms should be neglected,
|
150
|
+
mind you, for <b>efficiency reasons</b> alone; but the primary view for the
|
151
|
+
whole BioRoebe suite of programs will be to focus on "real" biology first
|
152
|
+
and foremost, not just <i>in-silico</i> dry runs or simulations, per se.
|
153
|
+
|
154
|
+
Although most of the fields of bioinformatics is dominated by mathematicians
|
155
|
+
and computer scientists, I know that there are plenty of people who come from
|
156
|
+
a simpler **molecular biology**-specific background. So the project tries to
|
157
|
+
cater primarily to the latter group, without trying to exclude anyone else.
|
158
|
+
This also includes the focus on **documentation** - while the documentation
|
159
|
+
right now is far from perfect, I try to polish it every now and then. The
|
160
|
+
main aim here is to make the documentation useful for "Average Joe" - the
|
161
|
+
common user, at the least from a wet-lab focus on molecular biology.
|
143
162
|
|
144
163
|
So, <b>how</b> can the BioRoebe project be helpful to its users?
|
145
164
|
|
146
165
|
The **BioRoebe** project can be used to solve (some) problems
|
147
166
|
related to **biology**, **molecular biology**, **genetics**
|
148
|
-
and, last but not least, **bioinformatics
|
167
|
+
and, last but not least, **bioinformatics**, perhaps even
|
168
|
+
<b>synthetic biology</b>.
|
149
169
|
|
150
170
|
For example, say that you quickly wish to **reverse-translate a
|
151
|
-
sequence of amino acids**, and select all possible codons
|
152
|
-
or <b>the most likely codon candidate</b>; or
|
153
|
-
candidates, and display that result on the
|
154
|
-
a www interface.
|
171
|
+
sequence of amino acids**, and select all possible codons from
|
172
|
+
that sequence, or <b>the most likely codon candidate(s)</b>; or
|
173
|
+
just random codon candidates, and display that result on the
|
174
|
+
commandline or via a www interface.
|
155
175
|
|
156
|
-
This is easily possible through **BioRoebe**. How fancy and
|
157
|
-
useful
|
176
|
+
This is easily possible through **BioRoebe**. <b>How fancy and
|
177
|
+
useful!</b> \o/
|
158
178
|
|
159
179
|
For instance, when I do this on the commandline via bash and
|
160
|
-
KDE konsole
|
180
|
+
KDE konsole, invoking the aliased binary called
|
181
|
+
<b>revseq</b>, for <b>reverse sequence</b>:
|
161
182
|
|
162
|
-
revseq AAT # Alanine-Alanine-Threonine
|
183
|
+
revseq AAT # Alanine-Alanine-Threonine, so three amino acids
|
163
184
|
|
164
|
-
Then
|
185
|
+
Then on the commandline the following result will be shown,
|
186
|
+
representing a <b>DNA sequence</b>:
|
165
187
|
|
166
188
|
GCCGCCACC # These are 9 nucleotides, corresponding to the three amino acids.
|
167
189
|
|
168
|
-
(This will only work if your alias of revseq points
|
169
|
-
to the correct bin/ entry
|
170
|
-
|
171
|
-
|
172
|
-
|
173
|
-
|
174
|
-
|
175
|
-
|
176
|
-
In order to require **BioRoebe**, do use a line in ruby code such
|
177
|
-
as the following one:
|
178
|
-
|
179
|
-
require 'bioroebe'
|
180
|
-
|
181
|
-
To **automatically** include the **main namespace** upon require-time,
|
182
|
-
the following line of code can be used:
|
183
|
-
|
184
|
-
require 'bioroebe/autoinclude'
|
185
|
-
|
186
|
-
Note that this will include into the Object namespace, so if you want
|
187
|
-
to have more control over the include-action, you need to first require
|
188
|
-
the bioroebe project, and then include it onto the target namespace
|
189
|
-
that you wish to use specifically, such as a subclass or another
|
190
|
-
module. If you do not need to autoinclude then I recommend to
|
191
|
-
simply use the first variant how to require the bioroebe gem - that
|
192
|
-
should suffice. The reason why the file called **autoinclude.rb**
|
193
|
-
exists is mostly due to laziness, so we can type less. We can omit
|
194
|
-
<b>include Bioroebe</b> after all.
|
195
|
-
|
196
|
-
The **BioRoebe** project comes with a file called **bin/bioshell**,
|
197
|
-
which allows you to start this project from a typical shell, such
|
198
|
-
as **bash**, by issuing this command on the command prompt:
|
199
|
-
|
200
|
-
bioshell
|
201
|
-
|
202
|
-
You can also load up the project and run it from within a .rb file
|
203
|
-
or during an **IRB session**, by doing the following:
|
204
|
-
|
205
|
-
require 'bioroebe'
|
206
|
-
|
207
|
-
Bioroebe::BioShell[]
|
208
|
-
|
209
|
-
Or alternatively, which may be more convenient to type:
|
210
|
-
|
211
|
-
require 'bioroebe'
|
212
|
-
|
213
|
-
Bioroebe.start_shell # No need to use the [], unlike the example shown above
|
214
|
-
|
215
|
-
## Usage of the BioRoebe project
|
216
|
-
|
217
|
-
Not all subcomponents within the **BioRoebe** project have received
|
218
|
-
equal attention and thus, the **quality** of these subcomponents may
|
219
|
-
often differ, to word this nicely.
|
220
|
-
|
221
|
-
Patches and contributions to extend functionality, improve the
|
222
|
-
documentation, fix existing bugs or improve the usability and
|
223
|
-
general quality of the project, are welcome. Take note that I
|
224
|
-
in general tend to add **new** entries at the bottom of this file
|
225
|
-
here (README.gen or README.md, respectively); use the navigation
|
226
|
-
menu on the **top right** of this page to quickly jump to these
|
227
|
-
entries. Sometimes headers change a bit, but by and large content
|
228
|
-
is rarely removed; so if you ever found something in the past,
|
229
|
-
you should be able to find it again in the future - except for
|
230
|
-
when APIs are removed. These may be omitted in the documentation.
|
231
|
-
That's quite rare, though.
|
232
|
-
|
233
|
-
I will move entries that have received updates with more recent
|
234
|
-
releases to the ~bottom of this very page here - that way it should
|
235
|
-
be a bit easier to keep up to date with what has changed within
|
236
|
-
this project. It is admittedly becoming a fairly large project,
|
237
|
-
which is why I try to keep things somewhat organized.
|
238
|
-
|
239
|
-
In the long run I will also, most likely, publish the documentation
|
240
|
-
in a "booklet" format such as https://yaml.readthedocs.io/en/latest/.
|
241
|
-
That way one may be more easily able to read individual
|
242
|
-
subchapters.
|
190
|
+
(This will only work if your alias of <b>revseq</b> points
|
191
|
+
to the correct bin/ entry as well; I keep the rcfiles gem for
|
192
|
+
that, which includes all the aliases I use. In my case I have
|
193
|
+
aliased **revseq** onto **bin/deduce_most_likely_aminoacid_sequence**
|
194
|
+
actually, as can be seen in the rcfiles gem. Of course you can use
|
195
|
+
any other alias, or just flat out call **bin/deduce_most_likely_aminoacid_sequence**
|
196
|
+
directly. I just like to be succinct and to-the-point whenever
|
197
|
+
possible, so the revseq alias suits the way how my brain works.)
|
243
198
|
|
244
199
|
## Differences and Compatibility towards BioRuby
|
245
200
|
|
246
|
-
This subsection will explain some of the philosophical - and, more
|
247
|
-
|
248
|
-
|
249
|
-
goals**.
|
201
|
+
This subsection will explain some of the philosophical - and, more importantly,
|
202
|
+
**practical** - differences between **BioRoebe** and **BioRuby**, as both projects
|
203
|
+
have somewhat similar, hence <b>shared goals</b>.
|
250
204
|
|
251
205
|
One philosophical difference, for example, is that BioRoebe is less
|
252
206
|
focused on bioinformatics as such, and more focused on **molecular
|
@@ -344,6 +298,106 @@ instead, they can do so, too - see the <b>API</b> for codon tables
|
|
344
298
|
lateron. Simply define your own constants and pass them to the
|
345
299
|
appropriate methods.
|
346
300
|
|
301
|
+
## The rewrite in November 2023
|
302
|
+
|
303
|
+
The bioroebe gem was partially rewritten in November 2023.
|
304
|
+
|
305
|
+
The primary goal for this rewrite was to add a jruby-SWING
|
306
|
+
GUI that allows the user to make use of the interactive
|
307
|
+
bioshell from a SWING GUI. This GUI works right now, but it
|
308
|
+
is not very pretty or elegant, as the following image shows
|
309
|
+
(you may have to scroll down a little bit, in order to
|
310
|
+
see it):
|
311
|
+
|
312
|
+
<img src="https://i.imgur.com/bVr8eIx.png" style="margin: 1em">
|
313
|
+
|
314
|
+
I also made sure that this does indeed work on Windows, which
|
315
|
+
was the primary reason this GUI was added in the first place -
|
316
|
+
and it does indeed work on windows. \o/
|
317
|
+
|
318
|
+
In the coming months this will be improved.
|
319
|
+
|
320
|
+
## Usage of the BioRoebe project
|
321
|
+
|
322
|
+
Not all subcomponents within the **BioRoebe** project have received equal
|
323
|
+
attention and thus, the **quality** of these subcomponents may often
|
324
|
+
differ, to word this nicely. In other words: the quality of the code will
|
325
|
+
be different, with some parts being tested more than others.
|
326
|
+
|
327
|
+
Patches and contributions to extend functionality, improve the documentation,
|
328
|
+
fix existing bugs or improve the usability and general quality of the project,
|
329
|
+
are in general welcome. Take note that I in general tend to add **new**
|
330
|
+
entries at the bottom of this file here (README.gen or README.md, respectively);
|
331
|
+
use the navigation menu on the **top right** of this page to quickly jump to
|
332
|
+
these entries.
|
333
|
+
|
334
|
+
Sometimes headers change a bit, but by and large content is rarely removed; so
|
335
|
+
if you ever found something in the past, you should be able to find it again
|
336
|
+
in the future - except for when APIs or submodules are removed. These may
|
337
|
+
be omitted in the documentation. That's quite rare to happen, though.
|
338
|
+
|
339
|
+
I will typically move entries that have received updates with more recent
|
340
|
+
releases to the ~bottom of this very page here - that way it should be a bit
|
341
|
+
easier to keep up to date with what has changed within this project. It is
|
342
|
+
admittedly becoming a fairly large project, which is why I try to keep
|
343
|
+
things somewhat organized here.
|
344
|
+
|
345
|
+
In the long run I will also, most likely, publish the documentation
|
346
|
+
in a "booklet" format such as https://yaml.readthedocs.io/en/latest/.
|
347
|
+
That way one may be more easily able to read individual subchapters.
|
348
|
+
|
349
|
+
The rest of this document shall now attempt to explain the different
|
350
|
+
parts of the bioroebe-gem.
|
351
|
+
|
352
|
+
## Requiring the BioRoebe project and starting the bioshell interface
|
353
|
+
|
354
|
+
In order to require **BioRoebe** you can use the following line of
|
355
|
+
ruby code:
|
356
|
+
|
357
|
+
require 'bioroebe'
|
358
|
+
|
359
|
+
To <b>automatically</b> include the **main namespace** upon require-time,
|
360
|
+
the following line of code can be used:
|
361
|
+
|
362
|
+
require 'bioroebe/autoinclude'
|
363
|
+
|
364
|
+
Note that this will include into the Object namespace, so if you want
|
365
|
+
to have more control over the include-action (and avoid inclusion into
|
366
|
+
ruby's Object namespace), you need to first require the bioroebe
|
367
|
+
project, as shown above, and then include it onto the target namespace
|
368
|
+
that you wish to use specifically, such as a subclass or another
|
369
|
+
module.
|
370
|
+
|
371
|
+
If you do not need to autoinclude then I recommend to simply use the
|
372
|
+
first variant (require 'bioroebe') how to require the bioroebe gem -
|
373
|
+
that should suffice for most use cases. The reason why the file called
|
374
|
+
<b>autoinclude.rb</b> exists is mostly due to my inherent laziness,
|
375
|
+
as well as a desire to be flexible, so we can ultimately type less.
|
376
|
+
We can omit <b>include Bioroebe</b> after all in the second case.
|
377
|
+
|
378
|
+
The **BioRoebe** project comes with a file called **bin/bioshell**,
|
379
|
+
which allows you to start this project from a typical shell, similar
|
380
|
+
to <b>bash</b>, by issuing this command on the command prompt:
|
381
|
+
|
382
|
+
bioshell
|
383
|
+
|
384
|
+
(If this does not work, make sure that bin/bioshell is in your
|
385
|
+
$PATH; you can also extract the .gem and move bin/bioshell to
|
386
|
+
any location you desire it to be, of course.)
|
387
|
+
|
388
|
+
You can also load up the project and run it from within a .rb
|
389
|
+
file or during an **IRB session**, by doing the following:
|
390
|
+
|
391
|
+
require 'bioroebe'
|
392
|
+
|
393
|
+
Bioroebe::BioShell[]
|
394
|
+
|
395
|
+
Or alternatively, which may be more convenient to type:
|
396
|
+
|
397
|
+
require 'bioroebe'
|
398
|
+
|
399
|
+
Bioroebe.start_shell # No need to use the [] here, unlike in the example shown above
|
400
|
+
|
347
401
|
## Readline support in the BioRoebe project
|
348
402
|
|
349
403
|
The **BioRoebe** project will attempt to make use of **Readline**, if
|
@@ -388,6 +442,61 @@ Personally I recommend that people should switch to **psych** and
|
|
388
442
|
give it a try. It should work fine really. But ultimately this is
|
389
443
|
up to them.
|
390
444
|
|
445
|
+
## Colours support in the Bioroebe project
|
446
|
+
|
447
|
+
<b>Colours</b> can be immensely useful, at the least to most people.
|
448
|
+
|
449
|
+
The bioroebe project has to support colours. The primary file
|
450
|
+
that bundles together most of that colours-related functionality
|
451
|
+
can be found at:
|
452
|
+
|
453
|
+
require 'bioroebe/colours/colours.rb'
|
454
|
+
|
455
|
+
This, in turn, depends on the gem called <b>colours</b>.
|
456
|
+
|
457
|
+
Code exists in the bioroebe project that can be used on the
|
458
|
+
commandline to output colours, such as the following image
|
459
|
+
shows:
|
460
|
+
|
461
|
+
<img src="https://i.imgur.com/VbfOME3.png" style="margin: 1em">
|
462
|
+
|
463
|
+
If you do not want or need colours, you can disable them via
|
464
|
+
this method call:
|
465
|
+
|
466
|
+
Bioroebe.disable_colours
|
467
|
+
|
468
|
+
Conversely, to enable colours again, use:
|
469
|
+
|
470
|
+
Bioroebe.enable_colours
|
471
|
+
|
472
|
+
Classes that subclass from Bioroebe::CommandlineApplication will
|
473
|
+
have a method called <b>.use_colours?</b>, which can be used
|
474
|
+
to query whether the class at hand makes use of colours.
|
475
|
+
|
476
|
+
This functionality depends on the gem called <b>colours</b>.
|
477
|
+
|
478
|
+
In the past the code allowed for konsole-colours support (KDE
|
479
|
+
konsole) and simpler terminals without RGB colour support.
|
480
|
+
|
481
|
+
When the bioroebe project was rewritten in April 2020, this was
|
482
|
+
changed. The project now depends on the Colours module, and it
|
483
|
+
will try to use that project to its full possibilities, including
|
484
|
+
KDE konsole colours (RGB colours) by default. This will also include
|
485
|
+
so called "HTML colours", such as :slateblue or :steelblue.
|
486
|
+
|
487
|
+
## class Bioroebe::DetermineMissingNucleotidesPercentage
|
488
|
+
|
489
|
+
The small class Bioroebe::DetermineMissingNucleotidesPercentage can be
|
490
|
+
used to determine missing nucleotide content, in percentage.
|
491
|
+
|
492
|
+
For instance, say that you know the GC content of a given
|
493
|
+
DNA sequence is at 48%. You want to know the GC and AT
|
494
|
+
content of that quickly, so you invoke this class.
|
495
|
+
|
496
|
+
It's output may then look like this:
|
497
|
+
|
498
|
+
<img src="https://i.imgur.com/txV3ZGY.png" style="margin: 1em">
|
499
|
+
|
391
500
|
## Phred quality score
|
392
501
|
|
393
502
|
If you need support for PHRED, you could use this method:
|
@@ -484,28 +593,6 @@ now and then.
|
|
484
593
|
TLR3 | double-stranded RNA (dsRNA) | https://en.wikipedia.org/wiki/TLR3
|
485
594
|
TLR4 | binds cell-wall components of gram-negative bacteria (via their LPS) | https://en.wikipedia.org/wiki/TLR4
|
486
595
|
|
487
|
-
## Enzymes
|
488
|
-
|
489
|
-
This subsection will be expanded at a later time - it will be
|
490
|
-
about enzymes in general.
|
491
|
-
|
492
|
-
For now, if you need a table, as memory, for the enzyme classes,
|
493
|
-
here is one:
|
494
|
-
|
495
|
-
| Enzyme class (EC) | Name |
|
496
|
-
|-------------------- |------------------|
|
497
|
-
| 1 | Oxidoreductases |
|
498
|
-
| 2 | Transferases |
|
499
|
-
| 3 | Hydrolases |
|
500
|
-
| 4 | Lyases |
|
501
|
-
| 5 | Isomerases |
|
502
|
-
| 6 | Ligases |
|
503
|
-
| 7 | Translocases |
|
504
|
-
|
505
|
-
See wikipedia for more information:
|
506
|
-
|
507
|
-
https://en.wikipedia.org/wiki/Enzyme_Commission_number#Top_level_codes
|
508
|
-
|
509
596
|
## Using Bioroebe in a project
|
510
597
|
|
511
598
|
Of considerable usefulness to the end-user may be the **BioShell**.
|
@@ -2455,24 +2542,6 @@ And, my favourite one:
|
|
2455
2542
|
|
2456
2543
|
https://labcalculator.net/wiki/oligo-tm
|
2457
2544
|
|
2458
|
-
## SimpleStringComparer
|
2459
|
-
|
2460
|
-
class **SimpleStringComparer** can be used to compare two strings visually,
|
2461
|
-
similar as to how **NCBI BLAST** compares two sequences to one another.
|
2462
|
-
|
2463
|
-
By default the output of that class will be, in colours, on the commandline,
|
2464
|
-
but you can disable this like so:
|
2465
|
-
|
2466
|
-
Bioroebe::SimpleStringComparer.new(ARGV) { :disable_colours }
|
2467
|
-
|
2468
|
-
Let's look at another example:
|
2469
|
-
|
2470
|
-
Bioroebe::SimpleStringComparer.new('AAAAAAAAAAAAAATTTTTTTTTTTAAAAAAAAAAAATATA|GAAAAAAAAAAAAAAAATATTTTTTTTTTTTTTTTTTTTTT')
|
2471
|
-
|
2472
|
-
This may look like so:
|
2473
|
-
|
2474
|
-
<img src="https://i.imgur.com/kkqkpmQ.png" style="margin-left: 2em">
|
2475
|
-
|
2476
2545
|
## Bioroebe::SanitizeNucleotideSequence
|
2477
2546
|
|
2478
2547
|
class **Bioroebe::SanitizeNucleotideSequence** can be used to **sanitize
|
@@ -2533,7 +2602,7 @@ class will replace the older code. But it will happen.
|
|
2533
2602
|
If you wish to make use of this class in your own projects,
|
2534
2603
|
first require it:
|
2535
2604
|
|
2536
|
-
require 'bioroebe/nucleotides/show_nucleotide_sequence.rb'
|
2605
|
+
require 'bioroebe/nucleotides/show_nucleotide_sequence/show_nucleotide_sequence.rb'
|
2537
2606
|
|
2538
2607
|
And then you can use it to display a nucleotide sequence,
|
2539
2608
|
such as via:
|
@@ -3385,7 +3454,7 @@ This would colourize Lysine. K is the one amino acid letter
|
|
3385
3454
|
for Lysine.
|
3386
3455
|
|
3387
3456
|
Next, you can test whether this works. A simply way is to
|
3388
|
-
ask for the Ubiquitin sequence. Pay attention to lysine
|
3457
|
+
ask for the Ubiquitin sequence. Pay attention to <b>lysine</b>
|
3389
3458
|
at position 48.
|
3390
3459
|
|
3391
3460
|
From within the bioshell, you can query for the ubiquitin
|
@@ -4139,7 +4208,7 @@ DNA-strand**, whereas the **palindrome** occurs on the **sister
|
|
4139
4208
|
strand**.
|
4140
4209
|
|
4141
4210
|
Bioroebe supports the simple "creation" of mirror repeats, through the
|
4142
|
-
file <b>bioroebe/utility_scripts/mirror_repeat.rb</b> and the method
|
4211
|
+
file <b>bioroebe/utility_scripts/mirror_repeat/mirror_repeat.rb</b> and the method
|
4143
4212
|
there called <b>Bioroebe.mirror_repeat_of()</b>.
|
4144
4213
|
|
4145
4214
|
Simply pass in the sequence that you wish to mirror, such as:
|
@@ -4254,50 +4323,6 @@ Why is this important to understand? Well, you may wish to design
|
|
4254
4323
|
or check that both primers in PCR, forward and reverse, would be
|
4255
4324
|
correct.
|
4256
4325
|
|
4257
|
-
## Logging and Log output
|
4258
|
-
|
4259
|
-
The **BioRoebe project** may autogenerate some files, including
|
4260
|
-
**log files**.
|
4261
|
-
|
4262
|
-
In order to be able to do so, the bioroebe project needs the user
|
4263
|
-
to be able to access a **base directory**, the so-called
|
4264
|
-
**working base directory**. This is where bioroebe assumes
|
4265
|
-
most working files to exist.
|
4266
|
-
|
4267
|
-
On my linux system this used to default to the directory
|
4268
|
-
called **/Depot/Bioroebe/**. On other systems, the log directory
|
4269
|
-
may default into the **user's home directory**, via a call to
|
4270
|
-
<b>"#{File.expand_path('~')}/"</b>. (I use this presently,
|
4271
|
-
since **2020**.)
|
4272
|
-
|
4273
|
-
This should work on most systems, but it may not be what you
|
4274
|
-
want to have or use in your own workflow. Thus, code exists
|
4275
|
-
that allows you to designate another log directory to use.
|
4276
|
-
|
4277
|
-
The API for this is simply called:
|
4278
|
-
|
4279
|
-
Bioroebe.set_log_dir()
|
4280
|
-
|
4281
|
-
The method can be found in
|
4282
|
-
**lib/bioroebe/toplevel_methods/log_directory.rb**.
|
4283
|
-
|
4284
|
-
If you want to do this from within the **bioshell** itself,
|
4285
|
-
try:
|
4286
|
-
|
4287
|
-
set_log_dir /tmp/test
|
4288
|
-
setlogdir /tmp/test
|
4289
|
-
|
4290
|
-
If you use the interactive bioroebe-shell then you can use
|
4291
|
-
**home?** to determine where the log directory is on your
|
4292
|
-
system. For example, I tend to just use **/root/Bioroebe/**
|
4293
|
-
these days, when I am the superuser.
|
4294
|
-
|
4295
|
-
Note that you can also define the environment variable
|
4296
|
-
called **BIOROEBE_DEFAULT_LOG_DIRECTORY**. If this is
|
4297
|
-
set on startup of the bioroebe-shell, then it will
|
4298
|
-
overrule the initial :default value that is used
|
4299
|
-
otherwise.
|
4300
|
-
|
4301
4326
|
## Working with Blosum
|
4302
4327
|
|
4303
4328
|
**BLOSUM** is used to sequence-align proteins; thus, it can be
|
@@ -4726,6 +4751,16 @@ than by phosphatases. Thus, there is a bias in regards to the
|
|
4726
4751
|
publications that are published. This bias is not necessarily
|
4727
4752
|
"existing" on the level of the cell(s) itself.
|
4728
4753
|
|
4754
|
+
Next, a table is shown to compare some search databases:
|
4755
|
+
|
4756
|
+
Name | remote URL
|
4757
|
+
----------------|------------------------------------------------------------------------|
|
4758
|
+
PubMed | https://pubmed.ncbi.nlm.nih.gov/ |
|
4759
|
+
Google Scholar | https://scholar.google.com/ |
|
4760
|
+
Web of Science | https://clarivate.com/webofsciencegroup/solutions/web-of-science/ |
|
4761
|
+
Scopus | https://www.scopus.com/ |
|
4762
|
+
----------------|------------------------------------------------------------------------|
|
4763
|
+
|
4729
4764
|
## Browser setting in the configuration file
|
4730
4765
|
|
4731
4766
|
The browser (for opening external websites) is defined here:
|
@@ -4866,7 +4901,7 @@ Let's take the phage lambda of E. coli. The **refseq entry** is at:
|
|
4866
4901
|
|
4867
4902
|
https://www.ncbi.nlm.nih.gov/nuccore/9626243
|
4868
4903
|
|
4869
|
-
The genome is
|
4904
|
+
The genome is <b>48502 bp</b> long (<b>dsDNA</b>).
|
4870
4905
|
|
4871
4906
|
The NCBI ID is: **NC_001416.1**
|
4872
4907
|
|
@@ -5300,44 +5335,6 @@ Try the following instruction in order to **disable** it again:
|
|
5300
5335
|
|
5301
5336
|
no_expand_cd_aliases
|
5302
5337
|
|
5303
|
-
## UniProt
|
5304
|
-
|
5305
|
-
UniProt is a freely accessible database of protein sequence and
|
5306
|
-
functional information. What makes it also useful for
|
5307
|
-
bioinformatics is that you can easily query the FASTA sequence
|
5308
|
-
of a protein.
|
5309
|
-
|
5310
|
-
Consider the protein called **A2Z669**. The **entry** to this protein
|
5311
|
-
can be found here:
|
5312
|
-
|
5313
|
-
https://www.uniprot.org/uniprot/A2Z669
|
5314
|
-
|
5315
|
-
And the corresponding FASTA sequence of that protein can be
|
5316
|
-
found here, if you append **.fasta** to the URL:
|
5317
|
-
|
5318
|
-
https://www.uniprot.org/uniprot/A2Z669.fasta
|
5319
|
-
|
5320
|
-
If you wish to save this file, from within **Bioroebe** itself,
|
5321
|
-
then you can use the following API:
|
5322
|
-
|
5323
|
-
Bioroebe.fetch_data_from_uniprot()
|
5324
|
-
Bioroebe.fetch_data_from_uniprot('A2Z669')
|
5325
|
-
|
5326
|
-
NCBI also has entries related to UniProt.
|
5327
|
-
|
5328
|
-
Example:
|
5329
|
-
|
5330
|
-
https://www.ncbi.nlm.nih.gov/protein/P02768
|
5331
|
-
https://www.ncbi.nlm.nih.gov/protein/P02768.2?report=fasta
|
5332
|
-
|
5333
|
-
This has the header **RecName: Full=Serum albumin; Flags:
|
5334
|
-
PrecursorUniProtKB/Swiss-Prot: P02768.2**.
|
5335
|
-
|
5336
|
-
From the bioroebe-shell, you can can fetch data from
|
5337
|
-
<b>Uniprot</b>, such as by issuing:
|
5338
|
-
|
5339
|
-
unitprot_fetch
|
5340
|
-
|
5341
5338
|
## AminoAcid composition
|
5342
5339
|
|
5343
5340
|
A small widget exists to show the amino-acid composition.
|
@@ -5446,11 +5443,9 @@ A bin file exists as well:
|
|
5446
5443
|
|
5447
5444
|
bin/genbank_to_fasta
|
5448
5445
|
|
5449
|
-
This is also handled by the more generic method
|
5450
|
-
|
5451
|
-
|
5452
|
-
eventually. For now (**May 2021**) only a few files are
|
5453
|
-
supported.
|
5446
|
+
This is also handled by the more generic method called <b>Bioroebe.parse()</b>,
|
5447
|
+
which attempts to parse any file that may be relevant in regards to bioinformatics
|
5448
|
+
eventually. For now (**May 2021**) only a few files are supported.
|
5454
5449
|
|
5455
5450
|
A small ruby-gtk3 widget exists for this as well:
|
5456
5451
|
|
@@ -6057,6 +6052,7 @@ found in **bioroebe/toplevel_methods/palindromes.rb**.
|
|
6057
6052
|
You can also use a toplevel method for this. Example:
|
6058
6053
|
|
6059
6054
|
Bioroebe.palindrome_generator 4 # => "CTAG\nGATC"
|
6055
|
+
Bioroebe.palindrome_generator(10)
|
6060
6056
|
|
6061
6057
|
In **June 2020** code was added to "display" a 2D structure
|
6062
6058
|
of RNA or DNA palindromes. The code for this resides in
|
@@ -6464,20 +6460,6 @@ You can use it as follows:
|
|
6464
6460
|
Several commandline scripts make use of that. I found it to be
|
6465
6461
|
useful to have a short visual separator ready.
|
6466
6462
|
|
6467
|
-
## Using Bioroebe.three_delimiter()
|
6468
|
-
|
6469
|
-
The method **Bioroebe.three_delimiter()** can be used to
|
6470
|
-
split a String into a String where every third position has
|
6471
|
-
a trailing '|' token.
|
6472
|
-
|
6473
|
-
So for instance:
|
6474
|
-
|
6475
|
-
Bioroebe.three_delimiter 'ATGGGGATGTAGGTA' # => "ATG|GGG|ATG|TAG|GTA"
|
6476
|
-
|
6477
|
-
The primary reason why that was added as a toplevel method has been
|
6478
|
-
because it may be visually simpler to identify the individual codons
|
6479
|
-
via your eyes that way.
|
6480
|
-
|
6481
6463
|
## Generating a random DNA sequence
|
6482
6464
|
|
6483
6465
|
You can "generate" a random DNA sequence from the commandline
|
@@ -7186,6 +7168,11 @@ It will deduce the possible codons for the aminoacid sequence
|
|
7186
7168
|
MTTAGP, and it will display the findings in RNA - thus, all
|
7187
7169
|
T are U on the display on the commandline.
|
7188
7170
|
|
7171
|
+
The commandline output of the above, captured as an image,
|
7172
|
+
will be shown next:
|
7173
|
+
|
7174
|
+
<img src="https://i.imgur.com/EUCU3sH.png" style="margin: 1em">
|
7175
|
+
|
7189
7176
|
## Determining the possible codons for a given aminoacid
|
7190
7177
|
|
7191
7178
|
If you need to quickly determine all possible codons for a specific
|
@@ -8956,7 +8943,7 @@ that the base is actually correct, we can use the following formula:
|
|
8956
8943
|
P stands for <b>Probability</b>. An example to this follows next:
|
8957
8944
|
|
8958
8945
|
Say that you have a quality score of 48; then, in ruby code,
|
8959
|
-
you would get an
|
8946
|
+
you would get an <b>error probability</b> of:
|
8960
8947
|
|
8961
8948
|
10 ** (-48 / 10.0) # => 1.584893192461114e-05
|
8962
8949
|
|
@@ -9524,11 +9511,14 @@ information about <b>SUMO</b>: https://en.wikipedia.org/wiki/SUMO_protein
|
|
9524
9511
|
|
9525
9512
|
This is just a simple table, for summary purposes.
|
9526
9513
|
|
9527
|
-
Name of the organism | Latin name | Number of chromosomes
|
9528
|
-
|
9514
|
+
Name of the organism | Latin name | Number of chromosomes (in somatic, diploid cells)
|
9515
|
+
------------------------------|--------------------------------|---------------------------------------------------
|
9529
9516
|
Zebrafish | Danio rerio | 16
|
9517
|
+
Cabbage plants | Brassica oleracea | 18
|
9530
9518
|
House mouse | Mus musculus | 20
|
9519
|
+
Chimpanzees | Pan troglodytes | 48
|
9531
9520
|
Pigeon (the domestic pigeon) | Columba livia domestica | 80
|
9521
|
+
Hedgehogs | Erinaceidae | 90
|
9532
9522
|
|
9533
9523
|
## Finding the consensus sequence and constructing a frequency profile
|
9534
9524
|
|
@@ -9801,6 +9791,9 @@ in time.
|
|
9801
9791
|
|
9802
9792
|
## GUIs of the bioroebe project - Graphical User Interface of the bioroebe project
|
9803
9793
|
|
9794
|
+
The bioroebe project comes with various GUI (Graphical User Interfaces), to
|
9795
|
+
help work with various aspects even for "average users".
|
9796
|
+
|
9804
9797
|
For example, <b>levensthein.rb</b> has code that allows you to
|
9805
9798
|
start its <b>ruby-gtk GUI</b> component, via:
|
9806
9799
|
|
@@ -9810,7 +9803,8 @@ Or, in a more generic manner:
|
|
9810
9803
|
|
9811
9804
|
bioroebe --levensthein-gui
|
9812
9805
|
|
9813
|
-
Here is a **screenshot** of the gtk2-class for
|
9806
|
+
Here is a **screenshot** of the (old) ruby-gtk2-class for the
|
9807
|
+
<b>HammingDistance</b>.
|
9814
9808
|
|
9815
9809
|
<img src="https://i.imgur.com/OT4dJiq.png" style="margin-left: 2em">
|
9816
9810
|
|
@@ -9916,7 +9910,7 @@ bindings in Bioroebe to ruby-tk bindings:
|
|
9916
9910
|
19 | sizeseq | [NOT YET IMPLEMENTED] | |
|
9917
9911
|
20 | three_to_one | [NOT YET IMPLEMENTED] | |
|
9918
9912
|
21 | www_finder | [NOT YET IMPLEMENTED] | |
|
9919
|
-
22 | blosum_matrix_viewer | [TINY BIT IMPLEMENTED; ~5%] | |
|
9913
|
+
22 | blosum_matrix_viewer | [TINY BIT IMPLEMENTED; ~5%] | | [PARTIALLY IMPLEMENTED]
|
9920
9914
|
23 | random_sequence | [NOT YET IMPLEMENTED] | |
|
9921
9915
|
|
9922
9916
|
</div>
|
@@ -9956,6 +9950,12 @@ second one is jruby+swing:
|
|
9956
9950
|
|
9957
9951
|
<img src="https://i.imgur.com/5IUbDSt.png" style="margin: 1em">
|
9958
9952
|
|
9953
|
+
In December 2023 I decided to replace all old GUIs via the
|
9954
|
+
universal-widget projects. This project allows us to, eventually,
|
9955
|
+
make use of different GUI toolkits, as well as the world wide
|
9956
|
+
web, for GUIs. This is ongoing - right now only one GUI has been
|
9957
|
+
ported (three_to_one.rb), but expect more changes in 2024 here.
|
9958
|
+
|
9959
9959
|
### libUI support
|
9960
9960
|
|
9961
9961
|
Presently, since as of **August 2021**, support for libUI in the bioroebe
|
@@ -10283,12 +10283,13 @@ For instance, the last CDS ranges from 2931 to 3917.
|
|
10283
10283
|
class <b>Bioroebe::Compacter</b> can be used to sanitize a text file that
|
10284
10284
|
is supposedly a FASTA sequence, such as for a DNA sequence.
|
10285
10285
|
|
10286
|
-
In September 2023 this class was partially rewritten - the old
|
10287
|
-
was not flexible enough and confusing to me
|
10288
|
-
|
10289
|
-
|
10286
|
+
In <b>September 2023</b> this class was partially rewritten - the old
|
10287
|
+
code was not flexible enough and confusing to me, which is a bad sign
|
10288
|
+
considering I wrote it in the first place. I also added more commandline
|
10289
|
+
options to this class during the rewrite, to allow the user more
|
10290
|
+
fine-tuned control over its behaviour.
|
10290
10291
|
|
10291
|
-
Why has this class been created in the first place?
|
10292
|
+
<b>Why</b> has this class been created in the first place?
|
10292
10293
|
|
10293
10294
|
If you download data from the internet, that data may not be what
|
10294
10295
|
you want it to be. It may contain numbers, rather than just
|
@@ -10305,6 +10306,217 @@ such as the following example shows:
|
|
10305
10306
|
|
10306
10307
|
compacter SPRR4_protein.fasta --retain-newlines
|
10307
10308
|
|
10309
|
+
## Calculating the BLOSUM substitution score via class Bioroebe::CompareTheseTwoSequencesViaBlosum.new
|
10310
|
+
|
10311
|
+
class Bioroebe::CompareTheseTwoSequencesViaBlosum.new is mostly an
|
10312
|
+
ad-hoc class; I wrote it quickly in 2023 to calculate the
|
10313
|
+
BLOSUM50 score.
|
10314
|
+
|
10315
|
+
I then invoke it like this from the commandline:
|
10316
|
+
|
10317
|
+
comparethesetwosequencesviablosum GSAQVKGHGKKVADALTNAVAHVDDMPNALSALSD----LHAHK GSGYLVGDSLTFVDLL--VAQHTADLLAANAALLDEFPQFKAHQ
|
10318
|
+
|
10319
|
+
This compares the two sequences:
|
10320
|
+
|
10321
|
+
GSAQVKGHGKKVADALTNAVAHVDDMPNALSALSD----LHAHK
|
10322
|
+
GSGYLVGDSLTFVDLL--VAQHTADLLAANAALLDEFPQFKAHQ
|
10323
|
+
|
10324
|
+
The score I obtained was 39.
|
10325
|
+
|
10326
|
+
## Bioroebe::SimpleStringComparer
|
10327
|
+
|
10328
|
+
class **SimpleStringComparer** (Bioroebe::SimpleStringComparer) can be
|
10329
|
+
used to compare two strings visually, similar as to how **NCBI BLAST**
|
10330
|
+
compares two sequences to one another.
|
10331
|
+
|
10332
|
+
By default the output of that class will be, in colours, on the commandline,
|
10333
|
+
but you can disable this like so:
|
10334
|
+
|
10335
|
+
Bioroebe::SimpleStringComparer.new(ARGV) { :disable_colours }
|
10336
|
+
|
10337
|
+
Let's look at another example:
|
10338
|
+
|
10339
|
+
Bioroebe::SimpleStringComparer.new('AAAAAAAAAAAAAATTTTTTTTTTTAAAAAAAAAAAATATA|GAAAAAAAAAAAAAAAATATTTTTTTTTTTTTTTTTTTTTT')
|
10340
|
+
|
10341
|
+
This may look like so:
|
10342
|
+
|
10343
|
+
<img src="https://i.imgur.com/kkqkpmQ.png" style="margin-left: 2em">
|
10344
|
+
|
10345
|
+
Because different people may wish to use different colours,
|
10346
|
+
the class allows the user to change these colours via
|
10347
|
+
the commandline. Let's assume you did alias
|
10348
|
+
simple_string_comparer to this class - then you can do the following:
|
10349
|
+
|
10350
|
+
simple_string_comparer 'AAAAAAAAAAAAAATTTTTTTTTTTAAAAAAAAAAAATATA|GAAAAAAAAAAAAAAAATATTTTTTTTTTTTTTTTTTTTTT' --colour-for-a-match=lightblue
|
10351
|
+
|
10352
|
+
So you can modify the vertical bar and display it in a specific
|
10353
|
+
colour. See the following image how this may then look:
|
10354
|
+
|
10355
|
+
<img src="https://i.imgur.com/ipaXUQT.png" style="margin: 1em">
|
10356
|
+
|
10357
|
+
I may add more options here, to allow arbitrary styling, but
|
10358
|
+
I'll leave it at this for now - the future shows how useful
|
10359
|
+
this class may be.
|
10360
|
+
|
10361
|
+
## Using Bioroebe.three_delimiter()
|
10362
|
+
|
10363
|
+
The method <b>Bioroebe.three_delimiter()</b> can be used to
|
10364
|
+
split a String into a String where every third position has
|
10365
|
+
a trailing '|' token.
|
10366
|
+
|
10367
|
+
So, for instance:
|
10368
|
+
|
10369
|
+
Bioroebe.three_delimiter 'ATGGGGATGTAGGTAAAA' # => "ATG|GGG|ATG|TAG|GTA|AAA"
|
10370
|
+
|
10371
|
+
The primary reason why that was added as a toplevel method has been
|
10372
|
+
because it may be visually simpler to identify the individual codons
|
10373
|
+
via your eyes that way.
|
10374
|
+
|
10375
|
+
The following image shows this output:
|
10376
|
+
|
10377
|
+
<img src="https://i.imgur.com/aakm0Z9.png" style="margin: 1em">
|
10378
|
+
|
10379
|
+
## Bioroebe.cat() - displaying the content of files
|
10380
|
+
|
10381
|
+
If you need to display the content of files you can use the
|
10382
|
+
helper-method called <b>Bioroebe.cat()</b>. A /bin executable
|
10383
|
+
for this functionality exists as well, aptly
|
10384
|
+
called <b>bioroebe_cat</b>.
|
10385
|
+
|
10386
|
+
## Bioroebe.extractseq
|
10387
|
+
|
10388
|
+
Bioroebe.extractseq can be used to assemble a new sequence
|
10389
|
+
from an existing sequence. This functionality has been
|
10390
|
+
inspired by EMBOSS extractseq.
|
10391
|
+
|
10392
|
+
Usage example:
|
10393
|
+
|
10394
|
+
Bioroebe.extractseq('AAAGGGTTT', '7-9','3-4') # => TTTAG
|
10395
|
+
|
10396
|
+
So a new String is generated; 7-9 and 3-4 refer to the range,
|
10397
|
+
so first we take position 7, 8, and 9, then we add 3 and
|
10398
|
+
4, and finally return the new sequence.
|
10399
|
+
|
10400
|
+
Note that one difference between EMBOSS extractseq and
|
10401
|
+
bioroebe extractseq is that no local file is generated
|
10402
|
+
in bioroebe; you may have to combine this by yourself
|
10403
|
+
if you desire this functionality.
|
10404
|
+
|
10405
|
+
## Bioroebe.log_dir? - Logging and Log output
|
10406
|
+
|
10407
|
+
Since as of <b>November 2023</b>, the bioroebe project uses a simplified approach
|
10408
|
+
when it comes to the log-directory. Before that there was also the
|
10409
|
+
Bioroebe.base_dir? in use, and I kept on forgetting what the difference
|
10410
|
+
was between these two - so the latter simply became an alias to
|
10411
|
+
Bioroebe.log_dir? now.
|
10412
|
+
|
10413
|
+
<b>Bioroebe.log_dir?</b> will determine where the directory resides into which
|
10414
|
+
you can put files and directories, and have the bioroebe-project
|
10415
|
+
recognize these files and directories too, in particular FASTA files.
|
10416
|
+
|
10417
|
+
The **BioRoebe project** may autogenerate some files, including
|
10418
|
+
**log files**.
|
10419
|
+
|
10420
|
+
In order to be able to do so, the bioroebe project needs the user
|
10421
|
+
to be able to access a **base directory**, the so-called
|
10422
|
+
**working base directory**. This is where bioroebe assumes
|
10423
|
+
most working files to exist.
|
10424
|
+
|
10425
|
+
On my linux system this used to default to the directory
|
10426
|
+
called **/Depot/Bioroebe/**. On other systems, the log directory
|
10427
|
+
may default into the **user's home directory**, via a call to
|
10428
|
+
<b>"#{File.expand_path('~')}/"</b>. (I use this presently,
|
10429
|
+
since **2020**.)
|
10430
|
+
|
10431
|
+
This should work on most systems, but it may not be what you
|
10432
|
+
want to have or use in your own workflow. Thus, code exists
|
10433
|
+
that allows you to designate another log directory to use.
|
10434
|
+
|
10435
|
+
The API for this is simply called:
|
10436
|
+
|
10437
|
+
Bioroebe.set_log_dir()
|
10438
|
+
|
10439
|
+
The method can be found in
|
10440
|
+
**lib/bioroebe/log_directory/log_directory.rb**.
|
10441
|
+
|
10442
|
+
If you want to do this from within the **bioshell** itself,
|
10443
|
+
try:
|
10444
|
+
|
10445
|
+
set_log_dir /tmp/test
|
10446
|
+
setlogdir /tmp/test
|
10447
|
+
|
10448
|
+
If you use the interactive bioroebe-shell then you can use
|
10449
|
+
**home?** to determine where the log directory is on your
|
10450
|
+
system. For example, I tend to just use **/root/Bioroebe/**
|
10451
|
+
these days, when I am the superuser.
|
10452
|
+
|
10453
|
+
Note that you can also define the environment variable called
|
10454
|
+
<b>BIOROEBE_DEFAULT_LOG_DIRECTORY</b>. If this is set on startup
|
10455
|
+
of the bioroebe-shell, then it will overrule the initial :default
|
10456
|
+
value that is used otherwise.
|
10457
|
+
|
10458
|
+
## Enzymes
|
10459
|
+
|
10460
|
+
This subsection will be expanded at a later time - it will be
|
10461
|
+
about enzymes in general.
|
10462
|
+
|
10463
|
+
For now, if you need a table, as memory, for the enzyme classes,
|
10464
|
+
here is one:
|
10465
|
+
|
10466
|
+
| Enzyme class (EC) | Name |
|
10467
|
+
|-------------------- |------------------|
|
10468
|
+
| 1 | Oxidoreductases |
|
10469
|
+
| 2 | Transferases |
|
10470
|
+
| 3 | Hydrolases |
|
10471
|
+
| 4 | Lyases |
|
10472
|
+
| 5 | Isomerases |
|
10473
|
+
| 6 | Ligases |
|
10474
|
+
| 7 | Translocases |
|
10475
|
+
|
10476
|
+
See wikipedia for more information:
|
10477
|
+
|
10478
|
+
https://en.wikipedia.org/wiki/Enzyme_Commission_number#Top_level_codes
|
10479
|
+
|
10480
|
+
## UniProt and the data provided by UniProt
|
10481
|
+
|
10482
|
+
UniProt is a freely accessible database of protein sequence and
|
10483
|
+
functional information. What makes it also useful for
|
10484
|
+
bioinformatics is that you can easily query the FASTA sequence
|
10485
|
+
of a protein.
|
10486
|
+
|
10487
|
+
Consider the protein called **A2Z669**. The **entry** to this protein
|
10488
|
+
can be found here:
|
10489
|
+
|
10490
|
+
https://www.uniprot.org/uniprot/A2Z669
|
10491
|
+
|
10492
|
+
And the corresponding FASTA sequence of that protein can be
|
10493
|
+
found here, if you append **.fasta** to the URL:
|
10494
|
+
|
10495
|
+
https://www.uniprot.org/uniprot/A2Z669.fasta
|
10496
|
+
|
10497
|
+
If you wish to save this file, from within **Bioroebe** itself,
|
10498
|
+
then you can use the following API:
|
10499
|
+
|
10500
|
+
Bioroebe.fetch_data_from_uniprot()
|
10501
|
+
Bioroebe.fetch_data_from_uniprot('A2Z669')
|
10502
|
+
|
10503
|
+
NCBI also has entries related to UniProt.
|
10504
|
+
|
10505
|
+
Example:
|
10506
|
+
|
10507
|
+
https://www.ncbi.nlm.nih.gov/protein/P02768
|
10508
|
+
https://www.ncbi.nlm.nih.gov/protein/P02768.2?report=fasta
|
10509
|
+
|
10510
|
+
This has the header **RecName: Full=Serum albumin; Flags:
|
10511
|
+
PrecursorUniProtKB/Swiss-Prot: P02768.2**.
|
10512
|
+
|
10513
|
+
From the bioroebe-shell, you can can fetch data from
|
10514
|
+
<b>Uniprot</b>, such as by issuing:
|
10515
|
+
|
10516
|
+
uniprot_fetch
|
10517
|
+
uniprot # This alias works as well.
|
10518
|
+
fetch_data_from_uniprot # As does this variant.
|
10519
|
+
|
10308
10520
|
## Possibly useful links in regards to molecular biology and science in general
|
10309
10521
|
|
10310
10522
|
On the www there are a myriad of links to various other external sites.
|
@@ -10364,7 +10576,7 @@ specific protein belongs to and which domains it contains.
|
|
10364
10576
|
https://www.ebi.ac.uk/interpro/
|
10365
10577
|
|
10366
10578
|
|
10367
|
-
## Contact information and mandatory 2FA coming up in 2022
|
10579
|
+
## Contact information and mandatory 2FA (no longer) coming up in 2022 / 2023
|
10368
10580
|
|
10369
10581
|
If your creative mind has ideas and specific suggestions to make this gem
|
10370
10582
|
more useful in general, feel free to drop me an email at any time, via:
|
@@ -10374,36 +10586,36 @@ more useful in general, feel free to drop me an email at any time, via:
|
|
10374
10586
|
Before that email I used an email account at Google gmail, but in **2021** I
|
10375
10587
|
decided to slowly abandon gmail, for various reasons. In order to limit the
|
10376
10588
|
explanation here, allow me to just briefly state that I do not feel as if I
|
10377
|
-
want to promote any Google service anymore when the user becomes the
|
10378
|
-
product (such as via data collection by upstream services
|
10379
|
-
a hugely flawed business model
|
10380
|
-
|
10381
|
-
|
10382
|
-
|
10383
|
-
|
10384
|
-
In
|
10385
|
-
|
10386
|
-
|
10589
|
+
want to promote any Google service anymore when the user becomes the end
|
10590
|
+
product (such as via data collection by upstream services, including other
|
10591
|
+
proxy-services). My feeling is that this is a hugely flawed business model
|
10592
|
+
to begin with, and I no longer wish to support this in any way, even if
|
10593
|
+
only indirectly so, such as by using services of companies that try to
|
10594
|
+
promote this flawed model.
|
10595
|
+
|
10596
|
+
In regards to responding to emails: please keep in mind that responding
|
10597
|
+
may take some time, depending on the amount of work I may have at that
|
10598
|
+
moment. So it is not that emails are ignored; it is more that I have not
|
10599
|
+
(yet) found the time to read and reply. This means there may be a delay
|
10600
|
+
of days, weeks and in some instances also months. There is, unfortunately,
|
10601
|
+
not much I can do when I need to prioritise my time investment, but I try
|
10602
|
+
to consider <b>all</b> feedback as an opportunity to improve my projects
|
10603
|
+
nonetheless.
|
10604
|
+
|
10605
|
+
In <b>2022</b> rubygems.org decided to make 2FA mandatory for every
|
10606
|
+
gem owner eventually:
|
10607
|
+
|
10608
|
+
see
|
10387
10609
|
https://blog.rubygems.org/2022/06/13/making-packages-more-secure.html
|
10388
10610
|
|
10389
|
-
|
10390
|
-
|
10391
|
-
|
10392
|
-
|
10393
|
-
|
10394
|
-
|
10395
|
-
|
10396
|
-
|
10397
|
-
|
10398
|
-
|
10399
|
-
days:
|
10400
|
-
|
10401
|
-
https://bugs.ruby-lang.org/issues/18800
|
10402
|
-
|
10403
|
-
(Note that this was changed a few months ago, so the last part is no
|
10404
|
-
longer valid - it is possible to register again without mandating
|
10405
|
-
2FA. I will retain the above notice for a bit longer, though, as I feel
|
10406
|
-
we should not restrict communication via mandatory authentification
|
10407
|
-
in general. Fighting spam is a noble goal, but when it also means you
|
10408
|
-
lock out real human people then this is definitely NOT good.)
|
10611
|
+
However had, that has been reverted again, so I decided to shorten
|
10612
|
+
this paragraph. Mandatory 2FA may exclude users who do not have a
|
10613
|
+
smartphone device or other means to 'identify'. I do not feel it is
|
10614
|
+
a fair assumption by others to be made that non-identified people may
|
10615
|
+
not contribute code, which is why I reject it. Mandatory 2FA would mean
|
10616
|
+
an end to all my projects on rubygems.org, so let's hope it will never
|
10617
|
+
happen. (Keep in mind that I refer to mandatory 2FA; I have no qualms
|
10618
|
+
for people who use 2FA on their own, but this carrot-and-stick strategy
|
10619
|
+
by those who control the rubygems infrastructure is a very bad one to
|
10620
|
+
pursue.
|
10409
10621
|
|