bio 1.0.0 → 1.1.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (340) hide show
  1. data/bin/bioruby +14 -122
  2. data/bin/br_biofetch.rb +2 -2
  3. data/bin/br_bioflat.rb +2 -2
  4. data/bin/br_biogetseq.rb +2 -2
  5. data/bin/br_pmfetch.rb +3 -3
  6. data/doc/Changes-0.7.rd +77 -0
  7. data/doc/KEGG_API.rd +523 -232
  8. data/doc/KEGG_API.rd.ja +529 -207
  9. data/doc/Tutorial.rd +48 -11
  10. data/lib/bio.rb +59 -6
  11. data/lib/bio/alignment.rb +713 -103
  12. data/lib/bio/appl/bl2seq/report.rb +2 -18
  13. data/lib/bio/appl/blast.rb +108 -91
  14. data/lib/bio/appl/blast/format0.rb +33 -18
  15. data/lib/bio/appl/blast/format8.rb +6 -20
  16. data/lib/bio/appl/blast/report.rb +293 -429
  17. data/lib/bio/appl/blast/rexml.rb +8 -22
  18. data/lib/bio/appl/blast/wublast.rb +21 -12
  19. data/lib/bio/appl/blast/xmlparser.rb +180 -183
  20. data/lib/bio/appl/blat/report.rb +127 -30
  21. data/lib/bio/appl/clustalw.rb +87 -59
  22. data/lib/bio/appl/clustalw/report.rb +20 -22
  23. data/lib/bio/appl/emboss.rb +113 -20
  24. data/lib/bio/appl/fasta.rb +173 -198
  25. data/lib/bio/appl/fasta/format10.rb +244 -347
  26. data/lib/bio/appl/gcg/msf.rb +212 -0
  27. data/lib/bio/appl/gcg/seq.rb +195 -0
  28. data/lib/bio/appl/genscan/report.rb +5 -23
  29. data/lib/bio/appl/hmmer.rb +8 -45
  30. data/lib/bio/appl/hmmer/report.rb +2 -20
  31. data/lib/bio/appl/iprscan/report.rb +374 -0
  32. data/lib/bio/appl/mafft.rb +87 -50
  33. data/lib/bio/appl/mafft/report.rb +151 -44
  34. data/lib/bio/appl/muscle.rb +52 -0
  35. data/lib/bio/appl/phylip/alignment.rb +129 -0
  36. data/lib/bio/appl/phylip/distance_matrix.rb +96 -0
  37. data/lib/bio/appl/probcons.rb +41 -0
  38. data/lib/bio/appl/psort.rb +89 -96
  39. data/lib/bio/appl/psort/report.rb +6 -22
  40. data/lib/bio/appl/pts1.rb +263 -0
  41. data/lib/bio/appl/sim4.rb +26 -36
  42. data/lib/bio/appl/sim4/report.rb +2 -18
  43. data/lib/bio/appl/sosui/report.rb +5 -20
  44. data/lib/bio/appl/spidey/report.rb +2 -2
  45. data/lib/bio/appl/targetp/report.rb +4 -20
  46. data/lib/bio/appl/tcoffee.rb +55 -0
  47. data/lib/bio/appl/tmhmm/report.rb +4 -20
  48. data/lib/bio/command.rb +235 -64
  49. data/lib/bio/data/aa.rb +21 -26
  50. data/lib/bio/data/codontable.rb +2 -20
  51. data/lib/bio/data/na.rb +19 -4
  52. data/lib/bio/db.rb +27 -12
  53. data/lib/bio/db/aaindex.rb +2 -20
  54. data/lib/bio/db/embl/common.rb +4 -21
  55. data/lib/bio/db/embl/embl.rb +33 -85
  56. data/lib/bio/db/embl/sptr.rb +612 -302
  57. data/lib/bio/db/embl/swissprot.rb +10 -29
  58. data/lib/bio/db/embl/trembl.rb +10 -29
  59. data/lib/bio/db/embl/uniprot.rb +10 -29
  60. data/lib/bio/db/fantom.rb +15 -20
  61. data/lib/bio/db/fasta.rb +3 -3
  62. data/lib/bio/db/genbank/common.rb +37 -46
  63. data/lib/bio/db/genbank/ddbj.rb +6 -18
  64. data/lib/bio/db/genbank/genbank.rb +47 -186
  65. data/lib/bio/db/genbank/genpept.rb +4 -17
  66. data/lib/bio/db/genbank/refseq.rb +4 -17
  67. data/lib/bio/db/gff.rb +103 -35
  68. data/lib/bio/db/go.rb +4 -20
  69. data/lib/bio/db/kegg/brite.rb +26 -36
  70. data/lib/bio/db/kegg/compound.rb +81 -85
  71. data/lib/bio/db/kegg/drug.rb +98 -0
  72. data/lib/bio/db/kegg/enzyme.rb +133 -110
  73. data/lib/bio/db/kegg/expression.rb +2 -20
  74. data/lib/bio/db/kegg/genes.rb +208 -238
  75. data/lib/bio/db/kegg/genome.rb +164 -285
  76. data/lib/bio/db/kegg/glycan.rb +114 -157
  77. data/lib/bio/db/kegg/keggtab.rb +242 -303
  78. data/lib/bio/db/kegg/kgml.rb +117 -160
  79. data/lib/bio/db/kegg/orthology.rb +112 -0
  80. data/lib/bio/db/kegg/reaction.rb +54 -69
  81. data/lib/bio/db/kegg/taxonomy.rb +331 -0
  82. data/lib/bio/db/lasergene.rb +209 -0
  83. data/lib/bio/db/litdb.rb +3 -27
  84. data/lib/bio/db/medline.rb +228 -249
  85. data/lib/bio/db/nbrf.rb +3 -3
  86. data/lib/bio/db/newick.rb +510 -0
  87. data/lib/bio/db/nexus.rb +1854 -0
  88. data/lib/bio/db/pdb.rb +5 -17
  89. data/lib/bio/db/pdb/atom.rb +2 -18
  90. data/lib/bio/db/pdb/chain.rb +2 -18
  91. data/lib/bio/db/pdb/chemicalcomponent.rb +2 -18
  92. data/lib/bio/db/pdb/model.rb +2 -18
  93. data/lib/bio/db/pdb/pdb.rb +73 -34
  94. data/lib/bio/db/pdb/residue.rb +4 -20
  95. data/lib/bio/db/pdb/utils.rb +2 -18
  96. data/lib/bio/db/prosite.rb +403 -422
  97. data/lib/bio/db/rebase.rb +84 -40
  98. data/lib/bio/db/soft.rb +404 -0
  99. data/lib/bio/db/transfac.rb +5 -17
  100. data/lib/bio/feature.rb +106 -52
  101. data/lib/bio/io/das.rb +32 -42
  102. data/lib/bio/io/dbget.rb +2 -20
  103. data/lib/bio/io/ddbjxml.rb +77 -138
  104. data/lib/bio/io/ebisoap.rb +158 -0
  105. data/lib/bio/io/ensembl.rb +229 -0
  106. data/lib/bio/io/fastacmd.rb +89 -82
  107. data/lib/bio/io/fetch.rb +163 -96
  108. data/lib/bio/io/flatfile.rb +170 -73
  109. data/lib/bio/io/flatfile/bdb.rb +3 -16
  110. data/lib/bio/io/flatfile/index.rb +2 -2
  111. data/lib/bio/io/flatfile/indexer.rb +3 -2
  112. data/lib/bio/io/higet.rb +12 -31
  113. data/lib/bio/io/keggapi.rb +210 -269
  114. data/lib/bio/io/ncbisoap.rb +155 -0
  115. data/lib/bio/io/pubmed.rb +169 -147
  116. data/lib/bio/io/registry.rb +4 -20
  117. data/lib/bio/io/soapwsdl.rb +43 -38
  118. data/lib/bio/io/sql.rb +242 -305
  119. data/lib/bio/location.rb +407 -285
  120. data/lib/bio/map.rb +410 -0
  121. data/lib/bio/pathway.rb +558 -695
  122. data/lib/bio/reference.rb +272 -75
  123. data/lib/bio/sequence.rb +255 -13
  124. data/lib/bio/sequence/aa.rb +71 -10
  125. data/lib/bio/sequence/common.rb +187 -33
  126. data/lib/bio/sequence/compat.rb +59 -4
  127. data/lib/bio/sequence/format.rb +54 -7
  128. data/lib/bio/sequence/generic.rb +3 -3
  129. data/lib/bio/sequence/na.rb +328 -26
  130. data/lib/bio/shell.rb +11 -4
  131. data/lib/bio/shell/core.rb +221 -160
  132. data/lib/bio/shell/demo.rb +18 -15
  133. data/lib/bio/shell/interface.rb +14 -12
  134. data/lib/bio/shell/irb.rb +95 -0
  135. data/lib/bio/shell/object.rb +45 -26
  136. data/lib/bio/shell/plugin/blast.rb +42 -0
  137. data/lib/bio/shell/plugin/codon.rb +22 -14
  138. data/lib/bio/shell/plugin/das.rb +58 -0
  139. data/lib/bio/shell/plugin/emboss.rb +2 -2
  140. data/lib/bio/shell/plugin/entry.rb +22 -11
  141. data/lib/bio/shell/plugin/flatfile.rb +2 -2
  142. data/lib/bio/shell/plugin/keggapi.rb +13 -6
  143. data/lib/bio/shell/plugin/midi.rb +4 -4
  144. data/lib/bio/shell/plugin/obda.rb +2 -2
  145. data/lib/bio/shell/plugin/psort.rb +56 -0
  146. data/lib/bio/shell/plugin/seq.rb +35 -8
  147. data/lib/bio/shell/plugin/soap.rb +87 -0
  148. data/lib/bio/shell/rails/vendor/plugins/generators/bioruby/bioruby_generator.rb +29 -0
  149. data/lib/bio/shell/rails/vendor/plugins/generators/bioruby/templates/_classes.rhtml +4 -0
  150. data/lib/bio/shell/rails/vendor/plugins/generators/bioruby/templates/_log.rhtml +27 -0
  151. data/lib/bio/shell/rails/vendor/plugins/generators/bioruby/templates/_methods.rhtml +11 -0
  152. data/lib/bio/shell/rails/vendor/plugins/generators/bioruby/templates/_modules.rhtml +4 -0
  153. data/lib/bio/shell/rails/vendor/plugins/generators/bioruby/templates/_variables.rhtml +7 -0
  154. data/lib/bio/shell/rails/vendor/plugins/generators/bioruby/templates/bioruby-bg.gif +0 -0
  155. data/lib/bio/shell/rails/vendor/plugins/generators/bioruby/templates/bioruby-console.png +0 -0
  156. data/lib/bio/shell/rails/{public/images/icon.png → vendor/plugins/generators/bioruby/templates/bioruby-gem.png} +0 -0
  157. data/lib/bio/shell/rails/vendor/plugins/generators/bioruby/templates/bioruby-link.gif +0 -0
  158. data/lib/bio/shell/rails/vendor/plugins/generators/bioruby/templates/bioruby.css +369 -0
  159. data/lib/bio/shell/rails/vendor/plugins/generators/bioruby/templates/bioruby.rhtml +47 -0
  160. data/lib/bio/shell/rails/vendor/plugins/generators/bioruby/templates/bioruby_controller.rb +144 -0
  161. data/lib/bio/shell/rails/vendor/plugins/generators/bioruby/templates/bioruby_helper.rb +47 -0
  162. data/lib/bio/shell/rails/vendor/plugins/generators/bioruby/templates/commands.rhtml +8 -0
  163. data/lib/bio/shell/rails/vendor/plugins/generators/bioruby/templates/history.rhtml +10 -0
  164. data/lib/bio/shell/rails/vendor/plugins/generators/bioruby/templates/index.rhtml +22 -0
  165. data/lib/bio/shell/script.rb +25 -0
  166. data/lib/bio/shell/setup.rb +109 -0
  167. data/lib/bio/shell/web.rb +70 -58
  168. data/lib/bio/tree.rb +850 -0
  169. data/lib/bio/util/color_scheme.rb +84 -107
  170. data/lib/bio/util/color_scheme/buried.rb +5 -24
  171. data/lib/bio/util/color_scheme/helix.rb +5 -24
  172. data/lib/bio/util/color_scheme/hydropathy.rb +5 -24
  173. data/lib/bio/util/color_scheme/nucleotide.rb +5 -24
  174. data/lib/bio/util/color_scheme/strand.rb +5 -24
  175. data/lib/bio/util/color_scheme/taylor.rb +5 -24
  176. data/lib/bio/util/color_scheme/turn.rb +5 -24
  177. data/lib/bio/util/color_scheme/zappo.rb +5 -24
  178. data/lib/bio/util/contingency_table.rb +70 -43
  179. data/lib/bio/util/restriction_enzyme.rb +228 -0
  180. data/lib/bio/util/restriction_enzyme/analysis.rb +249 -0
  181. data/lib/bio/util/restriction_enzyme/analysis_basic.rb +217 -0
  182. data/lib/bio/util/restriction_enzyme/cut_symbol.rb +107 -0
  183. data/lib/bio/util/restriction_enzyme/double_stranded.rb +321 -0
  184. data/lib/bio/util/restriction_enzyme/double_stranded/aligned_strands.rb +130 -0
  185. data/lib/bio/util/restriction_enzyme/double_stranded/cut_location_pair.rb +103 -0
  186. data/lib/bio/util/restriction_enzyme/double_stranded/cut_location_pair_in_enzyme_notation.rb +38 -0
  187. data/lib/bio/util/restriction_enzyme/double_stranded/cut_locations.rb +76 -0
  188. data/lib/bio/util/restriction_enzyme/double_stranded/cut_locations_in_enzyme_notation.rb +107 -0
  189. data/lib/bio/util/restriction_enzyme/enzymes.yaml +7061 -0
  190. data/lib/bio/util/restriction_enzyme/range/cut_range.rb +24 -0
  191. data/lib/bio/util/restriction_enzyme/range/cut_ranges.rb +47 -0
  192. data/lib/bio/util/restriction_enzyme/range/horizontal_cut_range.rb +67 -0
  193. data/lib/bio/util/restriction_enzyme/range/sequence_range.rb +257 -0
  194. data/lib/bio/util/restriction_enzyme/range/sequence_range/calculated_cuts.rb +242 -0
  195. data/lib/bio/util/restriction_enzyme/range/sequence_range/fragment.rb +51 -0
  196. data/lib/bio/util/restriction_enzyme/range/sequence_range/fragments.rb +41 -0
  197. data/lib/bio/util/restriction_enzyme/range/vertical_cut_range.rb +77 -0
  198. data/lib/bio/util/restriction_enzyme/single_strand.rb +199 -0
  199. data/lib/bio/util/restriction_enzyme/single_strand/cut_locations_in_enzyme_notation.rb +135 -0
  200. data/lib/bio/util/restriction_enzyme/single_strand_complement.rb +23 -0
  201. data/lib/bio/util/restriction_enzyme/string_formatting.rb +111 -0
  202. data/lib/bio/util/sirna.rb +4 -22
  203. data/sample/color_scheme_na.rb +4 -12
  204. data/sample/enzymes.rb +78 -0
  205. data/sample/goslim.rb +5 -13
  206. data/sample/psortplot_html.rb +4 -12
  207. data/test/data/blast/2.2.15.blastp.m7 +876 -0
  208. data/test/data/embl/AB090716.embl.rel89 +63 -0
  209. data/test/data/fasta/example1.txt +75 -0
  210. data/test/data/fasta/example2.txt +21 -0
  211. data/test/data/iprscan/merged.raw +32 -0
  212. data/test/data/iprscan/merged.txt +74 -0
  213. data/test/data/soft/GDS100_partial.soft +92 -0
  214. data/test/data/soft/GSE3457_family_partial.soft +874 -0
  215. data/test/functional/bio/io/test_ensembl.rb +103 -0
  216. data/test/functional/bio/io/test_soapwsdl.rb +5 -17
  217. data/test/unit/bio/appl/bl2seq/test_report.rb +2 -2
  218. data/test/unit/bio/appl/blast/test_report.rb +3 -16
  219. data/test/unit/bio/appl/blast/test_xmlparser.rb +4 -16
  220. data/test/unit/bio/appl/genscan/test_report.rb +3 -16
  221. data/test/unit/bio/appl/hmmer/test_report.rb +3 -16
  222. data/test/unit/bio/appl/iprscan/test_report.rb +338 -0
  223. data/test/unit/bio/appl/mafft/test_report.rb +63 -0
  224. data/test/unit/bio/appl/sosui/test_report.rb +3 -16
  225. data/test/unit/bio/appl/targetp/test_report.rb +3 -16
  226. data/test/unit/bio/appl/test_blast.rb +3 -16
  227. data/test/unit/bio/appl/test_fasta.rb +4 -16
  228. data/test/unit/bio/appl/test_pts1.rb +140 -0
  229. data/test/unit/bio/appl/tmhmm/test_report.rb +3 -16
  230. data/test/unit/bio/data/test_aa.rb +4 -17
  231. data/test/unit/bio/data/test_codontable.rb +3 -16
  232. data/test/unit/bio/data/test_na.rb +3 -3
  233. data/test/unit/bio/db/embl/test_common.rb +3 -16
  234. data/test/unit/bio/db/embl/test_embl.rb +3 -16
  235. data/test/unit/bio/db/embl/test_embl_rel89.rb +219 -0
  236. data/test/unit/bio/db/embl/test_sptr.rb +1548 -41
  237. data/test/unit/bio/db/embl/test_uniprot.rb +3 -16
  238. data/test/unit/bio/db/kegg/test_genes.rb +3 -16
  239. data/test/unit/bio/db/pdb/test_pdb.rb +7 -24
  240. data/test/unit/bio/db/test_aaindex.rb +2 -2
  241. data/test/unit/bio/db/test_fasta.rb +3 -16
  242. data/test/unit/bio/db/test_gff.rb +3 -16
  243. data/test/unit/bio/db/test_lasergene.rb +95 -0
  244. data/test/unit/bio/db/test_newick.rb +56 -0
  245. data/test/unit/bio/db/test_nexus.rb +360 -0
  246. data/test/unit/bio/db/test_prosite.rb +5 -18
  247. data/test/unit/bio/db/test_rebase.rb +11 -25
  248. data/test/unit/bio/db/test_soft.rb +138 -0
  249. data/test/unit/bio/io/test_ddbjxml.rb +5 -17
  250. data/test/unit/bio/io/test_ensembl.rb +109 -0
  251. data/test/unit/bio/io/test_fastacmd.rb +3 -16
  252. data/test/unit/bio/io/test_flatfile.rb +237 -0
  253. data/test/unit/bio/io/test_soapwsdl.rb +4 -17
  254. data/test/unit/bio/sequence/test_aa.rb +3 -3
  255. data/test/unit/bio/sequence/test_common.rb +3 -16
  256. data/test/unit/bio/sequence/test_compat.rb +3 -16
  257. data/test/unit/bio/sequence/test_na.rb +29 -3
  258. data/test/unit/bio/shell/plugin/test_seq.rb +8 -8
  259. data/test/unit/bio/test_alignment.rb +16 -27
  260. data/test/unit/bio/test_command.rb +242 -25
  261. data/test/unit/bio/test_db.rb +3 -16
  262. data/test/unit/bio/test_feature.rb +4 -16
  263. data/test/unit/bio/test_location.rb +4 -16
  264. data/test/unit/bio/test_map.rb +230 -0
  265. data/test/unit/bio/test_pathway.rb +4 -16
  266. data/test/unit/bio/test_reference.rb +2 -2
  267. data/test/unit/bio/test_sequence.rb +7 -19
  268. data/test/unit/bio/test_shell.rb +3 -16
  269. data/test/unit/bio/test_tree.rb +593 -0
  270. data/test/unit/bio/util/restriction_enzyme/analysis/test_calculated_cuts.rb +299 -0
  271. data/test/unit/bio/util/restriction_enzyme/analysis/test_cut_ranges.rb +103 -0
  272. data/test/unit/bio/util/restriction_enzyme/analysis/test_sequence_range.rb +240 -0
  273. data/test/unit/bio/util/restriction_enzyme/double_stranded/test_aligned_strands.rb +100 -0
  274. data/test/unit/bio/util/restriction_enzyme/double_stranded/test_cut_location_pair.rb +75 -0
  275. data/test/unit/bio/util/restriction_enzyme/double_stranded/test_cut_location_pair_in_enzyme_notation.rb +73 -0
  276. data/test/unit/bio/util/restriction_enzyme/double_stranded/test_cut_locations.rb +53 -0
  277. data/test/unit/bio/util/restriction_enzyme/double_stranded/test_cut_locations_in_enzyme_notation.rb +104 -0
  278. data/test/unit/bio/util/restriction_enzyme/single_strand/test_cut_locations_in_enzyme_notation.rb +83 -0
  279. data/test/unit/bio/util/restriction_enzyme/test_analysis.rb +246 -0
  280. data/test/unit/bio/util/restriction_enzyme/test_cut_symbol.rb +44 -0
  281. data/test/unit/bio/util/restriction_enzyme/test_double_stranded.rb +115 -0
  282. data/test/unit/bio/util/restriction_enzyme/test_single_strand.rb +147 -0
  283. data/test/unit/bio/util/restriction_enzyme/test_single_strand_complement.rb +147 -0
  284. data/test/unit/bio/util/restriction_enzyme/test_string_formatting.rb +60 -0
  285. data/test/unit/bio/util/test_color_scheme.rb +6 -18
  286. data/test/unit/bio/util/test_contingency_table.rb +6 -18
  287. data/test/unit/bio/util/test_restriction_enzyme.rb +42 -0
  288. data/test/unit/bio/util/test_sirna.rb +3 -16
  289. metadata +228 -169
  290. data/doc/BioRuby.rd.ja +0 -225
  291. data/doc/Design.rd.ja +0 -341
  292. data/doc/TODO.rd.ja +0 -138
  293. data/lib/bio/appl/fasta/format6.rb +0 -37
  294. data/lib/bio/db/kegg/cell.rb +0 -88
  295. data/lib/bio/db/kegg/ko.rb +0 -178
  296. data/lib/bio/shell/rails/Rakefile +0 -10
  297. data/lib/bio/shell/rails/app/controllers/application.rb +0 -4
  298. data/lib/bio/shell/rails/app/controllers/shell_controller.rb +0 -94
  299. data/lib/bio/shell/rails/app/helpers/application_helper.rb +0 -3
  300. data/lib/bio/shell/rails/app/models/shell_connection.rb +0 -30
  301. data/lib/bio/shell/rails/app/views/layouts/shell.rhtml +0 -37
  302. data/lib/bio/shell/rails/app/views/shell/history.rhtml +0 -5
  303. data/lib/bio/shell/rails/app/views/shell/index.rhtml +0 -2
  304. data/lib/bio/shell/rails/app/views/shell/show.rhtml +0 -13
  305. data/lib/bio/shell/rails/config/boot.rb +0 -19
  306. data/lib/bio/shell/rails/config/database.yml +0 -85
  307. data/lib/bio/shell/rails/config/environment.rb +0 -53
  308. data/lib/bio/shell/rails/config/environments/development.rb +0 -19
  309. data/lib/bio/shell/rails/config/environments/production.rb +0 -19
  310. data/lib/bio/shell/rails/config/environments/test.rb +0 -19
  311. data/lib/bio/shell/rails/config/routes.rb +0 -19
  312. data/lib/bio/shell/rails/doc/README_FOR_APP +0 -2
  313. data/lib/bio/shell/rails/public/404.html +0 -8
  314. data/lib/bio/shell/rails/public/500.html +0 -8
  315. data/lib/bio/shell/rails/public/dispatch.cgi +0 -10
  316. data/lib/bio/shell/rails/public/dispatch.fcgi +0 -24
  317. data/lib/bio/shell/rails/public/dispatch.rb +0 -10
  318. data/lib/bio/shell/rails/public/favicon.ico +0 -0
  319. data/lib/bio/shell/rails/public/images/rails.png +0 -0
  320. data/lib/bio/shell/rails/public/index.html +0 -277
  321. data/lib/bio/shell/rails/public/javascripts/controls.js +0 -750
  322. data/lib/bio/shell/rails/public/javascripts/dragdrop.js +0 -584
  323. data/lib/bio/shell/rails/public/javascripts/effects.js +0 -854
  324. data/lib/bio/shell/rails/public/javascripts/prototype.js +0 -1785
  325. data/lib/bio/shell/rails/public/robots.txt +0 -1
  326. data/lib/bio/shell/rails/public/stylesheets/main.css +0 -187
  327. data/lib/bio/shell/rails/script/about +0 -3
  328. data/lib/bio/shell/rails/script/breakpointer +0 -3
  329. data/lib/bio/shell/rails/script/console +0 -3
  330. data/lib/bio/shell/rails/script/destroy +0 -3
  331. data/lib/bio/shell/rails/script/generate +0 -3
  332. data/lib/bio/shell/rails/script/performance/benchmarker +0 -3
  333. data/lib/bio/shell/rails/script/performance/profiler +0 -3
  334. data/lib/bio/shell/rails/script/plugin +0 -3
  335. data/lib/bio/shell/rails/script/process/reaper +0 -3
  336. data/lib/bio/shell/rails/script/process/spawner +0 -3
  337. data/lib/bio/shell/rails/script/process/spinner +0 -3
  338. data/lib/bio/shell/rails/script/runner +0 -3
  339. data/lib/bio/shell/rails/script/server +0 -42
  340. data/lib/bio/shell/rails/test/test_helper.rb +0 -28
@@ -0,0 +1,155 @@
1
+ #
2
+ # = bio/io/ncbisoap.rb - SOAP interface for NCBI Entrez Utilities
3
+ #
4
+ # Copyright:: Copyright (C) 2004, 2006
5
+ # Toshiaki Katayama <k@bioruby.org>
6
+ # License:: The Ruby License
7
+ #
8
+ # $Id: ncbisoap.rb,v 1.3 2007/04/05 23:35:41 trevor Exp $
9
+ #
10
+
11
+ require 'bio/io/soapwsdl'
12
+
13
+ module Bio
14
+ class NCBI
15
+
16
+ # == References
17
+ #
18
+ # * http://eutils.ncbi.nlm.nih.gov/entrez/query/static/esoap_help.html
19
+ #
20
+ # == Methods
21
+ #
22
+ # All methods accept a hash as its argument and most of the keys can be
23
+ # ommited (values are string).
24
+ #
25
+ # Note: Methods which name ends with _MS are designed for use with
26
+ # Microsoft Visual Studio and SOAP Toolkit 3.0
27
+ #
28
+ # * http://www.ncbi.nlm.nih.gov/entrez/query/static/esoap_ms_help.html
29
+ #
30
+ # * run_eFetch(_MS)
31
+ # * "db", "id", "WebEnv", "query_key", "tool", "email", "retstart",
32
+ # "retmax", "rettype", "strand", "seq_start", "seq_stop", "complexity",
33
+ # "report"
34
+ #
35
+ # * run_eGquery(_MS)
36
+ # * "term", "tool", "email"
37
+ #
38
+ # * run_eInfo(_MS)
39
+ # * "db", "tool", "email"
40
+ #
41
+ # * run_eSpell(_MS)
42
+ # * "db", "term", "tool", "email"
43
+ #
44
+ # * run_eLink(_MS)
45
+ # * "db", "id", "reldate", "mindate", "maxdate", "datetype", "term"
46
+ # "dbfrom", "WebEnv", "query_key", "cmd", "tool", "email"
47
+ #
48
+ # * run_eSearch(_MS)
49
+ # * "db", "term", "WebEnv", "QueryKey", "usehistory", "tool", "email",
50
+ # "field", "reldate", "mindate", "maxdate", "datetype", "RetStart",
51
+ # "RetMax", "rettype", "sort"
52
+ #
53
+ # * run_eSummary(_MS)
54
+ # * "db", "id", "WebEnv", "query_key", "retstart", "retmax", "tool", "email"
55
+ #
56
+ # == Complex data types
57
+ #
58
+ # * http://www.ncbi.nlm.nih.gov/entrez/eutils/soap/egquery.xsd
59
+ # * http://www.ncbi.nlm.nih.gov/entrez/eutils/soap/einfo.xsd
60
+ # * http://www.ncbi.nlm.nih.gov/entrez/eutils/soap/esearch.xsd
61
+ # * http://www.ncbi.nlm.nih.gov/entrez/eutils/soap/esummary.xsd
62
+ # * http://www.ncbi.nlm.nih.gov/entrez/eutils/soap/elink.xsd
63
+ # * http://www.ncbi.nlm.nih.gov/entrez/eutils/soap/efetch.xsd
64
+ # * http://www.ncbi.nlm.nih.gov/entrez/eutils/soap/espell.xsd
65
+ #
66
+ class SOAP < Bio::SOAPWSDL
67
+
68
+ BASE_URI = "http://www.ncbi.nlm.nih.gov/entrez/eutils/soap/"
69
+
70
+ # set default to EUtils
71
+ SERVER_URI = BASE_URI + "eutils.wsdl"
72
+
73
+ def initialize(wsdl = nil)
74
+ super(wsdl || self.class::SERVER_URI)
75
+ end
76
+
77
+ def method_missing(*arg)
78
+ sleep 3 # make sure to rest for 3 seconds per request
79
+ @driver.send(*arg)
80
+ end
81
+
82
+ class EUtils < Bio::NCBI::SOAP
83
+ SERVER_URI = BASE_URI + "eutils.wsdl"
84
+ end
85
+
86
+ class EUtilsLite < Bio::NCBI::SOAP
87
+ SERVER_URI = BASE_URI + "eutils_lite.wsdl"
88
+ end
89
+
90
+ class EFetch < Bio::NCBI::SOAP
91
+ SERVER_URI = BASE_URI + "efetch.wsdl"
92
+ end
93
+
94
+ class EFetchLite < Bio::NCBI::SOAP
95
+ SERVER_URI = BASE_URI + "efetch_lit.wsdl"
96
+ end
97
+
98
+ end # SOAP
99
+ end # NCBI
100
+ end # Bio
101
+
102
+
103
+ if __FILE__ == $0
104
+
105
+ puts ">>> Bio::NCBI::SOAP::EFetch"
106
+ efetch = Bio::NCBI::SOAP::EFetch.new
107
+
108
+ puts "### run_eFetch in EFetch"
109
+ hash = {"db" => "protein", "id" => "37776955"}
110
+ result = efetch.run_eFetch(hash)
111
+ p result
112
+
113
+ puts ">>> Bio::NCBI::SOAP::EUtils"
114
+ eutils = Bio::NCBI::SOAP::EUtils.new
115
+
116
+ puts "### run_eFetch in EUtils"
117
+ hash = {"db" => "pubmed", "id" => "12345"}
118
+ result = eutils.run_eFetch(hash)
119
+ p result
120
+
121
+ puts "### run_eGquery - Entrez meta search to count hits in each DB"
122
+ hash = {"term" => "kinase"}
123
+ result = eutils.run_eGquery(hash) # working?
124
+ p result
125
+
126
+ puts "### run_eInfo - listing of the databases"
127
+ hash = {"db" => "protein"}
128
+ result = eutils.run_eInfo(hash)
129
+ p result
130
+
131
+ puts "### run_eSpell"
132
+ hash = {"db" => "pubmed", "term" => "kinas"}
133
+ result = eutils.run_eSpell(hash)
134
+ p result
135
+ p result["CorrectedQuery"]
136
+
137
+ puts "### run_eLink"
138
+ hash = {"db" => "protein", "id" => "37776955"}
139
+ result = eutils.run_eLink(hash) # working?
140
+ p result
141
+
142
+ puts "### run_eSearch"
143
+ hash = {"db" => "pubmed", "term" => "kinase"}
144
+ result = eutils.run_eSearch(hash)
145
+ p result
146
+
147
+ puts "### run_eSummary"
148
+ hash = {"db" => "protein", "id" => "37776955"}
149
+ result = eutils.run_eSummary(hash)
150
+ p result
151
+
152
+ end
153
+
154
+
155
+
@@ -1,110 +1,195 @@
1
1
  #
2
- # bio/io/pubmed.rb - NCBI Entrez/PubMed client module
2
+ # = bio/io/pubmed.rb - NCBI Entrez/PubMed client module
3
3
  #
4
- # Copyright (C) 2001 KATAYAMA Toshiaki <k@bioruby.org>
4
+ # Copyright:: Copyright (C) 2001 Toshiaki Katayama <k@bioruby.org>
5
+ # Copyright:: Copyright (C) 2006 Jan Aerts <jan.aerts@bbsrc.ac.uk>
6
+ # License:: The Ruby License
5
7
  #
6
- # This library is free software; you can redistribute it and/or
7
- # modify it under the terms of the GNU Lesser General Public
8
- # License as published by the Free Software Foundation; either
9
- # version 2 of the License, or (at your option) any later version.
10
- #
11
- # This library is distributed in the hope that it will be useful,
12
- # but WITHOUT ANY WARRANTY; without even the implied warranty of
13
- # MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
14
- # Lesser General Public License for more details.
15
- #
16
- # You should have received a copy of the GNU Lesser General Public
17
- # License along with this library; if not, write to the Free Software
18
- # Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA
19
- #
20
- # $Id: pubmed.rb,v 1.12 2005/09/08 01:22:12 k Exp $
8
+ # $Id: pubmed.rb,v 1.16 2007/04/05 23:35:41 trevor Exp $
21
9
  #
22
10
 
23
11
  require 'net/http'
24
12
  require 'cgi' unless defined?(CGI)
13
+ require 'bio/command'
25
14
 
26
15
  module Bio
27
16
 
28
- class PubMed
29
-
30
- def self.query(id)
31
- host = "www.ncbi.nlm.nih.gov"
32
- path = "/entrez/query.fcgi?tool=bioruby&cmd=Text&dopt=MEDLINE&db=PubMed&uid="
33
-
34
- http = Net::HTTP.new(host)
35
- response, = http.get(path + id.to_s)
36
- result = response.body
37
- if result =~ /#{id}\s+Error/
38
- raise( result )
39
- else
40
- result = result.gsub("\r", "\n").squeeze("\n").gsub(/<\/?pre>/, '')
41
- return result
42
- end
43
- end
44
-
45
- def self.pmfetch(id)
46
- host = "www.ncbi.nlm.nih.gov"
47
- path = "/entrez/utils/pmfetch.fcgi?tool=bioruby&mode=text&report=medline&db=PubMed&id="
48
-
49
- http = Net::HTTP.new(host)
50
- response, = http.get(path + id.to_s)
51
- result = response.body
52
- if result =~ /#{id}\s+Error/
53
- raise( result )
54
- else
55
- result = result.gsub("\r", "\n").squeeze("\n").gsub(/<\/?pre>/, '')
56
- return result
57
- end
58
- end
59
-
60
- def self.search(str)
61
- host = "www.ncbi.nlm.nih.gov"
62
- path = "/entrez/query.fcgi?tool=bioruby&cmd=Search&doptcmdl=MEDLINE&db=PubMed&term="
17
+ # == Description
18
+ #
19
+ # The Bio::PubMed class provides several ways to retrieve bibliographic
20
+ # information from the PubMed database at
21
+ # http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=PubMed. Basically, two
22
+ # types of queries are possible:
23
+ #
24
+ # * searching for PubMed IDs given a query string:
25
+ # * Bio::PubMed#search
26
+ # * Bio::PubMed#esearch
27
+ #
28
+ # * retrieving the MEDLINE text (i.e. authors, journal, abstract, ...)
29
+ # given a PubMed ID
30
+ # * Bio::PubMed#query
31
+ # * Bio::PubMed#pmfetch
32
+ # * Bio::PubMed#efetch
33
+ #
34
+ # The different methods within the same group are interchangeable and should
35
+ # return the same result.
36
+ #
37
+ # Additional information about the MEDLINE format and PubMed programmable
38
+ # APIs can be found on the following websites:
39
+ #
40
+ # * Overview: http://www.ncbi.nlm.nih.gov/entrez/query/static/overview.html
41
+ # * How to link: http://www.ncbi.nlm.nih.gov/entrez/query/static/linking.html
42
+ # * MEDLINE format: http://www.ncbi.nlm.nih.gov/entrez/query/static/help/pmhelp.html#MEDLINEDisplayFormat
43
+ # * Search field descriptions and tags: http://www.ncbi.nlm.nih.gov/entrez/query/static/help/pmhelp.html#SearchFieldDescriptionsandTags
44
+ # * Entrez utilities index: http://www.ncbi.nlm.nih.gov/entrez/utils/utils_index.html
45
+ # * PmFetch CGI help: http://www.ncbi.nlm.nih.gov/entrez/utils/pmfetch_help.html
46
+ # * E-Utilities CGI help: http://eutils.ncbi.nlm.nih.gov/entrez/query/static/eutils_help.html
47
+ #
48
+ # == Usage
49
+ #
50
+ # require 'bio'
51
+ #
52
+ # # If you don't know the pubmed ID:
53
+ # Bio::PubMed.search("(genome AND analysis) OR bioinformatics)").each do |x|
54
+ # p x
55
+ # end
56
+ # Bio::PubMed.esearch("(genome AND analysis) OR bioinformatics)").each do |x|
57
+ # p x
58
+ # end
59
+ #
60
+ # # To retrieve the MEDLINE entry for a given PubMed ID:
61
+ # puts Bio::PubMed.query("10592173")
62
+ # puts Bio::PubMed.pmfetch("10592173")
63
+ # puts Bio::PubMed.efetch("10592173", "14693808")
64
+ # # This can be converted into a Bio::MEDLINE object:
65
+ # manuscript = Bio::PubMed.query("10592173")
66
+ # medline = Bio::MEDLINE(manuscript)
67
+ #
68
+ class PubMed
69
+
70
+ # Search the PubMed database by given keywords using entrez query and returns
71
+ # an array of PubMed IDs.
72
+ # ---
73
+ # *Arguments*:
74
+ # * _id_: query string (required)
75
+ # *Returns*:: array of PubMed IDs
76
+ def self.search(str)
77
+ host = "www.ncbi.nlm.nih.gov"
78
+ path = "/entrez/query.fcgi?tool=bioruby&cmd=Search&doptcmdl=MEDLINE&db=PubMed&term="
79
+
80
+ http = Bio::Command.new_http(host)
81
+ response, = http.get(path + CGI.escape(str))
82
+ result = response.body
83
+ result = result.gsub("\r", "\n").squeeze("\n")
84
+ result = result.scan(/<pre>(.*?)<\/pre>/m).flatten
85
+ return result
86
+ end
63
87
 
64
- http = Net::HTTP.new(host)
65
- response, = http.get(path + CGI.escape(str))
66
- result = response.body
67
- result = result.gsub("\r", "\n").squeeze("\n")
68
- result = result.scan(/<pre>(.*?)<\/pre>/m).flatten
69
- return result
88
+ # Search the PubMed database by given keywords using E-Utils and returns
89
+ # an array of PubMed IDs.
90
+ #
91
+ # For information on the possible arguments, see
92
+ # http://eutils.ncbi.nlm.nih.gov/entrez/query/static/esearch_help.html#PubMed
93
+ # ---
94
+ # *Arguments*:
95
+ # * _id_: query string (required)
96
+ # * _field_
97
+ # * _reldate_
98
+ # * _mindate_
99
+ # * _maxdate_
100
+ # * _datetype_
101
+ # * _retstart_
102
+ # * _retmax_ (default 100)
103
+ # * _retmode_
104
+ # * _rettype_
105
+ # *Returns*:: array of PubMed IDs
106
+ def self.esearch(str, hash = {})
107
+ hash['retmax'] = 100 unless hash['retmax']
108
+
109
+ opts = []
110
+ hash.each do |k, v|
111
+ opts << "#{k}=#{v}"
70
112
  end
71
113
 
72
- def self.esearch(str, hash = {})
73
- hash['retmax'] = 100 unless hash['retmax']
74
-
75
- opts = []
76
- hash.each do |k, v|
77
- opts << "#{k}=#{v}"
78
- end
114
+ host = "eutils.ncbi.nlm.nih.gov"
115
+ path = "/entrez/eutils/esearch.fcgi?tool=bioruby&db=pubmed&#{opts.join('&')}&term="
79
116
 
80
- host = "eutils.ncbi.nlm.nih.gov"
81
- path = "/entrez/eutils/esearch.fcgi?tool=bioruby&db=pubmed&#{opts.join('&')}&term="
117
+ http = Bio::Command.new_http(host)
118
+ response, = http.get(path + CGI.escape(str))
119
+ result = response.body
120
+ result = result.scan(/<Id>(.*?)<\/Id>/m).flatten
121
+ return result
122
+ end
82
123
 
83
- http = Net::HTTP.new(host)
84
- response, = http.get(path + CGI.escape(str))
85
- result = response.body
86
- result = result.scan(/<Id>(.*?)<\/Id>/m).flatten
124
+ # Retrieve PubMed entry by PMID and returns MEDLINE formatted string using
125
+ # entrez query.
126
+ # ---
127
+ # *Arguments*:
128
+ # * _id_: PubMed ID (required)
129
+ # *Returns*:: MEDLINE formatted String
130
+ def self.query(id)
131
+ host = "www.ncbi.nlm.nih.gov"
132
+ path = "/entrez/query.fcgi?tool=bioruby&cmd=Text&dopt=MEDLINE&db=PubMed&uid="
133
+
134
+ http = Bio::Command.new_http(host)
135
+ response, = http.get(path + id.to_s)
136
+ result = response.body
137
+ if result =~ /#{id}\s+Error/
138
+ raise( result )
139
+ else
140
+ result = result.gsub("\r", "\n").squeeze("\n").gsub(/<\/?pre>/, '')
87
141
  return result
88
142
  end
143
+ end
89
144
 
90
- def self.efetch(*ids)
91
- return [] if ids.empty?
92
-
93
- host = "eutils.ncbi.nlm.nih.gov"
94
- path = "/entrez/eutils/efetch.fcgi?tool=bioruby&db=pubmed&retmode=text&rettype=medline&id="
95
-
96
- ids = ids.join(",")
97
-
98
- http = Net::HTTP.new(host)
99
- response, = http.get(path + ids)
100
- result = response.body
101
- result = result.split(/\n\n+/)
145
+ # Retrieve PubMed entry by PMID and returns MEDLINE formatted string using
146
+ # entrez pmfetch.
147
+ # ---
148
+ # *Arguments*:
149
+ # * _id_: PubMed ID (required)
150
+ # *Returns*:: MEDLINE formatted String
151
+ def self.pmfetch(id)
152
+ host = "www.ncbi.nlm.nih.gov"
153
+ path = "/entrez/utils/pmfetch.fcgi?tool=bioruby&mode=text&report=medline&db=PubMed&id="
154
+
155
+ http = Bio::Command.new_http(host)
156
+ response, = http.get(path + id.to_s)
157
+ result = response.body
158
+ if result =~ /#{id}\s+Error/
159
+ raise( result )
160
+ else
161
+ result = result.gsub("\r", "\n").squeeze("\n").gsub(/<\/?pre>/, '')
102
162
  return result
103
163
  end
164
+ end
104
165
 
166
+ # Retrieve PubMed entry by PMID and returns MEDLINE formatted string using
167
+ # entrez efetch. Multiple PubMed IDs can be provided:
168
+ # Bio::PubMed.efetch(123)
169
+ # Bio::PubMed.efetch(123,456,789)
170
+ # Bio::PubMed.efetch([123,456,789])
171
+ # ---
172
+ # *Arguments*:
173
+ # * _ids_: list of PubMed IDs (required)
174
+ # *Returns*:: MEDLINE formatted String
175
+ def self.efetch(*ids)
176
+ return [] if ids.empty?
177
+
178
+ host = "eutils.ncbi.nlm.nih.gov"
179
+ path = "/entrez/eutils/efetch.fcgi?tool=bioruby&db=pubmed&retmode=text&rettype=medline&id="
180
+
181
+ ids = ids.join(",")
182
+
183
+ http = Bio::Command.new_http(host)
184
+ response, = http.get(path + ids)
185
+ result = response.body
186
+ result = result.split(/\n\n+/)
187
+ return result
105
188
  end
106
189
 
107
- end
190
+ end # PubMed
191
+
192
+ end # Bio
108
193
 
109
194
 
110
195
  if __FILE__ == $0
@@ -124,66 +209,3 @@ if __FILE__ == $0
124
209
  puts Bio::PubMed.efetch("10592173", "14693808")
125
210
 
126
211
  end
127
-
128
- =begin
129
-
130
- = Bio::PubMed
131
-
132
- These class methods access NCBI/PubMed database via HTTP.
133
-
134
- --- Bio::PubMed.esearch(str, options)
135
-
136
- Search keywords in PubMed by E-Utils and returns an array of PubMed IDs.
137
- Options can be a hash containing keys include 'field', 'reldate',
138
- 'mindate', 'maxdate', 'datetype', 'retstart', 'retmax', 'retmode',
139
- and 'rettype' as specified in the following URL:
140
-
141
- ((<URL:http://eutils.ncbi.nlm.nih.gov/entrez/query/static/esearch_help.html#PubMed>))
142
-
143
- Default 'retmax' is 100.
144
-
145
- --- Bio::PubMed.efetch(pmids)
146
-
147
- Returns an array of MEDLINE records. A list of PubMed IDs can be
148
- supplied as following:
149
-
150
- Bio::PubMed.efetch(123)
151
- Bio::PubMed.efetch(123,456,789)
152
- Bio::PubMed.efetch([123,456,789])
153
-
154
- --- Bio::PubMed.query(pmid)
155
-
156
- Retrieve PubMed entry by PMID and returns MEDLINE format string (can
157
- be parsed by the Bio::MEDLINE and can be converted into Bio::Reference
158
- object).
159
-
160
- --- Bio::PubMed.pmfetch(pmid)
161
-
162
- Just another query method (by pmfetch).
163
-
164
- --- Bio::PubMed.search(str)
165
-
166
- Search the PubMed database by given keywords and returns the list of
167
- matched records in MEDLINE format.
168
-
169
-
170
- = For more informations
171
-
172
- * Overview
173
- * ((<URL:http://www.ncbi.nlm.nih.gov/entrez/query/static/overview.html>))
174
- * How to link
175
- * ((<URL:http://www.ncbi.nlm.nih.gov/entrez/query/static/linking.html>))
176
- * MEDLINE format
177
- * ((<URL:http://www.ncbi.nlm.nih.gov/entrez/query/static/help/pmhelp.html#MEDLINEDisplayFormat>))
178
- * Search field descriptions and tags
179
- * ((<URL:http://www.ncbi.nlm.nih.gov/entrez/query/static/help/pmhelp.html#SearchFieldDescriptionsandTags>))
180
- * Entrez utilities index
181
- * ((<URL:http://www.ncbi.nlm.nih.gov/entrez/utils/utils_index.html>))
182
- * PmFetch CGI help
183
- * ((<URL:http://www.ncbi.nlm.nih.gov/entrez/utils/pmfetch_help.html>))
184
- * E-Utilities CGI help
185
- * ((<URL:http://eutils.ncbi.nlm.nih.gov/entrez/query/static/eutils_help.html>))
186
-
187
- =end
188
-
189
-