bio 0.7.0 → 0.7.1

Sign up to get free protection for your applications and to get access to all the features.
@@ -5,22 +5,22 @@ is released.
5
5
 
6
6
  --- Ruby 1.6 series are no longer supported.
7
7
 
8
- We use autoload functionality and many other libraries bundled in
9
- Ruby 1.8.2 (such as SOAP, open-uri, pp etc.) by default.
8
+ We use autoload functionality and many standard (bundled) libraries
9
+ (such as SOAP, open-uri, pp etc.) only in Ruby >1.8.2.
10
10
 
11
11
  --- BioRuby will be loaded about 30 times faster than before.
12
12
 
13
13
  As we changed to use autoload instead of require, time required
14
14
  to start up the BioRuby library made surprisingly faster.
15
15
 
16
- Other changes (including exciting BioRuby shell etc.) made in this release
17
- is described in this file.
16
+ Other changes (including newly introduced BioRuby shell etc.) made
17
+ in this series will be described in this file.
18
18
 
19
19
  == New features
20
20
 
21
21
  --- BioRuby shell
22
22
 
23
- Command line user interface for the BioRuby is included.
23
+ A new command line user interface for the BioRuby is now included.
24
24
  You can invoke the shell by
25
25
 
26
26
  % bioruby
@@ -139,13 +139,13 @@ as it was a typo.
139
139
 
140
140
  * lib/bio/db/genbank/common.rb is removed.
141
141
 
142
- Renamed to Bio::NCBIDB::Common for the simple autoload dependency.
142
+ Renamed to Bio::NCBIDB::Common to make simplify the autoload dependency.
143
143
 
144
144
  --- Bio::EMBL::Common
145
145
 
146
146
  * lib/bio/db/embl/common.rb is removed.
147
147
 
148
- Renamed to Bio::EMBLDB::Common for the simple autoload dependency.
148
+ Renamed to Bio::EMBLDB::Common to make simplify the autoload dependency.
149
149
 
150
150
  --- Bio::KEGG::GENES
151
151
 
@@ -188,16 +188,21 @@ instead of a Hash of a entry ID string.
188
188
 
189
189
  --- Bio::PDB
190
190
 
191
+ In 0.7.0:
192
+
191
193
  * Bio::PDB::Atom is removed. Instead, please use Bio::PDB::Record::ATOM and
192
194
  Bio::PDB::Record::HETATM.
193
195
  * Bio::PDB::FieldDef is removed and Bio::PDB::Record is completely
194
- changed. Now, Record is changed from hash to Struct, and
195
- method_missing is no longer used.
196
+ changed. Now, records is changed from hash to Struct objects.
197
+ (Note that method_missing is no longer used.)
198
+ * In records, "do_parse" is now automatically called.
199
+ Users don't need to call do_parse explicitly.
200
+ (0.7.0 feature: "inspect" does not call do_parse.)
201
+ (0.7.1 feature: "inspect" calls do_parse.)
196
202
  * In the "MODEL" record, model_serial is changed to serial.
197
203
  * In records, record_type is changed to record_name.
198
- * In any records, record_type is changed to record_name.
199
- * In most records contains real numbers, changed to return
200
- float values instead of strings.
204
+ * In most records contains real numbers, return values are changed
205
+ to float instead of string.
201
206
  * Pdb_AChar, Pdb_Atom, Pdb_Character, Pdb_Continuation,
202
207
  Pdb_Date, Pdb_IDcode, Pdb_Integer, Pdb_LString, Pdb_List,
203
208
  Pdb_Real, Pdb_Residue_name, Pdb_SList, Pdb_Specification_list,
@@ -205,6 +210,22 @@ instead of a Hash of a entry ID string.
205
210
  Bio::PDB::DataType.
206
211
  * There are more and more changes to be written...
207
212
 
213
+ In 0.7.1:
214
+
215
+ * Heterogens and HETATMs are completely separeted from residues and ATOMs.
216
+ HETATMs (Bio::PDB::Record::HETATM objects) are stored in
217
+ Bio::PDB::Heterogen (which inherits Bio::PDB::Residue).
218
+ * Waters (resName=="HOH") are treated as normal heterogens.
219
+ Model#solvents is still available but it will be deprecated.
220
+ * In Bio::PDB::Chain, adding "LIGAND" to the heterogen id is no longer
221
+ available. Instead, please use Chain#get_heterogen_by_id method.
222
+ In addition, Bio::{PDB|PDB::Model::PDB::Chain}#heterogens, #each_heterogen,
223
+ #find_heterogen, Bio::{PDB|PDB::Model::PDB::Chain::PDB::Heterogen}#hetatms,
224
+ #each_hetatm, #find_hetatm methods are added.
225
+ * Bio::PDB#seqres returns Bio::Sequence::NA object if the chain seems to be
226
+ a nucleic acid sequence.
227
+ * There are more and more changes to be written...
228
+
208
229
  === Deleted files
209
230
 
210
231
  : lib/bio/db/genbank.rb
@@ -223,7 +244,7 @@ in your code to
223
244
 
224
245
  require 'bio'
225
246
 
226
- and this change will also speeds up loading time if you only need
247
+ and this change will also speeds up loading time even if you only need
227
248
  one of the sub classes under the genbank/ or embl/ directory.
228
249
 
229
250
  : lib/bio/extend.rb
@@ -1,6 +1,6 @@
1
1
  =begin
2
2
 
3
- $Id: Tutorial.rd.ja,v 1.18 2005/12/07 11:40:45 k Exp $
3
+ $Id: Tutorial.rd.ja,v 1.19 2006/01/16 15:23:11 k Exp $
4
4
 
5
5
  Copyright (C) 2001-2003, 2005 KATAYAMA Toshiaki <k@bioruby.org>
6
6
 
@@ -50,6 +50,18 @@ Ruby ɸ
50
50
 
51
51
  * ((<URL:http://i.loveruby.net/ja/prog/refe.html>))
52
52
 
53
+ === RubyGems �Υ��󥹥ȡ���
54
+
55
+ RubyGems �Υڡ�������ǿ��Ǥ����������ɤ��ޤ���
56
+
57
+ * ((<URL:http://rubyforge.org/projects/rubygems/>))
58
+
59
+ Ÿ�����ƥ��󥹥ȡ��뤷�ޤ���
60
+
61
+ % tar zxvf rubygems-x.x.x.tar.gz
62
+ % cd rubygems-x.x.x
63
+ % ruby setup.rb
64
+
53
65
  === BioRuby �Υ��󥹥ȡ���
54
66
 
55
67
  BioRuby �Υ��󥹥ȡ�����ˡ�� http://bioruby.org/archive/ ����
@@ -64,9 +76,9 @@ BioPerl
64
76
  % ruby install.rb setup
65
77
  # ruby install.rb install
66
78
 
67
- ����ˡ�RubyGems ���Ȥ���Ķ��Ǥ����
79
+ RubyGems ���Ȥ���Ķ��Ǥ����
68
80
 
69
- % gems install bio
81
+ % gem install bio
70
82
 
71
83
  �����ǥ��󥹥ȡ���Ǥ��ޤ���
72
84
 
@@ -2048,7 +2060,7 @@ BioFetch
2048
2060
  serv = reg.get_database('genbank')
2049
2061
  entry = serv.get_by_id('AA2CG')
2050
2062
 
2051
- �⤷(4) ��Ȥ��������� seqdatabase.ini ��
2063
+ �⤷ (4) ��Ȥ��������� seqdatabase.ini ��
2052
2064
 
2053
2065
  [genbank]
2054
2066
  protocol=biofetch
data/lib/bio.rb CHANGED
@@ -1,11 +1,11 @@
1
1
  #
2
2
  # = bio.rb - Loading all BioRuby modules
3
3
  #
4
- # Copyright:: Copyright (C) 2001-2005
4
+ # Copyright:: Copyright (C) 2001-2006
5
5
  # Toshiaki Katayama <k@bioruby.org>
6
6
  # License:: LGPL
7
7
  #
8
- # $Id: bio.rb,v 1.58 2005/11/28 04:57:32 k Exp $
8
+ # $Id: bio.rb,v 1.59 2006/01/20 09:57:08 k Exp $
9
9
  #
10
10
  #--
11
11
  #
@@ -28,7 +28,7 @@
28
28
 
29
29
  module Bio
30
30
 
31
- BIORUBY_VERSION = [0, 7, 0].extend(Comparable)
31
+ BIORUBY_VERSION = [0, 7, 1].extend(Comparable)
32
32
 
33
33
  ### Basic data types
34
34
 
@@ -78,19 +78,6 @@ module Bio
78
78
 
79
79
  ## GenBank/RefSeq/DDBJ
80
80
 
81
- # module Bio
82
- # autoload :NCBIDB, 'bio/db'
83
- # class GenBank < NCBIDB
84
- # autoload :Common, 'bio/db/genbank/common'
85
- # include Bio::GenBank::Common
86
-
87
- # module Bio
88
- # autoload :NCBIDB, 'bio/db'
89
- # end
90
- # class Bio::GenBank < Bio::NCBIDB
91
- # autoload :Common, 'bio/db/genbank/common'
92
- # include Bio::GenBank::Common
93
-
94
81
  autoload :GenBank, 'bio/db/genbank/genbank'
95
82
  autoload :GenPept, 'bio/db/genbank/genpept'
96
83
  autoload :RefSeq, 'bio/db/genbank/refseq'
@@ -108,7 +95,6 @@ module Bio
108
95
  autoload :UniProt, 'bio/db/embl/uniprot'
109
96
  autoload :SwissProt, 'bio/db/embl/swissprot'
110
97
 
111
-
112
98
  ## KEGG
113
99
 
114
100
  class KEGG
@@ -254,3 +240,4 @@ module Bio
254
240
  autoload :ColorScheme, 'bio/util/color_scheme'
255
241
 
256
242
  end
243
+
@@ -5,7 +5,7 @@
5
5
  # KATAYAMA Toshiaki <k@bioruby.org>
6
6
  # License:: LGPL
7
7
  #
8
- # $Id: db.rb,v 0.31 2005/12/07 11:23:51 k Exp $
8
+ # $Id: db.rb,v 0.32 2006/01/12 08:58:27 k Exp $
9
9
  #
10
10
  # == On-demand parsing and cache
11
11
  #
@@ -210,29 +210,21 @@ class DB
210
210
  # Returns a String with successive white spaces are replaced by one
211
211
  # space and stripeed.
212
212
  def truncate(str)
213
- if str
214
- str.gsub(/\s+/, ' ').strip
215
- else
216
- ""
217
- end
213
+ str ||= ""
214
+ return str.gsub(/\s+/, ' ').strip
218
215
  end
219
216
 
220
217
  # Returns a tag name of the field as a String.
221
218
  def tag_get(str)
222
- if str
223
- str[0,@tagsize].strip
224
- else
225
- ""
226
- end
219
+ str ||= ""
220
+ return str[0,@tagsize].strip
227
221
  end
228
222
 
229
223
  # Returns a String of the field without a tag name.
230
224
  def tag_cut(str)
231
- if str
232
- str[0,@tagsize] = ''
233
- else
234
- ""
235
- end
225
+ str ||= ""
226
+ str[0,@tagsize] = ''
227
+ return str
236
228
  end
237
229
 
238
230
  # Returns the content of the field as a String like the fetch method.
@@ -1,9 +1,14 @@
1
1
  #
2
- # bio/db/pdb/atom.rb - Coordinate and atom class for PDB
2
+ # = bio/db/pdb/atom.rb - Coordinate class for PDB
3
3
  #
4
- # Copyright (C) 2004 Alex Gutteridge <alexg@ebi.ac.uk>
5
- # Copyright (C) 2004 GOTO Naohisa <ngoto@gen-info.osaka-u.ac.jp>
4
+ # Copyright:: Copyright (C) 2004, 2006
5
+ # Alex Gutteridge <alexg@ebi.ac.uk>
6
+ # Naohisa Goto <ng@bioruby.org>
7
+ # License:: LGPL
6
8
  #
9
+ # $Id: atom.rb,v 1.6 2006/01/08 12:59:04 ngoto Exp $
10
+ #
11
+ #--
7
12
  # This library is free software; you can redistribute it and/or
8
13
  # modify it under the terms of the GNU Lesser General Public
9
14
  # License as published by the Free Software Foundation; either
@@ -17,8 +22,17 @@
17
22
  # You should have received a copy of the GNU Lesser General Public
18
23
  # License along with this library; if not, write to the Free Software
19
24
  # Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA
25
+ #++
26
+ #
27
+ # = Bio::PDB::Coordinate
28
+ #
29
+ # Coordinate class for PDB.
30
+ #
31
+ # = Compatibility Note
32
+ #
33
+ # From bioruby 0.7.0, the Bio::PDB::Atom class is no longer available.
34
+ # Please use Bio::PDB::Record::ATOM and Bio::PDB::Record::HETATM instead.
20
35
  #
21
- # $Id: atom.rb,v 1.5 2005/12/18 17:33:32 ngoto Exp $
22
36
 
23
37
  require 'matrix'
24
38
  require 'bio/db/pdb'
@@ -26,31 +40,50 @@ require 'bio/db/pdb'
26
40
  module Bio
27
41
  class PDB
28
42
 
43
+ # Bio::PDB::Coordinate is a class to store a 3D coordinate.
44
+ # It inherits Vector (in bundled library in Ruby).
45
+ #
29
46
  class Coordinate < Vector
47
+ # same as Vector.[x,y,z]
30
48
  def self.[](x,y,z)
31
49
  super
32
50
  end
33
51
 
52
+ # same as Vector.elements
34
53
  def self.elements(array, *a)
35
54
  raise 'Size of given array must be 3' if array.size != 3
36
55
  super
37
56
  end
38
-
57
+
58
+ # x
39
59
  def x; self[0]; end
60
+ # y
40
61
  def y; self[1]; end
62
+ # z
41
63
  def z; self[2]; end
64
+ # x=(n)
42
65
  def x=(n); self[0]=n; end
66
+ # y=(n)
43
67
  def y=(n); self[1]=n; end
68
+ # z=(n)
44
69
  def z=(n); self[2]=n; end
45
70
 
71
+ # Implicit conversion to an array.
72
+ #
73
+ # Note that this method would be deprecated in the future.
74
+ #
75
+ #--
46
76
  # Definition of 'to_ary' means objects of the class is
47
77
  # implicitly regarded as an array.
78
+ #++
48
79
  def to_ary; self.to_a; end
49
80
 
81
+ # returns self.
50
82
  def xyz; self; end
51
-
83
+
84
+ # distance between <em>object2</em>.
52
85
  def distance(object2)
53
- Utils::to_xyz(object2)
86
+ Utils::convert_to_xyz(object2)
54
87
  (self - object2).r
55
88
  end
56
89
  end #class Coordinate
@@ -1,8 +1,14 @@
1
1
  #
2
- # bio/db/pdb/chain.rb - chain class for PDB
2
+ # = bio/db/pdb/chain.rb - chain class for PDB
3
3
  #
4
- # Copyright (C) 2004 Alex Gutteridge <alexg@ebi.ac.uk>
4
+ # Copyright:: Copyright (C) 2004, 2006
5
+ # Alex Gutteridge <alexg@ebi.ac.uk>
6
+ # Naohisa Goto <ng@bioruby.org>
7
+ # License:: LGPL
8
+ #
9
+ # $Id: chain.rb,v 1.6 2006/01/20 13:54:08 ngoto Exp $
5
10
  #
11
+ #--
6
12
  # This library is free software; you can redistribute it and/or
7
13
  # modify it under the terms of the GNU Lesser General Public
8
14
  # License as published by the Free Software Foundation; either
@@ -16,8 +22,12 @@
16
22
  # You should have received a copy of the GNU Lesser General Public
17
23
  # License along with this library; if not, write to the Free Software
18
24
  # Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA
25
+ #++
26
+ #
27
+ # = Bio::PDB::Chain
28
+ #
29
+ # Please refer Bio::PDB::Chain.
19
30
  #
20
- # $Id: chain.rb,v 1.2 2005/09/26 13:00:08 k Exp $
21
31
 
22
32
  require 'bio/db/pdb'
23
33
 
@@ -25,93 +35,182 @@ module Bio
25
35
 
26
36
  class PDB
27
37
 
38
+ # Bio::PDB::Chain is a class to store a chain.
39
+ #
40
+ # The object would contain some residues (Bio::PDB::Residue objects)
41
+ # and some heterogens (Bio::PDB::Heterogen objects).
42
+ #
28
43
  class Chain
29
44
 
30
45
  include Utils
31
46
  include AtomFinder
32
47
  include ResidueFinder
48
+
49
+ include HetatmFinder
50
+ include HeterogenFinder
51
+
33
52
  include Enumerable
34
53
  include Comparable
35
-
36
- attr_reader :id, :model
37
- attr_writer :id
38
-
54
+
55
+ # Creates a new chain object.
39
56
  def initialize(id = nil, model = nil)
40
57
 
41
- @id = id
58
+ @chain_id = id
42
59
 
43
60
  @model = model
44
61
 
45
- @residues = Array.new
46
- @ligands = Array.new
47
-
62
+ @residues = []
63
+ @residues_hash = {}
64
+ @heterogens = []
65
+ @heterogens_hash = {}
48
66
  end
67
+
68
+ # Identifier of this chain
69
+ attr_accessor :chain_id
70
+ # alias
71
+ alias id chain_id
72
+
73
+ # the model to which this chain belongs.
74
+ attr_reader :model
75
+
76
+ # residues in this chain
77
+ attr_reader :residues
78
+
79
+ # heterogens in this chain
80
+ attr_reader :heterogens
49
81
 
50
- #Keyed access to residues based on ids
82
+ # get the residue by id
83
+ def get_residue_by_id(key)
84
+ #@residues.find { |r| r.residue_id == key }
85
+ @residues_hash[key]
86
+ end
87
+
88
+ # get the residue by id.
89
+ #
90
+ # Compatibility Note: Now, you cannot find HETATMS in this method.
91
+ # To add "LIGAND" to the id is no longer available.
92
+ # To get heterogens, you must use <code>get_heterogen_by_id</code>.
51
93
  def [](key)
52
- #If you want to find HETATMS you need to add LIGAND to the id
53
- if key.to_s[0,6] == 'LIGAND'
54
- residue = @ligands.find{ |residue| key.to_s == residue.id }
55
- else
56
- residue = @residues.find{ |residue| key.to_s == residue.id }
57
- end
94
+ get_residue_by_id(key)
95
+ end
96
+
97
+ # get the heterogen (ligand) by id
98
+ def get_heterogen_by_id(key)
99
+ #@heterogens.find { |r| r.residue_id == key }
100
+ @heterogens_hash[key]
58
101
  end
59
102
 
60
103
  #Add a residue to this chain
61
104
  def addResidue(residue)
62
- raise "Expecting a Bio::PDB::Residue" if not residue.is_a? Bio::PDB::Residue
105
+ raise "Expecting a Bio::PDB::Residue" unless residue.is_a? Bio::PDB::Residue
63
106
  @residues.push(residue)
107
+ if @residues_hash[residue.residue_id] then
108
+ $stderr.puts "Warning: residue_id #{residue.residue_id.inspect} is already used" if $VERBOSE
109
+ else
110
+ @residues_hash[residue.residue_id] = residue
111
+ end
64
112
  self
65
113
  end
66
114
 
67
- #Add a ligand to this chain
68
- def addLigand(residue)
69
- raise "Expecting a Bio::PDB::Residue" if not residue.is_a? Bio::PDB::Residue
70
- @ligands.push(residue)
115
+ #Add a heterogen (ligand) to this chain
116
+ def addLigand(ligand)
117
+ raise "Expecting a Bio::PDB::Residue" unless ligand.is_a? Bio::PDB::Residue
118
+ @heterogens.push(ligand)
119
+ if @heterogens_hash[ligand.residue_id] then
120
+ $stderr.puts "Warning: heterogen_id (residue_id) #{ligand.residue_id.inspect} is already used" if $VERBOSE
121
+ else
122
+ @heterogens_hash[ligand.residue_id] = ligand
123
+ end
71
124
  self
72
125
  end
73
-
74
- #Residue iterator
75
- def each
76
- @residues.each{ |residue| yield residue }
126
+
127
+ # rehash residues hash
128
+ def rehash_residues
129
+ begin
130
+ residues_bak = @residues
131
+ residues_hash_bak = @residues_hash
132
+ @residues = []
133
+ @residues_hash = {}
134
+ residues_bak.each do |residue|
135
+ self.addResidue(residue)
136
+ end
137
+ rescue RuntimeError
138
+ @residues = residues_bak
139
+ @residues_hash = residues_hash_bak
140
+ raise
141
+ end
142
+ self
143
+ end
144
+
145
+ # rehash heterogens hash
146
+ def rehash_heterogens
147
+ begin
148
+ heterogens_bak = @heterogens
149
+ heterogens_hash_bak = @heterogens_hash
150
+ @heterogens = []
151
+ @heterogens_hash = {}
152
+ heterogens_bak.each do |heterogen|
153
+ self.addLigand(heterogen)
154
+ end
155
+ rescue RuntimeError
156
+ @heterogens = heterogens_bak
157
+ @heterogens_hash = heterogens_hash_bak
158
+ raise
159
+ end
160
+ self
161
+ end
162
+
163
+ # rehash residues hash and heterogens hash
164
+ def rehash
165
+ rehash_residues
166
+ rehash_heterogens
167
+ end
168
+
169
+ # Iterates over each residue
170
+ def each(&x) #:yields: residue
171
+ @residues.each(&x)
77
172
  end
78
173
  #Alias to override ResidueFinder#each_residue
79
174
  alias each_residue each
175
+
176
+ # Iterates over each hetero-compound
177
+ def each_heterogen(&x) #:yields: heterogen
178
+ @heterogens.each(&x)
179
+ end
80
180
 
81
- #Sort based on chain id
181
+ # Operator aimed to sort based on chain id
82
182
  def <=>(other)
83
- return @id <=> other.id
183
+ return @chain_id <=> other.chain_id
84
184
  end
85
185
 
86
- #Stringifies each residue
186
+ # Stringifies each residue
87
187
  def to_s
88
- string = ""
89
- @residues.each{ |residue| string << residue.to_s }
90
- string = string << "TER\n"
91
- return string
188
+ @residues.join('') + "TER\n" + @heterogens.join('')
92
189
  end
93
190
 
94
- def atom_seq
95
- string = ""
96
- last_residue_num = nil
97
- @residues.each{ |residue|
98
- if last_residue_num and
99
- (residue.resSeq.to_i - last_residue_num).abs > 1
100
- (residue.resSeq.to_i - last_residue_num).abs.times{ string << 'X' }
101
- end
102
- tlc = residue.resName.capitalize
103
- olc = AminoAcid.names.invert[tlc]
104
- if !olc
105
- olc = 'X'
191
+ # gets an amino acid sequence of this chain from ATOM records
192
+ def aaseq
193
+ unless defined? @aaseq
194
+ string = ""
195
+ last_residue_num = nil
196
+ @residues.each do |residue|
197
+ if last_residue_num and
198
+ (x = (residue.resSeq.to_i - last_residue_num).abs) > 1 then
199
+ x.times { string << 'X' }
200
+ end
201
+ tlc = residue.resName.capitalize
202
+ olc = (Bio::AminoAcid.three2one(tlc) or 'X')
203
+ string << olc
106
204
  end
107
- string << olc
108
- }
109
- Bio::Sequence::AA.new(string)
110
-
205
+ @aaseq = Bio::Sequence::AA.new(string)
206
+ end
207
+ @aaseq
111
208
  end
209
+ # for backward compatibility
210
+ alias atom_seq aaseq
112
211
 
113
- end
212
+ end #class Chain
114
213
 
115
- end
214
+ end #class PDB
116
215
 
117
- end
216
+ end #module Bio