codon_table_parser 0.2.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- data/LICENSE +20 -0
- data/README.md +290 -0
- data/Rakefile +25 -0
- data/lib/codon_table_parser/version.rb +3 -0
- data/lib/codon_table_parser.rb +189 -0
- metadata +66 -0
data/LICENSE
ADDED
@@ -0,0 +1,20 @@
|
|
1
|
+
Copyright (c) 2011 Stefan Rohlfing
|
2
|
+
|
3
|
+
Permission is hereby granted, free of charge, to any person obtaining
|
4
|
+
a copy of this software and associated documentation files (the
|
5
|
+
"Software"), to deal in the Software without restriction, including
|
6
|
+
without limitation the rights to use, copy, modify, merge, publish,
|
7
|
+
distribute, sublicense, and/or sell copies of the Software, and to
|
8
|
+
permit persons to whom the Software is furnished to do so, subject to
|
9
|
+
the following conditions:
|
10
|
+
|
11
|
+
The above copyright notice and this permission notice shall be
|
12
|
+
included in all copies or substantial portions of the Software.
|
13
|
+
|
14
|
+
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
|
15
|
+
EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
|
16
|
+
MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
|
17
|
+
NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE
|
18
|
+
LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION
|
19
|
+
OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION
|
20
|
+
WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
|
data/README.md
ADDED
@@ -0,0 +1,290 @@
|
|
1
|
+
# CodonTableParser
|
2
|
+
|
3
|
+
Parses the [NCBI genetic code](ftp://ftp.ncbi.nih.gov/entrez/misc/data/gc.prt) table with a multiline Regex, generating hash maps of each species' name, start codons, stop codons and codon table.
|
4
|
+
The output can be easily customized and used to update the respective constants of BioRuby's [CodonTable](https://github.com/bioruby/bioruby/blob/master/lib/bio/data/codontable.rb) class whenever the original data changes.
|
5
|
+
|
6
|
+
## Installation
|
7
|
+
|
8
|
+
``` bash
|
9
|
+
gem install codon_table_parser
|
10
|
+
|
11
|
+
```
|
12
|
+
|
13
|
+
## Usage
|
14
|
+
|
15
|
+
Without any parameters, the genetic code file is downloaded directly from the NCBI web site
|
16
|
+
|
17
|
+
``` ruby
|
18
|
+
parser = CodonTableParser.new
|
19
|
+
```
|
20
|
+
|
21
|
+
Alternatively, the genetic code file can be loaded from a path
|
22
|
+
|
23
|
+
``` ruby
|
24
|
+
file = 'path/to/genetic_code.txt'
|
25
|
+
parser = CodonTableParser.new(file)
|
26
|
+
```
|
27
|
+
|
28
|
+
The first line of the file is read to determine if the content is correct. If not, an exception is thrown:
|
29
|
+
|
30
|
+
``` ruby
|
31
|
+
wrong_content = 'path/to/wrong_content.txt'
|
32
|
+
parser = CodonTableParser.new(wrong_content)
|
33
|
+
# Exception: This is not the NCBI genetic code table
|
34
|
+
```
|
35
|
+
|
36
|
+
### Instance Methods
|
37
|
+
|
38
|
+
The following instance methods are available:
|
39
|
+
|
40
|
+
* CodonTableParser#definitions
|
41
|
+
* CodonTableParser#starts
|
42
|
+
* CodonTableParser#stops
|
43
|
+
* CodonTableParser#tables
|
44
|
+
* CodonTableParser#bundle
|
45
|
+
|
46
|
+
Every intance method can take a *:range* option that specifies the ids of the species to be considered in the output.
|
47
|
+
A range is specified as an array of integers, Ranges or both.
|
48
|
+
Example:
|
49
|
+
|
50
|
+
``` ruby
|
51
|
+
:range => [(1..3), 5, 9] # converted internally to [1, 2, 3, 5, 9]
|
52
|
+
|
53
|
+
```
|
54
|
+
ids not present in the originial data are ignored.
|
55
|
+
Besides the *:range* option, several methods also take other options as demonstrated below.
|
56
|
+
|
57
|
+
#### CodonTableParser#definitions
|
58
|
+
|
59
|
+
``` ruby
|
60
|
+
|
61
|
+
parser = CodonTableParser.new
|
62
|
+
|
63
|
+
# Return default hash map of names
|
64
|
+
definitions = parser.definitions
|
65
|
+
|
66
|
+
definitions
|
67
|
+
# {1=>"Standard",
|
68
|
+
# 2=>"Vertebrate Mitochondrial",
|
69
|
+
# 3=>"Yeast Mitochondrial",
|
70
|
+
# 4=>"Mold Mitochondrial; Protozoan Mitochondrial; Coelenterate Mitochondrial; Mycoplasma; Spiroplasma",
|
71
|
+
# 5=>"Invertebrate Mitochondrial",
|
72
|
+
# 6=>"Ciliate Nuclear; Dasycladacean Nuclear; Hexamita Nuclear",
|
73
|
+
# 9=>"Echinoderm Mitochondrial; Flatworm Mitochondrial",
|
74
|
+
# 10=>"Euplotid Nuclear",
|
75
|
+
# 11=>"Bacterial and Plant Plastid",
|
76
|
+
# 12=>"Alternative Yeast Nuclear",
|
77
|
+
# 13=>"Ascidian Mitochondrial",
|
78
|
+
# 14=>"Alternative Flatworm Mitochondrial",
|
79
|
+
# 15=>"Blepharisma Macronuclear",
|
80
|
+
# 16=>"Chlorophycean Mitochondrial",
|
81
|
+
# 21=>"Trematode Mitochondrial",
|
82
|
+
# 22=>"Scenedesmus obliquus Mitochondrial",
|
83
|
+
# 23=>"Thraustochytrium Mitochondrial"}
|
84
|
+
|
85
|
+
# Return the names names for the ids specified in :range
|
86
|
+
definitions = parser.definitions :range => [(1..3), 5, 9]
|
87
|
+
|
88
|
+
# Return default hash map with custom names for the ids 1 and 3
|
89
|
+
definitions = parser.definitions :names => {1 => "Standard (Eukaryote)",
|
90
|
+
3 => "Yeast Mitochondorial"}
|
91
|
+
definitions[1]
|
92
|
+
# "Standard (Eukaryote)"
|
93
|
+
definitions[3]
|
94
|
+
# "Yeast Mitochondorial"
|
95
|
+
|
96
|
+
# Return the names for the ids specified in :range, with custom names for the ids 1 and 3
|
97
|
+
parser.definitions :range => [(1..3), 5, 9],
|
98
|
+
:names => {1 => "Standard (Eukaryote)",
|
99
|
+
3 => "Yeast Mitochondorial"}
|
100
|
+
|
101
|
+
```
|
102
|
+
|
103
|
+
#### CodonTableParser#starts
|
104
|
+
|
105
|
+
``` ruby
|
106
|
+
|
107
|
+
parser = CodonTableParser.new
|
108
|
+
|
109
|
+
# Return default hash map of start codons
|
110
|
+
start_codons = parser.starts
|
111
|
+
|
112
|
+
start_codons
|
113
|
+
# {1=>["ttg", "ctg", "atg"],
|
114
|
+
# 2=>["att", "atc", "ata", "atg", "gtg"],
|
115
|
+
# 3=>["ata", "atg"],
|
116
|
+
# 4=>["tta", "ttg", "ctg", "att", "atc", "ata", "atg", "gtg"],
|
117
|
+
# 5=>["ttg", "att", "atc", "ata", "atg", "gtg"],
|
118
|
+
# 6=>["atg"],
|
119
|
+
# 9=>["atg", "gtg"],
|
120
|
+
# 10=>["atg"],
|
121
|
+
# 11=>["ttg", "ctg", "att", "atc", "ata", "atg", "gtg"],
|
122
|
+
# 12=>["ctg", "atg"],
|
123
|
+
# 13=>["ttg", "ata", "atg", "gtg"],
|
124
|
+
# 14=>["atg"],
|
125
|
+
# 15=>["atg"],
|
126
|
+
# 16=>["atg"],
|
127
|
+
# 21=>["atg", "gtg"],
|
128
|
+
# 22=>["atg"],
|
129
|
+
# 23=>["att", "atg", "gtg"]}
|
130
|
+
|
131
|
+
# Return the start codons for the ids specified in :range
|
132
|
+
start_codons = parser.starts :range => [(1..3), 5, 9]
|
133
|
+
|
134
|
+
# Add or remove start codons as necessary
|
135
|
+
start_codons = parser.starts 1 => {:add => ['gtg']},
|
136
|
+
13 => {:remove => ['ttg', 'ata', 'gtg']}
|
137
|
+
|
138
|
+
start_codons[1]
|
139
|
+
# ["ttg", "ctg", "atg", "gtg"]
|
140
|
+
start_codons[13]
|
141
|
+
# ["atg"]
|
142
|
+
|
143
|
+
# Alternative syntax, normally only used in the bundle method described below
|
144
|
+
start_codons = parser.starts :starts => {1 => {:add => ['gtg']},
|
145
|
+
13 => {:remove => ['ttg', 'ata', 'gtg']}}
|
146
|
+
|
147
|
+
# Return the start codons for the ids specified with :range, add or remove codons from specific ids
|
148
|
+
start_codons = parser.starts :range => [(1..3), 13],
|
149
|
+
1 => {:add => ['gtg']},
|
150
|
+
13 => {:remove => ['ttg', 'ata', 'gtg']}
|
151
|
+
|
152
|
+
```
|
153
|
+
|
154
|
+
#### CodonTableParser#stops
|
155
|
+
|
156
|
+
``` ruby
|
157
|
+
|
158
|
+
parser = CodonTableParser.new
|
159
|
+
|
160
|
+
# Return the default hash map of stop codons
|
161
|
+
stop_codons = parser.stops
|
162
|
+
|
163
|
+
stops
|
164
|
+
# {1=>["taa", "tag", "tga"],
|
165
|
+
# 2=>["taa", "tag", "aga", "agg"],
|
166
|
+
# 3=>["taa", "tag"],
|
167
|
+
# 4=>["taa", "tag"],
|
168
|
+
# 5=>["taa", "tag"],
|
169
|
+
# 6=>["tga"],
|
170
|
+
# 9=>["taa", "tag"],
|
171
|
+
# 10=>["taa", "tag"],
|
172
|
+
# 11=>["taa", "tag", "tga"],
|
173
|
+
# 12=>["taa", "tag", "tga"],
|
174
|
+
# 13=>["taa", "tag"],
|
175
|
+
# 14=>["tag"],
|
176
|
+
# 15=>["taa", "tga"],
|
177
|
+
# 16=>["taa", "tga"],
|
178
|
+
# 21=>["taa", "tag"],
|
179
|
+
# 22=>["tca", "taa", "tga"],
|
180
|
+
# 23=>["tta", "taa", "tag", "tga"]}
|
181
|
+
|
182
|
+
|
183
|
+
# Return the stop codons for the ids specified with :range
|
184
|
+
stop_codons = parser.stops :range => [(1..3), 5, 9]
|
185
|
+
|
186
|
+
# Add or remove stop codons as necessary
|
187
|
+
|
188
|
+
stop_codons = parser.stops 1 => {:add => ['gtg'], :remove => ['taa']},
|
189
|
+
13 => {:add => ['gcc'], :remove => ['taa', 'tag']}
|
190
|
+
|
191
|
+
stop_codons[1]
|
192
|
+
# ["tag", "tga", "gtg"]
|
193
|
+
stop_codons[13]
|
194
|
+
# ["gcc"]
|
195
|
+
|
196
|
+
# Alternative syntax, normally only used in the bundle method described below
|
197
|
+
stop_codons = parser.stops :stops => {1 => {:add => ['gtg'], :remove => ['taa']},
|
198
|
+
13 => {:add => ['gcc'], :remove => ['taa', 'tag']}}
|
199
|
+
|
200
|
+
|
201
|
+
# Return the stop codons for the ids specified with :range, add or remove codons from specific ids
|
202
|
+
stop_codons = parser.stops :range => [(1..3), 5, 13],
|
203
|
+
1 => {:add => ['gtg'], :remove => ['taa']},
|
204
|
+
13 => {:add => ['gcc'], :remove => ['taa', 'tag']}
|
205
|
+
|
206
|
+
```
|
207
|
+
|
208
|
+
#### CodonTableParser#tables
|
209
|
+
|
210
|
+
``` ruby
|
211
|
+
|
212
|
+
parser = CodonTableParser.new
|
213
|
+
|
214
|
+
# Return codon tables of all species
|
215
|
+
codon_tables = parser.tables
|
216
|
+
|
217
|
+
tables
|
218
|
+
# {
|
219
|
+
# 1 => {
|
220
|
+
# 'ttt' => 'F', 'tct' => 'S', 'tat' => 'Y', 'tgt' => 'C',
|
221
|
+
# 'ttc' => 'F', 'tcc' => 'S', 'tac' => 'Y', 'tgc' => 'C',
|
222
|
+
# 'tta' => 'L', 'tca' => 'S', 'taa' => '*', 'tga' => '*',
|
223
|
+
# 'ttg' => 'L', 'tcg' => 'S', 'tag' => '*', 'tgg' => 'W',
|
224
|
+
#
|
225
|
+
# 'ctt' => 'L', 'cct' => 'P', 'cat' => 'H', 'cgt' => 'R',
|
226
|
+
# 'ctc' => 'L', 'ccc' => 'P', 'cac' => 'H', 'cgc' => 'R',
|
227
|
+
# 'cta' => 'L', 'cca' => 'P', 'caa' => 'Q', 'cga' => 'R',
|
228
|
+
# 'ctg' => 'L', 'ccg' => 'P', 'cag' => 'Q', 'cgg' => 'R',
|
229
|
+
#
|
230
|
+
# 'att' => 'I', 'act' => 'T', 'aat' => 'N', 'agt' => 'S',
|
231
|
+
# 'atc' => 'I', 'acc' => 'T', 'aac' => 'N', 'agc' => 'S',
|
232
|
+
# 'ata' => 'I', 'aca' => 'T', 'aaa' => 'K', 'aga' => 'R',
|
233
|
+
# 'atg' => 'M', 'acg' => 'T', 'aag' => 'K', 'agg' => 'R',
|
234
|
+
#
|
235
|
+
# 'gtt' => 'V', 'gct' => 'A', 'gat' => 'D', 'ggt' => 'G',
|
236
|
+
# 'gtc' => 'V', 'gcc' => 'A', 'gac' => 'D', 'ggc' => 'G',
|
237
|
+
# 'gta' => 'V', 'gca' => 'A', 'gaa' => 'E', 'gga' => 'G',
|
238
|
+
# 'gtg' => 'V', 'gcg' => 'A', 'gag' => 'E', 'ggg' => 'G',
|
239
|
+
# },
|
240
|
+
# 2 => { ... },
|
241
|
+
# 3 => { ... },
|
242
|
+
# ...
|
243
|
+
# 23 => { ... }
|
244
|
+
# }
|
245
|
+
|
246
|
+
# Return the codon tables for the ids specified with :range
|
247
|
+
codon_tables = parser.tables :range => [(1..3), 5, 9, 23]
|
248
|
+
|
249
|
+
```
|
250
|
+
|
251
|
+
#### CodonTableParser#bundle
|
252
|
+
|
253
|
+
``` ruby
|
254
|
+
|
255
|
+
parser = CodonTableParser.new
|
256
|
+
|
257
|
+
# Return the definitions, codon table, start and stop codons for all species as a hash map
|
258
|
+
bundle = parser.bundle
|
259
|
+
|
260
|
+
bundle
|
261
|
+
# {:definitions => {return value of the 'definitions' method}
|
262
|
+
# :starts => {return value of the 'starts' method}
|
263
|
+
# :stops => {return value of the 'stops' method}
|
264
|
+
# :tables => {return value of the 'tables' method}
|
265
|
+
# }
|
266
|
+
|
267
|
+
```
|
268
|
+
The *bundle* method accepts all options from the methods described above, that is:
|
269
|
+
|
270
|
+
* :range (applied to all methods)
|
271
|
+
* :names (applied to the *definitions* method)
|
272
|
+
* :starts (applied to the *starts* method)
|
273
|
+
* :stops (applied to the *stops* method)
|
274
|
+
|
275
|
+
|
276
|
+
To return the same values as are assigned to the constants *DEFINITIONS*, *STARTS*, *STOPS*, and *TABLES* of BioRuby's [CodonTable](https://github.com/bioruby/bioruby/blob/master/lib/bio/data/codontable.rb) class, calling *bundle* with the following options will do:
|
277
|
+
|
278
|
+
``` ruby
|
279
|
+
bundle = parser.bundle :names => {1 => "Standard (Eukaryote)",
|
280
|
+
4 => "Mold, Protozoan, Coelenterate Mitochondrial and Mycoplasma/Spiroplasma",
|
281
|
+
3 => "Yeast Mitochondorial",
|
282
|
+
6 => "Ciliate Macronuclear and Dasycladacean",
|
283
|
+
9 => "Echinoderm Mitochondrial",
|
284
|
+
11 => "Bacteria",
|
285
|
+
14 => "Flatworm Mitochondrial",
|
286
|
+
22 => "Scenedesmus obliquus mitochondrial"},
|
287
|
+
:starts => {1 => {:add => ['gtg']},
|
288
|
+
13 => {:remove => ['ttg', 'ata', 'gtg']}}
|
289
|
+
|
290
|
+
```
|
data/Rakefile
ADDED
@@ -0,0 +1,25 @@
|
|
1
|
+
# require 'spec/rake/spectask' # depreciated
|
2
|
+
require 'rspec/core/rake_task'
|
3
|
+
require 'rake/gempackagetask'
|
4
|
+
require 'rdoc/task'
|
5
|
+
|
6
|
+
# Build gem: rake gem
|
7
|
+
# Push gem: rake push
|
8
|
+
|
9
|
+
task :default => [ :spec, :gem ]
|
10
|
+
|
11
|
+
RSpec::Core::RakeTask.new :spec
|
12
|
+
|
13
|
+
gem_spec = eval(File.read('codon_table_parser.gemspec'))
|
14
|
+
|
15
|
+
Rake::GemPackageTask.new( gem_spec ) do |t|
|
16
|
+
t.need_zip = true
|
17
|
+
end
|
18
|
+
|
19
|
+
#RDoc::Task.new do |rdoc|
|
20
|
+
#
|
21
|
+
#end
|
22
|
+
|
23
|
+
task :push => :gem do |t|
|
24
|
+
sh "gem push pkg/#{gem_spec.name}-#{gem_spec.version}.gem"
|
25
|
+
end
|
@@ -0,0 +1,189 @@
|
|
1
|
+
# encoding: utf-8
|
2
|
+
|
3
|
+
# Parses the NCBI genetic code table, generating separate hash maps of each species' name, start & stop codons and codon table.
|
4
|
+
#
|
5
|
+
# to return definitions, start & stop codons as well as codon tables that can be used
|
6
|
+
class CodonTableParser
|
7
|
+
|
8
|
+
attr_reader :address
|
9
|
+
|
10
|
+
@default_address = 'ftp://ftp.ncbi.nih.gov/entrez/misc/data/gc.prt'
|
11
|
+
|
12
|
+
class << self
|
13
|
+
attr_accessor :default_address
|
14
|
+
end
|
15
|
+
|
16
|
+
def initialize(path = '')
|
17
|
+
@address = CodonTableParser.default_address
|
18
|
+
data = content(path)
|
19
|
+
@codons = triplets(data)
|
20
|
+
@parsed_data = parse(data)
|
21
|
+
end
|
22
|
+
|
23
|
+
def content path
|
24
|
+
if path.empty?
|
25
|
+
require 'open-uri'
|
26
|
+
f = open(@address)
|
27
|
+
else
|
28
|
+
f = File.new(path)
|
29
|
+
end
|
30
|
+
|
31
|
+
first_line = f.readline
|
32
|
+
raise Exception, "This is not the NCBI genetic code table" unless first_line =~ /--\*+/
|
33
|
+
f.read.each_line do |line|
|
34
|
+
next if line.match(/-- /)
|
35
|
+
line
|
36
|
+
end
|
37
|
+
end
|
38
|
+
|
39
|
+
def triplets data
|
40
|
+
base1, base2, base3 = bases data
|
41
|
+
arr = []
|
42
|
+
|
43
|
+
base1.each_with_index do |base, i|
|
44
|
+
arr << (base + base2[i] + base3[i]).downcase
|
45
|
+
end
|
46
|
+
arr
|
47
|
+
end
|
48
|
+
|
49
|
+
def parse data
|
50
|
+
del = /.*?\s/.source # .+ does not work as the regex is greedy.
|
51
|
+
# del = /.*?(?=[a-z])/.source # Using non-greedy search + pos. lookahead also works
|
52
|
+
l_name = /name "(.*?)"#{del}/.source
|
53
|
+
s_name = /(|name ".*?"#{del})/.source # Either nothing 'line does not exists' or the short name.
|
54
|
+
id = /id (\d+)#{del}/.source
|
55
|
+
ncbieaa = /ncbieaa "(.*?)"#{del}/.source
|
56
|
+
sncbieaa = /sncbieaa "(.*?)"/.source
|
57
|
+
|
58
|
+
# flag 'o':
|
59
|
+
# Perform inline substitutions (#{variable}) only once on creation.
|
60
|
+
# Normally, the variable is inserted on every evaluation.
|
61
|
+
result = data.scan(/#{l_name}#{s_name}#{id}#{ncbieaa}#{sncbieaa}/mo).
|
62
|
+
inject([]) do |res, (l_name, s_name, id, ncbieaa, sncbieaa)|
|
63
|
+
|
64
|
+
short = s_name.match(/[A-Z]{3}\d/)[0] unless s_name.empty?
|
65
|
+
l_name = l_name.gsub(/\n/,'')
|
66
|
+
|
67
|
+
res << {:id => id.to_i,
|
68
|
+
:long_name => l_name,
|
69
|
+
:short_name => short,
|
70
|
+
:ncbieaa => ncbieaa,
|
71
|
+
:sncbieaa => sncbieaa}
|
72
|
+
end
|
73
|
+
|
74
|
+
result
|
75
|
+
end
|
76
|
+
|
77
|
+
|
78
|
+
def definitions options = {}
|
79
|
+
Hash[@parsed_data.map do |species|
|
80
|
+
id = species[:id]
|
81
|
+
name = species[:long_name]
|
82
|
+
new_names = options[:names]
|
83
|
+
if new_names
|
84
|
+
name = new_names[id] if new_names[id]
|
85
|
+
end
|
86
|
+
custom_range(options[:range], id) {[id, name]}
|
87
|
+
end]
|
88
|
+
end
|
89
|
+
|
90
|
+
def starts options = {}
|
91
|
+
Hash[@parsed_data.map do |species|
|
92
|
+
codons = []
|
93
|
+
species[:sncbieaa].split(//).each_with_index do |pos, i|
|
94
|
+
if pos == 'M'
|
95
|
+
codons << @codons[i]
|
96
|
+
end
|
97
|
+
end
|
98
|
+
id = species[:id]
|
99
|
+
# Options can either be passed as :starts => {1 => {:add => ...}} or 1 => {:add => ...}
|
100
|
+
selection = options[:starts] || options
|
101
|
+
codons = custom_codons(selection[id], codons)
|
102
|
+
custom_range(options[:range], id) {[id, codons]}
|
103
|
+
end]
|
104
|
+
end
|
105
|
+
|
106
|
+
|
107
|
+
def stops options = {}
|
108
|
+
Hash[@parsed_data.map do |species|
|
109
|
+
codons = []
|
110
|
+
species[:ncbieaa].split(//).each_with_index do |pos, i|
|
111
|
+
if pos == '*'
|
112
|
+
codons << @codons[i]
|
113
|
+
end
|
114
|
+
end
|
115
|
+
|
116
|
+
id = species[:id]
|
117
|
+
# Options can either be passed as :stops => {1 => {:add => ...}} or 1 => {:add => ...}
|
118
|
+
selection = options[:stops] || options
|
119
|
+
codons = custom_codons(selection[id], codons)
|
120
|
+
custom_range(options[:range], id) {[id, codons]}
|
121
|
+
end]
|
122
|
+
end
|
123
|
+
|
124
|
+
def tables options = {}
|
125
|
+
Hash[@parsed_data.map do |species|
|
126
|
+
id = species[:id]
|
127
|
+
codon_table = table(@codons, species[:ncbieaa])
|
128
|
+
custom_range(options[:range], id) {[id, codon_table]}
|
129
|
+
end]
|
130
|
+
end
|
131
|
+
|
132
|
+
def bundle options = {}
|
133
|
+
{:definitions => definitions(options),
|
134
|
+
:starts => starts(options),
|
135
|
+
:stops => stops(options),
|
136
|
+
:tables => tables(options)}
|
137
|
+
end
|
138
|
+
|
139
|
+
|
140
|
+
def bases data
|
141
|
+
del = /[^\n]*\n\s+/.source
|
142
|
+
base1 = /-- Base1\s+([A-Z]+)#{del}/.source
|
143
|
+
base2 = /-- Base2\s+([A-Z]+)#{del}/.source
|
144
|
+
base3 = /-- Base3\s+([A-Z]+)/.source
|
145
|
+
data.scan(/#{base1}#{base2}#{base3}/m).first.map do |base|
|
146
|
+
base.split(//)
|
147
|
+
end
|
148
|
+
end
|
149
|
+
|
150
|
+
def prepare_range range
|
151
|
+
require 'set'
|
152
|
+
range.map do |val|
|
153
|
+
val.is_a?(Range) ? val.to_a : val
|
154
|
+
end.flatten.to_set.sort
|
155
|
+
end
|
156
|
+
|
157
|
+
|
158
|
+
def custom_range options, id, &block
|
159
|
+
range = prepare_range options if options
|
160
|
+
if range
|
161
|
+
block.call if range.include?(id)
|
162
|
+
else
|
163
|
+
block.call
|
164
|
+
end
|
165
|
+
end
|
166
|
+
|
167
|
+
|
168
|
+
def custom_codons options, codons
|
169
|
+
opt = options
|
170
|
+
if opt
|
171
|
+
codons = codons | opt[:add] if opt[:add]
|
172
|
+
codons = codons.delete_if {|codon| opt[:remove].include?(codon)} if opt[:remove]
|
173
|
+
end
|
174
|
+
codons
|
175
|
+
end
|
176
|
+
|
177
|
+
def table triplets, ncbieaa
|
178
|
+
ncbieaa = ncbieaa.split(//)
|
179
|
+
|
180
|
+
hash = {}
|
181
|
+
triplets.each_with_index do |codon, i|
|
182
|
+
hash[codon] = ncbieaa[i]
|
183
|
+
end
|
184
|
+
hash
|
185
|
+
end
|
186
|
+
|
187
|
+
private :content, :bases, :prepare_range, :custom_range, :custom_codons, :table
|
188
|
+
end
|
189
|
+
|
metadata
ADDED
@@ -0,0 +1,66 @@
|
|
1
|
+
--- !ruby/object:Gem::Specification
|
2
|
+
name: codon_table_parser
|
3
|
+
version: !ruby/object:Gem::Version
|
4
|
+
version: 0.2.0
|
5
|
+
prerelease:
|
6
|
+
platform: ruby
|
7
|
+
authors:
|
8
|
+
- Stefan Rohlfing
|
9
|
+
autorequire:
|
10
|
+
bindir: bin
|
11
|
+
cert_chain: []
|
12
|
+
date: 2011-11-03 00:00:00.000000000Z
|
13
|
+
dependencies:
|
14
|
+
- !ruby/object:Gem::Dependency
|
15
|
+
name: rspec
|
16
|
+
requirement: &13232540 !ruby/object:Gem::Requirement
|
17
|
+
none: false
|
18
|
+
requirements:
|
19
|
+
- - ! '>='
|
20
|
+
- !ruby/object:Gem::Version
|
21
|
+
version: '0'
|
22
|
+
type: :development
|
23
|
+
prerelease: false
|
24
|
+
version_requirements: *13232540
|
25
|
+
description: ! ' Parses the NCBI genetic code table, generating hash maps of each
|
26
|
+
species'' name, start codons, stop codons and codon table. The output of CodonTableParser
|
27
|
+
can be customized easily and used to update the respective constants of BioRuby''s
|
28
|
+
CodonTable class whenever the original data has changed.
|
29
|
+
|
30
|
+
'
|
31
|
+
email: stefan.rohlfing@gmail.com
|
32
|
+
executables: []
|
33
|
+
extensions: []
|
34
|
+
extra_rdoc_files: []
|
35
|
+
files:
|
36
|
+
- lib/codon_table_parser.rb
|
37
|
+
- lib/codon_table_parser/version.rb
|
38
|
+
- README.md
|
39
|
+
- Rakefile
|
40
|
+
- LICENSE
|
41
|
+
homepage: http://github.com/bytesource/codon_table_parser
|
42
|
+
licenses: []
|
43
|
+
post_install_message:
|
44
|
+
rdoc_options: []
|
45
|
+
require_paths:
|
46
|
+
- lib
|
47
|
+
required_ruby_version: !ruby/object:Gem::Requirement
|
48
|
+
none: false
|
49
|
+
requirements:
|
50
|
+
- - ! '>='
|
51
|
+
- !ruby/object:Gem::Version
|
52
|
+
version: 1.9.1
|
53
|
+
required_rubygems_version: !ruby/object:Gem::Requirement
|
54
|
+
none: false
|
55
|
+
requirements:
|
56
|
+
- - ! '>='
|
57
|
+
- !ruby/object:Gem::Version
|
58
|
+
version: '0'
|
59
|
+
requirements: []
|
60
|
+
rubyforge_project: codon_table_parser
|
61
|
+
rubygems_version: 1.8.10
|
62
|
+
signing_key:
|
63
|
+
specification_version: 3
|
64
|
+
summary: Parses the NCBI genetic code table, generating hash maps of each species'
|
65
|
+
name, start codons, stop codons and codon table.
|
66
|
+
test_files: []
|