bio-protparam 0.1.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- data/.document +5 -0
- data/.travis.yml +12 -0
- data/Gemfile +9 -0
- data/LICENSE.txt +20 -0
- data/README.md +47 -0
- data/README.rdoc +48 -0
- data/Rakefile +42 -0
- data/VERSION +1 -0
- data/lib/bio-protparam.rb +12 -0
- data/lib/bio/util/protparam.rb +817 -0
- data/test/data/uniprot/p53_human.uniprot +1456 -0
- data/test/helper.rb +20 -0
- data/test/test_bio-protparam.rb +121 -0
- metadata +130 -0
data/.document
ADDED
data/.travis.yml
ADDED
@@ -0,0 +1,12 @@
|
|
1
|
+
language: ruby
|
2
|
+
rvm:
|
3
|
+
- 1.9.2
|
4
|
+
- 1.9.3
|
5
|
+
- jruby-19mode # JRuby in 1.9 mode
|
6
|
+
- rbx-19mode
|
7
|
+
# - 1.8.7
|
8
|
+
# - jruby-18mode # JRuby in 1.8 mode
|
9
|
+
# - rbx-18mode
|
10
|
+
|
11
|
+
# uncomment this line if your project needs to run something other than `rake`:
|
12
|
+
# script: bundle exec rspec spec
|
data/Gemfile
ADDED
data/LICENSE.txt
ADDED
@@ -0,0 +1,20 @@
|
|
1
|
+
Copyright (c) 2012 hryk <hiroyuki@1vq9.com>
|
2
|
+
|
3
|
+
Permission is hereby granted, free of charge, to any person obtaining
|
4
|
+
a copy of this software and associated documentation files (the
|
5
|
+
"Software"), to deal in the Software without restriction, including
|
6
|
+
without limitation the rights to use, copy, modify, merge, publish,
|
7
|
+
distribute, sublicense, and/or sell copies of the Software, and to
|
8
|
+
permit persons to whom the Software is furnished to do so, subject to
|
9
|
+
the following conditions:
|
10
|
+
|
11
|
+
The above copyright notice and this permission notice shall be
|
12
|
+
included in all copies or substantial portions of the Software.
|
13
|
+
|
14
|
+
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
|
15
|
+
EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
|
16
|
+
MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
|
17
|
+
NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE
|
18
|
+
LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION
|
19
|
+
OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION
|
20
|
+
WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
|
data/README.md
ADDED
@@ -0,0 +1,47 @@
|
|
1
|
+
# bio-protparam
|
2
|
+
|
3
|
+
[](http://travis-ci.org/hryk/bioruby-protparam)
|
4
|
+
|
5
|
+
`bio-protparam` adds Bio::Protparam class. Bio::Protparam has same interface and
|
6
|
+
function as Bio::Tools::Protparam class of BioPerl, except that it calculate
|
7
|
+
parameters instead of throwing query to Expasy protparam tool.
|
8
|
+
|
9
|
+
**Note: this software is under active development!**
|
10
|
+
|
11
|
+
## Installation
|
12
|
+
|
13
|
+
```sh
|
14
|
+
gem install bio-protparam
|
15
|
+
```
|
16
|
+
|
17
|
+
## Usage
|
18
|
+
|
19
|
+
```ruby
|
20
|
+
require 'bio-protparam'
|
21
|
+
|
22
|
+
protparam = Bio::Protparam.new("MYNNYNLCHIRTINWEEIITGPSAMYSYVY...")
|
23
|
+
# Return Mw
|
24
|
+
protparam.molecular_weight
|
25
|
+
# Return pI
|
26
|
+
protparam.theorettical_pI
|
27
|
+
|
28
|
+
```
|
29
|
+
|
30
|
+
The API doc is on [rdoc.info](http://rdoc.info/github/hryk/bioruby-protparam/). For
|
31
|
+
more code examples see the test files in the source tree.
|
32
|
+
|
33
|
+
## Cite
|
34
|
+
|
35
|
+
If you use this software, please cite one of
|
36
|
+
|
37
|
+
* [BioRuby: bioinformatics software for the Ruby programming language](http://dx.doi.org/10.1093/bioinformatics/btq475)
|
38
|
+
* [Biogem: an effective tool-based approach for scaling up open source software development in bioinformatics](http://dx.doi.org/10.1093/bioinformatics/bts080)
|
39
|
+
|
40
|
+
## Biogems.info
|
41
|
+
|
42
|
+
This Biogem is published at [#bio-protparam](http://biogems.info/index.html)
|
43
|
+
|
44
|
+
## Copyright
|
45
|
+
|
46
|
+
Copyright (c) 2012 hryk. See LICENSE.txt for further details.
|
47
|
+
|
data/README.rdoc
ADDED
@@ -0,0 +1,48 @@
|
|
1
|
+
= bio-protparam
|
2
|
+
|
3
|
+
{<img
|
4
|
+
src="https://secure.travis-ci.org/hryk/bioruby-protparam.png"
|
5
|
+
/>}[http://travis-ci.org/#!/hryk/bioruby-protparam]
|
6
|
+
|
7
|
+
Full description goes here
|
8
|
+
|
9
|
+
Note: this software is under active development!
|
10
|
+
|
11
|
+
== Installation
|
12
|
+
|
13
|
+
gem install bio-protparam
|
14
|
+
|
15
|
+
== Usage
|
16
|
+
|
17
|
+
== Developers
|
18
|
+
|
19
|
+
To use the library
|
20
|
+
|
21
|
+
require 'bio-protparam'
|
22
|
+
|
23
|
+
The API doc is online. For more code examples see also the test files in
|
24
|
+
the source tree.
|
25
|
+
|
26
|
+
== Project home page
|
27
|
+
|
28
|
+
Information on the source tree, documentation, issues and how to contribute, see
|
29
|
+
|
30
|
+
http://github.com/hryk/bioruby-protparam
|
31
|
+
|
32
|
+
The BioRuby community is on IRC server: irc.freenode.org, channel: #bioruby.
|
33
|
+
|
34
|
+
== Cite
|
35
|
+
|
36
|
+
If you use this software, please cite one of
|
37
|
+
|
38
|
+
* [BioRuby: bioinformatics software for the Ruby programming language](http://dx.doi.org/10.1093/bioinformatics/btq475)
|
39
|
+
* [Biogem: an effective tool-based approach for scaling up open source software development in bioinformatics](http://dx.doi.org/10.1093/bioinformatics/bts080)
|
40
|
+
|
41
|
+
== Biogems.info
|
42
|
+
|
43
|
+
This Biogem is published at http://biogems.info/index.html#bio-protparam
|
44
|
+
|
45
|
+
== Copyright
|
46
|
+
|
47
|
+
Copyright (c) 2012 hryk. See LICENSE.txt for further details.
|
48
|
+
|
data/Rakefile
ADDED
@@ -0,0 +1,42 @@
|
|
1
|
+
# encoding: utf-8
|
2
|
+
|
3
|
+
require 'rubygems'
|
4
|
+
require 'bundler'
|
5
|
+
begin
|
6
|
+
Bundler.setup(:default, :development)
|
7
|
+
rescue Bundler::BundlerError => e
|
8
|
+
$stderr.puts e.message
|
9
|
+
$stderr.puts "Run `bundle install` to install missing gems"
|
10
|
+
exit e.status_code
|
11
|
+
end
|
12
|
+
require 'rake'
|
13
|
+
|
14
|
+
require 'jeweler'
|
15
|
+
Jeweler::Tasks.new do |gem|
|
16
|
+
gem.name = "bio-protparam"
|
17
|
+
gem.homepage = "http://github.com/hryk/bioruby-protparam"
|
18
|
+
gem.license = "MIT"
|
19
|
+
gem.summary = %Q{A Protparam compatible utility for BioRuby.}
|
20
|
+
gem.description = %Q{Bio::Protparam has same interface and function as Bio::Tools::Protparam class of BioPerl, except that it calculate parameters instead of throwing query to Expasy protparam tool.}
|
21
|
+
gem.email = "hiroyuki@1vq9.com"
|
22
|
+
gem.authors = ["hryk"]
|
23
|
+
end
|
24
|
+
Jeweler::RubygemsDotOrgTasks.new
|
25
|
+
|
26
|
+
require 'rake/testtask'
|
27
|
+
Rake::TestTask.new(:test) do |test|
|
28
|
+
test.libs << 'lib' << 'test'
|
29
|
+
test.pattern = 'test/**/test_*.rb'
|
30
|
+
test.verbose = true
|
31
|
+
end
|
32
|
+
|
33
|
+
task :default => :test
|
34
|
+
|
35
|
+
require 'rdoc/task'
|
36
|
+
Rake::RDocTask.new do |rdoc|
|
37
|
+
version = File.exist?('VERSION') ? File.read('VERSION') : ""
|
38
|
+
rdoc.rdoc_dir = 'rdoc'
|
39
|
+
rdoc.title = "bio-protparam #{version}"
|
40
|
+
rdoc.rdoc_files.include('README*')
|
41
|
+
rdoc.rdoc_files.include('lib/**/*.rb')
|
42
|
+
end
|
data/VERSION
ADDED
@@ -0,0 +1 @@
|
|
1
|
+
0.1.0
|
@@ -0,0 +1,12 @@
|
|
1
|
+
# Please require your code below, respecting the naming conventions in the
|
2
|
+
# bioruby directory tree.
|
3
|
+
#
|
4
|
+
# For example, say you have a plugin named bio-plugin, the only uncommented
|
5
|
+
# line in this file would be
|
6
|
+
#
|
7
|
+
# require 'bio/bio-plugin/plugin'
|
8
|
+
#
|
9
|
+
# In this file only require other files. Avoid other source code.
|
10
|
+
|
11
|
+
require 'bio/util/protparam'
|
12
|
+
|
@@ -0,0 +1,817 @@
|
|
1
|
+
# encoding: utf-8
|
2
|
+
#
|
3
|
+
#
|
4
|
+
# = bio/appl/protparam.rb - A Class to Calculate Protein Parameters.
|
5
|
+
#
|
6
|
+
# Copyright:: Copyright (C) 2012
|
7
|
+
# Hiroyuki Nakamura <hiroyuki@1vq9.com>
|
8
|
+
# License:: The Ruby License
|
9
|
+
#
|
10
|
+
require 'rational'
|
11
|
+
|
12
|
+
module Bio
|
13
|
+
##
|
14
|
+
# == Description
|
15
|
+
#
|
16
|
+
# Bio::Protparam is a class for calculating protein paramesters. This class
|
17
|
+
# has a similer interface to BioPerl's Bio::Tools::Protparam. However, it
|
18
|
+
# calculate parameters instead of throwing a query to Expasy's {Protparam
|
19
|
+
# tool}[http://web.expasy.org/protparam/]{[1]}[rdoc-label:1] as Bio::Tools::Protparam does.
|
20
|
+
#
|
21
|
+
class Protparam
|
22
|
+
|
23
|
+
# {IUPAC codes}[http://www.bioinformatics.org/sms2/iupac.html] for amino acids.
|
24
|
+
IUPAC_CODE = {
|
25
|
+
:I => "Ile",
|
26
|
+
:V => "Val",
|
27
|
+
:L => "Leu",
|
28
|
+
:F => "Phe",
|
29
|
+
:C => "Cys",
|
30
|
+
:M => "Met",
|
31
|
+
:A => "Ala",
|
32
|
+
:G => "Gly",
|
33
|
+
:T => "Thr",
|
34
|
+
:W => "Trp",
|
35
|
+
:S => "Ser",
|
36
|
+
:Y => "Tyr",
|
37
|
+
:P => "Pro",
|
38
|
+
:H => "His",
|
39
|
+
:E => "Glu",
|
40
|
+
:Q => "Gln",
|
41
|
+
:D => "Asp",
|
42
|
+
:N => "Asn",
|
43
|
+
:K => "Lys",
|
44
|
+
:R => "Arg",
|
45
|
+
:U => "Sec",
|
46
|
+
:O => "Pyl",
|
47
|
+
:B => "Asx",
|
48
|
+
:Z => "Glx",
|
49
|
+
:X => "Xaa"
|
50
|
+
}
|
51
|
+
|
52
|
+
# Dipeptide instability weight value for calculating instability index of proteins {[10]}[rdoc-label:10].
|
53
|
+
DIWV = {
|
54
|
+
:W => {
|
55
|
+
:W => 1.0, :C => 1.0, :M => 24.68, :H => 24.68, :Y => 1.0, :F => 1.0, :Q => 1.0,
|
56
|
+
:N => 13.34, :I => 1.0, :R => 1.0, :D => 1.0, :P => 1.0, :T => -14.03, :K => 1.0,
|
57
|
+
:E => 1.0, :V => -7.49, :S => 1.0, :G => -9.37, :A => -14.03, :L => 13.34
|
58
|
+
},
|
59
|
+
:C => {
|
60
|
+
:W => 24.68, :C => 1.0, :M => 33.6, :H => 33.6, :Y => 1.0, :F => 1.0, :Q => -6.54, :N => 1.0,
|
61
|
+
:I => 1.0, :R => 1.0, :D => 20.26, :P => 20.26, :T => 33.6, :K => 1.0, :E => 1.0, :V => -6.54,
|
62
|
+
:S => 1.0, :G => 1.0, :A => 1.0, :L => 20.26
|
63
|
+
},
|
64
|
+
:M => {
|
65
|
+
:W => 1.0, :C => 1.0, :M => -1.88, :H => 58.28, :Y => 24.68, :F => 1.0, :Q => -6.54,
|
66
|
+
:N => 1.0, :I => 1.0, :R => -6.54, :D => 1.0, :P => 44.94, :T => -1.88, :K => 1.0, :E => 1.0,
|
67
|
+
:V => 1.0, :S => 44.94, :G => 1.0, :A => 13.34, :L => 1.0
|
68
|
+
},
|
69
|
+
:H => {
|
70
|
+
:W => -1.88, :C => 1.0, :M => 1.0, :H => 1.0, :Y => 44.94, :F => -9.37, :Q => 1.0,
|
71
|
+
:N => 24.68, :I => 44.94, :R => 1.0, :D => 1.0, :P => -1.88, :T => -6.54, :K => 24.68,
|
72
|
+
:E => 1.0, :V => 1.0, :S => 1.0, :G => -9.37, :A => 1.0, :L => 1.0
|
73
|
+
},
|
74
|
+
:Y => {
|
75
|
+
:W => -9.37, :C => 1.0, :M => 44.94, :H => 13.34, :Y => 13.34, :F => 1.0, :Q => 1.0,
|
76
|
+
:N => 1.0, :I => 1.0, :R => -15.91, :D => 24.68, :P => 13.34, :T => -7.49, :K => 1.0,
|
77
|
+
:E => -6.54, :V => 1.0, :S => 1.0, :G => -7.49, :A => 24.68, :L => 1.0
|
78
|
+
},
|
79
|
+
:F => {
|
80
|
+
:W => 1.0, :C => 1.0, :M => 1.0, :H => 1.0, :Y => 33.6, :F => 1.0, :Q => 1.0, :N => 1.0,
|
81
|
+
:I => 1.0, :R => 1.0, :D => 13.34, :P => 20.26, :T => 1.0, :K => -14.03, :E => 1.0,
|
82
|
+
:V => 1.0, :S => 1.0, :G => 1.0, :A => 1.0, :L => 1.0
|
83
|
+
},
|
84
|
+
:Q => {
|
85
|
+
:W => 1.0, :C => -6.54, :M => 1.0, :H => 1.0, :Y => -6.54, :F => -6.54, :Q => 20.26,
|
86
|
+
:N => 1.0, :I => 1.0, :R => 1.0, :D => 20.26, :P => 20.26, :T => 1.0, :K => 1.0, :E => 20.26,
|
87
|
+
:V => -6.54, :S => 44.94, :G => 1.0, :A => 1.0, :L => 1.0
|
88
|
+
},
|
89
|
+
:N => {
|
90
|
+
:W => -9.37, :C => -1.88, :M => 1.0, :H => 1.0, :Y => 1.0, :F => -14.03, :Q => -6.54,
|
91
|
+
:N => 1.0, :I => 44.94, :R => 1.0, :D => 1.0, :P => -1.88, :T => -7.49, :K => 24.68,
|
92
|
+
:E => 1.0, :V => 1.0, :S => 1.0, :G => -14.03, :A => 1.0, :L => 1.0
|
93
|
+
},
|
94
|
+
:I => {
|
95
|
+
:W => 1.0, :C => 1.0, :M => 1.0, :H => 13.34, :Y => 1.0, :F => 1.0, :Q => 1.0, :N => 1.0,
|
96
|
+
:I => 1.0, :R => 1.0, :D => 1.0, :P => -1.88, :T => 1.0, :K => -7.49, :E => 44.94,
|
97
|
+
:V => -7.49, :S => 1.0, :G => 1.0, :A => 1.0, :L => 20.26
|
98
|
+
},
|
99
|
+
:R => {
|
100
|
+
:W => 58.28, :C => 1.0, :M => 1.0, :H => 20.26, :Y => -6.54, :F => 1.0, :Q => 20.26,
|
101
|
+
:N => 13.34, :I => 1.0, :R => 58.28, :D => 1.0, :P => 20.26, :T => 1.0, :K => 1.0, :E => 1.0,
|
102
|
+
:V => 1.0, :S => 44.94, :G => -7.49, :A => 1.0, :L => 1.0
|
103
|
+
},
|
104
|
+
:D => {
|
105
|
+
:W => 1.0, :C => 1.0, :M => 1.0, :H => 1.0, :Y => 1.0, :F => -6.54, :Q => 1.0, :N => 1.0,
|
106
|
+
:I => 1.0, :R => -6.54, :D => 1.0, :P => 1.0, :T => -14.03, :K => -7.49, :E => 1.0,
|
107
|
+
:V => 1.0, :S => 20.26, :G => 1.0, :A => 1.0, :L => 1.0
|
108
|
+
},
|
109
|
+
:P => {
|
110
|
+
:W => -1.88, :C => -6.54, :M => -6.54, :H => 1.0, :Y => 1.0, :F => 20.26, :Q => 20.26,
|
111
|
+
:N => 1.0, :I => 1.0, :R => -6.54, :D => -6.54, :P => 20.26, :T => 1.0, :K => 1.0, :E => 18.38,
|
112
|
+
:V => 20.26, :S => 20.26, :G => 1.0, :A => 20.26, :L => 1.0
|
113
|
+
},
|
114
|
+
:T => {
|
115
|
+
:W => -14.03, :C => 1.0, :M => 1.0, :H => 1.0, :Y => 1.0, :F => 13.34, :Q => -6.54,
|
116
|
+
:N => -14.03, :I => 1.0, :R => 1.0, :D => 1.0, :P => 1.0, :T => 1.0, :K => 1.0, :E => 20.26,
|
117
|
+
:V => 1.0, :S => 1.0, :G => -7.49, :A => 1.0, :L => 1.0
|
118
|
+
},
|
119
|
+
:K => {
|
120
|
+
:W => 1.0, :C => 1.0, :M => 33.6, :H => 1.0, :Y => 1.0, :F => 1.0, :Q => 24.68, :N => 1.0,
|
121
|
+
:I => -7.49, :R => 33.6, :D => 1.0, :P => -6.54, :T => 1.0, :K => 1.0, :E => 1.0, :V => -7.49,
|
122
|
+
:S => 1.0, :G => -7.49, :A => 1.0, :L => -7.49
|
123
|
+
},
|
124
|
+
:E => {
|
125
|
+
:W => -14.03, :C => 44.94, :M => 1.0, :H => -6.54, :Y => 1.0, :F => 1.0, :Q => 20.26,
|
126
|
+
:N => 1.0, :I => 20.26, :R => 1.0, :D => 20.26, :P => 20.26, :T => 1.0, :K => 1.0, :E => 33.6,
|
127
|
+
:V => 1.0, :S => 20.26, :G => 1.0, :A => 1.0, :L => 1.0
|
128
|
+
},
|
129
|
+
:V => {
|
130
|
+
:W => 1.0, :C => 1.0, :M => 1.0, :H => 1.0, :Y => -6.54, :F => 1.0, :Q => 1.0, :N => 1.0,
|
131
|
+
:I => 1.0, :R => 1.0, :D => -14.03, :P => 20.26, :T => -7.49, :K => -1.88, :E => 1.0,
|
132
|
+
:V => 1.0, :S => 1.0, :G => -7.49, :A => 1.0, :L => 1.0
|
133
|
+
},
|
134
|
+
:S => {
|
135
|
+
:W => 1.0, :C => 33.6, :M => 1.0, :H => 1.0, :Y => 1.0, :F => 1.0, :Q => 20.26, :N => 1.0,
|
136
|
+
:I => 1.0, :R => 20.26, :D => 1.0, :P => 44.94, :T => 1.0, :K => 1.0, :E => 20.26, :V => 1.0,
|
137
|
+
:S => 20.26, :G => 1.0, :A => 1.0, :L => 1.0
|
138
|
+
},
|
139
|
+
:G => {
|
140
|
+
:W => 13.34, :C => 1.0, :M => 1.0, :H => 1.0, :Y => -7.49, :F => 1.0, :Q => 1.0, :N => -7.49,
|
141
|
+
:I => -7.49, :R => 1.0, :D => 1.0, :P => 1.0, :T => -7.49, :K => -7.49, :E => -6.54,
|
142
|
+
:V => 1.0, :S => 1.0, :G => 13.34, :A => -7.49, :L => 1.0
|
143
|
+
},
|
144
|
+
:A => {
|
145
|
+
:W => 1.0, :C => 44.94, :M => 1.0, :H => -7.49, :Y => 1.0, :F => 1.0, :Q => 1.0, :N => 1.0,
|
146
|
+
:I => 1.0, :R => 1.0, :D => -7.49, :P => 20.26, :T => 1.0, :K => 1.0, :E => 1.0, :V => 1.0,
|
147
|
+
:S => 1.0, :G => 1.0, :A => 1.0, :L => 1.0
|
148
|
+
},
|
149
|
+
:L => {
|
150
|
+
:W => 24.68, :C => 1.0, :M => 1.0, :H => 1.0, :Y => 1.0, :F => 1.0, :Q => 33.6, :N => 1.0,
|
151
|
+
:I => 1.0, :R => 20.26, :D => 1.0, :P => 20.26, :T => 1.0, :K => -7.49, :E => 1.0, :V => 1.0,
|
152
|
+
:S => 1.0, :G => 1.0, :A => 1.0, :L => 1.0
|
153
|
+
}
|
154
|
+
}
|
155
|
+
|
156
|
+
# Estemated half-life of N-terminal residue of a protein.
|
157
|
+
HALFLIFE = {
|
158
|
+
:ecoli => {
|
159
|
+
:I => 600,
|
160
|
+
:V => 600,
|
161
|
+
:L => 2,
|
162
|
+
:F => 2,
|
163
|
+
:C => 600,
|
164
|
+
:M => 600,
|
165
|
+
:A => 600,
|
166
|
+
:G => 600,
|
167
|
+
:T => 600,
|
168
|
+
:W => 2,
|
169
|
+
:S => 600,
|
170
|
+
:Y => 2,
|
171
|
+
:P => 600,
|
172
|
+
:H => 600,
|
173
|
+
:E => 600,
|
174
|
+
:Q => 600,
|
175
|
+
:D => 600,
|
176
|
+
:N => 600,
|
177
|
+
:K => 2,
|
178
|
+
:R => 2,
|
179
|
+
:U => 600
|
180
|
+
},
|
181
|
+
:mammalian => {
|
182
|
+
:A => 264,
|
183
|
+
:R => 60,
|
184
|
+
:N => 84,
|
185
|
+
:D => 66,
|
186
|
+
:C => 72,
|
187
|
+
:Q => 48,
|
188
|
+
:E => 60,
|
189
|
+
:G => 30,
|
190
|
+
:H => 210,
|
191
|
+
:I => 1200,
|
192
|
+
:L => 330,
|
193
|
+
:K => 78,
|
194
|
+
:M => 1800,
|
195
|
+
:F => 66,
|
196
|
+
:P => 1200,
|
197
|
+
:S => 114,
|
198
|
+
:T => 432,
|
199
|
+
:W => 168,
|
200
|
+
:Y => 168,
|
201
|
+
:V => 6000
|
202
|
+
},
|
203
|
+
:yeast => {
|
204
|
+
:A => 1200,
|
205
|
+
:R => 2,
|
206
|
+
:N => 3,
|
207
|
+
:D => 3,
|
208
|
+
:C => 1200,
|
209
|
+
:Q => 10,
|
210
|
+
:E => 30,
|
211
|
+
:G => 1200,
|
212
|
+
:H => 10,
|
213
|
+
:I => 30,
|
214
|
+
:L => 3,
|
215
|
+
:K => 3,
|
216
|
+
:M => 1200,
|
217
|
+
:F => 3,
|
218
|
+
:P => 1200,
|
219
|
+
:S => 1200,
|
220
|
+
:T => 1200,
|
221
|
+
:W => 3,
|
222
|
+
:Y => 10,
|
223
|
+
:V => 1200
|
224
|
+
}
|
225
|
+
}
|
226
|
+
|
227
|
+
## TOP-IDP
|
228
|
+
##
|
229
|
+
## http://www.ncbi.nlm.nih.gov/pmc/articles/PMC2676888/
|
230
|
+
##
|
231
|
+
# TOP_IDP = {
|
232
|
+
# :I => -0.486,
|
233
|
+
# :V => -0.121,
|
234
|
+
# :L => -0.326,
|
235
|
+
# :F => -0.697,
|
236
|
+
# :C => 0.02,
|
237
|
+
# :M => -0.397,
|
238
|
+
# :A => 0.06,
|
239
|
+
# :G => 0.166,
|
240
|
+
# :T => 0.059,
|
241
|
+
# :W => -0.884,
|
242
|
+
# :S => 0.341,
|
243
|
+
# :Y => -0.510,
|
244
|
+
# :P => 0.987,
|
245
|
+
# :H => 0.303,
|
246
|
+
# :E => 0.736,
|
247
|
+
# :Q => 0.318,
|
248
|
+
# :D => 0.192,
|
249
|
+
# :N => 0.007,
|
250
|
+
# :K => 0.586,
|
251
|
+
# :R => 0.180,
|
252
|
+
# :U => 0.02
|
253
|
+
# }
|
254
|
+
|
255
|
+
# Hydropathy values for amino acids {[12]}[rdoc-label:12].
|
256
|
+
HYDROPATHY = {
|
257
|
+
:I => 4.5 ,
|
258
|
+
:V => 4.2 ,
|
259
|
+
:L => 3.8 ,
|
260
|
+
:F => 2.8 ,
|
261
|
+
:C => 2.5 ,
|
262
|
+
:M => 1.9 ,
|
263
|
+
:A => 1.8 ,
|
264
|
+
:G => -0.4,
|
265
|
+
:T => -0.7,
|
266
|
+
:W => -0.9,
|
267
|
+
:S => -0.8,
|
268
|
+
:Y => -1.3,
|
269
|
+
:P => -1.6,
|
270
|
+
:H => -3.2,
|
271
|
+
:E => -3.5,
|
272
|
+
:Q => -3.5,
|
273
|
+
:D => -3.5,
|
274
|
+
:N => -3.5,
|
275
|
+
:K => -3.9,
|
276
|
+
:R => -4.5,
|
277
|
+
:U => 2.5
|
278
|
+
}
|
279
|
+
|
280
|
+
# {Average isotopic masses of amino acids}[http://web.expasy.org/findmod/findmod_masses.html#AA]
|
281
|
+
AVERAGE_MASS = {
|
282
|
+
:I => 113.1594,
|
283
|
+
:V => 99.1326,
|
284
|
+
:L => 113.1594,
|
285
|
+
:F => 147.1766,
|
286
|
+
:C => 103.1388,
|
287
|
+
:M => 131.1926,
|
288
|
+
:A => 71.0788,
|
289
|
+
:G => 57.0519,
|
290
|
+
:T => 101.1051,
|
291
|
+
:W => 186.2132,
|
292
|
+
:S => 87.0782,
|
293
|
+
:Y => 163.1760,
|
294
|
+
:P => 97.1167,
|
295
|
+
:H => 137.1411,
|
296
|
+
:E => 129.1155,
|
297
|
+
:Q => 128.1307,
|
298
|
+
:D => 115.0886,
|
299
|
+
:N => 114.1038,
|
300
|
+
:K => 128.1741,
|
301
|
+
:R => 156.1875,
|
302
|
+
:U => 150.0388
|
303
|
+
}
|
304
|
+
WATER_MASS = 18.01524
|
305
|
+
|
306
|
+
# Atomic composition of amino acids.
|
307
|
+
ATOM = {
|
308
|
+
:I => {:C => 6, :H => 13, :O => 2, :N => 1, :S => 0}, # C6H13NO2
|
309
|
+
:V => {:C => 5, :H => 11, :O => 2, :N => 1, :S => 0}, # C5H11NO2
|
310
|
+
:L => {:C => 6, :H => 13, :O => 2, :N => 1, :S => 0}, # C6H13NO2
|
311
|
+
:F => {:C => 9, :H => 11, :O => 2, :N => 1, :S => 0}, # C9H11NO2
|
312
|
+
:C => {:C => 3, :H => 7 , :O => 2, :N => 1, :S => 1}, # C3H7NO2S
|
313
|
+
:M => {:C => 5, :H => 11 ,:O => 2, :N => 1, :S => 1}, # C5H11NO2S
|
314
|
+
:A => {:C => 3, :H => 7 , :O => 2, :N => 1, :S => 0}, # C3H7NO2
|
315
|
+
:G => {:C => 2, :H => 5 , :O => 2, :N => 1, :S => 0}, # C2H5NO2
|
316
|
+
:T => {:C => 4, :H => 9 , :O => 3, :N => 1, :S => 0}, # C4H9NO3
|
317
|
+
:W => {:C => 11,:H => 12, :O => 2, :N => 2, :S => 0}, # C11H12N2O2
|
318
|
+
:S => {:C => 3, :H => 7 , :O => 3, :N => 1, :S => 0}, # C3H7NO3
|
319
|
+
:Y => {:C => 9, :H => 11, :O => 3, :N => 1, :S => 0}, # C9H11NO3
|
320
|
+
:P => {:C => 5, :H => 9 , :O => 2, :N => 1, :S => 0}, # C5H9NO2
|
321
|
+
:H => {:C => 6, :H => 9 , :O => 2, :N => 3, :S => 0}, # C6H9N3O2
|
322
|
+
:E => {:C => 5, :H => 9 , :O => 4, :N => 1, :S => 0}, # C5H9NO4
|
323
|
+
:Q => {:C => 5, :H => 10, :O => 3, :N => 2, :S => 0}, # C5H10N2O3
|
324
|
+
:D => {:C => 4, :H => 7 , :O => 4, :N => 1, :S => 0}, # C4H7NO4
|
325
|
+
:N => {:C => 4, :H => 8 , :O => 3, :N => 2, :S => 0}, # C4H8N2O3
|
326
|
+
:K => {:C => 6, :H => 14, :O => 2, :N => 2, :S => 0}, # C6H14N2O2
|
327
|
+
:R => {:C => 6, :H => 14, :O => 2, :N => 4, :S => 0}, # C6H14N4O2
|
328
|
+
}
|
329
|
+
|
330
|
+
##
|
331
|
+
#
|
332
|
+
# pK value from Bjellqvist, et al {[13]}[rdoc-label:13].
|
333
|
+
# Taking into account the decrease in pK differences
|
334
|
+
# between acids and bases when going from water
|
335
|
+
# to 8 M urea, a value of 7.5 has been assigned to the
|
336
|
+
# N-terminal residue .
|
337
|
+
#
|
338
|
+
PK = {
|
339
|
+
:cterm => {
|
340
|
+
:normal => 3.55, :D => 4.55, :E => 4.75
|
341
|
+
},
|
342
|
+
:nterm => {
|
343
|
+
:A => 7.59, :M => 7.00, :S => 6.93, :P => 8.36,
|
344
|
+
:T => 6.82, :V => 7.44, :E => 7.70 , :G => 7.50
|
345
|
+
},
|
346
|
+
:internal => {
|
347
|
+
:D => 4.05, :E => 4.45, :H => 5.98, :C => 9.0,
|
348
|
+
:Y => 10.0, :K => 10.0, :R => 12.0
|
349
|
+
}
|
350
|
+
}
|
351
|
+
|
352
|
+
def initialize(seq)
|
353
|
+
if seq.kind_of?(String) && Bio::Sequence.guess(seq) == Bio::Sequence::AA
|
354
|
+
# TODO: has issue.
|
355
|
+
@seq = Bio::Sequence::AA.new seq
|
356
|
+
elsif seq.kind_of? Bio::Sequence::AA
|
357
|
+
@seq = seq
|
358
|
+
elsif seq.kind_of?(Bio::Sequence) &&
|
359
|
+
seq.guess.kind_of?(Bio::Sequence::AA)
|
360
|
+
@seq = seq.guess
|
361
|
+
else
|
362
|
+
raise ArgumentError, "sequence must be an AA sequence"
|
363
|
+
end
|
364
|
+
end
|
365
|
+
|
366
|
+
##
|
367
|
+
#
|
368
|
+
# Return the number of negative amino acids (D and E) in an AA sequence.
|
369
|
+
#
|
370
|
+
def num_neg
|
371
|
+
@num_neg ||= @seq.count("DE")
|
372
|
+
end
|
373
|
+
|
374
|
+
##
|
375
|
+
#
|
376
|
+
# Return the number of positive amino acids (R and K) in an AA sequence.
|
377
|
+
#
|
378
|
+
def num_pos
|
379
|
+
@num_neg ||= @seq.count("RK")
|
380
|
+
end
|
381
|
+
|
382
|
+
##
|
383
|
+
#
|
384
|
+
# Return the number of residues in an AA sequence.
|
385
|
+
#
|
386
|
+
def amino_acid_number
|
387
|
+
@seq.length
|
388
|
+
end
|
389
|
+
|
390
|
+
##
|
391
|
+
#
|
392
|
+
# Return the number of atoms in a sequence. If type is given, return the
|
393
|
+
# number of specific atoms in a sequence.
|
394
|
+
#
|
395
|
+
def total_atoms(type=nil)
|
396
|
+
if !type.nil?
|
397
|
+
type = type.to_sym
|
398
|
+
if /^(?:C|H|O|N|S){1}$/ !~ type.to_s
|
399
|
+
raise ArgumentError, "type must be C/H/O/N/S/nil(all)"
|
400
|
+
end
|
401
|
+
end
|
402
|
+
num_atom = {:C => 0,
|
403
|
+
:H => 0,
|
404
|
+
:O => 0,
|
405
|
+
:N => 0,
|
406
|
+
:S => 0}
|
407
|
+
each_aa do |aa|
|
408
|
+
ATOM[aa].each do |t, num|
|
409
|
+
num_atom[t] += num
|
410
|
+
end
|
411
|
+
end
|
412
|
+
num_atom[:H] = num_atom[:H] - 2 * (amino_acid_number - 1)
|
413
|
+
num_atom[:O] = num_atom[:O] - (amino_acid_number - 1)
|
414
|
+
if type.nil?
|
415
|
+
num_atom.values.inject(0){|prod, num| prod += num }
|
416
|
+
else
|
417
|
+
num_atom[type]
|
418
|
+
end
|
419
|
+
end
|
420
|
+
|
421
|
+
##
|
422
|
+
#
|
423
|
+
# Return the number of carbons.
|
424
|
+
#
|
425
|
+
def num_carbon
|
426
|
+
@num_carbon ||= total_atoms :C
|
427
|
+
end
|
428
|
+
|
429
|
+
def num_hydrogen
|
430
|
+
@num_hydrogen ||= total_atoms :H
|
431
|
+
end
|
432
|
+
|
433
|
+
##
|
434
|
+
#
|
435
|
+
# Return the number of nitrogens.
|
436
|
+
#
|
437
|
+
def num_nitro
|
438
|
+
@num_nitro ||= total_atoms :N
|
439
|
+
end
|
440
|
+
|
441
|
+
##
|
442
|
+
#
|
443
|
+
# Return the number of oxygens.
|
444
|
+
#
|
445
|
+
def num_oxygen
|
446
|
+
@num_oxygen ||= total_atoms :O
|
447
|
+
end
|
448
|
+
|
449
|
+
##
|
450
|
+
#
|
451
|
+
# Return the number of sulphurs.
|
452
|
+
#
|
453
|
+
def num_sulphur
|
454
|
+
@num_sulphur ||= total_atoms :S
|
455
|
+
end
|
456
|
+
|
457
|
+
##
|
458
|
+
#
|
459
|
+
# Calculate molecular weight of an AA sequence.
|
460
|
+
#
|
461
|
+
# _Protein Mw is calculated by the addition of average isotopic masses of
|
462
|
+
# amino acids in the protein and the average isotopic mass of one water
|
463
|
+
# molecule._
|
464
|
+
#
|
465
|
+
def molecular_weight
|
466
|
+
@mw ||= begin
|
467
|
+
mass = WATER_MASS
|
468
|
+
each_aa do |aa|
|
469
|
+
mass += AVERAGE_MASS[aa.to_sym]
|
470
|
+
end
|
471
|
+
(mass * 10).floor().to_f / 10
|
472
|
+
end
|
473
|
+
end
|
474
|
+
|
475
|
+
##
|
476
|
+
#
|
477
|
+
# Claculate theoretical pI for an AA sequence with bisect algorithm.
|
478
|
+
# pK value by Bjelqist, et al. is used to calculate pI.
|
479
|
+
#
|
480
|
+
def theoretical_pI
|
481
|
+
charges = []
|
482
|
+
residue_count().each do |residue|
|
483
|
+
charges << charge_proc(residue[:positive],
|
484
|
+
residue[:pK],
|
485
|
+
residue[:num])
|
486
|
+
end
|
487
|
+
round(solve_pI(charges), 2)
|
488
|
+
end
|
489
|
+
|
490
|
+
##
|
491
|
+
#
|
492
|
+
# Return estimated half_life of an AA sequence.
|
493
|
+
#
|
494
|
+
# _The half-life is a prediction of the time it takes for half of the
|
495
|
+
# amount of protein in a cell to disappear after its synthesis in the
|
496
|
+
# cell. ProtParam relies on the "N-end rule", which relates the half-life
|
497
|
+
# of a protein to the identity of its N-terminal residue; the prediction
|
498
|
+
# is given for 3 model organisms (human, yeast and E.coli)._
|
499
|
+
#
|
500
|
+
def half_life(species=nil)
|
501
|
+
n_end = @seq[0].chr.to_sym
|
502
|
+
if species
|
503
|
+
HALFLIFE[species][n_end]
|
504
|
+
else
|
505
|
+
{
|
506
|
+
:ecoli => HALFLIFE[:ecoli][n_end],
|
507
|
+
:mammalian => HALFLIFE[:mammalian][n_end],
|
508
|
+
:yeast => HALFLIFE[:yeast][n_end]
|
509
|
+
}
|
510
|
+
end
|
511
|
+
end
|
512
|
+
|
513
|
+
##
|
514
|
+
#
|
515
|
+
# Calculate instability index of an AA sequence.
|
516
|
+
#
|
517
|
+
# _The instability index provides an estimate of the stability of your
|
518
|
+
# protein in a test tube. Statistical analysis of 12 unstable and 32
|
519
|
+
# stable proteins has revealed [7] that there are certain dipeptides, the
|
520
|
+
# occurence of which is significantly different in the unstable proteins
|
521
|
+
# compared with those in the stable ones. The authors of this method have
|
522
|
+
# assigned a weight value of instability to each of the 400 different
|
523
|
+
# dipeptides (DIWV)._
|
524
|
+
#
|
525
|
+
def instability_index
|
526
|
+
@instability_index ||=
|
527
|
+
begin
|
528
|
+
instability_sum = 0.0
|
529
|
+
i = 0
|
530
|
+
while @seq[i+1] != nil
|
531
|
+
aa, next_aa = [@seq[i].chr.to_sym, @seq[i+1].chr.to_sym]
|
532
|
+
if DIWV.key?(aa) && DIWV[aa].key?(next_aa)
|
533
|
+
instability_sum += DIWV[aa][next_aa]
|
534
|
+
end
|
535
|
+
i += 1
|
536
|
+
end
|
537
|
+
round((10.0/amino_acid_number.to_f) * instability_sum, 2)
|
538
|
+
end
|
539
|
+
end
|
540
|
+
|
541
|
+
##
|
542
|
+
#
|
543
|
+
# Return wheter the sequence is stable or not as String (stable/unstable).
|
544
|
+
#
|
545
|
+
# _Protein whose instability index is smaller than 40 is predicted as
|
546
|
+
# stable, a value above 40 predicts that the protein may be unstable._
|
547
|
+
#
|
548
|
+
#
|
549
|
+
def stability
|
550
|
+
(instability_index <= 40) ? "stable" : "unstable"
|
551
|
+
end
|
552
|
+
|
553
|
+
##
|
554
|
+
#
|
555
|
+
# Return true if the sequence is stable.
|
556
|
+
#
|
557
|
+
def stable?
|
558
|
+
(instability_index <= 40) ? true : false
|
559
|
+
end
|
560
|
+
|
561
|
+
##
|
562
|
+
#
|
563
|
+
# Calculate aliphatic index of an AA sequence.
|
564
|
+
#
|
565
|
+
# _The aliphatic index of a protein is defined as the relative volume
|
566
|
+
# occupied by aliphatic side chains (alanine, valine, isoleucine, and
|
567
|
+
# leucine). It may be regarded as a positive factor for the increase of
|
568
|
+
# thermostability of globular proteins._
|
569
|
+
#
|
570
|
+
def aliphatic_index
|
571
|
+
aa_map = aa_comp_map
|
572
|
+
@aliphatic_index ||= round(aa_map[:A] +
|
573
|
+
2.9 * aa_map[:V] +
|
574
|
+
(3.9 * (aa_map[:I] + aa_map[:L])), 2)
|
575
|
+
end
|
576
|
+
|
577
|
+
##
|
578
|
+
#
|
579
|
+
# Calculate GRAVY score of an AA sequence.
|
580
|
+
#
|
581
|
+
# _The GRAVY(Grand Average of Hydropathy) value for a peptide or protein
|
582
|
+
# is calculated as the sum of hydropathy values [9] of all the amino acids,
|
583
|
+
# divided by the number of residues in the sequence._
|
584
|
+
#
|
585
|
+
def gravy
|
586
|
+
@gravy ||= begin
|
587
|
+
hydropathy_sum = 0.0
|
588
|
+
each_aa do |aa|
|
589
|
+
hydropathy_sum += HYDROPATHY[aa]
|
590
|
+
end
|
591
|
+
round(hydropathy_sum / @seq.length.to_f, 3)
|
592
|
+
end
|
593
|
+
end
|
594
|
+
|
595
|
+
##
|
596
|
+
#
|
597
|
+
# Calculate the percentage composition of an AA sequence as a Hash object.
|
598
|
+
# It return percentage of a given amino acid if aa_code is not nil.
|
599
|
+
#
|
600
|
+
def aa_comp(aa_code=nil)
|
601
|
+
if aa_code.nil?
|
602
|
+
aa_map = {}
|
603
|
+
IUPAC_CODE.keys.each do |k|
|
604
|
+
aa_map[k] = 0.0
|
605
|
+
end
|
606
|
+
aa_map.update(aa_comp_map){|k,_,v| round(v, 1) }
|
607
|
+
else
|
608
|
+
round(aa_comp_map[aa_code], 1)
|
609
|
+
end
|
610
|
+
end
|
611
|
+
|
612
|
+
private
|
613
|
+
|
614
|
+
def aa_comp_map
|
615
|
+
@aa_comp_map ||=
|
616
|
+
begin
|
617
|
+
aa_map = {}
|
618
|
+
aa_comp = {}
|
619
|
+
sum = 0
|
620
|
+
each_aa do |aa|
|
621
|
+
if aa_map.key? aa
|
622
|
+
aa_map[aa] += 1
|
623
|
+
else
|
624
|
+
aa_map[aa] = 1
|
625
|
+
end
|
626
|
+
sum += 1
|
627
|
+
end
|
628
|
+
aa_map.each {|aa, count| aa_comp[aa] = (Rational(count,sum) * 100).to_f }
|
629
|
+
aa_comp
|
630
|
+
end
|
631
|
+
end
|
632
|
+
|
633
|
+
def each_aa
|
634
|
+
@seq.each_byte do |x|
|
635
|
+
yield x.chr.to_sym
|
636
|
+
end
|
637
|
+
end
|
638
|
+
|
639
|
+
def positive? residue
|
640
|
+
(residue == "H" || residue == "R" || residue == "K")
|
641
|
+
end
|
642
|
+
|
643
|
+
#
|
644
|
+
# Return proc calculating charge of a residue.
|
645
|
+
#
|
646
|
+
def charge_proc positive, pK, num
|
647
|
+
if positive
|
648
|
+
lambda {|ph|
|
649
|
+
num.to_f / (1.0 + 10.0 ** (ph - pK))
|
650
|
+
}
|
651
|
+
else
|
652
|
+
lambda {|ph|
|
653
|
+
(-1.0 * num.to_f) / (1.0 + 10.0 ** (pK - ph))
|
654
|
+
}
|
655
|
+
end
|
656
|
+
end
|
657
|
+
|
658
|
+
#
|
659
|
+
# Transform AA sequence into residue count
|
660
|
+
#
|
661
|
+
def residue_count
|
662
|
+
counted = []
|
663
|
+
# N-terminal
|
664
|
+
n_term = @seq[0].chr
|
665
|
+
if PK[:nterm].key? n_term.to_sym
|
666
|
+
counted << {
|
667
|
+
:num => 1,
|
668
|
+
:residue => n_term.to_sym,
|
669
|
+
:pK => PK[:nterm][n_term.to_sym],
|
670
|
+
:positive => positive?(n_term)
|
671
|
+
}
|
672
|
+
elsif PK[:normal].key? n_term.to_sym
|
673
|
+
counted << {
|
674
|
+
:num => 1,
|
675
|
+
:residue => n_term.to_sym,
|
676
|
+
:pK => PK[:normal][n_term.to_sym],
|
677
|
+
:positive => positive?(n_term)
|
678
|
+
}
|
679
|
+
end
|
680
|
+
# Internal
|
681
|
+
tmp_internal = {}
|
682
|
+
@seq[1,(@seq.length-2)].each_byte do |x|
|
683
|
+
aa = x.chr.to_sym
|
684
|
+
if PK[:internal].key? aa
|
685
|
+
if tmp_internal.key? aa
|
686
|
+
tmp_internal[aa][:num] += 1
|
687
|
+
else
|
688
|
+
tmp_internal[aa] = {
|
689
|
+
:num => 1,
|
690
|
+
:residue => aa,
|
691
|
+
:pK => PK[:internal][aa],
|
692
|
+
:positive => positive?(aa.to_s)
|
693
|
+
}
|
694
|
+
end
|
695
|
+
end
|
696
|
+
end
|
697
|
+
tmp_internal.each do |aa, val|
|
698
|
+
counted << val
|
699
|
+
end
|
700
|
+
# C-terminal
|
701
|
+
c_term = @seq[-1].chr
|
702
|
+
if PK[:cterm].key? c_term.to_sym
|
703
|
+
counted << {
|
704
|
+
:num => 1,
|
705
|
+
:residue => c_term.to_sym,
|
706
|
+
:pK => PK[:cterm][c_term.to_sym],
|
707
|
+
:positive => positive?(c_term)
|
708
|
+
}
|
709
|
+
end
|
710
|
+
counted
|
711
|
+
end
|
712
|
+
|
713
|
+
#
|
714
|
+
# Solving pI value with bisect algorithm.
|
715
|
+
#
|
716
|
+
def solve_pI charges
|
717
|
+
state = {
|
718
|
+
:ph => 0.0,
|
719
|
+
:charges => charges,
|
720
|
+
:pI => nil,
|
721
|
+
:ph_prev => 0.0,
|
722
|
+
:ph_next => 14.0,
|
723
|
+
:net_charge => 0.0
|
724
|
+
}
|
725
|
+
error = false
|
726
|
+
# epsilon means precision [pI = pH +_ E]
|
727
|
+
epsilon = 0.001
|
728
|
+
|
729
|
+
loop do
|
730
|
+
# Reset net charge
|
731
|
+
state[:net_charge] = 0.0
|
732
|
+
# Calculate net charge
|
733
|
+
state[:charges].each do |charge_proc|
|
734
|
+
state[:net_charge] += charge_proc.call state[:ph]
|
735
|
+
end
|
736
|
+
|
737
|
+
# Something is wrong - pH is higher than 14
|
738
|
+
if state[:ph] >= 14.0
|
739
|
+
error = true
|
740
|
+
break
|
741
|
+
end
|
742
|
+
|
743
|
+
# Making decision
|
744
|
+
temp_ph = 0.0
|
745
|
+
if state[:net_charge] <= 0.0
|
746
|
+
temp_ph = state[:ph]
|
747
|
+
state[:ph] = state[:ph] - ((state[:ph] - state[:ph_prev]) / 2.0)
|
748
|
+
state[:ph_next] = temp_ph
|
749
|
+
else
|
750
|
+
temp_ph = state[:ph]
|
751
|
+
state[:ph] = state[:ph] + ((state[:ph_next] - state[:ph]) / 2.0)
|
752
|
+
state[:ph_prev] = temp_ph
|
753
|
+
end
|
754
|
+
|
755
|
+
if (state[:ph] - state[:ph_prev] < epsilon) &&
|
756
|
+
(state[:ph_next] - state[:ph] < epsilon)
|
757
|
+
state[:pI] = state[:ph]
|
758
|
+
break
|
759
|
+
end
|
760
|
+
end
|
761
|
+
|
762
|
+
if !state[:pI].nil? && !error
|
763
|
+
state[:pI]
|
764
|
+
else
|
765
|
+
raise "Failed to Calc pI: pH is higher than 14"
|
766
|
+
end
|
767
|
+
end
|
768
|
+
|
769
|
+
def round(num, ndigits=0)
|
770
|
+
(num * (10 ** ndigits)).round().to_f / (10 ** ndigits).to_f
|
771
|
+
end
|
772
|
+
|
773
|
+
# --------------------------------
|
774
|
+
# :section: References
|
775
|
+
#
|
776
|
+
#
|
777
|
+
# 1. Protein Identification and Analysis Tools on the ExPASy Server;
|
778
|
+
# Gasteiger E., Hoogland C., Gattiker A., Duvaud S., Wilkins M.R.,
|
779
|
+
# Appel R.D., Bairoch A.; (In) John M. Walker (ed): The Proteomics
|
780
|
+
# Protocols Handbook, Humana Press (2005). pp. 571-607
|
781
|
+
# 2. Pace, C.N., Vajdos, F., Fee, L., Grimsley, G., and Gray, T. (1995)
|
782
|
+
# How to measure and predict the molar absorption coefficient of a
|
783
|
+
# protein. Protein Sci. 11, 2411-2423.
|
784
|
+
# 3. Edelhoch, H. (1967) Spectroscopic determination of tryptophan and
|
785
|
+
# tyrosine in proteins. Biochemistry 6, 1948-1954.
|
786
|
+
# 4. Gill, S.C. and von Hippel, P.H. (1989) Calculation of protein
|
787
|
+
# extinction coefficients from amino acid sequence data. Anal. Biochem.
|
788
|
+
# 182:319-326(1989).
|
789
|
+
# 5. Bachmair, A., Finley, D. and Varshavsky, A. (1986) In vivo half-life
|
790
|
+
# of a protein is a function of its amino-terminal residue. Science 234,
|
791
|
+
# 179-186.
|
792
|
+
# 6. Gonda, D.K., Bachmair, A., Wunning, I., Tobias, J.W., Lane, W.S. and
|
793
|
+
# Varshavsky, A. J. (1989) Universality and structure of the N-end rule.
|
794
|
+
# J. Biol. Chem. 264, 16700-16712.
|
795
|
+
# 7. Tobias, J.W., Shrader, T.E., Rocap, G. and Varshavsky, A. (1991) The
|
796
|
+
# N-end rule in bacteria. Science 254, 1374-1377.
|
797
|
+
# 8. Ciechanover, A. and Schwartz, A.L. (1989) How are substrates
|
798
|
+
# recognized by the ubiquitin-mediated proteolytic system? Trends Biochem.
|
799
|
+
# Sci. 14, 483-488.
|
800
|
+
# 9. Varshavsky, A. (1997) The N-end rule pathway of protein degradation.
|
801
|
+
# Genes Cells 2, 13-28.
|
802
|
+
# 10. Guruprasad, K., Reddy, B.V.B. and Pandit, M.W. (1990) Correlation
|
803
|
+
# between stability of a protein and its dipeptide composition: a novel
|
804
|
+
# approach for predicting in vivo stability of a protein from its primary
|
805
|
+
# sequence. Protein Eng. 4,155-161.
|
806
|
+
# 11. Ikai, A.J. (1980) Thermostability and aliphatic index of globular
|
807
|
+
# proteins. J. Biochem. 88, 1895-1898.
|
808
|
+
# 12. Kyte, J. and Doolittle, R.F. (1982) A simple method for displaying
|
809
|
+
# the hydropathic character of a protein. J. Mol. Biol. 157, 105-132.
|
810
|
+
# 13. Bjellqvist, B.,Hughes, G.J., Pasquali, Ch., Paquet, N., Ravier, F.,
|
811
|
+
# Sanchez, J.-Ch., Frutiger, S. & Hochstrasser, D.F. The focusing positions
|
812
|
+
# of polypeptides in immobilized pH gradients can be predicted from their
|
813
|
+
# amino acid sequences. Electrophoresis 1993, 14, 1023-1031.
|
814
|
+
#
|
815
|
+
# --------------------------------
|
816
|
+
end
|
817
|
+
end
|