bio-protparam 0.1.0
Sign up to get free protection for your applications and to get access to all the features.
- data/.document +5 -0
- data/.travis.yml +12 -0
- data/Gemfile +9 -0
- data/LICENSE.txt +20 -0
- data/README.md +47 -0
- data/README.rdoc +48 -0
- data/Rakefile +42 -0
- data/VERSION +1 -0
- data/lib/bio-protparam.rb +12 -0
- data/lib/bio/util/protparam.rb +817 -0
- data/test/data/uniprot/p53_human.uniprot +1456 -0
- data/test/helper.rb +20 -0
- data/test/test_bio-protparam.rb +121 -0
- metadata +130 -0
data/.document
ADDED
data/.travis.yml
ADDED
@@ -0,0 +1,12 @@
|
|
1
|
+
language: ruby
|
2
|
+
rvm:
|
3
|
+
- 1.9.2
|
4
|
+
- 1.9.3
|
5
|
+
- jruby-19mode # JRuby in 1.9 mode
|
6
|
+
- rbx-19mode
|
7
|
+
# - 1.8.7
|
8
|
+
# - jruby-18mode # JRuby in 1.8 mode
|
9
|
+
# - rbx-18mode
|
10
|
+
|
11
|
+
# uncomment this line if your project needs to run something other than `rake`:
|
12
|
+
# script: bundle exec rspec spec
|
data/Gemfile
ADDED
data/LICENSE.txt
ADDED
@@ -0,0 +1,20 @@
|
|
1
|
+
Copyright (c) 2012 hryk <hiroyuki@1vq9.com>
|
2
|
+
|
3
|
+
Permission is hereby granted, free of charge, to any person obtaining
|
4
|
+
a copy of this software and associated documentation files (the
|
5
|
+
"Software"), to deal in the Software without restriction, including
|
6
|
+
without limitation the rights to use, copy, modify, merge, publish,
|
7
|
+
distribute, sublicense, and/or sell copies of the Software, and to
|
8
|
+
permit persons to whom the Software is furnished to do so, subject to
|
9
|
+
the following conditions:
|
10
|
+
|
11
|
+
The above copyright notice and this permission notice shall be
|
12
|
+
included in all copies or substantial portions of the Software.
|
13
|
+
|
14
|
+
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
|
15
|
+
EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
|
16
|
+
MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
|
17
|
+
NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE
|
18
|
+
LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION
|
19
|
+
OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION
|
20
|
+
WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
|
data/README.md
ADDED
@@ -0,0 +1,47 @@
|
|
1
|
+
# bio-protparam
|
2
|
+
|
3
|
+
[![Build Status](https://secure.travis-ci.org/hryk/bioruby-protparam.png)](http://travis-ci.org/hryk/bioruby-protparam)
|
4
|
+
|
5
|
+
`bio-protparam` adds Bio::Protparam class. Bio::Protparam has same interface and
|
6
|
+
function as Bio::Tools::Protparam class of BioPerl, except that it calculate
|
7
|
+
parameters instead of throwing query to Expasy protparam tool.
|
8
|
+
|
9
|
+
**Note: this software is under active development!**
|
10
|
+
|
11
|
+
## Installation
|
12
|
+
|
13
|
+
```sh
|
14
|
+
gem install bio-protparam
|
15
|
+
```
|
16
|
+
|
17
|
+
## Usage
|
18
|
+
|
19
|
+
```ruby
|
20
|
+
require 'bio-protparam'
|
21
|
+
|
22
|
+
protparam = Bio::Protparam.new("MYNNYNLCHIRTINWEEIITGPSAMYSYVY...")
|
23
|
+
# Return Mw
|
24
|
+
protparam.molecular_weight
|
25
|
+
# Return pI
|
26
|
+
protparam.theorettical_pI
|
27
|
+
|
28
|
+
```
|
29
|
+
|
30
|
+
The API doc is on [rdoc.info](http://rdoc.info/github/hryk/bioruby-protparam/). For
|
31
|
+
more code examples see the test files in the source tree.
|
32
|
+
|
33
|
+
## Cite
|
34
|
+
|
35
|
+
If you use this software, please cite one of
|
36
|
+
|
37
|
+
* [BioRuby: bioinformatics software for the Ruby programming language](http://dx.doi.org/10.1093/bioinformatics/btq475)
|
38
|
+
* [Biogem: an effective tool-based approach for scaling up open source software development in bioinformatics](http://dx.doi.org/10.1093/bioinformatics/bts080)
|
39
|
+
|
40
|
+
## Biogems.info
|
41
|
+
|
42
|
+
This Biogem is published at [#bio-protparam](http://biogems.info/index.html)
|
43
|
+
|
44
|
+
## Copyright
|
45
|
+
|
46
|
+
Copyright (c) 2012 hryk. See LICENSE.txt for further details.
|
47
|
+
|
data/README.rdoc
ADDED
@@ -0,0 +1,48 @@
|
|
1
|
+
= bio-protparam
|
2
|
+
|
3
|
+
{<img
|
4
|
+
src="https://secure.travis-ci.org/hryk/bioruby-protparam.png"
|
5
|
+
/>}[http://travis-ci.org/#!/hryk/bioruby-protparam]
|
6
|
+
|
7
|
+
Full description goes here
|
8
|
+
|
9
|
+
Note: this software is under active development!
|
10
|
+
|
11
|
+
== Installation
|
12
|
+
|
13
|
+
gem install bio-protparam
|
14
|
+
|
15
|
+
== Usage
|
16
|
+
|
17
|
+
== Developers
|
18
|
+
|
19
|
+
To use the library
|
20
|
+
|
21
|
+
require 'bio-protparam'
|
22
|
+
|
23
|
+
The API doc is online. For more code examples see also the test files in
|
24
|
+
the source tree.
|
25
|
+
|
26
|
+
== Project home page
|
27
|
+
|
28
|
+
Information on the source tree, documentation, issues and how to contribute, see
|
29
|
+
|
30
|
+
http://github.com/hryk/bioruby-protparam
|
31
|
+
|
32
|
+
The BioRuby community is on IRC server: irc.freenode.org, channel: #bioruby.
|
33
|
+
|
34
|
+
== Cite
|
35
|
+
|
36
|
+
If you use this software, please cite one of
|
37
|
+
|
38
|
+
* [BioRuby: bioinformatics software for the Ruby programming language](http://dx.doi.org/10.1093/bioinformatics/btq475)
|
39
|
+
* [Biogem: an effective tool-based approach for scaling up open source software development in bioinformatics](http://dx.doi.org/10.1093/bioinformatics/bts080)
|
40
|
+
|
41
|
+
== Biogems.info
|
42
|
+
|
43
|
+
This Biogem is published at http://biogems.info/index.html#bio-protparam
|
44
|
+
|
45
|
+
== Copyright
|
46
|
+
|
47
|
+
Copyright (c) 2012 hryk. See LICENSE.txt for further details.
|
48
|
+
|
data/Rakefile
ADDED
@@ -0,0 +1,42 @@
|
|
1
|
+
# encoding: utf-8
|
2
|
+
|
3
|
+
require 'rubygems'
|
4
|
+
require 'bundler'
|
5
|
+
begin
|
6
|
+
Bundler.setup(:default, :development)
|
7
|
+
rescue Bundler::BundlerError => e
|
8
|
+
$stderr.puts e.message
|
9
|
+
$stderr.puts "Run `bundle install` to install missing gems"
|
10
|
+
exit e.status_code
|
11
|
+
end
|
12
|
+
require 'rake'
|
13
|
+
|
14
|
+
require 'jeweler'
|
15
|
+
Jeweler::Tasks.new do |gem|
|
16
|
+
gem.name = "bio-protparam"
|
17
|
+
gem.homepage = "http://github.com/hryk/bioruby-protparam"
|
18
|
+
gem.license = "MIT"
|
19
|
+
gem.summary = %Q{A Protparam compatible utility for BioRuby.}
|
20
|
+
gem.description = %Q{Bio::Protparam has same interface and function as Bio::Tools::Protparam class of BioPerl, except that it calculate parameters instead of throwing query to Expasy protparam tool.}
|
21
|
+
gem.email = "hiroyuki@1vq9.com"
|
22
|
+
gem.authors = ["hryk"]
|
23
|
+
end
|
24
|
+
Jeweler::RubygemsDotOrgTasks.new
|
25
|
+
|
26
|
+
require 'rake/testtask'
|
27
|
+
Rake::TestTask.new(:test) do |test|
|
28
|
+
test.libs << 'lib' << 'test'
|
29
|
+
test.pattern = 'test/**/test_*.rb'
|
30
|
+
test.verbose = true
|
31
|
+
end
|
32
|
+
|
33
|
+
task :default => :test
|
34
|
+
|
35
|
+
require 'rdoc/task'
|
36
|
+
Rake::RDocTask.new do |rdoc|
|
37
|
+
version = File.exist?('VERSION') ? File.read('VERSION') : ""
|
38
|
+
rdoc.rdoc_dir = 'rdoc'
|
39
|
+
rdoc.title = "bio-protparam #{version}"
|
40
|
+
rdoc.rdoc_files.include('README*')
|
41
|
+
rdoc.rdoc_files.include('lib/**/*.rb')
|
42
|
+
end
|
data/VERSION
ADDED
@@ -0,0 +1 @@
|
|
1
|
+
0.1.0
|
@@ -0,0 +1,12 @@
|
|
1
|
+
# Please require your code below, respecting the naming conventions in the
|
2
|
+
# bioruby directory tree.
|
3
|
+
#
|
4
|
+
# For example, say you have a plugin named bio-plugin, the only uncommented
|
5
|
+
# line in this file would be
|
6
|
+
#
|
7
|
+
# require 'bio/bio-plugin/plugin'
|
8
|
+
#
|
9
|
+
# In this file only require other files. Avoid other source code.
|
10
|
+
|
11
|
+
require 'bio/util/protparam'
|
12
|
+
|
@@ -0,0 +1,817 @@
|
|
1
|
+
# encoding: utf-8
|
2
|
+
#
|
3
|
+
#
|
4
|
+
# = bio/appl/protparam.rb - A Class to Calculate Protein Parameters.
|
5
|
+
#
|
6
|
+
# Copyright:: Copyright (C) 2012
|
7
|
+
# Hiroyuki Nakamura <hiroyuki@1vq9.com>
|
8
|
+
# License:: The Ruby License
|
9
|
+
#
|
10
|
+
require 'rational'
|
11
|
+
|
12
|
+
module Bio
|
13
|
+
##
|
14
|
+
# == Description
|
15
|
+
#
|
16
|
+
# Bio::Protparam is a class for calculating protein paramesters. This class
|
17
|
+
# has a similer interface to BioPerl's Bio::Tools::Protparam. However, it
|
18
|
+
# calculate parameters instead of throwing a query to Expasy's {Protparam
|
19
|
+
# tool}[http://web.expasy.org/protparam/]{[1]}[rdoc-label:1] as Bio::Tools::Protparam does.
|
20
|
+
#
|
21
|
+
class Protparam
|
22
|
+
|
23
|
+
# {IUPAC codes}[http://www.bioinformatics.org/sms2/iupac.html] for amino acids.
|
24
|
+
IUPAC_CODE = {
|
25
|
+
:I => "Ile",
|
26
|
+
:V => "Val",
|
27
|
+
:L => "Leu",
|
28
|
+
:F => "Phe",
|
29
|
+
:C => "Cys",
|
30
|
+
:M => "Met",
|
31
|
+
:A => "Ala",
|
32
|
+
:G => "Gly",
|
33
|
+
:T => "Thr",
|
34
|
+
:W => "Trp",
|
35
|
+
:S => "Ser",
|
36
|
+
:Y => "Tyr",
|
37
|
+
:P => "Pro",
|
38
|
+
:H => "His",
|
39
|
+
:E => "Glu",
|
40
|
+
:Q => "Gln",
|
41
|
+
:D => "Asp",
|
42
|
+
:N => "Asn",
|
43
|
+
:K => "Lys",
|
44
|
+
:R => "Arg",
|
45
|
+
:U => "Sec",
|
46
|
+
:O => "Pyl",
|
47
|
+
:B => "Asx",
|
48
|
+
:Z => "Glx",
|
49
|
+
:X => "Xaa"
|
50
|
+
}
|
51
|
+
|
52
|
+
# Dipeptide instability weight value for calculating instability index of proteins {[10]}[rdoc-label:10].
|
53
|
+
DIWV = {
|
54
|
+
:W => {
|
55
|
+
:W => 1.0, :C => 1.0, :M => 24.68, :H => 24.68, :Y => 1.0, :F => 1.0, :Q => 1.0,
|
56
|
+
:N => 13.34, :I => 1.0, :R => 1.0, :D => 1.0, :P => 1.0, :T => -14.03, :K => 1.0,
|
57
|
+
:E => 1.0, :V => -7.49, :S => 1.0, :G => -9.37, :A => -14.03, :L => 13.34
|
58
|
+
},
|
59
|
+
:C => {
|
60
|
+
:W => 24.68, :C => 1.0, :M => 33.6, :H => 33.6, :Y => 1.0, :F => 1.0, :Q => -6.54, :N => 1.0,
|
61
|
+
:I => 1.0, :R => 1.0, :D => 20.26, :P => 20.26, :T => 33.6, :K => 1.0, :E => 1.0, :V => -6.54,
|
62
|
+
:S => 1.0, :G => 1.0, :A => 1.0, :L => 20.26
|
63
|
+
},
|
64
|
+
:M => {
|
65
|
+
:W => 1.0, :C => 1.0, :M => -1.88, :H => 58.28, :Y => 24.68, :F => 1.0, :Q => -6.54,
|
66
|
+
:N => 1.0, :I => 1.0, :R => -6.54, :D => 1.0, :P => 44.94, :T => -1.88, :K => 1.0, :E => 1.0,
|
67
|
+
:V => 1.0, :S => 44.94, :G => 1.0, :A => 13.34, :L => 1.0
|
68
|
+
},
|
69
|
+
:H => {
|
70
|
+
:W => -1.88, :C => 1.0, :M => 1.0, :H => 1.0, :Y => 44.94, :F => -9.37, :Q => 1.0,
|
71
|
+
:N => 24.68, :I => 44.94, :R => 1.0, :D => 1.0, :P => -1.88, :T => -6.54, :K => 24.68,
|
72
|
+
:E => 1.0, :V => 1.0, :S => 1.0, :G => -9.37, :A => 1.0, :L => 1.0
|
73
|
+
},
|
74
|
+
:Y => {
|
75
|
+
:W => -9.37, :C => 1.0, :M => 44.94, :H => 13.34, :Y => 13.34, :F => 1.0, :Q => 1.0,
|
76
|
+
:N => 1.0, :I => 1.0, :R => -15.91, :D => 24.68, :P => 13.34, :T => -7.49, :K => 1.0,
|
77
|
+
:E => -6.54, :V => 1.0, :S => 1.0, :G => -7.49, :A => 24.68, :L => 1.0
|
78
|
+
},
|
79
|
+
:F => {
|
80
|
+
:W => 1.0, :C => 1.0, :M => 1.0, :H => 1.0, :Y => 33.6, :F => 1.0, :Q => 1.0, :N => 1.0,
|
81
|
+
:I => 1.0, :R => 1.0, :D => 13.34, :P => 20.26, :T => 1.0, :K => -14.03, :E => 1.0,
|
82
|
+
:V => 1.0, :S => 1.0, :G => 1.0, :A => 1.0, :L => 1.0
|
83
|
+
},
|
84
|
+
:Q => {
|
85
|
+
:W => 1.0, :C => -6.54, :M => 1.0, :H => 1.0, :Y => -6.54, :F => -6.54, :Q => 20.26,
|
86
|
+
:N => 1.0, :I => 1.0, :R => 1.0, :D => 20.26, :P => 20.26, :T => 1.0, :K => 1.0, :E => 20.26,
|
87
|
+
:V => -6.54, :S => 44.94, :G => 1.0, :A => 1.0, :L => 1.0
|
88
|
+
},
|
89
|
+
:N => {
|
90
|
+
:W => -9.37, :C => -1.88, :M => 1.0, :H => 1.0, :Y => 1.0, :F => -14.03, :Q => -6.54,
|
91
|
+
:N => 1.0, :I => 44.94, :R => 1.0, :D => 1.0, :P => -1.88, :T => -7.49, :K => 24.68,
|
92
|
+
:E => 1.0, :V => 1.0, :S => 1.0, :G => -14.03, :A => 1.0, :L => 1.0
|
93
|
+
},
|
94
|
+
:I => {
|
95
|
+
:W => 1.0, :C => 1.0, :M => 1.0, :H => 13.34, :Y => 1.0, :F => 1.0, :Q => 1.0, :N => 1.0,
|
96
|
+
:I => 1.0, :R => 1.0, :D => 1.0, :P => -1.88, :T => 1.0, :K => -7.49, :E => 44.94,
|
97
|
+
:V => -7.49, :S => 1.0, :G => 1.0, :A => 1.0, :L => 20.26
|
98
|
+
},
|
99
|
+
:R => {
|
100
|
+
:W => 58.28, :C => 1.0, :M => 1.0, :H => 20.26, :Y => -6.54, :F => 1.0, :Q => 20.26,
|
101
|
+
:N => 13.34, :I => 1.0, :R => 58.28, :D => 1.0, :P => 20.26, :T => 1.0, :K => 1.0, :E => 1.0,
|
102
|
+
:V => 1.0, :S => 44.94, :G => -7.49, :A => 1.0, :L => 1.0
|
103
|
+
},
|
104
|
+
:D => {
|
105
|
+
:W => 1.0, :C => 1.0, :M => 1.0, :H => 1.0, :Y => 1.0, :F => -6.54, :Q => 1.0, :N => 1.0,
|
106
|
+
:I => 1.0, :R => -6.54, :D => 1.0, :P => 1.0, :T => -14.03, :K => -7.49, :E => 1.0,
|
107
|
+
:V => 1.0, :S => 20.26, :G => 1.0, :A => 1.0, :L => 1.0
|
108
|
+
},
|
109
|
+
:P => {
|
110
|
+
:W => -1.88, :C => -6.54, :M => -6.54, :H => 1.0, :Y => 1.0, :F => 20.26, :Q => 20.26,
|
111
|
+
:N => 1.0, :I => 1.0, :R => -6.54, :D => -6.54, :P => 20.26, :T => 1.0, :K => 1.0, :E => 18.38,
|
112
|
+
:V => 20.26, :S => 20.26, :G => 1.0, :A => 20.26, :L => 1.0
|
113
|
+
},
|
114
|
+
:T => {
|
115
|
+
:W => -14.03, :C => 1.0, :M => 1.0, :H => 1.0, :Y => 1.0, :F => 13.34, :Q => -6.54,
|
116
|
+
:N => -14.03, :I => 1.0, :R => 1.0, :D => 1.0, :P => 1.0, :T => 1.0, :K => 1.0, :E => 20.26,
|
117
|
+
:V => 1.0, :S => 1.0, :G => -7.49, :A => 1.0, :L => 1.0
|
118
|
+
},
|
119
|
+
:K => {
|
120
|
+
:W => 1.0, :C => 1.0, :M => 33.6, :H => 1.0, :Y => 1.0, :F => 1.0, :Q => 24.68, :N => 1.0,
|
121
|
+
:I => -7.49, :R => 33.6, :D => 1.0, :P => -6.54, :T => 1.0, :K => 1.0, :E => 1.0, :V => -7.49,
|
122
|
+
:S => 1.0, :G => -7.49, :A => 1.0, :L => -7.49
|
123
|
+
},
|
124
|
+
:E => {
|
125
|
+
:W => -14.03, :C => 44.94, :M => 1.0, :H => -6.54, :Y => 1.0, :F => 1.0, :Q => 20.26,
|
126
|
+
:N => 1.0, :I => 20.26, :R => 1.0, :D => 20.26, :P => 20.26, :T => 1.0, :K => 1.0, :E => 33.6,
|
127
|
+
:V => 1.0, :S => 20.26, :G => 1.0, :A => 1.0, :L => 1.0
|
128
|
+
},
|
129
|
+
:V => {
|
130
|
+
:W => 1.0, :C => 1.0, :M => 1.0, :H => 1.0, :Y => -6.54, :F => 1.0, :Q => 1.0, :N => 1.0,
|
131
|
+
:I => 1.0, :R => 1.0, :D => -14.03, :P => 20.26, :T => -7.49, :K => -1.88, :E => 1.0,
|
132
|
+
:V => 1.0, :S => 1.0, :G => -7.49, :A => 1.0, :L => 1.0
|
133
|
+
},
|
134
|
+
:S => {
|
135
|
+
:W => 1.0, :C => 33.6, :M => 1.0, :H => 1.0, :Y => 1.0, :F => 1.0, :Q => 20.26, :N => 1.0,
|
136
|
+
:I => 1.0, :R => 20.26, :D => 1.0, :P => 44.94, :T => 1.0, :K => 1.0, :E => 20.26, :V => 1.0,
|
137
|
+
:S => 20.26, :G => 1.0, :A => 1.0, :L => 1.0
|
138
|
+
},
|
139
|
+
:G => {
|
140
|
+
:W => 13.34, :C => 1.0, :M => 1.0, :H => 1.0, :Y => -7.49, :F => 1.0, :Q => 1.0, :N => -7.49,
|
141
|
+
:I => -7.49, :R => 1.0, :D => 1.0, :P => 1.0, :T => -7.49, :K => -7.49, :E => -6.54,
|
142
|
+
:V => 1.0, :S => 1.0, :G => 13.34, :A => -7.49, :L => 1.0
|
143
|
+
},
|
144
|
+
:A => {
|
145
|
+
:W => 1.0, :C => 44.94, :M => 1.0, :H => -7.49, :Y => 1.0, :F => 1.0, :Q => 1.0, :N => 1.0,
|
146
|
+
:I => 1.0, :R => 1.0, :D => -7.49, :P => 20.26, :T => 1.0, :K => 1.0, :E => 1.0, :V => 1.0,
|
147
|
+
:S => 1.0, :G => 1.0, :A => 1.0, :L => 1.0
|
148
|
+
},
|
149
|
+
:L => {
|
150
|
+
:W => 24.68, :C => 1.0, :M => 1.0, :H => 1.0, :Y => 1.0, :F => 1.0, :Q => 33.6, :N => 1.0,
|
151
|
+
:I => 1.0, :R => 20.26, :D => 1.0, :P => 20.26, :T => 1.0, :K => -7.49, :E => 1.0, :V => 1.0,
|
152
|
+
:S => 1.0, :G => 1.0, :A => 1.0, :L => 1.0
|
153
|
+
}
|
154
|
+
}
|
155
|
+
|
156
|
+
# Estemated half-life of N-terminal residue of a protein.
|
157
|
+
HALFLIFE = {
|
158
|
+
:ecoli => {
|
159
|
+
:I => 600,
|
160
|
+
:V => 600,
|
161
|
+
:L => 2,
|
162
|
+
:F => 2,
|
163
|
+
:C => 600,
|
164
|
+
:M => 600,
|
165
|
+
:A => 600,
|
166
|
+
:G => 600,
|
167
|
+
:T => 600,
|
168
|
+
:W => 2,
|
169
|
+
:S => 600,
|
170
|
+
:Y => 2,
|
171
|
+
:P => 600,
|
172
|
+
:H => 600,
|
173
|
+
:E => 600,
|
174
|
+
:Q => 600,
|
175
|
+
:D => 600,
|
176
|
+
:N => 600,
|
177
|
+
:K => 2,
|
178
|
+
:R => 2,
|
179
|
+
:U => 600
|
180
|
+
},
|
181
|
+
:mammalian => {
|
182
|
+
:A => 264,
|
183
|
+
:R => 60,
|
184
|
+
:N => 84,
|
185
|
+
:D => 66,
|
186
|
+
:C => 72,
|
187
|
+
:Q => 48,
|
188
|
+
:E => 60,
|
189
|
+
:G => 30,
|
190
|
+
:H => 210,
|
191
|
+
:I => 1200,
|
192
|
+
:L => 330,
|
193
|
+
:K => 78,
|
194
|
+
:M => 1800,
|
195
|
+
:F => 66,
|
196
|
+
:P => 1200,
|
197
|
+
:S => 114,
|
198
|
+
:T => 432,
|
199
|
+
:W => 168,
|
200
|
+
:Y => 168,
|
201
|
+
:V => 6000
|
202
|
+
},
|
203
|
+
:yeast => {
|
204
|
+
:A => 1200,
|
205
|
+
:R => 2,
|
206
|
+
:N => 3,
|
207
|
+
:D => 3,
|
208
|
+
:C => 1200,
|
209
|
+
:Q => 10,
|
210
|
+
:E => 30,
|
211
|
+
:G => 1200,
|
212
|
+
:H => 10,
|
213
|
+
:I => 30,
|
214
|
+
:L => 3,
|
215
|
+
:K => 3,
|
216
|
+
:M => 1200,
|
217
|
+
:F => 3,
|
218
|
+
:P => 1200,
|
219
|
+
:S => 1200,
|
220
|
+
:T => 1200,
|
221
|
+
:W => 3,
|
222
|
+
:Y => 10,
|
223
|
+
:V => 1200
|
224
|
+
}
|
225
|
+
}
|
226
|
+
|
227
|
+
## TOP-IDP
|
228
|
+
##
|
229
|
+
## http://www.ncbi.nlm.nih.gov/pmc/articles/PMC2676888/
|
230
|
+
##
|
231
|
+
# TOP_IDP = {
|
232
|
+
# :I => -0.486,
|
233
|
+
# :V => -0.121,
|
234
|
+
# :L => -0.326,
|
235
|
+
# :F => -0.697,
|
236
|
+
# :C => 0.02,
|
237
|
+
# :M => -0.397,
|
238
|
+
# :A => 0.06,
|
239
|
+
# :G => 0.166,
|
240
|
+
# :T => 0.059,
|
241
|
+
# :W => -0.884,
|
242
|
+
# :S => 0.341,
|
243
|
+
# :Y => -0.510,
|
244
|
+
# :P => 0.987,
|
245
|
+
# :H => 0.303,
|
246
|
+
# :E => 0.736,
|
247
|
+
# :Q => 0.318,
|
248
|
+
# :D => 0.192,
|
249
|
+
# :N => 0.007,
|
250
|
+
# :K => 0.586,
|
251
|
+
# :R => 0.180,
|
252
|
+
# :U => 0.02
|
253
|
+
# }
|
254
|
+
|
255
|
+
# Hydropathy values for amino acids {[12]}[rdoc-label:12].
|
256
|
+
HYDROPATHY = {
|
257
|
+
:I => 4.5 ,
|
258
|
+
:V => 4.2 ,
|
259
|
+
:L => 3.8 ,
|
260
|
+
:F => 2.8 ,
|
261
|
+
:C => 2.5 ,
|
262
|
+
:M => 1.9 ,
|
263
|
+
:A => 1.8 ,
|
264
|
+
:G => -0.4,
|
265
|
+
:T => -0.7,
|
266
|
+
:W => -0.9,
|
267
|
+
:S => -0.8,
|
268
|
+
:Y => -1.3,
|
269
|
+
:P => -1.6,
|
270
|
+
:H => -3.2,
|
271
|
+
:E => -3.5,
|
272
|
+
:Q => -3.5,
|
273
|
+
:D => -3.5,
|
274
|
+
:N => -3.5,
|
275
|
+
:K => -3.9,
|
276
|
+
:R => -4.5,
|
277
|
+
:U => 2.5
|
278
|
+
}
|
279
|
+
|
280
|
+
# {Average isotopic masses of amino acids}[http://web.expasy.org/findmod/findmod_masses.html#AA]
|
281
|
+
AVERAGE_MASS = {
|
282
|
+
:I => 113.1594,
|
283
|
+
:V => 99.1326,
|
284
|
+
:L => 113.1594,
|
285
|
+
:F => 147.1766,
|
286
|
+
:C => 103.1388,
|
287
|
+
:M => 131.1926,
|
288
|
+
:A => 71.0788,
|
289
|
+
:G => 57.0519,
|
290
|
+
:T => 101.1051,
|
291
|
+
:W => 186.2132,
|
292
|
+
:S => 87.0782,
|
293
|
+
:Y => 163.1760,
|
294
|
+
:P => 97.1167,
|
295
|
+
:H => 137.1411,
|
296
|
+
:E => 129.1155,
|
297
|
+
:Q => 128.1307,
|
298
|
+
:D => 115.0886,
|
299
|
+
:N => 114.1038,
|
300
|
+
:K => 128.1741,
|
301
|
+
:R => 156.1875,
|
302
|
+
:U => 150.0388
|
303
|
+
}
|
304
|
+
WATER_MASS = 18.01524
|
305
|
+
|
306
|
+
# Atomic composition of amino acids.
|
307
|
+
ATOM = {
|
308
|
+
:I => {:C => 6, :H => 13, :O => 2, :N => 1, :S => 0}, # C6H13NO2
|
309
|
+
:V => {:C => 5, :H => 11, :O => 2, :N => 1, :S => 0}, # C5H11NO2
|
310
|
+
:L => {:C => 6, :H => 13, :O => 2, :N => 1, :S => 0}, # C6H13NO2
|
311
|
+
:F => {:C => 9, :H => 11, :O => 2, :N => 1, :S => 0}, # C9H11NO2
|
312
|
+
:C => {:C => 3, :H => 7 , :O => 2, :N => 1, :S => 1}, # C3H7NO2S
|
313
|
+
:M => {:C => 5, :H => 11 ,:O => 2, :N => 1, :S => 1}, # C5H11NO2S
|
314
|
+
:A => {:C => 3, :H => 7 , :O => 2, :N => 1, :S => 0}, # C3H7NO2
|
315
|
+
:G => {:C => 2, :H => 5 , :O => 2, :N => 1, :S => 0}, # C2H5NO2
|
316
|
+
:T => {:C => 4, :H => 9 , :O => 3, :N => 1, :S => 0}, # C4H9NO3
|
317
|
+
:W => {:C => 11,:H => 12, :O => 2, :N => 2, :S => 0}, # C11H12N2O2
|
318
|
+
:S => {:C => 3, :H => 7 , :O => 3, :N => 1, :S => 0}, # C3H7NO3
|
319
|
+
:Y => {:C => 9, :H => 11, :O => 3, :N => 1, :S => 0}, # C9H11NO3
|
320
|
+
:P => {:C => 5, :H => 9 , :O => 2, :N => 1, :S => 0}, # C5H9NO2
|
321
|
+
:H => {:C => 6, :H => 9 , :O => 2, :N => 3, :S => 0}, # C6H9N3O2
|
322
|
+
:E => {:C => 5, :H => 9 , :O => 4, :N => 1, :S => 0}, # C5H9NO4
|
323
|
+
:Q => {:C => 5, :H => 10, :O => 3, :N => 2, :S => 0}, # C5H10N2O3
|
324
|
+
:D => {:C => 4, :H => 7 , :O => 4, :N => 1, :S => 0}, # C4H7NO4
|
325
|
+
:N => {:C => 4, :H => 8 , :O => 3, :N => 2, :S => 0}, # C4H8N2O3
|
326
|
+
:K => {:C => 6, :H => 14, :O => 2, :N => 2, :S => 0}, # C6H14N2O2
|
327
|
+
:R => {:C => 6, :H => 14, :O => 2, :N => 4, :S => 0}, # C6H14N4O2
|
328
|
+
}
|
329
|
+
|
330
|
+
##
|
331
|
+
#
|
332
|
+
# pK value from Bjellqvist, et al {[13]}[rdoc-label:13].
|
333
|
+
# Taking into account the decrease in pK differences
|
334
|
+
# between acids and bases when going from water
|
335
|
+
# to 8 M urea, a value of 7.5 has been assigned to the
|
336
|
+
# N-terminal residue .
|
337
|
+
#
|
338
|
+
PK = {
|
339
|
+
:cterm => {
|
340
|
+
:normal => 3.55, :D => 4.55, :E => 4.75
|
341
|
+
},
|
342
|
+
:nterm => {
|
343
|
+
:A => 7.59, :M => 7.00, :S => 6.93, :P => 8.36,
|
344
|
+
:T => 6.82, :V => 7.44, :E => 7.70 , :G => 7.50
|
345
|
+
},
|
346
|
+
:internal => {
|
347
|
+
:D => 4.05, :E => 4.45, :H => 5.98, :C => 9.0,
|
348
|
+
:Y => 10.0, :K => 10.0, :R => 12.0
|
349
|
+
}
|
350
|
+
}
|
351
|
+
|
352
|
+
def initialize(seq)
|
353
|
+
if seq.kind_of?(String) && Bio::Sequence.guess(seq) == Bio::Sequence::AA
|
354
|
+
# TODO: has issue.
|
355
|
+
@seq = Bio::Sequence::AA.new seq
|
356
|
+
elsif seq.kind_of? Bio::Sequence::AA
|
357
|
+
@seq = seq
|
358
|
+
elsif seq.kind_of?(Bio::Sequence) &&
|
359
|
+
seq.guess.kind_of?(Bio::Sequence::AA)
|
360
|
+
@seq = seq.guess
|
361
|
+
else
|
362
|
+
raise ArgumentError, "sequence must be an AA sequence"
|
363
|
+
end
|
364
|
+
end
|
365
|
+
|
366
|
+
##
|
367
|
+
#
|
368
|
+
# Return the number of negative amino acids (D and E) in an AA sequence.
|
369
|
+
#
|
370
|
+
def num_neg
|
371
|
+
@num_neg ||= @seq.count("DE")
|
372
|
+
end
|
373
|
+
|
374
|
+
##
|
375
|
+
#
|
376
|
+
# Return the number of positive amino acids (R and K) in an AA sequence.
|
377
|
+
#
|
378
|
+
def num_pos
|
379
|
+
@num_neg ||= @seq.count("RK")
|
380
|
+
end
|
381
|
+
|
382
|
+
##
|
383
|
+
#
|
384
|
+
# Return the number of residues in an AA sequence.
|
385
|
+
#
|
386
|
+
def amino_acid_number
|
387
|
+
@seq.length
|
388
|
+
end
|
389
|
+
|
390
|
+
##
|
391
|
+
#
|
392
|
+
# Return the number of atoms in a sequence. If type is given, return the
|
393
|
+
# number of specific atoms in a sequence.
|
394
|
+
#
|
395
|
+
def total_atoms(type=nil)
|
396
|
+
if !type.nil?
|
397
|
+
type = type.to_sym
|
398
|
+
if /^(?:C|H|O|N|S){1}$/ !~ type.to_s
|
399
|
+
raise ArgumentError, "type must be C/H/O/N/S/nil(all)"
|
400
|
+
end
|
401
|
+
end
|
402
|
+
num_atom = {:C => 0,
|
403
|
+
:H => 0,
|
404
|
+
:O => 0,
|
405
|
+
:N => 0,
|
406
|
+
:S => 0}
|
407
|
+
each_aa do |aa|
|
408
|
+
ATOM[aa].each do |t, num|
|
409
|
+
num_atom[t] += num
|
410
|
+
end
|
411
|
+
end
|
412
|
+
num_atom[:H] = num_atom[:H] - 2 * (amino_acid_number - 1)
|
413
|
+
num_atom[:O] = num_atom[:O] - (amino_acid_number - 1)
|
414
|
+
if type.nil?
|
415
|
+
num_atom.values.inject(0){|prod, num| prod += num }
|
416
|
+
else
|
417
|
+
num_atom[type]
|
418
|
+
end
|
419
|
+
end
|
420
|
+
|
421
|
+
##
|
422
|
+
#
|
423
|
+
# Return the number of carbons.
|
424
|
+
#
|
425
|
+
def num_carbon
|
426
|
+
@num_carbon ||= total_atoms :C
|
427
|
+
end
|
428
|
+
|
429
|
+
def num_hydrogen
|
430
|
+
@num_hydrogen ||= total_atoms :H
|
431
|
+
end
|
432
|
+
|
433
|
+
##
|
434
|
+
#
|
435
|
+
# Return the number of nitrogens.
|
436
|
+
#
|
437
|
+
def num_nitro
|
438
|
+
@num_nitro ||= total_atoms :N
|
439
|
+
end
|
440
|
+
|
441
|
+
##
|
442
|
+
#
|
443
|
+
# Return the number of oxygens.
|
444
|
+
#
|
445
|
+
def num_oxygen
|
446
|
+
@num_oxygen ||= total_atoms :O
|
447
|
+
end
|
448
|
+
|
449
|
+
##
|
450
|
+
#
|
451
|
+
# Return the number of sulphurs.
|
452
|
+
#
|
453
|
+
def num_sulphur
|
454
|
+
@num_sulphur ||= total_atoms :S
|
455
|
+
end
|
456
|
+
|
457
|
+
##
|
458
|
+
#
|
459
|
+
# Calculate molecular weight of an AA sequence.
|
460
|
+
#
|
461
|
+
# _Protein Mw is calculated by the addition of average isotopic masses of
|
462
|
+
# amino acids in the protein and the average isotopic mass of one water
|
463
|
+
# molecule._
|
464
|
+
#
|
465
|
+
def molecular_weight
|
466
|
+
@mw ||= begin
|
467
|
+
mass = WATER_MASS
|
468
|
+
each_aa do |aa|
|
469
|
+
mass += AVERAGE_MASS[aa.to_sym]
|
470
|
+
end
|
471
|
+
(mass * 10).floor().to_f / 10
|
472
|
+
end
|
473
|
+
end
|
474
|
+
|
475
|
+
##
|
476
|
+
#
|
477
|
+
# Claculate theoretical pI for an AA sequence with bisect algorithm.
|
478
|
+
# pK value by Bjelqist, et al. is used to calculate pI.
|
479
|
+
#
|
480
|
+
def theoretical_pI
|
481
|
+
charges = []
|
482
|
+
residue_count().each do |residue|
|
483
|
+
charges << charge_proc(residue[:positive],
|
484
|
+
residue[:pK],
|
485
|
+
residue[:num])
|
486
|
+
end
|
487
|
+
round(solve_pI(charges), 2)
|
488
|
+
end
|
489
|
+
|
490
|
+
##
|
491
|
+
#
|
492
|
+
# Return estimated half_life of an AA sequence.
|
493
|
+
#
|
494
|
+
# _The half-life is a prediction of the time it takes for half of the
|
495
|
+
# amount of protein in a cell to disappear after its synthesis in the
|
496
|
+
# cell. ProtParam relies on the "N-end rule", which relates the half-life
|
497
|
+
# of a protein to the identity of its N-terminal residue; the prediction
|
498
|
+
# is given for 3 model organisms (human, yeast and E.coli)._
|
499
|
+
#
|
500
|
+
def half_life(species=nil)
|
501
|
+
n_end = @seq[0].chr.to_sym
|
502
|
+
if species
|
503
|
+
HALFLIFE[species][n_end]
|
504
|
+
else
|
505
|
+
{
|
506
|
+
:ecoli => HALFLIFE[:ecoli][n_end],
|
507
|
+
:mammalian => HALFLIFE[:mammalian][n_end],
|
508
|
+
:yeast => HALFLIFE[:yeast][n_end]
|
509
|
+
}
|
510
|
+
end
|
511
|
+
end
|
512
|
+
|
513
|
+
##
|
514
|
+
#
|
515
|
+
# Calculate instability index of an AA sequence.
|
516
|
+
#
|
517
|
+
# _The instability index provides an estimate of the stability of your
|
518
|
+
# protein in a test tube. Statistical analysis of 12 unstable and 32
|
519
|
+
# stable proteins has revealed [7] that there are certain dipeptides, the
|
520
|
+
# occurence of which is significantly different in the unstable proteins
|
521
|
+
# compared with those in the stable ones. The authors of this method have
|
522
|
+
# assigned a weight value of instability to each of the 400 different
|
523
|
+
# dipeptides (DIWV)._
|
524
|
+
#
|
525
|
+
def instability_index
|
526
|
+
@instability_index ||=
|
527
|
+
begin
|
528
|
+
instability_sum = 0.0
|
529
|
+
i = 0
|
530
|
+
while @seq[i+1] != nil
|
531
|
+
aa, next_aa = [@seq[i].chr.to_sym, @seq[i+1].chr.to_sym]
|
532
|
+
if DIWV.key?(aa) && DIWV[aa].key?(next_aa)
|
533
|
+
instability_sum += DIWV[aa][next_aa]
|
534
|
+
end
|
535
|
+
i += 1
|
536
|
+
end
|
537
|
+
round((10.0/amino_acid_number.to_f) * instability_sum, 2)
|
538
|
+
end
|
539
|
+
end
|
540
|
+
|
541
|
+
##
|
542
|
+
#
|
543
|
+
# Return wheter the sequence is stable or not as String (stable/unstable).
|
544
|
+
#
|
545
|
+
# _Protein whose instability index is smaller than 40 is predicted as
|
546
|
+
# stable, a value above 40 predicts that the protein may be unstable._
|
547
|
+
#
|
548
|
+
#
|
549
|
+
def stability
|
550
|
+
(instability_index <= 40) ? "stable" : "unstable"
|
551
|
+
end
|
552
|
+
|
553
|
+
##
|
554
|
+
#
|
555
|
+
# Return true if the sequence is stable.
|
556
|
+
#
|
557
|
+
def stable?
|
558
|
+
(instability_index <= 40) ? true : false
|
559
|
+
end
|
560
|
+
|
561
|
+
##
|
562
|
+
#
|
563
|
+
# Calculate aliphatic index of an AA sequence.
|
564
|
+
#
|
565
|
+
# _The aliphatic index of a protein is defined as the relative volume
|
566
|
+
# occupied by aliphatic side chains (alanine, valine, isoleucine, and
|
567
|
+
# leucine). It may be regarded as a positive factor for the increase of
|
568
|
+
# thermostability of globular proteins._
|
569
|
+
#
|
570
|
+
def aliphatic_index
|
571
|
+
aa_map = aa_comp_map
|
572
|
+
@aliphatic_index ||= round(aa_map[:A] +
|
573
|
+
2.9 * aa_map[:V] +
|
574
|
+
(3.9 * (aa_map[:I] + aa_map[:L])), 2)
|
575
|
+
end
|
576
|
+
|
577
|
+
##
|
578
|
+
#
|
579
|
+
# Calculate GRAVY score of an AA sequence.
|
580
|
+
#
|
581
|
+
# _The GRAVY(Grand Average of Hydropathy) value for a peptide or protein
|
582
|
+
# is calculated as the sum of hydropathy values [9] of all the amino acids,
|
583
|
+
# divided by the number of residues in the sequence._
|
584
|
+
#
|
585
|
+
def gravy
|
586
|
+
@gravy ||= begin
|
587
|
+
hydropathy_sum = 0.0
|
588
|
+
each_aa do |aa|
|
589
|
+
hydropathy_sum += HYDROPATHY[aa]
|
590
|
+
end
|
591
|
+
round(hydropathy_sum / @seq.length.to_f, 3)
|
592
|
+
end
|
593
|
+
end
|
594
|
+
|
595
|
+
##
|
596
|
+
#
|
597
|
+
# Calculate the percentage composition of an AA sequence as a Hash object.
|
598
|
+
# It return percentage of a given amino acid if aa_code is not nil.
|
599
|
+
#
|
600
|
+
def aa_comp(aa_code=nil)
|
601
|
+
if aa_code.nil?
|
602
|
+
aa_map = {}
|
603
|
+
IUPAC_CODE.keys.each do |k|
|
604
|
+
aa_map[k] = 0.0
|
605
|
+
end
|
606
|
+
aa_map.update(aa_comp_map){|k,_,v| round(v, 1) }
|
607
|
+
else
|
608
|
+
round(aa_comp_map[aa_code], 1)
|
609
|
+
end
|
610
|
+
end
|
611
|
+
|
612
|
+
private
|
613
|
+
|
614
|
+
def aa_comp_map
|
615
|
+
@aa_comp_map ||=
|
616
|
+
begin
|
617
|
+
aa_map = {}
|
618
|
+
aa_comp = {}
|
619
|
+
sum = 0
|
620
|
+
each_aa do |aa|
|
621
|
+
if aa_map.key? aa
|
622
|
+
aa_map[aa] += 1
|
623
|
+
else
|
624
|
+
aa_map[aa] = 1
|
625
|
+
end
|
626
|
+
sum += 1
|
627
|
+
end
|
628
|
+
aa_map.each {|aa, count| aa_comp[aa] = (Rational(count,sum) * 100).to_f }
|
629
|
+
aa_comp
|
630
|
+
end
|
631
|
+
end
|
632
|
+
|
633
|
+
def each_aa
|
634
|
+
@seq.each_byte do |x|
|
635
|
+
yield x.chr.to_sym
|
636
|
+
end
|
637
|
+
end
|
638
|
+
|
639
|
+
def positive? residue
|
640
|
+
(residue == "H" || residue == "R" || residue == "K")
|
641
|
+
end
|
642
|
+
|
643
|
+
#
|
644
|
+
# Return proc calculating charge of a residue.
|
645
|
+
#
|
646
|
+
def charge_proc positive, pK, num
|
647
|
+
if positive
|
648
|
+
lambda {|ph|
|
649
|
+
num.to_f / (1.0 + 10.0 ** (ph - pK))
|
650
|
+
}
|
651
|
+
else
|
652
|
+
lambda {|ph|
|
653
|
+
(-1.0 * num.to_f) / (1.0 + 10.0 ** (pK - ph))
|
654
|
+
}
|
655
|
+
end
|
656
|
+
end
|
657
|
+
|
658
|
+
#
|
659
|
+
# Transform AA sequence into residue count
|
660
|
+
#
|
661
|
+
def residue_count
|
662
|
+
counted = []
|
663
|
+
# N-terminal
|
664
|
+
n_term = @seq[0].chr
|
665
|
+
if PK[:nterm].key? n_term.to_sym
|
666
|
+
counted << {
|
667
|
+
:num => 1,
|
668
|
+
:residue => n_term.to_sym,
|
669
|
+
:pK => PK[:nterm][n_term.to_sym],
|
670
|
+
:positive => positive?(n_term)
|
671
|
+
}
|
672
|
+
elsif PK[:normal].key? n_term.to_sym
|
673
|
+
counted << {
|
674
|
+
:num => 1,
|
675
|
+
:residue => n_term.to_sym,
|
676
|
+
:pK => PK[:normal][n_term.to_sym],
|
677
|
+
:positive => positive?(n_term)
|
678
|
+
}
|
679
|
+
end
|
680
|
+
# Internal
|
681
|
+
tmp_internal = {}
|
682
|
+
@seq[1,(@seq.length-2)].each_byte do |x|
|
683
|
+
aa = x.chr.to_sym
|
684
|
+
if PK[:internal].key? aa
|
685
|
+
if tmp_internal.key? aa
|
686
|
+
tmp_internal[aa][:num] += 1
|
687
|
+
else
|
688
|
+
tmp_internal[aa] = {
|
689
|
+
:num => 1,
|
690
|
+
:residue => aa,
|
691
|
+
:pK => PK[:internal][aa],
|
692
|
+
:positive => positive?(aa.to_s)
|
693
|
+
}
|
694
|
+
end
|
695
|
+
end
|
696
|
+
end
|
697
|
+
tmp_internal.each do |aa, val|
|
698
|
+
counted << val
|
699
|
+
end
|
700
|
+
# C-terminal
|
701
|
+
c_term = @seq[-1].chr
|
702
|
+
if PK[:cterm].key? c_term.to_sym
|
703
|
+
counted << {
|
704
|
+
:num => 1,
|
705
|
+
:residue => c_term.to_sym,
|
706
|
+
:pK => PK[:cterm][c_term.to_sym],
|
707
|
+
:positive => positive?(c_term)
|
708
|
+
}
|
709
|
+
end
|
710
|
+
counted
|
711
|
+
end
|
712
|
+
|
713
|
+
#
|
714
|
+
# Solving pI value with bisect algorithm.
|
715
|
+
#
|
716
|
+
def solve_pI charges
|
717
|
+
state = {
|
718
|
+
:ph => 0.0,
|
719
|
+
:charges => charges,
|
720
|
+
:pI => nil,
|
721
|
+
:ph_prev => 0.0,
|
722
|
+
:ph_next => 14.0,
|
723
|
+
:net_charge => 0.0
|
724
|
+
}
|
725
|
+
error = false
|
726
|
+
# epsilon means precision [pI = pH +_ E]
|
727
|
+
epsilon = 0.001
|
728
|
+
|
729
|
+
loop do
|
730
|
+
# Reset net charge
|
731
|
+
state[:net_charge] = 0.0
|
732
|
+
# Calculate net charge
|
733
|
+
state[:charges].each do |charge_proc|
|
734
|
+
state[:net_charge] += charge_proc.call state[:ph]
|
735
|
+
end
|
736
|
+
|
737
|
+
# Something is wrong - pH is higher than 14
|
738
|
+
if state[:ph] >= 14.0
|
739
|
+
error = true
|
740
|
+
break
|
741
|
+
end
|
742
|
+
|
743
|
+
# Making decision
|
744
|
+
temp_ph = 0.0
|
745
|
+
if state[:net_charge] <= 0.0
|
746
|
+
temp_ph = state[:ph]
|
747
|
+
state[:ph] = state[:ph] - ((state[:ph] - state[:ph_prev]) / 2.0)
|
748
|
+
state[:ph_next] = temp_ph
|
749
|
+
else
|
750
|
+
temp_ph = state[:ph]
|
751
|
+
state[:ph] = state[:ph] + ((state[:ph_next] - state[:ph]) / 2.0)
|
752
|
+
state[:ph_prev] = temp_ph
|
753
|
+
end
|
754
|
+
|
755
|
+
if (state[:ph] - state[:ph_prev] < epsilon) &&
|
756
|
+
(state[:ph_next] - state[:ph] < epsilon)
|
757
|
+
state[:pI] = state[:ph]
|
758
|
+
break
|
759
|
+
end
|
760
|
+
end
|
761
|
+
|
762
|
+
if !state[:pI].nil? && !error
|
763
|
+
state[:pI]
|
764
|
+
else
|
765
|
+
raise "Failed to Calc pI: pH is higher than 14"
|
766
|
+
end
|
767
|
+
end
|
768
|
+
|
769
|
+
def round(num, ndigits=0)
|
770
|
+
(num * (10 ** ndigits)).round().to_f / (10 ** ndigits).to_f
|
771
|
+
end
|
772
|
+
|
773
|
+
# --------------------------------
|
774
|
+
# :section: References
|
775
|
+
#
|
776
|
+
#
|
777
|
+
# 1. Protein Identification and Analysis Tools on the ExPASy Server;
|
778
|
+
# Gasteiger E., Hoogland C., Gattiker A., Duvaud S., Wilkins M.R.,
|
779
|
+
# Appel R.D., Bairoch A.; (In) John M. Walker (ed): The Proteomics
|
780
|
+
# Protocols Handbook, Humana Press (2005). pp. 571-607
|
781
|
+
# 2. Pace, C.N., Vajdos, F., Fee, L., Grimsley, G., and Gray, T. (1995)
|
782
|
+
# How to measure and predict the molar absorption coefficient of a
|
783
|
+
# protein. Protein Sci. 11, 2411-2423.
|
784
|
+
# 3. Edelhoch, H. (1967) Spectroscopic determination of tryptophan and
|
785
|
+
# tyrosine in proteins. Biochemistry 6, 1948-1954.
|
786
|
+
# 4. Gill, S.C. and von Hippel, P.H. (1989) Calculation of protein
|
787
|
+
# extinction coefficients from amino acid sequence data. Anal. Biochem.
|
788
|
+
# 182:319-326(1989).
|
789
|
+
# 5. Bachmair, A., Finley, D. and Varshavsky, A. (1986) In vivo half-life
|
790
|
+
# of a protein is a function of its amino-terminal residue. Science 234,
|
791
|
+
# 179-186.
|
792
|
+
# 6. Gonda, D.K., Bachmair, A., Wunning, I., Tobias, J.W., Lane, W.S. and
|
793
|
+
# Varshavsky, A. J. (1989) Universality and structure of the N-end rule.
|
794
|
+
# J. Biol. Chem. 264, 16700-16712.
|
795
|
+
# 7. Tobias, J.W., Shrader, T.E., Rocap, G. and Varshavsky, A. (1991) The
|
796
|
+
# N-end rule in bacteria. Science 254, 1374-1377.
|
797
|
+
# 8. Ciechanover, A. and Schwartz, A.L. (1989) How are substrates
|
798
|
+
# recognized by the ubiquitin-mediated proteolytic system? Trends Biochem.
|
799
|
+
# Sci. 14, 483-488.
|
800
|
+
# 9. Varshavsky, A. (1997) The N-end rule pathway of protein degradation.
|
801
|
+
# Genes Cells 2, 13-28.
|
802
|
+
# 10. Guruprasad, K., Reddy, B.V.B. and Pandit, M.W. (1990) Correlation
|
803
|
+
# between stability of a protein and its dipeptide composition: a novel
|
804
|
+
# approach for predicting in vivo stability of a protein from its primary
|
805
|
+
# sequence. Protein Eng. 4,155-161.
|
806
|
+
# 11. Ikai, A.J. (1980) Thermostability and aliphatic index of globular
|
807
|
+
# proteins. J. Biochem. 88, 1895-1898.
|
808
|
+
# 12. Kyte, J. and Doolittle, R.F. (1982) A simple method for displaying
|
809
|
+
# the hydropathic character of a protein. J. Mol. Biol. 157, 105-132.
|
810
|
+
# 13. Bjellqvist, B.,Hughes, G.J., Pasquali, Ch., Paquet, N., Ravier, F.,
|
811
|
+
# Sanchez, J.-Ch., Frutiger, S. & Hochstrasser, D.F. The focusing positions
|
812
|
+
# of polypeptides in immobilized pH gradients can be predicted from their
|
813
|
+
# amino acid sequences. Electrophoresis 1993, 14, 1023-1031.
|
814
|
+
#
|
815
|
+
# --------------------------------
|
816
|
+
end
|
817
|
+
end
|