bio-bigbio 0.1.4 → 0.1.5

Sign up to get free protection for your applications and to get access to all the features.
data/.travis.yml ADDED
@@ -0,0 +1,12 @@
1
+ language: ruby
2
+ rvm:
3
+ - 1.9.2
4
+ # - 1.9.3
5
+ # - 1.8.7
6
+ # - jruby-19mode # JRuby in 1.9 mode
7
+ # - rbx-19mode
8
+ # - jruby-18mode # JRuby in 1.8 mode
9
+ # - rbx-18mode
10
+
11
+ # uncomment this line if your project needs to run something other than `rake`:
12
+ # script: bundle exec rspec spec
data/LICENSE.txt ADDED
@@ -0,0 +1,20 @@
1
+ Copyright (c) 2011-2013 Pjotr Prins
2
+
3
+ Permission is hereby granted, free of charge, to any person obtaining
4
+ a copy of this software and associated documentation files (the
5
+ "Software"), to deal in the Software without restriction, including
6
+ without limitation the rights to use, copy, modify, merge, publish,
7
+ distribute, sublicense, and/or sell copies of the Software, and to
8
+ permit persons to whom the Software is furnished to do so, subject to
9
+ the following conditions:
10
+
11
+ The above copyright notice and this permission notice shall be
12
+ included in all copies or substantial portions of the Software.
13
+
14
+ THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
15
+ EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
16
+ MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
17
+ NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE
18
+ LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION
19
+ OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION
20
+ WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
data/README.md CHANGED
@@ -8,31 +8,119 @@ computing in biology.
8
8
  BigBio may use BioLib C/C++/D functions for increasing performance and
9
9
  reducing memory consumption.
10
10
 
11
- This is an experimental project. If you wish to contribute subscribe
12
- to the BioRuby and/or BioLib mailing lists.
11
+ In a way, this is an experimental project. I use it for
12
+ experimentation, but what is in here should work fine. If you wish to
13
+ contribute subscribe to the BioRuby and/or BioLib mailing lists
14
+ instead.
13
15
 
14
16
  # Overview
15
17
 
16
18
  * BigBio can translate nucleotide sequences to amino acid
17
19
  sequences using an EMBOSS C function, or BioRuby's translator.
20
+ * BigBio has a terrific FASTA file emitter which iterates FASTA files and
21
+ iterates sequences without loading everything in memory. There is
22
+ also an indexed edition
23
+ * BioBio has a flexible FASTA filter
18
24
  * BigBio has an ORF emitter which parses DNA/RNA sequences and emits
19
25
  ORFs between START_STOP or STOP_STOP codons.
20
- * BigBio has a FASTA file emitter, with iterates FASTA files and
21
- iterates sequences without loading everything in memory.
26
+ * BigBio has a Phylip (PAML style) emitter and writer
22
27
 
23
- # Examples
28
+ # Installation
29
+
30
+ The easy way
31
+
32
+ ```sh
33
+ gem install bio-bigbio
34
+ ```
35
+
36
+ in your code
37
+
38
+ ```ruby
39
+ require 'bigbio'
40
+ ```
41
+
42
+ # Command line tools
43
+
44
+ Some functionality comes also as executable command line tools (see the
45
+ ./bin directory). Use the -h switch to get information. Current tools
46
+ are
47
+
48
+ 1. getorf: fetch all areas between start-stop and stop-stop codons in six frames (using EMBOSS when biolib is available)
49
+ 2. nt2aa.rb: translate in six frames (using EMBOSS when biolib is available)
50
+ 3. fasta_filter.rb
51
+
52
+ ## Command line Fasta Filter
53
+
54
+ The CLI filter accepts standard Ruby commands.
55
+
56
+ Filter sequences that contain more than 25% C's
57
+
58
+ ```sh
59
+ fasta_filter.rb --filter "rec.seq.count('C') > rec.seq.size*0.25" test/data/fasta/nt.fa
60
+ ```
61
+
62
+ Look for IDs containing -126 and sequences ending on CCC
63
+
64
+ ```sh
65
+ fasta_filter.rb --filter "rec.id =~ /-126/ or rec.seq =~ /CCC$/" test/data/fasta/nt.fa
66
+ ```
67
+
68
+ Filter out all masked sequences that contain more than 10% masked
69
+ nucleotides
70
+
71
+ ```sh
72
+ fasta_filter.rb --filter "rec.seq.count('N')<rec.seq.size*0.10"
73
+ ```
74
+
75
+ Next to rec.id and rec.seq, you have rec.descr and 'num' as variables,
76
+ so to skip every other record
77
+
78
+ ```sh
79
+ fasta_filter.rb --filter "num % 2 == 0"
80
+ ```
81
+
82
+ Rewrite all sequences to lower case, you can use the useful rewrite
83
+ option
84
+
85
+ ```sh
86
+ fasta_filter.rb --rewrite 'rec.seq = rec.seq.downcase'
87
+ ```
88
+
89
+ Filters and rewrites can be combined. The rest is up to your imagination!
90
+
91
+ # API Examples
24
92
 
25
93
  ## Iterate through a FASTA file
26
94
 
27
95
  Read a file without loading the whole thing in memory
28
96
 
29
97
  ```ruby
98
+ require 'bigbio'
99
+
30
100
  fasta = FastaReader.new(fn)
31
101
  fasta.each do | rec |
32
102
  print rec.descr,rec.seq
33
103
  end
34
104
  ```
35
105
 
106
+ Since FastaReader parses the ID, write a tab file with id and sequence
107
+
108
+ ```ruby
109
+ i = 1
110
+ print "num\tid\tseq\n"
111
+ FastaReader.new(fn).each do | rec |
112
+ if rec.id =~ /(AT\w+)/
113
+ print i,"\t",$1,"\t",rec.seq,"\n"
114
+ i += 1
115
+ end
116
+ end
117
+ ```
118
+
119
+ wich, for example, can be turned into RDF with the
120
+ [bio-table](https://github.com/pjotrp/bioruby-table) biogem.
121
+
122
+ ## Write a FASTA file
123
+
36
124
  Write a FASTA file. The simple way
37
125
 
38
126
  ```ruby
@@ -60,6 +148,44 @@ fasta = FastaWriter.new(fn)
60
148
  fasta.write(mysequence)
61
149
  ```
62
150
 
151
+ ## Transform a FASTA file
152
+
153
+ You can combine above FastaReader and FastaWriter to transform
154
+ sequences, e.g.
155
+
156
+ ```ruby
157
+ fasta = FastaWriter.new(in_fn)
158
+ FastaReader.new(out_fn).each do | rec |
159
+ # Strip the description down to the second ID
160
+ (id1,id2) = /(\S+)\s+(\S+)/.match(rec.descr)
161
+ fasta.write(id2,rec.seq)
162
+ end
163
+ ```
164
+
165
+ The downside to this approach is the explicit file naming. What if you
166
+ want to use STDIN or some other source instead? I have come round to
167
+ the idea of using a combination of lambda and block. For example:
168
+
169
+ ```ruby
170
+ FastaReader::emit_fastarecord(-> {gets}) { |rec|
171
+ print FastaWriter.to_fasta(rec)
172
+ }
173
+ ```
174
+
175
+ which takes STDIN line by line, and outputs FASTA on STDOUT. This is
176
+ a better design as the FastaReader and FastaWriter know nothing of
177
+ the mechanism fetching and displaying data. These can both be 'pure'
178
+ functions. Note also that the data is never fully loaded into RAM.
179
+
180
+ Here the transformer functional style
181
+
182
+ ```ruby
183
+ FastaReader::emit_fastarecord(-> {gets}) { |rec|
184
+ (id1,id2) = /(\S+)\s+(\S+)/.match(rec.descr)
185
+ print FastaWriter.to_fasta(id2,req.seq)
186
+ }
187
+ ```
188
+
63
189
  ## Fetch ORFs from a sequence
64
190
 
65
191
  BigBio can parse a sequence for ORFs. Together with the FastaReader
@@ -83,21 +209,27 @@ translate = Nucleotide::Translate.new(trn_table)
83
209
  aa_frames = translate.aa_6_frames("ATCATTAGCAACACCAGCTTCCTCTCTCTCGCTTCAAAGTTCACTACTCGTGGATCTCGT")
84
210
  ```
85
211
 
86
- # Install
212
+ # Project home page
87
213
 
88
- The easy way
214
+ Information on the source tree, documentation, examples, issues and
215
+ how to contribute, see
89
216
 
90
- ```sh
91
- gem install bio-bigbio
92
- ```
217
+ http://github.com/pjotrp/bigbio
93
218
 
94
- in your code
219
+ The BioRuby community is on IRC server: irc.freenode.org, channel: #bioruby.
95
220
 
96
- ```ruby
97
- require 'bigbio'
98
- ```
221
+ # Cite
222
+
223
+ If you use this software, please cite one of
224
+
225
+ * [BioRuby: bioinformatics software for the Ruby programming language](http://dx.doi.org/10.1093/bioinformatics/btq475)
226
+ * [Biogem: an effective tool-based approach for scaling up open source software development in bioinformatics](http://dx.doi.org/10.1093/bioinformatics/bts080)
227
+
228
+ # Biogems.info
229
+
230
+ This Biogem is published at [#bio-table](http://biogems.info/index.html)
99
231
 
100
232
  # Copyright
101
233
 
102
- Copyright (c) 2011-2012 Pjotr Prins. See LICENSE for further details.
234
+ Copyright (c) 2011-2013 Pjotr Prins. See LICENSE for further details.
103
235
 
data/Rakefile CHANGED
@@ -37,6 +37,7 @@ RSpec::Core::RakeTask.new(:rcov) do |spec|
37
37
  spec.rcov = true
38
38
  end
39
39
 
40
+ task :test => :spec
40
41
  task :default => :spec
41
42
 
42
43
  require 'rake/rdoctask'
data/VERSION CHANGED
@@ -1 +1 @@
1
- 0.1.4
1
+ 0.1.5
@@ -0,0 +1,100 @@
1
+ #! /usr/bin/env ruby
2
+ #
3
+ # Filter for FASTA files
4
+ #
5
+
6
+ $: << File.dirname(__FILE__)+'/../lib'
7
+
8
+ require 'bigbio'
9
+ require 'optparse'
10
+ require 'ostruct'
11
+
12
+ class OptParser
13
+ #
14
+ # Return a structure describing the options.
15
+ #
16
+ def self.parse(args)
17
+ # The options specified on the command line will be collected in *options*.
18
+ # We set default values here.
19
+ options = OpenStruct.new
20
+ options.codonize = false
21
+ options.verbose = false
22
+
23
+ opt_parser = OptionParser.new do |opts|
24
+ opts.banner = "Usage: fasta_filter.rb [options]"
25
+
26
+ opts.separator ""
27
+ opts.separator "Specific options:"
28
+
29
+ opts.on("--filter expression","Filter on Ruby expression") do |expr|
30
+ options.filter = expr
31
+ end
32
+
33
+ opts.on("--rewrite expression","Rewrite expression") do |expr|
34
+ options.rewrite = expr
35
+ end
36
+
37
+ opts.on("--codonize",
38
+ "Trim sequence to be at multiple of 3 nucleotides") do |b|
39
+ options.codonize = b
40
+ end
41
+
42
+ opts.on("--min size",
43
+ "Set minimum sequence size") do |min|
44
+ options.min = min.to_i
45
+ end
46
+
47
+ opts.on("--id","Write out ID only") do |b|
48
+ options.id = b
49
+ end
50
+
51
+ opts.on("-v", "--[no-]verbose", "Run verbosely") do |v|
52
+ options.verbose = v
53
+ end
54
+
55
+ opts.separator ""
56
+ opts.separator "Examples:"
57
+ opts.separator ""
58
+ opts.separator " fasta_filter.rb --filter \"rec.id =~ /-126/ or rec.seq =~ /CCC$/\" test/data/fasta/nt.fa"
59
+ opts.separator " fasta_filter.rb --filter \"rec.seq.count('C') > rec.seq.size*0.25\" test/data/fasta/nt.fa"
60
+ opts.separator " fasta_filter.rb --filter \"rec.descr =~ /C. elegans/\" test/data/fasta/nt.fa"
61
+ opts.separator " fasta_filter.rb --filter \"num % 2 == 0\" test/data/fasta/nt.fa"
62
+ opts.separator " fasta_filter.rb test/data/fasta/nt.fa --rewrite 'rec.seq.downcase!'"
63
+ opts.separator ""
64
+ opts.separator "Other options:"
65
+ opts.separator ""
66
+
67
+ opts.on_tail("-h", "--help", "Show this message") do
68
+ puts opts
69
+ exit
70
+ end
71
+
72
+ end
73
+
74
+ opt_parser.parse!(args)
75
+ options
76
+ end # parse()
77
+ end # class OptParser
78
+
79
+ options = OptParser.parse(ARGV)
80
+
81
+ num = -1
82
+ FastaReader::emit_fastarecord(-> { ARGF.gets }) { | rec |
83
+ num += 1
84
+ # --- Filtering
85
+ next if options.filter and not eval(options.filter)
86
+ if options.codonize
87
+ # --- Round sequence to nearest 3 nucleotides
88
+ size = rec.seq.size
89
+ rec.seq = rec.seq[0..size - (size % 3) - 1]
90
+ end
91
+ # --- Only use sequences from MIN size
92
+ next if options.min and rec.seq.size < options.min
93
+ # --- Truncate description to ID
94
+ rec.descr = rec.id if options.id
95
+
96
+ # --- rewrite
97
+ eval(options.rewrite) if options.rewrite
98
+ print rec.to_fasta
99
+ }
100
+
data/bin/fasta_sort.rb ADDED
@@ -0,0 +1,24 @@
1
+ #!/usr/bin/env ruby
2
+ #
3
+ # fasta_sort: Sorts a FASTA file and outputs sorted unique records as FASTA again
4
+ #
5
+ # Usage:
6
+ #
7
+ # fasta_sort inputfile(s)
8
+
9
+ require 'bio'
10
+
11
+ include Bio
12
+
13
+ table = Hash.new
14
+ ARGV.each do | fn |
15
+ Bio::FlatFile.auto(fn).each do | seq |
16
+ table[seq.definition] ||= seq.data
17
+ end
18
+ end
19
+
20
+ table.sort.each do | definition, data |
21
+ rec = Bio::FastaFormat.new('> '+definition.strip+"\n"+data)
22
+ print rec
23
+ end
24
+
data/bin/getorf CHANGED
@@ -6,12 +6,8 @@
6
6
  # (aa_heuristic.fa and nt_heuristic.fa respectively)
7
7
  #
8
8
  # You can choose the heuristic on the command line (default stopstop).
9
- #
10
- # Author:: Pjotr Prins
11
- # Copyright:: 2009-2011
12
- # License:: Ruby License
13
- #
14
- # Copyright (C) 2009-2011 Pjotr Prins <pjotr.prins@thebird.nl>
9
+
10
+ $stderr.print "WARNING: This tool has one or more known bugs! Better use the EMBOSS getorf instead for now\n"
15
11
 
16
12
  rootpath = File.dirname(File.dirname(__FILE__))
17
13
  $: << File.join(rootpath,'lib')
@@ -48,10 +44,10 @@ EXAMPLE
48
44
  exit()
49
45
  }
50
46
 
51
- opts.on("-h heuristic", String, "Heuristic (stopstop)") do | s |
47
+ opts.on("-h heuristic", String, "Heuristic (default #{heuristic})") do | s |
52
48
  heuristic = s
53
49
  end
54
- opts.on("-s size", "--min-size", Integer, "Minimal sequence size") do | n |
50
+ opts.on("-s size", "--min-size", Integer, "Minimal sequence size (default #{minsize})") do | n |
55
51
  minsize = n
56
52
  end
57
53
  opts.on("--longest", "Only get longest ORF match") do
data/bin/nt2aa.rb CHANGED
@@ -3,11 +3,6 @@
3
3
  # Translate nucleotide sequences into aminoacids sequences in all
4
4
  # reading frames.
5
5
  #
6
- #
7
- # (: pjotrp 2009, 2012 rblicense :)
8
- #
9
- # Copyright (C) 2012 Pjotr Prins <pjotr.prins@thebird.nl>
10
-
11
6
  USAGE =<<EOM
12
7
  ruby #{__FILE__} [--six-frame] inputfile(s)
13
8
  EOM
@@ -44,7 +39,9 @@ ARGV.each do | fn |
44
39
 
45
40
  # ajpseqt = Biolib::Emboss.ajTrnSeqOrig(trnTable,ajpseq,frame)
46
41
  # aa = Biolib::Emboss.ajSeqGetSeqCopyC(ajpseqt)
47
- print "> ",rec.descr," [",frame.to_s,"]\n"
42
+ print ">",rec.descr
43
+ print " [",frame.to_s,"]" if do_sixframes
44
+ print "\n"
48
45
  print aa,"\n"
49
46
  end
50
47
  }
data/bio-bigbio.gemspec CHANGED
@@ -5,25 +5,28 @@
5
5
 
6
6
  Gem::Specification.new do |s|
7
7
  s.name = "bio-bigbio"
8
- s.version = "0.1.4"
8
+ s.version = "0.1.5"
9
9
 
10
10
  s.required_rubygems_version = Gem::Requirement.new(">= 0") if s.respond_to? :required_rubygems_version=
11
11
  s.authors = ["Pjotr Prins"]
12
- s.date = "2012-02-03"
12
+ s.date = "2013-05-03"
13
13
  s.description = "Fasta reader, ORF emitter, sequence translation"
14
14
  s.email = "pjotr.public01@thebird.nl"
15
- s.executables = ["getorf", "nt2aa.rb"]
15
+ s.executables = ["fasta_filter.rb", "fasta_sort.rb", "getorf", "nt2aa.rb"]
16
16
  s.extra_rdoc_files = [
17
- "LICENSE",
17
+ "LICENSE.txt",
18
18
  "README.md"
19
19
  ]
20
20
  s.files = [
21
+ ".travis.yml",
21
22
  "Gemfile",
22
23
  "Gemfile.lock",
23
- "LICENSE",
24
+ "LICENSE.txt",
24
25
  "README.md",
25
26
  "Rakefile",
26
27
  "VERSION",
28
+ "bin/fasta_filter.rb",
29
+ "bin/fasta_sort.rb",
27
30
  "bin/getorf",
28
31
  "bin/nt2aa.rb",
29
32
  "bio-bigbio.gemspec",
@@ -42,6 +45,7 @@ Gem::Specification.new do |s|
42
45
  "lib/bigbio/db/fasta/fastarecord.rb",
43
46
  "lib/bigbio/db/fasta/fastawriter.rb",
44
47
  "lib/bigbio/db/fasta/indexer.rb",
48
+ "lib/bigbio/db/phylip.rb",
45
49
  "lib/bigbio/environment.rb",
46
50
  "lib/bigbio/sequence/predictorf.rb",
47
51
  "lib/bigbio/sequence/translate.rb",
@@ -130,3 +130,38 @@ class FastaReader
130
130
  end
131
131
 
132
132
  end
133
+
134
+ # The following is actually a module/trait implementation without state
135
+
136
+ class FastaReader
137
+
138
+ # func passes in a FASTA buffer. Every time a record is parsed it is
139
+ # yielded.
140
+ #
141
+ def FastaReader::emit getbuf_func
142
+ seq = ""
143
+ id = nil
144
+ descr = nil
145
+ while buf = getbuf_func.call
146
+ buf.split(/\n/).each do | line |
147
+ if line =~ /^>/
148
+ yield id, descr, seq if descr
149
+ descr = line[1..-1].strip
150
+ matched = /^(\S+)/.match(descr)
151
+ id = matched[0]
152
+ seq = ""
153
+ else
154
+ seq += line.strip
155
+ end
156
+ end
157
+ end
158
+ yield id, descr, seq if descr and seq.size > 0
159
+ end
160
+
161
+ def FastaReader::emit_fastarecord getbuf_func
162
+ emit(getbuf_func) do | id, descr, seq |
163
+ yield FastaRecord.new(id, descr, seq)
164
+ end
165
+ end
166
+
167
+ end
@@ -7,6 +7,10 @@ class FastaRecord
7
7
  @descr = descr
8
8
  @seq = seq
9
9
  end
10
+
11
+ def to_fasta
12
+ ">"+@descr+"\n"+@seq+"\n"
13
+ end
10
14
  end
11
15
 
12
16
  class FastaPairedRecord
@@ -30,7 +34,9 @@ class FastaPairedRecord
30
34
  if nt.seq.size == aa.seq.size*3-3
31
35
  aa.seq.chop!
32
36
  end
33
- raise "Sequence size mismatch for #{nt.id} <nt:#{nt.seq.size} != #{aa.seq.size*3} (aa:#{aa.seq.size}*3)>" if nt.seq.size != aa.seq.size*3
37
+ nt_size = nt.seq.size
38
+ expected_size = aa.seq.size*3
39
+ # raise "Sequence size mismatch for #{nt.id} <nt:#{nt.seq.size} != #{aa.seq.size*3} (aa:#{aa.seq.size}*3)>" if expected_size - 3 > nt_size and nt_size > expected_size + 3
34
40
  end
35
41
 
36
42
  def id
@@ -0,0 +1,49 @@
1
+ # Simple phylip reader. Supports PAML style files formatted as
2
+ #
3
+ # sequence 1
4
+ # AAGCTTCACCGGCGCAGTCATTCTCATAAT
5
+ # CGCCCACGGACTTACATCCTCATTACTATT
6
+ # sequence 2
7
+ # AAGCTTCACCGGCGCAATTATCCTCATAAT
8
+ # CGCCCACGGACTTACATCCTCATTATTATT
9
+ # sequence 3
10
+ # AAGCTTCACCGGCGCAGTTGTTCTTATAAT
11
+ # TGCCCACGGACTTACATCATCATTATTATT
12
+ # sequence 4
13
+ # AAGCTTCACCGGCGCAACCACCCTCATGAT
14
+ # TGCCCATGGACTCACATCCTCCCTACTGTT
15
+
16
+ module Bio
17
+ module Big
18
+ module PhylipReader
19
+ # Define get_line as a lambda function, e.g.
20
+ # Bio::Big::PhylipReader.emit_seq(-> { lines.next }) { | name, seq | p [name,seq] }
21
+
22
+ def PhylipReader::emit_seq get_line
23
+ line = get_line.call.strip
24
+ a = line.split
25
+ seq_num = a[0].to_i
26
+ seq_size = a[1].to_i
27
+ name = nil
28
+ seq = ""
29
+ while true
30
+ line = get_line.call
31
+ break if line == nil or line == ""
32
+ line = line.strip
33
+ if name == nil
34
+ name = line
35
+ next
36
+ end
37
+ seq += line
38
+ if seq.size >= seq_size
39
+ raise "Name wrong size for #{name}" if name.size > 20
40
+ raise "Sequence wrong size for #{name}" if seq.size > seq_size
41
+ yield name, seq
42
+ name = nil
43
+ seq = ""
44
+ end
45
+ end
46
+ end
47
+ end
48
+ end
49
+ end
data/spec/emitter_spec.rb CHANGED
@@ -20,6 +20,23 @@ describe Bio::Big::FastaEmitter, "when using the emitter" do
20
20
  end
21
21
  end
22
22
 
23
+ it "should emit functional style" do
24
+ count = 0
25
+ FastaReader::emit_fastarecord(-> { File.open("test/data/fasta/nt.fa").read }) { |rec|
26
+ case count
27
+ when 0
28
+ rec.id.should == "PUT-157a-Arabidopsis_thaliana-1"
29
+ rec.seq[0..10].should == "AGGTTCGNACG"
30
+ when 1
31
+ rec.id.should == "PUT-157a-Arabidopsis_thaliana-2"
32
+ rec.seq[0..10].should == "AGACAAACGAC"
33
+ else
34
+ break
35
+ end
36
+ count += 1
37
+ }
38
+ end
39
+
23
40
  it "should emit large parts" do
24
41
  FastaEmitter.new("test/data/fasta/nt.fa").emit_seq do | part, index, tag, seq |
25
42
  # p [index, part, tag, seq]
metadata CHANGED
@@ -1,7 +1,7 @@
1
1
  --- !ruby/object:Gem::Specification
2
2
  name: bio-bigbio
3
3
  version: !ruby/object:Gem::Version
4
- version: 0.1.4
4
+ version: 0.1.5
5
5
  prerelease:
6
6
  platform: ruby
7
7
  authors:
@@ -9,11 +9,11 @@ authors:
9
9
  autorequire:
10
10
  bindir: bin
11
11
  cert_chain: []
12
- date: 2012-02-03 00:00:00.000000000Z
12
+ date: 2013-05-03 00:00:00.000000000Z
13
13
  dependencies:
14
14
  - !ruby/object:Gem::Dependency
15
15
  name: bio
16
- requirement: &15446660 !ruby/object:Gem::Requirement
16
+ requirement: &27203900 !ruby/object:Gem::Requirement
17
17
  none: false
18
18
  requirements:
19
19
  - - ! '>='
@@ -21,10 +21,10 @@ dependencies:
21
21
  version: 1.4.1
22
22
  type: :runtime
23
23
  prerelease: false
24
- version_requirements: *15446660
24
+ version_requirements: *27203900
25
25
  - !ruby/object:Gem::Dependency
26
26
  name: bio-logger
27
- requirement: &15445800 !ruby/object:Gem::Requirement
27
+ requirement: &27203120 !ruby/object:Gem::Requirement
28
28
  none: false
29
29
  requirements:
30
30
  - - ! '>='
@@ -32,10 +32,10 @@ dependencies:
32
32
  version: 0.9.0
33
33
  type: :runtime
34
34
  prerelease: false
35
- version_requirements: *15445800
35
+ version_requirements: *27203120
36
36
  - !ruby/object:Gem::Dependency
37
37
  name: rspec
38
- requirement: &15445180 !ruby/object:Gem::Requirement
38
+ requirement: &27202300 !ruby/object:Gem::Requirement
39
39
  none: false
40
40
  requirements:
41
41
  - - ~>
@@ -43,10 +43,10 @@ dependencies:
43
43
  version: 2.3.0
44
44
  type: :development
45
45
  prerelease: false
46
- version_requirements: *15445180
46
+ version_requirements: *27202300
47
47
  - !ruby/object:Gem::Dependency
48
48
  name: bundler
49
- requirement: &15444540 !ruby/object:Gem::Requirement
49
+ requirement: &27201380 !ruby/object:Gem::Requirement
50
50
  none: false
51
51
  requirements:
52
52
  - - ~>
@@ -54,10 +54,10 @@ dependencies:
54
54
  version: 1.0.0
55
55
  type: :development
56
56
  prerelease: false
57
- version_requirements: *15444540
57
+ version_requirements: *27201380
58
58
  - !ruby/object:Gem::Dependency
59
59
  name: jeweler
60
- requirement: &15443800 !ruby/object:Gem::Requirement
60
+ requirement: &27200760 !ruby/object:Gem::Requirement
61
61
  none: false
62
62
  requirements:
63
63
  - - ~>
@@ -65,10 +65,10 @@ dependencies:
65
65
  version: 1.5.2
66
66
  type: :development
67
67
  prerelease: false
68
- version_requirements: *15443800
68
+ version_requirements: *27200760
69
69
  - !ruby/object:Gem::Dependency
70
70
  name: rcov
71
- requirement: &15440240 !ruby/object:Gem::Requirement
71
+ requirement: &27199840 !ruby/object:Gem::Requirement
72
72
  none: false
73
73
  requirements:
74
74
  - - ! '>='
@@ -76,23 +76,28 @@ dependencies:
76
76
  version: '0'
77
77
  type: :development
78
78
  prerelease: false
79
- version_requirements: *15440240
79
+ version_requirements: *27199840
80
80
  description: Fasta reader, ORF emitter, sequence translation
81
81
  email: pjotr.public01@thebird.nl
82
82
  executables:
83
+ - fasta_filter.rb
84
+ - fasta_sort.rb
83
85
  - getorf
84
86
  - nt2aa.rb
85
87
  extensions: []
86
88
  extra_rdoc_files:
87
- - LICENSE
89
+ - LICENSE.txt
88
90
  - README.md
89
91
  files:
92
+ - .travis.yml
90
93
  - Gemfile
91
94
  - Gemfile.lock
92
- - LICENSE
95
+ - LICENSE.txt
93
96
  - README.md
94
97
  - Rakefile
95
98
  - VERSION
99
+ - bin/fasta_filter.rb
100
+ - bin/fasta_sort.rb
96
101
  - bin/getorf
97
102
  - bin/nt2aa.rb
98
103
  - bio-bigbio.gemspec
@@ -111,6 +116,7 @@ files:
111
116
  - lib/bigbio/db/fasta/fastarecord.rb
112
117
  - lib/bigbio/db/fasta/fastawriter.rb
113
118
  - lib/bigbio/db/fasta/indexer.rb
119
+ - lib/bigbio/db/phylip.rb
114
120
  - lib/bigbio/environment.rb
115
121
  - lib/bigbio/sequence/predictorf.rb
116
122
  - lib/bigbio/sequence/translate.rb
@@ -139,7 +145,7 @@ required_ruby_version: !ruby/object:Gem::Requirement
139
145
  version: '0'
140
146
  segments:
141
147
  - 0
142
- hash: -2436097965031091716
148
+ hash: 2941883289909211187
143
149
  required_rubygems_version: !ruby/object:Gem::Requirement
144
150
  none: false
145
151
  requirements:
data/LICENSE DELETED
@@ -1,34 +0,0 @@
1
- If a license is not specified the code contributed to BioBig defaults to the
2
- BSD license:
3
-
4
- Copyright (c) 2008, 2009 The BioLib Project
5
- All rights reserved.
6
-
7
- Redistribution and use in source and binary forms, with or without
8
- modification, are permitted provided that the following conditions are met:
9
-
10
- * Redistributions of source code must retain the above copyright notice,
11
- this list of conditions and the following disclaimer.
12
-
13
- * Redistributions in binary form must reproduce the above copyright notice,
14
- this list of conditions and the following disclaimer in the documentation
15
- and/or other materials provided with the distribution.
16
-
17
- * Neither the name of the The BioLib Project nor the names of
18
- its contributors may be used to endorse or promote products derived from
19
- this software without specific prior written permission.
20
-
21
- THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND
22
- ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED
23
- WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE
24
- DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE LIABLE FOR
25
- ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES
26
- (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES;
27
- LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON
28
- ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
29
- (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS
30
- SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
31
-
32
- For more information on opensource software licenses see
33
- http://www.opensource.org/licenses/bsd-license.php,
34
- http://www.gnu.org/licenses/gpl.html and http://www.fsf.org/.