bio-velvet 0.0.1 → 0.1.0

Sign up to get free protection for your applications and to get access to all the features.
checksums.yaml CHANGED
@@ -1,7 +1,7 @@
1
1
  ---
2
2
  SHA1:
3
- metadata.gz: 798289b36dd93bb47a40f8f5c1e71ecf59305699
4
- data.tar.gz: 3b97653d1ca5fd6b62c1ab1f097c3ffe868ce04c
3
+ metadata.gz: 4150c97e0ffba2bebefade9a57dca252344e3550
4
+ data.tar.gz: ce80f5a1d0ae5443ab707f1cd93efa576ab27fcf
5
5
  SHA512:
6
- metadata.gz: ee227d4e19f9ce09edb22316aea8fa05fde499e27dcec8af4bec6afc7c8bbb7567b081f86cd64d0c34a00ec4d7e7c2202a3336914770ee00e053bf09907cf8f0
7
- data.tar.gz: f84054f0cff8d627a57d2431d3af35b19a9941d65da10aee04e38431e436bc28ddef4021904df445d31d59a87cddabe8b61dcc409abd6df1db19bc9f41930fb2
6
+ metadata.gz: 571473ce789272839ba47759af5f2c5a6e6cdfc97f206b21112549b25ac897eee7f1cad1b7cf08d59a703c448c75e50627e0c8e079cc477fcb2ea5632d6aaf9c
7
+ data.tar.gz: 42d86334301a5fa559c3de76144951717bd20514b3025b594c052e99ce17414a0acb9d68a2215bef93b48b16f80ee39e926959ee2c0efad61543152d1263ee75
data/Gemfile CHANGED
@@ -1,17 +1,16 @@
1
1
  source "http://rubygems.org"
2
2
 
3
- gem 'bio-logger', '>=1.0.1'
4
- gem 'systemu'
5
- gem 'files'
6
- gem 'hopcsv', '>= 0.4.3'
3
+ gem 'bio-logger', '~>1.0'
4
+ gem 'systemu', '~>2.6'
5
+ gem 'files', '~>0.3'
6
+ gem 'hopcsv', '~> 0.4'
7
7
 
8
8
  # Add dependencies to develop your gem here.
9
9
  # Include everything needed to run rake, tests, features, etc.
10
10
  group :development do
11
- gem "rspec", ">= 2.8.0"
12
- gem "rdoc", ">= 3.12"
13
- gem "jeweler", ">= 1.8.4"
14
- gem "bundler", ">= 1.0.21"
15
- gem "bio", ">= 1.4.2"
16
- gem "rdoc", ">= 3.12"
11
+ gem "rspec", "~> 2.8"
12
+ gem "jeweler", "~> 2.0"
13
+ gem "bundler", "~> 1.0"
14
+ gem "bio", "~> 1.4"
15
+ gem "rdoc", "~> 4.1"
17
16
  end
data/README.md CHANGED
@@ -23,13 +23,25 @@ contigs_file = velvet_result.contigs_path #=> path to contigs file as a String
23
23
  lastgraph_file = velvet_result.last_graph_path #=> path to last graph file as a String
24
24
  ```
25
25
 
26
- The graph file can be then parsed from the ```velvet_result```:
26
+ By default, the ```velvet``` method passes no parameters to ```velvetg``` other than the velvet directory created by velveth. This directory is a temporary directory by default, but this can also be set. For instance, to run velvet using with a ```-cov_cutoff``` parameter in the ```velvet_dir``` directory:
27
+ ```ruby
28
+ velvet_result = Bio::Velvet::Runner.new.velvet(87,
29
+ '-short /path/to/reads.fa',
30
+ '-cov_cutoff 3.5',
31
+ :output_assembly_path => 'velvet_dir')
32
+ ```
33
+
34
+ The graph file can be parsed from a ```velvet_result```:
27
35
  ```ruby
28
36
  graph = velvet_result.last_graph #=> Bio::Velvet::Graph object
29
37
  ```
30
- In my experience (mostly on complex metagenomes), the graph object itself does not take as much RAM as I initially expected. Most of the hard work has already been done by velvet itself, particularly if the ```-cov_cutoff``` has been set. However parsing in the graph can take many minutes if the LastGraph file is big (>500MB).
38
+ In my experience (mostly on complex metagenomes), the graph object itself does not take as much RAM as initially expected. Most of the hard work has already been done by velvet itself, particularly if the ```-cov_cutoff``` has been set. However parsing in the graph can take many minutes or even hours if the LastGraph file is big (>500MB). The slowest part of parsing is parsing in the positions of reads i.e. using the ```-read_trkg yes``` velvet option. To speed up that process one can use e.g.
39
+ ```ruby
40
+ velvet_result.last_graph(:interesting_read_ids => Set.new([1,2,3]))
41
+ ```
42
+ To only parse read in the positions of the first 3 reads.
31
43
 
32
- With this graph you can access interact with the graph e.g.
44
+ With a parsed graph (a ```Bio::Velvet::Graph``` object) you can interact with the graph e.g.
33
45
  ```ruby
34
46
  graph.kmer_length #=> 87
35
47
  graph.nodes #=> Bio::Velvet::Graph::NodeArray object
@@ -37,7 +49,7 @@ graph.nodes[3] #=> Bio::Velvet::Graph::Node object with node ID 3
37
49
  graph.get_arcs_by_node_id(1, 3) #=> an array of arcs between nodes 1 and 3 (Bio::Velvet::Graph::Arc objects)
38
50
  graph.nodes[5].noded_reads #=> array of Bio::Velvet::Graph::NodedRead objects, for read tracking
39
51
  ```
40
- There is much more that can be done to interact with the graph object and its components - see the [rubydoc](http://rubydoc.info/gems/bio-velvet).
52
+ There is much more that can be done to interact with the graph object and its components - see the [rubydoc](http://rubydoc.info/gems/bio-velvet/Bio/Velvet/Graph).
41
53
 
42
54
  ## Project home page
43
55
 
@@ -54,7 +66,7 @@ This code is currently unpublished.
54
66
 
55
67
  ## Biogems.info
56
68
 
57
- This Biogem is published at (http://biogems.info/index.html#bio-velvet)
69
+ This Biogem is listed at [biogems.info](http://biogems.info/index.html#bio-velvet)
58
70
 
59
71
  ## Copyright
60
72
 
data/VERSION CHANGED
@@ -1 +1 @@
1
- 0.0.1
1
+ 0.1.0
@@ -12,13 +12,13 @@ module Bio
12
12
  class Graph
13
13
  include Bio::Velvet::Logging
14
14
 
15
- # $NUMBER_OF_NODES $NUMBER_OF_SEQUENCES $HASH_LENGTH
15
+ # Taken directly from the graph, statistics and information about the Graph i.e. from the velvet manual "$NUMBER_OF_NODES $NUMBER_OF_SEQUENCES $HASH_LENGTH"
16
16
  attr_accessor :number_of_nodes, :number_of_sequences, :hash_length
17
17
 
18
18
  # NodeArray object of all the graph's node objects
19
19
  attr_accessor :nodes
20
20
 
21
- # Array of Arc objects
21
+ # ArcArray of Arc objects
22
22
  attr_accessor :arcs
23
23
 
24
24
  def self.log
@@ -27,7 +27,12 @@ module Bio
27
27
 
28
28
  # Parse a graph file from a Graph, Graph2 or LastGraph output file from velvet
29
29
  # into a Bio::Velvet::Graph object
30
- def self.parse_from_file(path_to_graph_file)
30
+ #
31
+ # Options:
32
+ # * :interesting_read_ids: If not nil, is a Set of nodes that we are interested in. Reads
33
+ # not of interest will not be parsed in (the NR part of the velvet LastGraph file). Regardless all
34
+ # nodes and edges are parsed in. Using this options saves both memory and CPU.
35
+ def self.parse_from_file(path_to_graph_file, options={})
31
36
  graph = self.new
32
37
  state = :header
33
38
 
@@ -126,8 +131,13 @@ module Bio
126
131
  next
127
132
  else
128
133
  raise unless row.length == 3
134
+ read_id = row[0].to_i
135
+ if options[:interesting_read_ids] and !options[:interesting_read_ids].include?(read_id)
136
+ # We have come across an uninteresting read. Ignore it.
137
+ next
138
+ end
129
139
  nr = NodedRead.new
130
- nr.read_id = row[0].to_i
140
+ nr.read_id = read_id
131
141
  nr.offset_from_start_of_node = row[1].to_i
132
142
  nr.start_coord = row[2].to_i
133
143
  nr.direction = current_node_direction
@@ -191,7 +201,7 @@ module Bio
191
201
  # are deleted first, and then the node, so that the graph remains sane at all
192
202
  # times - there is never dangling arcs, as such.
193
203
  #
194
- # Returns a [deleted_nodes, deleted_arc] tuple, which are both enumerables,
204
+ # Returns a [deleted_nodes, deleted_arcs] tuple, which are both enumerables,
195
205
  # each in no particular order.
196
206
  def delete_nodes_if(&block)
197
207
  deleted_nodes = []
@@ -82,8 +82,9 @@ module Bio
82
82
  File.join result_directory, 'stats.txt'
83
83
  end
84
84
 
85
- # Return a Bio::Velvet::Graph object built from the LastGraph file
86
- def last_graph
85
+ # Return a Bio::Velvet::Graph object built from the LastGraph file.
86
+ # The options for parsing are as per Bio::Velvet::Graph#parse_from_file
87
+ def last_graph(options=nil)
87
88
  Bio::Velvet::Graph.parse_from_file(last_graph_path)
88
89
  end
89
90
  end
@@ -93,6 +93,31 @@ describe "BioVelvet" do
93
93
  graph.nodes.collect{|n| n.short_reads.nil? ? 0 : n.short_reads.length}.reduce(:+).should eq(40327)
94
94
  end
95
95
 
96
+ it 'should ignore read_ids when they are uninteresting when parsing the graph' do
97
+ graph = Bio::Velvet::Graph.parse_from_file(
98
+ File.join(TEST_DATA_DIR, 'velvet_test_reads_assembly_read_tracking','Graph2'),
99
+ :interesting_read_ids => Set.new([47223])
100
+ )
101
+ graph.should be_kind_of(Bio::Velvet::Graph)
102
+
103
+ graph.number_of_nodes.should eq(967)
104
+ graph.number_of_sequences.should eq(50000)
105
+ graph.hash_length.should eq(31)
106
+
107
+ # NR -951 2
108
+ #47210 0 0
109
+ #47223 41 0
110
+ # ====later
111
+ # NR 951 2
112
+ # 47209 54 0
113
+ # 47224 0 0
114
+ node = graph.nodes[951]
115
+ node.short_reads.length.should eq(1)
116
+ node.number_of_short_reads.should eq(4)
117
+ node.short_reads[0].read_id.should eq(47223)
118
+ node.short_reads[0].offset_from_start_of_node.should eq(41)
119
+ end
120
+
96
121
  it 'should return sets of arcs by id' do
97
122
  graph = Bio::Velvet::Graph.parse_from_file File.join(TEST_DATA_DIR, 'velvet_test_reads_assembly','LastGraph')
98
123
  # ARC 2 -578 1
metadata CHANGED
@@ -1,155 +1,141 @@
1
1
  --- !ruby/object:Gem::Specification
2
2
  name: bio-velvet
3
3
  version: !ruby/object:Gem::Version
4
- version: 0.0.1
4
+ version: 0.1.0
5
5
  platform: ruby
6
6
  authors:
7
7
  - Ben J Woodcroft
8
8
  autorequire:
9
9
  bindir: bin
10
10
  cert_chain: []
11
- date: 2013-12-20 00:00:00.000000000 Z
11
+ date: 2014-01-06 00:00:00.000000000 Z
12
12
  dependencies:
13
13
  - !ruby/object:Gem::Dependency
14
14
  name: bio-logger
15
15
  requirement: !ruby/object:Gem::Requirement
16
16
  requirements:
17
- - - '>='
17
+ - - "~>"
18
18
  - !ruby/object:Gem::Version
19
- version: 1.0.1
19
+ version: '1.0'
20
20
  type: :runtime
21
21
  prerelease: false
22
22
  version_requirements: !ruby/object:Gem::Requirement
23
23
  requirements:
24
- - - '>='
24
+ - - "~>"
25
25
  - !ruby/object:Gem::Version
26
- version: 1.0.1
26
+ version: '1.0'
27
27
  - !ruby/object:Gem::Dependency
28
28
  name: systemu
29
29
  requirement: !ruby/object:Gem::Requirement
30
30
  requirements:
31
- - - '>='
31
+ - - "~>"
32
32
  - !ruby/object:Gem::Version
33
- version: '0'
33
+ version: '2.6'
34
34
  type: :runtime
35
35
  prerelease: false
36
36
  version_requirements: !ruby/object:Gem::Requirement
37
37
  requirements:
38
- - - '>='
38
+ - - "~>"
39
39
  - !ruby/object:Gem::Version
40
- version: '0'
40
+ version: '2.6'
41
41
  - !ruby/object:Gem::Dependency
42
42
  name: files
43
43
  requirement: !ruby/object:Gem::Requirement
44
44
  requirements:
45
- - - '>='
45
+ - - "~>"
46
46
  - !ruby/object:Gem::Version
47
- version: '0'
47
+ version: '0.3'
48
48
  type: :runtime
49
49
  prerelease: false
50
50
  version_requirements: !ruby/object:Gem::Requirement
51
51
  requirements:
52
- - - '>='
52
+ - - "~>"
53
53
  - !ruby/object:Gem::Version
54
- version: '0'
54
+ version: '0.3'
55
55
  - !ruby/object:Gem::Dependency
56
56
  name: hopcsv
57
57
  requirement: !ruby/object:Gem::Requirement
58
58
  requirements:
59
- - - '>='
59
+ - - "~>"
60
60
  - !ruby/object:Gem::Version
61
- version: 0.4.3
61
+ version: '0.4'
62
62
  type: :runtime
63
63
  prerelease: false
64
64
  version_requirements: !ruby/object:Gem::Requirement
65
65
  requirements:
66
- - - '>='
66
+ - - "~>"
67
67
  - !ruby/object:Gem::Version
68
- version: 0.4.3
68
+ version: '0.4'
69
69
  - !ruby/object:Gem::Dependency
70
70
  name: rspec
71
71
  requirement: !ruby/object:Gem::Requirement
72
72
  requirements:
73
- - - '>='
73
+ - - "~>"
74
74
  - !ruby/object:Gem::Version
75
- version: 2.8.0
75
+ version: '2.8'
76
76
  type: :development
77
77
  prerelease: false
78
78
  version_requirements: !ruby/object:Gem::Requirement
79
79
  requirements:
80
- - - '>='
80
+ - - "~>"
81
81
  - !ruby/object:Gem::Version
82
- version: 2.8.0
83
- - !ruby/object:Gem::Dependency
84
- name: rdoc
85
- requirement: !ruby/object:Gem::Requirement
86
- requirements:
87
- - - '>='
88
- - !ruby/object:Gem::Version
89
- version: '3.12'
90
- type: :development
91
- prerelease: false
92
- version_requirements: !ruby/object:Gem::Requirement
93
- requirements:
94
- - - '>='
95
- - !ruby/object:Gem::Version
96
- version: '3.12'
82
+ version: '2.8'
97
83
  - !ruby/object:Gem::Dependency
98
84
  name: jeweler
99
85
  requirement: !ruby/object:Gem::Requirement
100
86
  requirements:
101
- - - '>='
87
+ - - "~>"
102
88
  - !ruby/object:Gem::Version
103
- version: 1.8.4
89
+ version: '2.0'
104
90
  type: :development
105
91
  prerelease: false
106
92
  version_requirements: !ruby/object:Gem::Requirement
107
93
  requirements:
108
- - - '>='
94
+ - - "~>"
109
95
  - !ruby/object:Gem::Version
110
- version: 1.8.4
96
+ version: '2.0'
111
97
  - !ruby/object:Gem::Dependency
112
98
  name: bundler
113
99
  requirement: !ruby/object:Gem::Requirement
114
100
  requirements:
115
- - - '>='
101
+ - - "~>"
116
102
  - !ruby/object:Gem::Version
117
- version: 1.0.21
103
+ version: '1.0'
118
104
  type: :development
119
105
  prerelease: false
120
106
  version_requirements: !ruby/object:Gem::Requirement
121
107
  requirements:
122
- - - '>='
108
+ - - "~>"
123
109
  - !ruby/object:Gem::Version
124
- version: 1.0.21
110
+ version: '1.0'
125
111
  - !ruby/object:Gem::Dependency
126
112
  name: bio
127
113
  requirement: !ruby/object:Gem::Requirement
128
114
  requirements:
129
- - - '>='
115
+ - - "~>"
130
116
  - !ruby/object:Gem::Version
131
- version: 1.4.2
117
+ version: '1.4'
132
118
  type: :development
133
119
  prerelease: false
134
120
  version_requirements: !ruby/object:Gem::Requirement
135
121
  requirements:
136
- - - '>='
122
+ - - "~>"
137
123
  - !ruby/object:Gem::Version
138
- version: 1.4.2
124
+ version: '1.4'
139
125
  - !ruby/object:Gem::Dependency
140
126
  name: rdoc
141
127
  requirement: !ruby/object:Gem::Requirement
142
128
  requirements:
143
- - - '>='
129
+ - - "~>"
144
130
  - !ruby/object:Gem::Version
145
- version: '3.12'
131
+ version: '4.1'
146
132
  type: :development
147
133
  prerelease: false
148
134
  version_requirements: !ruby/object:Gem::Requirement
149
135
  requirements:
150
- - - '>='
136
+ - - "~>"
151
137
  - !ruby/object:Gem::Version
152
- version: '3.12'
138
+ version: '4.1'
153
139
  description: Parser to work with some file formats used in the velvet DNA assembler
154
140
  email: donttrustben@gmail.com
155
141
  executables: []
@@ -158,9 +144,9 @@ extra_rdoc_files:
158
144
  - LICENSE.txt
159
145
  - README.md
160
146
  files:
161
- - .document
162
- - .rspec
163
- - .travis.yml
147
+ - ".document"
148
+ - ".rspec"
149
+ - ".travis.yml"
164
150
  - Gemfile
165
151
  - LICENSE.txt
166
152
  - README.md
@@ -194,17 +180,17 @@ require_paths:
194
180
  - lib
195
181
  required_ruby_version: !ruby/object:Gem::Requirement
196
182
  requirements:
197
- - - '>='
183
+ - - ">="
198
184
  - !ruby/object:Gem::Version
199
185
  version: '0'
200
186
  required_rubygems_version: !ruby/object:Gem::Requirement
201
187
  requirements:
202
- - - '>='
188
+ - - ">="
203
189
  - !ruby/object:Gem::Version
204
190
  version: '0'
205
191
  requirements: []
206
192
  rubyforge_project:
207
- rubygems_version: 2.0.3
193
+ rubygems_version: 2.2.0
208
194
  signing_key:
209
195
  specification_version: 4
210
196
  summary: Parser to work with file formats used in the velvet DNA assembler