bio-velvet 0.0.1 → 0.1.0
Sign up to get free protection for your applications and to get access to all the features.
- checksums.yaml +4 -4
- data/Gemfile +9 -10
- data/README.md +17 -5
- data/VERSION +1 -1
- data/lib/bio-velvet/graph.rb +15 -5
- data/lib/bio-velvet/runner.rb +3 -2
- data/spec/bio-velvet_graph_spec.rb +25 -0
- metadata +44 -58
checksums.yaml
CHANGED
@@ -1,7 +1,7 @@
|
|
1
1
|
---
|
2
2
|
SHA1:
|
3
|
-
metadata.gz:
|
4
|
-
data.tar.gz:
|
3
|
+
metadata.gz: 4150c97e0ffba2bebefade9a57dca252344e3550
|
4
|
+
data.tar.gz: ce80f5a1d0ae5443ab707f1cd93efa576ab27fcf
|
5
5
|
SHA512:
|
6
|
-
metadata.gz:
|
7
|
-
data.tar.gz:
|
6
|
+
metadata.gz: 571473ce789272839ba47759af5f2c5a6e6cdfc97f206b21112549b25ac897eee7f1cad1b7cf08d59a703c448c75e50627e0c8e079cc477fcb2ea5632d6aaf9c
|
7
|
+
data.tar.gz: 42d86334301a5fa559c3de76144951717bd20514b3025b594c052e99ce17414a0acb9d68a2215bef93b48b16f80ee39e926959ee2c0efad61543152d1263ee75
|
data/Gemfile
CHANGED
@@ -1,17 +1,16 @@
|
|
1
1
|
source "http://rubygems.org"
|
2
2
|
|
3
|
-
gem 'bio-logger', '
|
4
|
-
gem 'systemu'
|
5
|
-
gem 'files'
|
6
|
-
gem 'hopcsv', '
|
3
|
+
gem 'bio-logger', '~>1.0'
|
4
|
+
gem 'systemu', '~>2.6'
|
5
|
+
gem 'files', '~>0.3'
|
6
|
+
gem 'hopcsv', '~> 0.4'
|
7
7
|
|
8
8
|
# Add dependencies to develop your gem here.
|
9
9
|
# Include everything needed to run rake, tests, features, etc.
|
10
10
|
group :development do
|
11
|
-
gem "rspec", "
|
12
|
-
gem "
|
13
|
-
gem "
|
14
|
-
gem "
|
15
|
-
gem "
|
16
|
-
gem "rdoc", ">= 3.12"
|
11
|
+
gem "rspec", "~> 2.8"
|
12
|
+
gem "jeweler", "~> 2.0"
|
13
|
+
gem "bundler", "~> 1.0"
|
14
|
+
gem "bio", "~> 1.4"
|
15
|
+
gem "rdoc", "~> 4.1"
|
17
16
|
end
|
data/README.md
CHANGED
@@ -23,13 +23,25 @@ contigs_file = velvet_result.contigs_path #=> path to contigs file as a String
|
|
23
23
|
lastgraph_file = velvet_result.last_graph_path #=> path to last graph file as a String
|
24
24
|
```
|
25
25
|
|
26
|
-
|
26
|
+
By default, the ```velvet``` method passes no parameters to ```velvetg``` other than the velvet directory created by velveth. This directory is a temporary directory by default, but this can also be set. For instance, to run velvet using with a ```-cov_cutoff``` parameter in the ```velvet_dir``` directory:
|
27
|
+
```ruby
|
28
|
+
velvet_result = Bio::Velvet::Runner.new.velvet(87,
|
29
|
+
'-short /path/to/reads.fa',
|
30
|
+
'-cov_cutoff 3.5',
|
31
|
+
:output_assembly_path => 'velvet_dir')
|
32
|
+
```
|
33
|
+
|
34
|
+
The graph file can be parsed from a ```velvet_result```:
|
27
35
|
```ruby
|
28
36
|
graph = velvet_result.last_graph #=> Bio::Velvet::Graph object
|
29
37
|
```
|
30
|
-
In my experience (mostly on complex metagenomes), the graph object itself does not take as much RAM as
|
38
|
+
In my experience (mostly on complex metagenomes), the graph object itself does not take as much RAM as initially expected. Most of the hard work has already been done by velvet itself, particularly if the ```-cov_cutoff``` has been set. However parsing in the graph can take many minutes or even hours if the LastGraph file is big (>500MB). The slowest part of parsing is parsing in the positions of reads i.e. using the ```-read_trkg yes``` velvet option. To speed up that process one can use e.g.
|
39
|
+
```ruby
|
40
|
+
velvet_result.last_graph(:interesting_read_ids => Set.new([1,2,3]))
|
41
|
+
```
|
42
|
+
To only parse read in the positions of the first 3 reads.
|
31
43
|
|
32
|
-
With
|
44
|
+
With a parsed graph (a ```Bio::Velvet::Graph``` object) you can interact with the graph e.g.
|
33
45
|
```ruby
|
34
46
|
graph.kmer_length #=> 87
|
35
47
|
graph.nodes #=> Bio::Velvet::Graph::NodeArray object
|
@@ -37,7 +49,7 @@ graph.nodes[3] #=> Bio::Velvet::Graph::Node object with node ID 3
|
|
37
49
|
graph.get_arcs_by_node_id(1, 3) #=> an array of arcs between nodes 1 and 3 (Bio::Velvet::Graph::Arc objects)
|
38
50
|
graph.nodes[5].noded_reads #=> array of Bio::Velvet::Graph::NodedRead objects, for read tracking
|
39
51
|
```
|
40
|
-
There is much more that can be done to interact with the graph object and its components - see the [rubydoc](http://rubydoc.info/gems/bio-velvet).
|
52
|
+
There is much more that can be done to interact with the graph object and its components - see the [rubydoc](http://rubydoc.info/gems/bio-velvet/Bio/Velvet/Graph).
|
41
53
|
|
42
54
|
## Project home page
|
43
55
|
|
@@ -54,7 +66,7 @@ This code is currently unpublished.
|
|
54
66
|
|
55
67
|
## Biogems.info
|
56
68
|
|
57
|
-
This Biogem is
|
69
|
+
This Biogem is listed at [biogems.info](http://biogems.info/index.html#bio-velvet)
|
58
70
|
|
59
71
|
## Copyright
|
60
72
|
|
data/VERSION
CHANGED
@@ -1 +1 @@
|
|
1
|
-
0.0
|
1
|
+
0.1.0
|
data/lib/bio-velvet/graph.rb
CHANGED
@@ -12,13 +12,13 @@ module Bio
|
|
12
12
|
class Graph
|
13
13
|
include Bio::Velvet::Logging
|
14
14
|
|
15
|
-
# $NUMBER_OF_NODES $NUMBER_OF_SEQUENCES $HASH_LENGTH
|
15
|
+
# Taken directly from the graph, statistics and information about the Graph i.e. from the velvet manual "$NUMBER_OF_NODES $NUMBER_OF_SEQUENCES $HASH_LENGTH"
|
16
16
|
attr_accessor :number_of_nodes, :number_of_sequences, :hash_length
|
17
17
|
|
18
18
|
# NodeArray object of all the graph's node objects
|
19
19
|
attr_accessor :nodes
|
20
20
|
|
21
|
-
#
|
21
|
+
# ArcArray of Arc objects
|
22
22
|
attr_accessor :arcs
|
23
23
|
|
24
24
|
def self.log
|
@@ -27,7 +27,12 @@ module Bio
|
|
27
27
|
|
28
28
|
# Parse a graph file from a Graph, Graph2 or LastGraph output file from velvet
|
29
29
|
# into a Bio::Velvet::Graph object
|
30
|
-
|
30
|
+
#
|
31
|
+
# Options:
|
32
|
+
# * :interesting_read_ids: If not nil, is a Set of nodes that we are interested in. Reads
|
33
|
+
# not of interest will not be parsed in (the NR part of the velvet LastGraph file). Regardless all
|
34
|
+
# nodes and edges are parsed in. Using this options saves both memory and CPU.
|
35
|
+
def self.parse_from_file(path_to_graph_file, options={})
|
31
36
|
graph = self.new
|
32
37
|
state = :header
|
33
38
|
|
@@ -126,8 +131,13 @@ module Bio
|
|
126
131
|
next
|
127
132
|
else
|
128
133
|
raise unless row.length == 3
|
134
|
+
read_id = row[0].to_i
|
135
|
+
if options[:interesting_read_ids] and !options[:interesting_read_ids].include?(read_id)
|
136
|
+
# We have come across an uninteresting read. Ignore it.
|
137
|
+
next
|
138
|
+
end
|
129
139
|
nr = NodedRead.new
|
130
|
-
nr.read_id =
|
140
|
+
nr.read_id = read_id
|
131
141
|
nr.offset_from_start_of_node = row[1].to_i
|
132
142
|
nr.start_coord = row[2].to_i
|
133
143
|
nr.direction = current_node_direction
|
@@ -191,7 +201,7 @@ module Bio
|
|
191
201
|
# are deleted first, and then the node, so that the graph remains sane at all
|
192
202
|
# times - there is never dangling arcs, as such.
|
193
203
|
#
|
194
|
-
# Returns a [deleted_nodes,
|
204
|
+
# Returns a [deleted_nodes, deleted_arcs] tuple, which are both enumerables,
|
195
205
|
# each in no particular order.
|
196
206
|
def delete_nodes_if(&block)
|
197
207
|
deleted_nodes = []
|
data/lib/bio-velvet/runner.rb
CHANGED
@@ -82,8 +82,9 @@ module Bio
|
|
82
82
|
File.join result_directory, 'stats.txt'
|
83
83
|
end
|
84
84
|
|
85
|
-
# Return a Bio::Velvet::Graph object built from the LastGraph file
|
86
|
-
|
85
|
+
# Return a Bio::Velvet::Graph object built from the LastGraph file.
|
86
|
+
# The options for parsing are as per Bio::Velvet::Graph#parse_from_file
|
87
|
+
def last_graph(options=nil)
|
87
88
|
Bio::Velvet::Graph.parse_from_file(last_graph_path)
|
88
89
|
end
|
89
90
|
end
|
@@ -93,6 +93,31 @@ describe "BioVelvet" do
|
|
93
93
|
graph.nodes.collect{|n| n.short_reads.nil? ? 0 : n.short_reads.length}.reduce(:+).should eq(40327)
|
94
94
|
end
|
95
95
|
|
96
|
+
it 'should ignore read_ids when they are uninteresting when parsing the graph' do
|
97
|
+
graph = Bio::Velvet::Graph.parse_from_file(
|
98
|
+
File.join(TEST_DATA_DIR, 'velvet_test_reads_assembly_read_tracking','Graph2'),
|
99
|
+
:interesting_read_ids => Set.new([47223])
|
100
|
+
)
|
101
|
+
graph.should be_kind_of(Bio::Velvet::Graph)
|
102
|
+
|
103
|
+
graph.number_of_nodes.should eq(967)
|
104
|
+
graph.number_of_sequences.should eq(50000)
|
105
|
+
graph.hash_length.should eq(31)
|
106
|
+
|
107
|
+
# NR -951 2
|
108
|
+
#47210 0 0
|
109
|
+
#47223 41 0
|
110
|
+
# ====later
|
111
|
+
# NR 951 2
|
112
|
+
# 47209 54 0
|
113
|
+
# 47224 0 0
|
114
|
+
node = graph.nodes[951]
|
115
|
+
node.short_reads.length.should eq(1)
|
116
|
+
node.number_of_short_reads.should eq(4)
|
117
|
+
node.short_reads[0].read_id.should eq(47223)
|
118
|
+
node.short_reads[0].offset_from_start_of_node.should eq(41)
|
119
|
+
end
|
120
|
+
|
96
121
|
it 'should return sets of arcs by id' do
|
97
122
|
graph = Bio::Velvet::Graph.parse_from_file File.join(TEST_DATA_DIR, 'velvet_test_reads_assembly','LastGraph')
|
98
123
|
# ARC 2 -578 1
|
metadata
CHANGED
@@ -1,155 +1,141 @@
|
|
1
1
|
--- !ruby/object:Gem::Specification
|
2
2
|
name: bio-velvet
|
3
3
|
version: !ruby/object:Gem::Version
|
4
|
-
version: 0.0
|
4
|
+
version: 0.1.0
|
5
5
|
platform: ruby
|
6
6
|
authors:
|
7
7
|
- Ben J Woodcroft
|
8
8
|
autorequire:
|
9
9
|
bindir: bin
|
10
10
|
cert_chain: []
|
11
|
-
date:
|
11
|
+
date: 2014-01-06 00:00:00.000000000 Z
|
12
12
|
dependencies:
|
13
13
|
- !ruby/object:Gem::Dependency
|
14
14
|
name: bio-logger
|
15
15
|
requirement: !ruby/object:Gem::Requirement
|
16
16
|
requirements:
|
17
|
-
- -
|
17
|
+
- - "~>"
|
18
18
|
- !ruby/object:Gem::Version
|
19
|
-
version: 1.0
|
19
|
+
version: '1.0'
|
20
20
|
type: :runtime
|
21
21
|
prerelease: false
|
22
22
|
version_requirements: !ruby/object:Gem::Requirement
|
23
23
|
requirements:
|
24
|
-
- -
|
24
|
+
- - "~>"
|
25
25
|
- !ruby/object:Gem::Version
|
26
|
-
version: 1.0
|
26
|
+
version: '1.0'
|
27
27
|
- !ruby/object:Gem::Dependency
|
28
28
|
name: systemu
|
29
29
|
requirement: !ruby/object:Gem::Requirement
|
30
30
|
requirements:
|
31
|
-
- -
|
31
|
+
- - "~>"
|
32
32
|
- !ruby/object:Gem::Version
|
33
|
-
version: '
|
33
|
+
version: '2.6'
|
34
34
|
type: :runtime
|
35
35
|
prerelease: false
|
36
36
|
version_requirements: !ruby/object:Gem::Requirement
|
37
37
|
requirements:
|
38
|
-
- -
|
38
|
+
- - "~>"
|
39
39
|
- !ruby/object:Gem::Version
|
40
|
-
version: '
|
40
|
+
version: '2.6'
|
41
41
|
- !ruby/object:Gem::Dependency
|
42
42
|
name: files
|
43
43
|
requirement: !ruby/object:Gem::Requirement
|
44
44
|
requirements:
|
45
|
-
- -
|
45
|
+
- - "~>"
|
46
46
|
- !ruby/object:Gem::Version
|
47
|
-
version: '0'
|
47
|
+
version: '0.3'
|
48
48
|
type: :runtime
|
49
49
|
prerelease: false
|
50
50
|
version_requirements: !ruby/object:Gem::Requirement
|
51
51
|
requirements:
|
52
|
-
- -
|
52
|
+
- - "~>"
|
53
53
|
- !ruby/object:Gem::Version
|
54
|
-
version: '0'
|
54
|
+
version: '0.3'
|
55
55
|
- !ruby/object:Gem::Dependency
|
56
56
|
name: hopcsv
|
57
57
|
requirement: !ruby/object:Gem::Requirement
|
58
58
|
requirements:
|
59
|
-
- -
|
59
|
+
- - "~>"
|
60
60
|
- !ruby/object:Gem::Version
|
61
|
-
version: 0.4
|
61
|
+
version: '0.4'
|
62
62
|
type: :runtime
|
63
63
|
prerelease: false
|
64
64
|
version_requirements: !ruby/object:Gem::Requirement
|
65
65
|
requirements:
|
66
|
-
- -
|
66
|
+
- - "~>"
|
67
67
|
- !ruby/object:Gem::Version
|
68
|
-
version: 0.4
|
68
|
+
version: '0.4'
|
69
69
|
- !ruby/object:Gem::Dependency
|
70
70
|
name: rspec
|
71
71
|
requirement: !ruby/object:Gem::Requirement
|
72
72
|
requirements:
|
73
|
-
- -
|
73
|
+
- - "~>"
|
74
74
|
- !ruby/object:Gem::Version
|
75
|
-
version: 2.8
|
75
|
+
version: '2.8'
|
76
76
|
type: :development
|
77
77
|
prerelease: false
|
78
78
|
version_requirements: !ruby/object:Gem::Requirement
|
79
79
|
requirements:
|
80
|
-
- -
|
80
|
+
- - "~>"
|
81
81
|
- !ruby/object:Gem::Version
|
82
|
-
version: 2.8
|
83
|
-
- !ruby/object:Gem::Dependency
|
84
|
-
name: rdoc
|
85
|
-
requirement: !ruby/object:Gem::Requirement
|
86
|
-
requirements:
|
87
|
-
- - '>='
|
88
|
-
- !ruby/object:Gem::Version
|
89
|
-
version: '3.12'
|
90
|
-
type: :development
|
91
|
-
prerelease: false
|
92
|
-
version_requirements: !ruby/object:Gem::Requirement
|
93
|
-
requirements:
|
94
|
-
- - '>='
|
95
|
-
- !ruby/object:Gem::Version
|
96
|
-
version: '3.12'
|
82
|
+
version: '2.8'
|
97
83
|
- !ruby/object:Gem::Dependency
|
98
84
|
name: jeweler
|
99
85
|
requirement: !ruby/object:Gem::Requirement
|
100
86
|
requirements:
|
101
|
-
- -
|
87
|
+
- - "~>"
|
102
88
|
- !ruby/object:Gem::Version
|
103
|
-
version:
|
89
|
+
version: '2.0'
|
104
90
|
type: :development
|
105
91
|
prerelease: false
|
106
92
|
version_requirements: !ruby/object:Gem::Requirement
|
107
93
|
requirements:
|
108
|
-
- -
|
94
|
+
- - "~>"
|
109
95
|
- !ruby/object:Gem::Version
|
110
|
-
version:
|
96
|
+
version: '2.0'
|
111
97
|
- !ruby/object:Gem::Dependency
|
112
98
|
name: bundler
|
113
99
|
requirement: !ruby/object:Gem::Requirement
|
114
100
|
requirements:
|
115
|
-
- -
|
101
|
+
- - "~>"
|
116
102
|
- !ruby/object:Gem::Version
|
117
|
-
version: 1.0
|
103
|
+
version: '1.0'
|
118
104
|
type: :development
|
119
105
|
prerelease: false
|
120
106
|
version_requirements: !ruby/object:Gem::Requirement
|
121
107
|
requirements:
|
122
|
-
- -
|
108
|
+
- - "~>"
|
123
109
|
- !ruby/object:Gem::Version
|
124
|
-
version: 1.0
|
110
|
+
version: '1.0'
|
125
111
|
- !ruby/object:Gem::Dependency
|
126
112
|
name: bio
|
127
113
|
requirement: !ruby/object:Gem::Requirement
|
128
114
|
requirements:
|
129
|
-
- -
|
115
|
+
- - "~>"
|
130
116
|
- !ruby/object:Gem::Version
|
131
|
-
version: 1.4
|
117
|
+
version: '1.4'
|
132
118
|
type: :development
|
133
119
|
prerelease: false
|
134
120
|
version_requirements: !ruby/object:Gem::Requirement
|
135
121
|
requirements:
|
136
|
-
- -
|
122
|
+
- - "~>"
|
137
123
|
- !ruby/object:Gem::Version
|
138
|
-
version: 1.4
|
124
|
+
version: '1.4'
|
139
125
|
- !ruby/object:Gem::Dependency
|
140
126
|
name: rdoc
|
141
127
|
requirement: !ruby/object:Gem::Requirement
|
142
128
|
requirements:
|
143
|
-
- -
|
129
|
+
- - "~>"
|
144
130
|
- !ruby/object:Gem::Version
|
145
|
-
version: '
|
131
|
+
version: '4.1'
|
146
132
|
type: :development
|
147
133
|
prerelease: false
|
148
134
|
version_requirements: !ruby/object:Gem::Requirement
|
149
135
|
requirements:
|
150
|
-
- -
|
136
|
+
- - "~>"
|
151
137
|
- !ruby/object:Gem::Version
|
152
|
-
version: '
|
138
|
+
version: '4.1'
|
153
139
|
description: Parser to work with some file formats used in the velvet DNA assembler
|
154
140
|
email: donttrustben@gmail.com
|
155
141
|
executables: []
|
@@ -158,9 +144,9 @@ extra_rdoc_files:
|
|
158
144
|
- LICENSE.txt
|
159
145
|
- README.md
|
160
146
|
files:
|
161
|
-
- .document
|
162
|
-
- .rspec
|
163
|
-
- .travis.yml
|
147
|
+
- ".document"
|
148
|
+
- ".rspec"
|
149
|
+
- ".travis.yml"
|
164
150
|
- Gemfile
|
165
151
|
- LICENSE.txt
|
166
152
|
- README.md
|
@@ -194,17 +180,17 @@ require_paths:
|
|
194
180
|
- lib
|
195
181
|
required_ruby_version: !ruby/object:Gem::Requirement
|
196
182
|
requirements:
|
197
|
-
- -
|
183
|
+
- - ">="
|
198
184
|
- !ruby/object:Gem::Version
|
199
185
|
version: '0'
|
200
186
|
required_rubygems_version: !ruby/object:Gem::Requirement
|
201
187
|
requirements:
|
202
|
-
- -
|
188
|
+
- - ">="
|
203
189
|
- !ruby/object:Gem::Version
|
204
190
|
version: '0'
|
205
191
|
requirements: []
|
206
192
|
rubyforge_project:
|
207
|
-
rubygems_version: 2.0
|
193
|
+
rubygems_version: 2.2.0
|
208
194
|
signing_key:
|
209
195
|
specification_version: 4
|
210
196
|
summary: Parser to work with file formats used in the velvet DNA assembler
|