NetAnalyzer 0.1.5 → 0.6.2
Sign up to get free protection for your applications and to get access to all the features.
- checksums.yaml +5 -5
- data/.rspec +1 -0
- data/Gemfile +4 -0
- data/NetAnalyzer.gemspec +15 -5
- data/README.md +14 -2
- data/Rakefile +13 -9
- data/bin/NetAnalyzer.rb +183 -30
- data/bin/text2binary_matrix.rb +294 -0
- data/lib/NetAnalyzer/network.rb +651 -87
- data/lib/NetAnalyzer/templates/ElGrapho.min.js +28 -0
- data/lib/NetAnalyzer/templates/cytoscape.erb +65 -0
- data/lib/NetAnalyzer/templates/cytoscape.min.js +32 -0
- data/lib/NetAnalyzer/templates/el_grapho.erb +89 -0
- data/lib/NetAnalyzer/templates/pako.min.js +1 -0
- data/lib/NetAnalyzer/templates/sigma.erb +132 -0
- data/lib/NetAnalyzer/version.rb +1 -1
- data/lib/NetAnalyzer.rb +2 -0
- metadata +171 -24
checksums.yaml
CHANGED
@@ -1,7 +1,7 @@
|
|
1
1
|
---
|
2
|
-
|
3
|
-
metadata.gz:
|
4
|
-
data.tar.gz:
|
2
|
+
SHA256:
|
3
|
+
metadata.gz: 53e3a09e27675b6e10398c8c869e31314c8afccb440b5f7d3cf2f84bec554d24
|
4
|
+
data.tar.gz: 17b9a25ca6e45512f049097dad67f3f8a12ab4cf12b2f5706b2777ec301f436f
|
5
5
|
SHA512:
|
6
|
-
metadata.gz:
|
7
|
-
data.tar.gz:
|
6
|
+
metadata.gz: 58d378216bdd2aaa7b374b43ce441500d0b08b2cf30e54a88cab9fec39c4ccad9dfe77554b7ef670d4a2874de9142c0046b5b2aa89be803c4a037d541816abf4
|
7
|
+
data.tar.gz: 5992bbed01102a8e59da389f872f24087e8d0f6f31aefe53518c49ba88e75730493fea725485686ab0c3b1c846c7d9afd2c4091163e97921670e5a6a8ddfac22
|
data/.rspec
CHANGED
data/Gemfile
CHANGED
@@ -2,3 +2,7 @@ source 'https://rubygems.org'
|
|
2
2
|
|
3
3
|
# Specify your gem's dependencies in NetAnalyzer.gemspec
|
4
4
|
gemspec
|
5
|
+
semtools_dev_path = File.expand_path('~/dev_gems/semtools')
|
6
|
+
gem "semtools", github: "seoanezonjic/semtools", branch: "master" if Dir.exist?(semtools_dev_path)
|
7
|
+
expcalc_dev_path = File.expand_path('~/dev_gems/expcalc')
|
8
|
+
gem "expcalc", github: "seoanezonjic/expcalc", branch: "master" if Dir.exist?(expcalc_dev_path)
|
data/NetAnalyzer.gemspec
CHANGED
@@ -19,9 +19,19 @@ Gem::Specification.new do |spec|
|
|
19
19
|
spec.executables = spec.files.grep(%r{^bin/}) { |f| File.basename(f) }
|
20
20
|
spec.require_paths = ["lib"]
|
21
21
|
|
22
|
-
spec.add_development_dependency "
|
23
|
-
spec.add_development_dependency "
|
24
|
-
spec.add_development_dependency "
|
25
|
-
spec.add_dependency "
|
26
|
-
spec.add_dependency "
|
22
|
+
spec.add_development_dependency "rake", ">= 13.0.3"
|
23
|
+
spec.add_development_dependency "rspec"
|
24
|
+
spec.add_development_dependency "minitest"
|
25
|
+
spec.add_dependency "cmath", ">= 1.0.0"
|
26
|
+
spec.add_dependency "numo-linalg", ">= 0.1.5"
|
27
|
+
spec.add_dependency "numo-narray", ">= 0.9.1.9"
|
28
|
+
spec.add_dependency "pp", ">= 0.1.0"
|
29
|
+
spec.add_dependency "npy", ">= 0.2.0"
|
30
|
+
spec.add_dependency "bigdecimal", ">= 3.0.0"
|
31
|
+
spec.add_dependency "gv", ">= 0.1.0"
|
32
|
+
spec.add_dependency "semtools", ">= 0.1.1"
|
33
|
+
spec.add_dependency "expcalc"
|
34
|
+
spec.add_dependency "parallel"
|
35
|
+
spec.add_dependency "rubystats"
|
36
|
+
spec.add_dependency "red-colors"
|
27
37
|
end
|
data/README.md
CHANGED
@@ -2,7 +2,11 @@
|
|
2
2
|
|
3
3
|
NetAnalyzer is a network analysis tool that can be used to calculate the associations between nodes in unweighted n-partite networks [1]. The calculation of the association between nodes is based on similarity indices (Jaccard, Simpson, geometric and cosine), statistic-based (Pearson correlation coefficient, CSI and hypergeometric) [2] and a special metric designed only for tripartite networks (here called as 'transference' method [3]). The user can choose the association index method according to the network to analyse. The tool gives a table of results, with all the associations between nodes and the association value calculated.
|
4
4
|
|
5
|
-
If you use this tool, please cite us: E. Rojano, P. Seoane, A. Bueno, J. R. Perkins & J. A. G. Ranea. Revealing the Relationship Between Human Genome Regions and Pathological Phenotypes Through Network Analysis. Lecture Notes in Computer Science, Vol 10208, 197-207 (2017).
|
5
|
+
If you use this tool, please cite us: [1] E. Rojano, P. Seoane, A. Bueno, J. R. Perkins & J. A. G. Ranea. Revealing the Relationship Between Human Genome Regions and Pathological Phenotypes Through Network Analysis. Lecture Notes in Computer Science, Vol 10208, 197-207 (2017).
|
6
|
+
|
7
|
+
[2] Fuxman-Bass et al. Using networks to measure similarity between genes: association index selection. Nature Methods, 10(12):1169-76. 2013.
|
8
|
+
|
9
|
+
[3] Alaimo et al. ncPred: ncRNA-Disease Association Prediction through Tripartite Network-Based Inference. Frontiers in Bioengineering and Biotechnology, 2:71, 2014.
|
6
10
|
|
7
11
|
## Installation
|
8
12
|
|
@@ -32,7 +36,6 @@ Once nmatrix gem is installed:
|
|
32
36
|
gem install 'NetAnalyzer'
|
33
37
|
```
|
34
38
|
|
35
|
-
|
36
39
|
## Usage
|
37
40
|
|
38
41
|
The program NetAnalyzer.rb can analyse an unweighted network to calculate the association index between different nodes.
|
@@ -47,9 +50,18 @@ Where:
|
|
47
50
|
-i: Input file with the network to analyse. It must have two columns (separated by default by tabs) that represents the nodes that are related (NodeA\tNodeB). Please if you have doubts about the format, check the example providen.
|
48
51
|
-l: Layers construction. Please consider that, depending on the n-partite network you provide, NetAnalyzer will transform it into a bipartite one to perform the analysis (excepting if the association method used is 'transference'). The layers must contain a identifier of the node, and a character or pattern to identify. In this example, the bipartite network has HPO terms (with 'HP:' string in each of them) and patients that have these HPO terms (they are given as numerical patient IDs). Both layers must be separated by ';'.
|
49
52
|
-m: Association method. There are 8 different association methods to choose: 'jaccard', 'cosine', 'pcc', 'csi', 'hypergeometric', 'simpson', 'geometric' and 'transference'.
|
53
|
+
-u: Set which layer will be the one that establish connections between nodes in the other layer. In this case, we will get with patient is associated to other patient because the HPO they share.
|
50
54
|
-a: Associations output file name. Here you can find the associations between nodes in the network and the calculated association value, according to the chosen method.
|
51
55
|
```
|
52
56
|
|
57
|
+
Optional flags:
|
58
|
+
|
59
|
+
```
|
60
|
+
-s: Split character. Change if the layers of the network are not separated by tabs.
|
61
|
+
-o: Output file name.
|
62
|
+
|
63
|
+
```
|
64
|
+
|
53
65
|
## Development
|
54
66
|
|
55
67
|
After checking out the repo, run `bin/setup` to install dependencies. Then, run `rake spec` to run the tests. You can also run `bin/console` for an interactive prompt that will allow you to experiment.
|
data/Rakefile
CHANGED
@@ -1,13 +1,17 @@
|
|
1
1
|
require "bundler/gem_tasks"
|
2
|
-
|
2
|
+
require "rake/testtask"
|
3
|
+
require 'rdoc/task'
|
3
4
|
|
4
|
-
|
5
|
+
Rake::TestTask.new(:test) do |t|
|
6
|
+
t.libs << "test"
|
7
|
+
t.libs << "lib"
|
8
|
+
t.test_files = FileList["test/**/*_test.rb"]
|
9
|
+
end
|
5
10
|
|
6
|
-
|
11
|
+
RDoc::Task.new do |rdoc|
|
12
|
+
rdoc.main = "README.doc"
|
13
|
+
rdoc.rdoc_files.include("README.md", "lib/*.rb", "lib/NetAnalyzer/*.rb")
|
14
|
+
rdoc.options << "--all"
|
15
|
+
end
|
7
16
|
|
8
|
-
|
9
|
-
|
10
|
-
Rake::TestTask.new do |t|
|
11
|
-
t.libs << 'test'
|
12
|
-
t.pattern = "test/*_test.rb"
|
13
|
-
end
|
17
|
+
task :default => :test
|
data/bin/NetAnalyzer.rb
CHANGED
@@ -1,11 +1,21 @@
|
|
1
1
|
#! /usr/bin/env ruby
|
2
2
|
|
3
3
|
ROOT_PATH = File.dirname(__FILE__)
|
4
|
-
|
5
|
-
$: << File.expand_path(File.join(ROOT_PATH, '..', 'lib', 'NetAnalyzer', 'methods'))
|
6
|
-
|
7
|
-
require 'network'
|
4
|
+
$LOAD_PATH.unshift(File.expand_path(File.join(ROOT_PATH, '..', 'lib')))
|
8
5
|
require 'optparse'
|
6
|
+
require 'benchmark'
|
7
|
+
require 'NetAnalyzer'
|
8
|
+
|
9
|
+
######################################
|
10
|
+
## METHODS
|
11
|
+
######################################
|
12
|
+
def load_file(path)
|
13
|
+
data = []
|
14
|
+
File.open(path).each do |line|
|
15
|
+
data << line.chomp.split("\t")
|
16
|
+
end
|
17
|
+
return data
|
18
|
+
end
|
9
19
|
|
10
20
|
##############################
|
11
21
|
#OPTPARSE
|
@@ -20,11 +30,26 @@ OptionParser.new do |opts|
|
|
20
30
|
options[:input_file] = input_file
|
21
31
|
end
|
22
32
|
|
33
|
+
options[:node_file] = nil
|
34
|
+
opts.on("-n", "--node_names_file PATH", "File with node names corresponding to the input matrix, only use when -i is set to bin or matrix.") do |node_file|
|
35
|
+
options[:node_file] = node_file
|
36
|
+
end
|
37
|
+
|
38
|
+
options[:input_format] = 'pair'
|
39
|
+
opts.on("-f", "--input_format STRING", "Input file format: pair (default), bin, matrix") do |input_format|
|
40
|
+
options[:input_format] = input_format
|
41
|
+
end
|
42
|
+
|
23
43
|
options[:split_char] = "\t"
|
24
44
|
opts.on("-s", "--split_char STRING", "Character for splitting input file. Default: tab") do |split_char|
|
25
45
|
options[:split_char] = split_char
|
26
46
|
end
|
27
47
|
|
48
|
+
options[:use_pairs] = :conn
|
49
|
+
opts.on("-P", "--use_pairs STRING", "Which pairs must be computed. 'all' means all posible pair node combinations and 'conn' means the pair are truly connected in the network. Default 'conn' ") do |use_pairs|
|
50
|
+
options[:use_pairs] = use_pairs.to_sym
|
51
|
+
end
|
52
|
+
|
28
53
|
options[:output_file] = "network2plot"
|
29
54
|
opts.on("-o", "--output_file PATH", "Output file name") do |output_file|
|
30
55
|
options[:output_file] = output_file
|
@@ -35,6 +60,11 @@ OptionParser.new do |opts|
|
|
35
60
|
options[:assoc_file] = output_file
|
36
61
|
end
|
37
62
|
|
63
|
+
options[:kernel_file] = "kernel_values"
|
64
|
+
opts.on("-K", "--kernel_file PATH", "Output file name for kernel values") do |output_file|
|
65
|
+
options[:kernel_file] = output_file
|
66
|
+
end
|
67
|
+
|
38
68
|
options[:performance_file] = "perf_values.txt"
|
39
69
|
opts.on("-p", "--performance_file PATH", "Output file name for performance values") do |output_file|
|
40
70
|
options[:performance_file] = output_file
|
@@ -42,8 +72,8 @@ OptionParser.new do |opts|
|
|
42
72
|
|
43
73
|
options[:layers] = [:layer, '-']
|
44
74
|
opts.on("-l", "--layers STRING", "Layer definition on network: layer1name,regexp1;layer2name,regexp2...") do |layers|
|
45
|
-
|
46
|
-
|
75
|
+
layers_definition = layers.split(";").map{|layer_attr| layer_attr.split(',')}
|
76
|
+
layers_definition.map!{|layer_attr| [layer_attr.first.to_sym, /#{layer_attr.last}/]}
|
47
77
|
options[:layers] = layers_definition
|
48
78
|
end
|
49
79
|
|
@@ -62,28 +92,129 @@ OptionParser.new do |opts|
|
|
62
92
|
options[:output_style] = output_style
|
63
93
|
end
|
64
94
|
|
95
|
+
options[:ontologies] = []
|
96
|
+
opts.on("-O", "--ontology STRING", "String that define which ontologies must be used with each layer. String definition:'layer_name1:path_to_obo_file1;layer_name2:path_to_obo_file2'") do |ontologies|
|
97
|
+
options[:ontologies] = ontologies.split(';').map{|pair| pair.split(':')}
|
98
|
+
end
|
99
|
+
|
65
100
|
options[:meth] = nil
|
66
101
|
opts.on("-m", "--association_method STRING", "Association method to use on network") do |meth|
|
67
102
|
options[:meth] = meth.to_sym
|
68
103
|
end
|
69
104
|
|
70
|
-
options[:
|
105
|
+
options[:kernel] = nil
|
106
|
+
opts.on("-k", "--kernel_method STRING", "Kernel operation to perform with the adjacency matrix") do |kernel|
|
107
|
+
options[:kernel] = kernel
|
108
|
+
end
|
109
|
+
|
110
|
+
options[:no_autorelations] = false
|
71
111
|
opts.on("-N", "--no_autorelations", "Remove association values between nodes os same type") do
|
72
|
-
options[:no_autorelations] =
|
112
|
+
options[:no_autorelations] = true
|
113
|
+
end
|
114
|
+
|
115
|
+
options[:normalize_kernel] = false
|
116
|
+
opts.on("-z", "--normalize_kernel_values", "Apply cosine normalization to the obtained kernel") do
|
117
|
+
options[:normalize_kernel] = true
|
118
|
+
end
|
119
|
+
|
120
|
+
options[:graph_file] = nil
|
121
|
+
opts.on("-g", "--graph_file PATH", "Build a graphic representation of the network") do |item|
|
122
|
+
options[:graph_file] = item
|
123
|
+
end
|
124
|
+
|
125
|
+
options[:graph_options] = {method: 'el_grapho', layout: 'forcedir', steps: '30'}
|
126
|
+
opts.on("--graph_options STRING", "Set graph parameters as 'NAME1=value1,NAME2=value2,...") do |item|
|
127
|
+
options[:graph_options] = {}
|
128
|
+
item.split(',').each do |pair|
|
129
|
+
fields = pair.split('=')
|
130
|
+
options[:graph_options][fields.first.to_sym] = fields.last
|
131
|
+
end
|
132
|
+
end
|
133
|
+
|
134
|
+
options[:threads] = 0
|
135
|
+
opts.on( '-T', '--threads INTEGER', 'Number of threads to use in computation, one thread will be reserved as manager.' ) do |opt|
|
136
|
+
options[:threads] = opt.to_i - 1
|
137
|
+
end
|
138
|
+
|
139
|
+
options[:reference_nodes] = []
|
140
|
+
opts.on("-r", "--reference_nodes STRING", "Node ids comma separared") do |item|
|
141
|
+
options[:reference_nodes] = item.split(',')
|
73
142
|
end
|
74
143
|
|
144
|
+
options[:group_nodes] = {}
|
145
|
+
opts.on("-G", "--group_nodes STRING", "File path or groups separated by ';' and group node ids comma separared") do |item|
|
146
|
+
if File.exists?(item)
|
147
|
+
File.open(item).each do |line|
|
148
|
+
groupID, nodeID = line.chomp.split("\t")
|
149
|
+
query = options[:group_nodes][groupID]
|
150
|
+
query.nil? ? options[:group_nodes][groupID] = [nodeID] : query << nodeID
|
151
|
+
end
|
152
|
+
else
|
153
|
+
item.split(';').each_with_index do |group, i|
|
154
|
+
options[:group_nodes][i] = group.split(',')
|
155
|
+
end
|
156
|
+
end
|
157
|
+
end
|
158
|
+
|
159
|
+
options[:group_metrics] = false
|
160
|
+
opts.on("-M", "--group_metrics", "Perform group group_metrics") do
|
161
|
+
options[:group_metrics] = true
|
162
|
+
end
|
163
|
+
|
164
|
+
options[:expand_clusters] = nil
|
165
|
+
opts.on("-x", "--expand_clusters STRING", "Method to expand clusters Available methods: sht_path") do |item|
|
166
|
+
options[:expand_clusters] = item
|
167
|
+
end
|
168
|
+
|
169
|
+
options[:get_attributes] = []
|
170
|
+
opts.on("-A", "--attributes STRING", "String separadted by commas with the name of network attribute") do |item|
|
171
|
+
options[:get_attributes] = item.split(',')
|
172
|
+
end
|
173
|
+
|
174
|
+
options[:delete_nodes] = []
|
175
|
+
opts.on("-d", "--delete PATH", "Remove nodes from file. If PATH;r then nodes not included in file are removed") do |item|
|
176
|
+
options[:delete_nodes] = item.split(';')
|
177
|
+
end
|
75
178
|
end.parse!
|
76
179
|
|
77
180
|
##########################
|
78
181
|
#MAIN
|
79
182
|
##########################
|
80
|
-
|
81
183
|
fullNet = Network.new(options[:layers].map{|layer| layer.first})
|
184
|
+
fullNet.reference_nodes = options[:reference_nodes]
|
185
|
+
fullNet.threads = options[:threads]
|
186
|
+
fullNet.group_nodes = options[:group_nodes]
|
187
|
+
fullNet.set_compute_pairs(options[:use_pairs], !options[:no_autorelations])
|
82
188
|
#puts options[:layers].map{|layer| layer.first}.inspect
|
83
189
|
puts "Loading network data"
|
84
|
-
|
190
|
+
if options[:input_format] == 'pair'
|
191
|
+
fullNet.load_network_by_pairs(options[:input_file], options[:layers], options[:split_char])
|
192
|
+
elsif options[:input_format] == 'bin'
|
193
|
+
fullNet.load_network_by_bin_matrix(options[:input_file], options[:node_file], options[:layers])
|
194
|
+
elsif options[:input_format] == 'matrix'
|
195
|
+
fullNet.load_network_by_plain_matrix(options[:input_file], options[:node_file], options[:layers], options[:splitChar])
|
196
|
+
else
|
197
|
+
raise("ERROR: The format #{options[:input_format]} is not defined")
|
198
|
+
end
|
199
|
+
|
200
|
+
if !options[:delete_nodes].empty?
|
201
|
+
node_list = load_file(options[:delete_nodes].first).flatten
|
202
|
+
options[:delete_nodes].length > 1 ? mode = options[:delete_nodes][1] : 'd'
|
203
|
+
fullNet.delete_nodes(node_list, mode)
|
204
|
+
end
|
85
205
|
|
86
|
-
|
206
|
+
options[:ontologies].each do |layer_name, ontology_file_path|
|
207
|
+
fullNet.link_ontology(ontology_file_path, layer_name.to_sym)
|
208
|
+
end
|
209
|
+
|
210
|
+
if !options[:get_attributes].empty?
|
211
|
+
node_attributes = fullNet.get_node_attributes(options[:get_attributes])
|
212
|
+
File.open(File.join(File.dirname(options[:output_file]), 'node_attributes.txt'), 'w' ) do |f|
|
213
|
+
node_attributes.each do |attributes|
|
214
|
+
f.puts(attributes.join("\t"))
|
215
|
+
end
|
216
|
+
end
|
217
|
+
end
|
87
218
|
|
88
219
|
if !options[:meth].nil?
|
89
220
|
puts "Performing association method #{options[:meth]} on network"
|
@@ -100,31 +231,53 @@ if !options[:meth].nil?
|
|
100
231
|
options[:use_layers][1].first,
|
101
232
|
options[:meth])
|
102
233
|
end
|
103
|
-
puts 'Clean autorelations' if options[:no_autorelations]
|
104
|
-
fullNet.clean_autorelations_on_association_values if options[:no_autorelations]
|
105
234
|
File.open(options[:assoc_file], 'w') do |f|
|
106
235
|
fullNet.association_values[options[:meth]].each do |val|
|
107
236
|
f.puts val.join("\t")
|
108
237
|
end
|
109
238
|
end
|
239
|
+
if !options[:control_file].nil?
|
240
|
+
puts "Doing validation on association values obtained from method #{options[:meth]}"
|
241
|
+
control = []
|
242
|
+
File.open(options[:control_file]).each("\n") do |line|
|
243
|
+
line.chomp!
|
244
|
+
control << line.split("\t")
|
245
|
+
end
|
246
|
+
fullNet.load_control(control)
|
247
|
+
performance = fullNet.get_pred_rec(options[:meth])
|
248
|
+
File.open(options[:performance_file], 'w') do |f|
|
249
|
+
f.puts %w[cut prec rec meth].join("\t")
|
250
|
+
performance.each do |item|
|
251
|
+
item << options[:meth].to_s
|
252
|
+
f.puts item.join("\t")
|
253
|
+
end
|
254
|
+
end
|
255
|
+
end
|
256
|
+
puts "End of analysis: #{options[:meth]}"
|
110
257
|
end
|
111
258
|
|
112
|
-
if !options[:
|
113
|
-
|
114
|
-
|
115
|
-
|
116
|
-
|
117
|
-
|
118
|
-
|
119
|
-
|
120
|
-
|
121
|
-
File.open(options[:performance_file], 'w') do |f|
|
122
|
-
f.puts %w[cut prec rec meth].join("\t")
|
123
|
-
performance.each do |item|
|
124
|
-
item << options[:meth].to_s
|
125
|
-
f.puts item.join("\t")
|
126
|
-
end
|
127
|
-
end
|
259
|
+
if !options[:kernel].nil?
|
260
|
+
layer2kernel = options[:use_layers].first # we use only a layer to perform the kernel, so only one item it is selected.
|
261
|
+
fullNet.get_kernel(layer2kernel, options[:kernel], options[:normalize_kernel])
|
262
|
+
fullNet.write_kernel(layer2kernel, options[:kernel_file])
|
263
|
+
end
|
264
|
+
|
265
|
+
if !options[:graph_file].nil?
|
266
|
+
options[:graph_options][:output_file] = options[:graph_file]
|
267
|
+
fullNet.plot_network(options[:graph_options])
|
128
268
|
end
|
129
269
|
|
130
|
-
|
270
|
+
if options[:group_metrics]
|
271
|
+
fullNet.compute_group_metrics(File.join(File.dirname(options[:output_file]), 'group_metrics.txt'))
|
272
|
+
end
|
273
|
+
|
274
|
+
if !options[:expand_clusters].nil?
|
275
|
+
expanded_clusters = fullNet.expand_clusters(options[:expand_clusters])
|
276
|
+
File.open(File.join(File.dirname(options[:output_file]), 'expand_clusters.txt'), 'w' ) do |f|
|
277
|
+
expanded_clusters.each do |cl_id, nodes|
|
278
|
+
nodes.each do |node|
|
279
|
+
f.puts "#{cl_id}\t#{node}"
|
280
|
+
end
|
281
|
+
end
|
282
|
+
end
|
283
|
+
end
|