NetAnalyzer 0.1.2 → 0.6.2

Sign up to get free protection for your applications and to get access to all the features.
checksums.yaml CHANGED
@@ -1,7 +1,7 @@
1
1
  ---
2
- SHA1:
3
- metadata.gz: 412854b58d2f6a2dbb1844cb943a90c9b96f7fd6
4
- data.tar.gz: df275072cb0d1bb9b9d15099b089897471566eb8
2
+ SHA256:
3
+ metadata.gz: 53e3a09e27675b6e10398c8c869e31314c8afccb440b5f7d3cf2f84bec554d24
4
+ data.tar.gz: 17b9a25ca6e45512f049097dad67f3f8a12ab4cf12b2f5706b2777ec301f436f
5
5
  SHA512:
6
- metadata.gz: a6d3be799d9def07f7addef75dfad5a8d1c6b9d14c684d3dcb227b6be68ef22912ab069beb2e439d07a08b7dc2a6d7d0f38e873a1303a099c3c110c9cf6b075b
7
- data.tar.gz: 561ad012584a397023897047b0e401a96246d39a67082f0c7cfd3ee6dc3cb8075d77d08bb75ae34a8174cbc13e5a8439f0348175f38133162f54ef9e6b55c07c
6
+ metadata.gz: 58d378216bdd2aaa7b374b43ce441500d0b08b2cf30e54a88cab9fec39c4ccad9dfe77554b7ef670d4a2874de9142c0046b5b2aa89be803c4a037d541816abf4
7
+ data.tar.gz: 5992bbed01102a8e59da389f872f24087e8d0f6f31aefe53518c49ba88e75730493fea725485686ab0c3b1c846c7d9afd2c4091163e97921670e5a6a8ddfac22
data/.rspec CHANGED
@@ -1,2 +1,3 @@
1
1
  --format documentation
2
2
  --color
3
+ --require spec_helper
data/Gemfile CHANGED
@@ -2,3 +2,7 @@ source 'https://rubygems.org'
2
2
 
3
3
  # Specify your gem's dependencies in NetAnalyzer.gemspec
4
4
  gemspec
5
+ semtools_dev_path = File.expand_path('~/dev_gems/semtools')
6
+ gem "semtools", github: "seoanezonjic/semtools", branch: "master" if Dir.exist?(semtools_dev_path)
7
+ expcalc_dev_path = File.expand_path('~/dev_gems/expcalc')
8
+ gem "expcalc", github: "seoanezonjic/expcalc", branch: "master" if Dir.exist?(expcalc_dev_path)
data/NetAnalyzer.gemspec CHANGED
@@ -7,7 +7,7 @@ Gem::Specification.new do |spec|
7
7
  spec.name = "NetAnalyzer"
8
8
  spec.version = NetAnalyzer::VERSION
9
9
  spec.authors = ["Elena Rojano, Pedro Seoane"]
10
- spec.email = ["elenarojano@uma.es, seoanezonjic@uma.es"]
10
+ spec.email = ["elenarojano@uma.es, seoanezonjic@hotmail.com"]
11
11
 
12
12
  spec.summary = %q{Network analysis tool that calculate and validate different association indices.}
13
13
  spec.description = %q{NetAnalyzer is a useful network analysis tool developed in Ruby that can 1) analyse any type of unweighted network, regardless of the number of layers, 2) calculate the relationship between different layers, using various association indices (Jaccard, Simpson, PCC, geometric, cosine and hypergeometric) and 3) validate the results}
@@ -19,9 +19,19 @@ Gem::Specification.new do |spec|
19
19
  spec.executables = spec.files.grep(%r{^bin/}) { |f| File.basename(f) }
20
20
  spec.require_paths = ["lib"]
21
21
 
22
- spec.add_development_dependency "bundler", "~> 1.11"
23
- spec.add_development_dependency "rake", "~> 10.0"
24
- spec.add_development_dependency "rspec", "~> 3.0"
25
- spec.add_dependency "nmatrix"
26
- spec.add_dependency "bigdecimal"
22
+ spec.add_development_dependency "rake", ">= 13.0.3"
23
+ spec.add_development_dependency "rspec"
24
+ spec.add_development_dependency "minitest"
25
+ spec.add_dependency "cmath", ">= 1.0.0"
26
+ spec.add_dependency "numo-linalg", ">= 0.1.5"
27
+ spec.add_dependency "numo-narray", ">= 0.9.1.9"
28
+ spec.add_dependency "pp", ">= 0.1.0"
29
+ spec.add_dependency "npy", ">= 0.2.0"
30
+ spec.add_dependency "bigdecimal", ">= 3.0.0"
31
+ spec.add_dependency "gv", ">= 0.1.0"
32
+ spec.add_dependency "semtools", ">= 0.1.1"
33
+ spec.add_dependency "expcalc"
34
+ spec.add_dependency "parallel"
35
+ spec.add_dependency "rubystats"
36
+ spec.add_dependency "red-colors"
27
37
  end
data/README.md CHANGED
@@ -1,28 +1,66 @@
1
1
  # NetAnalyzer
2
2
 
3
- Welcome to your new gem! In this directory, you'll find the files you need to be able to package up your Ruby library into a gem. Put your Ruby code in the file `lib/NetAnalyzer`. To experiment with that code, run `bin/console` for an interactive prompt.
3
+ NetAnalyzer is a network analysis tool that can be used to calculate the associations between nodes in unweighted n-partite networks [1]. The calculation of the association between nodes is based on similarity indices (Jaccard, Simpson, geometric and cosine), statistic-based (Pearson correlation coefficient, CSI and hypergeometric) [2] and a special metric designed only for tripartite networks (here called as 'transference' method [3]). The user can choose the association index method according to the network to analyse. The tool gives a table of results, with all the associations between nodes and the association value calculated.
4
4
 
5
- TODO: Delete this and the text above, and describe your gem
5
+ If you use this tool, please cite us: [1] E. Rojano, P. Seoane, A. Bueno, J. R. Perkins & J. A. G. Ranea. Revealing the Relationship Between Human Genome Regions and Pathological Phenotypes Through Network Analysis. Lecture Notes in Computer Science, Vol 10208, 197-207 (2017).
6
+
7
+ [2] Fuxman-Bass et al. Using networks to measure similarity between genes: association index selection. Nature Methods, 10(12):1169-76. 2013.
8
+
9
+ [3] Alaimo et al. ncPred: ncRNA-Disease Association Prediction through Tripartite Network-Based Inference. Frontiers in Bioengineering and Biotechnology, 2:71, 2014.
6
10
 
7
11
  ## Installation
8
12
 
9
- Add this line to your application's Gemfile:
13
+ Linux & MacOS:
14
+
15
+ Please, check before your Ruby compiler (it has to be clang to install nmatrix)
10
16
 
11
- ```ruby
12
- gem 'NetAnalyzer'
17
+ ```
18
+ ruby -rrbconfig -e 'puts RbConfig::MAKEFILE_CONFIG["CC"]'
13
19
  ```
14
20
 
15
- And then execute:
21
+ If not, install RVM (https://rvm.io/), and then:
16
22
 
17
- $ bundle
23
+ ```
24
+ rvm reinstall 2.4.1 --with-gcc=clang --with-cxx=clang++
25
+ git clone https://github.com/SciRuby/nmatrix.git
26
+ cd nmatrix
27
+ gem install bundler
28
+ bundle install
29
+ bundle exec rake compile
30
+ bundle exec rake spec
31
+ ```
18
32
 
19
- Or install it yourself as:
33
+ Once nmatrix gem is installed:
20
34
 
21
- $ gem install NetAnalyzer
35
+ ````ruby
36
+ gem install 'NetAnalyzer'
37
+ ```
22
38
 
23
39
  ## Usage
24
40
 
25
- TODO: Write usage instructions here
41
+ The program NetAnalyzer.rb can analyse an unweighted network to calculate the association index between different nodes.
42
+
43
+ An example of use can be the following:
44
+
45
+ $ NetAnalyzer.rb NetAnalyzer.rb -i network.txt -l 'hpo,HP:;patients,[0-9]' -m hypergeometric -u 'hpo;patients' -a 'associations_file.txt'
46
+
47
+ Where:
48
+
49
+ ```
50
+ -i: Input file with the network to analyse. It must have two columns (separated by default by tabs) that represents the nodes that are related (NodeA\tNodeB). Please if you have doubts about the format, check the example providen.
51
+ -l: Layers construction. Please consider that, depending on the n-partite network you provide, NetAnalyzer will transform it into a bipartite one to perform the analysis (excepting if the association method used is 'transference'). The layers must contain a identifier of the node, and a character or pattern to identify. In this example, the bipartite network has HPO terms (with 'HP:' string in each of them) and patients that have these HPO terms (they are given as numerical patient IDs). Both layers must be separated by ';'.
52
+ -m: Association method. There are 8 different association methods to choose: 'jaccard', 'cosine', 'pcc', 'csi', 'hypergeometric', 'simpson', 'geometric' and 'transference'.
53
+ -u: Set which layer will be the one that establish connections between nodes in the other layer. In this case, we will get with patient is associated to other patient because the HPO they share.
54
+ -a: Associations output file name. Here you can find the associations between nodes in the network and the calculated association value, according to the chosen method.
55
+ ```
56
+
57
+ Optional flags:
58
+
59
+ ```
60
+ -s: Split character. Change if the layers of the network are not separated by tabs.
61
+ -o: Output file name.
62
+
63
+ ```
26
64
 
27
65
  ## Development
28
66
 
data/Rakefile CHANGED
@@ -1,6 +1,17 @@
1
1
  require "bundler/gem_tasks"
2
- require "rspec/core/rake_task"
2
+ require "rake/testtask"
3
+ require 'rdoc/task'
3
4
 
4
- RSpec::Core::RakeTask.new(:spec)
5
+ Rake::TestTask.new(:test) do |t|
6
+ t.libs << "test"
7
+ t.libs << "lib"
8
+ t.test_files = FileList["test/**/*_test.rb"]
9
+ end
5
10
 
6
- task :default => :spec
11
+ RDoc::Task.new do |rdoc|
12
+ rdoc.main = "README.doc"
13
+ rdoc.rdoc_files.include("README.md", "lib/*.rb", "lib/NetAnalyzer/*.rb")
14
+ rdoc.options << "--all"
15
+ end
16
+
17
+ task :default => :test
data/bin/NetAnalyzer.rb CHANGED
@@ -1,28 +1,20 @@
1
1
  #! /usr/bin/env ruby
2
2
 
3
3
  ROOT_PATH = File.dirname(__FILE__)
4
- $: << File.expand_path(File.join(ROOT_PATH, '..', 'lib', 'NetAnalyzer'))
5
-
6
- require 'network'
4
+ $LOAD_PATH.unshift(File.expand_path(File.join(ROOT_PATH, '..', 'lib')))
7
5
  require 'optparse'
6
+ require 'benchmark'
7
+ require 'NetAnalyzer'
8
8
 
9
- ##############################
10
- # MAIN METHODS
11
- ##############################
12
-
13
- def set_layer(layer_definitions, node_name)
14
- layer = nil
15
- if layer_definitions.length > 1
16
- layer_definitions.each do |layer_name, regexp|
17
- if node_name =~ regexp
18
- layer = layer_name
19
- break
20
- end
21
- end
22
- else
23
- layer = layer_definitions.first.first
24
- end
25
- return layer
9
+ ######################################
10
+ ## METHODS
11
+ ######################################
12
+ def load_file(path)
13
+ data = []
14
+ File.open(path).each do |line|
15
+ data << line.chomp.split("\t")
16
+ end
17
+ return data
26
18
  end
27
19
 
28
20
  ##############################
@@ -38,11 +30,26 @@ OptionParser.new do |opts|
38
30
  options[:input_file] = input_file
39
31
  end
40
32
 
33
+ options[:node_file] = nil
34
+ opts.on("-n", "--node_names_file PATH", "File with node names corresponding to the input matrix, only use when -i is set to bin or matrix.") do |node_file|
35
+ options[:node_file] = node_file
36
+ end
37
+
38
+ options[:input_format] = 'pair'
39
+ opts.on("-f", "--input_format STRING", "Input file format: pair (default), bin, matrix") do |input_format|
40
+ options[:input_format] = input_format
41
+ end
42
+
41
43
  options[:split_char] = "\t"
42
44
  opts.on("-s", "--split_char STRING", "Character for splitting input file. Default: tab") do |split_char|
43
45
  options[:split_char] = split_char
44
46
  end
45
47
 
48
+ options[:use_pairs] = :conn
49
+ opts.on("-P", "--use_pairs STRING", "Which pairs must be computed. 'all' means all posible pair node combinations and 'conn' means the pair are truly connected in the network. Default 'conn' ") do |use_pairs|
50
+ options[:use_pairs] = use_pairs.to_sym
51
+ end
52
+
46
53
  options[:output_file] = "network2plot"
47
54
  opts.on("-o", "--output_file PATH", "Output file name") do |output_file|
48
55
  options[:output_file] = output_file
@@ -53,6 +60,11 @@ OptionParser.new do |opts|
53
60
  options[:assoc_file] = output_file
54
61
  end
55
62
 
63
+ options[:kernel_file] = "kernel_values"
64
+ opts.on("-K", "--kernel_file PATH", "Output file name for kernel values") do |output_file|
65
+ options[:kernel_file] = output_file
66
+ end
67
+
56
68
  options[:performance_file] = "perf_values.txt"
57
69
  opts.on("-p", "--performance_file PATH", "Output file name for performance values") do |output_file|
58
70
  options[:performance_file] = output_file
@@ -60,8 +72,8 @@ OptionParser.new do |opts|
60
72
 
61
73
  options[:layers] = [:layer, '-']
62
74
  opts.on("-l", "--layers STRING", "Layer definition on network: layer1name,regexp1;layer2name,regexp2...") do |layers|
63
- layers_definition = layers.split(";").map{|layer_attr| layer_attr.split(',')}
64
- layers_definition.map!{|layer_attr| [layer_attr.first.to_sym, /#{layer_attr.last}/]}
75
+ layers_definition = layers.split(";").map{|layer_attr| layer_attr.split(',')}
76
+ layers_definition.map!{|layer_attr| [layer_attr.first.to_sym, /#{layer_attr.last}/]}
65
77
  options[:layers] = layers_definition
66
78
  end
67
79
 
@@ -80,35 +92,129 @@ OptionParser.new do |opts|
80
92
  options[:output_style] = output_style
81
93
  end
82
94
 
95
+ options[:ontologies] = []
96
+ opts.on("-O", "--ontology STRING", "String that define which ontologies must be used with each layer. String definition:'layer_name1:path_to_obo_file1;layer_name2:path_to_obo_file2'") do |ontologies|
97
+ options[:ontologies] = ontologies.split(';').map{|pair| pair.split(':')}
98
+ end
99
+
83
100
  options[:meth] = nil
84
101
  opts.on("-m", "--association_method STRING", "Association method to use on network") do |meth|
85
102
  options[:meth] = meth.to_sym
86
103
  end
87
104
 
88
- options[:no_autorelations] = FALSE
105
+ options[:kernel] = nil
106
+ opts.on("-k", "--kernel_method STRING", "Kernel operation to perform with the adjacency matrix") do |kernel|
107
+ options[:kernel] = kernel
108
+ end
109
+
110
+ options[:no_autorelations] = false
89
111
  opts.on("-N", "--no_autorelations", "Remove association values between nodes os same type") do
90
- options[:no_autorelations] = TRUE
112
+ options[:no_autorelations] = true
113
+ end
114
+
115
+ options[:normalize_kernel] = false
116
+ opts.on("-z", "--normalize_kernel_values", "Apply cosine normalization to the obtained kernel") do
117
+ options[:normalize_kernel] = true
118
+ end
119
+
120
+ options[:graph_file] = nil
121
+ opts.on("-g", "--graph_file PATH", "Build a graphic representation of the network") do |item|
122
+ options[:graph_file] = item
123
+ end
124
+
125
+ options[:graph_options] = {method: 'el_grapho', layout: 'forcedir', steps: '30'}
126
+ opts.on("--graph_options STRING", "Set graph parameters as 'NAME1=value1,NAME2=value2,...") do |item|
127
+ options[:graph_options] = {}
128
+ item.split(',').each do |pair|
129
+ fields = pair.split('=')
130
+ options[:graph_options][fields.first.to_sym] = fields.last
131
+ end
132
+ end
133
+
134
+ options[:threads] = 0
135
+ opts.on( '-T', '--threads INTEGER', 'Number of threads to use in computation, one thread will be reserved as manager.' ) do |opt|
136
+ options[:threads] = opt.to_i - 1
91
137
  end
92
138
 
139
+ options[:reference_nodes] = []
140
+ opts.on("-r", "--reference_nodes STRING", "Node ids comma separared") do |item|
141
+ options[:reference_nodes] = item.split(',')
142
+ end
143
+
144
+ options[:group_nodes] = {}
145
+ opts.on("-G", "--group_nodes STRING", "File path or groups separated by ';' and group node ids comma separared") do |item|
146
+ if File.exists?(item)
147
+ File.open(item).each do |line|
148
+ groupID, nodeID = line.chomp.split("\t")
149
+ query = options[:group_nodes][groupID]
150
+ query.nil? ? options[:group_nodes][groupID] = [nodeID] : query << nodeID
151
+ end
152
+ else
153
+ item.split(';').each_with_index do |group, i|
154
+ options[:group_nodes][i] = group.split(',')
155
+ end
156
+ end
157
+ end
158
+
159
+ options[:group_metrics] = false
160
+ opts.on("-M", "--group_metrics", "Perform group group_metrics") do
161
+ options[:group_metrics] = true
162
+ end
163
+
164
+ options[:expand_clusters] = nil
165
+ opts.on("-x", "--expand_clusters STRING", "Method to expand clusters Available methods: sht_path") do |item|
166
+ options[:expand_clusters] = item
167
+ end
168
+
169
+ options[:get_attributes] = []
170
+ opts.on("-A", "--attributes STRING", "String separadted by commas with the name of network attribute") do |item|
171
+ options[:get_attributes] = item.split(',')
172
+ end
173
+
174
+ options[:delete_nodes] = []
175
+ opts.on("-d", "--delete PATH", "Remove nodes from file. If PATH;r then nodes not included in file are removed") do |item|
176
+ options[:delete_nodes] = item.split(';')
177
+ end
93
178
  end.parse!
94
179
 
95
180
  ##########################
96
181
  #MAIN
97
182
  ##########################
98
-
99
183
  fullNet = Network.new(options[:layers].map{|layer| layer.first})
184
+ fullNet.reference_nodes = options[:reference_nodes]
185
+ fullNet.threads = options[:threads]
186
+ fullNet.group_nodes = options[:group_nodes]
187
+ fullNet.set_compute_pairs(options[:use_pairs], !options[:no_autorelations])
188
+ #puts options[:layers].map{|layer| layer.first}.inspect
100
189
  puts "Loading network data"
101
- File.open(options[:input_file]).each("\n") do |line|
102
- line.chomp!
103
- pair = line.split(options[:splitChar])
104
- node1 = pair[0]
105
- node2 = pair[1]
106
- fullNet.add_node(node1, set_layer(options[:layers], node1))
107
- fullNet.add_node(node2, set_layer(options[:layers], node2))
108
- fullNet.add_edge(node1, node2)
190
+ if options[:input_format] == 'pair'
191
+ fullNet.load_network_by_pairs(options[:input_file], options[:layers], options[:split_char])
192
+ elsif options[:input_format] == 'bin'
193
+ fullNet.load_network_by_bin_matrix(options[:input_file], options[:node_file], options[:layers])
194
+ elsif options[:input_format] == 'matrix'
195
+ fullNet.load_network_by_plain_matrix(options[:input_file], options[:node_file], options[:layers], options[:splitChar])
196
+ else
197
+ raise("ERROR: The format #{options[:input_format]} is not defined")
109
198
  end
110
- #fullNet.plot(options[:output_file], options[:output_style])
111
199
 
200
+ if !options[:delete_nodes].empty?
201
+ node_list = load_file(options[:delete_nodes].first).flatten
202
+ options[:delete_nodes].length > 1 ? mode = options[:delete_nodes][1] : 'd'
203
+ fullNet.delete_nodes(node_list, mode)
204
+ end
205
+
206
+ options[:ontologies].each do |layer_name, ontology_file_path|
207
+ fullNet.link_ontology(ontology_file_path, layer_name.to_sym)
208
+ end
209
+
210
+ if !options[:get_attributes].empty?
211
+ node_attributes = fullNet.get_node_attributes(options[:get_attributes])
212
+ File.open(File.join(File.dirname(options[:output_file]), 'node_attributes.txt'), 'w' ) do |f|
213
+ node_attributes.each do |attributes|
214
+ f.puts(attributes.join("\t"))
215
+ end
216
+ end
217
+ end
112
218
 
113
219
  if !options[:meth].nil?
114
220
  puts "Performing association method #{options[:meth]} on network"
@@ -121,35 +227,57 @@ if !options[:meth].nil?
121
227
  :transference)
122
228
  else
123
229
  fullNet.get_association_values(
124
- options[:use_layers][0],
230
+ options[:use_layers][0],
125
231
  options[:use_layers][1].first,
126
232
  options[:meth])
127
233
  end
128
- puts 'Clean autorelations' if options[:no_autorelations]
129
- fullNet.clean_autorelations_on_association_values if options[:no_autorelations]
130
234
  File.open(options[:assoc_file], 'w') do |f|
131
235
  fullNet.association_values[options[:meth]].each do |val|
132
236
  f.puts val.join("\t")
133
237
  end
134
238
  end
239
+ if !options[:control_file].nil?
240
+ puts "Doing validation on association values obtained from method #{options[:meth]}"
241
+ control = []
242
+ File.open(options[:control_file]).each("\n") do |line|
243
+ line.chomp!
244
+ control << line.split("\t")
245
+ end
246
+ fullNet.load_control(control)
247
+ performance = fullNet.get_pred_rec(options[:meth])
248
+ File.open(options[:performance_file], 'w') do |f|
249
+ f.puts %w[cut prec rec meth].join("\t")
250
+ performance.each do |item|
251
+ item << options[:meth].to_s
252
+ f.puts item.join("\t")
253
+ end
254
+ end
255
+ end
256
+ puts "End of analysis: #{options[:meth]}"
135
257
  end
136
258
 
137
- if !options[:meth].nil? && !options[:control_file].nil?
138
- puts "Doing validation on association values obtained from method #{options[:meth]}"
139
- control = []
140
- File.open(options[:control_file]).each("\n") do |line|
141
- line.chomp!
142
- control << line.split("\t")
143
- end
144
- fullNet.load_control(control)
145
- performance = fullNet.get_pred_rec(options[:meth])
146
- File.open(options[:performance_file], 'w') do |f|
147
- f.puts %w[cut prec rec meth].join("\t")
148
- performance.each do |item|
149
- item << options[:meth].to_s
150
- f.puts item.join("\t")
151
- end
152
- end
259
+ if !options[:kernel].nil?
260
+ layer2kernel = options[:use_layers].first # we use only a layer to perform the kernel, so only one item it is selected.
261
+ fullNet.get_kernel(layer2kernel, options[:kernel], options[:normalize_kernel])
262
+ fullNet.write_kernel(layer2kernel, options[:kernel_file])
263
+ end
264
+
265
+ if !options[:graph_file].nil?
266
+ options[:graph_options][:output_file] = options[:graph_file]
267
+ fullNet.plot_network(options[:graph_options])
153
268
  end
154
269
 
155
- puts "End of analysis: #{options[:meth]}"
270
+ if options[:group_metrics]
271
+ fullNet.compute_group_metrics(File.join(File.dirname(options[:output_file]), 'group_metrics.txt'))
272
+ end
273
+
274
+ if !options[:expand_clusters].nil?
275
+ expanded_clusters = fullNet.expand_clusters(options[:expand_clusters])
276
+ File.open(File.join(File.dirname(options[:output_file]), 'expand_clusters.txt'), 'w' ) do |f|
277
+ expanded_clusters.each do |cl_id, nodes|
278
+ nodes.each do |node|
279
+ f.puts "#{cl_id}\t#{node}"
280
+ end
281
+ end
282
+ end
283
+ end