NetAnalyzer 0.1.5 → 0.6.2

Sign up to get free protection for your applications and to get access to all the features.
checksums.yaml CHANGED
@@ -1,7 +1,7 @@
1
1
  ---
2
- SHA1:
3
- metadata.gz: 35cb3956bb731175602f6149bce80bec19e11d67
4
- data.tar.gz: d0b3127557d53140681054dd73f85b781b823408
2
+ SHA256:
3
+ metadata.gz: 53e3a09e27675b6e10398c8c869e31314c8afccb440b5f7d3cf2f84bec554d24
4
+ data.tar.gz: 17b9a25ca6e45512f049097dad67f3f8a12ab4cf12b2f5706b2777ec301f436f
5
5
  SHA512:
6
- metadata.gz: 7848fe8950d1e7ce9ce1a2f74e427a7ff0825cfce3994bfb69580323d5e08b46981810749a1091c8a7fd3e2b7bd0c7e95a265d59e43dbae6f73695ed0e48eb8e
7
- data.tar.gz: ff6a98cce36dad00ef8f9eb015343117d8128fd6ddc5f4ade641edc69fb87ad0e4dcb1ecb731caaa2de42bd2a19b3f4b6c23d148e972bc2ecda7a8a8cb1e72d8
6
+ metadata.gz: 58d378216bdd2aaa7b374b43ce441500d0b08b2cf30e54a88cab9fec39c4ccad9dfe77554b7ef670d4a2874de9142c0046b5b2aa89be803c4a037d541816abf4
7
+ data.tar.gz: 5992bbed01102a8e59da389f872f24087e8d0f6f31aefe53518c49ba88e75730493fea725485686ab0c3b1c846c7d9afd2c4091163e97921670e5a6a8ddfac22
data/.rspec CHANGED
@@ -1,2 +1,3 @@
1
1
  --format documentation
2
2
  --color
3
+ --require spec_helper
data/Gemfile CHANGED
@@ -2,3 +2,7 @@ source 'https://rubygems.org'
2
2
 
3
3
  # Specify your gem's dependencies in NetAnalyzer.gemspec
4
4
  gemspec
5
+ semtools_dev_path = File.expand_path('~/dev_gems/semtools')
6
+ gem "semtools", github: "seoanezonjic/semtools", branch: "master" if Dir.exist?(semtools_dev_path)
7
+ expcalc_dev_path = File.expand_path('~/dev_gems/expcalc')
8
+ gem "expcalc", github: "seoanezonjic/expcalc", branch: "master" if Dir.exist?(expcalc_dev_path)
data/NetAnalyzer.gemspec CHANGED
@@ -19,9 +19,19 @@ Gem::Specification.new do |spec|
19
19
  spec.executables = spec.files.grep(%r{^bin/}) { |f| File.basename(f) }
20
20
  spec.require_paths = ["lib"]
21
21
 
22
- spec.add_development_dependency "bundler", "~> 1.11"
23
- spec.add_development_dependency "rake", "~> 10.0"
24
- spec.add_development_dependency "rspec", "~> 3.0"
25
- spec.add_dependency "nmatrix"
26
- spec.add_dependency "bigdecimal"
22
+ spec.add_development_dependency "rake", ">= 13.0.3"
23
+ spec.add_development_dependency "rspec"
24
+ spec.add_development_dependency "minitest"
25
+ spec.add_dependency "cmath", ">= 1.0.0"
26
+ spec.add_dependency "numo-linalg", ">= 0.1.5"
27
+ spec.add_dependency "numo-narray", ">= 0.9.1.9"
28
+ spec.add_dependency "pp", ">= 0.1.0"
29
+ spec.add_dependency "npy", ">= 0.2.0"
30
+ spec.add_dependency "bigdecimal", ">= 3.0.0"
31
+ spec.add_dependency "gv", ">= 0.1.0"
32
+ spec.add_dependency "semtools", ">= 0.1.1"
33
+ spec.add_dependency "expcalc"
34
+ spec.add_dependency "parallel"
35
+ spec.add_dependency "rubystats"
36
+ spec.add_dependency "red-colors"
27
37
  end
data/README.md CHANGED
@@ -2,7 +2,11 @@
2
2
 
3
3
  NetAnalyzer is a network analysis tool that can be used to calculate the associations between nodes in unweighted n-partite networks [1]. The calculation of the association between nodes is based on similarity indices (Jaccard, Simpson, geometric and cosine), statistic-based (Pearson correlation coefficient, CSI and hypergeometric) [2] and a special metric designed only for tripartite networks (here called as 'transference' method [3]). The user can choose the association index method according to the network to analyse. The tool gives a table of results, with all the associations between nodes and the association value calculated.
4
4
 
5
- If you use this tool, please cite us: E. Rojano, P. Seoane, A. Bueno, J. R. Perkins & J. A. G. Ranea. Revealing the Relationship Between Human Genome Regions and Pathological Phenotypes Through Network Analysis. Lecture Notes in Computer Science, Vol 10208, 197-207 (2017).
5
+ If you use this tool, please cite us: [1] E. Rojano, P. Seoane, A. Bueno, J. R. Perkins & J. A. G. Ranea. Revealing the Relationship Between Human Genome Regions and Pathological Phenotypes Through Network Analysis. Lecture Notes in Computer Science, Vol 10208, 197-207 (2017).
6
+
7
+ [2] Fuxman-Bass et al. Using networks to measure similarity between genes: association index selection. Nature Methods, 10(12):1169-76. 2013.
8
+
9
+ [3] Alaimo et al. ncPred: ncRNA-Disease Association Prediction through Tripartite Network-Based Inference. Frontiers in Bioengineering and Biotechnology, 2:71, 2014.
6
10
 
7
11
  ## Installation
8
12
 
@@ -32,7 +36,6 @@ Once nmatrix gem is installed:
32
36
  gem install 'NetAnalyzer'
33
37
  ```
34
38
 
35
-
36
39
  ## Usage
37
40
 
38
41
  The program NetAnalyzer.rb can analyse an unweighted network to calculate the association index between different nodes.
@@ -47,9 +50,18 @@ Where:
47
50
  -i: Input file with the network to analyse. It must have two columns (separated by default by tabs) that represents the nodes that are related (NodeA\tNodeB). Please if you have doubts about the format, check the example providen.
48
51
  -l: Layers construction. Please consider that, depending on the n-partite network you provide, NetAnalyzer will transform it into a bipartite one to perform the analysis (excepting if the association method used is 'transference'). The layers must contain a identifier of the node, and a character or pattern to identify. In this example, the bipartite network has HPO terms (with 'HP:' string in each of them) and patients that have these HPO terms (they are given as numerical patient IDs). Both layers must be separated by ';'.
49
52
  -m: Association method. There are 8 different association methods to choose: 'jaccard', 'cosine', 'pcc', 'csi', 'hypergeometric', 'simpson', 'geometric' and 'transference'.
53
+ -u: Set which layer will be the one that establish connections between nodes in the other layer. In this case, we will get with patient is associated to other patient because the HPO they share.
50
54
  -a: Associations output file name. Here you can find the associations between nodes in the network and the calculated association value, according to the chosen method.
51
55
  ```
52
56
 
57
+ Optional flags:
58
+
59
+ ```
60
+ -s: Split character. Change if the layers of the network are not separated by tabs.
61
+ -o: Output file name.
62
+
63
+ ```
64
+
53
65
  ## Development
54
66
 
55
67
  After checking out the repo, run `bin/setup` to install dependencies. Then, run `rake spec` to run the tests. You can also run `bin/console` for an interactive prompt that will allow you to experiment.
data/Rakefile CHANGED
@@ -1,13 +1,17 @@
1
1
  require "bundler/gem_tasks"
2
- #require "rspec/core/rake_task"
2
+ require "rake/testtask"
3
+ require 'rdoc/task'
3
4
 
4
- #RSpec::Core::RakeTask.new(:spec)
5
+ Rake::TestTask.new(:test) do |t|
6
+ t.libs << "test"
7
+ t.libs << "lib"
8
+ t.test_files = FileList["test/**/*_test.rb"]
9
+ end
5
10
 
6
- #task :default => :spec
11
+ RDoc::Task.new do |rdoc|
12
+ rdoc.main = "README.doc"
13
+ rdoc.rdoc_files.include("README.md", "lib/*.rb", "lib/NetAnalyzer/*.rb")
14
+ rdoc.options << "--all"
15
+ end
7
16
 
8
- require 'rake/testtask'
9
-
10
- Rake::TestTask.new do |t|
11
- t.libs << 'test'
12
- t.pattern = "test/*_test.rb"
13
- end
17
+ task :default => :test
data/bin/NetAnalyzer.rb CHANGED
@@ -1,11 +1,21 @@
1
1
  #! /usr/bin/env ruby
2
2
 
3
3
  ROOT_PATH = File.dirname(__FILE__)
4
- $: << File.expand_path(File.join(ROOT_PATH, '..', 'lib', 'NetAnalyzer'))
5
- $: << File.expand_path(File.join(ROOT_PATH, '..', 'lib', 'NetAnalyzer', 'methods'))
6
-
7
- require 'network'
4
+ $LOAD_PATH.unshift(File.expand_path(File.join(ROOT_PATH, '..', 'lib')))
8
5
  require 'optparse'
6
+ require 'benchmark'
7
+ require 'NetAnalyzer'
8
+
9
+ ######################################
10
+ ## METHODS
11
+ ######################################
12
+ def load_file(path)
13
+ data = []
14
+ File.open(path).each do |line|
15
+ data << line.chomp.split("\t")
16
+ end
17
+ return data
18
+ end
9
19
 
10
20
  ##############################
11
21
  #OPTPARSE
@@ -20,11 +30,26 @@ OptionParser.new do |opts|
20
30
  options[:input_file] = input_file
21
31
  end
22
32
 
33
+ options[:node_file] = nil
34
+ opts.on("-n", "--node_names_file PATH", "File with node names corresponding to the input matrix, only use when -i is set to bin or matrix.") do |node_file|
35
+ options[:node_file] = node_file
36
+ end
37
+
38
+ options[:input_format] = 'pair'
39
+ opts.on("-f", "--input_format STRING", "Input file format: pair (default), bin, matrix") do |input_format|
40
+ options[:input_format] = input_format
41
+ end
42
+
23
43
  options[:split_char] = "\t"
24
44
  opts.on("-s", "--split_char STRING", "Character for splitting input file. Default: tab") do |split_char|
25
45
  options[:split_char] = split_char
26
46
  end
27
47
 
48
+ options[:use_pairs] = :conn
49
+ opts.on("-P", "--use_pairs STRING", "Which pairs must be computed. 'all' means all posible pair node combinations and 'conn' means the pair are truly connected in the network. Default 'conn' ") do |use_pairs|
50
+ options[:use_pairs] = use_pairs.to_sym
51
+ end
52
+
28
53
  options[:output_file] = "network2plot"
29
54
  opts.on("-o", "--output_file PATH", "Output file name") do |output_file|
30
55
  options[:output_file] = output_file
@@ -35,6 +60,11 @@ OptionParser.new do |opts|
35
60
  options[:assoc_file] = output_file
36
61
  end
37
62
 
63
+ options[:kernel_file] = "kernel_values"
64
+ opts.on("-K", "--kernel_file PATH", "Output file name for kernel values") do |output_file|
65
+ options[:kernel_file] = output_file
66
+ end
67
+
38
68
  options[:performance_file] = "perf_values.txt"
39
69
  opts.on("-p", "--performance_file PATH", "Output file name for performance values") do |output_file|
40
70
  options[:performance_file] = output_file
@@ -42,8 +72,8 @@ OptionParser.new do |opts|
42
72
 
43
73
  options[:layers] = [:layer, '-']
44
74
  opts.on("-l", "--layers STRING", "Layer definition on network: layer1name,regexp1;layer2name,regexp2...") do |layers|
45
- layers_definition = layers.split(";").map{|layer_attr| layer_attr.split(',')}
46
- layers_definition.map!{|layer_attr| [layer_attr.first.to_sym, /#{layer_attr.last}/]}
75
+ layers_definition = layers.split(";").map{|layer_attr| layer_attr.split(',')}
76
+ layers_definition.map!{|layer_attr| [layer_attr.first.to_sym, /#{layer_attr.last}/]}
47
77
  options[:layers] = layers_definition
48
78
  end
49
79
 
@@ -62,28 +92,129 @@ OptionParser.new do |opts|
62
92
  options[:output_style] = output_style
63
93
  end
64
94
 
95
+ options[:ontologies] = []
96
+ opts.on("-O", "--ontology STRING", "String that define which ontologies must be used with each layer. String definition:'layer_name1:path_to_obo_file1;layer_name2:path_to_obo_file2'") do |ontologies|
97
+ options[:ontologies] = ontologies.split(';').map{|pair| pair.split(':')}
98
+ end
99
+
65
100
  options[:meth] = nil
66
101
  opts.on("-m", "--association_method STRING", "Association method to use on network") do |meth|
67
102
  options[:meth] = meth.to_sym
68
103
  end
69
104
 
70
- options[:no_autorelations] = FALSE
105
+ options[:kernel] = nil
106
+ opts.on("-k", "--kernel_method STRING", "Kernel operation to perform with the adjacency matrix") do |kernel|
107
+ options[:kernel] = kernel
108
+ end
109
+
110
+ options[:no_autorelations] = false
71
111
  opts.on("-N", "--no_autorelations", "Remove association values between nodes os same type") do
72
- options[:no_autorelations] = TRUE
112
+ options[:no_autorelations] = true
113
+ end
114
+
115
+ options[:normalize_kernel] = false
116
+ opts.on("-z", "--normalize_kernel_values", "Apply cosine normalization to the obtained kernel") do
117
+ options[:normalize_kernel] = true
118
+ end
119
+
120
+ options[:graph_file] = nil
121
+ opts.on("-g", "--graph_file PATH", "Build a graphic representation of the network") do |item|
122
+ options[:graph_file] = item
123
+ end
124
+
125
+ options[:graph_options] = {method: 'el_grapho', layout: 'forcedir', steps: '30'}
126
+ opts.on("--graph_options STRING", "Set graph parameters as 'NAME1=value1,NAME2=value2,...") do |item|
127
+ options[:graph_options] = {}
128
+ item.split(',').each do |pair|
129
+ fields = pair.split('=')
130
+ options[:graph_options][fields.first.to_sym] = fields.last
131
+ end
132
+ end
133
+
134
+ options[:threads] = 0
135
+ opts.on( '-T', '--threads INTEGER', 'Number of threads to use in computation, one thread will be reserved as manager.' ) do |opt|
136
+ options[:threads] = opt.to_i - 1
137
+ end
138
+
139
+ options[:reference_nodes] = []
140
+ opts.on("-r", "--reference_nodes STRING", "Node ids comma separared") do |item|
141
+ options[:reference_nodes] = item.split(',')
73
142
  end
74
143
 
144
+ options[:group_nodes] = {}
145
+ opts.on("-G", "--group_nodes STRING", "File path or groups separated by ';' and group node ids comma separared") do |item|
146
+ if File.exists?(item)
147
+ File.open(item).each do |line|
148
+ groupID, nodeID = line.chomp.split("\t")
149
+ query = options[:group_nodes][groupID]
150
+ query.nil? ? options[:group_nodes][groupID] = [nodeID] : query << nodeID
151
+ end
152
+ else
153
+ item.split(';').each_with_index do |group, i|
154
+ options[:group_nodes][i] = group.split(',')
155
+ end
156
+ end
157
+ end
158
+
159
+ options[:group_metrics] = false
160
+ opts.on("-M", "--group_metrics", "Perform group group_metrics") do
161
+ options[:group_metrics] = true
162
+ end
163
+
164
+ options[:expand_clusters] = nil
165
+ opts.on("-x", "--expand_clusters STRING", "Method to expand clusters Available methods: sht_path") do |item|
166
+ options[:expand_clusters] = item
167
+ end
168
+
169
+ options[:get_attributes] = []
170
+ opts.on("-A", "--attributes STRING", "String separadted by commas with the name of network attribute") do |item|
171
+ options[:get_attributes] = item.split(',')
172
+ end
173
+
174
+ options[:delete_nodes] = []
175
+ opts.on("-d", "--delete PATH", "Remove nodes from file. If PATH;r then nodes not included in file are removed") do |item|
176
+ options[:delete_nodes] = item.split(';')
177
+ end
75
178
  end.parse!
76
179
 
77
180
  ##########################
78
181
  #MAIN
79
182
  ##########################
80
-
81
183
  fullNet = Network.new(options[:layers].map{|layer| layer.first})
184
+ fullNet.reference_nodes = options[:reference_nodes]
185
+ fullNet.threads = options[:threads]
186
+ fullNet.group_nodes = options[:group_nodes]
187
+ fullNet.set_compute_pairs(options[:use_pairs], !options[:no_autorelations])
82
188
  #puts options[:layers].map{|layer| layer.first}.inspect
83
189
  puts "Loading network data"
84
- fullNet.load_network_by_pairs(options[:input_file], options[:layers], options[:splitChar])
190
+ if options[:input_format] == 'pair'
191
+ fullNet.load_network_by_pairs(options[:input_file], options[:layers], options[:split_char])
192
+ elsif options[:input_format] == 'bin'
193
+ fullNet.load_network_by_bin_matrix(options[:input_file], options[:node_file], options[:layers])
194
+ elsif options[:input_format] == 'matrix'
195
+ fullNet.load_network_by_plain_matrix(options[:input_file], options[:node_file], options[:layers], options[:splitChar])
196
+ else
197
+ raise("ERROR: The format #{options[:input_format]} is not defined")
198
+ end
199
+
200
+ if !options[:delete_nodes].empty?
201
+ node_list = load_file(options[:delete_nodes].first).flatten
202
+ options[:delete_nodes].length > 1 ? mode = options[:delete_nodes][1] : 'd'
203
+ fullNet.delete_nodes(node_list, mode)
204
+ end
85
205
 
86
- #fullNet.plot(options[:output_file], options[:output_style])
206
+ options[:ontologies].each do |layer_name, ontology_file_path|
207
+ fullNet.link_ontology(ontology_file_path, layer_name.to_sym)
208
+ end
209
+
210
+ if !options[:get_attributes].empty?
211
+ node_attributes = fullNet.get_node_attributes(options[:get_attributes])
212
+ File.open(File.join(File.dirname(options[:output_file]), 'node_attributes.txt'), 'w' ) do |f|
213
+ node_attributes.each do |attributes|
214
+ f.puts(attributes.join("\t"))
215
+ end
216
+ end
217
+ end
87
218
 
88
219
  if !options[:meth].nil?
89
220
  puts "Performing association method #{options[:meth]} on network"
@@ -100,31 +231,53 @@ if !options[:meth].nil?
100
231
  options[:use_layers][1].first,
101
232
  options[:meth])
102
233
  end
103
- puts 'Clean autorelations' if options[:no_autorelations]
104
- fullNet.clean_autorelations_on_association_values if options[:no_autorelations]
105
234
  File.open(options[:assoc_file], 'w') do |f|
106
235
  fullNet.association_values[options[:meth]].each do |val|
107
236
  f.puts val.join("\t")
108
237
  end
109
238
  end
239
+ if !options[:control_file].nil?
240
+ puts "Doing validation on association values obtained from method #{options[:meth]}"
241
+ control = []
242
+ File.open(options[:control_file]).each("\n") do |line|
243
+ line.chomp!
244
+ control << line.split("\t")
245
+ end
246
+ fullNet.load_control(control)
247
+ performance = fullNet.get_pred_rec(options[:meth])
248
+ File.open(options[:performance_file], 'w') do |f|
249
+ f.puts %w[cut prec rec meth].join("\t")
250
+ performance.each do |item|
251
+ item << options[:meth].to_s
252
+ f.puts item.join("\t")
253
+ end
254
+ end
255
+ end
256
+ puts "End of analysis: #{options[:meth]}"
110
257
  end
111
258
 
112
- if !options[:meth].nil? && !options[:control_file].nil?
113
- puts "Doing validation on association values obtained from method #{options[:meth]}"
114
- control = []
115
- File.open(options[:control_file]).each("\n") do |line|
116
- line.chomp!
117
- control << line.split("\t")
118
- end
119
- fullNet.load_control(control)
120
- performance = fullNet.get_pred_rec(options[:meth])
121
- File.open(options[:performance_file], 'w') do |f|
122
- f.puts %w[cut prec rec meth].join("\t")
123
- performance.each do |item|
124
- item << options[:meth].to_s
125
- f.puts item.join("\t")
126
- end
127
- end
259
+ if !options[:kernel].nil?
260
+ layer2kernel = options[:use_layers].first # we use only a layer to perform the kernel, so only one item it is selected.
261
+ fullNet.get_kernel(layer2kernel, options[:kernel], options[:normalize_kernel])
262
+ fullNet.write_kernel(layer2kernel, options[:kernel_file])
263
+ end
264
+
265
+ if !options[:graph_file].nil?
266
+ options[:graph_options][:output_file] = options[:graph_file]
267
+ fullNet.plot_network(options[:graph_options])
128
268
  end
129
269
 
130
- puts "End of analysis: #{options[:meth]}"
270
+ if options[:group_metrics]
271
+ fullNet.compute_group_metrics(File.join(File.dirname(options[:output_file]), 'group_metrics.txt'))
272
+ end
273
+
274
+ if !options[:expand_clusters].nil?
275
+ expanded_clusters = fullNet.expand_clusters(options[:expand_clusters])
276
+ File.open(File.join(File.dirname(options[:output_file]), 'expand_clusters.txt'), 'w' ) do |f|
277
+ expanded_clusters.each do |cl_id, nodes|
278
+ nodes.each do |node|
279
+ f.puts "#{cl_id}\t#{node}"
280
+ end
281
+ end
282
+ end
283
+ end