macroape 3.3.8 → 4.0.0
Sign up to get free protection for your applications and to get access to all the features.
- checksums.yaml +7 -0
- data/README.md +2 -2
- data/TODO.txt +2 -0
- data/lib/macroape/cli/align_motifs.rb +62 -22
- data/lib/macroape/version.rb +1 -1
- data/macroape.gemspec +2 -3
- data/test/align_motifs_test.rb +13 -0
- metadata +13 -32
checksums.yaml
ADDED
@@ -0,0 +1,7 @@
|
|
1
|
+
---
|
2
|
+
SHA1:
|
3
|
+
metadata.gz: 3037e164fd1b1c23bf40a9fca1d1da39737934a5
|
4
|
+
data.tar.gz: 4d4482d00ce0c76cbb47fe9d9eff53b48c1d2741
|
5
|
+
SHA512:
|
6
|
+
metadata.gz: cc4176fe2b1d2f7b5bf4835b3612d316797f378dbf036bf0bba3135effe1cf9cbba1037ef87c7cffc24d0d2ef8d2ebf16fe5d585780445a1492fa795df44cb48
|
7
|
+
data.tar.gz: d86983eeb148235e1470dbfa7f674cf0e395129bd1836e34256b42bba06525f0881961cd8c6ca0e5e09df520786a8735e1ec2e10e76c4e59b8cfddb38b151685
|
data/README.md
CHANGED
@@ -17,7 +17,7 @@ Or install it yourself as:
|
|
17
17
|
$ gem install macroape
|
18
18
|
|
19
19
|
## Usage
|
20
|
-
For more information read manual at https://docs.google.com/document/pub?id=1_jsxhMNzMzy4d2d_byAd3n6Szg5gEcqG_Sf7w9tEqWw
|
20
|
+
For more information read manual at https://docs.google.com/document/pub?id=1_jsxhMNzMzy4d2d_byAd3n6Szg5gEcqG_Sf7w9tEqWw
|
21
21
|
|
22
22
|
## Basic usage as a command-line tool
|
23
23
|
MacroAPE have 7 command line tools:
|
@@ -31,7 +31,7 @@ Or install it yourself as:
|
|
31
31
|
* eval_alignment \<first PWM file\> \<second PWM file\> \<shift of second matrix\> \<orientation of second matrix(direct|revcomp)\>
|
32
32
|
|
33
33
|
### Tools for looking through collection for the motifs most similar to a query motif
|
34
|
-
* preprocess_collection \<folder with motif files\>
|
34
|
+
* preprocess_collection \<folder with motif files\> \<collection output file\>
|
35
35
|
* scan_collection \<query PWM file\> \<collection file\>
|
36
36
|
|
37
37
|
### Tool for finding mutual alignment of several motifs relative to first(leader) motif. It's designed to use with sequence_logo to draw logos of clusters
|
data/TODO.txt
CHANGED
@@ -17,6 +17,8 @@ ToDo:
|
|
17
17
|
8)(TODO: for theoretically consistency, while making small inconsistences to old calculations)
|
18
18
|
When we work with strong threshold, we round matrix up(in order to overrate threshold comparing to real thus taking underrated pvalue) and take upper bound of discrete-thresholds fork.
|
19
19
|
When we are estimating lower bound of threshold (weak threshold) we take lower bound of fork of discrete thresholds. But we should ALSO (not done yet) take matrix discreted down! This'd allow us give exact answer on a question in which range real threshold should lay with given P-value, now we correctly estimate only lower bound of threshold(upper bound of P-value)
|
20
|
+
9) (may be) Option to specify predefined query motif threshold in scan_collection
|
21
|
+
10) Fix Readme!
|
20
22
|
|
21
23
|
Specs and tests:
|
22
24
|
create spec on use of MaxHashSize, MaxHashSizeDouble
|
@@ -1,12 +1,12 @@
|
|
1
|
-
require 'docopt'
|
2
1
|
require_relative '../../macroape'
|
2
|
+
require 'shellwords'
|
3
3
|
|
4
4
|
module Macroape
|
5
5
|
module CLI
|
6
6
|
module AlignMotifs
|
7
7
|
|
8
8
|
def self.main(argv)
|
9
|
-
doc = <<-
|
9
|
+
doc = <<-EOS.strip_doc
|
10
10
|
Align motifs tool.
|
11
11
|
It takes motifs and builds alignment of each motif to the first (leader) motif.
|
12
12
|
|
@@ -16,38 +16,78 @@ module Macroape
|
|
16
16
|
pwm_file_3 shift_3 orientation_3
|
17
17
|
|
18
18
|
Usage:
|
19
|
-
|
19
|
+
#{run_tool_cmd} [options] <leader pm> <rest pm files>...
|
20
|
+
or
|
21
|
+
ls rest_pms/*.pm | #{run_tool_cmd} [options] <leader pm>
|
20
22
|
|
21
23
|
Options:
|
22
|
-
-
|
23
|
-
|
24
|
-
|
24
|
+
[-p <P-value>]
|
25
|
+
[-d <discretization level>]
|
26
|
+
[--pcm] - treat the input file as Position Count Matrix. PCM-to-PWM transformation to be done internally.
|
27
|
+
[--boundary lower|upper] Upper boundary (default) means that the obtained P-value is greater than or equal to the requested P-value
|
28
|
+
[-b <background probabilities] ACGT - 4 numbers, comma-delimited(spaces not allowed), sum should be equal to 1, like 0.25,0.24,0.26,0.25
|
29
|
+
EOS
|
25
30
|
|
26
|
-
|
31
|
+
if argv.empty? || ['-h', '--h', '-help', '--help'].any?{|help_option| argv.include?(help_option)}
|
32
|
+
STDERR.puts doc
|
33
|
+
exit
|
34
|
+
end
|
27
35
|
|
28
|
-
|
29
|
-
|
30
|
-
leader = motif_files.first
|
31
|
-
background = [1,1,1,1]
|
36
|
+
leader_background = [1,1,1,1]
|
37
|
+
rest_motifs_background = [1,1,1,1]
|
32
38
|
discretization = 1
|
33
39
|
pvalue = 0.0005
|
40
|
+
max_hash_size = 10000000
|
41
|
+
max_pair_hash_size = 10000
|
42
|
+
pvalue_boundary = :upper
|
43
|
+
|
44
|
+
data_model = argv.delete('--pcm') ? Bioinform::PCM : Bioinform::PWM
|
45
|
+
|
46
|
+
while argv.first && argv.first.start_with?('-')
|
47
|
+
case argv.shift
|
48
|
+
when '-p'
|
49
|
+
pvalue = argv.shift.to_f
|
50
|
+
when '-d'
|
51
|
+
discretization = argv.shift.to_f
|
52
|
+
when '--max-hash-size'
|
53
|
+
max_hash_size = argv.shift.to_i
|
54
|
+
when '--max-2d-hash-size'
|
55
|
+
max_pair_hash_size = argv.shift.to_i
|
56
|
+
when '-b'
|
57
|
+
rest_motifs_background = leader_background = argv.shift.split(',').map(&:to_f)
|
58
|
+
when '-b1'
|
59
|
+
leader_background = argv.shift.split(',').map(&:to_f)
|
60
|
+
when '-b2'
|
61
|
+
rest_motifs_background = argv.shift.split(',').map(&:to_f)
|
62
|
+
when '--boundary'
|
63
|
+
pvalue_boundary = argv.shift.to_sym
|
64
|
+
raise 'boundary should be either lower or upper' unless pvalue_boundary == :lower || pvalue_boundary == :upper
|
65
|
+
end
|
66
|
+
end
|
34
67
|
|
35
|
-
|
36
|
-
|
37
|
-
|
38
|
-
|
68
|
+
leader_pwm_file = argv.shift
|
69
|
+
rest_pwms_file = argv
|
70
|
+
rest_pwms_file += $stdin.read.shellsplit unless $stdin.tty?
|
71
|
+
rest_pwms_file.reject!{|filename| File.expand_path(filename) == File.expand_path(leader_pwm_file)}
|
72
|
+
|
73
|
+
shifts = []
|
74
|
+
shifts << [leader_pwm_file, 0, :direct]
|
75
|
+
pwm_first = data_model.new(File.read(leader_pwm_file)).to_pwm
|
76
|
+
pwm_first.set_parameters(background: leader_background, max_hash_size: max_hash_size).discrete!(discretization)
|
77
|
+
|
78
|
+
rest_pwms_file.each do |motif_name|
|
39
79
|
pwm_second = data_model.new(File.read(motif_name)).to_pwm
|
40
|
-
pwm_second.set_parameters(background:
|
41
|
-
|
42
|
-
|
80
|
+
pwm_second.set_parameters(background: rest_motifs_background, max_hash_size: max_hash_size).discrete!(discretization)
|
81
|
+
cmp = Macroape::PWMCompare.new(pwm_first, pwm_second).set_parameters(max_pair_hash_size: max_pair_hash_size)
|
82
|
+
info = cmp.jaccard_by_pvalue(pvalue)
|
83
|
+
shifts << [motif_name, info[:shift], info[:orientation]]
|
43
84
|
end
|
44
85
|
|
45
|
-
shifts.each do |motif_name,
|
86
|
+
shifts.each do |motif_name, shift,orientation|
|
46
87
|
puts "#{motif_name}\t#{shift}\t#{orientation}"
|
47
88
|
end
|
48
|
-
|
49
|
-
|
50
|
-
puts e.message
|
89
|
+
rescue => err
|
90
|
+
STDERR.puts "\n#{err}\n#{err.backtrace.first(5).join("\n")}\n\nUse --help option for help\n\n#{doc}"
|
51
91
|
end
|
52
92
|
|
53
93
|
end
|
data/lib/macroape/version.rb
CHANGED
data/macroape.gemspec
CHANGED
@@ -4,7 +4,7 @@ require File.expand_path('../lib/macroape/version', __FILE__)
|
|
4
4
|
Gem::Specification.new do |gem|
|
5
5
|
gem.authors = ["Ilya Vorontsov"]
|
6
6
|
gem.email = ["prijutme4ty@gmail.com"]
|
7
|
-
gem.description = %q{Macroape is an abbreviation for MAtrix CompaRisOn by Approximate P-value Estimation. It's a bioinformatic tool for evaluating similarity measure and best alignment between a pair of Position Weight Matrices(PWM), finding thresholds by P-values and
|
7
|
+
gem.description = %q{Macroape is an abbreviation for MAtrix CompaRisOn by Approximate P-value Estimation. It's a bioinformatic tool for evaluating similarity measure and best alignment between a pair of Position Weight Matrices(PWM), finding thresholds by P-values and vice versa and even searching a collection of motifs for the most similar ones. Used approach and application described in manual at https://docs.google.com/document/pub?id=1_jsxhMNzMzy4d2d_byAd3n6Szg5gEcqG_Sf7w9tEqWw}
|
8
8
|
gem.summary = %q{PWM comparison tool using MACROAPE approach}
|
9
9
|
gem.homepage = "http://autosome.ru/macroape/"
|
10
10
|
|
@@ -15,6 +15,5 @@ Gem::Specification.new do |gem|
|
|
15
15
|
gem.require_paths = ["lib"]
|
16
16
|
gem.version = Macroape::VERSION
|
17
17
|
|
18
|
-
gem.add_dependency('bioinform', '
|
19
|
-
gem.add_dependency('docopt', '= 0.5.0')
|
18
|
+
gem.add_dependency('bioinform', '~> 0.1.10')
|
20
19
|
end
|
data/test/align_motifs_test.rb
CHANGED
@@ -21,4 +21,17 @@ class TestAlignmotifs < Test::Unit::TestCase
|
|
21
21
|
%w[SP1_f1_revcomp.pcm -1 revcomp]],
|
22
22
|
Helpers.align_motifs_output('--pcm KLF4_f2.pcm KLF3_f1.pcm SP1_f1_revcomp.pcm')
|
23
23
|
end
|
24
|
+
def test_names_from_stdin
|
25
|
+
assert_equal [%w[KLF4_f2.pwm 0 direct],
|
26
|
+
%w[KLF3_f1.pwm -4 direct],
|
27
|
+
%w[SP1_f1_revcomp.pwm -1 revcomp]],
|
28
|
+
Helpers.provide_stdin('KLF3_f1.pwm SP1_f1_revcomp.pwm'){ Helpers.align_motifs_output('KLF4_f2.pwm') }
|
29
|
+
end
|
30
|
+
def test_names_from_stdin_duplicate_leader
|
31
|
+
assert_equal [%w[KLF4_f2.pwm 0 direct],
|
32
|
+
%w[KLF3_f1.pwm -4 direct],
|
33
|
+
%w[SP1_f1_revcomp.pwm -1 revcomp]],
|
34
|
+
Helpers.provide_stdin('KLF3_f1.pwm KLF4_f2.pwm SP1_f1_revcomp.pwm'){ Helpers.align_motifs_output('KLF4_f2.pwm') }
|
35
|
+
end
|
36
|
+
|
24
37
|
end
|
metadata
CHANGED
@@ -1,52 +1,33 @@
|
|
1
1
|
--- !ruby/object:Gem::Specification
|
2
2
|
name: macroape
|
3
3
|
version: !ruby/object:Gem::Version
|
4
|
-
version:
|
5
|
-
prerelease:
|
4
|
+
version: 4.0.0
|
6
5
|
platform: ruby
|
7
6
|
authors:
|
8
7
|
- Ilya Vorontsov
|
9
8
|
autorequire:
|
10
9
|
bindir: bin
|
11
10
|
cert_chain: []
|
12
|
-
date:
|
11
|
+
date: 2013-04-29 00:00:00.000000000 Z
|
13
12
|
dependencies:
|
14
13
|
- !ruby/object:Gem::Dependency
|
15
14
|
name: bioinform
|
16
15
|
requirement: !ruby/object:Gem::Requirement
|
17
|
-
none: false
|
18
16
|
requirements:
|
19
|
-
- -
|
17
|
+
- - ~>
|
20
18
|
- !ruby/object:Gem::Version
|
21
|
-
version: 0.1.
|
19
|
+
version: 0.1.10
|
22
20
|
type: :runtime
|
23
21
|
prerelease: false
|
24
22
|
version_requirements: !ruby/object:Gem::Requirement
|
25
|
-
none: false
|
26
23
|
requirements:
|
27
|
-
- -
|
24
|
+
- - ~>
|
28
25
|
- !ruby/object:Gem::Version
|
29
|
-
version: 0.1.
|
30
|
-
- !ruby/object:Gem::Dependency
|
31
|
-
name: docopt
|
32
|
-
requirement: !ruby/object:Gem::Requirement
|
33
|
-
none: false
|
34
|
-
requirements:
|
35
|
-
- - '='
|
36
|
-
- !ruby/object:Gem::Version
|
37
|
-
version: 0.5.0
|
38
|
-
type: :runtime
|
39
|
-
prerelease: false
|
40
|
-
version_requirements: !ruby/object:Gem::Requirement
|
41
|
-
none: false
|
42
|
-
requirements:
|
43
|
-
- - '='
|
44
|
-
- !ruby/object:Gem::Version
|
45
|
-
version: 0.5.0
|
26
|
+
version: 0.1.10
|
46
27
|
description: Macroape is an abbreviation for MAtrix CompaRisOn by Approximate P-value
|
47
28
|
Estimation. It's a bioinformatic tool for evaluating similarity measure and best
|
48
29
|
alignment between a pair of Position Weight Matrices(PWM), finding thresholds by
|
49
|
-
P-values and
|
30
|
+
P-values and vice versa and even searching a collection of motifs for the most similar
|
50
31
|
ones. Used approach and application described in manual at https://docs.google.com/document/pub?id=1_jsxhMNzMzy4d2d_byAd3n6Szg5gEcqG_Sf7w9tEqWw
|
51
32
|
email:
|
52
33
|
- prijutme4ty@gmail.com
|
@@ -130,27 +111,26 @@ files:
|
|
130
111
|
- test/test_helper.rb
|
131
112
|
homepage: http://autosome.ru/macroape/
|
132
113
|
licenses: []
|
114
|
+
metadata: {}
|
133
115
|
post_install_message:
|
134
116
|
rdoc_options: []
|
135
117
|
require_paths:
|
136
118
|
- lib
|
137
119
|
required_ruby_version: !ruby/object:Gem::Requirement
|
138
|
-
none: false
|
139
120
|
requirements:
|
140
|
-
- -
|
121
|
+
- - '>='
|
141
122
|
- !ruby/object:Gem::Version
|
142
123
|
version: '0'
|
143
124
|
required_rubygems_version: !ruby/object:Gem::Requirement
|
144
|
-
none: false
|
145
125
|
requirements:
|
146
|
-
- -
|
126
|
+
- - '>='
|
147
127
|
- !ruby/object:Gem::Version
|
148
128
|
version: '0'
|
149
129
|
requirements: []
|
150
130
|
rubyforge_project:
|
151
|
-
rubygems_version:
|
131
|
+
rubygems_version: 2.0.3
|
152
132
|
signing_key:
|
153
|
-
specification_version:
|
133
|
+
specification_version: 4
|
154
134
|
summary: PWM comparison tool using MACROAPE approach
|
155
135
|
test_files:
|
156
136
|
- spec/count_distribution_spec.rb
|
@@ -190,3 +170,4 @@ test_files:
|
|
190
170
|
- test/preprocess_collection_test.rb
|
191
171
|
- test/scan_collection_test.rb
|
192
172
|
- test/test_helper.rb
|
173
|
+
has_rdoc:
|