cheripic 1.1.0 → 1.2.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- checksums.yaml +4 -4
- data/.travis.yml +2 -2
- data/Gemfile +1 -0
- data/README.md +75 -1
- data/Rakefile +72 -4
- data/lib/cheripic/cmd.rb +16 -7
- data/lib/cheripic/contig_pileups.rb +3 -1
- data/lib/cheripic/implementer.rb +3 -3
- data/lib/cheripic/options.rb +9 -8
- data/lib/cheripic/variants.rb +13 -13
- data/lib/cheripic/version.rb +1 -1
- metadata +2 -2
checksums.yaml
CHANGED
@@ -1,7 +1,7 @@
|
|
1
1
|
---
|
2
2
|
SHA1:
|
3
|
-
metadata.gz:
|
4
|
-
data.tar.gz:
|
3
|
+
metadata.gz: 958b4091f2c95903c3a43a13af7d75cbc7605813
|
4
|
+
data.tar.gz: 18b91af8e68553f4d1700dae921beb7e420f11ac
|
5
5
|
SHA512:
|
6
|
-
metadata.gz:
|
7
|
-
data.tar.gz:
|
6
|
+
metadata.gz: 7290c13e270aae1a777767179168353c5c55a035bfd6e82025d000414112425e77f59ad7a6fd0c736d2d2775182e17f978ffa6b67153201e9d316458a6360db6
|
7
|
+
data.tar.gz: 595be6e01fdc4e0d6185a86f79f207abb2dc6bf50a8763a67186339e639542294185656b204cde8d86cda2f3519f7ae3341443a978f9f019c51cfef294513694
|
data/.travis.yml
CHANGED
data/Gemfile
CHANGED
data/README.md
CHANGED
@@ -15,6 +15,15 @@ Currently this gem is still in development and nearing complete working package.
|
|
15
15
|
|
16
16
|
## Installation
|
17
17
|
|
18
|
+
Cheripic is available both as a command line tool and as a gem.
|
19
|
+
Binaries are available for Linux 64bit and OSX.
|
20
|
+
Best way to use Cheripic is to download appropriate binary arhcive
|
21
|
+
unpack (`tar -xzf`) and add the unpacked directory to your `PATH`
|
22
|
+
|
23
|
+
Latest binaries are available to [download here](https://github.com/shyamrallapalli/cheripic/releases/tag/v1.1.0)
|
24
|
+
|
25
|
+
|
26
|
+
To install gem and use the gem in your development
|
18
27
|
Add this line to your application's Gemfile:
|
19
28
|
|
20
29
|
```ruby
|
@@ -31,7 +40,72 @@ Or install it yourself as:
|
|
31
40
|
|
32
41
|
## Usage
|
33
42
|
|
34
|
-
|
43
|
+
Running `cheripic` without any input at command line interface shows following help options
|
44
|
+
|
45
|
+
```
|
46
|
+
|
47
|
+
Cheripic v1.1.0
|
48
|
+
Authors: Shyam Rallapalli and Dan MacLean
|
49
|
+
|
50
|
+
Description: Candidate mutation and closely linked marker selection for non reference genomes
|
51
|
+
Uses bulk segregant data from non-reference sequence genomes
|
52
|
+
|
53
|
+
Inputs:
|
54
|
+
1. Needs a reference fasta file of asssembly use for variant analysis
|
55
|
+
2. Pileup files for mutant (phenotype of interest) bulks and background (wildtype phenotype) bulks
|
56
|
+
3. If polyploid species, include of pileup from one or both parents
|
57
|
+
|
58
|
+
USAGE:
|
59
|
+
cheripic <options>
|
60
|
+
|
61
|
+
OPTIONS:
|
62
|
+
-f, --assembly=<s> Assembly file in FASTA format
|
63
|
+
-F, --input-format=<s> bulk and parent alignment file format types - set either pileup or bam (default: pileup)
|
64
|
+
-a, --mut-bulk=<s> Pileup or sorted BAM file alignments from mutant/trait of interest bulk 1
|
65
|
+
-b, --bg-bulk=<s> Pileup or sorted BAM file alignments from background/wildtype bulk 2
|
66
|
+
--output=<s> Directory to store results, will be created if not existing (default: cheripic_results)
|
67
|
+
--loglevel=<s> Choose any one of "info / warn / debug" level for logs generated (default: debug)
|
68
|
+
--hmes-adjust=<f> factor added to snp count of each contig to adjust for hme score calculations (default: 0.5)
|
69
|
+
--htlow=<f> lower level for categorizing heterozygosity (default: 0.2)
|
70
|
+
--hthigh=<f> high level for categorizing heterozygosity (default: 0.9)
|
71
|
+
--mindepth=<i> minimum read depth to conisder a position for variant calls (default: 6)
|
72
|
+
--min-non-ref-count=<i> minimum read depth supporting non reference base at each position (default: 3)
|
73
|
+
--min-indel-count-support=<i> minimum read depth supporting an indel at each position (default: 3)
|
74
|
+
--ignore-reference-n, --no-ignore-reference-n ignore variant calls at N (completely ambigous) bases in the reference (default: true)
|
75
|
+
-q, --mapping-quality=<i> minimum mapping quality of read covering the position (default: 20)
|
76
|
+
-Q, --base-quality=<i> minimum base quality of bases covering the position (default: 15)
|
77
|
+
--noise=<f> praportion of reads for a variant to conisder as noise (default: 0.1)
|
78
|
+
--cross-type=<s> type of cross used to generated mapping population - back or out (default: back)
|
79
|
+
--only-frag-with-vars, --no-only-frag-with-vars select only contigs containing variants for analysis (default: true)
|
80
|
+
--filter-out-low-hmes, --no-filter-out-low-hmes ignore variants from contigs with low hmescore or bfr to list in the final output (default: true)
|
81
|
+
--polyploidy Set if the data input is from polyploids
|
82
|
+
-p, --mut-parent=<s> Pileup or sorted BAM file alignments from mutant/trait of interest parent (default: )
|
83
|
+
-r, --bg-parent=<s> Pileup or sorted BAM file alignments from background/wildtype parent (default: )
|
84
|
+
--bfr-adjust=<f> factor added to hemi snp frequency of each parent to adjust for bfr calculations (default: 0.05)
|
85
|
+
--examples shows some example commands with explanation
|
86
|
+
|
87
|
+
```
|
88
|
+
|
89
|
+
|
90
|
+
|
91
|
+
Example Commands
|
92
|
+
|
93
|
+
|
94
|
+
```
|
95
|
+
EXAMPLE COMMANDS:
|
96
|
+
1. cheripic -f assembly.fa -a mutbulk.pileup -b bgbulk.pileup --output=cheripic_output
|
97
|
+
2. cheripic --assembly assembly.fa --mut-bulk mutbulk.pileup --bg-bulk bgbulk.pileup
|
98
|
+
--mut-parent mutparent.pileup --bg-parent bgparent.pileup --polyploidy true --output cheripic_results
|
99
|
+
3. cheripic --assembly assembly.fa --mut-bulk mutbulk.pileup --bg-bulk bgbulk.pileup
|
100
|
+
--mut-parent mutparent.pileup --bg-parent bgparent.pileup --polyploidy true
|
101
|
+
--no-only-frag-with-vars --no-filter-out-low-hmes --output cheripic_results
|
102
|
+
|
103
|
+
```
|
104
|
+
|
105
|
+
|
106
|
+
By default contigs with out a variant and thos contigs with lower scores are discarded.
|
107
|
+
so use options `--no-only-frag-with-vars` and `--no-filter-out-low-hmes` to disable them
|
108
|
+
|
35
109
|
|
36
110
|
## Development
|
37
111
|
|
data/Rakefile
CHANGED
@@ -1,10 +1,78 @@
|
|
1
|
-
require
|
2
|
-
require
|
1
|
+
require 'bundler/gem_tasks'
|
2
|
+
require 'rake/testtask'
|
3
|
+
# For Bundler.with_clean_env
|
4
|
+
require 'bundler/setup'
|
3
5
|
|
4
6
|
Rake::TestTask.new(:test) do |t|
|
5
|
-
t.libs <<
|
6
|
-
t.libs <<
|
7
|
+
t.libs << 'test'
|
8
|
+
t.libs << 'lib'
|
7
9
|
t.test_files = FileList['test/**/*_test.rb']
|
8
10
|
end
|
9
11
|
|
10
12
|
task :default => :test
|
13
|
+
|
14
|
+
|
15
|
+
# for packaging
|
16
|
+
|
17
|
+
PACKAGE_NAME = 'cheripic'
|
18
|
+
VERSION = `bundle exec bin/cheripic -v`.chomp
|
19
|
+
TRAVELING_RUBY_VERSION = '20150210-2.1.5'
|
20
|
+
|
21
|
+
# pre-downloaded travelling ruby from following links and placed them in 'packaging' dirctory
|
22
|
+
# http://d6r77u77i8pq3.cloudfront.net/releases/traveling-ruby-20150210-2.1.5-linux-x86_64.tar.gz
|
23
|
+
# http://d6r77u77i8pq3.cloudfront.net/releases/traveling-ruby-20150210-2.1.5-osx.tar.gz
|
24
|
+
|
25
|
+
desc 'Package your app'
|
26
|
+
task :package => ['package:linux:x86_64', 'package:osx']
|
27
|
+
|
28
|
+
namespace :package do
|
29
|
+
|
30
|
+
namespace :linux do
|
31
|
+
desc 'Package your app for Linux x86_64'
|
32
|
+
task :x86_64 => [:bundle_install, "packaging/traveling-ruby-#{TRAVELING_RUBY_VERSION}-linux-x86_64.tar.gz"] do
|
33
|
+
create_package('linux-x86_64')
|
34
|
+
end
|
35
|
+
end
|
36
|
+
|
37
|
+
desc 'Package your app for OS X'
|
38
|
+
task :osx => [:bundle_install, "packaging/traveling-ruby-#{TRAVELING_RUBY_VERSION}-osx.tar.gz"] do
|
39
|
+
create_package('osx')
|
40
|
+
end
|
41
|
+
|
42
|
+
desc 'Install gems to local directory'
|
43
|
+
task :bundle_install do
|
44
|
+
if RUBY_VERSION !~ /^2\.1\./
|
45
|
+
abort "You can only 'bundle install' using Ruby 2.1, because that's what Traveling Ruby uses."
|
46
|
+
end
|
47
|
+
sh 'rm -rf packaging/tmp'
|
48
|
+
sh 'mkdir packaging/tmp'
|
49
|
+
sh 'cp Gemfile.lock packaging/tmp/'
|
50
|
+
sh 'cp packaging/Gemfile packaging/tmp/'
|
51
|
+
Bundler.with_clean_env do
|
52
|
+
sh 'env BUNDLE_IGNORE_CONFIG=1 bundle install --path packaging/vendor --without development'
|
53
|
+
end
|
54
|
+
sh 'rm -rf packaging/tmp'
|
55
|
+
sh 'rm -f packaging/vendor/*/*/cache/*'
|
56
|
+
end
|
57
|
+
end
|
58
|
+
|
59
|
+
def create_package(target)
|
60
|
+
package_dest = "#{PACKAGE_NAME}-#{VERSION}-#{target}"
|
61
|
+
package_dir = "packaging/#{package_dest}"
|
62
|
+
sh "rm -rf #{package_dir}"
|
63
|
+
sh "mkdir #{package_dir}"
|
64
|
+
sh "mkdir -p #{package_dir}/lib/app"
|
65
|
+
sh "cp -R bin #{package_dir}/lib/app/"
|
66
|
+
sh "cp -R lib #{package_dir}/lib/app/"
|
67
|
+
sh "mkdir #{package_dir}/lib/app/ruby"
|
68
|
+
sh "tar -xzf packaging/traveling-ruby-#{TRAVELING_RUBY_VERSION}-#{target}.tar.gz -C #{package_dir}/lib/app/ruby"
|
69
|
+
sh "cp packaging/wrapper.sh #{package_dir}/cheripic"
|
70
|
+
sh "cp -pR packaging/vendor/ruby/2.1.0 #{package_dir}/lib/app/ruby/"
|
71
|
+
sh "cp packaging/cheripic.gemspec Gemfile Gemfile.lock LICENSE.txt #{package_dir}/lib/app/"
|
72
|
+
sh "mkdir #{package_dir}/lib/app/.bundle"
|
73
|
+
sh "cp packaging/bundler-config #{package_dir}/lib/app/.bundle/config"
|
74
|
+
# if !ENV['DIR_ONLY']
|
75
|
+
# sh "tar -czf #{package_dir}.tar.gz #{package_dir}"
|
76
|
+
# sh "rm -rf #{package_dir}"
|
77
|
+
# end
|
78
|
+
end
|
data/lib/cheripic/cmd.rb
CHANGED
@@ -40,6 +40,7 @@ module Cheripic
|
|
40
40
|
def argument_parser
|
41
41
|
cmds = self
|
42
42
|
Trollop::Parser.new do
|
43
|
+
version Cheripic::VERSION
|
43
44
|
banner cmds.help_message
|
44
45
|
opt :assembly, 'Assembly file in FASTA format',
|
45
46
|
:short => '-f',
|
@@ -76,9 +77,9 @@ module Cheripic
|
|
76
77
|
opt :min_indel_count_support, 'minimum read depth supporting an indel at each position',
|
77
78
|
:type => Integer,
|
78
79
|
:default => 3
|
79
|
-
opt :
|
80
|
+
opt :ambiguous_ref_bases, 'including variant at completely ambiguous bases in the reference',
|
80
81
|
:type => FalseClass,
|
81
|
-
:default =>
|
82
|
+
:default => false
|
82
83
|
opt :mapping_quality, 'minimum mapping quality of read covering the position',
|
83
84
|
:short => '-q',
|
84
85
|
:type => Integer,
|
@@ -93,12 +94,12 @@ module Cheripic
|
|
93
94
|
opt :cross_type, 'type of cross used to generated mapping population - back or out',
|
94
95
|
:type => String,
|
95
96
|
:default => 'back'
|
96
|
-
opt :
|
97
|
+
opt :use_all_contigs, 'option to select all contigs or only contigs containing variants for analysis',
|
97
98
|
:type => FalseClass,
|
98
|
-
:default =>
|
99
|
-
opt :
|
99
|
+
:default => false
|
100
|
+
opt :include_low_hmes, 'option to include or discard variants from contigs with low hme-score or bfr score to list in the final output',
|
100
101
|
:type => FalseClass,
|
101
|
-
:default =>
|
102
|
+
:default => false
|
102
103
|
opt :polyploidy, 'Set if the data input is from polyploids',
|
103
104
|
:type => FalseClass,
|
104
105
|
:default => false
|
@@ -113,6 +114,9 @@ module Cheripic
|
|
113
114
|
opt :bfr_adjust, 'factor added to hemi snp frequency of each parent to adjust for bfr calculations',
|
114
115
|
:type => Float,
|
115
116
|
:default => 0.05
|
117
|
+
opt :sel_seq_len, 'sequence length to print from either side of selected variants',
|
118
|
+
:type => Integer,
|
119
|
+
:default => 50
|
116
120
|
opt :examples, 'shows some example commands with explanation'
|
117
121
|
end
|
118
122
|
end
|
@@ -148,7 +152,12 @@ module Cheripic
|
|
148
152
|
Cheripic v#{Cheripic::VERSION.dup}
|
149
153
|
|
150
154
|
EXAMPLE COMMANDS:
|
151
|
-
|
155
|
+
1. cheripic -f assembly.fa -a mutbulk.pileup -b bgbulk.pileup --output=cheripic_output
|
156
|
+
2. cheripic --assembly assembly.fa --mut-bulk mutbulk.pileup --bg-bulk bgbulk.pileup
|
157
|
+
--mut-parent mutparent.pileup --bg-parent bgparent.pileup --polyploidy true --output cheripic_results
|
158
|
+
3. cheripic --assembly assembly.fa --mut-bulk mutbulk.pileup --bg-bulk bgbulk.pileup
|
159
|
+
--mut-parent mutparent.pileup --bg-parent bgparent.pileup --polyploidy true
|
160
|
+
--no-only-frag-with-vars --no-filter-out-low-hmes --output cheripic_results
|
152
161
|
EOS
|
153
162
|
puts msg.split("\n").map{ |line| line.lstrip }.join("\n")
|
154
163
|
exit(0)
|
@@ -131,12 +131,14 @@ module Cheripic
|
|
131
131
|
# @return [Symbol] variant mode of the background bulk (:hom or :het) at the position
|
132
132
|
def bg_bulk_var(pos)
|
133
133
|
bg_base_hash = @bg_bulk[pos].var_base_frac
|
134
|
+
bg_base_hash.delete(:ref)
|
135
|
+
return nil if bg_base_hash.empty?
|
134
136
|
if bg_base_hash.length > 1
|
135
137
|
# taking only var mode
|
136
138
|
var_mode(bg_base_hash.values.max)
|
137
139
|
else
|
138
140
|
# taking only var mode
|
139
|
-
var_mode(bg_base_hash[0])
|
141
|
+
var_mode(bg_base_hash[bg_base_hash.keys[0]])
|
140
142
|
end
|
141
143
|
end
|
142
144
|
|
data/lib/cheripic/implementer.rb
CHANGED
@@ -36,13 +36,13 @@ module Cheripic
|
|
36
36
|
mindepth
|
37
37
|
min_non_ref_count
|
38
38
|
min_indel_count_support
|
39
|
-
|
39
|
+
ambiguous_ref_bases
|
40
40
|
mapping_quality
|
41
41
|
base_quality
|
42
42
|
noise
|
43
43
|
cross_type
|
44
|
-
|
45
|
-
|
44
|
+
use_all_contigs
|
45
|
+
include_low_hmes
|
46
46
|
polyploidy
|
47
47
|
bfr_adjust}
|
48
48
|
settings = inputs.select { |k| set2.include?(k) }
|
data/lib/cheripic/options.rb
CHANGED
@@ -14,13 +14,13 @@ module Cheripic
|
|
14
14
|
:mindepth => 6,
|
15
15
|
:min_non_ref_count => 3,
|
16
16
|
:min_indel_count_support => 3,
|
17
|
-
:
|
17
|
+
:ambiguous_ref_bases => false,
|
18
18
|
:mapping_quality => 20,
|
19
19
|
:base_quality => 15,
|
20
20
|
:noise => 0.1,
|
21
21
|
:cross_type => 'back',
|
22
|
-
:
|
23
|
-
:
|
22
|
+
:use_all_contigs => false,
|
23
|
+
:include_low_hmes => false,
|
24
24
|
:polyploidy => false,
|
25
25
|
:bfr_adjust => 0.05,
|
26
26
|
:sel_seq_len => 50
|
@@ -66,9 +66,10 @@ module Cheripic
|
|
66
66
|
end
|
67
67
|
|
68
68
|
# Option to whether to ignore or consider the reference positions which are ambiguous
|
69
|
+
# @note switching option name here so Pileup options are same
|
69
70
|
# @return [Boolean]
|
70
71
|
def self.ignore_reference_n
|
71
|
-
@user_settings[:
|
72
|
+
@user_settings[:ambiguous_ref_bases] ? false : true
|
72
73
|
end
|
73
74
|
|
74
75
|
# Minimum alignment mapping quality of the read to be used for bam files
|
@@ -98,14 +99,14 @@ module Cheripic
|
|
98
99
|
|
99
100
|
# Option to whether to ignore or consider the contigs with out any variants
|
100
101
|
# @return [Boolean]
|
101
|
-
def self.
|
102
|
-
@user_settings[:
|
102
|
+
def self.use_all_contigs
|
103
|
+
@user_settings[:use_all_contigs]
|
103
104
|
end
|
104
105
|
|
105
106
|
# Option to whether to ignore or consider the contigs with low HME score
|
106
107
|
# @return [Boolean]
|
107
|
-
def self.
|
108
|
-
@user_settings[:
|
108
|
+
def self.include_low_hmes
|
109
|
+
@user_settings[:include_low_hmes]
|
109
110
|
end
|
110
111
|
|
111
112
|
# Option to whether to set the input data is from polyploid or not
|
data/lib/cheripic/variants.rb
CHANGED
@@ -119,15 +119,17 @@ module Cheripic
|
|
119
119
|
end
|
120
120
|
|
121
121
|
# Applies selection procedure on assembly contigs based on the ratio_type provided.
|
122
|
-
# If
|
122
|
+
# If use_all_contigs is set to false then contigs without any variant are discarded for :hme_score
|
123
123
|
# while contigs without any hemisnps are discarded for :bfr_score
|
124
|
-
# If
|
124
|
+
# If include_low_hmes is set to false then contigs are further filtered based on a cut off value of the score
|
125
125
|
# @param ratio_type [Symbol] ratio_type is either :hme_score or :bfr_score
|
126
126
|
def select_contigs(ratio_type)
|
127
127
|
selected_contigs ={}
|
128
|
-
|
128
|
+
use_all_contigs = Options.use_all_contigs
|
129
129
|
@assembly.each_key do | frag |
|
130
|
-
if
|
130
|
+
if use_all_contigs
|
131
|
+
selected_contigs[frag] = @assembly[frag]
|
132
|
+
else
|
131
133
|
if ratio_type == :hme_score
|
132
134
|
# selecting fragments which have a variant
|
133
135
|
if @assembly[frag].hm_num + @assembly[frag].ht_num > 2 * Options.hmes_adjust
|
@@ -139,15 +141,13 @@ module Cheripic
|
|
139
141
|
selected_contigs[frag] = @assembly[frag]
|
140
142
|
end
|
141
143
|
end
|
142
|
-
else
|
143
|
-
selected_contigs[frag] = @assembly[frag]
|
144
144
|
end
|
145
145
|
end
|
146
146
|
selected_contigs = filter_contigs(selected_contigs, ratio_type)
|
147
|
-
if
|
148
|
-
logger.info "Selected #{selected_contigs.length} out of #{@assembly.length} fragments with #{ratio_type} score\n"
|
149
|
-
else
|
147
|
+
if use_all_contigs
|
150
148
|
logger.info "No filtering was applied to fragments\n"
|
149
|
+
else
|
150
|
+
logger.info "Selected #{selected_contigs.length} out of #{@assembly.length} fragments with #{ratio_type} score\n"
|
151
151
|
end
|
152
152
|
selected_contigs
|
153
153
|
end
|
@@ -171,11 +171,13 @@ module Cheripic
|
|
171
171
|
# @param ratio_type [Symbol] ratio_type is either :hme_score or :bfr_score
|
172
172
|
# @param selected_contigs [Hash] a hash of contigs with selected ratio_type, a subset of assembly hash
|
173
173
|
def get_cutoff(selected_contigs, ratio_type)
|
174
|
-
|
174
|
+
include_low_hmes = Options.include_low_hmes
|
175
175
|
# set minimum cut off hme_score or bfr_score to pick fragments with variants
|
176
176
|
# calculate min hme score for back or out crossed data or bfr_score for polypoidy data
|
177
177
|
# if no filtering applied set cutoff to 1.1
|
178
|
-
if
|
178
|
+
if include_low_hmes
|
179
|
+
cutoff = 0.0
|
180
|
+
else
|
179
181
|
if ratio_type == :hme_score
|
180
182
|
adjust = Options.hmes_adjust
|
181
183
|
if Options.cross_type == 'back'
|
@@ -186,8 +188,6 @@ module Cheripic
|
|
186
188
|
else # ratio_type is bfr_score
|
187
189
|
cutoff = bfr_cutoff(selected_contigs)
|
188
190
|
end
|
189
|
-
else
|
190
|
-
cutoff = 0.0
|
191
191
|
end
|
192
192
|
cutoff
|
193
193
|
end
|
data/lib/cheripic/version.rb
CHANGED
metadata
CHANGED
@@ -1,14 +1,14 @@
|
|
1
1
|
--- !ruby/object:Gem::Specification
|
2
2
|
name: cheripic
|
3
3
|
version: !ruby/object:Gem::Version
|
4
|
-
version: 1.
|
4
|
+
version: 1.2.0
|
5
5
|
platform: ruby
|
6
6
|
authors:
|
7
7
|
- Shyam Rallapalli
|
8
8
|
autorequire:
|
9
9
|
bindir: exe
|
10
10
|
cert_chain: []
|
11
|
-
date: 2016-08-
|
11
|
+
date: 2016-08-11 00:00:00.000000000 Z
|
12
12
|
dependencies:
|
13
13
|
- !ruby/object:Gem::Dependency
|
14
14
|
name: yell
|