cheripic 1.1.0 → 1.2.0

Sign up to get free protection for your applications and to get access to all the features.
checksums.yaml CHANGED
@@ -1,7 +1,7 @@
1
1
  ---
2
2
  SHA1:
3
- metadata.gz: 458f681424a73ea58acb8aefa73d68019ad0854d
4
- data.tar.gz: 23547939b1fead465d06d2f6d8e45ce4172b1cb1
3
+ metadata.gz: 958b4091f2c95903c3a43a13af7d75cbc7605813
4
+ data.tar.gz: 18b91af8e68553f4d1700dae921beb7e420f11ac
5
5
  SHA512:
6
- metadata.gz: 2e3af0df95197769c542b4aab76444a6b14842890b46a97d6be10101f267db5f5df7d1ed8d67083ac8890a866e1cab678a9b23c5dc03b1edb7b8fc2150b35097
7
- data.tar.gz: 9aa159df9086102679bd6359d4a5bf94dfe72f52d9c11e66259ffd40754f767001bb2a67e996b958018a009db0a6aa558c7ebe4001c2f6bce9b8993bdfd66091
6
+ metadata.gz: 7290c13e270aae1a777767179168353c5c55a035bfd6e82025d000414112425e77f59ad7a6fd0c736d2d2775182e17f978ffa6b67153201e9d316458a6360db6
7
+ data.tar.gz: 595be6e01fdc4e0d6185a86f79f207abb2dc6bf50a8763a67186339e639542294185656b204cde8d86cda2f3519f7ae3341443a978f9f019c51cfef294513694
data/.travis.yml CHANGED
@@ -1,4 +1,4 @@
1
1
  language: ruby
2
2
  rvm:
3
- - 2.2.1
4
- before_install: gem install bundler -v 1.10.6
3
+ - 2.1.5
4
+ before_install: gem install bundler -v 1.7.6
data/Gemfile CHANGED
@@ -1,4 +1,5 @@
1
1
  source 'https://rubygems.org'
2
+ ruby '2.1.5'
2
3
 
3
4
  # Specify your gem's dependencies in cheripic.gemspec
4
5
  gemspec
data/README.md CHANGED
@@ -15,6 +15,15 @@ Currently this gem is still in development and nearing complete working package.
15
15
 
16
16
  ## Installation
17
17
 
18
+ Cheripic is available both as a command line tool and as a gem.
19
+ Binaries are available for Linux 64bit and OSX.
20
+ Best way to use Cheripic is to download appropriate binary arhcive
21
+ unpack (`tar -xzf`) and add the unpacked directory to your `PATH`
22
+
23
+ Latest binaries are available to [download here](https://github.com/shyamrallapalli/cheripic/releases/tag/v1.1.0)
24
+
25
+
26
+ To install gem and use the gem in your development
18
27
  Add this line to your application's Gemfile:
19
28
 
20
29
  ```ruby
@@ -31,7 +40,72 @@ Or install it yourself as:
31
40
 
32
41
  ## Usage
33
42
 
34
- TODO: Write usage instructions here
43
+ Running `cheripic` without any input at command line interface shows following help options
44
+
45
+ ```
46
+
47
+ Cheripic v1.1.0
48
+ Authors: Shyam Rallapalli and Dan MacLean
49
+
50
+ Description: Candidate mutation and closely linked marker selection for non reference genomes
51
+ Uses bulk segregant data from non-reference sequence genomes
52
+
53
+ Inputs:
54
+ 1. Needs a reference fasta file of asssembly use for variant analysis
55
+ 2. Pileup files for mutant (phenotype of interest) bulks and background (wildtype phenotype) bulks
56
+ 3. If polyploid species, include of pileup from one or both parents
57
+
58
+ USAGE:
59
+ cheripic <options>
60
+
61
+ OPTIONS:
62
+ -f, --assembly=<s> Assembly file in FASTA format
63
+ -F, --input-format=<s> bulk and parent alignment file format types - set either pileup or bam (default: pileup)
64
+ -a, --mut-bulk=<s> Pileup or sorted BAM file alignments from mutant/trait of interest bulk 1
65
+ -b, --bg-bulk=<s> Pileup or sorted BAM file alignments from background/wildtype bulk 2
66
+ --output=<s> Directory to store results, will be created if not existing (default: cheripic_results)
67
+ --loglevel=<s> Choose any one of "info / warn / debug" level for logs generated (default: debug)
68
+ --hmes-adjust=<f> factor added to snp count of each contig to adjust for hme score calculations (default: 0.5)
69
+ --htlow=<f> lower level for categorizing heterozygosity (default: 0.2)
70
+ --hthigh=<f> high level for categorizing heterozygosity (default: 0.9)
71
+ --mindepth=<i> minimum read depth to conisder a position for variant calls (default: 6)
72
+ --min-non-ref-count=<i> minimum read depth supporting non reference base at each position (default: 3)
73
+ --min-indel-count-support=<i> minimum read depth supporting an indel at each position (default: 3)
74
+ --ignore-reference-n, --no-ignore-reference-n ignore variant calls at N (completely ambigous) bases in the reference (default: true)
75
+ -q, --mapping-quality=<i> minimum mapping quality of read covering the position (default: 20)
76
+ -Q, --base-quality=<i> minimum base quality of bases covering the position (default: 15)
77
+ --noise=<f> praportion of reads for a variant to conisder as noise (default: 0.1)
78
+ --cross-type=<s> type of cross used to generated mapping population - back or out (default: back)
79
+ --only-frag-with-vars, --no-only-frag-with-vars select only contigs containing variants for analysis (default: true)
80
+ --filter-out-low-hmes, --no-filter-out-low-hmes ignore variants from contigs with low hmescore or bfr to list in the final output (default: true)
81
+ --polyploidy Set if the data input is from polyploids
82
+ -p, --mut-parent=<s> Pileup or sorted BAM file alignments from mutant/trait of interest parent (default: )
83
+ -r, --bg-parent=<s> Pileup or sorted BAM file alignments from background/wildtype parent (default: )
84
+ --bfr-adjust=<f> factor added to hemi snp frequency of each parent to adjust for bfr calculations (default: 0.05)
85
+ --examples shows some example commands with explanation
86
+
87
+ ```
88
+
89
+
90
+
91
+ Example Commands
92
+
93
+
94
+ ```
95
+ EXAMPLE COMMANDS:
96
+ 1. cheripic -f assembly.fa -a mutbulk.pileup -b bgbulk.pileup --output=cheripic_output
97
+ 2. cheripic --assembly assembly.fa --mut-bulk mutbulk.pileup --bg-bulk bgbulk.pileup
98
+ --mut-parent mutparent.pileup --bg-parent bgparent.pileup --polyploidy true --output cheripic_results
99
+ 3. cheripic --assembly assembly.fa --mut-bulk mutbulk.pileup --bg-bulk bgbulk.pileup
100
+ --mut-parent mutparent.pileup --bg-parent bgparent.pileup --polyploidy true
101
+ --no-only-frag-with-vars --no-filter-out-low-hmes --output cheripic_results
102
+
103
+ ```
104
+
105
+
106
+ By default contigs with out a variant and thos contigs with lower scores are discarded.
107
+ so use options `--no-only-frag-with-vars` and `--no-filter-out-low-hmes` to disable them
108
+
35
109
 
36
110
  ## Development
37
111
 
data/Rakefile CHANGED
@@ -1,10 +1,78 @@
1
- require "bundler/gem_tasks"
2
- require "rake/testtask"
1
+ require 'bundler/gem_tasks'
2
+ require 'rake/testtask'
3
+ # For Bundler.with_clean_env
4
+ require 'bundler/setup'
3
5
 
4
6
  Rake::TestTask.new(:test) do |t|
5
- t.libs << "test"
6
- t.libs << "lib"
7
+ t.libs << 'test'
8
+ t.libs << 'lib'
7
9
  t.test_files = FileList['test/**/*_test.rb']
8
10
  end
9
11
 
10
12
  task :default => :test
13
+
14
+
15
+ # for packaging
16
+
17
+ PACKAGE_NAME = 'cheripic'
18
+ VERSION = `bundle exec bin/cheripic -v`.chomp
19
+ TRAVELING_RUBY_VERSION = '20150210-2.1.5'
20
+
21
+ # pre-downloaded travelling ruby from following links and placed them in 'packaging' dirctory
22
+ # http://d6r77u77i8pq3.cloudfront.net/releases/traveling-ruby-20150210-2.1.5-linux-x86_64.tar.gz
23
+ # http://d6r77u77i8pq3.cloudfront.net/releases/traveling-ruby-20150210-2.1.5-osx.tar.gz
24
+
25
+ desc 'Package your app'
26
+ task :package => ['package:linux:x86_64', 'package:osx']
27
+
28
+ namespace :package do
29
+
30
+ namespace :linux do
31
+ desc 'Package your app for Linux x86_64'
32
+ task :x86_64 => [:bundle_install, "packaging/traveling-ruby-#{TRAVELING_RUBY_VERSION}-linux-x86_64.tar.gz"] do
33
+ create_package('linux-x86_64')
34
+ end
35
+ end
36
+
37
+ desc 'Package your app for OS X'
38
+ task :osx => [:bundle_install, "packaging/traveling-ruby-#{TRAVELING_RUBY_VERSION}-osx.tar.gz"] do
39
+ create_package('osx')
40
+ end
41
+
42
+ desc 'Install gems to local directory'
43
+ task :bundle_install do
44
+ if RUBY_VERSION !~ /^2\.1\./
45
+ abort "You can only 'bundle install' using Ruby 2.1, because that's what Traveling Ruby uses."
46
+ end
47
+ sh 'rm -rf packaging/tmp'
48
+ sh 'mkdir packaging/tmp'
49
+ sh 'cp Gemfile.lock packaging/tmp/'
50
+ sh 'cp packaging/Gemfile packaging/tmp/'
51
+ Bundler.with_clean_env do
52
+ sh 'env BUNDLE_IGNORE_CONFIG=1 bundle install --path packaging/vendor --without development'
53
+ end
54
+ sh 'rm -rf packaging/tmp'
55
+ sh 'rm -f packaging/vendor/*/*/cache/*'
56
+ end
57
+ end
58
+
59
+ def create_package(target)
60
+ package_dest = "#{PACKAGE_NAME}-#{VERSION}-#{target}"
61
+ package_dir = "packaging/#{package_dest}"
62
+ sh "rm -rf #{package_dir}"
63
+ sh "mkdir #{package_dir}"
64
+ sh "mkdir -p #{package_dir}/lib/app"
65
+ sh "cp -R bin #{package_dir}/lib/app/"
66
+ sh "cp -R lib #{package_dir}/lib/app/"
67
+ sh "mkdir #{package_dir}/lib/app/ruby"
68
+ sh "tar -xzf packaging/traveling-ruby-#{TRAVELING_RUBY_VERSION}-#{target}.tar.gz -C #{package_dir}/lib/app/ruby"
69
+ sh "cp packaging/wrapper.sh #{package_dir}/cheripic"
70
+ sh "cp -pR packaging/vendor/ruby/2.1.0 #{package_dir}/lib/app/ruby/"
71
+ sh "cp packaging/cheripic.gemspec Gemfile Gemfile.lock LICENSE.txt #{package_dir}/lib/app/"
72
+ sh "mkdir #{package_dir}/lib/app/.bundle"
73
+ sh "cp packaging/bundler-config #{package_dir}/lib/app/.bundle/config"
74
+ # if !ENV['DIR_ONLY']
75
+ # sh "tar -czf #{package_dir}.tar.gz #{package_dir}"
76
+ # sh "rm -rf #{package_dir}"
77
+ # end
78
+ end
data/lib/cheripic/cmd.rb CHANGED
@@ -40,6 +40,7 @@ module Cheripic
40
40
  def argument_parser
41
41
  cmds = self
42
42
  Trollop::Parser.new do
43
+ version Cheripic::VERSION
43
44
  banner cmds.help_message
44
45
  opt :assembly, 'Assembly file in FASTA format',
45
46
  :short => '-f',
@@ -76,9 +77,9 @@ module Cheripic
76
77
  opt :min_indel_count_support, 'minimum read depth supporting an indel at each position',
77
78
  :type => Integer,
78
79
  :default => 3
79
- opt :ignore_reference_n, 'ignore variant calls at N (completely ambigous) bases in the reference',
80
+ opt :ambiguous_ref_bases, 'including variant at completely ambiguous bases in the reference',
80
81
  :type => FalseClass,
81
- :default => true
82
+ :default => false
82
83
  opt :mapping_quality, 'minimum mapping quality of read covering the position',
83
84
  :short => '-q',
84
85
  :type => Integer,
@@ -93,12 +94,12 @@ module Cheripic
93
94
  opt :cross_type, 'type of cross used to generated mapping population - back or out',
94
95
  :type => String,
95
96
  :default => 'back'
96
- opt :only_frag_with_vars, 'select only contigs containing variants for analysis',
97
+ opt :use_all_contigs, 'option to select all contigs or only contigs containing variants for analysis',
97
98
  :type => FalseClass,
98
- :default => true
99
- opt :filter_out_low_hmes, 'ignore variants from contigs with low hmescore or bfr to list in the final output',
99
+ :default => false
100
+ opt :include_low_hmes, 'option to include or discard variants from contigs with low hme-score or bfr score to list in the final output',
100
101
  :type => FalseClass,
101
- :default => true
102
+ :default => false
102
103
  opt :polyploidy, 'Set if the data input is from polyploids',
103
104
  :type => FalseClass,
104
105
  :default => false
@@ -113,6 +114,9 @@ module Cheripic
113
114
  opt :bfr_adjust, 'factor added to hemi snp frequency of each parent to adjust for bfr calculations',
114
115
  :type => Float,
115
116
  :default => 0.05
117
+ opt :sel_seq_len, 'sequence length to print from either side of selected variants',
118
+ :type => Integer,
119
+ :default => 50
116
120
  opt :examples, 'shows some example commands with explanation'
117
121
  end
118
122
  end
@@ -148,7 +152,12 @@ module Cheripic
148
152
  Cheripic v#{Cheripic::VERSION.dup}
149
153
 
150
154
  EXAMPLE COMMANDS:
151
-
155
+ 1. cheripic -f assembly.fa -a mutbulk.pileup -b bgbulk.pileup --output=cheripic_output
156
+ 2. cheripic --assembly assembly.fa --mut-bulk mutbulk.pileup --bg-bulk bgbulk.pileup
157
+ --mut-parent mutparent.pileup --bg-parent bgparent.pileup --polyploidy true --output cheripic_results
158
+ 3. cheripic --assembly assembly.fa --mut-bulk mutbulk.pileup --bg-bulk bgbulk.pileup
159
+ --mut-parent mutparent.pileup --bg-parent bgparent.pileup --polyploidy true
160
+ --no-only-frag-with-vars --no-filter-out-low-hmes --output cheripic_results
152
161
  EOS
153
162
  puts msg.split("\n").map{ |line| line.lstrip }.join("\n")
154
163
  exit(0)
@@ -131,12 +131,14 @@ module Cheripic
131
131
  # @return [Symbol] variant mode of the background bulk (:hom or :het) at the position
132
132
  def bg_bulk_var(pos)
133
133
  bg_base_hash = @bg_bulk[pos].var_base_frac
134
+ bg_base_hash.delete(:ref)
135
+ return nil if bg_base_hash.empty?
134
136
  if bg_base_hash.length > 1
135
137
  # taking only var mode
136
138
  var_mode(bg_base_hash.values.max)
137
139
  else
138
140
  # taking only var mode
139
- var_mode(bg_base_hash[0])
141
+ var_mode(bg_base_hash[bg_base_hash.keys[0]])
140
142
  end
141
143
  end
142
144
 
@@ -36,13 +36,13 @@ module Cheripic
36
36
  mindepth
37
37
  min_non_ref_count
38
38
  min_indel_count_support
39
- ignore_reference_n
39
+ ambiguous_ref_bases
40
40
  mapping_quality
41
41
  base_quality
42
42
  noise
43
43
  cross_type
44
- only_frag_with_vars
45
- filter_out_low_hmes
44
+ use_all_contigs
45
+ include_low_hmes
46
46
  polyploidy
47
47
  bfr_adjust}
48
48
  settings = inputs.select { |k| set2.include?(k) }
@@ -14,13 +14,13 @@ module Cheripic
14
14
  :mindepth => 6,
15
15
  :min_non_ref_count => 3,
16
16
  :min_indel_count_support => 3,
17
- :ignore_reference_n => true,
17
+ :ambiguous_ref_bases => false,
18
18
  :mapping_quality => 20,
19
19
  :base_quality => 15,
20
20
  :noise => 0.1,
21
21
  :cross_type => 'back',
22
- :only_frag_with_vars => true,
23
- :filter_out_low_hmes => true,
22
+ :use_all_contigs => false,
23
+ :include_low_hmes => false,
24
24
  :polyploidy => false,
25
25
  :bfr_adjust => 0.05,
26
26
  :sel_seq_len => 50
@@ -66,9 +66,10 @@ module Cheripic
66
66
  end
67
67
 
68
68
  # Option to whether to ignore or consider the reference positions which are ambiguous
69
+ # @note switching option name here so Pileup options are same
69
70
  # @return [Boolean]
70
71
  def self.ignore_reference_n
71
- @user_settings[:ignore_reference_n]
72
+ @user_settings[:ambiguous_ref_bases] ? false : true
72
73
  end
73
74
 
74
75
  # Minimum alignment mapping quality of the read to be used for bam files
@@ -98,14 +99,14 @@ module Cheripic
98
99
 
99
100
  # Option to whether to ignore or consider the contigs with out any variants
100
101
  # @return [Boolean]
101
- def self.only_frag_with_vars
102
- @user_settings[:only_frag_with_vars]
102
+ def self.use_all_contigs
103
+ @user_settings[:use_all_contigs]
103
104
  end
104
105
 
105
106
  # Option to whether to ignore or consider the contigs with low HME score
106
107
  # @return [Boolean]
107
- def self.filter_out_low_hmes
108
- @user_settings[:filter_out_low_hmes]
108
+ def self.include_low_hmes
109
+ @user_settings[:include_low_hmes]
109
110
  end
110
111
 
111
112
  # Option to whether to set the input data is from polyploid or not
@@ -119,15 +119,17 @@ module Cheripic
119
119
  end
120
120
 
121
121
  # Applies selection procedure on assembly contigs based on the ratio_type provided.
122
- # If only_frag_with_vars is set to true then contigs without any variant are discarded for :hme_score
122
+ # If use_all_contigs is set to false then contigs without any variant are discarded for :hme_score
123
123
  # while contigs without any hemisnps are discarded for :bfr_score
124
- # If filter_out_low_hmes is set to true then contigs are further filtered based on a cut off value of the score
124
+ # If include_low_hmes is set to false then contigs are further filtered based on a cut off value of the score
125
125
  # @param ratio_type [Symbol] ratio_type is either :hme_score or :bfr_score
126
126
  def select_contigs(ratio_type)
127
127
  selected_contigs ={}
128
- only_frag_with_vars = Options.only_frag_with_vars
128
+ use_all_contigs = Options.use_all_contigs
129
129
  @assembly.each_key do | frag |
130
- if only_frag_with_vars
130
+ if use_all_contigs
131
+ selected_contigs[frag] = @assembly[frag]
132
+ else
131
133
  if ratio_type == :hme_score
132
134
  # selecting fragments which have a variant
133
135
  if @assembly[frag].hm_num + @assembly[frag].ht_num > 2 * Options.hmes_adjust
@@ -139,15 +141,13 @@ module Cheripic
139
141
  selected_contigs[frag] = @assembly[frag]
140
142
  end
141
143
  end
142
- else
143
- selected_contigs[frag] = @assembly[frag]
144
144
  end
145
145
  end
146
146
  selected_contigs = filter_contigs(selected_contigs, ratio_type)
147
- if only_frag_with_vars
148
- logger.info "Selected #{selected_contigs.length} out of #{@assembly.length} fragments with #{ratio_type} score\n"
149
- else
147
+ if use_all_contigs
150
148
  logger.info "No filtering was applied to fragments\n"
149
+ else
150
+ logger.info "Selected #{selected_contigs.length} out of #{@assembly.length} fragments with #{ratio_type} score\n"
151
151
  end
152
152
  selected_contigs
153
153
  end
@@ -171,11 +171,13 @@ module Cheripic
171
171
  # @param ratio_type [Symbol] ratio_type is either :hme_score or :bfr_score
172
172
  # @param selected_contigs [Hash] a hash of contigs with selected ratio_type, a subset of assembly hash
173
173
  def get_cutoff(selected_contigs, ratio_type)
174
- filter_out_low_hmes = Options.filter_out_low_hmes
174
+ include_low_hmes = Options.include_low_hmes
175
175
  # set minimum cut off hme_score or bfr_score to pick fragments with variants
176
176
  # calculate min hme score for back or out crossed data or bfr_score for polypoidy data
177
177
  # if no filtering applied set cutoff to 1.1
178
- if filter_out_low_hmes
178
+ if include_low_hmes
179
+ cutoff = 0.0
180
+ else
179
181
  if ratio_type == :hme_score
180
182
  adjust = Options.hmes_adjust
181
183
  if Options.cross_type == 'back'
@@ -186,8 +188,6 @@ module Cheripic
186
188
  else # ratio_type is bfr_score
187
189
  cutoff = bfr_cutoff(selected_contigs)
188
190
  end
189
- else
190
- cutoff = 0.0
191
191
  end
192
192
  cutoff
193
193
  end
@@ -2,6 +2,6 @@ module Cheripic
2
2
 
3
3
  # Sets the semantic version number for this module.
4
4
  # Version number will be used in help messages and for generating gem.
5
- VERSION = '1.1.0'
5
+ VERSION = '1.2.0'
6
6
 
7
7
  end
metadata CHANGED
@@ -1,14 +1,14 @@
1
1
  --- !ruby/object:Gem::Specification
2
2
  name: cheripic
3
3
  version: !ruby/object:Gem::Version
4
- version: 1.1.0
4
+ version: 1.2.0
5
5
  platform: ruby
6
6
  authors:
7
7
  - Shyam Rallapalli
8
8
  autorequire:
9
9
  bindir: exe
10
10
  cert_chain: []
11
- date: 2016-08-08 00:00:00.000000000 Z
11
+ date: 2016-08-11 00:00:00.000000000 Z
12
12
  dependencies:
13
13
  - !ruby/object:Gem::Dependency
14
14
  name: yell