cheripic 1.1.0 → 1.2.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
checksums.yaml CHANGED
@@ -1,7 +1,7 @@
1
1
  ---
2
2
  SHA1:
3
- metadata.gz: 458f681424a73ea58acb8aefa73d68019ad0854d
4
- data.tar.gz: 23547939b1fead465d06d2f6d8e45ce4172b1cb1
3
+ metadata.gz: 958b4091f2c95903c3a43a13af7d75cbc7605813
4
+ data.tar.gz: 18b91af8e68553f4d1700dae921beb7e420f11ac
5
5
  SHA512:
6
- metadata.gz: 2e3af0df95197769c542b4aab76444a6b14842890b46a97d6be10101f267db5f5df7d1ed8d67083ac8890a866e1cab678a9b23c5dc03b1edb7b8fc2150b35097
7
- data.tar.gz: 9aa159df9086102679bd6359d4a5bf94dfe72f52d9c11e66259ffd40754f767001bb2a67e996b958018a009db0a6aa558c7ebe4001c2f6bce9b8993bdfd66091
6
+ metadata.gz: 7290c13e270aae1a777767179168353c5c55a035bfd6e82025d000414112425e77f59ad7a6fd0c736d2d2775182e17f978ffa6b67153201e9d316458a6360db6
7
+ data.tar.gz: 595be6e01fdc4e0d6185a86f79f207abb2dc6bf50a8763a67186339e639542294185656b204cde8d86cda2f3519f7ae3341443a978f9f019c51cfef294513694
data/.travis.yml CHANGED
@@ -1,4 +1,4 @@
1
1
  language: ruby
2
2
  rvm:
3
- - 2.2.1
4
- before_install: gem install bundler -v 1.10.6
3
+ - 2.1.5
4
+ before_install: gem install bundler -v 1.7.6
data/Gemfile CHANGED
@@ -1,4 +1,5 @@
1
1
  source 'https://rubygems.org'
2
+ ruby '2.1.5'
2
3
 
3
4
  # Specify your gem's dependencies in cheripic.gemspec
4
5
  gemspec
data/README.md CHANGED
@@ -15,6 +15,15 @@ Currently this gem is still in development and nearing complete working package.
15
15
 
16
16
  ## Installation
17
17
 
18
+ Cheripic is available both as a command line tool and as a gem.
19
+ Binaries are available for Linux 64bit and OSX.
20
+ Best way to use Cheripic is to download appropriate binary arhcive
21
+ unpack (`tar -xzf`) and add the unpacked directory to your `PATH`
22
+
23
+ Latest binaries are available to [download here](https://github.com/shyamrallapalli/cheripic/releases/tag/v1.1.0)
24
+
25
+
26
+ To install gem and use the gem in your development
18
27
  Add this line to your application's Gemfile:
19
28
 
20
29
  ```ruby
@@ -31,7 +40,72 @@ Or install it yourself as:
31
40
 
32
41
  ## Usage
33
42
 
34
- TODO: Write usage instructions here
43
+ Running `cheripic` without any input at command line interface shows following help options
44
+
45
+ ```
46
+
47
+ Cheripic v1.1.0
48
+ Authors: Shyam Rallapalli and Dan MacLean
49
+
50
+ Description: Candidate mutation and closely linked marker selection for non reference genomes
51
+ Uses bulk segregant data from non-reference sequence genomes
52
+
53
+ Inputs:
54
+ 1. Needs a reference fasta file of asssembly use for variant analysis
55
+ 2. Pileup files for mutant (phenotype of interest) bulks and background (wildtype phenotype) bulks
56
+ 3. If polyploid species, include of pileup from one or both parents
57
+
58
+ USAGE:
59
+ cheripic <options>
60
+
61
+ OPTIONS:
62
+ -f, --assembly=<s> Assembly file in FASTA format
63
+ -F, --input-format=<s> bulk and parent alignment file format types - set either pileup or bam (default: pileup)
64
+ -a, --mut-bulk=<s> Pileup or sorted BAM file alignments from mutant/trait of interest bulk 1
65
+ -b, --bg-bulk=<s> Pileup or sorted BAM file alignments from background/wildtype bulk 2
66
+ --output=<s> Directory to store results, will be created if not existing (default: cheripic_results)
67
+ --loglevel=<s> Choose any one of "info / warn / debug" level for logs generated (default: debug)
68
+ --hmes-adjust=<f> factor added to snp count of each contig to adjust for hme score calculations (default: 0.5)
69
+ --htlow=<f> lower level for categorizing heterozygosity (default: 0.2)
70
+ --hthigh=<f> high level for categorizing heterozygosity (default: 0.9)
71
+ --mindepth=<i> minimum read depth to conisder a position for variant calls (default: 6)
72
+ --min-non-ref-count=<i> minimum read depth supporting non reference base at each position (default: 3)
73
+ --min-indel-count-support=<i> minimum read depth supporting an indel at each position (default: 3)
74
+ --ignore-reference-n, --no-ignore-reference-n ignore variant calls at N (completely ambigous) bases in the reference (default: true)
75
+ -q, --mapping-quality=<i> minimum mapping quality of read covering the position (default: 20)
76
+ -Q, --base-quality=<i> minimum base quality of bases covering the position (default: 15)
77
+ --noise=<f> praportion of reads for a variant to conisder as noise (default: 0.1)
78
+ --cross-type=<s> type of cross used to generated mapping population - back or out (default: back)
79
+ --only-frag-with-vars, --no-only-frag-with-vars select only contigs containing variants for analysis (default: true)
80
+ --filter-out-low-hmes, --no-filter-out-low-hmes ignore variants from contigs with low hmescore or bfr to list in the final output (default: true)
81
+ --polyploidy Set if the data input is from polyploids
82
+ -p, --mut-parent=<s> Pileup or sorted BAM file alignments from mutant/trait of interest parent (default: )
83
+ -r, --bg-parent=<s> Pileup or sorted BAM file alignments from background/wildtype parent (default: )
84
+ --bfr-adjust=<f> factor added to hemi snp frequency of each parent to adjust for bfr calculations (default: 0.05)
85
+ --examples shows some example commands with explanation
86
+
87
+ ```
88
+
89
+
90
+
91
+ Example Commands
92
+
93
+
94
+ ```
95
+ EXAMPLE COMMANDS:
96
+ 1. cheripic -f assembly.fa -a mutbulk.pileup -b bgbulk.pileup --output=cheripic_output
97
+ 2. cheripic --assembly assembly.fa --mut-bulk mutbulk.pileup --bg-bulk bgbulk.pileup
98
+ --mut-parent mutparent.pileup --bg-parent bgparent.pileup --polyploidy true --output cheripic_results
99
+ 3. cheripic --assembly assembly.fa --mut-bulk mutbulk.pileup --bg-bulk bgbulk.pileup
100
+ --mut-parent mutparent.pileup --bg-parent bgparent.pileup --polyploidy true
101
+ --no-only-frag-with-vars --no-filter-out-low-hmes --output cheripic_results
102
+
103
+ ```
104
+
105
+
106
+ By default contigs with out a variant and thos contigs with lower scores are discarded.
107
+ so use options `--no-only-frag-with-vars` and `--no-filter-out-low-hmes` to disable them
108
+
35
109
 
36
110
  ## Development
37
111
 
data/Rakefile CHANGED
@@ -1,10 +1,78 @@
1
- require "bundler/gem_tasks"
2
- require "rake/testtask"
1
+ require 'bundler/gem_tasks'
2
+ require 'rake/testtask'
3
+ # For Bundler.with_clean_env
4
+ require 'bundler/setup'
3
5
 
4
6
  Rake::TestTask.new(:test) do |t|
5
- t.libs << "test"
6
- t.libs << "lib"
7
+ t.libs << 'test'
8
+ t.libs << 'lib'
7
9
  t.test_files = FileList['test/**/*_test.rb']
8
10
  end
9
11
 
10
12
  task :default => :test
13
+
14
+
15
+ # for packaging
16
+
17
+ PACKAGE_NAME = 'cheripic'
18
+ VERSION = `bundle exec bin/cheripic -v`.chomp
19
+ TRAVELING_RUBY_VERSION = '20150210-2.1.5'
20
+
21
+ # pre-downloaded travelling ruby from following links and placed them in 'packaging' dirctory
22
+ # http://d6r77u77i8pq3.cloudfront.net/releases/traveling-ruby-20150210-2.1.5-linux-x86_64.tar.gz
23
+ # http://d6r77u77i8pq3.cloudfront.net/releases/traveling-ruby-20150210-2.1.5-osx.tar.gz
24
+
25
+ desc 'Package your app'
26
+ task :package => ['package:linux:x86_64', 'package:osx']
27
+
28
+ namespace :package do
29
+
30
+ namespace :linux do
31
+ desc 'Package your app for Linux x86_64'
32
+ task :x86_64 => [:bundle_install, "packaging/traveling-ruby-#{TRAVELING_RUBY_VERSION}-linux-x86_64.tar.gz"] do
33
+ create_package('linux-x86_64')
34
+ end
35
+ end
36
+
37
+ desc 'Package your app for OS X'
38
+ task :osx => [:bundle_install, "packaging/traveling-ruby-#{TRAVELING_RUBY_VERSION}-osx.tar.gz"] do
39
+ create_package('osx')
40
+ end
41
+
42
+ desc 'Install gems to local directory'
43
+ task :bundle_install do
44
+ if RUBY_VERSION !~ /^2\.1\./
45
+ abort "You can only 'bundle install' using Ruby 2.1, because that's what Traveling Ruby uses."
46
+ end
47
+ sh 'rm -rf packaging/tmp'
48
+ sh 'mkdir packaging/tmp'
49
+ sh 'cp Gemfile.lock packaging/tmp/'
50
+ sh 'cp packaging/Gemfile packaging/tmp/'
51
+ Bundler.with_clean_env do
52
+ sh 'env BUNDLE_IGNORE_CONFIG=1 bundle install --path packaging/vendor --without development'
53
+ end
54
+ sh 'rm -rf packaging/tmp'
55
+ sh 'rm -f packaging/vendor/*/*/cache/*'
56
+ end
57
+ end
58
+
59
+ def create_package(target)
60
+ package_dest = "#{PACKAGE_NAME}-#{VERSION}-#{target}"
61
+ package_dir = "packaging/#{package_dest}"
62
+ sh "rm -rf #{package_dir}"
63
+ sh "mkdir #{package_dir}"
64
+ sh "mkdir -p #{package_dir}/lib/app"
65
+ sh "cp -R bin #{package_dir}/lib/app/"
66
+ sh "cp -R lib #{package_dir}/lib/app/"
67
+ sh "mkdir #{package_dir}/lib/app/ruby"
68
+ sh "tar -xzf packaging/traveling-ruby-#{TRAVELING_RUBY_VERSION}-#{target}.tar.gz -C #{package_dir}/lib/app/ruby"
69
+ sh "cp packaging/wrapper.sh #{package_dir}/cheripic"
70
+ sh "cp -pR packaging/vendor/ruby/2.1.0 #{package_dir}/lib/app/ruby/"
71
+ sh "cp packaging/cheripic.gemspec Gemfile Gemfile.lock LICENSE.txt #{package_dir}/lib/app/"
72
+ sh "mkdir #{package_dir}/lib/app/.bundle"
73
+ sh "cp packaging/bundler-config #{package_dir}/lib/app/.bundle/config"
74
+ # if !ENV['DIR_ONLY']
75
+ # sh "tar -czf #{package_dir}.tar.gz #{package_dir}"
76
+ # sh "rm -rf #{package_dir}"
77
+ # end
78
+ end
data/lib/cheripic/cmd.rb CHANGED
@@ -40,6 +40,7 @@ module Cheripic
40
40
  def argument_parser
41
41
  cmds = self
42
42
  Trollop::Parser.new do
43
+ version Cheripic::VERSION
43
44
  banner cmds.help_message
44
45
  opt :assembly, 'Assembly file in FASTA format',
45
46
  :short => '-f',
@@ -76,9 +77,9 @@ module Cheripic
76
77
  opt :min_indel_count_support, 'minimum read depth supporting an indel at each position',
77
78
  :type => Integer,
78
79
  :default => 3
79
- opt :ignore_reference_n, 'ignore variant calls at N (completely ambigous) bases in the reference',
80
+ opt :ambiguous_ref_bases, 'including variant at completely ambiguous bases in the reference',
80
81
  :type => FalseClass,
81
- :default => true
82
+ :default => false
82
83
  opt :mapping_quality, 'minimum mapping quality of read covering the position',
83
84
  :short => '-q',
84
85
  :type => Integer,
@@ -93,12 +94,12 @@ module Cheripic
93
94
  opt :cross_type, 'type of cross used to generated mapping population - back or out',
94
95
  :type => String,
95
96
  :default => 'back'
96
- opt :only_frag_with_vars, 'select only contigs containing variants for analysis',
97
+ opt :use_all_contigs, 'option to select all contigs or only contigs containing variants for analysis',
97
98
  :type => FalseClass,
98
- :default => true
99
- opt :filter_out_low_hmes, 'ignore variants from contigs with low hmescore or bfr to list in the final output',
99
+ :default => false
100
+ opt :include_low_hmes, 'option to include or discard variants from contigs with low hme-score or bfr score to list in the final output',
100
101
  :type => FalseClass,
101
- :default => true
102
+ :default => false
102
103
  opt :polyploidy, 'Set if the data input is from polyploids',
103
104
  :type => FalseClass,
104
105
  :default => false
@@ -113,6 +114,9 @@ module Cheripic
113
114
  opt :bfr_adjust, 'factor added to hemi snp frequency of each parent to adjust for bfr calculations',
114
115
  :type => Float,
115
116
  :default => 0.05
117
+ opt :sel_seq_len, 'sequence length to print from either side of selected variants',
118
+ :type => Integer,
119
+ :default => 50
116
120
  opt :examples, 'shows some example commands with explanation'
117
121
  end
118
122
  end
@@ -148,7 +152,12 @@ module Cheripic
148
152
  Cheripic v#{Cheripic::VERSION.dup}
149
153
 
150
154
  EXAMPLE COMMANDS:
151
-
155
+ 1. cheripic -f assembly.fa -a mutbulk.pileup -b bgbulk.pileup --output=cheripic_output
156
+ 2. cheripic --assembly assembly.fa --mut-bulk mutbulk.pileup --bg-bulk bgbulk.pileup
157
+ --mut-parent mutparent.pileup --bg-parent bgparent.pileup --polyploidy true --output cheripic_results
158
+ 3. cheripic --assembly assembly.fa --mut-bulk mutbulk.pileup --bg-bulk bgbulk.pileup
159
+ --mut-parent mutparent.pileup --bg-parent bgparent.pileup --polyploidy true
160
+ --no-only-frag-with-vars --no-filter-out-low-hmes --output cheripic_results
152
161
  EOS
153
162
  puts msg.split("\n").map{ |line| line.lstrip }.join("\n")
154
163
  exit(0)
@@ -131,12 +131,14 @@ module Cheripic
131
131
  # @return [Symbol] variant mode of the background bulk (:hom or :het) at the position
132
132
  def bg_bulk_var(pos)
133
133
  bg_base_hash = @bg_bulk[pos].var_base_frac
134
+ bg_base_hash.delete(:ref)
135
+ return nil if bg_base_hash.empty?
134
136
  if bg_base_hash.length > 1
135
137
  # taking only var mode
136
138
  var_mode(bg_base_hash.values.max)
137
139
  else
138
140
  # taking only var mode
139
- var_mode(bg_base_hash[0])
141
+ var_mode(bg_base_hash[bg_base_hash.keys[0]])
140
142
  end
141
143
  end
142
144
 
@@ -36,13 +36,13 @@ module Cheripic
36
36
  mindepth
37
37
  min_non_ref_count
38
38
  min_indel_count_support
39
- ignore_reference_n
39
+ ambiguous_ref_bases
40
40
  mapping_quality
41
41
  base_quality
42
42
  noise
43
43
  cross_type
44
- only_frag_with_vars
45
- filter_out_low_hmes
44
+ use_all_contigs
45
+ include_low_hmes
46
46
  polyploidy
47
47
  bfr_adjust}
48
48
  settings = inputs.select { |k| set2.include?(k) }
@@ -14,13 +14,13 @@ module Cheripic
14
14
  :mindepth => 6,
15
15
  :min_non_ref_count => 3,
16
16
  :min_indel_count_support => 3,
17
- :ignore_reference_n => true,
17
+ :ambiguous_ref_bases => false,
18
18
  :mapping_quality => 20,
19
19
  :base_quality => 15,
20
20
  :noise => 0.1,
21
21
  :cross_type => 'back',
22
- :only_frag_with_vars => true,
23
- :filter_out_low_hmes => true,
22
+ :use_all_contigs => false,
23
+ :include_low_hmes => false,
24
24
  :polyploidy => false,
25
25
  :bfr_adjust => 0.05,
26
26
  :sel_seq_len => 50
@@ -66,9 +66,10 @@ module Cheripic
66
66
  end
67
67
 
68
68
  # Option to whether to ignore or consider the reference positions which are ambiguous
69
+ # @note switching option name here so Pileup options are same
69
70
  # @return [Boolean]
70
71
  def self.ignore_reference_n
71
- @user_settings[:ignore_reference_n]
72
+ @user_settings[:ambiguous_ref_bases] ? false : true
72
73
  end
73
74
 
74
75
  # Minimum alignment mapping quality of the read to be used for bam files
@@ -98,14 +99,14 @@ module Cheripic
98
99
 
99
100
  # Option to whether to ignore or consider the contigs with out any variants
100
101
  # @return [Boolean]
101
- def self.only_frag_with_vars
102
- @user_settings[:only_frag_with_vars]
102
+ def self.use_all_contigs
103
+ @user_settings[:use_all_contigs]
103
104
  end
104
105
 
105
106
  # Option to whether to ignore or consider the contigs with low HME score
106
107
  # @return [Boolean]
107
- def self.filter_out_low_hmes
108
- @user_settings[:filter_out_low_hmes]
108
+ def self.include_low_hmes
109
+ @user_settings[:include_low_hmes]
109
110
  end
110
111
 
111
112
  # Option to whether to set the input data is from polyploid or not
@@ -119,15 +119,17 @@ module Cheripic
119
119
  end
120
120
 
121
121
  # Applies selection procedure on assembly contigs based on the ratio_type provided.
122
- # If only_frag_with_vars is set to true then contigs without any variant are discarded for :hme_score
122
+ # If use_all_contigs is set to false then contigs without any variant are discarded for :hme_score
123
123
  # while contigs without any hemisnps are discarded for :bfr_score
124
- # If filter_out_low_hmes is set to true then contigs are further filtered based on a cut off value of the score
124
+ # If include_low_hmes is set to false then contigs are further filtered based on a cut off value of the score
125
125
  # @param ratio_type [Symbol] ratio_type is either :hme_score or :bfr_score
126
126
  def select_contigs(ratio_type)
127
127
  selected_contigs ={}
128
- only_frag_with_vars = Options.only_frag_with_vars
128
+ use_all_contigs = Options.use_all_contigs
129
129
  @assembly.each_key do | frag |
130
- if only_frag_with_vars
130
+ if use_all_contigs
131
+ selected_contigs[frag] = @assembly[frag]
132
+ else
131
133
  if ratio_type == :hme_score
132
134
  # selecting fragments which have a variant
133
135
  if @assembly[frag].hm_num + @assembly[frag].ht_num > 2 * Options.hmes_adjust
@@ -139,15 +141,13 @@ module Cheripic
139
141
  selected_contigs[frag] = @assembly[frag]
140
142
  end
141
143
  end
142
- else
143
- selected_contigs[frag] = @assembly[frag]
144
144
  end
145
145
  end
146
146
  selected_contigs = filter_contigs(selected_contigs, ratio_type)
147
- if only_frag_with_vars
148
- logger.info "Selected #{selected_contigs.length} out of #{@assembly.length} fragments with #{ratio_type} score\n"
149
- else
147
+ if use_all_contigs
150
148
  logger.info "No filtering was applied to fragments\n"
149
+ else
150
+ logger.info "Selected #{selected_contigs.length} out of #{@assembly.length} fragments with #{ratio_type} score\n"
151
151
  end
152
152
  selected_contigs
153
153
  end
@@ -171,11 +171,13 @@ module Cheripic
171
171
  # @param ratio_type [Symbol] ratio_type is either :hme_score or :bfr_score
172
172
  # @param selected_contigs [Hash] a hash of contigs with selected ratio_type, a subset of assembly hash
173
173
  def get_cutoff(selected_contigs, ratio_type)
174
- filter_out_low_hmes = Options.filter_out_low_hmes
174
+ include_low_hmes = Options.include_low_hmes
175
175
  # set minimum cut off hme_score or bfr_score to pick fragments with variants
176
176
  # calculate min hme score for back or out crossed data or bfr_score for polypoidy data
177
177
  # if no filtering applied set cutoff to 1.1
178
- if filter_out_low_hmes
178
+ if include_low_hmes
179
+ cutoff = 0.0
180
+ else
179
181
  if ratio_type == :hme_score
180
182
  adjust = Options.hmes_adjust
181
183
  if Options.cross_type == 'back'
@@ -186,8 +188,6 @@ module Cheripic
186
188
  else # ratio_type is bfr_score
187
189
  cutoff = bfr_cutoff(selected_contigs)
188
190
  end
189
- else
190
- cutoff = 0.0
191
191
  end
192
192
  cutoff
193
193
  end
@@ -2,6 +2,6 @@ module Cheripic
2
2
 
3
3
  # Sets the semantic version number for this module.
4
4
  # Version number will be used in help messages and for generating gem.
5
- VERSION = '1.1.0'
5
+ VERSION = '1.2.0'
6
6
 
7
7
  end
metadata CHANGED
@@ -1,14 +1,14 @@
1
1
  --- !ruby/object:Gem::Specification
2
2
  name: cheripic
3
3
  version: !ruby/object:Gem::Version
4
- version: 1.1.0
4
+ version: 1.2.0
5
5
  platform: ruby
6
6
  authors:
7
7
  - Shyam Rallapalli
8
8
  autorequire:
9
9
  bindir: exe
10
10
  cert_chain: []
11
- date: 2016-08-08 00:00:00.000000000 Z
11
+ date: 2016-08-11 00:00:00.000000000 Z
12
12
  dependencies:
13
13
  - !ruby/object:Gem::Dependency
14
14
  name: yell