bio-polyploid-tools 0.7.0 → 0.7.1

Sign up to get free protection for your applications and to get access to all the features.
checksums.yaml CHANGED
@@ -1,7 +1,7 @@
1
1
  ---
2
2
  SHA1:
3
- metadata.gz: 330ed2e80e9d4bc4146e88f1705633cde7554dc6
4
- data.tar.gz: 123b1efc4cb49358be17f1913ed68af241b8232b
3
+ metadata.gz: b5cbfc6ce3619d8af0f5b7c9780227b2d8730e6c
4
+ data.tar.gz: 430ebabf4af13f68b00cf90d2018fd144a2a07d2
5
5
  SHA512:
6
- metadata.gz: f5eec1a30cdb7365d1bccae60e0c5b89b4939dbae7b928d0d30d1bc88b4b2f193b710894120d756f252c2b76d93b7c3bcf700e9b729973f15481325978fcf049
7
- data.tar.gz: e84c2c06bd35049fd90c7b4a8a7a475a59030e757ee0a7506c68ab9aa4419818c3ddc389a820f9de35b4c839d81b8efd3a25c3ff284f8e4d8940ca11b8de71c9
6
+ metadata.gz: c01c1705ec3af7ae6253b7cc870f455e4f02e27e1d909f54b877ec3d291064deca272c14bab162990c8c57b180ae779f1c66bd9f53b0a38ad6ecba65cef96df6
7
+ data.tar.gz: cdbdf9f70e044f581e80058373d43d365767066e63f0155196a8ddd03daeabad78bba2612c52f918d48707a3369c6d9d6c9bd85fd90254f419841da239b6705f
data/README.md CHANGED
@@ -21,8 +21,8 @@ The code has been developed on ruby 2.1.0, but it should work on 1.9.3 and above
21
21
  #PolyMarker
22
22
 
23
23
  To run poolymerker with the CSS wheat contigs, you need to unzip the
24
- (reference file)[ftp://ftp.ensemblgenomes.org/pub/release-25/plants/fasta/triticum_aestivum/dna/Triticum_aestivum.IWGSC2.25.dna.genome.fa.gz
25
- ].
24
+ [reference file](ftp://ftp.ensemblgenomes.org/pub/release-25/plants/fasta/triticum_aestivum/dna/Triticum_aestivum.IWGSC2.25.dna.genome.fa.gz
25
+ ).
26
26
 
27
27
 
28
28
 
@@ -30,14 +30,63 @@ To run poolymerker with the CSS wheat contigs, you need to unzip the
30
30
  polymarker.rb --contigs Triticum_aestivum.IWGSC2.25.dna.genome.fa --marker_list snp_list.csv --output output_folder
31
31
  ```
32
32
 
33
- The snp_list file must follow the convention
34
- <ID>,<Chromosome>,<SEQUENCE>
33
+ The ```snp_list``` file must follow the convention
34
+ ```<ID>,<Chromosome>,<SEQUENCE>```
35
35
  with the SNP inside the sequence in the format [A/T]. As a reference, look at test/data/short_primer_design_test.csv
36
36
 
37
37
  If you want to use the web interface, visit the [PolyMarker webservice at TGAC](http://polymarker.tgac.ac.uk)
38
38
 
39
+ The available command line arguments are:
40
+
41
+ ```
42
+ Usage: polymarker.rb [options]
43
+ -c, --contigs FILE File with contigs to use as database
44
+ -m, --marker_list FILE File with the list of markers to search from
45
+ -g, --genomes_count INT Number of genomes (default 3, for hexaploid)
46
+ -s, --snp_list FILE File with the list of snps to search from, requires --reference to get the sequence using a position
47
+ -t, --mutant_list FILE File with the list of positions with mutation and the mutation line.
48
+ requires --reference to get the sequence using a position
49
+ -r, --reference FILE Fasta file with the sequence for the markers (to complement --snp_list)
50
+ -o, --output FOLDER Output folder
51
+ -e, --exonerate_model MODEL Model to be used in exonerate to search for the contigs
52
+ -a, --arm_selection arm_selection_embl|arm_selection_morex|arm_selection_first_two
53
+ Function to decide the chromome arm
54
+ -p, --primer_3_preferences FILE file with preferences to be sent to primer3
55
+ -v, --variation_free_region INT If present, avoid generating the common primer if there are homoeologous SNPs within the specified distance (not tested)
56
+ -x, --extract_found_contigs If present, save in a separate file the contigs with matches. Useful to debug.
57
+ -P, --primers_to_order If present, saves a file named primers_to_order which contains the KASP tails
58
+ ```
59
+
60
+ ###Custom reference sequences.
61
+ By default, the contigs and pseudomolecules from [ensembl](ftp://ftp.ensemblgenomes.org/pub/release-25/plants/fasta/triticum_aestivum/dna/Triticum_aestivum.IWGSC2.25.dna.genome.fa.gz
62
+ ) are used. However, it is possible to use a custom reference. To define the chromosome where each contig belongs the argument ```arm_selection``` is used. The defailt uses ids like: ```IWGSC_CSS_1AL_scaff_110```, where the third field, separated by underscores is used. A simple way to add costum references is to rename the fasta file to follow that convention. Another way is to use the option ```--arm_selection arm_selection_first_two```, where only the first two characters in each contig is used as identifier, useful when pseudomolecules are named after the chromosomes (ie: ">1A" in the fasta file).
63
+
64
+ If your contigs follow a different convention, in ```polymarker.rb``` it is possible to define new parsers, by adding at the begining, with the rest of the parsers a new lambda like:
65
+
66
+ ```rb
67
+ arm_selection_functions[:arm_selection_embl] = lambda do | contig_name|
68
+ arr = contig_name.split('_')
69
+ ret = "U"
70
+ ret = arr[2][0,2] if arr.size >= 3
71
+ ret = "3B" if arr.size == 2 and arr[0] == "v443"
72
+ ret = arr[0][0,2] if arr.size == 1
73
+ return ret
74
+ end
75
+ ```
76
+
77
+ The function should return a 2 character string, when the first is the chromosome number and the second the chromosome group. The symbol in the hash is the name to be used in the argument ```--arm_selection```. If you want your parser to be added to the distribution, feel free to fork and make a pull request.
78
+
79
+
80
+
39
81
  ##Release Notes
40
82
 
83
+ ###0.7.1
84
+ * BUGFIX: Now the parser for ```arm_selection_embl``` works with the mixture of contigs and pseudomolecules
85
+ * DOC: Added documentation on how to use custom references.
86
+
87
+ ###0.7.0
88
+ * Added flag ```gebines_count``` for number of genomes, to be used on tetraploids, etc.
89
+
41
90
  ###0.6.1
42
91
 
43
92
 
@@ -49,10 +98,10 @@ If you want to use the web interface, visit the [PolyMarker webservice at TGAC](
49
98
 
50
99
  * If the SNP is in a gap in the alignment to the chromosomes, it is ignored.
51
100
 
52
- BUG: Blocks with NNNs are picked and treated as semi-specific.
53
- BUG: If the name of the reference have space, the ID is not chopped. ">gene_1 (G12A)" shouls be treated as ">gene_1".
54
- TODO: Add a parameter file to configure the alignments.
55
- TODO: Produce primers for products of different sizes
101
+ * BUG: Blocks with NNNs are picked and treated as semi-specific.
102
+ * BUG: If the name of the reference have space, the ID is not chopped. ">gene_1 (G12A)" shouls be treated as ">gene_1".
103
+ * TODO: Add a parameter file to configure the alignments.
104
+ * TODO: Produce primers for products of different sizes. This can probably be done with the primer_3_preferences option, but hasn't been tested.
56
105
 
57
106
 
58
107
 
data/VERSION CHANGED
@@ -1 +1 @@
1
- 0.7.0
1
+ 0.7.1
data/bin/polymarker.rb CHANGED
@@ -17,9 +17,18 @@ arm_selection_functions[:arm_selection_first_two] = lambda do | contig_name |
17
17
  ret = contig_name[0,2]
18
18
  return ret
19
19
  end
20
- #Function to parse stuff like: IWGSC_CSS_1AL_scaff_110
20
+
21
+ #Function to parse stuff like: "IWGSC_CSS_1AL_scaff_110"
22
+ #Or the first two characters in the contig name, to deal with
23
+ #pseudomolecules that start with headers like: "1A"
24
+ #And with the cases when 3B is named with the prefix: v443
21
25
  arm_selection_functions[:arm_selection_embl] = lambda do | contig_name|
22
- ret = contig_name.split('_')[2][0,2]
26
+
27
+ arr = contig_name.split('_')
28
+ ret = "U"
29
+ ret = arr[2][0,2] if arr.size >= 3
30
+ ret = "3B" if arr.size == 2 and arr[0] == "v443"
31
+ ret = arr[0][0,2] if arr.size == 1
23
32
  return ret
24
33
  end
25
34
 
@@ -2,16 +2,16 @@
2
2
  # DO NOT EDIT THIS FILE DIRECTLY
3
3
  # Instead, edit Jeweler::Tasks in Rakefile, and run 'rake gemspec'
4
4
  # -*- encoding: utf-8 -*-
5
- # stub: bio-polyploid-tools 0.7.0 ruby lib
5
+ # stub: bio-polyploid-tools 0.7.1 ruby lib
6
6
 
7
7
  Gem::Specification.new do |s|
8
8
  s.name = "bio-polyploid-tools"
9
- s.version = "0.7.0"
9
+ s.version = "0.7.1"
10
10
 
11
11
  s.required_rubygems_version = Gem::Requirement.new(">= 0") if s.respond_to? :required_rubygems_version=
12
12
  s.require_paths = ["lib"]
13
13
  s.authors = ["Ricardo H. Ramirez-Gonzalez"]
14
- s.date = "2015-05-08"
14
+ s.date = "2015-05-28"
15
15
  s.description = "Repository of tools developed in TGAC and Crop Genetics in JIC to work with polyploid wheat"
16
16
  s.email = "ricardo.ramirez-gonzalez@tgac.ac.uk"
17
17
  s.executables = ["bfr.rb", "count_variations.rb", "filter_blat_by_target_coverage.rb", "filter_exonerate_by_identity.rb", "find_best_blat_hit.rb", "find_best_exonerate.rb", "hexaploid_primers.rb", "homokaryot_primers.rb", "map_markers_to_contigs.rb", "markers_in_region.rb", "polymarker.rb", "snp_position_to_polymarker.rb", "snps_between_bams.rb"]
@@ -129,7 +129,7 @@ Gem::Specification.new do |s|
129
129
  ]
130
130
  s.homepage = "http://github.com/tgac/bioruby-polyploid-tools"
131
131
  s.licenses = ["MIT"]
132
- s.rubygems_version = "2.2.1"
132
+ s.rubygems_version = "2.4.7"
133
133
  s.summary = "Tool to work with polyploids, NGS and molecular biology"
134
134
 
135
135
  if s.respond_to? :specification_version then
metadata CHANGED
@@ -1,14 +1,14 @@
1
1
  --- !ruby/object:Gem::Specification
2
2
  name: bio-polyploid-tools
3
3
  version: !ruby/object:Gem::Version
4
- version: 0.7.0
4
+ version: 0.7.1
5
5
  platform: ruby
6
6
  authors:
7
7
  - Ricardo H. Ramirez-Gonzalez
8
8
  autorequire:
9
9
  bindir: bin
10
10
  cert_chain: []
11
- date: 2015-05-08 00:00:00.000000000 Z
11
+ date: 2015-05-28 00:00:00.000000000 Z
12
12
  dependencies:
13
13
  - !ruby/object:Gem::Dependency
14
14
  name: bio
@@ -228,7 +228,7 @@ required_rubygems_version: !ruby/object:Gem::Requirement
228
228
  version: '0'
229
229
  requirements: []
230
230
  rubyforge_project:
231
- rubygems_version: 2.2.1
231
+ rubygems_version: 2.4.7
232
232
  signing_key:
233
233
  specification_version: 4
234
234
  summary: Tool to work with polyploids, NGS and molecular biology