snp-search 0.4.0 → 0.5.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- data/README.rdoc +26 -19
- data/VERSION +1 -1
- data/bin/snp-search +16 -10
- data/lib/snp-search.rb +2 -2
- data/snp-search.gemspec +2 -2
- metadata +17 -17
data/README.rdoc
CHANGED
@@ -1,6 +1,6 @@
|
|
1
1
|
= snp-search
|
2
2
|
|
3
|
-
snp-search is a set of tools that manages SNP data and allows for data importing, manipulating, editing and complex querying of SNP data. It can be used to evaluate the utility of SNPs for the assessment of genetic diversity between strains and the management of genotype and phenotype data. Once a query is performed, SNPsearch can be used to convert the selected SNP data into FASTA sequences. SNPsearch is particularly useful in the analysis of phylogenetic trees that are based on SNP differences across whole core genomes. Queries can be made to answer critical genomic questions such as the association of SNPs with particular phenotypes.
|
3
|
+
snp-search is a set of tools that manages SNP data and allows for data importing, manipulating, editing and complex querying of SNP data. It can be used to evaluate the utility of SNPs for the assessment of genetic diversity between haploid strains and the management of genotype and phenotype data. Once a query is performed, SNPsearch can be used to convert the selected SNP data into FASTA sequences. SNPsearch is particularly useful in the analysis of phylogenetic trees that are based on SNP differences across whole core genomes. Queries can be made to answer critical genomic questions such as the association of SNPs with particular phenotypes.
|
4
4
|
|
5
5
|
== Obtaining and installing the code
|
6
6
|
snp-search is written in Ruby and operates in a Unix enviroment. It is made available as a gem. See the github site for more information (https://github.com/hpa-bioinformatics/snp-search).
|
@@ -10,27 +10,34 @@ To install snp-search, do
|
|
10
10
|
|
11
11
|
== Requirements
|
12
12
|
|
13
|
-
*
|
14
|
-
gem install activerecord
|
15
|
-
|
16
|
-
* SQLite3: The SQL engine used is sqlite3. Most Linux operating systems come with sqlite3. However if you do not have sqlite then you may download it from http://www.sqlite.org/download.html. The installation instructions are available in the download page.
|
17
|
-
|
18
|
-
* bio gem.
|
13
|
+
* Nothing! You just need to run this in unix and it will install all the necessary gems for you.
|
19
14
|
|
20
15
|
== Running snp-search
|
21
16
|
|
22
|
-
|
23
|
-
|
24
|
-
|
25
|
-
|
26
|
-
|
27
|
-
|
28
|
-
|
29
|
-
|
30
|
-
|
31
|
-
|
32
|
-
|
33
|
-
*
|
17
|
+
* To run snp-search, you need to have 3 files:
|
18
|
+
1- Variant Call Format (.vcf) file (which contains the SNP information)
|
19
|
+
2- Your reference genome that you used to generate your .vcf file (in genbank or embl format, the script will automatically detect the format).
|
20
|
+
3- A text file with a list of your strain/sample names. These should be the same strains/samples used in generating the .vcf file. In the text file, every strain/sample name should have a new line, e.g.
|
21
|
+
|
22
|
+
strain1
|
23
|
+
strain2
|
24
|
+
strain3
|
25
|
+
strain4
|
26
|
+
etc..
|
27
|
+
|
28
|
+
* Once you have these files ready, you may run snp-search with the following options:
|
29
|
+
|
30
|
+
-V Enable verbose mode
|
31
|
+
-n Name of your database Optional, default = snp_db.sqlite3
|
32
|
+
-v .vcf file Required
|
33
|
+
-r Reference genome file (The same file that was used in generating the .vcf file). This should be in genbank or embl format. Required
|
34
|
+
-s Text file that contains a list of the strain/sample names (The same strains/samples used in generating the .vcf file) Required
|
35
|
+
-c SNP quality cuttoff. A phred-scaled quality score. High quality scores indicate high confidence calls. Optional, default = 90
|
36
|
+
-t Genotype Quality cuttoff. This is the probability that the genotype call is wrong under the condition that the site is being variant. Optional, default = 30
|
37
|
+
-h help message
|
38
|
+
|
39
|
+
* Usage:
|
40
|
+
snp-search -n my_snp_db.sqlite3 -r my_ref.gbk -v my_vcf_file.vcf -s my_list_of_strains.txt
|
34
41
|
|
35
42
|
== Copyright
|
36
43
|
|
data/VERSION
CHANGED
@@ -1 +1 @@
|
|
1
|
-
0.
|
1
|
+
0.5.0
|
data/bin/snp-search
CHANGED
@@ -10,23 +10,26 @@ opts = Slop.new :help do
|
|
10
10
|
banner "ruby snp-search [OPTIONS]"
|
11
11
|
|
12
12
|
on :V, :verbose, 'Enable verbose mode'
|
13
|
-
on :n, :name=, 'Name of database',
|
14
|
-
on :r, :reference_file=, '
|
15
|
-
on :v, :vcf_file=, '
|
16
|
-
on :s, :strain
|
17
|
-
on :c, :cuttoff_snp=, '
|
18
|
-
on :t, :cuttoff_genotype=, 'cuttoff
|
13
|
+
on :n, :name=, 'Name of database', :default => 'snp_db.sqlite3'
|
14
|
+
on :r, :reference_file=, 'Reference genome file, in gbk or embl file format', true
|
15
|
+
on :v, :vcf_file=, '.vcf file', true
|
16
|
+
on :s, :strain=, 'text file with a list of strains/samples', true
|
17
|
+
on :c, :cuttoff_snp=, 'SNP quality cuttoff', :default => 90
|
18
|
+
on :t, :cuttoff_genotype=, 'Genotype quality cuttoff', :default => 30
|
19
|
+
|
19
20
|
on_empty do
|
20
21
|
puts help
|
21
22
|
end
|
22
23
|
end
|
23
|
-
|
24
24
|
opts.parse
|
25
25
|
|
26
|
+
|
27
|
+
begin
|
26
28
|
strains = []
|
27
|
-
File.read(opts[:strain]).each_line do |line|
|
28
|
-
|
29
|
-
end
|
29
|
+
File.read(opts[:strain]).each_line do |line|
|
30
|
+
strains << line.chop
|
31
|
+
end
|
32
|
+
|
30
33
|
|
31
34
|
# Enter the name of your database
|
32
35
|
establish_connection(opts[:name])
|
@@ -61,3 +64,6 @@ populate_features_and_annotations(sequence_flatfile)
|
|
61
64
|
|
62
65
|
#The populate_snps_alleles_genotypes method populates the snps, alleles and genotypes. It uses the strain names (array) and vcf file.
|
63
66
|
populate_snps_alleles_genotypes(strains, vcf_mpileup_file, opts[:cuttoff_snp].to_i, opts[:cuttoff_genotype].to_i)
|
67
|
+
|
68
|
+
rescue
|
69
|
+
end
|
data/lib/snp-search.rb
CHANGED
@@ -37,7 +37,7 @@ def populate_features_and_annotations(sequence_file)
|
|
37
37
|
db_feature.strand = feature.locations.first.strand
|
38
38
|
db_feature.name = feature.feature
|
39
39
|
db_feature.save
|
40
|
-
puts "
|
40
|
+
puts "populating #{db_feature.name}, start: #{db_feature.start}, end: #{db_feature.end}, strand: #{db_feature.strand} for feature: #{db_feature.id}"
|
41
41
|
# Populate the Annotation table with qualifier information from the genbank file
|
42
42
|
feature.qualifiers.each do |qualifier|
|
43
43
|
a = Annotation.new
|
@@ -45,7 +45,7 @@ def populate_features_and_annotations(sequence_file)
|
|
45
45
|
a.value = qualifier.value
|
46
46
|
a.save
|
47
47
|
db_feature.annotations << a
|
48
|
-
puts "
|
48
|
+
puts "populating #{a.qualifier} for feature: #{db_feature.id}"
|
49
49
|
end
|
50
50
|
end
|
51
51
|
end
|
data/snp-search.gemspec
CHANGED
@@ -5,11 +5,11 @@
|
|
5
5
|
|
6
6
|
Gem::Specification.new do |s|
|
7
7
|
s.name = "snp-search"
|
8
|
-
s.version = "0.
|
8
|
+
s.version = "0.5.0"
|
9
9
|
|
10
10
|
s.required_rubygems_version = Gem::Requirement.new(">= 0") if s.respond_to? :required_rubygems_version=
|
11
11
|
s.authors = ["Ali Al-Shahib", "Anthony Underwood"]
|
12
|
-
s.date = "2011-
|
12
|
+
s.date = "2011-12-01"
|
13
13
|
s.description = "Use the snp-search toolset to query the SNP database"
|
14
14
|
s.email = "ali.al-shahib@hpa.org.uk"
|
15
15
|
s.executables = ["snp-search"]
|
metadata
CHANGED
@@ -1,7 +1,7 @@
|
|
1
1
|
--- !ruby/object:Gem::Specification
|
2
2
|
name: snp-search
|
3
3
|
version: !ruby/object:Gem::Version
|
4
|
-
version: 0.
|
4
|
+
version: 0.5.0
|
5
5
|
prerelease:
|
6
6
|
platform: ruby
|
7
7
|
authors:
|
@@ -10,11 +10,11 @@ authors:
|
|
10
10
|
autorequire:
|
11
11
|
bindir: bin
|
12
12
|
cert_chain: []
|
13
|
-
date: 2011-
|
13
|
+
date: 2011-12-01 00:00:00.000000000Z
|
14
14
|
dependencies:
|
15
15
|
- !ruby/object:Gem::Dependency
|
16
16
|
name: activerecord
|
17
|
-
requirement: &
|
17
|
+
requirement: &2158814260 !ruby/object:Gem::Requirement
|
18
18
|
none: false
|
19
19
|
requirements:
|
20
20
|
- - ! '>='
|
@@ -22,10 +22,10 @@ dependencies:
|
|
22
22
|
version: '0'
|
23
23
|
type: :runtime
|
24
24
|
prerelease: false
|
25
|
-
version_requirements: *
|
25
|
+
version_requirements: *2158814260
|
26
26
|
- !ruby/object:Gem::Dependency
|
27
27
|
name: bio
|
28
|
-
requirement: &
|
28
|
+
requirement: &2158813660 !ruby/object:Gem::Requirement
|
29
29
|
none: false
|
30
30
|
requirements:
|
31
31
|
- - ! '>='
|
@@ -33,10 +33,10 @@ dependencies:
|
|
33
33
|
version: '0'
|
34
34
|
type: :runtime
|
35
35
|
prerelease: false
|
36
|
-
version_requirements: *
|
36
|
+
version_requirements: *2158813660
|
37
37
|
- !ruby/object:Gem::Dependency
|
38
38
|
name: slop
|
39
|
-
requirement: &
|
39
|
+
requirement: &2158813060 !ruby/object:Gem::Requirement
|
40
40
|
none: false
|
41
41
|
requirements:
|
42
42
|
- - ! '>='
|
@@ -44,10 +44,10 @@ dependencies:
|
|
44
44
|
version: '0'
|
45
45
|
type: :runtime
|
46
46
|
prerelease: false
|
47
|
-
version_requirements: *
|
47
|
+
version_requirements: *2158813060
|
48
48
|
- !ruby/object:Gem::Dependency
|
49
49
|
name: rspec
|
50
|
-
requirement: &
|
50
|
+
requirement: &2158812460 !ruby/object:Gem::Requirement
|
51
51
|
none: false
|
52
52
|
requirements:
|
53
53
|
- - ~>
|
@@ -55,10 +55,10 @@ dependencies:
|
|
55
55
|
version: 2.3.0
|
56
56
|
type: :development
|
57
57
|
prerelease: false
|
58
|
-
version_requirements: *
|
58
|
+
version_requirements: *2158812460
|
59
59
|
- !ruby/object:Gem::Dependency
|
60
60
|
name: bundler
|
61
|
-
requirement: &
|
61
|
+
requirement: &2158752880 !ruby/object:Gem::Requirement
|
62
62
|
none: false
|
63
63
|
requirements:
|
64
64
|
- - ~>
|
@@ -66,10 +66,10 @@ dependencies:
|
|
66
66
|
version: 1.0.0
|
67
67
|
type: :development
|
68
68
|
prerelease: false
|
69
|
-
version_requirements: *
|
69
|
+
version_requirements: *2158752880
|
70
70
|
- !ruby/object:Gem::Dependency
|
71
71
|
name: jeweler
|
72
|
-
requirement: &
|
72
|
+
requirement: &2158752260 !ruby/object:Gem::Requirement
|
73
73
|
none: false
|
74
74
|
requirements:
|
75
75
|
- - ~>
|
@@ -77,10 +77,10 @@ dependencies:
|
|
77
77
|
version: 1.6.4
|
78
78
|
type: :development
|
79
79
|
prerelease: false
|
80
|
-
version_requirements: *
|
80
|
+
version_requirements: *2158752260
|
81
81
|
- !ruby/object:Gem::Dependency
|
82
82
|
name: rcov
|
83
|
-
requirement: &
|
83
|
+
requirement: &2158751780 !ruby/object:Gem::Requirement
|
84
84
|
none: false
|
85
85
|
requirements:
|
86
86
|
- - ! '>='
|
@@ -88,7 +88,7 @@ dependencies:
|
|
88
88
|
version: '0'
|
89
89
|
type: :development
|
90
90
|
prerelease: false
|
91
|
-
version_requirements: *
|
91
|
+
version_requirements: *2158751780
|
92
92
|
description: Use the snp-search toolset to query the SNP database
|
93
93
|
email: ali.al-shahib@hpa.org.uk
|
94
94
|
executables:
|
@@ -130,7 +130,7 @@ required_ruby_version: !ruby/object:Gem::Requirement
|
|
130
130
|
version: '0'
|
131
131
|
segments:
|
132
132
|
- 0
|
133
|
-
hash:
|
133
|
+
hash: 2192451038165693366
|
134
134
|
required_rubygems_version: !ruby/object:Gem::Requirement
|
135
135
|
none: false
|
136
136
|
requirements:
|