snp-search 0.4.0 → 0.5.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
@@ -1,6 +1,6 @@
1
1
  = snp-search
2
2
 
3
- snp-search is a set of tools that manages SNP data and allows for data importing, manipulating, editing and complex querying of SNP data. It can be used to evaluate the utility of SNPs for the assessment of genetic diversity between strains and the management of genotype and phenotype data. Once a query is performed, SNPsearch can be used to convert the selected SNP data into FASTA sequences. SNPsearch is particularly useful in the analysis of phylogenetic trees that are based on SNP differences across whole core genomes. Queries can be made to answer critical genomic questions such as the association of SNPs with particular phenotypes.
3
+ snp-search is a set of tools that manages SNP data and allows for data importing, manipulating, editing and complex querying of SNP data. It can be used to evaluate the utility of SNPs for the assessment of genetic diversity between haploid strains and the management of genotype and phenotype data. Once a query is performed, SNPsearch can be used to convert the selected SNP data into FASTA sequences. SNPsearch is particularly useful in the analysis of phylogenetic trees that are based on SNP differences across whole core genomes. Queries can be made to answer critical genomic questions such as the association of SNPs with particular phenotypes.
4
4
 
5
5
  == Obtaining and installing the code
6
6
  snp-search is written in Ruby and operates in a Unix enviroment. It is made available as a gem. See the github site for more information (https://github.com/hpa-bioinformatics/snp-search).
@@ -10,27 +10,34 @@ To install snp-search, do
10
10
 
11
11
  == Requirements
12
12
 
13
- * ActiveRecord: The snp-search API is based on ActiveRecord to get the data from the database. ActiveRecord is available as a gem:
14
- gem install activerecord
15
-
16
- * SQLite3: The SQL engine used is sqlite3. Most Linux operating systems come with sqlite3. However if you do not have sqlite then you may download it from http://www.sqlite.org/download.html. The installation instructions are available in the download page.
17
-
18
- * bio gem.
13
+ * Nothing! You just need to run this in unix and it will install all the necessary gems for you.
19
14
 
20
15
  == Running snp-search
21
16
 
22
- Step 1:
23
-
24
- == Contributing to snp-search
25
-
26
- * Check out the latest master to make sure the feature hasn't been implemented or the bug hasn't been fixed yet
27
- * Check out the issue tracker to make sure someone already hasn't requested it and/or contributed it
28
- * Fork the project
29
- * Start a feature/bugfix branch
30
- * Commit and push until you are happy with your contribution
31
- * Make sure to add tests for it. This is important so I don't break it in a future version unintentionally.
32
- * Please try not to mess with the Rakefile, version, or history. If you want to have your own version, or is otherwise necessary, that is fine, but please isolate to its own commit so I can cherry-pick around it.
33
- *
17
+ * To run snp-search, you need to have 3 files:
18
+ 1- Variant Call Format (.vcf) file (which contains the SNP information)
19
+ 2- Your reference genome that you used to generate your .vcf file (in genbank or embl format, the script will automatically detect the format).
20
+ 3- A text file with a list of your strain/sample names. These should be the same strains/samples used in generating the .vcf file. In the text file, every strain/sample name should have a new line, e.g.
21
+
22
+ strain1
23
+ strain2
24
+ strain3
25
+ strain4
26
+ etc..
27
+
28
+ * Once you have these files ready, you may run snp-search with the following options:
29
+
30
+ -V Enable verbose mode
31
+ -n Name of your database Optional, default = snp_db.sqlite3
32
+ -v .vcf file Required
33
+ -r Reference genome file (The same file that was used in generating the .vcf file). This should be in genbank or embl format. Required
34
+ -s Text file that contains a list of the strain/sample names (The same strains/samples used in generating the .vcf file) Required
35
+ -c SNP quality cuttoff. A phred-scaled quality score. High quality scores indicate high confidence calls. Optional, default = 90
36
+ -t Genotype Quality cuttoff. This is the probability that the genotype call is wrong under the condition that the site is being variant. Optional, default = 30
37
+ -h help message
38
+
39
+ * Usage:
40
+ snp-search -n my_snp_db.sqlite3 -r my_ref.gbk -v my_vcf_file.vcf -s my_list_of_strains.txt
34
41
 
35
42
  == Copyright
36
43
 
data/VERSION CHANGED
@@ -1 +1 @@
1
- 0.4.0
1
+ 0.5.0
@@ -10,23 +10,26 @@ opts = Slop.new :help do
10
10
  banner "ruby snp-search [OPTIONS]"
11
11
 
12
12
  on :V, :verbose, 'Enable verbose mode'
13
- on :n, :name=, 'Name of database', true
14
- on :r, :reference_file=, 'Path for the reference database, in gbk or embl file format'
15
- on :v, :vcf_file=, 'Path for the .vcf file', true
16
- on :s, :strain, 'Path for the list of strains text file', true
17
- on :c, :cuttoff_snp=, 'cuttoff for SNP quality'
18
- on :t, :cuttoff_genotype=, 'cuttoff for genotype quality'
13
+ on :n, :name=, 'Name of database', :default => 'snp_db.sqlite3'
14
+ on :r, :reference_file=, 'Reference genome file, in gbk or embl file format', true
15
+ on :v, :vcf_file=, '.vcf file', true
16
+ on :s, :strain=, 'text file with a list of strains/samples', true
17
+ on :c, :cuttoff_snp=, 'SNP quality cuttoff', :default => 90
18
+ on :t, :cuttoff_genotype=, 'Genotype quality cuttoff', :default => 30
19
+
19
20
  on_empty do
20
21
  puts help
21
22
  end
22
23
  end
23
-
24
24
  opts.parse
25
25
 
26
+
27
+ begin
26
28
  strains = []
27
- File.read(opts[:strain]).each_line do |line|
28
- strains << line.chop
29
- end
29
+ File.read(opts[:strain]).each_line do |line|
30
+ strains << line.chop
31
+ end
32
+
30
33
 
31
34
  # Enter the name of your database
32
35
  establish_connection(opts[:name])
@@ -61,3 +64,6 @@ populate_features_and_annotations(sequence_flatfile)
61
64
 
62
65
  #The populate_snps_alleles_genotypes method populates the snps, alleles and genotypes. It uses the strain names (array) and vcf file.
63
66
  populate_snps_alleles_genotypes(strains, vcf_mpileup_file, opts[:cuttoff_snp].to_i, opts[:cuttoff_genotype].to_i)
67
+
68
+ rescue
69
+ end
@@ -37,7 +37,7 @@ def populate_features_and_annotations(sequence_file)
37
37
  db_feature.strand = feature.locations.first.strand
38
38
  db_feature.name = feature.feature
39
39
  db_feature.save
40
- puts "populated #{db_feature.name}, start: #{db_feature.start}, end: #{db_feature.end}, strand: #{db_feature.strand} for feature: #{db_feature.id}"
40
+ puts "populating #{db_feature.name}, start: #{db_feature.start}, end: #{db_feature.end}, strand: #{db_feature.strand} for feature: #{db_feature.id}"
41
41
  # Populate the Annotation table with qualifier information from the genbank file
42
42
  feature.qualifiers.each do |qualifier|
43
43
  a = Annotation.new
@@ -45,7 +45,7 @@ def populate_features_and_annotations(sequence_file)
45
45
  a.value = qualifier.value
46
46
  a.save
47
47
  db_feature.annotations << a
48
- puts "populated #{a.qualifier} for feature: #{db_feature.id}"
48
+ puts "populating #{a.qualifier} for feature: #{db_feature.id}"
49
49
  end
50
50
  end
51
51
  end
@@ -5,11 +5,11 @@
5
5
 
6
6
  Gem::Specification.new do |s|
7
7
  s.name = "snp-search"
8
- s.version = "0.4.0"
8
+ s.version = "0.5.0"
9
9
 
10
10
  s.required_rubygems_version = Gem::Requirement.new(">= 0") if s.respond_to? :required_rubygems_version=
11
11
  s.authors = ["Ali Al-Shahib", "Anthony Underwood"]
12
- s.date = "2011-11-30"
12
+ s.date = "2011-12-01"
13
13
  s.description = "Use the snp-search toolset to query the SNP database"
14
14
  s.email = "ali.al-shahib@hpa.org.uk"
15
15
  s.executables = ["snp-search"]
metadata CHANGED
@@ -1,7 +1,7 @@
1
1
  --- !ruby/object:Gem::Specification
2
2
  name: snp-search
3
3
  version: !ruby/object:Gem::Version
4
- version: 0.4.0
4
+ version: 0.5.0
5
5
  prerelease:
6
6
  platform: ruby
7
7
  authors:
@@ -10,11 +10,11 @@ authors:
10
10
  autorequire:
11
11
  bindir: bin
12
12
  cert_chain: []
13
- date: 2011-11-30 00:00:00.000000000Z
13
+ date: 2011-12-01 00:00:00.000000000Z
14
14
  dependencies:
15
15
  - !ruby/object:Gem::Dependency
16
16
  name: activerecord
17
- requirement: &2155288080 !ruby/object:Gem::Requirement
17
+ requirement: &2158814260 !ruby/object:Gem::Requirement
18
18
  none: false
19
19
  requirements:
20
20
  - - ! '>='
@@ -22,10 +22,10 @@ dependencies:
22
22
  version: '0'
23
23
  type: :runtime
24
24
  prerelease: false
25
- version_requirements: *2155288080
25
+ version_requirements: *2158814260
26
26
  - !ruby/object:Gem::Dependency
27
27
  name: bio
28
- requirement: &2155287480 !ruby/object:Gem::Requirement
28
+ requirement: &2158813660 !ruby/object:Gem::Requirement
29
29
  none: false
30
30
  requirements:
31
31
  - - ! '>='
@@ -33,10 +33,10 @@ dependencies:
33
33
  version: '0'
34
34
  type: :runtime
35
35
  prerelease: false
36
- version_requirements: *2155287480
36
+ version_requirements: *2158813660
37
37
  - !ruby/object:Gem::Dependency
38
38
  name: slop
39
- requirement: &2155286880 !ruby/object:Gem::Requirement
39
+ requirement: &2158813060 !ruby/object:Gem::Requirement
40
40
  none: false
41
41
  requirements:
42
42
  - - ! '>='
@@ -44,10 +44,10 @@ dependencies:
44
44
  version: '0'
45
45
  type: :runtime
46
46
  prerelease: false
47
- version_requirements: *2155286880
47
+ version_requirements: *2158813060
48
48
  - !ruby/object:Gem::Dependency
49
49
  name: rspec
50
- requirement: &2155286300 !ruby/object:Gem::Requirement
50
+ requirement: &2158812460 !ruby/object:Gem::Requirement
51
51
  none: false
52
52
  requirements:
53
53
  - - ~>
@@ -55,10 +55,10 @@ dependencies:
55
55
  version: 2.3.0
56
56
  type: :development
57
57
  prerelease: false
58
- version_requirements: *2155286300
58
+ version_requirements: *2158812460
59
59
  - !ruby/object:Gem::Dependency
60
60
  name: bundler
61
- requirement: &2155285700 !ruby/object:Gem::Requirement
61
+ requirement: &2158752880 !ruby/object:Gem::Requirement
62
62
  none: false
63
63
  requirements:
64
64
  - - ~>
@@ -66,10 +66,10 @@ dependencies:
66
66
  version: 1.0.0
67
67
  type: :development
68
68
  prerelease: false
69
- version_requirements: *2155285700
69
+ version_requirements: *2158752880
70
70
  - !ruby/object:Gem::Dependency
71
71
  name: jeweler
72
- requirement: &2155285120 !ruby/object:Gem::Requirement
72
+ requirement: &2158752260 !ruby/object:Gem::Requirement
73
73
  none: false
74
74
  requirements:
75
75
  - - ~>
@@ -77,10 +77,10 @@ dependencies:
77
77
  version: 1.6.4
78
78
  type: :development
79
79
  prerelease: false
80
- version_requirements: *2155285120
80
+ version_requirements: *2158752260
81
81
  - !ruby/object:Gem::Dependency
82
82
  name: rcov
83
- requirement: &2155284520 !ruby/object:Gem::Requirement
83
+ requirement: &2158751780 !ruby/object:Gem::Requirement
84
84
  none: false
85
85
  requirements:
86
86
  - - ! '>='
@@ -88,7 +88,7 @@ dependencies:
88
88
  version: '0'
89
89
  type: :development
90
90
  prerelease: false
91
- version_requirements: *2155284520
91
+ version_requirements: *2158751780
92
92
  description: Use the snp-search toolset to query the SNP database
93
93
  email: ali.al-shahib@hpa.org.uk
94
94
  executables:
@@ -130,7 +130,7 @@ required_ruby_version: !ruby/object:Gem::Requirement
130
130
  version: '0'
131
131
  segments:
132
132
  - 0
133
- hash: 3021040479965194059
133
+ hash: 2192451038165693366
134
134
  required_rubygems_version: !ruby/object:Gem::Requirement
135
135
  none: false
136
136
  requirements: