bio-sra 0.1.0 → 0.2.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (5) hide show
  1. checksums.yaml +4 -4
  2. data/README.md +36 -15
  3. data/VERSION +1 -1
  4. data/bin/sra_download +18 -1
  5. metadata +1 -1
checksums.yaml CHANGED
@@ -1,7 +1,7 @@
1
1
  ---
2
2
  SHA1:
3
- metadata.gz: 48531f3fc7b4facb8b4ed2b4523f6ab5eabb6897
4
- data.tar.gz: 1ce0ebaccc4601144aa0150caf2913e2c3d96de1
3
+ metadata.gz: 7ffa909e8f796caf46d17335cd68a8075cc5da00
4
+ data.tar.gz: fdf1328d17f032197485359f73c313533649ca23
5
5
  SHA512:
6
- metadata.gz: 3a81f703d495b138cb4a2d53ec2cced77fe888cd6f505ca28c9bec2bb20cddff4ca3b57c2b7cfcf31f8eda5fe00679d1bab503c91423f1f66ab141028ddf10c0
7
- data.tar.gz: 7f4564e2b124d49b6313548f88020202942092ac41788f6d3228ce9608f8c92c955eadc0e5e789cc05744eb312341511a162848b783fdb3d1550bb317834a2c3
6
+ metadata.gz: 5b457ed258b8fa212996f142b69049c9e1b86abd36b3f316522cecb8bfd009349a8920e112ee34e977c1300075a5a1240c93c6707d9f3d82c4e8f9eb4b1fd6b3
7
+ data.tar.gz: 980aa9f9c2b0266b558ae16f7d0b643adf9f78837c67e21638a4be33b5fed224b1e49ca6aa39cef91bbdf17e102630ceb4c6d6eb2b6d38afb95d0f27011af2cb
data/README.md CHANGED
@@ -1,20 +1,18 @@
1
1
  # bio-sra
2
2
 
3
- [![Build Status](https://secure.travis-ci.org/wwood/bioruby-sra.png)](http://travis-ci.org/wwood/bioruby-sra)
4
-
5
3
  A Sequence Read Archive (SRA) download script and Ruby interface to the [SRAdb](ncbi.nlm.nih.gov/pmc/articles/PMC3560148/) (SRA metadata) SQLite database.
6
4
 
7
5
  ## Installation
8
6
 
9
7
  ```sh
10
- gem install bio-sra
8
+ $ gem install bio-sra
11
9
  ```
12
10
 
13
11
  ## Download script usage
14
12
 
15
13
  Download a single run file to the current directory:
16
14
  ```sh
17
- sra_download --runs ERR229501.sra
15
+ $ sra_download ERR229501
18
16
  ```
19
17
 
20
18
  Download a list of runs
@@ -22,19 +20,45 @@ Download a list of runs
22
20
  $ cat srr_list.txt
23
21
  ERR229501
24
22
  ERR229498
25
- $ sra_download --runs -f srr_list.txt
23
+ $ sra_download -f srr_list.txt
26
24
  ```
27
25
 
28
- Download all runs that are a part of the experiment ERP001779 (Microbial biogeography of public restroom surfaces)
26
+ Download all runs that are a part of the experiment ERP001779 "Microbial biogeography of public restroom surfaces". This requires an [SRAdb](http://www.bioconductor.org/packages/release/bioc/html/SRAdb.html) database (i.e. a database of the SRA metadata), which can be downloaded from
29
27
  ```sh
30
- $ sra_download ERP001779
28
+ $ sra_download -d '/path/to/SRAmetadb.sqlite' ERP001779
31
29
  ```
32
- This finds ERP001779 and links it to runs through the SRAdb
30
+ The SRAdb SQLite file can be downloaded from these mirrors:
31
+ * http://gbnci.abcc.ncifcrf.gov/backup/SRAmetadb.sqlite.gz
32
+ * http://watson.nci.nih.gov/~zhujack/SRAmetadb.sqlite.gz
33
+ * http://dl.dropbox.com/u/51653511/SRAmetadb.sqlite.gz
33
34
 
34
35
  ## Ruby interface script
35
36
 
36
37
  ```ruby
37
38
  require 'bio-sra'
39
+
40
+ # Connect to the database
41
+ Bio::SRA::Connection.connect '/path/to/SRAmetadb.sqlite'
42
+ ```
43
+ Once connected, the each row of the Bio::SRA::Tables::SRA table represents an SRA run:
44
+ ```
45
+ Bio::SRA::Tables::SRA.first.run_accession
46
+ # => "DRR000001"
47
+
48
+ Bio::SRA::Tables::SRA.first.submission_accession
49
+ # => "DRA000001"
50
+
51
+ Bio::SRA::Tables::SRA.first.submission_date
52
+ # => "2009-06-20"
53
+
54
+ Bio::SRA::Tables::SRA.first.submission_comment
55
+ # => "Bacillus subtilis subsp. natto BEST195 draft sequence, the chromosome and plasmid pBEST195S"
56
+ ```
57
+ There is a description of each available table on the [wiki](https://github.com/wwood/bioruby-sra/wiki).
58
+
59
+ There are also methods for working with accession numbers, e.g.
60
+ ```ruby
61
+ Bio::SRA::Accession.classify_accession_type('ERP001779') #=> :study_accession
38
62
  ```
39
63
 
40
64
  The API doc is online. For more code examples see the test files in
@@ -47,20 +71,17 @@ how to contribute, see
47
71
 
48
72
  http://github.com/wwood/bioruby-sra
49
73
 
50
- The BioRuby community is on IRC server: irc.freenode.org, channel: #bioruby.
51
-
52
74
  ## Cite
53
75
 
54
- This Ruby code is unpublished, but there's a problem with
76
+ This Ruby code is unpublished, but citing the SRAdb paper is probably good practice:
55
77
 
56
- * [BioRuby: bioinformatics software for the Ruby programming language](http://dx.doi.org/10.1093/bioinformatics/btq475)
57
- * [Biogem: an effective tool-based approach for scaling up open source software development in bioinformatics](http://dx.doi.org/10.1093/bioinformatics/bts080)
78
+ * [SRAdb: query and use public next-generation sequencing data from within R](dx.doi.org/10.1186/1471-2105-14-19)
58
79
 
59
80
  ## Biogems.info
60
81
 
61
- This Biogem is published at [#bio-sra](http://biogems.info/index.html)
82
+ This Biogem is published at [biogems.info](http://biogems.info/index.html)
62
83
 
63
84
  ## Copyright
64
85
 
65
- Copyright (c) 2012 Ben J. Woodcroft. See LICENSE.txt for further details.
86
+ Copyright (c) 2012-2014 Ben J. Woodcroft. See LICENSE.txt for further details.
66
87
 
data/VERSION CHANGED
@@ -1 +1 @@
1
- 0.1.0
1
+ 0.2.0
@@ -27,6 +27,9 @@ Download data from SRA \n"
27
27
  opts.on('-f', "--file FILENAME", "Provide a file of accession numbers, separated by whitespace or commas [default: not used, use the first argument <SRA_ACCESSION>]") do |f|
28
28
  options[:accessions_file] = f
29
29
  end
30
+ opts.on('-d', '--db SRAmetaDB_PATH', "Path to the SRAmetadb downloaded from NCBI e.g. from the URL [required unless all accessions are runs (rather than e.g. studies or submissions)]") do |arg|
31
+ options[:sradb] = arg
32
+ end
30
33
  opts.on("--format FORMAT", "format for download [default: 'sra']") do |f|
31
34
  format_string_to_sym = {
32
35
  'sralite' => :sralite, # no longer supported by NCBI?
@@ -89,7 +92,21 @@ end
89
92
 
90
93
  # Connect to the database if required
91
94
  log.info "Connecting to database.."
92
- Bio::SRA::Connection.connect unless options[:treat_input_as_runs]
95
+ unless options[:treat_input_as_runs]
96
+ if options[:sradb]
97
+ Bio::SRA::Connection.connect options[:sradb]
98
+ else
99
+ Bio::SRA::Connection.connect
100
+ end
101
+
102
+ # Check for connection
103
+ begin
104
+ s = Bio::SRA::Tables::SRA.first
105
+ rescue
106
+ log.error "There was a problem connecting to the database at `#{options[:sradb] }', was it specified correctly?"
107
+ exit 2
108
+ end
109
+ end
93
110
 
94
111
  log.info "Collecting a list of runs to download.."
95
112
  runs = []
metadata CHANGED
@@ -1,7 +1,7 @@
1
1
  --- !ruby/object:Gem::Specification
2
2
  name: bio-sra
3
3
  version: !ruby/object:Gem::Version
4
- version: 0.1.0
4
+ version: 0.2.0
5
5
  platform: ruby
6
6
  authors:
7
7
  - Ben J. Woodcroft