bio-maf 0.3.1 → 0.3.2

Sign up to get free protection for your applications and to get access to all the features.
data/DEVELOPMENT.md CHANGED
@@ -20,6 +20,10 @@ platform.
20
20
  The version is simply set by hand in `bio-maf.gemspec`. Don't forget
21
21
  to increment it!
22
22
 
23
+ First, verify that you are on the `master` branch:
24
+
25
+ $ git branch
26
+
23
27
  Testing the build:
24
28
 
25
29
  $ rake build
data/README.md CHANGED
@@ -129,6 +129,10 @@ end
129
129
  # => Matched block at 80082713, 54 bases
130
130
  ```
131
131
 
132
+ This can be done with [`maf_extract(1)`](http://csw.github.com/bioruby-maf/man/maf_extract.1.html) as well:
133
+
134
+ $ maf_extract -d test/data --interval mm8.chr7:80082592-80082766
135
+
132
136
  ### Extract alignment blocks truncated to a given interval
133
137
 
134
138
  Given a genomic interval of interest, one can also extract only the
@@ -144,6 +148,10 @@ puts "Got #{blocks.size} blocks, first #{blocks.first.ref_seq.size} base pairs."
144
148
  # => Got 2 blocks, first 18 base pairs.
145
149
  ```
146
150
 
151
+ Or, with [`maf_extract(1)`](http://csw.github.com/bioruby-maf/man/maf_extract.1.html):
152
+
153
+ $ maf_extract -d test/data --mode slice --interval mm8.chr7:80082592-80082766
154
+
147
155
  ### Filter species returned in alignment blocks
148
156
 
149
157
  ```ruby
@@ -159,6 +167,10 @@ puts "Block has #{block.sequences.size} sequences."
159
167
  # => Block has 3 sequences.
160
168
  ```
161
169
 
170
+ With [`maf_extract(1)`](http://csw.github.com/bioruby-maf/man/maf_extract.1.html):
171
+
172
+ $ maf_extract -d test/data --interval mm8.chr7:80082592-80082766 --only-species hg18,mm8,rheMac2
173
+
162
174
  ### Extract blocks matching certain conditions
163
175
 
164
176
  See also the [Cucumber feature][] and [step definitions][] for this.
@@ -176,6 +188,10 @@ n_blocks = access.find(q).count
176
188
  # => 1
177
189
  ```
178
190
 
191
+ With [`maf_extract(1)`](http://csw.github.com/bioruby-maf/man/maf_extract.1.html):
192
+
193
+ $ maf_extract -d test/data --interval mm8.chr7:80082471-80082730 --with-all-species panTro2,loxAfr1
194
+
179
195
  #### Match only blocks with a certain number of sequences
180
196
 
181
197
  ```ruby
@@ -186,6 +202,10 @@ n_blocks = access.find(q).count
186
202
  # => 1
187
203
  ```
188
204
 
205
+ With [`maf_extract(1)`](http://csw.github.com/bioruby-maf/man/maf_extract.1.html):
206
+
207
+ $ maf_extract -d test/data --interval mm8.chr7:80082767-80083008 --min-sequences 6
208
+
189
209
  #### Match only blocks within a text size range
190
210
 
191
211
  ```ruby
@@ -196,6 +216,10 @@ n_blocks = access.find(q).count
196
216
  # => 3
197
217
  ```
198
218
 
219
+ With [`maf_extract(1)`](http://csw.github.com/bioruby-maf/man/maf_extract.1.html):
220
+
221
+ $ maf_extract -d test/data --interval mm8.chr7:0-80100000 --min-text-size 72 --max-text-size 160
222
+
199
223
  ### Process each block in a MAF file
200
224
 
201
225
  ```ruby
@@ -333,6 +357,7 @@ end
333
357
  Man pages for command line tools:
334
358
 
335
359
  * [`maf_index(1)`](http://csw.github.com/bioruby-maf/man/maf_index.1.html)
360
+ * [`maf_extract(1)`](http://csw.github.com/bioruby-maf/man/maf_extract.1.html)
336
361
  * [`maf_to_fasta(1)`](http://csw.github.com/bioruby-maf/man/maf_to_fasta.1.html)
337
362
  * [`maf_tile(1)`](http://csw.github.com/bioruby-maf/man/maf_tile.1.html)
338
363
 
@@ -377,4 +402,3 @@ This Biogem is published at [biogems.info](http://biogems.info/index.html#bio-ma
377
402
  ## Copyright
378
403
 
379
404
  Copyright (c) 2012 Clayton Wheeler. See LICENSE.txt for further details.
380
-
data/bin/maf_extract CHANGED
@@ -22,8 +22,11 @@ def handle_list_spec(spec)
22
22
  end
23
23
 
24
24
  def handle_interval_spec(int)
25
- parts = int.split(':')
26
- Bio::GenomicInterval.zero_based(parts[0], parts[1].to_i, parts[2].to_i)
25
+ if int =~ /(.+):(\d+)-(\d+)/
26
+ Bio::GenomicInterval.zero_based($1, $2.to_i, $3.to_i)
27
+ else
28
+ raise "Invalid interval specification: #{int}"
29
+ end
27
30
  end
28
31
 
29
32
  $op = OptionParser.new do |opts|
data/bio-maf.gemspec CHANGED
@@ -2,7 +2,7 @@
2
2
 
3
3
  Gem::Specification.new do |s|
4
4
  s.name = "bio-maf"
5
- s.version = "0.3.1"
5
+ s.version = "0.3.2"
6
6
 
7
7
  s.required_rubygems_version = Gem::Requirement.new(">= 0") if s.respond_to? :required_rubygems_version=
8
8
  s.authors = ["Clayton Wheeler"]
data/man/maf_extract.1 CHANGED
@@ -7,13 +7,13 @@
7
7
  \fBmaf_extract\fR \- extract blocks from MAF files
8
8
  .
9
9
  .SH "SYNOPSIS"
10
- \fBmaf_extract\fR \-m MAF [\-i INDEX] \-\-interval SEQ:START:END \fIOPTIONS\fR
10
+ \fBmaf_extract\fR \-m MAF [\-i INDEX] \-\-interval SEQ:START\-END \fIOPTIONS\fR
11
11
  .
12
12
  .P
13
13
  \fBmaf_extract\fR \-m MAF [\-i INDEX] \-\-bed BED \fIOPTIONS\fR
14
14
  .
15
15
  .P
16
- \fBmaf_extract\fR \-d MAFDIR \-\-interval SEQ:START:END \fIOPTIONS\fR
16
+ \fBmaf_extract\fR \-d MAFDIR \-\-interval SEQ:START\-END \fIOPTIONS\fR
17
17
  .
18
18
  .P
19
19
  \fBmaf_extract\fR \-d MAFDIR \-\-bed BED \fIOPTIONS\fR
@@ -69,7 +69,7 @@ The extraction mode to use\. With \fB\-\-mode intersect\fR, any alignment block
69
69
  The specified file will be parsed as a BED file, and each interval it contains will be matched in turn\.
70
70
  .
71
71
  .TP
72
- \fB\-\-interval SEQ:START:END\fR
72
+ \fB\-\-interval SEQ:START\-END\fR
73
73
  A single zero\-based half\-open genomic interval will be matched, with sequence identifier \fIseq\fR, (inclusive) start position \fIstart\fR, and (exclusive) end position \fIend\fR\.
74
74
  .
75
75
  .P
@@ -141,7 +141,116 @@ Run verbosely, with additional informational messages\.
141
141
  Log debugging information\.
142
142
  .
143
143
  .SH "EXAMPLES"
144
- TODO
144
+ Extract MAF blocks intersecting with a given interval:
145
+ .
146
+ .IP "" 4
147
+ .
148
+ .nf
149
+
150
+ $ maf_extract \-d test/data \-\-interval mm8\.chr7:80082592\-80082766
151
+ .
152
+ .fi
153
+ .
154
+ .IP "" 0
155
+ .
156
+ .P
157
+ As above, but operating on a single file:
158
+ .
159
+ .IP "" 4
160
+ .
161
+ .nf
162
+
163
+ $ maf_extract \-m test/data/mm8_chr7_tiny\.maf \e
164
+ \-i test/data/mm8_chr7_tiny\.kct \e
165
+ \-\-interval mm8\.chr7:80082592\-80082766
166
+ .
167
+ .fi
168
+ .
169
+ .IP "" 0
170
+ .
171
+ .P
172
+ Like the first case, but writing output to a file:
173
+ .
174
+ .IP "" 4
175
+ .
176
+ .nf
177
+
178
+ $ maf_extract \-d test/data \-\-interval mm8\.chr7:80082592\-80082766 \e
179
+ \-\-output out\.maf
180
+ .
181
+ .fi
182
+ .
183
+ .IP "" 0
184
+ .
185
+ .P
186
+ Extract a slice of MAF blocks over a given interval:
187
+ .
188
+ .IP "" 4
189
+ .
190
+ .nf
191
+
192
+ $ maf_extract \-d test/data \-\-mode slice \e
193
+ \-\-interval mm8\.chr7:80082592\-80082766
194
+ .
195
+ .fi
196
+ .
197
+ .IP "" 0
198
+ .
199
+ .P
200
+ Filter for sequences from only certain species:
201
+ .
202
+ .IP "" 4
203
+ .
204
+ .nf
205
+
206
+ $ maf_extract \-d test/data \-\-interval mm8\.chr7:80082592\-80082766 \e
207
+ \-\-only\-species hg18,mm8,rheMac2
208
+ .
209
+ .fi
210
+ .
211
+ .IP "" 0
212
+ .
213
+ .P
214
+ Extract only blocks with all specified species:
215
+ .
216
+ .IP "" 4
217
+ .
218
+ .nf
219
+
220
+ $ maf_extract \-d test/data \-\-interval mm8\.chr7:80082471\-80082730 \e
221
+ \-\-with\-all\-species panTro2,loxAfr1
222
+ .
223
+ .fi
224
+ .
225
+ .IP "" 0
226
+ .
227
+ .P
228
+ Extract blocks with at least a certain number of sequences:
229
+ .
230
+ .IP "" 4
231
+ .
232
+ .nf
233
+
234
+ $ maf_extract \-d test/data \-\-interval mm8\.chr7:80082767\-80083008 \e
235
+ \-\-min\-sequences 6
236
+ .
237
+ .fi
238
+ .
239
+ .IP "" 0
240
+ .
241
+ .P
242
+ Extract blocks with text sizes in a certain range:
243
+ .
244
+ .IP "" 4
245
+ .
246
+ .nf
247
+
248
+ $ maf_extract \-d test/data \-\-interval mm8\.chr7:0\-80100000 \e
249
+ \-\-min\-text\-size 72 \-\-max\-text\-size 160
250
+ .
251
+ .fi
252
+ .
253
+ .IP "" 0
145
254
  .
146
255
  .SH "ENVIRONMENT"
147
256
  \fBmaf_index\fR is a Ruby program and relies on ordinary Ruby environment variables\.
@@ -3,11 +3,11 @@ maf_extract(1) -- extract blocks from MAF files
3
3
 
4
4
  ## SYNOPSIS
5
5
 
6
- `maf_extract` -m MAF [-i INDEX] --interval SEQ:START:END [OPTIONS]
6
+ `maf_extract` -m MAF [-i INDEX] --interval SEQ:START-END [OPTIONS]
7
7
 
8
8
  `maf_extract` -m MAF [-i INDEX] --bed BED [OPTIONS]
9
9
 
10
- `maf_extract` -d MAFDIR --interval SEQ:START:END [OPTIONS]
10
+ `maf_extract` -d MAFDIR --interval SEQ:START-END [OPTIONS]
11
11
 
12
12
  `maf_extract` -d MAFDIR --bed BED [OPTIONS]
13
13
 
@@ -79,7 +79,7 @@ Extraction options:
79
79
  The specified file will be parsed as a BED file, and each interval
80
80
  it contains will be matched in turn.
81
81
 
82
- * `--interval SEQ:START:END`:
82
+ * `--interval SEQ:START-END`:
83
83
  A single zero-based half-open genomic interval will be matched,
84
84
  with sequence identifier <seq>, (inclusive) start position <start>,
85
85
  and (exclusive) end position <end>.
@@ -153,7 +153,45 @@ Logging options:
153
153
 
154
154
  ## EXAMPLES
155
155
 
156
- TODO
156
+ Extract MAF blocks intersecting with a given interval:
157
+
158
+ $ maf_extract -d test/data --interval mm8.chr7:80082592-80082766
159
+
160
+ As above, but operating on a single file:
161
+
162
+ $ maf_extract -m test/data/mm8_chr7_tiny.maf \
163
+ -i test/data/mm8_chr7_tiny.kct \
164
+ --interval mm8.chr7:80082592-80082766
165
+
166
+ Like the first case, but writing output to a file:
167
+
168
+ $ maf_extract -d test/data --interval mm8.chr7:80082592-80082766 \
169
+ --output out.maf
170
+
171
+ Extract a slice of MAF blocks over a given interval:
172
+
173
+ $ maf_extract -d test/data --mode slice \
174
+ --interval mm8.chr7:80082592-80082766
175
+
176
+ Filter for sequences from only certain species:
177
+
178
+ $ maf_extract -d test/data --interval mm8.chr7:80082592-80082766 \
179
+ --only-species hg18,mm8,rheMac2
180
+
181
+ Extract only blocks with all specified species:
182
+
183
+ $ maf_extract -d test/data --interval mm8.chr7:80082471-80082730 \
184
+ --with-all-species panTro2,loxAfr1
185
+
186
+ Extract blocks with at least a certain number of sequences:
187
+
188
+ $ maf_extract -d test/data --interval mm8.chr7:80082767-80083008 \
189
+ --min-sequences 6
190
+
191
+ Extract blocks with text sizes in a certain range:
192
+
193
+ $ maf_extract -d test/data --interval mm8.chr7:0-80100000 \
194
+ --min-text-size 72 --max-text-size 160
157
195
 
158
196
  ## ENVIRONMENT
159
197
 
metadata CHANGED
@@ -1,7 +1,7 @@
1
1
  --- !ruby/object:Gem::Specification
2
2
  name: bio-maf
3
3
  version: !ruby/object:Gem::Version
4
- version: 0.3.1
4
+ version: 0.3.2
5
5
  prerelease:
6
6
  platform: ruby
7
7
  authors:
@@ -219,7 +219,7 @@ required_ruby_version: !ruby/object:Gem::Requirement
219
219
  version: '0'
220
220
  segments:
221
221
  - 0
222
- hash: -1336822573836516057
222
+ hash: 2092820657742105268
223
223
  required_rubygems_version: !ruby/object:Gem::Requirement
224
224
  none: false
225
225
  requirements: