bio-maf 0.3.1 → 0.3.2

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
data/DEVELOPMENT.md CHANGED
@@ -20,6 +20,10 @@ platform.
20
20
  The version is simply set by hand in `bio-maf.gemspec`. Don't forget
21
21
  to increment it!
22
22
 
23
+ First, verify that you are on the `master` branch:
24
+
25
+ $ git branch
26
+
23
27
  Testing the build:
24
28
 
25
29
  $ rake build
data/README.md CHANGED
@@ -129,6 +129,10 @@ end
129
129
  # => Matched block at 80082713, 54 bases
130
130
  ```
131
131
 
132
+ This can be done with [`maf_extract(1)`](http://csw.github.com/bioruby-maf/man/maf_extract.1.html) as well:
133
+
134
+ $ maf_extract -d test/data --interval mm8.chr7:80082592-80082766
135
+
132
136
  ### Extract alignment blocks truncated to a given interval
133
137
 
134
138
  Given a genomic interval of interest, one can also extract only the
@@ -144,6 +148,10 @@ puts "Got #{blocks.size} blocks, first #{blocks.first.ref_seq.size} base pairs."
144
148
  # => Got 2 blocks, first 18 base pairs.
145
149
  ```
146
150
 
151
+ Or, with [`maf_extract(1)`](http://csw.github.com/bioruby-maf/man/maf_extract.1.html):
152
+
153
+ $ maf_extract -d test/data --mode slice --interval mm8.chr7:80082592-80082766
154
+
147
155
  ### Filter species returned in alignment blocks
148
156
 
149
157
  ```ruby
@@ -159,6 +167,10 @@ puts "Block has #{block.sequences.size} sequences."
159
167
  # => Block has 3 sequences.
160
168
  ```
161
169
 
170
+ With [`maf_extract(1)`](http://csw.github.com/bioruby-maf/man/maf_extract.1.html):
171
+
172
+ $ maf_extract -d test/data --interval mm8.chr7:80082592-80082766 --only-species hg18,mm8,rheMac2
173
+
162
174
  ### Extract blocks matching certain conditions
163
175
 
164
176
  See also the [Cucumber feature][] and [step definitions][] for this.
@@ -176,6 +188,10 @@ n_blocks = access.find(q).count
176
188
  # => 1
177
189
  ```
178
190
 
191
+ With [`maf_extract(1)`](http://csw.github.com/bioruby-maf/man/maf_extract.1.html):
192
+
193
+ $ maf_extract -d test/data --interval mm8.chr7:80082471-80082730 --with-all-species panTro2,loxAfr1
194
+
179
195
  #### Match only blocks with a certain number of sequences
180
196
 
181
197
  ```ruby
@@ -186,6 +202,10 @@ n_blocks = access.find(q).count
186
202
  # => 1
187
203
  ```
188
204
 
205
+ With [`maf_extract(1)`](http://csw.github.com/bioruby-maf/man/maf_extract.1.html):
206
+
207
+ $ maf_extract -d test/data --interval mm8.chr7:80082767-80083008 --min-sequences 6
208
+
189
209
  #### Match only blocks within a text size range
190
210
 
191
211
  ```ruby
@@ -196,6 +216,10 @@ n_blocks = access.find(q).count
196
216
  # => 3
197
217
  ```
198
218
 
219
+ With [`maf_extract(1)`](http://csw.github.com/bioruby-maf/man/maf_extract.1.html):
220
+
221
+ $ maf_extract -d test/data --interval mm8.chr7:0-80100000 --min-text-size 72 --max-text-size 160
222
+
199
223
  ### Process each block in a MAF file
200
224
 
201
225
  ```ruby
@@ -333,6 +357,7 @@ end
333
357
  Man pages for command line tools:
334
358
 
335
359
  * [`maf_index(1)`](http://csw.github.com/bioruby-maf/man/maf_index.1.html)
360
+ * [`maf_extract(1)`](http://csw.github.com/bioruby-maf/man/maf_extract.1.html)
336
361
  * [`maf_to_fasta(1)`](http://csw.github.com/bioruby-maf/man/maf_to_fasta.1.html)
337
362
  * [`maf_tile(1)`](http://csw.github.com/bioruby-maf/man/maf_tile.1.html)
338
363
 
@@ -377,4 +402,3 @@ This Biogem is published at [biogems.info](http://biogems.info/index.html#bio-ma
377
402
  ## Copyright
378
403
 
379
404
  Copyright (c) 2012 Clayton Wheeler. See LICENSE.txt for further details.
380
-
data/bin/maf_extract CHANGED
@@ -22,8 +22,11 @@ def handle_list_spec(spec)
22
22
  end
23
23
 
24
24
  def handle_interval_spec(int)
25
- parts = int.split(':')
26
- Bio::GenomicInterval.zero_based(parts[0], parts[1].to_i, parts[2].to_i)
25
+ if int =~ /(.+):(\d+)-(\d+)/
26
+ Bio::GenomicInterval.zero_based($1, $2.to_i, $3.to_i)
27
+ else
28
+ raise "Invalid interval specification: #{int}"
29
+ end
27
30
  end
28
31
 
29
32
  $op = OptionParser.new do |opts|
data/bio-maf.gemspec CHANGED
@@ -2,7 +2,7 @@
2
2
 
3
3
  Gem::Specification.new do |s|
4
4
  s.name = "bio-maf"
5
- s.version = "0.3.1"
5
+ s.version = "0.3.2"
6
6
 
7
7
  s.required_rubygems_version = Gem::Requirement.new(">= 0") if s.respond_to? :required_rubygems_version=
8
8
  s.authors = ["Clayton Wheeler"]
data/man/maf_extract.1 CHANGED
@@ -7,13 +7,13 @@
7
7
  \fBmaf_extract\fR \- extract blocks from MAF files
8
8
  .
9
9
  .SH "SYNOPSIS"
10
- \fBmaf_extract\fR \-m MAF [\-i INDEX] \-\-interval SEQ:START:END \fIOPTIONS\fR
10
+ \fBmaf_extract\fR \-m MAF [\-i INDEX] \-\-interval SEQ:START\-END \fIOPTIONS\fR
11
11
  .
12
12
  .P
13
13
  \fBmaf_extract\fR \-m MAF [\-i INDEX] \-\-bed BED \fIOPTIONS\fR
14
14
  .
15
15
  .P
16
- \fBmaf_extract\fR \-d MAFDIR \-\-interval SEQ:START:END \fIOPTIONS\fR
16
+ \fBmaf_extract\fR \-d MAFDIR \-\-interval SEQ:START\-END \fIOPTIONS\fR
17
17
  .
18
18
  .P
19
19
  \fBmaf_extract\fR \-d MAFDIR \-\-bed BED \fIOPTIONS\fR
@@ -69,7 +69,7 @@ The extraction mode to use\. With \fB\-\-mode intersect\fR, any alignment block
69
69
  The specified file will be parsed as a BED file, and each interval it contains will be matched in turn\.
70
70
  .
71
71
  .TP
72
- \fB\-\-interval SEQ:START:END\fR
72
+ \fB\-\-interval SEQ:START\-END\fR
73
73
  A single zero\-based half\-open genomic interval will be matched, with sequence identifier \fIseq\fR, (inclusive) start position \fIstart\fR, and (exclusive) end position \fIend\fR\.
74
74
  .
75
75
  .P
@@ -141,7 +141,116 @@ Run verbosely, with additional informational messages\.
141
141
  Log debugging information\.
142
142
  .
143
143
  .SH "EXAMPLES"
144
- TODO
144
+ Extract MAF blocks intersecting with a given interval:
145
+ .
146
+ .IP "" 4
147
+ .
148
+ .nf
149
+
150
+ $ maf_extract \-d test/data \-\-interval mm8\.chr7:80082592\-80082766
151
+ .
152
+ .fi
153
+ .
154
+ .IP "" 0
155
+ .
156
+ .P
157
+ As above, but operating on a single file:
158
+ .
159
+ .IP "" 4
160
+ .
161
+ .nf
162
+
163
+ $ maf_extract \-m test/data/mm8_chr7_tiny\.maf \e
164
+ \-i test/data/mm8_chr7_tiny\.kct \e
165
+ \-\-interval mm8\.chr7:80082592\-80082766
166
+ .
167
+ .fi
168
+ .
169
+ .IP "" 0
170
+ .
171
+ .P
172
+ Like the first case, but writing output to a file:
173
+ .
174
+ .IP "" 4
175
+ .
176
+ .nf
177
+
178
+ $ maf_extract \-d test/data \-\-interval mm8\.chr7:80082592\-80082766 \e
179
+ \-\-output out\.maf
180
+ .
181
+ .fi
182
+ .
183
+ .IP "" 0
184
+ .
185
+ .P
186
+ Extract a slice of MAF blocks over a given interval:
187
+ .
188
+ .IP "" 4
189
+ .
190
+ .nf
191
+
192
+ $ maf_extract \-d test/data \-\-mode slice \e
193
+ \-\-interval mm8\.chr7:80082592\-80082766
194
+ .
195
+ .fi
196
+ .
197
+ .IP "" 0
198
+ .
199
+ .P
200
+ Filter for sequences from only certain species:
201
+ .
202
+ .IP "" 4
203
+ .
204
+ .nf
205
+
206
+ $ maf_extract \-d test/data \-\-interval mm8\.chr7:80082592\-80082766 \e
207
+ \-\-only\-species hg18,mm8,rheMac2
208
+ .
209
+ .fi
210
+ .
211
+ .IP "" 0
212
+ .
213
+ .P
214
+ Extract only blocks with all specified species:
215
+ .
216
+ .IP "" 4
217
+ .
218
+ .nf
219
+
220
+ $ maf_extract \-d test/data \-\-interval mm8\.chr7:80082471\-80082730 \e
221
+ \-\-with\-all\-species panTro2,loxAfr1
222
+ .
223
+ .fi
224
+ .
225
+ .IP "" 0
226
+ .
227
+ .P
228
+ Extract blocks with at least a certain number of sequences:
229
+ .
230
+ .IP "" 4
231
+ .
232
+ .nf
233
+
234
+ $ maf_extract \-d test/data \-\-interval mm8\.chr7:80082767\-80083008 \e
235
+ \-\-min\-sequences 6
236
+ .
237
+ .fi
238
+ .
239
+ .IP "" 0
240
+ .
241
+ .P
242
+ Extract blocks with text sizes in a certain range:
243
+ .
244
+ .IP "" 4
245
+ .
246
+ .nf
247
+
248
+ $ maf_extract \-d test/data \-\-interval mm8\.chr7:0\-80100000 \e
249
+ \-\-min\-text\-size 72 \-\-max\-text\-size 160
250
+ .
251
+ .fi
252
+ .
253
+ .IP "" 0
145
254
  .
146
255
  .SH "ENVIRONMENT"
147
256
  \fBmaf_index\fR is a Ruby program and relies on ordinary Ruby environment variables\.
@@ -3,11 +3,11 @@ maf_extract(1) -- extract blocks from MAF files
3
3
 
4
4
  ## SYNOPSIS
5
5
 
6
- `maf_extract` -m MAF [-i INDEX] --interval SEQ:START:END [OPTIONS]
6
+ `maf_extract` -m MAF [-i INDEX] --interval SEQ:START-END [OPTIONS]
7
7
 
8
8
  `maf_extract` -m MAF [-i INDEX] --bed BED [OPTIONS]
9
9
 
10
- `maf_extract` -d MAFDIR --interval SEQ:START:END [OPTIONS]
10
+ `maf_extract` -d MAFDIR --interval SEQ:START-END [OPTIONS]
11
11
 
12
12
  `maf_extract` -d MAFDIR --bed BED [OPTIONS]
13
13
 
@@ -79,7 +79,7 @@ Extraction options:
79
79
  The specified file will be parsed as a BED file, and each interval
80
80
  it contains will be matched in turn.
81
81
 
82
- * `--interval SEQ:START:END`:
82
+ * `--interval SEQ:START-END`:
83
83
  A single zero-based half-open genomic interval will be matched,
84
84
  with sequence identifier <seq>, (inclusive) start position <start>,
85
85
  and (exclusive) end position <end>.
@@ -153,7 +153,45 @@ Logging options:
153
153
 
154
154
  ## EXAMPLES
155
155
 
156
- TODO
156
+ Extract MAF blocks intersecting with a given interval:
157
+
158
+ $ maf_extract -d test/data --interval mm8.chr7:80082592-80082766
159
+
160
+ As above, but operating on a single file:
161
+
162
+ $ maf_extract -m test/data/mm8_chr7_tiny.maf \
163
+ -i test/data/mm8_chr7_tiny.kct \
164
+ --interval mm8.chr7:80082592-80082766
165
+
166
+ Like the first case, but writing output to a file:
167
+
168
+ $ maf_extract -d test/data --interval mm8.chr7:80082592-80082766 \
169
+ --output out.maf
170
+
171
+ Extract a slice of MAF blocks over a given interval:
172
+
173
+ $ maf_extract -d test/data --mode slice \
174
+ --interval mm8.chr7:80082592-80082766
175
+
176
+ Filter for sequences from only certain species:
177
+
178
+ $ maf_extract -d test/data --interval mm8.chr7:80082592-80082766 \
179
+ --only-species hg18,mm8,rheMac2
180
+
181
+ Extract only blocks with all specified species:
182
+
183
+ $ maf_extract -d test/data --interval mm8.chr7:80082471-80082730 \
184
+ --with-all-species panTro2,loxAfr1
185
+
186
+ Extract blocks with at least a certain number of sequences:
187
+
188
+ $ maf_extract -d test/data --interval mm8.chr7:80082767-80083008 \
189
+ --min-sequences 6
190
+
191
+ Extract blocks with text sizes in a certain range:
192
+
193
+ $ maf_extract -d test/data --interval mm8.chr7:0-80100000 \
194
+ --min-text-size 72 --max-text-size 160
157
195
 
158
196
  ## ENVIRONMENT
159
197
 
metadata CHANGED
@@ -1,7 +1,7 @@
1
1
  --- !ruby/object:Gem::Specification
2
2
  name: bio-maf
3
3
  version: !ruby/object:Gem::Version
4
- version: 0.3.1
4
+ version: 0.3.2
5
5
  prerelease:
6
6
  platform: ruby
7
7
  authors:
@@ -219,7 +219,7 @@ required_ruby_version: !ruby/object:Gem::Requirement
219
219
  version: '0'
220
220
  segments:
221
221
  - 0
222
- hash: -1336822573836516057
222
+ hash: 2092820657742105268
223
223
  required_rubygems_version: !ruby/object:Gem::Requirement
224
224
  none: false
225
225
  requirements: