RubyGems - bio-maf - Versions diffs - 0.1.0 → 0.2.0 - Mend

bio-maf 0.1.0 → 0.2.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (26) hide show

data/.gitignore +53 -0
data/DEVELOPMENT.md +29 -0
data/Gemfile +1 -0
data/README.md +69 -1
data/Rakefile +4 -3
data/bin/find_overlaps +21 -0
data/bin/maf_tile +103 -0
data/bio-maf.gemspec +43 -0
data/features/gap-filling.feature +158 -0
data/features/gap-removal.feature +50 -0
data/features/step_definitions/gap-filling_steps.rb +32 -0
data/features/step_definitions/gap_removal_steps.rb +19 -0
data/features/step_definitions/parse_steps.rb +2 -1
data/lib/bio/maf/index.rb +15 -8
data/lib/bio/maf/maf.rb +267 -0
data/lib/bio/maf/parser.rb +115 -175
data/lib/bio/maf/tiler.rb +167 -0
data/lib/bio/maf.rb +2 -0
data/man/maf_tile.1 +108 -0
data/man/maf_tile.1.ronn +104 -0
data/spec/bio/maf/index_spec.rb +1 -0
data/spec/bio/maf/parser_spec.rb +103 -0
data/spec/bio/maf/tiler_spec.rb +69 -0
data/test/data/gap-sp1.fa +6 -0
data/test/data/mm8_chr7_tiny.kct +0 -0
metadata +58 -3

data/.gitignore ADDED Viewed

@@ -0,0 +1,53 @@
+# rcov generated
+coverage
+coverage.data
+# rdoc generated
+rdoc
+# yard generated
+doc
+.yardoc
+# bundler
+.bundle
+# jeweler generated
+pkg
+# Have editor/IDE/OS specific files you need to ignore? Consider using a global gitignore:
+#
+# * Create a file at ~/.gitignore
+# * Include files you want ignored
+# * Run: git config --global core.excludesfile ~/.gitignore
+#
+# After doing this, these files will be ignored in all your git projects,
+# saving you from having to 'pollute' every project you touch with them
+#
+# Not sure what to needs to be ignored for particular editors/OSes? Here's some ideas to get you started. (Remember, remove the leading # of the line)
+#
+# For MacOS:
+#
+#.DS_Store
+# For TextMate
+#*.tmproj
+#tmtags
+# For emacs:
+#*~
+#\#*
+#.\#*
+# For vim:
+#*.swp
+# For redcar:
+#.redcar
+# For rubinius:
+*.rbc
+.rbx
+# Ignore Gemfile.lock for gems. See http://yehudakatz.com/2010/12/16/clarifying-the-roles-of-the-gemspec-and-gemfile/
+Gemfile.lock

data/DEVELOPMENT.md CHANGED Viewed

@@ -3,6 +3,35 @@
 Here are notes on less obvious aspects of the development process for
 this library.
+## Gem build / tagging / release
+This now uses [rubygems-tasks][] for building and releasing gems.
+[rubygems-tasks]: https://github.com/postmodern/rubygems-tasks
+We build two gem platform variants: a 'default' one for MRI with no
+platform set, and a JRuby one with `platform = 'java'`. These get
+built as `bio-maf-X.Y.Z.gem` and `bio-maf-X.Y.Z-java.gem`. At least
+for now, this is done by running `gem release` separately under JRuby
+and MRI. SCM tagging and pushing is done under MRI only, but the gems
+will be built and pushed to rubygems.org separately under each
+platform.
+The version is simply set by hand in `bio-maf.gemspec`. Don't forget
+to increment it!
+Testing the build:
+    $ rake build
+    $ rake install
+Release:
+    $ rvm use 1.9.3@bioruby-maf
+    $ rake release
+    $ rvm use jruby-1.6.7.2@bioruby-maf
+    $ rake release
 ## kyotocabinet-java
 Running `bio-maf` on JRuby requires the [kyotocabinet-java][] gem, a

data/Gemfile CHANGED Viewed

@@ -13,6 +13,7 @@ group :development do
   gem "redcarpet", "~> 2.1.1", :platforms => :mri
   gem "ronn", "~> 0.7.3", :platforms => :mri
   gem "sinatra", "~> 1.3.2" # for ronn --server
+  gem "rubygems-tasks", "~> 0.2.3"
 end
 group :test do

data/README.md CHANGED Viewed

@@ -47,8 +47,29 @@ problems building or using this gem, which is still fairly new.
 ## Installation
+`bio-maf` is now published as a Ruby [gem](https://rubygems.org/gems/bio-maf).
     $ gem install bio-maf
+## Performance
+This parser performs best under [JRuby][], particularly with Java
+7. See the [Performance][] wiki page for more information. For best
+results, use JRuby in 1.9 mode with the ObjectProxyCache disabled:
+[JRuby]: http://jruby.org/
+[Performance]: https://github.com/csw/bioruby-maf/wiki/Performance
+    $ export JRUBY_OPTS='--1.9 -Xji.objectProxyCache=false'
+Many parsing modes are multithreaded. Under JRuby, it will default to
+using one parser thread per available core, but if desired this can be
+configured with the `:threads` parser option.
+Ruby 1.9.3 is fully supported, but does not perform as well,
+especially since its concurrency features are not useful for this
+workload.
 ## Usage
 ### Create an index on a MAF file
@@ -162,6 +183,47 @@ Refer to [`chr22_ieq.maf`](https://github.com/csw/bioruby-maf/blob/master/test/d
     #      @size=1601, @strand=:+, @src_size=50103, @text=nil,
     #      @status="I">
+### Remove gaps from parsed blocks
+After filtering out species with
+[`Parser#sequence_filter`](#filter-species-returned-in-alignment-blocks),
+gaps may be left where there was an insertion present only in
+sequences that were filtered out. Such gaps can be removed by setting
+the `:remove_gaps` parser option:
+    require 'bio-maf'
+    p = Bio::MAF::Parser.new('test/data/chr22_ieq.maf',
+                             :remove_gaps => true)
+### Tile blocks together over an interval
+Extracts alignment blocks overlapping the given genomic interval and
+constructs a single alignment block covering the entire interval for
+the specified species. Optionally, any gaps in coverage of the MAF
+file's reference sequence can be filled in from a FASTA sequence
+file. See the Cucumber [feature][] for examples of output, and also
+the
+[`maf_tile(1)`](http://csw.github.com/bioruby-maf/man/maf_tile.1.html)
+man page.
+[feature]: https://github.com/csw/bioruby-maf/blob/master/features/gap-filling.feature
+    require 'bio-maf'
+    tiler = Bio::MAF::Tiler.new
+    tiler.index = Bio::MAF::KyotoIndex.open('test/data/mm8_chr7_tiny.kct')
+    tiler.parser = Bio::MAF::Parser.new('test/data/mm8_chr7_tiny.maf')
+    # optional
+    tiler.reference = Bio::MAF::FASTARangeReader.new('reference.fa.gz')
+    tiler.species = %w(mm8 rn4 hg18)
+    tiler.species_map = {
+      'mm8' => 'mouse',
+      'rn4' => 'rat',
+      'hg18' => 'human'
+    }
+    tiler.interval = Bio::GenomicInterval.zero_based('mm8.chr7',
+                                                     80082334,
+                                                     80082468)
+    tiler.write_fasta($stdout)
 ### Command line tools
@@ -169,6 +231,12 @@ Man pages for command line tools:
 * [`maf_index(1)`](http://csw.github.com/bioruby-maf/man/maf_index.1.html)
 * [`maf_to_fasta(1)`](http://csw.github.com/bioruby-maf/man/maf_to_fasta.1.html)
+* [`maf_tile(1)`](http://csw.github.com/bioruby-maf/man/maf_tile.1.html)
+With [gem-man](https://github.com/defunkt/gem-man) installed, these
+can be read with:
+    $ gem man bio-maf
 ### Other documentation
@@ -201,7 +269,7 @@ If you use this software, please cite one of
 ## Biogems.info
-This Biogem will be published at [#bio-maf](http://biogems.info/index.html)
+This Biogem is published at [biogems.info](http://biogems.info/index.html#bio-maf).
 ## Copyright

data/Rakefile CHANGED Viewed

@@ -10,10 +10,11 @@ rescue Bundler::BundlerError => e
   exit e.status_code
 end
 require 'rake'
-require 'rubygems/package_task'
-$gemspec = Gem::Specification.load("bio-maf.gemspec")
-Gem::PackageTask.new($gemspec) { |pkg| }
+require 'rubygems/tasks'
+# we only want to do the SCM tag/push stuff once, on MRI
+use_scm = (RUBY_PLATFORM != 'java')
+Gem::Tasks.new(:scm => {:tag => use_scm, :push => use_scm})
 require 'rspec/core'
 require 'rspec/core/rake_task'

data/bin/find_overlaps ADDED Viewed

@@ -0,0 +1,21 @@
+#!/usr/bin/env ruby
+require 'bio-maf'
+parser = Bio::MAF::Parser.new(ARGV.shift, :threads => 4)
+def desc(seq)
+  "#{seq.source}:#{seq.start}-#{seq.end}"
+end
+open = []
+parser.parse_blocks.each do |block|
+  start_pos = block.ref_seq.start
+  open.delete_if { |open_b| open_b.ref_seq.end <= start_pos }
+  open.each do |ovl|
+    ref_a = ovl.ref_seq
+    ref_b = block.ref_seq
+    puts "#{desc(ref_a)} overlaps #{desc(ref_b)}"
+  end
+  open << block
+end

data/bin/maf_tile ADDED Viewed

@@ -0,0 +1,103 @@
+#!/usr/bin/env ruby
+require 'optparse'
+require 'ostruct'
+require 'bio-maf'
+require 'bio-genomic-interval'
+options = OpenStruct.new
+options.p = { :threads => 1 }
+options.species = []
+options.species_map = {}
+options.usage = false
+o_parser = OptionParser.new do |opts|
+  opts.banner = "Usage: maf_tile [options] <maf> <index>"
+  opts.separator ""
+  opts.separator "Options:"
+  opts.on("-r", "--reference SEQ", "FASTA reference sequence") do |ref|
+    options.ref = ref
+  end
+  opts.on("-i", "--interval BEGIN:END", "Genomic interval, zero-based") do |int|
+    if int =~ /(\d+):(\d+)/
+      options.interval = ($1.to_i)...($2.to_i)
+    else
+      options.usage = true
+    end
+  end
+  opts.on("-s", "--species SPECIES[:NAME]", "Species to use (with mapped name)") do |sp|
+    if sp =~ /:/
+      species, mapped = sp.split(/:/)
+      options.species << species
+      options.species_map[species] = mapped
+    else
+      options.species << sp
+    end
+  end
+  opts.on("-o", "--output-base BASE", "Base name for output files",
+          "Use stdout for a single interval if not given") do |base|
+    options.output_base = base
+  end
+  opts.on("--bed BED", "BED file specifying intervals",
+          "(requires --output-base)") do |bed|
+    options.bed = bed
+  end
+end
+o_parser.parse!(ARGV)
+maf_p = ARGV.shift
+index_p = ARGV.shift
+unless (! options.usage) \
+  && maf_p && index_p && (! options.species.empty?) \
+  && (options.output_base ? options.bed : options.interval)
+  $stderr.puts o_parser
+  exit 2
+end
+tiler = Bio::MAF::Tiler.new
+tiler.index = Bio::MAF::KyotoIndex.open(index_p)
+tiler.parser = Bio::MAF::Parser.new(maf_p, options.p)
+tiler.reference = Bio::MAF::FASTARangeReader.new(options.ref) if options.ref
+tiler.species = options.species
+tiler.species_map = options.species_map
+def parse_interval(line)
+  src, r_start_s, r_end_s, _ = line.split(nil, 4)
+  r_start = r_start_s.to_i
+  r_end = r_end_s.to_i
+  return Bio::GenomicInterval.zero_based(src, r_start, r_end)
+end
+def target_for(base, interval)
+  path = "#{base}_#{interval.zero_start}-#{interval.zero_end}.fa"
+  File.open(path, 'w')
+end
+if options.bed
+  intervals = []
+  File.open(options.bed) do |bed_f|
+    bed_f.each_line { |line| intervals << parse_interval(line) }
+  end
+  intervals.sort_by! { |int| int.zero_start }
+  intervals.each do |int|
+    tiler.interval = int
+    target = target_for(options.output_base, int)
+    tiler.write_fasta(target)
+    target.close
+  end
+else
+  # single interval
+  tiler.interval = Bio::GenomicInterval.zero_based(tiler.index.ref_seq,
+                                                   options.interval.begin,
+                                                   options.interval.end)
+  if options.output_base
+    target = target_for(options.output_base, tiler.interval)
+  else
+    target = $stdout
+  end
+  tiler.write_fasta(target)
+  target.close
+end

data/bio-maf.gemspec ADDED Viewed

@@ -0,0 +1,43 @@
+# -*- encoding: utf-8 -*-
+Gem::Specification.new do |s|
+  s.name = "bio-maf"
+  s.version = "0.2.0"
+  s.required_rubygems_version = Gem::Requirement.new(">= 0") if s.respond_to? :required_rubygems_version=
+  s.authors = ["Clayton Wheeler"]
+  s.date = "2012-06-29"
+  s.description = "Multiple Alignment Format parser for BioRuby."
+  s.email = "cswh@umich.edu"
+  s.executables = ["maf_count", "maf_dump_blocks", "maf_extract_ranges_count", "maf_index", "maf_parse_bench", "maf_to_fasta", "maf_write", "random_ranges"]
+  s.extra_rdoc_files = [
+    "LICENSE.txt",
+    "README.md"
+                       ]
+  s.files         = `git ls-files`.split("\n")
+  s.test_files    = `git ls-files -- {test,spec,features}/*`.split("\n")
+  s.executables   = `git ls-files -- bin/*`.split("\n").map {
+    |f| File.basename(f)
+  }
+  s.homepage = "http://github.com/csw/bioruby-maf"
+  s.licenses = ["MIT"]
+  s.require_paths = ["lib"]
+  s.rubygems_version = "1.8.24"
+  s.summary = "MAF parser for BioRuby"
+  s.specification_version = 3
+  if RUBY_PLATFORM == 'java'
+    s.platform = 'java'
+  end
+  s.add_runtime_dependency('bio-bigbio', [">= 0"])
+  s.add_runtime_dependency('bio-genomic-interval', ["~> 0.1.2"])
+  if RUBY_PLATFORM == 'java'
+    s.add_runtime_dependency('kyotocabinet-java', ["~> 0.2.0"])
+  else
+    s.add_runtime_dependency('kyotocabinet-ruby', ["~> 1.27.1"])
+  end
+end

data/features/gap-filling.feature ADDED Viewed

@@ -0,0 +1,158 @@
+Feature: Join alignment blocks with reference data
+  In order to produce FASTA output with one sequence per species
+  For use in downstream tools
+  We need to join adjacent MAF blocks together
+  And fill gaps in the reference sequence from reference data
+  Scenario: Non-overlapping MAF blocks in region of interest
+    Given MAF data:
+    """
+    ##maf version=1
+    a score=20.0
+    s sp1.chr1        10 13 +      50 GGGCTGAGGGC--AG
+    s sp2.chr5     53010 13 +   65536 GGGCTGACGGC--AG
+    s sp3.chr2     33010 15 +   65536 AGGTTTAGGGCAGAG
+    a score=21.0
+    s sp1.chr1        30 10 +      50 AGGGCGGTCC
+    s sp2.chr5     53030 10 +   65536 AGGGCGGTGC
+    """
+    And chromosome reference sequence:
+    """
+    >sp1.chr1
+    CCAGGATGCT
+    GGGCTGAGGG
+    CAGTTGTGTC
+    AGGGCGGTCC
+    GGTGCAGGCA
+    """
+    When I open it with a MAF reader
+    And build an index on the reference sequence
+    And tile sp1.chr1:0-50 with the chromosome reference
+    And tile with species [sp1, sp2, sp3]
+    And write the tiled data as FASTA
+    Then the FASTA data obtained should be:
+    """
+    >sp1
+    CCAGGATGCTGGGCTGAGGGC--AGTTGTGTCAGGGCGGTCCGGTGCAGGCA
+    >sp2
+    **********GGGCTGACGGC--AG*******AGGGCGGTGC**********
+    >sp3
+    **********AGGTTTAGGGCAGAG***************************
+    """
+  Scenario: Non-overlapping MAF blocks with species map
+    Given MAF data:
+    """
+    ##maf version=1
+    a score=20.0
+    s sp1.chr1        10 13 +      50 GGGCTGAGGGC--AG
+    s sp2.chr5     53010 13 +   65536 GGGCTGACGGC--AG
+    s sp3.chr2     33010 15 +   65536 AGGTTTAGGGCAGAG
+    a score=21.0
+    s sp1.chr1        30 10 +      50 AGGGCGGTCC
+    s sp2.chr5     53030 10 +   65536 AGGGCGGTGC
+    """
+    And chromosome reference sequence:
+    """
+    >sp1.chr1
+    CCAGGATGCT
+    GGGCTGAGGG
+    CAGTTGTGTC
+    AGGGCGGTCC
+    GGTGCAGGCA
+    """
+    When I open it with a MAF reader
+    And build an index on the reference sequence
+    And tile sp1.chr1:0-50 with the chromosome reference
+    And tile with species [sp1, sp2, sp3]
+    And map species sp1 as mouse
+    And map species sp2 as hippo
+    And map species sp3 as squid
+    And write the tiled data as FASTA
+    Then the FASTA data obtained should be:
+    """
+    >mouse
+    CCAGGATGCTGGGCTGAGGGC--AGTTGTGTCAGGGCGGTCCGGTGCAGGCA
+    >hippo
+    **********GGGCTGACGGC--AG*******AGGGCGGTGC**********
+    >squid
+    **********AGGTTTAGGGCAGAG***************************
+    """
+  Scenario: Subset of non-overlapping MAF blocks in region
+    Given MAF data:
+    """
+    ##maf version=1
+    a score=20.0
+    s sp1.chr1        10 13 +      50 GGGCTGAGGGC--AG
+    s sp2.chr5     53010 13 +   65536 GGGCTGACGGC--AG
+    s sp3.chr2     33010 15 +   65536 AGGTTTAGGGCAGAG
+    a score=21.0
+    s sp1.chr1        30 10 +      50 AGGGCGGTCC
+    s sp2.chr5     53030 10 +   65536 AGGGCGGTGC
+    """
+    And chromosome reference sequence:
+    """
+    >sp1.chr1
+    CCAGGATGCT
+    GGGCTGAGGG
+    CAGTTGTGTC
+    AGGGCGGTCC
+    GGTGCAGGCA
+    """
+    When I open it with a MAF reader
+    And build an index on the reference sequence
+    And tile sp1.chr1:12-36 with the chromosome reference
+    And tile with species [sp1, sp2, sp3]
+    And write the tiled data as FASTA
+    Then the FASTA data obtained should be:
+    """
+    >sp1
+    GCTGAGGGC--AGTTGTGTCAGGGCG
+    >sp2
+    GCTGACGGC--AG*******AGGGCG
+    >sp3
+    GTTTAGGGCAGAG*************
+    """
+  Scenario: Overlapping MAF blocks in region of interest
+    Given MAF data:
+    """
+    ##maf version=1
+    a score=20.0
+    s sp1.chr1        10 13 +      50 GGGCTGAGGGC--AG
+    s sp2.chr5     53010 13 +   65536 GGGCTGACGGC--AG
+    s sp3.chr2     33010 15 +   65536 AGGTTTAGGGCAGAG
+    a score=21.0
+    s sp1.chr1        20 10 +      50 AGGGCGGTCC
+    s sp2.chr5     53020 10 +   65536 AGGGCGGTGC
+    """
+    And chromosome reference sequence:
+    """
+    >sp1.chr1
+    CCAGGATGCT
+    GGGCTGAGGG
+    CAGTTGTGTC
+    AGGGCGGTCC
+    GGTGCAGGCA
+    """
+    When I open it with a MAF reader
+    And build an index on the reference sequence
+    And tile sp1.chr1:0-50 with the chromosome reference
+    And tile with species [sp1, sp2, sp3]
+    And write the tiled data as FASTA
+    Then the FASTA data obtained should be:
+    """
+    >sp1
+    CCAGGATGCTGGGCTGAGGGAGGGCGGTCCAGGGCGGTCCGGTGCAGGCA
+    >sp2
+    **********GGGCTGACGGAGGGCGGTGC********************
+    >sp3
+    **********AGGTTTAGGG******************************
+    """

data/features/gap-removal.feature ADDED Viewed

@@ -0,0 +1,50 @@
+Feature: Remove gaps from MAF files
+  In order to work with only the alignment data involving sequences
+  Which can be used by downstream software
+  We may want to filter out certain species
+  Which can leave gap regions where sequence data was only present
+  For removed species
+  So it is useful to be able to remove those gaps
+  Background:
+    Given MAF data:
+    """
+    ##maf version=1
+    a score=10542.0
+    s mm8.chr7                 80082334 34 + 145134094 GGGCTGAGGGC--AGGGATGG---AGGGCGGTCC--------------CAGCA-
+    s rn4.chr1                136011785 34 + 267910886 GGGCTGAGGGC--AGGGACGG---AGGGCGGTCC--------------CAGCA-
+    s oryCun1.scaffold_199771     14021 43 -     75077 -----ATGGGC--AAGCGTGG---AGGGGAACCTCTCCTCCCCTCCGACAAAG-
+    s hg18.chr15               88557580 27 + 100338915 --------GGC--AAGTGTGGA--AGGGAAGCCC--------------CAGAA-
+    s panTro2.chr15            87959837 27 + 100063422 --------GGC--AAGTGTGGA--AGGGAAGCCC--------------CAGAA-
+    s rheMac2.chr7             69864714 28 + 169801366 -------GGGC--AAGTATGGA--AGGGAAGCCC--------------CAGAA-
+    s canFam2.chr3             56030570 39 +  94715083 AGGTTTAGGGCAGAGGGATGAAGGAGGAGAATCC--------------CTATG-
+    s dasNov1.scaffold_106893      7435 34 +      9831 GGAACGAGGGC--ATGTGTGG---AGGGGGCTGC--------------CCACA-
+    s loxAfr1.scaffold_8298       30264 38 +     78952 ATGATGAGGGG--AAGCGTGGAGGAGGGGAACCC--------------CTAGGA
+    s echTel1.scaffold_304651       594 37 -     10007 -TGCTATGGCT--TTGTGTCTAGGAGGGGAATCC--------------CCAGGA
+    """
+    When I open it with a MAF reader
+    And filter for only the species
+    | mm8     |
+    | rn4     |
+    | hg18    |
+    | canFam2 |
+    | loxAfr1 |
+  Scenario: Detect filtered blocks
+    When an alignment block can be obtained
+    Then the alignment block is marked as filtered
+    And the alignment block has 5 sequences
+  Scenario: Detect gaps
+    When an alignment block can be obtained
+    Then 1 gap is found with length [14]
+  Scenario: Remove gaps
+    When an alignment block can be obtained
+    And gaps are removed
+    Then the text size of the block is 40
+  Scenario: Remove gaps in the parser
+    When I enable the :remove_gaps parser option
+    And an alignment block can be obtained
+    Then the text size of the block is 40

data/features/step_definitions/gap-filling_steps.rb ADDED Viewed

@@ -0,0 +1,32 @@
+Given /^chromosome reference sequence:$/ do |string|
+  sio = StringIO.new(string)
+  @refseq = Bio::MAF::FASTARangeReader.new(sio)
+end
+When /^tile ([^:\s]+):(\d+)-(\d+)( with the chromosome reference)?$/ do |seq, i_start, i_end, ref_p|
+  @tiler = Bio::MAF::Tiler.new
+  @tiler.index = @idx
+  @tiler.parser = @parser
+  @tiler.reference = @refseq if ref_p
+  @tiler.interval = Bio::GenomicInterval.zero_based(seq,
+                                                    i_start.to_i,
+                                                    i_end.to_i)
+end
+When /^tile with species \[(.+?)\]$/ do |species_text|
+  @tiler.species = species_text.split(/,\s*/)
+end
+When /^map species (\S+) as (\S+)$/ do |sp1, sp2|
+  @tiler.species_map[sp1] = sp2
+end
+When /^write the tiled data as FASTA$/ do
+  @dst = Tempfile.new(["cuke", ".fa"])
+  @tiler.write_fasta(@dst)
+end
+Then /^the FASTA data obtained should be:$/ do |string|
+  @dst.seek(0)
+  @dst.read.rstrip.should == string.rstrip
+end

data/features/step_definitions/gap_removal_steps.rb ADDED Viewed

@@ -0,0 +1,19 @@
+Then /^the alignment block is marked as filtered$/ do
+  @block.filtered?.should be_true
+end
+Then /^(\d+) gaps? (?:is|are) found with length \[(\d+)\]$/ do |n_gaps, gap_sizes_s|
+  gaps = @block.find_gaps
+  gaps.size.should == n_gaps.to_i
+  e_gap_sizes = gap_sizes_s.split(/,\s*/).collect { |n| n.to_i }
+  gap_sizes = gaps.collect { |gap| gap[1] }
+  gap_sizes.should == e_gap_sizes
+end
+When /^gaps are removed$/ do
+  @block.remove_gaps!
+end
+Then /^the text size of the block is (\d+)$/ do |e_text_size|
+  @block.text_size.should == e_text_size.to_i
+end

data/features/step_definitions/parse_steps.rb CHANGED Viewed

@@ -1,5 +1,6 @@
 When /^I open it with a MAF reader$/ do
-  @parser = Bio::MAF::Parser.new(@src_f, @opts || {})
+  @opts ||= {}
+  @parser = Bio::MAF::Parser.new(@src_f, @opts)
 end
 When /^I enable the :(\S+) parser option$/ do |opt_s|