RubyGems - loofah - Versions diffs - 0.4.6 → 0.4.7 - Mend

loofah 0.4.6 → 0.4.7

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Potentially problematic release.

This version of loofah might be problematic. Click here for more details.

Files changed (20) hide show

data.tar.gz.sig +0 -0
data/CHANGELOG.rdoc +31 -19
data/Manifest.txt +4 -0
data/README.rdoc +7 -0
data/lib/loofah.rb +4 -1
data/lib/loofah/elements.rb +19 -0
data/lib/loofah/helpers.rb +7 -0
data/lib/loofah/html/document.rb +4 -9
data/lib/loofah/html/document_fragment.rb +6 -15
data/lib/loofah/html5/whitelist.rb +1 -7
data/lib/loofah/instance_methods.rb +54 -4
data/lib/loofah/metahelpers.rb +15 -0
data/lib/loofah/scrubbers.rb +19 -5
data/test/integration/test_ad_hoc.rb +0 -75
data/test/integration/test_html.rb +51 -0
data/test/integration/test_scrubbers.rb +37 -0
data/test/integration/test_xml.rb +55 -0
data/test/unit/test_api.rb +12 -2
metadata +85 -26
metadata.gz.sig +0 -0

data.tar.gz.sig CHANGED Viewed

Binary file

data/CHANGELOG.rdoc CHANGED Viewed

@@ -1,60 +1,72 @@
 = Changelog
+== 0.4.7 (2010-03-09)
+Enhancements:
+* New methods Loofah::HTML::Document#to_text and
+  Loofah::HTML::DocumentFragment#to_text do the right thing with
+  whitespace. Note that these methods are significantly slower than
+  #text. GH #12
+* Loofah::Elements::BLOCK_LEVEL contains a canonical list of HTML4 block-level4 elements.
+* Loofah::HTML::Document#text and Loofah::HTML::DocumentFragment#text
+  will return unescaped HTML entities by passing :encode_special_chars => false.
 == 0.4.4, 0.4.5, 0.4.6 (2010-02-01)
 Enhancements:
-  * Loofah::HTML::Document#text and Loofah::HTML::DocumentFragment#text now escape HTML entities.
+* Loofah::HTML::Document#text and Loofah::HTML::DocumentFragment#text now escape HTML entities.
 Bug fixes:
-  * Loofah::XssFoliate was not properly escaping HTML entities when implicitly scrubbing a string attribute. GH #17
+* Loofah::XssFoliate was not properly escaping HTML entities when implicitly scrubbing a string attribute. GH #17
 == 0.4.3 (2010-01-29)
 Enhancements:
-  * All built-in scrubbers are accepted by ActiveRecord::Base.xss_foliate
-  * Loofah::XssFoliate.xss_foliate_all_models replaces use of the constant LOOFAH_XSS_FOLIATE_ALL_MODELS
+* All built-in scrubbers are accepted by ActiveRecord::Base.xss_foliate
+* Loofah::XssFoliate.xss_foliate_all_models replaces use of the constant LOOFAH_XSS_FOLIATE_ALL_MODELS
 Miscellaneous:
-  * Modified documentation for bootstrapping XssFoliate in a Rails
-    app, since the use of Bundler breaks the previously-documented
-    method. To be safe, always use an initializer file.
+* Modified documentation for bootstrapping XssFoliate in a Rails app,
+  since the use of Bundler breaks the previously-documented method. To
+  be safe, always use an initializer file.
 == 0.4.2 (2010-01-22)
 Enhancements:
-  * Implemented Node#scrub! for scrubbing subtrees.
-  * Implemented NodeSet#scrub! for scrubbing a set of subtrees.
-  * Document.text now only serializes <body> contents (ignores <head>)
-  * <head>, <html> and <body> added to the HTML5lib whitelist.
+* Implemented Node#scrub! for scrubbing subtrees.
+* Implemented NodeSet#scrub! for scrubbing a set of subtrees.
+* Document.text now only serializes <body> contents (ignores <head>)
+* <head>, <html> and <body> added to the HTML5lib whitelist.
 Bug fixes:
-  * Supporting Rails apps that aren't loading ActiveRecord. GH #10
+* Supporting Rails apps that aren't loading ActiveRecord. GH #10
 Miscellaneous:
-  * Mailing list is now loofah@librelist.com / http://librelist.com
-  * IRC channel is now \#loofah on freenode.
+* Mailing list is now loofah@librelist.com / http://librelist.com
+* IRC channel is now \#loofah on freenode.
 == 0.4.1 (2009-11-23)
 Bugfix:
-  * Manifest fixed. Whoops.
+* Manifest fixed. Whoops.
 == 0.4.0 (2009-11-21)
 Enhancements:
-  * Scrubber class introduced, allowing development of custom scrubbers.
-  * Added support for XML documents and fragments.
-  * Added :nofollow HTML scrubber (thanks Luke Melia!)
-  * Built-in scrubbing methods refactored to use Scrubber.
+* Scrubber class introduced, allowing development of custom scrubbers.
+* Added support for XML documents and fragments.
+* Added :nofollow HTML scrubber (thanks Luke Melia!)
+* Built-in scrubbing methods refactored to use Scrubber.
 == 0.3.1 (2009-10-12)

data/Manifest.txt CHANGED Viewed

@@ -12,12 +12,14 @@ benchmark/www.slashdot.com.html
 init.rb
 lib/loofah.rb
 lib/loofah/active_record.rb
+lib/loofah/elements.rb
 lib/loofah/helpers.rb
 lib/loofah/html/document.rb
 lib/loofah/html/document_fragment.rb
 lib/loofah/html5/scrub.rb
 lib/loofah/html5/whitelist.rb
 lib/loofah/instance_methods.rb
+lib/loofah/metahelpers.rb
 lib/loofah/scrubber.rb
 lib/loofah/scrubbers.rb
 lib/loofah/xml/document.rb
@@ -27,7 +29,9 @@ test/helper.rb
 test/html5/test_sanitizer.rb
 test/integration/test_ad_hoc.rb
 test/integration/test_helpers.rb
+test/integration/test_html.rb
 test/integration/test_scrubbers.rb
+test/integration/test_xml.rb
 test/unit/test_active_record.rb
 test/unit/test_api.rb
 test/unit/test_helpers.rb

data/README.rdoc CHANGED Viewed

@@ -120,6 +120,13 @@ and +text+ to return plain text:
   doc.text    # => "ohai! div is safe "
+Also, +to_text+ is available, which does the right thing with
+whitespace around block-level elements.
+  doc = Loofah.fragment("<h1>Title</h1><div>Content</div>")
+  doc.text    # => "TitleContent"           # probably not what you want
+  doc.to_text # => "\nTitle\n\nContent\n"   # better
 === Loofah::XML::Document and Loofah::XML::DocumentFragment
 These classes are subclasses of Nokogiri::XML::Document and

data/lib/loofah.rb CHANGED Viewed

@@ -2,6 +2,9 @@ $LOAD_PATH.unshift(File.expand_path(File.dirname(__FILE__))) unless $LOAD_PATH.i
 require 'nokogiri'
+require 'loofah/metahelpers'
+require 'loofah/elements'
 require 'loofah/html5/whitelist'
 require 'loofah/html5/scrub'
@@ -26,7 +29,7 @@ require 'loofah/helpers'
 #
 module Loofah
   # The version of Loofah you are using
-  VERSION = '0.4.6'
+  VERSION = '0.4.7'
   # The minimum required version of Nokogiri
   REQUIRED_NOKOGIRI_VERSION = '1.3.3'

data/lib/loofah/elements.rb ADDED Viewed

@@ -0,0 +1,19 @@
+module Loofah
+  module Elements
+    # Block elements in HTML4
+    STRICT_BLOCK_LEVEL = %w[address blockquote center dir div dl
+      fieldset form h1 h2 h3 h4 h5 h6 hr isindex menu noframes
+      noscript ol p pre table ul]
+    # The following elements may also be considered block-level elements since they may contain block-level elements
+    LOOSE_BLOCK_LEVEL = %w[dd dt frameset li tbody td tfoot th thead tr]
+    BLOCK_LEVEL = STRICT_BLOCK_LEVEL + LOOSE_BLOCK_LEVEL
+  end
+  module HashedElements
+    include Loofah::MetaHelpers::HashifiedConstants(Elements)
+  end
+end

data/lib/loofah/helpers.rb CHANGED Viewed

@@ -18,6 +18,13 @@ module Loofah
       def sanitize(string_or_io)
         Loofah.scrub_fragment(string_or_io, :strip).to_s
       end
+      #
+      #  A helper to remove extraneous whitespace from text-ified HTML
+      #
+      def remove_extraneous_whitespace(string)
+        string.gsub(/\n\s*\n\s*\n/,"\n\n")
+      end
     end
   end
 end

data/lib/loofah/html/document.rb CHANGED Viewed

@@ -3,21 +3,16 @@ module Loofah
     #
     #  Subclass of Nokogiri::HTML::Document.
     #
-    #  See Loofah::ScrubBehavior and Loofah::DocumentDecorator for additional methods.
+    #  See Loofah::ScrubBehavior and Loofah::TextBehavior for additional methods.
     #
     class Document < Nokogiri::HTML::Document
       include Loofah::ScrubBehavior::Node
       include Loofah::DocumentDecorator
+      include Loofah::TextBehavior
-      #
-      #  Returns a plain-text version of the markup contained by the document,
-      #  with HTML entities encoded.
-      #
-      def text
-        encode_special_chars xpath("/html/body").inner_text
+      def serialize_root
+        at_xpath("/html/body")
       end
-      alias :inner_text :text
-      alias :to_str     :text
     end
   end
 end

data/lib/loofah/html/document_fragment.rb CHANGED Viewed

@@ -3,9 +3,11 @@ module Loofah
     #
     #  Subclass of Nokogiri::HTML::DocumentFragment.
     #
-    #  See Loofah::ScrubBehavior for additional methods.
+    #  See Loofah::ScrubBehavior and Loofah::TextBehavior for additional methods.
     #
     class DocumentFragment < Nokogiri::HTML::DocumentFragment
+      include Loofah::TextBehavior
       class << self
         #
         #  Overridden Nokogiri::HTML::DocumentFragment
@@ -21,23 +23,12 @@ module Loofah
       #  Returns the HTML markup contained by the fragment
       #
       def to_s
-        serialize_roots.children.to_s
+        serialize_root.children.to_s
       end
       alias :serialize :to_s
-      #
-      #  Returns a plain-text version of the markup contained by the fragment
-      #
-      def text
-        encode_special_chars serialize_roots.children.inner_text
-      end
-      alias :inner_text :text
-      alias :to_str     :text
-      private
-      def serialize_roots # :nodoc:
-        xpath("./body").first || self
+      def serialize_root
+        at_xpath("./body") || self
       end
     end
   end

data/lib/loofah/html5/whitelist.rb CHANGED Viewed

@@ -162,13 +162,7 @@ module Loofah
     #  The HTML5lib whitelist arrays, transformed into hashes for faster lookup.
     #
     module HashedWhiteList
-      WhiteList.constants.each do |constant|
-        next unless WhiteList.module_eval("#{constant}").is_a?(Array)
-        module_eval <<-CODE
-        #{constant} = {}
-        WhiteList::#{constant}.each { |c| #{constant}[c] = true ; #{constant}[c.downcase] = true }
-      CODE
-      end
+      include Loofah::MetaHelpers::HashifiedConstants(WhiteList)
     end
   end
 end

data/lib/loofah/instance_methods.rb CHANGED Viewed

@@ -27,8 +27,7 @@ module Loofah
   #  README.rdoc for more example usage.
   #
   module ScrubBehavior
-    # see Loofah::ScrubBehavior
-    module Node
+    module Node # :nodoc:
       def scrub!(scrubber)
         #
         #  yes. this should be three separate methods. but nokogiri
@@ -50,8 +49,7 @@ module Loofah
       end
     end
-    # see Loofah::ScrubBehavior
-    module NodeSet
+    module NodeSet # :nodoc:
       def scrub!(scrubber)
         each { |node| node.scrub!(scrubber) }
         self
@@ -67,6 +65,58 @@ module Loofah
     end
   end
+  #
+  #  Overrides +text+ in HTML::Document and HTML::DocumentFragment,
+  #  and mixes in +to_text+.
+  #
+  module TextBehavior
+    #
+    #  Returns a plain-text version of the markup contained by the document,
+    #  with HTML entities encoded.
+    #
+    #  This method is significantly faster than #to_text, but isn't
+    #  clever about whitespace around block elements.
+    #
+    #    Loofah.document("<h1>Title</h1><div>Content</div>").text
+    #    # => "TitleContent"
+    #
+    #  By default, the returned text will have HTML entities
+    #  escaped. If you want unescaped entities, and you understand
+    #  that the result is unsafe to render in a browser, then you
+    #  can pass an argument as shown:
+    #
+    #    frag = Loofah.fragment("&lt;script&gt;alert('EVIL');&lt;/script&gt;")
+    #    # ok for browser:
+    #    frag.text                                 # => "&lt;script&gt;alert('EVIL');&lt;/script&gt;"
+    #    # decidedly not ok for browser:
+    #    frag.text(:encode_special_chars => false) # => "<script>alert('EVIL');</script>"
+    #
+    def text(options={})
+      result = serialize_root.children.inner_text rescue ""
+      if options[:encode_special_chars] == false
+        result # possibly dangerous if rendered in a browser
+      else
+        encode_special_chars result
+      end
+    end
+    alias :inner_text :text
+    alias :to_str     :text
+    #
+    #  Returns a plain-text version of the markup contained by the
+    #  fragment, with HTML entities encoded.
+    #
+    #  This method is slower than #to_text, but is clever about
+    #  whitespace around block elements.
+    #
+    #    Loofah.document("<h1>Title</h1><div>Content</div>").to_text
+    #    # => "\nTitle\n\nContent\n"
+    #
+    def to_text(options={})
+      Loofah::Helpers.remove_extraneous_whitespace self.dup.scrub!(:newline_block_elements).text(options)
+    end
+  end
   module DocumentDecorator # :nodoc:
     def initialize(*args, &block)
       super

data/lib/loofah/metahelpers.rb ADDED Viewed

@@ -0,0 +1,15 @@
+module Loofah
+  module MetaHelpers
+    def self.HashifiedConstants(orig_module)
+      hashed_module = Module.new
+      orig_module.constants.each do |constant|
+        next unless orig_module.module_eval("#{constant}").is_a?(Array)
+        hashed_module.module_eval <<-CODE
+          #{constant} = {}
+          #{orig_module.name}::#{constant}.each { |c| #{constant}[c] = true ; #{constant}[c.downcase] = true }
+        CODE
+      end
+      hashed_module
+    end
+  end
+end

data/lib/loofah/scrubbers.rb CHANGED Viewed

@@ -58,7 +58,6 @@ module Loofah
   #     Loofah.fragment(link_farmers_markup).scrub!(:nofollow)
   #     => "ohai! <a href='http://www.myswarmysite.com/' rel="nofollow">I like your blog post</a>"
   #
-  #
   module Scrubbers
     #
     #  === scrub!(:strip)
@@ -184,15 +183,30 @@ module Loofah
       end
     end
+    # This class probably isn't useful publicly, but is used for #to_text's current implemention
+    class NewlineBlockElements < Scrubber # :nodoc:
+      def initialize
+        @direction = :bottom_up
+      end
+      def scrub(node)
+        return CONTINUE unless Loofah::HashedElements::BLOCK_LEVEL[node.name]
+        replacement_killer = Nokogiri::XML::Text.new("\n#{node.content}\n", node.document)
+        node.add_next_sibling replacement_killer
+        node.remove
+      end
+    end
     #
     #  A hash that maps a symbol (like +:prune+) to the appropriate Scrubber (Loofah::Scrubbers::Prune).
     #
     MAP = {
-      :escape => Escape,
-      :prune => Prune,
+      :escape    => Escape,
+      :prune     => Prune,
       :whitewash => Whitewash,
-      :strip => Strip,
-      :nofollow => NoFollow
+      :strip     => Strip,
+      :nofollow  => NoFollow,
+      :newline_block_elements => NewlineBlockElements
     }
     #

data/test/integration/test_ad_hoc.rb CHANGED Viewed

@@ -16,81 +16,6 @@ class TestAdHoc < Test::Unit::TestCase
     end
   end
-  context "integration test" do
-    context "xml document" do
-      context "custom scrubber" do
-        should "act as expected" do
-          xml = Loofah.xml_document <<-EOXML
-            <root>
-              <employee deceased='true'>Abraham Lincoln</employee>
-              <employee deceased='false'>Abe Vigoda</employee>
-            </root>
-          EOXML
-          bring_out_your_dead = Loofah::Scrubber.new do |node|
-            if node.name == "employee" and node["deceased"] == "true"
-              node.remove
-              Loofah::Scrubber::STOP # don't bother with the rest of the subtree
-            end
-          end
-          assert_equal 2, xml.css("employee").length
-          xml.scrub!(bring_out_your_dead)
-          employees = xml.css "employee"
-          assert_equal 1, employees.length
-          assert_equal "Abe Vigoda", employees.first.inner_text
-        end
-      end
-    end
-    context "xml fragment" do
-      context "custom scrubber" do
-        should "act as expected" do
-          xml = Loofah.xml_fragment <<-EOXML
-            <employee deceased='true'>Abraham Lincoln</employee>
-            <employee deceased='false'>Abe Vigoda</employee>
-          EOXML
-          bring_out_your_dead = Loofah::Scrubber.new do |node|
-            if node.name == "employee" and node["deceased"] == "true"
-              node.remove
-              Loofah::Scrubber::STOP # don't bother with the rest of the subtree
-            end
-          end
-          assert_equal 2, xml.css("employee").length
-          xml.scrub!(bring_out_your_dead)
-          employees = xml.css "employee"
-          assert_equal 1, employees.length
-          assert_equal "Abe Vigoda", employees.first.inner_text
-        end
-      end
-    end
-    context "html fragment" do
-      context "#to_s" do
-        should "not include head tags (like style)" do
-          html = Loofah.fragment "<style>foo</style><div>bar</div>"
-          assert_equal "<div>bar</div>", html.to_s
-        end
-      end
-      context "#text" do
-        should "not include head tags (like style)" do
-          html = Loofah.fragment "<style>foo</style><div>bar</div>"
-          assert_equal "bar", html.text
-        end
-      end
-    end
-    context "html document" do
-      should "not include head tags (like style)" do
-        html = Loofah.document "<style>foo</style><div>bar</div>"
-        assert_equal "bar", html.text
-      end
-    end
-  end
   def test_removal_of_illegal_tag
     html = <<-HTML
       following this there should be no jim tag

data/test/integration/test_html.rb ADDED Viewed

@@ -0,0 +1,51 @@
+require File.expand_path(File.join(File.dirname(__FILE__), '..', 'helper'))
+class TestHtml < Test::Unit::TestCase
+  context "html fragment" do
+    context "#to_s" do
+      should "not include head tags (like style)" do
+        html = Loofah.fragment "<style>foo</style><div>bar</div>"
+        assert_equal "<div>bar</div>", html.to_s
+      end
+    end
+    context "#text" do
+      should "not include head tags (like style)" do
+        html = Loofah.fragment "<style>foo</style><div>bar</div>"
+        assert_equal "bar", html.text
+      end
+    end
+    context "#to_text" do
+      should "add newlines before and after block elements" do
+        html = Loofah.fragment "<div>tweedle<h1>beetle</h1>bottle<span>puddle</span>paddle<div>battle</div>muddle</div>"
+        assert_equal "\ntweedle\nbeetle\nbottlepuddlepaddle\nbattle\nmuddle\n", html.to_text
+      end
+      should "remove extraneous whitespace" do
+        html = Loofah.fragment "<div>tweedle\n\n\t\n\s\nbeetle</div>"
+        assert_equal "\ntweedle\n\nbeetle\n", html.to_text
+      end
+    end
+  end
+  context "html document" do
+    should "not include head tags (like style)" do
+      html = Loofah.document "<style>foo</style><div>bar</div>"
+      assert_equal "bar", html.text
+    end
+    context "#to_text" do
+      should "add newlines before and after block elements" do
+        html = Loofah.document "<div>tweedle<h1>beetle</h1>bottle<span>puddle</span>paddle<div>battle</div>muddle</div>"
+        assert_equal "\ntweedle\nbeetle\nbottlepuddlepaddle\nbattle\nmuddle\n", html.to_text
+      end
+      should "remove extraneous whitespace" do
+        html = Loofah.document "<div>tweedle\n\n\t\n\s\nbeetle</div>"
+        assert_equal "\ntweedle\n\nbeetle\n", html.to_text
+      end
+    end
+  end
+end

data/test/integration/test_scrubbers.rb CHANGED Viewed

@@ -18,6 +18,7 @@ class TestScrubbers < Test::Unit::TestCase
   ENTITY_HACK_ATTACK            = "<div><div>Hack attack!</div><div>&lt;script&gt;alert('evil')&lt;/script&gt;</div></div>"
   ENTITY_HACK_ATTACK_TEXT_SCRUB = "Hack attack!&lt;script&gt;alert('evil')&lt;/script&gt;"
+  ENTITY_HACK_ATTACK_TEXT_SCRUB_UNESC = "Hack attack!<script>alert('evil')</script>"
   context "Document" do
     context "#scrub!" do
@@ -89,6 +90,24 @@ class TestScrubbers < Test::Unit::TestCase
         assert_equal ENTITY_HACK_ATTACK_TEXT_SCRUB, result
       end
+      context "with encode_special_chars => false" do
+        should "leave behind only inner text with html entities unescaped" do
+          doc = Loofah::HTML::Document.parse "<html><body>#{ENTITY_HACK_ATTACK}</body></html>"
+          result = doc.text(:encode_special_chars => false)
+          assert_equal ENTITY_HACK_ATTACK_TEXT_SCRUB_UNESC, result
+        end
+      end
+      context "with encode_special_chars => true" do
+        should "leave behind only inner text with html entities still escaped" do
+          doc = Loofah::HTML::Document.parse "<html><body>#{ENTITY_HACK_ATTACK}</body></html>"
+          result = doc.text(:encode_special_chars => true)
+          assert_equal ENTITY_HACK_ATTACK_TEXT_SCRUB, result
+        end
+      end
     end
     context "#to_s" do
@@ -239,6 +258,24 @@ class TestScrubbers < Test::Unit::TestCase
         assert_equal ENTITY_HACK_ATTACK_TEXT_SCRUB, result
       end
+      context "with encode_special_chars => false" do
+        should "leave behind only inner text with html entities unescaped" do
+          doc = Loofah::HTML::DocumentFragment.parse "<div>#{ENTITY_HACK_ATTACK}</div>"
+          result = doc.text(:encode_special_chars => false)
+          assert_equal ENTITY_HACK_ATTACK_TEXT_SCRUB_UNESC, result
+        end
+      end
+      context "with encode_special_chars => true" do
+        should "leave behind only inner text with html entities still escaped" do
+          doc = Loofah::HTML::DocumentFragment.parse "<div>#{ENTITY_HACK_ATTACK}</div>"
+          result = doc.text(:encode_special_chars => true)
+          assert_equal ENTITY_HACK_ATTACK_TEXT_SCRUB, result
+        end
+      end
     end
     context "#to_s" do

data/test/integration/test_xml.rb ADDED Viewed

@@ -0,0 +1,55 @@
+require File.expand_path(File.join(File.dirname(__FILE__), '..', 'helper'))
+class TestXml < Test::Unit::TestCase
+  context "integration test" do
+    context "xml document" do
+      context "custom scrubber" do
+        should "act as expected" do
+          xml = Loofah.xml_document <<-EOXML
+            <root>
+              <employee deceased='true'>Abraham Lincoln</employee>
+              <employee deceased='false'>Abe Vigoda</employee>
+            </root>
+          EOXML
+          bring_out_your_dead = Loofah::Scrubber.new do |node|
+            if node.name == "employee" and node["deceased"] == "true"
+              node.remove
+              Loofah::Scrubber::STOP # don't bother with the rest of the subtree
+            end
+          end
+          assert_equal 2, xml.css("employee").length
+          xml.scrub!(bring_out_your_dead)
+          employees = xml.css "employee"
+          assert_equal 1, employees.length
+          assert_equal "Abe Vigoda", employees.first.inner_text
+        end
+      end
+    end
+    context "xml fragment" do
+      context "custom scrubber" do
+        should "act as expected" do
+          xml = Loofah.xml_fragment <<-EOXML
+            <employee deceased='true'>Abraham Lincoln</employee>
+            <employee deceased='false'>Abe Vigoda</employee>
+          EOXML
+          bring_out_your_dead = Loofah::Scrubber.new do |node|
+            if node.name == "employee" and node["deceased"] == "true"
+              node.remove
+              Loofah::Scrubber::STOP # don't bother with the rest of the subtree
+            end
+          end
+          assert_equal 2, xml.css("employee").length
+          xml.scrub!(bring_out_your_dead)
+          employees = xml.css "employee"
+          assert_equal 1, employees.length
+          assert_equal "Abe Vigoda", employees.first.inner_text
+        end
+      end
+    end
+  end
+end

data/test/unit/test_api.rb CHANGED Viewed

@@ -81,13 +81,13 @@ class TestApi < Test::Unit::TestCase
   end
   def test_loofah_xml_document_node_scrub!
-    doc = Loofah.document(XML)
+    doc = Loofah.xml_document(XML)
     assert(node = doc.at_css("div"))
     node.scrub!(:strip)
   end
   def test_loofah_xml_fragment_node_scrub!
-    doc = Loofah.fragment(XML)
+    doc = Loofah.xml_fragment(XML)
     assert(node = doc.at_css("div"))
     node.scrub!(:strip)
   end
@@ -99,6 +99,16 @@ class TestApi < Test::Unit::TestCase
     node_set.scrub!(:strip)
   end
+  should "HTML::DocumentFragment exposes serialize_root" do
+    doc = Loofah.fragment(HTML)
+    assert_equal HTML, doc.serialize_root.to_html
+  end
+  should "HTML::Document exposes serialize_root" do
+    doc = Loofah.document(HTML)
+    assert_equal HTML, doc.serialize_root.children.to_html
+  end
   private
   def assert_html_documentish(doc)

metadata CHANGED Viewed

@@ -1,7 +1,12 @@
 --- !ruby/object:Gem::Specification
 name: loofah
 version: !ruby/object:Gem::Version
-  version: 0.4.6
+  prerelease: false
+  segments:
+  - 0
+  - 4
+  - 7
+  version: 0.4.7
 platform: ruby
 authors:
 - Mike Dalessio
@@ -31,59 +36,105 @@ cert_chain:
   FlqnTjy13J3nD30uxy9a1g==
   -----END CERTIFICATE-----
-date: 2010-02-02 00:00:00 -05:00
+date: 2010-03-09 00:00:00 -05:00
 default_executable:
 dependencies:
 - !ruby/object:Gem::Dependency
   name: nokogiri
-  type: :runtime
-  version_requirement:
-  version_requirements: !ruby/object:Gem::Requirement
+  prerelease: false
+  requirement: &id001 !ruby/object:Gem::Requirement
     requirements:
     - - ">="
       - !ruby/object:Gem::Version
+        segments:
+        - 1
+        - 3
+        - 3
         version: 1.3.3
-    version:
+  type: :runtime
+  version_requirements: *id001
 - !ruby/object:Gem::Dependency
-  name: mocha
+  name: rubyforge
+  prerelease: false
+  requirement: &id002 !ruby/object:Gem::Requirement
+    requirements:
+    - - ">="
+      - !ruby/object:Gem::Version
+        segments:
+        - 2
+        - 0
+        - 3
+        version: 2.0.3
+  type: :development
+  version_requirements: *id002
+- !ruby/object:Gem::Dependency
+  name: gemcutter
+  prerelease: false
+  requirement: &id003 !ruby/object:Gem::Requirement
+    requirements:
+    - - ">="
+      - !ruby/object:Gem::Version
+        segments:
+        - 0
+        - 3
+        - 0
+        version: 0.3.0
   type: :development
-  version_requirement:
-  version_requirements: !ruby/object:Gem::Requirement
+  version_requirements: *id003
+- !ruby/object:Gem::Dependency
+  name: mocha
+  prerelease: false
+  requirement: &id004 !ruby/object:Gem::Requirement
     requirements:
     - - ">="
       - !ruby/object:Gem::Version
+        segments:
+        - 0
+        - 9
         version: "0.9"
-    version:
+  type: :development
+  version_requirements: *id004
 - !ruby/object:Gem::Dependency
   name: thoughtbot-shoulda
-  type: :development
-  version_requirement:
-  version_requirements: !ruby/object:Gem::Requirement
+  prerelease: false
+  requirement: &id005 !ruby/object:Gem::Requirement
     requirements:
     - - ">="
       - !ruby/object:Gem::Version
+        segments:
+        - 2
+        - 10
         version: "2.10"
-    version:
+  type: :development
+  version_requirements: *id005
 - !ruby/object:Gem::Dependency
   name: acts_as_fu
-  type: :development
-  version_requirement:
-  version_requirements: !ruby/object:Gem::Requirement
+  prerelease: false
+  requirement: &id006 !ruby/object:Gem::Requirement
     requirements:
     - - ">="
       - !ruby/object:Gem::Version
+        segments:
+        - 0
+        - 0
+        - 5
         version: 0.0.5
-    version:
+  type: :development
+  version_requirements: *id006
 - !ruby/object:Gem::Dependency
   name: hoe
-  type: :development
-  version_requirement:
-  version_requirements: !ruby/object:Gem::Requirement
+  prerelease: false
+  requirement: &id007 !ruby/object:Gem::Requirement
     requirements:
     - - ">="
       - !ruby/object:Gem::Version
-        version: 2.3.3
-    version:
+        segments:
+        - 2
+        - 5
+        - 0
+        version: 2.5.0
+  type: :development
+  version_requirements: *id007
 description: |-
   Loofah is a general library for manipulating HTML/XML documents and
   fragments. It's built on top of Nokogiri and libxml2, so it's fast and
@@ -122,12 +173,14 @@ files:
 - init.rb
 - lib/loofah.rb
 - lib/loofah/active_record.rb
+- lib/loofah/elements.rb
 - lib/loofah/helpers.rb
 - lib/loofah/html/document.rb
 - lib/loofah/html/document_fragment.rb
 - lib/loofah/html5/scrub.rb
 - lib/loofah/html5/whitelist.rb
 - lib/loofah/instance_methods.rb
+- lib/loofah/metahelpers.rb
 - lib/loofah/scrubber.rb
 - lib/loofah/scrubbers.rb
 - lib/loofah/xml/document.rb
@@ -137,7 +190,9 @@ files:
 - test/html5/test_sanitizer.rb
 - test/integration/test_ad_hoc.rb
 - test/integration/test_helpers.rb
+- test/integration/test_html.rb
 - test/integration/test_scrubbers.rb
+- test/integration/test_xml.rb
 - test/unit/test_active_record.rb
 - test/unit/test_api.rb
 - test/unit/test_helpers.rb
@@ -158,18 +213,20 @@ required_ruby_version: !ruby/object:Gem::Requirement
   requirements:
   - - ">="
     - !ruby/object:Gem::Version
+      segments:
+      - 0
       version: "0"
-  version:
 required_rubygems_version: !ruby/object:Gem::Requirement
   requirements:
   - - ">="
     - !ruby/object:Gem::Version
+      segments:
+      - 0
       version: "0"
-  version:
 requirements: []
 rubyforge_project: loofah
-rubygems_version: 1.3.5
+rubygems_version: 1.3.6
 signing_key:
 specification_version: 3
 summary: Loofah is a general library for manipulating HTML/XML documents and fragments
@@ -177,6 +234,8 @@ test_files:
 - test/integration/test_helpers.rb
 - test/integration/test_scrubbers.rb
 - test/integration/test_ad_hoc.rb
+- test/integration/test_xml.rb
+- test/integration/test_html.rb
 - test/unit/test_xss_foliate.rb
 - test/unit/test_helpers.rb
 - test/unit/test_scrubber.rb

metadata.gz.sig CHANGED Viewed

Binary file