sanitize 6.0.0 → 6.0.1

Sign up to get free protection for your applications and to get access to all the features.

Potentially problematic release.


This version of sanitize might be problematic. Click here for more details.

checksums.yaml CHANGED
@@ -1,7 +1,7 @@
1
1
  ---
2
2
  SHA256:
3
- metadata.gz: 94a37503617774f9317150c834cc3025cd32a718be754fb72eea1b9dd7347571
4
- data.tar.gz: 597c76746d742db21842377bafab2911e7b84f389baf4dffafb2e53ecf67de92
3
+ metadata.gz: 819d713b2d4a78519e8bd4f2f853d6558d93ffd2d0481e10d012d8f74afbb555
4
+ data.tar.gz: 04a48476bf940cfffc12654e71d60a95fd93c0576b6bec6870c2defb5b72fa90
5
5
  SHA512:
6
- metadata.gz: c6d2dedfa9d6a589788d4156babae09cf14b3bebc765a9bb04a492aa5b5702f82dc3ae26d45199da3e8f9c096dfd191d15c53fea8d62084a3679604be5f7ddba
7
- data.tar.gz: 70bbb00756f1a4a085ad5901b27fd91ebc4308d5f42bfa57ec54c8cc7982ded8395eff9b59546ca62f3dba6e7a012351d62f9ec81b06aa8ccbb563211f39bd3c
6
+ metadata.gz: ed59ea47cc4a620ccf61be3443ef97036a877903bbc90fa855936e57446e34b92f5b9eb41ed9a026e17779fa473ce10d066986c1dd986c58381dae22bb7c9905
7
+ data.tar.gz: 27b40d2033ecd346c299bb77a7788b5325b79edd39c4767c9e5bf27486cf29bf2a5f3b34f96def645bbefd325b0e51a27182b75f187d2eb00931542769cd8c37
data/HISTORY.md CHANGED
@@ -1,5 +1,57 @@
1
1
  # Sanitize History
2
2
 
3
+ ## 6.0.1 (2023-01-27)
4
+
5
+ ### Bug Fixes
6
+
7
+ * Sanitize now always removes `<noscript>` elements and their contents, even
8
+ when `noscript` is in the allowlist.
9
+
10
+ This fixes a sanitization bypass that could occur when `noscript` was allowed
11
+ by a custom allowlist. In this scenario, carefully crafted input could sneak
12
+ arbitrary HTML through Sanitize, potentially enabling an XSS (cross-site
13
+ scripting) attack.
14
+
15
+ Sanitize's default configs don't allow `<noscript>` elements and are not
16
+ vulnerable. This issue only affects users who are using a custom config that
17
+ adds `noscript` to the element allowlist.
18
+
19
+ The root cause of this issue is that HTML parsing rules treat the contents of
20
+ a `<noscript>` element differently depending on whether scripting is enabled
21
+ in the user agent. Nokogiri doesn't support scripting so it follows the
22
+ "scripting disabled" rules, but a web browser with scripting enabled will
23
+ follow the "scripting enabled" rules. This means that Sanitize can't reliably
24
+ make the contents of a `<noscript>` element safe for scripting enabled
25
+ browsers, so the safest thing to do is to remove the element and its contents
26
+ entirely.
27
+
28
+ See the following security advisory for additional details:
29
+ [GHSA-fw3g-2h3j-qmm7](https://github.com/rgrove/sanitize/security/advisories/GHSA-fw3g-2h3j-qmm7)
30
+
31
+ Thanks to David Klein from [TU Braunschweig](https://www.tu-braunschweig.de/en/ias)
32
+ (@leeN) for reporting this issue.
33
+
34
+ * Fixed an edge case in which the contents of an "unescaped text" element (such
35
+ as `<noembed>` or `<xmp>`) were not properly escaped if that element was
36
+ allowlisted and was also inside an allowlisted `<math>` or `<svg>` element.
37
+
38
+ The only way to encounter this situation was to ignore multiple warnings in
39
+ the readme and create a custom config that allowlisted all the elements
40
+ involved, including `<math>` or `<svg>`. If you're using a default config or
41
+ if you heeded the warnings about MathML and SVG not being supported, you're
42
+ not affected by this issue.
43
+
44
+ Please let this be a reminder that Sanitize cannot safely sanitize MathML or
45
+ SVG content and does not support this use case. The default configs don't
46
+ allow MathML or SVG elements, and allowlisting MathML or SVG elements in a
47
+ custom config may create a security vulnerability in your application.
48
+
49
+ Documentation has been updated to add more warnings and to make the existing
50
+ warnings about this more prominent.
51
+
52
+ Thanks to David Klein from [TU Braunschweig](https://www.tu-braunschweig.de/en/ias)
53
+ (@leeN) for reporting this issue.
54
+
3
55
  ## 6.0.0 (2021-08-03)
4
56
 
5
57
  ### Potentially Breaking Changes
data/README.md CHANGED
@@ -11,27 +11,26 @@ protocols within attributes that contain URLs. You can also allow specific CSS
11
11
  properties, @ rules, and URL protocols in elements or attributes containing CSS.
12
12
  Any HTML or CSS that you don't explicitly allow will be removed.
13
13
 
14
- Sanitize is based on the [Nokogumbo HTML5 parser][nokogumbo], which parses HTML
15
- exactly the same way modern browsers do, and [Crass][crass], which parses CSS
16
- exactly the same way modern browsers do. As long as your allowlist config only
17
- allows safe markup and CSS, even the most malformed or malicious input will be
18
- transformed into safe output.
14
+ Sanitize is based on the [Nokogiri HTML5 parser][nokogiri], which parses HTML
15
+ the same way modern browsers do, and [Crass][crass], which parses CSS the same
16
+ way modern browsers do. As long as your allowlist config only allows safe markup
17
+ and CSS, even the most malformed or malicious input will be transformed into
18
+ safe output.
19
19
 
20
20
  [![Gem Version](https://badge.fury.io/rb/sanitize.svg)](http://badge.fury.io/rb/sanitize)
21
21
  [![Tests](https://github.com/rgrove/sanitize/workflows/Tests/badge.svg)](https://github.com/rgrove/sanitize/actions?query=workflow%3ATests)
22
22
 
23
23
  [crass]:https://github.com/rgrove/crass
24
- [nokogumbo]:https://github.com/rubys/nokogumbo
24
+ [nokogiri]:https://github.com/sparklemotion/nokogiri
25
25
 
26
26
  Links
27
27
  -----
28
28
 
29
29
  * [Home](https://github.com/rgrove/sanitize/)
30
- * [API Docs](http://rubydoc.info/github/rgrove/sanitize/master)
30
+ * [API Docs](https://rubydoc.info/github/rgrove/sanitize/Sanitize)
31
31
  * [Issues](https://github.com/rgrove/sanitize/issues)
32
- * [Release History](https://github.com/rgrove/sanitize/blob/master/HISTORY.md#sanitize-history)
33
- * [Online Demo](https://sanitize.herokuapp.com/)
34
- * [Biased comparison of Ruby HTML sanitization libraries](https://github.com/rgrove/sanitize/blob/master/COMPARISON.md)
32
+ * [Release History](https://github.com/rgrove/sanitize/releases)
33
+ * [Online Demo](https://sanitize-web.fly.dev/)
35
34
 
36
35
  Installation
37
36
  -------------
@@ -72,10 +71,11 @@ Sanitize can sanitize the following types of input:
72
71
  * Standalone CSS stylesheets
73
72
  * Standalone CSS properties
74
73
 
75
- However, please note that Sanitize _cannot_ fully sanitize the contents of
76
- `<math>` or `<svg>` elements, since these elements don't follow the same parsing
77
- rules as the rest of HTML. If this is something you need, you may want to look
78
- for another solution.
74
+ > **Warning**
75
+ >
76
+ > Sanitize cannot fully sanitize the contents of `<math>` or `<svg>` elements. MathML and SVG elements are [foreign elements](https://html.spec.whatwg.org/multipage/syntax.html#foreign-elements) that don't follow normal HTML parsing rules.
77
+ >
78
+ > By default, Sanitize will remove all MathML and SVG elements. If you add MathML or SVG elements to a custom element allowlist, you may create a security vulnerability in your application.
79
79
 
80
80
  ### HTML Fragments
81
81
 
@@ -420,11 +420,17 @@ elements not in this array will be removed.
420
420
  ]
421
421
  ```
422
422
 
423
- **Warning:** Sanitize cannot fully sanitize the contents of `<math>` or `<svg>`
424
- elements, since these elements don't follow the same parsing rules as the rest
425
- of HTML. If you add `math` or `svg` to the allowlist, you must assume that any
426
- content inside them will be allowed, even if that content would otherwise be
427
- removed by Sanitize.
423
+ > **Warning**
424
+ >
425
+ > Sanitize cannot fully sanitize the contents of `<math>` or `<svg>` elements. MathML and SVG elements are [foreign elements](https://html.spec.whatwg.org/multipage/syntax.html#foreign-elements) that don't follow normal HTML parsing rules.
426
+ >
427
+ > By default, Sanitize will remove all MathML and SVG elements. If you add MathML or SVG elements to a custom element allowlist, you must assume that any content inside them will be allowed, even if that content would otherwise be removed or escaped by Sanitize. This may create a security vulnerability in your application.
428
+
429
+ > **Note**
430
+ >
431
+ > Sanitize always removes `<noscript>` elements and their contents, even if `noscript` is in the allowlist.
432
+ >
433
+ > This is because a `<noscript>` element's content is parsed differently in browsers depending on whether or not scripting is enabled. Since Nokogiri doesn't support scripting, it always parses `<noscript>` elements as if scripting is disabled. This results in edge cases where it's not possible to reliably sanitize the contents of a `<noscript>` element because Nokogiri can't fully replicate the parsing behavior of a scripting-enabled browser.
428
434
 
429
435
  #### :parser_options (Hash)
430
436
 
@@ -54,6 +54,11 @@ class Sanitize
54
54
 
55
55
  # HTML elements to allow. By default, no elements are allowed (which means
56
56
  # that all HTML will be stripped).
57
+ #
58
+ # Warning: Sanitize cannot safely sanitize the contents of foreign
59
+ # elements (elements in the MathML or SVG namespaces). Do not add `math`
60
+ # or `svg` to this list! If you do, you may create a security
61
+ # vulnerability in your application.
57
62
  :elements => [],
58
63
 
59
64
  # HTML parsing options to pass to Nokogumbo.
@@ -1,5 +1,6 @@
1
1
  # encoding: utf-8
2
2
 
3
+ require 'cgi'
3
4
  require 'set'
4
5
 
5
6
  class Sanitize; module Transformers; class CleanElement
@@ -18,6 +19,18 @@ class Sanitize; module Transformers; class CleanElement
18
19
  # http://www.whatwg.org/specs/web-apps/current-work/multipage/elements.html#embedding-custom-non-visible-data-with-the-data-*-attributes
19
20
  REGEX_DATA_ATTR = /\Adata-(?!xml)[a-z_][\w.\u00E0-\u00F6\u00F8-\u017F\u01DD-\u02AF-]*\z/u
20
21
 
22
+ # Elements whose content is treated as unescaped text by HTML parsers.
23
+ UNESCAPED_TEXT_ELEMENTS = Set.new(%w[
24
+ iframe
25
+ noembed
26
+ noframes
27
+ noscript
28
+ plaintext
29
+ script
30
+ style
31
+ xmp
32
+ ])
33
+
21
34
  # Attributes that need additional escaping on `<a>` elements due to unsafe
22
35
  # libxml2 behavior.
23
36
  UNSAFE_LIBXML_ATTRS_A = Set.new(%w[
@@ -185,6 +198,28 @@ class Sanitize; module Transformers; class CleanElement
185
198
  @add_attributes[name].each {|key, val| node[key] = val }
186
199
  end
187
200
 
201
+ # Make a best effort to ensure that text nodes in invalid "unescaped text"
202
+ # elements that are inside a math or svg namespace are properly escaped so
203
+ # that they don't get parsed as HTML.
204
+ #
205
+ # Sanitize is explicitly documented as not supporting MathML or SVG, but
206
+ # people sometimes allow `<math>` and `<svg>` elements in their custom
207
+ # configs without realizing that it's not safe. This workaround makes it
208
+ # slightly less unsafe, but you still shouldn't allow `<math>` or `<svg>`
209
+ # because Nokogiri doesn't parse them the same way browsers do and Sanitize
210
+ # can't guarantee that their contents are safe.
211
+ unless node.namespace.nil?
212
+ prefix = node.namespace.prefix
213
+
214
+ if (prefix == 'math' || prefix == 'svg') && UNESCAPED_TEXT_ELEMENTS.include?(name)
215
+ node.children.each do |child|
216
+ if child.type == Nokogiri::XML::Node::TEXT_NODE
217
+ child.content = CGI.escapeHTML(child.content)
218
+ end
219
+ end
220
+ end
221
+ end
222
+
188
223
  # Element-specific special cases.
189
224
  case name
190
225
 
@@ -217,6 +252,16 @@ class Sanitize; module Transformers; class CleanElement
217
252
 
218
253
  node['content'] = node['content'].gsub(/;\s*charset\s*=.+\z/, ';charset=utf-8')
219
254
  end
255
+
256
+ # A `<noscript>` element's content is parsed differently in browsers
257
+ # depending on whether or not scripting is enabled. Since Nokogiri doesn't
258
+ # support scripting, it always parses `<noscript>` elements as if scripting
259
+ # is disabled. This results in edge cases where it's not possible to
260
+ # reliably sanitize the contents of a `<noscript>` element because Nokogiri
261
+ # can't fully replicate the parsing behavior of a scripting-enabled browser.
262
+ # The safest thing to do is to simply remove all `<noscript>` elements.
263
+ when 'noscript'
264
+ node.unlink
220
265
  end
221
266
  end
222
267
 
@@ -1,5 +1,5 @@
1
1
  # encoding: utf-8
2
2
 
3
3
  class Sanitize
4
- VERSION = '6.0.0'
4
+ VERSION = '6.0.1'
5
5
  end
@@ -11,18 +11,18 @@ describe 'Sanitize::Transformers::CleanComment' do
11
11
  end
12
12
 
13
13
  it 'should remove comments' do
14
- @s.fragment('foo <!-- comment --> bar').must_equal 'foo bar'
15
- @s.fragment('foo <!-- ').must_equal 'foo '
16
- @s.fragment('foo <!-- - -> bar').must_equal 'foo '
17
- @s.fragment("foo <!--\n\n\n\n-->bar").must_equal 'foo bar'
18
- @s.fragment("foo <!-- <!-- <!-- --> --> -->bar").must_equal 'foo --&gt; --&gt;bar'
19
- @s.fragment("foo <div <!-- comment -->>bar</div>").must_equal 'foo <div>&gt;bar</div>'
14
+ _(@s.fragment('foo <!-- comment --> bar')).must_equal 'foo bar'
15
+ _(@s.fragment('foo <!-- ')).must_equal 'foo '
16
+ _(@s.fragment('foo <!-- - -> bar')).must_equal 'foo '
17
+ _(@s.fragment("foo <!--\n\n\n\n-->bar")).must_equal 'foo bar'
18
+ _(@s.fragment("foo <!-- <!-- <!-- --> --> -->bar")).must_equal 'foo --&gt; --&gt;bar'
19
+ _(@s.fragment("foo <div <!-- comment -->>bar</div>")).must_equal 'foo <div>&gt;bar</div>'
20
20
 
21
21
  # Special case: the comment markup is inside a <script>, which makes it
22
22
  # text content and not an actual HTML comment.
23
- @s.fragment("<script><!-- comment --></script>").must_equal ''
23
+ _(@s.fragment("<script><!-- comment --></script>")).must_equal ''
24
24
 
25
- Sanitize.fragment("<script><!-- comment --></script>", :allow_comments => false, :elements => ['script'])
25
+ _(Sanitize.fragment("<script><!-- comment --></script>", :allow_comments => false, :elements => ['script']))
26
26
  .must_equal '<script><!-- comment --></script>'
27
27
  end
28
28
  end
@@ -33,14 +33,14 @@ describe 'Sanitize::Transformers::CleanComment' do
33
33
  end
34
34
 
35
35
  it 'should allow comments' do
36
- @s.fragment('foo <!-- comment --> bar').must_equal 'foo <!-- comment --> bar'
37
- @s.fragment('foo <!-- ').must_equal 'foo <!-- -->'
38
- @s.fragment('foo <!-- - -> bar').must_equal 'foo <!-- - -> bar-->'
39
- @s.fragment("foo <!--\n\n\n\n-->bar").must_equal "foo <!--\n\n\n\n-->bar"
40
- @s.fragment("foo <!-- <!-- <!-- --> --> -->bar").must_equal 'foo <!-- <!-- <!-- --> --&gt; --&gt;bar'
41
- @s.fragment("foo <div <!-- comment -->>bar</div>").must_equal 'foo <div>&gt;bar</div>'
42
-
43
- Sanitize.fragment("<script><!-- comment --></script>", :allow_comments => true, :elements => ['script'])
36
+ _(@s.fragment('foo <!-- comment --> bar')).must_equal 'foo <!-- comment --> bar'
37
+ _(@s.fragment('foo <!-- ')).must_equal 'foo <!-- -->'
38
+ _(@s.fragment('foo <!-- - -> bar')).must_equal 'foo <!-- - -> bar-->'
39
+ _(@s.fragment("foo <!--\n\n\n\n-->bar")).must_equal "foo <!--\n\n\n\n-->bar"
40
+ _(@s.fragment("foo <!-- <!-- <!-- --> --> -->bar")).must_equal 'foo <!-- <!-- <!-- --> --&gt; --&gt;bar'
41
+ _(@s.fragment("foo <div <!-- comment -->>bar</div>")).must_equal 'foo <div>&gt;bar</div>'
42
+
43
+ _(Sanitize.fragment("<script><!-- comment --></script>", :allow_comments => true, :elements => ['script']))
44
44
  .must_equal '<script><!-- comment --></script>'
45
45
  end
46
46
  end
@@ -10,15 +10,15 @@ describe 'Sanitize::Transformers::CSS::CleanAttribute' do
10
10
  end
11
11
 
12
12
  it 'should sanitize CSS properties in style attributes' do
13
- @s.fragment(%[
13
+ _(@s.fragment(%[
14
14
  <div style="color: #fff; width: expression(alert(1)); /* <-- evil! */"></div>
15
- ].strip).must_equal %[
15
+ ].strip)).must_equal %[
16
16
  <div style="color: #fff; /* <-- evil! */"></div>
17
17
  ].strip
18
18
  end
19
19
 
20
20
  it 'should remove the style attribute if the sanitized CSS is empty' do
21
- @s.fragment('<div style="width: expression(alert(1))"></div>').
21
+ _(@s.fragment('<div style="width: expression(alert(1))"></div>')).
22
22
  must_equal '<div></div>'
23
23
  end
24
24
  end
@@ -46,7 +46,7 @@ describe 'Sanitize::Transformers::CSS::CleanElement' do
46
46
  </style>
47
47
  ].strip
48
48
 
49
- @s.fragment(html).must_equal %[
49
+ _(@s.fragment(html)).must_equal %[
50
50
  <style>
51
51
  /* Yay CSS! */
52
52
  .foo { color: #fff; }
@@ -62,6 +62,6 @@ describe 'Sanitize::Transformers::CSS::CleanElement' do
62
62
  end
63
63
 
64
64
  it 'should remove the <style> element if the sanitized CSS is empty' do
65
- @s.fragment('<style></style>').must_equal ''
65
+ _(@s.fragment('<style></style>')).must_equal ''
66
66
  end
67
67
  end
@@ -11,18 +11,18 @@ describe 'Sanitize::Transformers::CleanDoctype' do
11
11
  end
12
12
 
13
13
  it 'should remove doctype declarations' do
14
- @s.document('<!DOCTYPE html><html>foo</html>').must_equal "<html>foo</html>"
15
- @s.fragment('<!DOCTYPE html>foo').must_equal 'foo'
14
+ _(@s.document('<!DOCTYPE html><html>foo</html>')).must_equal "<html>foo</html>"
15
+ _(@s.fragment('<!DOCTYPE html>foo')).must_equal 'foo'
16
16
  end
17
17
 
18
18
  it 'should not allow doctype definitions in fragments' do
19
- @s.fragment('<!DOCTYPE html><html>foo</html>')
19
+ _(@s.fragment('<!DOCTYPE html><html>foo</html>'))
20
20
  .must_equal "foo"
21
21
 
22
- @s.fragment('<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01//EN"><html>foo</html>')
22
+ _(@s.fragment('<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01//EN"><html>foo</html>'))
23
23
  .must_equal "foo"
24
24
 
25
- @s.fragment("<!DOCTYPE html PUBLIC \"-//W3C//DTD XHTML 1.0 Strict//EN\"\n \"http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd\"><html>foo</html>")
25
+ _(@s.fragment("<!DOCTYPE html PUBLIC \"-//W3C//DTD XHTML 1.0 Strict//EN\"\n \"http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd\"><html>foo</html>"))
26
26
  .must_equal "foo"
27
27
  end
28
28
  end
@@ -33,38 +33,38 @@ describe 'Sanitize::Transformers::CleanDoctype' do
33
33
  end
34
34
 
35
35
  it 'should allow doctype declarations in documents' do
36
- @s.document('<!DOCTYPE html><html>foo</html>')
36
+ _(@s.document('<!DOCTYPE html><html>foo</html>'))
37
37
  .must_equal "<!DOCTYPE html><html>foo</html>"
38
38
 
39
- @s.document('<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01//EN"><html>foo</html>')
39
+ _(@s.document('<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01//EN"><html>foo</html>'))
40
40
  .must_equal "<!DOCTYPE html><html>foo</html>"
41
41
 
42
- @s.document("<!DOCTYPE html PUBLIC \"-//W3C//DTD XHTML 1.0 Strict//EN\"\n \"http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd\"><html>foo</html>")
42
+ _(@s.document("<!DOCTYPE html PUBLIC \"-//W3C//DTD XHTML 1.0 Strict//EN\"\n \"http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd\"><html>foo</html>"))
43
43
  .must_equal "<!DOCTYPE html><html>foo</html>"
44
44
  end
45
45
 
46
46
  it 'should not allow obviously invalid doctype declarations in documents' do
47
- @s.document('<!DOCTYPE blah blah blah><html>foo</html>')
47
+ _(@s.document('<!DOCTYPE blah blah blah><html>foo</html>'))
48
48
  .must_equal "<!DOCTYPE html><html>foo</html>"
49
49
 
50
- @s.document('<!DOCTYPE blah><html>foo</html>')
50
+ _(@s.document('<!DOCTYPE blah><html>foo</html>'))
51
51
  .must_equal "<!DOCTYPE html><html>foo</html>"
52
52
 
53
- @s.document('<!DOCTYPE html BLAH "-//W3C//DTD HTML 4.01//EN"><html>foo</html>')
53
+ _(@s.document('<!DOCTYPE html BLAH "-//W3C//DTD HTML 4.01//EN"><html>foo</html>'))
54
54
  .must_equal "<!DOCTYPE html><html>foo</html>"
55
55
 
56
- @s.document('<!whatever><html>foo</html>')
56
+ _(@s.document('<!whatever><html>foo</html>'))
57
57
  .must_equal "<html>foo</html>"
58
58
  end
59
59
 
60
60
  it 'should not allow doctype definitions in fragments' do
61
- @s.fragment('<!DOCTYPE html><html>foo</html>')
61
+ _(@s.fragment('<!DOCTYPE html><html>foo</html>'))
62
62
  .must_equal "foo"
63
63
 
64
- @s.fragment('<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01//EN"><html>foo</html>')
64
+ _(@s.fragment('<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01//EN"><html>foo</html>'))
65
65
  .must_equal "foo"
66
66
 
67
- @s.fragment("<!DOCTYPE html PUBLIC \"-//W3C//DTD XHTML 1.0 Strict//EN\"\n \"http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd\"><html>foo</html>")
67
+ _(@s.fragment("<!DOCTYPE html PUBLIC \"-//W3C//DTD XHTML 1.0 Strict//EN\"\n \"http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd\"><html>foo</html>"))
68
68
  .must_equal "foo"
69
69
  end
70
70
  end