RubyGems - kramdown - Versions diffs - 0.2.0 → 0.3.0 - Mend

kramdown 0.2.0 → 0.3.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Potentially problematic release.

This version of kramdown might be problematic. Click here for more details.

Files changed (67) hide show

data/ChangeLog +267 -0
data/VERSION +1 -1
data/benchmark/benchmark.rb +2 -1
data/benchmark/generate_data.rb +110 -0
data/benchmark/historic-jruby-1.4.0.dat +7 -0
data/benchmark/historic-ruby-1.8.6.dat +7 -0
data/benchmark/historic-ruby-1.8.7.dat +7 -0
data/benchmark/historic-ruby-1.9.1p243.dat +7 -0
data/benchmark/historic-ruby-1.9.2dev.dat +7 -0
data/benchmark/static-jruby-1.4.0.dat +7 -0
data/benchmark/static-ruby-1.8.6.dat +7 -0
data/benchmark/static-ruby-1.8.7.dat +7 -0
data/benchmark/static-ruby-1.9.1p243.dat +7 -0
data/benchmark/static-ruby-1.9.2dev.dat +7 -0
data/benchmark/testing.sh +1 -1
data/doc/index.page +5 -5
data/doc/installation.page +3 -3
data/doc/quickref.page +3 -3
data/doc/syntax.page +133 -101
data/doc/tests.page +9 -1
data/lib/kramdown/compatibility.rb +34 -0
data/lib/kramdown/converter.rb +26 -8
data/lib/kramdown/document.rb +2 -1
data/lib/kramdown/parser.rb +1 -1192
data/lib/kramdown/parser/kramdown.rb +272 -0
data/lib/kramdown/parser/kramdown/attribute_list.rb +102 -0
data/lib/kramdown/parser/kramdown/autolink.rb +42 -0
data/lib/kramdown/parser/kramdown/blank_line.rb +43 -0
data/lib/kramdown/parser/kramdown/blockquote.rb +42 -0
data/lib/kramdown/parser/kramdown/codeblock.rb +62 -0
data/lib/kramdown/parser/kramdown/codespan.rb +57 -0
data/lib/kramdown/parser/kramdown/emphasis.rb +69 -0
data/lib/kramdown/parser/kramdown/eob.rb +39 -0
data/lib/kramdown/parser/kramdown/escaped_chars.rb +38 -0
data/lib/kramdown/parser/kramdown/extension.rb +65 -0
data/lib/kramdown/parser/kramdown/footnote.rb +72 -0
data/lib/kramdown/parser/kramdown/header.rb +81 -0
data/lib/kramdown/parser/kramdown/horizontal_rule.rb +39 -0
data/lib/kramdown/parser/kramdown/html.rb +253 -0
data/lib/kramdown/{deprecated.rb → parser/kramdown/html_entity.rb} +10 -12
data/lib/kramdown/parser/kramdown/line_break.rb +38 -0
data/lib/kramdown/parser/kramdown/link.rb +153 -0
data/lib/kramdown/parser/kramdown/list.rb +225 -0
data/lib/kramdown/parser/kramdown/paragraph.rb +44 -0
data/lib/kramdown/parser/kramdown/typographic_symbol.rb +48 -0
data/lib/kramdown/version.rb +1 -1
data/test/testcases/block/09_html/comment.html +1 -0
data/test/testcases/block/09_html/comment.text +1 -1
data/test/testcases/block/09_html/content_model/tables.text +2 -2
data/test/testcases/block/09_html/not_parsed.html +10 -0
data/test/testcases/block/09_html/not_parsed.text +9 -0
data/test/testcases/block/09_html/parse_as_raw.html +4 -0
data/test/testcases/block/09_html/parse_as_raw.text +2 -0
data/test/testcases/block/09_html/parse_block_html.html +4 -0
data/test/testcases/block/09_html/parse_block_html.text +3 -0
data/test/testcases/block/09_html/processing_instruction.html +1 -0
data/test/testcases/block/09_html/processing_instruction.text +1 -1
data/test/testcases/block/09_html/simple.html +8 -15
data/test/testcases/block/09_html/simple.text +2 -12
data/test/testcases/span/02_emphasis/normal.html +8 -4
data/test/testcases/span/02_emphasis/normal.text +6 -2
data/test/testcases/span/05_html/markdown_attr.html +2 -1
data/test/testcases/span/05_html/markdown_attr.text +2 -1
data/test/testcases/span/05_html/normal.html +6 -2
data/test/testcases/span/05_html/normal.text +4 -0
metadata +35 -4
data/lib/kramdown/parser/registry.rb +0 -62

data/benchmark/static-ruby-1.8.6.dat ADDED

@@ -0,0 +1,7 @@
+# Maruku 0.6.0 || BlueFeather 0.32 || BlueCloth 2.0.5 || RDiscount 1.3.5
+    1    0.06202    0.12554    0.00055    0.00061
+    2    0.11490    0.42433    0.00150    0.00104
+    4    0.22792    1.35390    0.00189    0.00210
+    8    0.50887    4.93941    0.00363    0.00411
+   16    1.11704   18.90444    0.00792    0.00937
+   32    2.36787   78.08507    0.01822    0.02148

data/benchmark/static-ruby-1.8.7.dat ADDED

@@ -0,0 +1,7 @@
+# Maruku 0.6.0 || BlueFeather 0.32 || BlueCloth 2.0.5 || RDiscount 1.3.5
+    1    0.06325    0.13453    0.00060    0.00058
+    2    0.12058    0.45557    0.00108    0.00115
+    4    0.24777    1.47726    0.00219    0.00237
+    8    0.49758    5.21774    0.00401    0.00460
+   16    1.18307   20.93746    0.00916    0.01011
+   32    2.65514   81.59896    0.02015    0.02210

data/benchmark/static-ruby-1.9.1p243.dat ADDED

@@ -0,0 +1,7 @@
+# Maruku 0.6.0 || BlueFeather 0.32 || BlueCloth 2.0.5 || RDiscount 1.3.5
+    1    0.07650    0.16749    0.00067    0.00070
+    2    0.22272    0.54882    0.00146    0.00132
+    4    0.30977    1.95581    0.00248    0.00265
+    8    0.61692    7.35415    0.00453    0.00555
+   16    1.35505   28.91778    0.00939    0.01241
+   32    3.00461  115.68435    0.02550    0.02510

data/benchmark/static-ruby-1.9.2dev.dat ADDED

@@ -0,0 +1,7 @@
+# Maruku 0.6.0 || BlueFeather 0.32 || BlueCloth 2.0.5 || RDiscount 1.3.5
+    1    0.05888    0.12860    0.00063    0.00059
+    2    0.18025    0.40292    0.00115    0.00108
+    4    0.24735    1.51833    0.00353    0.00299
+    8    0.51102    6.07239    0.00410    0.00497
+   16    1.03159   23.21515    0.01348    0.01277
+   32    2.61620   92.00956    0.01893    0.02003

data/benchmark/testing.sh CHANGED

@@ -2,7 +2,7 @@
 source ~/.bashrc
-for VERSION in 1.8.6 1.8.7 1.9.1 1.9.2 'jruby 1.4.0'; do
+for VERSION in 1.8.5 1.8.6 1.8.7 1.9.1 1.9.2 'jruby 1.4.0'; do
 	rvm $VERSION
 	echo $(ruby -v)
 	rake test

data/doc/index.page CHANGED

@@ -73,16 +73,16 @@ and [mailing lists][ml] available if you have any questions!
 ## Welcome to the kramdown site
-kramdown is a *free* GPL-licensed [Ruby](http://www.ruby-lang.org) library for parsing a superset of
-Markdown. It is completely written in Ruby, supports standard Markdown (with some minor
-modifications) and various extensions that have been made popular by the [PHP Markdown Extra]
-package and [Maruku].
+**kramdown** (sic, not Kramdown or KramDown, just kramdown) is a *free* GPL-licensed
+[Ruby](http://www.ruby-lang.org) library for parsing a superset of Markdown. It is completely
+written in Ruby, supports standard Markdown (with some minor modifications) and various extensions
+that have been made popular by the [PHP Markdown Extra] package and [Maruku].
 It is probably the fastest pure-Ruby Markdown converter available (November 2009), being 5x faster
 than [Maruku] and about 10x faster than [BlueFeather].
 <div class="a-center">
-The latest version of kramdown is <b>0.2.0</b> and it was released on <b>2009-12-03</b>.
+The latest version of kramdown is <b>0.3.0</b> and it was released on <b>2009-12-20</b>.
 </div>
 [PHP Markdown Extra]: http://michelf.com/projects/php-markdown/extra/

data/doc/installation.page CHANGED

@@ -10,8 +10,8 @@ sort_info: 5
 kramdown should work on any platform which supports Ruby. It has been successfully tested on the
 following platforms:
-* Linux with Ruby 1.8.6
-* Mac OSX with Ruby 1.8.6, 1.8.7, 1.9.1, 1.9.2-preview1 and jruby 1.4.0.
+* Linux with Ruby 1.8.5, 1.8.6, 1.8.7, 1.9.1
+* Mac OSX with Ruby 1.8.5, 1.8.6, 1.8.7, 1.9.1, 1.9.2-preview1 and jruby 1.4.0.
 See the platform specific installation notes for more information!
@@ -87,4 +87,4 @@ out kramdown use the following command:
 ## Dependencies
 Since kramdown is written in Ruby, you just need the [Ruby interpreter](http://www.ruby-lang.org)
-versions 1.8.6, 1.8.7 or 1.9.1. There are no other dependencies.
+versions 1.8.5, 1.8.6, 1.8.7 or 1.9.1. There are no other dependencies.

data/doc/quickref.page CHANGED

@@ -55,7 +55,7 @@ which contains a hard line break.
 {::kdlink:: #headers part="headers"}
-kramdown supports Setext style headers and atx style headers. A header must always be preceeded by a
+kramdown supports Setext style headers and atx style headers. A header must always be preceded by a
 blank line except at the beginning of the document:
 {::kdexample:}
@@ -239,7 +239,7 @@ term
 {::kdexample:}
 Each term can be styled using span level elements and each definition is parsed as block level
-elements, ie. you can use any block level in a definition. Just use the same indent for the lines
+elements, i.e. you can use any block level in a definition. Just use the same indent for the lines
 following the definition line:
 {::kdexample:}
@@ -469,7 +469,7 @@ e.g. `` `code` ``.
 {::kdlink:: #footnotes part="footnotes"}
 Footnotes can easily be used in kramdown. Just set a footnote marker (consists of square brackets
-with a caret and the footname inside) in the text and somewhere else the footnote definition (which
+with a caret and the footnote name inside) in the text and somewhere else the footnote definition (which
 basically looks like a reference link definition):
 {::kdexample:}

data/doc/syntax.page CHANGED

@@ -17,7 +17,7 @@ The kramdown syntax is based on the Markdown syntax and has been enhanced with f
 found in other Markdown implementations like [Maruku], [PHP Markdown Extra] and [Pandoc]. However,
 it strives to provide a strict syntax with definite rules and therefore isn't completely compatible
 with Markdown. Nonetheless, most Markdown documents should work fine when parsed with kramdown. All
-places where the kramdown syntax differes from the Markdown syntax are highlighted.
+places where the kramdown syntax differs from the Markdown syntax are highlighted.
 Following is the complete syntax definition so that you know what you will get when a kramdown
 document is converted to HTML. There are basically two types of elements: block level elements
@@ -59,7 +59,7 @@ will be converted to:
     0 &lt; 1 &lt; 2 and 2 &gt; 1 &gt; 0</p>
 Since kramdown also uses some characters to mark-up the text, there need to be a way to escape these
-special characters sothat they can have their normal meaning. This can be done by using backslash
+special characters so that they can have their normal meaning. This can be done by using backslash
 escapes. For example, you can use a literal backtick like this:
     This \`is not a code\` span!
@@ -152,7 +152,7 @@ Paragraphs are the most used block level elements. One or more consecutive lines
 interpreted as one paragraph. Every line of a paragraph may be indented up to three spaces. You can
 separate two consecutive paragraphs from each other by using one or more blank lines. Notice that a
 line break in the source does not mean a line break in the output. If you want to have an explicit
-line break (ie. a `<br />` tag) you need to end a line with two or more spaces or two backslashes!
+line break (i.e. a `<br />` tag) you need to end a line with two or more spaces or two backslashes!
 Note, however, that a line break on the last text line of a paragraph is not possible and will be
 ignored. Leading and trailing spaces will be stripped from the paragraph text.
@@ -205,7 +205,7 @@ As mentioned you need to insert a blank line before a Setext header:
     ------------
 However, it is generally a good idea to also use a blank line after a Setext header because it looks
-more approriate.
+more appropriate.
 > The original Markdown syntax allows one to omit the blank line before a Setext header. However,
 > this makes reading the document harder than necessary and is therefore not allowed in a kramdown
@@ -243,13 +243,13 @@ close the header. Any leading or trailing spaces are stripped from the header te
 ### Automatic Generation of Header IDs
 kramdown supports the automatic generation of header IDs if the option `:auto_ids` is set to `true`
-(which is the default). This is done by converting the untransformed, ie. plain, header text via the
+(which is the default). This is done by converting the untransformed, i.e. plain, header text via the
 following steps:
 * All characters except letters, numbers, spaces and dashes are removed.
 * All characters from the start of the line till the first letter are removed.
 * Everything except letters and numbers is converted to dashes.
-* Everything is downcased.
+* Everything is lowercased.
 * If nothing is left, the identifier `section` is used.
 * If a such created identifier already exists, a dash and a sequential number is added (first `-1`,
   then `-2` and so on).
@@ -335,7 +335,7 @@ indented with five spaces or one space and one tab, like this:
     >
     >     ruby -e 'puts :works'
-> The original Markdown syntax allowed "lazy" blockquotes, ie. blockquotes where only the first line
+> The original Markdown syntax allowed "lazy" blockquotes, i.e. blockquotes where only the first line
 > needs a blockquote marker. This is disallowed in kramdown, you always need to use a blockquote
 > marker! The rational behind this is that most email programs and good text editors put the `>`
 > maker automatically before every quoted line and that things like the following work like
@@ -477,7 +477,7 @@ list markers in the list may be indented up to three spaces or the number of spa
 indentation of the last list item minus one, whichever number is smaller. For example:
     * This is the first line. Since the first non-space characters appears in column 3, all other
-      lines have to be indented 2 spaces sothat they first characters align. This tells kramdown
+      lines have to be indented 2 spaces so that the first characters align. This tells kramdown
       that the lines belong to the list item.
     *       This is the another item of the list. It uses a different number of spaces for
             indentation which is okay but should generally be avoided.
@@ -531,7 +531,7 @@ by leaving a blank line after the last list item and using an EOB marker:
 > blank lines are wrapped in paragraph tags. This means that the first text will also be wrapped in
 > a paragraph if you have block level elements in a list which are separated by blank lines. The
 > above rule is easy to remember and lets you exactly specify when the first list text should be
-> wrapped in a paragraph. The idea for the above rule comes from the [pandoc] package.
+> wrapped in a paragraph. The idea for the above rule comes from the [Pandoc] package.
 {: .markdown-difference}
 As seen in the examples above, blank lines between list items are allowed.
@@ -580,7 +580,7 @@ marker.
 > This kind of syntax is disallowed in kramdown.
 {: .markdown-difference}
-If you want to have one list directly after another one (both with the same list type, ie. ordered
+If you want to have one list directly after another one (both with the same list type, i.e. ordered
 or unordered), you need to use an EOB marker to separate the two:
     * List one
@@ -641,7 +641,7 @@ The column number of the first non-space character which appears after a definit
 same line specifies the indentation that has to be used for the following lines of the definition.
 If there is no such character, the indentation that needs to be used is four spaces or one tab. If
 one of the following lines does not have the needed amount of indentation, it is not treated as part
-of the defintion. The indentation is stripped from the definition and it (note that the definition
+of the definition. The indentation is stripped from the definition and it (note that the definition
 naturally also contains the content of the line with the definition marker) is processed as text
 containing block level elements. If there is more than one definition, all other definition markers
 for the term may be indented up to three spaces or the number of spaces used for the indentation of
@@ -650,9 +650,9 @@ the last definition minus one, whichever number is smaller. For example:
     definition term 1
     definition term 2
     : This is the first line. Since the first non-space characters appears in column 3, all other
-      lines have to be indented 2 spaces sothat they first characters align. This tells kramdown
+      lines have to be indented 2 spaces so that they first characters align. This tells kramdown
       that the lines belong to the definition.
-    :       This is the another defintion for the same term. It uses a different number of spaces
+    :       This is the another definition for the same term. It uses a different number of spaces
             for indentation which is okay but should generally be avoided.
        : The definition marker is indented 3 spaces which is allowed but should also be avoided.
@@ -668,7 +668,7 @@ paragraph otherwise:
     : This definition will just be text because it would normally be a paragraph and the there is
       no preceding blank line.
-      > although the defintion contains other block level elements
+      > although the definition contains other block level elements
     : This definition *will* be a paragraph since it is preceded by a blank line.
@@ -678,26 +678,57 @@ definition.
 ## HTML Blocks
-There is no problem mixing HTML tags into a kramdown document. An HTML block is started when
-kramdown encounters a line beginning with an HTML tag that is *not* a span level HTML tag (`div`,
-`p`, `pre`, ...) -- or with a general XML tag -- which may be indented up to three spaces. After
-that any combination of text and HTML tags are allowed, with HTML span tags being treated as text.
-Each HTML block tag must appear completely on the line, it may not be broken across several lines!
-If a block HTML tag is not closed on the same line, the HTML block continues till the HTML block
-line with the corresponding closing tag. Note that only correct XHTML is supported! This means that
-you have to use, for example, `<hr />` instead of `<hr>` (although kramdown tries to fix such things
-if possible).
+> The original Markdown syntax specifies that an HTML block must start at the left margin, i.e. no
+> indentation is allowed. Also, the HTML block has to be surrounded by blank lines. Both
+> restrictions are lifted for kramdown documents. Additionally, the original syntax does not allow
+> you to use Markdown syntax in HTML blocks which is allowed with kramdown.
+{: .markdown-difference}
+An HTML block is potentially started if a line is encountered that begins with a non-span-level HTML
+tag or a general XML tag (opening or closing) which may be indented up to three spaces.
+The following HTML tags count as span level HTML tags and *won't* start an HTML block if found at
+the beginning of an HTML block line:
+    a abbr acronym b big bdo br button cite code del dfn em i img input
+    ins kbd label option q rb rbc rp rt rtc ruby samp select small span
+    strong sub sup textarea tt var
+Further parsing of a found start tag depends on the tag and in which of three possible ways its
+content is parsed:
+* Parse as raw HTML block: If the HTML/XML tag content should be handled as raw HTML, then only
+  HTML/XML tags are parsed from this point onwards and text is handled as raw, unparsed text until
+  the matching end tag is found or until the end of the document. Each found tag will be parsed as
+  raw HTML again. However, if a tag has a `markdown` attribute, this attribute controls parsing of
+  this one tag (see below).
-By default, kramdown does not parse HTML blocks, i.e. when a block level HTML tag is encountered
-only HTML block lines are parsed and everything else is treated as raw text. This will be done until
-the closing tag for the outermost HTML tag is found (or until the end of the document if the closing
-HTML tag does not exist). However, this can be configured with the `:parse_block_html` option. If
-this is set to `true`, then syntax parsing in HTML blocks is enabled. All the examples below assume
-that `:parse_block_html` is set to `true`. It is also possible to enable/disable syntax parsing on a
-tag per tag basis using the `markdown` attribute:
+  Note that only correct XHTML is supported! This means that you have to use, for example, `<hr />`
+  instead of `<hr>` (although kramdown tries to fix such things if possible). If an invalid closing
+  tag is found, it is ignored.
-* If an HTML tag has an attribute `markdown="0"`, then no parsing (except parsing of HTML block
-  lines) is done inside that HTML tag.
+* Parse as block level elements: If the HTML/XML tag content should be parsed as text containing
+  block level elements, the remaining text on the line will be parsed by the block level parser as
+  if it appears on a separate line (**Caution**: This also means that if the line consists of the
+  start tag, text and the end tag, the end tag will not be found!). All following lines are parsed
+  as block level elements until an HTML block line with the matching end tag is found or until the
+  end of the document.
+* Parse as span level elements: If the HTML/XML tag content should be parsed as text containing span
+  level elements, then all text until the *next* matching end tag or until the end of the document
+  will be the content of the tag and will later be parsed by the span level parser.
+If there is text after an end tag, it will be parsed as if it appears on a separate line except when
+inside a raw HTML block.
+Also, if an invalid closing tag is found, it is ignored.
+By default, kramdown parses all block HTML tags and all XML tags as raw HTML blocks. However, this
+can be configured with the `:parse_block_html` option. If this is set to `true`, then syntax parsing
+in HTML blocks is globally enabled. It is also possible to enable/disable syntax parsing on a tag
+per tag basis using the `markdown` attribute:
+* If an HTML tag has an attribute `markdown="0"`, then the tag is parsed as raw HTML block.
 * If an HTML tag has an attribute `markdown="1"`, then the default mechanism for parsing syntax in
   this tag is used.
@@ -708,20 +739,31 @@ tag per tag basis using the `markdown` attribute:
 * If an HTML tag has an attribute `markdown="span"`, then the content of the tag is parsed as span
   level elements.
-Note, however, that text that appears on an HTML block starting or ending line is not parsed as
-block level element even if the rest of the HTML block is! It is parsed as span level text
-nonetheless.
+The following list shows which HTML tags are parsed in which mode by default when `markdown="1"` is
+applied or `:parse_block_html` is `true`:
-If an invalid closing tag is found on an HTML block line while no HTML block is active, it is
-ignored.
+Parse as raw HTML block
+:
+        script math option textarea
-> The original Markdown syntax specifies that an HTML block must start at the left margin, ie. no
-> indentation is allowed. Also, the HTML block has to be surrounded by blank lines. Both
-> restrictions are lifted for kramdown documents. Additionally, the original syntax does not allow
-> you to use Markdown syntax in HTML blocks which is allowed with kramdown.
-{: .markdown-difference}
+    Also, all general XML tags are parsed as raw HTML blocks.
+Parse as block level elements
+:
+        applet button blockquote colgroup dd div dl fieldset form iframe li
+        map noscript object ol table tbody td th thead tfoot tr ul
+Parse as span level elements
+:
+        a abbr acronym address b bdo big cite caption code del dfn dt em
+        h1 h2 h3 h4 h5 h6 i ins kbd label legend optgroup p pre q rb rbc
+        rp rt rtc ruby samp select small span strong sub sup tt var
+> Remember that all span level HTML tags like `a` or `b` do not start a HTML block! However, the
+> above lists also include span level HTML tags in the case the `markdown` attribute is used on a
+> tag inside a raw HTML block.
-Here is a simple example:
+Here is a simple example input and its output with `:parse_block_html` set to `false`:
     This is a para.
     <div>
@@ -731,84 +773,64 @@ Here is a simple example:
 ^
     <p>This is a para.</p>
     <div>
-      <p>Something in here.</p>
+    Something in here.
     </div>
     <p>Other para.</p>
-See how the content of the `div` tag is wrapped in a paragraph!
+As one can see the content of the `div` tag will be parsed as raw HTML block and left alone.
+However, if the `markdown="1"` attribute was used on the `div` tag, the content would be parsed as
+block level elements and therefore converted to a paragraph.
 You can also use several HTML tags at once:
     <div id="content"><div id="layers"><div id="layer1">
-    This is a para in the `layer1` div.
+    This is some text in the `layer1` div.
     </div>
-    This is a para in the `layers` div.
+    This is some text in the `layers` div.
     </div></div>
     This is a para outside the HTML block.
-When you specify an HTML block don't forget that the first column does not change:
+However, remember that if the content of a tag is parsed as block level elements, the content that
+appears after a start/end tag but on the same line, is processed as if it appears on a new line:
-    This is a para.
-    <div><h1>some header</h1>
-        code block (indented four spaces)
-      <div>
-        also a code block
-      </div>
+    <div markdown="1">This is the first part of a para,
+    which is continued here.
     </div>
-If you don't use valid XHTML tags, you sometimes won't get the expected result. However, kramdown
-tries to fix broken HTML if possible (note the automatically closed `<hr />` tag):
+    <p markdown="1">This works without problems because it is parsed as span level elements</p>
-    This is a para.
-    <div>
-    Something is broken here.
-    <hr>
-    </div>
-    This is a para.
-^
-    <p>This is a para.</p>
-    <div>
-      <p>Something is broken here.</p>
-      <hr />
-    </div>
-    <p>This is a para.</p>
+    <div markdown="1">The end tag is not found because
+    this line is parsed as a paragraph</div>
+Since setting `:parse_block_html` to `true` can lead to some not wanted behaviour, it is generally
+better to selectively enable or disable block/span level elements parsing by using the `markdown`
+attribute!
 Unclosed block level HTML tags are correctly closed at the end of the document to ensure correct
 nesting and invalidly used end tags are removed from the output:
     This is a para.
-    <div><div class="clear"></div>
+    <div markdown="1">
     Another para.
     </p>
 ^
     <p>This is a para.</p>
-    <div><div class="clear"></div>
+    <div>
       <p>Another para.</p>
     </div>
-The content of a HTML tag is either parsed as block level elements, span level elements or is not
-parsed at all depending on the tag encountered. For example, a `<div>` tag contains block level
-elements, a `<p>` tag contains span level elements and the contents of a `<script>` tag is not
-parsed at all. General XML tags are also not parsed at all by default.
-The following HTML tags count as span level HTML tags and *won't* start an HTML block if found on an
-HTML block line. All other HTML tags and general XML tags will start an HTML block!
-    a abbr acronym b big bdo br button cite code del dfn em i img input
-    ins kbd label option q rb rbc rp rt rtc ruby samp select small span
-    strong sub sup textarea tt var
 The parsing of processing instructions and XML comments is also supported. The content of both, PIs
 and XML comments, may span multiple lines. The start of a PI/XML comment may only appear at the
-beginning of a line, optionally indented up to three spaces. All characters from the end of a PI/XML
-comment till the end of the line are ignored. kramdown syntax in PIs/XML comments is not processed:
+beginning of a line, optionally indented up to three spaces. If there is text after the end of a PI
+or XML comment, it will be parsed as if it appears on a separate line. kramdown syntax in PIs/XML
+comments is not processed:
     This is a para.
     <!-- a *comment* -->
     <? a processing `instruction`
        spanning multiple lines
-    ?>
-    Another para.
+    ?> First part of para,
+    continues here.
 ## Attribute List Definitions
@@ -836,7 +858,7 @@ An ALD line has the following structure:
 * a left brace, optionally preceded by up to three spaces,
 * followed by a colon, the reference name and another colon,
 * followed by attribute definitions (allowed characters are backslash-escaped closing braces or any
-  character except an unescaped closing brace),
+  character except a not escaped closing brace),
 * followed by a closing brace and optional spaces till the end of the line.
 The reference name needs to start with a word character or a digit, optionally followed by other word
@@ -847,7 +869,7 @@ spaces:
 references
-:   This must be a valid reference name. It is used to reference an other ALD sothat the attributes
+:   This must be a valid reference name. It is used to reference an other ALD so that the attributes
     of the other ALD are also included in this one. The reference name is ignored when collecting the
     attributes if no attribute definition list with this reference name exists. For example, a
     simple reference looks like `id`.
@@ -857,7 +879,7 @@ key-value pairs
 :   A key-value pair is defined by a key name, which must follow the rules for reference names, then
     an equal sign and then the value in single or double quotes. If you need to use the value
     delimiter (a single or a double quote) inside the value, you need to escape it with a backslash.
-    Key-value pairs can be used to specify arbitray attributes for block or span level elements. For
+    Key-value pairs can be used to specify arbitrary attributes for block or span level elements. For
     example, a key-value pair looks like `key1="bef \"quoted\" aft"` or `title='This is a title'`.
 ID name
@@ -869,7 +891,7 @@ ID name
 class names
-:   A class name is definied by using a dot and then the class name. This is (almost, but not quite)
+:   A class name is defined by using a dot and then the class name. This is (almost, but not quite)
     a short hand for the key-value pair `class="class-name"`. Almost because it actually means that
     the class name should be appended to the current value of the `class` attribute. The following
     ALDs are all equivalent:
@@ -941,7 +963,7 @@ With a body
     normally by kramdown.
 If the specified extension is not found, a warning is shown and the whole extension block including
-the body is ignored. The following extensions are builtin:
+the body is ignored. The following extensions are built-in:
 `comment`
@@ -1013,15 +1035,15 @@ Notes:
 To create a reference style link, you need to surround the link text with square brackets (as with
 inline links), followed by optional spaces/tabs/line breaks and then optionally followed with
-another set of square brackets with the link identifier in them. A link indentifier may only contain
+another set of square brackets with the link identifier in them. A link identifier may only contain
 numbers, letters, spaces (line breaks and tabs are converted to spaces, multiple spaces are
-compressed to one) and punctuation characters (ie. `_.:,;!?-`) and is not case sensitive. For
+compressed to one) and punctuation characters (i.e. `_.:,;!?-`) and is not case sensitive. For
 example:
     This is a [reference style link][linkid] to a page. And [this]
     [linkid] is also a link. As is [this][] and [THIS].
-If you don't specify a link identifier (ie. only use empty square brackets) or completely omit the
+If you don't specify a link identifier (i.e. only use empty square brackets) or completely omit the
 second pair of square brackets, the link text is converted to a valid link identifier by removing
 all invalid characters and inserting spaces for line breaks. If there is a link definition found for
 the link identifier, a link will be created. Otherwise the text is not converted to a link.
@@ -1068,7 +1090,7 @@ kramdown uses the HTML elements `em` and `strong` to style emphasized text parts
 are surrounded with single asterisks `*` or underscores `_` are wrapped in `em` tags, text parts
 surrounded with two asterisks or underscores are wrapped in `strong` tags. Surrounded means that the
 starting delimiter must not be followed by a space and that the stopping delimiter must not be
-preceeded by a space. For example:
+preceded by a space. For example:
     *some text*
     _some text_
@@ -1143,9 +1165,10 @@ literal meaning of a backtick you can backslash-escape it:
 HTML tags cannot only be used on the block level but also on the span level. Span level HTML tags
 can only be used inside one block level element, it is not possible to use a start tag in one block
 level element and the end tag in another. Note that only correct XHTML is supported! This means that
-you have to use, for example, `<br />` instead of `<br>`.
+you have to use, for example, `<br />` instead of `<br>` (although kramdown tries to fix such errors
+if possible).
-By default, kramdown parses kramdown syntax inside HTML spans. However, this behaviour can be
+By default, kramdown parses kramdown syntax inside span HTML tags. However, this behaviour can be
 configured with the `:parse_span_html` option. If this is set to `true`, then syntax parsing in HTML
 spans is enabled, if it is set to `false`, parsing is disabled. It is also possible to
 enable/disable syntax parsing on a tag per tag basis using the `markdown` attribute:
@@ -1169,11 +1192,20 @@ Processing instructions and XML comments can also be used (their content is not
 with HTML tags the start and the end have to appear in the same block level element.
 Span level PIs and span level XML comments as well as general span level HTML and XML tags have to
-be preceded by at least one non whitespace character on the same line sothat kramdown correctly
+be preceded by at least one non whitespace character on the same line so that kramdown correctly
 recognizes them as span level element and not as block level element. However, all span HTML tags,
 i.e. `a`, `em`, `b`, ..., (opening or closing) can appear at the start of a line.
-Unclosed HTML tags as well as invalidly used end tags or block HTML tags are escaped.
+Unclosed span level HTML tags are correctly closed at the end of the span level text to ensure
+correct nesting and invalidly used end tags or block HTML tags are removed from the output:
+    This is </invalid>.
+    This <span>is automatically closed.
+^
+    <p>This is .</p>
+    <p>This <span>is automatically closed.</span></p>
 Also note that one or more consecutive new line characters in an HTML span tag are replaced by a
 single space, for example:
@@ -1190,7 +1222,7 @@ single space, for example:
 > the [PHP Markdown Extra] package.
 {: .markdown-difference}
-Footnotes in kramdown are simliar to reference style links and link definitions. You need to place
+Footnotes in kramdown are similar to reference style links and link definitions. You need to place
 the footnote marker in the correct position in the text and the actual footnote content can be
 defined anywhere in the document.