kramdown 0.2.0 → 0.3.0

Sign up to get free protection for your applications and to get access to all the features.

Potentially problematic release.


This version of kramdown might be problematic. Click here for more details.

Files changed (67) hide show
  1. data/ChangeLog +267 -0
  2. data/VERSION +1 -1
  3. data/benchmark/benchmark.rb +2 -1
  4. data/benchmark/generate_data.rb +110 -0
  5. data/benchmark/historic-jruby-1.4.0.dat +7 -0
  6. data/benchmark/historic-ruby-1.8.6.dat +7 -0
  7. data/benchmark/historic-ruby-1.8.7.dat +7 -0
  8. data/benchmark/historic-ruby-1.9.1p243.dat +7 -0
  9. data/benchmark/historic-ruby-1.9.2dev.dat +7 -0
  10. data/benchmark/static-jruby-1.4.0.dat +7 -0
  11. data/benchmark/static-ruby-1.8.6.dat +7 -0
  12. data/benchmark/static-ruby-1.8.7.dat +7 -0
  13. data/benchmark/static-ruby-1.9.1p243.dat +7 -0
  14. data/benchmark/static-ruby-1.9.2dev.dat +7 -0
  15. data/benchmark/testing.sh +1 -1
  16. data/doc/index.page +5 -5
  17. data/doc/installation.page +3 -3
  18. data/doc/quickref.page +3 -3
  19. data/doc/syntax.page +133 -101
  20. data/doc/tests.page +9 -1
  21. data/lib/kramdown/compatibility.rb +34 -0
  22. data/lib/kramdown/converter.rb +26 -8
  23. data/lib/kramdown/document.rb +2 -1
  24. data/lib/kramdown/parser.rb +1 -1192
  25. data/lib/kramdown/parser/kramdown.rb +272 -0
  26. data/lib/kramdown/parser/kramdown/attribute_list.rb +102 -0
  27. data/lib/kramdown/parser/kramdown/autolink.rb +42 -0
  28. data/lib/kramdown/parser/kramdown/blank_line.rb +43 -0
  29. data/lib/kramdown/parser/kramdown/blockquote.rb +42 -0
  30. data/lib/kramdown/parser/kramdown/codeblock.rb +62 -0
  31. data/lib/kramdown/parser/kramdown/codespan.rb +57 -0
  32. data/lib/kramdown/parser/kramdown/emphasis.rb +69 -0
  33. data/lib/kramdown/parser/kramdown/eob.rb +39 -0
  34. data/lib/kramdown/parser/kramdown/escaped_chars.rb +38 -0
  35. data/lib/kramdown/parser/kramdown/extension.rb +65 -0
  36. data/lib/kramdown/parser/kramdown/footnote.rb +72 -0
  37. data/lib/kramdown/parser/kramdown/header.rb +81 -0
  38. data/lib/kramdown/parser/kramdown/horizontal_rule.rb +39 -0
  39. data/lib/kramdown/parser/kramdown/html.rb +253 -0
  40. data/lib/kramdown/{deprecated.rb → parser/kramdown/html_entity.rb} +10 -12
  41. data/lib/kramdown/parser/kramdown/line_break.rb +38 -0
  42. data/lib/kramdown/parser/kramdown/link.rb +153 -0
  43. data/lib/kramdown/parser/kramdown/list.rb +225 -0
  44. data/lib/kramdown/parser/kramdown/paragraph.rb +44 -0
  45. data/lib/kramdown/parser/kramdown/typographic_symbol.rb +48 -0
  46. data/lib/kramdown/version.rb +1 -1
  47. data/test/testcases/block/09_html/comment.html +1 -0
  48. data/test/testcases/block/09_html/comment.text +1 -1
  49. data/test/testcases/block/09_html/content_model/tables.text +2 -2
  50. data/test/testcases/block/09_html/not_parsed.html +10 -0
  51. data/test/testcases/block/09_html/not_parsed.text +9 -0
  52. data/test/testcases/block/09_html/parse_as_raw.html +4 -0
  53. data/test/testcases/block/09_html/parse_as_raw.text +2 -0
  54. data/test/testcases/block/09_html/parse_block_html.html +4 -0
  55. data/test/testcases/block/09_html/parse_block_html.text +3 -0
  56. data/test/testcases/block/09_html/processing_instruction.html +1 -0
  57. data/test/testcases/block/09_html/processing_instruction.text +1 -1
  58. data/test/testcases/block/09_html/simple.html +8 -15
  59. data/test/testcases/block/09_html/simple.text +2 -12
  60. data/test/testcases/span/02_emphasis/normal.html +8 -4
  61. data/test/testcases/span/02_emphasis/normal.text +6 -2
  62. data/test/testcases/span/05_html/markdown_attr.html +2 -1
  63. data/test/testcases/span/05_html/markdown_attr.text +2 -1
  64. data/test/testcases/span/05_html/normal.html +6 -2
  65. data/test/testcases/span/05_html/normal.text +4 -0
  66. metadata +35 -4
  67. data/lib/kramdown/parser/registry.rb +0 -62
@@ -0,0 +1,7 @@
1
+ # Maruku 0.6.0 || BlueFeather 0.32 || BlueCloth 2.0.5 || RDiscount 1.3.5
2
+ 1 0.06202 0.12554 0.00055 0.00061
3
+ 2 0.11490 0.42433 0.00150 0.00104
4
+ 4 0.22792 1.35390 0.00189 0.00210
5
+ 8 0.50887 4.93941 0.00363 0.00411
6
+ 16 1.11704 18.90444 0.00792 0.00937
7
+ 32 2.36787 78.08507 0.01822 0.02148
@@ -0,0 +1,7 @@
1
+ # Maruku 0.6.0 || BlueFeather 0.32 || BlueCloth 2.0.5 || RDiscount 1.3.5
2
+ 1 0.06325 0.13453 0.00060 0.00058
3
+ 2 0.12058 0.45557 0.00108 0.00115
4
+ 4 0.24777 1.47726 0.00219 0.00237
5
+ 8 0.49758 5.21774 0.00401 0.00460
6
+ 16 1.18307 20.93746 0.00916 0.01011
7
+ 32 2.65514 81.59896 0.02015 0.02210
@@ -0,0 +1,7 @@
1
+ # Maruku 0.6.0 || BlueFeather 0.32 || BlueCloth 2.0.5 || RDiscount 1.3.5
2
+ 1 0.07650 0.16749 0.00067 0.00070
3
+ 2 0.22272 0.54882 0.00146 0.00132
4
+ 4 0.30977 1.95581 0.00248 0.00265
5
+ 8 0.61692 7.35415 0.00453 0.00555
6
+ 16 1.35505 28.91778 0.00939 0.01241
7
+ 32 3.00461 115.68435 0.02550 0.02510
@@ -0,0 +1,7 @@
1
+ # Maruku 0.6.0 || BlueFeather 0.32 || BlueCloth 2.0.5 || RDiscount 1.3.5
2
+ 1 0.05888 0.12860 0.00063 0.00059
3
+ 2 0.18025 0.40292 0.00115 0.00108
4
+ 4 0.24735 1.51833 0.00353 0.00299
5
+ 8 0.51102 6.07239 0.00410 0.00497
6
+ 16 1.03159 23.21515 0.01348 0.01277
7
+ 32 2.61620 92.00956 0.01893 0.02003
@@ -2,7 +2,7 @@
2
2
 
3
3
  source ~/.bashrc
4
4
 
5
- for VERSION in 1.8.6 1.8.7 1.9.1 1.9.2 'jruby 1.4.0'; do
5
+ for VERSION in 1.8.5 1.8.6 1.8.7 1.9.1 1.9.2 'jruby 1.4.0'; do
6
6
  rvm $VERSION
7
7
  echo $(ruby -v)
8
8
  rake test
@@ -73,16 +73,16 @@ and [mailing lists][ml] available if you have any questions!
73
73
 
74
74
  ## Welcome to the kramdown site
75
75
 
76
- kramdown is a *free* GPL-licensed [Ruby](http://www.ruby-lang.org) library for parsing a superset of
77
- Markdown. It is completely written in Ruby, supports standard Markdown (with some minor
78
- modifications) and various extensions that have been made popular by the [PHP Markdown Extra]
79
- package and [Maruku].
76
+ **kramdown** (sic, not Kramdown or KramDown, just kramdown) is a *free* GPL-licensed
77
+ [Ruby](http://www.ruby-lang.org) library for parsing a superset of Markdown. It is completely
78
+ written in Ruby, supports standard Markdown (with some minor modifications) and various extensions
79
+ that have been made popular by the [PHP Markdown Extra] package and [Maruku].
80
80
 
81
81
  It is probably the fastest pure-Ruby Markdown converter available (November 2009), being 5x faster
82
82
  than [Maruku] and about 10x faster than [BlueFeather].
83
83
 
84
84
  <div class="a-center">
85
- The latest version of kramdown is <b>0.2.0</b> and it was released on <b>2009-12-03</b>.
85
+ The latest version of kramdown is <b>0.3.0</b> and it was released on <b>2009-12-20</b>.
86
86
  </div>
87
87
 
88
88
  [PHP Markdown Extra]: http://michelf.com/projects/php-markdown/extra/
@@ -10,8 +10,8 @@ sort_info: 5
10
10
  kramdown should work on any platform which supports Ruby. It has been successfully tested on the
11
11
  following platforms:
12
12
 
13
- * Linux with Ruby 1.8.6
14
- * Mac OSX with Ruby 1.8.6, 1.8.7, 1.9.1, 1.9.2-preview1 and jruby 1.4.0.
13
+ * Linux with Ruby 1.8.5, 1.8.6, 1.8.7, 1.9.1
14
+ * Mac OSX with Ruby 1.8.5, 1.8.6, 1.8.7, 1.9.1, 1.9.2-preview1 and jruby 1.4.0.
15
15
 
16
16
  See the platform specific installation notes for more information!
17
17
 
@@ -87,4 +87,4 @@ out kramdown use the following command:
87
87
  ## Dependencies
88
88
 
89
89
  Since kramdown is written in Ruby, you just need the [Ruby interpreter](http://www.ruby-lang.org)
90
- versions 1.8.6, 1.8.7 or 1.9.1. There are no other dependencies.
90
+ versions 1.8.5, 1.8.6, 1.8.7 or 1.9.1. There are no other dependencies.
@@ -55,7 +55,7 @@ which contains a hard line break.
55
55
 
56
56
  {::kdlink:: #headers part="headers"}
57
57
 
58
- kramdown supports Setext style headers and atx style headers. A header must always be preceeded by a
58
+ kramdown supports Setext style headers and atx style headers. A header must always be preceded by a
59
59
  blank line except at the beginning of the document:
60
60
 
61
61
  {::kdexample:}
@@ -239,7 +239,7 @@ term
239
239
  {::kdexample:}
240
240
 
241
241
  Each term can be styled using span level elements and each definition is parsed as block level
242
- elements, ie. you can use any block level in a definition. Just use the same indent for the lines
242
+ elements, i.e. you can use any block level in a definition. Just use the same indent for the lines
243
243
  following the definition line:
244
244
 
245
245
  {::kdexample:}
@@ -469,7 +469,7 @@ e.g. `` `code` ``.
469
469
  {::kdlink:: #footnotes part="footnotes"}
470
470
 
471
471
  Footnotes can easily be used in kramdown. Just set a footnote marker (consists of square brackets
472
- with a caret and the footname inside) in the text and somewhere else the footnote definition (which
472
+ with a caret and the footnote name inside) in the text and somewhere else the footnote definition (which
473
473
  basically looks like a reference link definition):
474
474
 
475
475
  {::kdexample:}
@@ -17,7 +17,7 @@ The kramdown syntax is based on the Markdown syntax and has been enhanced with f
17
17
  found in other Markdown implementations like [Maruku], [PHP Markdown Extra] and [Pandoc]. However,
18
18
  it strives to provide a strict syntax with definite rules and therefore isn't completely compatible
19
19
  with Markdown. Nonetheless, most Markdown documents should work fine when parsed with kramdown. All
20
- places where the kramdown syntax differes from the Markdown syntax are highlighted.
20
+ places where the kramdown syntax differs from the Markdown syntax are highlighted.
21
21
 
22
22
  Following is the complete syntax definition so that you know what you will get when a kramdown
23
23
  document is converted to HTML. There are basically two types of elements: block level elements
@@ -59,7 +59,7 @@ will be converted to:
59
59
  0 &lt; 1 &lt; 2 and 2 &gt; 1 &gt; 0</p>
60
60
 
61
61
  Since kramdown also uses some characters to mark-up the text, there need to be a way to escape these
62
- special characters sothat they can have their normal meaning. This can be done by using backslash
62
+ special characters so that they can have their normal meaning. This can be done by using backslash
63
63
  escapes. For example, you can use a literal backtick like this:
64
64
 
65
65
  This \`is not a code\` span!
@@ -152,7 +152,7 @@ Paragraphs are the most used block level elements. One or more consecutive lines
152
152
  interpreted as one paragraph. Every line of a paragraph may be indented up to three spaces. You can
153
153
  separate two consecutive paragraphs from each other by using one or more blank lines. Notice that a
154
154
  line break in the source does not mean a line break in the output. If you want to have an explicit
155
- line break (ie. a `<br />` tag) you need to end a line with two or more spaces or two backslashes!
155
+ line break (i.e. a `<br />` tag) you need to end a line with two or more spaces or two backslashes!
156
156
  Note, however, that a line break on the last text line of a paragraph is not possible and will be
157
157
  ignored. Leading and trailing spaces will be stripped from the paragraph text.
158
158
 
@@ -205,7 +205,7 @@ As mentioned you need to insert a blank line before a Setext header:
205
205
  ------------
206
206
 
207
207
  However, it is generally a good idea to also use a blank line after a Setext header because it looks
208
- more approriate.
208
+ more appropriate.
209
209
 
210
210
  > The original Markdown syntax allows one to omit the blank line before a Setext header. However,
211
211
  > this makes reading the document harder than necessary and is therefore not allowed in a kramdown
@@ -243,13 +243,13 @@ close the header. Any leading or trailing spaces are stripped from the header te
243
243
  ### Automatic Generation of Header IDs
244
244
 
245
245
  kramdown supports the automatic generation of header IDs if the option `:auto_ids` is set to `true`
246
- (which is the default). This is done by converting the untransformed, ie. plain, header text via the
246
+ (which is the default). This is done by converting the untransformed, i.e. plain, header text via the
247
247
  following steps:
248
248
 
249
249
  * All characters except letters, numbers, spaces and dashes are removed.
250
250
  * All characters from the start of the line till the first letter are removed.
251
251
  * Everything except letters and numbers is converted to dashes.
252
- * Everything is downcased.
252
+ * Everything is lowercased.
253
253
  * If nothing is left, the identifier `section` is used.
254
254
  * If a such created identifier already exists, a dash and a sequential number is added (first `-1`,
255
255
  then `-2` and so on).
@@ -335,7 +335,7 @@ indented with five spaces or one space and one tab, like this:
335
335
  >
336
336
  > ruby -e 'puts :works'
337
337
 
338
- > The original Markdown syntax allowed "lazy" blockquotes, ie. blockquotes where only the first line
338
+ > The original Markdown syntax allowed "lazy" blockquotes, i.e. blockquotes where only the first line
339
339
  > needs a blockquote marker. This is disallowed in kramdown, you always need to use a blockquote
340
340
  > marker! The rational behind this is that most email programs and good text editors put the `>`
341
341
  > maker automatically before every quoted line and that things like the following work like
@@ -477,7 +477,7 @@ list markers in the list may be indented up to three spaces or the number of spa
477
477
  indentation of the last list item minus one, whichever number is smaller. For example:
478
478
 
479
479
  * This is the first line. Since the first non-space characters appears in column 3, all other
480
- lines have to be indented 2 spaces sothat they first characters align. This tells kramdown
480
+ lines have to be indented 2 spaces so that the first characters align. This tells kramdown
481
481
  that the lines belong to the list item.
482
482
  * This is the another item of the list. It uses a different number of spaces for
483
483
  indentation which is okay but should generally be avoided.
@@ -531,7 +531,7 @@ by leaving a blank line after the last list item and using an EOB marker:
531
531
  > blank lines are wrapped in paragraph tags. This means that the first text will also be wrapped in
532
532
  > a paragraph if you have block level elements in a list which are separated by blank lines. The
533
533
  > above rule is easy to remember and lets you exactly specify when the first list text should be
534
- > wrapped in a paragraph. The idea for the above rule comes from the [pandoc] package.
534
+ > wrapped in a paragraph. The idea for the above rule comes from the [Pandoc] package.
535
535
  {: .markdown-difference}
536
536
 
537
537
  As seen in the examples above, blank lines between list items are allowed.
@@ -580,7 +580,7 @@ marker.
580
580
  > This kind of syntax is disallowed in kramdown.
581
581
  {: .markdown-difference}
582
582
 
583
- If you want to have one list directly after another one (both with the same list type, ie. ordered
583
+ If you want to have one list directly after another one (both with the same list type, i.e. ordered
584
584
  or unordered), you need to use an EOB marker to separate the two:
585
585
 
586
586
  * List one
@@ -641,7 +641,7 @@ The column number of the first non-space character which appears after a definit
641
641
  same line specifies the indentation that has to be used for the following lines of the definition.
642
642
  If there is no such character, the indentation that needs to be used is four spaces or one tab. If
643
643
  one of the following lines does not have the needed amount of indentation, it is not treated as part
644
- of the defintion. The indentation is stripped from the definition and it (note that the definition
644
+ of the definition. The indentation is stripped from the definition and it (note that the definition
645
645
  naturally also contains the content of the line with the definition marker) is processed as text
646
646
  containing block level elements. If there is more than one definition, all other definition markers
647
647
  for the term may be indented up to three spaces or the number of spaces used for the indentation of
@@ -650,9 +650,9 @@ the last definition minus one, whichever number is smaller. For example:
650
650
  definition term 1
651
651
  definition term 2
652
652
  : This is the first line. Since the first non-space characters appears in column 3, all other
653
- lines have to be indented 2 spaces sothat they first characters align. This tells kramdown
653
+ lines have to be indented 2 spaces so that they first characters align. This tells kramdown
654
654
  that the lines belong to the definition.
655
- : This is the another defintion for the same term. It uses a different number of spaces
655
+ : This is the another definition for the same term. It uses a different number of spaces
656
656
  for indentation which is okay but should generally be avoided.
657
657
  : The definition marker is indented 3 spaces which is allowed but should also be avoided.
658
658
 
@@ -668,7 +668,7 @@ paragraph otherwise:
668
668
  : This definition will just be text because it would normally be a paragraph and the there is
669
669
  no preceding blank line.
670
670
 
671
- > although the defintion contains other block level elements
671
+ > although the definition contains other block level elements
672
672
 
673
673
  : This definition *will* be a paragraph since it is preceded by a blank line.
674
674
 
@@ -678,26 +678,57 @@ definition.
678
678
 
679
679
  ## HTML Blocks
680
680
 
681
- There is no problem mixing HTML tags into a kramdown document. An HTML block is started when
682
- kramdown encounters a line beginning with an HTML tag that is *not* a span level HTML tag (`div`,
683
- `p`, `pre`, ...) -- or with a general XML tag -- which may be indented up to three spaces. After
684
- that any combination of text and HTML tags are allowed, with HTML span tags being treated as text.
685
- Each HTML block tag must appear completely on the line, it may not be broken across several lines!
686
- If a block HTML tag is not closed on the same line, the HTML block continues till the HTML block
687
- line with the corresponding closing tag. Note that only correct XHTML is supported! This means that
688
- you have to use, for example, `<hr />` instead of `<hr>` (although kramdown tries to fix such things
689
- if possible).
681
+ > The original Markdown syntax specifies that an HTML block must start at the left margin, i.e. no
682
+ > indentation is allowed. Also, the HTML block has to be surrounded by blank lines. Both
683
+ > restrictions are lifted for kramdown documents. Additionally, the original syntax does not allow
684
+ > you to use Markdown syntax in HTML blocks which is allowed with kramdown.
685
+ {: .markdown-difference}
686
+
687
+ An HTML block is potentially started if a line is encountered that begins with a non-span-level HTML
688
+ tag or a general XML tag (opening or closing) which may be indented up to three spaces.
689
+
690
+ The following HTML tags count as span level HTML tags and *won't* start an HTML block if found at
691
+ the beginning of an HTML block line:
692
+
693
+ a abbr acronym b big bdo br button cite code del dfn em i img input
694
+ ins kbd label option q rb rbc rp rt rtc ruby samp select small span
695
+ strong sub sup textarea tt var
696
+
697
+ Further parsing of a found start tag depends on the tag and in which of three possible ways its
698
+ content is parsed:
699
+
700
+ * Parse as raw HTML block: If the HTML/XML tag content should be handled as raw HTML, then only
701
+ HTML/XML tags are parsed from this point onwards and text is handled as raw, unparsed text until
702
+ the matching end tag is found or until the end of the document. Each found tag will be parsed as
703
+ raw HTML again. However, if a tag has a `markdown` attribute, this attribute controls parsing of
704
+ this one tag (see below).
690
705
 
691
- By default, kramdown does not parse HTML blocks, i.e. when a block level HTML tag is encountered
692
- only HTML block lines are parsed and everything else is treated as raw text. This will be done until
693
- the closing tag for the outermost HTML tag is found (or until the end of the document if the closing
694
- HTML tag does not exist). However, this can be configured with the `:parse_block_html` option. If
695
- this is set to `true`, then syntax parsing in HTML blocks is enabled. All the examples below assume
696
- that `:parse_block_html` is set to `true`. It is also possible to enable/disable syntax parsing on a
697
- tag per tag basis using the `markdown` attribute:
706
+ Note that only correct XHTML is supported! This means that you have to use, for example, `<hr />`
707
+ instead of `<hr>` (although kramdown tries to fix such things if possible). If an invalid closing
708
+ tag is found, it is ignored.
698
709
 
699
- * If an HTML tag has an attribute `markdown="0"`, then no parsing (except parsing of HTML block
700
- lines) is done inside that HTML tag.
710
+ * Parse as block level elements: If the HTML/XML tag content should be parsed as text containing
711
+ block level elements, the remaining text on the line will be parsed by the block level parser as
712
+ if it appears on a separate line (**Caution**: This also means that if the line consists of the
713
+ start tag, text and the end tag, the end tag will not be found!). All following lines are parsed
714
+ as block level elements until an HTML block line with the matching end tag is found or until the
715
+ end of the document.
716
+
717
+ * Parse as span level elements: If the HTML/XML tag content should be parsed as text containing span
718
+ level elements, then all text until the *next* matching end tag or until the end of the document
719
+ will be the content of the tag and will later be parsed by the span level parser.
720
+
721
+ If there is text after an end tag, it will be parsed as if it appears on a separate line except when
722
+ inside a raw HTML block.
723
+
724
+ Also, if an invalid closing tag is found, it is ignored.
725
+
726
+ By default, kramdown parses all block HTML tags and all XML tags as raw HTML blocks. However, this
727
+ can be configured with the `:parse_block_html` option. If this is set to `true`, then syntax parsing
728
+ in HTML blocks is globally enabled. It is also possible to enable/disable syntax parsing on a tag
729
+ per tag basis using the `markdown` attribute:
730
+
731
+ * If an HTML tag has an attribute `markdown="0"`, then the tag is parsed as raw HTML block.
701
732
 
702
733
  * If an HTML tag has an attribute `markdown="1"`, then the default mechanism for parsing syntax in
703
734
  this tag is used.
@@ -708,20 +739,31 @@ tag per tag basis using the `markdown` attribute:
708
739
  * If an HTML tag has an attribute `markdown="span"`, then the content of the tag is parsed as span
709
740
  level elements.
710
741
 
711
- Note, however, that text that appears on an HTML block starting or ending line is not parsed as
712
- block level element even if the rest of the HTML block is! It is parsed as span level text
713
- nonetheless.
742
+ The following list shows which HTML tags are parsed in which mode by default when `markdown="1"` is
743
+ applied or `:parse_block_html` is `true`:
714
744
 
715
- If an invalid closing tag is found on an HTML block line while no HTML block is active, it is
716
- ignored.
745
+ Parse as raw HTML block
746
+ :
747
+ script math option textarea
717
748
 
718
- > The original Markdown syntax specifies that an HTML block must start at the left margin, ie. no
719
- > indentation is allowed. Also, the HTML block has to be surrounded by blank lines. Both
720
- > restrictions are lifted for kramdown documents. Additionally, the original syntax does not allow
721
- > you to use Markdown syntax in HTML blocks which is allowed with kramdown.
722
- {: .markdown-difference}
749
+ Also, all general XML tags are parsed as raw HTML blocks.
750
+
751
+ Parse as block level elements
752
+ :
753
+ applet button blockquote colgroup dd div dl fieldset form iframe li
754
+ map noscript object ol table tbody td th thead tfoot tr ul
755
+
756
+ Parse as span level elements
757
+ :
758
+ a abbr acronym address b bdo big cite caption code del dfn dt em
759
+ h1 h2 h3 h4 h5 h6 i ins kbd label legend optgroup p pre q rb rbc
760
+ rp rt rtc ruby samp select small span strong sub sup tt var
761
+
762
+ > Remember that all span level HTML tags like `a` or `b` do not start a HTML block! However, the
763
+ > above lists also include span level HTML tags in the case the `markdown` attribute is used on a
764
+ > tag inside a raw HTML block.
723
765
 
724
- Here is a simple example:
766
+ Here is a simple example input and its output with `:parse_block_html` set to `false`:
725
767
 
726
768
  This is a para.
727
769
  <div>
@@ -731,84 +773,64 @@ Here is a simple example:
731
773
  ^
732
774
  <p>This is a para.</p>
733
775
  <div>
734
- <p>Something in here.</p>
776
+ Something in here.
735
777
  </div>
736
778
  <p>Other para.</p>
737
779
 
738
- See how the content of the `div` tag is wrapped in a paragraph!
780
+ As one can see the content of the `div` tag will be parsed as raw HTML block and left alone.
781
+ However, if the `markdown="1"` attribute was used on the `div` tag, the content would be parsed as
782
+ block level elements and therefore converted to a paragraph.
739
783
 
740
784
  You can also use several HTML tags at once:
741
785
 
742
786
  <div id="content"><div id="layers"><div id="layer1">
743
- This is a para in the `layer1` div.
787
+ This is some text in the `layer1` div.
744
788
  </div>
745
- This is a para in the `layers` div.
789
+ This is some text in the `layers` div.
746
790
  </div></div>
747
791
  This is a para outside the HTML block.
748
792
 
749
- When you specify an HTML block don't forget that the first column does not change:
793
+ However, remember that if the content of a tag is parsed as block level elements, the content that
794
+ appears after a start/end tag but on the same line, is processed as if it appears on a new line:
750
795
 
751
- This is a para.
752
- <div><h1>some header</h1>
753
- code block (indented four spaces)
754
- <div>
755
- also a code block
756
- </div>
796
+ <div markdown="1">This is the first part of a para,
797
+ which is continued here.
757
798
  </div>
758
799
 
759
- If you don't use valid XHTML tags, you sometimes won't get the expected result. However, kramdown
760
- tries to fix broken HTML if possible (note the automatically closed `<hr />` tag):
800
+ <p markdown="1">This works without problems because it is parsed as span level elements</p>
761
801
 
762
- This is a para.
763
- <div>
764
- Something is broken here.
765
- <hr>
766
- </div>
767
- This is a para.
768
- ^
769
- <p>This is a para.</p>
770
- <div>
771
- <p>Something is broken here.</p>
772
- <hr />
773
- </div>
774
- <p>This is a para.</p>
802
+ <div markdown="1">The end tag is not found because
803
+ this line is parsed as a paragraph</div>
804
+
805
+ Since setting `:parse_block_html` to `true` can lead to some not wanted behaviour, it is generally
806
+ better to selectively enable or disable block/span level elements parsing by using the `markdown`
807
+ attribute!
775
808
 
776
809
  Unclosed block level HTML tags are correctly closed at the end of the document to ensure correct
777
810
  nesting and invalidly used end tags are removed from the output:
778
811
 
779
812
  This is a para.
780
- <div><div class="clear"></div>
813
+ <div markdown="1">
781
814
  Another para.
782
815
  </p>
783
816
  ^
784
817
  <p>This is a para.</p>
785
- <div><div class="clear"></div>
818
+ <div>
786
819
  <p>Another para.</p>
787
820
  </div>
788
821
 
789
- The content of a HTML tag is either parsed as block level elements, span level elements or is not
790
- parsed at all depending on the tag encountered. For example, a `<div>` tag contains block level
791
- elements, a `<p>` tag contains span level elements and the contents of a `<script>` tag is not
792
- parsed at all. General XML tags are also not parsed at all by default.
793
-
794
- The following HTML tags count as span level HTML tags and *won't* start an HTML block if found on an
795
- HTML block line. All other HTML tags and general XML tags will start an HTML block!
796
-
797
- a abbr acronym b big bdo br button cite code del dfn em i img input
798
- ins kbd label option q rb rbc rp rt rtc ruby samp select small span
799
- strong sub sup textarea tt var
800
-
801
822
  The parsing of processing instructions and XML comments is also supported. The content of both, PIs
802
823
  and XML comments, may span multiple lines. The start of a PI/XML comment may only appear at the
803
- beginning of a line, optionally indented up to three spaces. All characters from the end of a PI/XML
804
- comment till the end of the line are ignored. kramdown syntax in PIs/XML comments is not processed:
824
+ beginning of a line, optionally indented up to three spaces. If there is text after the end of a PI
825
+ or XML comment, it will be parsed as if it appears on a separate line. kramdown syntax in PIs/XML
826
+ comments is not processed:
805
827
 
806
828
  This is a para.
807
829
  <!-- a *comment* -->
808
830
  <? a processing `instruction`
809
831
  spanning multiple lines
810
- ?>
811
- Another para.
832
+ ?> First part of para,
833
+ continues here.
812
834
 
813
835
 
814
836
  ## Attribute List Definitions
@@ -836,7 +858,7 @@ An ALD line has the following structure:
836
858
  * a left brace, optionally preceded by up to three spaces,
837
859
  * followed by a colon, the reference name and another colon,
838
860
  * followed by attribute definitions (allowed characters are backslash-escaped closing braces or any
839
- character except an unescaped closing brace),
861
+ character except a not escaped closing brace),
840
862
  * followed by a closing brace and optional spaces till the end of the line.
841
863
 
842
864
  The reference name needs to start with a word character or a digit, optionally followed by other word
@@ -847,7 +869,7 @@ spaces:
847
869
 
848
870
  references
849
871
 
850
- : This must be a valid reference name. It is used to reference an other ALD sothat the attributes
872
+ : This must be a valid reference name. It is used to reference an other ALD so that the attributes
851
873
  of the other ALD are also included in this one. The reference name is ignored when collecting the
852
874
  attributes if no attribute definition list with this reference name exists. For example, a
853
875
  simple reference looks like `id`.
@@ -857,7 +879,7 @@ key-value pairs
857
879
  : A key-value pair is defined by a key name, which must follow the rules for reference names, then
858
880
  an equal sign and then the value in single or double quotes. If you need to use the value
859
881
  delimiter (a single or a double quote) inside the value, you need to escape it with a backslash.
860
- Key-value pairs can be used to specify arbitray attributes for block or span level elements. For
882
+ Key-value pairs can be used to specify arbitrary attributes for block or span level elements. For
861
883
  example, a key-value pair looks like `key1="bef \"quoted\" aft"` or `title='This is a title'`.
862
884
 
863
885
  ID name
@@ -869,7 +891,7 @@ ID name
869
891
 
870
892
  class names
871
893
 
872
- : A class name is definied by using a dot and then the class name. This is (almost, but not quite)
894
+ : A class name is defined by using a dot and then the class name. This is (almost, but not quite)
873
895
  a short hand for the key-value pair `class="class-name"`. Almost because it actually means that
874
896
  the class name should be appended to the current value of the `class` attribute. The following
875
897
  ALDs are all equivalent:
@@ -941,7 +963,7 @@ With a body
941
963
  normally by kramdown.
942
964
 
943
965
  If the specified extension is not found, a warning is shown and the whole extension block including
944
- the body is ignored. The following extensions are builtin:
966
+ the body is ignored. The following extensions are built-in:
945
967
 
946
968
  `comment`
947
969
 
@@ -1013,15 +1035,15 @@ Notes:
1013
1035
 
1014
1036
  To create a reference style link, you need to surround the link text with square brackets (as with
1015
1037
  inline links), followed by optional spaces/tabs/line breaks and then optionally followed with
1016
- another set of square brackets with the link identifier in them. A link indentifier may only contain
1038
+ another set of square brackets with the link identifier in them. A link identifier may only contain
1017
1039
  numbers, letters, spaces (line breaks and tabs are converted to spaces, multiple spaces are
1018
- compressed to one) and punctuation characters (ie. `_.:,;!?-`) and is not case sensitive. For
1040
+ compressed to one) and punctuation characters (i.e. `_.:,;!?-`) and is not case sensitive. For
1019
1041
  example:
1020
1042
 
1021
1043
  This is a [reference style link][linkid] to a page. And [this]
1022
1044
  [linkid] is also a link. As is [this][] and [THIS].
1023
1045
 
1024
- If you don't specify a link identifier (ie. only use empty square brackets) or completely omit the
1046
+ If you don't specify a link identifier (i.e. only use empty square brackets) or completely omit the
1025
1047
  second pair of square brackets, the link text is converted to a valid link identifier by removing
1026
1048
  all invalid characters and inserting spaces for line breaks. If there is a link definition found for
1027
1049
  the link identifier, a link will be created. Otherwise the text is not converted to a link.
@@ -1068,7 +1090,7 @@ kramdown uses the HTML elements `em` and `strong` to style emphasized text parts
1068
1090
  are surrounded with single asterisks `*` or underscores `_` are wrapped in `em` tags, text parts
1069
1091
  surrounded with two asterisks or underscores are wrapped in `strong` tags. Surrounded means that the
1070
1092
  starting delimiter must not be followed by a space and that the stopping delimiter must not be
1071
- preceeded by a space. For example:
1093
+ preceded by a space. For example:
1072
1094
 
1073
1095
  *some text*
1074
1096
  _some text_
@@ -1143,9 +1165,10 @@ literal meaning of a backtick you can backslash-escape it:
1143
1165
  HTML tags cannot only be used on the block level but also on the span level. Span level HTML tags
1144
1166
  can only be used inside one block level element, it is not possible to use a start tag in one block
1145
1167
  level element and the end tag in another. Note that only correct XHTML is supported! This means that
1146
- you have to use, for example, `<br />` instead of `<br>`.
1168
+ you have to use, for example, `<br />` instead of `<br>` (although kramdown tries to fix such errors
1169
+ if possible).
1147
1170
 
1148
- By default, kramdown parses kramdown syntax inside HTML spans. However, this behaviour can be
1171
+ By default, kramdown parses kramdown syntax inside span HTML tags. However, this behaviour can be
1149
1172
  configured with the `:parse_span_html` option. If this is set to `true`, then syntax parsing in HTML
1150
1173
  spans is enabled, if it is set to `false`, parsing is disabled. It is also possible to
1151
1174
  enable/disable syntax parsing on a tag per tag basis using the `markdown` attribute:
@@ -1169,11 +1192,20 @@ Processing instructions and XML comments can also be used (their content is not
1169
1192
  with HTML tags the start and the end have to appear in the same block level element.
1170
1193
 
1171
1194
  Span level PIs and span level XML comments as well as general span level HTML and XML tags have to
1172
- be preceded by at least one non whitespace character on the same line sothat kramdown correctly
1195
+ be preceded by at least one non whitespace character on the same line so that kramdown correctly
1173
1196
  recognizes them as span level element and not as block level element. However, all span HTML tags,
1174
1197
  i.e. `a`, `em`, `b`, ..., (opening or closing) can appear at the start of a line.
1175
1198
 
1176
- Unclosed HTML tags as well as invalidly used end tags or block HTML tags are escaped.
1199
+ Unclosed span level HTML tags are correctly closed at the end of the span level text to ensure
1200
+ correct nesting and invalidly used end tags or block HTML tags are removed from the output:
1201
+
1202
+ This is </invalid>.
1203
+
1204
+ This <span>is automatically closed.
1205
+ ^
1206
+ <p>This is .</p>
1207
+
1208
+ <p>This <span>is automatically closed.</span></p>
1177
1209
 
1178
1210
  Also note that one or more consecutive new line characters in an HTML span tag are replaced by a
1179
1211
  single space, for example:
@@ -1190,7 +1222,7 @@ single space, for example:
1190
1222
  > the [PHP Markdown Extra] package.
1191
1223
  {: .markdown-difference}
1192
1224
 
1193
- Footnotes in kramdown are simliar to reference style links and link definitions. You need to place
1225
+ Footnotes in kramdown are similar to reference style links and link definitions. You need to place
1194
1226
  the footnote marker in the correct position in the text and the actual footnote content can be
1195
1227
  defined anywhere in the document.
1196
1228