maruku 0.2.13 → 0.3.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (86) hide show
  1. data/bin/maruku +23 -15
  2. data/bin/maruku0.3 +37 -0
  3. data/bin/marutest +277 -0
  4. data/docs/changelog-0.3.html +99 -0
  5. data/docs/changelog-0.3.md +84 -0
  6. data/docs/faq.html +46 -0
  7. data/docs/faq.md +32 -0
  8. data/docs/index.html +629 -64
  9. data/docs/markdown_extra2.html +67 -14
  10. data/docs/markdown_syntax.html +631 -94
  11. data/docs/markdown_syntax_2.html +152 -0
  12. data/docs/maruku.html +629 -64
  13. data/docs/maruku.md +108 -105
  14. data/docs/proposal.html +362 -55
  15. data/docs/proposal.md +133 -169
  16. data/docs/todo.html +30 -0
  17. data/lib/maruku.rb +13 -3
  18. data/lib/maruku/errors_management.rb +75 -0
  19. data/lib/maruku/helpers.rb +164 -0
  20. data/lib/maruku/html_helper.rb +33 -13
  21. data/lib/maruku/parse_block.rb +89 -92
  22. data/lib/maruku/parse_doc.rb +43 -18
  23. data/lib/maruku/parse_span.rb +17 -46
  24. data/lib/maruku/parse_span_better.rb +681 -0
  25. data/lib/maruku/string_utils.rb +17 -10
  26. data/lib/maruku/structures.rb +62 -35
  27. data/lib/maruku/structures_iterators.rb +39 -0
  28. data/lib/maruku/tests/benchmark.rb +12 -4
  29. data/lib/maruku/tests/new_parser.rb +318 -0
  30. data/lib/maruku/to_html.rb +113 -44
  31. data/lib/maruku/to_latex.rb +32 -14
  32. data/lib/maruku/to_markdown.rb +110 -0
  33. data/lib/maruku/toc.rb +35 -1
  34. data/lib/maruku/version.rb +10 -1
  35. data/lib/test.rb +29 -0
  36. data/tests/others/escaping.md +6 -4
  37. data/tests/others/links.md +1 -1
  38. data/tests/others/lists_after_paragraph.md +44 -0
  39. data/tests/unittest/abbreviations.md +71 -0
  40. data/tests/unittest/blank.md +43 -0
  41. data/tests/unittest/blanks_in_code.md +131 -0
  42. data/tests/unittest/code.md +64 -0
  43. data/tests/unittest/code2.md +59 -0
  44. data/tests/unittest/code3.md +121 -0
  45. data/tests/unittest/easy.md +36 -0
  46. data/tests/unittest/email.md +39 -0
  47. data/tests/unittest/encoding/iso-8859-1.md +9 -0
  48. data/tests/unittest/encoding/utf-8.md +38 -0
  49. data/tests/unittest/entities.md +174 -0
  50. data/tests/unittest/escaping.md +97 -0
  51. data/tests/unittest/extra_dl.md +81 -0
  52. data/tests/unittest/extra_header_id.md +96 -0
  53. data/tests/unittest/extra_table1.md +78 -0
  54. data/tests/unittest/footnotes.md +120 -0
  55. data/tests/unittest/headers.md +64 -0
  56. data/tests/unittest/hrule.md +77 -0
  57. data/tests/unittest/images.md +114 -0
  58. data/tests/unittest/inline_html.md +185 -0
  59. data/tests/unittest/links.md +162 -0
  60. data/tests/unittest/list1.md +80 -0
  61. data/tests/unittest/list2.md +75 -0
  62. data/tests/unittest/list3.md +111 -0
  63. data/tests/unittest/list4.md +43 -0
  64. data/tests/unittest/lists.md +262 -0
  65. data/tests/unittest/lists_after_paragraph.md +280 -0
  66. data/tests/unittest/lists_ol.md +323 -0
  67. data/tests/unittest/misc_sw.md +751 -0
  68. data/tests/unittest/notyet/escape.md +46 -0
  69. data/tests/unittest/notyet/header_after_par.md +85 -0
  70. data/tests/unittest/notyet/ticks.md +67 -0
  71. data/tests/unittest/notyet/triggering.md +210 -0
  72. data/tests/unittest/one.md +33 -0
  73. data/tests/unittest/paragraph.md +34 -0
  74. data/tests/unittest/paragraph_rules/dont_merge_ref.md +60 -0
  75. data/tests/unittest/paragraph_rules/tab_is_blank.md +43 -0
  76. data/tests/unittest/paragraphs.md +84 -0
  77. data/tests/unittest/recover/recover_links.md +32 -0
  78. data/tests/unittest/references/long_example.md +87 -0
  79. data/tests/unittest/references/spaces_and_numbers.md +27 -0
  80. data/tests/unittest/syntax_hl.md +99 -0
  81. data/tests/unittest/test.md +36 -0
  82. data/tests/unittest/wrapping.md +88 -0
  83. data/tests/utf8-files/simple.md +1 -0
  84. metadata +139 -86
  85. data/lib/maruku/maruku.rb +0 -50
  86. data/tests/a.md +0 -10
@@ -3,74 +3,96 @@ LaTeX_use_listings: true
3
3
  html_use_syntax: true
4
4
  use_numbered_headers: true
5
5
 
6
- Syntax for meta-data
7
- ====================
6
+ Proposal for adding a meta-data syntax to Markdown
7
+ =============================================
8
8
 
9
- This document describe a syntax that makes it possible to attach meta-data to
10
- block-level elements (headers, paragraphs, code blocks, ...),
11
- and to span-level elements (links, images, ...).
9
+ This document describes a syntax for attaching meta-data to
10
+ block-level elements (headers, paragraphs, code blocks,…),
11
+ and to span-level elements (links, images,…).
12
12
 
13
-
14
- Last update: December 29th, 2006.
13
+ Last updated **January 2nd, 2007**: integrated topics
14
+ discussed in mailing list.
15
15
 
16
16
  *Table of contents:*
17
17
  > @toc
18
18
  > * Table of contents
19
19
 
20
+
21
+ Overview
22
+ --------
23
+
24
+ This proposal describes two additions to the Markdown syntax:
25
+
26
+ 1. inline attribute lists (IAL)
27
+
28
+ ## Header ## {key=val .class #id ref_id}
29
+
30
+ 2. attribute lists definitions (ALD)
31
+
32
+ {ref_id}: key=val .class #id
33
+
34
+ Every span-level or block-level element can be followed by an IAL:
35
+
36
+ ### Header ### {#header1 class=c1}
37
+
38
+ Paragraph *with emphasis*{class=c1}
39
+ second line of paragraph
40
+ {class=c1}
41
+
42
+ In this example, the three IALs refer to the header, the emphasis span, and the entire paragraph, respectively.
43
+
44
+ IALs can reference ALDs. The result of the following example is the same as the previous one:
45
+
46
+ ### Header ### {#header1 c1}
47
+
48
+ Paragraph *with emphasis*{c1}
49
+ second line of paragraph
50
+ {c1}
51
+
52
+ {c1}: class=c1
53
+
20
54
  Attribute lists
21
55
  ---------------
22
56
 
23
57
  This is an example attribute list, which shows
24
58
  everything you can put inside:
25
59
 
26
- {key1=val key2="long val" #myid .class1 .class2 tag1 tag2}
60
+ key1=val key2="long val" #myid .class1 .class2 ref1 ref2
27
61
 
28
- More in particular, an attribute list is a brace-enclosed, whitespace-separated list
62
+ More in particular, an attribute list is a whitespace-separated list
29
63
  of elements of 4 different kinds:
30
64
 
31
- 1. key/value pairs
32
- 2. [tags](#using_tags) (`tag1`,`tag2`)
65
+ 1. key/value pairs (quoted if necessary)
66
+ 2. [references to ALD](#using_tags) (`ref1`,`ref2`)
33
67
  3. [id specifiers](#class_id) (`#myid`)
34
68
  4. [class specifiers](#class_id) (`.myclass`)
35
69
 
36
- The formal grammar is specified [below](#grammar).
37
-
38
70
  ### `id` and `class` are special ### {#class_id}
39
71
 
40
- You can attach every attribute you want to elements, but
41
- some are threated in a special way:
42
-
43
- * `id`: you can only have one ID specified for an element.
44
- ID must not conflict with one another.
72
+ For ID and classes there are special shortcuts:
45
73
 
46
- * `class`: class attributes are cumulative.
47
- It is possible to attach more that one class attribute
48
- to the same element (just like HTML).
74
+ * `#myid` is a shortcut for `id=myid`
75
+ * `.myclass` means "add `myclass` to the current `class` attribute".
49
76
 
50
- In this case, the values get merged. So these are equivalent:
77
+ So these are equivalent:
51
78
 
52
79
  {.class1 .class2}
53
80
  {class="class1 class2"}
54
81
 
55
82
 
56
- For ID and classes there are special shortcuts:
57
-
58
- * `#myid` is a shortcut for `id=myid`
59
- * `.myclass` is a shortcut for `class=myclass`
60
-
61
- Therefore the following attribute lists are equivalent:
83
+ The following attribute lists are equivalent:
62
84
 
63
85
  {#myid .class1 .class2}
64
- {id=myid class=class1 class=class2}
86
+ {id=myid class=class1 .class2}
65
87
  {id=myid class="class1 class2"}
88
+ {id=myid class="will be overridden" class=class1 .class2}
66
89
 
67
-
68
- Where to put attribute lists
69
- ----------------------------
90
+ Where to put inline attribute lists
91
+ ----------------------------------
70
92
 
71
93
  ### For block-level elements ###
72
94
 
73
- For paragraphs and other block-level elements, attributes lists go
95
+ For paragraphs and other block-level elements, IAL go
74
96
  **after** the element:
75
97
 
76
98
  This is a paragraph.
@@ -81,7 +103,7 @@ For paragraphs and other block-level elements, attributes lists go
81
103
  > Who said that?
82
104
  {cite=google.com}
83
105
 
84
- Note: empty lines between the block and the attributes list are not tollerated.
106
+ Note: empty lines between the block and the IAL are not tollerated.
85
107
  So this is not legal:
86
108
 
87
109
  This is a paragraph.
@@ -110,7 +132,7 @@ For headers, you can put attribute lists on the same line:
110
132
  Header {#myid .myclass}
111
133
  ------
112
134
 
113
- or, as other block-level elements, on the line after:
135
+ or, as like other block-level elements, on the line below:
114
136
 
115
137
  ### Header ###
116
138
  {#myid}
@@ -121,17 +143,16 @@ or, as other block-level elements, on the line after:
121
143
 
122
144
  ### For span-level elements ###
123
145
 
124
- For span-level elements, metadata goes immediately **after** in the paragraph
146
+ For span-level elements, meta-data goes immediately **after** in the
125
147
  flow.
126
148
 
127
-
128
149
  For example, in this:
129
150
 
130
- This is a *chunky paragraph*{#id1}.
151
+ This is a *chunky paragraph*{#id1}
131
152
  {#id2}
132
153
 
133
154
  the ID of the `em` element is set to `id1`
134
- and the id of the paragraph is set to `id2`.
155
+ and the ID of the paragraph is set to `id2`.
135
156
 
136
157
  This works also for links, like this:
137
158
 
@@ -145,189 +166,132 @@ is equivalent to:
145
166
 
146
167
  This is ![Alt text](url){title="fresh carrots"}
147
168
 
148
- Using "tags" {#using_tags}
149
- ------------
169
+ Using attributes lists definition {#using_tags}
170
+ ---------------------------------
150
171
 
151
172
  In an attribute list, you can have:
173
+
152
174
  1. `key=value` pairs,
153
175
  2. id attributes (`#myid`)
154
176
  3. class attributes (`.myclass`)
155
177
 
156
- Everything else is interpreted as a "tag" [^tag].
157
- Tags let you tag an element and then specify
158
- the attributes later:
178
+ Everything else is interpreted as a reference to
179
+ an ALD.
159
180
 
160
- # Header # {tag}
181
+ # Header # {ref}
161
182
 
162
183
  Blah blah blah.
163
184
 
164
- {tag}: #myhead .myclass lang=fr
185
+ {ref}: #myhead .myclass lang=fr
165
186
 
166
- Tags are not unique: more than one element can
167
- be assigned the same tag.
187
+ Of course, more than one IAL can reference the same ALD:
168
188
 
169
- # Header 1 # {tag}
189
+ # Header 1 # {1}
170
190
  ...
171
- # Header 2 # {tag}
191
+ # Header 2 # {1}
172
192
 
173
- {tag}: .myclass lang=fr
193
+ {1}: .myclass lang=fr
174
194
 
175
- In this case, however, you should not assign
176
- the `id` attribute. So this is **not** valid:
177
195
 
178
- # Header 1 # {tag}
179
- ...
180
- # Header 2 # {tag}
181
-
182
- {tag}: #myid .myclass lang=fr
183
-
184
-
185
- [^tag]: a better name for this?
186
-
187
- Of course, tags are valid for both block-level and span-level elements:
188
-
189
- ### My header ### {1}
190
- This is a paragraph with an *emphasis*{2}
191
- a and the paragraph goes on.
192
- {3}
193
-
194
- {1}: #header_id
195
- {2}: #emph_id
196
- {3}: #par_id
196
+ The rules {#grammar}
197
+ ---------
197
198
 
199
+ ### The issue of escaping ###
198
200
 
199
- Additional examples and corner-cases
200
- ------------------------------------
201
+ 1. No escaping in code blocks.
201
202
 
202
- ### Code blocks ###
203
+ * ``` `\` ``` represents the one-character string `\`.
203
204
 
204
- Note that attributes for code blocks should not be indented
205
- by more than 3 spaces:
205
+ 2. Everywhere else, **all** characters **can** be escaped:
206
206
 
207
- @ code_show_spaces
208
- This is a code block.
209
- {#myid} <-- this is part of the block
210
- {#blockid}
207
+ * `\|` is the literal `|`, `\n` is the literal `n`.
208
+ * `\ ` represents a non-breaking space.
209
+ * `\` followed by a newline represents a linebreak.
211
210
 
211
+ 3. Quotes **must** be escaped inside quoted values:
212
+
213
+ * Inside `"quoted values"`, you **must** escape `"`.
214
+ * Inside `'quoted values'`, you **must** escape `'`.
212
215
 
213
- Formal grammar {#grammar}
214
- --------------
216
+ * Other examples:
215
217
 
216
- In this section we define the formal grammar AKA the big regexp.
218
+ `"bah 'bah' bah"` = `"bah \'bah\' bah"` = `'bah \'bah\' bah'`
219
+
220
+ `'bah "bah" bah'` = `'bah \"bah\" bah'` = `"bah \"bah\" bah"`
217
221
 
218
- In the spirit of HTML:
219
222
 
220
- > Identifiers must begin with a letter (`[A-Za-z]`) and
221
- > may be followed by any number of letters, digits (`[0-9]`),
222
- > hyphens (`-`), underscores (`_`), colons (`:`), and periods (`.`).
223
+ 4. There is an exception for backward compatibility:
223
224
 
224
- the same applies to class attributes and for the keys in key/value pairs.
225
- Moreover, they are case-sensitive.
225
+ [text](url "title"with"quotes")
226
226
 
227
- So this is a valid attribute list:
228
227
 
229
- {#my:_A123.veryspecialID .my:____:class }
228
+ ### Syntax for attribute lists ####
230
229
 
231
- The regexp for identifiers is therefore
230
+ Consider the following attribute list:
232
231
 
233
- Identifier = [A-Za-z][A-Za-z0-9_\.\:\-]*
232
+ {key=value ref key2="quoted value" }
233
+
234
+ In this string, `key`, `value`, and `ref` can be substituted by any
235
+ string that does not contain whitespace, or the unescaped characters `}`,`=`,`'`,`"`.
234
236
 
235
- (This is Ruby syntax; I am told it is similar to Perl's so I guess
236
- it is generally understandable. If not, please tell me the equivalent
237
- in your language.)
237
+ Inside a quoted value, you **may** use `}`,`=` unescaped but you **must**
238
+ escape the other kind of quote.
238
239
 
239
- Now:
240
- * an id attribute is an `Identifier` preceded by `#`
241
- * a class attribute is an `Identifier` preceded by `.`
242
- * a `Tag` is an `Identifier`
243
240
 
244
- * A key/value pair is an Identifier, followed by a `=`, followed by
245
- a value.
241
+ Things to discuss
242
+ -----------------
246
243
 
247
- The value can be quoted (`key="Very long quote"`) or unquoted (`key=small_value`).
244
+ * A syntax for creating `SPAN` elements in the paragraphs and setting their attributes.
248
245
 
249
- * An unquoted value must not start with a double quote `"`, and may contain everything
250
- except whitespace:
246
+ This is my proposal:
251
247
 
252
- UnquotedValue = [^\s\"][^\s]*
248
+ a long paragraph with [special words]{#myspan} that I want to
249
+ highlight
253
250
 
254
- Example:
251
+ should originate the following HTML:
255
252
 
256
- {key1=This=is"myValue_%&$&d9i key2=true}
253
+ <p>a long paragraph with <span id="myspan">special words</span>
254
+ that I want to highlight</p>
257
255
 
258
- * A quoted value is enclosed in double quotes and may contain every char.
259
- In a quoted value there are two escaping rules:
256
+ ***Note: I changed the old `{special words}{#myspan}` with `[special words]{#myspan}` which is less ambiguous.***
260
257
 
261
- 1. The sequence ` \\ ` is replaced by ` \ `
262
- 2. The sequence `\"` is replaced by `"`
258
+ * > Another question: does it makes sense to define `<span>` within
259
+ > Markdown when you can't have `<b>` and `<i>`, or the more meaningful
260
+ > `<cite>`, `<q>`, `<dfn>`, and `<var>`? We have to draw the line somewhere,
261
+ > where should it be? Another good question for the list.
263
262
 
264
- this makes it possible to include both `"` and `\` in the strings.
263
+ Any opinion?
265
264
 
266
- {key1="\\\" backslash and quote also a tab"}
265
+ * **Default ALD for classes of elements.** For example, an header of level 2 inherits automatically the attributes of `{header2}`, if it is defined.
267
266
 
268
- ### Summary ###
267
+ ## Header ##
268
+
269
+ Paragraph..
270
+
271
+ ## Second Header ## {.mah}
272
+
273
+ Paragraph..
274
+
275
+ {header2}: .myclass
276
+ {paragraph}: .withmargins
269
277
 
270
- To summarize:
278
+ In this example:
271
279
 
272
- AttributeList = \{ (ws [KeyValue|IdSpec|ClassSpec|Tag])* ws \}
273
- Identifier = [A-Za-z][A-Za-z0-9_\.\:\-]*
274
- Tag = Identifier
275
- IdSpec = #Identifier
276
- ClassSpec = .Identifier
277
- KeyValue = Key=[QuotedValue|UnquotedValue]
278
- Key = Identifier
279
- UnquotedValue = [^\s\"][^\s]*
280
- QuotedValue = \"[^\"]*\" <---------- note: simplistic
280
+ * the first header has attributes `class=myclass`
281
+ * the second header has attributes `class="myclass mah"`
282
+ * the two paragraphs have attributes `class=withmargins`
281
283
 
282
- **Note**: I am not able to write the regexp for `QuotedValue` that takes into
283
- account also the escaping of the characters. Any regexp wizard out there?
284
284
 
285
- Things to discuss
286
- -----------------
285
+ Design rationale
286
+ ----------------
287
287
 
288
288
  * Question: should we allow whitespace at the sides of `=` in key/value pairs?
289
289
 
290
- * Question: should `:` be a synonym for `=` in attributes list.
290
+ > No, because it is difficult to parse.
291
291
 
292
- Personally, I like this:
293
-
294
- {key1: value key2: "value2 with spaces" }
292
+ * Question: should `:` be a synonym for `=` in attributes list?
295
293
 
296
- much more than this:
297
-
298
- {key1=value key2="value2 with spaces " }
299
-
300
-
301
- * A syntax for creating `SPAN` elements in the paragraphs and setting their
302
- attributes.
303
-
304
- This is my proposal:
294
+ > No, because ':' is used for XML namespaces (`xml:lang=en`)
305
295
 
306
- a long paragraph with {special words}{#myspan} that I want to
307
- highlight
308
-
309
- should originate the following HTML:
310
-
311
- <p>a long paragraph with <span id="myspan">special words</span>
312
- that I want to highlight</p>
313
-
314
- This is Michel's comment on this syntax:
315
-
316
- > It looks quite good. One question is can it be amgibuous with braces
317
- > used for the attributes themselves? I don't have an answer to that
318
- > question; better ask this on the list.
319
-
320
- I don't think it is ambiguous, because it's the only case in which you have
321
- the sequence `}{`:
322
-
323
- {.*}{Attributes}
324
-
325
- > Another question: does it makes sense to define `<span>` within
326
- > Markdown when you can't have `<b>` and `<i>`, or the more meaningful
327
- > `<cite>`, `<q>`, `<dfn>`, and `<var>`? We have to draw the line somewhere,
328
- > where should it be? Another good question for the list.
329
-
330
-
331
296
 
332
- * anything else?
333
297
 
@@ -0,0 +1,30 @@
1
+ <?xml version='1.0' ?>
2
+ <!DOCTYPE html PUBLIC '-//W3C//DTD XHTML 1.0 Strict//EN'
3
+ 'http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd'>
4
+ <html lang='en' xml:lang='en' xmlns='http://www.w3.org/1999/xhtml'
5
+ ><head
6
+ ><title></title
7
+ ></head
8
+ ><head
9
+ ><title></title
10
+ ></head
11
+ ><body
12
+ ><ul
13
+ ><li
14
+ ><p>Export to HTML * Include RubyPants</p
15
+ ></li
16
+ ><li
17
+ ><p>Export to PDF * support for images</p
18
+ ></li
19
+ ><li
20
+ ><p>Export to Markdown (pretty-printing)</p
21
+ ></li
22
+ ></ul
23
+ ><div class='maruku_signature'
24
+ ><hr
25
+ /><span style='font-size: small; font-style: italic'>Created by <a href='http://maruku.rubyforge.org' title='Maruku: a Markdown interpreter'>Maruku</a
26
+ > at 17:47 on Sunday, December 31st, 2006.</span
27
+ ></div
28
+ ></body
29
+ ></html
30
+ >