maruku 0.2.13 → 0.3.0

Sign up to get free protection for your applications and to get access to all the features.
Files changed (86) hide show
  1. data/bin/maruku +23 -15
  2. data/bin/maruku0.3 +37 -0
  3. data/bin/marutest +277 -0
  4. data/docs/changelog-0.3.html +99 -0
  5. data/docs/changelog-0.3.md +84 -0
  6. data/docs/faq.html +46 -0
  7. data/docs/faq.md +32 -0
  8. data/docs/index.html +629 -64
  9. data/docs/markdown_extra2.html +67 -14
  10. data/docs/markdown_syntax.html +631 -94
  11. data/docs/markdown_syntax_2.html +152 -0
  12. data/docs/maruku.html +629 -64
  13. data/docs/maruku.md +108 -105
  14. data/docs/proposal.html +362 -55
  15. data/docs/proposal.md +133 -169
  16. data/docs/todo.html +30 -0
  17. data/lib/maruku.rb +13 -3
  18. data/lib/maruku/errors_management.rb +75 -0
  19. data/lib/maruku/helpers.rb +164 -0
  20. data/lib/maruku/html_helper.rb +33 -13
  21. data/lib/maruku/parse_block.rb +89 -92
  22. data/lib/maruku/parse_doc.rb +43 -18
  23. data/lib/maruku/parse_span.rb +17 -46
  24. data/lib/maruku/parse_span_better.rb +681 -0
  25. data/lib/maruku/string_utils.rb +17 -10
  26. data/lib/maruku/structures.rb +62 -35
  27. data/lib/maruku/structures_iterators.rb +39 -0
  28. data/lib/maruku/tests/benchmark.rb +12 -4
  29. data/lib/maruku/tests/new_parser.rb +318 -0
  30. data/lib/maruku/to_html.rb +113 -44
  31. data/lib/maruku/to_latex.rb +32 -14
  32. data/lib/maruku/to_markdown.rb +110 -0
  33. data/lib/maruku/toc.rb +35 -1
  34. data/lib/maruku/version.rb +10 -1
  35. data/lib/test.rb +29 -0
  36. data/tests/others/escaping.md +6 -4
  37. data/tests/others/links.md +1 -1
  38. data/tests/others/lists_after_paragraph.md +44 -0
  39. data/tests/unittest/abbreviations.md +71 -0
  40. data/tests/unittest/blank.md +43 -0
  41. data/tests/unittest/blanks_in_code.md +131 -0
  42. data/tests/unittest/code.md +64 -0
  43. data/tests/unittest/code2.md +59 -0
  44. data/tests/unittest/code3.md +121 -0
  45. data/tests/unittest/easy.md +36 -0
  46. data/tests/unittest/email.md +39 -0
  47. data/tests/unittest/encoding/iso-8859-1.md +9 -0
  48. data/tests/unittest/encoding/utf-8.md +38 -0
  49. data/tests/unittest/entities.md +174 -0
  50. data/tests/unittest/escaping.md +97 -0
  51. data/tests/unittest/extra_dl.md +81 -0
  52. data/tests/unittest/extra_header_id.md +96 -0
  53. data/tests/unittest/extra_table1.md +78 -0
  54. data/tests/unittest/footnotes.md +120 -0
  55. data/tests/unittest/headers.md +64 -0
  56. data/tests/unittest/hrule.md +77 -0
  57. data/tests/unittest/images.md +114 -0
  58. data/tests/unittest/inline_html.md +185 -0
  59. data/tests/unittest/links.md +162 -0
  60. data/tests/unittest/list1.md +80 -0
  61. data/tests/unittest/list2.md +75 -0
  62. data/tests/unittest/list3.md +111 -0
  63. data/tests/unittest/list4.md +43 -0
  64. data/tests/unittest/lists.md +262 -0
  65. data/tests/unittest/lists_after_paragraph.md +280 -0
  66. data/tests/unittest/lists_ol.md +323 -0
  67. data/tests/unittest/misc_sw.md +751 -0
  68. data/tests/unittest/notyet/escape.md +46 -0
  69. data/tests/unittest/notyet/header_after_par.md +85 -0
  70. data/tests/unittest/notyet/ticks.md +67 -0
  71. data/tests/unittest/notyet/triggering.md +210 -0
  72. data/tests/unittest/one.md +33 -0
  73. data/tests/unittest/paragraph.md +34 -0
  74. data/tests/unittest/paragraph_rules/dont_merge_ref.md +60 -0
  75. data/tests/unittest/paragraph_rules/tab_is_blank.md +43 -0
  76. data/tests/unittest/paragraphs.md +84 -0
  77. data/tests/unittest/recover/recover_links.md +32 -0
  78. data/tests/unittest/references/long_example.md +87 -0
  79. data/tests/unittest/references/spaces_and_numbers.md +27 -0
  80. data/tests/unittest/syntax_hl.md +99 -0
  81. data/tests/unittest/test.md +36 -0
  82. data/tests/unittest/wrapping.md +88 -0
  83. data/tests/utf8-files/simple.md +1 -0
  84. metadata +139 -86
  85. data/lib/maruku/maruku.rb +0 -50
  86. data/tests/a.md +0 -10
@@ -3,74 +3,96 @@ LaTeX_use_listings: true
3
3
  html_use_syntax: true
4
4
  use_numbered_headers: true
5
5
 
6
- Syntax for meta-data
7
- ====================
6
+ Proposal for adding a meta-data syntax to Markdown
7
+ =============================================
8
8
 
9
- This document describe a syntax that makes it possible to attach meta-data to
10
- block-level elements (headers, paragraphs, code blocks, ...),
11
- and to span-level elements (links, images, ...).
9
+ This document describes a syntax for attaching meta-data to
10
+ block-level elements (headers, paragraphs, code blocks,…),
11
+ and to span-level elements (links, images,…).
12
12
 
13
-
14
- Last update: December 29th, 2006.
13
+ Last updated **January 2nd, 2007**: integrated topics
14
+ discussed in mailing list.
15
15
 
16
16
  *Table of contents:*
17
17
  > @toc
18
18
  > * Table of contents
19
19
 
20
+
21
+ Overview
22
+ --------
23
+
24
+ This proposal describes two additions to the Markdown syntax:
25
+
26
+ 1. inline attribute lists (IAL)
27
+
28
+ ## Header ## {key=val .class #id ref_id}
29
+
30
+ 2. attribute lists definitions (ALD)
31
+
32
+ {ref_id}: key=val .class #id
33
+
34
+ Every span-level or block-level element can be followed by an IAL:
35
+
36
+ ### Header ### {#header1 class=c1}
37
+
38
+ Paragraph *with emphasis*{class=c1}
39
+ second line of paragraph
40
+ {class=c1}
41
+
42
+ In this example, the three IALs refer to the header, the emphasis span, and the entire paragraph, respectively.
43
+
44
+ IALs can reference ALDs. The result of the following example is the same as the previous one:
45
+
46
+ ### Header ### {#header1 c1}
47
+
48
+ Paragraph *with emphasis*{c1}
49
+ second line of paragraph
50
+ {c1}
51
+
52
+ {c1}: class=c1
53
+
20
54
  Attribute lists
21
55
  ---------------
22
56
 
23
57
  This is an example attribute list, which shows
24
58
  everything you can put inside:
25
59
 
26
- {key1=val key2="long val" #myid .class1 .class2 tag1 tag2}
60
+ key1=val key2="long val" #myid .class1 .class2 ref1 ref2
27
61
 
28
- More in particular, an attribute list is a brace-enclosed, whitespace-separated list
62
+ More in particular, an attribute list is a whitespace-separated list
29
63
  of elements of 4 different kinds:
30
64
 
31
- 1. key/value pairs
32
- 2. [tags](#using_tags) (`tag1`,`tag2`)
65
+ 1. key/value pairs (quoted if necessary)
66
+ 2. [references to ALD](#using_tags) (`ref1`,`ref2`)
33
67
  3. [id specifiers](#class_id) (`#myid`)
34
68
  4. [class specifiers](#class_id) (`.myclass`)
35
69
 
36
- The formal grammar is specified [below](#grammar).
37
-
38
70
  ### `id` and `class` are special ### {#class_id}
39
71
 
40
- You can attach every attribute you want to elements, but
41
- some are threated in a special way:
42
-
43
- * `id`: you can only have one ID specified for an element.
44
- ID must not conflict with one another.
72
+ For ID and classes there are special shortcuts:
45
73
 
46
- * `class`: class attributes are cumulative.
47
- It is possible to attach more that one class attribute
48
- to the same element (just like HTML).
74
+ * `#myid` is a shortcut for `id=myid`
75
+ * `.myclass` means "add `myclass` to the current `class` attribute".
49
76
 
50
- In this case, the values get merged. So these are equivalent:
77
+ So these are equivalent:
51
78
 
52
79
  {.class1 .class2}
53
80
  {class="class1 class2"}
54
81
 
55
82
 
56
- For ID and classes there are special shortcuts:
57
-
58
- * `#myid` is a shortcut for `id=myid`
59
- * `.myclass` is a shortcut for `class=myclass`
60
-
61
- Therefore the following attribute lists are equivalent:
83
+ The following attribute lists are equivalent:
62
84
 
63
85
  {#myid .class1 .class2}
64
- {id=myid class=class1 class=class2}
86
+ {id=myid class=class1 .class2}
65
87
  {id=myid class="class1 class2"}
88
+ {id=myid class="will be overridden" class=class1 .class2}
66
89
 
67
-
68
- Where to put attribute lists
69
- ----------------------------
90
+ Where to put inline attribute lists
91
+ ----------------------------------
70
92
 
71
93
  ### For block-level elements ###
72
94
 
73
- For paragraphs and other block-level elements, attributes lists go
95
+ For paragraphs and other block-level elements, IAL go
74
96
  **after** the element:
75
97
 
76
98
  This is a paragraph.
@@ -81,7 +103,7 @@ For paragraphs and other block-level elements, attributes lists go
81
103
  > Who said that?
82
104
  {cite=google.com}
83
105
 
84
- Note: empty lines between the block and the attributes list are not tollerated.
106
+ Note: empty lines between the block and the IAL are not tollerated.
85
107
  So this is not legal:
86
108
 
87
109
  This is a paragraph.
@@ -110,7 +132,7 @@ For headers, you can put attribute lists on the same line:
110
132
  Header {#myid .myclass}
111
133
  ------
112
134
 
113
- or, as other block-level elements, on the line after:
135
+ or, as like other block-level elements, on the line below:
114
136
 
115
137
  ### Header ###
116
138
  {#myid}
@@ -121,17 +143,16 @@ or, as other block-level elements, on the line after:
121
143
 
122
144
  ### For span-level elements ###
123
145
 
124
- For span-level elements, metadata goes immediately **after** in the paragraph
146
+ For span-level elements, meta-data goes immediately **after** in the
125
147
  flow.
126
148
 
127
-
128
149
  For example, in this:
129
150
 
130
- This is a *chunky paragraph*{#id1}.
151
+ This is a *chunky paragraph*{#id1}
131
152
  {#id2}
132
153
 
133
154
  the ID of the `em` element is set to `id1`
134
- and the id of the paragraph is set to `id2`.
155
+ and the ID of the paragraph is set to `id2`.
135
156
 
136
157
  This works also for links, like this:
137
158
 
@@ -145,189 +166,132 @@ is equivalent to:
145
166
 
146
167
  This is ![Alt text](url){title="fresh carrots"}
147
168
 
148
- Using "tags" {#using_tags}
149
- ------------
169
+ Using attributes lists definition {#using_tags}
170
+ ---------------------------------
150
171
 
151
172
  In an attribute list, you can have:
173
+
152
174
  1. `key=value` pairs,
153
175
  2. id attributes (`#myid`)
154
176
  3. class attributes (`.myclass`)
155
177
 
156
- Everything else is interpreted as a "tag" [^tag].
157
- Tags let you tag an element and then specify
158
- the attributes later:
178
+ Everything else is interpreted as a reference to
179
+ an ALD.
159
180
 
160
- # Header # {tag}
181
+ # Header # {ref}
161
182
 
162
183
  Blah blah blah.
163
184
 
164
- {tag}: #myhead .myclass lang=fr
185
+ {ref}: #myhead .myclass lang=fr
165
186
 
166
- Tags are not unique: more than one element can
167
- be assigned the same tag.
187
+ Of course, more than one IAL can reference the same ALD:
168
188
 
169
- # Header 1 # {tag}
189
+ # Header 1 # {1}
170
190
  ...
171
- # Header 2 # {tag}
191
+ # Header 2 # {1}
172
192
 
173
- {tag}: .myclass lang=fr
193
+ {1}: .myclass lang=fr
174
194
 
175
- In this case, however, you should not assign
176
- the `id` attribute. So this is **not** valid:
177
195
 
178
- # Header 1 # {tag}
179
- ...
180
- # Header 2 # {tag}
181
-
182
- {tag}: #myid .myclass lang=fr
183
-
184
-
185
- [^tag]: a better name for this?
186
-
187
- Of course, tags are valid for both block-level and span-level elements:
188
-
189
- ### My header ### {1}
190
- This is a paragraph with an *emphasis*{2}
191
- a and the paragraph goes on.
192
- {3}
193
-
194
- {1}: #header_id
195
- {2}: #emph_id
196
- {3}: #par_id
196
+ The rules {#grammar}
197
+ ---------
197
198
 
199
+ ### The issue of escaping ###
198
200
 
199
- Additional examples and corner-cases
200
- ------------------------------------
201
+ 1. No escaping in code blocks.
201
202
 
202
- ### Code blocks ###
203
+ * ``` `\` ``` represents the one-character string `\`.
203
204
 
204
- Note that attributes for code blocks should not be indented
205
- by more than 3 spaces:
205
+ 2. Everywhere else, **all** characters **can** be escaped:
206
206
 
207
- @ code_show_spaces
208
- This is a code block.
209
- {#myid} <-- this is part of the block
210
- {#blockid}
207
+ * `\|` is the literal `|`, `\n` is the literal `n`.
208
+ * `\ ` represents a non-breaking space.
209
+ * `\` followed by a newline represents a linebreak.
211
210
 
211
+ 3. Quotes **must** be escaped inside quoted values:
212
+
213
+ * Inside `"quoted values"`, you **must** escape `"`.
214
+ * Inside `'quoted values'`, you **must** escape `'`.
212
215
 
213
- Formal grammar {#grammar}
214
- --------------
216
+ * Other examples:
215
217
 
216
- In this section we define the formal grammar AKA the big regexp.
218
+ `"bah 'bah' bah"` = `"bah \'bah\' bah"` = `'bah \'bah\' bah'`
219
+
220
+ `'bah "bah" bah'` = `'bah \"bah\" bah'` = `"bah \"bah\" bah"`
217
221
 
218
- In the spirit of HTML:
219
222
 
220
- > Identifiers must begin with a letter (`[A-Za-z]`) and
221
- > may be followed by any number of letters, digits (`[0-9]`),
222
- > hyphens (`-`), underscores (`_`), colons (`:`), and periods (`.`).
223
+ 4. There is an exception for backward compatibility:
223
224
 
224
- the same applies to class attributes and for the keys in key/value pairs.
225
- Moreover, they are case-sensitive.
225
+ [text](url "title"with"quotes")
226
226
 
227
- So this is a valid attribute list:
228
227
 
229
- {#my:_A123.veryspecialID .my:____:class }
228
+ ### Syntax for attribute lists ####
230
229
 
231
- The regexp for identifiers is therefore
230
+ Consider the following attribute list:
232
231
 
233
- Identifier = [A-Za-z][A-Za-z0-9_\.\:\-]*
232
+ {key=value ref key2="quoted value" }
233
+
234
+ In this string, `key`, `value`, and `ref` can be substituted by any
235
+ string that does not contain whitespace, or the unescaped characters `}`,`=`,`'`,`"`.
234
236
 
235
- (This is Ruby syntax; I am told it is similar to Perl's so I guess
236
- it is generally understandable. If not, please tell me the equivalent
237
- in your language.)
237
+ Inside a quoted value, you **may** use `}`,`=` unescaped but you **must**
238
+ escape the other kind of quote.
238
239
 
239
- Now:
240
- * an id attribute is an `Identifier` preceded by `#`
241
- * a class attribute is an `Identifier` preceded by `.`
242
- * a `Tag` is an `Identifier`
243
240
 
244
- * A key/value pair is an Identifier, followed by a `=`, followed by
245
- a value.
241
+ Things to discuss
242
+ -----------------
246
243
 
247
- The value can be quoted (`key="Very long quote"`) or unquoted (`key=small_value`).
244
+ * A syntax for creating `SPAN` elements in the paragraphs and setting their attributes.
248
245
 
249
- * An unquoted value must not start with a double quote `"`, and may contain everything
250
- except whitespace:
246
+ This is my proposal:
251
247
 
252
- UnquotedValue = [^\s\"][^\s]*
248
+ a long paragraph with [special words]{#myspan} that I want to
249
+ highlight
253
250
 
254
- Example:
251
+ should originate the following HTML:
255
252
 
256
- {key1=This=is"myValue_%&$&d9i key2=true}
253
+ <p>a long paragraph with <span id="myspan">special words</span>
254
+ that I want to highlight</p>
257
255
 
258
- * A quoted value is enclosed in double quotes and may contain every char.
259
- In a quoted value there are two escaping rules:
256
+ ***Note: I changed the old `{special words}{#myspan}` with `[special words]{#myspan}` which is less ambiguous.***
260
257
 
261
- 1. The sequence ` \\ ` is replaced by ` \ `
262
- 2. The sequence `\"` is replaced by `"`
258
+ * > Another question: does it makes sense to define `<span>` within
259
+ > Markdown when you can't have `<b>` and `<i>`, or the more meaningful
260
+ > `<cite>`, `<q>`, `<dfn>`, and `<var>`? We have to draw the line somewhere,
261
+ > where should it be? Another good question for the list.
263
262
 
264
- this makes it possible to include both `"` and `\` in the strings.
263
+ Any opinion?
265
264
 
266
- {key1="\\\" backslash and quote also a tab"}
265
+ * **Default ALD for classes of elements.** For example, an header of level 2 inherits automatically the attributes of `{header2}`, if it is defined.
267
266
 
268
- ### Summary ###
267
+ ## Header ##
268
+
269
+ Paragraph..
270
+
271
+ ## Second Header ## {.mah}
272
+
273
+ Paragraph..
274
+
275
+ {header2}: .myclass
276
+ {paragraph}: .withmargins
269
277
 
270
- To summarize:
278
+ In this example:
271
279
 
272
- AttributeList = \{ (ws [KeyValue|IdSpec|ClassSpec|Tag])* ws \}
273
- Identifier = [A-Za-z][A-Za-z0-9_\.\:\-]*
274
- Tag = Identifier
275
- IdSpec = #Identifier
276
- ClassSpec = .Identifier
277
- KeyValue = Key=[QuotedValue|UnquotedValue]
278
- Key = Identifier
279
- UnquotedValue = [^\s\"][^\s]*
280
- QuotedValue = \"[^\"]*\" <---------- note: simplistic
280
+ * the first header has attributes `class=myclass`
281
+ * the second header has attributes `class="myclass mah"`
282
+ * the two paragraphs have attributes `class=withmargins`
281
283
 
282
- **Note**: I am not able to write the regexp for `QuotedValue` that takes into
283
- account also the escaping of the characters. Any regexp wizard out there?
284
284
 
285
- Things to discuss
286
- -----------------
285
+ Design rationale
286
+ ----------------
287
287
 
288
288
  * Question: should we allow whitespace at the sides of `=` in key/value pairs?
289
289
 
290
- * Question: should `:` be a synonym for `=` in attributes list.
290
+ > No, because it is difficult to parse.
291
291
 
292
- Personally, I like this:
293
-
294
- {key1: value key2: "value2 with spaces" }
292
+ * Question: should `:` be a synonym for `=` in attributes list?
295
293
 
296
- much more than this:
297
-
298
- {key1=value key2="value2 with spaces " }
299
-
300
-
301
- * A syntax for creating `SPAN` elements in the paragraphs and setting their
302
- attributes.
303
-
304
- This is my proposal:
294
+ > No, because ':' is used for XML namespaces (`xml:lang=en`)
305
295
 
306
- a long paragraph with {special words}{#myspan} that I want to
307
- highlight
308
-
309
- should originate the following HTML:
310
-
311
- <p>a long paragraph with <span id="myspan">special words</span>
312
- that I want to highlight</p>
313
-
314
- This is Michel's comment on this syntax:
315
-
316
- > It looks quite good. One question is can it be amgibuous with braces
317
- > used for the attributes themselves? I don't have an answer to that
318
- > question; better ask this on the list.
319
-
320
- I don't think it is ambiguous, because it's the only case in which you have
321
- the sequence `}{`:
322
-
323
- {.*}{Attributes}
324
-
325
- > Another question: does it makes sense to define `<span>` within
326
- > Markdown when you can't have `<b>` and `<i>`, or the more meaningful
327
- > `<cite>`, `<q>`, `<dfn>`, and `<var>`? We have to draw the line somewhere,
328
- > where should it be? Another good question for the list.
329
-
330
-
331
296
 
332
- * anything else?
333
297
 
@@ -0,0 +1,30 @@
1
+ <?xml version='1.0' ?>
2
+ <!DOCTYPE html PUBLIC '-//W3C//DTD XHTML 1.0 Strict//EN'
3
+ 'http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd'>
4
+ <html lang='en' xml:lang='en' xmlns='http://www.w3.org/1999/xhtml'
5
+ ><head
6
+ ><title></title
7
+ ></head
8
+ ><head
9
+ ><title></title
10
+ ></head
11
+ ><body
12
+ ><ul
13
+ ><li
14
+ ><p>Export to HTML * Include RubyPants</p
15
+ ></li
16
+ ><li
17
+ ><p>Export to PDF * support for images</p
18
+ ></li
19
+ ><li
20
+ ><p>Export to Markdown (pretty-printing)</p
21
+ ></li
22
+ ></ul
23
+ ><div class='maruku_signature'
24
+ ><hr
25
+ /><span style='font-size: small; font-style: italic'>Created by <a href='http://maruku.rubyforge.org' title='Maruku: a Markdown interpreter'>Maruku</a
26
+ > at 17:47 on Sunday, December 31st, 2006.</span
27
+ ></div
28
+ ></body
29
+ ></html
30
+ >