maruku 0.2.13 → 0.3.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- data/bin/maruku +23 -15
- data/bin/maruku0.3 +37 -0
- data/bin/marutest +277 -0
- data/docs/changelog-0.3.html +99 -0
- data/docs/changelog-0.3.md +84 -0
- data/docs/faq.html +46 -0
- data/docs/faq.md +32 -0
- data/docs/index.html +629 -64
- data/docs/markdown_extra2.html +67 -14
- data/docs/markdown_syntax.html +631 -94
- data/docs/markdown_syntax_2.html +152 -0
- data/docs/maruku.html +629 -64
- data/docs/maruku.md +108 -105
- data/docs/proposal.html +362 -55
- data/docs/proposal.md +133 -169
- data/docs/todo.html +30 -0
- data/lib/maruku.rb +13 -3
- data/lib/maruku/errors_management.rb +75 -0
- data/lib/maruku/helpers.rb +164 -0
- data/lib/maruku/html_helper.rb +33 -13
- data/lib/maruku/parse_block.rb +89 -92
- data/lib/maruku/parse_doc.rb +43 -18
- data/lib/maruku/parse_span.rb +17 -46
- data/lib/maruku/parse_span_better.rb +681 -0
- data/lib/maruku/string_utils.rb +17 -10
- data/lib/maruku/structures.rb +62 -35
- data/lib/maruku/structures_iterators.rb +39 -0
- data/lib/maruku/tests/benchmark.rb +12 -4
- data/lib/maruku/tests/new_parser.rb +318 -0
- data/lib/maruku/to_html.rb +113 -44
- data/lib/maruku/to_latex.rb +32 -14
- data/lib/maruku/to_markdown.rb +110 -0
- data/lib/maruku/toc.rb +35 -1
- data/lib/maruku/version.rb +10 -1
- data/lib/test.rb +29 -0
- data/tests/others/escaping.md +6 -4
- data/tests/others/links.md +1 -1
- data/tests/others/lists_after_paragraph.md +44 -0
- data/tests/unittest/abbreviations.md +71 -0
- data/tests/unittest/blank.md +43 -0
- data/tests/unittest/blanks_in_code.md +131 -0
- data/tests/unittest/code.md +64 -0
- data/tests/unittest/code2.md +59 -0
- data/tests/unittest/code3.md +121 -0
- data/tests/unittest/easy.md +36 -0
- data/tests/unittest/email.md +39 -0
- data/tests/unittest/encoding/iso-8859-1.md +9 -0
- data/tests/unittest/encoding/utf-8.md +38 -0
- data/tests/unittest/entities.md +174 -0
- data/tests/unittest/escaping.md +97 -0
- data/tests/unittest/extra_dl.md +81 -0
- data/tests/unittest/extra_header_id.md +96 -0
- data/tests/unittest/extra_table1.md +78 -0
- data/tests/unittest/footnotes.md +120 -0
- data/tests/unittest/headers.md +64 -0
- data/tests/unittest/hrule.md +77 -0
- data/tests/unittest/images.md +114 -0
- data/tests/unittest/inline_html.md +185 -0
- data/tests/unittest/links.md +162 -0
- data/tests/unittest/list1.md +80 -0
- data/tests/unittest/list2.md +75 -0
- data/tests/unittest/list3.md +111 -0
- data/tests/unittest/list4.md +43 -0
- data/tests/unittest/lists.md +262 -0
- data/tests/unittest/lists_after_paragraph.md +280 -0
- data/tests/unittest/lists_ol.md +323 -0
- data/tests/unittest/misc_sw.md +751 -0
- data/tests/unittest/notyet/escape.md +46 -0
- data/tests/unittest/notyet/header_after_par.md +85 -0
- data/tests/unittest/notyet/ticks.md +67 -0
- data/tests/unittest/notyet/triggering.md +210 -0
- data/tests/unittest/one.md +33 -0
- data/tests/unittest/paragraph.md +34 -0
- data/tests/unittest/paragraph_rules/dont_merge_ref.md +60 -0
- data/tests/unittest/paragraph_rules/tab_is_blank.md +43 -0
- data/tests/unittest/paragraphs.md +84 -0
- data/tests/unittest/recover/recover_links.md +32 -0
- data/tests/unittest/references/long_example.md +87 -0
- data/tests/unittest/references/spaces_and_numbers.md +27 -0
- data/tests/unittest/syntax_hl.md +99 -0
- data/tests/unittest/test.md +36 -0
- data/tests/unittest/wrapping.md +88 -0
- data/tests/utf8-files/simple.md +1 -0
- metadata +139 -86
- data/lib/maruku/maruku.rb +0 -50
- data/tests/a.md +0 -10
data/docs/proposal.md
CHANGED
@@ -3,74 +3,96 @@ LaTeX_use_listings: true
|
|
3
3
|
html_use_syntax: true
|
4
4
|
use_numbered_headers: true
|
5
5
|
|
6
|
-
|
7
|
-
|
6
|
+
Proposal for adding a meta-data syntax to Markdown
|
7
|
+
=============================================
|
8
8
|
|
9
|
-
This document
|
10
|
-
block-level elements (headers, paragraphs, code blocks
|
11
|
-
and to span-level elements (links, images
|
9
|
+
This document describes a syntax for attaching meta-data to
|
10
|
+
block-level elements (headers, paragraphs, code blocks,…),
|
11
|
+
and to span-level elements (links, images,…).
|
12
12
|
|
13
|
-
|
14
|
-
|
13
|
+
Last updated **January 2nd, 2007**: integrated topics
|
14
|
+
discussed in mailing list.
|
15
15
|
|
16
16
|
*Table of contents:*
|
17
17
|
> @toc
|
18
18
|
> * Table of contents
|
19
19
|
|
20
|
+
|
21
|
+
Overview
|
22
|
+
--------
|
23
|
+
|
24
|
+
This proposal describes two additions to the Markdown syntax:
|
25
|
+
|
26
|
+
1. inline attribute lists (IAL)
|
27
|
+
|
28
|
+
## Header ## {key=val .class #id ref_id}
|
29
|
+
|
30
|
+
2. attribute lists definitions (ALD)
|
31
|
+
|
32
|
+
{ref_id}: key=val .class #id
|
33
|
+
|
34
|
+
Every span-level or block-level element can be followed by an IAL:
|
35
|
+
|
36
|
+
### Header ### {#header1 class=c1}
|
37
|
+
|
38
|
+
Paragraph *with emphasis*{class=c1}
|
39
|
+
second line of paragraph
|
40
|
+
{class=c1}
|
41
|
+
|
42
|
+
In this example, the three IALs refer to the header, the emphasis span, and the entire paragraph, respectively.
|
43
|
+
|
44
|
+
IALs can reference ALDs. The result of the following example is the same as the previous one:
|
45
|
+
|
46
|
+
### Header ### {#header1 c1}
|
47
|
+
|
48
|
+
Paragraph *with emphasis*{c1}
|
49
|
+
second line of paragraph
|
50
|
+
{c1}
|
51
|
+
|
52
|
+
{c1}: class=c1
|
53
|
+
|
20
54
|
Attribute lists
|
21
55
|
---------------
|
22
56
|
|
23
57
|
This is an example attribute list, which shows
|
24
58
|
everything you can put inside:
|
25
59
|
|
26
|
-
|
60
|
+
key1=val key2="long val" #myid .class1 .class2 ref1 ref2
|
27
61
|
|
28
|
-
More in particular, an attribute list is a
|
62
|
+
More in particular, an attribute list is a whitespace-separated list
|
29
63
|
of elements of 4 different kinds:
|
30
64
|
|
31
|
-
1. key/value pairs
|
32
|
-
2. [
|
65
|
+
1. key/value pairs (quoted if necessary)
|
66
|
+
2. [references to ALD](#using_tags) (`ref1`,`ref2`)
|
33
67
|
3. [id specifiers](#class_id) (`#myid`)
|
34
68
|
4. [class specifiers](#class_id) (`.myclass`)
|
35
69
|
|
36
|
-
The formal grammar is specified [below](#grammar).
|
37
|
-
|
38
70
|
### `id` and `class` are special ### {#class_id}
|
39
71
|
|
40
|
-
|
41
|
-
some are threated in a special way:
|
42
|
-
|
43
|
-
* `id`: you can only have one ID specified for an element.
|
44
|
-
ID must not conflict with one another.
|
72
|
+
For ID and classes there are special shortcuts:
|
45
73
|
|
46
|
-
* `
|
47
|
-
|
48
|
-
to the same element (just like HTML).
|
74
|
+
* `#myid` is a shortcut for `id=myid`
|
75
|
+
* `.myclass` means "add `myclass` to the current `class` attribute".
|
49
76
|
|
50
|
-
|
77
|
+
So these are equivalent:
|
51
78
|
|
52
79
|
{.class1 .class2}
|
53
80
|
{class="class1 class2"}
|
54
81
|
|
55
82
|
|
56
|
-
|
57
|
-
|
58
|
-
* `#myid` is a shortcut for `id=myid`
|
59
|
-
* `.myclass` is a shortcut for `class=myclass`
|
60
|
-
|
61
|
-
Therefore the following attribute lists are equivalent:
|
83
|
+
The following attribute lists are equivalent:
|
62
84
|
|
63
85
|
{#myid .class1 .class2}
|
64
|
-
{id=myid class=class1
|
86
|
+
{id=myid class=class1 .class2}
|
65
87
|
{id=myid class="class1 class2"}
|
88
|
+
{id=myid class="will be overridden" class=class1 .class2}
|
66
89
|
|
67
|
-
|
68
|
-
|
69
|
-
----------------------------
|
90
|
+
Where to put inline attribute lists
|
91
|
+
----------------------------------
|
70
92
|
|
71
93
|
### For block-level elements ###
|
72
94
|
|
73
|
-
For paragraphs and other block-level elements,
|
95
|
+
For paragraphs and other block-level elements, IAL go
|
74
96
|
**after** the element:
|
75
97
|
|
76
98
|
This is a paragraph.
|
@@ -81,7 +103,7 @@ For paragraphs and other block-level elements, attributes lists go
|
|
81
103
|
> Who said that?
|
82
104
|
{cite=google.com}
|
83
105
|
|
84
|
-
Note: empty lines between the block and the
|
106
|
+
Note: empty lines between the block and the IAL are not tollerated.
|
85
107
|
So this is not legal:
|
86
108
|
|
87
109
|
This is a paragraph.
|
@@ -110,7 +132,7 @@ For headers, you can put attribute lists on the same line:
|
|
110
132
|
Header {#myid .myclass}
|
111
133
|
------
|
112
134
|
|
113
|
-
or, as other block-level elements, on the line
|
135
|
+
or, as like other block-level elements, on the line below:
|
114
136
|
|
115
137
|
### Header ###
|
116
138
|
{#myid}
|
@@ -121,17 +143,16 @@ or, as other block-level elements, on the line after:
|
|
121
143
|
|
122
144
|
### For span-level elements ###
|
123
145
|
|
124
|
-
For span-level elements,
|
146
|
+
For span-level elements, meta-data goes immediately **after** in the
|
125
147
|
flow.
|
126
148
|
|
127
|
-
|
128
149
|
For example, in this:
|
129
150
|
|
130
|
-
This is a *chunky paragraph*{#id1}
|
151
|
+
This is a *chunky paragraph*{#id1}
|
131
152
|
{#id2}
|
132
153
|
|
133
154
|
the ID of the `em` element is set to `id1`
|
134
|
-
and the
|
155
|
+
and the ID of the paragraph is set to `id2`.
|
135
156
|
|
136
157
|
This works also for links, like this:
|
137
158
|
|
@@ -145,189 +166,132 @@ is equivalent to:
|
|
145
166
|
|
146
167
|
This is {title="fresh carrots"}
|
147
168
|
|
148
|
-
Using
|
149
|
-
|
169
|
+
Using attributes lists definition {#using_tags}
|
170
|
+
---------------------------------
|
150
171
|
|
151
172
|
In an attribute list, you can have:
|
173
|
+
|
152
174
|
1. `key=value` pairs,
|
153
175
|
2. id attributes (`#myid`)
|
154
176
|
3. class attributes (`.myclass`)
|
155
177
|
|
156
|
-
Everything else is interpreted as a
|
157
|
-
|
158
|
-
the attributes later:
|
178
|
+
Everything else is interpreted as a reference to
|
179
|
+
an ALD.
|
159
180
|
|
160
|
-
# Header # {
|
181
|
+
# Header # {ref}
|
161
182
|
|
162
183
|
Blah blah blah.
|
163
184
|
|
164
|
-
{
|
185
|
+
{ref}: #myhead .myclass lang=fr
|
165
186
|
|
166
|
-
|
167
|
-
be assigned the same tag.
|
187
|
+
Of course, more than one IAL can reference the same ALD:
|
168
188
|
|
169
|
-
# Header 1 # {
|
189
|
+
# Header 1 # {1}
|
170
190
|
...
|
171
|
-
# Header 2 # {
|
191
|
+
# Header 2 # {1}
|
172
192
|
|
173
|
-
{
|
193
|
+
{1}: .myclass lang=fr
|
174
194
|
|
175
|
-
In this case, however, you should not assign
|
176
|
-
the `id` attribute. So this is **not** valid:
|
177
195
|
|
178
|
-
|
179
|
-
|
180
|
-
# Header 2 # {tag}
|
181
|
-
|
182
|
-
{tag}: #myid .myclass lang=fr
|
183
|
-
|
184
|
-
|
185
|
-
[^tag]: a better name for this?
|
186
|
-
|
187
|
-
Of course, tags are valid for both block-level and span-level elements:
|
188
|
-
|
189
|
-
### My header ### {1}
|
190
|
-
This is a paragraph with an *emphasis*{2}
|
191
|
-
a and the paragraph goes on.
|
192
|
-
{3}
|
193
|
-
|
194
|
-
{1}: #header_id
|
195
|
-
{2}: #emph_id
|
196
|
-
{3}: #par_id
|
196
|
+
The rules {#grammar}
|
197
|
+
---------
|
197
198
|
|
199
|
+
### The issue of escaping ###
|
198
200
|
|
199
|
-
|
200
|
-
------------------------------------
|
201
|
+
1. No escaping in code blocks.
|
201
202
|
|
202
|
-
|
203
|
+
* ``` `\` ``` represents the one-character string `\`.
|
203
204
|
|
204
|
-
|
205
|
-
by more than 3 spaces:
|
205
|
+
2. Everywhere else, **all** characters **can** be escaped:
|
206
206
|
|
207
|
-
|
208
|
-
|
209
|
-
|
210
|
-
{#blockid}
|
207
|
+
* `\|` is the literal `|`, `\n` is the literal `n`.
|
208
|
+
* `\ ` represents a non-breaking space.
|
209
|
+
* `\` followed by a newline represents a linebreak.
|
211
210
|
|
211
|
+
3. Quotes **must** be escaped inside quoted values:
|
212
|
+
|
213
|
+
* Inside `"quoted values"`, you **must** escape `"`.
|
214
|
+
* Inside `'quoted values'`, you **must** escape `'`.
|
212
215
|
|
213
|
-
|
214
|
-
--------------
|
216
|
+
* Other examples:
|
215
217
|
|
216
|
-
|
218
|
+
`"bah 'bah' bah"` = `"bah \'bah\' bah"` = `'bah \'bah\' bah'`
|
219
|
+
|
220
|
+
`'bah "bah" bah'` = `'bah \"bah\" bah'` = `"bah \"bah\" bah"`
|
217
221
|
|
218
|
-
In the spirit of HTML:
|
219
222
|
|
220
|
-
|
221
|
-
> may be followed by any number of letters, digits (`[0-9]`),
|
222
|
-
> hyphens (`-`), underscores (`_`), colons (`:`), and periods (`.`).
|
223
|
+
4. There is an exception for backward compatibility:
|
223
224
|
|
224
|
-
|
225
|
-
Moreover, they are case-sensitive.
|
225
|
+
[text](url "title"with"quotes")
|
226
226
|
|
227
|
-
So this is a valid attribute list:
|
228
227
|
|
229
|
-
|
228
|
+
### Syntax for attribute lists ####
|
230
229
|
|
231
|
-
|
230
|
+
Consider the following attribute list:
|
232
231
|
|
233
|
-
|
232
|
+
{key=value ref key2="quoted value" }
|
233
|
+
|
234
|
+
In this string, `key`, `value`, and `ref` can be substituted by any
|
235
|
+
string that does not contain whitespace, or the unescaped characters `}`,`=`,`'`,`"`.
|
234
236
|
|
235
|
-
|
236
|
-
|
237
|
-
in your language.)
|
237
|
+
Inside a quoted value, you **may** use `}`,`=` unescaped but you **must**
|
238
|
+
escape the other kind of quote.
|
238
239
|
|
239
|
-
Now:
|
240
|
-
* an id attribute is an `Identifier` preceded by `#`
|
241
|
-
* a class attribute is an `Identifier` preceded by `.`
|
242
|
-
* a `Tag` is an `Identifier`
|
243
240
|
|
244
|
-
|
245
|
-
|
241
|
+
Things to discuss
|
242
|
+
-----------------
|
246
243
|
|
247
|
-
|
244
|
+
* A syntax for creating `SPAN` elements in the paragraphs and setting their attributes.
|
248
245
|
|
249
|
-
|
250
|
-
except whitespace:
|
246
|
+
This is my proposal:
|
251
247
|
|
252
|
-
|
248
|
+
a long paragraph with [special words]{#myspan} that I want to
|
249
|
+
highlight
|
253
250
|
|
254
|
-
|
251
|
+
should originate the following HTML:
|
255
252
|
|
256
|
-
|
253
|
+
<p>a long paragraph with <span id="myspan">special words</span>
|
254
|
+
that I want to highlight</p>
|
257
255
|
|
258
|
-
|
259
|
-
In a quoted value there are two escaping rules:
|
256
|
+
***Note: I changed the old `{special words}{#myspan}` with `[special words]{#myspan}` which is less ambiguous.***
|
260
257
|
|
261
|
-
|
262
|
-
|
258
|
+
* > Another question: does it makes sense to define `<span>` within
|
259
|
+
> Markdown when you can't have `<b>` and `<i>`, or the more meaningful
|
260
|
+
> `<cite>`, `<q>`, `<dfn>`, and `<var>`? We have to draw the line somewhere,
|
261
|
+
> where should it be? Another good question for the list.
|
263
262
|
|
264
|
-
|
263
|
+
Any opinion?
|
265
264
|
|
266
|
-
|
265
|
+
* **Default ALD for classes of elements.** For example, an header of level 2 inherits automatically the attributes of `{header2}`, if it is defined.
|
267
266
|
|
268
|
-
|
267
|
+
## Header ##
|
268
|
+
|
269
|
+
Paragraph..
|
270
|
+
|
271
|
+
## Second Header ## {.mah}
|
272
|
+
|
273
|
+
Paragraph..
|
274
|
+
|
275
|
+
{header2}: .myclass
|
276
|
+
{paragraph}: .withmargins
|
269
277
|
|
270
|
-
|
278
|
+
In this example:
|
271
279
|
|
272
|
-
|
273
|
-
|
274
|
-
|
275
|
-
IdSpec = #Identifier
|
276
|
-
ClassSpec = .Identifier
|
277
|
-
KeyValue = Key=[QuotedValue|UnquotedValue]
|
278
|
-
Key = Identifier
|
279
|
-
UnquotedValue = [^\s\"][^\s]*
|
280
|
-
QuotedValue = \"[^\"]*\" <---------- note: simplistic
|
280
|
+
* the first header has attributes `class=myclass`
|
281
|
+
* the second header has attributes `class="myclass mah"`
|
282
|
+
* the two paragraphs have attributes `class=withmargins`
|
281
283
|
|
282
|
-
**Note**: I am not able to write the regexp for `QuotedValue` that takes into
|
283
|
-
account also the escaping of the characters. Any regexp wizard out there?
|
284
284
|
|
285
|
-
|
286
|
-
|
285
|
+
Design rationale
|
286
|
+
----------------
|
287
287
|
|
288
288
|
* Question: should we allow whitespace at the sides of `=` in key/value pairs?
|
289
289
|
|
290
|
-
|
290
|
+
> No, because it is difficult to parse.
|
291
291
|
|
292
|
-
|
293
|
-
|
294
|
-
{key1: value key2: "value2 with spaces" }
|
292
|
+
* Question: should `:` be a synonym for `=` in attributes list?
|
295
293
|
|
296
|
-
|
297
|
-
|
298
|
-
{key1=value key2="value2 with spaces " }
|
299
|
-
|
300
|
-
|
301
|
-
* A syntax for creating `SPAN` elements in the paragraphs and setting their
|
302
|
-
attributes.
|
303
|
-
|
304
|
-
This is my proposal:
|
294
|
+
> No, because ':' is used for XML namespaces (`xml:lang=en`)
|
305
295
|
|
306
|
-
a long paragraph with {special words}{#myspan} that I want to
|
307
|
-
highlight
|
308
|
-
|
309
|
-
should originate the following HTML:
|
310
|
-
|
311
|
-
<p>a long paragraph with <span id="myspan">special words</span>
|
312
|
-
that I want to highlight</p>
|
313
|
-
|
314
|
-
This is Michel's comment on this syntax:
|
315
|
-
|
316
|
-
> It looks quite good. One question is can it be amgibuous with braces
|
317
|
-
> used for the attributes themselves? I don't have an answer to that
|
318
|
-
> question; better ask this on the list.
|
319
|
-
|
320
|
-
I don't think it is ambiguous, because it's the only case in which you have
|
321
|
-
the sequence `}{`:
|
322
|
-
|
323
|
-
{.*}{Attributes}
|
324
|
-
|
325
|
-
> Another question: does it makes sense to define `<span>` within
|
326
|
-
> Markdown when you can't have `<b>` and `<i>`, or the more meaningful
|
327
|
-
> `<cite>`, `<q>`, `<dfn>`, and `<var>`? We have to draw the line somewhere,
|
328
|
-
> where should it be? Another good question for the list.
|
329
|
-
|
330
|
-
|
331
296
|
|
332
|
-
* anything else?
|
333
297
|
|
data/docs/todo.html
ADDED
@@ -0,0 +1,30 @@
|
|
1
|
+
<?xml version='1.0' ?>
|
2
|
+
<!DOCTYPE html PUBLIC '-//W3C//DTD XHTML 1.0 Strict//EN'
|
3
|
+
'http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd'>
|
4
|
+
<html lang='en' xml:lang='en' xmlns='http://www.w3.org/1999/xhtml'
|
5
|
+
><head
|
6
|
+
><title></title
|
7
|
+
></head
|
8
|
+
><head
|
9
|
+
><title></title
|
10
|
+
></head
|
11
|
+
><body
|
12
|
+
><ul
|
13
|
+
><li
|
14
|
+
><p>Export to HTML * Include RubyPants</p
|
15
|
+
></li
|
16
|
+
><li
|
17
|
+
><p>Export to PDF * support for images</p
|
18
|
+
></li
|
19
|
+
><li
|
20
|
+
><p>Export to Markdown (pretty-printing)</p
|
21
|
+
></li
|
22
|
+
></ul
|
23
|
+
><div class='maruku_signature'
|
24
|
+
><hr
|
25
|
+
/><span style='font-size: small; font-style: italic'>Created by <a href='http://maruku.rubyforge.org' title='Maruku: a Markdown interpreter'>Maruku</a
|
26
|
+
> at 17:47 on Sunday, December 31st, 2006.</span
|
27
|
+
></div
|
28
|
+
></body
|
29
|
+
></html
|
30
|
+
>
|