commonmarker 0.14.3 → 0.14.4

Sign up to get free protection for your applications and to get access to all the features.

Potentially problematic release.


This version of commonmarker might be problematic. Click here for more details.

Files changed (31) hide show
  1. checksums.yaml +4 -4
  2. data/commonmarker.gemspec +1 -1
  3. data/ext/commonmarker/cmark/Makefile +2 -2
  4. data/ext/commonmarker/cmark/extensions/CMakeLists.txt +38 -12
  5. data/ext/commonmarker/cmark/extensions/autolink.c +151 -102
  6. data/ext/commonmarker/cmark/extensions/core-extensions.h +2 -0
  7. data/ext/commonmarker/cmark/extensions/strikethrough.c +15 -10
  8. data/ext/commonmarker/cmark/extensions/table.c +31 -26
  9. data/ext/commonmarker/cmark/src/CMakeLists.txt +4 -1
  10. data/ext/commonmarker/cmark/src/blocks.c +11 -7
  11. data/ext/commonmarker/cmark/src/cmark_extension_api.h +14 -0
  12. data/ext/commonmarker/cmark/src/commonmark.c +7 -5
  13. data/ext/commonmarker/cmark/src/inlines.c +1 -6
  14. data/ext/commonmarker/cmark/src/latex.c +4 -3
  15. data/ext/commonmarker/cmark/src/man.c +4 -3
  16. data/ext/commonmarker/cmark/src/render.c +5 -3
  17. data/ext/commonmarker/cmark/src/render.h +4 -3
  18. data/ext/commonmarker/cmark/src/syntax_extension.c +5 -0
  19. data/ext/commonmarker/cmark/src/syntax_extension.h +1 -0
  20. data/ext/commonmarker/cmark/test/CMakeLists.txt +3 -2
  21. data/ext/commonmarker/cmark/test/cmark.py +69 -23
  22. data/ext/commonmarker/cmark/test/extensions.txt +6 -6
  23. data/ext/commonmarker/cmark/test/roundtrip_tests.py +6 -4
  24. data/ext/commonmarker/cmark/test/spec.txt +420 -4
  25. data/ext/commonmarker/cmark/test/spec_tests.py +12 -6
  26. data/ext/commonmarker/extconf.rb +1 -1
  27. data/lib/commonmarker/renderer/html_renderer.rb +4 -0
  28. data/lib/commonmarker/version.rb +1 -1
  29. data/test/test_helper.rb +5 -2
  30. data/test/test_spec.rb +9 -7
  31. metadata +6 -6
@@ -44,7 +44,7 @@ Hello!
44
44
  | _abc_ | セン |
45
45
  | ----- | ---- |
46
46
  | 1. Block elements inside cells don't work. | |
47
- | But **_inline elements do_**. | x |
47
+ | But _**inline elements do**_. | x |
48
48
 
49
49
  Hi!
50
50
  .
@@ -62,7 +62,7 @@ Hi!
62
62
  <td></td>
63
63
  </tr>
64
64
  <tr>
65
- <td>But <strong><em>inline elements do</em></strong>.</td>
65
+ <td>But <em><strong>inline elements do</strong></em>.</td>
66
66
  <td>x</td>
67
67
  </tr></tbody></table>
68
68
  <p>Hi!</p>
@@ -145,7 +145,7 @@ Hello!
145
145
  | _abc_ | セン |
146
146
  | ----- | ---- |
147
147
  | this row has a space at the end | |
148
- | But **_inline elements do_**. | x |
148
+ | But _**inline elements do**_. | x |
149
149
 
150
150
  Hi!
151
151
  .
@@ -163,7 +163,7 @@ Hi!
163
163
  <td></td>
164
164
  </tr>
165
165
  <tr>
166
- <td>But <strong><em>inline elements do</em></strong>.</td>
166
+ <td>But <em><strong>inline elements do</strong></em>.</td>
167
167
  <td>x</td>
168
168
  </tr></tbody></table>
169
169
  <p>Hi!</p>
@@ -255,7 +255,7 @@ Tables with embedded pipes could be tricky.
255
255
  | --- | --- |
256
256
  | Escaped pipes are \|okay\|. | Like \| this. |
257
257
  | Within `|code| is okay` too. |
258
- | **_`c|`_** \| complex
258
+ | _**`c|`**_ \| complex
259
259
  | don't **\_reparse\_**
260
260
  .
261
261
  <table>
@@ -275,7 +275,7 @@ Tables with embedded pipes could be tricky.
275
275
  <td></td>
276
276
  </tr>
277
277
  <tr>
278
- <td><strong><em><code>c|</code></em></strong> | complex</td>
278
+ <td><em><strong><code>c|</code></strong></em> | complex</td>
279
279
  <td></td>
280
280
  </tr>
281
281
  <tr>
@@ -14,6 +14,8 @@ if __name__ == "__main__":
14
14
  default=None, help='limit to sections matching regex pattern')
15
15
  parser.add_argument('--library-dir', dest='library_dir', nargs='?',
16
16
  default=None, help='directory containing dynamic library')
17
+ parser.add_argument('--extensions', dest='extensions', nargs='?',
18
+ default=None, help='space separated list of extensions to enable')
17
19
  parser.add_argument('--no-normalize', dest='normalize',
18
20
  action='store_const', const=False, default=True,
19
21
  help='do not normalize HTML')
@@ -23,11 +25,11 @@ if __name__ == "__main__":
23
25
 
24
26
  spec = sys.argv[1]
25
27
 
26
- def converter(md):
27
- cmark = CMark(prog=args.program, library_dir=args.library_dir)
28
- [ec, result, err] = cmark.to_commonmark(md)
28
+ def converter(md, exts):
29
+ cmark = CMark(prog=args.program, library_dir=args.library_dir, extensions=args.extensions)
30
+ [ec, result, err] = cmark.to_commonmark(md, exts)
29
31
  if ec == 0:
30
- [ec, html, err] = cmark.to_html(result)
32
+ [ec, html, err] = cmark.to_html(result, exts)
31
33
  if ec == 0:
32
34
  # In the commonmark writer we insert dummy HTML
33
35
  # comments between lists, and between lists and code
@@ -3161,7 +3161,187 @@ aaa
3161
3161
  <h1>aaa</h1>
3162
3162
  ````````````````````````````````
3163
3163
 
3164
+ <div class="extension">
3164
3165
 
3166
+ ## Tables (extension)
3167
+
3168
+ If the `table` extension is enabled, an additional leaf block type is
3169
+ available
3170
+
3171
+ A [table](@) is an arrangement of data with rows and columns, consisting of a
3172
+ single header row, a [delimiter row] separating the header from the data, and
3173
+ zero or more data rows.
3174
+
3175
+ Each row consists of cells containing arbitrary text, in which [inlines] are
3176
+ parsed, separated by pipes (`|`). A leading and trailing pipe is also
3177
+ recommended for clarity of reading, and if there's otherwise parsing ambiguity.
3178
+ Spaces between pipes and cell content are trimmed. Block-level elements cannot
3179
+ be inserted in a table.
3180
+
3181
+ The [delimiter row](@) consists of cells whose only content are hyphens (`-`),
3182
+ and optionally, a leading or trailing colon (`:`), or both, to indicate left,
3183
+ right, or center alignment respectively.
3184
+
3185
+ ```````````````````````````````` example table
3186
+ | foo | bar |
3187
+ | --- | --- |
3188
+ | baz | bim |
3189
+ .
3190
+ <table>
3191
+ <thead>
3192
+ <tr>
3193
+ <th>foo</th>
3194
+ <th>bar</th>
3195
+ </tr>
3196
+ </thead>
3197
+ <tbody>
3198
+ <tr>
3199
+ <td>baz</td>
3200
+ <td>bim</td>
3201
+ </tr></tbody></table>
3202
+ ````````````````````````````````
3203
+
3204
+ Cells in one column don't need to match length, though it's easier to read if
3205
+ they are. Likewise, use of leading and trailing pipes may be inconsistent:
3206
+
3207
+ ```````````````````````````````` example table
3208
+ | abc | defghi |
3209
+ :-: | -----------:
3210
+ bar | baz
3211
+ .
3212
+ <table>
3213
+ <thead>
3214
+ <tr>
3215
+ <th align="center">abc</th>
3216
+ <th align="right">defghi</th>
3217
+ </tr>
3218
+ </thead>
3219
+ <tbody>
3220
+ <tr>
3221
+ <td align="center">bar</td>
3222
+ <td align="right">baz</td>
3223
+ </tr></tbody></table>
3224
+ ````````````````````````````````
3225
+
3226
+ Include a pipe in a cell's content by escaping it. Pipes inside other inline
3227
+ spans (such as emphasis, code, etc.) will not break a cell:
3228
+
3229
+ ```````````````````````````````` example table
3230
+ | f\|oo |
3231
+ | ------ |
3232
+ | b `|` az |
3233
+ | b **|** im |
3234
+ .
3235
+ <table>
3236
+ <thead>
3237
+ <tr>
3238
+ <th>f|oo</th>
3239
+ </tr>
3240
+ </thead>
3241
+ <tbody>
3242
+ <tr>
3243
+ <td>b <code>|</code> az</td>
3244
+ </tr>
3245
+ <tr>
3246
+ <td>b <strong>|</strong> im</td>
3247
+ </tr></tbody></table>
3248
+ ````````````````````````````````
3249
+
3250
+ The table is broken at the first empty line, or beginning of another
3251
+ block-level structure:
3252
+
3253
+ ```````````````````````````````` example table
3254
+ | abc | def |
3255
+ | --- | --- |
3256
+ | bar | baz |
3257
+ > bar
3258
+ .
3259
+ <table>
3260
+ <thead>
3261
+ <tr>
3262
+ <th>abc</th>
3263
+ <th>def</th>
3264
+ </tr>
3265
+ </thead>
3266
+ <tbody>
3267
+ <tr>
3268
+ <td>bar</td>
3269
+ <td>baz</td>
3270
+ </tr></tbody></table>
3271
+ <blockquote>
3272
+ <p>bar</p>
3273
+ </blockquote>
3274
+ ````````````````````````````````
3275
+
3276
+ ```````````````````````````````` example table
3277
+ | abc | def |
3278
+ | --- | --- |
3279
+ | bar | baz |
3280
+ bar
3281
+
3282
+ bar
3283
+ .
3284
+ <table>
3285
+ <thead>
3286
+ <tr>
3287
+ <th>abc</th>
3288
+ <th>def</th>
3289
+ </tr>
3290
+ </thead>
3291
+ <tbody>
3292
+ <tr>
3293
+ <td>bar</td>
3294
+ <td>baz</td>
3295
+ </tr>
3296
+ <tr>
3297
+ <td>bar</td>
3298
+ <td></td>
3299
+ </tr></tbody></table>
3300
+ <p>bar</p>
3301
+ ````````````````````````````````
3302
+
3303
+ The header row must match the [delimiter row] in the number of cells. If not,
3304
+ a table will not be recognized:
3305
+
3306
+ ```````````````````````````````` example table
3307
+ | abc | def |
3308
+ | --- |
3309
+ | bar |
3310
+ .
3311
+ <p>| abc | def |
3312
+ | --- |
3313
+ | bar |</p>
3314
+ ````````````````````````````````
3315
+
3316
+ The remainder of the table's rows may vary in the number of cells. If there
3317
+ are a number of cells than the header, empty cells are inserted. If there are
3318
+ greater, the excess is ignored:
3319
+
3320
+ ```````````````````````````````` example table
3321
+ | abc | def |
3322
+ | --- | --- |
3323
+ | bar |
3324
+ | bar | baz | boo |
3325
+ .
3326
+ <table>
3327
+ <thead>
3328
+ <tr>
3329
+ <th>abc</th>
3330
+ <th>def</th>
3331
+ </tr>
3332
+ </thead>
3333
+ <tbody>
3334
+ <tr>
3335
+ <td>bar</td>
3336
+ <td></td>
3337
+ </tr>
3338
+ <tr>
3339
+ <td>bar</td>
3340
+ <td>baz</td>
3341
+ </tr></tbody></table>
3342
+ ````````````````````````````````
3343
+
3344
+ </div>
3165
3345
 
3166
3346
  # Container blocks
3167
3347
 
@@ -5976,8 +6156,8 @@ the following principles resolve ambiguity:
5976
6156
  an interpretation `<strong>...</strong>` is always preferred to
5977
6157
  `<em><em>...</em></em>`.
5978
6158
 
5979
- 14. An interpretation `<strong><em>...</em></strong>` is always
5980
- preferred to `<em><strong>..</strong></em>`.
6159
+ 14. An interpretation `<em><strong>...</strong></em>` is always
6160
+ preferred to `<strong><em>..</em></strong>`.
5981
6161
 
5982
6162
  15. When two potential emphasis or strong emphasis spans overlap,
5983
6163
  so that the second begins before the first ends and ends after
@@ -7000,14 +7180,14 @@ Rule 14:
7000
7180
  ```````````````````````````````` example
7001
7181
  ***foo***
7002
7182
  .
7003
- <p><strong><em>foo</em></strong></p>
7183
+ <p><em><strong>foo</strong></em></p>
7004
7184
  ````````````````````````````````
7005
7185
 
7006
7186
 
7007
7187
  ```````````````````````````````` example
7008
7188
  _____foo_____
7009
7189
  .
7010
- <p><strong><strong><em>foo</em></strong></strong></p>
7190
+ <p><em><strong><strong>foo</strong></strong></em></p>
7011
7191
  ````````````````````````````````
7012
7192
 
7013
7193
 
@@ -7108,6 +7288,43 @@ __a<http://foo.bar/?q=__>
7108
7288
  ````````````````````````````````
7109
7289
 
7110
7290
 
7291
+ <div class="extension">
7292
+
7293
+ ## Strikethrough (extension)
7294
+
7295
+ If the `strikethrough` extension is enabled, an additional emphasis type is
7296
+ available.
7297
+
7298
+ Strikethrough text is any text wrapped in tildes (`~`).
7299
+
7300
+ ```````````````````````````````` example strikethrough
7301
+ ~Hi~ Hello, world!
7302
+ .
7303
+ <p><del>Hi</del> Hello, world!</p>
7304
+ ````````````````````````````````
7305
+
7306
+ Any number of tildes may be used on either side of the text; they do not need
7307
+ to match, and they cannot be nested.
7308
+
7309
+ ```````````````````````````````` example strikethrough
7310
+ This ~text~~~~ is ~~~~curious~.
7311
+ .
7312
+ <p>This <del>text</del> is <del>curious</del>.</p>
7313
+ ````````````````````````````````
7314
+
7315
+ As with regular emphasis delimiters, a new paragraph will cause the cessation
7316
+ of parsing a strikethrough:
7317
+
7318
+ ```````````````````````````````` example strikethrough
7319
+ This ~~has a
7320
+
7321
+ new paragraph~~.
7322
+ .
7323
+ <p>This ~~has a</p>
7324
+ <p>new paragraph~~.</p>
7325
+ ````````````````````````````````
7326
+
7327
+ </div>
7111
7328
 
7112
7329
  ## Links
7113
7330
 
@@ -8529,6 +8746,166 @@ foo@bar.example.com
8529
8746
  <p>foo@bar.example.com</p>
8530
8747
  ````````````````````````````````
8531
8748
 
8749
+ <div class="extension">
8750
+
8751
+ ## Autolinks (extension)
8752
+
8753
+ If the `autolink` extension is enabled, autolinks will be recognised in a
8754
+ greater number of conditions.
8755
+
8756
+ [Autolink]s can also be constructed without requiring the use of `<` and to `>`
8757
+ to delimit them, although they will be recognized under a smaller set of
8758
+ circumstances. All such recognized autolinks can only come after whitespace,
8759
+ or any of the delimiting characters `*`, `_`, `~`, `(`, and `[`.
8760
+
8761
+ An [extended www autolink](@) will be recognized when a [valid domain] is
8762
+ found. A [valid domain](@) consists of the text `www.`, followed by
8763
+ alphanumeric characters, underscores (`_`), hyphens (`-`) and periods (`.`).
8764
+ There must be at least one period, and no underscores may be present in the
8765
+ last two segments of the domain.
8766
+
8767
+ The scheme `http` will be inserted automatically:
8768
+
8769
+ ```````````````````````````````` example autolink
8770
+ www.commonmark.org
8771
+ .
8772
+ <p><a href="http://www.commonmark.org">www.commonmark.org</a></p>
8773
+ ````````````````````````````````
8774
+
8775
+ After a [valid domain], zero or more non-space non-`<` characters may follow:
8776
+
8777
+ ```````````````````````````````` example autolink
8778
+ Visit www.commonmark.org/help for more information.
8779
+ .
8780
+ <p>Visit <a href="http://www.commonmark.org/help">www.commonmark.org/help</a> for more information.</p>
8781
+ ````````````````````````````````
8782
+
8783
+ We then apply [extended autolink path validation](@) as follows:
8784
+
8785
+ Trailing punctuation (specifically, `?`, `!`, `.`, `,`, `:`, `*`, `_`, and `~`)
8786
+ will not be considered part of the autolink, though they may be included in the
8787
+ interior of the link:
8788
+
8789
+ ```````````````````````````````` example autolink
8790
+ Visit www.commonmark.org.
8791
+
8792
+ Visit www.commonmark.org/a.b.
8793
+ .
8794
+ <p>Visit <a href="http://www.commonmark.org">www.commonmark.org</a>.</p>
8795
+ <p>Visit <a href="http://www.commonmark.org/a.b">www.commonmark.org/a.b</a>.</p>
8796
+ ````````````````````````````````
8797
+
8798
+ When an autolink ends in `)`, we scan the entire autolink for the total number
8799
+ of parentheses. If there is a greater number of closing parentheses than
8800
+ opening ones, we don't consider the last character part of the autolink, in
8801
+ order to facilitate including an autolink inside a parenthesis:
8802
+
8803
+ ```````````````````````````````` example autolink
8804
+ www.google.com/search?q=Markup+(business)
8805
+
8806
+ (www.google.com/search?q=Markup+(business))
8807
+ .
8808
+ <p><a href="http://www.google.com/search?q=Markup+(business)">www.google.com/search?q=Markup+(business)</a></p>
8809
+ <p>(<a href="http://www.google.com/search?q=Markup+(business)">www.google.com/search?q=Markup+(business)</a>)</p>
8810
+ ````````````````````````````````
8811
+
8812
+ This check is only done when the link ends in a closing parentheses `)`, so if
8813
+ the only parentheses are in the interior of the autolink, no special rules are
8814
+ applied:
8815
+
8816
+ ```````````````````````````````` example autolink
8817
+ www.google.com/search?q=(business))+ok
8818
+ .
8819
+ <p><a href="http://www.google.com/search?q=(business))+ok">www.google.com/search?q=(business))+ok</a></p>
8820
+ ````````````````````````````````
8821
+
8822
+ If an autolink ends in a semicolon (`;`), we check to see if it appears to
8823
+ resemble an [entity reference][entity references]; if the preceding text is `&`
8824
+ followed by one or more alphanumeric characters. If so, it is excluded from
8825
+ the autolink:
8826
+
8827
+ ```````````````````````````````` example autolink
8828
+ www.google.com/search?q=commonmark&hl=en
8829
+
8830
+ www.google.com/search?q=commonmark&hl;
8831
+ .
8832
+ <p><a href="http://www.google.com/search?q=commonmark&amp;hl=en">www.google.com/search?q=commonmark&amp;hl=en</a></p>
8833
+ <p><a href="http://www.google.com/search?q=commonmark">www.google.com/search?q=commonmark</a>&amp;hl;</p>
8834
+ ````````````````````````````````
8835
+
8836
+ `<` immediately ends an autolink.
8837
+
8838
+ ```````````````````````````````` example autolink
8839
+ www.commonmark.org/he<lp
8840
+ .
8841
+ <p><a href="http://www.commonmark.org/he">www.commonmark.org/he</a>&lt;lp</p>
8842
+ ````````````````````````````````
8843
+
8844
+ An [extended url autolink](@) will be recognised when one of the schemes
8845
+ `http://`, `https://`, or `ftp://`, followed by a [valid domain], then zero or
8846
+ more non-space non-`<` characters according to
8847
+ [extended autolink path validation]:
8848
+
8849
+ ```````````````````````````````` example autolink
8850
+ http://commonmark.org
8851
+
8852
+ (Visit https://encrypted.google.com/search?q=Markup+(business))
8853
+
8854
+ Anonymous FTP is available at ftp://foo.bar.baz.
8855
+ .
8856
+ <p><a href="http://commonmark.org">http://commonmark.org</a></p>
8857
+ <p>(Visit <a href="https://encrypted.google.com/search?q=Markup+(business)">https://encrypted.google.com/search?q=Markup+(business)</a>)</p>
8858
+ <p>Anonymous FTP is available at <a href="ftp://foo.bar.baz">ftp://foo.bar.baz</a>.</p>
8859
+ ````````````````````````````````
8860
+
8861
+
8862
+ An [extended email autolink](@) will be recognised when an email address is
8863
+ recognised within any text node. Email addresses are recognised according to
8864
+ the following rules:
8865
+
8866
+ * One ore more characters which are alphanumeric, or `.`, `-`, `_`, or `+`.
8867
+ * An `@` symbol.
8868
+ * One or more characters which are alphanumeric, or `.`, `-`, or `_`. At least
8869
+ one of the characters here must be a period (`.`). The last character must
8870
+ not be one of `-` or `_`. If the last character is a period (`.`), it will
8871
+ be excluded from the autolink.
8872
+
8873
+ The scheme `mailto:` will automatically be added to the generated link:
8874
+
8875
+ ```````````````````````````````` example autolink
8876
+ foo@bar.baz
8877
+ .
8878
+ <p><a href="mailto:foo@bar.baz">foo@bar.baz</a></p>
8879
+ ````````````````````````````````
8880
+
8881
+ `+` can occur before the `@`, but not after.
8882
+
8883
+ ```````````````````````````````` example autolink
8884
+ hello@mail+xyz.example isn't valid, but hello+xyz@mail.example is.
8885
+ .
8886
+ <p>hello@mail+xyz.example isn't valid, but <a href="mailto:hello+xyz@mail.example">hello+xyz@mail.example</a> is.</p>
8887
+ ````````````````````````````````
8888
+
8889
+ `.`, `-`, and `_` can occur on both sides of the `@`, but only `.` may occur at
8890
+ the end of the email address, in which case it will not be considered part of
8891
+ the address:
8892
+
8893
+ ```````````````````````````````` example autolink
8894
+ a.b-c_d@a.b
8895
+
8896
+ a.b-c_d@a.b.
8897
+
8898
+ a.b-c_d@a.b-
8899
+
8900
+ a.b-c_d@a.b_
8901
+ .
8902
+ <p><a href="mailto:a.b-c_d@a.b">a.b-c_d@a.b</a></p>
8903
+ <p><a href="mailto:a.b-c_d@a.b">a.b-c_d@a.b</a>.</p>
8904
+ <p>a.b-c_d@a.b-</p>
8905
+ <p>a.b-c_d@a.b_</p>
8906
+ ````````````````````````````````
8907
+
8908
+ </div>
8532
8909
 
8533
8910
  ## Raw HTML
8534
8911
 
@@ -8800,6 +9177,45 @@ foo <a href="\*">
8800
9177
  ````````````````````````````````
8801
9178
 
8802
9179
 
9180
+ <div class="extension">
9181
+
9182
+ ## Raw HTML (extension)
9183
+
9184
+ If the `tagfilter` extension is enabled, the following HTML tags will be
9185
+ filtered when rendering HTML output:
9186
+
9187
+ * `<title>`
9188
+ * `<textarea>`
9189
+ * `<style>`
9190
+ * `<xmp>`
9191
+ * `<iframe>`
9192
+ * `<noembed>`
9193
+ * `<noframes>`
9194
+ * `<script>`
9195
+ * `<plaintext>`
9196
+
9197
+ Filtering is done by replacing the leading `<` with the entity `&lt;`. These
9198
+ tags are chosen in particular as they change how HTML is interpreted in a way
9199
+ unique to them (i.e. nested HTML is interpreted differently), and this is
9200
+ usually undesireable in the context of other rendered Markdown content.
9201
+
9202
+ All other HTML tags are left untouched.
9203
+
9204
+ ```````````````````````````````` example tagfilter
9205
+ <strong> <title> <style> <em>
9206
+
9207
+ <blockquote>
9208
+ <xmp> is disallowed.
9209
+ </blockquote>
9210
+ .
9211
+ <p><strong> &lt;title> &lt;style> <em></p>
9212
+ <blockquote>
9213
+ &lt;xmp> is disallowed.
9214
+ </blockquote>
9215
+ ````````````````````````````````
9216
+
9217
+ </div>
9218
+
8803
9219
  ## Hard line breaks
8804
9220
 
8805
9221
  A line break (not in a code span or HTML tag) that is preceded