docverter-server 1.0.1-java

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (58) hide show
  1. data/.buildpacks +2 -0
  2. data/.gitignore +6 -0
  3. data/.vendor_urls +3 -0
  4. data/Gemfile +3 -0
  5. data/Gemfile.lock +55 -0
  6. data/LICENSE +67 -0
  7. data/Procfile +1 -0
  8. data/README.md +32 -0
  9. data/Rakefile +23 -0
  10. data/config.ru +3 -0
  11. data/doc/api.md +1664 -0
  12. data/doc/examples/html_to_encrypted_pdf/convert.sh +16 -0
  13. data/doc/examples/html_to_encrypted_pdf/imfeldoublepica.ttf +0 -0
  14. data/doc/examples/html_to_encrypted_pdf/input.html +27 -0
  15. data/doc/examples/html_to_encrypted_pdf/marcellus.ttf +0 -0
  16. data/doc/examples/html_to_encrypted_pdf/stylesheet.css +22 -0
  17. data/doc/examples/html_to_pdf/convert.sh +16 -0
  18. data/doc/examples/html_to_pdf/imfeldoublepica.ttf +0 -0
  19. data/doc/examples/html_to_pdf/input.html +27 -0
  20. data/doc/examples/html_to_pdf/marcellus.ttf +0 -0
  21. data/doc/examples/html_to_pdf/stylesheet.css +22 -0
  22. data/doc/examples/markdown_to_epub/chapter1.md +11 -0
  23. data/doc/examples/markdown_to_epub/chapter2.md +10 -0
  24. data/doc/examples/markdown_to_epub/convert.sh +20 -0
  25. data/doc/examples/markdown_to_epub/document-open.png +0 -0
  26. data/doc/examples/markdown_to_epub/markdown_to_epub.epub +241 -0
  27. data/doc/examples/markdown_to_epub/markdown_to_epub.epub.html +251 -0
  28. data/doc/examples/markdown_to_epub/metadata.xml +2 -0
  29. data/doc/examples/markdown_to_epub/stylesheet.css +216 -0
  30. data/doc/examples/markdown_to_epub/title.txt +2 -0
  31. data/doc/examples/markdown_to_mobi/chapter1.md +11 -0
  32. data/doc/examples/markdown_to_mobi/chapter2.md +10 -0
  33. data/doc/examples/markdown_to_mobi/convert.sh +20 -0
  34. data/doc/examples/markdown_to_mobi/document-open.png +0 -0
  35. data/doc/examples/markdown_to_mobi/metadata.xml +2 -0
  36. data/doc/examples/markdown_to_mobi/stylesheet.css +216 -0
  37. data/doc/examples/markdown_to_mobi/title.txt +2 -0
  38. data/doc/examples/markdown_to_pdf/chapter1.md +11 -0
  39. data/doc/examples/markdown_to_pdf/chapter2.md +10 -0
  40. data/doc/examples/markdown_to_pdf/convert.sh +18 -0
  41. data/doc/examples/markdown_to_pdf/imfeldoublepica.ttf +0 -0
  42. data/doc/examples/markdown_to_pdf/manifest.yml +8 -0
  43. data/doc/examples/markdown_to_pdf/marcellus.ttf +0 -0
  44. data/doc/examples/markdown_to_pdf/stylesheet.css +22 -0
  45. data/docverter.gemspec +28 -0
  46. data/lib/docverter-server/app.rb +50 -0
  47. data/lib/docverter-server/conversion.rb +38 -0
  48. data/lib/docverter-server/conversion_types.rb +59 -0
  49. data/lib/docverter-server/jars/bcprov-ext-jdk15-1.43.jar +0 -0
  50. data/lib/docverter-server/jars/htmlcleaner-2.2.jar +0 -0
  51. data/lib/docverter-server/manifest.rb +107 -0
  52. data/lib/docverter-server/runner/base.rb +36 -0
  53. data/lib/docverter-server/runner/calibre.rb +10 -0
  54. data/lib/docverter-server/runner/pandoc.rb +18 -0
  55. data/lib/docverter-server/runner/pdf.rb +91 -0
  56. data/lib/docverter-server/version.rb +3 -0
  57. data/lib/docverter-server.rb +11 -0
  58. metadata +230 -0
data/doc/api.md ADDED
@@ -0,0 +1,1664 @@
1
+ Languages
2
+ =========
3
+
4
+ Docverter has an [official Ruby Gem](http://rubygems.org/gems/docverter) ([Github](https://github.com/docverter/docverter-ruby)). The API described here can of course be used by any language that can make HTTP requests.
5
+
6
+ Conversions
7
+ ===========
8
+
9
+ The API has one endpoint:
10
+
11
+ POST https://api.docverter.com/v1/convert
12
+
13
+ The contents of your POST should be `multipart/form-data` and consist of your input file(s) and options which describe your conversion. For example:
14
+
15
+ curl -u "your_api_key:" \
16
+ --form input_files[]=@chapter1.md \
17
+ --form input_files[]=@chapter2.md \
18
+ --form input_files[]=@chapter3.md \
19
+ --form from=markdown \
20
+ --form to=pdf \
21
+ --form css=stylesheet.css \
22
+ --form other_files[]=@stylesheet.css
23
+
24
+ The `examples` directory contains several examples showing off various API options.
25
+
26
+ Full Option Reference
27
+ =====================
28
+
29
+ General options
30
+ ---------------
31
+
32
+ **`input_files[]`** *ATTACHMENT*
33
+
34
+ A single input file. This can be specified multiple times. The value should be a `multipart/form-data` file upload.
35
+
36
+ **`other_files[]`** *ATTACHMENT*
37
+
38
+ A single additional file. This can be speicifed multiple times. The value should be a `multipart/form-data` file upload.
39
+
40
+ **`from`**
41
+
42
+ Specify input format. *FORMAT* can be `markdown` (markdown),
43
+ `textile` (Textile), `rst` (reStructuredText), `html` (HTML),
44
+ `docbook` (DocBook XML), or `latex` (LaTeX).
45
+
46
+ **`to`**
47
+
48
+ Specify output format. *FORMAT* can be `markdown` (markdown), `rst` (reStructuredText), `html` (XHTML 1),
49
+ `latex` (LaTeX), `context` (ConTeXt), `mediawiki` (MediaWiki markup),
50
+ `textile` (Textile), `org` (Emacs Org-Mode), `texinfo` (GNU Texinfo),
51
+ `docbook` (DocBook XML), `docx` (Word docx), `epub` (EPUB book),
52
+ `mobi` (Kindle book), `asciidoc` (AsciiDoc), or `rtf` (rich text format).
53
+
54
+
55
+ Reader options
56
+ --------------
57
+
58
+ **`strict`**
59
+
60
+ Use strict markdown syntax, with no docverter extensions or variants.
61
+ When the input format is HTML, this means that constructs that have no
62
+ equivalents in standard markdown (e.g. definition lists or strikeout
63
+ text) will be parsed as raw HTML.
64
+
65
+ **`parse_raw`**
66
+
67
+ Parse untranslatable HTML codes and LaTeX environments as raw HTML
68
+ or LaTeX, instead of ignoring them.
69
+
70
+ **`smart`**
71
+
72
+ Produce typographically correct output, converting straight quotes
73
+ to curly quotes, `---` to em-dashes, `--` to en-dashes, and
74
+ `...` to ellipses. Nonbreaking spaces are inserted after certain
75
+ abbreviations, such as "Mr." (Note: This option is significant only when
76
+ the input format is `markdown` or `textile`. It is selected automatically
77
+ when the input format is `textile` or the output format is `latex` or
78
+ `context`, unless `--no-tex-ligatures` is used.)
79
+
80
+ **`base_header_level`** *NUMBER*
81
+
82
+ Specify the base level for headers (defaults to 1).
83
+
84
+ **`indented_code_classes`** *CLASSES*
85
+
86
+ Specify classes to use for indented code blocks--for example,
87
+ `perl,numberLines` or `haskell`. Multiple classes may be separated by commas.
88
+
89
+ **`normalize`**
90
+
91
+ Normalize the document after reading: merge adjacent
92
+ `Str` or `Emph` elements, for example, and remove repeated `Space`s.
93
+
94
+ **`preserve_tabs`**
95
+
96
+ Preserve tabs instead of converting them to spaces (the default).
97
+
98
+ **`tab-stop`** *NUMBER*
99
+
100
+ Specify the number of spaces per tab (default is 4).
101
+
102
+ General writer options
103
+ ----------------------
104
+
105
+ **`template`** *FILE*
106
+
107
+ Use *FILE* as a custom template for the generated document. See [Templates](#templates) below for a description
108
+ of template syntax. If no extension is specified, an extension
109
+ corresponding to the writer will be added, so that `template: special`
110
+ looks for `special.html` for HTML output. If this option is not used, a default
111
+ template appropriate for the output format will be used. This file must be included using `other_files[]`.
112
+
113
+ **`no_wrap`**
114
+
115
+ Disable text wrapping in output. By default, text is wrapped
116
+ appropriately for the output format.
117
+
118
+ **`columns`** *NUMBER*
119
+
120
+ Specify length of lines in characters (for text wrapping).
121
+
122
+ **`table_of_contents`**
123
+
124
+ Include an automatically generated table of contents (or, in
125
+ the case of `latex`, `context`, and `rst`, an instruction to create
126
+ one) in the output document.
127
+
128
+ **`no_highlight`**
129
+
130
+ Disables syntax highlighting for code blocks and inlines, even when
131
+ a language attribute is given.
132
+
133
+ **`highlight_style`** *STYLE*
134
+
135
+ Specifies the coloring style to be used in highlighted source code.
136
+ Options are `pygments` (the default), `kate`, `monochrome`,
137
+ `espresso`, `zenburn`, `haddock`, and `tango`.
138
+
139
+ **`include_in_header`** *FILE*
140
+
141
+ Include contents of *FILE*, verbatim, at the end of the header.
142
+ This can be used, for example, to include special
143
+ CSS or javascript in HTML documents. This file must be included using `other_files[]`.
144
+
145
+ **`include_before_body`** *FILE*
146
+
147
+ Include contents of *FILE*, verbatim, at the beginning of the
148
+ document body (e.g. after the `<body>` tag in HTML, or the
149
+ `\begin{document}` command in LaTeX). This can be used to include
150
+ navigation bars or banners in HTML documents. This file must be included using `other_files[]`.
151
+
152
+ **`include_after_body`** *FILE*
153
+
154
+ Include contents of *FILE*, verbatim, at the end of the document
155
+ body (before the `</body>` tag in HTML, or the
156
+ `\end{document}` command in LaTeX). This file must be included using `other_files[]`.
157
+
158
+ **`variable`** *KEY[:VAL]*
159
+
160
+ Set the template variable *KEY* to the value *VAL* when rendering the
161
+ document in standalone mode. This is generally only useful when the
162
+ `template` option is used to specify a custom template, since
163
+ docverter automatically sets the variables used in the default
164
+ templates. If no *VAL* is specified, the key will be given the
165
+ value `true`.
166
+
167
+ Options affecting specific writers
168
+ ----------------------------------
169
+
170
+ **`ascii`**
171
+
172
+ Use only ascii characters in output. Currently supported only
173
+ for HTML output (which uses numerical entities instead of
174
+ UTF-8 when this option is selected).
175
+
176
+ **`reference_links`**
177
+
178
+ Use reference-style links, rather than inline links, in writing markdown
179
+ or reStructuredText. By default inline links are used.
180
+
181
+ **`atx_headers`**
182
+
183
+ Use ATX style headers in markdown output. The default is to use
184
+ setext-style headers for levels 1-2, and then ATX headers.
185
+
186
+ **`chapters`**
187
+
188
+ Treat top-level headers as chapters in LaTeX, ConTeXt, and DocBook
189
+ output. When the LaTeX template uses the report, book, or
190
+ memoir class, this option is implied.
191
+
192
+ **`number_sections`**
193
+
194
+ Number section headings in LaTeX, ConTeXt, or HTML output.
195
+ By default, sections are not numbered.
196
+
197
+ **`no_tex_ligatures`**
198
+
199
+ Do not convert quotation marks, apostrophes, and dashes to
200
+ the TeX ligatures when writing LaTeX or ConTeXt. Instead, just
201
+ use literal unicode characters. This is needed for using advanced
202
+ OpenType features with XeLaTeX and LuaLaTeX. Note: normally
203
+ `smart` is selected automatically for LaTeX and ConTeXt
204
+ output, but it must be specified explicitly if `no_tex_ligatures`
205
+ is selected. If you use literal curly quotes, dashes, and ellipses
206
+ in your source, then you may want to use `no_tex_ligatures`
207
+ without `smart`.
208
+
209
+ **`listings`**
210
+
211
+ Use listings package for LaTeX code blocks
212
+
213
+ **`section_divs`**
214
+
215
+ Wrap sections in `<div>` tags (or `<section>` tags in HTML5),
216
+ and attach identifiers to the enclosing `<div>` (or `<section>`)
217
+ rather than the header itself.
218
+ See [Section identifiers](#header-identifiers-in-html-latex-and-context), below.
219
+
220
+ **`email_obfuscation`** *none|javascript|references*
221
+
222
+ Specify a method for obfuscating `mailto:` links in HTML documents.
223
+ *none* leaves `mailto:` links as they are. *javascript* obfuscates
224
+ them using javascript. *references* obfuscates them by printing their
225
+ letters as decimal or hexadecimal character references.
226
+ If `strict` is specified, *references* is used regardless of the
227
+ presence of this option.
228
+
229
+ **`id_prefix`** *STRING*
230
+
231
+ Specify a prefix to be added to all automatically generated identifiers
232
+ in HTML output. This is useful for preventing duplicate identifiers
233
+ when generating fragments to be included in other pages.
234
+
235
+ **`title_prefix`** *STRING*
236
+
237
+ Specify *STRING* as a prefix at the beginning of the title
238
+ that appears in the HTML header (but not in the title as it
239
+ appears at the beginning of the HTML body).
240
+
241
+ **`css=`** *URL*
242
+
243
+ Link to a CSS style sheet.
244
+
245
+ **`reference_docx`** *FILE*
246
+
247
+ Use the specified file as a style reference in producing a docx file.
248
+ For best results, the reference docx should be a modified version
249
+ of a docx file produced using docverter. The contents of the reference docx
250
+ are ignored, but its stylesheets are used in the new docx. This file must be included using `other_files[]`.
251
+
252
+ **`pdf_username`** *STRING*
253
+
254
+ Encrypt the output PDF with the given username.
255
+
256
+ **`pdf_password`** *STRING*
257
+
258
+ Encrypt the output PDF with the given password.
259
+
260
+ **`epub_stylesheet`** *FILE*
261
+
262
+ Use the specified CSS file to style the EPUB. This file must be included using `other_files[]`.
263
+
264
+ **`epub_cover_image`** *FILE*
265
+
266
+ Use the specified image as the EPUB cover. It is recommended
267
+ that the image be less than 1000px in width and height. This file must be included using `other_files[]`.
268
+
269
+ **`epub_metadata`** *FILE*
270
+
271
+ Look in the specified XML file for metadata for the EPUB.
272
+ The file should contain a series of Dublin Core elements,
273
+ as documented at <http://dublincore.org/documents/dces/>.
274
+ For example:
275
+
276
+ <dc:rights>Creative Commons</dc:rights>
277
+ <dc:language>es-AR</dc:language>
278
+
279
+ By default, docverter will include the following metadata elements:
280
+ `<dc:title>` (from the document title), `<dc:creator>` (from the
281
+ document authors), `<dc:date>` (from the document date, which should
282
+ be in [ISO 8601 format]), `<dc:language>` (from the `lang`
283
+ variable, or, if is not set, the locale), and `<dc:identifier
284
+ id="BookId">` (a randomly generated UUID). Any of these may be
285
+ overridden by elements in the metadata file. This file must be included using `other_files[]`.
286
+
287
+ **`epub_embed_font`** *FILE*
288
+
289
+ Embed the specified font in the EPUB. This option can be an
290
+ array to embed multiple fonts. To use embedded fonts, you
291
+ will need to add declarations like the following to your CSS (see
292
+ `epub_stylesheet`):
293
+
294
+ @font-face {
295
+ font-family: DejaVuSans;
296
+ font-style: normal;
297
+ font-weight: normal;
298
+ src:url("DejaVuSans-Regular.ttf");
299
+ }
300
+ @font-face {
301
+ font-family: DejaVuSans;
302
+ font-style: normal;
303
+ font-weight: bold;
304
+ src:url("DejaVuSans-Bold.ttf");
305
+ }
306
+ @font-face {
307
+ font-family: DejaVuSans;
308
+ font-style: italic;
309
+ font-weight: normal;
310
+ src:url("DejaVuSans-Oblique.ttf");
311
+ }
312
+ @font-face {
313
+ font-family: DejaVuSans;
314
+ font-style: italic;
315
+ font-weight: bold;
316
+ src:url("DejaVuSans-BoldOblique.ttf");
317
+ }
318
+ body { font-family: "DejaVuSans"; }
319
+
320
+ This file must be included using `other_files[]`.
321
+
322
+ Templates
323
+ =========
324
+
325
+ Docverter uses a template to
326
+ add header and footer material that is needed for a self-standing
327
+ document. A custom template
328
+ can be specified using the `template` option.
329
+
330
+ Templates may contain *variables*. Variable names are sequences of
331
+ alphanumerics, `-`, and `_`, starting with a letter. A variable name
332
+ surrounded by `$` signs will be replaced by its value. For example,
333
+ the string `$title$` in
334
+
335
+ <title>$title$</title>
336
+
337
+ will be replaced by the document title.
338
+
339
+ To write a literal `$` in a template, use `$$`.
340
+
341
+ Some variables are set automatically by docverter. These vary somewhat
342
+ depending on the output format, but include:
343
+
344
+ **`header-includes`**
345
+
346
+ contents specified by `include_in_header` (may have multiple
347
+ values)
348
+
349
+ **`toc`**
350
+
351
+ non-null value if `table_of_contents` was specified
352
+
353
+ **`include-before`**
354
+
355
+ contents specified by `include_before_body` (may have
356
+ multiple values)
357
+
358
+ **`include-after`**
359
+
360
+ contents specified by `include_after_body` (may have
361
+ multiple values)
362
+
363
+ **`body`**
364
+
365
+ body of document
366
+
367
+ **`title`**
368
+
369
+ title of document, as specified in title block
370
+
371
+ **`author`**
372
+
373
+ author of document, as specified in title block (may have
374
+ multiple values)
375
+
376
+ **`date`**
377
+
378
+ date of document, as specified in title block
379
+
380
+ **`lang`**
381
+
382
+ language code for HTML or LaTeX documents
383
+
384
+ **`fontsize`**
385
+
386
+ font size (10pt, 11pt, 12pt) for LaTeX documents
387
+
388
+ **`documentclass`**
389
+
390
+ document class for LaTeX documents
391
+
392
+ **`geometry`**
393
+
394
+ options for LaTeX `geometry` class, e.g. `margin=1in`;
395
+ may be repeated for multiple options
396
+
397
+ **`mainfont`**, **`sansfont`**, **`monofont`**, **`mathfont`**
398
+
399
+ fonts for LaTeX documents (works only with xelatex
400
+ and lualatex)
401
+
402
+ **`linkcolor`**
403
+
404
+ color for internal links in LaTeX documents (`red`, `green`,
405
+ `magenta`, `cyan`, `blue`, `black`)
406
+
407
+ **`urlcolor`**
408
+
409
+ color for external links in LaTeX documents
410
+
411
+ **`links-as-notes`**
412
+
413
+ causes links to be printed as footnotes in LaTeX documents
414
+
415
+ Variables may be set in the manifest using the `variable`
416
+ option. This allows users to include custom variables in their
417
+ templates.
418
+
419
+ Templates may contain conditionals. The syntax is as follows:
420
+
421
+ $if(variable)$
422
+ X
423
+ $else$
424
+ Y
425
+ $endif$
426
+
427
+ This will include `X` in the template if `variable` has a non-null
428
+ value; otherwise it will include `Y`. `X` and `Y` are placeholders for
429
+ any valid template text, and may include interpolated variables or other
430
+ conditionals. The `$else$` section may be omitted.
431
+
432
+ When variables can have multiple values (for example, `author` in
433
+ a multi-author document), you can use the `$for$` keyword:
434
+
435
+ $for(author)$
436
+ <meta name="author" content="$author$" />
437
+ $endfor$
438
+
439
+ You can optionally specify a separator to be used between
440
+ consecutive items:
441
+
442
+ $for(author)$$author$$sep$, $endfor$
443
+
444
+ If you use custom templates, you may need to revise them as pandoc
445
+ changes. We recommend tracking the changes in the default templates,
446
+ and modifying your custom templates accordingly. An easy way to do this
447
+ is to fork the pandoc-templates repository
448
+ (<http://github.com/jgm/pandoc-templates>) and merge in changes after each
449
+ pandoc release.
450
+
451
+ Docverter's markdown
452
+ ====================
453
+
454
+ Docverter understands an extended and slightly revised version of
455
+ John Gruber's [markdown][] syntax. This document explains the syntax,
456
+ noting differences from standard markdown. Except where noted, these
457
+ differences can be suppressed by specifying the `--strict` command-line
458
+ option.
459
+
460
+ Philosophy
461
+ ----------
462
+
463
+ Markdown is designed to be easy to write, and, even more importantly,
464
+ easy to read:
465
+
466
+ > A Markdown-formatted document should be publishable as-is, as plain
467
+ > text, without looking like it's been marked up with tags or formatting
468
+ > instructions.
469
+ > -- [John Gruber](http://daringfireball.net/projects/markdown/syntax#philosophy)
470
+
471
+ This principle has guided docverter's decisions in finding syntax for
472
+ tables, footnotes, and other extensions.
473
+
474
+ There is, however, one respect in which docverter's aims are different
475
+ from the original aims of markdown. Whereas markdown was originally
476
+ designed with HTML generation in mind, docverter is designed for multiple
477
+ output formats. Thus, while docverter allows the embedding of raw HTML,
478
+ it discourages it, and provides other, non-HTMLish ways of representing
479
+ important document elements like definition lists, tables, mathematics, and
480
+ footnotes.
481
+
482
+ Paragraphs
483
+ ----------
484
+
485
+ A paragraph is one or more lines of text followed by one or more blank line.
486
+ Newlines are treated as spaces, so you can reflow your paragraphs as you like.
487
+ If you need a hard line break, put two or more spaces at the end of a line,
488
+ or type a backslash followed by a newline.
489
+
490
+ Headers
491
+ -------
492
+
493
+ There are two kinds of headers, Setext and atx.
494
+
495
+ ### Setext-style headers ###
496
+
497
+ A setext-style header is a line of text "underlined" with a row of `=` signs
498
+ (for a level one header) of `-` signs (for a level two header):
499
+
500
+ A level-one header
501
+ ==================
502
+
503
+ A level-two header
504
+ ------------------
505
+
506
+ The header text can contain inline formatting, such as emphasis (see
507
+ [Inline formatting](#inline-formatting), below).
508
+
509
+
510
+ ### Atx-style headers ###
511
+
512
+ An Atx-style header consists of one to six `#` signs and a line of
513
+ text, optionally followed by any number of `#` signs. The number of
514
+ `#` signs at the beginning of the line is the header level:
515
+
516
+ ## A level-two header
517
+
518
+ ### A level-three header ###
519
+
520
+ As with setext-style headers, the header text can contain formatting:
521
+
522
+ # A level-one header with a [link](/url) and *emphasis*
523
+
524
+ Standard markdown syntax does not require a blank line before a header.
525
+ Docverter does require this (except, of course, at the beginning of the
526
+ document). The reason for the requirement is that it is all too easy for a
527
+ `#` to end up at the beginning of a line by accident (perhaps through line
528
+ wrapping). Consider, for example:
529
+
530
+ I like several of their flavors of ice cream:
531
+ #22, for example, and #5.
532
+
533
+
534
+ ### Header identifiers in HTML, LaTeX, and ConTeXt ###
535
+
536
+ *Docverter extension*.
537
+
538
+ Each header element in docverter's HTML and ConTeXt output is given a
539
+ unique identifier. This identifier is based on the text of the header.
540
+ To derive the identifier from the header text,
541
+
542
+ - Remove all formatting, links, etc.
543
+ - Remove all punctuation, except underscores, hyphens, and periods.
544
+ - Replace all spaces and newlines with hyphens.
545
+ - Convert all alphabetic characters to lowercase.
546
+ - Remove everything up to the first letter (identifiers may
547
+ not begin with a number or punctuation mark).
548
+ - If nothing is left after this, use the identifier `section`.
549
+
550
+ Thus, for example,
551
+
552
+ Header Identifier
553
+ ------------------------------- ----------------------------
554
+ Header identifiers in HTML `header-identifiers-in-html`
555
+ *Dogs*?--in *my* house? `dogs--in-my-house`
556
+ [HTML], [S5], or [RTF]? `html-s5-or-rtf`
557
+ 3. Applications `applications`
558
+ 33 `section`
559
+
560
+ These rules should, in most cases, allow one to determine the identifier
561
+ from the header text. The exception is when several headers have the
562
+ same text; in this case, the first will get an identifier as described
563
+ above; the second will get the same identifier with `-1` appended; the
564
+ third with `-2`; and so on.
565
+
566
+ These identifiers are used to provide link targets in the table of
567
+ contents generated by the `--toc|--table-of-contents` option. They
568
+ also make it easy to provide links from one section of a document to
569
+ another. A link to this section, for example, might look like this:
570
+
571
+ See the section on
572
+ [header identifiers](#header-identifiers-in-html).
573
+
574
+ Note, however, that this method of providing links to sections works
575
+ only in HTML, LaTeX, and ConTeXt formats.
576
+
577
+ If the `--section-divs` option is specified, then each section will
578
+ be wrapped in a `div` (or a `section`, if `--html5` was specified),
579
+ and the identifier will be attached to the enclosing `<div>`
580
+ (or `<section>`) tag rather than the header itself. This allows entire
581
+ sections to be manipulated using javascript or treated differently in
582
+ CSS.
583
+
584
+
585
+ Block quotations
586
+ ----------------
587
+
588
+ Markdown uses email conventions for quoting blocks of text.
589
+ A block quotation is one or more paragraphs or other block elements
590
+ (such as lists or headers), with each line preceded by a `>` character
591
+ and a space. (The `>` need not start at the left margin, but it should
592
+ not be indented more than three spaces.)
593
+
594
+ > This is a block quote. This
595
+ > paragraph has two lines.
596
+ >
597
+ > 1. This is a list inside a block quote.
598
+ > 2. Second item.
599
+
600
+ A "lazy" form, which requires the `>` character only on the first
601
+ line of each block, is also allowed:
602
+
603
+ > This is a block quote. This
604
+ paragraph has two lines.
605
+
606
+ > 1. This is a list inside a block quote.
607
+ 2. Second item.
608
+
609
+ Among the block elements that can be contained in a block quote are
610
+ other block quotes. That is, block quotes can be nested:
611
+
612
+ > This is a block quote.
613
+ >
614
+ > > A block quote within a block quote.
615
+
616
+ Standard markdown syntax does not require a blank line before a block
617
+ quote. Docverter does require this (except, of course, at the beginning of the
618
+ document). The reason for the requirement is that it is all too easy for a
619
+ `>` to end up at the beginning of a line by accident (perhaps through line
620
+ wrapping). So, unless `--strict` is used, the following does not produce
621
+ a nested block quote in docverter:
622
+
623
+ > This is a block quote.
624
+ >> Nested.
625
+
626
+
627
+ Verbatim (code) blocks
628
+ ----------------------
629
+
630
+ ### Indented code blocks ###
631
+
632
+ A block of text indented four spaces (or one tab) is treated as verbatim
633
+ text: that is, special characters do not trigger special formatting,
634
+ and all spaces and line breaks are preserved. For example,
635
+
636
+ if (a > 3) {
637
+ moveShip(5 * gravity, DOWN);
638
+ }
639
+
640
+ The initial (four space or one tab) indentation is not considered part
641
+ of the verbatim text, and is removed in the output.
642
+
643
+ Note: blank lines in the verbatim text need not begin with four spaces.
644
+
645
+
646
+ ### Delimited code blocks ###
647
+
648
+ *Docverter extension*.
649
+
650
+ In addition to standard indented code blocks, Docverter supports
651
+ *delimited* code blocks. These begin with a row of three or more
652
+ tildes (`~`) or backticks (`` ` ``) and end with a row of tildes or
653
+ backticks that must be at least as long as the starting row. Everything
654
+ between these lines is treated as code. No indentation is necessary:
655
+
656
+ ~~~~~~~
657
+ if (a > 3) {
658
+ moveShip(5 * gravity, DOWN);
659
+ }
660
+ ~~~~~~~
661
+
662
+ Like regular code blocks, delimited code blocks must be separated
663
+ from surrounding text by blank lines.
664
+
665
+ If the code itself contains a row of tildes or backticks, just use a longer
666
+ row of tildes or backticks at the start and end:
667
+
668
+ ~~~~~~~~~~~~~~~~
669
+ ~~~~~~~~~~
670
+ code including tildes
671
+ ~~~~~~~~~~
672
+ ~~~~~~~~~~~~~~~~
673
+
674
+ Optionally, you may attach attributes to the code block using
675
+ this syntax:
676
+
677
+ ~~~~ {#mycode .haskell .numberLines startFrom="100"}
678
+ qsort [] = []
679
+ qsort (x:xs) = qsort (filter (< x) xs) ++ [x] ++
680
+ qsort (filter (>= x) xs)
681
+ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
682
+
683
+ Here `mycode` is an identifier, `haskell` and `numberLines` are classes, and
684
+ `startFrom` is an attribute with value `100`. Some output formats can use this
685
+ information to do syntax highlighting. Currently, the only output formats
686
+ that uses this information are HTML and LaTeX. If highlighting is supported
687
+ for your output format and language, then the code block above will appear
688
+ highlighted, with numbered lines. (To see which languages are supported, do
689
+ `docverter --version`.) Otherwise, the code block above will appear as follows:
690
+
691
+ <pre id="mycode" class="haskell numberLines" startFrom="100">
692
+ <code>
693
+ ...
694
+ </code>
695
+ </pre>
696
+
697
+ A shortcut form can also be used for specifying the language of
698
+ the code block:
699
+
700
+ ```haskell
701
+ qsort [] = []
702
+ ```
703
+
704
+ This is equivalent to:
705
+
706
+ ``` {.haskell}
707
+ qsort [] = []
708
+ ```
709
+
710
+ To prevent all highlighting, use the `--no-highlight` flag.
711
+ To set the highlighting style, use `--highlight-style`.
712
+
713
+ Lists
714
+ -----
715
+
716
+ ### Bullet lists ###
717
+
718
+ A bullet list is a list of bulleted list items. A bulleted list
719
+ item begins with a bullet (`*`, `+`, or `-`). Here is a simple
720
+ example:
721
+
722
+ * one
723
+ * two
724
+ * three
725
+
726
+ This will produce a "compact" list. If you want a "loose" list, in which
727
+ each item is formatted as a paragraph, put spaces between the items:
728
+
729
+ * one
730
+
731
+ * two
732
+
733
+ * three
734
+
735
+ The bullets need not be flush with the left margin; they may be
736
+ indented one, two, or three spaces. The bullet must be followed
737
+ by whitespace.
738
+
739
+ List items look best if subsequent lines are flush with the first
740
+ line (after the bullet):
741
+
742
+ * here is my first
743
+ list item.
744
+ * and my second.
745
+
746
+ But markdown also allows a "lazy" format:
747
+
748
+ * here is my first
749
+ list item.
750
+ * and my second.
751
+
752
+ ### The four-space rule ###
753
+
754
+ A list item may contain multiple paragraphs and other block-level
755
+ content. However, subsequent paragraphs must be preceded by a blank line
756
+ and indented four spaces or a tab. The list will look better if the first
757
+ paragraph is aligned with the rest:
758
+
759
+ * First paragraph.
760
+
761
+ Continued.
762
+
763
+ * Second paragraph. With a code block, which must be indented
764
+ eight spaces:
765
+
766
+ { code }
767
+
768
+ List items may include other lists. In this case the preceding blank
769
+ line is optional. The nested list must be indented four spaces or
770
+ one tab:
771
+
772
+ * fruits
773
+ + apples
774
+ - macintosh
775
+ - red delicious
776
+ + pears
777
+ + peaches
778
+ * vegetables
779
+ + brocolli
780
+ + chard
781
+
782
+ As noted above, markdown allows you to write list items "lazily," instead of
783
+ indenting continuation lines. However, if there are multiple paragraphs or
784
+ other blocks in a list item, the first line of each must be indented.
785
+
786
+ + A lazy, lazy, list
787
+ item.
788
+
789
+ + Another one; this looks
790
+ bad but is legal.
791
+
792
+ Second paragraph of second
793
+ list item.
794
+
795
+ **Note:** Although the four-space rule for continuation paragraphs
796
+ comes from the official [markdown syntax guide], the reference implementation,
797
+ `Markdown.pl`, does not follow it. So docverter will give different results than
798
+ `Markdown.pl` when authors have indented continuation paragraphs fewer than
799
+ four spaces.
800
+
801
+ The [markdown syntax guide] is not explicit whether the four-space
802
+ rule applies to *all* block-level content in a list item; it only
803
+ mentions paragraphs and code blocks. But it implies that the rule
804
+ applies to all block-level content (including nested lists), and
805
+ docverter interprets it that way.
806
+
807
+ [markdown syntax guide]:
808
+ http://daringfireball.net/projects/markdown/syntax#list
809
+
810
+ ### Ordered lists ###
811
+
812
+ Ordered lists work just like bulleted lists, except that the items
813
+ begin with enumerators rather than bullets.
814
+
815
+ In standard markdown, enumerators are decimal numbers followed
816
+ by a period and a space. The numbers themselves are ignored, so
817
+ there is no difference between this list:
818
+
819
+ 1. one
820
+ 2. two
821
+ 3. three
822
+
823
+ and this one:
824
+
825
+ 5. one
826
+ 7. two
827
+ 1. three
828
+
829
+ *Docverter extension*.
830
+
831
+ Unlike standard markdown, Docverter allows ordered list items to be marked
832
+ with uppercase and lowercase letters and roman numerals, in addition to
833
+ arabic numerals. List markers may be enclosed in parentheses or followed by a
834
+ single right-parentheses or period. They must be separated from the
835
+ text that follows by at least one space, and, if the list marker is a
836
+ capital letter with a period, by at least two spaces.
837
+
838
+ Docverter also pays attention to the type of list marker used, and to the
839
+ starting number, and both of these are preserved where possible in the
840
+ output format. Thus, the following yields a list with numbers followed
841
+ by a single parenthesis, starting with 9, and a sublist with lowercase
842
+ roman numerals:
843
+
844
+ 9) Ninth
845
+ 10) Tenth
846
+ 11) Eleventh
847
+ i. subone
848
+ ii. subtwo
849
+ iii. subthree
850
+
851
+ Docverter will start a new list each time a different type of list
852
+ marker is used. So, the following will create three lists:
853
+
854
+ (2) Two
855
+ (5) Three
856
+ 1. Four
857
+ * Five
858
+
859
+ If default list markers are desired, use `#.`:
860
+
861
+ #. one
862
+ #. two
863
+ #. three
864
+
865
+
866
+ ### Definition lists ###
867
+
868
+ *Docverter extension*.
869
+
870
+ Docverter supports definition lists, using a syntax inspired by
871
+ [PHP Markdown Extra] and [reStructuredText]:
872
+
873
+ Term 1
874
+
875
+ : Definition 1
876
+
877
+ Term 2 with *inline markup*
878
+
879
+ : Definition 2
880
+
881
+ { some code, part of Definition 2 }
882
+
883
+ Third paragraph of definition 2.
884
+
885
+ Each term must fit on one line, which may optionally be followed by
886
+ a blank line, and must be followed by one or more definitions.
887
+ A definition begins with a colon or tilde, which may be indented one
888
+ or two spaces. The body of the definition (including the first line,
889
+ aside from the colon or tilde) should be indented four spaces. A term may have
890
+ multiple definitions, and each definition may consist of one or more block
891
+ elements (paragraph, code block, list, etc.), each indented four spaces or one
892
+ tab stop.
893
+
894
+ If you leave space after the definition (as in the example above),
895
+ the blocks of the definitions will be considered paragraphs. In some
896
+ output formats, this will mean greater spacing between term/definition
897
+ pairs. For a compact definition list, do not leave space between the
898
+ definition and the next term:
899
+
900
+ Term 1
901
+ ~ Definition 1
902
+ Term 2
903
+ ~ Definition 2a
904
+ ~ Definition 2b
905
+
906
+ [PHP Markdown Extra]: http://www.michelf.com/projects/php-markdown/extra/
907
+
908
+
909
+ ### Numbered example lists ###
910
+
911
+ *Docverter extension*.
912
+
913
+ The special list marker `@` can be used for sequentially numbered
914
+ examples. The first list item with a `@` marker will be numbered '1',
915
+ the next '2', and so on, throughout the document. The numbered examples
916
+ need not occur in a single list; each new list using `@` will take up
917
+ where the last stopped. So, for example:
918
+
919
+ (@) My first example will be numbered (1).
920
+ (@) My second example will be numbered (2).
921
+
922
+ Explanation of examples.
923
+
924
+ (@) My third example will be numbered (3).
925
+
926
+ Numbered examples can be labeled and referred to elsewhere in the
927
+ document:
928
+
929
+ (@good) This is a good example.
930
+
931
+ As (@good) illustrates, ...
932
+
933
+ The label can be any string of alphanumeric characters, underscores,
934
+ or hyphens.
935
+
936
+
937
+ ### Compact and loose lists ###
938
+
939
+ Docverter behaves differently from `Markdown.pl` on some "edge
940
+ cases" involving lists. Consider this source:
941
+
942
+ + First
943
+ + Second:
944
+ - Fee
945
+ - Fie
946
+ - Foe
947
+
948
+ + Third
949
+
950
+ Docverter transforms this into a "compact list" (with no `<p>` tags around
951
+ "First", "Second", or "Third"), while markdown puts `<p>` tags around
952
+ "Second" and "Third" (but not "First"), because of the blank space
953
+ around "Third". Docverter follows a simple rule: if the text is followed by
954
+ a blank line, it is treated as a paragraph. Since "Second" is followed
955
+ by a list, and not a blank line, it isn't treated as a paragraph. The
956
+ fact that the list is followed by a blank line is irrelevant. (Note:
957
+ Docverter works this way even when the `--strict` option is specified. This
958
+ behavior is consistent with the official markdown syntax description,
959
+ even though it is different from that of `Markdown.pl`.)
960
+
961
+
962
+ ### Ending a list ###
963
+
964
+ What if you want to put an indented code block after a list?
965
+
966
+ - item one
967
+ - item two
968
+
969
+ { my code block }
970
+
971
+ Trouble! Here docverter (like other markdown implementations) will treat
972
+ `{ my code block }` as the second paragraph of item two, and not as
973
+ a code block.
974
+
975
+ To "cut off" the list after item two, you can insert some non-indented
976
+ content, like an HTML comment, which won't produce visible output in
977
+ any format:
978
+
979
+ - item one
980
+ - item two
981
+
982
+ <!-- end of list -->
983
+
984
+ { my code block }
985
+
986
+ You can use the same trick if you want two consecutive lists instead
987
+ of one big list:
988
+
989
+ 1. one
990
+ 2. two
991
+ 3. three
992
+
993
+ <!-- -->
994
+
995
+ 1. uno
996
+ 2. dos
997
+ 3. tres
998
+
999
+ Horizontal rules
1000
+ ----------------
1001
+
1002
+ A line containing a row of three or more `*`, `-`, or `_` characters
1003
+ (optionally separated by spaces) produces a horizontal rule:
1004
+
1005
+ * * * *
1006
+
1007
+ ---------------
1008
+
1009
+
1010
+ Tables
1011
+ ------
1012
+
1013
+ *Docverter extension*.
1014
+
1015
+ Three kinds of tables may be used. All three kinds presuppose the use of
1016
+ a fixed-width font, such as Courier.
1017
+
1018
+ **Simple tables** look like this:
1019
+
1020
+ Right Left Center Default
1021
+ ------- ------ ---------- -------
1022
+ 12 12 12 12
1023
+ 123 123 123 123
1024
+ 1 1 1 1
1025
+
1026
+ Table: Demonstration of simple table syntax.
1027
+
1028
+ The headers and table rows must each fit on one line. Column
1029
+ alignments are determined by the position of the header text relative
1030
+ to the dashed line below it:
1031
+
1032
+ - If the dashed line is flush with the header text on the right side
1033
+ but extends beyond it on the left, the column is right-aligned.
1034
+ - If the dashed line is flush with the header text on the left side
1035
+ but extends beyond it on the right, the column is left-aligned.
1036
+ - If the dashed line extends beyond the header text on both sides,
1037
+ the column is centered.
1038
+ - If the dashed line is flush with the header text on both sides,
1039
+ the default alignment is used (in most cases, this will be left).
1040
+
1041
+ The table must end with a blank line, or a line of dashes followed by
1042
+ a blank line. A caption may optionally be provided (as illustrated in
1043
+ the example above). A caption is a paragraph beginning with the string
1044
+ `Table:` (or just `:`), which will be stripped off. It may appear either
1045
+ before or after the table.
1046
+
1047
+ The column headers may be omitted, provided a dashed line is used
1048
+ to end the table. For example:
1049
+
1050
+ ------- ------ ---------- -------
1051
+ 12 12 12 12
1052
+ 123 123 123 123
1053
+ 1 1 1 1
1054
+ ------- ------ ---------- -------
1055
+
1056
+ When headers are omitted, column alignments are determined on the basis
1057
+ of the first line of the table body. So, in the tables above, the columns
1058
+ would be right, left, center, and right aligned, respectively.
1059
+
1060
+ **Multiline tables** allow headers and table rows to span multiple lines
1061
+ of text (but cells that span multiple columns or rows of the table are
1062
+ not supported). Here is an example:
1063
+
1064
+ -------------------------------------------------------------
1065
+ Centered Default Right Left
1066
+ Header Aligned Aligned Aligned
1067
+ ----------- ------- --------------- -------------------------
1068
+ First row 12.0 Example of a row that
1069
+ spans multiple lines.
1070
+
1071
+ Second row 5.0 Here's another one. Note
1072
+ the blank line between
1073
+ rows.
1074
+ -------------------------------------------------------------
1075
+
1076
+ Table: Here's the caption. It, too, may span
1077
+ multiple lines.
1078
+
1079
+ These work like simple tables, but with the following differences:
1080
+
1081
+ - They must begin with a row of dashes, before the header text
1082
+ (unless the headers are omitted).
1083
+ - They must end with a row of dashes, then a blank line.
1084
+ - The rows must be separated by blank lines.
1085
+
1086
+ In multiline tables, the table parser pays attention to the widths of
1087
+ the columns, and the writers try to reproduce these relative widths in
1088
+ the output. So, if you find that one of the columns is too narrow in the
1089
+ output, try widening it in the markdown source.
1090
+
1091
+ Headers may be omitted in multiline tables as well as simple tables:
1092
+
1093
+ ----------- ------- --------------- -------------------------
1094
+ First row 12.0 Example of a row that
1095
+ spans multiple lines.
1096
+
1097
+ Second row 5.0 Here's another one. Note
1098
+ the blank line between
1099
+ rows.
1100
+ -------------------------------------------------------------
1101
+
1102
+ : Here's a multiline table without headers.
1103
+
1104
+ It is possible for a multiline table to have just one row, but the row
1105
+ should be followed by a blank line (and then the row of dashes that ends
1106
+ the table), or the table may be interpreted as a simple table.
1107
+
1108
+ **Grid tables** look like this:
1109
+
1110
+ : Sample grid table.
1111
+
1112
+ +---------------+---------------+--------------------+
1113
+ | Fruit | Price | Advantages |
1114
+ +===============+===============+====================+
1115
+ | Bananas | $1.34 | - built-in wrapper |
1116
+ | | | - bright color |
1117
+ +---------------+---------------+--------------------+
1118
+ | Oranges | $2.10 | - cures scurvy |
1119
+ | | | - tasty |
1120
+ +---------------+---------------+--------------------+
1121
+
1122
+ The row of `=`s separates the header from the table body, and can be
1123
+ omitted for a headerless table. The cells of grid tables may contain
1124
+ arbitrary block elements (multiple paragraphs, code blocks, lists,
1125
+ etc.). Alignments are not supported, nor are cells that span multiple
1126
+ columns or rows. Grid tables can be created easily using [Emacs table mode].
1127
+
1128
+ [Emacs table mode]: http://table.sourceforge.net/
1129
+
1130
+
1131
+ Title block
1132
+ -----------
1133
+
1134
+ *Docverter extension*.
1135
+
1136
+ If the file begins with a title block
1137
+
1138
+ % title
1139
+ % author(s) (separated by semicolons)
1140
+ % date
1141
+
1142
+ it will be parsed as bibliographic information, not regular text. (It
1143
+ will be used, for example, in the title of standalone LaTeX or HTML
1144
+ output.) The block may contain just a title, a title and an author,
1145
+ or all three elements. If you want to include an author but no
1146
+ title, or a title and a date but no author, you need a blank line:
1147
+
1148
+ %
1149
+ % Author
1150
+
1151
+ % My title
1152
+ %
1153
+ % June 15, 2006
1154
+
1155
+ The title may occupy multiple lines, but continuation lines must
1156
+ begin with leading space, thus:
1157
+
1158
+ % My title
1159
+ on multiple lines
1160
+
1161
+ If a document has multiple authors, the authors may be put on
1162
+ separate lines with leading space, or separated by semicolons, or
1163
+ both. So, all of the following are equivalent:
1164
+
1165
+ % Author One
1166
+ Author Two
1167
+
1168
+ % Author One; Author Two
1169
+
1170
+ % Author One;
1171
+ Author Two
1172
+
1173
+ The date must fit on one line.
1174
+
1175
+ All three metadata fields may contain standard inline formatting
1176
+ (italics, links, footnotes, etc.).
1177
+
1178
+ Title blocks will always be parsed, but they will affect the output only
1179
+ when the `--standalone` (`-s`) option is chosen. In HTML output, titles
1180
+ will appear twice: once in the document head -- this is the title that
1181
+ will appear at the top of the window in a browser -- and once at the
1182
+ beginning of the document body. The title in the document head can have
1183
+ an optional prefix attached (`--title-prefix` or `-T` option). The title
1184
+ in the body appears as an H1 element with class "title", so it can be
1185
+ suppressed or reformatted with CSS. If a title prefix is specified with
1186
+ `-T` and no title block appears in the document, the title prefix will
1187
+ be used by itself as the HTML title.
1188
+
1189
+ The man page writer extracts a title, man page section number, and
1190
+ other header and footer information from the title line. The title
1191
+ is assumed to be the first word on the title line, which may optionally
1192
+ end with a (single-digit) section number in parentheses. (There should
1193
+ be no space between the title and the parentheses.) Anything after
1194
+ this is assumed to be additional footer and header text. A single pipe
1195
+ character (`|`) should be used to separate the footer text from the header
1196
+ text. Thus,
1197
+
1198
+ % DOCVERTER(1)
1199
+
1200
+ will yield a man page with the title `DOCVERTER` and section 1.
1201
+
1202
+ % DOCVERTER(1) Docverter User Manuals
1203
+
1204
+ will also have "Docverter User Manuals" in the footer.
1205
+
1206
+ % DOCVERTER(1) Docverter User Manuals | Version 4.0
1207
+
1208
+ will also have "Version 4.0" in the header.
1209
+
1210
+
1211
+ Backslash escapes
1212
+ -----------------
1213
+
1214
+ Except inside a code block or inline code, any punctuation or space
1215
+ character preceded by a backslash will be treated literally, even if it
1216
+ would normally indicate formatting. Thus, for example, if one writes
1217
+
1218
+ *\*hello\**
1219
+
1220
+ one will get
1221
+
1222
+ <em>*hello*</em>
1223
+
1224
+ instead of
1225
+
1226
+ <strong>hello</strong>
1227
+
1228
+ This rule is easier to remember than standard markdown's rule,
1229
+ which allows only the following characters to be backslash-escaped:
1230
+
1231
+ \`*_{}[]()>#+-.!
1232
+
1233
+ (However, if the `--strict` option is supplied, the standard
1234
+ markdown rule will be used.)
1235
+
1236
+ A backslash-escaped space is parsed as a nonbreaking space. It will
1237
+ appear in TeX output as `~` and in HTML and XML as `\&#160;` or
1238
+ `\&nbsp;`.
1239
+
1240
+ A backslash-escaped newline (i.e. a backslash occurring at the end of
1241
+ a line) is parsed as a hard line break. It will appear in TeX output as
1242
+ `\\` and in HTML as `<br />`. This is a nice alternative to
1243
+ markdown's "invisible" way of indicating hard line breaks using
1244
+ two trailing spaces on a line.
1245
+
1246
+ Backslash escapes do not work in verbatim contexts.
1247
+
1248
+ Smart punctuation
1249
+ -----------------
1250
+
1251
+ *Docverter extension*.
1252
+
1253
+ If the `--smart` option is specified, docverter will produce typographically
1254
+ correct output, converting straight quotes to curly quotes, `---` to
1255
+ em-dashes, `--` to en-dashes, and `...` to ellipses. Nonbreaking spaces
1256
+ are inserted after certain abbreviations, such as "Mr."
1257
+
1258
+ Note: if your LaTeX template uses the `csquotes` package, docverter will
1259
+ detect automatically this and use `\enquote{...}` for quoted text.
1260
+
1261
+ Inline formatting
1262
+ -----------------
1263
+
1264
+ ### Emphasis ###
1265
+
1266
+ To *emphasize* some text, surround it with `*`s or `_`, like this:
1267
+
1268
+ This text is _emphasized with underscores_, and this
1269
+ is *emphasized with asterisks*.
1270
+
1271
+ Double `*` or `_` produces **strong emphasis**:
1272
+
1273
+ This is **strong emphasis** and __with underscores__.
1274
+
1275
+ A `*` or `_` character surrounded by spaces, or backslash-escaped,
1276
+ will not trigger emphasis:
1277
+
1278
+ This is * not emphasized *, and \*neither is this\*.
1279
+
1280
+ Because `_` is sometimes used inside words and identifiers,
1281
+ docverter does not interpret a `_` surrounded by alphanumeric
1282
+ characters as an emphasis marker. If you want to emphasize
1283
+ just part of a word, use `*`:
1284
+
1285
+ feas*ible*, not feas*able*.
1286
+
1287
+
1288
+ ### Strikeout ###
1289
+
1290
+ *Docverter extension*.
1291
+
1292
+ To strikeout a section of text with a horizontal line, begin and end it
1293
+ with `~~`. Thus, for example,
1294
+
1295
+ This ~~is deleted text.~~
1296
+
1297
+
1298
+ ### Superscripts and subscripts ###
1299
+
1300
+ *Docverter extension*.
1301
+
1302
+ Superscripts may be written by surrounding the superscripted text by `^`
1303
+ characters; subscripts may be written by surrounding the subscripted
1304
+ text by `~` characters. Thus, for example,
1305
+
1306
+ H~2~O is a liquid. 2^10^ is 1024.
1307
+
1308
+ If the superscripted or subscripted text contains spaces, these spaces
1309
+ must be escaped with backslashes. (This is to prevent accidental
1310
+ superscripting and subscripting through the ordinary use of `~` and `^`.)
1311
+ Thus, if you want the letter P with 'a cat' in subscripts, use
1312
+ `P~a\ cat~`, not `P~a cat~`.
1313
+
1314
+
1315
+ ### Verbatim ###
1316
+
1317
+ To make a short span of text verbatim, put it inside backticks:
1318
+
1319
+ What is the difference between `>>=` and `>>`?
1320
+
1321
+ If the verbatim text includes a backtick, use double backticks:
1322
+
1323
+ Here is a literal backtick `` ` ``.
1324
+
1325
+ (The spaces after the opening backticks and before the closing
1326
+ backticks will be ignored.)
1327
+
1328
+ The general rule is that a verbatim span starts with a string
1329
+ of consecutive backticks (optionally followed by a space)
1330
+ and ends with a string of the same number of backticks (optionally
1331
+ preceded by a space).
1332
+
1333
+ Note that backslash-escapes (and other markdown constructs) do not
1334
+ work in verbatim contexts:
1335
+
1336
+ This is a backslash followed by an asterisk: `\*`.
1337
+
1338
+ Attributes can be attached to verbatim text, just as with
1339
+ [delimited code blocks](#delimited-code-blocks):
1340
+
1341
+ `<$>`{.haskell}
1342
+
1343
+
1344
+ Raw HTML
1345
+ --------
1346
+
1347
+ Markdown allows you to insert raw HTML (or DocBook) anywhere in a document
1348
+ (except verbatim contexts, where `<`, `>`, and `&` are interpreted
1349
+ literally).
1350
+
1351
+ The raw HTML is passed through unchanged in HTML, S5, Slidy, Slideous,
1352
+ DZSlides, EPUB,
1353
+ Markdown, and Textile output, and suppressed in other formats.
1354
+
1355
+ *Docverter extension*.
1356
+
1357
+ Standard markdown allows you to include HTML "blocks": blocks
1358
+ of HTML between balanced tags that are separated from the surrounding text
1359
+ with blank lines, and start and end at the left margin. Within
1360
+ these blocks, everything is interpreted as HTML, not markdown;
1361
+ so (for example), `*` does not signify emphasis.
1362
+
1363
+ Docverter behaves this way when `--strict` is specified; but by default,
1364
+ Docverter interprets material between HTML block tags as markdown.
1365
+ Thus, for example, Docverter will turn
1366
+
1367
+ <table>
1368
+ <tr>
1369
+ <td>*one*</td>
1370
+ <td>[a link](http://google.com)</td>
1371
+ </tr>
1372
+ </table>
1373
+
1374
+ into
1375
+
1376
+ <table>
1377
+ <tr>
1378
+ <td><em>one</em></td>
1379
+ <td><a href="http://google.com">a link</a></td>
1380
+ </tr>
1381
+ </table>
1382
+
1383
+ whereas `Markdown.pl` will preserve it as is.
1384
+
1385
+ There is one exception to this rule: text between `<script>` and
1386
+ `<style>` tags is not interpreted as markdown.
1387
+
1388
+ This departure from standard markdown should make it easier to mix
1389
+ markdown with HTML block elements. For example, one can surround
1390
+ a block of markdown text with `<div>` tags without preventing it
1391
+ from being interpreted as markdown.
1392
+
1393
+ Links
1394
+ -----
1395
+
1396
+ Markdown allows links to be specified in several ways.
1397
+
1398
+ ### Automatic links ###
1399
+
1400
+ If you enclose a URL or email address in pointy brackets, it
1401
+ will become a link:
1402
+
1403
+ <http://google.com>
1404
+ <sam@green.eggs.ham>
1405
+
1406
+
1407
+ ### Inline links ###
1408
+
1409
+ An inline link consists of the link text in square brackets,
1410
+ followed by the URL in parentheses. (Optionally, the URL can
1411
+ be followed by a link title, in quotes.)
1412
+
1413
+ This is an [inline link](/url), and here's [one with
1414
+ a title](http://fsf.org "click here for a good time!").
1415
+
1416
+ There can be no space between the bracketed part and the parenthesized part.
1417
+ The link text can contain formatting (such as emphasis), but the title cannot.
1418
+
1419
+
1420
+ ### Reference links ###
1421
+
1422
+ An *explicit* reference link has two parts, the link itself and the link
1423
+ definition, which may occur elsewhere in the document (either
1424
+ before or after the link).
1425
+
1426
+ The link consists of link text in square brackets, followed by a label in
1427
+ square brackets. (There can be space between the two.) The link definition
1428
+ must begin at the left margin or indented no more than three spaces. It
1429
+ consists of the bracketed label, followed by a colon and a space, followed by
1430
+ the URL, and optionally (after a space) a link title either in quotes or in
1431
+ parentheses.
1432
+
1433
+ Here are some examples:
1434
+
1435
+ [my label 1]: /foo/bar.html "My title, optional"
1436
+ [my label 2]: /foo
1437
+ [my label 3]: http://fsf.org (The free software foundation)
1438
+ [my label 4]: /bar#special 'A title in single quotes'
1439
+
1440
+ The URL may optionally be surrounded by angle brackets:
1441
+
1442
+ [my label 5]: <http://foo.bar.baz>
1443
+
1444
+ The title may go on the next line:
1445
+
1446
+ [my label 3]: http://fsf.org
1447
+ "The free software foundation"
1448
+
1449
+ Note that link labels are not case sensitive. So, this will work:
1450
+
1451
+ Here is [my link][FOO]
1452
+
1453
+ [Foo]: /bar/baz
1454
+
1455
+ In an *implicit* reference link, the second pair of brackets is
1456
+ empty, or omitted entirely:
1457
+
1458
+ See [my website][], or [my website].
1459
+
1460
+ [my website]: http://foo.bar.baz
1461
+
1462
+ ### Internal links
1463
+
1464
+ To link to another section of the same document, use the automatically
1465
+ generated identifier (see [Header identifiers in HTML, LaTeX, and
1466
+ ConTeXt](#header-identifiers-in-html-latex-and-context), below).
1467
+ For example:
1468
+
1469
+ See the [Introduction](#introduction).
1470
+
1471
+ or
1472
+
1473
+ See the [Introduction].
1474
+
1475
+ [Introduction]: #introduction
1476
+
1477
+ Internal links are currently supported for HTML formats (including
1478
+ HTML slide shows and EPUB), LaTeX, and ConTeXt.
1479
+
1480
+ Images
1481
+ ------
1482
+
1483
+ A link immediately preceded by a `!` will be treated as an image.
1484
+ The link text will be used as the image's alt text:
1485
+
1486
+ ![la lune](lalune.jpg "Voyage to the moon")
1487
+
1488
+ ![movie reel]
1489
+
1490
+ [movie reel]: movie.gif
1491
+
1492
+ ### Pictures with captions ###
1493
+
1494
+ *Docverter extension*.
1495
+
1496
+ An image occurring by itself in a paragraph will be rendered as
1497
+ a figure with a caption. (In LaTeX, a figure environment will be
1498
+ used; in HTML, the image will be placed in a `div` with class
1499
+ `figure`, together with a caption in a `p` with class `caption`.)
1500
+ The image's alt text will be used as the caption.
1501
+
1502
+ ![This is the caption](/url/of/image.png)
1503
+
1504
+ If you just want a regular inline image, just make sure it is not
1505
+ the only thing in the paragraph. One way to do this is to insert a
1506
+ nonbreaking space after the image:
1507
+
1508
+ ![This image won't be a figure](/url/of/image.png)\
1509
+
1510
+
1511
+ Footnotes
1512
+ ---------
1513
+
1514
+ *Docverter extension*.
1515
+
1516
+ Docverter's markdown allows footnotes, using the following syntax:
1517
+
1518
+ Here is a footnote reference,[^1] and another.[^longnote]
1519
+
1520
+ [^1]: Here is the footnote.
1521
+
1522
+ [^longnote]: Here's one with multiple blocks.
1523
+
1524
+ Subsequent paragraphs are indented to show that they
1525
+ belong to the previous footnote.
1526
+
1527
+ { some.code }
1528
+
1529
+ The whole paragraph can be indented, or just the first
1530
+ line. In this way, multi-paragraph footnotes work like
1531
+ multi-paragraph list items.
1532
+
1533
+ This paragraph won't be part of the note, because it
1534
+ isn't indented.
1535
+
1536
+ The identifiers in footnote references may not contain spaces, tabs,
1537
+ or newlines. These identifiers are used only to correlate the
1538
+ footnote reference with the note itself; in the output, footnotes
1539
+ will be numbered sequentially.
1540
+
1541
+ The footnotes themselves need not be placed at the end of the
1542
+ document. They may appear anywhere except inside other block elements
1543
+ (lists, block quotes, tables, etc.).
1544
+
1545
+ Inline footnotes are also allowed (though, unlike regular notes,
1546
+ they cannot contain multiple paragraphs). The syntax is as follows:
1547
+
1548
+ Here is an inline note.^[Inlines notes are easier to write, since
1549
+ you don't have to pick an identifier and move down to type the
1550
+ note.]
1551
+
1552
+ Inline and regular footnotes may be mixed freely.
1553
+
1554
+ PDF Styling
1555
+ ===========
1556
+
1557
+ Docverter PDF conversion supports all of CSS 2.1 and some of CSS 3, including
1558
+ `@font-face` and paged media. Docverter uses Flying Saucer to render HTML
1559
+ to PDF. See the [user's guide](http://flyingsaucerproject.github.com/flyingsaucer/r8/guide/users-guide-R8.html)
1560
+ for extensive details. Here are a few useful things.
1561
+
1562
+ Fonts
1563
+ -----
1564
+
1565
+ Use a `@font-face` delcaration to include fonts in your stylesheet. Any fonts should be
1566
+ included in `other_files[]` as truetype font files. For example:
1567
+
1568
+ @font-face {
1569
+ font-family: 'Arial';
1570
+ font-style: normal;
1571
+ font-weight: 400;
1572
+ src: url('arial.ttf');
1573
+ -fs-pdf-font-embed: embed;
1574
+ -fs-pdf-font-encoding: Identity-H;
1575
+ }
1576
+ body {
1577
+ font-family: 'Arial';
1578
+ }
1579
+
1580
+ **VERY IMPORTANT NOTE** You *must* include the `-fs-pdf-font-embed` and `-fs-pdf-font-encoding` attributes, and they must be the exact values as above. In addition, the font-family *must* be identical to the font family that is encoded in the font file itself.
1581
+
1582
+ Page Attributes
1583
+ ---------------
1584
+
1585
+ See the W3C's [Paged Media](http://www.w3.org/TR/css3-page/) for details. A small example:
1586
+
1587
+ @page {
1588
+ size: 8.5in 11in;
1589
+ margin: 27mm;
1590
+ }
1591
+
1592
+ Headers and Footers
1593
+ -------------------
1594
+
1595
+ See the W3C's [Paged Media](http://www.w3.org/TR/css3-page/) for details. A small example:
1596
+
1597
+ h1 {
1598
+ string-set: header content();
1599
+ }
1600
+
1601
+ @page {
1602
+ @bottom-right {
1603
+ content: string(header, first);
1604
+ }
1605
+
1606
+ @bottom-left {
1607
+ content: counter(page)
1608
+ }
1609
+ }
1610
+
1611
+ This copies the contents of each `<h1>` into a string named `header`. Then, it inserts it into
1612
+ the bottom right corner of each page. It also inserts a page counter into the bottom left corner
1613
+ of each page.
1614
+
1615
+ Docverter supports both margin boxes, described above, and running elements as defined by the CSS3 spec.
1616
+
1617
+
1618
+ Authors
1619
+ =======
1620
+
1621
+ This is a copy of the Pandoc README file, modified to suit Docverter's manifest format.
1622
+
1623
+ Docverter © 2012 Pete Keen (pete@bugsplat.info) and released under the MIT license (see LICENSE)
1624
+
1625
+ Original © 2006-2011 John MacFarlane (jgm at berkeley dot edu). Pandoc
1626
+ released under the [GPL], version 2 or greater. This software carries no warranty of
1627
+ any kind.
1628
+
1629
+ Other contributors include Recai Oktaş, Paulo Tanimoto, Peter Wang,
1630
+ Andrea Rossato, Eric Kow, infinity0x, Luke Plant, shreevatsa.public,
1631
+ Puneeth Chaganti, Paul Rivier, rodja.trappe, Bradley Kuhn, thsutton,
1632
+ Nathan Gass, Jonathan Daugherty, Jérémy Bobbio, Justin Bogner, qerub,
1633
+ Christopher Sawicki, Kelsey Hightower, Masayoshi Takahashi, Antoine
1634
+ Latter, Ralf Stephan, Eric Seidel, B. Scott Michel, Gavin Beatty,
1635
+ Sergey Astanin.
1636
+
1637
+ [markdown]: http://daringfireball.net/projects/markdown/
1638
+ [reStructuredText]: http://docutils.sourceforge.net/docs/ref/rst/introduction.html
1639
+ [S5]: http://meyerweb.com/eric/tools/s5/
1640
+ [Slidy]: http://www.w3.org/Talks/Tools/Slidy/
1641
+ [Slideous]: http://goessner.net/articles/slideous/
1642
+ [HTML]: http://www.w3.org/TR/html40/
1643
+ [HTML 5]: http://www.w3.org/TR/html5/
1644
+ [XHTML]: http://www.w3.org/TR/xhtml1/
1645
+ [LaTeX]: http://www.latex-project.org/
1646
+ [beamer]: http://www.tex.ac.uk/CTAN/macros/latex/contrib/beamer
1647
+ [ConTeXt]: http://www.pragma-ade.nl/
1648
+ [RTF]: http://en.wikipedia.org/wiki/Rich_Text_Format
1649
+ [DocBook XML]: http://www.docbook.org/
1650
+ [OpenDocument XML]: http://opendocument.xml.org/
1651
+ [ODT]: http://en.wikipedia.org/wiki/OpenDocument
1652
+ [Textile]: http://redcloth.org/textile
1653
+ [MediaWiki markup]: http://www.mediawiki.org/wiki/Help:Formatting
1654
+ [groff man]: http://developer.apple.com/DOCUMENTATION/Darwin/Reference/ManPages/man7/groff_man.7.html
1655
+ [Haskell]: http://www.haskell.org/
1656
+ [GNU Texinfo]: http://www.gnu.org/software/texinfo/
1657
+ [Emacs Org-Mode]: http://orgmode.org
1658
+ [AsciiDoc]: http://www.methods.co.nz/asciidoc/
1659
+ [EPUB]: http://www.idpf.org/
1660
+ [GPL]: http://www.gnu.org/copyleft/gpl.html "GNU General Public License"
1661
+ [DZSlides]: http://paulrouget.com/dzslides/
1662
+ [ISO 8601 format]: http://www.w3.org/TR/NOTE-datetime
1663
+ [Word docx]: http://www.microsoft.com/interop/openup/openxml/default.aspx
1664
+ [PDF]: http://www.adobe.com/pdf/