metanorma-iso 1.2.4 → 1.3.0

Sign up to get free protection for your applications and to get access to all the features.
@@ -1,7 +0,0 @@
1
- = Customising Metanorma
2
-
3
- Refer to Metanorma adoption FAQ for advice on how to adopt the Metanorma approach
4
- to document generation for your documents.
5
-
6
- You will usually find there a detailed customisation guidance prefixed with a short summary
7
- for a quick-and-dirty implementation.
data/docs/htmloutput.adoc DELETED
@@ -1,118 +0,0 @@
1
- = HTML and Word HTML Output
2
-
3
- In order to create CSS stylesheets for the HTML and Word HTML output of the Metanorma tool, it is necessary to understand the structure of the HTML it generates.
4
-
5
- == HTML
6
-
7
- === Top-Level Structure
8
-
9
- The `head` of the HTML document contains a single stylesheet (the `:htmlstylesheet` parameter of `HtmlConvert.new()`), and some brief script calls that are embedded in the Ruby code (initialising jQuery, including webfonts).
10
-
11
- The `body` of the HTML document is divided into the following parts:
12
-
13
- * A title section (`<div class="title-section">`), comprising identifying information about the document, such as appears in a title page in print.
14
- ** The section is populated with an HTML template (the `:htmlcoverpage` parameter of `HtmlConvert.new()`). The information in this section is sourced from document metadata, rather than document content proper; the gem uses http://liquidmarkup.org[Liquid Template] to populate the HTML template. Different fields usually have distinct class names for CSS styling; these can vary by gem.
15
- ** For example, ISO documents have `coverpage_docnumber` (for the document ID), `coverpage_techcommittee` (for the technical committee responsible for the document), `doctitle-en` (for the English-language title of the document), `doctitle-fr` (for the French title), `title, subtitle, part` (for the three components of the document title), and `coverpage_docstage` (for the stage of publication of the document).
16
- * A prefatory section (`<div class="prefatory-section">`), comprising boilerplate information which also does not come from document content proper. This is typically restricted to a copyright statement (`<div class="copyright">`), contact details, and a table of contents `<div id="toc">`.
17
- ** The section is also populated with a Liquid HTML template (the `:htmlintropage` parameter of `HtmlConvert.new()`).
18
- ** The table of contents in the HTML template is a placeholder; it is populated by a table of contents script included among the scripts loaded into the HTML body.
19
- * The main section of the document (`<main class="main-section">`), which is populated with the document content.
20
- * Optionally, a colophon (`<div class="colophon">`), which is populated with boilerplate information and/or document metadata. (Currently colophons in Metanorma gems appear only in Word output.)
21
- * Scripts. These are populated from a static file (the `:scripts` parameter of `HtmlConvert.new()`). These are expected to include https://www.mathjax.org[MathJax], a Table of Contents generator, and a script for handling footnotes.
22
-
23
- === Body markup
24
-
25
- Within the body of the document, different blocks and inline spans of the Metanorma document model (https://github.com/metanorma/metanorma-model-standoc[Standoc XML], https://github.com/metanorma/basicdoc-models[BasicDoc XML]) are represented by different CSS classes, as follows:
26
-
27
- ==== Sections
28
-
29
- Symbols and abbreviated terms:: `<div class="Symbols">` (contents are a definition list)
30
- Appendix title:: `<h1 class="Annex">`
31
- Appendix, Bibliography, Introduction:: `<div class="Section3">`
32
- Introduction title:: `<h1 class="IntroTitle">`
33
- Foreword title:: `<h1 class="ForewordTitle">`
34
- Deprecated term:: `<p class="DeprecatedTerms">`
35
- Alternative term:: `<p class="AltTerms">`
36
- Primary term:: `<p class="Terms">`
37
- Term header:: `<p class="TermNum">`
38
- Document title (in body):: `<p class="zzSTDTitle1">`
39
-
40
- ==== Blocks
41
-
42
- Note:: `<div class="Note">`
43
- Note label:: `<span class="note_label">`
44
- Figure:: `<div class="figure">`
45
- Figure title:: `<span class="FigureTitle">`
46
- Example:: `<table class="example">` or `<div class="example">`
47
- Example label:: `<span class="example_label">`
48
- Sourcecode:: `<p class="Sourcecode">`
49
- Admonition:: `<div class="Admonition">`
50
- Formula:: `<div class="formula">`
51
- Blockquote:: `<div class="Quote">`
52
- Blockquote attribution:: `<p class="QuoteAttribution">`
53
- Footnote:: `<aside class="footnote">`
54
- Ordered list:: `<ol>`
55
- Unordered list:: `<ul>`
56
- Definition list:: `<dl>`
57
- Normative reference:: `<p class="NormRef">`
58
- Informative reference:: `<p class="Biblio">`
59
- Table:: `<table>`
60
- Table title:: `<p class="TableTitle">`
61
- Table head:: `<thead>`
62
- Table body:: `<tbody>`
63
- Table foot:: `<tfoot>`
64
-
65
- ==== Inline
66
-
67
- Hyperlink:: `<a>`
68
- Cross-Reference:: `<a>`
69
- Stem expression:: `<span class="stem">`
70
- Small caps:: `<span style="font-variant:small-caps;">`
71
- Emphasis:: `<i>`
72
- Strong:: `<b>`
73
- Superscript:: `<sup>`
74
- Subscript:: `<sub>`
75
- Monospace:: `<tt>`
76
- Strikethrough:: `<s>`
77
- Line Break:: `<br>`
78
- Horizontal Rule:: `<hr>`
79
- Page Break:: `<br>` (realised as page break in Word HTML)
80
-
81
- ==== Images
82
-
83
- All images for an HTML document `{filename}.html` are moved to the folder `{filename}_images`, and renamed with GUIDs. This is to ensure that all images are available in the one location, making it easier to package the HTML output and upload it elsewhere.
84
-
85
- == Word HTML
86
-
87
- === Word HTML and Word HTML CSS
88
-
89
- The Word HTML documented here is what is used by the gems to generate DOC output. For more on why Word HTML is used, instead of OOXML or HTML 5 embedded into DOCX, see https://github.com/metanorma/html2doc/wiki/Why-not-docx%3F
90
-
91
- Word HTML, and the Word HTML version of CSS, are restricted compared to the HTML and CSS you are likely familiar with. Word HTML is a subset of HTML 4; Word HTML CSS has a weakened set of selectors, and a range of Microsoft-specific extensions (prefixed with `@` or `mso-`). The weakened set of selectors means you cannot assume that classes are inherited by their children; normal CSS would apply formatting on a `div` class to its child paragraphs, but Word HTML would expect you to repeat that class definition for `p`.
92
-
93
- Some of the necessary caveats are listed in https://github.com/metanorma/html2doc/blob/master/README.adoc. The styling of lists in particular is quite different to normal CSS, and requires a Word-specific selector to define list styles (the `:ulstyle ` and `:olstyle ` parameter of `WordConvert.new()`).
94
-
95
- Word HTML and CSS is not well-documented (even though there is a 1500 page manual from Microsoft); fortunately saving Word documents to HTML will reveal the Word HTML and Word HTML CSS that can be used to generate the same formatting. The stylesheets need to follow the conventions of Word HTML, and should be formulated by saving Word documents as HTML, and extracting their CSS stylesheets. Note that the CSS is prefixed with a set of font definitions; these too should be obtained by saving Word documents as HTML.
96
-
97
- === Top-Level Structure
98
-
99
- The headers and footers of a Word document are defined in Word HTML in a separate file, `header.html` (the `:header` parameter of `WordConvert.new()`), which is included in the file manifest for the document. The header.html file is cross-referenced to the Word HTML CSS file, and contains a separate `div` for each header and footer type; refer to the instances in the gems for illustration.
100
-
101
- The `head` of the Word HTML document contains two stylesheets (the `:wordstylesheet` and `:standardsheet` parameter of `WordConvert.new()`). The `:wordstylesheet` is intended as generic Word markup, while `:standardsheet` is intended to contain styling specific to the standard. No scripts are supported in Word HTML.
102
-
103
- The other elements of the Word HTML head are populated by the https://github.com/metanorma/html2doc[html2doc gem]: a reference to a manifest of included files (specifically images and the header file), and settings to open the document in Print View at 100% magnification.
104
-
105
- The `body` of the Word HTML document is divided into the following parts:
106
-
107
- * A title section (`<div class="WordSection1">`), comprising identifying information about the document, such as appears in a title page in print.
108
- ** The section is populated with an HTML template (the `:wordcoverpage` parameter of `WordConvert.new()`). As with HTML, the information in this section is sourced from document metadata, rather than document content proper; and the gem uses http://liquidmarkup.org[Liquid Template] to populate the HTML template.
109
- * A prefatory section (`<div class="WordSection2">`), comprising boilerplate information which does not come from document content proper (such as a Table of Contents shell), as well as prefatory material from the document content. The prefatory section is set in the CSS stylesheet to have Roman numerals for its pagination.
110
- ** Because of the requirement for Roman numerals, prefatory material from the document is sent to this section, whereas all document content in the HTML document is sent to the main section.
111
- * The main section of the document (`<div class="WordSection3">`), which is populated with the remaining document content. The main section is set in the CSS stylesheet to have Arabic numerals for its pagination.
112
- * Optionally, a colophon (`<div class="colophon">`), which is populated with boilerplate information and/or document metadata.
113
-
114
- === Body markup
115
-
116
- With the exception of the top-level document sections, discussed above, the Word HTML generated by the gem use the same CSS classes as the HTML proper. As already noted, the quirks of Word HTML CSS mean that classes need to be repeated on descendant elements that are not required in normal CSS.
117
-
118
- The handling of footnotes and comments in Word HTML uses idiosyncratic Word HTML markup, including custom CSS, and is generated separately their the HTML counterparts in the gems.
@@ -1,57 +0,0 @@
1
- = How can I localize the resulting output?
2
-
3
- [TIP]
4
- ====
5
- * Copy the `lib/isodoc/i18n-en.yaml` file from the isodoc gem to your gem.
6
- * Edit the right-hand text in the file.
7
- * Give the file location as the `i18nyaml` document attribute in any files you wish to use your localisation.
8
- ====
9
-
10
- Every piece of text generated by the toolset instead of the author is looked up in an internationalisation file; that means that if the language setting for the document changes, and there is an internationalisation file for that language, all output is localised to that language. Of the existing gems, metanorma-gb is localised in this way for English and Chinese, and metanorma-iso is localised for English, French and Chinese.
11
-
12
- The localisation files are http://yaml.org[YAML] files stores in `lib/isodoc/`, named `i18n-{languagecode}.yaml`. (In the case of Chinese, the script code is added to the filename: `i18n-zh-Hans.yaml`.) Most localised text are direct mappings from English metalanguage to the target language (including English itself); there are also instances of hashes in the YAML files. Most localisation text consists of one- or two-word labels, such as "Figure" or "Annex"; some boilerplate text is also included in the localisation text, such as the ISO text describing the use of external sources in Terms and Definitions.
13
-
14
- Localisation is mostly used for translation purposes, but they can also be used to customise the rendering of particular labels in English. For example, the default English label for a first-level supplementary section is "Annex", reflecting ISO practice; but in the metanorma-sample gem, as seen above, this label is overruled in code to be "Appendix" instead.
15
-
16
- The YAML files are read into the `IsoDoc` classes through the `i18n_init()` method of `IsoDoc::...::HtmlConvert` and `Isodoc::...::WordConvert`. The localisation equivalents for the nominated language are read from the corresponding YAML file into the `@labels` hash. The base Isodoc instance of `i18n_init()` also assigns an instance variable for each label (e.g. `@annex_lbl` for English "Annex"). These instance variables are used to generate all automated text in the Isodoc classes.
17
-
18
- All current gems inherit their localisation files from the base isodoc gem. The local `i18n_init()` instance can overwrite individual labels in code (metanorma-csd), or they can read in a local additional YAML file for the same language (metanorma-gb). If you are implementing a completely new language, you will need to replace the base `i18n_init()` method rather than inheriting from it, to ensure that the local YAML files are read in.
19
-
20
- The foregoing describes how to incorporate localisation into your gem on a permanent basis; but the toolset also allows you to nominate a YAML localisation file just for the current document. In Asciidoc, the YAML file is nominated as the i18nyaml document attribute; for IsoDoc, it is passed in as the `i18nyaml` hash attribute to the initialisation method. You will still need to access the base IsoDoc YAML instances, to make sure that all necessary labels are given in your YAML document.
21
-
22
- === Example internationalisation code
23
-
24
- * metanorma-mpfd/lib/isodoc/mpfd/i18n-en.yaml: customisation of clause label in YAML
25
-
26
- [source]
27
- --
28
- clause: Paragraph
29
- --
30
-
31
- * metanorma-m3d/lib/isodoc/m3d/m3dhtmlconvert.rb: customisation of annex label as class variable
32
-
33
- [source,ruby]
34
- --
35
- def i18n_init(lang, script)
36
- super
37
- @annex_lbl = "Appendix"
38
- end
39
- --
40
-
41
- * metanorma-gb/lib/isodoc/gb/gbhtmlconvert.rb: code to read in internationalisation YAML templates (merges superclass `@labels` map, derived from the parent Isodoc::HtmlConvert class, with the labels read in from the GB-specific YAML templates.)
42
-
43
- [source,ruby]
44
- --
45
- def i18n_init(lang, script)
46
- super
47
- y = if lang == "en"
48
- YAML.load_file(File.join(File.dirname(__FILE__), "i18n-en.yaml"))
49
- elsif lang == "zh" && script == "Hans"
50
- YAML.load_file(File.join(File.dirname(__FILE__),
51
- "i18n-zh-Hans.yaml"))
52
- else
53
- YAML.load_file(File.join(File.dirname(__FILE__), "i18n-zh-Hans.yaml"))
54
- end
55
- @labels = @labels.merge(y)
56
- end
57
- --
data/docs/outputs.adoc DELETED
@@ -1,42 +0,0 @@
1
- = Outputs
2
-
3
- The metanorma toolset currently outputs documents in four formats.
4
-
5
- == Metanorma XML
6
-
7
- The Metanorma XML output is the intermediate format which marks up the semantic content of the standards document, and is
8
- used to drive the other formats. The Metanorma XML file is also the file which is used for validation of the standards
9
- document: line numbers in the validation output refer to this file.
10
-
11
- == HTML
12
-
13
- The HTML output is in HTML 5. It has optional Data-URI encoding of local images; if images in the output are are not Data-URI encoded,
14
- they are moved to a folder called `{filename}_images`, and renamed with GUID names, to prevent collisions. Audio and video files are
15
- not supported. All HTML output has a sidebar with a Javascript-generated Table of Contents, which is two section levels deep.
16
-
17
- == PDF
18
-
19
- PDF output, for those standards gems that support it, is generated from the HTML output via Google Puppeteer (which runs in Node.js).
20
- The PDF output generation takes advantage of the print mode in the HTML CSS stylesheet, so much of the browser-like styling of the HTML
21
- is rendered as a more print-like document. Because it is generated from HTML, the PDF output does not support page numbers in its
22
- Table of Contents. Nor does it support advanced paragraph formatting, such as Keep With Next or Widow/Orphan control.
23
-
24
- == Word
25
-
26
- The Word output is output as a DOC format rather than DOCX (i.e. the pre-2007 version of Word), and it is generated using the
27
- Microsoft Office flavour of HTML 4, as a Multipart HTML Word Document (MHT). (This is a MIME-encoded counterpart to the HTML obtained
28
- when you save a Word document as HTML.)
29
-
30
- Using DOC HTML makes it much easier to generate documents with
31
- the advanced formatting requirements of Metanorma (including complex tables, formulas, footnotes, headers and footers,
32
- nested list numbering and crossreferences) than generating either native DOCX (in OOXML), or the DOCX flavour of MHT. For more
33
- on the choice to use DOC, see https://github.com/metanorma/html2doc/wiki/Why-not-docx%3F
34
-
35
- The constraint on using DOC, however, imposes some constraints.
36
-
37
- * SVG images are not supported. (Word internally converts them into PNG files to render them in Word HTML.)
38
- * DOC files are a legacy format of Word.
39
- * DOC files cannot be processed by Pages or LibreOffice: they can only be processed by Microsoft Word. To open the Word output in LibreOffice in particular, you will need to convert the DOC file as a DOCX file, taking the following steps.
40
- ** Open with MS Word
41
- ** Save it once by changing the Extension to `.doc`. (When it asks you to overwrite, say Yes.)
42
- * "Save As" as a `.docx` file.
@@ -1,15 +0,0 @@
1
- = I can translate my specifications into IsoDoc XML myself (i.e. I don't like AsciiDoc, or I already have my own toolchain). Can I only use IsoDoc XML to produce pretty output?
2
-
3
- [TIP]
4
- ====
5
- * Generate correct IsoDoc XML (make sure it validates!)
6
- * Create just the `IsoDoc::...::HtmlConvert` and/or `IsoDoc::...::WordConvert` classes to convert the IsoDoc XML into target formats.
7
- * Initialise the IsoDoc class passing the necessary information about fonts and scripts; the existing gems all illustrate this kind of initialisation.
8
- * Create the target format using the method `.convert(filename, xml)`.
9
- ====
10
-
11
- The Asciidoctor-to-XML and XML-to-Output classes are separate, so you can invoke just the latter without the former. Of course, you will need to make sure that the IsoDoc XML you are passing to the generators is valid.
12
-
13
- The `IsoDoc::...::HtmlConvert` and/or `IsoDoc::...::WordConvert` are initialised in the existing gems with a hash giving the fonts to be used in the document (to be injected in the document SCSS stylesheets), the script of the document (to be used to pick the right font, in case of default font settings), and the `i18nyaml` YAML file for localisation. All existing gems have defaults set for these values on the Asciidoctor side invoking the class, so all parameters are optional.
14
-
15
- Once you have the classes set up, all you need to do is invoke the conversion of XML to the target format, with the method `.convert(filename, xml)`, where XML is the IsoDoc XML.
@@ -1,38 +0,0 @@
1
- = How can I style the resulting Microsoft Word output?
2
-
3
- [TIP]
4
- ====
5
- * There is no quick way of doing this.
6
- * Everything you can do in Word, you can do in Word HTML. Save Word documents as Word HTML to see how.
7
- * Clone the metanorma-sample gem: https://github.com/metanorma/metanorma-sample.
8
- * Edit the `word_sample_titlepage.html` and `word_sample_intro.html` pages to match your organisation's branding. With lots of iterations of saving Word documents as HTML, for trial and error.
9
- ** Leave the Liquid Template instructions alone (`{{`, `{%`) unless you know what you're doing with them: they are how the pages are populated with metadata.
10
- * Edit the `default_fonts()` method in your `IsoDoc::...::WordConvert` class, to match your desired fonts.
11
- * Edit the `default_file_locations()` method in your `IsoDoc::...::WordConvert` class, to match your desired stylesheets and file templates.
12
- * Edit the `wordstyle.scss` and `sample.scss` stylesheets to match your organisation's branding. With lots of iterations of saving Word documents as HTML, for trial and error.
13
- ====
14
-
15
- Word output in the document toolset is generated through Word HTML, the variant of HTML that you get when you save a Word document as HTML. (That is why documents are saved in `.doc`, not `.docx`.) This has the advantage over https://en.wikipedia.org/wiki/Office_Open_XML[OOXML], the native markup of DOCX, of using a well-known markup language, with a low barrier to entry: if you want to work out how to do something in Word HTML, do it in Word, save the document as HTML, and open up the HTML in a text editor. (For more on the choice of using Word HTML, see https://github.com/metanorma/html2doc/wiki/Why-not-docx%3F.)
16
-
17
- However Word HTML is not quite the HTML you are used to: it is a restricted, syntactically idiosyncratic variant of HTML 4, with a non-standard and weakened form of CSS. Doing any styling in Word HTML involves lots of trial and error, and paying close attention to how Word HTML does things in its CSS. We have documented a few of the clearer gotchas in https://github.com/metanorma/html2doc/blob/master/README.adoc.
18
-
19
- It's still better than learning OOXML.
20
-
21
- The process for generating Word output is fairly similar to that for generating HTML, since both processes are generating a form of HTML; as we already noted, the two processes share a substantial amount of code. The main differences are in the handling of page-media features that CSS has lagged in (footnotes, pagination, headers and footers), and in the styling of lists, for which Word HTML uses custom (and undocumented) CSS classes prefixed with `@`, specifying inter alia the numbering for nine levels of nesting of the same list.
22
-
23
- * Styling information is stored in the `.../lib/isodoc/html` folder of the gem, and applies to both Word and HTML content. For Word content, the relevant files are `word_..._titlepage.html` (title page HTML template), `word_..._intro.html` (introductory HTML template, typically restricted to Table of Contents), `wordstyle.scss` and `{name_of_standard}.scss` (the Word stylesheets), and `header.html` (document headers, footers, and endnote/footnote separators, referenced from the stylesheets).
24
- * The styling files to be loaded in are set in the `default_file_locations()` method of `IsoDoc::...::WordConvert`.
25
- * As with HTML generation, additional files (e.g. logos) can be loaded in the `initialize()` method of `IsoDoc::...::WordConvert`. The `initialize()` method also sets the `@` styles in the stylesheet to be used for unordered and ordered lists; a single such style is intended to capture the behaviour of all levels of indentation.
26
- * As with HTML output, the HTML templates are populated through Liquid Templates: variables in `{{` correspond to the hash keys for metadata extracted in `IsoDoc::...::Metadata`, and its superclass `IsoDoc::Metadata` in the isodoc gem.
27
- * As with HTML, the SCSS stylesheets treat fonts as variables, and are set in the `default_fonts()` method of `IsoDoc::...::WordConvert`.
28
- * Document headers and footers are set in the `header.html` file. This is also an HTML template, which is populated with metadata attributes through Liquid Template. The structure of `header.html` is determined by Word, and elements of `header.html` need to be crossreferenced from the Word stylesheet. To inspect Word `header.html` files, save a Word document as HTML, and look inside the `{document_name}.fld` folder generated alongside the HTML output.
29
- * The classes in the SCSS stylesheet correspond to static HTML content in the HTML templates, and dynamic HTML content in the `IsoDoc::...::WordConvert` class, and its superclasses `IsoDoc::WordConvert` and `IsoDoc::Common` in the isodoc gem.
30
-
31
- A Word HTML document is populated as follows:
32
- * HTML Head wrapper (in `IsoDoc::WordConvert`)
33
- ** `@wordstylesheet` CSS stylesheet (generated from SCSS through the `generate_css()` method of `Isodoc::WordConvert`); corresponds to `wordstyle.scss`.
34
- ** `@standstylesheet` CSS stylesheet (generated from SCSS through the `generate_css()` method of `Isodoc::WordConvert`); intended to override any generic CSS in `@wordstylesheet`. Optional, corresponds to `{name_of_standard}.scss`.
35
- * HTML Body
36
- ** `@wordcoverpage` HTML template (optional, corresponds to `word_..._titlepage.html`). Included in `<div class=WordSection1>`.
37
- ** `@htmlintropage` HTML template (optional, corresponds to `word_..._intro.html`). Included in `<div class=WordSection2>`. In the existing gems, WordSection2 is paginated with roman numerals.
38
- ** Document proper (converted from Standoc XML). Included in `<div class=WordSection2>` (prefatory material) and `<div class=WordSection3>` (main document). In the existing gems, WordSection3 is paginated with roman numerals.