asciidoctor-iso 0.0.1

Sign up to get free protection for your applications and to get access to all the features.
@@ -0,0 +1,15 @@
1
+ # This project follows the Ribose OSS style guide.
2
+ # https://github.com/riboseinc/oss-guides
3
+ # All project-specific additions and overrides should be specified in this file.
4
+
5
+ inherit_from:
6
+ # Thoughtbot's style guide from: https://github.com/thoughtbot/guides
7
+ - ".rubocop.tb.yml"
8
+ # Overrides from Ribose
9
+ - ".rubocop.ribose.yml"
10
+ AllCops:
11
+ DisplayCopNames: false
12
+ StyleGuideCopsOnly: false
13
+ TargetRubyVersion: 2.4
14
+ Rails:
15
+ Enabled: true
data/Gemfile ADDED
@@ -0,0 +1,4 @@
1
+ source "https://rubygems.org"
2
+
3
+ # Specify your gem's dependencies in ribose.gemspec
4
+ gemspec
@@ -0,0 +1,96 @@
1
+ # asciidoctor-iso
2
+ = Asciidoctor processor for ISO standards
3
+
4
+ image:https://img.shields.io/gem/v/asciidoctor-rfc.svg["Gem Version", link="https://rubygems.org/gems/asciidoctor-iso"]
5
+ image:https://img.shields.io/travis/riboseinc/asciidoctor-rfc/master.svg["Build Status", link="https://travis-ci.org/riboseinc/asciidoctor-iso"]
6
+ image:https://codeclimate.com/github/riboseinc/asciidoctor-rfc/badges/gpa.svg["Code Climate", link="https://codeclimate.com/github/riboseinc/asciidoctor-iso"]
7
+
8
+ This gem processes http://asciidoctor.org/[Asciidoctor] documents and outputs an XML representation of the document, intended as a document model for ISO International Standards. The XML representation can then be processed in turn to generate PDF or Microsoft Word output (via DocBook).
9
+
10
+ The document model intends to introduce rigour into the ISO standards authoring process; the existing https://www.iso.org/iso-templates.html[Microsoft Word template from ISO] do not support such rigour down to the element level. The ISO International Standard format is prescribed in http://www.iec.ch/members_experts/refdocs/iec/isoiecdir-2%7Bed7.0%7Den.pdf[ISO/IEC DIR 2 "Principles and rules for the structure and drafting of ISO and IEC documents"], to a level that is amenable to an explicit document model. A formal document model would allow checking for consistency in format and content, and expedite authoring and quality control of ISO standards.
11
+
12
+ The document model ("ISO XML") is under development, but it already contains all the markup needed to render the https://www.iso.org/publication/PUB100407.html["Rice document"], the ISO's model document of an international standard. It is expressed as a link:lib/asciidoctor/iso/validate.rnc[Relax NG Compact schema]; actual validation occurs against its link:lib/asciidoctor/iso/validate.rng[full Relax NG counterpart]. A UML representation of the document model is given below. Note that the document model is currently still in the exploratory phase, and will likely be changing significantly.
13
+
14
+ Asciidoctor has been selected as the authoring tool to generate the document model representation of ISO standards. It is a document formatting tool like Markdown and DocBook, which combines the relative ease of use of the former (using relatively lightweight markup), and the rigour and expressively of the latter (it has a well-defined syntax, and was in fact initially developed as a DocBook document authoring tool). Asciidoctor has built-in capability to output Text, DocBook and HTML; so it can be used to preview the file as it is being authored.
15
+
16
+ Note that in order to generate output close to what is intended, the Asciidoc document includes a fair amount of formatting instructions (e.g. disabling section numbering where appropriate, the titling of Appendixes as Annexes), as well as ISO boilerplate text, and predefined section headers (sections are recognised by fixed titles such as `Normative References`). Authoring ISO standards in this fashion assumes that users will be populating an Asciidoc template, and not removing needed formatting instructions.
17
+
18
+ == Features not visible in HTML preview
19
+
20
+ The gem uses built-in Asciidoc formatting as much as possible, so that users can retain the ability to preview documents; for _Terms and Definitions_ clauses, which have a good deal of explicit structure, macros have been introduced for semantic markup (admitted terms, deprecated terms, etc). The default HTML output of an Asciidoc-formatted ISO document is quite close to the intended final output, with the following exceptions:
21
+
22
+ * _Terms and Definitions_: each term is marked up as an unnumbered subclause, the semantic markup of alternate and other terms is not rendered visually.
23
+ * _Formuals_: Asciidoctor has no provision for the automated numbering of isolated block formulas ("stem"), and does not display the number assigned a block formula in its default HTML processor—although it does provide automated numbering of examples. The encoding of formulas may change in future versions, although the final numbering is meant to be provided by downstream tools processing the ISO XML output.
24
+ * _Missing elements_: The document model does not yet include Asciidoc elements that do not occur in the Rice document: in particular, source code, definition lists (except when used as keys for formuals or figures), examples (as distinct from figures; examples within _Terms and Definitions_ are catered for), sidebars (as distinct from warnings), quotes.
25
+ * _Markup_: Some connecting text which is used to convey markup structure is left out: in particular, `DEPRECATED` and `SOURCE` (replaced by formatting macros).
26
+ * _Tables_: Table footnotes are treated like all other footnotes: they are rendered at the bottom of the document, rather than the bottom of the table, and they are not numbered separately.
27
+ * _Crossreferences_: Footnoted crossreferences are indicated with the reference text `fn` in isolation, or `fn:` as a prefix to the reference text. The default HTML processor leaves these as is: if no reference text is given, only `fn` will be displayed (though it will still hyperlink to the right reference).
28
+ * _References_: The convention for references is that ISO documents are cited without brackets by ISO number, and optionally year, whether they are normative or in the bibliography (e.g. `ISO 20483:2013`); while all other references are cited by bracketed number in the bibliography (e.g. `[1]`). The default HTML processor treats all references the same, and will bracket them (e.g. `[ISO 20483:2013]`). For the same reason, ISO references listed in the bibliography will be listed under an ISO reference, rather than a bracketed number.
29
+ * _References_: References are rendered cited throughout, since they are automated. For that reason, if reference is to be made to both an undated and a dated version of an ISO reference, these need to be explicitly listed as separate references. (This is not done in the Rice model document, which lists ISO 6646, but under _Terms and Definitions_ cites the dated ISO 6646:2011.
30
+ * _References_: ISO references that are undated but published have their date indicated under the ISO standards format in an explanatory footnote. Because of constraints introduced by Asciidoctor, that explanation is instead given in square brackets in Asciidoc format.
31
+ * _Annexes_: Subheadings cannot preserve subsection numbering, while also appearing inline with their text (e.g. Rice document, Annex B.2): they appear as headings in separate lines.
32
+ * _Annexes_: Crossreferences to Annex subclauses are automatically prefixed with `Clause` rather than `Annex` or nothing.
33
+ * _Metadata_: Document metadata such as document numbers, technical committees and title wording are not rendered in the default HTML output.
34
+ * _Patent Notice_: Patent notices are treated and rendered as a subsection of the introduction, with an explicit subheading.
35
+ * _Numbering_: The numbering of figures and tables is sequential in the default HTML processor: it does not include the Clause or Annex number. This, _Figure 1_, not _Figure A.1_.
36
+ * _Notes_: There is no automatic note numbering by the default HTML processor.
37
+ * _Keys_: Keys to formulas and figures are expected to be marked up as definition lists consistently, rather than as inline prose.
38
+ * _Figures_: Simple figures are marked up as images, figures containing subfigures as examples. Numbering by the default HTML processor may be inconsistent. Subfigures are automatically numbered as independent figures.
39
+ * _Markup_: The default HTML processor does not support CSS extensions such as small caps or strike through, though these can be marked up as CSS classes through custom macros in Asciidoc: a custom CSS stylesheet will be needed to render them.
40
+
41
+ TODO: May need to only encode figures as examples.
42
+
43
+ == Document Attributes
44
+
45
+ The gem also relies on Asciidoc document attributes to provide necessary metadata about the document. These include:
46
+
47
+ `:docnumber:`:: The ISO document number (mandatory)
48
+ `:tc-docnumber:`:: The document number assigned by the Technical committee
49
+ `:ref-docnumber:`:: The reference document number (appearing in page headers)
50
+ `:partnumber:`:: The ISO document part number
51
+ `:edition:`:: The document edition
52
+ `:revdate:`:: The date the document was last updated
53
+ `:copyright-year:`:: The year which will be claimed as when the copyright for the document was issued
54
+ `:title-intro-en:`:: The introductory component of the English title of the document
55
+ `:title-main-en:`:: The main component of the English title of the document (mandatory). (The first line of the Asciidoc document, which contains the title introduced with `=`, is ignored)
56
+ `:title-part-en:`:: The English title of the document part
57
+ `:title-intro-fr:`:: The introductory component of the French title of the document. (This document template presupposes authoring in English; a different template will be needed for French, including French titles of document components such as annexes.)
58
+ `:title-main-fr:`:: The main component of the French title of the document (mandatory).
59
+ `:title-part-fr:`:: The French title of the document part
60
+ `:doctype:`:: The document type (see https://www.iso.org/deliverables-all.html[ISO deliverables: The different types of ISO publications]) (mandatory). The permitted types are: `international-standard, technical-specification, technical-report, publicly-available-specification, international-workshop-agreement, guide`.
61
+ `:docstage:`:: The stage code for the document status (see https://www.iso.org/stage-codes.html[International harmonized stage codes])
62
+ `:docsubstage:`:: The substage code for the document status (see https://www.iso.org/stage-codes.html[International harmonized stage codes])
63
+ `:secretariat:`:: The national body acting as the secretariat for the document in the deafting stage
64
+ `:technical-committee-number:`:: The number of the relevant ISO technical committee
65
+ `:technical-committee:`:: The name of the relevant ISO technical committee (mandatory)
66
+ `:subcommittee-number:`:: The number of the relevant ISO subcommittee
67
+ `:subcommittee:`:: The name of the relevant ISO subcommittee
68
+ `:workgroup-number:`:: The number of the relevant ISO workgroup
69
+ `:workgroup:`:: The name of the relevant ISO workgroup
70
+ `:language:` :: The language of the document (`en` or `fr`) (mandatory)
71
+
72
+ The gem translates the document into ISO XML format, and then validates its output against the ISO XML document model; errors are reported to console against the XML, and are intended for users to check that they have provided all necessary components of the document.
73
+
74
+ The attribute `:draft:`, if present, includes review notes in the XML output; these are otherwise suppressed.
75
+
76
+ == Usage
77
+ [source,console]
78
+ ----
79
+ $ asciidoctor a.adoc # HTML output of Asciidoc file
80
+ $ asciidoctor -b iso -r 'asciidoctor-iso' a.adoc # ISO XML output
81
+ ----
82
+
83
+ == Document model
84
+
85
+ image::grammar1.gif[]
86
+ image::grammar2.gif[]
87
+ image::grammar3.gif[]
88
+ image::grammar4.gif[]
89
+
90
+
91
+ == Examples
92
+ The gem has been tested to date against the https://www.iso.org/publication/PUB100407.html["Rice document"], the ISO's model document of an international standard. This repository includes:
93
+
94
+ * the link:spec/examples/rice.adoc[Asciidoc version of the Rice document].
95
+ * the link:spec/examples/rice.html[Asciidoc rendering of the Rice document as HTML].
96
+ * the link:spec/examples/rice.xml[ISO XML rendering of the Rice document].
@@ -0,0 +1,44 @@
1
+ # coding: utf-8
2
+
3
+ lib = File.expand_path("../lib", __FILE__)
4
+ $LOAD_PATH.unshift(lib) unless $LOAD_PATH.include?(lib)
5
+ require "asciidoctor/iso/version"
6
+
7
+ Gem::Specification.new do |spec|
8
+ spec.name = "asciidoctor-iso"
9
+ spec.version = Asciidoctor::ISO::VERSION
10
+ spec.authors = ["Ribose Inc."]
11
+ spec.email = ["open.source@ribose.com"]
12
+
13
+ spec.summary = "asciidoctor-iso lets you write ISO standards in AsciiDoc."
14
+ spec.description = <<~DESCRIPTION
15
+ asciidoctor-iso lets you write ISO standards in AsciiDoc syntax.
16
+
17
+ This gem is in active development.
18
+ DESCRIPTION
19
+
20
+ spec.homepage = "https://github.com/riboseinc/asciidoctor-iso"
21
+ spec.license = "MIT"
22
+
23
+ spec.bindir = "bin"
24
+ spec.require_paths = ["lib"]
25
+ spec.files = `git ls-files`.split("\n")
26
+ spec.test_files = `git ls-files -- {spec}/*`.split("\n")
27
+ spec.required_ruby_version = Gem::Requirement.new(">= 2.3.0")
28
+
29
+ spec.add_dependency "asciidoctor", "~> 1.5.6"
30
+ spec.add_dependency "htmlentities", "~> 4.3.4"
31
+ spec.add_dependency "nokogiri", "~> 1.8.1"
32
+ spec.add_dependency "thread_safe"
33
+
34
+ spec.add_development_dependency "bundler", "~> 1.15"
35
+ spec.add_development_dependency "byebug", "~> 9.1"
36
+ spec.add_development_dependency "equivalent-xml", "~> 0.6"
37
+ spec.add_development_dependency "guard", "~> 2.14"
38
+ spec.add_development_dependency "guard-rspec", "~> 4.7"
39
+ spec.add_development_dependency "rake", "~> 12.0"
40
+ spec.add_development_dependency "rspec", "~> 3.6"
41
+ spec.add_development_dependency "rubocop", "~> 0.50"
42
+ spec.add_development_dependency "simplecov", "~> 0.15"
43
+ spec.add_development_dependency "timecop", "~> 0.9"
44
+ end
Binary file
Binary file
Binary file
Binary file
@@ -0,0 +1,3 @@
1
+ require "asciidoctor" unless defined? Asciidoctor::Converter
2
+ require_relative "asciidoctor/iso/converter"
3
+ require_relative "asciidoctor/iso/version"
@@ -0,0 +1,145 @@
1
+ require "date"
2
+ require "nokogiri"
3
+ require "htmlentities"
4
+ require "json"
5
+ require "pathname"
6
+ require "open-uri"
7
+ require "pp"
8
+
9
+ module Asciidoctor
10
+ module ISO
11
+ module Base
12
+ def content(node)
13
+ node.content
14
+ end
15
+
16
+ def skip(node, name = nil)
17
+ warn %(asciidoctor: WARNING (#{current_location(node)}): \
18
+ converter missing for #{name || node.node_name} node in ISO backend)
19
+ nil
20
+ end
21
+
22
+ def document(node)
23
+ result = ["<?xml version='1.0' encoding='UTF-8'?>\n<iso-standard#{document_ns_attributes node}>"]
24
+ $draft = node.attributes.has_key?("draft")
25
+ result << noko { |ixml| front node, ixml }
26
+ result << noko { |ixml| middle node, ixml }
27
+ result << "</iso-standard>"
28
+ ret = result.flatten * "\n"
29
+ ret1 = cleanup(Nokogiri::XML(ret))
30
+ Validate::validate(ret1)
31
+ ret1.to_xml(indent: 2)
32
+ end
33
+
34
+ def front(node, xml)
35
+ xml.front do |xml_front|
36
+ title node, xml_front
37
+ metadata node, xml_front
38
+ end
39
+ end
40
+
41
+ def middle(node, xml)
42
+ xml.middle do |xml_middle|
43
+ xml_middle << node.content if node.blocks?
44
+ end
45
+ end
46
+
47
+ def termsource(node)
48
+ result = []
49
+ result << noko do |xml|
50
+ xml.termref do |xml_t|
51
+ matched = /^(?<xref><xref[^>]+>)
52
+ (,\s(?<section>.[^, ]+))?
53
+ (,\s(?<text>.*))?$/x.match node.content
54
+ if matched.nil?
55
+ warn %(asciidoctor: WARNING (#{current_location(node)}): term reference not in expected format: #{node.content})
56
+ else
57
+ seen_xref = Nokogiri::XML.fragment(matched[:xref])
58
+ attr = {
59
+ target: seen_xref.children[0]["target"],
60
+ format: seen_xref.children[0]["format"],
61
+ }
62
+ xml_t.xref seen_xref.children[0].content, **attr_code(attr)
63
+ xml_t.isosection matched[:section] if matched[:section]
64
+ xml_t.modification { |m| m << matched[:text] } if matched[:text]
65
+ end
66
+ end
67
+ end
68
+ result
69
+ end
70
+
71
+ def paragraph(node)
72
+ return termsource(node) if node.role == "source"
73
+ result = []
74
+ result << noko do |xml|
75
+ xml.p do |xml_t|
76
+ xml_t << node.content
77
+ end
78
+ end
79
+ result
80
+ end
81
+
82
+ def inline_footnote(node)
83
+ noko do |xml|
84
+ xml.fn do |xml_t|
85
+ xml_t << node.text
86
+ end
87
+ end.join
88
+ end
89
+
90
+ def open(node)
91
+ # open block is a container of multiple blocks,
92
+ # treated as a single block.
93
+ # We append each contained block to its parent
94
+ result = []
95
+ if node.blocks?
96
+ node.blocks.each do |b|
97
+ result << send(b.context, b)
98
+ end
99
+ else
100
+ result = paragraph(node)
101
+ end
102
+ result
103
+ end
104
+
105
+ def inline_break(node)
106
+ noko do |xml|
107
+ xml << node.text
108
+ xml.br
109
+ end.join
110
+ end
111
+
112
+ def page_break(node)
113
+ noko do |xml|
114
+ xml << node.text
115
+ xml.pagebreak
116
+ end.join
117
+ end
118
+
119
+ def inline_quoted(node)
120
+ noko do |xml|
121
+ case node.type
122
+ when :emphasis then xml.em node.text
123
+ when :strong then xml.strong node.text
124
+ when :monospaced then xml.tt node.text
125
+ when :double then xml << "\"#{node.text}\""
126
+ when :single then xml << "'#{node.text}'"
127
+ when :superscript then xml.sup node.text
128
+ when :subscript then xml.sub node.text
129
+ when :asciimath then xml.stem node.text
130
+ else
131
+ if node.role == "alt"
132
+ xml.admitted_term { |a| a << node.text }
133
+ elsif node.role == "deprecated"
134
+ xml.deprecated_term { |a| a << node.text }
135
+ elsif node.role == "domain"
136
+ xml.termdomain { |a| a << node.text }
137
+ else
138
+ xml << node.text
139
+ end
140
+ end
141
+ end.join
142
+ end
143
+ end
144
+ end
145
+ end
@@ -0,0 +1,185 @@
1
+ require "htmlentities"
2
+ require "uri"
3
+
4
+ module Asciidoctor
5
+ module ISO
6
+ module Blocks
7
+ def stem(node)
8
+ stem_attributes = {
9
+ anchor: node.id,
10
+ }
11
+ # NOTE: html escaping is performed by Nokogiri
12
+ stem_content = node.lines.join("\n")
13
+
14
+ noko do |xml|
15
+ xml.formula **attr_code(stem_attributes) do |s|
16
+ s.stem stem_content
17
+ end
18
+ end
19
+ end
20
+
21
+ def sidebar(node)
22
+ if $draft
23
+ note_attributes = {
24
+ color: node.attr("color") ? node.attr("color") : "red",
25
+ }
26
+ content = flatten_rawtext(node.content).join("\n")
27
+ noko do |xml|
28
+ xml.review_note content, **attr_code(note_attributes)
29
+ end
30
+ end
31
+ end
32
+
33
+ def termnote(node)
34
+ note_attributes = { anchor: node.id }
35
+
36
+ warn <<~WARNING_MESSAGE if node.blocks?
37
+ asciidoctor: WARNING (#{current_location(node)}): \
38
+ comment can not contain blocks of text in XML RFC:\n #{node.content}
39
+ WARNING_MESSAGE
40
+
41
+ noko do |xml|
42
+ xml.termnote **attr_code(note_attributes) do |xml_cref|
43
+ xml_cref << node.content
44
+ end
45
+ end.join
46
+ end
47
+
48
+ def admonition(node)
49
+ return termnote(node) if $term_def
50
+ noko do |xml|
51
+ xml.note **attr_code(anchor: node.id) do |xml_cref|
52
+ if node.blocks?
53
+ xml_cref << node.content
54
+ else
55
+ xml_cref.p { |p| p << node.content }
56
+ end
57
+ end
58
+ end.join
59
+ end
60
+
61
+ def term_example(node)
62
+ noko do |xml|
63
+ xml.termexample **attr_code(anchor: node.id) do |ex|
64
+ ex << node.content
65
+ end
66
+ end.join
67
+ end
68
+
69
+ def example(node)
70
+ return term_example(node) if $term_def
71
+ noko do |xml|
72
+ xml.example **attr_code(anchor: node.id) do |ex|
73
+ ex << node.content
74
+ end
75
+ end.join
76
+ end
77
+
78
+ def preamble(node)
79
+ result = []
80
+ result << noko do |xml|
81
+ xml.foreword do |xml_abstract|
82
+ xml_abstract << node.content
83
+ end
84
+ end
85
+ result
86
+ end
87
+
88
+ def section(node)
89
+ attr = { anchor: node.id.empty? ? nil : node.id }
90
+ noko do |xml|
91
+ case node.title.downcase
92
+ when "introduction"
93
+ xml.introduction **attr_code(attr) do |xml_section|
94
+ xml_section << node.content
95
+ end
96
+ when "patent notice"
97
+ xml.patent_notice **attr_code(attr) do |xml_section|
98
+ xml_section << node.content
99
+ end
100
+ when "scope"
101
+ xml.scope **attr_code(attr) do |xml_section|
102
+ xml_section << node.content
103
+ end
104
+ when "normative references"
105
+ $norm_ref = true
106
+ xml.norm_ref **attr_code(attr) do |xml_section|
107
+ xml_section << node.content
108
+ end
109
+ $norm_ref = false
110
+ when "terms and definitions"
111
+ $term_def = true
112
+ xml.terms_defs **attr_code(attr) do |xml_section|
113
+ xml_section << node.content
114
+ end
115
+ $term_def = false
116
+ when "bibliography"
117
+ $biblio = true
118
+ xml.bibliography **attr_code(attr) do |xml_section|
119
+ xml_section << node.content
120
+ end
121
+ $biblio = true
122
+ else
123
+ if $term_def
124
+ xml.termdef **attr_code(attr) do |xml_section|
125
+ xml_section.term { |name| name << node.title }
126
+ xml_section << node.content
127
+ end
128
+ elsif node.attr("style") == "appendix"
129
+ xml.annex **attr_code(attr) do |xml_section|
130
+ xml_section.name { |name| name << node.title }
131
+ xml_section << node.content
132
+ end
133
+ else
134
+ xml.clause **attr_code(attr) do |xml_section|
135
+ unless node.title.nil?
136
+ xml_section.name { |name| name << node.title }
137
+ end
138
+ xml_section << node.content
139
+ end
140
+ end
141
+ end
142
+ end.join
143
+ end
144
+
145
+ def image(node)
146
+ uri = node.image_uri node.attr("target")
147
+ artwork_attributes = {
148
+ anchor: node.id,
149
+ src: uri,
150
+ }
151
+
152
+ noko do |xml|
153
+ xml.figure **attr_code(artwork_attributes) do |f|
154
+ f.name { |name| name << node.title } unless node.title.nil?
155
+ end
156
+ end
157
+ end
158
+
159
+ def quote(node)
160
+ noko do |xml|
161
+ xml.quote **attr_code(anchor: node.id) do |xml_blockquote|
162
+ if node.blocks?
163
+ xml_blockquote << node.content
164
+ else
165
+ xml_blockquote.p { |p| p << node.content }
166
+ end
167
+ end
168
+ end
169
+ end
170
+
171
+ def listing(node)
172
+ # NOTE: html escaping is performed by Nokogiri
173
+ noko do |xml|
174
+ if node.parent.context != :example
175
+ xml.figure do |xml_figure|
176
+ xml_figure.sourcecode { |s| s << node.content }
177
+ end
178
+ else
179
+ xml.sourcecode { |s| s << node.content }
180
+ end
181
+ end
182
+ end
183
+ end
184
+ end
185
+ end