ebnf 2.0.0 → 2.1.0

Sign up to get free protection for your applications and to get access to all the features.
checksums.yaml CHANGED
@@ -1,7 +1,7 @@
1
1
  ---
2
2
  SHA256:
3
- metadata.gz: 06a44ff62e466914a2d6f1d80d4eacae30745c5c7823799af847db98cd27c813
4
- data.tar.gz: fa433cd2719ec088cadfd987d69effa5490a3219166bd93406b1e73537956056
3
+ metadata.gz: dc55292610eb978d5751361069f3b993d35db3a597442e2027cf0fd2ff886ba5
4
+ data.tar.gz: cc74cd0257a36fa3591f54becfdb51dfffbf44662598f2d67c6a36bf4e969e61
5
5
  SHA512:
6
- metadata.gz: 335fb057c452beb803c64fdfcbe87bc908e26323904a9641892987f819beb87edea7c3ca2f2bfa1aa911fcbf4b642c41b5cfdf5ba77568a8455283482fcaeb7e
7
- data.tar.gz: f4dd5ab8c0b86c2cf2ca2b1edc4f5d93ce3c86a6b5bb527f43def66c33374090873be59ecadb60874bea62d5f80bfa76c230dc7468caf5e8af3557125d887b85
6
+ metadata.gz: bf7c7df32e027a0739b4830651dcff1f4b5186ff2177e1f57b4e952db33619660c4025a29ffc8dba5c7d0f5f5b95f2cbd432a379bc2ffb02d22f7ec6913a48e2
7
+ data.tar.gz: 909a8ff438172431054a33fb067198e32d98d5b88fa630978831af4f1dd728f0197512ee1f341667eb3039662db618122d7da5d0cc4cc55838625f685f7d7a9b
data/README.md CHANGED
@@ -9,10 +9,17 @@
9
9
  ## Description
10
10
  This is a [Ruby][] implementation of an [EBNF][] and [BNF][] parser and parser generator.
11
11
 
12
+ ### [PEG][]/[Packrat][] Parser
13
+ In the primary mode, it supports a Parsing Expression Grammar ([PEG][]) parser generator. This performs more minmal transformations on the parsed grammar to extract sub-productions, which allows each component of a rule to generate its own parsing event.
14
+
15
+ The resulting {EBNF::PEG::Rule} objects then parse each associated rule according to the operator semantics and use a [Packrat][] memoizer to reduce extra work when backtracking.
16
+
17
+ These rules are driven using the {EBNF::PEG::Parser} module which calls invokes the starting rule and ensures that all input is consumed.
18
+
12
19
  ### LL(1) Parser
13
- In one mode, it parses [EBNF][] grammars to [BNF][], generates [First/Follow][] and Branch tables for [LL(1)][] grammars, which can be used with the stream [Tokenizer][] and [LL(1) Parser][].
20
+ In another mode, it parses [EBNF][] grammars to [BNF][], generates [First/Follow][] and Branch tables for [LL(1)][] grammars, which can be used with the stream [Tokenizer][] and [LL(1) Parser][].
14
21
 
15
- As LL(1) grammars operate using `alt` and `seq` primitives, allowing for a match on alternative productions or a sequence of productions, generating a parser requires turning the EBNF rules into BNF:
22
+ As LL(1) grammars operate using `alt` and `seq` primitives, allowing for a match on alternative productions or a sequence of productions, generating a parser requires turning the [EBNF][] rules into [BNF][]:
16
23
 
17
24
  * Transform `a ::= b?` into `a ::= _empty | b`
18
25
  * Transform `a ::= b+` into `a ::= b b*`
@@ -25,58 +32,77 @@ As LL(1) grammars operate using `alt` and `seq` primitives, allowing for a match
25
32
 
26
33
  Of note in this implementation is that the tokenizer and parser are streaming, so that they can process inputs of arbitrary size.
27
34
 
28
- See {EBNF::LL1} and {EBNF::LL1::Parser} for further information.
35
+ The _exception operator_ (`A - B`) is only supported on terminals.
29
36
 
30
- ### [PEG][]/[Packrat][] Parser
31
- An additional Parsing Expression Grammar ([PEG][]) parser generator is also supported. This performs more minmal transformations on the parsed grammar to extract sub-productions, which allows each component of a rule to generate its own parsing event.
37
+ See {EBNF::LL1} and {EBNF::LL1::Parser} for further information.
32
38
 
33
39
  ## Usage
34
40
  ### Parsing an EBNF Grammar
35
41
 
36
42
  require 'ebnf'
37
43
 
38
- ebnf = EBNF.parse(File.open('./etc/ebnf.ebnf'))
44
+ grammar = EBNF.parse(File.open('./etc/ebnf.ebnf'))
39
45
 
40
- Output rules and terminals as S-Expressions, Turtle, HTML or BNF
46
+ Output rules and terminals as [S-Expressions][S-Expression], [Turtle][], HTML or [BNF][]
41
47
 
42
- puts ebnf.to_sxp
43
- puts ebnf.to_ttl
44
- puts ebnf.to_html
45
- puts ebnf.to_s
48
+ puts grammar.to_sxp
49
+ puts grammar.to_ttl
50
+ puts grammar.to_html
51
+ puts grammar.to_s
46
52
 
47
- Transform EBNF to PEG (generates sub-rules for embedded expressions) and the RULES table as Ruby for parsing grammars:
53
+ Transform [EBNF][] to [PEG][] (generates sub-rules for embedded expressions) and the RULES table as Ruby for parsing grammars:
48
54
 
49
- ebnf.make_peg
50
- ebnf.to_ruby
55
+ grammar.make_peg
56
+ grammar.to_ruby
51
57
 
52
- Transform EBNF to BNF (generates sub-rules using `alt` or `seq` from `plus`, `star` or `opt`)
58
+ Transform [EBNF][] to [BNF][] (generates sub-rules using `alt` or `seq` from `plus`, `star` or `opt`)
53
59
 
54
- ebnf.make_bnf
60
+ grammar.make_bnf
55
61
 
56
62
  Generate [First/Follow][] rules for BNF grammars (using "ebnf" as the starting production):
57
63
 
58
- ebnf.first_follow(:ebnf)
64
+ grammar.first_follow(:ebnf)
59
65
 
60
66
  Generate Terminal, [First/Follow][], Cleanup and Branch tables as Ruby for parsing grammars:
61
67
 
62
- ebnf.build_tables
63
- ebnf.to_ruby
68
+ grammar.build_tables
69
+ grammar.to_ruby
64
70
 
65
71
  Generate formatted grammar using HTML (requires [Haml][Haml] gem):
66
72
 
67
- ebnf.to_html
73
+ grammar.to_html
74
+
75
+ ### Parsing an ISO/IEC 14977 Grammar
76
+
77
+ The EBNF gem can also parse [ISO/EIC 14977] Grammars (ISOEBNF) to [S-Expressions][S-Expression].
78
+
79
+ grammar = EBNF.parse(File.open('./etc/iso-ebnf.isoebnf', format: :isoebnf))
80
+
81
+ ### Parsing an ABNF Grammar
82
+
83
+ The EBNF gem can also parse [ABNF] Grammars to [S-Expressions][S-Expression].
84
+
85
+ grammar = EBNF.parse(File.open('./etc/abnf.abnf', format: :abnf))
68
86
 
69
- ### Parser debugging
87
+ ### Parser Debugging
70
88
 
71
89
  Inevitably while implementing a parser for some specific grammar, a developer will need greater insight into the operation of the parser. While this can involve sorting through a tremendous amount of data, the parser can be provided a [Logger][] instance which will output messages at varying levels of detail to document the state of the parser at any given point. Most useful is likely the `INFO` level of debugging, but even more detail is revealed using the `DEBUG` level. `WARN` and `ERROR` statements will typically also be provided as part of an exception if parsing fails, but can be shown in the context of other parsing state with appropriate indentation as part of the logger.
72
90
 
73
- ### Parser errors
91
+ ### Writing Grammars
92
+
93
+ The {EBNF::Writer} class can be used to write parsed grammars out, either as formatted text, or HTML. Because grammars are written from the Abstract Syntax Tree, represented as [S-Expressions][S-Expression], this provides a means of transforming between grammar formats (e.g., W3C [EBNF][] to [ABNF][]), although with some potential loss in semantic fidelity (case-insensitive string matching vs. case-sensitive matching).
94
+
95
+ The formatted HTML results are designed to be appropriate for including in specifications.
96
+
97
+ ### Parser Errors
74
98
  On a parsing failure, and exception is raised with information that may be useful in determining the source of the error.
75
99
 
76
100
  ## EBNF Grammar
77
101
  The [EBNF][] variant used here is based on [W3C](https://w3.org/) [EBNF][] (see {file:etc/ebnf.ebnf EBNF grammar}) as defined in the
78
102
  [XML 1.0 recommendation](https://www.w3.org/TR/REC-xml/), with minor extensions:
79
103
 
104
+ The character set for EBNF is UTF-8.
105
+
80
106
  The general form of a rule is:
81
107
 
82
108
  symbol ::= expression
@@ -85,7 +111,9 @@ which can also be proceeded by an optional number enclosed in square brackets to
85
111
 
86
112
  [1] symbol ::= expression
87
113
 
88
- Symbols are written with an initial capital letter if they are the start symbol of a regular language (terminals), otherwise with an initial lowercase letter (non-terminals). Literal strings are quoted.
114
+ (Note, this can introduce an ambiguity if the previous rule ends in a range or enum and the current rule has no identifier. In this case, enclosing `expression` within parentheses, or adding intervening comments can resolve the ambiguity.)
115
+
116
+ Symbols are written in CAPITAL CASE if they are the start symbol of a regular language (terminals), otherwise with they are treated as non-terminal rules. Literal strings are quoted.
89
117
 
90
118
  Within the expression on the right-hand side of a rule, the following expressions are used to match strings of one or more characters:
91
119
 
@@ -93,13 +121,13 @@ Within the expression on the right-hand side of a rule, the following expression
93
121
  <tr><td><code>#xN</code></td>
94
122
  <td>where <code>N</code> is a hexadecimal integer, the expression matches the character whose number (code point) in ISO/IEC 10646 is <code>N</code>. The number of leading zeros in the <code>#xN</code> form is insignificant.</td></tr>
95
123
  <tr><td><code>[a-zA-Z], [#xN-#xN]</code>
96
- <td>matches any Char with a value in the range(s) indicated (inclusive).</td></tr>
124
+ <td>matches any Char or HEX with a value in the range(s) indicated (inclusive).</td></tr>
97
125
  <tr><td><code>[abc], [#xN#xN#xN]</code></td>
98
- <td>matches any Char with a value among the characters enumerated. Enumerations and ranges can be mixed in one set of brackets.</td></tr>
126
+ <td>matches any UTF-8 R\_CHAR or HEX with a value among the characters enumerated. The last component may be '-'. Enumerations and ranges may be mixed in one set of brackets.</td></tr>
99
127
  <tr><td><code>[^a-z], [^#xN-#xN]</code></td>
100
- <td>matches any Char with a value outside the range indicated.</td></tr>
128
+ <td>matches any UTF-8 Char or HEX a value outside the range indicated.</td></tr>
101
129
  <tr><td><code>[^abc], [^#xN#xN#xN]</code></td>
102
- <td>matches any Char with a value not among the characters given. Enumerations and ranges of forbidden values can be mixed in one set of brackets.</td></tr>
130
+ <td>matches any UTF-8 R\_CHAR or HEX with a value not among the characters given. The last component may be '-'. Enumerations and ranges of excluded values may be mixed in one set of brackets.</td></tr>
103
131
  <tr><td><code>"string"</code></td>
104
132
  <td>matches a literal string matching that given inside the double quotes.</td></tr>
105
133
  <tr><td><code>'string'</code></td>
@@ -113,7 +141,7 @@ Within the expression on the right-hand side of a rule, the following expression
113
141
  <tr><td><code>A | B</code></td>
114
142
  <td>matches <code>A</code> or <code>B</code>.</td></tr>
115
143
  <tr><td><code>A - B</code></td>
116
- <td>matches any string that matches <code>A</code> but does not match <code>B</code>.</td></tr>
144
+ <td>matches any string that matches <code>A</code> but does not match <code>B</code>. (Only supported on Terminals in LL(1) BNF).</td></tr>
117
145
  <tr><td><code>A+</code></td>
118
146
  <td>matches one or more occurrences of <code>A</code>. Concatenation has higher precedence than alternation; thus <code>A+ | B+</code> is identical to <code>(A+) | (B+)</code>.</td></tr>
119
147
  <tr><td><code>A*</code></td>
@@ -130,10 +158,10 @@ Within the expression on the right-hand side of a rule, the following expression
130
158
  * `@pass` defines the expression used to detect whitespace, which is removed in processing.
131
159
  * No support for `wfc` (well-formedness constraint) or `vc` (validity constraint).
132
160
 
133
- Parsing this grammar yields an S-Expression version: {file:etc/ebnf.sxp} (or [LL(1)][] version {file:etc/ebnf.ll1.sxp} or [PEG][] version {file:etc/ebnf.peg.sxp}).
161
+ Parsing this grammar yields an [S-Expression][] version: {file:etc/ebnf.sxp} (or [LL(1)][] version {file:etc/ebnf.ll1.sxp} or [PEG][] version {file:etc/ebnf.peg.sxp}).
134
162
 
135
163
  ### Parser S-Expressions
136
- Intermediate representations of the grammar may be serialized to Lisp-like S-Expressions. For example, the rule
164
+ Intermediate representations of the grammar may be serialized to Lisp-like [S-Expressions][S-Expression]. For example, the rule
137
165
 
138
166
  [1] ebnf ::= (declaration | rule)*
139
167
 
@@ -155,13 +183,23 @@ Different components of an EBNF rule expression are transformed into their own o
155
183
  <tr><td><code>A?</code></td><td><code>(opt A)</code></td></tr>
156
184
  <tr><td><code>A B</code></td><td><code>(seq A B)</code></td></tr>
157
185
  <tr><td><code>A | B</code></td><td><code>(alt A B)</code></td></tr>
158
- <tr><td><code>A - B</code></td><td><code>(diff A B)</code></td></tr>
186
+ <tr><td><code>A - B</code></td>
187
+ <td><code>(diff A B) for terminals.<br/>
188
+ <code>(seq (not B) A) for non-terminals (PEG parsing only)</code></code></td></tr>
159
189
  <tr><td><code>A+</code></td><td><code>(plus A)</code></td></tr>
160
190
  <tr><td><code>A*</code></td><td><code>(star A)</code></td></tr>
161
- <tr><td><code>@pass " "*</code></td><td><code>(pass (star " "))</code></td></tr>
191
+ <tr><td><code>@pass " "*</code></td><td><code>(pass _pass (star " "))</code></td></tr>
162
192
  <tr><td><code>@terminals</code></td><td></td></tr>
163
193
  </table>
164
194
 
195
+ Other rule operators are not directly supported in [EBNF][], but are included to support other notations (e.g., [ABNF][] and [ISO/IEC 14977][]):
196
+
197
+ <table>
198
+ <tr><td><code>%i"StRiNg"</code></td><td><code>(istr "StRiNg")</code></td><td>Case-insensitive string matching</td></tr>
199
+ <tr><td><code>'' - A</code></td><td><code>(not A)</code></td><td>Negative look-ahead, used for non-terminal uses of `B - A`.</td></tr>
200
+ <tr><td><code>n*mA</code></td><td><code>(rept n m A)</code></td><td>Explicit repetition.</td></tr>
201
+ </table>
202
+
165
203
  Additionally, rules defined with an UPPERCASE symbol are treated as terminals.
166
204
 
167
205
  For an [LL(1)][] parser generator, the {EBNF::BNF.make_bnf} method can be used to transform the EBNF rule into a BNF rule.
@@ -179,12 +217,16 @@ For a [PEG][] parser generator, there is a simpler transformation that reduces r
179
217
  (rule _ebnf_1 "1.1" (alt declaration rule))
180
218
 
181
219
  ## Example parsers
182
- For a [PEG][] parser for a simple grammar implementing a calculator see [Calc example](http://dryruby.github.io/ebnf/examples/calc/doc/calc.html
220
+ For a [PEG][] parser for a simple grammar implementing a calculator see [Calc example](https://dryruby.github.io/ebnf/examples/calc/doc/calc.html)
183
221
 
184
- For an example parser built using this gem that parses the [EBNF][] grammar, see [EBNF PEG Parser example](http://dryruby.github.io/ebnf/examples/ebnf-peg-parser/doc/parser.html). This example creates a parser for the [EBNF][] grammar which generates the same Abstract Syntax Tree as the built-in parser in the gem.
222
+ For an example parser built using this gem that parses the [EBNF][] grammar, see [EBNF PEG Parser example](https://dryruby.github.io/ebnf/examples/ebnf-peg-parser/doc/parser.html). This example creates a parser for the [EBNF][] grammar which generates the same Abstract Syntax Tree as the built-in parser in the gem.
185
223
 
186
224
  There is also an
187
- [EBNF LL(1) Parser example](http://dryruby.github.io/ebnf/examples/ebnf-peg-parser/doc/parser.html).
225
+ [EBNF LL(1) Parser example](https://dryruby.github.io/ebnf/examples/ebnf-peg-parser/doc/parser.html).
226
+
227
+ The [ISO EBNF Parser](https://dryruby.github.io/ebnf/examples/isoebnf/doc/parser.html) example parses [ISO/IEC 14977][] into [S-Expressions][S-Expression], which can be used to parse compatible grammars using this parser (either [PEG][] or [LL(1)][]).
228
+
229
+ The [ABNF Parser](https://dryruby.github.io/ebnf/examples/abnf/doc/parser.html) example parses [ABNF][] into [S-Expressions][S-Expression], which can be used to parse compatible grammars using this [PEG][] parser.
188
230
 
189
231
  ## Acknowledgements
190
232
  Much of this work, particularly the generic parser, is inspired by work originally done by
@@ -229,16 +271,19 @@ A copy of the [Turtle EBNF][] and derived parser files are included in the repos
229
271
  [YARD]: https://yardoc.org/
230
272
  [YARD-GS]: https://rubydoc.info/docs/yard/file/docs/GettingStarted.md
231
273
  [PDD]: https://lists.w3.org/Archives/Public/public-rdf-ruby/2010May/0013.html
274
+ [ABNF]: https://www.rfc-editor.org/rfc/rfc5234
232
275
  [BNF]: https://en.wikipedia.org/wiki/Backus–Naur_form
233
276
  [EBNF]: https://www.w3.org/TR/REC-xml/#sec-notation
234
277
  [EBNF doc]: https://rubydoc.info/github/dryruby/ebnf
235
278
  [First/Follow]: https://en.wikipedia.org/wiki/LL_parser#Constructing_an_LL.281.29_parsing_table
279
+ [ISO/IEC 14977]:https://www.iso.org/standard/26153.html
236
280
  [LL(1)]: https://www.csd.uwo.ca/~moreno//CS447/Lectures/Syntax.html/node14.html
237
281
  [LL(1) Parser]: https://en.wikipedia.org/wiki/LL_parser
238
282
  [Logger]: https://ruby-doc.org/stdlib-2.4.0/libdoc/logger/rdoc/Logger.html
283
+ [S-expression]: https://en.wikipedia.org/wiki/S-expression
239
284
  [Tokenizer]: https://en.wikipedia.org/wiki/Lexical_analysis#Tokenizer
285
+ [Turtle]: https://www.w3.org/TR/2012/WD-turtle-20120710/
240
286
  [Turtle EBNF]: https://dvcs.w3.org/hg/rdf/file/default/rdf-turtle/turtle.bnf
241
287
  [Packrat]: https://pdos.csail.mit.edu/~baford/packrat/thesis/
242
288
  [PEG]: https://en.wikipedia.org/wiki/Parsing_expression_grammar
243
- [Treetop]: https://rubygems.org/gems/treetop
244
289
  [Haml]: https://rubygems.org/gems/haml
data/VERSION CHANGED
@@ -1 +1 @@
1
- 2.0.0
1
+ 2.1.0
data/bin/ebnf CHANGED
@@ -15,6 +15,7 @@ options = {
15
15
  output_format: :sxp,
16
16
  prefix: "ttl",
17
17
  namespace: "http://www.w3.org/ns/formats/Turtle#",
18
+ level: 4
18
19
  }
19
20
 
20
21
  input, out = nil, STDOUT
@@ -23,15 +24,17 @@ OPT_ARGS = [
23
24
  ["--debug", GetoptLong::NO_ARGUMENT, "Turn on debugging output"],
24
25
  ["--bnf", GetoptLong::NO_ARGUMENT, "Transform EBNF to BNF"],
25
26
  ["--evaluate","-e", GetoptLong::REQUIRED_ARGUMENT,"Evaluate argument as an EBNF document"],
27
+ ["--format", "-f", GetoptLong::REQUIRED_ARGUMENT,"Specify output format one of abnf, abnfh, ebnf, html, isoebnf, isoebnfh, ttl, sxp, or rb"],
28
+ ["--input-format", GetoptLong::REQUIRED_ARGUMENT,"Specify input format one of abnf, ebnf isoebnf, native, or sxp"],
26
29
  ["--ll1", GetoptLong::REQUIRED_ARGUMENT,"Generate First/Follow rules, argument is start symbol"],
27
- ["--format", "-f", GetoptLong::REQUIRED_ARGUMENT,"Specify output format one of ebnf, html, ttl, sxp, or rb"],
28
- ["--input-format", GetoptLong::REQUIRED_ARGUMENT,"Specify input format one of ebnf or sxp"],
29
30
  ["--mod-name", GetoptLong::REQUIRED_ARGUMENT,"Module name used when creating ruby tables"],
31
+ ["--namespace", "-n", GetoptLong::REQUIRED_ARGUMENT,"Namespace to use when generating Turtle"],
30
32
  ["--output", "-o", GetoptLong::REQUIRED_ARGUMENT,"Output to the specified file path"],
31
33
  ["--peg", GetoptLong::NO_ARGUMENT, "Transform EBNF to PEG"],
32
34
  ["--prefix", "-p", GetoptLong::REQUIRED_ARGUMENT,"Prefix to use when generating Turtle"],
33
35
  ["--progress", "-v", GetoptLong::NO_ARGUMENT, "Detail on execution"],
34
- ["--namespace", "-n", GetoptLong::REQUIRED_ARGUMENT,"Namespace to use when generating Turtle"],
36
+ ["--renumber", GetoptLong::NO_ARGUMENT, "Renumber parsed reules"],
37
+ ["--validate", GetoptLong::NO_ARGUMENT, "Validate grammar"],
35
38
  ["--help", "-?", GetoptLong::NO_ARGUMENT, "This message"]
36
39
  ]
37
40
  def usage
@@ -54,27 +57,34 @@ opts = GetoptLong.new(*OPT_ARGS.map {|o| o[0..-2]})
54
57
 
55
58
  opts.each do |opt, arg|
56
59
  case opt
57
- when '--debug' then options[:debug] = true
60
+ when '--debug' then options[:level] = 0
58
61
  when '--bnf' then options[:bnf] = true
59
62
  when '--evaluate' then input = arg
60
- when '--input-format' then options[:format] = arg.to_sym
61
- when '--format' then options[:output_format] = arg.to_sym
63
+ when '--input-format'
64
+ unless %w(abnf ebnf isoebnf native sxp).include?(arg)
65
+ STDERR.puts("unrecognized input format #{arg}")
66
+ usage
67
+ end
68
+ options[:format] = arg.to_sym
69
+ when '--format'
70
+ unless %w(abnf abnfh ebnf html isoebnf isoebnfh rb sxp).include?(arg)
71
+ STDERR.puts("unrecognized output format #{arg}")
72
+ usage
73
+ end
74
+ options[:output_format] = arg.to_sym
62
75
  when '--ll1' then (options[:ll1] ||= []) << arg.to_sym
63
76
  when '--mod-name' then options[:mod_name] = arg
64
77
  when '--output' then out = File.open(arg, "w")
65
78
  when '--peg' then options[:peg] = true
66
79
  when '--prefix' then options[:prefix] = arg
80
+ when '--renumber' then options[:renumber] = true
67
81
  when '--namespace' then options[:namespace] = arg
68
- when '--progress' then options[:progress] = true
82
+ when '--progress' then options[:level] = 1 unless options[:level] == 0
83
+ when '--validate' then options[:validate] = true
69
84
  when '--help' then usage
70
85
  end
71
86
  end
72
87
 
73
- if options[:output_format] == :rb && !(options[:ll1] || options[:peg])
74
- STDERR.puts "outputing in .rb format requires --ll or --peg"
75
- exit(1)
76
- end
77
-
78
88
  input = File.open(ARGV[0]) if ARGV[0]
79
89
 
80
90
  ebnf = EBNF.parse(input || STDIN, **options)
@@ -85,13 +95,19 @@ if options[:ll1]
85
95
  ebnf.build_tables
86
96
  end
87
97
 
98
+ ebnf.renumber! if options[:renumber]
99
+
88
100
  res = case options[:output_format]
89
- when :ebnf then ebnf.to_s
90
- when :html then ebnf.to_html
91
- when :sxp then ebnf.to_sxp
92
- when :ttl then ebnf.to_ttl(options[:prefix], options[:namespace])
93
- when :rb then ebnf.to_ruby(out, grammarFile: ARGV[0], **options)
94
- else ebnf.ast.inspect
101
+ when :abnf then ebnf.to_s(format: :abnf)
102
+ when :abnfh then ebnf.to_html(format: :abnf)
103
+ when :ebnf then ebnf.to_s
104
+ when :html then ebnf.to_html
105
+ when :isoebnf then ebnf.to_s(format: :isoebnf)
106
+ when :isoebnfh then ebnf.to_html(format: :isoebnf)
107
+ when :sxp then ebnf.to_sxp
108
+ when :ttl then ebnf.to_ttl(options[:prefix], options[:namespace])
109
+ when :rb then ebnf.to_ruby(out, grammarFile: ARGV[0], **options)
110
+ else ebnf.ast.inspect
95
111
  end
96
112
 
97
113
  out.puts res
@@ -0,0 +1,52 @@
1
+ # Core terminals available in uses of ABNF
2
+ ALPHA ::= [#x41-#x5A#x61-#x7A] # A-Z | a-z
3
+
4
+ BIT ::= '0' | '1'
5
+
6
+ CHAR ::= [#x01-#x7F]
7
+ # any 7-bit US-ASCII character,
8
+ # excluding NUL
9
+ CR ::= #x0D
10
+ # carriage return
11
+
12
+ CRLF ::= CR? LF
13
+ # Internet standard newline
14
+
15
+ CTL ::= [#x00-#x1F] | #x7F
16
+ # controls
17
+
18
+ DIGIT ::= [#x30-#x39]
19
+ # 0-9
20
+
21
+ DQUOTE ::= #x22
22
+ # " (Double Quote)
23
+
24
+ HEXDIG ::= DIGIT | [A-F] # [0-9A-F]
25
+
26
+ HTAB ::= #x09
27
+ # horizontal tab
28
+
29
+ LF ::= #x0A
30
+ # linefeed
31
+
32
+ LWSP ::= (WSP | CRLF WSP)*
33
+ # Use of this linear-white-space rule
34
+ # permits lines containing only white
35
+ # space that are no longer legal in
36
+ # mail headers and have caused
37
+ # interoperability problems in other
38
+ # contexts.
39
+ # Do not use when defining mail
40
+ # headers and use with caution in
41
+ # other contexts.
42
+
43
+ OCTET ::= [#x00-#xFF]
44
+ # 8 bits of data
45
+
46
+ SP ::= #x20
47
+
48
+ VCHAR ::= [#x21-#x7E]
49
+ # visible (printing) characters
50
+
51
+ WSP ::= SP | HTAB
52
+ # white space
@@ -0,0 +1,121 @@
1
+ rulelist = 1*( rule / (*c-wsp c-nl) )
2
+
3
+ rule = rulename defined-as elements c-nl
4
+ ; continues if next line starts
5
+ ; with white space
6
+
7
+ rulename = ALPHA *(ALPHA / DIGIT / "-")
8
+
9
+ defined-as = *c-wsp ("=" / "=/") *c-wsp
10
+ ; basic rules definition and
11
+ ; incremental alternatives
12
+
13
+ elements = alternation *c-wsp
14
+
15
+ c-wsp = WSP / (c-nl WSP)
16
+
17
+ c-nl = comment / CRLF
18
+ ; comment or newline
19
+
20
+ comment = ";" *(WSP / VCHAR) CRLF
21
+
22
+ alternation = concatenation
23
+ *(*c-wsp "/" *c-wsp concatenation)
24
+
25
+ concatenation = repetition *(1*c-wsp repetition)
26
+
27
+ repetition = [repeat] element
28
+
29
+ repeat = (*DIGIT "*" *DIGIT) / 1*DIGIT
30
+
31
+ element = rulename / group / option /
32
+ char-val / num-val / prose-val
33
+
34
+ group = "(" *c-wsp alternation *c-wsp ")"
35
+
36
+ option = "[" *c-wsp alternation *c-wsp "]"
37
+
38
+ char-val = case-insensitive-string /
39
+ case-sensitive-string
40
+
41
+ case-insensitive-string =
42
+ [ "%i" ] quoted-string
43
+
44
+ case-sensitive-string =
45
+ "%s" quoted-string
46
+
47
+ quoted-string = DQUOTE *(%x20-21 / %x23-7E) DQUOTE
48
+ ; quoted string of SP and VCHAR
49
+ ; without DQUOTE
50
+
51
+ num-val = "%" (bin-val / dec-val / hex-val)
52
+
53
+ bin-val = "b" 1*BIT
54
+ [ 1*("." 1*BIT) / ("-" 1*BIT) ]
55
+ ; series of concatenated bit values
56
+ ; or single ONEOF range
57
+
58
+ dec-val = "d" 1*DIGIT
59
+ [ 1*("." 1*DIGIT) / ("-" 1*DIGIT) ]
60
+
61
+ hex-val = "x" 1*HEXDIG
62
+ [ 1*("." 1*HEXDIG) / ("-" 1*HEXDIG) ]
63
+
64
+ prose-val = "<" *(%x20-3D / %x3F-7E) ">"
65
+ ; bracketed string of SP and VCHAR
66
+ ; without angles
67
+ ; prose description, to be used as
68
+ ; last resort
69
+
70
+ ALPHA = %x41-5A / %x61-7A ; A-Z / a-z
71
+
72
+ BIT = "0" / "1"
73
+
74
+ CHAR = %x01-7F
75
+ ; any 7-bit US-ASCII character,
76
+ ; excluding NUL
77
+ CR = %x0D
78
+ ; carriage return
79
+
80
+ CRLF = [CR] LF
81
+ ; Internet standard newline
82
+ ; Extended to allow only newline
83
+
84
+ CTL = %x00-1F / %x7F
85
+ ; controls
86
+
87
+ DIGIT = %x30-39
88
+ ; 0-9
89
+
90
+ DQUOTE = %x22
91
+ ; " (Double Quote)
92
+
93
+ HEXDIG = DIGIT / "A" / "B" / "C" / "D" / "E" / "F"
94
+
95
+ HTAB = %x09
96
+ ; horizontal tab
97
+
98
+ LF = %x0A
99
+ ; linefeed
100
+
101
+ LWSP = *(WSP / CRLF WSP)
102
+ ; Use of this linear-white-space rule
103
+ ; permits lines containing only white
104
+ ; space that are no longer legal in
105
+ ; mail headers and have caused
106
+ ; interoperability problems in other
107
+ ; contexts.
108
+ ; Do not use when defining mail
109
+ ; headers and use with caution in
110
+ ; other contexts.
111
+
112
+ OCTET = %x00-FF
113
+ ; 8 bits of data
114
+
115
+ SP = %x20
116
+
117
+ VCHAR = %x21-7E
118
+ ; visible (printing) characters
119
+
120
+ WSP = SP / HTAB
121
+ ; white space