ebnf 1.1.1 → 2.1.0

Sign up to get free protection for your applications and to get access to all the features.
Files changed (56) hide show
  1. checksums.yaml +5 -5
  2. data/README.md +218 -196
  3. data/UNLICENSE +1 -1
  4. data/VERSION +1 -1
  5. data/bin/ebnf +40 -21
  6. data/etc/abnf-core.ebnf +52 -0
  7. data/etc/abnf.abnf +121 -0
  8. data/etc/abnf.ebnf +124 -0
  9. data/etc/abnf.sxp +45 -0
  10. data/etc/doap.ttl +13 -12
  11. data/etc/ebnf.ebnf +21 -33
  12. data/etc/ebnf.html +171 -160
  13. data/etc/{ebnf.rb → ebnf.ll1.rb} +30 -107
  14. data/etc/ebnf.ll1.sxp +182 -183
  15. data/etc/ebnf.peg.rb +90 -0
  16. data/etc/ebnf.peg.sxp +84 -0
  17. data/etc/ebnf.sxp +40 -41
  18. data/etc/iso-ebnf.ebnf +140 -0
  19. data/etc/iso-ebnf.isoebnf +138 -0
  20. data/etc/iso-ebnf.sxp +65 -0
  21. data/etc/sparql.ebnf +4 -4
  22. data/etc/sparql.html +1603 -1751
  23. data/etc/sparql.ll1.sxp +7372 -7372
  24. data/etc/sparql.peg.rb +532 -0
  25. data/etc/sparql.peg.sxp +597 -0
  26. data/etc/sparql.sxp +363 -362
  27. data/etc/turtle.ebnf +3 -3
  28. data/etc/turtle.html +465 -517
  29. data/etc/{turtle.rb → turtle.ll1.rb} +3 -4
  30. data/etc/turtle.ll1.sxp +425 -425
  31. data/etc/turtle.peg.rb +182 -0
  32. data/etc/turtle.peg.sxp +199 -0
  33. data/etc/turtle.sxp +103 -101
  34. data/lib/ebnf.rb +7 -2
  35. data/lib/ebnf/abnf.rb +301 -0
  36. data/lib/ebnf/abnf/core.rb +23 -0
  37. data/lib/ebnf/abnf/meta.rb +111 -0
  38. data/lib/ebnf/base.rb +128 -87
  39. data/lib/ebnf/bnf.rb +1 -26
  40. data/lib/ebnf/ebnf/meta.rb +90 -0
  41. data/lib/ebnf/isoebnf.rb +229 -0
  42. data/lib/ebnf/isoebnf/meta.rb +75 -0
  43. data/lib/ebnf/ll1.rb +140 -8
  44. data/lib/ebnf/ll1/lexer.rb +37 -32
  45. data/lib/ebnf/ll1/parser.rb +113 -73
  46. data/lib/ebnf/ll1/scanner.rb +84 -51
  47. data/lib/ebnf/native.rb +320 -0
  48. data/lib/ebnf/parser.rb +285 -302
  49. data/lib/ebnf/peg.rb +39 -0
  50. data/lib/ebnf/peg/parser.rb +554 -0
  51. data/lib/ebnf/peg/rule.rb +241 -0
  52. data/lib/ebnf/rule.rb +453 -163
  53. data/lib/ebnf/terminals.rb +21 -0
  54. data/lib/ebnf/writer.rb +554 -85
  55. metadata +98 -20
  56. data/etc/sparql.rb +0 -45773
checksums.yaml CHANGED
@@ -1,7 +1,7 @@
1
1
  ---
2
- SHA1:
3
- metadata.gz: 9beea76a6ce73ec7ecd512e5b6b6a9a9057059aa
4
- data.tar.gz: 7bbcfb97700d781070d276ca3e5a378fc0fdb9f4
2
+ SHA256:
3
+ metadata.gz: dc55292610eb978d5751361069f3b993d35db3a597442e2027cf0fd2ff886ba5
4
+ data.tar.gz: cc74cd0257a36fa3591f54becfdb51dfffbf44662598f2d67c6a36bf4e969e61
5
5
  SHA512:
6
- metadata.gz: 3b84a0e8f12ab0e83d7cd37f88c5ce960ee80c86e485f49a8f3125f53d5aa3614300209c974904a5c86bfe7c97069be697e61182d464869423b5f6499d55f21c
7
- data.tar.gz: 9e1bfdf21b091b61e892a614be05f1f6d43d9c5a42530142bd0d005e0a79438fbc7a7009502260ddaad1b2b3adeb7ea850353c159672224feed3319f4d7f4ffd
6
+ metadata.gz: bf7c7df32e027a0739b4830651dcff1f4b5186ff2177e1f57b4e952db33619660c4025a29ffc8dba5c7d0f5f5b95f2cbd432a379bc2ffb02d22f7ec6913a48e2
7
+ data.tar.gz: 909a8ff438172431054a33fb067198e32d98d5b88fa630978831af4f1dd728f0197512ee1f341667eb3039662db618122d7da5d0cc4cc55838625f685f7d7a9b
data/README.md CHANGED
@@ -2,15 +2,24 @@
2
2
 
3
3
  [EBNF][] parser and generic parser generator.
4
4
 
5
- [![Gem Version](https://badge.fury.io/rb/ebnf.png)](http://badge.fury.io/rb/ebnf)
6
- [![Build Status](https://secure.travis-ci.org/gkellogg/ebnf.png?branch=master)](http://travis-ci.org/gkellogg/ebnf)
7
- [![Coverage Status](https://coveralls.io/repos/gkellogg/ebnf/badge.svg)](https://coveralls.io/r/gkellogg/ebnf)
8
- [![Dependency Status](https://gemnasium.com/gkellogg/ebnf.png)](https://gemnasium.com/gkellogg/ebnf)
5
+ [![Gem Version](https://badge.fury.io/rb/ebnf.png)](https://badge.fury.io/rb/ebnf)
6
+ [![Build Status](https://secure.travis-ci.org/dryruby/ebnf.png?branch=master)](https://travis-ci.org/dryruby/ebnf)
7
+ [![Coverage Status](https://coveralls.io/repos/dryruby/ebnf/badge.svg)](https://coveralls.io/r/dryruby/ebnf)
9
8
 
10
9
  ## Description
11
- This is a [Ruby][] implementation of an [EBNF][] and [BNF][] parser and parser generator. It parses [EBNF][] grammars to [BNF][], generates [First/Follow][] and Branch tables for [LL(1)][] grammars, which can be used with the stream [Tokenizer][] and [LL(1) Parser][].
10
+ This is a [Ruby][] implementation of an [EBNF][] and [BNF][] parser and parser generator.
12
11
 
13
- As LL(1) grammars operate using `alt` and `seq` primitives, allowing for a match on alternative productions or a sequence of productions, generating a parser requires turning the EBNF rules into BNF:
12
+ ### [PEG][]/[Packrat][] Parser
13
+ In the primary mode, it supports a Parsing Expression Grammar ([PEG][]) parser generator. This performs more minmal transformations on the parsed grammar to extract sub-productions, which allows each component of a rule to generate its own parsing event.
14
+
15
+ The resulting {EBNF::PEG::Rule} objects then parse each associated rule according to the operator semantics and use a [Packrat][] memoizer to reduce extra work when backtracking.
16
+
17
+ These rules are driven using the {EBNF::PEG::Parser} module which calls invokes the starting rule and ensures that all input is consumed.
18
+
19
+ ### LL(1) Parser
20
+ In another mode, it parses [EBNF][] grammars to [BNF][], generates [First/Follow][] and Branch tables for [LL(1)][] grammars, which can be used with the stream [Tokenizer][] and [LL(1) Parser][].
21
+
22
+ As LL(1) grammars operate using `alt` and `seq` primitives, allowing for a match on alternative productions or a sequence of productions, generating a parser requires turning the [EBNF][] rules into [BNF][]:
14
23
 
15
24
  * Transform `a ::= b?` into `a ::= _empty | b`
16
25
  * Transform `a ::= b+` into `a ::= b b*`
@@ -23,211 +32,219 @@ As LL(1) grammars operate using `alt` and `seq` primitives, allowing for a match
23
32
 
24
33
  Of note in this implementation is that the tokenizer and parser are streaming, so that they can process inputs of arbitrary size.
25
34
 
35
+ The _exception operator_ (`A - B`) is only supported on terminals.
36
+
37
+ See {EBNF::LL1} and {EBNF::LL1::Parser} for further information.
38
+
26
39
  ## Usage
27
- ### Parsing an LL(1) Grammar
40
+ ### Parsing an EBNF Grammar
28
41
 
29
42
  require 'ebnf'
30
43
 
31
- ebnf = EBNF.parse(File.open('./etc/ebnf.bnf'))
44
+ grammar = EBNF.parse(File.open('./etc/ebnf.ebnf'))
32
45
 
33
- Output rules and terminals as S-Expressions, Turtle or EBNF
46
+ Output rules and terminals as [S-Expressions][S-Expression], [Turtle][], HTML or [BNF][]
34
47
 
35
- puts ebnf.to_sxp
36
- puts ebnf.to_ttl
37
- puts ebnf.to_ebnf
48
+ puts grammar.to_sxp
49
+ puts grammar.to_ttl
50
+ puts grammar.to_html
51
+ puts grammar.to_s
38
52
 
39
- Transform EBNF to BNF (generates sub-productions using `alt` or `seq` from `plus`, `star` or `opt`)
53
+ Transform [EBNF][] to [PEG][] (generates sub-rules for embedded expressions) and the RULES table as Ruby for parsing grammars:
40
54
 
41
- ebnf.make_bnf
55
+ grammar.make_peg
56
+ grammar.to_ruby
42
57
 
43
- Generate [First/Follow][] rules for BNF grammars
58
+ Transform [EBNF][] to [BNF][] (generates sub-rules using `alt` or `seq` from `plus`, `star` or `opt`)
44
59
 
45
- ebnf.first_follow(start_tokens)
60
+ grammar.make_bnf
46
61
 
47
- Generate Terminal, [First/Follow][], Cleanup and Branch tables as Ruby for parsing grammars
62
+ Generate [First/Follow][] rules for BNF grammars (using "ebnf" as the starting production):
48
63
 
49
- ebnf.to_ruby
64
+ grammar.first_follow(:ebnf)
50
65
 
51
- Generate formatted grammar using HTML (requires [Haml][Haml] gem)
66
+ Generate Terminal, [First/Follow][], Cleanup and Branch tables as Ruby for parsing grammars:
52
67
 
53
- ebnf.to_html
68
+ grammar.build_tables
69
+ grammar.to_ruby
54
70
 
55
- ### Parser S-Expressions
56
- Intermediate representations of the grammar may be serialized to Lisp-like S-Expressions. For example, the rule `[1] ebnf ::= (declaration | rule)*` is serialized as `(rule ebnf "1" (star (alt declaration rule)))`.
57
-
58
- Once the [LL(1)][] conversion is made, the [First/Follow][] table is generated, this rule expands as follows:
59
-
60
- (rule ebnf "1"
61
- (start #t)
62
- (first "@pass" "@terminals" LHS _eps)
63
- (follow _eof)
64
- (cleanup star)
65
- (alt _empty _ebnf_2))
66
- (rule _ebnf_1 "1.1"
67
- (first "@pass" "@terminals" LHS)
68
- (follow "@pass" "@terminals" LHS _eof)
69
- (alt declaration rule))
70
- (rule _ebnf_2 "1.2"
71
- (first "@pass" "@terminals" LHS)
72
- (follow _eof)
73
- (cleanup merge)
74
- (seq _ebnf_1 ebnf))
75
- (rule _ebnf_3 "1.3" (first "@pass" "@terminals" LHS _eps) (follow _eof) (seq ebnf))
76
-
77
- ### Creating terminal definitions and parser rules to parse generated grammars
78
- The parser is initialized to callbacks invoked on entry and exit
79
- to each `terminal` and `production`. A trivial parser loop can be described as follows:
80
-
81
- require 'ebnf/ll1/parser'
82
- require 'meta'
83
-
84
- class Parser
85
- include Meta
86
-
87
- terminal(:SYMBOL, /([a-z]|[A-Z]|[0-9]|_)+/) do |prod, token, input|
88
- # Add data based on scanned token to input
89
- input[:symbol] = token.value
90
- end
91
-
92
- start_production(:rule) do |input, current, callback|
93
- # Process on start of production
94
- # Set state for entry into recursed rules through current
95
-
96
- # Callback to parser loop with callback
97
- end
98
-
99
- production(:rule) do |input, current, callback|
100
- # Process on end of production
101
- # return results in input, retrieve results from recursed rules in current
102
-
103
- # Callback to parser loop with callback
104
- end
105
-
106
- def initialize(input)
107
- parser_options = {
108
- branch: BRANCH,
109
- first: FIRST,
110
- follow: FOLLOW,
111
- cleanup: CLEANUP
112
- }
113
- parse(input, start_symbol, parser_options) do |context, *data|
114
- # Process calls from callback from productions
115
-
116
- rescue ArgumentError, RDF::LL1::Parser::Error => e
117
- progress("Parsing completed with errors:\n\t#{e.message}")
118
- raise RDF::ReaderError, e.message if validate?
119
- end
120
-
121
- ### Branch Table
122
- The Branch table is a hash mapping production rules to a hash relating terminals appearing in input to sequence of productions to follow when the corresponding input terminal is found. This allows either the `seq` primitive, where all terminals map to the same sequence of productions, or the `alt` primitive, where each terminal may map to a different production.
123
-
124
- BRANCH = {
125
- :alt => {
126
- "(" => [:seq, :_alt_1],
127
- :ENUM => [:seq, :_alt_1],
128
- :HEX => [:seq, :_alt_1],
129
- :O_ENUM => [:seq, :_alt_1],
130
- :O_RANGE => [:seq, :_alt_1],
131
- :RANGE => [:seq, :_alt_1],
132
- :STRING1 => [:seq, :_alt_1],
133
- :STRING2 => [:seq, :_alt_1],
134
- :SYMBOL => [:seq, :_alt_1],
135
- },
136
- ...
137
- :declaration => {
138
- "@pass" => [:pass],
139
- "@terminals" => ["@terminals"],
140
- },
141
- ...
142
- }
143
-
144
- In this case the `alt` rule is `seq ('|' seq)*` can happen when any of the specified tokens appears on the input stream. The all cause the same token to be passed to the `seq` rule and follow with `_alt_1`, which handles the `('|' seq)*` portion of the rule, after the first sequence is matched.
145
-
146
- The `declaration` rule is `@terminals' | pass` using the `alt` primitive determining the production to run based on the terminal appearing on the input stream. Eventually, a terminal production is found and the token is consumed.
147
-
148
- ### First/Follow Table
149
- The [First/Follow][] table is a hash mapping production rules to the terminals that may proceed or follow the rule. For example:
150
-
151
- FIRST = {
152
- :alt => [
153
- :HEX,
154
- :SYMBOL,
155
- :ENUM,
156
- :O_ENUM,
157
- :RANGE,
158
- :O_RANGE,
159
- :STRING1,
160
- :STRING2,
161
- "("],
162
- ...
163
- }
164
-
165
- ### Terminals Table
166
- This table is a simple list of the terminal productions found in the grammar. For example:
167
-
168
- TERMINALS = ["(", ")", "-",
169
- "@pass", "@terminals",
170
- :ENUM, :HEX, :LHS, :O_ENUM, :O_RANGE,:POSTFIX,
171
- :RANGE, :STRING1, :STRING2, :SYMBOL,"|"
172
- ].freeze
173
-
174
- ### Cleanup Table
175
- This table identifies productions which used EBNF rules, which are transformed to BNF for actual parsing. This allows the parser, in some cases, to reproduce *star*, *plus*, and *opt* rule matches. For example:
176
-
177
- CLEANUP = {
178
- :_alt_1 => :star,
179
- :_alt_3 => :merge,
180
- :_diff_1 => :opt,
181
- :ebnf => :star,
182
- :_ebnf_2 => :merge,
183
- :_postfix_1 => :opt,
184
- :seq => :plus,
185
- :_seq_1 => :star,
186
- :_seq_2 => :merge,
187
- }.freeze
188
-
189
- In this case the `ebnf` rule was `(declaration | rule)*`. As BNF does not support a star operator, this is decomposed into a set of rules using `alt` and `seq` primitives:
190
-
191
- ebnf ::= _empty _ebnf_2
192
- _ebnf_1 ::= declaration | rule
193
- _ebnf_2 ::= _ebnf_1 ebnf
194
- _ebnf_3 ::= ebnf
195
-
196
- The `_empty` production matches an empty string, so allows for now value. `_ebnf_2` matches `declaration | rule` (using the `alt` primitive) followed by `ebnf`, creating a sequence of zero or more `declaration` or `alt` members.
71
+ Generate formatted grammar using HTML (requires [Haml][Haml] gem):
197
72
 
198
- ## EBNF Grammar
199
- The [EBNF][] variant used here is based on [W3C](http://w3.org/) [EBNF][] (see {file:etc/ebnf.ebnf EBNF grammar}) as defined in the
200
- [XML 1.0 recommendation](http://www.w3.org/TR/REC-xml/), with minor extensions:
73
+ grammar.to_html
74
+
75
+ ### Parsing an ISO/IEC 14977 Grammar
76
+
77
+ The EBNF gem can also parse [ISO/EIC 14977] Grammars (ISOEBNF) to [S-Expressions][S-Expression].
78
+
79
+ grammar = EBNF.parse(File.open('./etc/iso-ebnf.isoebnf', format: :isoebnf))
80
+
81
+ ### Parsing an ABNF Grammar
82
+
83
+ The EBNF gem can also parse [ABNF] Grammars to [S-Expressions][S-Expression].
84
+
85
+ grammar = EBNF.parse(File.open('./etc/abnf.abnf', format: :abnf))
86
+
87
+ ### Parser Debugging
88
+
89
+ Inevitably while implementing a parser for some specific grammar, a developer will need greater insight into the operation of the parser. While this can involve sorting through a tremendous amount of data, the parser can be provided a [Logger][] instance which will output messages at varying levels of detail to document the state of the parser at any given point. Most useful is likely the `INFO` level of debugging, but even more detail is revealed using the `DEBUG` level. `WARN` and `ERROR` statements will typically also be provided as part of an exception if parsing fails, but can be shown in the context of other parsing state with appropriate indentation as part of the logger.
90
+
91
+ ### Writing Grammars
92
+
93
+ The {EBNF::Writer} class can be used to write parsed grammars out, either as formatted text, or HTML. Because grammars are written from the Abstract Syntax Tree, represented as [S-Expressions][S-Expression], this provides a means of transforming between grammar formats (e.g., W3C [EBNF][] to [ABNF][]), although with some potential loss in semantic fidelity (case-insensitive string matching vs. case-sensitive matching).
94
+
95
+ The formatted HTML results are designed to be appropriate for including in specifications.
96
+
97
+ ### Parser Errors
98
+ On a parsing failure, and exception is raised with information that may be useful in determining the source of the error.
201
99
 
202
- * Comments include `\\` and `#` through end of line (other than hex character) and `/* ... */ (* ... *) which may cross lines`
100
+ ## EBNF Grammar
101
+ The [EBNF][] variant used here is based on [W3C](https://w3.org/) [EBNF][] (see {file:etc/ebnf.ebnf EBNF grammar}) as defined in the
102
+ [XML 1.0 recommendation](https://www.w3.org/TR/REC-xml/), with minor extensions:
103
+
104
+ The character set for EBNF is UTF-8.
105
+
106
+ The general form of a rule is:
107
+
108
+ symbol ::= expression
109
+
110
+ which can also be proceeded by an optional number enclosed in square brackets to identify the rule number:
111
+
112
+ [1] symbol ::= expression
113
+
114
+ (Note, this can introduce an ambiguity if the previous rule ends in a range or enum and the current rule has no identifier. In this case, enclosing `expression` within parentheses, or adding intervening comments can resolve the ambiguity.)
115
+
116
+ Symbols are written in CAPITAL CASE if they are the start symbol of a regular language (terminals), otherwise with they are treated as non-terminal rules. Literal strings are quoted.
117
+
118
+ Within the expression on the right-hand side of a rule, the following expressions are used to match strings of one or more characters:
119
+
120
+ <table>
121
+ <tr><td><code>#xN</code></td>
122
+ <td>where <code>N</code> is a hexadecimal integer, the expression matches the character whose number (code point) in ISO/IEC 10646 is <code>N</code>. The number of leading zeros in the <code>#xN</code> form is insignificant.</td></tr>
123
+ <tr><td><code>[a-zA-Z], [#xN-#xN]</code>
124
+ <td>matches any Char or HEX with a value in the range(s) indicated (inclusive).</td></tr>
125
+ <tr><td><code>[abc], [#xN#xN#xN]</code></td>
126
+ <td>matches any UTF-8 R\_CHAR or HEX with a value among the characters enumerated. The last component may be '-'. Enumerations and ranges may be mixed in one set of brackets.</td></tr>
127
+ <tr><td><code>[^a-z], [^#xN-#xN]</code></td>
128
+ <td>matches any UTF-8 Char or HEX a value outside the range indicated.</td></tr>
129
+ <tr><td><code>[^abc], [^#xN#xN#xN]</code></td>
130
+ <td>matches any UTF-8 R\_CHAR or HEX with a value not among the characters given. The last component may be '-'. Enumerations and ranges of excluded values may be mixed in one set of brackets.</td></tr>
131
+ <tr><td><code>"string"</code></td>
132
+ <td>matches a literal string matching that given inside the double quotes.</td></tr>
133
+ <tr><td><code>'string'</code></td>
134
+ <td>matches a literal string matching that given inside the single quotes.</td></tr>
135
+ <tr><td><code>A (B | C)</code></td>
136
+ <td><code>(B | C)</code> is treated as a unit and may be combined as described in this list.</td></tr>
137
+ <tr><td><code>A?</code></td>
138
+ <td>matches A or nothing; optional A.</td></tr>
139
+ <tr><td><code>A B</code></td>
140
+ <td>matches <code>A</code> followed by <code>B</code>. This operator has higher precedence than alternation; thus <code>A B | C D</code> is identical to <code>(A B) | (C D)</code>.</td></tr>
141
+ <tr><td><code>A | B</code></td>
142
+ <td>matches <code>A</code> or <code>B</code>.</td></tr>
143
+ <tr><td><code>A - B</code></td>
144
+ <td>matches any string that matches <code>A</code> but does not match <code>B</code>. (Only supported on Terminals in LL(1) BNF).</td></tr>
145
+ <tr><td><code>A+</code></td>
146
+ <td>matches one or more occurrences of <code>A</code>. Concatenation has higher precedence than alternation; thus <code>A+ | B+</code> is identical to <code>(A+) | (B+)</code>.</td></tr>
147
+ <tr><td><code>A*</code></td>
148
+ <td>matches zero or more occurrences of <code>A</code>. Concatenation has higher precedence than alternation; thus <code>A* | B*</code> is identical to <code>(A*) | (B*)</code>.</td></tr>
149
+ <tr><td><code>@pass " "*</code></td>
150
+ <td>Defines consumed whitespace in the document. Any whitespace found between non-terminal rules is consumed and ignored.</td></tr>
151
+ <tr><td><code>@terminals</code></td>
152
+ <td>Introduces terminal rules. All rules defined after this point are treated as terminals.</td></tr>
153
+ </table>
154
+
155
+ * Comments include `//` and `#` through end of line (other than hex character) and `/* ... */ (* ... *) which may cross lines`
203
156
  * All rules **MAY** start with an identifier, contained within square brackets. For example `[1] rule`, where the value within the brackets is a symbol `([a-z] | [A-Z] | [0-9] | "_" | ".")+`
204
- * `@terminals` causes following rules to be treated as terminals. Any terminal which are entirely upper-case are also treated as terminals
157
+ * `@terminals` causes following rules to be treated as terminals. Any terminal which is all upper-case (eg`TERMINAL`), or any rules with expressions that match characters (`#xN`, `[a-z]`, `[^a-z]`, `[abc]`, `[^abc]`, `"string"`, `'string'`, or `A - B`), are also treated as terminals.
205
158
  * `@pass` defines the expression used to detect whitespace, which is removed in processing.
206
159
  * No support for `wfc` (well-formedness constraint) or `vc` (validity constraint).
207
160
 
208
- Parsing this grammar yields an S-Expression version: {file:etc/ebnf.ll1.sxp}.
161
+ Parsing this grammar yields an [S-Expression][] version: {file:etc/ebnf.sxp} (or [LL(1)][] version {file:etc/ebnf.ll1.sxp} or [PEG][] version {file:etc/ebnf.peg.sxp}).
162
+
163
+ ### Parser S-Expressions
164
+ Intermediate representations of the grammar may be serialized to Lisp-like [S-Expressions][S-Expression]. For example, the rule
165
+
166
+ [1] ebnf ::= (declaration | rule)*
167
+
168
+ is serialized as
169
+
170
+ (rule ebnf "1" (star (alt declaration rule)))
171
+
172
+ Different components of an EBNF rule expression are transformed into their own operator:
173
+
174
+ <table>
175
+ <tr><td><code>#xN</code></td><td><code>(hex "#xN")</code></td></tr>
176
+ <tr><td><code>[a-z#xN-#xN]</code></td><td><code>(range "a-z#xN-#xN")</code></td></tr>
177
+ <tr><td><code>[abc#xN]</code></td><td><code>(range "abc#xN")</code></td></tr>
178
+ <tr><td><code>[^a-z#xN-#xN]</code></td><td><code>(range "^a-z#xN-#xN")</code></td></tr>
179
+ <tr><td><code>[^abc#xN]</code></td><td><code>(range "^abc#xN")</code></td></tr>
180
+ <tr><td><code>"string"</code></td><td><code>"string"</code></td></tr>
181
+ <tr><td><code>'string'</code></td><td><code>"string"</code></td></tr>
182
+ <tr><td><code>A (B | C)</code></td><td><code>(seq (A (alt B C)))</code></td></tr>
183
+ <tr><td><code>A?</code></td><td><code>(opt A)</code></td></tr>
184
+ <tr><td><code>A B</code></td><td><code>(seq A B)</code></td></tr>
185
+ <tr><td><code>A | B</code></td><td><code>(alt A B)</code></td></tr>
186
+ <tr><td><code>A - B</code></td>
187
+ <td><code>(diff A B) for terminals.<br/>
188
+ <code>(seq (not B) A) for non-terminals (PEG parsing only)</code></code></td></tr>
189
+ <tr><td><code>A+</code></td><td><code>(plus A)</code></td></tr>
190
+ <tr><td><code>A*</code></td><td><code>(star A)</code></td></tr>
191
+ <tr><td><code>@pass " "*</code></td><td><code>(pass _pass (star " "))</code></td></tr>
192
+ <tr><td><code>@terminals</code></td><td></td></tr>
193
+ </table>
194
+
195
+ Other rule operators are not directly supported in [EBNF][], but are included to support other notations (e.g., [ABNF][] and [ISO/IEC 14977][]):
196
+
197
+ <table>
198
+ <tr><td><code>%i"StRiNg"</code></td><td><code>(istr "StRiNg")</code></td><td>Case-insensitive string matching</td></tr>
199
+ <tr><td><code>'' - A</code></td><td><code>(not A)</code></td><td>Negative look-ahead, used for non-terminal uses of `B - A`.</td></tr>
200
+ <tr><td><code>n*mA</code></td><td><code>(rept n m A)</code></td><td>Explicit repetition.</td></tr>
201
+ </table>
202
+
203
+ Additionally, rules defined with an UPPERCASE symbol are treated as terminals.
204
+
205
+ For an [LL(1)][] parser generator, the {EBNF::BNF.make_bnf} method can be used to transform the EBNF rule into a BNF rule.
206
+
207
+ (rule ebnf "1" (alt _empty _ebnf_2))
208
+ (rule _ebnf_1 "1.1" (alt declaration rule))
209
+ (rule _ebnf_2 "1.2" (seq _ebnf_1 ebnf))
210
+ (rule _ebnf_3 "1.3" (seq ebnf))
211
+
212
+ This allows [First/Follow][] and other tables used by a parser to parse examples of the associated grammar. For more, see {EBNF::LL1}.
213
+
214
+ For a [PEG][] parser generator, there is a simpler transformation that reduces rules containing sub-expressions (composed of `star`, `alt`, `seq` and similar expressions) and creates named rules to allow appropriate callbacks and for naming elements of the generating abstract syntax tree. The {EBNF::PEG.make_peg} method transforms the original rule into the following two rules:
215
+
216
+ (rule ebnf "1" (star _ebnf_1))
217
+ (rule _ebnf_1 "1.1" (alt declaration rule))
218
+
219
+ ## Example parsers
220
+ For a [PEG][] parser for a simple grammar implementing a calculator see [Calc example](https://dryruby.github.io/ebnf/examples/calc/doc/calc.html)
221
+
222
+ For an example parser built using this gem that parses the [EBNF][] grammar, see [EBNF PEG Parser example](https://dryruby.github.io/ebnf/examples/ebnf-peg-parser/doc/parser.html). This example creates a parser for the [EBNF][] grammar which generates the same Abstract Syntax Tree as the built-in parser in the gem.
223
+
224
+ There is also an
225
+ [EBNF LL(1) Parser example](https://dryruby.github.io/ebnf/examples/ebnf-peg-parser/doc/parser.html).
226
+
227
+ The [ISO EBNF Parser](https://dryruby.github.io/ebnf/examples/isoebnf/doc/parser.html) example parses [ISO/IEC 14977][] into [S-Expressions][S-Expression], which can be used to parse compatible grammars using this parser (either [PEG][] or [LL(1)][]).
209
228
 
210
- ## Example parser
211
- For an example parser built using this gem, see {file:examples/ebnf-parser/README EBNF Parser example}. This example creates a parser for the [EBNF][] grammar which generates the same Abstract Syntax Tree as the built-in parser in the gem.
229
+ The [ABNF Parser](https://dryruby.github.io/ebnf/examples/abnf/doc/parser.html) example parses [ABNF][] into [S-Expressions][S-Expression], which can be used to parse compatible grammars using this [PEG][] parser.
212
230
 
213
231
  ## Acknowledgements
214
232
  Much of this work, particularly the generic parser, is inspired by work originally done by
215
- Tim Berners-Lee's Python [predictive parser](http://www.w3.org/2000/10/swap/grammar/predictiveParser.py).
233
+ Tim Berners-Lee's Python [predictive parser](https://www.w3.org/2000/10/swap/grammar/predictiveParser.py).
216
234
 
217
- The EBNF parser was inspired by Dan Connolly's
218
- [EBNF to Turtle processor](http://www.w3.org/2000/10/swap/grammar/ebnf2turtle.py),
219
- [EBNF to BNF Notation-3 rules](http://www.w3.org/2000/10/swap/grammar/ebnf2bnf.n3),
220
- and [First Follow Notation-3 rules](http://www.w3.org/2000/10/swap/grammar/first_follow.n3).
235
+ The [LL(1)][] parser was inspired by Dan Connolly's
236
+ [EBNF to Turtle processor](https://www.w3.org/2000/10/swap/grammar/ebnf2turtle.py),
237
+ [EBNF to BNF Notation-3 rules](https://www.w3.org/2000/10/swap/grammar/ebnf2bnf.n3),
238
+ and [First Follow Notation-3 rules](https://www.w3.org/2000/10/swap/grammar/first_follow.n3).
221
239
 
222
240
  ## Documentation
223
241
  Full documentation available on [Rubydoc.info][EBNF doc].
224
242
 
225
243
  ## Future Work
226
244
  * Better LL(1) parser tests
227
- * Either generate [Packrat parser][Packrat] for a [Parsing Regular Expression Grammar][PEG], or integrate with [Treetop][] or similar.
228
245
 
229
246
  ## Author
230
- * [Gregg Kellogg](http://github.com/gkellogg) - <http://greggkellogg.net/>
247
+ * [Gregg Kellogg](https://github.com/gkellogg) - <https://greggkellogg.net/>
231
248
 
232
249
  ## Contributing
233
250
  This repository uses [Git Flow](https://github.com/nvie/gitflow) to mange development and release activity. All submissions _must_ be on a feature branch based on the _develop_ branch to ease staging and integration.
@@ -246,22 +263,27 @@ This repository uses [Git Flow](https://github.com/nvie/gitflow) to mange develo
246
263
 
247
264
  ## License
248
265
  This is free and unencumbered public domain software. For more information,
249
- see <http://unlicense.org/> or the accompanying {file:UNLICENSE} file.
250
-
251
- A copy of the [Turtle EBNF][] and derived parser files are included in the repository, which are not covered under the UNLICENSE. These files are covered via the [W3C Document License](http://www.w3.org/Consortium/Legal/2002/copyright-documents-20021231).
252
-
253
- [Ruby]: http://ruby-lang.org/
254
- [YARD]: http://yardoc.org/
255
- [YARD-GS]: http://rubydoc.info/docs/yard/file/docs/GettingStarted.md
256
- [PDD]: http://lists.w3.org/Archives/Public/public-rdf-ruby/2010May/0013.html
257
- [EBNF]: http://www.w3.org/TR/REC-xml/#sec-notation
258
- [EBNF doc]: http://rubydoc.info/github/gkellogg/ebnf/master/frames
259
- [First/Follow]: http://en.wikipedia.org/wiki/LL_parser#Constructing_an_LL.281.29_parsing_table
260
- [LL(1)]: http://www.csd.uwo.ca/~moreno//CS447/Lectures/Syntax.html/node14.html
261
- [LL(1) Parser]: http://en.wikipedia.org/wiki/LL_parser
262
- [Tokenizer]: http://en.wikipedia.org/wiki/Lexical_analysis#Tokenizer
263
- [Turtle EBNF]: http://dvcs.w3.org/hg/rdf/file/default/rdf-turtle/turtle.bnf
264
- [Packrat]: http://pdos.csail.mit.edu/~baford/packrat/thesis/
265
- [PEG]: http://en.wikipedia.org/wiki/Parsing_expression_grammar
266
- [Treetop]: http://rubygems.org/gems/treetop
267
- [Haml]: http://rubygems.org/gems/haml
266
+ see <https://unlicense.org/> or the accompanying {file:UNLICENSE} file.
267
+
268
+ A copy of the [Turtle EBNF][] and derived parser files are included in the repository, which are not covered under the UNLICENSE. These files are covered via the [W3C Document License](https://www.w3.org/Consortium/Legal/2002/copyright-documents-20021231).
269
+
270
+ [Ruby]: https://ruby-lang.org/
271
+ [YARD]: https://yardoc.org/
272
+ [YARD-GS]: https://rubydoc.info/docs/yard/file/docs/GettingStarted.md
273
+ [PDD]: https://lists.w3.org/Archives/Public/public-rdf-ruby/2010May/0013.html
274
+ [ABNF]: https://www.rfc-editor.org/rfc/rfc5234
275
+ [BNF]: https://en.wikipedia.org/wiki/Backus–Naur_form
276
+ [EBNF]: https://www.w3.org/TR/REC-xml/#sec-notation
277
+ [EBNF doc]: https://rubydoc.info/github/dryruby/ebnf
278
+ [First/Follow]: https://en.wikipedia.org/wiki/LL_parser#Constructing_an_LL.281.29_parsing_table
279
+ [ISO/IEC 14977]:https://www.iso.org/standard/26153.html
280
+ [LL(1)]: https://www.csd.uwo.ca/~moreno//CS447/Lectures/Syntax.html/node14.html
281
+ [LL(1) Parser]: https://en.wikipedia.org/wiki/LL_parser
282
+ [Logger]: https://ruby-doc.org/stdlib-2.4.0/libdoc/logger/rdoc/Logger.html
283
+ [S-expression]: https://en.wikipedia.org/wiki/S-expression
284
+ [Tokenizer]: https://en.wikipedia.org/wiki/Lexical_analysis#Tokenizer
285
+ [Turtle]: https://www.w3.org/TR/2012/WD-turtle-20120710/
286
+ [Turtle EBNF]: https://dvcs.w3.org/hg/rdf/file/default/rdf-turtle/turtle.bnf
287
+ [Packrat]: https://pdos.csail.mit.edu/~baford/packrat/thesis/
288
+ [PEG]: https://en.wikipedia.org/wiki/Parsing_expression_grammar
289
+ [Haml]: https://rubygems.org/gems/haml