ebnf 1.0.2 → 1.1.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
checksums.yaml CHANGED
@@ -1,7 +1,7 @@
1
1
  ---
2
2
  SHA1:
3
- metadata.gz: b7016a2ce8e6c7b23ec669880a9b3f3ddd381793
4
- data.tar.gz: 625460a7f6f2c8155960fc5a8640e4b0d854d1cd
3
+ metadata.gz: 5b0233cc19d80ca25dc2221725770bc87a47d0f0
4
+ data.tar.gz: e381edc76658d0f56816c4ddc107417df301c9cc
5
5
  SHA512:
6
- metadata.gz: e09689686d97b44c8845f66129a313006a7264722eb9694b35a675edf2c369b7795f7b3ca6bb6ee82c7567d3be4a752c74b4388b1d57e3f242759eec47e37199
7
- data.tar.gz: 6aa62a952f16d3ccd018397a6652b65d3053f68b00c0d0ee767cae3e4686d659e7c5f7e94d89a80a10fdab271ad45e3a733e780372379be5aa7a611ef71b6e46
6
+ metadata.gz: 18415d7b3393069f09d0af3c6433722c3aadfd4f3b5eca67c763294dd81584c9e435eb2aab6e27d15cf557f1d8587bfc67c3671e64797523340f06560c42f27c
7
+ data.tar.gz: bc6917f74c4420facfee72e7089d225545e49d201653731dd1caa28eb71f547b3c8231b44175b9144f2b7c9d4ac979088d4647c487e885adc624c046c9c595fd
data/README.md CHANGED
@@ -8,12 +8,20 @@
8
8
  [![Dependency Status](https://gemnasium.com/gkellogg/ebnf.png)](https://gemnasium.com/gkellogg/ebnf)
9
9
 
10
10
  ## Description
11
- This is a [Ruby][] implementation of an [EBNF][] and [BNF][] parser and parser generator.
12
- It parses [EBNF][] grammars to [BNF][], generates [First/Follow and Branch][] tables for
13
- [LL(1)][] grammars, which can be used with the stream [Tokenizer][] and [LL(1) Parser][].
11
+ This is a [Ruby][] implementation of an [EBNF][] and [BNF][] parser and parser generator. It parses [EBNF][] grammars to [BNF][], generates [First/Follow][] and Branch tables for [LL(1)][] grammars, which can be used with the stream [Tokenizer][] and [LL(1) Parser][].
14
12
 
15
- Of note in this implementation is that the tokenizer and parser are streaming, so that they can
16
- process inputs of arbitrary size.
13
+ As LL(1) grammars operate using `alt` and `seq` primitives, allowing for a match on alternative productions or a sequence of productions, generating a parser requires turning the EBNF rules into BNF:
14
+
15
+ * Transform `a ::= b?` into `a ::= _empty | b`
16
+ * Transform `a ::= b+` into `a ::= b b*`
17
+ * Transform `a ::= b*` into `a ::= _empty | (b a)`
18
+ * Transform `a ::= op1 (op2)` into two rules:
19
+ ```
20
+ a ::= op1 _a_1
21
+ _a_1_ ::= op2
22
+ ```
23
+
24
+ Of note in this implementation is that the tokenizer and parser are streaming, so that they can process inputs of arbitrary size.
17
25
 
18
26
  ## Usage
19
27
  ### Parsing an LL(1) Grammar
@@ -36,7 +44,7 @@ Generate [First/Follow][] rules for BNF grammars
36
44
 
37
45
  ebnf.first_follow(start_tokens)
38
46
 
39
- Generate Terminal, [First/Follow and Branch][] tables as Ruby for parsing grammars
47
+ Generate Terminal, [First/Follow][], Cleanup and Branch tables as Ruby for parsing grammars
40
48
 
41
49
  ebnf.to_ruby
42
50
 
@@ -44,8 +52,29 @@ Generate formatted grammar using HTML (requires [Haml][Haml] gem)
44
52
 
45
53
  ebnf.to_html
46
54
 
47
- ### Creating terminal definitions and parser rules to parse generated grammars
55
+ ### Parser S-Expressions
56
+ Intermediate representations of the grammar may be serialized to Lisp-like S-Expressions. For example, the rule `[1] ebnf ::= (declaration | rule)*` is serialized as `(rule ebnf "1" (star (alt declaration rule)))`.
57
+
58
+ Once the [LL(1)][] conversion is made, the [First/Follow][] table is generated, this rule expands as follows:
59
+
60
+ (rule ebnf "1"
61
+ (start #t)
62
+ (first "@pass" "@terminals" LHS _eps)
63
+ (follow _eof)
64
+ (cleanup star)
65
+ (alt _empty _ebnf_2))
66
+ (rule _ebnf_1 "1.1"
67
+ (first "@pass" "@terminals" LHS)
68
+ (follow "@pass" "@terminals" LHS _eof)
69
+ (alt declaration rule))
70
+ (rule _ebnf_2 "1.2"
71
+ (first "@pass" "@terminals" LHS)
72
+ (follow _eof)
73
+ (cleanup merge)
74
+ (seq _ebnf_1 ebnf))
75
+ (rule _ebnf_3 "1.3" (first "@pass" "@terminals" LHS _eps) (follow _eof) (seq ebnf))
48
76
 
77
+ ### Creating terminal definitions and parser rules to parse generated grammars
49
78
  The parser is initialized to callbacks invoked on entry and exit
50
79
  to each `terminal` and `production`. A trivial parser loop can be described as follows:
51
80
 
@@ -76,9 +105,10 @@ to each `terminal` and `production`. A trivial parser loop can be described as f
76
105
 
77
106
  def initialize(input)
78
107
  parser_options = {
79
- :branch => BRANCH,
80
- :first => FIRST,
81
- :follow => FOLLOW
108
+ branch: BRANCH,
109
+ first: FIRST,
110
+ follow: FOLLOW,
111
+ cleanup: CLEANUP
82
112
  }
83
113
  parse(input, start_symbol, parser_options) do |context, *data|
84
114
  # Process calls from callback from productions
@@ -88,10 +118,92 @@ to each `terminal` and `production`. A trivial parser loop can be described as f
88
118
  raise RDF::ReaderError, e.message if validate?
89
119
  end
90
120
 
121
+ ### Branch Table
122
+ The Branch table is a hash mapping production rules to a hash relating terminals appearing in input to sequence of productions to follow when the corresponding input terminal is found. This allows either the `seq` primitive, where all terminals map to the same sequence of productions, or the `alt` primitive, where each terminal may map to a different production.
123
+
124
+ BRANCH = {
125
+ :alt => {
126
+ "(" => [:seq, :_alt_1],
127
+ :ENUM => [:seq, :_alt_1],
128
+ :HEX => [:seq, :_alt_1],
129
+ :O_ENUM => [:seq, :_alt_1],
130
+ :O_RANGE => [:seq, :_alt_1],
131
+ :RANGE => [:seq, :_alt_1],
132
+ :STRING1 => [:seq, :_alt_1],
133
+ :STRING2 => [:seq, :_alt_1],
134
+ :SYMBOL => [:seq, :_alt_1],
135
+ },
136
+ ...
137
+ :declaration => {
138
+ "@pass" => [:pass],
139
+ "@terminals" => ["@terminals"],
140
+ },
141
+ ...
142
+ }
143
+
144
+ In this case the `alt` rule is `seq ('|' seq)*` can happen when any of the specified tokens appears on the input stream. The all cause the same token to be passed to the `seq` rule and follow with `_alt_1`, which handles the `('|' seq)*` portion of the rule, after the first sequence is matched.
145
+
146
+ The `declaration` rule is `@terminals' | pass` using the `alt` primitive determining the production to run based on the terminal appearing on the input stream. Eventually, a terminal production is found and the token is consumed.
147
+
148
+ ### First/Follow Table
149
+ The [First/Follow][] table is a hash mapping production rules to the terminals that may proceed or follow the rule. For example:
150
+
151
+ FIRST = {
152
+ :alt => [
153
+ :HEX,
154
+ :SYMBOL,
155
+ :ENUM,
156
+ :O_ENUM,
157
+ :RANGE,
158
+ :O_RANGE,
159
+ :STRING1,
160
+ :STRING2,
161
+ "("],
162
+ ...
163
+ }
164
+
165
+ ### Terminals Table
166
+ This table is a simple list of the terminal productions found in the grammar. For example:
167
+
168
+ TERMINALS = ["(", ")", "-",
169
+ "@pass", "@terminals",
170
+ :ENUM, :HEX, :LHS, :O_ENUM, :O_RANGE,:POSTFIX,
171
+ :RANGE, :STRING1, :STRING2, :SYMBOL,"|"
172
+ ].freeze
173
+
174
+ ### Cleanup Table
175
+ This table identifies productions which used EBNF rules, which are transformed to BNF for actual parsing. This allows the parser, in some cases, to reproduce *star*, *plus*, and *opt* rule matches. For example:
176
+
177
+ CLEANUP = {
178
+ :_alt_1 => :star,
179
+ :_alt_3 => :merge,
180
+ :_diff_1 => :opt,
181
+ :ebnf => :star,
182
+ :_ebnf_2 => :merge,
183
+ :_postfix_1 => :opt,
184
+ :seq => :plus,
185
+ :_seq_1 => :star,
186
+ :_seq_2 => :merge,
187
+ }.freeze
188
+
189
+ In this case the `ebnf` rule was `(declaration | rule)*`. As BNF does not support a star operator, this is decomposed into a set of rules using `alt` and `seq` primitives:
190
+
191
+ ebnf ::= _empty _ebnf_2
192
+ _ebnf_1 ::= declaration | rule
193
+ _ebnf_2 ::= _ebnf_1 ebnf
194
+ _ebnf_3 ::= ebnf
195
+
196
+ The `_empty` production matches an empty string, so allows for now value. `_ebnf_2` matches `declaration | rule` (using the `alt` primitive) followed by `ebnf`, creating a sequence of zero or more `declaration` or `alt` members.
91
197
 
92
198
  ## EBNF Grammar
93
199
  The [EBNF][] variant used here is based on [W3C](http://w3.org/) [EBNF][] (see {file:etc/ebnf.ebnf EBNF grammar}) as defined in the
94
- [XML 1.0 recommendation](http://www.w3.org/TR/REC-xml/), with minor extensions.
200
+ [XML 1.0 recommendation](http://www.w3.org/TR/REC-xml/), with minor extensions:
201
+
202
+ * Comments include `\\` and `#` through end of line (other than hex character) and `/* ... */ (* ... *) which may cross lines`
203
+ * All rules **MAY** start with an identifier, contained within square brackets. For example `[1] rule`, where the value within the brackets is a symbol `([a-z] | [A-Z] | [0-9] | "_" | ".")+`
204
+ * `@terminals` causes following rules to be treated as terminals. Any terminal which are entirely upper-case are also treated as terminals
205
+ * `@pass` defines the expression used to detect whitespace, which is removed in processing.
206
+ * No support for `wfc` (well-formedness constraint) or `vc` (validity constraint).
95
207
 
96
208
  Parsing this grammar yields an S-Expression version: {file:etc/ebnf.ll1.sxp}.
97
209
 
data/VERSION CHANGED
@@ -1 +1 @@
1
- 1.0.2
1
+ 1.1.0
data/bin/ebnf CHANGED
@@ -12,9 +12,9 @@ require 'getoptlong'
12
12
  require 'ebnf'
13
13
 
14
14
  options = {
15
- :output_format => :sxp,
16
- :prefix => "ttl",
17
- :namespace => "http://www.w3.org/ns/formats/Turtle#",
15
+ output_format: :sxp,
16
+ prefix: "ttl",
17
+ namespace: "http://www.w3.org/ns/formats/Turtle#",
18
18
  }
19
19
 
20
20
  input, out = nil, STDOUT
@@ -20,6 +20,8 @@
20
20
 
21
21
  [9] primary ::= HEX
22
22
  | SYMBOL
23
+ | ENUM
24
+ | O_ENUM
23
25
  | RANGE
24
26
  | O_RANGE
25
27
  | STRING1
@@ -36,29 +38,33 @@
36
38
 
37
39
  [13] HEX ::= '#x' ([0-9]|[a-f]|[A-F])+
38
40
 
41
+ [14] ENUM ::= '[' ((R_BEGIN (HEX | R_CHAR)) | (HEX | R_CHAR)) '-' ((R_BEGIN (HEX | R_CHAR)) | (HEX | R_CHAR)) ']'
42
+
43
+ [15] O_ENUM ::= '[^' ((R_BEGIN (HEX | R_CHAR)) | (HEX | R_CHAR)) '-' ((R_BEGIN (HEX | R_CHAR)) | (HEX | R_CHAR)) ']'
44
+
39
45
  # Range is any combination of R_CHAR '-' R_CHAR or R_CHAR+
40
- [14] RANGE ::= '[' ((R_BEGIN (HEX | R_CHAR)) | (HEX | R_CHAR))+ ']'
46
+ [16] RANGE ::= '[' ((R_BEGIN (HEX | R_CHAR)) | (HEX | R_CHAR))+ ']'
41
47
 
42
48
  # Range is any combination of R_CHAR '-' R_CHAR or R_CHAR+ preceded by ^
43
- [15] O_RANGE ::= '[^' ((R_BEGIN (HEX | R_CHAR)) | (HEX | R_CHAR))+ ']'
49
+ [17] O_RANGE ::= '[^' ((R_BEGIN (HEX | R_CHAR)) | (HEX | R_CHAR))+ ']'
44
50
 
45
51
  # Strings are unescaped Unicode, excepting control characters and hash (#)
46
- [16] STRING1 ::= '"' (CHAR - '"')* '"'
52
+ [18] STRING1 ::= '"' (CHAR - '"')* '"'
47
53
 
48
- [17] STRING2 ::= "'" (CHAR - "'")* "'"
54
+ [19] STRING2 ::= "'" (CHAR - "'")* "'"
49
55
 
50
- [18] CHAR ::= HEX
56
+ [20] CHAR ::= HEX
51
57
  | [#x20#x21#x22]
52
58
  | [#x24-#x00FFFFFF]
53
59
 
54
- [19] R_CHAR ::= CHAR - ']'
60
+ [21] R_CHAR ::= CHAR - ']'
55
61
 
56
- [20] R_BEGIN ::= (HEX | R_CHAR) "-"
62
+ [22] R_BEGIN ::= (HEX | R_CHAR) "-"
57
63
 
58
64
  # Should be able to do this inline, but not until terminal regular expressions are created automatically
59
- [21] POSTFIX ::= [?*+]
65
+ [23] POSTFIX ::= [?*+]
60
66
 
61
- [22] PASS ::= ( [#x00-#x20]
67
+ [24] PASS ::= ( [#x00-#x20]
62
68
  | ( '#' | '//' ) [^#x0A#x0D]*
63
69
  | '/*' (( '*' [^/] )? | [^*] )* '*/'
64
70
  | '(*' (( '*' [^)] )? | [^*] )* '*)'
@@ -76,6 +76,8 @@
76
76
  <td>
77
77
  <a href="#grammar-production-HEX">HEX</a>
78
78
  <code>|</code> <a href="#grammar-production-SYMBOL">SYMBOL</a>
79
+ <code>|</code> <a href="#grammar-production-ENUM">ENUM</a>
80
+ <code>|</code> <a href="#grammar-production-O_ENUM">O_ENUM</a>
79
81
  <code>|</code> <a href="#grammar-production-RANGE">RANGE</a>
80
82
  <code>|</code> <a href="#grammar-production-O_RANGE">O_RANGE</a>
81
83
  <code>|</code> <a href="#grammar-production-STRING1">STRING1</a>
@@ -119,8 +121,32 @@
119
121
  (<code>[</code> <code class="grammar-literal">0-9</code><code>]</code> <code>|</code> <code>[</code> <code class="grammar-literal">a-f</code><code>]</code> <code>|</code> <code>[</code> <code class="grammar-literal">A-F</code><code>]</code> )<code>+</code>
120
122
  </td>
121
123
  </tr>
122
- <tr id='grammar-production-RANGE'>
124
+ <tr id='grammar-production-ENUM'>
123
125
  <td>[14]</td>
126
+ <td><code>ENUM</code></td>
127
+ <td>::=</td>
128
+ <td>
129
+ "<code class="grammar-literal">[</code>"
130
+ <code>(</code> <a href="#grammar-production-R_BEGIN">R_BEGIN</a> <code>(</code> <a href="#grammar-production-HEX">HEX</a> <code>|</code> <a href="#grammar-production-R_CHAR">R_CHAR</a><code>)</code> <code>|</code> <a href="#grammar-production-HEX">HEX</a> <code>|</code> <a href="#grammar-production-R_CHAR">R_CHAR</a><code>)</code>
131
+ "<code class="grammar-literal">-</code>"
132
+ <code>(</code> <a href="#grammar-production-R_BEGIN">R_BEGIN</a> <code>(</code> <a href="#grammar-production-HEX">HEX</a> <code>|</code> <a href="#grammar-production-R_CHAR">R_CHAR</a><code>)</code> <code>|</code> <a href="#grammar-production-HEX">HEX</a> <code>|</code> <a href="#grammar-production-R_CHAR">R_CHAR</a><code>)</code>
133
+ "<code class="grammar-literal">]</code>"
134
+ </td>
135
+ </tr>
136
+ <tr id='grammar-production-O_ENUM'>
137
+ <td>[15]</td>
138
+ <td><code>O_ENUM</code></td>
139
+ <td>::=</td>
140
+ <td>
141
+ "<code class="grammar-literal">[^</code>"
142
+ <code>(</code> <a href="#grammar-production-R_BEGIN">R_BEGIN</a> <code>(</code> <a href="#grammar-production-HEX">HEX</a> <code>|</code> <a href="#grammar-production-R_CHAR">R_CHAR</a><code>)</code> <code>|</code> <a href="#grammar-production-HEX">HEX</a> <code>|</code> <a href="#grammar-production-R_CHAR">R_CHAR</a><code>)</code>
143
+ "<code class="grammar-literal">-</code>"
144
+ <code>(</code> <a href="#grammar-production-R_BEGIN">R_BEGIN</a> <code>(</code> <a href="#grammar-production-HEX">HEX</a> <code>|</code> <a href="#grammar-production-R_CHAR">R_CHAR</a><code>)</code> <code>|</code> <a href="#grammar-production-HEX">HEX</a> <code>|</code> <a href="#grammar-production-R_CHAR">R_CHAR</a><code>)</code>
145
+ "<code class="grammar-literal">]</code>"
146
+ </td>
147
+ </tr>
148
+ <tr id='grammar-production-RANGE'>
149
+ <td>[16]</td>
124
150
  <td><code>RANGE</code></td>
125
151
  <td>::=</td>
126
152
  <td>
@@ -130,7 +156,7 @@
130
156
  </td>
131
157
  </tr>
132
158
  <tr id='grammar-production-O_RANGE'>
133
- <td>[15]</td>
159
+ <td>[17]</td>
134
160
  <td><code>O_RANGE</code></td>
135
161
  <td>::=</td>
136
162
  <td>
@@ -140,7 +166,7 @@
140
166
  </td>
141
167
  </tr>
142
168
  <tr id='grammar-production-STRING1'>
143
- <td>[16]</td>
169
+ <td>[18]</td>
144
170
  <td><code>STRING1</code></td>
145
171
  <td>::=</td>
146
172
  <td>
@@ -150,7 +176,7 @@
150
176
  </td>
151
177
  </tr>
152
178
  <tr id='grammar-production-STRING2'>
153
- <td>[17]</td>
179
+ <td>[19]</td>
154
180
  <td><code>STRING2</code></td>
155
181
  <td>::=</td>
156
182
  <td>
@@ -160,7 +186,7 @@
160
186
  </td>
161
187
  </tr>
162
188
  <tr id='grammar-production-CHAR'>
163
- <td>[18]</td>
189
+ <td>[20]</td>
164
190
  <td><code>CHAR</code></td>
165
191
  <td>::=</td>
166
192
  <td>
@@ -170,7 +196,7 @@
170
196
  </td>
171
197
  </tr>
172
198
  <tr id='grammar-production-R_CHAR'>
173
- <td>[19]</td>
199
+ <td>[21]</td>
174
200
  <td><code>R_CHAR</code></td>
175
201
  <td>::=</td>
176
202
  <td>
@@ -179,7 +205,7 @@
179
205
  </td>
180
206
  </tr>
181
207
  <tr id='grammar-production-R_BEGIN'>
182
- <td>[20]</td>
208
+ <td>[22]</td>
183
209
  <td><code>R_BEGIN</code></td>
184
210
  <td>::=</td>
185
211
  <td>
@@ -188,7 +214,7 @@
188
214
  </td>
189
215
  </tr>
190
216
  <tr id='grammar-production-POSTFIX'>
191
- <td>[21]</td>
217
+ <td>[23]</td>
192
218
  <td><code>POSTFIX</code></td>
193
219
  <td>::=</td>
194
220
  <td>
@@ -196,7 +222,7 @@
196
222
  </td>
197
223
  </tr>
198
224
  <tr id='grammar-production-PASS'>
199
- <td>[22]</td>
225
+ <td>[24]</td>
200
226
  <td><code>PASS</code></td>
201
227
  <td>::=</td>
202
228
  <td>
@@ -5,12 +5,17 @@
5
5
  (start #t)
6
6
  (first "@pass" "@terminals" LHS _eps)
7
7
  (follow _eof)
8
+ (cleanup star)
8
9
  (alt _empty _ebnf_2))
9
10
  (rule _ebnf_1 "1.1"
10
11
  (first "@pass" "@terminals" LHS)
11
12
  (follow "@pass" "@terminals" LHS _eof)
12
13
  (alt declaration rule))
13
- (rule _ebnf_2 "1.2" (first "@pass" "@terminals" LHS) (follow _eof) (seq _ebnf_1 ebnf))
14
+ (rule _ebnf_2 "1.2"
15
+ (first "@pass" "@terminals" LHS)
16
+ (follow _eof)
17
+ (cleanup merge)
18
+ (seq _ebnf_1 ebnf))
14
19
  (rule _ebnf_3 "1.3" (first "@pass" "@terminals" LHS _eps) (follow _eof) (seq ebnf))
15
20
  (rule declaration "2"
16
21
  (first "@pass" "@terminals")
@@ -18,20 +23,21 @@
18
23
  (alt "@terminals" pass))
19
24
  (rule rule "3" (first LHS) (follow "@pass" "@terminals" LHS _eof) (seq LHS expression))
20
25
  (rule _rule_1 "3.1"
21
- (first "(" HEX O_RANGE RANGE STRING1 STRING2 SYMBOL)
26
+ (first "(" ENUM HEX O_ENUM O_RANGE RANGE STRING1 STRING2 SYMBOL)
22
27
  (follow "@pass" "@terminals" LHS _eof)
23
28
  (seq expression))
24
29
  (rule expression "4"
25
- (first "(" HEX O_RANGE RANGE STRING1 STRING2 SYMBOL)
30
+ (first "(" ENUM HEX O_ENUM O_RANGE RANGE STRING1 STRING2 SYMBOL)
26
31
  (follow ")" "@pass" "@terminals" LHS _eof)
27
32
  (seq alt))
28
33
  (rule alt "5"
29
- (first "(" HEX O_RANGE RANGE STRING1 STRING2 SYMBOL)
34
+ (first "(" ENUM HEX O_ENUM O_RANGE RANGE STRING1 STRING2 SYMBOL)
30
35
  (follow ")" "@pass" "@terminals" LHS _eof)
31
36
  (seq seq _alt_1))
32
37
  (rule _alt_1 "5.1"
33
38
  (first _eps "|")
34
39
  (follow ")" "@pass" "@terminals" LHS _eof)
40
+ (cleanup star)
35
41
  (alt _empty _alt_3))
36
42
  (rule _alt_2 "5.2"
37
43
  (first "|")
@@ -40,6 +46,7 @@
40
46
  (rule _alt_3 "5.3"
41
47
  (first "|")
42
48
  (follow ")" "@pass" "@terminals" LHS _eof)
49
+ (cleanup merge)
43
50
  (seq _alt_2 _alt_1))
44
51
  (rule _alt_4 "5.4"
45
52
  (first _eps "|")
@@ -50,111 +57,124 @@
50
57
  (follow ")" "@pass" "@terminals" LHS _eof)
51
58
  (seq _alt_1))
52
59
  (rule _alt_6 "5.6"
53
- (first "(" HEX O_RANGE RANGE STRING1 STRING2 SYMBOL)
60
+ (first "(" ENUM HEX O_ENUM O_RANGE RANGE STRING1 STRING2 SYMBOL)
54
61
  (follow ")" "@pass" "@terminals" LHS _eof "|")
55
62
  (seq seq))
56
63
  (rule seq "6"
57
- (first "(" HEX O_RANGE RANGE STRING1 STRING2 SYMBOL)
64
+ (first "(" ENUM HEX O_ENUM O_RANGE RANGE STRING1 STRING2 SYMBOL)
58
65
  (follow ")" "@pass" "@terminals" LHS _eof "|")
66
+ (cleanup plus)
59
67
  (seq diff _seq_1))
60
68
  (rule _seq_1 "6.1"
61
- (first "(" HEX O_RANGE RANGE STRING1 STRING2 SYMBOL _eps)
69
+ (first "(" ENUM HEX O_ENUM O_RANGE RANGE STRING1 STRING2 SYMBOL _eps)
62
70
  (follow ")" "@pass" "@terminals" LHS _eof "|")
71
+ (cleanup star)
63
72
  (alt _empty _seq_2))
64
73
  (rule _seq_2 "6.2"
65
- (first "(" HEX O_RANGE RANGE STRING1 STRING2 SYMBOL)
74
+ (first "(" ENUM HEX O_ENUM O_RANGE RANGE STRING1 STRING2 SYMBOL)
66
75
  (follow ")" "@pass" "@terminals" LHS _eof "|")
76
+ (cleanup merge)
67
77
  (seq diff _seq_1))
68
78
  (rule _seq_3 "6.3"
69
- (first "(" HEX O_RANGE RANGE STRING1 STRING2 SYMBOL _eps)
79
+ (first "(" ENUM HEX O_ENUM O_RANGE RANGE STRING1 STRING2 SYMBOL _eps)
70
80
  (follow ")" "@pass" "@terminals" LHS _eof "|")
71
81
  (seq _seq_1))
72
82
  (rule _seq_4 "6.4"
73
- (first "(" HEX O_RANGE RANGE STRING1 STRING2 SYMBOL _eps)
83
+ (first "(" ENUM HEX O_ENUM O_RANGE RANGE STRING1 STRING2 SYMBOL _eps)
74
84
  (follow ")" "@pass" "@terminals" LHS _eof "|")
75
85
  (seq _seq_1))
76
86
  (rule diff "7"
77
- (first "(" HEX O_RANGE RANGE STRING1 STRING2 SYMBOL)
78
- (follow "(" ")" "@pass" "@terminals" HEX LHS O_RANGE RANGE STRING1 STRING2
79
- SYMBOL _eof "|" )
87
+ (first "(" ENUM HEX O_ENUM O_RANGE RANGE STRING1 STRING2 SYMBOL)
88
+ (follow "(" ")" "@pass" "@terminals" ENUM HEX LHS O_ENUM O_RANGE RANGE
89
+ STRING1 STRING2 SYMBOL _eof "|" )
80
90
  (seq postfix _diff_1))
81
91
  (rule _diff_1 "7.1"
82
92
  (first "-" _eps)
83
- (follow "(" ")" "@pass" "@terminals" HEX LHS O_RANGE RANGE STRING1 STRING2
84
- SYMBOL _eof "|" )
93
+ (follow "(" ")" "@pass" "@terminals" ENUM HEX LHS O_ENUM O_RANGE RANGE
94
+ STRING1 STRING2 SYMBOL _eof "|" )
95
+ (cleanup opt)
85
96
  (alt _empty _diff_2))
86
97
  (rule _diff_2 "7.2"
87
98
  (first "-")
88
- (follow "(" ")" "@pass" "@terminals" HEX LHS O_RANGE RANGE STRING1 STRING2
89
- SYMBOL _eof "|" )
99
+ (follow "(" ")" "@pass" "@terminals" ENUM HEX LHS O_ENUM O_RANGE RANGE
100
+ STRING1 STRING2 SYMBOL _eof "|" )
90
101
  (seq "-" postfix))
91
102
  (rule _diff_3 "7.3"
92
103
  (first "-" _eps)
93
- (follow "(" ")" "@pass" "@terminals" HEX LHS O_RANGE RANGE STRING1 STRING2
94
- SYMBOL _eof "|" )
104
+ (follow "(" ")" "@pass" "@terminals" ENUM HEX LHS O_ENUM O_RANGE RANGE
105
+ STRING1 STRING2 SYMBOL _eof "|" )
95
106
  (seq _diff_1))
96
107
  (rule _diff_4 "7.4"
97
- (first "(" HEX O_RANGE RANGE STRING1 STRING2 SYMBOL)
98
- (follow "(" ")" "@pass" "@terminals" HEX LHS O_RANGE RANGE STRING1 STRING2
99
- SYMBOL _eof "|" )
108
+ (first "(" ENUM HEX O_ENUM O_RANGE RANGE STRING1 STRING2 SYMBOL)
109
+ (follow "(" ")" "@pass" "@terminals" ENUM HEX LHS O_ENUM O_RANGE RANGE
110
+ STRING1 STRING2 SYMBOL _eof "|" )
100
111
  (seq postfix))
101
112
  (rule postfix "8"
102
- (first "(" HEX O_RANGE RANGE STRING1 STRING2 SYMBOL)
103
- (follow "(" ")" "-" "@pass" "@terminals" HEX LHS O_RANGE RANGE STRING1
104
- STRING2 SYMBOL _eof "|" )
113
+ (first "(" ENUM HEX O_ENUM O_RANGE RANGE STRING1 STRING2 SYMBOL)
114
+ (follow "(" ")" "-" "@pass" "@terminals" ENUM HEX LHS O_ENUM O_RANGE RANGE
115
+ STRING1 STRING2 SYMBOL _eof "|" )
105
116
  (seq primary _postfix_1))
106
117
  (rule _postfix_1 "8.1"
107
118
  (first POSTFIX _eps)
108
- (follow "(" ")" "-" "@pass" "@terminals" HEX LHS O_RANGE RANGE STRING1
109
- STRING2 SYMBOL _eof "|" )
119
+ (follow "(" ")" "-" "@pass" "@terminals" ENUM HEX LHS O_ENUM O_RANGE RANGE
120
+ STRING1 STRING2 SYMBOL _eof "|" )
121
+ (cleanup opt)
110
122
  (alt _empty POSTFIX))
111
123
  (rule _postfix_2 "8.2"
112
124
  (first POSTFIX _eps)
113
- (follow "(" ")" "-" "@pass" "@terminals" HEX LHS O_RANGE RANGE STRING1
114
- STRING2 SYMBOL _eof "|" )
125
+ (follow "(" ")" "-" "@pass" "@terminals" ENUM HEX LHS O_ENUM O_RANGE RANGE
126
+ STRING1 STRING2 SYMBOL _eof "|" )
115
127
  (seq _postfix_1))
116
128
  (rule primary "9"
117
- (first "(" HEX O_RANGE RANGE STRING1 STRING2 SYMBOL)
118
- (follow "(" ")" "-" "@pass" "@terminals" HEX LHS O_RANGE POSTFIX RANGE
119
- STRING1 STRING2 SYMBOL _eof "|" )
120
- (alt HEX SYMBOL RANGE O_RANGE STRING1 STRING2 _primary_1))
129
+ (first "(" ENUM HEX O_ENUM O_RANGE RANGE STRING1 STRING2 SYMBOL)
130
+ (follow "(" ")" "-" "@pass" "@terminals" ENUM HEX LHS O_ENUM O_RANGE POSTFIX
131
+ RANGE STRING1 STRING2 SYMBOL _eof "|" )
132
+ (alt HEX SYMBOL ENUM O_ENUM RANGE O_RANGE STRING1 STRING2 _primary_1))
121
133
  (rule _primary_1 "9.1"
122
134
  (first "(")
123
- (follow "(" ")" "-" "@pass" "@terminals" HEX LHS O_RANGE POSTFIX RANGE
124
- STRING1 STRING2 SYMBOL _eof "|" )
135
+ (follow "(" ")" "-" "@pass" "@terminals" ENUM HEX LHS O_ENUM O_RANGE POSTFIX
136
+ RANGE STRING1 STRING2 SYMBOL _eof "|" )
125
137
  (seq "(" expression ")"))
126
138
  (rule _primary_2 "9.2"
127
- (first "(" HEX O_RANGE RANGE STRING1 STRING2 SYMBOL)
128
- (follow "(" ")" "-" "@pass" "@terminals" HEX LHS O_RANGE POSTFIX RANGE
129
- STRING1 STRING2 SYMBOL _eof "|" )
139
+ (first "(" ENUM HEX O_ENUM O_RANGE RANGE STRING1 STRING2 SYMBOL)
140
+ (follow "(" ")" "-" "@pass" "@terminals" ENUM HEX LHS O_ENUM O_RANGE POSTFIX
141
+ RANGE STRING1 STRING2 SYMBOL _eof "|" )
130
142
  (seq expression ")"))
131
143
  (rule _primary_3 "9.3"
132
144
  (first ")")
133
- (follow "(" ")" "-" "@pass" "@terminals" HEX LHS O_RANGE POSTFIX RANGE
134
- STRING1 STRING2 SYMBOL _eof "|" )
145
+ (follow "(" ")" "-" "@pass" "@terminals" ENUM HEX LHS O_ENUM O_RANGE POSTFIX
146
+ RANGE STRING1 STRING2 SYMBOL _eof "|" )
135
147
  (seq ")"))
136
148
  (rule pass "10"
137
149
  (first "@pass")
138
150
  (follow "@pass" "@terminals" LHS _eof)
139
151
  (seq "@pass" expression))
140
152
  (rule _pass_1 "10.1"
141
- (first "(" HEX O_RANGE RANGE STRING1 STRING2 SYMBOL)
153
+ (first "(" ENUM HEX O_ENUM O_RANGE RANGE STRING1 STRING2 SYMBOL)
142
154
  (follow "@pass" "@terminals" LHS _eof)
143
155
  (seq expression))
144
156
  (terminal LHS "11" (seq (opt (seq "[" (plus SYMBOL) "]")) SYMBOL "::="))
145
157
  (terminal SYMBOL "12" (plus (alt (range "a-z") (range "A-Z") (range "0-9") "_" ".")))
146
158
  (terminal HEX "13" (seq "#x" (plus (alt (range "0-9") (range "a-f") (range "A-F")))))
147
- (terminal RANGE "14"
159
+ (terminal ENUM "14"
160
+ (seq "["
161
+ (alt (seq R_BEGIN (alt HEX R_CHAR)) (alt HEX R_CHAR)) "-"
162
+ (alt (seq R_BEGIN (alt HEX R_CHAR)) (alt HEX R_CHAR)) "]" ))
163
+ (terminal O_ENUM "15"
164
+ (seq "[^"
165
+ (alt (seq R_BEGIN (alt HEX R_CHAR)) (alt HEX R_CHAR)) "-"
166
+ (alt (seq R_BEGIN (alt HEX R_CHAR)) (alt HEX R_CHAR)) "]" ))
167
+ (terminal RANGE "16"
148
168
  (seq "[" (plus (alt (seq R_BEGIN (alt HEX R_CHAR)) (alt HEX R_CHAR))) "]"))
149
- (terminal O_RANGE "15"
169
+ (terminal O_RANGE "17"
150
170
  (seq "[^" (plus (alt (seq R_BEGIN (alt HEX R_CHAR)) (alt HEX R_CHAR))) "]"))
151
- (terminal STRING1 "16" (seq "\"" (star (diff CHAR "\"")) "\""))
152
- (terminal STRING2 "17" (seq "'" (star (diff CHAR "'")) "'"))
153
- (terminal CHAR "18" (alt HEX (range "#x20#x21#x22") (range "#x24-#x00FFFFFF")))
154
- (terminal R_CHAR "19" (diff CHAR "]"))
155
- (terminal R_BEGIN "20" (seq (alt HEX R_CHAR) "-"))
156
- (terminal POSTFIX "21" (range "?*+"))
157
- (terminal PASS "22"
171
+ (terminal STRING1 "18" (seq "\"" (star (diff CHAR "\"")) "\""))
172
+ (terminal STRING2 "19" (seq "'" (star (diff CHAR "'")) "'"))
173
+ (terminal CHAR "20" (alt HEX (range "#x20#x21#x22") (range "#x24-#x00FFFFFF")))
174
+ (terminal R_CHAR "21" (diff CHAR "]"))
175
+ (terminal R_BEGIN "22" (seq (alt HEX R_CHAR) "-"))
176
+ (terminal POSTFIX "23" (range "?*+"))
177
+ (terminal PASS "24"
158
178
  (plus
159
179
  (alt
160
180
  (range "#x00-#x20")