ebnf 1.0.2 → 1.1.0

Sign up to get free protection for your applications and to get access to all the features.
checksums.yaml CHANGED
@@ -1,7 +1,7 @@
1
1
  ---
2
2
  SHA1:
3
- metadata.gz: b7016a2ce8e6c7b23ec669880a9b3f3ddd381793
4
- data.tar.gz: 625460a7f6f2c8155960fc5a8640e4b0d854d1cd
3
+ metadata.gz: 5b0233cc19d80ca25dc2221725770bc87a47d0f0
4
+ data.tar.gz: e381edc76658d0f56816c4ddc107417df301c9cc
5
5
  SHA512:
6
- metadata.gz: e09689686d97b44c8845f66129a313006a7264722eb9694b35a675edf2c369b7795f7b3ca6bb6ee82c7567d3be4a752c74b4388b1d57e3f242759eec47e37199
7
- data.tar.gz: 6aa62a952f16d3ccd018397a6652b65d3053f68b00c0d0ee767cae3e4686d659e7c5f7e94d89a80a10fdab271ad45e3a733e780372379be5aa7a611ef71b6e46
6
+ metadata.gz: 18415d7b3393069f09d0af3c6433722c3aadfd4f3b5eca67c763294dd81584c9e435eb2aab6e27d15cf557f1d8587bfc67c3671e64797523340f06560c42f27c
7
+ data.tar.gz: bc6917f74c4420facfee72e7089d225545e49d201653731dd1caa28eb71f547b3c8231b44175b9144f2b7c9d4ac979088d4647c487e885adc624c046c9c595fd
data/README.md CHANGED
@@ -8,12 +8,20 @@
8
8
  [![Dependency Status](https://gemnasium.com/gkellogg/ebnf.png)](https://gemnasium.com/gkellogg/ebnf)
9
9
 
10
10
  ## Description
11
- This is a [Ruby][] implementation of an [EBNF][] and [BNF][] parser and parser generator.
12
- It parses [EBNF][] grammars to [BNF][], generates [First/Follow and Branch][] tables for
13
- [LL(1)][] grammars, which can be used with the stream [Tokenizer][] and [LL(1) Parser][].
11
+ This is a [Ruby][] implementation of an [EBNF][] and [BNF][] parser and parser generator. It parses [EBNF][] grammars to [BNF][], generates [First/Follow][] and Branch tables for [LL(1)][] grammars, which can be used with the stream [Tokenizer][] and [LL(1) Parser][].
14
12
 
15
- Of note in this implementation is that the tokenizer and parser are streaming, so that they can
16
- process inputs of arbitrary size.
13
+ As LL(1) grammars operate using `alt` and `seq` primitives, allowing for a match on alternative productions or a sequence of productions, generating a parser requires turning the EBNF rules into BNF:
14
+
15
+ * Transform `a ::= b?` into `a ::= _empty | b`
16
+ * Transform `a ::= b+` into `a ::= b b*`
17
+ * Transform `a ::= b*` into `a ::= _empty | (b a)`
18
+ * Transform `a ::= op1 (op2)` into two rules:
19
+ ```
20
+ a ::= op1 _a_1
21
+ _a_1_ ::= op2
22
+ ```
23
+
24
+ Of note in this implementation is that the tokenizer and parser are streaming, so that they can process inputs of arbitrary size.
17
25
 
18
26
  ## Usage
19
27
  ### Parsing an LL(1) Grammar
@@ -36,7 +44,7 @@ Generate [First/Follow][] rules for BNF grammars
36
44
 
37
45
  ebnf.first_follow(start_tokens)
38
46
 
39
- Generate Terminal, [First/Follow and Branch][] tables as Ruby for parsing grammars
47
+ Generate Terminal, [First/Follow][], Cleanup and Branch tables as Ruby for parsing grammars
40
48
 
41
49
  ebnf.to_ruby
42
50
 
@@ -44,8 +52,29 @@ Generate formatted grammar using HTML (requires [Haml][Haml] gem)
44
52
 
45
53
  ebnf.to_html
46
54
 
47
- ### Creating terminal definitions and parser rules to parse generated grammars
55
+ ### Parser S-Expressions
56
+ Intermediate representations of the grammar may be serialized to Lisp-like S-Expressions. For example, the rule `[1] ebnf ::= (declaration | rule)*` is serialized as `(rule ebnf "1" (star (alt declaration rule)))`.
57
+
58
+ Once the [LL(1)][] conversion is made, the [First/Follow][] table is generated, this rule expands as follows:
59
+
60
+ (rule ebnf "1"
61
+ (start #t)
62
+ (first "@pass" "@terminals" LHS _eps)
63
+ (follow _eof)
64
+ (cleanup star)
65
+ (alt _empty _ebnf_2))
66
+ (rule _ebnf_1 "1.1"
67
+ (first "@pass" "@terminals" LHS)
68
+ (follow "@pass" "@terminals" LHS _eof)
69
+ (alt declaration rule))
70
+ (rule _ebnf_2 "1.2"
71
+ (first "@pass" "@terminals" LHS)
72
+ (follow _eof)
73
+ (cleanup merge)
74
+ (seq _ebnf_1 ebnf))
75
+ (rule _ebnf_3 "1.3" (first "@pass" "@terminals" LHS _eps) (follow _eof) (seq ebnf))
48
76
 
77
+ ### Creating terminal definitions and parser rules to parse generated grammars
49
78
  The parser is initialized to callbacks invoked on entry and exit
50
79
  to each `terminal` and `production`. A trivial parser loop can be described as follows:
51
80
 
@@ -76,9 +105,10 @@ to each `terminal` and `production`. A trivial parser loop can be described as f
76
105
 
77
106
  def initialize(input)
78
107
  parser_options = {
79
- :branch => BRANCH,
80
- :first => FIRST,
81
- :follow => FOLLOW
108
+ branch: BRANCH,
109
+ first: FIRST,
110
+ follow: FOLLOW,
111
+ cleanup: CLEANUP
82
112
  }
83
113
  parse(input, start_symbol, parser_options) do |context, *data|
84
114
  # Process calls from callback from productions
@@ -88,10 +118,92 @@ to each `terminal` and `production`. A trivial parser loop can be described as f
88
118
  raise RDF::ReaderError, e.message if validate?
89
119
  end
90
120
 
121
+ ### Branch Table
122
+ The Branch table is a hash mapping production rules to a hash relating terminals appearing in input to sequence of productions to follow when the corresponding input terminal is found. This allows either the `seq` primitive, where all terminals map to the same sequence of productions, or the `alt` primitive, where each terminal may map to a different production.
123
+
124
+ BRANCH = {
125
+ :alt => {
126
+ "(" => [:seq, :_alt_1],
127
+ :ENUM => [:seq, :_alt_1],
128
+ :HEX => [:seq, :_alt_1],
129
+ :O_ENUM => [:seq, :_alt_1],
130
+ :O_RANGE => [:seq, :_alt_1],
131
+ :RANGE => [:seq, :_alt_1],
132
+ :STRING1 => [:seq, :_alt_1],
133
+ :STRING2 => [:seq, :_alt_1],
134
+ :SYMBOL => [:seq, :_alt_1],
135
+ },
136
+ ...
137
+ :declaration => {
138
+ "@pass" => [:pass],
139
+ "@terminals" => ["@terminals"],
140
+ },
141
+ ...
142
+ }
143
+
144
+ In this case the `alt` rule is `seq ('|' seq)*` can happen when any of the specified tokens appears on the input stream. The all cause the same token to be passed to the `seq` rule and follow with `_alt_1`, which handles the `('|' seq)*` portion of the rule, after the first sequence is matched.
145
+
146
+ The `declaration` rule is `@terminals' | pass` using the `alt` primitive determining the production to run based on the terminal appearing on the input stream. Eventually, a terminal production is found and the token is consumed.
147
+
148
+ ### First/Follow Table
149
+ The [First/Follow][] table is a hash mapping production rules to the terminals that may proceed or follow the rule. For example:
150
+
151
+ FIRST = {
152
+ :alt => [
153
+ :HEX,
154
+ :SYMBOL,
155
+ :ENUM,
156
+ :O_ENUM,
157
+ :RANGE,
158
+ :O_RANGE,
159
+ :STRING1,
160
+ :STRING2,
161
+ "("],
162
+ ...
163
+ }
164
+
165
+ ### Terminals Table
166
+ This table is a simple list of the terminal productions found in the grammar. For example:
167
+
168
+ TERMINALS = ["(", ")", "-",
169
+ "@pass", "@terminals",
170
+ :ENUM, :HEX, :LHS, :O_ENUM, :O_RANGE,:POSTFIX,
171
+ :RANGE, :STRING1, :STRING2, :SYMBOL,"|"
172
+ ].freeze
173
+
174
+ ### Cleanup Table
175
+ This table identifies productions which used EBNF rules, which are transformed to BNF for actual parsing. This allows the parser, in some cases, to reproduce *star*, *plus*, and *opt* rule matches. For example:
176
+
177
+ CLEANUP = {
178
+ :_alt_1 => :star,
179
+ :_alt_3 => :merge,
180
+ :_diff_1 => :opt,
181
+ :ebnf => :star,
182
+ :_ebnf_2 => :merge,
183
+ :_postfix_1 => :opt,
184
+ :seq => :plus,
185
+ :_seq_1 => :star,
186
+ :_seq_2 => :merge,
187
+ }.freeze
188
+
189
+ In this case the `ebnf` rule was `(declaration | rule)*`. As BNF does not support a star operator, this is decomposed into a set of rules using `alt` and `seq` primitives:
190
+
191
+ ebnf ::= _empty _ebnf_2
192
+ _ebnf_1 ::= declaration | rule
193
+ _ebnf_2 ::= _ebnf_1 ebnf
194
+ _ebnf_3 ::= ebnf
195
+
196
+ The `_empty` production matches an empty string, so allows for now value. `_ebnf_2` matches `declaration | rule` (using the `alt` primitive) followed by `ebnf`, creating a sequence of zero or more `declaration` or `alt` members.
91
197
 
92
198
  ## EBNF Grammar
93
199
  The [EBNF][] variant used here is based on [W3C](http://w3.org/) [EBNF][] (see {file:etc/ebnf.ebnf EBNF grammar}) as defined in the
94
- [XML 1.0 recommendation](http://www.w3.org/TR/REC-xml/), with minor extensions.
200
+ [XML 1.0 recommendation](http://www.w3.org/TR/REC-xml/), with minor extensions:
201
+
202
+ * Comments include `\\` and `#` through end of line (other than hex character) and `/* ... */ (* ... *) which may cross lines`
203
+ * All rules **MAY** start with an identifier, contained within square brackets. For example `[1] rule`, where the value within the brackets is a symbol `([a-z] | [A-Z] | [0-9] | "_" | ".")+`
204
+ * `@terminals` causes following rules to be treated as terminals. Any terminal which are entirely upper-case are also treated as terminals
205
+ * `@pass` defines the expression used to detect whitespace, which is removed in processing.
206
+ * No support for `wfc` (well-formedness constraint) or `vc` (validity constraint).
95
207
 
96
208
  Parsing this grammar yields an S-Expression version: {file:etc/ebnf.ll1.sxp}.
97
209
 
data/VERSION CHANGED
@@ -1 +1 @@
1
- 1.0.2
1
+ 1.1.0
data/bin/ebnf CHANGED
@@ -12,9 +12,9 @@ require 'getoptlong'
12
12
  require 'ebnf'
13
13
 
14
14
  options = {
15
- :output_format => :sxp,
16
- :prefix => "ttl",
17
- :namespace => "http://www.w3.org/ns/formats/Turtle#",
15
+ output_format: :sxp,
16
+ prefix: "ttl",
17
+ namespace: "http://www.w3.org/ns/formats/Turtle#",
18
18
  }
19
19
 
20
20
  input, out = nil, STDOUT
@@ -20,6 +20,8 @@
20
20
 
21
21
  [9] primary ::= HEX
22
22
  | SYMBOL
23
+ | ENUM
24
+ | O_ENUM
23
25
  | RANGE
24
26
  | O_RANGE
25
27
  | STRING1
@@ -36,29 +38,33 @@
36
38
 
37
39
  [13] HEX ::= '#x' ([0-9]|[a-f]|[A-F])+
38
40
 
41
+ [14] ENUM ::= '[' ((R_BEGIN (HEX | R_CHAR)) | (HEX | R_CHAR)) '-' ((R_BEGIN (HEX | R_CHAR)) | (HEX | R_CHAR)) ']'
42
+
43
+ [15] O_ENUM ::= '[^' ((R_BEGIN (HEX | R_CHAR)) | (HEX | R_CHAR)) '-' ((R_BEGIN (HEX | R_CHAR)) | (HEX | R_CHAR)) ']'
44
+
39
45
  # Range is any combination of R_CHAR '-' R_CHAR or R_CHAR+
40
- [14] RANGE ::= '[' ((R_BEGIN (HEX | R_CHAR)) | (HEX | R_CHAR))+ ']'
46
+ [16] RANGE ::= '[' ((R_BEGIN (HEX | R_CHAR)) | (HEX | R_CHAR))+ ']'
41
47
 
42
48
  # Range is any combination of R_CHAR '-' R_CHAR or R_CHAR+ preceded by ^
43
- [15] O_RANGE ::= '[^' ((R_BEGIN (HEX | R_CHAR)) | (HEX | R_CHAR))+ ']'
49
+ [17] O_RANGE ::= '[^' ((R_BEGIN (HEX | R_CHAR)) | (HEX | R_CHAR))+ ']'
44
50
 
45
51
  # Strings are unescaped Unicode, excepting control characters and hash (#)
46
- [16] STRING1 ::= '"' (CHAR - '"')* '"'
52
+ [18] STRING1 ::= '"' (CHAR - '"')* '"'
47
53
 
48
- [17] STRING2 ::= "'" (CHAR - "'")* "'"
54
+ [19] STRING2 ::= "'" (CHAR - "'")* "'"
49
55
 
50
- [18] CHAR ::= HEX
56
+ [20] CHAR ::= HEX
51
57
  | [#x20#x21#x22]
52
58
  | [#x24-#x00FFFFFF]
53
59
 
54
- [19] R_CHAR ::= CHAR - ']'
60
+ [21] R_CHAR ::= CHAR - ']'
55
61
 
56
- [20] R_BEGIN ::= (HEX | R_CHAR) "-"
62
+ [22] R_BEGIN ::= (HEX | R_CHAR) "-"
57
63
 
58
64
  # Should be able to do this inline, but not until terminal regular expressions are created automatically
59
- [21] POSTFIX ::= [?*+]
65
+ [23] POSTFIX ::= [?*+]
60
66
 
61
- [22] PASS ::= ( [#x00-#x20]
67
+ [24] PASS ::= ( [#x00-#x20]
62
68
  | ( '#' | '//' ) [^#x0A#x0D]*
63
69
  | '/*' (( '*' [^/] )? | [^*] )* '*/'
64
70
  | '(*' (( '*' [^)] )? | [^*] )* '*)'
@@ -76,6 +76,8 @@
76
76
  <td>
77
77
  <a href="#grammar-production-HEX">HEX</a>
78
78
  <code>|</code> <a href="#grammar-production-SYMBOL">SYMBOL</a>
79
+ <code>|</code> <a href="#grammar-production-ENUM">ENUM</a>
80
+ <code>|</code> <a href="#grammar-production-O_ENUM">O_ENUM</a>
79
81
  <code>|</code> <a href="#grammar-production-RANGE">RANGE</a>
80
82
  <code>|</code> <a href="#grammar-production-O_RANGE">O_RANGE</a>
81
83
  <code>|</code> <a href="#grammar-production-STRING1">STRING1</a>
@@ -119,8 +121,32 @@
119
121
  (<code>[</code> <code class="grammar-literal">0-9</code><code>]</code> <code>|</code> <code>[</code> <code class="grammar-literal">a-f</code><code>]</code> <code>|</code> <code>[</code> <code class="grammar-literal">A-F</code><code>]</code> )<code>+</code>
120
122
  </td>
121
123
  </tr>
122
- <tr id='grammar-production-RANGE'>
124
+ <tr id='grammar-production-ENUM'>
123
125
  <td>[14]</td>
126
+ <td><code>ENUM</code></td>
127
+ <td>::=</td>
128
+ <td>
129
+ "<code class="grammar-literal">[</code>"
130
+ <code>(</code> <a href="#grammar-production-R_BEGIN">R_BEGIN</a> <code>(</code> <a href="#grammar-production-HEX">HEX</a> <code>|</code> <a href="#grammar-production-R_CHAR">R_CHAR</a><code>)</code> <code>|</code> <a href="#grammar-production-HEX">HEX</a> <code>|</code> <a href="#grammar-production-R_CHAR">R_CHAR</a><code>)</code>
131
+ "<code class="grammar-literal">-</code>"
132
+ <code>(</code> <a href="#grammar-production-R_BEGIN">R_BEGIN</a> <code>(</code> <a href="#grammar-production-HEX">HEX</a> <code>|</code> <a href="#grammar-production-R_CHAR">R_CHAR</a><code>)</code> <code>|</code> <a href="#grammar-production-HEX">HEX</a> <code>|</code> <a href="#grammar-production-R_CHAR">R_CHAR</a><code>)</code>
133
+ "<code class="grammar-literal">]</code>"
134
+ </td>
135
+ </tr>
136
+ <tr id='grammar-production-O_ENUM'>
137
+ <td>[15]</td>
138
+ <td><code>O_ENUM</code></td>
139
+ <td>::=</td>
140
+ <td>
141
+ "<code class="grammar-literal">[^</code>"
142
+ <code>(</code> <a href="#grammar-production-R_BEGIN">R_BEGIN</a> <code>(</code> <a href="#grammar-production-HEX">HEX</a> <code>|</code> <a href="#grammar-production-R_CHAR">R_CHAR</a><code>)</code> <code>|</code> <a href="#grammar-production-HEX">HEX</a> <code>|</code> <a href="#grammar-production-R_CHAR">R_CHAR</a><code>)</code>
143
+ "<code class="grammar-literal">-</code>"
144
+ <code>(</code> <a href="#grammar-production-R_BEGIN">R_BEGIN</a> <code>(</code> <a href="#grammar-production-HEX">HEX</a> <code>|</code> <a href="#grammar-production-R_CHAR">R_CHAR</a><code>)</code> <code>|</code> <a href="#grammar-production-HEX">HEX</a> <code>|</code> <a href="#grammar-production-R_CHAR">R_CHAR</a><code>)</code>
145
+ "<code class="grammar-literal">]</code>"
146
+ </td>
147
+ </tr>
148
+ <tr id='grammar-production-RANGE'>
149
+ <td>[16]</td>
124
150
  <td><code>RANGE</code></td>
125
151
  <td>::=</td>
126
152
  <td>
@@ -130,7 +156,7 @@
130
156
  </td>
131
157
  </tr>
132
158
  <tr id='grammar-production-O_RANGE'>
133
- <td>[15]</td>
159
+ <td>[17]</td>
134
160
  <td><code>O_RANGE</code></td>
135
161
  <td>::=</td>
136
162
  <td>
@@ -140,7 +166,7 @@
140
166
  </td>
141
167
  </tr>
142
168
  <tr id='grammar-production-STRING1'>
143
- <td>[16]</td>
169
+ <td>[18]</td>
144
170
  <td><code>STRING1</code></td>
145
171
  <td>::=</td>
146
172
  <td>
@@ -150,7 +176,7 @@
150
176
  </td>
151
177
  </tr>
152
178
  <tr id='grammar-production-STRING2'>
153
- <td>[17]</td>
179
+ <td>[19]</td>
154
180
  <td><code>STRING2</code></td>
155
181
  <td>::=</td>
156
182
  <td>
@@ -160,7 +186,7 @@
160
186
  </td>
161
187
  </tr>
162
188
  <tr id='grammar-production-CHAR'>
163
- <td>[18]</td>
189
+ <td>[20]</td>
164
190
  <td><code>CHAR</code></td>
165
191
  <td>::=</td>
166
192
  <td>
@@ -170,7 +196,7 @@
170
196
  </td>
171
197
  </tr>
172
198
  <tr id='grammar-production-R_CHAR'>
173
- <td>[19]</td>
199
+ <td>[21]</td>
174
200
  <td><code>R_CHAR</code></td>
175
201
  <td>::=</td>
176
202
  <td>
@@ -179,7 +205,7 @@
179
205
  </td>
180
206
  </tr>
181
207
  <tr id='grammar-production-R_BEGIN'>
182
- <td>[20]</td>
208
+ <td>[22]</td>
183
209
  <td><code>R_BEGIN</code></td>
184
210
  <td>::=</td>
185
211
  <td>
@@ -188,7 +214,7 @@
188
214
  </td>
189
215
  </tr>
190
216
  <tr id='grammar-production-POSTFIX'>
191
- <td>[21]</td>
217
+ <td>[23]</td>
192
218
  <td><code>POSTFIX</code></td>
193
219
  <td>::=</td>
194
220
  <td>
@@ -196,7 +222,7 @@
196
222
  </td>
197
223
  </tr>
198
224
  <tr id='grammar-production-PASS'>
199
- <td>[22]</td>
225
+ <td>[24]</td>
200
226
  <td><code>PASS</code></td>
201
227
  <td>::=</td>
202
228
  <td>
@@ -5,12 +5,17 @@
5
5
  (start #t)
6
6
  (first "@pass" "@terminals" LHS _eps)
7
7
  (follow _eof)
8
+ (cleanup star)
8
9
  (alt _empty _ebnf_2))
9
10
  (rule _ebnf_1 "1.1"
10
11
  (first "@pass" "@terminals" LHS)
11
12
  (follow "@pass" "@terminals" LHS _eof)
12
13
  (alt declaration rule))
13
- (rule _ebnf_2 "1.2" (first "@pass" "@terminals" LHS) (follow _eof) (seq _ebnf_1 ebnf))
14
+ (rule _ebnf_2 "1.2"
15
+ (first "@pass" "@terminals" LHS)
16
+ (follow _eof)
17
+ (cleanup merge)
18
+ (seq _ebnf_1 ebnf))
14
19
  (rule _ebnf_3 "1.3" (first "@pass" "@terminals" LHS _eps) (follow _eof) (seq ebnf))
15
20
  (rule declaration "2"
16
21
  (first "@pass" "@terminals")
@@ -18,20 +23,21 @@
18
23
  (alt "@terminals" pass))
19
24
  (rule rule "3" (first LHS) (follow "@pass" "@terminals" LHS _eof) (seq LHS expression))
20
25
  (rule _rule_1 "3.1"
21
- (first "(" HEX O_RANGE RANGE STRING1 STRING2 SYMBOL)
26
+ (first "(" ENUM HEX O_ENUM O_RANGE RANGE STRING1 STRING2 SYMBOL)
22
27
  (follow "@pass" "@terminals" LHS _eof)
23
28
  (seq expression))
24
29
  (rule expression "4"
25
- (first "(" HEX O_RANGE RANGE STRING1 STRING2 SYMBOL)
30
+ (first "(" ENUM HEX O_ENUM O_RANGE RANGE STRING1 STRING2 SYMBOL)
26
31
  (follow ")" "@pass" "@terminals" LHS _eof)
27
32
  (seq alt))
28
33
  (rule alt "5"
29
- (first "(" HEX O_RANGE RANGE STRING1 STRING2 SYMBOL)
34
+ (first "(" ENUM HEX O_ENUM O_RANGE RANGE STRING1 STRING2 SYMBOL)
30
35
  (follow ")" "@pass" "@terminals" LHS _eof)
31
36
  (seq seq _alt_1))
32
37
  (rule _alt_1 "5.1"
33
38
  (first _eps "|")
34
39
  (follow ")" "@pass" "@terminals" LHS _eof)
40
+ (cleanup star)
35
41
  (alt _empty _alt_3))
36
42
  (rule _alt_2 "5.2"
37
43
  (first "|")
@@ -40,6 +46,7 @@
40
46
  (rule _alt_3 "5.3"
41
47
  (first "|")
42
48
  (follow ")" "@pass" "@terminals" LHS _eof)
49
+ (cleanup merge)
43
50
  (seq _alt_2 _alt_1))
44
51
  (rule _alt_4 "5.4"
45
52
  (first _eps "|")
@@ -50,111 +57,124 @@
50
57
  (follow ")" "@pass" "@terminals" LHS _eof)
51
58
  (seq _alt_1))
52
59
  (rule _alt_6 "5.6"
53
- (first "(" HEX O_RANGE RANGE STRING1 STRING2 SYMBOL)
60
+ (first "(" ENUM HEX O_ENUM O_RANGE RANGE STRING1 STRING2 SYMBOL)
54
61
  (follow ")" "@pass" "@terminals" LHS _eof "|")
55
62
  (seq seq))
56
63
  (rule seq "6"
57
- (first "(" HEX O_RANGE RANGE STRING1 STRING2 SYMBOL)
64
+ (first "(" ENUM HEX O_ENUM O_RANGE RANGE STRING1 STRING2 SYMBOL)
58
65
  (follow ")" "@pass" "@terminals" LHS _eof "|")
66
+ (cleanup plus)
59
67
  (seq diff _seq_1))
60
68
  (rule _seq_1 "6.1"
61
- (first "(" HEX O_RANGE RANGE STRING1 STRING2 SYMBOL _eps)
69
+ (first "(" ENUM HEX O_ENUM O_RANGE RANGE STRING1 STRING2 SYMBOL _eps)
62
70
  (follow ")" "@pass" "@terminals" LHS _eof "|")
71
+ (cleanup star)
63
72
  (alt _empty _seq_2))
64
73
  (rule _seq_2 "6.2"
65
- (first "(" HEX O_RANGE RANGE STRING1 STRING2 SYMBOL)
74
+ (first "(" ENUM HEX O_ENUM O_RANGE RANGE STRING1 STRING2 SYMBOL)
66
75
  (follow ")" "@pass" "@terminals" LHS _eof "|")
76
+ (cleanup merge)
67
77
  (seq diff _seq_1))
68
78
  (rule _seq_3 "6.3"
69
- (first "(" HEX O_RANGE RANGE STRING1 STRING2 SYMBOL _eps)
79
+ (first "(" ENUM HEX O_ENUM O_RANGE RANGE STRING1 STRING2 SYMBOL _eps)
70
80
  (follow ")" "@pass" "@terminals" LHS _eof "|")
71
81
  (seq _seq_1))
72
82
  (rule _seq_4 "6.4"
73
- (first "(" HEX O_RANGE RANGE STRING1 STRING2 SYMBOL _eps)
83
+ (first "(" ENUM HEX O_ENUM O_RANGE RANGE STRING1 STRING2 SYMBOL _eps)
74
84
  (follow ")" "@pass" "@terminals" LHS _eof "|")
75
85
  (seq _seq_1))
76
86
  (rule diff "7"
77
- (first "(" HEX O_RANGE RANGE STRING1 STRING2 SYMBOL)
78
- (follow "(" ")" "@pass" "@terminals" HEX LHS O_RANGE RANGE STRING1 STRING2
79
- SYMBOL _eof "|" )
87
+ (first "(" ENUM HEX O_ENUM O_RANGE RANGE STRING1 STRING2 SYMBOL)
88
+ (follow "(" ")" "@pass" "@terminals" ENUM HEX LHS O_ENUM O_RANGE RANGE
89
+ STRING1 STRING2 SYMBOL _eof "|" )
80
90
  (seq postfix _diff_1))
81
91
  (rule _diff_1 "7.1"
82
92
  (first "-" _eps)
83
- (follow "(" ")" "@pass" "@terminals" HEX LHS O_RANGE RANGE STRING1 STRING2
84
- SYMBOL _eof "|" )
93
+ (follow "(" ")" "@pass" "@terminals" ENUM HEX LHS O_ENUM O_RANGE RANGE
94
+ STRING1 STRING2 SYMBOL _eof "|" )
95
+ (cleanup opt)
85
96
  (alt _empty _diff_2))
86
97
  (rule _diff_2 "7.2"
87
98
  (first "-")
88
- (follow "(" ")" "@pass" "@terminals" HEX LHS O_RANGE RANGE STRING1 STRING2
89
- SYMBOL _eof "|" )
99
+ (follow "(" ")" "@pass" "@terminals" ENUM HEX LHS O_ENUM O_RANGE RANGE
100
+ STRING1 STRING2 SYMBOL _eof "|" )
90
101
  (seq "-" postfix))
91
102
  (rule _diff_3 "7.3"
92
103
  (first "-" _eps)
93
- (follow "(" ")" "@pass" "@terminals" HEX LHS O_RANGE RANGE STRING1 STRING2
94
- SYMBOL _eof "|" )
104
+ (follow "(" ")" "@pass" "@terminals" ENUM HEX LHS O_ENUM O_RANGE RANGE
105
+ STRING1 STRING2 SYMBOL _eof "|" )
95
106
  (seq _diff_1))
96
107
  (rule _diff_4 "7.4"
97
- (first "(" HEX O_RANGE RANGE STRING1 STRING2 SYMBOL)
98
- (follow "(" ")" "@pass" "@terminals" HEX LHS O_RANGE RANGE STRING1 STRING2
99
- SYMBOL _eof "|" )
108
+ (first "(" ENUM HEX O_ENUM O_RANGE RANGE STRING1 STRING2 SYMBOL)
109
+ (follow "(" ")" "@pass" "@terminals" ENUM HEX LHS O_ENUM O_RANGE RANGE
110
+ STRING1 STRING2 SYMBOL _eof "|" )
100
111
  (seq postfix))
101
112
  (rule postfix "8"
102
- (first "(" HEX O_RANGE RANGE STRING1 STRING2 SYMBOL)
103
- (follow "(" ")" "-" "@pass" "@terminals" HEX LHS O_RANGE RANGE STRING1
104
- STRING2 SYMBOL _eof "|" )
113
+ (first "(" ENUM HEX O_ENUM O_RANGE RANGE STRING1 STRING2 SYMBOL)
114
+ (follow "(" ")" "-" "@pass" "@terminals" ENUM HEX LHS O_ENUM O_RANGE RANGE
115
+ STRING1 STRING2 SYMBOL _eof "|" )
105
116
  (seq primary _postfix_1))
106
117
  (rule _postfix_1 "8.1"
107
118
  (first POSTFIX _eps)
108
- (follow "(" ")" "-" "@pass" "@terminals" HEX LHS O_RANGE RANGE STRING1
109
- STRING2 SYMBOL _eof "|" )
119
+ (follow "(" ")" "-" "@pass" "@terminals" ENUM HEX LHS O_ENUM O_RANGE RANGE
120
+ STRING1 STRING2 SYMBOL _eof "|" )
121
+ (cleanup opt)
110
122
  (alt _empty POSTFIX))
111
123
  (rule _postfix_2 "8.2"
112
124
  (first POSTFIX _eps)
113
- (follow "(" ")" "-" "@pass" "@terminals" HEX LHS O_RANGE RANGE STRING1
114
- STRING2 SYMBOL _eof "|" )
125
+ (follow "(" ")" "-" "@pass" "@terminals" ENUM HEX LHS O_ENUM O_RANGE RANGE
126
+ STRING1 STRING2 SYMBOL _eof "|" )
115
127
  (seq _postfix_1))
116
128
  (rule primary "9"
117
- (first "(" HEX O_RANGE RANGE STRING1 STRING2 SYMBOL)
118
- (follow "(" ")" "-" "@pass" "@terminals" HEX LHS O_RANGE POSTFIX RANGE
119
- STRING1 STRING2 SYMBOL _eof "|" )
120
- (alt HEX SYMBOL RANGE O_RANGE STRING1 STRING2 _primary_1))
129
+ (first "(" ENUM HEX O_ENUM O_RANGE RANGE STRING1 STRING2 SYMBOL)
130
+ (follow "(" ")" "-" "@pass" "@terminals" ENUM HEX LHS O_ENUM O_RANGE POSTFIX
131
+ RANGE STRING1 STRING2 SYMBOL _eof "|" )
132
+ (alt HEX SYMBOL ENUM O_ENUM RANGE O_RANGE STRING1 STRING2 _primary_1))
121
133
  (rule _primary_1 "9.1"
122
134
  (first "(")
123
- (follow "(" ")" "-" "@pass" "@terminals" HEX LHS O_RANGE POSTFIX RANGE
124
- STRING1 STRING2 SYMBOL _eof "|" )
135
+ (follow "(" ")" "-" "@pass" "@terminals" ENUM HEX LHS O_ENUM O_RANGE POSTFIX
136
+ RANGE STRING1 STRING2 SYMBOL _eof "|" )
125
137
  (seq "(" expression ")"))
126
138
  (rule _primary_2 "9.2"
127
- (first "(" HEX O_RANGE RANGE STRING1 STRING2 SYMBOL)
128
- (follow "(" ")" "-" "@pass" "@terminals" HEX LHS O_RANGE POSTFIX RANGE
129
- STRING1 STRING2 SYMBOL _eof "|" )
139
+ (first "(" ENUM HEX O_ENUM O_RANGE RANGE STRING1 STRING2 SYMBOL)
140
+ (follow "(" ")" "-" "@pass" "@terminals" ENUM HEX LHS O_ENUM O_RANGE POSTFIX
141
+ RANGE STRING1 STRING2 SYMBOL _eof "|" )
130
142
  (seq expression ")"))
131
143
  (rule _primary_3 "9.3"
132
144
  (first ")")
133
- (follow "(" ")" "-" "@pass" "@terminals" HEX LHS O_RANGE POSTFIX RANGE
134
- STRING1 STRING2 SYMBOL _eof "|" )
145
+ (follow "(" ")" "-" "@pass" "@terminals" ENUM HEX LHS O_ENUM O_RANGE POSTFIX
146
+ RANGE STRING1 STRING2 SYMBOL _eof "|" )
135
147
  (seq ")"))
136
148
  (rule pass "10"
137
149
  (first "@pass")
138
150
  (follow "@pass" "@terminals" LHS _eof)
139
151
  (seq "@pass" expression))
140
152
  (rule _pass_1 "10.1"
141
- (first "(" HEX O_RANGE RANGE STRING1 STRING2 SYMBOL)
153
+ (first "(" ENUM HEX O_ENUM O_RANGE RANGE STRING1 STRING2 SYMBOL)
142
154
  (follow "@pass" "@terminals" LHS _eof)
143
155
  (seq expression))
144
156
  (terminal LHS "11" (seq (opt (seq "[" (plus SYMBOL) "]")) SYMBOL "::="))
145
157
  (terminal SYMBOL "12" (plus (alt (range "a-z") (range "A-Z") (range "0-9") "_" ".")))
146
158
  (terminal HEX "13" (seq "#x" (plus (alt (range "0-9") (range "a-f") (range "A-F")))))
147
- (terminal RANGE "14"
159
+ (terminal ENUM "14"
160
+ (seq "["
161
+ (alt (seq R_BEGIN (alt HEX R_CHAR)) (alt HEX R_CHAR)) "-"
162
+ (alt (seq R_BEGIN (alt HEX R_CHAR)) (alt HEX R_CHAR)) "]" ))
163
+ (terminal O_ENUM "15"
164
+ (seq "[^"
165
+ (alt (seq R_BEGIN (alt HEX R_CHAR)) (alt HEX R_CHAR)) "-"
166
+ (alt (seq R_BEGIN (alt HEX R_CHAR)) (alt HEX R_CHAR)) "]" ))
167
+ (terminal RANGE "16"
148
168
  (seq "[" (plus (alt (seq R_BEGIN (alt HEX R_CHAR)) (alt HEX R_CHAR))) "]"))
149
- (terminal O_RANGE "15"
169
+ (terminal O_RANGE "17"
150
170
  (seq "[^" (plus (alt (seq R_BEGIN (alt HEX R_CHAR)) (alt HEX R_CHAR))) "]"))
151
- (terminal STRING1 "16" (seq "\"" (star (diff CHAR "\"")) "\""))
152
- (terminal STRING2 "17" (seq "'" (star (diff CHAR "'")) "'"))
153
- (terminal CHAR "18" (alt HEX (range "#x20#x21#x22") (range "#x24-#x00FFFFFF")))
154
- (terminal R_CHAR "19" (diff CHAR "]"))
155
- (terminal R_BEGIN "20" (seq (alt HEX R_CHAR) "-"))
156
- (terminal POSTFIX "21" (range "?*+"))
157
- (terminal PASS "22"
171
+ (terminal STRING1 "18" (seq "\"" (star (diff CHAR "\"")) "\""))
172
+ (terminal STRING2 "19" (seq "'" (star (diff CHAR "'")) "'"))
173
+ (terminal CHAR "20" (alt HEX (range "#x20#x21#x22") (range "#x24-#x00FFFFFF")))
174
+ (terminal R_CHAR "21" (diff CHAR "]"))
175
+ (terminal R_BEGIN "22" (seq (alt HEX R_CHAR) "-"))
176
+ (terminal POSTFIX "23" (range "?*+"))
177
+ (terminal PASS "24"
158
178
  (plus
159
179
  (alt
160
180
  (range "#x00-#x20")