ebnf 1.0.2 → 1.1.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- checksums.yaml +4 -4
- data/README.md +123 -11
- data/VERSION +1 -1
- data/bin/ebnf +3 -3
- data/etc/ebnf.ebnf +15 -9
- data/etc/ebnf.html +35 -9
- data/etc/ebnf.ll1.sxp +70 -50
- data/etc/ebnf.rb +87 -0
- data/etc/ebnf.sxp +18 -10
- data/etc/sparql.ll1.sxp +277 -102
- data/etc/sparql.rb +140 -0
- data/etc/turtle.ll1.sxp +27 -16
- data/etc/turtle.rb +13 -0
- data/lib/ebnf/base.rb +3 -2
- data/lib/ebnf/bnf.rb +1 -1
- data/lib/ebnf/ll1.rb +19 -9
- data/lib/ebnf/ll1/lexer.rb +15 -11
- data/lib/ebnf/ll1/parser.rb +34 -16
- data/lib/ebnf/ll1/scanner.rb +22 -8
- data/lib/ebnf/parser.rb +1 -1
- data/lib/ebnf/rule.rb +22 -10
- data/lib/ebnf/writer.rb +1 -1
- metadata +3 -3
checksums.yaml
CHANGED
@@ -1,7 +1,7 @@
|
|
1
1
|
---
|
2
2
|
SHA1:
|
3
|
-
metadata.gz:
|
4
|
-
data.tar.gz:
|
3
|
+
metadata.gz: 5b0233cc19d80ca25dc2221725770bc87a47d0f0
|
4
|
+
data.tar.gz: e381edc76658d0f56816c4ddc107417df301c9cc
|
5
5
|
SHA512:
|
6
|
-
metadata.gz:
|
7
|
-
data.tar.gz:
|
6
|
+
metadata.gz: 18415d7b3393069f09d0af3c6433722c3aadfd4f3b5eca67c763294dd81584c9e435eb2aab6e27d15cf557f1d8587bfc67c3671e64797523340f06560c42f27c
|
7
|
+
data.tar.gz: bc6917f74c4420facfee72e7089d225545e49d201653731dd1caa28eb71f547b3c8231b44175b9144f2b7c9d4ac979088d4647c487e885adc624c046c9c595fd
|
data/README.md
CHANGED
@@ -8,12 +8,20 @@
|
|
8
8
|
[](https://gemnasium.com/gkellogg/ebnf)
|
9
9
|
|
10
10
|
## Description
|
11
|
-
This is a [Ruby][] implementation of an [EBNF][] and [BNF][] parser and parser generator.
|
12
|
-
It parses [EBNF][] grammars to [BNF][], generates [First/Follow and Branch][] tables for
|
13
|
-
[LL(1)][] grammars, which can be used with the stream [Tokenizer][] and [LL(1) Parser][].
|
11
|
+
This is a [Ruby][] implementation of an [EBNF][] and [BNF][] parser and parser generator. It parses [EBNF][] grammars to [BNF][], generates [First/Follow][] and Branch tables for [LL(1)][] grammars, which can be used with the stream [Tokenizer][] and [LL(1) Parser][].
|
14
12
|
|
15
|
-
|
16
|
-
|
13
|
+
As LL(1) grammars operate using `alt` and `seq` primitives, allowing for a match on alternative productions or a sequence of productions, generating a parser requires turning the EBNF rules into BNF:
|
14
|
+
|
15
|
+
* Transform `a ::= b?` into `a ::= _empty | b`
|
16
|
+
* Transform `a ::= b+` into `a ::= b b*`
|
17
|
+
* Transform `a ::= b*` into `a ::= _empty | (b a)`
|
18
|
+
* Transform `a ::= op1 (op2)` into two rules:
|
19
|
+
```
|
20
|
+
a ::= op1 _a_1
|
21
|
+
_a_1_ ::= op2
|
22
|
+
```
|
23
|
+
|
24
|
+
Of note in this implementation is that the tokenizer and parser are streaming, so that they can process inputs of arbitrary size.
|
17
25
|
|
18
26
|
## Usage
|
19
27
|
### Parsing an LL(1) Grammar
|
@@ -36,7 +44,7 @@ Generate [First/Follow][] rules for BNF grammars
|
|
36
44
|
|
37
45
|
ebnf.first_follow(start_tokens)
|
38
46
|
|
39
|
-
Generate Terminal, [First/Follow and Branch
|
47
|
+
Generate Terminal, [First/Follow][], Cleanup and Branch tables as Ruby for parsing grammars
|
40
48
|
|
41
49
|
ebnf.to_ruby
|
42
50
|
|
@@ -44,8 +52,29 @@ Generate formatted grammar using HTML (requires [Haml][Haml] gem)
|
|
44
52
|
|
45
53
|
ebnf.to_html
|
46
54
|
|
47
|
-
###
|
55
|
+
### Parser S-Expressions
|
56
|
+
Intermediate representations of the grammar may be serialized to Lisp-like S-Expressions. For example, the rule `[1] ebnf ::= (declaration | rule)*` is serialized as `(rule ebnf "1" (star (alt declaration rule)))`.
|
57
|
+
|
58
|
+
Once the [LL(1)][] conversion is made, the [First/Follow][] table is generated, this rule expands as follows:
|
59
|
+
|
60
|
+
(rule ebnf "1"
|
61
|
+
(start #t)
|
62
|
+
(first "@pass" "@terminals" LHS _eps)
|
63
|
+
(follow _eof)
|
64
|
+
(cleanup star)
|
65
|
+
(alt _empty _ebnf_2))
|
66
|
+
(rule _ebnf_1 "1.1"
|
67
|
+
(first "@pass" "@terminals" LHS)
|
68
|
+
(follow "@pass" "@terminals" LHS _eof)
|
69
|
+
(alt declaration rule))
|
70
|
+
(rule _ebnf_2 "1.2"
|
71
|
+
(first "@pass" "@terminals" LHS)
|
72
|
+
(follow _eof)
|
73
|
+
(cleanup merge)
|
74
|
+
(seq _ebnf_1 ebnf))
|
75
|
+
(rule _ebnf_3 "1.3" (first "@pass" "@terminals" LHS _eps) (follow _eof) (seq ebnf))
|
48
76
|
|
77
|
+
### Creating terminal definitions and parser rules to parse generated grammars
|
49
78
|
The parser is initialized to callbacks invoked on entry and exit
|
50
79
|
to each `terminal` and `production`. A trivial parser loop can be described as follows:
|
51
80
|
|
@@ -76,9 +105,10 @@ to each `terminal` and `production`. A trivial parser loop can be described as f
|
|
76
105
|
|
77
106
|
def initialize(input)
|
78
107
|
parser_options = {
|
79
|
-
:
|
80
|
-
:
|
81
|
-
:
|
108
|
+
branch: BRANCH,
|
109
|
+
first: FIRST,
|
110
|
+
follow: FOLLOW,
|
111
|
+
cleanup: CLEANUP
|
82
112
|
}
|
83
113
|
parse(input, start_symbol, parser_options) do |context, *data|
|
84
114
|
# Process calls from callback from productions
|
@@ -88,10 +118,92 @@ to each `terminal` and `production`. A trivial parser loop can be described as f
|
|
88
118
|
raise RDF::ReaderError, e.message if validate?
|
89
119
|
end
|
90
120
|
|
121
|
+
### Branch Table
|
122
|
+
The Branch table is a hash mapping production rules to a hash relating terminals appearing in input to sequence of productions to follow when the corresponding input terminal is found. This allows either the `seq` primitive, where all terminals map to the same sequence of productions, or the `alt` primitive, where each terminal may map to a different production.
|
123
|
+
|
124
|
+
BRANCH = {
|
125
|
+
:alt => {
|
126
|
+
"(" => [:seq, :_alt_1],
|
127
|
+
:ENUM => [:seq, :_alt_1],
|
128
|
+
:HEX => [:seq, :_alt_1],
|
129
|
+
:O_ENUM => [:seq, :_alt_1],
|
130
|
+
:O_RANGE => [:seq, :_alt_1],
|
131
|
+
:RANGE => [:seq, :_alt_1],
|
132
|
+
:STRING1 => [:seq, :_alt_1],
|
133
|
+
:STRING2 => [:seq, :_alt_1],
|
134
|
+
:SYMBOL => [:seq, :_alt_1],
|
135
|
+
},
|
136
|
+
...
|
137
|
+
:declaration => {
|
138
|
+
"@pass" => [:pass],
|
139
|
+
"@terminals" => ["@terminals"],
|
140
|
+
},
|
141
|
+
...
|
142
|
+
}
|
143
|
+
|
144
|
+
In this case the `alt` rule is `seq ('|' seq)*` can happen when any of the specified tokens appears on the input stream. The all cause the same token to be passed to the `seq` rule and follow with `_alt_1`, which handles the `('|' seq)*` portion of the rule, after the first sequence is matched.
|
145
|
+
|
146
|
+
The `declaration` rule is `@terminals' | pass` using the `alt` primitive determining the production to run based on the terminal appearing on the input stream. Eventually, a terminal production is found and the token is consumed.
|
147
|
+
|
148
|
+
### First/Follow Table
|
149
|
+
The [First/Follow][] table is a hash mapping production rules to the terminals that may proceed or follow the rule. For example:
|
150
|
+
|
151
|
+
FIRST = {
|
152
|
+
:alt => [
|
153
|
+
:HEX,
|
154
|
+
:SYMBOL,
|
155
|
+
:ENUM,
|
156
|
+
:O_ENUM,
|
157
|
+
:RANGE,
|
158
|
+
:O_RANGE,
|
159
|
+
:STRING1,
|
160
|
+
:STRING2,
|
161
|
+
"("],
|
162
|
+
...
|
163
|
+
}
|
164
|
+
|
165
|
+
### Terminals Table
|
166
|
+
This table is a simple list of the terminal productions found in the grammar. For example:
|
167
|
+
|
168
|
+
TERMINALS = ["(", ")", "-",
|
169
|
+
"@pass", "@terminals",
|
170
|
+
:ENUM, :HEX, :LHS, :O_ENUM, :O_RANGE,:POSTFIX,
|
171
|
+
:RANGE, :STRING1, :STRING2, :SYMBOL,"|"
|
172
|
+
].freeze
|
173
|
+
|
174
|
+
### Cleanup Table
|
175
|
+
This table identifies productions which used EBNF rules, which are transformed to BNF for actual parsing. This allows the parser, in some cases, to reproduce *star*, *plus*, and *opt* rule matches. For example:
|
176
|
+
|
177
|
+
CLEANUP = {
|
178
|
+
:_alt_1 => :star,
|
179
|
+
:_alt_3 => :merge,
|
180
|
+
:_diff_1 => :opt,
|
181
|
+
:ebnf => :star,
|
182
|
+
:_ebnf_2 => :merge,
|
183
|
+
:_postfix_1 => :opt,
|
184
|
+
:seq => :plus,
|
185
|
+
:_seq_1 => :star,
|
186
|
+
:_seq_2 => :merge,
|
187
|
+
}.freeze
|
188
|
+
|
189
|
+
In this case the `ebnf` rule was `(declaration | rule)*`. As BNF does not support a star operator, this is decomposed into a set of rules using `alt` and `seq` primitives:
|
190
|
+
|
191
|
+
ebnf ::= _empty _ebnf_2
|
192
|
+
_ebnf_1 ::= declaration | rule
|
193
|
+
_ebnf_2 ::= _ebnf_1 ebnf
|
194
|
+
_ebnf_3 ::= ebnf
|
195
|
+
|
196
|
+
The `_empty` production matches an empty string, so allows for now value. `_ebnf_2` matches `declaration | rule` (using the `alt` primitive) followed by `ebnf`, creating a sequence of zero or more `declaration` or `alt` members.
|
91
197
|
|
92
198
|
## EBNF Grammar
|
93
199
|
The [EBNF][] variant used here is based on [W3C](http://w3.org/) [EBNF][] (see {file:etc/ebnf.ebnf EBNF grammar}) as defined in the
|
94
|
-
[XML 1.0 recommendation](http://www.w3.org/TR/REC-xml/), with minor extensions
|
200
|
+
[XML 1.0 recommendation](http://www.w3.org/TR/REC-xml/), with minor extensions:
|
201
|
+
|
202
|
+
* Comments include `\\` and `#` through end of line (other than hex character) and `/* ... */ (* ... *) which may cross lines`
|
203
|
+
* All rules **MAY** start with an identifier, contained within square brackets. For example `[1] rule`, where the value within the brackets is a symbol `([a-z] | [A-Z] | [0-9] | "_" | ".")+`
|
204
|
+
* `@terminals` causes following rules to be treated as terminals. Any terminal which are entirely upper-case are also treated as terminals
|
205
|
+
* `@pass` defines the expression used to detect whitespace, which is removed in processing.
|
206
|
+
* No support for `wfc` (well-formedness constraint) or `vc` (validity constraint).
|
95
207
|
|
96
208
|
Parsing this grammar yields an S-Expression version: {file:etc/ebnf.ll1.sxp}.
|
97
209
|
|
data/VERSION
CHANGED
@@ -1 +1 @@
|
|
1
|
-
1.0
|
1
|
+
1.1.0
|
data/bin/ebnf
CHANGED
@@ -12,9 +12,9 @@ require 'getoptlong'
|
|
12
12
|
require 'ebnf'
|
13
13
|
|
14
14
|
options = {
|
15
|
-
:
|
16
|
-
:
|
17
|
-
:
|
15
|
+
output_format: :sxp,
|
16
|
+
prefix: "ttl",
|
17
|
+
namespace: "http://www.w3.org/ns/formats/Turtle#",
|
18
18
|
}
|
19
19
|
|
20
20
|
input, out = nil, STDOUT
|
data/etc/ebnf.ebnf
CHANGED
@@ -20,6 +20,8 @@
|
|
20
20
|
|
21
21
|
[9] primary ::= HEX
|
22
22
|
| SYMBOL
|
23
|
+
| ENUM
|
24
|
+
| O_ENUM
|
23
25
|
| RANGE
|
24
26
|
| O_RANGE
|
25
27
|
| STRING1
|
@@ -36,29 +38,33 @@
|
|
36
38
|
|
37
39
|
[13] HEX ::= '#x' ([0-9]|[a-f]|[A-F])+
|
38
40
|
|
41
|
+
[14] ENUM ::= '[' ((R_BEGIN (HEX | R_CHAR)) | (HEX | R_CHAR)) '-' ((R_BEGIN (HEX | R_CHAR)) | (HEX | R_CHAR)) ']'
|
42
|
+
|
43
|
+
[15] O_ENUM ::= '[^' ((R_BEGIN (HEX | R_CHAR)) | (HEX | R_CHAR)) '-' ((R_BEGIN (HEX | R_CHAR)) | (HEX | R_CHAR)) ']'
|
44
|
+
|
39
45
|
# Range is any combination of R_CHAR '-' R_CHAR or R_CHAR+
|
40
|
-
[
|
46
|
+
[16] RANGE ::= '[' ((R_BEGIN (HEX | R_CHAR)) | (HEX | R_CHAR))+ ']'
|
41
47
|
|
42
48
|
# Range is any combination of R_CHAR '-' R_CHAR or R_CHAR+ preceded by ^
|
43
|
-
[
|
49
|
+
[17] O_RANGE ::= '[^' ((R_BEGIN (HEX | R_CHAR)) | (HEX | R_CHAR))+ ']'
|
44
50
|
|
45
51
|
# Strings are unescaped Unicode, excepting control characters and hash (#)
|
46
|
-
[
|
52
|
+
[18] STRING1 ::= '"' (CHAR - '"')* '"'
|
47
53
|
|
48
|
-
[
|
54
|
+
[19] STRING2 ::= "'" (CHAR - "'")* "'"
|
49
55
|
|
50
|
-
[
|
56
|
+
[20] CHAR ::= HEX
|
51
57
|
| [#x20#x21#x22]
|
52
58
|
| [#x24-#x00FFFFFF]
|
53
59
|
|
54
|
-
[
|
60
|
+
[21] R_CHAR ::= CHAR - ']'
|
55
61
|
|
56
|
-
[
|
62
|
+
[22] R_BEGIN ::= (HEX | R_CHAR) "-"
|
57
63
|
|
58
64
|
# Should be able to do this inline, but not until terminal regular expressions are created automatically
|
59
|
-
[
|
65
|
+
[23] POSTFIX ::= [?*+]
|
60
66
|
|
61
|
-
[
|
67
|
+
[24] PASS ::= ( [#x00-#x20]
|
62
68
|
| ( '#' | '//' ) [^#x0A#x0D]*
|
63
69
|
| '/*' (( '*' [^/] )? | [^*] )* '*/'
|
64
70
|
| '(*' (( '*' [^)] )? | [^*] )* '*)'
|
data/etc/ebnf.html
CHANGED
@@ -76,6 +76,8 @@
|
|
76
76
|
<td>
|
77
77
|
<a href="#grammar-production-HEX">HEX</a>
|
78
78
|
<code>|</code> <a href="#grammar-production-SYMBOL">SYMBOL</a>
|
79
|
+
<code>|</code> <a href="#grammar-production-ENUM">ENUM</a>
|
80
|
+
<code>|</code> <a href="#grammar-production-O_ENUM">O_ENUM</a>
|
79
81
|
<code>|</code> <a href="#grammar-production-RANGE">RANGE</a>
|
80
82
|
<code>|</code> <a href="#grammar-production-O_RANGE">O_RANGE</a>
|
81
83
|
<code>|</code> <a href="#grammar-production-STRING1">STRING1</a>
|
@@ -119,8 +121,32 @@
|
|
119
121
|
(<code>[</code> <code class="grammar-literal">0-9</code><code>]</code> <code>|</code> <code>[</code> <code class="grammar-literal">a-f</code><code>]</code> <code>|</code> <code>[</code> <code class="grammar-literal">A-F</code><code>]</code> )<code>+</code>
|
120
122
|
</td>
|
121
123
|
</tr>
|
122
|
-
<tr id='grammar-production-
|
124
|
+
<tr id='grammar-production-ENUM'>
|
123
125
|
<td>[14]</td>
|
126
|
+
<td><code>ENUM</code></td>
|
127
|
+
<td>::=</td>
|
128
|
+
<td>
|
129
|
+
"<code class="grammar-literal">[</code>"
|
130
|
+
<code>(</code> <a href="#grammar-production-R_BEGIN">R_BEGIN</a> <code>(</code> <a href="#grammar-production-HEX">HEX</a> <code>|</code> <a href="#grammar-production-R_CHAR">R_CHAR</a><code>)</code> <code>|</code> <a href="#grammar-production-HEX">HEX</a> <code>|</code> <a href="#grammar-production-R_CHAR">R_CHAR</a><code>)</code>
|
131
|
+
"<code class="grammar-literal">-</code>"
|
132
|
+
<code>(</code> <a href="#grammar-production-R_BEGIN">R_BEGIN</a> <code>(</code> <a href="#grammar-production-HEX">HEX</a> <code>|</code> <a href="#grammar-production-R_CHAR">R_CHAR</a><code>)</code> <code>|</code> <a href="#grammar-production-HEX">HEX</a> <code>|</code> <a href="#grammar-production-R_CHAR">R_CHAR</a><code>)</code>
|
133
|
+
"<code class="grammar-literal">]</code>"
|
134
|
+
</td>
|
135
|
+
</tr>
|
136
|
+
<tr id='grammar-production-O_ENUM'>
|
137
|
+
<td>[15]</td>
|
138
|
+
<td><code>O_ENUM</code></td>
|
139
|
+
<td>::=</td>
|
140
|
+
<td>
|
141
|
+
"<code class="grammar-literal">[^</code>"
|
142
|
+
<code>(</code> <a href="#grammar-production-R_BEGIN">R_BEGIN</a> <code>(</code> <a href="#grammar-production-HEX">HEX</a> <code>|</code> <a href="#grammar-production-R_CHAR">R_CHAR</a><code>)</code> <code>|</code> <a href="#grammar-production-HEX">HEX</a> <code>|</code> <a href="#grammar-production-R_CHAR">R_CHAR</a><code>)</code>
|
143
|
+
"<code class="grammar-literal">-</code>"
|
144
|
+
<code>(</code> <a href="#grammar-production-R_BEGIN">R_BEGIN</a> <code>(</code> <a href="#grammar-production-HEX">HEX</a> <code>|</code> <a href="#grammar-production-R_CHAR">R_CHAR</a><code>)</code> <code>|</code> <a href="#grammar-production-HEX">HEX</a> <code>|</code> <a href="#grammar-production-R_CHAR">R_CHAR</a><code>)</code>
|
145
|
+
"<code class="grammar-literal">]</code>"
|
146
|
+
</td>
|
147
|
+
</tr>
|
148
|
+
<tr id='grammar-production-RANGE'>
|
149
|
+
<td>[16]</td>
|
124
150
|
<td><code>RANGE</code></td>
|
125
151
|
<td>::=</td>
|
126
152
|
<td>
|
@@ -130,7 +156,7 @@
|
|
130
156
|
</td>
|
131
157
|
</tr>
|
132
158
|
<tr id='grammar-production-O_RANGE'>
|
133
|
-
<td>[
|
159
|
+
<td>[17]</td>
|
134
160
|
<td><code>O_RANGE</code></td>
|
135
161
|
<td>::=</td>
|
136
162
|
<td>
|
@@ -140,7 +166,7 @@
|
|
140
166
|
</td>
|
141
167
|
</tr>
|
142
168
|
<tr id='grammar-production-STRING1'>
|
143
|
-
<td>[
|
169
|
+
<td>[18]</td>
|
144
170
|
<td><code>STRING1</code></td>
|
145
171
|
<td>::=</td>
|
146
172
|
<td>
|
@@ -150,7 +176,7 @@
|
|
150
176
|
</td>
|
151
177
|
</tr>
|
152
178
|
<tr id='grammar-production-STRING2'>
|
153
|
-
<td>[
|
179
|
+
<td>[19]</td>
|
154
180
|
<td><code>STRING2</code></td>
|
155
181
|
<td>::=</td>
|
156
182
|
<td>
|
@@ -160,7 +186,7 @@
|
|
160
186
|
</td>
|
161
187
|
</tr>
|
162
188
|
<tr id='grammar-production-CHAR'>
|
163
|
-
<td>[
|
189
|
+
<td>[20]</td>
|
164
190
|
<td><code>CHAR</code></td>
|
165
191
|
<td>::=</td>
|
166
192
|
<td>
|
@@ -170,7 +196,7 @@
|
|
170
196
|
</td>
|
171
197
|
</tr>
|
172
198
|
<tr id='grammar-production-R_CHAR'>
|
173
|
-
<td>[
|
199
|
+
<td>[21]</td>
|
174
200
|
<td><code>R_CHAR</code></td>
|
175
201
|
<td>::=</td>
|
176
202
|
<td>
|
@@ -179,7 +205,7 @@
|
|
179
205
|
</td>
|
180
206
|
</tr>
|
181
207
|
<tr id='grammar-production-R_BEGIN'>
|
182
|
-
<td>[
|
208
|
+
<td>[22]</td>
|
183
209
|
<td><code>R_BEGIN</code></td>
|
184
210
|
<td>::=</td>
|
185
211
|
<td>
|
@@ -188,7 +214,7 @@
|
|
188
214
|
</td>
|
189
215
|
</tr>
|
190
216
|
<tr id='grammar-production-POSTFIX'>
|
191
|
-
<td>[
|
217
|
+
<td>[23]</td>
|
192
218
|
<td><code>POSTFIX</code></td>
|
193
219
|
<td>::=</td>
|
194
220
|
<td>
|
@@ -196,7 +222,7 @@
|
|
196
222
|
</td>
|
197
223
|
</tr>
|
198
224
|
<tr id='grammar-production-PASS'>
|
199
|
-
<td>[
|
225
|
+
<td>[24]</td>
|
200
226
|
<td><code>PASS</code></td>
|
201
227
|
<td>::=</td>
|
202
228
|
<td>
|
data/etc/ebnf.ll1.sxp
CHANGED
@@ -5,12 +5,17 @@
|
|
5
5
|
(start #t)
|
6
6
|
(first "@pass" "@terminals" LHS _eps)
|
7
7
|
(follow _eof)
|
8
|
+
(cleanup star)
|
8
9
|
(alt _empty _ebnf_2))
|
9
10
|
(rule _ebnf_1 "1.1"
|
10
11
|
(first "@pass" "@terminals" LHS)
|
11
12
|
(follow "@pass" "@terminals" LHS _eof)
|
12
13
|
(alt declaration rule))
|
13
|
-
(rule _ebnf_2 "1.2"
|
14
|
+
(rule _ebnf_2 "1.2"
|
15
|
+
(first "@pass" "@terminals" LHS)
|
16
|
+
(follow _eof)
|
17
|
+
(cleanup merge)
|
18
|
+
(seq _ebnf_1 ebnf))
|
14
19
|
(rule _ebnf_3 "1.3" (first "@pass" "@terminals" LHS _eps) (follow _eof) (seq ebnf))
|
15
20
|
(rule declaration "2"
|
16
21
|
(first "@pass" "@terminals")
|
@@ -18,20 +23,21 @@
|
|
18
23
|
(alt "@terminals" pass))
|
19
24
|
(rule rule "3" (first LHS) (follow "@pass" "@terminals" LHS _eof) (seq LHS expression))
|
20
25
|
(rule _rule_1 "3.1"
|
21
|
-
(first "(" HEX O_RANGE RANGE STRING1 STRING2 SYMBOL)
|
26
|
+
(first "(" ENUM HEX O_ENUM O_RANGE RANGE STRING1 STRING2 SYMBOL)
|
22
27
|
(follow "@pass" "@terminals" LHS _eof)
|
23
28
|
(seq expression))
|
24
29
|
(rule expression "4"
|
25
|
-
(first "(" HEX O_RANGE RANGE STRING1 STRING2 SYMBOL)
|
30
|
+
(first "(" ENUM HEX O_ENUM O_RANGE RANGE STRING1 STRING2 SYMBOL)
|
26
31
|
(follow ")" "@pass" "@terminals" LHS _eof)
|
27
32
|
(seq alt))
|
28
33
|
(rule alt "5"
|
29
|
-
(first "(" HEX O_RANGE RANGE STRING1 STRING2 SYMBOL)
|
34
|
+
(first "(" ENUM HEX O_ENUM O_RANGE RANGE STRING1 STRING2 SYMBOL)
|
30
35
|
(follow ")" "@pass" "@terminals" LHS _eof)
|
31
36
|
(seq seq _alt_1))
|
32
37
|
(rule _alt_1 "5.1"
|
33
38
|
(first _eps "|")
|
34
39
|
(follow ")" "@pass" "@terminals" LHS _eof)
|
40
|
+
(cleanup star)
|
35
41
|
(alt _empty _alt_3))
|
36
42
|
(rule _alt_2 "5.2"
|
37
43
|
(first "|")
|
@@ -40,6 +46,7 @@
|
|
40
46
|
(rule _alt_3 "5.3"
|
41
47
|
(first "|")
|
42
48
|
(follow ")" "@pass" "@terminals" LHS _eof)
|
49
|
+
(cleanup merge)
|
43
50
|
(seq _alt_2 _alt_1))
|
44
51
|
(rule _alt_4 "5.4"
|
45
52
|
(first _eps "|")
|
@@ -50,111 +57,124 @@
|
|
50
57
|
(follow ")" "@pass" "@terminals" LHS _eof)
|
51
58
|
(seq _alt_1))
|
52
59
|
(rule _alt_6 "5.6"
|
53
|
-
(first "(" HEX O_RANGE RANGE STRING1 STRING2 SYMBOL)
|
60
|
+
(first "(" ENUM HEX O_ENUM O_RANGE RANGE STRING1 STRING2 SYMBOL)
|
54
61
|
(follow ")" "@pass" "@terminals" LHS _eof "|")
|
55
62
|
(seq seq))
|
56
63
|
(rule seq "6"
|
57
|
-
(first "(" HEX O_RANGE RANGE STRING1 STRING2 SYMBOL)
|
64
|
+
(first "(" ENUM HEX O_ENUM O_RANGE RANGE STRING1 STRING2 SYMBOL)
|
58
65
|
(follow ")" "@pass" "@terminals" LHS _eof "|")
|
66
|
+
(cleanup plus)
|
59
67
|
(seq diff _seq_1))
|
60
68
|
(rule _seq_1 "6.1"
|
61
|
-
(first "(" HEX O_RANGE RANGE STRING1 STRING2 SYMBOL _eps)
|
69
|
+
(first "(" ENUM HEX O_ENUM O_RANGE RANGE STRING1 STRING2 SYMBOL _eps)
|
62
70
|
(follow ")" "@pass" "@terminals" LHS _eof "|")
|
71
|
+
(cleanup star)
|
63
72
|
(alt _empty _seq_2))
|
64
73
|
(rule _seq_2 "6.2"
|
65
|
-
(first "(" HEX O_RANGE RANGE STRING1 STRING2 SYMBOL)
|
74
|
+
(first "(" ENUM HEX O_ENUM O_RANGE RANGE STRING1 STRING2 SYMBOL)
|
66
75
|
(follow ")" "@pass" "@terminals" LHS _eof "|")
|
76
|
+
(cleanup merge)
|
67
77
|
(seq diff _seq_1))
|
68
78
|
(rule _seq_3 "6.3"
|
69
|
-
(first "(" HEX O_RANGE RANGE STRING1 STRING2 SYMBOL _eps)
|
79
|
+
(first "(" ENUM HEX O_ENUM O_RANGE RANGE STRING1 STRING2 SYMBOL _eps)
|
70
80
|
(follow ")" "@pass" "@terminals" LHS _eof "|")
|
71
81
|
(seq _seq_1))
|
72
82
|
(rule _seq_4 "6.4"
|
73
|
-
(first "(" HEX O_RANGE RANGE STRING1 STRING2 SYMBOL _eps)
|
83
|
+
(first "(" ENUM HEX O_ENUM O_RANGE RANGE STRING1 STRING2 SYMBOL _eps)
|
74
84
|
(follow ")" "@pass" "@terminals" LHS _eof "|")
|
75
85
|
(seq _seq_1))
|
76
86
|
(rule diff "7"
|
77
|
-
(first "(" HEX O_RANGE RANGE STRING1 STRING2 SYMBOL)
|
78
|
-
(follow "(" ")" "@pass" "@terminals" HEX LHS O_RANGE RANGE
|
79
|
-
SYMBOL _eof "|" )
|
87
|
+
(first "(" ENUM HEX O_ENUM O_RANGE RANGE STRING1 STRING2 SYMBOL)
|
88
|
+
(follow "(" ")" "@pass" "@terminals" ENUM HEX LHS O_ENUM O_RANGE RANGE
|
89
|
+
STRING1 STRING2 SYMBOL _eof "|" )
|
80
90
|
(seq postfix _diff_1))
|
81
91
|
(rule _diff_1 "7.1"
|
82
92
|
(first "-" _eps)
|
83
|
-
(follow "(" ")" "@pass" "@terminals" HEX LHS O_RANGE RANGE
|
84
|
-
SYMBOL _eof "|" )
|
93
|
+
(follow "(" ")" "@pass" "@terminals" ENUM HEX LHS O_ENUM O_RANGE RANGE
|
94
|
+
STRING1 STRING2 SYMBOL _eof "|" )
|
95
|
+
(cleanup opt)
|
85
96
|
(alt _empty _diff_2))
|
86
97
|
(rule _diff_2 "7.2"
|
87
98
|
(first "-")
|
88
|
-
(follow "(" ")" "@pass" "@terminals" HEX LHS O_RANGE RANGE
|
89
|
-
SYMBOL _eof "|" )
|
99
|
+
(follow "(" ")" "@pass" "@terminals" ENUM HEX LHS O_ENUM O_RANGE RANGE
|
100
|
+
STRING1 STRING2 SYMBOL _eof "|" )
|
90
101
|
(seq "-" postfix))
|
91
102
|
(rule _diff_3 "7.3"
|
92
103
|
(first "-" _eps)
|
93
|
-
(follow "(" ")" "@pass" "@terminals" HEX LHS O_RANGE RANGE
|
94
|
-
SYMBOL _eof "|" )
|
104
|
+
(follow "(" ")" "@pass" "@terminals" ENUM HEX LHS O_ENUM O_RANGE RANGE
|
105
|
+
STRING1 STRING2 SYMBOL _eof "|" )
|
95
106
|
(seq _diff_1))
|
96
107
|
(rule _diff_4 "7.4"
|
97
|
-
(first "(" HEX O_RANGE RANGE STRING1 STRING2 SYMBOL)
|
98
|
-
(follow "(" ")" "@pass" "@terminals" HEX LHS O_RANGE RANGE
|
99
|
-
SYMBOL _eof "|" )
|
108
|
+
(first "(" ENUM HEX O_ENUM O_RANGE RANGE STRING1 STRING2 SYMBOL)
|
109
|
+
(follow "(" ")" "@pass" "@terminals" ENUM HEX LHS O_ENUM O_RANGE RANGE
|
110
|
+
STRING1 STRING2 SYMBOL _eof "|" )
|
100
111
|
(seq postfix))
|
101
112
|
(rule postfix "8"
|
102
|
-
(first "(" HEX O_RANGE RANGE STRING1 STRING2 SYMBOL)
|
103
|
-
(follow "(" ")" "-" "@pass" "@terminals" HEX LHS O_RANGE RANGE
|
104
|
-
STRING2 SYMBOL _eof "|" )
|
113
|
+
(first "(" ENUM HEX O_ENUM O_RANGE RANGE STRING1 STRING2 SYMBOL)
|
114
|
+
(follow "(" ")" "-" "@pass" "@terminals" ENUM HEX LHS O_ENUM O_RANGE RANGE
|
115
|
+
STRING1 STRING2 SYMBOL _eof "|" )
|
105
116
|
(seq primary _postfix_1))
|
106
117
|
(rule _postfix_1 "8.1"
|
107
118
|
(first POSTFIX _eps)
|
108
|
-
(follow "(" ")" "-" "@pass" "@terminals" HEX LHS O_RANGE RANGE
|
109
|
-
STRING2 SYMBOL _eof "|" )
|
119
|
+
(follow "(" ")" "-" "@pass" "@terminals" ENUM HEX LHS O_ENUM O_RANGE RANGE
|
120
|
+
STRING1 STRING2 SYMBOL _eof "|" )
|
121
|
+
(cleanup opt)
|
110
122
|
(alt _empty POSTFIX))
|
111
123
|
(rule _postfix_2 "8.2"
|
112
124
|
(first POSTFIX _eps)
|
113
|
-
(follow "(" ")" "-" "@pass" "@terminals" HEX LHS O_RANGE RANGE
|
114
|
-
STRING2 SYMBOL _eof "|" )
|
125
|
+
(follow "(" ")" "-" "@pass" "@terminals" ENUM HEX LHS O_ENUM O_RANGE RANGE
|
126
|
+
STRING1 STRING2 SYMBOL _eof "|" )
|
115
127
|
(seq _postfix_1))
|
116
128
|
(rule primary "9"
|
117
|
-
(first "(" HEX O_RANGE RANGE STRING1 STRING2 SYMBOL)
|
118
|
-
(follow "(" ")" "-" "@pass" "@terminals" HEX LHS O_RANGE POSTFIX
|
119
|
-
STRING1 STRING2 SYMBOL _eof "|" )
|
120
|
-
(alt HEX SYMBOL RANGE O_RANGE STRING1 STRING2 _primary_1))
|
129
|
+
(first "(" ENUM HEX O_ENUM O_RANGE RANGE STRING1 STRING2 SYMBOL)
|
130
|
+
(follow "(" ")" "-" "@pass" "@terminals" ENUM HEX LHS O_ENUM O_RANGE POSTFIX
|
131
|
+
RANGE STRING1 STRING2 SYMBOL _eof "|" )
|
132
|
+
(alt HEX SYMBOL ENUM O_ENUM RANGE O_RANGE STRING1 STRING2 _primary_1))
|
121
133
|
(rule _primary_1 "9.1"
|
122
134
|
(first "(")
|
123
|
-
(follow "(" ")" "-" "@pass" "@terminals" HEX LHS O_RANGE POSTFIX
|
124
|
-
STRING1 STRING2 SYMBOL _eof "|" )
|
135
|
+
(follow "(" ")" "-" "@pass" "@terminals" ENUM HEX LHS O_ENUM O_RANGE POSTFIX
|
136
|
+
RANGE STRING1 STRING2 SYMBOL _eof "|" )
|
125
137
|
(seq "(" expression ")"))
|
126
138
|
(rule _primary_2 "9.2"
|
127
|
-
(first "(" HEX O_RANGE RANGE STRING1 STRING2 SYMBOL)
|
128
|
-
(follow "(" ")" "-" "@pass" "@terminals" HEX LHS O_RANGE POSTFIX
|
129
|
-
STRING1 STRING2 SYMBOL _eof "|" )
|
139
|
+
(first "(" ENUM HEX O_ENUM O_RANGE RANGE STRING1 STRING2 SYMBOL)
|
140
|
+
(follow "(" ")" "-" "@pass" "@terminals" ENUM HEX LHS O_ENUM O_RANGE POSTFIX
|
141
|
+
RANGE STRING1 STRING2 SYMBOL _eof "|" )
|
130
142
|
(seq expression ")"))
|
131
143
|
(rule _primary_3 "9.3"
|
132
144
|
(first ")")
|
133
|
-
(follow "(" ")" "-" "@pass" "@terminals" HEX LHS O_RANGE POSTFIX
|
134
|
-
STRING1 STRING2 SYMBOL _eof "|" )
|
145
|
+
(follow "(" ")" "-" "@pass" "@terminals" ENUM HEX LHS O_ENUM O_RANGE POSTFIX
|
146
|
+
RANGE STRING1 STRING2 SYMBOL _eof "|" )
|
135
147
|
(seq ")"))
|
136
148
|
(rule pass "10"
|
137
149
|
(first "@pass")
|
138
150
|
(follow "@pass" "@terminals" LHS _eof)
|
139
151
|
(seq "@pass" expression))
|
140
152
|
(rule _pass_1 "10.1"
|
141
|
-
(first "(" HEX O_RANGE RANGE STRING1 STRING2 SYMBOL)
|
153
|
+
(first "(" ENUM HEX O_ENUM O_RANGE RANGE STRING1 STRING2 SYMBOL)
|
142
154
|
(follow "@pass" "@terminals" LHS _eof)
|
143
155
|
(seq expression))
|
144
156
|
(terminal LHS "11" (seq (opt (seq "[" (plus SYMBOL) "]")) SYMBOL "::="))
|
145
157
|
(terminal SYMBOL "12" (plus (alt (range "a-z") (range "A-Z") (range "0-9") "_" ".")))
|
146
158
|
(terminal HEX "13" (seq "#x" (plus (alt (range "0-9") (range "a-f") (range "A-F")))))
|
147
|
-
(terminal
|
159
|
+
(terminal ENUM "14"
|
160
|
+
(seq "["
|
161
|
+
(alt (seq R_BEGIN (alt HEX R_CHAR)) (alt HEX R_CHAR)) "-"
|
162
|
+
(alt (seq R_BEGIN (alt HEX R_CHAR)) (alt HEX R_CHAR)) "]" ))
|
163
|
+
(terminal O_ENUM "15"
|
164
|
+
(seq "[^"
|
165
|
+
(alt (seq R_BEGIN (alt HEX R_CHAR)) (alt HEX R_CHAR)) "-"
|
166
|
+
(alt (seq R_BEGIN (alt HEX R_CHAR)) (alt HEX R_CHAR)) "]" ))
|
167
|
+
(terminal RANGE "16"
|
148
168
|
(seq "[" (plus (alt (seq R_BEGIN (alt HEX R_CHAR)) (alt HEX R_CHAR))) "]"))
|
149
|
-
(terminal O_RANGE "
|
169
|
+
(terminal O_RANGE "17"
|
150
170
|
(seq "[^" (plus (alt (seq R_BEGIN (alt HEX R_CHAR)) (alt HEX R_CHAR))) "]"))
|
151
|
-
(terminal STRING1 "
|
152
|
-
(terminal STRING2 "
|
153
|
-
(terminal CHAR "
|
154
|
-
(terminal R_CHAR "
|
155
|
-
(terminal R_BEGIN "
|
156
|
-
(terminal POSTFIX "
|
157
|
-
(terminal PASS "
|
171
|
+
(terminal STRING1 "18" (seq "\"" (star (diff CHAR "\"")) "\""))
|
172
|
+
(terminal STRING2 "19" (seq "'" (star (diff CHAR "'")) "'"))
|
173
|
+
(terminal CHAR "20" (alt HEX (range "#x20#x21#x22") (range "#x24-#x00FFFFFF")))
|
174
|
+
(terminal R_CHAR "21" (diff CHAR "]"))
|
175
|
+
(terminal R_BEGIN "22" (seq (alt HEX R_CHAR) "-"))
|
176
|
+
(terminal POSTFIX "23" (range "?*+"))
|
177
|
+
(terminal PASS "24"
|
158
178
|
(plus
|
159
179
|
(alt
|
160
180
|
(range "#x00-#x20")
|