ebnf 1.0.2 → 1.1.0
Sign up to get free protection for your applications and to get access to all the features.
- checksums.yaml +4 -4
- data/README.md +123 -11
- data/VERSION +1 -1
- data/bin/ebnf +3 -3
- data/etc/ebnf.ebnf +15 -9
- data/etc/ebnf.html +35 -9
- data/etc/ebnf.ll1.sxp +70 -50
- data/etc/ebnf.rb +87 -0
- data/etc/ebnf.sxp +18 -10
- data/etc/sparql.ll1.sxp +277 -102
- data/etc/sparql.rb +140 -0
- data/etc/turtle.ll1.sxp +27 -16
- data/etc/turtle.rb +13 -0
- data/lib/ebnf/base.rb +3 -2
- data/lib/ebnf/bnf.rb +1 -1
- data/lib/ebnf/ll1.rb +19 -9
- data/lib/ebnf/ll1/lexer.rb +15 -11
- data/lib/ebnf/ll1/parser.rb +34 -16
- data/lib/ebnf/ll1/scanner.rb +22 -8
- data/lib/ebnf/parser.rb +1 -1
- data/lib/ebnf/rule.rb +22 -10
- data/lib/ebnf/writer.rb +1 -1
- metadata +3 -3
checksums.yaml
CHANGED
@@ -1,7 +1,7 @@
|
|
1
1
|
---
|
2
2
|
SHA1:
|
3
|
-
metadata.gz:
|
4
|
-
data.tar.gz:
|
3
|
+
metadata.gz: 5b0233cc19d80ca25dc2221725770bc87a47d0f0
|
4
|
+
data.tar.gz: e381edc76658d0f56816c4ddc107417df301c9cc
|
5
5
|
SHA512:
|
6
|
-
metadata.gz:
|
7
|
-
data.tar.gz:
|
6
|
+
metadata.gz: 18415d7b3393069f09d0af3c6433722c3aadfd4f3b5eca67c763294dd81584c9e435eb2aab6e27d15cf557f1d8587bfc67c3671e64797523340f06560c42f27c
|
7
|
+
data.tar.gz: bc6917f74c4420facfee72e7089d225545e49d201653731dd1caa28eb71f547b3c8231b44175b9144f2b7c9d4ac979088d4647c487e885adc624c046c9c595fd
|
data/README.md
CHANGED
@@ -8,12 +8,20 @@
|
|
8
8
|
[![Dependency Status](https://gemnasium.com/gkellogg/ebnf.png)](https://gemnasium.com/gkellogg/ebnf)
|
9
9
|
|
10
10
|
## Description
|
11
|
-
This is a [Ruby][] implementation of an [EBNF][] and [BNF][] parser and parser generator.
|
12
|
-
It parses [EBNF][] grammars to [BNF][], generates [First/Follow and Branch][] tables for
|
13
|
-
[LL(1)][] grammars, which can be used with the stream [Tokenizer][] and [LL(1) Parser][].
|
11
|
+
This is a [Ruby][] implementation of an [EBNF][] and [BNF][] parser and parser generator. It parses [EBNF][] grammars to [BNF][], generates [First/Follow][] and Branch tables for [LL(1)][] grammars, which can be used with the stream [Tokenizer][] and [LL(1) Parser][].
|
14
12
|
|
15
|
-
|
16
|
-
|
13
|
+
As LL(1) grammars operate using `alt` and `seq` primitives, allowing for a match on alternative productions or a sequence of productions, generating a parser requires turning the EBNF rules into BNF:
|
14
|
+
|
15
|
+
* Transform `a ::= b?` into `a ::= _empty | b`
|
16
|
+
* Transform `a ::= b+` into `a ::= b b*`
|
17
|
+
* Transform `a ::= b*` into `a ::= _empty | (b a)`
|
18
|
+
* Transform `a ::= op1 (op2)` into two rules:
|
19
|
+
```
|
20
|
+
a ::= op1 _a_1
|
21
|
+
_a_1_ ::= op2
|
22
|
+
```
|
23
|
+
|
24
|
+
Of note in this implementation is that the tokenizer and parser are streaming, so that they can process inputs of arbitrary size.
|
17
25
|
|
18
26
|
## Usage
|
19
27
|
### Parsing an LL(1) Grammar
|
@@ -36,7 +44,7 @@ Generate [First/Follow][] rules for BNF grammars
|
|
36
44
|
|
37
45
|
ebnf.first_follow(start_tokens)
|
38
46
|
|
39
|
-
Generate Terminal, [First/Follow and Branch
|
47
|
+
Generate Terminal, [First/Follow][], Cleanup and Branch tables as Ruby for parsing grammars
|
40
48
|
|
41
49
|
ebnf.to_ruby
|
42
50
|
|
@@ -44,8 +52,29 @@ Generate formatted grammar using HTML (requires [Haml][Haml] gem)
|
|
44
52
|
|
45
53
|
ebnf.to_html
|
46
54
|
|
47
|
-
###
|
55
|
+
### Parser S-Expressions
|
56
|
+
Intermediate representations of the grammar may be serialized to Lisp-like S-Expressions. For example, the rule `[1] ebnf ::= (declaration | rule)*` is serialized as `(rule ebnf "1" (star (alt declaration rule)))`.
|
57
|
+
|
58
|
+
Once the [LL(1)][] conversion is made, the [First/Follow][] table is generated, this rule expands as follows:
|
59
|
+
|
60
|
+
(rule ebnf "1"
|
61
|
+
(start #t)
|
62
|
+
(first "@pass" "@terminals" LHS _eps)
|
63
|
+
(follow _eof)
|
64
|
+
(cleanup star)
|
65
|
+
(alt _empty _ebnf_2))
|
66
|
+
(rule _ebnf_1 "1.1"
|
67
|
+
(first "@pass" "@terminals" LHS)
|
68
|
+
(follow "@pass" "@terminals" LHS _eof)
|
69
|
+
(alt declaration rule))
|
70
|
+
(rule _ebnf_2 "1.2"
|
71
|
+
(first "@pass" "@terminals" LHS)
|
72
|
+
(follow _eof)
|
73
|
+
(cleanup merge)
|
74
|
+
(seq _ebnf_1 ebnf))
|
75
|
+
(rule _ebnf_3 "1.3" (first "@pass" "@terminals" LHS _eps) (follow _eof) (seq ebnf))
|
48
76
|
|
77
|
+
### Creating terminal definitions and parser rules to parse generated grammars
|
49
78
|
The parser is initialized to callbacks invoked on entry and exit
|
50
79
|
to each `terminal` and `production`. A trivial parser loop can be described as follows:
|
51
80
|
|
@@ -76,9 +105,10 @@ to each `terminal` and `production`. A trivial parser loop can be described as f
|
|
76
105
|
|
77
106
|
def initialize(input)
|
78
107
|
parser_options = {
|
79
|
-
:
|
80
|
-
:
|
81
|
-
:
|
108
|
+
branch: BRANCH,
|
109
|
+
first: FIRST,
|
110
|
+
follow: FOLLOW,
|
111
|
+
cleanup: CLEANUP
|
82
112
|
}
|
83
113
|
parse(input, start_symbol, parser_options) do |context, *data|
|
84
114
|
# Process calls from callback from productions
|
@@ -88,10 +118,92 @@ to each `terminal` and `production`. A trivial parser loop can be described as f
|
|
88
118
|
raise RDF::ReaderError, e.message if validate?
|
89
119
|
end
|
90
120
|
|
121
|
+
### Branch Table
|
122
|
+
The Branch table is a hash mapping production rules to a hash relating terminals appearing in input to sequence of productions to follow when the corresponding input terminal is found. This allows either the `seq` primitive, where all terminals map to the same sequence of productions, or the `alt` primitive, where each terminal may map to a different production.
|
123
|
+
|
124
|
+
BRANCH = {
|
125
|
+
:alt => {
|
126
|
+
"(" => [:seq, :_alt_1],
|
127
|
+
:ENUM => [:seq, :_alt_1],
|
128
|
+
:HEX => [:seq, :_alt_1],
|
129
|
+
:O_ENUM => [:seq, :_alt_1],
|
130
|
+
:O_RANGE => [:seq, :_alt_1],
|
131
|
+
:RANGE => [:seq, :_alt_1],
|
132
|
+
:STRING1 => [:seq, :_alt_1],
|
133
|
+
:STRING2 => [:seq, :_alt_1],
|
134
|
+
:SYMBOL => [:seq, :_alt_1],
|
135
|
+
},
|
136
|
+
...
|
137
|
+
:declaration => {
|
138
|
+
"@pass" => [:pass],
|
139
|
+
"@terminals" => ["@terminals"],
|
140
|
+
},
|
141
|
+
...
|
142
|
+
}
|
143
|
+
|
144
|
+
In this case the `alt` rule is `seq ('|' seq)*` can happen when any of the specified tokens appears on the input stream. The all cause the same token to be passed to the `seq` rule and follow with `_alt_1`, which handles the `('|' seq)*` portion of the rule, after the first sequence is matched.
|
145
|
+
|
146
|
+
The `declaration` rule is `@terminals' | pass` using the `alt` primitive determining the production to run based on the terminal appearing on the input stream. Eventually, a terminal production is found and the token is consumed.
|
147
|
+
|
148
|
+
### First/Follow Table
|
149
|
+
The [First/Follow][] table is a hash mapping production rules to the terminals that may proceed or follow the rule. For example:
|
150
|
+
|
151
|
+
FIRST = {
|
152
|
+
:alt => [
|
153
|
+
:HEX,
|
154
|
+
:SYMBOL,
|
155
|
+
:ENUM,
|
156
|
+
:O_ENUM,
|
157
|
+
:RANGE,
|
158
|
+
:O_RANGE,
|
159
|
+
:STRING1,
|
160
|
+
:STRING2,
|
161
|
+
"("],
|
162
|
+
...
|
163
|
+
}
|
164
|
+
|
165
|
+
### Terminals Table
|
166
|
+
This table is a simple list of the terminal productions found in the grammar. For example:
|
167
|
+
|
168
|
+
TERMINALS = ["(", ")", "-",
|
169
|
+
"@pass", "@terminals",
|
170
|
+
:ENUM, :HEX, :LHS, :O_ENUM, :O_RANGE,:POSTFIX,
|
171
|
+
:RANGE, :STRING1, :STRING2, :SYMBOL,"|"
|
172
|
+
].freeze
|
173
|
+
|
174
|
+
### Cleanup Table
|
175
|
+
This table identifies productions which used EBNF rules, which are transformed to BNF for actual parsing. This allows the parser, in some cases, to reproduce *star*, *plus*, and *opt* rule matches. For example:
|
176
|
+
|
177
|
+
CLEANUP = {
|
178
|
+
:_alt_1 => :star,
|
179
|
+
:_alt_3 => :merge,
|
180
|
+
:_diff_1 => :opt,
|
181
|
+
:ebnf => :star,
|
182
|
+
:_ebnf_2 => :merge,
|
183
|
+
:_postfix_1 => :opt,
|
184
|
+
:seq => :plus,
|
185
|
+
:_seq_1 => :star,
|
186
|
+
:_seq_2 => :merge,
|
187
|
+
}.freeze
|
188
|
+
|
189
|
+
In this case the `ebnf` rule was `(declaration | rule)*`. As BNF does not support a star operator, this is decomposed into a set of rules using `alt` and `seq` primitives:
|
190
|
+
|
191
|
+
ebnf ::= _empty _ebnf_2
|
192
|
+
_ebnf_1 ::= declaration | rule
|
193
|
+
_ebnf_2 ::= _ebnf_1 ebnf
|
194
|
+
_ebnf_3 ::= ebnf
|
195
|
+
|
196
|
+
The `_empty` production matches an empty string, so allows for now value. `_ebnf_2` matches `declaration | rule` (using the `alt` primitive) followed by `ebnf`, creating a sequence of zero or more `declaration` or `alt` members.
|
91
197
|
|
92
198
|
## EBNF Grammar
|
93
199
|
The [EBNF][] variant used here is based on [W3C](http://w3.org/) [EBNF][] (see {file:etc/ebnf.ebnf EBNF grammar}) as defined in the
|
94
|
-
[XML 1.0 recommendation](http://www.w3.org/TR/REC-xml/), with minor extensions
|
200
|
+
[XML 1.0 recommendation](http://www.w3.org/TR/REC-xml/), with minor extensions:
|
201
|
+
|
202
|
+
* Comments include `\\` and `#` through end of line (other than hex character) and `/* ... */ (* ... *) which may cross lines`
|
203
|
+
* All rules **MAY** start with an identifier, contained within square brackets. For example `[1] rule`, where the value within the brackets is a symbol `([a-z] | [A-Z] | [0-9] | "_" | ".")+`
|
204
|
+
* `@terminals` causes following rules to be treated as terminals. Any terminal which are entirely upper-case are also treated as terminals
|
205
|
+
* `@pass` defines the expression used to detect whitespace, which is removed in processing.
|
206
|
+
* No support for `wfc` (well-formedness constraint) or `vc` (validity constraint).
|
95
207
|
|
96
208
|
Parsing this grammar yields an S-Expression version: {file:etc/ebnf.ll1.sxp}.
|
97
209
|
|
data/VERSION
CHANGED
@@ -1 +1 @@
|
|
1
|
-
1.0
|
1
|
+
1.1.0
|
data/bin/ebnf
CHANGED
@@ -12,9 +12,9 @@ require 'getoptlong'
|
|
12
12
|
require 'ebnf'
|
13
13
|
|
14
14
|
options = {
|
15
|
-
:
|
16
|
-
:
|
17
|
-
:
|
15
|
+
output_format: :sxp,
|
16
|
+
prefix: "ttl",
|
17
|
+
namespace: "http://www.w3.org/ns/formats/Turtle#",
|
18
18
|
}
|
19
19
|
|
20
20
|
input, out = nil, STDOUT
|
data/etc/ebnf.ebnf
CHANGED
@@ -20,6 +20,8 @@
|
|
20
20
|
|
21
21
|
[9] primary ::= HEX
|
22
22
|
| SYMBOL
|
23
|
+
| ENUM
|
24
|
+
| O_ENUM
|
23
25
|
| RANGE
|
24
26
|
| O_RANGE
|
25
27
|
| STRING1
|
@@ -36,29 +38,33 @@
|
|
36
38
|
|
37
39
|
[13] HEX ::= '#x' ([0-9]|[a-f]|[A-F])+
|
38
40
|
|
41
|
+
[14] ENUM ::= '[' ((R_BEGIN (HEX | R_CHAR)) | (HEX | R_CHAR)) '-' ((R_BEGIN (HEX | R_CHAR)) | (HEX | R_CHAR)) ']'
|
42
|
+
|
43
|
+
[15] O_ENUM ::= '[^' ((R_BEGIN (HEX | R_CHAR)) | (HEX | R_CHAR)) '-' ((R_BEGIN (HEX | R_CHAR)) | (HEX | R_CHAR)) ']'
|
44
|
+
|
39
45
|
# Range is any combination of R_CHAR '-' R_CHAR or R_CHAR+
|
40
|
-
[
|
46
|
+
[16] RANGE ::= '[' ((R_BEGIN (HEX | R_CHAR)) | (HEX | R_CHAR))+ ']'
|
41
47
|
|
42
48
|
# Range is any combination of R_CHAR '-' R_CHAR or R_CHAR+ preceded by ^
|
43
|
-
[
|
49
|
+
[17] O_RANGE ::= '[^' ((R_BEGIN (HEX | R_CHAR)) | (HEX | R_CHAR))+ ']'
|
44
50
|
|
45
51
|
# Strings are unescaped Unicode, excepting control characters and hash (#)
|
46
|
-
[
|
52
|
+
[18] STRING1 ::= '"' (CHAR - '"')* '"'
|
47
53
|
|
48
|
-
[
|
54
|
+
[19] STRING2 ::= "'" (CHAR - "'")* "'"
|
49
55
|
|
50
|
-
[
|
56
|
+
[20] CHAR ::= HEX
|
51
57
|
| [#x20#x21#x22]
|
52
58
|
| [#x24-#x00FFFFFF]
|
53
59
|
|
54
|
-
[
|
60
|
+
[21] R_CHAR ::= CHAR - ']'
|
55
61
|
|
56
|
-
[
|
62
|
+
[22] R_BEGIN ::= (HEX | R_CHAR) "-"
|
57
63
|
|
58
64
|
# Should be able to do this inline, but not until terminal regular expressions are created automatically
|
59
|
-
[
|
65
|
+
[23] POSTFIX ::= [?*+]
|
60
66
|
|
61
|
-
[
|
67
|
+
[24] PASS ::= ( [#x00-#x20]
|
62
68
|
| ( '#' | '//' ) [^#x0A#x0D]*
|
63
69
|
| '/*' (( '*' [^/] )? | [^*] )* '*/'
|
64
70
|
| '(*' (( '*' [^)] )? | [^*] )* '*)'
|
data/etc/ebnf.html
CHANGED
@@ -76,6 +76,8 @@
|
|
76
76
|
<td>
|
77
77
|
<a href="#grammar-production-HEX">HEX</a>
|
78
78
|
<code>|</code> <a href="#grammar-production-SYMBOL">SYMBOL</a>
|
79
|
+
<code>|</code> <a href="#grammar-production-ENUM">ENUM</a>
|
80
|
+
<code>|</code> <a href="#grammar-production-O_ENUM">O_ENUM</a>
|
79
81
|
<code>|</code> <a href="#grammar-production-RANGE">RANGE</a>
|
80
82
|
<code>|</code> <a href="#grammar-production-O_RANGE">O_RANGE</a>
|
81
83
|
<code>|</code> <a href="#grammar-production-STRING1">STRING1</a>
|
@@ -119,8 +121,32 @@
|
|
119
121
|
(<code>[</code> <code class="grammar-literal">0-9</code><code>]</code> <code>|</code> <code>[</code> <code class="grammar-literal">a-f</code><code>]</code> <code>|</code> <code>[</code> <code class="grammar-literal">A-F</code><code>]</code> )<code>+</code>
|
120
122
|
</td>
|
121
123
|
</tr>
|
122
|
-
<tr id='grammar-production-
|
124
|
+
<tr id='grammar-production-ENUM'>
|
123
125
|
<td>[14]</td>
|
126
|
+
<td><code>ENUM</code></td>
|
127
|
+
<td>::=</td>
|
128
|
+
<td>
|
129
|
+
"<code class="grammar-literal">[</code>"
|
130
|
+
<code>(</code> <a href="#grammar-production-R_BEGIN">R_BEGIN</a> <code>(</code> <a href="#grammar-production-HEX">HEX</a> <code>|</code> <a href="#grammar-production-R_CHAR">R_CHAR</a><code>)</code> <code>|</code> <a href="#grammar-production-HEX">HEX</a> <code>|</code> <a href="#grammar-production-R_CHAR">R_CHAR</a><code>)</code>
|
131
|
+
"<code class="grammar-literal">-</code>"
|
132
|
+
<code>(</code> <a href="#grammar-production-R_BEGIN">R_BEGIN</a> <code>(</code> <a href="#grammar-production-HEX">HEX</a> <code>|</code> <a href="#grammar-production-R_CHAR">R_CHAR</a><code>)</code> <code>|</code> <a href="#grammar-production-HEX">HEX</a> <code>|</code> <a href="#grammar-production-R_CHAR">R_CHAR</a><code>)</code>
|
133
|
+
"<code class="grammar-literal">]</code>"
|
134
|
+
</td>
|
135
|
+
</tr>
|
136
|
+
<tr id='grammar-production-O_ENUM'>
|
137
|
+
<td>[15]</td>
|
138
|
+
<td><code>O_ENUM</code></td>
|
139
|
+
<td>::=</td>
|
140
|
+
<td>
|
141
|
+
"<code class="grammar-literal">[^</code>"
|
142
|
+
<code>(</code> <a href="#grammar-production-R_BEGIN">R_BEGIN</a> <code>(</code> <a href="#grammar-production-HEX">HEX</a> <code>|</code> <a href="#grammar-production-R_CHAR">R_CHAR</a><code>)</code> <code>|</code> <a href="#grammar-production-HEX">HEX</a> <code>|</code> <a href="#grammar-production-R_CHAR">R_CHAR</a><code>)</code>
|
143
|
+
"<code class="grammar-literal">-</code>"
|
144
|
+
<code>(</code> <a href="#grammar-production-R_BEGIN">R_BEGIN</a> <code>(</code> <a href="#grammar-production-HEX">HEX</a> <code>|</code> <a href="#grammar-production-R_CHAR">R_CHAR</a><code>)</code> <code>|</code> <a href="#grammar-production-HEX">HEX</a> <code>|</code> <a href="#grammar-production-R_CHAR">R_CHAR</a><code>)</code>
|
145
|
+
"<code class="grammar-literal">]</code>"
|
146
|
+
</td>
|
147
|
+
</tr>
|
148
|
+
<tr id='grammar-production-RANGE'>
|
149
|
+
<td>[16]</td>
|
124
150
|
<td><code>RANGE</code></td>
|
125
151
|
<td>::=</td>
|
126
152
|
<td>
|
@@ -130,7 +156,7 @@
|
|
130
156
|
</td>
|
131
157
|
</tr>
|
132
158
|
<tr id='grammar-production-O_RANGE'>
|
133
|
-
<td>[
|
159
|
+
<td>[17]</td>
|
134
160
|
<td><code>O_RANGE</code></td>
|
135
161
|
<td>::=</td>
|
136
162
|
<td>
|
@@ -140,7 +166,7 @@
|
|
140
166
|
</td>
|
141
167
|
</tr>
|
142
168
|
<tr id='grammar-production-STRING1'>
|
143
|
-
<td>[
|
169
|
+
<td>[18]</td>
|
144
170
|
<td><code>STRING1</code></td>
|
145
171
|
<td>::=</td>
|
146
172
|
<td>
|
@@ -150,7 +176,7 @@
|
|
150
176
|
</td>
|
151
177
|
</tr>
|
152
178
|
<tr id='grammar-production-STRING2'>
|
153
|
-
<td>[
|
179
|
+
<td>[19]</td>
|
154
180
|
<td><code>STRING2</code></td>
|
155
181
|
<td>::=</td>
|
156
182
|
<td>
|
@@ -160,7 +186,7 @@
|
|
160
186
|
</td>
|
161
187
|
</tr>
|
162
188
|
<tr id='grammar-production-CHAR'>
|
163
|
-
<td>[
|
189
|
+
<td>[20]</td>
|
164
190
|
<td><code>CHAR</code></td>
|
165
191
|
<td>::=</td>
|
166
192
|
<td>
|
@@ -170,7 +196,7 @@
|
|
170
196
|
</td>
|
171
197
|
</tr>
|
172
198
|
<tr id='grammar-production-R_CHAR'>
|
173
|
-
<td>[
|
199
|
+
<td>[21]</td>
|
174
200
|
<td><code>R_CHAR</code></td>
|
175
201
|
<td>::=</td>
|
176
202
|
<td>
|
@@ -179,7 +205,7 @@
|
|
179
205
|
</td>
|
180
206
|
</tr>
|
181
207
|
<tr id='grammar-production-R_BEGIN'>
|
182
|
-
<td>[
|
208
|
+
<td>[22]</td>
|
183
209
|
<td><code>R_BEGIN</code></td>
|
184
210
|
<td>::=</td>
|
185
211
|
<td>
|
@@ -188,7 +214,7 @@
|
|
188
214
|
</td>
|
189
215
|
</tr>
|
190
216
|
<tr id='grammar-production-POSTFIX'>
|
191
|
-
<td>[
|
217
|
+
<td>[23]</td>
|
192
218
|
<td><code>POSTFIX</code></td>
|
193
219
|
<td>::=</td>
|
194
220
|
<td>
|
@@ -196,7 +222,7 @@
|
|
196
222
|
</td>
|
197
223
|
</tr>
|
198
224
|
<tr id='grammar-production-PASS'>
|
199
|
-
<td>[
|
225
|
+
<td>[24]</td>
|
200
226
|
<td><code>PASS</code></td>
|
201
227
|
<td>::=</td>
|
202
228
|
<td>
|
data/etc/ebnf.ll1.sxp
CHANGED
@@ -5,12 +5,17 @@
|
|
5
5
|
(start #t)
|
6
6
|
(first "@pass" "@terminals" LHS _eps)
|
7
7
|
(follow _eof)
|
8
|
+
(cleanup star)
|
8
9
|
(alt _empty _ebnf_2))
|
9
10
|
(rule _ebnf_1 "1.1"
|
10
11
|
(first "@pass" "@terminals" LHS)
|
11
12
|
(follow "@pass" "@terminals" LHS _eof)
|
12
13
|
(alt declaration rule))
|
13
|
-
(rule _ebnf_2 "1.2"
|
14
|
+
(rule _ebnf_2 "1.2"
|
15
|
+
(first "@pass" "@terminals" LHS)
|
16
|
+
(follow _eof)
|
17
|
+
(cleanup merge)
|
18
|
+
(seq _ebnf_1 ebnf))
|
14
19
|
(rule _ebnf_3 "1.3" (first "@pass" "@terminals" LHS _eps) (follow _eof) (seq ebnf))
|
15
20
|
(rule declaration "2"
|
16
21
|
(first "@pass" "@terminals")
|
@@ -18,20 +23,21 @@
|
|
18
23
|
(alt "@terminals" pass))
|
19
24
|
(rule rule "3" (first LHS) (follow "@pass" "@terminals" LHS _eof) (seq LHS expression))
|
20
25
|
(rule _rule_1 "3.1"
|
21
|
-
(first "(" HEX O_RANGE RANGE STRING1 STRING2 SYMBOL)
|
26
|
+
(first "(" ENUM HEX O_ENUM O_RANGE RANGE STRING1 STRING2 SYMBOL)
|
22
27
|
(follow "@pass" "@terminals" LHS _eof)
|
23
28
|
(seq expression))
|
24
29
|
(rule expression "4"
|
25
|
-
(first "(" HEX O_RANGE RANGE STRING1 STRING2 SYMBOL)
|
30
|
+
(first "(" ENUM HEX O_ENUM O_RANGE RANGE STRING1 STRING2 SYMBOL)
|
26
31
|
(follow ")" "@pass" "@terminals" LHS _eof)
|
27
32
|
(seq alt))
|
28
33
|
(rule alt "5"
|
29
|
-
(first "(" HEX O_RANGE RANGE STRING1 STRING2 SYMBOL)
|
34
|
+
(first "(" ENUM HEX O_ENUM O_RANGE RANGE STRING1 STRING2 SYMBOL)
|
30
35
|
(follow ")" "@pass" "@terminals" LHS _eof)
|
31
36
|
(seq seq _alt_1))
|
32
37
|
(rule _alt_1 "5.1"
|
33
38
|
(first _eps "|")
|
34
39
|
(follow ")" "@pass" "@terminals" LHS _eof)
|
40
|
+
(cleanup star)
|
35
41
|
(alt _empty _alt_3))
|
36
42
|
(rule _alt_2 "5.2"
|
37
43
|
(first "|")
|
@@ -40,6 +46,7 @@
|
|
40
46
|
(rule _alt_3 "5.3"
|
41
47
|
(first "|")
|
42
48
|
(follow ")" "@pass" "@terminals" LHS _eof)
|
49
|
+
(cleanup merge)
|
43
50
|
(seq _alt_2 _alt_1))
|
44
51
|
(rule _alt_4 "5.4"
|
45
52
|
(first _eps "|")
|
@@ -50,111 +57,124 @@
|
|
50
57
|
(follow ")" "@pass" "@terminals" LHS _eof)
|
51
58
|
(seq _alt_1))
|
52
59
|
(rule _alt_6 "5.6"
|
53
|
-
(first "(" HEX O_RANGE RANGE STRING1 STRING2 SYMBOL)
|
60
|
+
(first "(" ENUM HEX O_ENUM O_RANGE RANGE STRING1 STRING2 SYMBOL)
|
54
61
|
(follow ")" "@pass" "@terminals" LHS _eof "|")
|
55
62
|
(seq seq))
|
56
63
|
(rule seq "6"
|
57
|
-
(first "(" HEX O_RANGE RANGE STRING1 STRING2 SYMBOL)
|
64
|
+
(first "(" ENUM HEX O_ENUM O_RANGE RANGE STRING1 STRING2 SYMBOL)
|
58
65
|
(follow ")" "@pass" "@terminals" LHS _eof "|")
|
66
|
+
(cleanup plus)
|
59
67
|
(seq diff _seq_1))
|
60
68
|
(rule _seq_1 "6.1"
|
61
|
-
(first "(" HEX O_RANGE RANGE STRING1 STRING2 SYMBOL _eps)
|
69
|
+
(first "(" ENUM HEX O_ENUM O_RANGE RANGE STRING1 STRING2 SYMBOL _eps)
|
62
70
|
(follow ")" "@pass" "@terminals" LHS _eof "|")
|
71
|
+
(cleanup star)
|
63
72
|
(alt _empty _seq_2))
|
64
73
|
(rule _seq_2 "6.2"
|
65
|
-
(first "(" HEX O_RANGE RANGE STRING1 STRING2 SYMBOL)
|
74
|
+
(first "(" ENUM HEX O_ENUM O_RANGE RANGE STRING1 STRING2 SYMBOL)
|
66
75
|
(follow ")" "@pass" "@terminals" LHS _eof "|")
|
76
|
+
(cleanup merge)
|
67
77
|
(seq diff _seq_1))
|
68
78
|
(rule _seq_3 "6.3"
|
69
|
-
(first "(" HEX O_RANGE RANGE STRING1 STRING2 SYMBOL _eps)
|
79
|
+
(first "(" ENUM HEX O_ENUM O_RANGE RANGE STRING1 STRING2 SYMBOL _eps)
|
70
80
|
(follow ")" "@pass" "@terminals" LHS _eof "|")
|
71
81
|
(seq _seq_1))
|
72
82
|
(rule _seq_4 "6.4"
|
73
|
-
(first "(" HEX O_RANGE RANGE STRING1 STRING2 SYMBOL _eps)
|
83
|
+
(first "(" ENUM HEX O_ENUM O_RANGE RANGE STRING1 STRING2 SYMBOL _eps)
|
74
84
|
(follow ")" "@pass" "@terminals" LHS _eof "|")
|
75
85
|
(seq _seq_1))
|
76
86
|
(rule diff "7"
|
77
|
-
(first "(" HEX O_RANGE RANGE STRING1 STRING2 SYMBOL)
|
78
|
-
(follow "(" ")" "@pass" "@terminals" HEX LHS O_RANGE RANGE
|
79
|
-
SYMBOL _eof "|" )
|
87
|
+
(first "(" ENUM HEX O_ENUM O_RANGE RANGE STRING1 STRING2 SYMBOL)
|
88
|
+
(follow "(" ")" "@pass" "@terminals" ENUM HEX LHS O_ENUM O_RANGE RANGE
|
89
|
+
STRING1 STRING2 SYMBOL _eof "|" )
|
80
90
|
(seq postfix _diff_1))
|
81
91
|
(rule _diff_1 "7.1"
|
82
92
|
(first "-" _eps)
|
83
|
-
(follow "(" ")" "@pass" "@terminals" HEX LHS O_RANGE RANGE
|
84
|
-
SYMBOL _eof "|" )
|
93
|
+
(follow "(" ")" "@pass" "@terminals" ENUM HEX LHS O_ENUM O_RANGE RANGE
|
94
|
+
STRING1 STRING2 SYMBOL _eof "|" )
|
95
|
+
(cleanup opt)
|
85
96
|
(alt _empty _diff_2))
|
86
97
|
(rule _diff_2 "7.2"
|
87
98
|
(first "-")
|
88
|
-
(follow "(" ")" "@pass" "@terminals" HEX LHS O_RANGE RANGE
|
89
|
-
SYMBOL _eof "|" )
|
99
|
+
(follow "(" ")" "@pass" "@terminals" ENUM HEX LHS O_ENUM O_RANGE RANGE
|
100
|
+
STRING1 STRING2 SYMBOL _eof "|" )
|
90
101
|
(seq "-" postfix))
|
91
102
|
(rule _diff_3 "7.3"
|
92
103
|
(first "-" _eps)
|
93
|
-
(follow "(" ")" "@pass" "@terminals" HEX LHS O_RANGE RANGE
|
94
|
-
SYMBOL _eof "|" )
|
104
|
+
(follow "(" ")" "@pass" "@terminals" ENUM HEX LHS O_ENUM O_RANGE RANGE
|
105
|
+
STRING1 STRING2 SYMBOL _eof "|" )
|
95
106
|
(seq _diff_1))
|
96
107
|
(rule _diff_4 "7.4"
|
97
|
-
(first "(" HEX O_RANGE RANGE STRING1 STRING2 SYMBOL)
|
98
|
-
(follow "(" ")" "@pass" "@terminals" HEX LHS O_RANGE RANGE
|
99
|
-
SYMBOL _eof "|" )
|
108
|
+
(first "(" ENUM HEX O_ENUM O_RANGE RANGE STRING1 STRING2 SYMBOL)
|
109
|
+
(follow "(" ")" "@pass" "@terminals" ENUM HEX LHS O_ENUM O_RANGE RANGE
|
110
|
+
STRING1 STRING2 SYMBOL _eof "|" )
|
100
111
|
(seq postfix))
|
101
112
|
(rule postfix "8"
|
102
|
-
(first "(" HEX O_RANGE RANGE STRING1 STRING2 SYMBOL)
|
103
|
-
(follow "(" ")" "-" "@pass" "@terminals" HEX LHS O_RANGE RANGE
|
104
|
-
STRING2 SYMBOL _eof "|" )
|
113
|
+
(first "(" ENUM HEX O_ENUM O_RANGE RANGE STRING1 STRING2 SYMBOL)
|
114
|
+
(follow "(" ")" "-" "@pass" "@terminals" ENUM HEX LHS O_ENUM O_RANGE RANGE
|
115
|
+
STRING1 STRING2 SYMBOL _eof "|" )
|
105
116
|
(seq primary _postfix_1))
|
106
117
|
(rule _postfix_1 "8.1"
|
107
118
|
(first POSTFIX _eps)
|
108
|
-
(follow "(" ")" "-" "@pass" "@terminals" HEX LHS O_RANGE RANGE
|
109
|
-
STRING2 SYMBOL _eof "|" )
|
119
|
+
(follow "(" ")" "-" "@pass" "@terminals" ENUM HEX LHS O_ENUM O_RANGE RANGE
|
120
|
+
STRING1 STRING2 SYMBOL _eof "|" )
|
121
|
+
(cleanup opt)
|
110
122
|
(alt _empty POSTFIX))
|
111
123
|
(rule _postfix_2 "8.2"
|
112
124
|
(first POSTFIX _eps)
|
113
|
-
(follow "(" ")" "-" "@pass" "@terminals" HEX LHS O_RANGE RANGE
|
114
|
-
STRING2 SYMBOL _eof "|" )
|
125
|
+
(follow "(" ")" "-" "@pass" "@terminals" ENUM HEX LHS O_ENUM O_RANGE RANGE
|
126
|
+
STRING1 STRING2 SYMBOL _eof "|" )
|
115
127
|
(seq _postfix_1))
|
116
128
|
(rule primary "9"
|
117
|
-
(first "(" HEX O_RANGE RANGE STRING1 STRING2 SYMBOL)
|
118
|
-
(follow "(" ")" "-" "@pass" "@terminals" HEX LHS O_RANGE POSTFIX
|
119
|
-
STRING1 STRING2 SYMBOL _eof "|" )
|
120
|
-
(alt HEX SYMBOL RANGE O_RANGE STRING1 STRING2 _primary_1))
|
129
|
+
(first "(" ENUM HEX O_ENUM O_RANGE RANGE STRING1 STRING2 SYMBOL)
|
130
|
+
(follow "(" ")" "-" "@pass" "@terminals" ENUM HEX LHS O_ENUM O_RANGE POSTFIX
|
131
|
+
RANGE STRING1 STRING2 SYMBOL _eof "|" )
|
132
|
+
(alt HEX SYMBOL ENUM O_ENUM RANGE O_RANGE STRING1 STRING2 _primary_1))
|
121
133
|
(rule _primary_1 "9.1"
|
122
134
|
(first "(")
|
123
|
-
(follow "(" ")" "-" "@pass" "@terminals" HEX LHS O_RANGE POSTFIX
|
124
|
-
STRING1 STRING2 SYMBOL _eof "|" )
|
135
|
+
(follow "(" ")" "-" "@pass" "@terminals" ENUM HEX LHS O_ENUM O_RANGE POSTFIX
|
136
|
+
RANGE STRING1 STRING2 SYMBOL _eof "|" )
|
125
137
|
(seq "(" expression ")"))
|
126
138
|
(rule _primary_2 "9.2"
|
127
|
-
(first "(" HEX O_RANGE RANGE STRING1 STRING2 SYMBOL)
|
128
|
-
(follow "(" ")" "-" "@pass" "@terminals" HEX LHS O_RANGE POSTFIX
|
129
|
-
STRING1 STRING2 SYMBOL _eof "|" )
|
139
|
+
(first "(" ENUM HEX O_ENUM O_RANGE RANGE STRING1 STRING2 SYMBOL)
|
140
|
+
(follow "(" ")" "-" "@pass" "@terminals" ENUM HEX LHS O_ENUM O_RANGE POSTFIX
|
141
|
+
RANGE STRING1 STRING2 SYMBOL _eof "|" )
|
130
142
|
(seq expression ")"))
|
131
143
|
(rule _primary_3 "9.3"
|
132
144
|
(first ")")
|
133
|
-
(follow "(" ")" "-" "@pass" "@terminals" HEX LHS O_RANGE POSTFIX
|
134
|
-
STRING1 STRING2 SYMBOL _eof "|" )
|
145
|
+
(follow "(" ")" "-" "@pass" "@terminals" ENUM HEX LHS O_ENUM O_RANGE POSTFIX
|
146
|
+
RANGE STRING1 STRING2 SYMBOL _eof "|" )
|
135
147
|
(seq ")"))
|
136
148
|
(rule pass "10"
|
137
149
|
(first "@pass")
|
138
150
|
(follow "@pass" "@terminals" LHS _eof)
|
139
151
|
(seq "@pass" expression))
|
140
152
|
(rule _pass_1 "10.1"
|
141
|
-
(first "(" HEX O_RANGE RANGE STRING1 STRING2 SYMBOL)
|
153
|
+
(first "(" ENUM HEX O_ENUM O_RANGE RANGE STRING1 STRING2 SYMBOL)
|
142
154
|
(follow "@pass" "@terminals" LHS _eof)
|
143
155
|
(seq expression))
|
144
156
|
(terminal LHS "11" (seq (opt (seq "[" (plus SYMBOL) "]")) SYMBOL "::="))
|
145
157
|
(terminal SYMBOL "12" (plus (alt (range "a-z") (range "A-Z") (range "0-9") "_" ".")))
|
146
158
|
(terminal HEX "13" (seq "#x" (plus (alt (range "0-9") (range "a-f") (range "A-F")))))
|
147
|
-
(terminal
|
159
|
+
(terminal ENUM "14"
|
160
|
+
(seq "["
|
161
|
+
(alt (seq R_BEGIN (alt HEX R_CHAR)) (alt HEX R_CHAR)) "-"
|
162
|
+
(alt (seq R_BEGIN (alt HEX R_CHAR)) (alt HEX R_CHAR)) "]" ))
|
163
|
+
(terminal O_ENUM "15"
|
164
|
+
(seq "[^"
|
165
|
+
(alt (seq R_BEGIN (alt HEX R_CHAR)) (alt HEX R_CHAR)) "-"
|
166
|
+
(alt (seq R_BEGIN (alt HEX R_CHAR)) (alt HEX R_CHAR)) "]" ))
|
167
|
+
(terminal RANGE "16"
|
148
168
|
(seq "[" (plus (alt (seq R_BEGIN (alt HEX R_CHAR)) (alt HEX R_CHAR))) "]"))
|
149
|
-
(terminal O_RANGE "
|
169
|
+
(terminal O_RANGE "17"
|
150
170
|
(seq "[^" (plus (alt (seq R_BEGIN (alt HEX R_CHAR)) (alt HEX R_CHAR))) "]"))
|
151
|
-
(terminal STRING1 "
|
152
|
-
(terminal STRING2 "
|
153
|
-
(terminal CHAR "
|
154
|
-
(terminal R_CHAR "
|
155
|
-
(terminal R_BEGIN "
|
156
|
-
(terminal POSTFIX "
|
157
|
-
(terminal PASS "
|
171
|
+
(terminal STRING1 "18" (seq "\"" (star (diff CHAR "\"")) "\""))
|
172
|
+
(terminal STRING2 "19" (seq "'" (star (diff CHAR "'")) "'"))
|
173
|
+
(terminal CHAR "20" (alt HEX (range "#x20#x21#x22") (range "#x24-#x00FFFFFF")))
|
174
|
+
(terminal R_CHAR "21" (diff CHAR "]"))
|
175
|
+
(terminal R_BEGIN "22" (seq (alt HEX R_CHAR) "-"))
|
176
|
+
(terminal POSTFIX "23" (range "?*+"))
|
177
|
+
(terminal PASS "24"
|
158
178
|
(plus
|
159
179
|
(alt
|
160
180
|
(range "#x00-#x20")
|