lrama 0.6.10 → 0.6.11
Sign up to get free protection for your applications and to get access to all the features.
- checksums.yaml +4 -4
- data/.github/workflows/test.yaml +5 -1
- data/Gemfile +2 -2
- data/NEWS.md +65 -30
- data/Steepfile +3 -0
- data/doc/development/compressed_state_table/main.md +635 -0
- data/doc/development/compressed_state_table/parse.output +174 -0
- data/doc/development/compressed_state_table/parse.y +22 -0
- data/doc/development/compressed_state_table/parser.rb +282 -0
- data/lib/lrama/bitmap.rb +1 -1
- data/lib/lrama/context.rb +3 -3
- data/lib/lrama/counterexamples/derivation.rb +6 -5
- data/lib/lrama/counterexamples/example.rb +7 -4
- data/lib/lrama/counterexamples/path.rb +4 -0
- data/lib/lrama/counterexamples.rb +19 -9
- data/lib/lrama/grammar/parameterizing_rule/rhs.rb +1 -1
- data/lib/lrama/grammar/rule_builder.rb +1 -1
- data/lib/lrama/grammar/symbols/resolver.rb +4 -0
- data/lib/lrama/grammar.rb +2 -2
- data/lib/lrama/lexer/token/user_code.rb +1 -1
- data/lib/lrama/lexer.rb +1 -0
- data/lib/lrama/parser.rb +520 -487
- data/lib/lrama/state/reduce.rb +2 -3
- data/lib/lrama/version.rb +1 -1
- data/parser.y +38 -27
- data/rbs_collection.lock.yaml +10 -2
- data/sig/lrama/counterexamples/derivation.rbs +33 -0
- data/sig/lrama/counterexamples/example.rbs +45 -0
- data/sig/lrama/counterexamples/path.rbs +21 -0
- data/sig/lrama/counterexamples/production_path.rbs +11 -0
- data/sig/lrama/counterexamples/start_path.rbs +13 -0
- data/sig/lrama/counterexamples/state_item.rbs +10 -0
- data/sig/lrama/counterexamples/transition_path.rbs +11 -0
- data/sig/lrama/counterexamples/triple.rbs +20 -0
- data/sig/lrama/counterexamples.rbs +29 -0
- data/sig/lrama/grammar/symbol.rbs +1 -1
- data/sig/lrama/grammar/symbols/resolver.rbs +3 -3
- data/sig/lrama/grammar.rbs +13 -0
- data/sig/lrama/state/reduce_reduce_conflict.rbs +2 -2
- data/sig/lrama/state.rbs +79 -0
- data/sig/lrama/states.rbs +101 -0
- metadata +17 -2
@@ -0,0 +1,635 @@
|
|
1
|
+
# Compressed State Table
|
2
|
+
|
3
|
+
LR parser generates two large tables, action table and GOTO table.
|
4
|
+
Action table is a matrix of states and tokens. Each cell of action table indicates next action (shift, reduce, accept and error).
|
5
|
+
GOTO table is a matrix of states and nonterminal symbols. Each cell of GOTO table indicates next state.
|
6
|
+
|
7
|
+
Action table of "parse.y":
|
8
|
+
|
9
|
+
| |EOF| LF|NUM|'+'|'*'|'('|')'|
|
10
|
+
|--------|--:|--:|--:|--:|--:|--:|--:|
|
11
|
+
|State 0| r1| | s1| | | s2| |
|
12
|
+
|State 1| r3| r3| r3| r3| r3| r3| r3|
|
13
|
+
|State 2| | | s1| | | s2| |
|
14
|
+
|State 3| s6| | | | | | |
|
15
|
+
|State 4| | s7| | s8| s9| | |
|
16
|
+
|State 5| | | | s8| s9| |s10|
|
17
|
+
|State 6|acc|acc|acc|acc|acc|acc|acc|
|
18
|
+
|State 7| r2| r2| r2| r2| r2| r2| r2|
|
19
|
+
|State 8| | | s1| | | s2| |
|
20
|
+
|State 9| | | s1| | | s2| |
|
21
|
+
|State 10| r6| r6| r6| r6| r6| r6| r6|
|
22
|
+
|State 11| | r4| | r4| s9| | r4|
|
23
|
+
|State 12| | r5| | r5| r5| | r5|
|
24
|
+
|
25
|
+
GOTO table of "parse.y":
|
26
|
+
|
27
|
+
| |$accept|program|expr|
|
28
|
+
|--------|------:|------:|---:|
|
29
|
+
|State 0| | g3| g4|
|
30
|
+
|State 1| | | |
|
31
|
+
|State 2| | | g5|
|
32
|
+
|State 3| | | |
|
33
|
+
|State 4| | | |
|
34
|
+
|State 5| | | |
|
35
|
+
|State 6| | | |
|
36
|
+
|State 7| | | |
|
37
|
+
|State 8| | | g11|
|
38
|
+
|State 9| | | g12|
|
39
|
+
|State 10| | | |
|
40
|
+
|State 11| | | |
|
41
|
+
|State 12| | | |
|
42
|
+
|
43
|
+
|
44
|
+
Both action table and GOTO table are sparse. Therefore LR parser generator compresses both tables and creates these tables.
|
45
|
+
|
46
|
+
* `yypact` & `yypgoto`
|
47
|
+
* `yytable`
|
48
|
+
* `yycheck`
|
49
|
+
* `yydefact` & `yydefgoto`
|
50
|
+
|
51
|
+
## Introduction to major tables
|
52
|
+
|
53
|
+
### `yypact` & `yypgoto`
|
54
|
+
|
55
|
+
`yypact` specifies offset on `yytable` for the current state.
|
56
|
+
As an optimization, `yypact` also specifies default reduce action for some states.
|
57
|
+
Accessing the value by `state`. For example,
|
58
|
+
|
59
|
+
```ruby
|
60
|
+
offset = yypact[state]
|
61
|
+
```
|
62
|
+
|
63
|
+
If the value is `YYPACT_NINF` (Negative INFinity), it means execution of default reduce action.
|
64
|
+
Otherwise the value is an offset in `yytable`.
|
65
|
+
|
66
|
+
`yypgoto` plays the same role as `yypact`.
|
67
|
+
But `yypgoto` is used for GOTO table.
|
68
|
+
Then its index is nonterminal symbol id.
|
69
|
+
Especially `yypgoto` is used when reduce happens.
|
70
|
+
|
71
|
+
```ruby
|
72
|
+
rule_for_reduce = rules[rule_id]
|
73
|
+
|
74
|
+
# lhs_id holds LHS nonterminal id of the rule used for reduce.
|
75
|
+
lhs_id = rule_for_reduce.lhs.id
|
76
|
+
|
77
|
+
offset = yypgoto[lhs_id]
|
78
|
+
|
79
|
+
# Validate access to yytable
|
80
|
+
if yycheck[offset + state] == state
|
81
|
+
next_state = yytable[offset + state]
|
82
|
+
end
|
83
|
+
```
|
84
|
+
|
85
|
+
### `yytable`
|
86
|
+
|
87
|
+
`yytable` is a mixture of action table and GOTO table.
|
88
|
+
|
89
|
+
#### For action table
|
90
|
+
|
91
|
+
For action table, `yytable` specifies what actually to do on the current state.
|
92
|
+
|
93
|
+
Positive number means shift and specifies next state.
|
94
|
+
For example, `yytable[yyn] == 1` means shift and next state is State 1.
|
95
|
+
|
96
|
+
`YYTABLE_NINF` (Negative INFinity) means syntax error.
|
97
|
+
For example, `yytable[yyn] == YYTABLE_NINF` means syntax error.
|
98
|
+
|
99
|
+
Other negative number and zero mean reducing with the rule whose number is opposite.
|
100
|
+
For example, `yytable[yyn] == -1` means reduce with Rule 1.
|
101
|
+
|
102
|
+
#### For GOTO table
|
103
|
+
|
104
|
+
For GOTO table, `yytable` specifies the next state for given LSH nonterminal.
|
105
|
+
|
106
|
+
The value is always positive number which means next state id.
|
107
|
+
It never becomes `YYTABLE_NINF`.
|
108
|
+
|
109
|
+
### `yycheck`
|
110
|
+
|
111
|
+
`yycheck` validates accesses to `yytable`.
|
112
|
+
|
113
|
+
Each line of action table and GOTO table is placed into single array in `yytable`.
|
114
|
+
Consider the case where action table has only two states.
|
115
|
+
In this case, if the second array is shifted to the right, they can be merged into one array without conflict.
|
116
|
+
|
117
|
+
```ruby
|
118
|
+
[
|
119
|
+
[ 'a', 'b', , , 'e'], # State 0
|
120
|
+
[ , 'B', 'C', , 'E'], # State 1
|
121
|
+
]
|
122
|
+
|
123
|
+
# => Shift the second array to the right
|
124
|
+
|
125
|
+
[
|
126
|
+
[ 'a', 'b', , , 'e'], # State 0
|
127
|
+
[ , 'B', 'C', , 'E'], # State 1
|
128
|
+
]
|
129
|
+
|
130
|
+
# => Merge them into single array
|
131
|
+
|
132
|
+
yytable = [
|
133
|
+
'a', 'b', 'B', 'C', 'e', 'E'
|
134
|
+
]
|
135
|
+
```
|
136
|
+
|
137
|
+
`yypact` is an array of each state offset.
|
138
|
+
|
139
|
+
```ruby
|
140
|
+
yypact = [
|
141
|
+
0, # State 0 is not shifted
|
142
|
+
1 # State 1 is shifted one to right
|
143
|
+
]
|
144
|
+
```
|
145
|
+
|
146
|
+
We can access the value of `state1[2]` by consulting `yypact`.
|
147
|
+
|
148
|
+
```ruby
|
149
|
+
yytable[yypact[1] + 2]
|
150
|
+
# => yytable[1 + 2]
|
151
|
+
# => 'C'
|
152
|
+
```
|
153
|
+
|
154
|
+
However this approach doesn't work well when accessing to nil value like `state1[3]`.
|
155
|
+
Because it tries to access to `state0[4]`.
|
156
|
+
|
157
|
+
```ruby
|
158
|
+
yytable[yypact[1] + 3]
|
159
|
+
# => yytable[1 + 3]
|
160
|
+
# => 'e'
|
161
|
+
```
|
162
|
+
|
163
|
+
This is why `yycheck` is needed.
|
164
|
+
`yycheck` stores valid indexes of the original table.
|
165
|
+
In the current example:
|
166
|
+
|
167
|
+
* 0, 1 and 4 are valid index of State 0
|
168
|
+
* 1, 2 and 4 are valid index of State 1
|
169
|
+
|
170
|
+
`yycheck` stores these indexes with same offset with `yytable`.
|
171
|
+
|
172
|
+
```ruby
|
173
|
+
# yytable
|
174
|
+
[
|
175
|
+
[ 'a', 'b', , , 'e'], # State 0
|
176
|
+
[ , 'B', 'C', , 'E'], # State 1
|
177
|
+
]
|
178
|
+
|
179
|
+
yytable = [
|
180
|
+
'a', 'b', 'B', 'C', 'e', 'E'
|
181
|
+
]
|
182
|
+
|
183
|
+
# yycheck
|
184
|
+
[
|
185
|
+
[ 0, 1, , , 4], # State 0
|
186
|
+
[ , 1, 2, , 4], # State 1
|
187
|
+
]
|
188
|
+
|
189
|
+
yycheck = [
|
190
|
+
0, 1, 1, 2, 4, 4
|
191
|
+
]
|
192
|
+
```
|
193
|
+
|
194
|
+
We can validate accesses to `yytable` by consulting `yycheck`.
|
195
|
+
`yycheck` stores valid indexes in the original arrays then validation is comparing `yycheck[index_for_yytable]` and `index_for_the_state`.
|
196
|
+
The access is valid if both values are same.
|
197
|
+
|
198
|
+
```ruby
|
199
|
+
# Validate an access to state1[2]
|
200
|
+
yycheck[yypact[1] + 2] == 2
|
201
|
+
# => yycheck[1 + 2] == 2
|
202
|
+
# => 2 == 2
|
203
|
+
# => true (valid)
|
204
|
+
|
205
|
+
# Validate an access to state1[3]
|
206
|
+
yycheck[yypact[1] + 3] == 3
|
207
|
+
# => yycheck[1 + 3] == 3
|
208
|
+
# => 4 == 3
|
209
|
+
# => false (invalid)
|
210
|
+
```
|
211
|
+
|
212
|
+
### `yydefact` & `yydefgoto`
|
213
|
+
|
214
|
+
`yydefact` stores rule id of default actions for each state.
|
215
|
+
`0` means syntax error, other number means reduce using Rule N.
|
216
|
+
|
217
|
+
```ruby
|
218
|
+
rule_id = yydefact[state]
|
219
|
+
# => 0 means syntax error, other number means reduce using Rule whose id is `rule_id`
|
220
|
+
```
|
221
|
+
|
222
|
+
`yydefgoto` stores default GOTOs for each nonterminal.
|
223
|
+
The number means next state.
|
224
|
+
|
225
|
+
```ruby
|
226
|
+
next_state = yydefgoto[lhs_id]
|
227
|
+
# => Next state id is `next_state`
|
228
|
+
```
|
229
|
+
|
230
|
+
## Example
|
231
|
+
|
232
|
+
Take a look at compressed tables of "parse.y".
|
233
|
+
See "parse.output" for detailed information of symbols and states.
|
234
|
+
|
235
|
+
### `yytable`
|
236
|
+
|
237
|
+
Original action table and GOTO table look like:
|
238
|
+
|
239
|
+
```ruby
|
240
|
+
# Action table is a matrix of terminals * states
|
241
|
+
[
|
242
|
+
# [ EOF, error, undef, LF, NUM, '+', '*', '(', ')'] (default reduce)
|
243
|
+
[ , , , , s1, , , s2, ], # State 0 (r1)
|
244
|
+
[ , , , , , , , , ], # State 1 (r3)
|
245
|
+
[ , , , , s1, , , s2, ], # State 2 ()
|
246
|
+
[ s6, , , , , , , , ], # State 3 ()
|
247
|
+
[ , , , s7, , s8, s9, , ], # State 4 ()
|
248
|
+
[ , , , , , s8, s9, , s10], # State 5 ()
|
249
|
+
[ , , , , , , , , ], # State 6 (accept)
|
250
|
+
[ , , , , , , , , ], # State 7 (r2)
|
251
|
+
[ , , , , s1, , , s2, ], # State 8 ()
|
252
|
+
[ , , , , s1, , , s2, ], # State 9 ()
|
253
|
+
[ , , , , , , , , ], # State 10 (r6)
|
254
|
+
[ , , , , , , s9, , ], # State 11 (r4)
|
255
|
+
[ , , , , , , , , ], # State 12 (r5)
|
256
|
+
]
|
257
|
+
|
258
|
+
# GOTO table is a matrix of states * nonterminals
|
259
|
+
[
|
260
|
+
# [ 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12] State No (default goto)
|
261
|
+
[ , , , , , , , , , , , , ], # $accept (g0)
|
262
|
+
[ g3, , , , , , , , , , , , ], # program (g3)
|
263
|
+
[ g4, , g5, , , , , , g11, g12, , , ], # expr (g4)
|
264
|
+
]
|
265
|
+
|
266
|
+
# => Remove default goto
|
267
|
+
|
268
|
+
[
|
269
|
+
# [ 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12] State No (default goto)
|
270
|
+
[ , , , , , , , , , , , , ], # $accept (g0)
|
271
|
+
[ , , , , , , , , , , , , ], # program (g3)
|
272
|
+
[ , , g5, , , , , , g11, g12, , , ], # expr (g4)
|
273
|
+
]
|
274
|
+
```
|
275
|
+
|
276
|
+
These are compressed to `yytable` like below.
|
277
|
+
If offset equals to `YYPACT_NINF`, the line has only default value then the line can be ignored (commented out in this example).
|
278
|
+
|
279
|
+
```ruby
|
280
|
+
[
|
281
|
+
# Action table
|
282
|
+
# (offset, YYPACT_NINF = -4)
|
283
|
+
[ , , , , s1, , , s2, ], # State 0 ( 6)
|
284
|
+
# [ , , , , , , , , ], # State 1 (-4)
|
285
|
+
[ , , , , s1, , , s2, ], # State 2 ( 6)
|
286
|
+
[ s6, , , , , , , , ], # State 3 ( 1)
|
287
|
+
[ , , , s7, , s8, s9, , ], # State 4 (-1)
|
288
|
+
[ , , , , , s8, s9, , s10], # State 5 ( 3)
|
289
|
+
# [ , , , , , , , , ], # State 6 (-4)
|
290
|
+
# [ , , , , , , , , ], # State 7 (-4)
|
291
|
+
[ , , , , s1, , , s2, ], # State 8 ( 6)
|
292
|
+
[ , , , , s1, , , s2, ], # State 9 ( 6)
|
293
|
+
# [ , , , , , , , , ], # State 10 (-4)
|
294
|
+
[ , , , , , , s9, , ], # State 11 (-3)
|
295
|
+
# [ , , , , , , , , ], # State 12 (-4)
|
296
|
+
|
297
|
+
# GOTO table
|
298
|
+
# [ , , , , , , , , , , , , ], # $accept (-4)
|
299
|
+
# [ , , , , , , , , , , , , ], # program (-4)
|
300
|
+
[ , , g5, , , , , , g11, g12, , , ], # expr (-2)
|
301
|
+
]
|
302
|
+
|
303
|
+
# => compressed into single array
|
304
|
+
[ , , , g5, s6, s7, s9, s8, s9, g11, g12, s8, s9, s1, s10, , s2, ]
|
305
|
+
|
306
|
+
# => Cut blank cells on head and tail, remove 'g' and 's' prefix, fill blank with 0
|
307
|
+
# This is `yytable`
|
308
|
+
[ 5, 6, 7, 9, 8, 9, 11, 12, 8, 9, 1, 10, 0, 2]
|
309
|
+
```
|
310
|
+
|
311
|
+
`YYTABLE_NINF` is the minimum negative number.
|
312
|
+
In this case, `0` is the minimum offset number then `YYTABLE_NINF` is `-1`.
|
313
|
+
|
314
|
+
### `yycheck`
|
315
|
+
|
316
|
+
```ruby
|
317
|
+
[
|
318
|
+
# Action table valid indexes
|
319
|
+
# (offset, YYPACT_NINF = -4)
|
320
|
+
[ , , , , 4, , , 7, ], # State 0 ( 6)
|
321
|
+
# [ , , , , , , , , ], # State 1 (-4)
|
322
|
+
[ , , , , 4, , , 7, ], # State 2 ( 6)
|
323
|
+
[ 0, , , , , , , , ], # State 3 ( 1)
|
324
|
+
[ , , , 3, , 5, 6, , ], # State 4 (-1)
|
325
|
+
[ , , , , , 5, 6, , 8], # State 5 ( 3)
|
326
|
+
# [ , , , , , , , , ], # State 6 (-4)
|
327
|
+
# [ , , , , , , , , ], # State 7 (-4)
|
328
|
+
[ , , , , 4, , , 7, ], # State 8 ( 6)
|
329
|
+
[ , , , , 4, , , 7, ], # State 9 ( 6)
|
330
|
+
# [ , , , , , , , , ], # State 10 (-4)
|
331
|
+
[ , , , , , , 6, , ], # State 11 (-3)
|
332
|
+
# [ , , , , , , , , ], # State 12 (-4)
|
333
|
+
|
334
|
+
# GOTO table valid indexes
|
335
|
+
# [ , , , , , , , , , , , , ], # $accept (-4)
|
336
|
+
# [ , , , , , , , , , , , , ], # program (-4)
|
337
|
+
[ , , 2, , , , , , 8, 9, , , ], # expr (-2)
|
338
|
+
]
|
339
|
+
|
340
|
+
# => compressed into single array
|
341
|
+
[ , , , 2, 0, 3, 6, 5, 6, 8, 9, 5, 6, 4, 8, , 7, ]
|
342
|
+
|
343
|
+
# => Cut blank cells on head and tail, fill blank with -1 because no index can be -1 and comparison always fails
|
344
|
+
# This is `yycheck`
|
345
|
+
[ 2, 0, 3, 6, 5, 6, 8, 9, 5, 6, 4, 8, -1, 7]
|
346
|
+
```
|
347
|
+
|
348
|
+
### `yypact` & `yypgoto`
|
349
|
+
|
350
|
+
`yypact` & `yypgoto` are mixture of offset in `yytable` and `YYPACT_NINF` (default reduce action).
|
351
|
+
Index in `yypact` is state id and index in `yypgoto` is nonterminal symbol id.
|
352
|
+
`YYPACT_NINF` is the minimum negative number.
|
353
|
+
In this case, `-3` is the minimum offset number then `YYPACT_NINF` is `-4`.
|
354
|
+
|
355
|
+
```ruby
|
356
|
+
YYPACT_NINF = -4
|
357
|
+
|
358
|
+
yypact = [
|
359
|
+
# 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12 (State No)
|
360
|
+
6, -4, 6, 1, -1, 3, -4, -4, 6, 6, -4, -3, -4
|
361
|
+
]
|
362
|
+
|
363
|
+
yypgoto = [
|
364
|
+
# $accept, program, expr
|
365
|
+
-4, -4, -2
|
366
|
+
]
|
367
|
+
```
|
368
|
+
|
369
|
+
### `yydefact` & `yydefgoto`
|
370
|
+
|
371
|
+
`yydefact` & `yydefgoto` store default value.
|
372
|
+
|
373
|
+
`yydefact` specifies rule id of default actions of the state.
|
374
|
+
Because `0` is reserved for syntax error, Rule id starts with 1.
|
375
|
+
|
376
|
+
```
|
377
|
+
# In "parse.output"
|
378
|
+
Grammar
|
379
|
+
|
380
|
+
0 $accept: program "end of file"
|
381
|
+
|
382
|
+
1 program: ε
|
383
|
+
2 | expr LF
|
384
|
+
|
385
|
+
3 expr: NUM
|
386
|
+
4 | expr '+' expr
|
387
|
+
5 | expr '*' expr
|
388
|
+
6 | '(' expr ')'
|
389
|
+
|
390
|
+
# =>
|
391
|
+
|
392
|
+
# In `yydefact`
|
393
|
+
Grammar
|
394
|
+
|
395
|
+
0 Syntax Error
|
396
|
+
|
397
|
+
1 $accept: program "end of file"
|
398
|
+
|
399
|
+
2 program: ε
|
400
|
+
3 | expr LF
|
401
|
+
|
402
|
+
4 expr: NUM
|
403
|
+
5 | expr '+' expr
|
404
|
+
6 | expr '*' expr
|
405
|
+
7 | '(' expr ')'
|
406
|
+
```
|
407
|
+
|
408
|
+
For example, default action for state 1 is 4 (`yydefact[1] == 4`).
|
409
|
+
This means Rule 3 (`3 expr: NUM`) in "parse.output" file.
|
410
|
+
|
411
|
+
`yydefgoto` specifies next state id of the nonterminal.
|
412
|
+
|
413
|
+
```ruby
|
414
|
+
yydefact = [
|
415
|
+
# 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12 (State No)
|
416
|
+
2, 4, 0, 0, 0, 0, 1, 3, 0, 0, 7, 5, 6
|
417
|
+
]
|
418
|
+
|
419
|
+
yydefgoto = [
|
420
|
+
# $accept, program, expr
|
421
|
+
0, 3, 4
|
422
|
+
]
|
423
|
+
```
|
424
|
+
|
425
|
+
### `yyr1` & `yyr2`
|
426
|
+
|
427
|
+
Both of them are tables for rules.
|
428
|
+
`yyr1` specifies nonterminal symbol id of rule's Left-Hand-Side.
|
429
|
+
`yyr2` specifies the length of the rule, that is, number of symbols on the rule's Right-Hand-Side.
|
430
|
+
Index 0 is not used because Rule id starts with 1.
|
431
|
+
|
432
|
+
```ruby
|
433
|
+
yyr1 = [
|
434
|
+
# 0, 1, 2, 3, 4, 5, 6, 7 (Rule id)
|
435
|
+
# no rule, $accept, program, program, expr, expr, expr, expr (LHS symbol id)
|
436
|
+
0, 9, 10, 10, 11, 11, 11, 11
|
437
|
+
]
|
438
|
+
|
439
|
+
yyr2 = [
|
440
|
+
# 0, 1, 2, 3, 4, 5, 6, 7 (Rule id)
|
441
|
+
0, 2, 0, 2, 1, 3, 3, 3
|
442
|
+
]
|
443
|
+
```
|
444
|
+
|
445
|
+
## How to use tables
|
446
|
+
|
447
|
+
See also "parse.rb" which implements LALR parser based on "parse.y" file.
|
448
|
+
|
449
|
+
At first, define important constants and arrays:
|
450
|
+
|
451
|
+
```ruby
|
452
|
+
YYNTOKENS = 9
|
453
|
+
|
454
|
+
# The last index of yytable and yycheck
|
455
|
+
# The length of yytable and yycheck are always same
|
456
|
+
YYLAST = 13
|
457
|
+
YYTABLE_NINF = -1
|
458
|
+
yytable = [ 5, 6, 7, 9, 8, 9, 11, 12, 8, 9, 1, 10, 0, 2]
|
459
|
+
yycheck = [ 2, 0, 3, 6, 5, 6, 8, 9, 5, 6, 4, 8, -1, 7]
|
460
|
+
|
461
|
+
YYPACT_NINF = -4
|
462
|
+
yypact = [ 6, -4, 6, 1, -1, 3, -4, -4, 6, 6, -4, -3, -4]
|
463
|
+
yypgoto = [ -4, -4, -2]
|
464
|
+
|
465
|
+
yydefact = [ 2, 4, 0, 0, 0, 0, 1, 3, 0, 0, 7, 5, 6]
|
466
|
+
yydefgoto = [ 0, 3, 4]
|
467
|
+
|
468
|
+
yyr1 = [ 0, 9, 10, 10, 11, 11, 11, 11]
|
469
|
+
yyr2 = [ 0, 2, 0, 2, 1, 3, 3, 3]
|
470
|
+
```
|
471
|
+
|
472
|
+
### Determine what to do next
|
473
|
+
|
474
|
+
Determine what to do next based on current state (`state`) and next token (`yytoken`).
|
475
|
+
|
476
|
+
The first step to decide action is looking up `yypact` table by current state.
|
477
|
+
If only default reduce exists for the current state, `yypact` returns `YYPACT_NINF`.
|
478
|
+
|
479
|
+
```ruby
|
480
|
+
# Case 1: Only default reduce exists for the state
|
481
|
+
#
|
482
|
+
# State 7
|
483
|
+
#
|
484
|
+
# 2 program: expr LF •
|
485
|
+
#
|
486
|
+
# $default reduce using rule 2 (program)
|
487
|
+
|
488
|
+
state = 7
|
489
|
+
yytoken = nil # Do not use yytoken in this case
|
490
|
+
|
491
|
+
offset = yypact[state] # -4
|
492
|
+
if offset == YYPACT_NINF # true
|
493
|
+
next_action = :yydefault
|
494
|
+
return
|
495
|
+
end
|
496
|
+
```
|
497
|
+
|
498
|
+
If both shift and default reduce exists for the current state, `yypact` returns offset in `yytable`.
|
499
|
+
Index is the sum of `offset` and `yytoken`.
|
500
|
+
Need to check index before access to `yytable` by consulting `yycheck`.
|
501
|
+
Index can be out of range because blank cells on head and tail are omitted, see how `yycheck` is constructed in the example above.
|
502
|
+
Therefore need to check an index is not less than 0 and not greater than `YYLAST`.
|
503
|
+
|
504
|
+
```ruby
|
505
|
+
# Case 2: Both shift and default reduce exists for the state
|
506
|
+
#
|
507
|
+
# State 11
|
508
|
+
#
|
509
|
+
# 4 expr: expr • '+' expr
|
510
|
+
# 4 | expr '+' expr • [LF, '+', ')']
|
511
|
+
# 5 | expr • '*' expr
|
512
|
+
#
|
513
|
+
# '*' shift, and go to state 9
|
514
|
+
#
|
515
|
+
# $default reduce using rule 4 (expr)
|
516
|
+
|
517
|
+
# Next token is '*' then shift it
|
518
|
+
state = 11
|
519
|
+
yytoken = nil
|
520
|
+
|
521
|
+
offset = yypact[state] # -3
|
522
|
+
if offset == YYPACT_NINF # false
|
523
|
+
next_action = :yydefault
|
524
|
+
break
|
525
|
+
end
|
526
|
+
|
527
|
+
unless yytoken
|
528
|
+
yytoken = yylex() # yylex returns 6 ('*')
|
529
|
+
end
|
530
|
+
|
531
|
+
idx = offset + yytoken # 3
|
532
|
+
if idx < 0 || YYLAST < idx # false
|
533
|
+
next_action = :yydefault
|
534
|
+
break
|
535
|
+
end
|
536
|
+
if yycheck[idx] != yytoken # false
|
537
|
+
next_action = :yydefault
|
538
|
+
break
|
539
|
+
end
|
540
|
+
|
541
|
+
act = yytable[idx] # 9
|
542
|
+
if act == YYTABLE_NINF # false
|
543
|
+
next_action = :syntax_error
|
544
|
+
break
|
545
|
+
end
|
546
|
+
if act > 0 # true
|
547
|
+
# Shift
|
548
|
+
next_action = :yyshift
|
549
|
+
break
|
550
|
+
else
|
551
|
+
# Reduce
|
552
|
+
next_action = :yyreduce
|
553
|
+
break
|
554
|
+
end
|
555
|
+
```
|
556
|
+
|
557
|
+
### Execute (default) reduce
|
558
|
+
|
559
|
+
Once next action is decided to default reduce, need to determine
|
560
|
+
|
561
|
+
1. the rule to be applied
|
562
|
+
2. the next state from GOTO table
|
563
|
+
|
564
|
+
Rule id for the default reduce is stored in `yydefact`.
|
565
|
+
`0` in `yydefact` means syntax error so need to check the value is not `0` before continue the process.
|
566
|
+
|
567
|
+
Once rule is determined, the length of the rule can be decided from `yyr2` and the LHS nonterminal can be decided from `yyr1`.
|
568
|
+
|
569
|
+
The next state is determined by LHS nonterminal and the state after reduce.
|
570
|
+
GOTO table is also compressed into `yytable` then the process to decide next state is similar to `yypact`.
|
571
|
+
|
572
|
+
1. Look up `yypgoto` by LHS nonterminal. Note `yypact` is indexed by state but `yypgoto` is indexed by nonterminal.
|
573
|
+
2. Check the value on `yypgoto` is `YYPACT_NINF` is not.
|
574
|
+
3. Check the index, sum of offset and state, is out of range or not.
|
575
|
+
4. Check `yycheck` table before access to `yytable`.
|
576
|
+
|
577
|
+
Finally push the state to the stack.
|
578
|
+
|
579
|
+
```ruby
|
580
|
+
# State 11
|
581
|
+
#
|
582
|
+
# 4 expr: expr • '+' expr
|
583
|
+
# 4 | expr '+' expr • [LF, '+', ')']
|
584
|
+
# 5 | expr • '*' expr
|
585
|
+
#
|
586
|
+
# '*' shift, and go to state 9
|
587
|
+
#
|
588
|
+
# $default reduce using rule 4 (expr)
|
589
|
+
|
590
|
+
# Input is "1 + 2 + 3 LF" and next token is the second '+'.
|
591
|
+
# Current state stack is `[0, 4, 8, 11]`.
|
592
|
+
# What to do next is reduce with default action.
|
593
|
+
state = 11
|
594
|
+
yytoken = 5 # '+'
|
595
|
+
|
596
|
+
rule = yydefact[state] # 5
|
597
|
+
if rule == 0 # false
|
598
|
+
next_action = :syntax_error
|
599
|
+
break
|
600
|
+
end
|
601
|
+
|
602
|
+
rhs_length = yyr2[rule] # 3. Because rule 4 is "expr: expr '+' expr"
|
603
|
+
lhs_nterm = yyr1[rule] # 11 (expr)
|
604
|
+
lhs_nterm_id = lhs_nterm - YYNTOKENS # 11 - 9 = 2
|
605
|
+
|
606
|
+
case rule
|
607
|
+
when 1
|
608
|
+
# Execute Rule 1 action
|
609
|
+
when 2
|
610
|
+
# Execute Rule 2 action
|
611
|
+
#...
|
612
|
+
when 7
|
613
|
+
# Execute Rule 7 action
|
614
|
+
end
|
615
|
+
|
616
|
+
stack.pop(rhs_length) # state stack: `[0, 4, 8, 11]` -> `[0]`
|
617
|
+
state = stack[-1] # state = 0
|
618
|
+
|
619
|
+
offset = yypgoto[lhs_nterm_id] # -2
|
620
|
+
if offset == YYPACT_NINF # false
|
621
|
+
state = yydefgoto[lhs_nterm_id]
|
622
|
+
else
|
623
|
+
idx = offset + state # 0
|
624
|
+
if idx < 0 || YYLAST < idx # true
|
625
|
+
state = yydefgoto[lhs_nterm_id] # 4
|
626
|
+
elsif yycheck[idx] != state
|
627
|
+
state = yydefgoto[lhs_nterm_id]
|
628
|
+
else
|
629
|
+
state = yytable[idx]
|
630
|
+
end
|
631
|
+
end
|
632
|
+
|
633
|
+
# yyval = $$, yyloc = @$
|
634
|
+
push_state(state, yyval, yyloc) # state stack: [0, 4]
|
635
|
+
```
|