lrama 0.6.10 → 0.7.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (87) hide show
  1. checksums.yaml +4 -4
  2. data/.github/workflows/gh-pages.yml +46 -0
  3. data/.github/workflows/test.yaml +40 -8
  4. data/.gitignore +1 -0
  5. data/.rdoc_options +2 -0
  6. data/Gemfile +4 -2
  7. data/NEWS.md +125 -30
  8. data/README.md +44 -15
  9. data/Rakefile +13 -1
  10. data/Steepfile +5 -0
  11. data/doc/Index.md +58 -0
  12. data/doc/development/compressed_state_table/main.md +635 -0
  13. data/doc/development/compressed_state_table/parse.output +174 -0
  14. data/doc/development/compressed_state_table/parse.y +22 -0
  15. data/doc/development/compressed_state_table/parser.rb +282 -0
  16. data/lib/lrama/bitmap.rb +4 -1
  17. data/lib/lrama/command.rb +2 -1
  18. data/lib/lrama/context.rb +3 -3
  19. data/lib/lrama/counterexamples/derivation.rb +6 -5
  20. data/lib/lrama/counterexamples/example.rb +7 -4
  21. data/lib/lrama/counterexamples/path.rb +4 -0
  22. data/lib/lrama/counterexamples.rb +19 -9
  23. data/lib/lrama/digraph.rb +30 -0
  24. data/lib/lrama/grammar/binding.rb +47 -15
  25. data/lib/lrama/grammar/parameterizing_rule/rhs.rb +1 -1
  26. data/lib/lrama/grammar/rule.rb +8 -0
  27. data/lib/lrama/grammar/rule_builder.rb +4 -16
  28. data/lib/lrama/grammar/symbols/resolver.rb +4 -0
  29. data/lib/lrama/grammar.rb +10 -5
  30. data/lib/lrama/lexer/grammar_file.rb +8 -1
  31. data/lib/lrama/lexer/location.rb +17 -1
  32. data/lib/lrama/lexer/token/char.rb +1 -0
  33. data/lib/lrama/lexer/token/ident.rb +1 -0
  34. data/lib/lrama/lexer/token/instantiate_rule.rb +6 -1
  35. data/lib/lrama/lexer/token/tag.rb +3 -1
  36. data/lib/lrama/lexer/token/user_code.rb +6 -2
  37. data/lib/lrama/lexer/token.rb +14 -2
  38. data/lib/lrama/lexer.rb +5 -5
  39. data/lib/lrama/logger.rb +4 -0
  40. data/lib/lrama/option_parser.rb +10 -8
  41. data/lib/lrama/options.rb +2 -1
  42. data/lib/lrama/parser.rb +529 -490
  43. data/lib/lrama/state/reduce.rb +2 -3
  44. data/lib/lrama/state.rb +288 -1
  45. data/lib/lrama/states/item.rb +8 -0
  46. data/lib/lrama/states.rb +69 -2
  47. data/lib/lrama/trace_reporter.rb +17 -2
  48. data/lib/lrama/version.rb +1 -1
  49. data/lrama.gemspec +1 -1
  50. data/parser.y +42 -30
  51. data/rbs_collection.lock.yaml +10 -2
  52. data/sig/generated/lrama/bitmap.rbs +11 -0
  53. data/sig/generated/lrama/digraph.rbs +39 -0
  54. data/sig/generated/lrama/grammar/binding.rbs +34 -0
  55. data/sig/generated/lrama/lexer/grammar_file.rbs +28 -0
  56. data/sig/generated/lrama/lexer/location.rbs +52 -0
  57. data/sig/{lrama → generated/lrama}/lexer/token/char.rbs +2 -0
  58. data/sig/{lrama → generated/lrama}/lexer/token/ident.rbs +2 -0
  59. data/sig/{lrama → generated/lrama}/lexer/token/instantiate_rule.rbs +8 -0
  60. data/sig/{lrama → generated/lrama}/lexer/token/tag.rbs +3 -0
  61. data/sig/{lrama → generated/lrama}/lexer/token/user_code.rbs +6 -1
  62. data/sig/{lrama → generated/lrama}/lexer/token.rbs +26 -3
  63. data/sig/generated/lrama/logger.rbs +14 -0
  64. data/sig/generated/lrama/trace_reporter.rbs +25 -0
  65. data/sig/lrama/counterexamples/derivation.rbs +33 -0
  66. data/sig/lrama/counterexamples/example.rbs +45 -0
  67. data/sig/lrama/counterexamples/path.rbs +21 -0
  68. data/sig/lrama/counterexamples/production_path.rbs +11 -0
  69. data/sig/lrama/counterexamples/start_path.rbs +13 -0
  70. data/sig/lrama/counterexamples/state_item.rbs +10 -0
  71. data/sig/lrama/counterexamples/transition_path.rbs +11 -0
  72. data/sig/lrama/counterexamples/triple.rbs +20 -0
  73. data/sig/lrama/counterexamples.rbs +29 -0
  74. data/sig/lrama/grammar/rule_builder.rbs +0 -1
  75. data/sig/lrama/grammar/symbol.rbs +1 -1
  76. data/sig/lrama/grammar/symbols/resolver.rbs +3 -3
  77. data/sig/lrama/grammar.rbs +13 -0
  78. data/sig/lrama/options.rbs +1 -0
  79. data/sig/lrama/state/reduce_reduce_conflict.rbs +2 -2
  80. data/sig/lrama/state.rbs +79 -0
  81. data/sig/lrama/states.rbs +101 -0
  82. metadata +34 -14
  83. data/sig/lrama/bitmap.rbs +0 -7
  84. data/sig/lrama/digraph.rbs +0 -23
  85. data/sig/lrama/grammar/binding.rbs +0 -19
  86. data/sig/lrama/lexer/grammar_file.rbs +0 -17
  87. data/sig/lrama/lexer/location.rbs +0 -26
data/doc/Index.md ADDED
@@ -0,0 +1,58 @@
1
+ # Lrama
2
+
3
+ [![Gem Version](https://badge.fury.io/rb/lrama.svg)](https://badge.fury.io/rb/lrama)
4
+ [![build](https://github.com/ruby/lrama/actions/workflows/test.yaml/badge.svg)](https://github.com/ruby/lrama/actions/workflows/test.yaml)
5
+
6
+
7
+ ## Overview
8
+
9
+ Lrama is LALR (1) parser generator written by Ruby. The first goal of this project is providing error tolerant parser for CRuby with minimal changes on CRuby parse.y file.
10
+
11
+ ## Installation
12
+
13
+ Lrama's installation is simple. You can install it via RubyGems.
14
+
15
+ ```shell
16
+ $ gem install lrama
17
+ ```
18
+
19
+ From source codes, you can install it as follows:
20
+
21
+ ```shell
22
+ $ cd "$(lrama root)"
23
+ $ bundle install
24
+ $ bundle exec rake install
25
+ $ bundle exec lrama --version
26
+ lrama 0.7.0
27
+ ```
28
+ ## Usage
29
+
30
+ Lrama is a command line tool. You can generate a parser from a grammar file by running `lrama` command.
31
+
32
+ ```shell
33
+ # "y.tab.c" and "y.tab.h" are generated
34
+ $ lrama -d sample/parse.y
35
+ ```
36
+ Specify the output file with `-o` option. The following example generates "calc.c" and "calc.h".
37
+
38
+ ```shell
39
+ # "calc", "calc.c", and "calc.h" are generated
40
+ $ lrama -d sample/calc.y -o calc.c && gcc -Wall calc.c -o calc && ./calc
41
+ Enter the formula:
42
+ 1
43
+ => 1
44
+ 1+2*3
45
+ => 7
46
+ (1+2)*3
47
+ => 9
48
+ ```
49
+
50
+ ## Supported Ruby version
51
+
52
+ Lrama is executed with BASERUBY when building ruby from source code. Therefore Lrama needs to support BASERUBY, currently 2.5, or later version.
53
+
54
+ This also requires Lrama to be able to run with only default gems because BASERUBY runs with `--disable=gems` option.
55
+
56
+ ## License
57
+
58
+ See [LEGAL.md](https://github.com/ruby/lrama/blob/master/LEGAL.md) file.
@@ -0,0 +1,635 @@
1
+ # Compressed State Table
2
+
3
+ LR parser generates two large tables, action table and GOTO table.
4
+ Action table is a matrix of states and tokens. Each cell of action table indicates next action (shift, reduce, accept and error).
5
+ GOTO table is a matrix of states and nonterminal symbols. Each cell of GOTO table indicates next state.
6
+
7
+ Action table of "parse.y":
8
+
9
+ | |EOF| LF|NUM|'+'|'*'|'('|')'|
10
+ |--------|--:|--:|--:|--:|--:|--:|--:|
11
+ |State 0| r1| | s1| | | s2| |
12
+ |State 1| r3| r3| r3| r3| r3| r3| r3|
13
+ |State 2| | | s1| | | s2| |
14
+ |State 3| s6| | | | | | |
15
+ |State 4| | s7| | s8| s9| | |
16
+ |State 5| | | | s8| s9| |s10|
17
+ |State 6|acc|acc|acc|acc|acc|acc|acc|
18
+ |State 7| r2| r2| r2| r2| r2| r2| r2|
19
+ |State 8| | | s1| | | s2| |
20
+ |State 9| | | s1| | | s2| |
21
+ |State 10| r6| r6| r6| r6| r6| r6| r6|
22
+ |State 11| | r4| | r4| s9| | r4|
23
+ |State 12| | r5| | r5| r5| | r5|
24
+
25
+ GOTO table of "parse.y":
26
+
27
+ | |$accept|program|expr|
28
+ |--------|------:|------:|---:|
29
+ |State 0| | g3| g4|
30
+ |State 1| | | |
31
+ |State 2| | | g5|
32
+ |State 3| | | |
33
+ |State 4| | | |
34
+ |State 5| | | |
35
+ |State 6| | | |
36
+ |State 7| | | |
37
+ |State 8| | | g11|
38
+ |State 9| | | g12|
39
+ |State 10| | | |
40
+ |State 11| | | |
41
+ |State 12| | | |
42
+
43
+
44
+ Both action table and GOTO table are sparse. Therefore LR parser generator compresses both tables and creates these tables.
45
+
46
+ * `yypact` & `yypgoto`
47
+ * `yytable`
48
+ * `yycheck`
49
+ * `yydefact` & `yydefgoto`
50
+
51
+ ## Introduction to major tables
52
+
53
+ ### `yypact` & `yypgoto`
54
+
55
+ `yypact` specifies offset on `yytable` for the current state.
56
+ As an optimization, `yypact` also specifies default reduce action for some states.
57
+ Accessing the value by `state`. For example,
58
+
59
+ ```ruby
60
+ offset = yypact[state]
61
+ ```
62
+
63
+ If the value is `YYPACT_NINF` (Negative INFinity), it means execution of default reduce action.
64
+ Otherwise the value is an offset in `yytable`.
65
+
66
+ `yypgoto` plays the same role as `yypact`.
67
+ But `yypgoto` is used for GOTO table.
68
+ Then its index is nonterminal symbol id.
69
+ Especially `yypgoto` is used when reduce happens.
70
+
71
+ ```ruby
72
+ rule_for_reduce = rules[rule_id]
73
+
74
+ # lhs_id holds LHS nonterminal id of the rule used for reduce.
75
+ lhs_id = rule_for_reduce.lhs.id
76
+
77
+ offset = yypgoto[lhs_id]
78
+
79
+ # Validate access to yytable
80
+ if yycheck[offset + state] == state
81
+ next_state = yytable[offset + state]
82
+ end
83
+ ```
84
+
85
+ ### `yytable`
86
+
87
+ `yytable` is a mixture of action table and GOTO table.
88
+
89
+ #### For action table
90
+
91
+ For action table, `yytable` specifies what actually to do on the current state.
92
+
93
+ Positive number means shift and specifies next state.
94
+ For example, `yytable[yyn] == 1` means shift and next state is State 1.
95
+
96
+ `YYTABLE_NINF` (Negative INFinity) means syntax error.
97
+ For example, `yytable[yyn] == YYTABLE_NINF` means syntax error.
98
+
99
+ Other negative number and zero mean reducing with the rule whose number is opposite.
100
+ For example, `yytable[yyn] == -1` means reduce with Rule 1.
101
+
102
+ #### For GOTO table
103
+
104
+ For GOTO table, `yytable` specifies the next state for given LSH nonterminal.
105
+
106
+ The value is always positive number which means next state id.
107
+ It never becomes `YYTABLE_NINF`.
108
+
109
+ ### `yycheck`
110
+
111
+ `yycheck` validates accesses to `yytable`.
112
+
113
+ Each line of action table and GOTO table is placed into single array in `yytable`.
114
+ Consider the case where action table has only two states.
115
+ In this case, if the second array is shifted to the right, they can be merged into one array without conflict.
116
+
117
+ ```ruby
118
+ [
119
+ [ 'a', 'b', , , 'e'], # State 0
120
+ [ , 'B', 'C', , 'E'], # State 1
121
+ ]
122
+
123
+ # => Shift the second array to the right
124
+
125
+ [
126
+ [ 'a', 'b', , , 'e'], # State 0
127
+ [ , 'B', 'C', , 'E'], # State 1
128
+ ]
129
+
130
+ # => Merge them into single array
131
+
132
+ yytable = [
133
+ 'a', 'b', 'B', 'C', 'e', 'E'
134
+ ]
135
+ ```
136
+
137
+ `yypact` is an array of each state offset.
138
+
139
+ ```ruby
140
+ yypact = [
141
+ 0, # State 0 is not shifted
142
+ 1 # State 1 is shifted one to right
143
+ ]
144
+ ```
145
+
146
+ We can access the value of `state1[2]` by consulting `yypact`.
147
+
148
+ ```ruby
149
+ yytable[yypact[1] + 2]
150
+ # => yytable[1 + 2]
151
+ # => 'C'
152
+ ```
153
+
154
+ However this approach doesn't work well when accessing to nil value like `state1[3]`.
155
+ Because it tries to access to `state0[4]`.
156
+
157
+ ```ruby
158
+ yytable[yypact[1] + 3]
159
+ # => yytable[1 + 3]
160
+ # => 'e'
161
+ ```
162
+
163
+ This is why `yycheck` is needed.
164
+ `yycheck` stores valid indexes of the original table.
165
+ In the current example:
166
+
167
+ * 0, 1 and 4 are valid index of State 0
168
+ * 1, 2 and 4 are valid index of State 1
169
+
170
+ `yycheck` stores these indexes with same offset with `yytable`.
171
+
172
+ ```ruby
173
+ # yytable
174
+ [
175
+ [ 'a', 'b', , , 'e'], # State 0
176
+ [ , 'B', 'C', , 'E'], # State 1
177
+ ]
178
+
179
+ yytable = [
180
+ 'a', 'b', 'B', 'C', 'e', 'E'
181
+ ]
182
+
183
+ # yycheck
184
+ [
185
+ [ 0, 1, , , 4], # State 0
186
+ [ , 1, 2, , 4], # State 1
187
+ ]
188
+
189
+ yycheck = [
190
+ 0, 1, 1, 2, 4, 4
191
+ ]
192
+ ```
193
+
194
+ We can validate accesses to `yytable` by consulting `yycheck`.
195
+ `yycheck` stores valid indexes in the original arrays then validation is comparing `yycheck[index_for_yytable]` and `index_for_the_state`.
196
+ The access is valid if both values are same.
197
+
198
+ ```ruby
199
+ # Validate an access to state1[2]
200
+ yycheck[yypact[1] + 2] == 2
201
+ # => yycheck[1 + 2] == 2
202
+ # => 2 == 2
203
+ # => true (valid)
204
+
205
+ # Validate an access to state1[3]
206
+ yycheck[yypact[1] + 3] == 3
207
+ # => yycheck[1 + 3] == 3
208
+ # => 4 == 3
209
+ # => false (invalid)
210
+ ```
211
+
212
+ ### `yydefact` & `yydefgoto`
213
+
214
+ `yydefact` stores rule id of default actions for each state.
215
+ `0` means syntax error, other number means reduce using Rule N.
216
+
217
+ ```ruby
218
+ rule_id = yydefact[state]
219
+ # => 0 means syntax error, other number means reduce using Rule whose id is `rule_id`
220
+ ```
221
+
222
+ `yydefgoto` stores default GOTOs for each nonterminal.
223
+ The number means next state.
224
+
225
+ ```ruby
226
+ next_state = yydefgoto[lhs_id]
227
+ # => Next state id is `next_state`
228
+ ```
229
+
230
+ ## Example
231
+
232
+ Take a look at compressed tables of "parse.y".
233
+ See "parse.output" for detailed information of symbols and states.
234
+
235
+ ### `yytable`
236
+
237
+ Original action table and GOTO table look like:
238
+
239
+ ```ruby
240
+ # Action table is a matrix of terminals * states
241
+ [
242
+ # [ EOF, error, undef, LF, NUM, '+', '*', '(', ')'] (default reduce)
243
+ [ , , , , s1, , , s2, ], # State 0 (r1)
244
+ [ , , , , , , , , ], # State 1 (r3)
245
+ [ , , , , s1, , , s2, ], # State 2 ()
246
+ [ s6, , , , , , , , ], # State 3 ()
247
+ [ , , , s7, , s8, s9, , ], # State 4 ()
248
+ [ , , , , , s8, s9, , s10], # State 5 ()
249
+ [ , , , , , , , , ], # State 6 (accept)
250
+ [ , , , , , , , , ], # State 7 (r2)
251
+ [ , , , , s1, , , s2, ], # State 8 ()
252
+ [ , , , , s1, , , s2, ], # State 9 ()
253
+ [ , , , , , , , , ], # State 10 (r6)
254
+ [ , , , , , , s9, , ], # State 11 (r4)
255
+ [ , , , , , , , , ], # State 12 (r5)
256
+ ]
257
+
258
+ # GOTO table is a matrix of states * nonterminals
259
+ [
260
+ # [ 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12] State No (default goto)
261
+ [ , , , , , , , , , , , , ], # $accept (g0)
262
+ [ g3, , , , , , , , , , , , ], # program (g3)
263
+ [ g4, , g5, , , , , , g11, g12, , , ], # expr (g4)
264
+ ]
265
+
266
+ # => Remove default goto
267
+
268
+ [
269
+ # [ 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12] State No (default goto)
270
+ [ , , , , , , , , , , , , ], # $accept (g0)
271
+ [ , , , , , , , , , , , , ], # program (g3)
272
+ [ , , g5, , , , , , g11, g12, , , ], # expr (g4)
273
+ ]
274
+ ```
275
+
276
+ These are compressed to `yytable` like below.
277
+ If offset equals to `YYPACT_NINF`, the line has only default value then the line can be ignored (commented out in this example).
278
+
279
+ ```ruby
280
+ [
281
+ # Action table
282
+ # (offset, YYPACT_NINF = -4)
283
+ [ , , , , s1, , , s2, ], # State 0 ( 6)
284
+ # [ , , , , , , , , ], # State 1 (-4)
285
+ [ , , , , s1, , , s2, ], # State 2 ( 6)
286
+ [ s6, , , , , , , , ], # State 3 ( 1)
287
+ [ , , , s7, , s8, s9, , ], # State 4 (-1)
288
+ [ , , , , , s8, s9, , s10], # State 5 ( 3)
289
+ # [ , , , , , , , , ], # State 6 (-4)
290
+ # [ , , , , , , , , ], # State 7 (-4)
291
+ [ , , , , s1, , , s2, ], # State 8 ( 6)
292
+ [ , , , , s1, , , s2, ], # State 9 ( 6)
293
+ # [ , , , , , , , , ], # State 10 (-4)
294
+ [ , , , , , , s9, , ], # State 11 (-3)
295
+ # [ , , , , , , , , ], # State 12 (-4)
296
+
297
+ # GOTO table
298
+ # [ , , , , , , , , , , , , ], # $accept (-4)
299
+ # [ , , , , , , , , , , , , ], # program (-4)
300
+ [ , , g5, , , , , , g11, g12, , , ], # expr (-2)
301
+ ]
302
+
303
+ # => compressed into single array
304
+ [ , , , g5, s6, s7, s9, s8, s9, g11, g12, s8, s9, s1, s10, , s2, ]
305
+
306
+ # => Cut blank cells on head and tail, remove 'g' and 's' prefix, fill blank with 0
307
+ # This is `yytable`
308
+ [ 5, 6, 7, 9, 8, 9, 11, 12, 8, 9, 1, 10, 0, 2]
309
+ ```
310
+
311
+ `YYTABLE_NINF` is the minimum negative number.
312
+ In this case, `0` is the minimum offset number then `YYTABLE_NINF` is `-1`.
313
+
314
+ ### `yycheck`
315
+
316
+ ```ruby
317
+ [
318
+ # Action table valid indexes
319
+ # (offset, YYPACT_NINF = -4)
320
+ [ , , , , 4, , , 7, ], # State 0 ( 6)
321
+ # [ , , , , , , , , ], # State 1 (-4)
322
+ [ , , , , 4, , , 7, ], # State 2 ( 6)
323
+ [ 0, , , , , , , , ], # State 3 ( 1)
324
+ [ , , , 3, , 5, 6, , ], # State 4 (-1)
325
+ [ , , , , , 5, 6, , 8], # State 5 ( 3)
326
+ # [ , , , , , , , , ], # State 6 (-4)
327
+ # [ , , , , , , , , ], # State 7 (-4)
328
+ [ , , , , 4, , , 7, ], # State 8 ( 6)
329
+ [ , , , , 4, , , 7, ], # State 9 ( 6)
330
+ # [ , , , , , , , , ], # State 10 (-4)
331
+ [ , , , , , , 6, , ], # State 11 (-3)
332
+ # [ , , , , , , , , ], # State 12 (-4)
333
+
334
+ # GOTO table valid indexes
335
+ # [ , , , , , , , , , , , , ], # $accept (-4)
336
+ # [ , , , , , , , , , , , , ], # program (-4)
337
+ [ , , 2, , , , , , 8, 9, , , ], # expr (-2)
338
+ ]
339
+
340
+ # => compressed into single array
341
+ [ , , , 2, 0, 3, 6, 5, 6, 8, 9, 5, 6, 4, 8, , 7, ]
342
+
343
+ # => Cut blank cells on head and tail, fill blank with -1 because no index can be -1 and comparison always fails
344
+ # This is `yycheck`
345
+ [ 2, 0, 3, 6, 5, 6, 8, 9, 5, 6, 4, 8, -1, 7]
346
+ ```
347
+
348
+ ### `yypact` & `yypgoto`
349
+
350
+ `yypact` & `yypgoto` are mixture of offset in `yytable` and `YYPACT_NINF` (default reduce action).
351
+ Index in `yypact` is state id and index in `yypgoto` is nonterminal symbol id.
352
+ `YYPACT_NINF` is the minimum negative number.
353
+ In this case, `-3` is the minimum offset number then `YYPACT_NINF` is `-4`.
354
+
355
+ ```ruby
356
+ YYPACT_NINF = -4
357
+
358
+ yypact = [
359
+ # 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12 (State No)
360
+ 6, -4, 6, 1, -1, 3, -4, -4, 6, 6, -4, -3, -4
361
+ ]
362
+
363
+ yypgoto = [
364
+ # $accept, program, expr
365
+ -4, -4, -2
366
+ ]
367
+ ```
368
+
369
+ ### `yydefact` & `yydefgoto`
370
+
371
+ `yydefact` & `yydefgoto` store default value.
372
+
373
+ `yydefact` specifies rule id of default actions of the state.
374
+ Because `0` is reserved for syntax error, Rule id starts with 1.
375
+
376
+ ```
377
+ # In "parse.output"
378
+ Grammar
379
+
380
+ 0 $accept: program "end of file"
381
+
382
+ 1 program: ε
383
+ 2 | expr LF
384
+
385
+ 3 expr: NUM
386
+ 4 | expr '+' expr
387
+ 5 | expr '*' expr
388
+ 6 | '(' expr ')'
389
+
390
+ # =>
391
+
392
+ # In `yydefact`
393
+ Grammar
394
+
395
+ 0 Syntax Error
396
+
397
+ 1 $accept: program "end of file"
398
+
399
+ 2 program: ε
400
+ 3 | expr LF
401
+
402
+ 4 expr: NUM
403
+ 5 | expr '+' expr
404
+ 6 | expr '*' expr
405
+ 7 | '(' expr ')'
406
+ ```
407
+
408
+ For example, default action for state 1 is 4 (`yydefact[1] == 4`).
409
+ This means Rule 3 (`3 expr: NUM`) in "parse.output" file.
410
+
411
+ `yydefgoto` specifies next state id of the nonterminal.
412
+
413
+ ```ruby
414
+ yydefact = [
415
+ # 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12 (State No)
416
+ 2, 4, 0, 0, 0, 0, 1, 3, 0, 0, 7, 5, 6
417
+ ]
418
+
419
+ yydefgoto = [
420
+ # $accept, program, expr
421
+ 0, 3, 4
422
+ ]
423
+ ```
424
+
425
+ ### `yyr1` & `yyr2`
426
+
427
+ Both of them are tables for rules.
428
+ `yyr1` specifies nonterminal symbol id of rule's Left-Hand-Side.
429
+ `yyr2` specifies the length of the rule, that is, number of symbols on the rule's Right-Hand-Side.
430
+ Index 0 is not used because Rule id starts with 1.
431
+
432
+ ```ruby
433
+ yyr1 = [
434
+ # 0, 1, 2, 3, 4, 5, 6, 7 (Rule id)
435
+ # no rule, $accept, program, program, expr, expr, expr, expr (LHS symbol id)
436
+ 0, 9, 10, 10, 11, 11, 11, 11
437
+ ]
438
+
439
+ yyr2 = [
440
+ # 0, 1, 2, 3, 4, 5, 6, 7 (Rule id)
441
+ 0, 2, 0, 2, 1, 3, 3, 3
442
+ ]
443
+ ```
444
+
445
+ ## How to use tables
446
+
447
+ See also "parse.rb" which implements LALR parser based on "parse.y" file.
448
+
449
+ At first, define important constants and arrays:
450
+
451
+ ```ruby
452
+ YYNTOKENS = 9
453
+
454
+ # The last index of yytable and yycheck
455
+ # The length of yytable and yycheck are always same
456
+ YYLAST = 13
457
+ YYTABLE_NINF = -1
458
+ yytable = [ 5, 6, 7, 9, 8, 9, 11, 12, 8, 9, 1, 10, 0, 2]
459
+ yycheck = [ 2, 0, 3, 6, 5, 6, 8, 9, 5, 6, 4, 8, -1, 7]
460
+
461
+ YYPACT_NINF = -4
462
+ yypact = [ 6, -4, 6, 1, -1, 3, -4, -4, 6, 6, -4, -3, -4]
463
+ yypgoto = [ -4, -4, -2]
464
+
465
+ yydefact = [ 2, 4, 0, 0, 0, 0, 1, 3, 0, 0, 7, 5, 6]
466
+ yydefgoto = [ 0, 3, 4]
467
+
468
+ yyr1 = [ 0, 9, 10, 10, 11, 11, 11, 11]
469
+ yyr2 = [ 0, 2, 0, 2, 1, 3, 3, 3]
470
+ ```
471
+
472
+ ### Determine what to do next
473
+
474
+ Determine what to do next based on current state (`state`) and next token (`yytoken`).
475
+
476
+ The first step to decide action is looking up `yypact` table by current state.
477
+ If only default reduce exists for the current state, `yypact` returns `YYPACT_NINF`.
478
+
479
+ ```ruby
480
+ # Case 1: Only default reduce exists for the state
481
+ #
482
+ # State 7
483
+ #
484
+ # 2 program: expr LF •
485
+ #
486
+ # $default reduce using rule 2 (program)
487
+
488
+ state = 7
489
+ yytoken = nil # Do not use yytoken in this case
490
+
491
+ offset = yypact[state] # -4
492
+ if offset == YYPACT_NINF # true
493
+ next_action = :yydefault
494
+ return
495
+ end
496
+ ```
497
+
498
+ If both shift and default reduce exists for the current state, `yypact` returns offset in `yytable`.
499
+ Index is the sum of `offset` and `yytoken`.
500
+ Need to check index before access to `yytable` by consulting `yycheck`.
501
+ Index can be out of range because blank cells on head and tail are omitted, see how `yycheck` is constructed in the example above.
502
+ Therefore need to check an index is not less than 0 and not greater than `YYLAST`.
503
+
504
+ ```ruby
505
+ # Case 2: Both shift and default reduce exists for the state
506
+ #
507
+ # State 11
508
+ #
509
+ # 4 expr: expr • '+' expr
510
+ # 4 | expr '+' expr • [LF, '+', ')']
511
+ # 5 | expr • '*' expr
512
+ #
513
+ # '*' shift, and go to state 9
514
+ #
515
+ # $default reduce using rule 4 (expr)
516
+
517
+ # Next token is '*' then shift it
518
+ state = 11
519
+ yytoken = nil
520
+
521
+ offset = yypact[state] # -3
522
+ if offset == YYPACT_NINF # false
523
+ next_action = :yydefault
524
+ break
525
+ end
526
+
527
+ unless yytoken
528
+ yytoken = yylex() # yylex returns 6 ('*')
529
+ end
530
+
531
+ idx = offset + yytoken # 3
532
+ if idx < 0 || YYLAST < idx # false
533
+ next_action = :yydefault
534
+ break
535
+ end
536
+ if yycheck[idx] != yytoken # false
537
+ next_action = :yydefault
538
+ break
539
+ end
540
+
541
+ act = yytable[idx] # 9
542
+ if act == YYTABLE_NINF # false
543
+ next_action = :syntax_error
544
+ break
545
+ end
546
+ if act > 0 # true
547
+ # Shift
548
+ next_action = :yyshift
549
+ break
550
+ else
551
+ # Reduce
552
+ next_action = :yyreduce
553
+ break
554
+ end
555
+ ```
556
+
557
+ ### Execute (default) reduce
558
+
559
+ Once next action is decided to default reduce, need to determine
560
+
561
+ 1. the rule to be applied
562
+ 2. the next state from GOTO table
563
+
564
+ Rule id for the default reduce is stored in `yydefact`.
565
+ `0` in `yydefact` means syntax error so need to check the value is not `0` before continue the process.
566
+
567
+ Once rule is determined, the length of the rule can be decided from `yyr2` and the LHS nonterminal can be decided from `yyr1`.
568
+
569
+ The next state is determined by LHS nonterminal and the state after reduce.
570
+ GOTO table is also compressed into `yytable` then the process to decide next state is similar to `yypact`.
571
+
572
+ 1. Look up `yypgoto` by LHS nonterminal. Note `yypact` is indexed by state but `yypgoto` is indexed by nonterminal.
573
+ 2. Check the value on `yypgoto` is `YYPACT_NINF` is not.
574
+ 3. Check the index, sum of offset and state, is out of range or not.
575
+ 4. Check `yycheck` table before access to `yytable`.
576
+
577
+ Finally push the state to the stack.
578
+
579
+ ```ruby
580
+ # State 11
581
+ #
582
+ # 4 expr: expr • '+' expr
583
+ # 4 | expr '+' expr • [LF, '+', ')']
584
+ # 5 | expr • '*' expr
585
+ #
586
+ # '*' shift, and go to state 9
587
+ #
588
+ # $default reduce using rule 4 (expr)
589
+
590
+ # Input is "1 + 2 + 3 LF" and next token is the second '+'.
591
+ # Current state stack is `[0, 4, 8, 11]`.
592
+ # What to do next is reduce with default action.
593
+ state = 11
594
+ yytoken = 5 # '+'
595
+
596
+ rule = yydefact[state] # 5
597
+ if rule == 0 # false
598
+ next_action = :syntax_error
599
+ break
600
+ end
601
+
602
+ rhs_length = yyr2[rule] # 3. Because rule 4 is "expr: expr '+' expr"
603
+ lhs_nterm = yyr1[rule] # 11 (expr)
604
+ lhs_nterm_id = lhs_nterm - YYNTOKENS # 11 - 9 = 2
605
+
606
+ case rule
607
+ when 1
608
+ # Execute Rule 1 action
609
+ when 2
610
+ # Execute Rule 2 action
611
+ #...
612
+ when 7
613
+ # Execute Rule 7 action
614
+ end
615
+
616
+ stack.pop(rhs_length) # state stack: `[0, 4, 8, 11]` -> `[0]`
617
+ state = stack[-1] # state = 0
618
+
619
+ offset = yypgoto[lhs_nterm_id] # -2
620
+ if offset == YYPACT_NINF # false
621
+ state = yydefgoto[lhs_nterm_id]
622
+ else
623
+ idx = offset + state # 0
624
+ if idx < 0 || YYLAST < idx # true
625
+ state = yydefgoto[lhs_nterm_id] # 4
626
+ elsif yycheck[idx] != state
627
+ state = yydefgoto[lhs_nterm_id]
628
+ else
629
+ state = yytable[idx]
630
+ end
631
+ end
632
+
633
+ # yyval = $$, yyloc = @$
634
+ push_state(state, yyval, yyloc) # state stack: [0, 4]
635
+ ```