lrama 0.6.2 → 0.6.3
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- checksums.yaml +4 -4
- data/NEWS.md +34 -0
- data/README.md +23 -0
- data/Steepfile +2 -0
- data/lib/lrama/context.rb +4 -4
- data/lib/lrama/grammar/code/initial_action_code.rb +6 -0
- data/lib/lrama/grammar/code/no_reference_code.rb +4 -0
- data/lib/lrama/grammar/code/printer_code.rb +6 -0
- data/lib/lrama/grammar/code/rule_action.rb +11 -1
- data/lib/lrama/grammar/reference.rb +4 -3
- data/lib/lrama/grammar/rule_builder.rb +8 -1
- data/lib/lrama/grammar/symbol.rb +1 -1
- data/lib/lrama/grammar/symbols/resolver.rb +276 -0
- data/lib/lrama/grammar/symbols.rb +1 -0
- data/lib/lrama/grammar.rb +25 -244
- data/lib/lrama/lexer/token/user_code.rb +13 -2
- data/lib/lrama/lexer.rb +6 -0
- data/lib/lrama/output.rb +56 -2
- data/lib/lrama/parser.rb +520 -457
- data/lib/lrama/state.rb +4 -4
- data/lib/lrama/states/item.rb +6 -8
- data/lib/lrama/states_reporter.rb +2 -2
- data/lib/lrama/version.rb +1 -1
- data/lrama.gemspec +7 -0
- data/parser.y +20 -0
- data/sig/lrama/grammar/reference.rbs +2 -1
- data/sig/lrama/grammar/symbol.rbs +4 -4
- data/sig/lrama/grammar/symbols/resolver.rbs +41 -0
- data/sig/lrama/grammar/type.rbs +11 -0
- data/template/bison/yacc.c +6 -0
- metadata +12 -3
checksums.yaml
CHANGED
@@ -1,7 +1,7 @@
|
|
1
1
|
---
|
2
2
|
SHA256:
|
3
|
-
metadata.gz:
|
4
|
-
data.tar.gz:
|
3
|
+
metadata.gz: ecd30d3fab4dd73442ed6d3b2802db5b463159cb6ddf1f1835d9b8e860d4c9dd
|
4
|
+
data.tar.gz: 79b6087e68d3c2e95db81fa1d25f58280a5543e9c5fa91f5b6ecc7c40b5599d7
|
5
5
|
SHA512:
|
6
|
-
metadata.gz:
|
7
|
-
data.tar.gz:
|
6
|
+
metadata.gz: f3302156423399987015deb90afbaa0d6916e5e61b14c5297ff6a0e01ab9db3bbd164b334a11e26d38cf626425b7faee404e5d1cdec236a9b08b576ced4fe201
|
7
|
+
data.tar.gz: 380d8d31c93e5ae6c5a406c2b2eedad0d4b52dd311ecd742ca1412bd1acade7dae8fe12f7a6325ef8a0b4922ffd927b0dfa2caf3cd54c095669ab2b2cab85516
|
data/NEWS.md
CHANGED
@@ -1,5 +1,39 @@
|
|
1
1
|
# NEWS for Lrama
|
2
2
|
|
3
|
+
## Lrama 0.6.3 (2024-02-15)
|
4
|
+
|
5
|
+
### Bring Your Own Stack
|
6
|
+
|
7
|
+
Provide functionalities for Bring Your Own Stack.
|
8
|
+
|
9
|
+
Ruby’s Ripper library requires their own semantic value stack to manage Ruby Objects returned by user defined callback method. Currently Ripper uses semantic value stack (`yyvsa`) which is used by parser to manage Node. This hack introduces some limitation on Ripper. For example, Ripper can not execute semantic analysis depending on Node structure.
|
10
|
+
|
11
|
+
Lrama introduces two features to support another semantic value stack by parser generator users.
|
12
|
+
|
13
|
+
1. Callback entry points
|
14
|
+
|
15
|
+
User can emulate semantic value stack by these callbacks.
|
16
|
+
Lrama provides these five callbacks. Registered functions are called when each event happen. For example %after-shift function is called when shift happens on original semantic value stack.
|
17
|
+
|
18
|
+
* `%after-shift` function_name
|
19
|
+
* `%before-reduce` function_name
|
20
|
+
* `%after-reduce` function_name
|
21
|
+
* `%after-shift-error-token` function_name
|
22
|
+
* `%after-pop-stack` function_name
|
23
|
+
|
24
|
+
2. `$:n` variable to access index of each grammar symbols
|
25
|
+
|
26
|
+
User also needs to access semantic value of their stack in grammar action. `$:n` provides the way to access to it. `$:n` is translated to the minus index from the top of the stack.
|
27
|
+
For example
|
28
|
+
|
29
|
+
```
|
30
|
+
primary: k_if expr_value then compstmt if_tail k_end
|
31
|
+
{
|
32
|
+
/*% ripper: if!($:2, $:4, $:5) %*/
|
33
|
+
/* $:2 = -5, $:4 = -3, $:5 = -2. */
|
34
|
+
}
|
35
|
+
```
|
36
|
+
|
3
37
|
## Lrama 0.6.2 (2024-01-27)
|
4
38
|
|
5
39
|
### %no-stdlib directive
|
data/README.md
CHANGED
@@ -1,7 +1,23 @@
|
|
1
1
|
# Lrama
|
2
2
|
|
3
|
+
[](https://badge.fury.io/rb/lrama)
|
4
|
+
[](https://github.com/ruby/lrama/actions/workflows/test.yaml)
|
5
|
+
|
3
6
|
Lrama is LALR (1) parser generator written by Ruby. The first goal of this project is providing error tolerant parser for CRuby with minimal changes on CRuby parse.y file.
|
4
7
|
|
8
|
+
* [Features](#features)
|
9
|
+
* [Installation](#installation)
|
10
|
+
* [Usage](#usage)
|
11
|
+
* [Versions and Branches](#versions-and-branches)
|
12
|
+
* [Supported Ruby version](#supported-ruby-version)
|
13
|
+
* [Development](#development)
|
14
|
+
* [How to generate parser.rb](#how-to-generate-parserrb)
|
15
|
+
* [Test](#test)
|
16
|
+
* [Profiling Lrama](#profiling-lrama)
|
17
|
+
* [Build Ruby](#build-ruby)
|
18
|
+
* [Release flow](#release-flow)
|
19
|
+
* [License](#license)
|
20
|
+
|
5
21
|
## Features
|
6
22
|
|
7
23
|
* Bison style grammar file is supported with some assumptions
|
@@ -11,6 +27,9 @@ Lrama is LALR (1) parser generator written by Ruby. The first goal of this proje
|
|
11
27
|
* b4_lac_if is always false
|
12
28
|
* Error Tolerance parser
|
13
29
|
* Subset of [Repairing Syntax Errors in LR Parsers (Corchuelo et al.)](https://idus.us.es/bitstream/handle/11441/65631/Repairing%20syntax%20errors.pdf) algorithm is supported
|
30
|
+
* Parameterizing rules
|
31
|
+
* The definition of a non-terminal symbol can be parameterized with other (terminal or non-terminal) symbols.
|
32
|
+
* Providing a generic definition of parameterizing rules as a [standard library](lib/lrama/grammar/stdlib.y).
|
14
33
|
|
15
34
|
## Installation
|
16
35
|
|
@@ -85,6 +104,8 @@ Running tests:
|
|
85
104
|
```shell
|
86
105
|
$ bundle install
|
87
106
|
$ bundle exec rspec
|
107
|
+
# or
|
108
|
+
$ bundle exec rake spec
|
88
109
|
```
|
89
110
|
|
90
111
|
Running type check:
|
@@ -93,6 +114,8 @@ Running type check:
|
|
93
114
|
$ bundle install
|
94
115
|
$ bundle exec rbs collection install
|
95
116
|
$ bundle exec steep check
|
117
|
+
# or
|
118
|
+
$ bundle exec rake steep
|
96
119
|
```
|
97
120
|
|
98
121
|
Running both of them:
|
data/Steepfile
CHANGED
@@ -11,12 +11,14 @@ target :lib do
|
|
11
11
|
check "lib/lrama/grammar/error_token.rb"
|
12
12
|
check "lib/lrama/grammar/parameterizing_rule"
|
13
13
|
check "lib/lrama/grammar/parameterizing_rules"
|
14
|
+
check "lib/lrama/grammar/symbols"
|
14
15
|
check "lib/lrama/grammar/percent_code.rb"
|
15
16
|
check "lib/lrama/grammar/precedence.rb"
|
16
17
|
check "lib/lrama/grammar/printer.rb"
|
17
18
|
check "lib/lrama/grammar/reference.rb"
|
18
19
|
check "lib/lrama/grammar/rule_builder.rb"
|
19
20
|
check "lib/lrama/grammar/symbol.rb"
|
21
|
+
check "lib/lrama/grammar/type.rb"
|
20
22
|
check "lib/lrama/lexer"
|
21
23
|
check "lib/lrama/report"
|
22
24
|
check "lib/lrama/bitmap.rb"
|
data/lib/lrama/context.rb
CHANGED
@@ -265,9 +265,9 @@ module Lrama
|
|
265
265
|
|
266
266
|
s = actions.each_with_index.map do |n, i|
|
267
267
|
[i, n]
|
268
|
-
end.
|
268
|
+
end.reject do |i, n|
|
269
269
|
# Remove default_reduction_rule entries
|
270
|
-
n
|
270
|
+
n == 0
|
271
271
|
end
|
272
272
|
|
273
273
|
if s.count != 0
|
@@ -462,7 +462,7 @@ module Lrama
|
|
462
462
|
@yylast = high
|
463
463
|
|
464
464
|
# replace_ninf
|
465
|
-
@yypact_ninf = (@base.
|
465
|
+
@yypact_ninf = (@base.reject {|i| i == BaseMin } + [0]).min - 1
|
466
466
|
@base.map! do |i|
|
467
467
|
case i
|
468
468
|
when BaseMin
|
@@ -472,7 +472,7 @@ module Lrama
|
|
472
472
|
end
|
473
473
|
end
|
474
474
|
|
475
|
-
@yytable_ninf = (@table.compact.
|
475
|
+
@yytable_ninf = (@table.compact.reject {|i| i == ErrorActionNumber } + [0]).min - 1
|
476
476
|
@table.map! do |i|
|
477
477
|
case i
|
478
478
|
when nil
|
@@ -6,18 +6,24 @@ module Lrama
|
|
6
6
|
|
7
7
|
# * ($$) yylval
|
8
8
|
# * (@$) yylloc
|
9
|
+
# * ($:$) error
|
9
10
|
# * ($1) error
|
10
11
|
# * (@1) error
|
12
|
+
# * ($:1) error
|
11
13
|
def reference_to_c(ref)
|
12
14
|
case
|
13
15
|
when ref.type == :dollar && ref.name == "$" # $$
|
14
16
|
"yylval"
|
15
17
|
when ref.type == :at && ref.name == "$" # @$
|
16
18
|
"yylloc"
|
19
|
+
when ref.type == :index && ref.name == "$" # $:$
|
20
|
+
raise "$:#{ref.value} can not be used in initial_action."
|
17
21
|
when ref.type == :dollar # $n
|
18
22
|
raise "$#{ref.value} can not be used in initial_action."
|
19
23
|
when ref.type == :at # @n
|
20
24
|
raise "@#{ref.value} can not be used in initial_action."
|
25
|
+
when ref.type == :index # $:n
|
26
|
+
raise "$:#{ref.value} can not be used in initial_action."
|
21
27
|
else
|
22
28
|
raise "Unexpected. #{self}, #{ref}"
|
23
29
|
end
|
@@ -6,14 +6,18 @@ module Lrama
|
|
6
6
|
|
7
7
|
# * ($$) error
|
8
8
|
# * (@$) error
|
9
|
+
# * ($:$) error
|
9
10
|
# * ($1) error
|
10
11
|
# * (@1) error
|
12
|
+
# * ($:1) error
|
11
13
|
def reference_to_c(ref)
|
12
14
|
case
|
13
15
|
when ref.type == :dollar # $$, $n
|
14
16
|
raise "$#{ref.value} can not be used in #{type}."
|
15
17
|
when ref.type == :at # @$, @n
|
16
18
|
raise "@#{ref.value} can not be used in #{type}."
|
19
|
+
when ref.type == :index # $:$, $:n
|
20
|
+
raise "$:#{ref.value} can not be used in #{type}."
|
17
21
|
else
|
18
22
|
raise "Unexpected. #{self}, #{ref}"
|
19
23
|
end
|
@@ -11,8 +11,10 @@ module Lrama
|
|
11
11
|
|
12
12
|
# * ($$) *yyvaluep
|
13
13
|
# * (@$) *yylocationp
|
14
|
+
# * ($:$) error
|
14
15
|
# * ($1) error
|
15
16
|
# * (@1) error
|
17
|
+
# * ($:1) error
|
16
18
|
def reference_to_c(ref)
|
17
19
|
case
|
18
20
|
when ref.type == :dollar && ref.name == "$" # $$
|
@@ -20,10 +22,14 @@ module Lrama
|
|
20
22
|
"((*yyvaluep).#{member})"
|
21
23
|
when ref.type == :at && ref.name == "$" # @$
|
22
24
|
"(*yylocationp)"
|
25
|
+
when ref.type == :index && ref.name == "$" # $:$
|
26
|
+
raise "$:#{ref.value} can not be used in #{type}."
|
23
27
|
when ref.type == :dollar # $n
|
24
28
|
raise "$#{ref.value} can not be used in #{type}."
|
25
29
|
when ref.type == :at # @n
|
26
30
|
raise "@#{ref.value} can not be used in #{type}."
|
31
|
+
when ref.type == :index # $:n
|
32
|
+
raise "$:#{ref.value} can not be used in #{type}."
|
27
33
|
else
|
28
34
|
raise "Unexpected. #{self}, #{ref}"
|
29
35
|
end
|
@@ -11,8 +11,10 @@ module Lrama
|
|
11
11
|
|
12
12
|
# * ($$) yyval
|
13
13
|
# * (@$) yyloc
|
14
|
+
# * ($:$) error
|
14
15
|
# * ($1) yyvsp[i]
|
15
16
|
# * (@1) yylsp[i]
|
17
|
+
# * ($:1) i - 1
|
16
18
|
#
|
17
19
|
#
|
18
20
|
# Consider a rule like
|
@@ -24,6 +26,8 @@ module Lrama
|
|
24
26
|
# "Rule" class: keyword_class { $1 } tSTRING { $2 + $3 } keyword_end { $class = $1 + $keyword_end }
|
25
27
|
# "Position in grammar" $1 $2 $3 $4 $5
|
26
28
|
# "Index for yyvsp" -4 -3 -2 -1 0
|
29
|
+
# "$:n" $:1 $:2 $:3 $:4 $:5
|
30
|
+
# "index of $:n" -5 -4 -3 -2 -1
|
27
31
|
#
|
28
32
|
#
|
29
33
|
# For the first midrule action:
|
@@ -31,6 +35,7 @@ module Lrama
|
|
31
35
|
# "Rule" class: keyword_class { $1 } tSTRING { $2 + $3 } keyword_end { $class = $1 + $keyword_end }
|
32
36
|
# "Position in grammar" $1
|
33
37
|
# "Index for yyvsp" 0
|
38
|
+
# "$:n" $:1
|
34
39
|
def reference_to_c(ref)
|
35
40
|
case
|
36
41
|
when ref.type == :dollar && ref.name == "$" # $$
|
@@ -39,6 +44,8 @@ module Lrama
|
|
39
44
|
"(yyval.#{tag.member})"
|
40
45
|
when ref.type == :at && ref.name == "$" # @$
|
41
46
|
"(yyloc)"
|
47
|
+
when ref.type == :index && ref.name == "$" # $:$
|
48
|
+
raise "$:$ is not supported"
|
42
49
|
when ref.type == :dollar # $n
|
43
50
|
i = -position_in_rhs + ref.index
|
44
51
|
tag = ref.ex_tag || rhs[ref.index - 1].tag
|
@@ -47,6 +54,9 @@ module Lrama
|
|
47
54
|
when ref.type == :at # @n
|
48
55
|
i = -position_in_rhs + ref.index
|
49
56
|
"(yylsp[#{i}])"
|
57
|
+
when ref.type == :index # $:n
|
58
|
+
i = -position_in_rhs + ref.index
|
59
|
+
"(#{i} - 1)"
|
50
60
|
else
|
51
61
|
raise "Unexpected. #{self}, #{ref}"
|
52
62
|
end
|
@@ -70,7 +80,7 @@ module Lrama
|
|
70
80
|
end
|
71
81
|
|
72
82
|
def raise_tag_not_found_error(ref)
|
73
|
-
raise "Tag is not specified for '$#{ref.value}' in '#{@rule
|
83
|
+
raise "Tag is not specified for '$#{ref.value}' in '#{@rule}'"
|
74
84
|
end
|
75
85
|
end
|
76
86
|
end
|
@@ -2,11 +2,12 @@ module Lrama
|
|
2
2
|
class Grammar
|
3
3
|
# type: :dollar or :at
|
4
4
|
# name: String (e.g. $$, $foo, $expr.right)
|
5
|
-
#
|
5
|
+
# number: Integer (e.g. $1)
|
6
|
+
# index: Integer
|
6
7
|
# ex_tag: "$<tag>1" (Optional)
|
7
|
-
class Reference < Struct.new(:type, :name, :index, :ex_tag, :first_column, :last_column, keyword_init: true)
|
8
|
+
class Reference < Struct.new(:type, :name, :number, :index, :ex_tag, :first_column, :last_column, keyword_init: true)
|
8
9
|
def value
|
9
|
-
name ||
|
10
|
+
name || number
|
10
11
|
end
|
11
12
|
end
|
12
13
|
end
|
@@ -181,11 +181,18 @@ module Lrama
|
|
181
181
|
if referring_symbol[1] == 0 # Refers to LHS
|
182
182
|
ref.name = '$'
|
183
183
|
else
|
184
|
-
ref.
|
184
|
+
ref.number = referring_symbol[1]
|
185
185
|
end
|
186
186
|
end
|
187
187
|
end
|
188
188
|
|
189
|
+
if ref.number
|
190
|
+
# TODO: When Inlining is implemented, for example, if `$1` is expanded to multiple RHS tokens,
|
191
|
+
# `$2` needs to access `$2 + n` to actually access it. So, after the Inlining implementation,
|
192
|
+
# it needs resolves from number to index.
|
193
|
+
ref.index = ref.number
|
194
|
+
end
|
195
|
+
|
189
196
|
# TODO: Need to check index of @ too?
|
190
197
|
next if ref.type == :at
|
191
198
|
|
data/lib/lrama/grammar/symbol.rb
CHANGED
@@ -11,7 +11,7 @@ module Lrama
|
|
11
11
|
attr_reader :term
|
12
12
|
attr_writer :eof_symbol, :error_symbol, :undef_symbol, :accept_symbol
|
13
13
|
|
14
|
-
def initialize(id:, alias_name: nil, number: nil, tag: nil,
|
14
|
+
def initialize(id:, term:, alias_name: nil, number: nil, tag: nil, token_id: nil, nullable: nil, precedence: nil, printer: nil)
|
15
15
|
@id = id
|
16
16
|
@alias_name = alias_name
|
17
17
|
@number = number
|
@@ -0,0 +1,276 @@
|
|
1
|
+
module Lrama
|
2
|
+
class Grammar
|
3
|
+
class Symbols
|
4
|
+
class Resolver
|
5
|
+
attr_reader :terms, :nterms
|
6
|
+
|
7
|
+
def initialize
|
8
|
+
@terms = []
|
9
|
+
@nterms = []
|
10
|
+
end
|
11
|
+
|
12
|
+
def symbols
|
13
|
+
@symbols ||= (@terms + @nterms)
|
14
|
+
end
|
15
|
+
|
16
|
+
def sort_by_number!
|
17
|
+
symbols.sort_by!(&:number)
|
18
|
+
end
|
19
|
+
|
20
|
+
def add_term(id:, alias_name: nil, tag: nil, token_id: nil, replace: false)
|
21
|
+
if token_id && (sym = find_symbol_by_token_id(token_id))
|
22
|
+
if replace
|
23
|
+
sym.id = id
|
24
|
+
sym.alias_name = alias_name
|
25
|
+
sym.tag = tag
|
26
|
+
end
|
27
|
+
|
28
|
+
return sym
|
29
|
+
end
|
30
|
+
|
31
|
+
if (sym = find_symbol_by_id(id))
|
32
|
+
return sym
|
33
|
+
end
|
34
|
+
|
35
|
+
@symbols = nil
|
36
|
+
term = Symbol.new(
|
37
|
+
id: id, alias_name: alias_name, number: nil, tag: tag,
|
38
|
+
term: true, token_id: token_id, nullable: false
|
39
|
+
)
|
40
|
+
@terms << term
|
41
|
+
term
|
42
|
+
end
|
43
|
+
|
44
|
+
def add_nterm(id:, alias_name: nil, tag: nil)
|
45
|
+
return if find_symbol_by_id(id)
|
46
|
+
|
47
|
+
@symbols = nil
|
48
|
+
nterm = Symbol.new(
|
49
|
+
id: id, alias_name: alias_name, number: nil, tag: tag,
|
50
|
+
term: false, token_id: nil, nullable: nil,
|
51
|
+
)
|
52
|
+
@nterms << nterm
|
53
|
+
nterm
|
54
|
+
end
|
55
|
+
|
56
|
+
def find_symbol_by_s_value(s_value)
|
57
|
+
symbols.find { |s| s.id.s_value == s_value }
|
58
|
+
end
|
59
|
+
|
60
|
+
def find_symbol_by_s_value!(s_value)
|
61
|
+
find_symbol_by_s_value(s_value) || (raise "Symbol not found: #{s_value}")
|
62
|
+
end
|
63
|
+
|
64
|
+
def find_symbol_by_id(id)
|
65
|
+
symbols.find do |s|
|
66
|
+
s.id == id || s.alias_name == id.s_value
|
67
|
+
end
|
68
|
+
end
|
69
|
+
|
70
|
+
def find_symbol_by_id!(id)
|
71
|
+
find_symbol_by_id(id) || (raise "Symbol not found: #{id}")
|
72
|
+
end
|
73
|
+
|
74
|
+
def find_symbol_by_token_id(token_id)
|
75
|
+
symbols.find {|s| s.token_id == token_id }
|
76
|
+
end
|
77
|
+
|
78
|
+
def find_symbol_by_number!(number)
|
79
|
+
sym = symbols[number]
|
80
|
+
|
81
|
+
raise "Symbol not found: #{number}" unless sym
|
82
|
+
raise "[BUG] Symbol number mismatch. #{number}, #{sym}" if sym.number != number
|
83
|
+
|
84
|
+
sym
|
85
|
+
end
|
86
|
+
|
87
|
+
def fill_symbol_number
|
88
|
+
# YYEMPTY = -2
|
89
|
+
# YYEOF = 0
|
90
|
+
# YYerror = 1
|
91
|
+
# YYUNDEF = 2
|
92
|
+
@number = 3
|
93
|
+
fill_terms_number
|
94
|
+
fill_nterms_number
|
95
|
+
end
|
96
|
+
|
97
|
+
def fill_nterm_type(types)
|
98
|
+
types.each do |type|
|
99
|
+
nterm = find_nterm_by_id!(type.id)
|
100
|
+
nterm.tag = type.tag
|
101
|
+
end
|
102
|
+
end
|
103
|
+
|
104
|
+
def fill_printer(printers)
|
105
|
+
symbols.each do |sym|
|
106
|
+
printers.each do |printer|
|
107
|
+
printer.ident_or_tags.each do |ident_or_tag|
|
108
|
+
case ident_or_tag
|
109
|
+
when Lrama::Lexer::Token::Ident
|
110
|
+
sym.printer = printer if sym.id == ident_or_tag
|
111
|
+
when Lrama::Lexer::Token::Tag
|
112
|
+
sym.printer = printer if sym.tag == ident_or_tag
|
113
|
+
else
|
114
|
+
raise "Unknown token type. #{printer}"
|
115
|
+
end
|
116
|
+
end
|
117
|
+
end
|
118
|
+
end
|
119
|
+
end
|
120
|
+
|
121
|
+
def fill_error_token(error_tokens)
|
122
|
+
symbols.each do |sym|
|
123
|
+
error_tokens.each do |token|
|
124
|
+
token.ident_or_tags.each do |ident_or_tag|
|
125
|
+
case ident_or_tag
|
126
|
+
when Lrama::Lexer::Token::Ident
|
127
|
+
sym.error_token = token if sym.id == ident_or_tag
|
128
|
+
when Lrama::Lexer::Token::Tag
|
129
|
+
sym.error_token = token if sym.tag == ident_or_tag
|
130
|
+
else
|
131
|
+
raise "Unknown token type. #{token}"
|
132
|
+
end
|
133
|
+
end
|
134
|
+
end
|
135
|
+
end
|
136
|
+
end
|
137
|
+
|
138
|
+
def token_to_symbol(token)
|
139
|
+
case token
|
140
|
+
when Lrama::Lexer::Token
|
141
|
+
find_symbol_by_id!(token)
|
142
|
+
else
|
143
|
+
raise "Unknown class: #{token}"
|
144
|
+
end
|
145
|
+
end
|
146
|
+
|
147
|
+
def validate!
|
148
|
+
validate_number_uniqueness!
|
149
|
+
validate_alias_name_uniqueness!
|
150
|
+
end
|
151
|
+
|
152
|
+
private
|
153
|
+
|
154
|
+
def find_nterm_by_id!(id)
|
155
|
+
@nterms.find do |s|
|
156
|
+
s.id == id
|
157
|
+
end || (raise "Symbol not found: #{id}")
|
158
|
+
end
|
159
|
+
|
160
|
+
def fill_terms_number
|
161
|
+
# Character literal in grammar file has
|
162
|
+
# token id corresponding to ASCII code by default,
|
163
|
+
# so start token_id from 256.
|
164
|
+
token_id = 256
|
165
|
+
|
166
|
+
@terms.each do |sym|
|
167
|
+
while used_numbers[@number] do
|
168
|
+
@number += 1
|
169
|
+
end
|
170
|
+
|
171
|
+
if sym.number.nil?
|
172
|
+
sym.number = @number
|
173
|
+
used_numbers[@number] = true
|
174
|
+
@number += 1
|
175
|
+
end
|
176
|
+
|
177
|
+
# If id is Token::Char, it uses ASCII code
|
178
|
+
if sym.token_id.nil?
|
179
|
+
if sym.id.is_a?(Lrama::Lexer::Token::Char)
|
180
|
+
# Ignore ' on the both sides
|
181
|
+
case sym.id.s_value[1..-2]
|
182
|
+
when "\\b"
|
183
|
+
sym.token_id = 8
|
184
|
+
when "\\f"
|
185
|
+
sym.token_id = 12
|
186
|
+
when "\\n"
|
187
|
+
sym.token_id = 10
|
188
|
+
when "\\r"
|
189
|
+
sym.token_id = 13
|
190
|
+
when "\\t"
|
191
|
+
sym.token_id = 9
|
192
|
+
when "\\v"
|
193
|
+
sym.token_id = 11
|
194
|
+
when "\""
|
195
|
+
sym.token_id = 34
|
196
|
+
when "'"
|
197
|
+
sym.token_id = 39
|
198
|
+
when "\\\\"
|
199
|
+
sym.token_id = 92
|
200
|
+
when /\A\\(\d+)\z/
|
201
|
+
unless (id = Integer($1, 8)).nil?
|
202
|
+
sym.token_id = id
|
203
|
+
else
|
204
|
+
raise "Unknown Char s_value #{sym}"
|
205
|
+
end
|
206
|
+
when /\A(.)\z/
|
207
|
+
unless (id = $1&.bytes&.first).nil?
|
208
|
+
sym.token_id = id
|
209
|
+
else
|
210
|
+
raise "Unknown Char s_value #{sym}"
|
211
|
+
end
|
212
|
+
else
|
213
|
+
raise "Unknown Char s_value #{sym}"
|
214
|
+
end
|
215
|
+
else
|
216
|
+
sym.token_id = token_id
|
217
|
+
token_id += 1
|
218
|
+
end
|
219
|
+
end
|
220
|
+
end
|
221
|
+
end
|
222
|
+
|
223
|
+
def fill_nterms_number
|
224
|
+
token_id = 0
|
225
|
+
|
226
|
+
@nterms.each do |sym|
|
227
|
+
while used_numbers[@number] do
|
228
|
+
@number += 1
|
229
|
+
end
|
230
|
+
|
231
|
+
if sym.number.nil?
|
232
|
+
sym.number = @number
|
233
|
+
used_numbers[@number] = true
|
234
|
+
@number += 1
|
235
|
+
end
|
236
|
+
|
237
|
+
if sym.token_id.nil?
|
238
|
+
sym.token_id = token_id
|
239
|
+
token_id += 1
|
240
|
+
end
|
241
|
+
end
|
242
|
+
end
|
243
|
+
|
244
|
+
def used_numbers
|
245
|
+
return @used_numbers if defined?(@used_numbers)
|
246
|
+
|
247
|
+
@used_numbers = {}
|
248
|
+
symbols.map(&:number).each do |n|
|
249
|
+
@used_numbers[n] = true
|
250
|
+
end
|
251
|
+
@used_numbers
|
252
|
+
end
|
253
|
+
|
254
|
+
def validate_number_uniqueness!
|
255
|
+
invalid = symbols.group_by(&:number).select do |number, syms|
|
256
|
+
syms.count > 1
|
257
|
+
end
|
258
|
+
|
259
|
+
return if invalid.empty?
|
260
|
+
|
261
|
+
raise "Symbol number is duplicated. #{invalid}"
|
262
|
+
end
|
263
|
+
|
264
|
+
def validate_alias_name_uniqueness!
|
265
|
+
invalid = symbols.select(&:alias_name).group_by(&:alias_name).select do |alias_name, syms|
|
266
|
+
syms.count > 1
|
267
|
+
end
|
268
|
+
|
269
|
+
return if invalid.empty?
|
270
|
+
|
271
|
+
raise "Symbol alias name is duplicated. #{invalid}"
|
272
|
+
end
|
273
|
+
end
|
274
|
+
end
|
275
|
+
end
|
276
|
+
end
|
@@ -0,0 +1 @@
|
|
1
|
+
require_relative "symbols/resolver"
|