lrama 0.6.2 → 0.6.3
Sign up to get free protection for your applications and to get access to all the features.
- checksums.yaml +4 -4
- data/NEWS.md +34 -0
- data/README.md +23 -0
- data/Steepfile +2 -0
- data/lib/lrama/context.rb +4 -4
- data/lib/lrama/grammar/code/initial_action_code.rb +6 -0
- data/lib/lrama/grammar/code/no_reference_code.rb +4 -0
- data/lib/lrama/grammar/code/printer_code.rb +6 -0
- data/lib/lrama/grammar/code/rule_action.rb +11 -1
- data/lib/lrama/grammar/reference.rb +4 -3
- data/lib/lrama/grammar/rule_builder.rb +8 -1
- data/lib/lrama/grammar/symbol.rb +1 -1
- data/lib/lrama/grammar/symbols/resolver.rb +276 -0
- data/lib/lrama/grammar/symbols.rb +1 -0
- data/lib/lrama/grammar.rb +25 -244
- data/lib/lrama/lexer/token/user_code.rb +13 -2
- data/lib/lrama/lexer.rb +6 -0
- data/lib/lrama/output.rb +56 -2
- data/lib/lrama/parser.rb +520 -457
- data/lib/lrama/state.rb +4 -4
- data/lib/lrama/states/item.rb +6 -8
- data/lib/lrama/states_reporter.rb +2 -2
- data/lib/lrama/version.rb +1 -1
- data/lrama.gemspec +7 -0
- data/parser.y +20 -0
- data/sig/lrama/grammar/reference.rbs +2 -1
- data/sig/lrama/grammar/symbol.rbs +4 -4
- data/sig/lrama/grammar/symbols/resolver.rbs +41 -0
- data/sig/lrama/grammar/type.rbs +11 -0
- data/template/bison/yacc.c +6 -0
- metadata +12 -3
checksums.yaml
CHANGED
@@ -1,7 +1,7 @@
|
|
1
1
|
---
|
2
2
|
SHA256:
|
3
|
-
metadata.gz:
|
4
|
-
data.tar.gz:
|
3
|
+
metadata.gz: ecd30d3fab4dd73442ed6d3b2802db5b463159cb6ddf1f1835d9b8e860d4c9dd
|
4
|
+
data.tar.gz: 79b6087e68d3c2e95db81fa1d25f58280a5543e9c5fa91f5b6ecc7c40b5599d7
|
5
5
|
SHA512:
|
6
|
-
metadata.gz:
|
7
|
-
data.tar.gz:
|
6
|
+
metadata.gz: f3302156423399987015deb90afbaa0d6916e5e61b14c5297ff6a0e01ab9db3bbd164b334a11e26d38cf626425b7faee404e5d1cdec236a9b08b576ced4fe201
|
7
|
+
data.tar.gz: 380d8d31c93e5ae6c5a406c2b2eedad0d4b52dd311ecd742ca1412bd1acade7dae8fe12f7a6325ef8a0b4922ffd927b0dfa2caf3cd54c095669ab2b2cab85516
|
data/NEWS.md
CHANGED
@@ -1,5 +1,39 @@
|
|
1
1
|
# NEWS for Lrama
|
2
2
|
|
3
|
+
## Lrama 0.6.3 (2024-02-15)
|
4
|
+
|
5
|
+
### Bring Your Own Stack
|
6
|
+
|
7
|
+
Provide functionalities for Bring Your Own Stack.
|
8
|
+
|
9
|
+
Ruby’s Ripper library requires their own semantic value stack to manage Ruby Objects returned by user defined callback method. Currently Ripper uses semantic value stack (`yyvsa`) which is used by parser to manage Node. This hack introduces some limitation on Ripper. For example, Ripper can not execute semantic analysis depending on Node structure.
|
10
|
+
|
11
|
+
Lrama introduces two features to support another semantic value stack by parser generator users.
|
12
|
+
|
13
|
+
1. Callback entry points
|
14
|
+
|
15
|
+
User can emulate semantic value stack by these callbacks.
|
16
|
+
Lrama provides these five callbacks. Registered functions are called when each event happen. For example %after-shift function is called when shift happens on original semantic value stack.
|
17
|
+
|
18
|
+
* `%after-shift` function_name
|
19
|
+
* `%before-reduce` function_name
|
20
|
+
* `%after-reduce` function_name
|
21
|
+
* `%after-shift-error-token` function_name
|
22
|
+
* `%after-pop-stack` function_name
|
23
|
+
|
24
|
+
2. `$:n` variable to access index of each grammar symbols
|
25
|
+
|
26
|
+
User also needs to access semantic value of their stack in grammar action. `$:n` provides the way to access to it. `$:n` is translated to the minus index from the top of the stack.
|
27
|
+
For example
|
28
|
+
|
29
|
+
```
|
30
|
+
primary: k_if expr_value then compstmt if_tail k_end
|
31
|
+
{
|
32
|
+
/*% ripper: if!($:2, $:4, $:5) %*/
|
33
|
+
/* $:2 = -5, $:4 = -3, $:5 = -2. */
|
34
|
+
}
|
35
|
+
```
|
36
|
+
|
3
37
|
## Lrama 0.6.2 (2024-01-27)
|
4
38
|
|
5
39
|
### %no-stdlib directive
|
data/README.md
CHANGED
@@ -1,7 +1,23 @@
|
|
1
1
|
# Lrama
|
2
2
|
|
3
|
+
[![Gem Version](https://badge.fury.io/rb/lrama.svg)](https://badge.fury.io/rb/lrama)
|
4
|
+
[![build](https://github.com/ruby/lrama/actions/workflows/test.yaml/badge.svg)](https://github.com/ruby/lrama/actions/workflows/test.yaml)
|
5
|
+
|
3
6
|
Lrama is LALR (1) parser generator written by Ruby. The first goal of this project is providing error tolerant parser for CRuby with minimal changes on CRuby parse.y file.
|
4
7
|
|
8
|
+
* [Features](#features)
|
9
|
+
* [Installation](#installation)
|
10
|
+
* [Usage](#usage)
|
11
|
+
* [Versions and Branches](#versions-and-branches)
|
12
|
+
* [Supported Ruby version](#supported-ruby-version)
|
13
|
+
* [Development](#development)
|
14
|
+
* [How to generate parser.rb](#how-to-generate-parserrb)
|
15
|
+
* [Test](#test)
|
16
|
+
* [Profiling Lrama](#profiling-lrama)
|
17
|
+
* [Build Ruby](#build-ruby)
|
18
|
+
* [Release flow](#release-flow)
|
19
|
+
* [License](#license)
|
20
|
+
|
5
21
|
## Features
|
6
22
|
|
7
23
|
* Bison style grammar file is supported with some assumptions
|
@@ -11,6 +27,9 @@ Lrama is LALR (1) parser generator written by Ruby. The first goal of this proje
|
|
11
27
|
* b4_lac_if is always false
|
12
28
|
* Error Tolerance parser
|
13
29
|
* Subset of [Repairing Syntax Errors in LR Parsers (Corchuelo et al.)](https://idus.us.es/bitstream/handle/11441/65631/Repairing%20syntax%20errors.pdf) algorithm is supported
|
30
|
+
* Parameterizing rules
|
31
|
+
* The definition of a non-terminal symbol can be parameterized with other (terminal or non-terminal) symbols.
|
32
|
+
* Providing a generic definition of parameterizing rules as a [standard library](lib/lrama/grammar/stdlib.y).
|
14
33
|
|
15
34
|
## Installation
|
16
35
|
|
@@ -85,6 +104,8 @@ Running tests:
|
|
85
104
|
```shell
|
86
105
|
$ bundle install
|
87
106
|
$ bundle exec rspec
|
107
|
+
# or
|
108
|
+
$ bundle exec rake spec
|
88
109
|
```
|
89
110
|
|
90
111
|
Running type check:
|
@@ -93,6 +114,8 @@ Running type check:
|
|
93
114
|
$ bundle install
|
94
115
|
$ bundle exec rbs collection install
|
95
116
|
$ bundle exec steep check
|
117
|
+
# or
|
118
|
+
$ bundle exec rake steep
|
96
119
|
```
|
97
120
|
|
98
121
|
Running both of them:
|
data/Steepfile
CHANGED
@@ -11,12 +11,14 @@ target :lib do
|
|
11
11
|
check "lib/lrama/grammar/error_token.rb"
|
12
12
|
check "lib/lrama/grammar/parameterizing_rule"
|
13
13
|
check "lib/lrama/grammar/parameterizing_rules"
|
14
|
+
check "lib/lrama/grammar/symbols"
|
14
15
|
check "lib/lrama/grammar/percent_code.rb"
|
15
16
|
check "lib/lrama/grammar/precedence.rb"
|
16
17
|
check "lib/lrama/grammar/printer.rb"
|
17
18
|
check "lib/lrama/grammar/reference.rb"
|
18
19
|
check "lib/lrama/grammar/rule_builder.rb"
|
19
20
|
check "lib/lrama/grammar/symbol.rb"
|
21
|
+
check "lib/lrama/grammar/type.rb"
|
20
22
|
check "lib/lrama/lexer"
|
21
23
|
check "lib/lrama/report"
|
22
24
|
check "lib/lrama/bitmap.rb"
|
data/lib/lrama/context.rb
CHANGED
@@ -265,9 +265,9 @@ module Lrama
|
|
265
265
|
|
266
266
|
s = actions.each_with_index.map do |n, i|
|
267
267
|
[i, n]
|
268
|
-
end.
|
268
|
+
end.reject do |i, n|
|
269
269
|
# Remove default_reduction_rule entries
|
270
|
-
n
|
270
|
+
n == 0
|
271
271
|
end
|
272
272
|
|
273
273
|
if s.count != 0
|
@@ -462,7 +462,7 @@ module Lrama
|
|
462
462
|
@yylast = high
|
463
463
|
|
464
464
|
# replace_ninf
|
465
|
-
@yypact_ninf = (@base.
|
465
|
+
@yypact_ninf = (@base.reject {|i| i == BaseMin } + [0]).min - 1
|
466
466
|
@base.map! do |i|
|
467
467
|
case i
|
468
468
|
when BaseMin
|
@@ -472,7 +472,7 @@ module Lrama
|
|
472
472
|
end
|
473
473
|
end
|
474
474
|
|
475
|
-
@yytable_ninf = (@table.compact.
|
475
|
+
@yytable_ninf = (@table.compact.reject {|i| i == ErrorActionNumber } + [0]).min - 1
|
476
476
|
@table.map! do |i|
|
477
477
|
case i
|
478
478
|
when nil
|
@@ -6,18 +6,24 @@ module Lrama
|
|
6
6
|
|
7
7
|
# * ($$) yylval
|
8
8
|
# * (@$) yylloc
|
9
|
+
# * ($:$) error
|
9
10
|
# * ($1) error
|
10
11
|
# * (@1) error
|
12
|
+
# * ($:1) error
|
11
13
|
def reference_to_c(ref)
|
12
14
|
case
|
13
15
|
when ref.type == :dollar && ref.name == "$" # $$
|
14
16
|
"yylval"
|
15
17
|
when ref.type == :at && ref.name == "$" # @$
|
16
18
|
"yylloc"
|
19
|
+
when ref.type == :index && ref.name == "$" # $:$
|
20
|
+
raise "$:#{ref.value} can not be used in initial_action."
|
17
21
|
when ref.type == :dollar # $n
|
18
22
|
raise "$#{ref.value} can not be used in initial_action."
|
19
23
|
when ref.type == :at # @n
|
20
24
|
raise "@#{ref.value} can not be used in initial_action."
|
25
|
+
when ref.type == :index # $:n
|
26
|
+
raise "$:#{ref.value} can not be used in initial_action."
|
21
27
|
else
|
22
28
|
raise "Unexpected. #{self}, #{ref}"
|
23
29
|
end
|
@@ -6,14 +6,18 @@ module Lrama
|
|
6
6
|
|
7
7
|
# * ($$) error
|
8
8
|
# * (@$) error
|
9
|
+
# * ($:$) error
|
9
10
|
# * ($1) error
|
10
11
|
# * (@1) error
|
12
|
+
# * ($:1) error
|
11
13
|
def reference_to_c(ref)
|
12
14
|
case
|
13
15
|
when ref.type == :dollar # $$, $n
|
14
16
|
raise "$#{ref.value} can not be used in #{type}."
|
15
17
|
when ref.type == :at # @$, @n
|
16
18
|
raise "@#{ref.value} can not be used in #{type}."
|
19
|
+
when ref.type == :index # $:$, $:n
|
20
|
+
raise "$:#{ref.value} can not be used in #{type}."
|
17
21
|
else
|
18
22
|
raise "Unexpected. #{self}, #{ref}"
|
19
23
|
end
|
@@ -11,8 +11,10 @@ module Lrama
|
|
11
11
|
|
12
12
|
# * ($$) *yyvaluep
|
13
13
|
# * (@$) *yylocationp
|
14
|
+
# * ($:$) error
|
14
15
|
# * ($1) error
|
15
16
|
# * (@1) error
|
17
|
+
# * ($:1) error
|
16
18
|
def reference_to_c(ref)
|
17
19
|
case
|
18
20
|
when ref.type == :dollar && ref.name == "$" # $$
|
@@ -20,10 +22,14 @@ module Lrama
|
|
20
22
|
"((*yyvaluep).#{member})"
|
21
23
|
when ref.type == :at && ref.name == "$" # @$
|
22
24
|
"(*yylocationp)"
|
25
|
+
when ref.type == :index && ref.name == "$" # $:$
|
26
|
+
raise "$:#{ref.value} can not be used in #{type}."
|
23
27
|
when ref.type == :dollar # $n
|
24
28
|
raise "$#{ref.value} can not be used in #{type}."
|
25
29
|
when ref.type == :at # @n
|
26
30
|
raise "@#{ref.value} can not be used in #{type}."
|
31
|
+
when ref.type == :index # $:n
|
32
|
+
raise "$:#{ref.value} can not be used in #{type}."
|
27
33
|
else
|
28
34
|
raise "Unexpected. #{self}, #{ref}"
|
29
35
|
end
|
@@ -11,8 +11,10 @@ module Lrama
|
|
11
11
|
|
12
12
|
# * ($$) yyval
|
13
13
|
# * (@$) yyloc
|
14
|
+
# * ($:$) error
|
14
15
|
# * ($1) yyvsp[i]
|
15
16
|
# * (@1) yylsp[i]
|
17
|
+
# * ($:1) i - 1
|
16
18
|
#
|
17
19
|
#
|
18
20
|
# Consider a rule like
|
@@ -24,6 +26,8 @@ module Lrama
|
|
24
26
|
# "Rule" class: keyword_class { $1 } tSTRING { $2 + $3 } keyword_end { $class = $1 + $keyword_end }
|
25
27
|
# "Position in grammar" $1 $2 $3 $4 $5
|
26
28
|
# "Index for yyvsp" -4 -3 -2 -1 0
|
29
|
+
# "$:n" $:1 $:2 $:3 $:4 $:5
|
30
|
+
# "index of $:n" -5 -4 -3 -2 -1
|
27
31
|
#
|
28
32
|
#
|
29
33
|
# For the first midrule action:
|
@@ -31,6 +35,7 @@ module Lrama
|
|
31
35
|
# "Rule" class: keyword_class { $1 } tSTRING { $2 + $3 } keyword_end { $class = $1 + $keyword_end }
|
32
36
|
# "Position in grammar" $1
|
33
37
|
# "Index for yyvsp" 0
|
38
|
+
# "$:n" $:1
|
34
39
|
def reference_to_c(ref)
|
35
40
|
case
|
36
41
|
when ref.type == :dollar && ref.name == "$" # $$
|
@@ -39,6 +44,8 @@ module Lrama
|
|
39
44
|
"(yyval.#{tag.member})"
|
40
45
|
when ref.type == :at && ref.name == "$" # @$
|
41
46
|
"(yyloc)"
|
47
|
+
when ref.type == :index && ref.name == "$" # $:$
|
48
|
+
raise "$:$ is not supported"
|
42
49
|
when ref.type == :dollar # $n
|
43
50
|
i = -position_in_rhs + ref.index
|
44
51
|
tag = ref.ex_tag || rhs[ref.index - 1].tag
|
@@ -47,6 +54,9 @@ module Lrama
|
|
47
54
|
when ref.type == :at # @n
|
48
55
|
i = -position_in_rhs + ref.index
|
49
56
|
"(yylsp[#{i}])"
|
57
|
+
when ref.type == :index # $:n
|
58
|
+
i = -position_in_rhs + ref.index
|
59
|
+
"(#{i} - 1)"
|
50
60
|
else
|
51
61
|
raise "Unexpected. #{self}, #{ref}"
|
52
62
|
end
|
@@ -70,7 +80,7 @@ module Lrama
|
|
70
80
|
end
|
71
81
|
|
72
82
|
def raise_tag_not_found_error(ref)
|
73
|
-
raise "Tag is not specified for '$#{ref.value}' in '#{@rule
|
83
|
+
raise "Tag is not specified for '$#{ref.value}' in '#{@rule}'"
|
74
84
|
end
|
75
85
|
end
|
76
86
|
end
|
@@ -2,11 +2,12 @@ module Lrama
|
|
2
2
|
class Grammar
|
3
3
|
# type: :dollar or :at
|
4
4
|
# name: String (e.g. $$, $foo, $expr.right)
|
5
|
-
#
|
5
|
+
# number: Integer (e.g. $1)
|
6
|
+
# index: Integer
|
6
7
|
# ex_tag: "$<tag>1" (Optional)
|
7
|
-
class Reference < Struct.new(:type, :name, :index, :ex_tag, :first_column, :last_column, keyword_init: true)
|
8
|
+
class Reference < Struct.new(:type, :name, :number, :index, :ex_tag, :first_column, :last_column, keyword_init: true)
|
8
9
|
def value
|
9
|
-
name ||
|
10
|
+
name || number
|
10
11
|
end
|
11
12
|
end
|
12
13
|
end
|
@@ -181,11 +181,18 @@ module Lrama
|
|
181
181
|
if referring_symbol[1] == 0 # Refers to LHS
|
182
182
|
ref.name = '$'
|
183
183
|
else
|
184
|
-
ref.
|
184
|
+
ref.number = referring_symbol[1]
|
185
185
|
end
|
186
186
|
end
|
187
187
|
end
|
188
188
|
|
189
|
+
if ref.number
|
190
|
+
# TODO: When Inlining is implemented, for example, if `$1` is expanded to multiple RHS tokens,
|
191
|
+
# `$2` needs to access `$2 + n` to actually access it. So, after the Inlining implementation,
|
192
|
+
# it needs resolves from number to index.
|
193
|
+
ref.index = ref.number
|
194
|
+
end
|
195
|
+
|
189
196
|
# TODO: Need to check index of @ too?
|
190
197
|
next if ref.type == :at
|
191
198
|
|
data/lib/lrama/grammar/symbol.rb
CHANGED
@@ -11,7 +11,7 @@ module Lrama
|
|
11
11
|
attr_reader :term
|
12
12
|
attr_writer :eof_symbol, :error_symbol, :undef_symbol, :accept_symbol
|
13
13
|
|
14
|
-
def initialize(id:, alias_name: nil, number: nil, tag: nil,
|
14
|
+
def initialize(id:, term:, alias_name: nil, number: nil, tag: nil, token_id: nil, nullable: nil, precedence: nil, printer: nil)
|
15
15
|
@id = id
|
16
16
|
@alias_name = alias_name
|
17
17
|
@number = number
|
@@ -0,0 +1,276 @@
|
|
1
|
+
module Lrama
|
2
|
+
class Grammar
|
3
|
+
class Symbols
|
4
|
+
class Resolver
|
5
|
+
attr_reader :terms, :nterms
|
6
|
+
|
7
|
+
def initialize
|
8
|
+
@terms = []
|
9
|
+
@nterms = []
|
10
|
+
end
|
11
|
+
|
12
|
+
def symbols
|
13
|
+
@symbols ||= (@terms + @nterms)
|
14
|
+
end
|
15
|
+
|
16
|
+
def sort_by_number!
|
17
|
+
symbols.sort_by!(&:number)
|
18
|
+
end
|
19
|
+
|
20
|
+
def add_term(id:, alias_name: nil, tag: nil, token_id: nil, replace: false)
|
21
|
+
if token_id && (sym = find_symbol_by_token_id(token_id))
|
22
|
+
if replace
|
23
|
+
sym.id = id
|
24
|
+
sym.alias_name = alias_name
|
25
|
+
sym.tag = tag
|
26
|
+
end
|
27
|
+
|
28
|
+
return sym
|
29
|
+
end
|
30
|
+
|
31
|
+
if (sym = find_symbol_by_id(id))
|
32
|
+
return sym
|
33
|
+
end
|
34
|
+
|
35
|
+
@symbols = nil
|
36
|
+
term = Symbol.new(
|
37
|
+
id: id, alias_name: alias_name, number: nil, tag: tag,
|
38
|
+
term: true, token_id: token_id, nullable: false
|
39
|
+
)
|
40
|
+
@terms << term
|
41
|
+
term
|
42
|
+
end
|
43
|
+
|
44
|
+
def add_nterm(id:, alias_name: nil, tag: nil)
|
45
|
+
return if find_symbol_by_id(id)
|
46
|
+
|
47
|
+
@symbols = nil
|
48
|
+
nterm = Symbol.new(
|
49
|
+
id: id, alias_name: alias_name, number: nil, tag: tag,
|
50
|
+
term: false, token_id: nil, nullable: nil,
|
51
|
+
)
|
52
|
+
@nterms << nterm
|
53
|
+
nterm
|
54
|
+
end
|
55
|
+
|
56
|
+
def find_symbol_by_s_value(s_value)
|
57
|
+
symbols.find { |s| s.id.s_value == s_value }
|
58
|
+
end
|
59
|
+
|
60
|
+
def find_symbol_by_s_value!(s_value)
|
61
|
+
find_symbol_by_s_value(s_value) || (raise "Symbol not found: #{s_value}")
|
62
|
+
end
|
63
|
+
|
64
|
+
def find_symbol_by_id(id)
|
65
|
+
symbols.find do |s|
|
66
|
+
s.id == id || s.alias_name == id.s_value
|
67
|
+
end
|
68
|
+
end
|
69
|
+
|
70
|
+
def find_symbol_by_id!(id)
|
71
|
+
find_symbol_by_id(id) || (raise "Symbol not found: #{id}")
|
72
|
+
end
|
73
|
+
|
74
|
+
def find_symbol_by_token_id(token_id)
|
75
|
+
symbols.find {|s| s.token_id == token_id }
|
76
|
+
end
|
77
|
+
|
78
|
+
def find_symbol_by_number!(number)
|
79
|
+
sym = symbols[number]
|
80
|
+
|
81
|
+
raise "Symbol not found: #{number}" unless sym
|
82
|
+
raise "[BUG] Symbol number mismatch. #{number}, #{sym}" if sym.number != number
|
83
|
+
|
84
|
+
sym
|
85
|
+
end
|
86
|
+
|
87
|
+
def fill_symbol_number
|
88
|
+
# YYEMPTY = -2
|
89
|
+
# YYEOF = 0
|
90
|
+
# YYerror = 1
|
91
|
+
# YYUNDEF = 2
|
92
|
+
@number = 3
|
93
|
+
fill_terms_number
|
94
|
+
fill_nterms_number
|
95
|
+
end
|
96
|
+
|
97
|
+
def fill_nterm_type(types)
|
98
|
+
types.each do |type|
|
99
|
+
nterm = find_nterm_by_id!(type.id)
|
100
|
+
nterm.tag = type.tag
|
101
|
+
end
|
102
|
+
end
|
103
|
+
|
104
|
+
def fill_printer(printers)
|
105
|
+
symbols.each do |sym|
|
106
|
+
printers.each do |printer|
|
107
|
+
printer.ident_or_tags.each do |ident_or_tag|
|
108
|
+
case ident_or_tag
|
109
|
+
when Lrama::Lexer::Token::Ident
|
110
|
+
sym.printer = printer if sym.id == ident_or_tag
|
111
|
+
when Lrama::Lexer::Token::Tag
|
112
|
+
sym.printer = printer if sym.tag == ident_or_tag
|
113
|
+
else
|
114
|
+
raise "Unknown token type. #{printer}"
|
115
|
+
end
|
116
|
+
end
|
117
|
+
end
|
118
|
+
end
|
119
|
+
end
|
120
|
+
|
121
|
+
def fill_error_token(error_tokens)
|
122
|
+
symbols.each do |sym|
|
123
|
+
error_tokens.each do |token|
|
124
|
+
token.ident_or_tags.each do |ident_or_tag|
|
125
|
+
case ident_or_tag
|
126
|
+
when Lrama::Lexer::Token::Ident
|
127
|
+
sym.error_token = token if sym.id == ident_or_tag
|
128
|
+
when Lrama::Lexer::Token::Tag
|
129
|
+
sym.error_token = token if sym.tag == ident_or_tag
|
130
|
+
else
|
131
|
+
raise "Unknown token type. #{token}"
|
132
|
+
end
|
133
|
+
end
|
134
|
+
end
|
135
|
+
end
|
136
|
+
end
|
137
|
+
|
138
|
+
def token_to_symbol(token)
|
139
|
+
case token
|
140
|
+
when Lrama::Lexer::Token
|
141
|
+
find_symbol_by_id!(token)
|
142
|
+
else
|
143
|
+
raise "Unknown class: #{token}"
|
144
|
+
end
|
145
|
+
end
|
146
|
+
|
147
|
+
def validate!
|
148
|
+
validate_number_uniqueness!
|
149
|
+
validate_alias_name_uniqueness!
|
150
|
+
end
|
151
|
+
|
152
|
+
private
|
153
|
+
|
154
|
+
def find_nterm_by_id!(id)
|
155
|
+
@nterms.find do |s|
|
156
|
+
s.id == id
|
157
|
+
end || (raise "Symbol not found: #{id}")
|
158
|
+
end
|
159
|
+
|
160
|
+
def fill_terms_number
|
161
|
+
# Character literal in grammar file has
|
162
|
+
# token id corresponding to ASCII code by default,
|
163
|
+
# so start token_id from 256.
|
164
|
+
token_id = 256
|
165
|
+
|
166
|
+
@terms.each do |sym|
|
167
|
+
while used_numbers[@number] do
|
168
|
+
@number += 1
|
169
|
+
end
|
170
|
+
|
171
|
+
if sym.number.nil?
|
172
|
+
sym.number = @number
|
173
|
+
used_numbers[@number] = true
|
174
|
+
@number += 1
|
175
|
+
end
|
176
|
+
|
177
|
+
# If id is Token::Char, it uses ASCII code
|
178
|
+
if sym.token_id.nil?
|
179
|
+
if sym.id.is_a?(Lrama::Lexer::Token::Char)
|
180
|
+
# Ignore ' on the both sides
|
181
|
+
case sym.id.s_value[1..-2]
|
182
|
+
when "\\b"
|
183
|
+
sym.token_id = 8
|
184
|
+
when "\\f"
|
185
|
+
sym.token_id = 12
|
186
|
+
when "\\n"
|
187
|
+
sym.token_id = 10
|
188
|
+
when "\\r"
|
189
|
+
sym.token_id = 13
|
190
|
+
when "\\t"
|
191
|
+
sym.token_id = 9
|
192
|
+
when "\\v"
|
193
|
+
sym.token_id = 11
|
194
|
+
when "\""
|
195
|
+
sym.token_id = 34
|
196
|
+
when "'"
|
197
|
+
sym.token_id = 39
|
198
|
+
when "\\\\"
|
199
|
+
sym.token_id = 92
|
200
|
+
when /\A\\(\d+)\z/
|
201
|
+
unless (id = Integer($1, 8)).nil?
|
202
|
+
sym.token_id = id
|
203
|
+
else
|
204
|
+
raise "Unknown Char s_value #{sym}"
|
205
|
+
end
|
206
|
+
when /\A(.)\z/
|
207
|
+
unless (id = $1&.bytes&.first).nil?
|
208
|
+
sym.token_id = id
|
209
|
+
else
|
210
|
+
raise "Unknown Char s_value #{sym}"
|
211
|
+
end
|
212
|
+
else
|
213
|
+
raise "Unknown Char s_value #{sym}"
|
214
|
+
end
|
215
|
+
else
|
216
|
+
sym.token_id = token_id
|
217
|
+
token_id += 1
|
218
|
+
end
|
219
|
+
end
|
220
|
+
end
|
221
|
+
end
|
222
|
+
|
223
|
+
def fill_nterms_number
|
224
|
+
token_id = 0
|
225
|
+
|
226
|
+
@nterms.each do |sym|
|
227
|
+
while used_numbers[@number] do
|
228
|
+
@number += 1
|
229
|
+
end
|
230
|
+
|
231
|
+
if sym.number.nil?
|
232
|
+
sym.number = @number
|
233
|
+
used_numbers[@number] = true
|
234
|
+
@number += 1
|
235
|
+
end
|
236
|
+
|
237
|
+
if sym.token_id.nil?
|
238
|
+
sym.token_id = token_id
|
239
|
+
token_id += 1
|
240
|
+
end
|
241
|
+
end
|
242
|
+
end
|
243
|
+
|
244
|
+
def used_numbers
|
245
|
+
return @used_numbers if defined?(@used_numbers)
|
246
|
+
|
247
|
+
@used_numbers = {}
|
248
|
+
symbols.map(&:number).each do |n|
|
249
|
+
@used_numbers[n] = true
|
250
|
+
end
|
251
|
+
@used_numbers
|
252
|
+
end
|
253
|
+
|
254
|
+
def validate_number_uniqueness!
|
255
|
+
invalid = symbols.group_by(&:number).select do |number, syms|
|
256
|
+
syms.count > 1
|
257
|
+
end
|
258
|
+
|
259
|
+
return if invalid.empty?
|
260
|
+
|
261
|
+
raise "Symbol number is duplicated. #{invalid}"
|
262
|
+
end
|
263
|
+
|
264
|
+
def validate_alias_name_uniqueness!
|
265
|
+
invalid = symbols.select(&:alias_name).group_by(&:alias_name).select do |alias_name, syms|
|
266
|
+
syms.count > 1
|
267
|
+
end
|
268
|
+
|
269
|
+
return if invalid.empty?
|
270
|
+
|
271
|
+
raise "Symbol alias name is duplicated. #{invalid}"
|
272
|
+
end
|
273
|
+
end
|
274
|
+
end
|
275
|
+
end
|
276
|
+
end
|
@@ -0,0 +1 @@
|
|
1
|
+
require_relative "symbols/resolver"
|