ruby_parser 3.14.0 → 3.16.0
Sign up to get free protection for your applications and to get access to all the features.
- checksums.yaml +4 -4
- checksums.yaml.gz.sig +0 -0
- data.tar.gz.sig +0 -0
- data/History.rdoc +78 -0
- data/Manifest.txt +4 -0
- data/Rakefile +50 -13
- data/bin/ruby_parse_extract_error +8 -3
- data/compare/normalize.rb +45 -5
- data/debugging.md +172 -0
- data/lib/ruby20_parser.rb +3378 -3353
- data/lib/ruby20_parser.y +99 -64
- data/lib/ruby21_parser.rb +3438 -3411
- data/lib/ruby21_parser.y +99 -64
- data/lib/ruby22_parser.rb +3445 -3414
- data/lib/ruby22_parser.y +99 -64
- data/lib/ruby23_parser.rb +3395 -3367
- data/lib/ruby23_parser.y +99 -64
- data/lib/ruby24_parser.rb +3443 -3407
- data/lib/ruby24_parser.y +99 -64
- data/lib/ruby25_parser.rb +3442 -3407
- data/lib/ruby25_parser.y +99 -64
- data/lib/ruby26_parser.rb +3380 -3343
- data/lib/ruby26_parser.y +100 -64
- data/lib/ruby27_parser.rb +7310 -0
- data/lib/ruby27_parser.y +2677 -0
- data/lib/ruby30_parser.rb +7310 -0
- data/lib/ruby30_parser.y +2677 -0
- data/lib/ruby_lexer.rb +94 -43
- data/lib/ruby_lexer.rex +6 -7
- data/lib/ruby_lexer.rex.rb +7 -9
- data/lib/ruby_parser.rb +4 -0
- data/lib/ruby_parser.yy +122 -64
- data/lib/ruby_parser_extras.rb +52 -23
- data/test/test_ruby_lexer.rb +80 -16
- data/test/test_ruby_parser.rb +259 -4
- data/tools/munge.rb +9 -4
- metadata +57 -36
- metadata.gz.sig +2 -2
checksums.yaml
CHANGED
@@ -1,7 +1,7 @@
|
|
1
1
|
---
|
2
2
|
SHA256:
|
3
|
-
metadata.gz:
|
4
|
-
data.tar.gz:
|
3
|
+
metadata.gz: 229868c7db5e2ab8106bc746973fd349d0c656c6886a3dcf9e772c88108a5e62
|
4
|
+
data.tar.gz: 583b39eb3c6834e9de3567bae2300bfdfc96101f5c5fe73f483db63229e73282
|
5
5
|
SHA512:
|
6
|
-
metadata.gz:
|
7
|
-
data.tar.gz:
|
6
|
+
metadata.gz: ce7d8b3e670dc37e9c7c215f47fd1ec8642f6ddab8648f6e502f507295be01ba976914c1dd6c7a727cb7075085e42dc7f252a08912337d0e43479bb664ec2e65
|
7
|
+
data.tar.gz: a59038ca6c27c24cfb4161ae2366df61cdc4e6c4e77f8201a463e4c21bc676517f80e6d9a9798113d60d497b216eed66796fc2fabf1ea4d9de0e15d0eaa6c5a9
|
checksums.yaml.gz.sig
CHANGED
Binary file
|
data.tar.gz.sig
CHANGED
Binary file
|
data/History.rdoc
CHANGED
@@ -1,3 +1,81 @@
|
|
1
|
+
=== 3.16.0 / 2021-05-15
|
2
|
+
|
3
|
+
* 1 major enhancement:
|
4
|
+
|
5
|
+
* Added tentative 3.0 support.
|
6
|
+
|
7
|
+
* 3 minor enhancements:
|
8
|
+
|
9
|
+
* Added lexing for "beginless range" (bdots).
|
10
|
+
* Added parsing for bdots.
|
11
|
+
* Updated rake compare task to download xz files, bumped versions, etc
|
12
|
+
|
13
|
+
* 4 bug fixes:
|
14
|
+
|
15
|
+
* Bump rake dependency to >= 10, < 15. (presidentbeef)
|
16
|
+
* Bump sexp_processor dependency to 4.15.1+. (pravi)
|
17
|
+
* Fixed minor state mismatch at the end of parsing to make diffing a little cleaner.
|
18
|
+
* Fixed normalizer to deal with new bison token syntax
|
19
|
+
|
20
|
+
=== 3.15.1 / 2021-01-10
|
21
|
+
|
22
|
+
* 1 bug fix:
|
23
|
+
|
24
|
+
* Bumped ruby version to include < 4 (trunk).
|
25
|
+
|
26
|
+
=== 3.15.0 / 2020-08-31
|
27
|
+
|
28
|
+
* 1 major enhancement:
|
29
|
+
|
30
|
+
* Added tentative 2.7 support.
|
31
|
+
|
32
|
+
* 1 minor enhancement:
|
33
|
+
|
34
|
+
* Improved ruby_parse_extract_error's handling of moving slow files out.
|
35
|
+
|
36
|
+
* 22 bug fixes:
|
37
|
+
|
38
|
+
* Bumped ruby version to include 3.0 (trunk).
|
39
|
+
* Fix an error related to empty ensure bodies. (presidentbeef)
|
40
|
+
* Fix handling of bad magic encoding comment.
|
41
|
+
* Fixed SystemStackError when parsing a huoooge hash, caused by a splat arg.
|
42
|
+
* Fixed a number of errors parsing do blocks in strange edge cases.
|
43
|
+
* Fixed a string backslash lexing bug when the string is an invalid encoding. (nijikon, gmcgibbon)
|
44
|
+
* Fixed bug assigning line number to some arg nodes.
|
45
|
+
* Fixed bug concatinating string literals with differing encodings.
|
46
|
+
* Fixed bug lexing heredoc w/ nasty mix of \r\n and \n.
|
47
|
+
* Fixed bug lexing multiple codepoints in \u{0000 1111 2222} forms.
|
48
|
+
* Fixed bug setting line numbers in empty xstrings in some contexts.
|
49
|
+
* Fixed edge case on call w/ begin + do block as an arg.
|
50
|
+
* Fixed handling of UTF BOM.
|
51
|
+
* Fixed handling of lexer state across string interpolation braces.
|
52
|
+
* Fixed infinite loop when lexing backslash+cr+newline (aka dos-files)
|
53
|
+
* Fixed lambda + do block edge case.
|
54
|
+
* Fixed lexing of some ?\M... and ?\C... edge cases.
|
55
|
+
* Fixed more do/brace block edge case failures.
|
56
|
+
* Fixed parsing bug where splat was used in the middle of a list.
|
57
|
+
* Fixed parsing of interpolation in heredoc-like strings. (presidentbeef)
|
58
|
+
* Fixed parsing some esoteric edge cases in op_asgn.
|
59
|
+
* Fixed unicode processing in ident chars so now they better mix.
|
60
|
+
|
61
|
+
=== 3.14.2 / 2020-02-06
|
62
|
+
|
63
|
+
* 1 minor enhancement:
|
64
|
+
|
65
|
+
* Cleaned up call_args and removed arg_blk_pass from ruby_parser_extras.rb! Yay!
|
66
|
+
|
67
|
+
=== 3.14.1 / 2019-10-29
|
68
|
+
|
69
|
+
* 1 minor enhancement:
|
70
|
+
|
71
|
+
* Declared that ruby_parser supports ruby 2.2 and up.
|
72
|
+
|
73
|
+
* 3 bug fixes:
|
74
|
+
|
75
|
+
* Fixed a problem with %W with a null-byte terminator. (wtf?) (spohlenz)
|
76
|
+
* Fixed line numbering for command (eg methods without parentheses) arguments. (mvz)
|
77
|
+
* Fixed lineno on new dxstrs. (presidentbeef)
|
78
|
+
|
1
79
|
=== 3.14.0 / 2019-09-24
|
2
80
|
|
3
81
|
* 8 minor enhancements:
|
data/Manifest.txt
CHANGED
data/Rakefile
CHANGED
@@ -8,11 +8,12 @@ Hoe.plugin :racc
|
|
8
8
|
Hoe.plugin :isolate
|
9
9
|
Hoe.plugin :rdoc
|
10
10
|
|
11
|
+
Hoe.add_include_dirs "lib"
|
11
12
|
Hoe.add_include_dirs "../../sexp_processor/dev/lib"
|
12
13
|
Hoe.add_include_dirs "../../minitest/dev/lib"
|
13
14
|
Hoe.add_include_dirs "../../oedipus_lex/dev/lib"
|
14
15
|
|
15
|
-
V2 = %w[20 21 22 23 24 25 26]
|
16
|
+
V2 = %w[20 21 22 23 24 25 26 27 30]
|
16
17
|
V2.replace [V2.last] if ENV["FAST"] # HACK
|
17
18
|
|
18
19
|
Hoe.spec "ruby_parser" do
|
@@ -20,10 +21,19 @@ Hoe.spec "ruby_parser" do
|
|
20
21
|
|
21
22
|
license "MIT"
|
22
23
|
|
23
|
-
dependency "sexp_processor", "~> 4.
|
24
|
-
dependency "rake", "<
|
24
|
+
dependency "sexp_processor", ["~> 4.15", ">= 4.15.1"]
|
25
|
+
dependency "rake", [">= 10", "< 15"], :developer
|
25
26
|
dependency "oedipus_lex", "~> 2.5", :developer
|
26
27
|
|
28
|
+
# NOTE: Ryan!!! Stop trying to fix this dependency! Isolate just
|
29
|
+
# can't handle having a faux-gem half-installed! Stop! Just `gem
|
30
|
+
# install racc` and move on. Revisit this ONLY once racc-compiler
|
31
|
+
# gets split out.
|
32
|
+
|
33
|
+
dependency "racc", "~> 1.5", :developer
|
34
|
+
|
35
|
+
require_ruby_version [">= 2.1", "< 4"]
|
36
|
+
|
27
37
|
if plugin? :perforce then # generated files
|
28
38
|
V2.each do |n|
|
29
39
|
self.perforce_ignore << "lib/ruby#{n}_parser.rb"
|
@@ -54,6 +64,8 @@ end
|
|
54
64
|
|
55
65
|
file "lib/ruby_lexer.rex.rb" => "lib/ruby_lexer.rex"
|
56
66
|
|
67
|
+
task :generate => [:lexer, :parser]
|
68
|
+
|
57
69
|
task :clean do
|
58
70
|
rm_rf(Dir["**/*~"] +
|
59
71
|
Dir["diff.diff"] + # not all diffs. bit me too many times
|
@@ -87,7 +99,7 @@ end
|
|
87
99
|
|
88
100
|
def dl v
|
89
101
|
dir = v[/^\d+\.\d+/]
|
90
|
-
url = "https://cache.ruby-lang.org/pub/ruby/#{dir}/ruby-#{v}.tar.
|
102
|
+
url = "https://cache.ruby-lang.org/pub/ruby/#{dir}/ruby-#{v}.tar.xz"
|
91
103
|
path = File.basename url
|
92
104
|
unless File.exist? path then
|
93
105
|
system "curl -O #{url}"
|
@@ -99,7 +111,7 @@ def ruby_parse version
|
|
99
111
|
rp_txt = "rp#{v}.txt"
|
100
112
|
mri_txt = "mri#{v}.txt"
|
101
113
|
parse_y = "parse#{v}.y"
|
102
|
-
tarball = "ruby-#{version}.tar.
|
114
|
+
tarball = "ruby-#{version}.tar.xz"
|
103
115
|
ruby_dir = "ruby-#{version}"
|
104
116
|
diff = "diff#{v}.diff"
|
105
117
|
rp_out = "lib/ruby#{v}_parser.output"
|
@@ -119,23 +131,40 @@ def ruby_parse version
|
|
119
131
|
end
|
120
132
|
end
|
121
133
|
|
134
|
+
desc "fetch all tarballs"
|
135
|
+
task :fetch => c_tarball
|
136
|
+
|
122
137
|
file c_parse_y => c_tarball do
|
123
138
|
in_compare do
|
124
|
-
|
139
|
+
extract_glob = case version
|
140
|
+
when /2\.7|3\.0/
|
141
|
+
"{id.h,parse.y,tool/{id2token.rb,lib/vpath.rb}}"
|
142
|
+
else
|
143
|
+
"{id.h,parse.y,tool/{id2token.rb,vpath.rb}}"
|
144
|
+
end
|
145
|
+
system "tar Jxf #{tarball} #{ruby_dir}/#{extract_glob}"
|
146
|
+
|
125
147
|
Dir.chdir ruby_dir do
|
126
148
|
if File.exist? "tool/id2token.rb" then
|
127
149
|
sh "ruby tool/id2token.rb --path-separator=.:./ id.h parse.y | expand > ../#{parse_y}"
|
128
150
|
else
|
129
151
|
sh "expand parse.y > ../#{parse_y}"
|
130
152
|
end
|
153
|
+
|
154
|
+
ruby "-pi", "-e", 'gsub(/^%define\s+api\.pure/, "%pure-parser")', "../#{parse_y}"
|
131
155
|
end
|
132
156
|
sh "rm -rf #{ruby_dir}"
|
133
157
|
end
|
134
158
|
end
|
135
159
|
|
160
|
+
bison = Dir["/opt/homebrew/opt/bison/bin/bison",
|
161
|
+
"/usr/local/opt/bison/bin/bison",
|
162
|
+
`which bison`.chomp,
|
163
|
+
].first
|
164
|
+
|
136
165
|
file c_mri_txt => [c_parse_y, normalize] do
|
137
166
|
in_compare do
|
138
|
-
sh "bison -r all #{parse_y}"
|
167
|
+
sh "#{bison} -r all #{parse_y}"
|
139
168
|
sh "./normalize.rb parse#{v}.output > #{mri_txt}"
|
140
169
|
rm ["parse#{v}.output", "parse#{v}.tab.c"]
|
141
170
|
end
|
@@ -180,16 +209,18 @@ ruby_parse "2.0.0-p648"
|
|
180
209
|
ruby_parse "2.1.9"
|
181
210
|
ruby_parse "2.2.9"
|
182
211
|
ruby_parse "2.3.8"
|
183
|
-
ruby_parse "2.4.
|
184
|
-
ruby_parse "2.5.
|
185
|
-
ruby_parse "2.6.
|
212
|
+
ruby_parse "2.4.10"
|
213
|
+
ruby_parse "2.5.9"
|
214
|
+
ruby_parse "2.6.7"
|
215
|
+
ruby_parse "2.7.3"
|
216
|
+
ruby_parse "3.0.1"
|
186
217
|
|
187
218
|
task :debug => :isolate do
|
188
219
|
ENV["V"] ||= V2.last
|
189
220
|
Rake.application[:parser].invoke # this way we can have DEBUG set
|
190
221
|
Rake.application[:lexer].invoke # this way we can have DEBUG set
|
191
222
|
|
192
|
-
|
223
|
+
$:.unshift "lib"
|
193
224
|
require "ruby_parser"
|
194
225
|
require "pp"
|
195
226
|
|
@@ -212,8 +243,9 @@ task :debug => :isolate do
|
|
212
243
|
|
213
244
|
begin
|
214
245
|
pp parser.process(ruby, file, time)
|
215
|
-
rescue Racc::ParseError => e
|
246
|
+
rescue ArgumentError, Racc::ParseError => e
|
216
247
|
p e
|
248
|
+
puts e.backtrace.join "\n "
|
217
249
|
ss = parser.lexer.ss
|
218
250
|
src = ss.string
|
219
251
|
lines = src[0..ss.pos].split(/\n/)
|
@@ -230,12 +262,17 @@ task :debug3 do
|
|
230
262
|
|
231
263
|
ENV.delete "V"
|
232
264
|
|
265
|
+
sh "ruby -v"
|
233
266
|
sh "ruby -y #{file} 2>&1 | #{munge} > tmp/ruby"
|
234
267
|
sh "./tools/ripper.rb -d #{file} | #{munge} > tmp/rip"
|
235
|
-
sh "rake debug F=#{file} DEBUG=1
|
268
|
+
sh "rake debug F=#{file} DEBUG=1 2>&1 | #{munge} > tmp/rp"
|
236
269
|
sh "diff -U 999 -d tmp/{rip,rp}"
|
237
270
|
end
|
238
271
|
|
272
|
+
task :cmp do
|
273
|
+
sh %(emacsclient --eval '(ediff-files "tmp/ruby" "tmp/rp")')
|
274
|
+
end
|
275
|
+
|
239
276
|
task :cmp3 do
|
240
277
|
sh %(emacsclient --eval '(ediff-files3 "tmp/ruby" "tmp/rip" "tmp/rp")')
|
241
278
|
end
|
@@ -104,9 +104,14 @@ rescue Timeout::Error
|
|
104
104
|
warn "TIMEOUT parsing #{file}. Skipping."
|
105
105
|
|
106
106
|
if $m then
|
107
|
-
|
108
|
-
|
109
|
-
|
107
|
+
base_dir, *rest = file.split("/")
|
108
|
+
base_dir.sub!(/\.slow\.?.*/, "")
|
109
|
+
base_dir += ".slow.#{time}"
|
110
|
+
|
111
|
+
new_file = File.join(base_dir, *rest)
|
112
|
+
|
113
|
+
FileUtils.mkdir_p File.dirname(new_file)
|
114
|
+
FileUtils.move file, new_file, verbose:true
|
110
115
|
elsif $t then
|
111
116
|
File.unlink file
|
112
117
|
end
|
data/compare/normalize.rb
CHANGED
@@ -8,6 +8,10 @@ order = []
|
|
8
8
|
|
9
9
|
def munge s
|
10
10
|
renames = [
|
11
|
+
# unquote... wtf?
|
12
|
+
/`(.+?)'/, proc { $1 },
|
13
|
+
/"'(.+?)'"/, proc { "\"#{$1}\"" },
|
14
|
+
|
11
15
|
"'='", "tEQL",
|
12
16
|
"'!'", "tBANG",
|
13
17
|
"'%'", "tPERCENT",
|
@@ -100,6 +104,43 @@ def munge s
|
|
100
104
|
|
101
105
|
"kVARIABLE", "keyword_variable", # ugh: this is a rule name
|
102
106
|
|
107
|
+
# 2.7 changes:
|
108
|
+
|
109
|
+
'"global variable"', "tGVAR",
|
110
|
+
'"operator-assignment"', "tOP_ASGN",
|
111
|
+
'"back reference"', "tBACK_REF",
|
112
|
+
'"numbered reference"', "tNTH_REF",
|
113
|
+
'"local variable or method"', "tIDENTIFIER",
|
114
|
+
'"constant"', "tCONSTANT",
|
115
|
+
|
116
|
+
'"(.."', "tBDOT2",
|
117
|
+
'"(..."', "tBDOT3",
|
118
|
+
'"char literal"', "tCHAR",
|
119
|
+
'"literal content"', "tSTRING_CONTENT",
|
120
|
+
'"string literal"', "tSTRING_BEG",
|
121
|
+
'"symbol literal"', "tSYMBEG",
|
122
|
+
'"backtick literal"', "tXSTRING_BEG",
|
123
|
+
'"regexp literal"', "tREGEXP_BEG",
|
124
|
+
'"word list"', "tWORDS_BEG",
|
125
|
+
'"verbatim word list"', "tQWORDS_BEG",
|
126
|
+
'"symbol list"', "tSYMBOLS_BEG",
|
127
|
+
'"verbatim symbol list"', "tQSYMBOLS_BEG",
|
128
|
+
|
129
|
+
'"float literal"', "tFLOAT",
|
130
|
+
'"imaginary literal"', "tIMAGINARY",
|
131
|
+
'"integer literal"', "tINTEGER",
|
132
|
+
'"rational literal"', "tRATIONAL",
|
133
|
+
|
134
|
+
'"instance variable"', "tIVAR",
|
135
|
+
'"class variable"', "tCVAR",
|
136
|
+
'"terminator"', "tSTRING_END", # TODO: switch this?
|
137
|
+
'"method"', "tFID",
|
138
|
+
'"}"', "tSTRING_DEND",
|
139
|
+
|
140
|
+
'"do for block"', "kDO_BLOCK",
|
141
|
+
'"do for condition"', "kDO_COND",
|
142
|
+
'"do for lambda"', "kDO_LAMBDA",
|
143
|
+
|
103
144
|
# UGH
|
104
145
|
"k_LINE__", "k__LINE__",
|
105
146
|
"k_FILE__", "k__FILE__",
|
@@ -107,13 +148,12 @@ def munge s
|
|
107
148
|
|
108
149
|
'"defined?"', "kDEFINED",
|
109
150
|
|
110
|
-
|
111
151
|
'"do (for condition)"', "kDO_COND",
|
112
152
|
'"do (for lambda)"', "kDO_LAMBDA",
|
113
153
|
'"do (for block)"', "kDO_BLOCK",
|
114
154
|
|
115
|
-
/\"(\w+) \(modifier\)
|
116
|
-
/\"(\w+)\"/,
|
155
|
+
/\"(\w+) \(?modifier\)?\"/, proc { |x| "k#{$1.upcase}_MOD" },
|
156
|
+
/\"(\w+)\"/, proc { |x| "k#{$1.upcase}" },
|
117
157
|
|
118
158
|
/@(\d+)(\s+|$)/, "",
|
119
159
|
]
|
@@ -134,7 +174,7 @@ ARGF.each_line do |line|
|
|
134
174
|
|
135
175
|
case line.strip
|
136
176
|
when /^$/ then
|
137
|
-
when /^(\d+) (
|
177
|
+
when /^(\d+) (\$?[@\w]+): (.*)/ then # yacc
|
138
178
|
rule = $2
|
139
179
|
order << rule unless rules.has_key? rule
|
140
180
|
rules[rule] << munge($3)
|
@@ -159,7 +199,7 @@ ARGF.each_line do |line|
|
|
159
199
|
when /^\cL/ then # byacc
|
160
200
|
break
|
161
201
|
else
|
162
|
-
warn "unparsed: #{$.}: #{line.
|
202
|
+
warn "unparsed: #{$.}: #{line.strip.inspect}"
|
163
203
|
end
|
164
204
|
end
|
165
205
|
|
data/debugging.md
CHANGED
@@ -1,5 +1,44 @@
|
|
1
1
|
# Quick Notes to Help with Debugging
|
2
2
|
|
3
|
+
## Reducing
|
4
|
+
|
5
|
+
One of the most important steps is reducing the code sample to a
|
6
|
+
minimal reproduction. For example, one thing I'm debugging right now
|
7
|
+
was reported as:
|
8
|
+
|
9
|
+
```ruby
|
10
|
+
a, b, c, d, e, f, g, h, i, j = 1, *[p1, p2, p3], *[p1, p2, p3], *[p4, p5, p6]
|
11
|
+
```
|
12
|
+
|
13
|
+
This original sample has 10 items on the left-hand-side (LHS) and 1 +
|
14
|
+
3 groups of 3 (calls) on the RHS + 3 arrays + 3 splats. That's a lot.
|
15
|
+
|
16
|
+
It's already been reported (perhaps incorrectly) that this has to do
|
17
|
+
with multiple splats on the RHS, so let's focus on that. At a minimum
|
18
|
+
the code can be reduced to 2 splats on the RHS and some
|
19
|
+
experimentation shows that it needs a non-splat item to fail:
|
20
|
+
|
21
|
+
```
|
22
|
+
_, _, _ = 1, *[2], *[3]
|
23
|
+
```
|
24
|
+
|
25
|
+
and some intuition further removed the arrays:
|
26
|
+
|
27
|
+
```
|
28
|
+
_, _, _ = 1, *2, *3
|
29
|
+
```
|
30
|
+
|
31
|
+
the difference is huge and will make a ton of difference when
|
32
|
+
debugging.
|
33
|
+
|
34
|
+
## Getting something to compare
|
35
|
+
|
36
|
+
```
|
37
|
+
% rake debug3 F=file.rb
|
38
|
+
```
|
39
|
+
|
40
|
+
TODO
|
41
|
+
|
3
42
|
## Comparing against ruby / ripper:
|
4
43
|
|
5
44
|
```
|
@@ -16,3 +55,136 @@ From there? Good luck. I'm currently trying to backtrack from rule
|
|
16
55
|
reductions to state change differences. I'd like to figure out a way
|
17
56
|
to go from this sort of diff to a reasonable test that checks state
|
18
57
|
changes but I don't have that set up at this point.
|
58
|
+
|
59
|
+
## Adding New Grammar Productions
|
60
|
+
|
61
|
+
Ruby adds stuff to the parser ALL THE TIME. It's actually hard to keep
|
62
|
+
up with, but I've added some tools and shown what a typical workflow
|
63
|
+
looks like. Let's say you want to add ruby 2.7's "beginless range" (eg
|
64
|
+
`..42`).
|
65
|
+
|
66
|
+
Whenever there's a language feature missing, I start with comparing
|
67
|
+
the parse trees between MRI and RP:
|
68
|
+
|
69
|
+
### Structural Comparing
|
70
|
+
|
71
|
+
There's a bunch of rake tasks `compare27`, `compare26`, etc that try
|
72
|
+
to normalize and diff MRI's parse.y parse tree (just the structure of
|
73
|
+
the tree in yacc) to ruby\_parser's parse tree (racc). It's the first
|
74
|
+
thing I do when I'm adding a new version. Stub out all the version
|
75
|
+
differences, and then start to diff the structure and move
|
76
|
+
ruby\_parser towards the new changes.
|
77
|
+
|
78
|
+
Some differences are just gonna be there... but here's an example of a
|
79
|
+
real diff between MRI 2.7 and ruby_parser as of today:
|
80
|
+
|
81
|
+
```diff
|
82
|
+
arg tDOT3 arg
|
83
|
+
arg tDOT2
|
84
|
+
arg tDOT3
|
85
|
+
- tBDOT2 arg
|
86
|
+
- tBDOT3 arg
|
87
|
+
arg tPLUS arg
|
88
|
+
arg tMINUS arg
|
89
|
+
arg tSTAR2 arg
|
90
|
+
```
|
91
|
+
|
92
|
+
This is a new language feature that ruby_parser doesn't handle yet.
|
93
|
+
It's in MRI (the left hand side of the diff) but not ruby\_parser (the
|
94
|
+
right hand side) so it is a `-` or missing line.
|
95
|
+
|
96
|
+
Some other diffs will have both `+` and `-` lines. That usually
|
97
|
+
happens when MRI has been refactoring the grammar. Sometimes I choose
|
98
|
+
to adapt those refactorings and sometimes it starts to get too
|
99
|
+
difficult to maintain multiple versions of ruby parsing in a single
|
100
|
+
file.
|
101
|
+
|
102
|
+
But! This structural comparing is always a place you should look when
|
103
|
+
ruby_parser is failing to parse something. Maybe it just hasn't been
|
104
|
+
implemented yet and the easiest place to look is the diff.
|
105
|
+
|
106
|
+
### Starting Test First
|
107
|
+
|
108
|
+
The next thing I do is to add a parser test to cover that feature. I
|
109
|
+
usually start with the parser and work backwards towards the lexer as
|
110
|
+
needed, as I find it structures things properly and keeps things goal
|
111
|
+
oriented.
|
112
|
+
|
113
|
+
So, make a new parser test, usually in the versioned section of the
|
114
|
+
parser tests.
|
115
|
+
|
116
|
+
```
|
117
|
+
def test_beginless2
|
118
|
+
rb = "..10\n; ..a\n; c"
|
119
|
+
pt = s(:block,
|
120
|
+
s(:dot2, nil, s(:lit, 0).line(1)).line(1),
|
121
|
+
s(:dot2, nil, s(:call, nil, :a).line(2)).line(2),
|
122
|
+
s(:call, nil, :c).line(3)).line(1)
|
123
|
+
|
124
|
+
assert_parse_line rb, pt, 1
|
125
|
+
|
126
|
+
flunk "not done yet"
|
127
|
+
end
|
128
|
+
```
|
129
|
+
|
130
|
+
(In this case copied and modified the tests for open ranges from 2.6)
|
131
|
+
and run it to get my first error:
|
132
|
+
|
133
|
+
```
|
134
|
+
% rake N=/beginless/
|
135
|
+
|
136
|
+
...
|
137
|
+
|
138
|
+
E
|
139
|
+
|
140
|
+
Finished in 0.021814s, 45.8421 runs/s, 0.0000 assertions/s.
|
141
|
+
|
142
|
+
1) Error:
|
143
|
+
TestRubyParserV27#test_whatevs:
|
144
|
+
Racc::ParseError: (string):1 :: parse error on value ".." (tDOT2)
|
145
|
+
GEMS/2.7.0/gems/racc-1.5.0/lib/racc/parser.rb:538:in `on_error'
|
146
|
+
WORK/ruby_parser/dev/lib/ruby_parser_extras.rb:1304:in `on_error'
|
147
|
+
(eval):3:in `_racc_do_parse_c'
|
148
|
+
(eval):3:in `do_parse'
|
149
|
+
WORK/ruby_parser/dev/lib/ruby_parser_extras.rb:1329:in `block in process'
|
150
|
+
RUBY/lib/ruby/2.7.0/timeout.rb:95:in `block in timeout'
|
151
|
+
RUBY/lib/ruby/2.7.0/timeout.rb:33:in `block in catch'
|
152
|
+
RUBY/lib/ruby/2.7.0/timeout.rb:33:in `catch'
|
153
|
+
RUBY/lib/ruby/2.7.0/timeout.rb:33:in `catch'
|
154
|
+
RUBY/lib/ruby/2.7.0/timeout.rb:110:in `timeout'
|
155
|
+
WORK/ruby_parser/dev/lib/ruby_parser_extras.rb:1317:in `process'
|
156
|
+
WORK/ruby_parser/dev/test/test_ruby_parser.rb:4198:in `assert_parse'
|
157
|
+
WORK/ruby_parser/dev/test/test_ruby_parser.rb:4221:in `assert_parse_line'
|
158
|
+
WORK/ruby_parser/dev/test/test_ruby_parser.rb:4451:in `test_whatevs'
|
159
|
+
```
|
160
|
+
|
161
|
+
For starters, we know the missing production is for `tBDOT2 arg`. It
|
162
|
+
is currently blowing up because it is getting `tDOT2` and simply
|
163
|
+
doesn't know what to do with it, so it raises the error. As the diff
|
164
|
+
suggests, that's the wrong token to begin with, so it is probably time
|
165
|
+
to also create a lexer test:
|
166
|
+
|
167
|
+
```
|
168
|
+
def test_yylex_bdot2
|
169
|
+
assert_lex3("..42",
|
170
|
+
s(:dot2, nil, s(:lit, 42)),
|
171
|
+
|
172
|
+
:tBDOT2, "..", EXPR_BEG,
|
173
|
+
:tINTEGER, "42", EXPR_NUM)
|
174
|
+
|
175
|
+
flunk "not done yet"
|
176
|
+
end
|
177
|
+
```
|
178
|
+
|
179
|
+
This one is mostly speculative at this point. It says "if we're lexing
|
180
|
+
this string, we should get this sexp if we fully parse it, and the
|
181
|
+
lexical stream should look like this"... That last bit is mostly made
|
182
|
+
up at this point. Sometimes I don't know exactly what expression state
|
183
|
+
things should be in until I start really digging in.
|
184
|
+
|
185
|
+
At this point, I have 2 failing tests that are directing me in the
|
186
|
+
right direction. It's now a matter of digging through
|
187
|
+
`compare/parse26.y` to see how the lexer differs and implementing
|
188
|
+
it...
|
189
|
+
|
190
|
+
But this is a good start to the doco for now. I'll add more later.
|