ruby_parser 3.15.0 → 3.18.0
Sign up to get free protection for your applications and to get access to all the features.
- checksums.yaml +4 -4
- checksums.yaml.gz.sig +0 -0
- data/History.rdoc +101 -0
- data/Manifest.txt +5 -0
- data/README.rdoc +1 -0
- data/Rakefile +128 -30
- data/bin/ruby_parse_extract_error +1 -1
- data/compare/normalize.rb +8 -3
- data/debugging.md +133 -0
- data/gauntlet.md +106 -0
- data/lib/rp_extensions.rb +15 -36
- data/lib/rp_stringscanner.rb +20 -51
- data/lib/ruby20_parser.rb +3559 -3499
- data/lib/ruby20_parser.y +333 -248
- data/lib/ruby21_parser.rb +3650 -3614
- data/lib/ruby21_parser.y +328 -245
- data/lib/ruby22_parser.rb +3690 -3628
- data/lib/ruby22_parser.y +332 -247
- data/lib/ruby23_parser.rb +3629 -3573
- data/lib/ruby23_parser.y +332 -247
- data/lib/ruby24_parser.rb +3712 -3654
- data/lib/ruby24_parser.y +332 -247
- data/lib/ruby25_parser.rb +3712 -3654
- data/lib/ruby25_parser.y +332 -247
- data/lib/ruby26_parser.rb +3715 -3658
- data/lib/ruby26_parser.y +332 -246
- data/lib/ruby27_parser.rb +5009 -3722
- data/lib/ruby27_parser.y +928 -245
- data/lib/ruby30_parser.rb +8741 -0
- data/lib/ruby30_parser.y +3463 -0
- data/lib/ruby3_parser.yy +3467 -0
- data/lib/ruby_lexer.rb +273 -602
- data/lib/ruby_lexer.rex +28 -21
- data/lib/ruby_lexer.rex.rb +60 -24
- data/lib/ruby_lexer_strings.rb +638 -0
- data/lib/ruby_parser.rb +2 -0
- data/lib/ruby_parser.yy +969 -252
- data/lib/ruby_parser_extras.rb +297 -116
- data/test/test_ruby_lexer.rb +213 -129
- data/test/test_ruby_parser.rb +1288 -110
- data/tools/munge.rb +36 -8
- data/tools/ripper.rb +15 -10
- data.tar.gz.sig +0 -0
- metadata +48 -35
- metadata.gz.sig +1 -4
checksums.yaml
CHANGED
@@ -1,7 +1,7 @@
|
|
1
1
|
---
|
2
2
|
SHA256:
|
3
|
-
metadata.gz:
|
4
|
-
data.tar.gz:
|
3
|
+
metadata.gz: 36780d9d3244dd62d13430987076d5e81ae2e536d6d2bfd259f8a612da3d94cc
|
4
|
+
data.tar.gz: bec4b32e7f7a8d9ae8e3202f30230f351a2fedc6e2ac4e984260486dbb7529c6
|
5
5
|
SHA512:
|
6
|
-
metadata.gz:
|
7
|
-
data.tar.gz:
|
6
|
+
metadata.gz: f28d02d2b14687e365bab3a353348b93a9df993be2d1afd3f2783b5b97ca016a6ca2f834ef61ebb4a4eae3decc38e1351349679f951f901bef09c25f23d44322
|
7
|
+
data.tar.gz: 276ecce4db1f72ed2ce0d276679e65419225a46b885d0050aa7ba6382b45033ccd24b5006a0d382f0aecdbb6c5a5fd93e3e826adeafccc3c47ee051b76772eee
|
checksums.yaml.gz.sig
CHANGED
Binary file
|
data/History.rdoc
CHANGED
@@ -1,3 +1,104 @@
|
|
1
|
+
=== 3.18.0 / 2021-10-27
|
2
|
+
|
3
|
+
Holy crap... 58 commits! 2.7 and 3.0 are feature complete. Strings
|
4
|
+
& heredocs have been rewritten.
|
5
|
+
|
6
|
+
* 9 major enhancements:
|
7
|
+
|
8
|
+
* !!! Rewrote lexer (and friends) for strings, heredocs, and %*[] constructs.
|
9
|
+
* Massive overhaul on line numbers.
|
10
|
+
* Freeze input! Finally!!! No more modifying the input string for heredocs.
|
11
|
+
* Overhauled RPStringScanner. Removed OLD compatibility methods!
|
12
|
+
* Removed Sexp methods: value, to_sym, add, add_all, node_type, values.
|
13
|
+
* value moved to sexp_processor.
|
14
|
+
* Removed String#grep monkey-patch.
|
15
|
+
* Removed String#lineno monkey-patch.
|
16
|
+
* Removed string_to_pos, charpos, etc hacks for ancient ruby versions.
|
17
|
+
* Removed unread_many... NO! NO EDITING THE INPUT STRING!
|
18
|
+
|
19
|
+
* 31 minor enhancements:
|
20
|
+
|
21
|
+
* 2.7/3.0: many more pattern edge cases
|
22
|
+
* 2.7: Added `mlhs = rhs rescue expr`
|
23
|
+
* 2.7: refactored destructured args (`|(k,v)|`) and unfactored(?!) case_body/args.
|
24
|
+
* 3.0: excessed_comma
|
25
|
+
* 3.0: finished most everything: endless methods, patterns, etc.
|
26
|
+
* 3.0: refactored / added new pattern changes
|
27
|
+
* Added RubyLexer#in_heredoc? (ie, is there old_ss ?)
|
28
|
+
* Added RubyLexer#old_ss and old_lineno and removed much of SSStack(ish).
|
29
|
+
* Added Symbol#end_with? when necessary
|
30
|
+
* Added TALLY and DEBUG options for ss.getch and ss.scan
|
31
|
+
* Added ignore_body_comments to make parser productions more clear.
|
32
|
+
* Added support for no_kwarg (eg `def f(**nil)`).
|
33
|
+
* Added support for no_kwarg in blocks (eg `f { |**nil| }`).
|
34
|
+
* Augmented generated parser files to have frozen_string_literal comments and fixed tests.
|
35
|
+
* Broke out 3.0 parser into its own to ease development.
|
36
|
+
* Bumped dependencies on sexp_processor and oedipus_lex.
|
37
|
+
* Clean generated 3.x files.
|
38
|
+
* Extracted all string scanner methods to their own module.
|
39
|
+
* Fixed some precedence decls.
|
40
|
+
* Implemented most of pattern matching for 2.7+.
|
41
|
+
* Improve lex_state= to report location in verbose debug mode.
|
42
|
+
* Made it easier to debug with a particular version of ruby via rake.
|
43
|
+
* Make sure ripper uses the same version of ruby we specified.
|
44
|
+
* Moved all string/heredoc/etc code to ruby_lexer_strings.rb
|
45
|
+
* Remove warning from newer bisons.
|
46
|
+
* Sprinkled in some frozen_string_literal, but mostly helped by oedipus bump.
|
47
|
+
* Switch to comparing against ruby binary since ripper is buggy.
|
48
|
+
* bugs task should try both bug*.rb and bad*.rb.
|
49
|
+
* endless methods
|
50
|
+
* f_any_kwrest refactoring.
|
51
|
+
* refactored defn/defs
|
52
|
+
|
53
|
+
* 15 bug fixes:
|
54
|
+
|
55
|
+
* Cleaned a bunch of old hacks. Initializing RubyLexer w/ Parser is cleaner now.
|
56
|
+
* Corrected some lex_state errors in process_token_keyword.
|
57
|
+
* Fixed ancient ruby2 change (use #lines) in ruby_parse_extract_error.
|
58
|
+
* Fixed bug where else without rescue only raises on 2.6+
|
59
|
+
* Fixed caller for getch and scan when DEBUG=1
|
60
|
+
* Fixed comments in the middle of message cascades.
|
61
|
+
* Fixed differences w/ symbol productions against ruby 2.7.
|
62
|
+
* Fixed dsym to use string_contents production.
|
63
|
+
* Fixed error in bdot2/3 in some edge cases. Fixed p_alt line.
|
64
|
+
* Fixed heredoc dedenting in the presence of empty lines. (mvz)
|
65
|
+
* Fixed some leading whitespace / comment processing
|
66
|
+
* Fixed up how class/module/defn/defs comments were collected.
|
67
|
+
* Overhauled ripper.rb to deal with buggy ripper w/ yydebug.
|
68
|
+
* Removed dsym from literal.
|
69
|
+
* Removed tUBANG lexeme but kept it distinct as a method name (eg: `def !@`).
|
70
|
+
|
71
|
+
=== 3.17.0 / 2021-08-03
|
72
|
+
|
73
|
+
* 1 minor enhancement:
|
74
|
+
|
75
|
+
* Added support for arg forwarding (eg `def f(...); m(...); end`) (presidentbeef)
|
76
|
+
|
77
|
+
=== 3.16.0 / 2021-05-15
|
78
|
+
|
79
|
+
* 1 major enhancement:
|
80
|
+
|
81
|
+
* Added tentative 3.0 support.
|
82
|
+
|
83
|
+
* 3 minor enhancements:
|
84
|
+
|
85
|
+
* Added lexing for "beginless range" (bdots).
|
86
|
+
* Added parsing for bdots.
|
87
|
+
* Updated rake compare task to download xz files, bumped versions, etc
|
88
|
+
|
89
|
+
* 4 bug fixes:
|
90
|
+
|
91
|
+
* Bump rake dependency to >= 10, < 15. (presidentbeef)
|
92
|
+
* Bump sexp_processor dependency to 4.15.1+. (pravi)
|
93
|
+
* Fixed minor state mismatch at the end of parsing to make diffing a little cleaner.
|
94
|
+
* Fixed normalizer to deal with new bison token syntax
|
95
|
+
|
96
|
+
=== 3.15.1 / 2021-01-10
|
97
|
+
|
98
|
+
* 1 bug fix:
|
99
|
+
|
100
|
+
* Bumped ruby version to include < 4 (trunk).
|
101
|
+
|
1
102
|
=== 3.15.0 / 2020-08-31
|
2
103
|
|
3
104
|
* 1 major enhancement:
|
data/Manifest.txt
CHANGED
@@ -7,6 +7,7 @@ bin/ruby_parse
|
|
7
7
|
bin/ruby_parse_extract_error
|
8
8
|
compare/normalize.rb
|
9
9
|
debugging.md
|
10
|
+
gauntlet.md
|
10
11
|
lib/.document
|
11
12
|
lib/rp_extensions.rb
|
12
13
|
lib/rp_stringscanner.rb
|
@@ -26,9 +27,13 @@ lib/ruby26_parser.rb
|
|
26
27
|
lib/ruby26_parser.y
|
27
28
|
lib/ruby27_parser.rb
|
28
29
|
lib/ruby27_parser.y
|
30
|
+
lib/ruby30_parser.rb
|
31
|
+
lib/ruby30_parser.y
|
32
|
+
lib/ruby3_parser.yy
|
29
33
|
lib/ruby_lexer.rb
|
30
34
|
lib/ruby_lexer.rex
|
31
35
|
lib/ruby_lexer.rex.rb
|
36
|
+
lib/ruby_lexer_strings.rb
|
32
37
|
lib/ruby_parser.rb
|
33
38
|
lib/ruby_parser.yy
|
34
39
|
lib/ruby_parser_extras.rb
|
data/README.rdoc
CHANGED
@@ -32,6 +32,7 @@ Tested against 801,039 files from the latest of all rubygems (as of 2013-05):
|
|
32
32
|
* 1.8 parser is at 99.9739% accuracy, 3.651 sigma
|
33
33
|
* 1.9 parser is at 99.9940% accuracy, 4.013 sigma
|
34
34
|
* 2.0 parser is at 99.9939% accuracy, 4.008 sigma
|
35
|
+
* 2.6 parser is at 99.9972% accuracy, 4.191 sigma
|
35
36
|
|
36
37
|
== FEATURES/PROBLEMS:
|
37
38
|
|
data/Rakefile
CHANGED
@@ -14,25 +14,37 @@ Hoe.add_include_dirs "../../minitest/dev/lib"
|
|
14
14
|
Hoe.add_include_dirs "../../oedipus_lex/dev/lib"
|
15
15
|
|
16
16
|
V2 = %w[20 21 22 23 24 25 26 27]
|
17
|
-
|
17
|
+
V3 = %w[30]
|
18
|
+
|
19
|
+
VERS = V2 + V3
|
20
|
+
|
21
|
+
ENV["FAST"] = VERS.last if ENV["FAST"] && !VERS.include?(ENV["FAST"])
|
22
|
+
VERS.replace [ENV["FAST"]] if ENV["FAST"]
|
18
23
|
|
19
24
|
Hoe.spec "ruby_parser" do
|
20
25
|
developer "Ryan Davis", "ryand-ruby@zenspider.com"
|
21
26
|
|
22
27
|
license "MIT"
|
23
28
|
|
24
|
-
dependency "sexp_processor", "~> 4.
|
25
|
-
dependency "rake", "<
|
26
|
-
dependency "oedipus_lex", "~> 2.
|
29
|
+
dependency "sexp_processor", "~> 4.16"
|
30
|
+
dependency "rake", [">= 10", "< 15"], :developer
|
31
|
+
dependency "oedipus_lex", "~> 2.6", :developer
|
32
|
+
|
33
|
+
# NOTE: Ryan!!! Stop trying to fix this dependency! Isolate just
|
34
|
+
# can't handle having a faux-gem half-installed! Stop! Just `gem
|
35
|
+
# install racc` and move on. Revisit this ONLY once racc-compiler
|
36
|
+
# gets split out.
|
27
37
|
|
28
|
-
|
38
|
+
dependency "racc", "~> 1.5", :developer
|
39
|
+
|
40
|
+
require_ruby_version [">= 2.1", "< 4"]
|
29
41
|
|
30
42
|
if plugin? :perforce then # generated files
|
31
|
-
|
43
|
+
VERS.each do |n|
|
32
44
|
self.perforce_ignore << "lib/ruby#{n}_parser.rb"
|
33
45
|
end
|
34
46
|
|
35
|
-
|
47
|
+
VERS.each do |n|
|
36
48
|
self.perforce_ignore << "lib/ruby#{n}_parser.y"
|
37
49
|
end
|
38
50
|
|
@@ -46,6 +58,23 @@ Hoe.spec "ruby_parser" do
|
|
46
58
|
end
|
47
59
|
end
|
48
60
|
|
61
|
+
def maybe_add_to_top path, string
|
62
|
+
file = File.read path
|
63
|
+
|
64
|
+
return if file.start_with? string
|
65
|
+
|
66
|
+
warn "Altering top of #{path}"
|
67
|
+
tmp_path = "#{path}.tmp"
|
68
|
+
File.open(tmp_path, "w") do |f|
|
69
|
+
f.puts string
|
70
|
+
f.puts
|
71
|
+
|
72
|
+
f.write file
|
73
|
+
# TODO: make this deal with encoding comments properly?
|
74
|
+
end
|
75
|
+
File.rename tmp_path, path
|
76
|
+
end
|
77
|
+
|
49
78
|
V2.each do |n|
|
50
79
|
file "lib/ruby#{n}_parser.y" => "lib/ruby_parser.yy" do |t|
|
51
80
|
cmd = 'unifdef -tk -DV=%s -UDEAD %s > %s || true' % [n, t.source, t.name]
|
@@ -55,8 +84,23 @@ V2.each do |n|
|
|
55
84
|
file "lib/ruby#{n}_parser.rb" => "lib/ruby#{n}_parser.y"
|
56
85
|
end
|
57
86
|
|
87
|
+
V3.each do |n|
|
88
|
+
file "lib/ruby#{n}_parser.y" => "lib/ruby3_parser.yy" do |t|
|
89
|
+
cmd = 'unifdef -tk -DV=%s -UDEAD %s > %s || true' % [n, t.source, t.name]
|
90
|
+
sh cmd
|
91
|
+
end
|
92
|
+
|
93
|
+
file "lib/ruby#{n}_parser.rb" => "lib/ruby#{n}_parser.y"
|
94
|
+
end
|
95
|
+
|
58
96
|
file "lib/ruby_lexer.rex.rb" => "lib/ruby_lexer.rex"
|
59
97
|
|
98
|
+
task :parser do |t|
|
99
|
+
t.prerequisite_tasks.grep(Rake::FileTask).select(&:already_invoked).each do |f|
|
100
|
+
maybe_add_to_top f.name, "# frozen_string_literal: true"
|
101
|
+
end
|
102
|
+
end
|
103
|
+
|
60
104
|
task :generate => [:lexer, :parser]
|
61
105
|
|
62
106
|
task :clean do
|
@@ -65,6 +109,7 @@ task :clean do
|
|
65
109
|
Dir["coverage.info"] +
|
66
110
|
Dir["coverage"] +
|
67
111
|
Dir["lib/ruby2*_parser.y"] +
|
112
|
+
Dir["lib/ruby3*_parser.y"] +
|
68
113
|
Dir["lib/*.output"])
|
69
114
|
end
|
70
115
|
|
@@ -92,7 +137,7 @@ end
|
|
92
137
|
|
93
138
|
def dl v
|
94
139
|
dir = v[/^\d+\.\d+/]
|
95
|
-
url = "https://cache.ruby-lang.org/pub/ruby/#{dir}/ruby-#{v}.tar.
|
140
|
+
url = "https://cache.ruby-lang.org/pub/ruby/#{dir}/ruby-#{v}.tar.xz"
|
96
141
|
path = File.basename url
|
97
142
|
unless File.exist? path then
|
98
143
|
system "curl -O #{url}"
|
@@ -104,7 +149,7 @@ def ruby_parse version
|
|
104
149
|
rp_txt = "rp#{v}.txt"
|
105
150
|
mri_txt = "mri#{v}.txt"
|
106
151
|
parse_y = "parse#{v}.y"
|
107
|
-
tarball = "ruby-#{version}.tar.
|
152
|
+
tarball = "ruby-#{version}.tar.xz"
|
108
153
|
ruby_dir = "ruby-#{version}"
|
109
154
|
diff = "diff#{v}.diff"
|
110
155
|
rp_out = "lib/ruby#{v}_parser.output"
|
@@ -124,15 +169,18 @@ def ruby_parse version
|
|
124
169
|
end
|
125
170
|
end
|
126
171
|
|
172
|
+
desc "fetch all tarballs"
|
173
|
+
task :fetch => c_tarball
|
174
|
+
|
127
175
|
file c_parse_y => c_tarball do
|
128
176
|
in_compare do
|
129
177
|
extract_glob = case version
|
130
|
-
when /2\.7/
|
178
|
+
when /2\.7|3\.0/
|
131
179
|
"{id.h,parse.y,tool/{id2token.rb,lib/vpath.rb}}"
|
132
180
|
else
|
133
181
|
"{id.h,parse.y,tool/{id2token.rb,vpath.rb}}"
|
134
182
|
end
|
135
|
-
system "tar
|
183
|
+
system "tar Jxf #{tarball} #{ruby_dir}/#{extract_glob}"
|
136
184
|
|
137
185
|
Dir.chdir ruby_dir do
|
138
186
|
if File.exist? "tool/id2token.rb" then
|
@@ -141,15 +189,20 @@ def ruby_parse version
|
|
141
189
|
sh "expand parse.y > ../#{parse_y}"
|
142
190
|
end
|
143
191
|
|
144
|
-
ruby "-pi", "-e", 'gsub(/^%
|
192
|
+
ruby "-pi", "-e", 'gsub(/^%pure-parser/, "%define api.pure")', "../#{parse_y}"
|
145
193
|
end
|
146
194
|
sh "rm -rf #{ruby_dir}"
|
147
195
|
end
|
148
196
|
end
|
149
197
|
|
198
|
+
bison = Dir["/opt/homebrew/opt/bison/bin/bison",
|
199
|
+
"/usr/local/opt/bison/bin/bison",
|
200
|
+
`which bison`.chomp,
|
201
|
+
].first
|
202
|
+
|
150
203
|
file c_mri_txt => [c_parse_y, normalize] do
|
151
204
|
in_compare do
|
152
|
-
sh "bison -r all #{parse_y}"
|
205
|
+
sh "#{bison} -r all #{parse_y}"
|
153
206
|
sh "./normalize.rb parse#{v}.output > #{mri_txt}"
|
154
207
|
rm ["parse#{v}.output", "parse#{v}.tab.c"]
|
155
208
|
end
|
@@ -190,17 +243,50 @@ def ruby_parse version
|
|
190
243
|
end
|
191
244
|
end
|
192
245
|
|
246
|
+
task :versions do
|
247
|
+
require "open-uri"
|
248
|
+
require "net/http" # avoid require issues in threads
|
249
|
+
require "net/https"
|
250
|
+
|
251
|
+
versions = %w[ 2.0 2.1 2.2 2.3 2.4 2.5 2.6 2.7 3.0 ]
|
252
|
+
|
253
|
+
base_url = "https://cache.ruby-lang.org/pub/ruby"
|
254
|
+
|
255
|
+
class Array
|
256
|
+
def human_sort
|
257
|
+
sort_by { |item| item.to_s.split(/(\d+)/).map { |e| [e.to_i, e] } }
|
258
|
+
end
|
259
|
+
end
|
260
|
+
|
261
|
+
versions = versions.map { |ver|
|
262
|
+
Thread.new {
|
263
|
+
URI
|
264
|
+
.parse("#{base_url}/#{ver}/")
|
265
|
+
.read
|
266
|
+
.scan(/ruby-\d+\.\d+\.\d+[-\w.]*?.tar.gz/)
|
267
|
+
.reject { |s| s =~ /-(?:rc|preview)\d/ }
|
268
|
+
.human_sort
|
269
|
+
.last
|
270
|
+
.delete_prefix("ruby-")
|
271
|
+
.delete_suffix ".tar.gz"
|
272
|
+
}
|
273
|
+
}.map(&:value).sort
|
274
|
+
|
275
|
+
puts versions.map { |v| "ruby_parse %p" % [v] }
|
276
|
+
end
|
277
|
+
|
193
278
|
ruby_parse "2.0.0-p648"
|
194
|
-
ruby_parse "2.1.
|
195
|
-
ruby_parse "2.2.
|
279
|
+
ruby_parse "2.1.10"
|
280
|
+
ruby_parse "2.2.10"
|
196
281
|
ruby_parse "2.3.8"
|
197
|
-
ruby_parse "2.4.
|
198
|
-
ruby_parse "2.5.
|
199
|
-
ruby_parse "2.6.
|
200
|
-
ruby_parse "2.7.
|
282
|
+
ruby_parse "2.4.10"
|
283
|
+
ruby_parse "2.5.9"
|
284
|
+
ruby_parse "2.6.8"
|
285
|
+
ruby_parse "2.7.4"
|
286
|
+
ruby_parse "3.0.2"
|
201
287
|
|
202
288
|
task :debug => :isolate do
|
203
|
-
ENV["V"] ||=
|
289
|
+
ENV["V"] ||= VERS.last
|
204
290
|
Rake.application[:parser].invoke # this way we can have DEBUG set
|
205
291
|
Rake.application[:lexer].invoke # this way we can have DEBUG set
|
206
292
|
|
@@ -215,7 +301,7 @@ task :debug => :isolate do
|
|
215
301
|
time = (ENV["RP_TIMEOUT"] || 10).to_i
|
216
302
|
|
217
303
|
n = ENV["BUG"]
|
218
|
-
file = (n && "bug#{n}.rb") || ENV["F"] || ENV["FILE"] || "
|
304
|
+
file = (n && "bug#{n}.rb") || ENV["F"] || ENV["FILE"] || "debug.rb"
|
219
305
|
ruby = ENV["R"] || ENV["RUBY"]
|
220
306
|
|
221
307
|
if ruby then
|
@@ -238,19 +324,22 @@ task :debug => :isolate do
|
|
238
324
|
end
|
239
325
|
|
240
326
|
task :debug3 do
|
241
|
-
file = ENV["F"] || "
|
242
|
-
|
327
|
+
file = ENV["F"] || "debug.rb"
|
328
|
+
version = ENV["V"] || ""
|
329
|
+
verbose = ENV["VERBOSE"] ? "-v" : ""
|
243
330
|
munge = "./tools/munge.rb #{verbose}"
|
244
331
|
|
245
332
|
abort "Need a file to parse, via: F=path.rb" unless file
|
246
333
|
|
247
334
|
ENV.delete "V"
|
248
335
|
|
249
|
-
|
250
|
-
|
251
|
-
sh "
|
336
|
+
ruby = "ruby#{version}"
|
337
|
+
|
338
|
+
sh "#{ruby} -v"
|
339
|
+
sh "#{ruby} -y #{file} 2>&1 | #{munge} > tmp/ruby"
|
340
|
+
sh "#{ruby} ./tools/ripper.rb -d #{file} | #{munge} > tmp/rip"
|
252
341
|
sh "rake debug F=#{file} DEBUG=1 2>&1 | #{munge} > tmp/rp"
|
253
|
-
sh "diff -U 999 -d tmp/{
|
342
|
+
sh "diff -U 999 -d tmp/{ruby,rp}"
|
254
343
|
end
|
255
344
|
|
256
345
|
task :cmp do
|
@@ -262,16 +351,25 @@ task :cmp3 do
|
|
262
351
|
end
|
263
352
|
|
264
353
|
task :extract => :isolate do
|
265
|
-
ENV["V"] ||=
|
354
|
+
ENV["V"] ||= VERS.last
|
266
355
|
Rake.application[:parser].invoke # this way we can have DEBUG set
|
267
356
|
|
268
|
-
file = ENV["F"] || ENV["FILE"]
|
357
|
+
file = ENV["F"] || ENV["FILE"] || abort("Need to provide F=<path>")
|
269
358
|
|
270
359
|
ruby "-Ilib", "bin/ruby_parse_extract_error", file
|
271
360
|
end
|
272
361
|
|
362
|
+
task :parse => :isolate do
|
363
|
+
ENV["V"] ||= VERS.last
|
364
|
+
Rake.application[:parser].invoke # this way we can have DEBUG set
|
365
|
+
|
366
|
+
file = ENV["F"] || ENV["FILE"] || abort("Need to provide F=<path>")
|
367
|
+
|
368
|
+
ruby "-Ilib", "bin/ruby_parse", file
|
369
|
+
end
|
370
|
+
|
273
371
|
task :bugs do
|
274
|
-
sh "for f in bug*.rb ; do #{Gem.ruby} -S rake debug F=$f && rm $f ; done"
|
372
|
+
sh "for f in bug*.rb bad*.rb ; do #{Gem.ruby} -S rake debug F=$f && rm $f ; done"
|
275
373
|
end
|
276
374
|
|
277
375
|
# vim: syntax=Ruby
|
data/compare/normalize.rb
CHANGED
@@ -84,6 +84,7 @@ def munge s
|
|
84
84
|
|
85
85
|
"' '", "tSPACE", # needs to be later to avoid bad hits
|
86
86
|
|
87
|
+
"%empty", "none", # newer bison
|
87
88
|
"/* empty */", "none",
|
88
89
|
/^\s*$/, "none",
|
89
90
|
|
@@ -140,6 +141,7 @@ def munge s
|
|
140
141
|
'"do for block"', "kDO_BLOCK",
|
141
142
|
'"do for condition"', "kDO_COND",
|
142
143
|
'"do for lambda"', "kDO_LAMBDA",
|
144
|
+
"tLABEL", "kLABEL",
|
143
145
|
|
144
146
|
# UGH
|
145
147
|
"k_LINE__", "k__LINE__",
|
@@ -155,7 +157,10 @@ def munge s
|
|
155
157
|
/\"(\w+) \(?modifier\)?\"/, proc { |x| "k#{$1.upcase}_MOD" },
|
156
158
|
/\"(\w+)\"/, proc { |x| "k#{$1.upcase}" },
|
157
159
|
|
158
|
-
|
160
|
+
/\$?@(\d+)(\s+|$)/, "", # newer bison
|
161
|
+
|
162
|
+
# TODO: remove for 3.0 work:
|
163
|
+
"lex_ctxt ", "" # 3.0 production that's mostly noise right now
|
159
164
|
]
|
160
165
|
|
161
166
|
renames.each_slice(2) do |(a, b)|
|
@@ -174,7 +179,7 @@ ARGF.each_line do |line|
|
|
174
179
|
|
175
180
|
case line.strip
|
176
181
|
when /^$/ then
|
177
|
-
when /^(\d+) (
|
182
|
+
when /^(\d+) (\$?[@\w]+): (.*)/ then # yacc
|
178
183
|
rule = $2
|
179
184
|
order << rule unless rules.has_key? rule
|
180
185
|
rules[rule] << munge($3)
|
@@ -199,7 +204,7 @@ ARGF.each_line do |line|
|
|
199
204
|
when /^\cL/ then # byacc
|
200
205
|
break
|
201
206
|
else
|
202
|
-
warn "unparsed: #{$.}: #{line.
|
207
|
+
warn "unparsed: #{$.}: #{line.strip.inspect}"
|
203
208
|
end
|
204
209
|
end
|
205
210
|
|
data/debugging.md
CHANGED
@@ -55,3 +55,136 @@ From there? Good luck. I'm currently trying to backtrack from rule
|
|
55
55
|
reductions to state change differences. I'd like to figure out a way
|
56
56
|
to go from this sort of diff to a reasonable test that checks state
|
57
57
|
changes but I don't have that set up at this point.
|
58
|
+
|
59
|
+
## Adding New Grammar Productions
|
60
|
+
|
61
|
+
Ruby adds stuff to the parser ALL THE TIME. It's actually hard to keep
|
62
|
+
up with, but I've added some tools and shown what a typical workflow
|
63
|
+
looks like. Let's say you want to add ruby 2.7's "beginless range" (eg
|
64
|
+
`..42`).
|
65
|
+
|
66
|
+
Whenever there's a language feature missing, I start with comparing
|
67
|
+
the parse trees between MRI and RP:
|
68
|
+
|
69
|
+
### Structural Comparing
|
70
|
+
|
71
|
+
There's a bunch of rake tasks `compare27`, `compare26`, etc that try
|
72
|
+
to normalize and diff MRI's parse.y parse tree (just the structure of
|
73
|
+
the tree in yacc) to ruby\_parser's parse tree (racc). It's the first
|
74
|
+
thing I do when I'm adding a new version. Stub out all the version
|
75
|
+
differences, and then start to diff the structure and move
|
76
|
+
ruby\_parser towards the new changes.
|
77
|
+
|
78
|
+
Some differences are just gonna be there... but here's an example of a
|
79
|
+
real diff between MRI 2.7 and ruby_parser as of today:
|
80
|
+
|
81
|
+
```diff
|
82
|
+
arg tDOT3 arg
|
83
|
+
arg tDOT2
|
84
|
+
arg tDOT3
|
85
|
+
- tBDOT2 arg
|
86
|
+
- tBDOT3 arg
|
87
|
+
arg tPLUS arg
|
88
|
+
arg tMINUS arg
|
89
|
+
arg tSTAR2 arg
|
90
|
+
```
|
91
|
+
|
92
|
+
This is a new language feature that ruby_parser doesn't handle yet.
|
93
|
+
It's in MRI (the left hand side of the diff) but not ruby\_parser (the
|
94
|
+
right hand side) so it is a `-` or missing line.
|
95
|
+
|
96
|
+
Some other diffs will have both `+` and `-` lines. That usually
|
97
|
+
happens when MRI has been refactoring the grammar. Sometimes I choose
|
98
|
+
to adapt those refactorings and sometimes it starts to get too
|
99
|
+
difficult to maintain multiple versions of ruby parsing in a single
|
100
|
+
file.
|
101
|
+
|
102
|
+
But! This structural comparing is always a place you should look when
|
103
|
+
ruby_parser is failing to parse something. Maybe it just hasn't been
|
104
|
+
implemented yet and the easiest place to look is the diff.
|
105
|
+
|
106
|
+
### Starting Test First
|
107
|
+
|
108
|
+
The next thing I do is to add a parser test to cover that feature. I
|
109
|
+
usually start with the parser and work backwards towards the lexer as
|
110
|
+
needed, as I find it structures things properly and keeps things goal
|
111
|
+
oriented.
|
112
|
+
|
113
|
+
So, make a new parser test, usually in the versioned section of the
|
114
|
+
parser tests.
|
115
|
+
|
116
|
+
```
|
117
|
+
def test_beginless2
|
118
|
+
rb = "..10\n; ..a\n; c"
|
119
|
+
pt = s(:block,
|
120
|
+
s(:dot2, nil, s(:lit, 0).line(1)).line(1),
|
121
|
+
s(:dot2, nil, s(:call, nil, :a).line(2)).line(2),
|
122
|
+
s(:call, nil, :c).line(3)).line(1)
|
123
|
+
|
124
|
+
assert_parse_line rb, pt, 1
|
125
|
+
|
126
|
+
flunk "not done yet"
|
127
|
+
end
|
128
|
+
```
|
129
|
+
|
130
|
+
(In this case copied and modified the tests for open ranges from 2.6)
|
131
|
+
and run it to get my first error:
|
132
|
+
|
133
|
+
```
|
134
|
+
% rake N=/beginless/
|
135
|
+
|
136
|
+
...
|
137
|
+
|
138
|
+
E
|
139
|
+
|
140
|
+
Finished in 0.021814s, 45.8421 runs/s, 0.0000 assertions/s.
|
141
|
+
|
142
|
+
1) Error:
|
143
|
+
TestRubyParserV27#test_whatevs:
|
144
|
+
Racc::ParseError: (string):1 :: parse error on value ".." (tDOT2)
|
145
|
+
GEMS/2.7.0/gems/racc-1.5.0/lib/racc/parser.rb:538:in `on_error'
|
146
|
+
WORK/ruby_parser/dev/lib/ruby_parser_extras.rb:1304:in `on_error'
|
147
|
+
(eval):3:in `_racc_do_parse_c'
|
148
|
+
(eval):3:in `do_parse'
|
149
|
+
WORK/ruby_parser/dev/lib/ruby_parser_extras.rb:1329:in `block in process'
|
150
|
+
RUBY/lib/ruby/2.7.0/timeout.rb:95:in `block in timeout'
|
151
|
+
RUBY/lib/ruby/2.7.0/timeout.rb:33:in `block in catch'
|
152
|
+
RUBY/lib/ruby/2.7.0/timeout.rb:33:in `catch'
|
153
|
+
RUBY/lib/ruby/2.7.0/timeout.rb:33:in `catch'
|
154
|
+
RUBY/lib/ruby/2.7.0/timeout.rb:110:in `timeout'
|
155
|
+
WORK/ruby_parser/dev/lib/ruby_parser_extras.rb:1317:in `process'
|
156
|
+
WORK/ruby_parser/dev/test/test_ruby_parser.rb:4198:in `assert_parse'
|
157
|
+
WORK/ruby_parser/dev/test/test_ruby_parser.rb:4221:in `assert_parse_line'
|
158
|
+
WORK/ruby_parser/dev/test/test_ruby_parser.rb:4451:in `test_whatevs'
|
159
|
+
```
|
160
|
+
|
161
|
+
For starters, we know the missing production is for `tBDOT2 arg`. It
|
162
|
+
is currently blowing up because it is getting `tDOT2` and simply
|
163
|
+
doesn't know what to do with it, so it raises the error. As the diff
|
164
|
+
suggests, that's the wrong token to begin with, so it is probably time
|
165
|
+
to also create a lexer test:
|
166
|
+
|
167
|
+
```
|
168
|
+
def test_yylex_bdot2
|
169
|
+
assert_lex3("..42",
|
170
|
+
s(:dot2, nil, s(:lit, 42)),
|
171
|
+
|
172
|
+
:tBDOT2, "..", EXPR_BEG,
|
173
|
+
:tINTEGER, "42", EXPR_NUM)
|
174
|
+
|
175
|
+
flunk "not done yet"
|
176
|
+
end
|
177
|
+
```
|
178
|
+
|
179
|
+
This one is mostly speculative at this point. It says "if we're lexing
|
180
|
+
this string, we should get this sexp if we fully parse it, and the
|
181
|
+
lexical stream should look like this"... That last bit is mostly made
|
182
|
+
up at this point. Sometimes I don't know exactly what expression state
|
183
|
+
things should be in until I start really digging in.
|
184
|
+
|
185
|
+
At this point, I have 2 failing tests that are directing me in the
|
186
|
+
right direction. It's now a matter of digging through
|
187
|
+
`compare/parse26.y` to see how the lexer differs and implementing
|
188
|
+
it...
|
189
|
+
|
190
|
+
But this is a good start to the doco for now. I'll add more later.
|