ruby_parser 3.15.0 → 3.18.0

Sign up to get free protection for your applications and to get access to all the features.
checksums.yaml CHANGED
@@ -1,7 +1,7 @@
1
1
  ---
2
2
  SHA256:
3
- metadata.gz: b1172d3be0aa30a6d959bd6ab265a6e05b391923196ec30236705637f8b05cc0
4
- data.tar.gz: bedbe7c23eb256413a34d9b5dc67d975c70a8757edea6b22cebfcb82fb42d412
3
+ metadata.gz: 36780d9d3244dd62d13430987076d5e81ae2e536d6d2bfd259f8a612da3d94cc
4
+ data.tar.gz: bec4b32e7f7a8d9ae8e3202f30230f351a2fedc6e2ac4e984260486dbb7529c6
5
5
  SHA512:
6
- metadata.gz: 4e3174ada892182f6f60497feaa74668517e71bcffd44b10590f9c2fc153ba1744dbb8afd35493d1f7cb96598795e30505c922c2e53e441cbb1a7a646e9005ef
7
- data.tar.gz: 2cab3beeefb45cba366c482e862bb2594169560f8920d39b28a7a86ad5c6c6582393fe91e082cd6387314f36ae5e660d8e18c2de8a4ddcefce9d40600593f795
6
+ metadata.gz: f28d02d2b14687e365bab3a353348b93a9df993be2d1afd3f2783b5b97ca016a6ca2f834ef61ebb4a4eae3decc38e1351349679f951f901bef09c25f23d44322
7
+ data.tar.gz: 276ecce4db1f72ed2ce0d276679e65419225a46b885d0050aa7ba6382b45033ccd24b5006a0d382f0aecdbb6c5a5fd93e3e826adeafccc3c47ee051b76772eee
checksums.yaml.gz.sig CHANGED
Binary file
data/History.rdoc CHANGED
@@ -1,3 +1,104 @@
1
+ === 3.18.0 / 2021-10-27
2
+
3
+ Holy crap... 58 commits! 2.7 and 3.0 are feature complete. Strings
4
+ & heredocs have been rewritten.
5
+
6
+ * 9 major enhancements:
7
+
8
+ * !!! Rewrote lexer (and friends) for strings, heredocs, and %*[] constructs.
9
+ * Massive overhaul on line numbers.
10
+ * Freeze input! Finally!!! No more modifying the input string for heredocs.
11
+ * Overhauled RPStringScanner. Removed OLD compatibility methods!
12
+ * Removed Sexp methods: value, to_sym, add, add_all, node_type, values.
13
+ * value moved to sexp_processor.
14
+ * Removed String#grep monkey-patch.
15
+ * Removed String#lineno monkey-patch.
16
+ * Removed string_to_pos, charpos, etc hacks for ancient ruby versions.
17
+ * Removed unread_many... NO! NO EDITING THE INPUT STRING!
18
+
19
+ * 31 minor enhancements:
20
+
21
+ * 2.7/3.0: many more pattern edge cases
22
+ * 2.7: Added `mlhs = rhs rescue expr`
23
+ * 2.7: refactored destructured args (`|(k,v)|`) and unfactored(?!) case_body/args.
24
+ * 3.0: excessed_comma
25
+ * 3.0: finished most everything: endless methods, patterns, etc.
26
+ * 3.0: refactored / added new pattern changes
27
+ * Added RubyLexer#in_heredoc? (ie, is there old_ss ?)
28
+ * Added RubyLexer#old_ss and old_lineno and removed much of SSStack(ish).
29
+ * Added Symbol#end_with? when necessary
30
+ * Added TALLY and DEBUG options for ss.getch and ss.scan
31
+ * Added ignore_body_comments to make parser productions more clear.
32
+ * Added support for no_kwarg (eg `def f(**nil)`).
33
+ * Added support for no_kwarg in blocks (eg `f { |**nil| }`).
34
+ * Augmented generated parser files to have frozen_string_literal comments and fixed tests.
35
+ * Broke out 3.0 parser into its own to ease development.
36
+ * Bumped dependencies on sexp_processor and oedipus_lex.
37
+ * Clean generated 3.x files.
38
+ * Extracted all string scanner methods to their own module.
39
+ * Fixed some precedence decls.
40
+ * Implemented most of pattern matching for 2.7+.
41
+ * Improve lex_state= to report location in verbose debug mode.
42
+ * Made it easier to debug with a particular version of ruby via rake.
43
+ * Make sure ripper uses the same version of ruby we specified.
44
+ * Moved all string/heredoc/etc code to ruby_lexer_strings.rb
45
+ * Remove warning from newer bisons.
46
+ * Sprinkled in some frozen_string_literal, but mostly helped by oedipus bump.
47
+ * Switch to comparing against ruby binary since ripper is buggy.
48
+ * bugs task should try both bug*.rb and bad*.rb.
49
+ * endless methods
50
+ * f_any_kwrest refactoring.
51
+ * refactored defn/defs
52
+
53
+ * 15 bug fixes:
54
+
55
+ * Cleaned a bunch of old hacks. Initializing RubyLexer w/ Parser is cleaner now.
56
+ * Corrected some lex_state errors in process_token_keyword.
57
+ * Fixed ancient ruby2 change (use #lines) in ruby_parse_extract_error.
58
+ * Fixed bug where else without rescue only raises on 2.6+
59
+ * Fixed caller for getch and scan when DEBUG=1
60
+ * Fixed comments in the middle of message cascades.
61
+ * Fixed differences w/ symbol productions against ruby 2.7.
62
+ * Fixed dsym to use string_contents production.
63
+ * Fixed error in bdot2/3 in some edge cases. Fixed p_alt line.
64
+ * Fixed heredoc dedenting in the presence of empty lines. (mvz)
65
+ * Fixed some leading whitespace / comment processing
66
+ * Fixed up how class/module/defn/defs comments were collected.
67
+ * Overhauled ripper.rb to deal with buggy ripper w/ yydebug.
68
+ * Removed dsym from literal.
69
+ * Removed tUBANG lexeme but kept it distinct as a method name (eg: `def !@`).
70
+
71
+ === 3.17.0 / 2021-08-03
72
+
73
+ * 1 minor enhancement:
74
+
75
+ * Added support for arg forwarding (eg `def f(...); m(...); end`) (presidentbeef)
76
+
77
+ === 3.16.0 / 2021-05-15
78
+
79
+ * 1 major enhancement:
80
+
81
+ * Added tentative 3.0 support.
82
+
83
+ * 3 minor enhancements:
84
+
85
+ * Added lexing for "beginless range" (bdots).
86
+ * Added parsing for bdots.
87
+ * Updated rake compare task to download xz files, bumped versions, etc
88
+
89
+ * 4 bug fixes:
90
+
91
+ * Bump rake dependency to >= 10, < 15. (presidentbeef)
92
+ * Bump sexp_processor dependency to 4.15.1+. (pravi)
93
+ * Fixed minor state mismatch at the end of parsing to make diffing a little cleaner.
94
+ * Fixed normalizer to deal with new bison token syntax
95
+
96
+ === 3.15.1 / 2021-01-10
97
+
98
+ * 1 bug fix:
99
+
100
+ * Bumped ruby version to include < 4 (trunk).
101
+
1
102
  === 3.15.0 / 2020-08-31
2
103
 
3
104
  * 1 major enhancement:
data/Manifest.txt CHANGED
@@ -7,6 +7,7 @@ bin/ruby_parse
7
7
  bin/ruby_parse_extract_error
8
8
  compare/normalize.rb
9
9
  debugging.md
10
+ gauntlet.md
10
11
  lib/.document
11
12
  lib/rp_extensions.rb
12
13
  lib/rp_stringscanner.rb
@@ -26,9 +27,13 @@ lib/ruby26_parser.rb
26
27
  lib/ruby26_parser.y
27
28
  lib/ruby27_parser.rb
28
29
  lib/ruby27_parser.y
30
+ lib/ruby30_parser.rb
31
+ lib/ruby30_parser.y
32
+ lib/ruby3_parser.yy
29
33
  lib/ruby_lexer.rb
30
34
  lib/ruby_lexer.rex
31
35
  lib/ruby_lexer.rex.rb
36
+ lib/ruby_lexer_strings.rb
32
37
  lib/ruby_parser.rb
33
38
  lib/ruby_parser.yy
34
39
  lib/ruby_parser_extras.rb
data/README.rdoc CHANGED
@@ -32,6 +32,7 @@ Tested against 801,039 files from the latest of all rubygems (as of 2013-05):
32
32
  * 1.8 parser is at 99.9739% accuracy, 3.651 sigma
33
33
  * 1.9 parser is at 99.9940% accuracy, 4.013 sigma
34
34
  * 2.0 parser is at 99.9939% accuracy, 4.008 sigma
35
+ * 2.6 parser is at 99.9972% accuracy, 4.191 sigma
35
36
 
36
37
  == FEATURES/PROBLEMS:
37
38
 
data/Rakefile CHANGED
@@ -14,25 +14,37 @@ Hoe.add_include_dirs "../../minitest/dev/lib"
14
14
  Hoe.add_include_dirs "../../oedipus_lex/dev/lib"
15
15
 
16
16
  V2 = %w[20 21 22 23 24 25 26 27]
17
- V2.replace [V2.last] if ENV["FAST"] # HACK
17
+ V3 = %w[30]
18
+
19
+ VERS = V2 + V3
20
+
21
+ ENV["FAST"] = VERS.last if ENV["FAST"] && !VERS.include?(ENV["FAST"])
22
+ VERS.replace [ENV["FAST"]] if ENV["FAST"]
18
23
 
19
24
  Hoe.spec "ruby_parser" do
20
25
  developer "Ryan Davis", "ryand-ruby@zenspider.com"
21
26
 
22
27
  license "MIT"
23
28
 
24
- dependency "sexp_processor", "~> 4.9"
25
- dependency "rake", "< 11", :developer
26
- dependency "oedipus_lex", "~> 2.5", :developer
29
+ dependency "sexp_processor", "~> 4.16"
30
+ dependency "rake", [">= 10", "< 15"], :developer
31
+ dependency "oedipus_lex", "~> 2.6", :developer
32
+
33
+ # NOTE: Ryan!!! Stop trying to fix this dependency! Isolate just
34
+ # can't handle having a faux-gem half-installed! Stop! Just `gem
35
+ # install racc` and move on. Revisit this ONLY once racc-compiler
36
+ # gets split out.
27
37
 
28
- require_ruby_version [">= 2.1", "< 3.1"]
38
+ dependency "racc", "~> 1.5", :developer
39
+
40
+ require_ruby_version [">= 2.1", "< 4"]
29
41
 
30
42
  if plugin? :perforce then # generated files
31
- V2.each do |n|
43
+ VERS.each do |n|
32
44
  self.perforce_ignore << "lib/ruby#{n}_parser.rb"
33
45
  end
34
46
 
35
- V2.each do |n|
47
+ VERS.each do |n|
36
48
  self.perforce_ignore << "lib/ruby#{n}_parser.y"
37
49
  end
38
50
 
@@ -46,6 +58,23 @@ Hoe.spec "ruby_parser" do
46
58
  end
47
59
  end
48
60
 
61
+ def maybe_add_to_top path, string
62
+ file = File.read path
63
+
64
+ return if file.start_with? string
65
+
66
+ warn "Altering top of #{path}"
67
+ tmp_path = "#{path}.tmp"
68
+ File.open(tmp_path, "w") do |f|
69
+ f.puts string
70
+ f.puts
71
+
72
+ f.write file
73
+ # TODO: make this deal with encoding comments properly?
74
+ end
75
+ File.rename tmp_path, path
76
+ end
77
+
49
78
  V2.each do |n|
50
79
  file "lib/ruby#{n}_parser.y" => "lib/ruby_parser.yy" do |t|
51
80
  cmd = 'unifdef -tk -DV=%s -UDEAD %s > %s || true' % [n, t.source, t.name]
@@ -55,8 +84,23 @@ V2.each do |n|
55
84
  file "lib/ruby#{n}_parser.rb" => "lib/ruby#{n}_parser.y"
56
85
  end
57
86
 
87
+ V3.each do |n|
88
+ file "lib/ruby#{n}_parser.y" => "lib/ruby3_parser.yy" do |t|
89
+ cmd = 'unifdef -tk -DV=%s -UDEAD %s > %s || true' % [n, t.source, t.name]
90
+ sh cmd
91
+ end
92
+
93
+ file "lib/ruby#{n}_parser.rb" => "lib/ruby#{n}_parser.y"
94
+ end
95
+
58
96
  file "lib/ruby_lexer.rex.rb" => "lib/ruby_lexer.rex"
59
97
 
98
+ task :parser do |t|
99
+ t.prerequisite_tasks.grep(Rake::FileTask).select(&:already_invoked).each do |f|
100
+ maybe_add_to_top f.name, "# frozen_string_literal: true"
101
+ end
102
+ end
103
+
60
104
  task :generate => [:lexer, :parser]
61
105
 
62
106
  task :clean do
@@ -65,6 +109,7 @@ task :clean do
65
109
  Dir["coverage.info"] +
66
110
  Dir["coverage"] +
67
111
  Dir["lib/ruby2*_parser.y"] +
112
+ Dir["lib/ruby3*_parser.y"] +
68
113
  Dir["lib/*.output"])
69
114
  end
70
115
 
@@ -92,7 +137,7 @@ end
92
137
 
93
138
  def dl v
94
139
  dir = v[/^\d+\.\d+/]
95
- url = "https://cache.ruby-lang.org/pub/ruby/#{dir}/ruby-#{v}.tar.bz2"
140
+ url = "https://cache.ruby-lang.org/pub/ruby/#{dir}/ruby-#{v}.tar.xz"
96
141
  path = File.basename url
97
142
  unless File.exist? path then
98
143
  system "curl -O #{url}"
@@ -104,7 +149,7 @@ def ruby_parse version
104
149
  rp_txt = "rp#{v}.txt"
105
150
  mri_txt = "mri#{v}.txt"
106
151
  parse_y = "parse#{v}.y"
107
- tarball = "ruby-#{version}.tar.bz2"
152
+ tarball = "ruby-#{version}.tar.xz"
108
153
  ruby_dir = "ruby-#{version}"
109
154
  diff = "diff#{v}.diff"
110
155
  rp_out = "lib/ruby#{v}_parser.output"
@@ -124,15 +169,18 @@ def ruby_parse version
124
169
  end
125
170
  end
126
171
 
172
+ desc "fetch all tarballs"
173
+ task :fetch => c_tarball
174
+
127
175
  file c_parse_y => c_tarball do
128
176
  in_compare do
129
177
  extract_glob = case version
130
- when /2\.7/
178
+ when /2\.7|3\.0/
131
179
  "{id.h,parse.y,tool/{id2token.rb,lib/vpath.rb}}"
132
180
  else
133
181
  "{id.h,parse.y,tool/{id2token.rb,vpath.rb}}"
134
182
  end
135
- system "tar yxf #{tarball} #{ruby_dir}/#{extract_glob}"
183
+ system "tar Jxf #{tarball} #{ruby_dir}/#{extract_glob}"
136
184
 
137
185
  Dir.chdir ruby_dir do
138
186
  if File.exist? "tool/id2token.rb" then
@@ -141,15 +189,20 @@ def ruby_parse version
141
189
  sh "expand parse.y > ../#{parse_y}"
142
190
  end
143
191
 
144
- ruby "-pi", "-e", 'gsub(/^%define\s+api\.pure/, "%pure-parser")', "../#{parse_y}"
192
+ ruby "-pi", "-e", 'gsub(/^%pure-parser/, "%define api.pure")', "../#{parse_y}"
145
193
  end
146
194
  sh "rm -rf #{ruby_dir}"
147
195
  end
148
196
  end
149
197
 
198
+ bison = Dir["/opt/homebrew/opt/bison/bin/bison",
199
+ "/usr/local/opt/bison/bin/bison",
200
+ `which bison`.chomp,
201
+ ].first
202
+
150
203
  file c_mri_txt => [c_parse_y, normalize] do
151
204
  in_compare do
152
- sh "bison -r all #{parse_y}"
205
+ sh "#{bison} -r all #{parse_y}"
153
206
  sh "./normalize.rb parse#{v}.output > #{mri_txt}"
154
207
  rm ["parse#{v}.output", "parse#{v}.tab.c"]
155
208
  end
@@ -190,17 +243,50 @@ def ruby_parse version
190
243
  end
191
244
  end
192
245
 
246
+ task :versions do
247
+ require "open-uri"
248
+ require "net/http" # avoid require issues in threads
249
+ require "net/https"
250
+
251
+ versions = %w[ 2.0 2.1 2.2 2.3 2.4 2.5 2.6 2.7 3.0 ]
252
+
253
+ base_url = "https://cache.ruby-lang.org/pub/ruby"
254
+
255
+ class Array
256
+ def human_sort
257
+ sort_by { |item| item.to_s.split(/(\d+)/).map { |e| [e.to_i, e] } }
258
+ end
259
+ end
260
+
261
+ versions = versions.map { |ver|
262
+ Thread.new {
263
+ URI
264
+ .parse("#{base_url}/#{ver}/")
265
+ .read
266
+ .scan(/ruby-\d+\.\d+\.\d+[-\w.]*?.tar.gz/)
267
+ .reject { |s| s =~ /-(?:rc|preview)\d/ }
268
+ .human_sort
269
+ .last
270
+ .delete_prefix("ruby-")
271
+ .delete_suffix ".tar.gz"
272
+ }
273
+ }.map(&:value).sort
274
+
275
+ puts versions.map { |v| "ruby_parse %p" % [v] }
276
+ end
277
+
193
278
  ruby_parse "2.0.0-p648"
194
- ruby_parse "2.1.9"
195
- ruby_parse "2.2.9"
279
+ ruby_parse "2.1.10"
280
+ ruby_parse "2.2.10"
196
281
  ruby_parse "2.3.8"
197
- ruby_parse "2.4.9"
198
- ruby_parse "2.5.8"
199
- ruby_parse "2.6.6"
200
- ruby_parse "2.7.1"
282
+ ruby_parse "2.4.10"
283
+ ruby_parse "2.5.9"
284
+ ruby_parse "2.6.8"
285
+ ruby_parse "2.7.4"
286
+ ruby_parse "3.0.2"
201
287
 
202
288
  task :debug => :isolate do
203
- ENV["V"] ||= V2.last
289
+ ENV["V"] ||= VERS.last
204
290
  Rake.application[:parser].invoke # this way we can have DEBUG set
205
291
  Rake.application[:lexer].invoke # this way we can have DEBUG set
206
292
 
@@ -215,7 +301,7 @@ task :debug => :isolate do
215
301
  time = (ENV["RP_TIMEOUT"] || 10).to_i
216
302
 
217
303
  n = ENV["BUG"]
218
- file = (n && "bug#{n}.rb") || ENV["F"] || ENV["FILE"] || "bug.rb"
304
+ file = (n && "bug#{n}.rb") || ENV["F"] || ENV["FILE"] || "debug.rb"
219
305
  ruby = ENV["R"] || ENV["RUBY"]
220
306
 
221
307
  if ruby then
@@ -238,19 +324,22 @@ task :debug => :isolate do
238
324
  end
239
325
 
240
326
  task :debug3 do
241
- file = ENV["F"] || "bug.rb"
242
- verbose = ENV["V"] ? "-v" : ""
327
+ file = ENV["F"] || "debug.rb"
328
+ version = ENV["V"] || ""
329
+ verbose = ENV["VERBOSE"] ? "-v" : ""
243
330
  munge = "./tools/munge.rb #{verbose}"
244
331
 
245
332
  abort "Need a file to parse, via: F=path.rb" unless file
246
333
 
247
334
  ENV.delete "V"
248
335
 
249
- sh "ruby -v"
250
- sh "ruby -y #{file} 2>&1 | #{munge} > tmp/ruby"
251
- sh "./tools/ripper.rb -d #{file} | #{munge} > tmp/rip"
336
+ ruby = "ruby#{version}"
337
+
338
+ sh "#{ruby} -v"
339
+ sh "#{ruby} -y #{file} 2>&1 | #{munge} > tmp/ruby"
340
+ sh "#{ruby} ./tools/ripper.rb -d #{file} | #{munge} > tmp/rip"
252
341
  sh "rake debug F=#{file} DEBUG=1 2>&1 | #{munge} > tmp/rp"
253
- sh "diff -U 999 -d tmp/{rip,rp}"
342
+ sh "diff -U 999 -d tmp/{ruby,rp}"
254
343
  end
255
344
 
256
345
  task :cmp do
@@ -262,16 +351,25 @@ task :cmp3 do
262
351
  end
263
352
 
264
353
  task :extract => :isolate do
265
- ENV["V"] ||= V2.last
354
+ ENV["V"] ||= VERS.last
266
355
  Rake.application[:parser].invoke # this way we can have DEBUG set
267
356
 
268
- file = ENV["F"] || ENV["FILE"]
357
+ file = ENV["F"] || ENV["FILE"] || abort("Need to provide F=<path>")
269
358
 
270
359
  ruby "-Ilib", "bin/ruby_parse_extract_error", file
271
360
  end
272
361
 
362
+ task :parse => :isolate do
363
+ ENV["V"] ||= VERS.last
364
+ Rake.application[:parser].invoke # this way we can have DEBUG set
365
+
366
+ file = ENV["F"] || ENV["FILE"] || abort("Need to provide F=<path>")
367
+
368
+ ruby "-Ilib", "bin/ruby_parse", file
369
+ end
370
+
273
371
  task :bugs do
274
- sh "for f in bug*.rb ; do #{Gem.ruby} -S rake debug F=$f && rm $f ; done"
372
+ sh "for f in bug*.rb bad*.rb ; do #{Gem.ruby} -S rake debug F=$f && rm $f ; done"
275
373
  end
276
374
 
277
375
  # vim: syntax=Ruby
@@ -21,7 +21,7 @@ class RubyParser
21
21
  src = ss.string
22
22
  pre_error = src[0...ss.pos]
23
23
 
24
- defs = pre_error.grep(/^ *(?:def|it)/)
24
+ defs = pre_error.lines.grep(/^ *(?:def|it)/)
25
25
 
26
26
  raise "can't figure out where the bad code starts" unless defs.last
27
27
 
data/compare/normalize.rb CHANGED
@@ -84,6 +84,7 @@ def munge s
84
84
 
85
85
  "' '", "tSPACE", # needs to be later to avoid bad hits
86
86
 
87
+ "%empty", "none", # newer bison
87
88
  "/* empty */", "none",
88
89
  /^\s*$/, "none",
89
90
 
@@ -140,6 +141,7 @@ def munge s
140
141
  '"do for block"', "kDO_BLOCK",
141
142
  '"do for condition"', "kDO_COND",
142
143
  '"do for lambda"', "kDO_LAMBDA",
144
+ "tLABEL", "kLABEL",
143
145
 
144
146
  # UGH
145
147
  "k_LINE__", "k__LINE__",
@@ -155,7 +157,10 @@ def munge s
155
157
  /\"(\w+) \(?modifier\)?\"/, proc { |x| "k#{$1.upcase}_MOD" },
156
158
  /\"(\w+)\"/, proc { |x| "k#{$1.upcase}" },
157
159
 
158
- /@(\d+)(\s+|$)/, "",
160
+ /\$?@(\d+)(\s+|$)/, "", # newer bison
161
+
162
+ # TODO: remove for 3.0 work:
163
+ "lex_ctxt ", "" # 3.0 production that's mostly noise right now
159
164
  ]
160
165
 
161
166
  renames.each_slice(2) do |(a, b)|
@@ -174,7 +179,7 @@ ARGF.each_line do |line|
174
179
 
175
180
  case line.strip
176
181
  when /^$/ then
177
- when /^(\d+) (\$?\w+): (.*)/ then # yacc
182
+ when /^(\d+) (\$?[@\w]+): (.*)/ then # yacc
178
183
  rule = $2
179
184
  order << rule unless rules.has_key? rule
180
185
  rules[rule] << munge($3)
@@ -199,7 +204,7 @@ ARGF.each_line do |line|
199
204
  when /^\cL/ then # byacc
200
205
  break
201
206
  else
202
- warn "unparsed: #{$.}: #{line.chomp}"
207
+ warn "unparsed: #{$.}: #{line.strip.inspect}"
203
208
  end
204
209
  end
205
210
 
data/debugging.md CHANGED
@@ -55,3 +55,136 @@ From there? Good luck. I'm currently trying to backtrack from rule
55
55
  reductions to state change differences. I'd like to figure out a way
56
56
  to go from this sort of diff to a reasonable test that checks state
57
57
  changes but I don't have that set up at this point.
58
+
59
+ ## Adding New Grammar Productions
60
+
61
+ Ruby adds stuff to the parser ALL THE TIME. It's actually hard to keep
62
+ up with, but I've added some tools and shown what a typical workflow
63
+ looks like. Let's say you want to add ruby 2.7's "beginless range" (eg
64
+ `..42`).
65
+
66
+ Whenever there's a language feature missing, I start with comparing
67
+ the parse trees between MRI and RP:
68
+
69
+ ### Structural Comparing
70
+
71
+ There's a bunch of rake tasks `compare27`, `compare26`, etc that try
72
+ to normalize and diff MRI's parse.y parse tree (just the structure of
73
+ the tree in yacc) to ruby\_parser's parse tree (racc). It's the first
74
+ thing I do when I'm adding a new version. Stub out all the version
75
+ differences, and then start to diff the structure and move
76
+ ruby\_parser towards the new changes.
77
+
78
+ Some differences are just gonna be there... but here's an example of a
79
+ real diff between MRI 2.7 and ruby_parser as of today:
80
+
81
+ ```diff
82
+ arg tDOT3 arg
83
+ arg tDOT2
84
+ arg tDOT3
85
+ - tBDOT2 arg
86
+ - tBDOT3 arg
87
+ arg tPLUS arg
88
+ arg tMINUS arg
89
+ arg tSTAR2 arg
90
+ ```
91
+
92
+ This is a new language feature that ruby_parser doesn't handle yet.
93
+ It's in MRI (the left hand side of the diff) but not ruby\_parser (the
94
+ right hand side) so it is a `-` or missing line.
95
+
96
+ Some other diffs will have both `+` and `-` lines. That usually
97
+ happens when MRI has been refactoring the grammar. Sometimes I choose
98
+ to adapt those refactorings and sometimes it starts to get too
99
+ difficult to maintain multiple versions of ruby parsing in a single
100
+ file.
101
+
102
+ But! This structural comparing is always a place you should look when
103
+ ruby_parser is failing to parse something. Maybe it just hasn't been
104
+ implemented yet and the easiest place to look is the diff.
105
+
106
+ ### Starting Test First
107
+
108
+ The next thing I do is to add a parser test to cover that feature. I
109
+ usually start with the parser and work backwards towards the lexer as
110
+ needed, as I find it structures things properly and keeps things goal
111
+ oriented.
112
+
113
+ So, make a new parser test, usually in the versioned section of the
114
+ parser tests.
115
+
116
+ ```
117
+ def test_beginless2
118
+ rb = "..10\n; ..a\n; c"
119
+ pt = s(:block,
120
+ s(:dot2, nil, s(:lit, 0).line(1)).line(1),
121
+ s(:dot2, nil, s(:call, nil, :a).line(2)).line(2),
122
+ s(:call, nil, :c).line(3)).line(1)
123
+
124
+ assert_parse_line rb, pt, 1
125
+
126
+ flunk "not done yet"
127
+ end
128
+ ```
129
+
130
+ (In this case copied and modified the tests for open ranges from 2.6)
131
+ and run it to get my first error:
132
+
133
+ ```
134
+ % rake N=/beginless/
135
+
136
+ ...
137
+
138
+ E
139
+
140
+ Finished in 0.021814s, 45.8421 runs/s, 0.0000 assertions/s.
141
+
142
+ 1) Error:
143
+ TestRubyParserV27#test_whatevs:
144
+ Racc::ParseError: (string):1 :: parse error on value ".." (tDOT2)
145
+ GEMS/2.7.0/gems/racc-1.5.0/lib/racc/parser.rb:538:in `on_error'
146
+ WORK/ruby_parser/dev/lib/ruby_parser_extras.rb:1304:in `on_error'
147
+ (eval):3:in `_racc_do_parse_c'
148
+ (eval):3:in `do_parse'
149
+ WORK/ruby_parser/dev/lib/ruby_parser_extras.rb:1329:in `block in process'
150
+ RUBY/lib/ruby/2.7.0/timeout.rb:95:in `block in timeout'
151
+ RUBY/lib/ruby/2.7.0/timeout.rb:33:in `block in catch'
152
+ RUBY/lib/ruby/2.7.0/timeout.rb:33:in `catch'
153
+ RUBY/lib/ruby/2.7.0/timeout.rb:33:in `catch'
154
+ RUBY/lib/ruby/2.7.0/timeout.rb:110:in `timeout'
155
+ WORK/ruby_parser/dev/lib/ruby_parser_extras.rb:1317:in `process'
156
+ WORK/ruby_parser/dev/test/test_ruby_parser.rb:4198:in `assert_parse'
157
+ WORK/ruby_parser/dev/test/test_ruby_parser.rb:4221:in `assert_parse_line'
158
+ WORK/ruby_parser/dev/test/test_ruby_parser.rb:4451:in `test_whatevs'
159
+ ```
160
+
161
+ For starters, we know the missing production is for `tBDOT2 arg`. It
162
+ is currently blowing up because it is getting `tDOT2` and simply
163
+ doesn't know what to do with it, so it raises the error. As the diff
164
+ suggests, that's the wrong token to begin with, so it is probably time
165
+ to also create a lexer test:
166
+
167
+ ```
168
+ def test_yylex_bdot2
169
+ assert_lex3("..42",
170
+ s(:dot2, nil, s(:lit, 42)),
171
+
172
+ :tBDOT2, "..", EXPR_BEG,
173
+ :tINTEGER, "42", EXPR_NUM)
174
+
175
+ flunk "not done yet"
176
+ end
177
+ ```
178
+
179
+ This one is mostly speculative at this point. It says "if we're lexing
180
+ this string, we should get this sexp if we fully parse it, and the
181
+ lexical stream should look like this"... That last bit is mostly made
182
+ up at this point. Sometimes I don't know exactly what expression state
183
+ things should be in until I start really digging in.
184
+
185
+ At this point, I have 2 failing tests that are directing me in the
186
+ right direction. It's now a matter of digging through
187
+ `compare/parse26.y` to see how the lexer differs and implementing
188
+ it...
189
+
190
+ But this is a good start to the doco for now. I'll add more later.