ruby_parser 3.15.0 → 3.18.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
checksums.yaml CHANGED
@@ -1,7 +1,7 @@
1
1
  ---
2
2
  SHA256:
3
- metadata.gz: b1172d3be0aa30a6d959bd6ab265a6e05b391923196ec30236705637f8b05cc0
4
- data.tar.gz: bedbe7c23eb256413a34d9b5dc67d975c70a8757edea6b22cebfcb82fb42d412
3
+ metadata.gz: 36780d9d3244dd62d13430987076d5e81ae2e536d6d2bfd259f8a612da3d94cc
4
+ data.tar.gz: bec4b32e7f7a8d9ae8e3202f30230f351a2fedc6e2ac4e984260486dbb7529c6
5
5
  SHA512:
6
- metadata.gz: 4e3174ada892182f6f60497feaa74668517e71bcffd44b10590f9c2fc153ba1744dbb8afd35493d1f7cb96598795e30505c922c2e53e441cbb1a7a646e9005ef
7
- data.tar.gz: 2cab3beeefb45cba366c482e862bb2594169560f8920d39b28a7a86ad5c6c6582393fe91e082cd6387314f36ae5e660d8e18c2de8a4ddcefce9d40600593f795
6
+ metadata.gz: f28d02d2b14687e365bab3a353348b93a9df993be2d1afd3f2783b5b97ca016a6ca2f834ef61ebb4a4eae3decc38e1351349679f951f901bef09c25f23d44322
7
+ data.tar.gz: 276ecce4db1f72ed2ce0d276679e65419225a46b885d0050aa7ba6382b45033ccd24b5006a0d382f0aecdbb6c5a5fd93e3e826adeafccc3c47ee051b76772eee
checksums.yaml.gz.sig CHANGED
Binary file
data/History.rdoc CHANGED
@@ -1,3 +1,104 @@
1
+ === 3.18.0 / 2021-10-27
2
+
3
+ Holy crap... 58 commits! 2.7 and 3.0 are feature complete. Strings
4
+ & heredocs have been rewritten.
5
+
6
+ * 9 major enhancements:
7
+
8
+ * !!! Rewrote lexer (and friends) for strings, heredocs, and %*[] constructs.
9
+ * Massive overhaul on line numbers.
10
+ * Freeze input! Finally!!! No more modifying the input string for heredocs.
11
+ * Overhauled RPStringScanner. Removed OLD compatibility methods!
12
+ * Removed Sexp methods: value, to_sym, add, add_all, node_type, values.
13
+ * value moved to sexp_processor.
14
+ * Removed String#grep monkey-patch.
15
+ * Removed String#lineno monkey-patch.
16
+ * Removed string_to_pos, charpos, etc hacks for ancient ruby versions.
17
+ * Removed unread_many... NO! NO EDITING THE INPUT STRING!
18
+
19
+ * 31 minor enhancements:
20
+
21
+ * 2.7/3.0: many more pattern edge cases
22
+ * 2.7: Added `mlhs = rhs rescue expr`
23
+ * 2.7: refactored destructured args (`|(k,v)|`) and unfactored(?!) case_body/args.
24
+ * 3.0: excessed_comma
25
+ * 3.0: finished most everything: endless methods, patterns, etc.
26
+ * 3.0: refactored / added new pattern changes
27
+ * Added RubyLexer#in_heredoc? (ie, is there old_ss ?)
28
+ * Added RubyLexer#old_ss and old_lineno and removed much of SSStack(ish).
29
+ * Added Symbol#end_with? when necessary
30
+ * Added TALLY and DEBUG options for ss.getch and ss.scan
31
+ * Added ignore_body_comments to make parser productions more clear.
32
+ * Added support for no_kwarg (eg `def f(**nil)`).
33
+ * Added support for no_kwarg in blocks (eg `f { |**nil| }`).
34
+ * Augmented generated parser files to have frozen_string_literal comments and fixed tests.
35
+ * Broke out 3.0 parser into its own to ease development.
36
+ * Bumped dependencies on sexp_processor and oedipus_lex.
37
+ * Clean generated 3.x files.
38
+ * Extracted all string scanner methods to their own module.
39
+ * Fixed some precedence decls.
40
+ * Implemented most of pattern matching for 2.7+.
41
+ * Improve lex_state= to report location in verbose debug mode.
42
+ * Made it easier to debug with a particular version of ruby via rake.
43
+ * Make sure ripper uses the same version of ruby we specified.
44
+ * Moved all string/heredoc/etc code to ruby_lexer_strings.rb
45
+ * Remove warning from newer bisons.
46
+ * Sprinkled in some frozen_string_literal, but mostly helped by oedipus bump.
47
+ * Switch to comparing against ruby binary since ripper is buggy.
48
+ * bugs task should try both bug*.rb and bad*.rb.
49
+ * endless methods
50
+ * f_any_kwrest refactoring.
51
+ * refactored defn/defs
52
+
53
+ * 15 bug fixes:
54
+
55
+ * Cleaned a bunch of old hacks. Initializing RubyLexer w/ Parser is cleaner now.
56
+ * Corrected some lex_state errors in process_token_keyword.
57
+ * Fixed ancient ruby2 change (use #lines) in ruby_parse_extract_error.
58
+ * Fixed bug where else without rescue only raises on 2.6+
59
+ * Fixed caller for getch and scan when DEBUG=1
60
+ * Fixed comments in the middle of message cascades.
61
+ * Fixed differences w/ symbol productions against ruby 2.7.
62
+ * Fixed dsym to use string_contents production.
63
+ * Fixed error in bdot2/3 in some edge cases. Fixed p_alt line.
64
+ * Fixed heredoc dedenting in the presence of empty lines. (mvz)
65
+ * Fixed some leading whitespace / comment processing
66
+ * Fixed up how class/module/defn/defs comments were collected.
67
+ * Overhauled ripper.rb to deal with buggy ripper w/ yydebug.
68
+ * Removed dsym from literal.
69
+ * Removed tUBANG lexeme but kept it distinct as a method name (eg: `def !@`).
70
+
71
+ === 3.17.0 / 2021-08-03
72
+
73
+ * 1 minor enhancement:
74
+
75
+ * Added support for arg forwarding (eg `def f(...); m(...); end`) (presidentbeef)
76
+
77
+ === 3.16.0 / 2021-05-15
78
+
79
+ * 1 major enhancement:
80
+
81
+ * Added tentative 3.0 support.
82
+
83
+ * 3 minor enhancements:
84
+
85
+ * Added lexing for "beginless range" (bdots).
86
+ * Added parsing for bdots.
87
+ * Updated rake compare task to download xz files, bumped versions, etc
88
+
89
+ * 4 bug fixes:
90
+
91
+ * Bump rake dependency to >= 10, < 15. (presidentbeef)
92
+ * Bump sexp_processor dependency to 4.15.1+. (pravi)
93
+ * Fixed minor state mismatch at the end of parsing to make diffing a little cleaner.
94
+ * Fixed normalizer to deal with new bison token syntax
95
+
96
+ === 3.15.1 / 2021-01-10
97
+
98
+ * 1 bug fix:
99
+
100
+ * Bumped ruby version to include < 4 (trunk).
101
+
1
102
  === 3.15.0 / 2020-08-31
2
103
 
3
104
  * 1 major enhancement:
data/Manifest.txt CHANGED
@@ -7,6 +7,7 @@ bin/ruby_parse
7
7
  bin/ruby_parse_extract_error
8
8
  compare/normalize.rb
9
9
  debugging.md
10
+ gauntlet.md
10
11
  lib/.document
11
12
  lib/rp_extensions.rb
12
13
  lib/rp_stringscanner.rb
@@ -26,9 +27,13 @@ lib/ruby26_parser.rb
26
27
  lib/ruby26_parser.y
27
28
  lib/ruby27_parser.rb
28
29
  lib/ruby27_parser.y
30
+ lib/ruby30_parser.rb
31
+ lib/ruby30_parser.y
32
+ lib/ruby3_parser.yy
29
33
  lib/ruby_lexer.rb
30
34
  lib/ruby_lexer.rex
31
35
  lib/ruby_lexer.rex.rb
36
+ lib/ruby_lexer_strings.rb
32
37
  lib/ruby_parser.rb
33
38
  lib/ruby_parser.yy
34
39
  lib/ruby_parser_extras.rb
data/README.rdoc CHANGED
@@ -32,6 +32,7 @@ Tested against 801,039 files from the latest of all rubygems (as of 2013-05):
32
32
  * 1.8 parser is at 99.9739% accuracy, 3.651 sigma
33
33
  * 1.9 parser is at 99.9940% accuracy, 4.013 sigma
34
34
  * 2.0 parser is at 99.9939% accuracy, 4.008 sigma
35
+ * 2.6 parser is at 99.9972% accuracy, 4.191 sigma
35
36
 
36
37
  == FEATURES/PROBLEMS:
37
38
 
data/Rakefile CHANGED
@@ -14,25 +14,37 @@ Hoe.add_include_dirs "../../minitest/dev/lib"
14
14
  Hoe.add_include_dirs "../../oedipus_lex/dev/lib"
15
15
 
16
16
  V2 = %w[20 21 22 23 24 25 26 27]
17
- V2.replace [V2.last] if ENV["FAST"] # HACK
17
+ V3 = %w[30]
18
+
19
+ VERS = V2 + V3
20
+
21
+ ENV["FAST"] = VERS.last if ENV["FAST"] && !VERS.include?(ENV["FAST"])
22
+ VERS.replace [ENV["FAST"]] if ENV["FAST"]
18
23
 
19
24
  Hoe.spec "ruby_parser" do
20
25
  developer "Ryan Davis", "ryand-ruby@zenspider.com"
21
26
 
22
27
  license "MIT"
23
28
 
24
- dependency "sexp_processor", "~> 4.9"
25
- dependency "rake", "< 11", :developer
26
- dependency "oedipus_lex", "~> 2.5", :developer
29
+ dependency "sexp_processor", "~> 4.16"
30
+ dependency "rake", [">= 10", "< 15"], :developer
31
+ dependency "oedipus_lex", "~> 2.6", :developer
32
+
33
+ # NOTE: Ryan!!! Stop trying to fix this dependency! Isolate just
34
+ # can't handle having a faux-gem half-installed! Stop! Just `gem
35
+ # install racc` and move on. Revisit this ONLY once racc-compiler
36
+ # gets split out.
27
37
 
28
- require_ruby_version [">= 2.1", "< 3.1"]
38
+ dependency "racc", "~> 1.5", :developer
39
+
40
+ require_ruby_version [">= 2.1", "< 4"]
29
41
 
30
42
  if plugin? :perforce then # generated files
31
- V2.each do |n|
43
+ VERS.each do |n|
32
44
  self.perforce_ignore << "lib/ruby#{n}_parser.rb"
33
45
  end
34
46
 
35
- V2.each do |n|
47
+ VERS.each do |n|
36
48
  self.perforce_ignore << "lib/ruby#{n}_parser.y"
37
49
  end
38
50
 
@@ -46,6 +58,23 @@ Hoe.spec "ruby_parser" do
46
58
  end
47
59
  end
48
60
 
61
+ def maybe_add_to_top path, string
62
+ file = File.read path
63
+
64
+ return if file.start_with? string
65
+
66
+ warn "Altering top of #{path}"
67
+ tmp_path = "#{path}.tmp"
68
+ File.open(tmp_path, "w") do |f|
69
+ f.puts string
70
+ f.puts
71
+
72
+ f.write file
73
+ # TODO: make this deal with encoding comments properly?
74
+ end
75
+ File.rename tmp_path, path
76
+ end
77
+
49
78
  V2.each do |n|
50
79
  file "lib/ruby#{n}_parser.y" => "lib/ruby_parser.yy" do |t|
51
80
  cmd = 'unifdef -tk -DV=%s -UDEAD %s > %s || true' % [n, t.source, t.name]
@@ -55,8 +84,23 @@ V2.each do |n|
55
84
  file "lib/ruby#{n}_parser.rb" => "lib/ruby#{n}_parser.y"
56
85
  end
57
86
 
87
+ V3.each do |n|
88
+ file "lib/ruby#{n}_parser.y" => "lib/ruby3_parser.yy" do |t|
89
+ cmd = 'unifdef -tk -DV=%s -UDEAD %s > %s || true' % [n, t.source, t.name]
90
+ sh cmd
91
+ end
92
+
93
+ file "lib/ruby#{n}_parser.rb" => "lib/ruby#{n}_parser.y"
94
+ end
95
+
58
96
  file "lib/ruby_lexer.rex.rb" => "lib/ruby_lexer.rex"
59
97
 
98
+ task :parser do |t|
99
+ t.prerequisite_tasks.grep(Rake::FileTask).select(&:already_invoked).each do |f|
100
+ maybe_add_to_top f.name, "# frozen_string_literal: true"
101
+ end
102
+ end
103
+
60
104
  task :generate => [:lexer, :parser]
61
105
 
62
106
  task :clean do
@@ -65,6 +109,7 @@ task :clean do
65
109
  Dir["coverage.info"] +
66
110
  Dir["coverage"] +
67
111
  Dir["lib/ruby2*_parser.y"] +
112
+ Dir["lib/ruby3*_parser.y"] +
68
113
  Dir["lib/*.output"])
69
114
  end
70
115
 
@@ -92,7 +137,7 @@ end
92
137
 
93
138
  def dl v
94
139
  dir = v[/^\d+\.\d+/]
95
- url = "https://cache.ruby-lang.org/pub/ruby/#{dir}/ruby-#{v}.tar.bz2"
140
+ url = "https://cache.ruby-lang.org/pub/ruby/#{dir}/ruby-#{v}.tar.xz"
96
141
  path = File.basename url
97
142
  unless File.exist? path then
98
143
  system "curl -O #{url}"
@@ -104,7 +149,7 @@ def ruby_parse version
104
149
  rp_txt = "rp#{v}.txt"
105
150
  mri_txt = "mri#{v}.txt"
106
151
  parse_y = "parse#{v}.y"
107
- tarball = "ruby-#{version}.tar.bz2"
152
+ tarball = "ruby-#{version}.tar.xz"
108
153
  ruby_dir = "ruby-#{version}"
109
154
  diff = "diff#{v}.diff"
110
155
  rp_out = "lib/ruby#{v}_parser.output"
@@ -124,15 +169,18 @@ def ruby_parse version
124
169
  end
125
170
  end
126
171
 
172
+ desc "fetch all tarballs"
173
+ task :fetch => c_tarball
174
+
127
175
  file c_parse_y => c_tarball do
128
176
  in_compare do
129
177
  extract_glob = case version
130
- when /2\.7/
178
+ when /2\.7|3\.0/
131
179
  "{id.h,parse.y,tool/{id2token.rb,lib/vpath.rb}}"
132
180
  else
133
181
  "{id.h,parse.y,tool/{id2token.rb,vpath.rb}}"
134
182
  end
135
- system "tar yxf #{tarball} #{ruby_dir}/#{extract_glob}"
183
+ system "tar Jxf #{tarball} #{ruby_dir}/#{extract_glob}"
136
184
 
137
185
  Dir.chdir ruby_dir do
138
186
  if File.exist? "tool/id2token.rb" then
@@ -141,15 +189,20 @@ def ruby_parse version
141
189
  sh "expand parse.y > ../#{parse_y}"
142
190
  end
143
191
 
144
- ruby "-pi", "-e", 'gsub(/^%define\s+api\.pure/, "%pure-parser")', "../#{parse_y}"
192
+ ruby "-pi", "-e", 'gsub(/^%pure-parser/, "%define api.pure")', "../#{parse_y}"
145
193
  end
146
194
  sh "rm -rf #{ruby_dir}"
147
195
  end
148
196
  end
149
197
 
198
+ bison = Dir["/opt/homebrew/opt/bison/bin/bison",
199
+ "/usr/local/opt/bison/bin/bison",
200
+ `which bison`.chomp,
201
+ ].first
202
+
150
203
  file c_mri_txt => [c_parse_y, normalize] do
151
204
  in_compare do
152
- sh "bison -r all #{parse_y}"
205
+ sh "#{bison} -r all #{parse_y}"
153
206
  sh "./normalize.rb parse#{v}.output > #{mri_txt}"
154
207
  rm ["parse#{v}.output", "parse#{v}.tab.c"]
155
208
  end
@@ -190,17 +243,50 @@ def ruby_parse version
190
243
  end
191
244
  end
192
245
 
246
+ task :versions do
247
+ require "open-uri"
248
+ require "net/http" # avoid require issues in threads
249
+ require "net/https"
250
+
251
+ versions = %w[ 2.0 2.1 2.2 2.3 2.4 2.5 2.6 2.7 3.0 ]
252
+
253
+ base_url = "https://cache.ruby-lang.org/pub/ruby"
254
+
255
+ class Array
256
+ def human_sort
257
+ sort_by { |item| item.to_s.split(/(\d+)/).map { |e| [e.to_i, e] } }
258
+ end
259
+ end
260
+
261
+ versions = versions.map { |ver|
262
+ Thread.new {
263
+ URI
264
+ .parse("#{base_url}/#{ver}/")
265
+ .read
266
+ .scan(/ruby-\d+\.\d+\.\d+[-\w.]*?.tar.gz/)
267
+ .reject { |s| s =~ /-(?:rc|preview)\d/ }
268
+ .human_sort
269
+ .last
270
+ .delete_prefix("ruby-")
271
+ .delete_suffix ".tar.gz"
272
+ }
273
+ }.map(&:value).sort
274
+
275
+ puts versions.map { |v| "ruby_parse %p" % [v] }
276
+ end
277
+
193
278
  ruby_parse "2.0.0-p648"
194
- ruby_parse "2.1.9"
195
- ruby_parse "2.2.9"
279
+ ruby_parse "2.1.10"
280
+ ruby_parse "2.2.10"
196
281
  ruby_parse "2.3.8"
197
- ruby_parse "2.4.9"
198
- ruby_parse "2.5.8"
199
- ruby_parse "2.6.6"
200
- ruby_parse "2.7.1"
282
+ ruby_parse "2.4.10"
283
+ ruby_parse "2.5.9"
284
+ ruby_parse "2.6.8"
285
+ ruby_parse "2.7.4"
286
+ ruby_parse "3.0.2"
201
287
 
202
288
  task :debug => :isolate do
203
- ENV["V"] ||= V2.last
289
+ ENV["V"] ||= VERS.last
204
290
  Rake.application[:parser].invoke # this way we can have DEBUG set
205
291
  Rake.application[:lexer].invoke # this way we can have DEBUG set
206
292
 
@@ -215,7 +301,7 @@ task :debug => :isolate do
215
301
  time = (ENV["RP_TIMEOUT"] || 10).to_i
216
302
 
217
303
  n = ENV["BUG"]
218
- file = (n && "bug#{n}.rb") || ENV["F"] || ENV["FILE"] || "bug.rb"
304
+ file = (n && "bug#{n}.rb") || ENV["F"] || ENV["FILE"] || "debug.rb"
219
305
  ruby = ENV["R"] || ENV["RUBY"]
220
306
 
221
307
  if ruby then
@@ -238,19 +324,22 @@ task :debug => :isolate do
238
324
  end
239
325
 
240
326
  task :debug3 do
241
- file = ENV["F"] || "bug.rb"
242
- verbose = ENV["V"] ? "-v" : ""
327
+ file = ENV["F"] || "debug.rb"
328
+ version = ENV["V"] || ""
329
+ verbose = ENV["VERBOSE"] ? "-v" : ""
243
330
  munge = "./tools/munge.rb #{verbose}"
244
331
 
245
332
  abort "Need a file to parse, via: F=path.rb" unless file
246
333
 
247
334
  ENV.delete "V"
248
335
 
249
- sh "ruby -v"
250
- sh "ruby -y #{file} 2>&1 | #{munge} > tmp/ruby"
251
- sh "./tools/ripper.rb -d #{file} | #{munge} > tmp/rip"
336
+ ruby = "ruby#{version}"
337
+
338
+ sh "#{ruby} -v"
339
+ sh "#{ruby} -y #{file} 2>&1 | #{munge} > tmp/ruby"
340
+ sh "#{ruby} ./tools/ripper.rb -d #{file} | #{munge} > tmp/rip"
252
341
  sh "rake debug F=#{file} DEBUG=1 2>&1 | #{munge} > tmp/rp"
253
- sh "diff -U 999 -d tmp/{rip,rp}"
342
+ sh "diff -U 999 -d tmp/{ruby,rp}"
254
343
  end
255
344
 
256
345
  task :cmp do
@@ -262,16 +351,25 @@ task :cmp3 do
262
351
  end
263
352
 
264
353
  task :extract => :isolate do
265
- ENV["V"] ||= V2.last
354
+ ENV["V"] ||= VERS.last
266
355
  Rake.application[:parser].invoke # this way we can have DEBUG set
267
356
 
268
- file = ENV["F"] || ENV["FILE"]
357
+ file = ENV["F"] || ENV["FILE"] || abort("Need to provide F=<path>")
269
358
 
270
359
  ruby "-Ilib", "bin/ruby_parse_extract_error", file
271
360
  end
272
361
 
362
+ task :parse => :isolate do
363
+ ENV["V"] ||= VERS.last
364
+ Rake.application[:parser].invoke # this way we can have DEBUG set
365
+
366
+ file = ENV["F"] || ENV["FILE"] || abort("Need to provide F=<path>")
367
+
368
+ ruby "-Ilib", "bin/ruby_parse", file
369
+ end
370
+
273
371
  task :bugs do
274
- sh "for f in bug*.rb ; do #{Gem.ruby} -S rake debug F=$f && rm $f ; done"
372
+ sh "for f in bug*.rb bad*.rb ; do #{Gem.ruby} -S rake debug F=$f && rm $f ; done"
275
373
  end
276
374
 
277
375
  # vim: syntax=Ruby
@@ -21,7 +21,7 @@ class RubyParser
21
21
  src = ss.string
22
22
  pre_error = src[0...ss.pos]
23
23
 
24
- defs = pre_error.grep(/^ *(?:def|it)/)
24
+ defs = pre_error.lines.grep(/^ *(?:def|it)/)
25
25
 
26
26
  raise "can't figure out where the bad code starts" unless defs.last
27
27
 
data/compare/normalize.rb CHANGED
@@ -84,6 +84,7 @@ def munge s
84
84
 
85
85
  "' '", "tSPACE", # needs to be later to avoid bad hits
86
86
 
87
+ "%empty", "none", # newer bison
87
88
  "/* empty */", "none",
88
89
  /^\s*$/, "none",
89
90
 
@@ -140,6 +141,7 @@ def munge s
140
141
  '"do for block"', "kDO_BLOCK",
141
142
  '"do for condition"', "kDO_COND",
142
143
  '"do for lambda"', "kDO_LAMBDA",
144
+ "tLABEL", "kLABEL",
143
145
 
144
146
  # UGH
145
147
  "k_LINE__", "k__LINE__",
@@ -155,7 +157,10 @@ def munge s
155
157
  /\"(\w+) \(?modifier\)?\"/, proc { |x| "k#{$1.upcase}_MOD" },
156
158
  /\"(\w+)\"/, proc { |x| "k#{$1.upcase}" },
157
159
 
158
- /@(\d+)(\s+|$)/, "",
160
+ /\$?@(\d+)(\s+|$)/, "", # newer bison
161
+
162
+ # TODO: remove for 3.0 work:
163
+ "lex_ctxt ", "" # 3.0 production that's mostly noise right now
159
164
  ]
160
165
 
161
166
  renames.each_slice(2) do |(a, b)|
@@ -174,7 +179,7 @@ ARGF.each_line do |line|
174
179
 
175
180
  case line.strip
176
181
  when /^$/ then
177
- when /^(\d+) (\$?\w+): (.*)/ then # yacc
182
+ when /^(\d+) (\$?[@\w]+): (.*)/ then # yacc
178
183
  rule = $2
179
184
  order << rule unless rules.has_key? rule
180
185
  rules[rule] << munge($3)
@@ -199,7 +204,7 @@ ARGF.each_line do |line|
199
204
  when /^\cL/ then # byacc
200
205
  break
201
206
  else
202
- warn "unparsed: #{$.}: #{line.chomp}"
207
+ warn "unparsed: #{$.}: #{line.strip.inspect}"
203
208
  end
204
209
  end
205
210
 
data/debugging.md CHANGED
@@ -55,3 +55,136 @@ From there? Good luck. I'm currently trying to backtrack from rule
55
55
  reductions to state change differences. I'd like to figure out a way
56
56
  to go from this sort of diff to a reasonable test that checks state
57
57
  changes but I don't have that set up at this point.
58
+
59
+ ## Adding New Grammar Productions
60
+
61
+ Ruby adds stuff to the parser ALL THE TIME. It's actually hard to keep
62
+ up with, but I've added some tools and shown what a typical workflow
63
+ looks like. Let's say you want to add ruby 2.7's "beginless range" (eg
64
+ `..42`).
65
+
66
+ Whenever there's a language feature missing, I start with comparing
67
+ the parse trees between MRI and RP:
68
+
69
+ ### Structural Comparing
70
+
71
+ There's a bunch of rake tasks `compare27`, `compare26`, etc that try
72
+ to normalize and diff MRI's parse.y parse tree (just the structure of
73
+ the tree in yacc) to ruby\_parser's parse tree (racc). It's the first
74
+ thing I do when I'm adding a new version. Stub out all the version
75
+ differences, and then start to diff the structure and move
76
+ ruby\_parser towards the new changes.
77
+
78
+ Some differences are just gonna be there... but here's an example of a
79
+ real diff between MRI 2.7 and ruby_parser as of today:
80
+
81
+ ```diff
82
+ arg tDOT3 arg
83
+ arg tDOT2
84
+ arg tDOT3
85
+ - tBDOT2 arg
86
+ - tBDOT3 arg
87
+ arg tPLUS arg
88
+ arg tMINUS arg
89
+ arg tSTAR2 arg
90
+ ```
91
+
92
+ This is a new language feature that ruby_parser doesn't handle yet.
93
+ It's in MRI (the left hand side of the diff) but not ruby\_parser (the
94
+ right hand side) so it is a `-` or missing line.
95
+
96
+ Some other diffs will have both `+` and `-` lines. That usually
97
+ happens when MRI has been refactoring the grammar. Sometimes I choose
98
+ to adapt those refactorings and sometimes it starts to get too
99
+ difficult to maintain multiple versions of ruby parsing in a single
100
+ file.
101
+
102
+ But! This structural comparing is always a place you should look when
103
+ ruby_parser is failing to parse something. Maybe it just hasn't been
104
+ implemented yet and the easiest place to look is the diff.
105
+
106
+ ### Starting Test First
107
+
108
+ The next thing I do is to add a parser test to cover that feature. I
109
+ usually start with the parser and work backwards towards the lexer as
110
+ needed, as I find it structures things properly and keeps things goal
111
+ oriented.
112
+
113
+ So, make a new parser test, usually in the versioned section of the
114
+ parser tests.
115
+
116
+ ```
117
+ def test_beginless2
118
+ rb = "..10\n; ..a\n; c"
119
+ pt = s(:block,
120
+ s(:dot2, nil, s(:lit, 0).line(1)).line(1),
121
+ s(:dot2, nil, s(:call, nil, :a).line(2)).line(2),
122
+ s(:call, nil, :c).line(3)).line(1)
123
+
124
+ assert_parse_line rb, pt, 1
125
+
126
+ flunk "not done yet"
127
+ end
128
+ ```
129
+
130
+ (In this case copied and modified the tests for open ranges from 2.6)
131
+ and run it to get my first error:
132
+
133
+ ```
134
+ % rake N=/beginless/
135
+
136
+ ...
137
+
138
+ E
139
+
140
+ Finished in 0.021814s, 45.8421 runs/s, 0.0000 assertions/s.
141
+
142
+ 1) Error:
143
+ TestRubyParserV27#test_whatevs:
144
+ Racc::ParseError: (string):1 :: parse error on value ".." (tDOT2)
145
+ GEMS/2.7.0/gems/racc-1.5.0/lib/racc/parser.rb:538:in `on_error'
146
+ WORK/ruby_parser/dev/lib/ruby_parser_extras.rb:1304:in `on_error'
147
+ (eval):3:in `_racc_do_parse_c'
148
+ (eval):3:in `do_parse'
149
+ WORK/ruby_parser/dev/lib/ruby_parser_extras.rb:1329:in `block in process'
150
+ RUBY/lib/ruby/2.7.0/timeout.rb:95:in `block in timeout'
151
+ RUBY/lib/ruby/2.7.0/timeout.rb:33:in `block in catch'
152
+ RUBY/lib/ruby/2.7.0/timeout.rb:33:in `catch'
153
+ RUBY/lib/ruby/2.7.0/timeout.rb:33:in `catch'
154
+ RUBY/lib/ruby/2.7.0/timeout.rb:110:in `timeout'
155
+ WORK/ruby_parser/dev/lib/ruby_parser_extras.rb:1317:in `process'
156
+ WORK/ruby_parser/dev/test/test_ruby_parser.rb:4198:in `assert_parse'
157
+ WORK/ruby_parser/dev/test/test_ruby_parser.rb:4221:in `assert_parse_line'
158
+ WORK/ruby_parser/dev/test/test_ruby_parser.rb:4451:in `test_whatevs'
159
+ ```
160
+
161
+ For starters, we know the missing production is for `tBDOT2 arg`. It
162
+ is currently blowing up because it is getting `tDOT2` and simply
163
+ doesn't know what to do with it, so it raises the error. As the diff
164
+ suggests, that's the wrong token to begin with, so it is probably time
165
+ to also create a lexer test:
166
+
167
+ ```
168
+ def test_yylex_bdot2
169
+ assert_lex3("..42",
170
+ s(:dot2, nil, s(:lit, 42)),
171
+
172
+ :tBDOT2, "..", EXPR_BEG,
173
+ :tINTEGER, "42", EXPR_NUM)
174
+
175
+ flunk "not done yet"
176
+ end
177
+ ```
178
+
179
+ This one is mostly speculative at this point. It says "if we're lexing
180
+ this string, we should get this sexp if we fully parse it, and the
181
+ lexical stream should look like this"... That last bit is mostly made
182
+ up at this point. Sometimes I don't know exactly what expression state
183
+ things should be in until I start really digging in.
184
+
185
+ At this point, I have 2 failing tests that are directing me in the
186
+ right direction. It's now a matter of digging through
187
+ `compare/parse26.y` to see how the lexer differs and implementing
188
+ it...
189
+
190
+ But this is a good start to the doco for now. I'll add more later.