ruby_parser 3.15.1 → 3.18.1

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
checksums.yaml CHANGED
@@ -1,7 +1,7 @@
1
1
  ---
2
2
  SHA256:
3
- metadata.gz: 0adcb0c06b7ee6d29360bd8dd9641dfcce867cdd7167f59bef5134c3092cb091
4
- data.tar.gz: ef51691d5a0e2f09a38a0c6d8bd8d3fccdebf19bd1c1ce41a99f761112a43c4a
3
+ metadata.gz: dad59e40690dfb746fea4399e13d05f97b43cf562c2559c67d1f9b5868da86c5
4
+ data.tar.gz: 0c14605ecf34929f6d4865c20f085b6ef2a35533a70e530f261f9d04f44998cb
5
5
  SHA512:
6
- metadata.gz: 85e6684d7a8a6207f1040ca97768d4792ba52e2f039eee5fa83047e8e5c83a7a913f98fc5384ac8b3387829078e294e22fb009163ac4eee272557cbed715849e
7
- data.tar.gz: 7e996bef3411d3f63dfd8e2af388bddb16dc8a4206bd309611aeda038feb4c66041fb2b7ba5136015d4b0453e34fdebf8c1f8d408d68dccb9031514ebff6f728
6
+ metadata.gz: 4fa69e55ac577e55fa2945ef9c135b47359a646e654bc8cf9f2ead973852adb97350cf67ac5da549592ea583f4824cd5f6930a167fa19cf3f914cb6d6888093d
7
+ data.tar.gz: 0c13eb02ffaf3b3c60ef358be2fa8f70b951e947170d3eeb8a94e5c37bec5dc8282376e4ab3f03de99e49b6e2aac2aa13169b31c30de1067e1a290c1fd65803a
checksums.yaml.gz.sig CHANGED
Binary file
data/History.rdoc CHANGED
@@ -1,3 +1,110 @@
1
+ === 3.18.1 / 2021-11-10
2
+
3
+ * 1 minor enhancement:
4
+
5
+ * All parser tests are now explicitly testing line numbers at every level.
6
+
7
+ * 3 bug fixes:
8
+
9
+ * Fixed endless method with noargs. (mitsuru)
10
+ * Fixed line numbers on some yield forms.
11
+ * Handle and clearly report if unifdef is missing.
12
+
13
+ === 3.18.0 / 2021-10-27
14
+
15
+ Holy crap... 58 commits! 2.7 and 3.0 are feature complete. Strings
16
+ & heredocs have been rewritten.
17
+
18
+ * 9 major enhancements:
19
+
20
+ * !!! Rewrote lexer (and friends) for strings, heredocs, and %*[] constructs.
21
+ * Massive overhaul on line numbers.
22
+ * Freeze input! Finally!!! No more modifying the input string for heredocs.
23
+ * Overhauled RPStringScanner. Removed OLD compatibility methods!
24
+ * Removed Sexp methods: value, to_sym, add, add_all, node_type, values.
25
+ * value moved to sexp_processor.
26
+ * Removed String#grep monkey-patch.
27
+ * Removed String#lineno monkey-patch.
28
+ * Removed string_to_pos, charpos, etc hacks for ancient ruby versions.
29
+ * Removed unread_many... NO! NO EDITING THE INPUT STRING!
30
+
31
+ * 31 minor enhancements:
32
+
33
+ * 2.7/3.0: many more pattern edge cases
34
+ * 2.7: Added `mlhs = rhs rescue expr`
35
+ * 2.7: refactored destructured args (`|(k,v)|`) and unfactored(?!) case_body/args.
36
+ * 3.0: excessed_comma
37
+ * 3.0: finished most everything: endless methods, patterns, etc.
38
+ * 3.0: refactored / added new pattern changes
39
+ * Added RubyLexer#in_heredoc? (ie, is there old_ss ?)
40
+ * Added RubyLexer#old_ss and old_lineno and removed much of SSStack(ish).
41
+ * Added Symbol#end_with? when necessary
42
+ * Added TALLY and DEBUG options for ss.getch and ss.scan
43
+ * Added ignore_body_comments to make parser productions more clear.
44
+ * Added support for no_kwarg (eg `def f(**nil)`).
45
+ * Added support for no_kwarg in blocks (eg `f { |**nil| }`).
46
+ * Augmented generated parser files to have frozen_string_literal comments and fixed tests.
47
+ * Broke out 3.0 parser into its own to ease development.
48
+ * Bumped dependencies on sexp_processor and oedipus_lex.
49
+ * Clean generated 3.x files.
50
+ * Extracted all string scanner methods to their own module.
51
+ * Fixed some precedence decls.
52
+ * Implemented most of pattern matching for 2.7+.
53
+ * Improve lex_state= to report location in verbose debug mode.
54
+ * Made it easier to debug with a particular version of ruby via rake.
55
+ * Make sure ripper uses the same version of ruby we specified.
56
+ * Moved all string/heredoc/etc code to ruby_lexer_strings.rb
57
+ * Remove warning from newer bisons.
58
+ * Sprinkled in some frozen_string_literal, but mostly helped by oedipus bump.
59
+ * Switch to comparing against ruby binary since ripper is buggy.
60
+ * bugs task should try both bug*.rb and bad*.rb.
61
+ * endless methods
62
+ * f_any_kwrest refactoring.
63
+ * refactored defn/defs
64
+
65
+ * 15 bug fixes:
66
+
67
+ * Cleaned a bunch of old hacks. Initializing RubyLexer w/ Parser is cleaner now.
68
+ * Corrected some lex_state errors in process_token_keyword.
69
+ * Fixed ancient ruby2 change (use #lines) in ruby_parse_extract_error.
70
+ * Fixed bug where else without rescue only raises on 2.6+
71
+ * Fixed caller for getch and scan when DEBUG=1
72
+ * Fixed comments in the middle of message cascades.
73
+ * Fixed differences w/ symbol productions against ruby 2.7.
74
+ * Fixed dsym to use string_contents production.
75
+ * Fixed error in bdot2/3 in some edge cases. Fixed p_alt line.
76
+ * Fixed heredoc dedenting in the presence of empty lines. (mvz)
77
+ * Fixed some leading whitespace / comment processing
78
+ * Fixed up how class/module/defn/defs comments were collected.
79
+ * Overhauled ripper.rb to deal with buggy ripper w/ yydebug.
80
+ * Removed dsym from literal.
81
+ * Removed tUBANG lexeme but kept it distinct as a method name (eg: `def !@`).
82
+
83
+ === 3.17.0 / 2021-08-03
84
+
85
+ * 1 minor enhancement:
86
+
87
+ * Added support for arg forwarding (eg `def f(...); m(...); end`) (presidentbeef)
88
+
89
+ === 3.16.0 / 2021-05-15
90
+
91
+ * 1 major enhancement:
92
+
93
+ * Added tentative 3.0 support.
94
+
95
+ * 3 minor enhancements:
96
+
97
+ * Added lexing for "beginless range" (bdots).
98
+ * Added parsing for bdots.
99
+ * Updated rake compare task to download xz files, bumped versions, etc
100
+
101
+ * 4 bug fixes:
102
+
103
+ * Bump rake dependency to >= 10, < 15. (presidentbeef)
104
+ * Bump sexp_processor dependency to 4.15.1+. (pravi)
105
+ * Fixed minor state mismatch at the end of parsing to make diffing a little cleaner.
106
+ * Fixed normalizer to deal with new bison token syntax
107
+
1
108
  === 3.15.1 / 2021-01-10
2
109
 
3
110
  * 1 bug fix:
data/Manifest.txt CHANGED
@@ -7,6 +7,7 @@ bin/ruby_parse
7
7
  bin/ruby_parse_extract_error
8
8
  compare/normalize.rb
9
9
  debugging.md
10
+ gauntlet.md
10
11
  lib/.document
11
12
  lib/rp_extensions.rb
12
13
  lib/rp_stringscanner.rb
@@ -26,9 +27,13 @@ lib/ruby26_parser.rb
26
27
  lib/ruby26_parser.y
27
28
  lib/ruby27_parser.rb
28
29
  lib/ruby27_parser.y
30
+ lib/ruby30_parser.rb
31
+ lib/ruby30_parser.y
32
+ lib/ruby3_parser.yy
29
33
  lib/ruby_lexer.rb
30
34
  lib/ruby_lexer.rex
31
35
  lib/ruby_lexer.rex.rb
36
+ lib/ruby_lexer_strings.rb
32
37
  lib/ruby_parser.rb
33
38
  lib/ruby_parser.yy
34
39
  lib/ruby_parser_extras.rb
data/README.rdoc CHANGED
@@ -32,6 +32,7 @@ Tested against 801,039 files from the latest of all rubygems (as of 2013-05):
32
32
  * 1.8 parser is at 99.9739% accuracy, 3.651 sigma
33
33
  * 1.9 parser is at 99.9940% accuracy, 4.013 sigma
34
34
  * 2.0 parser is at 99.9939% accuracy, 4.008 sigma
35
+ * 2.6 parser is at 99.9972% accuracy, 4.191 sigma
35
36
 
36
37
  == FEATURES/PROBLEMS:
37
38
 
data/Rakefile CHANGED
@@ -14,25 +14,37 @@ Hoe.add_include_dirs "../../minitest/dev/lib"
14
14
  Hoe.add_include_dirs "../../oedipus_lex/dev/lib"
15
15
 
16
16
  V2 = %w[20 21 22 23 24 25 26 27]
17
- V2.replace [V2.last] if ENV["FAST"] # HACK
17
+ V3 = %w[30]
18
+
19
+ VERS = V2 + V3
20
+
21
+ ENV["FAST"] = VERS.last if ENV["FAST"] && !VERS.include?(ENV["FAST"])
22
+ VERS.replace [ENV["FAST"]] if ENV["FAST"]
18
23
 
19
24
  Hoe.spec "ruby_parser" do
20
25
  developer "Ryan Davis", "ryand-ruby@zenspider.com"
21
26
 
22
27
  license "MIT"
23
28
 
24
- dependency "sexp_processor", "~> 4.9"
25
- dependency "rake", "< 11", :developer
26
- dependency "oedipus_lex", "~> 2.5", :developer
29
+ dependency "sexp_processor", "~> 4.16"
30
+ dependency "rake", [">= 10", "< 15"], :developer
31
+ dependency "oedipus_lex", "~> 2.6", :developer
32
+
33
+ # NOTE: Ryan!!! Stop trying to fix this dependency! Isolate just
34
+ # can't handle having a faux-gem half-installed! Stop! Just `gem
35
+ # install racc` and move on. Revisit this ONLY once racc-compiler
36
+ # gets split out.
37
+
38
+ dependency "racc", "~> 1.5", :developer
27
39
 
28
40
  require_ruby_version [">= 2.1", "< 4"]
29
41
 
30
42
  if plugin? :perforce then # generated files
31
- V2.each do |n|
43
+ VERS.each do |n|
32
44
  self.perforce_ignore << "lib/ruby#{n}_parser.rb"
33
45
  end
34
46
 
35
- V2.each do |n|
47
+ VERS.each do |n|
36
48
  self.perforce_ignore << "lib/ruby#{n}_parser.y"
37
49
  end
38
50
 
@@ -46,8 +58,44 @@ Hoe.spec "ruby_parser" do
46
58
  end
47
59
  end
48
60
 
61
+ def maybe_add_to_top path, string
62
+ file = File.read path
63
+
64
+ return if file.start_with? string
65
+
66
+ warn "Altering top of #{path}"
67
+ tmp_path = "#{path}.tmp"
68
+ File.open(tmp_path, "w") do |f|
69
+ f.puts string
70
+ f.puts
71
+
72
+ f.write file
73
+ # TODO: make this deal with encoding comments properly?
74
+ end
75
+ File.rename tmp_path, path
76
+ end
77
+
78
+ def unifdef?
79
+ @unifdef ||= system("which unifdef") or abort <<~EOM
80
+ unifdef not found!
81
+
82
+ Please install 'unifdef' package on your system or `rake generate` on a mac.
83
+ EOM
84
+ end
85
+
49
86
  V2.each do |n|
50
87
  file "lib/ruby#{n}_parser.y" => "lib/ruby_parser.yy" do |t|
88
+ unifdef?
89
+ cmd = 'unifdef -tk -DV=%s -UDEAD %s > %s || true' % [n, t.source, t.name]
90
+ sh cmd
91
+ end
92
+
93
+ file "lib/ruby#{n}_parser.rb" => "lib/ruby#{n}_parser.y"
94
+ end
95
+
96
+ V3.each do |n|
97
+ file "lib/ruby#{n}_parser.y" => "lib/ruby3_parser.yy" do |t|
98
+ unifdef?
51
99
  cmd = 'unifdef -tk -DV=%s -UDEAD %s > %s || true' % [n, t.source, t.name]
52
100
  sh cmd
53
101
  end
@@ -57,6 +105,12 @@ end
57
105
 
58
106
  file "lib/ruby_lexer.rex.rb" => "lib/ruby_lexer.rex"
59
107
 
108
+ task :parser do |t|
109
+ t.prerequisite_tasks.grep(Rake::FileTask).select(&:already_invoked).each do |f|
110
+ maybe_add_to_top f.name, "# frozen_string_literal: true"
111
+ end
112
+ end
113
+
60
114
  task :generate => [:lexer, :parser]
61
115
 
62
116
  task :clean do
@@ -65,6 +119,7 @@ task :clean do
65
119
  Dir["coverage.info"] +
66
120
  Dir["coverage"] +
67
121
  Dir["lib/ruby2*_parser.y"] +
122
+ Dir["lib/ruby3*_parser.y"] +
68
123
  Dir["lib/*.output"])
69
124
  end
70
125
 
@@ -92,7 +147,7 @@ end
92
147
 
93
148
  def dl v
94
149
  dir = v[/^\d+\.\d+/]
95
- url = "https://cache.ruby-lang.org/pub/ruby/#{dir}/ruby-#{v}.tar.bz2"
150
+ url = "https://cache.ruby-lang.org/pub/ruby/#{dir}/ruby-#{v}.tar.xz"
96
151
  path = File.basename url
97
152
  unless File.exist? path then
98
153
  system "curl -O #{url}"
@@ -104,7 +159,7 @@ def ruby_parse version
104
159
  rp_txt = "rp#{v}.txt"
105
160
  mri_txt = "mri#{v}.txt"
106
161
  parse_y = "parse#{v}.y"
107
- tarball = "ruby-#{version}.tar.bz2"
162
+ tarball = "ruby-#{version}.tar.xz"
108
163
  ruby_dir = "ruby-#{version}"
109
164
  diff = "diff#{v}.diff"
110
165
  rp_out = "lib/ruby#{v}_parser.output"
@@ -124,15 +179,18 @@ def ruby_parse version
124
179
  end
125
180
  end
126
181
 
182
+ desc "fetch all tarballs"
183
+ task :fetch => c_tarball
184
+
127
185
  file c_parse_y => c_tarball do
128
186
  in_compare do
129
187
  extract_glob = case version
130
- when /2\.7/
188
+ when /2\.7|3\.0/
131
189
  "{id.h,parse.y,tool/{id2token.rb,lib/vpath.rb}}"
132
190
  else
133
191
  "{id.h,parse.y,tool/{id2token.rb,vpath.rb}}"
134
192
  end
135
- system "tar yxf #{tarball} #{ruby_dir}/#{extract_glob}"
193
+ system "tar Jxf #{tarball} #{ruby_dir}/#{extract_glob}"
136
194
 
137
195
  Dir.chdir ruby_dir do
138
196
  if File.exist? "tool/id2token.rb" then
@@ -141,15 +199,20 @@ def ruby_parse version
141
199
  sh "expand parse.y > ../#{parse_y}"
142
200
  end
143
201
 
144
- ruby "-pi", "-e", 'gsub(/^%define\s+api\.pure/, "%pure-parser")', "../#{parse_y}"
202
+ ruby "-pi", "-e", 'gsub(/^%pure-parser/, "%define api.pure")', "../#{parse_y}"
145
203
  end
146
204
  sh "rm -rf #{ruby_dir}"
147
205
  end
148
206
  end
149
207
 
208
+ bison = Dir["/opt/homebrew/opt/bison/bin/bison",
209
+ "/usr/local/opt/bison/bin/bison",
210
+ `which bison`.chomp,
211
+ ].first
212
+
150
213
  file c_mri_txt => [c_parse_y, normalize] do
151
214
  in_compare do
152
- sh "bison -r all #{parse_y}"
215
+ sh "#{bison} -r all #{parse_y}"
153
216
  sh "./normalize.rb parse#{v}.output > #{mri_txt}"
154
217
  rm ["parse#{v}.output", "parse#{v}.tab.c"]
155
218
  end
@@ -190,17 +253,50 @@ def ruby_parse version
190
253
  end
191
254
  end
192
255
 
256
+ task :versions do
257
+ require "open-uri"
258
+ require "net/http" # avoid require issues in threads
259
+ require "net/https"
260
+
261
+ versions = %w[ 2.0 2.1 2.2 2.3 2.4 2.5 2.6 2.7 3.0 ]
262
+
263
+ base_url = "https://cache.ruby-lang.org/pub/ruby"
264
+
265
+ class Array
266
+ def human_sort
267
+ sort_by { |item| item.to_s.split(/(\d+)/).map { |e| [e.to_i, e] } }
268
+ end
269
+ end
270
+
271
+ versions = versions.map { |ver|
272
+ Thread.new {
273
+ URI
274
+ .parse("#{base_url}/#{ver}/")
275
+ .read
276
+ .scan(/ruby-\d+\.\d+\.\d+[-\w.]*?.tar.gz/)
277
+ .reject { |s| s =~ /-(?:rc|preview)\d/ }
278
+ .human_sort
279
+ .last
280
+ .delete_prefix("ruby-")
281
+ .delete_suffix ".tar.gz"
282
+ }
283
+ }.map(&:value).sort
284
+
285
+ puts versions.map { |v| "ruby_parse %p" % [v] }
286
+ end
287
+
193
288
  ruby_parse "2.0.0-p648"
194
- ruby_parse "2.1.9"
195
- ruby_parse "2.2.9"
289
+ ruby_parse "2.1.10"
290
+ ruby_parse "2.2.10"
196
291
  ruby_parse "2.3.8"
197
- ruby_parse "2.4.9"
198
- ruby_parse "2.5.8"
199
- ruby_parse "2.6.6"
200
- ruby_parse "2.7.1"
292
+ ruby_parse "2.4.10"
293
+ ruby_parse "2.5.9"
294
+ ruby_parse "2.6.8"
295
+ ruby_parse "2.7.4"
296
+ ruby_parse "3.0.2"
201
297
 
202
298
  task :debug => :isolate do
203
- ENV["V"] ||= V2.last
299
+ ENV["V"] ||= VERS.last
204
300
  Rake.application[:parser].invoke # this way we can have DEBUG set
205
301
  Rake.application[:lexer].invoke # this way we can have DEBUG set
206
302
 
@@ -215,7 +311,7 @@ task :debug => :isolate do
215
311
  time = (ENV["RP_TIMEOUT"] || 10).to_i
216
312
 
217
313
  n = ENV["BUG"]
218
- file = (n && "bug#{n}.rb") || ENV["F"] || ENV["FILE"] || "bug.rb"
314
+ file = (n && "bug#{n}.rb") || ENV["F"] || ENV["FILE"] || "debug.rb"
219
315
  ruby = ENV["R"] || ENV["RUBY"]
220
316
 
221
317
  if ruby then
@@ -238,19 +334,22 @@ task :debug => :isolate do
238
334
  end
239
335
 
240
336
  task :debug3 do
241
- file = ENV["F"] || "bug.rb"
242
- verbose = ENV["V"] ? "-v" : ""
337
+ file = ENV["F"] || "debug.rb"
338
+ version = ENV["V"] || ""
339
+ verbose = ENV["VERBOSE"] ? "-v" : ""
243
340
  munge = "./tools/munge.rb #{verbose}"
244
341
 
245
342
  abort "Need a file to parse, via: F=path.rb" unless file
246
343
 
247
344
  ENV.delete "V"
248
345
 
249
- sh "ruby -v"
250
- sh "ruby -y #{file} 2>&1 | #{munge} > tmp/ruby"
251
- sh "./tools/ripper.rb -d #{file} | #{munge} > tmp/rip"
346
+ ruby = "ruby#{version}"
347
+
348
+ sh "#{ruby} -v"
349
+ sh "#{ruby} -y #{file} 2>&1 | #{munge} > tmp/ruby"
350
+ sh "#{ruby} ./tools/ripper.rb -d #{file} | #{munge} > tmp/rip"
252
351
  sh "rake debug F=#{file} DEBUG=1 2>&1 | #{munge} > tmp/rp"
253
- sh "diff -U 999 -d tmp/{rip,rp}"
352
+ sh "diff -U 999 -d tmp/{ruby,rp}"
254
353
  end
255
354
 
256
355
  task :cmp do
@@ -262,16 +361,25 @@ task :cmp3 do
262
361
  end
263
362
 
264
363
  task :extract => :isolate do
265
- ENV["V"] ||= V2.last
364
+ ENV["V"] ||= VERS.last
266
365
  Rake.application[:parser].invoke # this way we can have DEBUG set
267
366
 
268
- file = ENV["F"] || ENV["FILE"]
367
+ file = ENV["F"] || ENV["FILE"] || abort("Need to provide F=<path>")
269
368
 
270
369
  ruby "-Ilib", "bin/ruby_parse_extract_error", file
271
370
  end
272
371
 
372
+ task :parse => :isolate do
373
+ ENV["V"] ||= VERS.last
374
+ Rake.application[:parser].invoke # this way we can have DEBUG set
375
+
376
+ file = ENV["F"] || ENV["FILE"] || abort("Need to provide F=<path>")
377
+
378
+ ruby "-Ilib", "bin/ruby_parse", file
379
+ end
380
+
273
381
  task :bugs do
274
- sh "for f in bug*.rb ; do #{Gem.ruby} -S rake debug F=$f && rm $f ; done"
382
+ sh "for f in bug*.rb bad*.rb ; do #{Gem.ruby} -S rake debug F=$f && rm $f ; done"
275
383
  end
276
384
 
277
385
  # vim: syntax=Ruby
@@ -21,7 +21,7 @@ class RubyParser
21
21
  src = ss.string
22
22
  pre_error = src[0...ss.pos]
23
23
 
24
- defs = pre_error.grep(/^ *(?:def|it)/)
24
+ defs = pre_error.lines.grep(/^ *(?:def|it)/)
25
25
 
26
26
  raise "can't figure out where the bad code starts" unless defs.last
27
27
 
data/compare/normalize.rb CHANGED
@@ -84,6 +84,7 @@ def munge s
84
84
 
85
85
  "' '", "tSPACE", # needs to be later to avoid bad hits
86
86
 
87
+ "%empty", "none", # newer bison
87
88
  "/* empty */", "none",
88
89
  /^\s*$/, "none",
89
90
 
@@ -140,6 +141,7 @@ def munge s
140
141
  '"do for block"', "kDO_BLOCK",
141
142
  '"do for condition"', "kDO_COND",
142
143
  '"do for lambda"', "kDO_LAMBDA",
144
+ "tLABEL", "kLABEL",
143
145
 
144
146
  # UGH
145
147
  "k_LINE__", "k__LINE__",
@@ -155,7 +157,10 @@ def munge s
155
157
  /\"(\w+) \(?modifier\)?\"/, proc { |x| "k#{$1.upcase}_MOD" },
156
158
  /\"(\w+)\"/, proc { |x| "k#{$1.upcase}" },
157
159
 
158
- /@(\d+)(\s+|$)/, "",
160
+ /\$?@(\d+)(\s+|$)/, "", # newer bison
161
+
162
+ # TODO: remove for 3.0 work:
163
+ "lex_ctxt ", "" # 3.0 production that's mostly noise right now
159
164
  ]
160
165
 
161
166
  renames.each_slice(2) do |(a, b)|
@@ -174,7 +179,7 @@ ARGF.each_line do |line|
174
179
 
175
180
  case line.strip
176
181
  when /^$/ then
177
- when /^(\d+) (\$?\w+): (.*)/ then # yacc
182
+ when /^(\d+) (\$?[@\w]+): (.*)/ then # yacc
178
183
  rule = $2
179
184
  order << rule unless rules.has_key? rule
180
185
  rules[rule] << munge($3)
@@ -199,7 +204,7 @@ ARGF.each_line do |line|
199
204
  when /^\cL/ then # byacc
200
205
  break
201
206
  else
202
- warn "unparsed: #{$.}: #{line.chomp}"
207
+ warn "unparsed: #{$.}: #{line.strip.inspect}"
203
208
  end
204
209
  end
205
210
 
data/debugging.md CHANGED
@@ -55,3 +55,136 @@ From there? Good luck. I'm currently trying to backtrack from rule
55
55
  reductions to state change differences. I'd like to figure out a way
56
56
  to go from this sort of diff to a reasonable test that checks state
57
57
  changes but I don't have that set up at this point.
58
+
59
+ ## Adding New Grammar Productions
60
+
61
+ Ruby adds stuff to the parser ALL THE TIME. It's actually hard to keep
62
+ up with, but I've added some tools and shown what a typical workflow
63
+ looks like. Let's say you want to add ruby 2.7's "beginless range" (eg
64
+ `..42`).
65
+
66
+ Whenever there's a language feature missing, I start with comparing
67
+ the parse trees between MRI and RP:
68
+
69
+ ### Structural Comparing
70
+
71
+ There's a bunch of rake tasks `compare27`, `compare26`, etc that try
72
+ to normalize and diff MRI's parse.y parse tree (just the structure of
73
+ the tree in yacc) to ruby\_parser's parse tree (racc). It's the first
74
+ thing I do when I'm adding a new version. Stub out all the version
75
+ differences, and then start to diff the structure and move
76
+ ruby\_parser towards the new changes.
77
+
78
+ Some differences are just gonna be there... but here's an example of a
79
+ real diff between MRI 2.7 and ruby_parser as of today:
80
+
81
+ ```diff
82
+ arg tDOT3 arg
83
+ arg tDOT2
84
+ arg tDOT3
85
+ - tBDOT2 arg
86
+ - tBDOT3 arg
87
+ arg tPLUS arg
88
+ arg tMINUS arg
89
+ arg tSTAR2 arg
90
+ ```
91
+
92
+ This is a new language feature that ruby_parser doesn't handle yet.
93
+ It's in MRI (the left hand side of the diff) but not ruby\_parser (the
94
+ right hand side) so it is a `-` or missing line.
95
+
96
+ Some other diffs will have both `+` and `-` lines. That usually
97
+ happens when MRI has been refactoring the grammar. Sometimes I choose
98
+ to adapt those refactorings and sometimes it starts to get too
99
+ difficult to maintain multiple versions of ruby parsing in a single
100
+ file.
101
+
102
+ But! This structural comparing is always a place you should look when
103
+ ruby_parser is failing to parse something. Maybe it just hasn't been
104
+ implemented yet and the easiest place to look is the diff.
105
+
106
+ ### Starting Test First
107
+
108
+ The next thing I do is to add a parser test to cover that feature. I
109
+ usually start with the parser and work backwards towards the lexer as
110
+ needed, as I find it structures things properly and keeps things goal
111
+ oriented.
112
+
113
+ So, make a new parser test, usually in the versioned section of the
114
+ parser tests.
115
+
116
+ ```
117
+ def test_beginless2
118
+ rb = "..10\n; ..a\n; c"
119
+ pt = s(:block,
120
+ s(:dot2, nil, s(:lit, 0).line(1)).line(1),
121
+ s(:dot2, nil, s(:call, nil, :a).line(2)).line(2),
122
+ s(:call, nil, :c).line(3)).line(1)
123
+
124
+ assert_parse_line rb, pt, 1
125
+
126
+ flunk "not done yet"
127
+ end
128
+ ```
129
+
130
+ (In this case copied and modified the tests for open ranges from 2.6)
131
+ and run it to get my first error:
132
+
133
+ ```
134
+ % rake N=/beginless/
135
+
136
+ ...
137
+
138
+ E
139
+
140
+ Finished in 0.021814s, 45.8421 runs/s, 0.0000 assertions/s.
141
+
142
+ 1) Error:
143
+ TestRubyParserV27#test_whatevs:
144
+ Racc::ParseError: (string):1 :: parse error on value ".." (tDOT2)
145
+ GEMS/2.7.0/gems/racc-1.5.0/lib/racc/parser.rb:538:in `on_error'
146
+ WORK/ruby_parser/dev/lib/ruby_parser_extras.rb:1304:in `on_error'
147
+ (eval):3:in `_racc_do_parse_c'
148
+ (eval):3:in `do_parse'
149
+ WORK/ruby_parser/dev/lib/ruby_parser_extras.rb:1329:in `block in process'
150
+ RUBY/lib/ruby/2.7.0/timeout.rb:95:in `block in timeout'
151
+ RUBY/lib/ruby/2.7.0/timeout.rb:33:in `block in catch'
152
+ RUBY/lib/ruby/2.7.0/timeout.rb:33:in `catch'
153
+ RUBY/lib/ruby/2.7.0/timeout.rb:33:in `catch'
154
+ RUBY/lib/ruby/2.7.0/timeout.rb:110:in `timeout'
155
+ WORK/ruby_parser/dev/lib/ruby_parser_extras.rb:1317:in `process'
156
+ WORK/ruby_parser/dev/test/test_ruby_parser.rb:4198:in `assert_parse'
157
+ WORK/ruby_parser/dev/test/test_ruby_parser.rb:4221:in `assert_parse_line'
158
+ WORK/ruby_parser/dev/test/test_ruby_parser.rb:4451:in `test_whatevs'
159
+ ```
160
+
161
+ For starters, we know the missing production is for `tBDOT2 arg`. It
162
+ is currently blowing up because it is getting `tDOT2` and simply
163
+ doesn't know what to do with it, so it raises the error. As the diff
164
+ suggests, that's the wrong token to begin with, so it is probably time
165
+ to also create a lexer test:
166
+
167
+ ```
168
+ def test_yylex_bdot2
169
+ assert_lex3("..42",
170
+ s(:dot2, nil, s(:lit, 42)),
171
+
172
+ :tBDOT2, "..", EXPR_BEG,
173
+ :tINTEGER, "42", EXPR_NUM)
174
+
175
+ flunk "not done yet"
176
+ end
177
+ ```
178
+
179
+ This one is mostly speculative at this point. It says "if we're lexing
180
+ this string, we should get this sexp if we fully parse it, and the
181
+ lexical stream should look like this"... That last bit is mostly made
182
+ up at this point. Sometimes I don't know exactly what expression state
183
+ things should be in until I start really digging in.
184
+
185
+ At this point, I have 2 failing tests that are directing me in the
186
+ right direction. It's now a matter of digging through
187
+ `compare/parse26.y` to see how the lexer differs and implementing
188
+ it...
189
+
190
+ But this is a good start to the doco for now. I'll add more later.