ruby_parser 3.15.1 → 3.18.1

Sign up to get free protection for your applications and to get access to all the features.
checksums.yaml CHANGED
@@ -1,7 +1,7 @@
1
1
  ---
2
2
  SHA256:
3
- metadata.gz: 0adcb0c06b7ee6d29360bd8dd9641dfcce867cdd7167f59bef5134c3092cb091
4
- data.tar.gz: ef51691d5a0e2f09a38a0c6d8bd8d3fccdebf19bd1c1ce41a99f761112a43c4a
3
+ metadata.gz: dad59e40690dfb746fea4399e13d05f97b43cf562c2559c67d1f9b5868da86c5
4
+ data.tar.gz: 0c14605ecf34929f6d4865c20f085b6ef2a35533a70e530f261f9d04f44998cb
5
5
  SHA512:
6
- metadata.gz: 85e6684d7a8a6207f1040ca97768d4792ba52e2f039eee5fa83047e8e5c83a7a913f98fc5384ac8b3387829078e294e22fb009163ac4eee272557cbed715849e
7
- data.tar.gz: 7e996bef3411d3f63dfd8e2af388bddb16dc8a4206bd309611aeda038feb4c66041fb2b7ba5136015d4b0453e34fdebf8c1f8d408d68dccb9031514ebff6f728
6
+ metadata.gz: 4fa69e55ac577e55fa2945ef9c135b47359a646e654bc8cf9f2ead973852adb97350cf67ac5da549592ea583f4824cd5f6930a167fa19cf3f914cb6d6888093d
7
+ data.tar.gz: 0c13eb02ffaf3b3c60ef358be2fa8f70b951e947170d3eeb8a94e5c37bec5dc8282376e4ab3f03de99e49b6e2aac2aa13169b31c30de1067e1a290c1fd65803a
checksums.yaml.gz.sig CHANGED
Binary file
data/History.rdoc CHANGED
@@ -1,3 +1,110 @@
1
+ === 3.18.1 / 2021-11-10
2
+
3
+ * 1 minor enhancement:
4
+
5
+ * All parser tests are now explicitly testing line numbers at every level.
6
+
7
+ * 3 bug fixes:
8
+
9
+ * Fixed endless method with noargs. (mitsuru)
10
+ * Fixed line numbers on some yield forms.
11
+ * Handle and clearly report if unifdef is missing.
12
+
13
+ === 3.18.0 / 2021-10-27
14
+
15
+ Holy crap... 58 commits! 2.7 and 3.0 are feature complete. Strings
16
+ & heredocs have been rewritten.
17
+
18
+ * 9 major enhancements:
19
+
20
+ * !!! Rewrote lexer (and friends) for strings, heredocs, and %*[] constructs.
21
+ * Massive overhaul on line numbers.
22
+ * Freeze input! Finally!!! No more modifying the input string for heredocs.
23
+ * Overhauled RPStringScanner. Removed OLD compatibility methods!
24
+ * Removed Sexp methods: value, to_sym, add, add_all, node_type, values.
25
+ * value moved to sexp_processor.
26
+ * Removed String#grep monkey-patch.
27
+ * Removed String#lineno monkey-patch.
28
+ * Removed string_to_pos, charpos, etc hacks for ancient ruby versions.
29
+ * Removed unread_many... NO! NO EDITING THE INPUT STRING!
30
+
31
+ * 31 minor enhancements:
32
+
33
+ * 2.7/3.0: many more pattern edge cases
34
+ * 2.7: Added `mlhs = rhs rescue expr`
35
+ * 2.7: refactored destructured args (`|(k,v)|`) and unfactored(?!) case_body/args.
36
+ * 3.0: excessed_comma
37
+ * 3.0: finished most everything: endless methods, patterns, etc.
38
+ * 3.0: refactored / added new pattern changes
39
+ * Added RubyLexer#in_heredoc? (ie, is there old_ss ?)
40
+ * Added RubyLexer#old_ss and old_lineno and removed much of SSStack(ish).
41
+ * Added Symbol#end_with? when necessary
42
+ * Added TALLY and DEBUG options for ss.getch and ss.scan
43
+ * Added ignore_body_comments to make parser productions more clear.
44
+ * Added support for no_kwarg (eg `def f(**nil)`).
45
+ * Added support for no_kwarg in blocks (eg `f { |**nil| }`).
46
+ * Augmented generated parser files to have frozen_string_literal comments and fixed tests.
47
+ * Broke out 3.0 parser into its own to ease development.
48
+ * Bumped dependencies on sexp_processor and oedipus_lex.
49
+ * Clean generated 3.x files.
50
+ * Extracted all string scanner methods to their own module.
51
+ * Fixed some precedence decls.
52
+ * Implemented most of pattern matching for 2.7+.
53
+ * Improve lex_state= to report location in verbose debug mode.
54
+ * Made it easier to debug with a particular version of ruby via rake.
55
+ * Make sure ripper uses the same version of ruby we specified.
56
+ * Moved all string/heredoc/etc code to ruby_lexer_strings.rb
57
+ * Remove warning from newer bisons.
58
+ * Sprinkled in some frozen_string_literal, but mostly helped by oedipus bump.
59
+ * Switch to comparing against ruby binary since ripper is buggy.
60
+ * bugs task should try both bug*.rb and bad*.rb.
61
+ * endless methods
62
+ * f_any_kwrest refactoring.
63
+ * refactored defn/defs
64
+
65
+ * 15 bug fixes:
66
+
67
+ * Cleaned a bunch of old hacks. Initializing RubyLexer w/ Parser is cleaner now.
68
+ * Corrected some lex_state errors in process_token_keyword.
69
+ * Fixed ancient ruby2 change (use #lines) in ruby_parse_extract_error.
70
+ * Fixed bug where else without rescue only raises on 2.6+
71
+ * Fixed caller for getch and scan when DEBUG=1
72
+ * Fixed comments in the middle of message cascades.
73
+ * Fixed differences w/ symbol productions against ruby 2.7.
74
+ * Fixed dsym to use string_contents production.
75
+ * Fixed error in bdot2/3 in some edge cases. Fixed p_alt line.
76
+ * Fixed heredoc dedenting in the presence of empty lines. (mvz)
77
+ * Fixed some leading whitespace / comment processing
78
+ * Fixed up how class/module/defn/defs comments were collected.
79
+ * Overhauled ripper.rb to deal with buggy ripper w/ yydebug.
80
+ * Removed dsym from literal.
81
+ * Removed tUBANG lexeme but kept it distinct as a method name (eg: `def !@`).
82
+
83
+ === 3.17.0 / 2021-08-03
84
+
85
+ * 1 minor enhancement:
86
+
87
+ * Added support for arg forwarding (eg `def f(...); m(...); end`) (presidentbeef)
88
+
89
+ === 3.16.0 / 2021-05-15
90
+
91
+ * 1 major enhancement:
92
+
93
+ * Added tentative 3.0 support.
94
+
95
+ * 3 minor enhancements:
96
+
97
+ * Added lexing for "beginless range" (bdots).
98
+ * Added parsing for bdots.
99
+ * Updated rake compare task to download xz files, bumped versions, etc
100
+
101
+ * 4 bug fixes:
102
+
103
+ * Bump rake dependency to >= 10, < 15. (presidentbeef)
104
+ * Bump sexp_processor dependency to 4.15.1+. (pravi)
105
+ * Fixed minor state mismatch at the end of parsing to make diffing a little cleaner.
106
+ * Fixed normalizer to deal with new bison token syntax
107
+
1
108
  === 3.15.1 / 2021-01-10
2
109
 
3
110
  * 1 bug fix:
data/Manifest.txt CHANGED
@@ -7,6 +7,7 @@ bin/ruby_parse
7
7
  bin/ruby_parse_extract_error
8
8
  compare/normalize.rb
9
9
  debugging.md
10
+ gauntlet.md
10
11
  lib/.document
11
12
  lib/rp_extensions.rb
12
13
  lib/rp_stringscanner.rb
@@ -26,9 +27,13 @@ lib/ruby26_parser.rb
26
27
  lib/ruby26_parser.y
27
28
  lib/ruby27_parser.rb
28
29
  lib/ruby27_parser.y
30
+ lib/ruby30_parser.rb
31
+ lib/ruby30_parser.y
32
+ lib/ruby3_parser.yy
29
33
  lib/ruby_lexer.rb
30
34
  lib/ruby_lexer.rex
31
35
  lib/ruby_lexer.rex.rb
36
+ lib/ruby_lexer_strings.rb
32
37
  lib/ruby_parser.rb
33
38
  lib/ruby_parser.yy
34
39
  lib/ruby_parser_extras.rb
data/README.rdoc CHANGED
@@ -32,6 +32,7 @@ Tested against 801,039 files from the latest of all rubygems (as of 2013-05):
32
32
  * 1.8 parser is at 99.9739% accuracy, 3.651 sigma
33
33
  * 1.9 parser is at 99.9940% accuracy, 4.013 sigma
34
34
  * 2.0 parser is at 99.9939% accuracy, 4.008 sigma
35
+ * 2.6 parser is at 99.9972% accuracy, 4.191 sigma
35
36
 
36
37
  == FEATURES/PROBLEMS:
37
38
 
data/Rakefile CHANGED
@@ -14,25 +14,37 @@ Hoe.add_include_dirs "../../minitest/dev/lib"
14
14
  Hoe.add_include_dirs "../../oedipus_lex/dev/lib"
15
15
 
16
16
  V2 = %w[20 21 22 23 24 25 26 27]
17
- V2.replace [V2.last] if ENV["FAST"] # HACK
17
+ V3 = %w[30]
18
+
19
+ VERS = V2 + V3
20
+
21
+ ENV["FAST"] = VERS.last if ENV["FAST"] && !VERS.include?(ENV["FAST"])
22
+ VERS.replace [ENV["FAST"]] if ENV["FAST"]
18
23
 
19
24
  Hoe.spec "ruby_parser" do
20
25
  developer "Ryan Davis", "ryand-ruby@zenspider.com"
21
26
 
22
27
  license "MIT"
23
28
 
24
- dependency "sexp_processor", "~> 4.9"
25
- dependency "rake", "< 11", :developer
26
- dependency "oedipus_lex", "~> 2.5", :developer
29
+ dependency "sexp_processor", "~> 4.16"
30
+ dependency "rake", [">= 10", "< 15"], :developer
31
+ dependency "oedipus_lex", "~> 2.6", :developer
32
+
33
+ # NOTE: Ryan!!! Stop trying to fix this dependency! Isolate just
34
+ # can't handle having a faux-gem half-installed! Stop! Just `gem
35
+ # install racc` and move on. Revisit this ONLY once racc-compiler
36
+ # gets split out.
37
+
38
+ dependency "racc", "~> 1.5", :developer
27
39
 
28
40
  require_ruby_version [">= 2.1", "< 4"]
29
41
 
30
42
  if plugin? :perforce then # generated files
31
- V2.each do |n|
43
+ VERS.each do |n|
32
44
  self.perforce_ignore << "lib/ruby#{n}_parser.rb"
33
45
  end
34
46
 
35
- V2.each do |n|
47
+ VERS.each do |n|
36
48
  self.perforce_ignore << "lib/ruby#{n}_parser.y"
37
49
  end
38
50
 
@@ -46,8 +58,44 @@ Hoe.spec "ruby_parser" do
46
58
  end
47
59
  end
48
60
 
61
+ def maybe_add_to_top path, string
62
+ file = File.read path
63
+
64
+ return if file.start_with? string
65
+
66
+ warn "Altering top of #{path}"
67
+ tmp_path = "#{path}.tmp"
68
+ File.open(tmp_path, "w") do |f|
69
+ f.puts string
70
+ f.puts
71
+
72
+ f.write file
73
+ # TODO: make this deal with encoding comments properly?
74
+ end
75
+ File.rename tmp_path, path
76
+ end
77
+
78
+ def unifdef?
79
+ @unifdef ||= system("which unifdef") or abort <<~EOM
80
+ unifdef not found!
81
+
82
+ Please install 'unifdef' package on your system or `rake generate` on a mac.
83
+ EOM
84
+ end
85
+
49
86
  V2.each do |n|
50
87
  file "lib/ruby#{n}_parser.y" => "lib/ruby_parser.yy" do |t|
88
+ unifdef?
89
+ cmd = 'unifdef -tk -DV=%s -UDEAD %s > %s || true' % [n, t.source, t.name]
90
+ sh cmd
91
+ end
92
+
93
+ file "lib/ruby#{n}_parser.rb" => "lib/ruby#{n}_parser.y"
94
+ end
95
+
96
+ V3.each do |n|
97
+ file "lib/ruby#{n}_parser.y" => "lib/ruby3_parser.yy" do |t|
98
+ unifdef?
51
99
  cmd = 'unifdef -tk -DV=%s -UDEAD %s > %s || true' % [n, t.source, t.name]
52
100
  sh cmd
53
101
  end
@@ -57,6 +105,12 @@ end
57
105
 
58
106
  file "lib/ruby_lexer.rex.rb" => "lib/ruby_lexer.rex"
59
107
 
108
+ task :parser do |t|
109
+ t.prerequisite_tasks.grep(Rake::FileTask).select(&:already_invoked).each do |f|
110
+ maybe_add_to_top f.name, "# frozen_string_literal: true"
111
+ end
112
+ end
113
+
60
114
  task :generate => [:lexer, :parser]
61
115
 
62
116
  task :clean do
@@ -65,6 +119,7 @@ task :clean do
65
119
  Dir["coverage.info"] +
66
120
  Dir["coverage"] +
67
121
  Dir["lib/ruby2*_parser.y"] +
122
+ Dir["lib/ruby3*_parser.y"] +
68
123
  Dir["lib/*.output"])
69
124
  end
70
125
 
@@ -92,7 +147,7 @@ end
92
147
 
93
148
  def dl v
94
149
  dir = v[/^\d+\.\d+/]
95
- url = "https://cache.ruby-lang.org/pub/ruby/#{dir}/ruby-#{v}.tar.bz2"
150
+ url = "https://cache.ruby-lang.org/pub/ruby/#{dir}/ruby-#{v}.tar.xz"
96
151
  path = File.basename url
97
152
  unless File.exist? path then
98
153
  system "curl -O #{url}"
@@ -104,7 +159,7 @@ def ruby_parse version
104
159
  rp_txt = "rp#{v}.txt"
105
160
  mri_txt = "mri#{v}.txt"
106
161
  parse_y = "parse#{v}.y"
107
- tarball = "ruby-#{version}.tar.bz2"
162
+ tarball = "ruby-#{version}.tar.xz"
108
163
  ruby_dir = "ruby-#{version}"
109
164
  diff = "diff#{v}.diff"
110
165
  rp_out = "lib/ruby#{v}_parser.output"
@@ -124,15 +179,18 @@ def ruby_parse version
124
179
  end
125
180
  end
126
181
 
182
+ desc "fetch all tarballs"
183
+ task :fetch => c_tarball
184
+
127
185
  file c_parse_y => c_tarball do
128
186
  in_compare do
129
187
  extract_glob = case version
130
- when /2\.7/
188
+ when /2\.7|3\.0/
131
189
  "{id.h,parse.y,tool/{id2token.rb,lib/vpath.rb}}"
132
190
  else
133
191
  "{id.h,parse.y,tool/{id2token.rb,vpath.rb}}"
134
192
  end
135
- system "tar yxf #{tarball} #{ruby_dir}/#{extract_glob}"
193
+ system "tar Jxf #{tarball} #{ruby_dir}/#{extract_glob}"
136
194
 
137
195
  Dir.chdir ruby_dir do
138
196
  if File.exist? "tool/id2token.rb" then
@@ -141,15 +199,20 @@ def ruby_parse version
141
199
  sh "expand parse.y > ../#{parse_y}"
142
200
  end
143
201
 
144
- ruby "-pi", "-e", 'gsub(/^%define\s+api\.pure/, "%pure-parser")', "../#{parse_y}"
202
+ ruby "-pi", "-e", 'gsub(/^%pure-parser/, "%define api.pure")', "../#{parse_y}"
145
203
  end
146
204
  sh "rm -rf #{ruby_dir}"
147
205
  end
148
206
  end
149
207
 
208
+ bison = Dir["/opt/homebrew/opt/bison/bin/bison",
209
+ "/usr/local/opt/bison/bin/bison",
210
+ `which bison`.chomp,
211
+ ].first
212
+
150
213
  file c_mri_txt => [c_parse_y, normalize] do
151
214
  in_compare do
152
- sh "bison -r all #{parse_y}"
215
+ sh "#{bison} -r all #{parse_y}"
153
216
  sh "./normalize.rb parse#{v}.output > #{mri_txt}"
154
217
  rm ["parse#{v}.output", "parse#{v}.tab.c"]
155
218
  end
@@ -190,17 +253,50 @@ def ruby_parse version
190
253
  end
191
254
  end
192
255
 
256
+ task :versions do
257
+ require "open-uri"
258
+ require "net/http" # avoid require issues in threads
259
+ require "net/https"
260
+
261
+ versions = %w[ 2.0 2.1 2.2 2.3 2.4 2.5 2.6 2.7 3.0 ]
262
+
263
+ base_url = "https://cache.ruby-lang.org/pub/ruby"
264
+
265
+ class Array
266
+ def human_sort
267
+ sort_by { |item| item.to_s.split(/(\d+)/).map { |e| [e.to_i, e] } }
268
+ end
269
+ end
270
+
271
+ versions = versions.map { |ver|
272
+ Thread.new {
273
+ URI
274
+ .parse("#{base_url}/#{ver}/")
275
+ .read
276
+ .scan(/ruby-\d+\.\d+\.\d+[-\w.]*?.tar.gz/)
277
+ .reject { |s| s =~ /-(?:rc|preview)\d/ }
278
+ .human_sort
279
+ .last
280
+ .delete_prefix("ruby-")
281
+ .delete_suffix ".tar.gz"
282
+ }
283
+ }.map(&:value).sort
284
+
285
+ puts versions.map { |v| "ruby_parse %p" % [v] }
286
+ end
287
+
193
288
  ruby_parse "2.0.0-p648"
194
- ruby_parse "2.1.9"
195
- ruby_parse "2.2.9"
289
+ ruby_parse "2.1.10"
290
+ ruby_parse "2.2.10"
196
291
  ruby_parse "2.3.8"
197
- ruby_parse "2.4.9"
198
- ruby_parse "2.5.8"
199
- ruby_parse "2.6.6"
200
- ruby_parse "2.7.1"
292
+ ruby_parse "2.4.10"
293
+ ruby_parse "2.5.9"
294
+ ruby_parse "2.6.8"
295
+ ruby_parse "2.7.4"
296
+ ruby_parse "3.0.2"
201
297
 
202
298
  task :debug => :isolate do
203
- ENV["V"] ||= V2.last
299
+ ENV["V"] ||= VERS.last
204
300
  Rake.application[:parser].invoke # this way we can have DEBUG set
205
301
  Rake.application[:lexer].invoke # this way we can have DEBUG set
206
302
 
@@ -215,7 +311,7 @@ task :debug => :isolate do
215
311
  time = (ENV["RP_TIMEOUT"] || 10).to_i
216
312
 
217
313
  n = ENV["BUG"]
218
- file = (n && "bug#{n}.rb") || ENV["F"] || ENV["FILE"] || "bug.rb"
314
+ file = (n && "bug#{n}.rb") || ENV["F"] || ENV["FILE"] || "debug.rb"
219
315
  ruby = ENV["R"] || ENV["RUBY"]
220
316
 
221
317
  if ruby then
@@ -238,19 +334,22 @@ task :debug => :isolate do
238
334
  end
239
335
 
240
336
  task :debug3 do
241
- file = ENV["F"] || "bug.rb"
242
- verbose = ENV["V"] ? "-v" : ""
337
+ file = ENV["F"] || "debug.rb"
338
+ version = ENV["V"] || ""
339
+ verbose = ENV["VERBOSE"] ? "-v" : ""
243
340
  munge = "./tools/munge.rb #{verbose}"
244
341
 
245
342
  abort "Need a file to parse, via: F=path.rb" unless file
246
343
 
247
344
  ENV.delete "V"
248
345
 
249
- sh "ruby -v"
250
- sh "ruby -y #{file} 2>&1 | #{munge} > tmp/ruby"
251
- sh "./tools/ripper.rb -d #{file} | #{munge} > tmp/rip"
346
+ ruby = "ruby#{version}"
347
+
348
+ sh "#{ruby} -v"
349
+ sh "#{ruby} -y #{file} 2>&1 | #{munge} > tmp/ruby"
350
+ sh "#{ruby} ./tools/ripper.rb -d #{file} | #{munge} > tmp/rip"
252
351
  sh "rake debug F=#{file} DEBUG=1 2>&1 | #{munge} > tmp/rp"
253
- sh "diff -U 999 -d tmp/{rip,rp}"
352
+ sh "diff -U 999 -d tmp/{ruby,rp}"
254
353
  end
255
354
 
256
355
  task :cmp do
@@ -262,16 +361,25 @@ task :cmp3 do
262
361
  end
263
362
 
264
363
  task :extract => :isolate do
265
- ENV["V"] ||= V2.last
364
+ ENV["V"] ||= VERS.last
266
365
  Rake.application[:parser].invoke # this way we can have DEBUG set
267
366
 
268
- file = ENV["F"] || ENV["FILE"]
367
+ file = ENV["F"] || ENV["FILE"] || abort("Need to provide F=<path>")
269
368
 
270
369
  ruby "-Ilib", "bin/ruby_parse_extract_error", file
271
370
  end
272
371
 
372
+ task :parse => :isolate do
373
+ ENV["V"] ||= VERS.last
374
+ Rake.application[:parser].invoke # this way we can have DEBUG set
375
+
376
+ file = ENV["F"] || ENV["FILE"] || abort("Need to provide F=<path>")
377
+
378
+ ruby "-Ilib", "bin/ruby_parse", file
379
+ end
380
+
273
381
  task :bugs do
274
- sh "for f in bug*.rb ; do #{Gem.ruby} -S rake debug F=$f && rm $f ; done"
382
+ sh "for f in bug*.rb bad*.rb ; do #{Gem.ruby} -S rake debug F=$f && rm $f ; done"
275
383
  end
276
384
 
277
385
  # vim: syntax=Ruby
@@ -21,7 +21,7 @@ class RubyParser
21
21
  src = ss.string
22
22
  pre_error = src[0...ss.pos]
23
23
 
24
- defs = pre_error.grep(/^ *(?:def|it)/)
24
+ defs = pre_error.lines.grep(/^ *(?:def|it)/)
25
25
 
26
26
  raise "can't figure out where the bad code starts" unless defs.last
27
27
 
data/compare/normalize.rb CHANGED
@@ -84,6 +84,7 @@ def munge s
84
84
 
85
85
  "' '", "tSPACE", # needs to be later to avoid bad hits
86
86
 
87
+ "%empty", "none", # newer bison
87
88
  "/* empty */", "none",
88
89
  /^\s*$/, "none",
89
90
 
@@ -140,6 +141,7 @@ def munge s
140
141
  '"do for block"', "kDO_BLOCK",
141
142
  '"do for condition"', "kDO_COND",
142
143
  '"do for lambda"', "kDO_LAMBDA",
144
+ "tLABEL", "kLABEL",
143
145
 
144
146
  # UGH
145
147
  "k_LINE__", "k__LINE__",
@@ -155,7 +157,10 @@ def munge s
155
157
  /\"(\w+) \(?modifier\)?\"/, proc { |x| "k#{$1.upcase}_MOD" },
156
158
  /\"(\w+)\"/, proc { |x| "k#{$1.upcase}" },
157
159
 
158
- /@(\d+)(\s+|$)/, "",
160
+ /\$?@(\d+)(\s+|$)/, "", # newer bison
161
+
162
+ # TODO: remove for 3.0 work:
163
+ "lex_ctxt ", "" # 3.0 production that's mostly noise right now
159
164
  ]
160
165
 
161
166
  renames.each_slice(2) do |(a, b)|
@@ -174,7 +179,7 @@ ARGF.each_line do |line|
174
179
 
175
180
  case line.strip
176
181
  when /^$/ then
177
- when /^(\d+) (\$?\w+): (.*)/ then # yacc
182
+ when /^(\d+) (\$?[@\w]+): (.*)/ then # yacc
178
183
  rule = $2
179
184
  order << rule unless rules.has_key? rule
180
185
  rules[rule] << munge($3)
@@ -199,7 +204,7 @@ ARGF.each_line do |line|
199
204
  when /^\cL/ then # byacc
200
205
  break
201
206
  else
202
- warn "unparsed: #{$.}: #{line.chomp}"
207
+ warn "unparsed: #{$.}: #{line.strip.inspect}"
203
208
  end
204
209
  end
205
210
 
data/debugging.md CHANGED
@@ -55,3 +55,136 @@ From there? Good luck. I'm currently trying to backtrack from rule
55
55
  reductions to state change differences. I'd like to figure out a way
56
56
  to go from this sort of diff to a reasonable test that checks state
57
57
  changes but I don't have that set up at this point.
58
+
59
+ ## Adding New Grammar Productions
60
+
61
+ Ruby adds stuff to the parser ALL THE TIME. It's actually hard to keep
62
+ up with, but I've added some tools and shown what a typical workflow
63
+ looks like. Let's say you want to add ruby 2.7's "beginless range" (eg
64
+ `..42`).
65
+
66
+ Whenever there's a language feature missing, I start with comparing
67
+ the parse trees between MRI and RP:
68
+
69
+ ### Structural Comparing
70
+
71
+ There's a bunch of rake tasks `compare27`, `compare26`, etc that try
72
+ to normalize and diff MRI's parse.y parse tree (just the structure of
73
+ the tree in yacc) to ruby\_parser's parse tree (racc). It's the first
74
+ thing I do when I'm adding a new version. Stub out all the version
75
+ differences, and then start to diff the structure and move
76
+ ruby\_parser towards the new changes.
77
+
78
+ Some differences are just gonna be there... but here's an example of a
79
+ real diff between MRI 2.7 and ruby_parser as of today:
80
+
81
+ ```diff
82
+ arg tDOT3 arg
83
+ arg tDOT2
84
+ arg tDOT3
85
+ - tBDOT2 arg
86
+ - tBDOT3 arg
87
+ arg tPLUS arg
88
+ arg tMINUS arg
89
+ arg tSTAR2 arg
90
+ ```
91
+
92
+ This is a new language feature that ruby_parser doesn't handle yet.
93
+ It's in MRI (the left hand side of the diff) but not ruby\_parser (the
94
+ right hand side) so it is a `-` or missing line.
95
+
96
+ Some other diffs will have both `+` and `-` lines. That usually
97
+ happens when MRI has been refactoring the grammar. Sometimes I choose
98
+ to adapt those refactorings and sometimes it starts to get too
99
+ difficult to maintain multiple versions of ruby parsing in a single
100
+ file.
101
+
102
+ But! This structural comparing is always a place you should look when
103
+ ruby_parser is failing to parse something. Maybe it just hasn't been
104
+ implemented yet and the easiest place to look is the diff.
105
+
106
+ ### Starting Test First
107
+
108
+ The next thing I do is to add a parser test to cover that feature. I
109
+ usually start with the parser and work backwards towards the lexer as
110
+ needed, as I find it structures things properly and keeps things goal
111
+ oriented.
112
+
113
+ So, make a new parser test, usually in the versioned section of the
114
+ parser tests.
115
+
116
+ ```
117
+ def test_beginless2
118
+ rb = "..10\n; ..a\n; c"
119
+ pt = s(:block,
120
+ s(:dot2, nil, s(:lit, 0).line(1)).line(1),
121
+ s(:dot2, nil, s(:call, nil, :a).line(2)).line(2),
122
+ s(:call, nil, :c).line(3)).line(1)
123
+
124
+ assert_parse_line rb, pt, 1
125
+
126
+ flunk "not done yet"
127
+ end
128
+ ```
129
+
130
+ (In this case copied and modified the tests for open ranges from 2.6)
131
+ and run it to get my first error:
132
+
133
+ ```
134
+ % rake N=/beginless/
135
+
136
+ ...
137
+
138
+ E
139
+
140
+ Finished in 0.021814s, 45.8421 runs/s, 0.0000 assertions/s.
141
+
142
+ 1) Error:
143
+ TestRubyParserV27#test_whatevs:
144
+ Racc::ParseError: (string):1 :: parse error on value ".." (tDOT2)
145
+ GEMS/2.7.0/gems/racc-1.5.0/lib/racc/parser.rb:538:in `on_error'
146
+ WORK/ruby_parser/dev/lib/ruby_parser_extras.rb:1304:in `on_error'
147
+ (eval):3:in `_racc_do_parse_c'
148
+ (eval):3:in `do_parse'
149
+ WORK/ruby_parser/dev/lib/ruby_parser_extras.rb:1329:in `block in process'
150
+ RUBY/lib/ruby/2.7.0/timeout.rb:95:in `block in timeout'
151
+ RUBY/lib/ruby/2.7.0/timeout.rb:33:in `block in catch'
152
+ RUBY/lib/ruby/2.7.0/timeout.rb:33:in `catch'
153
+ RUBY/lib/ruby/2.7.0/timeout.rb:33:in `catch'
154
+ RUBY/lib/ruby/2.7.0/timeout.rb:110:in `timeout'
155
+ WORK/ruby_parser/dev/lib/ruby_parser_extras.rb:1317:in `process'
156
+ WORK/ruby_parser/dev/test/test_ruby_parser.rb:4198:in `assert_parse'
157
+ WORK/ruby_parser/dev/test/test_ruby_parser.rb:4221:in `assert_parse_line'
158
+ WORK/ruby_parser/dev/test/test_ruby_parser.rb:4451:in `test_whatevs'
159
+ ```
160
+
161
+ For starters, we know the missing production is for `tBDOT2 arg`. It
162
+ is currently blowing up because it is getting `tDOT2` and simply
163
+ doesn't know what to do with it, so it raises the error. As the diff
164
+ suggests, that's the wrong token to begin with, so it is probably time
165
+ to also create a lexer test:
166
+
167
+ ```
168
+ def test_yylex_bdot2
169
+ assert_lex3("..42",
170
+ s(:dot2, nil, s(:lit, 42)),
171
+
172
+ :tBDOT2, "..", EXPR_BEG,
173
+ :tINTEGER, "42", EXPR_NUM)
174
+
175
+ flunk "not done yet"
176
+ end
177
+ ```
178
+
179
+ This one is mostly speculative at this point. It says "if we're lexing
180
+ this string, we should get this sexp if we fully parse it, and the
181
+ lexical stream should look like this"... That last bit is mostly made
182
+ up at this point. Sometimes I don't know exactly what expression state
183
+ things should be in until I start really digging in.
184
+
185
+ At this point, I have 2 failing tests that are directing me in the
186
+ right direction. It's now a matter of digging through
187
+ `compare/parse26.y` to see how the lexer differs and implementing
188
+ it...
189
+
190
+ But this is a good start to the doco for now. I'll add more later.