ruby_parser 3.14.2 → 3.17.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- checksums.yaml +4 -4
- checksums.yaml.gz.sig +0 -0
- data.tar.gz.sig +0 -0
- data/History.rdoc +60 -0
- data/Manifest.txt +4 -0
- data/Rakefile +83 -16
- data/bin/ruby_parse_extract_error +8 -3
- data/compare/normalize.rb +45 -5
- data/debugging.md +172 -0
- data/lib/ruby20_parser.rb +2953 -2924
- data/lib/ruby20_parser.y +99 -59
- data/lib/ruby21_parser.rb +3008 -2977
- data/lib/ruby21_parser.y +99 -59
- data/lib/ruby22_parser.rb +3011 -2976
- data/lib/ruby22_parser.y +99 -59
- data/lib/ruby23_parser.rb +2955 -2923
- data/lib/ruby23_parser.y +99 -59
- data/lib/ruby24_parser.rb +3024 -2984
- data/lib/ruby24_parser.y +99 -59
- data/lib/ruby25_parser.rb +3023 -2984
- data/lib/ruby25_parser.y +99 -59
- data/lib/ruby26_parser.rb +2954 -2913
- data/lib/ruby26_parser.y +100 -59
- data/lib/ruby27_parser.rb +7393 -0
- data/lib/ruby27_parser.y +2715 -0
- data/lib/ruby30_parser.rb +7393 -0
- data/lib/ruby30_parser.y +2715 -0
- data/lib/ruby_lexer.rb +90 -39
- data/lib/ruby_lexer.rex +6 -7
- data/lib/ruby_lexer.rex.rb +7 -9
- data/lib/ruby_parser.rb +4 -0
- data/lib/ruby_parser.yy +164 -59
- data/lib/ruby_parser_extras.rb +57 -18
- data/test/test_ruby_lexer.rb +64 -16
- data/test/test_ruby_parser.rb +277 -3
- data/tools/munge.rb +9 -4
- metadata +55 -36
- metadata.gz.sig +0 -0
    
        checksums.yaml
    CHANGED
    
    | @@ -1,7 +1,7 @@ | |
| 1 1 | 
             
            ---
         | 
| 2 2 | 
             
            SHA256:
         | 
| 3 | 
            -
              metadata.gz:  | 
| 4 | 
            -
              data.tar.gz:  | 
| 3 | 
            +
              metadata.gz: ff6be95e278654e341f5279fed2fd7f0c9a96d93b2fd23ba1ff4b181d593be18
         | 
| 4 | 
            +
              data.tar.gz: ab91b782eb2e77cdd855fa68f4699614b6160ebcca623dd8be25719b410b4206
         | 
| 5 5 | 
             
            SHA512:
         | 
| 6 | 
            -
              metadata.gz:  | 
| 7 | 
            -
              data.tar.gz:  | 
| 6 | 
            +
              metadata.gz: a469da9dadd1eeb35a48dbb34548e70feed8ca83b2d27e41c6bf940cf9dd779622fbddcc4b3c50534f46c6de42f1f085754739d76051413866ee6557fe84050d
         | 
| 7 | 
            +
              data.tar.gz: d182a507b167a6c9af4a7a48e748f55066462888b646a38a08ab12c4365a57645c5216baa2d64c99c8d7b6ebcb4e4b22219a9da9a7d0bd945c00fc21104d7343
         | 
    
        checksums.yaml.gz.sig
    CHANGED
    
    | Binary file | 
    
        data.tar.gz.sig
    CHANGED
    
    | Binary file | 
    
        data/History.rdoc
    CHANGED
    
    | @@ -1,3 +1,63 @@ | |
| 1 | 
            +
            === 3.16.0 / 2021-05-15
         | 
| 2 | 
            +
             | 
| 3 | 
            +
            * 1 major enhancement:
         | 
| 4 | 
            +
             | 
| 5 | 
            +
              * Added tentative 3.0 support.
         | 
| 6 | 
            +
             | 
| 7 | 
            +
            * 3 minor enhancements:
         | 
| 8 | 
            +
             | 
| 9 | 
            +
              * Added lexing for "beginless range" (bdots).
         | 
| 10 | 
            +
              * Added parsing for bdots.
         | 
| 11 | 
            +
              * Updated rake compare task to download xz files, bumped versions, etc
         | 
| 12 | 
            +
             | 
| 13 | 
            +
            * 4 bug fixes:
         | 
| 14 | 
            +
             | 
| 15 | 
            +
              * Bump rake dependency to >= 10, < 15. (presidentbeef)
         | 
| 16 | 
            +
              * Bump sexp_processor dependency to 4.15.1+. (pravi)
         | 
| 17 | 
            +
              * Fixed minor state mismatch at the end of parsing to make diffing a little cleaner.
         | 
| 18 | 
            +
              * Fixed normalizer to deal with new bison token syntax
         | 
| 19 | 
            +
             | 
| 20 | 
            +
            === 3.15.1 / 2021-01-10
         | 
| 21 | 
            +
             | 
| 22 | 
            +
            * 1 bug fix:
         | 
| 23 | 
            +
             | 
| 24 | 
            +
              * Bumped ruby version to include < 4 (trunk).
         | 
| 25 | 
            +
             | 
| 26 | 
            +
            === 3.15.0 / 2020-08-31
         | 
| 27 | 
            +
             | 
| 28 | 
            +
            * 1 major enhancement:
         | 
| 29 | 
            +
             | 
| 30 | 
            +
              * Added tentative 2.7 support.
         | 
| 31 | 
            +
             | 
| 32 | 
            +
            * 1 minor enhancement:
         | 
| 33 | 
            +
             | 
| 34 | 
            +
              * Improved ruby_parse_extract_error's handling of moving slow files out.
         | 
| 35 | 
            +
             | 
| 36 | 
            +
            * 22 bug fixes:
         | 
| 37 | 
            +
             | 
| 38 | 
            +
              * Bumped ruby version to include 3.0 (trunk).
         | 
| 39 | 
            +
              * Fix an error related to empty ensure bodies. (presidentbeef)
         | 
| 40 | 
            +
              * Fix handling of bad magic encoding comment.
         | 
| 41 | 
            +
              * Fixed SystemStackError when parsing a huoooge hash, caused by a splat arg.
         | 
| 42 | 
            +
              * Fixed a number of errors parsing do blocks in strange edge cases.
         | 
| 43 | 
            +
              * Fixed a string backslash lexing bug when the string is an invalid encoding. (nijikon, gmcgibbon)
         | 
| 44 | 
            +
              * Fixed bug assigning line number to some arg nodes.
         | 
| 45 | 
            +
              * Fixed bug concatinating string literals with differing encodings.
         | 
| 46 | 
            +
              * Fixed bug lexing heredoc w/ nasty mix of \r\n and \n.
         | 
| 47 | 
            +
              * Fixed bug lexing multiple codepoints in \u{0000 1111 2222} forms.
         | 
| 48 | 
            +
              * Fixed bug setting line numbers in empty xstrings in some contexts.
         | 
| 49 | 
            +
              * Fixed edge case on call w/ begin + do block as an arg.
         | 
| 50 | 
            +
              * Fixed handling of UTF BOM.
         | 
| 51 | 
            +
              * Fixed handling of lexer state across string interpolation braces.
         | 
| 52 | 
            +
              * Fixed infinite loop when lexing backslash+cr+newline (aka dos-files)
         | 
| 53 | 
            +
              * Fixed lambda + do block edge case.
         | 
| 54 | 
            +
              * Fixed lexing of some ?\M... and ?\C... edge cases.
         | 
| 55 | 
            +
              * Fixed more do/brace block edge case failures.
         | 
| 56 | 
            +
              * Fixed parsing bug where splat was used in the middle of a list.
         | 
| 57 | 
            +
              * Fixed parsing of interpolation in heredoc-like strings. (presidentbeef)
         | 
| 58 | 
            +
              * Fixed parsing some esoteric edge cases in op_asgn.
         | 
| 59 | 
            +
              * Fixed unicode processing in ident chars so now they better mix.
         | 
| 60 | 
            +
             | 
| 1 61 | 
             
            === 3.14.2 / 2020-02-06
         | 
| 2 62 |  | 
| 3 63 | 
             
            * 1 minor enhancement:
         | 
    
        data/Manifest.txt
    CHANGED
    
    
    
        data/Rakefile
    CHANGED
    
    | @@ -8,11 +8,12 @@ Hoe.plugin :racc | |
| 8 8 | 
             
            Hoe.plugin :isolate
         | 
| 9 9 | 
             
            Hoe.plugin :rdoc
         | 
| 10 10 |  | 
| 11 | 
            +
            Hoe.add_include_dirs "lib"
         | 
| 11 12 | 
             
            Hoe.add_include_dirs "../../sexp_processor/dev/lib"
         | 
| 12 13 | 
             
            Hoe.add_include_dirs "../../minitest/dev/lib"
         | 
| 13 14 | 
             
            Hoe.add_include_dirs "../../oedipus_lex/dev/lib"
         | 
| 14 15 |  | 
| 15 | 
            -
            V2   = %w[20 21 22 23 24 25 26]
         | 
| 16 | 
            +
            V2   = %w[20 21 22 23 24 25 26 27 30]
         | 
| 16 17 | 
             
            V2.replace [V2.last] if ENV["FAST"] # HACK
         | 
| 17 18 |  | 
| 18 19 | 
             
            Hoe.spec "ruby_parser" do
         | 
| @@ -20,11 +21,18 @@ Hoe.spec "ruby_parser" do | |
| 20 21 |  | 
| 21 22 | 
             
              license "MIT"
         | 
| 22 23 |  | 
| 23 | 
            -
              dependency "sexp_processor", "~> 4. | 
| 24 | 
            -
              dependency "rake", "<  | 
| 24 | 
            +
              dependency "sexp_processor", ["~> 4.15", ">= 4.15.1"]
         | 
| 25 | 
            +
              dependency "rake", [">= 10", "< 15"], :developer
         | 
| 25 26 | 
             
              dependency "oedipus_lex", "~> 2.5", :developer
         | 
| 26 27 |  | 
| 27 | 
            -
               | 
| 28 | 
            +
              # NOTE: Ryan!!! Stop trying to fix this dependency! Isolate just
         | 
| 29 | 
            +
              # can't handle having a faux-gem half-installed! Stop! Just `gem
         | 
| 30 | 
            +
              # install racc` and move on. Revisit this ONLY once racc-compiler
         | 
| 31 | 
            +
              # gets split out.
         | 
| 32 | 
            +
             | 
| 33 | 
            +
              dependency "racc", "~> 1.5", :developer
         | 
| 34 | 
            +
             | 
| 35 | 
            +
              require_ruby_version [">= 2.1", "< 4"]
         | 
| 28 36 |  | 
| 29 37 | 
             
              if plugin? :perforce then     # generated files
         | 
| 30 38 | 
             
                V2.each do |n|
         | 
| @@ -56,6 +64,8 @@ end | |
| 56 64 |  | 
| 57 65 | 
             
            file "lib/ruby_lexer.rex.rb" => "lib/ruby_lexer.rex"
         | 
| 58 66 |  | 
| 67 | 
            +
            task :generate => [:lexer, :parser]
         | 
| 68 | 
            +
             | 
| 59 69 | 
             
            task :clean do
         | 
| 60 70 | 
             
              rm_rf(Dir["**/*~"] +
         | 
| 61 71 | 
             
                    Dir["diff.diff"] + # not all diffs. bit me too many times
         | 
| @@ -89,7 +99,7 @@ end | |
| 89 99 |  | 
| 90 100 | 
             
            def dl v
         | 
| 91 101 | 
             
              dir = v[/^\d+\.\d+/]
         | 
| 92 | 
            -
              url = "https://cache.ruby-lang.org/pub/ruby/#{dir}/ruby-#{v}.tar. | 
| 102 | 
            +
              url = "https://cache.ruby-lang.org/pub/ruby/#{dir}/ruby-#{v}.tar.xz"
         | 
| 93 103 | 
             
              path = File.basename url
         | 
| 94 104 | 
             
              unless File.exist? path then
         | 
| 95 105 | 
             
                system "curl -O #{url}"
         | 
| @@ -101,7 +111,7 @@ def ruby_parse version | |
| 101 111 | 
             
              rp_txt    = "rp#{v}.txt"
         | 
| 102 112 | 
             
              mri_txt   = "mri#{v}.txt"
         | 
| 103 113 | 
             
              parse_y   = "parse#{v}.y"
         | 
| 104 | 
            -
              tarball   = "ruby-#{version}.tar. | 
| 114 | 
            +
              tarball   = "ruby-#{version}.tar.xz"
         | 
| 105 115 | 
             
              ruby_dir  = "ruby-#{version}"
         | 
| 106 116 | 
             
              diff      = "diff#{v}.diff"
         | 
| 107 117 | 
             
              rp_out    = "lib/ruby#{v}_parser.output"
         | 
| @@ -121,23 +131,40 @@ def ruby_parse version | |
| 121 131 | 
             
                end
         | 
| 122 132 | 
             
              end
         | 
| 123 133 |  | 
| 134 | 
            +
              desc "fetch all tarballs"
         | 
| 135 | 
            +
              task :fetch => c_tarball
         | 
| 136 | 
            +
             | 
| 124 137 | 
             
              file c_parse_y => c_tarball do
         | 
| 125 138 | 
             
                in_compare do
         | 
| 126 | 
            -
                   | 
| 139 | 
            +
                  extract_glob = case version
         | 
| 140 | 
            +
                                 when /2\.7|3\.0/
         | 
| 141 | 
            +
                                   "{id.h,parse.y,tool/{id2token.rb,lib/vpath.rb}}"
         | 
| 142 | 
            +
                                 else
         | 
| 143 | 
            +
                                   "{id.h,parse.y,tool/{id2token.rb,vpath.rb}}"
         | 
| 144 | 
            +
                                 end
         | 
| 145 | 
            +
                  system "tar Jxf #{tarball} #{ruby_dir}/#{extract_glob}"
         | 
| 146 | 
            +
             | 
| 127 147 | 
             
                  Dir.chdir ruby_dir do
         | 
| 128 148 | 
             
                    if File.exist? "tool/id2token.rb" then
         | 
| 129 149 | 
             
                      sh "ruby tool/id2token.rb --path-separator=.:./ id.h parse.y | expand > ../#{parse_y}"
         | 
| 130 150 | 
             
                    else
         | 
| 131 151 | 
             
                      sh "expand parse.y > ../#{parse_y}"
         | 
| 132 152 | 
             
                    end
         | 
| 153 | 
            +
             | 
| 154 | 
            +
                    ruby "-pi", "-e", 'gsub(/^%define\s+api\.pure/, "%pure-parser")', "../#{parse_y}"
         | 
| 133 155 | 
             
                  end
         | 
| 134 156 | 
             
                  sh "rm -rf #{ruby_dir}"
         | 
| 135 157 | 
             
                end
         | 
| 136 158 | 
             
              end
         | 
| 137 159 |  | 
| 160 | 
            +
              bison = Dir["/opt/homebrew/opt/bison/bin/bison",
         | 
| 161 | 
            +
                          "/usr/local/opt/bison/bin/bison",
         | 
| 162 | 
            +
                          `which bison`.chomp,
         | 
| 163 | 
            +
                         ].first
         | 
| 164 | 
            +
             | 
| 138 165 | 
             
              file c_mri_txt => [c_parse_y, normalize] do
         | 
| 139 166 | 
             
                in_compare do
         | 
| 140 | 
            -
                  sh "bison -r all #{parse_y}"
         | 
| 167 | 
            +
                  sh "#{bison} -r all #{parse_y}"
         | 
| 141 168 | 
             
                  sh "./normalize.rb parse#{v}.output > #{mri_txt}"
         | 
| 142 169 | 
             
                  rm ["parse#{v}.output", "parse#{v}.tab.c"]
         | 
| 143 170 | 
             
                end
         | 
| @@ -178,20 +205,54 @@ def ruby_parse version | |
| 178 205 | 
             
              end
         | 
| 179 206 | 
             
            end
         | 
| 180 207 |  | 
| 208 | 
            +
            task :versions do
         | 
| 209 | 
            +
              require "open-uri"
         | 
| 210 | 
            +
              require "net/http" # avoid require issues in threads
         | 
| 211 | 
            +
              require "net/https"
         | 
| 212 | 
            +
             | 
| 213 | 
            +
              versions = %w[ 2.0 2.1 2.2 2.3 2.4 2.5 2.6 2.7 3.0 ]
         | 
| 214 | 
            +
             | 
| 215 | 
            +
              base_url = "https://cache.ruby-lang.org/pub/ruby"
         | 
| 216 | 
            +
             | 
| 217 | 
            +
              class Array
         | 
| 218 | 
            +
                def human_sort
         | 
| 219 | 
            +
                  sort_by { |item| item.to_s.split(/(\d+)/).map { |e| [e.to_i, e] } }
         | 
| 220 | 
            +
                end
         | 
| 221 | 
            +
              end
         | 
| 222 | 
            +
             | 
| 223 | 
            +
              versions = versions.map { |ver|
         | 
| 224 | 
            +
                Thread.new {
         | 
| 225 | 
            +
                  URI
         | 
| 226 | 
            +
                    .parse("#{base_url}/#{ver}/")
         | 
| 227 | 
            +
                    .read
         | 
| 228 | 
            +
                    .scan(/ruby-\d+\.\d+\.\d+[-\w.]*?.tar.gz/)
         | 
| 229 | 
            +
                    .reject { |s| s =~ /-(?:rc|preview)\d/ }
         | 
| 230 | 
            +
                    .human_sort
         | 
| 231 | 
            +
                    .last
         | 
| 232 | 
            +
                    .delete_prefix("ruby-")
         | 
| 233 | 
            +
                    .delete_suffix ".tar.gz"
         | 
| 234 | 
            +
                }
         | 
| 235 | 
            +
              }.map(&:value).sort
         | 
| 236 | 
            +
             | 
| 237 | 
            +
              puts versions.map { |v| "ruby_parse %p" % [v] }
         | 
| 238 | 
            +
            end
         | 
| 239 | 
            +
             | 
| 181 240 | 
             
            ruby_parse "2.0.0-p648"
         | 
| 182 | 
            -
            ruby_parse "2.1. | 
| 183 | 
            -
            ruby_parse "2.2. | 
| 241 | 
            +
            ruby_parse "2.1.10"
         | 
| 242 | 
            +
            ruby_parse "2.2.10"
         | 
| 184 243 | 
             
            ruby_parse "2.3.8"
         | 
| 185 | 
            -
            ruby_parse "2.4. | 
| 186 | 
            -
            ruby_parse "2.5. | 
| 187 | 
            -
            ruby_parse "2.6. | 
| 244 | 
            +
            ruby_parse "2.4.10"
         | 
| 245 | 
            +
            ruby_parse "2.5.9"
         | 
| 246 | 
            +
            ruby_parse "2.6.8"
         | 
| 247 | 
            +
            ruby_parse "2.7.4"
         | 
| 248 | 
            +
            ruby_parse "3.0.2"
         | 
| 188 249 |  | 
| 189 250 | 
             
            task :debug => :isolate do
         | 
| 190 251 | 
             
              ENV["V"] ||= V2.last
         | 
| 191 252 | 
             
              Rake.application[:parser].invoke # this way we can have DEBUG set
         | 
| 192 253 | 
             
              Rake.application[:lexer].invoke # this way we can have DEBUG set
         | 
| 193 254 |  | 
| 194 | 
            -
               | 
| 255 | 
            +
              $:.unshift "lib"
         | 
| 195 256 | 
             
              require "ruby_parser"
         | 
| 196 257 | 
             
              require "pp"
         | 
| 197 258 |  | 
| @@ -214,8 +275,9 @@ task :debug => :isolate do | |
| 214 275 |  | 
| 215 276 | 
             
              begin
         | 
| 216 277 | 
             
                pp parser.process(ruby, file, time)
         | 
| 217 | 
            -
              rescue Racc::ParseError => e
         | 
| 278 | 
            +
              rescue ArgumentError, Racc::ParseError => e
         | 
| 218 279 | 
             
                p e
         | 
| 280 | 
            +
                puts e.backtrace.join "\n  "
         | 
| 219 281 | 
             
                ss = parser.lexer.ss
         | 
| 220 282 | 
             
                src = ss.string
         | 
| 221 283 | 
             
                lines = src[0..ss.pos].split(/\n/)
         | 
| @@ -232,12 +294,17 @@ task :debug3 do | |
| 232 294 |  | 
| 233 295 | 
             
              ENV.delete "V"
         | 
| 234 296 |  | 
| 297 | 
            +
              sh "ruby -v"
         | 
| 235 298 | 
             
              sh "ruby -y #{file} 2>&1 | #{munge} > tmp/ruby"
         | 
| 236 299 | 
             
              sh "./tools/ripper.rb -d #{file} | #{munge} > tmp/rip"
         | 
| 237 | 
            -
              sh "rake debug F=#{file} DEBUG=1  | 
| 300 | 
            +
              sh "rake debug F=#{file} DEBUG=1 2>&1 | #{munge} > tmp/rp"
         | 
| 238 301 | 
             
              sh "diff -U 999 -d tmp/{rip,rp}"
         | 
| 239 302 | 
             
            end
         | 
| 240 303 |  | 
| 304 | 
            +
            task :cmp do
         | 
| 305 | 
            +
              sh %(emacsclient --eval '(ediff-files "tmp/ruby" "tmp/rp")')
         | 
| 306 | 
            +
            end
         | 
| 307 | 
            +
             | 
| 241 308 | 
             
            task :cmp3 do
         | 
| 242 309 | 
             
              sh %(emacsclient --eval '(ediff-files3 "tmp/ruby" "tmp/rip" "tmp/rp")')
         | 
| 243 310 | 
             
            end
         | 
| @@ -104,9 +104,14 @@ rescue Timeout::Error | |
| 104 104 | 
             
              warn "TIMEOUT parsing #{file}. Skipping."
         | 
| 105 105 |  | 
| 106 106 | 
             
              if $m then
         | 
| 107 | 
            -
                 | 
| 108 | 
            -
                 | 
| 109 | 
            -
                 | 
| 107 | 
            +
                base_dir, *rest = file.split("/")
         | 
| 108 | 
            +
                base_dir.sub!(/\.slow\.?.*/, "")
         | 
| 109 | 
            +
                base_dir += ".slow.#{time}"
         | 
| 110 | 
            +
             | 
| 111 | 
            +
                new_file = File.join(base_dir, *rest)
         | 
| 112 | 
            +
             | 
| 113 | 
            +
                FileUtils.mkdir_p File.dirname(new_file)
         | 
| 114 | 
            +
                FileUtils.move file, new_file, verbose:true
         | 
| 110 115 | 
             
              elsif $t then
         | 
| 111 116 | 
             
                File.unlink file
         | 
| 112 117 | 
             
              end
         | 
    
        data/compare/normalize.rb
    CHANGED
    
    | @@ -8,6 +8,10 @@ order = [] | |
| 8 8 |  | 
| 9 9 | 
             
            def munge s
         | 
| 10 10 | 
             
              renames = [
         | 
| 11 | 
            +
                         # unquote... wtf?
         | 
| 12 | 
            +
                         /`(.+?)'/,          proc { $1 },
         | 
| 13 | 
            +
                         /"'(.+?)'"/,        proc { "\"#{$1}\"" },
         | 
| 14 | 
            +
             | 
| 11 15 | 
             
                         "'='",             "tEQL",
         | 
| 12 16 | 
             
                         "'!'",             "tBANG",
         | 
| 13 17 | 
             
                         "'%'",             "tPERCENT",
         | 
| @@ -100,6 +104,43 @@ def munge s | |
| 100 104 |  | 
| 101 105 | 
             
                         "kVARIABLE",       "keyword_variable", # ugh: this is a rule name
         | 
| 102 106 |  | 
| 107 | 
            +
                         # 2.7 changes:
         | 
| 108 | 
            +
             | 
| 109 | 
            +
                         '"global variable"',          "tGVAR",
         | 
| 110 | 
            +
                         '"operator-assignment"',      "tOP_ASGN",
         | 
| 111 | 
            +
                         '"back reference"',           "tBACK_REF",
         | 
| 112 | 
            +
                         '"numbered reference"',       "tNTH_REF",
         | 
| 113 | 
            +
                         '"local variable or method"', "tIDENTIFIER",
         | 
| 114 | 
            +
                         '"constant"',                 "tCONSTANT",
         | 
| 115 | 
            +
             | 
| 116 | 
            +
                         '"(.."',                  "tBDOT2",
         | 
| 117 | 
            +
                         '"(..."',                 "tBDOT3",
         | 
| 118 | 
            +
                         '"char literal"',         "tCHAR",
         | 
| 119 | 
            +
                         '"literal content"',      "tSTRING_CONTENT",
         | 
| 120 | 
            +
                         '"string literal"',       "tSTRING_BEG",
         | 
| 121 | 
            +
                         '"symbol literal"',       "tSYMBEG",
         | 
| 122 | 
            +
                         '"backtick literal"',     "tXSTRING_BEG",
         | 
| 123 | 
            +
                         '"regexp literal"',       "tREGEXP_BEG",
         | 
| 124 | 
            +
                         '"word list"',            "tWORDS_BEG",
         | 
| 125 | 
            +
                         '"verbatim word list"',   "tQWORDS_BEG",
         | 
| 126 | 
            +
                         '"symbol list"',          "tSYMBOLS_BEG",
         | 
| 127 | 
            +
                         '"verbatim symbol list"', "tQSYMBOLS_BEG",
         | 
| 128 | 
            +
             | 
| 129 | 
            +
                         '"float literal"',        "tFLOAT",
         | 
| 130 | 
            +
                         '"imaginary literal"',    "tIMAGINARY",
         | 
| 131 | 
            +
                         '"integer literal"',      "tINTEGER",
         | 
| 132 | 
            +
                         '"rational literal"',     "tRATIONAL",
         | 
| 133 | 
            +
             | 
| 134 | 
            +
                         '"instance variable"',  "tIVAR",
         | 
| 135 | 
            +
                         '"class variable"',     "tCVAR",
         | 
| 136 | 
            +
                         '"terminator"',         "tSTRING_END", # TODO: switch this?
         | 
| 137 | 
            +
                         '"method"',             "tFID",
         | 
| 138 | 
            +
                         '"}"',                  "tSTRING_DEND",
         | 
| 139 | 
            +
             | 
| 140 | 
            +
                         '"do for block"',     "kDO_BLOCK",
         | 
| 141 | 
            +
                         '"do for condition"', "kDO_COND",
         | 
| 142 | 
            +
                         '"do for lambda"',    "kDO_LAMBDA",
         | 
| 143 | 
            +
             | 
| 103 144 | 
             
                         # UGH
         | 
| 104 145 | 
             
                         "k_LINE__",       "k__LINE__",
         | 
| 105 146 | 
             
                         "k_FILE__",       "k__FILE__",
         | 
| @@ -107,13 +148,12 @@ def munge s | |
| 107 148 |  | 
| 108 149 | 
             
                         '"defined?"',     "kDEFINED",
         | 
| 109 150 |  | 
| 110 | 
            -
             | 
| 111 151 | 
             
                         '"do (for condition)"', "kDO_COND",
         | 
| 112 152 | 
             
                         '"do (for lambda)"',    "kDO_LAMBDA",
         | 
| 113 153 | 
             
                         '"do (for block)"',     "kDO_BLOCK",
         | 
| 114 154 |  | 
| 115 | 
            -
                         /\"(\w+) \(modifier\) | 
| 116 | 
            -
                         /\"(\w+)\"/, | 
| 155 | 
            +
                         /\"(\w+) \(?modifier\)?\"/, proc { |x| "k#{$1.upcase}_MOD" },
         | 
| 156 | 
            +
                         /\"(\w+)\"/,                proc { |x| "k#{$1.upcase}" },
         | 
| 117 157 |  | 
| 118 158 | 
             
                         /@(\d+)(\s+|$)/,       "",
         | 
| 119 159 | 
             
                        ]
         | 
| @@ -134,7 +174,7 @@ ARGF.each_line do |line| | |
| 134 174 |  | 
| 135 175 | 
             
              case line.strip
         | 
| 136 176 | 
             
              when /^$/ then
         | 
| 137 | 
            -
              when /^(\d+) ( | 
| 177 | 
            +
              when /^(\d+) (\$?[@\w]+): (.*)/ then    # yacc
         | 
| 138 178 | 
             
                rule = $2
         | 
| 139 179 | 
             
                order << rule unless rules.has_key? rule
         | 
| 140 180 | 
             
                rules[rule] << munge($3)
         | 
| @@ -159,7 +199,7 @@ ARGF.each_line do |line| | |
| 159 199 | 
             
              when /^\cL/ then                     # byacc
         | 
| 160 200 | 
             
                break
         | 
| 161 201 | 
             
              else
         | 
| 162 | 
            -
                warn "unparsed: #{$.}: #{line. | 
| 202 | 
            +
                warn "unparsed: #{$.}: #{line.strip.inspect}"
         | 
| 163 203 | 
             
              end
         | 
| 164 204 | 
             
            end
         | 
| 165 205 |  | 
    
        data/debugging.md
    CHANGED
    
    | @@ -1,5 +1,44 @@ | |
| 1 1 | 
             
            # Quick Notes to Help with Debugging
         | 
| 2 2 |  | 
| 3 | 
            +
            ## Reducing
         | 
| 4 | 
            +
             | 
| 5 | 
            +
            One of the most important steps is reducing the code sample to a
         | 
| 6 | 
            +
            minimal reproduction. For example, one thing I'm debugging right now
         | 
| 7 | 
            +
            was reported as:
         | 
| 8 | 
            +
             | 
| 9 | 
            +
            ```ruby
         | 
| 10 | 
            +
            a, b, c, d, e, f, g, h, i, j = 1, *[p1, p2, p3], *[p1, p2, p3], *[p4, p5, p6]
         | 
| 11 | 
            +
            ```
         | 
| 12 | 
            +
             | 
| 13 | 
            +
            This original sample has 10 items on the left-hand-side (LHS) and 1 +
         | 
| 14 | 
            +
            3 groups of 3 (calls) on the RHS + 3 arrays + 3 splats. That's a lot.
         | 
| 15 | 
            +
             | 
| 16 | 
            +
            It's already been reported (perhaps incorrectly) that this has to do
         | 
| 17 | 
            +
            with multiple splats on the RHS, so let's focus on that. At a minimum
         | 
| 18 | 
            +
            the code can be reduced to 2 splats on the RHS and some
         | 
| 19 | 
            +
            experimentation shows that it needs a non-splat item to fail:
         | 
| 20 | 
            +
             | 
| 21 | 
            +
            ```
         | 
| 22 | 
            +
            _, _, _ = 1, *[2], *[3]
         | 
| 23 | 
            +
            ```
         | 
| 24 | 
            +
             | 
| 25 | 
            +
            and some intuition further removed the arrays:
         | 
| 26 | 
            +
             | 
| 27 | 
            +
            ```
         | 
| 28 | 
            +
            _, _, _ = 1, *2, *3
         | 
| 29 | 
            +
            ```
         | 
| 30 | 
            +
             | 
| 31 | 
            +
            the difference is huge and will make a ton of difference when
         | 
| 32 | 
            +
            debugging.
         | 
| 33 | 
            +
             | 
| 34 | 
            +
            ## Getting something to compare
         | 
| 35 | 
            +
             | 
| 36 | 
            +
            ```
         | 
| 37 | 
            +
            % rake debug3 F=file.rb
         | 
| 38 | 
            +
            ```
         | 
| 39 | 
            +
             | 
| 40 | 
            +
            TODO
         | 
| 41 | 
            +
             | 
| 3 42 | 
             
            ## Comparing against ruby / ripper:
         | 
| 4 43 |  | 
| 5 44 | 
             
            ```
         | 
| @@ -16,3 +55,136 @@ From there? Good luck. I'm currently trying to backtrack from rule | |
| 16 55 | 
             
            reductions to state change differences. I'd like to figure out a way
         | 
| 17 56 | 
             
            to go from this sort of diff to a reasonable test that checks state
         | 
| 18 57 | 
             
            changes but I don't have that set up at this point.
         | 
| 58 | 
            +
             | 
| 59 | 
            +
            ## Adding New Grammar Productions
         | 
| 60 | 
            +
             | 
| 61 | 
            +
            Ruby adds stuff to the parser ALL THE TIME. It's actually hard to keep
         | 
| 62 | 
            +
            up with, but I've added some tools and shown what a typical workflow
         | 
| 63 | 
            +
            looks like. Let's say you want to add ruby 2.7's "beginless range" (eg
         | 
| 64 | 
            +
            `..42`).
         | 
| 65 | 
            +
             | 
| 66 | 
            +
            Whenever there's a language feature missing, I start with comparing
         | 
| 67 | 
            +
            the parse trees between MRI and RP:
         | 
| 68 | 
            +
             | 
| 69 | 
            +
            ### Structural Comparing
         | 
| 70 | 
            +
             | 
| 71 | 
            +
            There's a bunch of rake tasks `compare27`, `compare26`, etc that try
         | 
| 72 | 
            +
            to normalize and diff MRI's parse.y parse tree (just the structure of
         | 
| 73 | 
            +
            the tree in yacc) to ruby\_parser's parse tree (racc). It's the first
         | 
| 74 | 
            +
            thing I do when I'm adding a new version. Stub out all the version
         | 
| 75 | 
            +
            differences, and then start to diff the structure and move
         | 
| 76 | 
            +
            ruby\_parser towards the new changes.
         | 
| 77 | 
            +
             | 
| 78 | 
            +
            Some differences are just gonna be there... but here's an example of a
         | 
| 79 | 
            +
            real diff between MRI 2.7 and ruby_parser as of today:
         | 
| 80 | 
            +
             | 
| 81 | 
            +
            ```diff
         | 
| 82 | 
            +
                 arg tDOT3 arg
         | 
| 83 | 
            +
                 arg tDOT2
         | 
| 84 | 
            +
                 arg tDOT3
         | 
| 85 | 
            +
            -    tBDOT2 arg
         | 
| 86 | 
            +
            -    tBDOT3 arg
         | 
| 87 | 
            +
                 arg tPLUS arg
         | 
| 88 | 
            +
                 arg tMINUS arg
         | 
| 89 | 
            +
                 arg tSTAR2 arg
         | 
| 90 | 
            +
            ```
         | 
| 91 | 
            +
             | 
| 92 | 
            +
            This is a new language feature that ruby_parser doesn't handle yet.
         | 
| 93 | 
            +
            It's in MRI (the left hand side of the diff) but not ruby\_parser (the
         | 
| 94 | 
            +
            right hand side) so it is a `-` or missing line.
         | 
| 95 | 
            +
             | 
| 96 | 
            +
            Some other diffs will have both `+` and `-` lines. That usually
         | 
| 97 | 
            +
            happens when MRI has been refactoring the grammar. Sometimes I choose
         | 
| 98 | 
            +
            to adapt those refactorings and sometimes it starts to get too
         | 
| 99 | 
            +
            difficult to maintain multiple versions of ruby parsing in a single
         | 
| 100 | 
            +
            file.
         | 
| 101 | 
            +
             | 
| 102 | 
            +
            But! This structural comparing is always a place you should look when
         | 
| 103 | 
            +
            ruby_parser is failing to parse something. Maybe it just hasn't been
         | 
| 104 | 
            +
            implemented yet and the easiest place to look is the diff.
         | 
| 105 | 
            +
             | 
| 106 | 
            +
            ### Starting Test First
         | 
| 107 | 
            +
             | 
| 108 | 
            +
            The next thing I do is to add a parser test to cover that feature. I
         | 
| 109 | 
            +
            usually start with the parser and work backwards towards the lexer as
         | 
| 110 | 
            +
            needed, as I find it structures things properly and keeps things goal
         | 
| 111 | 
            +
            oriented.
         | 
| 112 | 
            +
             | 
| 113 | 
            +
            So, make a new parser test, usually in the versioned section of the
         | 
| 114 | 
            +
            parser tests.
         | 
| 115 | 
            +
             | 
| 116 | 
            +
            ```
         | 
| 117 | 
            +
              def test_beginless2
         | 
| 118 | 
            +
                rb = "..10\n; ..a\n; c"
         | 
| 119 | 
            +
                pt = s(:block,
         | 
| 120 | 
            +
                       s(:dot2, nil, s(:lit, 0).line(1)).line(1),
         | 
| 121 | 
            +
                       s(:dot2, nil, s(:call, nil, :a).line(2)).line(2),
         | 
| 122 | 
            +
                       s(:call, nil, :c).line(3)).line(1)
         | 
| 123 | 
            +
             | 
| 124 | 
            +
                assert_parse_line rb, pt, 1
         | 
| 125 | 
            +
             | 
| 126 | 
            +
                flunk "not done yet"
         | 
| 127 | 
            +
              end
         | 
| 128 | 
            +
            ```
         | 
| 129 | 
            +
             | 
| 130 | 
            +
            (In this case copied and modified the tests for open ranges from 2.6)
         | 
| 131 | 
            +
            and run it to get my first error:
         | 
| 132 | 
            +
             | 
| 133 | 
            +
            ```
         | 
| 134 | 
            +
            % rake N=/beginless/
         | 
| 135 | 
            +
             | 
| 136 | 
            +
            ...
         | 
| 137 | 
            +
             | 
| 138 | 
            +
            E
         | 
| 139 | 
            +
             | 
| 140 | 
            +
            Finished in 0.021814s, 45.8421 runs/s, 0.0000 assertions/s.
         | 
| 141 | 
            +
             | 
| 142 | 
            +
              1) Error:
         | 
| 143 | 
            +
            TestRubyParserV27#test_whatevs:
         | 
| 144 | 
            +
            Racc::ParseError: (string):1 :: parse error on value ".." (tDOT2)
         | 
| 145 | 
            +
                GEMS/2.7.0/gems/racc-1.5.0/lib/racc/parser.rb:538:in `on_error'
         | 
| 146 | 
            +
                WORK/ruby_parser/dev/lib/ruby_parser_extras.rb:1304:in `on_error'
         | 
| 147 | 
            +
                (eval):3:in `_racc_do_parse_c'
         | 
| 148 | 
            +
                (eval):3:in `do_parse'
         | 
| 149 | 
            +
                WORK/ruby_parser/dev/lib/ruby_parser_extras.rb:1329:in `block in process'
         | 
| 150 | 
            +
                RUBY/lib/ruby/2.7.0/timeout.rb:95:in `block in timeout'
         | 
| 151 | 
            +
                RUBY/lib/ruby/2.7.0/timeout.rb:33:in `block in catch'
         | 
| 152 | 
            +
                RUBY/lib/ruby/2.7.0/timeout.rb:33:in `catch'
         | 
| 153 | 
            +
                RUBY/lib/ruby/2.7.0/timeout.rb:33:in `catch'
         | 
| 154 | 
            +
                RUBY/lib/ruby/2.7.0/timeout.rb:110:in `timeout'
         | 
| 155 | 
            +
                WORK/ruby_parser/dev/lib/ruby_parser_extras.rb:1317:in `process'
         | 
| 156 | 
            +
                WORK/ruby_parser/dev/test/test_ruby_parser.rb:4198:in `assert_parse'
         | 
| 157 | 
            +
                WORK/ruby_parser/dev/test/test_ruby_parser.rb:4221:in `assert_parse_line'
         | 
| 158 | 
            +
                WORK/ruby_parser/dev/test/test_ruby_parser.rb:4451:in `test_whatevs'
         | 
| 159 | 
            +
            ```
         | 
| 160 | 
            +
             | 
| 161 | 
            +
            For starters, we know the missing production is for `tBDOT2 arg`. It
         | 
| 162 | 
            +
            is currently blowing up because it is getting `tDOT2` and simply
         | 
| 163 | 
            +
            doesn't know what to do with it, so it raises the error. As the diff
         | 
| 164 | 
            +
            suggests, that's the wrong token to begin with, so it is probably time
         | 
| 165 | 
            +
            to also create a lexer test:
         | 
| 166 | 
            +
             | 
| 167 | 
            +
            ```
         | 
| 168 | 
            +
            def test_yylex_bdot2
         | 
| 169 | 
            +
              assert_lex3("..42",
         | 
| 170 | 
            +
                          s(:dot2, nil, s(:lit, 42)),
         | 
| 171 | 
            +
             | 
| 172 | 
            +
                          :tBDOT2,   "..", EXPR_BEG,
         | 
| 173 | 
            +
                          :tINTEGER, "42", EXPR_NUM)
         | 
| 174 | 
            +
             | 
| 175 | 
            +
              flunk "not done yet"
         | 
| 176 | 
            +
            end
         | 
| 177 | 
            +
            ```
         | 
| 178 | 
            +
             | 
| 179 | 
            +
            This one is mostly speculative at this point. It says "if we're lexing
         | 
| 180 | 
            +
            this string, we should get this sexp if we fully parse it, and the
         | 
| 181 | 
            +
            lexical stream should look like this"... That last bit is mostly made
         | 
| 182 | 
            +
            up at this point. Sometimes I don't know exactly what expression state
         | 
| 183 | 
            +
            things should be in until I start really digging in.
         | 
| 184 | 
            +
             | 
| 185 | 
            +
            At this point, I have 2 failing tests that are directing me in the
         | 
| 186 | 
            +
            right direction. It's now a matter of digging through
         | 
| 187 | 
            +
            `compare/parse26.y` to see how the lexer differs and implementing
         | 
| 188 | 
            +
            it...
         | 
| 189 | 
            +
             | 
| 190 | 
            +
            But this is a good start to the doco for now. I'll add more later.
         |