ruby_parser 3.15.0 → 3.18.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- checksums.yaml +4 -4
- checksums.yaml.gz.sig +0 -0
- data/History.rdoc +101 -0
- data/Manifest.txt +5 -0
- data/README.rdoc +1 -0
- data/Rakefile +128 -30
- data/bin/ruby_parse_extract_error +1 -1
- data/compare/normalize.rb +8 -3
- data/debugging.md +133 -0
- data/gauntlet.md +106 -0
- data/lib/rp_extensions.rb +15 -36
- data/lib/rp_stringscanner.rb +20 -51
- data/lib/ruby20_parser.rb +3559 -3499
- data/lib/ruby20_parser.y +333 -248
- data/lib/ruby21_parser.rb +3650 -3614
- data/lib/ruby21_parser.y +328 -245
- data/lib/ruby22_parser.rb +3690 -3628
- data/lib/ruby22_parser.y +332 -247
- data/lib/ruby23_parser.rb +3629 -3573
- data/lib/ruby23_parser.y +332 -247
- data/lib/ruby24_parser.rb +3712 -3654
- data/lib/ruby24_parser.y +332 -247
- data/lib/ruby25_parser.rb +3712 -3654
- data/lib/ruby25_parser.y +332 -247
- data/lib/ruby26_parser.rb +3715 -3658
- data/lib/ruby26_parser.y +332 -246
- data/lib/ruby27_parser.rb +5009 -3722
- data/lib/ruby27_parser.y +928 -245
- data/lib/ruby30_parser.rb +8741 -0
- data/lib/ruby30_parser.y +3463 -0
- data/lib/ruby3_parser.yy +3467 -0
- data/lib/ruby_lexer.rb +273 -602
- data/lib/ruby_lexer.rex +28 -21
- data/lib/ruby_lexer.rex.rb +60 -24
- data/lib/ruby_lexer_strings.rb +638 -0
- data/lib/ruby_parser.rb +2 -0
- data/lib/ruby_parser.yy +969 -252
- data/lib/ruby_parser_extras.rb +297 -116
- data/test/test_ruby_lexer.rb +213 -129
- data/test/test_ruby_parser.rb +1288 -110
- data/tools/munge.rb +36 -8
- data/tools/ripper.rb +15 -10
- data.tar.gz.sig +0 -0
- metadata +48 -35
- metadata.gz.sig +1 -4
    
        checksums.yaml
    CHANGED
    
    | @@ -1,7 +1,7 @@ | |
| 1 1 | 
             
            ---
         | 
| 2 2 | 
             
            SHA256:
         | 
| 3 | 
            -
              metadata.gz:  | 
| 4 | 
            -
              data.tar.gz:  | 
| 3 | 
            +
              metadata.gz: 36780d9d3244dd62d13430987076d5e81ae2e536d6d2bfd259f8a612da3d94cc
         | 
| 4 | 
            +
              data.tar.gz: bec4b32e7f7a8d9ae8e3202f30230f351a2fedc6e2ac4e984260486dbb7529c6
         | 
| 5 5 | 
             
            SHA512:
         | 
| 6 | 
            -
              metadata.gz:  | 
| 7 | 
            -
              data.tar.gz:  | 
| 6 | 
            +
              metadata.gz: f28d02d2b14687e365bab3a353348b93a9df993be2d1afd3f2783b5b97ca016a6ca2f834ef61ebb4a4eae3decc38e1351349679f951f901bef09c25f23d44322
         | 
| 7 | 
            +
              data.tar.gz: 276ecce4db1f72ed2ce0d276679e65419225a46b885d0050aa7ba6382b45033ccd24b5006a0d382f0aecdbb6c5a5fd93e3e826adeafccc3c47ee051b76772eee
         | 
    
        checksums.yaml.gz.sig
    CHANGED
    
    | Binary file | 
    
        data/History.rdoc
    CHANGED
    
    | @@ -1,3 +1,104 @@ | |
| 1 | 
            +
            === 3.18.0 / 2021-10-27
         | 
| 2 | 
            +
             | 
| 3 | 
            +
            Holy crap... 58 commits! 2.7 and 3.0 are feature complete. Strings
         | 
| 4 | 
            +
            & heredocs have been rewritten.
         | 
| 5 | 
            +
             | 
| 6 | 
            +
            * 9 major enhancements:
         | 
| 7 | 
            +
             | 
| 8 | 
            +
              * !!! Rewrote lexer (and friends) for strings, heredocs, and %*[] constructs.
         | 
| 9 | 
            +
              * Massive overhaul on line numbers.
         | 
| 10 | 
            +
              * Freeze input! Finally!!! No more modifying the input string for heredocs.
         | 
| 11 | 
            +
              * Overhauled RPStringScanner. Removed OLD compatibility methods!
         | 
| 12 | 
            +
              * Removed Sexp methods: value, to_sym, add, add_all, node_type, values.
         | 
| 13 | 
            +
                * value moved to sexp_processor.
         | 
| 14 | 
            +
              * Removed String#grep monkey-patch.
         | 
| 15 | 
            +
              * Removed String#lineno monkey-patch.
         | 
| 16 | 
            +
              * Removed string_to_pos, charpos, etc hacks for ancient ruby versions.
         | 
| 17 | 
            +
              * Removed unread_many... NO! NO EDITING THE INPUT STRING!
         | 
| 18 | 
            +
             | 
| 19 | 
            +
            * 31 minor enhancements:
         | 
| 20 | 
            +
             | 
| 21 | 
            +
              * 2.7/3.0: many more pattern edge cases
         | 
| 22 | 
            +
              * 2.7: Added `mlhs = rhs rescue expr`
         | 
| 23 | 
            +
              * 2.7: refactored destructured args (`|(k,v)|`) and unfactored(?!) case_body/args.
         | 
| 24 | 
            +
              * 3.0: excessed_comma
         | 
| 25 | 
            +
              * 3.0: finished most everything: endless methods, patterns, etc.
         | 
| 26 | 
            +
              * 3.0: refactored / added new pattern changes
         | 
| 27 | 
            +
              * Added RubyLexer#in_heredoc? (ie, is there old_ss ?)
         | 
| 28 | 
            +
              * Added RubyLexer#old_ss and old_lineno and removed much of SSStack(ish).
         | 
| 29 | 
            +
              * Added Symbol#end_with? when necessary
         | 
| 30 | 
            +
              * Added TALLY and DEBUG options for ss.getch and ss.scan
         | 
| 31 | 
            +
              * Added ignore_body_comments to make parser productions more clear.
         | 
| 32 | 
            +
              * Added support for no_kwarg (eg `def f(**nil)`).
         | 
| 33 | 
            +
              * Added support for no_kwarg in blocks (eg `f { |**nil| }`).
         | 
| 34 | 
            +
              * Augmented generated parser files to have frozen_string_literal comments and fixed tests.
         | 
| 35 | 
            +
              * Broke out 3.0 parser into its own to ease development.
         | 
| 36 | 
            +
              * Bumped dependencies on sexp_processor and oedipus_lex.
         | 
| 37 | 
            +
              * Clean generated 3.x files.
         | 
| 38 | 
            +
              * Extracted all string scanner methods to their own module.
         | 
| 39 | 
            +
              * Fixed some precedence decls.
         | 
| 40 | 
            +
              * Implemented most of pattern matching for 2.7+.
         | 
| 41 | 
            +
              * Improve lex_state= to report location in verbose debug mode.
         | 
| 42 | 
            +
              * Made it easier to debug with a particular version of ruby via rake.
         | 
| 43 | 
            +
              * Make sure ripper uses the same version of ruby we specified.
         | 
| 44 | 
            +
              * Moved all string/heredoc/etc code to ruby_lexer_strings.rb
         | 
| 45 | 
            +
              * Remove warning from newer bisons.
         | 
| 46 | 
            +
              * Sprinkled in some frozen_string_literal, but mostly helped by oedipus bump.
         | 
| 47 | 
            +
              * Switch to comparing against ruby binary since ripper is buggy.
         | 
| 48 | 
            +
              * bugs task should try both bug*.rb and bad*.rb.
         | 
| 49 | 
            +
              * endless methods
         | 
| 50 | 
            +
              * f_any_kwrest refactoring.
         | 
| 51 | 
            +
              * refactored defn/defs
         | 
| 52 | 
            +
             | 
| 53 | 
            +
            * 15 bug fixes:
         | 
| 54 | 
            +
             | 
| 55 | 
            +
              * Cleaned a bunch of old hacks. Initializing RubyLexer w/ Parser is cleaner now.
         | 
| 56 | 
            +
              * Corrected some lex_state errors in process_token_keyword.
         | 
| 57 | 
            +
              * Fixed ancient ruby2 change (use #lines) in ruby_parse_extract_error.
         | 
| 58 | 
            +
              * Fixed bug where else without rescue only raises on 2.6+
         | 
| 59 | 
            +
              * Fixed caller for getch and scan when DEBUG=1
         | 
| 60 | 
            +
              * Fixed comments in the middle of message cascades.
         | 
| 61 | 
            +
              * Fixed differences w/ symbol productions against ruby 2.7.
         | 
| 62 | 
            +
              * Fixed dsym to use string_contents production.
         | 
| 63 | 
            +
              * Fixed error in bdot2/3 in some edge cases. Fixed p_alt line.
         | 
| 64 | 
            +
              * Fixed heredoc dedenting in the presence of empty lines. (mvz)
         | 
| 65 | 
            +
              * Fixed some leading whitespace / comment processing
         | 
| 66 | 
            +
              * Fixed up how class/module/defn/defs comments were collected.
         | 
| 67 | 
            +
              * Overhauled ripper.rb to deal with buggy ripper w/ yydebug.
         | 
| 68 | 
            +
              * Removed dsym from literal.
         | 
| 69 | 
            +
              * Removed tUBANG lexeme but kept it distinct as a method name (eg: `def !@`).
         | 
| 70 | 
            +
             | 
| 71 | 
            +
            === 3.17.0 / 2021-08-03
         | 
| 72 | 
            +
             | 
| 73 | 
            +
            * 1 minor enhancement:
         | 
| 74 | 
            +
             | 
| 75 | 
            +
              * Added support for arg forwarding (eg `def f(...); m(...); end`) (presidentbeef)
         | 
| 76 | 
            +
             | 
| 77 | 
            +
            === 3.16.0 / 2021-05-15
         | 
| 78 | 
            +
             | 
| 79 | 
            +
            * 1 major enhancement:
         | 
| 80 | 
            +
             | 
| 81 | 
            +
              * Added tentative 3.0 support.
         | 
| 82 | 
            +
             | 
| 83 | 
            +
            * 3 minor enhancements:
         | 
| 84 | 
            +
             | 
| 85 | 
            +
              * Added lexing for "beginless range" (bdots).
         | 
| 86 | 
            +
              * Added parsing for bdots.
         | 
| 87 | 
            +
              * Updated rake compare task to download xz files, bumped versions, etc
         | 
| 88 | 
            +
             | 
| 89 | 
            +
            * 4 bug fixes:
         | 
| 90 | 
            +
             | 
| 91 | 
            +
              * Bump rake dependency to >= 10, < 15. (presidentbeef)
         | 
| 92 | 
            +
              * Bump sexp_processor dependency to 4.15.1+. (pravi)
         | 
| 93 | 
            +
              * Fixed minor state mismatch at the end of parsing to make diffing a little cleaner.
         | 
| 94 | 
            +
              * Fixed normalizer to deal with new bison token syntax
         | 
| 95 | 
            +
             | 
| 96 | 
            +
            === 3.15.1 / 2021-01-10
         | 
| 97 | 
            +
             | 
| 98 | 
            +
            * 1 bug fix:
         | 
| 99 | 
            +
             | 
| 100 | 
            +
              * Bumped ruby version to include < 4 (trunk).
         | 
| 101 | 
            +
             | 
| 1 102 | 
             
            === 3.15.0 / 2020-08-31
         | 
| 2 103 |  | 
| 3 104 | 
             
            * 1 major enhancement:
         | 
    
        data/Manifest.txt
    CHANGED
    
    | @@ -7,6 +7,7 @@ bin/ruby_parse | |
| 7 7 | 
             
            bin/ruby_parse_extract_error
         | 
| 8 8 | 
             
            compare/normalize.rb
         | 
| 9 9 | 
             
            debugging.md
         | 
| 10 | 
            +
            gauntlet.md
         | 
| 10 11 | 
             
            lib/.document
         | 
| 11 12 | 
             
            lib/rp_extensions.rb
         | 
| 12 13 | 
             
            lib/rp_stringscanner.rb
         | 
| @@ -26,9 +27,13 @@ lib/ruby26_parser.rb | |
| 26 27 | 
             
            lib/ruby26_parser.y
         | 
| 27 28 | 
             
            lib/ruby27_parser.rb
         | 
| 28 29 | 
             
            lib/ruby27_parser.y
         | 
| 30 | 
            +
            lib/ruby30_parser.rb
         | 
| 31 | 
            +
            lib/ruby30_parser.y
         | 
| 32 | 
            +
            lib/ruby3_parser.yy
         | 
| 29 33 | 
             
            lib/ruby_lexer.rb
         | 
| 30 34 | 
             
            lib/ruby_lexer.rex
         | 
| 31 35 | 
             
            lib/ruby_lexer.rex.rb
         | 
| 36 | 
            +
            lib/ruby_lexer_strings.rb
         | 
| 32 37 | 
             
            lib/ruby_parser.rb
         | 
| 33 38 | 
             
            lib/ruby_parser.yy
         | 
| 34 39 | 
             
            lib/ruby_parser_extras.rb
         | 
    
        data/README.rdoc
    CHANGED
    
    | @@ -32,6 +32,7 @@ Tested against 801,039 files from the latest of all rubygems (as of 2013-05): | |
| 32 32 | 
             
            * 1.8 parser is at 99.9739% accuracy, 3.651 sigma
         | 
| 33 33 | 
             
            * 1.9 parser is at 99.9940% accuracy, 4.013 sigma
         | 
| 34 34 | 
             
            * 2.0 parser is at 99.9939% accuracy, 4.008 sigma
         | 
| 35 | 
            +
            * 2.6 parser is at 99.9972% accuracy, 4.191 sigma
         | 
| 35 36 |  | 
| 36 37 | 
             
            == FEATURES/PROBLEMS:
         | 
| 37 38 |  | 
    
        data/Rakefile
    CHANGED
    
    | @@ -14,25 +14,37 @@ Hoe.add_include_dirs "../../minitest/dev/lib" | |
| 14 14 | 
             
            Hoe.add_include_dirs "../../oedipus_lex/dev/lib"
         | 
| 15 15 |  | 
| 16 16 | 
             
            V2   = %w[20 21 22 23 24 25 26 27]
         | 
| 17 | 
            -
             | 
| 17 | 
            +
            V3   = %w[30]
         | 
| 18 | 
            +
             | 
| 19 | 
            +
            VERS = V2 + V3
         | 
| 20 | 
            +
             | 
| 21 | 
            +
            ENV["FAST"] = VERS.last if ENV["FAST"] && !VERS.include?(ENV["FAST"])
         | 
| 22 | 
            +
            VERS.replace [ENV["FAST"]] if ENV["FAST"]
         | 
| 18 23 |  | 
| 19 24 | 
             
            Hoe.spec "ruby_parser" do
         | 
| 20 25 | 
             
              developer "Ryan Davis", "ryand-ruby@zenspider.com"
         | 
| 21 26 |  | 
| 22 27 | 
             
              license "MIT"
         | 
| 23 28 |  | 
| 24 | 
            -
              dependency "sexp_processor", "~> 4. | 
| 25 | 
            -
              dependency "rake", "<  | 
| 26 | 
            -
              dependency "oedipus_lex", "~> 2. | 
| 29 | 
            +
              dependency "sexp_processor", "~> 4.16"
         | 
| 30 | 
            +
              dependency "rake", [">= 10", "< 15"], :developer
         | 
| 31 | 
            +
              dependency "oedipus_lex", "~> 2.6", :developer
         | 
| 32 | 
            +
             | 
| 33 | 
            +
              # NOTE: Ryan!!! Stop trying to fix this dependency! Isolate just
         | 
| 34 | 
            +
              # can't handle having a faux-gem half-installed! Stop! Just `gem
         | 
| 35 | 
            +
              # install racc` and move on. Revisit this ONLY once racc-compiler
         | 
| 36 | 
            +
              # gets split out.
         | 
| 27 37 |  | 
| 28 | 
            -
               | 
| 38 | 
            +
              dependency "racc", "~> 1.5", :developer
         | 
| 39 | 
            +
             | 
| 40 | 
            +
              require_ruby_version [">= 2.1", "< 4"]
         | 
| 29 41 |  | 
| 30 42 | 
             
              if plugin? :perforce then     # generated files
         | 
| 31 | 
            -
                 | 
| 43 | 
            +
                VERS.each do |n|
         | 
| 32 44 | 
             
                  self.perforce_ignore << "lib/ruby#{n}_parser.rb"
         | 
| 33 45 | 
             
                end
         | 
| 34 46 |  | 
| 35 | 
            -
                 | 
| 47 | 
            +
                VERS.each do |n|
         | 
| 36 48 | 
             
                  self.perforce_ignore << "lib/ruby#{n}_parser.y"
         | 
| 37 49 | 
             
                end
         | 
| 38 50 |  | 
| @@ -46,6 +58,23 @@ Hoe.spec "ruby_parser" do | |
| 46 58 | 
             
              end
         | 
| 47 59 | 
             
            end
         | 
| 48 60 |  | 
| 61 | 
            +
            def maybe_add_to_top path, string
         | 
| 62 | 
            +
              file = File.read path
         | 
| 63 | 
            +
             | 
| 64 | 
            +
              return if file.start_with? string
         | 
| 65 | 
            +
             | 
| 66 | 
            +
              warn "Altering top of #{path}"
         | 
| 67 | 
            +
              tmp_path = "#{path}.tmp"
         | 
| 68 | 
            +
              File.open(tmp_path, "w") do |f|
         | 
| 69 | 
            +
                f.puts string
         | 
| 70 | 
            +
                f.puts
         | 
| 71 | 
            +
             | 
| 72 | 
            +
                f.write file
         | 
| 73 | 
            +
                # TODO: make this deal with encoding comments properly?
         | 
| 74 | 
            +
              end
         | 
| 75 | 
            +
              File.rename tmp_path, path
         | 
| 76 | 
            +
            end
         | 
| 77 | 
            +
             | 
| 49 78 | 
             
            V2.each do |n|
         | 
| 50 79 | 
             
              file "lib/ruby#{n}_parser.y" => "lib/ruby_parser.yy" do |t|
         | 
| 51 80 | 
             
                cmd = 'unifdef -tk -DV=%s -UDEAD %s > %s || true' % [n, t.source, t.name]
         | 
| @@ -55,8 +84,23 @@ V2.each do |n| | |
| 55 84 | 
             
              file "lib/ruby#{n}_parser.rb" => "lib/ruby#{n}_parser.y"
         | 
| 56 85 | 
             
            end
         | 
| 57 86 |  | 
| 87 | 
            +
            V3.each do |n|
         | 
| 88 | 
            +
              file "lib/ruby#{n}_parser.y" => "lib/ruby3_parser.yy" do |t|
         | 
| 89 | 
            +
                cmd = 'unifdef -tk -DV=%s -UDEAD %s > %s || true' % [n, t.source, t.name]
         | 
| 90 | 
            +
                sh cmd
         | 
| 91 | 
            +
              end
         | 
| 92 | 
            +
             | 
| 93 | 
            +
              file "lib/ruby#{n}_parser.rb" => "lib/ruby#{n}_parser.y"
         | 
| 94 | 
            +
            end
         | 
| 95 | 
            +
             | 
| 58 96 | 
             
            file "lib/ruby_lexer.rex.rb" => "lib/ruby_lexer.rex"
         | 
| 59 97 |  | 
| 98 | 
            +
            task :parser do |t|
         | 
| 99 | 
            +
              t.prerequisite_tasks.grep(Rake::FileTask).select(&:already_invoked).each do |f|
         | 
| 100 | 
            +
                maybe_add_to_top f.name, "# frozen_string_literal: true"
         | 
| 101 | 
            +
              end
         | 
| 102 | 
            +
            end
         | 
| 103 | 
            +
             | 
| 60 104 | 
             
            task :generate => [:lexer, :parser]
         | 
| 61 105 |  | 
| 62 106 | 
             
            task :clean do
         | 
| @@ -65,6 +109,7 @@ task :clean do | |
| 65 109 | 
             
                    Dir["coverage.info"] +
         | 
| 66 110 | 
             
                    Dir["coverage"] +
         | 
| 67 111 | 
             
                    Dir["lib/ruby2*_parser.y"] +
         | 
| 112 | 
            +
                    Dir["lib/ruby3*_parser.y"] +
         | 
| 68 113 | 
             
                    Dir["lib/*.output"])
         | 
| 69 114 | 
             
            end
         | 
| 70 115 |  | 
| @@ -92,7 +137,7 @@ end | |
| 92 137 |  | 
| 93 138 | 
             
            def dl v
         | 
| 94 139 | 
             
              dir = v[/^\d+\.\d+/]
         | 
| 95 | 
            -
              url = "https://cache.ruby-lang.org/pub/ruby/#{dir}/ruby-#{v}.tar. | 
| 140 | 
            +
              url = "https://cache.ruby-lang.org/pub/ruby/#{dir}/ruby-#{v}.tar.xz"
         | 
| 96 141 | 
             
              path = File.basename url
         | 
| 97 142 | 
             
              unless File.exist? path then
         | 
| 98 143 | 
             
                system "curl -O #{url}"
         | 
| @@ -104,7 +149,7 @@ def ruby_parse version | |
| 104 149 | 
             
              rp_txt    = "rp#{v}.txt"
         | 
| 105 150 | 
             
              mri_txt   = "mri#{v}.txt"
         | 
| 106 151 | 
             
              parse_y   = "parse#{v}.y"
         | 
| 107 | 
            -
              tarball   = "ruby-#{version}.tar. | 
| 152 | 
            +
              tarball   = "ruby-#{version}.tar.xz"
         | 
| 108 153 | 
             
              ruby_dir  = "ruby-#{version}"
         | 
| 109 154 | 
             
              diff      = "diff#{v}.diff"
         | 
| 110 155 | 
             
              rp_out    = "lib/ruby#{v}_parser.output"
         | 
| @@ -124,15 +169,18 @@ def ruby_parse version | |
| 124 169 | 
             
                end
         | 
| 125 170 | 
             
              end
         | 
| 126 171 |  | 
| 172 | 
            +
              desc "fetch all tarballs"
         | 
| 173 | 
            +
              task :fetch => c_tarball
         | 
| 174 | 
            +
             | 
| 127 175 | 
             
              file c_parse_y => c_tarball do
         | 
| 128 176 | 
             
                in_compare do
         | 
| 129 177 | 
             
                  extract_glob = case version
         | 
| 130 | 
            -
                                 when /2\.7/
         | 
| 178 | 
            +
                                 when /2\.7|3\.0/
         | 
| 131 179 | 
             
                                   "{id.h,parse.y,tool/{id2token.rb,lib/vpath.rb}}"
         | 
| 132 180 | 
             
                                 else
         | 
| 133 181 | 
             
                                   "{id.h,parse.y,tool/{id2token.rb,vpath.rb}}"
         | 
| 134 182 | 
             
                                 end
         | 
| 135 | 
            -
                  system "tar  | 
| 183 | 
            +
                  system "tar Jxf #{tarball} #{ruby_dir}/#{extract_glob}"
         | 
| 136 184 |  | 
| 137 185 | 
             
                  Dir.chdir ruby_dir do
         | 
| 138 186 | 
             
                    if File.exist? "tool/id2token.rb" then
         | 
| @@ -141,15 +189,20 @@ def ruby_parse version | |
| 141 189 | 
             
                      sh "expand parse.y > ../#{parse_y}"
         | 
| 142 190 | 
             
                    end
         | 
| 143 191 |  | 
| 144 | 
            -
                    ruby "-pi", "-e", 'gsub(/^% | 
| 192 | 
            +
                    ruby "-pi", "-e", 'gsub(/^%pure-parser/, "%define api.pure")', "../#{parse_y}"
         | 
| 145 193 | 
             
                  end
         | 
| 146 194 | 
             
                  sh "rm -rf #{ruby_dir}"
         | 
| 147 195 | 
             
                end
         | 
| 148 196 | 
             
              end
         | 
| 149 197 |  | 
| 198 | 
            +
              bison = Dir["/opt/homebrew/opt/bison/bin/bison",
         | 
| 199 | 
            +
                          "/usr/local/opt/bison/bin/bison",
         | 
| 200 | 
            +
                          `which bison`.chomp,
         | 
| 201 | 
            +
                         ].first
         | 
| 202 | 
            +
             | 
| 150 203 | 
             
              file c_mri_txt => [c_parse_y, normalize] do
         | 
| 151 204 | 
             
                in_compare do
         | 
| 152 | 
            -
                  sh "bison -r all #{parse_y}"
         | 
| 205 | 
            +
                  sh "#{bison} -r all #{parse_y}"
         | 
| 153 206 | 
             
                  sh "./normalize.rb parse#{v}.output > #{mri_txt}"
         | 
| 154 207 | 
             
                  rm ["parse#{v}.output", "parse#{v}.tab.c"]
         | 
| 155 208 | 
             
                end
         | 
| @@ -190,17 +243,50 @@ def ruby_parse version | |
| 190 243 | 
             
              end
         | 
| 191 244 | 
             
            end
         | 
| 192 245 |  | 
| 246 | 
            +
            task :versions do
         | 
| 247 | 
            +
              require "open-uri"
         | 
| 248 | 
            +
              require "net/http" # avoid require issues in threads
         | 
| 249 | 
            +
              require "net/https"
         | 
| 250 | 
            +
             | 
| 251 | 
            +
              versions = %w[ 2.0 2.1 2.2 2.3 2.4 2.5 2.6 2.7 3.0 ]
         | 
| 252 | 
            +
             | 
| 253 | 
            +
              base_url = "https://cache.ruby-lang.org/pub/ruby"
         | 
| 254 | 
            +
             | 
| 255 | 
            +
              class Array
         | 
| 256 | 
            +
                def human_sort
         | 
| 257 | 
            +
                  sort_by { |item| item.to_s.split(/(\d+)/).map { |e| [e.to_i, e] } }
         | 
| 258 | 
            +
                end
         | 
| 259 | 
            +
              end
         | 
| 260 | 
            +
             | 
| 261 | 
            +
              versions = versions.map { |ver|
         | 
| 262 | 
            +
                Thread.new {
         | 
| 263 | 
            +
                  URI
         | 
| 264 | 
            +
                    .parse("#{base_url}/#{ver}/")
         | 
| 265 | 
            +
                    .read
         | 
| 266 | 
            +
                    .scan(/ruby-\d+\.\d+\.\d+[-\w.]*?.tar.gz/)
         | 
| 267 | 
            +
                    .reject { |s| s =~ /-(?:rc|preview)\d/ }
         | 
| 268 | 
            +
                    .human_sort
         | 
| 269 | 
            +
                    .last
         | 
| 270 | 
            +
                    .delete_prefix("ruby-")
         | 
| 271 | 
            +
                    .delete_suffix ".tar.gz"
         | 
| 272 | 
            +
                }
         | 
| 273 | 
            +
              }.map(&:value).sort
         | 
| 274 | 
            +
             | 
| 275 | 
            +
              puts versions.map { |v| "ruby_parse %p" % [v] }
         | 
| 276 | 
            +
            end
         | 
| 277 | 
            +
             | 
| 193 278 | 
             
            ruby_parse "2.0.0-p648"
         | 
| 194 | 
            -
            ruby_parse "2.1. | 
| 195 | 
            -
            ruby_parse "2.2. | 
| 279 | 
            +
            ruby_parse "2.1.10"
         | 
| 280 | 
            +
            ruby_parse "2.2.10"
         | 
| 196 281 | 
             
            ruby_parse "2.3.8"
         | 
| 197 | 
            -
            ruby_parse "2.4. | 
| 198 | 
            -
            ruby_parse "2.5. | 
| 199 | 
            -
            ruby_parse "2.6. | 
| 200 | 
            -
            ruby_parse "2.7. | 
| 282 | 
            +
            ruby_parse "2.4.10"
         | 
| 283 | 
            +
            ruby_parse "2.5.9"
         | 
| 284 | 
            +
            ruby_parse "2.6.8"
         | 
| 285 | 
            +
            ruby_parse "2.7.4"
         | 
| 286 | 
            +
            ruby_parse "3.0.2"
         | 
| 201 287 |  | 
| 202 288 | 
             
            task :debug => :isolate do
         | 
| 203 | 
            -
              ENV["V"] ||=  | 
| 289 | 
            +
              ENV["V"] ||= VERS.last
         | 
| 204 290 | 
             
              Rake.application[:parser].invoke # this way we can have DEBUG set
         | 
| 205 291 | 
             
              Rake.application[:lexer].invoke # this way we can have DEBUG set
         | 
| 206 292 |  | 
| @@ -215,7 +301,7 @@ task :debug => :isolate do | |
| 215 301 | 
             
              time = (ENV["RP_TIMEOUT"] || 10).to_i
         | 
| 216 302 |  | 
| 217 303 | 
             
              n = ENV["BUG"]
         | 
| 218 | 
            -
              file = (n && "bug#{n}.rb") || ENV["F"] || ENV["FILE"] || " | 
| 304 | 
            +
              file = (n && "bug#{n}.rb") || ENV["F"] || ENV["FILE"] || "debug.rb"
         | 
| 219 305 | 
             
              ruby = ENV["R"] || ENV["RUBY"]
         | 
| 220 306 |  | 
| 221 307 | 
             
              if ruby then
         | 
| @@ -238,19 +324,22 @@ task :debug => :isolate do | |
| 238 324 | 
             
            end
         | 
| 239 325 |  | 
| 240 326 | 
             
            task :debug3 do
         | 
| 241 | 
            -
              file    = ENV["F"] || " | 
| 242 | 
            -
               | 
| 327 | 
            +
              file    = ENV["F"] || "debug.rb"
         | 
| 328 | 
            +
              version = ENV["V"] || ""
         | 
| 329 | 
            +
              verbose = ENV["VERBOSE"] ? "-v" : ""
         | 
| 243 330 | 
             
              munge    = "./tools/munge.rb #{verbose}"
         | 
| 244 331 |  | 
| 245 332 | 
             
              abort "Need a file to parse, via: F=path.rb" unless file
         | 
| 246 333 |  | 
| 247 334 | 
             
              ENV.delete "V"
         | 
| 248 335 |  | 
| 249 | 
            -
               | 
| 250 | 
            -
             | 
| 251 | 
            -
              sh " | 
| 336 | 
            +
              ruby = "ruby#{version}"
         | 
| 337 | 
            +
             | 
| 338 | 
            +
              sh "#{ruby} -v"
         | 
| 339 | 
            +
              sh "#{ruby} -y #{file} 2>&1 | #{munge} > tmp/ruby"
         | 
| 340 | 
            +
              sh "#{ruby} ./tools/ripper.rb -d #{file} | #{munge} > tmp/rip"
         | 
| 252 341 | 
             
              sh "rake debug F=#{file} DEBUG=1 2>&1 | #{munge} > tmp/rp"
         | 
| 253 | 
            -
              sh "diff -U 999 -d tmp/{ | 
| 342 | 
            +
              sh "diff -U 999 -d tmp/{ruby,rp}"
         | 
| 254 343 | 
             
            end
         | 
| 255 344 |  | 
| 256 345 | 
             
            task :cmp do
         | 
| @@ -262,16 +351,25 @@ task :cmp3 do | |
| 262 351 | 
             
            end
         | 
| 263 352 |  | 
| 264 353 | 
             
            task :extract => :isolate do
         | 
| 265 | 
            -
              ENV["V"] ||=  | 
| 354 | 
            +
              ENV["V"] ||= VERS.last
         | 
| 266 355 | 
             
              Rake.application[:parser].invoke # this way we can have DEBUG set
         | 
| 267 356 |  | 
| 268 | 
            -
              file = ENV["F"] || ENV["FILE"]
         | 
| 357 | 
            +
              file = ENV["F"] || ENV["FILE"] || abort("Need to provide F=<path>")
         | 
| 269 358 |  | 
| 270 359 | 
             
              ruby "-Ilib", "bin/ruby_parse_extract_error", file
         | 
| 271 360 | 
             
            end
         | 
| 272 361 |  | 
| 362 | 
            +
            task :parse => :isolate do
         | 
| 363 | 
            +
              ENV["V"] ||= VERS.last
         | 
| 364 | 
            +
              Rake.application[:parser].invoke # this way we can have DEBUG set
         | 
| 365 | 
            +
             | 
| 366 | 
            +
              file = ENV["F"] || ENV["FILE"] || abort("Need to provide F=<path>")
         | 
| 367 | 
            +
             | 
| 368 | 
            +
              ruby "-Ilib", "bin/ruby_parse", file
         | 
| 369 | 
            +
            end
         | 
| 370 | 
            +
             | 
| 273 371 | 
             
            task :bugs do
         | 
| 274 | 
            -
              sh "for f in bug*.rb ; do #{Gem.ruby} -S rake debug F=$f && rm $f ; done"
         | 
| 372 | 
            +
              sh "for f in bug*.rb bad*.rb ; do #{Gem.ruby} -S rake debug F=$f && rm $f ; done"
         | 
| 275 373 | 
             
            end
         | 
| 276 374 |  | 
| 277 375 | 
             
            # vim: syntax=Ruby
         | 
    
        data/compare/normalize.rb
    CHANGED
    
    | @@ -84,6 +84,7 @@ def munge s | |
| 84 84 |  | 
| 85 85 | 
             
                         "' '",             "tSPACE", # needs to be later to avoid bad hits
         | 
| 86 86 |  | 
| 87 | 
            +
                         "%empty",          "none", # newer bison
         | 
| 87 88 | 
             
                         "/* empty */",     "none",
         | 
| 88 89 | 
             
                         /^\s*$/,           "none",
         | 
| 89 90 |  | 
| @@ -140,6 +141,7 @@ def munge s | |
| 140 141 | 
             
                         '"do for block"',     "kDO_BLOCK",
         | 
| 141 142 | 
             
                         '"do for condition"', "kDO_COND",
         | 
| 142 143 | 
             
                         '"do for lambda"',    "kDO_LAMBDA",
         | 
| 144 | 
            +
                         "tLABEL",             "kLABEL",
         | 
| 143 145 |  | 
| 144 146 | 
             
                         # UGH
         | 
| 145 147 | 
             
                         "k_LINE__",       "k__LINE__",
         | 
| @@ -155,7 +157,10 @@ def munge s | |
| 155 157 | 
             
                         /\"(\w+) \(?modifier\)?\"/, proc { |x| "k#{$1.upcase}_MOD" },
         | 
| 156 158 | 
             
                         /\"(\w+)\"/,                proc { |x| "k#{$1.upcase}" },
         | 
| 157 159 |  | 
| 158 | 
            -
                          | 
| 160 | 
            +
                         /\$?@(\d+)(\s+|$)/,    "", # newer bison
         | 
| 161 | 
            +
             | 
| 162 | 
            +
                         # TODO: remove for 3.0 work:
         | 
| 163 | 
            +
                         "lex_ctxt ", "" # 3.0 production that's mostly noise right now
         | 
| 159 164 | 
             
                        ]
         | 
| 160 165 |  | 
| 161 166 | 
             
              renames.each_slice(2) do |(a, b)|
         | 
| @@ -174,7 +179,7 @@ ARGF.each_line do |line| | |
| 174 179 |  | 
| 175 180 | 
             
              case line.strip
         | 
| 176 181 | 
             
              when /^$/ then
         | 
| 177 | 
            -
              when /^(\d+) ( | 
| 182 | 
            +
              when /^(\d+) (\$?[@\w]+): (.*)/ then    # yacc
         | 
| 178 183 | 
             
                rule = $2
         | 
| 179 184 | 
             
                order << rule unless rules.has_key? rule
         | 
| 180 185 | 
             
                rules[rule] << munge($3)
         | 
| @@ -199,7 +204,7 @@ ARGF.each_line do |line| | |
| 199 204 | 
             
              when /^\cL/ then                     # byacc
         | 
| 200 205 | 
             
                break
         | 
| 201 206 | 
             
              else
         | 
| 202 | 
            -
                warn "unparsed: #{$.}: #{line. | 
| 207 | 
            +
                warn "unparsed: #{$.}: #{line.strip.inspect}"
         | 
| 203 208 | 
             
              end
         | 
| 204 209 | 
             
            end
         | 
| 205 210 |  | 
    
        data/debugging.md
    CHANGED
    
    | @@ -55,3 +55,136 @@ From there? Good luck. I'm currently trying to backtrack from rule | |
| 55 55 | 
             
            reductions to state change differences. I'd like to figure out a way
         | 
| 56 56 | 
             
            to go from this sort of diff to a reasonable test that checks state
         | 
| 57 57 | 
             
            changes but I don't have that set up at this point.
         | 
| 58 | 
            +
             | 
| 59 | 
            +
            ## Adding New Grammar Productions
         | 
| 60 | 
            +
             | 
| 61 | 
            +
            Ruby adds stuff to the parser ALL THE TIME. It's actually hard to keep
         | 
| 62 | 
            +
            up with, but I've added some tools and shown what a typical workflow
         | 
| 63 | 
            +
            looks like. Let's say you want to add ruby 2.7's "beginless range" (eg
         | 
| 64 | 
            +
            `..42`).
         | 
| 65 | 
            +
             | 
| 66 | 
            +
            Whenever there's a language feature missing, I start with comparing
         | 
| 67 | 
            +
            the parse trees between MRI and RP:
         | 
| 68 | 
            +
             | 
| 69 | 
            +
            ### Structural Comparing
         | 
| 70 | 
            +
             | 
| 71 | 
            +
            There's a bunch of rake tasks `compare27`, `compare26`, etc that try
         | 
| 72 | 
            +
            to normalize and diff MRI's parse.y parse tree (just the structure of
         | 
| 73 | 
            +
            the tree in yacc) to ruby\_parser's parse tree (racc). It's the first
         | 
| 74 | 
            +
            thing I do when I'm adding a new version. Stub out all the version
         | 
| 75 | 
            +
            differences, and then start to diff the structure and move
         | 
| 76 | 
            +
            ruby\_parser towards the new changes.
         | 
| 77 | 
            +
             | 
| 78 | 
            +
            Some differences are just gonna be there... but here's an example of a
         | 
| 79 | 
            +
            real diff between MRI 2.7 and ruby_parser as of today:
         | 
| 80 | 
            +
             | 
| 81 | 
            +
            ```diff
         | 
| 82 | 
            +
                 arg tDOT3 arg
         | 
| 83 | 
            +
                 arg tDOT2
         | 
| 84 | 
            +
                 arg tDOT3
         | 
| 85 | 
            +
            -    tBDOT2 arg
         | 
| 86 | 
            +
            -    tBDOT3 arg
         | 
| 87 | 
            +
                 arg tPLUS arg
         | 
| 88 | 
            +
                 arg tMINUS arg
         | 
| 89 | 
            +
                 arg tSTAR2 arg
         | 
| 90 | 
            +
            ```
         | 
| 91 | 
            +
             | 
| 92 | 
            +
            This is a new language feature that ruby_parser doesn't handle yet.
         | 
| 93 | 
            +
            It's in MRI (the left hand side of the diff) but not ruby\_parser (the
         | 
| 94 | 
            +
            right hand side) so it is a `-` or missing line.
         | 
| 95 | 
            +
             | 
| 96 | 
            +
            Some other diffs will have both `+` and `-` lines. That usually
         | 
| 97 | 
            +
            happens when MRI has been refactoring the grammar. Sometimes I choose
         | 
| 98 | 
            +
            to adapt those refactorings and sometimes it starts to get too
         | 
| 99 | 
            +
            difficult to maintain multiple versions of ruby parsing in a single
         | 
| 100 | 
            +
            file.
         | 
| 101 | 
            +
             | 
| 102 | 
            +
            But! This structural comparing is always a place you should look when
         | 
| 103 | 
            +
            ruby_parser is failing to parse something. Maybe it just hasn't been
         | 
| 104 | 
            +
            implemented yet and the easiest place to look is the diff.
         | 
| 105 | 
            +
             | 
| 106 | 
            +
            ### Starting Test First
         | 
| 107 | 
            +
             | 
| 108 | 
            +
            The next thing I do is to add a parser test to cover that feature. I
         | 
| 109 | 
            +
            usually start with the parser and work backwards towards the lexer as
         | 
| 110 | 
            +
            needed, as I find it structures things properly and keeps things goal
         | 
| 111 | 
            +
            oriented.
         | 
| 112 | 
            +
             | 
| 113 | 
            +
            So, make a new parser test, usually in the versioned section of the
         | 
| 114 | 
            +
            parser tests.
         | 
| 115 | 
            +
             | 
| 116 | 
            +
            ```
         | 
| 117 | 
            +
              def test_beginless2
         | 
| 118 | 
            +
                rb = "..10\n; ..a\n; c"
         | 
| 119 | 
            +
                pt = s(:block,
         | 
| 120 | 
            +
                       s(:dot2, nil, s(:lit, 0).line(1)).line(1),
         | 
| 121 | 
            +
                       s(:dot2, nil, s(:call, nil, :a).line(2)).line(2),
         | 
| 122 | 
            +
                       s(:call, nil, :c).line(3)).line(1)
         | 
| 123 | 
            +
             | 
| 124 | 
            +
                assert_parse_line rb, pt, 1
         | 
| 125 | 
            +
             | 
| 126 | 
            +
                flunk "not done yet"
         | 
| 127 | 
            +
              end
         | 
| 128 | 
            +
            ```
         | 
| 129 | 
            +
             | 
| 130 | 
            +
            (In this case copied and modified the tests for open ranges from 2.6)
         | 
| 131 | 
            +
            and run it to get my first error:
         | 
| 132 | 
            +
             | 
| 133 | 
            +
            ```
         | 
| 134 | 
            +
            % rake N=/beginless/
         | 
| 135 | 
            +
             | 
| 136 | 
            +
            ...
         | 
| 137 | 
            +
             | 
| 138 | 
            +
            E
         | 
| 139 | 
            +
             | 
| 140 | 
            +
            Finished in 0.021814s, 45.8421 runs/s, 0.0000 assertions/s.
         | 
| 141 | 
            +
             | 
| 142 | 
            +
              1) Error:
         | 
| 143 | 
            +
            TestRubyParserV27#test_whatevs:
         | 
| 144 | 
            +
            Racc::ParseError: (string):1 :: parse error on value ".." (tDOT2)
         | 
| 145 | 
            +
                GEMS/2.7.0/gems/racc-1.5.0/lib/racc/parser.rb:538:in `on_error'
         | 
| 146 | 
            +
                WORK/ruby_parser/dev/lib/ruby_parser_extras.rb:1304:in `on_error'
         | 
| 147 | 
            +
                (eval):3:in `_racc_do_parse_c'
         | 
| 148 | 
            +
                (eval):3:in `do_parse'
         | 
| 149 | 
            +
                WORK/ruby_parser/dev/lib/ruby_parser_extras.rb:1329:in `block in process'
         | 
| 150 | 
            +
                RUBY/lib/ruby/2.7.0/timeout.rb:95:in `block in timeout'
         | 
| 151 | 
            +
                RUBY/lib/ruby/2.7.0/timeout.rb:33:in `block in catch'
         | 
| 152 | 
            +
                RUBY/lib/ruby/2.7.0/timeout.rb:33:in `catch'
         | 
| 153 | 
            +
                RUBY/lib/ruby/2.7.0/timeout.rb:33:in `catch'
         | 
| 154 | 
            +
                RUBY/lib/ruby/2.7.0/timeout.rb:110:in `timeout'
         | 
| 155 | 
            +
                WORK/ruby_parser/dev/lib/ruby_parser_extras.rb:1317:in `process'
         | 
| 156 | 
            +
                WORK/ruby_parser/dev/test/test_ruby_parser.rb:4198:in `assert_parse'
         | 
| 157 | 
            +
                WORK/ruby_parser/dev/test/test_ruby_parser.rb:4221:in `assert_parse_line'
         | 
| 158 | 
            +
                WORK/ruby_parser/dev/test/test_ruby_parser.rb:4451:in `test_whatevs'
         | 
| 159 | 
            +
            ```
         | 
| 160 | 
            +
             | 
| 161 | 
            +
            For starters, we know the missing production is for `tBDOT2 arg`. It
         | 
| 162 | 
            +
            is currently blowing up because it is getting `tDOT2` and simply
         | 
| 163 | 
            +
            doesn't know what to do with it, so it raises the error. As the diff
         | 
| 164 | 
            +
            suggests, that's the wrong token to begin with, so it is probably time
         | 
| 165 | 
            +
            to also create a lexer test:
         | 
| 166 | 
            +
             | 
| 167 | 
            +
            ```
         | 
| 168 | 
            +
            def test_yylex_bdot2
         | 
| 169 | 
            +
              assert_lex3("..42",
         | 
| 170 | 
            +
                          s(:dot2, nil, s(:lit, 42)),
         | 
| 171 | 
            +
             | 
| 172 | 
            +
                          :tBDOT2,   "..", EXPR_BEG,
         | 
| 173 | 
            +
                          :tINTEGER, "42", EXPR_NUM)
         | 
| 174 | 
            +
             | 
| 175 | 
            +
              flunk "not done yet"
         | 
| 176 | 
            +
            end
         | 
| 177 | 
            +
            ```
         | 
| 178 | 
            +
             | 
| 179 | 
            +
            This one is mostly speculative at this point. It says "if we're lexing
         | 
| 180 | 
            +
            this string, we should get this sexp if we fully parse it, and the
         | 
| 181 | 
            +
            lexical stream should look like this"... That last bit is mostly made
         | 
| 182 | 
            +
            up at this point. Sometimes I don't know exactly what expression state
         | 
| 183 | 
            +
            things should be in until I start really digging in.
         | 
| 184 | 
            +
             | 
| 185 | 
            +
            At this point, I have 2 failing tests that are directing me in the
         | 
| 186 | 
            +
            right direction. It's now a matter of digging through
         | 
| 187 | 
            +
            `compare/parse26.y` to see how the lexer differs and implementing
         | 
| 188 | 
            +
            it...
         | 
| 189 | 
            +
             | 
| 190 | 
            +
            But this is a good start to the doco for now. I'll add more later.
         |