parser 2.0.0.pre2 → 2.0.0.pre3

Sign up to get free protection for your applications and to get access to all the features.
checksums.yaml CHANGED
@@ -1,7 +1,7 @@
1
1
  ---
2
2
  SHA1:
3
- metadata.gz: f4144b03e5cb0074bd4e6a62442b3954365d7ec8
4
- data.tar.gz: 4276d7b16e60596143e0cc22b19ef0f048446af5
3
+ metadata.gz: 7d801efbc17526dd9207b668ac390eb9c02fab75
4
+ data.tar.gz: 198728fb3fb74976aaaa19cd2de1612b42ad2db8
5
5
  SHA512:
6
- metadata.gz: 644d1bb91060042670f4602788785bec67c8cc594ffa44d78922e74ba8201d954789e9b8953632b8f2c13b11efe1a2588cdde7bed8b3718c40189dad766104c0
7
- data.tar.gz: 75a5830a2a534f63663e83caf4333f5d462a2a591bc72adec2a09d61fc90a2f2d3ea106c26d37df225949beb6e6139c2b2033e76d1448ec0f365160bc28d6ba7
6
+ metadata.gz: 4a6e46bdb13a7348167df9f5a7c5e4f6ce31685e8f70a71f186becf91fe8f19ac3eeb70830ab6ecbf277667e6f269ab0dbeec8bc51f13ae340b6dc47e23b2b22
7
+ data.tar.gz: 068b0c06f919970b48529366a915f625902be5af227348af442b3fb89aab7455ae2293c2e327109a06cda4ee46b161659b44070a73fd6f636bab0a7f375f64b0
data/.yardopts CHANGED
@@ -3,10 +3,10 @@
3
3
  -M kramdown
4
4
  -o ./yardoc
5
5
  -r ./README.md
6
- --private
7
- --protected
8
6
  --asset ./doc/css/common.css:css/common.css
9
7
  --verbose
8
+ --api public
9
+ --exclude lib/parser/lexer.rb
10
10
  --exclude lib/parser/ruby18.rb
11
11
  --exclude lib/parser/ruby19.rb
12
12
  --exclude lib/parser/ruby20.rb
@@ -1,6 +1,61 @@
1
1
  Changelog
2
2
  =========
3
3
 
4
+ v2.0.0.pre3 (2013-07-26)
5
+ ------------------------
6
+
7
+ API modifications:
8
+ * lexer.rl: add simple explicit output encoding for strings. (Peter Zotov)
9
+
10
+ Features implemented:
11
+ * Source::Buffer: support for -(dos|unix|mac) and utf8-mac encodings. (Peter Zotov)
12
+ * Source::Range#resize. (Peter Zotov)
13
+ * Significantly improve speed for large (>100k) and very large (>1M) files. (Peter Zotov)
14
+
15
+ Bugs fixed:
16
+ * ruby21.y: fix typos. (Peter Zotov)
17
+ * builders/default: respect regexp encoding. (Peter Zotov)
18
+ * lexer.rl: literal EOF (\0, \x04, \x1a) inside literals and comments. (Peter Zotov)
19
+ * lexer.rl: "meth (lambda do end)" (1.8), "f x: -> do meth do end end": expr_cmdarg. (Peter Zotov)
20
+ * lexer.rl: "\<\<E\nE\r\r\n": extraneous CRs are ignored after heredoc delimiter. (Peter Zotov)
21
+ * lexer.rl: "%\nfoo\n": \n can be used as %-literal delimiter. (Peter Zotov)
22
+ * source/buffer, lexer.rl: convert CRLF to LF prior to lexing. (Peter Zotov)
23
+ * lexer.rl: "\<\<w; "\nfoo\nw\n"": interleaved heredoc and non-heredoc literals. (Peter Zotov)
24
+ * builders/default: 1.8 did not descend into &&/|| in conditional context. (Peter Zotov)
25
+ * lexer.rl: "1+a:a": respect context sensitivity in 1.8 label fallback. (Peter Zotov)
26
+ * lexer.rl: ruby 1.8 is context-sensitive wrt/ locals as well. (Peter Zotov)
27
+ * lexer.rl: "eof??a": expr_arg doesn't need space before character literal. (Peter Zotov)
28
+ * lexer.rl: interleaved heredoc and interpolated double-quoted string. (Peter Zotov)
29
+ * lexer.rl: "#{f:a}": interpolation starts expr_value, not expr_beg. (Peter Zotov)
30
+ * lexer.rl: "\cM" is "\r", not an error. (Peter Zotov)
31
+ * ruby{20,21}.y: constant op-assignment inside a def is not an error. (Peter Zotov)
32
+ * lexer.rl: "when Date:" fix label fallback for 1.8 mode. (Peter Zotov)
33
+ * ruby{19,20,21}.y: "->(scope){}; scope :foo": lambda identifier leakage. (Peter Zotov)
34
+ * lexer.rl: "eh ?\r\n": don't eat tEH if followed by CRLF. (Peter Zotov)
35
+ * lexer.rl: "f \<\<-TABLE\ndo |a,b|\nTABLE\nend": leave FSM after lexing heredoc. (Peter Zotov)
36
+ * lexer.rl: "foo %\n bar": don't % at expr_arg as tSTRING_BEG. (Peter Zotov)
37
+ * lexer.rl, lexer/literal: use lexer encoding for literal buffer. (Peter Zotov)
38
+ * lexer.rl: "\u{9}": one-digit braced unicode escapes. (Peter Zotov)
39
+ * Source::Buffer: don't chew \r from source lines. (Peter Zotov)
40
+ * builders/default: don't die in eh_keyword_map if else branch is empty. (Peter Zotov)
41
+ * lexer.rl: "0777_333": octal literals with internal underscores. (Peter Zotov)
42
+ * lexer.rl: "let [] {}": goto tLBRACE_ARG after any closing braces. (Peter Zotov)
43
+ * lexer.rl: "while not (1) do end": emit proper kDO* when in cond/cmdarg state. (Peter Zotov)
44
+ * lexer.rl: "rescue=>": correctly handle rescue+assoc at expr_beg. (Peter Zotov)
45
+ * lexer.rl: "puts 123do end": only trailing `_' and `e' in number are errors. (Peter Zotov)
46
+ * lexer.rl: "begin; rescue rescue1; end": accept barewords at expr_mid. (Peter Zotov)
47
+ * lexer.rl: "f.x!if 1": correct modifier handling in expr_arg. (Peter Zotov)
48
+ * lexer.rl: "=begin\n#=end\n=end": only recognize =end at bol. (Peter Zotov)
49
+ * builders/default: don't check for duplicate arguments in 1.8 mode. (Peter Zotov)
50
+ * Don't attempt to parse magic encoding comment in 1.8 mode. (Peter Zotov)
51
+ * lexer.rl: "\777": octal literals overflow. (Peter Zotov)
52
+ * lexer.rl: "foo;\n__END__", "\na:b": whitespace in expr_value. (Peter Zotov)
53
+ * lexer.rl: "\xE2\x80\x99": concatenation of byte escape sequences. (Peter Zotov)
54
+ * lexer.rl: "E10", "E4U": don't conflate floats and identifiers. (Peter Zotov)
55
+ * lexer.rl: "foo.bar= {1=>2}": return fid, = as separate tokens in expr_dot. (Peter Zotov)
56
+ * lexer.rl: "def defined?": properly return defined? in expr_fname. (Peter Zotov)
57
+ * lexer.rl: "Rainbows! do end", "foo.S?": allow bareword fid in expr_beg/dot. (Peter Zotov)
58
+
4
59
  v2.0.0.pre2 (2013-07-11)
5
60
  ------------------------
6
61
 
data/Gemfile CHANGED
@@ -2,5 +2,3 @@ source 'https://rubygems.org'
2
2
 
3
3
  # Specify your gem's dependencies in parser.gemspec
4
4
  gemspec
5
-
6
- gem 'rubocop', :platform => [:ruby_19, :ruby_20]
data/README.md CHANGED
@@ -11,6 +11,8 @@ par or better than Ripper, Melbourne, JRubyParser or ruby\_parser.
11
11
  You can also use [unparser](https://github.com/mbj/unparser) to produce
12
12
  equivalent source code from Parser's ASTs.
13
13
 
14
+ Sponsored by [Evil Martians](http://evilmartians.com).
15
+
14
16
  ## Installation
15
17
 
16
18
  Most recent version of Parser is 2.0; however, per
@@ -151,6 +153,19 @@ Both `(begin)` and `(kwbegin)` nodes represent compound statements, that is, sev
151
153
 
152
154
  and so on.
153
155
 
156
+ ```
157
+ $ ruby-parse -e '(foo; bar)'
158
+ (begin
159
+ (send nil :foo)
160
+ (send nil :bar))
161
+ $ ruby-parse -e 'def x; foo; bar end'
162
+ (def :x
163
+ (args)
164
+ (begin
165
+ (send nil :foo)
166
+ (send nil :bar)))
167
+ ```
168
+
154
169
  Note that, despite its name, `kwbegin` node only has tangential relation to the `begin` keyword. Normally, Parser AST is semantic, that is, if two constructs look differently but behave identically, they get parsed to the same node. However, there exists a peculiar construct called post-loop in Ruby:
155
170
 
156
171
  ```
@@ -163,20 +178,59 @@ This specific syntactic construct, that is, keyword `begin..end` block followed
163
178
 
164
179
  [postloop]: http://rosettacode.org/wiki/Loops/Do-while#Ruby
165
180
 
181
+ ```
182
+ $ ruby-parse -e 'begin foo end while cond'
183
+ (while-post
184
+ (send nil :cond)
185
+ (kwbegin
186
+ (send nil :foo)))
187
+ $ ruby-parse -e 'foo while cond'
188
+ (while
189
+ (send nil :cond)
190
+ (send nil :foo))
191
+ $ ruby-parse -e '(foo) while cond'
192
+ (while
193
+ (send nil :cond)
194
+ (begin
195
+ (send nil :foo)))
196
+ ```
197
+
166
198
  (Parser also needs the `(kwbegin)` node type internally, and it is highly problematic to map it back to `(begin)`.)
167
199
 
168
200
  ## Known issues
169
201
 
202
+ Adding support for the following Ruby MRI features in Parser would needlessly complicate it, and as they all are very specific and rarely occuring corner cases, this is not done.
203
+
204
+ Parser has been extensively tested; in particular, it parses almost entire [Rubygems][rg] corpus. For every issue, a breakdown of affected gems is offered.
205
+
206
+ [rg]: http://rubygems.org
207
+
170
208
  ### Void value expressions
171
209
 
172
- So-called "void value expressions" are not handled by Parser. For a description
210
+ Ruby MRI prohibits so-called "void value expressions". For a description
173
211
  of what a void value expression is, see [this
174
212
  gist](https://gist.github.com/JoshCheek/5625007) and [this Parser
175
213
  issue](https://github.com/whitequark/parser/issues/72).
176
214
 
177
- It is not clear which rules this piece of static analysis follows, or which
178
- problem does it solve. It is not implemented because there is no clear
179
- specification allowing us to verify the behavior.
215
+ It is unknown whether any gems are affected by this issue.
216
+
217
+ ### Invalid characters inside comments
218
+
219
+ Ruby MRI permits arbitrary non-7-bit characters to appear in comments regardless of source encoding.
220
+
221
+ As of 2013-07-25, there are about 180 affected gems.
222
+
223
+ ### \u escape in 1.8 mode
224
+
225
+ Ruby MRI 1.8 permits to specify a bare `\u` escape sequence in a string; it treats it like `u`. Ruby MRI 1.9 and later treat `\u` as a prefix for Unicode escape sequence and do not allow it to appear bare. Parser follows 1.9+ behavior.
226
+
227
+ As of 2013-07-25, affected gems are: activerdf, activerdf_net7, fastreader, gkellog-reddy.
228
+
229
+ ### Invalid Unicode escape sequences
230
+
231
+ Ruby MRI 1.9+ permits to specify invalid UTF-8 sequences in Unicode escape sequences, such as `\u{d800}`.
232
+
233
+ As of 2013-07-25, affected gems are: aws_cloud_search.
180
234
 
181
235
  ## Contributors
182
236
 
@@ -0,0 +1,121 @@
1
+ require 'gauntlet'
2
+ require 'parser/all'
3
+ require 'shellwords'
4
+
5
+ class ParserGauntlet < Gauntlet
6
+ RUBY20 = 'ruby'
7
+ RUBY19 = 'ruby1.9.1'
8
+ RUBY18 = '/opt/rubies/ruby-1.8.7-p370/bin/ruby'
9
+
10
+ def try(parser, ruby, file, show_ok: false)
11
+ try_ruby = lambda do |e|
12
+ Process.spawn(%{#{ruby} -c #{Shellwords.escape file}},
13
+ :err => '/dev/null', :out => '/dev/null')
14
+ _, status = Process.wait2
15
+
16
+ if status.success?
17
+ # Bug in Parser.
18
+ puts "Parser bug."
19
+ @result[file] = { parser.to_s => "#{e.class}: #{e.to_s}" }
20
+ else
21
+ # No, this file is not Ruby.
22
+ yield if block_given?
23
+ end
24
+ end
25
+
26
+ begin
27
+ parser.parse_file(file)
28
+
29
+ rescue Parser::SyntaxError => e
30
+ if e.diagnostic.location.resize(2).is?('<%')
31
+ puts "ERb."
32
+ return
33
+ end
34
+
35
+ try_ruby.call(e)
36
+
37
+ rescue ArgumentError, RegexpError,
38
+ Encoding::UndefinedConversionError => e
39
+ puts "#{file}: #{e.class}: #{e.to_s}"
40
+
41
+ try_ruby.call(e)
42
+
43
+ rescue Interrupt
44
+ raise
45
+
46
+ rescue Exception => e
47
+ puts "Parser bug: #{file} #{e.class}: #{e.to_s}"
48
+ @result[file] = { parser.to_s => "#{e.class}: #{e.to_s}" }
49
+
50
+ else
51
+ puts "Ok." if show_ok
52
+ end
53
+ end
54
+
55
+ def parse(name)
56
+ puts "GEM: #{name}"
57
+
58
+ @result = {}
59
+
60
+ if ENV.include?('FAST')
61
+ total_size = Dir["**/*.rb"].map(&File.method(:size)).reduce(:+)
62
+ if total_size > 300_000
63
+ puts "Skip."
64
+ return
65
+ end
66
+ end
67
+
68
+ Dir["**/*.rb"].each do |file|
69
+ next if File.directory? file
70
+
71
+ try(Parser::Ruby20, RUBY20, file) do
72
+ puts "Trying 1.9:"
73
+ try(Parser::Ruby19, RUBY19, file, show_ok: true) do
74
+ puts "Trying 1.8:"
75
+ try(Parser::Ruby18, RUBY18, file, show_ok: true) do
76
+ puts "Invalid syntax."
77
+ end
78
+ end
79
+ end
80
+ end
81
+
82
+ @result
83
+ end
84
+
85
+ def run(name)
86
+ data[name] = parse(name)
87
+ self.dirty = true
88
+ end
89
+
90
+ def should_skip?(name)
91
+ data[name] == {}
92
+ end
93
+
94
+ def load_yaml(*)
95
+ data = super
96
+ @was_errors = data.count { |_name, errs| errs != {} }
97
+
98
+ data
99
+ end
100
+
101
+ def shutdown
102
+ super
103
+
104
+ errors = data.count { |_name, errs| errs != {} }
105
+ total = data.count
106
+ percent = "%.5f" % [100 - errors.to_f / total * 100]
107
+ puts "!!! was: #{@was_errors} now: #{errors} total: #{total} frac: #{percent}%"
108
+ end
109
+ end
110
+
111
+ filter = ARGV.shift
112
+ filter = Regexp.new filter if filter
113
+
114
+ gauntlet = ParserGauntlet.new
115
+
116
+ if ENV.include? 'UPDATE'
117
+ gauntlet.source_index
118
+ gauntlet.update_gem_tarballs
119
+ end
120
+
121
+ gauntlet.run_the_gauntlet filter
@@ -7,35 +7,39 @@ if RUBY_VERSION < '1.9'
7
7
  require 'parser/compatibility/ruby1_8'
8
8
  end
9
9
 
10
- # Library namespace
10
+ ##
11
+ # @api public
12
+ #
11
13
  module Parser
12
14
  require 'parser/version'
13
15
 
14
16
  require 'parser/ast/node'
15
17
  require 'parser/ast/processor'
16
18
 
17
- require 'parser/source/buffer'
18
- require 'parser/source/range'
19
-
20
- require 'parser/source/comment'
21
- require 'parser/source/comment/associator'
22
-
23
- require 'parser/source/rewriter'
24
- require 'parser/source/rewriter/action'
25
-
26
- require 'parser/source/map'
27
- require 'parser/source/map/operator'
28
- require 'parser/source/map/collection'
29
- require 'parser/source/map/constant'
30
- require 'parser/source/map/variable'
31
- require 'parser/source/map/keyword'
32
- require 'parser/source/map/definition'
33
- require 'parser/source/map/send'
34
- require 'parser/source/map/block'
35
- require 'parser/source/map/condition'
36
- require 'parser/source/map/ternary'
37
- require 'parser/source/map/for'
38
- require 'parser/source/map/rescue_body'
19
+ module Source
20
+ require 'parser/source/buffer'
21
+ require 'parser/source/range'
22
+
23
+ require 'parser/source/comment'
24
+ require 'parser/source/comment/associator'
25
+
26
+ require 'parser/source/rewriter'
27
+ require 'parser/source/rewriter/action'
28
+
29
+ require 'parser/source/map'
30
+ require 'parser/source/map/operator'
31
+ require 'parser/source/map/collection'
32
+ require 'parser/source/map/constant'
33
+ require 'parser/source/map/variable'
34
+ require 'parser/source/map/keyword'
35
+ require 'parser/source/map/definition'
36
+ require 'parser/source/map/send'
37
+ require 'parser/source/map/block'
38
+ require 'parser/source/map/condition'
39
+ require 'parser/source/map/ternary'
40
+ require 'parser/source/map/for'
41
+ require 'parser/source/map/rescue_body'
42
+ end
39
43
 
40
44
  require 'parser/syntax_error'
41
45
  require 'parser/diagnostic'
@@ -68,7 +72,7 @@ module Parser
68
72
  :regexp_options => 'unknown regexp options: %{options}',
69
73
  :cvar_name => "`%{name}' is not allowed as a class variable name",
70
74
  :ivar_name => "`%{name}' is not allowed as an instance variable name",
71
- :trailing_underscore => "trailing `_' in number",
75
+ :trailing_in_number => "trailing `%{character}' in number",
72
76
  :empty_numeric => 'numeric literal without digits',
73
77
  :invalid_octal => 'invalid octal digit',
74
78
  :no_dot_digit_literal => 'no .<digit> floating literal anymore; put 0 before dot',
@@ -107,6 +111,9 @@ module Parser
107
111
  :useless_else => 'else without rescue is useless',
108
112
  }.freeze
109
113
 
114
+ ##
115
+ # Verify that the current Ruby implementation supports Encoding.
116
+ # @raise [RuntimeError]
110
117
  def self.check_for_encoding_support
111
118
  unless defined?(Encoding)
112
119
  raise RuntimeError, 'Parsing 1.9 and later versions of Ruby is not supported on 1.8 due to the lack of Encoding support'
@@ -3,10 +3,13 @@ module Parser
3
3
 
4
4
  ##
5
5
  # {Parser::AST::Node} contains information about a single AST node and its
6
- # child nodes, it extends the basic `AST::Node` class provided by the "ast"
7
- # Gem.
6
+ # child nodes. It extends the basic [AST::Node](http://rdoc.info/gems/ast/AST/Node)
7
+ # class provided by gem [ast](http://rdoc.info/gems/ast).
8
+ #
9
+ # @api public
8
10
  #
9
11
  # @!attribute [r] location
12
+ # Source map for this Node.
10
13
  # @return [Parser::Source::Map]
11
14
  #
12
15
  class Node < ::AST::Node
@@ -15,11 +18,10 @@ module Parser
15
18
  alias loc location
16
19
 
17
20
  ##
18
- # Assigns various properties to the current AST node. Currently only the
21
+ # Assigns various properties to this AST node. Currently only the
19
22
  # location can be set.
20
23
  #
21
24
  # @param [Hash] properties
22
- #
23
25
  # @option properties [Parser::Source::Map] :location Location information
24
26
  # of the node.
25
27
  #
@@ -1,6 +1,9 @@
1
1
  module Parser
2
2
  module AST
3
3
 
4
+ ##
5
+ # @api public
6
+ #
4
7
  class Processor < ::AST::Processor
5
8
  def process_regular_node(node)
6
9
  node.updated(nil, process_all(node))
@@ -1,6 +1,10 @@
1
1
  module Parser
2
2
 
3
3
  ##
4
+ # Base class for version-specific parsers.
5
+ #
6
+ # @api public
7
+ #
4
8
  # @!attribute [r] diagnostics
5
9
  # @return [Parser::Diagnostic::Engine]
6
10
  #
@@ -9,7 +13,9 @@ module Parser
9
13
  #
10
14
  class Base < Racc::Parser
11
15
  ##
12
- # Parses a string of Ruby code and returns the AST.
16
+ # Parses a string of Ruby code and returns the AST. If the source
17
+ # cannot be parsed, {SyntaxError} is raised and a diagnostic is
18
+ # printed to `stderr`.
13
19
  #
14
20
  # @example
15
21
  # Parser::Base.parse('puts "hello"')
@@ -25,7 +31,6 @@ module Parser
25
31
  parser.diagnostics.all_errors_are_fatal = true
26
32
  parser.diagnostics.ignore_warnings = true
27
33
 
28
- # Temporary, for manual testing convenience
29
34
  parser.diagnostics.consumer = lambda do |diagnostic|
30
35
  $stderr.puts(diagnostic.render)
31
36
  end
@@ -33,13 +38,20 @@ module Parser
33
38
  string = string.dup.force_encoding(parser.default_encoding)
34
39
 
35
40
  source_buffer = Source::Buffer.new(file, line)
36
- source_buffer.source = string
41
+
42
+ if name == 'Parser::Ruby18'
43
+ source_buffer.raw_source = string
44
+ else
45
+ source_buffer.source = string
46
+ end
37
47
 
38
48
  parser.parse(source_buffer)
39
49
  end
40
50
 
41
51
  ##
42
- # Parses Ruby source code by reading it from a file.
52
+ # Parses Ruby source code by reading it from a file. If the source
53
+ # cannot be parsed, {SyntaxError} is raised and a diagnostic is
54
+ # printed to `stderr`.
43
55
  #
44
56
  # @param [String] filename Path to the file to parse.
45
57
  # @see #parse
@@ -49,19 +61,8 @@ module Parser
49
61
  end
50
62
 
51
63
  attr_reader :diagnostics
52
-
53
64
  attr_reader :builder
54
-
55
- ##
56
- # @api internal
57
- #
58
65
  attr_reader :static_env
59
-
60
- ##
61
- # The source file currently being parsed.
62
- #
63
- # @api internal
64
- #
65
66
  attr_reader :source_buffer
66
67
 
67
68
  ##
@@ -155,14 +156,14 @@ module Parser
155
156
  end
156
157
 
157
158
  ##
158
- # @api internal
159
+ # @api private
159
160
  # @return [TrueClass|FalseClass]
160
161
  #
161
162
  def in_def?
162
163
  @def_level > 0
163
164
  end
164
165
 
165
- protected
166
+ private
166
167
 
167
168
  def next_token
168
169
  @lexer.advance