machete 0.4.0 → 0.5.0

Sign up to get free protection for your applications and to get access to all the features.
data/.gitignore CHANGED
@@ -2,4 +2,6 @@ lib/machete/parser.rb
2
2
  doc/
3
3
  .yardoc
4
4
  *.rbc
5
+ .rbx/
5
6
  Gemfile.lock
7
+ .rbenv-version
data/CHANGELOG CHANGED
@@ -1,5 +1,18 @@
1
+ 0.5.0 (2012-08-19)
2
+ ------------------
3
+
4
+ * Support for matching symbols using "^=", "$=" and "*=" operators.
5
+ * Support for regexp literals and matching regexps using "*=" operator.
6
+ * Extended symbol grammar to cover symbols denoting instance and class variable
7
+ names (:@foo, :@@bar).
8
+ * Matchete#matches? and Machete#find accept the pattern also in compiled
9
+ form (instance of Machete:Matchers::Matcher class).
10
+ * Works in Rubinius 1.9 mode.
11
+ * Internal code improvements and fixes.
12
+
1
13
  0.4.0 (2011-10-18)
2
14
  ------------------
15
+
3
16
  * Support for "true", "false" and "nil" literals.
4
17
  * New "*=" operator matching part of a string.
5
18
  * Bundler support.
data/Gemfile CHANGED
@@ -1,3 +1,7 @@
1
1
  source "http://rubygems.org"
2
2
 
3
3
  gemspec
4
+
5
+ group :test do
6
+ gem "rake"
7
+ end
data/README.md CHANGED
@@ -1,12 +1,16 @@
1
1
  Machete
2
2
  =======
3
3
 
4
- Machete is a simple tool for matching Rubinius AST nodes against patterns. You can use it if you are writing any kind of tool that processes Ruby code and needs to do some work on specific types of nodes, needs to find patterns in the code, etc.
4
+ Machete is a simple tool for matching Rubinius AST nodes against patterns. You
5
+ can use it if you are writing any kind of tool that processes Ruby code and
6
+ needs to do some work on specific types of nodes, needs to find patterns in the
7
+ code, etc.
5
8
 
6
9
  Installation
7
10
  ------------
8
11
 
9
- You need to install [Rubinius](http://rubini.us/) first. You can then install Machete:
12
+ You need to install current development version of [Rubinius](http://rubini.us/)
13
+ first. You can then install Machete:
10
14
 
11
15
  $ gem install machete
12
16
 
@@ -17,19 +21,21 @@ First, require the library:
17
21
 
18
22
  require "machete"
19
23
 
20
- You can now use one of two methods Machete offers: `Machete.matches?` and `Machete.find`.
24
+ You can now use one of two methods Machete offers: `Machete.matches?` and
25
+ `Machete.find`.
21
26
 
22
27
  The `Machete.matches?` method matches a Rubinus AST node against a pattern:
23
28
 
24
- Machete.matches?('foo.bar'.to_ast, 'Send<receiver = Send<receiver = Self, name = :foo>, name = :bar>')
29
+ Machete.matches?('foo.bar'.to_ast, 'Send<name = :bar>')
25
30
  # => true
26
31
 
27
- Machete.matches?('42'.to_ast, 'Send<receiver = Send<receiver = Self, name = :foo>, name = :bar>')
32
+ Machete.matches?('42'.to_ast, 'Send<name = :bar>')
28
33
  # => false
29
34
 
30
35
  (See below for pattern syntax description.)
31
36
 
32
- The `Machete.find` method finds all nodes in a Rubinius AST tree matching a pattern:
37
+ The `Machete.find` method finds all nodes in a Rubinius AST tree matching a
38
+ pattern:
33
39
 
34
40
  Machete.find('42 + 43 + 44'.to_ast, 'FixnumLiteral')
35
41
  # => [
@@ -38,12 +44,24 @@ The `Machete.find` method finds all nodes in a Rubinius AST tree matching a patt
38
44
  # #<Rubinius::AST::FixnumLiteral:0x10c0 @value=42 @line=1>
39
45
  # ]
40
46
 
47
+ Both `Machete.matches?` and `Machete.find` also accept patterns in their
48
+ compiled form (instance of `Machete::Matchers::Matcher`):
49
+
50
+ Machete.matches?(
51
+ 'foo.bar'.to_ast,
52
+ Machete::Matchers::NodeMatcher.new("Send",
53
+ :name => Machete::Matchers::LiteralMatcher.new(:bar)
54
+ )
55
+ )
56
+ # => true
57
+
41
58
  Pattern Syntax
42
59
  --------------
43
60
 
44
61
  ### Basics
45
62
 
46
- Rubinius AST consists of instances of classes that represent various types of nodes:
63
+ Rubinius AST consists of instances of classes that represent various types of
64
+ nodes:
47
65
 
48
66
  '42'.to_ast # => #<Rubinius::AST::FixnumLiteral:0xf28 @value=42 @line=1>
49
67
  '"abcd"'.to_ast # => #<Rubinius::AST::StringLiteral:0xf60 @line=1 @string="abcd">
@@ -58,25 +76,31 @@ To specify multiple alternatives, use the choice operator:
58
76
  Machete.matches?('42'.to_ast, 'FixnumLiteral | StringLiteral') # => true
59
77
  Machete.matches?('"abcd"'.to_ast, 'FixnumLiteral | StringLiteral') # => true
60
78
 
61
- If you don't care about the node type at all, use the `any` keyword (this is most useful when matching arrays — see below):
79
+ If you don't care about the node type at all, use the `any` keyword (this is
80
+ most useful when matching arrays — see below):
62
81
 
63
82
  Machete.matches?('42'.to_ast, 'any') # => true
64
83
  Machete.matches?('"abcd"'.to_ast, 'any') # => true
65
84
 
66
85
  ### Node Attributes
67
86
 
68
- If you want to match a specific attribute of a node, specify its value inside `<...>` right after the node name:
87
+ If you want to match a specific attribute of a node, specify its value inside
88
+ `<...>` right after the node name:
69
89
 
70
90
  Machete.matches?('42'.to_ast, 'FixnumLiteral<value = 42>') # => true
71
91
  Machete.matches?('45'.to_ast, 'FixnumLiteral<value = 42>') # => false
72
92
 
73
- The attribute value can be `true`, `false`, `nil`, integer, string, symbol, array or other pattern. The last option means you can easily match nested nodes recursively. You can also specify multiple attributes:
93
+ The attribute value can be `nil`, `true`, `false`, integer, symbol, string,
94
+ regexp, array or other pattern. The last option means you can easily match
95
+ nested nodes recursively. You can also specify multiple attributes:
74
96
 
75
97
  Machete.matches?('foo.bar'.to_ast, 'Send<receiver = Send<receiver = Self, name = :foo>, name = :bar>') # => true
76
98
 
77
- #### String Attributes
99
+ #### String And Symbol Attributes
78
100
 
79
- When matching string attributes values, you don't have to do a whole-string match using the `=` operator. You can also match the beginning, the end or a part of a string attribute value using the `^=`, `$=` and `*=` operators:
101
+ When matching string attributes values, you don't have to do a whole-string
102
+ match using the `=` operator. You can also match the beginning, the end or a
103
+ part of a string attribute value using the `^=`, `$=` and `*=` operators:
80
104
 
81
105
  Machete.matches?('"abcd"'.to_ast, 'StringLiteral<string ^= "ab">') # => true
82
106
  Machete.matches?('"efgh"'.to_ast, 'StringLiteral<string ^= "ab">') # => false
@@ -85,9 +109,31 @@ When matching string attributes values, you don't have to do a whole-string matc
85
109
  Machete.matches?('"abcd"'.to_ast, 'StringLiteral<string *= "bc">') # => true
86
110
  Machete.matches?('"efgh"'.to_ast, 'StringLiteral<string *= "bc">') # => false
87
111
 
112
+ Match symbol attributes works in the same way:
113
+
114
+ Machete.matches?(':abcd'.to_ast, 'SymbolLiteral<value ^= :ab>') # => true
115
+ Machete.matches?(':efgh'.to_ast, 'SymbolLiteral<value ^= :ab>') # => false
116
+ Machete.matches?(':abcd'.to_ast, 'SymbolLiteral<value $= :cd>') # => true
117
+ Machete.matches?(':efgh'.to_ast, 'SymbolLiteral<value $= :cd>') # => false
118
+ Machete.matches?(':abcd'.to_ast, 'SymbolLiteral<value *= :bc>') # => true
119
+ Machete.matches?(':efgh'.to_ast, 'SymbolLiteral<value *= :bc>') # => false
120
+
121
+ In addition, you can match string and symbol attributes using regular
122
+ expressions together with the `*=` operator:
123
+
124
+ Machete.matches?('"abcd"'.to_ast, 'StringLiteral<string *= /bc/>') # => true
125
+ Machete.matches?('"efgh"'.to_ast, 'StringLiteral<string *= /bc/>') # => false
126
+
127
+ Machete.matches?(':abcd'.to_ast, 'SymbolLiteral<value *= /bc/>') # => true
128
+ Machete.matches?(':efgh'.to_ast, 'SymbolLiteral<value *= /bc/>') # => false
129
+
130
+ The regular expressions can take the `i`, `m` and `x` options with the same
131
+ semantics as in Ruby.
132
+
88
133
  #### Array Attributes
89
134
 
90
- When matching array attribute values, the simplest way is to specify the array elements exactly. They will be matched one-by-one.
135
+ When matching array attribute values, the simplest way is to specify the array
136
+ elements exactly. They will be matched one-by-one.
91
137
 
92
138
  Machete.matches?('[1, 2]'.to_ast, 'ArrayLiteral<body = [FixnumLiteral<value = 1>, FixnumLiteral<value = 2>]>') # => true
93
139
 
@@ -96,7 +142,9 @@ If you don't care about the node type of some array elements, you can use `any`:
96
142
  Machete.matches?('[1, 2]'.to_ast, 'ArrayLiteral<body = [any, FixnumLiteral<value = 2>]>') # => true
97
143
  Machete.matches?('["abcd", 2]'.to_ast, 'ArrayLiteral<body = [any, FixnumLiteral<value = 2>]>') # => true
98
144
 
99
- The best thing about array matching is that you can use quantifiers for elements: `*`, `+`, `?`, `{n}`, `{n,}`, `{,n}`, `{m,n}`. Their meaning is the same as in Perl-like regular expressions:
145
+ The best thing about array matching is that you can use quantifiers for
146
+ elements: `*`, `+`, `?`, `{n}`, `{n,}`, `{,n}`, `{m,n}`. Their meaning is the
147
+ same as in Perl-like regular expressions:
100
148
 
101
149
  Machete.matches?('[2]'.to_ast, 'ArrayLiteral<body = [any*, FixnumLiteral<value = 2>]>') # => true
102
150
  Machete.matches?('[1, 2]'.to_ast, 'ArrayLiteral<body = [any*, FixnumLiteral<value = 2>]>') # => true
@@ -126,7 +174,8 @@ The best thing about array matching is that you can use quantifiers for elements
126
174
  Machete.matches?('[1, 1, 2]'.to_ast, 'ArrayLiteral<body = [any{1,2}, FixnumLiteral<value = 2>]>') # => true
127
175
  Machete.matches?('[1, 1, 1, 2]'.to_ast, 'ArrayLiteral<body = [any{1,2}, FixnumLiteral<value = 2>]>') # => false
128
176
 
129
- There are also two unusual quantifiers: `{even}` and `{odd}`. They specify that the quantified expression must repeat even or odd number of times:
177
+ There are also two unusual quantifiers: `{even}` and `{odd}`. They specify that
178
+ the quantified expression must repeat even or odd number of times:
130
179
 
131
180
  Machete.matches?('[1, 2]'.to_ast, 'ArrayLiteral<body = [any{even}, FixnumLiteral<value = 2>]>') # => false
132
181
  Machete.matches?('[1, 1, 2]'.to_ast, 'ArrayLiteral<body = [any{even}, FixnumLiteral<value = 2>]>') # => true
@@ -134,26 +183,40 @@ There are also two unusual quantifiers: `{even}` and `{odd}`. They specify that
134
183
  Machete.matches?('[1, 2]'.to_ast, 'ArrayLiteral<body = [any{odd}, FixnumLiteral<value = 2>]>') # => true
135
184
  Machete.matches?('[1, 1, 2]'.to_ast, 'ArrayLiteral<body = [any{odd}, FixnumLiteral<value = 2>]>') # => false
136
185
 
137
- These quantifiers are best used when matching hashes containing a specific key or value. This is because in Rubinius AST both hash keys and values are flattened into one array and the only thing distinguishing them is even or odd position.
186
+ These quantifiers are best used when matching hashes containing a specific key
187
+ or value. This is because in Rubinius AST both hash keys and values are
188
+ flattened into one array and the only thing distinguishing them is even or odd
189
+ position.
138
190
 
139
191
  ### More Information
140
192
 
141
- For more details about the syntax see the `lib/machete/parser.y` file which contains the pattern parser.
193
+ For more details about the syntax see the `lib/machete/parser.y` file which
194
+ contains the pattern parser.
142
195
 
143
196
  FAQ
144
197
  ---
145
198
 
146
- **Why did you chose Rubinius AST as a base? Aren't there other tools for Ruby parsing which are not VM-specific?**
199
+ **Why did you chose Rubinius AST as a base? Aren't there other tools for Ruby
200
+ parsing which are not VM-specific?**
147
201
 
148
202
  There are three other tools which were considered but each has its issues:
149
203
 
150
- * [parse_tree](http://parsetree.rubyforge.org/) — unmaintained and unsupported for 1.9
151
- * [ruby_parser](http://parsetree.rubyforge.org/) — sometimes reports wrong line numbers for the nodes (this is a killer for some use cases)
152
- * [Ripper](http://rubyforge.org/projects/ripper/) — usable but the generated AST is too low level (the patterns would be too complex and low-level)
204
+ * [parse_tree](http://parsetree.rubyforge.org/) — unmaintained and unsupported
205
+ for 1.9
206
+ * [ruby_parser](http://parsetree.rubyforge.org/) — sometimes reports wrong line
207
+ numbers for the nodes (this is a killer for some use cases)
208
+ * [Ripper](http://rubyforge.org/projects/ripper/) — usable but the generated AST
209
+ is too low level (the patterns would be too complex and low-level)
153
210
 
154
211
  Rubinius AST is also by far the easiest to work with.
155
212
 
213
+ Compatibility
214
+ -------------
215
+
216
+ Machete is compatible with both the 1.8 and 1.9 mode of Rubinius.
217
+
156
218
  Acknowledgement
157
219
  ---------------
158
220
 
159
- The general idea and inspiration for the pattern syntax was taken form Python's [2to3](http://docs.python.org/library/2to3.html) tool.
221
+ The general idea and inspiration for the pattern syntax was taken form Python's
222
+ [2to3](http://docs.python.org/library/2to3.html) tool.
data/Rakefile CHANGED
@@ -5,8 +5,8 @@ desc "Generate the expression parser"
5
5
  task :parser do
6
6
  source = "lib/machete/parser.y"
7
7
  target = "lib/machete/parser.rb"
8
- unless uptodate?(target, source)
9
- system "racc -o #{target} #{source}" or exit 1
8
+ unless uptodate?(target, [source])
9
+ sh "racc -o #{target} #{source}"
10
10
  end
11
11
  end
12
12
 
data/VERSION CHANGED
@@ -1 +1 @@
1
- 0.4.0
1
+ 0.5.0
@@ -3,53 +3,97 @@ require File.expand_path(File.dirname(__FILE__) + "/machete/parser")
3
3
  require File.expand_path(File.dirname(__FILE__) + "/machete/version")
4
4
 
5
5
  module Machete
6
- # Matches a Rubinius AST node against a pattern.
7
- #
8
- # @param [Rubinius::AST::Node] node node to match
9
- # @param [String] pattern pattern to match the node against (see {file:README.md} for syntax description)
10
- #
11
- # @example Succesfull match
12
- # Machete.matches?('foo.bar'.to_ast, 'Send<receiver = Send<receiver = Self, name = :foo>, name = :bar>')
13
- # # => true
14
- #
15
- # @example Failed match
16
- # Machete.matches?('42'.to_ast, 'Send<receiver = Send<receiver = Self, name = :foo>, name = :bar>')
17
- # # => false
18
- #
19
- # @return [Boolean] +true+ if the node matches the pattern, +false+ otherwise
20
- #
21
- # @raise [Matchete::Parser::SyntaxError] if the pattern is invalid
22
- def self.matches?(node, pattern)
23
- Parser.new.parse(pattern).matches?(node)
24
- end
6
+ class << self
7
+ # Matches a Rubinius AST node against a pattern.
8
+ #
9
+ # @param [Rubinius::AST::Node] node node to match
10
+ # @param [String, Machete::Matchers::Matcher] pattern pattern to match the
11
+ # node against, either as a string (see {file:README.md} for syntax
12
+ # description) or in compiled form
13
+ #
14
+ # @example Test using a string pattern
15
+ # Machete.matches?('foo.bar'.to_ast, 'Send<name = :bar>')
16
+ # # => true
17
+ #
18
+ # Machete.matches?('42'.to_ast, 'Send<name = :bar>')
19
+ # # => false
20
+ #
21
+ # @example Test using a compiled pattern
22
+ # Machete.matches?(
23
+ # 'foo.bar'.to_ast,
24
+ # Machete::Matchers::NodeMatcher.new("Send",
25
+ # :name => Machete::Matchers::LiteralMatcher.new(:bar)
26
+ # )
27
+ # )
28
+ # # => true
29
+ #
30
+ # Machete.matches?(
31
+ # '42'.to_ast,
32
+ # Machete::Matchers::NodeMatcher.new("Send",
33
+ # :name => Machete::Matchers::LiteralMatcher.new(:bar)
34
+ # )
35
+ # )
36
+ # # => false
37
+ #
38
+ # @return [Boolean] +true+ if the node matches the pattern, +false+
39
+ # otherwise
40
+ #
41
+ # @raise [Matchete::Parser::SyntaxError] if the pattern is invalid
42
+ def matches?(node, pattern)
43
+ compiled_pattern(pattern).matches?(node)
44
+ end
45
+
46
+ # Finds all nodes in a Rubinius AST matching a pattern.
47
+ #
48
+ # @param [Rubinius::AST::Node] ast tree to search
49
+ # @param [String, Machete::Matchers::Matcher] pattern pattern to match the
50
+ # nodes against, either as a string (see {file:README.md} for syntax
51
+ # description) or in compiled form
52
+ #
53
+ # @example Search using a string pattern
54
+ # Machete.find('42 + 43 + 44'.to_ast, 'FixnumLiteral')
55
+ # # => [
56
+ # # #<Rubinius::AST::FixnumLiteral:0x10b0 @value=44 @line=1>,
57
+ # # #<Rubinius::AST::FixnumLiteral:0x10b8 @value=43 @line=1>,
58
+ # # #<Rubinius::AST::FixnumLiteral:0x10c0 @value=42 @line=1>
59
+ # # ]
60
+ #
61
+ # @example Search using a compiled pattern
62
+ # Machete.find(
63
+ # '42 + 43 + 44'.to_ast,
64
+ # Machete::Matchers::NodeMatcher.new("FixnumLiteral")
65
+ # )
66
+ # # => [
67
+ # # #<Rubinius::AST::FixnumLiteral:0x10b0 @value=44 @line=1>,
68
+ # # #<Rubinius::AST::FixnumLiteral:0x10b8 @value=43 @line=1>,
69
+ # # #<Rubinius::AST::FixnumLiteral:0x10c0 @value=42 @line=1>
70
+ # # ]
71
+ #
72
+ # @return [Array] list of matching nodes (in unspecified order)
73
+ #
74
+ # @raise [Matchete::Parser::SyntaxError] if the pattern is invalid
75
+ def find(ast, pattern)
76
+ matcher = compiled_pattern(pattern)
25
77
 
26
- # Finds all nodes in a Rubinius AST matching a pattern.
27
- #
28
- # @param [Rubinius::AST::Node] ast tree to search
29
- # @param [String] pattern pattern to match the nodes against (see {file:README.md} for syntax description)
30
- #
31
- # @example
32
- # Machete.find('42 + 43 + 44'.to_ast, 'FixnumLiteral')
33
- # # => [
34
- # # #<Rubinius::AST::FixnumLiteral:0x10b0 @value=44 @line=1>,
35
- # # #<Rubinius::AST::FixnumLiteral:0x10b8 @value=43 @line=1>,
36
- # # #<Rubinius::AST::FixnumLiteral:0x10c0 @value=42 @line=1>
37
- # # ]
38
- #
39
- # @return [Array] list of matching nodes (in unspecified order)
40
- #
41
- # @raise [Matchete::Parser::SyntaxError] if the pattern is invalid
42
- def self.find(ast, pattern)
43
- matcher = Parser.new.parse(pattern)
78
+ result = []
79
+ result << ast if matcher.matches?(ast)
44
80
 
45
- result = []
46
- result << ast if matcher.matches?(ast)
81
+ ast.walk(true) do |dummy, node|
82
+ result << node if matcher.matches?(node)
83
+ true
84
+ end
47
85
 
48
- ast.walk(true) do |dummy, node|
49
- result << node if matcher.matches?(node)
50
- true
86
+ result
51
87
  end
52
88
 
53
- result
89
+ private
90
+
91
+ def compiled_pattern(pattern)
92
+ if pattern.is_a?(String)
93
+ Parser.new.parse(pattern)
94
+ else
95
+ pattern
96
+ end
97
+ end
54
98
  end
55
99
  end
@@ -1,7 +1,5 @@
1
1
  module Machete
2
- # @private
3
2
  module Matchers
4
- # @private
5
3
  class Quantifier
6
4
  # :min should be always set, :max can be nil (meaning infinity)
7
5
  attr_reader :matcher, :min, :max, :step
@@ -19,8 +17,10 @@ module Machete
19
17
  end
20
18
  end
21
19
 
22
- # @private
23
- class ChoiceMatcher
20
+ class Matcher
21
+ end
22
+
23
+ class ChoiceMatcher < Matcher
24
24
  attr_reader :alternatives
25
25
 
26
26
  def initialize(alternatives)
@@ -36,8 +36,7 @@ module Machete
36
36
  end
37
37
  end
38
38
 
39
- # @private
40
- class NodeMatcher
39
+ class NodeMatcher < Matcher
41
40
  attr_reader :class_name, :attrs
42
41
 
43
42
  def initialize(class_name, attrs = {})
@@ -56,8 +55,7 @@ module Machete
56
55
  end
57
56
  end
58
57
 
59
- # @private
60
- class ArrayMatcher
58
+ class ArrayMatcher < Matcher
61
59
  attr_reader :items
62
60
 
63
61
  def initialize(items)
@@ -119,8 +117,7 @@ module Machete
119
117
  end
120
118
  end
121
119
 
122
- # @private
123
- class LiteralMatcher
120
+ class LiteralMatcher < Matcher
124
121
  attr_reader :literal
125
122
 
126
123
  def initialize(literal)
@@ -136,8 +133,7 @@ module Machete
136
133
  end
137
134
  end
138
135
 
139
- # @private
140
- class StringRegexpMatcher
136
+ class RegexpMatcher < Matcher
141
137
  attr_reader :regexp
142
138
 
143
139
  def initialize(regexp)
@@ -147,14 +143,28 @@ module Machete
147
143
  def ==(other)
148
144
  other.instance_of?(self.class) && @regexp == other.regexp
149
145
  end
146
+ end
150
147
 
148
+ class SymbolRegexpMatcher < RegexpMatcher
149
+ def matches?(node)
150
+ node.is_a?(Symbol) && node.to_s =~ @regexp
151
+ end
152
+ end
153
+
154
+ class StringRegexpMatcher < RegexpMatcher
151
155
  def matches?(node)
152
156
  node.is_a?(String) && node =~ @regexp
153
157
  end
154
158
  end
155
159
 
156
- # @private
157
- class AnyMatcher
160
+ class IndifferentRegexpMatcher < RegexpMatcher
161
+ def matches?(node)
162
+ (node.is_a?(Symbol) && node.to_s =~ @regexp) ||
163
+ (node.is_a?(String) && node =~ @regexp)
164
+ end
165
+ end
166
+
167
+ class AnyMatcher < Matcher
158
168
  def ==(other)
159
169
  other.instance_of?(self.class)
160
170
  end