machete 0.2.1 → 0.3.0
Sign up to get free protection for your applications and to get access to all the features.
- data/CHANGELOG +9 -0
- data/README.md +78 -9
- data/VERSION +1 -1
- data/lib/machete/matchers.rb +128 -2
- data/lib/machete/parser.rb +382 -119
- data/lib/machete/parser.y +127 -49
- data/spec/machete/matchers_spec.rb +356 -0
- data/spec/machete/parser_spec.rb +202 -111
- metadata +5 -5
data/CHANGELOG
CHANGED
@@ -1,3 +1,12 @@
|
|
1
|
+
0.3.0 (2011-09-27)
|
2
|
+
------------------
|
3
|
+
|
4
|
+
* Support for array matching, including quantifiers for array elements
|
5
|
+
("*", "+", "?", "{n}", "{n,}", "{,n}", "{m,n}", "{even}", "{odd}").
|
6
|
+
* New "^=" and "$=" operators matching beginning/end of a string.
|
7
|
+
* New "any" keyword that matches any node.
|
8
|
+
* Internal code improvements and fixes.
|
9
|
+
|
1
10
|
0.2.1 (2011-08-17)
|
2
11
|
------------------
|
3
12
|
|
data/README.md
CHANGED
@@ -41,6 +41,8 @@ The `Machete.find` method finds all nodes in a Rubinius AST tree matching a patt
|
|
41
41
|
Pattern Syntax
|
42
42
|
--------------
|
43
43
|
|
44
|
+
### Basics
|
45
|
+
|
44
46
|
Rubinius AST consists of instances of classes that represent various types of nodes:
|
45
47
|
|
46
48
|
'42'.to_ast # => #<Rubinius::AST::FixnumLiteral:0xf28 @value=42 @line=1>
|
@@ -51,23 +53,90 @@ To match a specific node type, just use its class name in the pattern:
|
|
51
53
|
Machete.matches?('42'.to_ast, 'FixnumLiteral') # => true
|
52
54
|
Machete.matches?('"abcd"'.to_ast, 'FixnumLiteral') # => false
|
53
55
|
|
54
|
-
|
56
|
+
To specify multiple alternatives, use the choice operator:
|
57
|
+
|
58
|
+
Machete.matches?('42'.to_ast, 'FixnumLiteral | StringLiteral') # => true
|
59
|
+
Machete.matches?('"abcd"'.to_ast, 'FixnumLiteral | StringLiteral') # => true
|
60
|
+
|
61
|
+
If you don't care about the node type at all, use the `any` keyword (this is most useful when matching arrays — see below):
|
62
|
+
|
63
|
+
Machete.matches?('42'.to_ast, 'any') # => true
|
64
|
+
Machete.matches?('"abcd"'.to_ast, 'any') # => true
|
65
|
+
|
66
|
+
### Node Attributes
|
67
|
+
|
68
|
+
If you want to match a specific attribute of a node, specify its value inside `<...>` right after the node name:
|
55
69
|
|
56
70
|
Machete.matches?('42'.to_ast, 'FixnumLiteral<value = 42>') # => true
|
57
71
|
Machete.matches?('45'.to_ast, 'FixnumLiteral<value = 42>') # => false
|
58
72
|
|
59
|
-
The attribute value can be an integer, string, symbol or other pattern.
|
73
|
+
The attribute value can be an integer, string, symbol, array or other pattern. The last option means you can easily match nested nodes recursively. You can also specify multiple attributes:
|
60
74
|
|
61
|
-
Machete.matches?('foo.bar'.to_ast, 'Send<receiver = Send<receiver = Self, name = :foo>, name = :bar>')
|
62
|
-
# => true
|
75
|
+
Machete.matches?('foo.bar'.to_ast, 'Send<receiver = Send<receiver = Self, name = :foo>, name = :bar>') # => true
|
63
76
|
|
64
|
-
|
65
|
-
# => false
|
77
|
+
#### String Attributes
|
66
78
|
|
67
|
-
|
79
|
+
When matching string attributes values, you don't have to do a whole-string match using the `=` operator. You can also match the beginning or the end of a string attribute value using the `^=` or `$=` operators:
|
68
80
|
|
69
|
-
Machete.matches?('
|
70
|
-
Machete.matches?('"
|
81
|
+
Machete.matches?('"abcd"'.to_ast, 'StringLiteral<string ^= "ab">') # => true
|
82
|
+
Machete.matches?('"efgh"'.to_ast, 'StringLiteral<string ^= "ab">') # => false
|
83
|
+
Machete.matches?('"abcd"'.to_ast, 'StringLiteral<string $= "cd">') # => true
|
84
|
+
Machete.matches?('"efgh"'.to_ast, 'StringLiteral<string $= "cd">') # => false
|
85
|
+
|
86
|
+
#### Array Attributes
|
87
|
+
|
88
|
+
When matching array attribute values, the simplest way is to specify the array elements exactly. They will be matched one-by-one.
|
89
|
+
|
90
|
+
Machete.matches?('[1, 2]'.to_ast, 'ArrayLiteral<body = [FixnumLiteral<value = 1>, FixnumLiteral<value = 2>]>') # => true
|
91
|
+
|
92
|
+
If you don't care about the node type of some array elements, you can use `any`:
|
93
|
+
|
94
|
+
Machete.matches?('[1, 2]'.to_ast, 'ArrayLiteral<body = [any, FixnumLiteral<value = 2>]>') # => true
|
95
|
+
Machete.matches?('["abcd", 2]'.to_ast, 'ArrayLiteral<body = [any, FixnumLiteral<value = 2>]>') # => true
|
96
|
+
|
97
|
+
The best thing about array matching is that you can use quantifiers for elements: `*`, `+`, `?`, `{n}`, `{n,}`, `{,n}`, `{m,n}`. Their meaning is the same as in Perl-like regular expressions:
|
98
|
+
|
99
|
+
Machete.matches?('[2]'.to_ast, 'ArrayLiteral<body = [any*, FixnumLiteral<value = 2>]>') # => true
|
100
|
+
Machete.matches?('[1, 2]'.to_ast, 'ArrayLiteral<body = [any*, FixnumLiteral<value = 2>]>') # => true
|
101
|
+
Machete.matches?('[1, 1, 2]'.to_ast, 'ArrayLiteral<body = [any*, FixnumLiteral<value = 2>]>') # => true
|
102
|
+
|
103
|
+
Machete.matches?('[2]'.to_ast, 'ArrayLiteral<body = [any+, FixnumLiteral<value = 2>]>') # => false
|
104
|
+
Machete.matches?('[1, 2]'.to_ast, 'ArrayLiteral<body = [any+, FixnumLiteral<value = 2>]>') # => true
|
105
|
+
Machete.matches?('[1, 1, 2]'.to_ast, 'ArrayLiteral<body = [any+, FixnumLiteral<value = 2>]>') # => true
|
106
|
+
|
107
|
+
Machete.matches?('[2]'.to_ast, 'ArrayLiteral<body = [any?, FixnumLiteral<value = 2>]>') # => true
|
108
|
+
Machete.matches?('[1, 2]'.to_ast, 'ArrayLiteral<body = [any?, FixnumLiteral<value = 2>]>') # => true
|
109
|
+
|
110
|
+
Machete.matches?('[2]'.to_ast, 'ArrayLiteral<body = [any{1}, FixnumLiteral<value = 2>]>') # => false
|
111
|
+
Machete.matches?('[1, 2]'.to_ast, 'ArrayLiteral<body = [any{1}, FixnumLiteral<value = 2>]>') # => true
|
112
|
+
Machete.matches?('[1, 1, 2]'.to_ast, 'ArrayLiteral<body = [any{1}, FixnumLiteral<value = 2>]>') # => false
|
113
|
+
|
114
|
+
Machete.matches?('[2]'.to_ast, 'ArrayLiteral<body = [any{1,}, FixnumLiteral<value = 2>]>') # => false
|
115
|
+
Machete.matches?('[1, 2]'.to_ast, 'ArrayLiteral<body = [any{1,}, FixnumLiteral<value = 2>]>') # => true
|
116
|
+
Machete.matches?('[1, 1, 2]'.to_ast, 'ArrayLiteral<body = [any{1,}, FixnumLiteral<value = 2>]>') # => true
|
117
|
+
|
118
|
+
Machete.matches?('[2]'.to_ast, 'ArrayLiteral<body = [any{,1}, FixnumLiteral<value = 2>]>') # => true
|
119
|
+
Machete.matches?('[1, 2]'.to_ast, 'ArrayLiteral<body = [any{,1}, FixnumLiteral<value = 2>]>') # => true
|
120
|
+
Machete.matches?('[1, 1, 2]'.to_ast, 'ArrayLiteral<body = [any{,1}, FixnumLiteral<value = 2>]>') # => false
|
121
|
+
|
122
|
+
Machete.matches?('[2]'.to_ast, 'ArrayLiteral<body = [any{1,2}, FixnumLiteral<value = 2>]>') # => false
|
123
|
+
Machete.matches?('[1, 2]'.to_ast, 'ArrayLiteral<body = [any{1,2}, FixnumLiteral<value = 2>]>') # => true
|
124
|
+
Machete.matches?('[1, 1, 2]'.to_ast, 'ArrayLiteral<body = [any{1,2}, FixnumLiteral<value = 2>]>') # => true
|
125
|
+
Machete.matches?('[1, 1, 1, 2]'.to_ast, 'ArrayLiteral<body = [any{1,2}, FixnumLiteral<value = 2>]>') # => false
|
126
|
+
|
127
|
+
There are also two unusual quantifiers: `{even}` and `{odd}`. They specify that the quantified expression must repeat even or odd number of times:
|
128
|
+
|
129
|
+
Machete.matches?('[1, 2]'.to_ast, 'ArrayLiteral<body = [any{even}, FixnumLiteral<value = 2>]>') # => false
|
130
|
+
Machete.matches?('[1, 1, 2]'.to_ast, 'ArrayLiteral<body = [any{even}, FixnumLiteral<value = 2>]>') # => true
|
131
|
+
|
132
|
+
Machete.matches?('[1, 2]'.to_ast, 'ArrayLiteral<body = [any{odd}, FixnumLiteral<value = 2>]>') # => true
|
133
|
+
Machete.matches?('[1, 1, 2]'.to_ast, 'ArrayLiteral<body = [any{odd}, FixnumLiteral<value = 2>]>') # => false
|
134
|
+
|
135
|
+
These quantifiers are best used when matching hashes containing a specific key or value. This is because in Rubinius AST both hash keys and values are flattened into one array and the only thing distinguishing them is even or odd position.
|
136
|
+
|
137
|
+
### More Information
|
138
|
+
|
139
|
+
For more details about the syntax see the `lib/machete/parser.y` file which contains the pattern parser.
|
71
140
|
|
72
141
|
FAQ
|
73
142
|
---
|
data/VERSION
CHANGED
@@ -1 +1 @@
|
|
1
|
-
0.
|
1
|
+
0.3.0
|
data/lib/machete/matchers.rb
CHANGED
@@ -1,6 +1,24 @@
|
|
1
1
|
module Machete
|
2
2
|
# @private
|
3
3
|
module Matchers
|
4
|
+
# @private
|
5
|
+
class Quantifier
|
6
|
+
# :min should be always set, :max can be nil (meaning infinity)
|
7
|
+
attr_reader :matcher, :min, :max, :step
|
8
|
+
|
9
|
+
def initialize(matcher, min, max, step)
|
10
|
+
@matcher, @min, @max, @step = matcher, min, max, step
|
11
|
+
end
|
12
|
+
|
13
|
+
def ==(other)
|
14
|
+
other.instance_of?(self.class) &&
|
15
|
+
@matcher == other.matcher &&
|
16
|
+
@min == other.min &&
|
17
|
+
@max == other.max &&
|
18
|
+
@step == other.step
|
19
|
+
end
|
20
|
+
end
|
21
|
+
|
4
22
|
# @private
|
5
23
|
class ChoiceMatcher
|
6
24
|
attr_reader :alternatives
|
@@ -14,7 +32,7 @@ module Machete
|
|
14
32
|
end
|
15
33
|
|
16
34
|
def matches?(node)
|
17
|
-
alternatives.any? { |a| a.matches?(node) }
|
35
|
+
@alternatives.any? { |a| a.matches?(node) }
|
18
36
|
end
|
19
37
|
end
|
20
38
|
|
@@ -34,7 +52,70 @@ module Machete
|
|
34
52
|
|
35
53
|
def matches?(node)
|
36
54
|
node.class == Rubinius::AST.const_get(@class_name) &&
|
37
|
-
attrs.all? { |name, matcher| matcher.matches?(node.send(name)) }
|
55
|
+
@attrs.all? { |name, matcher| matcher.matches?(node.send(name)) }
|
56
|
+
end
|
57
|
+
end
|
58
|
+
|
59
|
+
# @private
|
60
|
+
class ArrayMatcher
|
61
|
+
attr_reader :items
|
62
|
+
|
63
|
+
def initialize(items)
|
64
|
+
@items = items
|
65
|
+
end
|
66
|
+
|
67
|
+
def ==(other)
|
68
|
+
other.instance_of?(self.class) && @items == other.items
|
69
|
+
end
|
70
|
+
|
71
|
+
def matches?(node)
|
72
|
+
return false unless node.is_a?(Array)
|
73
|
+
|
74
|
+
match(@items, node)
|
75
|
+
end
|
76
|
+
|
77
|
+
private
|
78
|
+
|
79
|
+
# Simple recursive algorithm based on the one for regexp matching
|
80
|
+
# described in Beatiful Code (Chapter 1).
|
81
|
+
def match(matchers, nodes)
|
82
|
+
if matchers.empty?
|
83
|
+
nodes.empty?
|
84
|
+
elsif !matchers[0].is_a?(Quantifier)
|
85
|
+
matchers[0].matches?(nodes[0]) && match(matchers[1..-1], nodes[1..-1])
|
86
|
+
else
|
87
|
+
quantifier = matchers[0]
|
88
|
+
|
89
|
+
# Too little elements?
|
90
|
+
return false if nodes.size < quantifier.min
|
91
|
+
|
92
|
+
# Make sure at least min elements match.
|
93
|
+
matches_min = nodes[0...quantifier.min].all? do |node|
|
94
|
+
quantifier.matcher.matches?(node)
|
95
|
+
end
|
96
|
+
return false unless matches_min
|
97
|
+
|
98
|
+
# Now try to match the remaining elements. The shortest match wins.
|
99
|
+
i = quantifier.min
|
100
|
+
max = if quantifier.max
|
101
|
+
[quantifier.max, nodes.size].min
|
102
|
+
else
|
103
|
+
nodes.size
|
104
|
+
end
|
105
|
+
while i <= max
|
106
|
+
return true if match(matchers[1..-1], nodes[i..-1])
|
107
|
+
|
108
|
+
matches_next = nodes[i...(i + quantifier.step)].all? do |node|
|
109
|
+
quantifier.matcher.matches?(node)
|
110
|
+
end
|
111
|
+
return false unless matches_next
|
112
|
+
|
113
|
+
i += quantifier.step
|
114
|
+
end
|
115
|
+
|
116
|
+
# No match found.
|
117
|
+
false
|
118
|
+
end
|
38
119
|
end
|
39
120
|
end
|
40
121
|
|
@@ -54,5 +135,50 @@ module Machete
|
|
54
135
|
@literal == node
|
55
136
|
end
|
56
137
|
end
|
138
|
+
|
139
|
+
# @private
|
140
|
+
class StartsWithMatcher
|
141
|
+
attr_reader :prefix
|
142
|
+
|
143
|
+
def initialize(prefix)
|
144
|
+
@prefix = prefix
|
145
|
+
end
|
146
|
+
|
147
|
+
def ==(other)
|
148
|
+
other.instance_of?(self.class) && @prefix == other.prefix
|
149
|
+
end
|
150
|
+
|
151
|
+
def matches?(node)
|
152
|
+
node.is_a?(String) && node.start_with?(@prefix)
|
153
|
+
end
|
154
|
+
end
|
155
|
+
|
156
|
+
# @private
|
157
|
+
class EndsWithMatcher
|
158
|
+
attr_reader :suffix
|
159
|
+
|
160
|
+
def initialize(suffix)
|
161
|
+
@suffix = suffix
|
162
|
+
end
|
163
|
+
|
164
|
+
def ==(other)
|
165
|
+
other.instance_of?(self.class) && @suffix == other.suffix
|
166
|
+
end
|
167
|
+
|
168
|
+
def matches?(node)
|
169
|
+
node.is_a?(String) && node.end_with?(@suffix)
|
170
|
+
end
|
171
|
+
end
|
172
|
+
|
173
|
+
# @private
|
174
|
+
class AnyMatcher
|
175
|
+
def ==(other)
|
176
|
+
other.instance_of?(self.class)
|
177
|
+
end
|
178
|
+
|
179
|
+
def matches?(node)
|
180
|
+
true
|
181
|
+
end
|
182
|
+
end
|
57
183
|
end
|
58
184
|
end
|