cannonbol 1.1.0 → 1.3.0
Sign up to get free protection for your applications and to get access to all the features.
- checksums.yaml +4 -4
- data/README.md +55 -37
- data/Rakefile +3 -3
- data/cannonbol.gemspec +2 -0
- data/config.ru +13 -0
- data/lib/cannonbol.rb +7 -676
- data/lib/cannonbol/cannonbol.rb +678 -0
- data/lib/cannonbol/version.rb +1 -1
- metadata +32 -2
checksums.yaml
CHANGED
@@ -1,7 +1,7 @@
|
|
1
1
|
---
|
2
2
|
SHA1:
|
3
|
-
metadata.gz:
|
4
|
-
data.tar.gz:
|
3
|
+
metadata.gz: a8b090116c27ac8cf008c2a0b54b5aa2eb259bf4
|
4
|
+
data.tar.gz: 6899f4b674be76248a133b664870632eae5d48bd
|
5
5
|
SHA512:
|
6
|
-
metadata.gz:
|
7
|
-
data.tar.gz:
|
6
|
+
metadata.gz: 9e820fa3e6b2758aa2e5fd67d7b9d24c6482bc96ad33b4b2763b071f3d73d5b8944fe5da77e56bdfd1fb03df303f4a13c024d3cb0884745fac100ddc4429baec
|
7
|
+
data.tar.gz: 201b97e9cdf60edbf9b3c5b84d9801b7883f06371f2a23c234616c4e0662ccc7d8eb716d9e52fb01cdf422ce458377881fc71fe550e8e33cff5dfca7d3d5f06a
|
data/README.md
CHANGED
@@ -9,6 +9,7 @@ CannonBol is a ruby DSL for patten matching based on SNOBOL4 and SPITBOL.
|
|
9
9
|
* Complete SNOBOL4 + SPITBOL extensions!
|
10
10
|
* Based on the well documented, proven SNOBOL4 language!
|
11
11
|
* Simple syntax looks great within your ruby code!
|
12
|
+
* Tested with ruby 1.9.3, 2.1.1 and [Opal](www.opalrb.org)
|
12
13
|
|
13
14
|
## Installation
|
14
15
|
|
@@ -34,28 +35,31 @@ Strings, Regexes and primitives are combined using & (concatenation) and | (alte
|
|
34
35
|
|
35
36
|
Here is a simple pattern that matches a simple noun clause:
|
36
37
|
|
37
|
-
("a" | "the") & /\s+/ & ("boy" | "girl")
|
38
|
+
> ("a" | "the") & /\s+/ & ("boy" | "girl")
|
38
39
|
|
39
|
-
|
40
|
+
This will match either "a" or "the" followed white space and then by "boy or "girl". Okay! Lets use it!
|
40
41
|
|
41
|
-
("a" | "the") & /\s+/ & ("boy" | "girl").match?("he saw a boy going home")
|
42
|
+
> ("a" | "the") & /\s+/ & ("boy" | "girl").match?("he saw a boy going home")
|
42
43
|
=> "a boy"
|
43
|
-
("a" | "the") & /\s+/ & ("boy" | "girl").match?("he saw a big boy going home")
|
44
|
+
> ("a" | "the") & /\s+/ & ("boy" | "girl").match?("he saw a big boy going home")
|
44
45
|
=> nil
|
45
46
|
|
46
|
-
Now let's save the pieces of the match using the capture
|
47
|
+
Now let's save the pieces of the match using the `capture?` (pronounced _capture IF_) method:
|
47
48
|
|
48
|
-
article, noun = nil, nil
|
49
|
-
pattern = ("a" | "the").capture? { |m| article = m } & /\s+/ & ("boy" | "girl").capture? { |m| noun = m }
|
50
|
-
pattern.match?("he saw the girl going home")
|
51
|
-
|
49
|
+
> article, noun = nil, nil;
|
50
|
+
* pattern = ("a" | "the").capture? { |m| article = m } & /\s+/ & ("boy" | "girl").capture? { |m| noun = m };
|
51
|
+
* pattern.match?("he saw the girl going home")
|
52
|
+
=> the girl
|
53
|
+
> noun
|
52
54
|
=> girl
|
53
|
-
article
|
55
|
+
> article
|
54
56
|
=> the
|
55
57
|
|
56
|
-
The capture
|
58
|
+
The `capture?` method and its friend `capture!` (pronounced _capture NOW_) have many powerful features.
|
59
|
+
As shown above it can take a block which is passed the matching substring, _IF the match succeeds_.
|
60
|
+
The other features of the capture method will be detailed [below.](Advanced capture techniques)
|
57
61
|
|
58
|
-
Arrays can be turned into patterns using the match_any and match_all methods:
|
62
|
+
Arrays can be turned into patterns using the `match_any` and `match_all` methods:
|
59
63
|
|
60
64
|
ARTICLES = ["a", "the"]
|
61
65
|
NOUNS = ["boy", "girl", "dog", "cat"]
|
@@ -64,7 +68,7 @@ Arrays can be turned into patterns using the match_any and match_all methods:
|
|
64
68
|
[ARTICLES.match_any, [WS, [WS, ADJECTIVES.match_any, WS].match_all].match_any, NOUNS.match_any].match_all
|
65
69
|
|
66
70
|
This is equivilent to
|
67
|
-
|
71
|
+
|
68
72
|
("a" | "the") & (WS | (WS & ("big" | "small" | "fierce" | "friendly") & WS)) & ("boy" | "girl" | "dog" | "cat")
|
69
73
|
|
70
74
|
### match? options
|
@@ -80,20 +84,30 @@ replace_with | nil | When a non-falsy value is supplied, the value will replace
|
|
80
84
|
|
81
85
|
Example of replace with:
|
82
86
|
|
83
|
-
"hello".match?("She said hello!")
|
87
|
+
> "hello".match?("She said hello!")
|
84
88
|
=> hello
|
85
|
-
"hello".match?("She said hello!", replace_with => "goodby")
|
89
|
+
> "hello".match?("She said hello!", replace_with => "goodby")
|
86
90
|
=> She said goodby!
|
91
|
+
|
92
|
+
#### Ignore case on a subpattern
|
93
|
+
|
94
|
+
Sometimes its useful to run the matcher in the default case sensitive mode, and only turn off matching for one part of the pattern. To do this
|
95
|
+
prefix a subpattern with a "-". For example
|
87
96
|
|
97
|
+
> (-"GIRL" | "boy").match?("A big girl!")
|
98
|
+
=> girl
|
99
|
+
> (-"GIRL" | "boy").match?("A big BOY!")
|
100
|
+
=> nil
|
101
|
+
|
88
102
|
### Patterns, Subjects, Cursors, Alternatives, and Backtracking
|
89
103
|
|
90
104
|
A pattern is an object that responds to the match? method. Cannonbol adds the match? method to Ruby strings, and regexes, and provides a number of _primitive_ patterns. A pattern can be combined with another pattern using the &, and | operators. There are also several primitive patterns that take a pattern and create a new pattern. Here are some example patterns:
|
91
105
|
|
92
|
-
"hello"
|
93
|
-
/\s+/
|
106
|
+
"hello" # matches any string containing hello
|
107
|
+
/\s+/ # matches one or more white space characters
|
94
108
|
"hello" & /\s+/ & "there" # matches "hello" and "there" seperated by white space
|
95
|
-
"hello" | "goodby"
|
96
|
-
ARB
|
109
|
+
"hello" | "goodby" # matches EITHER "hello" or "there"
|
110
|
+
ARB # a primitive pattern that matches anything (similar to /.*/)
|
97
111
|
("hello" | "goodby") & ARB & "Fred" # matches "hello" or "goodby" followed by any characters and finally "Fred"
|
98
112
|
|
99
113
|
Patterns are just objects, so they can be assigned to variables:
|
@@ -270,18 +284,19 @@ We can use this to clean up the palindrome pattern a little bit:
|
|
270
284
|
|
271
285
|
Another way to get the capture variables is to interogate the value returned by match?. The value returned by match? is a subclass of string, that has some extra methods. One of these is the captured method which gives a hash of all the captured variables. For example:
|
272
286
|
|
273
|
-
("dog" | "cat").capture?(:pet).match?("He had a dog named Spot.").captured[:pet]
|
287
|
+
> ("dog" | "cat").capture?(:pet).match?("He had a dog named Spot.").captured[:pet]
|
274
288
|
=> dog
|
275
289
|
|
276
290
|
You can also give a block to the match? method which will be called whether the block passes or not. For example:
|
277
291
|
|
278
|
-
("dog" | "cat").capture?(:pet).match?("He had a dog named Spot."){ |match| match.captured[:pet] if match}
|
292
|
+
> ("dog" | "cat").capture?(:pet).match?("He had a dog named Spot."){ |match| match.captured[:pet] if match}
|
279
293
|
=> dog
|
280
294
|
|
281
295
|
The match? block can also explicitly name any capture variables you need to get the values of. So for example:
|
282
296
|
|
283
|
-
pet_data = (POS(0) & ARBNO(("big" | "small").capture?(:size) | ("dog" | "cat").capture?(:pet) | LEN(1)) & RPOS(0))
|
284
|
-
|
297
|
+
> pet_data = (POS(0) & ARBNO(("big" | "small").capture?(:size) | ("dog" | "cat").capture?(:pet) | LEN(1)) & RPOS(0))
|
298
|
+
=> #<Cannonbol::Concat .... etc
|
299
|
+
> pet_data.match?("He has a big dog!") { |m, pet, size| "type of pet: #{pet.upcase}, size: #{size.upcase}"}
|
285
300
|
=> type of pet: DOG, size: BIG
|
286
301
|
|
287
302
|
If the match? block mentions capture variables that were not assigned in the match they get nil.
|
@@ -289,14 +304,14 @@ If the match? block mentions capture variables that were not assigned in the mat
|
|
289
304
|
#### Initializing capture variables
|
290
305
|
|
291
306
|
When used as a parameter to a primitve the capture variable may be given an initial value. For example:
|
292
|
-
|
307
|
+
|
293
308
|
LEN(baz: 12)
|
294
309
|
|
295
|
-
would match LEN(12) if :baz had not yet been set.
|
310
|
+
would match `LEN(12)` if :baz had not yet been set.
|
296
311
|
|
297
312
|
A second way to initialize (or update capture variables) is to combine capture variables with a capture block like this:
|
298
313
|
|
299
|
-
some_pattern.capture!(:baz) { |match, position, baz| baz || position * 2 } initializes :baz to position * 2
|
314
|
+
some_pattern.capture!(:baz) { |match, position, baz| baz || position * 2 } # initializes :baz to position * 2
|
300
315
|
|
301
316
|
If a symbol is specified in a capture!, and there is a block, then the symbol will be set to the value returned by the block.
|
302
317
|
|
@@ -349,15 +364,13 @@ The difference is that FENCE will fail the whole match, but FENCE(pattern) will
|
|
349
364
|
|
350
365
|
These can be used together to do some interesting things. For example
|
351
366
|
|
352
|
-
pattern = POS(0) & SUCCEED & (FENCE(TAB(n: 1).capture!(:n) { |m, p, n| puts m; p+1 } | ABORT)) & FAIL
|
353
|
-
pattern.match?("abcd")
|
354
|
-
|
355
|
-
prints
|
356
|
-
|
367
|
+
> pattern = POS(0) & SUCCEED & (FENCE(TAB(n: 1).capture!(:n) { |m, p, n| puts m; p+1 } | ABORT)) & FAIL;
|
368
|
+
* pattern.match?("abcd")
|
357
369
|
a
|
358
370
|
ab
|
359
371
|
abc
|
360
372
|
abcd
|
373
|
+
=> nil
|
361
374
|
|
362
375
|
The SUCCEED and FAIL primitives keep forcing the matcher to retry. Eventually the TAB will fail causing the ABORT alternative to execute the matcher.
|
363
376
|
|
@@ -397,19 +410,24 @@ Cannonbol can be used to easily translate the email BNF spec into an email addre
|
|
397
410
|
|
398
411
|
So for example we can even parse an obscure email with groups and routes
|
399
412
|
|
400
|
-
email = 'here is my "big fat \\\n groupen" : someone@catprint.com, Fred Nurph<@sub1.sub2@sub3.sub4:fred.nurph@catprint.com>;'
|
401
|
-
|
402
|
-
match.
|
413
|
+
> email = 'here is my "big fat \\\n groupen" : someone@catprint.com, Fred Nurph<@sub1.sub2@sub3.sub4:fred.nurph@catprint.com>;'
|
414
|
+
=> here is my "big fat \\\n groupen" : someone@catprint.com, Fred Nurph<@sub1.sub2@sub3.sub4:fred.nurph@catprint.com>;
|
415
|
+
> match = address.match?(email)
|
416
|
+
=> here is my "big fat \\\n groupen" : someone@catprint.com, Fred Nurph<@sub1.sub2@sub3.sub4:fred.nurph@catprint.com>;
|
417
|
+
> match.captured[:group_mailboxes].first.captured[:mailbox]
|
403
418
|
=> someone@catprint.com
|
404
|
-
match.captured[:group_name]
|
419
|
+
> match.captured[:group_name]
|
405
420
|
=> here is my "big fat \\\n groupen
|
406
421
|
|
407
422
|
|
408
423
|
## Development
|
409
424
|
|
410
|
-
After checking out the repo, run `
|
425
|
+
After checking out the repo, run `bundle install` to install dependencies.
|
426
|
+
|
427
|
+
### Specs
|
411
428
|
|
412
|
-
|
429
|
+
Run `bundle exec rspec` to run the tests on your server environment
|
430
|
+
Run `bundle exec rake rackup` and then point your browser to your machine to run the tests in the opal
|
413
431
|
|
414
432
|
## Contributing
|
415
433
|
|
data/Rakefile
CHANGED
@@ -1,3 +1,3 @@
|
|
1
|
-
require
|
2
|
-
|
3
|
-
|
1
|
+
require 'bundler'
|
2
|
+
Bundler.require
|
3
|
+
Bundler::GemHelper.install_tasks
|
data/cannonbol.gemspec
CHANGED
@@ -33,4 +33,6 @@ Simple syntax looks great alongside ruby!
|
|
33
33
|
spec.add_development_dependency "bundler", "~> 1.8"
|
34
34
|
spec.add_development_dependency "rake", "~> 10.0"
|
35
35
|
spec.add_development_dependency "rspec"
|
36
|
+
spec.add_development_dependency "opal-rspec"
|
37
|
+
spec.add_development_dependency "opal"
|
36
38
|
end
|
data/config.ru
ADDED
@@ -0,0 +1,13 @@
|
|
1
|
+
require 'bundler'
|
2
|
+
Bundler.require
|
3
|
+
|
4
|
+
require 'opal-rspec'
|
5
|
+
Opal.append_path File.expand_path('../spec', __FILE__)
|
6
|
+
|
7
|
+
run Opal::Server.new { |s|
|
8
|
+
s.main = 'opal/rspec/sprockets_runner'
|
9
|
+
s.append_path 'spec'
|
10
|
+
s.debug = false
|
11
|
+
s.index_path = 'spec/index.html.erb'
|
12
|
+
}
|
13
|
+
|
data/lib/cannonbol.rb
CHANGED
@@ -1,678 +1,9 @@
|
|
1
|
-
|
2
|
-
|
3
|
-
|
4
|
-
|
5
|
-
|
6
|
-
|
7
|
-
|
8
|
-
attr_reader :match_start
|
9
|
-
attr_reader :match_end
|
10
|
-
|
11
|
-
def initialize(string, match_start, match_end, captured)
|
12
|
-
@cannonbol_string = string
|
13
|
-
@match_start = match_start
|
14
|
-
@match_end = match_end
|
15
|
-
@captured = captured.dup
|
16
|
-
super(@match_end < 0 ? "" : string[@match_start..@match_end])
|
17
|
-
end
|
18
|
-
|
19
|
-
def replace_match_with(s)
|
20
|
-
before_match = ""
|
21
|
-
before_match = @cannonbol_string[0..@match_start-1] if @match_start > 0
|
22
|
-
after_match = @cannonbol_string[@match_end+1..-1] || ""
|
23
|
-
before_match + s + after_match
|
24
|
-
end
|
25
|
-
|
1
|
+
require_relative 'cannonbol/cannonbol'
|
2
|
+
require_relative 'cannonbol/version'
|
3
|
+
unless RUBY_ENGINE == 'opal'
|
4
|
+
begin
|
5
|
+
require 'opal'
|
6
|
+
Opal.append_path File.expand_path('..', __FILE__).untaint
|
7
|
+
rescue LoadError
|
26
8
|
end
|
27
|
-
|
28
|
-
class Needle
|
29
|
-
|
30
|
-
attr_reader :cursor
|
31
|
-
attr_reader :string
|
32
|
-
attr_accessor :captures
|
33
|
-
attr_accessor :match_failed
|
34
|
-
attr_accessor :ignore_case
|
35
|
-
|
36
|
-
def initialize(string)
|
37
|
-
@string = string
|
38
|
-
end
|
39
|
-
|
40
|
-
def thread(pattern, opts = {}, &match_block)
|
41
|
-
@captures = {}
|
42
|
-
anchor = opts[:anchor]
|
43
|
-
raise_error = opts[:raise_error]
|
44
|
-
replace_with = opts[:replace_match_with]
|
45
|
-
ignore_case = opts[:ignore_case]
|
46
|
-
@cursor = -1
|
47
|
-
match = nil
|
48
|
-
begin
|
49
|
-
while !match and !match_failed and @cursor < @string.length-1
|
50
|
-
@cursor += 1
|
51
|
-
@starting_character = nil
|
52
|
-
@success_blocks = []
|
53
|
-
@ignore_case = ignore_case
|
54
|
-
match = pattern._match?(self)
|
55
|
-
break if !match and anchor
|
56
|
-
end
|
57
|
-
rescue MatchFailed
|
58
|
-
end
|
59
|
-
if match
|
60
|
-
@success_blocks.each(&:call)
|
61
|
-
match = MatchString.new(@string, @starting_character || @cursor, @cursor-1, @captures)
|
62
|
-
else
|
63
|
-
raise MatchFailed if raise_error
|
64
|
-
end
|
65
|
-
if match_block
|
66
|
-
match = match_block.call(*([match] + (match_block.parameters[1..-1] || []).collect { |param| @captures[param[1].to_sym] }))
|
67
|
-
elsif replace_with
|
68
|
-
match = match.replace_match_with(replace_with)
|
69
|
-
end
|
70
|
-
match
|
71
|
-
end
|
72
|
-
|
73
|
-
def capture(name, value)
|
74
|
-
@captures[name.to_sym] = value if name
|
75
|
-
value
|
76
|
-
end
|
77
|
-
|
78
|
-
|
79
|
-
def remaining_string
|
80
|
-
@string[@cursor..-1]
|
81
|
-
end
|
82
|
-
|
83
|
-
def push(length, &success_block)
|
84
|
-
thread_state = [@starting_character, @cursor, @success_blocks.dup, @ignore_case]
|
85
|
-
@starting_character ||= @cursor
|
86
|
-
@cursor += length
|
87
|
-
@success_blocks << success_block if success_block
|
88
|
-
thread_state
|
89
|
-
end
|
90
|
-
|
91
|
-
def pull(thread_state)
|
92
|
-
@starting_character, @cursor, @success_blocks, @ignore_case = thread_state if thread_state
|
93
|
-
nil
|
94
|
-
end
|
95
|
-
|
96
|
-
end
|
97
|
-
|
98
|
-
module Operators
|
99
|
-
|
100
|
-
def _match?(needle, *args, &block)
|
101
|
-
return if needle.match_failed
|
102
|
-
__match?(needle, *args, &block)
|
103
|
-
end
|
104
|
-
|
105
|
-
def match?(s, opts = {}, &match_block)
|
106
|
-
Needle.new(s).thread(self, opts, &match_block)
|
107
|
-
end
|
108
|
-
|
109
|
-
def |(pattern)
|
110
|
-
Choose.new(self, pattern)
|
111
|
-
end
|
112
|
-
|
113
|
-
def &(pattern)
|
114
|
-
Concat.new(self, pattern)
|
115
|
-
end
|
116
|
-
|
117
|
-
def -@
|
118
|
-
CaseSensitiveOff.new(self)
|
119
|
-
end
|
120
|
-
|
121
|
-
def capture?(opts = {}, &block)
|
122
|
-
OnSuccess.new(self, opts, &block)
|
123
|
-
end
|
124
|
-
|
125
|
-
def capture!(opts = {}, &block)
|
126
|
-
OnMatch.new(self, opts, &block)
|
127
|
-
end
|
128
|
-
|
129
|
-
end
|
130
|
-
|
131
|
-
class Pattern
|
132
|
-
|
133
|
-
include Operators
|
134
|
-
|
135
|
-
def __match?(needle)
|
136
|
-
[]
|
137
|
-
end
|
138
|
-
|
139
|
-
end
|
140
|
-
|
141
|
-
class Choose < Pattern
|
142
|
-
|
143
|
-
def __match?(needle, i = 0, s = [])
|
144
|
-
while i < @params.length
|
145
|
-
s = @params[i]._match?(needle, *s)
|
146
|
-
return [i, s] if s
|
147
|
-
s = []
|
148
|
-
i += 1
|
149
|
-
end
|
150
|
-
nil
|
151
|
-
end
|
152
|
-
|
153
|
-
def initialize(p1, p2)
|
154
|
-
@params = [p1, p2]
|
155
|
-
end
|
156
|
-
|
157
|
-
end
|
158
|
-
|
159
|
-
class Concat < Pattern
|
160
|
-
|
161
|
-
def __match?(needle, i = 0, s = [])
|
162
|
-
while i < @params.length and i >= 0
|
163
|
-
s[i] = @params[i]._match?(needle, *(s[i] || []))
|
164
|
-
i = s[i] ? i+1 : i-1
|
165
|
-
end
|
166
|
-
[i-1, s] if i == @params.length
|
167
|
-
end
|
168
|
-
|
169
|
-
def initialize(p1, p2)
|
170
|
-
@params = [p1, p2]
|
171
|
-
end
|
172
|
-
|
173
|
-
end
|
174
|
-
|
175
|
-
class CaseSensitiveOff < Pattern
|
176
|
-
|
177
|
-
def initialize(pattern)
|
178
|
-
@pattern = pattern
|
179
|
-
end
|
180
|
-
|
181
|
-
def __match?(needle, thread=nil, s=[])
|
182
|
-
needle.pull(thread)
|
183
|
-
thread = needle.push(0)
|
184
|
-
needle.ignore_case = true
|
185
|
-
s = @pattern._match?(needle, *s)
|
186
|
-
return [thread, s] if s
|
187
|
-
end
|
188
|
-
|
189
|
-
end
|
190
|
-
|
191
|
-
class OnSuccess < Pattern
|
192
|
-
|
193
|
-
def initialize(pattern, opts, &block)
|
194
|
-
if opts.class == Hash
|
195
|
-
if opts.first
|
196
|
-
@capture_name = opts.first.first
|
197
|
-
@initial_capture_value = opts.first.last
|
198
|
-
end
|
199
|
-
else
|
200
|
-
@capture_name = opts
|
201
|
-
end
|
202
|
-
@pattern = pattern
|
203
|
-
@block = block
|
204
|
-
end
|
205
|
-
|
206
|
-
def __match?(needle, thread_state = nil, starting_cursor = nil, s=[])
|
207
|
-
needle.pull(thread_state)
|
208
|
-
starting_cursor ||= needle.cursor
|
209
|
-
if s = @pattern._match?(needle, *s)
|
210
|
-
ending_cursor = needle.cursor-1
|
211
|
-
push = needle.push(0) do
|
212
|
-
match_string = MatchString.new(needle.string, starting_cursor, ending_cursor, needle.captures)
|
213
|
-
capture_value = @capture_name && (needle.captures.has_key?(@capture_name) ? needle.captures[@capture_name] : @initial_capture_value)
|
214
|
-
if @block
|
215
|
-
match_string = @block.call(match_string, ending_cursor+1, capture_value)
|
216
|
-
elsif capture_value.class == Array
|
217
|
-
match_string = capture_value + [match_string]
|
218
|
-
end
|
219
|
-
needle.capture(@capture_name, match_string)
|
220
|
-
end
|
221
|
-
[ push, starting_cursor, s ]
|
222
|
-
end
|
223
|
-
end
|
224
|
-
|
225
|
-
end
|
226
|
-
|
227
|
-
class OnMatch < OnSuccess
|
228
|
-
|
229
|
-
def __match?(needle, starting_cursor = nil, s=[])
|
230
|
-
starting_cursor ||= needle.cursor
|
231
|
-
if s = @pattern._match?(needle, *s)
|
232
|
-
match_string = MatchString.new(needle.string, starting_cursor, needle.cursor-1, needle.captures)
|
233
|
-
capture_value = @capture_name && (needle.captures.has_key?(@capture_name) ? needle.captures[@capture_name] : @initial_capture_value)
|
234
|
-
match_string = @block.call(match_string, needle.cursor, capture_value) if @block
|
235
|
-
needle.capture(@capture_name, match_string)
|
236
|
-
[starting_cursor, s]
|
237
|
-
end
|
238
|
-
end
|
239
|
-
|
240
|
-
end
|
241
|
-
|
242
|
-
class Match < Pattern
|
243
|
-
|
244
|
-
def initialize(sub_pattern_or_name = nil, &block)
|
245
|
-
if block
|
246
|
-
@block = block
|
247
|
-
elsif sub_pattern_or_name and sub_pattern_or_name.class == Symbol
|
248
|
-
@name = sub_pattern_or_name
|
249
|
-
elsif sub_pattern_or_name and sub_pattern_or_name.respond_to? "_match?"
|
250
|
-
@pattern = sub_pattern_or_name
|
251
|
-
elsif sub_pattern_or_name and sub_pattern_or_name.respond_to? "to_s"
|
252
|
-
@pattern = sub_pattern_or_name.to_s
|
253
|
-
end
|
254
|
-
end
|
255
|
-
|
256
|
-
def __match?(needle, pattern = nil, s = [])
|
257
|
-
pattern ||= if @block
|
258
|
-
@block.call
|
259
|
-
elsif @name
|
260
|
-
needle.captures[@name] || ""
|
261
|
-
else
|
262
|
-
@pattern
|
263
|
-
end
|
264
|
-
existing_captures = needle.captures.dup
|
265
|
-
s = pattern._match?(needle, *s)
|
266
|
-
needle.captures = needle.captures.merge(existing_captures)
|
267
|
-
[pattern, s] if s
|
268
|
-
end
|
269
|
-
|
270
|
-
end
|
271
|
-
|
272
|
-
class Rem < Pattern
|
273
|
-
|
274
|
-
def __match?(needle, thread_state = nil)
|
275
|
-
if thread_state
|
276
|
-
needle_pull(thread_state)
|
277
|
-
else
|
278
|
-
[needle.push(needle.string.length-needle.cursor)]
|
279
|
-
end
|
280
|
-
end
|
281
|
-
|
282
|
-
end
|
283
|
-
|
284
|
-
class Arb < Pattern
|
285
|
-
|
286
|
-
def __match?(needle, match_length = 0, thread_state = nil)
|
287
|
-
needle.pull(thread_state)
|
288
|
-
if needle.remaining_string.length >= match_length
|
289
|
-
thread_state = needle.push(match_length)
|
290
|
-
match_length += 1
|
291
|
-
[match_length, thread_state]
|
292
|
-
end
|
293
|
-
end
|
294
|
-
|
295
|
-
end
|
296
|
-
|
297
|
-
class ParameterizedPattern < Pattern
|
298
|
-
|
299
|
-
def initialize(opts = nil, &block)
|
300
|
-
if opts.class == Hash
|
301
|
-
if opts.first
|
302
|
-
@param_name = opts.first.first
|
303
|
-
@initial_param_value = opts.first.last
|
304
|
-
end
|
305
|
-
else
|
306
|
-
@initial_param_value = opts
|
307
|
-
end
|
308
|
-
@block = block
|
309
|
-
end
|
310
|
-
|
311
|
-
def self.parameter(name, &post_processor)
|
312
|
-
@post_processor = post_processor
|
313
|
-
define_method(name) do |needle|
|
314
|
-
value = (@param_name && needle.captures.has_key?(@param_name)) ? needle.captures[@param_name] : @initial_param_value
|
315
|
-
value = @block.call(value) if @block
|
316
|
-
needle.capture(@param_name, value)
|
317
|
-
value = post_processor.call(value) if @post_processor
|
318
|
-
value
|
319
|
-
end
|
320
|
-
end
|
321
|
-
|
322
|
-
end
|
323
|
-
|
324
|
-
class Len < ParameterizedPattern
|
325
|
-
|
326
|
-
parameter :len
|
327
|
-
|
328
|
-
def __match?(needle, thread_state = nil)
|
329
|
-
|
330
|
-
if thread_state
|
331
|
-
needle.pull(thread_state)
|
332
|
-
else
|
333
|
-
len_temp = len(needle)
|
334
|
-
[needle.push(len_temp)] if needle.remaining_string.length >= len_temp
|
335
|
-
end
|
336
|
-
|
337
|
-
end
|
338
|
-
|
339
|
-
end
|
340
|
-
|
341
|
-
class Pos < ParameterizedPattern
|
342
|
-
|
343
|
-
parameter :pos
|
344
|
-
|
345
|
-
def __match?(needle, matched = nil)
|
346
|
-
return [true] if needle.cursor == pos(needle) and !matched
|
347
|
-
end
|
348
|
-
|
349
|
-
end
|
350
|
-
|
351
|
-
class RPos < ParameterizedPattern
|
352
|
-
|
353
|
-
parameter :pos
|
354
|
-
|
355
|
-
def __match?(needle, matched = nil)
|
356
|
-
return [true] if needle.string.length-needle.cursor == pos(needle) and !matched
|
357
|
-
end
|
358
|
-
|
359
|
-
end
|
360
|
-
|
361
|
-
class Tab < ParameterizedPattern
|
362
|
-
|
363
|
-
parameter :pos
|
364
|
-
|
365
|
-
def __match?(needle, thread_state = nil)
|
366
|
-
|
367
|
-
if thread_state
|
368
|
-
needle.pull(thread_state)
|
369
|
-
else
|
370
|
-
len = pos(needle) - needle.cursor
|
371
|
-
[needle.push(len)] if len >= 0 and needle.remaining_string.length >= len
|
372
|
-
end
|
373
|
-
end
|
374
|
-
|
375
|
-
end
|
376
|
-
|
377
|
-
class RTab < ParameterizedPattern
|
378
|
-
|
379
|
-
parameter :pos
|
380
|
-
|
381
|
-
def __match?(needle, thread_state = nil)
|
382
|
-
if thread_state
|
383
|
-
needle.pull(thread_state)
|
384
|
-
else
|
385
|
-
len = (needle.remaining_string.length - pos(needle))
|
386
|
-
[needle.push(len)] if len >= 0 and needle.remaining_string.length >= len
|
387
|
-
end
|
388
|
-
end
|
389
|
-
|
390
|
-
end
|
391
|
-
|
392
|
-
class Any < ParameterizedPattern
|
393
|
-
|
394
|
-
parameter :chars, &:split
|
395
|
-
|
396
|
-
def __match?(needle, thread_state = nil)
|
397
|
-
if thread_state
|
398
|
-
needle.pull(thread_state)
|
399
|
-
elsif chars(needle).include? needle.remaining_string[0..0]
|
400
|
-
[needle.push(1)]
|
401
|
-
end
|
402
|
-
end
|
403
|
-
|
404
|
-
end
|
405
|
-
|
406
|
-
class NotAny < ParameterizedPattern
|
407
|
-
|
408
|
-
parameter :chars, &:split
|
409
|
-
|
410
|
-
def __match?(needle, thread_state = nil)
|
411
|
-
if thread_state
|
412
|
-
needle.pull(thread_state)
|
413
|
-
elsif !(chars(needle).include? needle.remaining_string[0..0])
|
414
|
-
[needle.push(1)]
|
415
|
-
end
|
416
|
-
end
|
417
|
-
|
418
|
-
end
|
419
|
-
|
420
|
-
class Span < ParameterizedPattern
|
421
|
-
|
422
|
-
parameter :chars, &:split
|
423
|
-
|
424
|
-
def __match?(needle, match_length = nil, thread_state = nil)
|
425
|
-
unless match_length
|
426
|
-
the_chars, match_length = chars(needle), 0
|
427
|
-
while needle.remaining_string.length > match_length and the_chars.include? needle.remaining_string[match_length..match_length]
|
428
|
-
match_length += 1
|
429
|
-
end
|
430
|
-
end
|
431
|
-
needle.pull(thread_state)
|
432
|
-
if match_length > 0
|
433
|
-
thread_state = needle.push(match_length)
|
434
|
-
match_length -= 1
|
435
|
-
[match_length, thread_state]
|
436
|
-
end
|
437
|
-
end
|
438
|
-
|
439
|
-
end
|
440
|
-
|
441
|
-
class Break < ParameterizedPattern
|
442
|
-
|
443
|
-
parameter :chars, &:split
|
444
|
-
|
445
|
-
def __match?(needle, thread_state = nil)
|
446
|
-
if thread_state
|
447
|
-
needle.pull(thread_state)
|
448
|
-
else
|
449
|
-
the_chars, len = chars(needle), 0
|
450
|
-
while needle.remaining_string.length > len and !(the_chars.include? needle.remaining_string[len..len])
|
451
|
-
len += 1
|
452
|
-
end
|
453
|
-
[needle.push(len)]
|
454
|
-
end
|
455
|
-
end
|
456
|
-
|
457
|
-
end
|
458
|
-
|
459
|
-
|
460
|
-
class BreakX < ParameterizedPattern
|
461
|
-
|
462
|
-
parameter :chars, &:split
|
463
|
-
|
464
|
-
def __match?(needle, len = 0, thread_state = nil)
|
465
|
-
needle.pull(thread_state)
|
466
|
-
the_chars = chars(needle)
|
467
|
-
while needle.remaining_string.length > len and !(the_chars.include? needle.remaining_string[len..len])
|
468
|
-
len += 1
|
469
|
-
end
|
470
|
-
[len+1, needle.push(len)] if needle.remaining_string.length >= len
|
471
|
-
end
|
472
|
-
|
473
|
-
end
|
474
|
-
|
475
|
-
class Arbno < Match
|
476
|
-
|
477
|
-
def __match?(needle, pattern = nil, s = [[]])
|
478
|
-
return if s.length == 0
|
479
|
-
if pattern
|
480
|
-
existing_captures = needle.captures.dup
|
481
|
-
s[-1] = pattern._match?(needle, *(s.last))
|
482
|
-
s = s[-1] ? s + [[]] : s[0..-2]
|
483
|
-
needle.captures = needle.captures.merge(existing_captures)
|
484
|
-
else
|
485
|
-
if @block
|
486
|
-
pattern = @block.call
|
487
|
-
elsif @name
|
488
|
-
pattern = needle.captures[@name] || ""
|
489
|
-
else
|
490
|
-
pattern = @pattern
|
491
|
-
end
|
492
|
-
end
|
493
|
-
[pattern, s]
|
494
|
-
end
|
495
|
-
|
496
|
-
end
|
497
|
-
|
498
|
-
class FailPat < Pattern
|
499
|
-
|
500
|
-
def __match?(needle)
|
501
|
-
end
|
502
|
-
|
503
|
-
end
|
504
|
-
|
505
|
-
class Abort < Pattern
|
506
|
-
|
507
|
-
def __match?(needle)
|
508
|
-
raise MatchFailed
|
509
|
-
end
|
510
|
-
|
511
|
-
end
|
512
|
-
|
513
|
-
class Fence < Match
|
514
|
-
|
515
|
-
def __match?(needle, on_backtrack = nil)
|
516
|
-
if on_backtrack == :fail_match
|
517
|
-
needle.match_failed = true
|
518
|
-
return nil
|
519
|
-
elsif on_backtrack == :return_nil
|
520
|
-
return nil
|
521
|
-
elsif @block
|
522
|
-
pattern = @block.call
|
523
|
-
elsif @name
|
524
|
-
pattern = needle.captures[@name] || ""
|
525
|
-
elsif @pattern
|
526
|
-
pattern = @pattern
|
527
|
-
else
|
528
|
-
return [:fail_match]
|
529
|
-
end
|
530
|
-
return [:return_nil] if pattern._match?(needle)
|
531
|
-
end
|
532
|
-
|
533
|
-
end
|
534
|
-
|
535
|
-
class Succeed < Pattern
|
536
|
-
def _match?(needle, thread_state = nil)
|
537
|
-
needle.pull(thread_state)
|
538
|
-
[needle.push(0)]
|
539
|
-
end
|
540
|
-
end
|
541
|
-
|
542
|
-
end
|
543
|
-
|
544
|
-
class String
|
545
|
-
|
546
|
-
include Cannonbol::Operators
|
547
|
-
|
548
|
-
def __match?(needle, thread_state = nil)
|
549
|
-
if thread_state
|
550
|
-
needle.pull(thread_state)
|
551
|
-
elsif self.length == 0 or
|
552
|
-
(!needle.ignore_case and needle.remaining_string[0..self.length-1] == self) or
|
553
|
-
(needle.ignore_case and needle.remaining_string[0..self.length-1].upcase == self.upcase)
|
554
|
-
[needle.push(self.length)]
|
555
|
-
end
|
556
|
-
end
|
557
|
-
|
558
|
-
end
|
559
|
-
|
560
|
-
class Regexp
|
561
|
-
|
562
|
-
include Cannonbol::Operators
|
563
|
-
|
564
|
-
def __match?(needle, thread_state = nil)
|
565
|
-
if defined? Opal
|
566
|
-
options = ""
|
567
|
-
options += "m" if `#{self}.multiline`
|
568
|
-
options += "g" if `#{self}.global`
|
569
|
-
options += "i" if needle.ignore_case or `#{self}.ignoreCase`
|
570
|
-
else
|
571
|
-
options = self.options | (needle.ignore_case ? Regexp::IGNORECASE : 0)
|
572
|
-
end
|
573
|
-
@cannonbol_regex ||= Regexp.new("^#{self.source}", options )
|
574
|
-
if thread_state
|
575
|
-
needle.pull(thread_state)
|
576
|
-
elsif m = @cannonbol_regex.match(needle.remaining_string)
|
577
|
-
[needle.push(m[0].length)]
|
578
|
-
end
|
579
|
-
end
|
580
|
-
|
581
|
-
end
|
582
|
-
|
583
|
-
if defined? Opal
|
584
|
-
|
585
|
-
class Proc
|
586
|
-
|
587
|
-
def parameters
|
588
|
-
/.*function[^(]*\(([^)]*)\)/.match(`#{self}.toString()`)[1].split(",").collect { |param| [:req, param.strip.to_sym]}
|
589
|
-
end
|
590
|
-
|
591
|
-
end
|
592
|
-
|
593
|
-
end
|
594
|
-
|
595
|
-
module Enumerable
|
596
|
-
|
597
|
-
def match_any
|
598
|
-
if self.first
|
599
|
-
self[1..-1].inject(self.first) { |memo, item| memo | item }
|
600
|
-
else
|
601
|
-
FAIL
|
602
|
-
end
|
603
|
-
end
|
604
|
-
|
605
|
-
def match_all
|
606
|
-
self.inject("") { |memo, item| memo & item }
|
607
|
-
end
|
608
|
-
|
609
|
-
end
|
610
|
-
|
611
|
-
|
612
|
-
class Object
|
613
|
-
|
614
|
-
REM = Cannonbol::Rem.new
|
615
|
-
|
616
|
-
ARB = Cannonbol::Arb.new
|
617
|
-
|
618
|
-
FAIL = Cannonbol::FailPat.new
|
619
|
-
|
620
|
-
ABORT = Cannonbol::Abort.new
|
621
|
-
|
622
|
-
FENCE = Cannonbol::Fence.new
|
623
|
-
|
624
|
-
SUCCEED = Cannonbol::Succeed.new
|
625
|
-
|
626
|
-
def LEN(p={}, &block)
|
627
|
-
Cannonbol::Len.new(p, &block)
|
628
|
-
end
|
629
|
-
|
630
|
-
def POS(p=nil, &block)
|
631
|
-
Cannonbol::Pos.new(p, &block)
|
632
|
-
end
|
633
|
-
|
634
|
-
def RPOS(p=nil, &block)
|
635
|
-
Cannonbol::RPos.new(p, &block)
|
636
|
-
end
|
637
|
-
|
638
|
-
def TAB(p=nil, &block)
|
639
|
-
Cannonbol::Tab.new(p, &block)
|
640
|
-
end
|
641
|
-
|
642
|
-
def RTAB(p=nil, &block)
|
643
|
-
Cannonbol::RTab.new(p, &block)
|
644
|
-
end
|
645
|
-
|
646
|
-
def ANY(p=nil, &block)
|
647
|
-
Cannonbol::Any.new(p, &block)
|
648
|
-
end
|
649
|
-
|
650
|
-
def NOTANY(p=nil, &block)
|
651
|
-
Cannonbol::NotAny.new(p, &block)
|
652
|
-
end
|
653
|
-
|
654
|
-
def SPAN(p=nil, &block)
|
655
|
-
Cannonbol::Span.new(p, &block)
|
656
|
-
end
|
657
|
-
|
658
|
-
def BREAK(p=nil, &block)
|
659
|
-
Cannonbol::Break.new(p, &block)
|
660
|
-
end
|
661
|
-
|
662
|
-
def BREAKX(p=nil, &block)
|
663
|
-
Cannonbol::BreakX.new(p, &block)
|
664
|
-
end
|
665
|
-
|
666
|
-
def MATCH(p=nil, &block)
|
667
|
-
Cannonbol::Match.new(p, &block)
|
668
|
-
end
|
669
|
-
|
670
|
-
def ARBNO(p=nil, &block)
|
671
|
-
Cannonbol::Arbno.new(p, &block)
|
672
|
-
end
|
673
|
-
|
674
|
-
def FENCE(p=nil, &block)
|
675
|
-
Cannonbol::Fence.new(p, &block)
|
676
|
-
end
|
677
|
-
|
678
9
|
end
|