reg 0.4.6
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- data/COPYING +510 -0
- data/README +404 -0
- data/assert.rb +31 -0
- data/calc.reg +73 -0
- data/forward_to.rb +49 -0
- data/item_thattest.rb +47 -0
- data/numberset.rb +200 -0
- data/parser.txt +188 -0
- data/philosophy.txt +72 -0
- data/reg.gemspec +27 -0
- data/reg.rb +33 -0
- data/regarray.rb +675 -0
- data/regarrayold.rb +477 -0
- data/regbackref.rb +126 -0
- data/regbind.rb +74 -0
- data/regcase.rb +78 -0
- data/regcore.rb +379 -0
- data/regdeferred.rb +134 -0
- data/reggrid.csv +2 -1
- data/regguide.txt +416 -0
- data/reghash.rb +318 -0
- data/regitem_that.rb +146 -0
- data/regknows.rb +63 -0
- data/reglogic.rb +195 -0
- data/reglookab.rb +94 -0
- data/regold.rb +75 -0
- data/regpath.rb +74 -0
- data/regposition.rb +68 -0
- data/regprogress.rb +1067 -0
- data/regreplace.rb +114 -0
- data/regsugar.rb +230 -0
- data/regtest.rb +1075 -0
- data/regvar.rb +76 -0
- data/trace.rb +45 -0
- metadata +83 -0
data/philosophy.txt
ADDED
@@ -0,0 +1,72 @@
|
|
1
|
+
|
2
|
+
|
3
|
+
That's a long story, and well worth telling.
|
4
|
+
|
5
|
+
A long time ago, I wanted a better regexp than regexp. My search ended
|
6
|
+
when I found an extremely obscure language called gema (the
|
7
|
+
general-purpose matcher). I'm guessing that I'm the only person to ever
|
8
|
+
take gema seriously. For a time, I became the worlds foremost expert on
|
9
|
+
gema. Gema is designed around the idea that all computation can be
|
10
|
+
modeled as pattern and replacement. Everything in gema is pattern and
|
11
|
+
replacement... essentially everything is done with regexps. I was
|
12
|
+
fascinated with the idea. This seemed to me to be a much better model
|
13
|
+
for most programming problems, which typically involve reading input,
|
14
|
+
tranforming it in some way, and writing it out again. Conventional
|
15
|
+
languages (starting with fortran, and including ruby) are based around
|
16
|
+
the idea of a program being a long string of formulas. This is great
|
17
|
+
for math-heavy stuff, but most programming is really about data
|
18
|
+
manipulation, not math.
|
19
|
+
|
20
|
+
But there was trouble in paradise. Gema was wonderful, but weird. The
|
21
|
+
syntax was cranky. The author had issued one version long ago then
|
22
|
+
disappeared. Gema code was hard to read, in part because
|
23
|
+
everythingwasalljammedtogether .
|
24
|
+
Ifyouinsertspacestomakeitmorer eadable,itchangesthesemanticso fyourprogram.
|
25
|
+
There were strange problems that I never tracked down or fully
|
26
|
+
characterized. The only data-type was the string. You had to be an
|
27
|
+
expert at avoiding the invisible pitfalls of the language to get
|
28
|
+
anywhere. But I did get surprisingly far. I managed to coax gema into
|
29
|
+
becoming a true parser, and parsing a toy language.
|
30
|
+
I wanted to write a compiler in gema. Yes, the whole compiler. And
|
31
|
+
parsing the toy language was already straining its capabilites. It
|
32
|
+
wasn't the data model; I actually figured out how to model all other
|
33
|
+
data types using strings. A match-and-replace language is actually much
|
34
|
+
better suited to most compiler tasks than an algol-like formula
|
35
|
+
language.
|
36
|
+
|
37
|
+
Eventually, I abandoned gema, determined to recreate it's glory in a
|
38
|
+
cleaner form. It was at about this time that I discovered ruby. The
|
39
|
+
successor to gema was ruma, the ruby matcher. Ruma would be basically
|
40
|
+
just like gema, but without the problems. Whitespace allowed between
|
41
|
+
tokens. Proper quotation mechanisms, including nested quotes. And the
|
42
|
+
language used in the actions (replacements) would be full ruby, instead
|
43
|
+
of gema's inadequate and crude action language.
|
44
|
+
|
45
|
+
Ruma got maybe halfway done... quite a ways, really. As part of ruma, I
|
46
|
+
needed a ruby lexer to make sense of the actions. This turned out to be
|
47
|
+
quite a lot harder than I had anticipated; I'm still working on that
|
48
|
+
lexer.
|
49
|
+
|
50
|
+
After grinding away at the lexer for a while, dreaming of ruma in the
|
51
|
+
meantime, I had a brainstorm. Ruma, like gema, was to be a string-based
|
52
|
+
language. It only operated on strings. In gema, that was just fine
|
53
|
+
because everything was strings and you just had to live with that. But
|
54
|
+
ruby has all these other types, a real type system. Wouldn't it be nice
|
55
|
+
to have those sophisticated search capabilites for other types too?
|
56
|
+
Well, since I proved to myself that all data types can be converted to
|
57
|
+
strings, why not convert the ruby data into strings and then match that
|
58
|
+
in ruma. Of course, it would be so much nicer to just do the matching
|
59
|
+
on the data in it's original form....
|
60
|
+
|
61
|
+
The breakthrough came when I realized how malleable ruby really is. I
|
62
|
+
had become accustomed to c, which I still love, but in so many ways
|
63
|
+
it's so much more limited. I didn't really have to write my own parser
|
64
|
+
and lexer; ruby could do it all for me. I just had to override a bunch
|
65
|
+
of operators.
|
66
|
+
|
67
|
+
After that, it was simple. All I do is override the right operators,
|
68
|
+
and ruby does the parsing and hands me the match expressions in
|
69
|
+
already-parsed form. Reg is amazingly small in the end. Most of the
|
70
|
+
effort and code went into the array matcher, but at least as much
|
71
|
+
functionality is to be had from the hash and object matchers, which
|
72
|
+
were trivial.
|
data/reg.gemspec
ADDED
@@ -0,0 +1,27 @@
|
|
1
|
+
require 'rubygems'
|
2
|
+
|
3
|
+
spec = Gem::Specification.new do |s|
|
4
|
+
s.name = 'reg'
|
5
|
+
s.rubyforge_project = 'reg'
|
6
|
+
s.version = '0.4.6'
|
7
|
+
s.summary = 'The reg pattern matching/replacement language'
|
8
|
+
s.files = %w[item_thattest.rb regbackref.rb regknows.rb regtest.rb
|
9
|
+
numberset.rb regbind.rb reglogic.rb regvar.rb
|
10
|
+
COPYING parser.txt regcase.rb reglookab.rb
|
11
|
+
README regcore.rb regold.rb reg.gemspec reggrid.csv
|
12
|
+
assert.rb philosophy.txt regdeferred.rb regpath.rb
|
13
|
+
regposition.rb trace.rb calc.reg reg.rb
|
14
|
+
regguide.txt regprogress.rb reghash.rb regreplace.rb
|
15
|
+
forward_to.rb regarray.rb regarrayold.rb regitem_that.rb regsugar.rb]
|
16
|
+
s.require_path = '.'
|
17
|
+
s.has_rdoc = false
|
18
|
+
s.requirements=["none"]
|
19
|
+
s.add_dependency("cursor", [">= 0.9"])
|
20
|
+
s.author = 'Caleb Clausen'
|
21
|
+
end
|
22
|
+
|
23
|
+
|
24
|
+
if $0==__FILE__
|
25
|
+
Gem::manage_gems
|
26
|
+
Gem::Builder.new(spec).build
|
27
|
+
end
|
data/reg.rb
ADDED
@@ -0,0 +1,33 @@
|
|
1
|
+
=begin copyright
|
2
|
+
reg - the ruby extended grammar
|
3
|
+
Copyright (C) 2005 Caleb Clausen
|
4
|
+
|
5
|
+
This library is free software; you can redistribute it and/or
|
6
|
+
modify it under the terms of the GNU Lesser General Public
|
7
|
+
License as published by the Free Software Foundation; either
|
8
|
+
version 2.1 of the License, or (at your option) any later version.
|
9
|
+
|
10
|
+
This library is distributed in the hope that it will be useful,
|
11
|
+
but WITHOUT ANY WARRANTY; without even the implied warranty of
|
12
|
+
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
|
13
|
+
Lesser General Public License for more details.
|
14
|
+
|
15
|
+
You should have received a copy of the GNU Lesser General Public
|
16
|
+
License along with this library; if not, write to the Free Software
|
17
|
+
Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA
|
18
|
+
=end
|
19
|
+
require 'regcore'
|
20
|
+
require 'reglogic'
|
21
|
+
require 'reghash'
|
22
|
+
require 'regarray'
|
23
|
+
#require 'regarrayold' #old bt engine
|
24
|
+
require 'regprogress' #new bt engine
|
25
|
+
#enable one engine or the other, but not both
|
26
|
+
|
27
|
+
require 'regbackref'
|
28
|
+
require 'regitem_that'
|
29
|
+
require 'regknows'
|
30
|
+
require 'regsugar'
|
31
|
+
require 'regold' #will go away
|
32
|
+
Kernel.instance_eval( &Reg::TLA_pirate)
|
33
|
+
Reg::Sugar.include!
|
data/regarray.rb
ADDED
@@ -0,0 +1,675 @@
|
|
1
|
+
=begin copyright
|
2
|
+
reg - the ruby extended grammar
|
3
|
+
Copyright (C) 2005 Caleb Clausen
|
4
|
+
|
5
|
+
This library is free software; you can redistribute it and/or
|
6
|
+
modify it under the terms of the GNU Lesser General Public
|
7
|
+
License as published by the Free Software Foundation; either
|
8
|
+
version 2.1 of the License, or (at your option) any later version.
|
9
|
+
|
10
|
+
This library is distributed in the hope that it will be useful,
|
11
|
+
but WITHOUT ANY WARRANTY; without even the implied warranty of
|
12
|
+
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
|
13
|
+
Lesser General Public License for more details.
|
14
|
+
|
15
|
+
You should have received a copy of the GNU Lesser General Public
|
16
|
+
License along with this library; if not, write to the Free Software
|
17
|
+
Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA
|
18
|
+
=end
|
19
|
+
|
20
|
+
require "assert"
|
21
|
+
require "pp"
|
22
|
+
|
23
|
+
module Reg
|
24
|
+
module Reg
|
25
|
+
def itemrange; 1..1 end #default match 1 item
|
26
|
+
|
27
|
+
|
28
|
+
#create a (vector) Reg that will match this pattern repeatedly.
|
29
|
+
#(creates a Reg::Repeat.)
|
30
|
+
#the argument determines the number of times to match.
|
31
|
+
#times may be a positive integer, zero, INFINITY, or a
|
32
|
+
#range over any of the above. if a range, the lower
|
33
|
+
#end may not be INFINITY! Reg#- and Reg#+ are shortcuts
|
34
|
+
#for the most common cases of multiplting by a range.
|
35
|
+
#(at least 0 and at most INFINITY.) watch out when
|
36
|
+
#multiplying with zero and INFINITY (including in a
|
37
|
+
#range), as you can easily create a situation where
|
38
|
+
#the number of matches to enumerate explodes exponentionaly,
|
39
|
+
#or even is infinite. i won't say too much here except
|
40
|
+
#that these are generally the same sorts of problems you
|
41
|
+
#can run into with Regexps as well.
|
42
|
+
def *(times=0..INFINITY)
|
43
|
+
Repeat.new(self,times)
|
44
|
+
end
|
45
|
+
|
46
|
+
#repeat this pattern up to atmost times. could match
|
47
|
+
#0 times as the minimum number of matches here is zero.
|
48
|
+
def -(atmost=1)
|
49
|
+
self*(0..atmost)
|
50
|
+
end
|
51
|
+
|
52
|
+
#repeat this pattern atleast times or more
|
53
|
+
def +(atleast=1)
|
54
|
+
self*(atleast..INFINITY)
|
55
|
+
end
|
56
|
+
|
57
|
+
end
|
58
|
+
|
59
|
+
#--------------------------
|
60
|
+
module Multiple
|
61
|
+
def ===(other)
|
62
|
+
method_missing(:===, other)
|
63
|
+
end
|
64
|
+
|
65
|
+
def maybe_multiple(needsmult) #better name needed
|
66
|
+
assert( needsmult.respond_to?( :mmatch))
|
67
|
+
class <<needsmult
|
68
|
+
undef_method :mmatch
|
69
|
+
include Multiple
|
70
|
+
#alias mmatch mmatch_multiple #this doesn't work... why?
|
71
|
+
def mmatch(a,s) mmatch_multiple(a,s); end #have to do this instead
|
72
|
+
end
|
73
|
+
assert( needsmult.respond_to?( :mmatch))
|
74
|
+
end
|
75
|
+
|
76
|
+
def maybe_multiples(*args) end
|
77
|
+
#we're already multiple; no need to try to become multiple again
|
78
|
+
|
79
|
+
def mmatch(arr,idx) #multiple match
|
80
|
+
abstract
|
81
|
+
end
|
82
|
+
|
83
|
+
#negated Reg::Multiple's are automatically lookaheads (not implemented yet)
|
84
|
+
def ~
|
85
|
+
~(Lookahead.new self)
|
86
|
+
end
|
87
|
+
|
88
|
+
|
89
|
+
def starts_with
|
90
|
+
abstract
|
91
|
+
end
|
92
|
+
|
93
|
+
def ends_with
|
94
|
+
abstract
|
95
|
+
end
|
96
|
+
|
97
|
+
def matches_class
|
98
|
+
raise 'multiple regs match no single class'
|
99
|
+
end
|
100
|
+
end
|
101
|
+
|
102
|
+
#--------------------------
|
103
|
+
module Backtrace
|
104
|
+
# protected
|
105
|
+
|
106
|
+
def regs(ri) @regs[ri] end
|
107
|
+
|
108
|
+
def update_di(di,len); di+len; end
|
109
|
+
#--------------------------
|
110
|
+
$RegTraceEnable=$RegTraceDisable=nil
|
111
|
+
def trace_enabled?
|
112
|
+
@trace||=nil
|
113
|
+
$RegTraceEnable or (!$RegTraceDisable && @trace)
|
114
|
+
end
|
115
|
+
|
116
|
+
#--------------------------
|
117
|
+
def trace!
|
118
|
+
@trace=true
|
119
|
+
self
|
120
|
+
end
|
121
|
+
|
122
|
+
#--------------------------
|
123
|
+
def notrace!
|
124
|
+
@trace=false
|
125
|
+
self
|
126
|
+
end
|
127
|
+
end
|
128
|
+
|
129
|
+
#--------------------------
|
130
|
+
if false
|
131
|
+
class RR < ::Array
|
132
|
+
def inspect
|
133
|
+
[self,super].to_s
|
134
|
+
end
|
135
|
+
|
136
|
+
def rrflatten
|
137
|
+
result=[]
|
138
|
+
each{|i|
|
139
|
+
case i
|
140
|
+
when RR then result +=i.rrflatten
|
141
|
+
when Literal then result << i.unlit
|
142
|
+
else result << i
|
143
|
+
end
|
144
|
+
}
|
145
|
+
end
|
146
|
+
|
147
|
+
def +(other)
|
148
|
+
RR[*super]
|
149
|
+
end
|
150
|
+
end
|
151
|
+
Result=RR
|
152
|
+
else
|
153
|
+
RR=::Array
|
154
|
+
end
|
155
|
+
|
156
|
+
#--------------------------
|
157
|
+
class MatchSet
|
158
|
+
|
159
|
+
def next_match(ary,start)
|
160
|
+
abstract
|
161
|
+
end
|
162
|
+
|
163
|
+
def deep_copy
|
164
|
+
abstract
|
165
|
+
end
|
166
|
+
|
167
|
+
def ob_state
|
168
|
+
instance_variables.sort.map{|i| instance_variable_get i }
|
169
|
+
end
|
170
|
+
|
171
|
+
def ==(other)
|
172
|
+
self.class==other.class and ob_state==other.ob_state
|
173
|
+
end
|
174
|
+
|
175
|
+
end
|
176
|
+
|
177
|
+
#--------------------------
|
178
|
+
class SingleRepeatMatchSet < MatchSet
|
179
|
+
def initialize(startcnt,stepper,endcnt)
|
180
|
+
endcnt==startcnt and raise 'why even make it a set, then?'
|
181
|
+
(endcnt-startcnt)*stepper>0 or raise "tried to make null match set"
|
182
|
+
assert startcnt>=0
|
183
|
+
assert endcnt>=0
|
184
|
+
@matchtimes,@stepper,@endcnt=startcnt,stepper,endcnt
|
185
|
+
end
|
186
|
+
|
187
|
+
def next_match(arr,idx)
|
188
|
+
assert @stepper == -1 #'only greedy matching implemnted for now'
|
189
|
+
@endcnt<=@matchtimes or return nil
|
190
|
+
assert @matchtimes >=0
|
191
|
+
result=[RR[arr[idx...idx+@matchtimes]], @matchtimes]
|
192
|
+
assert ::Array===result.first.first
|
193
|
+
@matchtimes+=@stepper
|
194
|
+
|
195
|
+
assert @matchtimes >=-1
|
196
|
+
|
197
|
+
assert ::Array===result.first.first
|
198
|
+
return result
|
199
|
+
end
|
200
|
+
|
201
|
+
def deep_copy
|
202
|
+
dup
|
203
|
+
end
|
204
|
+
end
|
205
|
+
|
206
|
+
|
207
|
+
#--------------------------
|
208
|
+
class Repeat
|
209
|
+
include Reg,Backtrace,Multiple
|
210
|
+
|
211
|
+
attr :times
|
212
|
+
|
213
|
+
def max_matches; @times.end end
|
214
|
+
|
215
|
+
def regs(ri) @reg end
|
216
|
+
|
217
|
+
def initialize(reg,times)
|
218
|
+
Integer===times and times=times..times
|
219
|
+
times.exclude_end? and times=times.begin..times.end-1
|
220
|
+
assert times.begin <= times.end
|
221
|
+
assert times.begin < INFINITY
|
222
|
+
assert times.begin >= 0
|
223
|
+
assert times.end >= 0
|
224
|
+
if Multiple===reg
|
225
|
+
class<<self
|
226
|
+
#alias mmatch mmatch_multiple #this doesn't work... why?
|
227
|
+
def mmatch(a,s) mmatch_multiple(a,s); end #have to do this instead
|
228
|
+
end
|
229
|
+
else
|
230
|
+
assert reg.itemrange==(1..1)
|
231
|
+
@itemrange=times
|
232
|
+
end
|
233
|
+
@reg,@times=reg,times
|
234
|
+
end
|
235
|
+
|
236
|
+
def itemrange
|
237
|
+
defined? @itemrange and return @itemrange
|
238
|
+
|
239
|
+
i=@reg.itemrange
|
240
|
+
rf,rl=i.first,i.last
|
241
|
+
tf,tl=times.first,times.last
|
242
|
+
@itemrange = rf*tf ..
|
243
|
+
if tl==0 or rl==0
|
244
|
+
0
|
245
|
+
elsif tl==INFINITY
|
246
|
+
#ought to emit warnings if trouble here...
|
247
|
+
#rl==INFINITY and maybe trouble
|
248
|
+
#rf==0 and trouble
|
249
|
+
INFINITY
|
250
|
+
elsif rl==INFINITY
|
251
|
+
#...and here
|
252
|
+
#maybe trouble #... combinatorial explosion
|
253
|
+
INFINITY
|
254
|
+
else
|
255
|
+
rl*tl
|
256
|
+
end
|
257
|
+
end
|
258
|
+
|
259
|
+
|
260
|
+
def enough_matches? matchcnt
|
261
|
+
@times===matchcnt
|
262
|
+
end
|
263
|
+
|
264
|
+
def inspect
|
265
|
+
if @times.end==INFINITY
|
266
|
+
"(#{@reg.inspect})+#{@times.begin}"
|
267
|
+
elsif @times.begin==0
|
268
|
+
"(#{@reg.inspect})-#{@times.end}"
|
269
|
+
elsif @times.begin==@times.end
|
270
|
+
"(#{@reg.inspect})*#{@times.begin}"
|
271
|
+
else
|
272
|
+
"(#{@reg.inspect})*(#{@times.begin}..#{@times.end})"
|
273
|
+
end
|
274
|
+
end
|
275
|
+
|
276
|
+
def subregs; @reg end
|
277
|
+
|
278
|
+
private
|
279
|
+
|
280
|
+
end
|
281
|
+
|
282
|
+
|
283
|
+
|
284
|
+
|
285
|
+
#--------------------------
|
286
|
+
class OrMatchSet < MatchSet
|
287
|
+
def initialize(orreg,idx,set,firstmatch)
|
288
|
+
@orreg,@idx,@set,@firstmatch=orreg,idx,set,firstmatch
|
289
|
+
assert @firstmatch.nil? || ::Array===@firstmatch.first.first
|
290
|
+
end
|
291
|
+
|
292
|
+
def ob_state
|
293
|
+
instance_variables.map{|i| instance_variable_get i }
|
294
|
+
end
|
295
|
+
|
296
|
+
def ==(other)
|
297
|
+
OrMatchSet===other and ob_state==other.ob_state
|
298
|
+
end
|
299
|
+
|
300
|
+
def next_match(ary,idx)
|
301
|
+
if @firstmatch
|
302
|
+
result,@firstmatch=@firstmatch,nil
|
303
|
+
assert ::Array===result
|
304
|
+
assert ::Array===result.first.first
|
305
|
+
assert 2==result.size
|
306
|
+
assert Integer===result.last
|
307
|
+
return result
|
308
|
+
end
|
309
|
+
@set and result= @set.next_match(ary,idx)
|
310
|
+
while result.nil?
|
311
|
+
@idx+=1
|
312
|
+
@idx >= @orreg.regs.size and return nil
|
313
|
+
x=@orreg.regs[@idx].mmatch(ary,idx)
|
314
|
+
@set,result=*if MatchSet===x then [x,x.next_match] else [nil,x] end
|
315
|
+
end
|
316
|
+
a=RR[nil]*@orreg.regs.size
|
317
|
+
a[idx]=result[0]
|
318
|
+
result[0]=a
|
319
|
+
assert ::Array===result.first.first
|
320
|
+
return result
|
321
|
+
end
|
322
|
+
|
323
|
+
def deep_copy
|
324
|
+
result=OrMatchSet.new(@orreg,@idx,@set && @set.deep_copy,@firstmatch)
|
325
|
+
assert self==result
|
326
|
+
return result
|
327
|
+
end
|
328
|
+
end
|
329
|
+
|
330
|
+
#--------------------------
|
331
|
+
class Or
|
332
|
+
def mmatch(arr,start)
|
333
|
+
assert start <= arr.size
|
334
|
+
@regs.each_with_index {|reg,i|
|
335
|
+
reg===arr[start] and
|
336
|
+
return OrMatchSet.new(self,i,nil,[arr[start]])
|
337
|
+
} unless start == arr.size
|
338
|
+
return nil
|
339
|
+
end
|
340
|
+
|
341
|
+
def itemrange
|
342
|
+
if true
|
343
|
+
min,max=INFINITY,0
|
344
|
+
@regs.each {|r|
|
345
|
+
min=r.itemrange.first if min>r.itemrange.first
|
346
|
+
max=r.itemrange.last if max<r.itemrange.last
|
347
|
+
}
|
348
|
+
return min..max
|
349
|
+
else
|
350
|
+
limits=@regs.map{|r|
|
351
|
+
|
352
|
+
i=(r.respond_to? :itemrange)? r.itemrange : 1..1
|
353
|
+
[i.first,i.last]
|
354
|
+
}.transpose
|
355
|
+
limits.first.sort.first .. limits.last.sort.last
|
356
|
+
end
|
357
|
+
end
|
358
|
+
|
359
|
+
private
|
360
|
+
def mmatch_multiple(arr,start)
|
361
|
+
mat=i=nil
|
362
|
+
@regs.each_with_index{|r,i|
|
363
|
+
if r.respond_to? :mmatch
|
364
|
+
mat=r.mmatch(arr,start) or next
|
365
|
+
if mat.respond_to? :next_match
|
366
|
+
return OrMatchSet.new(self,i,mat,mat.next_match(arr,start))
|
367
|
+
else
|
368
|
+
return OrMatchSet.new(self,i,nil,mat)
|
369
|
+
end
|
370
|
+
else
|
371
|
+
r===arr[start] and
|
372
|
+
return OrMatchSet.new(self,i,nil,[[[arr[start]]],1])
|
373
|
+
end
|
374
|
+
}
|
375
|
+
|
376
|
+
assert mat.nil?
|
377
|
+
return nil
|
378
|
+
end
|
379
|
+
end
|
380
|
+
|
381
|
+
#--------------------------
|
382
|
+
class Xor
|
383
|
+
def clean_result
|
384
|
+
huh
|
385
|
+
end
|
386
|
+
|
387
|
+
def itemrange
|
388
|
+
#min,max=INFINITY,0
|
389
|
+
#@regs.each {|r|
|
390
|
+
# min=[min,r.itemrange.first].sort.first
|
391
|
+
# max=[r.itemrange.last,max].sort.last
|
392
|
+
#}
|
393
|
+
#return min..max
|
394
|
+
limits=@regs.map{|r| i=r.itemrange; [i.first,i.last]}.transpose
|
395
|
+
limits.first.sort.first .. limits.last.sort.last
|
396
|
+
end
|
397
|
+
|
398
|
+
private
|
399
|
+
=begin
|
400
|
+
def mmatch_multiple(arr,start)
|
401
|
+
mat=i=nil
|
402
|
+
count=0
|
403
|
+
@regs.each_with_index{|reg,idx|
|
404
|
+
if reg.respond_to? :mmatch
|
405
|
+
mat=reg.mmatch(arr,start) or next
|
406
|
+
else
|
407
|
+
reg===arr[start] or next
|
408
|
+
mat=[[arr[start]],1]
|
409
|
+
end
|
410
|
+
count==0 or return nil
|
411
|
+
count=1
|
412
|
+
assert mat
|
413
|
+
}
|
414
|
+
|
415
|
+
return nil unless mat
|
416
|
+
assert count==1
|
417
|
+
mat.respond_to? :next_match and return XorMatchSet.new(reg,idx,mat,huh)
|
418
|
+
|
419
|
+
a=RR[nil]*regs.size
|
420
|
+
a[idx]=mat[0]
|
421
|
+
mat[0]=a
|
422
|
+
assert huh
|
423
|
+
assert ::Array===mat.first.first
|
424
|
+
return mat
|
425
|
+
end
|
426
|
+
=end
|
427
|
+
|
428
|
+
def mmatch_multiple arr, start
|
429
|
+
found=nil
|
430
|
+
@regs.each{|reg|
|
431
|
+
if m=reg.mmatch(arr, start)
|
432
|
+
return if found
|
433
|
+
found=m
|
434
|
+
end
|
435
|
+
}
|
436
|
+
return found
|
437
|
+
end
|
438
|
+
|
439
|
+
end
|
440
|
+
|
441
|
+
#--------------------------
|
442
|
+
class And
|
443
|
+
include Backtrace #shouldn't this be included only when needed?
|
444
|
+
|
445
|
+
def update_di(di,len) di; end
|
446
|
+
|
447
|
+
|
448
|
+
def clean_result
|
449
|
+
huh
|
450
|
+
end
|
451
|
+
|
452
|
+
|
453
|
+
def enough_matches? matchcnt
|
454
|
+
matchcnt==@regs.size
|
455
|
+
end
|
456
|
+
|
457
|
+
def itemrange
|
458
|
+
limits=@regs.map{|r| i=r.itemrange; [i.first,i.last]}.transpose
|
459
|
+
limits.first.sort.last .. limits.last.sort.last
|
460
|
+
end
|
461
|
+
|
462
|
+
private
|
463
|
+
def mmatch_multiple(arr,start)
|
464
|
+
#in this version, at least one of @regs is a multiple reg
|
465
|
+
assert( (0..arr.size).include?( start))
|
466
|
+
result,*bogus=huh.bt_match(arr,start,0,0,[RR[]])
|
467
|
+
result and AndMatchSet.new(self,result)
|
468
|
+
end
|
469
|
+
end
|
470
|
+
|
471
|
+
#--------------------------
|
472
|
+
class Array
|
473
|
+
include Reg,Backtrace
|
474
|
+
|
475
|
+
def max_matches; @regs.size end
|
476
|
+
|
477
|
+
def initialize(*regs)
|
478
|
+
@regs=regs
|
479
|
+
end
|
480
|
+
|
481
|
+
class <<self
|
482
|
+
alias new__nobooleans new
|
483
|
+
def new(*args)
|
484
|
+
# args.detect{|o| /^(AND|X?OR)$/.sym===o } or return new__nobooleans(*args)
|
485
|
+
# +[/^(AND|X?OR)$/.sym.splitter].match(args)
|
486
|
+
Pair===args.first and return OrderedHash.new(*args)
|
487
|
+
new__nobooleans(*args)
|
488
|
+
end
|
489
|
+
alias [] new
|
490
|
+
end
|
491
|
+
|
492
|
+
def matches_class; ::Array end
|
493
|
+
|
494
|
+
def -@ #subsequence inclusion
|
495
|
+
Subseq.new(*@regs)
|
496
|
+
end
|
497
|
+
|
498
|
+
def +@ #cvt to Reg::Array; that what we are already....
|
499
|
+
self
|
500
|
+
end
|
501
|
+
|
502
|
+
def maybe_multiples(*args) end #never do anything for Reg::Array
|
503
|
+
|
504
|
+
def enough_matches? matchcnt
|
505
|
+
matchcnt==@regs.size
|
506
|
+
end
|
507
|
+
|
508
|
+
def +(reg)
|
509
|
+
#not right... + should not modify self
|
510
|
+
if self.class==reg.class
|
511
|
+
@regs.concat reg.regs
|
512
|
+
else
|
513
|
+
super
|
514
|
+
end
|
515
|
+
end
|
516
|
+
|
517
|
+
def inspect
|
518
|
+
"+["+ @regs.collect{|r| r.inspect}.join(', ') +"]"
|
519
|
+
end
|
520
|
+
|
521
|
+
def subregs; @regs end
|
522
|
+
end
|
523
|
+
|
524
|
+
|
525
|
+
|
526
|
+
|
527
|
+
|
528
|
+
|
529
|
+
|
530
|
+
#--------------------------
|
531
|
+
class Subseq < ::Reg::Array
|
532
|
+
include Multiple
|
533
|
+
|
534
|
+
def max_matches; @regs.size end
|
535
|
+
|
536
|
+
def initialize(*regs)
|
537
|
+
regs.each{|reg| Multiple===reg and class<<self
|
538
|
+
undef mmatch
|
539
|
+
def mmatch(a,s) mmatch_multiple(a,s) end
|
540
|
+
end}
|
541
|
+
|
542
|
+
@regs=regs
|
543
|
+
end
|
544
|
+
|
545
|
+
|
546
|
+
def inspect
|
547
|
+
super.sub( /^\+/,'-')
|
548
|
+
end
|
549
|
+
|
550
|
+
def itemrange
|
551
|
+
#add the ranges of the individual items
|
552
|
+
@itemrange ||= #some caching...
|
553
|
+
@regs.inject(0){|sum,ob| sum+ob.begin } ..
|
554
|
+
@regs.inject(0){|sum,ob| sum+ob.end }
|
555
|
+
end
|
556
|
+
|
557
|
+
def -@ #subsequence inclusion... that's what we are, do nothing
|
558
|
+
self
|
559
|
+
end
|
560
|
+
|
561
|
+
def +@ #cvt to Reg::Array
|
562
|
+
Array.new(*@regs)
|
563
|
+
end
|
564
|
+
|
565
|
+
private
|
566
|
+
|
567
|
+
#tla of +[], regproc{}
|
568
|
+
assign_TLA true, :Reg=>:Array
|
569
|
+
assign_TLA :Res=>:Subseq
|
570
|
+
#no need to alias the constant name 'Reg', too.
|
571
|
+
#ruby does it for us.
|
572
|
+
end
|
573
|
+
|
574
|
+
|
575
|
+
|
576
|
+
#--------------------------
|
577
|
+
class None; end
|
578
|
+
class <<None
|
579
|
+
include Reg
|
580
|
+
def new; self end
|
581
|
+
|
582
|
+
def *(times)
|
583
|
+
times===0 ? Many[0] : self
|
584
|
+
end
|
585
|
+
|
586
|
+
def ~; Any; end
|
587
|
+
|
588
|
+
def &(other); self; end
|
589
|
+
|
590
|
+
def |(other) other end
|
591
|
+
def ^(other) other end
|
592
|
+
|
593
|
+
def ===(other); false; end
|
594
|
+
def matches_class; self; end
|
595
|
+
end
|
596
|
+
|
597
|
+
if defined? $RegAnyEnable #disabled for now -- these optimizations are broken
|
598
|
+
|
599
|
+
#--------------------------
|
600
|
+
class Any; end
|
601
|
+
class <<Any #maybe all this can be in Object's meta-class....
|
602
|
+
include Reg
|
603
|
+
|
604
|
+
#any is a singleton
|
605
|
+
def new; self end
|
606
|
+
|
607
|
+
def *(times)
|
608
|
+
Many.new(times)
|
609
|
+
end
|
610
|
+
|
611
|
+
def ~; None; end
|
612
|
+
|
613
|
+
def &(other); other; end
|
614
|
+
|
615
|
+
def |(other); self; end
|
616
|
+
def ^(other); ~other end
|
617
|
+
|
618
|
+
def ===(other); true;end
|
619
|
+
def matches_class; ::Object end
|
620
|
+
end
|
621
|
+
|
622
|
+
#--------------------------
|
623
|
+
class Many
|
624
|
+
include Reg
|
625
|
+
include Multiple
|
626
|
+
|
627
|
+
class <<self
|
628
|
+
@@RAMs={}
|
629
|
+
alias uncached__new new
|
630
|
+
def new times=0..INFINITY
|
631
|
+
@@RAMs[times] ||= uncached__new times
|
632
|
+
end
|
633
|
+
alias [] new
|
634
|
+
end
|
635
|
+
|
636
|
+
def initialize(times=0..INFINITY)
|
637
|
+
Integer===times and times=times..times
|
638
|
+
@times=times
|
639
|
+
end
|
640
|
+
|
641
|
+
def mmatch(arr,start)
|
642
|
+
left=arr.size-start
|
643
|
+
beg=@times.begin
|
644
|
+
beg<=left and
|
645
|
+
SingleRepeatMatchSet.new([left,@times.end].max, -1, beg)
|
646
|
+
end
|
647
|
+
|
648
|
+
def subregs; Any end
|
649
|
+
|
650
|
+
def inspect; "Any*(#{@times})"; end
|
651
|
+
end
|
652
|
+
|
653
|
+
#--------------------------
|
654
|
+
class ::Object
|
655
|
+
def reg
|
656
|
+
Any
|
657
|
+
end
|
658
|
+
end
|
659
|
+
OB=Any
|
660
|
+
OBS=Many[]
|
661
|
+
|
662
|
+
else #traditional and uncomplicated version of OB and OBS
|
663
|
+
OB=::Object.reg
|
664
|
+
OBS=OB+0 #std abbreviation for 0 or more of anything
|
665
|
+
def OBS.inspect
|
666
|
+
"OBS"
|
667
|
+
end
|
668
|
+
def OB.inspect
|
669
|
+
"OB"
|
670
|
+
end
|
671
|
+
end
|
672
|
+
|
673
|
+
|
674
|
+
|
675
|
+
end
|