RegExpr 0.0.1
Sign up to get free protection for your applications and to get access to all the features.
- data/LICENSE +165 -0
- data/VERSION +1 -0
- data/lib/regexpr-uri.rb +62 -0
- data/lib/regexpr.rb +462 -0
- metadata +65 -0
data/LICENSE
ADDED
@@ -0,0 +1,165 @@
|
|
1
|
+
GNU LESSER GENERAL PUBLIC LICENSE
|
2
|
+
Version 3, 29 June 2007
|
3
|
+
|
4
|
+
Copyright (C) 2007 Free Software Foundation, Inc. <http://fsf.org/>
|
5
|
+
Everyone is permitted to copy and distribute verbatim copies
|
6
|
+
of this license document, but changing it is not allowed.
|
7
|
+
|
8
|
+
|
9
|
+
This version of the GNU Lesser General Public License incorporates
|
10
|
+
the terms and conditions of version 3 of the GNU General Public
|
11
|
+
License, supplemented by the additional permissions listed below.
|
12
|
+
|
13
|
+
0. Additional Definitions.
|
14
|
+
|
15
|
+
As used herein, "this License" refers to version 3 of the GNU Lesser
|
16
|
+
General Public License, and the "GNU GPL" refers to version 3 of the GNU
|
17
|
+
General Public License.
|
18
|
+
|
19
|
+
"The Library" refers to a covered work governed by this License,
|
20
|
+
other than an Application or a Combined Work as defined below.
|
21
|
+
|
22
|
+
An "Application" is any work that makes use of an interface provided
|
23
|
+
by the Library, but which is not otherwise based on the Library.
|
24
|
+
Defining a subclass of a class defined by the Library is deemed a mode
|
25
|
+
of using an interface provided by the Library.
|
26
|
+
|
27
|
+
A "Combined Work" is a work produced by combining or linking an
|
28
|
+
Application with the Library. The particular version of the Library
|
29
|
+
with which the Combined Work was made is also called the "Linked
|
30
|
+
Version".
|
31
|
+
|
32
|
+
The "Minimal Corresponding Source" for a Combined Work means the
|
33
|
+
Corresponding Source for the Combined Work, excluding any source code
|
34
|
+
for portions of the Combined Work that, considered in isolation, are
|
35
|
+
based on the Application, and not on the Linked Version.
|
36
|
+
|
37
|
+
The "Corresponding Application Code" for a Combined Work means the
|
38
|
+
object code and/or source code for the Application, including any data
|
39
|
+
and utility programs needed for reproducing the Combined Work from the
|
40
|
+
Application, but excluding the System Libraries of the Combined Work.
|
41
|
+
|
42
|
+
1. Exception to Section 3 of the GNU GPL.
|
43
|
+
|
44
|
+
You may convey a covered work under sections 3 and 4 of this License
|
45
|
+
without being bound by section 3 of the GNU GPL.
|
46
|
+
|
47
|
+
2. Conveying Modified Versions.
|
48
|
+
|
49
|
+
If you modify a copy of the Library, and, in your modifications, a
|
50
|
+
facility refers to a function or data to be supplied by an Application
|
51
|
+
that uses the facility (other than as an argument passed when the
|
52
|
+
facility is invoked), then you may convey a copy of the modified
|
53
|
+
version:
|
54
|
+
|
55
|
+
a) under this License, provided that you make a good faith effort to
|
56
|
+
ensure that, in the event an Application does not supply the
|
57
|
+
function or data, the facility still operates, and performs
|
58
|
+
whatever part of its purpose remains meaningful, or
|
59
|
+
|
60
|
+
b) under the GNU GPL, with none of the additional permissions of
|
61
|
+
this License applicable to that copy.
|
62
|
+
|
63
|
+
3. Object Code Incorporating Material from Library Header Files.
|
64
|
+
|
65
|
+
The object code form of an Application may incorporate material from
|
66
|
+
a header file that is part of the Library. You may convey such object
|
67
|
+
code under terms of your choice, provided that, if the incorporated
|
68
|
+
material is not limited to numerical parameters, data structure
|
69
|
+
layouts and accessors, or small macros, inline functions and templates
|
70
|
+
(ten or fewer lines in length), you do both of the following:
|
71
|
+
|
72
|
+
a) Give prominent notice with each copy of the object code that the
|
73
|
+
Library is used in it and that the Library and its use are
|
74
|
+
covered by this License.
|
75
|
+
|
76
|
+
b) Accompany the object code with a copy of the GNU GPL and this license
|
77
|
+
document.
|
78
|
+
|
79
|
+
4. Combined Works.
|
80
|
+
|
81
|
+
You may convey a Combined Work under terms of your choice that,
|
82
|
+
taken together, effectively do not restrict modification of the
|
83
|
+
portions of the Library contained in the Combined Work and reverse
|
84
|
+
engineering for debugging such modifications, if you also do each of
|
85
|
+
the following:
|
86
|
+
|
87
|
+
a) Give prominent notice with each copy of the Combined Work that
|
88
|
+
the Library is used in it and that the Library and its use are
|
89
|
+
covered by this License.
|
90
|
+
|
91
|
+
b) Accompany the Combined Work with a copy of the GNU GPL and this license
|
92
|
+
document.
|
93
|
+
|
94
|
+
c) For a Combined Work that displays copyright notices during
|
95
|
+
execution, include the copyright notice for the Library among
|
96
|
+
these notices, as well as a reference directing the user to the
|
97
|
+
copies of the GNU GPL and this license document.
|
98
|
+
|
99
|
+
d) Do one of the following:
|
100
|
+
|
101
|
+
0) Convey the Minimal Corresponding Source under the terms of this
|
102
|
+
License, and the Corresponding Application Code in a form
|
103
|
+
suitable for, and under terms that permit, the user to
|
104
|
+
recombine or relink the Application with a modified version of
|
105
|
+
the Linked Version to produce a modified Combined Work, in the
|
106
|
+
manner specified by section 6 of the GNU GPL for conveying
|
107
|
+
Corresponding Source.
|
108
|
+
|
109
|
+
1) Use a suitable shared library mechanism for linking with the
|
110
|
+
Library. A suitable mechanism is one that (a) uses at run time
|
111
|
+
a copy of the Library already present on the user's computer
|
112
|
+
system, and (b) will operate properly with a modified version
|
113
|
+
of the Library that is interface-compatible with the Linked
|
114
|
+
Version.
|
115
|
+
|
116
|
+
e) Provide Installation Information, but only if you would otherwise
|
117
|
+
be required to provide such information under section 6 of the
|
118
|
+
GNU GPL, and only to the extent that such information is
|
119
|
+
necessary to install and execute a modified version of the
|
120
|
+
Combined Work produced by recombining or relinking the
|
121
|
+
Application with a modified version of the Linked Version. (If
|
122
|
+
you use option 4d0, the Installation Information must accompany
|
123
|
+
the Minimal Corresponding Source and Corresponding Application
|
124
|
+
Code. If you use option 4d1, you must provide the Installation
|
125
|
+
Information in the manner specified by section 6 of the GNU GPL
|
126
|
+
for conveying Corresponding Source.)
|
127
|
+
|
128
|
+
5. Combined Libraries.
|
129
|
+
|
130
|
+
You may place library facilities that are a work based on the
|
131
|
+
Library side by side in a single library together with other library
|
132
|
+
facilities that are not Applications and are not covered by this
|
133
|
+
License, and convey such a combined library under terms of your
|
134
|
+
choice, if you do both of the following:
|
135
|
+
|
136
|
+
a) Accompany the combined library with a copy of the same work based
|
137
|
+
on the Library, uncombined with any other library facilities,
|
138
|
+
conveyed under the terms of this License.
|
139
|
+
|
140
|
+
b) Give prominent notice with the combined library that part of it
|
141
|
+
is a work based on the Library, and explaining where to find the
|
142
|
+
accompanying uncombined form of the same work.
|
143
|
+
|
144
|
+
6. Revised Versions of the GNU Lesser General Public License.
|
145
|
+
|
146
|
+
The Free Software Foundation may publish revised and/or new versions
|
147
|
+
of the GNU Lesser General Public License from time to time. Such new
|
148
|
+
versions will be similar in spirit to the present version, but may
|
149
|
+
differ in detail to address new problems or concerns.
|
150
|
+
|
151
|
+
Each version is given a distinguishing version number. If the
|
152
|
+
Library as you received it specifies that a certain numbered version
|
153
|
+
of the GNU Lesser General Public License "or any later version"
|
154
|
+
applies to it, you have the option of following the terms and
|
155
|
+
conditions either of that published version or of any later version
|
156
|
+
published by the Free Software Foundation. If the Library as you
|
157
|
+
received it does not specify a version number of the GNU Lesser
|
158
|
+
General Public License, you may choose any version of the GNU Lesser
|
159
|
+
General Public License ever published by the Free Software Foundation.
|
160
|
+
|
161
|
+
If the Library as you received it specifies that a proxy can decide
|
162
|
+
whether future versions of the GNU Lesser General Public License shall
|
163
|
+
apply, that proxy's public statement of acceptance of any version is
|
164
|
+
permanent authorization for you to choose that version for the
|
165
|
+
Library.
|
data/VERSION
ADDED
@@ -0,0 +1 @@
|
|
1
|
+
0.0.1
|
data/lib/regexpr-uri.rb
ADDED
@@ -0,0 +1,62 @@
|
|
1
|
+
require 'regexpr.rb'
|
2
|
+
|
3
|
+
class Uri
|
4
|
+
class Flags< Array
|
5
|
+
class <<self
|
6
|
+
def new p
|
7
|
+
self[ p. split( /[&;]/) ]
|
8
|
+
end
|
9
|
+
end
|
10
|
+
|
11
|
+
def to_s
|
12
|
+
join '&'
|
13
|
+
end
|
14
|
+
end
|
15
|
+
|
16
|
+
class <<self
|
17
|
+
def new uri
|
18
|
+
uri. instance_of?( Uri) ? uri. dup : super( uri)
|
19
|
+
end
|
20
|
+
end
|
21
|
+
|
22
|
+
attr_accessor :scheme, :username, :password, :host, :port, :path, :file, :flags, :fragment
|
23
|
+
def initialize( uri) self. uri= uri end
|
24
|
+
def uri() self. pre+ self. uri+ (self. fragment ? '#'+ self. fragment : '') end
|
25
|
+
alias to_s uri
|
26
|
+
def pre() (self. scheme ? self. scheme+ '://' : '')+ self. serv end
|
27
|
+
def serv() (self. host|| '')+ (self. port ? ':'+ self. port : '') end
|
28
|
+
|
29
|
+
def uri
|
30
|
+
(self. path || '')+ (self. file || '')+ (self. flags ? '?'+ self. flags : '')
|
31
|
+
end
|
32
|
+
|
33
|
+
RegExpr = ::RegExpr[
|
34
|
+
:scheme => '( "http" | "ftp" ) "s" ? | "sftp" | "fish"',
|
35
|
+
:username => '[^:@]*',
|
36
|
+
:password => '[^@]*',
|
37
|
+
:host => 'hostname | ipv4',
|
38
|
+
:port => 'digit +',
|
39
|
+
:path => '( "/" ? [^?#]* "/" ) ?',
|
40
|
+
:file => '[^/?#] *',
|
41
|
+
:flags => '[^#] *',
|
42
|
+
:fragment => '. *',
|
43
|
+
'ipv4digits' => '0..255',
|
44
|
+
'ipv6digits' => '[0-9a-bA-B] {1,4}',
|
45
|
+
'userinfo' => 'username ( ":" password ) ?',
|
46
|
+
'domainlabel' => 'alphadigit ( ( alphadigit | "-" ) * alphadigit ) ?',
|
47
|
+
'hostname' => 'domainlabel ( "." | domainlabel ) *',
|
48
|
+
'ipv4' => 'ipv4digits ( "." ipv4digits ) {3,3}',
|
49
|
+
'request_uri' => '( path ? file ) ? ( "?" flags ) ?',
|
50
|
+
'serv' => 'host ( ":" port ? ) ?',
|
51
|
+
'pre' => '( ( scheme "://" ) ( auth "@" ) ? serv ) ?',
|
52
|
+
'uri' => 'pre request_uri ( "#" fragment ) ?',
|
53
|
+
'main' => 'uri'
|
54
|
+
]
|
55
|
+
RegExpr. def self, :path, :uri, :pre, :serv, :request_uri
|
56
|
+
end
|
57
|
+
|
58
|
+
def Uri( uri) Uri.new uri end
|
59
|
+
|
60
|
+
class String
|
61
|
+
def to_uri() Uri. new self end
|
62
|
+
end
|
data/lib/regexpr.rb
ADDED
@@ -0,0 +1,462 @@
|
|
1
|
+
|
2
|
+
class RegExpr< Hash
|
3
|
+
end
|
4
|
+
|
5
|
+
class RegExpr::Segment
|
6
|
+
attr_accessor :value
|
7
|
+
def initialize( val) self.value= val end
|
8
|
+
def to_r() self.value.to_s end
|
9
|
+
def empty?() self.value.nil? end
|
10
|
+
def names() @value.names.flatten.compact end
|
11
|
+
|
12
|
+
def optimize
|
13
|
+
self.value= self.class.optimize self.value
|
14
|
+
self
|
15
|
+
end
|
16
|
+
|
17
|
+
class <<self
|
18
|
+
def optimize v
|
19
|
+
v= v.optimize
|
20
|
+
v= nil if v and v.empty?
|
21
|
+
v= v.value[ 0] if RegExpr::Block === v and v.hidden and v.size == 1
|
22
|
+
v
|
23
|
+
end
|
24
|
+
|
25
|
+
def deepest
|
26
|
+
self.class_eval do
|
27
|
+
def names() [] end
|
28
|
+
end
|
29
|
+
end
|
30
|
+
|
31
|
+
def novalue
|
32
|
+
self.class_eval do
|
33
|
+
def initialize() end
|
34
|
+
def empty?() false end
|
35
|
+
def to_r() '' end
|
36
|
+
def optimize() self end
|
37
|
+
end
|
38
|
+
end
|
39
|
+
|
40
|
+
def nooptimize
|
41
|
+
self.class_eval do
|
42
|
+
def optimize() self end
|
43
|
+
end
|
44
|
+
end
|
45
|
+
end
|
46
|
+
end
|
47
|
+
|
48
|
+
class RegExpr::Block< RegExpr::Segment
|
49
|
+
attr_accessor :name, :hidden
|
50
|
+
def hidden?() @hidden end
|
51
|
+
def optimize() self.dup.optimize! end
|
52
|
+
def push( *v) @value.push *v end
|
53
|
+
def pop() @value.pop end
|
54
|
+
def empty?() @value.empty? end
|
55
|
+
def size() @value.size end
|
56
|
+
|
57
|
+
def names
|
58
|
+
names= @value.collect &:names
|
59
|
+
names.push( name) unless self.hidden?
|
60
|
+
names.flatten.compact
|
61
|
+
end
|
62
|
+
|
63
|
+
def initialize *val
|
64
|
+
val= val[ 0] if ::Array === val[ 0] and val.size == 1
|
65
|
+
super val
|
66
|
+
@hidden= true
|
67
|
+
end
|
68
|
+
|
69
|
+
def optimize!
|
70
|
+
list, chars= [[]], RegExpr::Chars.new( '')
|
71
|
+
|
72
|
+
@value.each do |v|
|
73
|
+
v= self.class.optimize v
|
74
|
+
if RegExpr::Or === v
|
75
|
+
list.push []
|
76
|
+
else list[ -1].push v
|
77
|
+
end
|
78
|
+
end
|
79
|
+
|
80
|
+
list.delete_if do |v|
|
81
|
+
if (RegExpr::Chars === v[ 0] and v.size == 1 ) or RegExpr::Char === v[ 0]
|
82
|
+
chars+= v[ 0]
|
83
|
+
else false
|
84
|
+
end
|
85
|
+
end
|
86
|
+
chars= chars.optimize
|
87
|
+
|
88
|
+
values= []
|
89
|
+
list.each do |v|
|
90
|
+
values.push RegExpr::Or.new
|
91
|
+
values+= if v.size == 1 and RegExpr::Block === v[ 0] and v[ 0].hidden
|
92
|
+
v[ 0].value
|
93
|
+
else
|
94
|
+
v.collect do |w|
|
95
|
+
if RegExpr::Block === w and w.hidden
|
96
|
+
u= false
|
97
|
+
w.value.each do |i|
|
98
|
+
break unless u||= RegExpr::Or === i
|
99
|
+
end
|
100
|
+
u ? w : w.value
|
101
|
+
else w
|
102
|
+
end
|
103
|
+
end.flatten
|
104
|
+
end
|
105
|
+
end
|
106
|
+
values.push RegExpr::Or.new, chars if chars.size > 0
|
107
|
+
values.shift
|
108
|
+
@value= values
|
109
|
+
self
|
110
|
+
end
|
111
|
+
|
112
|
+
def to_r()
|
113
|
+
(@hidden ? '(?:%s)' : '(%s)')% @value.collect( &:to_r).join( '')
|
114
|
+
end
|
115
|
+
end
|
116
|
+
|
117
|
+
class RegExpr::Not< RegExpr::Segment
|
118
|
+
deepest
|
119
|
+
novalue
|
120
|
+
def to_r
|
121
|
+
if @value.instance_of? RegExpr::Chars
|
122
|
+
@value.not!
|
123
|
+
@value.to_s
|
124
|
+
else '(?!%s)'% @value
|
125
|
+
end
|
126
|
+
end
|
127
|
+
end
|
128
|
+
|
129
|
+
class RegExpr::Range< RegExpr::Segment
|
130
|
+
novalue
|
131
|
+
attr_accessor :v1, :v2
|
132
|
+
def names() [] end
|
133
|
+
def optimize() self.value.optimize end
|
134
|
+
def to_r() self.optimize.to_r end
|
135
|
+
def initialize( v1, v2) @v1, @v2= v1, v2 end
|
136
|
+
|
137
|
+
# algo stolen from thomas leitner
|
138
|
+
def value
|
139
|
+
a, b= @v1< @v2 ? [ @v1, @v2] : [ @v2, @v1]
|
140
|
+
arr= Array[ a]
|
141
|
+
|
142
|
+
af= a == 0 ? 1.0 : a.to_f
|
143
|
+
bf= b == 0 ? 1.0 : b.to_f
|
144
|
+
1.upto( b.to_s.length- 1) do |i|
|
145
|
+
pot= 10** i
|
146
|
+
num= (af/ pot).ceil* pot # next higher number with i zeros
|
147
|
+
arr.insert i, num if num < @v2
|
148
|
+
num= (bf/ pot).floor* pot # next lower number with i zeros
|
149
|
+
arr.insert -i, num
|
150
|
+
end
|
151
|
+
arr.uniq!
|
152
|
+
arr.push b+ 1 # +1 -> to handle it in the same way as the other elements
|
153
|
+
|
154
|
+
result= RegExpr::Block.new
|
155
|
+
0.upto( arr. length- 2) do |i|
|
156
|
+
first= arr[ i].to_s
|
157
|
+
second= (arr[ i+ 1]- 1).to_s
|
158
|
+
result.push RegExpr::Or.new
|
159
|
+
0.upto( first.length- 1) do |j|
|
160
|
+
result.push( if first[ j] == second[ j]
|
161
|
+
RegExpr::Char.new first[ j].chr
|
162
|
+
else
|
163
|
+
RegExpr::Chars.new '%c-%c'% [ first[ j], second[ j] ]
|
164
|
+
end)
|
165
|
+
end
|
166
|
+
end
|
167
|
+
result. value. shift
|
168
|
+
result
|
169
|
+
end
|
170
|
+
end
|
171
|
+
|
172
|
+
class RegExpr::Chars< RegExpr::Segment
|
173
|
+
deepest
|
174
|
+
attr_reader :chars, :not
|
175
|
+
def to_r() '[%s]'% self. value end
|
176
|
+
def not?() @not end
|
177
|
+
def empty?() @chars. empty? end
|
178
|
+
def size() @chars. size end
|
179
|
+
def value=( val) @chars= (@not= val[ 0] == ?^) ? val[ 1.. -1] : val ; val end
|
180
|
+
def value() (self. not? ? '^' : '')+ (@chars) end
|
181
|
+
def not!() @not= !@not end
|
182
|
+
alias -@ not!
|
183
|
+
|
184
|
+
def split
|
185
|
+
chars= []
|
186
|
+
@chars. gsub( /\\-/) do |r|
|
187
|
+
chars. push ?-.ord
|
188
|
+
nil
|
189
|
+
end. gsub( /.-./) do |r|
|
190
|
+
chars+= (r[ 0].ord .. r[ 2].ord). to_a
|
191
|
+
nil
|
192
|
+
end. bytes. each do |c|
|
193
|
+
chars. push c
|
194
|
+
end
|
195
|
+
chars
|
196
|
+
end
|
197
|
+
|
198
|
+
def optimize!
|
199
|
+
b2chr= lambda do |b|
|
200
|
+
b = b.chr
|
201
|
+
"-[]".include?( b) ? '\%c'% b : b
|
202
|
+
end
|
203
|
+
chars= self. split. sort. uniq
|
204
|
+
@chars= ''
|
205
|
+
return self if chars.empty?
|
206
|
+
b= chars.shift
|
207
|
+
chars.each do |i|
|
208
|
+
if b+1 == i
|
209
|
+
unless @chars[ -1] == ?- and @chars[-2] != ?\\
|
210
|
+
@chars+= b2chr.call( b)+ '-'
|
211
|
+
end
|
212
|
+
else @chars+= b2chr.call b
|
213
|
+
end
|
214
|
+
b= i
|
215
|
+
end
|
216
|
+
@chars+= b2chr.call b
|
217
|
+
self
|
218
|
+
end
|
219
|
+
|
220
|
+
def optimize
|
221
|
+
n= self.dup.optimize!
|
222
|
+
if (n.size == 1 or (n.size == 2 and n.value[ 0] == ?\\ )) and not n.not?
|
223
|
+
RegExpr::Char.new n.chars[ -1]
|
224
|
+
else
|
225
|
+
n
|
226
|
+
end
|
227
|
+
end
|
228
|
+
|
229
|
+
def + b
|
230
|
+
chars= self.not? ? '^' : ''
|
231
|
+
chars+= if b.instance_of? RegExpr::Char
|
232
|
+
self.split.push b.value[ 0]
|
233
|
+
elsif self.not? == b.not?
|
234
|
+
self.split+ b.split
|
235
|
+
elsif self.not?
|
236
|
+
(0..255).to_a- self.split+ b.split
|
237
|
+
else
|
238
|
+
(0..255).to_a- b.split+ self.split
|
239
|
+
end.compact.uniq.collect {|i| i.chr }.join( '')
|
240
|
+
self.class.new chars
|
241
|
+
end
|
242
|
+
end
|
243
|
+
|
244
|
+
class RegExpr::Repeat< RegExpr::Segment
|
245
|
+
attr_reader :min, :max
|
246
|
+
|
247
|
+
def minandmax x
|
248
|
+
case x
|
249
|
+
when nil, '' then nil
|
250
|
+
else x.to_i
|
251
|
+
end
|
252
|
+
end
|
253
|
+
|
254
|
+
def optimize
|
255
|
+
super
|
256
|
+
r = (min == 1 and max == 1) ? @value : self
|
257
|
+
r
|
258
|
+
end
|
259
|
+
|
260
|
+
def initialize value, min= 1, max= min
|
261
|
+
super value
|
262
|
+
@min, @max= self.minandmax( min), self.minandmax( max)
|
263
|
+
end
|
264
|
+
|
265
|
+
def to_r
|
266
|
+
t= '{%s,%s}'% [ @min||'', @max||'' ]
|
267
|
+
return '' if '{0,0}' == t
|
268
|
+
t= Hash[ *%w<{,1} ? {0,1} ? {0,} * {,} * {1,} +>+ ['{1,1}', ''] ][ t]|| t
|
269
|
+
@value.to_r+ t
|
270
|
+
end
|
271
|
+
end
|
272
|
+
|
273
|
+
class RegExpr::Char< RegExpr::Segment
|
274
|
+
deepest
|
275
|
+
nooptimize
|
276
|
+
def to_r() ::Regexp.quote @value end
|
277
|
+
def size() 1 end
|
278
|
+
|
279
|
+
def self.new x
|
280
|
+
x= x.split( '').collect {|i| super i }
|
281
|
+
x.size == 1 ? x[ 0] : RegExpr::Block.new( x)
|
282
|
+
end
|
283
|
+
end
|
284
|
+
|
285
|
+
class RegExpr::Regexp< RegExpr::Segment
|
286
|
+
deepest
|
287
|
+
nooptimize
|
288
|
+
def to_r() @value. to_s end
|
289
|
+
end
|
290
|
+
|
291
|
+
class RegExpr::Or< RegExpr::Segment
|
292
|
+
deepest
|
293
|
+
novalue
|
294
|
+
def to_r() '|' end
|
295
|
+
def to_s() '|' end
|
296
|
+
end
|
297
|
+
|
298
|
+
class RegExpr::End< RegExpr::Segment
|
299
|
+
deepest
|
300
|
+
novalue
|
301
|
+
def to_r() '$' end
|
302
|
+
def to_s() '$' end
|
303
|
+
end
|
304
|
+
|
305
|
+
class RegExpr::Begin< RegExpr::Segment
|
306
|
+
deepest
|
307
|
+
novalue
|
308
|
+
def to_r() '^' end
|
309
|
+
def to_s() '^' end
|
310
|
+
end
|
311
|
+
|
312
|
+
class RegExpr::WildCard< RegExpr::Segment
|
313
|
+
deepest
|
314
|
+
nooptimize
|
315
|
+
def to_r() @value end
|
316
|
+
def to_s() @value end
|
317
|
+
end
|
318
|
+
|
319
|
+
class RegExpr
|
320
|
+
class <<self
|
321
|
+
STDEXP= Hash[
|
322
|
+
'loalpha' => '[a-z]',
|
323
|
+
'hialpha' => '[A-Z]',
|
324
|
+
'alpha' => 'loalpha | hialpha',
|
325
|
+
'digit' => '[0-9]',
|
326
|
+
'alphadigit' => 'alpha | digit',
|
327
|
+
'hexdigit' => 'digit | [a-fA-F]',
|
328
|
+
'octdigit' => '[0-7]',
|
329
|
+
'bindigit' => '[01]',
|
330
|
+
'space' => '[ \t\n\r\v]'
|
331
|
+
]
|
332
|
+
|
333
|
+
def [] *vals
|
334
|
+
ret= super *vals
|
335
|
+
STDEXP.each {|k, v| ret[ k]||= v }
|
336
|
+
ret
|
337
|
+
end
|
338
|
+
|
339
|
+
def new *vals
|
340
|
+
ret= super *vals
|
341
|
+
STDEXP.each {|k, v| ret[ k]||= v }
|
342
|
+
ret
|
343
|
+
end
|
344
|
+
end
|
345
|
+
|
346
|
+
def to_r exp= :main
|
347
|
+
r = self.to_re( exp)
|
348
|
+
#r.optimize!
|
349
|
+
h, r = r.hidden?, r.to_r
|
350
|
+
r = r[ 1...-1] unless h
|
351
|
+
::Regexp.new r
|
352
|
+
end
|
353
|
+
|
354
|
+
def to_re exp= :main
|
355
|
+
u= RegExpr::Block.new
|
356
|
+
t, u.hidden= if Symbol === exp
|
357
|
+
u.name= exp.to_sym
|
358
|
+
if self[ exp]
|
359
|
+
[ self[ exp], false]
|
360
|
+
else [ self[ exp.to_s], true]
|
361
|
+
end
|
362
|
+
else [ exp.to_s, true]
|
363
|
+
end
|
364
|
+
|
365
|
+
until !t or t.empty?
|
366
|
+
v, t= self.to_r_next t
|
367
|
+
case v
|
368
|
+
when ')' then return u, t
|
369
|
+
when RegExpr::Repeat then v.value= u.pop
|
370
|
+
end
|
371
|
+
u.push v
|
372
|
+
end
|
373
|
+
u
|
374
|
+
end
|
375
|
+
|
376
|
+
def to_r_next exp
|
377
|
+
exp.strip!
|
378
|
+
/^/ =~ exp[ 1.. -1]
|
379
|
+
t= case exp[ 0]
|
380
|
+
when ?^ then return RegExpr::Begin.new, exp[ 1.. -1]
|
381
|
+
when ?$ then return RegExpr::End.new, exp[ 1.. -1]
|
382
|
+
when ?\\
|
383
|
+
h= case exp[ 1]
|
384
|
+
when ?D, ?S, ?W, ?a, ?d.. ?f, ?n, ?r.. ?t, ?v, ?w
|
385
|
+
return RegExpr::WildCard.new( '\%c'% exp[ 1]), exp[ 2.. -1]
|
386
|
+
when ?x then 16
|
387
|
+
when ?o then 8
|
388
|
+
when ?b then 2
|
389
|
+
when ?0.. ?9
|
390
|
+
exp= 'XX'+ exp[ 1.. -1]
|
391
|
+
10
|
392
|
+
else raise ArgumentError, 'Unknown form "%s"'% exp
|
393
|
+
end
|
394
|
+
i= exp[ 2.. -1].to_i h
|
395
|
+
return RegExpr::Char.new( i.chr), exp[ (i.to_s( h). size+ 2).. -1]
|
396
|
+
|
397
|
+
when ?. then return RegExpr::WildCard.new( '.'), exp[ 1.. -1]
|
398
|
+
|
399
|
+
when ?0
|
400
|
+
case exp[ 1]
|
401
|
+
when ?x then %r<^0x([0-9a-f]+)>i.match exp
|
402
|
+
return '', $1.to_i( 16).to_s+ $'
|
403
|
+
when ?o then %r<^0o([0-8]+)>.match exp
|
404
|
+
return '', $1.to_i( 8).to_s+ $'
|
405
|
+
when ?b then %r<^0b([01]+)>.match exp
|
406
|
+
return '', $1.to_i( 2).to_s+ $'
|
407
|
+
else
|
408
|
+
case exp
|
409
|
+
when %r<(\d+)..(\d+)> then RegExpr::Range.new $1.to_i, $2.to_i
|
410
|
+
when %r<^(\d+,\d+|,\d+|\d+,?)> then RegExpr::Repeat.new '', *$1.split( ',')
|
411
|
+
else raise ArgumentError, 'Unknown form "%s"'% exp
|
412
|
+
end
|
413
|
+
end
|
414
|
+
|
415
|
+
when ?( then return self.to_re( exp[ 1.. -1])
|
416
|
+
when ?) then ')'
|
417
|
+
when ?| then RegExpr::Or.new
|
418
|
+
|
419
|
+
when ?+ then RegExpr::Repeat.new '', 1, nil
|
420
|
+
when ?* then RegExpr::Repeat.new '', nil
|
421
|
+
when ?? then RegExpr::Repeat.new '', 0, 1
|
422
|
+
|
423
|
+
when ?" then RegExpr::Char.new %r<^"((?:[^"]|\\")*)">.match( exp)[ 1]
|
424
|
+
when ?[ then RegExpr::Chars.new %r<^\[((?:[^\]]|\\\])*[^\\]|)\]>.match( exp)[ 1]
|
425
|
+
when ?/ then exp =~ %r<^/((?:[^/]|\\/)*)/(im?|mi)?>
|
426
|
+
RegExpr::Regexp.new ::Regexp.new( $1,
|
427
|
+
($2 =~ /i/ ? ::Regexp::IGNORECASE : 0)+
|
428
|
+
($2 =~ /m/ ? ::Regexp::MULTILINE : 0))
|
429
|
+
|
430
|
+
else
|
431
|
+
case exp
|
432
|
+
when %r<^([a-z_][a-z_0-9]*\b)>i then self.to_re $1.to_sym
|
433
|
+
when %r<(\d+)..(\d+)> then RegExpr::Range.new $1.to_i, $2.to_i
|
434
|
+
when %r<^(\d+,\d+|,\d+|\d+,?)> then RegExpr::Repeat.new '', *$1.split( ',')
|
435
|
+
else raise ArgumentError, 'Unknown form "%s"'% exp
|
436
|
+
end
|
437
|
+
end
|
438
|
+
[ t, $' ]
|
439
|
+
end
|
440
|
+
|
441
|
+
def def cl= Class.new, *exp
|
442
|
+
exp= [ :main ] if exp.empty?
|
443
|
+
exp.each do |e|
|
444
|
+
re= self.to_re e
|
445
|
+
names= re.names.collect('@%s'.method(:%)).join ', '
|
446
|
+
re= ::Regexp.new '^%s$'% re.to_r
|
447
|
+
ev= <<-EOF
|
448
|
+
def #{e}= val
|
449
|
+
m= #{re.inspect}. match val
|
450
|
+
raise ArgumentError, 'Unallowed Chars! (%s =~ #{re.inspect})'% val. inspect unless m
|
451
|
+
#{names}= *m[ 1.. -1]
|
452
|
+
end
|
453
|
+
EOF
|
454
|
+
cl.class_eval ev
|
455
|
+
end
|
456
|
+
cl
|
457
|
+
end
|
458
|
+
|
459
|
+
def match( m, exp= :main) to_r( exp).match m end
|
460
|
+
|
461
|
+
def =~( x) to_r =~ x end
|
462
|
+
end
|
metadata
ADDED
@@ -0,0 +1,65 @@
|
|
1
|
+
--- !ruby/object:Gem::Specification
|
2
|
+
name: RegExpr
|
3
|
+
version: !ruby/object:Gem::Version
|
4
|
+
prerelease: false
|
5
|
+
segments:
|
6
|
+
- 0
|
7
|
+
- 0
|
8
|
+
- 1
|
9
|
+
version: 0.0.1
|
10
|
+
platform: ruby
|
11
|
+
authors:
|
12
|
+
- Denis Knauf
|
13
|
+
autorequire:
|
14
|
+
bindir: bin
|
15
|
+
cert_chain: []
|
16
|
+
|
17
|
+
date: 2010-03-19 00:00:00 +01:00
|
18
|
+
default_executable:
|
19
|
+
dependencies: []
|
20
|
+
|
21
|
+
description: Write Regular Expressions in a Hash and generats an optimized Regex
|
22
|
+
email: Denis.Knauf@gmail.com
|
23
|
+
executables: []
|
24
|
+
|
25
|
+
extensions: []
|
26
|
+
|
27
|
+
extra_rdoc_files:
|
28
|
+
- LICENSE
|
29
|
+
files:
|
30
|
+
- VERSION
|
31
|
+
- lib/regexpr-uri.rb
|
32
|
+
- lib/regexpr.rb
|
33
|
+
- LICENSE
|
34
|
+
has_rdoc: true
|
35
|
+
homepage: http://github.com/DenisKnauf/RegExpr
|
36
|
+
licenses: []
|
37
|
+
|
38
|
+
post_install_message:
|
39
|
+
rdoc_options:
|
40
|
+
- --charset=UTF-8
|
41
|
+
require_paths:
|
42
|
+
- lib
|
43
|
+
required_ruby_version: !ruby/object:Gem::Requirement
|
44
|
+
requirements:
|
45
|
+
- - ">="
|
46
|
+
- !ruby/object:Gem::Version
|
47
|
+
segments:
|
48
|
+
- 0
|
49
|
+
version: "0"
|
50
|
+
required_rubygems_version: !ruby/object:Gem::Requirement
|
51
|
+
requirements:
|
52
|
+
- - ">="
|
53
|
+
- !ruby/object:Gem::Version
|
54
|
+
segments:
|
55
|
+
- 0
|
56
|
+
version: "0"
|
57
|
+
requirements: []
|
58
|
+
|
59
|
+
rubyforge_project:
|
60
|
+
rubygems_version: 1.3.6
|
61
|
+
signing_key:
|
62
|
+
specification_version: 3
|
63
|
+
summary: Regular Expression Creator
|
64
|
+
test_files: []
|
65
|
+
|