reg 0.4.6

Sign up to get free protection for your applications and to get access to all the features.
@@ -0,0 +1,134 @@
1
+ =begin copyright
2
+ reg - the ruby extended grammar
3
+ Copyright (C) 2005 Caleb Clausen
4
+
5
+ This library is free software; you can redistribute it and/or
6
+ modify it under the terms of the GNU Lesser General Public
7
+ License as published by the Free Software Foundation; either
8
+ version 2.1 of the License, or (at your option) any later version.
9
+
10
+ This library is distributed in the hope that it will be useful,
11
+ but WITHOUT ANY WARRANTY; without even the implied warranty of
12
+ MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
13
+ Lesser General Public License for more details.
14
+
15
+ You should have received a copy of the GNU Lesser General Public
16
+ License along with this library; if not, write to the Free Software
17
+ Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA
18
+ =end
19
+
20
+ #----------------------------------
21
+ module Kernel
22
+ def formula_value(*ctx) #hopefully, no-one else will ever use this same name....
23
+ self
24
+ end
25
+ end
26
+
27
+ module Reg
28
+ #----------------------------------
29
+ #BlankSlate,Formula,Deferred,and Const are courtesy of Jim Weirich.
30
+ #Slightly modified by me.
31
+ #see http://onestepback.org/index.cgi/Tech/Ruby/SlowingDownCalculations.rdoc
32
+
33
+
34
+
35
+
36
+ #----------------------------------
37
+ module BlankSlate;
38
+ module ClassMethods
39
+ def restore(*names)
40
+ names.each{|name| alias_method name, "##{name}"}
41
+ end
42
+ def hide(*names)
43
+ names.each do|name|
44
+ undef_method name if instance_methods.include?(name.to_s)
45
+ end
46
+ end
47
+ end
48
+
49
+ def BlankSlate.included(othermod)
50
+ othermod.instance_eval {
51
+ instance_methods.each { |m|
52
+ alias_method "##{m}", m #archive m
53
+ undef_method m unless m =~ /^__/ || m=='instance_eval'
54
+ }
55
+ extend BlankSlate::ClassMethods
56
+ }
57
+ end
58
+ end
59
+
60
+ #----------------------------------
61
+ module Formula
62
+ def method_missing(sym, *args, &block)
63
+ Deferred.new(self, mixmod, sym, args, block)
64
+ end
65
+ alias deferred method_missing
66
+
67
+ def mixmod; nil end #default is not contagious
68
+
69
+ def coerce(other)
70
+ [Const.new(other), self]
71
+ end
72
+
73
+ def formula_value(*ctx)
74
+ fail "Subclass Responsibility"
75
+ end
76
+ end
77
+
78
+ #----------------------------------
79
+ class Deferred
80
+ include BlankSlate
81
+ restore :inspect,:extend
82
+ include Formula
83
+ attr_reader :operation, :args, :target, :block
84
+
85
+ def initialize(target, mod, operation, args, block)
86
+ @target = target
87
+ @operation = operation
88
+ @args = args
89
+ @block = block
90
+ mod ||= args.find{|a| Formula===a }
91
+ mod and extend mod
92
+ end
93
+
94
+ def formula_value(*ctx)
95
+ @target.formula_value(*ctx).send(@operation, *eval_args(*ctx), &@block)
96
+ end
97
+
98
+ private
99
+
100
+ def eval_args(*ctx)
101
+ @args.collect { |a| a.formula_value(*ctx) }
102
+ end
103
+
104
+ class <<self
105
+ alias new__no_const new
106
+ def new(*args)
107
+ if args.size==1
108
+ Const.new( *args)
109
+ else
110
+ new__no_const( *args)
111
+ end
112
+ end
113
+ end
114
+
115
+ class Const
116
+ include BlankSlate
117
+ restore :inspect,:extend
118
+ include Formula
119
+
120
+ def initialize(value)
121
+ @formula_value = value
122
+ end
123
+ def formula_value(*ctx) @formula_value end
124
+
125
+ class<<self
126
+ alias [] new
127
+ end
128
+ end
129
+
130
+ end
131
+
132
+
133
+
134
+ end
@@ -0,0 +1 @@
1
+ Methods/classes Reg Or And Xor Not Hash OrderedHash Object OrderedObject Array Fixed Equals Literal Multiple Interpreter Subseq Repeat LookAhead LookBack Regexp Module Set Range Pair Position
2
+
-
1
3
 
@@ -0,0 +1,416 @@
1
+ Note: this document is half skeleton right now. Eventually, I'll
2
+ flesh this out into a more user-accessable description of Reg.
3
+
4
+ Another note: some of the things described here aren't working yet...
5
+ I'll try to mark such features with a triple-asterisk (***).
6
+
7
+ I'm going to assume that the reader of this document is well versed in
8
+ regular expressions. Reg is intended to be an expanded type of regular
9
+ expression language which extends Regexp-like capabilities to data types
10
+ other than String. (For instance, Files and Arrays.) (It can also enhances
11
+ the capabilites for Strings.... eventually, you will be able to use Reg as
12
+ a full-fleged lexing tool. For information on parsing, see parser.txt and
13
+ calc.reg.)
14
+
15
+ matchers and matching:
16
+ Matcher is my term for any Object that can respond to ===. In practice this is
17
+ almost all objects, since by default === forwards to == which is built in.
18
+ Sometimes, I use the term matcher as shorthand for what might be more properly
19
+ called an 'interesting matcher': one which responds to === in a different way
20
+ than ==. Of the built-in classes, objects of type Class, (and Module, its parent),
21
+ Regexp, and Range are interesting matchers. Reg extends these, and provides a
22
+ whole host of other interesting matchers, in the usual sense as well. This
23
+ includes matchers composed of other (usually smaller) matchers, and a mini-
24
+ language for composing the matchers.
25
+
26
+ Much of the mini-language, (a sugary layer, which makes writing matches somewhat
27
+ naturalistic) is in the module Reg::Reg. Reg::Reg can extend user-created
28
+ (interesting) matchers (giving them |, &, ^, ~, as well as +, -, * and others which
29
+ I haven't explained yet) in somewhat the same manner that Enumerable extends
30
+ each and Comparable extends <=>.
31
+
32
+ Matching means using a matcher. Typically, this would mean calling the matcher's
33
+ === method. Eventually, a whole bunch of other matching methods will be provided
34
+ too, but for now === is the only published match operator.
35
+
36
+
37
+
38
+
39
+
40
+
41
+ #reg actually returns a new matcher that responds to the various operators;
42
+ the original object is unchanged.
43
+ Note: in many contexts, it's not even necessary to use #reg. The right
44
+ side of a |,&,or ^ (where the left is already known to be a reg) need not
45
+ be made into a reg, so long as it responds to ===. Likewise with the items
46
+ inside Reg::Arrays and Reg::Subseqs, and the keys and values of
47
+ Reg::Hashes and Reg::Objects need not be regs, so long as they can respond
48
+ to ===. Examples:
49
+
50
+ unr=some_unreggy_matcher
51
+ unr.reg|String #.reg only needed on first
52
+ +[ unr, -[unr, String]+0 ] #no .reg req'd w/ elems of Reg::Array/Reg::Subseq
53
+ +{ unr=>unr } # or Reg::Hash
54
+ -{ unr=>unr } # or Reg::Object
55
+
56
+ Reg::Reg is mixed into the following built-in classes by default: Module,Class, Range, Regexp.
57
+ Class Set is extended to have an ===, but Reg::Reg is not used in it (yet).
58
+
59
+
60
+
61
+
62
+ regexp has just 4 constructs that create scalar expressions:
63
+ . [st] [^st] s
64
+ all others regexp constructs are for creating regexp vectors.
65
+ regexp scalars do not exist by themselves; they are merely subexpressions of a
66
+ larger regexp.
67
+
68
+
69
+
70
+
71
+ reg scalars
72
+ reg has a rich array of matchers for different types of ruby objects. there are
73
+ specialized matchers for arrays, hashs, strings, symbols, and others, as well as
74
+ the object matcher, suitable for use with just about any ruby object, user-
75
+ created or otherwise. ordinary objects can also serve as reg scalars; it will
76
+ match (normally) if == succeeds. actually, === is used internally, but it just
77
+ delegates to == by default.
78
+
79
+ objects that can respond to === (which is most, since by default it delegates to ==) can also serve as a scalar Reg; this is a nice capability for creating user-defined reg types. Among other
80
+ things, it means that most objects can serve as 'literal' Reg matchers that match only themselves.
81
+ (Note that if you want to use methods of Reg, such as Reg::Reg#*, on such 'literal' reg scalars, you'll have to invoke the reg method of the user-defined object. for instance, this won't work:
82
+ :foo*5 #error, no Symbol#*
83
+ but this will:
84
+ :foo.reg*5 #creates a wrapper Reg that forwards === to the wrapped object
85
+ )
86
+
87
+ reg vectors (will be met in full later)
88
+ subsequence(-[]) and iteration (*,+,-) are always vectors.
89
+ logic operators (~, &, |, ^) become vectors if any subexpression is a vector
90
+ (recursively).
91
+ array matchers(+[]) are NOT vectors (but, all vectors must ultimately be
92
+ contained in them)
93
+ certain other, more dynamic reg types (Reg::Variable,Reg::Constant,Reg::Proc) might
94
+ be able to be vectors if that feature is enabled by the user.
95
+
96
+
97
+ Reg by example:
98
+ I will use fragments of Reg to illustrate the various aspects of the language. We'll start with various
99
+ scalar matchers, followed by some vector matchers later on.
100
+
101
+ /foo/ #everybody knows what this matches, right?
102
+
103
+ You didn't know that a Regexp is also a Reg, did you? (*** Actually, Regexp serves a double role. When matching String-like data, it's matches a variable number of characters. When matching Array-like data, it matches a single item in the Array.)
104
+
105
+ Logic:
106
+ (The &, |, ^, and ~ operators in the next few examples can apply to any Reg. Regexp is used for demonstration purposes because it is the only type of Reg so far introduced.)
107
+
108
+ ~/foo/ #matches on strings without foo (_and_ all non-Strings)
109
+
110
+ Not expressions (negations) invert their operand. This expression matches things that don't match /foo/. Note that that normally includes all non-String objects as well as String objects that don't match the Regexp.
111
+
112
+ I'm overriding the default meaning of Regexp#~ here. The original is still available as Regexp#cmp.
113
+
114
+ /foo/|/bar/ #roughly like /(?>foo)|(?>bar)/: strings with either /foo/ or /bar/
115
+
116
+ Or expressions (alternations) match if any of the alternatives match. The current treatment of alternation is charitably described as 'traditional' and 'non-greedy'. Alternatives are given the opportunity to match in the order that was specified in the or matcher. The amount consumed by the overall match is the amount of input matched by the first (leftmost) alternative which actually matches (within the larger expression).
117
+
118
+ (***This is not necessarily the longest that the overall expression can match. A future implementation may be greedier overall. Currently, if the first (tentative) match leads to the overall expression failing, the first matching branch is consulted for shorter matches before later alternatives which might be as long as (or longer than) the current match are attempted.)
119
+
120
+ (I've used the (?>) construct to demonstrate a subtle point about the use of Regexp in larger Reg expressions: there's no way to backtrack into a nested Regexp when backtracking through the larger Reg. Whatever the Regexp is able to first match is all it
121
+ can match; shorter matches cannot be considered. Mostly this is a consideration with sequences and subsequences when matching String-like data, which isn't supported yet.)
122
+
123
+ /foo/&/bar/ #sorta like /foo.*bar|bar.*foo/: strings with both /foo/ and /bar/
124
+
125
+ And expressions (conjunctions) match if all of the sub-expressions match. The amount consumed by the overall expression is the largest amount consumed by the longest alternative(s).
126
+
127
+ (*** Backtracking in & expressions is only sketchily understood and not implemented yet. The total number of ways a conjunction can match rises exponentionally with the number of the branches which are actually ambiguous for the current input. The compromise
128
+ implementation I would like to do does not enumerate all of these (very numerous) matches... instead, all possible overall lengths will be tried, at least. )
129
+
130
+ /foo/^/bar/ #vaguely like /foo|bar/&~(/foo/&/bar/): strings with /foo/ or /bar/ but not both
131
+ /foo/^/bar/^/baz/ #string matching one and only one of /foo/, /bar/, or /baz/
132
+
133
+ Xor expressions (exclusive alternations) match if one and only one of the branches actually matches. If there are only two branches, this is equivalent to matching if one branch matches and the other doesn't.
134
+ The amount consumed overall is the amound consumed by the only branch that matches.
135
+
136
+ Note that any of the three binary boolean operations can have more than two branches, as in the second example above. In the case of xor, the meaning of more than two branches is not necessarily obvious, but it is consistant: one and only one of the branches must match.
137
+
138
+
139
+ Matching symbols:
140
+
141
+ /dd/.sym
142
+
143
+ This examples matches Symbols that contain two consecutive d's. The Reg::Symbol matcher permits all the capabilities of Regexps when matching Symbols, only with slightly longer syntax. Regexp#sym returns a Reg::Symbol that matches Symbols that the Regexp
144
+ would match if they were converted to strings.
145
+
146
+
147
+
148
+
149
+ ItemThat:
150
+
151
+ /foo/|item_that.has_attr?
152
+
153
+ Lest you thought that only Strings can be matched, I've introduced a new type: Reg::ItemThat. This example matches Strings with 'foo' in them or Objects that respond to :has_attr? with a true value. Kernel#item_that returns a Reg::ItemThat, which is a rel
154
+ ative of Jim Weirich's Deferred class. Deferred objects respond to all methods with another Deferred object. The method(s) are not actually called, but saved up (with args) until a future time when they're all invoked at once. Almost all methods create an
155
+ other Deferred operation in this way, except a magic method that performs all the deferred operations. Since ItemThat is what I call a matcher, the magic method is ===.
156
+
157
+ item_that.meth.another_meth(@args)==$something
158
+
159
+ This demonstrates some of the concepts of the last paragraph: deferred calls to item_that can be chained together. The above expression returns a matcher for items that respond to :meth, with an object that responds to another_meth, taking @args, and retu
160
+ rning a value equal to $something. This illustrates calling methods on item_that, chaining the calls, methods taking arguments, and even deferring overridable operators.
161
+
162
+ item_that<44
163
+
164
+ This is another example of deferring an operator. This creates a matcher that returns
165
+ true if compared to objects less than 44.
166
+
167
+ (item_that<44) & (item_that%2==0) #even numbers smaller than 44
168
+
169
+ Here we've got two item_that in one expression. &,|,and~ can substitute for &&, ||, and ! in deferred (and non-deferred) boolean expressions. If you do that, be careful that the operands are really boolean (true, false, or nil). And be aware that it's no
170
+ longer a short-circuit operator.
171
+
172
+ I am using & because && is not overrideable. (& has the same meaning for booleans as &&, but isn't a short-circuit operator.) Note that parentheses are now necessary because of the inconveniently different precedence of &.
173
+
174
+ ItemThat is not a Reg. ItemThat expressions that use +, -, *, ~, etc in them will get a deferred operator rather than the Reg meaning. Use reg_that (or #reg on an item_that expression) to make them capable of using the reg operators and methods.
175
+
176
+ item_that.deferred(:===, 'foo')
177
+
178
+ If you absolutely had to defer ===, or anything else, use the #deferred method. For the few
179
+ methods that ItemThat actually implements, deferred provides a way to 'escape' the method call.
180
+
181
+ item_is(Integer)<44
182
+
183
+ item_is is an alias for item_that. Either version can take a Module (well actually, scalar matcher) parameter which imposes an additional constraint to the query. (The parameter makes more sense with item_is, however.) In this case, the constraint is that
184
+
185
+ the item value must also be an integer, so the whole expression will match any integer less than 44.
186
+
187
+ item_that{|x| (some_complicated_expression).has_property? x }
188
+
189
+ This illustrates the block form of item_that. Block item_that should be used where deferred-style queries
190
+ break down. (In a past version of reg, the block form was the only one available, and it was called proceq instead of item_that.) The block is consulted to see whether a given item matches. If the block returns false or nil or raises an exception, the mat
191
+ ch fails. All other values indicate success. This form of item_that still returns a Deferred relative, meaning that most methods of the result will create a Deferred object, which behaves in the usual way. When the Deferred object is ultimately used to ma
192
+ tch, the block in the center of it is executed first, and it's result is passed through the chain of deferred methods attached to it, in (more or less) the order they were given in the source:
193
+
194
+ item_that{|x| y.z(x).w-x }.zero?|item_that.perfect?
195
+
196
+
197
+
198
+ tbd: item_that gotchas
199
+ ! != && || assignment
200
+ the methods it does know, which must be escaped by .deferred(:sym,... if you want that:
201
+ === coerce deferred __id__ __send__ extend mixmod reg inspect formula_value initialize eval_args
202
+
203
+ Hash matchers:
204
+
205
+ +{/foo/=>/bar/, /baz/=>/boff/|nil} or +[/foo/**/bar/, /baz/**(/boff/|nil)]
206
+
207
+ This is a hash matcher, which can match some patterns within hashes. Two forms are given, and they
208
+ actually have slightly different meanings, which I will explain in a moment. A hash matcher is a set of filters applied to key,value pairs in the hash. Each pair has to be matched by some filter, else the entire hash matcher fails. An expression like: /foo/=>/bar/ is a filter, which matches items with a key that matches /foo/ and value that matches /bar/. The above two hash matchers should match these hashes:
209
+
210
+ {"bazzx"=>"boffo the clown", "fool"=>"barfly" } {"cat food"=>"barf"}
211
+
212
+ but not these:
213
+
214
+ {"bazzx"=>"bof"} {"fool"=>"barfly", "quux"=>"zork} {} {"foo"=>"boff"} {"baz"=>"bar"}
215
+
216
+ Each pair in the hash matched against must match some filter in the hash matcher. Also, each filter of the
217
+ hash matcher must match something in the hash (or be able to match the default value).
218
+
219
+ Hashes are unordered data structures. With the first form (+{a=>b}), the +@ operator is applied to a hash value, so order of filters is not preseved. I interpret this to mean that the user wants Reg to determine an appropriate order in which to attempt filters. Reg attempts to assign a sensible order to filters in an unordered hash matcher according to the following rules, based on categorization of the key matchers:
220
+ first, keys of uninteresting matchers and Reg::Equal (and decendants).
221
+ then, keys of regs and other interesting matchers
222
+ then, key of OB (the catchall)
223
+
224
+ Within the second category, order is still unspecified. (So don't depend on order.)
225
+
226
+ Explicit order of filters may be needed in some cases, to assign greater priority to certain filters. That's what ordered matchers (the 2nd form, +[a**b]) is for. The order of filters is respected in ordered hash matchers. The first filter is given a chance to match first, followed by the second and so forth. Within ordered matchers, ** should be understood as a stand-in for =>. Note that unlike =>, ** is unfortunately very high precedence, so its arguments must be surrounded by () if they have operators in them... for
227
+ instance:
228
+
229
+ +[:a ** /b/|/c/] #eventually causes error... parsed like +[(:a**/b/)|/c/]
230
+
231
+
232
+ +[:a ** (/b/|/c/)] #right way
233
+
234
+
235
+ Empty hashes are matched only by matchers that can have an OB rule that matches the hashes default value. (Or by
236
+ +{}, which matches all Hashes.)
237
+
238
+
239
+ Here's a more complicated hash matcher, demonstrating that both key and value can be arbitrarily complicated expressions. This would match hashes where all the keys are symbols containing 'whatnot', and the values are strings containing 'what' or ('have'
240
+ and 'you'):
241
+ +{/whatnot/.sym => /what/|/have/&/you/}
242
+
243
+ Every item in a Hash must be accounted for by one of the rules in the matcher, else the match fails. (To disable this behavior, add this filter to your matcher: OB=>OB.)
244
+
245
+
246
+ Object matchers:
247
+
248
+ -{:@attr => /something/} (*** or -[:@attr ** /something/] )
249
+
250
+ An object matcher looks like a hash matcher, except you use - instead of +. Object matching is viewed as a special
251
+ case of hash matching, where the keys (of the object, not the matcher) are constrained to be symbols. In the matcher, the keys must match Symbols (or else they are irrelevant).
252
+ The symbol may denote an instance, class, or constant variable. (So, the above example shows the most
253
+ common of these, the instance variable.) The symbol may also (if lowercase) represent a property (method) to be
254
+ called. The keys of an object matcher must match symbols ( or arrays starting with symbols). The following are
255
+ equivalent:
256
+
257
+ -{:method => (40..50)}
258
+ item_that{|x| (40..50)===x.method }
259
+ item_that.method.in?(40..50) #if Object#in?(Enumerable) (from the facets gem) exists
260
+
261
+ *** If the key of an object matcher is an array beginning with a symbol (hereafter, a symbol array), the symbol denotes the method to be called, and the remaining elements denote the parameters to be passed to it. In this parameter list, most things will be passed through to the method unchanged. These will not: Backreferences
262
+ will be resolved. RegProc objects will be called at match-time (with the match progress...) Literal objects will
263
+ be unwrapped, and then used as-is, so that literal backreference or RegProc objects (or whatever) can be passed in when necessary.
264
+
265
+
266
+ -{[:method,:arg1,:arg2,:arg3] => /agent 99/}
267
+
268
+
269
+ (or.... maybe it could just be item_that, instead.) ***
270
+
271
+ I use the term property here, because one must excercise caution when calling methods in matchers this way. The general problem is one of side effects in matchers, which should be avoided. You can also make trouble for yourself with side effects in item_that expressions and probably other ways I haven't thought of. The result will probably not be what you expect. Do not call methods (using either form, or any other way) within matchers that cause changes of state, in either the current object or elsewhere. If you do, you may well see your side effect runs many more times, and against many more objects, than you wanted it to. There are language-approved ways to cause side effects at (or really, after) match time: substitutions, later, side_effect and undo, even backref name binding can be viewed as a type of side effect.
272
+
273
+ *** As in the hash matcher, the key of an object matcher can be a Reg, allowing you to match patterns in variable or method names. (I will attempt to make matchers or Strings work just like matchers of Symbols do here...)
274
+
275
+ Unlike a Hash matcher, Object matchers do not have to account for every element. *** If you want to force every
276
+ instance variable to be accounted for, use a rule like this: OB=>None. (Note: OB is interpreted as /^@[^@]/ here.)
277
+
278
+ differences between ordered and unordered object matchers:
279
+ As with the hash matchers, an implicit order is assigned to filters in unordered object matchers, whereas the order specified in an ordered matcher is strictly respected.
280
+
281
+
282
+
283
+
284
+
285
+
286
+ Matching arrays:
287
+ +[/foo/,/bar/,/baz/]
288
+
289
+ This matches arrays containing exactly 3 (string) items. The first contain 'foo', the second 'bar', and the
290
+ third 'baz'.
291
+
292
+ A Reg::Array looks exactly like a normal array literal, except with the + in front. It provides an Array matcher with capabilities similar to what Regexp provides for String. Each element of the Reg::Array represents one or more (or maybe less!) items in the array to be matched. Normal scalar matchers, (like all those we have met thus far) always match just one array item.
293
+
294
+ scalar and vector matchers (and variable)
295
+ reg can operate on sequences of objects in a way similar to the way that regexp
296
+ operates on a sequence of characters. each operates on sequences of items, where
297
+ in one case the items are objects and in the other characters.
298
+
299
+ both regexp and reg provide constructs for matching just a single item (scalar
300
+ subexpressions) as well as matching multiple items (vector subexpressions). in
301
+ Regexp, a vector is a String and a scalar is a single character. in Reg, a vector
302
+ is an Array, and a scalar is any item that can be put into an Array (therefore,
303
+ any Object).
304
+
305
+ Vector matchers _don't_ have an === method. Instead, the mmatch
306
+ method matches data in a vector. Mmatch takes extra/different parameters to allow
307
+ more information to be made available to the matching method than === allows. The
308
+ calling conventions of mmatch are still changing; mmatch is at the moment considered
309
+ an 'internal' method, which shouldn't be used by users. In the future, this
310
+ interface will be published, to give users more options in creating and using
311
+ matchers.
312
+
313
+ (Vector matchers may be subdivided into multiple and variable varieties. Multiple
314
+ matchers are those that match a known number of items if they match at all. Variable
315
+ matchers might match any of a number of lengths.)
316
+
317
+ Confusingly, a Reg::Array is still a _scalar_ matcher. It matches a single item (of
318
+ type Array). The contents of the Reg::Array are a vector expression, which match the
319
+ contents of the Array. Reg::Array matches the entire Array contents. Unlike Regexp,
320
+ Reg::Array is effectively anchored on both ends to the underlying array. If you want
321
+ unanchored-like behavior, that can be simulated by putting OBS at both ends. (For
322
+ maximum Regexp-likeness, the first must really be OBS.l, the lazy form.)
323
+
324
+
325
+
326
+ Repetitons:
327
+
328
+ OB #matches any single object
329
+ OBS
330
+
331
+ +[/bar/-1]
332
+ +[/bar/.-]
333
+
334
+ +[/bar/+1]
335
+ +[/bar/.+]
336
+
337
+ +[/bar/+6]
338
+
339
+ +[/bar/*6]
340
+
341
+ the unary suffix forms of +,-,*
342
+
343
+
344
+
345
+ +[/bar/*(6..17)]
346
+
347
+ +[ -[/bar/,/baz/]+5 ]
348
+
349
+
350
+
351
+ backtracking:
352
+
353
+ +[Integer+1, Fixnum]
354
+
355
+
356
+
357
+ Subsequences:
358
+
359
+ +[ -[/bar/,/baz/] | -[/foo/,(1..10).reg*5] | :zork ]
360
+
361
+ +[ (-[/bar/,/baz/] & -[/bif/,/baf/,/bof/])*4 ] #parens needed because of low precedence of &
362
+
363
+ (un)Anchors:
364
+
365
+ +[OBS, -[/bar/,/baz/]+5, OBS]
366
+
367
+ The individual members of a Reg::Array (or Reg::Subseq) can be any type of Reg. Scalars always
368
+ match just one item. Vectors match a variable number of items. The two can be intermixed in
369
+ any combination needed within a sequence or subsequence.
370
+ +[(3..10).reg*5, -[:foo, /bar/-5]|/foof/, -{:foo=>item_that>66}, 14..88 ]
371
+
372
+
373
+ more topics:
374
+
375
+ backreferences ***
376
+
377
+ named backreferences ***
378
+
379
+ substitutions ***
380
+
381
+
382
+ named subexpressions
383
+
384
+ Reg::var and recursive matching
385
+
386
+ dynamic creation of reg subexpressions
387
+
388
+ actions in the middle of a match (later, side_effect and undo) ***
389
+
390
+ sep and splitter
391
+
392
+ literals
393
+
394
+ :AND,:OR,:XOR ***
395
+
396
+ lookahead and negative lookahead ***
397
+
398
+ lookback ***
399
+
400
+ laziness ( (reg1+1).l ) ***
401
+
402
+ lexing ***
403
+
404
+ parsing => see parser.txt, calc.reg ***
405
+
406
+ the medium and long reg literal names syntaxes
407
+
408
+ avoiding sugar (and why you would want to)
409
+
410
+ Reg::Progress
411
+
412
+ MatchSets and next_match
413
+
414
+ depth-mostly matches ***
415
+
416
+ IntegerSet ***