regexp-examples 1.2.0 → 1.2.1
Sign up to get free protection for your applications and to get access to all the features.
- checksums.yaml +4 -4
- data/.travis.yml +3 -0
- data/README.md +58 -45
- data/db/unicode_ranges_2.1.pstore +1 -0
- data/lib/core_extensions/regexp/examples.rb +1 -1
- data/lib/regexp-examples/groups.rb +6 -15
- data/lib/regexp-examples/version.rb +1 -1
- metadata +3 -3
- data/db/unicode_ranges_2.1.pstore +0 -0
checksums.yaml
CHANGED
@@ -1,7 +1,7 @@
|
|
1
1
|
---
|
2
2
|
SHA1:
|
3
|
-
metadata.gz:
|
4
|
-
data.tar.gz:
|
3
|
+
metadata.gz: d049ada3d2ac8bd900ed9b1d03066b28f805ae72
|
4
|
+
data.tar.gz: b68954dec1f8db36d531244ddb3dbbbb2a6edd92
|
5
5
|
SHA512:
|
6
|
-
metadata.gz:
|
7
|
-
data.tar.gz:
|
6
|
+
metadata.gz: a2db5421c17dfc6eec5e99caa78c95320432f8d1ace3d5647d02e8d8356518d360bd6409ec69ee8290e6da51b0ee1fbc30ca63dfe22855262ea0ba43e2e31e65
|
7
|
+
data.tar.gz: e9bb1f444f866dff086953c0e851846f54f9cdf2139e94bfa51f7430277c8201c79c34eaaf459cc6d6f637e0e0f53ac1ecbd2144f6ae7edeaa2a1d2334bef8d7
|
data/.travis.yml
CHANGED
data/README.md
CHANGED
@@ -4,7 +4,7 @@
|
|
4
4
|
[![Coverage Status](https://coveralls.io/repos/tom-lord/regexp-examples/badge.svg?branch=master)](https://coveralls.io/r/tom-lord/regexp-examples?branch=master)
|
5
5
|
[![Code Climate](https://codeclimate.com/github/tom-lord/regexp-examples/badges/gpa.svg)](https://codeclimate.com/github/tom-lord/regexp-examples)
|
6
6
|
|
7
|
-
Extends the Regexp class with the methods: `Regexp#examples` and `Regexp#random_example`
|
7
|
+
Extends the `Regexp` class with the methods: `Regexp#examples` and `Regexp#random_example`
|
8
8
|
|
9
9
|
`Regexp#examples` generates a list of all\* strings that will match the given regular expression.
|
10
10
|
|
@@ -19,6 +19,8 @@ If you'd like to understand how/why this gem works, please check out my [blog po
|
|
19
19
|
|
20
20
|
## Usage
|
21
21
|
|
22
|
+
#### Regexp#examples
|
23
|
+
|
22
24
|
```ruby
|
23
25
|
/a*/.examples #=> ['', 'a', 'aa']
|
24
26
|
/ab+/.examples #=> ['ab', 'abb', 'abbb']
|
@@ -36,11 +38,15 @@ If you'd like to understand how/why this gem works, please check out my [blog po
|
|
36
38
|
|
|
37
39
|
\u{28}\u2310\u25a0\u{5f}\u25a0\u{29}
|
38
40
|
/x.examples #=> ["(•_•)", "( •_•)>⌐■-■ ", "(⌐■_■)"]
|
41
|
+
```
|
39
42
|
|
40
|
-
|
43
|
+
#### Regexp#random_example
|
41
44
|
|
42
|
-
|
45
|
+
Obviously, you will get different (random) results if you try these yourself!
|
46
|
+
|
47
|
+
```ruby
|
43
48
|
/\w{10}@(hotmail|gmail)\.com/.random_example #=> "TTsJsiwzKS@gmail.com"
|
49
|
+
/5[1-5][0-9]{14}/.random_example #=> "5224028604559821" (A valid MasterCard number)
|
44
50
|
/\p{Greek}{80}/.random_example
|
45
51
|
#=> "ΖΆΧͷᵦμͷηϒϰΟᵝΔ΄θϔζΌψΨεκᴪΓΕπι϶ονϵΓϹᵦΟπᵡήϴϜΦϚϴϑ͵ϴΉϺ͵ϹϰϡᵠϝΤΏΨϹϊϻαώΞΰϰΑͼΈΘͽϙͽξΆΆΡΡΉΓς"
|
46
52
|
/written by tom lord/i.random_example #=> "WrITtEN bY tOM LORD"
|
@@ -109,7 +115,7 @@ Long answer:
|
|
109
115
|
* Octal characters, e.g. `/\10/`, `/\177/`
|
110
116
|
* Named properties, e.g. `/\p{L}/` ("Letter"), `/\p{Arabic}/` ("Arabic character")
|
111
117
|
, `/\p{^Ll}/` ("Not a lowercase letter"), `/\P{^Canadian\_Aboriginal}/` ("Not not a Canadian aboriginal character")
|
112
|
-
* ...Even between different ruby versions!! (e.g. `/\p{Arabic}/.examples(
|
118
|
+
* ...Even between different ruby versions!! (e.g. `/\p{Arabic}/.examples(max_group_results: 999)` will give you a different answer in ruby v2.1.x and v2.2.x)
|
113
119
|
* **Arbitrarily complex combinations of all the above!**
|
114
120
|
|
115
121
|
* Regexp options can also be used:
|
@@ -118,81 +124,88 @@ Long answer:
|
|
118
124
|
* Extended form examples: `/line1 #comment \n line2/x.examples #=> ["line1line2"]`
|
119
125
|
* Options toggling supported: `/before(?imx-imx)after/`, `/before(?imx-imx:subexpr)after/`
|
120
126
|
|
121
|
-
## Bugs and Not-Yet-Supported syntax
|
122
|
-
|
123
|
-
* There are some (rare) edge cases where backreferences do not work properly, e.g. `/(a\*)a\* \1/.examples` - which includes "aaaa aa". This is because each repeater is not context-aware, so the "greediness" logic is flawed. (E.g. in this case, the second `a\*` should always evaluate to an empty string, because the previous `a\*` was greedy!) However, patterns like this are highly unusual...
|
124
|
-
|
125
|
-
Since the Regexp language is so vast, it's quite likely I've missed something (please raise an issue if you find something)! The only missing feature that I'm currently aware of is:
|
126
|
-
* Conditional capture groups, e.g. `/(group1)? (?(1)yes|no)/.examples` (which *should* return: `["group1 yes", " no"]`)
|
127
|
-
|
128
|
-
Some of the most obscure regexp features are not even mentioned in the ruby docs. However, full documentation on all the intricate obscurities in the ruby (version 2.x) regexp parser can be found [here](https://raw.githubusercontent.com/k-takata/Onigmo/master/doc/RE).
|
129
|
-
|
130
|
-
## Impossible features ("illegal syntax")
|
131
|
-
|
132
|
-
The following features in the regex language can never be properly implemented into this gem because, put simply, they are not technically "regular"!
|
133
|
-
If you'd like to understand this in more detail, check out what I had to say in [my blog post](http://tom-lord.weebly.com/blog/reverse-engineering-regular-expressions) about this gem.
|
134
|
-
|
135
|
-
Using any of the following will raise a RegexpExamples::IllegalSyntax exception:
|
136
|
-
|
137
|
-
* Lookarounds, e.g. `/foo(?=bar)/`, `/foo(?!bar)/`, `/(?<=foo)bar/`, `/(?<\!foo)bar/`
|
138
|
-
* [Anchors](http://ruby-doc.org/core-2.2.0/Regexp.html#class-Regexp-label-Anchors) (`\b`, `\B`, `\G`, `^`, `\A`, `$`, `\z`, `\Z`), e.g. `/\bword\b/`, `/line1\n^line2/`
|
139
|
-
* However, a special case has been made to allow `^`, `\A` and `\G` at the start of a pattern; and to allow `$`, `\z` and `\Z` at the end of pattern. In such cases, the characters are effectively just ignored.
|
140
|
-
* Subexpression calls (`\g`), e.g. `/(?<name> ... \g<name>\* )/`
|
141
|
-
|
142
|
-
(Note: Backreferences are not really "regular" either, but I got these to work with a bit of hackery.)
|
143
|
-
|
144
127
|
##Configuration Options
|
145
128
|
|
146
129
|
When generating examples, the gem uses 3 configurable values to limit how many examples are listed:
|
147
130
|
|
148
|
-
* `
|
149
|
-
*
|
131
|
+
* `max_repeater_variance` (default = `2`) restricts how many examples to return for each repeater. For example:
|
132
|
+
* `.*` is equivalent to `.{0,2}`
|
150
133
|
* `.+` is equivalent to `.{1,3}`
|
151
134
|
* `.{2,}` is equivalent to `.{2,4}`
|
152
135
|
* `.{,3}` is equivalent to `.{0,2}`
|
153
136
|
* `.{3,8}` is equivalent to `.{3,5}`
|
154
137
|
|
155
|
-
* `
|
138
|
+
* `max_group_results` (default = `5`) restricts how many characters to return for each "set". For example:
|
156
139
|
* `\d` is equivalent to `[01234]`
|
157
140
|
* `\w` is equivalent to `[abcde]`
|
158
141
|
* `[h-s]` is equivalent to `[hijkl]`
|
159
142
|
* `(1|2|3|4|5|6|7|8)` is equivalent to `[12345]`
|
160
143
|
|
161
|
-
* `
|
162
|
-
* `/
|
144
|
+
* `max_results_limit` (default = `10000`) restricts the maximum number of results that can possibly be generated. For example:
|
145
|
+
* `/c+r+a+z+y+ * B+I+G+ * r+e+g+e+x+/i.examples.length <= 10000` -- Attempting this will NOT freeze your system, even though
|
146
|
+
(by the above rules) this "should" attempt to generate **117546246144** examples.
|
163
147
|
|
164
|
-
`Rexexp#examples` makes use of *all* these options; `Rexexp#
|
148
|
+
`Rexexp#examples` makes use of *all* these options; `Rexexp#random_example` only uses `max_repeater_variance`, since the other options are redundant.
|
165
149
|
|
166
150
|
To use an alternative value, simply pass the configuration option as follows:
|
167
151
|
|
168
152
|
```ruby
|
169
|
-
/a*/.examples(
|
153
|
+
/a*/.examples(max_repeater_variance: 5)
|
170
154
|
#=> [''. 'a', 'aa', 'aaa', 'aaaa' 'aaaaa']
|
171
|
-
/[F-X]/.examples(
|
155
|
+
/[F-X]/.examples(max_group_results: 10)
|
172
156
|
#=> ['F', 'G', 'H', 'I', 'J', 'K', 'L', 'M', 'N', 'O']
|
173
|
-
/[ab]{10}/.examples(
|
174
|
-
/[slow]{9}/.examples(
|
175
|
-
|
157
|
+
/[ab]{10}/.examples(max_results_limit: 64).length == 64 # NOT 1024
|
158
|
+
/[slow]{9}/.examples(max_results_limit: 9999999).length == 4 ** 9 == 262144 # Warning - this will take a while!
|
159
|
+
/.*/.random_example(max_repeater_variance: 50)
|
176
160
|
#=> "A very unlikely result!"
|
177
161
|
```
|
178
162
|
|
179
163
|
A sensible use case might be, for example, to generate all 1-5 digit strings:
|
180
164
|
|
181
165
|
```ruby
|
182
|
-
/\d{1,5}/.examples(
|
166
|
+
/\d{1,5}/.examples(max_repeater_variance: 4, max_group_results: 10, max_results_limit: 100000)
|
183
167
|
#=> ['0', '1', '2', ..., '99998', '99999']
|
184
168
|
```
|
185
169
|
|
186
|
-
Due to code optimisation, `Regexp#
|
187
|
-
(I.e. It's a _lot_ faster than using `/pattern/.
|
170
|
+
Due to code optimisation, `Regexp#random_example` runs pretty fast even on very complex patterns.
|
171
|
+
(I.e. It's typically a _lot_ faster than using `/pattern/.examples.sample(1)`.)
|
188
172
|
For instance, the following takes no more than ~ 1 second on my machine:
|
189
173
|
|
190
|
-
|
174
|
+
`/.*\w+\d{100}/.random_example(max_repeater_variance: 1000)`
|
175
|
+
|
176
|
+
## Bugs and TODOs
|
177
|
+
|
178
|
+
There are no known major bugs with this library. However, there are a few obscure issues that you *may* encounter:
|
179
|
+
|
180
|
+
* Conditional capture groups, e.g. `/(group1)? (?(1)yes|no)/.examples` are not yet supported. (This example *should* return: `["group1 yes", " no"]`)
|
181
|
+
* `\Z` should be interpreted like `\n?\z`; it's currently just interpreted like `\z`. (This basically just means you'll be missing a few examples.)
|
182
|
+
* Ideally, `regexp#examples` should always return up to `max_results_limit`. Currenty, it usually "aborts" before this limit is reached.
|
183
|
+
(I.e. the exact number of examples generated can be hard to predict, for complex patterns.)
|
184
|
+
* There are some (rare) edge cases where backreferences do not work properly, e.g. `/(a*)a* \1/.examples` -
|
185
|
+
which includes `"aaaa aa"`. This is because each repeater is not context-aware, so the "greediness" logic is flawed.
|
186
|
+
(E.g. in this case, the second `a*` should always evaluate to an empty string, because the previous `a*` was greedy.)
|
187
|
+
However, patterns like this are highly unusual...
|
188
|
+
* Nested repeat operators are incorrectly parsed, e.g. `/b{2}{3}/` - which *should* be interpreted like `/b{6}/`. (However, there is probably no reason
|
189
|
+
to ever write regexes like this!)
|
190
|
+
|
191
|
+
Some of the most obscure regexp features are not even mentioned in [the ruby docs](ruby-doc.org/core/Regexp.html).
|
192
|
+
However, full documentation on all the intricate obscurities in the ruby (version 2.x) regexp parser can be found
|
193
|
+
[here](https://raw.githubusercontent.com/k-takata/Onigmo/master/doc/RE).
|
194
|
+
|
195
|
+
## Impossible features ("illegal syntax")
|
196
|
+
|
197
|
+
The following features in the regex language can never be properly implemented into this gem because, put simply, they are not technically "regular"!
|
198
|
+
If you'd like to understand this in more detail, check out what I had to say in [my blog post](http://tom-lord.weebly.com/blog/reverse-engineering-regular-expressions) about this gem.
|
191
199
|
|
192
|
-
|
200
|
+
Using any of the following will raise a `RegexpExamples::IllegalSyntax` exception:
|
193
201
|
|
194
|
-
*
|
195
|
-
* `\
|
202
|
+
* Lookarounds, e.g. `/foo(?=bar)/`, `/foo(?!bar)/`, `/(?<=foo)bar/`, `/(?<!foo)bar/`
|
203
|
+
* [Anchors](http://ruby-doc.org/core-2.2.0/Regexp.html#class-Regexp-label-Anchors) (`\b`, `\B`, `\G`, `^`, `\A`, `$`, `\z`, `\Z`), e.g. `/\bword\b/`, `/line1\n^line2/`
|
204
|
+
* Anchors are really just special cases of lookarounds!
|
205
|
+
* However, a special case has been made to allow `^`, `\A` and `\G` at the start of a pattern; and to allow `$`, `\z` and `\Z` at the end of pattern. In such cases, the characters are effectively just ignored.
|
206
|
+
* Subexpression calls (`\g`), e.g. `/(?<name> ... \g<name>* )/`
|
207
|
+
|
208
|
+
(Note: Backreferences are not really "regular" either, but I got these to work with a bit of hackery.)
|
196
209
|
|
197
210
|
## Contributing
|
198
211
|
|
@@ -0,0 +1 @@
|
|
1
|
+
unicode_ranges_2.0.pstore
|
@@ -17,7 +17,7 @@ module CoreExtensions
|
|
17
17
|
RegexpExamples::ResultCountLimiters.configure!(
|
18
18
|
max_repeater_variance: config_options[:max_repeater_variance]
|
19
19
|
)
|
20
|
-
examples_by_method(:random_result).first
|
20
|
+
examples_by_method(:random_result).sample(1).first
|
21
21
|
end
|
22
22
|
|
23
23
|
private
|
@@ -20,27 +20,17 @@ module RegexpExamples
|
|
20
20
|
end
|
21
21
|
end
|
22
22
|
|
23
|
-
# A helper method for mixing in to Group classes...
|
24
|
-
# Needed because sometimes (for performace) group results are lazy enumerators;
|
25
|
-
# Meanwhile other times (again, for performance!) group results are just arrays
|
26
|
-
module ForceLazyEnumerators
|
27
|
-
def force_if_lazy(arr_or_enum)
|
28
|
-
arr_or_enum.respond_to?(:force) ? arr_or_enum.force : arr_or_enum
|
29
|
-
end
|
30
|
-
end
|
31
|
-
|
32
23
|
# A helper method for mixing in to Group classes...
|
33
24
|
# Needed for generating a complete results set when the ignorecase
|
34
25
|
# regexp option has been set
|
35
26
|
module GroupWithIgnoreCase
|
36
|
-
include ForceLazyEnumerators
|
37
27
|
attr_reader :ignorecase
|
38
28
|
def result
|
39
29
|
group_result = super
|
40
30
|
if ignorecase
|
41
|
-
|
42
|
-
|
43
|
-
.concat(
|
31
|
+
group_result
|
32
|
+
.to_a # In case of lazy enumerator
|
33
|
+
.concat(group_result.to_a.map(&:swapcase))
|
44
34
|
.uniq
|
45
35
|
else
|
46
36
|
group_result
|
@@ -52,9 +42,10 @@ module RegexpExamples
|
|
52
42
|
# Uses Array#sample to randomly choose one result from all
|
53
43
|
# possible examples
|
54
44
|
module RandomResultBySample
|
55
|
-
include ForceLazyEnumerators
|
56
45
|
def random_result
|
57
|
-
|
46
|
+
result
|
47
|
+
.to_a # In case of lazy enumerator
|
48
|
+
.sample(1)
|
58
49
|
end
|
59
50
|
end
|
60
51
|
|
metadata
CHANGED
@@ -1,14 +1,14 @@
|
|
1
1
|
--- !ruby/object:Gem::Specification
|
2
2
|
name: regexp-examples
|
3
3
|
version: !ruby/object:Gem::Version
|
4
|
-
version: 1.2.
|
4
|
+
version: 1.2.1
|
5
5
|
platform: ruby
|
6
6
|
authors:
|
7
7
|
- Tom Lord
|
8
8
|
autorequire:
|
9
9
|
bindir: bin
|
10
10
|
cert_chain: []
|
11
|
-
date: 2016-
|
11
|
+
date: 2016-05-18 00:00:00.000000000 Z
|
12
12
|
dependencies:
|
13
13
|
- !ruby/object:Gem::Dependency
|
14
14
|
name: bundler
|
@@ -101,7 +101,7 @@ required_rubygems_version: !ruby/object:Gem::Requirement
|
|
101
101
|
version: '0'
|
102
102
|
requirements: []
|
103
103
|
rubyforge_project:
|
104
|
-
rubygems_version: 2.
|
104
|
+
rubygems_version: 2.5.1
|
105
105
|
signing_key:
|
106
106
|
specification_version: 4
|
107
107
|
summary: Extends the Regexp class with '#examples' and '#random_example'
|
Binary file
|