regexp-examples 1.0.0 → 1.0.1
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- checksums.yaml +4 -4
- data/README.md +16 -12
- data/lib/regexp-examples/parser.rb +7 -6
- data/lib/regexp-examples/version.rb +1 -1
- data/spec/regexp-examples_spec.rb +5 -1
- metadata +2 -2
checksums.yaml
CHANGED
@@ -1,7 +1,7 @@
|
|
1
1
|
---
|
2
2
|
SHA1:
|
3
|
-
metadata.gz:
|
4
|
-
data.tar.gz:
|
3
|
+
metadata.gz: fc845182adb1adaeed70de6139d27711a69dc81f
|
4
|
+
data.tar.gz: 3d1850382acaf7ee4c96c9acf924a585cec6939b
|
5
5
|
SHA512:
|
6
|
-
metadata.gz:
|
7
|
-
data.tar.gz:
|
6
|
+
metadata.gz: 0b2a8ff8619ba8bc4186a27491dac7140ff8e0d7e4cb87ccfd8e9047b0f392c7152e2988666edaf4df6a1d2db0961ac1dfdc0af32b9d50c3ddbda6ff60814c97
|
7
|
+
data.tar.gz: f3690a8f6d2089b57a57246d9ec2b25252a2bee5972536ef095bf45358348ce1c322e3d3b04b69121e8530e155c2c9c4542f5f48cdc6a83ea6366eb246af3f94
|
data/README.md
CHANGED
@@ -15,7 +15,7 @@ For more detail on this, see [configuration options](#configuration-options).
|
|
15
15
|
## Usage
|
16
16
|
|
17
17
|
```ruby
|
18
|
-
/a*/.examples #=> [''
|
18
|
+
/a*/.examples #=> ['', 'a', 'aa']
|
19
19
|
/ab+/.examples #=> ['ab', 'abb', 'abbb']
|
20
20
|
/this|is|awesome/.examples #=> ['this', 'is', 'awesome']
|
21
21
|
/https?:\/\/(www\.)?github\.com/.examples #=> ['http://github.com',
|
@@ -23,7 +23,8 @@ For more detail on this, see [configuration options](#configuration-options).
|
|
23
23
|
/(I(N(C(E(P(T(I(O(N)))))))))*/.examples #=> ["", "INCEPTION", "INCEPTIONINCEPTION"]
|
24
24
|
/\x74\x68\x69\x73/.examples #=> ["this"]
|
25
25
|
/\u6829/.examples #=> ["栩"]
|
26
|
-
/what about (backreferences\?) \1/.examples
|
26
|
+
/what about (backreferences\?) \1/.examples
|
27
|
+
#=> ['what about backreferences? backreferences?']
|
27
28
|
```
|
28
29
|
|
29
30
|
## Installation
|
@@ -45,9 +46,9 @@ Or install it yourself as:
|
|
45
46
|
## Supported syntax
|
46
47
|
|
47
48
|
* All forms of repeaters (quantifiers), e.g. `/a*/`, `/a+/`, `/a?/`, `/a{1,4}/`, `/a{3,}/`, `/a{,2}/`
|
48
|
-
* Reluctant and possissive repeaters work fine, too
|
49
|
+
* Reluctant and possissive repeaters work fine, too, e.g. `/a*?/`, `/a*+/`
|
49
50
|
* Boolean "Or" groups, e.g. `/a|b|c/`
|
50
|
-
* Character sets e.g. `/[abc]/` - including:
|
51
|
+
* Character sets, e.g. `/[abc]/` - including:
|
51
52
|
* Ranges, e.g.`/[A-Z0-9]/`
|
52
53
|
* Negation, e.g. `/[^a-z]/`
|
53
54
|
* Escaped characters, e.g. `/[\w\s\b]/`
|
@@ -57,14 +58,15 @@ Or install it yourself as:
|
|
57
58
|
* Capture groups, e.g. `/(group)/`
|
58
59
|
* Including named groups, e.g. `/(?<name>group)/`
|
59
60
|
* ...And backreferences(!!!), e.g. `/(this|that) \1/` `/(?<name>foo) \k<name>/`
|
60
|
-
* Groups work fine, even if nested or optional e.g. `/(even(this(works?))) \1 \2 \3/`, `/what about (this)? \1/`
|
61
|
+
* Groups work fine, even if nested or optional, e.g. `/(even(this(works?))) \1 \2 \3/`, `/what about (this)? \1/`
|
61
62
|
* Non-capture groups, e.g. `/(?:foo)/`
|
62
63
|
* Comment groups, e.g. `/foo(?#comment)bar/`
|
63
64
|
* Control characters, e.g. `/\ca/`, `/\cZ/`, `/\C-9/`
|
64
65
|
* Escape sequences, e.g. `/\x42/`, `/\x5word/`, `/#{"\x80".force_encoding("ASCII-8BIT")}/`
|
65
66
|
* Unicode characters, e.g. `/\u0123/`, `/\uabcd/`, `/\u{789}/`
|
66
67
|
* Octal characters, e.g. `/\10/`, `/\177/`
|
67
|
-
* Named properties, e.g. `/\p{L}/` ("Letter"), `/\p{Arabic}/` ("Arabic character")
|
68
|
+
* Named properties, e.g. `/\p{L}/` ("Letter"), `/\p{Arabic}/` ("Arabic character")
|
69
|
+
, `/\p{^Ll}/` ("Not a lowercase letter"), `\P{^Canadian_Aboriginal}` ("Not not a Canadian aboriginal character")
|
68
70
|
* **Arbitrarily complex combinations of all the above!**
|
69
71
|
|
70
72
|
* Regexp options can also be used:
|
@@ -76,11 +78,12 @@ Or install it yourself as:
|
|
76
78
|
## Bugs and Not-Yet-Supported syntax
|
77
79
|
|
78
80
|
* There are some (rare) edge cases where backreferences do not work properly, e.g. `/(a*)a* \1/.examples` - which includes "aaaa aa". This is because each repeater is not context-aware, so the "greediness" logic is flawed. (E.g. in this case, the second `a*` should always evaluate to an empty string, because the previous `a*` was greedy! However, patterns like this are highly unusual...
|
79
|
-
* Some named properties, e.g. `/\p{Arabic}/`, list non-matching examples for ruby 2.0/2.1 (as the definitions changed in ruby 2.2). This
|
81
|
+
* Some named properties, e.g. `/\p{Arabic}/`, list non-matching examples for ruby 2.0/2.1 (as the definitions changed in ruby 2.2). This will be fixed in version 1.1.0 (see the pending pull request)!
|
80
82
|
|
81
|
-
There are also some various (increasingly obscure) unsupported bits of syntax
|
83
|
+
There are also some various (increasingly obscure) unsupported bits of syntax; some of which I haven't yet investigated. Much of this is not even mentioned in the ruby docs! Full documentation on all the intricate obscurities in the ruby (version 2.x) regexp parser can be found [here](https://raw.githubusercontent.com/k-takata/Onigmo/master/doc/RE). To name a few:
|
82
84
|
* Conditional capture groups, e.g. `/(group1)? (?(1)yes|no)/.examples` (which *should* return: `["group1 yes", " no"]`)
|
83
|
-
* Back reference by
|
85
|
+
* Back reference by relative group number, e.g. `/(a)(b)(c)(d) \k<-2>/.examples` (which *should* return: `["abcd c"]`)
|
86
|
+
* Back reference using single quotes, and for group numbers, e.g. `/(a) \k'1'/.examples` (which is really just alternative syntax for `/(a) \1/`!)
|
84
87
|
|
85
88
|
## Impossible features ("illegal syntax")
|
86
89
|
|
@@ -92,7 +95,7 @@ Using any of the following will raise a RegexpExamples::IllegalSyntax exception:
|
|
92
95
|
* Lookarounds, e.g. `/foo(?=bar)/`, `/foo(?!bar)/`, `/(?<=foo)bar/`, `/(?<!foo)bar/`
|
93
96
|
* [Anchors](http://ruby-doc.org/core-2.2.0/Regexp.html#class-Regexp-label-Anchors) (`\b`, `\B`, `\G`, `^`, `\A`, `$`, `\z`, `\Z`), e.g. `/\bword\b/`, `/line1\n^line2/`
|
94
97
|
* However, a special case has been made to allow `^`, `\A` and `\G` at the start of a pattern; and to allow `$`, `\z` and `\Z` at the end of pattern. In such cases, the characters are effectively just ignored.
|
95
|
-
* Subexpression calls, e.g. `/(?<name> ... \g<name>* )/`
|
98
|
+
* Subexpression calls (`\g`), e.g. `/(?<name> ... \g<name>* )/`
|
96
99
|
|
97
100
|
(Note: Backreferences are not really "regular" either, but I got these to work with a bit of hackery!)
|
98
101
|
|
@@ -137,8 +140,9 @@ A more sensible use case might be, for example, to generate one random 1-4 digit
|
|
137
140
|
## TODO
|
138
141
|
|
139
142
|
* Performance improvements:
|
140
|
-
* Use of lambdas/something (in [constants.rb](lib/regexp-examples/constants.rb)) to improve the library load time.
|
141
|
-
* (Maybe?) add a `max_examples` configuration option and use lazy evaluation, to ensure the method never "freezes"
|
143
|
+
* Use of lambdas/something (in [constants.rb](lib/regexp-examples/constants.rb)) to improve the library load time. See the pending pull request.
|
144
|
+
* (Maybe?) add a `max_examples` configuration option and use lazy evaluation, to ensure the method never "freezes".
|
145
|
+
* Potential future feature: `Regexp#random_example` - but implementing this properly is non-trivial, due to performance issues that need addressing first!
|
142
146
|
* Write a blog post about how this amazing gem works! :)
|
143
147
|
|
144
148
|
## Contributing
|
@@ -99,13 +99,14 @@ module RegexpExamples
|
|
99
99
|
@current_position += $1.length
|
100
100
|
sequence = $1.match(/\h{1,4}/)[0] # Strip off "{" and "}"
|
101
101
|
group = parse_single_char_group( parse_unicode_sequence(sequence) )
|
102
|
-
when rest_of_string =~ /\
|
103
|
-
@current_position += ($
|
102
|
+
when rest_of_string =~ /\A(p)\{(\^?)([^}]+)\}/i # Named properties
|
103
|
+
@current_position += ($2.length + $3.length + 2)
|
104
|
+
is_negative = ($1 == "P") ^ ($2 == "^") # Beware of double negatives! E.g. /\P{^Space}/
|
104
105
|
group = CharGroup.new(
|
105
|
-
if
|
106
|
-
CharSets::Any.dup - NamedPropertyCharMap[$
|
106
|
+
if is_negative
|
107
|
+
CharSets::Any.dup - NamedPropertyCharMap[$3.downcase]
|
107
108
|
else
|
108
|
-
NamedPropertyCharMap[$
|
109
|
+
NamedPropertyCharMap[$3.downcase]
|
109
110
|
end,
|
110
111
|
@ignorecase
|
111
112
|
)
|
@@ -114,7 +115,7 @@ module RegexpExamples
|
|
114
115
|
when next_char == 'R' # Linebreak
|
115
116
|
group = CharGroup.new(["\r\n", "\n", "\v", "\f", "\r"], @ignorecase) # A bit hacky...
|
116
117
|
when next_char == 'g' # Subexpression call
|
117
|
-
raise IllegalSyntaxError, "Subexpression calls (
|
118
|
+
raise IllegalSyntaxError, "Subexpression calls (\\g) cannot be supported, as they are not regular"
|
118
119
|
when next_char =~ /[bB]/ # Anchors
|
119
120
|
raise IllegalSyntaxError, "Anchors ('\\#{next_char}') cannot be supported, as they are not regular"
|
120
121
|
when next_char =~ /[AG]/ # Start of string
|
@@ -80,6 +80,8 @@ RSpec.describe Regexp, "#examples" do
|
|
80
80
|
/[\\\]]/,
|
81
81
|
/[\n-\r]/,
|
82
82
|
/[\-]/,
|
83
|
+
/[-abc]/,
|
84
|
+
/[abc-]/,
|
83
85
|
/[%-+]/, # This regex is "supposed to" match some surprising things!!!
|
84
86
|
/['-.]/, # Test to ensure no "infinite loop" on character set expansion
|
85
87
|
/[[abc]]/, # Nested groups
|
@@ -180,7 +182,9 @@ RSpec.describe Regexp, "#examples" do
|
|
180
182
|
/\p{L}/,
|
181
183
|
/\p{Space}/,
|
182
184
|
/\p{AlPhA}/, # Checking case insensitivity
|
183
|
-
/\p{^Ll}
|
185
|
+
/\p{^Ll}/,
|
186
|
+
/\P{Ll}/,
|
187
|
+
/\P{^Ll}/ # Double negative!!
|
184
188
|
)
|
185
189
|
|
186
190
|
end
|
metadata
CHANGED
@@ -1,14 +1,14 @@
|
|
1
1
|
--- !ruby/object:Gem::Specification
|
2
2
|
name: regexp-examples
|
3
3
|
version: !ruby/object:Gem::Version
|
4
|
-
version: 1.0.
|
4
|
+
version: 1.0.1
|
5
5
|
platform: ruby
|
6
6
|
authors:
|
7
7
|
- Tom Lord
|
8
8
|
autorequire:
|
9
9
|
bindir: bin
|
10
10
|
cert_chain: []
|
11
|
-
date: 2015-03-
|
11
|
+
date: 2015-03-04 00:00:00.000000000 Z
|
12
12
|
dependencies:
|
13
13
|
- !ruby/object:Gem::Dependency
|
14
14
|
name: bundler
|