regexp-examples 1.0.1 → 1.0.2

Sign up to get free protection for your applications and to get access to all the features.
checksums.yaml CHANGED
@@ -1,7 +1,7 @@
1
1
  ---
2
2
  SHA1:
3
- metadata.gz: fc845182adb1adaeed70de6139d27711a69dc81f
4
- data.tar.gz: 3d1850382acaf7ee4c96c9acf924a585cec6939b
3
+ metadata.gz: e4648ff5cf5c73b7916f58099a989ad58619e2d1
4
+ data.tar.gz: 9a9d53ceaf5a89f1f363124fad033b46b1489774
5
5
  SHA512:
6
- metadata.gz: 0b2a8ff8619ba8bc4186a27491dac7140ff8e0d7e4cb87ccfd8e9047b0f392c7152e2988666edaf4df6a1d2db0961ac1dfdc0af32b9d50c3ddbda6ff60814c97
7
- data.tar.gz: f3690a8f6d2089b57a57246d9ec2b25252a2bee5972536ef095bf45358348ce1c322e3d3b04b69121e8530e155c2c9c4542f5f48cdc6a83ea6366eb246af3f94
6
+ metadata.gz: 77997419f70d44cde2181c9a61f81a7e34456de573ab0bfbe46d1dcb35a6350472b3f09b4f9853aa657bca17af8b220d6179649227fda29e69a203f4a467b668
7
+ data.tar.gz: 5c830560b6485f7a02bb9ad41fdc0f316ca4b0685d079ea8706cf170a4298e88f113b8bb8eb9bfc0cce18e22e733e235f03ab42120c60bd7828628ae779f5666
data/README.md CHANGED
@@ -5,7 +5,7 @@
5
5
 
6
6
  Extends the Regexp class with the method: Regexp#examples
7
7
 
8
- This method generates a list of (some\*) strings that will match the given regular expression
8
+ This method generates a list of (some\*) strings that will match the given regular expression.
9
9
 
10
10
  \* If the regex has an infinite number of possible srings that match it, such as `/a*b+c{2,}/`,
11
11
  or a huge number of possible matches, such as `/.\w/`, then only a subset of these will be listed.
@@ -22,9 +22,15 @@ For more detail on this, see [configuration options](#configuration-options).
22
22
  # 'http://www.github.com', 'https://github.com', 'https://www.github.com']
23
23
  /(I(N(C(E(P(T(I(O(N)))))))))*/.examples #=> ["", "INCEPTION", "INCEPTIONINCEPTION"]
24
24
  /\x74\x68\x69\x73/.examples #=> ["this"]
25
- /\u6829/.examples #=> ["栩"]
26
25
  /what about (backreferences\?) \1/.examples
27
26
  #=> ['what about backreferences? backreferences?']
27
+ /
28
+ \u{28}\u2022\u{5f}\u2022\u{29}
29
+ |
30
+ \u{28}\u{20}\u2022\u{5f}\u2022\u{29}\u{3e}\u2310\u25a0\u{2d}\u25a0\u{20}
31
+ |
32
+ \u{28}\u2310\u25a0\u{5f}\u25a0\u{29}
33
+ /x.examples #=> ["(•_•)", "( •_•)>⌐■-■ ", "(⌐■_■)"]
28
34
  ```
29
35
 
30
36
  ## Installation
@@ -45,6 +51,10 @@ Or install it yourself as:
45
51
 
46
52
  ## Supported syntax
47
53
 
54
+ Short answer: **Everything** is supported, apart from "irregular" aspects of the regexp language -- see [impossible features](#impossible-features-illegal-syntax)
55
+
56
+ Long answer:
57
+
48
58
  * All forms of repeaters (quantifiers), e.g. `/a*/`, `/a+/`, `/a?/`, `/a{1,4}/`, `/a{3,}/`, `/a{,2}/`
49
59
  * Reluctant and possissive repeaters work fine, too, e.g. `/a*?/`, `/a*+/`
50
60
  * Boolean "Or" groups, e.g. `/a|b|c/`
@@ -57,8 +67,9 @@ Or install it yourself as:
57
67
  * Escaped characters, e.g. `/\n/`, `/\w/`, `/\D/` (and so on...)
58
68
  * Capture groups, e.g. `/(group)/`
59
69
  * Including named groups, e.g. `/(?<name>group)/`
60
- * ...And backreferences(!!!), e.g. `/(this|that) \1/` `/(?<name>foo) \k<name>/`
61
- * Groups work fine, even if nested or optional, e.g. `/(even(this(works?))) \1 \2 \3/`, `/what about (this)? \1/`
70
+ * And backreferences(!!!), e.g. `/(this|that) \1/` `/(?<name>foo) \k<name>/`
71
+ * ...even for the more "obscure" syntax, e.g. `/(?<future>the) \k'future'/`, `/(a)(b) \k<-1>/``
72
+ * ...and even if nested or optional, e.g. `/(even(this(works?))) \1 \2 \3/`, `/what about (this)? \1/`
62
73
  * Non-capture groups, e.g. `/(?:foo)/`
63
74
  * Comment groups, e.g. `/foo(?#comment)bar/`
64
75
  * Control characters, e.g. `/\ca/`, `/\cZ/`, `/\C-9/`
@@ -66,7 +77,7 @@ Or install it yourself as:
66
77
  * Unicode characters, e.g. `/\u0123/`, `/\uabcd/`, `/\u{789}/`
67
78
  * Octal characters, e.g. `/\10/`, `/\177/`
68
79
  * Named properties, e.g. `/\p{L}/` ("Letter"), `/\p{Arabic}/` ("Arabic character")
69
- , `/\p{^Ll}/` ("Not a lowercase letter"), `\P{^Canadian_Aboriginal}` ("Not not a Canadian aboriginal character")
80
+ , `/\p{^Ll}/` ("Not a lowercase letter"), `/\P{^Canadian_Aboriginal}/` ("Not not a Canadian aboriginal character")
70
81
  * **Arbitrarily complex combinations of all the above!**
71
82
 
72
83
  * Regexp options can also be used:
@@ -77,13 +88,13 @@ Or install it yourself as:
77
88
 
78
89
  ## Bugs and Not-Yet-Supported syntax
79
90
 
80
- * There are some (rare) edge cases where backreferences do not work properly, e.g. `/(a*)a* \1/.examples` - which includes "aaaa aa". This is because each repeater is not context-aware, so the "greediness" logic is flawed. (E.g. in this case, the second `a*` should always evaluate to an empty string, because the previous `a*` was greedy! However, patterns like this are highly unusual...
91
+ * There are some (rare) edge cases where backreferences do not work properly, e.g. `/(a*)a* \1/.examples` - which includes "aaaa aa". This is because each repeater is not context-aware, so the "greediness" logic is flawed. (E.g. in this case, the second `a*` should always evaluate to an empty string, because the previous `a*` was greedy! However, patterns like this are highly unusual...)
81
92
  * Some named properties, e.g. `/\p{Arabic}/`, list non-matching examples for ruby 2.0/2.1 (as the definitions changed in ruby 2.2). This will be fixed in version 1.1.0 (see the pending pull request)!
82
93
 
83
- There are also some various (increasingly obscure) unsupported bits of syntax; some of which I haven't yet investigated. Much of this is not even mentioned in the ruby docs! Full documentation on all the intricate obscurities in the ruby (version 2.x) regexp parser can be found [here](https://raw.githubusercontent.com/k-takata/Onigmo/master/doc/RE). To name a few:
94
+ Since the Regexp language is so vast, it's quite likely I've missed something (please raise an issue if you find something)! The only missing feature that I'm currently aware of is:
84
95
  * Conditional capture groups, e.g. `/(group1)? (?(1)yes|no)/.examples` (which *should* return: `["group1 yes", " no"]`)
85
- * Back reference by relative group number, e.g. `/(a)(b)(c)(d) \k<-2>/.examples` (which *should* return: `["abcd c"]`)
86
- * Back reference using single quotes, and for group numbers, e.g. `/(a) \k'1'/.examples` (which is really just alternative syntax for `/(a) \1/`!)
96
+
97
+ Some of the most obscure regexp features are not even mentioned in the ruby docs! However, full documentation on all the intricate obscurities in the ruby (version 2.x) regexp parser can be found [here](https://raw.githubusercontent.com/k-takata/Onigmo/master/doc/RE).
87
98
 
88
99
  ## Impossible features ("illegal syntax")
89
100
 
@@ -46,31 +46,60 @@ module RegexpExamples
46
46
  when '\\'
47
47
  group = parse_after_backslash_group
48
48
  when '^'
49
- if @current_position == 0
50
- group = PlaceHolderGroup.new # Ignore the "illegal" character
51
- else
52
- raise IllegalSyntaxError, "Anchors ('#{next_char}') cannot be supported, as they are not regular"
53
- end
49
+ group = parse_caret
54
50
  when '$'
55
- if @current_position == (regexp_string.length - 1)
56
- group = PlaceHolderGroup.new # Ignore the "illegal" character
57
- else
58
- raise IllegalSyntaxError, "Anchors ('#{next_char}') cannot be supported, as they are not regular"
59
- end
51
+ group = parse_dollar
60
52
  when /[#\s]/
61
- if @extended
62
- parse_extended_whitespace
63
- group = PlaceHolderGroup.new # Ignore the whitespace/comment
64
- else
65
- group = parse_single_char_group(next_char)
66
- end
53
+ group = parse_extended_whitespace
67
54
  else
68
55
  group = parse_single_char_group(next_char)
69
56
  end
70
57
  group
71
58
  end
72
59
 
60
+ def parse_repeater(group)
61
+ case next_char
62
+ when '*'
63
+ repeater = parse_star_repeater(group)
64
+ when '+'
65
+ repeater = parse_plus_repeater(group)
66
+ when '?'
67
+ repeater = parse_question_mark_repeater(group)
68
+ when '{'
69
+ repeater = parse_range_repeater(group)
70
+ else
71
+ repeater = parse_one_time_repeater(group)
72
+ end
73
+ repeater
74
+ end
75
+
76
+ def parse_caret
77
+ if @current_position == 0
78
+ return PlaceHolderGroup.new # Ignore the "illegal" character
79
+ else
80
+ raise_anchors_exception!
81
+ end
82
+ end
83
+
84
+ def parse_dollar
85
+ if @current_position == (regexp_string.length - 1)
86
+ return PlaceHolderGroup.new # Ignore the "illegal" character
87
+ else
88
+ raise_anchors_exception!
89
+ end
90
+ end
91
+
73
92
  def parse_extended_whitespace
93
+ if @extended
94
+ skip_whitespace
95
+ group = PlaceHolderGroup.new # Ignore the whitespace/comment
96
+ else
97
+ group = parse_single_char_group(next_char)
98
+ end
99
+ group
100
+ end
101
+
102
+ def skip_whitespace
74
103
  whitespace_chars = rest_of_string.match(/#.*|\s+/)[0]
75
104
  @current_position += whitespace_chars.length - 1
76
105
  end
@@ -81,9 +110,11 @@ module RegexpExamples
81
110
  when rest_of_string =~ /\A(\d{1,3})/
82
111
  @current_position += ($1.length - 1) # In case of 10+ backrefs!
83
112
  group = parse_backreference_group($1)
84
- when rest_of_string =~ /\Ak<([^>]+)>/ # Named capture group
113
+ when rest_of_string =~ /\Ak['<]([\w-]+)['>]/ # Named capture group
85
114
  @current_position += ($1.length + 2)
86
- group = parse_backreference_group($1)
115
+ # Check for RELATIVE group number, e.g. /(a)(b)(c)(d) \k<-2>/
116
+ group_id = ($1.to_i < 0) ? (@num_groups + $1.to_i + 1) : $1
117
+ group = parse_backreference_group(group_id)
87
118
  when BackslashCharMap.keys.include?(next_char)
88
119
  group = CharGroup.new(
89
120
  BackslashCharMap[next_char].dup,
@@ -117,18 +148,18 @@ module RegexpExamples
117
148
  when next_char == 'g' # Subexpression call
118
149
  raise IllegalSyntaxError, "Subexpression calls (\\g) cannot be supported, as they are not regular"
119
150
  when next_char =~ /[bB]/ # Anchors
120
- raise IllegalSyntaxError, "Anchors ('\\#{next_char}') cannot be supported, as they are not regular"
151
+ raise_anchors_exception!
121
152
  when next_char =~ /[AG]/ # Start of string
122
153
  if @current_position == 1
123
154
  group = PlaceHolderGroup.new
124
155
  else
125
- raise IllegalSyntaxError, "Anchors ('\\#{next_char}') cannot be supported, as they are not regular"
156
+ raise_anchors_exception!
126
157
  end
127
158
  when next_char =~ /[zZ]/ # End of string
128
159
  if @current_position == (regexp_string.length - 1)
129
160
  group = PlaceHolderGroup.new
130
161
  else
131
- raise IllegalSyntaxError, "Anchors ('\\#{next_char}') cannot be supported, as they are not regular"
162
+ raise_anchors_exception!
132
163
  end
133
164
  else
134
165
  group = parse_single_char_group( next_char )
@@ -136,31 +167,13 @@ module RegexpExamples
136
167
  group
137
168
  end
138
169
 
139
- def parse_repeater(group)
140
- case next_char
141
- when '*'
142
- repeater = parse_star_repeater(group)
143
- when '+'
144
- repeater = parse_plus_repeater(group)
145
- when '?'
146
- repeater = parse_question_mark_repeater(group)
147
- when '{'
148
- repeater = parse_range_repeater(group)
149
- else
150
- repeater = parse_one_time_repeater(group)
151
- end
152
- repeater
153
- end
154
-
155
170
  def parse_multi_group
156
171
  @current_position += 1
157
172
  @num_groups += 1
158
- group_id = nil # init
159
- previous_ignorecase = @ignorecase
160
- previous_multiline = @multiline
161
- previous_extended = @extended
162
- rest_of_string.match(
163
- /
173
+ remember_old_regexp_options do
174
+ group_id = nil # init
175
+ rest_of_string.match(
176
+ /
164
177
  \A
165
178
  (\?)? # Is it a "special" group, i.e. starts with a "?"?
166
179
  (
@@ -175,39 +188,48 @@ module RegexpExamples
175
188
  |[^>]+ # Named capture
176
189
  )
177
190
  |[mix]*-?[mix]* # Option toggle
178
- )?
179
- /x
180
- ) do |match|
181
- case
182
- when match[1].nil? # e.g. /(normal)/
183
- group_id = @num_groups.to_s
184
- when match[2] == ':' # e.g. /(?:nocapture)/
185
- @current_position += 2
186
- when match[2] == '#' # e.g. /(?#comment)/
187
- comment_group = rest_of_string.match(/.*?[^\\](?:\\{2})*\)/)[0]
188
- @current_position += comment_group.length
189
- when match[2] =~ /\A(?=[mix-]+)([mix]*)-?([mix]*)/ # e.g. /(?i-mx)/
190
- regexp_options_toggle($1, $2)
191
- @current_position += $&.length + 1
192
- if next_char == ':' # e.g. /(?i:subexpr)/
193
- @current_position += 1
194
- else
195
- return PlaceHolderGroup.new
191
+ )?
192
+ /x
193
+ ) do |match|
194
+ case
195
+ when match[1].nil? # e.g. /(normal)/
196
+ group_id = @num_groups.to_s
197
+ when match[2] == ':' # e.g. /(?:nocapture)/
198
+ @current_position += 2
199
+ when match[2] == '#' # e.g. /(?#comment)/
200
+ comment_group = rest_of_string.match(/.*?[^\\](?:\\{2})*\)/)[0]
201
+ @current_position += comment_group.length
202
+ when match[2] =~ /\A(?=[mix-]+)([mix]*)-?([mix]*)/ # e.g. /(?i-mx)/
203
+ regexp_options_toggle($1, $2)
204
+ @num_groups -= 1 # Toggle "groups" should not increase backref group count
205
+ @current_position += $&.length + 1
206
+ if next_char == ':' # e.g. /(?i:subexpr)/
207
+ @current_position += 1
208
+ else
209
+ return PlaceHolderGroup.new
210
+ end
211
+ when %w(! =).include?(match[2]) # e.g. /(?=lookahead)/, /(?!neglookahead)/
212
+ raise IllegalSyntaxError, "Lookaheads are not regular; cannot generate examples"
213
+ when %w(! =).include?(match[3]) # e.g. /(?<=lookbehind)/, /(?<!neglookbehind)/
214
+ raise IllegalSyntaxError, "Lookbehinds are not regular; cannot generate examples"
215
+ else # e.g. /(?<name>namedgroup)/
216
+ @current_position += (match[3].length + 3)
217
+ group_id = match[3]
196
218
  end
197
- when %w(! =).include?(match[2]) # e.g. /(?=lookahead)/, /(?!neglookahead)/
198
- raise IllegalSyntaxError, "Lookaheads are not regular; cannot generate examples"
199
- when %w(! =).include?(match[3]) # e.g. /(?<=lookbehind)/, /(?<!neglookbehind)/
200
- raise IllegalSyntaxError, "Lookbehinds are not regular; cannot generate examples"
201
- else # e.g. /(?<name>namedgroup)/
202
- @current_position += (match[3].length + 3)
203
- group_id = match[3]
204
219
  end
220
+ MultiGroup.new(parse, group_id)
205
221
  end
206
- groups = parse
222
+ end
223
+
224
+ def remember_old_regexp_options
225
+ previous_ignorecase = @ignorecase
226
+ previous_multiline = @multiline
227
+ previous_extended = @extended
228
+ group = yield
207
229
  @ignorecase = previous_ignorecase
208
230
  @multiline = previous_multiline
209
231
  @extended = previous_extended
210
- MultiGroup.new(groups, group_id)
232
+ group
211
233
  end
212
234
 
213
235
  def regexp_options_toggle(on, off)
@@ -246,8 +268,8 @@ module RegexpExamples
246
268
  SingleCharGroup.new(char, @ignorecase)
247
269
  end
248
270
 
249
- def parse_backreference_group(match)
250
- BackReferenceGroup.new(match)
271
+ def parse_backreference_group(group_id)
272
+ BackReferenceGroup.new(group_id)
251
273
  end
252
274
 
253
275
  def parse_control_character(char)
@@ -308,6 +330,10 @@ module RegexpExamples
308
330
  repeater
309
331
  end
310
332
 
333
+ def raise_anchors_exception!
334
+ raise IllegalSyntaxError, "Anchors ('#{next_char}') cannot be supported, as they are not regular"
335
+ end
336
+
311
337
  def parse_one_time_repeater(group)
312
338
  OneTimeRepeater.new(group)
313
339
  end
@@ -1,3 +1,3 @@
1
1
  module RegexpExamples
2
- VERSION = '1.0.1'
2
+ VERSION = '1.0.2'
3
3
  end
@@ -98,7 +98,8 @@ RSpec.describe Regexp, "#examples" do
98
98
  /(normal)/,
99
99
  /(?:nocapture)/,
100
100
  /(?<name>namedgroup)/,
101
- /(?<name>namedgroup) \k<name>/
101
+ /(?<name>namedgroup) \k<name>/,
102
+ /(?<name>namedgroup) \k'name'/
102
103
  )
103
104
  end
104
105
 
@@ -124,7 +125,8 @@ RSpec.describe Regexp, "#examples" do
124
125
  /(a?(b?(c?(d?(e?)))))/,
125
126
  /(a)? \1/,
126
127
  /(a|(b)) \2/,
127
- /([ab]){2} \1/ # \1 should always be the LAST result of the capture group
128
+ /([ab]){2} \1/, # \1 should always be the LAST result of the capture group
129
+ /(ref1) (ref2) \k'1' \k<-1>/, # RELATIVE backref!
128
130
  )
129
131
  end
130
132
 
@@ -326,6 +328,7 @@ RSpec.describe Regexp, "#examples" do
326
328
  it { expect(/a(?i)b(?-i)c/.examples).to eq %w{abc aBc}}
327
329
  it { expect(/a(?x) b(?-x) c/.examples).to eq %w{ab\ c}}
328
330
  it { expect(/(?m)./.examples(max_group_results: 999)).to include "\n" }
331
+ it { expect(/(?i)(a)-\1/.examples).to eq %w{a-a A-A}} # Toggle "groups" should not increase backref group count
329
332
  end
330
333
  context "subexpression" do
331
334
  it { expect(/a(?i:b)c/.examples).to eq %w{abc aBc}}
metadata CHANGED
@@ -1,14 +1,14 @@
1
1
  --- !ruby/object:Gem::Specification
2
2
  name: regexp-examples
3
3
  version: !ruby/object:Gem::Version
4
- version: 1.0.1
4
+ version: 1.0.2
5
5
  platform: ruby
6
6
  authors:
7
7
  - Tom Lord
8
8
  autorequire:
9
9
  bindir: bin
10
10
  cert_chain: []
11
- date: 2015-03-04 00:00:00.000000000 Z
11
+ date: 2015-03-07 00:00:00.000000000 Z
12
12
  dependencies:
13
13
  - !ruby/object:Gem::Dependency
14
14
  name: bundler
@@ -85,7 +85,7 @@ required_rubygems_version: !ruby/object:Gem::Requirement
85
85
  version: '0'
86
86
  requirements: []
87
87
  rubyforge_project:
88
- rubygems_version: 2.2.2
88
+ rubygems_version: 2.4.5
89
89
  signing_key:
90
90
  specification_version: 4
91
91
  summary: Extends the Regexp class with '#examples'