regexp-examples 0.4.1 → 0.4.2

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
checksums.yaml CHANGED
@@ -1,7 +1,7 @@
1
1
  ---
2
2
  SHA1:
3
- metadata.gz: e6b6bd1d3602690963b1f1ecb61dbb91b38d0847
4
- data.tar.gz: c104bcd97175689467336fa4c11f4f187d483332
3
+ metadata.gz: 7e76c8504f718a77052f24bdd8785042ded1b584
4
+ data.tar.gz: 5f63fd56430a4f8f0f10fd9ffc4f82d7b1301b2e
5
5
  SHA512:
6
- metadata.gz: 75662efea94170727d89f7f88335a02ab688cbe37a587a01d1c08b72ab69e64ff3bdc0ff2faef91ebb9622b0f465b67843620d3219639cc41156800f59d8ed22
7
- data.tar.gz: 85b54784013793a0a192cf2b456eb48fab304be07c87149726f3379a7409ae2a3f3fdc66c592b526d6c634d4a2bea0a53b60c0ea226fd295191549a6e8f67f2c
6
+ metadata.gz: ead89c6fead001d78b5336a69fa044af6fce905664c48a1430594b811b3729d4ea341e369ef190a43159672c4ca16a75a3c92c9a068c3638774b6030d318932a
7
+ data.tar.gz: d9edc5da0d9f2d3a6b856f00b202e5c119a1733f130cdf40b27e70875771f3f08707a9f0523b607b9882220b39e486a844e2013d774660c17c3b0f51743db02a
data/README.md CHANGED
@@ -29,14 +29,15 @@ For more detail on this, see [configuration options](#configuration-options).
29
29
  ## Supported syntax
30
30
 
31
31
  * All forms of repeaters (quantifiers), e.g. `/a*/`, `/a+/`, `/a?/`, `/a{1,4}/`, `/a{3,}/`, `/a{,2}/`
32
+ * Reluctant and possissive repeaters work fine, too - e.g. `/a*?/`, `/a*+/`
32
33
  * Boolean "Or" groups, e.g. `/a|b|c/`
33
34
  * Character sets (inluding ranges and negation!), e.g. `/[abc]/`, `/[A-Z0-9]/`, `/[^a-z]/`, `/[\w\s\b]/`
34
35
  * Escaped characters, e.g. `/\n/`, `/\w/`, `/\D/` (and so on...)
35
- * Non-capture groups, e.g. `/(?:foo)/`
36
36
  * Capture groups, e.g. `/(group)/`
37
37
  * Including named groups, e.g. `/(?<name>group)/`
38
38
  * ...And backreferences(!!!), e.g. `/(this|that) \1/` `/(?<name>foo) \k<name>/`
39
- * Groups work fine, even if nested! e.g. `/(even(this(works?))) \1 \2 \3/`
39
+ * Groups work fine, even if nested or optional e.g. `/(even(this(works?))) \1 \2 \3/`, `/what about (this)? \1/`
40
+ * Non-capture groups, e.g. `/(?:foo)/`
40
41
  * Control characters, e.g. `/\ca/`, `/\cZ/`, `/\C-9/`
41
42
  * Escape sequences, e.g. `/\x42/`, `/\x3D/`, `/\x5word/`, `/#{"\x80".force_encoding("ASCII-8BIT")}/`
42
43
  * Unicode characters, e.g. `/\u0123/`, `/\uabcd/`, `/\u{789}/`
@@ -44,27 +45,22 @@ For more detail on this, see [configuration options](#configuration-options).
44
45
 
45
46
  ## Bugs and Not-Yet-Supported syntax
46
47
 
47
- * Backreferences are replaced by the _first_ occurance of the group, not the _last_ (as it should be). This is quite a rare occurance, but for example:
48
- * `/(a|b){2} \1/.examples` incorrectly includes: `"ba b"` rather than the correct: `"ba a"`
49
-
50
48
  * Options, e.g. `/pattern/i`, `/foo.*bar/m` - Using options will currently just be ignored, for example:
51
49
  * `/test/i.examples` will NOT include `"TEST"`
52
50
  * `/white space/x.examples` will not strip out the whitespace from the pattern, i.e. this incorrectly returns `["white space"]` rather than `["whitespace"]`
53
51
 
54
52
  * Nested character classes, and the use of set intersection ([See here](http://www.ruby-doc.org/core-2.2.0/Regexp.html#class-Regexp-label-Character+Classes) for the official documentation on this.) For example:
55
53
  * `/[[abc]]/.examples` (which _should_ return `["a", "b", "c"]`)
56
- * `/[[a-d]&&[c-f]]/.examples` (which _should_ return: `["c", "d"]`)
54
+ * `/[[a-d]&&[c-f]]/.examples` (which _should_ return: `["c", "d"]`)
57
55
 
58
56
  * Extended groups are not yet supported, such as:
59
57
  * Including comments inside the pattern, i.e. `/(?#...)/`
60
58
  * Conditional capture groups, such as `/(group1) (?(1)yes|no)`
61
59
  * Options toggling, i.e. `/(?imx)/`, `/(?-imx)/`, `/(?imx: re)/` and `/(?-imx: re)/`
62
60
 
63
- * Possessive quantifiers, i.e. `/.?+/`, `/.*+/`, `/.++/`
64
-
65
61
  * The patterns: `/\10/` ... `/\77/` should match the octal representation of their character code, if there is no nth grouped subexpression. For example, `/\10/.examples` should return `["\x08"]`. Funnily enough, I did not think of this when writing my regexp parser.
66
62
 
67
- Full documentation on all the various other obscurities in the ruby (version 2.x) regexp parser can be found [here](https://raw.githubusercontent.com/k-takata/Onigmo/master/doc/RE).
63
+ There are loads more (increasingly obscure) unsupported bits of syntax, which I cannot be bothered to write out here. Full documentation on all the various other obscurities in the ruby (version 2.x) regexp parser can be found [here](https://raw.githubusercontent.com/k-takata/Onigmo/master/doc/RE).
68
64
 
69
65
  Using any of the following will raise a RegexpExamples::UnsupportedSyntax exception (until such time as they are implemented!):
70
66
 
@@ -105,8 +101,10 @@ When generating examples, the gem uses 2 configurable values to limit how many e
105
101
  To use an alternative value, simply pass the configuration option as follows:
106
102
 
107
103
  ```ruby
108
- /a*/.examples(max_repeater_variance: 5) #=> [''. 'a', 'aa', 'aaa', 'aaaa' 'aaaaa']
109
- /[F-X]/.examples(max_group_results: 10) #=> ['F', 'G', 'H', 'I', 'J', 'K', 'L', 'M', 'N', 'O']
104
+ /a*/.examples(max_repeater_variance: 5)
105
+ #=> [''. 'a', 'aa', 'aaa', 'aaaa' 'aaaaa']
106
+ /[F-X]/.examples(max_group_results: 10)
107
+ #=> ['F', 'G', 'H', 'I', 'J', 'K', 'L', 'M', 'N', 'O']
110
108
  ```
111
109
 
112
110
  _**WARNING**: Choosing huge numbers, along with a "complex" regex, could easily cause your system to freeze!_
Binary file
@@ -20,6 +20,11 @@ module RegexpExamples
20
20
  subgroups = result
21
21
  .map(&:all_subgroups)
22
22
  .flatten
23
+
24
+ # Only save the LAST group from repeated capture groups, e.g. /([ab]){2}/
25
+ subgroups.delete_if do |subgroup|
26
+ subgroups.count { |other_subgroup| other_subgroup.group_id == subgroup.group_id } > 1
27
+ end
23
28
  GroupResult.new(result.join, nil, subgroups)
24
29
  end
25
30
 
@@ -212,26 +212,26 @@ module RegexpExamples
212
212
 
213
213
  def parse_star_repeater(group)
214
214
  @current_position += 1
215
- parse_non_greedy_repeater
215
+ parse_reluctant_or_possessive_repeater
216
216
  StarRepeater.new(group)
217
217
  end
218
218
 
219
219
  def parse_plus_repeater(group)
220
220
  @current_position += 1
221
- parse_non_greedy_repeater
221
+ parse_reluctant_or_possessive_repeater
222
222
  PlusRepeater.new(group)
223
223
  end
224
224
 
225
- def parse_non_greedy_repeater
226
- if next_char == '?'
227
- # TODO: Delay this warning until after parsing, and only display if capture groups are used
228
- warn "Warning: Non-greedy operators (*? and +?) might not work properly, when using capture groups"
225
+ def parse_reluctant_or_possessive_repeater
226
+ if next_char =~ /[?+]/
227
+ # Don't treat these repeaters any differently when generating examples
229
228
  @current_position += 1
230
229
  end
231
230
  end
232
231
 
233
232
  def parse_question_mark_repeater(group)
234
233
  @current_position += 1
234
+ parse_reluctant_or_possessive_repeater
235
235
  QuestionMarkRepeater.new(group)
236
236
  end
237
237
 
@@ -241,7 +241,18 @@ module RegexpExamples
241
241
  min = match[1].to_i if match[1]
242
242
  has_comma = !match[2].nil?
243
243
  max = match[3].to_i if match[3]
244
- RangeRepeater.new(group, min, has_comma, max)
244
+ repeater = RangeRepeater.new(group, min, has_comma, max)
245
+ parse_reluctant_or_possessive_range_repeater(repeater, min, has_comma, max)
246
+ end
247
+
248
+ def parse_reluctant_or_possessive_range_repeater(repeater, min, has_comma, max)
249
+ # .{1}? should be equivalent to (?:.{1})?, i.e. NOT a "non-greedy quantifier"
250
+ if min && !has_comma && !max && next_char == "?"
251
+ repeater = parse_question_mark_repeater(repeater)
252
+ else
253
+ parse_reluctant_or_possessive_repeater
254
+ end
255
+ repeater
245
256
  end
246
257
 
247
258
  def parse_one_time_repeater(group)
@@ -1,3 +1,3 @@
1
1
  module RegexpExamples
2
- VERSION = '0.4.1'
2
+ VERSION = '0.4.2'
3
3
  end
@@ -40,16 +40,28 @@ RSpec.describe Regexp, "#examples" do
40
40
  context 'returns matching strings' do
41
41
  context "for basic repeaters" do
42
42
  examples_exist_and_match(
43
- /a/,
44
- /a*/,
45
- /a*?/,
43
+ /a/, # "one-time repeater"
44
+ /a*/, # greedy
45
+ /a*?/, # reluctant (non-greedy)
46
+ /a*+/, # possesive
46
47
  /a+/,
47
48
  /a+?/,
49
+ /a*+/,
48
50
  /a?/,
51
+ /a??/,
52
+ /a?+/,
49
53
  /a{1}/,
54
+ /a{1}?/,
55
+ /a{1}+/,
50
56
  /a{1,}/,
57
+ /a{1,}?/,
58
+ /a{1,}+/,
51
59
  /a{,2}/,
52
- /a{1,2}/
60
+ /a{,2}?/,
61
+ /a{,2}+/,
62
+ /a{1,2}/,
63
+ /a{1,2}?/,
64
+ /a{1,2}+/
53
65
  )
54
66
  end
55
67
 
@@ -119,7 +131,8 @@ RSpec.describe Regexp, "#examples" do
119
131
  /(one)(two)(three)(four)(five)(six)(seven)(eight)(nine)(ten) \10\9\8\7\6\5\4\3\2\1/,
120
132
  /(a?(b?(c?(d?(e?)))))/,
121
133
  /(a)? \1/,
122
- /(a|(b)) \2/
134
+ /(a|(b)) \2/,
135
+ /([ab]){2} \1/ # \1 should always be the LAST result of the capture group
123
136
  )
124
137
  end
125
138
 
@@ -221,9 +234,13 @@ RSpec.describe Regexp, "#examples" do
221
234
  context "exact examples match" do
222
235
  # More rigorous tests to assert that ALL examples are being listed
223
236
  context "default options" do
237
+ # Simple examples
224
238
  it { expect(/[ab]{2}/.examples).to eq ["aa", "ab", "ba", "bb"] }
225
239
  it { expect(/(a|b){2}/.examples).to eq ["aa", "ab", "ba", "bb"] }
226
240
  it { expect(/a+|b?/.examples).to eq ["a", "aa", "aaa", "", "b"] }
241
+
242
+ # a{1}? should be equivalent to (?:a{1})?, i.e. NOT a "non-greedy quantifier"
243
+ it { expect(/a{1}?/.examples).to eq ["", "a"] }
227
244
  end
228
245
  context "max_repeater_variance option" do
229
246
  it do
metadata CHANGED
@@ -1,14 +1,14 @@
1
1
  --- !ruby/object:Gem::Specification
2
2
  name: regexp-examples
3
3
  version: !ruby/object:Gem::Version
4
- version: 0.4.1
4
+ version: 0.4.2
5
5
  platform: ruby
6
6
  authors:
7
7
  - Tom Lord
8
8
  autorequire:
9
9
  bindir: bin
10
10
  cert_chain: []
11
- date: 2015-02-02 00:00:00.000000000 Z
11
+ date: 2015-02-03 00:00:00.000000000 Z
12
12
  dependencies:
13
13
  - !ruby/object:Gem::Dependency
14
14
  name: bundler
@@ -86,7 +86,7 @@ required_rubygems_version: !ruby/object:Gem::Requirement
86
86
  version: '0'
87
87
  requirements: []
88
88
  rubyforge_project:
89
- rubygems_version: 2.2.2
89
+ rubygems_version: 2.4.5
90
90
  signing_key:
91
91
  specification_version: 4
92
92
  summary: Extends the Regexp class with '#examples'