regexp-examples 1.0.2 → 1.1.0

Sign up to get free protection for your applications and to get access to all the features.
checksums.yaml CHANGED
@@ -1,7 +1,7 @@
1
1
  ---
2
2
  SHA1:
3
- metadata.gz: e4648ff5cf5c73b7916f58099a989ad58619e2d1
4
- data.tar.gz: 9a9d53ceaf5a89f1f363124fad033b46b1489774
3
+ metadata.gz: df7027b67d35ab27ac6577ffad4ccb2edc9852a9
4
+ data.tar.gz: 9b5d3556ec049d0ccde7d6e1d8ed8bcdced2d1dd
5
5
  SHA512:
6
- metadata.gz: 77997419f70d44cde2181c9a61f81a7e34456de573ab0bfbe46d1dcb35a6350472b3f09b4f9853aa657bca17af8b220d6179649227fda29e69a203f4a467b668
7
- data.tar.gz: 5c830560b6485f7a02bb9ad41fdc0f316ca4b0685d079ea8706cf170a4298e88f113b8bb8eb9bfc0cce18e22e733e235f03ab42120c60bd7828628ae779f5666
6
+ metadata.gz: ddcd3c9c084020ea7003cb86363ed40d7b511f743f0bb9169e9ba32dcb2aecce65d03819dee59a8889148a7c399b1a547ee07bfa9800f30fbf28d47ed7a50eb3
7
+ data.tar.gz: bf2714b21a7c81db479f5fda2b23d378abcea98d2f2843429006ae39402db96fc09d15cd233fbea8f11c17e3bca93062cef01f3ac51a947beac00076beb4acbb
data/README.md CHANGED
@@ -3,9 +3,11 @@
3
3
  [![Build Status](https://travis-ci.org/tom-lord/regexp-examples.svg?branch=master)](https://travis-ci.org/tom-lord/regexp-examples/builds)
4
4
  [![Coverage Status](https://coveralls.io/repos/tom-lord/regexp-examples/badge.svg?branch=master)](https://coveralls.io/r/tom-lord/regexp-examples?branch=master)
5
5
 
6
- Extends the Regexp class with the method: Regexp#examples
6
+ Extends the Regexp class with the methods: `Regexp#examples` and `Regexp#random_example`
7
7
 
8
- This method generates a list of (some\*) strings that will match the given regular expression.
8
+ `Regexp#examples` generates a list of all\* strings that will match the given regular expression.
9
+
10
+ `Regexp#random_example` returns one, random string (from all possible strings!!) that matches the regex.
9
11
 
10
12
  \* If the regex has an infinite number of possible srings that match it, such as `/a*b+c{2,}/`,
11
13
  or a huge number of possible matches, such as `/.\w/`, then only a subset of these will be listed.
@@ -31,6 +33,14 @@ For more detail on this, see [configuration options](#configuration-options).
31
33
  |
32
34
  \u{28}\u2310\u25a0\u{5f}\u25a0\u{29}
33
35
  /x.examples #=> ["(•_•)", "( •_•)>⌐■-■ ", "(⌐■_■)"]
36
+
37
+ ###################################################################################
38
+
39
+ # Obviously, you will get different results if you try these yourself!
40
+ /\w{10}@(hotmail|gmail)\.com/.random_example #=> "TTsJsiwzKS@gmail.com"
41
+ /\p{Greek}{80}/.random_example
42
+ #=> "ΖΆΧͷᵦμͷηϒϰΟᵝΔ΄θϔζΌψΨεκᴪΓΕπι϶ονϵΓϹᵦΟπᵡήϴϜΦϚϴϑ͵ϴΉϺ͵ϹϰϡᵠϝΤΏΨϹϊϻαώΞΰϰΑͼΈΘͽϙͽξΆΆΡΡΉΓς"
43
+ /written by tom lord/i.random_example #=> "WrITtEN bY tOM LORD"
34
44
  ```
35
45
 
36
46
  ## Installation
@@ -51,7 +61,7 @@ Or install it yourself as:
51
61
 
52
62
  ## Supported syntax
53
63
 
54
- Short answer: **Everything** is supported, apart from "irregular" aspects of the regexp language -- see [impossible features](#impossible-features-illegal-syntax)
64
+ Short answer: **Everything** is supported, apart from "irregular" aspects of the regexp language -- see [impossible features](#impossible-features-illegal-syntax).
55
65
 
56
66
  Long answer:
57
67
 
@@ -89,7 +99,7 @@ Long answer:
89
99
  ## Bugs and Not-Yet-Supported syntax
90
100
 
91
101
  * There are some (rare) edge cases where backreferences do not work properly, e.g. `/(a*)a* \1/.examples` - which includes "aaaa aa". This is because each repeater is not context-aware, so the "greediness" logic is flawed. (E.g. in this case, the second `a*` should always evaluate to an empty string, because the previous `a*` was greedy! However, patterns like this are highly unusual...)
92
- * Some named properties, e.g. `/\p{Arabic}/`, list non-matching examples for ruby 2.0/2.1 (as the definitions changed in ruby 2.2). This will be fixed in version 1.1.0 (see the pending pull request)!
102
+ * Some named properties, e.g. `/\p{Arabic}/`, list non-matching examples for ruby 2.0/2.1 (as the definitions changed in ruby 2.2). This will be fixed in version 1.1.1 (see the pending pull request)!
93
103
 
94
104
  Since the Regexp language is so vast, it's quite likely I've missed something (please raise an issue if you find something)! The only missing feature that I'm currently aware of is:
95
105
  * Conditional capture groups, e.g. `/(group1)? (?(1)yes|no)/.examples` (which *should* return: `["group1 yes", " no"]`)
@@ -127,6 +137,8 @@ When generating examples, the gem uses 2 configurable values to limit how many e
127
137
  * `[h-s]` is equivalent to `[hijkl]`
128
138
  * `(1|2|3|4|5|6|7|8)` is equivalent to `[12345]`
129
139
 
140
+ Rexexp#examples makes use of *both* these options; Rexexp#random_example only uses `max_repeater_variance`, since the other option is redundant!
141
+
130
142
  To use an alternative value, simply pass the configuration option as follows:
131
143
 
132
144
  ```ruby
@@ -134,26 +146,29 @@ To use an alternative value, simply pass the configuration option as follows:
134
146
  #=> [''. 'a', 'aa', 'aaa', 'aaaa' 'aaaaa']
135
147
  /[F-X]/.examples(max_group_results: 10)
136
148
  #=> ['F', 'G', 'H', 'I', 'J', 'K', 'L', 'M', 'N', 'O']
149
+ /.*/.random_example(max_repeater_variance: 50)
150
+ #=> "A very unlikely result!"
137
151
  ```
138
152
 
139
- _**WARNING**: Choosing huge numbers, along with a "complex" regex, could easily cause your system to freeze!_
153
+ _**WARNING**: Choosing huge numbers for `Regexp#examples`, along with a "complex" regex, could easily cause your system to freeze!_
140
154
 
141
155
  For example, if you try to generate a list of _all_ 5-letter words: `/\w{5}/.examples(max_group_results: 999)`, then since there are actually `63` "word" characters (upper/lower case letters, numbers and "\_"), this will try to generate `63**5 #=> 992436543` (almost 1 _trillion_) examples!
142
156
 
143
157
  In other words, think twice before playing around with this config!
144
158
 
145
- A more sensible use case might be, for example, to generate one random 1-4 digit string:
159
+ A more sensible use case might be, for example, to generate all 1-4 digit strings:
160
+
161
+ `/\d{1,4}/.examples(max_repeater_variance: 3, max_group_results: 10)`
146
162
 
147
- `/\d{1,4}/.examples(max_repeater_variance: 3, max_group_results: 10).sample(1)`
163
+ Due to code optimisation, this is not something you need to worry about (much) for `Regexp#random_example`. For instance, the following takes no more than ~ 1 second on my machine:
148
164
 
149
- (Note: I may develop a much more efficient way to "generate one example" in a later release of this gem.)
165
+ `/.*\w+\d{100}/.random_example(max_repeater_variance: 1000)`
150
166
 
151
167
  ## TODO
152
168
 
153
169
  * Performance improvements:
154
170
  * Use of lambdas/something (in [constants.rb](lib/regexp-examples/constants.rb)) to improve the library load time. See the pending pull request.
155
171
  * (Maybe?) add a `max_examples` configuration option and use lazy evaluation, to ensure the method never "freezes".
156
- * Potential future feature: `Regexp#random_example` - but implementing this properly is non-trivial, due to performance issues that need addressing first!
157
172
  * Write a blog post about how this amazing gem works! :)
158
173
 
159
174
  ## Contributing
@@ -17,7 +17,7 @@ module RegexpExamples
17
17
 
18
18
  class << self
19
19
  attr_reader :max_repeater_variance, :max_group_results
20
- def configure!(max_repeater_variance, max_group_results)
20
+ def configure!(max_repeater_variance, max_group_results = nil)
21
21
  @max_repeater_variance = (max_repeater_variance || MaxRepeaterVarianceDefault)
22
22
  @max_group_results = (max_group_results || MaxGroupResultsDefault)
23
23
  end
@@ -44,7 +44,8 @@ module RegexpExamples
44
44
  Whitespace = [' ', "\t", "\n", "\r", "\v", "\f"]
45
45
  Control = (0..31).map(&:chr) | ["\x7f"]
46
46
  # Ensure that the "common" characters appear first in the array
47
- Any = Lower | Upper | Digit | Punct | (0..127).map(&:chr)
47
+ # Also, ensure "\n" comes first, to make it obvious when included
48
+ Any = ["\n"] | Lower | Upper | Digit | Punct | (0..127).map(&:chr)
48
49
  AnyNoNewLine = Any - ["\n"]
49
50
  end.freeze
50
51
 
@@ -1,12 +1,29 @@
1
1
  module CoreExtensions
2
2
  module Regexp
3
3
  module Examples
4
- def examples(config_options={})
5
- full_examples = RegexpExamples.map_results(
6
- RegexpExamples::Parser.new(source, options, config_options).parse
4
+ def examples(**config_options)
5
+ RegexpExamples::ResultCountLimiters.configure!(
6
+ config_options[:max_repeater_variance],
7
+ config_options[:max_group_results]
7
8
  )
8
- RegexpExamples::BackReferenceReplacer.new.substitute_backreferences(full_examples)
9
+ examples_by_method(:map_results)
10
+ end
11
+
12
+ def random_example(**config_options)
13
+ RegexpExamples::ResultCountLimiters.configure!(
14
+ config_options[:max_repeater_variance]
15
+ )
16
+ examples_by_method(:map_random_result).first
9
17
  end
18
+
19
+ private
20
+ def examples_by_method(method)
21
+ full_examples = RegexpExamples.public_send(
22
+ method,
23
+ RegexpExamples::Parser.new(source, options).parse
24
+ )
25
+ RegexpExamples::BackReferenceReplacer.new.substitute_backreferences(full_examples)
26
+ end
10
27
  end
11
28
  end
12
29
  end
@@ -37,7 +37,14 @@ module RegexpExamples
37
37
  end
38
38
  end
39
39
 
40
+ module RandomResultBySample
41
+ def random_result
42
+ result.sample(1)
43
+ end
44
+ end
45
+
40
46
  class SingleCharGroup
47
+ include RandomResultBySample
41
48
  prepend GroupWithIgnoreCase
42
49
  def initialize(char, ignorecase)
43
50
  @char = char
@@ -48,17 +55,19 @@ module RegexpExamples
48
55
  end
49
56
  end
50
57
 
51
- # Used as a workaround for when a grep is expected to be returned,
58
+ # Used as a workaround for when a group is expected to be returned,
52
59
  # but there are no results for the group.
53
60
  # i.e. PlaceHolderGroup.new.result == '' == SingleCharGroup.new('').result
54
61
  # (But using PlaceHolderGroup makes it clearer what the intention is!)
55
62
  class PlaceHolderGroup
63
+ include RandomResultBySample
56
64
  def result
57
65
  [GroupResult.new('')]
58
66
  end
59
67
  end
60
68
 
61
69
  class CharGroup
70
+ include RandomResultBySample
62
71
  prepend GroupWithIgnoreCase
63
72
  def initialize(chars, ignorecase)
64
73
  @chars = chars
@@ -74,6 +83,7 @@ module RegexpExamples
74
83
  end
75
84
 
76
85
  class DotGroup
86
+ include RandomResultBySample
77
87
  attr_reader :multiline
78
88
  def initialize(multiline)
79
89
  @multiline = multiline
@@ -94,30 +104,48 @@ module RegexpExamples
94
104
  @group_id = group_id
95
105
  end
96
106
 
97
- # Generates the result of each contained group
98
- # and adds the filled group of each result to
99
- # itself
100
107
  def result
101
- strings = @groups.map {|repeater| repeater.result}
108
+ result_by_method(:result)
109
+ end
110
+
111
+ def random_result
112
+ result_by_method(:random_result)
113
+ end
114
+
115
+ private
116
+ # Generates the result of each contained group
117
+ # and adds the filled group of each result to itself
118
+ def result_by_method(method)
119
+ strings = @groups.map {|repeater| repeater.public_send(method)}
102
120
  RegexpExamples.permutations_of_strings(strings).map do |result|
103
121
  GroupResult.new(result, group_id)
104
122
  end
105
123
  end
106
124
  end
107
125
 
108
- class MultiGroupEnd
109
- end
110
-
111
126
  class OrGroup
112
127
  def initialize(left_repeaters, right_repeaters)
113
128
  @left_repeaters = left_repeaters
114
129
  @right_repeaters = right_repeaters
115
130
  end
116
131
 
117
-
118
132
  def result
119
- left_result = RegexpExamples.map_results(@left_repeaters)
120
- right_result = RegexpExamples.map_results(@right_repeaters)
133
+ result_by_method(:map_results)
134
+ end
135
+
136
+ def random_result
137
+ # TODO: This logic is flawed in terms of choosing a truly "random" example!
138
+ # E.g. /a|b|c|d/.random_example will choose a letter with the following probabilities:
139
+ # a = 50%, b = 25%, c = 12.5%, d = 12.5%
140
+ # In order to fix this, I must either apply some weighted selection logic,
141
+ # or change how the OrGroup examples are generated - i.e. make this class work with >2 repeaters
142
+ result_by_method(:map_random_result).sample(1)
143
+ end
144
+
145
+ private
146
+ def result_by_method(method)
147
+ left_result = RegexpExamples.public_send(method, @left_repeaters)
148
+ right_result = RegexpExamples.public_send(method, @right_repeaters)
121
149
  left_result.concat(right_result).flatten.uniq.map do |result|
122
150
  GroupResult.new(result)
123
151
  end
@@ -125,6 +153,7 @@ module RegexpExamples
125
153
  end
126
154
 
127
155
  class BackReferenceGroup
156
+ include RandomResultBySample
128
157
  attr_reader :id
129
158
  def initialize(id)
130
159
  @id = id
@@ -1,8 +1,6 @@
1
1
  module RegexpExamples
2
- # Given an array of arrays of strings,
3
- # returns all possible perutations,
4
- # for strings created by joining one
5
- # element from each array
2
+ # Given an array of arrays of strings, returns all possible perutations
3
+ # for strings, created by joining one element from each array
6
4
  #
7
5
  # For example:
8
6
  # permutations_of_strings [ ['a'], ['b'], ['c', 'd', 'e'] ] #=> ['abc', 'abd', 'abe']
@@ -29,8 +27,17 @@ module RegexpExamples
29
27
  end
30
28
 
31
29
  def self.map_results(repeaters)
30
+ generic_map_result(repeaters, :result)
31
+ end
32
+
33
+ def self.map_random_result(repeaters)
34
+ generic_map_result(repeaters, :random_result)
35
+ end
36
+
37
+ private
38
+ def self.generic_map_result(repeaters, method)
32
39
  repeaters
33
- .map {|repeater| repeater.result}
40
+ .map {|repeater| repeater.public_send(method)}
34
41
  .instance_eval do |partial_results|
35
42
  RegexpExamples.permutations_of_strings(partial_results)
36
43
  end
@@ -2,24 +2,19 @@ module RegexpExamples
2
2
  IllegalSyntaxError = Class.new(StandardError)
3
3
  class Parser
4
4
  attr_reader :regexp_string
5
- def initialize(regexp_string, regexp_options, config_options={})
5
+ def initialize(regexp_string, regexp_options)
6
6
  @regexp_string = regexp_string
7
7
  @ignorecase = !(regexp_options & Regexp::IGNORECASE).zero?
8
8
  @multiline = !(regexp_options & Regexp::MULTILINE).zero?
9
9
  @extended = !(regexp_options & Regexp::EXTENDED).zero?
10
10
  @num_groups = 0
11
11
  @current_position = 0
12
- ResultCountLimiters.configure!(
13
- config_options[:max_repeater_variance],
14
- config_options[:max_group_results]
15
- )
16
12
  end
17
13
 
18
14
  def parse
19
15
  repeaters = []
20
- while @current_position < regexp_string.length
16
+ until end_of_regexp
21
17
  group = parse_group(repeaters)
22
- break if group.is_a? MultiGroupEnd
23
18
  if group.is_a? OrGroup
24
19
  return [OneTimeRepeater.new(group)]
25
20
  end
@@ -35,8 +30,6 @@ module RegexpExamples
35
30
  case next_char
36
31
  when '('
37
32
  group = parse_multi_group
38
- when ')'
39
- group = parse_multi_end_group
40
33
  when '['
41
34
  group = parse_char_group
42
35
  when '.'
@@ -241,10 +234,6 @@ module RegexpExamples
241
234
  @extended = false if (off.include? "x")
242
235
  end
243
236
 
244
- def parse_multi_end_group
245
- MultiGroupEnd.new
246
- end
247
-
248
237
  def parse_char_group
249
238
  @current_position += 1 # Skip past opening "["
250
239
  chargroup_parser = ChargroupParser.new(rest_of_string)
@@ -345,6 +334,10 @@ module RegexpExamples
345
334
  def next_char
346
335
  regexp_string[@current_position]
347
336
  end
337
+
338
+ def end_of_regexp
339
+ next_char == ")" || @current_position >= regexp_string.length
340
+ end
348
341
  end
349
342
  end
350
343
 
@@ -1,12 +1,12 @@
1
1
  module RegexpExamples
2
2
  class BaseRepeater
3
- attr_reader :group
3
+ attr_reader :group, :min_repeats, :max_repeats
4
4
  def initialize(group)
5
5
  @group = group
6
6
  end
7
7
 
8
- def result(min_repeats, max_repeats)
9
- group_results = @group.result[0 .. RegexpExamples.MaxGroupResults-1]
8
+ def result
9
+ group_results = group.result[0 .. RegexpExamples.MaxGroupResults-1]
10
10
  results = []
11
11
  min_repeats.upto(max_repeats) do |repeats|
12
12
  if repeats.zero?
@@ -19,66 +19,60 @@ module RegexpExamples
19
19
  end
20
20
  results.flatten.uniq
21
21
  end
22
+
23
+ def random_result
24
+ result = []
25
+ rand(min_repeats..max_repeats).times { result << group.random_result }
26
+ result << [ GroupResult.new('') ] if result.empty? # in case of 0.times
27
+ RegexpExamples::permutations_of_strings(result)
28
+ end
22
29
  end
23
30
 
24
31
  class OneTimeRepeater < BaseRepeater
25
32
  def initialize(group)
26
33
  super
27
- end
28
-
29
- def result
30
- super(1, 1)
34
+ @min_repeats = 1
35
+ @max_repeats = 1
31
36
  end
32
37
  end
33
38
 
34
39
  class StarRepeater < BaseRepeater
35
40
  def initialize(group)
36
41
  super
37
- end
38
-
39
- def result
40
- super(0, RegexpExamples.MaxRepeaterVariance)
42
+ @min_repeats = 0
43
+ @max_repeats = RegexpExamples.MaxRepeaterVariance
41
44
  end
42
45
  end
43
46
 
44
47
  class PlusRepeater < BaseRepeater
45
48
  def initialize(group)
46
49
  super
47
- end
48
-
49
- def result
50
- super(1, RegexpExamples.MaxRepeaterVariance + 1)
50
+ @min_repeats = 1
51
+ @max_repeats = RegexpExamples.MaxRepeaterVariance + 1
51
52
  end
52
53
  end
53
54
 
54
55
  class QuestionMarkRepeater < BaseRepeater
55
56
  def initialize(group)
56
57
  super
57
- end
58
-
59
- def result
60
- super(0, 1)
58
+ @min_repeats = 0
59
+ @max_repeats = 1
61
60
  end
62
61
  end
63
62
 
64
63
  class RangeRepeater < BaseRepeater
65
64
  def initialize(group, min, has_comma, max)
66
65
  super(group)
67
- @min = min || 0
68
- if max
69
- # Prevent huge number of results in case of e.g. /.{1,100}/.examples
70
- @max = smallest(max, @min + RegexpExamples.MaxRepeaterVariance)
71
- elsif has_comma
72
- @max = @min + RegexpExamples.MaxRepeaterVariance
73
- else
74
- @max = @min
66
+ @min_repeats = min || 0
67
+ if max # e.g. {1,100} --> Treat as {1,3} or similar, to prevent a huge number of results
68
+ @max_repeats = smallest(max, @min_repeats + RegexpExamples.MaxRepeaterVariance)
69
+ elsif has_comma # e.g. {2,} --> Treat as {2,4} or similar
70
+ @max_repeats = @min_repeats + RegexpExamples.MaxRepeaterVariance
71
+ else # e.g. {3} --> Treat as {3,3}
72
+ @max_repeats = @min_repeats
75
73
  end
76
74
  end
77
75
 
78
- def result
79
- super(@min, @max)
80
- end
81
-
82
76
  private
83
77
  def smallest(x, y)
84
78
  (x < y) ? x : y
@@ -1,3 +1,3 @@
1
1
  module RegexpExamples
2
- VERSION = '1.0.2'
2
+ VERSION = '1.1.0'
3
3
  end
@@ -0,0 +1,23 @@
1
+ RSpec.describe Regexp, "#random_example" do
2
+ def self.random_example_matches(*regexps)
3
+ regexps.each do |regexp|
4
+ it "random example for /#{regexp.source}/" do
5
+ random_example = regexp.random_example
6
+
7
+ expect(random_example).to be_a String # Not an Array!
8
+ expect(random_example).to match(Regexp.new("\\A(?:#{regexp.source})\\z", regexp.options))
9
+ end
10
+ end
11
+ end
12
+
13
+ context "smoke tests" do
14
+ # Just a few "smoke tests", to ensure the basic method isn't broken.
15
+ # Testing of the RegexpExamples::Parser class is all covered by Regexp#examples test already.
16
+ random_example_matches(
17
+ /\w{10}/,
18
+ /(we(need(to(go(deeper)?)?)?)?) \1/,
19
+ /case insensitive/i,
20
+ /front seat|back seat/, # Which seat will I take??
21
+ )
22
+ end
23
+ end
metadata CHANGED
@@ -1,14 +1,14 @@
1
1
  --- !ruby/object:Gem::Specification
2
2
  name: regexp-examples
3
3
  version: !ruby/object:Gem::Version
4
- version: 1.0.2
4
+ version: 1.1.0
5
5
  platform: ruby
6
6
  authors:
7
7
  - Tom Lord
8
8
  autorequire:
9
9
  bindir: bin
10
10
  cert_chain: []
11
- date: 2015-03-07 00:00:00.000000000 Z
11
+ date: 2015-03-08 00:00:00.000000000 Z
12
12
  dependencies:
13
13
  - !ruby/object:Gem::Dependency
14
14
  name: bundler
@@ -64,6 +64,7 @@ files:
64
64
  - regexp-examples.gemspec
65
65
  - scripts/unicode_lister.rb
66
66
  - spec/regexp-examples_spec.rb
67
+ - spec/regexp-random_example_spec.rb
67
68
  - spec/spec_helper.rb
68
69
  homepage: http://rubygems.org/gems/regexp-examples
69
70
  licenses:
@@ -91,4 +92,5 @@ specification_version: 4
91
92
  summary: Extends the Regexp class with '#examples'
92
93
  test_files:
93
94
  - spec/regexp-examples_spec.rb
95
+ - spec/regexp-random_example_spec.rb
94
96
  - spec/spec_helper.rb