regexp-examples 1.3.1 → 1.3.2

Sign up to get free protection for your applications and to get access to all the features.
checksums.yaml CHANGED
@@ -1,7 +1,7 @@
1
1
  ---
2
2
  SHA1:
3
- metadata.gz: 94abdab1c267eae56ef6b2d7d9968891776cb64d
4
- data.tar.gz: 8e01129cc7d0d34d584f25e52950e89cea9d431b
3
+ metadata.gz: 7ce9ae460670e7c7525a38992990271751c311d4
4
+ data.tar.gz: f9501da52ba2b57b6ef92324f43ba7252a8d70f8
5
5
  SHA512:
6
- metadata.gz: 776ca69a842d56844c0fd2b9eb0d03d5999a1bce2018ff74ee0a92106347f5710dc8c0bab1ce9beb705393768be8709d1f2193bc23381385f386e037f68151fb
7
- data.tar.gz: 1c429180eec239661ab896a19a88835428ec74e05bb9af4ad6d63998d3d213510fc72d50d91be96eac9e78cad9e42c6f18d29f61b772b954fc48ea0b16b75a7e
6
+ metadata.gz: 5fabfe8bf9dcb41d4e1a6ceba7b8c98278ea44f12ea657409821abd25fbf856b11effc726d67d06c0d2c67e728caf24853a89e2e73f26f1c5fba58ea397478e9
7
+ data.tar.gz: 717755a835ca5f0a611c4a62ba4d8f4b90b7a422ce8d80136f79a97a71ee2f439b027c3fbd42f68ac79e65884940f0004b9f35a72197e8748cc6c7fef2680891
@@ -2,8 +2,9 @@ language: ruby
2
2
  rvm:
3
3
  - 2.0.0
4
4
  - 2.1.10
5
- - 2.2.6
6
- - 2.3.3
5
+ - 2.2.7
6
+ - 2.3.4
7
+ - 2.4.1
7
8
  - ruby-head
8
9
  matrix:
9
10
  allow_failures:
data/README.md CHANGED
@@ -15,7 +15,7 @@ or a huge number of possible matches, such as `/.\w/`, then only a subset of the
15
15
 
16
16
  For more detail on this, see [configuration options](#configuration-options).
17
17
 
18
- If you'd like to understand how/why this gem works, please check out my [blog post](http://tom-lord.weebly.com/blog/reverse-engineering-regular-expressions) about it.
18
+ If you'd like to understand how/why this gem works, please check out my [blog post](https://tom-lord.github.io/Reverse-Engineering-Regular-Expressions/) about it.
19
19
 
20
20
  ## Usage
21
21
 
@@ -100,12 +100,14 @@ Long answer:
100
100
  * Negation, e.g. `/[^a-z]/`
101
101
  * Escaped characters, e.g. `/[\w\s\b]/`
102
102
  * POSIX bracket expressions, e.g. `/[[:alnum:]]/`, `/[[:^space:]]/`
103
+ * ...Taking the current ruby version into account - e.g. the definition of `/[[:punct:]]/`
104
+ [changed](https://bugs.ruby-lang.org/issues/12577) in version `2.4.0`.
103
105
  * Set intersection, e.g. `/[[a-h]&&[f-z]]/`
104
106
  * Escaped characters, e.g. `/\n/`, `/\w/`, `/\D/` (and so on...)
105
107
  * Capture groups, e.g. `/(group)/`
106
108
  * Including named groups, e.g. `/(?<name>group)/`
107
109
  * And backreferences(!!!), e.g. `/(this|that) \1/` `/(?<name>foo) \k<name>/`
108
- * ...even for the more "obscure" syntax, e.g. `/(?<future>the) \k'future'/`, `/(a)(b) \k<-1>/``
110
+ * ...even for the more "obscure" syntax, e.g. `/(?<future>the) \k'future'/`, `/(a)(b) \k<-1>/`
109
111
  * ...and even if nested or optional, e.g. `/(even(this(works?))) \1 \2 \3/`, `/what about (this)? \1/`
110
112
  * Non-capture groups, e.g. `/(?:foo)/`
111
113
  * Comment groups, e.g. `/foo(?#comment)bar/`
@@ -178,15 +180,12 @@ For instance, the following takes no more than ~ 1 second on my machine:
178
180
  There are no known major bugs with this library. However, there are a few obscure issues that you *may* encounter:
179
181
 
180
182
  * Conditional capture groups, e.g. `/(group1)? (?(1)yes|no)/.examples` are not yet supported. (This example *should* return: `["group1 yes", " no"]`)
181
- * `\Z` should be interpreted like `\n?\z`; it's currently just interpreted like `\z`. (This basically just means you'll be missing a few examples.)
182
- * Ideally, `regexp#examples` should always return up to `max_results_limit`. Currenty, it usually "aborts" before this limit is reached.
183
- (I.e. the exact number of examples generated can be hard to predict, for complex patterns.)
184
- * There are some (rare) edge cases where backreferences do not work properly, e.g. `/(a*)a* \1/.examples` -
185
- which includes `"aaaa aa"`. This is because each repeater is not context-aware, so the "greediness" logic is flawed.
186
- (E.g. in this case, the second `a*` should always evaluate to an empty string, because the previous `a*` was greedy.)
187
- However, patterns like this are highly unusual...
188
183
  * Nested repeat operators are incorrectly parsed, e.g. `/b{2}{3}/` - which *should* be interpreted like `/b{6}/`. (However, there is probably no reason
189
184
  to ever write regexes like this!)
185
+ * A new ["absent operator" (`/(?~exp)/`)](https://medium.com/rubyinside/the-new-absent-operator-in-ruby-s-regular-expressions-7c3ef6cd0b99)
186
+ was added to Ruby version `2.4.1`. This gem does not yet support it (or gracefully fail when used).
187
+ * Ideally, `regexp#examples` should always return up to `max_results_limit`. Currenty, it usually "aborts" before this limit is reached.
188
+ (I.e. the exact number of examples generated can be hard to predict, for complex patterns.)
190
189
 
191
190
  Some of the most obscure regexp features are not even mentioned in [the ruby docs](http://ruby-doc.org/core/Regexp.html).
192
191
  However, full documentation on all the intricate obscurities in the ruby (version 2.x) regexp parser can be found
@@ -195,7 +194,7 @@ However, full documentation on all the intricate obscurities in the ruby (versio
195
194
  ## Impossible features ("illegal syntax")
196
195
 
197
196
  The following features in the regex language can never be properly implemented into this gem because, put simply, they are not technically "regular"!
198
- If you'd like to understand this in more detail, check out what I had to say in [my blog post](http://tom-lord.weebly.com/blog/reverse-engineering-regular-expressions) about this gem.
197
+ If you'd like to understand this in more detail, check out what I had to say in [my blog post](https://tom-lord.github.io/Reverse-Engineering-Regular-Expressions/) about this gem.
199
198
 
200
199
  Using any of the following will raise a `RegexpExamples::IllegalSyntax` exception:
201
200
 
@@ -15,7 +15,7 @@ module RegexpExamples
15
15
  include CharsetNegationHelper
16
16
 
17
17
  attr_reader :regexp_string, :current_position
18
- alias_method :length, :current_position
18
+ alias length current_position
19
19
 
20
20
  def initialize(regexp_string, is_sub_group: false)
21
21
  @regexp_string = regexp_string
@@ -85,10 +85,10 @@ module RegexpExamples
85
85
 
86
86
  def parse_after_backslash
87
87
  case next_char
88
- when *BackslashCharMap.keys
89
- BackslashCharMap[next_char]
90
88
  when 'b'
91
89
  ["\b"]
90
+ when *BackslashCharMap.keys
91
+ BackslashCharMap[next_char]
92
92
  else
93
93
  [next_char]
94
94
  end
@@ -21,10 +21,12 @@ module RegexpExamples
21
21
  # This is to prevent the system "freezing" when given instructions like:
22
22
  # /[ab]{30}/.examples
23
23
  # (Which would attempt to generate 2**30 == 1073741824 examples!!!)
24
- MAX_RESULTS_LIMIT_DEFAULT = 10000
24
+ MAX_RESULTS_LIMIT_DEFAULT = 10_000
25
25
  class << self
26
26
  attr_reader :max_repeater_variance, :max_group_results, :max_results_limit
27
- def configure!(max_repeater_variance: nil, max_group_results: nil, max_results_limit: nil)
27
+ def configure!(max_repeater_variance: nil,
28
+ max_group_results: nil,
29
+ max_results_limit: nil)
28
30
  @max_repeater_variance = (max_repeater_variance || MAX_REPEATER_VARIANCE_DEFAULT)
29
31
  @max_group_results = (max_group_results || MAX_GROUP_RESULTS_DEFAULT)
30
32
  @max_results_limit = (max_results_limit || MAX_RESULTS_LIMIT_DEFAULT)
@@ -35,9 +37,11 @@ module RegexpExamples
35
37
  def self.max_repeater_variance
36
38
  ResultCountLimiters.max_repeater_variance
37
39
  end
40
+
38
41
  def self.max_group_results
39
42
  ResultCountLimiters.max_group_results
40
43
  end
44
+
41
45
  def self.max_results_limit
42
46
  ResultCountLimiters.max_results_limit
43
47
  end
@@ -48,13 +52,11 @@ module RegexpExamples
48
52
  Lower = Array('a'..'z')
49
53
  Upper = Array('A'..'Z')
50
54
  Digit = Array('0'..'9')
51
- # Note: Punct should also include the following chars: $ + < = > ^ ` | ~
52
- # I.e. Punct = %w(! " # $ % & ' ( ) * + , - . / : ; < = > ? @ [ \\ ] ^ _ ` { | } ~)
53
- # However, due to a ruby bug (!!) these do not work properly at the moment!
54
- Punct = %w(! " # % & ' ( ) * , - . / : ; ? @ [ \\ ] _ { })
55
+ Punct = %w[! " # % & ' ( ) * , - . / : ; ? @ [ \\ ] _ { }] \
56
+ | (RUBY_VERSION >= '2.4.0' ? %w[$ + < = > ^ ` | ~] : [])
55
57
  Hex = Array('a'..'f') | Array('A'..'F') | Digit
56
58
  Word = Lower | Upper | Digit | ['_']
57
- Whitespace = [' ', "\t", "\n", "\r", "\v", "\f"]
59
+ Whitespace = [' ', "\t", "\n", "\r", "\v", "\f"].freeze
58
60
  Control = (0..31).map(&:chr) | ["\x7f"]
59
61
  # Ensure that the "common" characters appear first in the array
60
62
  # Also, ensure "\n" comes first, to make it obvious when included
@@ -131,7 +131,7 @@ module RegexpExamples
131
131
  end
132
132
  end
133
133
 
134
- alias_method :random_result, :result
134
+ alias random_result result
135
135
  end
136
136
 
137
137
  # A boolean "or" group.
@@ -162,7 +162,7 @@ module RegexpExamples
162
162
  max_results_limiter = MaxResultsLimiterBySum.new
163
163
  repeaters_list
164
164
  .map { |repeaters| RegexpExamples.generic_map_result(repeaters, method) }
165
- .map { |result| max_results_limiter.limit_results(result)}
165
+ .map { |result| max_results_limiter.limit_results(result) }
166
166
  .inject(:concat)
167
167
  .map { |result| GroupResult.new(result) }
168
168
  .uniq
@@ -184,7 +184,7 @@ module RegexpExamples
184
184
  # of /\1/ as being "__1__". It later gets updated.
185
185
  class BackReferenceGroup
186
186
  include RandomResultBySample
187
- PLACEHOLDER_FORMAT = '__%s__'
187
+ PLACEHOLDER_FORMAT = '__%s__'.freeze
188
188
  attr_reader :id
189
189
  def initialize(id)
190
190
  @id = id
@@ -44,7 +44,7 @@ module RegexpExamples
44
44
  end
45
45
 
46
46
  # For example:
47
- # Needed when generating examples for /[ab]{10}|{cd}{11}/
47
+ # Needed when generating examples for /[ab]{10}|{cd}{11}/
48
48
  # (here, results_count will reach 1024 + 2048 == 3072)
49
49
  class MaxResultsLimiterBySum < MaxResultsLimiter
50
50
  def initialize
@@ -5,35 +5,33 @@ module RegexpExamples
5
5
 
6
6
  def parse_after_backslash_group
7
7
  @current_position += 1
8
- case
9
- when rest_of_string =~ /\A(\d{1,3})/
8
+ if rest_of_string =~ /\A(\d{1,3})/
10
9
  parse_regular_backreference_group(Regexp.last_match(1))
11
- when rest_of_string =~ /\Ak['<]([\w-]+)['>]/
10
+ elsif rest_of_string =~ /\Ak['<]([\w-]+)['>]/
12
11
  parse_named_backreference_group(Regexp.last_match(1))
13
- when BackslashCharMap.keys.include?(next_char)
12
+ elsif BackslashCharMap.keys.include?(next_char)
14
13
  parse_backslash_special_char
15
- when rest_of_string =~ /\A(c|C-)(.)/
14
+ elsif rest_of_string =~ /\A(c|C-)(.)/
16
15
  parse_backslash_control_char(Regexp.last_match(1), Regexp.last_match(2))
17
- when rest_of_string =~ /\Ax(\h{1,2})/
16
+ elsif rest_of_string =~ /\Ax(\h{1,2})/
18
17
  parse_backslash_escape_sequence(Regexp.last_match(1))
19
- when rest_of_string =~ /\Au(\h{4}|\{\h{1,4}\})/
18
+ elsif rest_of_string =~ /\Au(\h{4}|\{\h{1,4}\})/
20
19
  parse_backslash_unicode_sequence(Regexp.last_match(1))
21
- when rest_of_string =~ /\A(p)\{(\^?)([^}]+)\}/i
20
+ elsif rest_of_string =~ /\A(p)\{(\^?)([^}]+)\}/i
22
21
  parse_backslash_named_property(
23
22
  Regexp.last_match(1), Regexp.last_match(2), Regexp.last_match(3)
24
23
  )
25
- when next_char == 'K' # Keep (special lookbehind that CAN be supported safely!)
24
+ elsif next_char == 'K' # Keep (special lookbehind that CAN be supported safely!)
26
25
  PlaceHolderGroup.new
27
- when next_char == 'R'
26
+ elsif next_char == 'R'
28
27
  parse_backslash_linebreak
29
- when next_char == 'g'
28
+ elsif next_char == 'g'
30
29
  parse_backslash_subexpresion_call
31
- when next_char =~ /[bB]/
30
+ elsif next_char =~ /[bB]/
32
31
  parse_backslash_anchor
33
- when next_char =~ /[AG]/
32
+ elsif next_char =~ /[AG]/
34
33
  parse_backslash_start_of_string
35
- when next_char =~ /[zZ]/
36
- # TODO: /\Z/ should be treated as /\n?/
34
+ elsif next_char =~ /[zZ]/
37
35
  parse_backslash_end_of_string
38
36
  else
39
37
  parse_single_char_group(next_char)
@@ -112,8 +110,8 @@ module RegexpExamples
112
110
  end
113
111
 
114
112
  def parse_backslash_subexpresion_call
115
- fail IllegalSyntaxError,
116
- 'Subexpression calls (\\g) cannot be supported, as they are not regular'
113
+ raise IllegalSyntaxError,
114
+ 'Subexpression calls (\\g) cannot be supported, as they are not regular'
117
115
  end
118
116
 
119
117
  def parse_backslash_anchor
@@ -130,15 +128,19 @@ module RegexpExamples
130
128
 
131
129
  def parse_backslash_end_of_string
132
130
  if @current_position == (regexp_string.length - 1)
133
- PlaceHolderGroup.new
131
+ if next_char == 'z'
132
+ PlaceHolderGroup.new
133
+ else # next_char == 'Z'
134
+ QuestionMarkRepeater.new(SingleCharGroup.new("\n", @ignorecase))
135
+ end
134
136
  else
135
137
  raise_anchors_exception!
136
138
  end
137
139
  end
138
140
 
139
141
  def raise_anchors_exception!
140
- fail IllegalSyntaxError,
141
- "Anchors ('#{next_char}') cannot be supported, as they are not regular"
142
+ raise IllegalSyntaxError,
143
+ "Anchors ('#{next_char}') cannot be supported, as they are not regular"
142
144
  end
143
145
  end
144
146
  end
@@ -28,15 +28,14 @@ module RegexpExamples
28
28
  )?
29
29
  /x
30
30
  ) do |match|
31
- case
32
- when match[1].nil? # e.g. /(normal)/
31
+ if match[1].nil? # e.g. /(normal)/
33
32
  group_id = @num_groups.to_s
34
- when match[2] == ':' # e.g. /(?:nocapture)/
33
+ elsif match[2] == ':' # e.g. /(?:nocapture)/
35
34
  @current_position += 2
36
- when match[2] == '#' # e.g. /(?#comment)/
35
+ elsif match[2] == '#' # e.g. /(?#comment)/
37
36
  comment_group = rest_of_string.match(/.*?[^\\](?:\\{2})*\)/)[0]
38
37
  @current_position += comment_group.length
39
- when match[2] =~ /\A(?=[mix-]+)([mix]*)-?([mix]*)/ # e.g. /(?i-mx)/
38
+ elsif match[2] =~ /\A(?=[mix-]+)([mix]*)-?([mix]*)/ # e.g. /(?i-mx)/
40
39
  regexp_options_toggle(Regexp.last_match(1), Regexp.last_match(2))
41
40
  @num_groups -= 1 # Toggle "groups" should not increase backref group count
42
41
  @current_position += $&.length + 1
@@ -45,12 +44,12 @@ module RegexpExamples
45
44
  else
46
45
  return PlaceHolderGroup.new
47
46
  end
48
- when %w(! =).include?(match[2]) # e.g. /(?=lookahead)/, /(?!neglookahead)/
49
- fail IllegalSyntaxError,
50
- 'Lookaheads are not regular; cannot generate examples'
51
- when %w(! =).include?(match[3]) # e.g. /(?<=lookbehind)/, /(?<!neglookbehind)/
52
- fail IllegalSyntaxError,
53
- 'Lookbehinds are not regular; cannot generate examples'
47
+ elsif %w[! =].include?(match[2]) # e.g. /(?=lookahead)/, /(?!neglookahead)/
48
+ raise IllegalSyntaxError,
49
+ 'Lookaheads are not regular; cannot generate examples'
50
+ elsif %w[! =].include?(match[3]) # e.g. /(?<=lookbehind)/, /(?<!neglookbehind)/
51
+ raise IllegalSyntaxError,
52
+ 'Lookbehinds are not regular; cannot generate examples'
54
53
  else # e.g. /(?<name>namedgroup)/
55
54
  @current_position += (match[3].length + 3)
56
55
  group_id = match[3]
@@ -92,7 +92,7 @@ module RegexpExamples
92
92
  private
93
93
 
94
94
  def smallest(x, y)
95
- (x < y) ? x : y
95
+ x < y ? x : y
96
96
  end
97
97
  end
98
98
  end
@@ -10,7 +10,7 @@ module RegexpExamples
10
10
  # Note: Only the first 128 results are listed, for performance.
11
11
  # Also, some groups seem to have no matches (weird!)
12
12
  # (Don't care about ruby micro version number)
13
- STORE_FILENAME = "unicode_ranges_#{RUBY_VERSION[0..2]}.pstore"
13
+ STORE_FILENAME = "unicode_ranges_#{RUBY_VERSION[0..2]}.pstore".freeze
14
14
 
15
15
  attr_reader :range_store
16
16
 
@@ -24,7 +24,7 @@ module RegexpExamples
24
24
  end
25
25
  end
26
26
 
27
- alias_method :[], :get
27
+ alias [] get
28
28
 
29
29
  private
30
30
 
@@ -40,7 +40,7 @@ module RegexpExamples
40
40
  def ranges_to_unicode(ranges)
41
41
  result = []
42
42
  ranges.each do |range|
43
- if range.is_a? Fixnum # Small hack to increase data compression
43
+ if range.is_a? Integer # Small hack to increase data compression
44
44
  result << hex_to_unicode(range.to_s(16))
45
45
  else
46
46
  range.each { |num| result << hex_to_unicode(num.to_s(16)) }
@@ -1,4 +1,4 @@
1
1
  # Gem version
2
2
  module RegexpExamples
3
- VERSION = '1.3.1'
3
+ VERSION = '1.3.2'.freeze
4
4
  end
@@ -170,8 +170,9 @@ RSpec.describe Regexp, '#examples' do
170
170
  /\Glast-match/,
171
171
  /^start/,
172
172
  /end$/,
173
- /end\z/,
174
- /end\Z/
173
+ /end\z/
174
+ # Cannot test /end\Z/ with the generic method here,
175
+ # as it's a special case. Tested specially below.
175
176
  )
176
177
  end
177
178
 
@@ -303,6 +304,11 @@ RSpec.describe Regexp, '#examples' do
303
304
  it { expect(/a{1}?/.examples).to match_array ['', 'a'] }
304
305
  end
305
306
 
307
+ context 'end of string' do
308
+ it { expect(/test\z/.examples).to match_array %w(test) }
309
+ it { expect(/test\Z/.examples).to match_array ['test', "test\n"] }
310
+ end
311
+
306
312
  context 'backreferences and escaped octal combined' do
307
313
  it do
308
314
  expect(/(a)(b)(c)(d)(e)(f)(g)(h)(i)(j)? \10\9\8\7\6\5\4\3\2\1/.examples)
metadata CHANGED
@@ -1,14 +1,14 @@
1
1
  --- !ruby/object:Gem::Specification
2
2
  name: regexp-examples
3
3
  version: !ruby/object:Gem::Version
4
- version: 1.3.1
4
+ version: 1.3.2
5
5
  platform: ruby
6
6
  authors:
7
7
  - Tom Lord
8
8
  autorequire:
9
9
  bindir: bin
10
10
  cert_chain: []
11
- date: 2016-12-30 00:00:00.000000000 Z
11
+ date: 2017-06-06 00:00:00.000000000 Z
12
12
  dependencies:
13
13
  - !ruby/object:Gem::Dependency
14
14
  name: bundler
@@ -100,7 +100,7 @@ required_rubygems_version: !ruby/object:Gem::Requirement
100
100
  version: '0'
101
101
  requirements: []
102
102
  rubyforge_project:
103
- rubygems_version: 2.6.8
103
+ rubygems_version: 2.6.12
104
104
  signing_key:
105
105
  specification_version: 4
106
106
  summary: Extends the Regexp class with '#examples' and '#random_example'