RubyGems - list_matcher - Versions diffs - 1.0.1 → 1.0.2 - Mend

list_matcher 1.0.1 → 1.0.2

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (7) hide show

checksums.yaml CHANGED Viewed

@@ -1,7 +1,7 @@
 ---
 SHA1:
-  metadata.gz: 233f0ed367dd0fe761cff13a2998ce8310401d0c
-  data.tar.gz: a1cbb19b367ce7aa5279d68a519c16f613462ad2
+  metadata.gz: 7af9157cc2a2c829220e00e56fac87000c3bf5d4
+  data.tar.gz: d7dee5b5521351a63b7f61a5e30dd61cb69d6071
 SHA512:
-  metadata.gz: 8665a1df6177622c962ab021aa3a60177e8c2cd33dd6ca68a12f7cf96c3305a156ba8a91a7c6d763644b5d0b809fe4add5c373afb32c3496302dac09dac8eccd
-  data.tar.gz: fc2f7625c101f014ea1c8151387539b2a89636fcb4d5cedfee742ba99a5da450133d617a8ca71be9ce5ff6ce02b0772e11843095d8355efd7d9b67488cc612f9
+  metadata.gz: 085ce4508c859a091ea44ca3c5e5a68db8b184ad37abe599723f02492870f533db63a179e3049d326690a274706fd3dc1448189cea1bcd5455894723c0448b19
+  data.tar.gz: 99c1e1da688674e227234d03098573a48a273f2f46022eae0f254233030b0e8de1f16d9073f47c4c9b0576031c92cf3e5989e3e3181513801b422d890555e54a

data/README.md CHANGED Viewed

@@ -32,8 +32,8 @@ puts m.pattern %w( catttttttttt )                          # (?:cat{10})
 puts m.pattern %w( cat-t-t-t-t-t-t-t-t-t )                 # (?:ca(?:t-){9}t)
 puts m.pattern %w( catttttttttt batttttttttt )             # (?:[bc]at{10})
 puts m.pattern %w( cad bad dad )                           # (?:[b-d]ad)
-puts m.pattern %w( cat catalog )                           # (?:cat(?:alog)?+)
-puts m.pattern (1..31).to_a                                # (?:[4-9]|1\d?+|2\d?+|3[01]?+)
+puts m.pattern %w( cat catalog )                           # (?:cat(?:alog)?)
+puts m.pattern (1..31).to_a                                # (?:[4-9]|1\d?|2\d?|3[01]?)
 ```
 ## Description
@@ -48,8 +48,8 @@ are provided to minimize initializations and the number of times you specify opt
 class methods, either `pattern` which generates a string, or `rx`, which returns a `Regexp` object:
 ```ruby
-List::Matcher.pattern %( cat dog )   # "(?:cat|dog)"
-List::Matcher.rx      %( cat dog )   # /(?:cat|dog)/
+List::Matcher.pattern %w( cat dog )   # "(?:cat|dog)"
+List::Matcher.rx      %w( cat dog )   # /(?:cat|dog)/
 ```
 If you plan to generate multiple regexen, or have complicated options which you always use, you should generate a configured
@@ -318,6 +318,27 @@ List::Matcher.new symbols: { aw_nuts: '+++' }
 then you may want to vet your symbols. Vetting is not done by default because one assumes you've worked out
 your substitutions on your own time and we need not waste runtime checking them.
+### not_extended
+```ruby
+default: false
+```
+Under normal circumstances `List::Matcher` will escape simple space characters and `#` lest the pattern
+generated be included in an *extended* regular expression where these are meta-characters. If you find this
+makes the expressions unreadable or otherwise annoying, you can tell `List::Matcher` to explicitly generate
+a non-extended regular expression. This may safely be included in any sort of regular expression, but it will
+be wrapped with the modifier expression `(?-x:...)`.
+```ruby
+List::Matcher.pattern [ 'cat and dog', '# is sometimes called the pound symbol' ]
+# "(?:\\#\\ is\\ sometimes\\ called\\ the\\ pound\\ symbol|cat\\ and\\ dog)"
+List::Matcher.pattern [ 'cat and dog', '# is sometimes called the pound symbol' ], not_extended: true
+# "(?-x:cat and dog|# is sometimes called the pound symbol)"
+```
+Note that `List::Matcher` will continue to quote other white space characters.
 ## Benchmarks
 Efficiency isn't the principle purpose of List::Matcher, but in almost all cases List::Matcher

data/lib/list_matcher.rb CHANGED Viewed

@@ -2,7 +2,8 @@ require "list_matcher/version"
 module List
   class Matcher
-    attr_reader :atomic, :backtracking, :bound, :case_insensitive, :strip, :left_bound, :right_bound, :word_test, :normalize_whitespace, :multiline, :name, :vet
+    attr_reader :atomic, :backtracking, :bound, :case_insensitive, :strip, :left_bound,
+      :right_bound, :word_test, :normalize_whitespace, :multiline, :name, :vet, :not_extended
     # convenience method for one-off regexen where there's no point in keeping
     # around a pattern generator
@@ -25,6 +26,7 @@ module List
           strip:                false,
           case_insensitive:     false,
           multiline:            false,
+          not_extended:         false,
           normalize_whitespace: false,
           symbols:              {},
           name:                 false,
@@ -35,6 +37,7 @@ module List
       @strip                = strip || normalize_whitespace
       @case_insensitive     = case_insensitive
       @multiline            = multiline
+      @not_extended         = not_extended
       @symbols              = deep_dup symbols
       @_bound               = bound
       @bound                = !!bound
@@ -65,11 +68,16 @@ module List
       elsif !( bound === false )
         raise "unfamiliar value for :bound option: #{bound.inspect}"
       end
+      symbols.keys.each do |k|
+        raise "symbols variable #{k} is neither a string, a symbol, nor a regex" unless k.is_a?(String) || k.is_a?(Symbol) || k.is_a?(Regexp)
+      end
       if normalize_whitespace
         @symbols[' '] = { pattern: '\s++' }
+      elsif not_extended
+        @symbols[' '] = { pattern: ' ' }
       end
-      symbols.keys.each do |k|
-        raise "symbols variable #{k} is neither a string, a symbol, nor a regex" unless k.is_a?(String) || k.is_a?(Symbol) || k.is_a?(Regexp)
+      if not_extended
+        @symbols['#'] = { pattern: '#' }
       end
       if vet
         Special.new( self, @symbols, [] ).verify
@@ -85,6 +93,7 @@ module List
         strip:                @strip,
         case_insensitive:     @case_insensitive,
         multiline:            @multiline,
+        not_extended:         @not_extended,
         normalize_whitespace: @normalize_whitespace,
         symbols:              @symbols,
         name:                 @name,
@@ -124,8 +133,8 @@ module List
     end
     def modifiers
-      ( @modifiers ||= if case_insensitive || multiline
-        [ ( 'i' if case_insensitive ), ( 'm' if multiline ) ].compact.join
+      ( @modifiers ||= if case_insensitive || multiline || not_extended
+        [ [ ( 'i' if case_insensitive ), ( 'm' if multiline ), ( '-x' if not_extended ) ].compact.join ]
       else
         [nil]
       end )[0]

data/lib/list_matcher/version.rb CHANGED Viewed

@@ -1,3 +1,3 @@
 module ListMatcher
-  VERSION = "1.0.1"
+  VERSION = "1.0.2"
 end

data/test/basic_test.rb CHANGED Viewed

@@ -245,4 +245,13 @@ class BasicTest < Minitest::Test
       List::Matcher.pattern %w(cat), symbols: { foo: '+' }, vet: true
     end
   end
+  def test_not_extended
+    m = List::Matcher.new not_extended: true
+    rx = m.pattern [ ' ', '#' ]
+    assert_equal '(?-x:#| )', rx
+    rx = Regexp.new rx
+    assert rx === ' '
+    assert rx === '#'
+  end
 end

data/test/doc_test.rb ADDED Viewed

@@ -0,0 +1,104 @@
+require "minitest/autorun"
+require "list_matcher"
+# test to make sure all the examples in the documentation work
+class DocTest < Minitest::Test
+  def test_all
+    m = List::Matcher.new
+    assert_equal '(?:cat|dog)', (m.pattern %w( cat dog ))
+    assert_equal '(?:[cr]at)', (m.pattern %w( cat rat ))
+    assert_equal '(?:ca(?:mel|t))', (m.pattern %w( cat camel ))
+    assert_equal '(?:(?:c|fl|spr)at)', (m.pattern %w( cat flat sprat ))
+    assert_equal '(?:cat{10})', (m.pattern %w( catttttttttt ))
+    assert_equal '(?:ca(?:t-){9}t)', (m.pattern %w( cat-t-t-t-t-t-t-t-t-t ))
+    assert_equal '(?:[bc]at{10})', (m.pattern %w( catttttttttt batttttttttt ))
+    assert_equal '(?:[b-d]ad)', (m.pattern %w( cad bad dad ))
+    assert_equal '(?:cat(?:alog)?)', (m.pattern %w( cat catalog ))
+    assert_equal '(?:[4-9]|1\d?|2\d?|3[01]?)', (m.pattern (1..31).to_a)
+    assert_equal "(?:cat|dog)", (List::Matcher.pattern %w( cat dog ))
+    assert_equal /(?:cat|dog)/, (List::Matcher.rx      %w( cat dog ))
+    m = List::Matcher.new normalize_whitespace: true, bound: true, case_insensitive: true, multiline: true, atomic: false, symbols: { num: '\d++' }
+    m2 = m.bud case_insensitive: false
+    assert !m2.case_insensitive
+    assert_equal "cat|dog", (List::Matcher.pattern %w(cat dog), atomic: false)
+    assert_equal "(?:cat|dog)", (List::Matcher.pattern %w(cat dog), atomic: true)
+    assert_equal "(?:cat|dog)", (List::Matcher.pattern %w( cat dog ))
+    assert_equal "(?>cat|dog)", (List::Matcher.pattern %w( cat dog ), backtracking: false)
+    assert_equal "(?:\\bcat\\b)", (List::Matcher.pattern %w(cat), bound: :word)
+    assert_equal "(?:\\bcat\\b)", (List::Matcher.pattern %w(cat), bound: true)
+    assert_equal "(?:^cat$)", (List::Matcher.pattern %w(cat), bound: :line)
+    assert_equal "(?:\\Acat\\z)", (List::Matcher.pattern %w(cat), bound: :string)
+    assert_equal "(?:(?<!\\d)[1-9](?:\\d\\d?)?(?!\\d))", (List::Matcher.pattern (1...1000).to_a, bound: { test: /\d/, left: '(?<!\d)', right: '(?!\d)'})
+    assert_equal "(?:(?:\\ ){5}cat(?:\\ ){5})", (List::Matcher.pattern ['     cat     '])
+    assert_equal "(?:cat)", (List::Matcher.pattern ['     cat     '], strip: true)
+    assert_equal "(?:C(?:AT|at)|cat)", (List::Matcher.pattern %w( Cat cat CAT ))
+    assert_equal "(?i:cat)", (List::Matcher.pattern %w( Cat cat CAT ), case_insensitive: true)
+    assert_equal "(?m:cat)", (List::Matcher.pattern %w(cat), multiline: true)
+    assert_equal "(?:\\ (?:\\ dog\\ walker|cat\\ \\ walker\\ )|camel\\ \\ walker)", (List::Matcher.pattern [ ' cat  walker ', '  dog walker', 'camel  walker' ])
+    assert_equal "(?:(?:ca(?:mel|t)|dog)\\s++walker)", (List::Matcher.pattern [ ' cat  walker ', '  dog walker', 'camel  walker' ], normalize_whitespace: true)
+    assert_equal "(?:(?:(?:Catch|Fahrenheit)\\ )?\\d++)", (List::Matcher.pattern [ 'Catch 22', '1984', 'Fahrenheit 451' ], symbols: { /\d+/ => '\d++' })
+    assert_equal "(?:(?:(?:Catch|Fahrenheit)\\ )?\\d++)", (List::Matcher.pattern [ 'Catch foo', 'foo', 'Fahrenheit foo' ], symbols: { 'foo' => '\d++' })
+    assert_equal "(?:(?:(?:Catch|Fahrenheit)\\ )?\\d++)", (List::Matcher.pattern [ 'Catch foo', 'foo', 'Fahrenheit foo' ], symbols: { foo: '\d++' })
+    assert_equal "(?<cat>cat)", (List::Matcher.pattern %w(cat), name: :cat)
+    m = List::Matcher.new atomic: false, bound: true
+    year      = m.pattern( (1901..2000).to_a, name: :year )
+    mday      = m.pattern( (1..31).to_a, name: :mday )
+    weekdays  = %w( Monday Tuesday Wednesday Thursday Friday Saturday Sunday )
+    weekdays += weekdays.map{ |w| w[0...3] }
+    wday      = m.pattern weekdays, case_insensitive: true, name: :wday
+    months    = %w( January February March April May June July August September October November December )
+    months   += months.map{ |w| w[0...3] }
+    mo        = m.pattern months, case_insensitive: true, name: :mo
+    date_20th_century = m.rx(
+      [
+        'wday, mo mday',
+        'wday, mo mday year',
+        'mo mday, year',
+        'mo year',
+        'mday mo year',
+        'wday',
+        'year',
+        'mday mo',
+        'mo mday',
+        'mo mday year'
+      ],
+      normalize_whitespace: true,
+      atomic: true,
+      symbols: {
+        year: year,
+        mday: mday,
+        wday: wday,
+        mo:   mo
+      }
+    )
+    assert m = date_20th_century.match('Friday')
+    assert_equal 'Friday', m[:wday]
+    assert_nil m[:year]
+    assert_nil m[:mo]
+    assert_nil m[:mday]
+    assert m = date_20th_century.match('August 27')
+    assert_equal 'August', m[:mo]
+    assert_equal '27', m[:mday]
+    assert_nil m[:year]
+    assert_nil m[:wday]
+    assert m = date_20th_century.match('May 6, 1969')
+    assert_equal 'May', m[:mo]
+    assert_equal '6', m[:mday]
+    assert_equal '1969', m[:year]
+    assert_nil m[:wday]
+    assert m = date_20th_century.match('1 Jan 2000')
+    assert_equal '1', m[:mday]
+    assert_equal 'Jan', m[:mo]
+    assert_equal '2000', m[:year]
+    assert_nil m[:wday]
+    assert_nil date_20th_century.match('this is not actually a date')
+    assert_equal "(?:\\#\\ is\\ sometimes\\ called\\ the\\ pound\\ symbol|cat\\ and\\ dog)", (List::Matcher.pattern [ 'cat and dog', '# is sometimes called the pound symbol' ])
+    assert_equal "(?-x:cat and dog|# is sometimes called the pound symbol)", (List::Matcher.pattern [ 'cat and dog', '# is sometimes called the pound symbol' ], not_extended: true)
+  end
+end

metadata CHANGED Viewed

@@ -1,7 +1,7 @@
 --- !ruby/object:Gem::Specification
 name: list_matcher
 version: !ruby/object:Gem::Version
-  version: 1.0.1
+  version: 1.0.2
 platform: ruby
 authors:
 - dfhoughton
@@ -56,6 +56,7 @@ files:
 - list_matcher.gemspec
 - test/basic_test.rb
 - test/benchmarks.rb
+- test/doc_test.rb
 - test/stress.rb
 homepage: https://github.com/dfhoughton/list_matcher
 licenses:
@@ -84,4 +85,5 @@ summary: List::Matcher automates the generation of efficient regular expressions
 test_files:
 - test/basic_test.rb
 - test/benchmarks.rb
+- test/doc_test.rb
 - test/stress.rb