re 0.0.4 → 0.0.5
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- data/README.rdoc +71 -7
- data/Rakefile +31 -3
- data/lib/re.rb +129 -38
- data/test/re_test.rb +58 -2
- metadata +2 -2
    
        data/README.rdoc
    CHANGED
    
    | @@ -1,5 +1,5 @@ | |
| 1 1 |  | 
| 2 | 
            -
            = Regular Expression Construction | 
| 2 | 
            +
            = Regular Expression Construction
         | 
| 3 3 |  | 
| 4 4 | 
             
            Complex regular expressions are hard to construct and even harder to
         | 
| 5 5 | 
             
            read.  The Re library allows users to construct complex regular
         | 
| @@ -8,7 +8,7 @@ following regular expression that will parse dates: | |
| 8 8 |  | 
| 9 9 | 
             
               /\A((?:19|20)[0-9]{2})[\- \/.](0[1-9]|1[012])[\- \/.](0[1-9]|[12][0-9]|3[01])\z/
         | 
| 10 10 |  | 
| 11 | 
            -
            Using the Re library,  | 
| 11 | 
            +
            Using the Re library, that regular expression can be built
         | 
| 12 12 | 
             
            incrementaly from smaller, easier to understand expressions.
         | 
| 13 13 | 
             
            Perhaps something like this:
         | 
| 14 14 |  | 
| @@ -38,9 +38,11 @@ groups can be retrieved by name: | |
| 38 38 | 
             
              result.data(:month)  # => "01"
         | 
| 39 39 | 
             
              result.data(:day)    # => "23"
         | 
| 40 40 |  | 
| 41 | 
            -
            == Version | 
| 41 | 
            +
            == Version
         | 
| 42 42 |  | 
| 43 | 
            -
             | 
| 43 | 
            +
            This document describes Re version 0.0.5.
         | 
| 44 | 
            +
             | 
| 45 | 
            +
            == Usage
         | 
| 44 46 |  | 
| 45 47 | 
             
              include Re
         | 
| 46 48 |  | 
| @@ -51,7 +53,7 @@ groups can be retrieved by name: | |
| 51 53 | 
             
                puts "No Match"
         | 
| 52 54 | 
             
              end
         | 
| 53 55 |  | 
| 54 | 
            -
            == Examples | 
| 56 | 
            +
            == Examples
         | 
| 55 57 |  | 
| 56 58 | 
             
              re("a")                -- matches "a"
         | 
| 57 59 | 
             
              re("a") + re("b")      -- matches "ab"
         | 
| @@ -83,9 +85,70 @@ and character class functions. | |
| 83 85 |  | 
| 84 86 | 
             
            See Re.re, Re::Rexp, and Re::ConstructionMethods for details.
         | 
| 85 87 |  | 
| 86 | 
            -
            ==  | 
| 88 | 
            +
            == Performance
         | 
| 89 | 
            +
             | 
| 90 | 
            +
            We should say a word or two about performance.
         | 
| 91 | 
            +
             | 
| 92 | 
            +
            First of all, building regular expressions using Re is slow.  If you
         | 
| 93 | 
            +
            use Re to build regular expressions, you are encouraged to build the
         | 
| 94 | 
            +
            regular expression once and reuse it as needed.  This means you
         | 
| 95 | 
            +
            won't do a lot of inline expressions using Re, but rather assign the
         | 
| 96 | 
            +
            generated Re regular expression to a constant.  For example:
         | 
| 97 | 
            +
             | 
| 98 | 
            +
              PHONE_RE = re.digit.repeat(3).capture(:area) +
         | 
| 99 | 
            +
                           re("-") + 
         | 
| 100 | 
            +
                           re.digit.repeat(3).capture(:exchange) +
         | 
| 101 | 
            +
                           re("-") +
         | 
| 102 | 
            +
                           re.digit.repeat(4)).capture(:subscriber)
         | 
| 103 | 
            +
             | 
| 104 | 
            +
            Alternatively, you can arrange for the regular expression to be
         | 
| 105 | 
            +
            constructed only when actually needed.  Something like:q
         | 
| 106 | 
            +
             | 
| 107 | 
            +
              def phone_re
         | 
| 108 | 
            +
                @phone_re ||= re.digit.repeat(3).capture(:area) +
         | 
| 109 | 
            +
                                re("-") + 
         | 
| 110 | 
            +
                                re.digit.repeat(3).capture(:exchange) +
         | 
| 111 | 
            +
                                re("-") +
         | 
| 112 | 
            +
                                re.digit.repeat(4)).capture(:subscriber)
         | 
| 113 | 
            +
              end
         | 
| 114 | 
            +
             | 
| 115 | 
            +
            That method constructs the phone number regular expression once and
         | 
| 116 | 
            +
            returns a cached value thereafter.  Just make sure you put the
         | 
| 117 | 
            +
            method in an object that is instantiated once (e.g. a class method).
         | 
| 118 | 
            +
             | 
| 119 | 
            +
            When used in matching, Re regular expressions perform fairly well
         | 
| 120 | 
            +
            compared to native regular expressions.  The overhead is a small
         | 
| 121 | 
            +
            number of extra method calls and the creation of a Re::Result object
         | 
| 122 | 
            +
            to return the match results.
         | 
| 123 | 
            +
             | 
| 124 | 
            +
            If regular expression performance is a premium in your application,
         | 
| 125 | 
            +
            then you can still use Re to construct the regular expression and
         | 
| 126 | 
            +
            extract the raw Ruby Regexp object to be used for the actual
         | 
| 127 | 
            +
            matching.  You lose the ability to use named capture groups easily,
         | 
| 128 | 
            +
            but you get raw Ruby regular expression matching performance.
         | 
| 129 | 
            +
             | 
| 130 | 
            +
            For example, if you wanted to use the raw regular expression from
         | 
| 131 | 
            +
            PHONE_RE defined above, you could extract the regular expression
         | 
| 132 | 
            +
            like this:
         | 
| 133 | 
            +
             | 
| 134 | 
            +
              PHONE_REGEXP = PHONE_RE.regexp
         | 
| 135 | 
            +
             | 
| 136 | 
            +
            And then use it directly:
         | 
| 137 | 
            +
             | 
| 138 | 
            +
              if PHONE_REGEXP =~ string
         | 
| 139 | 
            +
                # blah blah blah
         | 
| 140 | 
            +
              end
         | 
| 141 | 
            +
             | 
| 142 | 
            +
            The above match runs at full Ruby matching speed.  If you still
         | 
| 143 | 
            +
            wanted named capture groups, you can something like this:
         | 
| 144 | 
            +
             | 
| 145 | 
            +
              match_data = PHONE_REGEXP.match(string)
         | 
| 146 | 
            +
              area_code = match_data[PHONE_RE.name_map[:area]]
         | 
| 147 | 
            +
             | 
| 148 | 
            +
            == License and Copyright
         | 
| 87 149 |  | 
| 88 | 
            -
            Copyright 2009 by Jim Weirich (jim.weirich@gmail.com)
         | 
| 150 | 
            +
            Copyright 2009 by Jim Weirich (jim.weirich@gmail.com).
         | 
| 151 | 
            +
            All rights Reserved.
         | 
| 89 152 |  | 
| 90 153 | 
             
            Re is provided under the MIT open source license (see MIT-LICENSE)
         | 
| 91 154 |  | 
| @@ -94,6 +157,7 @@ Re is provided under the MIT open source license (see MIT-LICENSE) | |
| 94 157 | 
             
            Documentation :: http://re-lib.rubyforge.org
         | 
| 95 158 | 
             
            Source        :: http://github.com/jimweirich/re
         | 
| 96 159 | 
             
            GemCutter     :: http://gemcutter.org/gems/re
         | 
| 160 | 
            +
            Download      :: http://rubyforge.org/frs/?group_id=9329
         | 
| 97 161 | 
             
            Bug Tracker   :: http://www.pivotaltracker.com/projects/47758
         | 
| 98 162 | 
             
            Author        :: jim.weirich@gmail.com
         | 
| 99 163 |  | 
    
        data/Rakefile
    CHANGED
    
    | @@ -14,8 +14,36 @@ Rake::TestTask.new(:test) do |t| | |
| 14 14 | 
             
              t.test_files = FileList['test/*_test.rb']
         | 
| 15 15 | 
             
            end
         | 
| 16 16 |  | 
| 17 | 
            -
             | 
| 17 | 
            +
            namespace "release" do
         | 
| 18 | 
            +
              task :new => [
         | 
| 19 | 
            +
                :readme,
         | 
| 20 | 
            +
                :check_non_beta,
         | 
| 21 | 
            +
                :check_all_committed,
         | 
| 22 | 
            +
                :tag_version,
         | 
| 23 | 
            +
                :gem,
         | 
| 24 | 
            +
                "publish:rdoc"
         | 
| 25 | 
            +
              ]
         | 
| 26 | 
            +
              
         | 
| 27 | 
            +
              task :check_all_committed do
         | 
| 28 | 
            +
                status = `git status`
         | 
| 29 | 
            +
                unless status =~ /nothing to commit/
         | 
| 30 | 
            +
                  fail "Outstanding Git Changes:\n#{status}"
         | 
| 31 | 
            +
                end
         | 
| 32 | 
            +
              end
         | 
| 33 | 
            +
              
         | 
| 34 | 
            +
              task :commit_new_version do
         | 
| 35 | 
            +
                sh "git commit -m 'bumped to version #{Re::VERSION}'"
         | 
| 36 | 
            +
              end
         | 
| 37 | 
            +
              
         | 
| 38 | 
            +
              task :not_already_tagged
         | 
| 18 39 |  | 
| 19 | 
            -
            task : | 
| 20 | 
            -
             | 
| 40 | 
            +
              task :tag_version => :not_already_tagged do
         | 
| 41 | 
            +
                sh "git tag re-#{Re::VERSION}"
         | 
| 42 | 
            +
                sh "git push --tags"
         | 
| 43 | 
            +
              end
         | 
| 44 | 
            +
              
         | 
| 45 | 
            +
              task :check_non_beta do
         | 
| 46 | 
            +
                fail "Must not be a beta version! Version is #{Re::VERSION}" if Re::Version::BETA
         | 
| 47 | 
            +
              end
         | 
| 21 48 | 
             
            end
         | 
| 49 | 
            +
            task :release => "release:new"
         | 
    
        data/lib/re.rb
    CHANGED
    
    | @@ -1,6 +1,6 @@ | |
| 1 1 | 
             
            #!/usr/bin/ruby -wKU
         | 
| 2 2 | 
             
            #
         | 
| 3 | 
            -
            # = Regular Expression Construction | 
| 3 | 
            +
            # = Regular Expression Construction
         | 
| 4 4 | 
             
            #
         | 
| 5 5 | 
             
            # Complex regular expressions are hard to construct and even harder to
         | 
| 6 6 | 
             
            # read.  The Re library allows users to construct complex regular
         | 
| @@ -9,7 +9,7 @@ | |
| 9 9 | 
             
            #
         | 
| 10 10 | 
             
            #    /\A((?:19|20)[0-9]{2})[\- \/.](0[1-9]|1[012])[\- \/.](0[1-9]|[12][0-9]|3[01])\z/
         | 
| 11 11 | 
             
            #
         | 
| 12 | 
            -
            # Using the Re library,  | 
| 12 | 
            +
            # Using the Re library, that regular expression can be built
         | 
| 13 13 | 
             
            # incrementaly from smaller, easier to understand expressions.
         | 
| 14 14 | 
             
            # Perhaps something like this:
         | 
| 15 15 | 
             
            #
         | 
| @@ -39,7 +39,7 @@ | |
| 39 39 | 
             
            #   result.data(:month)  # => "01"
         | 
| 40 40 | 
             
            #   result.data(:day)    # => "23"
         | 
| 41 41 | 
             
            #
         | 
| 42 | 
            -
            # == Usage | 
| 42 | 
            +
            # == Usage
         | 
| 43 43 | 
             
            #
         | 
| 44 44 | 
             
            #   include Re
         | 
| 45 45 | 
             
            #
         | 
| @@ -50,7 +50,7 @@ | |
| 50 50 | 
             
            #     puts "No Match"
         | 
| 51 51 | 
             
            #   end
         | 
| 52 52 | 
             
            #
         | 
| 53 | 
            -
            # == Examples | 
| 53 | 
            +
            # == Examples
         | 
| 54 54 | 
             
            #
         | 
| 55 55 | 
             
            #   re("a")                -- matches "a"
         | 
| 56 56 | 
             
            #   re("a") + re("b")      -- matches "ab"
         | 
| @@ -82,9 +82,70 @@ | |
| 82 82 | 
             
            #
         | 
| 83 83 | 
             
            # See Re.re, Re::Rexp, and Re::ConstructionMethods for details.
         | 
| 84 84 | 
             
            #
         | 
| 85 | 
            -
            # ==  | 
| 85 | 
            +
            # == Performance
         | 
| 86 86 | 
             
            #
         | 
| 87 | 
            -
            #  | 
| 87 | 
            +
            # We should say a word or two about performance.
         | 
| 88 | 
            +
            #
         | 
| 89 | 
            +
            # First of all, building regular expressions using Re is slow.  If you
         | 
| 90 | 
            +
            # use Re to build regular expressions, you are encouraged to build the
         | 
| 91 | 
            +
            # regular expression once and reuse it as needed.  This means you
         | 
| 92 | 
            +
            # won't do a lot of inline expressions using Re, but rather assign the
         | 
| 93 | 
            +
            # generated Re regular expression to a constant.  For example:
         | 
| 94 | 
            +
            #
         | 
| 95 | 
            +
            #   PHONE_RE = re.digit.repeat(3).capture(:area) +
         | 
| 96 | 
            +
            #                re("-") + 
         | 
| 97 | 
            +
            #                re.digit.repeat(3).capture(:exchange) +
         | 
| 98 | 
            +
            #                re("-") +
         | 
| 99 | 
            +
            #                re.digit.repeat(4)).capture(:subscriber)
         | 
| 100 | 
            +
            #
         | 
| 101 | 
            +
            # Alternatively, you can arrange for the regular expression to be
         | 
| 102 | 
            +
            # constructed only when actually needed.  Something like:q
         | 
| 103 | 
            +
            #
         | 
| 104 | 
            +
            #   def phone_re
         | 
| 105 | 
            +
            #     @phone_re ||= re.digit.repeat(3).capture(:area) +
         | 
| 106 | 
            +
            #                     re("-") + 
         | 
| 107 | 
            +
            #                     re.digit.repeat(3).capture(:exchange) +
         | 
| 108 | 
            +
            #                     re("-") +
         | 
| 109 | 
            +
            #                     re.digit.repeat(4)).capture(:subscriber)
         | 
| 110 | 
            +
            #   end
         | 
| 111 | 
            +
            #
         | 
| 112 | 
            +
            # That method constructs the phone number regular expression once and
         | 
| 113 | 
            +
            # returns a cached value thereafter.  Just make sure you put the
         | 
| 114 | 
            +
            # method in an object that is instantiated once (e.g. a class method).
         | 
| 115 | 
            +
            #
         | 
| 116 | 
            +
            # When used in matching, Re regular expressions perform fairly well
         | 
| 117 | 
            +
            # compared to native regular expressions.  The overhead is a small
         | 
| 118 | 
            +
            # number of extra method calls and the creation of a Re::Result object
         | 
| 119 | 
            +
            # to return the match results.
         | 
| 120 | 
            +
            #
         | 
| 121 | 
            +
            # If regular expression performance is a premium in your application,
         | 
| 122 | 
            +
            # then you can still use Re to construct the regular expression and
         | 
| 123 | 
            +
            # extract the raw Ruby Regexp object to be used for the actual
         | 
| 124 | 
            +
            # matching.  You lose the ability to use named capture groups easily,
         | 
| 125 | 
            +
            # but you get raw Ruby regular expression matching performance.
         | 
| 126 | 
            +
            #
         | 
| 127 | 
            +
            # For example, if you wanted to use the raw regular expression from
         | 
| 128 | 
            +
            # PHONE_RE defined above, you could extract the regular expression
         | 
| 129 | 
            +
            # like this:
         | 
| 130 | 
            +
            #
         | 
| 131 | 
            +
            #   PHONE_REGEXP = PHONE_RE.regexp
         | 
| 132 | 
            +
            #
         | 
| 133 | 
            +
            # And then use it directly:
         | 
| 134 | 
            +
            #
         | 
| 135 | 
            +
            #   if PHONE_REGEXP =~ string
         | 
| 136 | 
            +
            #     # blah blah blah
         | 
| 137 | 
            +
            #   end
         | 
| 138 | 
            +
            #
         | 
| 139 | 
            +
            # The above match runs at full Ruby matching speed.  If you still
         | 
| 140 | 
            +
            # wanted named capture groups, you can something like this:
         | 
| 141 | 
            +
            #
         | 
| 142 | 
            +
            #   match_data = PHONE_REGEXP.match(string)
         | 
| 143 | 
            +
            #   area_code = match_data[PHONE_RE.name_map[:area]]
         | 
| 144 | 
            +
            #
         | 
| 145 | 
            +
            # == License and Copyright
         | 
| 146 | 
            +
            #
         | 
| 147 | 
            +
            # Copyright 2009 by Jim Weirich (jim.weirich@gmail.com).
         | 
| 148 | 
            +
            # All rights Reserved.
         | 
| 88 149 | 
             
            #
         | 
| 89 150 | 
             
            # Re is provided under the MIT open source license (see MIT-LICENSE)
         | 
| 90 151 | 
             
            #
         | 
| @@ -93,6 +154,7 @@ | |
| 93 154 | 
             
            # Documentation :: http://re-lib.rubyforge.org
         | 
| 94 155 | 
             
            # Source        :: http://github.com/jimweirich/re
         | 
| 95 156 | 
             
            # GemCutter     :: http://gemcutter.org/gems/re
         | 
| 157 | 
            +
            # Download      :: http://rubyforge.org/frs/?group_id=9329
         | 
| 96 158 | 
             
            # Bug Tracker   :: http://www.pivotaltracker.com/projects/47758
         | 
| 97 159 | 
             
            # Author        :: jim.weirich@gmail.com
         | 
| 98 160 | 
             
            #
         | 
| @@ -102,7 +164,7 @@ module Re | |
| 102 164 | 
             
                NUMBERS = [
         | 
| 103 165 | 
             
                  MAJOR = 0,
         | 
| 104 166 | 
             
                  MINOR = 0,
         | 
| 105 | 
            -
                  BUILD =  | 
| 167 | 
            +
                  BUILD = 5,
         | 
| 106 168 | 
             
                  BETA  = nil,
         | 
| 107 169 | 
             
                ].compact
         | 
| 108 170 | 
             
              end
         | 
| @@ -125,8 +187,15 @@ module Re | |
| 125 187 |  | 
| 126 188 | 
             
                # Return the text of the named capture data.
         | 
| 127 189 | 
             
                def [](name)
         | 
| 128 | 
            -
                  index =  | 
| 129 | 
            -
                  index ? @match_data[index | 
| 190 | 
            +
                  index = name_map[name]
         | 
| 191 | 
            +
                  index ? @match_data[index] : nil
         | 
| 192 | 
            +
                end
         | 
| 193 | 
            +
             | 
| 194 | 
            +
                private
         | 
| 195 | 
            +
             | 
| 196 | 
            +
                # Lazy eval map of names to capture indices.
         | 
| 197 | 
            +
                def name_map
         | 
| 198 | 
            +
                  @name_map ||= @rexp.name_map
         | 
| 130 199 | 
             
                end
         | 
| 131 200 | 
             
              end
         | 
| 132 201 |  | 
| @@ -154,29 +223,47 @@ module Re | |
| 154 223 | 
             
                  @level = level
         | 
| 155 224 | 
             
                  @capture_keys = keys
         | 
| 156 225 | 
             
                  @options = options
         | 
| 226 | 
            +
                  @greedy = true
         | 
| 157 227 | 
             
                end
         | 
| 158 228 |  | 
| 229 | 
            +
                # Does it match a string?  Returns Re::Result if match, nil
         | 
| 230 | 
            +
                # otherwise.
         | 
| 231 | 
            +
                def match(string)
         | 
| 232 | 
            +
                  md = regexp.match(string)
         | 
| 233 | 
            +
                  md ? Result.new(md, self) : nil
         | 
| 234 | 
            +
                end
         | 
| 235 | 
            +
                alias =~ match
         | 
| 236 | 
            +
             | 
| 159 237 | 
             
                # Return a Regexp from the the constructed regular expression.
         | 
| 160 238 | 
             
                def regexp
         | 
| 161 239 | 
             
                  @regexp ||= Regexp.new(encoding)
         | 
| 162 240 | 
             
                end
         | 
| 163 241 |  | 
| 164 | 
            -
                #  | 
| 165 | 
            -
                 | 
| 166 | 
            -
             | 
| 167 | 
            -
                   | 
| 242 | 
            +
                # Is the current regular expression marked to be treated as greedy
         | 
| 243 | 
            +
                # when repeat operators are applied?
         | 
| 244 | 
            +
                def greedy?
         | 
| 245 | 
            +
                  @greedy
         | 
| 168 246 | 
             
                end
         | 
| 169 | 
            -
                alias =~ match
         | 
| 170 247 |  | 
| 171 | 
            -
                #  | 
| 172 | 
            -
                #  | 
| 248 | 
            +
                # Map of names to capture indices.  Use this to lookup names in
         | 
| 249 | 
            +
                # the the match data returned from a regular expression match.
         | 
| 250 | 
            +
                def name_map
         | 
| 251 | 
            +
                  result = {}
         | 
| 252 | 
            +
                  capture_keys.each_with_index do |key, i|
         | 
| 253 | 
            +
                    result[key] = i + 1
         | 
| 254 | 
            +
                  end
         | 
| 255 | 
            +
                  result
         | 
| 256 | 
            +
                end
         | 
| 257 | 
            +
             | 
| 258 | 
            +
                # New regular expression that matches the concatenation of self
         | 
| 259 | 
            +
                # and other.
         | 
| 173 260 | 
             
                def +(other)
         | 
| 174 261 | 
             
                  Rexp.new(parenthesized_encoding(CONCAT) + other.parenthesized_encoding(CONCAT),
         | 
| 175 262 | 
             
                    CONCAT,
         | 
| 176 263 | 
             
                    capture_keys + other.capture_keys)
         | 
| 177 264 | 
             
                end
         | 
| 178 265 |  | 
| 179 | 
            -
                # New regular  | 
| 266 | 
            +
                # New regular expression that matches either self or other.
         | 
| 180 267 | 
             
                def |(other)
         | 
| 181 268 | 
             
                  Rexp.new(parenthesized_encoding(ALT) + "|" + other.parenthesized_encoding(ALT),
         | 
| 182 269 | 
             
                    ALT,
         | 
| @@ -187,28 +274,26 @@ module Re | |
| 187 274 | 
             
                def optional
         | 
| 188 275 | 
             
                  Rexp.new(parenthesized_encoding(POSTFIX) + "?", POSTFIX, capture_keys)
         | 
| 189 276 | 
             
                end
         | 
| 277 | 
            +
                
         | 
| 278 | 
            +
                # Mark the current regular expression with the non-greedy flag.
         | 
| 279 | 
            +
                # Repeats applied to this regular expression will be treated as
         | 
| 280 | 
            +
                # non-greedy repeats.  Note that +non_greedy has no effect unless
         | 
| 281 | 
            +
                # immediately followed by +many+, +one_or_more+, +repeat+,
         | 
| 282 | 
            +
                # +at_least+ or +at_most+.
         | 
| 283 | 
            +
                def non_greedy
         | 
| 284 | 
            +
                  @greedy = false
         | 
| 285 | 
            +
                  self
         | 
| 286 | 
            +
                end
         | 
| 190 287 |  | 
| 191 288 | 
             
                # New regular expression that matches self many (zero or more)
         | 
| 192 289 | 
             
                # times.
         | 
| 193 290 | 
             
                def many
         | 
| 194 | 
            -
                  Rexp.new(parenthesized_encoding(POSTFIX) + "*", POSTFIX, capture_keys)
         | 
| 195 | 
            -
                end
         | 
| 196 | 
            -
             | 
| 197 | 
            -
                # New regular expression that matches self many (zero or more)
         | 
| 198 | 
            -
                # times (non-greedy version).
         | 
| 199 | 
            -
                def many!
         | 
| 200 | 
            -
                  Rexp.new(parenthesized_encoding(POSTFIX) + "*?", POSTFIX, capture_keys)
         | 
| 291 | 
            +
                  Rexp.new(parenthesized_encoding(POSTFIX) + apply_greedy("*"), POSTFIX, capture_keys)
         | 
| 201 292 | 
             
                end
         | 
| 202 | 
            -
             | 
| 293 | 
            +
                
         | 
| 203 294 | 
             
                # New regular expression that matches self one or more times.
         | 
| 204 295 | 
             
                def one_or_more
         | 
| 205 | 
            -
                  Rexp.new(parenthesized_encoding(POSTFIX) + "+", POSTFIX, capture_keys)
         | 
| 206 | 
            -
                end
         | 
| 207 | 
            -
             | 
| 208 | 
            -
                # New regular expression that matches self one or more times
         | 
| 209 | 
            -
                # (non-greedy version).
         | 
| 210 | 
            -
                def one_or_more!
         | 
| 211 | 
            -
                  Rexp.new(parenthesized_encoding(POSTFIX) + "+?", POSTFIX, capture_keys)
         | 
| 296 | 
            +
                  Rexp.new(parenthesized_encoding(POSTFIX) + apply_greedy("+"), POSTFIX, capture_keys)
         | 
| 212 297 | 
             
                end
         | 
| 213 298 |  | 
| 214 299 | 
             
                # New regular expression that matches self between +min+ and +max+
         | 
| @@ -216,7 +301,7 @@ module Re | |
| 216 301 | 
             
                # exactly exactly +min+ times.
         | 
| 217 302 | 
             
                def repeat(min, max=nil)
         | 
| 218 303 | 
             
                  if min && max
         | 
| 219 | 
            -
                    Rexp.new(parenthesized_encoding(POSTFIX) + "{#{min},#{max}}", POSTFIX, capture_keys)
         | 
| 304 | 
            +
                    Rexp.new(parenthesized_encoding(POSTFIX) + apply_greedy("{#{min},#{max}}"), POSTFIX, capture_keys)
         | 
| 220 305 | 
             
                  else
         | 
| 221 306 | 
             
                    Rexp.new(parenthesized_encoding(POSTFIX) + "{#{min}}", POSTFIX, capture_keys)
         | 
| 222 307 | 
             
                  end
         | 
| @@ -224,12 +309,12 @@ module Re | |
| 224 309 |  | 
| 225 310 | 
             
                # New regular expression that matches self at least +min+ times.
         | 
| 226 311 | 
             
                def at_least(min)
         | 
| 227 | 
            -
                  Rexp.new(parenthesized_encoding(POSTFIX) + "{#{min},}", POSTFIX, capture_keys)
         | 
| 312 | 
            +
                  Rexp.new(parenthesized_encoding(POSTFIX) + apply_greedy("{#{min},}"), POSTFIX, capture_keys)
         | 
| 228 313 | 
             
                end
         | 
| 229 314 |  | 
| 230 315 | 
             
                # New regular expression that matches self at most +max+ times.
         | 
| 231 316 | 
             
                def at_most(max)
         | 
| 232 | 
            -
                  Rexp.new(parenthesized_encoding(POSTFIX) + "{0,#{max}}", POSTFIX, capture_keys)
         | 
| 317 | 
            +
                  Rexp.new(parenthesized_encoding(POSTFIX) + apply_greedy("{0,#{max}}"), POSTFIX, capture_keys)
         | 
| 233 318 | 
             
                end
         | 
| 234 319 |  | 
| 235 320 | 
             
                # New regular expression that matches self across the complete
         | 
| @@ -323,7 +408,13 @@ module Re | |
| 323 408 |  | 
| 324 409 | 
             
                protected
         | 
| 325 410 |  | 
| 326 | 
            -
                #  | 
| 411 | 
            +
                # Return the repeat op in either greedy or non-greedy form, as
         | 
| 412 | 
            +
                # determined by the greedy flag on the current regular expression.
         | 
| 413 | 
            +
                def apply_greedy(op)
         | 
| 414 | 
            +
                  greedy? ? op : "#{op}?"
         | 
| 415 | 
            +
                end
         | 
| 416 | 
            +
             | 
| 417 | 
            +
                # String encoding with grouping if needed.
         | 
| 327 418 | 
             
                #
         | 
| 328 419 | 
             
                # If the precedence of the current Regexp is less than the new
         | 
| 329 420 | 
             
                # precedence level, return the encoding wrapped in a non-capturing
         | 
| @@ -457,8 +548,8 @@ module Re | |
| 457 548 | 
             
                # Examples:
         | 
| 458 549 | 
             
                #
         | 
| 459 550 | 
             
                #   re.none("aieouy")                 -- matches non-vowels
         | 
| 460 | 
            -
                #   re. | 
| 461 | 
            -
                #   re. | 
| 551 | 
            +
                #   re.none("0-9")                    -- matches non-digits
         | 
| 552 | 
            +
                #   re.none("A-Z", "a-z", "0-9")      -- matches non-alphanumerics
         | 
| 462 553 | 
             
                #
         | 
| 463 554 | 
             
                def none(*chars)
         | 
| 464 555 | 
             
                  Rexp.new("[^" + char_class(chars)  + "]", GROUPED, [])
         | 
    
        data/test/re_test.rb
    CHANGED
    
    | @@ -1,5 +1,10 @@ | |
| 1 1 | 
             
            #!/usr/bin/env ruby
         | 
| 2 2 |  | 
| 3 | 
            +
            # Copyright 2009 by Jim Weirich (jim.weirich@gmail.com).
         | 
| 4 | 
            +
            # All rights reserved.
         | 
| 5 | 
            +
            #
         | 
| 6 | 
            +
            # Re is provided under the MIT open source license (see MIT-LICENSE)
         | 
| 7 | 
            +
             | 
| 3 8 | 
             
            require 'test/unit'
         | 
| 4 9 | 
             
            require 're'
         | 
| 5 10 |  | 
| @@ -59,7 +64,7 @@ class ReTest < Test::Unit::TestCase | |
| 59 64 | 
             
              end
         | 
| 60 65 |  | 
| 61 66 | 
             
              def test_non_greedy_many
         | 
| 62 | 
            -
                r =  re.any.many | 
| 67 | 
            +
                r =  re.any.non_greedy.many.capture(:x) + re("b")
         | 
| 63 68 | 
             
                result = r.match("xbxb")
         | 
| 64 69 | 
             
                assert result
         | 
| 65 70 | 
             
                assert_equal "x", result[:x]
         | 
| @@ -80,7 +85,7 @@ class ReTest < Test::Unit::TestCase | |
| 80 85 | 
             
              end
         | 
| 81 86 |  | 
| 82 87 | 
             
              def test_non_greedy_one_or_more
         | 
| 83 | 
            -
                r = re.any.one_or_more | 
| 88 | 
            +
                r = re.any.non_greedy.one_or_more.capture(:any) + re("b")
         | 
| 84 89 | 
             
                result = r.match("xbxb")
         | 
| 85 90 | 
             
                assert result
         | 
| 86 91 | 
             
                assert_equal "x", result[:any]
         | 
| @@ -102,6 +107,18 @@ class ReTest < Test::Unit::TestCase | |
| 102 107 | 
             
                assert r !~ "aaaaa"
         | 
| 103 108 | 
             
              end
         | 
| 104 109 |  | 
| 110 | 
            +
              def test_repeat_greedy
         | 
| 111 | 
            +
                r = re("a").repeat(2, 4)
         | 
| 112 | 
            +
                result = r =~ "aaaaa"
         | 
| 113 | 
            +
                assert_equal "aaaa", result.full_match
         | 
| 114 | 
            +
              end
         | 
| 115 | 
            +
             | 
| 116 | 
            +
              def test_repeat_non_greedy
         | 
| 117 | 
            +
                r = re("a").non_greedy.repeat(2, 4)
         | 
| 118 | 
            +
                result = r =~ "aaaaa"
         | 
| 119 | 
            +
                assert_equal "aa", result.full_match
         | 
| 120 | 
            +
              end
         | 
| 121 | 
            +
             | 
| 105 122 | 
             
              def test_at_least
         | 
| 106 123 | 
             
                r = re("a").at_least(2).all
         | 
| 107 124 | 
             
                assert r !~ "a"
         | 
| @@ -109,6 +126,18 @@ class ReTest < Test::Unit::TestCase | |
| 109 126 | 
             
                assert r =~ "aaaaaaaaaaaaaaaaaaaa"
         | 
| 110 127 | 
             
              end
         | 
| 111 128 |  | 
| 129 | 
            +
              def test_at_least_greedy
         | 
| 130 | 
            +
                r = re("a").at_least(2)
         | 
| 131 | 
            +
                result =  r =~ "aaaa"
         | 
| 132 | 
            +
                assert_equal "aaaa", result.full_match
         | 
| 133 | 
            +
              end
         | 
| 134 | 
            +
             | 
| 135 | 
            +
              def test_at_least_non_greedy
         | 
| 136 | 
            +
                r = re("a").non_greedy.at_least(2)
         | 
| 137 | 
            +
                result =  r =~ "aaa"
         | 
| 138 | 
            +
                assert_equal "aa", result.full_match
         | 
| 139 | 
            +
              end
         | 
| 140 | 
            +
             | 
| 112 141 | 
             
              def test_at_most
         | 
| 113 142 | 
             
                r = re("a").at_most(4).all
         | 
| 114 143 | 
             
                assert r =~ ""
         | 
| @@ -119,6 +148,24 @@ class ReTest < Test::Unit::TestCase | |
| 119 148 | 
             
                assert r !~ "aaaaa"
         | 
| 120 149 | 
             
              end
         | 
| 121 150 |  | 
| 151 | 
            +
              def test_at_most_greedy
         | 
| 152 | 
            +
                r = re("a").at_most(4)
         | 
| 153 | 
            +
                result = r =~ "aaaa"
         | 
| 154 | 
            +
                assert_equal "aaaa", result.full_match
         | 
| 155 | 
            +
              end
         | 
| 156 | 
            +
              
         | 
| 157 | 
            +
              def test_at_most_non_greedy
         | 
| 158 | 
            +
                r = re("a").non_greedy.at_most(4)
         | 
| 159 | 
            +
                result = r =~ "aaaa"
         | 
| 160 | 
            +
                if RUBY_VERSION < "1.9"
         | 
| 161 | 
            +
                  # Ruby 1.8.x seems to have a bug where non-greedy matches with
         | 
| 162 | 
            +
                  # intervals match at least one character.
         | 
| 163 | 
            +
                  assert_equal "a", result.full_match
         | 
| 164 | 
            +
                else
         | 
| 165 | 
            +
                  assert_equal "", result.full_match
         | 
| 166 | 
            +
                end
         | 
| 167 | 
            +
              end
         | 
| 168 | 
            +
              
         | 
| 122 169 | 
             
              def test_optional
         | 
| 123 170 | 
             
                r = re("a").optional.all
         | 
| 124 171 | 
             
                assert r =~ ""
         | 
| @@ -494,6 +541,15 @@ class ReTest < Test::Unit::TestCase | |
| 494 541 | 
             
                assert_equal "02", result[:month]
         | 
| 495 542 | 
             
                assert_equal "14", result[:day]
         | 
| 496 543 | 
             
              end
         | 
| 544 | 
            +
              
         | 
| 545 | 
            +
              def test_name_map_returns_map_of_keywords
         | 
| 546 | 
            +
                r = re("a").capture(:a) + re("b").capture(:b) + re("c").capture(:c)
         | 
| 547 | 
            +
                result = r.match("abc")
         | 
| 548 | 
            +
                assert result
         | 
| 549 | 
            +
                assert_equal 1, r.name_map[:a]
         | 
| 550 | 
            +
                assert_equal 2, r.name_map[:b]
         | 
| 551 | 
            +
                assert_equal 3, r.name_map[:c]
         | 
| 552 | 
            +
              end
         | 
| 497 553 |  | 
| 498 554 | 
             
              private
         | 
| 499 555 |  | 
    
        metadata
    CHANGED
    
    | @@ -1,7 +1,7 @@ | |
| 1 1 | 
             
            --- !ruby/object:Gem::Specification 
         | 
| 2 2 | 
             
            name: re
         | 
| 3 3 | 
             
            version: !ruby/object:Gem::Version 
         | 
| 4 | 
            -
              version: 0.0. | 
| 4 | 
            +
              version: 0.0.5
         | 
| 5 5 | 
             
            platform: ruby
         | 
| 6 6 | 
             
            authors: 
         | 
| 7 7 | 
             
            - Jim Weirich
         | 
| @@ -9,7 +9,7 @@ autorequire: | |
| 9 9 | 
             
            bindir: bin
         | 
| 10 10 | 
             
            cert_chain: []
         | 
| 11 11 |  | 
| 12 | 
            -
            date: 2009-12- | 
| 12 | 
            +
            date: 2009-12-31 00:00:00 -05:00
         | 
| 13 13 | 
             
            default_executable: 
         | 
| 14 14 | 
             
            dependencies: []
         | 
| 15 15 |  |