RubyGems - regex - Versions diffs - 1.0.0 - Mend

regex 1.0.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (23) hide show

data/HISTORY +11 -0
data/LICENSE +23 -0
data/MANIFEST +25 -0
data/README +45 -0
data/bin/regex +3 -0
data/lib/regex.rb +236 -0
data/lib/regex/command.rb +108 -0
data/lib/regex/extractor.rb +1 -0
data/lib/regex/string.rb +68 -0
data/lib/regex/templates/common.rb +13 -0
data/meta/authors +2 -0
data/meta/created +1 -0
data/meta/description +1 -0
data/meta/download +1 -0
data/meta/homepage +1 -0
data/meta/mailinglist +1 -0
data/meta/name +1 -0
data/meta/repository +1 -0
data/meta/summary +1 -0
data/meta/title +1 -0
data/meta/version +1 -0
data/test/demos/regex.rdoc +44 -0
metadata +87 -0

data/HISTORY ADDED

@@ -0,0 +1,11 @@
+= RELEASE HISTORY
+1.0.0 / 2010-02-10
+Initial release of Regex. Regex is a simple
+commandline Regular Expression tool.
+Changes:
+* Happy Birthday

data/LICENSE ADDED

@@ -0,0 +1,23 @@
+The MIT License
+Copyright (c) 2009 Thomas Sawyer
+Permission is hereby granted, free of charge, to any person obtaining a copy
+of this software and associated documentation files (the "Software"), to deal
+in the Software without restriction, including without limitation the rights
+to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
+copies of the Software, and to permit persons to whom the Software is
+furnished to do so, subject to the following conditions:
+The above copyright notice and this permission notice shall be included in
+all copies or substantial portions of the Software.
+THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
+AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
+OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
+THE SOFTWARE.

data/MANIFEST ADDED

@@ -0,0 +1,25 @@
+HISTORY
+LICENSE
+MANIFEST
+README
+bin/regex
+lib/regex
+lib/regex.rb
+lib/regex/command.rb
+lib/regex/extractor.rb
+lib/regex/string.rb
+lib/regex/templates
+lib/regex/templates/common.rb
+meta/authors
+meta/created
+meta/description
+meta/download
+meta/homepage
+meta/mailinglist
+meta/name
+meta/repository
+meta/summary
+meta/title
+meta/version
+test/demos
+test/demos/regex.rdoc

data/README ADDED

@@ -0,0 +1,45 @@
+= Regex ("Like a Knife")
+* home: http://proutils.github.com/regex
+* work: http://github.com/proutils/regex
+== DESCRIPTION
+Yea, I know what you are going to say. "I can do that with ___" Fill in the blank
+with +grep+, +awk+, +sed+, +perl+, etc. But honestly, none of these tools are
+Langauge 2.0 (read "post-Ruby"). What I want is a simaple commandline tool that
+given me quick access to a Regular Expression engine. No more, no less.
+Now I could have written this too in Perl. I'm sure it would just as good, if not
+better since Perl's Regualar Expression engine rocks, or so I hear. But Ruby's is
+pretty good too, and getting better (with 1.9+). And since I know Ruby very
+well. Well that's waht you get.
+== USAGE
+Okay, check it out. It's real simple. Supply a regular expression and a file to
+match upon to the +regex+ command.
+  $ regex '=begin.*?\n(.*)=end' sample.txt
+It does exactly what you think it would.
+Check out the <tt>--help</tt> and I'm sure the rest will come to you real quick.
+But it you want more information, then do us the good favor of jumping over
+to the <a href="http://proutils.github.com/regex">documentation wiki</a>.
+== STATUS
+This is a very release. So don't expect every feature under the sun just yet, or
+that every detail is going to work peachy keen. But hey, if something needs fixing
+or a feature needs adding, well then get in there and send me a patch --open
+source software is built on *TEAM WORK*.
+And expect a potenial for rapid change here at the beginning.
+== COPYRIGHT
+Copyright (c) 2010 Thomas Sawyer
+Regex is licensed under the terms of the MIT license.

data/bin/regex ADDED

@@ -0,0 +1,3 @@
+#!/usr/bin/env ruby
+require 'regex'
+Regex::Command.main(*ARGV)

data/lib/regex.rb ADDED

@@ -0,0 +1,236 @@
+# = Text Extraction Class
+#
+# Extractor is was designed particulalry for extracting source code from embedded
+# comment blocks.
+#
+# Todo:
+#   - How can we handle embedded code in stadanrd comments? Eg. #
+#
+class Regex
+  VERSION = "1.1"
+  # When the regular expression return multiple groups,
+  # each is divided by the group deliminator.
+  # This is the default value.
+  DELIMINATOR_GROUP  = 29.chr + "\n"
+  # When using repeat mode, each match is divided by
+  # the record deliminator. This is the default value.
+  DELIMINATOR_RECORD = 30.chr + "\n"
+  require 'fileutils'
+  require 'open-uri'
+  require 'regex/string'
+  require 'regex/command'
+  # TODO: generalize to plugin
+  require 'regex/templates/common'
+  #
+  #attr_accessor :text
+  # Remove XML tags from search.
+  attr_accessor :unxml
+  # Regular expression.
+  attr_accessor :pattern
+  # Select built-in regular expression by name.
+  attr_accessor :template
+  # Index of expression return.
+  attr_accessor :index
+  # Ignore case.
+  attr_accessor :insensitive
+  # Repeat Match.
+  attr_accessor :repeat
+  # Output format.
+  attr_accessor :format
+  # DEPRECATE: Not needed anymore.
+  #def self.load(io, options={}, &block)
+  #  new(io, options, &block)
+  #end
+  # New extractor.
+  def initialize(io, options={})
+    @raw = (String === io ? io : io.read)
+    options.each do |k,v|
+      __send__("#{k}=", v)
+    end
+    yield(self) if block_given?
+  end
+  # Read file.
+  #def raw
+  #  @raw ||= open(@file) # File.read(@file)
+  #end
+  #--
+  # TODO: unxml is too primative, use real xml parser like nokogiri
+  #++
+  def text
+    @text ||= (
+      if unxml
+        raw.gsub!(/\<(.*?)\>/, '')
+      else
+        @raw
+      end
+    )
+  end
+  #
+  def regex
+    @regex ||= (
+      if template
+        TEMPLATES.const_get(template.upcase)
+      else
+        case pattern
+        when Regexp
+          pattern
+        when String
+          flags = []
+          flags << Regexp::MULTILINE
+          flags << Regexp::IGNORECASE if insensitive
+          Regexp.new(pattern, *flags)
+        end
+      end
+    )
+  end
+  #
+  def to_s(format=nil)
+    case format
+    when :yaml
+      to_s_yaml
+    when :json
+      to_s_json
+    else
+      out = structure
+      if repeat
+        out = out.map{ |m| m.join(deliminator_group) }
+        out = out.join(deliminator_record) #.chomp("\n") + "\n"
+      else
+        out = out.join(deliminator_group) #.chomp("\n") + "\n"
+      end
+      out
+    end
+  end
+  #
+  def to_s_yaml
+    require 'yaml'
+    structure.to_yaml
+  end
+  #
+  def to_s_json
+    begin
+      require 'json'
+    rescue LoadError
+      require 'json_pure'
+    end
+    structure.to_json
+  end
+  # Structure the matchdata according to specified options.
+  def structure
+    repeat ? structure_repeat : structure_single
+  end
+  # Structure the matchdata for single match.
+  def structure_single
+    md = extract
+    if index
+      [md[index]]
+    elsif md.size > 1
+      md[1..-1]
+    else
+      [md[0]]
+    end
+  end
+  # Structure the matchdata for repeat matches.
+  def structure_repeat
+    out = extract
+    if index
+      out.map{ |md| [md[index]] }
+    else
+      out.map{ |md| md.size > 1 ? md[1..-1] : [md[0]] }
+    end
+  end
+  # Extract match from source text.
+  def extract
+    if repeat
+      extract_repeat
+    else
+      extract_single
+    end
+  end
+  #
+  #def extract_single
+  #  out = []
+  #  if md = matchdata
+  #    if index
+  #      out << md[index]
+  #    elsif md.size > 1
+  #      out = md[1..-1] #.join(deliminator_group)
+  #    else
+  #      out = md
+  #    end
+  #  end
+  #  return out
+  #end
+  # Extract single match from source text.
+  def extract_single
+    md = regex.match(text)
+    md ? md : []
+  end
+  #
+  #def matchdata
+  #  regex.match(text)
+  #end
+  #
+  #def extract_repeat
+  #  out = []
+  #  text.scan(regex) do
+  #    md = $~
+  #    if index
+  #      out << [md[index]]
+  #    elsif md.size > 1
+  #      out << md[1..-1] #.join(deliminator_group)
+  #    else
+  #      out << md
+  #    end
+  #  end
+  #  out #.join(deliminator_record)
+  #end
+  # Extract repeat matches from source text.
+  def extract_repeat
+    out = []
+    text.scan(regex) do
+      out << $~
+    end
+    out
+  end
+  def deliminator_group
+    DELIMINATOR_GROUP
+  end
+  def deliminator_record
+    DELIMINATOR_RECORD
+  end
+end

data/lib/regex/command.rb ADDED

@@ -0,0 +1,108 @@
+require 'regex'
+class Regex
+  # Commandline interface.
+  #
+  class Command
+    #
+    attr :file
+    #
+    attr :format
+    #
+    attr :options
+    #
+    def self.main(*argv)
+      new(*argv).main
+    end
+    # New Command.
+    def initialize(*argv)
+      @file    = nil
+      @format  = nil
+      @options = {}
+      parse(*argv)
+    end
+    #
+    def parse(*argv)
+      parser.parse!(argv)
+      unless @options[:template]
+        @options[:pattern] = argv.shift
+      end
+      @file = argv.shift
+      if @file
+        unless File.file?(@file)
+          puts "No such file -- '#{file}'."
+          exit 1
+        end
+      end
+    end
+    # OptionParser instance.
+    def parser
+      require 'optparse'
+      @options = {}
+      OptionParser.new do |opt|
+        opt.on('--template', '-t NAME', "select a built-in regular expression") do |name|
+          @options[:template] = name
+        end
+        opt.on('--index', '-n INT', "return a specific match index") do |int|
+          @options[:index] = int.to_i
+        end
+        opt.on('--insensitive', '-i', "case insensitive matching") do
+          @options[:insensitive] = true
+        end
+        opt.on('--unxml', '-x', "ignore XML/HTML tags") do
+          @options[:unxml] = true
+        end
+        opt.on('--repeat', '-r', "find all matching occurances") do
+          @options[:repeat] = true
+        end
+        opt.on('--yaml', '-y', "output in YAML format") do
+          @format = :yaml
+        end
+        opt.on('--json', '-j', "output in JSON format") do
+          @format = :json
+        end
+        opt.on_tail('--help', '-h', "display this lovely help message") do
+          puts opt
+          exit 0
+        end
+      end
+    end
+    #
+    def extraction
+      target = file ? File.new(file) : ARGF
+      Regex.new(target, options)
+    end
+    # Extract and display.
+    def main
+      begin
+        puts extraction.to_s(@format)
+      rescue => error
+        if $DEBUG
+          raise error
+        else
+          abort error.to_s
+        end
+      end
+    end
+  end
+end

data/lib/regex/extractor.rb ADDED

	@@ -0,0 +1 @@
1	+

data/lib/regex/string.rb ADDED

@@ -0,0 +1,68 @@
+class Regex
+  # Extensions for String class.
+  # These methods are taken directly from Ruby Facets.
+  #
+  module String
+    # Provides a margin controlled string.
+    #
+    #   x = %Q{
+    #         | This
+    #         |   is
+    #         |     margin controlled!
+    #         }.margin
+    #
+    #
+    #   NOTE: This may still need a bit of tweaking.
+    #
+    #  CREDIT: Trans
+    def margin(n=0)
+      #d = /\A.*\n\s*(.)/.match( self )[1]
+      #d = /\A\s*(.)/.match( self)[1] unless d
+      d = ((/\A.*\n\s*(.)/.match(self)) ||
+          (/\A\s*(.)/.match(self)))[1]
+      return '' unless d
+      if n == 0
+        gsub(/\n\s*\Z/,'').gsub(/^\s*[#{d}]/, '')
+      else
+        gsub(/\n\s*\Z/,'').gsub(/^\s*[#{d}]/, ' ' * n)
+      end
+    end
+    # Preserves relative tabbing.
+    # The first non-empty line ends up with n spaces before nonspace.
+    #
+    #  CREDIT: Gavin Sinclair
+    def tabto(n)
+      if self =~ /^( *)\S/
+        indent(n - $1.length)
+      else
+        self
+      end
+    end
+    # Indent left or right by n spaces.
+    # (This used to be called #tab and aliased as #indent.)
+    #
+    #  CREDIT: Gavin Sinclair
+    #  CREDIT: Trans
+    def indent(n)
+      if n >= 0
+        gsub(/^/, ' ' * n)
+      else
+        gsub(/^ {0,#{-n}}/, "")
+      end
+    end
+  end
+  class ::String #:nodoc:
+    include Regex::String
+  end
+end

data/lib/regex/templates/common.rb ADDED

@@ -0,0 +1,13 @@
+class Regex
+  #
+  module TEMPLATES
+    MLTAG      = /<([A-Z][A-Z0-9]*)\b[^>]*>(.*?)<\/\1>/i
+    IP         = /\b(25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\.(25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\.(25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\.(25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\b/
+    EMAIL      = /([a-zA-Z0-9_\-\.]+)@((\[[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}\.)|(([a-zA-Z0-9\-]+\.)+))([a-zA-Z]{2,4}|[0-9]{1,3})(\]?)/i
+    USPHONE    = /(\d\d\d[-]|\(\d\d\d\))?(\d\d\d)[-](\d\d\d\d)/
+    RUBYBLOCK  = /^=begin\s+(.*?)\n(.*?)\n=end/mi
+    RUBYMETHOD = /\A\s*(\#.*?)^\s*def\s+(.*?)$/mi
+  end
+end

data/meta/authors ADDED

	@@ -0,0 +1,2 @@
1	+ Thomas Sawyer
2	+ Tyler Rick

data/meta/created ADDED

	@@ -0,0 +1 @@
1	+ 2006-05-09

data/meta/description ADDED

	@@ -0,0 +1 @@
1	+ Regex is simple commmandline Regular Expression tool.

data/meta/download ADDED

	@@ -0,0 +1 @@
1	+ http://github.com/proutils/regex/downloads

data/meta/homepage ADDED

	@@ -0,0 +1 @@
1	+ http://proutils.github.com/regex

data/meta/mailinglist ADDED

	@@ -0,0 +1 @@
1	+ http://groups.google.com/group/proutils/topics?hl=en

data/meta/name ADDED

	@@ -0,0 +1 @@
1	+ regex

data/meta/repository ADDED

	@@ -0,0 +1 @@
1	+ git://github.com/proutils/regex.git

data/meta/summary ADDED

	@@ -0,0 +1 @@
1	+ Regex is simple commmandline Regular Expression tool.

data/meta/title ADDED

	@@ -0,0 +1 @@
1	+ Regex

data/meta/version ADDED

	@@ -0,0 +1 @@
1	+ 1.0.0

data/test/demos/regex.rdoc ADDED

@@ -0,0 +1,44 @@
+= Regex class
+Regex is really mean to be used on the commandline
+since it is really nothing more than a front end
+to Ruby's regular expression engine. But we will
+demonstrate it's use in code just the same, and to
+help ensure code quality.
+First we need to require the Regex library.
+  require 'regex'
+Now let's create some material to work with.
+  text = "We will match against this string."
+Now we can then create a Regex object using the text.
+We will also suppoly a matching pattern, as none of
+the matching functions will work without providing
+a pattern or the name of built-in pattern template.
+  regex = Regex.new(text, :pattern=>'\w+')
+We can see that the Regex object has converted the pattern
+into the expected regular expression via the #regex method.
+  regex.regex.assert == /\w+/m
+Under the hood, Regex has split the process of matching,
+organizing and formating the results into separate methods.
+We can use the #structure method to see thematch results
+organized into uniform arrays.
+  regex.structure.assert == %w{We}
+Whereas the last use only returns a single metch, if we turn
+on repeat mode we can see every word.
+  regex.repeat = true
+  regex.structure.assert == %w{We will match against this string}.map{ |e| [e] }
+Notice that repeat mode creates an array in an array.

metadata ADDED

@@ -0,0 +1,87 @@
+--- !ruby/object:Gem::Specification
+name: regex
+version: !ruby/object:Gem::Version
+  prerelease: false
+  segments:
+  - 1
+  - 0
+  - 0
+  version: 1.0.0
+platform: ruby
+authors:
+- Thomas Sawyer
+- Tyler Rick
+autorequire:
+bindir: bin
+cert_chain: []
+date: 2010-02-12 00:00:00 -05:00
+default_executable:
+dependencies: []
+description: Regex is simple commmandline Regular Expression tool.
+email:
+executables:
+- regex
+extensions: []
+extra_rdoc_files:
+- README
+files:
+- HISTORY
+- LICENSE
+- MANIFEST
+- README
+- bin/regex
+- lib/regex.rb
+- lib/regex/command.rb
+- lib/regex/extractor.rb
+- lib/regex/string.rb
+- lib/regex/templates/common.rb
+- meta/authors
+- meta/created
+- meta/description
+- meta/download
+- meta/homepage
+- meta/mailinglist
+- meta/name
+- meta/repository
+- meta/summary
+- meta/title
+- meta/version
+- test/demos/regex.rdoc
+has_rdoc: true
+homepage: http://proutils.github.com/regex
+licenses: []
+post_install_message:
+rdoc_options:
+- --title
+- Regex API
+- --main
+- README
+require_paths:
+- lib
+required_ruby_version: !ruby/object:Gem::Requirement
+  requirements:
+  - - ">="
+    - !ruby/object:Gem::Version
+      segments:
+      - 0
+      version: "0"
+required_rubygems_version: !ruby/object:Gem::Requirement
+  requirements:
+  - - ">="
+    - !ruby/object:Gem::Version
+      segments:
+      - 0
+      version: "0"
+requirements: []
+rubyforge_project: regex
+rubygems_version: 1.3.6.pre.3
+signing_key:
+specification_version: 3
+summary: Regex is simple commmandline Regular Expression tool.
+test_files: []