RubyGems - michel-randexp - Versions diffs - 0.1.4 - Mend

michel-randexp 0.1.4

Files changed (29) hide show

data/CHANGELOG +23 -0
data/LICENSE +20 -0
data/README +82 -0
data/Rakefile +111 -0
data/TODO +4 -0
data/lib/randexp.rb +21 -0
data/lib/randexp/core_ext.rb +6 -0
data/lib/randexp/core_ext/array.rb +5 -0
data/lib/randexp/core_ext/integer.rb +5 -0
data/lib/randexp/core_ext/range.rb +9 -0
data/lib/randexp/core_ext/regexp.rb +7 -0
data/lib/randexp/dictionary.rb +24 -0
data/lib/randexp/parser.rb +99 -0
data/lib/randexp/randgen.rb +84 -0
data/lib/randexp/reducer.rb +114 -0
data/lib/randexp/wordlists/female_names.rb +23 -0
data/lib/randexp/wordlists/male_names.rb +23 -0
data/lib/randexp/wordlists/real_name.rb +33 -0
data/spec/regression/regexp_spec.rb +204 -0
data/spec/spec_helper.rb +8 -0
data/spec/unit/core_ext/regexp_spec.rb +9 -0
data/spec/unit/randexp/parser_spec.rb +77 -0
data/spec/unit/randexp/reducer_spec.rb +273 -0
data/spec/unit/randexp_spec.rb +164 -0
data/spec/unit/randgen_spec.rb +216 -0
data/wordlists/female_names +4275 -0
data/wordlists/male_names +1219 -0
data/wordlists/surnames +475 -0
metadata +84 -0

data/CHANGELOG ADDED Viewed

@@ -0,0 +1,23 @@
+== 0.1.5 "Michel de Graaf" 2010-01-16
+* added anychar *
+== 0.1.4 "Wally Wisoky" 2008-10-08
+* Added realistic name generation (Matt Aimonetti)
+* Fixed loadpath issues (Gerrit Kaiser)
+== 0.1.3 "Oological" 2008-07-08
+* Randgen.word should not return a string that does not match /^\w+$/
+== 0.1.2 "I'm Not Saying It's Not Beta"
+* Changed rand to Kernel#rand to avoid conflicting with rails (thanks agile!)
+== 0.1.1 "Still Quite Beta" 2008-07-20
+* Added Range#of method.
+* Heavy refactoring of the Parser.parse method.
+* Fixed the /\./ bug.
+== 0.1.0 "Very Beta" 2008-07-08
+* Initial version of randexp!
+* Has support for very simple regular expressions.
+* Randgen has limited methods.
+* Dictionary is reading from the local words file.

data/LICENSE ADDED Viewed

@@ -0,0 +1,20 @@
+Copyright (c) 2008 Ben Burkert
+Permission is hereby granted, free of charge, to any person obtaining
+a copy of this software and associated documentation files (the
+"Software"), to deal in the Software without restriction, including
+without limitation the rights to use, copy, modify, merge, publish,
+distribute, sublicense, and/or sell copies of the Software, and to
+permit persons to whom the Software is furnished to do so, subject to
+the following conditions:
+The above copyright notice and this permission notice shall be
+included in all copies or substantial portions of the Software.
+THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
+EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
+MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
+NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE
+LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION
+OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION
+WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.

data/README ADDED Viewed

@@ -0,0 +1,82 @@
+Randexp
+    by Ben Burkert
+    http://github.com/benburkert/randexp
+== DESCRIPTION:
+andexp makes it easy to generate random string from most regular expressions.
+== REQUIREMENTS:
+* none!
+== INSTALL:
+  $ sudo gem install randexp
+== USAGE:
+Randexp adds the #generate (or #gen, for short) method to the Regexp class,
+which generates a 'random' string that will match your regular expression.
+  /abc|def/.gen
+    # => "def"
+== Valid Regexp's
+Randexp can only generate matching string from simple regular expression.
+Except for a few circumstances, wildcards are generally not allowed in the
+regular expression.  is pretty domain specific, so trying to guess when to
+terminate a random pattern would produce unhelpful data:
+  >> /Aa{3}h*!/.gen
+      # => RuntimeError: Sorry, "h*" is too vague, try setting a range: "h{0,3}"
+  >> /Aa{3}h{3,15}!/.gen
+      => "Aaaahhhhh!"
+  >> /(never gonna (give you up|let you down), )*/.gen
+      => RuntimeError: Sorry, "(...)*" is too vague, try setting a range: "(...){0, 3}"
+  >> /(never gonna (give you up|let you down), ){3,5}/.gen
+      => "never gonna give you up, never gonna let you down, never gonna give you up, never gonna give you up, "
+The exception being word characters (\w), which generate a random word from the Dictionary class.
+  >> /\w+/.gen
+      => "groveling"
+= Primitives & Complex matches
+The single character matchers supported are words(\w), whitespace(\s), and digits(\d).
+  >> /\d{50}/.gen
+      => "50315410741096763188525493528315906035878741451037"
+When a multiplicity constraint is placed on a word character, a word with the valid length is generated.
+  >> /\w{10}/.gen  # a word with 10 letters
+      => "Chaucerism"
+  >> /\w{5,15}/.gen
+      => "cabalistic"
+Complex matchers use the [:...:] syntax within the regular expression.
+  >> /[:sentence:]/.gen
+      => "Nonhearer demetricize toppiece filicic possessedness rhodizite zoomagnetism earwigginess steady"
+Complex matchers can also be added by extending the Randgen class.
+  class Randgen
+    def self.serial_number(options = {})
+      /XX\d{4}-\w-\d{5}/.gen
+    end
+  end
+  >> /[:serial_number:]/.gen
+      => "XX3770-M-33114"
+= Dictionary
+The Dictionary loads the local users' words file, allowing randomly generated words to be chosen from
+thousands of entries to the words file.  Words are mapped by their length to allow words to be randomly
+chosen based on size.

data/Rakefile ADDED Viewed

@@ -0,0 +1,111 @@
+require 'rubygems'
+require 'rake/gempackagetask'
+require 'rubygems/specification'
+require 'date'
+require "spec/rake/spectask"
+require 'rake/rdoctask'
+PROJECT_NAME = "randexp"
+GEM = "michel-randexp"
+GEM_VERSION = "0.1.4"
+AUTHOR = "Ben Burkert,Michel de Graaf"
+EMAIL = "ben@benburkert.com, michel@re-invention.nl"
+HOMEPAGE = "http://github.com/michel/randexp"
+TITLE = "Randexp Gem"
+SUMMARY = "Library for generating random strings."
+FILES = %w(LICENSE README README Rakefile TODO CHANGELOG) + Dir.glob("{lib,spec}/**/*") + Dir.glob("wordlists/**/*")
+RDOC_FILES = %w(LICENSE README README Rakefile TODO CHANGELOG) + Dir.glob("lib/**/*")
+RUBYFORGE_USER = "benburkert"
+spec = Gem::Specification.new do |s|
+  s.name = GEM
+  s.version = GEM_VERSION
+  s.platform = Gem::Platform::RUBY
+  s.has_rdoc = true
+  s.extra_rdoc_files = ["README", "LICENSE", 'TODO']
+  s.summary = SUMMARY
+  s.description = s.summary
+  s.author = AUTHOR
+  s.email = EMAIL
+  s.homepage = HOMEPAGE
+  s.require_path = 'lib'
+  s.autorequire = GEM
+  s.files = FILES
+end
+Rake::GemPackageTask.new(spec) do |package|
+  package.gem_spec = spec
+  package.need_zip = true
+  package.need_tar = true
+end
+desc "install the gem locally"
+task :install => [:package] do
+  sh %{sudo gem install pkg/#{GEM}-#{GEM_VERSION}}
+end
+desc "create a gemspec file"
+task :make_spec do
+  File.open("#{GEM}.gemspec", "w") do |file|
+    file.puts spec.to_ruby
+  end
+end
+##############################################################################
+# rSpec & rcov
+##############################################################################
+desc "Run all unit specs"
+Spec::Rake::SpecTask.new("specs:unit") do |t|
+  t.spec_opts = ["--format", "specdoc", "--colour"]
+  t.spec_files = Dir["spec/unit/**/*_spec.rb"].sort
+  t.rcov = true
+  t.rcov_opts << '--sort' << 'coverage' << '--sort-reverse'
+  t.rcov_opts << '--only-uncovered'
+  t.rcov_opts << '--output coverage/unit'
+end
+desc "Run all regression specs"
+Spec::Rake::SpecTask.new("specs:regression") do |t|
+  t.spec_opts = ["--format", "specdoc", "--colour"]
+  t.spec_files = Dir["spec/regression/**/*_spec.rb"].sort
+  t.rcov = true
+  t.rcov_opts << '--sort' << 'coverage' << '--sort-reverse'
+  t.rcov_opts << '--only-uncovered'
+  t.rcov_opts << '--output coverage/integration'
+end
+task :specs => ['specs:unit', 'specs:regression']
+##############################################################################
+# Documentation
+##############################################################################
+task :doc => "doc:rerdoc"
+namespace :doc do
+  Rake::RDocTask.new do |rdoc|
+    rdoc.rdoc_files.add(RDOC_FILES)
+    rdoc.main = 'README'
+    rdoc.title = TITLE
+    rdoc.rdoc_dir = "rdoc"
+    rdoc.options << '--line-numbers' << '--inline-source'
+  end
+  desc "rdoc to rubyforge"
+  task :rubyforge => :doc do
+    sh %{chmod -R 755 rdoc}
+    sh %{/usr/bin/scp -r -p rdoc/* #{RUBYFORGE_USER}@rubyforge.org:/var/www/gforge-projects/#{PROJECT_NAME}/#{GEM}}
+  end
+end
+##############################################################################
+# release
+##############################################################################
+task :release => [:specs, :package, :doc] do
+  sh %{rubyforge add_release #{PROJECT_NAME} #{GEM} "#{GEM_VERSION}" pkg/#{GEM}-#{GEM_VERSION}.gem}
+  %w[zip tgz].each do |ext|
+    sh %{rubyforge add_file #{PROJECT_NAME} #{GEM} "#{GEM_VERSION}" pkg/#{GEM}-#{GEM_VERSION}.#{ext}}
+  end
+end

data/TODO ADDED Viewed

@@ -0,0 +1,4 @@
+== Todo list
+* add a ~/.randexp dir for configuration
+* add [] syntax: /[aeiou]{4}/.gen
+* more generators for Randgen

data/lib/randexp.rb ADDED Viewed

@@ -0,0 +1,21 @@
+class Randexp
+  attr_accessor :sexp
+  def initialize(source)
+    @sexp = Randexp::Parser[source]
+  end
+  def reduce
+    Reducer[@sexp.dup]
+  end
+end
+dir = File.dirname(__FILE__) + '/randexp'
+require dir + '/core_ext'
+require dir + '/dictionary'
+require dir + '/parser'
+require dir + '/randgen'
+require dir + '/reducer'
+require dir + '/wordlists/female_names'
+require dir + '/wordlists/male_names'
+require dir + '/wordlists/real_name'

data/lib/randexp/core_ext.rb ADDED Viewed

@@ -0,0 +1,6 @@
+dir = File.dirname(__FILE__)
+require dir + '/core_ext/array'
+require dir + '/core_ext/integer'
+require dir + '/core_ext/range'
+require dir + '/core_ext/regexp'

data/lib/randexp/core_ext/array.rb ADDED Viewed

@@ -0,0 +1,5 @@
+class Array
+  def pick
+    at Kernel.rand(size)
+  end
+end

data/lib/randexp/core_ext/integer.rb ADDED Viewed

@@ -0,0 +1,5 @@
+class Integer
+  def of
+    (1..self).to_a.map { yield }
+  end
+end

data/lib/randexp/core_ext/range.rb ADDED Viewed

@@ -0,0 +1,9 @@
+class Range
+  def pick
+    to_a.pick
+  end
+  def of
+    pick.of { yield }
+  end
+end

data/lib/randexp/core_ext/regexp.rb ADDED Viewed

@@ -0,0 +1,7 @@
+class Regexp
+  def generate
+    Randexp.new(source).reduce
+  end
+  alias_method :gen, :generate
+end

data/lib/randexp/dictionary.rb ADDED Viewed

@@ -0,0 +1,24 @@
+class Randexp::Dictionary
+  def self.load_dictionary
+    if File.exists?("/usr/share/dict/words")
+      File.read("/usr/share/dict/words").split
+    elsif File.exists?("/usr/dict/words")
+      File.read("/usr/dict/words").split
+    else
+      raise "words file not found"
+    end
+  end
+  def self.words(options = {})
+    case
+    when options.has_key?(:length)
+      words_by_length[options[:length]]
+    else
+      @@words ||= load_dictionary
+    end
+  end
+  def self.words_by_length
+    @@words_by_length ||= words.inject({}) {|h, w| (h[w.size] ||= []) << w; h }
+  end
+end

data/lib/randexp/parser.rb ADDED Viewed

@@ -0,0 +1,99 @@
+class Randexp
+  class Parser
+    def self.parse(source)
+      case
+      when source =~ /^(.*)(\*|\*\?|\+|\+\?|\?)$/ && balanced?($1, $2)
+        parse_quantified($1, $2.to_sym)                                 # ends with *, +, or ?: /(..)?/
+      when source =~ /^(.*)\{(\d+)\,(\d+)\}$/ && balanced?($1, $2)
+        parse_quantified($1, ($2.to_i)..($3.to_i))                      #ends with a range: /(..){..,..}/
+      when source =~ /^(.*)\{(\d+)\}$/ && balanced?($1, $2)
+        parse_quantified($1, $2.to_i)                                   #ends with a range: /..(..){..}/
+      when source =~ /^\((.*)\)\((.*)\)$/ && balanced?($1, $2)
+        union(parse($1), parse($2))                                     #balanced union: /(..)(..)/
+      when source =~ /^(\(.*\))\|(\(.*\))$/ && balanced?($1, $2)
+        intersection(parse($1), parse($2))                              #balanced intersection: /(..)|(..)/
+      when source =~ /^(.*)\|(.*)$/ && balanced?($1, $2)
+        intersection(parse($1), parse($2))                              #implied intersection: /..|../
+      when source =~ /^(.*)\|\((\(.*\))\)$/ && balanced?($1, $2)
+        intersection(parse($1), parse($2))                              #unbalanced intersection: /(..)|((...))/
+      when source =~ /^(.+)(\(.*\))$/ && balanced?($1, $2)
+        union(parse($1), parse($2))                                     #unbalanced union: /...(...)/
+      when source =~ /^\((.*)\)$/ && balanced?($1)
+        union(parse($1))                                                #explicit group: /(..)/
+      when source =~ /^([^()]*)(\(.*\))$/ && balanced?($1, $2)
+        union(parse($1), parse($2))                                     #implied group: /..(..)/
+      when source =~ /^(.*)\[\:(.*)\:\]$/
+        union(parse($1), random($2))                                    #custom random: /[:word:]/
+      when source =~  /(.*)\\(\.)$/ #-----
+        union(parse($1), literal($2))                                   # \.literal
+      when source =~ /^(.*)\\([wsdc])$/ || source =~ /(.*)(\.)$/
+        union(parse($1), random($2))                                    #reserved random: /..\w/ and .
+      when source =~ /^(.*)\\(.)$/ || source =~ /(.*)(.|\s)$/
+        union(parse($1), literal($2))                                   #end with literal or space: /... /
+      else
+        nil
+      end
+    end
+    def self.parse_quantified(source, multiplicity)
+      case source
+      when /^[^()]*$/     then quantify_rhs(parse(source), multiplicity)    #implied union: /...+/
+      when /^(\(.*\))$/   then quantify(parse(source), multiplicity)        #group: /(...)?/
+      when /^(.*\))$/     then quantify_rhs(parse(source), multiplicity)    #implied union: /...(...)?/
+      when /^(.*[^)]+)$/  then quantify_rhs(parse(source), multiplicity)    #implied union: /...(...)...?/
+      else quantify(parse(source), multiplicity)
+      end
+    end
+    class << self
+      alias_method :[], :parse
+    end
+    def self.balanced?(*args)
+      args.all? {|s| s.count('(') == s.count(')')}
+    end
+    def self.quantify_rhs(sexp, multiplicity)
+      case sexp.first
+      when :union
+        rhs = sexp.pop
+        sexp << quantify(rhs, multiplicity)
+      else
+        quantify(sexp, multiplicity)
+      end
+    end
+    def self.quantify(lhs, sym)
+      [:quantify, lhs, sym]
+    end
+    def self.union(lhs, *rhs)
+      if lhs.nil?
+        union(*rhs)
+      elsif rhs.empty?
+        lhs
+      elsif lhs.first == :union
+        rhs.each {|s| lhs << s}
+        lhs
+      else
+        [:union, lhs, *rhs]
+      end
+    end
+    def self.intersection(lhs, rhs)
+      if rhs.first == :intersection
+        [:intersection, lhs] + rhs[1..-1]
+      else
+        [:intersection, lhs, rhs]
+      end
+    end
+    def self.random(char)
+      [:random, char.to_sym]
+    end
+    def self.literal(word)
+      [:literal, word]
+    end
+  end
+end

data/lib/randexp/randgen.rb ADDED Viewed

@@ -0,0 +1,84 @@
+require 'enumerator'
+class Randgen
+  WORDS_PER_SENTENCE = 3..20
+  SENTENCES_PER_PARAGRAPH = 3..8
+  def self.bool(options = {})
+    ['true', 'false'].pick
+  end
+  def self.any(options = {})
+    length = options[:length] || 1
+    s = ""
+    length.enum_for(:times).inject(s) do |result, index|
+      s << rand(93) + 33
+    end
+  end
+  def self.lchar(options = {})
+    ('a'..'z').to_a.pick
+  end
+  def self.uchar(options = {})
+    ('A'..'Z').to_a.pick
+  end
+  def self.char(options = {})
+    [lchar, uchar].pick
+  end
+  def self.whitespace(options = {})
+    ["\t", "\n", "\r", "\f"].pick
+  end
+  def self.digit(options = {})
+    ('0'..'9').to_a.pick
+  end
+  def self.alpha_numeric(options = {})
+    [char, digit].pick
+  end
+  def self.word(options = {})
+    begin
+      word = Randexp::Dictionary.words(options).pick
+    rescue
+      word = ''
+      options[:length].times { |iterator| word += alpha_numeric }
+    end until word =~ /^\w+$/
+    word
+  end
+  def self.first_name(options = {})
+    RealName.first_names(options).pick
+  end
+  def self.surname(options = {})
+    RealName.surnames(options).pick
+  end
+  class << self
+    alias_method :last_name, :surname
+  end
+  def self.name(options = {})
+    "#{first_name(options)} #{surname(options)}"
+  end
+  def self.sentence(options = {})
+    ((options[:length] || WORDS_PER_SENTENCE.pick).of { word } * " ").capitalize
+  end
+  def self.paragraph(options = {})
+    ((options[:length] || SENTENCES_PER_PARAGRAPH.pick).of { sentence } * ".  ") + "."
+  end
+  def self.phone_number(options = {})
+    case options[:length]
+    when 7  then  /\d{3}-\d{4}/.gen
+    when 10 then  /\d{3}-\d{3}-\d{4}/.gen
+    else          /(\d{3}-)?\d{3}-\d{4}/.gen
+    end
+  end
+end