regex 1.1.0 → 1.1.1

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
data/.ruby ADDED
@@ -0,0 +1,43 @@
1
+ ---
2
+ source:
3
+ - meta
4
+ authors:
5
+ - name: Thomas Sawyer
6
+ email: transfire@gmail.com
7
+ - name: Tyler Rick
8
+ copyrights: []
9
+ replacements: []
10
+ alternatives: []
11
+ requirements:
12
+ - name: detroit
13
+ groups:
14
+ - build
15
+ development: true
16
+ - name: qed
17
+ groups:
18
+ - test
19
+ development: true
20
+ dependencies: []
21
+ conflicts: []
22
+ repositories:
23
+ - uri: git://github.com/proutils/regex.git
24
+ scm: git
25
+ name: upstream
26
+ resources:
27
+ Website: http://rubyworks.github.com/regex
28
+ User Guide: http://wiki.github.com/rubyworks/regex
29
+ Source Code: http://github.com/rubyworks/regex
30
+ Mailing List: http://groups.google.com/group/rubyworks-mailinglist
31
+ extra: {}
32
+ load_path:
33
+ - lib
34
+ revision: 0
35
+ created: '2006-05-09'
36
+ summary: Regex is a simple commmand-line Regular Expression tool.
37
+ title: Regex
38
+ version: 1.1.1
39
+ name: regex
40
+ description: ! 'Regex is a simple commmand-line Regular Expression tool
41
+
42
+ that makes it easy to search documents for content matches.'
43
+ date: '2011-10-24'
@@ -0,0 +1,8 @@
1
+ --title "RegEx"
2
+ --readme README.rdoc
3
+ --protected
4
+ --private
5
+ lib/**/*.rb
6
+ -
7
+ [A-Z]*.*
8
+
@@ -0,0 +1,31 @@
1
+ = COPYRIGHT NOTICES
2
+
3
+ == Regex
4
+
5
+ Copyright:: (c) 2010 Thomas Sawyer, Rubyworks
6
+ License:: BSD-2-Clause
7
+ Website:: http://rubyworks.github.com/tapout
8
+
9
+ Copyright 2010 Thomas Sawyer. All rights reserved.
10
+
11
+ Redistribution and use in source and binary forms, with or without
12
+ modification, are permitted provided that the following conditions are met:
13
+
14
+ 1. Redistributions of source code must retain the above copyright notice,
15
+ this list of conditions and the following disclaimer.
16
+
17
+ 2. Redistributions in binary form must reproduce the above copyright
18
+ notice, this list of conditions and the following disclaimer in the
19
+ documentation and/or other materials provided with the distribution.
20
+
21
+ THIS SOFTWARE IS PROVIDED ``AS IS'' AND ANY EXPRESS OR IMPLIED WARRANTIES,
22
+ INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY
23
+ AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE
24
+ COPYRIGHT HOLDERS OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT,
25
+ INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT
26
+ NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
27
+ DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY
28
+ OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING
29
+ NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE,
30
+ EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
31
+
@@ -1,5 +1,17 @@
1
1
  = RELEASE HISTORY
2
2
 
3
+ == 1.1.1 / 2011-10-24
4
+
5
+ Maintenance release updates build configuration. This release
6
+ also adds a man-page and fixes one bug with single search output.
7
+
8
+ Changes:
9
+
10
+ * Modernize build configuration.
11
+ * Fix return value when no single match is found.
12
+ * Add man-page for help.
13
+
14
+
3
15
  == 1.1.0 / 2010-10-12
4
16
 
5
17
  This release adds a detailed output option, and corrects
@@ -18,8 +18,9 @@ well. Well that's what you get.
18
18
 
19
19
  == RESOURCES
20
20
 
21
- * Home: http://rubyworks.github.com/regex
22
- * Code: http://github.com/rubyworks/regex
21
+ * {Home}[http://rubyworks.github.com/regex]
22
+ * {Code}[http://github.com/rubyworks/regex]
23
+ * {Mail}[http://groups.google.com/groups/rubyworks-mailinglist]
23
24
 
24
25
 
25
26
  == USAGE
@@ -81,14 +82,23 @@ Check out the <code>--help</code> and I am sure the rest will be smooth sailing.
81
82
  But it you want more information, then do us the good favor of jumping over
82
83
  to the wiki[http://wiki.github.com/rubyworks/regex].
83
84
 
85
+
86
+ == OUTPUT
87
+
88
+ Regex has three output modes. YAML, JSON and standard text. The standard
89
+ text output is unique in that it utilizes special ASCII characters
90
+ to separate matches and regex groups. ASCII 29, called the *record separator*,
91
+ is used to separate repeat matches. ASCII 30, called the *group separator*, is
92
+ is used to separate regular expression groups.
93
+
94
+
84
95
  == STATUS
85
96
 
86
- This is a very early release. So don't expect every feature under the sun just yet,
87
- or that every detail is going to work peachy. But hey, if something needs fixing
88
- or a feature needs adding, well then get in there and send me a patch. Open
89
- source software is built on *TEAM WORK*, baby.
97
+ The project is maturing but still a touch wet behnd the years. So don't be too surprised if
98
+ it doesn't have every feature under the sun just yet, or that every detail is going to work
99
+ absolutely peachy. But hey, if something needs fixing or a feature needs adding, well then get
100
+ in there and send me a patch. Open source software is built on *TEAM WORK*, right?
90
101
 
91
- Expect a potenial for rapid change here at the beginning.
92
102
 
93
103
  == COPYRIGHT
94
104
 
@@ -96,5 +106,5 @@ Copyright (c) 2010 Thomas Sawyer
96
106
 
97
107
  Regex is licensed under the terms of the Apache License, Version 2.0.
98
108
 
99
- See LICENSE file for details.
109
+ See COPYING.rdoc file for details.
100
110
 
@@ -1,19 +1,20 @@
1
1
  module Regex
2
- DIRECTORY = File.dirname(__FILE__)
3
-
4
2
  # Access to PACAKGE metadata.
5
- def self.package
6
- @package ||= (
3
+ def self.metadata
4
+ @metadata ||= (
7
5
  require 'yaml'
8
- YAML.load(File.new(DIRECTORY + '/regex/package.yml'))
6
+ YAML.load(File.new(File.dirname(__FILE__) + '/regex.yml'))
9
7
  )
10
8
  end
11
9
 
12
10
  # Need VRESION? You got it.
13
11
  def self.const_missing(name)
14
- package[name.to_s.downcase] || super(name)
12
+ metadata[name.to_s.downcase] || super(name)
15
13
  end
16
14
 
15
+ # TODO: This is only here to support broken Ruby 1.8.x.
16
+ VERSION = metadata['version']
17
+
17
18
  # Shortcut to create a new Regex::Extractor instance.
18
19
  def self.new(*io)
19
20
  Extractor.new(*io)
@@ -0,0 +1,43 @@
1
+ ---
2
+ source:
3
+ - meta
4
+ authors:
5
+ - name: Thomas Sawyer
6
+ email: transfire@gmail.com
7
+ - name: Tyler Rick
8
+ copyrights: []
9
+ replacements: []
10
+ alternatives: []
11
+ requirements:
12
+ - name: detroit
13
+ groups:
14
+ - build
15
+ development: true
16
+ - name: qed
17
+ groups:
18
+ - test
19
+ development: true
20
+ dependencies: []
21
+ conflicts: []
22
+ repositories:
23
+ - uri: git://github.com/proutils/regex.git
24
+ scm: git
25
+ name: upstream
26
+ resources:
27
+ Website: http://rubyworks.github.com/regex
28
+ User Guide: http://wiki.github.com/rubyworks/regex
29
+ Source Code: http://github.com/rubyworks/regex
30
+ Mailing List: http://groups.google.com/group/rubyworks-mailinglist
31
+ extra: {}
32
+ load_path:
33
+ - lib
34
+ revision: 0
35
+ created: '2006-05-09'
36
+ summary: Regex is a simple commmand-line Regular Expression tool.
37
+ title: Regex
38
+ version: 1.1.1
39
+ name: regex
40
+ description: ! 'Regex is a simple commmand-line Regular Expression tool
41
+
42
+ that makes it easy to search documents for content matches.'
43
+ date: '2011-10-24'
@@ -16,6 +16,9 @@ module Regex
16
16
  # the record deliminator. This is the default value.
17
17
  DELIMINATOR_RECORD = 30.chr + "\n"
18
18
 
19
+ # TODO: Separate by file ?
20
+ # DELIMINATOR_FILE = 28.chr +" \n"
21
+
19
22
  #
20
23
  def self.input_cache(input)
21
24
  @input_cache ||= {}
@@ -41,6 +44,9 @@ module Regex
41
44
  # Select built-in regular expression by name.
42
45
  attr_accessor :template
43
46
 
47
+ # Is a recusive serach?
48
+ attr_accessor :recursive
49
+
44
50
  # Index of expression return.
45
51
  attr_accessor :index
46
52
 
@@ -53,7 +59,7 @@ module Regex
53
59
  # Escape expression.
54
60
  attr_accessor :escape
55
61
 
56
- # Repeat Match.
62
+ # Repeat Match (global).
57
63
  attr_accessor :repeat
58
64
 
59
65
  # Output format.
@@ -263,7 +269,7 @@ module Regex
263
269
 
264
270
  # Structure the matchdata for single match.
265
271
  def structure_single
266
- structure_repeat.first
272
+ structure_repeat.first || []
267
273
  end
268
274
 
269
275
  # Structure the matchdata for repeat matches.
@@ -281,9 +287,14 @@ module Regex
281
287
  def scan
282
288
  list = []
283
289
  io.each do |input|
284
- text = read(input)
285
- text.scan(regex) do
286
- list << Match.new(input, $~)
290
+ # TODO: limit to text files, how?
291
+ begin
292
+ text = read(input)
293
+ text.scan(regex) do
294
+ list << Match.new(input, $~)
295
+ end
296
+ rescue => err
297
+ warn(input.inspect + ' ' + err.to_s) if $VERBOSE
287
298
  end
288
299
  end
289
300
  list
@@ -333,6 +344,12 @@ module Regex
333
344
  opt.on('--search', '-s PATTERN', "search for regular expression") do |re|
334
345
  options[:pattern] = re
335
346
  end
347
+ opt.on('--recursive', '-R', 'search recursively though subdirectories') do
348
+ options[:recursive] = true
349
+ end
350
+ opt.on('--escape', '-e', 'make all patterns verbatim string matchers') do
351
+ options[:escape] = true
352
+ end
336
353
  opt.on('--index', '-n INT', "return a specific match index") do |int|
337
354
  options[:index] = int.to_i
338
355
  end
@@ -387,11 +404,17 @@ module Regex
387
404
  end
388
405
  end
389
406
 
390
- files = argv
391
-
392
- files.each do |file|
393
- if !File.file?(file)
394
- $stderr.puts "No such file -- '#{file}'."
407
+ files = []
408
+ argv.each do |file|
409
+ if File.directory?(file)
410
+ if options[:recursive]
411
+ rec_files = Dir[File.join(file, '**')].reject{ |d| File.directory?(d) }
412
+ files.concat(rec_files)
413
+ end
414
+ elsif File.file?(file)
415
+ files << file
416
+ else
417
+ $stderr.puts "Not a file -- '#{file}'."
395
418
  exit 1
396
419
  end
397
420
  end
@@ -1,4 +1,5 @@
1
1
  require 'stringio'
2
+ require 'optparse'
2
3
 
3
4
  module Regex
4
5
 
@@ -8,6 +9,9 @@ module Regex
8
9
  # Array of [search, replace] rules.
9
10
  attr_reader :rules
10
11
 
12
+ # Is this a recursive search?
13
+ attr_accessor :recursive
14
+
11
15
  # Make all patterns exact string matchers.
12
16
  attr_accessor :escape
13
17
 
@@ -23,6 +27,9 @@ module Regex
23
27
  # Make backups of files when they change.
24
28
  attr_accessor :backup
25
29
 
30
+ # Interactive replacement.
31
+ attr_accessor :interactive
32
+
26
33
  #
27
34
  def initialize(options={})
28
35
  @rules = []
@@ -40,12 +47,16 @@ module Regex
40
47
  def apply(*ios)
41
48
  ios.each do |io|
42
49
  original = (IO === io || StringIO === io ? io.read : io.to_s)
43
- generate = original
50
+ generate = original.to_s
44
51
  rules.each do |(pattern, replacement)|
45
- if pattern.global
46
- generate = generate.gsub(pattern.to_re, replacement)
47
- else
48
- generate = generate.sub(pattern.to_re, replacement)
52
+ begin
53
+ if pattern.global
54
+ generate = generate.gsub(pattern.to_re, replacement)
55
+ else
56
+ generate = generate.sub(pattern.to_re, replacement)
57
+ end
58
+ rescue => err
59
+ warn(io.inspect + ' ' + err.to_s) if $VERBOSE
49
60
  end
50
61
  end
51
62
  if original != generate
@@ -54,6 +65,20 @@ module Regex
54
65
  end
55
66
  end
56
67
 
68
+ #
69
+ # TODO: interactive mode needs to handle \1 style substitutions.
70
+ def interactive_gsub(string, pattern, replacement)
71
+ copy = string.dup
72
+ string.scan(pattern) do |match|
73
+ print "#{match} ? (Y/n)"
74
+ case ask
75
+ when 'y', 'Y', ''
76
+ copy[$~.begin(0)..$~.end(0)] = replacement
77
+ else
78
+ end
79
+ end
80
+ end
81
+
57
82
  private
58
83
 
59
84
  # Parse pattern matcher.
@@ -92,7 +117,7 @@ module Regex
92
117
  replaces = []
93
118
  options = {}
94
119
  parser = OptionParser.new do |opt|
95
- opt.on('--subtitute', '-s PATTERN', 'search portion of substitution') do |search|
120
+ opt.on('--search', '-s PATTERN', 'search portion of substitution') do |search|
96
121
  searches << search
97
122
  end
98
123
  opt.on('--template', '-t NAME', 'search for built-in regular expression') do |name|
@@ -101,7 +126,10 @@ module Regex
101
126
  opt.on('--replace', '-r STRING', 'replacement string of substitution') do |replace|
102
127
  replaces << replace
103
128
  end
104
- opt.on('--escape', '-e', 'make all patterns exact string matchers') do
129
+ opt.on('--recursive', '-R', 'search recursively though subdirectories') do
130
+ options[:recursive] = true
131
+ end
132
+ opt.on('--escape', '-e', 'make all patterns verbatim string matchers') do
105
133
  options[:escape] = true
106
134
  end
107
135
  opt.on('--insensitive', '-i', 'make all patterns case-insensitive matchers') do
@@ -119,7 +147,10 @@ module Regex
119
147
  opt.on('-b', '--backup', 'backup any files that are changed') do
120
148
  options[:backup] = true
121
149
  end
122
- opt.on_tail('--debug', 'run in debug mode') do
150
+ opt.on('-i', '--interactive', 'interactive mode') do
151
+ options[:interactive] = true
152
+ end
153
+ opt.on_tail('--debug', 'run in debug mode') do
123
154
  $DEBUG = true
124
155
  end
125
156
  opt.on_tail('--help', '-h', 'display this lovely help message') do
@@ -129,10 +160,19 @@ module Regex
129
160
  end
130
161
  parser.parse!(argv)
131
162
 
132
- files = argv
133
- files.each do |file|
163
+ files = []
164
+
165
+ argv.each{ |file|
134
166
  raise "file does not exist -- #{file}" unless File.exist?(file)
135
- end
167
+ if File.directory?(file)
168
+ if options[:recursive]
169
+ files.concat Dir[File.join(file, '**')].reject{ |d| File.directory?(d) }
170
+ end
171
+ else
172
+ files << file
173
+ end
174
+ }
175
+
136
176
  targets = files.empty? ? [ARGF] : files.map{ |f| File.new(f) }
137
177
 
138
178
  unless searches.size == replaces.size
@@ -3,8 +3,9 @@ module Regex
3
3
  # = Templates
4
4
  #
5
5
  # TODO: What about regular expressions with variable content?
6
- # Should these be methods rather than constants? But then how
7
- # would we handle named substituions?
6
+ # But then how would we handle named substituions?
7
+ #
8
+ # TODO: Should these be methods rather than constants?
8
9
  module Templates
9
10
 
10
11
  # Empty line.
@@ -13,6 +14,7 @@ module Regex
13
14
  # Blank line.
14
15
  BLANK = /^\s*$/
15
16
 
17
+ #
16
18
  NUMBER = /[-+]?[0-9]*\.?[0-9]+/
17
19
 
18
20
  # Markup language tag, e.g \<a>stuff</a>.
@@ -21,8 +23,8 @@ module Regex
21
23
  # IPv4 Address
22
24
  IPV4 = /\b(25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\.(25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\.(25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\.(25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\b/
23
25
 
24
- # Username
25
- USERNAME = /^[a-zA-Z0-9_]{3,16}$/
26
+ # Dni (spanish ID card)
27
+ DNI = /^\d{8}[A-Za-z]{1}$/
26
28
 
27
29
  # Email Address
28
30
  EMAIL = /([a-zA-Z0-9_\-\.]+)@((\[[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}\.)|(([a-zA-Z0-9\-]+\.)+))([a-zA-Z]{2,4}|[0-9]{1,3})(\]?)/i
@@ -51,6 +53,34 @@ module Regex
51
53
  # HTTP URL Address
52
54
  HTTP = /^(https?:\/\/)?([\da-z\.-]+)\.([a-z\.]{2,6})([\/\w \?=.-]*)*\/?$/
53
55
 
56
+ # Validates Credit Card numbers, contains 16 numbers in groups of 4 separated
57
+ # by `-`, space or nothing.
58
+ CREDITCARD = /^(\d{4}-){3}\d{4}$|^(\d{4}\s){3}\d{4}$|^\d{16}$/
59
+
60
+ # MasterCard credit card
61
+ MASTERCARD = /^5[1-5]\d{14}$/
62
+
63
+ # Visa credit card.
64
+ VISA = /^4\d{15}$/
65
+
66
+ # TODO: Better name?
67
+ UNIXWORD = /^[a-zA-Z0-9_]*$/
68
+
69
+ # Username, at lest 3 characters and no more than 16.
70
+ USERNAME = /^[a-zA-Z0-9_]{3,16}$/
71
+
72
+ # Twitter username
73
+ TWITTER_USERNMAE = /^([a-z0-9\_])+$/ix
74
+
75
+ # Github username
76
+ GITHUB_USERNAME = /^([a-z0-9\_\-])+$/ix
77
+
78
+ # Slideshare username
79
+ SLIDESHARE_USERNAME = /^([a-z0-9])+$/ix
80
+
81
+ # Del.icio.us username
82
+ DELICIOUS_USERNMAME = /^([a-z0-9\_\-])+$/ix
83
+
54
84
  # Ruby comment block.
55
85
  RUBYBLOCK = /^=begin\s*(.*?)\n(.*?)\n=end/m
56
86
 
@@ -58,7 +88,7 @@ module Regex
58
88
  # TODO: Not quite right.
59
89
  RUBYMETHOD_WITH_COMMENT = /(^\ *\#.*?)^\s*def\s*(.*?)$/m
60
90
 
61
- #
91
+ # Ruby method definition.
62
92
  RUBYMETHOD = /^\ *def\s*(.*?)$/
63
93
 
64
94
  # By the legendary abigail. Fails to match if and only if it is matched against