regex 1.1.0 → 1.1.1

Sign up to get free protection for your applications and to get access to all the features.
data/.ruby ADDED
@@ -0,0 +1,43 @@
1
+ ---
2
+ source:
3
+ - meta
4
+ authors:
5
+ - name: Thomas Sawyer
6
+ email: transfire@gmail.com
7
+ - name: Tyler Rick
8
+ copyrights: []
9
+ replacements: []
10
+ alternatives: []
11
+ requirements:
12
+ - name: detroit
13
+ groups:
14
+ - build
15
+ development: true
16
+ - name: qed
17
+ groups:
18
+ - test
19
+ development: true
20
+ dependencies: []
21
+ conflicts: []
22
+ repositories:
23
+ - uri: git://github.com/proutils/regex.git
24
+ scm: git
25
+ name: upstream
26
+ resources:
27
+ Website: http://rubyworks.github.com/regex
28
+ User Guide: http://wiki.github.com/rubyworks/regex
29
+ Source Code: http://github.com/rubyworks/regex
30
+ Mailing List: http://groups.google.com/group/rubyworks-mailinglist
31
+ extra: {}
32
+ load_path:
33
+ - lib
34
+ revision: 0
35
+ created: '2006-05-09'
36
+ summary: Regex is a simple commmand-line Regular Expression tool.
37
+ title: Regex
38
+ version: 1.1.1
39
+ name: regex
40
+ description: ! 'Regex is a simple commmand-line Regular Expression tool
41
+
42
+ that makes it easy to search documents for content matches.'
43
+ date: '2011-10-24'
@@ -0,0 +1,8 @@
1
+ --title "RegEx"
2
+ --readme README.rdoc
3
+ --protected
4
+ --private
5
+ lib/**/*.rb
6
+ -
7
+ [A-Z]*.*
8
+
@@ -0,0 +1,31 @@
1
+ = COPYRIGHT NOTICES
2
+
3
+ == Regex
4
+
5
+ Copyright:: (c) 2010 Thomas Sawyer, Rubyworks
6
+ License:: BSD-2-Clause
7
+ Website:: http://rubyworks.github.com/tapout
8
+
9
+ Copyright 2010 Thomas Sawyer. All rights reserved.
10
+
11
+ Redistribution and use in source and binary forms, with or without
12
+ modification, are permitted provided that the following conditions are met:
13
+
14
+ 1. Redistributions of source code must retain the above copyright notice,
15
+ this list of conditions and the following disclaimer.
16
+
17
+ 2. Redistributions in binary form must reproduce the above copyright
18
+ notice, this list of conditions and the following disclaimer in the
19
+ documentation and/or other materials provided with the distribution.
20
+
21
+ THIS SOFTWARE IS PROVIDED ``AS IS'' AND ANY EXPRESS OR IMPLIED WARRANTIES,
22
+ INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY
23
+ AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE
24
+ COPYRIGHT HOLDERS OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT,
25
+ INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT
26
+ NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
27
+ DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY
28
+ OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING
29
+ NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE,
30
+ EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
31
+
@@ -1,5 +1,17 @@
1
1
  = RELEASE HISTORY
2
2
 
3
+ == 1.1.1 / 2011-10-24
4
+
5
+ Maintenance release updates build configuration. This release
6
+ also adds a man-page and fixes one bug with single search output.
7
+
8
+ Changes:
9
+
10
+ * Modernize build configuration.
11
+ * Fix return value when no single match is found.
12
+ * Add man-page for help.
13
+
14
+
3
15
  == 1.1.0 / 2010-10-12
4
16
 
5
17
  This release adds a detailed output option, and corrects
@@ -18,8 +18,9 @@ well. Well that's what you get.
18
18
 
19
19
  == RESOURCES
20
20
 
21
- * Home: http://rubyworks.github.com/regex
22
- * Code: http://github.com/rubyworks/regex
21
+ * {Home}[http://rubyworks.github.com/regex]
22
+ * {Code}[http://github.com/rubyworks/regex]
23
+ * {Mail}[http://groups.google.com/groups/rubyworks-mailinglist]
23
24
 
24
25
 
25
26
  == USAGE
@@ -81,14 +82,23 @@ Check out the <code>--help</code> and I am sure the rest will be smooth sailing.
81
82
  But it you want more information, then do us the good favor of jumping over
82
83
  to the wiki[http://wiki.github.com/rubyworks/regex].
83
84
 
85
+
86
+ == OUTPUT
87
+
88
+ Regex has three output modes. YAML, JSON and standard text. The standard
89
+ text output is unique in that it utilizes special ASCII characters
90
+ to separate matches and regex groups. ASCII 29, called the *record separator*,
91
+ is used to separate repeat matches. ASCII 30, called the *group separator*, is
92
+ is used to separate regular expression groups.
93
+
94
+
84
95
  == STATUS
85
96
 
86
- This is a very early release. So don't expect every feature under the sun just yet,
87
- or that every detail is going to work peachy. But hey, if something needs fixing
88
- or a feature needs adding, well then get in there and send me a patch. Open
89
- source software is built on *TEAM WORK*, baby.
97
+ The project is maturing but still a touch wet behnd the years. So don't be too surprised if
98
+ it doesn't have every feature under the sun just yet, or that every detail is going to work
99
+ absolutely peachy. But hey, if something needs fixing or a feature needs adding, well then get
100
+ in there and send me a patch. Open source software is built on *TEAM WORK*, right?
90
101
 
91
- Expect a potenial for rapid change here at the beginning.
92
102
 
93
103
  == COPYRIGHT
94
104
 
@@ -96,5 +106,5 @@ Copyright (c) 2010 Thomas Sawyer
96
106
 
97
107
  Regex is licensed under the terms of the Apache License, Version 2.0.
98
108
 
99
- See LICENSE file for details.
109
+ See COPYING.rdoc file for details.
100
110
 
@@ -1,19 +1,20 @@
1
1
  module Regex
2
- DIRECTORY = File.dirname(__FILE__)
3
-
4
2
  # Access to PACAKGE metadata.
5
- def self.package
6
- @package ||= (
3
+ def self.metadata
4
+ @metadata ||= (
7
5
  require 'yaml'
8
- YAML.load(File.new(DIRECTORY + '/regex/package.yml'))
6
+ YAML.load(File.new(File.dirname(__FILE__) + '/regex.yml'))
9
7
  )
10
8
  end
11
9
 
12
10
  # Need VRESION? You got it.
13
11
  def self.const_missing(name)
14
- package[name.to_s.downcase] || super(name)
12
+ metadata[name.to_s.downcase] || super(name)
15
13
  end
16
14
 
15
+ # TODO: This is only here to support broken Ruby 1.8.x.
16
+ VERSION = metadata['version']
17
+
17
18
  # Shortcut to create a new Regex::Extractor instance.
18
19
  def self.new(*io)
19
20
  Extractor.new(*io)
@@ -0,0 +1,43 @@
1
+ ---
2
+ source:
3
+ - meta
4
+ authors:
5
+ - name: Thomas Sawyer
6
+ email: transfire@gmail.com
7
+ - name: Tyler Rick
8
+ copyrights: []
9
+ replacements: []
10
+ alternatives: []
11
+ requirements:
12
+ - name: detroit
13
+ groups:
14
+ - build
15
+ development: true
16
+ - name: qed
17
+ groups:
18
+ - test
19
+ development: true
20
+ dependencies: []
21
+ conflicts: []
22
+ repositories:
23
+ - uri: git://github.com/proutils/regex.git
24
+ scm: git
25
+ name: upstream
26
+ resources:
27
+ Website: http://rubyworks.github.com/regex
28
+ User Guide: http://wiki.github.com/rubyworks/regex
29
+ Source Code: http://github.com/rubyworks/regex
30
+ Mailing List: http://groups.google.com/group/rubyworks-mailinglist
31
+ extra: {}
32
+ load_path:
33
+ - lib
34
+ revision: 0
35
+ created: '2006-05-09'
36
+ summary: Regex is a simple commmand-line Regular Expression tool.
37
+ title: Regex
38
+ version: 1.1.1
39
+ name: regex
40
+ description: ! 'Regex is a simple commmand-line Regular Expression tool
41
+
42
+ that makes it easy to search documents for content matches.'
43
+ date: '2011-10-24'
@@ -16,6 +16,9 @@ module Regex
16
16
  # the record deliminator. This is the default value.
17
17
  DELIMINATOR_RECORD = 30.chr + "\n"
18
18
 
19
+ # TODO: Separate by file ?
20
+ # DELIMINATOR_FILE = 28.chr +" \n"
21
+
19
22
  #
20
23
  def self.input_cache(input)
21
24
  @input_cache ||= {}
@@ -41,6 +44,9 @@ module Regex
41
44
  # Select built-in regular expression by name.
42
45
  attr_accessor :template
43
46
 
47
+ # Is a recusive serach?
48
+ attr_accessor :recursive
49
+
44
50
  # Index of expression return.
45
51
  attr_accessor :index
46
52
 
@@ -53,7 +59,7 @@ module Regex
53
59
  # Escape expression.
54
60
  attr_accessor :escape
55
61
 
56
- # Repeat Match.
62
+ # Repeat Match (global).
57
63
  attr_accessor :repeat
58
64
 
59
65
  # Output format.
@@ -263,7 +269,7 @@ module Regex
263
269
 
264
270
  # Structure the matchdata for single match.
265
271
  def structure_single
266
- structure_repeat.first
272
+ structure_repeat.first || []
267
273
  end
268
274
 
269
275
  # Structure the matchdata for repeat matches.
@@ -281,9 +287,14 @@ module Regex
281
287
  def scan
282
288
  list = []
283
289
  io.each do |input|
284
- text = read(input)
285
- text.scan(regex) do
286
- list << Match.new(input, $~)
290
+ # TODO: limit to text files, how?
291
+ begin
292
+ text = read(input)
293
+ text.scan(regex) do
294
+ list << Match.new(input, $~)
295
+ end
296
+ rescue => err
297
+ warn(input.inspect + ' ' + err.to_s) if $VERBOSE
287
298
  end
288
299
  end
289
300
  list
@@ -333,6 +344,12 @@ module Regex
333
344
  opt.on('--search', '-s PATTERN', "search for regular expression") do |re|
334
345
  options[:pattern] = re
335
346
  end
347
+ opt.on('--recursive', '-R', 'search recursively though subdirectories') do
348
+ options[:recursive] = true
349
+ end
350
+ opt.on('--escape', '-e', 'make all patterns verbatim string matchers') do
351
+ options[:escape] = true
352
+ end
336
353
  opt.on('--index', '-n INT', "return a specific match index") do |int|
337
354
  options[:index] = int.to_i
338
355
  end
@@ -387,11 +404,17 @@ module Regex
387
404
  end
388
405
  end
389
406
 
390
- files = argv
391
-
392
- files.each do |file|
393
- if !File.file?(file)
394
- $stderr.puts "No such file -- '#{file}'."
407
+ files = []
408
+ argv.each do |file|
409
+ if File.directory?(file)
410
+ if options[:recursive]
411
+ rec_files = Dir[File.join(file, '**')].reject{ |d| File.directory?(d) }
412
+ files.concat(rec_files)
413
+ end
414
+ elsif File.file?(file)
415
+ files << file
416
+ else
417
+ $stderr.puts "Not a file -- '#{file}'."
395
418
  exit 1
396
419
  end
397
420
  end
@@ -1,4 +1,5 @@
1
1
  require 'stringio'
2
+ require 'optparse'
2
3
 
3
4
  module Regex
4
5
 
@@ -8,6 +9,9 @@ module Regex
8
9
  # Array of [search, replace] rules.
9
10
  attr_reader :rules
10
11
 
12
+ # Is this a recursive search?
13
+ attr_accessor :recursive
14
+
11
15
  # Make all patterns exact string matchers.
12
16
  attr_accessor :escape
13
17
 
@@ -23,6 +27,9 @@ module Regex
23
27
  # Make backups of files when they change.
24
28
  attr_accessor :backup
25
29
 
30
+ # Interactive replacement.
31
+ attr_accessor :interactive
32
+
26
33
  #
27
34
  def initialize(options={})
28
35
  @rules = []
@@ -40,12 +47,16 @@ module Regex
40
47
  def apply(*ios)
41
48
  ios.each do |io|
42
49
  original = (IO === io || StringIO === io ? io.read : io.to_s)
43
- generate = original
50
+ generate = original.to_s
44
51
  rules.each do |(pattern, replacement)|
45
- if pattern.global
46
- generate = generate.gsub(pattern.to_re, replacement)
47
- else
48
- generate = generate.sub(pattern.to_re, replacement)
52
+ begin
53
+ if pattern.global
54
+ generate = generate.gsub(pattern.to_re, replacement)
55
+ else
56
+ generate = generate.sub(pattern.to_re, replacement)
57
+ end
58
+ rescue => err
59
+ warn(io.inspect + ' ' + err.to_s) if $VERBOSE
49
60
  end
50
61
  end
51
62
  if original != generate
@@ -54,6 +65,20 @@ module Regex
54
65
  end
55
66
  end
56
67
 
68
+ #
69
+ # TODO: interactive mode needs to handle \1 style substitutions.
70
+ def interactive_gsub(string, pattern, replacement)
71
+ copy = string.dup
72
+ string.scan(pattern) do |match|
73
+ print "#{match} ? (Y/n)"
74
+ case ask
75
+ when 'y', 'Y', ''
76
+ copy[$~.begin(0)..$~.end(0)] = replacement
77
+ else
78
+ end
79
+ end
80
+ end
81
+
57
82
  private
58
83
 
59
84
  # Parse pattern matcher.
@@ -92,7 +117,7 @@ module Regex
92
117
  replaces = []
93
118
  options = {}
94
119
  parser = OptionParser.new do |opt|
95
- opt.on('--subtitute', '-s PATTERN', 'search portion of substitution') do |search|
120
+ opt.on('--search', '-s PATTERN', 'search portion of substitution') do |search|
96
121
  searches << search
97
122
  end
98
123
  opt.on('--template', '-t NAME', 'search for built-in regular expression') do |name|
@@ -101,7 +126,10 @@ module Regex
101
126
  opt.on('--replace', '-r STRING', 'replacement string of substitution') do |replace|
102
127
  replaces << replace
103
128
  end
104
- opt.on('--escape', '-e', 'make all patterns exact string matchers') do
129
+ opt.on('--recursive', '-R', 'search recursively though subdirectories') do
130
+ options[:recursive] = true
131
+ end
132
+ opt.on('--escape', '-e', 'make all patterns verbatim string matchers') do
105
133
  options[:escape] = true
106
134
  end
107
135
  opt.on('--insensitive', '-i', 'make all patterns case-insensitive matchers') do
@@ -119,7 +147,10 @@ module Regex
119
147
  opt.on('-b', '--backup', 'backup any files that are changed') do
120
148
  options[:backup] = true
121
149
  end
122
- opt.on_tail('--debug', 'run in debug mode') do
150
+ opt.on('-i', '--interactive', 'interactive mode') do
151
+ options[:interactive] = true
152
+ end
153
+ opt.on_tail('--debug', 'run in debug mode') do
123
154
  $DEBUG = true
124
155
  end
125
156
  opt.on_tail('--help', '-h', 'display this lovely help message') do
@@ -129,10 +160,19 @@ module Regex
129
160
  end
130
161
  parser.parse!(argv)
131
162
 
132
- files = argv
133
- files.each do |file|
163
+ files = []
164
+
165
+ argv.each{ |file|
134
166
  raise "file does not exist -- #{file}" unless File.exist?(file)
135
- end
167
+ if File.directory?(file)
168
+ if options[:recursive]
169
+ files.concat Dir[File.join(file, '**')].reject{ |d| File.directory?(d) }
170
+ end
171
+ else
172
+ files << file
173
+ end
174
+ }
175
+
136
176
  targets = files.empty? ? [ARGF] : files.map{ |f| File.new(f) }
137
177
 
138
178
  unless searches.size == replaces.size
@@ -3,8 +3,9 @@ module Regex
3
3
  # = Templates
4
4
  #
5
5
  # TODO: What about regular expressions with variable content?
6
- # Should these be methods rather than constants? But then how
7
- # would we handle named substituions?
6
+ # But then how would we handle named substituions?
7
+ #
8
+ # TODO: Should these be methods rather than constants?
8
9
  module Templates
9
10
 
10
11
  # Empty line.
@@ -13,6 +14,7 @@ module Regex
13
14
  # Blank line.
14
15
  BLANK = /^\s*$/
15
16
 
17
+ #
16
18
  NUMBER = /[-+]?[0-9]*\.?[0-9]+/
17
19
 
18
20
  # Markup language tag, e.g \<a>stuff</a>.
@@ -21,8 +23,8 @@ module Regex
21
23
  # IPv4 Address
22
24
  IPV4 = /\b(25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\.(25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\.(25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\.(25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\b/
23
25
 
24
- # Username
25
- USERNAME = /^[a-zA-Z0-9_]{3,16}$/
26
+ # Dni (spanish ID card)
27
+ DNI = /^\d{8}[A-Za-z]{1}$/
26
28
 
27
29
  # Email Address
28
30
  EMAIL = /([a-zA-Z0-9_\-\.]+)@((\[[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}\.)|(([a-zA-Z0-9\-]+\.)+))([a-zA-Z]{2,4}|[0-9]{1,3})(\]?)/i
@@ -51,6 +53,34 @@ module Regex
51
53
  # HTTP URL Address
52
54
  HTTP = /^(https?:\/\/)?([\da-z\.-]+)\.([a-z\.]{2,6})([\/\w \?=.-]*)*\/?$/
53
55
 
56
+ # Validates Credit Card numbers, contains 16 numbers in groups of 4 separated
57
+ # by `-`, space or nothing.
58
+ CREDITCARD = /^(\d{4}-){3}\d{4}$|^(\d{4}\s){3}\d{4}$|^\d{16}$/
59
+
60
+ # MasterCard credit card
61
+ MASTERCARD = /^5[1-5]\d{14}$/
62
+
63
+ # Visa credit card.
64
+ VISA = /^4\d{15}$/
65
+
66
+ # TODO: Better name?
67
+ UNIXWORD = /^[a-zA-Z0-9_]*$/
68
+
69
+ # Username, at lest 3 characters and no more than 16.
70
+ USERNAME = /^[a-zA-Z0-9_]{3,16}$/
71
+
72
+ # Twitter username
73
+ TWITTER_USERNMAE = /^([a-z0-9\_])+$/ix
74
+
75
+ # Github username
76
+ GITHUB_USERNAME = /^([a-z0-9\_\-])+$/ix
77
+
78
+ # Slideshare username
79
+ SLIDESHARE_USERNAME = /^([a-z0-9])+$/ix
80
+
81
+ # Del.icio.us username
82
+ DELICIOUS_USERNMAME = /^([a-z0-9\_\-])+$/ix
83
+
54
84
  # Ruby comment block.
55
85
  RUBYBLOCK = /^=begin\s*(.*?)\n(.*?)\n=end/m
56
86
 
@@ -58,7 +88,7 @@ module Regex
58
88
  # TODO: Not quite right.
59
89
  RUBYMETHOD_WITH_COMMENT = /(^\ *\#.*?)^\s*def\s*(.*?)$/m
60
90
 
61
- #
91
+ # Ruby method definition.
62
92
  RUBYMETHOD = /^\ *def\s*(.*?)$/
63
93
 
64
94
  # By the legendary abigail. Fails to match if and only if it is matched against