pattern_matcher 0.1.0 → 0.1.1
Sign up to get free protection for your applications and to get access to all the features.
- data/README.md +140 -1
- data/lib/pattern_matcher.rb +15 -4
- data/lib/pattern_matcher/match.rb +8 -2
- data/lib/pattern_matcher/matcher.rb +0 -21
- data/lib/pattern_matcher/pattern.rb +18 -3
- data/lib/pattern_matcher/pattern_manager.rb +64 -0
- data/lib/pattern_matcher/version.rb +1 -1
- metadata +4 -3
data/README.md
CHANGED
@@ -3,4 +3,143 @@ Pattern Matcher
|
|
3
3
|
|
4
4
|
Gem to help manage, test and run regex pattern matchers for a given project.
|
5
5
|
|
6
|
-
Gem can be found on rubygems at https://rubygems.org/gems/pattern_matcher
|
6
|
+
Gem can be found on rubygems at https://rubygems.org/gems/pattern_matcher
|
7
|
+
|
8
|
+
|
9
|
+
=========
|
10
|
+
**Code coverage for Ruby 1.9**
|
11
|
+
|
12
|
+
* [Source Code]
|
13
|
+
|
14
|
+
[Source Code]: https://github.com/aparker/pattern_matcher "Source Code @ GitHub"
|
15
|
+
[leak-stopper]: https://rubygems.org/gems/leak_stopper "LeakStopper"
|
16
|
+
|
17
|
+
|
18
|
+
PatternMatcher is a code tool that provides a framework for the storing, testing and running of common patterns. It can be used
|
19
|
+
in a one-off fashion from the command line or can be included inside of your application. The underlying code uses regular
|
20
|
+
expression matching but provides more human readable responses when matches are found.
|
21
|
+
|
22
|
+
The intention is to provide structured management of system-wide matching strings that can then be easily reused and tested to
|
23
|
+
ensure that intended behavior is maintained throughout the lifespan of the project.
|
24
|
+
|
25
|
+
PatternMatcher is a result of abstracting the core logic for the [leak-stopper] gem and, as a result has some behavior that, while generalized, is intended to assist the functionality that LeakStopper intended to provide.
|
26
|
+
|
27
|
+
Getting started (code)
|
28
|
+
---------------
|
29
|
+
|
30
|
+
1. Add pattern_matcher to your `Gemfile` and `bundle install`:
|
31
|
+
|
32
|
+
2. Initialize Pattern pattern_matcher
|
33
|
+
|
34
|
+
```ruby
|
35
|
+
PatternMatcher.configure do |config|
|
36
|
+
config.patterns_yml = File.join(File.dirname(__FILE__), "config", "patterns.yml")
|
37
|
+
end
|
38
|
+
```
|
39
|
+
|
40
|
+
3. Call `match_patterns_to_text` to get a list of `PatternMatcher::Match` objects
|
41
|
+
|
42
|
+
```ruby
|
43
|
+
matches = PatternMatcher.match_patterns_to_text(some_text)
|
44
|
+
matches.each do |match|
|
45
|
+
puts match.to_s
|
46
|
+
end
|
47
|
+
```
|
48
|
+
|
49
|
+
4. Programatically `add_pattern_hash` to expand the scope of patterns tested
|
50
|
+
|
51
|
+
```ruby
|
52
|
+
hash = {:pattern_id => "ANumber", :name => "A Number", :regex => "[0-9]", :description => "Just a single number."}
|
53
|
+
PatternMatcher.add_pattern_hash(hash)
|
54
|
+
```
|
55
|
+
|
56
|
+
Getting started (command line)
|
57
|
+
---------------
|
58
|
+
|
59
|
+
1. download pattern_matcher by calling `gem install pattern_matcher`:
|
60
|
+
|
61
|
+
```ruby
|
62
|
+
gem 'simplecov', :require => false, :group => :test
|
63
|
+
```
|
64
|
+
|
65
|
+
## Example patterns.yml File
|
66
|
+
|
67
|
+
The `patterns.yml` file is where patterns can be defined and maintained. It looks like this:
|
68
|
+
|
69
|
+
```
|
70
|
+
patterns:
|
71
|
+
SSN:
|
72
|
+
name: Social Security Number
|
73
|
+
regex: '[0-9]{3}-[0-9]{2}-[0-9]{4}'
|
74
|
+
description: Social Security Numbers.
|
75
|
+
valid_examples: ['111-22-1111', '222-11-2222']
|
76
|
+
PhoneNumber:
|
77
|
+
name: Seven Digit Phone Number
|
78
|
+
regex: '[0-9]{3}-[0-9]{4}'
|
79
|
+
description: A seven digit US phone number.
|
80
|
+
valid_examples: ['111-1234', '867-5309']
|
81
|
+
```
|
82
|
+
|
83
|
+
**Pattern Key** - Below the patterns node is the Pattern Key. This must be unique and identifies the pattern attributes on
|
84
|
+
subsiquent nodes.
|
85
|
+
|
86
|
+
**name** - The descriptitive title for the type of pattern this is.
|
87
|
+
|
88
|
+
**regex** - A regular expression describing the pattern that is to be matched. This is a _required_ part of the pattern and is
|
89
|
+
used directly to identify matches within analyzed text.
|
90
|
+
|
91
|
+
**description** - A detailed description of what the pattern is.
|
92
|
+
|
93
|
+
**valid_examples** - A list of known matches that the system can test the regex pattern against to ensure expected behavior.
|
94
|
+
|
95
|
+
|
96
|
+
## The PatternMatcher::Match object
|
97
|
+
|
98
|
+
Think of the PatternMatcher::Match object as an extension of the MatchData class, one that provides a deeper explanation.
|
99
|
+
|
100
|
+
The **Match** object returns the Name of the match along with a copy of the regex MatchData object.
|
101
|
+
|
102
|
+
## Configuration
|
103
|
+
|
104
|
+
To provide an appropriate level of flexability and ease of use, there are a number of things that can be set as configuration
|
105
|
+
options on ProjectManager.
|
106
|
+
|
107
|
+
**patterns_yml** - The source of the patterns that are to be applied to the matcher.
|
108
|
+
|
109
|
+
|
110
|
+
## List/Report Known Patterns
|
111
|
+
|
112
|
+
**List Paterns** - To view an entire list of the pattern keys that PatternMatcher knows about you can
|
113
|
+
|
114
|
+
```
|
115
|
+
> PatternMatcher.patterns.keys
|
116
|
+
> [SSN, PhoneNumber, ...]
|
117
|
+
```
|
118
|
+
|
119
|
+
**Detailed List of Patterns** - looks at each pattern and calls `.to_s` on each pattern
|
120
|
+
|
121
|
+
```
|
122
|
+
> PatternMatcher.patterns.to_s
|
123
|
+
> "Social Security Number (true) -- [0-9]{3}-[0-9]{2}-[0-9]{4} -- Social Security Numbers.
|
124
|
+
Seven Digit Phone Number (true) -- [0-9]{3}-[0-9]{4} -- A seven digit US phone number."
|
125
|
+
```
|
126
|
+
|
127
|
+
**Details of a Single Pattern** - a richer view into the pattern.
|
128
|
+
|
129
|
+
```
|
130
|
+
> a_pattern = Pattern.new({:pattern_id => SSN, "name" => "Social Security Number", "regex" => "[0-9]{3}-[0-9]{2}-[0-9]{4}", "description" => "A Social Security Number."})
|
131
|
+
> a_pattern.to_s
|
132
|
+
> "Social Security Number (true) -- [0-9]{3}-[0-9]{2}-[0-9]{4} -- A Social Security Number"
|
133
|
+
```
|
134
|
+
|
135
|
+
|
136
|
+
## Validate Patterns
|
137
|
+
|
138
|
+
#### Debugging Patterns
|
139
|
+
|
140
|
+
A core role of PatternMatcher is to help maintain the patterns that are important to you. This means making them easy to see and understand those patterns along with being able to have confidence that they are behaving as expected.
|
141
|
+
|
142
|
+
```
|
143
|
+
> PatternMatcher.proof_patterns
|
144
|
+
> {"EmailAddress" => ["something#mail.com"]}
|
145
|
+
```
|
data/lib/pattern_matcher.rb
CHANGED
@@ -8,12 +8,13 @@ require "pattern_matcher/config"
|
|
8
8
|
require "pattern_matcher/pattern"
|
9
9
|
require "pattern_matcher/match"
|
10
10
|
require "pattern_matcher/matcher"
|
11
|
+
require "pattern_matcher/pattern_manager"
|
11
12
|
|
12
13
|
module PatternMatcher
|
13
14
|
|
14
15
|
def self.configure
|
15
16
|
yield configuration if block_given?
|
16
|
-
@patterns =
|
17
|
+
@patterns = PatternManager.initialize_patterns
|
17
18
|
end
|
18
19
|
|
19
20
|
def self.configuration
|
@@ -26,7 +27,7 @@ module PatternMatcher
|
|
26
27
|
|
27
28
|
def self.match_patterns_to_text(text)
|
28
29
|
matches = []
|
29
|
-
@patterns.each do |pattern|
|
30
|
+
@patterns.to_a.each do |pattern|
|
30
31
|
regex_match = Matcher.match_pattern_in_text(pattern, text)
|
31
32
|
matches << Match.new({:name => pattern.name, :regex_match => regex_match}) if regex_match
|
32
33
|
end
|
@@ -34,12 +35,22 @@ module PatternMatcher
|
|
34
35
|
end
|
35
36
|
|
36
37
|
def self.proof_patterns
|
37
|
-
pattern_errors =
|
38
|
+
pattern_errors = {}
|
38
39
|
@patterns.each do |pattern|
|
39
|
-
|
40
|
+
failures = pattern.validate_all_examples
|
41
|
+
pattern_errors[pattern.pattern_id] = failures if (pattern.is_valid? && failures.count > 0)
|
40
42
|
end
|
41
43
|
pattern_errors
|
42
44
|
end
|
43
45
|
|
46
|
+
def self.patterns
|
47
|
+
@patterns
|
48
|
+
end
|
49
|
+
|
50
|
+
def self.add_pattern_hash(hash)
|
51
|
+
pattern = Pattern.new(hash)
|
52
|
+
@patterns.add_pattern pattern if pattern.is_valid?
|
53
|
+
end
|
54
|
+
|
44
55
|
end
|
45
56
|
|
@@ -5,9 +5,15 @@ module PatternMatcher
|
|
5
5
|
|
6
6
|
def initialize(hash)
|
7
7
|
if hash.is_a?(Hash)
|
8
|
-
|
9
|
-
@regex_match = hash[:regex_match]
|
8
|
+
extract_attributes_from_hash hash
|
10
9
|
end
|
11
10
|
end
|
11
|
+
|
12
|
+
private
|
13
|
+
|
14
|
+
def extract_attributes_from_hash(hash)
|
15
|
+
@name = hash[:name]
|
16
|
+
@regex_match = hash[:regex_match]
|
17
|
+
end
|
12
18
|
end
|
13
19
|
end
|
@@ -16,11 +16,6 @@ module PatternMatcher
|
|
16
16
|
return regex.match text if is_valid_regex?(regex) && is_valid_text?(text)
|
17
17
|
end
|
18
18
|
|
19
|
-
def self.initialize_patterns
|
20
|
-
raw_pattern_hash = load_yaml || {}
|
21
|
-
@@patterns = hash_to_patterns_array(raw_pattern_hash)
|
22
|
-
end
|
23
|
-
|
24
19
|
private
|
25
20
|
|
26
21
|
def self.is_valid_regex_string?(string)
|
@@ -35,21 +30,5 @@ module PatternMatcher
|
|
35
30
|
!text.nil? && !text.empty?
|
36
31
|
end
|
37
32
|
|
38
|
-
def self.load_yaml
|
39
|
-
YAML.load_file(PatternMatcher.configuration.patterns_yml) if PatternMatcher.configuration.patterns_yml
|
40
|
-
end
|
41
|
-
|
42
|
-
def self.hash_to_patterns_array(hash)
|
43
|
-
patterns = []
|
44
|
-
hash_patterns(hash).each do |a_pattern|
|
45
|
-
patterns << PatternMatcher::Pattern.new(a_pattern[1])
|
46
|
-
end
|
47
|
-
patterns
|
48
|
-
end
|
49
|
-
|
50
|
-
def self.hash_patterns(hash)
|
51
|
-
(hash["patterns"] || {}) if !hash.nil?
|
52
|
-
end
|
53
|
-
|
54
33
|
end
|
55
34
|
end
|
@@ -1,11 +1,12 @@
|
|
1
1
|
module PatternMatcher
|
2
2
|
class Pattern
|
3
3
|
|
4
|
-
attr_accessor :name, :regex_string, :description, :valid_examples
|
4
|
+
attr_accessor :pattern_id, :name, :regex_string, :description, :valid_examples
|
5
5
|
attr_accessor :regex
|
6
6
|
|
7
7
|
def initialize(hash)
|
8
|
-
if
|
8
|
+
if hash.is_a?(Hash)
|
9
|
+
@pattern_id = hash[:pattern_id]
|
9
10
|
@name = hash["name"]
|
10
11
|
@regex_string = hash["regex"]
|
11
12
|
@regex = Matcher.string_to_regex(@regex_string)
|
@@ -22,8 +23,22 @@ module PatternMatcher
|
|
22
23
|
failures
|
23
24
|
end
|
24
25
|
|
26
|
+
def to_s
|
27
|
+
"#{@name} (#{is_valid?}) -- #{@regex_string} -- #{@description}"
|
28
|
+
end
|
29
|
+
|
30
|
+
def to_h
|
31
|
+
{
|
32
|
+
:name => @name,
|
33
|
+
:is_valid => is_valid?,
|
34
|
+
:regex_string => @regex_string,
|
35
|
+
:description => @description,
|
36
|
+
:valid_examples => @valid_examples
|
37
|
+
}
|
38
|
+
end
|
39
|
+
|
25
40
|
def is_valid?
|
26
|
-
return !@regex.nil?
|
41
|
+
return !@regex.nil? && !@pattern_id.nil?
|
27
42
|
end
|
28
43
|
|
29
44
|
def pattern_example_valid?(example)
|
@@ -0,0 +1,64 @@
|
|
1
|
+
module PatternMatcher
|
2
|
+
class PatternManager
|
3
|
+
|
4
|
+
attr_accessor :patterns_hash
|
5
|
+
|
6
|
+
def self.initialize_patterns
|
7
|
+
raw_pattern_hash = load_yaml || {}
|
8
|
+
PatternManager.new_from_raw_hash raw_pattern_hash
|
9
|
+
end
|
10
|
+
|
11
|
+
def self.new_from_raw_hash(raw_hash)
|
12
|
+
patterns = {}
|
13
|
+
hash_patterns(raw_hash).each_pair do |a_key, a_pattern|
|
14
|
+
a_pattern[:pattern_id] = a_key
|
15
|
+
patterns[a_key] = PatternMatcher::Pattern.new(a_pattern)
|
16
|
+
end
|
17
|
+
PatternManager.new(patterns)
|
18
|
+
end
|
19
|
+
|
20
|
+
def initialize(pattern_hash)
|
21
|
+
@patterns_hash = pattern_hash
|
22
|
+
end
|
23
|
+
|
24
|
+
def add_pattern(pattern)
|
25
|
+
@patterns_hash[pattern.pattern_id] = pattern if pattern.is_valid?
|
26
|
+
end
|
27
|
+
|
28
|
+
def keys
|
29
|
+
@patterns_hash.keys
|
30
|
+
end
|
31
|
+
|
32
|
+
def to_s
|
33
|
+
ret_str = ""
|
34
|
+
@patterns_hash.values.each do |pattern|
|
35
|
+
ret_str += pattern.to_s + "\n"
|
36
|
+
end
|
37
|
+
ret_str
|
38
|
+
end
|
39
|
+
|
40
|
+
def to_a
|
41
|
+
@patterns_hash.values
|
42
|
+
end
|
43
|
+
|
44
|
+
def each
|
45
|
+
@patterns_hash.values.each do |pattern|
|
46
|
+
yield pattern
|
47
|
+
end
|
48
|
+
end
|
49
|
+
|
50
|
+
def count
|
51
|
+
@patterns_hash.keys.count || 0
|
52
|
+
end
|
53
|
+
|
54
|
+
private
|
55
|
+
|
56
|
+
def self.load_yaml
|
57
|
+
YAML.load_file(PatternMatcher.configuration.patterns_yml) if PatternMatcher.configuration.patterns_yml
|
58
|
+
end
|
59
|
+
|
60
|
+
def self.hash_patterns(hash)
|
61
|
+
(hash["patterns"] || {}) if !hash.nil?
|
62
|
+
end
|
63
|
+
end
|
64
|
+
end
|
metadata
CHANGED
@@ -1,7 +1,7 @@
|
|
1
1
|
--- !ruby/object:Gem::Specification
|
2
2
|
name: pattern_matcher
|
3
3
|
version: !ruby/object:Gem::Version
|
4
|
-
version: 0.1.
|
4
|
+
version: 0.1.1
|
5
5
|
prerelease:
|
6
6
|
platform: ruby
|
7
7
|
authors:
|
@@ -9,7 +9,7 @@ authors:
|
|
9
9
|
autorequire:
|
10
10
|
bindir: bin
|
11
11
|
cert_chain: []
|
12
|
-
date: 2014-05-
|
12
|
+
date: 2014-05-05 00:00:00.000000000 Z
|
13
13
|
dependencies:
|
14
14
|
- !ruby/object:Gem::Dependency
|
15
15
|
name: rspec
|
@@ -69,11 +69,12 @@ files:
|
|
69
69
|
- lib/pattern_matcher/match.rb
|
70
70
|
- lib/pattern_matcher/matcher.rb
|
71
71
|
- lib/pattern_matcher/pattern.rb
|
72
|
+
- lib/pattern_matcher/pattern_manager.rb
|
72
73
|
- lib/pattern_matcher/version.rb
|
73
74
|
- lib/pattern_matcher.rb
|
74
75
|
- LICENSE
|
75
76
|
- README.md
|
76
|
-
homepage: https://
|
77
|
+
homepage: https://github.com/aparkerw/pattern_matcher
|
77
78
|
licenses:
|
78
79
|
- MIT
|
79
80
|
post_install_message:
|