crm_formatter 2.4 → 2.5

Sign up to get free protection for your applications and to get access to all the features.
checksums.yaml CHANGED
@@ -1,7 +1,7 @@
1
1
  ---
2
2
  SHA256:
3
- metadata.gz: 83777a72332f62d45de21bbd2caa03eb61b05017b84fb57169c879074785c7d5
4
- data.tar.gz: 0a7ff7909b01d40fdd053116105a88093f80aed75c25973e3799d8d205b4e6cc
3
+ metadata.gz: 3272fc47f3c2ccd991ce89ab4e7252b7783f306ebc86a3d03a0af49084509d92
4
+ data.tar.gz: f0854aa02ed045fde909d4e2240e9cb382240b8e8b2367f618faebb0dc8962de
5
5
  SHA512:
6
- metadata.gz: b7c6eabaf39cd839ab291548141e74faec397b5d9405ef911a923010205e5ecab2881308deb20001f9cd8d1c4d555d5b0e923e6c872a4028ee2816fec2f1806c
7
- data.tar.gz: fe6be76a0cf51ac51189423769871cf7c267a367bbae245626aa3c5e2837f1d1b9f30d8b4d69a08a82d673fe96e5405123cd4e76fa06804bcdb808e38092b189
6
+ metadata.gz: 55aee04e100de2b9862d9c09129599e9e5cb1460c1c078c1a1a15a649bea7e42a495f436f3034263e01560633640aae1e312daaf39ddd27ca8750aa30d7c0546
7
+ data.tar.gz: b5e4dd5923b92a24f7b7f235b1059e00f66b8bf20bfde316cbbe83982cfe25e2d2073dcfdbd68c80824c02784944695a177ef40a7712c788dd3ec8c2005cf838
data/README.md CHANGED
@@ -28,8 +28,125 @@ Or install it yourself as:
28
28
  ## Usage
29
29
 
30
30
  ### I. Basic Usage
31
+ Basic methods available are:
32
+ ```
33
+ format_addresses(array_of_addresses)
34
+ format_phones(array_of_phones)
35
+ format_propers(array_of_propers)
36
+ format_urls(array_of_urls)
37
+ ```
38
+
39
+ 1. Format Array of Proper Strings:
40
+ Use `format_propers` to format strings with proper nouns, such as (but not limited to):
41
+
42
+ Business Account Names (123 bmw-world => 123 BMW-World),
43
+
44
+ Proper Names (adam john booth => Adam John Booth),
45
+
46
+ Job Titles (marketing director => Marketing Director),
47
+
48
+ Article Titles (the 15 most useful ruby methods => The 15 Most Useful Ruby Methods)
49
+
50
+ ```
51
+ array_of_propers = [
52
+ '123 bmw-world',
53
+ 'Car-world Kia',
54
+ 'BUDGET - AUTOMOTORES ZONA & FRANCA, INC',
55
+ 'DOWNTOWN CAR REPAIR, INC',
56
+ 'Young Gmc Trucks',
57
+ 'TEXAS TRAVEL, CO',
58
+ 'youmans Chevrolet',
59
+ 'Hot-Deal auto Insurance',
60
+ 'quick auto approval, inc',
61
+ 'yazell chevy',
62
+ 'quick cAr LUBE',
63
+ 'yAtEs AuTo maLL',
64
+ 'YADKIN VALLEY COLLISION CO',
65
+ 'XIT FORD INC'
66
+ ]
67
+
68
+ formatted_proper_hashes = CrmFormatter.format_propers(array_of_propers)
69
+ ```
70
+
71
+ Formatted Proper Strings:
72
+
73
+ ```
74
+ formatted_proper_hashes = [
75
+ {
76
+ :proper_status=>"formatted",
77
+ :proper=>"123 bmw-world",
78
+ :proper_f=>"123 BMW-World"
79
+ },
80
+ {
81
+ :proper_status=>"formatted",
82
+ :proper=>"Car-world Kia",
83
+ :proper_f=>"Car-World Kia"
84
+ },
85
+ {
86
+ :proper_status=>"formatted",
87
+ :proper=>"BUDGET - AUTOMOTORES ZONA & FRANCA, INC",
88
+ :proper_f=>"Budget - Automotores Zona & Franca, Inc"
89
+ },
90
+
91
+ {:proper_status=>"formatted",
92
+ :proper=>"DOWNTOWN CAR REPAIR, INC",
93
+ :proper_f=>"Downtown Car Repair, Inc"
94
+ },
95
+ {
96
+ :proper_status=>"formatted",
97
+ :proper=>"Young Gmc Trucks",
98
+ :proper_f=>"Young GMC Trucks"
99
+ },
100
+ {
101
+ :proper_status=>"formatted",
102
+ :proper=>"TEXAS TRAVEL, CO",
103
+ :proper_f=>"Texas Travel, Co"
104
+ },
105
+ {
106
+ :proper_status=>"formatted",
107
+ :proper=>"youmans Chevrolet",
108
+ :proper_f=>"Youmans Chevrolet"
109
+ },
110
+ {
111
+ :proper_status=>"formatted",
112
+ :proper=>"Hot-Deal auto Insurance",
113
+ :proper_f=>"Hot-Deal Auto Insurance"
114
+ },
115
+ {
116
+ :proper_status=>"formatted",
117
+ :proper=>"quick auto approval, inc",
118
+ :proper_f=>"Quick Auto Approval, Inc"
119
+ },
120
+ {
121
+ :proper_status=>"formatted",
122
+ :proper=>"yazell chevy",
123
+ :proper_f=>"Yazell Chevy"
124
+ },
125
+ {
126
+ :proper_status=>"formatted",
127
+ :proper=>"quick cAr LUBE",
128
+ :proper_f=>"Quick Car Lube"
129
+ },
130
+ {
131
+ :proper_status=>"formatted",
132
+ :proper=>"yAtEs AuTo maLL",
133
+ :proper_f=>"Yates Auto Mall"
134
+ },
135
+ {
136
+ :proper_status=>"formatted",
137
+ :proper=>"YADKIN VALLEY COLLISION CO",
138
+ :proper_f=>"Yadkin Valley Collision Co"
139
+ },
140
+ {
141
+ :proper_status=>"formatted",
142
+ :proper=>"XIT FORD INC",
143
+ :proper_f=>"Xit Ford Inc"
144
+ }
145
+ ]
146
+ ```
147
+
148
+ 2. Format Array of Phone Numbers:
31
149
 
32
- 1. Format Array of Phone Numbers:
33
150
  ```
34
151
  array_of_phones = %w[
35
152
  555-457-4391
@@ -44,6 +161,7 @@ formatted_phone_hashes = CrmFormatter.format_phones(array_of_phones)
44
161
  ```
45
162
 
46
163
  Formatted Phone Numbers:
164
+
47
165
  ```
48
166
  formatted_phone_hashes = [
49
167
  {
@@ -79,7 +197,8 @@ formatted_phone_hashes = [
79
197
  ]
80
198
  ```
81
199
 
82
- 2. Format Array of URLs:
200
+ 3. Format Array of URLs:
201
+
83
202
  ```
84
203
  array_of_urls = %w[
85
204
  sample01.com/staff
@@ -98,6 +217,7 @@ formatted_url_hashes = CrmFormatter.format_urls(array_of_urls)
98
217
  ```
99
218
 
100
219
  Formatted URLs:
220
+
101
221
  ```
102
222
  formatted_url_hashes = [
103
223
  {
@@ -173,7 +293,8 @@ formatted_url_hashes = [
173
293
  ]
174
294
  ```
175
295
 
176
- 3. Format Array of Addresses (each as a hash):
296
+ 4. Format Array of Addresses (each as a hash):
297
+
177
298
  ```
178
299
  array_of_addresses = [
179
300
  { street: '1234 EAST FAIR BOULEVARD', city: 'AUSTIN', state: 'TEXAS', zip: '78734' },
@@ -184,6 +305,7 @@ formatted_address_hashes = CrmFormatter.format_addresses(array_of_addresses)
184
305
  ```
185
306
 
186
307
  Formatted Addresses:
308
+
187
309
  ```
188
310
  formatted_address_hashes = [
189
311
  {
@@ -222,11 +344,13 @@ Advanced usage has ability to parse a CSV file or pass large data sets. It also
222
344
  Access advanced usage via `format_with_report(args)` method and pass a csv file_path or data hashes.
223
345
 
224
346
  1. Parse and Format CSV via File Path (Must be absolute path to root and follow the syntax as below)
347
+
225
348
  ```
226
349
  formatted_csv_results = CrmFormatter.format_with_report(file_path: './path/to/your/csv.csv')
227
350
  ```
228
351
 
229
352
  Parsed & Formatted CSV Results:
353
+
230
354
  ```
231
355
  formatted_csv_results = {
232
356
  stats:
@@ -282,6 +406,7 @@ formatted_csv_results = {
282
406
  ```
283
407
 
284
408
  2. Format Data Hashes
409
+
285
410
  ```
286
411
  data_hashes_array = [{ row_id: '1', url: 'abcacura.com/twitter', act_name: "Stanley Chevrolet Kaufman\x99_\xCC", street: '825 East Fair Street', city: 'Kaufman', state: 'Texas', zip: '75142', phone: "555-457-4391\r\n" }]
287
412
 
@@ -289,6 +414,7 @@ formatted_data_hash_results = CrmFormatter.format_with_report(data: data_hashes_
289
414
  ```
290
415
 
291
416
  Formatted Data Hashes Results:
417
+
292
418
  ```
293
419
  formatted_data_hash_results = { stats:
294
420
  {
data/Rakefile CHANGED
@@ -17,20 +17,38 @@ task :console do
17
17
 
18
18
  # formatted_data = format_with_report
19
19
  # formatted_phones = format_phones
20
- formatted_urls = format_urls
20
+ # formatted_urls = format_urls
21
+ formatted_propers = format_propers
21
22
  # formatted_addresses = format_addresses
23
+ binding.pry
22
24
  IRB.start
23
25
  end
24
26
 
25
- def format_with_report
26
- data = [{ row_id: '1', url: 'abcacura.com/twitter', act_name: "Stanley Chevrolet Kaufman\x99_\xCC", street: '825 East Fair Street', city: 'Kaufman', state: 'Texas', zip: '75142', phone: "555-457-4391\r\n" }]
27
-
28
- file_path = './lib/crm_formatter/csv/seed.csv'
27
+ #############################################
28
+ def format_propers
29
+ array_of_propers = [
30
+ '123 bmw-world',
31
+ 'Car-world Kia',
32
+ 'BUDGET - AUTOMOTORES ZONA & FRANCA, INC',
33
+ 'DOWNTOWN CAR REPAIR, INC',
34
+ 'Young Gmc Trucks',
35
+ 'TEXAS TRAVEL, CO',
36
+ 'youmans Chevrolet',
37
+ 'Hot-Deal auto Insurance',
38
+ 'quick auto approval, inc',
39
+ 'yazell chevy',
40
+ 'quick cAr LUBE',
41
+ 'yAtEs AuTo maLL',
42
+ 'YADKIN VALLEY COLLISION CO',
43
+ 'XIT FORD INC'
44
+ ]
29
45
 
30
- # args = {data: data}
31
- args = { file_path: file_path }
32
- formatted_data = CrmFormatter.format_with_report(args)
46
+ formatted_propers = CrmFormatter.format_propers(array_of_propers)
47
+ formatted_propers
33
48
  end
49
+ #############################################
50
+
51
+
34
52
 
35
53
  def format_addresses
36
54
  array_of_addresses = [
@@ -41,6 +59,7 @@ def format_addresses
41
59
  formatted_addresses = CrmFormatter.format_addresses(array_of_addresses)
42
60
  end
43
61
 
62
+
44
63
  def format_phones
45
64
  array_of_phones = %w[
46
65
  555-457-4391 555-888-4391
@@ -51,6 +70,8 @@ def format_phones
51
70
  formatted_phones = CrmFormatter.format_phones(array_of_phones)
52
71
  end
53
72
 
73
+
74
+
54
75
  def format_urls
55
76
  array_of_urls = %w[
56
77
  sample01.com/staff
@@ -68,3 +89,14 @@ def format_urls
68
89
  array_of_urls = %w[sample01.com/staff www.sample02.net.com http://www.sample3.net www.sample04.net/contact_us http://sample05.net www.sample06.sofake www.sample07.com.sofake example08.not.real www.sample09.net/staff/management www.www.sample10.com]
69
90
  formatted_urls = CrmFormatter.format_urls(array_of_urls)
70
91
  end
92
+
93
+
94
+ def format_with_report
95
+ data = [{ row_id: '1', url: 'abcacura.com/twitter', act_name: "Stanley Chevrolet Kaufman\x99_\xCC", street: '825 East Fair Street', city: 'Kaufman', state: 'Texas', zip: '75142', phone: "555-457-4391\r\n" }]
96
+
97
+ file_path = './lib/crm_formatter/csv/seed.csv'
98
+
99
+ # args = {data: data}
100
+ args = { file_path: file_path }
101
+ formatted_data = CrmFormatter.format_with_report(args)
102
+ end
@@ -33,6 +33,7 @@ Gem::Specification.new do |spec|
33
33
  spec.require_paths = ['lib']
34
34
 
35
35
  spec.required_ruby_version = '~> 2.5.1'
36
+ spec.add_development_dependency "pry", "~> 0.11.3"
36
37
  spec.add_dependency "utf8_sanitizer", "~> 2.0"
37
38
  spec.add_dependency 'activesupport', '~> 5.2', '>= 5.2.0'
38
39
  # spec.add_dependency "activesupport-inflector", ['~> 0.1.0']
@@ -44,7 +45,6 @@ Gem::Specification.new do |spec|
44
45
  spec.add_development_dependency 'rspec', '~> 3.7'
45
46
  spec.add_development_dependency 'rubocop', '~> 0.56.0'
46
47
  spec.add_development_dependency 'ruby-beautify', '~> 0.97.4'
47
- spec.add_development_dependency "pry", "~> 0.11.3"
48
48
 
49
49
  # spec.add_runtime_dependency 'library', '~> 2.2'
50
50
  # spec.add_dependency 'activerecord', '>= 3.0'
data/lib/crm_formatter.rb CHANGED
@@ -3,21 +3,29 @@
3
3
  require 'crm_formatter/address'
4
4
  require 'crm_formatter/extensions'
5
5
  require 'crm_formatter/phone'
6
- require 'crm_formatter/version'
6
+ require 'crm_formatter/proper'
7
+ require 'crm_formatter/tools'
7
8
  require 'crm_formatter/web'
8
9
  require 'crm_formatter/wrap'
9
10
 
10
11
  # require 'crm_formatter/tools'
11
12
  # require 'crm_formatter/seed_criteria'
12
- require 'pry'
13
13
  require 'utf8_sanitizer'
14
+ require 'pry'
14
15
 
15
16
  module CrmFormatter
16
- def self.format_with_report(args={})
17
- formatted_data = self::Wrap.new.run(args)
18
- formatted_data
17
+
18
+ ## Takes array of proper strings, returns array of proper hashes.
19
+ def self.format_propers(array_of_propers)
20
+ proper_obj = CrmFormatter::Proper.new
21
+
22
+ formatted_proper_hashes = array_of_propers.map do |string|
23
+ crmf_proper_hsh = proper_obj.format_proper(string)
24
+ end
25
+ formatted_proper_hashes
19
26
  end
20
27
 
28
+
21
29
  ## Takes array of address hashes, returns array of address hashes.
22
30
  def self.format_addresses(array_of_addresses)
23
31
  address_obj = CrmFormatter::Address.new
@@ -51,4 +59,10 @@ module CrmFormatter
51
59
  end
52
60
  formatted_url_hashes
53
61
  end
62
+
63
+ def self.format_with_report(args={})
64
+ formatted_data = self::Wrap.new.run(args)
65
+ formatted_data
66
+ end
67
+
54
68
  end
@@ -51,10 +51,7 @@ module CrmFormatter
51
51
  def format_street(street)
52
52
  street = street&.gsub(/\s/, ' ')&.strip
53
53
  return unless street.present?
54
- # street = Wrap.new.letter_case_check(street)
55
- return unless street.present?
56
- # street = CrmFormatter::Tools.new.letter_case_check(street)
57
- street = letter_case_check(street)
54
+ street = CrmFormatter::Tools.new.letter_case_check(street)
58
55
 
59
56
  street = " #{street} " # Adds white space, to match below, then strip.
60
57
  street&.gsub!('.', ' ')
@@ -178,18 +175,5 @@ module CrmFormatter
178
175
  zip
179
176
  end
180
177
 
181
- def letter_case_check(str)
182
- return unless str.present?
183
- flashes = str&.gsub(/[^ A-Za-z]/, '')&.strip&.split(' ')
184
- flash = flashes&.reject { |e| e.length < 3 }&.join(' ')
185
-
186
- return str unless flash.present?
187
- has_caps = flash.scan(/[A-Z]/).any?
188
- has_lows = flash.scan(/[a-z]/).any?
189
-
190
- return str unless !has_caps || !has_lows
191
- str = str.split(' ')&.each { |el| el.capitalize! if el.gsub(/[^ A-Za-z]/, '')&.strip&.length > 2 }&.join(' ')
192
- str
193
- end
194
178
  end
195
179
  end
@@ -0,0 +1,26 @@
1
+ # frozen_string_literal: false
2
+
3
+ module CrmFormatter
4
+ class Proper
5
+
6
+ def format_proper(string)
7
+ str_hsh = { proper_status: nil, proper: string, proper_f: nil }
8
+ return str_hsh unless string.present?
9
+ str_hsh[:proper_f] = CrmFormatter::Tools.new.letter_case_check(string)
10
+
11
+ str_hsh = check_proper_status(str_hsh)
12
+ str_hsh
13
+ end
14
+
15
+ ####### COMPARE ORIGINAL AND FORMATTED PROPER ######
16
+ def check_proper_status(hsh)
17
+ proper = hsh[:proper]
18
+ proper_f = hsh[:proper_f]
19
+ status = 'invalid'
20
+ status = proper != proper_f ? 'formatted' : 'unchanged' if proper && proper_f
21
+ hsh[:proper_status] = status if status.present?
22
+ hsh
23
+ end
24
+
25
+ end
26
+ end
@@ -0,0 +1,37 @@
1
+ # frozen_string_literal: false
2
+
3
+ module CrmFormatter
4
+ class Tools
5
+
6
+ def letter_case_check(str)
7
+ return unless str.present?
8
+ # str = str.upcase
9
+ str = str.upcase.split(' ')&.each { |el| el.capitalize! if el.gsub(/[^ A-Za-z]/, '')&.strip }&.join(' ')
10
+ str = capitalize_dashes(str)
11
+ str = check_for_brands(str)
12
+ str
13
+ end
14
+
15
+ def capitalize_dashes(str)
16
+ if str&.include?('-')
17
+ els = str.split(' ')
18
+ dash_els = els.select { |el| el != '-' && el.include?('-') }
19
+
20
+ dash_els.each do |el|
21
+ el_cap = el.split('-').map(&:capitalize).join('-')
22
+ str = str.gsub(el, el_cap)
23
+ end
24
+ end
25
+ str
26
+ end
27
+
28
+ def check_for_brands(str)
29
+ return unless str.present?
30
+ ['BMW', 'CDJR', 'CJDR', 'GMC', 'CJD'].map do |brand|
31
+ str = str.gsub(brand.capitalize, brand)
32
+ end
33
+ str
34
+ end
35
+
36
+ end
37
+ end
@@ -1,5 +1,5 @@
1
1
  # frozen_string_literal: false
2
2
 
3
3
  module CrmFormatter
4
- VERSION = "2.4"
4
+ VERSION = "2.5"
5
5
  end
metadata CHANGED
@@ -1,15 +1,29 @@
1
1
  --- !ruby/object:Gem::Specification
2
2
  name: crm_formatter
3
3
  version: !ruby/object:Gem::Version
4
- version: '2.4'
4
+ version: '2.5'
5
5
  platform: ruby
6
6
  authors:
7
7
  - Adam Booth
8
8
  autorequire:
9
9
  bindir: exe
10
10
  cert_chain: []
11
- date: 2018-06-27 00:00:00.000000000 Z
11
+ date: 2018-06-28 00:00:00.000000000 Z
12
12
  dependencies:
13
+ - !ruby/object:Gem::Dependency
14
+ name: pry
15
+ requirement: !ruby/object:Gem::Requirement
16
+ requirements:
17
+ - - "~>"
18
+ - !ruby/object:Gem::Version
19
+ version: 0.11.3
20
+ type: :development
21
+ prerelease: false
22
+ version_requirements: !ruby/object:Gem::Requirement
23
+ requirements:
24
+ - - "~>"
25
+ - !ruby/object:Gem::Version
26
+ version: 0.11.3
13
27
  - !ruby/object:Gem::Dependency
14
28
  name: utf8_sanitizer
15
29
  requirement: !ruby/object:Gem::Requirement
@@ -180,20 +194,6 @@ dependencies:
180
194
  - - "~>"
181
195
  - !ruby/object:Gem::Version
182
196
  version: 0.97.4
183
- - !ruby/object:Gem::Dependency
184
- name: pry
185
- requirement: !ruby/object:Gem::Requirement
186
- requirements:
187
- - - "~>"
188
- - !ruby/object:Gem::Version
189
- version: 0.11.3
190
- type: :development
191
- prerelease: false
192
- version_requirements: !ruby/object:Gem::Requirement
193
- requirements:
194
- - - "~>"
195
- - !ruby/object:Gem::Version
196
- version: 0.11.3
197
197
  description: |-
198
198
  CrmFormatter is perfect for curating high-volume enterprise-scale web scraping, and integrates well with Nokogiri, Mechanize, and asynchronous jobs via Delayed_job or SideKick, to name a few. Web Scraping and Harvesting often gathers a lot of junk to sift through; presenting unexpected edge cases around each corner. CrmFormatter has been developed and refined during the past few years to focus on improving that task.
199
199
  It's also perfect for processing API data, Web Forms, and routine DB normalizing and scrubbing processes. Not only does it reformat Address, Phone, and Web data, it can also accept lists to scrub against, then providing detailed reports about how each piece of data compares with your criteria lists.
@@ -220,6 +220,8 @@ files:
220
220
  - lib/crm_formatter/csv/seed.csv
221
221
  - lib/crm_formatter/extensions.rb
222
222
  - lib/crm_formatter/phone.rb
223
+ - lib/crm_formatter/proper.rb
224
+ - lib/crm_formatter/tools.rb
223
225
  - lib/crm_formatter/version.rb
224
226
  - lib/crm_formatter/web.rb
225
227
  - lib/crm_formatter/wrap.rb