data_cleansing 0.6.0 → 0.6.1

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
checksums.yaml CHANGED
@@ -1,7 +1,7 @@
1
1
  ---
2
2
  SHA1:
3
- metadata.gz: d330ec38e5e452e5a6278a9086d2fbd010240d12
4
- data.tar.gz: 38dcd89391ba829b8d5b0d31fbf8f5b9c12ac6d0
3
+ metadata.gz: 963a448b6c2ab7dfe313dd911f19e83a4c4688a6
4
+ data.tar.gz: ff312d2b3a5dfc3bc20255f909bf3376ef720188
5
5
  SHA512:
6
- metadata.gz: d7a6e0ff6067d55da2910f494a7f1f3b0688460e794daa9cf7b5e0fdd8e1233df0251573404af4178b4e6d1559cb61960a252a8fad4155a4d773e41090f34867
7
- data.tar.gz: 29eb21b9b89ea762e405d80aefe947dc3c2066e563f3f07f2e491bee757cec5cdba5fa239741be8256ba701405278143c3811f61dd973503a0c7470bb547c0e5
6
+ metadata.gz: 8f899d278990a64a9ce9dcf51713b9143ba9485449ed793f4c96afe9a078dd0273308dfc2eaca41ce081c31ed2beba54173c0b6733c25ef9dfa16b4608e73b9a
7
+ data.tar.gz: c3d1addfece09468dc7f05981145e98f90ff46721c1b5e68e8b8b283f9b0d71485f0d63164115149b3a2dc84acece390fbbd5fa982b87c8689ad50316f2181bd
data/README.md CHANGED
@@ -22,11 +22,12 @@ or pull request.
22
22
  ## Features
23
23
 
24
24
  * Supports global cleansing definitions that can be associated with any Ruby,
25
- Rails, Mongoid, or other model.
26
- * Supports custom cleansing definitions that can be defined in-line using block.
27
- * A cleansing block can access the other attributes in the model in determining
28
- how to cleanse the current attribute
29
- * In a cleansing block other can also be modified if necessary
25
+ Rails, Mongoid, or other model
26
+ * Supports custom cleansing definitions that can be defined in-line
27
+ * A cleansing block can access the other attributes in the model while cleansing
28
+ the current attribute
29
+ * In a cleansing block other attributes in the model can be modified at the
30
+ same time
30
31
  * Cleansers are executed in the order they are defined. As a result multiple
31
32
  cleansers can be run against the same field and the order is preserved
32
33
  * Multiple cleansers can be specified for a list of attributes at the same time
@@ -34,9 +35,8 @@ or pull request.
34
35
  the child's cleansers
35
36
  * Cleansers can be called outside of a model instance for cases where fields
36
37
  need to be cleansed before the model is created, or needs to be found
37
- * Logging of data cleansing with the before and after values for troubleshooting.
38
- Depending on the log level all modified fields are logged, or just the ones
39
- completely wiped out to nil
38
+ * To aid troubleshooting the before and after values of cleansed attributes
39
+ is logged. The level of detail is fine-tuned using the log level
40
40
 
41
41
  ## ActiveRecord (ActiveModel) Features
42
42
 
@@ -67,11 +67,11 @@ u = User.new
67
67
  u.first_name = ' joe '
68
68
  u.last_name = "\n black\n"
69
69
  puts "Before data cleansing #{u.inspect}"
70
- # Before data cleansing Before data cleansing #<User:0x007fc9f1081980 @first_name=" joe ", @last_name="\n black\n">
70
+ # Before data cleansing #<User:0x007fc9f1081980 @first_name=" joe ", @last_name="\n black\n">
71
71
 
72
72
  u.cleanse_attributes!
73
73
  puts "After data cleansing #{u.inspect}"
74
- # After data cleansing After data cleansing #<User:0x007fc9f1081980 @first_name="joe", @last_name="black">
74
+ # After data cleansing #<User:0x007fc9f1081980 @first_name="joe", @last_name="black">
75
75
  ```
76
76
 
77
77
  ### Rails Example
@@ -154,6 +154,64 @@ puts "After data cleansing #{u.inspect}"
154
154
  # After data cleansing #<User:0x007fdd5a83a8f8 @first_name="joe", @last_name="black", @address1="2632 Brown St", @title="MR.", @gender="Male">
155
155
  ```
156
156
 
157
+ ## After Cleansing
158
+
159
+ It is sometimes useful to read or write multiple fields as part of a cleansing, or
160
+ where attributes need to be manipulated automatically once they have been cleansed.
161
+ For this purpose instance methods on the model can be registered for invocation once
162
+ all the attributes have been cleansed according to their :cleanse specifications.
163
+ Multiple methods can be registered and they are called in the order they are registered.
164
+
165
+ ```ruby
166
+ after_cleanse <instance_method_name>, <instance_method_name>, ...
167
+ ```
168
+
169
+ Example:
170
+ ```ruby
171
+ # Define a global cleanser
172
+ DataCleansing.register_cleaner(:strip) {|string| string.strip}
173
+
174
+ # 'users' table has the following columns :first_name, :last_name, :address1, :address2
175
+ class User < ActiveRecord::Base
176
+ include DataCleansing::Cleanse
177
+
178
+ # Use a global cleaner
179
+ cleanse :first_name, :last_name, :cleaner => :strip
180
+
181
+ # Define a once off cleaner
182
+ cleanse :address1, :address2, :cleaner => Proc.new {|string| string.strip}
183
+
184
+ # Once the above cleansing is complete call the instance method
185
+ after_cleanse :check_address
186
+
187
+ protected
188
+
189
+ # Method to be called once data cleansing is complete
190
+ def check_address
191
+ # Move address2 to address1 if Address1 is blank and address2 has a value
192
+ address2 = address1 if address1.blank? && !address2.blank?
193
+ end
194
+
195
+ end
196
+
197
+ # Create a User instance
198
+ u = User.new(:first_name => ' joe ', :last_name => "\n black\n", :address2 => "2632 Brown St \n")
199
+ puts "Before data cleansing #{u.attributes.inspect}"
200
+ u.cleanse_attributes!
201
+ puts "After data cleansing #{u.attributes.inspect}"
202
+ u.save!
203
+ ```
204
+
205
+ ## Recommendations
206
+
207
+ :data_cleanse block are ideal for cleansing a single attribute, and applying any
208
+ global or common cleansing algorithms.
209
+
210
+ Even though multiple attributes can be read or written in a single :data_cleanse
211
+ block, it is recommended to use the :after_cleanse method for working with multiple
212
+ attributes. It is much easier to read and understand the interactions between multiple
213
+ attributes in the :after_cleanse methods.
214
+
157
215
  ## Rails configuration
158
216
 
159
217
  When DataCleansing is used in a Rails environment it can be configured using the
@@ -196,13 +254,22 @@ SemanticLogger.default_level = Rails.logger.level
196
254
  SemanticLogger.add_appender(Rails.logger)
197
255
  ```
198
256
 
199
- By changing the log level for DataCleansing the type of output for data
257
+ By changing the log level of DataCleansing itself the type of output for data
200
258
  cleansing can be controlled:
201
259
 
202
260
  * :trace or :debug to log all fields modified
203
261
  * :info to log only those fields which were nilled out
204
262
  * :warn or higher to disable logging of cleansing actions
205
263
 
264
+ Note:
265
+
266
+ * The logging of changes made to attributes only includes attributes cleansed
267
+ with :data_cleanse blocks. Attributes modified within :after_cleanse methods
268
+ are not logged
269
+
270
+ * It is not necessary to change the global log level to affect the logging detail
271
+ level in DataCleansing. DataCleansing log level is changed independently
272
+
206
273
  To change the log level, either use the Rails configuration approach, or set it
207
274
  directly:
208
275
 
@@ -212,9 +279,9 @@ DataCleansing.logger.level = :info
212
279
 
213
280
  ## Notes
214
281
 
215
- Cleaners are called in the order in which they are defined, so subsequent cleaners
216
- can assume that the previous cleaners have run and can therefore access or even
217
- modify previously cleaned attributes
282
+ * Cleaners are called in the order in which they are defined, so subsequent cleaners
283
+ can assume that the previous cleaners have run and can therefore access or even
284
+ modify previously cleaned attributes
218
285
 
219
286
  ## Installation
220
287
 
@@ -223,7 +290,7 @@ modify previously cleaned attributes
223
290
  Add the following line to Gemfile
224
291
 
225
292
  ```ruby
226
- gem 'data_validation'
293
+ gem 'data_cleansing'
227
294
  ```
228
295
 
229
296
  Install the Gem with bundler
@@ -120,109 +120,122 @@ module DataCleansing
120
120
  module InstanceMethods
121
121
  # Cleanse the attributes using specified cleaners
122
122
  # and execute after cleaners once complete
123
- def cleanse_attributes!
124
- # Collect parent cleaners first, starting with the top parent
125
- cleaners = [self.class.send(:data_cleansing_cleaners)]
126
- after_cleaners = [self.class.send(:data_cleansing_after_cleaners)]
127
- klass = self.class.superclass
128
- while klass != Object
129
- cleaners << klass.send(:data_cleansing_cleaners) if klass.respond_to?(:data_cleansing_cleaners)
130
- after_cleaners << klass.send(:data_cleansing_after_cleaners) if klass.respond_to?(:data_cleansing_after_cleaners)
131
- klass = klass.superclass
132
- end
133
- cleaners.reverse_each {|cleaner| data_cleansing_execute_cleaners(cleaner)}
123
+ #
124
+ # Returns fields changed whilst cleaning the attributes
125
+ #
126
+ # Note: At this time the changes returned does not include any fields
127
+ # modified in any of the after_cleaner methods
128
+ def cleanse_attributes!(verbose=DataCleansing.logger.debug?)
129
+ changes = {}
130
+ DataCleansing.logger.benchmark_info("#{self.class.name}#cleanse_attributes!", :payload => changes) do
131
+ # Collect parent cleaners first, starting with the top parent
132
+ cleaners = [self.class.send(:data_cleansing_cleaners)]
133
+ after_cleaners = [self.class.send(:data_cleansing_after_cleaners)]
134
+ klass = self.class.superclass
135
+ while klass != Object
136
+ cleaners << klass.send(:data_cleansing_cleaners) if klass.respond_to?(:data_cleansing_cleaners)
137
+ after_cleaners << klass.send(:data_cleansing_after_cleaners) if klass.respond_to?(:data_cleansing_after_cleaners)
138
+ klass = klass.superclass
139
+ end
140
+ # Capture all modified fields if log_level is :debug or :trace
141
+ cleaners.reverse_each {|cleaner| changes.merge!(data_cleansing_execute_cleaners(cleaner, verbose))}
134
142
 
135
- # Execute the after cleaners, starting with the parent after cleanse methods
136
- after_cleaners.reverse_each {|a| a.each {|method| send(method)} }
137
- true
143
+ # Execute the after cleaners, starting with the parent after cleanse methods
144
+ after_cleaners.reverse_each {|a| a.each {|method| send(method)} }
145
+ end
146
+ changes
138
147
  end
139
148
 
140
149
  private
141
150
 
142
151
  # Run each of the cleaners in the order they are listed in the array
143
- def data_cleansing_execute_cleaners(cleaners)
152
+ # Returns a hash of before and after values of what was cleansed
153
+ # Parameters
154
+ # cleaners
155
+ # List of cleaners to run
156
+ #
157
+ # verbose [true|false]
158
+ # Whether to include all the fields cleansed or just the fields that
159
+ # were cleansed to nil
160
+ def data_cleansing_execute_cleaners(cleaners, verbose = false)
144
161
  return false if cleaners.nil?
145
162
 
146
163
  # Capture all changes to attributes if the log level is :info or greater
147
- changes = {} if DataCleansing.logger.info?
148
- # Capture all modified fields if log_level is :debug or :trace
149
- verbose = DataCleansing.logger.debug?
150
-
151
- DataCleansing.logger.benchmark_info("cleanse_attributes!", :payload => changes) do
152
- cleaners.each do |cleaner_struct|
153
- params = cleaner_struct.params
154
- attrs = cleaner_struct.attributes
155
-
156
- # Special case to include :all fields
157
- # Only works with ActiveRecord based models, not supported with regular Ruby models
158
- if attrs.include?(:all) && defined?(ActiveRecord) && respond_to?(:attributes)
159
- attrs = attributes.keys.collect{|i| i.to_sym}
160
- attrs.delete(:id)
161
-
162
- # Remove serialized_attributes if any, from the :all condition
163
- if self.class.respond_to?(:serialized_attributes)
164
- serialized_attrs = self.class.serialized_attributes.keys
165
- attrs -= serialized_attrs.collect{|i| i.to_sym} if serialized_attrs
166
- end
164
+ changes = {}
165
+
166
+ cleaners.each do |cleaner_struct|
167
+ params = cleaner_struct.params
168
+ attrs = cleaner_struct.attributes
169
+
170
+ # Special case to include :all fields
171
+ # Only works with ActiveRecord based models, not supported with regular Ruby models
172
+ if attrs.include?(:all) && defined?(ActiveRecord) && respond_to?(:attributes)
173
+ attrs = attributes.keys.collect{|i| i.to_sym}
174
+ attrs.delete(:id)
175
+
176
+ # Remove serialized_attributes if any, from the :all condition
177
+ if self.class.respond_to?(:serialized_attributes)
178
+ serialized_attrs = self.class.serialized_attributes.keys
179
+ attrs -= serialized_attrs.collect{|i| i.to_sym} if serialized_attrs
180
+ end
167
181
 
168
- # Replace any encrypted attributes with their non-encrypted versions if any
169
- if defined?(SymmetricEncryption) && self.class.respond_to?(:encrypted_attributes)
170
- self.class.encrypted_attributes.each_pair do |clear, encrypted|
171
- if attrs.include?(encrypted.to_sym)
172
- attrs.delete(encrypted.to_sym)
173
- attrs << clear.to_sym
174
- end
182
+ # Replace any encrypted attributes with their non-encrypted versions if any
183
+ if defined?(SymmetricEncryption) && self.class.respond_to?(:encrypted_attributes)
184
+ self.class.encrypted_attributes.each_pair do |clear, encrypted|
185
+ if attrs.include?(encrypted.to_sym)
186
+ attrs.delete(encrypted.to_sym)
187
+ attrs << clear.to_sym
175
188
  end
176
189
  end
190
+ end
177
191
 
178
- # Explicitly remove specified attributes from cleansing
179
- if except = params[:except]
180
- attrs -= except
181
- end
182
-
192
+ # Explicitly remove specified attributes from cleansing
193
+ if except = params[:except]
194
+ attrs -= except
183
195
  end
184
196
 
185
- attrs.each do |attr|
186
- # Under ActiveModel for Rails and Mongoid need to retrieve raw value
187
- # before data type conversion
188
- value = if respond_to?(:read_attribute_before_type_cast) && has_attribute?(attr.to_s)
189
- read_attribute_before_type_cast(attr.to_s)
190
- else
191
- send(attr.to_sym)
192
- end
197
+ end
198
+
199
+ attrs.each do |attr|
200
+ # Under ActiveModel for Rails and Mongoid need to retrieve raw value
201
+ # before data type conversion
202
+ value = if respond_to?(:read_attribute_before_type_cast) && has_attribute?(attr.to_s)
203
+ read_attribute_before_type_cast(attr.to_s)
204
+ else
205
+ send(attr.to_sym)
206
+ end
193
207
 
194
- # No need to clean if attribute is nil
195
- unless value.nil?
196
- new_value = self.class.send(:data_cleansing_clean,cleaner_struct, value, self)
197
-
198
- if new_value != value
199
- # Update value only if it has changed
200
- send("#{attr.to_sym}=".to_sym, new_value)
201
-
202
- # Capture changed attributes
203
- if changes
204
- # Mask sensitive attributes when logging
205
- masked = DataCleansing.masked_attributes.include?(attr.to_sym)
206
- new_value = :masked if masked && !new_value.nil?
207
- if previous = changes[attr.to_sym]
208
- previous[:after] = new_value
209
- else
210
- if new_value.nil? || verbose
211
- changes[attr.to_sym] = {
212
- :before => masked ? :masked : value,
213
- :after => new_value
214
- }
215
- end
208
+ # No need to clean if attribute is nil
209
+ unless value.nil?
210
+ new_value = self.class.send(:data_cleansing_clean,cleaner_struct, value, self)
211
+
212
+ if new_value != value
213
+ # Update value only if it has changed
214
+ send("#{attr.to_sym}=".to_sym, new_value)
215
+
216
+ # Capture changed attributes
217
+ if changes
218
+ # Mask sensitive attributes when logging
219
+ masked = DataCleansing.masked_attributes.include?(attr.to_sym)
220
+ new_value = :masked if masked && !new_value.nil?
221
+ if previous = changes[attr.to_sym]
222
+ previous[:after] = new_value
223
+ else
224
+ if new_value.nil? || verbose
225
+ changes[attr.to_sym] = {
226
+ :before => masked ? :masked : value,
227
+ :after => new_value
228
+ }
216
229
  end
217
230
  end
218
231
  end
219
232
  end
220
-
221
233
  end
222
234
  end
223
235
  end
224
- nil
236
+ changes
225
237
  end
238
+
226
239
  end
227
240
 
228
241
  def self.included(base)
@@ -1,3 +1,3 @@
1
1
  module DataCleansing
2
- VERSION = "0.6.0"
2
+ VERSION = "0.6.1"
3
3
  end
data/test/test_db.sqlite3 CHANGED
Binary file
metadata CHANGED
@@ -1,14 +1,14 @@
1
1
  --- !ruby/object:Gem::Specification
2
2
  name: data_cleansing
3
3
  version: !ruby/object:Gem::Version
4
- version: 0.6.0
4
+ version: 0.6.1
5
5
  platform: ruby
6
6
  authors:
7
7
  - Reid Morrison
8
8
  autorequire:
9
9
  bindir: bin
10
10
  cert_chain: []
11
- date: 2013-07-30 00:00:00.000000000 Z
11
+ date: 2013-08-29 00:00:00.000000000 Z
12
12
  dependencies:
13
13
  - !ruby/object:Gem::Dependency
14
14
  name: thread_safe