column_anonymizer 0.1.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
checksums.yaml ADDED
@@ -0,0 +1,7 @@
1
+ ---
2
+ SHA256:
3
+ metadata.gz: '01355341091ad9946164e31cd2b99be788e6c0c4feeaf57e660ec626dbe90ccd'
4
+ data.tar.gz: faf09021710551d4b0524da501b92e2d8ab263a4a1bc0a02069683a5afe3601d
5
+ SHA512:
6
+ metadata.gz: ccffa458954da6a422ae3a0d9cfe890fc9a2587d4f4df00b3bf1dc39cbfd2f322089c1ec563e5c8a2fb18d51781bcdeb8a484ee6479b6fc3ebcce245c86e1c87
7
+ data.tar.gz: 067fb6ee7eb334491934d792b9e7f84890bb6f81626b53bb1797850f373d3ae749ced9bf4757dd9d8ccbcb6a9f40a838c60eebe6e8b99716dfcc661de6cb9f98
data/.rspec ADDED
@@ -0,0 +1 @@
1
+ --require spec_helper
data/.rspec_status ADDED
@@ -0,0 +1,15 @@
1
+ example_id | status | run_time |
2
+ ----------------------------------------- | ------ | --------------- |
3
+ ./spec/column_anonymizer_spec.rb[1:1] | passed | 0.00027 seconds |
4
+ ./spec/column_anonymizer_spec.rb[1:2:1] | passed | 0.00042 seconds |
5
+ ./spec/column_anonymizer_spec.rb[1:2:2] | passed | 0.00003 seconds |
6
+ ./spec/column_anonymizer_spec.rb[1:2:3] | passed | 0.00002 seconds |
7
+ ./spec/column_anonymizer_spec.rb[1:3:1] | passed | 0.00034 seconds |
8
+ ./spec/column_anonymizer_spec.rb[1:4:1:1] | passed | 0.00099 seconds |
9
+ ./spec/column_anonymizer_spec.rb[1:4:1:2] | passed | 0.71609 seconds |
10
+ ./spec/column_anonymizer_spec.rb[1:4:1:3] | passed | 0.00079 seconds |
11
+ ./spec/column_anonymizer_spec.rb[1:4:1:4] | passed | 0.00008 seconds |
12
+ ./spec/column_anonymizer_spec.rb[1:4:1:5] | passed | 0.00013 seconds |
13
+ ./spec/column_anonymizer_spec.rb[1:4:1:6] | passed | 0.00026 seconds |
14
+ ./spec/column_anonymizer_spec.rb[1:4:2:1] | passed | 0.00003 seconds |
15
+ ./spec/column_anonymizer_spec.rb[1:4:2:2] | passed | 0.00003 seconds |
data/CHANGELOG.md ADDED
@@ -0,0 +1,49 @@
1
+ ## [Unreleased]
2
+
3
+ ### Added
4
+ - **Rake Tasks for Bulk Anonymization**: Process large datasets efficiently
5
+ - `rake column_anonymizer:anonymize_all` - Anonymize all models
6
+ - `rake column_anonymizer:anonymize_model[ModelName]` - Anonymize specific model
7
+ - `rake column_anonymizer:anonymize_where[Model,'condition']` - Conditional anonymization
8
+ - `rake column_anonymizer:preview` - Dry run without changes
9
+ - `rake column_anonymizer:stats` - Show statistics
10
+ - Progress tracking for long-running operations
11
+ - Error handling and recovery
12
+ - Confirmation prompts for conditional anonymization
13
+ - **Custom Anonymization Generators**: Register your own anonymization types
14
+ - `ColumnAnonymizer::Anonymizer.register(type, &block)` API
15
+ - Support for domain-specific data formats (credit cards, employee IDs, etc.)
16
+ - Flexible registration with blocks, lambdas, or callable objects
17
+ - New `rails generate column_anonymizer:initializer` command
18
+ - Comprehensive API: `register`, `unregister`, `all_generators`, `generator_exists?`
19
+ - Custom generators merge with built-in generators seamlessly
20
+ - **Automatic Model Scanner**: New `rails generate column_anonymizer:scan` command
21
+ - Automatically discovers all models with `encrypts` calls
22
+ - Intelligently guesses data types based on column names (email, phone, ssn, etc.)
23
+ - **Append-only mode**: Preserves existing formatting, comments, and order
24
+ - Inserts new columns under existing models
25
+ - Appends new models at the end of the file
26
+ - Shows detailed feedback about discovered and skipped columns
27
+ - `--scan` option for install generator: `rails generate column_anonymizer:install --scan`
28
+ - Comprehensive type guessing for common column name patterns
29
+
30
+ ### Changed
31
+ - Anonymizer now uses `BUILT_IN_GENERATORS` constant instead of `GENERATORS`
32
+ - `all_generators` method merges built-in and custom generators dynamically
33
+ - Scan generator now uses append-only strategy instead of regenerating entire file
34
+ - Preserves comments and custom formatting in YAML
35
+ - Git-friendly with minimal diffs
36
+ - Safe for team environments with organized config files
37
+ - Updated install generator to suggest scan option
38
+ - Enhanced README with scan generator and custom generators documentation
39
+
40
+ ## [0.1.0] - 2026-02-04
41
+
42
+ - Initial release with YAML-based configuration
43
+ - Add `SchemaLoader` to load encrypted column types from `config/encrypted_columns.yml`
44
+ - Add `Encryptable` module that reads column types from YAML schema
45
+ - Add `Anonymizer` class with built-in generators for common data types (email, phone, SSN, name, address, etc.)
46
+ - Add `anonymize_model` and `anonymize_model!` methods for intelligent data anonymization
47
+ - Add Rails generator for installing YAML configuration file
48
+ - Seamless integration with Rails 7+ Active Record Encryption via Railtie
49
+ - Support for all standard `encrypts` method options
@@ -0,0 +1,507 @@
1
+ # โœ… Custom Generators Implementation Complete
2
+
3
+ ## ๐ŸŽฏ Your Request
4
+ **"I want a way to add custom anonymization types from within the rails app. I am thinking it would be some ruby class somewhere that would define a custom type then it would get appended when we anonymize the data"**
5
+
6
+ ## โœ… Delivered!
7
+
8
+ You can now register custom anonymization types from within your Rails app using a simple, flexible API.
9
+
10
+ ---
11
+
12
+ ## ๐Ÿ“ฆ What Was Created
13
+
14
+ ### 1. **Enhanced Anonymizer Class**
15
+ **File**: `lib/column_anonymizer/anonymizer.rb`
16
+
17
+ **New Features:**
18
+ - `register(type, &block)` - Register custom generators
19
+ - `unregister(type)` - Remove custom generators
20
+ - `all_generators` - Get built-in + custom generators merged
21
+ - `generator_exists?(type)` - Check if a generator exists
22
+ - `reset_custom_generators!` - Clear custom generators (for testing)
23
+ - `BUILT_IN_GENERATORS` constant (renamed from `GENERATORS`)
24
+ - `@custom_generators` class instance variable to store custom types
25
+
26
+ ### 2. **Initializer Generator**
27
+ **File**: `lib/generators/column_anonymizer/initializer/initializer_generator.rb`
28
+
29
+ Command: `rails generate column_anonymizer:initializer`
30
+
31
+ Creates: `config/initializers/column_anonymizer.rb` with:
32
+ - Example custom generators
33
+ - Helpful comments
34
+ - Common patterns
35
+ - List of built-in types
36
+
37
+ ### 3. **Initializer Template**
38
+ **File**: `lib/generators/column_anonymizer/initializer/templates/column_anonymizer.rb`
39
+
40
+ Includes examples for:
41
+ - Simple masking (credit cards)
42
+ - Using Faker
43
+ - Custom business logic
44
+ - Format-specific patterns
45
+ - Using Rails models
46
+ - Date/time generators
47
+ - Complex multi-field logic
48
+
49
+ ### 4. **Comprehensive Documentation**
50
+ - `CUSTOM_GENERATORS_GUIDE.md` (500+ lines) - Complete guide
51
+ - `CUSTOM_GENERATORS_QUICK_REF.md` - Quick reference
52
+ - Updated `README.md` - Custom generators section
53
+ - Updated `CHANGELOG.md` - Feature documentation
54
+
55
+ ---
56
+
57
+ ## ๐Ÿš€ How It Works
58
+
59
+ ### Step 1: Generate Initializer
60
+ ```bash
61
+ rails generate column_anonymizer:initializer
62
+ ```
63
+
64
+ ### Step 2: Register Custom Types
65
+ ```ruby
66
+ # config/initializers/column_anonymizer.rb
67
+
68
+ ColumnAnonymizer::Anonymizer.register(:credit_card) do
69
+ "XXXX-XXXX-XXXX-#{rand(1000..9999)}"
70
+ end
71
+
72
+ ColumnAnonymizer::Anonymizer.register(:employee_id) do
73
+ "EMP-#{Time.now.year}-#{rand(10000..99999)}"
74
+ end
75
+ ```
76
+
77
+ ### Step 3: Use in YAML Config
78
+ ```yaml
79
+ # config/encrypted_columns.yml
80
+ User:
81
+ credit_card_number: credit_card # โ† Custom type!
82
+ employee_number: employee_id # โ† Custom type!
83
+ ```
84
+
85
+ ### Step 4: Anonymize
86
+ ```ruby
87
+ user = User.first
88
+ ColumnAnonymizer::Anonymizer.anonymize_model!(user)
89
+
90
+ # credit_card_number => "XXXX-XXXX-XXXX-4823"
91
+ # employee_number => "EMP-2026-47293"
92
+ ```
93
+
94
+ ---
95
+
96
+ ## ๐ŸŽจ Registration Patterns
97
+
98
+ ### Block Syntax (Recommended)
99
+ ```ruby
100
+ ColumnAnonymizer::Anonymizer.register(:custom_type) do
101
+ "VALUE-#{SecureRandom.hex(4)}"
102
+ end
103
+ ```
104
+
105
+ ### Lambda Syntax
106
+ ```ruby
107
+ ColumnAnonymizer::Anonymizer.register(:uuid, -> { SecureRandom.uuid })
108
+ ```
109
+
110
+ ### Callable Object
111
+ ```ruby
112
+ class EmployeeIdGenerator
113
+ def self.call
114
+ "EMP-#{Time.now.year}-#{rand(10000..99999)}"
115
+ end
116
+ end
117
+
118
+ ColumnAnonymizer::Anonymizer.register(:employee_id, EmployeeIdGenerator)
119
+ ```
120
+
121
+ ---
122
+
123
+ ## ๐Ÿ“‹ Example Custom Generators
124
+
125
+ ### Credit Card Masking
126
+ ```ruby
127
+ ColumnAnonymizer::Anonymizer.register(:credit_card) do
128
+ "XXXX-XXXX-XXXX-#{rand(1000..9999)}"
129
+ end
130
+ ```
131
+
132
+ ### License Plate
133
+ ```ruby
134
+ ColumnAnonymizer::Anonymizer.register(:license_plate) do
135
+ letters = ('A'..'Z').to_a.sample(3).join
136
+ numbers = rand(100..999)
137
+ "#{letters}-#{numbers}"
138
+ end
139
+ ```
140
+
141
+ ### Account Number with Format
142
+ ```ruby
143
+ ColumnAnonymizer::Anonymizer.register(:account_number) do
144
+ prefix = "ACC"
145
+ year = Time.now.year
146
+ sequence = rand(10000..99999)
147
+ "#{prefix}#{year}#{sequence}"
148
+ end
149
+ ```
150
+
151
+ ### Using Faker
152
+ ```ruby
153
+ ColumnAnonymizer::Anonymizer.register(:company_name) do
154
+ Faker::Company.name
155
+ end
156
+ ```
157
+
158
+ ### From Database
159
+ ```ruby
160
+ ColumnAnonymizer::Anonymizer.register(:department_name) do
161
+ Department.active.pluck(:name).sample || "General"
162
+ end
163
+ ```
164
+
165
+ ---
166
+
167
+ ## ๐Ÿ”ง API Reference
168
+
169
+ ### Register a Generator
170
+ ```ruby
171
+ ColumnAnonymizer::Anonymizer.register(type, generator = nil, &block)
172
+
173
+ # With block
174
+ register(:type) { "value" }
175
+
176
+ # With lambda
177
+ register(:type, -> { "value" })
178
+
179
+ # With callable
180
+ register(:type, MyGenerator)
181
+ ```
182
+
183
+ ### Unregister a Generator
184
+ ```ruby
185
+ ColumnAnonymizer::Anonymizer.unregister(:type)
186
+ ```
187
+
188
+ ### Check if Generator Exists
189
+ ```ruby
190
+ ColumnAnonymizer::Anonymizer.generator_exists?(:type)
191
+ # => true/false
192
+ ```
193
+
194
+ ### Get All Generators
195
+ ```ruby
196
+ ColumnAnonymizer::Anonymizer.all_generators
197
+ # => { email: #<Proc>, phone: #<Proc>, credit_card: #<Proc>, ... }
198
+ ```
199
+
200
+ ### Reset Custom Generators (Testing)
201
+ ```ruby
202
+ ColumnAnonymizer::Anonymizer.reset_custom_generators!
203
+ ```
204
+
205
+ ---
206
+
207
+ ## ๐Ÿงช How Custom Generators Merge
208
+
209
+ ### Built-in Generators
210
+ ```ruby
211
+ BUILT_IN_GENERATORS = {
212
+ email: -> { Faker::Internet.email },
213
+ phone: -> { Faker::PhoneNumber.phone_number },
214
+ ssn: -> { Faker::IdNumber.ssn_valid },
215
+ name: -> { Faker::Name.name },
216
+ first_name: -> { Faker::Name.first_name },
217
+ last_name: -> { Faker::Name.last_name },
218
+ address: -> { Faker::Address.full_address },
219
+ text: -> { Faker::Lorem.paragraph }
220
+ }
221
+ ```
222
+
223
+ ### Custom Generators
224
+ ```ruby
225
+ @custom_generators = {
226
+ credit_card: -> { "XXXX-XXXX-XXXX-#{rand(1000..9999)}" },
227
+ employee_id: -> { "EMP-#{Time.now.year}-#{rand(10000..99999)}" }
228
+ }
229
+ ```
230
+
231
+ ### Merged (all_generators)
232
+ ```ruby
233
+ {
234
+ # Built-ins
235
+ email: -> { ... },
236
+ phone: -> { ... },
237
+ ssn: -> { ... },
238
+ name: -> { ... },
239
+ first_name: -> { ... },
240
+ last_name: -> { ... },
241
+ address: -> { ... },
242
+ text: -> { ... },
243
+
244
+ # Custom
245
+ credit_card: -> { ... },
246
+ employee_id: -> { ... }
247
+ }
248
+ ```
249
+
250
+ ---
251
+
252
+ ## ๐Ÿ”„ Complete Workflow
253
+
254
+ ### 1. Setup
255
+ ```bash
256
+ # Install gem
257
+ echo "gem 'column_anonymizer'" >> Gemfile
258
+ bundle install
259
+
260
+ # Generate config
261
+ rails g column_anonymizer:install --scan
262
+
263
+ # Generate initializer for custom types
264
+ rails g column_anonymizer:initializer
265
+ ```
266
+
267
+ ### 2. Define Custom Types
268
+ ```ruby
269
+ # config/initializers/column_anonymizer.rb
270
+
271
+ ColumnAnonymizer::Anonymizer.register(:credit_card) do
272
+ "XXXX-XXXX-XXXX-#{rand(1000..9999)}"
273
+ end
274
+
275
+ ColumnAnonymizer::Anonymizer.register(:employee_id) do
276
+ "EMP-#{Time.now.year}-#{rand(10000..99999)}"
277
+ end
278
+
279
+ ColumnAnonymizer::Anonymizer.register(:medical_record) do
280
+ "MRN#{rand(100000..999999)}"
281
+ end
282
+ ```
283
+
284
+ ### 3. Configure YAML
285
+ ```yaml
286
+ # config/encrypted_columns.yml
287
+ User:
288
+ email: email # Built-in
289
+ credit_card_number: credit_card # Custom!
290
+
291
+ Employee:
292
+ employee_number: employee_id # Custom!
293
+ ssn: ssn # Built-in
294
+
295
+ Patient:
296
+ medical_record_number: medical_record # Custom!
297
+ ```
298
+
299
+ ### 4. Use in Models
300
+ ```ruby
301
+ class User < ApplicationRecord
302
+ encrypts :email
303
+ encrypts :credit_card_number
304
+ end
305
+
306
+ class Employee < ApplicationRecord
307
+ encrypts :employee_number
308
+ encrypts :ssn
309
+ end
310
+
311
+ class Patient < ApplicationRecord
312
+ encrypts :medical_record_number
313
+ end
314
+ ```
315
+
316
+ ### 5. Anonymize
317
+ ```ruby
318
+ user = User.first
319
+ ColumnAnonymizer::Anonymizer.anonymize_model!(user)
320
+
321
+ # Before:
322
+ # email: "john@example.com"
323
+ # credit_card_number: "4532-1234-5678-9012"
324
+
325
+ # After:
326
+ # email: "user_abc123@example.com"
327
+ # credit_card_number: "XXXX-XXXX-XXXX-7482"
328
+ ```
329
+
330
+ ---
331
+
332
+ ## โœจ Benefits
333
+
334
+ ### Flexibility
335
+ - โœ… Define any custom format
336
+ - โœ… Use Faker, Rails models, or custom logic
337
+ - โœ… Domain-specific anonymization
338
+
339
+ ### Simplicity
340
+ - โœ… Easy registration with blocks
341
+ - โœ… No complex class hierarchy
342
+ - โœ… Just define and use
343
+
344
+ ### Power
345
+ - โœ… Access to full Ruby/Rails environment
346
+ - โœ… Can use database queries
347
+ - โœ… Can use external libraries
348
+
349
+ ### Maintainability
350
+ - โœ… All custom types in one place (initializer)
351
+ - โœ… Easy to test
352
+ - โœ… Clear separation from built-ins
353
+
354
+ ---
355
+
356
+ ## ๐Ÿงช Testing
357
+
358
+ ```ruby
359
+ # spec/initializers/column_anonymizer_spec.rb
360
+
361
+ RSpec.describe "Custom Anonymizers" do
362
+ before do
363
+ ColumnAnonymizer::Anonymizer.reset_custom_generators!
364
+
365
+ ColumnAnonymizer::Anonymizer.register(:credit_card) do
366
+ "XXXX-XXXX-XXXX-#{rand(1000..9999)}"
367
+ end
368
+ end
369
+
370
+ it "registers custom generator" do
371
+ expect(ColumnAnonymizer::Anonymizer.generator_exists?(:credit_card)).to be true
372
+ end
373
+
374
+ it "generates correct format" do
375
+ generator = ColumnAnonymizer::Anonymizer.all_generators[:credit_card]
376
+ result = generator.call
377
+
378
+ expect(result).to match(/^XXXX-XXXX-XXXX-\d{4}$/)
379
+ end
380
+
381
+ it "merges with built-in generators" do
382
+ generators = ColumnAnonymizer::Anonymizer.all_generators
383
+
384
+ expect(generators).to include(:email, :phone, :credit_card)
385
+ end
386
+ end
387
+ ```
388
+
389
+ ---
390
+
391
+ ## ๐Ÿ“Š Technical Implementation
392
+
393
+ ### Class Structure
394
+ ```ruby
395
+ class Anonymizer
396
+ # Built-in generators (frozen)
397
+ BUILT_IN_GENERATORS = { ... }.freeze
398
+
399
+ # Custom generators (mutable)
400
+ @custom_generators = {}
401
+
402
+ class << self
403
+ # Registration API
404
+ def register(type, generator = nil, &block)
405
+ # Store in @custom_generators
406
+ end
407
+
408
+ # Merge built-in + custom
409
+ def all_generators
410
+ BUILT_IN_GENERATORS.merge(@custom_generators)
411
+ end
412
+
413
+ # Use merged generators
414
+ def anonymize_model(model_instance)
415
+ generators = all_generators
416
+ # Use generators[column_type]
417
+ end
418
+ end
419
+ end
420
+ ```
421
+
422
+ ### Flow
423
+ ```
424
+ 1. Rails boots
425
+ 2. Initializer loads (config/initializers/column_anonymizer.rb)
426
+ 3. Custom generators registered via .register()
427
+ 4. Stored in @custom_generators
428
+ 5. When anonymizing:
429
+ - all_generators called
430
+ - BUILT_IN_GENERATORS.merge(@custom_generators)
431
+ - Merged hash used for anonymization
432
+ 6. Custom types override built-ins if same key
433
+ ```
434
+
435
+ ---
436
+
437
+ ## ๐Ÿ“š Documentation
438
+
439
+ ### Created
440
+ 1. **CUSTOM_GENERATORS_GUIDE.md** (500+ lines)
441
+ - Complete guide with examples
442
+ - All registration patterns
443
+ - Advanced use cases
444
+ - Best practices
445
+ - Troubleshooting
446
+
447
+ 2. **CUSTOM_GENERATORS_QUICK_REF.md**
448
+ - One-page reference
449
+ - Common patterns
450
+ - API methods
451
+ - Quick examples
452
+
453
+ 3. **README.md** - Updated
454
+ - Custom generators section
455
+ - Quick example
456
+ - Link to full guide
457
+
458
+ 4. **CHANGELOG.md** - Updated
459
+ - Feature documented
460
+ - API listed
461
+ - Changes noted
462
+
463
+ ---
464
+
465
+ ## ๐ŸŽ‰ Summary
466
+
467
+ You can now add custom anonymization types easily:
468
+
469
+ ### Command
470
+ ```bash
471
+ rails generate column_anonymizer:initializer
472
+ ```
473
+
474
+ ### Register
475
+ ```ruby
476
+ ColumnAnonymizer::Anonymizer.register(:custom_type) do
477
+ # Your anonymization logic
478
+ end
479
+ ```
480
+
481
+ ### Use
482
+ ```yaml
483
+ Model:
484
+ column: custom_type
485
+ ```
486
+
487
+ **Simple, powerful, and flexible!** ๐Ÿš€
488
+
489
+ ---
490
+
491
+ ## ๐Ÿ“ฆ Files Summary
492
+
493
+ | File | Purpose |
494
+ |------|---------|
495
+ | `lib/column_anonymizer/anonymizer.rb` | Enhanced with registration API |
496
+ | `lib/generators/.../initializer_generator.rb` | Generator command |
497
+ | `lib/generators/.../templates/column_anonymizer.rb` | Template with examples |
498
+ | `CUSTOM_GENERATORS_GUIDE.md` | Complete documentation |
499
+ | `CUSTOM_GENERATORS_QUICK_REF.md` | Quick reference |
500
+ | `README.md` | Updated with custom generators |
501
+ | `CHANGELOG.md` | Feature documented |
502
+
503
+ ---
504
+
505
+ **Status:** โœ… **COMPLETE AND READY TO USE**
506
+ **Date:** February 5, 2026
507
+ **Implementation:** Custom generator registration system