orfeas_pam_dsl 0.6.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
data/README.md ADDED
@@ -0,0 +1,1365 @@
1
+ # PAM DSL - Privacy Attribute Matrix DSL
2
+
3
+ A declarative Domain-Specific Language (DSL) for defining privacy policies, PII fields, consent requirements, and retention rules using the Privacy Attribute Matrix (PAM) model for privacy-aware event monitoring.
4
+
5
+ ## Overview
6
+
7
+ PAM DSL provides a fluent, expressive way to define privacy policies using the Privacy Attribute Matrix (PAM) model. It can be used by privacy-aware monitoring systems like Lyra and helps you:
8
+
9
+ - **Define PII fields** with type and sensitivity classification
10
+ - **Specify processing purposes** with legal bases (GDPR compliant)
11
+ - **Configure retention policies** with field-level granularity
12
+ - **Manage consent requirements** with expiration and granular control
13
+ - **Validate data access** against defined policies
14
+
15
+ ## Installation
16
+
17
+ Add to your Gemfile:
18
+
19
+ ```ruby
20
+ gem 'pam_dsl', path: 'gems/pam_dsl' # For monorepo
21
+ ```
22
+
23
+ ## Quick Start
24
+
25
+ ```ruby
26
+ require 'pam_dsl'
27
+
28
+ # Define a privacy policy
29
+ PamDsl.define_policy :user_data do
30
+ # Define PII fields
31
+ field :email, type: :email, sensitivity: :internal do
32
+ allow_for :authentication, :communication, :marketing
33
+ transform :display do |value|
34
+ "#{value[0]}***@#{value.split('@').last}"
35
+ end
36
+ end
37
+
38
+ field :ssn, type: :ssn, sensitivity: :restricted do
39
+ allow_for :identity_verification
40
+ transform :display do |value|
41
+ "***-**-#{value[-4..]}"
42
+ end
43
+ end
44
+
45
+ # Define processing purposes
46
+ purpose :authentication do
47
+ describe "User authentication and session management"
48
+ basis :contract
49
+ requires :email
50
+ end
51
+
52
+ purpose :marketing do
53
+ describe "Marketing communications and newsletters"
54
+ basis :consent
55
+ requires :email
56
+ optionally :name, :preferences
57
+ end
58
+
59
+ # Configure retention
60
+ retention do
61
+ default 7.years
62
+
63
+ for_model 'User' do
64
+ keep_for 10.years
65
+ field :email, duration: 2.years
66
+ on_expiry :anonymize
67
+ end
68
+ end
69
+
70
+ # Configure consent
71
+ consent do
72
+ for_purpose :marketing do
73
+ required!
74
+ granular!
75
+ withdrawable!
76
+ expires_in 1.year
77
+ describe "We'll send you product updates and offers"
78
+ end
79
+ end
80
+ end
81
+ ```
82
+
83
+ ## Detailed Usage
84
+
85
+ ### Defining Fields
86
+
87
+ Fields represent PII data with type classification and sensitivity levels:
88
+
89
+ ```ruby
90
+ PamDsl.define_policy :my_policy do
91
+ # Basic field definition
92
+ field :email, type: :email, sensitivity: :internal
93
+
94
+ # Field with allowed purposes
95
+ field :phone, type: :phone, sensitivity: :confidential do
96
+ allow_for :contact, :verification
97
+ end
98
+
99
+ # Field with transformations
100
+ field :credit_card, type: :credit_card, sensitivity: :restricted do
101
+ allow_for :payment_processing
102
+
103
+ # Transform for display
104
+ transform :display do |value|
105
+ "****-****-****-#{value[-4..]}"
106
+ end
107
+
108
+ # Transform for logging
109
+ transform :log do |value|
110
+ "***REDACTED***"
111
+ end
112
+
113
+ # Add metadata
114
+ meta :encryption_required, true
115
+ meta :pci_dss_scope, true
116
+ end
117
+ end
118
+ ```
119
+
120
+ **Sensitivity Levels:**
121
+ - `:public` - Publicly accessible
122
+ - `:internal` - Internal use only
123
+ - `:confidential` - Sensitive, requires protection
124
+ - `:restricted` - Highly restricted access
125
+
126
+ See [Sensitivity Levels and Legislative Background](#sensitivity-levels-and-legislative-background) for detailed regulatory mapping.
127
+
128
+ **PII Types:**
129
+ `:email`, `:name`, `:phone`, `:address`, `:ssn`, `:date_of_birth`, `:ip_address`, `:credit_card`, `:financial`, `:health`, `:biometric`, `:location`, `:identifier`, `:custom`
130
+
131
+ ### Defining Purposes
132
+
133
+ Purposes represent why you process personal data, aligned with GDPR requirements:
134
+
135
+ ```ruby
136
+ PamDsl.define_policy :my_policy do
137
+ purpose :account_management do
138
+ describe "Creating and managing user accounts"
139
+ basis :contract # GDPR Article 6(1)(b)
140
+ requires :email, :password_hash
141
+ optionally :phone, :preferences
142
+ end
143
+
144
+ purpose :analytics do
145
+ describe "Improving service quality and user experience"
146
+ basis :legitimate_interests # GDPR Article 6(1)(f)
147
+ requires :user_id
148
+ optionally :session_data, :interaction_events
149
+ meta :balancing_test_performed, true
150
+ meta :data_minimization, true
151
+ end
152
+
153
+ purpose :legal_compliance do
154
+ describe "Complying with tax and financial regulations"
155
+ basis :legal_obligation # GDPR Article 6(1)(c)
156
+ requires :transaction_history, :financial_records
157
+ end
158
+ end
159
+ ```
160
+
161
+ **Legal Bases (GDPR Article 6):**
162
+ - `:consent` - Data subject has given consent
163
+ - `:contract` - Processing necessary for contract
164
+ - `:legal_obligation` - Compliance with legal obligation
165
+ - `:vital_interests` - Protection of vital interests
166
+ - `:public_task` - Task in public interest
167
+ - `:legitimate_interests` - Legitimate interests
168
+
169
+ ### Retention Policies
170
+
171
+ Define how long data should be retained:
172
+
173
+ ```ruby
174
+ PamDsl.define_policy :my_policy do
175
+ retention do
176
+ # Set default retention
177
+ default 5.years
178
+
179
+ # Model-specific retention
180
+ for_model 'User' do
181
+ keep_for 7.years
182
+
183
+ # Field-level overrides
184
+ field :email, duration: 2.years
185
+ field :payment_info, duration: 10.years
186
+
187
+ # Deletion strategy
188
+ on_expiry :anonymize # or :hard_delete, :soft_delete, :archive
189
+ end
190
+
191
+ # Conditional retention
192
+ for_model 'Transaction' do
193
+ keep_for 10.years
194
+ when do |context|
195
+ context[:transaction_type] == 'financial'
196
+ end
197
+ end
198
+ end
199
+ end
200
+ ```
201
+
202
+ **Deletion Strategies:**
203
+ - `:hard_delete` - Permanently delete data
204
+ - `:soft_delete` - Mark as deleted but keep data
205
+ - `:anonymize` - Remove PII while keeping aggregated data
206
+ - `:archive` - Move to long-term storage
207
+
208
+ ### Consent Management
209
+
210
+ Define consent requirements for purposes:
211
+
212
+ ```ruby
213
+ PamDsl.define_policy :my_policy do
214
+ consent do
215
+ for_purpose :marketing do
216
+ required! # Must have consent
217
+ granular! # Allow fine-grained consent
218
+ withdrawable! # Can be withdrawn anytime
219
+ expires_in 2.years # Consent expires after 2 years
220
+ describe "We'll send you marketing emails about new products"
221
+ end
222
+
223
+ for_purpose :analytics do
224
+ required! false # Optional consent
225
+ granular!
226
+ describe "Help us improve by sharing anonymous usage data"
227
+ end
228
+ end
229
+ end
230
+ ```
231
+
232
+ ### Custom Attributes (Metadata)
233
+
234
+ PAM DSL supports custom attributes via the `meta(key, value)` method at three levels: **policy**, **field**, and **purpose**. This allows you to extend policies with application-specific data without modifying the DSL core.
235
+
236
+ #### Policy-Level Metadata
237
+
238
+ Add organization-wide or policy-specific attributes:
239
+
240
+ ```ruby
241
+ PamDsl.define_policy :my_app do
242
+ # Policy-level custom attributes
243
+ meta :organization, "University of the Aegean"
244
+ meta :dpo_email, "dpo@aegean.gr"
245
+ meta :policy_version, "2.1"
246
+ meta :gdpr_compliant, true
247
+ meta :last_review_date, "2024-01-15"
248
+ meta :next_review_date, "2025-01-15"
249
+
250
+ # ... fields, purposes, etc.
251
+ end
252
+
253
+ # Access policy metadata
254
+ policy = PamDsl.policy(:my_app)
255
+ policy.metadata[:organization] # => "University of the Aegean"
256
+ policy.metadata[:gdpr_compliant] # => true
257
+ ```
258
+
259
+ #### Field-Level Metadata
260
+
261
+ Add field-specific attributes for compliance, encryption requirements, or custom categorization:
262
+
263
+ ```ruby
264
+ field :email, type: :email, sensitivity: :confidential do
265
+ allow_for :authentication, :marketing
266
+
267
+ # Custom attributes
268
+ meta :pii_category, "direct_identifier"
269
+ meta :encryption_required, true
270
+ meta :encryption_algorithm, "AES-256"
271
+ meta :anonymization_method, "hash_prefix"
272
+ meta :data_controller, "IT Department"
273
+ meta :cross_border_transfer, false
274
+ meta :third_party_sharing, ["analytics_provider"]
275
+ end
276
+
277
+ field :credit_card, type: :credit_card, sensitivity: :restricted do
278
+ allow_for :payment_processing
279
+
280
+ meta :pci_dss_scope, true
281
+ meta :tokenization_required, true
282
+ meta :storage_allowed, false # Store token only, not actual card
283
+ meta :processor, "Stripe"
284
+ end
285
+
286
+ # Access field metadata
287
+ field = policy.get_field(:email)
288
+ field.metadata[:encryption_required] # => true
289
+ field.metadata[:pii_category] # => "direct_identifier"
290
+ ```
291
+
292
+ #### Purpose-Level Metadata
293
+
294
+ Document compliance details for each processing purpose:
295
+
296
+ ```ruby
297
+ purpose :analytics do
298
+ describe "Aggregated analytics for service improvement"
299
+ basis :legitimate_interests
300
+ requires :usage_data
301
+ optionally :device_info
302
+
303
+ # Legitimate Interest Assessment (LIA) documentation
304
+ meta :lia_conducted, true
305
+ meta :lia_date, "2024-01-15"
306
+ meta :lia_outcome, "Approved - minimal privacy impact"
307
+ meta :balancing_test, "User benefit outweighs minimal data use"
308
+
309
+ # Data minimization
310
+ meta :data_minimization_review, "quarterly"
311
+ meta :aggregation_level, "daily"
312
+ meta :individual_identification, false
313
+ end
314
+
315
+ purpose :fraud_detection do
316
+ describe "Detecting and preventing fraudulent transactions"
317
+ basis :legitimate_interests
318
+ requires :transaction_data, :device_fingerprint
319
+
320
+ meta :automated_decision_making, true
321
+ meta :human_review_available, true
322
+ meta :profiling, true
323
+ meta :impact_assessment_required, true
324
+ meta :dpia_reference, "DPIA-2024-003"
325
+ end
326
+
327
+ # Access purpose metadata
328
+ purpose = policy.get_purpose(:analytics)
329
+ purpose.metadata[:lia_conducted] # => true
330
+ purpose.metadata[:dpia_reference] # => nil (not set)
331
+ ```
332
+
333
+ #### Common Use Cases for Metadata
334
+
335
+ | Use Case | Level | Example Keys |
336
+ |----------|-------|--------------|
337
+ | Compliance tracking | Policy | `:gdpr_compliant`, `:ccpa_compliant`, `:last_audit_date` |
338
+ | Encryption requirements | Field | `:encryption_required`, `:encryption_algorithm`, `:key_rotation` |
339
+ | Data categorization | Field | `:pii_category`, `:special_category_data`, `:children_data` |
340
+ | Cross-border transfers | Field | `:cross_border_transfer`, `:adequacy_decision`, `:sccs_required` |
341
+ | Third-party sharing | Field | `:third_party_sharing`, `:processors`, `:joint_controllers` |
342
+ | LIA documentation | Purpose | `:lia_conducted`, `:lia_date`, `:balancing_test` |
343
+ | DPIA references | Purpose | `:dpia_required`, `:dpia_reference`, `:impact_assessment_date` |
344
+ | Automated decisions | Purpose | `:automated_decision_making`, `:profiling`, `:human_review` |
345
+
346
+ #### Metadata in Exports
347
+
348
+ All metadata is preserved when exporting policies:
349
+
350
+ ```ruby
351
+ policy = PamDsl.policy(:my_app)
352
+ export = policy.to_h
353
+
354
+ # Metadata is included at each level
355
+ export[:metadata] # Policy metadata
356
+ export[:fields][:email][:metadata] # Field metadata
357
+ export[:purposes][:analytics][:metadata] # Purpose metadata
358
+ ```
359
+
360
+ This makes metadata available for compliance reporting, auditing, and integration with external systems.
361
+
362
+ ### Using Policies
363
+
364
+ ```ruby
365
+ # Get a defined policy
366
+ policy = PamDsl.policy(:user_data)
367
+
368
+ # Check if field is allowed for purpose
369
+ policy.allowed?(:email, :marketing) # => true
370
+ policy.allowed?(:ssn, :marketing) # => false
371
+
372
+ # Validate data access
373
+ begin
374
+ policy.validate_access!(
375
+ [:email, :name],
376
+ :marketing,
377
+ consent_granted: true,
378
+ consent_granted_at: 6.months.ago
379
+ )
380
+ rescue PamDsl::ConsentRequiredError => e
381
+ puts "Consent error: #{e.message}"
382
+ end
383
+
384
+ # Get field and apply transformation
385
+ field = policy.get_field(:email)
386
+ masked = field.apply_transformation(:display, "john@example.com")
387
+ # => "j***@example.com"
388
+
389
+ # Get sensitive fields
390
+ policy.sensitive_fields.each do |field|
391
+ puts "#{field.name} is #{field.sensitivity}"
392
+ end
393
+
394
+ # Get retention duration
395
+ duration = policy.retention_for('User', field_name: :email)
396
+ # => 2.years
397
+
398
+ # Export policy as hash
399
+ policy_hash = policy.to_h
400
+ ```
401
+
402
+ ### Advanced Examples
403
+
404
+ #### Multi-Purpose Field
405
+
406
+ ```ruby
407
+ field :email, type: :email, sensitivity: :internal do
408
+ allow_for :authentication, :communication, :account_recovery
409
+
410
+ transform :display do |value|
411
+ local, domain = value.split('@')
412
+ "#{local[0]}***@#{domain}"
413
+ end
414
+
415
+ transform :api_response do |value|
416
+ { email: value, verified: true }
417
+ end
418
+
419
+ meta :required, true
420
+ meta :unique, true
421
+ end
422
+ ```
423
+
424
+ #### Purpose with Complex Requirements
425
+
426
+ ```ruby
427
+ purpose :payment_processing do
428
+ describe "Processing customer payments securely"
429
+ basis :contract
430
+
431
+ requires :billing_address, :payment_method
432
+ optionally :billing_email, :invoice_preferences
433
+
434
+ meta :pci_dss_compliant, true
435
+ meta :encryption_required, true
436
+ meta :audit_logging, true
437
+ end
438
+ ```
439
+
440
+ #### Conditional Retention
441
+
442
+ ```ruby
443
+ retention do
444
+ for_model 'Contract' do
445
+ keep_for 10.years
446
+
447
+ when do |context|
448
+ # Keep active contracts longer
449
+ context[:status] == 'active'
450
+ end
451
+
452
+ on_expiry :archive
453
+ end
454
+
455
+ for_model 'SupportTicket' do
456
+ keep_for 3.years
457
+
458
+ when do |context|
459
+ # Keep escalated tickets longer
460
+ !context[:escalated]
461
+ end
462
+ end
463
+ end
464
+ ```
465
+
466
+ ## Privacy Reporting
467
+
468
+ PAM DSL includes a comprehensive reporting system for GDPR compliance documentation.
469
+
470
+ ### Quick Reports via Rake Tasks
471
+
472
+ ```bash
473
+ # Policy summary
474
+ bundle exec rake pam_dsl:report:policy
475
+
476
+ # GDPR Article 30 Records of Processing Activities
477
+ bundle exec rake pam_dsl:report:article_30
478
+
479
+ # Full compliance report
480
+ bundle exec rake pam_dsl:report:full
481
+
482
+ # Export to JSON
483
+ bundle exec rake "pam_dsl:report:export[reports/privacy_report.json]"
484
+
485
+ # PII analysis from event store (requires Lyra)
486
+ bundle exec rake pam_dsl:report:pii
487
+ bundle exec rake pam_dsl:report:retention
488
+ bundle exec rake pam_dsl:report:access_patterns
489
+ ```
490
+
491
+ ### Reporter Class
492
+
493
+ ```ruby
494
+ # Create a reporter
495
+ reporter = PamDsl::Reporter.new(
496
+ :my_policy,
497
+ organization: "My Company",
498
+ dpo_contact: "dpo@example.com",
499
+ event_store: Rails.configuration.event_store # Optional
500
+ )
501
+
502
+ # Generate reports (output to stdout)
503
+ reporter.policy_summary
504
+ reporter.article_30_report
505
+ reporter.full_report
506
+
507
+ # With event store integration
508
+ reporter.pii_analysis # Analyze PII in events
509
+ reporter.retention_check # Check retention compliance
510
+ reporter.access_patterns # Show access patterns by hour/operation
511
+
512
+ # Export
513
+ reporter.export_json("reports/privacy.json")
514
+ report_hash = reporter.to_h
515
+ ```
516
+
517
+ ### Report Contents
518
+
519
+ **Policy Summary:**
520
+ - PII fields with types, sensitivity levels, and transformations
521
+ - Processing purposes with legal bases
522
+ - Retention rules per model
523
+ - Sensitivity breakdown chart
524
+
525
+ **Article 30 Report:**
526
+ - Controller and DPO information
527
+ - Processing activities with legal basis citations (GDPR Art. 6(1)(a-f))
528
+ - Data categories and retention periods
529
+ - Data subject rights implementation status
530
+ - Technical and organizational measures
531
+
532
+ **Event Store Analysis (requires Lyra):**
533
+ - PII field occurrence counts
534
+ - Retention compliance status per model
535
+ - Access patterns by operation type and time
536
+
537
+ ## Policy Generation
538
+
539
+ Generate PAM DSL policies automatically from your codebase.
540
+
541
+ ### Generate from ActiveRecord Models
542
+
543
+ ```bash
544
+ # Scan models and generate policy
545
+ bundle exec rake "pam_dsl:generate:from_models[my_app_policy]"
546
+
547
+ # Generate basic template
548
+ bundle exec rake "pam_dsl:generate:policy[my_app_policy]"
549
+ ```
550
+
551
+ ### PolicyGenerator Class
552
+
553
+ ```ruby
554
+ generator = PamDsl::PolicyGenerator.new(
555
+ :my_app,
556
+ output_path: "config/initializers/pam_dsl_policy.rb"
557
+ )
558
+
559
+ # Generate from template
560
+ generator.generate
561
+
562
+ # Scan models for PII fields
563
+ generator.generate_from_models
564
+ ```
565
+
566
+ ### Detection Patterns
567
+
568
+ The generator detects PII fields by name patterns:
569
+
570
+ | Type | Patterns Detected |
571
+ |------|-------------------|
572
+ | Email | `email`, `email_address`, `user_email` |
573
+ | Phone | `phone`, `mobile`, `telephone`, `fax` |
574
+ | Name | `name`, `firstname`, `lastname`, `full_name` |
575
+ | Address | `address`, `street`, `city`, `postal_code`, `zip` |
576
+ | Financial | `iban`, `bic`, `account_number`, `routing_number` |
577
+ | Credit Card | `card_number`, `credit_card`, `cvv`, `card_` |
578
+ | Identifiers | `ssn`, `vat_number`, `tax_id`, `passport` |
579
+ | Location | `latitude`, `longitude`, `location`, `coordinates` |
580
+ | IP Address | `ip_address`, `ip`, `remote_ip` |
581
+ | Date of Birth | `dob`, `date_of_birth`, `birth_date`, `birthday` |
582
+
583
+ **Exclusion Patterns** (to reduce false positives):
584
+ - Timestamps: `*_at` (e.g., `created_at`, `email_sent_at`)
585
+ - Amounts: `*_amount` (e.g., `vat_amount`, `total_amount`)
586
+ - Foreign keys: `*_id` (e.g., `user_id`)
587
+ - Status fields: `*_status`, `*_reason`
588
+ - Boolean flags: `is_*`, `has_*`, `*_enabled`
589
+ - Security fields: `*_digest`, `*_token`, `encrypted_*`
590
+ - Code fields: `*_code` (e.g., `country_code`, but `postal_code` is whitelisted)
591
+
592
+ ### PIIDetector Configuration
593
+
594
+ The `PIIDetector` class provides automatic PII field detection with configurable matching behavior.
595
+
596
+ #### Matching Modes
597
+
598
+ PAM DSL supports two matching modes for PII detection:
599
+
600
+ | Mode | Setting | Behavior |
601
+ |------|---------|----------|
602
+ | **Partial** (default) | `partial_match = true` | Matches field names *containing* PII keywords with word boundaries |
603
+ | **Exact** | `partial_match = false` | Only matches specific known field names |
604
+
605
+ #### Partial Matching (Default)
606
+
607
+ Partial matching uses word boundary patterns to detect PII in compound field names:
608
+
609
+ ```ruby
610
+ # These are all detected as PII with partial matching enabled (default)
611
+ PamDsl::PIIDetector.contains_pii?(:email) # => true
612
+ PamDsl::PIIDetector.contains_pii?(:customer_email) # => true
613
+ PamDsl::PIIDetector.contains_pii?(:billing_phone) # => true
614
+ PamDsl::PIIDetector.contains_pii?(:user_name) # => true
615
+ PamDsl::PIIDetector.contains_pii?(:home_address) # => true
616
+ ```
617
+
618
+ Word boundaries are detected at:
619
+ - Start of string or after underscore (`_`)
620
+ - End of string, before underscore, or before uppercase (camelCase suffix)
621
+
622
+ **Important**: snake_case naming is preferred for reliable detection. camelCase prefix detection (e.g., `customerEmail`) is not supported—use `customer_email` instead. camelCase suffix detection works (e.g., `emailAddress` is detected).
623
+
624
+ #### Exact Matching
625
+
626
+ For stricter control, switch to exact matching mode:
627
+
628
+ ```ruby
629
+ # Configure exact matching
630
+ PamDsl::PIIDetector.partial_match = false
631
+
632
+ # Now only exact field names are detected
633
+ PamDsl::PIIDetector.contains_pii?(:email) # => true (exact match)
634
+ PamDsl::PIIDetector.contains_pii?(:user_email) # => true (in exact patterns)
635
+ PamDsl::PIIDetector.contains_pii?(:customer_email) # => false (not in exact patterns)
636
+ PamDsl::PIIDetector.contains_pii?(:billing_phone) # => false (not in exact patterns)
637
+
638
+ # Reset to default (partial matching)
639
+ PamDsl::PIIDetector.reset!
640
+ ```
641
+
642
+ #### When to Use Each Mode
643
+
644
+ | Use Case | Recommended Mode |
645
+ |----------|------------------|
646
+ | New applications with varied naming conventions | Partial (default) |
647
+ | Legacy databases with unpredictable field names | Partial (default) |
648
+ | Applications with strict naming conventions | Exact |
649
+ | Minimizing false positives in large schemas | Exact |
650
+ | GDPR compliance scanning | Partial (default) |
651
+
652
+ #### Configuration in Rails Initializer
653
+
654
+ ```ruby
655
+ # config/initializers/pam_dsl.rb
656
+
657
+ # Option 1: Use partial matching (default, recommended for most cases)
658
+ PamDsl::PIIDetector.partial_match = true
659
+
660
+ # Option 2: Use exact matching (stricter, fewer false positives)
661
+ PamDsl::PIIDetector.partial_match = false
662
+ ```
663
+
664
+ #### Direct PIIDetector Usage
665
+
666
+ ```ruby
667
+ # Check if a field contains PII
668
+ PamDsl::PIIDetector.contains_pii?(:customer_email) # => true
669
+
670
+ # Get the PII type
671
+ PamDsl::PIIDetector.pii_type(:customer_email) # => :email
672
+
673
+ # Get sensitivity level
674
+ PamDsl::PIIDetector.sensitivity(:customer_email) # => :confidential
675
+
676
+ # Detect PII in a hash of attributes
677
+ data = { customer_email: "test@example.com", order_id: 123 }
678
+ pii_fields = PamDsl::PIIDetector.detect(data)
679
+ # => { customer_email: { type: :email, value: "...", sensitive: false, sensitivity: :confidential } }
680
+
681
+ # Mask PII for display
682
+ PamDsl::PIIDetector.mask("test@example.com", :email) # => "t***@example.com"
683
+ ```
684
+
685
+ #### Extracting PII from Record Collections
686
+
687
+ The `extract_pii_from_records` method scans any collection of records for PII fields. It uses a generic interface with extractors, making it compatible with any data source—Lyra events, RubyEventStore events, ActiveRecord models, or plain hashes.
688
+
689
+ ```ruby
690
+ # Basic usage with hash records
691
+ records = [
692
+ { id: 1, email: "alice@example.com", name: "Alice", status: "active" },
693
+ { id: 2, email: "bob@example.com", name: "Bob", status: "inactive" }
694
+ ]
695
+
696
+ inventory = PamDsl::PIIDetector.extract_pii_from_records(
697
+ records,
698
+ attribute_extractor: ->(r) { r }
699
+ )
700
+ # => { email: [{ field: :email, value: "alice@...", pii_type: :email, sensitivity: :confidential }, ...],
701
+ # name: [{ field: :name, value: "Alice", pii_type: :name, sensitivity: :internal }, ...] }
702
+ ```
703
+
704
+ **With Metadata Extraction** (for tracing PII back to source records):
705
+
706
+ ```ruby
707
+ # Extract PII with record metadata for audit trails
708
+ inventory = PamDsl::PIIDetector.extract_pii_from_records(
709
+ records,
710
+ attribute_extractor: ->(r) { r },
711
+ metadata_extractor: ->(r) { { record_id: r[:id], source: "import" } }
712
+ )
713
+ # Each entry includes: { field:, value:, pii_type:, sensitivity:, record_id:, source: }
714
+ ```
715
+
716
+ **With Event Store Events** (RubyEventStore, Lyra, or custom):
717
+
718
+ ```ruby
719
+ # RubyEventStore events
720
+ inventory = PamDsl::PIIDetector.extract_pii_from_records(
721
+ event_store.read.to_a,
722
+ attribute_extractor: ->(e) { e.data[:attributes] || {} },
723
+ metadata_extractor: ->(e) {
724
+ { event_id: e.event_id, timestamp: e.metadata[:timestamp] }
725
+ }
726
+ )
727
+
728
+ # Lyra events (Lyra provides a convenience wrapper)
729
+ inventory = Lyra::Privacy::PIIDetector.extract_from_event_stream(events)
730
+ ```
731
+
732
+ **With ActiveRecord Models**:
733
+
734
+ ```ruby
735
+ # Scan database records for PII
736
+ inventory = PamDsl::PIIDetector.extract_pii_from_records(
737
+ User.where(created_at: 1.month.ago..),
738
+ attribute_extractor: ->(u) { u.attributes },
739
+ metadata_extractor: ->(u) { { id: u.id, type: u.class.name } }
740
+ )
741
+ ```
742
+
743
+ **Return Value Structure**:
744
+
745
+ The method returns a hash grouped by PII type:
746
+
747
+ ```ruby
748
+ {
749
+ email: [
750
+ { field: :email, value: "alice@example.com", pii_type: :email,
751
+ sensitivity: :confidential, event_id: "evt-1", ... },
752
+ { field: :contact_email, value: "bob@example.com", ... }
753
+ ],
754
+ name: [
755
+ { field: :name, value: "Alice", pii_type: :name, sensitivity: :internal, ... }
756
+ ],
757
+ phone: [...]
758
+ }
759
+ ```
760
+
761
+ This structure enables:
762
+ - PII inventory reports for GDPR compliance
763
+ - Data lineage tracking
764
+ - Retention policy enforcement
765
+ - Audit trail generation
766
+
767
+ ### PIIMasker
768
+
769
+ The `PIIMasker` class provides batch masking of PII fields in data structures. It uses `PIIDetector` for field detection and applies type-specific masking strategies.
770
+
771
+ #### Basic Usage
772
+
773
+ ```ruby
774
+ # Mask all PII in a hash (default: partial masking)
775
+ data = { email: "alice@example.com", name: "Alice Smith", status: "active" }
776
+ masked = PamDsl::PIIMasker.mask(data)
777
+ # => { email: "a***@example.com", name: "Alice ***", status: "active" }
778
+
779
+ # Full redaction mode
780
+ masked = PamDsl::PIIMasker.mask(data, strategy: :full)
781
+ # => { email: "[REDACTED]", name: "[REDACTED]", status: "active" }
782
+
783
+ # Redact only sensitive PII (ssn, credit_card, financial, health, biometric)
784
+ data = { email: "alice@example.com", ssn: "123-45-6789" }
785
+ masked = PamDsl::PIIMasker.mask(data, strategy: :redact_sensitive)
786
+ # => { email: "a***@example.com", ssn: "[REDACTED]" }
787
+ ```
788
+
789
+ #### Masking Strategies
790
+
791
+ | Strategy | Behavior |
792
+ |----------|----------|
793
+ | `:partial` (default) | Type-specific partial masking (e.g., `a***@example.com`) |
794
+ | `:full` | Complete redaction with `[REDACTED]` |
795
+ | `:redact_sensitive` | Full redaction for sensitive types, partial for others |
796
+
797
+ #### Masking Individual Fields
798
+
799
+ ```ruby
800
+ # Mask a value by field name
801
+ PamDsl::PIIMasker.mask_field("alice@example.com", :email)
802
+ # => "a***@example.com"
803
+
804
+ # Works with compound field names (partial matching)
805
+ PamDsl::PIIMasker.mask_field("alice@example.com", :customer_email)
806
+ # => "a***@example.com"
807
+
808
+ # Non-PII fields return original value
809
+ PamDsl::PIIMasker.mask_field("active", :status)
810
+ # => "active"
811
+
812
+ # Mask by known PII type
813
+ PamDsl::PIIMasker.mask_by_type("123-45-6789", :ssn)
814
+ # => "***REDACTED***"
815
+ ```
816
+
817
+ #### Masking Record Collections
818
+
819
+ The `mask_records` method masks PII in any collection using extractors, similar to `extract_pii_from_records`:
820
+
821
+ ```ruby
822
+ # With hash records
823
+ records = [
824
+ { id: 1, email: "alice@example.com" },
825
+ { id: 2, email: "bob@example.com" }
826
+ ]
827
+
828
+ masked = PamDsl::PIIMasker.mask_records(
829
+ records,
830
+ attribute_extractor: ->(r) { r },
831
+ attribute_setter: ->(r, masked_attrs) { masked_attrs }
832
+ )
833
+ # => [{ id: 1, email: "a***@example.com" }, { id: 2, email: "b***@example.com" }]
834
+
835
+ # With custom objects
836
+ masked = PamDsl::PIIMasker.mask_records(
837
+ events,
838
+ attribute_extractor: ->(e) { e.data },
839
+ attribute_setter: ->(e, masked_data) { e.class.new(e.id, masked_data) },
840
+ strategy: :full
841
+ )
842
+ ```
843
+
844
+ #### Integration with Lyra
845
+
846
+ Lyra provides a convenience wrapper for masking events:
847
+
848
+ ```ruby
849
+ # Mask all events in a collection
850
+ masked_events = Lyra::Privacy::PIIMasker.mask_events(events)
851
+
852
+ # With full redaction
853
+ masked_events = Lyra::Privacy::PIIMasker.mask_events(events, strategy: :full)
854
+ ```
855
+
856
+ ### GDPRCompliance
857
+
858
+ The `GDPRCompliance` class provides comprehensive GDPR data subject rights functionality. It works with any event source through configurable extractors.
859
+
860
+ #### GDPR Rights Supported
861
+
862
+ | Right | Article | Method |
863
+ |-------|---------|--------|
864
+ | Access | Art. 15 | `data_export` |
865
+ | Erasure ("Right to be Forgotten") | Art. 17 | `right_to_be_forgotten_report` |
866
+ | Portability | Art. 20 | `portable_export` |
867
+ | Rectification | Art. 16 | `rectification_history` |
868
+ | Processing Records | Art. 30 | `processing_activities` |
869
+
870
+ #### Basic Usage
871
+
872
+ ```ruby
873
+ # With any event source (using extractors)
874
+ compliance = PamDsl::GDPRCompliance.new(
875
+ subject_id: user.id,
876
+ subject_type: 'User',
877
+ event_reader: ->(subject_id, subject_type) {
878
+ # Return events for this subject from your event store
879
+ EventStore.events_for_user(subject_id)
880
+ }
881
+ )
882
+
883
+ # Generate Subject Access Request (SAR) report
884
+ report = compliance.data_export
885
+ # => { subject: { id: 123, type: 'User' },
886
+ # events: [...],
887
+ # pii_inventory: { email: [...], name: [...] },
888
+ # data_lineage: { email: [{ timestamp: ..., operation: :created }, ...] } }
889
+
890
+ # Right to be forgotten analysis
891
+ erasure = compliance.right_to_be_forgotten_report
892
+ # => { total_events: 47, events_with_pii: 23, affected_models: ['User', 'Order'],
893
+ # deletion_strategy: :batch_deletion }
894
+
895
+ # Data portability export
896
+ json = compliance.portable_export(format: :json)
897
+ csv = compliance.portable_export(format: :csv)
898
+ xml = compliance.portable_export(format: :xml)
899
+ ```
900
+
901
+ #### Custom Extractors
902
+
903
+ For non-standard event formats, provide custom extractors:
904
+
905
+ ```ruby
906
+ # RubyEventStore example
907
+ compliance = PamDsl::GDPRCompliance.new(
908
+ subject_id: user.id,
909
+ event_reader: ->(sid, stype) {
910
+ event_store.read.stream("User$#{sid}").to_a
911
+ },
912
+ attribute_extractor: ->(e) { e.data[:attributes] || {} },
913
+ timestamp_extractor: ->(e) { e.metadata[:timestamp] },
914
+ operation_extractor: ->(e) { e.data[:operation]&.to_sym },
915
+ model_class_extractor: ->(e) { e.data[:model_class] },
916
+ model_id_extractor: ->(e) { e.data[:model_id] },
917
+ changes_extractor: ->(e) { e.data[:changes] || {} },
918
+ retention_policy: {
919
+ default: { duration: 7.years },
920
+ 'Invoice' => { duration: 10.years }
921
+ }
922
+ )
923
+ ```
924
+
925
+ #### Retention Compliance
926
+
927
+ Check if data retention policies are being followed:
928
+
929
+ ```ruby
930
+ compliance.retention_compliance_check
931
+ # => [
932
+ # { model_class: 'User', total_events: 5, expired_events: 0, compliance_status: :compliant },
933
+ # { model_class: 'Log', total_events: 100, expired_events: 45, compliance_status: :requires_action }
934
+ # ]
935
+ ```
936
+
937
+ #### Consent Audit
938
+
939
+ Track and verify consent for data processing:
940
+
941
+ ```ruby
942
+ compliance.consent_audit
943
+ # => {
944
+ # current_consents: { marketing: { granted: true, timestamp: ... } },
945
+ # consent_history: [...],
946
+ # processing_legitimacy: [{ event_id: 'evt-1', has_consent: true, legitimate: true }, ...]
947
+ # }
948
+ ```
949
+
950
+ #### Full Compliance Report
951
+
952
+ Generate a comprehensive report covering all GDPR aspects:
953
+
954
+ ```ruby
955
+ compliance.full_report
956
+ # => {
957
+ # data_export: { ... },
958
+ # erasure_report: { ... },
959
+ # rectification_history: [...],
960
+ # processing_activities: [...],
961
+ # retention_compliance: [...],
962
+ # consent_audit: { ... }
963
+ # }
964
+ ```
965
+
966
+ #### Integration with Lyra
967
+
968
+ Lyra provides a convenience wrapper that auto-configures GDPRCompliance:
969
+
970
+ ```ruby
971
+ # Lyra automatically uses its event store and extractors
972
+ compliance = Lyra::Privacy::GDPRCompliance.new(subject_id: user.id)
973
+ report = compliance.data_export
974
+ ```
975
+
976
+ ### Generated Output
977
+
978
+ The generator creates a complete policy file with:
979
+ - Field definitions with appropriate types and sensitivity
980
+ - Auto-generated transformations for masking
981
+ - Suggested processing purposes based on field types
982
+ - Model-specific retention rules (10 years for financial models)
983
+ - Rails configuration boilerplate
984
+
985
+ ## Rails Integration
986
+
987
+ ### Configuration
988
+
989
+ ```ruby
990
+ # config/initializers/lyra.rb or pam_dsl.rb
991
+
992
+ # Define your policy
993
+ PamDsl.define_policy :my_app do
994
+ field :email, type: :email, sensitivity: :confidential
995
+ # ...
996
+ end
997
+
998
+ # Configure reporting defaults
999
+ Rails.application.config.pam_dsl.default_policy = :my_app
1000
+ Rails.application.config.pam_dsl.organization = "My Company Inc."
1001
+ Rails.application.config.pam_dsl.dpo_contact = "privacy@mycompany.com"
1002
+ ```
1003
+
1004
+ ### Available Rake Tasks
1005
+
1006
+ ```bash
1007
+ # Reporting
1008
+ rake pam_dsl:report:policy # Policy summary
1009
+ rake pam_dsl:report:article_30 # GDPR Article 30 report
1010
+ rake pam_dsl:report:full # Full compliance report
1011
+ rake pam_dsl:report:pii # PII analysis (requires Lyra)
1012
+ rake pam_dsl:report:retention # Retention compliance (requires Lyra)
1013
+ rake pam_dsl:report:access_patterns # Access patterns (requires Lyra)
1014
+ rake pam_dsl:report:export[path] # Export to JSON
1015
+
1016
+ # Generation
1017
+ rake pam_dsl:generate:policy[name] # Generate template policy
1018
+ rake pam_dsl:generate:from_models[name] # Generate from model scan
1019
+
1020
+ # Aliases
1021
+ rake privacy:report # Same as pam_dsl:report:full
1022
+ rake privacy:policy # Same as pam_dsl:report:policy
1023
+ rake privacy:article_30 # Same as pam_dsl:report:article_30
1024
+ ```
1025
+
1026
+ ## Integration with Lyra
1027
+
1028
+ PAM DSL is designed to integrate seamlessly with Lyra:
1029
+
1030
+ ```ruby
1031
+ # Define policy
1032
+ PamDsl.define_policy :university_system do
1033
+ field :student_id, type: :identifier, sensitivity: :internal
1034
+ field :email, type: :email, sensitivity: :internal
1035
+ field :ssn, type: :ssn, sensitivity: :restricted
1036
+
1037
+ purpose :enrollment do
1038
+ basis :contract
1039
+ requires :student_id, :email
1040
+ end
1041
+
1042
+ retention do
1043
+ for_model 'Student' do
1044
+ keep_for 10.years
1045
+ on_expiry :anonymize
1046
+ end
1047
+ end
1048
+ end
1049
+
1050
+ # Use in Lyra
1051
+ class Student < ApplicationRecord
1052
+ monitor_with_lyra privacy_policy: :university_system
1053
+ end
1054
+ ```
1055
+
1056
+ ## API Reference
1057
+
1058
+ ### PamDsl Module
1059
+
1060
+ - `PamDsl.define_policy(name, &block)` - Define a new policy
1061
+ - `PamDsl.policy(name)` - Get a defined policy
1062
+ - `PamDsl.reset!` - Clear all policies
1063
+
1064
+ ### Policy
1065
+
1066
+ - `field(name, type:, sensitivity:, &block)` - Define a field
1067
+ - `purpose(name, &block)` - Define a purpose
1068
+ - `retention(&block)` - Configure retention
1069
+ - `consent(&block)` - Configure consent
1070
+ - `meta(key, value)` - Add custom metadata to policy
1071
+ - `allowed?(field, purpose)` - Check if field is allowed for purpose
1072
+ - `validate_access!(fields, purpose, consent_granted:, consent_granted_at:)` - Validate access
1073
+ - `sensitive_fields` - Get all fields with confidential/restricted sensitivity
1074
+ - `restricted_fields` - Get all fields with restricted sensitivity
1075
+ - `metadata` - Access policy metadata hash
1076
+ - `to_h` - Export policy as hash (includes all metadata)
1077
+
1078
+ ### Field
1079
+
1080
+ - `allow_for(*purposes)` - Allow field for purposes
1081
+ - `transform(context, &block)` - Define transformation
1082
+ - `meta(key, value)` - Add custom metadata
1083
+ - `metadata` - Access field metadata hash
1084
+ - `sensitive?` - Check if field is confidential or restricted
1085
+ - `restricted?` - Check if field is restricted
1086
+ - `allowed_for?(purpose)` - Check if allowed for specific purpose
1087
+ - `apply_transformation(context, value)` - Apply defined transformation
1088
+
1089
+ ### Purpose
1090
+
1091
+ - `describe(text)` - Set description
1092
+ - `basis(legal_basis)` - Set legal basis
1093
+ - `requires(*fields)` - Define required fields
1094
+ - `optionally(*fields)` - Define optional fields
1095
+ - `meta(key, value)` - Add custom metadata
1096
+ - `metadata` - Access purpose metadata hash
1097
+ - `requires_consent?` - Check if purpose requires consent (basis is :consent)
1098
+ - `all_fields` - Get all fields (required + optional)
1099
+ - `requires_field?(field)` - Check if field is required
1100
+ - `allows_field?(field)` - Check if field is allowed
1101
+
1102
+ ### Retention
1103
+
1104
+ - `default(duration)` - Set default retention
1105
+ - `for_model(model_class, &block)` - Define model retention
1106
+ - `keep_for(duration)` - Set retention duration
1107
+ - `field(name, duration:)` - Set field retention
1108
+ - `on_expiry(strategy)` - Set deletion strategy
1109
+
1110
+ ### Consent
1111
+
1112
+ - `for_purpose(purpose, &block)` - Define consent requirement
1113
+ - `required!(value)` - Set if required
1114
+ - `granular!(value)` - Enable granular consent
1115
+ - `withdrawable!(value)` - Set if withdrawable
1116
+ - `expires_in(duration)` - Set expiration
1117
+
1118
+ ### Reporter
1119
+
1120
+ - `Reporter.new(policy_name, organization:, dpo_contact:, event_store:, output:)` - Create reporter
1121
+ - `policy_summary` - Print policy summary
1122
+ - `article_30_report` - Print GDPR Article 30 report
1123
+ - `pii_analysis` - Analyze PII in event store
1124
+ - `retention_check` - Check retention compliance
1125
+ - `access_patterns` - Show access patterns
1126
+ - `full_report` - Print complete report
1127
+ - `export_json(path)` - Export to JSON file
1128
+ - `to_h` - Export as hash
1129
+
1130
+ ### PolicyGenerator
1131
+
1132
+ - `PolicyGenerator.new(name, output_path:)` - Create generator
1133
+ - `generate` - Generate template policy file
1134
+ - `generate_from_models` - Scan models and generate policy
1135
+ - `scan_models` - Detect PII fields in ActiveRecord models
1136
+
1137
+ ### PIIDetector
1138
+
1139
+ - `PIIDetector.detect(attributes)` - Detect PII in a hash, returns `{ field: { type:, value:, sensitivity: } }`
1140
+ - `PIIDetector.contains_pii?(field_name)` - Check if a field name is PII
1141
+ - `PIIDetector.pii_type(field_name)` - Get PII type for a field (`:email`, `:phone`, etc.)
1142
+ - `PIIDetector.sensitivity(field_name)` - Get sensitivity level for a field
1143
+ - `PIIDetector.sensitive?(pii_type)` - Check if PII type requires special protection
1144
+ - `PIIDetector.mask(value, pii_type)` - Mask a PII value for safe display
1145
+ - `PIIDetector.extract_pii_from_records(records, attribute_extractor:, metadata_extractor:)` - Extract PII from any record collection
1146
+ - `PIIDetector.partial_match=(bool)` - Enable/disable partial matching mode
1147
+ - `PIIDetector.reset!` - Reset to default settings
1148
+
1149
+ ### PIIMasker
1150
+
1151
+ - `PIIMasker.mask(attributes, strategy:)` - Mask all PII in a hash
1152
+ - `PIIMasker.mask_field(value, field_name, strategy:)` - Mask a value by field name
1153
+ - `PIIMasker.mask_by_type(value, pii_type, strategy:)` - Mask a value by PII type
1154
+ - `PIIMasker.mask_records(records, attribute_extractor:, attribute_setter:, strategy:)` - Mask PII in a collection
1155
+
1156
+ ### GDPRCompliance
1157
+
1158
+ - `GDPRCompliance.new(subject_id:, subject_type:, event_reader:, **extractors)` - Create compliance handler
1159
+ - `data_export` - Right to Access (Art. 15) - Full data export
1160
+ - `right_to_be_forgotten_report` - Right to Erasure (Art. 17) - Deletion analysis
1161
+ - `portable_export(format:)` - Right to Portability (Art. 20) - Export as JSON/CSV/XML
1162
+ - `rectification_history` - Right to Rectification (Art. 16) - Correction history
1163
+ - `processing_activities` - Processing Records (Art. 30) - Activity documentation
1164
+ - `retention_compliance_check` - Check retention policy compliance
1165
+ - `consent_audit` - Audit consent records and legitimacy
1166
+ - `full_report` - Complete GDPR compliance report
1167
+
1168
+ ## Sensitivity Levels and Legislative Background
1169
+
1170
+ PAM DSL uses a four-tier sensitivity classification system that combines regulatory requirements from multiple frameworks. This section explains the legislative basis and practical implications of each level.
1171
+
1172
+ ### Regulatory Framework Alignment
1173
+
1174
+ The sensitivity levels are derived from three primary sources:
1175
+
1176
+ | Framework | Relevance | Key Articles/Sections |
1177
+ |-----------|-----------|----------------------|
1178
+ | **GDPR** (EU 2016/679) | Primary regulation for EU personal data | Art. 5, 6, 9, 32 |
1179
+ | **ISO/IEC 27001:2022** | Information security management | Annex A.5.12, A.5.13 |
1180
+ | **NIST SP 800-122** | US guidance on PII protection | Section 2.2 |
1181
+
1182
+ ### Sensitivity Level Definitions
1183
+
1184
+ #### `:public` - Publicly Accessible Data
1185
+
1186
+ **Definition**: Information that is intended for public disclosure or has no privacy implications.
1187
+
1188
+ **Regulatory Basis**:
1189
+ - GDPR Art. 9(2)(e): Data "manifestly made public by the data subject"
1190
+ - ISO 27001: Public classification level
1191
+
1192
+ **Examples**: Published company addresses, public social media handles, product catalogs
1193
+
1194
+ **Requirements**: No special handling required
1195
+
1196
+ ---
1197
+
1198
+ #### `:internal` - Internal Use Only
1199
+
1200
+ **Definition**: Personal data that requires basic protection but poses low risk if disclosed.
1201
+
1202
+ **Regulatory Basis**:
1203
+ - GDPR Art. 5(1)(f): "Integrity and confidentiality" principle
1204
+ - GDPR Art. 32: Appropriate security measures
1205
+ - ISO 27001 A.5.12: Classification of information
1206
+
1207
+ **Examples**: Names, business email addresses, IP addresses, user preferences
1208
+
1209
+ **GDPR Category**: Regular personal data (Art. 6)
1210
+
1211
+ **Requirements**:
1212
+ - Access control (need-to-know basis)
1213
+ - Basic audit logging
1214
+ - Standard encryption in transit
1215
+
1216
+ ---
1217
+
1218
+ #### `:confidential` - Requires Protection
1219
+
1220
+ **Definition**: Personal data that could cause harm or distress if disclosed, requiring enhanced protection measures.
1221
+
1222
+ **Regulatory Basis**:
1223
+ - GDPR Art. 5(1)(f): Enhanced integrity and confidentiality
1224
+ - GDPR Art. 32(1)(a): Pseudonymization and encryption
1225
+ - GDPR Art. 35: May require Data Protection Impact Assessment (DPIA)
1226
+ - ISO 27001 A.5.13: Labeling of information
1227
+ - NIST SP 800-122: Moderate confidentiality impact
1228
+
1229
+ **Examples**: Personal email, phone numbers, physical addresses, date of birth, location data, financial transactions
1230
+
1231
+ **GDPR Category**: Regular personal data requiring enhanced protection (Art. 6)
1232
+
1233
+ **Requirements**:
1234
+ - Encryption at rest and in transit
1235
+ - Enhanced access controls with approval workflows
1236
+ - Comprehensive audit logging
1237
+ - Data minimization practices
1238
+ - Defined retention periods
1239
+ - Breach notification within 72 hours (Art. 33)
1240
+
1241
+ ---
1242
+
1243
+ #### `:restricted` - Highly Restricted Access
1244
+
1245
+ **Definition**: Sensitive personal data that could cause significant harm if disclosed, subject to the strictest regulatory requirements.
1246
+
1247
+ **Regulatory Basis**:
1248
+ - GDPR Art. 9: Special categories of personal data (prohibited unless exception applies)
1249
+ - GDPR Art. 10: Criminal conviction data
1250
+ - GDPR Art. 35: DPIA mandatory
1251
+ - PCI DSS: Payment card data requirements
1252
+ - HIPAA: Health information (US)
1253
+ - ISO 27001: Confidential/Restricted classification
1254
+ - NIST SP 800-122: High confidentiality impact
1255
+
1256
+ **GDPR Special Categories (Art. 9)**:
1257
+ - Racial or ethnic origin
1258
+ - Political opinions
1259
+ - Religious or philosophical beliefs
1260
+ - Trade union membership
1261
+ - Genetic data
1262
+ - Biometric data (for identification)
1263
+ - Health data
1264
+ - Sex life or sexual orientation
1265
+
1266
+ **Additional Restricted Data**:
1267
+ - Social Security Numbers (SSN) / National IDs
1268
+ - Credit card numbers (PCI DSS scope)
1269
+ - Bank account details (IBAN, account numbers)
1270
+ - Tax identifiers (VAT numbers, TIN)
1271
+ - Passport numbers
1272
+ - Driver's license numbers
1273
+
1274
+ **Requirements**:
1275
+ - Encryption mandatory (at rest and in transit)
1276
+ - Strict access controls with multi-factor authentication
1277
+ - Detailed audit trails with tamper protection
1278
+ - Data Protection Impact Assessment (DPIA) required
1279
+ - Explicit consent or legal exception documented (Art. 9(2))
1280
+ - Breach notification within 72 hours with enhanced detail
1281
+ - Appointed Data Protection Officer (DPO) oversight
1282
+ - Regular security assessments
1283
+ - Data retention strictly limited
1284
+
1285
+ ### PII Type to Sensitivity Mapping
1286
+
1287
+ The following table shows the default sensitivity assignments in PAM DSL:
1288
+
1289
+ | PII Type | Default Sensitivity | GDPR Category | Regulatory Notes |
1290
+ |----------|---------------------|---------------|------------------|
1291
+ | `name` | `:internal` | Regular (Art. 6) | Low risk in isolation |
1292
+ | `email` | `:confidential` | Regular (Art. 6) | Contact data, spam risk |
1293
+ | `phone` | `:confidential` | Regular (Art. 6) | Contact data, spam risk |
1294
+ | `address` | `:confidential` | Regular (Art. 6) | Physical location risk |
1295
+ | `ip_address` | `:internal` | Regular (Art. 6) | CJEU: Personal data when linkable |
1296
+ | `date_of_birth` | `:confidential` | Regular (Art. 6) | Age discrimination risk |
1297
+ | `location` | `:confidential` | Regular (Art. 6) | Movement tracking risk |
1298
+ | `ssn` | `:restricted` | National ID (Art. 87) | High identity theft risk |
1299
+ | `credit_card` | `:restricted` | Financial | PCI DSS requirements |
1300
+ | `financial` | `:restricted` | Financial | Bank account data |
1301
+ | `identifier` | `:restricted` | National ID | VAT, tax IDs, passports |
1302
+ | `health` | `:restricted` | Special (Art. 9) | GDPR explicit prohibition |
1303
+ | `biometric` | `:restricted` | Special (Art. 9) | GDPR explicit prohibition |
1304
+
1305
+ ### Legal Bases by Sensitivity
1306
+
1307
+ | Sensitivity | Typical Legal Bases | GDPR Articles |
1308
+ |-------------|---------------------|---------------|
1309
+ | `:public` | Not applicable | N/A |
1310
+ | `:internal` | Contract, Legitimate Interest | Art. 6(1)(b), (f) |
1311
+ | `:confidential` | Contract, Consent, Legal Obligation | Art. 6(1)(a), (b), (c) |
1312
+ | `:restricted` | Explicit Consent + Art. 9(2) exception | Art. 9(2)(a)-(j) |
1313
+
1314
+ ### Practical Implementation
1315
+
1316
+ ```ruby
1317
+ PamDsl.define_policy :gdpr_compliant do
1318
+ # Internal - basic personal data
1319
+ field :display_name, type: :name, sensitivity: :internal do
1320
+ allow_for :personalization, :communication
1321
+ meta :gdpr_basis, "Art. 6(1)(b) - Contract performance"
1322
+ end
1323
+
1324
+ # Confidential - requires enhanced protection
1325
+ field :email, type: :email, sensitivity: :confidential do
1326
+ allow_for :authentication, :account_recovery
1327
+ meta :gdpr_basis, "Art. 6(1)(b) - Contract performance"
1328
+ meta :encryption_required, true
1329
+ meta :retention_period, "Account lifetime + 2 years"
1330
+ end
1331
+
1332
+ # Restricted - special category data
1333
+ field :health_status, type: :health, sensitivity: :restricted do
1334
+ allow_for :medical_services
1335
+ meta :gdpr_basis, "Art. 9(2)(a) - Explicit consent"
1336
+ meta :dpia_required, true
1337
+ meta :encryption_algorithm, "AES-256"
1338
+ meta :access_approval_required, true
1339
+ end
1340
+
1341
+ # Restricted - financial identifier
1342
+ field :vat_number, type: :identifier, sensitivity: :restricted do
1343
+ allow_for :invoicing, :tax_compliance
1344
+ meta :gdpr_basis, "Art. 6(1)(c) - Legal obligation"
1345
+ meta :retention_period, "10 years (tax law)"
1346
+ end
1347
+ end
1348
+ ```
1349
+
1350
+ ### References
1351
+
1352
+ - **GDPR Full Text**: [EUR-Lex 2016/679](https://eur-lex.europa.eu/eli/reg/2016/679/oj)
1353
+ - **ISO/IEC 27001:2022**: Information Security Management Systems
1354
+ - **NIST SP 800-122**: Guide to Protecting the Confidentiality of PII
1355
+ - **Article 29 Working Party Guidelines**: [EDPB Guidelines](https://edpb.europa.eu/our-work-tools/general-guidance/guidelines-recommendations-best-practices_en)
1356
+ - **PCI DSS v4.0**: Payment Card Industry Data Security Standard
1357
+ - **CJEU Breyer Case (C-582/14)**: IP addresses as personal data
1358
+
1359
+ ## License
1360
+
1361
+ MIT License - see LICENSE file
1362
+
1363
+ ## Contributing
1364
+
1365
+ This is part of the ORFEAS PhD thesis research. Contributions welcome.