guideai-app 0.4.3-1 → 0.4.3-2

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (45) hide show
  1. package/dist/GuideAI.js +1 -1
  2. package/dist/GuideAI.js.map +1 -1
  3. package/dist/components/Microphone.d.ts +17 -0
  4. package/dist/components/TranscriptBox.d.ts +4 -5
  5. package/dist/components/TranscriptMessages.d.ts +8 -0
  6. package/dist/components/TranscriptTextInput.d.ts +10 -0
  7. package/dist/components/TranscriptToggle.d.ts +10 -0
  8. package/dist/index.d.ts +1 -1
  9. package/dist/metric/event-listner.d.ts +143 -0
  10. package/dist/styles/GuideAI.styles.d.ts +1 -1
  11. package/dist/types/GuideAI.types.d.ts +16 -12
  12. package/dist/utils/api.d.ts +4 -3
  13. package/dist/utils/constants.d.ts +4 -0
  14. package/dist/utils/elementInteractions.d.ts +5 -0
  15. package/dist/utils/highlightAndClick.d.ts +3 -0
  16. package/dist/utils/highlightThenClick.d.ts +2 -1
  17. package/dist/utils/hoverAndClick.d.ts +4 -0
  18. package/dist/utils/hoverThenClick.d.ts +2 -1
  19. package/dist/utils/logger.d.ts +6 -5
  20. package/package.json +4 -1
  21. package/.workflow-test +0 -1
  22. package/API_DATA_CONTRACTS.md +0 -516
  23. package/API_SESSIONID_TESTING.md +0 -215
  24. package/GuideAI.d.ts +0 -19
  25. package/GuideAI.js +0 -1
  26. package/GuideAI.js.LICENSE.txt +0 -16
  27. package/GuideAI.js.map +0 -1
  28. package/PII_HASHING_EPIC.md +0 -886
  29. package/PII_HASHING_STORIES_SUMMARY.md +0 -275
  30. package/PRODUCTION_RELEASE.md +0 -126
  31. package/SESSION_ID_VERIFICATION.md +0 -122
  32. package/VISIT_COUNT_TESTING.md +0 -453
  33. package/index.d.ts +0 -7
  34. package/jest.config.js +0 -26
  35. package/jest.setup.js +0 -21
  36. package/metadata-tracking-example.md +0 -324
  37. package/obfuscate.js +0 -40
  38. package/obfuscator.prod.json +0 -24
  39. package/rollup.config.js +0 -34
  40. package/structure.md +0 -128
  41. package/text-input-usage.md +0 -321
  42. package/transcript-toggle-usage.md +0 -267
  43. package/visit-tracking-usage.md +0 -134
  44. package/webpack.config.js +0 -55
  45. package/workflow-trigger-usage.md +0 -398
@@ -1,886 +0,0 @@
1
- # Epic: Implement PII Hashing for Enhanced Privacy & Security
2
-
3
- ## Overview
4
- Implement client-side hashing of Personally Identifiable Information (PII) before sending to backend APIs to improve privacy, security, and GDPR/CCPA compliance. This will prevent PII from appearing in logs, monitoring tools, and third-party services.
5
-
6
- **Impact:**
7
- - ✅ Reduces PII exposure in server logs, APM tools, and monitoring systems
8
- - ✅ Improves GDPR/CCPA compliance through pseudonymization
9
- - ✅ Maintains user tracking capabilities across sessions
10
- - ✅ Reduces attack surface (leaked logs don't expose emails)
11
-
12
- **Estimated Timeline:** 2 weeks (1 week FE, 1 week BE + Testing)
13
-
14
- ---
15
-
16
- ## Story Map
17
-
18
- ```
19
- Epic: PII Hashing
20
- ├── Frontend Stories (Guide AI Package)
21
- │ ├── FE-1: Create PII Hashing Utility
22
- │ ├── FE-2: Update UserMetadataTracker to Hash Emails
23
- │ ├── FE-3: Update API Layer for Hashed Identifiers
24
- │ ├── FE-4: Add Privacy Configuration Options
25
- │ └── FE-5: Add Unit & Integration Tests
26
-
27
- └── Backend Stories (BE Repo - High Level)
28
- ├── BE-1: Update API Endpoints for Hashed Email Handling
29
- ├── BE-2: Create Email Hash Mapping System
30
- ├── BE-3: Update Database Schema
31
- └── BE-4: Add Email Retrieval & Notification Logic
32
- ```
33
-
34
- ---
35
-
36
- # Frontend Stories (Detailed)
37
-
38
- ## FE-1: Create PII Hashing Utility
39
-
40
- **Priority:** P0 (Blocker)
41
- **Story Points:** 3
42
- **Owner:** Frontend Team
43
-
44
- ### User Story
45
- As a developer, I need a reusable utility function to hash PII data so that sensitive information is never sent in plain text to the backend.
46
-
47
- ### Acceptance Criteria
48
- - [ ] Create `src/utils/pii.ts` with hashing utilities
49
- - [ ] Implement `hashEmail()` function using SHA-256
50
- - [ ] Implement generic `hashPII()` function for other PII fields
51
- - [ ] Functions return consistent hashes (same input = same output)
52
- - [ ] Functions are async and use Web Crypto API
53
- - [ ] Include JSDoc documentation
54
- - [ ] Add TypeScript types for hashed data
55
- - [ ] Works in browser environments (no Node.js dependencies)
56
-
57
- ### Technical Details
58
-
59
- **New File:** `src/utils/pii.ts`
60
-
61
- ```typescript
62
- /**
63
- * PII Hashing Utilities
64
- *
65
- * Provides client-side hashing of Personally Identifiable Information (PII)
66
- * using SHA-256 before sending to backend. This prevents PII exposure in logs,
67
- * monitoring tools, and third-party services.
68
- */
69
-
70
- /**
71
- * Hash an email address using SHA-256
72
- * @param email - The email address to hash
73
- * @returns Promise resolving to hex-encoded hash
74
- */
75
- export async function hashEmail(email: string): Promise<string> {
76
- if (!email || typeof email !== 'string') {
77
- throw new Error('Invalid email: must be a non-empty string');
78
- }
79
-
80
- // Normalize email (lowercase, trim) for consistent hashing
81
- const normalizedEmail = email.toLowerCase().trim();
82
- return hashPII(normalizedEmail);
83
- }
84
-
85
- /**
86
- * Generic PII hashing function using SHA-256
87
- * @param data - The PII string to hash
88
- * @returns Promise resolving to hex-encoded hash
89
- */
90
- export async function hashPII(data: string): Promise<string> {
91
- if (!data || typeof data !== 'string') {
92
- throw new Error('Invalid data: must be a non-empty string');
93
- }
94
-
95
- try {
96
- // Use Web Crypto API (supported in all modern browsers)
97
- const encoder = new TextEncoder();
98
- const dataBuffer = encoder.encode(data);
99
- const hashBuffer = await crypto.subtle.digest('SHA-256', dataBuffer);
100
-
101
- // Convert to hex string
102
- const hashArray = Array.from(new Uint8Array(hashBuffer));
103
- const hashHex = hashArray
104
- .map(byte => byte.toString(16).padStart(2, '0'))
105
- .join('');
106
-
107
- return hashHex;
108
- } catch (error) {
109
- console.error('PII hashing failed:', error);
110
- throw new Error('Failed to hash PII data');
111
- }
112
- }
113
-
114
- /**
115
- * Check if a value is already hashed (64-char hex string = SHA-256)
116
- * @param value - The value to check
117
- * @returns True if value appears to be a SHA-256 hash
118
- */
119
- export function isHashed(value: string): boolean {
120
- if (!value || typeof value !== 'string') return false;
121
- // SHA-256 produces 64 hex characters
122
- return /^[a-f0-9]{64}$/i.test(value);
123
- }
124
- ```
125
-
126
- **Testing Strategy:**
127
- - Unit tests for each function
128
- - Test edge cases (empty string, null, undefined, non-string)
129
- - Test hash consistency (same email → same hash)
130
- - Test normalization (Email@test.com === email@test.com)
131
-
132
- **Dependencies:** None (uses built-in Web Crypto API)
133
-
134
- ---
135
-
136
- ## FE-2: Update UserMetadataTracker to Hash Emails
137
-
138
- **Priority:** P0 (Blocker)
139
- **Story Points:** 5
140
- **Owner:** Frontend Team
141
- **Depends On:** FE-1
142
-
143
- ### User Story
144
- As a developer, I need the metadata tracker to automatically hash email addresses before storing them locally and sending to the backend, so that PII is never exposed in plain text.
145
-
146
- ### Acceptance Criteria
147
- - [ ] Import hashing utilities in `metadata-tracker.tsx`
148
- - [ ] Hash email in `updateUserInfo()` method before storing
149
- - [ ] Hash email in `trackLogin()` method before creating updates
150
- - [ ] Hash email in all `trackVisit*()` methods before creating updates
151
- - [ ] Hash email in `trackCustomEvent()` method if present
152
- - [ ] Store hashed email in localStorage (not plain text)
153
- - [ ] Update pending updates to include hashed email
154
- - [ ] Maintain backward compatibility with existing stored data
155
- - [ ] Add logging for hashing operations (development mode only)
156
- - [ ] Handle hashing errors gracefully (don't break tracking)
157
-
158
- ### Technical Details
159
-
160
- **Files to Modify:**
161
- - `src/metric/metadata-tracker.tsx`
162
-
163
- **Changes Required:**
164
-
165
- 1. **Import hashing utility** (top of file):
166
- ```typescript
167
- import { hashEmail, isHashed } from '../utils/pii';
168
- ```
169
-
170
- 2. **Update `updateUserInfo()` method** (lines 121-159):
171
- ```typescript
172
- public async updateUserInfo(userInfo: Partial<UserMetadata>): Promise<void> {
173
- const timestamp = Date.now();
174
-
175
- // Extract email from user object if present
176
- const rawEmail = userInfo.user?.email || userInfo.email;
177
-
178
- // Hash email if provided
179
- let hashedEmail: string | undefined;
180
- if (rawEmail) {
181
- try {
182
- // Only hash if not already hashed
183
- hashedEmail = isHashed(rawEmail) ? rawEmail : await hashEmail(rawEmail);
184
- Logger.metadata('Email hashed for user info update', { hashedEmail });
185
- } catch (error) {
186
- Logger.error('Metadata', 'Failed to hash email', error);
187
- // Continue without email rather than exposing plain text
188
- }
189
- }
190
-
191
- // Create updated info with hashed email
192
- const updatedInfo = {
193
- ...userInfo,
194
- ...(hashedEmail && { emailHash: hashedEmail }),
195
- email: undefined // Remove plain text email
196
- };
197
-
198
- // ... rest of method
199
- }
200
- ```
201
-
202
- 3. **Update `trackLogin()` method** (lines 161-195):
203
- ```typescript
204
- public async trackLogin(additionalInfo?: Partial<UserMetadata>): Promise<void> {
205
- if (!this.config.trackLogins) return;
206
-
207
- const timestamp = Date.now();
208
-
209
- // ... duplicate check logic ...
210
-
211
- // Hash email if present
212
- let hashedEmail: string | undefined;
213
- if (this.metadata.emailHash) {
214
- hashedEmail = this.metadata.emailHash;
215
- }
216
-
217
- this.metadata = {
218
- ...this.metadata,
219
- ...additionalInfo,
220
- loginCount: (this.metadata.loginCount || 0) + 1,
221
- lastLogin: timestamp,
222
- lastVisit: timestamp
223
- };
224
-
225
- this.addPendingUpdate({
226
- type: 'login',
227
- timestamp,
228
- data: {
229
- sessionId: this.metadata.sessionId,
230
- ...(hashedEmail && { emailHash: hashedEmail }),
231
- loginCount: this.metadata.loginCount,
232
- lastLogin: timestamp,
233
- ...additionalInfo
234
- }
235
- });
236
-
237
- // ... rest of method
238
- }
239
- ```
240
-
241
- 4. **Similar updates for:**
242
- - `trackVisitIfNewSession()` (lines 197-253)
243
- - `trackVisitManually()` (lines 270-303)
244
- - `trackCustomEvent()` (lines 305-336)
245
-
246
- **Migration Strategy:**
247
- - Existing plain-text emails in localStorage will be hashed on next update
248
- - Add migration helper in `loadMetadata()` to detect and hash old emails
249
- - Keep system working during transition period
250
-
251
- **Error Handling:**
252
- - Wrap hashing calls in try-catch
253
- - Log errors but don't fail the tracking operation
254
- - If hashing fails, omit email rather than sending plain text
255
-
256
- ---
257
-
258
- ## FE-3: Update API Layer for Hashed Identifiers
259
-
260
- **Priority:** P0 (Blocker)
261
- **Story Points:** 3
262
- **Owner:** Frontend Team
263
- **Depends On:** FE-1, FE-2
264
-
265
- ### User Story
266
- As a developer, I need the API layer to send hashed email identifiers instead of plain text, so that backend logs and monitoring systems never see PII.
267
-
268
- ### Acceptance Criteria
269
- - [ ] Update `MetadataUpdate` type to use `emailHash` instead of `email`
270
- - [ ] Update `sendMetadataUpdates()` to validate hashed format
271
- - [ ] Ensure no plain-text email is ever sent in API requests
272
- - [ ] Add logging to verify hashed data in requests (dev mode)
273
- - [ ] Update request body structure in API calls
274
- - [ ] Add JSDoc comments explaining hashed fields
275
-
276
- ### Technical Details
277
-
278
- **Files to Modify:**
279
- 1. `src/types/metadata.types.ts`
280
- 2. `src/utils/api.ts`
281
-
282
- **Changes in `metadata.types.ts`:**
283
- ```typescript
284
- export interface UserMetadata {
285
- // Core user identification
286
- userId?: string;
287
-
288
- // DEPRECATED: Use emailHash instead (for backward compatibility only)
289
- email?: string;
290
-
291
- // Hashed email identifier (SHA-256)
292
- // Use this for user identification instead of plain email
293
- emailHash?: string;
294
-
295
- userType?: 'agent' | 'admin' | 'manager' | 'customer' | 'guest' | string;
296
-
297
- // ... rest of interface
298
- }
299
- ```
300
-
301
- **Changes in `api.ts`:**
302
- ```typescript
303
- // Add validation before sending
304
- export const sendMetadataUpdates = async (
305
- updates: MetadataUpdate[],
306
- organizationKey: string,
307
- onError: (error: Error, context: string) => void
308
- ): Promise<boolean> => {
309
- if (updates.length === 0) return true;
310
-
311
- try {
312
- // Validate that no plain-text emails are being sent
313
- if (process.env.NODE_ENV === 'development') {
314
- updates.forEach(update => {
315
- if (update.data.email && !update.data.emailHash) {
316
- console.warn('⚠️ Plain-text email detected in metadata update!', update);
317
- }
318
- });
319
- }
320
-
321
- const requestData = {
322
- organizationKey,
323
- updates,
324
- batchTimestamp: Date.now(),
325
- updateCount: updates.length
326
- };
327
-
328
- Logger.apiCall('POST', '/metadata-updates', requestData);
329
-
330
- // ... rest of method
331
- }
332
- // ... error handling
333
- };
334
- ```
335
-
336
- **Validation Rules:**
337
- - Warn if `email` field exists without corresponding `emailHash`
338
- - Verify `emailHash` matches SHA-256 format (64 hex chars)
339
- - Only in development mode to avoid performance overhead
340
-
341
- ---
342
-
343
- ## FE-4: Add Privacy Configuration Options
344
-
345
- **Priority:** P1 (High)
346
- **Story Points:** 2
347
- **Owner:** Frontend Team
348
- **Depends On:** FE-1, FE-2
349
-
350
- ### User Story
351
- As a developer integrating GuideAI, I want configuration options to control PII hashing behavior, so I can customize privacy settings based on my application's requirements.
352
-
353
- ### Acceptance Criteria
354
- - [ ] Add `enablePIIHashing` config option (default: true)
355
- - [ ] Add `piiFields` config option to specify which fields to hash
356
- - [ ] Add `allowPlainTextFallback` option for backward compatibility
357
- - [ ] Update `MetadataConfig` type with new options
358
- - [ ] Document new configuration options
359
- - [ ] Respect config in metadata tracker
360
-
361
- ### Technical Details
362
-
363
- **Files to Modify:**
364
- - `src/types/metadata.types.ts`
365
-
366
- **New Configuration:**
367
- ```typescript
368
- export interface MetadataConfig {
369
- // ... existing options ...
370
-
371
- // Privacy & PII Settings
372
-
373
- /**
374
- * Enable automatic PII hashing before storage/transmission
375
- * @default true
376
- */
377
- enablePIIHashing?: boolean;
378
-
379
- /**
380
- * List of PII fields to hash (if enablePIIHashing is true)
381
- * @default ['email']
382
- */
383
- piiFields?: Array<'email' | 'userId' | 'customerId' | string>;
384
-
385
- /**
386
- * Allow plain-text fallback if hashing fails
387
- * WARNING: Only enable for backward compatibility during migration
388
- * @default false
389
- */
390
- allowPlainTextFallback?: boolean;
391
- }
392
- ```
393
-
394
- **Usage Example:**
395
- ```typescript
396
- const tracker = new UserMetadataTracker('org-key', {
397
- enablePIIHashing: true, // Hash PII before sending
398
- piiFields: ['email', 'customerId'], // Which fields to hash
399
- allowPlainTextFallback: false, // Never send plain text
400
- });
401
- ```
402
-
403
- ---
404
-
405
- ## FE-5: Add Unit & Integration Tests
406
-
407
- **Priority:** P0 (Blocker)
408
- **Story Points:** 5
409
- **Owner:** Frontend Team
410
- **Depends On:** FE-1, FE-2, FE-3
411
-
412
- ### User Story
413
- As a developer, I need comprehensive tests for PII hashing functionality to ensure it works correctly and doesn't break existing features.
414
-
415
- ### Acceptance Criteria
416
- - [ ] Unit tests for `pii.ts` utility functions (90%+ coverage)
417
- - [ ] Unit tests for metadata tracker hashing logic (90%+ coverage)
418
- - [ ] Integration tests for end-to-end hashing flow
419
- - [ ] Tests for error handling and edge cases
420
- - [ ] Tests for backward compatibility with existing data
421
- - [ ] Tests verify no plain-text emails in API calls
422
- - [ ] Mock Web Crypto API for tests
423
- - [ ] All tests pass in CI/CD pipeline
424
-
425
- ### Technical Details
426
-
427
- **New Test Files:**
428
- 1. `src/utils/pii.test.ts` (unit tests)
429
- 2. `src/metric/metadata-tracker.pii.test.ts` (integration tests)
430
-
431
- **Test Cases:**
432
-
433
- **Unit Tests (`pii.test.ts`):**
434
- ```typescript
435
- describe('PII Hashing Utils', () => {
436
- describe('hashEmail', () => {
437
- it('should hash email to 64-char hex string', async () => {
438
- const hash = await hashEmail('user@example.com');
439
- expect(hash).toMatch(/^[a-f0-9]{64}$/);
440
- });
441
-
442
- it('should produce consistent hashes', async () => {
443
- const hash1 = await hashEmail('user@example.com');
444
- const hash2 = await hashEmail('user@example.com');
445
- expect(hash1).toBe(hash2);
446
- });
447
-
448
- it('should normalize emails (case-insensitive)', async () => {
449
- const hash1 = await hashEmail('User@Example.com');
450
- const hash2 = await hashEmail('user@example.com');
451
- expect(hash1).toBe(hash2);
452
- });
453
-
454
- it('should throw error for invalid input', async () => {
455
- await expect(hashEmail('')).rejects.toThrow();
456
- await expect(hashEmail(null as any)).rejects.toThrow();
457
- });
458
- });
459
-
460
- describe('isHashed', () => {
461
- it('should identify valid hashes', () => {
462
- const validHash = 'a'.repeat(64);
463
- expect(isHashed(validHash)).toBe(true);
464
- });
465
-
466
- it('should reject invalid hashes', () => {
467
- expect(isHashed('short')).toBe(false);
468
- expect(isHashed('user@example.com')).toBe(false);
469
- expect(isHashed('')).toBe(false);
470
- });
471
- });
472
- });
473
- ```
474
-
475
- **Integration Tests (`metadata-tracker.pii.test.ts`):**
476
- ```typescript
477
- describe('MetadataTracker PII Hashing', () => {
478
- it('should hash email in updateUserInfo', async () => {
479
- const tracker = new UserMetadataTracker('test-org');
480
- tracker.init();
481
-
482
- await tracker.updateUserInfo({ email: 'user@test.com' });
483
-
484
- const metadata = tracker.getMetadata();
485
- expect(metadata.email).toBeUndefined();
486
- expect(metadata.emailHash).toMatch(/^[a-f0-9]{64}$/);
487
- });
488
-
489
- it('should not send plain-text email to API', async () => {
490
- const mockSendMetadata = jest.fn();
491
- // ... mock API call
492
-
493
- await tracker.updateUserInfo({ email: 'user@test.com' });
494
- await tracker.emitPendingUpdates();
495
-
496
- const apiCall = mockSendMetadata.mock.calls[0][0];
497
- const hasPlainEmail = JSON.stringify(apiCall).includes('user@test.com');
498
- expect(hasPlainEmail).toBe(false);
499
- });
500
-
501
- it('should handle hashing errors gracefully', async () => {
502
- // Mock crypto.subtle.digest to fail
503
- jest.spyOn(crypto.subtle, 'digest').mockRejectedValue(new Error('Crypto failed'));
504
-
505
- const tracker = new UserMetadataTracker('test-org');
506
-
507
- // Should not throw
508
- await expect(
509
- tracker.updateUserInfo({ email: 'user@test.com' })
510
- ).resolves.not.toThrow();
511
-
512
- // Email should be omitted (not sent in plain text)
513
- const updates = tracker.getPendingUpdates();
514
- expect(updates[0].data.email).toBeUndefined();
515
- expect(updates[0].data.emailHash).toBeUndefined();
516
- });
517
- });
518
- ```
519
-
520
- **Run Tests:**
521
- ```bash
522
- npm test -- --coverage
523
- ```
524
-
525
- ---
526
-
527
- # Backend Stories (High Level)
528
-
529
- ## BE-1: Update API Endpoints for Hashed Email Handling
530
-
531
- **Priority:** P0 (Blocker)
532
- **Story Points:** 8
533
- **Owner:** Backend Team
534
-
535
- ### User Story
536
- As a backend developer, I need to update API endpoints to receive and process hashed emails instead of plain-text emails, so that our logs and monitoring systems never contain PII.
537
-
538
- ### Acceptance Criteria
539
- - [ ] Update `/initialize-session` endpoint to accept `emailHash` instead of `email`
540
- - [ ] Update `/metadata-updates` endpoint to process `emailHash` field
541
- - [ ] Update `/conversations/:id/messages` endpoint if it handles user data
542
- - [ ] Validate that `emailHash` matches SHA-256 format (64 hex chars)
543
- - [ ] Add backward compatibility for transition period
544
- - [ ] Update API documentation/OpenAPI spec
545
- - [ ] Log hashed emails only, never plain text
546
- - [ ] Deploy to staging environment for testing
547
-
548
- ### Technical Notes
549
- - Accept both `email` and `emailHash` during migration (deprecate `email` later)
550
- - Validate hash format: `/^[a-f0-9]{64}$/i`
551
- - Update request validation schemas
552
- - Ensure no plain-text emails in application logs
553
- - Update error messages to not expose PII
554
-
555
- ### Backend Framework Considerations
556
- - If using Express/Node: Update request validators
557
- - If using Django/Python: Update serializers/validators
558
- - If using Spring/Java: Update DTOs and validation annotations
559
-
560
- ---
561
-
562
- ## BE-2: Create Email Hash Mapping System
563
-
564
- **Priority:** P0 (Blocker)
565
- **Story Points:** 13
566
- **Owner:** Backend Team
567
- **Depends On:** BE-1
568
-
569
- ### User Story
570
- As a backend developer, I need a secure system to store and retrieve the mapping between hashed emails and actual email addresses, so we can send notifications while keeping hashes in logs.
571
-
572
- ### Acceptance Criteria
573
- - [ ] Create `email_mappings` table/collection
574
- - [ ] Store mapping: `emailHash → encryptedEmail`
575
- - [ ] Encrypt actual emails at rest in database
576
- - [ ] Implement secure key management for encryption
577
- - [ ] Create API for email lookup by hash
578
- - [ ] Add strict access controls (only specific services can decrypt)
579
- - [ ] Implement audit logging for email decryption
580
- - [ ] Add cache layer for frequently accessed mappings (Redis)
581
- - [ ] Handle hash collisions (extremely unlikely but possible)
582
- - [ ] Add data retention policies
583
-
584
- ### Technical Notes
585
-
586
- **Database Schema (Example):**
587
- ```sql
588
- CREATE TABLE email_mappings (
589
- email_hash VARCHAR(64) PRIMARY KEY,
590
- encrypted_email BYTEA NOT NULL,
591
- encryption_key_id VARCHAR(50) NOT NULL,
592
- organization_key VARCHAR(100) NOT NULL,
593
- first_seen TIMESTAMP DEFAULT NOW(),
594
- last_seen TIMESTAMP DEFAULT NOW(),
595
- access_count INTEGER DEFAULT 0,
596
- INDEX idx_org_key (organization_key)
597
- );
598
- ```
599
-
600
- **Key Points:**
601
- - Use AES-256-GCM for email encryption
602
- - Store encryption key in KMS (AWS KMS, HashiCorp Vault, etc.)
603
- - Rotate encryption keys periodically
604
- - Log all email decryption operations
605
- - Implement rate limiting on lookups
606
- - Cache mappings in Redis with TTL
607
-
608
- **API Design:**
609
- ```
610
- POST /internal/email-mappings
611
- - Register new hash → email mapping
612
- - Returns: success status
613
-
614
- GET /internal/email-mappings/{hash}
615
- - Retrieve actual email for notification
616
- - Requires internal service authentication
617
- - Returns: decrypted email
618
- ```
619
-
620
- ---
621
-
622
- ## BE-3: Update Database Schema
623
-
624
- **Priority:** P0 (Blocker)
625
- **Story Points:** 5
626
- **Owner:** Backend Team
627
- **Depends On:** BE-2
628
-
629
- ### User Story
630
- As a backend developer, I need to update database schemas to use `emailHash` instead of `email` for user identification in metadata/analytics tables.
631
-
632
- ### Acceptance Criteria
633
- - [ ] Add `email_hash` column to relevant tables
634
- - [ ] Create migration script for existing data
635
- - [ ] Hash existing plain-text emails in database
636
- - [ ] Add index on `email_hash` for performance
637
- - [ ] Update all queries to use `email_hash`
638
- - [ ] Keep `email` column temporarily for backward compatibility
639
- - [ ] Plan deprecation of `email` column (after migration complete)
640
- - [ ] Test migration on staging database first
641
- - [ ] Verify no performance degradation after migration
642
-
643
- ### Technical Notes
644
-
645
- **Tables to Update:**
646
- - `user_metadata`
647
- - `user_sessions`
648
- - `user_visits`
649
- - `conversation_logs`
650
- - Any other tables storing user identifiers
651
-
652
- **Migration Strategy:**
653
- ```sql
654
- -- Step 1: Add new column
655
- ALTER TABLE user_metadata
656
- ADD COLUMN email_hash VARCHAR(64);
657
-
658
- -- Step 2: Backfill hashes (run incrementally)
659
- -- Use same SHA-256 algorithm as frontend
660
-
661
- -- Step 3: Create index
662
- CREATE INDEX idx_email_hash ON user_metadata(email_hash);
663
-
664
- -- Step 4: Update application code to use email_hash
665
-
666
- -- Step 5: (Future) Drop email column after verification period
667
- -- ALTER TABLE user_metadata DROP COLUMN email;
668
- ```
669
-
670
- **Data Migration:**
671
- - Hash existing emails using same algorithm (SHA-256)
672
- - Normalize emails before hashing (lowercase, trim)
673
- - Store hashes in email_mappings table
674
- - Run migration in batches to avoid locking tables
675
- - Verify hash consistency between FE and BE
676
-
677
- ---
678
-
679
- ## BE-4: Add Email Retrieval & Notification Logic
680
-
681
- **Priority:** P1 (High)
682
- **Story Points:** 8
683
- **Owner:** Backend Team
684
- **Depends On:** BE-2, BE-3
685
-
686
- ### User Story
687
- As a backend developer, I need to implement secure email retrieval logic for sending notifications, so that we can contact users while keeping hashed identifiers in all other systems.
688
-
689
- ### Acceptance Criteria
690
- - [ ] Create service layer for email retrieval
691
- - [ ] Implement notification service that looks up emails by hash
692
- - [ ] Add authentication/authorization for email access
693
- - [ ] Log all email access attempts (audit trail)
694
- - [ ] Implement rate limiting on email lookups
695
- - [ ] Add monitoring/alerts for suspicious access patterns
696
- - [ ] Test email notifications work end-to-end
697
- - [ ] Document email retrieval API for internal services
698
-
699
- ### Technical Notes
700
-
701
- **Service Design:**
702
- ```python
703
- # Example (Python/Django)
704
- class EmailRetrievalService:
705
- def get_email_by_hash(self, email_hash: str, requesting_service: str) -> str:
706
- """
707
- Retrieve actual email from hash for notification purposes.
708
-
709
- Args:
710
- email_hash: SHA-256 hash of email
711
- requesting_service: Name of service requesting email
712
-
713
- Returns:
714
- Decrypted email address
715
-
716
- Raises:
717
- UnauthorizedException: If service not allowed
718
- NotFoundException: If hash not found
719
- """
720
- # Check service authorization
721
- if not self.is_authorized(requesting_service):
722
- audit_log.warning(f"Unauthorized email access attempt by {requesting_service}")
723
- raise UnauthorizedException()
724
-
725
- # Retrieve mapping
726
- mapping = EmailMapping.objects.get(email_hash=email_hash)
727
-
728
- # Decrypt email
729
- email = self.decrypt_email(mapping.encrypted_email, mapping.encryption_key_id)
730
-
731
- # Audit log
732
- audit_log.info(f"Email retrieved for notification by {requesting_service}", {
733
- "email_hash": email_hash,
734
- "service": requesting_service
735
- })
736
-
737
- return email
738
- ```
739
-
740
- **Access Control:**
741
- - Only notification service can retrieve emails
742
- - Use service-to-service authentication (API keys, OAuth)
743
- - Implement IP whitelisting if services on private network
744
- - Rate limit: max 100 lookups per minute per service
745
-
746
- **Monitoring:**
747
- - Alert on unusual access patterns
748
- - Track email retrieval frequency
749
- - Monitor for brute force attempts
750
- - Dashboard for email access audit logs
751
-
752
- ---
753
-
754
- # Definition of Done
755
-
756
- ## Frontend
757
- - [ ] All FE stories completed and tested
758
- - [ ] Unit tests pass with 90%+ coverage
759
- - [ ] Integration tests pass
760
- - [ ] No plain-text emails in localStorage
761
- - [ ] No plain-text emails in API requests (verified in network tab)
762
- - [ ] Backward compatible with existing installations
763
- - [ ] Documentation updated
764
- - [ ] Code reviewed and approved
765
- - [ ] Deployed to staging
766
- - [ ] QA tested
767
-
768
- ## Backend
769
- - [ ] All BE stories completed and tested
770
- - [ ] Database migration completed successfully
771
- - [ ] Email mappings working correctly
772
- - [ ] Notifications still function properly
773
- - [ ] No plain-text emails in application logs
774
- - [ ] Audit logging in place
775
- - [ ] API documentation updated
776
- - [ ] Load testing completed
777
- - [ ] Security review completed
778
- - [ ] Deployed to staging
779
- - [ ] QA tested
780
-
781
- ## System-Wide
782
- - [ ] End-to-end testing completed (FE → BE → DB)
783
- - [ ] Privacy audit passed
784
- - [ ] Performance benchmarks met
785
- - [ ] Rollback plan documented
786
- - [ ] Production deployment plan approved
787
- - [ ] Monitoring/alerts configured
788
- - [ ] Team training completed
789
-
790
- ---
791
-
792
- # Risks & Mitigations
793
-
794
- | Risk | Impact | Mitigation |
795
- |------|--------|------------|
796
- | Breaking existing installations | High | Maintain backward compatibility; gradual migration |
797
- | Hashing errors in browser | Medium | Graceful fallback; comprehensive error handling |
798
- | Performance impact of hashing | Low | SHA-256 is fast; test with load testing |
799
- | Database migration failure | High | Test on staging first; incremental migration; rollback plan |
800
- | Email mapping not found | Medium | Handle missing mappings gracefully; alert on failures |
801
- | Encryption key compromise | Critical | Use KMS; rotate keys regularly; audit access |
802
-
803
- ---
804
-
805
- # Testing Strategy
806
-
807
- ## Phase 1: Unit Testing
808
- - Test all new utilities in isolation
809
- - Mock external dependencies
810
- - Cover edge cases and error scenarios
811
-
812
- ## Phase 2: Integration Testing
813
- - Test FE → BE data flow
814
- - Verify hashed data in API calls
815
- - Test email retrieval for notifications
816
-
817
- ## Phase 3: Migration Testing
818
- - Test backward compatibility with existing data
819
- - Verify data migration scripts
820
- - Test rollback procedures
821
-
822
- ## Phase 4: Security Testing
823
- - Verify no PII in logs (grep for email patterns)
824
- - Test encryption/decryption
825
- - Audit access controls
826
-
827
- ## Phase 5: Performance Testing
828
- - Load test hashing operations
829
- - Database query performance with new indexes
830
- - Cache hit rates for email mappings
831
-
832
- ---
833
-
834
- # Rollout Plan
835
-
836
- ## Week 1: Frontend Development
837
- - Days 1-2: FE-1 (PII utility)
838
- - Days 3-4: FE-2 (Metadata tracker updates)
839
- - Day 5: FE-3 (API layer updates)
840
-
841
- ## Week 2: Backend Development + Testing
842
- - Days 1-2: BE-1 (API endpoints)
843
- - Days 3-4: BE-2 (Email mapping system)
844
- - Day 5: BE-3 (Database schema updates)
845
-
846
- ## Week 3: Integration & Testing
847
- - Days 1-2: FE-5 (Tests) + BE-4 (Email retrieval)
848
- - Days 3-4: Integration testing + QA
849
- - Day 5: Staging deployment + validation
850
-
851
- ## Week 4: Production Rollout
852
- - Day 1: Production deployment (FE first)
853
- - Day 2: Monitor for issues
854
- - Day 3: BE deployment (staged rollout)
855
- - Day 4: Full system verification
856
- - Day 5: Retrospective + documentation
857
-
858
- ---
859
-
860
- # Success Metrics
861
-
862
- - [ ] 0 plain-text emails in server logs (verified by log analysis)
863
- - [ ] 100% of new sessions use hashed emails
864
- - [ ] < 50ms latency impact from hashing operations
865
- - [ ] 0 notification delivery failures due to mapping issues
866
- - [ ] 99.9% uptime during migration
867
- - [ ] Privacy audit score improvement by 20%+
868
-
869
- ---
870
-
871
- # Questions & Decisions
872
-
873
- | Question | Decision | Date | Owner |
874
- |----------|----------|------|-------|
875
- | Which PII fields to hash? | Start with email only, expand later | TBD | Product |
876
- | Backward compatibility period? | 3 months | TBD | Engineering |
877
- | Encryption key rotation schedule? | Quarterly | TBD | Security |
878
- | What to do with existing logs? | Redact after migration | TBD | DevOps |
879
-
880
- ---
881
-
882
- **Epic Created:** [Date]
883
- **Last Updated:** [Date]
884
- **Epic Owner:** [Name]
885
- **Status:** Planning
886
-