email-origin-chain 1.0.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (43) hide show
  1. package/LICENSE +21 -0
  2. package/README.md +425 -0
  3. package/dist/detectors/crisp-detector.d.ts +11 -0
  4. package/dist/detectors/crisp-detector.js +46 -0
  5. package/dist/detectors/index.d.ts +5 -0
  6. package/dist/detectors/index.js +11 -0
  7. package/dist/detectors/new-outlook-detector.d.ts +10 -0
  8. package/dist/detectors/new-outlook-detector.js +112 -0
  9. package/dist/detectors/outlook-empty-header-detector.d.ts +16 -0
  10. package/dist/detectors/outlook-empty-header-detector.js +64 -0
  11. package/dist/detectors/outlook-fr-detector.d.ts +10 -0
  12. package/dist/detectors/outlook-fr-detector.js +119 -0
  13. package/dist/detectors/outlook-reverse-fr-detector.d.ts +13 -0
  14. package/dist/detectors/outlook-reverse-fr-detector.js +86 -0
  15. package/dist/detectors/registry.d.ts +25 -0
  16. package/dist/detectors/registry.js +81 -0
  17. package/dist/detectors/reply-detector.d.ts +11 -0
  18. package/dist/detectors/reply-detector.js +82 -0
  19. package/dist/detectors/types.d.ts +38 -0
  20. package/dist/detectors/types.js +2 -0
  21. package/dist/index.d.ts +6 -0
  22. package/dist/index.js +132 -0
  23. package/dist/inline-layer.d.ts +7 -0
  24. package/dist/inline-layer.js +116 -0
  25. package/dist/mime-layer.d.ts +15 -0
  26. package/dist/mime-layer.js +70 -0
  27. package/dist/types.d.ts +63 -0
  28. package/dist/types.js +2 -0
  29. package/dist/utils/cleaner.d.ts +16 -0
  30. package/dist/utils/cleaner.js +51 -0
  31. package/dist/utils.d.ts +17 -0
  32. package/dist/utils.js +221 -0
  33. package/docs/TEST_COVERAGE.md +54 -0
  34. package/docs/architecture/README.md +27 -0
  35. package/docs/architecture/phase1_cc_fix.md +223 -0
  36. package/docs/architecture/phase2_plugin_foundation.md +185 -0
  37. package/docs/architecture/phase3_fallbacks.md +62 -0
  38. package/docs/architecture/plugin_plan.md +318 -0
  39. package/docs/architecture/refactor_report.md +98 -0
  40. package/docs/detectors_usage.md +42 -0
  41. package/docs/walkthrough_address_fix.md +58 -0
  42. package/docs/walkthrough_deep_forward_fix.md +35 -0
  43. package/package.json +48 -0
@@ -0,0 +1,58 @@
1
+ # Address Normalization Fix
2
+
3
+ ## Summary
4
+ Fixed critical issue where email addresses from MIME headers were not being properly cleaned, causing test failures. The `normalizeFrom` function now correctly handles complex Outlook address formats with nested `<mailto:...>` patterns.
5
+
6
+ ## Changes Made
7
+
8
+ ### [`src/utils.ts`](file:///c:/Users/Flo/.gemini/antigravity/ProjetsPerso/email-deepest-forward/src/utils.ts)
9
+ - **Rewrote `normalizeFrom` function** with preprocessing step
10
+ - Strips all `<mailto:...>` patterns using global regex
11
+ - Counts `<` and `>` brackets to remove excess trailing `>`
12
+ - Handles complex patterns like: `"Flo M." <florian.mezy@gmail.com<mailto:florian.mezy@gmail.com>>`
13
+ - Extracts email from `"Name" <email>` format when it appears in address field
14
+ - Validates extracted emails before returning
15
+
16
+ ### [`src/index.ts`](file:///c:/Users/Flo/.gemini/antigravity/ProjetsPerso/email-deepest-forward/src/index.ts)
17
+ - **Added `normalizeFrom` import** and calls in 3 locations:
18
+ 1. Line 44: Normalize `inlineResult.from` before using it
19
+ 2. Line 53: Normalize MIME metadata fallback `from`
20
+ 3. Line 67: Normalize history root entry `from`
21
+ - **Fixed object spread order**: Destructured `inlineResult` to exclude `from` field, preventing it from overwriting our normalized value
22
+
23
+ ### [`tests/utils.test.ts`](file:///c:/Users/Flo/.gemini/antigravity/ProjetsPerso/email-deepest-forward/tests/utils.test.ts) [NEW]
24
+ - Created comprehensive unit tests for `normalizeFrom`
25
+ - Tests cover:
26
+ - Standard `Name <email>` format
27
+ - `email [email]` pattern
28
+ - `email<mailto:email>` pattern
29
+ - `email<mailto:email>>` pattern (double bracket)
30
+ - Email extraction from name field
31
+ - Edge cases (null, empty, simple addresses)
32
+
33
+ ## Test Results
34
+
35
+ ### ✅ Passing (20/22)
36
+ - All `utils.test.ts` tests pass (8/8)
37
+ - All `crisp-fixtures.test.ts` tests pass
38
+ - All `basic.test.ts`, `comprehensive.test.ts`, `integration.test.ts` pass
39
+ - `empty-body-forward-anonymized.eml` test passes
40
+
41
+ ### ⚠️ Remaining Failures (2/22)
42
+ Both failures are **depth detection issues** (the original problem):
43
+
44
+ 1. **`complex-forward.eml`**
45
+ - Expected depth: 5
46
+ - Received depth: 4
47
+ - Missing 1 forward level detection
48
+
49
+ 2. **`extreme-forward-anonymized.eml`**
50
+ - Expected depth: 4
51
+ - Received depth: 3
52
+ - Missing 1 forward level detection
53
+
54
+ ## Next Steps
55
+ The address normalization is now working correctly. The remaining work is to:
56
+ 1. Debug why `complex-forward.eml` only detects 4 levels instead of 5
57
+ 2. Debug why `extreme-forward-anonymized.eml` only detects 3 levels instead of 4
58
+ 3. This likely involves examining the detector chain and regex patterns in `OutlookEmptyHeaderDetector` and `OutlookReverseFrDetector`
@@ -0,0 +1,35 @@
1
+ # Deep Forward Parsing Fix - Final Report
2
+
3
+ ## Summary
4
+ Successfully resolved all test failures in `eml-fixture.test.ts`, achieving a **100% pass rate (23/23 tests)**. The library now correctly detects deep nested forwards even in complex mixed scenarios (Outlook, English, French, Empty Headers).
5
+
6
+ ## Key Fixes Implemented
7
+
8
+ ### 1. Fix: Missing Outlook Forward (Empty Header)
9
+ **Problem**: The `OutlookEmptyHeaderDetector` was failing to detect blocks like:
10
+ ```text
11
+ ________________________________
12
+ De: Florian M.
13
+ ```
14
+ **Root Cause**: The regex `(?:_{30,}\s*)?` was "greedy" and consumed the newline after the separator. The subsequent `\r?\nDe` failed because it expected *another* newline which wasn't there.
15
+ **Solution**: Modified `src/detectors/outlook-empty-header-detector.ts` to use `[ \t]*` instead of `\s*` to avoid eating newlines, and improved stability against QP encoding errors.
16
+
17
+ ### 2. Fix: Missing "Intermediate" Levels (Registry Priority)
18
+ **Problem**: In mixed chains, standard detectors (Crisp, priority 0) were detecting *later* forwards first, effectively "skipping" earlier non-standard Outlook forwards.
19
+ **Root Cause**: `DetectorRegistry` returned the *first found* result based on priority, regardless of position in text.
20
+ **Solution**: Updated `src/detectors/registry.ts` to run ALL detectors and return the match that appears **earliest** in the text (smallest index). This ensures we process the conversation chain in correct chronological order.
21
+
22
+ ### 3. Fix: Truncated Body Text ("Florian" issue)
23
+ **Problem**: `OutlookReverseFrDetector` was returning only "Florian" instead of "Genial Yodjii... Florian".
24
+ **Root Cause**: The detector relied on finding a double newline (`\n\n`) to separate headers from body. In some Outlook blocks, the body starts immediately after the `Objet:` line with a single newline. The heuristic skipped the entire message body until it hit a double newline in the signature.
25
+ **Solution**: Updated `src/detectors/outlook-reverse-fr-detector.ts` to assume the body starts immediately after the `Objet` line if no clear separator is found.
26
+
27
+ ## Verified Results
28
+
29
+ | Fixture | Expected Depth | Received Depth | Status |
30
+ | :--- | :--- | :--- | :--- |
31
+ | `complex-forward.eml` | 5 | 5 | ✅ PASS |
32
+ | `extreme-forward.eml` | 4 | 4 | ✅ PASS |
33
+ | `empty-body...eml` | 2 | 2 | ✅ PASS |
34
+
35
+ All other tests (`basic`, `utils`, `crisp`) remain passing.
package/package.json ADDED
@@ -0,0 +1,48 @@
1
+ {
2
+ "name": "email-origin-chain",
3
+ "version": "1.0.0",
4
+ "description": "Uncover the full audit trail of your email threads. Recursively reconstructs the entire conversation history with instant access to the original sender and true source message.",
5
+ "main": "dist/index.js",
6
+ "types": "dist/index.d.ts",
7
+ "files": [
8
+ "dist",
9
+ "README.md",
10
+ "LICENSE",
11
+ "docs"
12
+ ],
13
+ "repository": {
14
+ "type": "git",
15
+ "url": "git+https://github.com/yodjii/email-origin-chain.git"
16
+ },
17
+ "bugs": {
18
+ "url": "https://github.com/yodjii/email-origin-chain/issues"
19
+ },
20
+ "homepage": "https://github.com/yodjii/email-origin-chain#readme",
21
+ "scripts": {
22
+ "build": "tsc",
23
+ "test": "jest",
24
+ "test:watch": "jest --watch",
25
+ "prepublishOnly": "npm run build"
26
+ },
27
+ "keywords": [
28
+ "email",
29
+ "parser",
30
+ "forward",
31
+ "mailparser"
32
+ ],
33
+ "author": "Flo (yodjii)",
34
+ "license": "MIT",
35
+ "devDependencies": {
36
+ "@types/jest": "^30.0.0",
37
+ "@types/mailparser": "^3.4.6",
38
+ "@types/node": "^25.0.10",
39
+ "jest": "^30.2.0",
40
+ "ts-jest": "^29.4.6",
41
+ "typescript": "^5.9.3"
42
+ },
43
+ "dependencies": {
44
+ "any-date-parser": "^2.2.3",
45
+ "email-forward-parser": "^1.7.2",
46
+ "mailparser": "^3.9.1"
47
+ }
48
+ }