goldenmatch 0.4.0 → 0.6.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (41) hide show
  1. package/CHANGELOG.md +88 -0
  2. package/dist/core/index.cjs +7247 -5292
  3. package/dist/core/index.cjs.map +1 -1
  4. package/dist/core/index.d.cts +404 -5
  5. package/dist/core/index.d.ts +404 -5
  6. package/dist/core/index.js +6071 -4120
  7. package/dist/core/index.js.map +1 -1
  8. package/dist/index.cjs +7247 -5292
  9. package/dist/index.cjs.map +1 -1
  10. package/dist/index.d.cts +1 -1
  11. package/dist/index.d.ts +1 -1
  12. package/dist/index.js +6071 -4120
  13. package/dist/index.js.map +1 -1
  14. package/dist/node/index.cjs +2353 -551
  15. package/dist/node/index.cjs.map +1 -1
  16. package/dist/node/index.d.cts +1 -1
  17. package/dist/node/index.d.ts +1 -1
  18. package/dist/node/index.js +2339 -552
  19. package/dist/node/index.js.map +1 -1
  20. package/package.json +1 -1
  21. package/src/core/autoconfig.ts +215 -166
  22. package/src/core/autoconfigController.ts +501 -0
  23. package/src/core/autoconfigHistory.ts +163 -0
  24. package/src/core/autoconfigPolicy.ts +115 -0
  25. package/src/core/autoconfigRules.ts +821 -0
  26. package/src/core/complexityProfile.ts +411 -0
  27. package/src/core/index.ts +70 -1
  28. package/src/core/indicators.ts +491 -0
  29. package/tests/parity/controller-stoppoint-fixtures.json +1103 -0
  30. package/tests/parity/controller-stoppoint.parity.test.ts +174 -0
  31. package/tests/parity/indicators-fixtures.json +542 -0
  32. package/tests/parity/indicators.parity.test.ts +116 -0
  33. package/tests/unit/autoconfig-classifier.test.ts +3 -0
  34. package/tests/unit/autoconfig.test.ts +11 -3
  35. package/tests/unit/autoconfigController.test.ts +98 -0
  36. package/tests/unit/autoconfigHistory.test.ts +183 -0
  37. package/tests/unit/autoconfigPolicy.test.ts +113 -0
  38. package/tests/unit/autoconfigRules.indicators.test.ts +291 -0
  39. package/tests/unit/autoconfigRules.test.ts +281 -0
  40. package/tests/unit/complexityProfile.test.ts +240 -0
  41. package/tests/unit/indicators.test.ts +195 -0
package/CHANGELOG.md CHANGED
@@ -4,6 +4,94 @@ All notable changes to goldenmatch-js are documented in this file.
4
4
 
5
5
  Format follows [Keep a Changelog](https://keepachangelog.com/). Versioning follows [Semantic Versioning](https://semver.org/) (strict after v1.0.0).
6
6
 
7
+ ## [0.6.0] - 2026-05-10
8
+
9
+ Indicator-aware refit parity with Python `goldenmatch` v1.9 + v1.10.
10
+
11
+ ### Added
12
+
13
+ - `IndicatorContext` memoization layer (`src/core/indicators.ts`) and 5 pure
14
+ complexity indicators ported from Python `core/indicators.py`:
15
+ `computeColumnPriors`, `estimateSparseMatchSignal`,
16
+ `computeCorruptionScore`, `estimateFullPopHits`,
17
+ `computeCrossBlockingOverlap`, plus `computeIdentityCollisionSignal`
18
+ used by the collision-aware refit rule.
19
+ - 7 new indicator-aware refit rules in `autoconfigRules.ts`:
20
+ `ruleUniformHeavyBlocking`, `ruleBlockingFieldNullHeavy`,
21
+ `ruleRecallGapSuspected`, `ruleCollisionSignalTooHigh`,
22
+ `ruleSparseMatchExpand`, `ruleCrossBlockingDisagreement`,
23
+ `ruleCorruptionNormalize`.
24
+ - `DEFAULT_RULES_V1_10` — 14-rule list mirroring Python's `DEFAULT_RULES`
25
+ order. The legacy `DEFAULT_RULES_V1_7_V1_8` 7-rule list is still exported
26
+ for callers that opt into base-only behavior.
27
+ - `RuleContext.indicators` optional field carries the per-iteration
28
+ `IndicatorContext`; rules that need indicator signals are silent no-ops
29
+ when callers run the legacy v1.7/v1.8 rule list.
30
+ - `RefitPolicy.propose(profile, current, history, indicators?)` — fourth
31
+ positional argument (back-compat: defaults to `null`).
32
+
33
+ ### Changed
34
+
35
+ - `autoConfigureRows` rewrite: matchkey naming now matches Python
36
+ (`fuzzy_match` for weighted, `exact_<col>` for exact). Scorer selection
37
+ follows Python's `_SCORER_MAP` (e.g. `name → ensemble`,
38
+ `email → exact`). Adaptive threshold uses Python's formula plus the
39
+ post-build data-quality adjustment (avg_null > 0.15 → −0.05;
40
+ avg_len < 5 → +0.05).
41
+ - `buildBlocking` aligned with Python: prefers high-cardinality
42
+ exact-eligible columns (email/phone/zip/identifier/year) for static
43
+ blocking, falls back to multi-pass name blocking
44
+ (`soundex` + `substring:0:5` + `token_sort + substring:0:8`).
45
+ - Controller provisions a fresh `IndicatorContext` per iteration and
46
+ threads it into `policy.propose()` for v1.10 rule consumption.
47
+
48
+ ### Parity status
49
+
50
+ - Controller stoppoint parity: 6/6 datasets pass shape-level assertions,
51
+ 2/6 (`dirty_people`, `mixed_blocking`) byte-equal on the normalized
52
+ committed config. The remaining 4 diverge because Python's iteration
53
+ path hits a `ModuleNotFoundError` on subsequent iterations and falls
54
+ back to a virtual v0 entry (out-of-scope to replicate in TS).
55
+ - Indicators parity: 8/8 fixture datasets pass at 4-decimal tolerance
56
+ on the 5 indicators. Identity-collision signal is unit-tested only —
57
+ the TS pure-JS token-sort approximation diverges numerically from
58
+ Python's `rapidfuzz.token_sort_ratio` at sub-rule precision, but the
59
+ rule-firing boundary (rate > 0.75) is preserved.
60
+
61
+ ## [0.5.0] - 2026-05-10
62
+
63
+ Auto-config controller parity with Python `goldenmatch` v1.7 + v1.8.
64
+
65
+ ### Added
66
+
67
+ - `AutoConfigController` (async `.run()`) — iterative auto-config with
68
+ pathological-input gates, deterministic sampling, policy-driven refit loop,
69
+ and best-effort commit via `RunHistory.pickCommitted`.
70
+ - `ComplexityProfile` + sub-profiles (`DataProfile`, `DomainProfile`,
71
+ `MatchkeyProfile`, `BlockingProfile`, `ScoringProfile`, `ClusterProfile`,
72
+ `ProfileMeta`, `IndicatorsProfile`) with `HealthVerdict` rollup.
73
+ - `RunHistory` audit trail with `PolicyDecision` / `ErrorRecord` / `HistoryEntry`
74
+ and `pickCommitted(precisionCollapseFloor)` lexicographic commit selection.
75
+ - `HeuristicRefitPolicy` rule dispatcher + 7 base v1.7/v1.8 rules:
76
+ `ruleBlockingSingletonTrap`, `ruleBlockingTooCoarse`, `ruleBlockingKeySwap`,
77
+ `ruleLowReductionRatio`, `ruleLowTransitivity`, `ruleNoMatches`,
78
+ `ruleUnimodalScoring`.
79
+ - `StopReason` telemetry (8 variants matching Python).
80
+ - `autoConfigureRowsIterate(rows)` async iterative entry point.
81
+ - `AutoconfigOptions.iterate` field (default `false`; preserves pre-0.5.0
82
+ behavior).
83
+ - `getLastControllerRun()` debug accessor mirroring Python's
84
+ `_LAST_CONTROLLER_RUN` ContextVar.
85
+ - Parity test suite: 6 dataset fixtures generated from the Python sibling
86
+ via `packages/python/goldenmatch/scripts/emit_ts_parity_fixtures.py`.
87
+
88
+ ### Deferred to v0.6.0 (Wave 2)
89
+
90
+ - 5 complexity indicators + `IndicatorContext` memoization.
91
+ - Indicator-aware refit rules (`ruleCorruptionNormalize`,
92
+ `ruleCrossBlockingDisagreement`, `ruleSparseMatchExpand`).
93
+ - Indicator-aware extensions to `ruleBlockingKeySwap` and `ruleNoMatches`.
94
+
7
95
  ## [0.4.0] - 2026-05-05
8
96
 
9
97
  ### BREAKING