@blamejs/core 0.7.104 → 0.7.105

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/CHANGELOG.md CHANGED
@@ -8,6 +8,8 @@ upgrading across more than a few patches at a time.
8
8
 
9
9
  ## v0.7.x
10
10
 
11
+ - **0.7.105** (2026-05-06) — `b.compliance.sanctions` — sanctions-list screening primitive. Operators handling KYC / payment / customer-onboarding flows screen names against the U.S. Treasury OFAC SDN list, EU CSL, UK HMT consolidated list, UN 1267 list, or operator-defined lists. The framework owns indexing + match algorithm; the operator owns the daily fetch + format-specific parsing (the framework intentionally does not vendor the list — it changes daily and has legal-distribution implications). **`b.compliance.sanctions.create({ entries, algorithm, fuzzy, ... })`** returns a screener with `screen(input)` (single record), `screenBulk(inputs)` (batch), `snapshot()` (rule-version digest for audit trails), `reload(newEntries)` (atomic index swap with diff), `entryById(id)` (lookup), and `size()`. Three match strategies: `exact` (fastest, no fuzz), `jaro-winkler` (default, threshold 0.85), `levenshtein` (edit-distance with cap). Match output: `{ match: bool, hits: [{ entryId, name, matchedOn, score, reason, listed, programs }], algorithm, ruleVersion, screenedAt }`. **`b.compliance.sanctions.fuzzy`** — pure algorithmic core: `normalize` (Unicode diacritic strip + lowercase + whitespace collapse), `tokenize`, `levenshtein` (cap + early-exit), `jaro` / `jaroWinkler`, `tokenSetSimilarity` (order-invariant bag-of-tokens), `substringContains` (token-bounded), `initialsMatch`. **`b.compliance.sanctions.aliases.expand(name, opts)`** — alias-expansion helper covering nicknames (Bill ↔ William, Mike ↔ Michael), transliteration variants (Mohamed ↔ Mohammed), reverse-order forms (Smith John / Smith, John), and initials (J. Smith). 32 built-in name pairs plus operator-extensible `extraPairs`. **`b.compliance.sanctions.fetcher.create({ screener, fetch, intervalMs, onRefreshed, onError })`** — periodic refresh worker that runs the operator's `fetch` callback, validates a non-empty result, and atomically reloads the screener via `screener.reload`. Audit emissions on every refresh state (`compliance.sanctions.refresh.started` / `completed` / `skipped` / `failed`). **Parser shims** for the canonical public list formats: `parseOfacCsvRow` / `parseOfacAliasRow` / `mergeAliases` (OFAC SDN), `parseEuCslEntry` (EU Consolidated Sanctions List XML), `parseUn1267Entry` (UN Security Council XML). Audit emissions: `compliance.sanctions.screened` (every screen call), `compliance.sanctions.matched` (when hits > 0). Test coverage: 39 cases across normalize / tokenize / Levenshtein / Jaro-Winkler / token-set / substring / initials / screen exact + jw + levenshtein / type filter / bulk / snapshot / reload / alias expansion / fetcher tick + failure modes.
12
+
11
13
  - **0.7.104** (2026-05-06) — `b.dsr` Data Subject Rights workflow primitive (~2000 LoC). End-to-end coordinator for GDPR Article 15-22 / CCPA / CPRA / LGPD / PIPEDA / UK-GDPR data-subject requests. **`b.dsr.create({ ticketStore, posture, identityResolver, sources, ... })`** returns a workflow instance with full ticket lifecycle: `submit(input)` resolves subject identity via the operator-supplied `identityResolver`, computes a posture-aware deadline (gdpr 30d / ccpa 45d / lgpd-br 15d / pipl-cn 15d / pipeda-ca 30d / appi-jp 30d / pdpa-sg 30d / uk-gdpr 30d), and persists a pending ticket. `process(ticketId, opts)` orchestrates per-source `query` (for access / portability / rectification) or `erase` (for erasure) callbacks; partial source failures land the ticket in `partially_completed` state with per-source error capture. `cancel` / `reject` (with required reason per GDPR) advance to terminal states. `expireOverdue()` sweep marks deadline-overdue tickets as `expired`. Seven request types: `access` / `erasure` / `portability` / `rectification` / `restriction` / `object` / `automated-decision`. **Verification ladder** (`minimal` / `secondary` / `strong`) per GDPR Art. 12(6) — minimum required level by request type with operator override; erasure / portability / rectification require `secondary` by default. **Receipt builder** (`buildReceipt(ticketId)`) — emits a canonical `blamejs.dsr.receipt/1` JSON envelope for completed/cancelled/rejected/expired tickets with optional operator-side `receiptSigner` hook for cryptographic attestation. **Portability bundle builder** (`buildPortabilityBundle(ticket)`) — `blamejs.dsr.portability/1` JSON shape with per-source data for access / portability requests. **Two ticket-store backends** ship: `memoryTicketStore()` for development / tests, `dbTicketStore({ db, table })` for production (auto-provisions a SQLite table with subject_email + status indexes, includes a `purgeExpired()` retention sweep). Audit emissions on every state transition (`dsr.ticket.submitted` / `in_progress` / `completed` / `partial` / `cancelled` / `rejected` / `expired` plus per-source `dsr.source.queried` / `erased` / `failed`). Test coverage: 38 cases across submit / process / cancel / reject / list / expire / portability / verification ladder / receipt / store backends.
12
14
 
13
15
  - **0.7.103** (2026-05-06) — W3C distributed tracing suite. End-to-end OTel-shaped tracing without a vendored OTel SDK: tracestate + Baggage parsers, span builder, OTLP/JSON exporter, HTTP-server span middleware, log correlation. **`b.observability.traceContext.parseTracestate / buildTracestate`** — W3C Trace Context §3.3 vendor data: enforces vendor-key shape (lcase-alnum + `_-*/`, optional `<tenant>@<system>`), value charset (printable ASCII excluding `,` and `=`), 32-entry cap, 512-char total cap, dup-key-keep-first per §3.3.1.5. **`b.observability.baggage.parse / build`** — W3C Baggage spec parser + builder for operator-supplied context (tenantId, region, experimentId, etc.) propagated across service boundaries. RFC 7230 tchar key grammar, percent-encoded UTF-8 values, optional per-entry properties (`key=value;property=value`), 64-entry / 8192-char caps. **`b.observability.tracer.create({ service, resource, onEnd })`** — OTel-shaped span builder. `tracer.start(name, opts)` returns a span with `setAttribute` / `setAttributes` / `addEvent` / `recordException` / `setStatus` / `end` / `isRecording` / `toJSON`. OTLP/JSON-compatible output (Trace v1) with `traceId` / `spanId` / `parentSpanId` / `name` / `kind` / `startTimeUnixNano` / `endTimeUnixNano` / `attributes` / `events` / `status` / `resource` / `scope` / `droppedAttributesCount` / `droppedEventsCount`. Attribute caps (128 keys, 1024-char values), event cap (128) per OTLP defaults. `tracer.startChildOf(parent, name)` derives child spans sharing the trace context. **`b.observability.tracer.spanToTraceparent(span)`** — emits the canonical W3C `traceparent` for outbound propagation. **`b.observability.otlpExporter.create({ endpoint, ... })`** — buffered OTLP/HTTP JSON span exporter. Batches spans (default 200), flushes on size + interval (default 5s), retries 5xx + 408/429 with exponential backoff, drops oldest on queue overflow (default 4096). Custom `fetchImpl` opt for testing or non-default HTTP transports; `allowedProtocols` opt for cleartext dev collectors. **`b.middleware.tracePropagate`** extended to also read inbound `tracestate` and stamp `req.trace.tracestate` as the parsed entries array (or `[]` when missing); when `setResponseHeader: true`, echoes both `traceparent` and `tracestate` on the response. **`b.middleware.spanHttpServer({ tracer, ... })`** — auto-creates a root server span per HTTP request, populates OTel `SEMCONV.HTTP_*` / `URL_*` / `SERVER_*` / `CLIENT_*` attributes, attaches the span to `req.span`, ends on response close, fires `onEnd(span.toJSON())` for export. `ignorePaths` (string + RegExp) keeps healthz / static-asset routes out of span volume; `captureRequestHeaders` / `captureResponseHeaders` lift named headers into the span as `http.request.header.*` / `http.response.header.*` attributes. **`b.middleware.traceLogCorrelation({ logger })`** — wraps a `b.log` instance for the request lifetime so every `info()` / `warn()` / `error()` / etc. emission inside the handler auto-includes `trace_id` + `span_id` from the active context (via `req.trace` + `req.span`). Pass-through when no trace context present. Internal sweep: `safeBuffer.TRACE_ID_HEX_RE` / `SPAN_ID_HEX_RE` / `RFC7230_TCHAR_RE` extracted as shared regex constants; `guard-mime` / `middleware/headers` / `observability` consolidated against the new shared constants.
@@ -0,0 +1,167 @@
1
+ "use strict";
2
+ /**
3
+ * Alias-expansion helpers for sanctions screening.
4
+ *
5
+ * The OFAC SDN list / EU CSL / UK HMT consolidated list publish a
6
+ * primary name + a small set of formal aliases per entry. Real-world
7
+ * input doesn't match those forms exactly: people use nicknames
8
+ * (Bill / William, Mike / Michael), transliteration variants
9
+ * (Mohamed / Mohammed / Muhammad), and initials (J. Smith).
10
+ *
11
+ * This module expands a candidate name into the set of plausible
12
+ * forms that should screen-match against the same SDN entry. Operators
13
+ * call expand() before screen() to broaden the match scope:
14
+ *
15
+ * var aliases = b.compliance.sanctions.aliases.expand("Bill J. Smith");
16
+ * var result = screener.screen({
17
+ * name: "Bill J. Smith",
18
+ * aliases: aliases,
19
+ * });
20
+ *
21
+ * The expansion is deterministic + idempotent. Operators with
22
+ * domain-specific names (Cyrillic / Arabic) extend via opts.extra.
23
+ */
24
+
25
+ var fuzzy = require("./compliance-sanctions-fuzzy");
26
+
27
+ // Common nickname → formal-name pairs. The framework ships a focused
28
+ // table for English/European names; operators with non-Western lists
29
+ // extend via opts.extraPairs at expand() time.
30
+ var NICKNAME_PAIRS = Object.freeze([
31
+ ["bill", "william"],
32
+ ["bob", "robert"],
33
+ ["dick", "richard"],
34
+ ["mike", "michael"],
35
+ ["nick", "nicholas"],
36
+ ["tom", "thomas"],
37
+ ["jim", "james"],
38
+ ["jack", "john"],
39
+ ["chris", "christopher"],
40
+ ["dan", "daniel"],
41
+ ["dave", "david"],
42
+ ["matt", "matthew"],
43
+ ["alex", "alexander"],
44
+ ["sam", "samuel"],
45
+ ["pat", "patrick"],
46
+ ["tony", "anthony"],
47
+ ["ben", "benjamin"],
48
+ ["joe", "joseph"],
49
+ ["ed", "edward"],
50
+ ["fred", "frederick"],
51
+ ["greg", "gregory"],
52
+ ["liz", "elizabeth"],
53
+ ["beth", "elizabeth"],
54
+ ["meg", "margaret"],
55
+ ["maggie", "margaret"],
56
+ ["kate", "katherine"],
57
+ ["kathy", "katherine"],
58
+ ["sue", "susan"],
59
+ ["jen", "jennifer"],
60
+ ["jenny", "jennifer"],
61
+ ["nat", "natalie"],
62
+ ["mohamed", "mohammed"],
63
+ ["muhammad", "mohammed"],
64
+ ["abd", "abdul"],
65
+ ["abu", "abou"],
66
+ ["yusuf", "yousef"],
67
+ ["yasin", "yaseen"],
68
+ ["hussein", "hussain"],
69
+ ]);
70
+
71
+ function _expandNickname(token) {
72
+ var alts = [];
73
+ var lower = token.toLowerCase();
74
+ for (var i = 0; i < NICKNAME_PAIRS.length; i++) {
75
+ var pair = NICKNAME_PAIRS[i];
76
+ if (lower === pair[0]) alts.push(pair[1]);
77
+ else if (lower === pair[1]) alts.push(pair[0]);
78
+ }
79
+ return alts;
80
+ }
81
+
82
+ function _expandInitials(tokens) {
83
+ // Build "J. Smith" / "JS" forms
84
+ var alts = [];
85
+ if (tokens.length >= 2) {
86
+ var first = tokens[0];
87
+ var rest = tokens.slice(1).join(" ");
88
+ if (first.length > 1) {
89
+ // J Smith / J. Smith
90
+ alts.push(first.charAt(0) + " " + rest);
91
+ alts.push(first.charAt(0) + ". " + rest);
92
+ }
93
+ // Last + first
94
+ alts.push(tokens[tokens.length - 1] + " " + tokens.slice(0, -1).join(" "));
95
+ // Last, First
96
+ alts.push(tokens[tokens.length - 1] + ", " + tokens.slice(0, -1).join(" "));
97
+ }
98
+ if (tokens.length === 2) {
99
+ // Initials-only "JS"
100
+ alts.push(tokens[0].charAt(0) + tokens[1].charAt(0));
101
+ }
102
+ return alts;
103
+ }
104
+
105
+ function _expandTokenLevel(tokens) {
106
+ // For each token, swap with each plausible nickname/transliteration,
107
+ // emit the resulting full name.
108
+ var alts = [];
109
+ for (var i = 0; i < tokens.length; i++) {
110
+ var swaps = _expandNickname(tokens[i]);
111
+ for (var j = 0; j < swaps.length; j++) {
112
+ var newTokens = tokens.slice();
113
+ newTokens[i] = swaps[j];
114
+ alts.push(newTokens.join(" "));
115
+ }
116
+ }
117
+ return alts;
118
+ }
119
+
120
+ function expand(name, opts) {
121
+ opts = opts || {};
122
+ if (typeof name !== "string" || name.length === 0) return [];
123
+ var tokens = fuzzy.tokenize(name);
124
+ if (tokens.length === 0) return [];
125
+ var seen = Object.create(null);
126
+ var out = [];
127
+ function _add(s) {
128
+ if (typeof s !== "string" || s.length === 0) return;
129
+ var key = fuzzy.normalize(s);
130
+ if (key.length === 0) return;
131
+ if (seen[key]) return;
132
+ seen[key] = true;
133
+ out.push(s);
134
+ }
135
+ // 1. The original (normalised)
136
+ _add(tokens.join(" "));
137
+ // 2. Initial-form variants
138
+ var initials = _expandInitials(tokens);
139
+ for (var i = 0; i < initials.length; i++) _add(initials[i]);
140
+ // 3. Token-level nickname/transliteration swaps
141
+ var swaps = _expandTokenLevel(tokens);
142
+ for (var j = 0; j < swaps.length; j++) _add(swaps[j]);
143
+ // 4. Operator-supplied extras
144
+ if (Array.isArray(opts.extra)) {
145
+ for (var k = 0; k < opts.extra.length; k++) _add(opts.extra[k]);
146
+ }
147
+ if (Array.isArray(opts.extraPairs)) {
148
+ for (var p = 0; p < opts.extraPairs.length; p++) {
149
+ var pair = opts.extraPairs[p];
150
+ if (!Array.isArray(pair) || pair.length !== 2) continue;
151
+ for (var ti = 0; ti < tokens.length; ti++) {
152
+ var lower = tokens[ti].toLowerCase();
153
+ if (lower === pair[0]) {
154
+ var nt1 = tokens.slice(); nt1[ti] = pair[1]; _add(nt1.join(" "));
155
+ } else if (lower === pair[1]) {
156
+ var nt2 = tokens.slice(); nt2[ti] = pair[0]; _add(nt2.join(" "));
157
+ }
158
+ }
159
+ }
160
+ }
161
+ return out;
162
+ }
163
+
164
+ module.exports = {
165
+ expand: expand,
166
+ NICKNAME_PAIRS: NICKNAME_PAIRS,
167
+ };
@@ -0,0 +1,206 @@
1
+ "use strict";
2
+ /**
3
+ * b.compliance.sanctions.fetcher — periodic sanctions-list refresh
4
+ * helper.
5
+ *
6
+ * The framework intentionally does NOT vendor the sanctions list (it
7
+ * changes daily and has legal-distribution implications). Operators
8
+ * fetch from the canonical source on a schedule + reload the screener.
9
+ * This module wraps the schedule + comparison + reload-trigger logic
10
+ * so operators write one fetch callback instead of orchestrating it.
11
+ *
12
+ * var fetcher = b.compliance.sanctions.fetcher.create({
13
+ * screener: sdnScreener, // from sanctions.create
14
+ * intervalMs: C.TIME.hours(24),
15
+ * fetch: async function () {
16
+ * // Operator-supplied: hits treasury.gov, parses CSV, returns
17
+ * // canonical entry array.
18
+ * var rows = await downloadSdnCsv();
19
+ * return rows.map(b.compliance.sanctions.parseOfacCsvRow);
20
+ * },
21
+ * onRefreshed: function (diff) {
22
+ * log.info("SDN list refreshed", diff);
23
+ * },
24
+ * onError: function (err) {
25
+ * pagerDuty.alert("SDN list fetch failed", err);
26
+ * },
27
+ * });
28
+ * fetcher.start();
29
+ * ...
30
+ * await fetcher.shutdown();
31
+ *
32
+ * Behavior:
33
+ * - On each tick, run fetch(); if it returns a non-empty array,
34
+ * swap the screener's index via screener.reload(entries).
35
+ * - If fetch() throws or returns empty, skip the swap and emit an
36
+ * audit event; the screener keeps the previous index. Operators
37
+ * can configure onError for paging.
38
+ * - Initial run is opt-in via opts.fetchOnStart (default true);
39
+ * operators that prefer to seed the screener from a cached file
40
+ * at boot pass false.
41
+ *
42
+ * Audit emissions:
43
+ * compliance.sanctions.refresh.started — every tick
44
+ * compliance.sanctions.refresh.completed — successful refresh + diff
45
+ * compliance.sanctions.refresh.skipped — tick returned empty
46
+ * compliance.sanctions.refresh.failed — fetch threw
47
+ */
48
+
49
+ var C = require("./constants");
50
+ var lazyRequire = require("./lazy-require");
51
+ var safeAsync = require("./safe-async");
52
+ var validateOpts = require("./validate-opts");
53
+ var { defineClass } = require("./framework-error");
54
+
55
+ var SanctionsFetcherError = defineClass("SanctionsFetcherError", { alwaysPermanent: true });
56
+
57
+ var audit = lazyRequire(function () { return require("./audit"); });
58
+ var observability = lazyRequire(function () { return require("./observability"); });
59
+
60
+ function create(opts) {
61
+ validateOpts.requireObject(opts, "compliance.sanctions.fetcher", SanctionsFetcherError);
62
+ validateOpts(opts, [
63
+ "screener", "intervalMs", "fetch",
64
+ "onRefreshed", "onError",
65
+ "fetchOnStart", "audit",
66
+ ], "compliance.sanctions.fetcher.create");
67
+
68
+ if (!opts.screener || typeof opts.screener.reload !== "function") {
69
+ throw new SanctionsFetcherError("sanctions-fetcher/bad-screener",
70
+ "fetcher.create: screener must be a sanctions.create() instance");
71
+ }
72
+ if (typeof opts.fetch !== "function") {
73
+ throw new SanctionsFetcherError("sanctions-fetcher/bad-fetch",
74
+ "fetcher.create: fetch must be an async function returning entry[]");
75
+ }
76
+ validateOpts.optionalPositiveFinite(opts.intervalMs,
77
+ "fetcher.create: intervalMs", SanctionsFetcherError, "sanctions-fetcher/bad-opts");
78
+ validateOpts.optionalFunction(opts.onRefreshed,
79
+ "fetcher.create: onRefreshed", SanctionsFetcherError, "sanctions-fetcher/bad-opts");
80
+ validateOpts.optionalFunction(opts.onError,
81
+ "fetcher.create: onError", SanctionsFetcherError, "sanctions-fetcher/bad-opts");
82
+
83
+ var intervalMs = opts.intervalMs || C.TIME.hours(24);
84
+ var fetchOnStart = opts.fetchOnStart !== false;
85
+ var auditOn = opts.audit !== false;
86
+ var screener = opts.screener;
87
+ var fetchFn = opts.fetch;
88
+
89
+ var handle = null;
90
+ var stopping = false;
91
+ var lastSuccess = null;
92
+ var lastError = null;
93
+ var refreshCount = 0;
94
+ var failureCount = 0;
95
+
96
+ function _emitAudit(action, outcome, metadata) {
97
+ if (!auditOn) return;
98
+ try {
99
+ audit().safeEmit({
100
+ action: action,
101
+ outcome: outcome,
102
+ metadata: metadata || {},
103
+ });
104
+ } catch (_e) { /* drop-silent */ }
105
+ }
106
+
107
+ function _emitMetric(verb, n) {
108
+ try { observability().safeEvent("compliance.sanctions.fetcher." + verb, n || 1, {}); }
109
+ catch (_e) { /* drop-silent */ }
110
+ }
111
+
112
+ async function _tick() {
113
+ if (stopping) return;
114
+ _emitAudit("compliance.sanctions.refresh.started", "success", {
115
+ algorithm: screener.algorithm,
116
+ });
117
+ var entries;
118
+ try {
119
+ entries = await fetchFn();
120
+ } catch (e) {
121
+ failureCount += 1;
122
+ lastError = (e && e.message) || String(e);
123
+ _emitAudit("compliance.sanctions.refresh.failed", "failure", {
124
+ error: lastError, algorithm: screener.algorithm,
125
+ });
126
+ _emitMetric("failed", 1);
127
+ if (typeof opts.onError === "function") {
128
+ try { opts.onError(e); } catch (_e2) { /* operator hook */ }
129
+ }
130
+ return;
131
+ }
132
+ if (!Array.isArray(entries) || entries.length === 0) {
133
+ _emitAudit("compliance.sanctions.refresh.skipped", "success", {
134
+ reason: "fetch-returned-empty", algorithm: screener.algorithm,
135
+ });
136
+ _emitMetric("skipped", 1);
137
+ return;
138
+ }
139
+ var diff;
140
+ try { diff = screener.reload(entries); }
141
+ catch (e) {
142
+ failureCount += 1;
143
+ lastError = (e && e.message) || String(e);
144
+ _emitAudit("compliance.sanctions.refresh.failed", "failure", {
145
+ error: lastError, phase: "reload", algorithm: screener.algorithm,
146
+ });
147
+ _emitMetric("failed", 1);
148
+ if (typeof opts.onError === "function") {
149
+ try { opts.onError(e); } catch (_e2) { /* operator hook */ }
150
+ }
151
+ return;
152
+ }
153
+ refreshCount += 1;
154
+ lastSuccess = Date.now();
155
+ _emitAudit("compliance.sanctions.refresh.completed", "success", {
156
+ algorithm: screener.algorithm,
157
+ added: diff.addedIds.length,
158
+ removed: diff.removedIds.length,
159
+ newSize: diff.newSize,
160
+ });
161
+ _emitMetric("completed", 1);
162
+ if (typeof opts.onRefreshed === "function") {
163
+ try { opts.onRefreshed(diff); } catch (_e2) { /* operator hook */ }
164
+ }
165
+ }
166
+
167
+ function start() {
168
+ if (handle) return;
169
+ stopping = false;
170
+ if (fetchOnStart) {
171
+ // Fire-and-forget; the periodic ticker handles the rest.
172
+ _tick().catch(function () { /* drop-silent — see _tick */ });
173
+ }
174
+ handle = safeAsync.repeating(function () {
175
+ _tick().catch(function () { /* drop-silent */ });
176
+ }, intervalMs, { name: "sanctions-fetcher" });
177
+ }
178
+
179
+ async function shutdown() {
180
+ stopping = true;
181
+ if (handle) { handle.stop(); handle = null; }
182
+ }
183
+
184
+ function stats() {
185
+ return {
186
+ lastSuccess: lastSuccess,
187
+ lastError: lastError,
188
+ refreshCount: refreshCount,
189
+ failureCount: failureCount,
190
+ running: handle !== null,
191
+ };
192
+ }
193
+
194
+ return {
195
+ start: start,
196
+ shutdown: shutdown,
197
+ stats: stats,
198
+ // Test hook
199
+ _tickOnce: _tick,
200
+ };
201
+ }
202
+
203
+ module.exports = {
204
+ create: create,
205
+ SanctionsFetcherError: SanctionsFetcherError,
206
+ };
@@ -0,0 +1,297 @@
1
+ "use strict";
2
+ /**
3
+ * Fuzzy name-matching primitives for sanctions screening.
4
+ *
5
+ * Operators screening names against the OFAC SDN list / EU CSL /
6
+ * UK HMT consolidated list need to handle:
7
+ * - Transliteration variations (Mohamed / Mohammed / Muhammad)
8
+ * - Order-of-name variations (Smith John vs John Smith)
9
+ * - Initials vs full names (J. Smith vs John Smith)
10
+ * - Diacritical noise (Müller vs Muller)
11
+ * - Substring containment (the SDN entry "Acme Corp" matches a
12
+ * local record "Acme Corp Limited")
13
+ *
14
+ * This module exports the algorithmic core; b.compliance.sanctions
15
+ * orchestrates parser/index/match against it.
16
+ *
17
+ * Functions:
18
+ * normalize(name) → canonical lowercase form, diacritics
19
+ * stripped, multi-space collapsed
20
+ * tokenize(name) → array of normalized tokens
21
+ * levenshtein(a, b, capDist) → edit distance with O(min(a,b)) memory
22
+ * + early-exit when distance > capDist
23
+ * jaroWinkler(a, b, prefix) → 0..1 similarity score per Jaro-Winkler
24
+ * (1996); operators typically threshold
25
+ * at >= 0.85 for "probable match"
26
+ * tokenSetSimilarity(a, b) → bag-of-tokens overlap with token-pair
27
+ * Jaro-Winkler scoring; resilient to
28
+ * word order and missing/extra terms
29
+ *
30
+ * Performance: worst-case O(n*m) for Levenshtein (n,m = string lengths),
31
+ * O(n*m) for Jaro-Winkler. Operators screening against a list of N
32
+ * entries should pre-filter on token-set overlap before computing
33
+ * Jaro-Winkler on every candidate.
34
+ */
35
+
36
+ var validateOpts = require("./validate-opts");
37
+ var { defineClass } = require("./framework-error");
38
+
39
+ var FuzzyError = defineClass("FuzzyError", { alwaysPermanent: true });
40
+
41
+ // ---- normalize ----
42
+
43
+ // Diacritic-stripping table — covers the most common Latin Unicode
44
+ // ranges. The framework intentionally ships a focused table (not a
45
+ // full Unicode normalizer) so the LoC is bounded; operators with
46
+ // non-Latin lists install ICU normalizer in their pre-processing.
47
+ var _DIACRITIC_MAP = {
48
+ "à":"a","á":"a","â":"a","ã":"a","ä":"a","å":"a","ą":"a","ă":"a",
49
+ "ç":"c","ć":"c","č":"c","ĉ":"c",
50
+ "ď":"d","đ":"d",
51
+ "è":"e","é":"e","ê":"e","ë":"e","ę":"e","ě":"e","ĕ":"e",
52
+ "ğ":"g","ĝ":"g","ġ":"g",
53
+ "ĥ":"h",
54
+ "ì":"i","í":"i","î":"i","ï":"i","ı":"i","į":"i",
55
+ "ĵ":"j",
56
+ "ķ":"k",
57
+ "ĺ":"l","ľ":"l","ł":"l","ļ":"l",
58
+ "ñ":"n","ń":"n","ň":"n","ņ":"n",
59
+ "ò":"o","ó":"o","ô":"o","õ":"o","ö":"o","ø":"o","ő":"o",
60
+ "ŕ":"r","ř":"r",
61
+ "ś":"s","š":"s","ş":"s","ș":"s","ŝ":"s",
62
+ "ť":"t","ţ":"t","ț":"t",
63
+ "ù":"u","ú":"u","û":"u","ü":"u","ū":"u","ů":"u","ű":"u","ŭ":"u",
64
+ "ŵ":"w",
65
+ "ý":"y","ÿ":"y","ŷ":"y",
66
+ "ź":"z","ż":"z","ž":"z",
67
+ "ß":"ss","æ":"ae","œ":"oe",
68
+ "À":"A","Á":"A","Â":"A","Ã":"A","Ä":"A","Å":"A",
69
+ "Ç":"C","È":"E","É":"E","Ê":"E","Ë":"E",
70
+ "Ì":"I","Í":"I","Î":"I","Ï":"I",
71
+ "Ñ":"N",
72
+ "Ò":"O","Ó":"O","Ô":"O","Õ":"O","Ö":"O","Ø":"O",
73
+ "Ù":"U","Ú":"U","Û":"U","Ü":"U",
74
+ "Ý":"Y","Ÿ":"Y",
75
+ "Ž":"Z","Š":"S",
76
+ };
77
+
78
+ function normalize(name) {
79
+ if (typeof name !== "string") return "";
80
+ // 1. Strip diacritics
81
+ var stripped = "";
82
+ for (var i = 0; i < name.length; i++) {
83
+ var ch = name.charAt(i);
84
+ stripped += _DIACRITIC_MAP[ch] || ch;
85
+ }
86
+ // 2. Lowercase
87
+ var lower = stripped.toLowerCase();
88
+ // 3. Strip punctuation other than hyphen + apostrophe (preserved
89
+ // inside names like O'Brien / Al-Faisal)
90
+ var punctStripped = lower.replace(/[^\p{Letter}\p{Number}'\- ]+/gu, " "); // allow:regex-no-length-cap — caller bounds total input via tokenize() length cap
91
+ // 4. Collapse whitespace
92
+ var collapsed = punctStripped.replace(/\s+/g, " ").trim();
93
+ return collapsed;
94
+ }
95
+
96
+ function tokenize(name) {
97
+ if (typeof name !== "string") return [];
98
+ if (name.length > MAX_INPUT_LEN) {
99
+ throw new FuzzyError("fuzzy/input-too-long",
100
+ "tokenize: input exceeds " + MAX_INPUT_LEN + " char cap");
101
+ }
102
+ var n = normalize(name);
103
+ if (n.length === 0) return [];
104
+ return n.split(" ").filter(function (t) { return t.length > 0; });
105
+ }
106
+
107
+ var MAX_INPUT_LEN = 512; // allow:raw-byte-literal — name length sanity cap (operators can override fuzzy.create)
108
+
109
+ // ---- Levenshtein with cap + early-exit ----
110
+
111
+ function levenshtein(a, b, capDist) {
112
+ if (typeof a !== "string" || typeof b !== "string") {
113
+ throw new FuzzyError("fuzzy/bad-input",
114
+ "levenshtein: a + b must be strings");
115
+ }
116
+ // Trivial cases
117
+ if (a === b) return 0;
118
+ if (a.length === 0) return b.length;
119
+ if (b.length === 0) return a.length;
120
+
121
+ // Cap (Math.abs(a.length - b.length) is the lower bound; if this
122
+ // already exceeds cap we can skip the full DP)
123
+ if (typeof capDist === "number" && capDist >= 0) {
124
+ var lengthDelta = Math.abs(a.length - b.length);
125
+ if (lengthDelta > capDist) return capDist + 1;
126
+ }
127
+
128
+ // Two-row DP: O(min(a.length, b.length)) memory.
129
+ var s = a.length <= b.length ? a : b;
130
+ var t = a.length <= b.length ? b : a;
131
+ var prev = new Array(s.length + 1);
132
+ var curr = new Array(s.length + 1);
133
+ for (var i = 0; i <= s.length; i++) prev[i] = i;
134
+ for (var j = 1; j <= t.length; j++) {
135
+ curr[0] = j;
136
+ var rowMin = j;
137
+ for (var k = 1; k <= s.length; k++) {
138
+ var cost = s.charAt(k - 1) === t.charAt(j - 1) ? 0 : 1;
139
+ curr[k] = Math.min(
140
+ prev[k] + 1, // deletion
141
+ curr[k - 1] + 1, // insertion
142
+ prev[k - 1] + cost // substitution
143
+ );
144
+ if (curr[k] < rowMin) rowMin = curr[k];
145
+ }
146
+ if (typeof capDist === "number" && rowMin > capDist) return capDist + 1;
147
+ var swap = prev; prev = curr; curr = swap;
148
+ }
149
+ return prev[s.length];
150
+ }
151
+
152
+ // ---- Jaro and Jaro-Winkler ----
153
+
154
+ function jaro(a, b) {
155
+ if (typeof a !== "string" || typeof b !== "string") return 0;
156
+ if (a === b) return a.length === 0 ? 0 : 1;
157
+ if (a.length === 0 || b.length === 0) return 0;
158
+ var matchWindow = Math.max(0, Math.floor(Math.max(a.length, b.length) / 2) - 1); // allow:raw-byte-literal — Jaro match-window formula
159
+ var aMatched = new Array(a.length).fill(false);
160
+ var bMatched = new Array(b.length).fill(false);
161
+ var matches = 0;
162
+ for (var i = 0; i < a.length; i++) {
163
+ var lo = Math.max(0, i - matchWindow);
164
+ var hi = Math.min(b.length - 1, i + matchWindow);
165
+ for (var j = lo; j <= hi; j++) {
166
+ if (bMatched[j]) continue;
167
+ if (a.charAt(i) !== b.charAt(j)) continue;
168
+ aMatched[i] = true;
169
+ bMatched[j] = true;
170
+ matches += 1;
171
+ break;
172
+ }
173
+ }
174
+ if (matches === 0) return 0;
175
+ // Count transpositions
176
+ var t = 0;
177
+ var k = 0;
178
+ for (var ii = 0; ii < a.length; ii++) {
179
+ if (!aMatched[ii]) continue;
180
+ while (!bMatched[k]) k += 1;
181
+ if (a.charAt(ii) !== b.charAt(k)) t += 1;
182
+ k += 1;
183
+ }
184
+ var transpositions = t / 2;
185
+ return (matches / a.length + matches / b.length +
186
+ (matches - transpositions) / matches) / 3; // allow:raw-byte-literal — Jaro 3-term formula
187
+ }
188
+
189
+ function jaroWinkler(a, b, prefixWeight) {
190
+ // prefixWeight defaults to 0.1 per the original Winkler paper;
191
+ // operators can lower to reduce prefix bias.
192
+ var w = (typeof prefixWeight === "number" && isFinite(prefixWeight))
193
+ ? prefixWeight : 0.1;
194
+ if (w < 0 || w > 0.25) {
195
+ throw new FuzzyError("fuzzy/bad-prefix-weight",
196
+ "jaroWinkler: prefixWeight must be in [0, 0.25]");
197
+ }
198
+ var j = jaro(a, b);
199
+ if (j === 0) return 0;
200
+ // Common prefix up to 4 chars (Winkler's cap)
201
+ var maxPrefix = 4; // allow:raw-byte-literal — Jaro-Winkler prefix cap (Winkler 1990)
202
+ var prefixLen = 0;
203
+ var max = Math.min(a.length, b.length, maxPrefix);
204
+ for (var i = 0; i < max; i++) {
205
+ if (a.charAt(i) !== b.charAt(i)) break;
206
+ prefixLen += 1;
207
+ }
208
+ return j + prefixLen * w * (1 - j);
209
+ }
210
+
211
+ // ---- Token-set similarity ----
212
+
213
+ function tokenSetSimilarity(a, b, opts) {
214
+ opts = opts || {};
215
+ var prefixWeight = opts.prefixWeight;
216
+ var threshold = (typeof opts.threshold === "number" && isFinite(opts.threshold))
217
+ ? opts.threshold : 0.85;
218
+ var tokensA = tokenize(a);
219
+ var tokensB = tokenize(b);
220
+ if (tokensA.length === 0 || tokensB.length === 0) return 0;
221
+ // Greedy bipartite matching: for each token in A, find the best
222
+ // unmatched B token; sum & average. This is O(n*m) but the typical
223
+ // name has ≤ 5 tokens so it's bounded.
224
+ var bUsed = new Array(tokensB.length).fill(false);
225
+ var matchedScores = [];
226
+ for (var i = 0; i < tokensA.length; i++) {
227
+ var bestScore = 0;
228
+ var bestIdx = -1;
229
+ for (var j = 0; j < tokensB.length; j++) {
230
+ if (bUsed[j]) continue;
231
+ var s = jaroWinkler(tokensA[i], tokensB[j], prefixWeight);
232
+ if (s > bestScore) { bestScore = s; bestIdx = j; }
233
+ }
234
+ if (bestIdx !== -1 && bestScore >= threshold) {
235
+ bUsed[bestIdx] = true;
236
+ matchedScores.push(bestScore);
237
+ }
238
+ }
239
+ if (matchedScores.length === 0) return 0;
240
+ // Token-set similarity: average of the matched-pair scores, weighted
241
+ // by coverage of the smaller-token-side.
242
+ var avg = matchedScores.reduce(function (a2, b2) { return a2 + b2; }, 0) /
243
+ matchedScores.length;
244
+ var coverage = matchedScores.length / Math.min(tokensA.length, tokensB.length);
245
+ return avg * coverage;
246
+ }
247
+
248
+ // ---- Container helpers ----
249
+
250
+ // substringContains — true when the normalized form of `needle` is a
251
+ // whitespace-bounded substring of the normalized form of `haystack`.
252
+ // Useful for catching SDN entries like "Acme Corp" inside a fuller
253
+ // local record like "Acme Corp Limited Liability Company".
254
+ function substringContains(haystack, needle) {
255
+ var nh = " " + normalize(haystack) + " ";
256
+ var nn = " " + normalize(needle) + " ";
257
+ return nh.indexOf(nn) !== -1;
258
+ }
259
+
260
+ // initialsMatch — true when the normalized form of `a` is shaped like
261
+ // "J Smith" / "J. Smith" / "JS" and matches the leading-character
262
+ // pattern of `b`. Catches the common "screen-typo" pattern where the
263
+ // user typed an initial instead of a full first name.
264
+ function initialsMatch(a, b) {
265
+ var ta = tokenize(a);
266
+ var tb = tokenize(b);
267
+ if (ta.length === 0 || tb.length === 0) return false;
268
+ if (ta.length !== tb.length) return false;
269
+ for (var i = 0; i < ta.length; i++) {
270
+ var x = ta[i];
271
+ var y = tb[i];
272
+ if (x === y) continue;
273
+ // Match if either side is a single char and matches the other's
274
+ // first char.
275
+ if (x.length === 1 && y.startsWith(x)) continue;
276
+ if (y.length === 1 && x.startsWith(y)) continue;
277
+ return false;
278
+ }
279
+ return true;
280
+ }
281
+
282
+ module.exports = {
283
+ normalize: normalize,
284
+ tokenize: tokenize,
285
+ levenshtein: levenshtein,
286
+ jaro: jaro,
287
+ jaroWinkler: jaroWinkler,
288
+ tokenSetSimilarity: tokenSetSimilarity,
289
+ substringContains: substringContains,
290
+ initialsMatch: initialsMatch,
291
+ FuzzyError: FuzzyError,
292
+ MAX_INPUT_LEN: MAX_INPUT_LEN,
293
+ };
294
+ // note: validateOpts intentionally not used in this file (pure
295
+ // algorithmic helpers); imported only to keep the require shape
296
+ // consistent with sister modules.
297
+ void validateOpts;
@@ -0,0 +1,569 @@
1
+ "use strict";
2
+ /**
3
+ * b.compliance.sanctions — sanctions-list screening.
4
+ *
5
+ * Operators handling KYC / payment / customer-onboarding flows screen
6
+ * names against the U.S. Treasury OFAC Specially Designated Nationals
7
+ * list, the EU Consolidated Sanctions List (CSL), the UK HMT
8
+ * consolidated list, the UN 1267 Al-Qaida/Taliban list, and adjacent
9
+ * regulatory lists. The framework owns the indexing + match algorithm;
10
+ * the operator owns the daily fetch + format-specific parsing.
11
+ *
12
+ * var screener = b.compliance.sanctions.create({
13
+ * entries: parsedSdnList, // operator-supplied
14
+ * algorithm: "ofac-sdn", // | "eu-csl" | "uk-hmt" | "un-1267" |
15
+ * // "custom"
16
+ * fuzzy: {
17
+ * enabled: true,
18
+ * threshold: 0.85, // Jaro-Winkler threshold; 0..1
19
+ * strategy: "jaro-winkler", // | "levenshtein" | "exact"
20
+ * maxLevenshtein: 3, // max edit distance per "levenshtein"
21
+ * },
22
+ * audit: true,
23
+ * });
24
+ *
25
+ * var result = await screener.screen({
26
+ * name: "John Smith",
27
+ * dateOfBirth: "1980-01-15",
28
+ * country: "US",
29
+ * type: "individual", // | "entity" | "vessel" | "aircraft"
30
+ * aliases: ["J Smith", "Jonny Smith"],
31
+ * });
32
+ * // → {
33
+ * // match: true | false,
34
+ * // hits: [{ entryId, name, score, reason, listed, programs }],
35
+ * // screenedAt, algorithm, ruleVersion,
36
+ * // }
37
+ *
38
+ * Entry shape (operator parses raw list into this canonical shape):
39
+ * {
40
+ * id: "OFAC-12345",
41
+ * primaryName: "JOHN SMITH",
42
+ * aliases: ["J SMITH", "JONNY SMITH"],
43
+ * type: "individual" | "entity" | "vessel" | "aircraft",
44
+ * programs: ["SDGT", "RUSSIA-EO13662"], // sanction programs
45
+ * listedAt: "2024-03-15",
46
+ * country: "RU",
47
+ * dateOfBirth: ["1980-01-15"], // optional disambiguator
48
+ * remarks: "...",
49
+ * // operator-side fields preserved verbatim:
50
+ * raw: <any>,
51
+ * }
52
+ *
53
+ * Audit emissions (audit namespace `compliance`):
54
+ * compliance.sanctions.screened — every screen() call (match or no-match)
55
+ * compliance.sanctions.matched — every screen() with at least one hit
56
+ *
57
+ * The framework does NOT vendor the list itself: list contents change
58
+ * daily and have legal-distribution implications. Operators fetch from
59
+ * the source (treasury.gov for OFAC, sanctionsmap.eu for EU CSL,
60
+ * gov.uk for HMT, scsanctions.un.org for UN 1267) on a daily schedule
61
+ * and pass the parsed array.
62
+ */
63
+
64
+ var lazyRequire = require("./lazy-require");
65
+ var validateOpts = require("./validate-opts");
66
+ var fuzzy = require("./compliance-sanctions-fuzzy");
67
+ var aliases = require("./compliance-sanctions-aliases");
68
+ var fetcher = require("./compliance-sanctions-fetcher");
69
+ var { defineClass } = require("./framework-error");
70
+
71
+ var SanctionsError = defineClass("SanctionsError", { alwaysPermanent: true });
72
+
73
+ var audit = lazyRequire(function () { return require("./audit"); });
74
+ var observability = lazyRequire(function () { return require("./observability"); });
75
+
76
+ var VALID_ALGORITHMS = Object.freeze([
77
+ "ofac-sdn", // U.S. Treasury Specially Designated Nationals
78
+ "eu-csl", // EU Consolidated Sanctions List
79
+ "uk-hmt", // UK HM Treasury consolidated
80
+ "un-1267", // UN Security Council 1267/1989/2253
81
+ "custom", // operator-defined list
82
+ ]);
83
+
84
+ var VALID_STRATEGIES = Object.freeze([
85
+ "jaro-winkler",
86
+ "levenshtein",
87
+ "exact",
88
+ ]);
89
+
90
+ var VALID_TYPES = Object.freeze([
91
+ "individual",
92
+ "entity",
93
+ "vessel",
94
+ "aircraft",
95
+ ]);
96
+
97
+ // ---- Parser shims ----
98
+ //
99
+ // Operators feed pre-parsed entries to create(); the framework also
100
+ // ships parser shims for the common public formats. Parsers run on
101
+ // the operator side (network fetch + format conversion) and return
102
+ // the canonical entry shape. The framework's parsers are minimal:
103
+ // just enough to extract id + primaryName + aliases + programs from
104
+ // the canonical XML/JSON shape that each sanctions authority ships.
105
+
106
+ // OFAC SDN — the Treasury distributes XML and CSV; we accept the
107
+ // parsed CSV-row shape (operator runs b.parsers.safeCsv). Each row:
108
+ // { ent_num, SDN_Name, SDN_Type, Program, Title, Call_Sign, ... }
109
+ function parseOfacCsvRow(row) {
110
+ if (!row || typeof row !== "object") return null;
111
+ if (!row.SDN_Name || row.ent_num === undefined) return null;
112
+ return {
113
+ id: "OFAC-" + String(row.ent_num),
114
+ primaryName: String(row.SDN_Name).trim(),
115
+ aliases: [], // OFAC distributes aliases in a separate alt-names file
116
+ type: _ofacTypeToCanonical(row.SDN_Type),
117
+ programs: row.Program ? String(row.Program).split(";").map(function (s) { return s.trim(); }).filter(Boolean) : [],
118
+ country: row.Country ? String(row.Country).trim() : null,
119
+ listedAt: row.Publish_Date ? String(row.Publish_Date) : null,
120
+ remarks: row.Remarks ? String(row.Remarks) : null,
121
+ raw: row,
122
+ };
123
+ }
124
+
125
+ function _ofacTypeToCanonical(t) {
126
+ switch (String(t || "").toLowerCase()) {
127
+ case "individual": return "individual";
128
+ case "entity": return "entity";
129
+ case "vessel": return "vessel";
130
+ case "aircraft": return "aircraft";
131
+ default: return "entity";
132
+ }
133
+ }
134
+
135
+ // OFAC alias rows from the alt-names file:
136
+ // { ent_num, alt_num, alt_type, alt_name, alt_remarks }
137
+ // merged into the primary entry by operator code via mergeAliases().
138
+ function parseOfacAliasRow(row) {
139
+ if (!row || typeof row !== "object") return null;
140
+ if (row.ent_num === undefined || !row.alt_name) return null;
141
+ return {
142
+ entId: "OFAC-" + String(row.ent_num),
143
+ altType: String(row.alt_type || "aka"),
144
+ altName: String(row.alt_name).trim(),
145
+ remarks: row.alt_remarks ? String(row.alt_remarks) : null,
146
+ };
147
+ }
148
+
149
+ function mergeAliases(entries, aliasRows) {
150
+ if (!Array.isArray(entries)) return [];
151
+ if (!Array.isArray(aliasRows)) return entries;
152
+ var byId = Object.create(null);
153
+ for (var i = 0; i < entries.length; i++) byId[entries[i].id] = entries[i];
154
+ for (var j = 0; j < aliasRows.length; j++) {
155
+ var alias = aliasRows[j];
156
+ var entry = byId[alias.entId];
157
+ if (entry) entry.aliases.push(alias.altName);
158
+ }
159
+ return entries;
160
+ }
161
+
162
+ // EU CSL — the EU distributes XML; operator parses with b.parsers.safeXml
163
+ // and feeds the per-entity dict (subjectType, nameAlias, regulation, etc.)
164
+ function parseEuCslEntry(entity) {
165
+ if (!entity || typeof entity !== "object") return null;
166
+ var nameAliases = entity.nameAlias || entity.NAMEALIAS || [];
167
+ if (!Array.isArray(nameAliases)) nameAliases = [nameAliases];
168
+ if (nameAliases.length === 0) return null;
169
+ var primary = nameAliases[0];
170
+ return {
171
+ id: "EU-CSL-" + String(entity.logicalId || entity.LOGICALID || ""),
172
+ primaryName: String(primary.wholeName || primary.WHOLENAME || "").trim(),
173
+ aliases: nameAliases.slice(1).map(function (a) {
174
+ return String(a.wholeName || a.WHOLENAME || "").trim();
175
+ }).filter(Boolean),
176
+ type: _euTypeToCanonical(entity.subjectType || entity.SUBJECTTYPE),
177
+ programs: entity.regulation ? [String(entity.regulation)] : [],
178
+ country: entity.country || null,
179
+ listedAt: entity.designationDate || null,
180
+ remarks: entity.remark || null,
181
+ raw: entity,
182
+ };
183
+ }
184
+
185
+ function _euTypeToCanonical(t) {
186
+ switch (String(t || "").toLowerCase()) {
187
+ case "person": return "individual";
188
+ case "enterprise": return "entity";
189
+ case "vessel": return "vessel";
190
+ case "aircraft": return "aircraft";
191
+ default: return "entity";
192
+ }
193
+ }
194
+
195
+ // UN 1267 list — XML-based, similar to EU shape but different field
196
+ // names. Operators parse the XML root then feed individual entries.
197
+ function parseUn1267Entry(entry) {
198
+ if (!entry || typeof entry !== "object") return null;
199
+ var name = entry.NAME || entry.name || entry.FIRST_NAME || "";
200
+ if (!name) return null;
201
+ var aliases = [];
202
+ if (Array.isArray(entry.ALIASES)) aliases = entry.ALIASES.slice();
203
+ else if (typeof entry.ALIAS_NAMES === "string") {
204
+ aliases = entry.ALIAS_NAMES.split(";").map(function (s) { return s.trim(); }).filter(Boolean);
205
+ }
206
+ return {
207
+ id: "UN-1267-" + String(entry.REFERENCE_NUMBER || entry.DATAID || ""),
208
+ primaryName: String(name).trim(),
209
+ aliases: aliases,
210
+ type: entry.NAME_TYPE === "Entity" ? "entity" : "individual",
211
+ programs: ["UN-1267"],
212
+ country: entry.COUNTRY || entry.NATIONALITY || null,
213
+ listedAt: entry.LISTED_ON || null,
214
+ remarks: entry.COMMENTS || null,
215
+ raw: entry,
216
+ };
217
+ }
218
+
219
+ // ---- Index + screen ----
220
+
221
+ function _normalizeEntry(e) {
222
+ // Defensive copy + normalise primaryName/aliases for fast match.
223
+ var norm = {
224
+ id: e.id,
225
+ primaryName: e.primaryName || "",
226
+ aliases: Array.isArray(e.aliases) ? e.aliases.slice() : [],
227
+ type: e.type || "entity",
228
+ programs: Array.isArray(e.programs) ? e.programs.slice() : [],
229
+ country: e.country || null,
230
+ listedAt: e.listedAt || null,
231
+ dateOfBirth: Array.isArray(e.dateOfBirth) ? e.dateOfBirth.slice() : (e.dateOfBirth ? [e.dateOfBirth] : []),
232
+ remarks: e.remarks || null,
233
+ raw: e.raw || null,
234
+ };
235
+ // Pre-tokenize for the matcher
236
+ norm._allNamesNormalized = [norm.primaryName].concat(norm.aliases)
237
+ .map(fuzzy.normalize)
238
+ .filter(function (s) { return s.length > 0; });
239
+ return norm;
240
+ }
241
+
242
+ function create(opts) {
243
+ validateOpts.requireObject(opts, "compliance.sanctions", SanctionsError);
244
+ validateOpts(opts, [
245
+ "entries", "algorithm", "fuzzy", "audit", "ruleVersion",
246
+ ], "compliance.sanctions.create");
247
+
248
+ if (!Array.isArray(opts.entries)) {
249
+ throw new SanctionsError("sanctions/no-entries",
250
+ "compliance.sanctions.create: entries must be an array");
251
+ }
252
+ var algorithm = opts.algorithm || "custom";
253
+ if (VALID_ALGORITHMS.indexOf(algorithm) === -1) {
254
+ throw new SanctionsError("sanctions/bad-algorithm",
255
+ "compliance.sanctions.create: algorithm must be one of " +
256
+ VALID_ALGORITHMS.join(", "));
257
+ }
258
+ var fuzzyOpts = opts.fuzzy || {};
259
+ if (typeof fuzzyOpts !== "object" || Array.isArray(fuzzyOpts)) {
260
+ throw new SanctionsError("sanctions/bad-fuzzy",
261
+ "compliance.sanctions.create: fuzzy must be an object");
262
+ }
263
+ var fuzzyEnabled = fuzzyOpts.enabled !== false;
264
+ var fuzzyThreshold = (typeof fuzzyOpts.threshold === "number" && isFinite(fuzzyOpts.threshold))
265
+ ? fuzzyOpts.threshold : 0.85;
266
+ if (fuzzyThreshold < 0 || fuzzyThreshold > 1) {
267
+ throw new SanctionsError("sanctions/bad-threshold",
268
+ "compliance.sanctions.create: fuzzy.threshold must be in [0, 1]");
269
+ }
270
+ var fuzzyStrategy = fuzzyOpts.strategy || "jaro-winkler";
271
+ if (VALID_STRATEGIES.indexOf(fuzzyStrategy) === -1) {
272
+ throw new SanctionsError("sanctions/bad-strategy",
273
+ "compliance.sanctions.create: fuzzy.strategy must be one of " +
274
+ VALID_STRATEGIES.join(", "));
275
+ }
276
+ var maxLevenshtein = (typeof fuzzyOpts.maxLevenshtein === "number" && isFinite(fuzzyOpts.maxLevenshtein))
277
+ ? fuzzyOpts.maxLevenshtein : 3; // allow:raw-byte-literal — default edit-distance cap (operator-tunable)
278
+ var auditOn = opts.audit !== false;
279
+ var ruleVersion = opts.ruleVersion || ("entries:" + opts.entries.length);
280
+
281
+ // Index — normalize all entries up front (O(N*M) once) so screen()
282
+ // is O(N*K) where K is the number of names+aliases per entry. For a
283
+ // 30k-entry list with ~3 aliases each, the index uses ~90k normalized
284
+ // strings.
285
+ var index = opts.entries.map(_normalizeEntry);
286
+
287
+ function _emitAudit(action, outcome, metadata) {
288
+ if (!auditOn) return;
289
+ try {
290
+ audit().safeEmit({
291
+ action: action,
292
+ outcome: outcome,
293
+ metadata: metadata || {},
294
+ });
295
+ } catch (_e) { /* drop-silent — audit sink */ }
296
+ }
297
+
298
+ function _emitMetric(verb, n, labels) {
299
+ try { observability().safeEvent("compliance.sanctions." + verb, n || 1, labels || {}); }
300
+ catch (_e) { /* drop-silent */ }
301
+ }
302
+
303
+ function _exactMatch(qNorm, candidate) {
304
+ for (var i = 0; i < candidate._allNamesNormalized.length; i++) {
305
+ if (candidate._allNamesNormalized[i] === qNorm) return 1.0;
306
+ }
307
+ return 0;
308
+ }
309
+
310
+ function _jaroWinklerMatch(qNorm, candidate) {
311
+ var bestScore = 0;
312
+ var bestName = "";
313
+ for (var i = 0; i < candidate._allNamesNormalized.length; i++) {
314
+ var name = candidate._allNamesNormalized[i];
315
+ var s = fuzzy.tokenSetSimilarity(qNorm, name, {
316
+ threshold: fuzzyThreshold,
317
+ });
318
+ if (s > bestScore) {
319
+ bestScore = s;
320
+ bestName = name;
321
+ }
322
+ // Also try direct Jaro-Winkler on the whole strings
323
+ var s2 = fuzzy.jaroWinkler(qNorm, name);
324
+ if (s2 > bestScore) {
325
+ bestScore = s2;
326
+ bestName = name;
327
+ }
328
+ // Substring containment scores 0.92 (high but below exact)
329
+ if (fuzzy.substringContains(name, qNorm)) {
330
+ if (0.92 > bestScore) { bestScore = 0.92; bestName = name; } // allow:raw-byte-literal — substring-match score weight
331
+ }
332
+ if (fuzzy.substringContains(qNorm, name)) {
333
+ if (0.92 > bestScore) { bestScore = 0.92; bestName = name; } // allow:raw-byte-literal — substring-match score weight
334
+ }
335
+ }
336
+ return { score: bestScore, name: bestName };
337
+ }
338
+
339
+ function _levenshteinMatch(qNorm, candidate) {
340
+ var bestScore = 0;
341
+ var bestName = "";
342
+ for (var i = 0; i < candidate._allNamesNormalized.length; i++) {
343
+ var name = candidate._allNamesNormalized[i];
344
+ var dist = fuzzy.levenshtein(qNorm, name, maxLevenshtein);
345
+ if (dist > maxLevenshtein) continue;
346
+ // Distance → score: distance 0 → 1.0; distance maxLev → 0.0.
347
+ var maxLen = Math.max(qNorm.length, name.length);
348
+ if (maxLen === 0) continue;
349
+ var score = Math.max(0, 1 - dist / maxLen);
350
+ if (score > bestScore) { bestScore = score; bestName = name; }
351
+ }
352
+ return { score: bestScore, name: bestName };
353
+ }
354
+
355
+ function screen(input) {
356
+ if (!input || typeof input !== "object") {
357
+ throw new SanctionsError("sanctions/bad-input",
358
+ "screen: input must be an object");
359
+ }
360
+ if (typeof input.name !== "string" || input.name.length === 0) {
361
+ throw new SanctionsError("sanctions/no-name",
362
+ "screen: input.name is required");
363
+ }
364
+ if (input.name.length > fuzzy.MAX_INPUT_LEN) {
365
+ throw new SanctionsError("sanctions/name-too-long",
366
+ "screen: input.name exceeds " + fuzzy.MAX_INPUT_LEN + " char cap");
367
+ }
368
+ if (input.type !== undefined && VALID_TYPES.indexOf(input.type) === -1) {
369
+ throw new SanctionsError("sanctions/bad-type",
370
+ "screen: input.type must be one of " + VALID_TYPES.join(", "));
371
+ }
372
+ var queryName = fuzzy.normalize(input.name);
373
+ var queryAliases = Array.isArray(input.aliases)
374
+ ? input.aliases.map(fuzzy.normalize).filter(function (s) { return s.length > 0; })
375
+ : [];
376
+ var queryNames = [queryName].concat(queryAliases);
377
+
378
+ var hits = [];
379
+ var screenedAt = Date.now();
380
+
381
+ for (var c = 0; c < index.length; c++) {
382
+ var candidate = index[c];
383
+ // Type filter: when input.type is set, skip candidates of
384
+ // the wrong type unless candidate is an entity (entities can
385
+ // be matched regardless to catch operator-side type errors).
386
+ if (input.type && candidate.type !== input.type &&
387
+ candidate.type !== "entity") {
388
+ continue;
389
+ }
390
+
391
+ var bestForCandidate = { score: 0, name: "" };
392
+ for (var qi = 0; qi < queryNames.length; qi++) {
393
+ var qn = queryNames[qi];
394
+ var match;
395
+ if (!fuzzyEnabled || fuzzyStrategy === "exact") {
396
+ var exact = _exactMatch(qn, candidate);
397
+ match = { score: exact, name: candidate.primaryName };
398
+ } else if (fuzzyStrategy === "jaro-winkler") {
399
+ match = _jaroWinklerMatch(qn, candidate);
400
+ } else {
401
+ match = _levenshteinMatch(qn, candidate);
402
+ }
403
+ if (match.score > bestForCandidate.score) {
404
+ bestForCandidate = match;
405
+ }
406
+ }
407
+ if (bestForCandidate.score >= fuzzyThreshold) {
408
+ hits.push({
409
+ entryId: candidate.id,
410
+ name: candidate.primaryName,
411
+ matchedOn: bestForCandidate.name,
412
+ score: bestForCandidate.score,
413
+ reason: bestForCandidate.score >= 0.99 ? "exact-or-near-exact" :
414
+ bestForCandidate.score >= 0.92 ? "substring-or-token-match" :
415
+ "fuzzy",
416
+ listed: candidate.listedAt,
417
+ programs: candidate.programs,
418
+ type: candidate.type,
419
+ country: candidate.country,
420
+ });
421
+ }
422
+ }
423
+ // Sort hits by descending score
424
+ hits.sort(function (a, b) { return b.score - a.score; });
425
+
426
+ var matched = hits.length > 0;
427
+ var result = {
428
+ match: matched,
429
+ hits: hits,
430
+ query: { name: input.name, type: input.type || null,
431
+ country: input.country || null,
432
+ dateOfBirth: input.dateOfBirth || null },
433
+ screenedAt: screenedAt,
434
+ algorithm: algorithm,
435
+ ruleVersion: ruleVersion,
436
+ strategy: fuzzyEnabled ? fuzzyStrategy : "exact",
437
+ threshold: fuzzyThreshold,
438
+ };
439
+ _emitAudit("compliance.sanctions.screened", "success", {
440
+ algorithm: algorithm, matched: matched,
441
+ hits: hits.length, ruleVersion: ruleVersion,
442
+ });
443
+ if (matched) {
444
+ _emitAudit("compliance.sanctions.matched", "success", {
445
+ algorithm: algorithm, hits: hits.length,
446
+ topScore: hits[0].score, topProgram: hits[0].programs && hits[0].programs[0],
447
+ });
448
+ _emitMetric("matched", 1, { algorithm: algorithm });
449
+ }
450
+ _emitMetric("screened", 1, { algorithm: algorithm });
451
+ return result;
452
+ }
453
+
454
+ function size() { return index.length; }
455
+ function entryById(id) {
456
+ for (var i = 0; i < index.length; i++) {
457
+ if (index[i].id === id) return index[i];
458
+ }
459
+ return null;
460
+ }
461
+
462
+ // screenBulk — convenience wrapper that screens an array of inputs
463
+ // and returns the per-input result array. Operators screening a
464
+ // batch of records (KYC list import, periodic re-screen of existing
465
+ // customers) call this once instead of looping; the wrapper still
466
+ // emits one audit event per input so the audit chain stays per-row.
467
+ function screenBulk(inputs) {
468
+ if (!Array.isArray(inputs)) {
469
+ throw new SanctionsError("sanctions/bad-bulk",
470
+ "screenBulk: inputs must be an array");
471
+ }
472
+ var out = [];
473
+ for (var i = 0; i < inputs.length; i++) {
474
+ out.push(screen(inputs[i]));
475
+ }
476
+ return out;
477
+ }
478
+
479
+ // snapshot — returns a content-derived hash + count of the active
480
+ // rule index, useful for compliance audit trails ("we screened
481
+ // ticket X against rule snapshot SHA-3 abcd..."). The snapshot is a
482
+ // truncated SHA-3-512 of the sorted entry ids; collisions are
483
+ // ignorable for the audit-trail use case (operators store the
484
+ // ruleVersion + entry count alongside).
485
+ function snapshot() {
486
+ var crypto = require("crypto");
487
+ var ids = index.map(function (e) { return e.id; }).sort();
488
+ var hash = crypto.createHash("sha3-512");
489
+ for (var i = 0; i < ids.length; i++) hash.update(ids[i]);
490
+ return {
491
+ algorithm: algorithm,
492
+ ruleVersion: ruleVersion,
493
+ entryCount: index.length,
494
+ digest: hash.digest("hex").slice(0, 32), // allow:raw-byte-literal — first 32 hex chars (128 bits) of SHA-3 digest, sufficient for snapshot identity
495
+ digestAlg: "sha3-512-trunc128",
496
+ capturedAt: Date.now(),
497
+ };
498
+ }
499
+
500
+ // reload — atomically swap the index to a fresh entry list. Returns
501
+ // a diff describing how the index changed (added / removed). The
502
+ // operator's daily-fetch worker uses this; the swap is atomic from
503
+ // the caller's perspective (screen() always sees the old or new
504
+ // index, never a partial state).
505
+ function reload(newEntries) {
506
+ if (!Array.isArray(newEntries)) {
507
+ throw new SanctionsError("sanctions/bad-reload",
508
+ "reload: newEntries must be an array");
509
+ }
510
+ var oldIds = Object.create(null);
511
+ for (var i = 0; i < index.length; i++) oldIds[index[i].id] = true;
512
+ var newIndex = newEntries.map(_normalizeEntry);
513
+ var newIds = Object.create(null);
514
+ for (var j = 0; j < newIndex.length; j++) newIds[newIndex[j].id] = true;
515
+ var added = [];
516
+ var removed = [];
517
+ for (var k = 0; k < newIndex.length; k++) {
518
+ if (!oldIds[newIndex[k].id]) added.push(newIndex[k].id);
519
+ }
520
+ for (var l = 0; l < index.length; l++) {
521
+ if (!newIds[index[l].id]) removed.push(index[l].id);
522
+ }
523
+ // Atomic swap (single reference assignment)
524
+ index = newIndex;
525
+ ruleVersion = "entries:" + index.length + ";reloadedAt:" + Date.now();
526
+ _emitAudit("compliance.sanctions.reloaded", "success", {
527
+ added: added.length, removed: removed.length,
528
+ newSize: index.length, ruleVersion: ruleVersion,
529
+ });
530
+ _emitMetric("reloaded", 1, { algorithm: algorithm });
531
+ return {
532
+ addedIds: added,
533
+ removedIds: removed,
534
+ newSize: index.length,
535
+ ruleVersion: ruleVersion,
536
+ };
537
+ }
538
+
539
+ return {
540
+ screen: screen,
541
+ screenBulk: screenBulk,
542
+ snapshot: snapshot,
543
+ reload: reload,
544
+ size: size,
545
+ entryById: entryById,
546
+ algorithm: algorithm,
547
+ ruleVersion: ruleVersion,
548
+ threshold: fuzzyThreshold,
549
+ strategy: fuzzyEnabled ? fuzzyStrategy : "exact",
550
+ // Exposed for tests + advanced operator workflows
551
+ _index: index,
552
+ };
553
+ }
554
+
555
+ module.exports = {
556
+ create: create,
557
+ parseOfacCsvRow: parseOfacCsvRow,
558
+ parseOfacAliasRow: parseOfacAliasRow,
559
+ mergeAliases: mergeAliases,
560
+ parseEuCslEntry: parseEuCslEntry,
561
+ parseUn1267Entry: parseUn1267Entry,
562
+ fuzzy: fuzzy,
563
+ aliases: aliases,
564
+ fetcher: fetcher,
565
+ VALID_ALGORITHMS: VALID_ALGORITHMS,
566
+ VALID_STRATEGIES: VALID_STRATEGIES,
567
+ VALID_TYPES: VALID_TYPES,
568
+ SanctionsError: SanctionsError,
569
+ };
package/lib/compliance.js CHANGED
@@ -29,6 +29,7 @@
29
29
  */
30
30
 
31
31
  var lazyRequire = require("./lazy-require");
32
+ var sanctions = require("./compliance-sanctions");
32
33
  var { ComplianceError } = require("./framework-error");
33
34
 
34
35
  var audit = lazyRequire(function () { return require("./audit"); });
@@ -305,6 +306,7 @@ module.exports = {
305
306
  posturesByDomain: posturesByDomain,
306
307
  posturesByJurisdiction: posturesByJurisdiction,
307
308
  list: list,
309
+ sanctions: sanctions,
308
310
  KNOWN_POSTURES: KNOWN_POSTURES,
309
311
  REGIME_MAP: REGIME_MAP,
310
312
  ComplianceError: ComplianceError,
package/package.json CHANGED
@@ -1,6 +1,6 @@
1
1
  {
2
2
  "name": "@blamejs/core",
3
- "version": "0.7.104",
3
+ "version": "0.7.105",
4
4
  "description": "The Node framework that owns its stack.",
5
5
  "license": "Apache-2.0",
6
6
  "author": "blamejs contributors",
@@ -2,10 +2,10 @@
2
2
  "$schema": "http://cyclonedx.org/schema/bom-1.5.schema.json",
3
3
  "bomFormat": "CycloneDX",
4
4
  "specVersion": "1.5",
5
- "serialNumber": "urn:uuid:5afd6c98-92aa-4383-a13c-7d4aed07fdbc",
5
+ "serialNumber": "urn:uuid:503d52be-ebde-43d9-99cf-866e8585557a",
6
6
  "version": 1,
7
7
  "metadata": {
8
- "timestamp": "2026-05-06T11:00:18.264Z",
8
+ "timestamp": "2026-05-06T11:23:52.585Z",
9
9
  "lifecycles": [
10
10
  {
11
11
  "phase": "build"
@@ -19,14 +19,14 @@
19
19
  }
20
20
  ],
21
21
  "component": {
22
- "bom-ref": "@blamejs/core@0.7.104",
22
+ "bom-ref": "@blamejs/core@0.7.105",
23
23
  "type": "library",
24
24
  "name": "blamejs",
25
- "version": "0.7.104",
25
+ "version": "0.7.105",
26
26
  "scope": "required",
27
27
  "author": "blamejs contributors",
28
28
  "description": "The Node framework that owns its stack.",
29
- "purl": "pkg:npm/%40blamejs/core@0.7.104",
29
+ "purl": "pkg:npm/%40blamejs/core@0.7.105",
30
30
  "properties": [],
31
31
  "externalReferences": [
32
32
  {
@@ -54,7 +54,7 @@
54
54
  "components": [],
55
55
  "dependencies": [
56
56
  {
57
- "ref": "@blamejs/core@0.7.104",
57
+ "ref": "@blamejs/core@0.7.105",
58
58
  "dependsOn": []
59
59
  }
60
60
  ]