namespace-guard 0.8.2 → 0.10.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/README.md +52 -10
- package/dist/index.d.mts +71 -6
- package/dist/index.d.ts +71 -6
- package/dist/index.js +1784 -2
- package/dist/index.mjs +1780 -1
- package/package.json +1 -1
package/README.md
CHANGED
|
@@ -5,7 +5,7 @@
|
|
|
5
5
|
[](https://www.typescriptlang.org/)
|
|
6
6
|
[](https://opensource.org/licenses/MIT)
|
|
7
7
|
|
|
8
|
-
**[Live Demo](https://paultendo.github.io/namespace-guard/)**
|
|
8
|
+
**[Live Demo](https://paultendo.github.io/namespace-guard/)** - try it in your browser | **[Blog Post](https://paultendo.github.io/posts/namespace-guard-launch/)** - why this exists
|
|
9
9
|
|
|
10
10
|
**Check slug/handle uniqueness across multiple database tables with reserved name protection.**
|
|
11
11
|
|
|
@@ -249,7 +249,7 @@ const result = await guard.check("admin");
|
|
|
249
249
|
// { available: false, reason: "reserved", category: "system", message: "That's a system route." }
|
|
250
250
|
```
|
|
251
251
|
|
|
252
|
-
You can also use a single string message for all categories, or mix
|
|
252
|
+
You can also use a single string message for all categories, or mix - categories without a specific message fall back to the default.
|
|
253
253
|
|
|
254
254
|
## Async Validators
|
|
255
255
|
|
|
@@ -279,7 +279,7 @@ Validators run sequentially and stop at the first rejection. They receive the no
|
|
|
279
279
|
|
|
280
280
|
### Built-in Profanity Validator
|
|
281
281
|
|
|
282
|
-
Use `createProfanityValidator` for a turnkey profanity filter
|
|
282
|
+
Use `createProfanityValidator` for a turnkey profanity filter - supply your own word list:
|
|
283
283
|
|
|
284
284
|
```typescript
|
|
285
285
|
import { createNamespaceGuard, createProfanityValidator } from "namespace-guard";
|
|
@@ -295,7 +295,7 @@ const guard = createNamespaceGuard({
|
|
|
295
295
|
}, adapter);
|
|
296
296
|
```
|
|
297
297
|
|
|
298
|
-
No words are bundled
|
|
298
|
+
No words are bundled - use any word list you like (e.g., the `bad-words` npm package, your own list, or an external API wrapped in a custom validator).
|
|
299
299
|
|
|
300
300
|
### Built-in Homoglyph Validator
|
|
301
301
|
|
|
@@ -324,6 +324,44 @@ createHomoglyphValidator({
|
|
|
324
324
|
|
|
325
325
|
The built-in `CONFUSABLE_MAP` contains 613 character pairs generated from [Unicode TR39 confusables.txt](https://unicode.org/reports/tr39/) plus supplemental Latin small capitals. It covers Cyrillic, Greek, Armenian, Cherokee, IPA, Coptic, Lisu, Canadian Syllabics, Georgian, and 20+ other scripts. The map is exported for inspection or extension, and is regenerable for new Unicode versions with `npx tsx scripts/generate-confusables.ts`.
|
|
326
326
|
|
|
327
|
+
#### CONFUSABLE_MAP_FULL
|
|
328
|
+
|
|
329
|
+
For standalone use without NFKC normalization, `CONFUSABLE_MAP_FULL` (~1,400 entries) includes every single-character-to-Latin mapping from TR39 with no NFKC filtering. This is the right map when your pipeline does not run NFKC before confusable detection, which is the case for most real-world systems: TR39's skeleton algorithm uses NFD, Chromium's IDN spoof checker uses NFD, Rust's `confusable_idents` lint runs on NFC, and django-registration applies the confusable map to raw input with no normalization at all.
|
|
330
|
+
|
|
331
|
+
```typescript
|
|
332
|
+
import { CONFUSABLE_MAP_FULL } from "namespace-guard";
|
|
333
|
+
|
|
334
|
+
// Contains everything in CONFUSABLE_MAP, plus:
|
|
335
|
+
// - ~766 entries where NFKC agrees with TR39 (mathematical alphanumerics, fullwidth forms)
|
|
336
|
+
// - 31 entries where TR39 and NFKC disagree on the target letter
|
|
337
|
+
CONFUSABLE_MAP_FULL["\u017f"]; // "f" (Long S: TR39 visual mapping)
|
|
338
|
+
CONFUSABLE_MAP_FULL["\u{1D41A}"]; // "a" (Mathematical Bold Small A)
|
|
339
|
+
```
|
|
340
|
+
|
|
341
|
+
#### `skeleton()` and `areConfusable()`
|
|
342
|
+
|
|
343
|
+
The TR39 Section 4 skeleton algorithm computes a normalized form of a string for confusable comparison. Two strings that look alike will produce the same skeleton. This is the same algorithm used by ICU's SpoofChecker, Chromium's IDN spoof checker, and the Rust compiler's `confusable_idents` lint.
|
|
344
|
+
|
|
345
|
+
```typescript
|
|
346
|
+
import { skeleton, areConfusable, CONFUSABLE_MAP } from "namespace-guard";
|
|
347
|
+
|
|
348
|
+
// Compute skeletons for comparison
|
|
349
|
+
skeleton("paypal"); // "paypal"
|
|
350
|
+
skeleton("\u0440\u0430ypal"); // "paypal" (Cyrillic р and а)
|
|
351
|
+
skeleton("pay\u200Bpal"); // "paypal" (zero-width space stripped)
|
|
352
|
+
skeleton("\u017f"); // "f" (Long S via TR39 visual mapping)
|
|
353
|
+
|
|
354
|
+
// Compare two strings directly
|
|
355
|
+
areConfusable("paypal", "\u0440\u0430ypal"); // true
|
|
356
|
+
areConfusable("google", "g\u043e\u043egle"); // true (Cyrillic о)
|
|
357
|
+
areConfusable("hello", "world"); // false
|
|
358
|
+
|
|
359
|
+
// Use CONFUSABLE_MAP for NFKC-first pipelines
|
|
360
|
+
skeleton("\u017f", { map: CONFUSABLE_MAP }); // "\u017f" (Long S not in filtered map)
|
|
361
|
+
```
|
|
362
|
+
|
|
363
|
+
By default, `skeleton()` uses `CONFUSABLE_MAP_FULL` (the complete TR39 map), which matches the NFD-based pipeline specified by TR39. Pass `{ map: CONFUSABLE_MAP }` if your pipeline runs NFKC normalization before calling `skeleton()`.
|
|
364
|
+
|
|
327
365
|
### How the anti-spoofing pipeline works
|
|
328
366
|
|
|
329
367
|
Most confusable-detection libraries apply a character map in isolation. namespace-guard uses a three-stage pipeline where each stage is aware of the others:
|
|
@@ -335,13 +373,13 @@ Input → NFKC normalize → Confusable map → Mixed-script reject
|
|
|
335
373
|
|
|
336
374
|
**Stage 1: NFKC normalization** collapses full-width characters (`I` → `I`), ligatures (`fi` → `fi`), superscripts, and other Unicode compatibility forms to their canonical equivalents. This runs first, before any confusable check.
|
|
337
375
|
|
|
338
|
-
**Stage 2: Confusable map** catches characters that survive NFKC but visually mimic Latin letters
|
|
376
|
+
**Stage 2: Confusable map** catches characters that survive NFKC but visually mimic Latin letters - Cyrillic `а` for `a`, Greek `ο` for `o`, Cherokee `Ꭺ` for `A`, and 600+ others from the Unicode Consortium's [confusables.txt](https://unicode.org/Public/security/latest/confusables.txt).
|
|
339
377
|
|
|
340
378
|
**Stage 3: Mixed-script rejection** (`rejectMixedScript: true`) blocks identifiers that mix Latin with non-Latin scripts (Hebrew, Arabic, Devanagari, Thai, Georgian, Ethiopic, etc.) even if the specific characters aren't in the confusable map. This catches novel homoglyphs that the map doesn't cover.
|
|
341
379
|
|
|
342
380
|
#### Why NFKC-aware filtering matters
|
|
343
381
|
|
|
344
|
-
The key insight: TR39's confusables.txt and NFKC normalization sometimes disagree. For example, Unicode says capital `I` (U+0049) is confusable with lowercase `l`
|
|
382
|
+
The key insight: TR39's confusables.txt and NFKC normalization sometimes disagree. For example, Unicode says capital `I` (U+0049) is confusable with lowercase `l` - visually true in many fonts. But NFKC maps Mathematical Bold `𝐈` (U+1D408) to `I`, not `l`. If you naively ship the TR39 mapping (`𝐈` → `l`), the confusable check will never see that character - NFKC already converted it to `I` in stage 1.
|
|
345
383
|
|
|
346
384
|
We found 31 entries where this happens:
|
|
347
385
|
|
|
@@ -354,7 +392,7 @@ We found 31 entries where this happens:
|
|
|
354
392
|
| 11 Mathematical I variants | `l` | `i` | NFKC |
|
|
355
393
|
| 12 Mathematical 0/1 variants | `o`/`l` | `0`/`1` | NFKC |
|
|
356
394
|
|
|
357
|
-
These entries are dead code in any pipeline that runs NFKC first
|
|
395
|
+
These entries are dead code in any pipeline that runs NFKC first - and worse, they encode the *wrong* mapping. The generate script (`scripts/generate-confusables.ts`) automatically detects and excludes them.
|
|
358
396
|
|
|
359
397
|
## Unicode Normalization
|
|
360
398
|
|
|
@@ -429,7 +467,7 @@ const result = await guard.check("acme-corp");
|
|
|
429
467
|
|
|
430
468
|
### Composing Strategies
|
|
431
469
|
|
|
432
|
-
Combine multiple strategies
|
|
470
|
+
Combine multiple strategies - candidates are interleaved round-robin:
|
|
433
471
|
|
|
434
472
|
```typescript
|
|
435
473
|
suggest: {
|
|
@@ -503,10 +541,10 @@ npx namespace-guard check acme-corp
|
|
|
503
541
|
# ✓ acme-corp is available
|
|
504
542
|
|
|
505
543
|
npx namespace-guard check admin
|
|
506
|
-
# ✗ admin
|
|
544
|
+
# ✗ admin - That name is reserved. Try another one.
|
|
507
545
|
|
|
508
546
|
npx namespace-guard check "a"
|
|
509
|
-
# ✗ a
|
|
547
|
+
# ✗ a - Use 2-30 lowercase letters, numbers, or hyphens.
|
|
510
548
|
```
|
|
511
549
|
|
|
512
550
|
### With a config file
|
|
@@ -761,7 +799,10 @@ import {
|
|
|
761
799
|
createNamespaceGuard,
|
|
762
800
|
createProfanityValidator,
|
|
763
801
|
createHomoglyphValidator,
|
|
802
|
+
skeleton,
|
|
803
|
+
areConfusable,
|
|
764
804
|
CONFUSABLE_MAP,
|
|
805
|
+
CONFUSABLE_MAP_FULL,
|
|
765
806
|
normalize,
|
|
766
807
|
type NamespaceConfig,
|
|
767
808
|
type NamespaceSource,
|
|
@@ -771,6 +812,7 @@ import {
|
|
|
771
812
|
type FindOneOptions,
|
|
772
813
|
type OwnershipScope,
|
|
773
814
|
type SuggestStrategyName,
|
|
815
|
+
type SkeletonOptions,
|
|
774
816
|
} from "namespace-guard";
|
|
775
817
|
```
|
|
776
818
|
|
package/dist/index.d.mts
CHANGED
|
@@ -6,14 +6,14 @@ type NamespaceSource = {
|
|
|
6
6
|
column: string;
|
|
7
7
|
/** Column name for the primary key (default: "id", or "_id" for Mongoose) */
|
|
8
8
|
idColumn?: string;
|
|
9
|
-
/** Scope key for ownership checks
|
|
9
|
+
/** Scope key for ownership checks - allows users to update their own slug without a false collision */
|
|
10
10
|
scopeKey?: string;
|
|
11
11
|
};
|
|
12
12
|
/** Built-in suggestion strategy names. */
|
|
13
13
|
type SuggestStrategyName = "sequential" | "random-digits" | "suffix-words" | "short-random" | "scramble" | "similar";
|
|
14
14
|
/** Configuration for a namespace guard instance. */
|
|
15
15
|
type NamespaceConfig = {
|
|
16
|
-
/** Reserved names
|
|
16
|
+
/** Reserved names - flat list, Set, or categorized record */
|
|
17
17
|
reserved?: Set<string> | string[] | Record<string, string[]>;
|
|
18
18
|
/** Data sources to check for collisions */
|
|
19
19
|
sources: NamespaceSource[];
|
|
@@ -35,7 +35,7 @@ type NamespaceConfig = {
|
|
|
35
35
|
/** Message shown when a purely numeric identifier is rejected (default: "Identifiers cannot be purely numeric.") */
|
|
36
36
|
purelyNumeric?: string;
|
|
37
37
|
};
|
|
38
|
-
/** Async validation hooks
|
|
38
|
+
/** Async validation hooks - run after format/reserved checks, before DB */
|
|
39
39
|
validators?: Array<(value: string) => Promise<{
|
|
40
40
|
available: false;
|
|
41
41
|
message: string;
|
|
@@ -62,7 +62,7 @@ type FindOneOptions = {
|
|
|
62
62
|
/** Use case-insensitive matching */
|
|
63
63
|
caseInsensitive?: boolean;
|
|
64
64
|
};
|
|
65
|
-
/** Database adapter interface
|
|
65
|
+
/** Database adapter interface - implement this for your ORM or query builder. */
|
|
66
66
|
type NamespaceAdapter = {
|
|
67
67
|
findOne: (source: NamespaceSource, value: string, options?: FindOneOptions) => Promise<Record<string, unknown> | null>;
|
|
68
68
|
};
|
|
@@ -82,6 +82,13 @@ type CheckResult = {
|
|
|
82
82
|
category?: string;
|
|
83
83
|
suggestions?: string[];
|
|
84
84
|
};
|
|
85
|
+
/** Options for the `skeleton()` and `areConfusable()` functions. */
|
|
86
|
+
type SkeletonOptions = {
|
|
87
|
+
/** Confusable character map to use.
|
|
88
|
+
* Default: `CONFUSABLE_MAP_FULL` (complete TR39 map, no NFKC filtering).
|
|
89
|
+
* Pass `CONFUSABLE_MAP` if your pipeline runs NFKC before calling skeleton(). */
|
|
90
|
+
map?: Record<string, string>;
|
|
91
|
+
};
|
|
85
92
|
/**
|
|
86
93
|
* Normalize a raw identifier: trims whitespace, applies NFKC Unicode normalization,
|
|
87
94
|
* lowercases, and strips leading `@` symbols.
|
|
@@ -108,7 +115,7 @@ declare function normalize(raw: string, options?: {
|
|
|
108
115
|
/**
|
|
109
116
|
* Create a validator that rejects identifiers containing profanity or offensive words.
|
|
110
117
|
*
|
|
111
|
-
* Supply your own word list
|
|
118
|
+
* Supply your own word list - no words are bundled with the library.
|
|
112
119
|
* The returned function is compatible with `config.validators`.
|
|
113
120
|
*
|
|
114
121
|
* @param words - Array of words to block
|
|
@@ -146,6 +153,22 @@ declare function createProfanityValidator(words: string[], options?: {
|
|
|
146
153
|
* Regenerate: `npx tsx scripts/generate-confusables.ts`
|
|
147
154
|
*/
|
|
148
155
|
declare const CONFUSABLE_MAP: Record<string, string>;
|
|
156
|
+
/**
|
|
157
|
+
* Complete TR39 confusable mapping: every single-character mapping to a
|
|
158
|
+
* lowercase Latin letter or digit from confusables.txt, with no NFKC filtering.
|
|
159
|
+
*
|
|
160
|
+
* Use this when your pipeline does NOT run NFKC normalization before confusable
|
|
161
|
+
* detection (which is most real-world systems: TR39 skeleton uses NFD, Chromium
|
|
162
|
+
* uses NFD, Rust uses NFC, django-registration uses no normalization at all).
|
|
163
|
+
*
|
|
164
|
+
* Includes ~1,400 entries vs CONFUSABLE_MAP's ~613 NFKC-deduped entries.
|
|
165
|
+
* The additional entries cover characters that NFKC normalization would handle
|
|
166
|
+
* (mathematical alphanumerics, fullwidth forms, etc.) plus the 31 entries where
|
|
167
|
+
* TR39 and NFKC disagree on the target letter.
|
|
168
|
+
*
|
|
169
|
+
* Regenerate: `npx tsx scripts/generate-confusables.ts`
|
|
170
|
+
*/
|
|
171
|
+
declare const CONFUSABLE_MAP_FULL: Record<string, string>;
|
|
149
172
|
/**
|
|
150
173
|
* Create a validator that rejects identifiers containing homoglyph/confusable characters.
|
|
151
174
|
*
|
|
@@ -179,6 +202,48 @@ declare function createHomoglyphValidator(options?: {
|
|
|
179
202
|
available: false;
|
|
180
203
|
message: string;
|
|
181
204
|
} | null>;
|
|
205
|
+
/**
|
|
206
|
+
* Compute the TR39 Section 4 skeleton of a string for confusable comparison.
|
|
207
|
+
*
|
|
208
|
+
* Implements `internalSkeleton`:
|
|
209
|
+
* 1. NFD normalize
|
|
210
|
+
* 2. Remove Default_Ignorable_Code_Point characters
|
|
211
|
+
* 3. Replace each character via the confusable map
|
|
212
|
+
* 4. Reapply NFD
|
|
213
|
+
* 5. Lowercase
|
|
214
|
+
*
|
|
215
|
+
* The default map is `CONFUSABLE_MAP_FULL` (the complete TR39 mapping without
|
|
216
|
+
* NFKC filtering), which matches the NFD-based pipeline used by ICU, Chromium,
|
|
217
|
+
* and the TR39 spec itself. Pass `{ map: CONFUSABLE_MAP }` if your pipeline
|
|
218
|
+
* runs NFKC normalization before calling skeleton().
|
|
219
|
+
*
|
|
220
|
+
* @param input - The string to skeletonize
|
|
221
|
+
* @param options - Optional settings (custom confusable map)
|
|
222
|
+
* @returns The skeleton string for comparison
|
|
223
|
+
*
|
|
224
|
+
* @example
|
|
225
|
+
* ```ts
|
|
226
|
+
* skeleton("paypal") === skeleton("\u0440\u0430ypal") // true (Cyrillic р/а)
|
|
227
|
+
* skeleton("pay\u200Bpal") === skeleton("paypal") // true (zero-width stripped)
|
|
228
|
+
* ```
|
|
229
|
+
*/
|
|
230
|
+
declare function skeleton(input: string, options?: SkeletonOptions): string;
|
|
231
|
+
/**
|
|
232
|
+
* Check whether two strings are visually confusable by comparing their TR39 skeletons.
|
|
233
|
+
*
|
|
234
|
+
* @param a - First string
|
|
235
|
+
* @param b - Second string
|
|
236
|
+
* @param options - Optional settings (custom confusable map)
|
|
237
|
+
* @returns `true` if the strings produce the same skeleton
|
|
238
|
+
*
|
|
239
|
+
* @example
|
|
240
|
+
* ```ts
|
|
241
|
+
* areConfusable("paypal", "\u0440\u0430ypal") // true
|
|
242
|
+
* areConfusable("google", "g\u043e\u043egle") // true
|
|
243
|
+
* areConfusable("hello", "world") // false
|
|
244
|
+
* ```
|
|
245
|
+
*/
|
|
246
|
+
declare function areConfusable(a: string, b: string, options?: SkeletonOptions): boolean;
|
|
182
247
|
/**
|
|
183
248
|
* Create a namespace guard instance for checking slug/handle uniqueness
|
|
184
249
|
* across multiple database tables with reserved name protection.
|
|
@@ -227,4 +292,4 @@ declare function createNamespaceGuard(config: NamespaceConfig, adapter: Namespac
|
|
|
227
292
|
/** The guard instance returned by `createNamespaceGuard`. */
|
|
228
293
|
type NamespaceGuard = ReturnType<typeof createNamespaceGuard>;
|
|
229
294
|
|
|
230
|
-
export { CONFUSABLE_MAP, type CheckResult, type FindOneOptions, type NamespaceAdapter, type NamespaceConfig, type NamespaceGuard, type NamespaceSource, type OwnershipScope, type SuggestStrategyName, createHomoglyphValidator, createNamespaceGuard, createProfanityValidator, normalize };
|
|
295
|
+
export { CONFUSABLE_MAP, CONFUSABLE_MAP_FULL, type CheckResult, type FindOneOptions, type NamespaceAdapter, type NamespaceConfig, type NamespaceGuard, type NamespaceSource, type OwnershipScope, type SkeletonOptions, type SuggestStrategyName, areConfusable, createHomoglyphValidator, createNamespaceGuard, createProfanityValidator, normalize, skeleton };
|
package/dist/index.d.ts
CHANGED
|
@@ -6,14 +6,14 @@ type NamespaceSource = {
|
|
|
6
6
|
column: string;
|
|
7
7
|
/** Column name for the primary key (default: "id", or "_id" for Mongoose) */
|
|
8
8
|
idColumn?: string;
|
|
9
|
-
/** Scope key for ownership checks
|
|
9
|
+
/** Scope key for ownership checks - allows users to update their own slug without a false collision */
|
|
10
10
|
scopeKey?: string;
|
|
11
11
|
};
|
|
12
12
|
/** Built-in suggestion strategy names. */
|
|
13
13
|
type SuggestStrategyName = "sequential" | "random-digits" | "suffix-words" | "short-random" | "scramble" | "similar";
|
|
14
14
|
/** Configuration for a namespace guard instance. */
|
|
15
15
|
type NamespaceConfig = {
|
|
16
|
-
/** Reserved names
|
|
16
|
+
/** Reserved names - flat list, Set, or categorized record */
|
|
17
17
|
reserved?: Set<string> | string[] | Record<string, string[]>;
|
|
18
18
|
/** Data sources to check for collisions */
|
|
19
19
|
sources: NamespaceSource[];
|
|
@@ -35,7 +35,7 @@ type NamespaceConfig = {
|
|
|
35
35
|
/** Message shown when a purely numeric identifier is rejected (default: "Identifiers cannot be purely numeric.") */
|
|
36
36
|
purelyNumeric?: string;
|
|
37
37
|
};
|
|
38
|
-
/** Async validation hooks
|
|
38
|
+
/** Async validation hooks - run after format/reserved checks, before DB */
|
|
39
39
|
validators?: Array<(value: string) => Promise<{
|
|
40
40
|
available: false;
|
|
41
41
|
message: string;
|
|
@@ -62,7 +62,7 @@ type FindOneOptions = {
|
|
|
62
62
|
/** Use case-insensitive matching */
|
|
63
63
|
caseInsensitive?: boolean;
|
|
64
64
|
};
|
|
65
|
-
/** Database adapter interface
|
|
65
|
+
/** Database adapter interface - implement this for your ORM or query builder. */
|
|
66
66
|
type NamespaceAdapter = {
|
|
67
67
|
findOne: (source: NamespaceSource, value: string, options?: FindOneOptions) => Promise<Record<string, unknown> | null>;
|
|
68
68
|
};
|
|
@@ -82,6 +82,13 @@ type CheckResult = {
|
|
|
82
82
|
category?: string;
|
|
83
83
|
suggestions?: string[];
|
|
84
84
|
};
|
|
85
|
+
/** Options for the `skeleton()` and `areConfusable()` functions. */
|
|
86
|
+
type SkeletonOptions = {
|
|
87
|
+
/** Confusable character map to use.
|
|
88
|
+
* Default: `CONFUSABLE_MAP_FULL` (complete TR39 map, no NFKC filtering).
|
|
89
|
+
* Pass `CONFUSABLE_MAP` if your pipeline runs NFKC before calling skeleton(). */
|
|
90
|
+
map?: Record<string, string>;
|
|
91
|
+
};
|
|
85
92
|
/**
|
|
86
93
|
* Normalize a raw identifier: trims whitespace, applies NFKC Unicode normalization,
|
|
87
94
|
* lowercases, and strips leading `@` symbols.
|
|
@@ -108,7 +115,7 @@ declare function normalize(raw: string, options?: {
|
|
|
108
115
|
/**
|
|
109
116
|
* Create a validator that rejects identifiers containing profanity or offensive words.
|
|
110
117
|
*
|
|
111
|
-
* Supply your own word list
|
|
118
|
+
* Supply your own word list - no words are bundled with the library.
|
|
112
119
|
* The returned function is compatible with `config.validators`.
|
|
113
120
|
*
|
|
114
121
|
* @param words - Array of words to block
|
|
@@ -146,6 +153,22 @@ declare function createProfanityValidator(words: string[], options?: {
|
|
|
146
153
|
* Regenerate: `npx tsx scripts/generate-confusables.ts`
|
|
147
154
|
*/
|
|
148
155
|
declare const CONFUSABLE_MAP: Record<string, string>;
|
|
156
|
+
/**
|
|
157
|
+
* Complete TR39 confusable mapping: every single-character mapping to a
|
|
158
|
+
* lowercase Latin letter or digit from confusables.txt, with no NFKC filtering.
|
|
159
|
+
*
|
|
160
|
+
* Use this when your pipeline does NOT run NFKC normalization before confusable
|
|
161
|
+
* detection (which is most real-world systems: TR39 skeleton uses NFD, Chromium
|
|
162
|
+
* uses NFD, Rust uses NFC, django-registration uses no normalization at all).
|
|
163
|
+
*
|
|
164
|
+
* Includes ~1,400 entries vs CONFUSABLE_MAP's ~613 NFKC-deduped entries.
|
|
165
|
+
* The additional entries cover characters that NFKC normalization would handle
|
|
166
|
+
* (mathematical alphanumerics, fullwidth forms, etc.) plus the 31 entries where
|
|
167
|
+
* TR39 and NFKC disagree on the target letter.
|
|
168
|
+
*
|
|
169
|
+
* Regenerate: `npx tsx scripts/generate-confusables.ts`
|
|
170
|
+
*/
|
|
171
|
+
declare const CONFUSABLE_MAP_FULL: Record<string, string>;
|
|
149
172
|
/**
|
|
150
173
|
* Create a validator that rejects identifiers containing homoglyph/confusable characters.
|
|
151
174
|
*
|
|
@@ -179,6 +202,48 @@ declare function createHomoglyphValidator(options?: {
|
|
|
179
202
|
available: false;
|
|
180
203
|
message: string;
|
|
181
204
|
} | null>;
|
|
205
|
+
/**
|
|
206
|
+
* Compute the TR39 Section 4 skeleton of a string for confusable comparison.
|
|
207
|
+
*
|
|
208
|
+
* Implements `internalSkeleton`:
|
|
209
|
+
* 1. NFD normalize
|
|
210
|
+
* 2. Remove Default_Ignorable_Code_Point characters
|
|
211
|
+
* 3. Replace each character via the confusable map
|
|
212
|
+
* 4. Reapply NFD
|
|
213
|
+
* 5. Lowercase
|
|
214
|
+
*
|
|
215
|
+
* The default map is `CONFUSABLE_MAP_FULL` (the complete TR39 mapping without
|
|
216
|
+
* NFKC filtering), which matches the NFD-based pipeline used by ICU, Chromium,
|
|
217
|
+
* and the TR39 spec itself. Pass `{ map: CONFUSABLE_MAP }` if your pipeline
|
|
218
|
+
* runs NFKC normalization before calling skeleton().
|
|
219
|
+
*
|
|
220
|
+
* @param input - The string to skeletonize
|
|
221
|
+
* @param options - Optional settings (custom confusable map)
|
|
222
|
+
* @returns The skeleton string for comparison
|
|
223
|
+
*
|
|
224
|
+
* @example
|
|
225
|
+
* ```ts
|
|
226
|
+
* skeleton("paypal") === skeleton("\u0440\u0430ypal") // true (Cyrillic р/а)
|
|
227
|
+
* skeleton("pay\u200Bpal") === skeleton("paypal") // true (zero-width stripped)
|
|
228
|
+
* ```
|
|
229
|
+
*/
|
|
230
|
+
declare function skeleton(input: string, options?: SkeletonOptions): string;
|
|
231
|
+
/**
|
|
232
|
+
* Check whether two strings are visually confusable by comparing their TR39 skeletons.
|
|
233
|
+
*
|
|
234
|
+
* @param a - First string
|
|
235
|
+
* @param b - Second string
|
|
236
|
+
* @param options - Optional settings (custom confusable map)
|
|
237
|
+
* @returns `true` if the strings produce the same skeleton
|
|
238
|
+
*
|
|
239
|
+
* @example
|
|
240
|
+
* ```ts
|
|
241
|
+
* areConfusable("paypal", "\u0440\u0430ypal") // true
|
|
242
|
+
* areConfusable("google", "g\u043e\u043egle") // true
|
|
243
|
+
* areConfusable("hello", "world") // false
|
|
244
|
+
* ```
|
|
245
|
+
*/
|
|
246
|
+
declare function areConfusable(a: string, b: string, options?: SkeletonOptions): boolean;
|
|
182
247
|
/**
|
|
183
248
|
* Create a namespace guard instance for checking slug/handle uniqueness
|
|
184
249
|
* across multiple database tables with reserved name protection.
|
|
@@ -227,4 +292,4 @@ declare function createNamespaceGuard(config: NamespaceConfig, adapter: Namespac
|
|
|
227
292
|
/** The guard instance returned by `createNamespaceGuard`. */
|
|
228
293
|
type NamespaceGuard = ReturnType<typeof createNamespaceGuard>;
|
|
229
294
|
|
|
230
|
-
export { CONFUSABLE_MAP, type CheckResult, type FindOneOptions, type NamespaceAdapter, type NamespaceConfig, type NamespaceGuard, type NamespaceSource, type OwnershipScope, type SuggestStrategyName, createHomoglyphValidator, createNamespaceGuard, createProfanityValidator, normalize };
|
|
295
|
+
export { CONFUSABLE_MAP, CONFUSABLE_MAP_FULL, type CheckResult, type FindOneOptions, type NamespaceAdapter, type NamespaceConfig, type NamespaceGuard, type NamespaceSource, type OwnershipScope, type SkeletonOptions, type SuggestStrategyName, areConfusable, createHomoglyphValidator, createNamespaceGuard, createProfanityValidator, normalize, skeleton };
|