name-tools 0.1.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/README.md ADDED
@@ -0,0 +1,912 @@
1
+ # name-tools
2
+
3
+ [![npm version](https://img.shields.io/npm/v/name-tools.svg)](https://www.npmjs.com/package/name-tools)
4
+ [![License: MIT](https://img.shields.io/badge/License-MIT-blue.svg)](https://opensource.org/licenses/MIT)
5
+
6
+ A lightweight, zero-dependency utility library for parsing, formatting, and manipulating person names. Now with **entity classification** to distinguish between people, organizations, families, and compound names.
7
+
8
+ **[Interactive Demo](https://bobpritchett.github.io/name-tools/)**
9
+
10
+ ## Features
11
+
12
+ - **Entity Classification** - Automatically detect if input is a person, organization, family, or compound name
13
+ - Parse full names into components (prefix, first, middle, last, suffix, nickname)
14
+ - **Single** formatting entry point with presets + options (`formatName`)
15
+ - **Gender guessing** from first names using US Social Security Administration birth data (140+ years of statistics)
16
+ - Smart no-break spacing for nicer UI rendering (NBSP/NNBSP, optional HTML entities output)
17
+ - Render lists/couples from arrays of names
18
+ - Extract specific parts (first name, last name, nickname)
19
+ - **Reversed name support** (e.g., "Smith, John, Jr.")
20
+ - **Email recipient list parsing** (`parseNameList` for To/CC lines)
21
+ - Comprehensive data sets of common prefixes and suffixes
22
+ - Full TypeScript support with type definitions
23
+ - Zero dependencies (gender data is tree-shakeable)
24
+ - Lightweight and fast
25
+
26
+ ## Installation
27
+
28
+ ```bash
29
+ pnpm add name-tools
30
+ ```
31
+
32
+ ## Quick Start
33
+
34
+ ```javascript
35
+ import { parseName, formatName, parseNameList } from "name-tools";
36
+
37
+ // Parse and classify a name - returns a typed entity
38
+ const person = parseName("Dr. William Frederick Richardson Jr.");
39
+ console.log(person);
40
+ // {
41
+ // kind: 'person',
42
+ // given: 'William',
43
+ // middle: 'Frederick',
44
+ // family: 'Richardson',
45
+ // honorific: 'Dr.',
46
+ // suffix: 'Jr.',
47
+ // meta: { confidence: 1, reasons: ['PERSON_STANDARD_FORMAT'], ... }
48
+ // }
49
+
50
+ // Detect organizations automatically
51
+ const org = parseName("Acme Corporation, Inc.");
52
+ console.log(org.kind); // 'organization'
53
+ console.log(org.legalForm); // 'Inc.'
54
+
55
+ // Detect compound names (couples)
56
+ const couple = parseName("Bob & Mary Smith");
57
+ console.log(couple.kind); // 'compound'
58
+ console.log(couple.sharedFamily); // 'Smith'
59
+
60
+ // Format a name (single entry point)
61
+ console.log(formatName("Dr. William Frederick Richardson Jr."));
62
+ // "William Richardson, Jr."
63
+
64
+ console.log(
65
+ formatName("Dr. William Frederick Richardson Jr.", { preset: "formalFull" })
66
+ );
67
+ // "Dr. William Frederick Richardson, Jr."
68
+
69
+ // Parse email recipient lists (To/CC lines)
70
+ const recipients = parseNameList("John Smith <john@example.com>; Jane Doe");
71
+ // Returns array of parsed recipients with emails and classified names
72
+ ```
73
+
74
+ ## API Reference
75
+
76
+ ### Entity Classification
77
+
78
+ The library classifies input into one of these entity types:
79
+
80
+ | Kind | Description | Example |
81
+ | -------------- | ---------------------------------------------- | -------------------------------- |
82
+ | `person` | Individual human name | "Dr. John Smith Jr." |
83
+ | `organization` | Company, institution, trust, foundation | "Acme Corp, Inc.", "First Bank" |
84
+ | `family` | Family or household | "The Smith Family", "The Smiths" |
85
+ | `compound` | Multiple people in one field | "Bob & Mary Smith" |
86
+ | `unknown` | Could not be classified | "@handle", ambiguous input |
87
+ | `rejected` | Strict mode rejection (not the requested type) | - |
88
+
89
+ ### Entity Type Structures
90
+
91
+ Each entity type has specific fields. All types include a `meta` property with parsing metadata.
92
+
93
+ #### `PersonName`
94
+
95
+ | Field | Type | Description |
96
+ | ----------- | ---------- | ------------------------------------------ |
97
+ | `kind` | `'person'` | Entity type discriminator |
98
+ | `honorific` | `string?` | Title/honorific (Dr., Mr., etc.) |
99
+ | `given` | `string?` | Given/first name |
100
+ | `middle` | `string?` | Middle name(s) |
101
+ | `family` | `string?` | Family/last name |
102
+ | `suffix` | `string?` | Suffix (Jr., PhD, etc.) |
103
+ | `nickname` | `string?` | Nickname |
104
+ | `particles` | `string[]?`| Surname particles (von, de, etc.) |
105
+ | `reversed` | `boolean?` | Whether name was in reversed format |
106
+ | `meta` | `ParseMeta`| Parsing metadata |
107
+
108
+ #### `OrganizationName`
109
+
110
+ | Field | Type | Description |
111
+ | --------------- | ---------------- | ------------------------------------- |
112
+ | `kind` | `'organization'` | Entity type discriminator |
113
+ | `baseName` | `string` | Base name without legal suffix |
114
+ | `legalForm` | `LegalForm?` | Detected legal form (Inc, LLC, etc.) |
115
+ | `legalSuffixRaw`| `string?` | Raw legal suffix as written |
116
+ | `aka` | `string[]?` | Alternate names (d/b/a) |
117
+ | `meta` | `ParseMeta` | Parsing metadata |
118
+
119
+ #### `FamilyName`
120
+
121
+ | Field | Type | Description |
122
+ | ------------ | ----------------------------- | ----------------------------------- |
123
+ | `kind` | `'family'` \| `'household'` | Entity type discriminator |
124
+ | `article` | `'The'?` | Leading article if present |
125
+ | `familyName` | `string` | Core family/surname |
126
+ | `style` | `'familyWord'` \| `'pluralSurname'` | How the family was expressed |
127
+ | `familyWord` | `'Family'` \| `'Household'?` | The word used (if style is familyWord) |
128
+ | `meta` | `ParseMeta` | Parsing metadata |
129
+
130
+ #### `CompoundName`
131
+
132
+ | Field | Type | Description |
133
+ | -------------- | --------------------------------- | -------------------------------- |
134
+ | `kind` | `'compound'` | Entity type discriminator |
135
+ | `connector` | `'&'` \| `'and'` \| `'+'` \| `'et'` \| `'unknown'` | The connector detected |
136
+ | `members` | `(PersonName \| UnknownName)[]` | Parsed members |
137
+ | `sharedFamily` | `string?` | Shared family name if inferred |
138
+ | `meta` | `ParseMeta` | Parsing metadata |
139
+
140
+ #### `UnknownName`
141
+
142
+ | Field | Type | Description |
143
+ | ------- | ----------- | ---------------------------------------- |
144
+ | `kind` | `'unknown'` | Entity type discriminator |
145
+ | `text` | `string` | Best-effort normalized text |
146
+ | `guess` | `NameKind?` | Best guess at what it might be |
147
+ | `meta` | `ParseMeta` | Parsing metadata |
148
+
149
+ #### `RejectedName`
150
+
151
+ | Field | Type | Description |
152
+ | ------------ | ------------ | ---------------------------------------- |
153
+ | `kind` | `'rejected'` | Entity type discriminator |
154
+ | `rejectedAs` | `NameKind` | What kind it would have been classified as |
155
+ | `meta` | `ParseMeta` | Parsing metadata |
156
+
157
+ #### `ParseMeta` (included in all entities)
158
+
159
+ | Field | Type | Description |
160
+ | ------------ | -------------- | ---------------------------------------- |
161
+ | `raw` | `string` | Exact input string |
162
+ | `normalized` | `string` | Normalized string used for parsing |
163
+ | `confidence` | `0` \| `0.25` \| `0.5` \| `0.75` \| `1` | Overall confidence in classification |
164
+ | `reasons` | `ReasonCode[]` | Reason codes explaining the classification |
165
+ | `warnings` | `string[]?` | Human-readable warnings |
166
+ | `locale` | `string?` | Locale hint (default: "en") |
167
+
168
+ ---
169
+
170
+ ### Parsing Functions
171
+
172
+ #### `parseName(input, options?)`
173
+
174
+ ```ts
175
+ parseName(input: string, options?: ParseOptions): ParsedNameEntity
176
+ ```
177
+
178
+ Parse and classify a name string into a typed entity.
179
+
180
+ **Parameters:**
181
+
182
+ | Parameter | Type | Description |
183
+ | ------------------ | -------- | -------------------------------------------- |
184
+ | `input` | `string` | The name string to parse |
185
+ | `options.locale` | `string` | Locale hint (default: `'en'`) |
186
+ | `options.strictKind` | `'person'` | If set, reject non-person entities |
187
+
188
+ **Returns:** `ParsedNameEntity` - One of `PersonName`, `OrganizationName`, `FamilyName`, `CompoundName`, `UnknownName`, or `RejectedName`
189
+
190
+ ```javascript
191
+ // Person
192
+ const person = parseName("Mr. John Robert Smith Jr.");
193
+ // { kind: 'person', honorific: 'Mr.', given: 'John', middle: 'Robert', family: 'Smith', suffix: 'Jr.', meta: {...} }
194
+
195
+ // Organization
196
+ const org = parseName("Smith Family Trust");
197
+ // { kind: 'organization', baseName: 'Smith Family Trust', legalForm: 'Trust', meta: {...} }
198
+
199
+ // Compound name (couple)
200
+ const couple = parseName("John and Mary Smith");
201
+ // { kind: 'compound', connector: 'and', members: [...], sharedFamily: 'Smith', meta: {...} }
202
+
203
+ // Reversed format
204
+ const reversed = parseName("Smith, John, Jr.");
205
+ // { kind: 'person', given: 'John', family: 'Smith', suffix: 'Jr.', reversed: true, meta: {...} }
206
+ ```
207
+
208
+ ---
209
+
210
+ #### `parsePersonName(fullName)`
211
+
212
+ ```ts
213
+ parsePersonName(fullName: string): ParsedName
214
+ ```
215
+
216
+ Parse a name string into legacy `ParsedName` format (for use with `formatName`).
217
+
218
+ **Parameters:**
219
+
220
+ | Parameter | Type | Description |
221
+ | ---------- | -------- | ------------------------ |
222
+ | `fullName` | `string` | The name string to parse |
223
+
224
+ **Returns:** `ParsedName` with `prefix`, `first`, `middle`, `last`, `suffix`, `nickname` fields
225
+
226
+ ```javascript
227
+ const parsed = parsePersonName("Dr. John William Smith Jr.");
228
+ // { prefix: 'Dr.', first: 'John', middle: 'William', last: 'Smith', suffix: 'Jr.' }
229
+ ```
230
+
231
+ ---
232
+
233
+ #### `parseNameList(input, options?)`
234
+
235
+ ```ts
236
+ parseNameList(input: string, options?: ParseListOptions): ParsedRecipient[]
237
+ ```
238
+
239
+ Parse email recipient lists (To/CC lines) into individual recipients.
240
+
241
+ **Parameters:**
242
+
243
+ | Parameter | Type | Description |
244
+ | ------------------ | -------- | -------------------------------------------- |
245
+ | `input` | `string` | The recipient list string |
246
+ | `options.locale` | `string` | Locale hint (default: `'en'`) |
247
+ | `options.strictKind` | `'person'` | If set, reject non-person entities |
248
+
249
+ **Returns:** `ParsedRecipient[]` - Array of recipients with `raw`, `display`, `email`, and `meta` fields
250
+
251
+ **Supports:**
252
+
253
+ - Semicolon separator (Outlook default)
254
+ - Comma separator (context-aware, respects reversed names)
255
+ - Newline separator
256
+ - Email formats: `Name <email>`, `email (Name)`, bare emails
257
+ - Reversed names: `Smith, John` won't split on the comma
258
+
259
+ ```javascript
260
+ const recipients = parseNameList(
261
+ "John Smith <john@example.com>; Jane Doe <jane@example.com>, Bob"
262
+ );
263
+ // [
264
+ // { raw: 'John Smith <john@example.com>', display: { kind: 'person', ... }, email: 'john@example.com', meta: {...} },
265
+ // { raw: 'Jane Doe <jane@example.com>', display: { kind: 'person', ... }, email: 'jane@example.com', meta: {...} },
266
+ // { raw: 'Bob', display: { kind: 'person', given: 'Bob', ... }, meta: {...} }
267
+ // ]
268
+ ```
269
+
270
+ ---
271
+
272
+ #### `getFirstName(fullName)`
273
+
274
+ ```ts
275
+ getFirstName(fullName: string): string | undefined
276
+ ```
277
+
278
+ Extract just the first/given name from a full name string.
279
+
280
+ **Parameters:**
281
+
282
+ | Parameter | Type | Description |
283
+ | ---------- | -------- | ------------------------ |
284
+ | `fullName` | `string` | The full name to extract from |
285
+
286
+ **Returns:** `string | undefined` - The first name, or `undefined` if not found
287
+
288
+ ```javascript
289
+ getFirstName("William Frederick Richardson"); // "William"
290
+ getFirstName("Dr. John Smith Jr."); // "John"
291
+ ```
292
+
293
+ ---
294
+
295
+ #### `getLastName(fullName)`
296
+
297
+ ```ts
298
+ getLastName(fullName: string): string | undefined
299
+ ```
300
+
301
+ Extract just the last/family name from a full name string.
302
+
303
+ **Parameters:**
304
+
305
+ | Parameter | Type | Description |
306
+ | ---------- | -------- | ------------------------ |
307
+ | `fullName` | `string` | The full name to extract from |
308
+
309
+ **Returns:** `string | undefined` - The last name, or `undefined` if not found
310
+
311
+ ```javascript
312
+ getLastName("William Frederick Richardson"); // "Richardson"
313
+ getLastName("Dr. John van der Berg Jr."); // "van der Berg"
314
+ ```
315
+
316
+ ---
317
+
318
+ #### `getNickname(fullName)`
319
+
320
+ ```ts
321
+ getNickname(fullName: string): string | undefined
322
+ ```
323
+
324
+ Extract the nickname from a full name string (if present).
325
+
326
+ **Parameters:**
327
+
328
+ | Parameter | Type | Description |
329
+ | ---------- | -------- | ------------------------ |
330
+ | `fullName` | `string` | The full name to extract from |
331
+
332
+ **Returns:** `string | undefined` - The nickname, or `undefined` if not found
333
+
334
+ ```javascript
335
+ getNickname('William "Bill" Smith'); // "Bill"
336
+ getNickname("Robert (Bob) Jones"); // "Bob"
337
+ ```
338
+
339
+ ---
340
+
341
+ #### `entityToLegacy(entity)`
342
+
343
+ ```ts
344
+ entityToLegacy(entity: ParsedNameEntity): ParsedName | null
345
+ ```
346
+
347
+ Convert a `ParsedNameEntity` to legacy `ParsedName` format (for use with `formatName`).
348
+
349
+ **Parameters:**
350
+
351
+ | Parameter | Type | Description |
352
+ | --------- | ------------------ | -------------------------- |
353
+ | `entity` | `ParsedNameEntity` | The entity to convert |
354
+
355
+ **Returns:** `ParsedName | null` - Legacy format, or `null` if not a person
356
+
357
+ ```javascript
358
+ const entity = parseName("Dr. John Smith Jr.");
359
+ const legacy = entityToLegacy(entity);
360
+ // { prefix: 'Dr.', first: 'John', last: 'Smith', suffix: 'Jr.' }
361
+ ```
362
+
363
+ ---
364
+
365
+ ### Type Guards
366
+
367
+ Type guard functions for TypeScript type narrowing:
368
+
369
+ | Function | Returns `true` when |
370
+ | ------------------ | ---------------------------------- |
371
+ | `isPerson(entity)` | `entity.kind === 'person'` |
372
+ | `isOrganization(entity)` | `entity.kind === 'organization'` |
373
+ | `isFamily(entity)` | `entity.kind === 'family'` |
374
+ | `isCompound(entity)` | `entity.kind === 'compound'` |
375
+ | `isUnknown(entity)` | `entity.kind === 'unknown'` |
376
+ | `isRejected(entity)` | `entity.kind === 'rejected'` |
377
+
378
+ ```typescript
379
+ import { parseName, isPerson, isOrganization } from "name-tools";
380
+
381
+ const entity = parseName(input);
382
+
383
+ if (isPerson(entity)) {
384
+ // TypeScript knows: entity is PersonName
385
+ console.log(entity.given, entity.family);
386
+ } else if (isOrganization(entity)) {
387
+ // TypeScript knows: entity is OrganizationName
388
+ console.log(entity.baseName, entity.legalForm);
389
+ }
390
+ ```
391
+
392
+ ---
393
+
394
+ ### Formatting Functions
395
+
396
+ #### `formatName(input, options?)`
397
+
398
+ ```ts
399
+ formatName(
400
+ input: string | ParsedName | Array<string | ParsedName>,
401
+ options?: NameFormatOptions
402
+ ): string
403
+ ```
404
+
405
+ Format a name (or array of names) according to a preset or custom options.
406
+
407
+ **Parameters:**
408
+
409
+ | Parameter | Type | Description |
410
+ | --------- | ---- | ----------- |
411
+ | `input` | `string \| ParsedName \| Array` | Name(s) to format |
412
+ | `options` | `NameFormatOptions` | Formatting options (see below) |
413
+
414
+ **Returns:** `string` - The formatted name
415
+
416
+ **Key Options:**
417
+
418
+ | Option | Type | Default | Description |
419
+ | --------------- | ----------------------------------------- | --------------- | ---------------------------------------- |
420
+ | `preset` | `NamePreset` | `'display'` | Preset format (see Preset Matrix below) |
421
+ | `output` | `'text' \| 'html'` | `'text'` | Output format |
422
+ | `typography` | `'plain' \| 'ui' \| 'fine'` | `'ui'` | Typography level for spacing/punctuation |
423
+ | `noBreak` | `'none' \| 'smart' \| 'all'` | `'smart'` | Non-breaking space behavior |
424
+ | `join` | `'none' \| 'list' \| 'couple'` | `'none'` | Array rendering mode |
425
+ | `conjunction` | `'and' \| '&' \| string` | `'and'` | Word between last two names |
426
+ | `oxfordComma` | `boolean` | `true` | Use Oxford comma in lists |
427
+ | `prefer` | `'auto' \| 'nickname' \| 'first' \| 'fullGiven'` | `'auto'` | Which given name to prefer |
428
+ | `middle` | `'full' \| 'initial' \| 'none'` | varies | Middle name handling |
429
+ | `prefix` | `'include' \| 'omit' \| 'auto'` | varies | Honorific handling |
430
+ | `suffix` | `'include' \| 'omit' \| 'auto'` | varies | Suffix handling |
431
+ | `order` | `'given-family' \| 'family-given' \| 'auto'` | `'given-family'` | Name order |
432
+
433
+ ```javascript
434
+ formatName("Dr. John Franklin Jr.");
435
+ // "John Franklin, Jr."
436
+
437
+ formatName("Dr. John Franklin Jr.", { preset: "alphabetical" });
438
+ // "Franklin, John, Jr."
439
+
440
+ formatName(["John Smith", "Jane Doe"], { join: "list" });
441
+ // "John Smith and Jane Doe"
442
+ ```
443
+
444
+ #### Preset Matrix (quick pick)
445
+
446
+ Example input used below:
447
+
448
+ ```javascript
449
+ const input = "Dr. William Frederick Richardson Jr.";
450
+ ```
451
+
452
+ | preset | intent | defaults (high level) | output example |
453
+ | ------------------- | ------------------------ | --------------------------------------------------------- | ---------------------------------------------------------------------------------------------------------------------- |
454
+ | `display` (default) | best UI display name | prefix omit, middle none, suffix auto, given-family | `formatName(input)` → `William Richardson, Jr.` |
455
+ | `preferredDisplay` | nickname + family for UI | prefer nickname, middle none, suffix auto | `formatName(input, { preset: "preferredDisplay" })` → `William Richardson, Jr.` _(falls back to first if no nickname)_ |
456
+ | `informal` | given + family | prefix omit, middle none, suffix omit | `formatName(input, { preset: "informal" })` → `William Richardson` |
457
+ | `firstOnly` | given only | prefix omit, middle none, suffix omit | `formatName(input, { preset: "firstOnly" })` → `William` |
458
+ | `preferredFirst` | nickname only | prefer nickname, middle none, suffix omit | `formatName(input, { preset: "preferredFirst" })` → `William` _(falls back to first if no nickname)_ |
459
+ | `formalFull` | full formal name | prefix include, middle full, suffix include, given-family | `formatName(input, { preset: "formalFull" })` → `Dr. William Frederick Richardson, Jr.` |
460
+ | `formalShort` | title + family | prefix include, middle none, suffix omit | `formatName(input, { preset: "formalShort" })` → `Dr. Richardson` |
461
+ | `expandedFull` | full formal name with full given | prefix include, prefer fullGiven, middle none, suffix include | `formatName('Thomas A. (Thomas Alva) Edison', { preset: "expandedFull" })` → `Thomas Alva Edison` |
462
+ | `alphabetical` | sortable family-first | family-given, middle initial, suffix auto | `formatName(input, { preset: "alphabetical" })` → `Richardson, William F., Jr.` |
463
+ | `library` | alphabetical with full given | family-given, middle initial, suffix auto, append fullGiven | `formatName('Thomas A. (Thomas Alva) Edison', { preset: "library" })` → `Edison, Thomas A. (Thomas Alva)` |
464
+ | `initialed` | initials + family | middle initial, suffix omit | `formatName(input, { preset: "initialed" })` → `W. F. Richardson` |
465
+
466
+ See `NameFormatOptions` for presets, typography, no-break behavior, and array rendering.
467
+
468
+ ### Data Sets & Utilities
469
+
470
+ #### Data Sets
471
+
472
+ | Export | Type | Description |
473
+ | ----------------------- | ------------------- | ------------------------------------------------ |
474
+ | `PARTICLES` | `readonly string[]` | Surname particles: "von", "van", "de", "la", etc. |
475
+ | `MULTI_WORD_PARTICLES` | `readonly string[]` | Multi-word particles: "de la", "van der", etc. |
476
+ | `COMMON_SURNAMES` | `readonly string[]` | ~1000 most common US surnames |
477
+ | `COMMON_FIRST_NAMES` | `readonly string[]` | ~1000 most common US first names |
478
+
479
+ #### Utility Functions
480
+
481
+ ```ts
482
+ isParticle(str: string): boolean
483
+ ```
484
+
485
+ Check if a string is a surname particle (case-insensitive).
486
+
487
+ ```javascript
488
+ isParticle("von"); // true
489
+ isParticle("Van"); // true
490
+ isParticle("Smith"); // false
491
+ ```
492
+
493
+ ---
494
+
495
+ ```ts
496
+ isMultiWordParticle(words: string[]): string | null
497
+ ```
498
+
499
+ Check if an array of words starts with a multi-word particle. Returns the matched particle or `null`.
500
+
501
+ ```javascript
502
+ isMultiWordParticle(["de", "la", "Cruz"]); // "de la"
503
+ isMultiWordParticle(["van", "der", "Berg"]); // "van der"
504
+ isMultiWordParticle(["Smith", "Jones"]); // null
505
+ ```
506
+
507
+ ---
508
+
509
+ ```ts
510
+ isCommonSurname(str: string): boolean
511
+ ```
512
+
513
+ Check if a string is a common US surname (case-insensitive).
514
+
515
+ ```javascript
516
+ isCommonSurname("Smith"); // true
517
+ isCommonSurname("Xyzzy"); // false
518
+ ```
519
+
520
+ ---
521
+
522
+ ```ts
523
+ isCommonFirstName(str: string): boolean
524
+ ```
525
+
526
+ Check if a string is a common US first name (case-insensitive).
527
+
528
+ ```javascript
529
+ isCommonFirstName("John"); // true
530
+ isCommonFirstName("Xyzzy"); // false
531
+ ```
532
+
533
+ ## TypeScript Support
534
+
535
+ This library is written in TypeScript and includes full type definitions.
536
+
537
+ ```typescript
538
+ import {
539
+ parseName,
540
+ formatName,
541
+ parseNameList,
542
+ ParsedNameEntity,
543
+ PersonName,
544
+ OrganizationName,
545
+ CompoundName,
546
+ ParsedRecipient,
547
+ isPerson,
548
+ NameFormatOptions,
549
+ } from "name-tools";
550
+
551
+ // Entity classification with type narrowing
552
+ const entity: ParsedNameEntity = parseName("John Franklin Jr.");
553
+
554
+ if (isPerson(entity)) {
555
+ const person: PersonName = entity;
556
+ console.log(person.given, person.family); // "John", "Franklin"
557
+ }
558
+
559
+ // Formatting
560
+ const formatted: string = formatName("John Franklin Jr.", {
561
+ preset: "display",
562
+ } satisfies NameFormatOptions);
563
+ ```
564
+
565
+ ## Use Cases
566
+
567
+ - User registration and profile systems
568
+ - Contact management systems
569
+ - Email address generators
570
+ - Mailing list formatters
571
+ - Name-based sorting and indexing
572
+ - Form validation and processing
573
+ - Business card and document generation
574
+ - CRM data normalization
575
+ - Donor/customer record deduplication
576
+
577
+ ## Examples
578
+
579
+ ### Entity Classification
580
+
581
+ ```javascript
582
+ import { parseName, isPerson, isOrganization } from "name-tools";
583
+
584
+ function processEntry(input) {
585
+ const entity = parseName(input);
586
+
587
+ switch (entity.kind) {
588
+ case "person":
589
+ return `Hello, ${entity.given}!`;
590
+ case "organization":
591
+ return `Business: ${entity.baseName}`;
592
+ case "family":
593
+ return `The ${entity.familyName} household`;
594
+ case "compound":
595
+ return `${entity.members.length} people`;
596
+ default:
597
+ return input; // unknown/rejected
598
+ }
599
+ }
600
+
601
+ processEntry("Dr. John Smith"); // "Hello, John!"
602
+ processEntry("Acme Corp, LLC"); // "Business: Acme Corp, LLC"
603
+ processEntry("The Smith Family"); // "The Smith household"
604
+ processEntry("Bob & Mary Smith"); // "2 people"
605
+ ```
606
+
607
+ ### Strict Mode (Person Only)
608
+
609
+ ```javascript
610
+ import { parseName } from "name-tools";
611
+
612
+ // Reject non-person entities
613
+ const result = parseName("Acme Corp, Inc.", { strictKind: "person" });
614
+ console.log(result.kind); // 'rejected'
615
+ console.log(result.rejectedAs); // 'organization'
616
+ ```
617
+
618
+ ### Building an Email Address
619
+
620
+ ```javascript
621
+ import { getFirstName, getLastName } from "name-tools";
622
+
623
+ function createEmail(fullName, domain) {
624
+ const first = getFirstName(fullName).toLowerCase();
625
+ const last = getLastName(fullName).toLowerCase();
626
+ return `${first}.${last}@${domain}`;
627
+ }
628
+
629
+ createEmail("John Franklin Jr.", "example.com");
630
+ // "john.franklin@example.com"
631
+ ```
632
+
633
+ ### Formatting for Display
634
+
635
+ ```javascript
636
+ import { parseName, formatName, isPerson } from "name-tools";
637
+
638
+ function displayName(fullName) {
639
+ const entity = parseName(fullName);
640
+
641
+ if (!isPerson(entity)) {
642
+ return fullName; // Return as-is for non-persons
643
+ }
644
+
645
+ // Use a more formal preset for VIPs with titles
646
+ if (entity.honorific) {
647
+ return formatName(fullName, { preset: "formalFull" });
648
+ }
649
+
650
+ // Otherwise the default display preset
651
+ return formatName(fullName);
652
+ }
653
+
654
+ displayName("Dr. William Frederick Richardson Jr.");
655
+ // "Dr. William Frederick Richardson, Jr."
656
+
657
+ displayName("William Frederick Richardson Jr.");
658
+ // "William Richardson, Jr."
659
+ ```
660
+
661
+ ### Sorting Names
662
+
663
+ ```javascript
664
+ import { formatName } from "name-tools";
665
+
666
+ const names = ["John Franklin Jr.", "Alice Johnson", "Dr. Charlie Brown"];
667
+
668
+ const sorted = names
669
+ .map((name) => ({
670
+ original: name,
671
+ sortKey: formatName(name, { preset: "alphabetical" }),
672
+ }))
673
+ .sort((a, b) => a.sortKey.localeCompare(b.sortKey))
674
+ .map((item) => item.original);
675
+
676
+ // ["Dr. Charlie Brown", "Alice Johnson", "John Franklin Jr."]
677
+ ```
678
+
679
+ ### Processing Email Recipient Lists
680
+
681
+ ```javascript
682
+ import { parseNameList, isPerson } from "name-tools";
683
+
684
+ const toLine =
685
+ "John Smith <john@example.com>; Jane Doe; sales@company.com; The Smiths";
686
+
687
+ const recipients = parseNameList(toLine);
688
+
689
+ for (const r of recipients) {
690
+ if (r.email) {
691
+ console.log(`Email: ${r.email}`);
692
+ }
693
+ if (r.display && isPerson(r.display)) {
694
+ console.log(`Name: ${r.display.given} ${r.display.family}`);
695
+ }
696
+ }
697
+ ```
698
+
699
+ ## Running the Demo Locally & Troubleshooting
700
+
701
+ ### Opening the Demo Page
702
+
703
+ To test your changes live in the demo page:
704
+
705
+ 1. Build the library and copy the latest code to the demo:
706
+ ```bash
707
+ pnpm run build
708
+ ```
709
+ 2. Open `docs/index.html` in your browser.
710
+
711
+ #### CORS Error: "Access to script at ... has been blocked by CORS policy"
712
+
713
+ Modern browsers block JavaScript modules and some resource loading when opening an HTML file directly from disk (`file://`), due to security (CORS) restrictions. If you see errors like:
714
+
715
+ ```
716
+ Access to script at 'file:///C:/.../script.js' from origin 'null' has been blocked by CORS policy
717
+ ```
718
+
719
+ **Solution:** Serve the `docs/` folder with a local web server instead of opening the file directly.
720
+
721
+ ##### Quick Start with a Local Server
722
+
723
+ From your project root, run one of the following:
724
+
725
+ - Using pnpm (if you have `serve`):
726
+ ```bash
727
+ pnpm dlx serve docs
728
+ ```
729
+ - Using npx (npm):
730
+ ```bash
731
+ npx serve docs
732
+ ```
733
+ - Using Python (if installed):
734
+ ```bash
735
+ python -m http.server 8000 --directory docs
736
+ ```
737
+
738
+ Then open your browser to the address shown in the terminal (e.g., http://localhost:3000 or http://localhost:8000).
739
+
740
+ ---
741
+
742
+ ## Development
743
+
744
+ ### Setup
745
+
746
+ ```bash
747
+ # Clone the repository
748
+ git clone https://github.com/BobPritchett/name-tools.git
749
+ cd name-tools
750
+
751
+ # Install dependencies
752
+ pnpm install
753
+
754
+ # Build the library
755
+ pnpm run build
756
+
757
+ # Run tests
758
+ pnpm test
759
+
760
+ # Development mode (watch for changes)
761
+ pnpm run dev
762
+ ```
763
+
764
+ ### Project Structure
765
+
766
+ ```
767
+ name-tools/
768
+ ├── src/ # Source code (TypeScript)
769
+ │ ├── data/ # Data sets (prefixes, suffixes, legal forms)
770
+ │ ├── detectors/ # Entity detection modules
771
+ │ │ ├── person.ts # Person name detection
772
+ │ │ ├── organization.ts # Organization detection
773
+ │ │ ├── family.ts # Family/household detection
774
+ │ │ └── compound.ts # Compound name detection
775
+ │ ├── gender/ # Gender probability module
776
+ │ │ ├── GenderDB.ts # Binary trie lookup class
777
+ │ │ ├── all.ts # Full dataset entry point
778
+ │ │ ├── coverage99.ts # 99% coverage entry point
779
+ │ │ ├── coverage95.ts # 95% coverage entry point
780
+ │ │ └── data/ # Generated binary data (base64)
781
+ │ ├── classifier.ts # Main classification orchestrator
782
+ │ ├── parsers.ts # Parsing functions
783
+ │ ├── formatters.ts # Formatting functions
784
+ │ ├── list-parser.ts # Recipient list parsing
785
+ │ ├── types.ts # TypeScript type definitions
786
+ │ └── index.ts # Main entry point
787
+ ├── scripts/ # Build scripts
788
+ │ └── build-gender-data.js # Generates binary trie from SSA data
789
+ ├── data/ # SSA source data (yobYYYY.txt files)
790
+ ├── dist/ # Compiled output (generated)
791
+ ├── docs/ # Demo site (GitHub Pages)
792
+ ├── tests/ # Unit tests
793
+ └── package.json # Package configuration
794
+ ```
795
+
796
+ ### Rebuilding Gender Probability Data
797
+
798
+ The library includes gender probability data derived from US Social Security Administration birth name records. The data is pre-built and included in the source, but you can rebuild it if you need to update or customize it.
799
+
800
+ #### Step 1: Download SSA Data
801
+
802
+ 1. Go to the [SSA Baby Names page](https://www.ssa.gov/oact/babynames/limits.html)
803
+ 2. Click **"National data"** to download `names.zip` (~9 MB)
804
+ 3. Extract the zip file contents into the `data/` folder in this project
805
+ 4. You should now have files like `data/yob1880.txt`, `data/yob1881.txt`, ..., `data/yob2023.txt`
806
+
807
+ The zip contains ~145 files covering births from 1880 to present.
808
+
809
+ #### Step 2: Build the Data
810
+
811
+ ```bash
812
+ pnpm build:gender
813
+ ```
814
+
815
+ This generates three tree-shakeable data files with different coverage levels:
816
+
817
+ | File | Coverage | Names | Size |
818
+ | ------------------------------- | -------- | ------- | ------- |
819
+ | `src/gender/data/all.ts` | 100% | ~83,000 | ~757 KB |
820
+ | `src/gender/data/coverage99.ts` | 99% | ~27,000 | ~257 KB |
821
+ | `src/gender/data/coverage95.ts` | 95% | ~7,000 | ~75 KB |
822
+
823
+ #### Configuration
824
+
825
+ Edit `scripts/build-gender-data.js` to customize:
826
+
827
+ ```javascript
828
+ const CONFIG = {
829
+ // Minimum occurrences to include a name (higher = smaller file)
830
+ MIN_OCCURRENCES: 10,
831
+
832
+ // Coverage levels to generate
833
+ COVERAGE_LEVELS: {
834
+ all: 1.0, // 100% - all names meeting threshold
835
+ coverage99: 0.99, // 99% of population
836
+ coverage95: 0.95, // 95% of population
837
+ },
838
+ };
839
+ ```
840
+
841
+ **How the filters interact:**
842
+
843
+ 1. `MIN_OCCURRENCES` is applied **first** to all names - any name with fewer total occurrences across all years is discarded
844
+ 2. `COVERAGE_LEVELS` then filters the remaining names by cumulative population coverage (sorted by popularity)
845
+
846
+ So `MIN_OCCURRENCES` affects **all** output files, not just the "all" dataset. Increasing it removes rare names from every coverage level, making all files smaller but potentially missing some valid obscure names.
847
+
848
+ #### Usage
849
+
850
+ Import the data size you need - tree-shaking drops the unused ones:
851
+
852
+ ```typescript
853
+ // Smallest bundle (~75 KB)
854
+ import { createGenderDB } from "name-tools/gender/coverage95";
855
+
856
+ // Medium bundle (~257 KB)
857
+ import { createGenderDB } from "name-tools/gender/coverage99";
858
+
859
+ // Full dataset (~757 KB)
860
+ import { createGenderDB } from "name-tools/gender/all";
861
+
862
+ const db = createGenderDB();
863
+
864
+ db.getMaleProbability("John"); // ~0.996 (male)
865
+ db.getMaleProbability("Mary"); // ~0.004 (female)
866
+ db.getMaleProbability("Chris"); // ~0.7 (leans male)
867
+
868
+ // Guess gender with 80% confidence threshold (default)
869
+ db.guessGender("John"); // 'male' (>80% male probability)
870
+ db.guessGender("Mary"); // 'female' (<20% male probability)
871
+ db.guessGender("Chris"); // 'unknown' (found, but between 20-80%)
872
+ db.guessGender("Alex", 0.6); // custom threshold: 'male' | 'female' | 'unknown'
873
+ db.guessGender("Xyzzy"); // null (not found in database)
874
+
875
+ // Check if a name exists in the database (useful for first-name validation)
876
+ db.has("Chris"); // true
877
+ db.has("Xyzzy"); // false
878
+ ```
879
+
880
+ ### Building for NPM and GitHub Pages
881
+
882
+ This project uses a dual-output setup:
883
+
884
+ 1. **NPM Package**: The `dist/` folder contains the compiled library code
885
+
886
+ - Built with `tsup` in both CommonJS and ESM formats
887
+ - Includes TypeScript type definitions
888
+
889
+ 2. **GitHub Pages Demo**: The `docs/` folder contains a static demo site
890
+ - Available at: https://bobpritchett.github.io/name-tools/
891
+
892
+ ## Contributing
893
+
894
+ Contributions are welcome! Please feel free to submit a Pull Request.
895
+
896
+ ## License
897
+
898
+ MIT © 2025-2026 Bob Pritchett
899
+
900
+ ## Author
901
+
902
+ **Bob Pritchett**
903
+
904
+ - GitHub: [@BobPritchett](https://github.com/BobPritchett)
905
+
906
+ ## Acknowledgments
907
+
908
+ Built with:
909
+
910
+ - [TypeScript](https://www.typescriptlang.org/)
911
+ - [tsup](https://github.com/egoist/tsup) - Zero-config TypeScript bundler
912
+ - [Vitest](https://vitest.dev/) - Unit testing framework