rfc-bcp47 0.0.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/LICENSE +21 -0
- package/README.md +200 -0
- package/dist/index.cjs +1115 -0
- package/dist/index.cjs.map +1 -0
- package/dist/index.d.cts +129 -0
- package/dist/index.d.ts +129 -0
- package/dist/index.js +1104 -0
- package/dist/index.js.map +1 -0
- package/package.json +73 -0
package/LICENSE
ADDED
|
@@ -0,0 +1,21 @@
|
|
|
1
|
+
The MIT License (MIT)
|
|
2
|
+
|
|
3
|
+
Copyright (c) Gabriel Llamas
|
|
4
|
+
|
|
5
|
+
Permission is hereby granted, free of charge, to any person obtaining a copy
|
|
6
|
+
of this software and associated documentation files (the "Software"), to deal
|
|
7
|
+
in the Software without restriction, including without limitation the rights
|
|
8
|
+
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
|
|
9
|
+
copies of the Software, and to permit persons to whom the Software is
|
|
10
|
+
furnished to do so, subject to the following conditions:
|
|
11
|
+
|
|
12
|
+
The above copyright notice and this permission notice shall be included in
|
|
13
|
+
all copies or substantial portions of the Software.
|
|
14
|
+
|
|
15
|
+
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
|
|
16
|
+
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
|
|
17
|
+
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
|
|
18
|
+
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
|
|
19
|
+
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
|
|
20
|
+
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
|
|
21
|
+
THE SOFTWARE.
|
package/README.md
ADDED
|
@@ -0,0 +1,200 @@
|
|
|
1
|
+
# bcp47
|
|
2
|
+
|
|
3
|
+
<p align="center">
|
|
4
|
+
<img src="docs/anatomy-screenshot.png" alt="BCP 47 language tag anatomy" width="765">
|
|
5
|
+
</p>
|
|
6
|
+
|
|
7
|
+
> Zero-dependency [BCP 47](https://www.rfc-editor.org/info/bcp47) / [RFC 5646](https://datatracker.ietf.org/doc/html/rfc5646) language tag toolkit for JavaScript and TypeScript
|
|
8
|
+
|
|
9
|
+
[](https://www.npmjs.com/package/bcp47)
|
|
10
|
+
[](https://www.npmjs.com/package/bcp47)
|
|
11
|
+
[](./LICENSE)
|
|
12
|
+
|
|
13
|
+
- **Parse** any BCP 47 language tag into a structured, typed object
|
|
14
|
+
- **Stringify** a tag object back into a well-formed language tag string
|
|
15
|
+
- **Canonicalize** with case normalization and [IANA registry](https://www.iana.org/assignments/language-subtag-registry/language-subtag-registry) data (deprecated subtags, suppress-script, extlang)
|
|
16
|
+
- **Match** language tags with `filter` and `lookup` per [RFC 4647](https://datatracker.ietf.org/doc/html/rfc4647)
|
|
17
|
+
- **Extension U/T** extraction for Unicode locales ([RFC 6067](https://datatracker.ietf.org/doc/html/rfc6067)) and transformed content ([RFC 6497](https://datatracker.ietf.org/doc/html/rfc6497))
|
|
18
|
+
- **Accept-Language** header parsing per [RFC 9110](https://datatracker.ietf.org/doc/html/rfc9110#section-12.5.4)
|
|
19
|
+
- **WCAG-ready** — use `parse()` to validate `lang` attributes per [WCAG 2.x SC 3.1.1](https://www.w3.org/WAI/WCAG22/Understanding/language-of-page) and [SC 3.1.2](https://www.w3.org/WAI/WCAG22/Understanding/language-of-parts)
|
|
20
|
+
- **TypeScript-first** with full type inference and strict types out of the box
|
|
21
|
+
- **Zero dependencies**, tree-shakeable, works in Node.js and browsers
|
|
22
|
+
|
|
23
|
+
## Install
|
|
24
|
+
|
|
25
|
+
```bash
|
|
26
|
+
npm install bcp47
|
|
27
|
+
```
|
|
28
|
+
|
|
29
|
+
## Operators
|
|
30
|
+
|
|
31
|
+
Tree-shakeable operators — import only what you need.
|
|
32
|
+
|
|
33
|
+
### parse / stringify
|
|
34
|
+
|
|
35
|
+
```ts
|
|
36
|
+
import { parse, stringify } from 'bcp47';
|
|
37
|
+
|
|
38
|
+
const tag = parse('en-Latn-US');
|
|
39
|
+
|
|
40
|
+
if (tag?.type === 'langtag') {
|
|
41
|
+
tag.langtag.language // 'en'
|
|
42
|
+
tag.langtag.script // 'Latn'
|
|
43
|
+
tag.langtag.region // 'US'
|
|
44
|
+
}
|
|
45
|
+
|
|
46
|
+
stringify(tag!); // 'en-Latn-US'
|
|
47
|
+
|
|
48
|
+
parse('invalid!'); // null
|
|
49
|
+
```
|
|
50
|
+
|
|
51
|
+
`parse` returns one of three tag types or `null` for invalid input:
|
|
52
|
+
|
|
53
|
+
| `type` | When | Fields available |
|
|
54
|
+
|--------|------|-----------------|
|
|
55
|
+
| `'langtag'` | Standard language tags (`en-US`, `zh-Hant-TW`) | `langtag.language`, `langtag.script`, `langtag.region`, `langtag.extlang`, `langtag.variant`, `langtag.extension`, `langtag.privateuse` |
|
|
56
|
+
| `'privateuse'` | Private use tags (`x-custom`) | `privateuse` |
|
|
57
|
+
| `'grandfathered'` | Legacy registered tags (`i-klingon`) | `grandfathered.type`, `grandfathered.tag` |
|
|
58
|
+
|
|
59
|
+
### langtag
|
|
60
|
+
|
|
61
|
+
Build a tag from known parts without parsing a string. Validates subtags and throws `RangeError` on invalid input:
|
|
62
|
+
|
|
63
|
+
```ts
|
|
64
|
+
import { langtag, stringify } from 'bcp47';
|
|
65
|
+
|
|
66
|
+
const tag = langtag('en', { region: 'US' });
|
|
67
|
+
stringify(tag); // 'en-US'
|
|
68
|
+
|
|
69
|
+
langtag('!!!'); // RangeError — invalid language
|
|
70
|
+
```
|
|
71
|
+
|
|
72
|
+
### canonicalize
|
|
73
|
+
|
|
74
|
+
Reduce equivalent tags to a single canonical form — handles case normalization, deprecated subtags, suppress-script, extlang promotion, and extension ordering using [IANA registry](https://www.iana.org/assignments/language-subtag-registry/language-subtag-registry) data:
|
|
75
|
+
|
|
76
|
+
```ts
|
|
77
|
+
import { canonicalize } from 'bcp47';
|
|
78
|
+
|
|
79
|
+
canonicalize('iw'); // 'he' (deprecated language)
|
|
80
|
+
canonicalize('zh-cmn'); // 'cmn' (extlang to preferred)
|
|
81
|
+
canonicalize('en-Latn'); // 'en' (suppress-script)
|
|
82
|
+
canonicalize('de-DD'); // 'de-DE' (deprecated region)
|
|
83
|
+
```
|
|
84
|
+
|
|
85
|
+
### filter
|
|
86
|
+
|
|
87
|
+
Find all matching tags with subtag-aware filtering per [RFC 4647 §3.3.2](https://datatracker.ietf.org/doc/html/rfc4647#section-3.3.2):
|
|
88
|
+
|
|
89
|
+
```ts
|
|
90
|
+
import { filter } from 'bcp47';
|
|
91
|
+
|
|
92
|
+
const tags = ['de', 'de-DE', 'de-Latn-DE', 'de-AT', 'en-US', 'fr-FR'];
|
|
93
|
+
|
|
94
|
+
filter(tags, 'de-DE'); // ['de-DE', 'de-Latn-DE'] (skips Latn to match DE)
|
|
95
|
+
filter(tags, 'de'); // ['de', 'de-DE', 'de-Latn-DE', 'de-AT'] (all German)
|
|
96
|
+
filter(tags, '*-DE'); // ['de-DE', 'de-Latn-DE'] (* wildcard = any language)
|
|
97
|
+
```
|
|
98
|
+
|
|
99
|
+
### lookup
|
|
100
|
+
|
|
101
|
+
Find the single best match via progressive truncation per [RFC 4647 §3.4](https://datatracker.ietf.org/doc/html/rfc4647#section-3.4):
|
|
102
|
+
|
|
103
|
+
```ts
|
|
104
|
+
import { lookup } from 'bcp47';
|
|
105
|
+
|
|
106
|
+
const tags = ['en', 'en-US', 'fr', 'de'];
|
|
107
|
+
|
|
108
|
+
lookup(tags, 'en-US-x-custom'); // 'en-US' (truncates to match)
|
|
109
|
+
lookup(tags, 'fr-CA'); // 'fr' (truncates to match)
|
|
110
|
+
lookup(tags, 'ja', 'en'); // 'en' (default fallback)
|
|
111
|
+
```
|
|
112
|
+
|
|
113
|
+
Pair with `acceptLanguage` for HTTP content negotiation:
|
|
114
|
+
|
|
115
|
+
```ts
|
|
116
|
+
import { acceptLanguage, lookup } from 'bcp47';
|
|
117
|
+
|
|
118
|
+
const prefs = acceptLanguage('fr-CA, en-US;q=0.8, en;q=0.5');
|
|
119
|
+
const best = lookup(['en', 'en-US', 'fr', 'fr-CA'], prefs.map((p) => p.tag));
|
|
120
|
+
// 'fr-CA'
|
|
121
|
+
```
|
|
122
|
+
|
|
123
|
+
### extensionU / extensionT
|
|
124
|
+
|
|
125
|
+
Extract Unicode locale and transformed content extensions:
|
|
126
|
+
|
|
127
|
+
```ts
|
|
128
|
+
import { parse, extensionU, extensionT } from 'bcp47';
|
|
129
|
+
|
|
130
|
+
extensionU(parse('de-DE-u-co-phonebk-ca-buddhist')!);
|
|
131
|
+
// { attributes: [], keywords: { co: 'phonebk', ca: 'buddhist' } }
|
|
132
|
+
|
|
133
|
+
extensionT(parse('und-t-it-m0-ungegn')!);
|
|
134
|
+
// { source: 'it', fields: { m0: 'ungegn' } }
|
|
135
|
+
```
|
|
136
|
+
|
|
137
|
+
### acceptLanguage
|
|
138
|
+
|
|
139
|
+
Parse HTTP `Accept-Language` headers:
|
|
140
|
+
|
|
141
|
+
```ts
|
|
142
|
+
import { acceptLanguage } from 'bcp47';
|
|
143
|
+
|
|
144
|
+
acceptLanguage('fr-CA, en-US;q=0.8, en;q=0.5, *;q=0.1');
|
|
145
|
+
// [
|
|
146
|
+
// { tag: 'fr-CA', quality: 1.0 },
|
|
147
|
+
// { tag: 'en-US', quality: 0.8 },
|
|
148
|
+
// { tag: 'en', quality: 0.5 },
|
|
149
|
+
// { tag: '*', quality: 0.1 }
|
|
150
|
+
// ]
|
|
151
|
+
```
|
|
152
|
+
|
|
153
|
+
See the [`examples/`](./examples) folder for more usage patterns.
|
|
154
|
+
|
|
155
|
+
## Operator Reference
|
|
156
|
+
|
|
157
|
+
| Operator | Description |
|
|
158
|
+
|----------|-------------|
|
|
159
|
+
| `parse(tag)` | Parse a BCP 47 tag string into a structured object. Returns `null` for invalid input. |
|
|
160
|
+
| `stringify(tag)` | Convert a parsed tag object back into a well-formed string. |
|
|
161
|
+
| `langtag(language, options?)` | Create a langtag object with sensible defaults. Throws `RangeError` on invalid input. |
|
|
162
|
+
| `canonicalize(tag)` | Normalize casing, sort extensions, apply [IANA registry](https://www.iana.org/assignments/language-subtag-registry/language-subtag-registry) mappings (deprecated subtags, suppress-script, extlang). Returns `null` for invalid input. |
|
|
163
|
+
| `filter(tags, patterns)` | Subtag-aware filtering with `*` wildcard support per RFC 4647 §3.3.2. Returns matched tags. |
|
|
164
|
+
| `lookup(tags, preferences, defaultValue?)` | Lookup per RFC 4647 §3.4. Returns first match or `defaultValue`/`null`. |
|
|
165
|
+
| `extensionU(tag)` | Extract Unicode locale attributes and keywords from the `u` extension. Takes a `BCP47Tag` (not a string). Returns `null` if absent. |
|
|
166
|
+
| `extensionT(tag)` | Extract transformed content data from the `t` extension. Takes a `BCP47Tag` (not a string). Returns `null` if absent. |
|
|
167
|
+
| `acceptLanguage(header)` | Parse an `Accept-Language` header into sorted `{ tag, quality }` entries. |
|
|
168
|
+
|
|
169
|
+
## CLDR Key References
|
|
170
|
+
|
|
171
|
+
Typed constants mapping extension keys to human-readable descriptions, sourced from the [CLDR BCP 47 data](https://github.com/unicode-org/cldr/tree/main/common/bcp47). Zero runtime cost — tree-shaken if unused.
|
|
172
|
+
|
|
173
|
+
| Constant | Description |
|
|
174
|
+
|----------|-------------|
|
|
175
|
+
| `UNICODE_LOCALE_KEYS` | U extension keys → descriptions (e.g. `ca` → `'Calendar'`, `nu` → `'Numbering system'`) |
|
|
176
|
+
| `TRANSFORM_KEYS` | T extension keys → descriptions (e.g. `m0` → `'Transform mechanism'`, `s0` → `'Transform source'`) |
|
|
177
|
+
|
|
178
|
+
## Choosing an Operator
|
|
179
|
+
|
|
180
|
+
| I want to... | Use |
|
|
181
|
+
|--------------|-----|
|
|
182
|
+
| Validate a language tag string | `parse(tag) !== null` |
|
|
183
|
+
| Read subtags (language, script, region) | `parse(tag)` → access `.langtag.*` |
|
|
184
|
+
| Build a tag from known parts | `langtag(language, options)` → `stringify(tag)` |
|
|
185
|
+
| Normalize casing and deprecated subtags | `canonicalize(tag)` |
|
|
186
|
+
| Read Unicode locale preferences (calendar, collation) | `parse(tag)` → `extensionU(parsedTag)` |
|
|
187
|
+
| Read transformed content metadata (source language) | `parse(tag)` → `extensionT(parsedTag)` |
|
|
188
|
+
| Find all locales matching a preference | `filter(tags, patterns)` |
|
|
189
|
+
| Pick the single best locale for a user | `lookup(tags, preferences, defaultValue)` |
|
|
190
|
+
| Parse an HTTP Accept-Language header | `acceptLanguage(header)` → `lookup()` or `filter()` |
|
|
191
|
+
|
|
192
|
+
> **Note:** `canonicalize` and `acceptLanguage` take strings. `extensionU` and `extensionT` take a pre-parsed `BCP47Tag` from `parse()`. This avoids re-parsing when you need multiple operations on the same tag.
|
|
193
|
+
|
|
194
|
+
## Changelog
|
|
195
|
+
|
|
196
|
+
See [CHANGELOG.md](./CHANGELOG.md) for breaking changes and release history.
|
|
197
|
+
|
|
198
|
+
## License
|
|
199
|
+
|
|
200
|
+
[MIT](./LICENSE)
|