chinese-characters-decomposition 0.0.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (5) hide show
  1. package/README.md +81 -0
  2. package/ccd.json +254030 -0
  3. package/index.d.ts +51 -0
  4. package/index.js +1 -0
  5. package/package.json +64 -0
package/README.md ADDED
@@ -0,0 +1,81 @@
1
+ # chinese-characters-decomposition
2
+
3
+ Typed JSON dataset of Chinese character decompositions (CCD), with a parser
4
+ for the source TSV.
5
+
6
+ The data comes from the [Wikimedia Commons CCD table][ccd] (21,169 entries),
7
+ which lists each character together with its stroke count, structural
8
+ composition type, sub-components, Cangjie signature, and Kangxi radical.
9
+
10
+ [ccd]: https://commons.wikimedia.org/wiki/Commons:Chinese_characters_decomposition
11
+
12
+ ## Install
13
+
14
+ ```sh
15
+ npm install chinese-characters-decomposition
16
+ ```
17
+
18
+ The package ships a compiled ESM bundle (`index.js`) with TypeScript
19
+ declarations (`index.d.ts`), so it works in any modern Node.js, TypeScript,
20
+ or bundled environment. If you only need the raw data, import
21
+ `chinese-characters-decomposition/ccd.json` to skip the JS wrapper.
22
+
23
+ ## Usage
24
+
25
+ ```ts
26
+ import { ccd, type CcdEntry } from "chinese-characters-decomposition";
27
+
28
+ const entry: CcdEntry | undefined = ccd.find((e) => e.component === "好");
29
+ // {
30
+ // component: "好",
31
+ // strokes: 6,
32
+ // compositionType: "吅",
33
+ // leftComponent: "女",
34
+ // leftStrokes: 3,
35
+ // rightComponent: "子",
36
+ // rightStrokes: 3,
37
+ // signature: "VND",
38
+ // notes: "/",
39
+ // section: "女",
40
+ // }
41
+ ```
42
+
43
+ Or import the raw JSON directly:
44
+
45
+ ```ts
46
+ import dataset from "chinese-characters-decomposition/ccd.json";
47
+ ```
48
+
49
+ ## Entry shape
50
+
51
+ | Field | Type | Description |
52
+ | ----------------- | ---------------- | -------------------------------------------------------------- |
53
+ | `component` | `string` | The character being decomposed. |
54
+ | `strokes` | `number` | Total stroke count. |
55
+ | `compositionType` | `string` | Structure indicator (`一`, `吕`, `回`, `吅`, `咒`, …, or `*`). |
56
+ | `leftComponent` | `string \| null` | First sub-component (`null` when atomic). |
57
+ | `leftStrokes` | `number` | Stroke count of the first sub-component. |
58
+ | `rightComponent` | `string \| null` | Second sub-component (`null` when atomic). |
59
+ | `rightStrokes` | `number` | Stroke count of the second sub-component. |
60
+ | `signature` | `string` | Cangjie-style code (may be empty). |
61
+ | `notes` | `string` | Free-form source notes (`/`, `*/`, `?/`, …). |
62
+ | `section` | `string \| null` | Kangxi radical the character is filed under. |
63
+
64
+ In the source TSV, missing/atomic sub-components and sections are encoded as
65
+ `*`; this parser converts them to `null`.
66
+
67
+ ## Regenerating `ccd.json` and the bundle
68
+
69
+ `scripts/parse.ts` rebuilds `ccd.json` from `data/ccd.tsv`, and `vp pack`
70
+ bundles `src/index.ts` into `index.js` + `index.d.ts`. Both run
71
+ automatically on `npm publish` via `prepublishOnly`.
72
+
73
+ ```sh
74
+ bun run parse # regenerate ccd.json only
75
+ bun run build # regenerate ccd.json and rebuild index.js + index.d.ts
76
+ ```
77
+
78
+ ## License
79
+
80
+ MIT. The underlying CCD table is distributed under the terms of the
81
+ [Wikimedia Commons][ccd] project.