micromark-extension-cjk-friendly-util 1.0.0 → 1.1.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/README.md CHANGED
@@ -1,6 +1,6 @@
1
1
  # micromark-extension-cjk-friendly-util
2
2
 
3
- [![Version](https://img.shields.io/npm/v/micromark-extension-cjk-friendly-util)](https://npmjs.com/package/micromark-extension-cjk-friendly-util) [![NPM Downloads](https://img.shields.io/npm/dw/micromark-extension-cjk-friendly-util)](https://npmjs.com/package/micromark-extension-cjk-friendly-util) [![NPM Last Update](https://img.shields.io/npm/last-update/micromark-extension-cjk-friendly-util)](https://npmjs.com/package/micromark-extension-cjk-friendly-util)
3
+ [![Version](https://img.shields.io/npm/v/micromark-extension-cjk-friendly-util)](https://npmjs.com/package/micromark-extension-cjk-friendly-util) ![Node Current](https://img.shields.io/node/v/micromark-extension-cjk-friendly-util) [![NPM Downloads](https://img.shields.io/npm/dm/micromark-extension-cjk-friendly-util)](https://npmjs.com/package/micromark-extension-cjk-friendly-util) [![NPM Last Update](https://img.shields.io/npm/last-update/micromark-extension-cjk-friendly-util)](https://npmjs.com/package/micromark-extension-cjk-friendly-util)
4
4
 
5
5
  An utility library package for [micromark-extension-cjk-friendly](https://npmjs.com/package/micromark-extension-cjk-friendly), which is internally used by [remark-cjk-friendly](https://npmjs.com/package/remark-cjk-friendly), and its related packages.
6
6
 
@@ -42,21 +42,21 @@ CommonMark issue: https://github.com/commonmark/commonmark-spec/issues/650
42
42
 
43
43
  ## Runtime Requirements / <span lang="ja">実行環境の要件</span> / <span lang="zh-Hans-CN">运行环境要求</span> / <span lang="ko">업데이트 전략</span>
44
44
 
45
- This package uses the [`v` flag of the regular expression](https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/RegExp/unicodeSets) introduced in ES2024 to determine whether the character is an emoji or not.
45
+ This package is ESM-only. It requires Node.js 16 or later.
46
46
 
47
- <span lang="ja">本パッケージは文字が絵文字かどうかを判定するために、ES2024で導入された[正規表現の`v`フラグ](https://developer.mozilla.org/ja/docs/Web/JavaScript/Reference/Global_Objects/RegExp/unicodeSets)を使用しています。</span>
47
+ <span lang="ja">本パッケージはESM専用です。Node.js 16以上が必要です。</span>
48
48
 
49
- <span lang="zh-CN">本包使用 ES2024 引入的[`v` 标志](https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/RegExp/unicodeSets) 来判断字符是否为 emoji。</span>
49
+ <span lang="zh-Hans-CN">此包仅支持ESM。需要Node.js 16或更高版本。</span>
50
50
 
51
- <span lang="ko">이 패키지는 ES2024에서 도입된 [`v` 플래그](https://developer.mozilla.org/ko/docs/Web/JavaScript/Reference/Global_Objects/RegExp/unicodeSets)를 사용하여 문자가 이모지인지 여부를 판단합니다.</span>
51
+ <span lang="ko">이 패키지는 ESM만 사용을 위한 패키지입니다. Node.js 16或更高版本가 필요입니다.</span>
52
52
 
53
- It makes this package compatible only with relatively recent browsers and Node.js:
53
+ This package uses the [`v` flag for regular expressions](https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/RegExp/unicodeSets) introduced in ES2024, if available, to determine whether a character is an emoji. In the following compatible environments, it will comply with the Unicode version supported by the runtime. Otherwise, it will fall back to the snapshot as of Unicode 16.
54
54
 
55
- <span lang="ja">このため、本パッケージは、次のような比較的新しいブラウザやNode.jsでしか動作しません。</span>
55
+ <span lang="ja">本パッケージは文字が絵文字かどうかを判定するために、ES2024で導入された[正規表現の`v`フラグ](https://developer.mozilla.org/ja/docs/Web/JavaScript/Reference/Global_Objects/RegExp/unicodeSets)が利用可能であれば使用します。以下の対応環境の場合、ランタイムが対応しているUnicodeバージョンに準拠します。それ以外の場合、Unicode 16時点のスナップショットにフォールバックします。</span>
56
56
 
57
- <span lang="zh-CN">因此,本包只兼容比较新的浏览器和 Node.js:</span>
57
+ <span lang="zh-Hans-CN">本包使用ES2024引入的[正则表达式`v`标志](https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/RegExp/unicodeSets)(如果可用)来判断字符是否为表情符号。在以下兼容环境中,将遵循运行时支持的Unicode版本。否则,将回退到Unicode 16的快照。</span>
58
58
 
59
- <span lang="ko">따라서, 패키지는 비교적 최신 브라우저와 Node.js에서만 작동합니다.</span>
59
+ <span lang="ko">이 패키지는 문자가 이모지인지 판단하기 위해 ES2024에서 도입된 [정규표현식 `v` 플래그](https://developer.mozilla.org/ko/docs/Web/JavaScript/Reference/Global_Objects/RegExp/unicodeSets)를 사용할 수 있다면 사용합니다. 다음 호환 환경에서는 런타임이 지원하는 Unicode 버전을 따릅니다. 그렇지 않은 경우, Unicode 16 시점의 스냅샷으로 폴백합니다.</span>
60
60
 
61
61
  - Chrome / Edge 112 or later
62
62
  - Firefox 116 or later
@@ -100,7 +100,7 @@ This package provides a function and a namespace based on the original micromark
100
100
  | `classifyCharacter` | function | [micromark-util-character](https://npmjs.com/package/micromark-util-character) | (same) | Tells whether a character is not only a punctuation or whitespace but also a CJK or variation selector |
101
101
  | `constantsEx` | namespace | [micromark-util-symbol](https://npmjs.com/package/micromark-util-symbol) | `constants` | Constants meaning CJK and variation selectors; use it and the original `constants` together. |
102
102
 
103
- Also, this package provides some utility functions to check whether a character belongs to the category defined in the specification (e.g. CJK code point without variation selector), or to help you fetch the Unicode Code Point of a character around the emphasis mark.
103
+ Also, this package provides some utility functions to check whether a character belongs to the category defined in the specification (e.g. CJK character), or to help you fetch the Unicode Code Point of a character around the emphasis mark.
104
104
 
105
105
  ## Specification / <span lang="ja">規格書</span> / <span lang="zh-Hans-CN">规范</span> / <span lang="ko">규정서</span>
106
106
 
@@ -108,11 +108,11 @@ https://github.com/tats-u/markdown-cjk-friendly/blob/main/specification.md (Engl
108
108
 
109
109
  ## Related packages / <span lang="ja">関連パッケージ</span> / <span lang="zh-Hans-CN">相关包</span> / <span lang="ko">관련 패키지</span>
110
110
 
111
- - [micromark-extension-cjk-friendly](https://npmjs.com/package/micromark-extension-cjk-friendly) [![Version](https://img.shields.io/npm/v/micromark-extension-cjk-friendly)](https://npmjs.com/package/micromark-extension-cjk-friendly) [![NPM Downloads](https://img.shields.io/npm/dw/micromark-extension-cjk-friendly)](https://npmjs.com/package/micromark-extension-cjk-friendly) [![NPM Last Update](https://img.shields.io/npm/last-update/micromark-extension-cjk-friendly)](https://npmjs.com/package/micromark-extension-cjk-friendly)
112
- - [remark-cjk-friendly](https://npmjs.com/package/remark-cjk-friendly) [![Version](https://img.shields.io/npm/v/remark-cjk-friendly)](https://npmjs.com/package/remark-cjk-friendly) [![NPM Downloads](https://img.shields.io/npm/dw/remark-cjk-friendly)](https://npmjs.com/package/remark-cjk-friendly) [![NPM Last Update](https://img.shields.io/npm/last-update/remark-cjk-friendly)](https://npmjs.com/package/remark-cjk-friendly)
113
- - [markdown-it-cjk-friendly](https://npmjs.com/package/markdown-it-cjk-friendly) [![Version](https://img.shields.io/npm/v/markdown-it-cjk-friendly)](https://npmjs.com/package/markdown-it-cjk-friendly) [![NPM Downloads](https://img.shields.io/npm/dw/markdown-it-cjk-friendly)](https://npmjs.com/package/markdown-it-cjk-friendly) [![NPM Last Update](https://img.shields.io/npm/last-update/markdown-it-cjk-friendly)](https://npmjs.com/package/markdown-it-cjk-friendly)
114
- - [remark-cjk-friendly](https://npmjs.com/package/remark-cjk-friendly) [![Version](https://img.shields.io/npm/v/remark-cjk-friendly)](https://npmjs.com/package/remark-cjk-friendly) [![NPM Downloads](https://img.shields.io/npm/dw/remark-cjk-friendly)](https://npmjs.com/package/remark-cjk-friendly) [![NPM Last Update](https://img.shields.io/npm/last-update/remark-cjk-friendly)](https://npmjs.com/package/remark-cjk-friendly)
115
- - [micromark-extension-cjk-friendly-gfm-strikethrough](https://npmjs.com/package/micromark-extension-cjk-friendly-gfm-strikethrough) [![Version](https://img.shields.io/npm/v/micromark-extension-cjk-friendly-gfm-strikethrough)](https://npmjs.com/package/micromark-extension-cjk-friendly-gfm-strikethrough) [![NPM Downloads](https://img.shields.io/npm/dw/micromark-extension-cjk-friendly-gfm-strikethrough)](https://npmjs.com/package/micromark-extension-cjk-friendly-gfm-strikethrough) [![NPM Last Update](https://img.shields.io/npm/last-update/micromark-extension-cjk-friendly-gfm-strikethrough)](https://npmjs.com/package/micromark-extension-cjk-friendly-gfm-strikethrough)
111
+ - [micromark-extension-cjk-friendly](https://npmjs.com/package/micromark-extension-cjk-friendly) [![Version](https://img.shields.io/npm/v/micromark-extension-cjk-friendly)](https://npmjs.com/package/micromark-extension-cjk-friendly) ![Node Current](https://img.shields.io/node/v/micromark-extension-cjk-friendly) [![NPM Downloads](https://img.shields.io/npm/dm/micromark-extension-cjk-friendly)](https://npmjs.com/package/micromark-extension-cjk-friendly) [![NPM Last Update](https://img.shields.io/npm/last-update/micromark-extension-cjk-friendly)](https://npmjs.com/package/micromark-extension-cjk-friendly)
112
+ - [remark-cjk-friendly](https://npmjs.com/package/remark-cjk-friendly) [![Version](https://img.shields.io/npm/v/remark-cjk-friendly)](https://npmjs.com/package/remark-cjk-friendly) ![Node Current](https://img.shields.io/node/v/remark-cjk-friendly) [![NPM Downloads](https://img.shields.io/npm/dm/remark-cjk-friendly)](https://npmjs.com/package/remark-cjk-friendly) [![NPM Last Update](https://img.shields.io/npm/last-update/remark-cjk-friendly)](https://npmjs.com/package/remark-cjk-friendly)
113
+ - [markdown-it-cjk-friendly](https://npmjs.com/package/markdown-it-cjk-friendly) [![Version](https://img.shields.io/npm/v/markdown-it-cjk-friendly)](https://npmjs.com/package/markdown-it-cjk-friendly) ![Node Current](https://img.shields.io/node/v/markdown-it-cjk-friendly) [![NPM Downloads](https://img.shields.io/npm/dm/markdown-it-cjk-friendly)](https://npmjs.com/package/markdown-it-cjk-friendly) [![NPM Last Update](https://img.shields.io/npm/last-update/markdown-it-cjk-friendly)](https://npmjs.com/package/markdown-it-cjk-friendly)
114
+ - [remark-cjk-friendly](https://npmjs.com/package/remark-cjk-friendly) [![Version](https://img.shields.io/npm/v/remark-cjk-friendly)](https://npmjs.com/package/remark-cjk-friendly) ![Node Current](https://img.shields.io/node/v/remark-cjk-friendly) [![NPM Downloads](https://img.shields.io/npm/dm/remark-cjk-friendly)](https://npmjs.com/package/remark-cjk-friendly) [![NPM Last Update](https://img.shields.io/npm/last-update/remark-cjk-friendly)](https://npmjs.com/package/remark-cjk-friendly)
115
+ - [micromark-extension-cjk-friendly-gfm-strikethrough](https://npmjs.com/package/micromark-extension-cjk-friendly-gfm-strikethrough) [![Version](https://img.shields.io/npm/v/micromark-extension-cjk-friendly-gfm-strikethrough)](https://npmjs.com/package/micromark-extension-cjk-friendly-gfm-strikethrough) ![Node Current](https://img.shields.io/node/v/micromark-extension-cjk-friendly-gfm-strikethrough) [![NPM Downloads](https://img.shields.io/npm/dm/micromark-extension-cjk-friendly-gfm-strikethrough)](https://npmjs.com/package/micromark-extension-cjk-friendly-gfm-strikethrough) [![NPM Last Update](https://img.shields.io/npm/last-update/micromark-extension-cjk-friendly-gfm-strikethrough)](https://npmjs.com/package/micromark-extension-cjk-friendly-gfm-strikethrough)
116
116
 
117
117
  ## Contributing / <span lang="ja">貢献</span> / <span lang="zh-Hans-CN">贡献</span> / <span lang="ko">기여</span>
118
118
 
@@ -1,4 +1,7 @@
1
- import { type classifyCharacter } from "./classifyCharacter.js";
1
+ import { classifyCharacter } from './classifyCharacter.js';
2
+ import 'micromark-util-symbol';
3
+ import 'micromark-util-types';
4
+
2
5
  type Category = ReturnType<typeof classifyCharacter>;
3
6
  /**
4
7
  * `true` if the code point represents an [Unicode whitespace character](https://spec.commonmark.org/0.31.2/#unicode-whitespace-character).
@@ -6,40 +9,41 @@ type Category = ReturnType<typeof classifyCharacter>;
6
9
  * @param category the return value of `classifyCharacter`.
7
10
  * @returns `true` if the code point represents an Unicode whitespace character
8
11
  */
9
- export declare function isUnicodeWhitespace(category: Category): boolean;
12
+ declare function isUnicodeWhitespace(category: Category): boolean;
10
13
  /**
11
14
  * `true` if the code point represents a [non-CJK punctuation character](https://github.com/tats-u/markdown-cjk-friendly/blob/main/specification.md#non-cjk-punctuation-character).
12
15
  *
13
16
  * @param category the return value of `classifyCharacter`.
14
17
  * @returns `true` if the code point represents a non-CJK punctuation character
15
18
  */
16
- export declare function isNonCjkPunctuation(category: Category): boolean;
19
+ declare function isNonCjkPunctuation(category: Category): boolean;
17
20
  /**
18
- * `true` if the code point represents a [CJK character (CJK code point without variation selector)](https://github.com/tats-u/markdown-cjk-friendly/blob/main/specification.md#cjk-code-point-without-variation-selector).
21
+ * `true` if the code point represents a [CJK character](https://github.com/tats-u/markdown-cjk-friendly/blob/main/specification.md#cjk-character).
19
22
  *
20
23
  * @param category the return value of `classifyCharacter`.
21
24
  * @returns `true` if the code point represents a CJK character
22
25
  */
23
- export declare function isCjk(category: Category): boolean;
26
+ declare function isCjk(category: Category): boolean;
24
27
  /**
25
- * `true` if the code point represents an [IVS (Ideographic Variation Selector)](https://github.com/tats-u/markdown-cjk-friendly/blob/main/specification.md#ivs).
28
+ * `true` if the code point represents an [Ideographic Variation Selector](https://github.com/tats-u/markdown-cjk-friendly/blob/main/specification.md#ideographi-variation-selector).
26
29
  *
27
30
  * @param category the return value of `classifyCharacter`.
28
31
  * @returns `true` if the code point represents an IVS
29
32
  */
30
- export declare function isIvs(category: Category): boolean;
33
+ declare function isIvs(category: Category): boolean;
31
34
  /**
32
- * `true` if the code point represents a [SVS (Standard Variation Selector/Sequence) that can follow CJK](https://github.com/tats-u/markdown-cjk-friendly/blob/main/specification.md#svs-that-can-follow-cjk).
35
+ * `true` if the code point represents a [Standard Variation Selector that can follow CJK](https://github.com/tats-u/markdown-cjk-friendly/blob/main/specification.md#svs-that-can-follow-cjk).
33
36
  *
34
37
  * @param category the return value of `classifyCharacter`.
35
- * @returns `true` if the code point represents an SVS that can follow CJK
38
+ * @returns `true` if the code point represents an Standard Variation Selector that can follow CJK
36
39
  */
37
- export declare function isSvsFollowingCjk(category: Category): boolean;
40
+ declare function isSvsFollowingCjk(category: Category): boolean;
38
41
  /**
39
42
  * `true` if the code point represents an [Unicode whitespace character](https://spec.commonmark.org/0.31.2/#unicode-whitespace-character) or an [Unicode punctuation character](https://spec.commonmark.org/0.31.2/#unicode-punctuation-character).
40
43
  *
41
44
  * @param category the return value of `classifyCharacter`.
42
45
  * @returns `true` if the code point represents a space or punctuation
43
46
  */
44
- export declare function isSpaceOrPunctuation(category: Category): boolean;
45
- export {};
47
+ declare function isSpaceOrPunctuation(category: Category): boolean;
48
+
49
+ export { isCjk, isIvs, isNonCjkPunctuation, isSpaceOrPunctuation, isSvsFollowingCjk, isUnicodeWhitespace };
@@ -0,0 +1,44 @@
1
+ // src/categoryUtil.ts
2
+ import { constants as constants2 } from "micromark-util-symbol";
3
+
4
+ // src/classifyCharacter.ts
5
+ import { markdownLineEndingOrSpace } from "micromark-util-character";
6
+ import { constants, codes } from "micromark-util-symbol";
7
+ var constantsEx;
8
+ ((constantsEx2) => {
9
+ constantsEx2.spaceOrPunctuation = 3;
10
+ constantsEx2.cjk = 4096;
11
+ constantsEx2.cjkPunctuation = 4098;
12
+ constantsEx2.ivs = 8192;
13
+ constantsEx2.cjkOrIvs = 12288;
14
+ constantsEx2.svsFollowingCjk = 16384;
15
+ constantsEx2.variationSelector = 28672;
16
+ })(constantsEx || (constantsEx = {}));
17
+
18
+ // src/categoryUtil.ts
19
+ function isUnicodeWhitespace(category) {
20
+ return Boolean(category & constants2.characterGroupWhitespace);
21
+ }
22
+ function isNonCjkPunctuation(category) {
23
+ return (category & constantsEx.cjkPunctuation) === constants2.characterGroupPunctuation;
24
+ }
25
+ function isCjk(category) {
26
+ return Boolean(category & constantsEx.cjk);
27
+ }
28
+ function isIvs(category) {
29
+ return category === constantsEx.ivs;
30
+ }
31
+ function isSvsFollowingCjk(category) {
32
+ return category === constantsEx.svsFollowingCjk;
33
+ }
34
+ function isSpaceOrPunctuation(category) {
35
+ return Boolean(category & constantsEx.spaceOrPunctuation);
36
+ }
37
+ export {
38
+ isCjk,
39
+ isIvs,
40
+ isNonCjkPunctuation,
41
+ isSpaceOrPunctuation,
42
+ isSvsFollowingCjk,
43
+ isUnicodeWhitespace
44
+ };
@@ -1,17 +1,18 @@
1
- import type { Code } from "micromark-util-types";
1
+ import { Code } from 'micromark-util-types';
2
+
2
3
  /**
3
4
  * Check if `uc` is CJK or IVS
4
5
  *
5
6
  * @param uc code point
6
7
  * @returns `true` if `uc` is CJK, `null` if IVS, or `false` if neither
7
8
  */
8
- export declare function cjkOrIvs(uc: Code): boolean | null;
9
+ declare function cjkOrIvs(uc: Code): boolean | null;
9
10
  /**
10
11
  * Check whether the character code represents Standard Variation Sequence that can follow an ideographic character.
11
12
  *
12
13
  * U+FE0E is used for some CJK symbols (e.g. U+3299) that can also be
13
14
  */
14
- export declare const svsFollowingCjk: (code: Code) => boolean;
15
+ declare const svsFollowingCjk: (code: Code) => boolean;
15
16
  /**
16
17
  * Check whether the character code represents Unicode punctuation.
17
18
  *
@@ -31,7 +32,7 @@ export declare const svsFollowingCjk: (code: Code) => boolean;
31
32
  * @returns
32
33
  * Whether it matches.
33
34
  */
34
- export declare const unicodePunctuation: (code: Code) => boolean;
35
+ declare const unicodePunctuation: (code: Code) => boolean;
35
36
  /**
36
37
  * Check whether the character code represents Unicode whitespace.
37
38
  *
@@ -52,4 +53,6 @@ export declare const unicodePunctuation: (code: Code) => boolean;
52
53
  * @returns
53
54
  * Whether it matches.
54
55
  */
55
- export declare const unicodeWhitespace: (code: Code) => boolean;
56
+ declare const unicodeWhitespace: (code: Code) => boolean;
57
+
58
+ export { cjkOrIvs, svsFollowingCjk, unicodePunctuation, unicodeWhitespace };
@@ -0,0 +1,54 @@
1
+ // src/characterWithNonBmp.ts
2
+ import { eastAsianWidthType } from "get-east-asian-width";
3
+ var isEmoji = function(uc) {
4
+ if (this.fn !== null) {
5
+ return this.fn(uc);
6
+ }
7
+ try {
8
+ const regex = new RegExp("^\\p{RGI_Emoji}", "v");
9
+ this.fn = (uc_) => regex.test(String.fromCodePoint(uc_));
10
+ } catch (e) {
11
+ if (!(e instanceof SyntaxError)) {
12
+ throw e;
13
+ }
14
+ this.fn = (cp) => 8986 <= cp && cp <= 8987 || 9193 <= cp && cp <= 9196 || cp === 9200 || cp === 9203 || 9725 <= cp && cp <= 9726 || 9748 <= cp && cp <= 9749 || 9800 <= cp && cp <= 9811 || cp === 9855 || cp === 9875 || cp === 9889 || 9898 <= cp && cp <= 9899 || 9917 <= cp && cp <= 9918 || 9924 <= cp && cp <= 9925 || cp === 9934 || cp === 9940 || cp === 9962 || 9970 <= cp && cp <= 9971 || cp === 9973 || cp === 9978 || cp === 9981 || cp === 9989 || 9994 <= cp && cp <= 9995 || cp === 10024 || cp === 10060 || cp === 10062 || 10067 <= cp && cp <= 10069 || cp === 10071 || 10133 <= cp && cp <= 10135 || cp === 10160 || cp === 10175 || 11035 <= cp && cp <= 11036 || cp === 11088 || cp === 11093 || cp === 126980 || cp === 127183 || cp === 127374 || 127377 <= cp && cp <= 127386 || cp === 127489 || cp === 127514 || cp === 127535 || 127538 <= cp && cp <= 127542 || 127544 <= cp && cp <= 127546 || 127568 <= cp && cp <= 127569 || 127744 <= cp && cp <= 127756 || 127757 <= cp && cp <= 127758 || cp === 127759 || cp === 127760 || cp === 127761 || cp === 127762 || 127763 <= cp && cp <= 127765 || 127766 <= cp && cp <= 127768 || cp === 127769 || cp === 127770 || cp === 127771 || cp === 127772 || 127773 <= cp && cp <= 127774 || 127775 <= cp && cp <= 127776 || 127789 <= cp && cp <= 127791 || 127792 <= cp && cp <= 127793 || 127794 <= cp && cp <= 127795 || 127796 <= cp && cp <= 127797 || 127799 <= cp && cp <= 127818 || cp === 127819 || 127820 <= cp && cp <= 127823 || cp === 127824 || 127825 <= cp && cp <= 127867 || cp === 127868 || 127870 <= cp && cp <= 127871 || 127872 <= cp && cp <= 127891 || 127904 <= cp && cp <= 127940 || cp === 127941 || cp === 127942 || cp === 127943 || cp === 127944 || cp === 127945 || cp === 127946 || 127951 <= cp && cp <= 127955 || 127968 <= cp && cp <= 127971 || cp === 127972 || 127973 <= cp && cp <= 127984 || cp === 127988 || 127992 <= cp && cp <= 128007 || cp === 128008 || 128009 <= cp && cp <= 128011 || 128012 <= cp && cp <= 128014 || 128015 <= cp && cp <= 128016 || 128017 <= cp && cp <= 128018 || cp === 128019 || cp === 128020 || cp === 128021 || cp === 128022 || 128023 <= cp && cp <= 128041 || cp === 128042 || 128043 <= cp && cp <= 128062 || cp === 128064 || 128066 <= cp && cp <= 128100 || cp === 128101 || 128102 <= cp && cp <= 128107 || 128108 <= cp && cp <= 128109 || 128110 <= cp && cp <= 128172 || cp === 128173 || 128174 <= cp && cp <= 128181 || 128182 <= cp && cp <= 128183 || 128184 <= cp && cp <= 128235 || 128236 <= cp && cp <= 128237 || cp === 128238 || cp === 128239 || 128240 <= cp && cp <= 128244 || cp === 128245 || 128246 <= cp && cp <= 128247 || cp === 128248 || 128249 <= cp && cp <= 128252 || 128255 <= cp && cp <= 128258 || cp === 128259 || 128260 <= cp && cp <= 128263 || cp === 128264 || cp === 128265 || 128266 <= cp && cp <= 128276 || cp === 128277 || 128278 <= cp && cp <= 128299 || 128300 <= cp && cp <= 128301 || 128302 <= cp && cp <= 128317 || 128331 <= cp && cp <= 128334 || 128336 <= cp && cp <= 128347 || 128348 <= cp && cp <= 128359 || cp === 128378 || 128405 <= cp && cp <= 128406 || cp === 128420 || 128507 <= cp && cp <= 128511 || cp === 128512 || 128513 <= cp && cp <= 128518 || 128519 <= cp && cp <= 128520 || 128521 <= cp && cp <= 128525 || cp === 128526 || cp === 128527 || cp === 128528 || cp === 128529 || 128530 <= cp && cp <= 128532 || cp === 128533 || cp === 128534 || cp === 128535 || cp === 128536 || cp === 128537 || cp === 128538 || cp === 128539 || 128540 <= cp && cp <= 128542 || cp === 128543 || 128544 <= cp && cp <= 128549 || 128550 <= cp && cp <= 128551 || 128552 <= cp && cp <= 128555 || cp === 128556 || cp === 128557 || 128558 <= cp && cp <= 128559 || 128560 <= cp && cp <= 128563 || cp === 128564 || cp === 128565 || cp === 128566 || 128567 <= cp && cp <= 128576 || 128577 <= cp && cp <= 128580 || 128581 <= cp && cp <= 128591 || cp === 128640 || 128641 <= cp && cp <= 128642 || 128643 <= cp && cp <= 128645 || cp === 128646 || cp === 128647 || cp === 128648 || cp === 128649 || 128650 <= cp && cp <= 128651 || cp === 128652 || cp === 128653 || cp === 128654 || cp === 128655 || cp === 128656 || 128657 <= cp && cp <= 128659 || cp === 128660 || cp === 128661 || cp === 128662 || cp === 128663 || cp === 128664 || 128665 <= cp && cp <= 128666 || 128667 <= cp && cp <= 128673 || cp === 128674 || cp === 128675 || 128676 <= cp && cp <= 128677 || cp === 128678 || 128679 <= cp && cp <= 128685 || 128686 <= cp && cp <= 128689 || cp === 128690 || 128691 <= cp && cp <= 128693 || cp === 128694 || 128695 <= cp && cp <= 128696 || 128697 <= cp && cp <= 128702 || cp === 128703 || cp === 128704 || 128705 <= cp && cp <= 128709 || cp === 128716 || cp === 128720 || 128721 <= cp && cp <= 128722 || cp === 128725 || 128726 <= cp && cp <= 128727 || cp === 128732 || 128733 <= cp && cp <= 128735 || 128747 <= cp && cp <= 128748 || 128756 <= cp && cp <= 128758 || 128759 <= cp && cp <= 128760 || cp === 128761 || cp === 128762 || 128763 <= cp && cp <= 128764 || 128992 <= cp && cp <= 129003 || cp === 129008 || cp === 129292 || 129293 <= cp && cp <= 129295 || 129296 <= cp && cp <= 129304 || 129305 <= cp && cp <= 129310 || cp === 129311 || 129312 <= cp && cp <= 129319 || 129320 <= cp && cp <= 129327 || cp === 129328 || 129329 <= cp && cp <= 129330 || 129331 <= cp && cp <= 129338 || 129340 <= cp && cp <= 129342 || cp === 129343 || 129344 <= cp && cp <= 129349 || 129351 <= cp && cp <= 129355 || cp === 129356 || 129357 <= cp && cp <= 129359 || 129360 <= cp && cp <= 129374 || 129375 <= cp && cp <= 129387 || 129388 <= cp && cp <= 129392 || cp === 129393 || cp === 129394 || 129395 <= cp && cp <= 129398 || 129399 <= cp && cp <= 129400 || cp === 129401 || cp === 129402 || cp === 129403 || 129404 <= cp && cp <= 129407 || 129408 <= cp && cp <= 129412 || 129413 <= cp && cp <= 129425 || 129426 <= cp && cp <= 129431 || 129432 <= cp && cp <= 129442 || 129443 <= cp && cp <= 129444 || 129445 <= cp && cp <= 129450 || 129451 <= cp && cp <= 129453 || 129454 <= cp && cp <= 129455 || 129456 <= cp && cp <= 129465 || 129466 <= cp && cp <= 129471 || cp === 129472 || 129473 <= cp && cp <= 129474 || 129475 <= cp && cp <= 129482 || cp === 129483 || cp === 129484 || 129485 <= cp && cp <= 129487 || 129488 <= cp && cp <= 129510 || 129511 <= cp && cp <= 129535 || 129648 <= cp && cp <= 129651 || cp === 129652 || 129653 <= cp && cp <= 129655 || 129656 <= cp && cp <= 129658 || 129659 <= cp && cp <= 129660 || 129664 <= cp && cp <= 129666 || 129667 <= cp && cp <= 129670 || 129671 <= cp && cp <= 129672 || cp === 129673 || cp === 129679 || 129680 <= cp && cp <= 129685 || 129686 <= cp && cp <= 129704 || 129705 <= cp && cp <= 129708 || 129709 <= cp && cp <= 129711 || 129712 <= cp && cp <= 129718 || 129719 <= cp && cp <= 129722 || 129723 <= cp && cp <= 129725 || cp === 129726 || cp === 129727 || 129728 <= cp && cp <= 129730 || 129731 <= cp && cp <= 129733 || cp === 129734 || 129742 <= cp && cp <= 129743 || 129744 <= cp && cp <= 129750 || 129751 <= cp && cp <= 129753 || 129754 <= cp && cp <= 129755 || cp === 129756 || cp === 129759 || 129760 <= cp && cp <= 129767 || cp === 129768 || cp === 129769 || 129776 <= cp && cp <= 129782 || 129783 <= cp && cp <= 129784;
15
+ }
16
+ return this.fn(uc);
17
+ }.bind({
18
+ fn: null
19
+ });
20
+ function cjkOrIvs(uc) {
21
+ if (!uc || uc < 0) {
22
+ return false;
23
+ }
24
+ const eaw = eastAsianWidthType(uc);
25
+ switch (eaw) {
26
+ case "fullwidth":
27
+ case "halfwidth":
28
+ return true;
29
+ // never be emoji
30
+ case "wide":
31
+ return !isEmoji(uc);
32
+ case "narrow":
33
+ return false;
34
+ case "ambiguous":
35
+ return 917760 <= uc && uc <= 917999 ? null : false;
36
+ case "neutral":
37
+ return /^\p{sc=Hangul}/u.test(String.fromCodePoint(uc));
38
+ }
39
+ }
40
+ var svsFollowingCjk = regexCheck(/[\uFE00-\uFE02\uFE0E]/u);
41
+ var unicodePunctuation = regexCheck(/\p{P}|\p{S}/u);
42
+ var unicodeWhitespace = regexCheck(/\s/);
43
+ function regexCheck(regex) {
44
+ return check;
45
+ function check(code) {
46
+ return code !== null && code > -1 && regex.test(String.fromCodePoint(code));
47
+ }
48
+ }
49
+ export {
50
+ cjkOrIvs,
51
+ svsFollowingCjk,
52
+ unicodePunctuation,
53
+ unicodeWhitespace
54
+ };
@@ -1,6 +1,7 @@
1
- import { constants } from "micromark-util-symbol";
2
- import type { Code } from "micromark-util-types";
3
- export declare namespace constantsEx {
1
+ import { constants } from 'micromark-util-symbol';
2
+ import { Code } from 'micromark-util-types';
3
+
4
+ declare namespace constantsEx {
4
5
  const spaceOrPunctuation: 3;
5
6
  const cjk: 4096;
6
7
  const cjkPunctuation: 4098;
@@ -23,4 +24,6 @@ export declare namespace constantsEx {
23
24
  * @returns
24
25
  * Group.
25
26
  */
26
- export declare function classifyCharacter(code: Code): typeof constants.characterGroupWhitespace | typeof constants.characterGroupPunctuation | typeof constantsEx.cjk | typeof constantsEx.cjkPunctuation | typeof constantsEx.ivs | typeof constantsEx.svsFollowingCjk | 0;
27
+ declare function classifyCharacter(code: Code): typeof constants.characterGroupWhitespace | typeof constants.characterGroupPunctuation | typeof constantsEx.cjk | typeof constantsEx.cjkPunctuation | typeof constantsEx.ivs | typeof constantsEx.svsFollowingCjk | 0;
28
+
29
+ export { classifyCharacter, constantsEx };
@@ -0,0 +1,90 @@
1
+ // src/classifyCharacter.ts
2
+ import { markdownLineEndingOrSpace } from "micromark-util-character";
3
+ import { constants, codes } from "micromark-util-symbol";
4
+
5
+ // src/characterWithNonBmp.ts
6
+ import { eastAsianWidthType } from "get-east-asian-width";
7
+ var isEmoji = function(uc) {
8
+ if (this.fn !== null) {
9
+ return this.fn(uc);
10
+ }
11
+ try {
12
+ const regex = new RegExp("^\\p{RGI_Emoji}", "v");
13
+ this.fn = (uc_) => regex.test(String.fromCodePoint(uc_));
14
+ } catch (e) {
15
+ if (!(e instanceof SyntaxError)) {
16
+ throw e;
17
+ }
18
+ this.fn = (cp) => 8986 <= cp && cp <= 8987 || 9193 <= cp && cp <= 9196 || cp === 9200 || cp === 9203 || 9725 <= cp && cp <= 9726 || 9748 <= cp && cp <= 9749 || 9800 <= cp && cp <= 9811 || cp === 9855 || cp === 9875 || cp === 9889 || 9898 <= cp && cp <= 9899 || 9917 <= cp && cp <= 9918 || 9924 <= cp && cp <= 9925 || cp === 9934 || cp === 9940 || cp === 9962 || 9970 <= cp && cp <= 9971 || cp === 9973 || cp === 9978 || cp === 9981 || cp === 9989 || 9994 <= cp && cp <= 9995 || cp === 10024 || cp === 10060 || cp === 10062 || 10067 <= cp && cp <= 10069 || cp === 10071 || 10133 <= cp && cp <= 10135 || cp === 10160 || cp === 10175 || 11035 <= cp && cp <= 11036 || cp === 11088 || cp === 11093 || cp === 126980 || cp === 127183 || cp === 127374 || 127377 <= cp && cp <= 127386 || cp === 127489 || cp === 127514 || cp === 127535 || 127538 <= cp && cp <= 127542 || 127544 <= cp && cp <= 127546 || 127568 <= cp && cp <= 127569 || 127744 <= cp && cp <= 127756 || 127757 <= cp && cp <= 127758 || cp === 127759 || cp === 127760 || cp === 127761 || cp === 127762 || 127763 <= cp && cp <= 127765 || 127766 <= cp && cp <= 127768 || cp === 127769 || cp === 127770 || cp === 127771 || cp === 127772 || 127773 <= cp && cp <= 127774 || 127775 <= cp && cp <= 127776 || 127789 <= cp && cp <= 127791 || 127792 <= cp && cp <= 127793 || 127794 <= cp && cp <= 127795 || 127796 <= cp && cp <= 127797 || 127799 <= cp && cp <= 127818 || cp === 127819 || 127820 <= cp && cp <= 127823 || cp === 127824 || 127825 <= cp && cp <= 127867 || cp === 127868 || 127870 <= cp && cp <= 127871 || 127872 <= cp && cp <= 127891 || 127904 <= cp && cp <= 127940 || cp === 127941 || cp === 127942 || cp === 127943 || cp === 127944 || cp === 127945 || cp === 127946 || 127951 <= cp && cp <= 127955 || 127968 <= cp && cp <= 127971 || cp === 127972 || 127973 <= cp && cp <= 127984 || cp === 127988 || 127992 <= cp && cp <= 128007 || cp === 128008 || 128009 <= cp && cp <= 128011 || 128012 <= cp && cp <= 128014 || 128015 <= cp && cp <= 128016 || 128017 <= cp && cp <= 128018 || cp === 128019 || cp === 128020 || cp === 128021 || cp === 128022 || 128023 <= cp && cp <= 128041 || cp === 128042 || 128043 <= cp && cp <= 128062 || cp === 128064 || 128066 <= cp && cp <= 128100 || cp === 128101 || 128102 <= cp && cp <= 128107 || 128108 <= cp && cp <= 128109 || 128110 <= cp && cp <= 128172 || cp === 128173 || 128174 <= cp && cp <= 128181 || 128182 <= cp && cp <= 128183 || 128184 <= cp && cp <= 128235 || 128236 <= cp && cp <= 128237 || cp === 128238 || cp === 128239 || 128240 <= cp && cp <= 128244 || cp === 128245 || 128246 <= cp && cp <= 128247 || cp === 128248 || 128249 <= cp && cp <= 128252 || 128255 <= cp && cp <= 128258 || cp === 128259 || 128260 <= cp && cp <= 128263 || cp === 128264 || cp === 128265 || 128266 <= cp && cp <= 128276 || cp === 128277 || 128278 <= cp && cp <= 128299 || 128300 <= cp && cp <= 128301 || 128302 <= cp && cp <= 128317 || 128331 <= cp && cp <= 128334 || 128336 <= cp && cp <= 128347 || 128348 <= cp && cp <= 128359 || cp === 128378 || 128405 <= cp && cp <= 128406 || cp === 128420 || 128507 <= cp && cp <= 128511 || cp === 128512 || 128513 <= cp && cp <= 128518 || 128519 <= cp && cp <= 128520 || 128521 <= cp && cp <= 128525 || cp === 128526 || cp === 128527 || cp === 128528 || cp === 128529 || 128530 <= cp && cp <= 128532 || cp === 128533 || cp === 128534 || cp === 128535 || cp === 128536 || cp === 128537 || cp === 128538 || cp === 128539 || 128540 <= cp && cp <= 128542 || cp === 128543 || 128544 <= cp && cp <= 128549 || 128550 <= cp && cp <= 128551 || 128552 <= cp && cp <= 128555 || cp === 128556 || cp === 128557 || 128558 <= cp && cp <= 128559 || 128560 <= cp && cp <= 128563 || cp === 128564 || cp === 128565 || cp === 128566 || 128567 <= cp && cp <= 128576 || 128577 <= cp && cp <= 128580 || 128581 <= cp && cp <= 128591 || cp === 128640 || 128641 <= cp && cp <= 128642 || 128643 <= cp && cp <= 128645 || cp === 128646 || cp === 128647 || cp === 128648 || cp === 128649 || 128650 <= cp && cp <= 128651 || cp === 128652 || cp === 128653 || cp === 128654 || cp === 128655 || cp === 128656 || 128657 <= cp && cp <= 128659 || cp === 128660 || cp === 128661 || cp === 128662 || cp === 128663 || cp === 128664 || 128665 <= cp && cp <= 128666 || 128667 <= cp && cp <= 128673 || cp === 128674 || cp === 128675 || 128676 <= cp && cp <= 128677 || cp === 128678 || 128679 <= cp && cp <= 128685 || 128686 <= cp && cp <= 128689 || cp === 128690 || 128691 <= cp && cp <= 128693 || cp === 128694 || 128695 <= cp && cp <= 128696 || 128697 <= cp && cp <= 128702 || cp === 128703 || cp === 128704 || 128705 <= cp && cp <= 128709 || cp === 128716 || cp === 128720 || 128721 <= cp && cp <= 128722 || cp === 128725 || 128726 <= cp && cp <= 128727 || cp === 128732 || 128733 <= cp && cp <= 128735 || 128747 <= cp && cp <= 128748 || 128756 <= cp && cp <= 128758 || 128759 <= cp && cp <= 128760 || cp === 128761 || cp === 128762 || 128763 <= cp && cp <= 128764 || 128992 <= cp && cp <= 129003 || cp === 129008 || cp === 129292 || 129293 <= cp && cp <= 129295 || 129296 <= cp && cp <= 129304 || 129305 <= cp && cp <= 129310 || cp === 129311 || 129312 <= cp && cp <= 129319 || 129320 <= cp && cp <= 129327 || cp === 129328 || 129329 <= cp && cp <= 129330 || 129331 <= cp && cp <= 129338 || 129340 <= cp && cp <= 129342 || cp === 129343 || 129344 <= cp && cp <= 129349 || 129351 <= cp && cp <= 129355 || cp === 129356 || 129357 <= cp && cp <= 129359 || 129360 <= cp && cp <= 129374 || 129375 <= cp && cp <= 129387 || 129388 <= cp && cp <= 129392 || cp === 129393 || cp === 129394 || 129395 <= cp && cp <= 129398 || 129399 <= cp && cp <= 129400 || cp === 129401 || cp === 129402 || cp === 129403 || 129404 <= cp && cp <= 129407 || 129408 <= cp && cp <= 129412 || 129413 <= cp && cp <= 129425 || 129426 <= cp && cp <= 129431 || 129432 <= cp && cp <= 129442 || 129443 <= cp && cp <= 129444 || 129445 <= cp && cp <= 129450 || 129451 <= cp && cp <= 129453 || 129454 <= cp && cp <= 129455 || 129456 <= cp && cp <= 129465 || 129466 <= cp && cp <= 129471 || cp === 129472 || 129473 <= cp && cp <= 129474 || 129475 <= cp && cp <= 129482 || cp === 129483 || cp === 129484 || 129485 <= cp && cp <= 129487 || 129488 <= cp && cp <= 129510 || 129511 <= cp && cp <= 129535 || 129648 <= cp && cp <= 129651 || cp === 129652 || 129653 <= cp && cp <= 129655 || 129656 <= cp && cp <= 129658 || 129659 <= cp && cp <= 129660 || 129664 <= cp && cp <= 129666 || 129667 <= cp && cp <= 129670 || 129671 <= cp && cp <= 129672 || cp === 129673 || cp === 129679 || 129680 <= cp && cp <= 129685 || 129686 <= cp && cp <= 129704 || 129705 <= cp && cp <= 129708 || 129709 <= cp && cp <= 129711 || 129712 <= cp && cp <= 129718 || 129719 <= cp && cp <= 129722 || 129723 <= cp && cp <= 129725 || cp === 129726 || cp === 129727 || 129728 <= cp && cp <= 129730 || 129731 <= cp && cp <= 129733 || cp === 129734 || 129742 <= cp && cp <= 129743 || 129744 <= cp && cp <= 129750 || 129751 <= cp && cp <= 129753 || 129754 <= cp && cp <= 129755 || cp === 129756 || cp === 129759 || 129760 <= cp && cp <= 129767 || cp === 129768 || cp === 129769 || 129776 <= cp && cp <= 129782 || 129783 <= cp && cp <= 129784;
19
+ }
20
+ return this.fn(uc);
21
+ }.bind({
22
+ fn: null
23
+ });
24
+ function cjkOrIvs(uc) {
25
+ if (!uc || uc < 0) {
26
+ return false;
27
+ }
28
+ const eaw = eastAsianWidthType(uc);
29
+ switch (eaw) {
30
+ case "fullwidth":
31
+ case "halfwidth":
32
+ return true;
33
+ // never be emoji
34
+ case "wide":
35
+ return !isEmoji(uc);
36
+ case "narrow":
37
+ return false;
38
+ case "ambiguous":
39
+ return 917760 <= uc && uc <= 917999 ? null : false;
40
+ case "neutral":
41
+ return /^\p{sc=Hangul}/u.test(String.fromCodePoint(uc));
42
+ }
43
+ }
44
+ var svsFollowingCjk = regexCheck(/[\uFE00-\uFE02\uFE0E]/u);
45
+ var unicodePunctuation = regexCheck(/\p{P}|\p{S}/u);
46
+ var unicodeWhitespace = regexCheck(/\s/);
47
+ function regexCheck(regex) {
48
+ return check;
49
+ function check(code) {
50
+ return code !== null && code > -1 && regex.test(String.fromCodePoint(code));
51
+ }
52
+ }
53
+
54
+ // src/classifyCharacter.ts
55
+ var constantsEx;
56
+ ((constantsEx2) => {
57
+ constantsEx2.spaceOrPunctuation = 3;
58
+ constantsEx2.cjk = 4096;
59
+ constantsEx2.cjkPunctuation = 4098;
60
+ constantsEx2.ivs = 8192;
61
+ constantsEx2.cjkOrIvs = 12288;
62
+ constantsEx2.svsFollowingCjk = 16384;
63
+ constantsEx2.variationSelector = 28672;
64
+ })(constantsEx || (constantsEx = {}));
65
+ function classifyCharacter(code) {
66
+ if (code === codes.eof || markdownLineEndingOrSpace(code) || unicodeWhitespace(code)) {
67
+ return constants.characterGroupWhitespace;
68
+ }
69
+ let value = 0;
70
+ if (code >= 4352) {
71
+ if (svsFollowingCjk(code)) {
72
+ return constantsEx.svsFollowingCjk;
73
+ }
74
+ switch (cjkOrIvs(code)) {
75
+ case null:
76
+ return constantsEx.ivs;
77
+ case true:
78
+ value |= constantsEx.cjk;
79
+ break;
80
+ }
81
+ }
82
+ if (unicodePunctuation(code)) {
83
+ value |= constants.characterGroupPunctuation;
84
+ }
85
+ return value;
86
+ }
87
+ export {
88
+ classifyCharacter,
89
+ constantsEx
90
+ };
@@ -1,4 +1,5 @@
1
- import type { Code, Point, TokenizeContext } from "micromark-util-types";
1
+ import { Code, Point, TokenizeContext } from 'micromark-util-types';
2
+
2
3
  /**
3
4
  * Check if the given code is a [High-Surrogate Code Unit](https://www.unicode.org/glossary/#high_surrogate_code_unit).
4
5
  *
@@ -7,7 +8,7 @@ import type { Code, Point, TokenizeContext } from "micromark-util-types";
7
8
  * @param code Code.
8
9
  * @returns `true` if the code is a High-Surrogate Code Unit, `false` otherwise.
9
10
  */
10
- export declare function isCodeHighSurrogate(code: Code): code is Exclude<Code, null>;
11
+ declare function isCodeHighSurrogate(code: Code): code is Exclude<Code, null>;
11
12
  /**
12
13
  * Check if the given code is a [Low-Surrogate Code Unit](https://www.unicode.org/glossary/#low_surrogate_code_unit).
13
14
  *
@@ -17,7 +18,7 @@ export declare function isCodeHighSurrogate(code: Code): code is Exclude<Code, n
17
18
  * @returns
18
19
  * True if the code is a Low-Surrogate Code Unit, false otherwise.
19
20
  */
20
- export declare function isCodeLowSurrogate(code: Code): code is Exclude<Code, null>;
21
+ declare function isCodeLowSurrogate(code: Code): code is Exclude<Code, null>;
21
22
  /**
22
23
  * If `code` is a [Low-Surrogate Code Unit](https://www.unicode.org/glossary/#low_surrogate_code_unit), try to get a genuine previous [Unicode Scalar Value](https://www.unicode.org/glossary/#unicode_scalar_value) corresponding to the Low-Surrogate Code Unit.
23
24
  * @param code a tentative previous [code unit](https://www.unicode.org/glossary/#code_unit) less than 65,536, including a Low-Surrogate one
@@ -25,7 +26,7 @@ export declare function isCodeLowSurrogate(code: Code): code is Exclude<Code, nu
25
26
  * @param sliceSerialize `this.sliceSerialize` (`this` = `TokenizeContext`)
26
27
  * @returns a value greater than 65,535 if the previous code point represents a [Supplementary Character](https://www.unicode.org/glossary/#supplementary_character), or `code` otherwise
27
28
  */
28
- export declare function tryGetGenuinePreviousCode(code: Exclude<Code, null>, nowPoint: Point, sliceSerialize: TokenizeContext["sliceSerialize"]): Exclude<Code, null>;
29
+ declare function tryGetGenuinePreviousCode(code: Exclude<Code, null>, nowPoint: Point, sliceSerialize: TokenizeContext["sliceSerialize"]): Exclude<Code, null>;
29
30
  /**
30
31
  * Try to get the [Unicode Code Point](https://www.unicode.org/glossary/#code_point) two positions before the current position.
31
32
  *
@@ -34,7 +35,36 @@ export declare function tryGetGenuinePreviousCode(code: Exclude<Code, null>, now
34
35
  * @param sliceSerialize `this.sliceSerialize` (`this` = `TokenizeContext`)
35
36
  * @returns a value greater than 65,535 if the code point two positions before represents a [Supplementary Character](https://www.unicode.org/glossary/#supplementary_character), a value less than 65,536 for a [BMP Character](https://www.unicode.org/glossary/#bmp_character), or `null` if not found
36
37
  */
37
- export declare function tryGetCodeTwoBefore(previousCode: Exclude<Code, null>, nowPoint: Point, sliceSerialize: TokenizeContext["sliceSerialize"]): Code;
38
+ declare function tryGetCodeTwoBefore(previousCode: Exclude<Code, null>, nowPoint: Point, sliceSerialize: TokenizeContext["sliceSerialize"]): Code;
39
+ /**
40
+ * Lazily get the [Unicode Code Point](https://www.unicode.org/glossary/#code_point) two positions before the current position only if necessary.
41
+ *
42
+ * @see {@link tryGetCodeTwoBefore}
43
+ */
44
+ declare class TwoPreviousCode {
45
+ readonly previousCode: Exclude<Code, null>;
46
+ readonly nowPoint: Point;
47
+ readonly sliceSerialize: TokenizeContext["sliceSerialize"];
48
+ private cachedValue;
49
+ /**
50
+ * @see {@link tryGetCodeTwoBefore}
51
+ *
52
+ * @param previousCode a previous code point. Should be greater than 65,535 if it represents a [Supplementary Character](https://www.unicode.org/glossary/#supplementary_character).
53
+ * @param nowPoint `this.now()` (`this` = `TokenizeContext`)
54
+ * @param sliceSerialize `this.sliceSerialize` (`this` = `TokenizeContext`)
55
+ */
56
+ constructor(previousCode: Exclude<Code, null>, nowPoint: Point, sliceSerialize: TokenizeContext["sliceSerialize"]);
57
+ /**
58
+ * Returns the return value of {@link tryGetCodeTwoBefore}.
59
+ *
60
+ * If the value has not been computed yet, it will be computed and cached.
61
+ *
62
+ * @see {@link tryGetCodeTwoBefore}
63
+ *
64
+ * @returns a value greater than 65,535 if the code point two positions before represents a [Supplementary Character](https://www.unicode.org/glossary/#supplementary_character), a value less than 65,536 for a [BMP Character](https://www.unicode.org/glossary/#bmp_character), or `null` if not found
65
+ */
66
+ value(): Code;
67
+ }
38
68
  /**
39
69
  * If `code` is a [High-Surrogate Code Unit](https://www.unicode.org/glossary/#high_surrogate_code_unit), try to get a genuine next [Unicode Scalar Value](https://www.unicode.org/glossary/#unicode_scalar_value) corresponding to the High-Surrogate Code Unit.
40
70
  * @param code a tentative next [code unit](https://www.unicode.org/glossary/#code_unit) less than 65,536, including a High-Surrogate one
@@ -42,4 +72,6 @@ export declare function tryGetCodeTwoBefore(previousCode: Exclude<Code, null>, n
42
72
  * @param sliceSerialize `this.sliceSerialize` (`this` = `TokenizeContext`)
43
73
  * @returns a value greater than 65,535 if the next code point represents a [Supplementary Character](https://www.unicode.org/glossary/#supplementary_character), or `code` otherwise
44
74
  */
45
- export declare function tryGetGenuineNextCode(code: Exclude<Code, null>, nowPoint: Point, sliceSerialize: TokenizeContext["sliceSerialize"]): Exclude<Code, null>;
75
+ declare function tryGetGenuineNextCode(code: Exclude<Code, null>, nowPoint: Point, sliceSerialize: TokenizeContext["sliceSerialize"]): Exclude<Code, null>;
76
+
77
+ export { TwoPreviousCode, isCodeHighSurrogate, isCodeLowSurrogate, tryGetCodeTwoBefore, tryGetGenuineNextCode, tryGetGenuinePreviousCode };
@@ -0,0 +1,104 @@
1
+ var __defProp = Object.defineProperty;
2
+ var __defNormalProp = (obj, key, value) => key in obj ? __defProp(obj, key, { enumerable: true, configurable: true, writable: true, value }) : obj[key] = value;
3
+ var __publicField = (obj, key, value) => __defNormalProp(obj, typeof key !== "symbol" ? key + "" : key, value);
4
+
5
+ // src/codeUtil.ts
6
+ function isCodeHighSurrogate(code) {
7
+ return Boolean(code && code >= 55296 && code <= 56319);
8
+ }
9
+ function isCodeLowSurrogate(code) {
10
+ return Boolean(code && code >= 56320 && code <= 57343);
11
+ }
12
+ function tryGetGenuinePreviousCode(code, nowPoint, sliceSerialize) {
13
+ if (nowPoint._bufferIndex < 2) {
14
+ return code;
15
+ }
16
+ const previousBuffer = sliceSerialize({
17
+ // take 2 characters (code units)
18
+ start: { ...nowPoint, _bufferIndex: nowPoint._bufferIndex - 2 },
19
+ end: nowPoint
20
+ });
21
+ const previousCandidate = previousBuffer.codePointAt(0);
22
+ return previousCandidate && previousCandidate >= 65536 ? previousCandidate : code;
23
+ }
24
+ function tryGetCodeTwoBefore(previousCode, nowPoint, sliceSerialize) {
25
+ const previousWidth = previousCode >= 65536 ? 2 : 1;
26
+ if (nowPoint._bufferIndex < 1 + previousWidth) {
27
+ return null;
28
+ }
29
+ const idealStart = nowPoint._bufferIndex - previousWidth - 2;
30
+ const twoPreviousBuffer = sliceSerialize({
31
+ // take 1--2 character
32
+ start: {
33
+ ...nowPoint,
34
+ _bufferIndex: idealStart >= 0 ? idealStart : 0
35
+ },
36
+ end: {
37
+ ...nowPoint,
38
+ _bufferIndex: nowPoint._bufferIndex - previousWidth
39
+ }
40
+ });
41
+ const twoPreviousLast = twoPreviousBuffer.charCodeAt(
42
+ twoPreviousBuffer.length - 1
43
+ );
44
+ if (Number.isNaN(twoPreviousLast)) {
45
+ return null;
46
+ }
47
+ if (twoPreviousBuffer.length < 2 || twoPreviousLast < 56320 || 57343 < twoPreviousLast) {
48
+ return twoPreviousLast;
49
+ }
50
+ const twoPreviousCandidate = twoPreviousBuffer.codePointAt(0);
51
+ if (twoPreviousCandidate && twoPreviousCandidate >= 65536) {
52
+ return twoPreviousCandidate;
53
+ }
54
+ return twoPreviousLast;
55
+ }
56
+ var TwoPreviousCode = class {
57
+ /**
58
+ * @see {@link tryGetCodeTwoBefore}
59
+ *
60
+ * @param previousCode a previous code point. Should be greater than 65,535 if it represents a [Supplementary Character](https://www.unicode.org/glossary/#supplementary_character).
61
+ * @param nowPoint `this.now()` (`this` = `TokenizeContext`)
62
+ * @param sliceSerialize `this.sliceSerialize` (`this` = `TokenizeContext`)
63
+ */
64
+ constructor(previousCode, nowPoint, sliceSerialize) {
65
+ this.previousCode = previousCode;
66
+ this.nowPoint = nowPoint;
67
+ this.sliceSerialize = sliceSerialize;
68
+ __publicField(this, "cachedValue");
69
+ }
70
+ /**
71
+ * Returns the return value of {@link tryGetCodeTwoBefore}.
72
+ *
73
+ * If the value has not been computed yet, it will be computed and cached.
74
+ *
75
+ * @see {@link tryGetCodeTwoBefore}
76
+ *
77
+ * @returns a value greater than 65,535 if the code point two positions before represents a [Supplementary Character](https://www.unicode.org/glossary/#supplementary_character), a value less than 65,536 for a [BMP Character](https://www.unicode.org/glossary/#bmp_character), or `null` if not found
78
+ */
79
+ value() {
80
+ if (this.cachedValue === void 0) {
81
+ this.cachedValue = tryGetCodeTwoBefore(
82
+ this.previousCode,
83
+ this.nowPoint,
84
+ this.sliceSerialize
85
+ );
86
+ }
87
+ return this.cachedValue;
88
+ }
89
+ };
90
+ function tryGetGenuineNextCode(code, nowPoint, sliceSerialize) {
91
+ const nextCandidate = sliceSerialize({
92
+ start: nowPoint,
93
+ end: { ...nowPoint, _bufferIndex: nowPoint._bufferIndex + 2 }
94
+ }).codePointAt(0);
95
+ return nextCandidate && nextCandidate >= 65536 ? nextCandidate : code;
96
+ }
97
+ export {
98
+ TwoPreviousCode,
99
+ isCodeHighSurrogate,
100
+ isCodeLowSurrogate,
101
+ tryGetCodeTwoBefore,
102
+ tryGetGenuineNextCode,
103
+ tryGetGenuinePreviousCode
104
+ };
package/dist/index.d.ts CHANGED
@@ -1,3 +1,5 @@
1
- export { isCjk, isIvs, isNonCjkPunctuation, isSpaceOrPunctuation, isSvsFollowingCjk, isUnicodeWhitespace, } from "./categoryUtil.js";
2
- export { classifyCharacter, constantsEx } from "./classifyCharacter.js";
3
- export { isCodeHighSurrogate, isCodeLowSurrogate, tryGetGenuineNextCode, tryGetGenuinePreviousCode, tryGetCodeTwoBefore, } from "./codeUtil.js";
1
+ export { isCjk, isIvs, isNonCjkPunctuation, isSpaceOrPunctuation, isSvsFollowingCjk, isUnicodeWhitespace } from './categoryUtil.js';
2
+ export { classifyCharacter, constantsEx } from './classifyCharacter.js';
3
+ export { isCodeHighSurrogate, isCodeLowSurrogate, tryGetCodeTwoBefore, tryGetGenuineNextCode, tryGetGenuinePreviousCode } from './codeUtil.js';
4
+ import 'micromark-util-symbol';
5
+ import 'micromark-util-types';
package/dist/index.js CHANGED
@@ -1,123 +1,183 @@
1
- import * as __WEBPACK_EXTERNAL_MODULE_micromark_util_symbol_0f33a452__ from "micromark-util-symbol";
2
- import * as __WEBPACK_EXTERNAL_MODULE_micromark_util_character_bdee8a5c__ from "micromark-util-character";
3
- import * as __WEBPACK_EXTERNAL_MODULE_get_east_asian_width_c890dec0__ from "get-east-asian-width";
4
- function cjkOrIvs(uc) {
5
- if (!uc || uc < 0) return false;
6
- const eaw = (0, __WEBPACK_EXTERNAL_MODULE_get_east_asian_width_c890dec0__.eastAsianWidthType)(uc);
7
- switch(eaw){
8
- case "fullwidth":
9
- case "halfwidth":
10
- return true;
11
- case "wide":
12
- return !/^\p{RGI_Emoji}/v.test(String.fromCodePoint(uc));
13
- case "narrow":
14
- return false;
15
- case "ambiguous":
16
- return 0xe0100 <= uc && uc <= 0xe01ef && null;
17
- case "neutral":
18
- return /^\p{sc=Hangul}/u.test(String.fromCodePoint(uc));
1
+ // src/categoryUtil.ts
2
+ import { constants as constants2 } from "micromark-util-symbol";
3
+
4
+ // src/classifyCharacter.ts
5
+ import { markdownLineEndingOrSpace } from "micromark-util-character";
6
+ import { constants, codes } from "micromark-util-symbol";
7
+
8
+ // src/characterWithNonBmp.ts
9
+ import { eastAsianWidthType } from "get-east-asian-width";
10
+ var isEmoji = function(uc) {
11
+ if (this.fn !== null) {
12
+ return this.fn(uc);
13
+ }
14
+ try {
15
+ const regex = new RegExp("^\\p{RGI_Emoji}", "v");
16
+ this.fn = (uc_) => regex.test(String.fromCodePoint(uc_));
17
+ } catch (e) {
18
+ if (!(e instanceof SyntaxError)) {
19
+ throw e;
19
20
  }
21
+ this.fn = (cp) => 8986 <= cp && cp <= 8987 || 9193 <= cp && cp <= 9196 || cp === 9200 || cp === 9203 || 9725 <= cp && cp <= 9726 || 9748 <= cp && cp <= 9749 || 9800 <= cp && cp <= 9811 || cp === 9855 || cp === 9875 || cp === 9889 || 9898 <= cp && cp <= 9899 || 9917 <= cp && cp <= 9918 || 9924 <= cp && cp <= 9925 || cp === 9934 || cp === 9940 || cp === 9962 || 9970 <= cp && cp <= 9971 || cp === 9973 || cp === 9978 || cp === 9981 || cp === 9989 || 9994 <= cp && cp <= 9995 || cp === 10024 || cp === 10060 || cp === 10062 || 10067 <= cp && cp <= 10069 || cp === 10071 || 10133 <= cp && cp <= 10135 || cp === 10160 || cp === 10175 || 11035 <= cp && cp <= 11036 || cp === 11088 || cp === 11093 || cp === 126980 || cp === 127183 || cp === 127374 || 127377 <= cp && cp <= 127386 || cp === 127489 || cp === 127514 || cp === 127535 || 127538 <= cp && cp <= 127542 || 127544 <= cp && cp <= 127546 || 127568 <= cp && cp <= 127569 || 127744 <= cp && cp <= 127756 || 127757 <= cp && cp <= 127758 || cp === 127759 || cp === 127760 || cp === 127761 || cp === 127762 || 127763 <= cp && cp <= 127765 || 127766 <= cp && cp <= 127768 || cp === 127769 || cp === 127770 || cp === 127771 || cp === 127772 || 127773 <= cp && cp <= 127774 || 127775 <= cp && cp <= 127776 || 127789 <= cp && cp <= 127791 || 127792 <= cp && cp <= 127793 || 127794 <= cp && cp <= 127795 || 127796 <= cp && cp <= 127797 || 127799 <= cp && cp <= 127818 || cp === 127819 || 127820 <= cp && cp <= 127823 || cp === 127824 || 127825 <= cp && cp <= 127867 || cp === 127868 || 127870 <= cp && cp <= 127871 || 127872 <= cp && cp <= 127891 || 127904 <= cp && cp <= 127940 || cp === 127941 || cp === 127942 || cp === 127943 || cp === 127944 || cp === 127945 || cp === 127946 || 127951 <= cp && cp <= 127955 || 127968 <= cp && cp <= 127971 || cp === 127972 || 127973 <= cp && cp <= 127984 || cp === 127988 || 127992 <= cp && cp <= 128007 || cp === 128008 || 128009 <= cp && cp <= 128011 || 128012 <= cp && cp <= 128014 || 128015 <= cp && cp <= 128016 || 128017 <= cp && cp <= 128018 || cp === 128019 || cp === 128020 || cp === 128021 || cp === 128022 || 128023 <= cp && cp <= 128041 || cp === 128042 || 128043 <= cp && cp <= 128062 || cp === 128064 || 128066 <= cp && cp <= 128100 || cp === 128101 || 128102 <= cp && cp <= 128107 || 128108 <= cp && cp <= 128109 || 128110 <= cp && cp <= 128172 || cp === 128173 || 128174 <= cp && cp <= 128181 || 128182 <= cp && cp <= 128183 || 128184 <= cp && cp <= 128235 || 128236 <= cp && cp <= 128237 || cp === 128238 || cp === 128239 || 128240 <= cp && cp <= 128244 || cp === 128245 || 128246 <= cp && cp <= 128247 || cp === 128248 || 128249 <= cp && cp <= 128252 || 128255 <= cp && cp <= 128258 || cp === 128259 || 128260 <= cp && cp <= 128263 || cp === 128264 || cp === 128265 || 128266 <= cp && cp <= 128276 || cp === 128277 || 128278 <= cp && cp <= 128299 || 128300 <= cp && cp <= 128301 || 128302 <= cp && cp <= 128317 || 128331 <= cp && cp <= 128334 || 128336 <= cp && cp <= 128347 || 128348 <= cp && cp <= 128359 || cp === 128378 || 128405 <= cp && cp <= 128406 || cp === 128420 || 128507 <= cp && cp <= 128511 || cp === 128512 || 128513 <= cp && cp <= 128518 || 128519 <= cp && cp <= 128520 || 128521 <= cp && cp <= 128525 || cp === 128526 || cp === 128527 || cp === 128528 || cp === 128529 || 128530 <= cp && cp <= 128532 || cp === 128533 || cp === 128534 || cp === 128535 || cp === 128536 || cp === 128537 || cp === 128538 || cp === 128539 || 128540 <= cp && cp <= 128542 || cp === 128543 || 128544 <= cp && cp <= 128549 || 128550 <= cp && cp <= 128551 || 128552 <= cp && cp <= 128555 || cp === 128556 || cp === 128557 || 128558 <= cp && cp <= 128559 || 128560 <= cp && cp <= 128563 || cp === 128564 || cp === 128565 || cp === 128566 || 128567 <= cp && cp <= 128576 || 128577 <= cp && cp <= 128580 || 128581 <= cp && cp <= 128591 || cp === 128640 || 128641 <= cp && cp <= 128642 || 128643 <= cp && cp <= 128645 || cp === 128646 || cp === 128647 || cp === 128648 || cp === 128649 || 128650 <= cp && cp <= 128651 || cp === 128652 || cp === 128653 || cp === 128654 || cp === 128655 || cp === 128656 || 128657 <= cp && cp <= 128659 || cp === 128660 || cp === 128661 || cp === 128662 || cp === 128663 || cp === 128664 || 128665 <= cp && cp <= 128666 || 128667 <= cp && cp <= 128673 || cp === 128674 || cp === 128675 || 128676 <= cp && cp <= 128677 || cp === 128678 || 128679 <= cp && cp <= 128685 || 128686 <= cp && cp <= 128689 || cp === 128690 || 128691 <= cp && cp <= 128693 || cp === 128694 || 128695 <= cp && cp <= 128696 || 128697 <= cp && cp <= 128702 || cp === 128703 || cp === 128704 || 128705 <= cp && cp <= 128709 || cp === 128716 || cp === 128720 || 128721 <= cp && cp <= 128722 || cp === 128725 || 128726 <= cp && cp <= 128727 || cp === 128732 || 128733 <= cp && cp <= 128735 || 128747 <= cp && cp <= 128748 || 128756 <= cp && cp <= 128758 || 128759 <= cp && cp <= 128760 || cp === 128761 || cp === 128762 || 128763 <= cp && cp <= 128764 || 128992 <= cp && cp <= 129003 || cp === 129008 || cp === 129292 || 129293 <= cp && cp <= 129295 || 129296 <= cp && cp <= 129304 || 129305 <= cp && cp <= 129310 || cp === 129311 || 129312 <= cp && cp <= 129319 || 129320 <= cp && cp <= 129327 || cp === 129328 || 129329 <= cp && cp <= 129330 || 129331 <= cp && cp <= 129338 || 129340 <= cp && cp <= 129342 || cp === 129343 || 129344 <= cp && cp <= 129349 || 129351 <= cp && cp <= 129355 || cp === 129356 || 129357 <= cp && cp <= 129359 || 129360 <= cp && cp <= 129374 || 129375 <= cp && cp <= 129387 || 129388 <= cp && cp <= 129392 || cp === 129393 || cp === 129394 || 129395 <= cp && cp <= 129398 || 129399 <= cp && cp <= 129400 || cp === 129401 || cp === 129402 || cp === 129403 || 129404 <= cp && cp <= 129407 || 129408 <= cp && cp <= 129412 || 129413 <= cp && cp <= 129425 || 129426 <= cp && cp <= 129431 || 129432 <= cp && cp <= 129442 || 129443 <= cp && cp <= 129444 || 129445 <= cp && cp <= 129450 || 129451 <= cp && cp <= 129453 || 129454 <= cp && cp <= 129455 || 129456 <= cp && cp <= 129465 || 129466 <= cp && cp <= 129471 || cp === 129472 || 129473 <= cp && cp <= 129474 || 129475 <= cp && cp <= 129482 || cp === 129483 || cp === 129484 || 129485 <= cp && cp <= 129487 || 129488 <= cp && cp <= 129510 || 129511 <= cp && cp <= 129535 || 129648 <= cp && cp <= 129651 || cp === 129652 || 129653 <= cp && cp <= 129655 || 129656 <= cp && cp <= 129658 || 129659 <= cp && cp <= 129660 || 129664 <= cp && cp <= 129666 || 129667 <= cp && cp <= 129670 || 129671 <= cp && cp <= 129672 || cp === 129673 || cp === 129679 || 129680 <= cp && cp <= 129685 || 129686 <= cp && cp <= 129704 || 129705 <= cp && cp <= 129708 || 129709 <= cp && cp <= 129711 || 129712 <= cp && cp <= 129718 || 129719 <= cp && cp <= 129722 || 129723 <= cp && cp <= 129725 || cp === 129726 || cp === 129727 || 129728 <= cp && cp <= 129730 || 129731 <= cp && cp <= 129733 || cp === 129734 || 129742 <= cp && cp <= 129743 || 129744 <= cp && cp <= 129750 || 129751 <= cp && cp <= 129753 || 129754 <= cp && cp <= 129755 || cp === 129756 || cp === 129759 || 129760 <= cp && cp <= 129767 || cp === 129768 || cp === 129769 || 129776 <= cp && cp <= 129782 || 129783 <= cp && cp <= 129784;
22
+ }
23
+ return this.fn(uc);
24
+ }.bind({
25
+ fn: null
26
+ });
27
+ function cjkOrIvs(uc) {
28
+ if (!uc || uc < 0) {
29
+ return false;
30
+ }
31
+ const eaw = eastAsianWidthType(uc);
32
+ switch (eaw) {
33
+ case "fullwidth":
34
+ case "halfwidth":
35
+ return true;
36
+ // never be emoji
37
+ case "wide":
38
+ return !isEmoji(uc);
39
+ case "narrow":
40
+ return false;
41
+ case "ambiguous":
42
+ return 917760 <= uc && uc <= 917999 ? null : false;
43
+ case "neutral":
44
+ return /^\p{sc=Hangul}/u.test(String.fromCodePoint(uc));
45
+ }
20
46
  }
21
- const svsFollowingCjk = regexCheck(/[\uFE00-\uFE02\uFE0E]/u);
22
- const unicodePunctuation = regexCheck(/\p{P}|\p{S}/u);
23
- const unicodeWhitespace = regexCheck(/\s/);
47
+ var svsFollowingCjk = regexCheck(/[\uFE00-\uFE02\uFE0E]/u);
48
+ var unicodePunctuation = regexCheck(/\p{P}|\p{S}/u);
49
+ var unicodeWhitespace = regexCheck(/\s/);
24
50
  function regexCheck(regex) {
25
- return check;
26
- function check(code) {
27
- return null !== code && code > -1 && regex.test(String.fromCodePoint(code));
28
- }
51
+ return check;
52
+ function check(code) {
53
+ return code !== null && code > -1 && regex.test(String.fromCodePoint(code));
54
+ }
29
55
  }
30
- (function(constantsEx) {
31
- constantsEx.spaceOrPunctuation = 3;
32
- constantsEx.cjk = 0x1000;
33
- constantsEx.cjkPunctuation = 0x1002;
34
- constantsEx.ivs = 0x2000;
35
- constantsEx.cjkOrIvs = 0x3000;
36
- constantsEx.svsFollowingCjk = 0x4000;
37
- constantsEx.variationSelector = 0x7000;
38
- })(classifyCharacter_constantsEx || (classifyCharacter_constantsEx = {}));
56
+
57
+ // src/classifyCharacter.ts
58
+ var constantsEx;
59
+ ((constantsEx2) => {
60
+ constantsEx2.spaceOrPunctuation = 3;
61
+ constantsEx2.cjk = 4096;
62
+ constantsEx2.cjkPunctuation = 4098;
63
+ constantsEx2.ivs = 8192;
64
+ constantsEx2.cjkOrIvs = 12288;
65
+ constantsEx2.svsFollowingCjk = 16384;
66
+ constantsEx2.variationSelector = 28672;
67
+ })(constantsEx || (constantsEx = {}));
39
68
  function classifyCharacter(code) {
40
- if (code === __WEBPACK_EXTERNAL_MODULE_micromark_util_symbol_0f33a452__.codes.eof || (0, __WEBPACK_EXTERNAL_MODULE_micromark_util_character_bdee8a5c__.markdownLineEndingOrSpace)(code) || unicodeWhitespace(code)) return __WEBPACK_EXTERNAL_MODULE_micromark_util_symbol_0f33a452__.constants.characterGroupWhitespace;
41
- let value = 0;
42
- if (code >= 0x1100) {
43
- if (svsFollowingCjk(code)) return classifyCharacter_constantsEx.svsFollowingCjk;
44
- switch(cjkOrIvs(code)){
45
- case null:
46
- return classifyCharacter_constantsEx.ivs;
47
- case true:
48
- value |= classifyCharacter_constantsEx.cjk;
49
- break;
50
- }
69
+ if (code === codes.eof || markdownLineEndingOrSpace(code) || unicodeWhitespace(code)) {
70
+ return constants.characterGroupWhitespace;
71
+ }
72
+ let value = 0;
73
+ if (code >= 4352) {
74
+ if (svsFollowingCjk(code)) {
75
+ return constantsEx.svsFollowingCjk;
51
76
  }
52
- if (unicodePunctuation(code)) value |= __WEBPACK_EXTERNAL_MODULE_micromark_util_symbol_0f33a452__.constants.characterGroupPunctuation;
53
- return value;
77
+ switch (cjkOrIvs(code)) {
78
+ case null:
79
+ return constantsEx.ivs;
80
+ case true:
81
+ value |= constantsEx.cjk;
82
+ break;
83
+ }
84
+ }
85
+ if (unicodePunctuation(code)) {
86
+ value |= constants.characterGroupPunctuation;
87
+ }
88
+ return value;
54
89
  }
55
- var classifyCharacter_constantsEx;
90
+
91
+ // src/categoryUtil.ts
56
92
  function isUnicodeWhitespace(category) {
57
- return Boolean(category & __WEBPACK_EXTERNAL_MODULE_micromark_util_symbol_0f33a452__.constants.characterGroupWhitespace);
93
+ return Boolean(category & constants2.characterGroupWhitespace);
58
94
  }
59
95
  function isNonCjkPunctuation(category) {
60
- return (category & classifyCharacter_constantsEx.cjkPunctuation) === __WEBPACK_EXTERNAL_MODULE_micromark_util_symbol_0f33a452__.constants.characterGroupPunctuation;
96
+ return (category & constantsEx.cjkPunctuation) === constants2.characterGroupPunctuation;
61
97
  }
62
98
  function isCjk(category) {
63
- return Boolean(category & classifyCharacter_constantsEx.cjk);
99
+ return Boolean(category & constantsEx.cjk);
64
100
  }
65
101
  function isIvs(category) {
66
- return category === classifyCharacter_constantsEx.ivs;
102
+ return category === constantsEx.ivs;
67
103
  }
68
104
  function isSvsFollowingCjk(category) {
69
- return category === classifyCharacter_constantsEx.svsFollowingCjk;
105
+ return category === constantsEx.svsFollowingCjk;
70
106
  }
71
107
  function isSpaceOrPunctuation(category) {
72
- return Boolean(category & classifyCharacter_constantsEx.spaceOrPunctuation);
108
+ return Boolean(category & constantsEx.spaceOrPunctuation);
73
109
  }
110
+
111
+ // src/codeUtil.ts
74
112
  function isCodeHighSurrogate(code) {
75
- return Boolean(code && code >= 0xd800 && code <= 0xdbff);
113
+ return Boolean(code && code >= 55296 && code <= 56319);
76
114
  }
77
115
  function isCodeLowSurrogate(code) {
78
- return Boolean(code && code >= 0xdc00 && code <= 0xdfff);
116
+ return Boolean(code && code >= 56320 && code <= 57343);
79
117
  }
80
118
  function tryGetGenuinePreviousCode(code, nowPoint, sliceSerialize) {
81
- if (nowPoint._bufferIndex < 2) return code;
82
- const previousBuffer = sliceSerialize({
83
- start: {
84
- ...nowPoint,
85
- _bufferIndex: nowPoint._bufferIndex - 2
86
- },
87
- end: nowPoint
88
- });
89
- const previousCandidate = previousBuffer.codePointAt(0);
90
- return previousCandidate && previousCandidate >= 65536 ? previousCandidate : code;
119
+ if (nowPoint._bufferIndex < 2) {
120
+ return code;
121
+ }
122
+ const previousBuffer = sliceSerialize({
123
+ // take 2 characters (code units)
124
+ start: { ...nowPoint, _bufferIndex: nowPoint._bufferIndex - 2 },
125
+ end: nowPoint
126
+ });
127
+ const previousCandidate = previousBuffer.codePointAt(0);
128
+ return previousCandidate && previousCandidate >= 65536 ? previousCandidate : code;
91
129
  }
92
130
  function tryGetCodeTwoBefore(previousCode, nowPoint, sliceSerialize) {
93
- const previousWidth = previousCode >= 65536 ? 2 : 1;
94
- if (nowPoint._bufferIndex < 1 + previousWidth) return null;
95
- const idealStart = nowPoint._bufferIndex - previousWidth - 2;
96
- const twoPreviousBuffer = sliceSerialize({
97
- start: {
98
- ...nowPoint,
99
- _bufferIndex: idealStart >= 0 ? idealStart : 0
100
- },
101
- end: {
102
- ...nowPoint,
103
- _bufferIndex: nowPoint._bufferIndex - previousWidth
104
- }
105
- });
106
- const twoPreviousLast = twoPreviousBuffer.charCodeAt(twoPreviousBuffer.length - 1);
107
- if (Number.isNaN(twoPreviousLast)) return null;
108
- if (twoPreviousBuffer.length < 2 || twoPreviousLast < 0xdc00 || 0xdfff < twoPreviousLast) return twoPreviousLast;
109
- const twoPreviousCandidate = twoPreviousBuffer.codePointAt(0);
110
- if (twoPreviousCandidate && twoPreviousCandidate >= 65536) return twoPreviousCandidate;
131
+ const previousWidth = previousCode >= 65536 ? 2 : 1;
132
+ if (nowPoint._bufferIndex < 1 + previousWidth) {
133
+ return null;
134
+ }
135
+ const idealStart = nowPoint._bufferIndex - previousWidth - 2;
136
+ const twoPreviousBuffer = sliceSerialize({
137
+ // take 1--2 character
138
+ start: {
139
+ ...nowPoint,
140
+ _bufferIndex: idealStart >= 0 ? idealStart : 0
141
+ },
142
+ end: {
143
+ ...nowPoint,
144
+ _bufferIndex: nowPoint._bufferIndex - previousWidth
145
+ }
146
+ });
147
+ const twoPreviousLast = twoPreviousBuffer.charCodeAt(
148
+ twoPreviousBuffer.length - 1
149
+ );
150
+ if (Number.isNaN(twoPreviousLast)) {
151
+ return null;
152
+ }
153
+ if (twoPreviousBuffer.length < 2 || twoPreviousLast < 56320 || 57343 < twoPreviousLast) {
111
154
  return twoPreviousLast;
155
+ }
156
+ const twoPreviousCandidate = twoPreviousBuffer.codePointAt(0);
157
+ if (twoPreviousCandidate && twoPreviousCandidate >= 65536) {
158
+ return twoPreviousCandidate;
159
+ }
160
+ return twoPreviousLast;
112
161
  }
113
162
  function tryGetGenuineNextCode(code, nowPoint, sliceSerialize) {
114
- const nextCandidate = sliceSerialize({
115
- start: nowPoint,
116
- end: {
117
- ...nowPoint,
118
- _bufferIndex: nowPoint._bufferIndex + 2
119
- }
120
- }).codePointAt(0);
121
- return nextCandidate && nextCandidate >= 65536 ? nextCandidate : code;
163
+ const nextCandidate = sliceSerialize({
164
+ start: nowPoint,
165
+ end: { ...nowPoint, _bufferIndex: nowPoint._bufferIndex + 2 }
166
+ }).codePointAt(0);
167
+ return nextCandidate && nextCandidate >= 65536 ? nextCandidate : code;
122
168
  }
123
- export { classifyCharacter, classifyCharacter_constantsEx as constantsEx, isCjk, isCodeHighSurrogate, isCodeLowSurrogate, isIvs, isNonCjkPunctuation, isSpaceOrPunctuation, isSvsFollowingCjk, isUnicodeWhitespace, tryGetCodeTwoBefore, tryGetGenuineNextCode, tryGetGenuinePreviousCode };
169
+ export {
170
+ classifyCharacter,
171
+ constantsEx,
172
+ isCjk,
173
+ isCodeHighSurrogate,
174
+ isCodeLowSurrogate,
175
+ isIvs,
176
+ isNonCjkPunctuation,
177
+ isSpaceOrPunctuation,
178
+ isSvsFollowingCjk,
179
+ isUnicodeWhitespace,
180
+ tryGetCodeTwoBefore,
181
+ tryGetGenuineNextCode,
182
+ tryGetGenuinePreviousCode
183
+ };
package/package.json CHANGED
@@ -1,6 +1,6 @@
1
1
  {
2
2
  "name": "micromark-extension-cjk-friendly-util",
3
- "version": "1.0.0",
3
+ "version": "1.1.0",
4
4
  "type": "module",
5
5
  "exports": {
6
6
  ".": {
@@ -8,6 +8,8 @@
8
8
  "default": "./dist/index.js"
9
9
  }
10
10
  },
11
+ "module": "./dist/index.js",
12
+ "types": "./dist/index.d.ts",
11
13
  "files": [
12
14
  "dist",
13
15
  "LICENSE",
@@ -44,14 +46,20 @@
44
46
  }
45
47
  },
46
48
  "engines": {
47
- "node": ">=20"
49
+ "node": ">=16"
48
50
  },
49
51
  "scripts": {
50
- "build": "rslib build",
51
- "dev": "rslib build --watch",
52
- "test": "vitest --run",
52
+ "build:rslib": "rslib build",
53
+ "build": "tsup",
54
+ "build:lib": "tsup",
55
+ "dev:rslib": "rslib build --watch",
56
+ "dev": "tsup --watch",
57
+ "dev:lib": "tsup --watch",
58
+ "test": "vitest run",
59
+ "test:lib": "vitest run",
53
60
  "test:up": "vitest -u",
54
61
  "test:watch": "vitest watch",
62
+ "test:lib:watch": "vitest watch",
55
63
  "lint:type": "tsc --noEmit"
56
64
  }
57
65
  }
package/dist/index.cjs DELETED
@@ -1,169 +0,0 @@
1
- "use strict";
2
- var __webpack_require__ = {};
3
- (()=>{
4
- __webpack_require__.d = function(exports1, definition) {
5
- for(var key in definition)if (__webpack_require__.o(definition, key) && !__webpack_require__.o(exports1, key)) Object.defineProperty(exports1, key, {
6
- enumerable: true,
7
- get: definition[key]
8
- });
9
- };
10
- })();
11
- (()=>{
12
- __webpack_require__.o = function(obj, prop) {
13
- return Object.prototype.hasOwnProperty.call(obj, prop);
14
- };
15
- })();
16
- (()=>{
17
- __webpack_require__.r = function(exports1) {
18
- if ('undefined' != typeof Symbol && Symbol.toStringTag) Object.defineProperty(exports1, Symbol.toStringTag, {
19
- value: 'Module'
20
- });
21
- Object.defineProperty(exports1, '__esModule', {
22
- value: true
23
- });
24
- };
25
- })();
26
- var __webpack_exports__ = {};
27
- __webpack_require__.r(__webpack_exports__);
28
- __webpack_require__.d(__webpack_exports__, {
29
- constantsEx: ()=>classifyCharacter_constantsEx,
30
- isCjk: ()=>isCjk,
31
- isSpaceOrPunctuation: ()=>isSpaceOrPunctuation,
32
- isNonCjkPunctuation: ()=>isNonCjkPunctuation,
33
- isSvsFollowingCjk: ()=>isSvsFollowingCjk,
34
- tryGetGenuinePreviousCode: ()=>tryGetGenuinePreviousCode,
35
- isUnicodeWhitespace: ()=>isUnicodeWhitespace,
36
- isCodeHighSurrogate: ()=>isCodeHighSurrogate,
37
- isIvs: ()=>isIvs,
38
- tryGetCodeTwoBefore: ()=>tryGetCodeTwoBefore,
39
- classifyCharacter: ()=>classifyCharacter,
40
- isCodeLowSurrogate: ()=>isCodeLowSurrogate,
41
- tryGetGenuineNextCode: ()=>tryGetGenuineNextCode
42
- });
43
- const external_micromark_util_symbol_namespaceObject = require("micromark-util-symbol");
44
- const external_micromark_util_character_namespaceObject = require("micromark-util-character");
45
- const external_get_east_asian_width_namespaceObject = require("get-east-asian-width");
46
- function cjkOrIvs(uc) {
47
- if (!uc || uc < 0) return false;
48
- const eaw = (0, external_get_east_asian_width_namespaceObject.eastAsianWidthType)(uc);
49
- switch(eaw){
50
- case "fullwidth":
51
- case "halfwidth":
52
- return true;
53
- case "wide":
54
- return !/^\p{RGI_Emoji}/v.test(String.fromCodePoint(uc));
55
- case "narrow":
56
- return false;
57
- case "ambiguous":
58
- return 0xe0100 <= uc && uc <= 0xe01ef && null;
59
- case "neutral":
60
- return /^\p{sc=Hangul}/u.test(String.fromCodePoint(uc));
61
- }
62
- }
63
- const svsFollowingCjk = regexCheck(/[\uFE00-\uFE02\uFE0E]/u);
64
- const unicodePunctuation = regexCheck(/\p{P}|\p{S}/u);
65
- const unicodeWhitespace = regexCheck(/\s/);
66
- function regexCheck(regex) {
67
- return check;
68
- function check(code) {
69
- return null !== code && code > -1 && regex.test(String.fromCodePoint(code));
70
- }
71
- }
72
- (function(constantsEx) {
73
- constantsEx.spaceOrPunctuation = 3;
74
- constantsEx.cjk = 0x1000;
75
- constantsEx.cjkPunctuation = 0x1002;
76
- constantsEx.ivs = 0x2000;
77
- constantsEx.cjkOrIvs = 0x3000;
78
- constantsEx.svsFollowingCjk = 0x4000;
79
- constantsEx.variationSelector = 0x7000;
80
- })(classifyCharacter_constantsEx || (classifyCharacter_constantsEx = {}));
81
- function classifyCharacter(code) {
82
- if (code === external_micromark_util_symbol_namespaceObject.codes.eof || (0, external_micromark_util_character_namespaceObject.markdownLineEndingOrSpace)(code) || unicodeWhitespace(code)) return external_micromark_util_symbol_namespaceObject.constants.characterGroupWhitespace;
83
- let value = 0;
84
- if (code >= 0x1100) {
85
- if (svsFollowingCjk(code)) return classifyCharacter_constantsEx.svsFollowingCjk;
86
- switch(cjkOrIvs(code)){
87
- case null:
88
- return classifyCharacter_constantsEx.ivs;
89
- case true:
90
- value |= classifyCharacter_constantsEx.cjk;
91
- break;
92
- }
93
- }
94
- if (unicodePunctuation(code)) value |= external_micromark_util_symbol_namespaceObject.constants.characterGroupPunctuation;
95
- return value;
96
- }
97
- var classifyCharacter_constantsEx;
98
- function isUnicodeWhitespace(category) {
99
- return Boolean(category & external_micromark_util_symbol_namespaceObject.constants.characterGroupWhitespace);
100
- }
101
- function isNonCjkPunctuation(category) {
102
- return (category & classifyCharacter_constantsEx.cjkPunctuation) === external_micromark_util_symbol_namespaceObject.constants.characterGroupPunctuation;
103
- }
104
- function isCjk(category) {
105
- return Boolean(category & classifyCharacter_constantsEx.cjk);
106
- }
107
- function isIvs(category) {
108
- return category === classifyCharacter_constantsEx.ivs;
109
- }
110
- function isSvsFollowingCjk(category) {
111
- return category === classifyCharacter_constantsEx.svsFollowingCjk;
112
- }
113
- function isSpaceOrPunctuation(category) {
114
- return Boolean(category & classifyCharacter_constantsEx.spaceOrPunctuation);
115
- }
116
- function isCodeHighSurrogate(code) {
117
- return Boolean(code && code >= 0xd800 && code <= 0xdbff);
118
- }
119
- function isCodeLowSurrogate(code) {
120
- return Boolean(code && code >= 0xdc00 && code <= 0xdfff);
121
- }
122
- function tryGetGenuinePreviousCode(code, nowPoint, sliceSerialize) {
123
- if (nowPoint._bufferIndex < 2) return code;
124
- const previousBuffer = sliceSerialize({
125
- start: {
126
- ...nowPoint,
127
- _bufferIndex: nowPoint._bufferIndex - 2
128
- },
129
- end: nowPoint
130
- });
131
- const previousCandidate = previousBuffer.codePointAt(0);
132
- return previousCandidate && previousCandidate >= 65536 ? previousCandidate : code;
133
- }
134
- function tryGetCodeTwoBefore(previousCode, nowPoint, sliceSerialize) {
135
- const previousWidth = previousCode >= 65536 ? 2 : 1;
136
- if (nowPoint._bufferIndex < 1 + previousWidth) return null;
137
- const idealStart = nowPoint._bufferIndex - previousWidth - 2;
138
- const twoPreviousBuffer = sliceSerialize({
139
- start: {
140
- ...nowPoint,
141
- _bufferIndex: idealStart >= 0 ? idealStart : 0
142
- },
143
- end: {
144
- ...nowPoint,
145
- _bufferIndex: nowPoint._bufferIndex - previousWidth
146
- }
147
- });
148
- const twoPreviousLast = twoPreviousBuffer.charCodeAt(twoPreviousBuffer.length - 1);
149
- if (Number.isNaN(twoPreviousLast)) return null;
150
- if (twoPreviousBuffer.length < 2 || twoPreviousLast < 0xdc00 || 0xdfff < twoPreviousLast) return twoPreviousLast;
151
- const twoPreviousCandidate = twoPreviousBuffer.codePointAt(0);
152
- if (twoPreviousCandidate && twoPreviousCandidate >= 65536) return twoPreviousCandidate;
153
- return twoPreviousLast;
154
- }
155
- function tryGetGenuineNextCode(code, nowPoint, sliceSerialize) {
156
- const nextCandidate = sliceSerialize({
157
- start: nowPoint,
158
- end: {
159
- ...nowPoint,
160
- _bufferIndex: nowPoint._bufferIndex + 2
161
- }
162
- }).codePointAt(0);
163
- return nextCandidate && nextCandidate >= 65536 ? nextCandidate : code;
164
- }
165
- var __webpack_export_target__ = exports;
166
- for(var __webpack_i__ in __webpack_exports__)__webpack_export_target__[__webpack_i__] = __webpack_exports__[__webpack_i__];
167
- if (__webpack_exports__.__esModule) Object.defineProperty(__webpack_export_target__, '__esModule', {
168
- value: true
169
- });