npm - micromark-extension-cjk-friendly-util - Versions diffs - 1.0.0 - Mend

micromark-extension-cjk-friendly-util 1.0.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (10) hide show

package/LICENSE +28 -0
package/README.md +139 -0
package/dist/categoryUtil.d.ts +45 -0
package/dist/characterWithNonBmp.d.ts +55 -0
package/dist/classifyCharacter.d.ts +26 -0
package/dist/codeUtil.d.ts +45 -0
package/dist/index.cjs +169 -0
package/dist/index.d.ts +3 -0
package/dist/index.js +123 -0
package/package.json +57 -0

package/LICENSE ADDED Viewed

@@ -0,0 +1,28 @@
+Copyright (c) 2025 Tatsunori Uchino <tats.u@live.jp>
+MIT LICENSE
+Based on micromark's sub-packages (micromark-util-character, micromark-util-symbol, and micromark-util-classify-character)
+(The MIT License)
+Copyright (c) Titus Wormer <tituswormer@gmail.com>
+Permission is hereby granted, free of charge, to any person obtaining
+a copy of this software and associated documentation files (the
+'Software'), to deal in the Software without restriction, including
+without limitation the rights to use, copy, modify, merge, publish,
+distribute, sublicense, and/or sell copies of the Software, and to
+permit persons to whom the Software is furnished to do so, subject to
+the following conditions:
+The above copyright notice and this permission notice shall be
+included in all copies or substantial portions of the Software.
+THE SOFTWARE IS PROVIDED 'AS IS', WITHOUT WARRANTY OF ANY KIND,
+EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
+MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.
+IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY
+CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT,
+TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE
+SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.

package/README.md ADDED Viewed

@@ -0,0 +1,139 @@
+# micromark-extension-cjk-friendly-util
+[![Version](https://img.shields.io/npm/v/micromark-extension-cjk-friendly-util)](https://npmjs.com/package/micromark-extension-cjk-friendly-util) [![NPM Downloads](https://img.shields.io/npm/dw/micromark-extension-cjk-friendly-util)](https://npmjs.com/package/micromark-extension-cjk-friendly-util) [![NPM Last Update](https://img.shields.io/npm/last-update/micromark-extension-cjk-friendly-util)](https://npmjs.com/package/micromark-extension-cjk-friendly-util)
+An utility library package for [micromark-extension-cjk-friendly](https://npmjs.com/package/micromark-extension-cjk-friendly), which is internally used by [remark-cjk-friendly](https://npmjs.com/package/remark-cjk-friendly), and its related packages.
+## Problem / <span lang="ja">問題</span> / <span lang="zh-Hans-CN">问题</span> / <span lang="ko">문제점</span>
+CommonMark has a problem that the following emphasis marks `**` are not recognized as emphasis marks in Japanese, Chinese, and Korean.
+<span lang="ja">CommonMarkには、日本語・中国語・韓国語内の次のような強調記号(`**`)が強調記号として認識されない問題があります。</span>
+<span lang="zh-Hans-CN">CommonMark存在以下问题：在中文、日语和韩语文本中，强调标记`**`不会被识别为强调标记。</span>
+<span lang="ko">CommonMark는 일본어와 중국어에서 다음과 같은 강조 표시 `**`가 강조 표시로 인식되지 않는 문제가 있습니다.</span>
+```md
+**このアスタリスクは強調記号として認識されず、そのまま表示されます。**この文のせいで。
+**该星号不会被识别，而是直接显示。**这是因为它没有被识别为强调符号。
+**이 별표는 강조 표시로 인식되지 않고 그대로 표시됩니다(이 괄호 때문에)**이 문장 때문에.
+```
+This problem occurs because the character just inside the `**` is a (Japanese or Chinese) punctuation mark (。) or parenthesis and the character just outside is not a space or punctuation mark.
+<span lang="ja">これが起こった原因は、終了側の`**`のすぐ内側が約物（。やカッコ）、かつ外側が約物や空白以外の文字であるためです。</span>
+<span lang="zh-Hans-CN">这个问题是因为在`**`的结束部分，内侧字符是标点符号（。）或括号，而外侧字符不是空格或标点符号。</span>
+<span lang="ko">이 문제는 `**` 바로 안쪽의 문자가 (일본어나 중국어) 문장 부호(。) 또는 괄호이고 바깥쪽 문자가 공백이나 문장 부호가 아니기 때문에 발생합니다.</span>
+Of course, not only the end side but also the start side has the same issue.
+<span lang="ja">もちろん終了側だけでなく、開始側も同様の問題が存在します。</span>
+<span lang="zh-Hans-CN">当然，不仅是结束侧，开始侧也存在同样的问题。</span>
+<span lang="ko">물론 끝나는 부분뿐만 아니라 시작하는 부분에서도 동일한 문제가 있습니다.</span>
+CommonMark issue: https://github.com/commonmark/commonmark-spec/issues/650
+## Runtime Requirements / <span lang="ja">実行環境の要件</span> / <span lang="zh-Hans-CN">运行环境要求</span> / <span lang="ko">업데이트 전략</span>
+This package uses the [`v` flag of the regular expression](https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/RegExp/unicodeSets) introduced in ES2024 to determine whether the character is an emoji or not.
+<span lang="ja">本パッケージは文字が絵文字かどうかを判定するために、ES2024で導入された[正規表現の`v`フラグ](https://developer.mozilla.org/ja/docs/Web/JavaScript/Reference/Global_Objects/RegExp/unicodeSets)を使用しています。</span>
+<span lang="zh-CN">本包使用 ES2024 引入的[`v` 标志](https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/RegExp/unicodeSets) 来判断字符是否为 emoji。</span>
+<span lang="ko">이 패키지는 ES2024에서 도입된 [`v` 플래그](https://developer.mozilla.org/ko/docs/Web/JavaScript/Reference/Global_Objects/RegExp/unicodeSets)를 사용하여 문자가 이모지인지 여부를 판단합니다.</span>
+It makes this package compatible only with relatively recent browsers and Node.js:
+<span lang="ja">このため、本パッケージは、次のような比較的新しいブラウザやNode.jsでしか動作しません。</span>
+<span lang="zh-CN">因此，本包只兼容比较新的浏览器和 Node.js:</span>
+<span lang="ko">따라서, 이 패키지는 비교적 최신 브라우저와 Node.js에서만 작동합니다.</span>
+- Chrome / Edge 112 or later
+- Firefox 116 or later
+- Safari 17 or later
+- Node.js 20 or later
+## Installation / <span lang="ja">インストール</span> / <span lang="zh-Hans-CN">安装</span> / <span lang="ko">설치</span>
+Install `micromark-extension-cjk-friendly-util` via [npm](https://www.npmjs.com/):
+<span lang="ja">`micromark-extension-cjk-friendly-util`を[npm](https://www.npmjs.com/)でインストールしてください。</span>
+<span lang="zh-Hans-CN">通过 [npm](https://www.npmjs.com/) 安装 `micromark-extension-cjk-friendly-util`。</span>
+<span lang="ko">`micromark-extension-cjk-friendly-util`를 [npm](https://www.npmjs.com/)으로 설치하세요.</span>
+```bash
+npm install micromark-extension-cjk-friendly-util
+```
+If you use another package manager, please replace `npm install` with the command of the package manager you use (e.g. `pnpm add` or `yarn add`).
+<span lang="ja">npm以外のパッケージマネージャを使う場合は、`npm install`を当該パッケージマネージャのコマンド（例：`pnpm add`・`yarn add`）に置き換えてください。</span>
+<span lang="zh-Hans-CN">如果使用其他包管理器，请将 `npm install` 替换为当时包管理器的命令（例如：`pnpm add`、`yarn add`）。</span>
+<span lang="ko">다른 패키지 매니저를 사용하는 경우 `npm install`을 해당 패키지 매니저의 명령어(예: `pnpm add`, `yarn add`)로 바꾸어 주세요.</span>
+## Usage / <span lang="ja">使い方</span> / <span lang="zh-Hans-CN">用法</span> / <span lang="ko">사용법</span>
+> [!IMPORTANT]
+> Most people do not have to use this package directly. Did you mean:
+>
+> - [remark-cjk-friendly](https://npmjs.com/package/remark-cjk-friendly)
+> - [micromark-extension-cjk-friendly](https://npmjs.com/package/micromark-extension-cjk-friendly)
+This package provides a function and a namespace based on the original micromark-related packages:
+| Name | Type | Derived from | Original Name | Description |
+| --- | --- | --- | --- | --- |
+| `classifyCharacter` | function | [micromark-util-character](https://npmjs.com/package/micromark-util-character) | (same) | Tells whether a character is not only a punctuation or whitespace but also a CJK or variation selector |
+| `constantsEx` | namespace | [micromark-util-symbol](https://npmjs.com/package/micromark-util-symbol) | `constants` | Constants meaning CJK and variation selectors; use it and the original `constants` together. |
+Also, this package provides some utility functions to check whether a character belongs to the category defined in the specification (e.g. CJK code point without variation selector), or to help you fetch the Unicode Code Point of a character around the emphasis mark.
+## Specification / <span lang="ja">規格書</span> / <span lang="zh-Hans-CN">规范</span> / <span lang="ko">규정서</span>
+https://github.com/tats-u/markdown-cjk-friendly/blob/main/specification.md (English)
+## Related packages / <span lang="ja">関連パッケージ</span> / <span lang="zh-Hans-CN">相关包</span> / <span lang="ko">관련 패키지</span>
+- [micromark-extension-cjk-friendly](https://npmjs.com/package/micromark-extension-cjk-friendly) [![Version](https://img.shields.io/npm/v/micromark-extension-cjk-friendly)](https://npmjs.com/package/micromark-extension-cjk-friendly) [![NPM Downloads](https://img.shields.io/npm/dw/micromark-extension-cjk-friendly)](https://npmjs.com/package/micromark-extension-cjk-friendly) [![NPM Last Update](https://img.shields.io/npm/last-update/micromark-extension-cjk-friendly)](https://npmjs.com/package/micromark-extension-cjk-friendly)
+- [remark-cjk-friendly](https://npmjs.com/package/remark-cjk-friendly) [![Version](https://img.shields.io/npm/v/remark-cjk-friendly)](https://npmjs.com/package/remark-cjk-friendly) [![NPM Downloads](https://img.shields.io/npm/dw/remark-cjk-friendly)](https://npmjs.com/package/remark-cjk-friendly) [![NPM Last Update](https://img.shields.io/npm/last-update/remark-cjk-friendly)](https://npmjs.com/package/remark-cjk-friendly)
+- [markdown-it-cjk-friendly](https://npmjs.com/package/markdown-it-cjk-friendly) [![Version](https://img.shields.io/npm/v/markdown-it-cjk-friendly)](https://npmjs.com/package/markdown-it-cjk-friendly) [![NPM Downloads](https://img.shields.io/npm/dw/markdown-it-cjk-friendly)](https://npmjs.com/package/markdown-it-cjk-friendly) [![NPM Last Update](https://img.shields.io/npm/last-update/markdown-it-cjk-friendly)](https://npmjs.com/package/markdown-it-cjk-friendly)
+- [remark-cjk-friendly](https://npmjs.com/package/remark-cjk-friendly) [![Version](https://img.shields.io/npm/v/remark-cjk-friendly)](https://npmjs.com/package/remark-cjk-friendly) [![NPM Downloads](https://img.shields.io/npm/dw/remark-cjk-friendly)](https://npmjs.com/package/remark-cjk-friendly) [![NPM Last Update](https://img.shields.io/npm/last-update/remark-cjk-friendly)](https://npmjs.com/package/remark-cjk-friendly)
+- [micromark-extension-cjk-friendly-gfm-strikethrough](https://npmjs.com/package/micromark-extension-cjk-friendly-gfm-strikethrough) [![Version](https://img.shields.io/npm/v/micromark-extension-cjk-friendly-gfm-strikethrough)](https://npmjs.com/package/micromark-extension-cjk-friendly-gfm-strikethrough) [![NPM Downloads](https://img.shields.io/npm/dw/micromark-extension-cjk-friendly-gfm-strikethrough)](https://npmjs.com/package/micromark-extension-cjk-friendly-gfm-strikethrough) [![NPM Last Update](https://img.shields.io/npm/last-update/micromark-extension-cjk-friendly-gfm-strikethrough)](https://npmjs.com/package/micromark-extension-cjk-friendly-gfm-strikethrough)
+## Contributing / <span lang="ja">貢献</span> / <span lang="zh-Hans-CN">贡献</span> / <span lang="ko">기여</span>
+### Setup
+Install the dependencies:
+```bash
+pnpm install
+```
+### Get Started
+Build the library:
+```bash
+pnpm build
+```
+Build the library in watch mode:
+```bash
+pnpm dev
+```

package/dist/categoryUtil.d.ts ADDED Viewed

@@ -0,0 +1,45 @@
+import { type classifyCharacter } from "./classifyCharacter.js";
+type Category = ReturnType<typeof classifyCharacter>;
+/**
+ * `true` if the code point represents an [Unicode whitespace character](https://spec.commonmark.org/0.31.2/#unicode-whitespace-character).
+ *
+ * @param category the return value of `classifyCharacter`.
+ * @returns `true` if the code point represents an Unicode whitespace character
+ */
+export declare function isUnicodeWhitespace(category: Category): boolean;
+/**
+ * `true` if the code point represents a [non-CJK punctuation character](https://github.com/tats-u/markdown-cjk-friendly/blob/main/specification.md#non-cjk-punctuation-character).
+ *
+ * @param category the return value of `classifyCharacter`.
+ * @returns `true` if the code point represents a non-CJK punctuation character
+ */
+export declare function isNonCjkPunctuation(category: Category): boolean;
+/**
+ * `true` if the code point represents a [CJK character (CJK code point without variation selector)](https://github.com/tats-u/markdown-cjk-friendly/blob/main/specification.md#cjk-code-point-without-variation-selector).
+ *
+ * @param category the return value of `classifyCharacter`.
+ * @returns `true` if the code point represents a CJK character
+ */
+export declare function isCjk(category: Category): boolean;
+/**
+ * `true` if the code point represents an [IVS (Ideographic Variation Selector)](https://github.com/tats-u/markdown-cjk-friendly/blob/main/specification.md#ivs).
+ *
+ * @param category the return value of `classifyCharacter`.
+ * @returns `true` if the code point represents an IVS
+ */
+export declare function isIvs(category: Category): boolean;
+/**
+ * `true` if the code point represents a [SVS (Standard Variation Selector/Sequence) that can follow CJK](https://github.com/tats-u/markdown-cjk-friendly/blob/main/specification.md#svs-that-can-follow-cjk).
+ *
+ * @param category the return value of `classifyCharacter`.
+ * @returns `true` if the code point represents an SVS that can follow CJK
+ */
+export declare function isSvsFollowingCjk(category: Category): boolean;
+/**
+ * `true` if the code point represents an [Unicode whitespace character](https://spec.commonmark.org/0.31.2/#unicode-whitespace-character) or an [Unicode punctuation character](https://spec.commonmark.org/0.31.2/#unicode-punctuation-character).
+ *
+ * @param category the return value of `classifyCharacter`.
+ * @returns `true` if the code point represents a space or punctuation
+ */
+export declare function isSpaceOrPunctuation(category: Category): boolean;
+export {};

package/dist/characterWithNonBmp.d.ts ADDED Viewed

@@ -0,0 +1,55 @@
+import type { Code } from "micromark-util-types";
+/**
+ * Check if `uc` is CJK or IVS
+ *
+ * @param uc code point
+ * @returns `true` if `uc` is CJK, `null` if IVS, or `false` if neither
+ */
+export declare function cjkOrIvs(uc: Code): boolean | null;
+/**
+ * Check whether the character code represents Standard Variation Sequence that can follow an ideographic character.
+ *
+ * U+FE0E is used for some CJK symbols (e.g. U+3299) that can also be
+ */
+export declare const svsFollowingCjk: (code: Code) => boolean;
+/**
+ * Check whether the character code represents Unicode punctuation.
+ *
+ * A **Unicode punctuation** is a character in the Unicode `Pc` (Punctuation,
+ * Connector), `Pd` (Punctuation, Dash), `Pe` (Punctuation, Close), `Pf`
+ * (Punctuation, Final quote), `Pi` (Punctuation, Initial quote), `Po`
+ * (Punctuation, Other), or `Ps` (Punctuation, Open) categories, or an ASCII
+ * punctuation (see `asciiPunctuation`).
+ *
+ * See:
+ * **\[UNICODE]**:
+ * [The Unicode Standard](https://www.unicode.org/versions/).
+ * Unicode Consortium.
+ *
+ * @param code
+ *   Code.
+ * @returns
+ *   Whether it matches.
+ */
+export declare const unicodePunctuation: (code: Code) => boolean;
+/**
+ * Check whether the character code represents Unicode whitespace.
+ *
+ * Note that this does handle micromark specific markdown whitespace characters.
+ * See `markdownLineEndingOrSpace` to check that.
+ *
+ * A **Unicode whitespace** is a character in the Unicode `Zs` (Separator,
+ * Space) category, or U+0009 CHARACTER TABULATION (HT), U+000A LINE FEED (LF),
+ * U+000C (FF), or U+000D CARRIAGE RETURN (CR) (**\[UNICODE]**).
+ *
+ * See:
+ * **\[UNICODE]**:
+ * [The Unicode Standard](https://www.unicode.org/versions/).
+ * Unicode Consortium.
+ *
+ * @param code
+ *   Code.
+ * @returns
+ *   Whether it matches.
+ */
+export declare const unicodeWhitespace: (code: Code) => boolean;

package/dist/classifyCharacter.d.ts ADDED Viewed

@@ -0,0 +1,26 @@
+import { constants } from "micromark-util-symbol";
+import type { Code } from "micromark-util-types";
+export declare namespace constantsEx {
+    const spaceOrPunctuation: 3;
+    const cjk: 4096;
+    const cjkPunctuation: 4098;
+    const ivs: 8192;
+    const cjkOrIvs: 12288;
+    const svsFollowingCjk: 16384;
+    const variationSelector: 28672;
+}
+/**
+ * Classify whether a code represents whitespace, punctuation, or something
+ * else.
+ *
+ * Used for attention (emphasis, strong), whose sequences can open or close
+ * based on the class of surrounding characters.
+ *
+ * > 👉 **Note**: eof (`null`) is seen as whitespace.
+ *
+ * @param code
+ *   Code.
+ * @returns
+ *   Group.
+ */
+export declare function classifyCharacter(code: Code): typeof constants.characterGroupWhitespace | typeof constants.characterGroupPunctuation | typeof constantsEx.cjk | typeof constantsEx.cjkPunctuation | typeof constantsEx.ivs | typeof constantsEx.svsFollowingCjk | 0;

package/dist/codeUtil.d.ts ADDED Viewed

@@ -0,0 +1,45 @@
+import type { Code, Point, TokenizeContext } from "micromark-util-types";
+/**
+ * Check if the given code is a [High-Surrogate Code Unit](https://www.unicode.org/glossary/#high_surrogate_code_unit).
+ *
+ * A High-Surrogate Code Unit is the _first_ half of a [Surrogate Pair](https://www.unicode.org/glossary/#surrogate_pair).
+ *
+ * @param code Code.
+ * @returns `true` if the code is a High-Surrogate Code Unit, `false` otherwise.
+ */
+export declare function isCodeHighSurrogate(code: Code): code is Exclude<Code, null>;
+/**
+ * Check if the given code is a [Low-Surrogate Code Unit](https://www.unicode.org/glossary/#low_surrogate_code_unit).
+ *
+ * A Low-Surrogate Code Unit is the _second_ half of a [Surrogate Pair](https://www.unicode.org/glossary/#surrogate_pair).
+ * @param code
+ *   The character code to check.
+ * @returns
+ *   True if the code is a Low-Surrogate Code Unit, false otherwise.
+ */
+export declare function isCodeLowSurrogate(code: Code): code is Exclude<Code, null>;
+/**
+ * If `code` is a [Low-Surrogate Code Unit](https://www.unicode.org/glossary/#low_surrogate_code_unit), try to get a genuine previous [Unicode Scalar Value](https://www.unicode.org/glossary/#unicode_scalar_value) corresponding to the Low-Surrogate Code Unit.
+ * @param code a tentative previous [code unit](https://www.unicode.org/glossary/#code_unit) less than 65,536, including a Low-Surrogate one
+ * @param nowPoint `this.now()` (`this` = `TokenizeContext`)
+ * @param sliceSerialize `this.sliceSerialize` (`this` = `TokenizeContext`)
+ * @returns a value greater than 65,535 if the previous code point represents a [Supplementary Character](https://www.unicode.org/glossary/#supplementary_character), or `code` otherwise
+ */
+export declare function tryGetGenuinePreviousCode(code: Exclude<Code, null>, nowPoint: Point, sliceSerialize: TokenizeContext["sliceSerialize"]): Exclude<Code, null>;
+/**
+ * Try to get the [Unicode Code Point](https://www.unicode.org/glossary/#code_point) two positions before the current position.
+ *
+ * @param previousCode a previous code point. Should be greater than 65,535 if it represents a [Supplementary Character](https://www.unicode.org/glossary/#supplementary_character).
+ * @param nowPoint `this.now()` (`this` = `TokenizeContext`)
+ * @param sliceSerialize `this.sliceSerialize` (`this` = `TokenizeContext`)
+ * @returns a value greater than 65,535 if the code point two positions before represents a [Supplementary Character](https://www.unicode.org/glossary/#supplementary_character), a value less than 65,536 for a [BMP Character](https://www.unicode.org/glossary/#bmp_character), or `null` if not found
+ */
+export declare function tryGetCodeTwoBefore(previousCode: Exclude<Code, null>, nowPoint: Point, sliceSerialize: TokenizeContext["sliceSerialize"]): Code;
+/**
+ * If `code` is a [High-Surrogate Code Unit](https://www.unicode.org/glossary/#high_surrogate_code_unit), try to get a genuine next [Unicode Scalar Value](https://www.unicode.org/glossary/#unicode_scalar_value) corresponding to the High-Surrogate Code Unit.
+ * @param code a tentative next [code unit](https://www.unicode.org/glossary/#code_unit) less than 65,536, including a High-Surrogate one
+ * @param nowPoint `this.now()` (`this` = `TokenizeContext`)
+ * @param sliceSerialize `this.sliceSerialize` (`this` = `TokenizeContext`)
+ * @returns a value greater than 65,535 if the next code point represents a [Supplementary Character](https://www.unicode.org/glossary/#supplementary_character), or `code` otherwise
+ */
+export declare function tryGetGenuineNextCode(code: Exclude<Code, null>, nowPoint: Point, sliceSerialize: TokenizeContext["sliceSerialize"]): Exclude<Code, null>;

package/dist/index.cjs ADDED Viewed

@@ -0,0 +1,169 @@
+"use strict";
+var __webpack_require__ = {};
+(()=>{
+    __webpack_require__.d = function(exports1, definition) {
+        for(var key in definition)if (__webpack_require__.o(definition, key) && !__webpack_require__.o(exports1, key)) Object.defineProperty(exports1, key, {
+            enumerable: true,
+            get: definition[key]
+        });
+    };
+})();
+(()=>{
+    __webpack_require__.o = function(obj, prop) {
+        return Object.prototype.hasOwnProperty.call(obj, prop);
+    };
+})();
+(()=>{
+    __webpack_require__.r = function(exports1) {
+        if ('undefined' != typeof Symbol && Symbol.toStringTag) Object.defineProperty(exports1, Symbol.toStringTag, {
+            value: 'Module'
+        });
+        Object.defineProperty(exports1, '__esModule', {
+            value: true
+        });
+    };
+})();
+var __webpack_exports__ = {};
+__webpack_require__.r(__webpack_exports__);
+__webpack_require__.d(__webpack_exports__, {
+    constantsEx: ()=>classifyCharacter_constantsEx,
+    isCjk: ()=>isCjk,
+    isSpaceOrPunctuation: ()=>isSpaceOrPunctuation,
+    isNonCjkPunctuation: ()=>isNonCjkPunctuation,
+    isSvsFollowingCjk: ()=>isSvsFollowingCjk,
+    tryGetGenuinePreviousCode: ()=>tryGetGenuinePreviousCode,
+    isUnicodeWhitespace: ()=>isUnicodeWhitespace,
+    isCodeHighSurrogate: ()=>isCodeHighSurrogate,
+    isIvs: ()=>isIvs,
+    tryGetCodeTwoBefore: ()=>tryGetCodeTwoBefore,
+    classifyCharacter: ()=>classifyCharacter,
+    isCodeLowSurrogate: ()=>isCodeLowSurrogate,
+    tryGetGenuineNextCode: ()=>tryGetGenuineNextCode
+});
+const external_micromark_util_symbol_namespaceObject = require("micromark-util-symbol");
+const external_micromark_util_character_namespaceObject = require("micromark-util-character");
+const external_get_east_asian_width_namespaceObject = require("get-east-asian-width");
+function cjkOrIvs(uc) {
+    if (!uc || uc < 0) return false;
+    const eaw = (0, external_get_east_asian_width_namespaceObject.eastAsianWidthType)(uc);
+    switch(eaw){
+        case "fullwidth":
+        case "halfwidth":
+            return true;
+        case "wide":
+            return !/^\p{RGI_Emoji}/v.test(String.fromCodePoint(uc));
+        case "narrow":
+            return false;
+        case "ambiguous":
+            return 0xe0100 <= uc && uc <= 0xe01ef && null;
+        case "neutral":
+            return /^\p{sc=Hangul}/u.test(String.fromCodePoint(uc));
+    }
+}
+const svsFollowingCjk = regexCheck(/[\uFE00-\uFE02\uFE0E]/u);
+const unicodePunctuation = regexCheck(/\p{P}|\p{S}/u);
+const unicodeWhitespace = regexCheck(/\s/);
+function regexCheck(regex) {
+    return check;
+    function check(code) {
+        return null !== code && code > -1 && regex.test(String.fromCodePoint(code));
+    }
+}
+(function(constantsEx) {
+    constantsEx.spaceOrPunctuation = 3;
+    constantsEx.cjk = 0x1000;
+    constantsEx.cjkPunctuation = 0x1002;
+    constantsEx.ivs = 0x2000;
+    constantsEx.cjkOrIvs = 0x3000;
+    constantsEx.svsFollowingCjk = 0x4000;
+    constantsEx.variationSelector = 0x7000;
+})(classifyCharacter_constantsEx || (classifyCharacter_constantsEx = {}));
+function classifyCharacter(code) {
+    if (code === external_micromark_util_symbol_namespaceObject.codes.eof || (0, external_micromark_util_character_namespaceObject.markdownLineEndingOrSpace)(code) || unicodeWhitespace(code)) return external_micromark_util_symbol_namespaceObject.constants.characterGroupWhitespace;
+    let value = 0;
+    if (code >= 0x1100) {
+        if (svsFollowingCjk(code)) return classifyCharacter_constantsEx.svsFollowingCjk;
+        switch(cjkOrIvs(code)){
+            case null:
+                return classifyCharacter_constantsEx.ivs;
+            case true:
+                value |= classifyCharacter_constantsEx.cjk;
+                break;
+        }
+    }
+    if (unicodePunctuation(code)) value |= external_micromark_util_symbol_namespaceObject.constants.characterGroupPunctuation;
+    return value;
+}
+var classifyCharacter_constantsEx;
+function isUnicodeWhitespace(category) {
+    return Boolean(category & external_micromark_util_symbol_namespaceObject.constants.characterGroupWhitespace);
+}
+function isNonCjkPunctuation(category) {
+    return (category & classifyCharacter_constantsEx.cjkPunctuation) === external_micromark_util_symbol_namespaceObject.constants.characterGroupPunctuation;
+}
+function isCjk(category) {
+    return Boolean(category & classifyCharacter_constantsEx.cjk);
+}
+function isIvs(category) {
+    return category === classifyCharacter_constantsEx.ivs;
+}
+function isSvsFollowingCjk(category) {
+    return category === classifyCharacter_constantsEx.svsFollowingCjk;
+}
+function isSpaceOrPunctuation(category) {
+    return Boolean(category & classifyCharacter_constantsEx.spaceOrPunctuation);
+}
+function isCodeHighSurrogate(code) {
+    return Boolean(code && code >= 0xd800 && code <= 0xdbff);
+}
+function isCodeLowSurrogate(code) {
+    return Boolean(code && code >= 0xdc00 && code <= 0xdfff);
+}
+function tryGetGenuinePreviousCode(code, nowPoint, sliceSerialize) {
+    if (nowPoint._bufferIndex < 2) return code;
+    const previousBuffer = sliceSerialize({
+        start: {
+            ...nowPoint,
+            _bufferIndex: nowPoint._bufferIndex - 2
+        },
+        end: nowPoint
+    });
+    const previousCandidate = previousBuffer.codePointAt(0);
+    return previousCandidate && previousCandidate >= 65536 ? previousCandidate : code;
+}
+function tryGetCodeTwoBefore(previousCode, nowPoint, sliceSerialize) {
+    const previousWidth = previousCode >= 65536 ? 2 : 1;
+    if (nowPoint._bufferIndex < 1 + previousWidth) return null;
+    const idealStart = nowPoint._bufferIndex - previousWidth - 2;
+    const twoPreviousBuffer = sliceSerialize({
+        start: {
+            ...nowPoint,
+            _bufferIndex: idealStart >= 0 ? idealStart : 0
+        },
+        end: {
+            ...nowPoint,
+            _bufferIndex: nowPoint._bufferIndex - previousWidth
+        }
+    });
+    const twoPreviousLast = twoPreviousBuffer.charCodeAt(twoPreviousBuffer.length - 1);
+    if (Number.isNaN(twoPreviousLast)) return null;
+    if (twoPreviousBuffer.length < 2 || twoPreviousLast < 0xdc00 || 0xdfff < twoPreviousLast) return twoPreviousLast;
+    const twoPreviousCandidate = twoPreviousBuffer.codePointAt(0);
+    if (twoPreviousCandidate && twoPreviousCandidate >= 65536) return twoPreviousCandidate;
+    return twoPreviousLast;
+}
+function tryGetGenuineNextCode(code, nowPoint, sliceSerialize) {
+    const nextCandidate = sliceSerialize({
+        start: nowPoint,
+        end: {
+            ...nowPoint,
+            _bufferIndex: nowPoint._bufferIndex + 2
+        }
+    }).codePointAt(0);
+    return nextCandidate && nextCandidate >= 65536 ? nextCandidate : code;
+}
+var __webpack_export_target__ = exports;
+for(var __webpack_i__ in __webpack_exports__)__webpack_export_target__[__webpack_i__] = __webpack_exports__[__webpack_i__];
+if (__webpack_exports__.__esModule) Object.defineProperty(__webpack_export_target__, '__esModule', {
+    value: true
+});

package/dist/index.d.ts ADDED Viewed

@@ -0,0 +1,3 @@
+export { isCjk, isIvs, isNonCjkPunctuation, isSpaceOrPunctuation, isSvsFollowingCjk, isUnicodeWhitespace, } from "./categoryUtil.js";
+export { classifyCharacter, constantsEx } from "./classifyCharacter.js";
+export { isCodeHighSurrogate, isCodeLowSurrogate, tryGetGenuineNextCode, tryGetGenuinePreviousCode, tryGetCodeTwoBefore, } from "./codeUtil.js";

package/dist/index.js ADDED Viewed

@@ -0,0 +1,123 @@
+import * as __WEBPACK_EXTERNAL_MODULE_micromark_util_symbol_0f33a452__ from "micromark-util-symbol";
+import * as __WEBPACK_EXTERNAL_MODULE_micromark_util_character_bdee8a5c__ from "micromark-util-character";
+import * as __WEBPACK_EXTERNAL_MODULE_get_east_asian_width_c890dec0__ from "get-east-asian-width";
+function cjkOrIvs(uc) {
+    if (!uc || uc < 0) return false;
+    const eaw = (0, __WEBPACK_EXTERNAL_MODULE_get_east_asian_width_c890dec0__.eastAsianWidthType)(uc);
+    switch(eaw){
+        case "fullwidth":
+        case "halfwidth":
+            return true;
+        case "wide":
+            return !/^\p{RGI_Emoji}/v.test(String.fromCodePoint(uc));
+        case "narrow":
+            return false;
+        case "ambiguous":
+            return 0xe0100 <= uc && uc <= 0xe01ef && null;
+        case "neutral":
+            return /^\p{sc=Hangul}/u.test(String.fromCodePoint(uc));
+    }
+}
+const svsFollowingCjk = regexCheck(/[\uFE00-\uFE02\uFE0E]/u);
+const unicodePunctuation = regexCheck(/\p{P}|\p{S}/u);
+const unicodeWhitespace = regexCheck(/\s/);
+function regexCheck(regex) {
+    return check;
+    function check(code) {
+        return null !== code && code > -1 && regex.test(String.fromCodePoint(code));
+    }
+}
+(function(constantsEx) {
+    constantsEx.spaceOrPunctuation = 3;
+    constantsEx.cjk = 0x1000;
+    constantsEx.cjkPunctuation = 0x1002;
+    constantsEx.ivs = 0x2000;
+    constantsEx.cjkOrIvs = 0x3000;
+    constantsEx.svsFollowingCjk = 0x4000;
+    constantsEx.variationSelector = 0x7000;
+})(classifyCharacter_constantsEx || (classifyCharacter_constantsEx = {}));
+function classifyCharacter(code) {
+    if (code === __WEBPACK_EXTERNAL_MODULE_micromark_util_symbol_0f33a452__.codes.eof || (0, __WEBPACK_EXTERNAL_MODULE_micromark_util_character_bdee8a5c__.markdownLineEndingOrSpace)(code) || unicodeWhitespace(code)) return __WEBPACK_EXTERNAL_MODULE_micromark_util_symbol_0f33a452__.constants.characterGroupWhitespace;
+    let value = 0;
+    if (code >= 0x1100) {
+        if (svsFollowingCjk(code)) return classifyCharacter_constantsEx.svsFollowingCjk;
+        switch(cjkOrIvs(code)){
+            case null:
+                return classifyCharacter_constantsEx.ivs;
+            case true:
+                value |= classifyCharacter_constantsEx.cjk;
+                break;
+        }
+    }
+    if (unicodePunctuation(code)) value |= __WEBPACK_EXTERNAL_MODULE_micromark_util_symbol_0f33a452__.constants.characterGroupPunctuation;
+    return value;
+}
+var classifyCharacter_constantsEx;
+function isUnicodeWhitespace(category) {
+    return Boolean(category & __WEBPACK_EXTERNAL_MODULE_micromark_util_symbol_0f33a452__.constants.characterGroupWhitespace);
+}
+function isNonCjkPunctuation(category) {
+    return (category & classifyCharacter_constantsEx.cjkPunctuation) === __WEBPACK_EXTERNAL_MODULE_micromark_util_symbol_0f33a452__.constants.characterGroupPunctuation;
+}
+function isCjk(category) {
+    return Boolean(category & classifyCharacter_constantsEx.cjk);
+}
+function isIvs(category) {
+    return category === classifyCharacter_constantsEx.ivs;
+}
+function isSvsFollowingCjk(category) {
+    return category === classifyCharacter_constantsEx.svsFollowingCjk;
+}
+function isSpaceOrPunctuation(category) {
+    return Boolean(category & classifyCharacter_constantsEx.spaceOrPunctuation);
+}
+function isCodeHighSurrogate(code) {
+    return Boolean(code && code >= 0xd800 && code <= 0xdbff);
+}
+function isCodeLowSurrogate(code) {
+    return Boolean(code && code >= 0xdc00 && code <= 0xdfff);
+}
+function tryGetGenuinePreviousCode(code, nowPoint, sliceSerialize) {
+    if (nowPoint._bufferIndex < 2) return code;
+    const previousBuffer = sliceSerialize({
+        start: {
+            ...nowPoint,
+            _bufferIndex: nowPoint._bufferIndex - 2
+        },
+        end: nowPoint
+    });
+    const previousCandidate = previousBuffer.codePointAt(0);
+    return previousCandidate && previousCandidate >= 65536 ? previousCandidate : code;
+}
+function tryGetCodeTwoBefore(previousCode, nowPoint, sliceSerialize) {
+    const previousWidth = previousCode >= 65536 ? 2 : 1;
+    if (nowPoint._bufferIndex < 1 + previousWidth) return null;
+    const idealStart = nowPoint._bufferIndex - previousWidth - 2;
+    const twoPreviousBuffer = sliceSerialize({
+        start: {
+            ...nowPoint,
+            _bufferIndex: idealStart >= 0 ? idealStart : 0
+        },
+        end: {
+            ...nowPoint,
+            _bufferIndex: nowPoint._bufferIndex - previousWidth
+        }
+    });
+    const twoPreviousLast = twoPreviousBuffer.charCodeAt(twoPreviousBuffer.length - 1);
+    if (Number.isNaN(twoPreviousLast)) return null;
+    if (twoPreviousBuffer.length < 2 || twoPreviousLast < 0xdc00 || 0xdfff < twoPreviousLast) return twoPreviousLast;
+    const twoPreviousCandidate = twoPreviousBuffer.codePointAt(0);
+    if (twoPreviousCandidate && twoPreviousCandidate >= 65536) return twoPreviousCandidate;
+    return twoPreviousLast;
+}
+function tryGetGenuineNextCode(code, nowPoint, sliceSerialize) {
+    const nextCandidate = sliceSerialize({
+        start: nowPoint,
+        end: {
+            ...nowPoint,
+            _bufferIndex: nowPoint._bufferIndex + 2
+        }
+    }).codePointAt(0);
+    return nextCandidate && nextCandidate >= 65536 ? nextCandidate : code;
+}
+export { classifyCharacter, classifyCharacter_constantsEx as constantsEx, isCjk, isCodeHighSurrogate, isCodeLowSurrogate, isIvs, isNonCjkPunctuation, isSpaceOrPunctuation, isSvsFollowingCjk, isUnicodeWhitespace, tryGetCodeTwoBefore, tryGetGenuineNextCode, tryGetGenuinePreviousCode };

package/package.json ADDED Viewed

@@ -0,0 +1,57 @@
+{
+  "name": "micromark-extension-cjk-friendly-util",
+  "version": "1.0.0",
+  "type": "module",
+  "exports": {
+    ".": {
+      "types": "./dist/index.d.ts",
+      "default": "./dist/index.js"
+    }
+  },
+  "files": [
+    "dist",
+    "LICENSE",
+    "README.md"
+  ],
+  "repository": {
+    "url": "https://github.com/tats-u/markdown-cjk-friendly"
+  },
+  "license": "MIT",
+  "author": "Tatsunori Uchino <tats.u@live.jp> (https://github.com/tats-u)",
+  "bugs": "https://github.com/tats-u/markdown-cjk-friendly/issues",
+  "keywords": [
+    "micromark-extension",
+    "micromark",
+    "markdown",
+    "japanese",
+    "chinese",
+    "korean",
+    "cjk"
+  ],
+  "description": "common library for micromark-extension-cjk-friendly and its related packages",
+  "sideEffects": false,
+  "dependencies": {
+    "get-east-asian-width": "^1.3.0",
+    "micromark-util-character": "^2.0.0",
+    "micromark-util-symbol": "^2.0.0"
+  },
+  "devDependencies": {
+    "micromark-util-types": "^2.0.0"
+  },
+  "peerDependenciesMeta": {
+    "micromark-util-types": {
+      "optional": true
+    }
+  },
+  "engines": {
+    "node": ">=20"
+  },
+  "scripts": {
+    "build": "rslib build",
+    "dev": "rslib build --watch",
+    "test": "vitest --run",
+    "test:up": "vitest -u",
+    "test:watch": "vitest watch",
+    "lint:type": "tsc --noEmit"
+  }
+}