@peaceroad/markdown-it-cjk-breaks-mod 0.1.7 → 0.1.9
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/README.md +86 -94
- package/index.js +84 -67
- package/package.json +2 -2
package/README.md
CHANGED
|
@@ -1,133 +1,125 @@
|
|
|
1
|
-
# markdown-it-cjk-breaks
|
|
1
|
+
# @peaceroad/markdown-it-cjk-breaks-mod
|
|
2
2
|
|
|
3
|
-
|
|
3
|
+
`@peaceroad/markdown-it-cjk-breaks-mod` is a markdown-it plugin that suppresses line breaks between CJK text and optionally injects spacing after configured punctuation when a break is removed. It is designed for mixed Japanese/CJK + ASCII documents where default newline handling often produces unwanted spaces or breaks.
|
|
4
4
|
|
|
5
|
-
|
|
5
|
+
This package is a fork lineage of [`markdown-it-cjk-breaks`](https://github.com/markdown-it/markdown-it-cjk-breaks) and [`@sup39/markdown-it-cjk-breaks`](https://www.npmjs.com/package/@sup39/markdown-it-cjk-breaks). It keeps the original CJK break suppression behavior, adds the `either` mode introduced by `@sup39`, and extends it with punctuation-spacing controls and softbreak normalization for plugin-heavy markdown-it pipelines.
|
|
6
6
|
|
|
7
|
-
|
|
7
|
+
## Install
|
|
8
8
|
|
|
9
|
-
|
|
9
|
+
```
|
|
10
|
+
npm i @peaceroad/markdown-it-cjk-breaks-mod
|
|
11
|
+
```
|
|
10
12
|
|
|
11
|
-
|
|
13
|
+
## Quick Start
|
|
12
14
|
|
|
13
15
|
```js
|
|
14
16
|
import MarkdownIt from 'markdown-it';
|
|
15
17
|
import cjkBreaks from '@peaceroad/markdown-it-cjk-breaks-mod';
|
|
16
18
|
|
|
17
|
-
|
|
18
|
-
|
|
19
|
-
|
|
20
|
-
|
|
21
|
-
|
|
22
|
-
|
|
23
|
-
|
|
19
|
+
const md = MarkdownIt({ html: true }).use(cjkBreaks);
|
|
20
|
+
|
|
21
|
+
md.render('あおえ\nうい');
|
|
22
|
+
// <p>あおえうい</p>
|
|
23
|
+
```
|
|
24
|
+
|
|
25
|
+
## Options
|
|
26
|
+
|
|
27
|
+
- `either`
|
|
28
|
+
Type: `boolean`
|
|
29
|
+
Default: `false`
|
|
30
|
+
Remove a break when either side (instead of both sides) is CJK-width (`F/W/H`), still excluding Hangul.
|
|
31
|
+
Origin: inherited from `@sup39/markdown-it-cjk-breaks`.
|
|
32
|
+
|
|
33
|
+
The options below are extensions added by this project:
|
|
34
|
+
|
|
35
|
+
- `normalizeSoftBreaks`
|
|
36
|
+
Type: `boolean`
|
|
37
|
+
Default: `false`
|
|
38
|
+
Split newline-containing `text` tokens into explicit `softbreak` tokens before processing. Useful with plugins that rewrite inline tokens.
|
|
39
|
+
- `spaceAfterPunctuation`
|
|
40
|
+
Type: `'half' | 'full' | string`
|
|
41
|
+
Default: disabled
|
|
42
|
+
Insert spacing only when this plugin removes a break after a target sequence. `'half'` => `' '`, `'full'` => `\u3000`.
|
|
43
|
+
- `spaceAfterPunctuationTargets`
|
|
44
|
+
Type: `string | string[] | [] | null | false`
|
|
45
|
+
Default: `['!', '?', '⁉', '!?', '?!', '!?', '?!', '.', ':']`
|
|
46
|
+
Replace the target sequence set. `[]`, `null`, or `false` explicitly disable target matching.
|
|
47
|
+
- `spaceAfterPunctuationTargetsAdd`
|
|
48
|
+
Type: `string | string[]`
|
|
49
|
+
Default: unset
|
|
50
|
+
Append target sequences after base resolution.
|
|
51
|
+
- `spaceAfterPunctuationTargetsRemove`
|
|
52
|
+
Type: `string | string[]`
|
|
53
|
+
Default: unset
|
|
54
|
+
Remove sequences from the resolved target list.
|
|
55
|
+
|
|
56
|
+
## Punctuation Spacing Examples
|
|
57
|
+
|
|
58
|
+
```js
|
|
59
|
+
import MarkdownIt from 'markdown-it';
|
|
60
|
+
import cjkBreaks from '@peaceroad/markdown-it-cjk-breaks-mod';
|
|
24
61
|
|
|
25
|
-
// Half-width spacing for ASCII-friendly mixes
|
|
26
62
|
const mdHalf = MarkdownIt({ html: true }).use(cjkBreaks, {
|
|
27
|
-
|
|
28
|
-
|
|
63
|
+
either: true,
|
|
64
|
+
spaceAfterPunctuation: 'half'
|
|
29
65
|
});
|
|
66
|
+
|
|
30
67
|
mdHalf.render('こんにちは!\nWorld');
|
|
31
68
|
// <p>こんにちは! World</p>
|
|
32
69
|
|
|
33
|
-
|
|
34
|
-
|
|
35
|
-
|
|
36
|
-
|
|
37
|
-
|
|
70
|
+
const mdFull = MarkdownIt({ html: true }).use(cjkBreaks, {
|
|
71
|
+
either: true,
|
|
72
|
+
spaceAfterPunctuation: 'full'
|
|
73
|
+
});
|
|
74
|
+
|
|
75
|
+
mdFull.render('こんにちは!\nWorld');
|
|
76
|
+
// <p>こんにちは! World</p>
|
|
38
77
|
|
|
39
|
-
// Custom punctuation triggers (replaces defaults)
|
|
40
78
|
const mdCustom = MarkdownIt({ html: true }).use(cjkBreaks, {
|
|
79
|
+
either: true,
|
|
41
80
|
spaceAfterPunctuation: 'half',
|
|
42
81
|
spaceAfterPunctuationTargets: ['??']
|
|
43
82
|
});
|
|
83
|
+
|
|
44
84
|
mdCustom.render('Hello??\nWorld');
|
|
45
85
|
// <p>Hello?? World</p>
|
|
46
86
|
```
|
|
47
87
|
|
|
48
|
-
|
|
49
|
-
Even with stock markdown-it, emphasis markers can leave inline `text` tokens that still embed `\n`. When `normalizeSoftBreaks: true`, those tokens are split back into proper `softbreak` entries before CJK suppression runs, so a trailing `***漢***\n字` behaves the same way regardless of how markdown-it represented it internally.
|
|
88
|
+
## Softbreak Normalization Example
|
|
50
89
|
|
|
51
90
|
```js
|
|
52
|
-
|
|
53
|
-
|
|
54
|
-
normalizeSoftBreaks: true,
|
|
55
|
-
either: true
|
|
56
|
-
});
|
|
57
|
-
mdStrongJaFriendly.render('**漢**\nb');
|
|
58
|
-
// <p><strong>漢</strong>b</p>
|
|
59
|
-
```
|
|
60
|
-
|
|
61
|
-
`@peaceroad/markdown-it-strong-ja` also emit newline-containing `text` nodes after their own rewrites. The same option keeps behavior consistent no matter which order you register plugins.
|
|
62
|
-
|
|
63
|
-
## sup39's additional features
|
|
64
|
-
|
|
65
|
-
- [@sup39/markdown-it-cjk-breaks](https://npmjs.com/package/@sup39/markdown-it-cjk-breaks)
|
|
91
|
+
import MarkdownIt from 'markdown-it';
|
|
92
|
+
import cjkBreaks from '@peaceroad/markdown-it-cjk-breaks-mod';
|
|
66
93
|
|
|
67
|
-
|
|
94
|
+
const md = MarkdownIt({ html: true }).use(cjkBreaks, {
|
|
95
|
+
either: true,
|
|
96
|
+
normalizeSoftBreaks: true
|
|
97
|
+
});
|
|
68
98
|
|
|
69
|
-
|
|
70
|
-
|
|
71
|
-
var cjk_breaks = require('markdown-it-cjk-breaks');
|
|
72
|
-
|
|
73
|
-
md.use(cjk_breaks, {either: true}); // << set either to true
|
|
74
|
-
|
|
75
|
-
md.render(`
|
|
76
|
-
あおえ
|
|
77
|
-
うい
|
|
78
|
-
aoe
|
|
79
|
-
ui
|
|
80
|
-
`);
|
|
81
|
-
|
|
82
|
-
// returns:
|
|
83
|
-
//
|
|
84
|
-
//<p>あおえういaoe <!-- linebreak between `い` and `a` is removed -->
|
|
85
|
-
//ui</p>
|
|
99
|
+
md.render('**漢**\nb');
|
|
100
|
+
// <p><strong>漢</strong>b</p>
|
|
86
101
|
```
|
|
87
102
|
|
|
88
|
-
##
|
|
103
|
+
## Behavior Notes
|
|
89
104
|
|
|
90
|
-
-
|
|
105
|
+
- Break suppression follows CSS Text Level 3 style rules used by upstream: ZWSP-adjacent breaks are removed first; otherwise width-class checks are applied with Hangul exclusion.
|
|
106
|
+
- Punctuation spacing is never global formatting. It only runs when this plugin actually removes the break.
|
|
107
|
+
- The second punctuation pass handles inline markup boundaries (inline code, links/autolinks, images, inline HTML) when a raw newline boundary is verifiably present.
|
|
108
|
+
- Matching is fail-closed: if raw boundary reconstruction cannot be proven, no space is inserted.
|
|
109
|
+
- If a `softbreak` is still active between candidate tokens, spacing insertion is skipped.
|
|
91
110
|
|
|
92
|
-
|
|
111
|
+
## Compatibility
|
|
93
112
|
|
|
94
|
-
|
|
113
|
+
- Module format: ESM (`"type": "module"`).
|
|
114
|
+
- Runtime: works in Node.js ESM environments and browser/VSCode bundling setups that support ESM dependencies.
|
|
115
|
+
- Runtime plugin code uses no Node-only APIs (`fs`, `path`, etc.); those are confined to tests.
|
|
116
|
+
- For plugin chains that rewrite inline text (for example `@peaceroad/markdown-it-strong-ja`), prefer `normalizeSoftBreaks: true` for stable behavior.
|
|
95
117
|
|
|
96
|
-
|
|
97
|
-
|
|
98
|
-
- If the character immediately before or immediately after the segment break is the zero-width space character (U+200B), then the break is removed, leaving behind the zero-width space.
|
|
99
|
-
- Otherwise, if the East Asian Width property [UAX11] of both the character before and after the segment break is F, W, or H (not A), and neither side is Hangul, then the segment break is removed.
|
|
100
|
-
- Otherwise, the segment break is converted to a space (U+0020).
|
|
101
|
-
|
|
102
|
-
## Install
|
|
103
|
-
|
|
104
|
-
```bash
|
|
105
|
-
yarn add markdown-it-cjk-breaks
|
|
106
|
-
```
|
|
107
|
-
|
|
108
|
-
|
|
109
|
-
## Usage
|
|
110
|
-
|
|
111
|
-
```js
|
|
112
|
-
var md = require('markdown-it')();
|
|
113
|
-
var cjk_breaks = require('markdown-it-cjk-breaks');
|
|
114
|
-
|
|
115
|
-
md.use(cjk_breaks);
|
|
116
|
-
|
|
117
|
-
md.render(`
|
|
118
|
-
あおえ
|
|
119
|
-
うい
|
|
120
|
-
aoe
|
|
121
|
-
ui
|
|
122
|
-
`);
|
|
123
|
-
|
|
124
|
-
// returns:
|
|
125
|
-
//
|
|
126
|
-
//<p>あおえうい
|
|
127
|
-
//aoe
|
|
128
|
-
//ui</p>
|
|
129
|
-
```
|
|
118
|
+
## Upstream And Credits
|
|
130
119
|
|
|
120
|
+
- Original: [markdown-it/markdown-it-cjk-breaks](https://github.com/markdown-it/markdown-it-cjk-breaks)
|
|
121
|
+
- Fork enhancement (`either`): [@sup39/markdown-it-cjk-breaks](https://www.npmjs.com/package/@sup39/markdown-it-cjk-breaks)
|
|
122
|
+
- Current package: [@peaceroad/markdown-it-cjk-breaks-mod](https://github.com/peaceroad/p7d-markdown-it-cjk-breaks-mod)
|
|
131
123
|
|
|
132
124
|
## License
|
|
133
125
|
|
package/index.js
CHANGED
|
@@ -11,7 +11,6 @@ const DEFAULT_PUNCTUATION_CONFIG = create_punctuation_config(DEFAULT_PUNCTUATION
|
|
|
11
11
|
const HANGUL_RE = /[\u1100-\u11FF\u302E\u302F\u3131-\u318E\u3200-\u321E\u3260-\u327E\uA960-\uA97C\uAC00-\uD7A3\uD7B0-\uD7C6\uD7CB-\uD7FB\uFFA0-\uFFBE\uFFC2-\uFFC7\uFFCA-\uFFCF\uFFD2-\uFFD7\uFFDA-\uFFDC]/;
|
|
12
12
|
/* eslint-enable max-len */
|
|
13
13
|
const WHITESPACE_RE = /\s/;
|
|
14
|
-
const WHITESPACE_LEAD_RE = /^\s/;
|
|
15
14
|
|
|
16
15
|
|
|
17
16
|
function is_surrogate(c1, c2) {
|
|
@@ -34,6 +33,7 @@ function create_punctuation_config(targets) {
|
|
|
34
33
|
for (var i = 0; i < targets.length; i++) {
|
|
35
34
|
var value = targets[i];
|
|
36
35
|
if (typeof value !== 'string' || value.length === 0) continue;
|
|
36
|
+
if (sequences.has(value)) continue;
|
|
37
37
|
sequences.add(value);
|
|
38
38
|
var valueLength = value.length;
|
|
39
39
|
if (valueLength > maxLength) maxLength = valueLength;
|
|
@@ -45,7 +45,9 @@ function create_punctuation_config(targets) {
|
|
|
45
45
|
if (endChar) endCharMap[endChar] = true;
|
|
46
46
|
}
|
|
47
47
|
|
|
48
|
-
lengths.
|
|
48
|
+
if (lengths.length > 1) {
|
|
49
|
+
lengths.sort(function (a, b) { return b - a; });
|
|
50
|
+
}
|
|
49
51
|
return { sequences: sequences, maxLength: maxLength, endCharMap: endCharMap, lengths: lengths };
|
|
50
52
|
}
|
|
51
53
|
|
|
@@ -126,9 +128,9 @@ function matches_punctuation_sequence(trailing, punctuationConfig, skipEndCharCh
|
|
|
126
128
|
if (!trailing || !punctuationConfig || punctuationConfig.maxLength === 0) return false;
|
|
127
129
|
|
|
128
130
|
var sequences = punctuationConfig.sequences;
|
|
129
|
-
var endCharMap = punctuationConfig.endCharMap;
|
|
130
131
|
var lengths = punctuationConfig.lengths;
|
|
131
132
|
if (!skipEndCharCheck) {
|
|
133
|
+
var endCharMap = punctuationConfig.endCharMap;
|
|
132
134
|
var endChar = get_last_char(trailing);
|
|
133
135
|
if (!endCharMap[endChar]) return false;
|
|
134
136
|
}
|
|
@@ -136,7 +138,7 @@ function matches_punctuation_sequence(trailing, punctuationConfig, skipEndCharCh
|
|
|
136
138
|
for (var i = 0; i < lengths.length; i++) {
|
|
137
139
|
var len = lengths[i];
|
|
138
140
|
if (len > trailingLength) continue;
|
|
139
|
-
var fragment = trailing.slice(-len);
|
|
141
|
+
var fragment = len === trailingLength ? trailing : trailing.slice(-len);
|
|
140
142
|
if (sequences.has(fragment)) return true;
|
|
141
143
|
}
|
|
142
144
|
return false;
|
|
@@ -160,6 +162,12 @@ function is_printable_ascii(ch) {
|
|
|
160
162
|
}
|
|
161
163
|
|
|
162
164
|
|
|
165
|
+
function has_leading_whitespace(text) {
|
|
166
|
+
if (!text) return false;
|
|
167
|
+
return WHITESPACE_RE.test(text.charAt(0));
|
|
168
|
+
}
|
|
169
|
+
|
|
170
|
+
|
|
163
171
|
function is_fullwidth_or_wide(ch) {
|
|
164
172
|
var width = get_cjk_width_class(ch);
|
|
165
173
|
return width === 'F' || width === 'W';
|
|
@@ -208,9 +216,8 @@ function process_inlines(tokens, ctx, inlineToken) {
|
|
|
208
216
|
var normalizeSoftBreaks = ctx.normalizeSoftBreaks;
|
|
209
217
|
var punctuationSpace = ctx.punctuationSpace;
|
|
210
218
|
var punctuationConfig = ctx.punctuationConfig;
|
|
211
|
-
var maxPunctuationLength = ctx.maxPunctuationLength;
|
|
212
219
|
var considerInlineBoundaries = ctx.considerInlineBoundaries;
|
|
213
|
-
var needsPunctuation = punctuationSpace && punctuationConfig && maxPunctuationLength > 0;
|
|
220
|
+
var needsPunctuation = punctuationSpace && punctuationConfig && ctx.maxPunctuationLength > 0;
|
|
214
221
|
var punctuationEndCharMap = punctuationConfig ? punctuationConfig.endCharMap : null;
|
|
215
222
|
|
|
216
223
|
if (!tokens || tokens.length === 0) return;
|
|
@@ -225,13 +232,13 @@ function process_inlines(tokens, ctx, inlineToken) {
|
|
|
225
232
|
if (!widthCache) widthCache = Object.create(null);
|
|
226
233
|
var cached = widthCache[ch];
|
|
227
234
|
if (cached !== undefined) return cached;
|
|
228
|
-
var width =
|
|
235
|
+
var width = eastAsianWidth(ch);
|
|
236
|
+
width = width === 'F' || width === 'W' || width === 'H' ? width : '';
|
|
229
237
|
widthCache[ch] = width;
|
|
230
238
|
return width;
|
|
231
239
|
}
|
|
232
240
|
|
|
233
241
|
var lastTextContent = '';
|
|
234
|
-
var hasLastText = false;
|
|
235
242
|
var sawEmptySinceLast = false;
|
|
236
243
|
|
|
237
244
|
for (i = 0; i < tokens.length; i++) {
|
|
@@ -253,7 +260,7 @@ function process_inlines(tokens, ctx, inlineToken) {
|
|
|
253
260
|
skippedEmptyAfter = nextSkippedEmpty ? nextSkippedEmpty[i] : false;
|
|
254
261
|
}
|
|
255
262
|
|
|
256
|
-
if (
|
|
263
|
+
if (lastTextContent) {
|
|
257
264
|
c1 = lastTextContent.charCodeAt(lastTextContent.length - 2);
|
|
258
265
|
c2 = lastTextContent.charCodeAt(lastTextContent.length - 1);
|
|
259
266
|
last = lastTextContent.slice(is_surrogate(c1, c2) ? -2 : -1);
|
|
@@ -261,12 +268,10 @@ function process_inlines(tokens, ctx, inlineToken) {
|
|
|
261
268
|
|
|
262
269
|
var nextIdx = nextTextIndex[i];
|
|
263
270
|
if (nextIdx !== -1) {
|
|
264
|
-
var nextContent = tokens[nextIdx].content
|
|
265
|
-
|
|
266
|
-
|
|
267
|
-
|
|
268
|
-
next = nextContent.slice(0, is_surrogate(c1, c2) ? 2 : 1);
|
|
269
|
-
}
|
|
271
|
+
var nextContent = tokens[nextIdx].content;
|
|
272
|
+
c1 = nextContent.charCodeAt(0);
|
|
273
|
+
c2 = nextContent.charCodeAt(1);
|
|
274
|
+
next = nextContent.slice(0, is_surrogate(c1, c2) ? 2 : 1);
|
|
270
275
|
}
|
|
271
276
|
|
|
272
277
|
remove_break = false;
|
|
@@ -299,10 +304,9 @@ function process_inlines(tokens, ctx, inlineToken) {
|
|
|
299
304
|
|
|
300
305
|
if (remove_break) {
|
|
301
306
|
var insertPunctuationSpace = false;
|
|
302
|
-
if (needsPunctuation &&
|
|
307
|
+
if (needsPunctuation && lastTextContent && nextIdx !== -1 && next !== '\u200b') {
|
|
303
308
|
if (punctuationEndCharMap[last]) {
|
|
304
|
-
|
|
305
|
-
if (matches_punctuation_sequence(trailing, punctuationConfig, true)) {
|
|
309
|
+
if (matches_punctuation_sequence(lastTextContent, punctuationConfig, true)) {
|
|
306
310
|
if (!nextWidthComputed) {
|
|
307
311
|
nextWidthClass = get_cached_width_class(next);
|
|
308
312
|
}
|
|
@@ -321,7 +325,6 @@ function process_inlines(tokens, ctx, inlineToken) {
|
|
|
321
325
|
if (considerInlineBoundaries) sawEmptySinceLast = true;
|
|
322
326
|
} else {
|
|
323
327
|
lastTextContent = token.content;
|
|
324
|
-
hasLastText = true;
|
|
325
328
|
if (considerInlineBoundaries) sawEmptySinceLast = false;
|
|
326
329
|
}
|
|
327
330
|
}
|
|
@@ -367,21 +370,29 @@ function split_text_token(token) {
|
|
|
367
370
|
var parts = [];
|
|
368
371
|
var content = token.content;
|
|
369
372
|
var start = 0;
|
|
373
|
+
var reusedToken = false;
|
|
374
|
+
|
|
375
|
+
function push_text_part(text) {
|
|
376
|
+
if (!text) return;
|
|
377
|
+
if (!reusedToken) {
|
|
378
|
+
token.content = text;
|
|
379
|
+
parts.push(token);
|
|
380
|
+
reusedToken = true;
|
|
381
|
+
return;
|
|
382
|
+
}
|
|
383
|
+
parts.push(clone_text_token(TokenConstructor, token, text));
|
|
384
|
+
}
|
|
370
385
|
|
|
371
386
|
for (var pos = 0; pos < content.length; pos++) {
|
|
372
387
|
if (content.charCodeAt(pos) !== 0x0A) continue;
|
|
373
388
|
|
|
374
|
-
if (pos > start)
|
|
375
|
-
parts.push(clone_text_token(TokenConstructor, token, content.slice(start, pos)));
|
|
376
|
-
}
|
|
389
|
+
if (pos > start) push_text_part(content.slice(start, pos));
|
|
377
390
|
|
|
378
391
|
parts.push(create_softbreak_token(TokenConstructor, token));
|
|
379
392
|
start = pos + 1;
|
|
380
393
|
}
|
|
381
394
|
|
|
382
|
-
if (start < content.length)
|
|
383
|
-
parts.push(clone_text_token(TokenConstructor, token, content.slice(start)));
|
|
384
|
-
}
|
|
395
|
+
if (start < content.length) push_text_part(content.slice(start));
|
|
385
396
|
|
|
386
397
|
return parts;
|
|
387
398
|
}
|
|
@@ -422,10 +433,14 @@ function apply_missing_punctuation_spacing(tokens, inlineToken, punctuationSpace
|
|
|
422
433
|
if (!inlineToken || !inlineToken.content) return;
|
|
423
434
|
if (inlineToken.content.indexOf('\n') === -1) return;
|
|
424
435
|
if (!tokens || tokens.length === 0) return;
|
|
425
|
-
|
|
426
|
-
if (maxPunctuationLength <= 0) return;
|
|
436
|
+
if (punctuationConfig.maxLength <= 0) return;
|
|
427
437
|
var endCharMap = punctuationConfig.endCharMap;
|
|
428
438
|
|
|
439
|
+
if (tokens.length === 1) {
|
|
440
|
+
apply_single_text_token_spacing(tokens, inlineToken, punctuationSpace, punctuationConfig);
|
|
441
|
+
return;
|
|
442
|
+
}
|
|
443
|
+
|
|
429
444
|
var rawSearchState = { pos: 0 };
|
|
430
445
|
|
|
431
446
|
for (var idx = 0; idx < tokens.length; idx++) {
|
|
@@ -434,15 +449,20 @@ function apply_missing_punctuation_spacing(tokens, inlineToken, punctuationSpace
|
|
|
434
449
|
|
|
435
450
|
var endChar = get_last_char(current.content);
|
|
436
451
|
if (!endCharMap[endChar]) continue;
|
|
437
|
-
|
|
438
|
-
if (!matches_punctuation_sequence(trailing, punctuationConfig, true)) continue;
|
|
452
|
+
if (!matches_punctuation_sequence(current.content, punctuationConfig, true)) continue;
|
|
439
453
|
|
|
440
454
|
var nextInfo = find_next_visible_token(tokens, idx + 1);
|
|
441
455
|
if (!nextInfo) continue;
|
|
442
|
-
if (nextInfo.token.type === 'text' &&
|
|
443
|
-
if (
|
|
444
|
-
|
|
445
|
-
if (!raw_boundary_includes_newline(
|
|
456
|
+
if (nextInfo.token.type === 'text' && has_leading_whitespace(nextInfo.token.content)) continue;
|
|
457
|
+
if (nextInfo.hasActiveBreak) continue;
|
|
458
|
+
|
|
459
|
+
if (!raw_boundary_includes_newline(
|
|
460
|
+
inlineToken.content,
|
|
461
|
+
current.content,
|
|
462
|
+
nextInfo.betweenMarkup,
|
|
463
|
+
nextInfo.fragment,
|
|
464
|
+
rawSearchState
|
|
465
|
+
)) {
|
|
446
466
|
continue;
|
|
447
467
|
}
|
|
448
468
|
|
|
@@ -450,50 +470,48 @@ function apply_missing_punctuation_spacing(tokens, inlineToken, punctuationSpace
|
|
|
450
470
|
idx = nextInfo.index;
|
|
451
471
|
}
|
|
452
472
|
|
|
453
|
-
if (tokens.length === 1) {
|
|
454
|
-
apply_single_text_token_spacing(tokens, inlineToken, punctuationSpace, punctuationConfig);
|
|
455
|
-
}
|
|
456
|
-
}
|
|
457
|
-
|
|
458
|
-
function has_active_break(tokens, fromIdx, nextIdx) {
|
|
459
|
-
for (var idx = fromIdx + 1; idx < nextIdx; idx++) {
|
|
460
|
-
var token = tokens[idx];
|
|
461
|
-
if (!token) continue;
|
|
462
|
-
if (token.type === 'softbreak') return true;
|
|
463
|
-
if (token.type === 'text' && token.content === '\n') return true;
|
|
464
|
-
}
|
|
465
|
-
return false;
|
|
466
473
|
}
|
|
467
474
|
|
|
468
|
-
|
|
469
|
-
function raw_boundary_includes_newline(source, tokens, fromIdx, nextIdx, afterFragment, state) {
|
|
475
|
+
function raw_boundary_includes_newline(source, beforeFragment, betweenFragment, afterFragment, state) {
|
|
470
476
|
if (!source || !afterFragment) return false;
|
|
471
|
-
|
|
472
|
-
|
|
473
|
-
|
|
474
|
-
|
|
475
|
-
|
|
476
|
-
|
|
477
|
-
|
|
478
|
-
|
|
479
|
-
|
|
480
|
-
|
|
481
|
-
|
|
482
|
-
|
|
483
|
-
|
|
484
|
-
return true;
|
|
477
|
+
if (!beforeFragment) return false;
|
|
478
|
+
betweenFragment = betweenFragment || '';
|
|
479
|
+
if (Array.isArray(afterFragment)) {
|
|
480
|
+
for (var i = 0; i < afterFragment.length; i++) {
|
|
481
|
+
var fragment = afterFragment[i];
|
|
482
|
+
if (!fragment) continue;
|
|
483
|
+
var candidate = beforeFragment + betweenFragment + '\n' + fragment;
|
|
484
|
+
var startPos = source.indexOf(candidate, state.pos);
|
|
485
|
+
if (startPos === -1) continue;
|
|
486
|
+
state.pos = startPos + candidate.length - fragment.length;
|
|
487
|
+
return true;
|
|
488
|
+
}
|
|
489
|
+
return false;
|
|
485
490
|
}
|
|
486
|
-
|
|
491
|
+
var fragment = afterFragment;
|
|
492
|
+
var candidate = beforeFragment + betweenFragment + '\n' + fragment;
|
|
493
|
+
var startPos = source.indexOf(candidate, state.pos);
|
|
494
|
+
if (startPos === -1) return false;
|
|
495
|
+
state.pos = startPos + candidate.length - fragment.length;
|
|
496
|
+
return true;
|
|
487
497
|
}
|
|
488
498
|
|
|
489
499
|
|
|
490
500
|
function find_next_visible_token(tokens, startIdx) {
|
|
501
|
+
var hasActiveBreak = false;
|
|
502
|
+
var betweenMarkup = '';
|
|
491
503
|
for (var idx = startIdx; idx < tokens.length; idx++) {
|
|
492
504
|
var token = tokens[idx];
|
|
493
505
|
if (!token) continue;
|
|
506
|
+
if (!hasActiveBreak && (token.type === 'softbreak' || (token.type === 'text' && token.content === '\n'))) {
|
|
507
|
+
hasActiveBreak = true;
|
|
508
|
+
}
|
|
494
509
|
var fragment = derive_after_fragment(token);
|
|
495
|
-
if (!fragment)
|
|
496
|
-
|
|
510
|
+
if (!fragment) {
|
|
511
|
+
if (token.markup) betweenMarkup += token.markup;
|
|
512
|
+
continue;
|
|
513
|
+
}
|
|
514
|
+
return { index: idx, token: token, fragment: fragment, hasActiveBreak: hasActiveBreak, betweenMarkup: betweenMarkup };
|
|
497
515
|
}
|
|
498
516
|
return null;
|
|
499
517
|
}
|
|
@@ -509,7 +527,7 @@ function derive_after_fragment(token) {
|
|
|
509
527
|
if (markup && content) fragments.push(markup + content);
|
|
510
528
|
if (markup) fragments.push(markup);
|
|
511
529
|
if (content) fragments.push(content);
|
|
512
|
-
return fragments;
|
|
530
|
+
return fragments.length > 0 ? fragments : '';
|
|
513
531
|
}
|
|
514
532
|
if (token.type === 'image') return '![';
|
|
515
533
|
if (token.type === 'link_open') return token.markup || '[';
|
|
@@ -549,7 +567,6 @@ function apply_single_text_token_spacing(tokens, inlineToken, punctuationSpace,
|
|
|
549
567
|
if (maxPunctuationLength <= 0) return;
|
|
550
568
|
|
|
551
569
|
var segments = inlineToken.content.split('\n');
|
|
552
|
-
if (segments.length < 2) return;
|
|
553
570
|
var cumulativeLength = 0;
|
|
554
571
|
var offsetDelta = 0;
|
|
555
572
|
var updatedContent = token.content;
|
package/package.json
CHANGED
|
@@ -1,6 +1,6 @@
|
|
|
1
1
|
{
|
|
2
2
|
"name": "@peaceroad/markdown-it-cjk-breaks-mod",
|
|
3
|
-
"version": "0.1.
|
|
3
|
+
"version": "0.1.9",
|
|
4
4
|
"type": "module",
|
|
5
5
|
"description": "Suppress linebreaks between east asian (Especially Japanese) characters",
|
|
6
6
|
"repository": {
|
|
@@ -22,7 +22,7 @@
|
|
|
22
22
|
"eastasianwidth": "^0.3.0"
|
|
23
23
|
},
|
|
24
24
|
"devDependencies": {
|
|
25
|
-
"@peaceroad/markdown-it-strong-ja": "^0.
|
|
25
|
+
"@peaceroad/markdown-it-strong-ja": "^0.8.1",
|
|
26
26
|
"markdown-it": "^14.1.0"
|
|
27
27
|
}
|
|
28
28
|
}
|