@peaceroad/markdown-it-strong-ja 0.5.6 → 0.6.1
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/README.md +88 -17
- package/index.js +478 -71
- package/package.json +2 -2
package/README.md
CHANGED
|
@@ -20,11 +20,88 @@ md.render('HTMLは*「HyperText Markup Language」*の略です。')
|
|
|
20
20
|
|
|
21
21
|
Notice. Basically, it is assumed that you will use markdown-it-attrs in conjunction with this. If you do not use it, please use `use(mditStrongJa, {mditAttrs: false})`.
|
|
22
22
|
|
|
23
|
+
### How this differs from vanilla markdown-it
|
|
24
|
+
|
|
25
|
+
Default output pairs `*` / `**` as it scans left-to-right: when a line contains Japanese (hiragana / katakana / kanji / fullwidth punctuation), japanese-only mode treats the leading `**` aggressively; English-only lines follow markdown-it style pairing. Pick one mode for the examples below:
|
|
26
|
+
|
|
27
|
+
- `mode: 'japanese-only'` (default) … Japanese ⇒ aggressive, English-only ⇒ markdown-it compatible
|
|
28
|
+
- `mode: 'aggressive'` … always aggressive (lead `**` pairs greedily)
|
|
29
|
+
- `mode: 'compatible'` … markdown-it compatible (lead `**` stays literal)
|
|
30
|
+
|
|
31
|
+
```js
|
|
32
|
+
const mdDefault = mdit().use(mditStrongJa) // mode: 'japanese-only'
|
|
33
|
+
const mdCompat = mdit().use(mditStrongJa, { mode: 'compatible' }) // markdown-it pairing
|
|
34
|
+
const mdAggressive = mdit().use(mditStrongJa, { mode: 'aggressive' }) // always pair leading **
|
|
35
|
+
```
|
|
36
|
+
|
|
37
|
+
Default (japanese-only) pairs aggressively only when Japanese is present. Aggressive always pairs the leading `**`, and compatible matches markdown-it.
|
|
38
|
+
|
|
39
|
+
Japanese-first pairing around punctuation and mixed sentences: leading/trailing Japanese quotes or brackets (`「`, `」`, `(`, `、` etc.) are wrapped even when the same pattern would stay literal in markdown-it. Mixed sentences here mean one line that contains multiple `*` runs; Japanese text keeps the leading `**` aggressive, while English-only stays compatible unless you pick aggressive mode.
|
|
40
|
+
|
|
41
|
+
- Punctuation:
|
|
42
|
+
- Input: `**「test」**`
|
|
43
|
+
- Output (default): `<p><strong>「test」</strong></p>`
|
|
44
|
+
- Output (aggressive): `<p><strong>「test」</strong></p>`
|
|
45
|
+
- Output (compatible): `<p>**「test」**</p>`
|
|
46
|
+
|
|
47
|
+
- Mixed sentence (multiple `*` runs): English-only stays markdown-it compatible unless you pick aggressive mode; earlier `**` runs can remain literal while later ones pair.
|
|
48
|
+
- Input (Japanese mixed): `**あああ。**iii**`
|
|
49
|
+
- Output (default): `<p><strong>あああ。</strong>iii**</p>`
|
|
50
|
+
- Output (aggressive): `<p><strong>あああ。</strong>iii**</p>`
|
|
51
|
+
- Output (compatible): `<p>**あああ。<strong>iii</strong></p>`
|
|
52
|
+
- Input (English-only): `**aaa.**iii**`
|
|
53
|
+
- Output (default): `<p>**aaa.<strong>iii</strong></p>`
|
|
54
|
+
- Output (aggressive): `<p><strong>aaa.</strong>iii**</p>`
|
|
55
|
+
- Output (compatible): `<p>**aaa.<strong>iii</strong></p>`
|
|
56
|
+
- Input (English-only, two `**` runs): `**aaa.**eee.**eeee**`
|
|
57
|
+
- Output (default): `<p>**aaa.**eee.<strong>eeee</strong></p>`
|
|
58
|
+
- Output (aggressive): `<p><strong>aaa.</strong>eee.<strong>eeee</strong></p>`
|
|
59
|
+
- Output (compatible): `<p>**aaa.**eee.<strong>eeee</strong></p>`
|
|
60
|
+
|
|
61
|
+
Inline link/HTML/code blocks stay intact (see Link / Inline code examples above): the plugin re-wraps `[label](url)` / `[label][]` after pairing to avoid broken emphasis tokens around anchors, inline HTML, or inline code. This also covers clusters of `*` with no spaces around the link or code span.
|
|
62
|
+
|
|
63
|
+
- Link (cluster of `*` without spaces):
|
|
64
|
+
- Input (English-only): `string**[text](url)**`
|
|
65
|
+
- Output (default): `<p>string**<a href="url">text</a>**</p>`
|
|
66
|
+
- Output (aggressive): `<p>string<strong><a href="url">text</a></strong></p>`
|
|
67
|
+
- Output (compatible): `<p>string**<a href="url">text</a>**</p>`
|
|
68
|
+
- Input (Japanese mixed): `これは**[text](url)**です`
|
|
69
|
+
- Output (default/aggressive): `<p>これは<strong><a href="url">text</a></strong>です</p>`
|
|
70
|
+
- Output (compatible): `<p>これは**<a href="url">text</a>**です</p>`
|
|
71
|
+
- Inline code (cluster of `*` without spaces):
|
|
72
|
+
- Input (English-only): `` **aa`code`**aa ``
|
|
73
|
+
- Output (default): `<p>**aa<code>code</code>**aa</p>`
|
|
74
|
+
- Output (aggressive): `<p><strong>aa<code>code</code></strong>aa</p>`
|
|
75
|
+
- Output (compatible): `<p>**aa<code>code</code>**aa</p>`
|
|
76
|
+
- Input (Japanese mixed): `` これは**`code`**です ``
|
|
77
|
+
- Output (default/aggressive): `<p>これは<strong><code>code</code></strong>です</p>`
|
|
78
|
+
- Output (compatible): `<p>これは**<code>code</code>**です</p>`
|
|
79
|
+
|
|
80
|
+
### Known differences from vanilla markdown-it
|
|
81
|
+
|
|
82
|
+
This section collects other cases that diverge from vanilla markdown-it.
|
|
83
|
+
|
|
84
|
+
The plugin keeps pairing aggressively in Japanese contexts, which can diverge from markdown-it when markup spans newlines or mixes nested markers.
|
|
85
|
+
|
|
86
|
+
- Multiline + nested emphasis (markdown-it leaves trailing `**`):
|
|
87
|
+
|
|
88
|
+
```markdown
|
|
89
|
+
***強調と*入れ子*の検証***を行う。
|
|
90
|
+
```
|
|
91
|
+
|
|
92
|
+
- markdown-it: `<p><em><em><em>強調と</em>入れ子</em>の検証</em>**を行う。</p>`
|
|
93
|
+
- markdown-it-strong-ja (default/aggressive): `<p><em><strong>強調と<em>入れ子</em>の検証</strong></em>を行う。</p>`
|
|
94
|
+
- If you want markdown-it behavior here, use `mode: 'compatible'`.
|
|
95
|
+
|
|
96
|
+
Notice. The plugin keeps inline HTML / angle-bracket regions intact so rendered HTML keeps correct nesting (for example, it avoids mis-nesting in inputs like `**aaa<code>**bbb</code>` when HTML output is enabled).
|
|
97
|
+
|
|
98
|
+
|
|
99
|
+
|
|
23
100
|
## Example
|
|
24
101
|
|
|
25
102
|
The following examples is for strong. The process for em is roughly the same.
|
|
26
103
|
|
|
27
|
-
|
|
104
|
+
````markdown
|
|
28
105
|
[Markdown]
|
|
29
106
|
HTMLは「**HyperText Markup Language**」の略です。
|
|
30
107
|
[HTML]
|
|
@@ -116,6 +193,11 @@ HTMLは**「HyperText <b>Markup</b> Language」**
|
|
|
116
193
|
[HTML:true]
|
|
117
194
|
<p>HTMLは<strong>「HyperText <b>Markup</b> Language」</strong></p>
|
|
118
195
|
|
|
196
|
+
[Markdown]
|
|
197
|
+
これは**[text](url)**と**`code`**と**<b>HTML</b>**です
|
|
198
|
+
[HTML html:true]
|
|
199
|
+
<p>これは<strong><a href="url">text</a></strong>と<strong><code>code</code></strong>と<strong><b>HTML</b></strong>です</p>
|
|
200
|
+
|
|
119
201
|
|
|
120
202
|
[Markdown]
|
|
121
203
|
HTMLは「**HyperText Markup Language**」
|
|
@@ -152,25 +234,12 @@ a****b
|
|
|
152
234
|
a****
|
|
153
235
|
[HTML]
|
|
154
236
|
<p>a****</p>
|
|
155
|
-
|
|
237
|
+
````
|
|
156
238
|
|
|
157
|
-
## Options
|
|
158
239
|
|
|
159
|
-
### disallowMixed
|
|
240
|
+
### disallowMixed (legacy)
|
|
160
241
|
|
|
161
|
-
|
|
162
|
-
|
|
163
|
-
```js
|
|
164
|
-
const md = mdit.use(mditStrongJa)
|
|
165
|
-
md.render('string**[text](url)**')
|
|
166
|
-
// <p>string<strong><a href="url">text</a></strong></p>
|
|
167
|
-
```
|
|
168
|
-
|
|
169
|
-
```js
|
|
170
|
-
const md = mdit.use(mditStrongJa, { disallowMixed: true })
|
|
171
|
-
md.render('string**[text](url)**')
|
|
172
|
-
// <p>string**<a href="url">text</a>**</p>
|
|
173
|
-
```
|
|
242
|
+
`disallowMixed: true` is kept for back-compat: it forces compatible pairing for English/mixed contexts that contain markdown links, HTML tags, inline code, or math expressions while staying aggressive for Japanese-only text. Prefer `mode` for new setups; enable this only if you need the legacy compat-first behavior in mixed English.
|
|
174
243
|
|
|
175
244
|
### coreRulesBeforePostprocess
|
|
176
245
|
|
|
@@ -187,3 +256,5 @@ const md = mdit()
|
|
|
187
256
|
- Default: `[]`
|
|
188
257
|
- Specify `['cjk_breaks']` (or other rule names) when you rely on plugins such as `@peaceroad/markdown-it-cjk-breaks-mod` and need them to run first.
|
|
189
258
|
- Pass an empty array if you do not want `mditStrongJa` to reorder any core rules.
|
|
259
|
+
|
|
260
|
+
Most setups can leave this option untouched; use it only when you must keep another plugin's core rule ahead of `strong_ja_postprocess`.
|
package/index.js
CHANGED
|
@@ -22,10 +22,18 @@ const CHAR_CLOSE_CURLY = 0x7D // }
|
|
|
22
22
|
|
|
23
23
|
const REG_ATTRS = /{[^{}\n!@#%^&*()]+?}$/
|
|
24
24
|
const REG_ASCII_PUNCT = /[!-/:-@[-`{-~]/g
|
|
25
|
-
const REG_JAPANESE =
|
|
25
|
+
const REG_JAPANESE = /[\p{Script=Hiragana}\p{Script=Katakana}\p{Script=Han}\u3000-\u303F\uFF00-\uFFEF]/u // ひらがな|カタカナ|漢字|CJK句読点・全角形状(絵文字は除外)
|
|
26
26
|
|
|
27
27
|
const REG_MARKDOWN_HTML = /^\[[^\[\]]+\]\([^)]+\)$|^<([a-zA-Z][a-zA-Z0-9]*)[^>]*>([^<]+<\/\1>)$|^`[^`]+`$|^\$[^$]+\$$/ // for mixed-language context detection
|
|
28
28
|
|
|
29
|
+
const hasCjkBreaksRule = (md) => {
|
|
30
|
+
if (!md || !md.core || !md.core.ruler || !Array.isArray(md.core.ruler.__rules__)) return false
|
|
31
|
+
if (md.__strongJaHasCjkBreaks === true) return true
|
|
32
|
+
const found = md.core.ruler.__rules__.some((rule) => rule && typeof rule.name === 'string' && rule.name.indexOf('cjk_breaks') !== -1)
|
|
33
|
+
if (found) md.__strongJaHasCjkBreaks = true
|
|
34
|
+
return found
|
|
35
|
+
}
|
|
36
|
+
|
|
29
37
|
const hasBackslash = (state, start) => {
|
|
30
38
|
if (start <= 0) return false
|
|
31
39
|
if (state.__strongJaHasBackslash === false) return false
|
|
@@ -268,6 +276,15 @@ const findInlineLinkRange = (pos, ranges, kind) => {
|
|
|
268
276
|
const useCache = ranges.length > 32
|
|
269
277
|
const cache = useCache ? getInlineRangeCacheMap(ranges, kind, false) : null
|
|
270
278
|
if (cache && cache.has(pos)) return cache.get(pos)
|
|
279
|
+
const first = ranges[0]
|
|
280
|
+
const last = ranges[ranges.length - 1]
|
|
281
|
+
if (pos < first.start || pos > last.end) {
|
|
282
|
+
if (useCache) {
|
|
283
|
+
const storeCache = getInlineRangeCacheMap(ranges, kind, true)
|
|
284
|
+
storeCache.set(pos, null)
|
|
285
|
+
}
|
|
286
|
+
return null
|
|
287
|
+
}
|
|
271
288
|
let left = 0
|
|
272
289
|
let right = ranges.length - 1
|
|
273
290
|
let found = null
|
|
@@ -387,8 +404,7 @@ const setToken = (state, inlines, opt, attrsEnabled) => {
|
|
|
387
404
|
const src = state.src
|
|
388
405
|
const mapFromPos = createLineMapper(state)
|
|
389
406
|
let i = 0
|
|
390
|
-
let
|
|
391
|
-
let attrsIsTextTag = ''
|
|
407
|
+
let lastTextToken = null
|
|
392
408
|
while (i < inlines.length) {
|
|
393
409
|
let type = inlines[i].type
|
|
394
410
|
let tag = ''
|
|
@@ -406,12 +422,23 @@ const setToken = (state, inlines, opt, attrsEnabled) => {
|
|
|
406
422
|
const startToken = state.push(type, tag, 1)
|
|
407
423
|
startToken.markup = tag === 'strong' ? '**' : '*'
|
|
408
424
|
startToken.map = mapFromPos(inlines[i].s, inlines[i].e)
|
|
409
|
-
attrsIsText = true
|
|
410
|
-
attrsIsTextTag = tag
|
|
411
425
|
}
|
|
412
426
|
|
|
413
427
|
if (type === 'html_inline') {
|
|
414
|
-
|
|
428
|
+
const content = src.slice(inlines[i].s, inlines[i].e + 1)
|
|
429
|
+
if (lastTextToken && inlines[i].s > 0) {
|
|
430
|
+
const prevChar = src.charAt(inlines[i].s - 1)
|
|
431
|
+
if (prevChar === ' ' || prevChar === '\t') {
|
|
432
|
+
if (!lastTextToken.content.endsWith(prevChar)) {
|
|
433
|
+
lastTextToken.content += prevChar
|
|
434
|
+
}
|
|
435
|
+
}
|
|
436
|
+
}
|
|
437
|
+
const htmlToken = state.push('html_inline', '', 0)
|
|
438
|
+
htmlToken.content = content
|
|
439
|
+
htmlToken.map = mapFromPos(inlines[i].s, inlines[i].e)
|
|
440
|
+
i++
|
|
441
|
+
continue
|
|
415
442
|
}
|
|
416
443
|
if (type === 'text') {
|
|
417
444
|
let content = src.slice(inlines[i].s, inlines[i].e + 1)
|
|
@@ -424,18 +451,25 @@ const setToken = (state, inlines, opt, attrsEnabled) => {
|
|
|
424
451
|
continue
|
|
425
452
|
}
|
|
426
453
|
}
|
|
427
|
-
|
|
428
|
-
|
|
429
|
-
|
|
430
|
-
|
|
431
|
-
|
|
432
|
-
|
|
433
|
-
|
|
434
|
-
|
|
435
|
-
|
|
454
|
+
const attrMatch = attrsEnabled && content.length > 0 && content.charCodeAt(content.length - 1) === CHAR_CLOSE_CURLY && REG_ATTRS.test(content)
|
|
455
|
+
? content.match(/^(.*?)(\s+{[^{}\n!@#%^&*()]+?})$/)
|
|
456
|
+
: null
|
|
457
|
+
if (attrMatch) {
|
|
458
|
+
const textPart = attrMatch[1] ? attrMatch[1].replace(/[ \t]+$/, '') : ''
|
|
459
|
+
const attrPart = attrMatch[2]
|
|
460
|
+
if (textPart && textPart.length > 0) {
|
|
461
|
+
const textToken = state.push(type, '', 0)
|
|
462
|
+
textToken.content = textPart
|
|
463
|
+
textToken.map = mapFromPos(inlines[i].s, inlines[i].s + textPart.length)
|
|
464
|
+
lastTextToken = textToken
|
|
465
|
+
}
|
|
466
|
+
const attrsToken = state.push(type, '', 0)
|
|
467
|
+
let attrsContent = attrPart.replace(/^\s+/, '')
|
|
468
|
+
if (attrsContent.indexOf('\\') !== -1) {
|
|
469
|
+
const hasBackslashBeforeCurlyAttribute = attrsContent.match(/(\\+){/)
|
|
436
470
|
if (hasBackslashBeforeCurlyAttribute) {
|
|
437
471
|
if (hasBackslashBeforeCurlyAttribute[1].length === 1) {
|
|
438
|
-
|
|
472
|
+
attrsContent = attrsContent.replace(/\\{/, '{')
|
|
439
473
|
} else {
|
|
440
474
|
let backSlashNum = Math.floor(hasBackslashBeforeCurlyAttribute[1].length / 2)
|
|
441
475
|
let k = 0
|
|
@@ -444,47 +478,77 @@ const setToken = (state, inlines, opt, attrsEnabled) => {
|
|
|
444
478
|
backSlash += '\\'
|
|
445
479
|
k++
|
|
446
480
|
}
|
|
447
|
-
|
|
481
|
+
attrsContent = attrsContent.replace(/\\+{/, backSlash + '{')
|
|
448
482
|
}
|
|
449
|
-
} else {
|
|
450
|
-
attrsToken.content = content
|
|
451
483
|
}
|
|
452
|
-
attrsIsText = false
|
|
453
|
-
attrsIsTextTag = ''
|
|
454
|
-
attrsToken.map = mapFromPos(inlines[i].s, inlines[i].e)
|
|
455
|
-
i++
|
|
456
|
-
continue
|
|
457
484
|
}
|
|
485
|
+
attrsToken.content = attrsContent
|
|
486
|
+
attrsToken.map = mapFromPos(inlines[i].s + content.length - attrPart.length, inlines[i].e)
|
|
487
|
+
i++
|
|
488
|
+
continue
|
|
458
489
|
}
|
|
459
490
|
if (isPlainTextContent(content)) {
|
|
460
491
|
const textToken = state.push(type, '', 0)
|
|
461
492
|
textToken.content = content
|
|
462
|
-
|
|
493
|
+
textToken.map = mapFromPos(inlines[i].s, inlines[i].e)
|
|
494
|
+
lastTextToken = textToken
|
|
463
495
|
i++
|
|
464
496
|
continue
|
|
465
497
|
}
|
|
466
498
|
|
|
467
|
-
const
|
|
468
|
-
|
|
469
|
-
|
|
470
|
-
|
|
471
|
-
|
|
472
|
-
|
|
473
|
-
|
|
474
|
-
|
|
499
|
+
const hasOnlySimpleNewline = attrsEnabled && (content.indexOf('{') !== -1 || content.indexOf('}') !== -1) &&
|
|
500
|
+
content.indexOf('\n') !== -1 &&
|
|
501
|
+
content.indexOf('`') === -1 &&
|
|
502
|
+
content.indexOf('$') === -1 &&
|
|
503
|
+
content.indexOf('<') === -1 &&
|
|
504
|
+
content.indexOf('>') === -1 &&
|
|
505
|
+
content.indexOf('[') === -1 &&
|
|
506
|
+
content.indexOf(']') === -1 &&
|
|
507
|
+
content.indexOf('(') === -1 &&
|
|
508
|
+
content.indexOf(')') === -1 &&
|
|
509
|
+
content.indexOf('^') === -1 &&
|
|
510
|
+
content.indexOf('~') === -1 &&
|
|
511
|
+
content.indexOf('\\') === -1
|
|
512
|
+
|
|
513
|
+
if (hasOnlySimpleNewline) {
|
|
514
|
+
const textToken = state.push(type, '', 0)
|
|
515
|
+
textToken.content = content
|
|
516
|
+
textToken.map = mapFromPos(inlines[i].s, inlines[i].e)
|
|
517
|
+
lastTextToken = textToken
|
|
518
|
+
i++
|
|
519
|
+
continue
|
|
520
|
+
}
|
|
521
|
+
|
|
522
|
+
const childTokens = []
|
|
523
|
+
state.md.inline.parse(content, state.md, state.env, childTokens)
|
|
524
|
+
let j = 0
|
|
525
|
+
while (j < childTokens.length) {
|
|
526
|
+
const t = childTokens[j]
|
|
527
|
+
if (t.type === 'softbreak' && !opt.mdBreaks) {
|
|
528
|
+
const hasCjk = opt.hasCjkBreaks === true
|
|
529
|
+
if (hasCjk) {
|
|
530
|
+
const prevToken = childTokens[j - 1]
|
|
531
|
+
const nextToken = childTokens[j + 1]
|
|
532
|
+
const prevChar = prevToken && prevToken.content ? prevToken.content.slice(-1) : ''
|
|
533
|
+
const nextChar = nextToken && nextToken.content ? nextToken.content.charAt(0) : ''
|
|
534
|
+
const isAsciiWord = nextChar >= '0' && nextChar <= 'z' && /[A-Za-z0-9]/.test(nextChar)
|
|
535
|
+
if (isAsciiWord && isJapanese(prevChar) && !isJapanese(nextChar)) {
|
|
475
536
|
t.type = 'text'
|
|
476
537
|
t.tag = ''
|
|
477
|
-
t.content = '
|
|
478
|
-
}
|
|
479
|
-
if (!attrsEnabled && t.tag === 'br') {
|
|
480
|
-
t.tag = ''
|
|
481
|
-
t.content = '\n'
|
|
538
|
+
t.content = ' '
|
|
482
539
|
}
|
|
483
|
-
const token = state.push(t.type, t.tag, t.nesting)
|
|
484
|
-
copyInlineTokenFields(token, t)
|
|
485
|
-
j++
|
|
486
540
|
}
|
|
487
541
|
}
|
|
542
|
+
if (!attrsEnabled && t.tag === 'br') {
|
|
543
|
+
t.tag = ''
|
|
544
|
+
t.content = '\n'
|
|
545
|
+
}
|
|
546
|
+
const token = state.push(t.type, t.tag, t.nesting)
|
|
547
|
+
copyInlineTokenFields(token, t)
|
|
548
|
+
if (t.type === 'text') {
|
|
549
|
+
lastTextToken = token
|
|
550
|
+
}
|
|
551
|
+
j++
|
|
488
552
|
}
|
|
489
553
|
}
|
|
490
554
|
|
|
@@ -492,8 +556,6 @@ const setToken = (state, inlines, opt, attrsEnabled) => {
|
|
|
492
556
|
const closeToken = state.push(type, tag, -1)
|
|
493
557
|
closeToken.markup = tag === 'strong' ? '**' : '*'
|
|
494
558
|
closeToken.map = mapFromPos(inlines[i].s, inlines[i].e)
|
|
495
|
-
attrsIsText = false
|
|
496
|
-
attrsIsTextTag = ''
|
|
497
559
|
}
|
|
498
560
|
|
|
499
561
|
i++
|
|
@@ -508,17 +570,12 @@ const pushInlines = (inlines, s, e, len, type, tag, tagType) => {
|
|
|
508
570
|
ep: e,
|
|
509
571
|
len: len,
|
|
510
572
|
type: type,
|
|
511
|
-
check:
|
|
573
|
+
check: false
|
|
512
574
|
}
|
|
513
575
|
if (tag) inline.tag = [tag, tagType]
|
|
514
576
|
inlines.push(inline)
|
|
515
577
|
}
|
|
516
578
|
|
|
517
|
-
const isAsciiPunctuationCode = (code) => {
|
|
518
|
-
return (code >= 33 && code <= 47) || (code >= 58 && code <= 64) ||
|
|
519
|
-
(code >= 91 && code <= 96) || (code >= 123 && code <= 126)
|
|
520
|
-
}
|
|
521
|
-
|
|
522
579
|
const findNextAsciiPunctuation = (src, start, max) => {
|
|
523
580
|
REG_ASCII_PUNCT.lastIndex = start
|
|
524
581
|
const match = REG_ASCII_PUNCT.exec(src)
|
|
@@ -650,11 +707,13 @@ const createInlines = (state, start, max, opt) => {
|
|
|
650
707
|
// HTML tags
|
|
651
708
|
if (htmlEnabled && currentChar === CHAR_LT) {
|
|
652
709
|
if (!isEscaped) {
|
|
710
|
+
const guardHtml = srcLen - n > 8192
|
|
711
|
+
const maxScanEnd = guardHtml ? Math.min(srcLen, n + 8192) : srcLen
|
|
653
712
|
let foundClosingTag = false
|
|
654
713
|
let i = n + 1
|
|
655
714
|
while (i < srcLen) {
|
|
656
715
|
i = src.indexOf('>', i)
|
|
657
|
-
if (i === -1 || i >=
|
|
716
|
+
if (i === -1 || i >= maxScanEnd) break
|
|
658
717
|
if (!hasBackslash(state, i)) {
|
|
659
718
|
hasText = processTextSegment(inlines, textStart, n, hasText)
|
|
660
719
|
let tag = src.slice(n + 1, i)
|
|
@@ -725,6 +784,8 @@ const setStrong = (state, inlines, marks, n, memo, opt, nestTracker, refRanges,
|
|
|
725
784
|
const hasInlineLinkRanges = inlineLinkRanges && inlineLinkRanges.length > 0
|
|
726
785
|
const hasRefRanges = refRanges && refRanges.length > 0
|
|
727
786
|
const inlinesLength = inlines.length
|
|
787
|
+
const leadingCompat = opt.leadingAsterisk === false
|
|
788
|
+
const conservativePunctuation = opt.disallowMixed === true
|
|
728
789
|
if (opt.disallowMixed === true) {
|
|
729
790
|
let i = n + 1
|
|
730
791
|
while (i < inlinesLength) {
|
|
@@ -767,6 +828,10 @@ const setStrong = (state, inlines, marks, n, memo, opt, nestTracker, refRanges,
|
|
|
767
828
|
}
|
|
768
829
|
}
|
|
769
830
|
|
|
831
|
+
if (state.md && state.md.options && state.md.options.html && hasCodeTagInside(state, inlines, n, i)) {
|
|
832
|
+
return [n, nest]
|
|
833
|
+
}
|
|
834
|
+
|
|
770
835
|
nest = checkNest(inlines, marks, n, i, nestTracker)
|
|
771
836
|
if (nest === -1) return [n, nest]
|
|
772
837
|
|
|
@@ -798,7 +863,12 @@ const setStrong = (state, inlines, marks, n, memo, opt, nestTracker, refRanges,
|
|
|
798
863
|
let strongNum = Math.trunc(Math.min(inlines[n].len, inlines[i].len) / 2)
|
|
799
864
|
|
|
800
865
|
if (inlines[i].len > 1) {
|
|
801
|
-
|
|
866
|
+
const hasJapaneseContext = isJapanese(state.src[inlines[n].s - 1] || '') || isJapanese(state.src[inlines[i].e + 1] || '')
|
|
867
|
+
const needsPunctuationCheck = (conservativePunctuation && !hasJapaneseContext) || hasHtmlLikePunctuation(state, inlines, n, i) || hasAngleBracketInside(state, inlines, n, i)
|
|
868
|
+
if (needsPunctuationCheck && hasPunctuationOrNonJapanese(state, inlines, n, i, opt, refRanges, hasRefRanges)) {
|
|
869
|
+
if (leadingCompat) {
|
|
870
|
+
return [n, nest]
|
|
871
|
+
}
|
|
802
872
|
if (memo.inlineMarkEnd) {
|
|
803
873
|
marks.push(...createMarks(state, inlines, i, inlinesLength - 1, memo, opt, refRanges, inlineLinkRanges))
|
|
804
874
|
if (inlines[i].len === 0) { i++; continue }
|
|
@@ -881,7 +951,17 @@ const isPunctuation = (ch) => {
|
|
|
881
951
|
(code >= 91 && code <= 96) || (code >= 123 && code <= 126) || code === 32
|
|
882
952
|
}
|
|
883
953
|
|
|
884
|
-
|
|
954
|
+
const isAsciiPunctuationCode = (code) => {
|
|
955
|
+
if (code < 33 || code > 126) return false
|
|
956
|
+
return (code <= 47) || (code >= 58 && code <= 64) || (code >= 91 && code <= 96) || (code >= 123)
|
|
957
|
+
}
|
|
958
|
+
|
|
959
|
+
const isUnicodePunctuation = (ch) => {
|
|
960
|
+
if (!ch) return false
|
|
961
|
+
return /\p{P}/u.test(ch)
|
|
962
|
+
}
|
|
963
|
+
|
|
964
|
+
// Check if character is Japanese (hiragana, katakana, kanji, CJK punctuation/fullwidth)
|
|
885
965
|
// Uses fast Unicode range checks for common cases, falls back to REG_JAPANESE for complex Unicode
|
|
886
966
|
const isJapanese = (ch) => {
|
|
887
967
|
if (!ch) return false
|
|
@@ -896,6 +976,26 @@ const isJapanese = (ch) => {
|
|
|
896
976
|
REG_JAPANESE.test(ch)
|
|
897
977
|
}
|
|
898
978
|
|
|
979
|
+
const hasJapaneseText = (str) => {
|
|
980
|
+
if (!str) return false
|
|
981
|
+
return REG_JAPANESE.test(str)
|
|
982
|
+
}
|
|
983
|
+
|
|
984
|
+
const resolveLeadingAsterisk = (state, opt, start, max) => {
|
|
985
|
+
const modeRaw = opt.mode || 'japanese-only'
|
|
986
|
+
const mode = typeof modeRaw === 'string' ? modeRaw.toLowerCase() : 'japanese-only'
|
|
987
|
+
if (mode === 'aggressive') return true
|
|
988
|
+
if (mode === 'compatible') return false
|
|
989
|
+
let hasJapanese = state.__strongJaHasJapanese
|
|
990
|
+
if (hasJapanese === undefined) {
|
|
991
|
+
hasJapanese = hasJapaneseText(state.src.slice(0, max))
|
|
992
|
+
state.__strongJaHasJapanese = hasJapanese
|
|
993
|
+
}
|
|
994
|
+
if (opt.disallowMixed === true) return hasJapanese
|
|
995
|
+
|
|
996
|
+
return hasJapanese
|
|
997
|
+
}
|
|
998
|
+
|
|
899
999
|
// Check if character is English (letters, numbers) or other non-Japanese characters
|
|
900
1000
|
// Uses REG_JAPANESE to exclude Japanese characters
|
|
901
1001
|
const isEnglish = (ch) => {
|
|
@@ -928,6 +1028,9 @@ const hasPunctuationOrNonJapanese = (state, inlines, n, i, opt, refRanges, hasRe
|
|
|
928
1028
|
const openPrevChar = src[inlines[n].s - 1] || ''
|
|
929
1029
|
const openNextChar = src[inlines[n].e + 1] || ''
|
|
930
1030
|
let checkOpenNextChar = isPunctuation(openNextChar)
|
|
1031
|
+
if (!checkOpenNextChar && opt.leadingAsterisk === false && isUnicodePunctuation(openNextChar)) {
|
|
1032
|
+
checkOpenNextChar = true
|
|
1033
|
+
}
|
|
931
1034
|
if (hasRefRanges && checkOpenNextChar && (openNextChar === '[' || openNextChar === ']')) {
|
|
932
1035
|
const openNextRange = findRefRangeIndex(inlines[n].e + 1, refRanges)
|
|
933
1036
|
if (openNextRange !== -1) {
|
|
@@ -936,6 +1039,9 @@ const hasPunctuationOrNonJapanese = (state, inlines, n, i, opt, refRanges, hasRe
|
|
|
936
1039
|
}
|
|
937
1040
|
const closePrevChar = src[inlines[i].s - 1] || ''
|
|
938
1041
|
let checkClosePrevChar = isPunctuation(closePrevChar)
|
|
1042
|
+
if (!checkClosePrevChar && opt.leadingAsterisk === false && isUnicodePunctuation(closePrevChar)) {
|
|
1043
|
+
checkClosePrevChar = true
|
|
1044
|
+
}
|
|
939
1045
|
if (hasRefRanges && checkClosePrevChar && (closePrevChar === '[' || closePrevChar === ']')) {
|
|
940
1046
|
const closePrevRange = findRefRangeIndex(inlines[i].s - 1, refRanges)
|
|
941
1047
|
if (closePrevRange !== -1) {
|
|
@@ -944,7 +1050,10 @@ const hasPunctuationOrNonJapanese = (state, inlines, n, i, opt, refRanges, hasRe
|
|
|
944
1050
|
}
|
|
945
1051
|
const closeNextChar = src[inlines[i].e + 1] || ''
|
|
946
1052
|
const isLastInline = i === inlines.length - 1
|
|
947
|
-
|
|
1053
|
+
let checkCloseNextChar = isLastInline || isPunctuation(closeNextChar) || closeNextChar === '\n'
|
|
1054
|
+
if (!checkCloseNextChar && opt.leadingAsterisk === false && isUnicodePunctuation(closeNextChar)) {
|
|
1055
|
+
checkCloseNextChar = true
|
|
1056
|
+
}
|
|
948
1057
|
|
|
949
1058
|
if (opt.disallowMixed === false) {
|
|
950
1059
|
if (isEnglish(openPrevChar) || isEnglish(closeNextChar)) {
|
|
@@ -958,12 +1067,52 @@ const hasPunctuationOrNonJapanese = (state, inlines, n, i, opt, refRanges, hasRe
|
|
|
958
1067
|
return result
|
|
959
1068
|
}
|
|
960
1069
|
|
|
1070
|
+
const hasHtmlLikePunctuation = (state, inlines, n, i) => {
|
|
1071
|
+
const src = state.src
|
|
1072
|
+
const chars = [
|
|
1073
|
+
src[inlines[n].e + 1] || '',
|
|
1074
|
+
src[inlines[i].s - 1] || '',
|
|
1075
|
+
src[inlines[i].e + 1] || ''
|
|
1076
|
+
]
|
|
1077
|
+
for (let idx = 0; idx < chars.length; idx++) {
|
|
1078
|
+
const ch = chars[idx]
|
|
1079
|
+
if (ch === '<' || ch === '>') return true
|
|
1080
|
+
}
|
|
1081
|
+
return false
|
|
1082
|
+
}
|
|
1083
|
+
|
|
1084
|
+
const hasAngleBracketInside = (state, inlines, n, i) => {
|
|
1085
|
+
const src = state.src
|
|
1086
|
+
const start = inlines[n].s
|
|
1087
|
+
const end = inlines[i].e
|
|
1088
|
+
const ltPos = src.indexOf('<', start)
|
|
1089
|
+
if (ltPos !== -1 && ltPos <= end) return true
|
|
1090
|
+
const gtPos = src.indexOf('>', start)
|
|
1091
|
+
return gtPos !== -1 && gtPos <= end
|
|
1092
|
+
}
|
|
1093
|
+
|
|
1094
|
+
const hasCodeTagInside = (state, inlines, n, i) => {
|
|
1095
|
+
const src = state.src
|
|
1096
|
+
const start = inlines[n].s
|
|
1097
|
+
const end = inlines[i].e
|
|
1098
|
+
const codeOpen = src.indexOf('<code', start)
|
|
1099
|
+
if (codeOpen !== -1 && codeOpen <= end) return true
|
|
1100
|
+
const codeClose = src.indexOf('</code', start)
|
|
1101
|
+
if (codeClose !== -1 && codeClose <= end) return true
|
|
1102
|
+
const preOpen = src.indexOf('<pre', start)
|
|
1103
|
+
if (preOpen !== -1 && preOpen <= end) return true
|
|
1104
|
+
const preClose = src.indexOf('</pre', start)
|
|
1105
|
+
return preClose !== -1 && preClose <= end
|
|
1106
|
+
}
|
|
1107
|
+
|
|
961
1108
|
const setEm = (state, inlines, marks, n, memo, opt, sNest, nestTracker, refRanges, inlineLinkRanges) => {
|
|
962
1109
|
const hasInlineLinkRanges = inlineLinkRanges && inlineLinkRanges.length > 0
|
|
963
1110
|
const hasRefRanges = refRanges && refRanges.length > 0
|
|
964
1111
|
const inlinesLength = inlines.length
|
|
965
1112
|
const emOpenRange = hasRefRanges ? findRefRangeIndex(inlines[n].s, refRanges) : -1
|
|
966
1113
|
const openLinkRange = hasInlineLinkRanges ? findInlineLinkRange(inlines[n].s, inlineLinkRanges) : null
|
|
1114
|
+
const leadingCompat = opt.leadingAsterisk === false
|
|
1115
|
+
const conservativePunctuation = leadingCompat || opt.disallowMixed === true
|
|
967
1116
|
if (opt.disallowMixed === true && !sNest) {
|
|
968
1117
|
let i = n + 1
|
|
969
1118
|
while (i < inlinesLength) {
|
|
@@ -1014,6 +1163,10 @@ const setEm = (state, inlines, marks, n, memo, opt, sNest, nestTracker, refRange
|
|
|
1014
1163
|
}
|
|
1015
1164
|
}
|
|
1016
1165
|
|
|
1166
|
+
if (state.md && state.md.options && state.md.options.html && hasCodeTagInside(state, inlines, n, i)) {
|
|
1167
|
+
return [n, nest]
|
|
1168
|
+
}
|
|
1169
|
+
|
|
1017
1170
|
const emNum = Math.min(inlines[n].len, inlines[i].len)
|
|
1018
1171
|
|
|
1019
1172
|
if (!sNest && emNum !== 1) return [n, sNest, memo]
|
|
@@ -1035,7 +1188,11 @@ const setEm = (state, inlines, marks, n, memo, opt, sNest, nestTracker, refRange
|
|
|
1035
1188
|
if (nest === -1) return [n, nest]
|
|
1036
1189
|
|
|
1037
1190
|
if (emNum === 1) {
|
|
1038
|
-
|
|
1191
|
+
const needsPunctuationCheckClose = conservativePunctuation || hasHtmlLikePunctuation(state, inlines, n, i) || hasAngleBracketInside(state, inlines, n, i)
|
|
1192
|
+
if (needsPunctuationCheckClose && hasPunctuationOrNonJapanese(state, inlines, n, i, opt, refRanges, hasRefRanges)) {
|
|
1193
|
+
if (leadingCompat) {
|
|
1194
|
+
return [n, nest]
|
|
1195
|
+
}
|
|
1039
1196
|
if (memo.inlineMarkEnd) {
|
|
1040
1197
|
marks.push(...createMarks(state, inlines, i, inlinesLength - 1, memo, opt, refRanges, inlineLinkRanges))
|
|
1041
1198
|
|
|
@@ -1228,6 +1385,16 @@ const strongJa = (state, silent, opt) => {
|
|
|
1228
1385
|
|
|
1229
1386
|
const attrsEnabled = opt.mditAttrs && hasMditAttrs(state)
|
|
1230
1387
|
|
|
1388
|
+
const leadingAsterisk = resolveLeadingAsterisk(state, opt, start, originalMax)
|
|
1389
|
+
|
|
1390
|
+
if (leadingAsterisk === false) {
|
|
1391
|
+
return false
|
|
1392
|
+
}
|
|
1393
|
+
|
|
1394
|
+
const runtimeOpt = leadingAsterisk === opt.leadingAsterisk
|
|
1395
|
+
? opt
|
|
1396
|
+
: { ...opt, leadingAsterisk }
|
|
1397
|
+
|
|
1231
1398
|
if (start === 0) {
|
|
1232
1399
|
state.__strongJaRefRangeCache = null
|
|
1233
1400
|
state.__strongJaInlineLinkRangeCache = null
|
|
@@ -1283,35 +1450,40 @@ const strongJa = (state, silent, opt) => {
|
|
|
1283
1450
|
|
|
1284
1451
|
let refRanges = []
|
|
1285
1452
|
const hasReferenceDefinitions = state.__strongJaReferenceCount > 0
|
|
1453
|
+
const refScanStart = 0
|
|
1286
1454
|
if (hasReferenceDefinitions) {
|
|
1287
|
-
const
|
|
1288
|
-
if (
|
|
1289
|
-
|
|
1290
|
-
|
|
1291
|
-
|
|
1292
|
-
|
|
1455
|
+
const firstRefBracket = state.src.indexOf('[', refScanStart)
|
|
1456
|
+
if (firstRefBracket !== -1 && firstRefBracket < max) {
|
|
1457
|
+
const refCache = state.__strongJaRefRangeCache
|
|
1458
|
+
if (refCache && refCache.max === max && refCache.start === refScanStart) {
|
|
1459
|
+
refRanges = refCache.ranges
|
|
1460
|
+
} else {
|
|
1461
|
+
refRanges = computeReferenceRanges(state, refScanStart, max)
|
|
1462
|
+
state.__strongJaRefRangeCache = { start: refScanStart, max, ranges: refRanges }
|
|
1463
|
+
}
|
|
1464
|
+
if (refRanges.length > 0) {
|
|
1465
|
+
state.__strongJaHasCollapsedRefs = true
|
|
1466
|
+
}
|
|
1293
1467
|
}
|
|
1294
1468
|
}
|
|
1295
|
-
if (refRanges.length > 0) {
|
|
1296
|
-
state.__strongJaHasCollapsedRefs = true
|
|
1297
|
-
}
|
|
1298
1469
|
|
|
1299
1470
|
let inlineLinkRanges = null
|
|
1300
|
-
const
|
|
1471
|
+
const inlineLinkScanStart = 0
|
|
1472
|
+
const inlineLinkCandidatePos = state.src.indexOf('](', inlineLinkScanStart)
|
|
1301
1473
|
const hasInlineLinkCandidate = inlineLinkCandidatePos !== -1 && inlineLinkCandidatePos < max
|
|
1302
1474
|
if (hasInlineLinkCandidate) {
|
|
1303
1475
|
const inlineCache = state.__strongJaInlineLinkRangeCache
|
|
1304
|
-
if (inlineCache && inlineCache.max === max && inlineCache.start
|
|
1476
|
+
if (inlineCache && inlineCache.max === max && inlineCache.start === inlineLinkScanStart) {
|
|
1305
1477
|
inlineLinkRanges = inlineCache.ranges
|
|
1306
1478
|
} else {
|
|
1307
|
-
inlineLinkRanges = computeInlineLinkRanges(state,
|
|
1308
|
-
state.__strongJaInlineLinkRangeCache = { start, max, ranges: inlineLinkRanges }
|
|
1479
|
+
inlineLinkRanges = computeInlineLinkRanges(state, inlineLinkScanStart, max)
|
|
1480
|
+
state.__strongJaInlineLinkRangeCache = { start: inlineLinkScanStart, max, ranges: inlineLinkRanges }
|
|
1309
1481
|
}
|
|
1310
1482
|
if (inlineLinkRanges.length > 0) {
|
|
1311
1483
|
state.__strongJaHasInlineLinks = true
|
|
1312
1484
|
}
|
|
1313
1485
|
}
|
|
1314
|
-
let inlines = createInlines(state, start, max,
|
|
1486
|
+
let inlines = createInlines(state, start, max, runtimeOpt)
|
|
1315
1487
|
|
|
1316
1488
|
const memo = {
|
|
1317
1489
|
html: state.md.options.html,
|
|
@@ -1321,11 +1493,11 @@ const strongJa = (state, silent, opt) => {
|
|
|
1321
1493
|
inlineMarkEnd: src.charCodeAt(max - 1) === CHAR_ASTERISK,
|
|
1322
1494
|
}
|
|
1323
1495
|
|
|
1324
|
-
let marks = createMarks(state, inlines, 0, inlines.length, memo,
|
|
1496
|
+
let marks = createMarks(state, inlines, 0, inlines.length, memo, runtimeOpt, refRanges, inlineLinkRanges)
|
|
1325
1497
|
|
|
1326
1498
|
inlines = mergeInlinesAndMarks(inlines, marks)
|
|
1327
1499
|
|
|
1328
|
-
setToken(state, inlines,
|
|
1500
|
+
setToken(state, inlines, runtimeOpt, attrsEnabled)
|
|
1329
1501
|
|
|
1330
1502
|
if (inlineLinkRanges && inlineLinkRanges.length > 0) {
|
|
1331
1503
|
const labelSources = []
|
|
@@ -1335,8 +1507,15 @@ const strongJa = (state, silent, opt) => {
|
|
|
1335
1507
|
labelSources.push(src.slice(range.start + 1, range.end))
|
|
1336
1508
|
}
|
|
1337
1509
|
if (labelSources.length > 0) {
|
|
1510
|
+
restoreLabelWhitespace(state.tokens, labelSources)
|
|
1338
1511
|
state.tokens.__strongJaInlineLabelSources = labelSources
|
|
1339
1512
|
state.tokens.__strongJaInlineLabelIndex = 0
|
|
1513
|
+
if (state.env) {
|
|
1514
|
+
if (!state.env.__strongJaInlineLabelSourceList) {
|
|
1515
|
+
state.env.__strongJaInlineLabelSourceList = []
|
|
1516
|
+
}
|
|
1517
|
+
state.env.__strongJaInlineLabelSourceList.push(labelSources)
|
|
1518
|
+
}
|
|
1340
1519
|
}
|
|
1341
1520
|
}
|
|
1342
1521
|
|
|
@@ -1705,9 +1884,52 @@ const removeGhostLabelText = (tokens, linkCloseIndex, labelText) => {
|
|
|
1705
1884
|
}
|
|
1706
1885
|
}
|
|
1707
1886
|
|
|
1887
|
+
const restoreLabelWhitespace = (tokens, labelSources) => {
|
|
1888
|
+
if (!tokens || !labelSources || labelSources.length === 0) return
|
|
1889
|
+
let labelIdx = 0
|
|
1890
|
+
for (let i = 0; i < tokens.length && labelIdx < labelSources.length; i++) {
|
|
1891
|
+
if (tokens[i].type !== 'link_open') continue
|
|
1892
|
+
const closeIdx = findLinkCloseIndex(tokens, i)
|
|
1893
|
+
if (closeIdx === -1) continue
|
|
1894
|
+
const labelSource = labelSources[labelIdx] || ''
|
|
1895
|
+
if (!labelSource) {
|
|
1896
|
+
labelIdx++
|
|
1897
|
+
continue
|
|
1898
|
+
}
|
|
1899
|
+
let cursor = 0
|
|
1900
|
+
for (let pos = i + 1; pos < closeIdx; pos++) {
|
|
1901
|
+
const t = tokens[pos]
|
|
1902
|
+
const markup = t.markup || ''
|
|
1903
|
+
const text = t.content || ''
|
|
1904
|
+
const startPos = cursor
|
|
1905
|
+
if (t.type === 'text') {
|
|
1906
|
+
cursor += text.length
|
|
1907
|
+
} else if (t.type === 'code_inline') {
|
|
1908
|
+
cursor += markup.length + text.length + markup.length
|
|
1909
|
+
} else if (markup) {
|
|
1910
|
+
cursor += markup.length
|
|
1911
|
+
}
|
|
1912
|
+
if ((t.type === 'strong_open' || t.type === 'em_open') && startPos > 0) {
|
|
1913
|
+
const prevToken = tokens[pos - 1]
|
|
1914
|
+
if (prevToken && prevToken.type === 'text' && prevToken.content && !prevToken.content.endsWith(' ')) {
|
|
1915
|
+
const hasSpaceBefore = startPos - 1 >= 0 && startPos - 1 < labelSource.length && labelSource[startPos - 1] === ' '
|
|
1916
|
+
const hasSpaceAt = startPos >= 0 && startPos < labelSource.length && labelSource[startPos] === ' '
|
|
1917
|
+
if (hasSpaceBefore || hasSpaceAt) {
|
|
1918
|
+
prevToken.content += ' '
|
|
1919
|
+
}
|
|
1920
|
+
}
|
|
1921
|
+
}
|
|
1922
|
+
}
|
|
1923
|
+
labelIdx++
|
|
1924
|
+
}
|
|
1925
|
+
}
|
|
1926
|
+
|
|
1708
1927
|
const convertInlineLinks = (tokens, state) => {
|
|
1709
1928
|
if (!tokens || tokens.length === 0) return
|
|
1710
|
-
|
|
1929
|
+
let labelSources = tokens.__strongJaInlineLabelSources
|
|
1930
|
+
if ((!labelSources || labelSources.length === 0) && state && state.env && Array.isArray(state.env.__strongJaInlineLabelSourceList) && state.env.__strongJaInlineLabelSourceList.length > 0) {
|
|
1931
|
+
labelSources = state.env.__strongJaInlineLabelSourceList.shift()
|
|
1932
|
+
}
|
|
1711
1933
|
let labelSourceIndex = tokens.__strongJaInlineLabelIndex || 0
|
|
1712
1934
|
let i = 0
|
|
1713
1935
|
while (i < tokens.length) {
|
|
@@ -1797,6 +2019,35 @@ const convertInlineLinks = (tokens, state) => {
|
|
|
1797
2019
|
i++
|
|
1798
2020
|
continue
|
|
1799
2021
|
}
|
|
2022
|
+
if (currentLabelSource) {
|
|
2023
|
+
const linkCloseIdx = findLinkCloseIndex(tokens, i)
|
|
2024
|
+
if (linkCloseIdx !== -1) {
|
|
2025
|
+
let cursor = 0
|
|
2026
|
+
for (let pos = i + 1; pos < linkCloseIdx; pos++) {
|
|
2027
|
+
const t = tokens[pos]
|
|
2028
|
+
const markup = t.markup || ''
|
|
2029
|
+
const text = t.content || ''
|
|
2030
|
+
const startPos = cursor
|
|
2031
|
+
if (t.type === 'text') {
|
|
2032
|
+
cursor += text.length
|
|
2033
|
+
} else if (t.type === 'code_inline') {
|
|
2034
|
+
cursor += markup.length + text.length + markup.length
|
|
2035
|
+
} else if (markup) {
|
|
2036
|
+
cursor += markup.length
|
|
2037
|
+
}
|
|
2038
|
+
if ((t.type === 'strong_open' || t.type === 'em_open') && startPos > 0) {
|
|
2039
|
+
const prevToken = tokens[pos - 1]
|
|
2040
|
+
if (prevToken && prevToken.type === 'text' && prevToken.content && !prevToken.content.endsWith(' ')) {
|
|
2041
|
+
const labelHasSpaceBefore = startPos - 1 >= 0 && startPos - 1 < currentLabelSource.length && currentLabelSource[startPos - 1] === ' '
|
|
2042
|
+
const labelHasSpaceAt = startPos >= 0 && startPos < currentLabelSource.length && currentLabelSource[startPos] === ' '
|
|
2043
|
+
if (labelHasSpaceBefore || labelHasSpaceAt) {
|
|
2044
|
+
prevToken.content += ' '
|
|
2045
|
+
}
|
|
2046
|
+
}
|
|
2047
|
+
}
|
|
2048
|
+
}
|
|
2049
|
+
}
|
|
2050
|
+
}
|
|
1800
2051
|
if (needsPlaceholder && currentLabelSource) {
|
|
1801
2052
|
removeGhostLabelText(tokens, nextIndex - 1, currentLabelSource)
|
|
1802
2053
|
}
|
|
@@ -2001,9 +2252,11 @@ const mditStrongJa = (md, option) => {
|
|
|
2001
2252
|
mditAttrs: true, //markdown-it-attrs
|
|
2002
2253
|
mdBreaks: md.options.breaks,
|
|
2003
2254
|
disallowMixed: false, //Non-Japanese text handling
|
|
2255
|
+
mode: 'japanese-only', // 'japanese-only' | 'aggressive' | 'compatible'
|
|
2004
2256
|
coreRulesBeforePostprocess: [] // e.g. ['cjk_breaks'] when CJK line-break plugins are active
|
|
2005
2257
|
}
|
|
2006
2258
|
if (option) Object.assign(opt, option)
|
|
2259
|
+
opt.hasCjkBreaks = hasCjkBreaksRule(md)
|
|
2007
2260
|
const rawCoreRules = opt.coreRulesBeforePostprocess
|
|
2008
2261
|
const hasCoreRuleConfig = Array.isArray(rawCoreRules)
|
|
2009
2262
|
? rawCoreRules.length > 0
|
|
@@ -2016,6 +2269,139 @@ const mditStrongJa = (md, option) => {
|
|
|
2016
2269
|
return strongJa(state, silent, opt)
|
|
2017
2270
|
})
|
|
2018
2271
|
|
|
2272
|
+
// Trim trailing spaces that remain after markdown-it-attrs strips `{...}`
|
|
2273
|
+
// Trim trailing spaces only at the very end of inline content (after attrs/core rules have run).
|
|
2274
|
+
const trimInlineTrailingSpaces = (state) => {
|
|
2275
|
+
if (!state || !state.tokens) return
|
|
2276
|
+
for (let i = 0; i < state.tokens.length; i++) {
|
|
2277
|
+
const token = state.tokens[i]
|
|
2278
|
+
if (!token || token.type !== 'inline' || !token.children || token.children.length === 0) continue
|
|
2279
|
+
let idx = token.children.length - 1
|
|
2280
|
+
while (idx >= 0 && (!token.children[idx] || (token.children[idx].type === 'text' && token.children[idx].content === ''))) {
|
|
2281
|
+
idx--
|
|
2282
|
+
}
|
|
2283
|
+
if (idx < 0) continue
|
|
2284
|
+
const tail = token.children[idx]
|
|
2285
|
+
if (!tail || tail.type !== 'text' || !tail.content) continue
|
|
2286
|
+
const trimmed = tail.content.replace(/[ \t]+$/, '')
|
|
2287
|
+
if (trimmed !== tail.content) {
|
|
2288
|
+
tail.content = trimmed
|
|
2289
|
+
}
|
|
2290
|
+
}
|
|
2291
|
+
}
|
|
2292
|
+
const hasTextJoinRule = Array.isArray(md.core?.ruler?.__rules__)
|
|
2293
|
+
? md.core.ruler.__rules__.some((rule) => rule && rule.name === 'text_join')
|
|
2294
|
+
: false
|
|
2295
|
+
if (hasTextJoinRule) {
|
|
2296
|
+
md.core.ruler.after('text_join', 'strong_ja_trim_trailing_spaces', trimInlineTrailingSpaces)
|
|
2297
|
+
} else {
|
|
2298
|
+
md.core.ruler.after('inline', 'strong_ja_trim_trailing_spaces', trimInlineTrailingSpaces)
|
|
2299
|
+
}
|
|
2300
|
+
|
|
2301
|
+
const normalizeSoftbreakSpacing = (state) => {
|
|
2302
|
+
if (!state || opt.hasCjkBreaks !== true) return
|
|
2303
|
+
if (!state.tokens || state.tokens.length === 0) return
|
|
2304
|
+
for (let i = 0; i < state.tokens.length; i++) {
|
|
2305
|
+
const token = state.tokens[i]
|
|
2306
|
+
if (!token || token.type !== 'inline' || !token.children || token.children.length === 0) continue
|
|
2307
|
+
for (let j = 0; j < token.children.length; j++) {
|
|
2308
|
+
const child = token.children[j]
|
|
2309
|
+
if (!child || child.type !== 'text' || !child.content) continue
|
|
2310
|
+
if (child.content.indexOf('\n') === -1) continue
|
|
2311
|
+
let normalized = ''
|
|
2312
|
+
for (let idx = 0; idx < child.content.length; idx++) {
|
|
2313
|
+
const ch = child.content[idx]
|
|
2314
|
+
if (ch === '\n') {
|
|
2315
|
+
const prevChar = idx > 0 ? child.content[idx - 1] : ''
|
|
2316
|
+
const nextChar = idx + 1 < child.content.length ? child.content[idx + 1] : ''
|
|
2317
|
+
const isAsciiWord = nextChar && nextChar >= '0' && nextChar <= 'z' && /[A-Za-z0-9]/.test(nextChar)
|
|
2318
|
+
const shouldReplace = isAsciiWord && nextChar !== '{' && nextChar !== '\\' && isJapanese(prevChar) && !isJapanese(nextChar)
|
|
2319
|
+
if (shouldReplace) {
|
|
2320
|
+
normalized += ' '
|
|
2321
|
+
continue
|
|
2322
|
+
}
|
|
2323
|
+
}
|
|
2324
|
+
normalized += ch
|
|
2325
|
+
}
|
|
2326
|
+
if (normalized !== child.content) {
|
|
2327
|
+
child.content = normalized
|
|
2328
|
+
}
|
|
2329
|
+
}
|
|
2330
|
+
}
|
|
2331
|
+
}
|
|
2332
|
+
if (hasTextJoinRule) {
|
|
2333
|
+
md.core.ruler.after('text_join', 'strong_ja_softbreak_spacing', normalizeSoftbreakSpacing)
|
|
2334
|
+
} else {
|
|
2335
|
+
md.core.ruler.after('inline', 'strong_ja_softbreak_spacing', normalizeSoftbreakSpacing)
|
|
2336
|
+
}
|
|
2337
|
+
|
|
2338
|
+
const restoreSoftbreaksAfterCjk = (state) => {
|
|
2339
|
+
if (!state) return
|
|
2340
|
+
if (!state.md || state.md.__strongJaRestoreSoftbreaksForAttrs !== true) return
|
|
2341
|
+
if (opt.hasCjkBreaks !== true) return
|
|
2342
|
+
if (!state.tokens || state.tokens.length === 0) return
|
|
2343
|
+
for (let i = 0; i < state.tokens.length; i++) {
|
|
2344
|
+
const token = state.tokens[i]
|
|
2345
|
+
if (!token || token.type !== 'inline' || !token.children || token.children.length === 0) continue
|
|
2346
|
+
const children = token.children
|
|
2347
|
+
for (let j = 0; j < children.length; j++) {
|
|
2348
|
+
const child = children[j]
|
|
2349
|
+
if (!child || child.type !== 'text' || child.content !== '') continue
|
|
2350
|
+
// Find previous non-empty text content to inspect the trailing character.
|
|
2351
|
+
let prevChar = ''
|
|
2352
|
+
for (let k = j - 1; k >= 0; k--) {
|
|
2353
|
+
const prev = children[k]
|
|
2354
|
+
if (prev && prev.type === 'text' && prev.content) {
|
|
2355
|
+
prevChar = prev.content.charAt(prev.content.length - 1)
|
|
2356
|
+
break
|
|
2357
|
+
}
|
|
2358
|
+
}
|
|
2359
|
+
if (!prevChar || !isJapanese(prevChar)) continue
|
|
2360
|
+
const next = children[j + 1]
|
|
2361
|
+
if (!next || next.type !== 'text' || !next.content) continue
|
|
2362
|
+
const nextChar = next.content.charAt(0)
|
|
2363
|
+
if (nextChar !== '{') continue
|
|
2364
|
+
child.type = 'softbreak'
|
|
2365
|
+
child.tag = ''
|
|
2366
|
+
child.content = '\n'
|
|
2367
|
+
child.markup = ''
|
|
2368
|
+
child.info = ''
|
|
2369
|
+
}
|
|
2370
|
+
}
|
|
2371
|
+
}
|
|
2372
|
+
|
|
2373
|
+
const registerRestoreSoftbreaks = () => {
|
|
2374
|
+
if (md.__strongJaRestoreRegistered) return
|
|
2375
|
+
const anchorRule = hasTextJoinRule ? 'text_join' : 'inline'
|
|
2376
|
+
const added = md.core.ruler.after(anchorRule, 'strong_ja_restore_softbreaks', restoreSoftbreaksAfterCjk)
|
|
2377
|
+
if (added !== false) {
|
|
2378
|
+
md.__strongJaRestoreRegistered = true
|
|
2379
|
+
md.__strongJaRestoreSoftbreaksForAttrs = opt.mditAttrs === false
|
|
2380
|
+
if (opt.hasCjkBreaks) {
|
|
2381
|
+
moveRuleAfter(md.core.ruler, 'strong_ja_restore_softbreaks', 'cjk_breaks')
|
|
2382
|
+
md.__strongJaRestoreReordered = true
|
|
2383
|
+
}
|
|
2384
|
+
if (!md.__strongJaPatchCorePush) {
|
|
2385
|
+
md.__strongJaPatchCorePush = true
|
|
2386
|
+
const originalPush = md.core.ruler.push.bind(md.core.ruler)
|
|
2387
|
+
md.core.ruler.push = (name, fn, options) => {
|
|
2388
|
+
const res = originalPush(name, fn, options)
|
|
2389
|
+
if (name && name.indexOf && name.indexOf('cjk_breaks') !== -1) {
|
|
2390
|
+
opt.hasCjkBreaks = true
|
|
2391
|
+
moveRuleAfter(md.core.ruler, 'strong_ja_restore_softbreaks', name)
|
|
2392
|
+
md.__strongJaRestoreReordered = true
|
|
2393
|
+
}
|
|
2394
|
+
return res
|
|
2395
|
+
}
|
|
2396
|
+
}
|
|
2397
|
+
if (opt.hasCjkBreaks) {
|
|
2398
|
+
moveRuleAfter(md.core.ruler, 'strong_ja_restore_softbreaks', 'cjk_breaks')
|
|
2399
|
+
md.__strongJaRestoreReordered = true
|
|
2400
|
+
}
|
|
2401
|
+
}
|
|
2402
|
+
}
|
|
2403
|
+
registerRestoreSoftbreaks()
|
|
2404
|
+
|
|
2019
2405
|
md.core.ruler.after('inline', 'strong_ja_postprocess', (state) => {
|
|
2020
2406
|
const targets = state.env.__strongJaPostProcessTargets
|
|
2021
2407
|
if (!targets || targets.length === 0) return
|
|
@@ -2043,6 +2429,9 @@ const mditStrongJa = (md, option) => {
|
|
|
2043
2429
|
delete tokens.__strongJaInlineLabelSources
|
|
2044
2430
|
delete tokens.__strongJaInlineLabelIndex
|
|
2045
2431
|
}
|
|
2432
|
+
if (state.env && state.env.__strongJaInlineLabelSourceList) {
|
|
2433
|
+
delete state.env.__strongJaInlineLabelSourceList
|
|
2434
|
+
}
|
|
2046
2435
|
delete state.env.__strongJaPostProcessTargets
|
|
2047
2436
|
delete state.env.__strongJaPostProcessTargetSet
|
|
2048
2437
|
})
|
|
@@ -2097,3 +2486,21 @@ function moveRuleBefore(ruler, ruleName, beforeName) {
|
|
|
2097
2486
|
rules.splice(beforeIdx, 0, rule)
|
|
2098
2487
|
ruler.__cache__ = null
|
|
2099
2488
|
}
|
|
2489
|
+
|
|
2490
|
+
function moveRuleAfter(ruler, ruleName, afterName) {
|
|
2491
|
+
if (!ruler || !ruler.__rules__) return
|
|
2492
|
+
const rules = ruler.__rules__
|
|
2493
|
+
let fromIdx = -1
|
|
2494
|
+
let afterIdx = -1
|
|
2495
|
+
for (let idx = 0; idx < rules.length; idx++) {
|
|
2496
|
+
if (rules[idx].name === ruleName) fromIdx = idx
|
|
2497
|
+
if (rules[idx].name === afterName) afterIdx = idx
|
|
2498
|
+
if (fromIdx !== -1 && afterIdx !== -1) break
|
|
2499
|
+
}
|
|
2500
|
+
if (fromIdx === -1 || afterIdx === -1 || fromIdx === afterIdx + 1) return
|
|
2501
|
+
|
|
2502
|
+
const rule = rules.splice(fromIdx, 1)[0]
|
|
2503
|
+
const targetIdx = fromIdx < afterIdx ? afterIdx - 1 : afterIdx
|
|
2504
|
+
rules.splice(targetIdx + 1, 0, rule)
|
|
2505
|
+
ruler.__cache__ = null
|
|
2506
|
+
}
|
package/package.json
CHANGED
|
@@ -1,7 +1,7 @@
|
|
|
1
1
|
{
|
|
2
2
|
"name": "@peaceroad/markdown-it-strong-ja",
|
|
3
3
|
"description": "This is a plugin for markdown-it. It is an alternative to the standard `**` (strong) and `*` (em) processing. It also processes strings that cannot be converted by the standard.",
|
|
4
|
-
"version": "0.
|
|
4
|
+
"version": "0.6.1",
|
|
5
5
|
"main": "index.js",
|
|
6
6
|
"type": "module",
|
|
7
7
|
"files": [
|
|
@@ -19,7 +19,7 @@
|
|
|
19
19
|
"markdown-it": "^14.1.0"
|
|
20
20
|
},
|
|
21
21
|
"devDependencies": {
|
|
22
|
-
"@peaceroad/markdown-it-cjk-breaks-mod": "^0.1.
|
|
22
|
+
"@peaceroad/markdown-it-cjk-breaks-mod": "^0.1.4",
|
|
23
23
|
"@peaceroad/markdown-it-hr-sandwiched-semantic-container": "^0.8.0",
|
|
24
24
|
"markdown-it-attrs": "^4.3.1",
|
|
25
25
|
"markdown-it-sub": "^2.0.0",
|