@peaceroad/markdown-it-strong-ja 0.5.5 → 0.6.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/README.md +88 -17
- package/index.js +525 -86
- package/package.json +2 -2
package/README.md
CHANGED
|
@@ -20,11 +20,88 @@ md.render('HTMLは*「HyperText Markup Language」*の略です。')
|
|
|
20
20
|
|
|
21
21
|
Notice. Basically, it is assumed that you will use markdown-it-attrs in conjunction with this. If you do not use it, please use `use(mditStrongJa, {mditAttrs: false})`.
|
|
22
22
|
|
|
23
|
+
### How this differs from vanilla markdown-it
|
|
24
|
+
|
|
25
|
+
Default output pairs `*` / `**` as it scans left-to-right: when a line contains Japanese (hiragana / katakana / kanji / fullwidth punctuation), japanese-only mode treats the leading `**` aggressively; English-only lines follow markdown-it style pairing. Pick one mode for the examples below:
|
|
26
|
+
|
|
27
|
+
- `mode: 'japanese-only'` (default) … Japanese ⇒ aggressive, English-only ⇒ markdown-it compatible
|
|
28
|
+
- `mode: 'aggressive'` … always aggressive (lead `**` pairs greedily)
|
|
29
|
+
- `mode: 'compatible'` … markdown-it compatible (lead `**` stays literal)
|
|
30
|
+
|
|
31
|
+
```js
|
|
32
|
+
const mdDefault = mdit().use(mditStrongJa) // mode: 'japanese-only'
|
|
33
|
+
const mdCompat = mdit().use(mditStrongJa, { mode: 'compatible' }) // markdown-it pairing
|
|
34
|
+
const mdAggressive = mdit().use(mditStrongJa, { mode: 'aggressive' }) // always pair leading **
|
|
35
|
+
```
|
|
36
|
+
|
|
37
|
+
Default (japanese-only) pairs aggressively only when Japanese is present. Aggressive always pairs the leading `**`, and compatible matches markdown-it.
|
|
38
|
+
|
|
39
|
+
Japanese-first pairing around punctuation and mixed sentences: leading/trailing Japanese quotes or brackets (`「`, `」`, `(`, `、` etc.) are wrapped even when the same pattern would stay literal in markdown-it. Mixed sentences here mean one line that contains multiple `*` runs; Japanese text keeps the leading `**` aggressive, while English-only stays compatible unless you pick aggressive mode.
|
|
40
|
+
|
|
41
|
+
- Punctuation:
|
|
42
|
+
- Input: `**「test」**`
|
|
43
|
+
- Output (default): `<p><strong>「test」</strong></p>`
|
|
44
|
+
- Output (aggressive): `<p><strong>「test」</strong></p>`
|
|
45
|
+
- Output (compatible): `<p>**「test」**</p>`
|
|
46
|
+
|
|
47
|
+
- Mixed sentence (multiple `*` runs): English-only stays markdown-it compatible unless you pick aggressive mode; earlier `**` runs can remain literal while later ones pair.
|
|
48
|
+
- Input (Japanese mixed): `**あああ。**iii**`
|
|
49
|
+
- Output (default): `<p><strong>あああ。</strong>iii**</p>`
|
|
50
|
+
- Output (aggressive): `<p><strong>あああ。</strong>iii**</p>`
|
|
51
|
+
- Output (compatible): `<p>**あああ。<strong>iii</strong></p>`
|
|
52
|
+
- Input (English-only): `**aaa.**iii**`
|
|
53
|
+
- Output (default): `<p>**aaa.<strong>iii</strong></p>`
|
|
54
|
+
- Output (aggressive): `<p><strong>aaa.</strong>iii**</p>`
|
|
55
|
+
- Output (compatible): `<p>**aaa.<strong>iii</strong></p>`
|
|
56
|
+
- Input (English-only, two `**` runs): `**aaa.**eee.**eeee**`
|
|
57
|
+
- Output (default): `<p>**aaa.**eee.<strong>eeee</strong></p>`
|
|
58
|
+
- Output (aggressive): `<p><strong>aaa.</strong>eee.<strong>eeee</strong></p>`
|
|
59
|
+
- Output (compatible): `<p>**aaa.**eee.<strong>eeee</strong></p>`
|
|
60
|
+
|
|
61
|
+
Inline link/HTML/code blocks stay intact (see Link / Inline code examples above): the plugin re-wraps `[label](url)` / `[label][]` after pairing to avoid broken emphasis tokens around anchors, inline HTML, or inline code. This also covers clusters of `*` with no spaces around the link or code span.
|
|
62
|
+
|
|
63
|
+
- Link (cluster of `*` without spaces):
|
|
64
|
+
- Input (English-only): `string**[text](url)**`
|
|
65
|
+
- Output (default): `<p>string**<a href="url">text</a>**</p>`
|
|
66
|
+
- Output (aggressive): `<p>string<strong><a href="url">text</a></strong></p>`
|
|
67
|
+
- Output (compatible): `<p>string**<a href="url">text</a>**</p>`
|
|
68
|
+
- Input (Japanese mixed): `これは**[text](url)**です`
|
|
69
|
+
- Output (default/aggressive): `<p>これは<strong><a href="url">text</a></strong>です</p>`
|
|
70
|
+
- Output (compatible): `<p>これは**<a href="url">text</a>**です</p>`
|
|
71
|
+
- Inline code (cluster of `*` without spaces):
|
|
72
|
+
- Input (English-only): `` **aa`code`**aa ``
|
|
73
|
+
- Output (default): `<p>**aa<code>code</code>**aa</p>`
|
|
74
|
+
- Output (aggressive): `<p><strong>aa<code>code</code></strong>aa</p>`
|
|
75
|
+
- Output (compatible): `<p>**aa<code>code</code>**aa</p>`
|
|
76
|
+
- Input (Japanese mixed): `` これは**`code`**です ``
|
|
77
|
+
- Output (default/aggressive): `<p>これは<strong><code>code</code></strong>です</p>`
|
|
78
|
+
- Output (compatible): `<p>これは**<code>code</code>**です</p>`
|
|
79
|
+
|
|
80
|
+
### Known differences from vanilla markdown-it
|
|
81
|
+
|
|
82
|
+
This section collects other cases that diverge from vanilla markdown-it.
|
|
83
|
+
|
|
84
|
+
The plugin keeps pairing aggressively in Japanese contexts, which can diverge from markdown-it when markup spans newlines or mixes nested markers.
|
|
85
|
+
|
|
86
|
+
- Multiline + nested emphasis (markdown-it leaves trailing `**`):
|
|
87
|
+
|
|
88
|
+
```markdown
|
|
89
|
+
***強調と*入れ子*の検証***を行う。
|
|
90
|
+
```
|
|
91
|
+
|
|
92
|
+
- markdown-it: `<p><em><em><em>強調と</em>入れ子</em>の検証</em>**を行う。</p>`
|
|
93
|
+
- markdown-it-strong-ja (default/aggressive): `<p><em><strong>強調と<em>入れ子</em>の検証</strong></em>を行う。</p>`
|
|
94
|
+
- If you want markdown-it behavior here, use `mode: 'compatible'`.
|
|
95
|
+
|
|
96
|
+
Notice. The plugin keeps inline HTML / angle-bracket regions intact so rendered HTML keeps correct nesting (for example, it avoids mis-nesting in inputs like `**aaa<code>**bbb</code>` when HTML output is enabled).
|
|
97
|
+
|
|
98
|
+
|
|
99
|
+
|
|
23
100
|
## Example
|
|
24
101
|
|
|
25
102
|
The following examples is for strong. The process for em is roughly the same.
|
|
26
103
|
|
|
27
|
-
|
|
104
|
+
````markdown
|
|
28
105
|
[Markdown]
|
|
29
106
|
HTMLは「**HyperText Markup Language**」の略です。
|
|
30
107
|
[HTML]
|
|
@@ -116,6 +193,11 @@ HTMLは**「HyperText <b>Markup</b> Language」**
|
|
|
116
193
|
[HTML:true]
|
|
117
194
|
<p>HTMLは<strong>「HyperText <b>Markup</b> Language」</strong></p>
|
|
118
195
|
|
|
196
|
+
[Markdown]
|
|
197
|
+
これは**[text](url)**と**`code`**と**<b>HTML</b>**です
|
|
198
|
+
[HTML html:true]
|
|
199
|
+
<p>これは<strong><a href="url">text</a></strong>と<strong><code>code</code></strong>と<strong><b>HTML</b></strong>です</p>
|
|
200
|
+
|
|
119
201
|
|
|
120
202
|
[Markdown]
|
|
121
203
|
HTMLは「**HyperText Markup Language**」
|
|
@@ -152,25 +234,12 @@ a****b
|
|
|
152
234
|
a****
|
|
153
235
|
[HTML]
|
|
154
236
|
<p>a****</p>
|
|
155
|
-
|
|
237
|
+
````
|
|
156
238
|
|
|
157
|
-
## Options
|
|
158
239
|
|
|
159
|
-
### disallowMixed
|
|
240
|
+
### disallowMixed (legacy)
|
|
160
241
|
|
|
161
|
-
|
|
162
|
-
|
|
163
|
-
```js
|
|
164
|
-
const md = mdit.use(mditStrongJa)
|
|
165
|
-
md.render('string**[text](url)**')
|
|
166
|
-
// <p>string<strong><a href="url">text</a></strong></p>
|
|
167
|
-
```
|
|
168
|
-
|
|
169
|
-
```js
|
|
170
|
-
const md = mdit.use(mditStrongJa, { disallowMixed: true })
|
|
171
|
-
md.render('string**[text](url)**')
|
|
172
|
-
// <p>string**<a href="url">text</a>**</p>
|
|
173
|
-
```
|
|
242
|
+
`disallowMixed: true` is kept for back-compat: it forces compatible pairing for English/mixed contexts that contain markdown links, HTML tags, inline code, or math expressions while staying aggressive for Japanese-only text. Prefer `mode` for new setups; enable this only if you need the legacy compat-first behavior in mixed English.
|
|
174
243
|
|
|
175
244
|
### coreRulesBeforePostprocess
|
|
176
245
|
|
|
@@ -187,3 +256,5 @@ const md = mdit()
|
|
|
187
256
|
- Default: `[]`
|
|
188
257
|
- Specify `['cjk_breaks']` (or other rule names) when you rely on plugins such as `@peaceroad/markdown-it-cjk-breaks-mod` and need them to run first.
|
|
189
258
|
- Pass an empty array if you do not want `mditStrongJa` to reorder any core rules.
|
|
259
|
+
|
|
260
|
+
Most setups can leave this option untouched; use it only when you must keep another plugin's core rule ahead of `strong_ja_postprocess`.
|
package/index.js
CHANGED
|
@@ -22,10 +22,18 @@ const CHAR_CLOSE_CURLY = 0x7D // }
|
|
|
22
22
|
|
|
23
23
|
const REG_ATTRS = /{[^{}\n!@#%^&*()]+?}$/
|
|
24
24
|
const REG_ASCII_PUNCT = /[!-/:-@[-`{-~]/g
|
|
25
|
-
const REG_JAPANESE =
|
|
25
|
+
const REG_JAPANESE = /[\p{Script=Hiragana}\p{Script=Katakana}\p{Script=Han}\u3000-\u303F\uFF00-\uFFEF]/u // ひらがな|カタカナ|漢字|CJK句読点・全角形状(絵文字は除外)
|
|
26
26
|
|
|
27
27
|
const REG_MARKDOWN_HTML = /^\[[^\[\]]+\]\([^)]+\)$|^<([a-zA-Z][a-zA-Z0-9]*)[^>]*>([^<]+<\/\1>)$|^`[^`]+`$|^\$[^$]+\$$/ // for mixed-language context detection
|
|
28
28
|
|
|
29
|
+
const hasCjkBreaksRule = (md) => {
|
|
30
|
+
if (!md || !md.core || !md.core.ruler || !Array.isArray(md.core.ruler.__rules__)) return false
|
|
31
|
+
if (md.__strongJaHasCjkBreaks === true) return true
|
|
32
|
+
const found = md.core.ruler.__rules__.some((rule) => rule && typeof rule.name === 'string' && rule.name.indexOf('cjk_breaks') !== -1)
|
|
33
|
+
if (found) md.__strongJaHasCjkBreaks = true
|
|
34
|
+
return found
|
|
35
|
+
}
|
|
36
|
+
|
|
29
37
|
const hasBackslash = (state, start) => {
|
|
30
38
|
if (start <= 0) return false
|
|
31
39
|
if (state.__strongJaHasBackslash === false) return false
|
|
@@ -268,6 +276,15 @@ const findInlineLinkRange = (pos, ranges, kind) => {
|
|
|
268
276
|
const useCache = ranges.length > 32
|
|
269
277
|
const cache = useCache ? getInlineRangeCacheMap(ranges, kind, false) : null
|
|
270
278
|
if (cache && cache.has(pos)) return cache.get(pos)
|
|
279
|
+
const first = ranges[0]
|
|
280
|
+
const last = ranges[ranges.length - 1]
|
|
281
|
+
if (pos < first.start || pos > last.end) {
|
|
282
|
+
if (useCache) {
|
|
283
|
+
const storeCache = getInlineRangeCacheMap(ranges, kind, true)
|
|
284
|
+
storeCache.set(pos, null)
|
|
285
|
+
}
|
|
286
|
+
return null
|
|
287
|
+
}
|
|
271
288
|
let left = 0
|
|
272
289
|
let right = ranges.length - 1
|
|
273
290
|
let found = null
|
|
@@ -293,16 +310,7 @@ const findInlineLinkRange = (pos, ranges, kind) => {
|
|
|
293
310
|
}
|
|
294
311
|
|
|
295
312
|
const copyInlineTokenFields = (dest, src) => {
|
|
296
|
-
|
|
297
|
-
if (src.map) dest.map = src.map
|
|
298
|
-
dest.level = src.level
|
|
299
|
-
if (src.children) dest.children = src.children
|
|
300
|
-
dest.content = src.content
|
|
301
|
-
dest.markup = src.markup
|
|
302
|
-
if (src.info) dest.info = src.info
|
|
303
|
-
if (src.meta) dest.meta = src.meta
|
|
304
|
-
dest.block = src.block
|
|
305
|
-
dest.hidden = src.hidden
|
|
313
|
+
Object.assign(dest, src)
|
|
306
314
|
}
|
|
307
315
|
|
|
308
316
|
const registerPostProcessTarget = (state) => {
|
|
@@ -352,16 +360,51 @@ function isPlainTextContent(content) {
|
|
|
352
360
|
if (code === CHAR_BACKSLASH || code === CHAR_NEWLINE || code === CHAR_TAB) {
|
|
353
361
|
return false
|
|
354
362
|
}
|
|
355
|
-
if (code
|
|
363
|
+
if (code === CHAR_BACKTICK || code === CHAR_DOLLAR || code === CHAR_LT || code === CHAR_GT) {
|
|
364
|
+
return false
|
|
365
|
+
}
|
|
366
|
+
if (code === CHAR_OPEN_BRACKET || code === CHAR_CLOSE_BRACKET || code === CHAR_OPEN_PAREN || code === CHAR_CLOSE_PAREN) {
|
|
367
|
+
return false
|
|
368
|
+
}
|
|
369
|
+
if (code === 0x5E || code === 0x7E) {
|
|
370
|
+
return false
|
|
371
|
+
}
|
|
356
372
|
}
|
|
357
373
|
return true
|
|
358
374
|
}
|
|
359
375
|
|
|
376
|
+
// Cache newline positions for lightweight map generation
|
|
377
|
+
const getLineOffsets = (state) => {
|
|
378
|
+
if (state.__strongJaLineOffsets) return state.__strongJaLineOffsets
|
|
379
|
+
const offsets = []
|
|
380
|
+
const src = state.src || ''
|
|
381
|
+
for (let i = 0; i < src.length; i++) {
|
|
382
|
+
if (src.charCodeAt(i) === CHAR_NEWLINE) offsets.push(i)
|
|
383
|
+
}
|
|
384
|
+
state.__strongJaLineOffsets = offsets
|
|
385
|
+
return offsets
|
|
386
|
+
}
|
|
387
|
+
|
|
388
|
+
const createLineMapper = (state) => {
|
|
389
|
+
const offsets = getLineOffsets(state)
|
|
390
|
+
let idx = 0
|
|
391
|
+
const maxIdx = offsets.length
|
|
392
|
+
return (startPos, endPos) => {
|
|
393
|
+
const start = startPos === undefined || startPos === null ? 0 : startPos
|
|
394
|
+
const end = endPos === undefined || endPos === null ? start : endPos
|
|
395
|
+
while (idx < maxIdx && offsets[idx] < start) idx++
|
|
396
|
+
const startLine = idx
|
|
397
|
+
let endIdx = idx
|
|
398
|
+
while (endIdx < maxIdx && offsets[endIdx] < end) endIdx++
|
|
399
|
+
return [startLine, endIdx]
|
|
400
|
+
}
|
|
401
|
+
}
|
|
402
|
+
|
|
360
403
|
const setToken = (state, inlines, opt, attrsEnabled) => {
|
|
361
404
|
const src = state.src
|
|
405
|
+
const mapFromPos = createLineMapper(state)
|
|
362
406
|
let i = 0
|
|
363
|
-
let
|
|
364
|
-
let attrsIsTextTag = ''
|
|
407
|
+
let lastTextToken = null
|
|
365
408
|
while (i < inlines.length) {
|
|
366
409
|
let type = inlines[i].type
|
|
367
410
|
let tag = ''
|
|
@@ -378,12 +421,24 @@ const setToken = (state, inlines, opt, attrsEnabled) => {
|
|
|
378
421
|
if (isOpen) {
|
|
379
422
|
const startToken = state.push(type, tag, 1)
|
|
380
423
|
startToken.markup = tag === 'strong' ? '**' : '*'
|
|
381
|
-
|
|
382
|
-
attrsIsTextTag = tag
|
|
424
|
+
startToken.map = mapFromPos(inlines[i].s, inlines[i].e)
|
|
383
425
|
}
|
|
384
426
|
|
|
385
427
|
if (type === 'html_inline') {
|
|
386
|
-
|
|
428
|
+
const content = src.slice(inlines[i].s, inlines[i].e + 1)
|
|
429
|
+
if (lastTextToken && inlines[i].s > 0) {
|
|
430
|
+
const prevChar = src.charAt(inlines[i].s - 1)
|
|
431
|
+
if (prevChar === ' ' || prevChar === '\t') {
|
|
432
|
+
if (!lastTextToken.content.endsWith(prevChar)) {
|
|
433
|
+
lastTextToken.content += prevChar
|
|
434
|
+
}
|
|
435
|
+
}
|
|
436
|
+
}
|
|
437
|
+
const htmlToken = state.push('html_inline', '', 0)
|
|
438
|
+
htmlToken.content = content
|
|
439
|
+
htmlToken.map = mapFromPos(inlines[i].s, inlines[i].e)
|
|
440
|
+
i++
|
|
441
|
+
continue
|
|
387
442
|
}
|
|
388
443
|
if (type === 'text') {
|
|
389
444
|
let content = src.slice(inlines[i].s, inlines[i].e + 1)
|
|
@@ -391,22 +446,30 @@ const setToken = (state, inlines, opt, attrsEnabled) => {
|
|
|
391
446
|
if (isAllAsterisks(content)) {
|
|
392
447
|
const asteriskToken = state.push(type, '', 0)
|
|
393
448
|
asteriskToken.content = content
|
|
449
|
+
asteriskToken.map = mapFromPos(inlines[i].s, inlines[i].e)
|
|
394
450
|
i++
|
|
395
451
|
continue
|
|
396
452
|
}
|
|
397
453
|
}
|
|
398
|
-
|
|
399
|
-
|
|
400
|
-
|
|
401
|
-
|
|
402
|
-
|
|
403
|
-
|
|
404
|
-
|
|
405
|
-
|
|
406
|
-
|
|
454
|
+
const attrMatch = attrsEnabled && content.length > 0 && content.charCodeAt(content.length - 1) === CHAR_CLOSE_CURLY && REG_ATTRS.test(content)
|
|
455
|
+
? content.match(/^(.*?)(\s+{[^{}\n!@#%^&*()]+?})$/)
|
|
456
|
+
: null
|
|
457
|
+
if (attrMatch) {
|
|
458
|
+
const textPart = attrMatch[1] ? attrMatch[1].replace(/[ \t]+$/, '') : ''
|
|
459
|
+
const attrPart = attrMatch[2]
|
|
460
|
+
if (textPart && textPart.length > 0) {
|
|
461
|
+
const textToken = state.push(type, '', 0)
|
|
462
|
+
textToken.content = textPart
|
|
463
|
+
textToken.map = mapFromPos(inlines[i].s, inlines[i].s + textPart.length)
|
|
464
|
+
lastTextToken = textToken
|
|
465
|
+
}
|
|
466
|
+
const attrsToken = state.push(type, '', 0)
|
|
467
|
+
let attrsContent = attrPart.replace(/^\s+/, '')
|
|
468
|
+
if (attrsContent.indexOf('\\') !== -1) {
|
|
469
|
+
const hasBackslashBeforeCurlyAttribute = attrsContent.match(/(\\+){/)
|
|
407
470
|
if (hasBackslashBeforeCurlyAttribute) {
|
|
408
471
|
if (hasBackslashBeforeCurlyAttribute[1].length === 1) {
|
|
409
|
-
|
|
472
|
+
attrsContent = attrsContent.replace(/\\{/, '{')
|
|
410
473
|
} else {
|
|
411
474
|
let backSlashNum = Math.floor(hasBackslashBeforeCurlyAttribute[1].length / 2)
|
|
412
475
|
let k = 0
|
|
@@ -415,50 +478,84 @@ const setToken = (state, inlines, opt, attrsEnabled) => {
|
|
|
415
478
|
backSlash += '\\'
|
|
416
479
|
k++
|
|
417
480
|
}
|
|
418
|
-
|
|
481
|
+
attrsContent = attrsContent.replace(/\\+{/, backSlash + '{')
|
|
419
482
|
}
|
|
420
|
-
} else {
|
|
421
|
-
attrsToken.content = content
|
|
422
483
|
}
|
|
423
|
-
attrsIsText = false
|
|
424
|
-
attrsIsTextTag = ''
|
|
425
|
-
i++
|
|
426
|
-
continue
|
|
427
484
|
}
|
|
485
|
+
attrsToken.content = attrsContent
|
|
486
|
+
attrsToken.map = mapFromPos(inlines[i].s + content.length - attrPart.length, inlines[i].e)
|
|
487
|
+
i++
|
|
488
|
+
continue
|
|
428
489
|
}
|
|
429
490
|
if (isPlainTextContent(content)) {
|
|
430
491
|
const textToken = state.push(type, '', 0)
|
|
431
492
|
textToken.content = content
|
|
493
|
+
textToken.map = mapFromPos(inlines[i].s, inlines[i].e)
|
|
494
|
+
lastTextToken = textToken
|
|
432
495
|
i++
|
|
433
496
|
continue
|
|
434
497
|
}
|
|
435
498
|
|
|
436
|
-
const
|
|
437
|
-
|
|
438
|
-
|
|
439
|
-
|
|
440
|
-
|
|
441
|
-
|
|
442
|
-
|
|
443
|
-
|
|
444
|
-
|
|
445
|
-
|
|
446
|
-
|
|
447
|
-
|
|
448
|
-
|
|
499
|
+
const hasOnlySimpleNewline = attrsEnabled && (content.indexOf('{') !== -1 || content.indexOf('}') !== -1) &&
|
|
500
|
+
content.indexOf('\n') !== -1 &&
|
|
501
|
+
content.indexOf('`') === -1 &&
|
|
502
|
+
content.indexOf('$') === -1 &&
|
|
503
|
+
content.indexOf('<') === -1 &&
|
|
504
|
+
content.indexOf('>') === -1 &&
|
|
505
|
+
content.indexOf('[') === -1 &&
|
|
506
|
+
content.indexOf(']') === -1 &&
|
|
507
|
+
content.indexOf('(') === -1 &&
|
|
508
|
+
content.indexOf(')') === -1 &&
|
|
509
|
+
content.indexOf('^') === -1 &&
|
|
510
|
+
content.indexOf('~') === -1 &&
|
|
511
|
+
content.indexOf('\\') === -1
|
|
512
|
+
|
|
513
|
+
if (hasOnlySimpleNewline) {
|
|
514
|
+
const textToken = state.push(type, '', 0)
|
|
515
|
+
textToken.content = content
|
|
516
|
+
textToken.map = mapFromPos(inlines[i].s, inlines[i].e)
|
|
517
|
+
lastTextToken = textToken
|
|
518
|
+
i++
|
|
519
|
+
continue
|
|
520
|
+
}
|
|
521
|
+
|
|
522
|
+
const childTokens = []
|
|
523
|
+
state.md.inline.parse(content, state.md, state.env, childTokens)
|
|
524
|
+
let j = 0
|
|
525
|
+
while (j < childTokens.length) {
|
|
526
|
+
const t = childTokens[j]
|
|
527
|
+
if (t.type === 'softbreak' && !opt.mdBreaks) {
|
|
528
|
+
const hasCjk = opt.hasCjkBreaks === true
|
|
529
|
+
if (hasCjk) {
|
|
530
|
+
const prevToken = childTokens[j - 1]
|
|
531
|
+
const nextToken = childTokens[j + 1]
|
|
532
|
+
const prevChar = prevToken && prevToken.content ? prevToken.content.slice(-1) : ''
|
|
533
|
+
const nextChar = nextToken && nextToken.content ? nextToken.content.charAt(0) : ''
|
|
534
|
+
const isAsciiWord = nextChar >= '0' && nextChar <= 'z' && /[A-Za-z0-9]/.test(nextChar)
|
|
535
|
+
if (isAsciiWord && isJapanese(prevChar) && !isJapanese(nextChar)) {
|
|
536
|
+
t.type = 'text'
|
|
537
|
+
t.tag = ''
|
|
538
|
+
t.content = ' '
|
|
539
|
+
}
|
|
449
540
|
}
|
|
450
|
-
const token = state.push(t.type, t.tag, t.nesting)
|
|
451
|
-
copyInlineTokenFields(token, t)
|
|
452
|
-
j++
|
|
453
541
|
}
|
|
542
|
+
if (!attrsEnabled && t.tag === 'br') {
|
|
543
|
+
t.tag = ''
|
|
544
|
+
t.content = '\n'
|
|
545
|
+
}
|
|
546
|
+
const token = state.push(t.type, t.tag, t.nesting)
|
|
547
|
+
copyInlineTokenFields(token, t)
|
|
548
|
+
if (t.type === 'text') {
|
|
549
|
+
lastTextToken = token
|
|
550
|
+
}
|
|
551
|
+
j++
|
|
454
552
|
}
|
|
455
553
|
}
|
|
456
554
|
|
|
457
555
|
if (isClose) {
|
|
458
556
|
const closeToken = state.push(type, tag, -1)
|
|
459
557
|
closeToken.markup = tag === 'strong' ? '**' : '*'
|
|
460
|
-
|
|
461
|
-
attrsIsTextTag = ''
|
|
558
|
+
closeToken.map = mapFromPos(inlines[i].s, inlines[i].e)
|
|
462
559
|
}
|
|
463
560
|
|
|
464
561
|
i++
|
|
@@ -473,17 +570,12 @@ const pushInlines = (inlines, s, e, len, type, tag, tagType) => {
|
|
|
473
570
|
ep: e,
|
|
474
571
|
len: len,
|
|
475
572
|
type: type,
|
|
476
|
-
check:
|
|
573
|
+
check: false
|
|
477
574
|
}
|
|
478
575
|
if (tag) inline.tag = [tag, tagType]
|
|
479
576
|
inlines.push(inline)
|
|
480
577
|
}
|
|
481
578
|
|
|
482
|
-
const isAsciiPunctuationCode = (code) => {
|
|
483
|
-
return (code >= 33 && code <= 47) || (code >= 58 && code <= 64) ||
|
|
484
|
-
(code >= 91 && code <= 96) || (code >= 123 && code <= 126)
|
|
485
|
-
}
|
|
486
|
-
|
|
487
579
|
const findNextAsciiPunctuation = (src, start, max) => {
|
|
488
580
|
REG_ASCII_PUNCT.lastIndex = start
|
|
489
581
|
const match = REG_ASCII_PUNCT.exec(src)
|
|
@@ -615,11 +707,13 @@ const createInlines = (state, start, max, opt) => {
|
|
|
615
707
|
// HTML tags
|
|
616
708
|
if (htmlEnabled && currentChar === CHAR_LT) {
|
|
617
709
|
if (!isEscaped) {
|
|
710
|
+
const guardHtml = srcLen - n > 8192
|
|
711
|
+
const maxScanEnd = guardHtml ? Math.min(srcLen, n + 8192) : srcLen
|
|
618
712
|
let foundClosingTag = false
|
|
619
713
|
let i = n + 1
|
|
620
714
|
while (i < srcLen) {
|
|
621
715
|
i = src.indexOf('>', i)
|
|
622
|
-
if (i === -1 || i >=
|
|
716
|
+
if (i === -1 || i >= maxScanEnd) break
|
|
623
717
|
if (!hasBackslash(state, i)) {
|
|
624
718
|
hasText = processTextSegment(inlines, textStart, n, hasText)
|
|
625
719
|
let tag = src.slice(n + 1, i)
|
|
@@ -690,6 +784,8 @@ const setStrong = (state, inlines, marks, n, memo, opt, nestTracker, refRanges,
|
|
|
690
784
|
const hasInlineLinkRanges = inlineLinkRanges && inlineLinkRanges.length > 0
|
|
691
785
|
const hasRefRanges = refRanges && refRanges.length > 0
|
|
692
786
|
const inlinesLength = inlines.length
|
|
787
|
+
const leadingCompat = opt.leadingAsterisk === false
|
|
788
|
+
const conservativePunctuation = opt.disallowMixed === true
|
|
693
789
|
if (opt.disallowMixed === true) {
|
|
694
790
|
let i = n + 1
|
|
695
791
|
while (i < inlinesLength) {
|
|
@@ -732,6 +828,10 @@ const setStrong = (state, inlines, marks, n, memo, opt, nestTracker, refRanges,
|
|
|
732
828
|
}
|
|
733
829
|
}
|
|
734
830
|
|
|
831
|
+
if (state.md && state.md.options && state.md.options.html && hasCodeTagInside(state, inlines, n, i)) {
|
|
832
|
+
return [n, nest]
|
|
833
|
+
}
|
|
834
|
+
|
|
735
835
|
nest = checkNest(inlines, marks, n, i, nestTracker)
|
|
736
836
|
if (nest === -1) return [n, nest]
|
|
737
837
|
|
|
@@ -763,7 +863,12 @@ const setStrong = (state, inlines, marks, n, memo, opt, nestTracker, refRanges,
|
|
|
763
863
|
let strongNum = Math.trunc(Math.min(inlines[n].len, inlines[i].len) / 2)
|
|
764
864
|
|
|
765
865
|
if (inlines[i].len > 1) {
|
|
766
|
-
|
|
866
|
+
const hasJapaneseContext = isJapanese(state.src[inlines[n].s - 1] || '') || isJapanese(state.src[inlines[i].e + 1] || '')
|
|
867
|
+
const needsPunctuationCheck = (conservativePunctuation && !hasJapaneseContext) || hasHtmlLikePunctuation(state, inlines, n, i) || hasAngleBracketInside(state, inlines, n, i)
|
|
868
|
+
if (needsPunctuationCheck && hasPunctuationOrNonJapanese(state, inlines, n, i, opt, refRanges, hasRefRanges)) {
|
|
869
|
+
if (leadingCompat) {
|
|
870
|
+
return [n, nest]
|
|
871
|
+
}
|
|
767
872
|
if (memo.inlineMarkEnd) {
|
|
768
873
|
marks.push(...createMarks(state, inlines, i, inlinesLength - 1, memo, opt, refRanges, inlineLinkRanges))
|
|
769
874
|
if (inlines[i].len === 0) { i++; continue }
|
|
@@ -846,7 +951,17 @@ const isPunctuation = (ch) => {
|
|
|
846
951
|
(code >= 91 && code <= 96) || (code >= 123 && code <= 126) || code === 32
|
|
847
952
|
}
|
|
848
953
|
|
|
849
|
-
|
|
954
|
+
const isAsciiPunctuationCode = (code) => {
|
|
955
|
+
if (code < 33 || code > 126) return false
|
|
956
|
+
return (code <= 47) || (code >= 58 && code <= 64) || (code >= 91 && code <= 96) || (code >= 123)
|
|
957
|
+
}
|
|
958
|
+
|
|
959
|
+
const isUnicodePunctuation = (ch) => {
|
|
960
|
+
if (!ch) return false
|
|
961
|
+
return /\p{P}/u.test(ch)
|
|
962
|
+
}
|
|
963
|
+
|
|
964
|
+
// Check if character is Japanese (hiragana, katakana, kanji, CJK punctuation/fullwidth)
|
|
850
965
|
// Uses fast Unicode range checks for common cases, falls back to REG_JAPANESE for complex Unicode
|
|
851
966
|
const isJapanese = (ch) => {
|
|
852
967
|
if (!ch) return false
|
|
@@ -861,6 +976,26 @@ const isJapanese = (ch) => {
|
|
|
861
976
|
REG_JAPANESE.test(ch)
|
|
862
977
|
}
|
|
863
978
|
|
|
979
|
+
const hasJapaneseText = (str) => {
|
|
980
|
+
if (!str) return false
|
|
981
|
+
return REG_JAPANESE.test(str)
|
|
982
|
+
}
|
|
983
|
+
|
|
984
|
+
const resolveLeadingAsterisk = (state, opt, start, max) => {
|
|
985
|
+
const modeRaw = opt.mode || 'japanese-only'
|
|
986
|
+
const mode = typeof modeRaw === 'string' ? modeRaw.toLowerCase() : 'japanese-only'
|
|
987
|
+
if (mode === 'aggressive') return true
|
|
988
|
+
if (mode === 'compatible') return false
|
|
989
|
+
let hasJapanese = state.__strongJaHasJapanese
|
|
990
|
+
if (hasJapanese === undefined) {
|
|
991
|
+
hasJapanese = hasJapaneseText(state.src.slice(0, max))
|
|
992
|
+
state.__strongJaHasJapanese = hasJapanese
|
|
993
|
+
}
|
|
994
|
+
if (opt.disallowMixed === true) return hasJapanese
|
|
995
|
+
|
|
996
|
+
return hasJapanese
|
|
997
|
+
}
|
|
998
|
+
|
|
864
999
|
// Check if character is English (letters, numbers) or other non-Japanese characters
|
|
865
1000
|
// Uses REG_JAPANESE to exclude Japanese characters
|
|
866
1001
|
const isEnglish = (ch) => {
|
|
@@ -893,6 +1028,9 @@ const hasPunctuationOrNonJapanese = (state, inlines, n, i, opt, refRanges, hasRe
|
|
|
893
1028
|
const openPrevChar = src[inlines[n].s - 1] || ''
|
|
894
1029
|
const openNextChar = src[inlines[n].e + 1] || ''
|
|
895
1030
|
let checkOpenNextChar = isPunctuation(openNextChar)
|
|
1031
|
+
if (!checkOpenNextChar && opt.leadingAsterisk === false && isUnicodePunctuation(openNextChar)) {
|
|
1032
|
+
checkOpenNextChar = true
|
|
1033
|
+
}
|
|
896
1034
|
if (hasRefRanges && checkOpenNextChar && (openNextChar === '[' || openNextChar === ']')) {
|
|
897
1035
|
const openNextRange = findRefRangeIndex(inlines[n].e + 1, refRanges)
|
|
898
1036
|
if (openNextRange !== -1) {
|
|
@@ -901,6 +1039,9 @@ const hasPunctuationOrNonJapanese = (state, inlines, n, i, opt, refRanges, hasRe
|
|
|
901
1039
|
}
|
|
902
1040
|
const closePrevChar = src[inlines[i].s - 1] || ''
|
|
903
1041
|
let checkClosePrevChar = isPunctuation(closePrevChar)
|
|
1042
|
+
if (!checkClosePrevChar && opt.leadingAsterisk === false && isUnicodePunctuation(closePrevChar)) {
|
|
1043
|
+
checkClosePrevChar = true
|
|
1044
|
+
}
|
|
904
1045
|
if (hasRefRanges && checkClosePrevChar && (closePrevChar === '[' || closePrevChar === ']')) {
|
|
905
1046
|
const closePrevRange = findRefRangeIndex(inlines[i].s - 1, refRanges)
|
|
906
1047
|
if (closePrevRange !== -1) {
|
|
@@ -909,7 +1050,10 @@ const hasPunctuationOrNonJapanese = (state, inlines, n, i, opt, refRanges, hasRe
|
|
|
909
1050
|
}
|
|
910
1051
|
const closeNextChar = src[inlines[i].e + 1] || ''
|
|
911
1052
|
const isLastInline = i === inlines.length - 1
|
|
912
|
-
|
|
1053
|
+
let checkCloseNextChar = isLastInline || isPunctuation(closeNextChar) || closeNextChar === '\n'
|
|
1054
|
+
if (!checkCloseNextChar && opt.leadingAsterisk === false && isUnicodePunctuation(closeNextChar)) {
|
|
1055
|
+
checkCloseNextChar = true
|
|
1056
|
+
}
|
|
913
1057
|
|
|
914
1058
|
if (opt.disallowMixed === false) {
|
|
915
1059
|
if (isEnglish(openPrevChar) || isEnglish(closeNextChar)) {
|
|
@@ -923,12 +1067,52 @@ const hasPunctuationOrNonJapanese = (state, inlines, n, i, opt, refRanges, hasRe
|
|
|
923
1067
|
return result
|
|
924
1068
|
}
|
|
925
1069
|
|
|
1070
|
+
const hasHtmlLikePunctuation = (state, inlines, n, i) => {
|
|
1071
|
+
const src = state.src
|
|
1072
|
+
const chars = [
|
|
1073
|
+
src[inlines[n].e + 1] || '',
|
|
1074
|
+
src[inlines[i].s - 1] || '',
|
|
1075
|
+
src[inlines[i].e + 1] || ''
|
|
1076
|
+
]
|
|
1077
|
+
for (let idx = 0; idx < chars.length; idx++) {
|
|
1078
|
+
const ch = chars[idx]
|
|
1079
|
+
if (ch === '<' || ch === '>') return true
|
|
1080
|
+
}
|
|
1081
|
+
return false
|
|
1082
|
+
}
|
|
1083
|
+
|
|
1084
|
+
const hasAngleBracketInside = (state, inlines, n, i) => {
|
|
1085
|
+
const src = state.src
|
|
1086
|
+
const start = inlines[n].s
|
|
1087
|
+
const end = inlines[i].e
|
|
1088
|
+
const ltPos = src.indexOf('<', start)
|
|
1089
|
+
if (ltPos !== -1 && ltPos <= end) return true
|
|
1090
|
+
const gtPos = src.indexOf('>', start)
|
|
1091
|
+
return gtPos !== -1 && gtPos <= end
|
|
1092
|
+
}
|
|
1093
|
+
|
|
1094
|
+
const hasCodeTagInside = (state, inlines, n, i) => {
|
|
1095
|
+
const src = state.src
|
|
1096
|
+
const start = inlines[n].s
|
|
1097
|
+
const end = inlines[i].e
|
|
1098
|
+
const codeOpen = src.indexOf('<code', start)
|
|
1099
|
+
if (codeOpen !== -1 && codeOpen <= end) return true
|
|
1100
|
+
const codeClose = src.indexOf('</code', start)
|
|
1101
|
+
if (codeClose !== -1 && codeClose <= end) return true
|
|
1102
|
+
const preOpen = src.indexOf('<pre', start)
|
|
1103
|
+
if (preOpen !== -1 && preOpen <= end) return true
|
|
1104
|
+
const preClose = src.indexOf('</pre', start)
|
|
1105
|
+
return preClose !== -1 && preClose <= end
|
|
1106
|
+
}
|
|
1107
|
+
|
|
926
1108
|
const setEm = (state, inlines, marks, n, memo, opt, sNest, nestTracker, refRanges, inlineLinkRanges) => {
|
|
927
1109
|
const hasInlineLinkRanges = inlineLinkRanges && inlineLinkRanges.length > 0
|
|
928
1110
|
const hasRefRanges = refRanges && refRanges.length > 0
|
|
929
1111
|
const inlinesLength = inlines.length
|
|
930
1112
|
const emOpenRange = hasRefRanges ? findRefRangeIndex(inlines[n].s, refRanges) : -1
|
|
931
1113
|
const openLinkRange = hasInlineLinkRanges ? findInlineLinkRange(inlines[n].s, inlineLinkRanges) : null
|
|
1114
|
+
const leadingCompat = opt.leadingAsterisk === false
|
|
1115
|
+
const conservativePunctuation = leadingCompat || opt.disallowMixed === true
|
|
932
1116
|
if (opt.disallowMixed === true && !sNest) {
|
|
933
1117
|
let i = n + 1
|
|
934
1118
|
while (i < inlinesLength) {
|
|
@@ -979,6 +1163,10 @@ const setEm = (state, inlines, marks, n, memo, opt, sNest, nestTracker, refRange
|
|
|
979
1163
|
}
|
|
980
1164
|
}
|
|
981
1165
|
|
|
1166
|
+
if (state.md && state.md.options && state.md.options.html && hasCodeTagInside(state, inlines, n, i)) {
|
|
1167
|
+
return [n, nest]
|
|
1168
|
+
}
|
|
1169
|
+
|
|
982
1170
|
const emNum = Math.min(inlines[n].len, inlines[i].len)
|
|
983
1171
|
|
|
984
1172
|
if (!sNest && emNum !== 1) return [n, sNest, memo]
|
|
@@ -1000,7 +1188,11 @@ const setEm = (state, inlines, marks, n, memo, opt, sNest, nestTracker, refRange
|
|
|
1000
1188
|
if (nest === -1) return [n, nest]
|
|
1001
1189
|
|
|
1002
1190
|
if (emNum === 1) {
|
|
1003
|
-
|
|
1191
|
+
const needsPunctuationCheckClose = conservativePunctuation || hasHtmlLikePunctuation(state, inlines, n, i) || hasAngleBracketInside(state, inlines, n, i)
|
|
1192
|
+
if (needsPunctuationCheckClose && hasPunctuationOrNonJapanese(state, inlines, n, i, opt, refRanges, hasRefRanges)) {
|
|
1193
|
+
if (leadingCompat) {
|
|
1194
|
+
return [n, nest]
|
|
1195
|
+
}
|
|
1004
1196
|
if (memo.inlineMarkEnd) {
|
|
1005
1197
|
marks.push(...createMarks(state, inlines, i, inlinesLength - 1, memo, opt, refRanges, inlineLinkRanges))
|
|
1006
1198
|
|
|
@@ -1193,6 +1385,16 @@ const strongJa = (state, silent, opt) => {
|
|
|
1193
1385
|
|
|
1194
1386
|
const attrsEnabled = opt.mditAttrs && hasMditAttrs(state)
|
|
1195
1387
|
|
|
1388
|
+
const leadingAsterisk = resolveLeadingAsterisk(state, opt, start, originalMax)
|
|
1389
|
+
|
|
1390
|
+
if (leadingAsterisk === false) {
|
|
1391
|
+
return false
|
|
1392
|
+
}
|
|
1393
|
+
|
|
1394
|
+
const runtimeOpt = leadingAsterisk === opt.leadingAsterisk
|
|
1395
|
+
? opt
|
|
1396
|
+
: { ...opt, leadingAsterisk }
|
|
1397
|
+
|
|
1196
1398
|
if (start === 0) {
|
|
1197
1399
|
state.__strongJaRefRangeCache = null
|
|
1198
1400
|
state.__strongJaInlineLinkRangeCache = null
|
|
@@ -1248,35 +1450,40 @@ const strongJa = (state, silent, opt) => {
|
|
|
1248
1450
|
|
|
1249
1451
|
let refRanges = []
|
|
1250
1452
|
const hasReferenceDefinitions = state.__strongJaReferenceCount > 0
|
|
1453
|
+
const refScanStart = 0
|
|
1251
1454
|
if (hasReferenceDefinitions) {
|
|
1252
|
-
const
|
|
1253
|
-
if (
|
|
1254
|
-
|
|
1255
|
-
|
|
1256
|
-
|
|
1257
|
-
|
|
1455
|
+
const firstRefBracket = state.src.indexOf('[', refScanStart)
|
|
1456
|
+
if (firstRefBracket !== -1 && firstRefBracket < max) {
|
|
1457
|
+
const refCache = state.__strongJaRefRangeCache
|
|
1458
|
+
if (refCache && refCache.max === max && refCache.start === refScanStart) {
|
|
1459
|
+
refRanges = refCache.ranges
|
|
1460
|
+
} else {
|
|
1461
|
+
refRanges = computeReferenceRanges(state, refScanStart, max)
|
|
1462
|
+
state.__strongJaRefRangeCache = { start: refScanStart, max, ranges: refRanges }
|
|
1463
|
+
}
|
|
1464
|
+
if (refRanges.length > 0) {
|
|
1465
|
+
state.__strongJaHasCollapsedRefs = true
|
|
1466
|
+
}
|
|
1258
1467
|
}
|
|
1259
1468
|
}
|
|
1260
|
-
if (refRanges.length > 0) {
|
|
1261
|
-
state.__strongJaHasCollapsedRefs = true
|
|
1262
|
-
}
|
|
1263
1469
|
|
|
1264
1470
|
let inlineLinkRanges = null
|
|
1265
|
-
const
|
|
1471
|
+
const inlineLinkScanStart = 0
|
|
1472
|
+
const inlineLinkCandidatePos = state.src.indexOf('](', inlineLinkScanStart)
|
|
1266
1473
|
const hasInlineLinkCandidate = inlineLinkCandidatePos !== -1 && inlineLinkCandidatePos < max
|
|
1267
1474
|
if (hasInlineLinkCandidate) {
|
|
1268
1475
|
const inlineCache = state.__strongJaInlineLinkRangeCache
|
|
1269
|
-
if (inlineCache && inlineCache.max === max && inlineCache.start
|
|
1476
|
+
if (inlineCache && inlineCache.max === max && inlineCache.start === inlineLinkScanStart) {
|
|
1270
1477
|
inlineLinkRanges = inlineCache.ranges
|
|
1271
1478
|
} else {
|
|
1272
|
-
inlineLinkRanges = computeInlineLinkRanges(state,
|
|
1273
|
-
state.__strongJaInlineLinkRangeCache = { start, max, ranges: inlineLinkRanges }
|
|
1479
|
+
inlineLinkRanges = computeInlineLinkRanges(state, inlineLinkScanStart, max)
|
|
1480
|
+
state.__strongJaInlineLinkRangeCache = { start: inlineLinkScanStart, max, ranges: inlineLinkRanges }
|
|
1274
1481
|
}
|
|
1275
1482
|
if (inlineLinkRanges.length > 0) {
|
|
1276
1483
|
state.__strongJaHasInlineLinks = true
|
|
1277
1484
|
}
|
|
1278
1485
|
}
|
|
1279
|
-
let inlines = createInlines(state, start, max,
|
|
1486
|
+
let inlines = createInlines(state, start, max, runtimeOpt)
|
|
1280
1487
|
|
|
1281
1488
|
const memo = {
|
|
1282
1489
|
html: state.md.options.html,
|
|
@@ -1286,11 +1493,11 @@ const strongJa = (state, silent, opt) => {
|
|
|
1286
1493
|
inlineMarkEnd: src.charCodeAt(max - 1) === CHAR_ASTERISK,
|
|
1287
1494
|
}
|
|
1288
1495
|
|
|
1289
|
-
let marks = createMarks(state, inlines, 0, inlines.length, memo,
|
|
1496
|
+
let marks = createMarks(state, inlines, 0, inlines.length, memo, runtimeOpt, refRanges, inlineLinkRanges)
|
|
1290
1497
|
|
|
1291
1498
|
inlines = mergeInlinesAndMarks(inlines, marks)
|
|
1292
1499
|
|
|
1293
|
-
setToken(state, inlines,
|
|
1500
|
+
setToken(state, inlines, runtimeOpt, attrsEnabled)
|
|
1294
1501
|
|
|
1295
1502
|
if (inlineLinkRanges && inlineLinkRanges.length > 0) {
|
|
1296
1503
|
const labelSources = []
|
|
@@ -1300,8 +1507,15 @@ const strongJa = (state, silent, opt) => {
|
|
|
1300
1507
|
labelSources.push(src.slice(range.start + 1, range.end))
|
|
1301
1508
|
}
|
|
1302
1509
|
if (labelSources.length > 0) {
|
|
1510
|
+
restoreLabelWhitespace(state.tokens, labelSources)
|
|
1303
1511
|
state.tokens.__strongJaInlineLabelSources = labelSources
|
|
1304
1512
|
state.tokens.__strongJaInlineLabelIndex = 0
|
|
1513
|
+
if (state.env) {
|
|
1514
|
+
if (!state.env.__strongJaInlineLabelSourceList) {
|
|
1515
|
+
state.env.__strongJaInlineLabelSourceList = []
|
|
1516
|
+
}
|
|
1517
|
+
state.env.__strongJaInlineLabelSourceList.push(labelSources)
|
|
1518
|
+
}
|
|
1305
1519
|
}
|
|
1306
1520
|
}
|
|
1307
1521
|
|
|
@@ -1375,13 +1589,10 @@ const adjustTokenLevels = (tokens, startIdx, endIdx, delta) => {
|
|
|
1375
1589
|
|
|
1376
1590
|
const cloneTextToken = (source, content) => {
|
|
1377
1591
|
const newToken = new Token('text', '', 0)
|
|
1592
|
+
Object.assign(newToken, source)
|
|
1378
1593
|
newToken.content = content
|
|
1379
|
-
newToken.
|
|
1380
|
-
newToken.
|
|
1381
|
-
newToken.info = source.info
|
|
1382
|
-
newToken.meta = source.meta ? {...source.meta} : null
|
|
1383
|
-
newToken.block = source.block
|
|
1384
|
-
newToken.hidden = source.hidden
|
|
1594
|
+
if (source.meta) newToken.meta = { ...source.meta }
|
|
1595
|
+
if (source.map) newToken.map = source.map
|
|
1385
1596
|
return newToken
|
|
1386
1597
|
}
|
|
1387
1598
|
|
|
@@ -1673,9 +1884,52 @@ const removeGhostLabelText = (tokens, linkCloseIndex, labelText) => {
|
|
|
1673
1884
|
}
|
|
1674
1885
|
}
|
|
1675
1886
|
|
|
1887
|
+
const restoreLabelWhitespace = (tokens, labelSources) => {
|
|
1888
|
+
if (!tokens || !labelSources || labelSources.length === 0) return
|
|
1889
|
+
let labelIdx = 0
|
|
1890
|
+
for (let i = 0; i < tokens.length && labelIdx < labelSources.length; i++) {
|
|
1891
|
+
if (tokens[i].type !== 'link_open') continue
|
|
1892
|
+
const closeIdx = findLinkCloseIndex(tokens, i)
|
|
1893
|
+
if (closeIdx === -1) continue
|
|
1894
|
+
const labelSource = labelSources[labelIdx] || ''
|
|
1895
|
+
if (!labelSource) {
|
|
1896
|
+
labelIdx++
|
|
1897
|
+
continue
|
|
1898
|
+
}
|
|
1899
|
+
let cursor = 0
|
|
1900
|
+
for (let pos = i + 1; pos < closeIdx; pos++) {
|
|
1901
|
+
const t = tokens[pos]
|
|
1902
|
+
const markup = t.markup || ''
|
|
1903
|
+
const text = t.content || ''
|
|
1904
|
+
const startPos = cursor
|
|
1905
|
+
if (t.type === 'text') {
|
|
1906
|
+
cursor += text.length
|
|
1907
|
+
} else if (t.type === 'code_inline') {
|
|
1908
|
+
cursor += markup.length + text.length + markup.length
|
|
1909
|
+
} else if (markup) {
|
|
1910
|
+
cursor += markup.length
|
|
1911
|
+
}
|
|
1912
|
+
if ((t.type === 'strong_open' || t.type === 'em_open') && startPos > 0) {
|
|
1913
|
+
const prevToken = tokens[pos - 1]
|
|
1914
|
+
if (prevToken && prevToken.type === 'text' && prevToken.content && !prevToken.content.endsWith(' ')) {
|
|
1915
|
+
const hasSpaceBefore = startPos - 1 >= 0 && startPos - 1 < labelSource.length && labelSource[startPos - 1] === ' '
|
|
1916
|
+
const hasSpaceAt = startPos >= 0 && startPos < labelSource.length && labelSource[startPos] === ' '
|
|
1917
|
+
if (hasSpaceBefore || hasSpaceAt) {
|
|
1918
|
+
prevToken.content += ' '
|
|
1919
|
+
}
|
|
1920
|
+
}
|
|
1921
|
+
}
|
|
1922
|
+
}
|
|
1923
|
+
labelIdx++
|
|
1924
|
+
}
|
|
1925
|
+
}
|
|
1926
|
+
|
|
1676
1927
|
const convertInlineLinks = (tokens, state) => {
|
|
1677
1928
|
if (!tokens || tokens.length === 0) return
|
|
1678
|
-
|
|
1929
|
+
let labelSources = tokens.__strongJaInlineLabelSources
|
|
1930
|
+
if ((!labelSources || labelSources.length === 0) && state && state.env && Array.isArray(state.env.__strongJaInlineLabelSourceList) && state.env.__strongJaInlineLabelSourceList.length > 0) {
|
|
1931
|
+
labelSources = state.env.__strongJaInlineLabelSourceList.shift()
|
|
1932
|
+
}
|
|
1679
1933
|
let labelSourceIndex = tokens.__strongJaInlineLabelIndex || 0
|
|
1680
1934
|
let i = 0
|
|
1681
1935
|
while (i < tokens.length) {
|
|
@@ -1765,6 +2019,35 @@ const convertInlineLinks = (tokens, state) => {
|
|
|
1765
2019
|
i++
|
|
1766
2020
|
continue
|
|
1767
2021
|
}
|
|
2022
|
+
if (currentLabelSource) {
|
|
2023
|
+
const linkCloseIdx = findLinkCloseIndex(tokens, i)
|
|
2024
|
+
if (linkCloseIdx !== -1) {
|
|
2025
|
+
let cursor = 0
|
|
2026
|
+
for (let pos = i + 1; pos < linkCloseIdx; pos++) {
|
|
2027
|
+
const t = tokens[pos]
|
|
2028
|
+
const markup = t.markup || ''
|
|
2029
|
+
const text = t.content || ''
|
|
2030
|
+
const startPos = cursor
|
|
2031
|
+
if (t.type === 'text') {
|
|
2032
|
+
cursor += text.length
|
|
2033
|
+
} else if (t.type === 'code_inline') {
|
|
2034
|
+
cursor += markup.length + text.length + markup.length
|
|
2035
|
+
} else if (markup) {
|
|
2036
|
+
cursor += markup.length
|
|
2037
|
+
}
|
|
2038
|
+
if ((t.type === 'strong_open' || t.type === 'em_open') && startPos > 0) {
|
|
2039
|
+
const prevToken = tokens[pos - 1]
|
|
2040
|
+
if (prevToken && prevToken.type === 'text' && prevToken.content && !prevToken.content.endsWith(' ')) {
|
|
2041
|
+
const labelHasSpaceBefore = startPos - 1 >= 0 && startPos - 1 < currentLabelSource.length && currentLabelSource[startPos - 1] === ' '
|
|
2042
|
+
const labelHasSpaceAt = startPos >= 0 && startPos < currentLabelSource.length && currentLabelSource[startPos] === ' '
|
|
2043
|
+
if (labelHasSpaceBefore || labelHasSpaceAt) {
|
|
2044
|
+
prevToken.content += ' '
|
|
2045
|
+
}
|
|
2046
|
+
}
|
|
2047
|
+
}
|
|
2048
|
+
}
|
|
2049
|
+
}
|
|
2050
|
+
}
|
|
1768
2051
|
if (needsPlaceholder && currentLabelSource) {
|
|
1769
2052
|
removeGhostLabelText(tokens, nextIndex - 1, currentLabelSource)
|
|
1770
2053
|
}
|
|
@@ -1969,9 +2252,11 @@ const mditStrongJa = (md, option) => {
|
|
|
1969
2252
|
mditAttrs: true, //markdown-it-attrs
|
|
1970
2253
|
mdBreaks: md.options.breaks,
|
|
1971
2254
|
disallowMixed: false, //Non-Japanese text handling
|
|
2255
|
+
mode: 'japanese-only', // 'japanese-only' | 'aggressive' | 'compatible'
|
|
1972
2256
|
coreRulesBeforePostprocess: [] // e.g. ['cjk_breaks'] when CJK line-break plugins are active
|
|
1973
2257
|
}
|
|
1974
2258
|
if (option) Object.assign(opt, option)
|
|
2259
|
+
opt.hasCjkBreaks = hasCjkBreaksRule(md)
|
|
1975
2260
|
const rawCoreRules = opt.coreRulesBeforePostprocess
|
|
1976
2261
|
const hasCoreRuleConfig = Array.isArray(rawCoreRules)
|
|
1977
2262
|
? rawCoreRules.length > 0
|
|
@@ -1984,6 +2269,139 @@ const mditStrongJa = (md, option) => {
|
|
|
1984
2269
|
return strongJa(state, silent, opt)
|
|
1985
2270
|
})
|
|
1986
2271
|
|
|
2272
|
+
// Trim trailing spaces that remain after markdown-it-attrs strips `{...}`
|
|
2273
|
+
// Trim trailing spaces only at the very end of inline content (after attrs/core rules have run).
|
|
2274
|
+
const trimInlineTrailingSpaces = (state) => {
|
|
2275
|
+
if (!state || !state.tokens) return
|
|
2276
|
+
for (let i = 0; i < state.tokens.length; i++) {
|
|
2277
|
+
const token = state.tokens[i]
|
|
2278
|
+
if (!token || token.type !== 'inline' || !token.children || token.children.length === 0) continue
|
|
2279
|
+
let idx = token.children.length - 1
|
|
2280
|
+
while (idx >= 0 && token.children[idx] && token.children[idx].type !== 'text') {
|
|
2281
|
+
idx--
|
|
2282
|
+
}
|
|
2283
|
+
if (idx < 0) continue
|
|
2284
|
+
const child = token.children[idx]
|
|
2285
|
+
if (!child.content) continue
|
|
2286
|
+
const trimmed = child.content.replace(/[ \t]+$/, '')
|
|
2287
|
+
if (trimmed !== child.content) {
|
|
2288
|
+
child.content = trimmed
|
|
2289
|
+
}
|
|
2290
|
+
}
|
|
2291
|
+
}
|
|
2292
|
+
const hasTextJoinRule = Array.isArray(md.core?.ruler?.__rules__)
|
|
2293
|
+
? md.core.ruler.__rules__.some((rule) => rule && rule.name === 'text_join')
|
|
2294
|
+
: false
|
|
2295
|
+
if (hasTextJoinRule) {
|
|
2296
|
+
md.core.ruler.after('text_join', 'strong_ja_trim_trailing_spaces', trimInlineTrailingSpaces)
|
|
2297
|
+
} else {
|
|
2298
|
+
md.core.ruler.after('inline', 'strong_ja_trim_trailing_spaces', trimInlineTrailingSpaces)
|
|
2299
|
+
}
|
|
2300
|
+
|
|
2301
|
+
const normalizeSoftbreakSpacing = (state) => {
|
|
2302
|
+
if (!state || opt.hasCjkBreaks !== true) return
|
|
2303
|
+
if (!state.tokens || state.tokens.length === 0) return
|
|
2304
|
+
for (let i = 0; i < state.tokens.length; i++) {
|
|
2305
|
+
const token = state.tokens[i]
|
|
2306
|
+
if (!token || token.type !== 'inline' || !token.children || token.children.length === 0) continue
|
|
2307
|
+
for (let j = 0; j < token.children.length; j++) {
|
|
2308
|
+
const child = token.children[j]
|
|
2309
|
+
if (!child || child.type !== 'text' || !child.content) continue
|
|
2310
|
+
if (child.content.indexOf('\n') === -1) continue
|
|
2311
|
+
let normalized = ''
|
|
2312
|
+
for (let idx = 0; idx < child.content.length; idx++) {
|
|
2313
|
+
const ch = child.content[idx]
|
|
2314
|
+
if (ch === '\n') {
|
|
2315
|
+
const prevChar = idx > 0 ? child.content[idx - 1] : ''
|
|
2316
|
+
const nextChar = idx + 1 < child.content.length ? child.content[idx + 1] : ''
|
|
2317
|
+
const isAsciiWord = nextChar && nextChar >= '0' && nextChar <= 'z' && /[A-Za-z0-9]/.test(nextChar)
|
|
2318
|
+
const shouldReplace = isAsciiWord && nextChar !== '{' && nextChar !== '\\' && isJapanese(prevChar) && !isJapanese(nextChar)
|
|
2319
|
+
if (shouldReplace) {
|
|
2320
|
+
normalized += ' '
|
|
2321
|
+
continue
|
|
2322
|
+
}
|
|
2323
|
+
}
|
|
2324
|
+
normalized += ch
|
|
2325
|
+
}
|
|
2326
|
+
if (normalized !== child.content) {
|
|
2327
|
+
child.content = normalized
|
|
2328
|
+
}
|
|
2329
|
+
}
|
|
2330
|
+
}
|
|
2331
|
+
}
|
|
2332
|
+
if (hasTextJoinRule) {
|
|
2333
|
+
md.core.ruler.after('text_join', 'strong_ja_softbreak_spacing', normalizeSoftbreakSpacing)
|
|
2334
|
+
} else {
|
|
2335
|
+
md.core.ruler.after('inline', 'strong_ja_softbreak_spacing', normalizeSoftbreakSpacing)
|
|
2336
|
+
}
|
|
2337
|
+
|
|
2338
|
+
const restoreSoftbreaksAfterCjk = (state) => {
|
|
2339
|
+
if (!state) return
|
|
2340
|
+
if (!state.md || state.md.__strongJaRestoreSoftbreaksForAttrs !== true) return
|
|
2341
|
+
if (opt.hasCjkBreaks !== true) return
|
|
2342
|
+
if (!state.tokens || state.tokens.length === 0) return
|
|
2343
|
+
for (let i = 0; i < state.tokens.length; i++) {
|
|
2344
|
+
const token = state.tokens[i]
|
|
2345
|
+
if (!token || token.type !== 'inline' || !token.children || token.children.length === 0) continue
|
|
2346
|
+
const children = token.children
|
|
2347
|
+
for (let j = 0; j < children.length; j++) {
|
|
2348
|
+
const child = children[j]
|
|
2349
|
+
if (!child || child.type !== 'text' || child.content !== '') continue
|
|
2350
|
+
// Find previous non-empty text content to inspect the trailing character.
|
|
2351
|
+
let prevChar = ''
|
|
2352
|
+
for (let k = j - 1; k >= 0; k--) {
|
|
2353
|
+
const prev = children[k]
|
|
2354
|
+
if (prev && prev.type === 'text' && prev.content) {
|
|
2355
|
+
prevChar = prev.content.charAt(prev.content.length - 1)
|
|
2356
|
+
break
|
|
2357
|
+
}
|
|
2358
|
+
}
|
|
2359
|
+
if (!prevChar || !isJapanese(prevChar)) continue
|
|
2360
|
+
const next = children[j + 1]
|
|
2361
|
+
if (!next || next.type !== 'text' || !next.content) continue
|
|
2362
|
+
const nextChar = next.content.charAt(0)
|
|
2363
|
+
if (nextChar !== '{') continue
|
|
2364
|
+
child.type = 'softbreak'
|
|
2365
|
+
child.tag = ''
|
|
2366
|
+
child.content = '\n'
|
|
2367
|
+
child.markup = ''
|
|
2368
|
+
child.info = ''
|
|
2369
|
+
}
|
|
2370
|
+
}
|
|
2371
|
+
}
|
|
2372
|
+
|
|
2373
|
+
const registerRestoreSoftbreaks = () => {
|
|
2374
|
+
if (md.__strongJaRestoreRegistered) return
|
|
2375
|
+
const anchorRule = hasTextJoinRule ? 'text_join' : 'inline'
|
|
2376
|
+
const added = md.core.ruler.after(anchorRule, 'strong_ja_restore_softbreaks', restoreSoftbreaksAfterCjk)
|
|
2377
|
+
if (added !== false) {
|
|
2378
|
+
md.__strongJaRestoreRegistered = true
|
|
2379
|
+
md.__strongJaRestoreSoftbreaksForAttrs = opt.mditAttrs === false
|
|
2380
|
+
if (opt.hasCjkBreaks) {
|
|
2381
|
+
moveRuleAfter(md.core.ruler, 'strong_ja_restore_softbreaks', 'cjk_breaks')
|
|
2382
|
+
md.__strongJaRestoreReordered = true
|
|
2383
|
+
}
|
|
2384
|
+
if (!md.__strongJaPatchCorePush) {
|
|
2385
|
+
md.__strongJaPatchCorePush = true
|
|
2386
|
+
const originalPush = md.core.ruler.push.bind(md.core.ruler)
|
|
2387
|
+
md.core.ruler.push = (name, fn, options) => {
|
|
2388
|
+
const res = originalPush(name, fn, options)
|
|
2389
|
+
if (name && name.indexOf && name.indexOf('cjk_breaks') !== -1) {
|
|
2390
|
+
opt.hasCjkBreaks = true
|
|
2391
|
+
moveRuleAfter(md.core.ruler, 'strong_ja_restore_softbreaks', name)
|
|
2392
|
+
md.__strongJaRestoreReordered = true
|
|
2393
|
+
}
|
|
2394
|
+
return res
|
|
2395
|
+
}
|
|
2396
|
+
}
|
|
2397
|
+
if (opt.hasCjkBreaks) {
|
|
2398
|
+
moveRuleAfter(md.core.ruler, 'strong_ja_restore_softbreaks', 'cjk_breaks')
|
|
2399
|
+
md.__strongJaRestoreReordered = true
|
|
2400
|
+
}
|
|
2401
|
+
}
|
|
2402
|
+
}
|
|
2403
|
+
registerRestoreSoftbreaks()
|
|
2404
|
+
|
|
1987
2405
|
md.core.ruler.after('inline', 'strong_ja_postprocess', (state) => {
|
|
1988
2406
|
const targets = state.env.__strongJaPostProcessTargets
|
|
1989
2407
|
if (!targets || targets.length === 0) return
|
|
@@ -2011,6 +2429,9 @@ const mditStrongJa = (md, option) => {
|
|
|
2011
2429
|
delete tokens.__strongJaInlineLabelSources
|
|
2012
2430
|
delete tokens.__strongJaInlineLabelIndex
|
|
2013
2431
|
}
|
|
2432
|
+
if (state.env && state.env.__strongJaInlineLabelSourceList) {
|
|
2433
|
+
delete state.env.__strongJaInlineLabelSourceList
|
|
2434
|
+
}
|
|
2014
2435
|
delete state.env.__strongJaPostProcessTargets
|
|
2015
2436
|
delete state.env.__strongJaPostProcessTargetSet
|
|
2016
2437
|
})
|
|
@@ -2065,3 +2486,21 @@ function moveRuleBefore(ruler, ruleName, beforeName) {
|
|
|
2065
2486
|
rules.splice(beforeIdx, 0, rule)
|
|
2066
2487
|
ruler.__cache__ = null
|
|
2067
2488
|
}
|
|
2489
|
+
|
|
2490
|
+
function moveRuleAfter(ruler, ruleName, afterName) {
|
|
2491
|
+
if (!ruler || !ruler.__rules__) return
|
|
2492
|
+
const rules = ruler.__rules__
|
|
2493
|
+
let fromIdx = -1
|
|
2494
|
+
let afterIdx = -1
|
|
2495
|
+
for (let idx = 0; idx < rules.length; idx++) {
|
|
2496
|
+
if (rules[idx].name === ruleName) fromIdx = idx
|
|
2497
|
+
if (rules[idx].name === afterName) afterIdx = idx
|
|
2498
|
+
if (fromIdx !== -1 && afterIdx !== -1) break
|
|
2499
|
+
}
|
|
2500
|
+
if (fromIdx === -1 || afterIdx === -1 || fromIdx === afterIdx + 1) return
|
|
2501
|
+
|
|
2502
|
+
const rule = rules.splice(fromIdx, 1)[0]
|
|
2503
|
+
const targetIdx = fromIdx < afterIdx ? afterIdx - 1 : afterIdx
|
|
2504
|
+
rules.splice(targetIdx + 1, 0, rule)
|
|
2505
|
+
ruler.__cache__ = null
|
|
2506
|
+
}
|
package/package.json
CHANGED
|
@@ -1,7 +1,7 @@
|
|
|
1
1
|
{
|
|
2
2
|
"name": "@peaceroad/markdown-it-strong-ja",
|
|
3
3
|
"description": "This is a plugin for markdown-it. It is an alternative to the standard `**` (strong) and `*` (em) processing. It also processes strings that cannot be converted by the standard.",
|
|
4
|
-
"version": "0.
|
|
4
|
+
"version": "0.6.0",
|
|
5
5
|
"main": "index.js",
|
|
6
6
|
"type": "module",
|
|
7
7
|
"files": [
|
|
@@ -19,7 +19,7 @@
|
|
|
19
19
|
"markdown-it": "^14.1.0"
|
|
20
20
|
},
|
|
21
21
|
"devDependencies": {
|
|
22
|
-
"@peaceroad/markdown-it-cjk-breaks-mod": "^0.1.
|
|
22
|
+
"@peaceroad/markdown-it-cjk-breaks-mod": "^0.1.4",
|
|
23
23
|
"@peaceroad/markdown-it-hr-sandwiched-semantic-container": "^0.8.0",
|
|
24
24
|
"markdown-it-attrs": "^4.3.1",
|
|
25
25
|
"markdown-it-sub": "^2.0.0",
|