@peaceroad/markdown-it-strong-ja 0.7.1 → 0.8.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/README.md +314 -195
- package/index.js +18 -48
- package/package.json +23 -5
- package/src/token-compat.js +108 -46
- package/src/token-core.js +467 -92
- package/src/token-link-utils.js +104 -400
- package/src/token-postprocess/fastpaths.js +349 -0
- package/src/token-postprocess/guards.js +436 -0
- package/src/token-postprocess/orchestrator.js +733 -0
- package/src/token-postprocess.js +1 -340
- package/src/token-utils.js +192 -148
package/README.md
CHANGED
|
@@ -1,274 +1,393 @@
|
|
|
1
1
|
# p7d-markdown-it-strong-ja
|
|
2
2
|
|
|
3
|
-
|
|
3
|
+
`@peaceroad/markdown-it-strong-ja` is a `markdown-it` plugin that extends `*` / `**` emphasis handling for Japanese text, while keeping normal Markdown behavior as close to `markdown-it` as possible.
|
|
4
4
|
|
|
5
|
-
##
|
|
5
|
+
## Install
|
|
6
6
|
|
|
7
|
-
```
|
|
8
|
-
|
|
9
|
-
|
|
10
|
-
|
|
11
|
-
|
|
7
|
+
```bash
|
|
8
|
+
npm i @peaceroad/markdown-it-strong-ja
|
|
9
|
+
```
|
|
10
|
+
|
|
11
|
+
## Quick Start
|
|
12
12
|
|
|
13
|
-
|
|
14
|
-
|
|
13
|
+
```js
|
|
14
|
+
import MarkdownIt from 'markdown-it'
|
|
15
|
+
import strongJa from '@peaceroad/markdown-it-strong-ja'
|
|
15
16
|
|
|
17
|
+
const md = MarkdownIt().use(strongJa)
|
|
16
18
|
|
|
17
|
-
md.render('
|
|
18
|
-
// <p
|
|
19
|
+
md.render('和食では**「だし」**が料理の土台です。')
|
|
20
|
+
// <p>和食では<strong>「だし」</strong>が料理の土台です。</p>
|
|
19
21
|
```
|
|
20
22
|
|
|
21
|
-
|
|
23
|
+
## Scope and Modes
|
|
22
24
|
|
|
23
|
-
|
|
25
|
+
This plugin targets asterisk emphasis markers (`*`, `**`). It does not replace all inline parsing behavior of `markdown-it`. The goal is to help only where emphasis tends to break in Japanese text. When input is heavily malformed, the plugin prefers safe output and leaves markers as literal text instead of forcing unstable HTML.
|
|
24
26
|
|
|
25
|
-
|
|
27
|
+
Underscore emphasis (`_`, `__`) is intentionally left to plain `markdown-it`. strong-ja does not add custom delimiter-direction logic for `_` runs, and underscore-heavy malformed spans are handled fail-safe (kept conservative rather than force-rewritten).
|
|
26
28
|
|
|
27
|
-
|
|
28
|
-
- `mode: 'aggressive'` … always aggressive (lead `**` pairs greedily)
|
|
29
|
-
- `mode: 'compatible'` … markdown-it compatible (lead `**` stays literal)
|
|
29
|
+
Mode selection controls how aggressively the plugin helps:
|
|
30
30
|
|
|
31
|
-
|
|
32
|
-
|
|
33
|
-
|
|
34
|
-
|
|
35
|
-
|
|
31
|
+
- `japanese` (default): alias of `japanese-boundary-guard`. This is the recommended mode for mixed Japanese/English prose.
|
|
32
|
+
- `japanese-boundary`: keeps markdown-it as baseline and enables Japanese-context local relaxation around `*` runs. It does not apply the mixed JA/EN single-`*` guard. Link/ref postprocess repairs are enabled. Target behavior is JP-friendly conservative recovery.
|
|
33
|
+
- `japanese-boundary-guard`: includes everything from `japanese-boundary`, plus an extra mixed JA/EN guard for space-adjacent ASCII segments (for patterns like `* English*`, `** "English"**`, `*** [English](u)***`). This guard is applied consistently for `*` run lengths (`*` and longer runs). Link/ref postprocess repairs are enabled. Target behavior is JP-friendly mixed-text safety.
|
|
34
|
+
- `aggressive`: is more permissive than baseline-first and is the most eager mode for early opener recovery. Japanese local relaxation and link/ref postprocess repairs are enabled. Target behavior is maximum recovery.
|
|
35
|
+
- `compatible`: keeps plain markdown-it delimiter decisions as-is. It does not run Japanese local relaxation and skips link/ref postprocess repairs. Output stays aligned with plain `markdown-it` under the same plugin stack.
|
|
36
36
|
|
|
37
|
-
|
|
37
|
+
### What `japanese-boundary` and `japanese-boundary-guard` Share
|
|
38
38
|
|
|
39
|
-
|
|
40
|
-
- Pick `compatible` for markdown-it behavior everywhere.
|
|
41
|
-
- Pick `japanese` to be aggressive only when Japanese text is present.
|
|
42
|
-
- Pick `aggressive` if you want leading `**` to always pair.
|
|
39
|
+
The following behavior is shared by both modes (`japanese` is an alias of `japanese-boundary-guard`):
|
|
43
40
|
|
|
44
|
-
|
|
41
|
+
- baseline-first decisions on top of `markdown-it`
|
|
42
|
+
- Japanese-context local relaxation (same-line neighborhood only)
|
|
43
|
+
- single-`*` direction correction for malformed opener/closer flips
|
|
44
|
+
- token-only postprocess repairs around links/references (except `compatible`)
|
|
45
|
+
- fail-safe behavior: low-confidence spans are preserved
|
|
45
46
|
|
|
46
|
-
|
|
47
|
-
- Input: `**「test」**`
|
|
48
|
-
- Output (default/aggressive/compatible/markdown-it): `<p><strong>「test」</strong></p>`
|
|
49
|
-
- Input: `これは**「test」**です`
|
|
50
|
-
- Output (default/aggressive): `<p>これは<strong>「test」</strong>です</p>`
|
|
51
|
-
- Output (compatible/markdown-it): `<p>これは**「test」**です</p>`
|
|
47
|
+
Representative shared outputs:
|
|
52
48
|
|
|
53
|
-
-
|
|
54
|
-
|
|
55
|
-
- Output (default/aggressive): `<p><strong>あああ。</strong>iii**</p>`
|
|
56
|
-
- Output (compatible/markdown-it): `<p>**あああ。<strong>iii</strong></p>`
|
|
57
|
-
- Input (English-only): `**aaa.**iii**`
|
|
58
|
-
- Output (aggressive): `<p><strong>aaa.</strong>iii**</p>`
|
|
59
|
-
- Output (default/compatible/markdown-it): `<p>**aaa.<strong>iii</strong></p>`
|
|
60
|
-
- Input (English-only, two `**` runs): `**aaa.**eee.**eeee**`
|
|
61
|
-
- Output (aggressive): `<p><strong>aaa.</strong>eee.<strong>eeee</strong></p>`
|
|
62
|
-
- Output (default/compatible/markdown-it): `<p>**aaa.**eee.<strong>eeee</strong></p>`
|
|
49
|
+
- Input: `*味噌汁。*umai*`
|
|
50
|
+
- `japanese-boundary` / `japanese-boundary-guard`: `<p><em>味噌汁。</em>umai*</p>`
|
|
63
51
|
|
|
64
|
-
|
|
52
|
+
- Input: `説明文ではこれは**[寿司](url)**です。`
|
|
53
|
+
- `japanese-boundary` / `japanese-boundary-guard`: `<p>説明文ではこれは<strong><a href="url">寿司</a></strong>です。</p>`
|
|
65
54
|
|
|
66
|
-
|
|
67
|
-
- Input (English-only): `string**[text](url)**`
|
|
68
|
-
- Output (aggressive): `<p>string<strong><a href="url">text</a></strong></p>`
|
|
69
|
-
- Output (default/compatible/markdown-it): `<p>string**<a href="url">text</a>**</p>`
|
|
70
|
-
- Input (Japanese mixed): `これは**[text](url)**です`
|
|
71
|
-
- Output (default/aggressive): `<p>これは<strong><a href="url">text</a></strong>です</p>`
|
|
72
|
-
- Output (compatible/markdown-it): `<p>これは**<a href="url">text</a>**です</p>`
|
|
73
|
-
- Inline code (cluster of `*` without spaces):
|
|
74
|
-
- Input (English-only): `` **aa`code`**aa ``
|
|
75
|
-
- Output (aggressive): `<p><strong>aa<code>code</code></strong>aa</p>`
|
|
76
|
-
- Output (default/compatible/markdown-it): `<p>**aa<code>code</code>**aa</p>`
|
|
77
|
-
- Input (Japanese mixed): `` これは**`code`**です ``
|
|
78
|
-
- Output (default/aggressive): `<p>これは<strong><code>code</code></strong>です</p>`
|
|
79
|
-
- Output (compatible/markdown-it): `<p>これは**<code>code</code>**です</p>`
|
|
55
|
+
### What Only `japanese-boundary-guard` Adds
|
|
80
56
|
|
|
81
|
-
|
|
57
|
+
`japanese-boundary-guard` adds an extra mixed JA/EN suppression guard:
|
|
82
58
|
|
|
59
|
+
- target: space-adjacent + ASCII-start segments (plain / quoted / link / code wrappers)
|
|
60
|
+
- goal: reduce unnatural conversions such as `* English*` or `* \`English\`*`
|
|
61
|
+
- applied consistently across run lengths (`*`, `**`, `***`, ...)
|
|
83
62
|
|
|
63
|
+
Representative differences:
|
|
84
64
|
|
|
85
|
-
|
|
65
|
+
- Input: `日本語です。* English* です。`
|
|
66
|
+
- `japanese-boundary`: `<p>日本語です。<em> English</em> です。</p>`
|
|
67
|
+
- `japanese-boundary-guard`: `<p>日本語です。* English* です。</p>`
|
|
86
68
|
|
|
87
|
-
|
|
69
|
+
- Input: `和食では* \`umami\`*を使う。`
|
|
70
|
+
- `japanese-boundary`: `<p>和食では<em> <code>umami</code></em>を使う。</p>`
|
|
71
|
+
- `japanese-boundary-guard`: `<p>和食では* <code>umami</code>*を使う。</p>`
|
|
88
72
|
|
|
89
|
-
|
|
90
|
-
[Markdown]
|
|
91
|
-
HTMLは「**HyperText Markup Language**」の略です。
|
|
92
|
-
[HTML]
|
|
93
|
-
<p>HTMLは「<strong>HyperText Markup Language</strong>」の略です。</p>
|
|
73
|
+
### Mode Selection Guide (Practical)
|
|
94
74
|
|
|
75
|
+
- default for user-facing prose: `japanese` (`japanese-boundary-guard`)
|
|
76
|
+
- strict markdown-it parity: `compatible`
|
|
77
|
+
- maximum recovery over predictability: `aggressive`
|
|
78
|
+
- niche use without guard suppression: `japanese-boundary`
|
|
95
79
|
|
|
96
|
-
|
|
97
|
-
HTMLは**「HyperText Markup Language」**の略です。
|
|
98
|
-
[HTML]
|
|
99
|
-
<p>HTMLは<strong>「HyperText Markup Language」</strong>の略です。</p>
|
|
80
|
+
### Example Corpus Notes
|
|
100
81
|
|
|
82
|
+
Detailed cases and visual outputs:
|
|
101
83
|
|
|
102
|
-
|
|
103
|
-
|
|
104
|
-
|
|
105
|
-
|
|
84
|
+
- `example/README.md`
|
|
85
|
+
- `example/mixed-ja-en-stars-mode.html`
|
|
86
|
+
- `example/mixed-ja-en-stars-mode.txt`
|
|
87
|
+
- `example/inline-wrapper-matrix.html`
|
|
106
88
|
|
|
89
|
+
## How `japanese` (`japanese-boundary-guard`) Decides (Step by Step)
|
|
107
90
|
|
|
108
|
-
|
|
109
|
-
|
|
110
|
-
[HTML]
|
|
111
|
-
<p>HTMLは<strong>「HyperText <em>Markup</em> <code>Language</code>」</strong>の略です。</p>
|
|
91
|
+
This section follows the runtime flow for `mode: 'japanese'` (which resolves to `japanese-boundary-guard`).
|
|
92
|
+
The flow has three layers: Step 1 builds the baseline with plain `markdown-it`; Steps 2-8 apply helper logic only where needed; Step 9 repairs link/reference-adjacent breakage.
|
|
112
93
|
|
|
94
|
+
Terms used below:
|
|
113
95
|
|
|
114
|
-
|
|
115
|
-
|
|
96
|
+
- Opening marker: `*` or `**` that starts emphasis.
|
|
97
|
+
- Closing marker: `*` or `**` that ends emphasis.
|
|
98
|
+
- Run: a contiguous group of the same marker (`*`, `**`, `***`, ...).
|
|
99
|
+
- Line: text split by `\n`.
|
|
116
100
|
|
|
117
|
-
|
|
118
|
-
[HTML]
|
|
119
|
-
<p>HTMLは**「HyperText Mark</p>
|
|
120
|
-
<p>up Language」**の略です。</p>
|
|
101
|
+
### Step 1: Build the baseline with plain `markdown-it`
|
|
121
102
|
|
|
103
|
+
`markdown-it` runs first. If it can already parse a pattern (including cross-line `**...**`), that baseline structure is kept.
|
|
122
104
|
|
|
123
|
-
|
|
124
|
-
HTMLは\**「HyperText Markup Language」**の略です。
|
|
125
|
-
[HTML]
|
|
126
|
-
<p>HTMLは**「HyperText Markup Language」**の略です。</p>
|
|
105
|
+
Example:
|
|
127
106
|
|
|
107
|
+
- Input: `カツ**丼も\n人気**です`
|
|
108
|
+
- `markdown-it` / `japanese` / `compatible`: `<p>カツ<strong>丼も\n人気</strong>です</p>`
|
|
128
109
|
|
|
129
|
-
|
|
130
|
-
HTMLは\\**「HyperText Markup Language」**の略です。
|
|
131
|
-
[HTML]
|
|
132
|
-
<p>HTMLは\<strong>「HyperText Markup Language」</strong>の略です。</p>
|
|
110
|
+
Positioning:
|
|
133
111
|
|
|
112
|
+
- `mode: 'compatible'` mostly uses this baseline as-is.
|
|
113
|
+
- Other modes (`japanese`, `japanese-boundary`, `japanese-boundary-guard`, `aggressive`) may add helper logic in later steps.
|
|
134
114
|
|
|
135
|
-
|
|
136
|
-
HTMLは\\\**「HyperText Markup Language」**の略です。
|
|
137
|
-
[HTML]
|
|
138
|
-
<p>HTMLは\**「HyperText Markup Language」**の略です。</p>
|
|
115
|
+
### Step 2: Decide whether Japanese helper logic should run
|
|
139
116
|
|
|
117
|
+
This decision is made per `*` run. `japanese` does not rewrite the whole line blindly. It checks non-whitespace characters adjacent to each run and only enters helper logic when local Japanese context exists.
|
|
140
118
|
|
|
141
|
-
|
|
142
|
-
HTMLは`**`は**「HyperText Markup Language」**の略です。
|
|
143
|
-
[HTML]
|
|
144
|
-
<p>HTMLは<code>**</code>は<strong>「HyperText Markup Language」</strong>の略です。</p>
|
|
119
|
+
Japanese context here is mainly Hiragana, Katakana, Kanji (Han), and fullwidth punctuation/symbols. If adjacent context is mostly ASCII letters/numbers, the Step 1 result is kept.
|
|
145
120
|
|
|
146
|
-
|
|
147
|
-
HTMLは`**`は**「HyperText** <b>Markup</b> Language」の略です。
|
|
148
|
-
[HTML:false]
|
|
149
|
-
<p>HTMLは<code>**</code>は<strong>「HyperText</strong> <b>Markup</b> Language」の略です。</p>
|
|
150
|
-
[HTML:true]
|
|
151
|
-
<p>HTMLは<code>**</code>は<strong>「HyperText</strong> <b>Markup</b> Language」の略です。</p>
|
|
121
|
+
Example that stays on baseline:
|
|
152
122
|
|
|
123
|
+
- Input: `**sushi.**umami**`
|
|
124
|
+
- Output (`japanese`): `<p>**sushi.<strong>umami</strong></p>`
|
|
125
|
+
- Why: local context is ASCII-side.
|
|
153
126
|
|
|
154
|
-
|
|
155
|
-
HTMLは`**`は**「HyperText <b>Markup</b> Language」**の略です。
|
|
156
|
-
[HTML:false]
|
|
157
|
-
<p>HTMLは<code>**</code>は<strong>「HyperText <b>Markup</b> Language」</strong>の略です。</p>
|
|
158
|
-
[HTML:true]
|
|
159
|
-
<p>HTMLは<code>**</code>は<strong>「HyperText <b>Markup</b> Language」</strong>の略です。</p>
|
|
127
|
+
Example that proceeds to helper logic:
|
|
160
128
|
|
|
129
|
+
- Input: `**味噌汁。**umami**`
|
|
130
|
+
- Why: local Japanese context is adjacent.
|
|
161
131
|
|
|
162
|
-
|
|
163
|
-
```
|
|
164
|
-
HTMLは`**`は**「HyperText Markup Language」**の略です。
|
|
165
|
-
```
|
|
166
|
-
[HTML:false]
|
|
167
|
-
<pre><code>HTMLは`**`は**「HyperText Markup Language」**の略です。
|
|
168
|
-
</code></pre>
|
|
169
|
-
[HTML:true]
|
|
170
|
-
<pre><code>HTMLは`**`は**「HyperText Markup Language」**の略です。
|
|
171
|
-
</code></pre>
|
|
132
|
+
### Step 3: Keep valid `markdown-it` direction decisions
|
|
172
133
|
|
|
134
|
+
`japanese` is baseline-first. It does not overwrite already-stable direction decisions. It only adds candidates where malformed input is likely to misdirect pairing.
|
|
173
135
|
|
|
174
|
-
|
|
175
|
-
HTMLは**「HyperText <b>Markup</b> Language」**
|
|
176
|
-
[HTML:false]
|
|
177
|
-
<p>HTMLは<strong>「HyperText <b>Markup</b> Language」</strong></p>
|
|
178
|
-
[HTML:true]
|
|
179
|
-
<p>HTMLは<strong>「HyperText <b>Markup</b> Language」</strong></p>
|
|
136
|
+
Example that stays as-is:
|
|
180
137
|
|
|
181
|
-
|
|
182
|
-
|
|
183
|
-
[HTML html:true]
|
|
184
|
-
<p>これは<strong><a href="url">text</a></strong>と<strong><code>code</code></strong>と<strong><b>HTML</b></strong>です</p>
|
|
138
|
+
- Input: `*寿司*は人気です。`
|
|
139
|
+
- Output: `<p><em>寿司</em>は人気です。</p>`
|
|
185
140
|
|
|
141
|
+
Example that continues:
|
|
186
142
|
|
|
187
|
-
|
|
188
|
-
|
|
189
|
-
[HTML]
|
|
190
|
-
<p>HTMLは「<strong>HyperText Markup Language</strong>」</p>
|
|
143
|
+
- Input: `*味噌汁。*umai*`
|
|
144
|
+
- Why: leaving the first `*` literal can make the later pair win (`*味噌汁。<em>umai</em>`), so local correction checks whether Japanese-side pairing should be preferred.
|
|
191
145
|
|
|
192
|
-
|
|
193
|
-
HTMLは**「HyperText Markup Language」**。
|
|
194
|
-
[HTML]
|
|
195
|
-
<p>HTMLは<strong>「HyperText Markup Language」</strong>。</p>
|
|
146
|
+
### Step 4: Use same-line local context only
|
|
196
147
|
|
|
197
|
-
|
|
198
|
-
HTMLは**「HyperText Markup Language」**
|
|
199
|
-
[HTML]
|
|
200
|
-
<p>HTMLは<strong>「HyperText Markup Language」</strong></p>
|
|
148
|
+
Local helper checks only read non-whitespace characters on the same line. They do not bridge across `\n`.
|
|
201
149
|
|
|
150
|
+
Example:
|
|
202
151
|
|
|
203
|
-
|
|
204
|
-
|
|
205
|
-
|
|
206
|
-
<p>HTMLは<strong>「HyperText Markup Language」</strong>。</p>
|
|
152
|
+
- Input: `*味噌汁。\n*umai*`
|
|
153
|
+
- Output (`japanese`): `<p>*味噌汁。\n<em>umai</em></p>`
|
|
154
|
+
- Why: the first `*` does not see the next line.
|
|
207
155
|
|
|
208
|
-
|
|
209
|
-
***強調と*入れ子*の検証***を行う。
|
|
210
|
-
[HTML]
|
|
211
|
-
<p><em><em><em>強調と</em>入れ子</em>の検証</em>**を行う。</p>
|
|
156
|
+
### Step 5 (`japanese-boundary-guard` only): Suppress mixed JA/EN over-conversion
|
|
212
157
|
|
|
213
|
-
|
|
214
|
-
****
|
|
215
|
-
[HTML]
|
|
216
|
-
<hr>
|
|
158
|
+
This step exists only in `japanese-boundary-guard`. It suppresses emphasis when the segment is space-adjacent and ASCII-start, to avoid unnatural emphasis around English fragments.
|
|
217
159
|
|
|
218
|
-
|
|
219
|
-
a****b
|
|
220
|
-
[HTML]
|
|
221
|
-
<p>a****b</p>
|
|
160
|
+
Representative differences:
|
|
222
161
|
|
|
223
|
-
|
|
224
|
-
|
|
225
|
-
|
|
226
|
-
<p>a****</p>
|
|
227
|
-
````
|
|
162
|
+
- Input: `日本語です。* English* です。`
|
|
163
|
+
- `japanese-boundary`: `<p>日本語です。<em> English</em> です。</p>`
|
|
164
|
+
- `japanese-boundary-guard`: `<p>日本語です。* English* です。</p>`
|
|
228
165
|
|
|
166
|
+
- Input: `和食では* \`umami\`*を使う。`
|
|
167
|
+
- `japanese-boundary`: `<p>和食では<em> <code>umami</code></em>を使う。</p>`
|
|
168
|
+
- `japanese-boundary-guard`: `<p>和食では* <code>umami</code>*を使う。</p>`
|
|
229
169
|
|
|
230
|
-
###
|
|
170
|
+
### Step 6: Apply extra direction correction only to single `*`
|
|
231
171
|
|
|
232
|
-
|
|
172
|
+
Extra direction correction is applied only to run length `1` (`*`), where malformed inputs most often flip opener/closer direction.
|
|
233
173
|
|
|
234
|
-
|
|
235
|
-
|
|
236
|
-
|
|
237
|
-
|
|
238
|
-
|
|
239
|
-
|
|
174
|
+
Example:
|
|
175
|
+
|
|
176
|
+
- Input: `*味噌汁。*umai*`
|
|
177
|
+
- `japanese` / `aggressive`: `<p><em>味噌汁。</em>umai*</p>`
|
|
178
|
+
- `compatible` / `markdown-it`: `<p>*味噌汁。<em>umai</em></p>`
|
|
179
|
+
|
|
180
|
+
Additional boundary rule:
|
|
181
|
+
|
|
182
|
+
- Backward scan for previous single-`*` stops at sentence punctuation (`。`, `!`, `?`, `.`, `!`, `?`, `‼`, `⁇`, `⁈`, `⁉`) unless that punctuation is immediately adjacent to the current marker.
|
|
183
|
+
|
|
184
|
+
### Step 7: Do not apply Step 6 single-star correction to `**` and longer runs
|
|
185
|
+
|
|
186
|
+
Runs of `**` and longer (`***`, `****`, `*****+`) still use baseline `markdown-it` decisions and Japanese relaxations. Only the single-star-specific correction from Step 6 is excluded.
|
|
187
|
+
|
|
188
|
+
Example:
|
|
189
|
+
|
|
190
|
+
- Input: `**味噌汁。**umami**という表現を使います。`
|
|
191
|
+
- `japanese`: `<p><strong>味噌汁。</strong>umami**という表現を使います。</p>`
|
|
192
|
+
- `compatible`: `<p>**味噌汁。<strong>umami</strong>という表現を使います。</p>`
|
|
193
|
+
|
|
194
|
+
### Step 8: Build emphasis pairs normally; keep literals when forcing is unsafe
|
|
195
|
+
|
|
196
|
+
After direction candidates are fixed, normal inline pairing builds final tokens. If forcing tags looks unsafe, markers are left literal.
|
|
197
|
+
|
|
198
|
+
Example:
|
|
199
|
+
|
|
200
|
+
- Input: `**[**[x](v)](u)**`
|
|
201
|
+
- Output: `<p><strong>[</strong><a href="v">x</a>](u)**</p>`
|
|
202
|
+
|
|
203
|
+
### Step 9: Repair link/reference-adjacent breakage after pairing
|
|
204
|
+
|
|
205
|
+
Steps 1-8 decide marker direction and pairing. Step 9 is a separate phase that only adjusts malformed spans around links/references. Option name: `postprocess`.
|
|
206
|
+
|
|
207
|
+
#### Step 9-1: Collapsed reference matching follows `markdown-it` normalization
|
|
208
|
+
|
|
209
|
+
##### 9-1A: Collapsed reference matching (`[label][]`)
|
|
210
|
+
|
|
211
|
+
Collapsed reference matching (`[label][]`) follows `markdown-it` key normalization. strong-ja does not force matching by deleting `*`/`**` markers from labels.
|
|
212
|
+
|
|
213
|
+
Mismatch example:
|
|
214
|
+
|
|
215
|
+
```markdown
|
|
216
|
+
献立は「[**寿司**][]」です。
|
|
217
|
+
|
|
218
|
+
[寿司]: https://example.com/
|
|
240
219
|
```
|
|
241
220
|
|
|
242
|
-
|
|
243
|
-
|
|
244
|
-
|
|
221
|
+
```html
|
|
222
|
+
<p>献立は「[<strong>寿司</strong>][]」です。</p>
|
|
223
|
+
```
|
|
245
224
|
|
|
246
|
-
|
|
225
|
+
Match example:
|
|
247
226
|
|
|
248
|
-
|
|
227
|
+
```markdown
|
|
228
|
+
献立は「[**寿司**][]」です。
|
|
249
229
|
|
|
250
|
-
|
|
230
|
+
[**寿司**]: https://example.com/
|
|
231
|
+
```
|
|
251
232
|
|
|
252
|
-
```
|
|
253
|
-
|
|
254
|
-
postprocess: false
|
|
255
|
-
})
|
|
233
|
+
```html
|
|
234
|
+
<p>献立は「<a href="https://example.com/"><strong>寿司</strong></a>」です。</p>
|
|
256
235
|
```
|
|
257
236
|
|
|
237
|
+
##### 9-1B: Inline link handling (`[text](url)`)
|
|
238
|
+
|
|
239
|
+
- `[text](url)` does not do collapsed-reference label matching.
|
|
240
|
+
- Step 9 only adjusts malformed `*` / `**` wrappers around links.
|
|
241
|
+
- It never forces matching by deleting markers.
|
|
242
|
+
|
|
243
|
+
Examples:
|
|
244
|
+
|
|
245
|
+
- Input: `メニューではmenu**[ramen](url)**と書きます。`
|
|
246
|
+
- `japanese` / `japanese-boundary` / `japanese-boundary-guard`: `<p>メニューではmenu**<a href="url">ramen</a>**と書きます。</p>`
|
|
247
|
+
- `aggressive`: `<p>メニューではmenu<strong><a href="url">ramen</a></strong>と書きます。</p>`
|
|
248
|
+
- `compatible` / `markdown-it`: `<p>メニューではmenu**<a href="url">ramen</a>**と書きます。</p>`
|
|
249
|
+
|
|
250
|
+
- Input: `説明文ではこれは**[寿司](url)**です。`
|
|
251
|
+
- `japanese` / `japanese-boundary` / `japanese-boundary-guard` / `aggressive`: `<p>説明文ではこれは<strong><a href="url">寿司</a></strong>です。</p>`
|
|
252
|
+
- `compatible` / `markdown-it`: `<p>説明文ではこれは**<a href="url">寿司</a>**です。</p>`
|
|
253
|
+
|
|
254
|
+
##### 9-1C: Inline code / symbol wrapper handling
|
|
255
|
+
|
|
256
|
+
- Input: `昼食は**\`code\`**の話です。`
|
|
257
|
+
- `japanese` / `japanese-boundary` / `japanese-boundary-guard` / `aggressive`: `<p>昼食は<strong><code>code</code></strong>の話です。</p>`
|
|
258
|
+
- `compatible` / `markdown-it`: `<p>昼食は**<code>code</code>**の話です。</p>`
|
|
259
|
+
|
|
260
|
+
- Input: `注記では**aa\`stock\`**aaという記法を試します。`
|
|
261
|
+
- `japanese` / `japanese-boundary` / `japanese-boundary-guard` / `compatible` / `markdown-it`: `<p>注記では**aa<code>stock</code>**aaという記法を試します。</p>`
|
|
262
|
+
- `aggressive`: `<p>注記では<strong>aa<code>stock</code></strong>aaという記法を試します。</p>`
|
|
263
|
+
|
|
264
|
+
- Input: `お店の場所は**{}()**です。`
|
|
265
|
+
- `japanese` / `japanese-boundary` / `japanese-boundary-guard` / `aggressive`: `<p>お店の場所は<strong>{}()</strong>です。</p>`
|
|
266
|
+
- `compatible` / `markdown-it`: `<p>お店の場所は**{}()**です。</p>`
|
|
267
|
+
|
|
268
|
+
#### Step 9-2: Which modes run Step 9
|
|
269
|
+
|
|
270
|
+
Step 9 runs in:
|
|
271
|
+
|
|
272
|
+
- `japanese-boundary`
|
|
273
|
+
- `japanese-boundary-guard` (therefore also `japanese`)
|
|
274
|
+
- `aggressive`
|
|
275
|
+
|
|
276
|
+
Step 9 is skipped in:
|
|
277
|
+
|
|
278
|
+
- `compatible` (to keep plain `markdown-it` parity)
|
|
279
|
+
|
|
280
|
+
Target is mainly malformed `*` / `**` around links and collapsed refs. Spans that cross inline code, inline HTML, images, or autolinks are kept as-is.
|
|
281
|
+
|
|
282
|
+
#### Step 9-3: Why Step 9 can skip rewrites or normalize tokens
|
|
283
|
+
|
|
284
|
+
Step 9 is intentionally conservative. It prefers stable output over maximum conversion, so it skips rewrites when:
|
|
285
|
+
|
|
286
|
+
- emphasis/link repair signals are weak
|
|
287
|
+
- the span is low-confidence (`***` noise, underscore-heavy mix, code involvement, wrapper imbalance)
|
|
288
|
+
- the malformed shape does not match known safe repair patterns
|
|
289
|
+
|
|
290
|
+
Even when rewrite succeeds, token arrangement can be normalized while rendered HTML stays equivalent. For example, `[` / `]` / `[]` may become separate text tokens. The runtime path is strict token-only (no inline reparse fallback).
|
|
291
|
+
|
|
292
|
+
Example (low-confidence span is preserved):
|
|
293
|
+
|
|
294
|
+
- Input: `注記では**aa\`stock\`***tail*です。`
|
|
295
|
+
- `japanese` / `compatible`: `<p>注記では**aa<code>stock</code>**<em>tail</em>です。</p>`
|
|
296
|
+
- Reason: mixed `**` and `*` around code is low-confidence, so literal `**` is preserved.
|
|
297
|
+
|
|
298
|
+
In short, for ambiguous malformed input, strong-ja prioritizes safe/readable output over maximum conversion.
|
|
299
|
+
|
|
300
|
+
## Behavior Examples
|
|
301
|
+
|
|
302
|
+
Representative cases only (full corpus: `test/readme-mode.txt`).
|
|
303
|
+
|
|
304
|
+
Supporting visuals:
|
|
305
|
+
|
|
306
|
+
- `example/inline-wrapper-matrix.html`
|
|
307
|
+
- `example/mixed-ja-en-stars-mode.html`
|
|
308
|
+
|
|
309
|
+
### 1) Baseline Japanese punctuation case
|
|
310
|
+
|
|
311
|
+
- Input: `**「だし」**は和食の基本です。`
|
|
312
|
+
- `japanese` / `aggressive`: `<p><strong>「だし」</strong>は和食の基本です。</p>`
|
|
313
|
+
- `compatible` / `markdown-it`: `<p>**「だし」**は和食の基本です。</p>`
|
|
314
|
+
|
|
315
|
+
### 2) Mixed JA/EN mode differences
|
|
316
|
+
|
|
317
|
+
- Input: `**天ぷら。**crunch**という表現を使います。`
|
|
318
|
+
- `japanese` / `aggressive`: `<p><strong>天ぷら。</strong>crunch**という表現を使います。</p>`
|
|
319
|
+
- `compatible` / `markdown-it`: `<p>**天ぷら。<strong>crunch</strong>という表現を使います。</p>`
|
|
320
|
+
|
|
321
|
+
- Input: `日本語です。* English* です。`
|
|
322
|
+
- `japanese-boundary`: `<p>日本語です。<em> English</em> です。</p>`
|
|
323
|
+
- `japanese-boundary-guard` / `compatible`: `<p>日本語です。* English* です。</p>`
|
|
324
|
+
|
|
325
|
+
### 3) Safety-first malformed handling
|
|
326
|
+
|
|
327
|
+
- Input: `**[**[x](v)](u)**`
|
|
328
|
+
- All modes: `<p><strong>[</strong><a href="v">x</a>](u)**</p>`
|
|
329
|
+
|
|
330
|
+
- Input: `注記では**aa\`stock\`***tail*です。`
|
|
331
|
+
- `japanese` / `compatible`: `<p>注記では**aa<code>stock</code>**<em>tail</em>です。</p>`
|
|
332
|
+
- Low-confidence span: keep literal `**` instead of risky forced conversion.
|
|
333
|
+
|
|
334
|
+
### 4) Inline link/code adjacency
|
|
335
|
+
|
|
336
|
+
- Input: `説明文ではこれは**[ラーメン](url)**です。`
|
|
337
|
+
- `japanese` / `aggressive`: `<p>説明文ではこれは<strong><a href="url">ラーメン</a></strong>です。</p>`
|
|
338
|
+
- `compatible` / `markdown-it`: `<p>説明文ではこれは**<a href="url">ラーメン</a>**です。</p>`
|
|
339
|
+
|
|
340
|
+
- Input: `注記では**aa\`stock\`**aaという記法を試します。`
|
|
341
|
+
- `japanese` / `compatible` / `markdown-it`: `<p>注記では**aa<code>stock</code>**aaという記法を試します。</p>`
|
|
342
|
+
- `aggressive`: `<p>注記では<strong>aa<code>stock</code></strong>aaという記法を試します。</p>`
|
|
343
|
+
|
|
344
|
+
### 5) Pure-English malformed tail (`aggressive` delta)
|
|
345
|
+
|
|
346
|
+
- Input: `broken **tail [aa**aa***Text***and*More*bb**bb](https://x.test) after`
|
|
347
|
+
- `japanese` / `compatible` / `markdown-it`:
|
|
348
|
+
`<p>broken **tail <a href="https://x.test">aa<strong>aa</strong><em>Text</em><em><em>and</em>More</em>bb**bb</a> after</p>`
|
|
349
|
+
- `aggressive`:
|
|
350
|
+
`<p>broken **tail <a href="https://x.test">aa<strong>aa</strong><em>Text</em><strong>and<em>More</em>bb</strong>bb</a> after</p>`
|
|
351
|
+
|
|
352
|
+
## Options
|
|
353
|
+
|
|
354
|
+
### `mode`
|
|
355
|
+
|
|
356
|
+
- Type: `'japanese' | 'japanese-boundary' | 'japanese-boundary-guard' | 'aggressive' | 'compatible'`
|
|
357
|
+
- Default: `'japanese'`
|
|
358
|
+
|
|
359
|
+
### `mditAttrs`
|
|
360
|
+
|
|
361
|
+
- Type: `boolean`
|
|
258
362
|
- Default: `true`
|
|
259
|
-
- Set `false`
|
|
363
|
+
- Set `false` if your stack does not use `markdown-it-attrs`.
|
|
260
364
|
|
|
261
|
-
###
|
|
365
|
+
### `postprocess`
|
|
262
366
|
|
|
263
|
-
|
|
367
|
+
- Type: `boolean`
|
|
368
|
+
- Default: `true`
|
|
369
|
+
- Set `false` to disable link/reference postprocess repairs.
|
|
370
|
+
- In `mode: 'compatible'`, repairs are skipped even when this is `true`.
|
|
264
371
|
|
|
265
|
-
|
|
266
|
-
|
|
267
|
-
|
|
268
|
-
|
|
269
|
-
|
|
270
|
-
```
|
|
372
|
+
### `coreRulesBeforePostprocess`
|
|
373
|
+
|
|
374
|
+
- Type: `string[]`
|
|
375
|
+
- Default: `[]`
|
|
376
|
+
- Names of core rules that must run before `strong_ja_token_postprocess`.
|
|
271
377
|
|
|
378
|
+
### `patchCorePush`
|
|
379
|
+
|
|
380
|
+
- Type: `boolean`
|
|
272
381
|
- Default: `true`
|
|
273
|
-
-
|
|
274
|
-
|
|
382
|
+
- Helper hook to keep rule order stable when `mditAttrs: false` and `cjk_breaks` is registered later.
|
|
383
|
+
|
|
384
|
+
### About `markdown-it` `breaks`
|
|
385
|
+
|
|
386
|
+
`breaks` is controlled by `markdown-it` itself. This plugin does not override `md.options.breaks`. However, with `cjk_breaks`, compatibility handling may adjust softbreak-related tokens, so rendered line-break behavior can still differ in some cases.
|
|
387
|
+
|
|
388
|
+
## Notes
|
|
389
|
+
|
|
390
|
+
- Use `state.env.__strongJaTokenOpt` to override options per render.
|
|
391
|
+
- Overrides are merged with plugin options, but setup-time behavior (such as rule registration/order) cannot be switched at render time.
|
|
392
|
+
- This is an ESM plugin (`type: module`) and works in Node.js, browser bundlers, and VS Code extension pipelines that use `markdown-it` ESM.
|
|
393
|
+
- `scanDelims` patch is applied once per `MarkdownIt` prototype in the same process.
|