@peaceroad/markdown-it-strong-ja 0.7.2 → 0.8.1
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/README.md +326 -195
- package/index.js +27 -40
- package/package.json +26 -6
- package/src/token-compat.js +71 -22
- package/src/token-core.js +521 -132
- package/src/token-link-utils.js +434 -539
- package/src/token-postprocess/broken-ref.js +475 -0
- package/src/token-postprocess/fastpaths.js +349 -0
- package/src/token-postprocess/guards.js +499 -0
- package/src/token-postprocess/orchestrator.js +672 -0
- package/src/token-postprocess.js +1 -334
- package/src/token-utils.js +215 -142
package/README.md
CHANGED
|
@@ -1,274 +1,405 @@
|
|
|
1
1
|
# p7d-markdown-it-strong-ja
|
|
2
2
|
|
|
3
|
-
|
|
3
|
+
`@peaceroad/markdown-it-strong-ja` is a `markdown-it` plugin that extends `*` / `**` emphasis handling for Japanese text, while keeping normal Markdown behavior as close to `markdown-it` as possible.
|
|
4
4
|
|
|
5
|
-
##
|
|
5
|
+
## Install
|
|
6
6
|
|
|
7
|
-
```
|
|
8
|
-
|
|
9
|
-
|
|
10
|
-
|
|
11
|
-
|
|
7
|
+
```bash
|
|
8
|
+
npm i @peaceroad/markdown-it-strong-ja
|
|
9
|
+
```
|
|
10
|
+
|
|
11
|
+
## Quick Start
|
|
12
12
|
|
|
13
|
-
|
|
14
|
-
|
|
13
|
+
```js
|
|
14
|
+
import MarkdownIt from 'markdown-it'
|
|
15
|
+
import strongJa from '@peaceroad/markdown-it-strong-ja'
|
|
15
16
|
|
|
17
|
+
const md = MarkdownIt().use(strongJa)
|
|
16
18
|
|
|
17
|
-
md.render('
|
|
18
|
-
// <p
|
|
19
|
+
md.render('和食では**「だし」**が料理の土台です。')
|
|
20
|
+
// <p>和食では<strong>「だし」</strong>が料理の土台です。</p>
|
|
19
21
|
```
|
|
20
22
|
|
|
21
|
-
|
|
23
|
+
## Scope and Modes
|
|
22
24
|
|
|
23
|
-
|
|
25
|
+
This plugin targets asterisk emphasis markers (`*`, `**`). It does not replace all inline parsing behavior of `markdown-it`. The goal is to help only where emphasis tends to break in Japanese text. When input is heavily malformed, the plugin prefers safe output and leaves markers as literal text instead of forcing unstable HTML.
|
|
24
26
|
|
|
25
|
-
|
|
27
|
+
Underscore emphasis (`_`, `__`) is intentionally left to plain `markdown-it`. strong-ja does not add custom delimiter-direction logic for `_` runs, and underscore-heavy malformed spans are handled fail-safe (kept conservative rather than force-rewritten).
|
|
26
28
|
|
|
27
|
-
|
|
28
|
-
- `mode: 'aggressive'` … always aggressive (lead `**` pairs greedily)
|
|
29
|
-
- `mode: 'compatible'` … markdown-it compatible (lead `**` stays literal)
|
|
29
|
+
Mode selection controls how aggressively the plugin helps:
|
|
30
30
|
|
|
31
|
-
|
|
32
|
-
|
|
33
|
-
|
|
34
|
-
|
|
35
|
-
|
|
31
|
+
- `japanese` (default): alias of `japanese-boundary-guard`. This is the recommended mode for mixed Japanese/English prose.
|
|
32
|
+
- `japanese-boundary`: keeps markdown-it as baseline and enables Japanese-context local relaxation around `*` runs. It does not apply the mixed JA/EN single-`*` guard. Link/ref postprocess repairs are enabled. Target behavior is JP-friendly conservative recovery.
|
|
33
|
+
- `japanese-boundary-guard`: includes everything from `japanese-boundary`, plus an extra mixed JA/EN guard for space-adjacent ASCII segments (for patterns like `* English*`, `** "English"**`, `*** [English](u)***`). This guard is applied consistently for `*` run lengths (`*` and longer runs). Link/ref postprocess repairs are enabled. Target behavior is JP-friendly mixed-text safety.
|
|
34
|
+
- `aggressive`: is more permissive than baseline-first and is the most eager mode for early opener recovery. Japanese local relaxation and link/ref postprocess repairs are enabled. Target behavior is maximum recovery.
|
|
35
|
+
- `compatible`: keeps plain markdown-it delimiter decisions as-is. It does not run Japanese local relaxation and skips link/ref postprocess repairs. Output stays aligned with plain `markdown-it` under the same plugin stack.
|
|
36
36
|
|
|
37
|
-
|
|
37
|
+
### What `japanese-boundary` and `japanese-boundary-guard` Share
|
|
38
38
|
|
|
39
|
-
|
|
40
|
-
- Pick `compatible` for markdown-it behavior everywhere.
|
|
41
|
-
- Pick `japanese` to be aggressive only when Japanese text is present.
|
|
42
|
-
- Pick `aggressive` if you want leading `**` to always pair.
|
|
39
|
+
The following behavior is shared by both modes (`japanese` is an alias of `japanese-boundary-guard`):
|
|
43
40
|
|
|
44
|
-
|
|
41
|
+
- baseline-first decisions on top of `markdown-it`
|
|
42
|
+
- Japanese-context local relaxation (same-line neighborhood only)
|
|
43
|
+
- single-`*` direction correction for malformed opener/closer flips
|
|
44
|
+
- token-only postprocess repairs around links/references (except `compatible`)
|
|
45
|
+
- fail-safe behavior: low-confidence spans are preserved
|
|
45
46
|
|
|
46
|
-
|
|
47
|
-
- Input: `**「test」**`
|
|
48
|
-
- Output (default/aggressive/compatible/markdown-it): `<p><strong>「test」</strong></p>`
|
|
49
|
-
- Input: `これは**「test」**です`
|
|
50
|
-
- Output (default/aggressive): `<p>これは<strong>「test」</strong>です</p>`
|
|
51
|
-
- Output (compatible/markdown-it): `<p>これは**「test」**です</p>`
|
|
47
|
+
Representative shared outputs:
|
|
52
48
|
|
|
53
|
-
-
|
|
54
|
-
|
|
55
|
-
- Output (default/aggressive): `<p><strong>あああ。</strong>iii**</p>`
|
|
56
|
-
- Output (compatible/markdown-it): `<p>**あああ。<strong>iii</strong></p>`
|
|
57
|
-
- Input (English-only): `**aaa.**iii**`
|
|
58
|
-
- Output (aggressive): `<p><strong>aaa.</strong>iii**</p>`
|
|
59
|
-
- Output (default/compatible/markdown-it): `<p>**aaa.<strong>iii</strong></p>`
|
|
60
|
-
- Input (English-only, two `**` runs): `**aaa.**eee.**eeee**`
|
|
61
|
-
- Output (aggressive): `<p><strong>aaa.</strong>eee.<strong>eeee</strong></p>`
|
|
62
|
-
- Output (default/compatible/markdown-it): `<p>**aaa.**eee.<strong>eeee</strong></p>`
|
|
49
|
+
- Input: `*味噌汁。*umai*`
|
|
50
|
+
- `japanese-boundary` / `japanese-boundary-guard`: `<p><em>味噌汁。</em>umai*</p>`
|
|
63
51
|
|
|
64
|
-
|
|
52
|
+
- Input: `説明文ではこれは**[寿司](url)**です。`
|
|
53
|
+
- `japanese-boundary` / `japanese-boundary-guard`: `<p>説明文ではこれは<strong><a href="url">寿司</a></strong>です。</p>`
|
|
65
54
|
|
|
66
|
-
|
|
67
|
-
- Input (English-only): `string**[text](url)**`
|
|
68
|
-
- Output (aggressive): `<p>string<strong><a href="url">text</a></strong></p>`
|
|
69
|
-
- Output (default/compatible/markdown-it): `<p>string**<a href="url">text</a>**</p>`
|
|
70
|
-
- Input (Japanese mixed): `これは**[text](url)**です`
|
|
71
|
-
- Output (default/aggressive): `<p>これは<strong><a href="url">text</a></strong>です</p>`
|
|
72
|
-
- Output (compatible/markdown-it): `<p>これは**<a href="url">text</a>**です</p>`
|
|
73
|
-
- Inline code (cluster of `*` without spaces):
|
|
74
|
-
- Input (English-only): `` **aa`code`**aa ``
|
|
75
|
-
- Output (aggressive): `<p><strong>aa<code>code</code></strong>aa</p>`
|
|
76
|
-
- Output (default/compatible/markdown-it): `<p>**aa<code>code</code>**aa</p>`
|
|
77
|
-
- Input (Japanese mixed): `` これは**`code`**です ``
|
|
78
|
-
- Output (default/aggressive): `<p>これは<strong><code>code</code></strong>です</p>`
|
|
79
|
-
- Output (compatible/markdown-it): `<p>これは**<code>code</code>**です</p>`
|
|
55
|
+
### What Only `japanese-boundary-guard` Adds
|
|
80
56
|
|
|
81
|
-
|
|
57
|
+
`japanese-boundary-guard` adds an extra mixed JA/EN suppression guard:
|
|
82
58
|
|
|
59
|
+
- target: space-adjacent + ASCII-start segments (plain / quoted / link / code wrappers)
|
|
60
|
+
- goal: reduce unnatural conversions such as `* English*` or `* \`English\`*`
|
|
61
|
+
- applied consistently across run lengths (`*`, `**`, `***`, ...)
|
|
83
62
|
|
|
63
|
+
Representative differences:
|
|
84
64
|
|
|
85
|
-
|
|
65
|
+
- Input: `日本語です。* English* です。`
|
|
66
|
+
- `japanese-boundary`: `<p>日本語です。<em> English</em> です。</p>`
|
|
67
|
+
- `japanese-boundary-guard`: `<p>日本語です。* English* です。</p>`
|
|
86
68
|
|
|
87
|
-
|
|
69
|
+
- Input: `和食では* \`umami\`*を使う。`
|
|
70
|
+
- `japanese-boundary`: `<p>和食では<em> <code>umami</code></em>を使う。</p>`
|
|
71
|
+
- `japanese-boundary-guard`: `<p>和食では* <code>umami</code>*を使う。</p>`
|
|
88
72
|
|
|
89
|
-
|
|
90
|
-
[Markdown]
|
|
91
|
-
HTMLは「**HyperText Markup Language**」の略です。
|
|
92
|
-
[HTML]
|
|
93
|
-
<p>HTMLは「<strong>HyperText Markup Language</strong>」の略です。</p>
|
|
73
|
+
### Mode Selection Guide (Practical)
|
|
94
74
|
|
|
75
|
+
- default for user-facing prose: `japanese` (`japanese-boundary-guard`)
|
|
76
|
+
- strict markdown-it parity: `compatible`
|
|
77
|
+
- maximum recovery over predictability: `aggressive`
|
|
78
|
+
- niche use without guard suppression: `japanese-boundary`
|
|
95
79
|
|
|
96
|
-
|
|
97
|
-
HTMLは**「HyperText Markup Language」**の略です。
|
|
98
|
-
[HTML]
|
|
99
|
-
<p>HTMLは<strong>「HyperText Markup Language」</strong>の略です。</p>
|
|
80
|
+
### Example Corpus Notes
|
|
100
81
|
|
|
82
|
+
Detailed cases and visual outputs:
|
|
101
83
|
|
|
102
|
-
|
|
103
|
-
|
|
104
|
-
|
|
105
|
-
|
|
84
|
+
- `example/README.md`
|
|
85
|
+
- `example/mixed-ja-en-stars-mode.html`
|
|
86
|
+
- `example/mixed-ja-en-stars-mode.txt`
|
|
87
|
+
- `example/inline-wrapper-matrix.html`
|
|
106
88
|
|
|
89
|
+
## How `japanese` (`japanese-boundary-guard`) Decides (Step by Step)
|
|
107
90
|
|
|
108
|
-
|
|
109
|
-
|
|
110
|
-
[HTML]
|
|
111
|
-
<p>HTMLは<strong>「HyperText <em>Markup</em> <code>Language</code>」</strong>の略です。</p>
|
|
91
|
+
This section follows the runtime flow for `mode: 'japanese'` (which resolves to `japanese-boundary-guard`).
|
|
92
|
+
The flow has three layers: Step 1 builds the baseline with plain `markdown-it`; Steps 2-8 apply helper logic only where needed; Step 9 repairs link/reference-adjacent breakage.
|
|
112
93
|
|
|
94
|
+
Terms used below:
|
|
113
95
|
|
|
114
|
-
|
|
115
|
-
|
|
96
|
+
- Opening marker: `*` or `**` that starts emphasis.
|
|
97
|
+
- Closing marker: `*` or `**` that ends emphasis.
|
|
98
|
+
- Run: a contiguous group of the same marker (`*`, `**`, `***`, ...).
|
|
99
|
+
- Line: text split by `\n`.
|
|
116
100
|
|
|
117
|
-
|
|
118
|
-
[HTML]
|
|
119
|
-
<p>HTMLは**「HyperText Mark</p>
|
|
120
|
-
<p>up Language」**の略です。</p>
|
|
101
|
+
### TL;DR
|
|
121
102
|
|
|
103
|
+
- Baseline: start from plain `markdown-it` delimiter pairing.
|
|
104
|
+
- Local helper path: only `*` runs with local Japanese context enter strong-ja boundary logic.
|
|
105
|
+
- Mixed-text guard: `japanese-boundary-guard` additionally suppresses mixed JA/EN over-conversion.
|
|
106
|
+
- Postprocess: token-only repairs run only for malformed link/reference-adjacent spans.
|
|
122
107
|
|
|
123
|
-
[Markdown]
|
|
124
|
-
HTMLは\**「HyperText Markup Language」**の略です。
|
|
125
|
-
[HTML]
|
|
126
|
-
<p>HTMLは**「HyperText Markup Language」**の略です。</p>
|
|
127
108
|
|
|
109
|
+
### Step 1: Build the baseline with plain `markdown-it`
|
|
128
110
|
|
|
129
|
-
|
|
130
|
-
HTMLは\\**「HyperText Markup Language」**の略です。
|
|
131
|
-
[HTML]
|
|
132
|
-
<p>HTMLは\<strong>「HyperText Markup Language」</strong>の略です。</p>
|
|
111
|
+
`markdown-it` runs first. If it can already parse a pattern (including cross-line `**...**`), that baseline structure is kept.
|
|
133
112
|
|
|
113
|
+
Example:
|
|
134
114
|
|
|
135
|
-
|
|
136
|
-
|
|
137
|
-
[HTML]
|
|
138
|
-
<p>HTMLは\**「HyperText Markup Language」**の略です。</p>
|
|
115
|
+
- Input: `カツ**丼も\n人気**です`
|
|
116
|
+
- `markdown-it` / `japanese` / `compatible`: `<p>カツ<strong>丼も\n人気</strong>です</p>`
|
|
139
117
|
|
|
118
|
+
Positioning:
|
|
140
119
|
|
|
141
|
-
|
|
142
|
-
|
|
143
|
-
[HTML]
|
|
144
|
-
<p>HTMLは<code>**</code>は<strong>「HyperText Markup Language」</strong>の略です。</p>
|
|
120
|
+
- `mode: 'compatible'` mostly uses this baseline as-is.
|
|
121
|
+
- Other modes (`japanese`, `japanese-boundary`, `japanese-boundary-guard`, `aggressive`) may add helper logic in later steps.
|
|
145
122
|
|
|
146
|
-
|
|
147
|
-
HTMLは`**`は**「HyperText** <b>Markup</b> Language」の略です。
|
|
148
|
-
[HTML:false]
|
|
149
|
-
<p>HTMLは<code>**</code>は<strong>「HyperText</strong> <b>Markup</b> Language」の略です。</p>
|
|
150
|
-
[HTML:true]
|
|
151
|
-
<p>HTMLは<code>**</code>は<strong>「HyperText</strong> <b>Markup</b> Language」の略です。</p>
|
|
123
|
+
### Step 2: Decide whether Japanese helper logic should run
|
|
152
124
|
|
|
125
|
+
This decision is made per `*` run. `japanese` does not rewrite the whole line blindly. It checks non-whitespace characters adjacent to each run and only enters helper logic when local Japanese context exists.
|
|
153
126
|
|
|
154
|
-
|
|
155
|
-
HTMLは`**`は**「HyperText <b>Markup</b> Language」**の略です。
|
|
156
|
-
[HTML:false]
|
|
157
|
-
<p>HTMLは<code>**</code>は<strong>「HyperText <b>Markup</b> Language」</strong>の略です。</p>
|
|
158
|
-
[HTML:true]
|
|
159
|
-
<p>HTMLは<code>**</code>は<strong>「HyperText <b>Markup</b> Language」</strong>の略です。</p>
|
|
127
|
+
Japanese context here is mainly Hiragana, Katakana, Kanji (Han), and fullwidth punctuation/symbols. If adjacent context is mostly ASCII letters/numbers, the Step 1 result is kept.
|
|
160
128
|
|
|
129
|
+
Example that stays on baseline:
|
|
161
130
|
|
|
162
|
-
|
|
163
|
-
|
|
164
|
-
|
|
165
|
-
```
|
|
166
|
-
[HTML:false]
|
|
167
|
-
<pre><code>HTMLは`**`は**「HyperText Markup Language」**の略です。
|
|
168
|
-
</code></pre>
|
|
169
|
-
[HTML:true]
|
|
170
|
-
<pre><code>HTMLは`**`は**「HyperText Markup Language」**の略です。
|
|
171
|
-
</code></pre>
|
|
131
|
+
- Input: `**sushi.**umami**`
|
|
132
|
+
- Output (`japanese`): `<p>**sushi.<strong>umami</strong></p>`
|
|
133
|
+
- Why: local context is ASCII-side.
|
|
172
134
|
|
|
135
|
+
Example that proceeds to helper logic:
|
|
173
136
|
|
|
174
|
-
|
|
175
|
-
|
|
176
|
-
[HTML:false]
|
|
177
|
-
<p>HTMLは<strong>「HyperText <b>Markup</b> Language」</strong></p>
|
|
178
|
-
[HTML:true]
|
|
179
|
-
<p>HTMLは<strong>「HyperText <b>Markup</b> Language」</strong></p>
|
|
137
|
+
- Input: `**味噌汁。**umami**`
|
|
138
|
+
- Why: local Japanese context is adjacent.
|
|
180
139
|
|
|
181
|
-
|
|
182
|
-
これは**[text](url)**と**`code`**と**<b>HTML</b>**です
|
|
183
|
-
[HTML html:true]
|
|
184
|
-
<p>これは<strong><a href="url">text</a></strong>と<strong><code>code</code></strong>と<strong><b>HTML</b></strong>です</p>
|
|
140
|
+
### Step 3: Keep valid `markdown-it` direction decisions
|
|
185
141
|
|
|
142
|
+
`japanese` is baseline-first. It does not overwrite already-stable direction decisions. It only adds candidates where malformed input is likely to misdirect pairing.
|
|
186
143
|
|
|
187
|
-
|
|
188
|
-
HTMLは「**HyperText Markup Language**」
|
|
189
|
-
[HTML]
|
|
190
|
-
<p>HTMLは「<strong>HyperText Markup Language</strong>」</p>
|
|
144
|
+
Example that stays as-is:
|
|
191
145
|
|
|
192
|
-
|
|
193
|
-
|
|
194
|
-
[HTML]
|
|
195
|
-
<p>HTMLは<strong>「HyperText Markup Language」</strong>。</p>
|
|
146
|
+
- Input: `*寿司*は人気です。`
|
|
147
|
+
- Output: `<p><em>寿司</em>は人気です。</p>`
|
|
196
148
|
|
|
197
|
-
|
|
198
|
-
HTMLは**「HyperText Markup Language」**
|
|
199
|
-
[HTML]
|
|
200
|
-
<p>HTMLは<strong>「HyperText Markup Language」</strong></p>
|
|
149
|
+
Example that continues:
|
|
201
150
|
|
|
151
|
+
- Input: `*味噌汁。*umai*`
|
|
152
|
+
- Why: leaving the first `*` literal can make the later pair win (`*味噌汁。<em>umai</em>`), so local correction checks whether Japanese-side pairing should be preferred.
|
|
202
153
|
|
|
203
|
-
|
|
204
|
-
HTMLは**「HyperText Markup Language」**。
|
|
205
|
-
[HTML]
|
|
206
|
-
<p>HTMLは<strong>「HyperText Markup Language」</strong>。</p>
|
|
154
|
+
### Step 4: Use same-line local context only
|
|
207
155
|
|
|
208
|
-
|
|
209
|
-
***強調と*入れ子*の検証***を行う。
|
|
210
|
-
[HTML]
|
|
211
|
-
<p><em><em><em>強調と</em>入れ子</em>の検証</em>**を行う。</p>
|
|
156
|
+
Local helper checks only read non-whitespace characters on the same line. They do not bridge across `\n`.
|
|
212
157
|
|
|
213
|
-
|
|
214
|
-
****
|
|
215
|
-
[HTML]
|
|
216
|
-
<hr>
|
|
158
|
+
Example:
|
|
217
159
|
|
|
218
|
-
|
|
219
|
-
|
|
220
|
-
|
|
221
|
-
<p>a****b</p>
|
|
160
|
+
- Input: `*味噌汁。\n*umai*`
|
|
161
|
+
- Output (`japanese`): `<p>*味噌汁。\n<em>umai</em></p>`
|
|
162
|
+
- Why: the first `*` does not see the next line.
|
|
222
163
|
|
|
223
|
-
|
|
224
|
-
a****
|
|
225
|
-
[HTML]
|
|
226
|
-
<p>a****</p>
|
|
227
|
-
````
|
|
164
|
+
### Step 5 (`japanese-boundary-guard` only): Suppress mixed JA/EN over-conversion
|
|
228
165
|
|
|
166
|
+
This step exists only in `japanese-boundary-guard`. It suppresses emphasis when the segment is space-adjacent and ASCII-start, to avoid unnatural emphasis around English fragments.
|
|
229
167
|
|
|
230
|
-
|
|
168
|
+
Representative differences:
|
|
231
169
|
|
|
232
|
-
|
|
170
|
+
- Input: `日本語です。* English* です。`
|
|
171
|
+
- `japanese-boundary`: `<p>日本語です。<em> English</em> です。</p>`
|
|
172
|
+
- `japanese-boundary-guard`: `<p>日本語です。* English* です。</p>`
|
|
233
173
|
|
|
234
|
-
|
|
235
|
-
|
|
236
|
-
|
|
237
|
-
|
|
238
|
-
|
|
239
|
-
|
|
174
|
+
- Input: `和食では* \`umami\`*を使う。`
|
|
175
|
+
- `japanese-boundary`: `<p>和食では<em> <code>umami</code></em>を使う。</p>`
|
|
176
|
+
- `japanese-boundary-guard`: `<p>和食では* <code>umami</code>*を使う。</p>`
|
|
177
|
+
|
|
178
|
+
### Step 6: Apply extra direction correction only to single `*`
|
|
179
|
+
|
|
180
|
+
Extra direction correction is applied only to run length `1` (`*`), where malformed inputs most often flip opener/closer direction.
|
|
181
|
+
|
|
182
|
+
Example:
|
|
183
|
+
|
|
184
|
+
- Input: `*味噌汁。*umai*`
|
|
185
|
+
- `japanese` / `aggressive`: `<p><em>味噌汁。</em>umai*</p>`
|
|
186
|
+
- `compatible` / `markdown-it`: `<p>*味噌汁。<em>umai</em></p>`
|
|
187
|
+
|
|
188
|
+
Additional boundary rule:
|
|
189
|
+
|
|
190
|
+
- Backward scan for previous single-`*` stops at sentence punctuation (`。`, `!`, `?`, `.`, `!`, `?`, `‼`, `⁇`, `⁈`, `⁉`) unless that punctuation is immediately adjacent to the current marker.
|
|
191
|
+
|
|
192
|
+
### Step 7: Do not apply Step 6 single-star correction to `**` and longer runs
|
|
193
|
+
|
|
194
|
+
Runs of `**` and longer (`***`, `****`, `*****+`) still use baseline `markdown-it` decisions and Japanese relaxations. Only the single-star-specific correction from Step 6 is excluded.
|
|
195
|
+
|
|
196
|
+
Example:
|
|
197
|
+
|
|
198
|
+
- Input: `**味噌汁。**umami**という表現を使います。`
|
|
199
|
+
- `japanese`: `<p><strong>味噌汁。</strong>umami**という表現を使います。</p>`
|
|
200
|
+
- `compatible`: `<p>**味噌汁。<strong>umami</strong>という表現を使います。</p>`
|
|
201
|
+
|
|
202
|
+
### Step 8: Build emphasis pairs normally; keep literals when forcing is unsafe
|
|
203
|
+
|
|
204
|
+
After direction candidates are fixed, normal inline pairing builds final tokens. If forcing tags looks unsafe, markers are left literal.
|
|
205
|
+
|
|
206
|
+
Example:
|
|
207
|
+
|
|
208
|
+
- Input: `**[**[x](v)](u)**`
|
|
209
|
+
- Output: `<p><strong>[</strong><a href="v">x</a>](u)**</p>`
|
|
210
|
+
|
|
211
|
+
### Step 9: Repair link/reference-adjacent breakage after pairing
|
|
212
|
+
|
|
213
|
+
Steps 1-8 decide marker direction and pairing. Step 9 is a separate phase that only adjusts malformed spans around links/references. Option name: `postprocess`.
|
|
214
|
+
|
|
215
|
+
#### Step 9-1: Collapsed reference matching follows `markdown-it` normalization
|
|
216
|
+
|
|
217
|
+
##### 9-1A: Collapsed reference matching (`[label][]`)
|
|
218
|
+
|
|
219
|
+
Collapsed reference matching (`[label][]`) follows `markdown-it` key normalization. strong-ja does not force matching by deleting `*`/`**` markers from labels.
|
|
220
|
+
|
|
221
|
+
Mismatch example:
|
|
222
|
+
|
|
223
|
+
```markdown
|
|
224
|
+
献立は「[**寿司**][]」です。
|
|
225
|
+
|
|
226
|
+
[寿司]: https://example.com/
|
|
240
227
|
```
|
|
241
228
|
|
|
242
|
-
|
|
243
|
-
|
|
244
|
-
|
|
229
|
+
```html
|
|
230
|
+
<p>献立は「[<strong>寿司</strong>][]」です。</p>
|
|
231
|
+
```
|
|
245
232
|
|
|
246
|
-
|
|
233
|
+
Match example:
|
|
247
234
|
|
|
248
|
-
|
|
235
|
+
```markdown
|
|
236
|
+
献立は「[**寿司**][]」です。
|
|
249
237
|
|
|
250
|
-
|
|
238
|
+
[**寿司**]: https://example.com/
|
|
239
|
+
```
|
|
251
240
|
|
|
252
|
-
```
|
|
253
|
-
|
|
254
|
-
postprocess: false
|
|
255
|
-
})
|
|
241
|
+
```html
|
|
242
|
+
<p>献立は「<a href="https://example.com/"><strong>寿司</strong></a>」です。</p>
|
|
256
243
|
```
|
|
257
244
|
|
|
245
|
+
##### 9-1B: Inline link handling (`[text](url)`)
|
|
246
|
+
|
|
247
|
+
- `[text](url)` does not do collapsed-reference label matching.
|
|
248
|
+
- Step 9 only adjusts malformed `*` / `**` wrappers around links.
|
|
249
|
+
- It never forces matching by deleting markers.
|
|
250
|
+
|
|
251
|
+
Examples:
|
|
252
|
+
|
|
253
|
+
- Input: `メニューではmenu**[ramen](url)**と書きます。`
|
|
254
|
+
- `japanese` / `japanese-boundary` / `japanese-boundary-guard`: `<p>メニューではmenu**<a href="url">ramen</a>**と書きます。</p>`
|
|
255
|
+
- `aggressive`: `<p>メニューではmenu<strong><a href="url">ramen</a></strong>と書きます。</p>`
|
|
256
|
+
- `compatible` / `markdown-it`: `<p>メニューではmenu**<a href="url">ramen</a>**と書きます。</p>`
|
|
257
|
+
|
|
258
|
+
- Input: `説明文ではこれは**[寿司](url)**です。`
|
|
259
|
+
- `japanese` / `japanese-boundary` / `japanese-boundary-guard` / `aggressive`: `<p>説明文ではこれは<strong><a href="url">寿司</a></strong>です。</p>`
|
|
260
|
+
- `compatible` / `markdown-it`: `<p>説明文ではこれは**<a href="url">寿司</a>**です。</p>`
|
|
261
|
+
|
|
262
|
+
##### 9-1C: Inline code / symbol wrapper handling
|
|
263
|
+
|
|
264
|
+
- Input: `昼食は**\`code\`**の話です。`
|
|
265
|
+
- `japanese` / `japanese-boundary` / `japanese-boundary-guard` / `aggressive`: `<p>昼食は<strong><code>code</code></strong>の話です。</p>`
|
|
266
|
+
- `compatible` / `markdown-it`: `<p>昼食は**<code>code</code>**の話です。</p>`
|
|
267
|
+
|
|
268
|
+
- Input: `注記では**aa\`stock\`**aaという記法を試します。`
|
|
269
|
+
- `japanese` / `japanese-boundary` / `japanese-boundary-guard` / `compatible` / `markdown-it`: `<p>注記では**aa<code>stock</code>**aaという記法を試します。</p>`
|
|
270
|
+
- `aggressive`: `<p>注記では<strong>aa<code>stock</code></strong>aaという記法を試します。</p>`
|
|
271
|
+
|
|
272
|
+
- Input: `お店の場所は**{}()**です。`
|
|
273
|
+
- `japanese` / `japanese-boundary` / `japanese-boundary-guard` / `aggressive`: `<p>お店の場所は<strong>{}()</strong>です。</p>`
|
|
274
|
+
- `compatible` / `markdown-it`: `<p>お店の場所は**{}()**です。</p>`
|
|
275
|
+
|
|
276
|
+
#### Step 9-2: Which modes run Step 9
|
|
277
|
+
|
|
278
|
+
Step 9 runs in:
|
|
279
|
+
|
|
280
|
+
- `japanese-boundary`
|
|
281
|
+
- `japanese-boundary-guard` (therefore also `japanese`)
|
|
282
|
+
- `aggressive`
|
|
283
|
+
|
|
284
|
+
Step 9 is skipped in:
|
|
285
|
+
|
|
286
|
+
- `compatible` (to keep plain `markdown-it` parity)
|
|
287
|
+
|
|
288
|
+
Target is mainly malformed `*` / `**` around links and collapsed refs. Spans that cross inline code, inline HTML, images, or autolinks are kept as-is.
|
|
289
|
+
|
|
290
|
+
#### Step 9-3: Why Step 9 can skip rewrites or normalize tokens
|
|
291
|
+
|
|
292
|
+
Step 9 is intentionally conservative. It prefers stable output over maximum conversion, so it skips rewrites when:
|
|
293
|
+
|
|
294
|
+
- emphasis/link repair signals are weak
|
|
295
|
+
- the span is low-confidence (`***` noise, underscore-heavy mix, code involvement, wrapper imbalance)
|
|
296
|
+
- the malformed shape does not match known safe repair patterns
|
|
297
|
+
|
|
298
|
+
Even when rewrite succeeds, token arrangement can be normalized while rendered HTML stays equivalent. For example, `[` / `]` / `[]` may become separate text tokens. The runtime path is strict token-only (no inline reparse fallback).
|
|
299
|
+
|
|
300
|
+
Example (low-confidence span is preserved):
|
|
301
|
+
|
|
302
|
+
- Input: `注記では**aa\`stock\`***tail*です。`
|
|
303
|
+
- `japanese` / `compatible`: `<p>注記では**aa<code>stock</code>**<em>tail</em>です。</p>`
|
|
304
|
+
- Reason: mixed `**` and `*` around code is low-confidence, so literal `**` is preserved.
|
|
305
|
+
|
|
306
|
+
In short, for ambiguous malformed input, strong-ja prioritizes safe/readable output over maximum conversion.
|
|
307
|
+
|
|
308
|
+
## Behavior Examples
|
|
309
|
+
|
|
310
|
+
Representative cases only (full corpus: `test/readme-mode.txt`).
|
|
311
|
+
|
|
312
|
+
Supporting visuals:
|
|
313
|
+
|
|
314
|
+
- `example/inline-wrapper-matrix.html`
|
|
315
|
+
- `example/mixed-ja-en-stars-mode.html`
|
|
316
|
+
|
|
317
|
+
### 1) Baseline Japanese punctuation case
|
|
318
|
+
|
|
319
|
+
- Input: `**「だし」**は和食の基本です。`
|
|
320
|
+
- `japanese` / `aggressive`: `<p><strong>「だし」</strong>は和食の基本です。</p>`
|
|
321
|
+
- `compatible` / `markdown-it`: `<p>**「だし」**は和食の基本です。</p>`
|
|
322
|
+
|
|
323
|
+
### 2) Mixed JA/EN mode differences
|
|
324
|
+
|
|
325
|
+
- Input: `**天ぷら。**crunch**という表現を使います。`
|
|
326
|
+
- `japanese` / `aggressive`: `<p><strong>天ぷら。</strong>crunch**という表現を使います。</p>`
|
|
327
|
+
- `compatible` / `markdown-it`: `<p>**天ぷら。<strong>crunch</strong>という表現を使います。</p>`
|
|
328
|
+
|
|
329
|
+
- Input: `日本語です。* English* です。`
|
|
330
|
+
- `japanese-boundary`: `<p>日本語です。<em> English</em> です。</p>`
|
|
331
|
+
- `japanese-boundary-guard` / `compatible`: `<p>日本語です。* English* です。</p>`
|
|
332
|
+
|
|
333
|
+
### 3) Safety-first malformed handling
|
|
334
|
+
|
|
335
|
+
- Input: `**[**[x](v)](u)**`
|
|
336
|
+
- All modes: `<p><strong>[</strong><a href="v">x</a>](u)**</p>`
|
|
337
|
+
|
|
338
|
+
- Input: `注記では**aa\`stock\`***tail*です。`
|
|
339
|
+
- `japanese` / `compatible`: `<p>注記では**aa<code>stock</code>**<em>tail</em>です。</p>`
|
|
340
|
+
- Low-confidence span: keep literal `**` instead of risky forced conversion.
|
|
341
|
+
|
|
342
|
+
### 4) Inline link/code adjacency
|
|
343
|
+
|
|
344
|
+
- Input: `説明文ではこれは**[ラーメン](url)**です。`
|
|
345
|
+
- `japanese` / `aggressive`: `<p>説明文ではこれは<strong><a href="url">ラーメン</a></strong>です。</p>`
|
|
346
|
+
- `compatible` / `markdown-it`: `<p>説明文ではこれは**<a href="url">ラーメン</a>**です。</p>`
|
|
347
|
+
|
|
348
|
+
- Input: `注記では**aa\`stock\`**aaという記法を試します。`
|
|
349
|
+
- `japanese` / `compatible` / `markdown-it`: `<p>注記では**aa<code>stock</code>**aaという記法を試します。</p>`
|
|
350
|
+
- `aggressive`: `<p>注記では<strong>aa<code>stock</code></strong>aaという記法を試します。</p>`
|
|
351
|
+
|
|
352
|
+
### 5) Pure-English malformed tail (`aggressive` delta)
|
|
353
|
+
|
|
354
|
+
- Input: `broken **tail [aa**aa***Text***and*More*bb**bb](https://x.test) after`
|
|
355
|
+
- `japanese` / `compatible` / `markdown-it`:
|
|
356
|
+
`<p>broken **tail <a href="https://x.test">aa<strong>aa</strong><em>Text</em><em><em>and</em>More</em>bb**bb</a> after</p>`
|
|
357
|
+
- `aggressive`:
|
|
358
|
+
`<p>broken **tail <a href="https://x.test">aa<strong>aa</strong><em>Text</em><strong>and<em>More</em>bb</strong>bb</a> after</p>`
|
|
359
|
+
|
|
360
|
+
## Options
|
|
361
|
+
|
|
362
|
+
### `mode`
|
|
363
|
+
|
|
364
|
+
- Type: `'japanese' | 'japanese-boundary' | 'japanese-boundary-guard' | 'aggressive' | 'compatible'`
|
|
365
|
+
- Default: `'japanese'`
|
|
366
|
+
|
|
367
|
+
### `mditAttrs`
|
|
368
|
+
|
|
369
|
+
- Type: `boolean`
|
|
258
370
|
- Default: `true`
|
|
259
|
-
- Set `false`
|
|
371
|
+
- Set `false` if your stack does not use `markdown-it-attrs`.
|
|
260
372
|
|
|
261
|
-
###
|
|
373
|
+
### `postprocess`
|
|
262
374
|
|
|
263
|
-
|
|
375
|
+
- Type: `boolean`
|
|
376
|
+
- Default: `true`
|
|
377
|
+
- Set `false` to disable link/reference postprocess repairs.
|
|
378
|
+
- In `mode: 'compatible'`, repairs are skipped even when this is `true`.
|
|
379
|
+
- Repairs stay local to malformed link/reference-adjacent spans; valid inputs such as `[w](u) *string* [w](u)` are left unchanged.
|
|
264
380
|
|
|
265
|
-
|
|
266
|
-
|
|
267
|
-
|
|
268
|
-
|
|
269
|
-
|
|
270
|
-
|
|
381
|
+
### `coreRulesBeforePostprocess`
|
|
382
|
+
|
|
383
|
+
- Type: `string[]`
|
|
384
|
+
- Default: `[]`
|
|
385
|
+
- Names of core rules that must run before `strong_ja_token_postprocess`.
|
|
386
|
+
|
|
387
|
+
### `patchCorePush`
|
|
271
388
|
|
|
389
|
+
- Type: `boolean`
|
|
272
390
|
- Default: `true`
|
|
273
|
-
-
|
|
274
|
-
|
|
391
|
+
- Helper hook to keep rule order stable when `mditAttrs: false` and `cjk_breaks` is registered later.
|
|
392
|
+
|
|
393
|
+
### About `markdown-it` `breaks`
|
|
394
|
+
|
|
395
|
+
`breaks` is controlled by `markdown-it` itself. This plugin does not override `md.options.breaks`. However, with `cjk_breaks`, compatibility handling may adjust softbreak-related tokens, so rendered line-break behavior can still differ in some cases.
|
|
396
|
+
|
|
397
|
+
## Notes
|
|
398
|
+
|
|
399
|
+
- Use `state.env.__strongJaTokenOpt` to override options per render.
|
|
400
|
+
- Overrides are merged with plugin options, but setup-time behavior (such as rule registration/order) cannot be switched at render time and cannot be retrofitted after the first `.use(...)` on the same `MarkdownIt` instance.
|
|
401
|
+
- `mode` and `postprocess` are runtime-effective. `mditAttrs`, `patchCorePush`, and `coreRulesBeforePostprocess` are setup-time effective after the first `.use(...)` on a `MarkdownIt` instance.
|
|
402
|
+
- This is an ESM plugin (`type: module`) and is tested against `markdown-it` 14.x in Node.js, browser bundlers, and VS Code extension pipelines that use `markdown-it` ESM.
|
|
403
|
+
- The implementation relies on `markdown-it` internal ESM modules / core rule internals (`lib/token.mjs`, `lib/common/utils.mjs`, `ruler.__rules__`) plus a `scanDelims` prototype patch, so internal `markdown-it` changes may require plugin updates.
|
|
404
|
+
- `scanDelims` patch is applied once per `MarkdownIt` prototype in the same process.
|
|
405
|
+
|