@peaceroad/markdown-it-figure-with-p-caption 0.16.1 → 0.18.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/README.md +22 -17
- package/embeds/detect.js +182 -0
- package/embeds/providers.js +30 -0
- package/index.js +429 -344
- package/package.json +7 -6
package/README.md
CHANGED
|
@@ -15,10 +15,10 @@ Optionally, you can auto-number image and table caption paragraphs starting from
|
|
|
15
15
|
- Pure image paragraphs (``) become `<figure class="f-img">` blocks as soon as a caption paragraph (previous or next) or an auto-detected caption exists.
|
|
16
16
|
- Auto detection runs per image paragraph when `autoCaptionDetection` is `true` (default). The priority is:
|
|
17
17
|
1. Caption paragraphs immediately before or after the image (standard syntax).
|
|
18
|
-
2. Image `alt` text that
|
|
18
|
+
2. Image `alt` text that `p7d-markdown-it-p-captions` recognizes as an image caption start (`Figure. `, `Figure 1. `, `図 `, `図1 `, etc.).
|
|
19
19
|
3. Image `title` attribute that matches the same labels.
|
|
20
20
|
4. Optional fallbacks (`autoAltCaption`, `autoTitleCaption`) that inject the label when the alt/title lacks one.
|
|
21
|
-
- `autoAltCaption`: `false` (default), `true`, or a string label. `true`
|
|
21
|
+
- `autoAltCaption`: `false` (default), `true`, or a string label. `true` uses locale-aware generated-label defaults from `p7d-markdown-it-p-captions`, so the label text and punctuation stay aligned with the upstream caption language data. A string is treated as a label stem that must be recognized by `p7d-markdown-it-p-captions`; setup throws if it cannot be parsed as an image caption label. This plugin appends the default joint/space unless the string already ends with punctuation such as `.` / `。` / `:`. Empty alt text does not generate a fallback caption.
|
|
22
22
|
- `autoTitleCaption`: same behavior but sourced from the image `title`. It stays off by default so other plugins can keep using the `title` attribute for metadata.
|
|
23
23
|
- Set `autoCaptionDetection: false` to disable the auto-caption workflow entirely.
|
|
24
24
|
- Multi-image paragraphs are still wrapped as one figure when `multipleImages: true` (default). Layout-specific classes help with styling:
|
|
@@ -41,7 +41,7 @@ Optionally, you can auto-number image and table caption paragraphs starting from
|
|
|
41
41
|
|
|
42
42
|
### Blockquote
|
|
43
43
|
|
|
44
|
-
- Captioned blockquotes (e.g.,
|
|
44
|
+
- Captioned blockquotes (e.g., `Source. A paragraph.` written immediately before or after `> ...`) become `<figure class="f-blockquote">` while keeping the original blockquote intact.
|
|
45
45
|
|
|
46
46
|
### Video & Audio
|
|
47
47
|
|
|
@@ -50,9 +50,9 @@ Optionally, you can auto-number image and table caption paragraphs starting from
|
|
|
50
50
|
|
|
51
51
|
### Embedded content by iframe
|
|
52
52
|
|
|
53
|
-
- Inline HTML `<iframe>` elements become `<figure class="f-video">` when they point to known video hosts (YouTube `youtube.com`
|
|
53
|
+
- Inline HTML `<iframe>` elements become `<figure class="f-video">` when they point to known video hosts (YouTube `www.youtube.com`, `youtube.com`, `www.youtube-nocookie.com`, `youtube-nocookie.com`, Vimeo `player.vimeo.com`).
|
|
54
54
|
- `<div>` wrappers are treated as iframe-type embeds only when the same HTML block contains an `<iframe ...>` tag (for example common video wrapper markup).
|
|
55
|
-
- Blockquote-based social embeds (Twitter/X `twitter-tweet`, Mastodon `mastodon-embed`, Bluesky `bluesky-embed`, Instagram `instagram-media`, Tumblr `text-post-media`) are treated like iframe-type embeds when their
|
|
55
|
+
- Blockquote-based social embeds (Twitter/X `twitter-tweet`, Mastodon `mastodon-embed`, Bluesky `bluesky-embed`, Instagram `instagram-media`, Tumblr `text-post-media`) are treated like iframe-type embeds when their class list contains one of those provider classes. Extra classes on the same blockquote do not block detection. By default they become `<figure class="f-img">` so the caption label behaves like an image label (Labels can also use quote labels). You can override that figure class with `figureClassThatWrapsIframeTypeBlockquote` or the global `allIframeTypeFigureClassName`.
|
|
56
56
|
- `p7d-markdown-it-p-captions` ships with a `Slide.` label. When you use it (for example with Speaker Deck or SlideShare iframes), the `<figure>` wrapper automatically switches to `f-slide` (or whatever you set via `figureClassThatWrapsSlides`) so slides can get their own layout. If `allIframeTypeFigureClassName` is also configured, that class takes precedence even for slides, so you get a uniform embed wrapper without touching the slide option.
|
|
57
57
|
- All other iframes fall back to `<figure class="f-iframe">` unless you override the class via `allIframeTypeFigureClassName`.
|
|
58
58
|
|
|
@@ -60,7 +60,8 @@ Optionally, you can auto-number image and table caption paragraphs starting from
|
|
|
60
60
|
|
|
61
61
|
- The label inside the figcaption (the `span` element used for the label) is generated by `p7d-markdown-it-p-captions`, not by this plugin. By default the class name is formed by combining `classPrefix` with the mark name, producing names such as `f-img-label`, `f-video-label`, `f-blockquote-label`, and `f-slide-label`.
|
|
62
62
|
- With `markdown-it-attrs`, attributes attached to image-only paragraphs (for example ` {.foo #bar}`) are forwarded to the generated `<figure>`.
|
|
63
|
-
- `styleProcess` controls parsing of a trailing `{...}` block from the last text token of an image-only paragraph in this plugin's own scanner. It is a narrow fallback parser, not full `markdown-it-attrs` parity, and attributes already attached to paragraph tokens by `markdown-it-attrs` are still forwarded.
|
|
63
|
+
- `styleProcess` controls parsing of a trailing `{...}` block from the last text token of an image-only paragraph in this plugin's own scanner. It supports simple `.class`, `#id`, bare attributes, and quoted `key="value with spaces"` / `key='value with spaces'` pairs. It is still a narrow fallback parser, not full `markdown-it-attrs` parity, and attributes already attached to paragraph tokens by `markdown-it-attrs` are still forwarded.
|
|
64
|
+
- Attribute forwarding is not sanitization. If you render untrusted Markdown, keep using an HTML sanitizer or a trusted-host policy appropriate for your application; this plugin only decides which already-parsed or narrowly parsed attributes move onto `<figure>`.
|
|
64
65
|
- Attributes attached to caption paragraphs stay on the converted `<figcaption>` token after paragraph-to-figcaption conversion.
|
|
65
66
|
|
|
66
67
|
## Behavior Customization
|
|
@@ -75,7 +76,8 @@ Optionally, you can auto-number image and table caption paragraphs starting from
|
|
|
75
76
|
|
|
76
77
|
### Wrapping without captions
|
|
77
78
|
|
|
78
|
-
- `
|
|
79
|
+
- `imageOnlyParagraphWithoutCaption`: turn valid image-only paragraphs into `<figure>` elements even when no caption paragraph/auto caption is present. This includes single-image paragraphs and, when `multipleImages` is enabled, multi-image paragraphs that receive classes such as `f-img-horizontal`, `f-img-vertical`, or `f-img-multiple`. This is independent of automatic detection.
|
|
80
|
+
- `oneImageWithoutCaption`: legacy alias for `imageOnlyParagraphWithoutCaption`. When both are provided, `imageOnlyParagraphWithoutCaption` wins.
|
|
79
81
|
- `videoWithoutCaption`, `audioWithoutCaption`, `iframeWithoutCaption`, `iframeTypeBlockquoteWithoutCaption`: wrap the respective media blocks without caption.
|
|
80
82
|
|
|
81
83
|
### Caption text helpers (delegated to `p7d-markdown-it-p-captions`)
|
|
@@ -85,20 +87,23 @@ Every option below is forwarded verbatim to `p7d-markdown-it-p-captions`, which
|
|
|
85
87
|
- `strongFilename` / `dquoteFilename`: pull out filenames from captions using `**filename**` or `"filename"` syntax and wrap them in `<strong class="f-*-filename">`.
|
|
86
88
|
- `jointSpaceUseHalfWidth`: replace full-width space between Japanese labels and caption body with half-width space.
|
|
87
89
|
- `bLabel` / `strongLabel`: emphasize the label span itself.
|
|
88
|
-
- `removeUnnumberedLabel`: drop the leading
|
|
90
|
+
- `removeUnnumberedLabel`: drop the leading label entirely when no label number is present. Use `removeUnnumberedLabelExceptMarks` to keep specific labels (e.g., `['blockquote']` keeps `Quote. `).
|
|
89
91
|
- `removeMarkNameInCaptionClass`: replace `.f-img-label` / `.f-table-label` with the generic `.f-label`.
|
|
90
92
|
- `wrapCaptionBody`: wrap the non-label caption text in a span element.
|
|
91
93
|
- `hasNumClass`: add a class attribute to label span element if it has a label number.
|
|
92
94
|
- `labelClassFollowsFigure`: mirror the resolved `<figure>` class onto the `figcaption` spans (`f-embed-label`, `f-embed-label-joint`, `f-embed-body`, etc.) when you want captions styled alongside the wrapper.
|
|
93
95
|
- `figureToLabelClassMap`: extend `labelClassFollowsFigure` by mapping specific figure classes (e.g., `f-embed`) to custom caption label classes such as `caption-embed caption-social` for fine-grained control. When this map is provided and `labelClassFollowsFigure` is not set explicitly, figure-following mode is enabled automatically.
|
|
94
96
|
- `labelPrefixMarker`: allow a leading marker before labels (string or array, e.g., `*Figure. ...`). Arrays are limited to two markers; extras are ignored.
|
|
97
|
+
- `languages`: optional available caption-recognition catalogs delegated to `p7d-markdown-it-p-captions` (default: `['en', 'ja']`). Most users can leave this unset. Set it only when you want to restrict or extend which labels can be recognized (for example English `Figure.` and Japanese `図 `) and which catalogs are available for generated fallback labels. It is separate from the active locale used to choose among those available catalogs.
|
|
98
|
+
- Automatic image-label fallback text and punctuation (`Figure. `, `図 `, etc.) are generated from `p7d-markdown-it-p-captions` locale metadata, not from a local hardcoded map in this plugin.
|
|
99
|
+
- Generated fallback label tie-break is resolved once per render. Prefer passing the active locale through `env.locale` or `env.preferredLocales`. Compatibility fallbacks are `preferredLanguages`, `env.preferredLanguages`, `env.lang`, and `env.language`. If none of those selects an available catalog, this plugin finally uses a cheap document-script heuristic that skips a leading hyphen-fenced frontmatter block (`---` or longer, spaces allowed before newline), then falls back to the raw `languages` order. This tie-break only affects generated fallback labels; it does not change the caption-recognition dictionaries selected by `languages`. Compatibility note: for generated fallback labels, `env.locale` / `env.preferredLocales` intentionally take precedence over the legacy `preferredLanguages` option so a shared `md` instance can render different documents with different active locales.
|
|
95
100
|
|
|
96
101
|
### Automatic numbering
|
|
97
102
|
|
|
98
103
|
- `autoLabelNumberSets`: enable numbering per media type. Pass an array such as `['img']`, `['table']`, or `['img', 'table']`.
|
|
99
104
|
- `autoLabelNumber`: shorthand for turning numbering on for both images and tables without passing the array yourself. Provide `autoLabelNumberSets` explicitly (e.g., `['img']`) when you need finer control—the explicit array always wins.
|
|
100
105
|
- Counters start at `1` near the top of the document and increment sequentially per media type. Figures and tables keep independent counters even when mixed together.
|
|
101
|
-
- The counter only advances when a real caption exists (paragraph, auto-detected alt/title, or fallback text). Figures emitted solely because of `oneImageWithoutCaption` stay unnumbered.
|
|
106
|
+
- The counter only advances when a real caption exists (paragraph, auto-detected alt/title, or fallback text). Figures emitted solely because of `imageOnlyParagraphWithoutCaption` / `oneImageWithoutCaption` stay unnumbered.
|
|
102
107
|
- Manual numbers inside the caption text (e.g., `Figure 5.`) always win. The plugin updates its internal counter so the next automatic number becomes `6`. This applies to captions sourced from paragraphs, auto detection, and fallback captions.
|
|
103
108
|
|
|
104
109
|
## Basic Usage
|
|
@@ -126,12 +131,12 @@ Auto label numbering for images and tables.
|
|
|
126
131
|
```js
|
|
127
132
|
const figureOption = {
|
|
128
133
|
// Opinionated defaults
|
|
129
|
-
|
|
134
|
+
imageOnlyParagraphWithoutCaption: true,
|
|
130
135
|
videoWithoutCaption: true,
|
|
131
136
|
audioWithoutCaption: true,
|
|
132
137
|
iframeWithoutCaption: true,
|
|
133
138
|
iframeTypeBlockquoteWithoutCaption: true,
|
|
134
|
-
removeUnnumberedLabelExceptMarks: ['blockquote'], // keep
|
|
139
|
+
removeUnnumberedLabelExceptMarks: ['blockquote'], // keep `Quote.` labels even when unnumbered
|
|
135
140
|
allIframeTypeFigureClassName: 'f-embed', // apply a uniform class to every iframe-style embed
|
|
136
141
|
autoLabelNumber: true,
|
|
137
142
|
|
|
@@ -145,7 +150,7 @@ If there is no label number, the label will also be deleted.
|
|
|
145
150
|
|
|
146
151
|
```js
|
|
147
152
|
const figureOption = {
|
|
148
|
-
|
|
153
|
+
imageOnlyParagraphWithoutCaption: true,
|
|
149
154
|
videoWithoutCaption: true,
|
|
150
155
|
audioWithoutCaption: true,
|
|
151
156
|
iframeWithoutCaption: true,
|
|
@@ -173,7 +178,7 @@ const md = mdit({ html: true }).use(mditFigureWithPCaption, figureOption)
|
|
|
173
178
|
[HTML]
|
|
174
179
|
<p><img src="figure.jpg" alt="A single cat"></p>
|
|
175
180
|
|
|
176
|
-
<!-- Above: If oneImageWithoutCaption is true, this img element has wrapped into figure element without caption. -->
|
|
181
|
+
<!-- Above: If imageOnlyParagraphWithoutCaption (or its legacy alias oneImageWithoutCaption) is true, this img element has wrapped into figure element without caption. -->
|
|
177
182
|
|
|
178
183
|
|
|
179
184
|
[Markdown]
|
|
@@ -438,7 +443,7 @@ A paragraph.
|
|
|
438
443
|
|
|
439
444
|
### Styles
|
|
440
445
|
|
|
441
|
-
This example uses `classPrefix: 'custom'` and leaves `styleProcess: true` so a trailing `{.notice}` block moves onto the `<figure>` wrapper. This fallback only handles the final trailing attrs block on an image-only paragraph; for broader attrs syntax support, keep using `markdown-it-attrs`.
|
|
446
|
+
This example uses `classPrefix: 'custom'` and leaves `styleProcess: true` so a trailing `{.notice}` block moves onto the `<figure>` wrapper. This fallback only handles the final trailing attrs block on an image-only paragraph; it supports quoted values with spaces, but for broader attrs syntax support, keep using `markdown-it-attrs`.
|
|
442
447
|
|
|
443
448
|
```
|
|
444
449
|
[Markdown]
|
|
@@ -454,7 +459,7 @@ Figure. Highlighted cat.
|
|
|
454
459
|
|
|
455
460
|
### Automatic detection fallbacks
|
|
456
461
|
|
|
457
|
-
`autoCaptionDetection` combined with `autoAltCaption` / `autoTitleCaption` can still generate caption text even when the original alt/title lacks labels. The corresponding attributes are cleared after conversion so the figcaption becomes the canonical source.
|
|
462
|
+
`autoCaptionDetection` combined with `autoAltCaption` / `autoTitleCaption` can still generate caption text even when the original alt/title lacks labels, as long as the alt/title body is non-empty. The corresponding attributes are cleared after conversion so the figcaption becomes the canonical source. When these fallbacks are `true`, the generated label text and punctuation come from `p7d-markdown-it-p-captions` locale metadata rather than a local hardcoded map. When these fallbacks are strings, the string must be a label stem recognized as an image caption label by `p7d-markdown-it-p-captions`; invalid strings fail during plugin setup instead of producing a stray paragraph.
|
|
458
463
|
|
|
459
464
|
```
|
|
460
465
|
[Markdown]
|
|
@@ -496,7 +501,7 @@ $ pwd
|
|
|
496
501
|
|
|
497
502
|
### Captionless conversion toggles
|
|
498
503
|
|
|
499
|
-
If `oneImageWithoutCaption` is enabled, a single image paragraph will be wrapped with `<figure class="f-img">` even without a caption.
|
|
504
|
+
If `imageOnlyParagraphWithoutCaption` (or the legacy alias `oneImageWithoutCaption`) is enabled, a single image paragraph will be wrapped with `<figure class="f-img">` even without a caption. Multi-image image-only paragraphs can also be wrapped, in which case the normal layout classes such as `f-img-horizontal`, `f-img-vertical`, or `f-img-multiple` are used.
|
|
500
505
|
|
|
501
506
|
```
|
|
502
507
|
[Markdown]
|
|
@@ -508,7 +513,7 @@ If `oneImageWithoutCaption` is enabled, a single image paragraph will be wrapped
|
|
|
508
513
|
</figure>
|
|
509
514
|
```
|
|
510
515
|
|
|
511
|
-
If `videoWithoutCaption` is enabled,
|
|
516
|
+
If `videoWithoutCaption` is enabled, `<video>` elements and iframes pointing to known video hosts (such as `www.youtube.com`, `youtube.com`, `www.youtube-nocookie.com`, or Vimeo) will be wrapped with `<figure class="f-video">`.
|
|
512
517
|
|
|
513
518
|
```
|
|
514
519
|
[Markdown]
|
package/embeds/detect.js
ADDED
|
@@ -0,0 +1,182 @@
|
|
|
1
|
+
import {
|
|
2
|
+
BLOCKQUOTE_EMBED_CLASS_NAMES,
|
|
3
|
+
HTML_EMBED_CANDIDATES,
|
|
4
|
+
VIDEO_IFRAME_HOSTS,
|
|
5
|
+
} from './providers.js'
|
|
6
|
+
|
|
7
|
+
const htmlRegCache = new Map()
|
|
8
|
+
const openingClassAttrReg = /^<[^>]*?\bclass=(?:"([^"]*)"|'([^']*)')/i
|
|
9
|
+
const openingSrcAttrReg = /^<[^>]*?\bsrc=(?:"([^"]*)"|'([^']*)')/i
|
|
10
|
+
const endBlockquoteScriptReg = /<\/blockquote> *<script[^>]*?><\/script>$/i
|
|
11
|
+
const targetHtmlHintReg = /<(?:video|audio|iframe|blockquote|div)\b/i
|
|
12
|
+
const blueskyEmbedHintReg = /bluesky-embed/i
|
|
13
|
+
const videoTagHintReg = /<video\b/i
|
|
14
|
+
const audioTagHintReg = /<audio\b/i
|
|
15
|
+
const iframeTagHintReg = /<iframe\b/i
|
|
16
|
+
const blockquoteTagHintReg = /<blockquote\b/i
|
|
17
|
+
const divTagHintReg = /<div\b/i
|
|
18
|
+
const iframeTagReg = /<iframe(?=[\s>])/i
|
|
19
|
+
|
|
20
|
+
const getHtmlReg = (tag) => {
|
|
21
|
+
const cached = htmlRegCache.get(tag)
|
|
22
|
+
if (cached) return cached
|
|
23
|
+
const regexStr = `^<${tag} ?[^>]*?>[\\s\\S]*?<\\/${tag}>(\\n| *?)(<script [^>]*?>(?:<\\/script>)?)? *(\\n|$)`
|
|
24
|
+
const reg = new RegExp(regexStr, 'i')
|
|
25
|
+
htmlRegCache.set(tag, reg)
|
|
26
|
+
return reg
|
|
27
|
+
}
|
|
28
|
+
|
|
29
|
+
const getHtmlDetectionHints = (content) => {
|
|
30
|
+
const source = typeof content === 'string' ? content : ''
|
|
31
|
+
const hasTargetHtmlHint = targetHtmlHintReg.test(source)
|
|
32
|
+
const hasBlueskyHint = blueskyEmbedHintReg.test(source)
|
|
33
|
+
if (!hasTargetHtmlHint && !hasBlueskyHint) {
|
|
34
|
+
return null
|
|
35
|
+
}
|
|
36
|
+
const hasVideoHint = videoTagHintReg.test(source)
|
|
37
|
+
const hasAudioHint = audioTagHintReg.test(source)
|
|
38
|
+
const hasIframeHint = iframeTagHintReg.test(source)
|
|
39
|
+
const hasBlockquoteHint = blockquoteTagHintReg.test(source)
|
|
40
|
+
const hasDivHint = divTagHintReg.test(source)
|
|
41
|
+
return {
|
|
42
|
+
hasBlueskyHint,
|
|
43
|
+
hasVideoHint,
|
|
44
|
+
hasAudioHint,
|
|
45
|
+
hasIframeHint,
|
|
46
|
+
hasBlockquoteHint,
|
|
47
|
+
hasDivHint,
|
|
48
|
+
hasIframeTag: hasIframeHint || (hasDivHint && iframeTagReg.test(source)),
|
|
49
|
+
}
|
|
50
|
+
}
|
|
51
|
+
|
|
52
|
+
const appendHtmlBlockNewlineIfNeeded = (token, hasTag) => {
|
|
53
|
+
if ((hasTag[2] && hasTag[3] !== '\n') || (hasTag[1] !== '\n' && hasTag[2] === undefined)) {
|
|
54
|
+
token.content += '\n'
|
|
55
|
+
}
|
|
56
|
+
}
|
|
57
|
+
|
|
58
|
+
const consumeBlockquoteEmbedScript = (tokens, token, startIndex) => {
|
|
59
|
+
let addedContent = ''
|
|
60
|
+
let i = startIndex + 1
|
|
61
|
+
while (i < tokens.length) {
|
|
62
|
+
const nextToken = tokens[i]
|
|
63
|
+
if (nextToken.type === 'inline' && endBlockquoteScriptReg.test(nextToken.content)) {
|
|
64
|
+
addedContent += nextToken.content + '\n'
|
|
65
|
+
if (tokens[i + 1] && tokens[i + 1].type === 'paragraph_close') {
|
|
66
|
+
tokens.splice(i + 1, 1)
|
|
67
|
+
}
|
|
68
|
+
nextToken.content = ''
|
|
69
|
+
if (nextToken.children) {
|
|
70
|
+
for (let j = 0; j < nextToken.children.length; j++) {
|
|
71
|
+
nextToken.children[j].content = ''
|
|
72
|
+
}
|
|
73
|
+
}
|
|
74
|
+
break
|
|
75
|
+
}
|
|
76
|
+
if (nextToken.type === 'paragraph_open') {
|
|
77
|
+
addedContent += '\n'
|
|
78
|
+
tokens.splice(i, 1)
|
|
79
|
+
continue
|
|
80
|
+
}
|
|
81
|
+
i++
|
|
82
|
+
}
|
|
83
|
+
token.content += addedContent
|
|
84
|
+
}
|
|
85
|
+
|
|
86
|
+
const getOpeningAttrValue = (content, reg) => {
|
|
87
|
+
if (typeof content !== 'string' || content.charCodeAt(0) !== 0x3c) return ''
|
|
88
|
+
const match = content.match(reg)
|
|
89
|
+
if (!match) return ''
|
|
90
|
+
return match[1] || match[2] || ''
|
|
91
|
+
}
|
|
92
|
+
|
|
93
|
+
const hasKnownBlockquoteEmbedClass = (content) => {
|
|
94
|
+
const classAttr = getOpeningAttrValue(content, openingClassAttrReg)
|
|
95
|
+
if (!classAttr) return false
|
|
96
|
+
let start = 0
|
|
97
|
+
while (start < classAttr.length) {
|
|
98
|
+
while (start < classAttr.length && classAttr.charCodeAt(start) <= 0x20) start++
|
|
99
|
+
if (start >= classAttr.length) break
|
|
100
|
+
let end = start + 1
|
|
101
|
+
while (end < classAttr.length && classAttr.charCodeAt(end) > 0x20) end++
|
|
102
|
+
if (BLOCKQUOTE_EMBED_CLASS_NAMES.has(classAttr.slice(start, end))) return true
|
|
103
|
+
start = end + 1
|
|
104
|
+
}
|
|
105
|
+
return false
|
|
106
|
+
}
|
|
107
|
+
|
|
108
|
+
const isKnownVideoIframe = (content) => {
|
|
109
|
+
const src = getOpeningAttrValue(content, openingSrcAttrReg)
|
|
110
|
+
if (!src || src.slice(0, 8).toLowerCase() !== 'https://') return false
|
|
111
|
+
const slashIndex = src.indexOf('/', 8)
|
|
112
|
+
const host = (slashIndex === -1 ? src.slice(8) : src.slice(8, slashIndex)).toLowerCase()
|
|
113
|
+
return VIDEO_IFRAME_HOSTS.has(host)
|
|
114
|
+
}
|
|
115
|
+
|
|
116
|
+
const detectHtmlTagCandidate = (tokens, token, startIndex, detector, hints, result) => {
|
|
117
|
+
if (detector.requiresIframeTag && !hints.hasIframeTag) return ''
|
|
118
|
+
const hasTagHint = !!(detector.hintKey && hints[detector.hintKey])
|
|
119
|
+
const allowBlueskyFallback = detector.candidate === 'blockquote' && hints.hasBlueskyHint
|
|
120
|
+
if (!hasTagHint && !allowBlueskyFallback) return ''
|
|
121
|
+
const hasTag = hasTagHint ? token.content.match(getHtmlReg(detector.lookupTag)) : null
|
|
122
|
+
const isBlueskyFallback = detector.candidate === 'blockquote' && !hasTag && hints.hasBlueskyHint
|
|
123
|
+
if (!hasTag && !isBlueskyFallback) return ''
|
|
124
|
+
if (hasTag) {
|
|
125
|
+
appendHtmlBlockNewlineIfNeeded(token, hasTag)
|
|
126
|
+
if (detector.treatAsVideoIframe) {
|
|
127
|
+
result.isVideoIframe = true
|
|
128
|
+
}
|
|
129
|
+
return detector.matchedTag || detector.candidate
|
|
130
|
+
}
|
|
131
|
+
consumeBlockquoteEmbedScript(tokens, token, startIndex)
|
|
132
|
+
return 'blockquote'
|
|
133
|
+
}
|
|
134
|
+
|
|
135
|
+
const resolveHtmlWrapWithoutCaption = (matchedTag, result, htmlWrapWithoutCaption) => {
|
|
136
|
+
if (!htmlWrapWithoutCaption) return false
|
|
137
|
+
if (matchedTag === 'blockquote') {
|
|
138
|
+
return !!(result.isIframeTypeBlockquote && htmlWrapWithoutCaption.iframeTypeBlockquote)
|
|
139
|
+
}
|
|
140
|
+
if (matchedTag === 'iframe' && result.isVideoIframe) {
|
|
141
|
+
return !!(htmlWrapWithoutCaption.video || htmlWrapWithoutCaption.iframe)
|
|
142
|
+
}
|
|
143
|
+
return !!htmlWrapWithoutCaption[matchedTag]
|
|
144
|
+
}
|
|
145
|
+
|
|
146
|
+
export const detectHtmlFigureCandidate = (tokens, token, startIndex, htmlWrapWithoutCaption) => {
|
|
147
|
+
if (!token || token.type !== 'html_block') return null
|
|
148
|
+
const hints = getHtmlDetectionHints(token.content)
|
|
149
|
+
if (!hints) return null
|
|
150
|
+
|
|
151
|
+
const result = {
|
|
152
|
+
isVideoIframe: false,
|
|
153
|
+
isIframeTypeBlockquote: false,
|
|
154
|
+
}
|
|
155
|
+
|
|
156
|
+
let matchedTag = ''
|
|
157
|
+
for (let i = 0; i < HTML_EMBED_CANDIDATES.length; i++) {
|
|
158
|
+
matchedTag = detectHtmlTagCandidate(tokens, token, startIndex, HTML_EMBED_CANDIDATES[i], hints, result)
|
|
159
|
+
if (matchedTag) break
|
|
160
|
+
}
|
|
161
|
+
if (!matchedTag) return null
|
|
162
|
+
|
|
163
|
+
if (matchedTag === 'blockquote') {
|
|
164
|
+
if (!hasKnownBlockquoteEmbedClass(token.content)) return null
|
|
165
|
+
result.isIframeTypeBlockquote = true
|
|
166
|
+
}
|
|
167
|
+
|
|
168
|
+
if (matchedTag === 'iframe' && isKnownVideoIframe(token.content)) {
|
|
169
|
+
result.isVideoIframe = true
|
|
170
|
+
}
|
|
171
|
+
|
|
172
|
+
return {
|
|
173
|
+
type: 'html',
|
|
174
|
+
tagName: matchedTag,
|
|
175
|
+
en: startIndex,
|
|
176
|
+
replaceInsteadOfWrap: false,
|
|
177
|
+
wrapWithoutCaption: resolveHtmlWrapWithoutCaption(matchedTag, result, htmlWrapWithoutCaption),
|
|
178
|
+
canWrap: true,
|
|
179
|
+
isVideoIframe: result.isVideoIframe,
|
|
180
|
+
isIframeTypeBlockquote: result.isIframeTypeBlockquote,
|
|
181
|
+
}
|
|
182
|
+
}
|
|
@@ -0,0 +1,30 @@
|
|
|
1
|
+
export const HTML_EMBED_CANDIDATES = Object.freeze([
|
|
2
|
+
{ candidate: 'video', lookupTag: 'video', hintKey: 'hasVideoHint' },
|
|
3
|
+
{ candidate: 'audio', lookupTag: 'audio', hintKey: 'hasAudioHint' },
|
|
4
|
+
{ candidate: 'iframe', lookupTag: 'iframe', hintKey: 'hasIframeHint' },
|
|
5
|
+
{ candidate: 'blockquote', lookupTag: 'blockquote', hintKey: 'hasBlockquoteHint' },
|
|
6
|
+
{
|
|
7
|
+
candidate: 'div',
|
|
8
|
+
lookupTag: 'div',
|
|
9
|
+
hintKey: 'hasDivHint',
|
|
10
|
+
requiresIframeTag: true,
|
|
11
|
+
matchedTag: 'iframe',
|
|
12
|
+
treatAsVideoIframe: true,
|
|
13
|
+
},
|
|
14
|
+
])
|
|
15
|
+
|
|
16
|
+
export const BLOCKQUOTE_EMBED_CLASS_NAMES = new Set([
|
|
17
|
+
'twitter-tweet',
|
|
18
|
+
'instagram-media',
|
|
19
|
+
'text-post-media',
|
|
20
|
+
'bluesky-embed',
|
|
21
|
+
'mastodon-embed',
|
|
22
|
+
])
|
|
23
|
+
|
|
24
|
+
export const VIDEO_IFRAME_HOSTS = new Set([
|
|
25
|
+
'www.youtube.com',
|
|
26
|
+
'youtube.com',
|
|
27
|
+
'www.youtube-nocookie.com',
|
|
28
|
+
'youtube-nocookie.com',
|
|
29
|
+
'player.vimeo.com',
|
|
30
|
+
])
|