mark-deco 0.1.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/README.md ADDED
@@ -0,0 +1,1424 @@
1
+ # MarkDeco
2
+
3
+ A high-performance Markdown to HTML conversion library written in TypeScript.
4
+
5
+ [![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](https://opensource.org/licenses/MIT)
6
+ [![CI](https://github.com/kekyo/mark-deco/actions/workflows/ci.yml/badge.svg)](https://github.com/kekyo/mark-deco/actions/workflows/ci.yml)
7
+
8
+ |Package|npm|
9
+ |:----|:----|
10
+ |`mark-deco`|[![npm version](https://img.shields.io/npm/v/mark-deco.svg)](https://www.npmjs.com/package/mark-deco)|
11
+ |`mark-deco-cli`|[![npm version](https://img.shields.io/npm/v/mark-deco-cli.svg)](https://www.npmjs.com/package/mark-deco-cli)|
12
+
13
+ [(日本語はこちら)](./README_ja.md)
14
+
15
+ ## What is this?
16
+
17
+ A high-performance Markdown to HTML conversion library written in TypeScript.
18
+ It interprets GitHub Flavored Markdown (GFM) and outputs HTML.
19
+ Supports frontmatter parsing, heading analysis, source code formatting, oEmbed/card/Mermaid graph rendering, and custom code block processing through plugin extensions.
20
+
21
+ * Can be used to render HTML from Markdown input.
22
+ * Simple interface makes it very easy to use.
23
+ * Highly independent with minimal runtime requirements. Works in both Node.js and browser environments.
24
+ * Built-in plugins support oEmbed, cards, and Mermaid.js.
25
+
26
+ ## Installation
27
+
28
+ ```bash
29
+ npm install mark-deco
30
+ ```
31
+
32
+ ## Basic Usage
33
+
34
+ Here's the simplest usage example:
35
+
36
+ ```typescript
37
+ import { createMarkdownProcessor, createCachedFetcher } from 'mark-deco';
38
+
39
+ // Create a memory-cached fetcher
40
+ const fetcher = createCachedFetcher('MyApp/1.0');
41
+
42
+ // Create MarkDeco processor
43
+ const processor = createMarkdownProcessor({
44
+ fetcher
45
+ });
46
+
47
+ // Markdown to convert
48
+ const markdown = `---
49
+ title: Sample Article
50
+ author: John Doe
51
+ ---
52
+
53
+ # Hello World
54
+
55
+ This is a test article.`;
56
+
57
+ // Render HTML from Markdown input
58
+ const result = await processor.process(
59
+ markdown,
60
+ "id"); // ID prefix for HTML elements (described later)
61
+
62
+ // Generated HTML
63
+ console.log(result.html);
64
+ // Extracted frontmatter information (described later)
65
+ console.log(result.frontmatter);
66
+ // Extracted heading information (described later)
67
+ console.log(result.headingTree);
68
+ ```
69
+
70
+ This will render HTML like this:
71
+
72
+ ```html
73
+ <h1 id="id-1">Hello World</h1>
74
+ <p>This is a test article.</p>
75
+ ```
76
+
77
+ A "fetcher" is an abstraction for external server access. It's primarily used by oEmbed and card plugins for external API calls and page scraping. See details below.
78
+ The argument passed to the fetcher is a user agent string, which is applied to HTTP request headers when accessing external servers.
79
+
80
+ HTML converted by the MarkDeco processor is formatted in a readable manner. Advanced options allow fine-tuning of formatting conditions (described later).
81
+
82
+ ### Aborting Processor Operations
83
+
84
+ While the MarkDeco processor engine itself doesn't access external servers, plugins may access external servers as needed (e.g., when using oEmbed APIs or performing page scraping).
85
+
86
+ To enable operation cancellation in such cases, pass an ECMAScript standard `AbortSignal` instance to notify cancellation signals:
87
+
88
+ ```typescript
89
+ // Abort controller
90
+ const abortController = new AbortController();
91
+
92
+ // ...
93
+
94
+ // Convert Markdown to HTML
95
+ const result = await processor.process(
96
+ markdown, "id",
97
+ { // Specify processor options
98
+ signal: abortController.signal, // Cancellation support
99
+ });
100
+ ```
101
+
102
+ For usage of `AbortController` and `AbortSignal`, refer to ECMAScript documentation.
103
+
104
+ ----
105
+
106
+ ## Frontmatter Information Extraction
107
+
108
+ MarkDeco automatically parses "YAML frontmatter" at the beginning of Markdown files and provides it as processing results. Frontmatter is used to describe article metadata (title, author, tags, publication date, etc.).
109
+
110
+ ```typescript
111
+ const markdown = `---
112
+ title: "Sample Article"
113
+ author: "John Doe"
114
+ date: "2024-01-15"
115
+ tags:
116
+ - markdown
117
+ - processor
118
+ published: true
119
+ ---
120
+
121
+ # Article Content
122
+
123
+ This article discusses...`;
124
+
125
+ const result = await processor.process(markdown, "id");
126
+
127
+ // Access frontmatter data
128
+ console.log(result.frontmatter.title); // "Sample Article"
129
+ console.log(result.frontmatter.author); // "John Doe"
130
+ console.log(result.frontmatter.date); // Date object: 2024-01-15T00:00:00.000Z
131
+ console.log(result.frontmatter.tags); // ["markdown", "processor"]
132
+ console.log(result.frontmatter.published); // true
133
+
134
+ // Generated HTML doesn't include frontmatter
135
+ console.log(result.html); // "<h1 id="id-1">Article Content</h1>..."
136
+ ```
137
+
138
+ Frontmatter data can be utilized for:
139
+
140
+ * Blog article metadata management
141
+ * Template engine integration
142
+ * Article search and filtering
143
+ * SEO information extraction
144
+ * Custom rendering logic control
145
+
146
+ Note: The MarkDeco processor itself doesn't use frontmatter information. Plugins may use this information depending on their implementation.
147
+
148
+ ## Heading ID Generation and Heading Information Extraction
149
+
150
+ The processor automatically generates unique IDs for all headings (H1-H6) in the document. These IDs can be used for anchor links and navigation. ID generation supports two modes and includes advanced processing for non-ASCII characters.
151
+
152
+ IDs are embedded in HTML and also exposed through the `headingTree` property of processing results. This information can be used for table of contents generation, document structure analysis, etc:
153
+
154
+ ```typescript
155
+ const markdown = `# Introduction
156
+
157
+ This is the introduction section.
158
+
159
+ # Usage
160
+
161
+ Explains basic usage.
162
+
163
+ ## Subsection
164
+
165
+ This is an H2 heading subsection.
166
+
167
+ # Conclusion
168
+
169
+ This is the conclusion section.`;
170
+
171
+ const result = await processor.process(markdown, "id");
172
+
173
+ // Output heading information (described later)
174
+ console.log(result.headingTree);
175
+ ```
176
+
177
+ Example HTML generated by the above code (hierarchical heading IDs enabled by default):
178
+
179
+ ```html
180
+ <h1 id="id-1">Introduction</h1>
181
+ <p>This is the introduction section.</p>
182
+ <h1 id="id-2">Usage</h1>
183
+ <p>Explains basic usage.</p>
184
+ <h2 id="id-2-1">Subsection</h2>
185
+ <p>This is an H2 heading subsection.</p>
186
+ <h1 id="id-3">Conclusion</h1>
187
+ <p>This is the conclusion section.</p>
188
+ ```
189
+
190
+ ### Hierarchical Heading IDs
191
+
192
+ There's a feature to generate IDs that reflect heading hierarchy. When the `useHierarchicalHeadingId` option is `true`, hierarchical numbers based on heading levels are assigned.
193
+
194
+ This feature generates heading IDs in the following format:
195
+
196
+ | Heading Level | ID Format | Example |
197
+ |------------|--------|-----|
198
+ | H1 | `id-N` | `id-1`, `id-2`, `id-3` |
199
+ | H2 | `id-N-M` | `id-1-1`, `id-1-2`, `id-2-1` |
200
+ | H3 | `id-N-M-L` | `id-1-1-1`, `id-1-1-2`, `id-1-2-1` |
201
+
202
+ This makes heading structure clear and useful for navigation and table of contents generation.
203
+
204
+ When `useHierarchicalHeadingId` is set to `false`, ID numbers are assigned sequentially rather than hierarchically:
205
+
206
+ ```typescript
207
+ const result = await processor.process(
208
+ markdown, "id", {
209
+ // Disable hierarchical heading IDs
210
+ useHierarchicalHeadingId: false
211
+ });
212
+ ```
213
+
214
+ Example heading ID generation with sequential numbering:
215
+
216
+ | Heading Level | ID Format | Example |
217
+ |------------|--------|-----|
218
+ | All headings | `id-N` | `id-1`, `id-2`, `id-3`, `id-4`, `id-5` |
219
+
220
+ Note: In sequential mode, all headings in the document are assigned numbers sequentially regardless of heading level.
221
+
222
+ ### Custom ID Prefix
223
+
224
+ You can customize the prefix used for IDs. These can be used to generate unique IDs for each tag when concatenating multiple HTML:
225
+
226
+ ```typescript
227
+ // Specify ID prefix as the second argument
228
+ // Generated IDs: "id-1", "id-2", "id-3", etc.
229
+ const result = await processor.process(markdown, "id");
230
+
231
+ // Generated IDs: "section-1", "section-2", "section-3", etc.
232
+ const result = await processor.process(markdown, "section");
233
+
234
+ // Content-based IDs (<h?> tags only, described later)
235
+ const result = await processor.process(markdown, "id", {
236
+ useContentStringHeaderId: true
237
+ });
238
+
239
+ // Example of making IDs completely unique when generating multiple HTMLs:
240
+ // "id1-1", "id1-2", "id2-1", "id2-2", "id3-1" ...
241
+ const results = await Promise.all(
242
+ markdowns.map((markdown, index) => processor.process(markdown, `id${index}`)));
243
+ ```
244
+
245
+ ### Content-Based IDs
246
+
247
+ You can enable content-based IDs that generate IDs from heading text:
248
+
249
+ ```typescript
250
+ // Generate human-readable IDs from heading text
251
+ const markdown = `# Hello world
252
+
253
+ ## Another section
254
+
255
+ ### Subsection`;
256
+
257
+ const result = await processor.process(markdown, "id", {
258
+ useContentStringHeaderId: true
259
+ });
260
+ ```
261
+
262
+ Result:
263
+
264
+ ```html
265
+ <h1 id="hello-world">Hello World</h1>
266
+ <h2 id="hello-world-another-section">Another Section</h2>
267
+ <h3 id="hello-world-another-section-subsection">Subsection</h3>
268
+ ```
269
+
270
+ When using content-based IDs, the processor employs sophisticated fallback strategies to handle various text inputs:
271
+
272
+ #### Step 1: Unicode Normalization and Accent Removal
273
+
274
+ Normalizes European language accents to ASCII equivalent characters:
275
+
276
+ * Input: "Café Naïve"
277
+ * Output: "cafe-naive"
278
+
279
+ * Input: "Résumé"
280
+ * Output: "resume"
281
+
282
+ #### Step 2: Control Character Processing
283
+
284
+ Converts escape sequences and control characters to hyphens:
285
+
286
+ * Input: "Section\n\nTitle"
287
+ * Output: "section-title"
288
+
289
+ * Input: "Hello\tWorld"
290
+ * Output: "hello-world"
291
+
292
+ #### Step 3: ASCII Character Extraction
293
+
294
+ Removes non-ASCII characters (Japanese, Chinese, emojis, etc.):
295
+
296
+ * Input: "Hello 世界 World"
297
+ * Output: "hello-world"
298
+
299
+ * Input: "🎉 lucky time!"
300
+ * Output: "lucky-time"
301
+
302
+ #### Step 4: Invalid ID Fallback
303
+
304
+ When the resulting ID is too short (less than 3 characters) or empty, the processor falls back to unique IDs:
305
+
306
+ * Input: "こんにちは" (Japanese only)
307
+ * Output: "id-1" (fallback)
308
+
309
+ * Input: "🎉" (emoji only)
310
+ * Output: "id-2" (fallback)
311
+
312
+ * Input: "A" (too short)
313
+ * Output: "id-3" (fallback)
314
+
315
+ ### ID Generation Examples
316
+
317
+ |Input Heading|Generated ID|Processing|
318
+ |:----|:----|:----|
319
+ |`"Hello World"`|`"hello-world"`|Standard processing|
320
+ |`"Café Naïve"`|`"cafe-naive"`|Unicode normalization|
321
+ |`"Section\n\nTwo"`|`"section-two"`|Control character processing|
322
+ |`"Hello 世界"`|`"hello"`|Non-ASCII removal|
323
+ |`"こんにちは"`|`"id-1"`|Fallback (non-ASCII only)|
324
+ |`"🎉 パーティー"`|`"id-2"`|Fallback (emoji + Japanese)|
325
+ |`"A"`|`"id-3"`|Fallback (too short)|
326
+
327
+ Note: While many sites adopt such content-based IDs, MarkDeco doesn't use them by default.
328
+ The reason is that building IDs with non-English characters makes them very difficult to recognize and manage, and search systems don't particularly value them highly nowadays.
329
+
330
+ ## Fetcher and Cache System
331
+
332
+ MarkDeco provides a fetcher system that uniformly manages external server access. All external server access (oEmbed API calls, page scraping, etc.) is executed through fetchers, and responses are automatically cached.
333
+
334
+ ### Fetcher Types
335
+
336
+ MarkDeco provides two types of fetchers:
337
+
338
+ ```typescript
339
+ import { createCachedFetcher, createDirectFetcher } from 'mark-deco';
340
+
341
+ // Cached fetcher (recommended)
342
+ const cachedFetcher = createCachedFetcher('MyApp/1.0');
343
+
344
+ // Direct fetcher (no cache)
345
+ const directFetcher = createDirectFetcher('MyApp/1.0');
346
+ ```
347
+
348
+ ### Cache Storage Selection
349
+
350
+ You can choose from three types of cache storage:
351
+
352
+ #### Memory Storage (Default)
353
+
354
+ ```typescript
355
+ import { createCachedFetcher, createMemoryCacheStorage } from 'mark-deco';
356
+
357
+ const memoryStorage = createMemoryCacheStorage();
358
+ const fetcher = createCachedFetcher(
359
+ 'MyApp/1.0', // User agent
360
+ 60000, // Timeout (milliseconds)
361
+ memoryStorage // Cache storage
362
+ );
363
+ ```
364
+
365
+ #### Local Storage (Browser Environment)
366
+
367
+ ```typescript
368
+ import { createLocalCacheStorage } from 'mark-deco';
369
+
370
+ const localStorage = createLocalCacheStorage('myapp:');
371
+ const fetcher = createCachedFetcher('MyApp/1.0', 60000, localStorage);
372
+ ```
373
+
374
+ #### File System Storage (Node.js Environment)
375
+
376
+ ```typescript
377
+ import { createFileSystemCacheStorage } from 'mark-deco';
378
+
379
+ // Specify cache file storage location
380
+ const fileStorage = createFileSystemCacheStorage('./cache');
381
+ const fetcher = createCachedFetcher('MyApp/1.0', 60000, fileStorage);
382
+ ```
383
+
384
+ ### Cache Options
385
+
386
+ You can control detailed cache behavior:
387
+
388
+ ```typescript
389
+ const fetcher = createCachedFetcher(
390
+ 'MyApp/1.0',
391
+ 60000,
392
+ fileStorage, {
393
+ cache: true, // Enable/disable cache
394
+ cacheTTL: 30 * 60 * 1000, // Cache time-to-live (30 minutes)
395
+ cacheFailures: true, // Cache failed requests too
396
+ failureCacheTTL: 5 * 60 * 1000 // Failure cache time-to-live (5 minutes)
397
+ }
398
+ );
399
+ ```
400
+
401
+ Cache behavior on fetch failures:
402
+
403
+ * Success cache: Successful responses are retained until the specified TTL and aren't deleted even on failures.
404
+ * Failure cache: When `cacheFailures` is `true`, failures are also cached with a separate TTL (`failureCacheTTL`).
405
+ * Old data protection: Existing success cache isn't affected even if new requests fail.
406
+
407
+ Caching reduces duplicate requests to the same URL and improves performance.
408
+
409
+ ## Built-in Plugins
410
+
411
+ MarkDeco has a plugin system. You can add the effects of these plugins during the Markdown to HTML conversion process. Here are the built-in plugins:
412
+
413
+ |Plugin Name|Details|
414
+ |:----|:----|
415
+ |`oembed`|Accesses oEmbed API from specified URLs and renders HTML with obtained metadata|
416
+ |`card`|Scrapes specified URL pages and renders HTML with obtained metadata|
417
+ |`mermaid`|Enables graph drawing with code written in `mermaid.js` graph syntax|
418
+
419
+ To use plugins, specify them as follows:
420
+
421
+ ```typescript
422
+ import {
423
+ createMarkdownProcessor, createCachedFetcher,
424
+ createOEmbedPlugin, defaultProviderList } from 'mark-deco';
425
+
426
+ // Create fetcher
427
+ const fetcher = createCachedFetcher('MyApp/1.0');
428
+
429
+ // Generate oEmbed plugin
430
+ const oembedPlugin = createOEmbedPlugin(defaultProviderList);
431
+
432
+ const processor = createMarkdownProcessor({
433
+ plugins: [ oembedPlugin ], // Specify plugins to use
434
+ fetcher // Specify fetcher
435
+ });
436
+
437
+ const markdown = `# Media Embedding Test
438
+
439
+ Specify YouTube video URL (short URL supported)
440
+
441
+ \`\`\`oembed
442
+ https://youtu.be/1La4QzGeaaQ
443
+ \`\`\``;
444
+
445
+ // Embed YouTube video
446
+ const result = await processor.process(markdown, "id");
447
+
448
+ // Embedded HTML is generated
449
+ console.log(result.html);
450
+ ```
451
+
452
+ Plugin extensions are done through Markdown code block syntax.
453
+ Usually, code block syntax is used for syntax highlighting of program code, but when a plugin is recognized, the content of code blocks with the plugin name is processed by the plugin.
454
+
455
+ ### oEmbed Plugin
456
+
457
+ "oEmbed" is an API format standard for websites to embed their URLs for display on other sites. Major platforms like YouTube and Flickr provide oEmbed APIs, allowing appropriate embedded content to be retrieved by just specifying a URL.
458
+
459
+ Using the oEmbed plugin, you can easily embed YouTube videos, Flickr photos, social media posts, etc.
460
+
461
+ ```typescript
462
+ import {
463
+ createMarkdownProcessor, createCachedFetcher,
464
+ createOEmbedPlugin, defaultProviderList } from 'mark-deco';
465
+
466
+ // Create fetcher
467
+ const fetcher = createCachedFetcher('MyApp/1.0');
468
+
469
+ // Generate oEmbed plugin using default provider list
470
+ const oembedPlugin = createOEmbedPlugin(defaultProviderList);
471
+ const processor = createMarkdownProcessor({
472
+ plugins: [ oembedPlugin ],
473
+ fetcher
474
+ });
475
+
476
+ const markdown = `# Media Embedding Test
477
+
478
+ YouTube video (short URL supported)
479
+
480
+ \`\`\`oembed
481
+ https://youtu.be/1La4QzGeaaQ
482
+ \`\`\`
483
+
484
+ Flickr photo
485
+
486
+ \`\`\`oembed
487
+ https://flickr.com/photos/bees/2362225867/
488
+ \`\`\`
489
+
490
+ Short URL (automatic redirect resolution)
491
+
492
+ \`\`\`oembed
493
+ https://bit.ly/example-site-page
494
+ \`\`\``;
495
+
496
+ const result = await processor.process(markdown, "id");
497
+ ```
498
+
499
+ Example generated HTML (for YouTube video):
500
+
501
+ ```html
502
+ <div class="oembed-container oembed-video">
503
+ <div class="oembed-header">
504
+ <div class="oembed-title">Sample Video Title</div>
505
+ <div class="oembed-author">by Channel Name</div>
506
+ <div class="oembed-provider">from YouTube</div>
507
+ </div>
508
+ <div class="oembed-content">
509
+ <iframe src="https://www.youtube.com/embed/[VIDEO_ID]"
510
+ frameborder="0" allowfullscreen>
511
+ <!-- Provider-specific implementation ... -->
512
+ </iframe>
513
+ </div>
514
+ <div class="oembed-footer">
515
+ <a href="https://youtu.be/[VIDEO_ID]" target="_blank" rel="noopener noreferrer">
516
+ Watch on YouTube
517
+ </a>
518
+ </div>
519
+ </div>
520
+ ```
521
+
522
+ #### Supported Providers
523
+
524
+ The oEmbed plugin includes a "default provider list" published at `https://oembed.com/providers.json`. You can also specify your own list. Major providers include:
525
+
526
+ |Provider|Supported Domains|Content|
527
+ |:----|:----|:----|
528
+ |YouTube|`youtube.com`, `youtu.be`|Video embedding|
529
+ |Vimeo|`vimeo.com`|Video embedding|
530
+ |Twitter/X|`twitter.com`, `x.com`|Tweet embedding|
531
+ |Instagram|`instagram.com`|Post embedding|
532
+ |Flickr|`flickr.com`|Photo embedding|
533
+ |TikTok|`tiktok.com`|Video embedding|
534
+ |Spotify|`spotify.com`|Music/playlist embedding|
535
+ |SoundCloud|`soundcloud.com`|Audio embedding|
536
+ |Reddit|`reddit.com`|Post embedding|
537
+ |Others|Many sites|Various content embedding|
538
+
539
+ The default provider list is large. Therefore, if you want to reduce bundle size, you should prepare your own list. If you don't use `defaultProviderList`, bundlers should implicitly reduce that data.
540
+
541
+ #### Display Field Order Control
542
+
543
+ The oEmbed plugin allows fine control over displayed metadata items and their display order using the `displayFields` option:
544
+
545
+ ```typescript
546
+ // Custom display order: embedded content first, then title, finally external link
547
+ const customOrderOEmbedPlugin = createOEmbedPlugin(
548
+ defaultProviderList, {
549
+ displayFields: {
550
+ 'embeddedContent': 1, // Display 1st
551
+ 'title': 2, // Display 2nd
552
+ 'externalLink': 3, // Display 3rd
553
+ } // Other items won't be output
554
+ });
555
+
556
+ // Default order when displayFields is undefined
557
+ const defaultOEmbedPlugin = createOEmbedPlugin(
558
+ defaultProviderList, { });
559
+ ```
560
+
561
+ * Numbers for each field represent display item order. They don't need to be sequential, and smaller numbers are output first.
562
+ * When `displayFields` isn't specified, all metadata items are rendered.
563
+
564
+ Available display control options:
565
+
566
+ |Field|Description|CSS Class|Default Order|
567
+ |:----|:----|:----|:----|
568
+ |`title`|Content title|`.oembed-title`|`1`|
569
+ |`author`|Author information|`.oembed-author`|`2`|
570
+ |`provider`|Provider information|`.oembed-provider`|`3`|
571
+ |`description`|Description text|`.oembed-description`|`4`|
572
+ |`thumbnail`|Thumbnail image|`.oembed-thumbnail`|`5`|
573
+ |`embeddedContent`|Embedded content (videos, etc.)|`.oembed-content`|`6`|
574
+ |`externalLink`|External link|`a[href]`|`7`|
575
+
576
+ #### Link URL Control
577
+
578
+ The oEmbed plugin allows control over URLs used for external links in generated content through the `useMetadataUrlLink` option:
579
+
580
+ ```typescript
581
+ // Use URL written in Markdown
582
+ const providedLinkOEmbedPlugin = createOEmbedPlugin(
583
+ defaultProviderList, {
584
+ useMetadataUrlLink: false // Use URL written in Markdown
585
+ });
586
+
587
+ // Use canonical URL from metadata
588
+ const metadataLinkOEmbedPlugin = createOEmbedPlugin(
589
+ defaultProviderList, {
590
+ useMetadataUrlLink: true // Use oEmbed metadata `web_page` URL
591
+ });
592
+ ```
593
+
594
+ Link URL selection priority:
595
+
596
+ |`useMetadataUrlLink`|URL Source Priority|Purpose|
597
+ |:----|:----|:----|
598
+ |`false`|Written URL|Preserve original URL (short links, etc.) (default)|
599
+ |`true`|oEmbed `web_page` URL --> Written URL|Use provider canonical URL|
600
+
601
+ #### Redirect Resolution
602
+
603
+ URLs specified in Markdown are automatically resolved to normalized URLs when they are short URLs or redirected URLs.
604
+ This is because oEmbed provider lists may only match normalized URLs:
605
+
606
+ ```markdown
607
+ \`\`\`oembed
608
+ https://youtu.be/1La4QzGeaaQ # --> Resolved to https://youtube.com/watch?v=1La4QzGeaaQ
609
+ \`\`\`
610
+
611
+ \`\`\`oembed
612
+ https://bit.ly/shortened-link # --> Resolved to normalized URL
613
+ \`\`\`
614
+ ```
615
+
616
+ Redirects may be executed multiple times, and provider list matching is performed for each redirect.
617
+
618
+ Note: This may not work correctly in browser environments due to CORS constraints.
619
+
620
+ #### Fallback Display
621
+
622
+ When a specified URL is from an unsupported provider, appropriate link display is generated:
623
+
624
+ ```html
625
+ <div class="oembed-container oembed-fallback">
626
+ <div class="oembed-header">
627
+ <div class="oembed-title">External Content</div>
628
+ <div class="oembed-provider">from example.com</div>
629
+ </div>
630
+ <div class="oembed-content">
631
+ <a href="https://example.com/content" target="_blank" rel="noopener noreferrer">
632
+ View content on example.com
633
+ </a>
634
+ </div>
635
+ </div>
636
+ ```
637
+
638
+ #### CSS Classes
639
+
640
+ HTML generated by the oEmbed plugin includes CSS classes for styling:
641
+
642
+ |CSS Class|Applied Element|Description|
643
+ |:----|:----|:----|
644
+ |`.oembed-container`| Overall container | Container for entire oEmbed embedding |
645
+ |`.oembed-video`| Container | Additional class for video content |
646
+ |`.oembed-photo`| Container | Additional class for photo content |
647
+ |`.oembed-link`| Container | Additional class for link content |
648
+ |`.oembed-rich`| Container | Additional class for rich content |
649
+ |`.oembed-header`| Header section | Container for title/author/provider info |
650
+ |`.oembed-title`| Title element | Content title |
651
+ |`.oembed-author`| Author element | Author/channel name, etc. |
652
+ |`.oembed-provider`| Provider element | Service provider name |
653
+ |`.oembed-description`| Description element | Content description |
654
+ |`.oembed-thumbnail`| Thumbnail element | Thumbnail image |
655
+ |`.oembed-content`| Embedded element | iframe or actual content |
656
+ |`.oembed-footer`| Footer section | External links, etc. |
657
+ |`.oembed-fallback`| Fallback element | Fallback display for unsupported sites |
658
+
659
+ ### Card Plugin
660
+
661
+ The card plugin scrapes specified URL pages and renders metadata groups. Even pages without oEmbed APIs can be scraped for information and displayed in a unified format.
662
+
663
+ By default, it extracts Open Graph Protocol (OGP) metadata from pages and generates designable HTML for rich preview cards. Other metadata formats can also be flexibly supported by writing extraction rules.
664
+
665
+ ```typescript
666
+ import { createCardPlugin, createCachedFetcher } from 'mark-deco';
667
+
668
+ // Create fetcher
669
+ const fetcher = createCachedFetcher('MyApp/1.0');
670
+
671
+ // Generate card plugin
672
+ const cardPlugin = createCardPlugin();
673
+
674
+ const processor = createMarkdownProcessor({
675
+ plugins: [ cardPlugin ],
676
+ fetcher
677
+ });
678
+
679
+ const markdown = `# Product Review
680
+
681
+ ## GitHub Repository
682
+ \`\`\`card
683
+ https://github.com/kekyo/async-primitives
684
+ \`\`\`
685
+
686
+ ## eBay Product
687
+ \`\`\`card
688
+ https://www.ebay.com/itm/167556314958
689
+ \`\`\``;
690
+
691
+ const result = await processor.process(markdown, "id");
692
+
693
+ // Rich card HTML is generated
694
+ console.log(result.html);
695
+ ```
696
+
697
+ Example generated HTML (varies depending on metadata obtained from target page):
698
+
699
+ ```html
700
+ <div class="card-container">
701
+ <a href="[URL]" target="_blank" rel="noopener noreferrer" class="card-link">
702
+ <div class="card-image">
703
+ <img src="[IMAGE_URL]" alt="[TITLE]" loading="lazy" />
704
+ </div>
705
+ <div class="card-body">
706
+ <div class="card-header">
707
+ <div class="card-title">[TITLE]</div>
708
+ <div class="card-provider">
709
+ <img src="[FAVICON]" alt="" class="card-favicon" />
710
+ <span>[SITE_NAME]</span>
711
+ </div>
712
+ </div>
713
+ <div class="card-description">[DESCRIPTION]</div>
714
+ </div>
715
+ </a>
716
+ </div>
717
+ ```
718
+
719
+ Note: OGP is a standard specification that allows SNS and other services to uniformly obtain webpage metadata. To avoid each site describing metadata in its own way, it provides information like `title`, `description`, `image`, `site_name` in a common HTML meta tag format. This allows the card plugin to obtain metadata in a consistent format from many websites and achieve unified card display.
720
+
721
+ #### Metadata Extraction Rules
722
+
723
+ The card plugin supports rule definitions for extracting metadata. Rules match URL patterns with regular expressions and extract data with CSS selectors:
724
+
725
+ ```typescript
726
+ import { createCardPlugin } from 'mark-deco';
727
+
728
+ const cardPlugin = createCardPlugin({
729
+ scrapingRules: [
730
+ {
731
+ pattern: '^https?://example\\.com/', // URL pattern
732
+ siteName: 'Example Site',
733
+ fields: { // Field configuration group
734
+ title: { // `title` field configuration
735
+ rules: [{ selector: 'h1.main-title', method: 'text' }]
736
+ },
737
+ description: { // `description` field configuration
738
+ rules: [{ selector: '.description', method: 'text' }]
739
+ },
740
+ image: { // `image` field configuration
741
+ rules: [{ selector: '.hero-image img', method: 'attr', attr: 'src' }]
742
+ }
743
+ }
744
+ }
745
+ ]
746
+ });
747
+ ```
748
+
749
+ `FieldConfig` (field configuration):
750
+ - `required`: Whether this field is required (boolean)
751
+ - `rules`: Array of extraction rules. Tried from top to bottom, and the first successful rule's result is used
752
+
753
+ `FieldRule` (extraction rule):
754
+ - `selector`: CSS selector (string or array)
755
+ - `method`: Extraction method (`text`, `attr`, `html`)
756
+ - `attr`: Attribute name when using `attr` method
757
+ - `multiple`: Whether multiple elements can be extracted (boolean)
758
+ - `processor`: Post-processing logic (regex, filters, currency formatting, etc.)
759
+
760
+ Each field's rules are defined as arrays and tried from top to bottom. Once a result is obtained from the first rule, extraction for that field is complete, and subsequent rules aren't executed.
761
+
762
+ Custom metadata extraction rule groups are applied before standard OGP metadata extraction, enabling more accurate information acquisition.
763
+
764
+ #### Extraction Method Selection
765
+
766
+ The `method` field in extraction rules specifies how to obtain data from HTML elements. Three methods are available:
767
+
768
+ |Extraction Method|Description|Usage Example|
769
+ |:----|:----|:----|
770
+ |`text`|Get element text content (HTML tags removed)|`<span>Hello World</span>` --> `"Hello World"`|
771
+ |`attr`|Get element attribute value|`<img src="image.jpg">` --> `"image.jpg"` (attr: `src`)|
772
+ |`html`|Get element inner HTML (including HTML tags)|`<div><b>Bold</b> text</div>` --> `"<b>Bold</b> text"`|
773
+
774
+ Specific usage examples for each method:
775
+
776
+ ```typescript
777
+ // Text extraction example
778
+ {
779
+ selector: 'h1.title',
780
+ method: 'text' // Default value
781
+ }
782
+
783
+ // Attribute extraction example
784
+ {
785
+ selector: 'meta[property="og:image"]',
786
+ method: 'attr',
787
+ attr: 'content' // Specify attribute name to get
788
+ }
789
+
790
+ // HTML extraction example
791
+ {
792
+ selector: '.rich-content',
793
+ method: 'html' // Get including HTML tags
794
+ }
795
+ ```
796
+
797
+ When `method` is omitted, `text` is used by default. When using the `attr` method, you must specify the attribute name with the `attr` field (`href` is used when omitted).
798
+
799
+ #### Post-processing Logic
800
+
801
+ You can use the `processor` field to perform post-processing on extracted data. It can be specified in two formats:
802
+
803
+ |Format|Details|
804
+ |:----|:----|
805
+ |Configuration object|Choose from several fixed methods. Since it's a built-in processing method, it can be used in ways like streaming entire rules in JSON.|
806
+ |Function|You can write custom processing with functions. Any post-processing can be handled.|
807
+
808
+ ##### Configuration Object Format
809
+
810
+ ```typescript
811
+ {
812
+ selector: '.price',
813
+ method: 'text',
814
+ processor: {
815
+ type: 'currency',
816
+ params: {
817
+ symbol: '$',
818
+ locale: 'en-US'
819
+ }
820
+ }
821
+ }
822
+ ```
823
+
824
+ Available configuration types:
825
+
826
+ |Type|Description|Parameter Example|Result Example|
827
+ |:----|:----|:----|:----|
828
+ |`regex`|String conversion with regular expressions|`replace: [{ pattern: '^Prefix:\\s*', replacement: '' }]`|Prefix removal|
829
+ |`filter`|Value filtering by conditions|`contains: 'keep', excludeContains: ['exclude']`|Extract values containing/not containing specific strings|
830
+ |`slice`|Partial array retrieval|`start: 0, end: 3`|Get only first 3 elements|
831
+ |`first`|Get only first value|(no parameters)|`['a', 'b', 'c']` --> `'a'`|
832
+ |`currency`|Currency formatting|`symbol: '$', locale: 'en-US'`|`'19.99'` --> `'$19.99'`|
833
+
834
+ Composite processing example:
835
+
836
+ ```typescript
837
+ {
838
+ selector: '.feature-list li',
839
+ method: 'text',
840
+ multiple: true, // Get multiple elements
841
+ processor: {
842
+ type: 'filter',
843
+ params: {
844
+ excludeContains: ['advertise', 'buy'], // Exclude advertising text
845
+ minLength: 10 // Exclude items less than 10 characters
846
+ }
847
+ }
848
+ }
849
+ ```
850
+
851
+ ##### Function Format
852
+
853
+ In function format, you receive an array of extracted values and processing context, and return processed results:
854
+
855
+ ```typescript
856
+ {
857
+ selector: '.brand-info',
858
+ method: 'text',
859
+ processor: (values, context) => {
860
+ // `values` is an array of extracted text
861
+ // `context` contains information available for conversion
862
+ if (values.length === 0 || !values[0]) return undefined;
863
+ const text = values[0];
864
+ const match = text.match(/ブランド:\s*([^の]+)/);
865
+ return match?.[1]?.trim();
866
+ }
867
+ }
868
+ ```
869
+
870
+ The `context` argument passed to function format `processor` contains the following information:
871
+
872
+ |Property|Type|Description|Usage Example|
873
+ |:----|:----|:----|:----|
874
+ |`$`|`Cheerio`|Cheerio instance for entire page|`context.$.html()` to get entire page HTML|
875
+ |`$head`|`Cheerio`|Cheerio instance for HTML head section|`context.$head('meta[name="description"]')` to get metadata|
876
+ |`url`|`string`|Processing page URL|`context.url` for domain extraction or ASIN extraction|
877
+ |`locale`|`string`|Page language/region information|`context.locale` for language-specific processing|
878
+
879
+ Practical examples using `context`:
880
+
881
+ ```typescript
882
+ // Example extracting ID from URL
883
+ processor: (values, context) => {
884
+ const match = context.url.match(/\/dp\/([A-Z0-9]{10,})/);
885
+ return match ? match[1] : undefined;
886
+ }
887
+
888
+ // Example extracting domain name
889
+ processor: (values, context) => {
890
+ try {
891
+ const url = new URL(context.url);
892
+ return url.hostname.replace(/^www\./, '');
893
+ } catch {
894
+ return 'Unknown Site';
895
+ }
896
+ }
897
+
898
+ // Example performing language-specific processing
899
+ processor: (values, context) => {
900
+ const isJapanese = context.locale?.startsWith('ja');
901
+ return isJapanese
902
+ ? values[0]?.replace(/ブランド:\s*/, '')
903
+ : values[0]?.replace(/Brand:\s*/, '');
904
+ }
905
+ ```
906
+
907
+ #### Display Field Order Control
908
+
909
+ The card plugin allows control over displayed metadata items and their display order using the `displayFields` option:
910
+
911
+ ```typescript
912
+ const cardPlugin = createCardPlugin({
913
+ displayFields: {
914
+ 'image': 1, // Display field name `image` first
915
+ 'title': 2, // Display field name `title` second
916
+ 'description': 3, // Display field name `description` third
917
+ // (Other metadata items won't be displayed even if obtained)
918
+ }
919
+ });
920
+ ```
921
+
922
+ Metadata field names follow the field names in metadata extraction rules.
923
+
924
+ #### Link URL Control
925
+
926
+ The card plugin allows control over URLs used for clickable links in generated cards through the `useMetadataUrlLink` option:
927
+
928
+ ```typescript
929
+ // Use URL written in Markdown
930
+ const providedLinkCardPlugin = createCardPlugin({
931
+ useMetadataUrlLink: false // Use URL written in Markdown
932
+ });
933
+
934
+ // Use metadata URL (canonical URL of retrieved page)
935
+ const metadataLinkCardPlugin = createCardPlugin({
936
+ useMetadataUrlLink: true // Use OGP metadata canonical URL
937
+ });
938
+ ```
939
+
940
+ Link URL selection priority:
941
+
942
+ |`useMetadataUrlLink`|URL Source Priority|Purpose|
943
+ |:----|:----|:----|
944
+ |`false`|Written URL|Preserve original URL with tracking parameters (default)|
945
+ |`true`|Extended canonical URL --> OGP URL --> Source URL --> Written URL|Expect normalized URL|
946
+
947
+ #### Fallback Processing
948
+
949
+ When network errors occur during scraping, the plugin provides appropriate fallback display. Here's an example when CORS restrictions prevent information retrieval:
950
+
951
+ ```html
952
+ <div class="card-container card-fallback">
953
+ <div class="card-body">
954
+ <div class="card-header">
955
+ <div class="card-title">📄 External Content</div>
956
+ <div class="card-provider">example.com</div>
957
+ </div>
958
+ <div class="card-description">
959
+ CORS Restriction - This site blocks cross-origin requests in browsers
960
+ </div>
961
+ <div class="card-content">
962
+ <a href="[URL]" target="_blank" rel="noopener noreferrer" class="card-external-link">
963
+ → Open example.com in new tab
964
+ </a>
965
+ </div>
966
+ </div>
967
+ </div>
968
+ ```
969
+
970
+ #### CSS Classes
971
+
972
+ HTML generated by the card plugin includes CSS classes for styling:
973
+
974
+ |CSS Class|Applied Element|Description|
975
+ |:----|:----|:----|
976
+ |`.card-container`| Overall container | Container for entire card |
977
+ |`.card-amazon`| Container | Additional class for Amazon products |
978
+ |`.card-fallback`| Container | Additional class for fallback display |
979
+ |`.card-link`| Link element | Clickable link for entire card |
980
+ |`.card-image`| Image container | Image display area |
981
+ |`.card-body`| Body section | Card main content area |
982
+ |`.card-header`| Header section | Container for title/provider info |
983
+ |`.card-title`| Title element | Card title |
984
+ |`.card-provider`| Provider element | Site name/favicon area |
985
+ |`.card-favicon`| Favicon element | Site favicon image |
986
+ |`.card-description`| Description element | Card description |
987
+ |`.card-content`| Content element | Additional content for fallback |
988
+ |`.card-external-link`| External link element | External link for fallback |
989
+ |`.card-{fieldName}`| Specific field | Classes corresponding to each field name (e.g., `.card-price`, `.card-rating`) |
990
+
991
+ Field-specific class naming convention:
992
+
993
+ Classes in `.card-{fieldName}` format are automatically generated based on field names defined in metadata extraction rules. For example, the `price` field gets `.card-price`, and the `rating` field gets `.card-rating`.
994
+
995
+ ### Mermaid Plugin
996
+
997
+ Using the Mermaid plugin, you can create diagrams and flowcharts using [mermaid.js](https://mermaid.js.org/) notation:
998
+
999
+ ```typescript
1000
+ import { createMarkdownProcessor, createMermaidPlugin, createCachedFetcher } from 'mark-deco';
1001
+
1002
+ // Create fetcher
1003
+ const fetcher = createCachedFetcher('MyApp/1.0');
1004
+
1005
+ // Generate Mermaid plugin
1006
+ const mermaidPlugin = createMermaidPlugin();
1007
+
1008
+ const processor = createMarkdownProcessor({
1009
+ plugins: [mermaidPlugin],
1010
+ fetcher
1011
+ });
1012
+
1013
+ const markdown = `# Diagram Example
1014
+
1015
+ \`\`\`mermaid
1016
+ graph TD
1017
+ A[Start] --> B{Decision}
1018
+ B -->|Yes| C[Action1]
1019
+ B -->|No| D[Action2]
1020
+ C --> E[End]
1021
+ D --> E
1022
+ \`\`\``;
1023
+
1024
+ const result = await processor.process(markdown, "id");
1025
+
1026
+ // Contains <div class="mermaid">...</div>
1027
+ console.log(result.html);
1028
+ ```
1029
+
1030
+ HTML like this is generated:
1031
+
1032
+ ```html
1033
+ <div class="mermaid-wrapper">
1034
+ <style> { ... } </style>
1035
+ <div class="mermaid" id="id-1">graph TD
1036
+ A[Start] --&gt; B{Decision}
1037
+ B --&gt;|Yes| C[Action1]
1038
+ B --&gt;|No| D[Action2]
1039
+ C --&gt; E[End]
1040
+ D --&gt; E</div>
1041
+ </div>
1042
+ ```
1043
+
1044
+ Note that the Mermaid plugin doesn't generate actual SVG graphics, but creates HTML elements to pass to Mermaid. This means it's insufficient to draw graphics alone, and you need to introduce the Mermaid main script when displaying HTML.
1045
+
1046
+ Generated HTML has the following characteristics:
1047
+
1048
+ * Diagram code is properly HTML-escaped to prevent XSS attacks.
1049
+ * Wrapped with `mermaid-wrapper` class and includes styles that override SVG size constraints.
1050
+ * Unique IDs are assigned by default, allowing proper identification when multiple diagrams exist.
1051
+
1052
+ For introducing the Mermaid script itself, refer to Mermaid documentation. Here's a simple example:
1053
+
1054
+ ```html
1055
+ <!DOCTYPE html>
1056
+ <html>
1057
+ <head>
1058
+ <title>Mermaid Rendering</title>
1059
+ <!-- Mermaid.js CDN -->
1060
+ <script src="https://cdn.jsdelivr.net/npm/mermaid@10/dist/mermaid.min.js"></script>
1061
+ </head>
1062
+ <body>
1063
+ <div id="content">
1064
+ <!-- Insert processor converted result HTML here -->
1065
+ </div>
1066
+ <script>
1067
+ // Initialize Mermaid
1068
+ mermaid.initialize({
1069
+ startOnLoad: true,
1070
+ theme: 'default'
1071
+ });
1072
+ </script>
1073
+ </body>
1074
+ </html>
1075
+ ```
1076
+
1077
+ #### Considerations for Dynamically Updating HTML in Browser Environments
1078
+
1079
+ Normally, Mermaid parses statically placed Mermaid code, renders it, and displays diagrams by inserting SVG tags in place.
1080
+ Therefore, even if you dynamically update the DOM (`innerHTML`, etc.), SVG tags aren't automatically generated. You need to manually trigger Mermaid re-rendering.
1081
+
1082
+ Here's a simple example of dynamically updating HTML containing Mermaid diagrams:
1083
+
1084
+ ```typescript
1085
+ // Function to process Markdown
1086
+ const processAndUpdate = async () => {
1087
+ // ...
1088
+
1089
+ // Execute MarkDeco processor
1090
+ const result = await processor.process(markdown, "id");
1091
+
1092
+ // Update DOM
1093
+ document.getElementById('output').innerHTML = result.html;
1094
+
1095
+ // If Mermaid diagrams exist
1096
+ if (result.html.includes('class="mermaid"')) {
1097
+ // Wait for DOM update completion (100ms)
1098
+ setTimeout(() => {
1099
+ // Initialize Mermaid to generate SVG
1100
+ window.mermaid.init(
1101
+ undefined,
1102
+ document.querySelectorAll('.mermaid:not([data-processed="true"])'));
1103
+ }, 100);
1104
+ }
1105
+ };
1106
+ ```
1107
+
1108
+ ## Creating Custom Plugins
1109
+
1110
+ MarkDeco allows you to implement and use your own plugins, not just built-in ones.
1111
+
1112
+ Plugins can intercept Markdown code block notation. In the following example, processing is delegated to a plugin named `foobar`:
1113
+
1114
+ ```markdown
1115
+ \`\`\`foobar
1116
+ Custom plugin directive text...
1117
+ \`\`\`
1118
+ ```
1119
+
1120
+ The text "Custom plugin directive text..." inside the code block is passed to the plugin. The plugin should interpret this text and provide custom functionality:
1121
+
1122
+ ```typescript
1123
+ import type { Plugin, PluginContext } from 'mark-deco';
1124
+
1125
+ // Define custom plugin as a function
1126
+ const createFooBarPlugin = (): Plugin => {
1127
+ // content contains code block text, context contains information and functions needed for operations
1128
+ const processBlock = async (content: string, context: PluginContext): Promise<string> => {
1129
+ // Implement custom processing (this example simply outputs text in a div)
1130
+ return `<div class="custom-block">${content}</div>`;
1131
+ };
1132
+ // Return Plugin object
1133
+ return {
1134
+ name: 'foobar', // Plugin name
1135
+ processBlock // Plugin handler
1136
+ };
1137
+ };
1138
+
1139
+ // Generate and register plugin
1140
+ const fooBarPlugin = createFooBarPlugin();
1141
+ const processor = createMarkdownProcessor({
1142
+ plugins: [ fooBarPlugin ],
1143
+ fetcher
1144
+ });
1145
+ ```
1146
+
1147
+ ### PluginContext
1148
+
1149
+ The plugin's `processBlock` method receives a `PluginContext` as the second argument. This object contains the functionality and data needed for plugin processing:
1150
+
1151
+ ```typescript
1152
+ interface PluginContext {
1153
+ /** Logger instance for log output */
1154
+ readonly logger: Logger;
1155
+ /** AbortSignal for processing cancellation (undefined if not specified) */
1156
+ readonly signal: AbortSignal | undefined;
1157
+ /** Frontmatter data extracted from Markdown */
1158
+ readonly frontmatter: FrontmatterData;
1159
+ /** Unique ID generation function (prefix + sequential number format) */
1160
+ readonly getUniqueId: () => string;
1161
+ /** Fetcher for HTTP requests */
1162
+ readonly fetcher: FetcherType;
1163
+ }
1164
+ ```
1165
+
1166
+ Usage of each property:
1167
+
1168
+ |Property|Type|Description|Usage Example|
1169
+ |:----|:----|:----|:----|
1170
+ |`logger`|`Logger`| Used for debug info and error log output |
1171
+ |`signal`|`AbortSignal \| undefined`| Used for supporting long-running process cancellation |
1172
+ |`frontmatter`|`FrontmatterData`| Conditional branching based on frontmatter data |
1173
+ |`getUniqueId`|`() => string`| Used for assigning unique IDs to HTML elements |
1174
+ |`fetcher`|`FetcherType`| Used for external API access and page scraping |
1175
+
1176
+ Practical examples using `context`:
1177
+
1178
+ ```typescript
1179
+ // Example of extracting ID from URL
1180
+ processor: (values, context) => {
1181
+ const match = context.url.match(/\/dp\/([A-Z0-9]{10,})/);
1182
+ return match ? match[1] : undefined;
1183
+ }
1184
+
1185
+ // Example of extracting domain name
1186
+ processor: (values, context) => {
1187
+ try {
1188
+ const url = new URL(context.url);
1189
+ return url.hostname.replace(/^www\./, '');
1190
+ } catch {
1191
+ return 'Unknown Site';
1192
+ }
1193
+ }
1194
+
1195
+ // Example of language-specific processing
1196
+ processor: (values, context) => {
1197
+ const isJapanese = context.locale?.startsWith('ja');
1198
+ return isJapanese
1199
+ ? values[0]?.replace(/ブランド:\s*/, '')
1200
+ : values[0]?.replace(/Brand:\s*/, '');
1201
+ }
1202
+ ```
1203
+
1204
+ #### Display Field Order Control
1205
+
1206
+ The card plugin allows you to control which metadata items are displayed and their order using the `displayFields` option:
1207
+
1208
+ ```typescript
1209
+ const cardPlugin = createCardPlugin({
1210
+ displayFields: {
1211
+ 'image': 1, // Display field name `image` first
1212
+ 'title': 2, // Display field name `title` second
1213
+ 'description': 3, // Display field name `description` third
1214
+ // (Other metadata items are not displayed even if retrieved)
1215
+ }
1216
+ });
1217
+ ```
1218
+
1219
+ Metadata field names follow the field names in the metadata extraction rules.
1220
+
1221
+ #### Link URL Control
1222
+
1223
+ The card plugin allows you to control the URL used for clickable links in generated cards through the `useMetadataUrlLink` option:
1224
+
1225
+ ```typescript
1226
+ // Use URL written in Markdown
1227
+ const providedLinkCardPlugin = createCardPlugin({
1228
+ useMetadataUrlLink: false // Use URL written in Markdown
1229
+ });
1230
+
1231
+ // Use metadata URL (canonical URL of retrieved page)
1232
+ const metadataLinkCardPlugin = createCardPlugin({
1233
+ useMetadataUrlLink: true // Use canonical URL from OGP metadata
1234
+ });
1235
+ ```
1236
+
1237
+ Link URL selection priority:
1238
+
1239
+ |`useMetadataUrlLink`|URL Source Priority|Use Case|
1240
+ |:----|:----|:----|
1241
+ |`false`|Written URL|Preserve original URL with tracking parameters (default)|
1242
+ |`true`|Extended canonical URL --> OGP URL --> Source URL --> Written URL|Expect normalized URL|
1243
+
1244
+ #### Fallback Processing
1245
+
1246
+ When network errors occur during scraping, the plugin provides appropriate fallback display. Here's an example when CORS restrictions prevent information retrieval:
1247
+
1248
+ ```html
1249
+ <div class="card-container card-fallback">
1250
+ <div class="card-body">
1251
+ <div class="card-header">
1252
+ <div class="card-title">📄 External Content</div>
1253
+ <div class="card-provider">example.com</div>
1254
+ </div>
1255
+ <div class="card-description">
1256
+ CORS restriction - This site blocks cross-origin requests in browsers
1257
+ </div>
1258
+ <div class="card-content">
1259
+ <a href="[URL]" target="_blank" rel="noopener noreferrer" class="card-external-link">
1260
+ → Open example.com in new tab
1261
+ </a>
1262
+ </div>
1263
+ </div>
1264
+ </div>
1265
+ ```
1266
+
1267
+ #### CSS Classes
1268
+
1269
+ The HTML generated by the card plugin includes CSS classes for styling:
1270
+
1271
+ |CSS Class|Applied Element|Description|
1272
+ |:----|:----|:----|
1273
+ |`.card-container`| Entire container | Container for the entire card |
1274
+ |`.card-amazon`| Container | Additional class for Amazon products |
1275
+ |`.card-fallback`| Container | Additional class for fallback display |
1276
+ |`.card-link`| Link element | Clickable link for the entire card |
1277
+ |`.card-image`| Image container | Image display area |
1278
+ |`.card-body`| Body section | Card content area |
1279
+ |`.card-header`| Header section | Container for title and provider information |
1280
+ |`.card-title`| Title element | Card title |
1281
+ |`.card-provider`| Provider element | Site name and favicon area |
1282
+ |`.card-favicon`| Favicon element | Site favicon image |
1283
+ |`.card-description`| Description element | Card description text |
1284
+ |`.card-content`| Content element | Additional content for fallback |
1285
+ |`.card-external-link`| External link element | External link for fallback |
1286
+ |`.card-{fieldName}`| Specific field | Classes corresponding to each field name (e.g., `.card-price`, `.card-rating`) |
1287
+
1288
+ Field-specific class naming convention:
1289
+
1290
+ Classes in the format `.card-{fieldName}` are automatically generated based on field names defined in metadata extraction rules. For example, a `price` field gets `.card-price`, and a `rating` field gets `.card-rating`.
1291
+
1292
+ ### Custom Unified Plugins
1293
+
1294
+ You can extend `unified` processing capabilities by adding `remark` and `rehype` plugins. This feature is very advanced and requires knowledge of `unified`, `remark`, and `rehype`:
1295
+
1296
+ ```typescript
1297
+ import remarkMath from 'remark-math';
1298
+ import rehypeKatex from 'rehype-katex';
1299
+ import rehypeHighlight from 'rehype-highlight';
1300
+
1301
+ const result = await processor.process(
1302
+ markdown, "id", {
1303
+ // Advanced options
1304
+ advancedOptions: {
1305
+ // Add remark plugins (processed before GFM)
1306
+ remarkPlugins: [
1307
+ remarkMath, // Add math support
1308
+ [remarkToc, { tight: true }] // Add table of contents with options
1309
+ ],
1310
+ // Add rehype plugins (processed after HTML generation)
1311
+ rehypePlugins: [
1312
+ rehypeKatex, // Render math with KaTeX
1313
+ [rehypeHighlight, { // Syntax highlighting with options
1314
+ detect: true,
1315
+ subset: ['javascript', 'typescript', 'python']
1316
+ }]
1317
+ ]
1318
+ }
1319
+ });
1320
+ ```
1321
+
1322
+ ----
1323
+
1324
+ ## CLI Application
1325
+
1326
+ MarkDeco includes a CLI application for processing Markdown files from the command line. It supports reading from standard input, file processing, and detailed customization using configuration files.
1327
+
1328
+ ### Installation
1329
+
1330
+ ```bash
1331
+ # Global installation
1332
+ npm install -g mark-deco-cli
1333
+
1334
+ # Or run directly with npx
1335
+ npx mark-deco-cli input.md
1336
+ ```
1337
+
1338
+ ### Basic Usage
1339
+
1340
+ ```bash
1341
+ # From standard input to standard output
1342
+ echo "# Hello World" | mark-deco-cli
1343
+
1344
+ # Process file
1345
+ mark-deco-cli -i input.md
1346
+
1347
+ # Save output to file
1348
+ mark-deco-cli -i input.md -o output.html
1349
+ ```
1350
+
1351
+ ### Command Line Options
1352
+
1353
+ ```
1354
+ Options:
1355
+ -i, --input <file> Input Markdown file (default: standard input)
1356
+ -o, --output <file> Output HTML file (default: standard output)
1357
+ -c, --config <file> Configuration file path
1358
+ -p, --plugins <plugins...> Enable specific plugins (oembed, card, mermaid)
1359
+ --no-plugins Disable all standard plugins
1360
+ --unique-id-prefix <prefix> Unique ID prefix (default: "section")
1361
+ --hierarchical-heading-id Use hierarchical heading IDs (default: true)
1362
+ --content-based-heading-id Use content-based heading IDs (default: false)
1363
+ --frontmatter-output <file> Output frontmatter as JSON to specified file
1364
+ --heading-tree-output <file> Output heading tree as JSON to specified file
1365
+ -h, --help Display help
1366
+ -V, --version Display version
1367
+ ```
1368
+
1369
+ ### Usage Examples
1370
+
1371
+ ```bash
1372
+ # Basic Markdown processing
1373
+ echo "# Hello World" | mark-deco-cli
1374
+
1375
+ # File processing with custom ID prefix
1376
+ mark-deco-cli -i document.md --unique-id-prefix "doc"
1377
+
1378
+ # Disable all plugins
1379
+ mark-deco-cli -i simple.md --no-plugins
1380
+
1381
+ # Enable only specific plugins
1382
+ mark-deco-cli -i content.md -p oembed mermaid
1383
+
1384
+ # Use configuration file
1385
+ mark-deco-cli -i content.md -c config.json
1386
+
1387
+ # Output frontmatter and HTML separately
1388
+ mark-deco-cli -i article.md -o article.html --frontmatter-output metadata.json
1389
+ ```
1390
+
1391
+ ### Configuration File
1392
+
1393
+ You can specify default options in JSON format configuration file:
1394
+
1395
+ ```json
1396
+ {
1397
+ "plugins": ["oembed", "card", "mermaid"],
1398
+ "uniqueIdPrefix": "section",
1399
+ "hierarchicalHeadingId": true,
1400
+ "contentBasedHeadingId": false,
1401
+ "oembed": {
1402
+ "enabled": true,
1403
+ "timeout": 5000
1404
+ },
1405
+ "card": {
1406
+ "enabled": true
1407
+ },
1408
+ "mermaid": {
1409
+ "enabled": true,
1410
+ "theme": "default"
1411
+ }
1412
+ }
1413
+ ```
1414
+
1415
+ ----
1416
+
1417
+ ## License
1418
+
1419
+ Under MIT.
1420
+
1421
+ ## Changelog
1422
+
1423
+ * 0.1.0:
1424
+ * First public release.