markdown-to-jsx 8.0.0 → 9.0.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/README.md CHANGED
@@ -1,20 +1,38 @@
1
- **markdown-to-jsx**
1
+ [![npm version](https://badge.fury.io/js/markdown-to-jsx.svg)](https://badge.fury.io/js/markdown-to-jsx) [![downloads](https://badgen.net/npm/dy/markdown-to-jsx)](https://npm-stat.com/charts.html?package=markdown-to-jsx)
2
2
 
3
- The most lightweight, customizable React markdown component.
3
+ `markdown-to-jsx` is a gfm+commonmark compliant markdown parser and compiler toolchain for JavaScript and TypeScript-based projects. It is extremely fast, capable of processing large documents fast enough for real-time interactivity.
4
4
 
5
- [![npm version](https://badge.fury.io/js/markdown-to-jsx.svg)](https://badge.fury.io/js/markdown-to-jsx) [![downloads](https://badgen.net/npm/dy/markdown-to-jsx)](https://npm-stat.com/charts.html?package=markdown-to-jsx)
5
+ Some special features of the library:
6
+
7
+ - Arbitrary HTML is supported and parsed into the appropriate JSX representation
8
+ without `dangerouslySetInnerHTML`
9
+
10
+ - Any HTML tags rendered by the compiler and/or `<Markdown>` component can be overridden to include additional props or even a different HTML representation entirely.
11
+
12
+ - All GFM special syntaxes are supported, including tables, task lists, strikethrough, autolinks, tag filtering, and more.
13
+
14
+ - Fenced code blocks with [highlight.js](https://highlightjs.org/) support; see [Syntax highlighting](#syntax-highlighting) for instructions on setting up highlight.js.
15
+
16
+ <h2>Table of Contents</h2>
6
17
 
7
18
  <!-- TOC -->
8
19
 
9
20
  - [Upgrading](#upgrading)
21
+ - [From v8.x to v9.x](#from-v8x-to-v9x)
10
22
  - [From v7.x to v8.x](#from-v7x-to-v8x)
11
23
  - [Installation](#installation)
12
24
  - [Usage](#usage)
13
- - [Parsing Options](#parsing-options)
25
+ - [Entry Points](#entry-points)
26
+ - [Main](#main)
27
+ - [React](#react)
28
+ - [HTML](#html)
29
+ - [Markdown](#markdown)
30
+ - [Library Options](#library-options)
14
31
  - [options.forceBlock](#optionsforceblock)
15
32
  - [options.forceInline](#optionsforceinline)
16
33
  - [options.wrapper](#optionswrapper)
17
34
  - [Other useful recipes](#other-useful-recipes)
35
+ - [options.wrapperProps](#optionswrapperprops)
18
36
  - [options.forceWrapper](#optionsforcewrapper)
19
37
  - [options.overrides - Void particular banned tags](#optionsoverrides---void-particular-banned-tags)
20
38
  - [options.overrides - Override Any HTML Tag's Representation](#optionsoverrides---override-any-html-tags-representation)
@@ -24,43 +42,124 @@ The most lightweight, customizable React markdown component.
24
42
  - [options.renderRule](#optionsrenderrule)
25
43
  - [options.sanitizer](#optionssanitizer)
26
44
  - [options.slugify](#optionsslugify)
27
- - [options.namedCodesToUnicode](#optionsnamedcodestounicode)
28
45
  - [options.disableAutoLink](#optionsdisableautolink)
29
46
  - [options.disableParsingRawHTML](#optionsdisableparsingrawhtml)
30
- - [options.ast](#optionsast)
47
+ - [options.tagfilter](#optionstagfilter)
31
48
  - [Syntax highlighting](#syntax-highlighting)
32
49
  - [Handling shortcodes](#handling-shortcodes)
33
50
  - [Getting the smallest possible bundle size](#getting-the-smallest-possible-bundle-size)
34
51
  - [Usage with Preact](#usage-with-preact)
35
- - [Gotchas](#gotchas)
36
- - [Passing props to stringified React components](#passing-props-to-stringified-react-components)
37
- - [Significant indentation inside arbitrary HTML](#significant-indentation-inside-arbitrary-html)
52
+ - [AST Anatomy](#ast-anatomy)
53
+ - [Node Types](#node-types)
54
+ - [Example AST Structure](#example-ast-structure)
55
+ - [Type Checking](#type-checking)
56
+ - [Gotchas](#gotchas)
57
+ - [Passing props to stringified React components](#passing-props-to-stringified-react-components)
58
+ - [Significant indentation inside arbitrary HTML](#significant-indentation-inside-arbitrary-html)
38
59
  - [Code blocks](#code-blocks)
39
- - [Using The Compiler Directly](#using-the-compiler-directly)
40
60
  - [Changelog](#changelog)
41
61
  - [Donate](#donate)
42
62
 
43
63
  <!-- /TOC -->
44
64
 
45
- ---
65
+ ## Upgrading
46
66
 
47
- `markdown-to-jsx` offers the following additional benefits over simple markdown parsing:
67
+ ### From v8.x to v9.x
48
68
 
49
- - Arbitrary HTML is supported and parsed into the appropriate JSX representation
50
- without `dangerouslySetInnerHTML`
69
+ **Breaking Changes:**
51
70
 
52
- - Any HTML tags rendered by the compiler and/or `<Markdown>` component can be overridden to include additional
53
- props or even a different HTML representation entirely.
71
+ - **`ast` option removed**: The `ast: true` option on `compiler()` has been removed. Use the new `parser()` function instead to access the AST directly.
54
72
 
55
- - GFM task list support.
73
+ ```typescript
74
+ // Before (v8)
75
+ import { compiler } from 'markdown-to-jsx'
76
+ const ast = compiler('# Hello world', { ast: true })
56
77
 
57
- - Fenced code blocks with [highlight.js](https://highlightjs.org/) support; see [Syntax highlighting](#syntax-highlighting) for instructions on setting up highlight.js.
78
+ // After (v9)
79
+ import { parser } from 'markdown-to-jsx'
80
+ const ast = parser('# Hello world')
81
+ ```
58
82
 
59
- All this clocks in at around 7.5 kB gzipped, which is a fraction of the size of most other React markdown components.
83
+ - **`namedCodesToUnicode` option removed**: The `namedCodesToUnicode` option has been removed. All named HTML entities are now supported by default via the full entity list, so custom entity mappings are no longer needed.
60
84
 
61
- Requires React >= 0.14.
85
+ ```typescript
86
+ // Before (v8)
87
+ import { compiler } from 'markdown-to-jsx'
88
+ compiler('&le; symbol', { namedCodesToUnicode: { le: '\u2264' } })
62
89
 
63
- ## Upgrading
90
+ // After (v9)
91
+ import { compiler } from 'markdown-to-jsx'
92
+ compiler('&le; symbol') // All entities supported automatically
93
+ ```
94
+
95
+ - **`tagfilter` enabled by default**: Dangerous HTML tags (`script`, `iframe`, `style`, `title`, `textarea`, `xmp`, `noembed`, `noframes`, `plaintext`) are now escaped by default in both HTML string output and React JSX output. Previously these tags were rendered as JSX elements in React output.
96
+
97
+ ```typescript
98
+ // Before (v8) - tags rendered as JSX elements
99
+ compiler('<script>alert("xss")</script>') // Rendered as <script> element
100
+
101
+ // After (v9) - tags escaped by default
102
+ compiler('<script>alert("xss")</script>') // Renders as <span>&lt;script&gt;</span>
103
+
104
+ // To restore old behavior:
105
+ compiler('<script>alert("xss")</script>', { tagfilter: false })
106
+ ```
107
+
108
+ **New Features:**
109
+
110
+ - **New `parser` function**: Provides direct access to the parsed AST without rendering. This is the recommended way to get AST nodes.
111
+
112
+ - **New entry points**: React-specific, HTML-specific, and markdown-specific entry points are now available for better tree-shaking and separation of concerns.
113
+
114
+ ```typescript
115
+ // React-specific usage
116
+ import Markdown, { compiler, parser } from 'markdown-to-jsx/react'
117
+
118
+ // HTML string output
119
+ import { compiler, astToHTML, parser } from 'markdown-to-jsx/html'
120
+
121
+ // Markdown string output (round-trip compilation)
122
+ import { compiler, astToMarkdown, parser } from 'markdown-to-jsx/markdown'
123
+ ```
124
+
125
+ **Migration Guide:**
126
+
127
+ 1. **Replace `compiler(..., { ast: true })` with `parser()`**:
128
+
129
+ ```typescript
130
+ // Before
131
+ import { compiler } from 'markdown-to-jsx'
132
+ const ast = compiler(markdown, { ast: true })
133
+
134
+ // After
135
+ import { parser } from 'markdown-to-jsx'
136
+ const ast = parser(markdown)
137
+ ```
138
+
139
+ 2. **Migrate React imports to `/react` entry point** (optional but recommended):
140
+
141
+ ```typescript
142
+ // Before
143
+ import Markdown, { compiler } from 'markdown-to-jsx'
144
+
145
+ // After (recommended)
146
+ import Markdown, { compiler } from 'markdown-to-jsx/react'
147
+ ```
148
+
149
+ 3. **Remove `namedCodesToUnicode` option**: All named HTML entities are now supported automatically, so you can remove any custom entity mappings.
150
+
151
+ ```typescript
152
+ // Before
153
+ compiler('&le; symbol', { namedCodesToUnicode: { le: '\u2264' } })
154
+
155
+ // After
156
+ compiler('&le; symbol') // Works automatically
157
+ ```
158
+
159
+ **Note:** The main entry point (`markdown-to-jsx`) continues to work for backward compatibility, but React code there is deprecated and will be removed in a future major release. Consider migrating to `markdown-to-jsx/react` for React-specific usage.
160
+
161
+ <details>
162
+ <summary>### Older Migration Guides</summary>
64
163
 
65
164
  ### From v7.x to v8.x
66
165
 
@@ -86,6 +185,8 @@ if (node.type === RuleType.textBolded) { ... }
86
185
  if (node.type === RuleType.textFormatted && node.bold) { ... }
87
186
  ```
88
187
 
188
+ </details>
189
+
89
190
  ## Installation
90
191
 
91
192
  Install `markdown-to-jsx` with your favorite package manager.
@@ -100,7 +201,7 @@ npm i markdown-to-jsx
100
201
 
101
202
  ES6-style usage\*:
102
203
 
103
- ```jsx
204
+ ```tsx
104
205
  import Markdown from 'markdown-to-jsx'
105
206
  import React from 'react'
106
207
  import { render } from 'react-dom'
@@ -116,7 +217,75 @@ render(<Markdown># Hello world!</Markdown>, document.body)
116
217
 
117
218
  \* **NOTE: JSX does not natively preserve newlines in multiline text. In general, writing markdown directly in JSX is discouraged and it's a better idea to keep your content in separate .md files and require them, perhaps using webpack's [raw-loader](https://github.com/webpack-contrib/raw-loader).**
118
219
 
119
- ### Parsing Options
220
+ ### Entry Points
221
+
222
+ `markdown-to-jsx` provides multiple entry points for different use cases:
223
+
224
+ #### Main
225
+
226
+ The legacy\*default entry point exports everything, including the React compiler and component:
227
+
228
+ ```tsx
229
+ import Markdown, { compiler, parser } from 'markdown-to-jsx'
230
+ ```
231
+
232
+ _The React code in this entry point is deprecated and will be removed in a future major release, migrate to `markdown-to-jsx/react`._
233
+
234
+ #### React
235
+
236
+ For React-specific usage, import from the `/react` entry point:
237
+
238
+ ```tsx
239
+ import Markdown, { compiler, parser, astToJSX } from 'markdown-to-jsx/react'
240
+
241
+ // Use compiler for markdown → JSX
242
+ const jsxElement = compiler('# Hello world')
243
+
244
+ const markdown = `# Hello world`
245
+
246
+ function App() {
247
+ return <Markdown children={markdown} />
248
+ }
249
+
250
+ // Or use parser + astToJSX for total control
251
+ const ast = parser('# Hello world')
252
+ const jsxElement2 = astToJSX(ast)
253
+ ```
254
+
255
+ #### HTML
256
+
257
+ For HTML string output (server-side rendering), import from the `/html` entry point:
258
+
259
+ ```tsx
260
+ import { compiler, html, parser } from 'markdown-to-jsx/html'
261
+
262
+ // Convenience function that combines parsing and HTML rendering
263
+ const htmlString = compiler('# Hello world')
264
+ // Returns: '<h1>Hello world</h1>'
265
+
266
+ // Or use parser + html separately for more control
267
+ const ast = parser('# Hello world')
268
+ const htmlString2 = html(ast)
269
+ ```
270
+
271
+ #### Markdown
272
+
273
+ For markdown-to-markdown compilation (normalization and formatting), import from the `/markdown` entry point:
274
+
275
+ ```typescript
276
+ import { compiler, astToMarkdown, parser } from 'markdown-to-jsx/markdown'
277
+
278
+ // Convenience function that parses and recompiles markdown
279
+ const normalizedMarkdown = compiler('# Hello world\n\nExtra spaces!')
280
+ // Returns: '# Hello world\n\nExtra spaces!\n'
281
+
282
+ // Or work with AST directly
283
+ const ast = parser('# Hello world')
284
+ const normalizedMarkdown2 = astToMarkdown(ast)
285
+ // Returns: '# Hello world\n'
286
+ ```
287
+
288
+ ### Library Options
120
289
 
121
290
  #### options.forceBlock
122
291
 
@@ -134,37 +303,36 @@ But this string would be considered "block" due to the existence of a header tag
134
303
 
135
304
  However, if you really want all input strings to be treated as "block" layout, simply pass `options.forceBlock = true` like this:
136
305
 
137
- ```jsx
138
- ;<Markdown options={{ forceBlock: true }}>Hello there old chap!</Markdown>
306
+ ```tsx
307
+ <Markdown options={{ forceBlock: true }}>Hello there old chap!</Markdown>
139
308
 
140
309
  // or
141
310
 
142
311
  compiler('Hello there old chap!', { forceBlock: true })
143
312
 
144
313
  // renders
145
- ;<p>Hello there old chap!</p>
314
+ <p>Hello there old chap!</p>
146
315
  ```
147
316
 
148
317
  #### options.forceInline
149
318
 
150
319
  The inverse is also available by passing `options.forceInline = true`:
151
320
 
152
- ```jsx
153
- ;<Markdown options={{ forceInline: true }}># You got it babe!</Markdown>
321
+ ```tsx
322
+ <Markdown options={{ forceInline: true }}># You got it babe!</Markdown>
154
323
 
155
324
  // or
156
-
157
325
  compiler('# You got it babe!', { forceInline: true })
158
326
 
159
327
  // renders
160
- ;<span># You got it babe!</span>
328
+ <span># You got it babe!</span>
161
329
  ```
162
330
 
163
331
  #### options.wrapper
164
332
 
165
333
  When there are multiple children to be rendered, the compiler will wrap the output in a `div` by default. You can override this default by setting the `wrapper` option to either a string (React Element) or a component.
166
334
 
167
- ```jsx
335
+ ```tsx
168
336
  const str = '# Heck Yes\n\nThis is great!'
169
337
 
170
338
  <Markdown options={{ wrapper: 'article' }}>
@@ -187,7 +355,7 @@ compiler(str, { wrapper: 'article' });
187
355
 
188
356
  To get an array of children back without a wrapper, set `wrapper` to `null`. This is particularly useful when using `compiler(…)` directly.
189
357
 
190
- ```jsx
358
+ ```tsx
191
359
  compiler('One\n\nTwo\n\nThree', { wrapper: null })
192
360
 
193
361
  // returns
@@ -196,11 +364,29 @@ compiler('One\n\nTwo\n\nThree', { wrapper: null })
196
364
 
197
365
  To render children at the same DOM level as `<Markdown>` with no HTML wrapper, set `wrapper` to `React.Fragment`. This will still wrap your children in a React node for the purposes of rendering, but the wrapper element won't show up in the DOM.
198
366
 
367
+ #### options.wrapperProps
368
+
369
+ Props to apply to the wrapper element when `wrapper` is used.
370
+
371
+ ```tsx
372
+ <Markdown options={{
373
+ wrapper: 'article',
374
+ wrapperProps: { className: 'post', 'data-testid': 'markdown-content' }
375
+ }}>
376
+ # Hello World
377
+ </Markdown>
378
+
379
+ // renders
380
+ <article class="post" data-testid="markdown-content">
381
+ <h1>Hello World</h1>
382
+ </article>
383
+ ```
384
+
199
385
  #### options.forceWrapper
200
386
 
201
387
  By default, the compiler does not wrap the rendered contents if there is only a single child. You can change this by setting `forceWrapper` to `true`. If the child is inline, it will not necessarily be wrapped in a `span`.
202
388
 
203
- ```jsx
389
+ ```tsx
204
390
  // Using `forceWrapper` with a single, inline child…
205
391
  <Markdown options={{ wrapper: 'aside', forceWrapper: true }}>
206
392
  Mumble, mumble…
@@ -213,7 +399,15 @@ By default, the compiler does not wrap the rendered contents if there is only a
213
399
 
214
400
  #### options.overrides - Void particular banned tags
215
401
 
216
- Pass the `options.overrides` prop to the compiler or `<Markdown>` component with an implementation that return `null` for tags you wish to exclude from the rendered output. It is recommended to void `script`, `iframe`, `object`, and `style` tags to avoid XSS attacks when working with user-generated content. For example, to void the `iframe` tag:
402
+ Pass the `options.overrides` prop to the compiler or `<Markdown>` component with an implementation that return `null` for tags you wish to exclude from the rendered output. This provides complete removal of tags from the output.
403
+
404
+ **Note**: The `tagfilter` option provides default escaping of dangerous tags (`script`, `iframe`, `style`, `title`, `textarea`, `xmp`, `noembed`, `noframes`, `plaintext`). Use `overrides` when you need to:
405
+
406
+ - Remove additional tags not covered by `tagfilter` (like `object`)
407
+ - Have more control over tag removal vs. escaping
408
+ - Disable `tagfilter` but still want to remove specific tags
409
+
410
+ For example, to void the `iframe` tag:
217
411
 
218
412
  ```tsx
219
413
  import Markdown from 'markdown-to-jsx'
@@ -230,13 +424,13 @@ render(
230
424
  // renders: ""
231
425
  ```
232
426
 
233
- The library does not void any tags by default to avoid surprising behavior for personal use cases.
427
+ The library does not void any tags by default (except through `tagfilter` escaping), allowing you to choose the appropriate security approach for your use case.
234
428
 
235
429
  #### options.overrides - Override Any HTML Tag's Representation
236
430
 
237
431
  Pass the `options.overrides` prop to the compiler or `<Markdown>` component to seamlessly revise the rendered representation of any HTML tag. You can choose to change the component itself, add/change props, or both.
238
432
 
239
- ```jsx
433
+ ```tsx
240
434
  import Markdown from 'markdown-to-jsx'
241
435
  import React from 'react'
242
436
  import { render } from 'react-dom'
@@ -304,7 +498,7 @@ One of the most interesting use cases enabled by the HTML syntax processing in `
304
498
 
305
499
  By adding an override for the components you plan to use in markdown documents, it's possible to dynamically render almost anything. One possible scenario could be writing documentation:
306
500
 
307
- ```jsx
501
+ ```tsx
308
502
  import Markdown from 'markdown-to-jsx'
309
503
  import React from 'react'
310
504
  import { render } from 'react-dom'
@@ -339,7 +533,7 @@ render(
339
533
 
340
534
  In the following case, `DatePicker` could simply run `parseInt()` on the passed `startTime` for example:
341
535
 
342
- ```jsx
536
+ ```tsx
343
537
  import Markdown from 'markdown-to-jsx'
344
538
  import React from 'react'
345
539
  import { render } from 'react-dom'
@@ -376,7 +570,7 @@ render(
376
570
 
377
571
  Another possibility is to use something like [recompose's `withProps()` HOC](https://github.com/acdlite/recompose/blob/main/docs/API.md#withprops) to create various pregenerated scenarios and then reference them by name in the markdown:
378
572
 
379
- ```jsx
573
+ ```tsx
380
574
  import Markdown from 'markdown-to-jsx'
381
575
  import React from 'react'
382
576
  import { render } from 'react-dom'
@@ -496,7 +690,7 @@ By default a lightweight URL sanitizer function is provided to avoid common atta
496
690
 
497
691
  This can be overridden and replaced with a custom sanitizer if desired via `options.sanitizer`:
498
692
 
499
- ```jsx
693
+ ```tsx
500
694
  // sanitizer in this situation would receive:
501
695
  // ('javascript:alert("foo")', 'a', 'href')
502
696
 
@@ -513,9 +707,9 @@ compiler('[foo](javascript:alert("foo"))', {
513
707
 
514
708
  #### options.slugify
515
709
 
516
- By default, a [lightweight deburring function](https://github.com/probablyup/markdown-to-jsx/blob/bc2f57412332dc670f066320c0f38d0252e0f057/index.js#L261-L275) is used to generate an HTML id from headings. You can override this by passing a function to `options.slugify`. This is helpful when you are using non-alphanumeric characters (e.g. Chinese or Japanese characters) in headings. For example:
710
+ By default, a [lightweight deburring function](https://github.com/quantizor/markdown-to-jsx/blob/bc2f57412332dc670f066320c0f38d0252e0f057/index.js#L261-L275) is used to generate an HTML id from headings. You can override this by passing a function to `options.slugify`. This is helpful when you are using non-alphanumeric characters (e.g. Chinese or Japanese characters) in headings. For example:
517
711
 
518
- ```jsx
712
+ ```tsx
519
713
  <Markdown options={{ slugify: str => str }}># 中文</Markdown>
520
714
 
521
715
  // or
@@ -528,44 +722,11 @@ compiler('# 中文', { slugify: str => str })
528
722
 
529
723
  The original function is available as a library export called `slugify`.
530
724
 
531
- #### options.namedCodesToUnicode
532
-
533
- By default only a couple of named html codes are converted to unicode characters:
534
-
535
- - `&` (`&amp;`)
536
- - `'` (`&apos;`)
537
- - `>` (`&gt;`)
538
- - `<` (`&lt;`)
539
- - ` ` (`&nbsp;`)
540
- - `"` (`&quot;`)
541
-
542
- Some projects require to extend this map of named codes and unicode characters. To customize this list with additional html codes pass the option namedCodesToUnicode as object with the code names needed as in the example below:
543
-
544
- ```jsx
545
- <Markdown options={{ namedCodesToUnicode: {
546
- le: '\u2264',
547
- ge: '\u2265',
548
- '#39': '\u0027',
549
- } }}>This text is &le; than this text.</Markdown>;
550
-
551
- // or
552
-
553
- compiler('This text is &le; than this text.', namedCodesToUnicode: {
554
- le: '\u2264',
555
- ge: '\u2265',
556
- '#39': '\u0027',
557
- });
558
-
559
- // renders:
560
-
561
- <p>This text is ≤ than this text.</p>
562
- ```
563
-
564
725
  #### options.disableAutoLink
565
726
 
566
727
  By default, bare URLs in the markdown document will be converted into an anchor tag. This behavior can be disabled if desired.
567
728
 
568
- ```jsx
729
+ ```tsx
569
730
  <Markdown options={{ disableAutoLink: true }}>
570
731
  The URL https://quantizor.dev will not be rendered as an anchor tag.
571
732
  </Markdown>
@@ -588,7 +749,7 @@ compiler(
588
749
 
589
750
  By default, raw HTML is parsed to JSX. This behavior can be disabled if desired.
590
751
 
591
- ```jsx
752
+ ```tsx
592
753
  <Markdown options={{ disableParsingRawHTML: true }}>
593
754
  This text has <span>html</span> in it but it won't be rendered
594
755
  </Markdown>;
@@ -602,37 +763,45 @@ compiler('This text has <span>html</span> in it but it won't be rendered', { dis
602
763
  <span>This text has &lt;span&gt;html&lt;/span&gt; in it but it won't be rendered</span>
603
764
  ```
604
765
 
605
- #### options.ast
766
+ #### options.tagfilter
606
767
 
607
- When `ast: true`, the compiler returns the parsed AST structure instead of rendered JSX. **This is the first time the AST is accessible to users!**
768
+ By default, dangerous HTML tags are filtered and escaped to prevent XSS attacks. This applies to both HTML string output and React JSX output. The following tags are filtered: `script`, `iframe`, `style`, `title`, `textarea`, `xmp`, `noembed`, `noframes`, `plaintext`.
608
769
 
609
770
  ```tsx
610
- import { compiler } from 'markdown-to-jsx'
611
- import type { MarkdownToJSX } from 'markdown-to-jsx'
612
-
613
- // Get the AST directly
614
- const ast = compiler('# Hello world', { ast: true })
615
-
616
- // TypeScript: AST is MarkdownToJSX.AST[]
617
- console.log(ast) // Array of parsed nodes with types
618
-
619
- // You can manipulate, transform, or analyze the AST before rendering
771
+ // Tags are escaped by default (GFM-compliant)
772
+ compiler('<script>alert("xss")</script>')
773
+ // HTML output: '<span>&lt;script&gt;</span>'
774
+ // React output: <span>&lt;script&gt;</span>
775
+
776
+ // Disable tag filtering:
777
+ compiler('<script>alert("xss")</script>', { tagfilter: false })
778
+ // HTML output: '<script></script>'
779
+ // React output: <script></script>
620
780
  ```
621
781
 
622
- The AST format is `MarkdownToJSX.AST[]` and enables:
782
+ **Note**: Even when `tagfilter` is disabled, other security measures remain active:
623
783
 
624
- - AST manipulation and transformation
625
- - Custom rendering logic without re-parsing
626
- - Caching parsed AST for performance
627
- - Linting or validation of markdown structure
628
-
629
- When footnotes are present, the returned value will be an object with `ast` and `footnotes` properties instead of just the AST array.
784
+ - URL sanitization preventing `javascript:` and `vbscript:` schemes in `href` and `src` attributes
785
+ - Protection against `data:` URLs (except safe `data:image/*` MIME types)
630
786
 
631
787
  ### Syntax highlighting
632
788
 
633
789
  When using [fenced code blocks](https://www.markdownguide.org/extended-syntax/#syntax-highlighting) with language annotation, that language will be added to the `<code>` element as `class="lang-${language}"`. For best results, you can use `options.overrides` to provide an appropriate syntax highlighting integration like this one using `highlight.js`:
634
790
 
635
- ````jsx
791
+ ```html
792
+ <!-- Add the following tags to your page <head> to automatically load hljs and styles: -->
793
+ <link
794
+ rel="stylesheet"
795
+ href="https://cdnjs.cloudflare.com/ajax/libs/highlight.js/11.11.1/styles/obsidian.min.css"
796
+ />
797
+
798
+ <script
799
+ crossorigin
800
+ src="https://unpkg.com/@highlightjs/cdn-assets@11.9.0/highlight.min.js"
801
+ ></script>
802
+ ```
803
+
804
+ ````tsx
636
805
  import { Markdown, RuleType } from 'markdown-to-jsx'
637
806
 
638
807
  const mdContainingFencedCodeBlock = '```js\nconsole.log("Hello world!");\n```\n'
@@ -650,23 +819,6 @@ function App() {
650
819
  )
651
820
  }
652
821
 
653
- /**
654
- * Add the following tags to your page <head> to automatically load hljs and styles:
655
-
656
- <link
657
- rel="stylesheet"
658
- href="https://unpkg.com/@highlightjs/cdn-assets@11.9.0/styles/nord.min.css"
659
- />
660
-
661
- * NOTE: for best performance, load individual languages you need instead of all
662
- of them. See their docs for more info: https://highlightjs.org/
663
-
664
- <script
665
- crossorigin
666
- src="https://unpkg.com/@highlightjs/cdn-assets@11.9.0/highlight.min.js"
667
- ></script>
668
- */
669
-
670
822
  function SyntaxHighlightedCode(props) {
671
823
  const ref = (React.useRef < HTMLElement) | (null > null)
672
824
 
@@ -747,9 +899,174 @@ Here are instructions for some of the popular bundlers:
747
899
 
748
900
  Everything will work just fine! Simply [Alias `react` to `preact/compat`](https://preactjs.com/guide/v10/switching-to-preact#setting-up-compat) like you probably already are doing.
749
901
 
750
- ## Gotchas
902
+ ### AST Anatomy
903
+
904
+ The Abstract Syntax Tree (AST) is a structured representation of parsed markdown. Each node in the AST has a `type` property that identifies its kind, and type-specific properties.
905
+
906
+ **Important:** The first node in the AST is typically a `RuleType.refCollection` node that contains all reference definitions found in the document, including footnotes (stored with keys prefixed with `^`). This node is skipped during rendering but is useful for accessing reference data. Footnotes are automatically extracted from the refCollection and rendered in a `<footer>` element by both `compiler()` and `astToJSX()`.
907
+
908
+ #### Node Types
909
+
910
+ The AST consists of the following node types (use `RuleType` to check node types):
911
+
912
+ **Block-level nodes:**
913
+
914
+ - `RuleType.heading` - Headings (`# Heading`)
915
+ ```tsx
916
+ { type: RuleType.heading, level: 1, id: "heading", children: [...] }
917
+ ```
918
+ - `RuleType.paragraph` - Paragraphs
919
+ ```tsx
920
+ { type: RuleType.paragraph, children: [...] }
921
+ ```
922
+ - `RuleType.codeBlock` - Fenced code blocks (```)
923
+ ```tsx
924
+ { type: RuleType.codeBlock, lang: "javascript", text: "code content" }
925
+ ```
926
+ - `RuleType.blockQuote` - Blockquotes (`>`)
927
+ ```tsx
928
+ { type: RuleType.blockQuote, children: [...], alert?: "note" }
929
+ ```
930
+ - `RuleType.orderedList` / `RuleType.unorderedList` - Lists
931
+ ```tsx
932
+ { type: RuleType.orderedList, items: [[...]], start?: 1 }
933
+ { type: RuleType.unorderedList, items: [[...]] }
934
+ ```
935
+ - `RuleType.table` - Tables
936
+ ```tsx
937
+ { type: RuleType.table, header: [...], cells: [[...]], align: [...] }
938
+ ```
939
+ - `RuleType.htmlBlock` - HTML blocks
940
+ ```tsx
941
+ { type: RuleType.htmlBlock, tag: "div", attrs: {}, children: [...] }
942
+ ```
943
+
944
+ **Inline nodes:**
945
+
946
+ - `RuleType.text` - Plain text
947
+ ```tsx
948
+ { type: RuleType.text, text: "Hello world" }
949
+ ```
950
+ - `RuleType.textFormatted` - Bold, italic, etc.
951
+ ```tsx
952
+ { type: RuleType.textFormatted, tag: "strong", children: [...] }
953
+ ```
954
+ - `RuleType.codeInline` - Inline code (`` ` ``)
955
+ ```tsx
956
+ { type: RuleType.codeInline, text: "code" }
957
+ ```
958
+ - `RuleType.link` - Links
959
+ ```tsx
960
+ { type: RuleType.link, target: "https://example.com", children: [...] }
961
+ ```
962
+ - `RuleType.image` - Images
963
+ ```tsx
964
+ { type: RuleType.image, target: "image.png", alt: "description" }
965
+ ```
966
+
967
+ **Other nodes:**
968
+
969
+ - `RuleType.breakLine` - Hard line breaks (` `)
970
+ - `RuleType.breakThematic` - Horizontal rules (`---`)
971
+ - `RuleType.gfmTask` - GFM task list items (`- [ ]`)
972
+ - `RuleType.ref` - Reference definition node (not rendered, stored in refCollection)
973
+ - `RuleType.refCollection` - Reference definitions collection (appears at AST root, includes footnotes with `^` prefix)
974
+ - `RuleType.footnote` - Footnote definition node (not rendered, stored in refCollection)
975
+ - `RuleType.footnoteReference` - Footnote reference (`[^identifier]`)
976
+ - `RuleType.frontmatter` - YAML frontmatter blocks
977
+ ```tsx
978
+ { type: RuleType.frontmatter, text: "---\ntitle: My Title\n---" }
979
+ ```
980
+ - `RuleType.htmlComment` - HTML comment nodes
981
+ ```tsx
982
+ { type: RuleType.htmlComment, text: "<!-- comment -->" }
983
+ ```
984
+ - `RuleType.htmlSelfClosing` - Self-closing HTML tags
985
+ ```tsx
986
+ { type: RuleType.htmlSelfClosing, tag: "img", attrs: { src: "image.png" } }
987
+ ```
988
+
989
+ #### Example AST Structure
990
+
991
+ ````tsx
992
+ import { parser, RuleType } from 'markdown-to-jsx'
993
+
994
+ const ast = parser(`# Hello World
995
+
996
+ This is a **paragraph** with [a link](https://example.com).
997
+
998
+ [linkref]: https://example.com
999
+
1000
+ ```javascript
1001
+ console.log('code')
1002
+ ```
1003
+
1004
+ `)
1005
+
1006
+ // AST structure:
1007
+ [
1008
+ // Reference collection (first node, if references exist)
1009
+ {
1010
+ type: RuleType.refCollection,
1011
+ refs: {
1012
+ linkref: { target: 'https://example.com', title: undefined },
1013
+ },
1014
+ },
1015
+ {
1016
+ type: RuleType.heading,
1017
+ level: 1,
1018
+ id: 'hello-world',
1019
+ children: [{ type: RuleType.text, text: 'Hello World' }],
1020
+ },
1021
+ {
1022
+ type: RuleType.paragraph,
1023
+ children: [
1024
+ { type: RuleType.text, text: 'This is a ' },
1025
+ {
1026
+ type: RuleType.textFormatted,
1027
+ tag: 'strong',
1028
+ children: [{ type: RuleType.text, text: 'paragraph' }],
1029
+ },
1030
+ { type: RuleType.text, text: ' with ' },
1031
+ {
1032
+ type: RuleType.link,
1033
+ target: 'https://example.com',
1034
+ children: [{ type: RuleType.text, text: 'a link' }],
1035
+ },
1036
+ { type: RuleType.text, text: '.' },
1037
+ ],
1038
+ },
1039
+ {
1040
+ type: RuleType.codeBlock,
1041
+ lang: 'javascript',
1042
+ text: "console.log('code')",
1043
+ },
1044
+ ]
1045
+
1046
+ ````
1047
+
1048
+ #### Type Checking
1049
+
1050
+ Use the `RuleType` enum to identify AST nodes:
1051
+
1052
+ ```tsx
1053
+ import { RuleType } from 'markdown-to-jsx'
1054
+
1055
+ if (node.type === RuleType.heading) {
1056
+ const heading = node as MarkdownToJSX.HeadingNode
1057
+ console.log(`Heading level ${heading.level}: ${heading.id}`)
1058
+ }
1059
+ ```
1060
+
1061
+ **When to use `compiler` vs `parser` vs `<Markdown>`:**
1062
+
1063
+ - Use `<Markdown>` when you need a simple React component that renders markdown to JSX.
1064
+ - Use `compiler` when you need React JSX output from markdown (the component uses this internally).
1065
+ - Use `parser` + `astToJSX` when you need the AST for custom processing before rendering to JSX, or just the AST itself.
751
1066
 
752
- ### Passing props to stringified React components
1067
+ ### Gotchas
1068
+
1069
+ #### Passing props to stringified React components
753
1070
 
754
1071
  Using the [`options.overrides`](#optionsoverrides---rendering-arbitrary-react-components) functionality to render React components, props are passed into the component in stringifed form. It is up to you to parse the string to make use of the data.
755
1072
 
@@ -795,7 +1112,7 @@ const Table: React.FC<
795
1112
  */
796
1113
  ```
797
1114
 
798
- ### Significant indentation inside arbitrary HTML
1115
+ #### Significant indentation inside arbitrary HTML
799
1116
 
800
1117
  People usually write HTML like this:
801
1118
 
@@ -831,40 +1148,14 @@ The two leading spaces in front of "# Hello" would be left-trimmed from all line
831
1148
  <div>
832
1149
  ```js
833
1150
  var some = code();
834
- ``\`
1151
+ ```
835
1152
  </div>
836
1153
  ````
837
1154
 
838
- ## Using The Compiler Directly
839
-
840
- If desired, the compiler function is a "named" export on the `markdown-to-jsx` module:
841
-
842
- ```jsx
843
- import { compiler } from 'markdown-to-jsx'
844
- import React from 'react'
845
- import { render } from 'react-dom'
846
-
847
- render(compiler('# Hello world!'), document.body)
848
-
849
- /*
850
- renders:
851
-
852
- <h1>Hello world!</h1>
853
- */
854
- ```
855
-
856
- It accepts the following arguments:
857
-
858
- ```js
859
- compiler(markdown: string, options: object?)
860
- ```
861
-
862
1155
  ## Changelog
863
1156
 
864
- See [Github Releases](https://github.com/probablyup/markdown-to-jsx/releases).
1157
+ See [Github Releases](https://github.com/quantizor/markdown-to-jsx/releases).
865
1158
 
866
1159
  ## Donate
867
1160
 
868
- Like this library? It's developed entirely on a volunteer basis; chip in a few bucks if you can via the Sponsor link!
869
-
870
- MIT
1161
+ Like this library? It's developed entirely on a volunteer basis; chip in a few bucks if you can via the [Sponsor link](https://github.com/sponsors/quantizor)!