@ai-react-markdown/core 1.4.2 → 1.4.3

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/README.md CHANGED
@@ -96,6 +96,8 @@ function StreamingChat({ content, isStreaming }: { content: string; isStreaming:
96
96
  | `Typography` | `AIMarkdownTypographyComponent` | `DefaultTypography` | Typography wrapper component. |
97
97
  | `ExtraStyles` | `AIMarkdownExtraStylesComponent` | `undefined` | Optional extra style wrapper rendered between typography and content. |
98
98
  | `documentId` | `string` | auto via `useId()` | Stable id for the _logical markdown document_ this `<AIMarkdown>` is rendering. Used as the id namespace for clobberable attributes (`id`, hash hrefs) so two documents on the same page do not cross-link (footnote `[^1]` in message A won't scroll to `[^1]` in message B). When one document is split into chunks rendered by multiple `<AIMarkdown>` instances, pass the SAME `documentId` to every chunk so prefixes align. The value is passed through `encodeURIComponent` before being injected into HTML attributes, so any string is safe (React's `useId()` output, your own opaque ids, user-supplied UUIDs). Long ids (>16 chars, e.g. UUIDs) are hashed via MurmurHash3 to a short Base62 form **inside the rendered `id="…"`/`href="#…"` prefix only** to keep HTML compact; `state.documentId` itself and registry keying via `useDocumentRegistry` stay raw, so deep linking and any consumer code reading `documentId` are unaffected. |
99
+ | `urlTransform` | `UrlTransform \| null` | `defaultUrlTransform` | Override the URL allowlist applied to `href`, `src`, and similar attributes. The default mirrors GitHub: `http`, `https`, `irc`, `ircs`, `mailto`, `xmpp`. Pass a function defined at module scope (or memoized) to permit additional schemes — see [Custom URL Schemes and Sanitization](#custom-url-schemes-and-sanitization). |
100
+ | `sanitizeSchema` | `SanitizeSchema` | library default | Override the `rehype-sanitize` schema. Build with [`extendSanitizeSchema`](#custom-url-schemes-and-sanitization) so the library's cross-chunk tag and KaTeX className allowlists survive — hand-rolling silently drops them. |
99
101
 
100
102
  ## Configuration
101
103
 
@@ -192,6 +194,107 @@ function MyHelper({ documentId }: { documentId: string }) {
192
194
  }
193
195
  ```
194
196
 
197
+ ## Custom URL Schemes and Sanitization
198
+
199
+ By default `<AIMarkdown>` only renders links and images whose URLs use the standard set of safe protocols (`http`, `https`, `irc`, `ircs`, `mailto`, `xmpp`). Anything else — `javascript:`, `data:`, or your own `myapp://` — is stripped. This protects against XSS in LLM-generated markdown but also means private application schemes are unreachable without configuration.
200
+
201
+ ### The Two-Gate Model
202
+
203
+ Sanitization runs in **two independent gates** (defense in depth):
204
+
205
+ 1. **`urlTransform`** — runs first, on every URL-bearing attribute, and rewrites disallowed URLs to `''`.
206
+ 2. **`rehype-sanitize` schema** — runs second, and drops the entire `href`/`src`/`cite` attribute when the protocol is not in its own allowlist.
207
+
208
+ For a private scheme to render, **both gates must permit it**. Allowing only one is the most common pitfall.
209
+
210
+ ### Allowing a Custom Scheme
211
+
212
+ Define both gates at module scope so their reference identity is stable across renders (this keeps the per-block memo cache warm):
213
+
214
+ ```tsx
215
+ import AIMarkdown, {
216
+ defaultUrlTransform,
217
+ extendSanitizeSchema,
218
+ } from '@ai-react-markdown/core';
219
+
220
+ // Gate 1: compose with the default so https/mailto/etc. still work.
221
+ const ALLOWED = /^myapp:/i;
222
+ const URL_TRANSFORM = (url, key, node) =>
223
+ ALLOWED.test(url) ? url : defaultUrlTransform(url, key, node);
224
+
225
+ // Gate 2: extend the library schema so it permits the scheme on href + src.
226
+ const SCHEMA = extendSanitizeSchema((s) => {
227
+ s.protocols.href.push('myapp');
228
+ s.protocols.src.push('myapp');
229
+ });
230
+
231
+ function App() {
232
+ return (
233
+ <AIMarkdown
234
+ content={markdown}
235
+ urlTransform={URL_TRANSFORM}
236
+ sanitizeSchema={SCHEMA}
237
+ />
238
+ );
239
+ }
240
+ ```
241
+
242
+ ### `extendSanitizeSchema((draft) => Schema | void)`
243
+
244
+ Hands you a deep clone of the library's default sanitize schema. Mutate it freely (the original singleton is never touched) or return a replacement object. Library invariants — cross-chunk coordination tags (`cross-chunk-link`, `cross-chunk-image`, `footnote-sup`), the KaTeX `math-inline` / `math-display` className allowlist, the `<mark>` allowance — survive untouched. **Hand-rolling a schema that doesn't spread these invariants silently breaks coordinated rendering**, which is why the helper is the recommended path.
245
+
246
+ ```tsx
247
+ const SCHEMA = extendSanitizeSchema((s) => {
248
+ s.tagNames.push('my-widget'); // add a tag
249
+ s.protocols.href.push('myapp'); // permit a protocol
250
+ s.attributes['my-widget'] = ['data-id', 'data-mode']; // allow attributes
251
+ // No `return` needed — mutate-only is fine.
252
+ });
253
+ ```
254
+
255
+ **Footguns** (also documented in JSDoc):
256
+
257
+ - Returning `null` is treated like returning nothing (the mutated draft is used).
258
+ - Reassigning the local parameter (`s = { ... }`) does NOT replace the draft — JS only rebinds the local. Either mutate the original or `return` an explicit value.
259
+ - Throwing inside the modifier propagates uncaught. Usually fine because the helper is called once at module load.
260
+
261
+ ### Reference Stability and the Cache
262
+
263
+ Both `urlTransform` and `sanitizeSchema` are tracked by the per-block memo cache. Defining them inline:
264
+
265
+ ```tsx
266
+ // 🚫 Anti-pattern — discards the entire markdown cache on every parent re-render.
267
+ <AIMarkdown
268
+ urlTransform={(url, k, n) => /* … */}
269
+ sanitizeSchema={extendSanitizeSchema((s) => /* … */)}
270
+ />
271
+ ```
272
+
273
+ … creates fresh references every render, invalidates the cache, and undermines streaming performance. In development the library will `console.warn` after detecting 3+ identity flips on either prop. The warning is dead-code-eliminated in production builds. Define both values at module scope, or memoize with `useMemo` if they depend on state.
274
+
275
+ ### Regex Escaping for `+` / `-` / `.` in Scheme Names
276
+
277
+ Per RFC 3986 scheme names may contain `+`, `-`, and `.` — all regex metacharacters. Write `/^web\+app:/i`, **not** `/^web+app:/i` (the latter would match `we`, `wee`, `weee`, …, silently broadening the allowlist).
278
+
279
+ ### Escape Hatch: Hand-rolled Schema
280
+
281
+ If you need full control, the library's default schema is also exported as `sanitizeSchema`. **Spread it** so cross-chunk and KaTeX additions survive:
282
+
283
+ ```tsx
284
+ import AIMarkdown, { sanitizeSchema } from '@ai-react-markdown/core';
285
+
286
+ const fullCustom = {
287
+ ...sanitizeSchema, // ← required to keep cross-chunk + math invariants
288
+ // your overrides here
289
+ };
290
+ ```
291
+
292
+ The `extendSanitizeSchema` helper exists precisely because consumers tend to forget that spread. Prefer the helper unless you have a specific reason.
293
+
294
+ ### API Stability of `UrlTransform` and `SanitizeSchema`
295
+
296
+ Both prop types track their respective upstream packages — `UrlTransform` follows `react-markdown`'s shape and `SanitizeSchema` follows `rehype-sanitize`'s. They may evolve with those packages' major versions. Hand-construct schemas via the helpers (rather than typing your own from scratch) and you'll inherit any upstream-driven changes automatically.
297
+
195
298
  ## Hooks
196
299
 
197
300
  ### `useAIMarkdownRenderState<TConfig>()`
@@ -462,6 +565,7 @@ The metadata and render state providers are deliberately separated so that metad
462
565
  - `AIMarkdownVariant`
463
566
  - `AIMarkdownColorScheme`
464
567
  - `AIMDContentPreprocessor`
568
+ - `UrlTransform`, `SanitizeSchema` -- prop-type aliases for the URL handling props (track upstream `react-markdown` / `rehype-sanitize` shapes)
465
569
  - `PartialDeep`
466
570
  - Cross-chunk registry types: `Registry`, `ChunkData`, `FootnoteDef`, `LinkDef`, `RefRecord`, `RefKind`
467
571
 
@@ -470,6 +574,12 @@ The metadata and render state providers are deliberately separated so that metad
470
574
  - `AIMarkdownRenderExtraSyntax`
471
575
  - `AIMarkdownRenderDisplayOptimizeAbility`
472
576
  - `defaultAIMarkdownRenderConfig`
577
+ - `defaultUrlTransform` -- the library's built-in URL-allowlist transform; compose with this when supplying a custom `urlTransform`
578
+ - `sanitizeSchema` -- the library's built-in `rehype-sanitize` schema; spread this when hand-rolling a custom schema (or use `extendSanitizeSchema` instead)
579
+
580
+ ### Helpers
581
+
582
+ - `extendSanitizeSchema((draft) => Schema | void)` -- mutate-and-return factory that produces a sanitize schema from a deep clone of the library default; preserves cross-chunk and KaTeX invariants
473
583
 
474
584
  ### Hooks (re-exported)
475
585