npm - flux-md - Versions diffs - 0.5.1 → 0.6.0 - Mend

flux-md 0.5.1 → 0.6.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (17) hide show

package/CHANGELOG.md +265 -2
package/README.md +230 -19
package/package.json +20 -5
package/src/block-props.ts +96 -0
package/src/client.ts +1 -1
package/src/dom.ts +430 -0
package/src/element.ts +339 -0
package/src/renderers/CodeBlock.tsx +62 -5
package/src/renderers/Math.tsx +5 -3
package/src/renderers/Mermaid.tsx +4 -3
package/src/solid.tsx +70 -0
package/src/svelte.ts +55 -0
package/src/types-core.ts +138 -0
package/src/types-react.ts +14 -0
package/src/types.ts +7 -150
package/src/vue.ts +100 -0
package/src/wasm/flux_md_core_bg.wasm +0 -0

package/CHANGELOG.md CHANGED Viewed

@@ -4,6 +4,268 @@ Notable changes to flux-md. Format based on
 [Keep a Changelog](https://keepachangelog.com/); this project aims to follow
 [Semantic Versioning](https://semver.org/).
+## 0.6.0 — 2026-05-28
+### Added — flux-md is no longer React-only
+The core (`FluxClient` + the WASM worker) was always framework-neutral; only
+the renderer was React-bound. This release adds five new entry points, each
+**thin lifecycle glue** over one new framework-agnostic DOM renderer — none
+re-implements the subscribe/diff loop, and none destroys your client (you own
+the worker/stream).
+- **`flux-md/dom`** — the foundation. `mountFluxMarkdown(client, container,
+  options?) → { destroy(), refresh() }` incrementally patches a DOM subtree
+  using the parser's stable block IDs: a committed block's node is never
+  recreated (so one-shot work like syntax highlighting and the copy-button
+  listener runs exactly once), only the streaming tail re-renders. Reuses the
+  in-house highlighter for deferred code, applies your `sanitize` hook to the
+  open/speculative tail, and batches patches per `requestAnimationFrame`.
+  Block-kind overrides via `components` (`(props) => HTMLElement | string`);
+  tag-level overrides remain React-only.
+- **`flux-md/element`** — `defineFluxMarkdown(tag = "flux-markdown")` defines a
+  `<flux-markdown>` custom element. Light DOM (your markdown CSS applies),
+  SSR-safe (no auto-register), and usable three ways: a caller-owned `client`
+  property, a self-owned client driven by `append()`/`finalize()`, or zero-JS
+  via a `src` URL it fetch-streams / inline text / a `markdown` attribute.
+  Config flags map to tri-state attributes (`gfm-math`, `dir-auto`, …). Covers
+  **Angular** with `CUSTOM_ELEMENTS_SCHEMA` — no separate package.
+- **`flux-md/vue`** — a `<FluxMarkdown>` component + `useFluxMarkdown`
+  composable (Vue 3, optional peer dep).
+- **`flux-md/svelte`** — a `fluxMarkdown` action, `use:fluxMarkdown={{ client }}`
+  (Svelte 4 and 5, optional peer dep).
+- **`flux-md/solid`** — a `<FluxMarkdown>` component (Solid, optional peer dep).
+  Newest binding: its mount/teardown glue is tested, but the JSX component shell
+  has only been exercised via a real `vite-plugin-solid` build, not in CI — the
+  `flux-md/dom` mount inside `onMount`/`onCleanup` is the fallback if your Solid
+  toolchain trips on it.
+Purely additive — existing `flux-md` / `flux-md/react` / `flux-md/client` users
+are unaffected (the React renderer and core are byte-identical; the only change
+to existing code was a type-only import repoint so the neutral entry points
+typecheck without React). `vue`, `svelte`, and `solid-js` join `react` as
+optional peer dependencies — import only the binding you need. See the new
+"Framework bindings" section in the README. 65 → 85 tests.
+## 0.5.6 — 2026-05-28
+### Performance
+- **`ContainerCache` now handles multi-paragraph inner content.** A blockquote
+  or GitHub alert with blank `>` lines inside (`> [!NOTE]\n> Para one.\n>\n>
+  Para two.\n`) used to drop the cache and fall back to the O(n²) full path
+  the moment the first blank arrived. The cache now closes the current
+  paragraph on a blank `>` and starts a new one, preserving the
+  streaming-O(new bytes) shape across multi-paragraph inner content. Each
+  completed inner paragraph is pre-rendered into a growing
+  `committed_paras_html` string; the single-paragraph fast path (the bench's
+  `big_blockquote` / `big_alert`) is unchanged within noise.
+- **`ListCache` now handles loose lists.** A flat list with blank lines
+  between siblings (`- one\n\n- two\n\n- three\n`) is a CommonMark "loose"
+  list — every item body gets wrapped in `<p>…</p>` — and the cache used to
+  bail on the first blank. The cache now flips to loose on the first
+  blank-then-marker sequence, re-renders prior cached items with `<p>`
+  wrappers from stored source spans (one-time O(items)), and continues the
+  streaming-O(new bytes) shape from there. Tight→loose is sticky.
+  50 KB loose-list bench, before-fix → after-fix:
+  | chunk |  before  |  after  | speedup |
+  |------:|---------:|--------:|--------:|
+  |  16   | 5593 ms  | 21 ms   | ~272×   |
+  | 256   |  355 ms  |  7 ms   | ~49×    |
+  Tight `big_list` perf is unchanged within bench noise.
+### Added
+- **React `CodeBlock` default renderer ships a copy-to-clipboard button.**
+  Closed code blocks now show an icon + "Copy" in their header (the existing
+  "streaming" pill takes that slot until close, so streaming code is never
+  copy-clickable mid-arrival). Click → copies the decoded source via
+  `navigator.clipboard.writeText` → swaps to a checkmark + "Copied" for
+  1.5 s → reverts. Native `<button>` (keyboard-reachable), `aria-label`
+  toggles between "Copy code" and "Copied" with `aria-live="polite"`,
+  guards against `navigator.clipboard` being absent (SSR / insecure context)
+  and rejected `writeText` promises (permission denied) — both leave the
+  button silently usable. No new dependency.
+### Documentation
+- README quickstart now uses `useState(() => new FluxClient())` + an
+  unmount-only destroy effect instead of `useMemo(() => new FluxClient(),
+  [])` + cleanup-on-stream-change (which destroyed the client when the
+  `stream` prop changed, leaking a freed parser on the next append).
+- New "when to enable each flag" guide for `ParserConfig` with concrete
+  LLM-output triggers (`gfmMath` when `$…$` arrives, `componentTags` for
+  `<Thinking>` blocks, etc.) — so a reader picks flags without reading the
+  full reference further down.
+- `Alert` block-kind override example added to the `components` docs.
+- `sanitize` example mirrors the realistic memoize-at-module-scope pattern
+  from the live demo (a fresh arrow each render busts the per-block memo).
+- New "Performance" section pointing to CHANGELOG / `examples/bench.rs` for
+  numbers (no numbers baked into the README — those rot).
+## 0.5.5 — 2026-05-28
+### Performance
+- 1× memcpy in the paragraph / container cache assembly (was 2×). Both caches
+  were building the block HTML in two stages — concatenate
+  `committed + active` into an intermediate `String`, then concatenate
+  `<p>` + that into the output — so a long open paragraph or container did two
+  memcpys of the committed inner per append. The fix builds directly into the
+  output buffer and trims trailing whitespace in-place; the container case
+  backs out a provisional `<p>` opener if the body content turns out to be
+  empty (preserving the empty-body fix from 0.5.4). Output is byte-identical.
+  200 KB bench (best of 7), chunk=16:
+  | shape           | 0.5.4    | 0.5.5    | speedup |
+  |-----------------|---------:|---------:|--------:|
+  | `long_paragraph`| 142 ms   | **96 ms**| 1.48× |
+  | `emphasis_para` | 170 ms   | **116 ms**| 1.47× |
+  | `big_blockquote`| 213 ms   | **157 ms**| 1.36× |
+  | `big_alert`     | 343 ms   | **237 ms**| 1.45× |
+  Modest wins at every chunk size for the affected caches; the
+  table / list / fence caches are unchanged (they were already 1× memcpy).
+## 0.5.4 — 2026-05-28
+### Fixed (mid-stream rendering)
+- **GFM tables now form during streaming, not just at finalize.** Streaming a
+  table char-by-char (or in any chunking where the delimiter row's `\n` lands
+  in a different chunk than the row's content) used to leave the block as a
+  `<p>` spanning both lines until `.finalize()` ran. The paragraph cache's
+  delimiter-detection walked from the line AFTER the cut and so missed a
+  delimiter row that completed inside the line the cut had advanced into. The
+  fix re-checks the line containing the cut whenever it has just completed,
+  guarded by a cheap `bytes[cut..].contains('\n')` so long open paragraphs
+  without interior `\n` still take the O(new bytes) per-call path.
+- **Open alerts/blockquotes with an empty body no longer render an empty
+  `<p></p>`.** A `> [!NOTE]\n` shown mid-stream now matches the full renderer:
+  `<div class="markdown-alert ...">…<p class="...title">Note</p></div>` with
+  no empty body paragraph. The container cache was wrapping the body in
+  `<p>…</p>` unconditionally, even when the body was empty.
+Both bugs only manifested *before* `finalize()`. The post-finalize output —
+what every existing parity test checks — was already correct, which is why
+neither was caught earlier. A new `tests/midstream_parity.rs` asserts that the
+streamed view of an open block matches what one-shot parsing produces for the
+same prefix (tables, alerts, blockquotes, lists, code fences, math fences).
+### Performance
+- `big_table` at the artificial `chunk=16` stress case is ~280 ms (was ~145 ms
+  in 0.5.3). The 145 ms was the *incorrect* path: the paragraph cache treated
+  the whole 200 KB table as a single growing paragraph until finalize, never
+  engaging the table cache. The 280 ms is the cost of correctly emitting the
+  table mid-stream at the smallest chunk size. Every realistic LLM streaming
+  chunk size (≥64 bytes) is unchanged — `big_table` at chunk=64 is 73 ms,
+  chunk=256 is 38 ms, etc.
+## 0.5.3 — 2026-05-28
+### Performance
+- **Streaming long open resumable containers is now O(n).** A long
+  `> [!NOTE]` alert, a `>`-quoted explanation, or a flat bullet/ordered list
+  used to re-run scan + inline render over the whole growing inner on every
+  append (O(n²)). Three new tail caches mirror the existing fence/table
+  pattern:
+  - `ContainerCache` — single-paragraph blockquote / GitHub alert. Wraps
+    the existing paragraph-cache (inline-boundary commit) with a
+    `>`-stripped inner buffer; the wrapper HTML (`<blockquote>` /
+    alert `<div>`) is built once at arm time, each new `> ` line is
+    stripped once into the inner buffer, only the unsettled inline tail is
+    re-rendered. Bails on a blank `>`-line (paragraph break inside the
+    container), lazy continuation, or `\r`.
+  - `ListCache` — tight, flat list (the LLM-emit shape: one sibling marker
+    per line, no blanks, no continuation, no nesting). Opener
+    (`<ul>` / `<ol start=N>`) pre-rendered at arm time; each new sibling
+    line renders directly into the cache as a tight `<li>…</li>` (GFM
+    task-list `[ ] `/`[x] ` supported). Bails on the first blank line
+    (loose-list signal), non-marker line, over-edge marker (nested), or
+    foreign-family marker — the full path handles those.
+  Measured at 50 KB (best of 7), before → after:
+  | shape           | chunk=16          | chunk=256       |
+  |-----------------|-------------------|-----------------|
+  | `big_blockquote`| 5164 → **22 ms**  | 332 → **8.5 ms**|
+  | `big_list`      | 6141 → **18 ms**  | 391 → **7.4 ms**|
+  | `big_alert`     | 6298 → **28 ms**  | 404 → **11 ms** |
+  At 200 KB, `big_list` chunk=256 was extrapolating to ~6.2 s before the
+  cache; now **36 ms** (~170×). Every realistic streaming shape now has a
+  flat chunk-size curve.
+  Output is byte-identical. Parity gated by `tests/container_cache.rs`
+  (blockquote + all five alert kinds, dir_auto, CRLF, lazy continuation,
+  multi-paragraph fallback, 400-line stress) and `tests/list_cache.rs` (5
+  marker families, ordered with non-default start, dir_auto, CRLF, loose /
+  nested / multi-line fallback, 400-item stress).
+### Documentation
+- Reworded the "future plugin slot" comments in `renderers/Math.tsx` and
+  `renderers/Mermaid.tsx`. The actual extension path is the
+  `components.MathBlock` / `components.Mermaid` overrides, which already
+  works end-to-end.
+### Known limitations
+- The three new caches disarm when `gfmFootnotes` is on, mirroring
+  `TableCache` from 0.5.2: cell-level `[^x]` occurrence ids would diverge
+  across the cache vs. full-reparse boundary. Footnotes + a long container
+  / table stays on the full O(n²) path — rare combination, may be lifted
+  in a later release by tracking per-cache footnote-occ deltas.
+- The blockquote/alert cache covers the *single-paragraph* inner case (the
+  realistic LLM shape). A long open container with a multi-block inner
+  (lists inside, fenced code inside, etc.) still routes through the full
+  path. The bench's `big_blockquote` / `big_alert` are single-paragraph
+  shapes — what these caches were built for.
+## 0.5.2 — 2026-05-28
+### Performance
+- **Streaming a long GFM table is now O(n) at every chunk size.** Tables already
+  rendered visually incrementally (header at the delimiter row, rows append as
+  they arrive) — but `render_table` re-walked every row on every append, so the
+  total work was O(n²) once chunks exceeded ~30 bytes (a row). The fix is an
+  incremental `TableCache` that mirrors the existing code/math `FenceCache`:
+  `<thead>` is pre-rendered once, each newly-complete `<tr>` is folded into the
+  cached prefix, and only the trailing partial row is re-rendered each append.
+  Output is byte-identical; parity gated by `tests/table_cache.rs` (every chunk
+  size 1..=9 × char-by-char against one-shot, with alignments, inline markdown,
+  link refs, CRLF fallback, and a 400-row stress case).
+  Measured on a 200 KB table (best of 7 — chunk varies on each row):
+  | chunk |  before  | after | speedup |
+  |------:|---------:|------:|--------:|
+  |    16 |   143 ms | 145 ms | ~1× (was already fast) |
+  |    64 | 20807 ms |  78 ms | **267×** |
+  |   128 | 10414 ms |  54 ms | **193×** |
+  |   256 |  5373 ms |  40 ms | **134×** |
+  |   512 |  2608 ms |  34 ms |  **77×** |
+  |  1024 |  1322 ms |  31 ms |  **43×** |
+  The pre-fix bench printed only chunks 16 and 256, which hid the regression
+  (16 was fine, 256 was the cliff floor). The bench now sweeps 16/64/128/256/
+  512/1024 so the next regression in this shape can't slip in unnoticed.
+  Footnotes are the one combination still on the full O(n²) path: the
+  cell-level `[^x]` occurrence counter would diverge across the
+  cache/full-reparse boundary, so the cache disarms when `gfmFootnotes` is on
+  (rare enough to defer to a later release).
 ## 0.5.1 — 2026-05-27
 ### Performance
@@ -14,8 +276,9 @@ Notable changes to flux-md. Format based on
   two-level lookup (committed, then the uncommitted tail), and folded in place
   via `Rc::make_mut` once the render's clone is dropped. A 235 KB
   reference-definition stream at 16-byte chunks: **~1,395 ms → ~53 ms** (~26×).
-  This was the last remaining O(n²) streaming shape — every realistic shape is
-  now O(n). Output is unchanged.
+  This was believed to be the last remaining O(n²) streaming shape; in fact a
+  long open GFM table was still O(n²) (fixed in 0.5.2 — `big_table` at
+  chunk=256 went from ~5,400 ms to ~40 ms). Output is unchanged.
 ## 0.5.0 — 2026-05-27

package/README.md CHANGED Viewed

@@ -2,6 +2,8 @@
 Zero-dep streaming markdown for the browser. Rust→WASM core, one Web Worker per stream, incremental parse with speculative closure for mid-stream constructs.
+Drop in a streaming-aware renderer — **React, Vue, Svelte, Solid, a framework-agnostic `<flux-markdown>` Web Component, or the vanilla DOM mount** — wire each LLM stream to a `FluxClient`, and the markdown renders incrementally off the main thread, block by block, with stable identities so unchanged blocks never re-reconcile.
 Parsing runs entirely **off the main thread** — each stream gets its own pooled Web Worker, so many concurrent LLM responses render without contending for the UI thread. On each token the parser re-parses only the **active tail**, not the whole document, and heavy renderers (syntax highlighting, math, mermaid) are **deferred until a block closes**. The result is low retained memory and a main thread that stays responsive while streaming. See [the live demo](https://md.hsingh.app/).
 ## Install
@@ -16,8 +18,10 @@ import.meta.url)`** pattern, so any bundler with asset-module support resolves
 them: **Vite** (the reference setup), **webpack 5**, **Rollup** (with asset
 modules), and **Parcel**. Next.js (webpack/turbopack) should work but is
 untested — file an issue if it doesn't. It is **browser-only** (it constructs
-Web Workers); it does not run under SSR/RSC. `react` is an optional peer
-dependency — only needed if you import `flux-md/react`.
+Web Workers); it does not run under SSR/RSC. The framework packages — `react`,
+`vue`, `svelte`, `solid-js` — are all **optional** peer dependencies; you only
+need the one whose binding you import. The core (`flux-md`, `flux-md/client`,
+`flux-md/dom`, `flux-md/element`) needs none.
 ## Quick start
@@ -37,11 +41,13 @@ client.finalize();
 In React:
 ```tsx
-import { useEffect, useMemo } from "react";
+import { useEffect, useState } from "react";
 import { FluxClient, FluxMarkdown } from "flux-md";
 export function ChatMessage({ stream }: { stream: AsyncIterable<string> }) {
-  const client = useMemo(() => new FluxClient(), []);
+  // One client per component instance. Destroy on unmount, not on stream change.
+  const [client] = useState(() => new FluxClient());
+  useEffect(() => () => client.destroy(), [client]);
   useEffect(() => {
     let cancelled = false;
@@ -52,11 +58,8 @@ export function ChatMessage({ stream }: { stream: AsyncIterable<string> }) {
       }
       if (!cancelled) client.finalize();
     })();
-    return () => {
-      cancelled = true;
-      client.destroy();
-    };
-  }, [stream]);
+    return () => { cancelled = true; };
+  }, [client, stream]);
   return <FluxMarkdown client={client} />;
 }
@@ -64,6 +67,166 @@ export function ChatMessage({ stream }: { stream: AsyncIterable<string> }) {
 Multiple concurrent streams just need multiple clients — each runs in its own worker, so they don't share main-thread budget.
+## Framework bindings
+`FluxClient` is framework-neutral — it owns the worker and exposes
+`subscribe`/`getSnapshot`. Pick a renderer to put its blocks on screen. Every
+binding below is thin glue over the same incremental DOM renderer, so they
+share one identity contract: a committed block's node is never recreated, only
+the streaming tail re-renders.
+**One ownership rule across all bindings:** the renderer's teardown (React
+unmount, `handle.destroy()`, element disconnect, etc.) frees only the rendered
+DOM and the subscription — it **never** destroys the client. You call
+`client.destroy()` when you're done with the stream. (React's `<FluxMarkdown>`,
+documented [below](#fluxmarkdown-react), is the same.)
+### Vanilla / any framework — `flux-md/dom`
+```ts
+import { FluxClient } from "flux-md/client";
+import { mountFluxMarkdown } from "flux-md/dom";
+const client = new FluxClient();
+const handle = mountFluxMarkdown(client, document.getElementById("out")!, {
+  stickToBottom: true,
+});
+// Feed it from a fetch/SSE reader:
+const reader = (await fetch("/api/chat")).body!.getReader();
+const dec = new TextDecoder();
+for (;;) {
+  const { value, done } = await reader.read();
+  if (done) break;
+  client.append(dec.decode(value, { stream: true })); // stream:true carries multibyte across chunks
+}
+client.append(dec.decode());
+client.finalize();
+// Teardown: destroy BOTH — the renderer and the client you created.
+handle.destroy();
+client.destroy();
+```
+`mountFluxMarkdown(client, container, options?)` returns `{ destroy(), refresh() }`.
+Options: `components`, `sanitize`, `virtualize`, `stickToBottom`, `highlightCode`
+(default true), `batch` (default true — one DOM write per `requestAnimationFrame`).
+Block-kind overrides use `components` keyed by block-kind (`CodeBlock`, `Table`,
+`Alert`, `Component`, …) with values `(props) => HTMLElement | string`. Tag-level
+(lowercase `a`/`table`/`code`) overrides are **React-only** — there's no virtual
+tree on the fast `innerHTML` path; a block-kind override can rewrite the `html`
+it's handed instead.
+### Web Component `<flux-markdown>` — `flux-md/element`
+The universal binding — plain HTML, Angular, or any framework that renders DOM.
+Register once, then use the element:
+```ts
+import { defineFluxMarkdown } from "flux-md/element";
+defineFluxMarkdown(); // defines <flux-markdown>; pass a custom tag name if you like
+```
+```html
+<!-- zero-JS streaming straight from a URL -->
+<flux-markdown src="/api/post.md" gfm-math stick-to-bottom></flux-markdown>
+<!-- one-shot from inline text -->
+<flux-markdown># Hello **world**</flux-markdown>
+```
+```js
+// or caller-owned streaming — drive your own client:
+const el = document.querySelector("flux-markdown");
+el.client = myFluxClient;             // element subscribes; never destroys it
+el.components = { Thinking: (p) => myNode(p) };
+myFluxClient.append(delta);
+```
+Config flags are **tri-state attributes**: absent = library default;
+`gfm-math` / `gfm-math="true"` / `="1"` = on; `gfm-math="false"` / `="0"` = off
+(the only way to turn off a default-on flag such as `gfm-alerts`). It renders in
+light DOM so your markdown CSS applies, and `defineFluxMarkdown` is a no-op under
+SSR (no `customElements`). A self-owned element (`src` / `markdown` / inline
+text / `append()`) is torn down on disconnect; a caller-supplied `client` is left
+alone.
+**Angular** consumes the same element — no separate package:
+```ts
+import { Component, CUSTOM_ELEMENTS_SCHEMA } from "@angular/core";
+import { defineFluxMarkdown } from "flux-md/element";
+defineFluxMarkdown(); // once at bootstrap
+@Component({
+  standalone: true,
+  schemas: [CUSTOM_ELEMENTS_SCHEMA],
+  template: `<flux-markdown [attr.src]="url" stick-to-bottom></flux-markdown>`,
+})
+export class Answer { url = "/api/post.md"; }
+```
+### Vue 3 — `flux-md/vue`
+```vue
+<script setup lang="ts">
+import { onBeforeUnmount } from "vue";
+import { FluxClient } from "flux-md/client";
+import { FluxMarkdown } from "flux-md/vue";
+const client = new FluxClient();
+// feed client.append(delta) from your stream, then client.finalize()
+onBeforeUnmount(() => client.destroy());
+</script>
+<template>
+  <FluxMarkdown :client="client" stick-to-bottom />
+</template>
+```
+Props: `client` (required), `components`, `sanitize`, `virtualize`,
+`stickToBottom`. There's also a `useFluxMarkdown` composable returning a
+`container` ref if you'd rather mount into your own element.
+### Svelte (4 & 5) — `flux-md/svelte`
+A Svelte action — works in both v4 and v5, no `.svelte` build step:
+```svelte
+<script lang="ts">
+  import { onDestroy } from "svelte";
+  import { FluxClient } from "flux-md/client";
+  import { fluxMarkdown } from "flux-md/svelte";
+  const client = new FluxClient();
+  // feed client.append(delta) then client.finalize()
+  onDestroy(() => client.destroy());
+</script>
+<div use:fluxMarkdown={{ client, stickToBottom: true }} />
+```
+### Solid — `flux-md/solid`
+```tsx
+import { onCleanup } from "solid-js";
+import { FluxClient } from "flux-md/client";
+import { FluxMarkdown } from "flux-md/solid";
+const client = new FluxClient();
+// feed client.append(delta) then client.finalize()
+onCleanup(() => client.destroy());
+<FluxMarkdown client={client} stickToBottom />;
+```
+The Solid binding's mount/teardown logic is tested, but its JSX component shell
+has so far only been exercised through a real Solid (`vite-plugin-solid`) build
+in development, not in CI — treat it as the newest of the bindings and file an
+issue if your Solid setup trips on it. The component is a thin `ref`'d `<div>`;
+if you hit a transform edge, `mountFluxMarkdown` from `flux-md/dom` inside
+`onMount`/`onCleanup` is the zero-surprise fallback.
 ## What it does
 | Concern | flux-md | conventional main-thread renderer |
@@ -73,7 +236,7 @@ Multiple concurrent streams just need multiple clients — each runs in its own
 | Block identity across chunks | Stable monotonic IDs | New keys on every render |
 | Mid-stream unclosed `` ``` `` / `*` / `**` | Speculatively closed in render, replaced cleanly | Often renders raw or breaks |
 | Heavy renderers (syntax, math, mermaid) | Deferred until block close | Re-run per chunk |
-| XSS sanitization | Allowlist in Rust + URL scheme check | rehype-sanitize on JS thread |
+| XSS sanitization | Allowlist in Rust + URL scheme check | Downstream sanitizer pass on the JS thread |
 ## Public API
@@ -114,6 +277,25 @@ Omitted fields use the defaults above, so `new FluxClient()` is unchanged.
 Config is applied when the stream's parser is created and is **immutable** for
 that stream (`reset()` keeps it; use a new client for different flags).
+When to enable each flag:
+- `gfmAutolinks` — on by default. Leave it on unless you want strict CommonMark.
+- `gfmAlerts` — on by default. Leave it on unless you want strict CommonMark.
+- `gfmMath: true` — when your LLM emits `$…$` or `$$…$$` (or LaTeX `\(…\)` /
+  `\[…\]`). flux-md emits KaTeX-ready markup; you bring the KaTeX pass (or
+  `components.MathBlock`).
+- `gfmFootnotes: true` — when your input uses `[^1]` references and `[^1]:`
+  definitions. Off by default; see the footnote streaming caveat above.
+- `dirAuto: true` — when content can be RTL / mixed-direction. Emits per-block
+  `dir="auto"` so the browser detects direction independently per block.
+- `unsafeHtml: true` — only when rendering trusted HTML. For untrusted /
+  LLM-produced HTML, pair this with `<FluxMarkdown sanitize={…} />` (DOMPurify or
+  similar — see [Security](#security)).
+- `componentTags: ["Thinking", …]` — when your LLM emits custom tags like
+  `<Thinking>…</Thinking>` and you want their inner content parsed as markdown
+  and dispatched to a React component. Safe without `unsafeHtml` (attributes are
+  sanitized; allowlisted tags only).
 **Footnotes** (`gfmFootnotes`) work in streaming with one honest caveat: a
 `[^1]` reference renders speculatively the moment it's seen (committed blocks
 can't re-render), and the footnote **section is emitted at finalize**. So a
@@ -160,15 +342,15 @@ Subscribes to a `FluxClient`, renders each block keyed by its stable parser-assi
 #### Custom components / overrides
-Pass a `components` map to replace how elements render — the same idea as
-react-markdown's `components` prop, but the keys come in **two namespaces**:
+Pass a `components` map to replace how elements render. Keys come in **two
+namespaces**:
 ```tsx
 import { useMemo } from "react";
 import { FluxClient, FluxMarkdown, type Components } from "flux-md";
 function Message({ client }: { client: FluxClient }) {
-  // ⚠️ Memoize (or hoist to module scope). A fresh object every render busts
+  // Memoize (or hoist to module scope). A fresh object every render busts
   // FluxMarkdown's block memo, so every block re-parses on every patch.
   const components: Components = useMemo(
     () => ({
@@ -181,6 +363,15 @@ function Message({ client }: { client: FluxClient }) {
       CodeBlock: ({ text, language, open }) => (
         <MyCodeBlockWithCopyButton code={text} lang={language} streaming={open} />
       ),
+      // GitHub alerts (`> [!NOTE]` / `[!TIP]` / `[!WARNING]` / `[!CAUTION]` /
+      // `[!IMPORTANT]`) — swap in your own callout component. The alert kind
+      // is on `block.kind.data.kind`; `html` is the rendered inner body.
+      Alert: ({ block, html }) => (
+        <MyCallout kind={(block.kind.data as { kind: string }).kind}>
+          <div dangerouslySetInnerHTML={{ __html: html }} />
+        </MyCallout>
+      ),
     }),
     [],
   );
@@ -313,8 +504,8 @@ styles them, and they're overridable as a block kind via `components.Alert`.
 By design, not yet, or only partially:
 - **Raw HTML in markdown** — escaped by default, not passed through. (Security
-  default. A `setUnsafeHtml(true)` opt-in exists but must never be enabled for
-  untrusted input.)
+  default. The `unsafeHtml: true` config flag disables the escape but must never
+  be enabled for untrusted input without a `sanitize` hook.)
 - **Forward link references when streaming** — a `[ref]` used *before* its later
   `[ref]: url` definition can't resolve until the definition arrives; one-shot
   parsing handles it fully, streaming converges once the definition streams in.
@@ -326,13 +517,28 @@ By design, not yet, or only partially:
 - **Syntax highlighting on open code blocks** — deferred until close. This is a
   deliberate perf choice.
+## Performance
+Every realistic streaming shape (long paragraph, fenced code block, GFM table,
+blockquote/alert, flat list, math fence, reference-heavy document) parses in
+**O(n) total work**, not O(n²) — at every chunk size from 16 bytes (char-by-char)
+up. Each shape has an incremental cache that mirrors the structure of the block
+so that an append only does work proportional to the *newly arrived* bytes, not
+the growing tail. See [CHANGELOG.md](./CHANGELOG.md) for per-shape numbers and
+the regression that prompted each cache; the canonical bench is
+`crates/flux-md-core/examples/bench.rs` (`cargo run --release --example bench`).
+Headline numbers are not durable across machines, but the curve is: chunk size
+shouldn't change the order of magnitude for any shape. If you hit one that does,
+file an issue with the input and chunking — that's the next bench scenario.
 ## Security
 flux-md is XSS-safe by default — its HTML output is meant to be injected via
 `innerHTML` without a downstream sanitizer:
-- **Raw HTML is escaped** (the `unsafe_html` / `setUnsafeHtml(true)` opt-in
-  disables this; **never enable it for untrusted input**).
+- **Raw HTML is escaped** (the `unsafeHtml: true` config flag disables this;
+  **never enable it for untrusted input without a `sanitize` hook**).
 - **Dangerous URL schemes are neutralized** in `<a href>` and `<img src>` —
   `javascript:`, `vbscript:`, `data:text/html`, `data:text/javascript` become
   `#`. The check runs on the *decoded* URL and strips characters browsers
@@ -352,12 +558,17 @@ that returns raw HTML), **bring a real sanitizer** and pass it via
 `<FluxMarkdown sanitize={…} />`. flux-md applies it to every block's HTML before
 injection — **including the streaming (open) tail**, which the raw-`innerHTML`
 fast path would otherwise expose. flux-md stays zero-dep; you choose the
-sanitizer:
+sanitizer. The realistic pattern (matches the live demo):
 ```tsx
 import DOMPurify from "dompurify";
-<FluxMarkdown client={client} sanitize={(html) => DOMPurify.sanitize(html)} />
+// Hoist to module scope (or wrap in useCallback). A fresh arrow each render
+// busts FluxMarkdown's per-block memo and re-runs every block through sanitize.
+const sanitize = (html: string) => DOMPurify.sanitize(html);
+// …then in your component:
+<FluxMarkdown client={client} sanitize={sanitize} />
 ```
 The built-in code/math renderers operate on already-escaped content and are not