mathpix-markdown-it 2.0.38 → 2.0.40
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/README.md +3 -0
- package/doc/changelog.md +128 -0
- package/es5/browser/auto-render.js +1 -1
- package/es5/bundle.js +4 -4
- package/es5/index.js +4 -4
- package/lib/components/mathpix-markdown/index.js +2 -1
- package/lib/components/mathpix-markdown/index.js.map +1 -1
- package/lib/index.d.ts +2 -1
- package/lib/index.js +3 -1
- package/lib/index.js.map +1 -1
- package/lib/markdown/common/consts.d.ts +5 -0
- package/lib/markdown/common/consts.js +17 -5
- package/lib/markdown/common/consts.js.map +1 -1
- package/lib/markdown/common/convert-math-to-html.d.ts +10 -0
- package/lib/markdown/common/convert-math-to-html.js +163 -41
- package/lib/markdown/common/convert-math-to-html.js.map +1 -1
- package/lib/markdown/common/labels.d.ts +9 -1
- package/lib/markdown/common/labels.js +82 -37
- package/lib/markdown/common/labels.js.map +1 -1
- package/lib/markdown/common/reset-mmd-state.d.ts +4 -0
- package/lib/markdown/common/reset-mmd-state.js +30 -0
- package/lib/markdown/common/reset-mmd-state.js.map +1 -0
- package/lib/markdown/common.d.ts +3 -0
- package/lib/markdown/common.js +34 -23
- package/lib/markdown/common.js.map +1 -1
- package/lib/markdown/highlight/highlight-math-token.js +1 -0
- package/lib/markdown/highlight/highlight-math-token.js.map +1 -1
- package/lib/markdown/index.js +22 -8
- package/lib/markdown/index.js.map +1 -1
- package/lib/markdown/mathpix-markdown-plugins.js +21 -1
- package/lib/markdown/mathpix-markdown-plugins.js.map +1 -1
- package/lib/markdown/md-block-rule/begin-tabular/common.d.ts +15 -1
- package/lib/markdown/md-block-rule/begin-tabular/common.js +57 -11
- package/lib/markdown/md-block-rule/begin-tabular/common.js.map +1 -1
- package/lib/markdown/md-block-rule/begin-tabular/index.d.ts +3 -0
- package/lib/markdown/md-block-rule/begin-tabular/index.js +79 -20
- package/lib/markdown/md-block-rule/begin-tabular/index.js.map +1 -1
- package/lib/markdown/md-block-rule/begin-tabular/multi-column-row.d.ts +3 -1
- package/lib/markdown/md-block-rule/begin-tabular/multi-column-row.js +15 -9
- package/lib/markdown/md-block-rule/begin-tabular/multi-column-row.js.map +1 -1
- package/lib/markdown/md-block-rule/begin-tabular/parse-tabular.d.ts +2 -1
- package/lib/markdown/md-block-rule/begin-tabular/parse-tabular.js +177 -73
- package/lib/markdown/md-block-rule/begin-tabular/parse-tabular.js.map +1 -1
- package/lib/markdown/md-block-rule/begin-tabular/sub-cell.d.ts +1 -0
- package/lib/markdown/md-block-rule/begin-tabular/sub-cell.js +11 -23
- package/lib/markdown/md-block-rule/begin-tabular/sub-cell.js.map +1 -1
- package/lib/markdown/md-block-rule/begin-tabular/sub-code.d.ts +0 -6
- package/lib/markdown/md-block-rule/begin-tabular/sub-code.js +10 -21
- package/lib/markdown/md-block-rule/begin-tabular/sub-code.js.map +1 -1
- package/lib/markdown/md-block-rule/begin-tabular/sub-math.d.ts +13 -5
- package/lib/markdown/md-block-rule/begin-tabular/sub-math.js +132 -93
- package/lib/markdown/md-block-rule/begin-tabular/sub-math.js.map +1 -1
- package/lib/markdown/md-block-rule/begin-tabular/sub-tabular.d.ts +3 -1
- package/lib/markdown/md-block-rule/begin-tabular/sub-tabular.js +44 -36
- package/lib/markdown/md-block-rule/begin-tabular/sub-tabular.js.map +1 -1
- package/lib/markdown/md-block-rule/begin-tabular/tabular-td.d.ts +11 -3
- package/lib/markdown/md-block-rule/begin-tabular/tabular-td.js +207 -57
- package/lib/markdown/md-block-rule/begin-tabular/tabular-td.js.map +1 -1
- package/lib/markdown/md-block-rule/mmd-html-block.js +11 -2
- package/lib/markdown/md-block-rule/mmd-html-block.js.map +1 -1
- package/lib/markdown/md-core-rules/set-positions.js +90 -15
- package/lib/markdown/md-core-rules/set-positions.js.map +1 -1
- package/lib/markdown/md-inline-rule/core-inline.js +41 -9
- package/lib/markdown/md-inline-rule/core-inline.js.map +1 -1
- package/lib/markdown/md-inline-rule/tabular.js +5 -2
- package/lib/markdown/md-inline-rule/tabular.js.map +1 -1
- package/lib/markdown/md-latex-footnotes/block-rule.js +72 -3
- package/lib/markdown/md-latex-footnotes/block-rule.js.map +1 -1
- package/lib/markdown/md-latex-lists-env/re-level.js +39 -22
- package/lib/markdown/md-latex-lists-env/re-level.js.map +1 -1
- package/lib/markdown/md-renderer-rules/render-tabular.js +115 -36
- package/lib/markdown/md-renderer-rules/render-tabular.js.map +1 -1
- package/lib/markdown/md-svg-to-base64/base64.js +8 -8
- package/lib/markdown/md-svg-to-base64/base64.js.map +1 -1
- package/lib/markdown/md-theorem/block-rule.js +10 -6
- package/lib/markdown/md-theorem/block-rule.js.map +1 -1
- package/lib/markdown/mdPluginRaw.js +24 -3
- package/lib/markdown/mdPluginRaw.js.map +1 -1
- package/lib/markdown/mdPluginTOC.js +30 -4
- package/lib/markdown/mdPluginTOC.js.map +1 -1
- package/lib/markdown/mdPluginTableTabular.js +46 -1
- package/lib/markdown/mdPluginTableTabular.js.map +1 -1
- package/lib/markdown/utils.js +3 -0
- package/lib/markdown/utils.js.map +1 -1
- package/lib/mathjax/index.js +3 -3
- package/lib/mathjax/index.js.map +1 -1
- package/lib/mathpix-markdown-model/index.d.ts +4 -0
- package/lib/mathpix-markdown-model/index.js +2 -1
- package/lib/mathpix-markdown-model/index.js.map +1 -1
- package/package.json +1 -1
- package/pr-specs/2026-04-global-state-cleanup-and-perf.md +212 -0
- package/pr-specs/2026-04-optimize-tabular-parsing.md +211 -0
- package/pr-specs/2026-05-footnote-perf-and-parser-invariants.md +246 -0
- package/pr-specs/2026-05-tabular-vertical-align-bracket.md +270 -0
- package/lib/markdown/mdPluginSeparateForBlock.d.ts +0 -2
- package/lib/markdown/mdPluginSeparateForBlock.js +0 -209
- package/lib/markdown/mdPluginSeparateForBlock.js.map +0 -1
package/README.md
CHANGED
|
@@ -913,6 +913,7 @@ The `MathpixMarkdown` React element accepts the following props:
|
|
|
913
913
|
| `showPageBreaks` | boolean;*`false`* | Hidden tags will be shown in html like page break |
|
|
914
914
|
| `centerImages` | boolean;*`true`* | Center align images by default |
|
|
915
915
|
| `centerTables` | boolean;*`true`* | Center align tables by default |
|
|
916
|
+
| `defaultCellVerticalAlign` | "top" \| "middle" \| "bottom";*`undefined`* | Fallback vertical alignment for tabular cells without an explicit `\begin{tabular}[t/c/b]{...}` bracket. Per-column `m`/`p`/`b` and any explicit `[t]/[c]/[b]` source bracket always override. Unset → no override. |
|
|
916
917
|
| `validateLink` | function;*`null`* | The function `(url: string) => void` to override md link validator |
|
|
917
918
|
| `enableCodeBlockRuleForLatexCommands` | boolean;*`false`* | By default, if latex commands are indented (4 spaces / 1 tab) they do not become `Code Blocks`. |
|
|
918
919
|
| `parserErrors` | [ParserErrors](https://github.com/Mathpix/mathpix-markdown-it#parsererrors);*`{}`* | Sets options to output parser errors for equations and tabular |
|
|
@@ -947,6 +948,7 @@ The `MathpixMarkdown` React element accepts the following props:
|
|
|
947
948
|
| `showPageBreaks` | boolean;*`false`* | Hidden tags will be shown in html like page break |
|
|
948
949
|
| `centerImages` | boolean;*`true`* | Center align images by default |
|
|
949
950
|
| `centerTables` | boolean;*`true`* | Center align tables by default |
|
|
951
|
+
| `defaultCellVerticalAlign` | "top" \| "middle" \| "bottom";*`undefined`* | Fallback vertical alignment for tabular cells without an explicit `\begin{tabular}[t/c/b]{...}` bracket. Per-column `m`/`p`/`b` and any explicit `[t]/[c]/[b]` source bracket always override. Unset → no override. |
|
|
950
952
|
| `validateLink` | function;*`null`* | The function `(url: string) => void` to override md link validator |
|
|
951
953
|
| `enableCodeBlockRuleForLatexCommands` | boolean;*`false`* | By default, if latex commands are indented (4 spaces / 1 tab) they do not become `Code Blocks`. |
|
|
952
954
|
| `parserErrors` | [ParserErrors](https://github.com/Mathpix/mathpix-markdown-it#parsererrors);*`{}`* | Sets options to output parser errors for equations and tabular |
|
|
@@ -978,6 +980,7 @@ The `MathpixMarkdown` React element accepts the following props:
|
|
|
978
980
|
| `include_speech` | boolean *`false`* | outputs speech `<speech>...</speech` |
|
|
979
981
|
| `md_separators` | `{column: ' ', row: ' <br> '}`| Separators for Markdown tables |
|
|
980
982
|
| `table_markdown` | `{math_as_ascii: false, math_inline_delimiters: ['$','$']}`| By default, math goes into Markdown tables as latex and is enclosed in `$...$` delimiters. If `math_as_ascii` is set to `true`, then math will be represented as asciimath |
|
|
983
|
+
| `skipMathToHtml` | boolean *`false`* | When `true`, skips SVG serialization and `token.mathEquation` storage. Overrides `include_svg`; other MathJax outputs (`mathml`, `asciimath`, `linearmath`, etc.) still respect their own `include_*` flags. Intended for callers that walk the token tree directly and never read the serialized math HTML. |
|
|
981
984
|
|
|
982
985
|
### TOutputMathJax
|
|
983
986
|
|
package/doc/changelog.md
CHANGED
|
@@ -1,3 +1,131 @@
|
|
|
1
|
+
# May 2026
|
|
2
|
+
|
|
3
|
+
## [2.0.40] - Tabular vertical-align bracket and footnote performance
|
|
4
|
+
|
|
5
|
+
- Tabular vertical alignment:
|
|
6
|
+
- Parse the optional `[t]/[c]/[b]` bracket on `\begin{tabular}` (standard LaTeX2e syntax) and use it as the row-level vertical-align default for `l/c/r/S` columns. Per-column `m`/`p`/`b` continues to override.
|
|
7
|
+
- Cell-level inference: when an outer cell's content includes a nested `\begin{tabular}[t/c/b]`, the outer `<td>` inherits that vertical-align (matching LaTeX baseline semantics — the cell containing the `[t]` inner tabular sits at the top of the row). Per-column `m`/`p`/`b` on the outer column still wins. Cell-level inference overrides the row-level bracket for that single cell.
|
|
8
|
+
- In `forLatex`, every `td_open` of a tabular with an effective bracket carries `meta.parentBracket` (`'t'`/`'c'`/`'b'`) — the bracket of THIS table, set on every cell of that table. Consumers walking forLatex tokens see parent context directly on each `<td>` without re-deriving from the parent `table_open`. `AddTd` and `AddTdSubTable` accept an optional `meta?: TTdMeta` parameter to attach this and other forLatex-specific cell info.
|
|
9
|
+
- New option `defaultCellVerticalAlign?: 'top' | 'middle' | 'bottom'`. HTML rendering: applies as the fallback for `\begin{tabular}` blocks without an explicit bracket. Explicit source bracket always wins. Default unset is byte-identical to legacy on existing MMD. `'middle'` propagates to regular `l/c/r/S` cells (matches existing default), but is a no-op for `\multicolumn` / `\multirow` cells (preserves legacy no-vertical-align on multicol).
|
|
10
|
+
- `forLatex` round-trip: for `'top'`/`'bottom'` (top-level only) the option's value is injected into `tableOpen.meta.bracket` so the consumer can serialize `\begin{tabular}[pos]{...}`. Nested absent-bracket tabulars stay bracket-less to preserve round-trip.
|
|
11
|
+
- `\multicolumn` / `\multirow` cells inherit `'t'`/`'b'` from any source (bracket or option), and `'c'` only from an explicit source bracket — never from option `'middle'`. Plain `\multicolumn{}` / `\multirow{}` in an absent-bracket tabular continues to emit no `vertical-align` (legacy).
|
|
12
|
+
- Diagbox cells always render with `vertical-align: middle` regardless of the outer tabular's bracket: `getSubTabular` flags wrappers with `hasDiagbox`, the parser skips its own vertical-align emit, and `render-tabular` adds `middle` once. Removes the duplicate `vertical-align: middle;` from existing diagbox snapshots.
|
|
13
|
+
- Explicit `\multirow[t/c/b]` always wins over the row-level default and emits explicit `vertical-align`. Fixes a regression where `\multirow[c]` inside `\begin{tabular}[t]{...}` silently inherited the outer `[t]` instead of honoring the user's explicit `[c]`. Two existing `\multirow[c]` snapshots in `_tabular/_data_digbox.js` updated to include the now-explicit `vertical-align: middle`.
|
|
14
|
+
- **Breaking change** to the exported `openTag` / `openTagG` regexes in `md-block-rule/begin-tabular`: the optional `[pos]` bracket is now a capture group. New shape is `match[1]` = bracket pos (`t`/`c`/`b` or `undefined`), `match[2]` = column spec. Previously `match[1]` = column spec. Consumers calling `openTag.exec(src)[1]` / `src.match(openTag)[1]` for the column spec must read `[2]` instead. `openTagTabular` and `BEGIN_TABULAR_INLINE_RE` (presence-check regexes) likewise allow the optional bracket but keep their existing capture groups.
|
|
15
|
+
- `getParams` (column-spec parser) now skips an optional `[pos]` before `{` and returns the normalized bracket position.
|
|
16
|
+
|
|
17
|
+
- Footnote rule performance:
|
|
18
|
+
|
|
19
|
+
- `latex_footnote_block` / `latex_footnotetext_block`: per-state position cache + per-line token guard turn the O(N×M) accumulation scan into one O(|src|) sweep per parse and O(1) per subsequent block-start. ~120× speedup on a 2.45 MB MMD with 706 long tabular blocks (worst case for the pre-change Phase 1 scan); HTML output byte-identical.
|
|
20
|
+
- `setChildrenPositions`: per-child `Object.isExtensible` guard before `.positions` assignment fixes `TypeError` thrown by frozen `SHARED_*_CLOSE` singletons inside `tabular_inline` subtrees, restoring `markdownToHTMLSegments({ addPositionsToTokens: true })` on documents with inline subtables. `link_open` branch split into strict-triple `[text](url)` (legacy snapshot-pinned math) + span fallback for fancy contents (`[**bold**](url)`, `` [`code`](url) ``, `[](url)`) — fixes silent NaN/off-by-N positions that existed on master.
|
|
21
|
+
- `BeginTheorem`: env-name validation hoisted above `state.push` in non-silent mode — unregistered environments no longer leave unmatched `<div class="theorem_block">` wrappers in the rendered HTML. Silent-mode terminator probes preserved (required by `\newtheorem` ↔ `\begin{NAME}` adjacent-line handshake).
|
|
22
|
+
- Behavior change for unregistered `\begin{NAME}…\end{NAME}` (e.g. TikZ): previously the rule emitted an unmatched `<div class="theorem_block">` wrapper around a math-block fallback `<span class="math-block equation-number" number="0"></span>`. Now the wrapper is gone; the inner placeholder is unchanged. `.equation-number` element count is unchanged vs master. Register via `\newtheorem{NAME}{…}` to get the body rendered.
|
|
23
|
+
- Behavior change for `markdownToHTMLSegments` consumers: same documents emit more segments than before because the previously unmatched wrapper was preventing segment delimiters from breaking at natural boundaries. Output bytes are unchanged; segment counts are not.
|
|
24
|
+
- Behavior change for highlights consumers: fancy-link span fallback (`[**bold**](url)` etc. with overlapping `highlights:`) emits empty `<span class="mmd-highlight"></span>` wrappers around markup-only inner tokens (strong_open/strong_close). Filter empty `.mmd-highlight` matches if iterating.
|
|
25
|
+
|
|
26
|
+
See `pr-specs/2026-05-footnote-perf-and-parser-invariants.md` for design and known limitations.
|
|
27
|
+
|
|
28
|
+
# April 2026
|
|
29
|
+
|
|
30
|
+
## [2.0.39] - Optimize tabular parsing memory and performance
|
|
31
|
+
|
|
32
|
+
- Algorithms:
|
|
33
|
+
- Rewrote `getSubMath()` from recursive to iterative single-pass (O(N×M) → O(N+M)); `getMathTableContent()` now uses `parts[]` + `join()` instead of repeated slice+concat. The `startPos: number = 0` optional parameter is preserved for signature compatibility with deep-import consumers.
|
|
34
|
+
- `colsToFixWidth` in the tabular parser converted from `Array` + `.includes()` + `.push()` to `Set<number>` for O(1) dedup-on-insert. Previous code was O(N²) in cell count for wide tables; Set path is O(N). Converted to array once at `tableOpen.meta` assignment.
|
|
35
|
+
- Removed two dead `.split('').join('')` round-trips in `common.ts` (`getColumnLines` and `getColumnAlign`) — identity operations that allocated a per-call character array. The `.split('').join(' ')` call on the next line is NOT a no-op and is preserved.
|
|
36
|
+
- `mathTable`, `subTabular`, `extractedCodeBlocks` converted from Array + `findIndex()` to Map for O(1) lookups.
|
|
37
|
+
- `labelsByKey` + `labelsByUuid` Map indexes; `labelsList` export kept as a deprecated backward-compatible `Proxy` that returns a version-cached snapshot of `labelsByKey.values()` — snapshot is rebuilt only when the underlying map changes. Mutations (`.push`, index assignment) target the throwaway target array and are effectively ignored.
|
|
38
|
+
- `diagboxById` reverse Map + `ClearDiagboxTable()`.
|
|
39
|
+
- `buildInlineCodePositionSet()` returns `Set<number>` for O(1) position checks in `findEndMarker` (previously O(n×m) per character).
|
|
40
|
+
- `tagRegexCache` memoizes HTML block regexes; fixed `lastIndex` corruption by swapping `.test()` on g-flag regex for `.match()`.
|
|
41
|
+
- `utf8Encode`: `parts[]` + `join()` instead of O(n²) string concat.
|
|
42
|
+
- `SetItemizeLevelTokens`: saves/restores only `outMath` with `try/finally`.
|
|
43
|
+
- `mathTablePush` accepts both `(id, content)` and `({id, content})` forms (backward-compatible overload).
|
|
44
|
+
- `mathpixMarkdownPlugin`: shared `envToInline` object per table to avoid hundreds of thousands of object copies on large documents.
|
|
45
|
+
|
|
46
|
+
- Per-parse math cache:
|
|
47
|
+
- Added `state.env.__mathpix` cache (following markdown-it-footnote convention) that deduplicates identical `inline_math` / `display_math` expressions within a single parse. No persistence between parses, no public API options.
|
|
48
|
+
- Cache exclusions: `equation_math` / `equation_math_not_number` (numbering side effects), `inline_mathML` / `display_mathML` (different MathJax path), `return_asciimath` tokens (ascii extraction side effects).
|
|
49
|
+
- Cache bypass via `beginCacheBypass` / `endCacheBypass` when `outMath` is temporarily mutated (e.g. `SetItemizeLevelTokens` for `forDocx`).
|
|
50
|
+
- Accessibility IDs (`mjx-mml-*`) regenerated on cache hit so every token keeps a unique DOM id.
|
|
51
|
+
- Cache hits mark the returned result with `_labelsRegistered: true`; `convertMathToHtml` then skips the per-label `state.md.inline.parse()` + `addIntoLabelsList()` loop (the two are idempotent for the same key+content). `idLabels` is still recomputed from `Object.keys(token.labels)`.
|
|
52
|
+
|
|
53
|
+
- Token-tree retention fixes:
|
|
54
|
+
- `mdPluginTOC`: stored the parse state on a module-level `gstate` variable so the TOC render rule could reach the top-level token list. The reference was never cleared and pinned the entire token tree across unrelated parses. The token list is now stashed on `state.env[TOC_ENV_KEY]` and released with the env when the parse ends.
|
|
55
|
+
- `coreInline`: rebound `state.env` to a fresh object inside the inline loop. That desynced state.env from the env reference the caller of `md.render(src, env)` still held, so parse-time mutations (TOC / cache) became invisible to render rules. Now mutates state.env in place and uses a private `inlineEnv` for the nested `inline.parse()` call. The same pattern was applied to the deeper recursive walker `walkInlineInTokens` (footnote / tabular deep-walk paths).
|
|
56
|
+
|
|
57
|
+
- Per-parse cross-plugin state reset (`reset_mmd_global_state` core-ruler hook, before `normalize`):
|
|
58
|
+
- Module-level state in sub-plugins (TOC slug registry, theorem/figure/section counters, labels Map, footnote list, itemize marker token trees, list-depth stack, size counter, MathJax equation counter) was previously cleared only at `md.use(plugin)` time or inside the `initMathpixMarkdown.parse` / `renderer.render` wrappers. Direct users of `markdownIt().use(mathpixMarkdownPlugin)` who reused one md instance across documents saw drift: extra `-2`/`-3` TOC slug suffixes, bumped theorem/section numbers, stale `\ref{}` IDs, stale footnote refs, retention of old `\renewcommand{\labelitemi}` token trees.
|
|
59
|
+
- The new hook clears all of the above at the start of every `md.parse()`. It respects `renderElement.startLine` and skips on partial re-renders so cross-references inside an enclosing parse are preserved.
|
|
60
|
+
- Also fixes a latent leak in `parse-error.ts` — `ParseErrorList` had a `ClearParseErrorList()` function that was never called anywhere; tabular parse errors accumulated monotonically.
|
|
61
|
+
- Exported `resetMmdGlobalState()` from the package root so one-shot converters (e.g. DOCX export) can release module-level state immediately after render without waiting for the next parse. Module-level state that render needs (labels, theorems, footnotes, etc.) is otherwise retained until the next `md.parse()` fires the hook.
|
|
62
|
+
|
|
63
|
+
- Segment balance fix in `markdownToHtmlPipelineSegments`:
|
|
64
|
+
- The segments renderer tracked a single `pendingCloseTag` + `pendingLevel` pair. A nested same-type same-level `_open` (e.g. md-theorem wraps an inner `paragraph_open` at level 0 inside the outer `paragraph_open` class `theorem_block`) caused the first `paragraph_close` to terminate the segment mid-block, producing `<div><div>...</div>` in one segment and `</div></div>...` in the next.
|
|
65
|
+
- Added a `pendingDepth` counter: nested opens of the same type at the same level now increment depth; the segment closes only when depth drops back to zero. Covered by `tests/_html-segments.js` across 38 scenarios exercising all block rules from `mmdRules.ts`.
|
|
66
|
+
|
|
67
|
+
- Additional parse-only retention fixes:
|
|
68
|
+
- `cleanup_math_cache` core-ruler hook (pushed, end of pipeline) clears `state.env.__mathpix`. Previously the per-parse math dedup cache was only initialized, never released, so MathJax html/svg strings for every unique expression stayed on env until the caller dropped it (200+ MB on math-heavy docs in long-lived processes).
|
|
69
|
+
- `mdPluginTOC.grab_state` stashes `state.tokens` on `state.env[TOC_ENV_KEY]` only when the document actually used `[[toc]]` — detected by a one-pass scan of inline-token children for `toc_body`. Documents without `[[toc]]` no longer pay the cost of retaining the whole token tree on env.
|
|
70
|
+
|
|
71
|
+
- Two-hook tabular-state cleanup:
|
|
72
|
+
- `reset_tabular_state` core-ruler hook (before `normalize`) clears tabular module-level state at the start of every `md.parse()`.
|
|
73
|
+
- New `cleanup_tabular_state` hook (pushed, end of core pipeline) drops parse-only caches (`subTabular`, `mathTable`, `extractedCodeBlocks`, `diagboxTable`, column-style intern cache) at the end of parse — they're never read during render. Both hooks respect `renderElement.startLine` for partial renders.
|
|
74
|
+
|
|
75
|
+
- Per-token allocation reduction:
|
|
76
|
+
- Pre-interned 16 border-style strings (`border-{top,bottom,left,right}-style`: solid / double / dashed / none) replace per-cell template-literal allocations.
|
|
77
|
+
- `columnStyleCache` per-parse intern for the composed `<td>` style string.
|
|
78
|
+
- `getSharedCellAttrs` / `getSharedTableOpenAttrs` / `getSharedTbodyOpenAttrs` / `getSharedTrOpenAttrs` return read-only shared attrs arrays keyed by (style, isEmpty) / (extraClass, numCol). Shared arrays carry the non-enumerable `Symbol.for('mathpix.tabular.attrsShared')` marker; mutation sites (`tokenAttrSet` in the tabular renderer, `addAttributesToParentTokenByType` in utils) clone-on-write before writing.
|
|
79
|
+
- Frozen singleton close-token markers: `SHARED_TD_CLOSE`, `SHARED_TR_CLOSE`, `SHARED_TABLE_CLOSE`, and `SHARED_TBODY_CLOSE` (non-forLatex only — under `forLatex`, `tbody_close` carries a per-table `latex` payload and is allocated per-instance). The multi-column branch of `parse-tabular.ts` also pushes `SHARED_TD_CLOSE` directly instead of allocating a fresh close-token per cell.
|
|
80
|
+
- `addStyle` / `addHLineIntoStyle` check the input attrs for the `attrsSharedMarker` symbol and clone before mutating so that callers which pass in a shared-attrs array do not corrupt the cached object.
|
|
81
|
+
- `StatePushTabulars` no longer assigns `content` / `children = []` onto open/close markers — those fields are never read on markers and assignment would throw on the frozen close singletons.
|
|
82
|
+
- Replaced `res = res.concat(...)` with in-place `res.push(...)` inside the tabular construction loop to remove intermediate array allocations.
|
|
83
|
+
- `applyTypesetResultToToken` drops `svg` from `token.mathData` when `options.highlights` is not set — the field is only read by `renderMathHighlight` (active under highlights); the default render rule uses `token.mathEquation`. The highlight path re-populates `mathData.svg` in `convertMathToHtmlWithHighlight`.
|
|
84
|
+
- `OuterData` returns `null` for empty `labels` instead of cloning an empty `{}` onto every math token.
|
|
85
|
+
|
|
86
|
+
- Output gating in the tabular renderer:
|
|
87
|
+
- `renderInlineTokenBlock` and `renderNonTableTokenIntoCell` build each output only when the caller requested it via a shared `computeOutputGates(options)` helper: `needHtml` (`!forMD && include_table_html !== false`), `needTsv` (`include_tsv`), `needCsv` (`include_csv`), `needMd` (`forMD || include_table_markdown`), `needSmoothed` (`forPptx`). Both call sites use the same helper so gates cannot drift. Every `result += ...`, array push, `cellMd +=`, and `formatTsvCell` / `formatCsvCell` call is gated on the corresponding flag.
|
|
88
|
+
- Leaf-token handling still calls `slf.renderInline([token], options, env)` even when `needHtml` is false — the `latex_list_item_open` render rule sets `token.meta.itemizeLevel` as a side effect that `handleListTokensForCellMarkdown` reads to emit list markers.
|
|
89
|
+
- `renderTabularInline` short-circuits early when `forMD: true` and neither TSV/CSV/markdown output is requested, avoiding an empty `<div class="inline-tabular"></div>` wrapper.
|
|
90
|
+
|
|
91
|
+
- HTML-visual attrs skipped for non-HTML outputs:
|
|
92
|
+
- `td_open` style / `_empty` class, `tr_open` border-reset style, `table_open` `class='tabular'`, and the `table_tabular` class + text-align style on the wrapping `paragraph_open` are HTML/CSS-only. When the caller sets `forMD` or `forLatex`, `AddTd` / `AddTdSubTable` / `getMultiColumnMultiRow` / `StatePushParagraphOpen` skip those assignments. Multicol/multirow cells still carry `colspan` / `rowspan`; `paragraph_open.data-align` is preserved for `forLatex`.
|
|
93
|
+
|
|
94
|
+
- New public option:
|
|
95
|
+
- `outMath.skipMathToHtml` (default `false`). Declared on the exported `TOutputMath` type. When `true`, `applyTypesetResultToToken` skips `token.mathEquation` and `typesetMathForToken` passes `include_svg: false` to MathJax so the SVG string is never serialized. Takes precedence over `include_svg`; other MathJax outputs respect their own `include_*` flags. Intended for callers that walk the token tree directly and never read the serialized math HTML. The per-token outMath clone used here is memoized via `WeakMap` to avoid ~49K spread allocations on large documents.
|
|
96
|
+
|
|
97
|
+
- Review-follow-up cleanups:
|
|
98
|
+
- `computeOutputGates(options)` helper extracted so both tabular render sites use identical gating.
|
|
99
|
+
- `attrsSharedMarker` centralized in `common/consts.ts` (was duplicated in `tabular-td.ts`, `utils.ts`, `render-tabular.ts`).
|
|
100
|
+
- `getSharedTableOpenAttrs(extraClass, skipVisual=true)` now also drops `class='tabular'` under `skipVisual` (previously leaked the HTML-only class for subtable cases).
|
|
101
|
+
- `getSubTabular` guards the direct Map lookup by a UUID-pattern regex so UUID-looking cell text cannot collide with a stored key.
|
|
102
|
+
- `subTabular` / `mathTable` module-level Maps marked `const` (never reassigned; only `.set` / `.clear` / `.get`).
|
|
103
|
+
- Regression test pins `envToInline` render isolation between blocks sharing `state.env`.
|
|
104
|
+
|
|
105
|
+
- Cleanup:
|
|
106
|
+
- Removed dead file `src/markdown/mdPluginSeparateForBlock.ts` (and its `lib/*` artifacts). It was never registered with markdown-it; its two core rules (`separateForBlock`, `separateBeforeBlock`) shipped in the initial 2019 commit and never wired in.
|
|
107
|
+
|
|
108
|
+
- Benchmark (16 MB MMD with 13,713 tabular blocks, ~479K `<td>` cells, ~49K inline math expressions):
|
|
109
|
+
|
|
110
|
+
Full SVG/HTML render path:
|
|
111
|
+
|
|
112
|
+
| Stage | Before | After | Δ |
|
|
113
|
+
|-------------------------|--------:|-------:|-------------:|
|
|
114
|
+
| Peak heap (html held) | 2597 MB | 778 MB | −1819 (−70%) |
|
|
115
|
+
| Heap after drop html | 1887 MB | 68 MB | −1819 (−96%) |
|
|
116
|
+
| Parse time | 17.9 s | 14.6 s | −18% |
|
|
117
|
+
|
|
118
|
+
Token-only path (`forMD: true`, `outMath.skipMathToHtml: true`):
|
|
119
|
+
|
|
120
|
+
| Stage | Before | After | Δ |
|
|
121
|
+
|-------------------------|--------:|-------:|-------------:|
|
|
122
|
+
| Peak heap | 2597 MB | 443 MB | −2154 (−83%) |
|
|
123
|
+
| Heap after drop output | 1887 MB | 81 MB | −1806 (−96%) |
|
|
124
|
+
| Serialized output size | 355 MB | 165 MB | −190 |
|
|
125
|
+
|
|
126
|
+
- Docs:
|
|
127
|
+
- Implementation details in `pr-specs/2026-04-optimize-tabular-parsing.md` and `pr-specs/2026-04-global-state-cleanup-and-perf.md`.
|
|
128
|
+
|
|
1
129
|
# March 2026
|
|
2
130
|
|
|
3
131
|
## [2.0.38] - Fix infinite loop in `inlineMmdIcon` and `inlineDiagbox` silent mode
|