npm - pdfnative - Versions diffs - 1.2.0 → 1.3.0 - Mend

pdfnative 1.2.0 → 1.3.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (26) hide show

package/README.md +41 -14
package/dist/index.cjs +2753 -442
package/dist/index.cjs.map +1 -1
package/dist/index.d.cts +603 -29
package/dist/index.d.ts +603 -29
package/dist/index.js +2727 -443
package/dist/index.js.map +1 -1
package/dist/worker/index.cjs +2003 -121
package/dist/worker/index.cjs.map +1 -1
package/dist/worker/index.js +2003 -121
package/dist/worker/index.js.map +1 -1
package/fonts/noto-color-emoji-data.d.ts +28 -0
package/fonts/noto-color-emoji-data.js +26 -0
package/fonts/noto-ethiopic-data.d.ts +13 -0
package/fonts/noto-ethiopic-data.js +64 -0
package/fonts/noto-khmer-data.d.ts +13 -0
package/fonts/noto-khmer-data.js +64 -0
package/fonts/noto-myanmar-data.d.ts +13 -0
package/fonts/noto-myanmar-data.js +64 -0
package/fonts/noto-sinhala-data.d.ts +13 -0
package/fonts/noto-sinhala-data.js +64 -0
package/fonts/noto-telugu-data.d.ts +13 -0
package/fonts/noto-telugu-data.js +64 -0
package/fonts/noto-tibetan-data.d.ts +13 -0
package/fonts/noto-tibetan-data.js +64 -0
package/package.json +22 -5

package/README.md CHANGED Viewed

@@ -21,7 +21,7 @@ pdfnative ships as three coordinated packages — pick whichever entry point fit
 | Package | Latest | Use it for |
 |---|:---:|---|
-| [`pdfnative`](https://www.npmjs.com/package/pdfnative) | **v1.2.0** | The library itself — call from Node, browsers, Workers, Deno, Bun. |
+| [`pdfnative`](https://www.npmjs.com/package/pdfnative) | **v1.3.0** | The library itself — call from Node, browsers, Workers, Deno, Bun. |
 | [`pdfnative-cli`](https://www.npmjs.com/package/pdfnative-cli) | **v0.3.0** | Render JSON → PDF, sign (RSA + ECDSA-SHA256, RFC 3161 detection), inspect, and verify CMS signatures from the shell. New in v0.3.0: `--watch`, `--template`, `--font {latin,emoji}`, auto signature placeholder. |
 | [`pdfnative-mcp`](https://www.npmjs.com/package/pdfnative-mcp) | **v0.3.0** | Use pdfnative from Claude Desktop, Cursor, Continue, Zed (or any stdio MCP client) — **9 structured tools** including the new `inspect_pdf`, a `pdfA` flag on every doc tool, multi-script `lang`, and per-tool `outputSchema` (MCP 2025-06-18). |
@@ -37,23 +37,24 @@ Detailed docs: [CLI guide](docs/guides/cli.md) · [MCP guide](docs/guides/mcp.md
 - **Zero dependencies** — built from scratch in pure TypeScript. Zero runtime dependencies, tree-shakeable, auditable
 - **ISO 32000-1 compliant** — valid xref tables, /Info metadata, proper font embedding
-- **16 Unicode scripts** — Thai, Japanese, Chinese (SC), Korean, Greek, Devanagari, Turkish, Vietnamese, Polish, Arabic, Hebrew, Cyrillic, Georgian, Armenian, Bengali, Tamil
+- **22 Unicode scripts** — Thai, Japanese, Chinese (SC), Korean, Greek, Devanagari, Telugu, Turkish, Vietnamese, Polish, Arabic, Hebrew, Cyrillic, Georgian, Armenian, Bengali, Tamil, Sinhala, Tibetan, Khmer, Myanmar, Ethiopic
 - **Thai OpenType shaping** — GSUB substitution + GPOS mark-to-base + mark-to-mark positioning
 - **Arabic positional shaping** — GSUB isolated/initial/medial/final forms + lam-alef ligatures
-- **BiDi text layout** — Unicode Bidirectional Algorithm (UAX #9) with glyph mirroring, isolates (LRI/RLI/FSI/PDI), and explicit embeddings (LRE/RLE/LRO/RLO/PDF)
-- **USE-lite cluster classifier** — public API (`classifyUseCategory`, `classifyClusters`) with per-script tables for Devanagari, Bengali, Tamil (v1.2.0; shaper rewire lands in v1.3)
+- **BiDi text layout** — Unicode Bidirectional Algorithm (UAX #9) with glyph mirroring, isolates (LRI/RLI/FSI/PDI), and explicit embeddings (LRE/RLE/LRO/RLO/PDF) including character-level X4–X5 overrides (v1.3.0)
+- **USE-lite shaping** — `classifyUseCategory` / `classifyClusters` drive joiner classification across the Devanagari, Bengali, and Tamil shapers, fixing nukta+virama, half-form, eyelash-ra, and ya-phalaa edge cases (v1.3.0)
+- **Colour emoji (COLRv1)** — opt-in Noto Color Emoji subset; solid + linear + radial gradient layers rendered as native PDF Form XObjects; monochrome fallback when not registered (v1.3.0). Variation selectors, ZWJ/ZWNJ, and skin-tone modifiers no longer leave tofu, and glyph `/BBox` is computed from contour bounds so emoji are never clipped (v1.3.0). [Guide →](docs/guides/colour-emoji.md)
 - **Multi-font fallback** — automatic cross-script font switching with continuation bias
 - **TTF subsetting** — only used glyphs embedded (dramatic file size reduction)
 - **Tagged PDF / PDF/A** — structure tree, /ActualText, XMP metadata, sRGB OutputIntent (PDF/A-1b, 2b, 2u, 3b with embedded file attachments)
 - **PDF Encryption** — AES-128 (V4/R4) and AES-256 (V5/R6), owner + user passwords, granular permissions
-- **Free-form document builder** — headings, paragraphs, lists, tables, images, barcodes, SVG paths, form fields, spacers, page breaks, table of contents
+- **Free-form document builder** — headings, paragraphs, lists, tables, images, barcodes, SVG paths, form fields, spacers, page breaks, table of contents. Configurable block limit via `layout.maxBlocks` (default 100 000) for very large reports (v1.3.0)
 - **Smart tables** — multi-page slicing with repeated headers, auto-wrap on column overflow, zebra striping, captions, and smart auto-fit columns (v1.2.0). [Guide →](docs/guides/tables.md)
 - **Barcode & QR code generation** — Code 128, EAN-13, QR Code, Data Matrix, PDF417 — pure PDF path operators (no images)
 - **SVG path rendering** — path, rect, circle, ellipse, line, polyline, polygon as native PDF operators
 - **AcroForm fields** — text, multiline, checkbox, radio, dropdown, listbox with appearance streams (ISO 32000-1 §12.7)
 - **Digital signatures** — CMS/PKCS#7 detached signatures with RSA + ECDSA, SHA-256/384/512, X.509 parsing (ISO 32000-1 §12.8). One-call placeholder injection via `addSignaturePlaceholder()` (v1.2.0)
-- **Streaming output** — AsyncGenerator-based progressive PDF emission with configurable chunk size, plus object-boundary page-by-page streaming (`buildPDFStreamPageByPage()`, v1.2.0)
-- **PDF parser & modifier** — read existing PDFs (tokenizer, xref, object parser, FlateDecode inflate) + incremental modification
+- **Streaming output** — AsyncGenerator-based progressive PDF emission with configurable chunk size, object-boundary page-by-page streaming, and **true constant-memory streaming** (`buildDocumentPDFStreamTrue()`, v1.3.0) where the full PDF binary never materialises. [Guide →](docs/guides/streaming.md)
+- **PDF parser & modifier** — read existing PDFs (tokenizer, xref, object parser, FlateDecode inflate) + incremental modification. Read-only PDF/UA structural checker `validatePdfUA()` (ISO 14289-1: MarkInfo, StructTree, ParentTree, Lang, per-page MCID uniqueness) (v1.3.0)
 - **Image embedding** — JPEG (DCTDecode) and PNG (FlateDecode) with auto-scaling and alignment
 - **Hyperlinks** — PDF link annotations (/URI) with URL validation, blue underlined text, tagged /Link
 - **Header/footer templates** — configurable `PageTemplate` with left/center/right zones and `{page}`/`{pages}`/`{date}`/`{title}` placeholders
@@ -62,7 +63,7 @@ Detailed docs: [CLI guide](docs/guides/cli.md) · [MCP guide](docs/guides/mcp.md
 - **FlateDecode compression** — zlib stream compression (50–90% size reduction), zero-dependency, platform-native
 - **Web Worker support** — off-main-thread generation for large datasets
 - **Tree-shakeable** — ESM + CJS dual build with TypeScript declarations
-- **95%+ test coverage** — 1822+ tests across 53 files, fuzz suite, performance benchmarks
+- **95%+ test coverage** — 1982+ tests across 71 files, fuzz suite, dual-mode visual-regression suite, performance benchmarks
 - **NPM provenance** — signed builds via GitHub Actions OIDC
 - **On-device generation** — runs in Node, browsers, Workers, Deno, Bun. No SaaS round-trip; documents never leave the calling process unless your application explicitly sends them
 - **No telemetry, no network calls** — verifiable in source. The library never opens a socket, fetches remote fonts, or phones home
@@ -86,7 +87,7 @@ npm install pdfnative
 - ❓ **FAQ:** [docs/guides/faq.md](docs/guides/faq.md) — fonts, encryption, signatures, comparisons.
 - 🛠️ **Troubleshooting:** [docs/guides/troubleshooting.md](docs/guides/troubleshooting.md) — common pitfalls.
 - 🎮 **Playgrounds:** [docs/playgrounds/extreme-scripts.html](docs/playgrounds/extreme-scripts.html) (live BiDi/Indic stress tests) and [docs/playgrounds/medical-800.html](docs/playgrounds/medical-800.html) (800-page Web Worker showcase).
-- 🧪 **Sample PDFs:** [scripts/generators/](scripts/generators/) — ~140 sample PDFs across 23 categories (see [Sample PDFs](#sample-pdfs) below).
+- 🧪 **Sample PDFs:** [scripts/generators/](scripts/generators/) — ~187 sample PDFs across 32 categories (see [Sample PDFs](#sample-pdfs) below).
 ## Why pdfnative?
@@ -201,6 +202,12 @@ registerFonts({
   hy: () => import('pdfnative/fonts/noto-armenian-data.js'),
   bn: () => import('pdfnative/fonts/noto-bengali-data.js'),
   ta: () => import('pdfnative/fonts/noto-tamil-data.js'),
+  te: () => import('pdfnative/fonts/noto-telugu-data.js'), // v1.3.0
+  si: () => import('pdfnative/fonts/noto-sinhala-data.js'), // v1.3.0
+  bo: () => import('pdfnative/fonts/noto-tibetan-data.js'), // v1.3.0
+  km: () => import('pdfnative/fonts/noto-khmer-data.js'), // v1.3.0
+  my: () => import('pdfnative/fonts/noto-myanmar-data.js'), // v1.3.0
+  am: () => import('pdfnative/fonts/noto-ethiopic-data.js'), // v1.3.0
   // v1.1.0+ — optional Latin fallback for PDF/A documents with curly quotes,
   // em-dash, ellipsis, etc. (activates automatically when needed):
   latin: () => import('pdfnative/fonts/noto-sans-data.js'),
@@ -420,7 +427,7 @@ See [scripts/README.md](scripts/README.md) for the modular generator architectur
 | `sample-hy.pdf` | Armenian |
 | `sample-bn.pdf` | Bengali (GSUB conjuncts + GPOS marks) |
 | `sample-ta.pdf` | Tamil (GSUB + split vowel decomposition) |
-| `sample-multi.pdf` | Mixed: all 16 scripts in one PDF |
+| `sample-multi.pdf` | Mixed: all 22 scripts in one PDF |
 | `sample-pagination.pdf` | 200 rows, multi-page layout |
 ### Diverse Use Cases (non-financial)
@@ -515,8 +522,14 @@ See [scripts/README.md](scripts/README.md) for the modular generator architectur
 | `doc-bengali.pdf` | Bengali document (GSUB conjuncts + GPOS marks) |
 | `doc-tamil.pdf` | Tamil document (GSUB substitution + split vowels) |
 | `doc-devanagari.pdf` | Hindi (Devanagari) document — GSUB conjuncts, reph reordering, matra reordering, split vowels |
+| `doc-telugu.pdf` | Telugu document (virama conjuncts + GPOS marks, no reph) |
+| `doc-sinhala.pdf` | Sinhala document (virama conjuncts + pre-base kombuva reordering) |
+| `doc-tibetan.pdf` | Tibetan document (vertical subjoined-consonant stacking) |
+| `doc-khmer.pdf` | Khmer document (USE-lite: coeng subscripts, pre-base vowels) |
+| `doc-myanmar.pdf` | Myanmar document (USE-lite: medials, pre-base reordering) |
+| `doc-amharic.pdf` | Amharic/Ethiopic document (syllabic abugida, no reordering) |
 | `doc-chinese-catalog.pdf` | Chinese product catalog (tables, ordering info) |
-| `doc-multi-language.pdf` | Multi-language: EN + Arabic + Japanese in one PDF |
+| `doc-multi-language.pdf` | Multi-language showcase: all 22 Unicode scripts in one PDF |
 | `doc-invoice.pdf` | Invoice template (line items, totals, payment link) |
 | `doc-report-multipage.pdf` | 3-page technical report (7 sections, 4 tables) |
 | `doc-contract-bilingual.pdf` | Bilingual EN/AR contract (legal sections, signatures) |
@@ -698,6 +711,10 @@ See [scripts/README.md](scripts/README.md) for the modular generator architectur
 |----------|-------------|
 | `buildDocumentPDFStream(params, layout?, streamOpts?)` | Stream document PDF as `AsyncGenerator<Uint8Array>` |
 | `buildPDFStream(params, layout?, streamOpts?)` | Stream table PDF as `AsyncGenerator<Uint8Array>` |
+| `buildDocumentPDFStreamTrue(params, layout?, streamOpts?)` | **True constant-memory** document streaming — frees each part as it yields (v1.3.0) |
+| `buildPDFStreamTrue(params, layout?, streamOpts?)` | **True constant-memory** table streaming (v1.3.0) |
+| `buildDocumentPDFStreamPageByPage(params, layout?)` | Stream document PDF chunked at PDF object boundaries |
+| `buildPDFStreamPageByPage(params, layout?)` | Stream table PDF chunked at PDF object boundaries |
 | `validateDocumentStreamable(params, layout?)` | Validate document is compatible with streaming (no TOC, no `{pages}`) |
 | `validateTableStreamable(params, layout?)` | Validate table is compatible with streaming |
 | `chunkBinaryString(str, chunkSize)` | Split binary string into `Uint8Array` chunks |
@@ -731,6 +748,7 @@ See [scripts/README.md](scripts/README.md) for the modular generator architectur
 | `isRef(v)` / `isDict(v)` / `isArray(v)` / `isStream(v)` | Type guards for parsed PDF values |
 | `dictGet(dict, key)` / `dictGetName(dict, key)` | Dictionary value accessors |
 | `inflateSync(data)` | Decompress FlateDecode data (zlib inflate) |
+| `validatePdfUA(bytes)` | Read-only PDF/UA structural checker — returns `{ valid, errors, warnings }` (v1.3.0) |
 ### Document Block Types
@@ -811,6 +829,11 @@ const pdf = buildPDFBytes(params, { compress: true });
 | `shapeBengaliText(str, fontData)` | Bengali GSUB conjuncts + GPOS marks |
 | `shapeTamilText(str, fontData)` | Tamil GSUB + split vowel decomposition |
 | `shapeDevanagariText(str, fontData)` | Devanagari cluster shaping + GSUB/GPOS |
+| `shapeTeluguText(str, fontData)` | Telugu GSUB conjuncts + GPOS marks (v1.3.0) |
+| `shapeSinhalaText(str, fontData)` | Sinhala conjuncts + pre-base reorder + GSUB/GPOS (v1.3.0) |
+| `shapeTibetanText(str, fontData)` | Tibetan vertical subjoined stacking (v1.3.0) |
+| `shapeKhmerText(str, fontData)` | Khmer USE-lite — coeng subscripts + pre-base vowels (v1.3.0) |
+| `shapeMyanmarText(str, fontData)` | Myanmar USE-lite — medials + virama stacking (v1.3.0) |
 | `detectFallbackLangs(texts, primaryLang)` | Detect needed fallback fonts |
 | `detectCharLang(codePoint)` | Map codepoint to preferred font language |
 | `splitTextByFont(str, fontEntries)` | Multi-font text run splitting |
@@ -821,6 +844,10 @@ const pdf = buildPDFBytes(params, { compress: true });
 | `shapeArabicText(str, fontData)` | Arabic GSUB positional shaping |
 | `containsArabic(text)` | Detect Arabic content |
 | `containsHebrew(text)` | Detect Hebrew content |
+| `containsTelugu(text)` | Detect Telugu content (v1.3.0) |
+| `isTeluguCodepoint(cp)` | Telugu codepoint predicate (v1.3.0) |
+| `containsSinhala(text)` / `containsTibetan(text)` / `containsKhmer(text)` / `containsMyanmar(text)` / `containsEthiopic(text)` | Detect script content (v1.3.0) |
+| `isSinhalaCodepoint(cp)` / `isTibetanCodepoint(cp)` / `isKhmerCodepoint(cp)` / `isMyanmarCodepoint(cp)` / `isEthiopicCodepoint(cp)` | Codepoint predicates (v1.3.0) |
 ### Layout Constants
@@ -971,9 +998,9 @@ src/
     ├── worker-api.ts     # Worker/main-thread dispatch
     └── pdf-worker.ts     # Self-contained worker entry
-fonts/                    # Pre-built font data modules (16 scripts)
+fonts/                    # Pre-built font data modules (22 scripts)
 tools/                    # CLI: build-font-data.cjs (TTF → JS module)
-scripts/                  # Modular sample PDF generation (23 generators, 140+ PDFs)
+scripts/                  # Modular sample PDF generation (32 generators, 187+ PDFs)
 tests/                    # 1726+ tests (48 files: unit + integration + fuzz + parser)
 bench/                    # Performance benchmarks (vitest bench)
 ```
@@ -1191,7 +1218,7 @@ pdfnative targets ES2020 and works in any environment that supports `Uint8Array`
 ## Origin
-pdfnative was born inside [**plika.app**](https://plika.app) — a personal finance application where high-quality, multi-language PDF generation (bank statements, transaction reports) was a core requirement. Rather than depending on heavy third-party libraries, the PDF engine was built from scratch with zero dependencies, strict ISO compliance, and native support for 16 Unicode scripts.
+pdfnative was born inside [**plika.app**](https://plika.app) — a personal finance application where high-quality, multi-language PDF generation (bank statements, transaction reports) was a core requirement. Rather than depending on heavy third-party libraries, the PDF engine was built from scratch with zero dependencies, strict ISO compliance, and native support for 22 Unicode scripts.
 The decision was then made to extract the engine into an independent open-source library so that everyone can benefit from production-grade PDF generation — not just plika.app users.