pdfnative 1.1.0 → 1.3.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/README.md CHANGED
@@ -15,25 +15,46 @@
15
15
 
16
16
  Pure native PDF generation library — zero vendor dependencies. ISO 32000-1 (PDF 1.7) compliant.
17
17
 
18
+ ## Ecosystem
19
+
20
+ pdfnative ships as three coordinated packages — pick whichever entry point fits your workflow:
21
+
22
+ | Package | Latest | Use it for |
23
+ |---|:---:|---|
24
+ | [`pdfnative`](https://www.npmjs.com/package/pdfnative) | **v1.3.0** | The library itself — call from Node, browsers, Workers, Deno, Bun. |
25
+ | [`pdfnative-cli`](https://www.npmjs.com/package/pdfnative-cli) | **v0.3.0** | Render JSON → PDF, sign (RSA + ECDSA-SHA256, RFC 3161 detection), inspect, and verify CMS signatures from the shell. New in v0.3.0: `--watch`, `--template`, `--font {latin,emoji}`, auto signature placeholder. |
26
+ | [`pdfnative-mcp`](https://www.npmjs.com/package/pdfnative-mcp) | **v0.3.0** | Use pdfnative from Claude Desktop, Cursor, Continue, Zed (or any stdio MCP client) — **9 structured tools** including the new `inspect_pdf`, a `pdfA` flag on every doc tool, multi-script `lang`, and per-tool `outputSchema` (MCP 2025-06-18). |
27
+
28
+ ```bash
29
+ npm install pdfnative # library
30
+ npm install -g pdfnative-cli # CLI
31
+ npm install -g pdfnative-mcp # MCP server
32
+ ```
33
+
34
+ Detailed docs: [CLI guide](docs/guides/cli.md) · [MCP guide](docs/guides/mcp.md) · [Onboarding cheatsheet](docs/guides/onboarding.md).
35
+
18
36
  ## Highlights
19
37
 
20
38
  - **Zero dependencies** — built from scratch in pure TypeScript. Zero runtime dependencies, tree-shakeable, auditable
21
39
  - **ISO 32000-1 compliant** — valid xref tables, /Info metadata, proper font embedding
22
- - **16 Unicode scripts** — Thai, Japanese, Chinese (SC), Korean, Greek, Devanagari, Turkish, Vietnamese, Polish, Arabic, Hebrew, Cyrillic, Georgian, Armenian, Bengali, Tamil
40
+ - **22 Unicode scripts** — Thai, Japanese, Chinese (SC), Korean, Greek, Devanagari, Telugu, Turkish, Vietnamese, Polish, Arabic, Hebrew, Cyrillic, Georgian, Armenian, Bengali, Tamil, Sinhala, Tibetan, Khmer, Myanmar, Ethiopic
23
41
  - **Thai OpenType shaping** — GSUB substitution + GPOS mark-to-base + mark-to-mark positioning
24
42
  - **Arabic positional shaping** — GSUB isolated/initial/medial/final forms + lam-alef ligatures
25
- - **BiDi text layout** — simplified Unicode Bidirectional Algorithm (UAX #9) with glyph mirroring
43
+ - **BiDi text layout** — Unicode Bidirectional Algorithm (UAX #9) with glyph mirroring, isolates (LRI/RLI/FSI/PDI), and explicit embeddings (LRE/RLE/LRO/RLO/PDF) including character-level X4–X5 overrides (v1.3.0)
44
+ - **USE-lite shaping** — `classifyUseCategory` / `classifyClusters` drive joiner classification across the Devanagari, Bengali, and Tamil shapers, fixing nukta+virama, half-form, eyelash-ra, and ya-phalaa edge cases (v1.3.0)
45
+ - **Colour emoji (COLRv1)** — opt-in Noto Color Emoji subset; solid + linear + radial gradient layers rendered as native PDF Form XObjects; monochrome fallback when not registered (v1.3.0). Variation selectors, ZWJ/ZWNJ, and skin-tone modifiers no longer leave tofu, and glyph `/BBox` is computed from contour bounds so emoji are never clipped (v1.3.0). [Guide →](docs/guides/colour-emoji.md)
26
46
  - **Multi-font fallback** — automatic cross-script font switching with continuation bias
27
47
  - **TTF subsetting** — only used glyphs embedded (dramatic file size reduction)
28
48
  - **Tagged PDF / PDF/A** — structure tree, /ActualText, XMP metadata, sRGB OutputIntent (PDF/A-1b, 2b, 2u, 3b with embedded file attachments)
29
49
  - **PDF Encryption** — AES-128 (V4/R4) and AES-256 (V5/R6), owner + user passwords, granular permissions
30
- - **Free-form document builder** — headings, paragraphs, lists, tables, images, barcodes, SVG paths, form fields, spacers, page breaks, table of contents
50
+ - **Free-form document builder** — headings, paragraphs, lists, tables, images, barcodes, SVG paths, form fields, spacers, page breaks, table of contents. Configurable block limit via `layout.maxBlocks` (default 100 000) for very large reports (v1.3.0)
51
+ - **Smart tables** — multi-page slicing with repeated headers, auto-wrap on column overflow, zebra striping, captions, and smart auto-fit columns (v1.2.0). [Guide →](docs/guides/tables.md)
31
52
  - **Barcode & QR code generation** — Code 128, EAN-13, QR Code, Data Matrix, PDF417 — pure PDF path operators (no images)
32
53
  - **SVG path rendering** — path, rect, circle, ellipse, line, polyline, polygon as native PDF operators
33
54
  - **AcroForm fields** — text, multiline, checkbox, radio, dropdown, listbox with appearance streams (ISO 32000-1 §12.7)
34
- - **Digital signatures** — CMS/PKCS#7 detached signatures with RSA + ECDSA, SHA-256/384/512, X.509 parsing (ISO 32000-1 §12.8)
35
- - **Streaming output** — AsyncGenerator-based progressive PDF emission with configurable chunk size
36
- - **PDF parser & modifier** — read existing PDFs (tokenizer, xref, object parser, FlateDecode inflate) + incremental modification
55
+ - **Digital signatures** — CMS/PKCS#7 detached signatures with RSA + ECDSA, SHA-256/384/512, X.509 parsing (ISO 32000-1 §12.8). One-call placeholder injection via `addSignaturePlaceholder()` (v1.2.0)
56
+ - **Streaming output** — AsyncGenerator-based progressive PDF emission with configurable chunk size, object-boundary page-by-page streaming, and **true constant-memory streaming** (`buildDocumentPDFStreamTrue()`, v1.3.0) where the full PDF binary never materialises. [Guide →](docs/guides/streaming.md)
57
+ - **PDF parser & modifier** — read existing PDFs (tokenizer, xref, object parser, FlateDecode inflate) + incremental modification. Read-only PDF/UA structural checker `validatePdfUA()` (ISO 14289-1: MarkInfo, StructTree, ParentTree, Lang, per-page MCID uniqueness) (v1.3.0)
37
58
  - **Image embedding** — JPEG (DCTDecode) and PNG (FlateDecode) with auto-scaling and alignment
38
59
  - **Hyperlinks** — PDF link annotations (/URI) with URL validation, blue underlined text, tagged /Link
39
60
  - **Header/footer templates** — configurable `PageTemplate` with left/center/right zones and `{page}`/`{pages}`/`{date}`/`{title}` placeholders
@@ -42,7 +63,7 @@ Pure native PDF generation library — zero vendor dependencies. ISO 32000-1 (PD
42
63
  - **FlateDecode compression** — zlib stream compression (50–90% size reduction), zero-dependency, platform-native
43
64
  - **Web Worker support** — off-main-thread generation for large datasets
44
65
  - **Tree-shakeable** — ESM + CJS dual build with TypeScript declarations
45
- - **95%+ test coverage** — 1588+ tests across 40 files, fuzz suite, performance benchmarks
66
+ - **95%+ test coverage** — 1982+ tests across 71 files, fuzz suite, dual-mode visual-regression suite, performance benchmarks
46
67
  - **NPM provenance** — signed builds via GitHub Actions OIDC
47
68
  - **On-device generation** — runs in Node, browsers, Workers, Deno, Bun. No SaaS round-trip; documents never leave the calling process unless your application explicitly sends them
48
69
  - **No telemetry, no network calls** — verifiable in source. The library never opens a socket, fetches remote fonts, or phones home
@@ -66,7 +87,7 @@ npm install pdfnative
66
87
  - ❓ **FAQ:** [docs/guides/faq.md](docs/guides/faq.md) — fonts, encryption, signatures, comparisons.
67
88
  - 🛠️ **Troubleshooting:** [docs/guides/troubleshooting.md](docs/guides/troubleshooting.md) — common pitfalls.
68
89
  - 🎮 **Playgrounds:** [docs/playgrounds/extreme-scripts.html](docs/playgrounds/extreme-scripts.html) (live BiDi/Indic stress tests) and [docs/playgrounds/medical-800.html](docs/playgrounds/medical-800.html) (800-page Web Worker showcase).
69
- - 🧪 **Sample PDFs:** [scripts/generators/](scripts/generators/) — ~140 sample PDFs across 23 categories (see [Sample PDFs](#sample-pdfs) below).
90
+ - 🧪 **Sample PDFs:** [scripts/generators/](scripts/generators/) — ~187 sample PDFs across 32 categories (see [Sample PDFs](#sample-pdfs) below).
70
91
 
71
92
  ## Why pdfnative?
72
93
 
@@ -181,6 +202,12 @@ registerFonts({
181
202
  hy: () => import('pdfnative/fonts/noto-armenian-data.js'),
182
203
  bn: () => import('pdfnative/fonts/noto-bengali-data.js'),
183
204
  ta: () => import('pdfnative/fonts/noto-tamil-data.js'),
205
+ te: () => import('pdfnative/fonts/noto-telugu-data.js'), // v1.3.0
206
+ si: () => import('pdfnative/fonts/noto-sinhala-data.js'), // v1.3.0
207
+ bo: () => import('pdfnative/fonts/noto-tibetan-data.js'), // v1.3.0
208
+ km: () => import('pdfnative/fonts/noto-khmer-data.js'), // v1.3.0
209
+ my: () => import('pdfnative/fonts/noto-myanmar-data.js'), // v1.3.0
210
+ am: () => import('pdfnative/fonts/noto-ethiopic-data.js'), // v1.3.0
184
211
  // v1.1.0+ — optional Latin fallback for PDF/A documents with curly quotes,
185
212
  // em-dash, ellipsis, etc. (activates automatically when needed):
186
213
  latin: () => import('pdfnative/fonts/noto-sans-data.js'),
@@ -400,7 +427,7 @@ See [scripts/README.md](scripts/README.md) for the modular generator architectur
400
427
  | `sample-hy.pdf` | Armenian |
401
428
  | `sample-bn.pdf` | Bengali (GSUB conjuncts + GPOS marks) |
402
429
  | `sample-ta.pdf` | Tamil (GSUB + split vowel decomposition) |
403
- | `sample-multi.pdf` | Mixed: all 16 scripts in one PDF |
430
+ | `sample-multi.pdf` | Mixed: all 22 scripts in one PDF |
404
431
  | `sample-pagination.pdf` | 200 rows, multi-page layout |
405
432
 
406
433
  ### Diverse Use Cases (non-financial)
@@ -495,8 +522,14 @@ See [scripts/README.md](scripts/README.md) for the modular generator architectur
495
522
  | `doc-bengali.pdf` | Bengali document (GSUB conjuncts + GPOS marks) |
496
523
  | `doc-tamil.pdf` | Tamil document (GSUB substitution + split vowels) |
497
524
  | `doc-devanagari.pdf` | Hindi (Devanagari) document — GSUB conjuncts, reph reordering, matra reordering, split vowels |
525
+ | `doc-telugu.pdf` | Telugu document (virama conjuncts + GPOS marks, no reph) |
526
+ | `doc-sinhala.pdf` | Sinhala document (virama conjuncts + pre-base kombuva reordering) |
527
+ | `doc-tibetan.pdf` | Tibetan document (vertical subjoined-consonant stacking) |
528
+ | `doc-khmer.pdf` | Khmer document (USE-lite: coeng subscripts, pre-base vowels) |
529
+ | `doc-myanmar.pdf` | Myanmar document (USE-lite: medials, pre-base reordering) |
530
+ | `doc-amharic.pdf` | Amharic/Ethiopic document (syllabic abugida, no reordering) |
498
531
  | `doc-chinese-catalog.pdf` | Chinese product catalog (tables, ordering info) |
499
- | `doc-multi-language.pdf` | Multi-language: EN + Arabic + Japanese in one PDF |
532
+ | `doc-multi-language.pdf` | Multi-language showcase: all 22 Unicode scripts in one PDF |
500
533
  | `doc-invoice.pdf` | Invoice template (line items, totals, payment link) |
501
534
  | `doc-report-multipage.pdf` | 3-page technical report (7 sections, 4 tables) |
502
535
  | `doc-contract-bilingual.pdf` | Bilingual EN/AR contract (legal sections, signatures) |
@@ -678,6 +711,10 @@ See [scripts/README.md](scripts/README.md) for the modular generator architectur
678
711
  |----------|-------------|
679
712
  | `buildDocumentPDFStream(params, layout?, streamOpts?)` | Stream document PDF as `AsyncGenerator<Uint8Array>` |
680
713
  | `buildPDFStream(params, layout?, streamOpts?)` | Stream table PDF as `AsyncGenerator<Uint8Array>` |
714
+ | `buildDocumentPDFStreamTrue(params, layout?, streamOpts?)` | **True constant-memory** document streaming — frees each part as it yields (v1.3.0) |
715
+ | `buildPDFStreamTrue(params, layout?, streamOpts?)` | **True constant-memory** table streaming (v1.3.0) |
716
+ | `buildDocumentPDFStreamPageByPage(params, layout?)` | Stream document PDF chunked at PDF object boundaries |
717
+ | `buildPDFStreamPageByPage(params, layout?)` | Stream table PDF chunked at PDF object boundaries |
681
718
  | `validateDocumentStreamable(params, layout?)` | Validate document is compatible with streaming (no TOC, no `{pages}`) |
682
719
  | `validateTableStreamable(params, layout?)` | Validate table is compatible with streaming |
683
720
  | `chunkBinaryString(str, chunkSize)` | Split binary string into `Uint8Array` chunks |
@@ -711,6 +748,7 @@ See [scripts/README.md](scripts/README.md) for the modular generator architectur
711
748
  | `isRef(v)` / `isDict(v)` / `isArray(v)` / `isStream(v)` | Type guards for parsed PDF values |
712
749
  | `dictGet(dict, key)` / `dictGetName(dict, key)` | Dictionary value accessors |
713
750
  | `inflateSync(data)` | Decompress FlateDecode data (zlib inflate) |
751
+ | `validatePdfUA(bytes)` | Read-only PDF/UA structural checker — returns `{ valid, errors, warnings }` (v1.3.0) |
714
752
 
715
753
  ### Document Block Types
716
754
 
@@ -791,6 +829,11 @@ const pdf = buildPDFBytes(params, { compress: true });
791
829
  | `shapeBengaliText(str, fontData)` | Bengali GSUB conjuncts + GPOS marks |
792
830
  | `shapeTamilText(str, fontData)` | Tamil GSUB + split vowel decomposition |
793
831
  | `shapeDevanagariText(str, fontData)` | Devanagari cluster shaping + GSUB/GPOS |
832
+ | `shapeTeluguText(str, fontData)` | Telugu GSUB conjuncts + GPOS marks (v1.3.0) |
833
+ | `shapeSinhalaText(str, fontData)` | Sinhala conjuncts + pre-base reorder + GSUB/GPOS (v1.3.0) |
834
+ | `shapeTibetanText(str, fontData)` | Tibetan vertical subjoined stacking (v1.3.0) |
835
+ | `shapeKhmerText(str, fontData)` | Khmer USE-lite — coeng subscripts + pre-base vowels (v1.3.0) |
836
+ | `shapeMyanmarText(str, fontData)` | Myanmar USE-lite — medials + virama stacking (v1.3.0) |
794
837
  | `detectFallbackLangs(texts, primaryLang)` | Detect needed fallback fonts |
795
838
  | `detectCharLang(codePoint)` | Map codepoint to preferred font language |
796
839
  | `splitTextByFont(str, fontEntries)` | Multi-font text run splitting |
@@ -801,6 +844,10 @@ const pdf = buildPDFBytes(params, { compress: true });
801
844
  | `shapeArabicText(str, fontData)` | Arabic GSUB positional shaping |
802
845
  | `containsArabic(text)` | Detect Arabic content |
803
846
  | `containsHebrew(text)` | Detect Hebrew content |
847
+ | `containsTelugu(text)` | Detect Telugu content (v1.3.0) |
848
+ | `isTeluguCodepoint(cp)` | Telugu codepoint predicate (v1.3.0) |
849
+ | `containsSinhala(text)` / `containsTibetan(text)` / `containsKhmer(text)` / `containsMyanmar(text)` / `containsEthiopic(text)` | Detect script content (v1.3.0) |
850
+ | `isSinhalaCodepoint(cp)` / `isTibetanCodepoint(cp)` / `isKhmerCodepoint(cp)` / `isMyanmarCodepoint(cp)` / `isEthiopicCodepoint(cp)` | Codepoint predicates (v1.3.0) |
804
851
 
805
852
  ### Layout Constants
806
853
 
@@ -821,9 +868,11 @@ pdfnative ships as a library, but two official companion packages cover the most
821
868
 
822
869
  ### pdfnative-cli — command-line interface
823
870
 
824
- [`pdfnative-cli`](https://github.com/Nizoka/pdfnative-cli) v0.2.0 is the **official CLI**. It exposes four commands — `render`, `sign`, `inspect`, **`verify`** — covering the full `pdfnative` v1.0.5 surface for use in shell scripts, Makefiles, GitHub Actions, and Docker images. Zero extra runtime dependencies, npm-provenance-signed.
871
+ [`pdfnative-cli`](https://github.com/Nizoka/pdfnative-cli) v0.3.0 is the **official CLI**. It exposes four commands — `render`, `sign`, `inspect`, **`verify`** — covering the full `pdfnative` v1.1.0 surface for use in shell scripts, Makefiles, GitHub Actions, and Docker images. Zero extra runtime dependencies, npm-provenance-signed.
825
872
 
826
- **Highlights (v0.2.0):** hybrid `flags + --layout file.json` model, encryption (AES-128/256), watermarks (text + image), header/footer page templates with `{page}/{pages}/{date}/{title}`, PDF/A-3 attachments (Factur-X / ZUGFeRD pattern), multilingual fonts via `--lang`, table-variant rendering, signing metadata + intermediate cert chains, `inspect --verbose/--pages/--check`, and a brand-new `verify` command for byte-range integrity and certificate-chain validation. **100 % backward-compatible** with v0.1.0.
873
+ **New in v0.3.0:** ECDSA-SHA256 (P-256) signing fully wired, real CMS/PKCS#7 verification with RFC 3161 timestamp detection, automatic signature-placeholder injection on `sign`, plus three iteration-friendly `render` flags (`--watch`, `--template`, `--font {latin,emoji}`). **100 % backward-compatible** with v0.2.0.
874
+
875
+ **Previously (v0.2.0):** hybrid `flags + --layout file.json` model, encryption (AES-128/256), watermarks (text + image), header/footer page templates with `{page}/{pages}/{date}/{title}`, PDF/A-3 attachments (Factur-X / ZUGFeRD pattern), multilingual fonts via `--lang`, table-variant rendering, signing metadata + intermediate cert chains, `inspect --verbose/--pages/--check`, and the `verify` command.
827
876
 
828
877
  ```bash
829
878
  # render with full layout coverage (encryption + watermark + PDF/A-2b)
@@ -845,11 +894,13 @@ npx pdfnative-cli inspect --input signed.pdf \
845
894
  --check pdfa --check signed --format json
846
895
  ```
847
896
 
848
- See the [CLI Guide](https://pdfnative.dev/guides/cli.html) for the full v0.2.0 reference, security model, recipes, and the `--conformance` → `--tagged` migration path. Try the [interactive CLI playground](https://pdfnative.dev/playgrounds/cli.html) to build commands without leaving the browser.
897
+ See the [CLI Guide](https://pdfnative.dev/guides/cli.html) for the full v0.3.0 reference, security model, recipes, and the `--conformance` → `--tagged` migration path. Try the [interactive CLI playground](https://pdfnative.dev/playgrounds/cli.html) to build commands without leaving the browser.
849
898
 
850
899
  ### pdfnative-mcp — Model Context Protocol server
851
900
 
852
- [`pdfnative-mcp`](https://github.com/Nizoka/pdfnative-mcp) is a **Model Context Protocol server** that bridges pdfnative to any MCP-compatible AI client. Once configured, your AI assistant can generate PDFs, embed barcodes, create forms, sign documents, and render international text — all without writing code.
901
+ [`pdfnative-mcp`](https://github.com/Nizoka/pdfnative-mcp) v0.3.0 is a **Model Context Protocol server** that bridges pdfnative to any MCP-compatible AI client. Once configured, your AI assistant can generate PDFs, embed barcodes, create forms, sign documents, render international text, and **inspect** existing PDFs — all without writing code.
902
+
903
+ **New in v0.3.0:** `inspect_pdf` (the 9th tool, structured metadata/page/signature/PDF/A report), `pdfA` flag on every document tool, multi-script `lang` (`string | string[] | csv`) with bundled `latin` + `emoji` codes, `add_table` `autoFitColumns`/`clipCells`, and per-tool `outputSchema` per the MCP 2025-06-18 spec.
853
904
 
854
905
  ```bash
855
906
  npx -y pdfnative-mcp
@@ -860,13 +911,14 @@ npx -y pdfnative-mcp
860
911
  | Tool | Purpose |
861
912
  |------|---------|
862
913
  | `generate_basic_pdf` | Multi-page documents from structured blocks (headings, paragraphs, lists) |
863
- | `add_table` | Tabular reports from column headers and data rows |
914
+ | `add_table` | Tabular reports from column headers and data rows (v0.3.0: `autoFitColumns`, `clipCells`) |
864
915
  | `add_barcode` | QR Code, Code 128, EAN-13, Data Matrix, PDF417 |
865
- | `add_international_text` | 16 non-Latin scripts with BiDi & OpenType shaping |
916
+ | `add_international_text` | 16 non-Latin scripts with BiDi & OpenType shaping (v0.3.0: multi-script `lang`) |
866
917
  | `add_form` | Interactive AcroForm PDFs (text, checkbox, radio, dropdown) |
867
918
  | `embed_image` | Embed a JPEG or PNG image (base64) |
868
919
  | `prepare_signature_placeholder` | PDF with a `/Sig` field ready to be signed |
869
920
  | `sign_pdf` | CMS/PKCS#7 digital signatures (RSA-SHA256 / ECDSA-SHA256) |
921
+ | `inspect_pdf` | **New in v0.3.0** — structured PDF report (metadata, pages, signatures, PDF/A) |
870
922
 
871
923
  ### Claude Desktop configuration
872
924
 
@@ -946,9 +998,9 @@ src/
946
998
  ├── worker-api.ts # Worker/main-thread dispatch
947
999
  └── pdf-worker.ts # Self-contained worker entry
948
1000
 
949
- fonts/ # Pre-built font data modules (16 scripts)
1001
+ fonts/ # Pre-built font data modules (22 scripts)
950
1002
  tools/ # CLI: build-font-data.cjs (TTF → JS module)
951
- scripts/ # Modular sample PDF generation (23 generators, 140+ PDFs)
1003
+ scripts/ # Modular sample PDF generation (32 generators, 187+ PDFs)
952
1004
  tests/ # 1726+ tests (48 files: unit + integration + fuzz + parser)
953
1005
  bench/ # Performance benchmarks (vitest bench)
954
1006
  ```
@@ -1166,7 +1218,7 @@ pdfnative targets ES2020 and works in any environment that supports `Uint8Array`
1166
1218
 
1167
1219
  ## Origin
1168
1220
 
1169
- pdfnative was born inside [**plika.app**](https://plika.app) — a personal finance application where high-quality, multi-language PDF generation (bank statements, transaction reports) was a core requirement. Rather than depending on heavy third-party libraries, the PDF engine was built from scratch with zero dependencies, strict ISO compliance, and native support for 16 Unicode scripts.
1221
+ pdfnative was born inside [**plika.app**](https://plika.app) — a personal finance application where high-quality, multi-language PDF generation (bank statements, transaction reports) was a core requirement. Rather than depending on heavy third-party libraries, the PDF engine was built from scratch with zero dependencies, strict ISO compliance, and native support for 22 Unicode scripts.
1170
1222
 
1171
1223
  The decision was then made to extract the engine into an independent open-source library so that everyone can benefit from production-grade PDF generation — not just plika.app users.
1172
1224