pdfnative 1.4.0 → 1.5.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/README.md CHANGED
@@ -22,9 +22,9 @@ pdfnative ships as four coordinated packages — pick whichever entry point fits
22
22
 
23
23
  | Package | Latest | Use it for |
24
24
  |---|:---:|---|
25
- | [`pdfnative`](https://www.npmjs.com/package/pdfnative) | **v1.3.0** | The library itself — call from Node, browsers, Workers, Deno, Bun. |
25
+ | [`pdfnative`](https://www.npmjs.com/package/pdfnative) | **v1.5.0** | The library itself — call from Node, browsers, Workers, Deno, Bun. |
26
26
  | [`pdfnative-cli`](https://www.npmjs.com/package/pdfnative-cli) | **v1.1.0** | Render JSON → PDF, sign (RSA + ECDSA-SHA256), inspect, verify (PAdES-T + OCSP/CRL), batch, and emit JSON Schemas from the shell. Built on pdfnative 1.3.0: 22 scripts + COLRv1 emoji, `--stream-true`, `--max-blocks`, `inspect --pdfua`, and an agent-native `--json`/`E_*`/`--dry-run`/`--summary` contract. |
27
- | [`pdfnative-mcp`](https://www.npmjs.com/package/pdfnative-mcp) | **v1.2.0** | Use pdfnative from Claude Desktop, Cursor, Continue, Zed (or any stdio MCP client) — **14 production tools** including `validate_pdf`, `verify_pdf`, `add_attachment`, `extract_attachments`, and `extract_text`; plus watermark support, Unicode `normalize`, token-frugal read modes (`verbosity` / `fields`), `pdfA` flags, and per-tool `_meta.apiVersion`. Built on pdfnative 1.3.0. |
27
+ | [`pdfnative-mcp`](https://www.npmjs.com/package/pdfnative-mcp) | **v1.3.0** | Use pdfnative from Claude Desktop, Cursor, Continue, Zed (or any stdio MCP client) — **17 production tools** including the v1.3.0 page-tree trio `merge_pdfs`, `split_pdf`, `extract_pages`, plus `validate_pdf`, `verify_pdf`, `add_attachment`, `extract_attachments`, and `extract_text`; watermark support, Unicode `normalize`, token-frugal read modes (`verbosity` / `fields`), `pdfA` flags, enriched authoring options (`outline`, `pageLabels`, nested lists, `viewerPreferences`, `cellBorders`, `cellVAlign`), a constant-time `node:crypto` signing provider, DNS-rebinding-protected HTTP transport, and per-tool `_meta.apiVersion`. Built on pdfnative 1.4.0. |
28
28
  | [`pdfnative-react`](https://www.npmjs.com/package/pdfnative-react) | **v0.2.0** | Write PDFs as declarative JSX — `<Document>`, `<Table>`, `<Barcode>`… compiled on-device to pdfnative blocks by a custom React reconciler. Render hooks (`usePdf`), client components (`PDFViewer`), and a token-frugal `DocSpec` for AI agents. |
29
29
 
30
30
  ```bash
@@ -53,14 +53,18 @@ Detailed docs: [CLI guide](docs/guides/cli.md) · [MCP guide](docs/guides/mcp.md
53
53
  - **Free-form document builder** — headings, paragraphs, lists (incl. **nested / hierarchical** bullet & numbered lists, v1.4.0), tables, images, barcodes, SVG paths, form fields, spacers, page breaks, table of contents. Configurable block limit via `layout.maxBlocks` (default 100 000) for very large reports (v1.3.0)
54
54
  - **Smart tables** — multi-page slicing with repeated headers, auto-wrap on column overflow, zebra striping, captions, and smart auto-fit columns (v1.2.0), plus per-cell **borders** (`cellBorders`) and **vertical alignment** (`cellVAlign` / `ColumnDef.vAlign`, v1.4.0). [Guide →](docs/guides/tables.md)
55
55
  - **Barcode & QR code generation** — Code 128, EAN-13, QR Code, Data Matrix, PDF417 — pure PDF path operators (no images)
56
- - **SVG path rendering** — path, rect, circle, ellipse, line, polyline, polygon as native PDF operators
56
+ - **SVG rendering** — path, rect, circle, ellipse, line, polyline, polygon as native PDF operators, plus `<text>` elements rendered as upright PDF text with `x`/`y` positioning and `text-anchor` (start/middle/end) support (v1.5.0)
57
57
  - **AcroForm fields** — text, multiline, checkbox, radio, dropdown, listbox with appearance streams (ISO 32000-1 §12.7)
58
58
  - **Digital signatures** — CMS/PKCS#7 detached signatures with RSA + ECDSA, SHA-256/384/512, X.509 parsing (ISO 32000-1 §12.8). One-call placeholder injection via `addSignaturePlaceholder()` (v1.2.0). Pluggable **native crypto provider** (`setCryptoProvider()` / `PdfSignOptions.provider`, v1.4.0) for constant-time, hardware-backed signing (`node:crypto` / Web Crypto / HSM)
59
59
  - **Streaming output** — AsyncGenerator-based progressive PDF emission with configurable chunk size, object-boundary page-by-page streaming, and **true constant-memory streaming** (`buildDocumentPDFStreamTrue()`, v1.3.0) where the full PDF binary never materialises. One-call `streamToFile()` drains any stream to disk with back-pressure and `AbortSignal` support (v1.4.0). [Guide →](docs/guides/streaming.md)
60
60
  - **Document outline & page labels** — nested bookmarks (`/Outlines` tree, with bold/italic/colour, collapsible nodes via `open: false`, explicit or `outline: 'auto'` from headings) and logical page numbering (`/PageLabels`: decimal, roman, alpha, prefixes, custom start) (v1.4.0). [Guide →](docs/guides/outlines.md)
61
61
  - **Viewer preferences** — `PdfLayoutOptions.viewerPreferences` controls initial `/PageLayout` & `/PageMode` plus the `/ViewerPreferences` dict (hide toolbar/menubar, fit/center window, display doc title, non-full-screen mode, reading direction, print scaling) — PDF/A-safe (v1.4.0). [Guide →](docs/guides/viewer-preferences.md)
62
62
  - **Font-data validator** — opt-in `validateFontData()` structurally checks custom font modules (SFNT magic, base64 integrity, cmap coverage, glyph-id range, width array, finite metrics) and returns `{ valid, errors, warnings }` (v1.4.0). [Guide →](docs/guides/font-validation.md)
63
- - **PDF parser & modifier** — read existing PDFs (tokenizer, xref, object parser, FlateDecode inflate) + incremental modification. Read-only PDF/UA structural checker `validatePdfUA()` (ISO 14289-1: MarkInfo, StructTree, ParentTree, Lang, per-page MCID uniqueness) (v1.3.0). **Page-tree manipulation** (v1.4.0): `mergePdfs()`, `splitPdf()`, `extractPages()` rebuild a clean object graph (inherited attributes resolved, annotations/signatures optionally dropped, deterministic trailer `/ID`, bounded-depth copy, 256 MiB output cap via `maxOutputSize`). [Guide →](docs/guides/pdf-manipulation.md)
63
+ - **PDF parser & modifier** — read existing PDFs (tokenizer, xref, object parser, FlateDecode inflate) + incremental modification. Read-only PDF/UA structural checker `validatePdfUA()` (ISO 14289-1: MarkInfo, StructTree, ParentTree, Lang, per-page MCID uniqueness) (v1.3.0). **Page-tree manipulation** (v1.4.0): `mergePdfs()`, `splitPdf()`, `extractPages()` rebuild a clean object graph (inherited attributes resolved, annotations/signatures optionally dropped, deterministic trailer `/ID`, bounded-depth copy, 256 MiB output cap via `maxOutputSize`). **Round-trip readers** (v1.5.0): `getPageLabels()` parses `/PageLabels` back into a typed `PageLabelRange[]`; `getAnnotations()` / `getPageRef()` read page annotations, and `PdfModifier.addAnnotation()` injects new ones incrementally. [Guide →](docs/guides/pdf-manipulation.md)
64
+ - **Markup annotations** — typed annotation model (text, highlight, underline, strikeout, squiggly, square, circle, line, freetext) via `buildAnnotation()` / `buildAnnotationBody()`, plus `PdfReader.getAnnotations()` and `PdfModifier.addAnnotation()` for round-trip read/write (v1.5.0). [Guide →](docs/guides/annotations.md)
65
+ - **Layout debug & inspection** — opt-in `layout: { debug: true }` overlays margin / content / cell boxes for visual layout debugging; `inspectDocumentLayout()` returns a programmatic per-page block-geometry report. Byte-identical when debug is off (v1.5.0). [Guide →](docs/guides/debugging.md)
66
+ - **Math & technical symbols** — bundleable math font under lang `'math'`; mathematical operators, Greek, arrows, and technical symbols route automatically via script detection (v1.5.0)
67
+ - **Font-data tooling** — `pdfnative/tools` exposes `compileFontData()` / `parseFontData()` to build and introspect font-data modules programmatically (v1.5.0)
64
68
  - **Image embedding** — JPEG (DCTDecode) and PNG (FlateDecode) with auto-scaling and alignment
65
69
  - **Hyperlinks** — PDF link annotations (/URI) with URL validation, blue underlined text, tagged /Link
66
70
  - **Header/footer templates** — configurable `PageTemplate` with left/center/right zones and `{page}`/`{pages}`/`{date}`/`{title}` placeholders
@@ -69,11 +73,11 @@ Detailed docs: [CLI guide](docs/guides/cli.md) · [MCP guide](docs/guides/mcp.md
69
73
  - **FlateDecode compression** — zlib stream compression (50–90% size reduction), zero-dependency, platform-native
70
74
  - **Web Worker support** — off-main-thread generation for large datasets
71
75
  - **Tree-shakeable** — ESM + CJS dual build with TypeScript declarations
72
- - **95%+ test coverage** — 2165+ tests across 83 files, fuzz suite, dual-mode visual-regression suite, performance benchmarks
76
+ - **95%+ test coverage** — 2218+ tests across 93 files, fuzz suite, dual-mode visual-regression suite, performance benchmarks
73
77
  - **NPM provenance** — signed builds via GitHub Actions OIDC
74
78
  - **On-device generation** — runs in Node, browsers, Workers, Deno, Bun. No SaaS round-trip; documents never leave the calling process unless your application explicitly sends them
75
79
  - **No telemetry, no network calls** — verifiable in source. The library never opens a socket, fetches remote fonts, or phones home
76
- - **AI client integration** — use pdfnative from Claude Desktop, Cursor, Continue, and Zed via [`pdfnative-mcp`](https://github.com/Nizoka/pdfnative-mcp) — **14 production tools** (generate, tables, barcodes, forms, sign, verify, validate, attachments, extraction, inspect)
80
+ - **AI client integration** — use pdfnative from Claude Desktop, Cursor, Continue, and Zed via [`pdfnative-mcp`](https://github.com/Nizoka/pdfnative-mcp) — **17 production tools** (generate, tables, barcodes, forms, sign, verify, validate, attachments, extraction, inspect, plus page-tree `merge_pdfs` / `split_pdf` / `extract_pages`)
77
81
  - **Command-line interface** — render, sign, verify, inspect, and batch-render PDFs from the shell with [`pdfnative-cli`](https://github.com/Nizoka/pdfnative-cli) — zero-config, scriptable, agent-native (`--json`/`E_*`/`--dry-run`), ideal for CI/CD pipelines
78
82
  - **React renderer** — author PDFs as declarative JSX with [`pdfnative-react`](https://github.com/Nizoka/pdfnative-react): `<Document>`/`<Table>`/`<Barcode>` components, `usePdf`/`PDFViewer` client hooks, on-device rendering with no DOM or headless browser
79
83
 
@@ -93,8 +97,8 @@ npm install pdfnative
93
97
  - ♿ **Accessibility:** [docs/guides/accessibility.md](docs/guides/accessibility.md) — tagged PDF, PDF/UA, PDF/A.
94
98
  - ❓ **FAQ:** [docs/guides/faq.md](docs/guides/faq.md) — fonts, encryption, signatures, comparisons.
95
99
  - 🛠️ **Troubleshooting:** [docs/guides/troubleshooting.md](docs/guides/troubleshooting.md) — common pitfalls.
96
- - 🎮 **Playgrounds:** [docs/playgrounds/extreme-scripts.html](docs/playgrounds/extreme-scripts.html) (live BiDi/Indic stress tests), [docs/playgrounds/medical-800.html](docs/playgrounds/medical-800.html) (800-page Web Worker showcase), and [docs/playgrounds/toolkit.html](docs/playgrounds/toolkit.html) (v1.4.0 bookmarks, page labels, viewer prefs, nested lists, cell borders, merge/split/extract).
97
- - 🧪 **Sample PDFs:** [scripts/generators/](scripts/generators/) — ~201 sample PDFs across 36 categories (see [Sample PDFs](#sample-pdfs) below).
100
+ - 🎮 **Playgrounds:** eight interactive demos at [docs/playgrounds/](docs/playgrounds/) — [extreme-scripts](docs/playgrounds/extreme-scripts.html) (live BiDi/Indic stress tests), [all-scripts](docs/playgrounds/all-scripts.html) (every Unicode script), [medical-800](docs/playgrounds/medical-800.html) (800-page Web Worker showcase), [toolkit](docs/playgrounds/toolkit.html) (v1.4.0 bookmarks, page labels, viewer prefs, nested lists, cell borders, merge/split/extract), plus [cli](docs/playgrounds/cli.html), [mcp](docs/playgrounds/mcp.html) and [react](docs/playgrounds/react.html) ecosystem explorers.
101
+ - 🧪 **Sample PDFs:** [scripts/generators/](scripts/generators/) — ~210 sample PDFs across 41 categories (see [Sample PDFs](#sample-pdfs) below).
98
102
 
99
103
  ## Why pdfnative?
100
104
 
@@ -432,7 +436,7 @@ Generate sample PDFs for all supported languages to visually verify output:
432
436
  npm run test:generate
433
437
  ```
434
438
 
435
- This creates **150+ PDF files** in `test-output/` (git-ignored), organized in twenty-five categories (including `emoji/` and `pdfa-latin/` added in v1.1.0).
439
+ This creates **~210 PDF files** in `test-output/` (git-ignored), organized in twenty-nine categories (including `emoji/` and `pdfa-latin/` added in v1.1.0, and `math/`, `svg/`, `debug/`, `annotations/`, `tools/` added in v1.5.0).
436
440
  See [scripts/README.md](scripts/README.md) for the modular generator architecture.
437
441
 
438
442
  ### Financial Statements (per language)
@@ -727,12 +731,36 @@ See [scripts/README.md](scripts/README.md) for the modular generator architectur
727
731
  | `encodePDF417(data, ecLevel?)` | Encode data into PDF417 codewords (ISO 15438) |
728
732
  | `renderPDF417(data, x, y, w, h, ecLevel?)` | Render PDF417 barcode as PDF path operators |
729
733
 
730
- ### SVG Path Rendering
734
+ ### SVG Rendering
731
735
 
732
736
  | Function | Description |
733
737
  |----------|-------------|
734
738
  | `parseSvgPath(d)` | Parse SVG path `d` attribute into segments |
735
- | `renderSvg(segments, options?)` | Render SVG segments as PDF path operators |
739
+ | `renderSvg(segments, options?)` | Render SVG segments (paths + `<text>`) as PDF operators |
740
+
741
+ ### Markup Annotations
742
+
743
+ | Function | Description |
744
+ |----------|-------------|
745
+ | `buildAnnotation(annot, objNum)` | Build a full markup annotation indirect object (v1.5.0) |
746
+ | `buildAnnotationBody(annot)` | Build a markup annotation dictionary body (for the modifier) (v1.5.0) |
747
+
748
+ Supported `MarkupAnnotation` types: `text`, `highlight`, `underline`, `strikeout`, `squiggly`, `square`, `circle`, `line`, `freetext`.
749
+
750
+ ### Layout Debug & Inspection
751
+
752
+ | Function | Description |
753
+ |----------|-------------|
754
+ | `inspectDocumentLayout(params, layout?)` | Return a programmatic per-page block-geometry `LayoutInspection` (v1.5.0) |
755
+
756
+ Enable the visual overlay via `layout: { debug: true }` or a granular `LayoutDebugOptions` (`showMargins` / `showContentBounds` / `showCells`). Byte-identical when debug is off.
757
+
758
+ ### Font-Data Tools (`pdfnative/tools`)
759
+
760
+ | Function | Description |
761
+ |----------|-------------|
762
+ | `compileFontData(buffer, opts?)` | Compile a TTF/OTF `Uint8Array` into a font-data module source string (v1.5.0) |
763
+ | `parseFontData(buffer, opts?)` | Parse a TTF/OTF `Uint8Array` into a `FontDataObject` (metrics, cmap, widths, glyph coverage) (v1.5.0) |
736
764
 
737
765
  ### AcroForm Fields
738
766
 
@@ -802,6 +830,10 @@ See [scripts/README.md](scripts/README.md) for the modular generator architectur
802
830
  | `mergePdfs(sources, opts?)` | Merge multiple PDFs into one, rebuilding a clean object graph; `opts.maxOutputSize` caps output at 256 MiB by default (v1.4.0) |
803
831
  | `splitPdf(src, ranges, opts?)` | Split a PDF into multiple documents by inclusive 0-based page ranges (v1.4.0) |
804
832
  | `extractPages(src, indices, opts?)` | Extract specific pages (0-based) into a new PDF (v1.4.0) |
833
+ | `reader.getPageLabels()` | Parse an existing `/PageLabels` number tree into `PageLabelRange[]` or `null` (v1.5.0) |
834
+ | `reader.getAnnotations(pageIndex)` | Read a page's annotations into `ParsedAnnotation[]` (v1.5.0) |
835
+ | `reader.getPageRef(pageIndex)` | Get the indirect `PdfRef` for a page (v1.5.0) |
836
+ | `modifier.addAnnotation(pageIndex, body)` | Inject a new annotation on a page via incremental update (v1.5.0) |
805
837
 
806
838
  ### Document Block Types
807
839
 
@@ -902,6 +934,7 @@ const pdf = buildPDFBytes(params, { compress: true });
902
934
  | `isTeluguCodepoint(cp)` | Telugu codepoint predicate (v1.3.0) |
903
935
  | `containsSinhala(text)` / `containsTibetan(text)` / `containsKhmer(text)` / `containsMyanmar(text)` / `containsEthiopic(text)` | Detect script content (v1.3.0) |
904
936
  | `isSinhalaCodepoint(cp)` / `isTibetanCodepoint(cp)` / `isKhmerCodepoint(cp)` / `isMyanmarCodepoint(cp)` / `isEthiopicCodepoint(cp)` | Codepoint predicates (v1.3.0) |
937
+ | `containsMath(text)` / `isMathCodepoint(cp)` | Detect / test mathematical symbols → lang `'math'` (v1.5.0) |
905
938
 
906
939
  ### Layout Constants
907
940
 
@@ -950,7 +983,7 @@ See the [CLI Guide](https://pdfnative.dev/guides/cli.html) for the full v1.1.0 r
950
983
 
951
984
  ### pdfnative-mcp — Model Context Protocol server
952
985
 
953
- [`pdfnative-mcp`](https://github.com/Nizoka/pdfnative-mcp) v1.2.0 is a **Model Context Protocol server** that bridges pdfnative to any MCP-compatible AI client. Once configured, your AI assistant can generate PDFs, embed barcodes, create forms, sign and verify documents, validate PDF/UA structure, embed and extract attachments, extract text, render international text, and inspect existing PDFs — all without writing code.
986
+ [`pdfnative-mcp`](https://github.com/Nizoka/pdfnative-mcp) v1.3.0 is a **Model Context Protocol server** that bridges pdfnative to any MCP-compatible AI client. Once configured, your AI assistant can generate PDFs, embed barcodes, create forms, sign and verify documents, validate PDF/UA structure, embed and extract attachments, extract text, render international text, merge, split and extract pages, and inspect existing PDFs — all without writing code.
954
987
 
955
988
  **v1.0.0:** first stable MCP release with 12 tools, `verify_pdf`, `add_attachment` (Factur-X / ZUGFeRD PDF/A-3), `extract_text`, smart-table options, auto-placeholder signing, and `_meta.apiVersion`.
956
989
 
@@ -958,6 +991,8 @@ See the [CLI Guide](https://pdfnative.dev/guides/cli.html) for the full v1.1.0 r
958
991
 
959
992
  **v1.2.0:** adds `extract_attachments`, watermark options on document tools, Unicode `normalize` (NFC/NFD/NFKC/NFKD), token-frugal read modes (`verbosity`/`fields`), and returns base64 PDF bytes once via a `resource` block.
960
993
 
994
+ **v1.3.0:** adds the page-tree trio `merge_pdfs` / `split_pdf` / `extract_pages` (**17 tools** total), enriched authoring options (`outline`, `pageLabels`, nested lists, `viewerPreferences`, `cellBorders`, `cellVAlign`), a constant-time `node:crypto` signing provider, and DNS-rebinding protection on the HTTP transport — all via the pdfnative 1.4.0 engine.
995
+
961
996
  ```bash
962
997
  npx -y pdfnative-mcp
963
998
  ```
@@ -979,6 +1014,9 @@ npx -y pdfnative-mcp
979
1014
  | `add_attachment` | **v1.0.0** — PDF/A-3 with embedded files (Factur-X / ZUGFeRD) |
980
1015
  | `extract_attachments` | **v1.2.0** — extract embedded files (optionally metadata-only) |
981
1016
  | `extract_text` | **v1.0.0** — best-effort plain-text extraction from a non-encrypted PDF |
1017
+ | `merge_pdfs` | **v1.3.0** — concatenate 2–50 PDFs into one via the page-tree API |
1018
+ | `split_pdf` | **v1.3.0** — split one PDF into one document per page range (multi-output) |
1019
+ | `extract_pages` | **v1.3.0** — pull an arbitrary, order-preserving page subset (max 5000) into a new PDF |
982
1020
  | `inspect_pdf` | Structured PDF report (metadata, pages, signatures, PDF/A, attachments, placeholder state) |
983
1021
 
984
1022
  ### Claude Desktop configuration
@@ -1085,7 +1123,7 @@ src/
1085
1123
 
1086
1124
  fonts/ # Pre-built font data modules (22 scripts)
1087
1125
  tools/ # CLI: build-font-data.cjs (TTF → JS module)
1088
- scripts/ # Modular sample PDF generation (36 generators, 201+ PDFs)
1126
+ scripts/ # Modular sample PDF generation (41 generators, 210+ PDFs)
1089
1127
  tests/ # 1726+ tests (48 files: unit + integration + fuzz + parser)
1090
1128
  bench/ # Performance benchmarks (vitest bench)
1091
1129
  ```
@@ -1098,9 +1136,9 @@ cd pdfnative
1098
1136
  npm install
1099
1137
 
1100
1138
  npm run build # tsup → dist/ (ESM + CJS + .d.ts)
1101
- npm run test # vitest run (1588+ tests)
1139
+ npm run test # vitest run (2218+ tests)
1102
1140
  npm run test:coverage # vitest with v8 coverage (95%+)
1103
- npm run test:generate # Generate 150+ sample PDFs → test-output/
1141
+ npm run test:generate # Generate ~210 sample PDFs → test-output/
1104
1142
  npm run lint # ESLint 9 + typescript-eslint strict
1105
1143
  npm run typecheck # tsc --noEmit (src/)
1106
1144
  npm run typecheck:tests # tsc --project tsconfig.test.json
@@ -1113,7 +1151,7 @@ npm run bench # Performance benchmarks (vitest bench)
1113
1151
 
1114
1152
  | Metric | Value |
1115
1153
  |--------|-------|
1116
- | Tests | 1588+ (40 files) |
1154
+ | Tests | 2218+ (93 files) |
1117
1155
  | Statement coverage | 95.41% |
1118
1156
  | Branch coverage | 87.79% |
1119
1157
  | Function coverage | 98.5% |