docgen-utils 1.0.19 → 1.0.21
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/README.md +65 -18
- package/dist/bundle.js +27905 -26887
- package/dist/bundle.min.js +253 -263
- package/dist/cli.js +3696 -2584
- package/dist/packages/cli/commands/export-docs.d.ts.map +1 -1
- package/dist/packages/cli/commands/export-docs.js +161 -12
- package/dist/packages/cli/commands/export-docs.js.map +1 -1
- package/dist/packages/cli/commands/export-slides.d.ts.map +1 -1
- package/dist/packages/cli/commands/export-slides.js +11 -7
- package/dist/packages/cli/commands/export-slides.js.map +1 -1
- package/dist/packages/cli/index.js.map +1 -1
- package/dist/packages/docs/common.d.ts +2 -0
- package/dist/packages/docs/common.d.ts.map +1 -1
- package/dist/packages/docs/convert.d.ts.map +1 -1
- package/dist/packages/docs/convert.js +6 -15
- package/dist/packages/docs/convert.js.map +1 -1
- package/dist/packages/docs/create-document.d.ts.map +1 -1
- package/dist/packages/docs/create-document.js +8 -2
- package/dist/packages/docs/create-document.js.map +1 -1
- package/dist/packages/docs/import-docx.d.ts.map +1 -1
- package/dist/packages/docs/import-docx.js +170 -83
- package/dist/packages/docs/import-docx.js.map +1 -1
- package/dist/packages/docs/parse-colors.d.ts +0 -5
- package/dist/packages/docs/parse-colors.d.ts.map +1 -1
- package/dist/packages/docs/parse-colors.js +2 -2
- package/dist/packages/docs/parse-colors.js.map +1 -1
- package/dist/packages/docs/parse-css.d.ts +0 -9
- package/dist/packages/docs/parse-css.d.ts.map +1 -1
- package/dist/packages/docs/parse-css.js +4 -6
- package/dist/packages/docs/parse-css.js.map +1 -1
- package/dist/packages/docs/parse-helpers.d.ts +0 -1
- package/dist/packages/docs/parse-helpers.d.ts.map +1 -1
- package/dist/packages/docs/parse-helpers.js +1 -1
- package/dist/packages/docs/parse-helpers.js.map +1 -1
- package/dist/packages/docs/parse-inline.d.ts +0 -13
- package/dist/packages/docs/parse-inline.d.ts.map +1 -1
- package/dist/packages/docs/parse-inline.js +7 -7
- package/dist/packages/docs/parse-inline.js.map +1 -1
- package/dist/packages/docs/parse-layout.d.ts.map +1 -1
- package/dist/packages/docs/parse-layout.js +1 -14
- package/dist/packages/docs/parse-layout.js.map +1 -1
- package/dist/packages/docs/parse-special.js +1 -1
- package/dist/packages/docs/parse-special.js.map +1 -1
- package/dist/packages/docs/parse.d.ts.map +1 -1
- package/dist/packages/docs/parse.js +120 -130
- package/dist/packages/docs/parse.js.map +1 -1
- package/dist/packages/shared/zip-guard.d.ts +37 -0
- package/dist/packages/shared/zip-guard.d.ts.map +1 -0
- package/dist/packages/shared/zip-guard.js +101 -0
- package/dist/packages/shared/zip-guard.js.map +1 -0
- package/dist/packages/slides/convert.d.ts +1 -3
- package/dist/packages/slides/convert.d.ts.map +1 -1
- package/dist/packages/slides/convert.js +8 -74
- package/dist/packages/slides/convert.js.map +1 -1
- package/dist/packages/slides/createPresentation.d.ts +1 -1
- package/dist/packages/slides/createPresentation.d.ts.map +1 -1
- package/dist/packages/slides/createPresentation.js +1 -10
- package/dist/packages/slides/createPresentation.js.map +1 -1
- package/dist/packages/slides/import-pptx.d.ts.map +1 -1
- package/dist/packages/slides/import-pptx.js +78 -9
- package/dist/packages/slides/import-pptx.js.map +1 -1
- package/dist/packages/slides/parse.d.ts +0 -22
- package/dist/packages/slides/parse.d.ts.map +1 -1
- package/dist/packages/slides/parse.js +106 -44
- package/dist/packages/slides/parse.js.map +1 -1
- package/dist/packages/slides/transform.d.ts.map +1 -1
- package/dist/packages/slides/transform.js +1 -5
- package/dist/packages/slides/transform.js.map +1 -1
- package/dist/packages/slides/vendor/VENDORING.md +2 -2
- package/package.json +15 -8
- package/dist/packages/cli/commands/common.d.ts +0 -2
- package/dist/packages/cli/commands/common.d.ts.map +0 -1
- package/dist/packages/cli/commands/common.js +0 -22
- package/dist/packages/cli/commands/common.js.map +0 -1
package/README.md
CHANGED
|
@@ -1,6 +1,6 @@
|
|
|
1
1
|
# DocGen
|
|
2
2
|
|
|
3
|
-
Converts HTML into DOCX, PPTX and vice versa.
|
|
3
|
+
Converts HTML into DOCX, PPTX and vice versa. Runs in browsers (via bundled JS) and Node.js (via CLI). Published as `docgen-utils` on npm.
|
|
4
4
|
|
|
5
5
|
## Key Components
|
|
6
6
|
|
|
@@ -8,41 +8,52 @@ Converts HTML into DOCX, PPTX and vice versa.
|
|
|
8
8
|
|
|
9
9
|
| File | Description |
|
|
10
10
|
| ---------- | -------------------------------------------------------------- |
|
|
11
|
-
| `build.sh` | Builds the library
|
|
11
|
+
| `build.sh` | Builds the library (TypeScript → esbuild bundles) |
|
|
12
12
|
| `dist/` | Output directory containing production-ready minified JS files |
|
|
13
13
|
|
|
14
|
+
### Package Structure
|
|
15
|
+
|
|
16
|
+
| Package | Description |
|
|
17
|
+
| ------------------ | --------------------------------------------------------------------------------- |
|
|
18
|
+
| `packages/docs/` | DOCX pipeline — 13 TypeScript files for HTML↔DOCX conversion |
|
|
19
|
+
| `packages/slides/` | PPTX pipeline — 8 TypeScript files + vendored PptxGenJS for HTML↔PPTX conversion |
|
|
20
|
+
| `packages/shared/` | Cross-cutting utilities — DOM parser shim, proxy-aware HTTP client, font mappings |
|
|
21
|
+
| `packages/cli/` | Node.js CLI (`docgen` command) — dispatches to import/export subcommands |
|
|
22
|
+
|
|
14
23
|
## Usage
|
|
15
24
|
|
|
16
|
-
|
|
25
|
+
## Install dependencies
|
|
17
26
|
|
|
18
27
|
```bash
|
|
19
|
-
npm
|
|
28
|
+
npm install --registry=https://registry.npmjs.org
|
|
20
29
|
```
|
|
21
30
|
|
|
22
|
-
###
|
|
23
|
-
The CLI is used in the agent sandbox to transform artifacts.
|
|
31
|
+
### Build
|
|
24
32
|
|
|
25
33
|
```bash
|
|
26
|
-
|
|
27
|
-
node dist/cli.js import pptx --file=file.pptx --out-dir=./output
|
|
28
|
-
node dist/cli.js export docs --file=file.html --out-dir=./output
|
|
29
|
-
node dist/cli.js export slides --files=slide-1.html,slide-2.html --out-dir=./output
|
|
34
|
+
npm run build
|
|
30
35
|
```
|
|
31
36
|
|
|
32
|
-
###
|
|
37
|
+
### CLI
|
|
38
|
+
|
|
39
|
+
The CLI is used in the agent sandbox to transform artifacts.
|
|
33
40
|
|
|
34
41
|
```bash
|
|
35
|
-
|
|
36
|
-
|
|
42
|
+
alias docgen='npx tsx packages/cli'
|
|
43
|
+
docgen import docx --file=file.docx --out-dir=./output [--name=<filename>]
|
|
44
|
+
docgen import pptx --file=file.pptx --out-dir=./output [--name=<filename>]
|
|
45
|
+
docgen export docs --file=file.html --out-dir=./output [--name=<filename>] [--pageless]
|
|
46
|
+
docgen export slides --files=slide-1.html,slide-2.html --out-dir=./output [--name=<filename>]
|
|
37
47
|
```
|
|
38
48
|
|
|
39
49
|
## Visual Comparison
|
|
40
50
|
|
|
41
51
|
The `output` directory contains the rendered output in target formats. e.g. DOCX or PPTX vs HTML
|
|
42
52
|
|
|
43
|
-
- Files in `test-data/docs/` → converted to DOCX → `docx-render.jpg`
|
|
44
|
-
- Files in `test-data/slides/` → converted to PPTX → `pptx-render.jpg`
|
|
45
|
-
- Files in `test-data/pptx/` → imported to HTML → `html-render.jpg`
|
|
53
|
+
- Files in `test-data/docs/` (27 HTML files) → converted to DOCX → `docx-render.jpg`
|
|
54
|
+
- Files in `test-data/slides/` (90 HTML files) → converted to PPTX → `pptx-render.jpg`
|
|
55
|
+
- Files in `test-data/pptx/` (145 PPTX files) → imported to HTML → `html-render.jpg`
|
|
56
|
+
- Files in `test-data/docx/` (22 DOCX files) → imported to HTML → `html-render.jpg`
|
|
46
57
|
|
|
47
58
|
### Prerequisites
|
|
48
59
|
|
|
@@ -57,6 +68,8 @@ brew install --cask libreoffice
|
|
|
57
68
|
brew install poppler
|
|
58
69
|
# Chromium for Playwright
|
|
59
70
|
npx playwright install chromium
|
|
71
|
+
# Install fonts for accurate rendering
|
|
72
|
+
npx tsx scripts/install-fonts/index.ts
|
|
60
73
|
```
|
|
61
74
|
|
|
62
75
|
### Usage
|
|
@@ -79,11 +92,39 @@ Import a PPTX file (PPTX -> HTML):
|
|
|
79
92
|
npm run generate-output -- test-data/pptx/presentation.pptx
|
|
80
93
|
```
|
|
81
94
|
|
|
95
|
+
Import a DOCX file (DOCX -> HTML):
|
|
96
|
+
|
|
97
|
+
```bash
|
|
98
|
+
npm run generate-output -- test-data/docx/document.docx
|
|
99
|
+
```
|
|
100
|
+
|
|
82
101
|
Process multiple files:
|
|
83
102
|
|
|
84
103
|
```bash
|
|
85
|
-
npm run generate-all-docs-output # All docs
|
|
86
|
-
npm run generate-all-slides-output # All slides
|
|
104
|
+
npm run generate-all-docs-output # All docs (HTML→DOCX)
|
|
105
|
+
npm run generate-all-slides-output # All slides (HTML→PPTX)
|
|
106
|
+
npm run generate-all-docx-output # All DOCX imports (DOCX→HTML)
|
|
107
|
+
npm run generate-all-pptx-output # All PPTX imports (PPTX→HTML)
|
|
108
|
+
```
|
|
109
|
+
|
|
110
|
+
### Roundtrip Testing
|
|
111
|
+
|
|
112
|
+
Full roundtrip tests convert HTML→DOCX/PPTX→HTML and compare the result:
|
|
113
|
+
|
|
114
|
+
```bash
|
|
115
|
+
npm run roundtrip -- test-data/docs/doc-1.html # Single doc roundtrip
|
|
116
|
+
npm run roundtrip -- test-data/slides/slide-1.html # Single slide roundtrip
|
|
117
|
+
npm run roundtrip-all-docs # All docs (no AI diff)
|
|
118
|
+
npm run roundtrip-all-slides # All slides (no AI diff)
|
|
119
|
+
```
|
|
120
|
+
|
|
121
|
+
### AI-Powered Visual Diff
|
|
122
|
+
|
|
123
|
+
Compare two rendered images using Claude vision API (requires `ANTHROPIC_API_KEY`):
|
|
124
|
+
|
|
125
|
+
```bash
|
|
126
|
+
npm run ai-diff <image1.jpg> <image2.jpg>
|
|
127
|
+
npm run ai-diff -- <image1.jpg> <image2.jpg> --question="Is the text shadow rendering correctly?"
|
|
87
128
|
```
|
|
88
129
|
|
|
89
130
|
### Output
|
|
@@ -110,10 +151,16 @@ output/
|
|
|
110
151
|
│ ├── diff.jpg # Visual diff highlighting differences
|
|
111
152
|
│ ├── output.html # Generated HTML (all slides concatenated)
|
|
112
153
|
│ └── report.json # Comparison metrics
|
|
154
|
+
├── metrics.json # Aggregated quality metrics across all tests
|
|
113
155
|
└── ...
|
|
114
156
|
```
|
|
115
157
|
|
|
158
|
+
### Quality Metrics
|
|
159
|
+
|
|
160
|
+
A pre-commit hook automatically aggregates all `report.json` files into `output/metrics.json`, tracking quality baselines by conversion type (docx-export, pptx-export, docx-import, pptx-import).
|
|
161
|
+
|
|
116
162
|
**Metrics explained:**
|
|
117
163
|
|
|
118
164
|
- **pixelDiff.percentDiff** - Percentage of pixels that differ between the two images (lower is better)
|
|
119
165
|
- **ssim.mssim** - Structural Similarity Index (0-1, higher is better). Values above 0.9 indicate very similar images
|
|
166
|
+
- **Quality Score** - Composite score 0-100 for roundtrip tests (target ≥85)
|