docxmlater 1.12.0 → 1.14.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/README.md CHANGED
@@ -1,1581 +1,223 @@
1
- # docXMLater - Professional DOCX Framework
2
-
3
- [![npm version](https://img.shields.io/npm/v/docxmlater.svg)](https://www.npmjs.com/package/docxmlater)
4
- [![Tests](https://img.shields.io/badge/tests-1119%20passing-brightgreen)](https://github.com/ItMeDiaTech/docXMLater)
5
- [![TypeScript](https://img.shields.io/badge/TypeScript-5.7-blue)](https://www.typescriptlang.org/)
6
- [![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](https://opensource.org/licenses/MIT)
7
-
8
- A comprehensive, production-ready TypeScript/JavaScript library for creating, reading, and manipulating Microsoft Word (.docx) documents programmatically. Full OpenXML compliance with extensive API coverage and robust test suite.
9
-
10
- Built for professional documentation work, docXMLater provides a complete solution for programmatic DOCX manipulation with an intuitive API and helper functions for all aspects of document creation and modification.
11
-
12
- ## Latest Updates - v1.4.3
13
-
14
- **Critical Fix: Header/Footer Parsing & Enhanced Management:**
15
-
16
- ### What's New in v1.4.3
17
-
18
- - **Header/Footer Parsing Fix:** Documents with headers/footers now load and save correctly
19
- - Fixed [Content_Types].xml corruption when loading documents with headers/footers
20
- - Headers and footers are now properly parsed when loading existing documents
21
- - All header/footer XML files are correctly declared in [Content_Types].xml
22
- - **New Header/Footer Management Methods:**
23
- - `removeHeader(type)` - Remove specific header (default/first/even)
24
- - `removeFooter(type)` - Remove specific footer (default/first/even)
25
- - `clearHeaders()` - Remove all headers while preserving footers
26
- - `clearFooters()` - Remove all footers while preserving headers
27
- - **Round-Trip Support:** Full load-modify-save cycle for documents with headers/footers
28
- - **MS Word Compliance:** Follows ECMA-376 specifications for header/footer handling
29
-
30
- ### Previous Updates - v1.3.0
31
-
32
- - **TOC Parsing:** Parse Table of Contents from existing documents with full SDT support
33
- - **TOC Modification:** Modify TOC field instructions (add/remove switches)
34
- - **Complete Feature Set:** All 102 major features implemented
35
- - **Table Styles:** Full support with 12 conditional formatting types
36
- - **Content Controls:** 9 control types supported
37
- - **Field Types:** 11 field types (PAGE, NUMPAGES, DATE, TIME, FILENAME, AUTHOR, TITLE, REF, HYPERLINK, SEQ, TC/XE)
38
- - **Production Ready:** Full ECMA-376 compliance
39
-
40
- **Test Results:** 1,122/1,150 tests passing (97.6% pass rate - 1,122 core features validated)
41
-
42
- ## Quick Start
43
-
44
- ```bash
45
- npm install docxmlater
46
- ```
47
-
48
- ```typescript
49
- import { Document } from "docxmlater";
50
-
51
- // Create document
52
- const doc = Document.create();
53
- doc.createParagraph("Hello World").setStyle("Title");
54
-
55
- // Save document
56
- await doc.save("output.docx");
57
- ```
58
-
59
- ## Complete API Reference
60
-
61
- ### Document Operations
62
-
63
- | Method | Description | Example |
64
- | --------------------------------- | ----------------------- | ------------------------------------------------ |
65
- | `Document.create(options?)` | Create new document | `const doc = Document.create()` |
66
- | `Document.createEmpty()` | Create minimal document | `const doc = Document.createEmpty()` |
67
- | `Document.load(path)` | Load from file | `const doc = await Document.load('file.docx')` |
68
- | `Document.loadFromBuffer(buffer)` | Load from buffer | `const doc = await Document.loadFromBuffer(buf)` |
69
- | `save(path)` | Save to file | `await doc.save('output.docx')` |
70
- | `toBuffer()` | Export as buffer | `const buffer = await doc.toBuffer()` |
71
- | `dispose()` | Clean up resources | `doc.dispose()` |
72
-
73
- ### Content Creation
74
-
75
- | Method | Description | Example |
76
- | -------------------------------- | ------------------------ | -------------------------------- |
77
- | `createParagraph(text?)` | Add paragraph | `doc.createParagraph('Text')` |
78
- | `createTable(rows, cols)` | Add table | `doc.createTable(3, 4)` |
79
- | `addParagraph(para)` | Add existing paragraph | `doc.addParagraph(myPara)` |
80
- | `addTable(table)` | Add existing table | `doc.addTable(myTable)` |
81
- | `addImage(image)` | Add image | `doc.addImage(myImage)` |
82
- | `addTableOfContents(toc?)` | Add TOC | `doc.addTableOfContents()` |
83
- | `insertParagraphAt(index, para)` | Insert at position | `doc.insertParagraphAt(0, para)` |
84
- | `insertTableAt(index, table)` | Insert table at position | `doc.insertTableAt(5, table)` |
85
- | `insertTocAt(index, toc)` | Insert TOC at position | `doc.insertTocAt(0, toc)` |
86
-
87
- ### Content Manipulation
88
-
89
- | Method | Description | Returns |
90
- | --------------------------------- | ---------------------------- | --------- |
91
- | `replaceParagraphAt(index, para)` | Replace paragraph | `boolean` |
92
- | `replaceTableAt(index, table)` | Replace table | `boolean` |
93
- | `moveElement(fromIndex, toIndex)` | Move element to new position | `boolean` |
94
- | `swapElements(index1, index2)` | Swap two elements | `boolean` |
95
- | `removeTocAt(index)` | Remove TOC element | `boolean` |
96
-
97
- ### Content Retrieval
98
-
99
- | Method | Description | Returns |
100
- | ---------------------- | --------------------------------------- | ------------------------------------------ |
101
- | `getParagraphs()` | Get top-level paragraphs | `Paragraph[]` |
102
- | `getAllParagraphs()` | Get all paragraphs (recursive) | `Paragraph[]` |
103
- | `getTables()` | Get top-level tables | `Table[]` |
104
- | `getAllTables()` | Get all tables (recursive) | `Table[]` |
105
- | `getBodyElements()` | Get all body elements | `BodyElement[]` |
106
- | `getParagraphCount()` | Count paragraphs | `number` |
107
- | `getTableCount()` | Count tables | `number` |
108
- | `getHyperlinks()` | Get all links | `Array<{hyperlink, paragraph}>` |
109
- | `getBookmarks()` | Get all bookmarks | `Array<{bookmark, paragraph}>` |
110
- | `getImages()` | Get all images | `Array<{image, relationshipId, filename}>` |
111
-
112
- **Note**: The `getAllParagraphs()` and `getAllTables()` methods recursively search inside tables and SDTs (Structured Document Tags), while the non-prefixed methods only return top-level elements.
113
-
114
- **Example - Recursive Element Access:**
115
-
116
- ```typescript
117
- import { Document, Hyperlink } from 'docxmlater';
118
-
119
- // Load document with complex structure (tables, SDTs, nested content)
120
- const doc = await Document.load('complex.docx');
121
-
122
- // Get only top-level paragraphs (misses nested content)
123
- const topLevel = doc.getParagraphs();
124
- console.log(`Top-level paragraphs: ${topLevel.length}`); // e.g., 37
125
-
126
- // Get ALL paragraphs including those in tables and SDTs
127
- const allParas = doc.getAllParagraphs();
128
- console.log(`All paragraphs: ${allParas.length}`); // e.g., 52
129
-
130
- // Apply formatting to ALL paragraphs (including nested ones)
131
- for (const para of allParas) {
132
- para.setSpaceAfter(120); // Set 6pt spacing after each paragraph
133
- }
134
-
135
- // Get all tables including those inside SDTs
136
- const allTables = doc.getAllTables();
137
- for (const table of allTables) {
138
- table.setWidth(5000).setWidthType('pct'); // Set to 100% width
139
- }
140
-
141
- // Find all hyperlinks in the entire document
142
- let hyperlinkCount = 0;
143
- for (const para of allParas) {
144
- for (const content of para.getContent()) {
145
- if (content instanceof Hyperlink) {
146
- hyperlinkCount++;
147
- content.setFormatting({ color: '0000FF' }); // Make all links blue
148
- }
149
- }
150
- }
151
- console.log(`Updated ${hyperlinkCount} hyperlinks`);
152
- ```
153
-
154
- ### Content Removal
155
-
156
- | Method | Description | Returns |
157
- | ------------------------------ | ------------------ | --------- |
158
- | `removeParagraph(paraOrIndex)` | Remove paragraph | `boolean` |
159
- | `removeTable(tableOrIndex)` | Remove table | `boolean` |
160
- | `clearParagraphs()` | Remove all content | `this` |
161
-
162
- ### Search & Replace
163
-
164
- | Method | Description | Options |
165
- | -------------------------------------- | --------------------- | ------------------------------ |
166
- | `findText(text, options?)` | Find text occurrences | `{caseSensitive?, wholeWord?}` |
167
- | `replaceText(find, replace, options?)` | Replace all text | `{caseSensitive?, wholeWord?}` |
168
- | `updateHyperlinkUrls(urlMap)` | Update hyperlink URLs | `Map<oldUrl, newUrl>` |
169
-
170
- ### Style Application
171
-
172
- | Method | Description | Returns |
173
- | ------------------------------------- | -------------------------------- | ------------------- |
174
- | `applyStyleToAll(styleId, predicate)` | Apply style to matching elements | `number` |
175
- | `findElementsByStyle(styleId)` | Find all elements using a style | `Array<Para\|Cell>` |
176
-
177
- **Example:**
178
-
179
- ```typescript
180
- // Apply Heading1 to all paragraphs containing "Chapter"
181
- const count = doc.applyStyleToAll("Heading1", (el) => {
182
- return el instanceof Paragraph && el.getText().includes("Chapter");
183
- });
184
-
185
- // Find all Heading1 elements
186
- const headings = doc.findElementsByStyle("Heading1");
187
- ```
188
-
189
- ### Document Statistics
190
-
191
- | Method | Description | Returns |
192
- | ----------------------------------- | ------------------- | ------------------------------ |
193
- | `getWordCount()` | Total word count | `number` |
194
- | `getCharacterCount(includeSpaces?)` | Character count | `number` |
195
- | `estimateSize()` | Size estimation | `{totalEstimatedMB, warning?}` |
196
- | `getSizeStats()` | Detailed size stats | `{elements, size, warnings}` |
197
-
198
- ### Text Formatting
199
-
200
- | Property | Values | Example |
201
- | ------------- | -------------------------------- | ----------------------- |
202
- | `bold` | `true/false` | `{bold: true}` |
203
- | `italic` | `true/false` | `{italic: true}` |
204
- | `underline` | `'single'/'double'/'dotted'/etc` | `{underline: 'single'}` |
205
- | `strike` | `true/false` | `{strike: true}` |
206
- | `font` | Font name | `{font: 'Arial'}` |
207
- | `size` | Points | `{size: 12}` |
208
- | `color` | Hex color | `{color: 'FF0000'}` |
209
- | `highlight` | Color name | `{highlight: 'yellow'}` |
210
- | `subscript` | `true/false` | `{subscript: true}` |
211
- | `superscript` | `true/false` | `{superscript: true}` |
212
- | `smallCaps` | `true/false` | `{smallCaps: true}` |
213
- | `allCaps` | `true/false` | `{allCaps: true}` |
214
-
215
- ### Paragraph Operations
216
-
217
- #### Creating Detached Paragraphs
218
-
219
- Create paragraphs independently before adding to a document:
220
-
221
- ```typescript
222
- // Create empty paragraph
223
- const para1 = Paragraph.create();
224
-
225
- // Create with text
226
- const para2 = Paragraph.create("Hello World");
227
-
228
- // Create with text and formatting
229
- const para3 = Paragraph.create("Centered text", { alignment: "center" });
230
-
231
- // Create with just formatting
232
- const para4 = Paragraph.create({
233
- alignment: "right",
234
- spacing: { before: 240 },
235
- });
236
-
237
- // Create with style
238
- const heading = Paragraph.createWithStyle("Chapter 1", "Heading1");
239
-
240
- // Create with both run and paragraph formatting
241
- const important = Paragraph.createFormatted(
242
- "Important Text",
243
- { bold: true, color: "FF0000" },
244
- { alignment: "center" }
245
- );
246
-
247
- // Add to document later
248
- doc.addParagraph(para1);
249
- doc.addParagraph(heading);
250
- ```
251
-
252
- #### Paragraph Factory Methods
253
-
254
- | Method | Description | Example |
255
- | --------------------------------------------------- | --------------------------- | --------------------------------------- |
256
- | `Paragraph.create(text?, formatting?)` | Create detached paragraph | `Paragraph.create('Text')` |
257
- | `Paragraph.create(formatting?)` | Create with formatting only | `Paragraph.create({alignment: 'left'})` |
258
- | `Paragraph.createWithStyle(text, styleId)` | Create with style | `Paragraph.createWithStyle('', 'H1')` |
259
- | `Paragraph.createEmpty()` | Create empty paragraph | `Paragraph.createEmpty()` |
260
- | `Paragraph.createFormatted(text, run?, paragraph?)` | Create with dual formatting | See example above |
261
-
262
- #### Paragraph Formatting Methods
263
-
264
- | Method | Description | Values |
265
- | ------------------------------ | ------------------- | ----------------------------------- |
266
- | `setAlignment(align)` | Text alignment | `'left'/'center'/'right'/'justify'` |
267
- | `setLeftIndent(twips)` | Left indentation | Twips value |
268
- | `setRightIndent(twips)` | Right indentation | Twips value |
269
- | `setFirstLineIndent(twips)` | First line indent | Twips value |
270
- | `setSpaceBefore(twips)` | Space before | Twips value |
271
- | `setSpaceAfter(twips)` | Space after | Twips value |
272
- | `setLineSpacing(twips, rule?)` | Line spacing | Twips + rule |
273
- | `setStyle(styleId)` | Apply style | Style ID |
274
- | `setKeepNext()` | Keep with next | - |
275
- | `setKeepLines()` | Keep lines together | - |
276
- | `setPageBreakBefore()` | Page break before | - |
277
-
278
- #### Paragraph Manipulation Methods
279
-
280
- | Method | Description | Returns |
281
- | -------------------------------------- | ----------------------- | ----------- |
282
- | `insertRunAt(index, run)` | Insert run at position | `this` |
283
- | `removeRunAt(index)` | Remove run at position | `boolean` |
284
- | `replaceRunAt(index, run)` | Replace run at position | `boolean` |
285
- | `findText(text, options?)` | Find text in runs | `number[]` |
286
- | `replaceText(find, replace, options?)` | Replace text in runs | `number` |
287
- | `mergeWith(otherPara)` | Merge another paragraph | `this` |
288
- | `clone()` | Clone paragraph | `Paragraph` |
289
-
290
- **Example:**
291
-
292
- ```typescript
293
- const para = doc.createParagraph("Hello World");
294
-
295
- // Find and replace
296
- const indices = para.findText("World"); // [1]
297
- const count = para.replaceText("World", "Universe", { caseSensitive: true });
298
-
299
- // Manipulate runs
300
- para.insertRunAt(0, new Run("Start: ", { bold: true }));
301
- para.replaceRunAt(1, new Run("HELLO", { allCaps: true }));
302
-
303
- // Merge paragraphs
304
- const para2 = Paragraph.create(" More text");
305
- para.mergeWith(para2); // Combines runs
306
- ```
307
-
308
- ### Run (Text Span) Operations
309
-
310
- | Method | Description | Returns |
311
- | ------------------------------- | ------------------------- | ------- |
312
- | `clone()` | Clone run with formatting | `Run` |
313
- | `insertText(index, text)` | Insert text at position | `this` |
314
- | `appendText(text)` | Append text to end | `this` |
315
- | `replaceText(start, end, text)` | Replace text range | `this` |
316
-
317
- **Example:**
318
-
319
- ```typescript
320
- const run = new Run("Hello World", { bold: true });
321
-
322
- // Text manipulation
323
- run.insertText(6, "Beautiful "); // "Hello Beautiful World"
324
- run.appendText("!"); // "Hello Beautiful World!"
325
- run.replaceText(0, 5, "Hi"); // "Hi Beautiful World!"
326
-
327
- // Clone for reuse
328
- const copy = run.clone();
329
- copy.setColor("FF0000"); // Original unchanged
330
- ```
331
-
332
- ### Table Operations
333
-
334
- | Method | Description | Example |
335
- | ----------------------- | -------------------- | ---------------------------------------- |
336
- | `getRow(index)` | Get table row | `table.getRow(0)` |
337
- | `getCell(row, col)` | Get table cell | `table.getCell(0, 1)` |
338
- | `addRow()` | Add new row | `table.addRow()` |
339
- | `removeRow(index)` | Remove row | `table.removeRow(2)` |
340
- | `insertColumn(index)` | Insert column | `table.insertColumn(1)` |
341
- | `removeColumn(index)` | Remove column | `table.removeColumn(3)` |
342
- | `setWidth(twips)` | Set table width | `table.setWidth(8640)` |
343
- | `setAlignment(align)` | Table alignment | `table.setAlignment('center')` |
344
- | `setAllBorders(border)` | Set all borders | `table.setAllBorders({style: 'single'})` |
345
- | `setBorders(borders)` | Set specific borders | `table.setBorders({top: {...}})` |
346
-
347
- #### Advanced Table Operations
348
-
349
- | Method | Description | Returns |
350
- | ------------------------------------------------ | ------------------------- | ------------ |
351
- | `mergeCells(startRow, startCol, endRow, endCol)` | Merge cells | `this` |
352
- | `splitCell(row, col)` | Remove cell spanning | `this` |
353
- | `moveCell(fromRow, fromCol, toRow, toCol)` | Move cell contents | `this` |
354
- | `swapCells(row1, col1, row2, col2)` | Swap two cells | `this` |
355
- | `setColumnWidth(index, width)` | Set specific column width | `this` |
356
- | `setColumnWidths(widths)` | Set all column widths | `this` |
357
- | `insertRows(startIndex, count)` | Insert multiple rows | `TableRow[]` |
358
- | `removeRows(startIndex, count)` | Remove multiple rows | `boolean` |
359
- | `clone()` | Clone entire table | `Table` |
360
-
361
- **Example:**
362
-
363
- ```typescript
364
- const table = doc.createTable(3, 3);
365
-
366
- // Merge cells horizontally (row 0, columns 0-2)
367
- table.mergeCells(0, 0, 0, 2);
368
-
369
- // Move cell contents
370
- table.moveCell(1, 1, 2, 2);
371
-
372
- // Swap cells
373
- table.swapCells(0, 0, 2, 2);
374
-
375
- // Batch row operations
376
- table.insertRows(1, 3); // Insert 3 rows at position 1
377
- table.removeRows(4, 2); // Remove 2 rows starting at position 4
378
-
379
- // Set column widths
380
- table.setColumnWidth(0, 2000); // First column = 2000 twips
381
- table.setColumnWidths([2000, 3000, 2000]); // All columns
382
-
383
- // Clone table for reuse
384
- const tableCopy = table.clone();
385
- ```
386
-
387
- ### Table Cell Operations
388
-
389
- | Method | Description | Example |
390
- | ----------------------------- | --------------------- | ------------------------------------- |
391
- | `createParagraph(text?)` | Add paragraph to cell | `cell.createParagraph('Text')` |
392
- | `setShading(shading)` | Cell background | `cell.setShading({fill: 'E0E0E0'})` |
393
- | `setVerticalAlignment(align)` | Vertical align | `cell.setVerticalAlignment('center')` |
394
- | `setColumnSpan(cols)` | Merge columns | `cell.setColumnSpan(3)` |
395
- | `setRowSpan(rows)` | Merge rows | `cell.setRowSpan(2)` |
396
- | `setBorders(borders)` | Cell borders | `cell.setBorders({top: {...}})` |
397
- | `setWidth(width, type?)` | Cell width | `cell.setWidth(2000, 'dxa')` |
398
-
399
- ### Style Management
400
-
401
- | Method | Description | Example |
402
- | ----------------------------- | ------------------ | ---------------------------------- |
403
- | `addStyle(style)` | Add custom style | `doc.addStyle(myStyle)` |
404
- | `getStyle(styleId)` | Get style by ID | `doc.getStyle('Heading1')` |
405
- | `hasStyle(styleId)` | Check style exists | `doc.hasStyle('CustomStyle')` |
406
- | `getStyles()` | Get all styles | `doc.getStyles()` |
407
- | `removeStyle(styleId)` | Remove style | `doc.removeStyle('OldStyle')` |
408
- | `updateStyle(styleId, props)` | Update style | `doc.updateStyle('Normal', {...})` |
409
-
410
- #### Style Manipulation
411
-
412
- | Method | Description | Returns |
413
- | ------------------------ | ----------------------------------- | ------- |
414
- | `style.clone()` | Clone style | `Style` |
415
- | `style.mergeWith(other)` | Merge properties from another style | `this` |
416
-
417
- **Example:**
418
-
419
- ```typescript
420
- // Clone a style
421
- const heading1 = doc.getStyle("Heading1");
422
- const customHeading = heading1.clone();
423
- customHeading.setRunFormatting({ color: "FF0000" });
424
-
425
- // Merge styles
426
- const baseStyle = Style.createNormalStyle();
427
- const overrideStyle = Style.create({
428
- styleId: "Override",
429
- name: "Override",
430
- type: "paragraph",
431
- runFormatting: { bold: true, color: "FF0000" },
432
- });
433
- baseStyle.mergeWith(overrideStyle); // baseStyle now has bold red text
434
- ```
435
-
436
- #### Built-in Styles
437
-
438
- - `Normal` - Default paragraph
439
- - `Title` - Document title
440
- - `Subtitle` - Document subtitle
441
- - `Heading1` through `Heading9` - Section headings
442
- - `ListParagraph` - List items
443
-
444
- ### List Management
445
-
446
- | Method | Description | Returns |
447
- | --------------------------------------- | ----------------------- | ------- |
448
- | `createBulletList(levels?, bullets?)` | Create bullet list | `numId` |
449
- | `createNumberedList(levels?, formats?)` | Create numbered list | `numId` |
450
- | `createMultiLevelList()` | Create multi-level list | `numId` |
451
-
452
- ### Table of Contents (TOC)
453
-
454
- #### Basic TOC Creation
455
-
456
- | Method | Description | Example |
457
- | -------------------------- | ---------------------- | -------------------------- |
458
- | `addTableOfContents(toc?)` | Add TOC to document | `doc.addTableOfContents()` |
459
- | `insertTocAt(index, toc)` | Insert TOC at position | `doc.insertTocAt(0, toc)` |
460
- | `removeTocAt(index)` | Remove TOC at position | `doc.removeTocAt(0)` |
461
-
462
- #### TOC Factory Methods
463
-
464
- | Method | Description | Example |
465
- | -------------------------------------------------------- | ------------------------ | ---------------------------------------------------- |
466
- | `TableOfContents.createStandard(title?)` | Standard TOC (3 levels) | `TableOfContents.createStandard()` |
467
- | `TableOfContents.createSimple(title?)` | Simple TOC (2 levels) | `TableOfContents.createSimple()` |
468
- | `TableOfContents.createDetailed(title?)` | Detailed TOC (4 levels) | `TableOfContents.createDetailed()` |
469
- | `TableOfContents.createHyperlinked(title?)` | Hyperlinked TOC | `TableOfContents.createHyperlinked()` |
470
- | `TableOfContents.createNoPageNumbers(opts?)` | TOC without page numbers | `TableOfContents.createNoPageNumbers()` |
471
- | `TableOfContents.createWithStyles(styles, opts?)` | TOC with specific styles | `TableOfContents.createWithStyles(['H1','H3'])` |
472
- | `TableOfContents.createFlat(title?, styles?)` | Flat TOC (no indent) | `TableOfContents.createFlat()` |
473
- | `TableOfContents.createNumbered(title?, format?)` | Numbered TOC | `TableOfContents.createNumbered('TOC', 'roman')` |
474
- | `TableOfContents.createWithSpacing(spacing, opts?)` | TOC with custom spacing | `TableOfContents.createWithSpacing(120)` |
475
- | `TableOfContents.createWithHyperlinkColor(color, opts?)` | Custom hyperlink color | `TableOfContents.createWithHyperlinkColor('FF0000')` |
476
-
477
- **Note:** All TOC elements are automatically wrapped in an SDT (Structured Document Tag) for native Word integration. This enables Word's "Update Table" button and provides better compatibility with Microsoft Word's TOC features.
478
-
479
- #### TOC Configuration Methods
480
-
481
- | Method | Description | Values |
482
- | --------------------------------- | ------------------------------ | -------------------------- |
483
- | `setIncludeStyles(styles)` | Select specific heading styles | `['Heading1', 'Heading3']` |
484
- | `setNumbered(numbered, format?)` | Enable/disable numbering | `(true, 'roman')` |
485
- | `setNoIndent(noIndent)` | Remove indentation | `true/false` |
486
- | `setCustomIndents(indents)` | Custom indents per level | `[0, 200, 400]` (twips) |
487
- | `setSpaceBetweenEntries(spacing)` | Spacing between entries | `120` (twips) |
488
- | `setHyperlinkColor(color)` | Hyperlink color | `'0000FF'` (default blue) |
489
- | `setHideInWebLayout(hide)` | Hide page numbers in web view | `true/false` |
490
- | `configure(options)` | Bulk configuration | See example below |
491
-
492
- #### TOC Properties
493
-
494
- | Property | Type | Default | Description |
495
- | --------------------- | ------------------------------------ | --------------------- | ---------------------------------- |
496
- | `title` | `string` | `'Table of Contents'` | TOC title |
497
- | `levels` | `number` (1-9) | `3` | Heading levels to include |
498
- | `includeStyles` | `string[]` | `undefined` | Specific styles (overrides levels) |
499
- | `showPageNumbers` | `boolean` | `true` | Show page numbers |
500
- | `useHyperlinks` | `boolean` | `false` | Use hyperlinks instead of page #s |
501
- | `hideInWebLayout` | `boolean` | `false` | Hide page numbers in web layout |
502
- | `numbered` | `boolean` | `false` | Number TOC entries |
503
- | `numberingFormat` | `'decimal'/'roman'/'alpha'` | `'decimal'` | Numbering format |
504
- | `noIndent` | `boolean` | `false` | Remove all indentation |
505
- | `customIndents` | `number[]` | `undefined` | Custom indents in twips |
506
- | `spaceBetweenEntries` | `number` | `0` | Spacing in twips |
507
- | `hyperlinkColor` | `string` | `'0000FF'` | Hyperlink color (hex without #) |
508
- | `tabLeader` | `'dot'/'hyphen'/'underscore'/'none'` | `'dot'` | Tab leader character |
509
-
510
- **Example:**
511
-
512
- ```typescript
513
- // Basic TOC
514
- const simpleToc = TableOfContents.createStandard();
515
- doc.addTableOfContents(simpleToc);
516
-
517
- // Select specific styles (e.g., only Heading1 and Heading3)
518
- const customToc = TableOfContents.createWithStyles(["Heading1", "Heading3"]);
519
-
520
- // Flat TOC with no indentation
521
- const flatToc = TableOfContents.createFlat("Contents");
522
-
523
- // Numbered TOC with roman numerals
524
- const numberedToc = TableOfContents.createNumbered(
525
- "Table of Contents",
526
- "roman"
527
- );
528
-
529
- // Custom hyperlink color (red instead of blue)
530
- const coloredToc = TableOfContents.createWithHyperlinkColor("FF0000");
531
-
532
- // Advanced configuration
533
- const toc = TableOfContents.create()
534
- .setIncludeStyles(["Heading1", "Heading2", "Heading3"])
535
- .setNumbered(true, "decimal")
536
- .setSpaceBetweenEntries(120) // 6pt spacing
537
- .setHyperlinkColor("0000FF")
538
- .setNoIndent(false);
539
-
540
- // Or use configure() for bulk settings
541
- toc.configure({
542
- title: "Table of Contents",
543
- includeStyles: ["Heading1", "CustomHeader"],
544
- numbered: true,
545
- numberingFormat: "alpha",
546
- spaceBetweenEntries: 100,
547
- hyperlinkColor: "FF0000",
548
- noIndent: true,
549
- });
550
-
551
- // Insert at specific position
552
- doc.insertTocAt(0, toc);
553
- ```
554
-
555
- ### Image Handling
556
-
557
- | Method | Description | Example |
558
- | --------------------------------------- | ------------------ | -------------------------------- |
559
- | `Image.fromFile(path, width?, height?)` | Load from file | `Image.fromFile('pic.jpg')` |
560
- | `Image.fromBuffer(buffer, ext, w?, h?)` | Load from buffer | `Image.fromBuffer(buf, 'png')` |
561
- | `setWidth(emus, maintainRatio?)` | Set width | `img.setWidth(inchesToEmus(3))` |
562
- | `setHeight(emus, maintainRatio?)` | Set height | `img.setHeight(inchesToEmus(2))` |
563
- | `setSize(width, height)` | Set dimensions | `img.setSize(w, h)` |
564
- | `setRotation(degrees)` | Rotate image | `img.setRotation(90)` |
565
- | `setAltText(text)` | Accessibility text | `img.setAltText('Description')` |
566
-
567
- ### Hyperlinks
568
-
569
- | Method | Description | Example |
570
- | ------------------------------------------------- | ---------------- | ---------------------------------------------------------- |
571
- | `Hyperlink.createExternal(url, text, format?)` | Web link | `Hyperlink.createExternal('https://example.com', 'Click')` |
572
- | `Hyperlink.createEmail(email, text?, format?)` | Email link | `Hyperlink.createEmail('user@example.com')` |
573
- | `Hyperlink.createInternal(anchor, text, format?)` | Internal link | `Hyperlink.createInternal('Section1', 'Go to')` |
574
- | `para.addHyperlink(hyperlink)` | Add to paragraph | `para.addHyperlink(link)` |
575
-
576
- ### Headers & Footers
577
-
578
- | Method | Description | Example |
579
- | ---------------------------- | ------------------------ | -------------------------------- |
580
- | `setHeader(header)` | Set default header | `doc.setHeader(myHeader)` |
581
- | `setFooter(footer)` | Set default footer | `doc.setFooter(myFooter)` |
582
- | `setFirstPageHeader(header)` | First page header | `doc.setFirstPageHeader(header)` |
583
- | `setFirstPageFooter(footer)` | First page footer | `doc.setFirstPageFooter(footer)` |
584
- | `setEvenPageHeader(header)` | Even page header | `doc.setEvenPageHeader(header)` |
585
- | `setEvenPageFooter(footer)` | Even page footer | `doc.setEvenPageFooter(footer)` |
586
- | `removeHeader(type)` | Remove specific header | `doc.removeHeader('default')` |
587
- | `removeFooter(type)` | Remove specific footer | `doc.removeFooter('first')` |
588
- | `clearHeaders()` | Remove all headers | `doc.clearHeaders()` |
589
- | `clearFooters()` | Remove all footers | `doc.clearFooters()` |
590
-
591
- ### Page Setup
592
-
593
- | Method | Description | Example |
594
- | ------------------------------------- | ----------------- | ------------------------------------- |
595
- | `setPageSize(width, height, orient?)` | Page dimensions | `doc.setPageSize(12240, 15840)` |
596
- | `setPageOrientation(orientation)` | Page orientation | `doc.setPageOrientation('landscape')` |
597
- | `setMargins(margins)` | Page margins | `doc.setMargins({top: 1440, ...})` |
598
- | `setLanguage(language)` | Document language | `doc.setLanguage('en-US')` |
599
-
600
- ### Document Properties
601
-
602
- | Method | Description | Properties |
603
- | ---------------------- | ------------ | ------------------------------------- |
604
- | `setProperties(props)` | Set metadata | `{title, subject, creator, keywords}` |
605
- | `getProperties()` | Get metadata | Returns all properties |
606
-
607
- ### Advanced Features
608
-
609
- #### Bookmarks
610
-
611
- | Method | Description |
612
- | ---------------------------------------- | ------------------- |
613
- | `createBookmark(name)` | Create bookmark |
614
- | `createHeadingBookmark(text)` | Auto-named bookmark |
615
- | `getBookmark(name)` | Get by name |
616
- | `hasBookmark(name)` | Check existence |
617
- | `addBookmarkToParagraph(para, bookmark)` | Add to paragraph |
618
-
619
- #### Comments
620
-
621
- | Method | Description |
622
- | ------------------------------------------- | ----------------- |
623
- | `createComment(author, content, initials?)` | Add comment |
624
- | `createReply(parentId, author, content)` | Reply to comment |
625
- | `getComment(id)` | Get by ID |
626
- | `getAllComments()` | Get all top-level |
627
- | `addCommentToParagraph(para, comment)` | Add to paragraph |
628
-
629
- #### Track Changes
630
-
631
- | Method | Description |
632
- | ------------------------------------ | ----------------- |
633
- | `trackInsertion(para, author, text)` | Track insertion |
634
- | `trackDeletion(para, author, text)` | Track deletion |
635
- | `isTrackingChanges()` | Check if tracking |
636
- | `getRevisionStats()` | Get statistics |
637
-
638
- #### Footnotes & Endnotes
639
-
640
- | Method | Description |
641
- | -------------------------- | ---------------- |
642
- | `FootnoteManager.create()` | Manage footnotes |
643
- | `EndnoteManager.create()` | Manage endnotes |
644
-
645
- #### Document Helper Functions
646
-
647
- High-level helper methods for common document formatting tasks:
648
-
649
- | Method | Description |
650
- | ---------------------------------------- | ------------------------------------------------------------------------------------------- |
651
- | `applyCustomFormattingToExistingStyles(options?)`| Modify Heading1, Heading2, Heading3, Normal, and List Paragraph styles with custom or default formatting (Verdana font, specific spacing, single line spacing). Heading2 paragraphs are wrapped in tables. Accepts optional configuration for full customization. |
652
- | `applyStylesFromObjects(...styles)` | Helper function that accepts Style objects and applies their formatting. Converts Style objects to configuration format and calls applyCustomFormattingToExistingStyles(). |
653
- | `applyStandardTableFormatting(options?)` | Comprehensive table formatting helper: autofit to window, format header row (shading, bold, centered), apply cell margins, conditionally recolor cells based on existing shading. One-call solution for standardizing all tables in document. |
654
- | `hideTableOfContentsPageNumbers()` | Hide page numbers in all TOC elements in the document; returns count of TOCs modified |
655
- | `wrapParagraphInTable(para, options?)` | Wrap a paragraph in a 1x1 table with optional shading, margins, and width settings |
656
- | `isParagraphInTable(para)` | Check if a paragraph is inside a table; returns `{inTable: boolean, cell?: TableCell}` |
657
- | `updateAllHyperlinkColors(color)` | Set all hyperlinks in the document to a specific color (e.g., '0000FF' for blue) |
658
- | `removeAllHeadersFooters()` | Remove all headers and footers from the document; returns count of headers/footers removed |
659
-
660
- **Example - Using Helper Functions:**
661
-
662
- ```typescript
663
- import { Document } from 'docxmlater';
664
-
665
- const doc = await Document.load('document.docx');
666
-
667
- // Apply default formatting to standard styles (Verdana, current spacing)
668
- const results = doc.applyCustomFormattingToExistingStyles();
669
- console.log(`Modified styles:`, results);
670
- // Output: { heading1: true, heading2: true, heading3: true, normal: true, listParagraph: true }
671
-
672
- // Or apply custom formatting with config objects
673
- doc.applyCustomFormattingToExistingStyles({
674
- heading1: {
675
- run: { font: 'Arial', size: 16, bold: true, color: '000000' },
676
- paragraph: { spacing: { before: 0, after: 200, line: 240, lineRule: 'auto' } }
677
- },
678
- heading2: {
679
- run: { font: 'Arial', size: 14, bold: true },
680
- paragraph: { spacing: { before: 100, after: 100 } },
681
- tableOptions: { shading: '808080', marginLeft: 150, marginRight: 150 }
682
- }
683
- });
684
-
685
- // Alternative: Use Style objects instead of config
686
- import { Style } from 'docxmlater';
687
-
688
- const h1 = new Style({
689
- styleId: 'Heading1',
690
- name: 'Heading 1',
691
- type: 'paragraph',
692
- runFormatting: { font: 'Arial', size: 16, bold: true },
693
- paragraphFormatting: { spacing: { before: 0, after: 200 } }
694
- });
695
-
696
- const h2 = Style.createHeadingStyle(2);
697
- h2.setRunFormatting({ font: 'Arial', size: 14, bold: true });
698
- h2.setHeading2TableOptions({ shading: '808080', marginLeft: 150, marginRight: 150 });
699
-
700
- doc.applyStylesFromObjects(h1, h2);
701
-
702
- // Hide page numbers in all TOC elements
703
- const tocCount = doc.hideTableOfContentsPageNumbers();
704
- console.log(`Hid page numbers in ${tocCount} TOCs`);
705
-
706
- // Set all hyperlinks to blue
707
- doc.updateAllHyperlinkColors('0000FF');
708
-
709
- // Wrap a specific paragraph in a table
710
- const para = doc.getParagraphs()[0];
711
- doc.wrapParagraphInTable(para, {
712
- shading: 'BFBFBF', // Gray background
713
- marginLeft: 101, // 5pt margins
714
- marginRight: 101,
715
- tableWidthPercent: 5000 // 100% width
716
- });
717
-
718
- // Check if a paragraph is in a table
719
- const { inTable, cell } = doc.isParagraphInTable(para);
720
- if (inTable && cell) {
721
- console.log('Paragraph is in a table cell');
722
- cell.setShading({ fill: 'FFFF00' }); // Change to yellow
723
- }
724
-
725
- // Remove all headers and footers
726
- const removedCount = doc.removeAllHeadersFooters();
727
- console.log(`Removed ${removedCount} headers and footers`);
728
-
729
- await doc.save('formatted.docx');
730
- ```
731
-
732
- **Note on `applyCustomFormattingToExistingStyles(options?)`:**
733
-
734
- This helper function modifies existing style definitions with custom or default formatting. All parameters are optional.
735
-
736
- **Default formatting (when no options provided):**
737
- - **Heading1**: 18pt black bold Verdana, left aligned, 0pt before/12pt after, single line spacing
738
- - **Heading2**: 14pt black bold Verdana, left aligned, 6pt before/after, single line spacing, wrapped in gray tables (100% width, 0.08" margins)
739
- - **Heading3**: 12pt black bold Verdana, left aligned, 3pt before/after, single line spacing (no table wrapping)
740
- - **Normal**: 12pt Verdana, left aligned, 3pt before/after, single line spacing
741
- - **List Paragraph**: 12pt Verdana, left aligned, 0pt before/3pt after, single line spacing, 0.25" bullet indent/0.50" text indent, contextual spacing enabled
742
-
743
- **Customization:**
744
- - Pass custom `ApplyCustomFormattingOptions` to override any style's run formatting (font, size, bold, italic, underline, color) or paragraph formatting (alignment, spacing, indentation, contextual spacing)
745
- - Heading2 table options are configurable: shading, margins, and table width
746
- - All styles have italic and underline formatting cleared to ensure style consistency
747
-
748
- **Heading2 table wrapping behavior:**
749
- - Empty Heading2 paragraphs are skipped (not wrapped in tables)
750
- - Heading2 paragraphs already in tables have their cell formatted with provided or default options
751
- - Heading2 paragraphs not in tables are wrapped in new 1x1 tables
752
- - Table appearance is fully customizable via `options.heading2.tableOptions`
753
-
754
- **Note:** Per ECMA-376 §17.7.2, direct formatting in document.xml overrides style definitions. This method automatically clears conflicting direct formatting to ensure style changes take effect.
755
-
756
- ---
757
-
758
- **Note on `applyStandardTableFormatting(options?)`:**
759
-
760
- This comprehensive helper function standardizes table formatting across an entire document in a single call. Perfect for ensuring consistent table appearance.
761
-
762
- **Default behavior (when no parameter provided):**
763
- - **All tables**: Autofit to window width
764
- - **First row (header)**: Gray background (#E9E9E9), bold, centered, Verdana 12pt, black text, 3pt spacing
765
- - **All cells**: 0pt top/bottom margin, 0.08" (115 twips) left/right margin
766
- - **Single-cell tables**: Skipped by default
767
- - **Conditional shading**: All colored cells (except white) become the same gray as header (#E9E9E9)
768
-
769
- **Key Features:**
770
- - **Autofit to window**: Tables automatically resize to fit content
771
- - **Header row formatting**: Comprehensive first-row styling with custom options
772
- - **Cell margins**: Consistent internal padding for all cells
773
- - **Conditional recoloring**: Cells with colors (excluding white and header color) can be automatically recolored
774
- - **Smart skipping**: Optionally skip 1x1 tables to avoid formatting placeholder tables
775
-
776
- **Usage Examples:**
777
-
778
- ```typescript
779
- import { Document } from 'docxmlater';
780
-
781
- const doc = await Document.load('document.docx');
782
-
783
- // Example 1: Use default gray color (E9E9E9) - simplest approach
784
- const result = doc.applyStandardTableFormatting();
785
- console.log(`Processed ${result.tablesProcessed} tables`);
786
- console.log(`Formatted ${result.headerRowsFormatted} header rows`);
787
- console.log(`Recolored ${result.cellsRecolored} cells`);
788
-
789
- // Example 2: Custom color - both header AND conditional cells use same color
790
- doc.applyStandardTableFormatting('D9D9D9'); // Light gray
791
-
792
- // Example 3: Different colors for header vs data cells (advanced)
793
- doc.applyStandardTableFormatting({
794
- headerRowShading: '4472C4', // Blue header
795
- conditionalShading: {
796
- replacementColor: 'FFD700', // Gold for other colored cells
797
- applyFormatting: true // Apply bold/centered/Verdana formatting
798
- }
799
- });
800
-
801
- // Example 4: Full customization
802
- doc.applyStandardTableFormatting({
803
- autofitToWindow: true,
804
- headerRowShading: 'CCCCCC',
805
- headerRowFormatting: {
806
- bold: true,
807
- alignment: 'left', // Left-aligned instead of centered
808
- font: 'Arial',
809
- size: 11,
810
- color: '000000',
811
- spacingBefore: 40,
812
- spacingAfter: 40
813
- },
814
- cellMargins: {
815
- top: 0,
816
- bottom: 0,
817
- left: 100, // Custom left margin
818
- right: 100 // Custom right margin
819
- },
820
- conditionalShading: {
821
- replacementColor: 'F0F0F0',
822
- applyFormatting: false // Just recolor, don't format
823
- },
824
- skipSingleCellTables: true // Skip 1x1 tables
825
- });
826
-
827
- await doc.save('formatted-tables.docx');
828
- ```
829
-
830
- **Conditional Shading Logic:**
831
-
832
- By default, colored cells are automatically recolored to match the header row color:
833
-
834
- 1. **Cells are evaluated**: Only data rows (not header row) are checked
835
- 2. **Exclusion criteria**: Cells with white (#FFFFFF) or header row color are skipped
836
- 3. **Matching cells**: All other colored cells are recolored to the same color as the header
837
- 4. **Automatic formatting**: Matching cells also receive:
838
- - Bold text
839
- - Centered alignment
840
- - Verdana 12pt font
841
- - 3pt spacing before/after
842
-
843
- **Example scenario:**
844
- ```typescript
845
- // Document has tables with various cell colors:
846
- // - White cells (#FFFFFF)
847
- // - Yellow highlights (#FFFF00)
848
- // - Green highlights (#00FF00)
849
-
850
- // Simple: Use same color for header AND highlighted cells
851
- doc.applyStandardTableFormatting('E9E9E9'); // Gray
852
-
853
- // Result:
854
- // - White cells: Stay white
855
- // - Header row: Gray (E9E9E9)
856
- // - Yellow highlights: Become gray (E9E9E9) with bold/centered formatting
857
- // - Green highlights: Become gray (E9E9E9) with bold/centered formatting
858
- ```
859
-
860
- **Return value:**
861
- ```typescript
862
- {
863
- tablesProcessed: number; // Number of tables modified
864
- headerRowsFormatted: number; // Number of header rows formatted
865
- cellsRecolored: number; // Number of cells recolored by conditional shading
866
- }
867
- ```
868
-
869
-
870
- **Related helpers:**
871
- - Use `hideTableOfContentsPageNumbers()` to hide page numbers in TOCs
872
- - Use `updateAllHyperlinkColors(color)` to change hyperlink colors
873
-
874
- ### Low-Level Document Parts
875
-
876
- | Method | Description | Example |
877
- | ---------------------------- | --------------------- | ------------------------------------------------- |
878
- | `getPart(partName)` | Get document part | `doc.getPart('word/document.xml')` |
879
- | `setPart(partName, content)` | Set document part | `doc.setPart('custom.xml', data)` |
880
- | `removePart(partName)` | Remove part | `doc.removePart('custom.xml')` |
881
- | `listParts()` | List all parts | `const parts = await doc.listParts()` |
882
- | `partExists(partName)` | Check part exists | `if (await doc.partExists('...'))` |
883
- | `getContentTypes()` | Get content types | `const types = await doc.getContentTypes()` |
884
- | `addContentType(part, type)` | Register content type | `doc.addContentType('.json', 'application/json')` |
885
-
886
- ### Unit Conversion Utilities
887
-
888
- #### Twips Conversions
889
- | Function | Description | Example |
890
- | ------------------------- | ------------------- | --------------------------- |
891
- | `twipsToPoints(twips)` | Twips to points | `twipsToPoints(240)` // 12 |
892
- | `twipsToInches(twips)` | Twips to inches | `twipsToInches(1440)` // 1 |
893
- | `twipsToCm(twips)` | Twips to cm | `twipsToCm(1440)` // 2.54 |
894
- | `twipsToEmus(twips)` | Twips to EMUs | `twipsToEmus(1440)` |
895
-
896
- #### EMUs (English Metric Units) Conversions
897
- | Function | Description | Example |
898
- | --------------------------- | -------------------- | ----------------------------- |
899
- | `emusToTwips(emus)` | EMUs to twips | `emusToTwips(914400)` // 1440 |
900
- | `emusToInches(emus)` | EMUs to inches | `emusToInches(914400)` // 1 |
901
- | `emusToCm(emus)` | EMUs to cm | `emusToCm(914400)` // 2.54 |
902
- | `emusToPoints(emus)` | EMUs to points | `emusToPoints(914400)` // 72 |
903
- | `emusToPixels(emus, dpi?)` | EMUs to pixels | `emusToPixels(914400)` // 96 |
904
-
905
- #### Points Conversions
906
- | Function | Description | Example |
907
- | ------------------------ | ------------------ | -------------------------- |
908
- | `pointsToTwips(points)` | Points to twips | `pointsToTwips(12)` // 240 |
909
- | `pointsToEmus(points)` | Points to EMUs | `pointsToEmus(72)` |
910
- | `pointsToInches(points)` | Points to inches | `pointsToInches(72)` // 1 |
911
- | `pointsToCm(points)` | Points to cm | `pointsToCm(72)` // 2.54 |
912
-
913
- #### Inches Conversions
914
- | Function | Description | Example |
915
- | ----------------------------- | ------------------- | ----------------------------- |
916
- | `inchesToTwips(inches)` | Inches to twips | `inchesToTwips(1)` // 1440 |
917
- | `inchesToEmus(inches)` | Inches to EMUs | `inchesToEmus(1)` // 914400 |
918
- | `inchesToPoints(inches)` | Inches to points | `inchesToPoints(1)` // 72 |
919
- | `inchesToCm(inches)` | Inches to cm | `inchesToCm(1)` // 2.54 |
920
- | `inchesToPixels(inches, dpi)` | Inches to pixels | `inchesToPixels(1, 96)` // 96 |
921
-
922
- #### Centimeters Conversions
923
- | Function | Description | Example |
924
- | ----------------------- | ---------------- | --------------------------- |
925
- | `cmToTwips(cm)` | cm to twips | `cmToTwips(2.54)` // 1440 |
926
- | `cmToEmus(cm)` | cm to EMUs | `cmToEmus(2.54)` // 914400 |
927
- | `cmToInches(cm)` | cm to inches | `cmToInches(2.54)` // 1 |
928
- | `cmToPoints(cm)` | cm to points | `cmToPoints(2.54)` // 72 |
929
- | `cmToPixels(cm, dpi?)` | cm to pixels | `cmToPixels(2.54, 96)` // 96|
930
-
931
- #### Pixels Conversions
932
- | Function | Description | Example |
933
- | ---------------------------- | ------------------- | ------------------------------ |
934
- | `pixelsToEmus(pixels, dpi?)` | Pixels to EMUs | `pixelsToEmus(96)` // 914400 |
935
- | `pixelsToInches(pixels, dpi?)`| Pixels to inches | `pixelsToInches(96, 96)` // 1 |
936
- | `pixelsToTwips(pixels, dpi?)`| Pixels to twips | `pixelsToTwips(96, 96)` // 1440|
937
- | `pixelsToCm(pixels, dpi?)` | Pixels to cm | `pixelsToCm(96, 96)` // 2.54 |
938
- | `pixelsToPoints(pixels, dpi?)`| Pixels to points | `pixelsToPoints(96, 96)` // 72 |
939
-
940
- **Note:** Default DPI is 96 for pixel conversions
941
-
942
- ### ZIP Archive Helper Methods
943
-
944
- #### File Operations
945
- | Method | Description | Example |
946
- | ------------------------------- | ------------------------- | -------------------------------------------- |
947
- | `addFile(path, content)` | Add file to archive | `handler.addFile('doc.xml', xmlContent)` |
948
- | `updateFile(path, content)` | Update existing file | `handler.updateFile('doc.xml', newContent)` |
949
- | `removeFile(path)` | Remove file from archive | `handler.removeFile('old.xml')` |
950
- | `renameFile(oldPath, newPath)` | Rename file | `handler.renameFile('a.xml', 'b.xml')` |
951
- | `copyFile(srcPath, destPath)` | Copy file | `handler.copyFile('a.xml', 'copy-a.xml')` |
952
- | `moveFile(srcPath, destPath)` | Move file | `handler.moveFile('a.xml', 'folder/a.xml')` |
953
-
954
- #### File Retrieval
955
- | Method | Description | Returns |
956
- | ------------------------- | ---------------------- | --------------- |
957
- | `getFile(path)` | Get file object | `ZipFile` |
958
- | `getFileAsString(path)` | Get file as string | `string` |
959
- | `getFileAsBuffer(path)` | Get file as buffer | `Buffer` |
960
- | `hasFile(path)` | Check if file exists | `boolean` |
961
- | `getFilePaths()` | Get all file paths | `string[]` |
962
- | `getAllFiles()` | Get all files | `FileMap` |
963
-
964
- #### Batch Operations
965
- | Method | Description | Returns |
966
- | ------------------------------- | ---------------------------- | -------------- |
967
- | `removeFiles(paths[])` | Remove multiple files | `number` |
968
- | `getFilesByExtension(ext)` | Get files by extension | `ZipFile[]` |
969
- | `getTextFiles()` | Get all text files | `ZipFile[]` |
970
- | `getBinaryFiles()` | Get all binary files | `ZipFile[]` |
971
- | `getMediaFiles()` | Get media files | `ZipFile[]` |
972
-
973
- #### Archive Information
974
- | Method | Description | Returns |
975
- | ------------------ | ------------------------- | ------------------------ |
976
- | `getFileCount()` | Count files in archive | `number` |
977
- | `getTotalSize()` | Get total size in bytes | `number` |
978
- | `getStats()` | Get detailed statistics | `{fileCount, size, ...}` |
979
- | `isEmpty()` | Check if archive is empty | `boolean` |
980
-
981
- #### Import/Export
982
- | Method | Description | Returns |
983
- | -------------------------------- | ------------------------ | -------------------- |
984
- | `exportFile(internal, external)` | Export file from archive | `Promise<void>` |
985
- | `importFile(external, internal)` | Import file to archive | `Promise<void>` |
986
-
987
- ## Common Recipes
988
-
989
- ### Create a Simple Document
990
-
991
- ```typescript
992
- const doc = Document.create();
993
- doc.createParagraph("Title").setStyle("Title");
994
- doc.createParagraph("This is a simple document.");
995
- await doc.save("simple.docx");
996
- ```
997
-
998
- ### Add Formatted Text
999
-
1000
- ```typescript
1001
- const para = doc.createParagraph();
1002
- para.addText("Bold", { bold: true });
1003
- para.addText(" and ");
1004
- para.addText("Colored", { color: "FF0000" });
1005
- ```
1006
-
1007
- ### Create a Table with Borders
1008
-
1009
- ```typescript
1010
- const table = doc.createTable(3, 3);
1011
- table.setAllBorders({ style: "single", size: 8, color: "000000" });
1012
- table.getCell(0, 0)?.createParagraph("Header 1");
1013
- table.getRow(0)?.getCell(0)?.setShading({ fill: "4472C4" });
1014
- ```
1015
-
1016
- ### Insert an Image
1017
-
1018
- ```typescript
1019
- import { Image, inchesToEmus } from "docxmlater";
1020
-
1021
- const image = Image.fromFile("./photo.jpg");
1022
- image.setWidth(inchesToEmus(4), true); // 4 inches, maintain ratio
1023
- doc.addImage(image);
1024
- ```
1025
-
1026
- ### Add a Hyperlink
1027
-
1028
- ```typescript
1029
- const para = doc.createParagraph();
1030
- para.addText("Visit ");
1031
- para.addHyperlink(
1032
- Hyperlink.createExternal("https://example.com", "our website")
1033
- );
1034
- ```
1035
-
1036
- ### Search and Replace Text
1037
-
1038
- ```typescript
1039
- // Find all occurrences
1040
- const results = doc.findText("old text", { caseSensitive: true });
1041
- console.log(`Found ${results.length} occurrences`);
1042
-
1043
- // Replace all
1044
- const count = doc.replaceText("old text", "new text", { wholeWord: true });
1045
- console.log(`Replaced ${count} occurrences`);
1046
- ```
1047
-
1048
- ### Load and Modify Existing Document
1049
-
1050
- ```typescript
1051
- const doc = await Document.load("existing.docx");
1052
- doc.createParagraph("Added paragraph");
1053
-
1054
- // Update all hyperlinks
1055
- const urlMap = new Map([["https://old-site.com", "https://new-site.com"]]);
1056
- doc.updateHyperlinkUrls(urlMap);
1057
-
1058
- await doc.save("modified.docx");
1059
- ```
1060
-
1061
- ### Create Lists
1062
-
1063
- ```typescript
1064
- // Bullet list
1065
- const bulletId = doc.createBulletList(3);
1066
- doc.createParagraph("First item").setNumbering(bulletId, 0);
1067
- doc.createParagraph("Second item").setNumbering(bulletId, 0);
1068
-
1069
- // Numbered list
1070
- const numberId = doc.createNumberedList(3);
1071
- doc.createParagraph("Step 1").setNumbering(numberId, 0);
1072
- doc.createParagraph("Step 2").setNumbering(numberId, 0);
1073
- ```
1074
-
1075
- ### Apply Custom Styles
1076
-
1077
- ```typescript
1078
- import { Style } from "docxmlater";
1079
-
1080
- const customStyle = Style.create({
1081
- styleId: "CustomHeading",
1082
- name: "Custom Heading",
1083
- basedOn: "Normal",
1084
- runFormatting: { bold: true, size: 14, color: "2E74B5" },
1085
- paragraphFormatting: { alignment: "center", spaceAfter: 240 },
1086
- });
1087
-
1088
- doc.addStyle(customStyle);
1089
- doc.createParagraph("Custom Styled Text").setStyle("CustomHeading");
1090
- ```
1091
-
1092
- ### Build Content with Detached Paragraphs
1093
-
1094
- Create paragraphs independently and add them conditionally:
1095
-
1096
- ```typescript
1097
- import { Paragraph } from "docxmlater";
1098
-
1099
- // Create reusable paragraph templates
1100
- const warningTemplate = Paragraph.createFormatted(
1101
- "WARNING: ",
1102
- { bold: true, color: "FF6600" },
1103
- { spacing: { before: 120, after: 120 } }
1104
- );
1105
-
1106
- // Clone and customize
1107
- const warning1 = warningTemplate.clone();
1108
- warning1.addText("Please read the documentation before proceeding.");
1109
-
1110
- // Build content from data
1111
- const items = [
1112
- { title: "First Item", description: "Description here" },
1113
- { title: "Second Item", description: "Another description" },
1114
- ];
1115
-
1116
- items.forEach((item, index) => {
1117
- const titlePara = Paragraph.create(`${index + 1}. `);
1118
- titlePara.addText(item.title, { bold: true });
1119
-
1120
- const descPara = Paragraph.create(item.description, {
1121
- indentation: { left: 360 },
1122
- });
1123
-
1124
- doc.addParagraph(titlePara);
1125
- doc.addParagraph(descPara);
1126
- });
1127
-
1128
- // See examples/advanced/detached-paragraphs.ts for more patterns
1129
- ```
1130
-
1131
- ### Add Headers and Footers
1132
-
1133
- ```typescript
1134
- import { Header, Footer, Field } from "docxmlater";
1135
-
1136
- // Header with page numbers
1137
- const header = Header.create();
1138
- header.addParagraph("Document Title").setAlignment("center");
1139
-
1140
- // Footer with page numbers
1141
- const footer = Footer.create();
1142
- const footerPara = footer.addParagraph();
1143
- footerPara.addText("Page ");
1144
- footerPara.addField(Field.create({ type: "PAGE" }));
1145
- footerPara.addText(" of ");
1146
- footerPara.addField(Field.create({ type: "NUMPAGES" }));
1147
-
1148
- doc.setHeader(header);
1149
- doc.setFooter(footer);
1150
- ```
1151
-
1152
- ### Work with Document Statistics
1153
-
1154
- ```typescript
1155
- // Get word and character counts
1156
- console.log("Words:", doc.getWordCount());
1157
- console.log("Characters:", doc.getCharacterCount());
1158
- console.log("Characters (no spaces):", doc.getCharacterCount(false));
1159
-
1160
- // Check document size
1161
- const size = doc.estimateSize();
1162
- if (size.warning) {
1163
- console.warn(size.warning);
1164
- }
1165
- console.log(`Estimated size: ${size.totalEstimatedMB} MB`);
1166
- ```
1167
-
1168
- ### Handle Large Documents Efficiently
1169
-
1170
- ```typescript
1171
- const doc = Document.create({
1172
- maxMemoryUsagePercent: 80,
1173
- maxRssMB: 2048,
1174
- maxImageCount: 50,
1175
- maxTotalImageSizeMB: 100,
1176
- });
1177
-
1178
- // Process document...
1179
-
1180
- // Clean up resources after saving
1181
- await doc.save("large-document.docx");
1182
- doc.dispose(); // Free memory
1183
- ```
1184
-
1185
- ### Direct XML Access (Advanced)
1186
-
1187
- ```typescript
1188
- // Get raw XML
1189
- const documentXml = await doc.getPart("word/document.xml");
1190
- console.log(documentXml?.content);
1191
-
1192
- // Modify raw XML (use with caution)
1193
- await doc.setPart("word/custom.xml", "<custom>data</custom>");
1194
- await doc.addContentType("/word/custom.xml", "application/xml");
1195
-
1196
- // List all parts
1197
- const parts = await doc.listParts();
1198
- console.log("Document contains:", parts.length, "parts");
1199
- ```
1200
-
1201
- ## Features
1202
-
1203
- - **Full OpenXML Compliance** - Follows ECMA-376 standard
1204
- - **TypeScript First** - Complete type definitions
1205
- - **Memory Efficient** - Handles large documents with streaming
1206
- - **Atomic Saves** - Prevents corruption with temp file pattern
1207
- - **Rich Formatting** - Complete text and paragraph formatting
1208
- - **Tables** - Full support with borders, shading, merging
1209
- - **Images** - PNG, JPEG, GIF with sizing and positioning
1210
- - **Hyperlinks** - External, internal, and email links
1211
- - **Styles** - 13 built-in styles + custom style creation
1212
- - **Lists** - Bullets, numbering, multi-level
1213
- - **Headers/Footers** - Different first/even/odd pages
1214
- - **Search & Replace** - With case and whole word options
1215
- - **Document Stats** - Word count, character count, size estimation
1216
- - **Track Changes** - Insertions and deletions with authors
1217
- - **Comments** - With replies and threading
1218
- - **Bookmarks** - For internal navigation
1219
- - **Low-level Access** - Direct ZIP and XML manipulation
1220
-
1221
- ## Performance
1222
-
1223
- - Process 100+ page documents efficiently
1224
- - Atomic save pattern prevents corruption
1225
- - Memory management for large files
1226
- - Lazy loading of document parts
1227
- - Resource cleanup with `dispose()`
1228
-
1229
- ## Testing
1230
-
1231
- ```bash
1232
- npm test # Run all tests
1233
- npm run test:watch # Watch mode
1234
- npm run test:coverage # Coverage report
1235
- ```
1236
-
1237
- **Current:** 1,119 tests passing (97.3% pass rate) | 100% core functionality covered
1238
-
1239
- ## Development
1240
-
1241
- ```bash
1242
- # Install dependencies
1243
- npm install
1244
-
1245
- # Build TypeScript
1246
- npm run build
1247
-
1248
- # Run examples
1249
- npx ts-node examples/simple-document.ts
1250
- ```
1251
-
1252
- ## Project Structure
1253
-
1254
- ```text
1255
- src/
1256
- ├── core/ # Document, Parser, Generator, Validator
1257
- ├── elements/ # Paragraph, Run, Table, Image, Hyperlink
1258
- ├── formatting/ # Style, NumberingManager
1259
- ├── xml/ # XMLBuilder, XMLParser
1260
- ├── zip/ # ZipHandler for DOCX manipulation
1261
- └── utils/ # Validation, Units conversion
1262
-
1263
- examples/
1264
- ├── 01-basic/ # Simple document creation
1265
- ├── 02-text/ # Text formatting examples
1266
- ├── 03-tables/ # Table examples
1267
- ├── 04-styles/ # Style examples
1268
- ├── 05-images/ # Image handling
1269
- ├── 06-complete/ # Full document examples
1270
- └── 07-hyperlinks/ # Link examples
1271
- ```
1272
-
1273
- ## Hierarchy
1274
-
1275
- ```text
1276
- w:document (root)
1277
- └── w:body (body container)
1278
- ├── w:p (paragraph) [1..n]
1279
- │ ├── w:pPr (paragraph properties) [0..1]
1280
- │ │ ├── w:pStyle (style reference)
1281
- │ │ ├── w:jc (justification/alignment)
1282
- │ │ ├── w:ind (indentation)
1283
- │ │ └── w:spacing (spacing before/after)
1284
- │ ├── w:r (run) [1..n]
1285
- │ │ ├── w:rPr (run properties) [0..1]
1286
- │ │ │ ├── w:b (bold)
1287
- │ │ │ ├── w:i (italic)
1288
- │ │ │ ├── w:u (underline)
1289
- │ │ │ ├── w:sz (font size)
1290
- │ │ │ └── w:color (text color)
1291
- │ │ └── w:t (text content) [1]
1292
- │ ├── w:hyperlink (hyperlink) [0..n]
1293
- │ │ └── w:r (run with hyperlink text)
1294
- │ └── w:drawing (embedded image/shape) [0..n]
1295
- ├── w:tbl (table) [1..n]
1296
- │ ├── w:tblPr (table properties)
1297
- │ └── w:tr (table row) [1..n]
1298
- │ └── w:tc (table cell) [1..n]
1299
- │ └── w:p (paragraph in cell)
1300
- └── w:sectPr (section properties) [1] (must be last child of w:body)
1301
- ```
1302
-
1303
- ## Requirements
1304
-
1305
- - Node.js 16+
1306
- - TypeScript 5.0+ (for development)
1307
-
1308
- ## Installation Options
1309
-
1310
- ```bash
1311
- # NPM
1312
- npm install docxmlater
1313
-
1314
- # Yarn
1315
- yarn add docxmlater
1316
-
1317
- # PNPM
1318
- pnpm add docxmlater
1319
- ```
1320
-
1321
- ## Troubleshooting
1322
-
1323
- ### XML Corruption in Text
1324
-
1325
- **Problem**: Text displays with XML tags like `Important Information<w:t xml:space="preserve">1` in Word.
1326
-
1327
- **Cause**: Passing XML-like strings to text methods instead of using the API properly.
1328
-
1329
- ```typescript
1330
- // WRONG - Will display escaped XML as literal text
1331
- paragraph.addText("Important Information<w:t>1</w:t>");
1332
- // Displays as: "Important Information<w:t>1</w:t>"
1333
-
1334
- // CORRECT - Use separate text runs
1335
- paragraph.addText("Important Information");
1336
- paragraph.addText("1");
1337
- // Displays as: "Important Information1"
1338
-
1339
- // Or combine in one call
1340
- paragraph.addText("Important Information 1");
1341
- ```
1342
-
1343
- **Detection**: Use the corruption detection utility to find issues:
1344
-
1345
- ```typescript
1346
- import { detectCorruptionInDocument } from "docxmlater";
1347
-
1348
- const doc = await Document.load("file.docx");
1349
- const report = detectCorruptionInDocument(doc);
1350
-
1351
- if (report.isCorrupted) {
1352
- console.log(report.summary);
1353
- report.locations.forEach((loc) => {
1354
- console.log(`Paragraph ${loc.paragraphIndex}, Run ${loc.runIndex}:`);
1355
- console.log(` Original: ${loc.text}`);
1356
- console.log(` Fixed: ${loc.suggestedFix}`);
1357
- });
1358
- }
1359
- ```
1360
-
1361
- **Auto-Cleaning**: XML patterns are automatically removed by default for defensive data handling:
1362
-
1363
- ```typescript
1364
- // Default behavior - auto-clean enabled
1365
- const run = new Run("Text<w:t>value</w:t>");
1366
- // Result: "Textvalue" (XML tags removed automatically)
1367
-
1368
- // Disable auto-cleaning (for debugging)
1369
- const run = new Run("Text<w:t>value</w:t>", { cleanXmlFromText: false });
1370
- // Result: "Text<w:t>value</w:t>" (XML tags preserved, will display in Word)
1371
- ```
1372
-
1373
- **Why This Happens**: The framework correctly escapes XML special characters per the XML specification. When you pass XML tags as text, they are properly escaped (`<` becomes `&lt;`) and Word displays them as literal text, not as markup.
1374
-
1375
- **The Right Approach**: Use the framework's API methods instead of embedding XML:
1376
-
1377
- - Use `paragraph.addText()` multiple times for separate text runs
1378
- - Use formatting options: `{bold: true}`, `{italic: true}`, etc.
1379
- - Use `paragraph.addHyperlink()` for links
1380
- - Don't pass XML strings to text methods
1381
- - Don't try to embed `<w:t>` or other XML tags in your text
1382
-
1383
- For more details, see the [corruption detection examples](examples/troubleshooting/).
1384
-
1385
- ### Layout Conflicts (Massive Whitespace)
1386
-
1387
- **Problem**: Documents show massive whitespace between paragraphs when opened in Word, even though the XML looks correct.
1388
-
1389
- **Cause**: The `pageBreakBefore` property conflicting with `keepNext`/`keepLines` properties. When a paragraph has both `pageBreakBefore` and keep properties set to true, Word's layout engine tries to satisfy contradictory constraints (insert break vs. keep together), resulting in massive whitespace as it struggles to resolve the conflict.
1390
-
1391
- **Why This Causes Problems**:
1392
-
1393
- - `pageBreakBefore` tells Word to insert a page break before the paragraph
1394
- - `keepNext` tells Word to keep the paragraph with the next one (no break)
1395
- - `keepLines` tells Word to keep all lines together (no break)
1396
- - The combination creates layout conflicts that manifest as massive whitespace
1397
-
1398
- **Automatic Conflict Resolution** (v0.28.2+):
1399
-
1400
- The framework now automatically prevents these conflicts by **prioritizing keep properties over page breaks**:
1401
-
1402
- ```typescript
1403
- // When setting keepNext or keepLines, pageBreakBefore is automatically cleared
1404
- const para = new Paragraph()
1405
- .addText("Content")
1406
- .setPageBreakBefore(true) // Set to true
1407
- .setKeepNext(true); // Automatically clears pageBreakBefore
1408
-
1409
- // Result: keepNext=true, pageBreakBefore=false (conflict resolved)
1410
- ```
1411
-
1412
- **Why This Priority?**
1413
-
1414
- - Keep properties (`keepNext`/`keepLines`) represent explicit user intent to keep content together
1415
- - Page breaks are often layout hints that may conflict with document flow
1416
- - Removing `pageBreakBefore` eliminates whitespace while preserving the user's intention
1417
-
1418
- **Parsing Documents**:
1419
-
1420
- When loading existing DOCX files with conflicts, they are automatically resolved:
1421
-
1422
- ```typescript
1423
- // Load document with conflicts
1424
- const doc = await Document.load("document-with-conflicts.docx");
1425
-
1426
- // Conflicts are automatically resolved during parsing
1427
- // keepNext/keepLines take priority, pageBreakBefore is removed
1428
- ```
1429
-
1430
- **How It Works**:
1431
-
1432
- 1. When `setKeepNext(true)` is called, `pageBreakBefore` is automatically set to `false`
1433
- 2. When `setKeepLines(true)` is called, `pageBreakBefore` is automatically set to `false`
1434
- 3. When parsing documents, if both properties exist, `pageBreakBefore` is cleared
1435
- 4. Keep properties win because they represent explicit user intent
1436
-
1437
- **Manual Override**:
1438
-
1439
- If you need a page break despite keep properties, set it after:
1440
-
1441
- ```typescript
1442
- const para = new Paragraph()
1443
- .setKeepNext(true) // Set first
1444
- .setPageBreakBefore(true); // Override - you explicitly want this conflict
1445
-
1446
- // But note: This will cause layout issues (whitespace) in Word
1447
- ```
1448
-
1449
- ## Known Limitations
1450
-
1451
- While docXMLater provides comprehensive DOCX manipulation capabilities, there are some features that are not yet fully implemented:
1452
-
1453
- ### 1. Table Row Spanning with vMerge
1454
-
1455
- **Status:** FULLY IMPLEMENTED ✅
1456
-
1457
- **What Works:**
1458
- - Column spanning (horizontal cell merging) is fully supported
1459
- - Row spanning (vertical cell merging) is now fully implemented
1460
- - Both horizontal and vertical merging can be combined
1461
- - Uses Word's proper `vMerge` attribute ('restart' and 'continue')
1462
-
1463
- **Usage:**
1464
- ```typescript
1465
- // Merge cells horizontally (column spanning)
1466
- table.mergeCells(0, 0, 0, 2); // Merge columns 0-2 in row 0
1467
-
1468
- // Merge cells vertically (row spanning)
1469
- table.mergeCells(0, 0, 2, 0); // Merge rows 0-2 in column 0
1470
-
1471
- // Merge both horizontally and vertically (2x2 block)
1472
- table.mergeCells(0, 0, 1, 1); // Merge 2x2 block starting at (0,0)
1473
- ```
1474
-
1475
- ### 2. Structured Document Tags (SDT) Parsing
1476
-
1477
- **Status:** FULLY IMPLEMENTED ✅
1478
-
1479
- **What Works:**
1480
- - Complete SDT parsing from existing documents
1481
- - All 9 control types supported (richText, plainText, comboBox, dropDownList, datePicker, checkbox, picture, buildingBlock, group)
1482
- - SDT properties fully extracted (id, tag, lock, alias, controlType)
1483
- - Nested content parsing (paragraphs, tables, nested SDTs)
1484
- - Preserves element order using XMLParser's `_orderedChildren` metadata
1485
- - Round-trip operations fully supported
1486
-
1487
- **Control Types Supported:**
1488
- - **Rich Text** - Multi-formatted text content
1489
- - **Plain Text** - Simple text with optional multiLine support
1490
- - **Combo Box** - User-editable dropdown with list items
1491
- - **Dropdown List** - Fixed selection from list items
1492
- - **Date Picker** - Date selection with format and calendar type
1493
- - **Checkbox** - Boolean selection with custom checked/unchecked states
1494
- - **Picture** - Image content control
1495
- - **Building Block** - Gallery and category-based content
1496
- - **Group** - Grouping of other controls
1497
-
1498
- **Usage:**
1499
- ```typescript
1500
- // Load documents with SDTs - fully parsed
1501
- const doc = await Document.load('document-with-sdts.docx');
1502
-
1503
- // Access parsed SDT content
1504
- const sdts = doc.getBodyElements().filter(el => el instanceof StructuredDocumentTag);
1505
- for (const sdt of sdts) {
1506
- console.log('ID:', sdt.getId());
1507
- console.log('Tag:', sdt.getTag());
1508
- console.log('Type:', sdt.getControlType());
1509
- console.log('Content:', sdt.getContent());
1510
- }
1511
-
1512
- // Create new SDTs programmatically
1513
- const sdt = new StructuredDocumentTag({
1514
- id: 123456,
1515
- tag: 'MyControl',
1516
- controlType: 'richText',
1517
- alias: 'Rich Text Control'
1518
- });
1519
- sdt.addContent(paragraph);
1520
- ```
1521
-
1522
- All known limitations have been resolved! For feature requests or bug reports, please visit our [GitHub Issues](https://github.com/ItMeDiaTech/docXMLater/issues).
1523
-
1524
- ## Contributing
1525
-
1526
- Contributions are welcome! Please read our [Contributing Guide](CONTRIBUTING.md).
1527
-
1528
- 1. Fork the repository
1529
- 2. Create your feature branch (`git checkout -b feature/amazing-feature`)
1530
- 3. Commit changes (`git commit -m 'Add amazing feature'`)
1531
- 4. Push to branch (`git push origin feature/amazing-feature`)
1532
- 5. Open a Pull Request
1533
-
1534
- ## Recent Updates (v1.3.0)
1535
-
1536
- **Enhanced Document Parsing & Helper Functions:**
1537
-
1538
- - **TOC Parsing** - Parse Table of Contents from existing DOCX files
1539
- - Extract TOC field instructions with all switches (\h, \u, \z, \n, \o, \t)
1540
- - Detect SDT wrappers with `docPartGallery="Table of Contents"`
1541
- - Create `TableOfContentsElement` objects from parsed TOCs
1542
- - Support for modifying TOC field instructions in loaded documents
1543
- - **removeAllHeadersFooters() Helper** - New document helper function
1544
- - Removes all headers and footers from the document
1545
- - Deletes header/footer XML files and relationships
1546
- - Returns count of removed headers/footers
1547
- - **Enhanced Test Suite** - 1,119/1,150 tests passing (97.3% pass rate)
1548
- - **Documentation Updates** - Complete API reference for new helper functions
1549
-
1550
- **Previous Enhancements (v1.2.0):**
1551
- - 5 advanced document helper functions
1552
- - Enhanced document modification capabilities
1553
- - Improved paragraph and table wrapping utilities
1554
-
1555
- ## License
1556
-
1557
- MIT © DiaTech
1558
-
1559
- ## Acknowledgments
1560
-
1561
- - Built with [JSZip](https://stuk.github.io/jszip/) for ZIP handling
1562
- - Follows [ECMA-376](https://www.ecma-international.org/publications-and-standards/standards/ecma-376/) Office Open XML standard
1563
- - Inspired by [python-docx](https://python-docx.readthedocs.io/) and [docx](https://github.com/dolanmiu/docx)
1564
-
1565
- ## Support
1566
-
1567
- - **Documentation**: [Full Docs](https://github.com/ItMeDiaTech/docXMLater/tree/main/docs)
1568
- - **Examples**: [Example Code](https://github.com/ItMeDiaTech/docXMLater/tree/main/examples)
1569
- - **Issues**: [GitHub Issues](https://github.com/ItMeDiaTech/docXMLater/issues)
1570
- - **Discussions**: [GitHub Discussions](https://github.com/ItMeDiaTech/docXMLater/discussions)
1571
-
1572
- ## Quick Links
1573
-
1574
- - [NPM Package](https://www.npmjs.com/package/docxmlater)
1575
- - [GitHub Repository](https://github.com/ItMeDiaTech/docXMLater)
1576
- - [API Reference](https://github.com/ItMeDiaTech/docXMLater/tree/main/docs/api)
1577
- - [Change Log](https://github.com/ItMeDiaTech/docXMLater/blob/main/CHANGELOG.md)
1578
-
1579
- ---
1580
-
1581
- **Ready to create amazing Word documents?** Start with our [examples](https://github.com/ItMeDiaTech/docXMLater/tree/main/examples) or dive into the [API Reference](#complete-api-reference) above!
1
+ # DOCX Header Line Break Processor
2
+
3
+ A TypeScript utility using the docXMLater framework to automatically insert line breaks after Header 2 elements within 1x1 tables in Microsoft Word documents.
4
+
5
+ ## Understanding Bullet Points in DOCX/XML
6
+
7
+ ### Structure Overview
8
+
9
+ Bullet points in DOCX files involve two main components:
10
+
11
+ 1. **Numbering Definitions** (`word/numbering.xml`)
12
+ ```xml
13
+ <w:abstractNum w:abstractNumId="1">
14
+ <w:lvl w:ilvl="0">
15
+ <w:numFmt w:val="bullet"/>
16
+ <w:lvlText w:val="•"/>
17
+ <w:lvlJc w:val="left"/>
18
+ </w:lvl>
19
+ </w:abstractNum>
20
+ ```
21
+
22
+ 2. **Paragraph References** (`word/document.xml`)
23
+ ```xml
24
+ <w:p>
25
+ <w:pPr>
26
+ <w:numPr>
27
+ <w:ilvl w:val="0"/>
28
+ <w:numId w:val="1"/>
29
+ </w:numPr>
30
+ </w:pPr>
31
+ <w:r>
32
+ <w:t>Bullet point text</w:t>
33
+ </w:r>
34
+ </w:p>
35
+ ```
36
+
37
+ ### Common Windows Bullet Symbols
38
+
39
+ | Symbol | Unicode | Name | Usage |
40
+ |--------|---------|------|-------|
41
+ | • | U+2022 | Bullet | Default bullet |
42
+ | | U+25CB | White Circle | Secondary level |
43
+ | ▪ | U+25AA | Black Square | Tertiary level |
44
+ | ▫ | U+25AB | White Square | Alternative |
45
+ | | U+25C6 | Black Diamond | Emphasis |
46
+ | ➢ | U+27A2 | Arrow | Direction/action |
47
+ | ✓ | U+2713 | Check Mark | Completed items |
48
+
49
+ ## Features
50
+
51
+ - Detects Header 2 (Heading2 style) within 1x1 tables
52
+ - Checks for existing line breaks between table and next element
53
+ - ✅ Inserts line break only when needed
54
+ - ✅ Preserves document structure and formatting
55
+ - Supports both low-level XML manipulation and high-level API
56
+
57
+ ## Installation
58
+
59
+ ```bash
60
+ # Clone or create the project
61
+ mkdir docx-processor
62
+ cd docx-processor
63
+
64
+ # Install dependencies
65
+ npm install docxml jszip
66
+ npm install -D typescript ts-node @types/node
67
+
68
+ # Or using the provided package.json
69
+ npm install
70
+ ```
71
+
72
+ ## Usage
73
+
74
+ ### Command Line
75
+
76
+ ```bash
77
+ # Basic usage
78
+ ts-node process-headers-in-tables.ts input.docx
79
+
80
+ # With custom output file
81
+ ts-node process-headers-in-tables.ts input.docx output.docx
82
+
83
+ # Verbose mode
84
+ ts-node process-headers-in-tables.ts input.docx output.docx --verbose
85
+ ```
86
+
87
+ ### As a Module
88
+
89
+ ```typescript
90
+ import { HeaderTableProcessor } from './process-headers-in-tables';
91
+
92
+ const processor = new HeaderTableProcessor({
93
+ inputFile: 'document.docx',
94
+ outputFile: 'processed.docx',
95
+ verbose: true
96
+ });
97
+
98
+ await processor.process();
99
+ ```
100
+
101
+ ## How It Works
102
+
103
+ ### Detection Logic
104
+
105
+ 1. **Table Identification**: Finds all `<w:tbl>` elements
106
+ 2. **1x1 Verification**: Counts rows (`<w:tr>`) and cells (`<w:tc>`)
107
+ 3. **Header 2 Check**: Looks for `<w:pStyle w:val="Heading2">`
108
+ 4. **Gap Analysis**: Examines content after table for existing breaks
109
+
110
+ ### Line Break Insertion
111
+
112
+ The processor inserts an empty paragraph when:
113
+ - Table is exactly 1x1
114
+ - Contains Header 2 style
115
+ - No line break exists after table
116
+ - Next element is not a section break
117
+
118
+ ### XML Structure Added
119
+
120
+ ```xml
121
+ <!-- Empty paragraph for line break -->
122
+ <w:p w:rsidR="00AB12CD" w:rsidRDefault="00AB12CD">
123
+ <w:pPr>
124
+ <w:spacing w:after="0" w:before="0" w:line="240" w:lineRule="auto"/>
125
+ </w:pPr>
126
+ </w:p>
127
+ ```
128
+
129
+ ## Architecture
130
+
131
+ ### Low-Level API (ZipHandler)
132
+ - Direct XML manipulation
133
+ - Full control over document structure
134
+ - Best for complex transformations
135
+
136
+ ### High-Level API (Document)
137
+ - Object-oriented approach
138
+ - Type-safe operations
139
+ - Simpler for basic edits
140
+
141
+ ## Examples
142
+
143
+ ### Example 1: Processing Multiple Files
144
+
145
+ ```typescript
146
+ const files = ['doc1.docx', 'doc2.docx', 'doc3.docx'];
147
+
148
+ for (const file of files) {
149
+ const processor = new HeaderTableProcessor({
150
+ inputFile: file,
151
+ verbose: false
152
+ });
153
+ await processor.process();
154
+ console.log(`Processed: ${file}`);
155
+ }
156
+ ```
157
+
158
+ ### Example 2: Custom Processing Logic
159
+
160
+ ```typescript
161
+ class CustomProcessor extends HeaderTableProcessor {
162
+ protected createEmptyParagraph(): string {
163
+ // Custom spacing or formatting
164
+ return `<w:p>
165
+ <w:pPr>
166
+ <w:spacing w:after="200" w:before="100"/>
167
+ </w:pPr>
168
+ </w:p>`;
169
+ }
170
+ }
171
+ ```
172
+
173
+ ## Troubleshooting
174
+
175
+ ### Document Won't Open
176
+ - Validate XML syntax
177
+ - Check for unclosed tags
178
+ - Verify RSID format (8 hex characters)
179
+
180
+ ### Line Breaks Not Appearing
181
+ - Confirm Header 2 style name matches exactly
182
+ - Check table structure (must be 1x1)
183
+ - Verify output file is being saved
184
+
185
+ ### Performance Issues
186
+ - Use buffer operations for large files
187
+ - Process in batches for multiple documents
188
+ - Consider streaming for files > 10MB
189
+
190
+ ## Testing
191
+
192
+ Create a test document with:
193
+ 1. Regular paragraphs
194
+ 2. 1x1 table with Header 2
195
+ 3. 2x2 table with Header 2 (should be ignored)
196
+ 4. 1x1 table without Header 2 (should be ignored)
197
+
198
+ Run the processor and verify only the 1x1 table with Header 2 gets a line break.
199
+
200
+ ## Dependencies
201
+
202
+ - `docxml` (docXMLater framework) - TypeScript DOCX manipulation
203
+ - `jszip` - ZIP file handling
204
+ - `typescript` - TypeScript compiler
205
+ - `ts-node` - TypeScript execution
206
+
207
+ ## License
208
+
209
+ MIT
210
+
211
+ ## Contributing
212
+
213
+ 1. Fork the repository
214
+ 2. Create your feature branch
215
+ 3. Test your changes
216
+ 4. Submit a pull request
217
+
218
+ ## Notes
219
+
220
+ - The docXMLater framework is accessed via npm package `docxml`
221
+ - Original repository: https://github.com/wvbe/docxml
222
+ - This implementation uses low-level ZIP/XML manipulation for precise control
223
+ - RSID generation ensures Word tracks changes properly