xml-twig 1.3.17 → 1.4.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/README.md CHANGED
@@ -18,22 +18,21 @@ When you need to read a XML file, then you have two principles:
18
18
  This module tries to combine both principles. The XML document can be read in chunks and within a chunk you have all the nice features and functions you know from a DOM based parser.
19
19
 
20
20
  ## Dependencies
21
- XML documents are read either with [sax](https://www.npmjs.com/package/sax), [node-expat](https://www.npmjs.com/package/node-expat) or [saxophone](https://www.npmjs.com/package/saxophone) parser. More parser may be added in future releases. By default the `sax` parser is used. However, I clearly recommend using the `node-expat` parser. All other parsers I tested, are not compliant to XML standards.
21
+ XML documents are read either with [sax](https://www.npmjs.com/package/sax) or [node-expat](https://www.npmjs.com/package/node-expat) parser. More parser may be added in future releases. By default the `sax` parser is used. However, I clearly recommend using the `node-expat` parser. All other parsers I tested, are not compliant to XML standards.
22
22
 
23
- **NOTE: The `node-expat` and `saxophone` modules are not automatically installed with this module. Install the parser by yourself, if you like to use it**
23
+ **NOTE: The `node-expat` module is not automatically installed with this module. Install the parser by yourself, if you like to use it**
24
24
 
25
25
  ## Installation
26
26
 
27
- Install module like any other node module and optionally `node-expat` and/or `saxophone`:
27
+ Install module like any other node module and optionally `node-expat`:
28
28
  ```bash
29
29
  npm install xml-twig
30
30
 
31
31
  # and optionally
32
32
  npm install node-expat
33
- npm install saxophone
34
33
 
35
34
  ```
36
- In my tests I parsed a 900 MB big XML file, the `node-expat` is faster than `sax` (node-expat: around 2:30 Minutes, sax: around 3:40 Minutes). However, you may run into problems when you try to install the `node-expat` parser. That's the reason why `node-expat` parser is not installed automatically. `saxophone` is even a little faster (around 2:10 Minutes) than `node-expat`.
35
+ In my tests I parsed a 900 MB big XML file, the `node-expat` is faster than `sax` (node-expat: around 2:30 Minutes, sax: around 3:40 Minutes). However, you may run into problems when you try to install the `node-expat` parser. That's the reason why `node-expat` parser is not installed automatically.
37
36
 
38
37
  ## How to use it
39
38
 
@@ -388,10 +387,9 @@ Accessing Twig-Elements by [XML-Path](https://www.w3.org/TR/xpath/) language is
388
387
 
389
388
  As already mentioned above, I recommend the `expat` parser. The other parser may work for your purpose, however they have several limitations and bugs:
390
389
 
391
- - `sax` and `saxophone` do not support UTF-16 encoding. I did not test other encodings, because [W3C Recommendations](https://www.w3.org/TR/xml/#charencoding) defines only UTF-8 and UTF-16 as required
390
+ - `sax` does not support UTF-16 encoding. I did not test other encodings, because [W3C Recommendations](https://www.w3.org/TR/xml/#charencoding) defines only UTF-8 and UTF-16 as required
392
391
  - `sax` misinterpret character entities
393
- - `saxophone` fails on `<!DOCTYPE>` element
394
- - Properties `currentLine` and `currentColumn` are not available with `saxophone`
392
+
395
393
 
396
394
 
397
395
 
package/doc/twig.md CHANGED
@@ -83,7 +83,7 @@ You can specify a <code>function</code> or a <code>event</code> name</p>
83
83
  <dt><a href="#ElementConditionFilter">ElementConditionFilter</a> ⇒ <code>boolean</code></dt>
84
84
  <dd><p>Custom filter function to select desired elements</p>
85
85
  </dd>
86
- <dt><a href="#Parser">Parser</a> ⇒ <code><a href="https://www.npmjs.com/package/sax">sax</a></code> | <code><a href="https://www.npmjs.com/package/node-expat">node-expat</a></code> | <code><a href="https://www.npmjs.com/package/saxophone">saxophone</a></code></dt>
86
+ <dt><a href="#Parser">Parser</a> ⇒ <code><a href="https://www.npmjs.com/package/sax">sax</a></code> | <code><a href="https://www.npmjs.com/package/node-expat">node-expat</a></code></dt>
87
87
  <dd></dd>
88
88
  <dt><a href="#AttributeCondition">AttributeCondition</a> : <code>string</code> | <code>RegExp</code> | <code><a href="#AttributeConditionFilter">AttributeConditionFilter</a></code></dt>
89
89
  <dd><p>Optional condition to get attributes<br> </p>
@@ -133,6 +133,7 @@ You can specify a <code>function</code> or a <code>event</code> name</p>
133
133
  * [.isRoot](#Twig+isRoot) ⇒ <code>boolean</code>
134
134
  * [.hasChildren](#Twig+hasChildren) ⇒ <code>boolean</code>
135
135
  * [.index](#Twig+index) ⇒ <code>number</code>
136
+ * [.path](#Twig+path) ⇒ <code>string</code>
136
137
  * [.name](#Twig+name) ⇒ <code>string</code>
137
138
  * [.tag](#Twig+tag) ⇒ <code>string</code>
138
139
  * [.text](#Twig+text) ⇒ <code>string</code>
@@ -299,6 +300,13 @@ The position in `#children` array. For root object 0
299
300
 
300
301
  **Kind**: instance property of [<code>Twig</code>](#Twig)
301
302
  **Returns**: <code>number</code> - Position of element in parent
303
+ <a name="Twig+path"></a>
304
+
305
+ ### twig.path ⇒ <code>string</code>
306
+ The X-Path position of the element
307
+
308
+ **Kind**: instance property of [<code>Twig</code>](#Twig)
309
+ **Returns**: <code>string</code> - X-Path
302
310
  <a name="Twig+name"></a>
303
311
 
304
312
  ### twig.name ⇒ <code>string</code>
@@ -758,6 +766,7 @@ Common function to filter Twig element
758
766
  * [.isRoot](#Twig+isRoot) ⇒ <code>boolean</code>
759
767
  * [.hasChildren](#Twig+hasChildren) ⇒ <code>boolean</code>
760
768
  * [.index](#Twig+index) ⇒ <code>number</code>
769
+ * [.path](#Twig+path) ⇒ <code>string</code>
761
770
  * [.name](#Twig+name) ⇒ <code>string</code>
762
771
  * [.tag](#Twig+tag) ⇒ <code>string</code>
763
772
  * [.text](#Twig+text) ⇒ <code>string</code>
@@ -924,6 +933,13 @@ The position in `#children` array. For root object 0
924
933
 
925
934
  **Kind**: instance property of [<code>Twig</code>](#Twig)
926
935
  **Returns**: <code>number</code> - Position of element in parent
936
+ <a name="Twig+path"></a>
937
+
938
+ ### twig.path ⇒ <code>string</code>
939
+ The X-Path position of the element
940
+
941
+ **Kind**: instance property of [<code>Twig</code>](#Twig)
942
+ **Returns**: <code>string</code> - X-Path
927
943
  <a name="Twig+name"></a>
928
944
 
929
945
  ### twig.name ⇒ <code>string</code>
@@ -1484,7 +1500,7 @@ Optional settings for the Twig parser
1484
1500
 
1485
1501
  | Name | Type | Description |
1486
1502
  | --- | --- | --- |
1487
- | [method] | <code>&#x27;sax&#x27;</code> \| <code>&#x27;expat&#x27;</code> \| <code>&#x27;saxophone&#x27;</code> | The underlying parser. Either `'sax'`, `'expat'` or `'saxophone'`. |
1503
+ | [method] | <code>&#x27;sax&#x27;</code> \| <code>&#x27;expat&#x27;</code> | The underlying parser. Either `'sax'`, `'expat'`. |
1488
1504
  | [xmlns] | <code>boolean</code> | If `true`, then namespaces are accessible by `namespace` property. |
1489
1505
  | [trim] | <code>boolean</code> | If `true`, then turn any whitespace into a single space. Text and comments are trimmed. |
1490
1506
  | [resumeAfterError] | <code>boolean</code> | If `true` then parser continues reading after an error. Otherwise it throws exception. |
@@ -1558,15 +1574,15 @@ Custom filter function to select desired elements
1558
1574
 
1559
1575
  <a name="Parser"></a>
1560
1576
 
1561
- ## Parser ⇒ [<code>sax</code>](https://www.npmjs.com/package/sax) \| [<code>node-expat</code>](https://www.npmjs.com/package/node-expat) \| [<code>saxophone</code>](https://www.npmjs.com/package/saxophone)
1577
+ ## Parser ⇒ [<code>sax</code>](https://www.npmjs.com/package/sax) \| [<code>node-expat</code>](https://www.npmjs.com/package/node-expat)
1562
1578
  **Kind**: global typedef
1563
- **Returns**: [<code>sax</code>](https://www.npmjs.com/package/sax) \| [<code>node-expat</code>](https://www.npmjs.com/package/node-expat) \| [<code>saxophone</code>](https://www.npmjs.com/package/saxophone) - The parser Object
1579
+ **Returns**: [<code>sax</code>](https://www.npmjs.com/package/sax) \| [<code>node-expat</code>](https://www.npmjs.com/package/node-expat) - The parser Object
1564
1580
  **Properties**
1565
1581
 
1566
1582
  | Name | Type | Description |
1567
1583
  | --- | --- | --- |
1568
- | [currentLine] | <code>number</code> | The currently processed line in the XML-File.<br/>Not available on `saxophone` parser. |
1569
- | [currentColumn] | <code>number</code> | The currently processed column in the XML-File.<br/>Not available on `saxophone` parser. |
1584
+ | [currentLine] | <code>number</code> | The currently processed line in the XML-File. |
1585
+ | [currentColumn] | <code>number</code> | The currently processed column in the XML-File. |
1570
1586
 
1571
1587
  <a name="AttributeCondition"></a>
1572
1588
 
package/package.json CHANGED
@@ -5,7 +5,7 @@
5
5
  },
6
6
  "name": "xml-twig",
7
7
  "description": "Node module for processing huge XML documents in tree mode",
8
- "version": "1.3.17",
8
+ "version": "1.4.0",
9
9
  "main": "twig.js",
10
10
  "directories": {
11
11
  "doc": "doc"
@@ -13,8 +13,7 @@
13
13
  "devDependencies": {
14
14
  "jsdoc-to-markdown": "^9.0.0",
15
15
  "luxon": "^3.5.0",
16
- "node-expat": "^2.4.1",
17
- "saxophone": "^0.8.0"
16
+ "node-expat": "^2.4.1"
18
17
  },
19
18
  "scripts": {
20
19
  "test": "node demo.js"
@@ -31,7 +31,7 @@ async function parse(method) {
31
31
 
32
32
  const main = async () => {
33
33
 
34
- for (let method of ["sax", "expat", "saxophone"]) {
34
+ for (let method of ["sax", "expat"]) {
35
35
  console.log(`Running with ${method}...`);
36
36
  printNE = true;
37
37
  for (let i = 0; i <= 5; i++)
@@ -77,21 +77,6 @@ Finished with expat in 02:33.270
77
77
  Finished with expat in 02:33.231
78
78
  Finished with expat in 02:38.269
79
79
 
80
- Running with saxophone...
81
- 25 NE in 00:12.874
82
- 50 NE in 00:27.736
83
- 75 NE in 00:41.591
84
- 100 NE in 00:58.430
85
- 125 NE in 01:20.685
86
- 150 NE in 01:43.568
87
- 175 NE in 01:58.438
88
- Finished with saxophone in 02:11.667
89
- Finished with saxophone in 02:07.623
90
- Finished with saxophone in 02:09.538
91
- Finished with saxophone in 02:09.965
92
- Finished with saxophone in 02:12.81
93
- Finished with saxophone in 02:09.792
94
-
95
80
 
96
81
  Good old Perl XML::Twig
97
82
  25 NE in 1:14
package/test.js CHANGED
@@ -35,14 +35,14 @@ function piHandler(elt) {
35
35
  const main = async () => {
36
36
 
37
37
  for (let file of ["bookstore", "breakfast-menu"]) {
38
- for (let method of ["sax", "expat", "saxophone"])
38
+ for (let method of ["sax", "expat"])
39
39
  await parse(`${__dirname}/samples/${file}.xml`, { tag: twig.Any, function: anyHandler }, { method: method });
40
40
  }
41
41
 
42
- for (let method of ["sax", "expat", "saxophone"])
42
+ for (let method of ["sax", "expat"])
43
43
  await parse(`${__dirname}/samples/xmlns.xml`, { tag: twig.Any, function: nsHandler }, { method: method, xmlns: true });
44
44
 
45
- for (let method of ["sax", "expat", "saxophone"])
45
+ for (let method of ["sax", "expat"])
46
46
  await parse(`${__dirname}/samples/processingInstruction.xml`, { tag: twig.Root, function: piHandler }, { method: method });
47
47
 
48
48
  }
package/twig.js CHANGED
@@ -1,6 +1,5 @@
1
1
  const SAX = 'sax';
2
2
  const EXPAT = ['expat', 'node-expat'];
3
- const SAXOPHONE = 'saxophone';
4
3
 
5
4
  let tree;
6
5
  let current;
@@ -20,13 +19,6 @@ let current;
20
19
  * @see {@link https://www.npmjs.com/package/node-expat|node-expat}
21
20
  */
22
21
 
23
- /**
24
- * @external saxophone
25
- * @see {@link https://www.npmjs.com/package/saxophone|saxophone}
26
- * @see {@link https://www.npmjs.com/package/@alexbosworth/saxophone|@alexbosworth/saxophone}
27
- * @see {@link https://www.npmjs.com/package/@pirxpilot/saxophone|@pirxpilot/saxophone}
28
- */
29
-
30
22
  /**
31
23
  * @external libxmljs
32
24
  * Though module looks promising, it is not implemented, because it does not support Streams.
@@ -66,7 +58,7 @@ const Any = new AnyHandler();
66
58
  /**
67
59
  * Optional settings for the Twig parser
68
60
  * @typedef ParserOptions
69
- * @property {'sax' | 'expat' | 'saxophone'} [method] - The underlying parser. Either `'sax'`, `'expat'` or `'saxophone'`.
61
+ * @property {'sax' | 'expat'} [method] - The underlying parser. Either `'sax'`, `'expat'`.
70
62
  * @property {boolean} [xmlns] - If `true`, then namespaces are accessible by `namespace` property.
71
63
  * @property {boolean} [trim] - If `true`, then turn any whitespace into a single space. Text and comments are trimmed.
72
64
  * @property {boolean} [resumeAfterError] - If `true` then parser continues reading after an error. Otherwise it throws exception.
@@ -129,9 +121,9 @@ const Any = new AnyHandler();
129
121
 
130
122
  /**
131
123
  * @typedef Parser
132
- * @property {number} [currentLine] - The currently processed line in the XML-File.<br/>Not available on `saxophone` parser.
133
- * @property {number} [currentColumn] - The currently processed column in the XML-File.<br/>Not available on `saxophone` parser.
134
- * @returns {external:sax|external:node-expat|external:saxophone} The parser Object
124
+ * @property {number} [currentLine] - The currently processed line in the XML-File.
125
+ * @property {number} [currentColumn] - The currently processed column in the XML-File.
126
+ * @returns {external:sax|external:node-expat} The parser Object
135
127
  */
136
128
 
137
129
  /**
@@ -264,64 +256,6 @@ function createParser(handler, options = {}) {
264
256
  parser.emit("finish");
265
257
  });
266
258
 
267
- } else if (options.method === SAXOPHONE) {
268
- const Saxophone = require('saxophone');
269
- //const Saxophone = require('@alexbosworth/saxophone');
270
- //const Saxophone = require('@pirxpilot/saxophone');
271
- parser = new Saxophone();
272
-
273
- parser.on("tagclose", onClose.bind(null, handler, options));
274
- parser.on("tagopen", onStart.bind(null, {
275
- handler: Array.isArray(handler) ? handler : [handler],
276
- options: options,
277
- namespaces: namespaces,
278
- parser: parser,
279
- Saxophone: Saxophone
280
- }));
281
-
282
- parser.on("cdata", function (str) {
283
- current.text = options.trim ? str.contents.trim() : str.contents;
284
- });
285
-
286
- parser.on('processinginstruction', function (pi) {
287
- if (pi.contents.startsWith('xml ')) {
288
- let declaration = {};
289
- for (let item of pi.contents.split(' ')) {
290
- let [k, v] = item.split('=');
291
- if (k === 'xml') continue;
292
- declaration[k] = v.replaceAll('"', '').replaceAll("'", '');
293
- }
294
- tree = new Twig(null);
295
- Object.defineProperty(tree, 'declaration', {
296
- value: declaration,
297
- writable: false,
298
- enumerable: true
299
- });
300
- } else if (tree.PI === undefined) {
301
- let instruction = { body: {} };
302
- for (let item of pi.contents.split(' ')) {
303
- let [k, v] = item.split('=');
304
- if (v === undefined) {
305
- instruction.name = k;
306
- } else {
307
- instruction.body[k] = v.replaceAll('"', '').replaceAll("'", '');
308
- }
309
- }
310
- Object.defineProperty(tree, 'PI', {
311
- value: { target: instruction.name, data: instruction.body },
312
- writable: false,
313
- enumerable: true
314
- });
315
- }
316
-
317
- });
318
-
319
- parser.on('finish', function () {
320
- // saxophone parser does not emit 'end' Event
321
- tree = undefined;
322
- current = undefined;
323
- });
324
-
325
259
  } else {
326
260
  throw new UnsupportedParser(options.method);
327
261
  }
@@ -335,16 +269,10 @@ function createParser(handler, options = {}) {
335
269
  // Common events
336
270
  parser.on('text', function (str) {
337
271
  if (current === undefined || current === null) return;
338
- if (options.method === SAXOPHONE) {
339
- current.text = options.trim ? str.contents.trim() : str.contents;
340
- } else {
341
- current.text = options.trim ? str.trim() : str;
342
- }
272
+ current.text = options.trim ? str.trim() : str;
343
273
  });
344
274
 
345
275
  parser.on("comment", function (str) {
346
- if (options.method === SAXOPHONE)
347
- str = str.contents;
348
276
  if (current.hasOwnProperty('comment')) {
349
277
  if (typeof current.comment === 'string') {
350
278
  current.comment = [current.comment, str.trim()];
@@ -362,14 +290,10 @@ function createParser(handler, options = {}) {
362
290
  });
363
291
 
364
292
  parser.on('error', function (err) {
365
- if (options.method === SAXOPHONE) {
366
- console.error(err);
367
- } else {
368
- console.error(`error at line [${parser.currentLine}], column [${parser.currentColumn}]`, err);
369
- if (options.resumeAfterError) {
370
- parser.underlyingParser.error = null;
371
- parser.underlyingParser.resume();
372
- }
293
+ console.error(`error at line [${parser.currentLine}], column [${parser.currentColumn}]`, err);
294
+ if (options.resumeAfterError) {
295
+ parser.underlyingParser.error = null;
296
+ parser.underlyingParser.resume();
373
297
  }
374
298
  });
375
299
 
@@ -390,9 +314,6 @@ function onStart(binds, node, attrs) {
390
314
  const options = binds.options;
391
315
  let namespaces = binds.namespaces;
392
316
 
393
- if (attrs === undefined && options.method === SAXOPHONE)
394
- attrs = binds.Saxophone.parseAttrs(node.attrs);
395
-
396
317
  let attrNS = {};
397
318
  if (options.xmlns && attrs !== undefined) {
398
319
  for (let key of Object.keys(attrs).filter(x => !(x.startsWith('xmlns:') && name.includes(':'))))
@@ -431,7 +352,7 @@ function onStart(binds, node, attrs) {
431
352
  }
432
353
 
433
354
  if (options.xmlns) {
434
- if (EXPAT.concat(SAXOPHONE).includes(options.method)) {
355
+ if (EXPAT.includes(options.method)) {
435
356
  for (let key of Object.keys(attrs).filter(x => x.startsWith('xmlns:')))
436
357
  namespaces[key.split(':')[1]] = attrs[key];
437
358
  }
@@ -446,8 +367,6 @@ function onStart(binds, node, attrs) {
446
367
  }
447
368
  }
448
369
  }
449
- if (options.method === SAXOPHONE && node.isSelfClosing)
450
- binds.parser.emit("tagclose", node);
451
370
  }
452
371
 
453
372
  /**
@@ -460,8 +379,6 @@ function onClose(handler, options, name) {
460
379
  current.close();
461
380
  let purge = true;
462
381
 
463
- if (options.method === SAXOPHONE)
464
- name = name.name;
465
382
  for (let hndl of Array.isArray(handler) ? handler : [handler]) {
466
383
  if (hndl.tag instanceof AnyHandler) {
467
384
  if (typeof hndl.function === 'function') hndl.function(current ?? tree);
@@ -710,6 +627,37 @@ class Twig {
710
627
  return this.isRoot ? 0 : this.#parent.#children.indexOf(this);
711
628
  }
712
629
 
630
+ /**
631
+ * The X-Path position of the element
632
+ * NOTE: Applies only to currently loaded elements.
633
+ * @returns {string} X-Path
634
+ */
635
+ get path() {
636
+ if (this.isRoot)
637
+ return `/${this.#name}`;
638
+
639
+ let ret = [];
640
+ if (this.#parent.children(this.#name).length > 1) {
641
+ let sameChildren = this.#parent.children(this.#name);
642
+ ret.unshift(`${this.#name}[${sameChildren.indexOf(this) + 1}]`);
643
+ } else {
644
+ ret.unshift(this.#name);
645
+ }
646
+ if (!this.isRoot) {
647
+ let parent = this.#parent;
648
+ while (!parent.isRoot) {
649
+ if (parent.#parent.children(parent.#name).length > 1) {
650
+ let sameChildren = parent.#parent.children(parent.#name);
651
+ ret.unshift(`${parent.#name}[${sameChildren.indexOf(parent) + 1}]`);
652
+ } else {
653
+ ret.unshift(parent.#name);
654
+ }
655
+ parent = parent.#parent;
656
+ }
657
+ }
658
+ return '/' + ret.join('/');
659
+ }
660
+
713
661
  /**
714
662
  * Returns the name of the element.
715
663
  * @returns {string} Element name
@@ -1296,7 +1244,7 @@ class UnsupportedParser extends TypeError {
1296
1244
  * @param {string} t Parser type
1297
1245
  */
1298
1246
  constructor(t) {
1299
- super(`Parser '${t}' is not supported. Use 'expat', 'sax' (default) or 'saxophone'`);
1247
+ super(`Parser '${t}' is not supported. Use 'expat', 'sax' (default)`);
1300
1248
  }
1301
1249
  }
1302
1250