@rgrove/parse-xml 2.0.4 → 4.0.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/LICENSE +1 -1
- package/README.md +84 -337
- package/dist/browser.js +774 -0
- package/dist/browser.js.map +7 -0
- package/dist/global.min.js +10 -0
- package/dist/global.min.js.map +7 -0
- package/dist/index.d.ts +24 -0
- package/dist/index.d.ts.map +1 -0
- package/dist/index.js +50 -0
- package/dist/index.js.map +1 -0
- package/dist/lib/Parser.d.ts +218 -0
- package/dist/lib/Parser.d.ts.map +1 -0
- package/dist/lib/Parser.js +638 -0
- package/dist/lib/Parser.js.map +1 -0
- package/dist/lib/StringScanner.d.ts +97 -0
- package/dist/lib/StringScanner.d.ts.map +1 -0
- package/dist/lib/StringScanner.js +210 -0
- package/dist/lib/StringScanner.js.map +1 -0
- package/dist/lib/XmlCdata.d.ts +8 -0
- package/dist/lib/XmlCdata.d.ts.map +1 -0
- package/dist/lib/XmlCdata.js +15 -0
- package/dist/lib/XmlCdata.js.map +1 -0
- package/dist/lib/XmlComment.d.ts +16 -0
- package/dist/lib/XmlComment.d.ts.map +1 -0
- package/dist/lib/XmlComment.js +23 -0
- package/dist/lib/XmlComment.js.map +1 -0
- package/dist/lib/XmlDocument.d.ts +29 -0
- package/dist/lib/XmlDocument.d.ts.map +1 -0
- package/dist/lib/XmlDocument.js +47 -0
- package/dist/lib/XmlDocument.js.map +1 -0
- package/dist/lib/XmlElement.d.ts +40 -0
- package/dist/lib/XmlElement.d.ts.map +1 -0
- package/dist/lib/XmlElement.js +51 -0
- package/dist/lib/XmlElement.js.map +1 -0
- package/dist/lib/XmlNode.d.ts +74 -0
- package/dist/lib/XmlNode.d.ts.map +1 -0
- package/dist/lib/XmlNode.js +96 -0
- package/dist/lib/XmlNode.js.map +1 -0
- package/dist/lib/XmlProcessingInstruction.d.ts +22 -0
- package/dist/lib/XmlProcessingInstruction.d.ts.map +1 -0
- package/dist/lib/XmlProcessingInstruction.js +25 -0
- package/dist/lib/XmlProcessingInstruction.js.map +1 -0
- package/dist/lib/XmlText.d.ts +16 -0
- package/dist/lib/XmlText.d.ts.map +1 -0
- package/dist/lib/XmlText.js +23 -0
- package/dist/lib/XmlText.js.map +1 -0
- package/dist/lib/syntax.d.ts +69 -0
- package/dist/lib/syntax.d.ts.map +1 -0
- package/dist/lib/syntax.js +133 -0
- package/dist/lib/syntax.js.map +1 -0
- package/dist/lib/types.d.ts +5 -0
- package/dist/lib/types.d.ts.map +1 -0
- package/dist/lib/types.js +3 -0
- package/dist/lib/types.js.map +1 -0
- package/package.json +36 -22
- package/src/index.ts +30 -0
- package/src/lib/Parser.ts +819 -0
- package/src/lib/StringScanner.ts +254 -0
- package/src/lib/XmlCdata.ts +11 -0
- package/src/lib/XmlComment.ts +26 -0
- package/src/lib/XmlDocument.ts +57 -0
- package/src/lib/XmlElement.ts +81 -0
- package/src/lib/XmlNode.ts +107 -0
- package/src/lib/XmlProcessingInstruction.ts +35 -0
- package/src/lib/XmlText.ts +26 -0
- package/src/lib/syntax.ts +136 -0
- package/src/lib/types.ts +2 -0
- package/CHANGELOG.md +0 -89
- package/dist/commonjs/index.js +0 -434
- package/dist/commonjs/lib/syntax.js +0 -262
- package/dist/umd/parse-xml.min.js +0 -1
- package/src/index.js +0 -451
- package/src/lib/syntax.js +0 -263
package/LICENSE
CHANGED
|
@@ -1,6 +1,6 @@
|
|
|
1
1
|
ISC License
|
|
2
2
|
|
|
3
|
-
Copyright
|
|
3
|
+
Copyright Ryan Grove <ryan@wonko.com>
|
|
4
4
|
|
|
5
5
|
Permission to use, copy, modify, and/or distribute this software for any purpose
|
|
6
6
|
with or without fee is hereby granted, provided that the above copyright notice
|
package/README.md
CHANGED
|
@@ -2,28 +2,13 @@
|
|
|
2
2
|
|
|
3
3
|
A fast, safe, compliant XML parser for Node.js and browsers.
|
|
4
4
|
|
|
5
|
-
[](https://badge.fury.io/js/%40rgrove%2Fparse-xml)
|
|
6
|
-
|
|
7
|
-
|
|
8
|
-
|
|
9
|
-
|
|
10
|
-
|
|
11
|
-
-
|
|
12
|
-
- [Features](#features)
|
|
13
|
-
- [Not Features](#not-features)
|
|
14
|
-
- [Examples](#examples)
|
|
15
|
-
- [Basic Usage](#basic-usage)
|
|
16
|
-
- [Friendly Errors](#friendly-errors)
|
|
17
|
-
- [API](#api)
|
|
18
|
-
- [Nodes](#nodes)
|
|
19
|
-
- [`cdata`](#cdata)
|
|
20
|
-
- [`comment`](#comment)
|
|
21
|
-
- [`document`](#document)
|
|
22
|
-
- [`element`](#element)
|
|
23
|
-
- [`text`](#text)
|
|
24
|
-
- [Why another XML parser?](#why-another-xml-parser)
|
|
25
|
-
- [Benchmark](#benchmark)
|
|
26
|
-
- [License](#license)
|
|
5
|
+
[](https://badge.fury.io/js/%40rgrove%2Fparse-xml) [](https://bundlephobia.com/result?p=@rgrove/parse-xml) [](https://github.com/rgrove/parse-xml/actions/workflows/ci.yml)
|
|
6
|
+
|
|
7
|
+
## Links
|
|
8
|
+
|
|
9
|
+
- [API Docs](https://rgrove.github.io/parse-xml/)
|
|
10
|
+
- [GitHub](https://github.com/rgrove/parse-xml)
|
|
11
|
+
- [npm](https://www.npmjs.com/package/@rgrove/parse-xml)
|
|
27
12
|
|
|
28
13
|
## Installation
|
|
29
14
|
|
|
@@ -31,83 +16,84 @@ A fast, safe, compliant XML parser for Node.js and browsers.
|
|
|
31
16
|
npm install @rgrove/parse-xml
|
|
32
17
|
```
|
|
33
18
|
|
|
34
|
-
Or, if you like living dangerously, you can load [the minified
|
|
35
|
-
in a browser via [Unpkg][] and use the `parseXml` global.
|
|
36
|
-
|
|
37
|
-
[umd]:https://unpkg.com/@rgrove/parse-xml/dist/umd/parse-xml.min.js
|
|
38
|
-
[Unpkg]:https://unpkg.com/
|
|
19
|
+
Or, if you like living dangerously, you can load [the minified bundle](https://unpkg.com/@rgrove/parse-xml/dist/global.min.js) in a browser via [Unpkg](https://unpkg.com/) and use the `parseXml` global.
|
|
39
20
|
|
|
40
21
|
## Features
|
|
41
22
|
|
|
42
|
-
- Returns
|
|
23
|
+
- Returns a convenient [object tree](#basic-usage) representing an XML document.
|
|
43
24
|
|
|
44
|
-
- Works great in Node.js
|
|
45
|
-
browsers if you provide polyfills for `Object.assign()`, `Object.freeze()`,
|
|
46
|
-
and `String.fromCodePoint()`.
|
|
25
|
+
- Works great in Node.js and browsers.
|
|
47
26
|
|
|
48
|
-
- Provides [helpful, detailed error messages](#friendly-errors) with context
|
|
49
|
-
when a document is not well-formed.
|
|
27
|
+
- Provides [helpful, detailed error messages](#friendly-errors) with context when a document is not well-formed.
|
|
50
28
|
|
|
51
|
-
- Mostly conforms to [XML 1.0 (Fifth Edition)](https://www.w3.org/TR/2008/REC-xml-20081126/)
|
|
52
|
-
as a non-validating parser (see [below](#not-features) for details).
|
|
29
|
+
- Mostly conforms to [XML 1.0 (Fifth Edition)](https://www.w3.org/TR/2008/REC-xml-20081126/) as a non-validating parser (see [below](#not-features) for details).
|
|
53
30
|
|
|
54
31
|
- Passes all relevant tests in the [XML Conformance Test Suite](https://www.w3.org/XML/Test/).
|
|
55
32
|
|
|
56
|
-
-
|
|
57
|
-
and has no dependencies.
|
|
33
|
+
- Written in TypeScript and compiled to ES2020 JavaScript for Node.js and ES2017 JavaScript for browsers. The browser build is also optimized for minification.
|
|
58
34
|
|
|
59
|
-
|
|
35
|
+
- Extremely [fast](#benchmark) and surprisingly [small](https://bundlephobia.com/result?p=@rgrove/parse-xml).
|
|
60
36
|
|
|
61
|
-
|
|
62
|
-
parts of the spec aren't very useful or aren't safe when the XML being parsed
|
|
63
|
-
comes from an untrusted source. However, those parts of XML that _are_
|
|
64
|
-
implemented behave as defined in the spec.
|
|
37
|
+
- Zero dependencies.
|
|
65
38
|
|
|
66
|
-
|
|
67
|
-
document tree:
|
|
39
|
+
## Not Features
|
|
68
40
|
|
|
69
|
-
|
|
70
|
-
- Document type definitions
|
|
71
|
-
- Processing instructions
|
|
41
|
+
This parser currently discards document type declarations (`<!DOCTYPE ... >`) and all their contents, because they're rarely useful and some of their features aren't safe when the XML being parsed comes from an untrusted source.
|
|
72
42
|
|
|
73
|
-
In addition, the only supported character encoding is UTF-8.
|
|
43
|
+
In addition, the only supported character encoding is UTF-8 because it's not feasible (or useful) to support other character encodings in JavaScript.
|
|
74
44
|
|
|
75
45
|
## Examples
|
|
76
46
|
|
|
77
47
|
### Basic Usage
|
|
78
48
|
|
|
49
|
+
**ESM**
|
|
50
|
+
|
|
79
51
|
```js
|
|
80
|
-
|
|
52
|
+
import { parseXml } from '@rgrove/parse-xml';
|
|
81
53
|
parseXml('<kittens fuzzy="yes">I like fuzzy kittens.</kittens>');
|
|
82
54
|
```
|
|
83
55
|
|
|
84
|
-
**
|
|
56
|
+
**CommonJS**
|
|
57
|
+
|
|
58
|
+
```js
|
|
59
|
+
const { parseXml } = require('@rgrove/parse-xml');
|
|
60
|
+
parseXml('<kittens fuzzy="yes">I like fuzzy kittens.</kittens>');
|
|
61
|
+
```
|
|
62
|
+
|
|
63
|
+
The result is an [`XmlDocument`](https://rgrove.github.io/parse-xml/classes/XmlDocument.html) instance containing the parsed document, with a structure that looks like this (some properties and methods are excluded for clarity; see the [API docs](https://rgrove.github.io/parse-xml/) for details):
|
|
85
64
|
|
|
86
65
|
```js
|
|
87
66
|
{
|
|
88
|
-
type:
|
|
67
|
+
type: 'document',
|
|
89
68
|
children: [
|
|
90
69
|
{
|
|
91
|
-
type:
|
|
92
|
-
name:
|
|
70
|
+
type: 'element',
|
|
71
|
+
name: 'kittens',
|
|
93
72
|
attributes: {
|
|
94
|
-
fuzzy:
|
|
73
|
+
fuzzy: 'yes'
|
|
95
74
|
},
|
|
96
75
|
children: [
|
|
97
76
|
{
|
|
98
|
-
type:
|
|
99
|
-
text:
|
|
77
|
+
type: 'text',
|
|
78
|
+
text: 'I like fuzzy kittens.'
|
|
100
79
|
}
|
|
101
|
-
]
|
|
80
|
+
],
|
|
81
|
+
parent: { ... },
|
|
82
|
+
isRootNode: true
|
|
102
83
|
}
|
|
103
84
|
]
|
|
104
85
|
}
|
|
105
86
|
```
|
|
106
87
|
|
|
88
|
+
All parse-xml objects have `toJSON()` methods that return JSON-serializable objects, so you can easily convert an XML document to JSON:
|
|
89
|
+
|
|
90
|
+
```js
|
|
91
|
+
let json = JSON.stringify(parseXml(xml));
|
|
92
|
+
```
|
|
93
|
+
|
|
107
94
|
### Friendly Errors
|
|
108
95
|
|
|
109
|
-
When something goes wrong, parse-xml throws an error that tells you exactly what
|
|
110
|
-
happened and shows you where the problem is so you can fix it.
|
|
96
|
+
When something goes wrong, parse-xml throws an error that tells you exactly what happened and shows you where the problem is so you can fix it.
|
|
111
97
|
|
|
112
98
|
```js
|
|
113
99
|
parseXml('<foo><bar>baz</foo>');
|
|
@@ -137,329 +123,90 @@ In addition to a helpful message, error objects have the following properties:
|
|
|
137
123
|
|
|
138
124
|
- **pos** _Number_
|
|
139
125
|
|
|
140
|
-
Character position where the error occurred relative to the beginning of the
|
|
141
|
-
input (0-based).
|
|
142
|
-
|
|
143
|
-
## API
|
|
144
|
-
|
|
145
|
-
### `parseXml(xml: string, options?: object) => object`
|
|
146
|
-
|
|
147
|
-
Parses an XML document and returns an object tree.
|
|
148
|
-
|
|
149
|
-
#### Options
|
|
150
|
-
|
|
151
|
-
The following options may be provided as properties of the `options` argument:
|
|
152
|
-
|
|
153
|
-
- **ignoreUndefinedEntities** _Boolean_ (default: `false`)
|
|
154
|
-
|
|
155
|
-
When `true`, an undefined named entity like `&bogus;` will be left as is
|
|
156
|
-
instead of causing a parse error.
|
|
157
|
-
|
|
158
|
-
- **preserveCdata** _Boolean_ (default: `false`)
|
|
159
|
-
|
|
160
|
-
When `true`, CDATA sections will be preserved in the document tree as nodes
|
|
161
|
-
of type `cdata`. Otherwise CDATA sections will be represented as nodes of
|
|
162
|
-
type `text`.
|
|
163
|
-
|
|
164
|
-
- **preserveComments** _Boolean_ (default: `false`)
|
|
165
|
-
|
|
166
|
-
When `true`, comments will be preserved in the document tree as nodes of
|
|
167
|
-
type `comment`. Otherwise comments will not be included in the document
|
|
168
|
-
tree.
|
|
169
|
-
|
|
170
|
-
- **resolveUndefinedEntity** _Function_
|
|
171
|
-
|
|
172
|
-
When an undefined named entity is encountered, this function will be called
|
|
173
|
-
with the entity as its only argument. It should return a string value with
|
|
174
|
-
which to replace the entity, or `null` or `undefined` to treat the entity as
|
|
175
|
-
undefined (which may result in a parse error depending on the value of
|
|
176
|
-
`ignoreUndefinedEntities`).
|
|
177
|
-
|
|
178
|
-
## Nodes
|
|
179
|
-
|
|
180
|
-
An XML document is parsed into a tree of node objects. Each node has the
|
|
181
|
-
following common properties:
|
|
182
|
-
|
|
183
|
-
- **parent** _Object?_
|
|
184
|
-
|
|
185
|
-
Reference to this node's parent node, or `null` if this node is the
|
|
186
|
-
`document` node (which has no parent).
|
|
187
|
-
|
|
188
|
-
- **type** _String_
|
|
189
|
-
|
|
190
|
-
Node type.
|
|
191
|
-
|
|
192
|
-
Each node also has a `toJSON()` method that returns a serializable
|
|
193
|
-
representation of the node without the `parent` property (in order to avoid
|
|
194
|
-
circular references). This means you can safely pass any node to
|
|
195
|
-
`JSON.stringify()` to serialize it and its children as JSON.
|
|
196
|
-
|
|
197
|
-
### `cdata`
|
|
198
|
-
|
|
199
|
-
A CDATA section. Only emitted when the `preserveCdata` option is `true` (by
|
|
200
|
-
default, CDATA sections become `text` nodes).
|
|
201
|
-
|
|
202
|
-
#### Properties
|
|
203
|
-
|
|
204
|
-
|
|
205
|
-
- **text** _String_
|
|
206
|
-
|
|
207
|
-
Unescaped text content of the CDATA section.
|
|
208
|
-
|
|
209
|
-
#### Example
|
|
210
|
-
|
|
211
|
-
```xml
|
|
212
|
-
<![CDATA[kittens are fuzzy & cute]]>
|
|
213
|
-
```
|
|
214
|
-
|
|
215
|
-
```js
|
|
216
|
-
{
|
|
217
|
-
type: "cdata",
|
|
218
|
-
text: "kittens are fuzzy & cute",
|
|
219
|
-
parent: { ... }
|
|
220
|
-
}
|
|
221
|
-
```
|
|
222
|
-
|
|
223
|
-
### `comment`
|
|
224
|
-
|
|
225
|
-
A comment. Only emitted when the `preserveComments` option is `true`.
|
|
226
|
-
|
|
227
|
-
#### Properties
|
|
228
|
-
|
|
229
|
-
- **content** _String_
|
|
230
|
-
|
|
231
|
-
Comment text.
|
|
232
|
-
|
|
233
|
-
#### Example
|
|
234
|
-
|
|
235
|
-
```xml
|
|
236
|
-
<!-- I'm a comment! -->
|
|
237
|
-
```
|
|
238
|
-
|
|
239
|
-
```js
|
|
240
|
-
{
|
|
241
|
-
type: "comment",
|
|
242
|
-
content: "I'm a comment!",
|
|
243
|
-
parent: { ... }
|
|
244
|
-
}
|
|
245
|
-
```
|
|
246
|
-
|
|
247
|
-
### `document`
|
|
248
|
-
|
|
249
|
-
The top-level node of an XML document.
|
|
250
|
-
|
|
251
|
-
#### Properties
|
|
252
|
-
|
|
253
|
-
- **children** _Object[]_
|
|
254
|
-
|
|
255
|
-
Array of child nodes.
|
|
256
|
-
|
|
257
|
-
#### Example
|
|
258
|
-
|
|
259
|
-
```xml
|
|
260
|
-
<root />
|
|
261
|
-
```
|
|
262
|
-
|
|
263
|
-
```js
|
|
264
|
-
{
|
|
265
|
-
type: "document",
|
|
266
|
-
children: [
|
|
267
|
-
{
|
|
268
|
-
type: "element",
|
|
269
|
-
name: "root",
|
|
270
|
-
attributes: {},
|
|
271
|
-
children: [],
|
|
272
|
-
parent: { ... }
|
|
273
|
-
}
|
|
274
|
-
],
|
|
275
|
-
parent: null
|
|
276
|
-
}
|
|
277
|
-
```
|
|
278
|
-
|
|
279
|
-
### `element`
|
|
280
|
-
|
|
281
|
-
An element.
|
|
282
|
-
|
|
283
|
-
Note that since parse-xml doesn't implement [XML Namespaces](https://www.w3.org/TR/REC-xml-names/),
|
|
284
|
-
no special treatment is given to namespace prefixes in element and attribute
|
|
285
|
-
names.
|
|
286
|
-
|
|
287
|
-
In other words, `<foo:bar foo:baz="quux" />` will result in the element name
|
|
288
|
-
"foo:bar" and the attribute name "foo:baz".
|
|
289
|
-
|
|
290
|
-
#### Properties
|
|
291
|
-
|
|
292
|
-
- **attributes** _Object_
|
|
293
|
-
|
|
294
|
-
Hash of attribute names to values.
|
|
295
|
-
|
|
296
|
-
Attribute names in this object are always in alphabetical order regardless
|
|
297
|
-
of their order in the document, and values are normalized and unescaped.
|
|
298
|
-
Values are always strings.
|
|
299
|
-
|
|
300
|
-
- **children** _Object[]_
|
|
301
|
-
|
|
302
|
-
Array of child nodes.
|
|
303
|
-
|
|
304
|
-
- **name** _String_
|
|
305
|
-
|
|
306
|
-
Name of the element as given in the start and/or end tags.
|
|
307
|
-
|
|
308
|
-
- **preserveWhitespace** _Boolean?_
|
|
309
|
-
|
|
310
|
-
This property will be set to `true` if the special
|
|
311
|
-
[`xml:space`](https://www.w3.org/TR/2008/REC-xml-20081126/#sec-white-space)
|
|
312
|
-
attribute on this element or on the closest parent with an `xml:space`
|
|
313
|
-
attribute has the value "preserve". This indicates that whitespace in the
|
|
314
|
-
text content of this element should be preserved rather than normalized.
|
|
315
|
-
|
|
316
|
-
If neither this element nor any of its ancestors has an `xml:space`
|
|
317
|
-
attribute set to "preserve", or if the closest `xml:space` attribute is set
|
|
318
|
-
to "default", this property will not be defined.
|
|
319
|
-
|
|
320
|
-
#### Example
|
|
321
|
-
|
|
322
|
-
```xml
|
|
323
|
-
<kittens description="fuzzy & cute">I <3 kittens</kittens>
|
|
324
|
-
```
|
|
325
|
-
|
|
326
|
-
```js
|
|
327
|
-
{
|
|
328
|
-
type: "element",
|
|
329
|
-
name: "kittens",
|
|
330
|
-
attributes: {
|
|
331
|
-
description: "fuzzy & cute"
|
|
332
|
-
},
|
|
333
|
-
children: [
|
|
334
|
-
{
|
|
335
|
-
type: "text",
|
|
336
|
-
text: "I <3 kittens",
|
|
337
|
-
parent: { ... }
|
|
338
|
-
}
|
|
339
|
-
],
|
|
340
|
-
parent: { ... }
|
|
341
|
-
}
|
|
342
|
-
```
|
|
343
|
-
|
|
344
|
-
### `text`
|
|
345
|
-
|
|
346
|
-
Text content inside an element.
|
|
347
|
-
|
|
348
|
-
#### Properties
|
|
349
|
-
|
|
350
|
-
- **text** _String_
|
|
351
|
-
|
|
352
|
-
Unescaped text content.
|
|
353
|
-
|
|
354
|
-
#### Example
|
|
355
|
-
|
|
356
|
-
```xml
|
|
357
|
-
kittens are fuzzy & cute
|
|
358
|
-
```
|
|
359
|
-
|
|
360
|
-
```js
|
|
361
|
-
{
|
|
362
|
-
type: "text"
|
|
363
|
-
text: "kittens are fuzzy & cute",
|
|
364
|
-
parent: { ... }
|
|
365
|
-
}
|
|
366
|
-
```
|
|
126
|
+
Character position where the error occurred relative to the beginning of the input (0-based).
|
|
367
127
|
|
|
368
128
|
## Why another XML parser?
|
|
369
129
|
|
|
370
|
-
There are many XML parsers for Node, and some of them are good. However, most of
|
|
371
|
-
them suffer from one or more of the following shortcomings:
|
|
130
|
+
There are many XML parsers for Node, and some of them are good. However, most of them suffer from one or more of the following shortcomings:
|
|
372
131
|
|
|
373
132
|
- Native dependencies.
|
|
374
133
|
|
|
375
|
-
- Loose, non-standard
|
|
376
|
-
unexpected or even unsafe results when given input the author didn't
|
|
377
|
-
anticipate.
|
|
134
|
+
- Loose, non-standard parsing behavior that can lead to unexpected or even unsafe results when given input the author didn't anticipate.
|
|
378
135
|
|
|
379
|
-
- Kitchen sink APIs that tightly couple a parser with DOM manipulation
|
|
380
|
-
functions, a stringifier, or other tooling that isn't directly related to
|
|
381
|
-
parsing.
|
|
136
|
+
- Kitchen sink APIs that tightly couple a parser with DOM manipulation functions, a stringifier, or other tooling that isn't directly related to parsing and consuming XML.
|
|
382
137
|
|
|
383
|
-
- Stream-based parsing. This is great in the rare case that you need to parse
|
|
384
|
-
truly enormous documents, but can be a pain to work with when all you want
|
|
385
|
-
is an object tree.
|
|
138
|
+
- Stream-based parsing. This is great in the rare case that you need to parse truly enormous documents, but can be a pain to work with when all you want is a node tree.
|
|
386
139
|
|
|
387
140
|
- Poor error handling.
|
|
388
141
|
|
|
389
142
|
- Too big or too Node-specific to work well in browsers.
|
|
390
143
|
|
|
391
|
-
parse-xml's goal is to be a small, fast, safe,
|
|
392
|
-
non-streaming, non-validating, browser-friendly parser, because I think this is
|
|
393
|
-
an under-served niche.
|
|
144
|
+
parse-xml's goal is to be a small, fast, safe, compliant, non-streaming, non-validating, browser-friendly parser, because I think this is an under-served niche.
|
|
394
145
|
|
|
395
|
-
I think parse-xml demonstrates that it's not necessary to jettison the spec
|
|
396
|
-
entirely or to write complex code in order to implement a small, fast XML
|
|
397
|
-
parser.
|
|
146
|
+
I think parse-xml demonstrates that it's not necessary to jettison the spec entirely or to write complex code in order to implement a small, fast XML parser.
|
|
398
147
|
|
|
399
148
|
Also, it was fun.
|
|
400
149
|
|
|
401
150
|
## Benchmark
|
|
402
151
|
|
|
403
|
-
Here's how parse-xml stacks up against two comparable libraries, [libxmljs2]
|
|
404
|
-
(which is based on the native libxml library) and [xmldoc] (which is based on
|
|
405
|
-
[sax-js]).
|
|
152
|
+
Here's how parse-xml stacks up against two comparable libraries, [libxmljs2](https://github.com/marudor/libxmljs2) (which is based on the native libxml library) and [xmldoc](https://github.com/nfarina/xmldoc) (which is based on [sax-js](https://github.com/isaacs/sax-js)).
|
|
406
153
|
|
|
407
|
-
[
|
|
408
|
-
[sax-js]:https://github.com/isaacs/sax-js
|
|
409
|
-
[xmldoc]:https://github.com/nfarina/xmldoc
|
|
154
|
+
While libxmljs2 is faster at parsing medium and large documents, its performance comes at the expense of a large C dependency, no browser support, and a [history of security vulnerabilities](https://www.cvedetails.com/vulnerability-list/vendor_id-1962/product_id-3311/Xmlsoft-Libxml2.html) in the underlying libxml2 library.
|
|
410
155
|
|
|
411
156
|
```
|
|
412
|
-
Node.js
|
|
413
|
-
|
|
157
|
+
Node.js v18.9.1 / Darwin arm64
|
|
158
|
+
Apple M1 Max
|
|
414
159
|
|
|
415
160
|
Running "Small document (291 bytes)" suite...
|
|
161
|
+
Progress: 100%
|
|
416
162
|
|
|
417
|
-
@rgrove/parse-xml
|
|
418
|
-
|
|
163
|
+
@rgrove/parse-xml 4.0.0:
|
|
164
|
+
189 074 ops/s, ±0.10% | fastest
|
|
419
165
|
|
|
420
|
-
libxmljs2 0.
|
|
421
|
-
|
|
166
|
+
libxmljs2 0.30.1 (native):
|
|
167
|
+
74 006 ops/s, ±0.32% | 60.86% slower
|
|
422
168
|
|
|
423
|
-
xmldoc 1.
|
|
424
|
-
|
|
169
|
+
xmldoc 1.2.0 (sax-js):
|
|
170
|
+
68 045 ops/s, ±0.08% | slowest, 64.01% slower
|
|
425
171
|
|
|
426
172
|
Finished 3 cases!
|
|
427
|
-
Fastest: @rgrove/parse-xml
|
|
428
|
-
Slowest: xmldoc 1.
|
|
173
|
+
Fastest: @rgrove/parse-xml 4.0.0
|
|
174
|
+
Slowest: xmldoc 1.2.0 (sax-js)
|
|
429
175
|
|
|
430
176
|
Running "Medium document (72081 bytes)" suite...
|
|
177
|
+
Progress: 100%
|
|
431
178
|
|
|
432
|
-
@rgrove/parse-xml
|
|
433
|
-
|
|
179
|
+
@rgrove/parse-xml 4.0.0:
|
|
180
|
+
1 066 ops/s, ±0.11% | 49.12% slower
|
|
434
181
|
|
|
435
|
-
libxmljs2 0.
|
|
436
|
-
|
|
182
|
+
libxmljs2 0.30.1 (native):
|
|
183
|
+
2 095 ops/s, ±2.68% | fastest
|
|
437
184
|
|
|
438
|
-
xmldoc 1.
|
|
439
|
-
|
|
185
|
+
xmldoc 1.2.0 (sax-js):
|
|
186
|
+
459 ops/s, ±0.10% | slowest, 78.09% slower
|
|
440
187
|
|
|
441
188
|
Finished 3 cases!
|
|
442
|
-
Fastest: libxmljs2 0.
|
|
443
|
-
Slowest: xmldoc 1.
|
|
189
|
+
Fastest: libxmljs2 0.30.1 (native)
|
|
190
|
+
Slowest: xmldoc 1.2.0 (sax-js)
|
|
444
191
|
|
|
445
192
|
Running "Large document (1162464 bytes)" suite...
|
|
193
|
+
Progress: 100%
|
|
446
194
|
|
|
447
|
-
@rgrove/parse-xml
|
|
448
|
-
|
|
195
|
+
@rgrove/parse-xml 4.0.0:
|
|
196
|
+
91 ops/s, ±0.11% | 51.85% slower
|
|
449
197
|
|
|
450
|
-
libxmljs2 0.
|
|
451
|
-
|
|
198
|
+
libxmljs2 0.30.1 (native):
|
|
199
|
+
189 ops/s, ±0.99% | fastest
|
|
452
200
|
|
|
453
|
-
xmldoc 1.
|
|
454
|
-
|
|
201
|
+
xmldoc 1.2.0 (sax-js):
|
|
202
|
+
39 ops/s, ±0.08% | slowest, 79.37% slower
|
|
455
203
|
|
|
456
204
|
Finished 3 cases!
|
|
457
|
-
Fastest: libxmljs2 0.
|
|
458
|
-
Slowest: xmldoc 1.
|
|
205
|
+
Fastest: libxmljs2 0.30.1 (native)
|
|
206
|
+
Slowest: xmldoc 1.2.0 (sax-js)
|
|
459
207
|
```
|
|
460
208
|
|
|
461
|
-
See the [parse-xml-benchmark](https://github.com/rgrove/parse-xml-benchmark)
|
|
462
|
-
repo for instructions on running this benchmark yourself.
|
|
209
|
+
See the [parse-xml-benchmark](https://github.com/rgrove/parse-xml-benchmark) repo for instructions on running this benchmark yourself.
|
|
463
210
|
|
|
464
211
|
## License
|
|
465
212
|
|