@xmldom/xmldom 0.8.1 → 0.9.0-beta.1
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/CHANGELOG.md +58 -0
- package/index.d.ts +42 -40
- package/lib/conventions.js +208 -8
- package/lib/dom-parser.js +237 -52
- package/lib/dom.js +266 -82
- package/lib/entities.js +2 -0
- package/lib/index.js +2 -0
- package/lib/sax.js +33 -22
- package/package.json +7 -7
- package/readme.md +183 -166
package/CHANGELOG.md
CHANGED
|
@@ -4,6 +4,64 @@ All notable changes to this project will be documented in this file.
|
|
|
4
4
|
|
|
5
5
|
This project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html).
|
|
6
6
|
|
|
7
|
+
## [0.9.0-beta.1](https://github.com/xmldom/xmldom/compare/0.8.2...0.9.0-beta.1)
|
|
8
|
+
|
|
9
|
+
### Fixed
|
|
10
|
+
|
|
11
|
+
**Only use HTML rules if mimeType matches** [`#338`](https://github.com/xmldom/xmldom/pull/338), fixes [`#203`](https://github.com/xmldom/xmldom/issues/203)
|
|
12
|
+
|
|
13
|
+
In the living specs for parsing XML and HTML, that this library is trying to implement,
|
|
14
|
+
there is a distinction between the different types of documents being parsed:
|
|
15
|
+
There are quite some rules that are different for parsing, constructing and serializing XML vs HTML documents.
|
|
16
|
+
|
|
17
|
+
So far xmldom was always "detecting" whether "the HTML rules should be applied" by looking at the current namespace. So from the first time an the HTML default namespace (`http://www.w3.org/1999/xhtml`) was found, every node was treated as being part of an HTML document. This misconception is the root cause for quite some reported bugs.
|
|
18
|
+
|
|
19
|
+
BREAKING CHANGE: HTML rules are no longer applied just because of the namespace, but require the `mimeType` argument passed to `DOMParser.parseFromString(source, mimeType)` to match `'text/html'`. Doing so implies all rules for handling casing for tag and attribute names when parsing, creation of nodes and searching nodes.
|
|
20
|
+
|
|
21
|
+
BREAKING CHANGE: Correct the return type of `DOMParser.parseFromString` to `Document | undefined`. In case of parsing errors it was always possible that "the returned `Document`" has not been created. In case you are using Typescript you now need to handle those cases.
|
|
22
|
+
|
|
23
|
+
BREAKING CHANGE: The instance property `DOMParser.options` is no longer available, instead use the individual `readonly` property per option (`assign`, `domHandler`, `errorHandler`, `normalizeLineEndings`, `locator`, `xmlns`). Those also provides the default value if the option was not passed. The 'locator' option is now just a boolean (default remains `true`).
|
|
24
|
+
|
|
25
|
+
BREAKING CHANGE: The following methods no longer allow a (non spec compliant) boolean argument to toggle "HTML rules":
|
|
26
|
+
- `XMLSerializer.serializeToString`
|
|
27
|
+
- `Node.toString`
|
|
28
|
+
- `Document.toString`
|
|
29
|
+
|
|
30
|
+
The following interfaces have been implemented:
|
|
31
|
+
`DOMImplementation` now implements all methods defined in the DOM spec, but not all of the behavior is implemented (see docstring):
|
|
32
|
+
- `createDocument` creates an "XML Document" (prototype: `Document`, property `type` is `'xml'`)
|
|
33
|
+
- `createHTMLDocument` creates an "HTML Document" (type/prototype: `Document`, property `type` is `'html'`).
|
|
34
|
+
- when no argument is passed or the first argument is a string, the basic nodes for an HTML structure are created, as specified
|
|
35
|
+
- when the first argument is `false` no child nodes are created
|
|
36
|
+
|
|
37
|
+
`Document` now has two new readonly properties as specified in the DOM spec:
|
|
38
|
+
- `contentType` which is the mime-type that was used to create the document
|
|
39
|
+
- `type` which is either the string literal `'xml'` or `'html'`
|
|
40
|
+
|
|
41
|
+
`MIME_TYPE` (`/lib/conventions.js`):
|
|
42
|
+
- `hasDefaultHTMLNamespace` test if the provided string is one of the miem types that implies the default HTML namespace: `text/html` or `application/xhtml+xml`
|
|
43
|
+
|
|
44
|
+
Thank you [@weiwu-zhang](https://github.com/weiwu-zhang) for your contributions
|
|
45
|
+
|
|
46
|
+
### Chore
|
|
47
|
+
|
|
48
|
+
- update multiple devDependencies
|
|
49
|
+
|
|
50
|
+
## [0.8.2](https://github.com/xmldom/xmldom/compare/0.8.1...0.8.2)
|
|
51
|
+
|
|
52
|
+
### Fixed
|
|
53
|
+
- fix(dom): Serialize `>` as specified (#395) [`#58`](https://github.com/xmldom/xmldom/issues/58)
|
|
54
|
+
|
|
55
|
+
### Other
|
|
56
|
+
- docs: Add `nodeType` values to public interface description [`#396`](https://github.com/xmldom/xmldom/pull/396)
|
|
57
|
+
- test: Add executable examples for node and typescript [`#317`](https://github.com/xmldom/xmldom/pull/317)
|
|
58
|
+
- fix(dom): Serialize `>` as specified [`#395`](https://github.com/xmldom/xmldom/pull/395)
|
|
59
|
+
- chore: Add minimal `Object.assign` ponyfill [`#379`](https://github.com/xmldom/xmldom/pull/379)
|
|
60
|
+
- docs: Refine release documentation [`#378`](https://github.com/xmldom/xmldom/pull/378)
|
|
61
|
+
- chore: update various dev dependencies
|
|
62
|
+
|
|
63
|
+
Thank you [@niklasl](https://github.com/niklasl), [@cburatto](https://github.com/cburatto), [@SheetJSDev](https://github.com/SheetJSDev), [@pyrsmk](https://github.com/pyrsmk) for your contributions
|
|
64
|
+
|
|
7
65
|
## [0.8.1](https://github.com/xmldom/xmldom/compare/0.8.0...0.8.1)
|
|
8
66
|
|
|
9
67
|
### Fixes
|
package/index.d.ts
CHANGED
|
@@ -1,43 +1,45 @@
|
|
|
1
1
|
/// <reference lib="dom" />
|
|
2
2
|
|
|
3
|
-
declare module
|
|
4
|
-
|
|
5
|
-
|
|
6
|
-
|
|
7
|
-
|
|
8
|
-
|
|
9
|
-
|
|
10
|
-
|
|
11
|
-
|
|
12
|
-
|
|
13
|
-
|
|
14
|
-
|
|
15
|
-
|
|
16
|
-
|
|
17
|
-
|
|
18
|
-
|
|
19
|
-
|
|
20
|
-
|
|
21
|
-
|
|
22
|
-
|
|
23
|
-
|
|
24
|
-
|
|
25
|
-
|
|
26
|
-
|
|
27
|
-
|
|
28
|
-
|
|
29
|
-
|
|
30
|
-
|
|
31
|
-
|
|
32
|
-
|
|
33
|
-
|
|
34
|
-
|
|
35
|
-
|
|
36
|
-
|
|
37
|
-
|
|
38
|
-
|
|
39
|
-
|
|
40
|
-
|
|
41
|
-
|
|
42
|
-
|
|
3
|
+
declare module '@xmldom/xmldom' {
|
|
4
|
+
var DOMParser: DOMParserStatic
|
|
5
|
+
var XMLSerializer: XMLSerializerStatic
|
|
6
|
+
var DOMImplementation: DOMImplementationStatic
|
|
7
|
+
|
|
8
|
+
interface DOMImplementationStatic {
|
|
9
|
+
new (): DOMImplementation
|
|
10
|
+
}
|
|
11
|
+
|
|
12
|
+
interface DOMParserStatic {
|
|
13
|
+
new (): DOMParser
|
|
14
|
+
new (options: DOMParserOptions): DOMParser
|
|
15
|
+
}
|
|
16
|
+
|
|
17
|
+
interface XMLSerializerStatic {
|
|
18
|
+
new (): XMLSerializer
|
|
19
|
+
}
|
|
20
|
+
|
|
21
|
+
interface DOMParser {
|
|
22
|
+
parseFromString(source: string, mimeType?: string): Document | undefined
|
|
23
|
+
}
|
|
24
|
+
|
|
25
|
+
interface XMLSerializer {
|
|
26
|
+
serializeToString(node: Node): string
|
|
27
|
+
}
|
|
28
|
+
|
|
29
|
+
interface DOMParserOptions {
|
|
30
|
+
errorHandler?: ErrorHandlerFunction | ErrorHandlerObject
|
|
31
|
+
locator?: boolean
|
|
32
|
+
normalizeLineEndings?: (source: string) => string
|
|
33
|
+
xmlns?: Record<string, string | null | undefined>
|
|
34
|
+
}
|
|
35
|
+
|
|
36
|
+
interface ErrorHandlerFunction {
|
|
37
|
+
(level: 'warn' | 'error' | 'fatalError', msg: string): void
|
|
38
|
+
}
|
|
39
|
+
|
|
40
|
+
interface ErrorHandlerObject {
|
|
41
|
+
warning?: (msg: string) => void
|
|
42
|
+
error?: (msg: string) => void
|
|
43
|
+
fatalError?: (msg: string) => void
|
|
44
|
+
}
|
|
43
45
|
}
|
package/lib/conventions.js
CHANGED
|
@@ -9,7 +9,7 @@
|
|
|
9
9
|
*
|
|
10
10
|
* @template T
|
|
11
11
|
* @param {T} object the object to freeze
|
|
12
|
-
* @param {Pick<ObjectConstructor, 'freeze'> =
|
|
12
|
+
* @param {Pick<ObjectConstructor, 'freeze'>} [oc=Object] `Object` by default,
|
|
13
13
|
* allows to inject custom object constructor for tests
|
|
14
14
|
* @returns {Readonly<T>}
|
|
15
15
|
*
|
|
@@ -22,6 +22,180 @@ function freeze(object, oc) {
|
|
|
22
22
|
return oc && typeof oc.freeze === 'function' ? oc.freeze(object) : object
|
|
23
23
|
}
|
|
24
24
|
|
|
25
|
+
/**
|
|
26
|
+
* Since we can not rely on `Object.assign` we provide a simplified version
|
|
27
|
+
* that is sufficient for our needs.
|
|
28
|
+
*
|
|
29
|
+
* @param {Object} target
|
|
30
|
+
* @param {Object | null | undefined} source
|
|
31
|
+
*
|
|
32
|
+
* @returns {Object} target
|
|
33
|
+
* @throws TypeError if target is not an object
|
|
34
|
+
*
|
|
35
|
+
* @see https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/Object/assign
|
|
36
|
+
* @see https://tc39.es/ecma262/multipage/fundamental-objects.html#sec-object.assign
|
|
37
|
+
*/
|
|
38
|
+
function assign(target, source) {
|
|
39
|
+
if (target === null || typeof target !== 'object') {
|
|
40
|
+
throw new TypeError('target is not an object')
|
|
41
|
+
}
|
|
42
|
+
for (var key in source) {
|
|
43
|
+
if (Object.prototype.hasOwnProperty.call(source, key)) {
|
|
44
|
+
target[key] = source[key]
|
|
45
|
+
}
|
|
46
|
+
}
|
|
47
|
+
return target
|
|
48
|
+
}
|
|
49
|
+
|
|
50
|
+
/**
|
|
51
|
+
* A number of attributes are boolean attributes.
|
|
52
|
+
* The presence of a boolean attribute on an element represents the `true` value,
|
|
53
|
+
* and the absence of the attribute represents the `false` value.
|
|
54
|
+
*
|
|
55
|
+
* If the attribute is present, its value must either be the empty string
|
|
56
|
+
* or a value that is an ASCII case-insensitive match for the attribute's canonical name,
|
|
57
|
+
* with no leading or trailing whitespace.
|
|
58
|
+
*
|
|
59
|
+
* Note: The values `"true"` and `"false"` are not allowed on boolean attributes.
|
|
60
|
+
* To represent a `false` value, the attribute has to be omitted altogether.
|
|
61
|
+
*
|
|
62
|
+
* @see https://html.spec.whatwg.org/#boolean-attributes
|
|
63
|
+
* @see https://html.spec.whatwg.org/#attributes-3
|
|
64
|
+
*/
|
|
65
|
+
var HTML_BOOLEAN_ATTRIBUTES = freeze({
|
|
66
|
+
allowfullscreen: true,
|
|
67
|
+
async: true,
|
|
68
|
+
autofocus: true,
|
|
69
|
+
autoplay: true,
|
|
70
|
+
checked: true,
|
|
71
|
+
controls: true,
|
|
72
|
+
default: true,
|
|
73
|
+
defer: true,
|
|
74
|
+
disabled: true,
|
|
75
|
+
formnovalidate: true,
|
|
76
|
+
hidden: true,
|
|
77
|
+
ismap: true,
|
|
78
|
+
itemscope: true,
|
|
79
|
+
loop: true,
|
|
80
|
+
multiple: true,
|
|
81
|
+
muted: true,
|
|
82
|
+
nomodule: true,
|
|
83
|
+
novalidate: true,
|
|
84
|
+
open: true,
|
|
85
|
+
playsinline: true,
|
|
86
|
+
readonly: true,
|
|
87
|
+
required: true,
|
|
88
|
+
reversed: true,
|
|
89
|
+
selected: true,
|
|
90
|
+
})
|
|
91
|
+
|
|
92
|
+
/**
|
|
93
|
+
* Check if `name` is matching one of the HTML boolean attribute names.
|
|
94
|
+
* This method doesn't check if such attributes are allowed in the context of the current document/parsing.
|
|
95
|
+
*
|
|
96
|
+
* @param {string} name
|
|
97
|
+
* @return {boolean}
|
|
98
|
+
* @see HTML_BOOLEAN_ATTRIBUTES
|
|
99
|
+
* @see https://html.spec.whatwg.org/#boolean-attributes
|
|
100
|
+
* @see https://html.spec.whatwg.org/#attributes-3
|
|
101
|
+
*/
|
|
102
|
+
function isHTMLBooleanAttribute(name) {
|
|
103
|
+
return HTML_BOOLEAN_ATTRIBUTES.hasOwnProperty(name.toLowerCase())
|
|
104
|
+
}
|
|
105
|
+
|
|
106
|
+
/**
|
|
107
|
+
* Void elements only have a start tag; end tags must not be specified for void elements.
|
|
108
|
+
* These elements should be written as self closing like this: `<area />`.
|
|
109
|
+
* This should not be confused with optional tags that HTML allows to omit the end tag for
|
|
110
|
+
* (like `li`, `tr` and others), which can have content after them,
|
|
111
|
+
* so they can not be written as self closing.
|
|
112
|
+
* xmldom does not have any logic for optional end tags cases and will report them as a warning.
|
|
113
|
+
* Content that would go into the unopened element will instead be added as a sibling text node.
|
|
114
|
+
*
|
|
115
|
+
* @type {Readonly<{area: boolean, col: boolean, img: boolean, wbr: boolean, link: boolean, hr: boolean, source: boolean, br: boolean, input: boolean, param: boolean, meta: boolean, embed: boolean, track: boolean, base: boolean}>}
|
|
116
|
+
* @see https://html.spec.whatwg.org/#void-elements
|
|
117
|
+
* @see https://html.spec.whatwg.org/#optional-tags
|
|
118
|
+
*/
|
|
119
|
+
var HTML_VOID_ELEMENTS = freeze({
|
|
120
|
+
area: true,
|
|
121
|
+
base: true,
|
|
122
|
+
br: true,
|
|
123
|
+
col: true,
|
|
124
|
+
embed: true,
|
|
125
|
+
hr: true,
|
|
126
|
+
img: true,
|
|
127
|
+
input: true,
|
|
128
|
+
link: true,
|
|
129
|
+
meta: true,
|
|
130
|
+
param: true,
|
|
131
|
+
source: true,
|
|
132
|
+
track: true,
|
|
133
|
+
wbr: true,
|
|
134
|
+
})
|
|
135
|
+
|
|
136
|
+
/**
|
|
137
|
+
* Check if `tagName` is matching one of the HTML void element names.
|
|
138
|
+
* This method doesn't check if such tags are allowed
|
|
139
|
+
* in the context of the current document/parsing.
|
|
140
|
+
*
|
|
141
|
+
* @param {string} tagName
|
|
142
|
+
* @return {boolean}
|
|
143
|
+
* @see HTML_VOID_ELEMENTS
|
|
144
|
+
* @see https://html.spec.whatwg.org/#void-elements
|
|
145
|
+
*/
|
|
146
|
+
function isHTMLVoidElement(tagName) {
|
|
147
|
+
return HTML_VOID_ELEMENTS.hasOwnProperty(tagName.toLowerCase())
|
|
148
|
+
}
|
|
149
|
+
|
|
150
|
+
/**
|
|
151
|
+
* Tag names that are raw text elements according to HTML spec.
|
|
152
|
+
* The value denotes whether they are escapable or not.
|
|
153
|
+
*
|
|
154
|
+
* @see isHTMLEscapableRawTextElement
|
|
155
|
+
* @see isHTMLRawTextElement
|
|
156
|
+
* @see https://html.spec.whatwg.org/#raw-text-elements
|
|
157
|
+
* @see https://html.spec.whatwg.org/#escapable-raw-text-elements
|
|
158
|
+
*/
|
|
159
|
+
var HTML_RAW_TEXT_ELEMENTS = freeze({
|
|
160
|
+
script: false,
|
|
161
|
+
style: false,
|
|
162
|
+
textarea: true,
|
|
163
|
+
title: true,
|
|
164
|
+
})
|
|
165
|
+
|
|
166
|
+
/**
|
|
167
|
+
* Check if `tagName` is matching one of the HTML raw text element names.
|
|
168
|
+
* This method doesn't check if such tags are allowed
|
|
169
|
+
* in the context of the current document/parsing.
|
|
170
|
+
*
|
|
171
|
+
* @param {string} tagName
|
|
172
|
+
* @return {boolean}
|
|
173
|
+
* @see isHTMLEscapableRawTextElement
|
|
174
|
+
* @see HTML_RAW_TEXT_ELEMENTS
|
|
175
|
+
* @see https://html.spec.whatwg.org/#raw-text-elements
|
|
176
|
+
* @see https://html.spec.whatwg.org/#escapable-raw-text-elements
|
|
177
|
+
*/
|
|
178
|
+
function isHTMLRawTextElement(tagName) {
|
|
179
|
+
var key = tagName.toLowerCase();
|
|
180
|
+
return HTML_RAW_TEXT_ELEMENTS.hasOwnProperty(key) && !HTML_RAW_TEXT_ELEMENTS[key];
|
|
181
|
+
}
|
|
182
|
+
/**
|
|
183
|
+
* Check if `tagName` is matching one of the HTML escapable raw text element names.
|
|
184
|
+
* This method doesn't check if such tags are allowed
|
|
185
|
+
* in the context of the current document/parsing.
|
|
186
|
+
*
|
|
187
|
+
* @param {string} tagName
|
|
188
|
+
* @return {boolean}
|
|
189
|
+
* @see isHTMLRawTextElement
|
|
190
|
+
* @see HTML_RAW_TEXT_ELEMENTS
|
|
191
|
+
* @see https://html.spec.whatwg.org/#raw-text-elements
|
|
192
|
+
* @see https://html.spec.whatwg.org/#escapable-raw-text-elements
|
|
193
|
+
*/
|
|
194
|
+
function isHTMLEscapableRawTextElement(tagName) {
|
|
195
|
+
var key = tagName.toLowerCase();
|
|
196
|
+
return HTML_RAW_TEXT_ELEMENTS.hasOwnProperty(key) && HTML_RAW_TEXT_ELEMENTS[key];
|
|
197
|
+
}
|
|
198
|
+
|
|
25
199
|
/**
|
|
26
200
|
* All mime types that are allowed as input to `DOMParser.parseFromString`
|
|
27
201
|
*
|
|
@@ -47,14 +221,32 @@ var MIME_TYPE = freeze({
|
|
|
47
221
|
* @param {string} [value]
|
|
48
222
|
* @returns {boolean}
|
|
49
223
|
*
|
|
50
|
-
* @see https://www.iana.org/assignments/media-types/text/html
|
|
51
|
-
* @see https://en.wikipedia.org/wiki/HTML
|
|
52
|
-
* @see https://developer.mozilla.org/en-US/docs/Web/API/DOMParser/parseFromString
|
|
53
|
-
* @see https://html.spec.whatwg.org/multipage/dynamic-markup-insertion.html#dom-domparser-parsefromstring
|
|
224
|
+
* @see [IANA MimeType registration](https://www.iana.org/assignments/media-types/text/html)
|
|
225
|
+
* @see [Wikipedia](https://en.wikipedia.org/wiki/HTML)
|
|
226
|
+
* @see [`DOMParser.parseFromString` @ MDN](https://developer.mozilla.org/en-US/docs/Web/API/DOMParser/parseFromString)
|
|
227
|
+
* @see [`DOMParser.parseFromString` @ HTML Specification](https://html.spec.whatwg.org/multipage/dynamic-markup-insertion.html#dom-domparser-parsefromstring)
|
|
228
|
+
*/
|
|
54
229
|
isHTML: function (value) {
|
|
55
230
|
return value === MIME_TYPE.HTML
|
|
56
231
|
},
|
|
57
232
|
|
|
233
|
+
/**
|
|
234
|
+
* For both the `text/html` and the `application/xhtml+xml` namespace
|
|
235
|
+
* the spec defines that the HTML namespace is provided as the default in some cases.
|
|
236
|
+
*
|
|
237
|
+
* @param {string} mimeType
|
|
238
|
+
* @returns {boolean}
|
|
239
|
+
*
|
|
240
|
+
* @see https://dom.spec.whatwg.org/#dom-document-createelement
|
|
241
|
+
* @see https://dom.spec.whatwg.org/#dom-domimplementation-createdocument
|
|
242
|
+
* @see https://dom.spec.whatwg.org/#dom-domimplementation-createhtmldocument
|
|
243
|
+
*/
|
|
244
|
+
hasDefaultHTMLNamespace: function (mimeType) {
|
|
245
|
+
return (
|
|
246
|
+
MIME_TYPE.isHTML(mimeType) || mimeType === MIME_TYPE.XML_XHTML_APPLICATION
|
|
247
|
+
)
|
|
248
|
+
},
|
|
249
|
+
|
|
58
250
|
/**
|
|
59
251
|
* `application/xml`, the standard mime type for XML documents.
|
|
60
252
|
*
|
|
@@ -139,6 +331,14 @@ var NAMESPACE = freeze({
|
|
|
139
331
|
XMLNS: 'http://www.w3.org/2000/xmlns/',
|
|
140
332
|
})
|
|
141
333
|
|
|
142
|
-
exports.
|
|
143
|
-
exports.
|
|
144
|
-
exports.
|
|
334
|
+
exports.assign = assign
|
|
335
|
+
exports.freeze = freeze
|
|
336
|
+
exports.HTML_BOOLEAN_ATTRIBUTES = HTML_BOOLEAN_ATTRIBUTES
|
|
337
|
+
exports.HTML_RAW_TEXT_ELEMENTS = HTML_RAW_TEXT_ELEMENTS
|
|
338
|
+
exports.HTML_VOID_ELEMENTS = HTML_VOID_ELEMENTS
|
|
339
|
+
exports.isHTMLBooleanAttribute = isHTMLBooleanAttribute
|
|
340
|
+
exports.isHTMLRawTextElement = isHTMLRawTextElement
|
|
341
|
+
exports.isHTMLEscapableRawTextElement = isHTMLEscapableRawTextElement
|
|
342
|
+
exports.isHTMLVoidElement = isHTMLVoidElement
|
|
343
|
+
exports.MIME_TYPE = MIME_TYPE
|
|
344
|
+
exports.NAMESPACE = NAMESPACE
|