xml-stream-editor 0.1.1 → 0.2.1

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/CHANGELOG.md CHANGED
@@ -1,6 +1,30 @@
1
1
  CHANGELOG
2
2
  ===
3
3
 
4
+ 0.2.1
5
+ ---
6
+
7
+ Validate selector strings passed to `createXMLEditor` (for now, very basic.
8
+ Just making sure there is only one space between element names in each
9
+ selector).
10
+
11
+ Fix issue where in some cases a selectors would match against the
12
+ suffixes/endings of elements, and not always the full element name
13
+ (e.g.,the selector `"steak"` would sometimes match elements like
14
+ `<mistake>`).
15
+
16
+ 0.2.0
17
+ ---
18
+
19
+ Significantly improve performance of selector matching.
20
+
21
+ Add validation checks for created or modified XML element attribute names.
22
+
23
+ Add config option, currently just with 1. the ability to disable validation
24
+ of outgoing XML, and 2. configuring the "saxes" parser.
25
+
26
+ Hopefully more helpful, additional text in README.md.
27
+
4
28
  0.1.1
5
29
  ---
6
30
 
package/README.md CHANGED
@@ -1,5 +1,4 @@
1
- xml-stream-editor
2
- ===
1
+ # xml-stream-editor
3
2
 
4
3
  Library to edit xml files in a streaming manner. Inspired by
5
4
  [xml-stream](https://www.npmjs.com/package/xml-stream), but 1. allows using
@@ -11,23 +10,85 @@ allows you to modify XML without needing to buffer the XML files in memory.
11
10
  For small to mid-sized XML files buffering is fine. But when editing very large
12
11
  files (e.g., multi-Gb files) buffering can be a problem or an absolute blocker.
13
12
 
14
- Usage
15
- ---
13
+ ## Usage
16
14
 
17
15
  `xml-stream-editor` is designed to be used with node's stream systems
18
16
  by subclassing [`stream.Transform`](https://nodejs.org/api/stream.html#class-streamtransform),
19
17
  so it can be used with the [streams promises API](https://nodejs.org/api/stream.html#streams-promises-api)
20
18
  and stdlib interfaces like [`stream.pipeline`](https://nodejs.org/api/stream.html#streampipelinestreams-options).
21
19
 
22
- The main way to use `xml-stream-editor` is to 1. select which XML elements
23
- you want to edit using simple declarative selectors (like _very_ simple XPath
24
- rules or CSS selectors), and 2. write functions to be called with each
25
- matching XML element in the document. Those functions then either edit and
26
- return the provided element, or remove the element from the document
27
- by returning nothing.
20
+ The main way to use `xml-stream-editor` is to:
28
21
 
29
- Example
30
- ---
22
+ 1. select which XML elements you want to edit using simple declarative selectors
23
+ (like _very_ simple XPath rules or CSS selectors), and
24
+ 2. write functions to be called with each matching XML element in the document.
25
+ Those functions then either edit and return the provided element, or remove
26
+ the element from the document by returning nothing.
27
+
28
+ ### Calling xml-stream-editor
29
+
30
+ The main way to call `xml-stream-editor` is by importing `createXMLEditor`,
31
+ passing that function an object, with keys as `selectors` (strings that describe
32
+ which elements to edit) as keys, and values being functions that get passed
33
+ matching elements (to edit to delete those elements).
34
+
35
+ ### Elements Selectors
36
+
37
+ You choose which XML elements to edit by writing (simple, limited) CSS-selector
38
+ like statements. For example, the selector `parent child` will match
39
+ all `<child>` elements that are _immediate_ children of `<parent>` nodes.
40
+ **Note**, this is a little different than CSS selectors, where the selector
41
+ `div a` would match `<a>` elements that were were contained in `<div>` elements,
42
+ regardless of whether the `<a>` was an immediate child or more deeply nested.
43
+
44
+ ### Editing Elements
45
+
46
+ Each element that matches a given selector is passed to the matching
47
+ function, with the signature `(elm: Element) => Element | undefined`,
48
+ and elements are structured as follows (as typescript):
49
+
50
+ ```typescript
51
+ interface Element {
52
+ name: string
53
+ text?: string
54
+ attributes: Record<string, string>
55
+ children: Element[]
56
+ }
57
+ ```
58
+
59
+ ### Options / Configuration
60
+
61
+ In addition to a `rules` argument, `createReadStream` can also take
62
+ a second `Options` argument. This object has the follow parameters.
63
+
64
+ ```typescript
65
+ interface Options {
66
+ // Whether to check and enforce the validity of created and modified
67
+ // XML element names and attributes. If true, will throw an error
68
+ // if you create an XML element with a disallowed name (e.g.,
69
+ // <no spaces allowed>) or with an invalid attribute name
70
+ // (<my-elm a:b:c="too many namespaces" d@y="no @ in attr names">)
71
+ //
72
+ // This only checks the syntax of the XML element names and attributes.
73
+ // It does not perform any further validation, like if used namespaces
74
+ // are valid.
75
+ //
76
+ // default: `true`
77
+ validate: boolean // true
78
+
79
+ // Options defined by the "saxes" library, and passed to the "saxes" parser
80
+ //
81
+ // https://github.com/lddubeau/saxes/blob/4968bd09b5fd0270a989c69913614b0e640dae1b/src/saxes.ts#L557
82
+ // https://www.npmjs.com/package/saxes
83
+ saxes?: SaxesOptions
84
+ }
85
+
86
+ // The createXMLEditor function takes the options object as an optional
87
+ // second argument.
88
+ const transformer = createXMLEditor(rules, options)
89
+ ```
90
+
91
+ ## Examples
31
92
 
32
93
  Start with this input as `simpsons.xml`:
33
94
 
@@ -54,49 +115,53 @@ import { createReadStream } from 'node:fs'
54
115
  import { pipeline } from 'node:stream/promises'
55
116
  import { createXMLEditor, newElement } from 'xml-stream-editor'
56
117
 
57
- (async () => {
58
- const config = {
59
- // Map element selectors to editing functions
60
- "main character": (elm) => {
61
- switch (elm.text) {
62
- case "Marge Simpson":
63
- elm.attributes["hair"] = "blue"
64
- break
65
- case "Homer Simpson":
66
- elm.text += " (Sr.)"
67
- break
68
- case "Lisa Simpson":
69
- elm.text = ""
70
-
71
- const instrumentElm = newElement("instrument")
72
- instrumentElm.text = "saxophone"
73
- elm.children.push(instrumentElm)
74
-
75
- const nameElm = newElement("name")
76
- nameElm.text = "Lisa Simpson"
77
- elm.children.push(nameElm)
78
- break
79
- case "Bart Simpson":
80
- // Remove the node by not returning an element.
81
- return
82
- }
83
- return elm
118
+ // The keys of this object are selector strings, and the
119
+ // values are functions that get called with matching elements.
120
+ const rules = {
121
+ "main character": (elm) => {
122
+ switch (elm.text) {
123
+ case "Marge Simpson":
124
+ elm.attributes["hair"] = "blue"
125
+ break
126
+ case "Homer Simpson":
127
+ elm.text += " (Sr.)"
128
+ break
129
+ case "Lisa Simpson":
130
+ elm.text = ""
131
+
132
+ // Create an <instrument> element and make it a child element.
133
+ const instrumentElm = newElement("instrument")
134
+ instrumentElm.text = "saxophone"
135
+ elm.children.push(instrumentElm)
136
+
137
+ // Also create a new <name> element, and also make it a child
138
+ // element.
139
+ const nameElm = newElement("name")
140
+ nameElm.text = "Lisa Simpson"
141
+ elm.children.push(nameElm)
142
+ break
143
+ case "Bart Simpson":
144
+ // Remove the node by not returning an element.
145
+ return
84
146
  }
147
+ return elm
85
148
  }
86
- await pipeline(
87
- createReadStream("simpsons.xml"), // above example
88
- createXMLEditor(config),
89
- process.stdout
90
- )
91
- })()
149
+ }
150
+ await pipeline(
151
+ createReadStream("simpsons.xml"), // above example
152
+ createXMLEditor(rules),
153
+ process.stdout
154
+ )
92
155
  ```
93
156
 
94
- And you'll find this printed to `STDOUT` (reformatted):
157
+ And you'll find this printed to `STDOUT` (reformatted and annotated):
95
158
 
96
159
  ```xml
97
160
  <?xml version="1.0" encoding="UTF-8"?>
98
161
  <simpsons decade="90s" locale="US">
99
162
  <main>
163
+ <!-- These character elements were edited because they're
164
+ children of the main element (i.e., "main character"). -->
100
165
  <character sex="female" hair="blue">Marge Simpson</character>
101
166
  <character sex="male">Homer Simpson (Sr.)</character>
102
167
  <character sex="female">
@@ -104,16 +169,22 @@ And you'll find this printed to `STDOUT` (reformatted):
104
169
  <name>Lisa Simpson</name>
105
170
  </character>
106
171
  <character sex="female">Maggie Simpson</character>
172
+ <!-- There is no <character>Bart Simpson</character>
173
+ element anymore because the `case "Bart Simpson":`
174
+ case didn't return an element from the function. -->
107
175
  </main>
108
176
  <side>
177
+ <!-- These side character elements were not edited of affected
178
+ at all because they didn't match the given selector
179
+ (i.e., they are not "character" elements that are direct
180
+ children of "side" elements). -->
109
181
  <character sex="male">Disco Stu</character>
110
182
  <character sex="male" title="Dr.">Julius Hibbert</character>
111
183
  </side>
112
184
  </simpsons>
113
185
  ```
114
186
 
115
- Notes
116
- ---
187
+ ## Notes
117
188
 
118
189
  Nested editing functions are not supported. You can define as many editing
119
190
  rules as you'd like, but only one rule can be matching the xml document
@@ -128,33 +199,34 @@ import { createReadStream } from 'node:fs'
128
199
  import { pipeline } from 'node:stream/promises'
129
200
  import { createXMLEditor, newElement } from 'xml-stream-editor'
130
201
 
131
- (async () => {
132
- const config = {
133
- // This rule will match first, since the "main" element will be
134
- // identified first during parsing.
135
- "main character": (elm) => {
136
- // editing goes here
137
- return elm
138
- },
139
- // And as a result, this rule will never be applied during editing
140
- // (since anytime "character" would match a <character> element,
141
- // that <character> element will have already been matched by the
142
- // above "main character" selector.
143
- "character": (elm) => {
144
- // this function would never be called in this document.
145
- return elm
146
- },
147
- }
148
- await pipeline(
149
- createReadStream("simpsons.xml"), // above example
150
- createXMLEditor(config),
151
- process.stdout
152
- )
153
- })()
202
+ const rules = {
203
+ // This rule will match first, since the "main" element will be
204
+ // identified first during parsing.
205
+ "main character": (elm) => {
206
+ // editing goes here
207
+ return elm
208
+ },
209
+ // And as a result, this rule will never match the "Disco Stu"
210
+ // or "Julius Hibbert" elements, since anytime the "character" selector
211
+ // would match a <character> element, that <character> element will
212
+ // have already been matched by the above "main character" selector.
213
+ //
214
+ // However, this selector would match (and so this function would
215
+ // be called with) the two <character> elements that are children
216
+ // of the <side> element.
217
+ "character": (elm) => {
218
+ // this function would never be called in this document.
219
+ return elm
220
+ },
221
+ }
222
+ await pipeline(
223
+ createReadStream("simpsons.xml"), // above example
224
+ createXMLEditor(rules),
225
+ process.stdout
226
+ )
154
227
  ```
155
228
 
156
- Motivation
157
- ---
229
+ ## Motivation
158
230
 
159
231
  `xml-stream-editor` was built to handle the extremely large XML files
160
232
  generated by [Brave Software's PageGraph system](https://github.com/brave/brave-browser/wiki/PageGraph),
@@ -0,0 +1,69 @@
1
+ import xnv from 'xml-name-validator';
2
+ const isValidName = xnv.qname;
3
+ export class Element {
4
+ attributes;
5
+ children = [];
6
+ name;
7
+ text;
8
+ constructor(name, attributes) {
9
+ this.name = name;
10
+ this.attributes = attributes
11
+ ? JSON.parse(JSON.stringify(attributes))
12
+ : Object.create(null);
13
+ }
14
+ validate() {
15
+ if (typeof this.name !== 'string') {
16
+ return [false, new Error('No name provided for element')];
17
+ }
18
+ if (!isValidName(this.name)) {
19
+ return [false, new Error(`"${this.name}" is not a valid element name`)];
20
+ }
21
+ if (typeof this.attributes !== 'object' || this.attributes === null) {
22
+ return [false, new Error('"attributes" property is not an object')];
23
+ }
24
+ for (const attrName of Object.keys(this.attributes)) {
25
+ if (!isValidName(attrName)) {
26
+ return [false, new Error(`"${attrName}" is not a valid attribute name`)];
27
+ }
28
+ }
29
+ for (const child of this.children) {
30
+ const [isChildValid, childError] = child.validate();
31
+ if (!isChildValid) {
32
+ return [false, childError];
33
+ }
34
+ }
35
+ return [true, undefined];
36
+ }
37
+ }
38
+ export class ParsedElement extends Element {
39
+ children = [];
40
+ static fromSaxesNode(node) {
41
+ // Here we check if each attribute name is simple (and so just a
42
+ // string), or in the namespace representation the "saxes" library
43
+ // uses (in which case attrValue will be a SaxesAttributeNS
44
+ // object, that we have to unpack a bit)
45
+ const attributes = Object.create(null);
46
+ if (node.attributes) {
47
+ for (const [attrName, attrValue] of Object.entries(node.attributes)) {
48
+ if (typeof attrValue === 'string') {
49
+ attributes[attrName] = attrValue;
50
+ continue;
51
+ }
52
+ attributes[attrValue.name] = attrValue.value;
53
+ }
54
+ }
55
+ return new ParsedElement(node.name, attributes);
56
+ }
57
+ clone() {
58
+ const cloneElm = new ParsedElement(this.name, this.attributes);
59
+ cloneElm.text = this.text;
60
+ cloneElm.children = [];
61
+ for (const aChildElm of this.children) {
62
+ cloneElm.children.push(aChildElm.clone());
63
+ }
64
+ return cloneElm;
65
+ }
66
+ }
67
+ export const newElement = (name, attributes) => {
68
+ return new Element(name, attributes);
69
+ };
package/dist/index.js CHANGED
@@ -1,2 +1,2 @@
1
- import { createXMLEditor, newElement, } from './xml-stream-editor.js';
2
- export { createXMLEditor, newElement, };
1
+ export { Element, newElement } from './element.js';
2
+ export { createXMLEditor, } from './xml-stream-editor.js';
package/dist/markup.js CHANGED
@@ -1,6 +1,4 @@
1
1
  import xmlescape from 'xml-escape';
2
- import xnv from 'xml-name-validator';
3
- export const isValidName = xnv.name;
4
2
  export const toAttrValue = (value) => {
5
3
  return xmlescape(value);
6
4
  };
@@ -0,0 +1,60 @@
1
+ // Represents the user provided selector strings, for defining which
2
+ // XML elements in the XML document they want to edit.
3
+ //
4
+ // We modify the (simplified) XML paths used to i. allow user to define
5
+ // which XML elements they want to edit, and ii. track the position of
6
+ // each parsed XML element in the incoming XML document.
7
+ //
8
+ // This allows us to quickly check whether a user-provided "selector"
9
+ // string matches the current XML parse stack with a simple .endsWith()
10
+ // call (specifically pathToJustParsedXMLElement.endsWith(userProvidedSelector).
11
+ import xnv from 'xml-name-validator';
12
+ // Single character string that cannot appear in XML element names.
13
+ const pathSeparator = '@';
14
+ const process = (elementPath) => {
15
+ const collapsedWhiteSpace = elementPath.trim().replace(/ +/g, ' ');
16
+ return collapsedWhiteSpace.split(' ').map(x => pathSeparator + x).join('');
17
+ };
18
+ const validate = (selector) => {
19
+ for (const elmName of selector.split(' ')) {
20
+ if (xnv.name(elmName) === true) {
21
+ continue;
22
+ }
23
+ const msg = `Selector "${selector}" contains invalid name "${elmName}"`;
24
+ return [false, new Error(msg)];
25
+ }
26
+ return [true, undefined];
27
+ };
28
+ // Simple class used for tracking the path to an element in an XML document,
29
+ // when parsing the XML document.
30
+ //
31
+ // Mostly this is just wrapping how we track the position of each element
32
+ // in the XML document as we're parsing it, and annotating that path
33
+ // in a way that makes it easy to check if a SelectorRule matches the
34
+ // leaf-element in that path.
35
+ export class ElementPath {
36
+ path;
37
+ pathForMatching;
38
+ constructor(path) {
39
+ this.path = path;
40
+ this.pathForMatching = process(path);
41
+ }
42
+ append(elmName) {
43
+ return new ElementPath(this.path + ' ' + elmName);
44
+ }
45
+ matches(selector) {
46
+ return this.pathForMatching.endsWith(selector.text);
47
+ }
48
+ }
49
+ export class SelectorRule {
50
+ text;
51
+ pathForMatching;
52
+ constructor(selector) {
53
+ const [isValid, err] = validate(selector);
54
+ if (!isValid) {
55
+ throw err;
56
+ }
57
+ this.text = process(selector);
58
+ this.pathForMatching = process(this.text);
59
+ }
60
+ }
@@ -1,56 +1,25 @@
1
1
  import { strict as assert } from 'node:assert';
2
2
  import { Transform } from 'node:stream';
3
3
  import { SaxesParser } from 'saxes';
4
- import { isValidName, toAttrValue, toBodyText, toCloseTag, toOpenTag } from './markup.js';
5
- export const newElement = (name) => {
6
- if (isValidName(name) === false) {
7
- throw new Error(`"${name}" is not a valid XML element name`);
8
- }
9
- return {
10
- name: name,
11
- text: undefined,
12
- attributes: {},
13
- children: [],
14
- };
15
- };
16
- const cloneElement = (elm) => {
17
- const newElm = newElement(elm.name);
18
- newElm.text = elm.text;
19
- newElm.attributes = JSON.parse(JSON.stringify(elm.attributes));
20
- newElm.children = elm.children.map(cloneElement);
21
- return newElm;
22
- };
23
- const elementForNode = (node) => {
24
- // Here we check if each attribute name is simple (and so just a
25
- // string), or in the namespace representation the "saxes" library
26
- // uses (in which case attrValue will be a SaxesAttributeNS
27
- // object, that we have to unpack a bit)
28
- const attributes = {};
29
- if (node.attributes) {
30
- for (const [attrName, attrValue] of Object.entries(node.attributes)) {
31
- if (typeof attrValue === 'string') {
32
- attributes[attrName] = attrValue;
33
- continue;
34
- }
35
- attributes[attrValue.name] = attrValue.value;
36
- }
37
- }
38
- return elementForNameAndAttrs(node.name, attributes);
39
- };
40
- const elementForNameAndAttrs = (name, attrs) => {
41
- const newElm = newElement(name);
42
- if (attrs) {
43
- newElm.attributes = attrs;
44
- }
45
- return newElm;
46
- };
4
+ import { ParsedElement } from './element.js';
5
+ import { toAttrValue, toBodyText, toCloseTag, toOpenTag } from './markup.js';
6
+ import { ElementPath, SelectorRule } from './selector.js';
47
7
  class XMLStreamEditorTransformer extends Transform {
8
+ // Default options, used if the caller doesn't provide any options (or
9
+ // merged into the provided options if the user only sets some options).
10
+ static defaultOptions = {
11
+ validate: true,
12
+ saxes: undefined,
13
+ };
14
+ // The configuration options, including possible options to pass to
15
+ // the (above) saxes parser at instantiation.
16
+ #options;
48
17
  // Used to track how deep in the XML tree the parser is, so that we can
49
18
  // check newly parsed elements against the passed editor rules.
50
- #elmStack = [];
51
- // This is a map of (VERY) simple xpaths (i.e., only XML element names;
52
- // no attributes, no name spaces, etc).
53
- #config;
19
+ #parseStack = [];
20
+ // This is a map of objects that represent simple xpaths (i.e., only XML
21
+ // element names (no attributes, no name spaces, etc).
22
+ #rules;
54
23
  // Handle to the 'saxes' xml parser object.
55
24
  #xmlParser;
56
25
  // If set, tracks the current element in the parser stack that matches
@@ -60,18 +29,35 @@ class XMLStreamEditorTransformer extends Transform {
60
29
  // Store any errors we've been passed by the saxes parser so that we
61
30
  // can pass it along in the transformer callback next time we get data.
62
31
  #error;
32
+ #pushParsedElementToStack(element) {
33
+ const topOfStackElm = this.#parseStack.at(-1);
34
+ // We prefix every element name in the parse stack with '@' (a character
35
+ // that isn't valid in an XML element name) so that we can easily
36
+ // check if a selector matches the parse stack by just checking if
37
+ // the selector matches right end of the stack path.
38
+ const pathToElement = topOfStackElm
39
+ ? topOfStackElm.path.append(element.name)
40
+ : new ElementPath(element.name);
41
+ this.#parseStack.push({
42
+ element: element,
43
+ path: pathToElement,
44
+ });
45
+ }
63
46
  // Checks to see if the current editor stack (which tracks the current
64
47
  // element being parsed in the input XML stream, along with its parent
65
48
  // elements) matches any of the passed editor rules.
66
- #doesStackMatchEditorRule() {
67
- const currentElementPath = this.#elmStack.map(x => x.name).join(' ');
68
- for (const [selector, editorFunc] of Object.entries(this.#config)) {
69
- if (currentElementPath.endsWith(selector)) {
49
+ #doesStackMatchEditingRule() {
50
+ const topOfStack = this.#parseStack.at(-1);
51
+ // This method is only called after pushing an element to the stack,
52
+ // so this is guaranteed to be true
53
+ assert(topOfStack);
54
+ for (const [selectorRule, editorFunc] of this.#rules.entries()) {
55
+ if (topOfStack.path.matches(selectorRule)) {
70
56
  // The depth of the root of this subtree in the stack
71
- const depth = this.#elmStack.length - 1;
57
+ const depth = this.#parseStack.length - 1;
72
58
  assert(depth >= 0);
73
- const elmToEdit = this.#elmStack[depth];
74
- return { selector: selector, func: editorFunc, element: elmToEdit };
59
+ const elmToEdit = this.#parseStack[depth].element;
60
+ return { selector: selectorRule, func: editorFunc, element: elmToEdit };
75
61
  }
76
62
  }
77
63
  return null;
@@ -89,12 +75,19 @@ class XMLStreamEditorTransformer extends Transform {
89
75
  }
90
76
  this.push(toCloseTag(element.name));
91
77
  }
92
- #writeSubtreeToStream() {
78
+ #callUserFuncOnCompletedElementAndWriteToStream() {
93
79
  assert(this.#elmToEditInfo);
94
- const clonedElm = cloneElement(this.#elmToEditInfo.element);
80
+ const clonedElm = this.#elmToEditInfo.element.clone();
81
+ const editElmFunc = this.#elmToEditInfo.func;
95
82
  try {
96
- const editedElm = this.#elmToEditInfo.func(clonedElm);
83
+ const editedElm = editElmFunc(clonedElm);
97
84
  if (editedElm) {
85
+ if (this.#options.validate === true) {
86
+ const [isValid, error] = editedElm.validate();
87
+ if (!isValid) {
88
+ throw error;
89
+ }
90
+ }
98
91
  this.#writeElementToStream(editedElm);
99
92
  }
100
93
  this.#elmToEditInfo = undefined;
@@ -115,14 +108,14 @@ class XMLStreamEditorTransformer extends Transform {
115
108
  // and append ourselves to the stack.
116
109
  // 3. We are NOT the root of a subtree to be edited, in which case
117
110
  // we just add ourselves to the stack.
118
- const newElement = elementForNode(node);
119
- this.#elmStack.push(newElement);
111
+ const newElement = ParsedElement.fromSaxesNode(node);
112
+ this.#pushParsedElementToStack(newElement);
120
113
  // Check for case one
121
114
  if (this.#isInSubtreeToBeEdited()) {
122
115
  return;
123
116
  }
124
117
  // Check for case two, if we're at the root of a subtree to edit.
125
- const matchingElementInfo = this.#doesStackMatchEditorRule();
118
+ const matchingElementInfo = this.#doesStackMatchEditingRule();
126
119
  if (matchingElementInfo !== null) {
127
120
  this.#elmToEditInfo = matchingElementInfo;
128
121
  return;
@@ -140,9 +133,9 @@ class XMLStreamEditorTransformer extends Transform {
140
133
  // print the text out immediately.
141
134
  // Check for case one
142
135
  if (this.#isInSubtreeToBeEdited()) {
143
- const topOfStack = this.#elmStack.at(-1);
136
+ const topOfStack = this.#parseStack.at(-1);
144
137
  assert(topOfStack);
145
- topOfStack.text = text;
138
+ topOfStack.element.text = text;
146
139
  return;
147
140
  }
148
141
  // Otherwise we're in case two, and can print the text out immediately.
@@ -176,36 +169,58 @@ class XMLStreamEditorTransformer extends Transform {
176
169
  // 3. We've completed a CHILD NODE in a subtree being edited,
177
170
  // in which case we append this node to our buffered subtree
178
171
  // and pop it off the stack.
179
- const completedElm = this.#elmStack.pop();
172
+ const completedStackElement = this.#parseStack.pop();
173
+ const completedElm = completedStackElement?.element;
180
174
  assert(completedElm);
181
175
  // Check for case one
182
176
  if (this.#isInSubtreeToBeEdited() === false) {
177
+ // Write the closing tag of the just-completed element
178
+ // to the write stream.
183
179
  this.push(toCloseTag(node.name));
184
180
  return;
185
181
  }
186
182
  // Check for case two
187
183
  assert(this.#elmToEditInfo);
188
184
  if (completedElm === this.#elmToEditInfo.element) {
189
- this.#writeSubtreeToStream();
185
+ this.#callUserFuncOnCompletedElementAndWriteToStream();
190
186
  return;
191
187
  }
192
188
  // Otherwise, we must be in case three
193
- assert(this.#elmStack.length > 0);
194
- this.#elmStack.at(-1)?.children.push(completedElm);
189
+ const topOfStack = this.#parseStack.at(-1);
190
+ assert(topOfStack);
191
+ topOfStack.element.children.push(completedElm);
195
192
  });
196
193
  }
197
- constructor(config, saxesOptions) {
194
+ constructor(editingRules, options) {
198
195
  super();
199
- this.#config = config;
200
- this.#xmlParser = new SaxesParser(saxesOptions);
196
+ const defaultOptions = XMLStreamEditorTransformer.defaultOptions;
197
+ const mergedOptions = {
198
+ validate: options?.validate ?? defaultOptions.validate,
199
+ saxes: options?.saxes ?? defaultOptions.saxes,
200
+ };
201
+ this.#options = mergedOptions;
202
+ this.#rules = new Map();
203
+ for (const [selector, editFunc] of Object.entries(editingRules)) {
204
+ // This will throw if one of the user-provided selectors
205
+ // is invalid.
206
+ const parsedSelector = new SelectorRule(selector);
207
+ this.#rules.set(parsedSelector, editFunc);
208
+ }
209
+ this.#xmlParser = new SaxesParser(this.#options.saxes);
201
210
  this.#configureParserCallbacks();
202
211
  }
203
212
  _transform(chunk, encoding, callback) {
213
+ // Don't do any parsing if something threw an error parsing the previous
214
+ // chunk.
204
215
  if (this.#error) {
205
216
  callback(this.#error);
206
217
  return;
207
218
  }
208
219
  this.#xmlParser.write(chunk);
220
+ // And, similarly, don't continuing parsing if we've caught any errors
221
+ // parsing the current chunk. This looks a little redundant, but because
222
+ // the XML from the input stream is parsed asynchronously, this is
223
+ // just an attempt to catch and handle an error as quickly as possible.
209
224
  if (this.#error) {
210
225
  callback(this.#error);
211
226
  return;
@@ -213,6 +228,10 @@ class XMLStreamEditorTransformer extends Transform {
213
228
  callback();
214
229
  }
215
230
  }
216
- export const createXMLEditor = (config, saxesOptions) => {
217
- return new XMLStreamEditorTransformer(config, saxesOptions);
231
+ // This is the entry point to the library, and is designed / named
232
+ // to mirror the naming of transformers in the standard lib
233
+ // (e.g., createGzip , createDeflate, etc in the stdlib zlib module,
234
+ // or createHmac, createECDH, etc in the stdlib crypto module).
235
+ export const createXMLEditor = (rules, options) => {
236
+ return new XMLStreamEditorTransformer(rules, options);
218
237
  };
package/package.json CHANGED
@@ -1,6 +1,6 @@
1
1
  {
2
2
  "name": "xml-stream-editor",
3
- "version": "0.1.1",
3
+ "version": "0.2.1",
4
4
  "description": "A streaming xml editor.",
5
5
  "main": "dist/index.js",
6
6
  "files": [
@@ -18,7 +18,7 @@
18
18
  "type": "module",
19
19
  "types": "src/types.d.ts",
20
20
  "repository": {
21
- "url": "https://github.com/pes10k/xml-stream-editor.git"
21
+ "url": "git+https://github.com/pes10k/xml-stream-editor.git"
22
22
  },
23
23
  "keywords": [
24
24
  "xml",
package/src/types.d.ts CHANGED
@@ -2,16 +2,41 @@ import { Transform } from 'node:stream'
2
2
 
3
3
  import { SaxesOptions } from 'saxes'
4
4
 
5
- export declare interface Element {
6
- name: string
7
- text?: string
5
+ export declare class Element {
6
+ constructor (name: string, attributes?: Record<string, string>)
8
7
  attributes: Record<string, string>
9
8
  children: Element[]
9
+ name: string
10
+ text?: string
11
+ }
12
+
13
+ export declare interface Options {
14
+ // Whether to check and enforce the validity of created and modified
15
+ // XML element names and attributes. If true, will throw an error
16
+ // if you create an XML element with a disallowed name (e.g.,
17
+ // <no spaces allowed>) or with an invalid attribute name
18
+ // (<my-elm a:b:c="too many namespaces" d@y="no @ in attr names">)
19
+ //
20
+ // This only checks the syntax of the XML element names and attributes.
21
+ // It does not perform any further validation, like if used namespaces
22
+ // are valid.
23
+ //
24
+ // default: `true`
25
+ validate: boolean // true
26
+
27
+ // Options defined by the "saxes" library, and passed to the "saxes" parser
28
+ //
29
+ // eslint-disable-next-line max-len
30
+ // https://github.com/lddubeau/saxes/blob/4968bd09b5fd0270a989c69913614b0e640dae1b/src/saxes.ts#L557
31
+ // https://www.npmjs.com/package/saxes
32
+ saxes?: SaxesOptions
10
33
  }
11
34
 
12
35
  export type Selector = string
13
36
  export type EditorFunc = (elm: Element) => Element | undefined
14
- export type Config = Record<Selector, EditorFunc>
37
+ export type EditingRules = Record<Selector, EditorFunc>
38
+ // Just wrapper for `new Element(name)`, mostly a remnant of a previous
39
+ // implementation approach.
15
40
  export declare const newElement: (name: string) => Element
16
41
  export declare const createXMLEditor: (
17
- config: Config, saxesOptions?: SaxesOptions) => Transform
42
+ editingRules: EditingRules, options?: Options) => Transform