xml-sax-ts 0.2.0 → 0.4.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/README.md CHANGED
@@ -82,6 +82,31 @@ const obj = builder.getResult();
82
82
  // { item: ["1", "2"] }
83
83
  ```
84
84
 
85
+ ### Object to XML
86
+
87
+ ```ts
88
+ import { objectToXml } from "xml-sax-ts";
89
+
90
+ const xml = objectToXml({
91
+ root: {
92
+ "@_id": "1",
93
+ item: ["1", "2"],
94
+ }
95
+ });
96
+
97
+ // <root id="1"><item>1</item><item>2</item></root>
98
+ ```
99
+
100
+ ```ts
101
+ import { buildObject, objectToXml, parseXmlString } from "xml-sax-ts";
102
+
103
+ const root = parseXmlString("<root id='1'><item>1</item></root>");
104
+ const obj = buildObject(root);
105
+ const xml = objectToXml(obj, { rootName: "root" });
106
+
107
+ // <root id="1"><item>1</item></root>
108
+ ```
109
+
85
110
  ### Serialize to XML
86
111
 
87
112
  ```ts
@@ -102,6 +127,120 @@ const xml = serializeXml(
102
127
  // </root>
103
128
  ```
104
129
 
130
+ ## Benchmarking
131
+
132
+ Run the reproducible benchmark harness:
133
+
134
+ ```bash
135
+ npm run bench
136
+ ```
137
+
138
+ Quick run (fewer rounds):
139
+
140
+ ```bash
141
+ npm run bench:quick
142
+ ```
143
+
144
+ The benchmark now runs multiple rounds and reports median/mean/stddev for better comparability.
145
+
146
+ - `xml-sax-ts:sax` scenarios measure streaming event parsing
147
+ - `xml-sax-ts:sax` scenarios include explicit `xmlns=true/false` modes
148
+ - `xml-sax-ts:sax ... no-position` shows upper-bound throughput with `trackPosition: false`
149
+ - `comparable:*` scenarios run minimal equivalent feature sets for fair `xml-sax-ts` vs `saxes` comparison
150
+ - `xml-sax-ts:tree` scenario measures full tree parsing (`parseXmlString`)
151
+ - `sax` and `saxes` scenarios provide common SAX parser comparisons
152
+ - `fast-xml-parser` scenarios measure object parsing on the same input corpus
153
+
154
+ `fast-xml-parser`, `sax`, and `saxes` are included as dev dependencies so comparison is available out of the box.
155
+
156
+ Example output includes a direct ratio line:
157
+
158
+ `Comparable parse ratio (xml-sax-ts:sax vs fast-xml-parser:object): ...x`
159
+
160
+ Note: SAX event parsing and object materialization are not identical workloads. Use the tree scenario for a closer semantic comparison.
161
+
162
+ ### Benchmark Methodology
163
+
164
+ - Benchmark command: `npm run bench`
165
+ - Runtime: Node `v24.7.0`
166
+ - Benchmark config defaults: `BENCH_ROUNDS=5`, `BENCH_MIN_MS=1200`, `BENCH_WARMUP=10`
167
+ - Corpus: repeated fixture corpus (`basic.xml`, `mixed.xml`, `namespaces.xml`) plus an entity-heavy synthetic case
168
+ - Output metric: median ops/s across rounds (with mean and stddev also shown)
169
+
170
+ ### Benchmark Environment
171
+
172
+ - Published sample run device: MacBook Pro M4
173
+ - Memory: 48 GB RAM
174
+ - CPU: 14-core CPU
175
+ - GPU: 20-core GPU
176
+
177
+ GPU is not used by these Node.js parser benchmarks, but listed for full machine disclosure.
178
+
179
+ Latest sample (`npm run bench` defaults, Node `v24.7.0`):
180
+
181
+ | Scenario | Median ops/s |
182
+ | --- | ---: |
183
+ | `xml-sax-ts:sax single-feed xmlns=true` | 15,155.48 |
184
+ | `xml-sax-ts:sax single-feed xmlns=false` | 21,178.68 |
185
+ | `xml-sax-ts:sax single-feed xmlns=false no-position` | 22,230.83 |
186
+ | `sax:single-feed xmlns=false` | 8,357.12 |
187
+ | `saxes:single-feed xmlns=false` | 23,296.03 |
188
+ | `xml-sax-ts:tree parseXmlString` | 8,833.38 |
189
+ | `fast-xml-parser:object parse` | 6,128.40 |
190
+
191
+ Comparable minimal feature scenarios (fair `saxes` parity check):
192
+
193
+ | Scenario | Median ops/s |
194
+ | --- | ---: |
195
+ | `comparable:xml-sax-ts single-feed xmlns=false position=false` | 22,637.22 |
196
+ | `comparable:saxes single-feed xmlns=false position=false` | 23,305.98 |
197
+ | `comparable:xml-sax-ts single-feed xmlns=true position=false` | 16,468.14 |
198
+ | `comparable:saxes single-feed xmlns=true position=false` | 11,868.48 |
199
+
200
+ - `xml-sax-ts:sax (xmlns=false)` vs `sax (xmlns=false)`: `2.534x`
201
+ - `xml-sax-ts:sax (xmlns=true)` vs `sax (xmlns=true)`: `2.989x`
202
+ - `xml-sax-ts:sax (xmlns=false)` vs `saxes (xmlns=false)`: `0.909x`
203
+ - `comparable minimal (xmlns=false, xml-sax-ts vs saxes)`: `0.971x`
204
+ - `comparable minimal (xmlns=true, xml-sax-ts vs saxes)`: `1.388x`
205
+ - `xml-sax-ts:tree` vs `fast-xml-parser:object`: `1.441x`
206
+
207
+ Benchmark visualization (same sample run):
208
+
209
+ ```mermaid
210
+ xychart-beta
211
+ title "SAX Throughput (xmlns=false)"
212
+ x-axis ["xml-sax-ts", "xml-sax-ts no-position", "sax", "saxes"]
213
+ y-axis "ops/s" 0 --> 24000
214
+ bar [21178.68, 22230.83, 8357.12, 23296.03]
215
+ ```
216
+
217
+ ```mermaid
218
+ xychart-beta
219
+ title "Object/Tree Throughput"
220
+ x-axis ["xml-sax-ts tree", "fast-xml-parser object"]
221
+ y-axis "ops/s" 0 --> 9000
222
+ bar [8833.38, 6128.40]
223
+ ```
224
+
225
+ ```mermaid
226
+ xychart-beta
227
+ title "Comparable Minimal (position=false)"
228
+ x-axis ["xml-sax-ts xmlns=false", "saxes xmlns=false", "xml-sax-ts xmlns=true", "saxes xmlns=true"]
229
+ y-axis "ops/s" 0 --> 24000
230
+ bar [22637.22, 23305.98, 16468.14, 11868.48]
231
+ ```
232
+
233
+ Legend: `xml-sax-ts` bars are the first bars in each chart.
234
+
235
+ Best fair-comparison read:
236
+
237
+ - Use `comparable:*` scenarios for `xml-sax-ts` vs `saxes` parity checks.
238
+ - `xml-sax-ts ... no-position` is useful for peak throughput, but not a default-to-default comparison.
239
+
240
+ These values are machine-dependent; rerun on your hardware for release-quality numbers.
241
+
242
+ Current status for this environment: comparable runs show `xml-sax-ts` at `0.971x` of `saxes` on `xmlns=false` and `1.388x` on `xmlns=true`.
243
+
105
244
  ## API
106
245
 
107
246
  ### `XmlSaxParser`
@@ -122,6 +261,8 @@ new XmlSaxParser(options?: ParserOptions)
122
261
  | `xmlns` | `boolean` | `true` | Enable namespace resolution |
123
262
  | `includeNamespaceAttributes` | `boolean` | `false` | Include `xmlns:*` attributes in tag output |
124
263
  | `allowDoctype` | `boolean` | `true` | Allow `<!DOCTYPE …>` declarations |
264
+ | `coalesceText` | `boolean` | `false` | Merge adjacent text callbacks into one event |
265
+ | `trackPosition` | `boolean` | `true` | Track line/column; disable for faster parsing |
125
266
  | `onOpenTag` | `function` | — | Called for each opening / self-closing tag |
126
267
  | `onCloseTag` | `function` | — | Called for each closing tag |
127
268
  | `onText` | `function` | — | Called for text nodes |
@@ -131,6 +272,16 @@ new XmlSaxParser(options?: ParserOptions)
131
272
  | `onDoctype` | `function` | — | Called for DOCTYPE declarations |
132
273
  | `onError` | `function` | — | Called on parse errors |
133
274
 
275
+ By default (`coalesceText: false`), streaming input can produce multiple consecutive `onText` callbacks that are logically adjacent. Enable `coalesceText: true` to receive one merged text callback per structural boundary.
276
+
277
+ `trackPosition` controls line/column tracking for parser errors. When set to `false`, parsing is faster and `XmlSaxError` still reports `offset`, while `line` and `column` are set to `0`.
278
+
279
+ Event payload note (breaking change): with `xmlns: false`, parser events now emit plain-mode tag shapes aligned with `saxes` performance semantics.
280
+
281
+ - `onOpenTag(tag).attributes` values are strings (not `XmlAttribute` objects)
282
+ - `onOpenTag(tag)` and `onCloseTag(tag)` omit `prefix`, `local`, and `uri`
283
+ - With `xmlns: true`, full namespace metadata remains present
284
+
134
285
  ### `parseXmlString(xml, options?)`
135
286
 
136
287
  Convenience function that parses a complete XML string into an `XmlNode` tree using `XmlSaxParser` + `TreeBuilder` internally.
@@ -157,6 +308,24 @@ Streaming builder that produces the same object shape as `buildObject` without b
157
308
  | `arrayElements` | `Set\<string\> \| (name: string, path: string[]) => boolean` | — | Force specific elements to always be arrays |
158
309
  | `coalesceText` | `boolean` | `true` | Merge adjacent text nodes into a single string |
159
310
 
311
+ ### `buildXmlNode(obj, options?)`
312
+
313
+ Converts a plain object into an `XmlNode` tree using the same attribute/text conventions as `buildObject`.
314
+
315
+ ### `objectToXml(obj, options?)`
316
+
317
+ Builds an `XmlNode` with `buildXmlNode` and serializes it with `serializeXml`.
318
+
319
+ #### `XmlBuilderOptions`
320
+
321
+ | Option | Type | Default | Description |
322
+ | ------------------ | ------------------------------------------------------------ | --------- | ---------------------------------------------- |
323
+ | `attributePrefix` | `string` | `"@_"` | Prefix for attribute keys |
324
+ | `textKey` | `string` | `"#text"` | Key used for text nodes |
325
+ | `stripNamespaces` | `boolean` | `false` | Strip namespace prefixes from names |
326
+ | `arrayElements` | `Set\<string\> \| (name: string, path: string[]) => boolean` | — | Force specific elements to always be arrays |
327
+ | `rootName` | `string` | — | Root element name when object has multiple keys|
328
+
160
329
  ### `serializeXml(node, options?)`
161
330
 
162
331
  Serializes an `XmlNode` back to an XML string.
@@ -176,7 +345,7 @@ Custom error class thrown on parse errors. Includes `offset`, `line`, and `colum
176
345
 
177
346
  ### Exported types
178
347
 
179
- `OpenTag` · `CloseTag` · `XmlAttribute` · `ProcessingInstruction` · `Doctype` · `XmlNode` · `XmlChild` · `XmlPosition` · `ParserOptions` · `SerializeOptions` · `ObjectBuilderOptions` · `ArrayElementSelector` · `XmlObjectMap` · `XmlObjectValue`
348
+ `OpenTag` · `CloseTag` · `XmlAttribute` · `ProcessingInstruction` · `Doctype` · `XmlNode` · `XmlChild` · `XmlPosition` · `ParserOptions` · `SerializeOptions` · `ObjectBuilderOptions` · `ArrayElementSelector` · `XmlObjectMap` · `XmlObjectValue` · `XmlBuilderOptions` · `XmlInputObject` · `XmlInputValue` · `ObjectToXmlOptions`
180
349
 
181
350
  ## Features
182
351