moxml 0.1.8 → 0.1.10
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- checksums.yaml +4 -4
- data/.rubocop_todo.yml +22 -39
- data/README.adoc +51 -20
- data/docs/_config.yml +3 -3
- data/docs/_guides/index.adoc +15 -7
- data/docs/_guides/modifying-xml.adoc +0 -1
- data/docs/_guides/node-api-consistency.adoc +572 -0
- data/docs/_guides/parsing-xml.adoc +0 -1
- data/docs/_guides/xml-declaration.adoc +450 -0
- data/docs/_pages/adapter-compatibility.adoc +1 -1
- data/docs/_pages/adapters/headed-ox.adoc +9 -9
- data/docs/_pages/adapters/index.adoc +0 -1
- data/docs/_pages/adapters/libxml.adoc +1 -2
- data/docs/_pages/adapters/nokogiri.adoc +1 -2
- data/docs/_pages/adapters/oga.adoc +1 -2
- data/docs/_pages/adapters/ox.adoc +2 -1
- data/docs/_pages/adapters/rexml.adoc +2 -3
- data/docs/_pages/best-practices.adoc +0 -1
- data/docs/_pages/compatibility.adoc +0 -1
- data/docs/_pages/configuration.adoc +0 -1
- data/docs/_pages/error-handling.adoc +0 -1
- data/docs/_pages/headed-ox-limitations.adoc +16 -0
- data/docs/_pages/installation.adoc +0 -1
- data/docs/_pages/node-api-reference.adoc +93 -4
- data/docs/_pages/performance.adoc +0 -1
- data/docs/_pages/quick-start.adoc +0 -1
- data/docs/_pages/thread-safety.adoc +0 -1
- data/docs/_references/document-api.adoc +0 -1
- data/docs/_tutorials/basic-usage.adoc +0 -1
- data/docs/_tutorials/builder-pattern.adoc +0 -1
- data/docs/_tutorials/namespace-handling.adoc +0 -1
- data/docs/_tutorials/xpath-queries.adoc +0 -1
- data/lib/moxml/adapter/customized_rexml/formatter.rb +2 -2
- data/lib/moxml/adapter/libxml.rb +34 -4
- data/lib/moxml/adapter/nokogiri.rb +50 -2
- data/lib/moxml/adapter/oga.rb +80 -3
- data/lib/moxml/adapter/ox.rb +70 -7
- data/lib/moxml/adapter/rexml.rb +45 -10
- data/lib/moxml/attribute.rb +6 -0
- data/lib/moxml/context.rb +18 -1
- data/lib/moxml/declaration.rb +9 -0
- data/lib/moxml/doctype.rb +33 -0
- data/lib/moxml/document.rb +14 -0
- data/lib/moxml/document_builder.rb +7 -0
- data/lib/moxml/element.rb +6 -0
- data/lib/moxml/error.rb +5 -5
- data/lib/moxml/node.rb +73 -1
- data/lib/moxml/processing_instruction.rb +6 -0
- data/lib/moxml/version.rb +1 -1
- data/lib/moxml/xpath/compiler.rb +2 -0
- data/lib/moxml/xpath/errors.rb +1 -1
- data/spec/integration/shared_examples/node_wrappers/declaration_behavior.rb +0 -3
- data/spec/moxml/declaration_preservation_spec.rb +217 -0
- data/spec/moxml/doctype_spec.rb +19 -3
- data/spec/performance/memory_usage_spec.rb +3 -2
- metadata +5 -3
- data/.ruby-version +0 -1
|
@@ -77,6 +77,7 @@ all_attrs = books.flat_map { |book| book.attributes.values }
|
|
|
77
77
|
----
|
|
78
78
|
|
|
79
79
|
**Test failures:**
|
|
80
|
+
|
|
80
81
|
* `spec/moxml/xpath/compiler_spec.rb:189` - Attribute axis wildcards
|
|
81
82
|
* `spec/moxml/xpath/axes_spec.rb:220` - Attribute + predicate combinations
|
|
82
83
|
|
|
@@ -85,6 +86,7 @@ all_attrs = books.flat_map { |book| book.attributes.values }
|
|
|
85
86
|
**Status:** Not implemented in HeadedOx adapter
|
|
86
87
|
|
|
87
88
|
**What's missing:**
|
|
89
|
+
|
|
88
90
|
* `adapter.namespace(node)` - Get primary namespace of element
|
|
89
91
|
* `adapter.namespace_definitions(node)` - Get all namespace definitions
|
|
90
92
|
* `node.namespace` - Access element's namespace
|
|
@@ -114,6 +116,7 @@ end
|
|
|
114
116
|
None. These operations require Ox enhancements.
|
|
115
117
|
|
|
116
118
|
**Test failures:**
|
|
119
|
+
|
|
117
120
|
* `spec/integration/shared_examples/edge_cases.rb:102` - Default namespace changes
|
|
118
121
|
* `spec/integration/shared_examples/edge_cases.rb:120` - Recursive namespace definitions
|
|
119
122
|
* `spec/integration/shared_examples/integration_workflows.rb:98` - Complex namespace scenarios
|
|
@@ -148,6 +151,7 @@ value = attr&.value
|
|
|
148
151
|
----
|
|
149
152
|
|
|
150
153
|
**Test failures:**
|
|
154
|
+
|
|
151
155
|
* `spec/integration/shared_examples/edge_cases.rb:134` - Attributes with same local name
|
|
152
156
|
|
|
153
157
|
=== 4. Parent Node Setter
|
|
@@ -189,6 +193,7 @@ new_parent.add_child(node) # Add to new parent
|
|
|
189
193
|
**Note:** This workaround is used internally where needed, but the getter/setter syntax is not supported.
|
|
190
194
|
|
|
191
195
|
**Test failures:**
|
|
196
|
+
|
|
192
197
|
* `spec/integration/shared_examples/integration_workflows.rb:122` - Complex modifications
|
|
193
198
|
|
|
194
199
|
=== 5. CDATA End Marker Escaping
|
|
@@ -224,6 +229,7 @@ doc.create_cdata(safe_content)
|
|
|
224
229
|
----
|
|
225
230
|
|
|
226
231
|
**Test failures:**
|
|
232
|
+
|
|
227
233
|
* `spec/integration/shared_examples/edge_cases.rb:41` - CDATA nested markers
|
|
228
234
|
* `spec/integration/shared_examples/node_wrappers/cdata_behavior.rb:44` - CDATA escaping
|
|
229
235
|
|
|
@@ -263,6 +269,7 @@ second_title = titles[1].text # Works correctly
|
|
|
263
269
|
----
|
|
264
270
|
|
|
265
271
|
**Test failures:**
|
|
272
|
+
|
|
266
273
|
* `spec/moxml/adapter/headed_ox_spec.rb:77` - String functions in predicates
|
|
267
274
|
* `spec/moxml/adapter/headed_ox_spec.rb:84` - Position functions
|
|
268
275
|
* `spec/moxml/adapter/headed_ox_spec.rb:304` - last() function
|
|
@@ -277,6 +284,7 @@ second_title = titles[1].text # Works correctly
|
|
|
277
284
|
**Why it fails:**
|
|
278
285
|
|
|
279
286
|
When using `//*` to select all elements, HeadedOx returns 6 elements while Nokogiri returns 7+. This is likely due to differences in:
|
|
287
|
+
|
|
280
288
|
* Document node counting
|
|
281
289
|
* Text node inclusion/exclusion
|
|
282
290
|
* Ox's internal DOM structure
|
|
@@ -297,6 +305,7 @@ result = doc.xpath("//*")
|
|
|
297
305
|
Use specific element names instead of wildcards.
|
|
298
306
|
|
|
299
307
|
**Test failures:**
|
|
308
|
+
|
|
300
309
|
* `spec/moxml/xpath/compiler_spec.rb:160` - Descendant-or-self wildcards
|
|
301
310
|
|
|
302
311
|
=== 8. Namespace-Aware XPath with Predicates
|
|
@@ -340,6 +349,7 @@ result = items.select { |item| item['id'] == '123' }
|
|
|
340
349
|
----
|
|
341
350
|
|
|
342
351
|
**Test failures:**
|
|
352
|
+
|
|
343
353
|
* `spec/integration/shared_examples/integration_workflows.rb:69` - XPath queries
|
|
344
354
|
|
|
345
355
|
== Ox Enhancement Requirements
|
|
@@ -477,6 +487,7 @@ Ensure element counting matches other parsers' conventions when using wildcard s
|
|
|
477
487
|
=== If Ox Adds Namespace API (v1.3)
|
|
478
488
|
|
|
479
489
|
With namespace methods (`namespace()`, `namespace_definitions()`):
|
|
490
|
+
|
|
480
491
|
* **Target:** 99.5% pass rate
|
|
481
492
|
* **Adds:** 4 more passing tests
|
|
482
493
|
* **Still limited:** Parent setter, CDATA escaping, attribute wildcards
|
|
@@ -484,6 +495,7 @@ With namespace methods (`namespace()`, `namespace_definitions()`):
|
|
|
484
495
|
=== If Ox Adds Reparenting API (v1.4)
|
|
485
496
|
|
|
486
497
|
With `reparent(new_parent)` method:
|
|
498
|
+
|
|
487
499
|
* **Target:** 99.6% pass rate
|
|
488
500
|
* **Adds:** 1 more passing test
|
|
489
501
|
* **Still limited:** CDATA escaping, attribute wildcards
|
|
@@ -491,6 +503,7 @@ With `reparent(new_parent)` method:
|
|
|
491
503
|
=== If Ox Fixes CDATA Escaping (v1.5)
|
|
492
504
|
|
|
493
505
|
With proper `]]>` handling:
|
|
506
|
+
|
|
494
507
|
* **Target:** 99.7% pass rate
|
|
495
508
|
* **Adds:** 2 more passing tests
|
|
496
509
|
* **Still limited:** Attribute wildcards
|
|
@@ -498,6 +511,7 @@ With proper `]]>` handling:
|
|
|
498
511
|
=== Full Feature Parity (v2.0)
|
|
499
512
|
|
|
500
513
|
Would require:
|
|
514
|
+
|
|
501
515
|
* All Ox enhancements above
|
|
502
516
|
* XPath parser support for `@*` wildcard
|
|
503
517
|
* Investigation and fixes for text content access
|
|
@@ -546,11 +560,13 @@ Total passing: **1,992 / 2,008** (99.20%)
|
|
|
546
560
|
HeadedOx v1.2 successfully delivers on its core promise: **fast XML parsing with comprehensive XPath support**. The 99.20% pass rate demonstrates excellent compatibility with Moxml's test suite, with the 0.80% of failures representing clear architectural boundaries in the Ox gem rather than bugs in HeadedOx.
|
|
547
561
|
|
|
548
562
|
**Use HeadedOx when:**
|
|
563
|
+
|
|
549
564
|
- Speed + XPath coverage matter most
|
|
550
565
|
- Basic namespace queries are sufficient
|
|
551
566
|
- DOM is mostly read-only
|
|
552
567
|
|
|
553
568
|
**Use Nokogiri/Oga when:**
|
|
569
|
+
|
|
554
570
|
- Need full namespace API
|
|
555
571
|
- Heavy DOM modifications required
|
|
556
572
|
- 100% feature parity is critical
|
|
@@ -1,10 +1,77 @@
|
|
|
1
1
|
---
|
|
2
|
-
title: Node API
|
|
3
|
-
|
|
4
|
-
|
|
2
|
+
title: Node API Reference
|
|
3
|
+
:toc:
|
|
4
|
+
:toclevels: 3
|
|
5
5
|
---
|
|
6
6
|
|
|
7
|
-
== Node API
|
|
7
|
+
== Node API Reference
|
|
8
|
+
|
|
9
|
+
This reference documents the API of all node types in Moxml. For a guide on API consistency and safe coding patterns, see the link:../guides/node-api-consistency[Node API Consistency Guide].
|
|
10
|
+
|
|
11
|
+
== Node Identity: The #identifier Method
|
|
12
|
+
|
|
13
|
+
All node types in Moxml support the `#identifier` method, which returns the primary identifier for a node:
|
|
14
|
+
|
|
15
|
+
[cols="1,2,1"]
|
|
16
|
+
|===
|
|
17
|
+
| Node Type | #identifier Returns | Example
|
|
18
|
+
|
|
19
|
+
| Element
|
|
20
|
+
| The tag name
|
|
21
|
+
| `"book"`, `"title"`
|
|
22
|
+
|
|
23
|
+
| Attribute
|
|
24
|
+
| The attribute name
|
|
25
|
+
| `"id"`, `"class"`
|
|
26
|
+
|
|
27
|
+
| ProcessingInstruction
|
|
28
|
+
| The PI target
|
|
29
|
+
| `"xml-stylesheet"`
|
|
30
|
+
|
|
31
|
+
| Text, Comment, Cdata
|
|
32
|
+
| `nil` (no identifier)
|
|
33
|
+
| `nil`
|
|
34
|
+
|
|
35
|
+
| Declaration
|
|
36
|
+
| `nil` (no identifier)
|
|
37
|
+
| `nil`
|
|
38
|
+
|
|
39
|
+
| Document
|
|
40
|
+
| `nil` (no identifier)
|
|
41
|
+
| `nil`
|
|
42
|
+
|===
|
|
43
|
+
|
|
44
|
+
**Example usage:**
|
|
45
|
+
|
|
46
|
+
[source,ruby]
|
|
47
|
+
----
|
|
48
|
+
element = doc.at_xpath("//book")
|
|
49
|
+
puts element.identifier # => "book"
|
|
50
|
+
|
|
51
|
+
attr = element.attribute("id")
|
|
52
|
+
puts attr.identifier # => "id"
|
|
53
|
+
|
|
54
|
+
pi = doc.children.find { |n| n.processing_instruction? }
|
|
55
|
+
puts pi.identifier # => "xml-stylesheet"
|
|
56
|
+
|
|
57
|
+
text = element.children.find { |n| n.text? }
|
|
58
|
+
puts text.identifier # => nil
|
|
59
|
+
----
|
|
60
|
+
|
|
61
|
+
**Safe iteration over mixed nodes:**
|
|
62
|
+
|
|
63
|
+
[source,ruby]
|
|
64
|
+
----
|
|
65
|
+
doc.root.children.each do |node|
|
|
66
|
+
if id = node.identifier
|
|
67
|
+
puts "#{node.class.name.split('::').last}: #{id}"
|
|
68
|
+
else
|
|
69
|
+
puts "#{node.class.name.split('::').last}: (no identifier)"
|
|
70
|
+
end
|
|
71
|
+
end
|
|
72
|
+
----
|
|
73
|
+
|
|
74
|
+
== Common Node Methods
|
|
8
75
|
|
|
9
76
|
== XML objects and their methods
|
|
10
77
|
|
|
@@ -48,3 +115,25 @@ See also:
|
|
|
48
115
|
|
|
49
116
|
* link:../guides/working-with-documents[Working with documents guide]
|
|
50
117
|
* link:../guides/advanced-features[Advanced features guide]
|
|
118
|
+
=== Doctype nodes
|
|
119
|
+
|
|
120
|
+
Doctype nodes represent DOCTYPE declarations in XML documents.
|
|
121
|
+
|
|
122
|
+
[source,ruby]
|
|
123
|
+
----
|
|
124
|
+
doctype = doc.create_doctype("html", "-//W3C//DTD HTML 4.01//EN",
|
|
125
|
+
"http://www.w3.org/TR/html4/strict.dtd")
|
|
126
|
+
doctype.name # => "html"
|
|
127
|
+
doctype.external_id # => "-//W3C//DTD HTML 4.01//EN"
|
|
128
|
+
doctype.system_id # => "http://www.w3.org/TR/html4/strict.dtd"
|
|
129
|
+
doctype.identifier # => "html"
|
|
130
|
+
----
|
|
131
|
+
|
|
132
|
+
*Available methods:*
|
|
133
|
+
|
|
134
|
+
* `name` - Returns the DOCTYPE name (root element name)
|
|
135
|
+
* `external_id` - Returns the PUBLIC identifier (or nil)
|
|
136
|
+
* `system_id` - Returns the SYSTEM identifier (DTD URI, or nil)
|
|
137
|
+
* `identifier` - Returns the primary identifier (same as `name`)
|
|
138
|
+
|
|
139
|
+
All Doctype accessor methods are fully implemented across all 6 adapters.
|
|
@@ -166,7 +166,7 @@ module Moxml
|
|
|
166
166
|
end
|
|
167
167
|
|
|
168
168
|
# Then write regular attributes
|
|
169
|
-
node.attributes.each do |name, attr|
|
|
169
|
+
node.attributes.each do |name, attr| # rubocop:disable Style/CombinableLoops
|
|
170
170
|
next if name.to_s.start_with?("xmlns:") || name.to_s == "xmlns"
|
|
171
171
|
|
|
172
172
|
output << " "
|
|
@@ -180,7 +180,7 @@ module Moxml
|
|
|
180
180
|
value = attr.respond_to?(:value) ? attr.value : attr
|
|
181
181
|
output << escape_attribute_value(value.to_s)
|
|
182
182
|
output << "\""
|
|
183
|
-
end
|
|
183
|
+
end # rubocop:enable Style/CombinableLoops
|
|
184
184
|
end
|
|
185
185
|
|
|
186
186
|
def escape_attribute_value(value)
|
data/lib/moxml/adapter/libxml.rb
CHANGED
|
@@ -332,7 +332,13 @@ module Moxml
|
|
|
332
332
|
|
|
333
333
|
def document(node)
|
|
334
334
|
native_node = unpatch_node(node)
|
|
335
|
-
native_node
|
|
335
|
+
return nil unless native_node
|
|
336
|
+
|
|
337
|
+
# Handle documents themselves
|
|
338
|
+
return native_node if native_node.is_a?(::LibXML::XML::Document)
|
|
339
|
+
|
|
340
|
+
# For other nodes, return their document
|
|
341
|
+
native_node.doc
|
|
336
342
|
end
|
|
337
343
|
|
|
338
344
|
def root(document)
|
|
@@ -782,6 +788,20 @@ module Moxml
|
|
|
782
788
|
end
|
|
783
789
|
end
|
|
784
790
|
|
|
791
|
+
# Doctype accessor methods
|
|
792
|
+
def doctype_name(native)
|
|
793
|
+
# LibXML uses DoctypeWrapper which stores the values
|
|
794
|
+
native.name
|
|
795
|
+
end
|
|
796
|
+
|
|
797
|
+
def doctype_external_id(native)
|
|
798
|
+
native.external_id
|
|
799
|
+
end
|
|
800
|
+
|
|
801
|
+
def doctype_system_id(native)
|
|
802
|
+
native.system_id
|
|
803
|
+
end
|
|
804
|
+
|
|
785
805
|
def xpath(node, expression, namespaces = nil)
|
|
786
806
|
native_node = unpatch_node(node)
|
|
787
807
|
return [] unless native_node
|
|
@@ -831,7 +851,16 @@ module Moxml
|
|
|
831
851
|
if native_node.is_a?(::LibXML::XML::Document)
|
|
832
852
|
output = +""
|
|
833
853
|
|
|
834
|
-
|
|
854
|
+
# Check if we should include declaration
|
|
855
|
+
# Priority: explicit no_declaration option > default (include)
|
|
856
|
+
should_include_decl = if options.key?(:no_declaration)
|
|
857
|
+
!options[:no_declaration]
|
|
858
|
+
else
|
|
859
|
+
# Default: include declaration
|
|
860
|
+
true
|
|
861
|
+
end
|
|
862
|
+
|
|
863
|
+
if should_include_decl
|
|
835
864
|
# Check if declaration was explicitly managed
|
|
836
865
|
if native_node.instance_variable_defined?(:@moxml_declaration)
|
|
837
866
|
decl = native_node.instance_variable_get(:@moxml_declaration)
|
|
@@ -1134,7 +1163,7 @@ module Moxml
|
|
|
1134
1163
|
# Add namespace definitions (only on this element, not ancestors)
|
|
1135
1164
|
if elem.respond_to?(:namespaces)
|
|
1136
1165
|
seen_ns = {}
|
|
1137
|
-
elem.namespaces.
|
|
1166
|
+
elem.namespaces.each do |ns|
|
|
1138
1167
|
prefix = ns.prefix
|
|
1139
1168
|
uri = ns.href
|
|
1140
1169
|
next if seen_ns.key?(prefix)
|
|
@@ -1301,7 +1330,7 @@ module Moxml
|
|
|
1301
1330
|
# - On child elements, output namespace definitions that override parent namespaces
|
|
1302
1331
|
if elem.respond_to?(:namespaces) && elem.namespaces.respond_to?(:definitions)
|
|
1303
1332
|
# Get parent's namespace definitions to detect overrides
|
|
1304
|
-
parent_ns_defs = if !include_ns && elem.respond_to?(:parent) && elem.parent
|
|
1333
|
+
parent_ns_defs = if !include_ns && elem.respond_to?(:parent) && elem.parent && !elem.parent.is_a?(::LibXML::XML::Document)
|
|
1305
1334
|
parent_namespaces = {}
|
|
1306
1335
|
if elem.parent.respond_to?(:namespaces)
|
|
1307
1336
|
elem.parent.namespaces.each do |ns|
|
|
@@ -1444,6 +1473,7 @@ module Moxml
|
|
|
1444
1473
|
node.each_child do |child|
|
|
1445
1474
|
collect_ns_from_subtree(child, ns_defs) if child.element?
|
|
1446
1475
|
end
|
|
1476
|
+
ns_defs
|
|
1447
1477
|
end
|
|
1448
1478
|
|
|
1449
1479
|
def build_xpath_namespaces(node, user_namespaces)
|
|
@@ -221,6 +221,23 @@ module Moxml
|
|
|
221
221
|
end
|
|
222
222
|
|
|
223
223
|
def add_child(element, child)
|
|
224
|
+
# Special handling for declarations on Nokogiri documents
|
|
225
|
+
if element.is_a?(::Nokogiri::XML::Document) &&
|
|
226
|
+
child.is_a?(::Nokogiri::XML::ProcessingInstruction) &&
|
|
227
|
+
child.name == "xml"
|
|
228
|
+
# Set document's xml_decl property
|
|
229
|
+
version = declaration_attribute(child, "version") || "1.0"
|
|
230
|
+
encoding = declaration_attribute(child, "encoding")
|
|
231
|
+
standalone = declaration_attribute(child, "standalone")
|
|
232
|
+
|
|
233
|
+
# Nokogiri's xml_decl can only be set via instance variable
|
|
234
|
+
element.instance_variable_set(:@xml_decl, {
|
|
235
|
+
version: version,
|
|
236
|
+
encoding: encoding,
|
|
237
|
+
standalone: standalone,
|
|
238
|
+
}.compact)
|
|
239
|
+
end
|
|
240
|
+
|
|
224
241
|
if node_type(child) == :doctype
|
|
225
242
|
# avoid exceptions: cannot reparent Nokogiri::XML::DTD there
|
|
226
243
|
element.create_internal_subset(
|
|
@@ -240,6 +257,14 @@ module Moxml
|
|
|
240
257
|
end
|
|
241
258
|
|
|
242
259
|
def remove(node)
|
|
260
|
+
# Special handling for declarations on Nokogiri documents
|
|
261
|
+
if node.is_a?(::Nokogiri::XML::ProcessingInstruction) &&
|
|
262
|
+
node.name == "xml" &&
|
|
263
|
+
node.parent.is_a?(::Nokogiri::XML::Document)
|
|
264
|
+
# Clear document's xml_decl when removing declaration
|
|
265
|
+
node.parent.instance_variable_set(:@xml_decl, nil)
|
|
266
|
+
end
|
|
267
|
+
|
|
243
268
|
node.remove
|
|
244
269
|
end
|
|
245
270
|
|
|
@@ -296,6 +321,19 @@ module Moxml
|
|
|
296
321
|
node.namespace_definitions
|
|
297
322
|
end
|
|
298
323
|
|
|
324
|
+
# Doctype accessor methods
|
|
325
|
+
def doctype_name(native)
|
|
326
|
+
native.name
|
|
327
|
+
end
|
|
328
|
+
|
|
329
|
+
def doctype_external_id(native)
|
|
330
|
+
native.external_id
|
|
331
|
+
end
|
|
332
|
+
|
|
333
|
+
def doctype_system_id(native)
|
|
334
|
+
native.system_id
|
|
335
|
+
end
|
|
336
|
+
|
|
299
337
|
def xpath(node, expression, namespaces = nil)
|
|
300
338
|
node.xpath(expression, namespaces).to_a
|
|
301
339
|
rescue ::Nokogiri::XML::XPath::SyntaxError => e
|
|
@@ -328,8 +366,18 @@ module Moxml
|
|
|
328
366
|
if options[:indent].to_i.positive?
|
|
329
367
|
save_options |= ::Nokogiri::XML::Node::SaveOptions::FORMAT
|
|
330
368
|
end
|
|
331
|
-
|
|
332
|
-
|
|
369
|
+
|
|
370
|
+
# Handle declaration option
|
|
371
|
+
# Priority:
|
|
372
|
+
# 1. Explicit no_declaration option
|
|
373
|
+
# 2. Check Nokogiri's internal @xml_decl (when remove is called, this becomes nil)
|
|
374
|
+
if options.key?(:no_declaration)
|
|
375
|
+
save_options |= ::Nokogiri::XML::Node::SaveOptions::NO_DECLARATION if options[:no_declaration]
|
|
376
|
+
elsif node.respond_to?(:instance_variable_get) &&
|
|
377
|
+
node.instance_variable_defined?(:@xml_decl)
|
|
378
|
+
# Nokogiri's internal state - if nil, declaration was removed
|
|
379
|
+
xml_decl = node.instance_variable_get(:@xml_decl)
|
|
380
|
+
save_options |= ::Nokogiri::XML::Node::SaveOptions::NO_DECLARATION if xml_decl.nil?
|
|
333
381
|
end
|
|
334
382
|
|
|
335
383
|
node.to_xml(
|
data/lib/moxml/adapter/oga.rb
CHANGED
|
@@ -10,7 +10,10 @@ module Moxml
|
|
|
10
10
|
class Oga < Base
|
|
11
11
|
class << self
|
|
12
12
|
def set_root(doc, element)
|
|
13
|
-
|
|
13
|
+
# Clear existing root element if any - Oga's NodeSet needs special handling
|
|
14
|
+
# We need to manually remove elements since NodeSet doesn't support clear or delete_if
|
|
15
|
+
elements_to_remove = doc.children.select { |child| child.is_a?(::Oga::XML::Element) }
|
|
16
|
+
elements_to_remove.each { |elem| doc.children.delete(elem) }
|
|
14
17
|
doc.children << element
|
|
15
18
|
end
|
|
16
19
|
|
|
@@ -247,6 +250,13 @@ module Moxml
|
|
|
247
250
|
child_or_text
|
|
248
251
|
end
|
|
249
252
|
|
|
253
|
+
# Special handling for declarations on Oga documents
|
|
254
|
+
if element.is_a?(::Oga::XML::Document) &&
|
|
255
|
+
child.is_a?(::Oga::XML::XmlDeclaration)
|
|
256
|
+
# Set as document's xml_declaration
|
|
257
|
+
element.instance_variable_set(:@xml_declaration, child)
|
|
258
|
+
end
|
|
259
|
+
|
|
250
260
|
element.children << child
|
|
251
261
|
end
|
|
252
262
|
|
|
@@ -273,6 +283,13 @@ module Moxml
|
|
|
273
283
|
end
|
|
274
284
|
|
|
275
285
|
def remove(node)
|
|
286
|
+
# Special handling for declarations on Oga documents
|
|
287
|
+
if node.is_a?(::Oga::XML::XmlDeclaration) &&
|
|
288
|
+
node.parent.is_a?(::Oga::XML::Document)
|
|
289
|
+
# Clear document's xml_declaration when removing declaration
|
|
290
|
+
node.parent.instance_variable_set(:@xml_declaration, nil)
|
|
291
|
+
end
|
|
292
|
+
|
|
276
293
|
node.remove
|
|
277
294
|
end
|
|
278
295
|
|
|
@@ -348,6 +365,19 @@ module Moxml
|
|
|
348
365
|
node.namespaces.values
|
|
349
366
|
end
|
|
350
367
|
|
|
368
|
+
# Doctype accessor methods
|
|
369
|
+
def doctype_name(native)
|
|
370
|
+
native.name
|
|
371
|
+
end
|
|
372
|
+
|
|
373
|
+
def doctype_external_id(native)
|
|
374
|
+
native.public_id
|
|
375
|
+
end
|
|
376
|
+
|
|
377
|
+
def doctype_system_id(native)
|
|
378
|
+
native.system_id
|
|
379
|
+
end
|
|
380
|
+
|
|
351
381
|
def xpath(node, expression, namespaces = nil)
|
|
352
382
|
node.xpath(expression, {},
|
|
353
383
|
namespaces: namespaces&.transform_keys(&:to_s)).to_a
|
|
@@ -371,8 +401,55 @@ module Moxml
|
|
|
371
401
|
)
|
|
372
402
|
end
|
|
373
403
|
|
|
374
|
-
def serialize(node,
|
|
375
|
-
#
|
|
404
|
+
def serialize(node, options = {})
|
|
405
|
+
# Oga's XmlGenerator doesn't support options directly
|
|
406
|
+
# We need to handle declaration options ourselves for Document nodes
|
|
407
|
+
if node.is_a?(::Oga::XML::Document)
|
|
408
|
+
# Check if we should include declaration
|
|
409
|
+
# Priority: explicit option > existence of xml_declaration node
|
|
410
|
+
should_include_decl = if options.key?(:no_declaration)
|
|
411
|
+
!options[:no_declaration]
|
|
412
|
+
elsif options.key?(:declaration)
|
|
413
|
+
options[:declaration]
|
|
414
|
+
else
|
|
415
|
+
# Default: include if document has xml_declaration node
|
|
416
|
+
node.xml_declaration ? true : false
|
|
417
|
+
end
|
|
418
|
+
|
|
419
|
+
if should_include_decl && !node.xml_declaration
|
|
420
|
+
# Need to add declaration - create default one
|
|
421
|
+
output = +""
|
|
422
|
+
output << '<?xml version="1.0" encoding="UTF-8"?>'
|
|
423
|
+
output << "\n"
|
|
424
|
+
|
|
425
|
+
# Serialize doctype if present
|
|
426
|
+
output << node.doctype.to_xml << "\n" if node.doctype
|
|
427
|
+
|
|
428
|
+
# Serialize children
|
|
429
|
+
node.children.each do |child|
|
|
430
|
+
output << ::Moxml::Adapter::CustomizedOga::XmlGenerator.new(child).to_xml
|
|
431
|
+
end
|
|
432
|
+
|
|
433
|
+
return output
|
|
434
|
+
elsif !should_include_decl
|
|
435
|
+
# Skip xml_declaration
|
|
436
|
+
output = +""
|
|
437
|
+
|
|
438
|
+
# Serialize doctype if present
|
|
439
|
+
output << node.doctype.to_xml << "\n" if node.doctype
|
|
440
|
+
|
|
441
|
+
# Serialize root and other children
|
|
442
|
+
node.children.each do |child|
|
|
443
|
+
next if child.is_a?(::Oga::XML::XmlDeclaration)
|
|
444
|
+
|
|
445
|
+
output << ::Moxml::Adapter::CustomizedOga::XmlGenerator.new(child).to_xml
|
|
446
|
+
end
|
|
447
|
+
|
|
448
|
+
return output
|
|
449
|
+
end
|
|
450
|
+
end
|
|
451
|
+
|
|
452
|
+
# Default: use XmlGenerator
|
|
376
453
|
::Moxml::Adapter::CustomizedOga::XmlGenerator.new(node).to_xml
|
|
377
454
|
end
|
|
378
455
|
end
|