nori 2.8.0 → 2.9.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- checksums.yaml +4 -4
- data/CHANGELOG.md +12 -1
- data/README.md +60 -2
- data/lib/nori/version.rb +1 -1
- data/lib/nori/xml_utility_node.rb +68 -8
- data/lib/nori.rb +12 -6
- metadata +1 -1
checksums.yaml
CHANGED
|
@@ -1,7 +1,7 @@
|
|
|
1
1
|
---
|
|
2
2
|
SHA256:
|
|
3
|
-
metadata.gz:
|
|
4
|
-
data.tar.gz:
|
|
3
|
+
metadata.gz: d2ffd841cec28588977a4cf413ade2a33ceba13466046a1299e27016231db0ef
|
|
4
|
+
data.tar.gz: 51edbdf4ab6413ca7583bb02e05a88ad13fac97741797cf5dde6135edd99d5f8
|
|
5
5
|
SHA512:
|
|
6
|
-
metadata.gz:
|
|
7
|
-
data.tar.gz:
|
|
6
|
+
metadata.gz: 1700483ec9b91f559ab0fa138c5760fe237f7bf92f5b927bbde60dc437175af5e8c8cc12998195bd875ea9272f62d25727e3fe85a5954918cc0ea31aa5a413a3
|
|
7
|
+
data.tar.gz: 8713852dfe40adfe96ae00debff569d58f6d8c394fd48fe8df7ad34a767e886605df4163d15cd8573922926ad7abb9f3af1135ddd7180301172dfef5b6b718c6
|
data/CHANGELOG.md
CHANGED
|
@@ -5,6 +5,16 @@ All notable changes to this project will be documented in this file.
|
|
|
5
5
|
The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.1.0/),
|
|
6
6
|
and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html).
|
|
7
7
|
|
|
8
|
+
## [2.9.0] - 2026-07-05
|
|
9
|
+
|
|
10
|
+
### Changed
|
|
11
|
+
|
|
12
|
+
* The `:standards` profile no longer applies schema-less typing. Under `Nori.new(standards: true)`, `:advanced_typecasting` now defaults to `false` (an explicit `advanced_typecasting: true` still wins), and the bare un-namespaced `type=` and `nil=` attributes become ordinary attributes instead of casting instructions. No more `type=` conversions (`integer`, `date`, `decimal`, `array`, `file`, ...), and only a prefixed `xsi:nil="true"` marks an element nil. These conventions come from Rails' `Hash.from_xml` (inherited via crack and merb), not from any XML spec. Without a schema, character data is just text. Knowing that `<id>123</id>` holds an integer is the business of a schema-aware layer, not a guess from string shape. Parsing without the profile is completely unchanged.
|
|
13
|
+
|
|
14
|
+
### Added
|
|
15
|
+
|
|
16
|
+
* Add the `:serializable` profile option. `Nori.new(serializable: true)` returns plain, directly-serializable data with no custom value classes. A text node that also has attributes becomes a Hash in the XML JSON convention (`{"#text" => content}` merged with our existing `@`-prefixed attributes instead of a `Nori::StringWithAttributes`, and a text node without attributes becomes a plain `String`. A `type="file"` node is no longer decoded into a `Nori::StringIOFile`, it folds into text and attributes like any other node, leaving the base64 content untouched for the caller to decode. That `type=` decoding is a Rails/ActiveSupport `Hash.from_xml` convention Nori inherited via crack and merb, not part of any XML spec, which is why the plain-data profile drops it. With this, values survive `to_json`, `to_yaml`, and `Marshal`. Opt-in on the 2.x line, probably the default in Nori 3.0. Reported by @ArnoldMEDLINQ ([#107](https://github.com/savonrb/nori/issues/107)), thanks to @dub357 for pinpointing the cause and finding [#106](https://github.com/savonrb/nori/pull/106), and thanks to @ekzobrain whose hash-shape approach this builds on.
|
|
17
|
+
|
|
8
18
|
## [2.8.0] - 2026-07-04
|
|
9
19
|
|
|
10
20
|
### Added
|
|
@@ -255,4 +265,5 @@ Please make sure to read the updated README for how to use the new version.
|
|
|
255
265
|
## 0.1.0 2009-03-28
|
|
256
266
|
* Initial release.
|
|
257
267
|
|
|
258
|
-
[2.
|
|
268
|
+
[2.9.0]: https://github.com/savonrb/nori/compare/v2.8.0...v2.9.0
|
|
269
|
+
[2.8.0]: https://github.com/savonrb/nori/compare/v2.7.1...v2.8.0
|
data/README.md
CHANGED
|
@@ -2,8 +2,8 @@ Nori
|
|
|
2
2
|
====
|
|
3
3
|
|
|
4
4
|
[](https://github.com/savonrb/nori/actions/workflows/test.yml)
|
|
5
|
-
[](https://rubygems.org/gems/nori)
|
|
6
|
+
[](https://coveralls.io/github/savonrb/nori?branch=main)
|
|
7
7
|
|
|
8
8
|
Really simple XML parsing ripped from Crack, which ripped it from Merb.
|
|
9
9
|
|
|
@@ -50,6 +50,11 @@ result["foo"].attributes
|
|
|
50
50
|
# => {"bar"=>"baz"}
|
|
51
51
|
```
|
|
52
52
|
|
|
53
|
+
Because `StringWithAttributes` hides the attributes on an accessor, they are
|
|
54
|
+
lost on a plain `to_json` and callers have to type-check each value. The
|
|
55
|
+
[`serializable`](#serializable) profile returns a plain Hash instead and will
|
|
56
|
+
probably be the default in 3.0.
|
|
57
|
+
|
|
53
58
|
## advanced_typecasting
|
|
54
59
|
|
|
55
60
|
Nori can automatically convert string values to `TrueClass`, `FalseClass`, `Time`, `Date`, and `DateTime`:
|
|
@@ -183,3 +188,56 @@ Members of the profile:
|
|
|
183
188
|
Nori.new(:standards => true).parse('<foo bar="baz"/>')
|
|
184
189
|
# => {"foo"=>""}
|
|
185
190
|
```
|
|
191
|
+
|
|
192
|
+
- **no schema-less typing.** Without a schema, character data is just text, so
|
|
193
|
+
the profile implies `advanced_typecasting: false` (overridable) and ignores
|
|
194
|
+
the bare `type=` and `nil=` attribute conventions that Nori inherited from
|
|
195
|
+
Rails' `Hash.from_xml`. A `type="integer"` no longer casts, a `type="array"`
|
|
196
|
+
no longer folds children into an Array, and both stay visible as ordinary
|
|
197
|
+
attributes. Only the prefixed `xsi:nil="true"` still marks an element nil,
|
|
198
|
+
because that convention comes from XML Schema Instance itself.
|
|
199
|
+
|
|
200
|
+
```ruby
|
|
201
|
+
Nori.new(:standards => true).parse('<id type="integer">123</id>')
|
|
202
|
+
# => {"id"=>"123"} (with "type" accessible via #attributes)
|
|
203
|
+
```
|
|
204
|
+
|
|
205
|
+
## serializable
|
|
206
|
+
|
|
207
|
+
`serializable` is a profile that makes Nori return plain, directly-serializable
|
|
208
|
+
data with no custom value classes. It is opt-in on the 2.x line and will
|
|
209
|
+
probably become the default in 3.0.
|
|
210
|
+
|
|
211
|
+
```ruby
|
|
212
|
+
Nori.new(:serializable => true)
|
|
213
|
+
```
|
|
214
|
+
|
|
215
|
+
Members of the profile:
|
|
216
|
+
|
|
217
|
+
- **text nodes with attributes become a Hash.** A tag with both text content
|
|
218
|
+
and attributes maps to the XML JSON convention (`{"#text" => content}` merged
|
|
219
|
+
with the `@`-prefixed attributes) instead of a `Nori::StringWithAttributes`.
|
|
220
|
+
This is the same `@`-keyed shape element nodes already use, so the attributes
|
|
221
|
+
survive `to_json`, `to_yaml`, and `Marshal`. A text node without attributes
|
|
222
|
+
stays a plain `String`.
|
|
223
|
+
|
|
224
|
+
```ruby
|
|
225
|
+
Nori.new(:serializable => true).parse('<foo bar="baz">Content</foo>')
|
|
226
|
+
# => {"foo"=>{"#text"=>"Content", "@bar"=>"baz"}}
|
|
227
|
+
|
|
228
|
+
Nori.new(:serializable => true).parse('<foo>Content</foo>')
|
|
229
|
+
# => {"foo"=>"Content"}
|
|
230
|
+
```
|
|
231
|
+
|
|
232
|
+
- **`type="file"` nodes are not decoded.** Without the profile, a node carrying
|
|
233
|
+
`type="file"` is base64-decoded into a `Nori::StringIOFile` with its filename
|
|
234
|
+
and content type. The profile skips that. The node folds into text and
|
|
235
|
+
attributes like any other, leaving the base64 content as a plain `String` for
|
|
236
|
+
the caller to decode. This `type=` decoding is a Rails/ActiveSupport
|
|
237
|
+
`Hash.from_xml` convention Nori inherited through crack and merb, not part of
|
|
238
|
+
any XML specification, so the plain-data profile leaves it behind.
|
|
239
|
+
|
|
240
|
+
```ruby
|
|
241
|
+
Nori.new(:serializable => true).parse('<doc type="file" name="x.pdf">aGVsbG8=</doc>')
|
|
242
|
+
# => {"doc"=>{"#text"=>"aGVsbG8=", "@type"=>"file", "@name"=>"x.pdf"}}
|
|
243
|
+
```
|
data/lib/nori/version.rb
CHANGED
|
@@ -92,12 +92,11 @@ class Nori
|
|
|
92
92
|
attributes = Hash[*intermediate]
|
|
93
93
|
end
|
|
94
94
|
|
|
95
|
-
|
|
96
|
-
@type = self.class.available_typecasts.include?(attributes["type"]) ? attributes.delete("type") : attributes["type"]
|
|
95
|
+
@type = bare_type(attributes)
|
|
97
96
|
|
|
98
97
|
@nil_element = false
|
|
99
98
|
attributes.keys.each do |key|
|
|
100
|
-
if result =
|
|
99
|
+
if result = nil_attribute_pattern.match(key)
|
|
101
100
|
@nil_element = attributes.delete(key) == "true"
|
|
102
101
|
attributes.delete("xmlns:#{result[2]}") if result[1]
|
|
103
102
|
end
|
|
@@ -130,12 +129,17 @@ class Nori
|
|
|
130
129
|
# Converts the node into a hash with the node name as its single key.
|
|
131
130
|
#
|
|
132
131
|
# The value depends on the shape of the node. A node typed as "file"
|
|
133
|
-
# becomes a {StringIOFile}
|
|
134
|
-
#
|
|
132
|
+
# becomes a {StringIOFile}, unless the +:serializable+ profile is enabled.
|
|
133
|
+
# That profile returns plain data only, so a file node folds into text and
|
|
134
|
+
# attributes like any other node and the base64 content is left undecoded.
|
|
135
|
+
# A node with text content becomes a typecast scalar. Every other node
|
|
136
|
+
# folds its children into an array or a hash. Under the +:standards+
|
|
137
|
+
# profile no bare type attribute is honored ({#bare_type}), so file
|
|
138
|
+
# decoding, array folding and typecasting never happen there.
|
|
135
139
|
#
|
|
136
140
|
# @return [Hash{String => Object}] the node name mapped to its value
|
|
137
141
|
def to_hash
|
|
138
|
-
return { name => file_value } if @type == "file"
|
|
142
|
+
return { name => file_value } if @type == "file" && !@options[:serializable]
|
|
139
143
|
return { name => text_value } if @text
|
|
140
144
|
|
|
141
145
|
groups = group_children
|
|
@@ -211,6 +215,36 @@ class Nori
|
|
|
211
215
|
|
|
212
216
|
private
|
|
213
217
|
|
|
218
|
+
# The value of the bare (un-namespaced) type attribute, or nil under
|
|
219
|
+
# the +:standards+ profile. A recognized type is consumed from the
|
|
220
|
+
# attributes because typecasting replaces it with the value it
|
|
221
|
+
# describes. An unrecognized type stays visible as an ordinary
|
|
222
|
+
# attribute. The bare attribute is a Rails +Hash.from_xml+ convention
|
|
223
|
+
# rather than XML, so the +:standards+ profile never reads it and the
|
|
224
|
+
# attribute passes through as ordinary data.
|
|
225
|
+
#
|
|
226
|
+
# @param attributes [Hash{String => String}] the element's attributes
|
|
227
|
+
# @return [String, nil] the type name, or nil when there is none to honor
|
|
228
|
+
def bare_type(attributes)
|
|
229
|
+
return nil if @options[:standards]
|
|
230
|
+
|
|
231
|
+
if self.class.available_typecasts.include?(attributes["type"])
|
|
232
|
+
attributes.delete("type")
|
|
233
|
+
else
|
|
234
|
+
attributes["type"]
|
|
235
|
+
end
|
|
236
|
+
end
|
|
237
|
+
|
|
238
|
+
# The attribute forms that declare an element nil. The prefixed form
|
|
239
|
+
# is the XML Schema Instance convention (xsi:nil). The bare +nil+ form
|
|
240
|
+
# is a Rails +Hash.from_xml+ convention, so the +:standards+ profile
|
|
241
|
+
# only accepts the prefixed form.
|
|
242
|
+
#
|
|
243
|
+
# @return [Regexp] the pattern, with the prefix in capture group 2
|
|
244
|
+
def nil_attribute_pattern
|
|
245
|
+
@options[:standards] ? /^((.+):)nil$/ : /^((.*):)?nil$/
|
|
246
|
+
end
|
|
247
|
+
|
|
214
248
|
# Decodes the base64 content of a node typed as "file" into a
|
|
215
249
|
# {StringIOFile} carrying the filename and content type attributes.
|
|
216
250
|
def file_value
|
|
@@ -278,14 +312,40 @@ class Nori
|
|
|
278
312
|
value.is_a?(String) ? string_with_attributes(value) : value
|
|
279
313
|
end
|
|
280
314
|
|
|
281
|
-
#
|
|
282
|
-
#
|
|
315
|
+
# Combines a string +value+ with the node's attributes in the shape the
|
|
316
|
+
# active output profile calls for.
|
|
317
|
+
#
|
|
318
|
+
# By default the value is a {StringWithAttributes}: a String carrying the
|
|
319
|
+
# node's attributes on {StringWithAttributes#attributes}. Under the
|
|
320
|
+
# +:serializable+ profile the value becomes plain, directly-serializable
|
|
321
|
+
# data instead, so no custom String subclass is returned.
|
|
322
|
+
#
|
|
323
|
+
# @param value [String] the typecast text content of the node
|
|
324
|
+
# @return [StringWithAttributes, Hash{String => String}, String] the value
|
|
325
|
+
# in the configured representation
|
|
283
326
|
def string_with_attributes(value)
|
|
327
|
+
return serializable_value(value) if @options[:serializable]
|
|
328
|
+
|
|
284
329
|
string = StringWithAttributes.new(value)
|
|
285
330
|
string.attributes = attributes
|
|
286
331
|
string
|
|
287
332
|
end
|
|
288
333
|
|
|
334
|
+
# The +:serializable+ representation of a string +value+ and the node's
|
|
335
|
+
# attributes. A node with attributes maps to the XML JSON convention
|
|
336
|
+
# (+{"#text" => value}+ merged with the "@"-prefixed attributes) and a
|
|
337
|
+
# node without attributes maps to the plain String. The attribute keys go
|
|
338
|
+
# through the same prefixing and tag conversion as element-node attributes
|
|
339
|
+
# ({#prefixed_attributes}), so every node kind shares one convention.
|
|
340
|
+
#
|
|
341
|
+
# @param value [String] the typecast text content of the node
|
|
342
|
+
# @return [Hash{String => String}, String] the hash shape when the node
|
|
343
|
+
# has attributes, otherwise the plain +value+
|
|
344
|
+
def serializable_value(value)
|
|
345
|
+
return value if attributes.empty?
|
|
346
|
+
{ "#text" => value }.merge(prefixed_attributes)
|
|
347
|
+
end
|
|
348
|
+
|
|
289
349
|
def try_to_convert(value, &block)
|
|
290
350
|
block.call(value)
|
|
291
351
|
rescue ArgumentError
|
data/lib/nori.rb
CHANGED
|
@@ -25,6 +25,7 @@ class Nori
|
|
|
25
25
|
:convert_dashes_to_underscores => true,
|
|
26
26
|
:scrub_xml => true,
|
|
27
27
|
:standards => false,
|
|
28
|
+
:serializable => false,
|
|
28
29
|
:parser => :nokogiri
|
|
29
30
|
}
|
|
30
31
|
|
|
@@ -56,17 +57,22 @@ class Nori
|
|
|
56
57
|
#
|
|
57
58
|
# The profile groups the spec-correct behaviors under a single opt-in.
|
|
58
59
|
# It turns on the XML string-value model for empty elements
|
|
59
|
-
# (+:consistent_empty_tags+ with an empty-string +:empty_tag_value+)
|
|
60
|
-
#
|
|
61
|
-
#
|
|
62
|
-
#
|
|
63
|
-
#
|
|
60
|
+
# (+:consistent_empty_tags+ with an empty-string +:empty_tag_value+),
|
|
61
|
+
# turns off +:advanced_typecasting+ (schema-less values are text, their
|
|
62
|
+
# types are the business of a schema-aware layer) and, in the parsers,
|
|
63
|
+
# honors xml:space. These are defaults, so an explicit option passed by
|
|
64
|
+
# the caller still wins. When the profile is off the hash is empty and
|
|
65
|
+
# parsing is unchanged.
|
|
64
66
|
#
|
|
65
67
|
# @param options [Hash] the options passed to {#initialize}
|
|
66
68
|
# @return [Hash] the implied defaults, or +{}+ when the profile is off
|
|
67
69
|
def standards_defaults(options)
|
|
68
70
|
return {} unless options[:standards]
|
|
69
|
-
{
|
|
71
|
+
{
|
|
72
|
+
:consistent_empty_tags => true,
|
|
73
|
+
:empty_tag_value => "",
|
|
74
|
+
:advanced_typecasting => false
|
|
75
|
+
}
|
|
70
76
|
end
|
|
71
77
|
|
|
72
78
|
def load_parser(parser)
|