nori 2.8.0 → 2.9.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
checksums.yaml CHANGED
@@ -1,7 +1,7 @@
1
1
  ---
2
2
  SHA256:
3
- metadata.gz: ac639b2fdf06325300d7a65de3e8de963d6b6deed94eba8a5c5560ec99e29e28
4
- data.tar.gz: 5837f3e91314f284171ac6b97ebd21d8b6d2bcd769aea32c854c4818523a15da
3
+ metadata.gz: d2ffd841cec28588977a4cf413ade2a33ceba13466046a1299e27016231db0ef
4
+ data.tar.gz: 51edbdf4ab6413ca7583bb02e05a88ad13fac97741797cf5dde6135edd99d5f8
5
5
  SHA512:
6
- metadata.gz: b315e0f90e75a5d225d83b0248b7d5e554ad1eab8e06b37e7cd25c5dacce146bfff6ef3e40e7ff37d4e07eb142b255b6d7a026512cd1f0d94622dcc6669d0ff0
7
- data.tar.gz: 5fa332d507687ea25c1f2517944aff167916275f149ed590d097bc04aec39b3cf57532eeab5cee37167df4c69e68267908f80ef3c223dd84bf3c1baf881e3815
6
+ metadata.gz: 1700483ec9b91f559ab0fa138c5760fe237f7bf92f5b927bbde60dc437175af5e8c8cc12998195bd875ea9272f62d25727e3fe85a5954918cc0ea31aa5a413a3
7
+ data.tar.gz: 8713852dfe40adfe96ae00debff569d58f6d8c394fd48fe8df7ad34a767e886605df4163d15cd8573922926ad7abb9f3af1135ddd7180301172dfef5b6b718c6
data/CHANGELOG.md CHANGED
@@ -5,6 +5,16 @@ All notable changes to this project will be documented in this file.
5
5
  The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.1.0/),
6
6
  and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html).
7
7
 
8
+ ## [2.9.0] - 2026-07-05
9
+
10
+ ### Changed
11
+
12
+ * The `:standards` profile no longer applies schema-less typing. Under `Nori.new(standards: true)`, `:advanced_typecasting` now defaults to `false` (an explicit `advanced_typecasting: true` still wins), and the bare un-namespaced `type=` and `nil=` attributes become ordinary attributes instead of casting instructions. No more `type=` conversions (`integer`, `date`, `decimal`, `array`, `file`, ...), and only a prefixed `xsi:nil="true"` marks an element nil. These conventions come from Rails' `Hash.from_xml` (inherited via crack and merb), not from any XML spec. Without a schema, character data is just text. Knowing that `<id>123</id>` holds an integer is the business of a schema-aware layer, not a guess from string shape. Parsing without the profile is completely unchanged.
13
+
14
+ ### Added
15
+
16
+ * Add the `:serializable` profile option. `Nori.new(serializable: true)` returns plain, directly-serializable data with no custom value classes. A text node that also has attributes becomes a Hash in the XML JSON convention (`{"#text" => content}` merged with our existing `@`-prefixed attributes instead of a `Nori::StringWithAttributes`, and a text node without attributes becomes a plain `String`. A `type="file"` node is no longer decoded into a `Nori::StringIOFile`, it folds into text and attributes like any other node, leaving the base64 content untouched for the caller to decode. That `type=` decoding is a Rails/ActiveSupport `Hash.from_xml` convention Nori inherited via crack and merb, not part of any XML spec, which is why the plain-data profile drops it. With this, values survive `to_json`, `to_yaml`, and `Marshal`. Opt-in on the 2.x line, probably the default in Nori 3.0. Reported by @ArnoldMEDLINQ ([#107](https://github.com/savonrb/nori/issues/107)), thanks to @dub357 for pinpointing the cause and finding [#106](https://github.com/savonrb/nori/pull/106), and thanks to @ekzobrain whose hash-shape approach this builds on.
17
+
8
18
  ## [2.8.0] - 2026-07-04
9
19
 
10
20
  ### Added
@@ -255,4 +265,5 @@ Please make sure to read the updated README for how to use the new version.
255
265
  ## 0.1.0 2009-03-28
256
266
  * Initial release.
257
267
 
258
- [2.8.0]: https://github.com/savonrb/nori/compare/v2.7.1...2.8.0
268
+ [2.9.0]: https://github.com/savonrb/nori/compare/v2.8.0...v2.9.0
269
+ [2.8.0]: https://github.com/savonrb/nori/compare/v2.7.1...v2.8.0
data/README.md CHANGED
@@ -2,8 +2,8 @@ Nori
2
2
  ====
3
3
 
4
4
  [![CI](https://github.com/savonrb/nori/actions/workflows/test.yml/badge.svg)](https://github.com/savonrb/nori/actions/workflows/test.yml)
5
- [![Gem Version](https://badge.fury.io/rb/nori.svg)](http://badge.fury.io/rb/nori)
6
- [![Code Climate](https://codeclimate.com/github/savonrb/nori.svg)](https://codeclimate.com/github/savonrb/nori)
5
+ [![Gem Version](https://img.shields.io/gem/v/nori.svg)](https://rubygems.org/gems/nori)
6
+ [![Coverage Status](https://coveralls.io/repos/github/savonrb/nori/badge.svg?branch=main)](https://coveralls.io/github/savonrb/nori?branch=main)
7
7
 
8
8
  Really simple XML parsing ripped from Crack, which ripped it from Merb.
9
9
 
@@ -50,6 +50,11 @@ result["foo"].attributes
50
50
  # => {"bar"=>"baz"}
51
51
  ```
52
52
 
53
+ Because `StringWithAttributes` hides the attributes on an accessor, they are
54
+ lost on a plain `to_json` and callers have to type-check each value. The
55
+ [`serializable`](#serializable) profile returns a plain Hash instead and will
56
+ probably be the default in 3.0.
57
+
53
58
  ## advanced_typecasting
54
59
 
55
60
  Nori can automatically convert string values to `TrueClass`, `FalseClass`, `Time`, `Date`, and `DateTime`:
@@ -183,3 +188,56 @@ Members of the profile:
183
188
  Nori.new(:standards => true).parse('<foo bar="baz"/>')
184
189
  # => {"foo"=>""}
185
190
  ```
191
+
192
+ - **no schema-less typing.** Without a schema, character data is just text, so
193
+ the profile implies `advanced_typecasting: false` (overridable) and ignores
194
+ the bare `type=` and `nil=` attribute conventions that Nori inherited from
195
+ Rails' `Hash.from_xml`. A `type="integer"` no longer casts, a `type="array"`
196
+ no longer folds children into an Array, and both stay visible as ordinary
197
+ attributes. Only the prefixed `xsi:nil="true"` still marks an element nil,
198
+ because that convention comes from XML Schema Instance itself.
199
+
200
+ ```ruby
201
+ Nori.new(:standards => true).parse('<id type="integer">123</id>')
202
+ # => {"id"=>"123"} (with "type" accessible via #attributes)
203
+ ```
204
+
205
+ ## serializable
206
+
207
+ `serializable` is a profile that makes Nori return plain, directly-serializable
208
+ data with no custom value classes. It is opt-in on the 2.x line and will
209
+ probably become the default in 3.0.
210
+
211
+ ```ruby
212
+ Nori.new(:serializable => true)
213
+ ```
214
+
215
+ Members of the profile:
216
+
217
+ - **text nodes with attributes become a Hash.** A tag with both text content
218
+ and attributes maps to the XML JSON convention (`{"#text" => content}` merged
219
+ with the `@`-prefixed attributes) instead of a `Nori::StringWithAttributes`.
220
+ This is the same `@`-keyed shape element nodes already use, so the attributes
221
+ survive `to_json`, `to_yaml`, and `Marshal`. A text node without attributes
222
+ stays a plain `String`.
223
+
224
+ ```ruby
225
+ Nori.new(:serializable => true).parse('<foo bar="baz">Content</foo>')
226
+ # => {"foo"=>{"#text"=>"Content", "@bar"=>"baz"}}
227
+
228
+ Nori.new(:serializable => true).parse('<foo>Content</foo>')
229
+ # => {"foo"=>"Content"}
230
+ ```
231
+
232
+ - **`type="file"` nodes are not decoded.** Without the profile, a node carrying
233
+ `type="file"` is base64-decoded into a `Nori::StringIOFile` with its filename
234
+ and content type. The profile skips that. The node folds into text and
235
+ attributes like any other, leaving the base64 content as a plain `String` for
236
+ the caller to decode. This `type=` decoding is a Rails/ActiveSupport
237
+ `Hash.from_xml` convention Nori inherited through crack and merb, not part of
238
+ any XML specification, so the plain-data profile leaves it behind.
239
+
240
+ ```ruby
241
+ Nori.new(:serializable => true).parse('<doc type="file" name="x.pdf">aGVsbG8=</doc>')
242
+ # => {"doc"=>{"#text"=>"aGVsbG8=", "@type"=>"file", "@name"=>"x.pdf"}}
243
+ ```
data/lib/nori/version.rb CHANGED
@@ -1,3 +1,3 @@
1
1
  class Nori
2
- VERSION = '2.8.0'
2
+ VERSION = '2.9.0'
3
3
  end
@@ -92,12 +92,11 @@ class Nori
92
92
  attributes = Hash[*intermediate]
93
93
  end
94
94
 
95
- # leave the type alone if we don't know what it is
96
- @type = self.class.available_typecasts.include?(attributes["type"]) ? attributes.delete("type") : attributes["type"]
95
+ @type = bare_type(attributes)
97
96
 
98
97
  @nil_element = false
99
98
  attributes.keys.each do |key|
100
- if result = /^((.*):)?nil$/.match(key)
99
+ if result = nil_attribute_pattern.match(key)
101
100
  @nil_element = attributes.delete(key) == "true"
102
101
  attributes.delete("xmlns:#{result[2]}") if result[1]
103
102
  end
@@ -130,12 +129,17 @@ class Nori
130
129
  # Converts the node into a hash with the node name as its single key.
131
130
  #
132
131
  # The value depends on the shape of the node. A node typed as "file"
133
- # becomes a {StringIOFile}. A node with text content becomes a typecast
134
- # scalar. Every other node folds its children into an array or a hash.
132
+ # becomes a {StringIOFile}, unless the +:serializable+ profile is enabled.
133
+ # That profile returns plain data only, so a file node folds into text and
134
+ # attributes like any other node and the base64 content is left undecoded.
135
+ # A node with text content becomes a typecast scalar. Every other node
136
+ # folds its children into an array or a hash. Under the +:standards+
137
+ # profile no bare type attribute is honored ({#bare_type}), so file
138
+ # decoding, array folding and typecasting never happen there.
135
139
  #
136
140
  # @return [Hash{String => Object}] the node name mapped to its value
137
141
  def to_hash
138
- return { name => file_value } if @type == "file"
142
+ return { name => file_value } if @type == "file" && !@options[:serializable]
139
143
  return { name => text_value } if @text
140
144
 
141
145
  groups = group_children
@@ -211,6 +215,36 @@ class Nori
211
215
 
212
216
  private
213
217
 
218
+ # The value of the bare (un-namespaced) type attribute, or nil under
219
+ # the +:standards+ profile. A recognized type is consumed from the
220
+ # attributes because typecasting replaces it with the value it
221
+ # describes. An unrecognized type stays visible as an ordinary
222
+ # attribute. The bare attribute is a Rails +Hash.from_xml+ convention
223
+ # rather than XML, so the +:standards+ profile never reads it and the
224
+ # attribute passes through as ordinary data.
225
+ #
226
+ # @param attributes [Hash{String => String}] the element's attributes
227
+ # @return [String, nil] the type name, or nil when there is none to honor
228
+ def bare_type(attributes)
229
+ return nil if @options[:standards]
230
+
231
+ if self.class.available_typecasts.include?(attributes["type"])
232
+ attributes.delete("type")
233
+ else
234
+ attributes["type"]
235
+ end
236
+ end
237
+
238
+ # The attribute forms that declare an element nil. The prefixed form
239
+ # is the XML Schema Instance convention (xsi:nil). The bare +nil+ form
240
+ # is a Rails +Hash.from_xml+ convention, so the +:standards+ profile
241
+ # only accepts the prefixed form.
242
+ #
243
+ # @return [Regexp] the pattern, with the prefix in capture group 2
244
+ def nil_attribute_pattern
245
+ @options[:standards] ? /^((.+):)nil$/ : /^((.*):)?nil$/
246
+ end
247
+
214
248
  # Decodes the base64 content of a node typed as "file" into a
215
249
  # {StringIOFile} carrying the filename and content type attributes.
216
250
  def file_value
@@ -278,14 +312,40 @@ class Nori
278
312
  value.is_a?(String) ? string_with_attributes(value) : value
279
313
  end
280
314
 
281
- # Wraps a string value so the node's attributes stay accessible
282
- # through {StringWithAttributes#attributes}.
315
+ # Combines a string +value+ with the node's attributes in the shape the
316
+ # active output profile calls for.
317
+ #
318
+ # By default the value is a {StringWithAttributes}: a String carrying the
319
+ # node's attributes on {StringWithAttributes#attributes}. Under the
320
+ # +:serializable+ profile the value becomes plain, directly-serializable
321
+ # data instead, so no custom String subclass is returned.
322
+ #
323
+ # @param value [String] the typecast text content of the node
324
+ # @return [StringWithAttributes, Hash{String => String}, String] the value
325
+ # in the configured representation
283
326
  def string_with_attributes(value)
327
+ return serializable_value(value) if @options[:serializable]
328
+
284
329
  string = StringWithAttributes.new(value)
285
330
  string.attributes = attributes
286
331
  string
287
332
  end
288
333
 
334
+ # The +:serializable+ representation of a string +value+ and the node's
335
+ # attributes. A node with attributes maps to the XML JSON convention
336
+ # (+{"#text" => value}+ merged with the "@"-prefixed attributes) and a
337
+ # node without attributes maps to the plain String. The attribute keys go
338
+ # through the same prefixing and tag conversion as element-node attributes
339
+ # ({#prefixed_attributes}), so every node kind shares one convention.
340
+ #
341
+ # @param value [String] the typecast text content of the node
342
+ # @return [Hash{String => String}, String] the hash shape when the node
343
+ # has attributes, otherwise the plain +value+
344
+ def serializable_value(value)
345
+ return value if attributes.empty?
346
+ { "#text" => value }.merge(prefixed_attributes)
347
+ end
348
+
289
349
  def try_to_convert(value, &block)
290
350
  block.call(value)
291
351
  rescue ArgumentError
data/lib/nori.rb CHANGED
@@ -25,6 +25,7 @@ class Nori
25
25
  :convert_dashes_to_underscores => true,
26
26
  :scrub_xml => true,
27
27
  :standards => false,
28
+ :serializable => false,
28
29
  :parser => :nokogiri
29
30
  }
30
31
 
@@ -56,17 +57,22 @@ class Nori
56
57
  #
57
58
  # The profile groups the spec-correct behaviors under a single opt-in.
58
59
  # It turns on the XML string-value model for empty elements
59
- # (+:consistent_empty_tags+ with an empty-string +:empty_tag_value+) and,
60
- # in the parsers, xml:space honoring. These are defaults, so an explicit
61
- # +:consistent_empty_tags+ or +:empty_tag_value+ passed by the caller
62
- # still wins. When the profile is off the hash is empty and parsing is
63
- # unchanged.
60
+ # (+:consistent_empty_tags+ with an empty-string +:empty_tag_value+),
61
+ # turns off +:advanced_typecasting+ (schema-less values are text, their
62
+ # types are the business of a schema-aware layer) and, in the parsers,
63
+ # honors xml:space. These are defaults, so an explicit option passed by
64
+ # the caller still wins. When the profile is off the hash is empty and
65
+ # parsing is unchanged.
64
66
  #
65
67
  # @param options [Hash] the options passed to {#initialize}
66
68
  # @return [Hash] the implied defaults, or +{}+ when the profile is off
67
69
  def standards_defaults(options)
68
70
  return {} unless options[:standards]
69
- { :consistent_empty_tags => true, :empty_tag_value => "" }
71
+ {
72
+ :consistent_empty_tags => true,
73
+ :empty_tag_value => "",
74
+ :advanced_typecasting => false
75
+ }
70
76
  end
71
77
 
72
78
  def load_parser(parser)
metadata CHANGED
@@ -1,7 +1,7 @@
1
1
  --- !ruby/object:Gem::Specification
2
2
  name: nori
3
3
  version: !ruby/object:Gem::Version
4
- version: 2.8.0
4
+ version: 2.9.0
5
5
  platform: ruby
6
6
  authors:
7
7
  - Daniel Harrington