rexml 3.2.8 → 3.3.6

Sign up to get free protection for your applications and to get access to all the features.

Potentially problematic release.


This version of rexml might be problematic. Click here for more details.

checksums.yaml CHANGED
@@ -1,7 +1,7 @@
1
1
  ---
2
2
  SHA256:
3
- metadata.gz: cef1cb55ed6a23bce0af7d46fce791961558ee885c57f1089207deb53e18ff95
4
- data.tar.gz: 4fbe36d9e4422d266c2ffee71fab703e6408e9a2b5533ca0ae49411483f98368
3
+ metadata.gz: 4b79c22060286dad847e18d30b4b336bda21d2772ccb35413fb9ba51a0012ed2
4
+ data.tar.gz: feb56a4a3071541e983acd33b8baa6b9052f8d67d871102cfe6e69773a0cfcfe
5
5
  SHA512:
6
- metadata.gz: 60da703f26ff7eaa6cf537634ad68e4c3246605c26c54ea96147fae59bb6c17eeab5c234e5e67301e342472ce8dd3cb671321168cddc212839b9347c8de62f91
7
- data.tar.gz: 4d6eeaba460a22bcdc3512ac53504fe87f829cb783069fbebb989fb8f1585fbdfe9ed357c8fbda894fb68777d04f871b9073aa6d13c31eccc6b3b22db10c2ce1
6
+ metadata.gz: b615c95f8624212e151443ad03ba9b64f39aee8a200ea212150a10116340157cfda1bf974ab3d03161c0fb37d866e8c1c69ccc6a9549a13398452b32166af2d8
7
+ data.tar.gz: db7dcac658e1f51f30575c24d6f36dc256349331fa1951c8fdfaf214baf97a5a446a1fcc411358a76d2c6fc36388ec8b1178adeacc3225d16d5d95ac53a8c4b3
data/NEWS.md CHANGED
@@ -1,5 +1,246 @@
1
1
  # News
2
2
 
3
+ ## 3.3.6 - 2024-08-22 {#version-3-3-6}
4
+
5
+ ### Improvements
6
+
7
+ * Removed duplicated entity expansions for performance.
8
+ * GH-194
9
+ * Patch by Viktor Ivarsson.
10
+
11
+ * Improved namespace conflicted attribute check performance. It was
12
+ too slow for deep elements.
13
+ * Reported by l33thaxor.
14
+
15
+ ### Fixes
16
+
17
+ * Fixed a bug that default entity expansions are counted for
18
+ security check. Default entity expansions should not be counted
19
+ because they don't have a security risk.
20
+ * GH-198
21
+ * GH-199
22
+ * Patch Viktor Ivarsson
23
+
24
+ * Fixed a parser bug that parameter entity references in internal
25
+ subsets are expanded. It's not allowed in the XML specification.
26
+ * GH-191
27
+ * Patch by NAITOH Jun.
28
+
29
+ * Fixed a stream parser bug that user-defined entity references in
30
+ text aren't expanded.
31
+ * GH-200
32
+ * Patch by NAITOH Jun.
33
+
34
+ ### Thanks
35
+
36
+ * Viktor Ivarsson
37
+
38
+ * NAITOH Jun
39
+
40
+ * l33thaxor
41
+
42
+ ## 3.3.5 - 2024-08-12 {#version-3-3-5}
43
+
44
+ ### Fixes
45
+
46
+ * Fixed a bug that `REXML::Security.entity_expansion_text_limit`
47
+ check has wrong text size calculation in SAX and pull parsers.
48
+ * GH-193
49
+ * GH-195
50
+ * Reported by Viktor Ivarsson.
51
+ * Patch by NAITOH Jun.
52
+
53
+ ### Thanks
54
+
55
+ * Viktor Ivarsson
56
+
57
+ * NAITOH Jun
58
+
59
+ ## 3.3.4 - 2024-08-01 {#version-3-3-4}
60
+
61
+ ### Fixes
62
+
63
+ * Fixed a bug that `REXML::Security` isn't defined when
64
+ `REXML::Parsers::StreamParser` is used and
65
+ `rexml/parsers/streamparser` is only required.
66
+ * GH-189
67
+ * Patch by takuya kodama.
68
+
69
+ ### Thanks
70
+
71
+ * takuya kodama
72
+
73
+ ## 3.3.3 - 2024-08-01 {#version-3-3-3}
74
+
75
+ ### Improvements
76
+
77
+ * Added support for detecting invalid XML that has unsupported
78
+ content before root element
79
+ * GH-184
80
+ * Patch by NAITOH Jun.
81
+
82
+ * Added support for `REXML::Security.entity_expansion_limit=` and
83
+ `REXML::Security.entity_expansion_text_limit=` in SAX2 and pull
84
+ parsers
85
+ * GH-187
86
+ * Patch by NAITOH Jun.
87
+
88
+ * Added more tests for invalid XMLs.
89
+ * GH-183
90
+ * Patch by Watson.
91
+
92
+ * Added more performance tests.
93
+ * Patch by Watson.
94
+
95
+ * Improved parse performance.
96
+ * GH-186
97
+ * Patch by tomoya ishida.
98
+
99
+ ### Thanks
100
+
101
+ * NAITOH Jun
102
+
103
+ * Watson
104
+
105
+ * tomoya ishida
106
+
107
+ ## 3.3.2 - 2024-07-16 {#version-3-3-2}
108
+
109
+ ### Improvements
110
+
111
+ * Improved parse performance.
112
+ * GH-160
113
+ * Patch by NAITOH Jun.
114
+
115
+ * Improved parse performance.
116
+ * GH-169
117
+ * GH-170
118
+ * GH-171
119
+ * GH-172
120
+ * GH-173
121
+ * GH-174
122
+ * GH-175
123
+ * GH-176
124
+ * GH-177
125
+ * Patch by Watson.
126
+
127
+ * Added support for raising a parse exception when an XML has extra
128
+ content after the root element.
129
+ * GH-161
130
+ * Patch by NAITOH Jun.
131
+
132
+ * Added support for raising a parse exception when an XML
133
+ declaration exists in wrong position.
134
+ * GH-162
135
+ * Patch by NAITOH Jun.
136
+
137
+ * Removed needless a space after XML declaration in pretty print mode.
138
+ * GH-164
139
+ * Patch by NAITOH Jun.
140
+
141
+ * Stopped to emit `:text` event after the root element.
142
+ * GH-167
143
+ * Patch by NAITOH Jun.
144
+
145
+ ### Fixes
146
+
147
+ * Fixed a bug that SAX2 parser doesn't expand predefined entities for
148
+ `characters` callback.
149
+ * GH-168
150
+ * Patch by NAITOH Jun.
151
+
152
+ ### Thanks
153
+
154
+ * NAITOH Jun
155
+
156
+ * Watson
157
+
158
+ ## 3.3.1 - 2024-06-25 {#version-3-3-1}
159
+
160
+ ### Improvements
161
+
162
+ * Added support for detecting malformed top-level comments.
163
+ * GH-145
164
+ * Patch by Hiroya Fujinami.
165
+
166
+ * Improved `REXML::Element#attribute` performance.
167
+ * GH-146
168
+ * Patch by Hiroya Fujinami.
169
+
170
+ * Added support for detecting malformed `<!-->` comments.
171
+ * GH-147
172
+ * Patch by Hiroya Fujinami.
173
+
174
+ * Added support for detecting unclosed `DOCTYPE`.
175
+ * GH-152
176
+ * Patch by Hiroya Fujinami.
177
+
178
+ * Added `changlog_uri` metadata to gemspec.
179
+ * GH-156
180
+ * Patch by fynsta.
181
+
182
+ * Improved parse performance.
183
+ * GH-157
184
+ * GH-158
185
+ * Patch by NAITOH Jun.
186
+
187
+ ### Fixes
188
+
189
+ * Fixed a bug that large XML can't be parsed.
190
+ * GH-154
191
+ * Patch by NAITOH Jun.
192
+
193
+ * Fixed a bug that private constants are visible.
194
+ * GH-155
195
+ * Patch by NAITOH Jun.
196
+
197
+ ### Thanks
198
+
199
+ * Hiroya Fujinami
200
+
201
+ * NAITOH Jun
202
+
203
+ * fynsta
204
+
205
+ ## 3.3.0 - 2024-06-11 {#version-3-3-0}
206
+
207
+ ### Improvements
208
+
209
+ * Added support for strscan 0.7.0 installed with Ruby 2.6.
210
+ * GH-142
211
+ * Reported by Fernando Trigoso.
212
+
213
+ ### Thanks
214
+
215
+ * Fernando Trigoso
216
+
217
+ ## 3.2.9 - 2024-06-09 {#version-3-2-9}
218
+
219
+ ### Improvements
220
+
221
+ * Added support for old strscan.
222
+ * GH-132
223
+ * Reported by Adam.
224
+
225
+ * Improved attribute value parse performance.
226
+ * GH-135
227
+ * Patch by NAITOH Jun.
228
+
229
+ * Improved `REXML::Node#each_recursive` performance.
230
+ * GH-134
231
+ * GH-139
232
+ * Patch by Hiroya Fujinami.
233
+
234
+ * Improved text parse performance.
235
+ * Reported by mprogrammer.
236
+
237
+ ### Thanks
238
+
239
+ * Adam
240
+ * NAITOH Jun
241
+ * Hiroya Fujinami
242
+ * mprogrammer
243
+
3
244
  ## 3.2.8 - 2024-05-16 {#version-3-2-8}
4
245
 
5
246
  ### Fixes
@@ -30,7 +271,7 @@
30
271
 
31
272
  * Improved parse performance when an attribute has many `<`s.
32
273
 
33
- * GH-124
274
+ * GH-126
34
275
 
35
276
  ### Fixes
36
277
 
@@ -65,7 +306,6 @@
65
306
  * jcavalieri
66
307
  * DuKewu
67
308
 
68
-
69
309
  ## 3.2.6 - 2023-07-27 {#version-3-2-6}
70
310
 
71
311
  ### Improvements
data/lib/rexml/element.rb CHANGED
@@ -7,14 +7,6 @@ require_relative "xpath"
7
7
  require_relative "parseexception"
8
8
 
9
9
  module REXML
10
- # An implementation note about namespaces:
11
- # As we parse, when we find namespaces we put them in a hash and assign
12
- # them a unique ID. We then convert the namespace prefix for the node
13
- # to the unique ID. This makes namespace lookup much faster for the
14
- # cost of extra memory use. We save the namespace prefix for the
15
- # context node and convert it back when we write it.
16
- @@namespaces = {}
17
-
18
10
  # An \REXML::Element object represents an XML element.
19
11
  #
20
12
  # An element:
@@ -449,9 +441,14 @@ module REXML
449
441
  # Related: #root_node, #document.
450
442
  #
451
443
  def root
452
- return elements[1] if self.kind_of? Document
453
- return self if parent.kind_of? Document or parent.nil?
454
- return parent.root
444
+ target = self
445
+ while target
446
+ return target.elements[1] if target.kind_of? Document
447
+ parent = target.parent
448
+ return target if parent.kind_of? Document or parent.nil?
449
+ target = parent
450
+ end
451
+ nil
455
452
  end
456
453
 
457
454
  # :call-seq:
@@ -627,8 +624,12 @@ module REXML
627
624
  else
628
625
  prefix = "xmlns:#{prefix}" unless prefix[0,5] == 'xmlns'
629
626
  end
630
- ns = attributes[ prefix ]
631
- ns = parent.namespace(prefix) if ns.nil? and parent
627
+ ns = nil
628
+ target = self
629
+ while ns.nil? and target
630
+ ns = target.attributes[prefix]
631
+ target = target.parent
632
+ end
632
633
  ns = '' if ns.nil? and prefix == 'xmlns'
633
634
  return ns
634
635
  end
@@ -1284,16 +1285,11 @@ module REXML
1284
1285
  # document.root.attribute("x", "a") # => a:x='a:x'
1285
1286
  #
1286
1287
  def attribute( name, namespace=nil )
1287
- prefix = nil
1288
- if namespaces.respond_to? :key
1289
- prefix = namespaces.key(namespace) if namespace
1290
- else
1291
- prefix = namespaces.index(namespace) if namespace
1292
- end
1288
+ prefix = namespaces.key(namespace) if namespace
1293
1289
  prefix = nil if prefix == 'xmlns'
1294
1290
 
1295
1291
  ret_val =
1296
- attributes.get_attribute( "#{prefix ? prefix + ':' : ''}#{name}" )
1292
+ attributes.get_attribute( prefix ? "#{prefix}:#{name}" : name )
1297
1293
 
1298
1294
  return ret_val unless ret_val.nil?
1299
1295
  return nil if prefix.nil?
@@ -2388,17 +2384,6 @@ module REXML
2388
2384
  elsif old_attr.kind_of? Hash
2389
2385
  old_attr[value.prefix] = value
2390
2386
  elsif old_attr.prefix != value.prefix
2391
- # Check for conflicting namespaces
2392
- if value.prefix != "xmlns" and old_attr.prefix != "xmlns"
2393
- old_namespace = old_attr.namespace
2394
- new_namespace = value.namespace
2395
- if old_namespace == new_namespace
2396
- raise ParseException.new(
2397
- "Namespace conflict in adding attribute \"#{value.name}\": "+
2398
- "Prefix \"#{old_attr.prefix}\" = \"#{old_namespace}\" and "+
2399
- "prefix \"#{value.prefix}\" = \"#{new_namespace}\"")
2400
- end
2401
- end
2402
2387
  store value.name, {old_attr.prefix => old_attr,
2403
2388
  value.prefix => value}
2404
2389
  else
data/lib/rexml/entity.rb CHANGED
@@ -12,6 +12,7 @@ module REXML
12
12
  EXTERNALID = "(?:(?:(SYSTEM)\\s+#{SYSTEMLITERAL})|(?:(PUBLIC)\\s+#{PUBIDLITERAL}\\s+#{SYSTEMLITERAL}))"
13
13
  NDATADECL = "\\s+NDATA\\s+#{NAME}"
14
14
  PEREFERENCE = "%#{NAME};"
15
+ PEREFERENCE_RE = /#{PEREFERENCE}/um
15
16
  ENTITYVALUE = %Q{((?:"(?:[^%&"]|#{PEREFERENCE}|#{REFERENCE})*")|(?:'([^%&']|#{PEREFERENCE}|#{REFERENCE})*'))}
16
17
  PEDEF = "(?:#{ENTITYVALUE}|#{EXTERNALID})"
17
18
  ENTITYDEF = "(?:#{ENTITYVALUE}|(?:#{EXTERNALID}(#{NDATADECL})?))"
@@ -19,7 +20,7 @@ module REXML
19
20
  GEDECL = "<!ENTITY\\s+#{NAME}\\s+#{ENTITYDEF}\\s*>"
20
21
  ENTITYDECL = /\s*(?:#{GEDECL})|(?:#{PEDECL})/um
21
22
 
22
- attr_reader :name, :external, :ref, :ndata, :pubid
23
+ attr_reader :name, :external, :ref, :ndata, :pubid, :value
23
24
 
24
25
  # Create a new entity. Simple entities can be constructed by passing a
25
26
  # name, value to the constructor; this creates a generic, plain entity
@@ -68,14 +69,11 @@ module REXML
68
69
  end
69
70
 
70
71
  # Evaluates to the unnormalized value of this entity; that is, replacing
71
- # all entities -- both %ent; and &ent; entities. This differs from
72
- # +value()+ in that +value+ only replaces %ent; entities.
72
+ # &ent; entities.
73
73
  def unnormalized
74
74
  document.record_entity_expansion unless document.nil?
75
- v = value()
76
- return nil if v.nil?
77
- @unnormalized = Text::unnormalize(v, parent)
78
- @unnormalized
75
+ return nil if @value.nil?
76
+ @unnormalized = Text::unnormalize(@value, parent)
79
77
  end
80
78
 
81
79
  #once :unnormalized
@@ -121,46 +119,6 @@ module REXML
121
119
  write rv
122
120
  rv
123
121
  end
124
-
125
- PEREFERENCE_RE = /#{PEREFERENCE}/um
126
- # Returns the value of this entity. At the moment, only internal entities
127
- # are processed. If the value contains internal references (IE,
128
- # %blah;), those are replaced with their values. IE, if the doctype
129
- # contains:
130
- # <!ENTITY % foo "bar">
131
- # <!ENTITY yada "nanoo %foo; nanoo>
132
- # then:
133
- # doctype.entity('yada').value #-> "nanoo bar nanoo"
134
- def value
135
- @resolved_value ||= resolve_value
136
- end
137
-
138
- def parent=(other)
139
- @resolved_value = nil
140
- super
141
- end
142
-
143
- private
144
- def resolve_value
145
- return nil if @value.nil?
146
- return @value unless @value.match?(PEREFERENCE_RE)
147
-
148
- matches = @value.scan(PEREFERENCE_RE)
149
- rv = @value.clone
150
- if @parent
151
- sum = 0
152
- matches.each do |entity_reference|
153
- entity_value = @parent.entity( entity_reference[0] )
154
- if sum + entity_value.bytesize > Security.entity_expansion_text_limit
155
- raise "entity expansion has grown too large"
156
- else
157
- sum += entity_value.bytesize
158
- end
159
- rv.gsub!( /%#{entity_reference.join};/um, entity_value )
160
- end
161
- end
162
- rv
163
- end
164
122
  end
165
123
 
166
124
  # This is a set of entity constants -- the ones defined in the XML
@@ -111,7 +111,7 @@ module REXML
111
111
  # itself, then we don't need a carriage return... which makes this
112
112
  # logic more complex.
113
113
  node.children.each { |child|
114
- next if child == node.children[-1] and child.instance_of?(Text)
114
+ next if child.instance_of?(Text)
115
115
  unless child == node.children[0] or child.instance_of?(Text) or
116
116
  (child == node.children[1] and !node.children[0].writethis)
117
117
  output << "\n"
data/lib/rexml/node.rb CHANGED
@@ -52,10 +52,14 @@ module REXML
52
52
 
53
53
  # Visit all subnodes of +self+ recursively
54
54
  def each_recursive(&block) # :yields: node
55
- self.elements.each {|node|
56
- block.call(node)
57
- node.each_recursive(&block)
58
- }
55
+ stack = []
56
+ each { |child| stack.unshift child if child.node_type == :element }
57
+ until stack.empty?
58
+ child = stack.pop
59
+ yield child
60
+ n = stack.size
61
+ child.each { |grandchild| stack.insert n, grandchild if grandchild.node_type == :element }
62
+ end
59
63
  end
60
64
 
61
65
  # Find (and return) first subnode (recursively) for which the block