rexml 3.2.8 → 3.3.6
Sign up to get free protection for your applications and to get access to all the features.
Potentially problematic release.
This version of rexml might be problematic. Click here for more details.
- checksums.yaml +4 -4
- data/NEWS.md +242 -2
- data/lib/rexml/element.rb +16 -31
- data/lib/rexml/entity.rb +5 -47
- data/lib/rexml/formatters/pretty.rb +1 -1
- data/lib/rexml/node.rb +8 -4
- data/lib/rexml/parsers/baseparser.rb +220 -61
- data/lib/rexml/parsers/pullparser.rb +4 -0
- data/lib/rexml/parsers/sax2parser.rb +6 -19
- data/lib/rexml/parsers/streamparser.rb +8 -10
- data/lib/rexml/parsers/treeparser.rb +9 -21
- data/lib/rexml/rexml.rb +1 -1
- data/lib/rexml/source.rb +72 -15
- data/lib/rexml/text.rb +34 -14
- metadata +6 -5
checksums.yaml
CHANGED
@@ -1,7 +1,7 @@
|
|
1
1
|
---
|
2
2
|
SHA256:
|
3
|
-
metadata.gz:
|
4
|
-
data.tar.gz:
|
3
|
+
metadata.gz: 4b79c22060286dad847e18d30b4b336bda21d2772ccb35413fb9ba51a0012ed2
|
4
|
+
data.tar.gz: feb56a4a3071541e983acd33b8baa6b9052f8d67d871102cfe6e69773a0cfcfe
|
5
5
|
SHA512:
|
6
|
-
metadata.gz:
|
7
|
-
data.tar.gz:
|
6
|
+
metadata.gz: b615c95f8624212e151443ad03ba9b64f39aee8a200ea212150a10116340157cfda1bf974ab3d03161c0fb37d866e8c1c69ccc6a9549a13398452b32166af2d8
|
7
|
+
data.tar.gz: db7dcac658e1f51f30575c24d6f36dc256349331fa1951c8fdfaf214baf97a5a446a1fcc411358a76d2c6fc36388ec8b1178adeacc3225d16d5d95ac53a8c4b3
|
data/NEWS.md
CHANGED
@@ -1,5 +1,246 @@
|
|
1
1
|
# News
|
2
2
|
|
3
|
+
## 3.3.6 - 2024-08-22 {#version-3-3-6}
|
4
|
+
|
5
|
+
### Improvements
|
6
|
+
|
7
|
+
* Removed duplicated entity expansions for performance.
|
8
|
+
* GH-194
|
9
|
+
* Patch by Viktor Ivarsson.
|
10
|
+
|
11
|
+
* Improved namespace conflicted attribute check performance. It was
|
12
|
+
too slow for deep elements.
|
13
|
+
* Reported by l33thaxor.
|
14
|
+
|
15
|
+
### Fixes
|
16
|
+
|
17
|
+
* Fixed a bug that default entity expansions are counted for
|
18
|
+
security check. Default entity expansions should not be counted
|
19
|
+
because they don't have a security risk.
|
20
|
+
* GH-198
|
21
|
+
* GH-199
|
22
|
+
* Patch Viktor Ivarsson
|
23
|
+
|
24
|
+
* Fixed a parser bug that parameter entity references in internal
|
25
|
+
subsets are expanded. It's not allowed in the XML specification.
|
26
|
+
* GH-191
|
27
|
+
* Patch by NAITOH Jun.
|
28
|
+
|
29
|
+
* Fixed a stream parser bug that user-defined entity references in
|
30
|
+
text aren't expanded.
|
31
|
+
* GH-200
|
32
|
+
* Patch by NAITOH Jun.
|
33
|
+
|
34
|
+
### Thanks
|
35
|
+
|
36
|
+
* Viktor Ivarsson
|
37
|
+
|
38
|
+
* NAITOH Jun
|
39
|
+
|
40
|
+
* l33thaxor
|
41
|
+
|
42
|
+
## 3.3.5 - 2024-08-12 {#version-3-3-5}
|
43
|
+
|
44
|
+
### Fixes
|
45
|
+
|
46
|
+
* Fixed a bug that `REXML::Security.entity_expansion_text_limit`
|
47
|
+
check has wrong text size calculation in SAX and pull parsers.
|
48
|
+
* GH-193
|
49
|
+
* GH-195
|
50
|
+
* Reported by Viktor Ivarsson.
|
51
|
+
* Patch by NAITOH Jun.
|
52
|
+
|
53
|
+
### Thanks
|
54
|
+
|
55
|
+
* Viktor Ivarsson
|
56
|
+
|
57
|
+
* NAITOH Jun
|
58
|
+
|
59
|
+
## 3.3.4 - 2024-08-01 {#version-3-3-4}
|
60
|
+
|
61
|
+
### Fixes
|
62
|
+
|
63
|
+
* Fixed a bug that `REXML::Security` isn't defined when
|
64
|
+
`REXML::Parsers::StreamParser` is used and
|
65
|
+
`rexml/parsers/streamparser` is only required.
|
66
|
+
* GH-189
|
67
|
+
* Patch by takuya kodama.
|
68
|
+
|
69
|
+
### Thanks
|
70
|
+
|
71
|
+
* takuya kodama
|
72
|
+
|
73
|
+
## 3.3.3 - 2024-08-01 {#version-3-3-3}
|
74
|
+
|
75
|
+
### Improvements
|
76
|
+
|
77
|
+
* Added support for detecting invalid XML that has unsupported
|
78
|
+
content before root element
|
79
|
+
* GH-184
|
80
|
+
* Patch by NAITOH Jun.
|
81
|
+
|
82
|
+
* Added support for `REXML::Security.entity_expansion_limit=` and
|
83
|
+
`REXML::Security.entity_expansion_text_limit=` in SAX2 and pull
|
84
|
+
parsers
|
85
|
+
* GH-187
|
86
|
+
* Patch by NAITOH Jun.
|
87
|
+
|
88
|
+
* Added more tests for invalid XMLs.
|
89
|
+
* GH-183
|
90
|
+
* Patch by Watson.
|
91
|
+
|
92
|
+
* Added more performance tests.
|
93
|
+
* Patch by Watson.
|
94
|
+
|
95
|
+
* Improved parse performance.
|
96
|
+
* GH-186
|
97
|
+
* Patch by tomoya ishida.
|
98
|
+
|
99
|
+
### Thanks
|
100
|
+
|
101
|
+
* NAITOH Jun
|
102
|
+
|
103
|
+
* Watson
|
104
|
+
|
105
|
+
* tomoya ishida
|
106
|
+
|
107
|
+
## 3.3.2 - 2024-07-16 {#version-3-3-2}
|
108
|
+
|
109
|
+
### Improvements
|
110
|
+
|
111
|
+
* Improved parse performance.
|
112
|
+
* GH-160
|
113
|
+
* Patch by NAITOH Jun.
|
114
|
+
|
115
|
+
* Improved parse performance.
|
116
|
+
* GH-169
|
117
|
+
* GH-170
|
118
|
+
* GH-171
|
119
|
+
* GH-172
|
120
|
+
* GH-173
|
121
|
+
* GH-174
|
122
|
+
* GH-175
|
123
|
+
* GH-176
|
124
|
+
* GH-177
|
125
|
+
* Patch by Watson.
|
126
|
+
|
127
|
+
* Added support for raising a parse exception when an XML has extra
|
128
|
+
content after the root element.
|
129
|
+
* GH-161
|
130
|
+
* Patch by NAITOH Jun.
|
131
|
+
|
132
|
+
* Added support for raising a parse exception when an XML
|
133
|
+
declaration exists in wrong position.
|
134
|
+
* GH-162
|
135
|
+
* Patch by NAITOH Jun.
|
136
|
+
|
137
|
+
* Removed needless a space after XML declaration in pretty print mode.
|
138
|
+
* GH-164
|
139
|
+
* Patch by NAITOH Jun.
|
140
|
+
|
141
|
+
* Stopped to emit `:text` event after the root element.
|
142
|
+
* GH-167
|
143
|
+
* Patch by NAITOH Jun.
|
144
|
+
|
145
|
+
### Fixes
|
146
|
+
|
147
|
+
* Fixed a bug that SAX2 parser doesn't expand predefined entities for
|
148
|
+
`characters` callback.
|
149
|
+
* GH-168
|
150
|
+
* Patch by NAITOH Jun.
|
151
|
+
|
152
|
+
### Thanks
|
153
|
+
|
154
|
+
* NAITOH Jun
|
155
|
+
|
156
|
+
* Watson
|
157
|
+
|
158
|
+
## 3.3.1 - 2024-06-25 {#version-3-3-1}
|
159
|
+
|
160
|
+
### Improvements
|
161
|
+
|
162
|
+
* Added support for detecting malformed top-level comments.
|
163
|
+
* GH-145
|
164
|
+
* Patch by Hiroya Fujinami.
|
165
|
+
|
166
|
+
* Improved `REXML::Element#attribute` performance.
|
167
|
+
* GH-146
|
168
|
+
* Patch by Hiroya Fujinami.
|
169
|
+
|
170
|
+
* Added support for detecting malformed `<!-->` comments.
|
171
|
+
* GH-147
|
172
|
+
* Patch by Hiroya Fujinami.
|
173
|
+
|
174
|
+
* Added support for detecting unclosed `DOCTYPE`.
|
175
|
+
* GH-152
|
176
|
+
* Patch by Hiroya Fujinami.
|
177
|
+
|
178
|
+
* Added `changlog_uri` metadata to gemspec.
|
179
|
+
* GH-156
|
180
|
+
* Patch by fynsta.
|
181
|
+
|
182
|
+
* Improved parse performance.
|
183
|
+
* GH-157
|
184
|
+
* GH-158
|
185
|
+
* Patch by NAITOH Jun.
|
186
|
+
|
187
|
+
### Fixes
|
188
|
+
|
189
|
+
* Fixed a bug that large XML can't be parsed.
|
190
|
+
* GH-154
|
191
|
+
* Patch by NAITOH Jun.
|
192
|
+
|
193
|
+
* Fixed a bug that private constants are visible.
|
194
|
+
* GH-155
|
195
|
+
* Patch by NAITOH Jun.
|
196
|
+
|
197
|
+
### Thanks
|
198
|
+
|
199
|
+
* Hiroya Fujinami
|
200
|
+
|
201
|
+
* NAITOH Jun
|
202
|
+
|
203
|
+
* fynsta
|
204
|
+
|
205
|
+
## 3.3.0 - 2024-06-11 {#version-3-3-0}
|
206
|
+
|
207
|
+
### Improvements
|
208
|
+
|
209
|
+
* Added support for strscan 0.7.0 installed with Ruby 2.6.
|
210
|
+
* GH-142
|
211
|
+
* Reported by Fernando Trigoso.
|
212
|
+
|
213
|
+
### Thanks
|
214
|
+
|
215
|
+
* Fernando Trigoso
|
216
|
+
|
217
|
+
## 3.2.9 - 2024-06-09 {#version-3-2-9}
|
218
|
+
|
219
|
+
### Improvements
|
220
|
+
|
221
|
+
* Added support for old strscan.
|
222
|
+
* GH-132
|
223
|
+
* Reported by Adam.
|
224
|
+
|
225
|
+
* Improved attribute value parse performance.
|
226
|
+
* GH-135
|
227
|
+
* Patch by NAITOH Jun.
|
228
|
+
|
229
|
+
* Improved `REXML::Node#each_recursive` performance.
|
230
|
+
* GH-134
|
231
|
+
* GH-139
|
232
|
+
* Patch by Hiroya Fujinami.
|
233
|
+
|
234
|
+
* Improved text parse performance.
|
235
|
+
* Reported by mprogrammer.
|
236
|
+
|
237
|
+
### Thanks
|
238
|
+
|
239
|
+
* Adam
|
240
|
+
* NAITOH Jun
|
241
|
+
* Hiroya Fujinami
|
242
|
+
* mprogrammer
|
243
|
+
|
3
244
|
## 3.2.8 - 2024-05-16 {#version-3-2-8}
|
4
245
|
|
5
246
|
### Fixes
|
@@ -30,7 +271,7 @@
|
|
30
271
|
|
31
272
|
* Improved parse performance when an attribute has many `<`s.
|
32
273
|
|
33
|
-
* GH-
|
274
|
+
* GH-126
|
34
275
|
|
35
276
|
### Fixes
|
36
277
|
|
@@ -65,7 +306,6 @@
|
|
65
306
|
* jcavalieri
|
66
307
|
* DuKewu
|
67
308
|
|
68
|
-
|
69
309
|
## 3.2.6 - 2023-07-27 {#version-3-2-6}
|
70
310
|
|
71
311
|
### Improvements
|
data/lib/rexml/element.rb
CHANGED
@@ -7,14 +7,6 @@ require_relative "xpath"
|
|
7
7
|
require_relative "parseexception"
|
8
8
|
|
9
9
|
module REXML
|
10
|
-
# An implementation note about namespaces:
|
11
|
-
# As we parse, when we find namespaces we put them in a hash and assign
|
12
|
-
# them a unique ID. We then convert the namespace prefix for the node
|
13
|
-
# to the unique ID. This makes namespace lookup much faster for the
|
14
|
-
# cost of extra memory use. We save the namespace prefix for the
|
15
|
-
# context node and convert it back when we write it.
|
16
|
-
@@namespaces = {}
|
17
|
-
|
18
10
|
# An \REXML::Element object represents an XML element.
|
19
11
|
#
|
20
12
|
# An element:
|
@@ -449,9 +441,14 @@ module REXML
|
|
449
441
|
# Related: #root_node, #document.
|
450
442
|
#
|
451
443
|
def root
|
452
|
-
|
453
|
-
|
454
|
-
|
444
|
+
target = self
|
445
|
+
while target
|
446
|
+
return target.elements[1] if target.kind_of? Document
|
447
|
+
parent = target.parent
|
448
|
+
return target if parent.kind_of? Document or parent.nil?
|
449
|
+
target = parent
|
450
|
+
end
|
451
|
+
nil
|
455
452
|
end
|
456
453
|
|
457
454
|
# :call-seq:
|
@@ -627,8 +624,12 @@ module REXML
|
|
627
624
|
else
|
628
625
|
prefix = "xmlns:#{prefix}" unless prefix[0,5] == 'xmlns'
|
629
626
|
end
|
630
|
-
ns =
|
631
|
-
|
627
|
+
ns = nil
|
628
|
+
target = self
|
629
|
+
while ns.nil? and target
|
630
|
+
ns = target.attributes[prefix]
|
631
|
+
target = target.parent
|
632
|
+
end
|
632
633
|
ns = '' if ns.nil? and prefix == 'xmlns'
|
633
634
|
return ns
|
634
635
|
end
|
@@ -1284,16 +1285,11 @@ module REXML
|
|
1284
1285
|
# document.root.attribute("x", "a") # => a:x='a:x'
|
1285
1286
|
#
|
1286
1287
|
def attribute( name, namespace=nil )
|
1287
|
-
prefix =
|
1288
|
-
if namespaces.respond_to? :key
|
1289
|
-
prefix = namespaces.key(namespace) if namespace
|
1290
|
-
else
|
1291
|
-
prefix = namespaces.index(namespace) if namespace
|
1292
|
-
end
|
1288
|
+
prefix = namespaces.key(namespace) if namespace
|
1293
1289
|
prefix = nil if prefix == 'xmlns'
|
1294
1290
|
|
1295
1291
|
ret_val =
|
1296
|
-
attributes.get_attribute(
|
1292
|
+
attributes.get_attribute( prefix ? "#{prefix}:#{name}" : name )
|
1297
1293
|
|
1298
1294
|
return ret_val unless ret_val.nil?
|
1299
1295
|
return nil if prefix.nil?
|
@@ -2388,17 +2384,6 @@ module REXML
|
|
2388
2384
|
elsif old_attr.kind_of? Hash
|
2389
2385
|
old_attr[value.prefix] = value
|
2390
2386
|
elsif old_attr.prefix != value.prefix
|
2391
|
-
# Check for conflicting namespaces
|
2392
|
-
if value.prefix != "xmlns" and old_attr.prefix != "xmlns"
|
2393
|
-
old_namespace = old_attr.namespace
|
2394
|
-
new_namespace = value.namespace
|
2395
|
-
if old_namespace == new_namespace
|
2396
|
-
raise ParseException.new(
|
2397
|
-
"Namespace conflict in adding attribute \"#{value.name}\": "+
|
2398
|
-
"Prefix \"#{old_attr.prefix}\" = \"#{old_namespace}\" and "+
|
2399
|
-
"prefix \"#{value.prefix}\" = \"#{new_namespace}\"")
|
2400
|
-
end
|
2401
|
-
end
|
2402
2387
|
store value.name, {old_attr.prefix => old_attr,
|
2403
2388
|
value.prefix => value}
|
2404
2389
|
else
|
data/lib/rexml/entity.rb
CHANGED
@@ -12,6 +12,7 @@ module REXML
|
|
12
12
|
EXTERNALID = "(?:(?:(SYSTEM)\\s+#{SYSTEMLITERAL})|(?:(PUBLIC)\\s+#{PUBIDLITERAL}\\s+#{SYSTEMLITERAL}))"
|
13
13
|
NDATADECL = "\\s+NDATA\\s+#{NAME}"
|
14
14
|
PEREFERENCE = "%#{NAME};"
|
15
|
+
PEREFERENCE_RE = /#{PEREFERENCE}/um
|
15
16
|
ENTITYVALUE = %Q{((?:"(?:[^%&"]|#{PEREFERENCE}|#{REFERENCE})*")|(?:'([^%&']|#{PEREFERENCE}|#{REFERENCE})*'))}
|
16
17
|
PEDEF = "(?:#{ENTITYVALUE}|#{EXTERNALID})"
|
17
18
|
ENTITYDEF = "(?:#{ENTITYVALUE}|(?:#{EXTERNALID}(#{NDATADECL})?))"
|
@@ -19,7 +20,7 @@ module REXML
|
|
19
20
|
GEDECL = "<!ENTITY\\s+#{NAME}\\s+#{ENTITYDEF}\\s*>"
|
20
21
|
ENTITYDECL = /\s*(?:#{GEDECL})|(?:#{PEDECL})/um
|
21
22
|
|
22
|
-
attr_reader :name, :external, :ref, :ndata, :pubid
|
23
|
+
attr_reader :name, :external, :ref, :ndata, :pubid, :value
|
23
24
|
|
24
25
|
# Create a new entity. Simple entities can be constructed by passing a
|
25
26
|
# name, value to the constructor; this creates a generic, plain entity
|
@@ -68,14 +69,11 @@ module REXML
|
|
68
69
|
end
|
69
70
|
|
70
71
|
# Evaluates to the unnormalized value of this entity; that is, replacing
|
71
|
-
#
|
72
|
-
# +value()+ in that +value+ only replaces %ent; entities.
|
72
|
+
# &ent; entities.
|
73
73
|
def unnormalized
|
74
74
|
document.record_entity_expansion unless document.nil?
|
75
|
-
|
76
|
-
|
77
|
-
@unnormalized = Text::unnormalize(v, parent)
|
78
|
-
@unnormalized
|
75
|
+
return nil if @value.nil?
|
76
|
+
@unnormalized = Text::unnormalize(@value, parent)
|
79
77
|
end
|
80
78
|
|
81
79
|
#once :unnormalized
|
@@ -121,46 +119,6 @@ module REXML
|
|
121
119
|
write rv
|
122
120
|
rv
|
123
121
|
end
|
124
|
-
|
125
|
-
PEREFERENCE_RE = /#{PEREFERENCE}/um
|
126
|
-
# Returns the value of this entity. At the moment, only internal entities
|
127
|
-
# are processed. If the value contains internal references (IE,
|
128
|
-
# %blah;), those are replaced with their values. IE, if the doctype
|
129
|
-
# contains:
|
130
|
-
# <!ENTITY % foo "bar">
|
131
|
-
# <!ENTITY yada "nanoo %foo; nanoo>
|
132
|
-
# then:
|
133
|
-
# doctype.entity('yada').value #-> "nanoo bar nanoo"
|
134
|
-
def value
|
135
|
-
@resolved_value ||= resolve_value
|
136
|
-
end
|
137
|
-
|
138
|
-
def parent=(other)
|
139
|
-
@resolved_value = nil
|
140
|
-
super
|
141
|
-
end
|
142
|
-
|
143
|
-
private
|
144
|
-
def resolve_value
|
145
|
-
return nil if @value.nil?
|
146
|
-
return @value unless @value.match?(PEREFERENCE_RE)
|
147
|
-
|
148
|
-
matches = @value.scan(PEREFERENCE_RE)
|
149
|
-
rv = @value.clone
|
150
|
-
if @parent
|
151
|
-
sum = 0
|
152
|
-
matches.each do |entity_reference|
|
153
|
-
entity_value = @parent.entity( entity_reference[0] )
|
154
|
-
if sum + entity_value.bytesize > Security.entity_expansion_text_limit
|
155
|
-
raise "entity expansion has grown too large"
|
156
|
-
else
|
157
|
-
sum += entity_value.bytesize
|
158
|
-
end
|
159
|
-
rv.gsub!( /%#{entity_reference.join};/um, entity_value )
|
160
|
-
end
|
161
|
-
end
|
162
|
-
rv
|
163
|
-
end
|
164
122
|
end
|
165
123
|
|
166
124
|
# This is a set of entity constants -- the ones defined in the XML
|
@@ -111,7 +111,7 @@ module REXML
|
|
111
111
|
# itself, then we don't need a carriage return... which makes this
|
112
112
|
# logic more complex.
|
113
113
|
node.children.each { |child|
|
114
|
-
next if child
|
114
|
+
next if child.instance_of?(Text)
|
115
115
|
unless child == node.children[0] or child.instance_of?(Text) or
|
116
116
|
(child == node.children[1] and !node.children[0].writethis)
|
117
117
|
output << "\n"
|
data/lib/rexml/node.rb
CHANGED
@@ -52,10 +52,14 @@ module REXML
|
|
52
52
|
|
53
53
|
# Visit all subnodes of +self+ recursively
|
54
54
|
def each_recursive(&block) # :yields: node
|
55
|
-
|
56
|
-
|
57
|
-
|
58
|
-
|
55
|
+
stack = []
|
56
|
+
each { |child| stack.unshift child if child.node_type == :element }
|
57
|
+
until stack.empty?
|
58
|
+
child = stack.pop
|
59
|
+
yield child
|
60
|
+
n = stack.size
|
61
|
+
child.each { |grandchild| stack.insert n, grandchild if grandchild.node_type == :element }
|
62
|
+
end
|
59
63
|
end
|
60
64
|
|
61
65
|
# Find (and return) first subnode (recursively) for which the block
|