compare-xml 0.5.2 → 0.6

Sign up to get free protection for your applications and to get access to all the features.
Files changed (5) hide show
  1. checksums.yaml +4 -4
  2. data/README.md +116 -119
  3. data/lib/compare-xml.rb +149 -125
  4. data/lib/compare-xml/version.rb +1 -1
  5. metadata +12 -12
checksums.yaml CHANGED
@@ -1,7 +1,7 @@
1
1
  ---
2
2
  SHA1:
3
- metadata.gz: 33a222cacf011fd953938812fd99be01832a61ca
4
- data.tar.gz: fd30d6713bddf059806814e28faf431132697c60
3
+ metadata.gz: 24ddb2d55335e31c8bca6b26447717c51f907db9
4
+ data.tar.gz: 1eeccbd5f186e26e09553c577b8713a4a9eaa3d3
5
5
  SHA512:
6
- metadata.gz: cc7042aa3c3ff8b69a6ebcbf76d8a443d5fc7f4d38379f12cbd91d6fe8b908d6d096f70dd22945231e0403ff742cd04f9c0423237fcb3b0715a89baf92d3a93c
7
- data.tar.gz: 54d723bfb1c797083b103328bfee70ad40ad302c15355106755da9d22d9e58a854f56a859032a3af771a1a042b16727b4082b119832f3f1b5ea0c770b1bd849d
6
+ metadata.gz: 85b1a91be2f641993997dbd9afe0c95a28f359fbe936d89d19afb8cd3e9998b3975ace6fc3e7a1a3b0a0684d6114e8ded56fa90c900673890e92d7d20415b799
7
+ data.tar.gz: e0548edf277dd3967495f03b7d3d3e836d2fa80a56752034f691f953270e71650cb88b5af25338fe99d48bb95c66cf9d545703c7c3fd34b2270c84865ea0824b
data/README.md CHANGED
@@ -68,31 +68,59 @@ CompareXML has a variety of options that can be invoked as an optional argument,
68
68
  CompareXML.equivalent?(doc1, doc2, {ignore_comments: false, verbose: true, ...})
69
69
  ```
70
70
 
71
- - `ignore_attr_order: {true|false}` default: **`true`**
71
+ - `collapse_whitespace: {true|false}` default: **`true`** [→ read more ←](#collapse_whitespace)
72
+ - when `true`, trims and collapses whitespace
73
+
74
+ - `ignore_attr_order: {true|false}` default: **`true`** [→ read more ←](#ignore_attr_order)
72
75
  - when `true`, ignores attribute order within tags
73
76
 
74
- - `ignore_attrs: {css}` default: **`{}`**
77
+ - `ignore_attr_content: [string1, string2, ...]` default: **`[]`** [→ read more ←](#ignore_attr_content)
78
+ - when provided, ignores all attributes that contain substrings `string`, `string2`, etc.
79
+
80
+ - `ignore_attrs: [css_selector1, css_selector1, ...]` default: **`[]`** [→ read more ←](#ignore_attrs)
75
81
  - when provided, ignores specific *attributes* using [CSS selectors](http://www.w3schools.com/cssref/css_selectors.asp)
76
82
 
77
- - `ignore_comments: {true|false}` default: **`true`**
83
+ - `ignore_comments: {true|false}` default: **`true`** [...](#ignore_comments)
78
84
  - when `true`, ignores comments, such as `<!-- comment -->`
79
85
 
80
- - `ignore_nodes: {css}` default: **`{}`**
86
+ - `ignore_nodes: [css_selector1, css_selector1, ...]` default: **`[]`** [&rarr; read more &larr;](#ignore_nodes)
81
87
  - when provided, ignores specific *nodes* using [CSS selectors](http://www.w3schools.com/cssref/css_selectors.asp)
82
88
 
83
- - `ignore_text_nodes: {true|false}` default: **`false`**
89
+ - `ignore_text_nodes: {true|false}` default: **`false`** [&rarr; read more &larr;](#ignore_text_nodes)
84
90
  - when `true`, ignores all text content within a document
85
91
 
86
- - `collapse_whitespace: {true|false}` default: **`true`**
87
- - when `true`, trims and collapses whitespace
88
-
89
- - `verbose: {true|false}` default: **`false`**
92
+ - `verbose: {true|false}` default: **`false`** [&rarr; read more &larr;](#verbose)
90
93
  - when `true`, instead of a boolean, `CompareXML.equivalent?` returns an array of discrepancies.
91
94
 
92
95
 
93
96
  ## Options in Depth
94
97
 
95
- - `ignore_attr_order: {true|false}` default: **`true`**
98
+ - <a id="collapse_whitespace"></a>`collapse_whitespace: {true|false}` default: **`true`**
99
+
100
+ When `true`, all text content within the document is trimmed (i.e. space removed from left and right) and whitespace is collapsed (i.e. tabs, new lines, multiple whitespace characters are replaced by a single whitespace).
101
+
102
+ **Usage Example:** `CompareXML.equivalent?(doc1, doc2, {collapse_whitespace: true})`
103
+
104
+ **Example:** When `true` the following HTML strings are considered equal:
105
+
106
+ <a href="/admin"> SOME TEXT CONTENT </a>
107
+ <a href="/index"> SOME TEXT CONTENT </a>
108
+
109
+ **Example:** When `true` the following HTML strings are considered equal:
110
+
111
+ <html>
112
+ <title>
113
+ This is my title
114
+ </title>
115
+ </html>
116
+
117
+ <html><title>This is my title</title></html>
118
+
119
+
120
+ ----------
121
+
122
+
123
+ - <a id="ignore_attr_order"></a>`ignore_attr_order: {true|false}` default: **`true`**
96
124
 
97
125
  When `true`, all attributes are sorted before comparison and only attributes of the same type are compared.
98
126
 
@@ -120,8 +148,30 @@ CompareXML.equivalent?(doc1, doc2, {ignore_comments: false, verbose: true, ...})
120
148
  target="_blank" == target="_blank"
121
149
 
122
150
 
151
+ ----------
152
+
153
+
154
+ - <a id="ignore_attr_content"></a>`ignore_attr_content: [string1, string2, ...]` default: **`[]`**
155
+
156
+ When provided, ignores all **attributes** that contain any of the given substrings. **Note:** types of attributes still have to match (i.e. `<p>` = `<p>`, `<div>` = `<div>`, etc).
157
+
158
+ **Usage Example:** `CompareXML.equivalent?(doc1, doc2, {ignore_attr_content: ['button']})`
159
+
160
+ **Example:** With `ignore_attr_content: ['button']` the following HTML strings are considered equal:
161
+
162
+ <a href="/admin" id="button_1" class="blue button">Link</a>
163
+ <a href="/admin" id="button_2" class="info button">Link</a>
164
+
165
+ **Example:** With `ignore_attr_content: ['menu']` the following HTML strings are considered equal:
123
166
 
124
- - `ignore_attrs: {css}` default: **`{}`**
167
+ <a class="menu left" data-scope="abrth$menu" role="side-menu">Link</a>
168
+ <a class="main menu" data-scope="ergeh$menu" role="main-menu">Link</a>
169
+
170
+
171
+ ----------
172
+
173
+
174
+ - <a id="ignore_attrs"></a>`ignore_attrs: [css_selector1, css_selector1, ...]` default: **`[]`**
125
175
 
126
176
  When provided, ignores all **attributes** that satisfy a particular rule using [CSS selectors](http://www.w3schools.com/cssref/css_selectors.asp).
127
177
 
@@ -138,8 +188,10 @@ CompareXML.equivalent?(doc1, doc2, {ignore_comments: false, verbose: true, ...})
138
188
  <a href="https://google.com" class="primary button rounded">Link</a>
139
189
 
140
190
 
191
+ ----------
192
+
141
193
 
142
- - `ignore_comments: {true|false}` default: **`true`**
194
+ - <a id="ignore_comments"></a>`ignore_comments: {true|false}` default: **`true`**
143
195
 
144
196
  When `true`, ignores comments, such as `<!-- This is a comment -->`.
145
197
 
@@ -156,8 +208,10 @@ CompareXML.equivalent?(doc1, doc2, {ignore_comments: false, verbose: true, ...})
156
208
  <a href="/admin">Link</a>
157
209
 
158
210
 
211
+ ----------
159
212
 
160
- - `ignore_nodes: {css}` default: **`{}`**
213
+
214
+ - <a id="ignore_nodes"></a>`ignore_nodes: [css_selector1, css_selector1, ...]` default: **`[]`**
161
215
 
162
216
  When provided, ignores all **nodes** that satisfy a particular rule using [CSS selectors](http://www.w3schools.com/cssref/css_selectors.asp).
163
217
 
@@ -174,8 +228,10 @@ CompareXML.equivalent?(doc1, doc2, {ignore_comments: false, verbose: true, ...})
174
228
  <a href="/admin"><i class"icon info"></i><b>Message:</b> Link</a>
175
229
 
176
230
 
231
+ ----------
232
+
177
233
 
178
- - `ignore_text_nodes: {true|false}` default: **`false`**
234
+ - <a id="ignore_text_nodes"></a>`ignore_text_nodes: {true|false}` default: **`false`**
179
235
 
180
236
  When `true`, ignores all text content. Text content is anything that is included between an opening and a closing tag, e.g. `<tag>THIS IS TEXT CONTENT</tag>`.
181
237
 
@@ -192,127 +248,68 @@ CompareXML.equivalent?(doc1, doc2, {ignore_comments: false, verbose: true, ...})
192
248
  <i class="icon> </i> <b>Message:</b>
193
249
 
194
250
 
251
+ ----------
195
252
 
196
- - `collapse_whitespace: {true|false}` default: **`true`**
197
-
198
- When `true`, all text content within the document is trimmed (i.e. space removed from left and right) and whitespace is collapsed (i.e. tabs, new lines, multiple whitespace characters are replaced by a single whitespace).
199
-
200
- **Usage Example:** `CompareXML.equivalent?(doc1, doc2, {collapse_whitespace: true})`
201
253
 
202
- **Example:** When `true` the following HTML strings are considered equal:
203
-
204
- <a href="/admin"> SOME TEXT CONTENT </a>
205
- <a href="/index"> SOME TEXT CONTENT </a>
206
-
207
- **Example:** When `true` the following HTML strings are considered equal:
208
-
209
- <html>
210
- <title>
211
- This is my title
212
- </title>
213
- </html>
214
-
215
- <html><title>This is my title</title></html>
216
-
217
-
218
-
219
- - `verbose: {true|false}` default: **`false`**
254
+ - <a id="verbose"></a>`verbose: {true|false}` default: **`false`**
220
255
 
221
256
  When `true`, instead of returning a boolean value `CompareXML.equivalent?` returns an array of all errors encountered when performing a comparison.
222
257
 
223
- > **Warning:** When `true`, the comparison takes longer! Not only because more processing is required to produce meaningful error messages, but also because in this mode, comparison does **NOT** stop when a first error is encountered, because the goal is to capture as many discrepancies as possible.
258
+ > **Warning:** When `true`, the comparison takes longer! Not only because more processing is required to produce meaningful differences, but also because in this mode, comparison does **NOT** stop when a first difference is encountered, because the goal is to capture as many differences as possible.
224
259
 
225
260
  **Usage Example:** `CompareXML.equivalent?(doc1, doc2, {verbose: true})`
226
261
 
227
262
  **Example:** When `true` given the following HTML strings:
228
263
 
229
- <!DOCTYPE html>
230
- <html lang="en">
231
- <head><title>TITLE</title></head>
232
- <body>
233
- <h1>SOME HEADING</h1>
234
- <div id="content">
235
- <h2><i class="fa fa-cogs"></i> ANOTHER HEADING</h2>
236
- <p>Extra content</p>
237
- </div>
238
- <div class="window">
239
- <a href="/admin" rel="icon">Link</a>
240
- </div>
241
- <blockquote>Some fancy quote <cite>Author Name</cite></blockquote>
242
- <p>Some more text</p>
243
- <p>Yet more text</p>
244
- <p>Too much text</p>
245
- <!-- The footer is below -->
246
- <p class="footer">FOOTER</p>
247
- </body>
248
- </html>
249
-
250
- <!DOCTYPE html>
251
- <html lang="en">
252
- <head><title>ANOTHER TITLE</title></head>
253
- <body>
254
- <h1 id="main">SOME HEADING</h1>
255
- <div id="content">
256
- <h2><i class="fa fa-cogs"></i> ANOTHER HEADING</h2>
257
- <p>Extra content</p>
258
- </div>
259
- <div class="window">
260
- <a rel="button" href="/admin">Link</a>
261
- </div>
262
- <blockquote>Some fancy quote</blockquote>
263
- <p>Some more text</p>
264
- <p>Yet more text</p>
265
- <p>Too much text</p>
266
- <!-- This is the footer -->
267
- <div class="footer">FOOTER</div>
268
- </body>
269
- </html>
264
+ ![diffing](https://dl.dropboxusercontent.com/u/1001101/input.png)
270
265
 
271
266
  `CompareXML.equivalent?(doc1, doc2, {verbose: true})` will produce an array shown below.
272
267
 
273
- [
274
- "html:head:title",
275
- "TITLE",
276
- 10,
277
- "ANOTHER TITLE",
278
- "html:head:title"
279
- ],
280
- [
281
- "html:body:h1",
282
- nil,
283
- 2,
284
- "id=\"main\"",
285
- "html:body:h1"
286
- ],
287
- [
288
- "html:body:div(2):a",
289
- "rel=\"button\"",
290
- 4,
291
- "rel=\"icon\"",
292
- "html:body:div(2):a"
293
- ],
294
- [
295
- "html:body:blockquote:cite",
296
- "cite",
297
- 3,
298
- nil,
299
- "html:body:blockquote:cite"
300
- ],
301
- [
302
- "html:body:p(4)",
303
- "p",
304
- 8,
305
- "div",
306
- "html:body:div(3)"
307
- ]
308
-
309
- The structure of the array is as follows:
310
-
311
- [left_node_location, left_content, error_code, right_content, right_node_location]
268
+ ```ruby
269
+ [
270
+ {
271
+ node1: '<title>TITLE</title>',
272
+ node2: '<title>ANOTHER TITLE</title>',
273
+ diff1: 'TITLE',
274
+ diff2: 'ANOTHER TITLE',
275
+ },
276
+ {
277
+ node1: '<h1>SOME HEADING</h1>',
278
+ node2: '<h1 id="main">SOME HEADING</h1>',
279
+ diff1: nil,
280
+ diff2: 'id="main"',
281
+ },
282
+ {
283
+ node1: '<a href="/admin" rel="icon">Link</a>',
284
+ node2: '<a rel="button" href="/admin">Link</a>',
285
+ diff1: '"rel="icon"',
286
+ diff2: '"rel="button"',
287
+ },
288
+ {
289
+ node1: '<cite>Author Name</cite>',
290
+ node2: nil,
291
+ diff1: '<cite>Author Name</cite>',
292
+ diff2: nil,
293
+ },
294
+ {
295
+ node1: '<p class="footer">FOOTER</p>',
296
+ node1: '<div class="footer">FOOTER</div>',
297
+ diff1: 'p',
298
+ diff2: 'div',
299
+ }
300
+ ]
301
+ ```
302
+
303
+ The structure of each hash inside the array is:
304
+
305
+ node1: [Nokogiri::XML::Node] left node that contains the difference
306
+ node2: [Nokogiri::XML::Node] right node that contains the difference
307
+ diff1: [Nokogiri::XML::Node|String] left difference
308
+ diff1: [Nokogiri::XML::Node|String] right difference
312
309
 
313
310
  **Node location** of `html:body:p(4)` means that the element in question is `<p>`, its hierarchical ancestors are `html > body`, and it is the **4th** `<p>` tag. That is, it could be found in
314
311
 
315
- <html><body><p>one</p>...<p>two</p>...<p>three</p>...<p>TARGET</p></body></html>
312
+ <html><body><p>one</p...p>two</p...p>three</p...p>TARGET</p></body></html>
316
313
 
317
314
  > **Note:** `p(4)` means that it is the fourth tag of type `<p>`, but there could be many other tags of other types between `p(3)` and `p(4)`.
318
315
 
data/lib/compare-xml.rb CHANGED
@@ -5,67 +5,71 @@ module CompareXML
5
5
 
6
6
  # default options used by the module; all of these can be overridden
7
7
  DEFAULTS_OPTS = {
8
+ # when true, trims and collapses whitespace in text nodes and comments to a single space
9
+ # when false, all whitespace is preserved as it is without any changes
10
+ collapse_whitespace: true,
11
+
8
12
  # when true, attribute order is not important (all attributes are sorted before comparison)
9
13
  # when false, attributes are compared in order and comparison stops on the first mismatch
10
14
  ignore_attr_order: true,
11
15
 
16
+ # contains an array of user specified strings that is used to ignore any attributes
17
+ # whose content contains a string from this array (e.g. "good automobile" contains "mobile")
18
+ ignore_attr_content: [],
19
+
12
20
  # contains an array of user-specified CSS rules used to perform attribute exclusions
13
21
  # for this to work, a CSS rule MUST contain the attribute to be excluded,
14
22
  # i.e. a[href] will exclude all "href" attributes contained in <a> tags.
15
- ignore_attrs: {},
23
+ ignore_attrs: [],
16
24
 
17
25
  # when true ignores XML and HTML comments
18
26
  # when false, all comments are compared to their counterparts
19
27
  ignore_comments: true,
20
28
 
21
29
  # contains an array of user-specified CSS rules used to perform node exclusions
22
- ignore_nodes: {},
30
+ ignore_nodes: [],
23
31
 
24
32
  # when true, ignores all text nodes (although blank text nodes are always ignored)
25
33
  # when false, all text nodes are compared to their counterparts (except the empty ones)
26
34
  ignore_text_nodes: false,
27
35
 
28
- # when true, trims and collapses whitespace in text nodes and comments to a single space
29
- # when false, all whitespace is preserved as it is without any changes
30
- collapse_whitespace: true,
31
-
32
36
  # when true, provides a list of all error messages encountered in comparisons
33
37
  # when false, execution stops when the first error is encountered with no error messages
34
38
  verbose: false
35
39
  }
36
40
 
37
- # used internally only in order to differentiate equivalence for inequivalence
38
- EQUIVALENT = 1
39
41
 
40
- # a list of all possible inequivalence types for nodes
41
- # these are returned in the errors array to differentiate error types.
42
- MISSING_ATTRIBUTE = 2 # attribute is missing its counterpart
43
- MISSING_NODE = 3 # node is missing its counterpart
44
- UNEQUAL_ATTRIBUTES = 4 # attributes are not equal
45
- UNEQUAL_COMMENTS = 5 # comment contents are not equal
46
- UNEQUAL_DOCUMENTS = 6 # document types are not equal
47
- UNEQUAL_ELEMENTS = 7 # nodes have the same type but are not equal
48
- UNEQUAL_NODES_TYPES = 8 # nodes do not have the same type
49
- UNEQUAL_TEXT_CONTENTS = 9 # text contents are not equal
42
+ class << self
50
43
 
44
+ # used internally only in order to differentiate equivalence for inequivalence
45
+ EQUIVALENT = 1
51
46
 
52
- class << self
47
+ # a list of all possible inequivalence types for nodes
48
+ # these are returned in the differences array to differentiate error types.
49
+ MISSING_ATTRIBUTE = 2 # attribute is missing its counterpart
50
+ MISSING_NODE = 3 # node is missing its counterpart
51
+ UNEQUAL_ATTRIBUTES = 4 # attributes are not equal
52
+ UNEQUAL_COMMENTS = 5 # comment contents are not equal
53
+ UNEQUAL_DOCUMENTS = 6 # document types are not equal
54
+ UNEQUAL_ELEMENTS = 7 # nodes have the same type but are not equal
55
+ UNEQUAL_NODES_TYPES = 8 # nodes do not have the same type
56
+ UNEQUAL_TEXT_CONTENTS = 9 # text node contents are not equal
53
57
 
54
58
  ##
55
59
  # Determines whether two XML documents or fragments are equal to each other.
56
60
  # The two parameters could be any type of XML documents, or fragments
57
61
  # or node sets or even text nodes - any subclass of Nokogiri::XML::Node.
58
62
  #
59
- # @param [Nokogiri::XML::Node] n1 left attribute
60
- # @param [Nokogiri::XML::Node] n2 right attribute
63
+ # @param [Nokogiri::XML::Element] n1 left node element
64
+ # @param [Nokogiri::XML::Element] n2 right node element
61
65
  # @param [Hash] opts user-overridden options
62
66
  #
63
- # @return true if equal, [Array] errors otherwise
67
+ # @return true if equal, [Array] differences otherwise
64
68
  #
65
69
  def equivalent?(n1, n2, opts = {})
66
- opts, errors = DEFAULTS_OPTS.merge(opts), []
67
- result = compareNodes(n1, n2, opts, errors)
68
- opts[:verbose] ? errors : result == EQUIVALENT
70
+ opts, differences = DEFAULTS_OPTS.merge(opts), []
71
+ result = compareNodes(n1, n2, opts, differences)
72
+ opts[:verbose] ? differences : result == EQUIVALENT
69
73
  end
70
74
 
71
75
 
@@ -75,36 +79,38 @@ module CompareXML
75
79
  # Compares two nodes for equivalence. The nodes could be any subclass
76
80
  # of Nokogiri::XML::Node including node sets and document fragments.
77
81
  #
78
- # @param [Nokogiri::XML::Node] n1 left attribute
79
- # @param [Nokogiri::XML::Node] n2 right attribute
82
+ # @param [Nokogiri::XML::Node] n1 left node
83
+ # @param [Nokogiri::XML::Node] n2 right node
80
84
  # @param [Hash] opts user-overridden options
81
- # @param [Array] errors inequivalence messages
85
+ # @param [Array] differences inequivalence messages
86
+ # @param [int] status comparison status code (EQUIVALENT by default)
82
87
  #
83
88
  # @return type of equivalence (from equivalence constants)
84
89
  #
85
- def compareNodes(n1, n2, opts, errors, status = EQUIVALENT)
90
+ def compareNodes(n1, n2, opts, differences, status = EQUIVALENT)
86
91
  if n1.class == n2.class
87
92
  case n1
88
93
  when Nokogiri::XML::Comment
89
- compareCommentNodes(n1, n2, opts, errors)
94
+ compareCommentNodes(n1, n2, opts, differences)
90
95
  when Nokogiri::HTML::Document
91
- compareDocumentNodes(n1, n2, opts, errors)
96
+ compareDocumentNodes(n1, n2, opts, differences)
92
97
  when Nokogiri::XML::Element
93
- status = compareElementNodes(n1, n2, opts, errors)
98
+ status = compareElementNodes(n1, n2, opts, differences)
94
99
  when Nokogiri::XML::Text
95
- status = compareTextNodes(n1, n2, opts, errors)
100
+ status = compareTextNodes(n1, n2, opts, differences)
96
101
  else
97
- status = compareChildren(n1.children, n2.children, opts, errors)
102
+ if n1.is_a?(Nokogiri::XML::Node) || n1.is_a?(Nokogiri::XML::NodeSet)
103
+ status = compareChildren(n1.children, n2.children, opts, differences)
104
+ else
105
+ raise 'Comparison only allowed between objects of type Nokogiri::XML::Node and Nokogiri::XML::NodeSet.'
106
+ end
98
107
  end
99
- elsif n1.nil?
100
- status = MISSING_NODE
101
- errors << [nodePath(n2), nil, status, n2.name, nodePath(n2)] if opts[:verbose]
102
- elsif n2.nil?
108
+ elsif n1.nil? || n2.nil?
103
109
  status = MISSING_NODE
104
- errors << [nodePath(n1), n1.name, status, nil, nodePath(n1)] if opts[:verbose]
110
+ addDifference(n1, n2, n1, n2, opts, differences)
105
111
  else
106
112
  status = UNEQUAL_NODES_TYPES
107
- errors << [nodePath(n1), n1.class, status, n2.class, nodePath(n2)] if opts[:verbose]
113
+ addDifference(n1, n2, n1, n2, opts, differences)
108
114
  end
109
115
  status
110
116
  end
@@ -113,20 +119,21 @@ module CompareXML
113
119
  ##
114
120
  # Compares two nodes of type Nokogiri::HTML::Comment.
115
121
  #
116
- # @param [Nokogiri::XML::Comment] n1 left attribute
117
- # @param [Nokogiri::XML::Comment] n2 right attribute
122
+ # @param [Nokogiri::XML::Comment] n1 left comment
123
+ # @param [Nokogiri::XML::Comment] n2 right comment
118
124
  # @param [Hash] opts user-overridden options
119
- # @param [Array] errors inequivalence messages
125
+ # @param [Array] differences inequivalence messages
126
+ # @param [int] status comparison status code (EQUIVALENT by default)
120
127
  #
121
128
  # @return type of equivalence (from equivalence constants)
122
129
  #
123
- def compareCommentNodes(n1, n2, opts, errors, status = EQUIVALENT)
130
+ def compareCommentNodes(n1, n2, opts, differences, status = EQUIVALENT)
124
131
  return true if opts[:ignore_comments]
125
132
  t1, t2 = n1.content, n2.content
126
133
  t1, t2 = collapse(t1), collapse(t2) if opts[:collapse_whitespace]
127
134
  unless t1 == t2
128
135
  status = UNEQUAL_COMMENTS
129
- errors << [nodePath(n1.parent), t1, status, t2, nodePath(n2.parent)] if opts[:verbose]
136
+ addDifference(n1, n2, t1, t2, opts, differences)
130
137
  end
131
138
  status
132
139
  end
@@ -135,19 +142,20 @@ module CompareXML
135
142
  ##
136
143
  # Compares two nodes of type Nokogiri::HTML::Document.
137
144
  #
138
- # @param [Nokogiri::XML::Document] n1 left attribute
139
- # @param [Nokogiri::XML::Document] n2 right attribute
145
+ # @param [Nokogiri::XML::Document] n1 left document
146
+ # @param [Nokogiri::XML::Document] n2 right document
140
147
  # @param [Hash] opts user-overridden options
141
- # @param [Array] errors inequivalence messages
148
+ # @param [Array] differences inequivalence messages
149
+ # @param [int] status comparison status code (EQUIVALENT by default)
142
150
  #
143
151
  # @return type of equivalence (from equivalence constants)
144
152
  #
145
- def compareDocumentNodes(n1, n2, opts, errors, status = EQUIVALENT)
153
+ def compareDocumentNodes(n1, n2, opts, differences, status = EQUIVALENT)
146
154
  if n1.name == n2.name
147
- status = compareChildren(n1.children, n2.children, opts, errors)
155
+ status = compareChildren(n1.children, n2.children, opts, differences)
148
156
  else
149
157
  status == UNEQUAL_DOCUMENTS
150
- errors << [nodePath(n1), n1, status, n2, nodePath(n2)] if opts[:verbose]
158
+ addDifference(n1, n2, n1, n2, opts, differences)
151
159
  end
152
160
  status
153
161
  end
@@ -159,11 +167,12 @@ module CompareXML
159
167
  # @param [Nokogiri::XML::NodeSet] n1_set left set of Nokogiri::XML::Node elements
160
168
  # @param [Nokogiri::XML::NodeSet] n2_set right set of Nokogiri::XML::Node elements
161
169
  # @param [Hash] opts user-overridden options
162
- # @param [Array] errors inequivalence messages
170
+ # @param [Array] differences inequivalence messages
171
+ # @param [int] status comparison status code (EQUIVALENT by default)
163
172
  #
164
173
  # @return type of equivalence (from equivalence constants)
165
174
  #
166
- def compareChildren(n1_set, n2_set, opts, errors, status = EQUIVALENT)
175
+ def compareChildren(n1_set, n2_set, opts, differences, status = EQUIVALENT)
167
176
  i = 0; j = 0
168
177
  while i < n1_set.length || j < n2_set.length
169
178
  if !n1_set[i].nil? && nodeExcluded?(n1_set[i], opts)
@@ -171,7 +180,7 @@ module CompareXML
171
180
  elsif !n2_set[j].nil? && nodeExcluded?(n2_set[j], opts)
172
181
  j += 1 # increment counter if right node is excluded
173
182
  else
174
- result = compareNodes(n1_set[i], n2_set[j], opts, errors)
183
+ result = compareNodes(n1_set[i], n2_set[j], opts, differences)
175
184
  status = result unless result == EQUIVALENT
176
185
 
177
186
  # return false so that this subtree could halt comparison on error
@@ -194,22 +203,23 @@ module CompareXML
194
203
  # - compares element attributes
195
204
  # - recursively compares element children
196
205
  #
197
- # @param [Nokogiri::XML::Element] n1 left attribute
198
- # @param [Nokogiri::XML::Element] n2 right attribute
206
+ # @param [Nokogiri::XML::Element] n1 left node element
207
+ # @param [Nokogiri::XML::Element] n2 right node element
199
208
  # @param [Hash] opts user-overridden options
200
- # @param [Array] errors inequivalence messages
209
+ # @param [Array] differences inequivalence messages
210
+ # @param [int] status comparison status code (EQUIVALENT by default)
201
211
  #
202
212
  # @return type of equivalence (from equivalence constants)
203
213
  #
204
- def compareElementNodes(n1, n2, opts, errors, status = EQUIVALENT)
214
+ def compareElementNodes(n1, n2, opts, differences, status = EQUIVALENT)
205
215
  if n1.name == n2.name
206
- result = compareAttributeSets(n1.attribute_nodes, n2.attribute_nodes, opts, errors)
207
- status = result unless result == EQUIVALENT
208
- result = compareChildren(n1.children, n2.children, opts, errors)
216
+ result = compareAttributeSets(n1, n2, n1.attribute_nodes, n2.attribute_nodes, opts, differences)
217
+ return result unless result == EQUIVALENT
218
+ result = compareChildren(n1.children, n2.children, opts, differences)
209
219
  status = result unless result == EQUIVALENT
210
220
  else
211
221
  status = UNEQUAL_ELEMENTS
212
- errors << [nodePath(n1), n1.name, status, n2.name, nodePath(n2)] if opts[:verbose]
222
+ addDifference(n1, n2, n1.name, n2.name, opts, differences)
213
223
  end
214
224
  status
215
225
  end
@@ -218,41 +228,44 @@ module CompareXML
218
228
  ##
219
229
  # Compares two nodes of type Nokogiri::XML::Text.
220
230
  #
221
- # @param [Nokogiri::XML::Text] n1 left attribute
222
- # @param [Nokogiri::XML::Text] n2 right attribute
231
+ # @param [Nokogiri::XML::Text] n1 left text node
232
+ # @param [Nokogiri::XML::Text] n2 right text node
223
233
  # @param [Hash] opts user-overridden options
224
- # @param [Array] errors inequivalence messages
234
+ # @param [Array] differences inequivalence messages
235
+ # @param [int] status comparison status code (EQUIVALENT by default)
225
236
  #
226
237
  # @return type of equivalence (from equivalence constants)
227
238
  #
228
- def compareTextNodes(n1, n2, opts, errors, status = EQUIVALENT)
239
+ def compareTextNodes(n1, n2, opts, differences, status = EQUIVALENT)
229
240
  return true if opts[:ignore_text_nodes]
230
241
  t1, t2 = n1.content, n2.content
231
242
  t1, t2 = collapse(t1), collapse(t2) if opts[:collapse_whitespace]
232
243
  unless t1 == t2
233
244
  status = UNEQUAL_TEXT_CONTENTS
234
- errors << [nodePath(n1.parent), t1, status, t2, nodePath(n2.parent)] if opts[:verbose]
245
+ addDifference(n1.parent, n2.parent, t1, t2, opts, differences)
235
246
  end
236
247
  status
237
248
  end
238
249
 
239
250
 
240
251
  ##
241
- # Compares two sets of Nokogiri::XML::Node attributes.
252
+ # Compares two sets of Nokogiri::XML::Element attributes.
242
253
  #
254
+ # @param [Nokogiri::XML::Element] n1 left node element
255
+ # @param [Nokogiri::XML::Element] n2 right node element
243
256
  # @param [Array] a1_set left attribute set
244
257
  # @param [Array] a2_set right attribute set
245
258
  # @param [Hash] opts user-overridden options
246
- # @param [Array] errors inequivalence messages
259
+ # @param [Array] differences inequivalence messages
247
260
  #
248
261
  # @return type of equivalence (from equivalence constants)
249
262
  #
250
- def compareAttributeSets(a1_set, a2_set, opts, errors)
263
+ def compareAttributeSets(n1, n2, a1_set, a2_set, opts, differences)
251
264
  return false unless a1_set.length == a2_set.length || opts[:verbose]
252
265
  if opts[:ignore_attr_order]
253
- compareSortedAttributeSets(a1_set, a2_set, opts, errors)
266
+ compareSortedAttributeSets(n1, n2, a1_set, a2_set, opts, differences)
254
267
  else
255
- compareUnsortedAttributeSets(a1_set, a2_set, opts, errors)
268
+ compareUnsortedAttributeSets(n1, n2, a1_set, a2_set, opts, differences)
256
269
  end
257
270
  end
258
271
 
@@ -262,29 +275,34 @@ module CompareXML
262
275
  # When the attributes are sorted, only attributes of the same type are compared
263
276
  # to each other, and missing attributes can be easily detected.
264
277
  #
278
+ # @param [Nokogiri::XML::Element] n1 left node element
279
+ # @param [Nokogiri::XML::Element] n2 right node element
265
280
  # @param [Array] a1_set left attribute set
266
281
  # @param [Array] a2_set right attribute set
267
282
  # @param [Hash] opts user-overridden options
268
- # @param [Array] errors inequivalence messages
283
+ # @param [Array] differences inequivalence messages
284
+ # @param [int] status comparison status code (EQUIVALENT by default)
269
285
  #
270
286
  # @return type of equivalence (from equivalence constants)
271
287
  #
272
- def compareSortedAttributeSets(a1_set, a2_set, opts, errors, status = EQUIVALENT)
288
+ def compareSortedAttributeSets(n1, n2, a1_set, a2_set, opts, differences, status = EQUIVALENT)
273
289
  a1_set, a2_set = a1_set.sort_by { |a| a.name }, a2_set.sort_by { |a| a.name }
274
290
  i = j = 0
275
291
 
276
292
  while i < a1_set.length || j < a2_set.length
293
+
277
294
  if a1_set[i].nil?
278
- result = compareAttributes(nil, a2_set[j], opts, errors); j += 1
295
+ result = compareAttributes(n1, n2, nil, a2_set[j], opts, differences); j += 1
279
296
  elsif a2_set[j].nil?
280
- result = compareAttributes(a1_set[i], nil, opts, errors); i += 1
297
+ result = compareAttributes(n1, n2, a1_set[i], nil, opts, differences); i += 1
281
298
  elsif a1_set[i].name < a2_set[j].name
282
- result = compareAttributes(a1_set[i], nil, opts, errors); i += 1
299
+ result = compareAttributes(n1, n2, a1_set[i], nil, opts, differences); i += 1
283
300
  elsif a1_set[i].name > a2_set[j].name
284
- result = compareAttributes(nil, a2_set[j], opts, errors); j += 1
301
+ result = compareAttributes(n1, n2, nil, a2_set[j], opts, differences); j += 1
285
302
  else
286
- result = compareAttributes(a1_set[i], a2_set[j], opts, errors); i += 1; j += 1
303
+ result = compareAttributes(n1, n2, a1_set[i], a2_set[j], opts, differences); i += 1; j += 1
287
304
  end
305
+
288
306
  status = result unless result == EQUIVALENT
289
307
  break unless status == EQUIVALENT || opts[:verbose]
290
308
  end
@@ -293,21 +311,24 @@ module CompareXML
293
311
 
294
312
 
295
313
  ##
296
- # Compares two sets of Nokogiri::XML::Node attributes without sorting them.
314
+ # Compares two sets of Nokogiri::XML::Element attributes without sorting them.
297
315
  # As a result attributes of different types may be compared, and even if all
298
316
  # attributes are identical in both sets, if their order is different,
299
317
  # the comparison will stop as soon two unequal attributes are found.
300
318
  #
319
+ # @param [Nokogiri::XML::Element] n1 left node element
320
+ # @param [Nokogiri::XML::Element] n2 right node element
301
321
  # @param [Array] a1_set left attribute set
302
322
  # @param [Array] a2_set right attribute set
303
323
  # @param [Hash] opts user-overridden options
304
- # @param [Array] errors inequivalence messages
324
+ # @param [Array] differences inequivalence messages
325
+ # @param [int] status comparison status code (EQUIVALENT by default)
305
326
  #
306
327
  # @return type of equivalence (from equivalence constants)
307
328
  #
308
- def compareUnsortedAttributeSets(a1_set, a2_set, opts, errors, status = EQUIVALENT)
329
+ def compareUnsortedAttributeSets(n1, n2, a1_set, a2_set, opts, differences, status = EQUIVALENT)
309
330
  [a1_set.length, a2_set.length].max.times do |i|
310
- result = compareAttributes(a1_set[i], a2_set[i], opts, errors)
331
+ result = compareAttributes(n1, n2, a1_set[i], a2_set[i], opts, differences)
311
332
  status = result unless result == EQUIVALENT
312
333
  break unless status == EQUIVALENT
313
334
  end
@@ -318,29 +339,33 @@ module CompareXML
318
339
  ##
319
340
  # Compares two attributes by name and value.
320
341
  #
342
+ # @param [Nokogiri::XML::Element] n1 left node element
343
+ # @param [Nokogiri::XML::Element] n2 right node element
321
344
  # @param [Nokogiri::XML::Attr] a1 left attribute
322
345
  # @param [Nokogiri::XML::Attr] a2 right attribute
323
346
  # @param [Hash] opts user-overridden options
324
- # @param [Array] errors inequivalence messages
347
+ # @param [Array] differences inequivalence messages
348
+ # @param [int] status comparison status code (EQUIVALENT by default)
325
349
  #
326
350
  # @return type of equivalence (from equivalence constants)
327
351
  #
328
- def compareAttributes(a1, a2, opts, errors, status = EQUIVALENT)
352
+ def compareAttributes(n1, n2, a1, a2, opts, differences, status = EQUIVALENT)
329
353
  if a1.nil?
330
354
  status = MISSING_ATTRIBUTE
331
- errors << [nodePath(a2.parent), nil, status, "#{a2.name}=\"#{a2.value}\"", nodePath(a2.parent)] if opts[:verbose]
355
+ addDifference(n1, n2, nil, "#{a2.name}=\"#{a2.value}\"", opts, differences)
332
356
  elsif a2.nil?
333
357
  status = MISSING_ATTRIBUTE
334
- errors << [nodePath(a1.parent), "#{a1.name}=\"#{a1.value}\"", status, nil, nodePath(a1.parent)] if opts[:verbose]
358
+ addDifference(n1, n2, "#{a1.name}=\"#{a1.value}\"", nil, opts, differences)
335
359
  elsif a1.name == a2.name
336
360
  return status if attrsExcluded?(a1, a2, opts)
361
+ return status if attrContentExcluded?(a1, a2, opts)
337
362
  if a1.value != a2.value
338
363
  status = UNEQUAL_ATTRIBUTES
339
- errors << [nodePath(a1.parent), "#{a1.name}=\"#{a1.value}\"", status, "#{a2.name}=\"#{a2.value}\"", nodePath(a2.parent)] if opts[:verbose]
364
+ addDifference(n1, n2, "#{a1.name}=\"#{a1.value}\"", "#{a2.name}=\"#{a2.value}\"", opts, differences)
340
365
  end
341
366
  else
342
367
  status = UNEQUAL_ATTRIBUTES
343
- errors << [nodePath(a1.parent), a1.name, status, a2.name, nodePath(a2.parent)] if opts[:verbose]
368
+ addDifference(n1, n2, "#{a1.name}=\"#{a1.value}\"", "#{a2.name}=\"#{a2.value}\"", opts, differences)
344
369
  end
345
370
  status
346
371
  end
@@ -353,7 +378,7 @@ module CompareXML
353
378
  # Several types of nodes are considered ignored:
354
379
  # - comments (only in +ignore_comments+ mode)
355
380
  # - text nodes (only in +ignore_text_nodes+ mode OR when a text node is empty)
356
- # - node matches a user-specified css rule from +ignore_comments+
381
+ # - node matches a user-specified css rule from +ignore_nodes+
357
382
  #
358
383
  # @param [Nokogiri::XML::Node] n node being tested for exclusion
359
384
  # @param [Hash] opts user-overridden options
@@ -361,11 +386,10 @@ module CompareXML
361
386
  # @return true if excluded, false otherwise
362
387
  #
363
388
  def nodeExcluded?(n, opts)
389
+ return true if n.is_a?(Nokogiri::XML::DTD)
364
390
  return true if n.is_a?(Nokogiri::XML::Comment) && opts[:ignore_comments]
365
391
  return true if n.is_a?(Nokogiri::XML::Text) && (opts[:ignore_text_nodes] || collapse(n.content).empty?)
366
- opts[:ignore_nodes].each do |css|
367
- return true if n.xpath('../*').css(css).include?(n)
368
- end
392
+ opts[:ignore_nodes].each { |css| return true if n.parent.css(css).include? n }
369
393
  false
370
394
  end
371
395
 
@@ -393,43 +417,43 @@ module CompareXML
393
417
 
394
418
 
395
419
  ##
396
- # Produces the hierarchical ancestral path of a node in the following format: <html:body:div(3):h2:b(2)>.
397
- # This means that the element is located in:
398
- #
399
- # <html>
400
- # <body>
401
- # <div>...</div>
402
- # <div>...</div>
403
- # <div>
404
- # <h2>
405
- # <b>...</b>
406
- # <b>TARGET</b>
407
- # </h2>
408
- # </div>
409
- # </body>
410
- # </html>
411
- #
412
- # Note that the counts of element locations only apply to elements of the same type. For example, div(3) means
413
- # that it is the 3rd <div> element in the <body>, but there could be many other elements in between the three
414
- # <div> elements.
415
- #
416
- # When +ignore_comments+ mode is disabled, mismatching comments will show up as <...:comment>.
417
- #
418
- # @param [Nokogiri::XML::Node] n node for which to determine a hierarchical path
420
+ # Checks whether two given attributes should be excluded, based on their content.
421
+ # Checks whether both attributes contain content that should be excluded, and
422
+ # returns true only if an excluded string is contained in both attribute values.
423
+ #
424
+ # @param [Nokogiri::XML::Attr] a1 left attribute
425
+ # @param [Nokogiri::XML::Attr] a2 right attribute
426
+ # @param [Hash] opts user-overridden options
419
427
  #
420
428
  # @return true if excluded, false otherwise
421
429
  #
422
- def nodePath(n)
423
- name = n.name
430
+ def attrContentExcluded?(a1, a2, opts)
431
+ a1_excluded, a2_excluded = false, false
432
+ opts[:ignore_attr_content].each do |content|
433
+ a1_excluded = a1_excluded || a1.value.include?(content)
434
+ a2_excluded = a2_excluded || a2.value.include?(content)
435
+ return true if a1_excluded && a2_excluded
436
+ end
437
+ false
438
+ end
424
439
 
425
- # find the index of the node if there are several of the same type
426
- siblings = n.xpath("../#{name}")
427
- name += "(#{siblings.index(n) + 1})" if siblings.length > 1
428
440
 
429
- if defined? n.parent
430
- status = "#{nodePath(n.parent)}:#{name}"
431
- status = status[1..-1] if status[0] == ':'
432
- status
441
+ ##
442
+ # Strips the whitespace (from beginning and end) and collapses it,
443
+ # i.e. multiple spaces, new lines and tabs are all collapsed to a single space.
444
+ #
445
+ # @param [Nokogiri::XML::Node] node1 left node
446
+ # @param [Nokogiri::XML::Node] node2 right node
447
+ # @param [String] diff1 left diffing value
448
+ # @param [String] diff2 right diffing value
449
+ # @param [Hash] opts user-overridden options
450
+ # @param [Array] differences inequivalence messages
451
+ #
452
+ # @return collapsed string
453
+ #
454
+ def addDifference(node1, node2, diff1, diff2, opts, differences)
455
+ if opts[:verbose]
456
+ differences << {node1: node1, node2: node2, diff1: diff1, diff2: diff2}
433
457
  end
434
458
  end
435
459
 
@@ -1,3 +1,3 @@
1
1
  module CompareXML
2
- VERSION = '0.5.2'
2
+ VERSION = '0.6'
3
3
  end
metadata CHANGED
@@ -1,55 +1,55 @@
1
1
  --- !ruby/object:Gem::Specification
2
2
  name: compare-xml
3
3
  version: !ruby/object:Gem::Version
4
- version: 0.5.2
4
+ version: '0.6'
5
5
  platform: ruby
6
6
  authors:
7
7
  - Vadim Kononov
8
8
  autorequire:
9
9
  bindir: exe
10
10
  cert_chain: []
11
- date: 2016-04-06 00:00:00.000000000 Z
11
+ date: 2016-04-29 00:00:00.000000000 Z
12
12
  dependencies:
13
13
  - !ruby/object:Gem::Dependency
14
14
  name: bundler
15
15
  requirement: !ruby/object:Gem::Requirement
16
16
  requirements:
17
- - - ~>
17
+ - - "~>"
18
18
  - !ruby/object:Gem::Version
19
19
  version: '1.11'
20
20
  type: :development
21
21
  prerelease: false
22
22
  version_requirements: !ruby/object:Gem::Requirement
23
23
  requirements:
24
- - - ~>
24
+ - - "~>"
25
25
  - !ruby/object:Gem::Version
26
26
  version: '1.11'
27
27
  - !ruby/object:Gem::Dependency
28
28
  name: rake
29
29
  requirement: !ruby/object:Gem::Requirement
30
30
  requirements:
31
- - - ~>
31
+ - - "~>"
32
32
  - !ruby/object:Gem::Version
33
33
  version: '11.1'
34
34
  type: :development
35
35
  prerelease: false
36
36
  version_requirements: !ruby/object:Gem::Requirement
37
37
  requirements:
38
- - - ~>
38
+ - - "~>"
39
39
  - !ruby/object:Gem::Version
40
40
  version: '11.1'
41
41
  - !ruby/object:Gem::Dependency
42
42
  name: nokogiri
43
43
  requirement: !ruby/object:Gem::Requirement
44
44
  requirements:
45
- - - ~>
45
+ - - "~>"
46
46
  - !ruby/object:Gem::Version
47
47
  version: '1.6'
48
48
  type: :runtime
49
49
  prerelease: false
50
50
  version_requirements: !ruby/object:Gem::Requirement
51
51
  requirements:
52
- - - ~>
52
+ - - "~>"
53
53
  - !ruby/object:Gem::Version
54
54
  version: '1.6'
55
55
  description: CompareXML is a fast, lightweight and feature-rich tool that will solve
@@ -61,7 +61,7 @@ executables: []
61
61
  extensions: []
62
62
  extra_rdoc_files: []
63
63
  files:
64
- - .gitignore
64
+ - ".gitignore"
65
65
  - Gemfile
66
66
  - LICENSE.txt
67
67
  - README.md
@@ -81,17 +81,17 @@ require_paths:
81
81
  - lib
82
82
  required_ruby_version: !ruby/object:Gem::Requirement
83
83
  requirements:
84
- - - '>='
84
+ - - ">="
85
85
  - !ruby/object:Gem::Version
86
86
  version: '0'
87
87
  required_rubygems_version: !ruby/object:Gem::Requirement
88
88
  requirements:
89
- - - '>='
89
+ - - ">="
90
90
  - !ruby/object:Gem::Version
91
91
  version: '0'
92
92
  requirements: []
93
93
  rubyforge_project:
94
- rubygems_version: 2.6.2
94
+ rubygems_version: 2.5.2
95
95
  signing_key:
96
96
  specification_version: 4
97
97
  summary: A customizable tool that compares two instances of Nokogiri::XML::Node for