compare-xml 0.5.2 → 0.6

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (5) hide show
  1. checksums.yaml +4 -4
  2. data/README.md +116 -119
  3. data/lib/compare-xml.rb +149 -125
  4. data/lib/compare-xml/version.rb +1 -1
  5. metadata +12 -12
checksums.yaml CHANGED
@@ -1,7 +1,7 @@
1
1
  ---
2
2
  SHA1:
3
- metadata.gz: 33a222cacf011fd953938812fd99be01832a61ca
4
- data.tar.gz: fd30d6713bddf059806814e28faf431132697c60
3
+ metadata.gz: 24ddb2d55335e31c8bca6b26447717c51f907db9
4
+ data.tar.gz: 1eeccbd5f186e26e09553c577b8713a4a9eaa3d3
5
5
  SHA512:
6
- metadata.gz: cc7042aa3c3ff8b69a6ebcbf76d8a443d5fc7f4d38379f12cbd91d6fe8b908d6d096f70dd22945231e0403ff742cd04f9c0423237fcb3b0715a89baf92d3a93c
7
- data.tar.gz: 54d723bfb1c797083b103328bfee70ad40ad302c15355106755da9d22d9e58a854f56a859032a3af771a1a042b16727b4082b119832f3f1b5ea0c770b1bd849d
6
+ metadata.gz: 85b1a91be2f641993997dbd9afe0c95a28f359fbe936d89d19afb8cd3e9998b3975ace6fc3e7a1a3b0a0684d6114e8ded56fa90c900673890e92d7d20415b799
7
+ data.tar.gz: e0548edf277dd3967495f03b7d3d3e836d2fa80a56752034f691f953270e71650cb88b5af25338fe99d48bb95c66cf9d545703c7c3fd34b2270c84865ea0824b
data/README.md CHANGED
@@ -68,31 +68,59 @@ CompareXML has a variety of options that can be invoked as an optional argument,
68
68
  CompareXML.equivalent?(doc1, doc2, {ignore_comments: false, verbose: true, ...})
69
69
  ```
70
70
 
71
- - `ignore_attr_order: {true|false}` default: **`true`**
71
+ - `collapse_whitespace: {true|false}` default: **`true`** [→ read more ←](#collapse_whitespace)
72
+ - when `true`, trims and collapses whitespace
73
+
74
+ - `ignore_attr_order: {true|false}` default: **`true`** [→ read more ←](#ignore_attr_order)
72
75
  - when `true`, ignores attribute order within tags
73
76
 
74
- - `ignore_attrs: {css}` default: **`{}`**
77
+ - `ignore_attr_content: [string1, string2, ...]` default: **`[]`** [→ read more ←](#ignore_attr_content)
78
+ - when provided, ignores all attributes that contain substrings `string`, `string2`, etc.
79
+
80
+ - `ignore_attrs: [css_selector1, css_selector1, ...]` default: **`[]`** [→ read more ←](#ignore_attrs)
75
81
  - when provided, ignores specific *attributes* using [CSS selectors](http://www.w3schools.com/cssref/css_selectors.asp)
76
82
 
77
- - `ignore_comments: {true|false}` default: **`true`**
83
+ - `ignore_comments: {true|false}` default: **`true`** [...](#ignore_comments)
78
84
  - when `true`, ignores comments, such as `<!-- comment -->`
79
85
 
80
- - `ignore_nodes: {css}` default: **`{}`**
86
+ - `ignore_nodes: [css_selector1, css_selector1, ...]` default: **`[]`** [&rarr; read more &larr;](#ignore_nodes)
81
87
  - when provided, ignores specific *nodes* using [CSS selectors](http://www.w3schools.com/cssref/css_selectors.asp)
82
88
 
83
- - `ignore_text_nodes: {true|false}` default: **`false`**
89
+ - `ignore_text_nodes: {true|false}` default: **`false`** [&rarr; read more &larr;](#ignore_text_nodes)
84
90
  - when `true`, ignores all text content within a document
85
91
 
86
- - `collapse_whitespace: {true|false}` default: **`true`**
87
- - when `true`, trims and collapses whitespace
88
-
89
- - `verbose: {true|false}` default: **`false`**
92
+ - `verbose: {true|false}` default: **`false`** [&rarr; read more &larr;](#verbose)
90
93
  - when `true`, instead of a boolean, `CompareXML.equivalent?` returns an array of discrepancies.
91
94
 
92
95
 
93
96
  ## Options in Depth
94
97
 
95
- - `ignore_attr_order: {true|false}` default: **`true`**
98
+ - <a id="collapse_whitespace"></a>`collapse_whitespace: {true|false}` default: **`true`**
99
+
100
+ When `true`, all text content within the document is trimmed (i.e. space removed from left and right) and whitespace is collapsed (i.e. tabs, new lines, multiple whitespace characters are replaced by a single whitespace).
101
+
102
+ **Usage Example:** `CompareXML.equivalent?(doc1, doc2, {collapse_whitespace: true})`
103
+
104
+ **Example:** When `true` the following HTML strings are considered equal:
105
+
106
+ <a href="/admin"> SOME TEXT CONTENT </a>
107
+ <a href="/index"> SOME TEXT CONTENT </a>
108
+
109
+ **Example:** When `true` the following HTML strings are considered equal:
110
+
111
+ <html>
112
+ <title>
113
+ This is my title
114
+ </title>
115
+ </html>
116
+
117
+ <html><title>This is my title</title></html>
118
+
119
+
120
+ ----------
121
+
122
+
123
+ - <a id="ignore_attr_order"></a>`ignore_attr_order: {true|false}` default: **`true`**
96
124
 
97
125
  When `true`, all attributes are sorted before comparison and only attributes of the same type are compared.
98
126
 
@@ -120,8 +148,30 @@ CompareXML.equivalent?(doc1, doc2, {ignore_comments: false, verbose: true, ...})
120
148
  target="_blank" == target="_blank"
121
149
 
122
150
 
151
+ ----------
152
+
153
+
154
+ - <a id="ignore_attr_content"></a>`ignore_attr_content: [string1, string2, ...]` default: **`[]`**
155
+
156
+ When provided, ignores all **attributes** that contain any of the given substrings. **Note:** types of attributes still have to match (i.e. `<p>` = `<p>`, `<div>` = `<div>`, etc).
157
+
158
+ **Usage Example:** `CompareXML.equivalent?(doc1, doc2, {ignore_attr_content: ['button']})`
159
+
160
+ **Example:** With `ignore_attr_content: ['button']` the following HTML strings are considered equal:
161
+
162
+ <a href="/admin" id="button_1" class="blue button">Link</a>
163
+ <a href="/admin" id="button_2" class="info button">Link</a>
164
+
165
+ **Example:** With `ignore_attr_content: ['menu']` the following HTML strings are considered equal:
123
166
 
124
- - `ignore_attrs: {css}` default: **`{}`**
167
+ <a class="menu left" data-scope="abrth$menu" role="side-menu">Link</a>
168
+ <a class="main menu" data-scope="ergeh$menu" role="main-menu">Link</a>
169
+
170
+
171
+ ----------
172
+
173
+
174
+ - <a id="ignore_attrs"></a>`ignore_attrs: [css_selector1, css_selector1, ...]` default: **`[]`**
125
175
 
126
176
  When provided, ignores all **attributes** that satisfy a particular rule using [CSS selectors](http://www.w3schools.com/cssref/css_selectors.asp).
127
177
 
@@ -138,8 +188,10 @@ CompareXML.equivalent?(doc1, doc2, {ignore_comments: false, verbose: true, ...})
138
188
  <a href="https://google.com" class="primary button rounded">Link</a>
139
189
 
140
190
 
191
+ ----------
192
+
141
193
 
142
- - `ignore_comments: {true|false}` default: **`true`**
194
+ - <a id="ignore_comments"></a>`ignore_comments: {true|false}` default: **`true`**
143
195
 
144
196
  When `true`, ignores comments, such as `<!-- This is a comment -->`.
145
197
 
@@ -156,8 +208,10 @@ CompareXML.equivalent?(doc1, doc2, {ignore_comments: false, verbose: true, ...})
156
208
  <a href="/admin">Link</a>
157
209
 
158
210
 
211
+ ----------
159
212
 
160
- - `ignore_nodes: {css}` default: **`{}`**
213
+
214
+ - <a id="ignore_nodes"></a>`ignore_nodes: [css_selector1, css_selector1, ...]` default: **`[]`**
161
215
 
162
216
  When provided, ignores all **nodes** that satisfy a particular rule using [CSS selectors](http://www.w3schools.com/cssref/css_selectors.asp).
163
217
 
@@ -174,8 +228,10 @@ CompareXML.equivalent?(doc1, doc2, {ignore_comments: false, verbose: true, ...})
174
228
  <a href="/admin"><i class"icon info"></i><b>Message:</b> Link</a>
175
229
 
176
230
 
231
+ ----------
232
+
177
233
 
178
- - `ignore_text_nodes: {true|false}` default: **`false`**
234
+ - <a id="ignore_text_nodes"></a>`ignore_text_nodes: {true|false}` default: **`false`**
179
235
 
180
236
  When `true`, ignores all text content. Text content is anything that is included between an opening and a closing tag, e.g. `<tag>THIS IS TEXT CONTENT</tag>`.
181
237
 
@@ -192,127 +248,68 @@ CompareXML.equivalent?(doc1, doc2, {ignore_comments: false, verbose: true, ...})
192
248
  <i class="icon> </i> <b>Message:</b>
193
249
 
194
250
 
251
+ ----------
195
252
 
196
- - `collapse_whitespace: {true|false}` default: **`true`**
197
-
198
- When `true`, all text content within the document is trimmed (i.e. space removed from left and right) and whitespace is collapsed (i.e. tabs, new lines, multiple whitespace characters are replaced by a single whitespace).
199
-
200
- **Usage Example:** `CompareXML.equivalent?(doc1, doc2, {collapse_whitespace: true})`
201
253
 
202
- **Example:** When `true` the following HTML strings are considered equal:
203
-
204
- <a href="/admin"> SOME TEXT CONTENT </a>
205
- <a href="/index"> SOME TEXT CONTENT </a>
206
-
207
- **Example:** When `true` the following HTML strings are considered equal:
208
-
209
- <html>
210
- <title>
211
- This is my title
212
- </title>
213
- </html>
214
-
215
- <html><title>This is my title</title></html>
216
-
217
-
218
-
219
- - `verbose: {true|false}` default: **`false`**
254
+ - <a id="verbose"></a>`verbose: {true|false}` default: **`false`**
220
255
 
221
256
  When `true`, instead of returning a boolean value `CompareXML.equivalent?` returns an array of all errors encountered when performing a comparison.
222
257
 
223
- > **Warning:** When `true`, the comparison takes longer! Not only because more processing is required to produce meaningful error messages, but also because in this mode, comparison does **NOT** stop when a first error is encountered, because the goal is to capture as many discrepancies as possible.
258
+ > **Warning:** When `true`, the comparison takes longer! Not only because more processing is required to produce meaningful differences, but also because in this mode, comparison does **NOT** stop when a first difference is encountered, because the goal is to capture as many differences as possible.
224
259
 
225
260
  **Usage Example:** `CompareXML.equivalent?(doc1, doc2, {verbose: true})`
226
261
 
227
262
  **Example:** When `true` given the following HTML strings:
228
263
 
229
- <!DOCTYPE html>
230
- <html lang="en">
231
- <head><title>TITLE</title></head>
232
- <body>
233
- <h1>SOME HEADING</h1>
234
- <div id="content">
235
- <h2><i class="fa fa-cogs"></i> ANOTHER HEADING</h2>
236
- <p>Extra content</p>
237
- </div>
238
- <div class="window">
239
- <a href="/admin" rel="icon">Link</a>
240
- </div>
241
- <blockquote>Some fancy quote <cite>Author Name</cite></blockquote>
242
- <p>Some more text</p>
243
- <p>Yet more text</p>
244
- <p>Too much text</p>
245
- <!-- The footer is below -->
246
- <p class="footer">FOOTER</p>
247
- </body>
248
- </html>
249
-
250
- <!DOCTYPE html>
251
- <html lang="en">
252
- <head><title>ANOTHER TITLE</title></head>
253
- <body>
254
- <h1 id="main">SOME HEADING</h1>
255
- <div id="content">
256
- <h2><i class="fa fa-cogs"></i> ANOTHER HEADING</h2>
257
- <p>Extra content</p>
258
- </div>
259
- <div class="window">
260
- <a rel="button" href="/admin">Link</a>
261
- </div>
262
- <blockquote>Some fancy quote</blockquote>
263
- <p>Some more text</p>
264
- <p>Yet more text</p>
265
- <p>Too much text</p>
266
- <!-- This is the footer -->
267
- <div class="footer">FOOTER</div>
268
- </body>
269
- </html>
264
+ ![diffing](https://dl.dropboxusercontent.com/u/1001101/input.png)
270
265
 
271
266
  `CompareXML.equivalent?(doc1, doc2, {verbose: true})` will produce an array shown below.
272
267
 
273
- [
274
- "html:head:title",
275
- "TITLE",
276
- 10,
277
- "ANOTHER TITLE",
278
- "html:head:title"
279
- ],
280
- [
281
- "html:body:h1",
282
- nil,
283
- 2,
284
- "id=\"main\"",
285
- "html:body:h1"
286
- ],
287
- [
288
- "html:body:div(2):a",
289
- "rel=\"button\"",
290
- 4,
291
- "rel=\"icon\"",
292
- "html:body:div(2):a"
293
- ],
294
- [
295
- "html:body:blockquote:cite",
296
- "cite",
297
- 3,
298
- nil,
299
- "html:body:blockquote:cite"
300
- ],
301
- [
302
- "html:body:p(4)",
303
- "p",
304
- 8,
305
- "div",
306
- "html:body:div(3)"
307
- ]
308
-
309
- The structure of the array is as follows:
310
-
311
- [left_node_location, left_content, error_code, right_content, right_node_location]
268
+ ```ruby
269
+ [
270
+ {
271
+ node1: '<title>TITLE</title>',
272
+ node2: '<title>ANOTHER TITLE</title>',
273
+ diff1: 'TITLE',
274
+ diff2: 'ANOTHER TITLE',
275
+ },
276
+ {
277
+ node1: '<h1>SOME HEADING</h1>',
278
+ node2: '<h1 id="main">SOME HEADING</h1>',
279
+ diff1: nil,
280
+ diff2: 'id="main"',
281
+ },
282
+ {
283
+ node1: '<a href="/admin" rel="icon">Link</a>',
284
+ node2: '<a rel="button" href="/admin">Link</a>',
285
+ diff1: '"rel="icon"',
286
+ diff2: '"rel="button"',
287
+ },
288
+ {
289
+ node1: '<cite>Author Name</cite>',
290
+ node2: nil,
291
+ diff1: '<cite>Author Name</cite>',
292
+ diff2: nil,
293
+ },
294
+ {
295
+ node1: '<p class="footer">FOOTER</p>',
296
+ node1: '<div class="footer">FOOTER</div>',
297
+ diff1: 'p',
298
+ diff2: 'div',
299
+ }
300
+ ]
301
+ ```
302
+
303
+ The structure of each hash inside the array is:
304
+
305
+ node1: [Nokogiri::XML::Node] left node that contains the difference
306
+ node2: [Nokogiri::XML::Node] right node that contains the difference
307
+ diff1: [Nokogiri::XML::Node|String] left difference
308
+ diff1: [Nokogiri::XML::Node|String] right difference
312
309
 
313
310
  **Node location** of `html:body:p(4)` means that the element in question is `<p>`, its hierarchical ancestors are `html > body`, and it is the **4th** `<p>` tag. That is, it could be found in
314
311
 
315
- <html><body><p>one</p>...<p>two</p>...<p>three</p>...<p>TARGET</p></body></html>
312
+ <html><body><p>one</p...p>two</p...p>three</p...p>TARGET</p></body></html>
316
313
 
317
314
  > **Note:** `p(4)` means that it is the fourth tag of type `<p>`, but there could be many other tags of other types between `p(3)` and `p(4)`.
318
315
 
data/lib/compare-xml.rb CHANGED
@@ -5,67 +5,71 @@ module CompareXML
5
5
 
6
6
  # default options used by the module; all of these can be overridden
7
7
  DEFAULTS_OPTS = {
8
+ # when true, trims and collapses whitespace in text nodes and comments to a single space
9
+ # when false, all whitespace is preserved as it is without any changes
10
+ collapse_whitespace: true,
11
+
8
12
  # when true, attribute order is not important (all attributes are sorted before comparison)
9
13
  # when false, attributes are compared in order and comparison stops on the first mismatch
10
14
  ignore_attr_order: true,
11
15
 
16
+ # contains an array of user specified strings that is used to ignore any attributes
17
+ # whose content contains a string from this array (e.g. "good automobile" contains "mobile")
18
+ ignore_attr_content: [],
19
+
12
20
  # contains an array of user-specified CSS rules used to perform attribute exclusions
13
21
  # for this to work, a CSS rule MUST contain the attribute to be excluded,
14
22
  # i.e. a[href] will exclude all "href" attributes contained in <a> tags.
15
- ignore_attrs: {},
23
+ ignore_attrs: [],
16
24
 
17
25
  # when true ignores XML and HTML comments
18
26
  # when false, all comments are compared to their counterparts
19
27
  ignore_comments: true,
20
28
 
21
29
  # contains an array of user-specified CSS rules used to perform node exclusions
22
- ignore_nodes: {},
30
+ ignore_nodes: [],
23
31
 
24
32
  # when true, ignores all text nodes (although blank text nodes are always ignored)
25
33
  # when false, all text nodes are compared to their counterparts (except the empty ones)
26
34
  ignore_text_nodes: false,
27
35
 
28
- # when true, trims and collapses whitespace in text nodes and comments to a single space
29
- # when false, all whitespace is preserved as it is without any changes
30
- collapse_whitespace: true,
31
-
32
36
  # when true, provides a list of all error messages encountered in comparisons
33
37
  # when false, execution stops when the first error is encountered with no error messages
34
38
  verbose: false
35
39
  }
36
40
 
37
- # used internally only in order to differentiate equivalence for inequivalence
38
- EQUIVALENT = 1
39
41
 
40
- # a list of all possible inequivalence types for nodes
41
- # these are returned in the errors array to differentiate error types.
42
- MISSING_ATTRIBUTE = 2 # attribute is missing its counterpart
43
- MISSING_NODE = 3 # node is missing its counterpart
44
- UNEQUAL_ATTRIBUTES = 4 # attributes are not equal
45
- UNEQUAL_COMMENTS = 5 # comment contents are not equal
46
- UNEQUAL_DOCUMENTS = 6 # document types are not equal
47
- UNEQUAL_ELEMENTS = 7 # nodes have the same type but are not equal
48
- UNEQUAL_NODES_TYPES = 8 # nodes do not have the same type
49
- UNEQUAL_TEXT_CONTENTS = 9 # text contents are not equal
42
+ class << self
50
43
 
44
+ # used internally only in order to differentiate equivalence for inequivalence
45
+ EQUIVALENT = 1
51
46
 
52
- class << self
47
+ # a list of all possible inequivalence types for nodes
48
+ # these are returned in the differences array to differentiate error types.
49
+ MISSING_ATTRIBUTE = 2 # attribute is missing its counterpart
50
+ MISSING_NODE = 3 # node is missing its counterpart
51
+ UNEQUAL_ATTRIBUTES = 4 # attributes are not equal
52
+ UNEQUAL_COMMENTS = 5 # comment contents are not equal
53
+ UNEQUAL_DOCUMENTS = 6 # document types are not equal
54
+ UNEQUAL_ELEMENTS = 7 # nodes have the same type but are not equal
55
+ UNEQUAL_NODES_TYPES = 8 # nodes do not have the same type
56
+ UNEQUAL_TEXT_CONTENTS = 9 # text node contents are not equal
53
57
 
54
58
  ##
55
59
  # Determines whether two XML documents or fragments are equal to each other.
56
60
  # The two parameters could be any type of XML documents, or fragments
57
61
  # or node sets or even text nodes - any subclass of Nokogiri::XML::Node.
58
62
  #
59
- # @param [Nokogiri::XML::Node] n1 left attribute
60
- # @param [Nokogiri::XML::Node] n2 right attribute
63
+ # @param [Nokogiri::XML::Element] n1 left node element
64
+ # @param [Nokogiri::XML::Element] n2 right node element
61
65
  # @param [Hash] opts user-overridden options
62
66
  #
63
- # @return true if equal, [Array] errors otherwise
67
+ # @return true if equal, [Array] differences otherwise
64
68
  #
65
69
  def equivalent?(n1, n2, opts = {})
66
- opts, errors = DEFAULTS_OPTS.merge(opts), []
67
- result = compareNodes(n1, n2, opts, errors)
68
- opts[:verbose] ? errors : result == EQUIVALENT
70
+ opts, differences = DEFAULTS_OPTS.merge(opts), []
71
+ result = compareNodes(n1, n2, opts, differences)
72
+ opts[:verbose] ? differences : result == EQUIVALENT
69
73
  end
70
74
 
71
75
 
@@ -75,36 +79,38 @@ module CompareXML
75
79
  # Compares two nodes for equivalence. The nodes could be any subclass
76
80
  # of Nokogiri::XML::Node including node sets and document fragments.
77
81
  #
78
- # @param [Nokogiri::XML::Node] n1 left attribute
79
- # @param [Nokogiri::XML::Node] n2 right attribute
82
+ # @param [Nokogiri::XML::Node] n1 left node
83
+ # @param [Nokogiri::XML::Node] n2 right node
80
84
  # @param [Hash] opts user-overridden options
81
- # @param [Array] errors inequivalence messages
85
+ # @param [Array] differences inequivalence messages
86
+ # @param [int] status comparison status code (EQUIVALENT by default)
82
87
  #
83
88
  # @return type of equivalence (from equivalence constants)
84
89
  #
85
- def compareNodes(n1, n2, opts, errors, status = EQUIVALENT)
90
+ def compareNodes(n1, n2, opts, differences, status = EQUIVALENT)
86
91
  if n1.class == n2.class
87
92
  case n1
88
93
  when Nokogiri::XML::Comment
89
- compareCommentNodes(n1, n2, opts, errors)
94
+ compareCommentNodes(n1, n2, opts, differences)
90
95
  when Nokogiri::HTML::Document
91
- compareDocumentNodes(n1, n2, opts, errors)
96
+ compareDocumentNodes(n1, n2, opts, differences)
92
97
  when Nokogiri::XML::Element
93
- status = compareElementNodes(n1, n2, opts, errors)
98
+ status = compareElementNodes(n1, n2, opts, differences)
94
99
  when Nokogiri::XML::Text
95
- status = compareTextNodes(n1, n2, opts, errors)
100
+ status = compareTextNodes(n1, n2, opts, differences)
96
101
  else
97
- status = compareChildren(n1.children, n2.children, opts, errors)
102
+ if n1.is_a?(Nokogiri::XML::Node) || n1.is_a?(Nokogiri::XML::NodeSet)
103
+ status = compareChildren(n1.children, n2.children, opts, differences)
104
+ else
105
+ raise 'Comparison only allowed between objects of type Nokogiri::XML::Node and Nokogiri::XML::NodeSet.'
106
+ end
98
107
  end
99
- elsif n1.nil?
100
- status = MISSING_NODE
101
- errors << [nodePath(n2), nil, status, n2.name, nodePath(n2)] if opts[:verbose]
102
- elsif n2.nil?
108
+ elsif n1.nil? || n2.nil?
103
109
  status = MISSING_NODE
104
- errors << [nodePath(n1), n1.name, status, nil, nodePath(n1)] if opts[:verbose]
110
+ addDifference(n1, n2, n1, n2, opts, differences)
105
111
  else
106
112
  status = UNEQUAL_NODES_TYPES
107
- errors << [nodePath(n1), n1.class, status, n2.class, nodePath(n2)] if opts[:verbose]
113
+ addDifference(n1, n2, n1, n2, opts, differences)
108
114
  end
109
115
  status
110
116
  end
@@ -113,20 +119,21 @@ module CompareXML
113
119
  ##
114
120
  # Compares two nodes of type Nokogiri::HTML::Comment.
115
121
  #
116
- # @param [Nokogiri::XML::Comment] n1 left attribute
117
- # @param [Nokogiri::XML::Comment] n2 right attribute
122
+ # @param [Nokogiri::XML::Comment] n1 left comment
123
+ # @param [Nokogiri::XML::Comment] n2 right comment
118
124
  # @param [Hash] opts user-overridden options
119
- # @param [Array] errors inequivalence messages
125
+ # @param [Array] differences inequivalence messages
126
+ # @param [int] status comparison status code (EQUIVALENT by default)
120
127
  #
121
128
  # @return type of equivalence (from equivalence constants)
122
129
  #
123
- def compareCommentNodes(n1, n2, opts, errors, status = EQUIVALENT)
130
+ def compareCommentNodes(n1, n2, opts, differences, status = EQUIVALENT)
124
131
  return true if opts[:ignore_comments]
125
132
  t1, t2 = n1.content, n2.content
126
133
  t1, t2 = collapse(t1), collapse(t2) if opts[:collapse_whitespace]
127
134
  unless t1 == t2
128
135
  status = UNEQUAL_COMMENTS
129
- errors << [nodePath(n1.parent), t1, status, t2, nodePath(n2.parent)] if opts[:verbose]
136
+ addDifference(n1, n2, t1, t2, opts, differences)
130
137
  end
131
138
  status
132
139
  end
@@ -135,19 +142,20 @@ module CompareXML
135
142
  ##
136
143
  # Compares two nodes of type Nokogiri::HTML::Document.
137
144
  #
138
- # @param [Nokogiri::XML::Document] n1 left attribute
139
- # @param [Nokogiri::XML::Document] n2 right attribute
145
+ # @param [Nokogiri::XML::Document] n1 left document
146
+ # @param [Nokogiri::XML::Document] n2 right document
140
147
  # @param [Hash] opts user-overridden options
141
- # @param [Array] errors inequivalence messages
148
+ # @param [Array] differences inequivalence messages
149
+ # @param [int] status comparison status code (EQUIVALENT by default)
142
150
  #
143
151
  # @return type of equivalence (from equivalence constants)
144
152
  #
145
- def compareDocumentNodes(n1, n2, opts, errors, status = EQUIVALENT)
153
+ def compareDocumentNodes(n1, n2, opts, differences, status = EQUIVALENT)
146
154
  if n1.name == n2.name
147
- status = compareChildren(n1.children, n2.children, opts, errors)
155
+ status = compareChildren(n1.children, n2.children, opts, differences)
148
156
  else
149
157
  status == UNEQUAL_DOCUMENTS
150
- errors << [nodePath(n1), n1, status, n2, nodePath(n2)] if opts[:verbose]
158
+ addDifference(n1, n2, n1, n2, opts, differences)
151
159
  end
152
160
  status
153
161
  end
@@ -159,11 +167,12 @@ module CompareXML
159
167
  # @param [Nokogiri::XML::NodeSet] n1_set left set of Nokogiri::XML::Node elements
160
168
  # @param [Nokogiri::XML::NodeSet] n2_set right set of Nokogiri::XML::Node elements
161
169
  # @param [Hash] opts user-overridden options
162
- # @param [Array] errors inequivalence messages
170
+ # @param [Array] differences inequivalence messages
171
+ # @param [int] status comparison status code (EQUIVALENT by default)
163
172
  #
164
173
  # @return type of equivalence (from equivalence constants)
165
174
  #
166
- def compareChildren(n1_set, n2_set, opts, errors, status = EQUIVALENT)
175
+ def compareChildren(n1_set, n2_set, opts, differences, status = EQUIVALENT)
167
176
  i = 0; j = 0
168
177
  while i < n1_set.length || j < n2_set.length
169
178
  if !n1_set[i].nil? && nodeExcluded?(n1_set[i], opts)
@@ -171,7 +180,7 @@ module CompareXML
171
180
  elsif !n2_set[j].nil? && nodeExcluded?(n2_set[j], opts)
172
181
  j += 1 # increment counter if right node is excluded
173
182
  else
174
- result = compareNodes(n1_set[i], n2_set[j], opts, errors)
183
+ result = compareNodes(n1_set[i], n2_set[j], opts, differences)
175
184
  status = result unless result == EQUIVALENT
176
185
 
177
186
  # return false so that this subtree could halt comparison on error
@@ -194,22 +203,23 @@ module CompareXML
194
203
  # - compares element attributes
195
204
  # - recursively compares element children
196
205
  #
197
- # @param [Nokogiri::XML::Element] n1 left attribute
198
- # @param [Nokogiri::XML::Element] n2 right attribute
206
+ # @param [Nokogiri::XML::Element] n1 left node element
207
+ # @param [Nokogiri::XML::Element] n2 right node element
199
208
  # @param [Hash] opts user-overridden options
200
- # @param [Array] errors inequivalence messages
209
+ # @param [Array] differences inequivalence messages
210
+ # @param [int] status comparison status code (EQUIVALENT by default)
201
211
  #
202
212
  # @return type of equivalence (from equivalence constants)
203
213
  #
204
- def compareElementNodes(n1, n2, opts, errors, status = EQUIVALENT)
214
+ def compareElementNodes(n1, n2, opts, differences, status = EQUIVALENT)
205
215
  if n1.name == n2.name
206
- result = compareAttributeSets(n1.attribute_nodes, n2.attribute_nodes, opts, errors)
207
- status = result unless result == EQUIVALENT
208
- result = compareChildren(n1.children, n2.children, opts, errors)
216
+ result = compareAttributeSets(n1, n2, n1.attribute_nodes, n2.attribute_nodes, opts, differences)
217
+ return result unless result == EQUIVALENT
218
+ result = compareChildren(n1.children, n2.children, opts, differences)
209
219
  status = result unless result == EQUIVALENT
210
220
  else
211
221
  status = UNEQUAL_ELEMENTS
212
- errors << [nodePath(n1), n1.name, status, n2.name, nodePath(n2)] if opts[:verbose]
222
+ addDifference(n1, n2, n1.name, n2.name, opts, differences)
213
223
  end
214
224
  status
215
225
  end
@@ -218,41 +228,44 @@ module CompareXML
218
228
  ##
219
229
  # Compares two nodes of type Nokogiri::XML::Text.
220
230
  #
221
- # @param [Nokogiri::XML::Text] n1 left attribute
222
- # @param [Nokogiri::XML::Text] n2 right attribute
231
+ # @param [Nokogiri::XML::Text] n1 left text node
232
+ # @param [Nokogiri::XML::Text] n2 right text node
223
233
  # @param [Hash] opts user-overridden options
224
- # @param [Array] errors inequivalence messages
234
+ # @param [Array] differences inequivalence messages
235
+ # @param [int] status comparison status code (EQUIVALENT by default)
225
236
  #
226
237
  # @return type of equivalence (from equivalence constants)
227
238
  #
228
- def compareTextNodes(n1, n2, opts, errors, status = EQUIVALENT)
239
+ def compareTextNodes(n1, n2, opts, differences, status = EQUIVALENT)
229
240
  return true if opts[:ignore_text_nodes]
230
241
  t1, t2 = n1.content, n2.content
231
242
  t1, t2 = collapse(t1), collapse(t2) if opts[:collapse_whitespace]
232
243
  unless t1 == t2
233
244
  status = UNEQUAL_TEXT_CONTENTS
234
- errors << [nodePath(n1.parent), t1, status, t2, nodePath(n2.parent)] if opts[:verbose]
245
+ addDifference(n1.parent, n2.parent, t1, t2, opts, differences)
235
246
  end
236
247
  status
237
248
  end
238
249
 
239
250
 
240
251
  ##
241
- # Compares two sets of Nokogiri::XML::Node attributes.
252
+ # Compares two sets of Nokogiri::XML::Element attributes.
242
253
  #
254
+ # @param [Nokogiri::XML::Element] n1 left node element
255
+ # @param [Nokogiri::XML::Element] n2 right node element
243
256
  # @param [Array] a1_set left attribute set
244
257
  # @param [Array] a2_set right attribute set
245
258
  # @param [Hash] opts user-overridden options
246
- # @param [Array] errors inequivalence messages
259
+ # @param [Array] differences inequivalence messages
247
260
  #
248
261
  # @return type of equivalence (from equivalence constants)
249
262
  #
250
- def compareAttributeSets(a1_set, a2_set, opts, errors)
263
+ def compareAttributeSets(n1, n2, a1_set, a2_set, opts, differences)
251
264
  return false unless a1_set.length == a2_set.length || opts[:verbose]
252
265
  if opts[:ignore_attr_order]
253
- compareSortedAttributeSets(a1_set, a2_set, opts, errors)
266
+ compareSortedAttributeSets(n1, n2, a1_set, a2_set, opts, differences)
254
267
  else
255
- compareUnsortedAttributeSets(a1_set, a2_set, opts, errors)
268
+ compareUnsortedAttributeSets(n1, n2, a1_set, a2_set, opts, differences)
256
269
  end
257
270
  end
258
271
 
@@ -262,29 +275,34 @@ module CompareXML
262
275
  # When the attributes are sorted, only attributes of the same type are compared
263
276
  # to each other, and missing attributes can be easily detected.
264
277
  #
278
+ # @param [Nokogiri::XML::Element] n1 left node element
279
+ # @param [Nokogiri::XML::Element] n2 right node element
265
280
  # @param [Array] a1_set left attribute set
266
281
  # @param [Array] a2_set right attribute set
267
282
  # @param [Hash] opts user-overridden options
268
- # @param [Array] errors inequivalence messages
283
+ # @param [Array] differences inequivalence messages
284
+ # @param [int] status comparison status code (EQUIVALENT by default)
269
285
  #
270
286
  # @return type of equivalence (from equivalence constants)
271
287
  #
272
- def compareSortedAttributeSets(a1_set, a2_set, opts, errors, status = EQUIVALENT)
288
+ def compareSortedAttributeSets(n1, n2, a1_set, a2_set, opts, differences, status = EQUIVALENT)
273
289
  a1_set, a2_set = a1_set.sort_by { |a| a.name }, a2_set.sort_by { |a| a.name }
274
290
  i = j = 0
275
291
 
276
292
  while i < a1_set.length || j < a2_set.length
293
+
277
294
  if a1_set[i].nil?
278
- result = compareAttributes(nil, a2_set[j], opts, errors); j += 1
295
+ result = compareAttributes(n1, n2, nil, a2_set[j], opts, differences); j += 1
279
296
  elsif a2_set[j].nil?
280
- result = compareAttributes(a1_set[i], nil, opts, errors); i += 1
297
+ result = compareAttributes(n1, n2, a1_set[i], nil, opts, differences); i += 1
281
298
  elsif a1_set[i].name < a2_set[j].name
282
- result = compareAttributes(a1_set[i], nil, opts, errors); i += 1
299
+ result = compareAttributes(n1, n2, a1_set[i], nil, opts, differences); i += 1
283
300
  elsif a1_set[i].name > a2_set[j].name
284
- result = compareAttributes(nil, a2_set[j], opts, errors); j += 1
301
+ result = compareAttributes(n1, n2, nil, a2_set[j], opts, differences); j += 1
285
302
  else
286
- result = compareAttributes(a1_set[i], a2_set[j], opts, errors); i += 1; j += 1
303
+ result = compareAttributes(n1, n2, a1_set[i], a2_set[j], opts, differences); i += 1; j += 1
287
304
  end
305
+
288
306
  status = result unless result == EQUIVALENT
289
307
  break unless status == EQUIVALENT || opts[:verbose]
290
308
  end
@@ -293,21 +311,24 @@ module CompareXML
293
311
 
294
312
 
295
313
  ##
296
- # Compares two sets of Nokogiri::XML::Node attributes without sorting them.
314
+ # Compares two sets of Nokogiri::XML::Element attributes without sorting them.
297
315
  # As a result attributes of different types may be compared, and even if all
298
316
  # attributes are identical in both sets, if their order is different,
299
317
  # the comparison will stop as soon two unequal attributes are found.
300
318
  #
319
+ # @param [Nokogiri::XML::Element] n1 left node element
320
+ # @param [Nokogiri::XML::Element] n2 right node element
301
321
  # @param [Array] a1_set left attribute set
302
322
  # @param [Array] a2_set right attribute set
303
323
  # @param [Hash] opts user-overridden options
304
- # @param [Array] errors inequivalence messages
324
+ # @param [Array] differences inequivalence messages
325
+ # @param [int] status comparison status code (EQUIVALENT by default)
305
326
  #
306
327
  # @return type of equivalence (from equivalence constants)
307
328
  #
308
- def compareUnsortedAttributeSets(a1_set, a2_set, opts, errors, status = EQUIVALENT)
329
+ def compareUnsortedAttributeSets(n1, n2, a1_set, a2_set, opts, differences, status = EQUIVALENT)
309
330
  [a1_set.length, a2_set.length].max.times do |i|
310
- result = compareAttributes(a1_set[i], a2_set[i], opts, errors)
331
+ result = compareAttributes(n1, n2, a1_set[i], a2_set[i], opts, differences)
311
332
  status = result unless result == EQUIVALENT
312
333
  break unless status == EQUIVALENT
313
334
  end
@@ -318,29 +339,33 @@ module CompareXML
318
339
  ##
319
340
  # Compares two attributes by name and value.
320
341
  #
342
+ # @param [Nokogiri::XML::Element] n1 left node element
343
+ # @param [Nokogiri::XML::Element] n2 right node element
321
344
  # @param [Nokogiri::XML::Attr] a1 left attribute
322
345
  # @param [Nokogiri::XML::Attr] a2 right attribute
323
346
  # @param [Hash] opts user-overridden options
324
- # @param [Array] errors inequivalence messages
347
+ # @param [Array] differences inequivalence messages
348
+ # @param [int] status comparison status code (EQUIVALENT by default)
325
349
  #
326
350
  # @return type of equivalence (from equivalence constants)
327
351
  #
328
- def compareAttributes(a1, a2, opts, errors, status = EQUIVALENT)
352
+ def compareAttributes(n1, n2, a1, a2, opts, differences, status = EQUIVALENT)
329
353
  if a1.nil?
330
354
  status = MISSING_ATTRIBUTE
331
- errors << [nodePath(a2.parent), nil, status, "#{a2.name}=\"#{a2.value}\"", nodePath(a2.parent)] if opts[:verbose]
355
+ addDifference(n1, n2, nil, "#{a2.name}=\"#{a2.value}\"", opts, differences)
332
356
  elsif a2.nil?
333
357
  status = MISSING_ATTRIBUTE
334
- errors << [nodePath(a1.parent), "#{a1.name}=\"#{a1.value}\"", status, nil, nodePath(a1.parent)] if opts[:verbose]
358
+ addDifference(n1, n2, "#{a1.name}=\"#{a1.value}\"", nil, opts, differences)
335
359
  elsif a1.name == a2.name
336
360
  return status if attrsExcluded?(a1, a2, opts)
361
+ return status if attrContentExcluded?(a1, a2, opts)
337
362
  if a1.value != a2.value
338
363
  status = UNEQUAL_ATTRIBUTES
339
- errors << [nodePath(a1.parent), "#{a1.name}=\"#{a1.value}\"", status, "#{a2.name}=\"#{a2.value}\"", nodePath(a2.parent)] if opts[:verbose]
364
+ addDifference(n1, n2, "#{a1.name}=\"#{a1.value}\"", "#{a2.name}=\"#{a2.value}\"", opts, differences)
340
365
  end
341
366
  else
342
367
  status = UNEQUAL_ATTRIBUTES
343
- errors << [nodePath(a1.parent), a1.name, status, a2.name, nodePath(a2.parent)] if opts[:verbose]
368
+ addDifference(n1, n2, "#{a1.name}=\"#{a1.value}\"", "#{a2.name}=\"#{a2.value}\"", opts, differences)
344
369
  end
345
370
  status
346
371
  end
@@ -353,7 +378,7 @@ module CompareXML
353
378
  # Several types of nodes are considered ignored:
354
379
  # - comments (only in +ignore_comments+ mode)
355
380
  # - text nodes (only in +ignore_text_nodes+ mode OR when a text node is empty)
356
- # - node matches a user-specified css rule from +ignore_comments+
381
+ # - node matches a user-specified css rule from +ignore_nodes+
357
382
  #
358
383
  # @param [Nokogiri::XML::Node] n node being tested for exclusion
359
384
  # @param [Hash] opts user-overridden options
@@ -361,11 +386,10 @@ module CompareXML
361
386
  # @return true if excluded, false otherwise
362
387
  #
363
388
  def nodeExcluded?(n, opts)
389
+ return true if n.is_a?(Nokogiri::XML::DTD)
364
390
  return true if n.is_a?(Nokogiri::XML::Comment) && opts[:ignore_comments]
365
391
  return true if n.is_a?(Nokogiri::XML::Text) && (opts[:ignore_text_nodes] || collapse(n.content).empty?)
366
- opts[:ignore_nodes].each do |css|
367
- return true if n.xpath('../*').css(css).include?(n)
368
- end
392
+ opts[:ignore_nodes].each { |css| return true if n.parent.css(css).include? n }
369
393
  false
370
394
  end
371
395
 
@@ -393,43 +417,43 @@ module CompareXML
393
417
 
394
418
 
395
419
  ##
396
- # Produces the hierarchical ancestral path of a node in the following format: <html:body:div(3):h2:b(2)>.
397
- # This means that the element is located in:
398
- #
399
- # <html>
400
- # <body>
401
- # <div>...</div>
402
- # <div>...</div>
403
- # <div>
404
- # <h2>
405
- # <b>...</b>
406
- # <b>TARGET</b>
407
- # </h2>
408
- # </div>
409
- # </body>
410
- # </html>
411
- #
412
- # Note that the counts of element locations only apply to elements of the same type. For example, div(3) means
413
- # that it is the 3rd <div> element in the <body>, but there could be many other elements in between the three
414
- # <div> elements.
415
- #
416
- # When +ignore_comments+ mode is disabled, mismatching comments will show up as <...:comment>.
417
- #
418
- # @param [Nokogiri::XML::Node] n node for which to determine a hierarchical path
420
+ # Checks whether two given attributes should be excluded, based on their content.
421
+ # Checks whether both attributes contain content that should be excluded, and
422
+ # returns true only if an excluded string is contained in both attribute values.
423
+ #
424
+ # @param [Nokogiri::XML::Attr] a1 left attribute
425
+ # @param [Nokogiri::XML::Attr] a2 right attribute
426
+ # @param [Hash] opts user-overridden options
419
427
  #
420
428
  # @return true if excluded, false otherwise
421
429
  #
422
- def nodePath(n)
423
- name = n.name
430
+ def attrContentExcluded?(a1, a2, opts)
431
+ a1_excluded, a2_excluded = false, false
432
+ opts[:ignore_attr_content].each do |content|
433
+ a1_excluded = a1_excluded || a1.value.include?(content)
434
+ a2_excluded = a2_excluded || a2.value.include?(content)
435
+ return true if a1_excluded && a2_excluded
436
+ end
437
+ false
438
+ end
424
439
 
425
- # find the index of the node if there are several of the same type
426
- siblings = n.xpath("../#{name}")
427
- name += "(#{siblings.index(n) + 1})" if siblings.length > 1
428
440
 
429
- if defined? n.parent
430
- status = "#{nodePath(n.parent)}:#{name}"
431
- status = status[1..-1] if status[0] == ':'
432
- status
441
+ ##
442
+ # Strips the whitespace (from beginning and end) and collapses it,
443
+ # i.e. multiple spaces, new lines and tabs are all collapsed to a single space.
444
+ #
445
+ # @param [Nokogiri::XML::Node] node1 left node
446
+ # @param [Nokogiri::XML::Node] node2 right node
447
+ # @param [String] diff1 left diffing value
448
+ # @param [String] diff2 right diffing value
449
+ # @param [Hash] opts user-overridden options
450
+ # @param [Array] differences inequivalence messages
451
+ #
452
+ # @return collapsed string
453
+ #
454
+ def addDifference(node1, node2, diff1, diff2, opts, differences)
455
+ if opts[:verbose]
456
+ differences << {node1: node1, node2: node2, diff1: diff1, diff2: diff2}
433
457
  end
434
458
  end
435
459
 
@@ -1,3 +1,3 @@
1
1
  module CompareXML
2
- VERSION = '0.5.2'
2
+ VERSION = '0.6'
3
3
  end
metadata CHANGED
@@ -1,55 +1,55 @@
1
1
  --- !ruby/object:Gem::Specification
2
2
  name: compare-xml
3
3
  version: !ruby/object:Gem::Version
4
- version: 0.5.2
4
+ version: '0.6'
5
5
  platform: ruby
6
6
  authors:
7
7
  - Vadim Kononov
8
8
  autorequire:
9
9
  bindir: exe
10
10
  cert_chain: []
11
- date: 2016-04-06 00:00:00.000000000 Z
11
+ date: 2016-04-29 00:00:00.000000000 Z
12
12
  dependencies:
13
13
  - !ruby/object:Gem::Dependency
14
14
  name: bundler
15
15
  requirement: !ruby/object:Gem::Requirement
16
16
  requirements:
17
- - - ~>
17
+ - - "~>"
18
18
  - !ruby/object:Gem::Version
19
19
  version: '1.11'
20
20
  type: :development
21
21
  prerelease: false
22
22
  version_requirements: !ruby/object:Gem::Requirement
23
23
  requirements:
24
- - - ~>
24
+ - - "~>"
25
25
  - !ruby/object:Gem::Version
26
26
  version: '1.11'
27
27
  - !ruby/object:Gem::Dependency
28
28
  name: rake
29
29
  requirement: !ruby/object:Gem::Requirement
30
30
  requirements:
31
- - - ~>
31
+ - - "~>"
32
32
  - !ruby/object:Gem::Version
33
33
  version: '11.1'
34
34
  type: :development
35
35
  prerelease: false
36
36
  version_requirements: !ruby/object:Gem::Requirement
37
37
  requirements:
38
- - - ~>
38
+ - - "~>"
39
39
  - !ruby/object:Gem::Version
40
40
  version: '11.1'
41
41
  - !ruby/object:Gem::Dependency
42
42
  name: nokogiri
43
43
  requirement: !ruby/object:Gem::Requirement
44
44
  requirements:
45
- - - ~>
45
+ - - "~>"
46
46
  - !ruby/object:Gem::Version
47
47
  version: '1.6'
48
48
  type: :runtime
49
49
  prerelease: false
50
50
  version_requirements: !ruby/object:Gem::Requirement
51
51
  requirements:
52
- - - ~>
52
+ - - "~>"
53
53
  - !ruby/object:Gem::Version
54
54
  version: '1.6'
55
55
  description: CompareXML is a fast, lightweight and feature-rich tool that will solve
@@ -61,7 +61,7 @@ executables: []
61
61
  extensions: []
62
62
  extra_rdoc_files: []
63
63
  files:
64
- - .gitignore
64
+ - ".gitignore"
65
65
  - Gemfile
66
66
  - LICENSE.txt
67
67
  - README.md
@@ -81,17 +81,17 @@ require_paths:
81
81
  - lib
82
82
  required_ruby_version: !ruby/object:Gem::Requirement
83
83
  requirements:
84
- - - '>='
84
+ - - ">="
85
85
  - !ruby/object:Gem::Version
86
86
  version: '0'
87
87
  required_rubygems_version: !ruby/object:Gem::Requirement
88
88
  requirements:
89
- - - '>='
89
+ - - ">="
90
90
  - !ruby/object:Gem::Version
91
91
  version: '0'
92
92
  requirements: []
93
93
  rubyforge_project:
94
- rubygems_version: 2.6.2
94
+ rubygems_version: 2.5.2
95
95
  signing_key:
96
96
  specification_version: 4
97
97
  summary: A customizable tool that compares two instances of Nokogiri::XML::Node for