RubyGems - compare-xml - Versions diffs - 0.5.1 - Mend

compare-xml 0.5.1

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (12) hide show

checksums.yaml ADDED

@@ -0,0 +1,7 @@
+---
+SHA1:
+  metadata.gz: eb7ad2fa6ba45154479d1129ad6ce84337b331fa
+  data.tar.gz: eb3a5b0f7caedccb403c1a5504baa80b4e5ff7a8
+SHA512:
+  metadata.gz: d46798f576e812ad39b3604c8b41f01d531f6490ba1690facb1cd978c200139d8b1e58a17b98a2d47fbbd76cc011487f9850bb68f5a5e28f5d3156d41a7c9f7c
+  data.tar.gz: 91993f87fca6eb40ec302bb598a88b85fa36e1876e1c9ef88396b2f52360e8bed89597400d9803ceaa7a455e8d5930750a9239d4610f607efe46f01d03c5dc31

data/.gitignore ADDED

@@ -0,0 +1,13 @@
+*.DS_Store
+*thumbs.db
+/*.gem
+/.bundle/
+/.idea/
+/.yardoc
+/_yardoc/
+/coverage/
+/doc/
+/Gemfile.lock
+/pkg/
+/spec/reports/
+/tmp/

data/Gemfile ADDED

@@ -0,0 +1,4 @@
+source 'https://rubygems.org'
+# Specify your gem's dependencies in compare-xml-xml.gemspec
+gemspec

data/LICENSE.txt ADDED

@@ -0,0 +1,21 @@
+The MIT License (MIT)
+Copyright (c) 2016 Vadim Kononov
+Permission is hereby granted, free of charge, to any person obtaining a copy
+of this software and associated documentation files (the "Software"), to deal
+in the Software without restriction, including without limitation the rights
+to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
+copies of the Software, and to permit persons to whom the Software is
+furnished to do so, subject to the following conditions:
+The above copyright notice and this permission notice shall be included in
+all copies or substantial portions of the Software.
+THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
+AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
+OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
+THE SOFTWARE.

data/README.md ADDED

@@ -0,0 +1,358 @@
+# CompareXML
+[![Gem Version](https://badge.fury.io/rb/compare-xml.svg)](https://rubygems.org/gems/compare-xml)
+CompareXML is a fast, lightweight and feature-rich tool that will solve your XML/HTML comparison or diffing needs. its purpose is to compare two instances of `Nokogiri::XML::Node` or `Nokogiri::XML::NodeSet` for equality or equivalency.
+**Features**
+ - Fast, light-weight and highly customizable
+ - Compares XML/HTML documents and document fragments
+ - Can produce both detailed diffing discrepancies or execute silently
+ - Has the ability to exclude specific nodes or attributes from all comparisons
+## Installation
+Add this line to your application's Gemfile:
+```ruby
+gem 'compare-xml'
+```
+And then execute:
+    $ bundle
+Or install it yourself as:
+    $ gem install compare-xml
+## Usage
+Using CompareXML is as simple as
+```ruby
+CompareXML.equivalent?(doc1, doc2)
+```
+where `doc1` and `doc2` are instances of `Nokogiri::XML::Node` or `Nokogiri::XML::NodeSet`.
+**Example**
+Suppose you have two files `1.html` and `2.html` that you would like to compare. You could do it as follows:
+```ruby
+doc1 = Nokogiri::HTML(open('1.html'))
+doc2 = Nokogiri::HTML(open('2.html'))
+puts CompareXML.equivalent?(doc1, doc2)
+```
+The above code will print `true` or `false` depending on the result of the comparison.
+> If you are using CompareXML in a script, then you need to require it manually with:
+```ruby
+require 'compare-xml'
+```
+## Options
+CompareXML has a variety of options that can be invoked as an optional argument, e.g.:
+```ruby
+CompareXML.equivalent?(doc1, doc2, {squeeze_whitespace: true, verbose: true})
+```
+----------
+- ####`ignore_attr_order: {true|false}` default: **`true`**
+    When `true`, all attributes are sorted before comparison and only attributes of the same type are compared.
+	**Usage Example:** `CompareXML.equivalent?(doc1, doc2, {ignore_attr_order: true})`
+    **Example:** When `true` the following HTML strings are considered equal:
+		<a href="/admin" class="button" target="_blank">Link</a>
+		<a class="button" target="_blank" href="/admin">Link</a>
+	**Example:** When `false` the above HTML strings are compared as follows:
+		href="admin" != class="button
+	The comparison of the `<a>` element will stop at this point, since a discrepancy is found.
+	**Example:** When `true` the following HTML strings are compared as follows:
+		<a href="/admin" class="button" target="_blank">Link</a>
+		<a class="button" target="_blank" href="/admin" rel="nofollow">Link</a>
+		class="button"  == class="button"
+		href="/admin"   == href="/admin"
+		                =! rel="nofollow"
+		target="_blank" == target="_blank"
+----------
+- ####`ignore_attrs: {css}` default: **`{}`**
+    When provided, ignores all **attributes** that satisfy a particular rule using [CSS selectors](http://www.w3schools.com/cssref/css_selectors.asp).
+	**Usage Example:** `CompareXML.equivalent?(doc1, doc2, {ignore_attrs: ['a[rel="nofollow"]', 'input[type="hidden"']})`
+    **Example:** With `ignore_attrs: ['a[rel="nofollow"]', 'a[target]']` the following HTML strings are considered equal:
+		<a href="/admin" class="button" target="_blank">Link</a>
+		<a href="/admin" class="button" target="_self" rel="nofollow">Link</a>
+	 **Example:** With `ignore_attrs: ['a[href^="http"]', 'a[class*="button"]']` the following HTML strings are considered equal:
+		<a href="http://google.ca" class="primary button">Link</a>
+		<a href="https://google.com" class="primary button rounded">Link</a>
+----------
+- ####`ignore_comments: {true|false}` default: **`true`**
+    When `true`, ignores comments, such as `<!-- This is a comment -->`.
+	**Usage Example:** `CompareXML.equivalent?(doc1, doc2, {ignore_comments: true})`
+    **Example:** When `true` the following HTML strings are considered equal:
+		<!-- This is a comment -->
+		<!-- This is another comment -->
+	**Example:** When `true` the following HTML strings are considered equal:
+		<a href="/admin"><!-- This is a comment -->Link</a>
+		<a href="/admin">Link</a>
+----------
+- ####`ignore_nodes: {css}` default: **`{}`**
+    When provided, ignores all **nodes** that satisfy a particular rule using [CSS selectors](http://www.w3schools.com/cssref/css_selectors.asp).
+	**Usage Example:** `CompareXML.equivalent?(doc1, doc2, {ignore_nodes: ['script', 'object']})`
+    **Example:** With `ignore_nodes: ['a[rel="nofollow"]', 'a[target]']` the following HTML strings are considered equal:
+		<a href="/admin" class="icon" target="_blank">Link 1</a>
+		<a href="/index" class="button" target="_self" rel="nofollow">Link 2</a>
+	 **Example:** With `ignore_nodes: ['b', 'i']` the following HTML strings are considered equal:
+		<a href="/admin"><i class"icon bulb"></i><b>Warning:</b> Link</a>
+		<a href="/admin"><i class"icon info"></i><b>Message:</b> Link</a>
+----------
+- ####`ignore_text_nodes: {true|false}` default: **`false`**
+    When `true`, ignores all text content. Text content is anything that is included between an opening and a closing tag, e.g. `<tag>THIS IS TEXT CONTENT</tag>`.
+	**Usage Example:** `CompareXML.equivalent?(doc1, doc2, {ignore_text_nodes: true})`
+    **Example:** When `true` the following HTML strings are considered equal:
+		<a href="/admin">SOME TEXT CONTENT</a>
+		<a href="/admin">DIFFERENT TEXT CONTENT</a>
+	**Example:** When `true` the following HTML strings are considered equal:
+		<i class="icon></i>  <b>Warning:</b>
+		<i class="icon>  </i>    <b>Message:</b>
+----------
+- ####`squeeze_whitespace: {true|false}` default: **`true`**
+    When `true`, all text content within the document is trimmed (i.e. space removed from left and right) and whitespace is squeezed (i.e. tabs, new lines, multiple whitespaces are all replaced by a single whitespace).
+	**Usage Example:** `CompareXML.equivalent?(doc1, doc2, {squeeze_whitespace: true})`
+    **Example:** When `true` the following HTML strings are considered equal:
+		<a href="/admin">   SOME TEXT CONTENT   </a>
+		<a href="/index"> SOME    TEXT    CONTENT </a>
+	**Example:** When `true` the following HTML strings are considered equal:
+		<html>
+			<title>
+				This is my title
+			</title>
+		</html>
+		<html><title>This is my title</title></html>
+----------
+- ####`verbose: {true|false}` default: **`false`**
+    When `true`, instead of returning a boolean value  `CompareXML.equivalent?` returns an array of all errors encountered when performing a comparison.
+	> **Warning:** When `true`, the comparison takes longer! Not only because more processing is required to produce meaningful error messages, but also because in this mode, comparison does **NOT** stop when a first error is encountered, because the goal is to capture as many discrepancies as possible.
+	**Usage Example:** `CompareXML.equivalent?(doc1, doc2, {verbose: true})`
+    **Example:** When `true` given the following HTML strings:
+		<!DOCTYPE html>
+		<html lang="en">
+		<head><title>TITLE</title></head>
+		<body>
+			<h1>SOME HEADING</h1>
+			<div id="content">
+			    <h2><i class="fa fa-cogs"></i> ANOTHER HEADING</h2>
+			    <p>Extra content</p>
+			</div>
+			<div class="window">
+			    <a href="/admin" rel="icon">Link</a>
+			</div>
+			<blockquote>Some fancy quote <cite>Author Name</cite></blockquote>
+			<p>Some more text</p>
+			<p>Yet more text</p>
+			<p>Too much text</p>
+			<!-- The footer is below -->
+			<p class="footer">FOOTER</p>
+		</body>
+		</html>
+		<!DOCTYPE html>
+		<html lang="en">
+		<head><title>ANOTHER TITLE</title></head>
+		<body>
+			<h1 id="main">SOME HEADING</h1>
+			<div id="content">
+			    <h2><i class="fa fa-cogs"></i> ANOTHER HEADING</h2>
+			    <p>Extra content</p>
+			</div>
+			<div class="window">
+			    <a rel="button" href="/admin">Link</a>
+			</div>
+			<blockquote>Some fancy quote</blockquote>
+			<p>Some more text</p>
+			<p>Yet more text</p>
+			<p>Too much text</p>
+			<!-- This is the footer -->
+			<div class="footer">FOOTER</div>
+		</body>
+		</html>
+	`CompareXML.equivalent?(doc1, doc2, {verbose: true})` will produce an array shown below.
+		[
+			"html:head:title",
+			"TITLE",
+			10,
+			"ANOTHER TITLE",
+			"html:head:title"
+		],
+		[
+		"html:body:h1",
+			nil,
+			2,
+			"id=\"main\"",
+			"html:body:h1"
+		],
+		[
+		"html:body:div(2):a",
+			"rel=\"button\"",
+			4,
+			"rel=\"icon\"",
+			"html:body:div(2):a"
+		],
+		[
+			"html:body:blockquote:cite",
+			"cite",
+			3,
+			nil,
+			"html:body:blockquote:cite"
+		],
+		[
+			"html:body:p(4)",
+			"p",
+			8,
+			"div",
+			"html:body:div(3)"
+		]
+	The structure of the array is as follows:
+	    [left_node_location, left_content, error_code, right_content, right_node_location]
+	**Node location** of `html:body:p(4)` means that the element in question is `<p>`, its hierarchical ancestors are `html > body`, and it is the **4th** `<p>` tag. That is, it could be found in
+        <html><body><p>one</p>...<p>two</p>...<p>three</p>...<p>TARGET</p></body></html>
+	> **Note:** `p(4)` means that it is the fourth tag of type `<p>`, but there could be many other tags of other types between `p(3)` and `p(4)`.
+	**Node content** displays the discrepancy in content (which could be the name of the tag, attributes, text content, comments, etc)
+	**Error code** is a numeric value that indicates the type of a discrepancy. CompareXML implements the following error codes
+	```ruby
+    EQUIVALENT = 1              # nodes are equal (for internal use only)
+    MISSING_ATTRIBUTE = 2       # attribute is missing its counterpart
+    MISSING_NODE = 3            # node is missing its counterpart
+    UNEQUAL_ATTRIBUTES = 4      # attributes are not equal
+    UNEQUAL_COMMENTS = 5        # comment contents are not equal
+    UNEQUAL_DOCUMENTS = 6       # document types are not equal
+    UNEQUAL_ELEMENTS = 7        # nodes have the same type but are not equal
+    UNEQUAL_NODES_TYPES = 8     # nodes do not have the same type
+    UNEQUAL_TEXT_CONTENTS = 9   # text contents are not equal
+	```
+	Here is an example of how these could be used:
+    ```ruby
+    case error_code
+      when CompareXML::UNEQUAL_ATTRIBUTES
+        '!='
+      when CompareXML::MISSING_ATTRIBUTE
+        '?'
+    end
+    ```
+## Contributing
+1. Fork it
+2. Create your feature branch (`git checkout -b my-new-feature`)
+3. Commit your changes (`git commit -am 'Add some feature'`)
+4. Push to the branch (`git push origin my-new-feature`)
+5. Create new Pull Request
+## Credits
+This gem was inspired by [Michael B. Klein](https://github.com/mbklein)'s gem [`equivalent-xml`](https://github.com/mbklein/equivalent-xml) - another excellent tool for XML comparison.
+## License
+The gem is available as open source under the terms of the [MIT License](http://opensource.org/licenses/MIT).

data/Rakefile ADDED

	@@ -0,0 +1,2 @@
1	+ require 'bundler/gem_tasks'
2	+ task :default => :spec

data/bin/console ADDED

@@ -0,0 +1,14 @@
+#!/usr/bin/env ruby
+require 'bundler/setup'
+require 'compare-xml/xml'
+# You can add fixtures and/or initialization code here to make experimenting
+# with your gem easier. You can also use a different console, if you like.
+# (If you use this, don't forget to add pry to your Gemfile!)
+# require 'pry'
+# Pry.start
+require 'irb'
+IRB.start

data/bin/setup ADDED

@@ -0,0 +1,8 @@
+#!/usr/bin/env bash
+set -euo pipefail
+IFS=$'\n\t'
+set -vx
+bundle install
+# Do any other automated setup that you need to do here

data/compare-xml.gemspec ADDED

@@ -0,0 +1,25 @@
+# coding: utf-8
+lib = File.expand_path('../lib', __FILE__)
+$LOAD_PATH.unshift(lib) unless $LOAD_PATH.include?(lib)
+require 'compare-xml/version'
+Gem::Specification.new do |spec|
+  spec.name          = 'compare-xml'
+  spec.version       = CompareXML::VERSION
+  spec.authors       = ['Vadim Kononov']
+  spec.email         = ['vadim@poetic.com']
+  spec.summary       = %q{A customizable tool that compares two instances of Nokogiri::XML::Node for equality or equivalency.}
+  spec.description   = %q{CompareXML is a fast, lightweight and feature-rich tool that will solve your XML/HTML comparison or diffing needs. its purpose is to compare two instances of Nokogiri::XML::Node or Nokogiri::XML::NodeSet for equality or equivalency.}
+  spec.homepage      = 'https://github.com/vkononov/compare-xml-xml'
+  spec.license       = 'MIT'
+  spec.files         = `git ls-files -z`.split("\x0").reject { |f| f.match(%r{^(test|spec|features)/}) }
+  spec.bindir        = 'exe'
+  spec.executables   = spec.files.grep(%r{^exe/}) { |f| File.basename(f) }
+  spec.require_paths = ['lib']
+  spec.add_development_dependency 'bundler', '~> 1.11'
+  spec.add_development_dependency 'rake', '~> 11.1'
+  spec.add_runtime_dependency 'nokogiri', '~> 1.6'
+end

data/lib/compare-xml.rb ADDED

@@ -0,0 +1,452 @@
+require 'compare-xml/version'
+require 'nokogiri'
+module CompareXML
+  # default options used by the module; all of these can be overridden
+  DEFAULTS_OPTS = {
+      # when true, attribute order is not important (all attributes are sorted before comparison)
+      # when false, attributes are compared in order and comparison stops on the first mismatch
+      ignore_attr_order: true,
+      # contains an array of user-specified CSS rules used to perform attribute exclusions
+      # for this to work, a CSS rule MUST contain the attribute to be excluded,
+      # i.e. a[href] will exclude all "href" attributes contained in <a> tags.
+      ignore_attrs: {},
+      # when true ignores XML and HTML comments
+      # when false, all comments are compared to their counterparts
+      ignore_comments: true,
+      # contains an array of user-specified CSS rules used to perform node exclusions
+      ignore_nodes: {},
+      # when true, ignores all text nodes (although blank text nodes are always ignored)
+      # when false, all text nodes are compared to their counterparts (except the empty ones)
+      ignore_text_nodes: false,
+      # when true, trims and squeezes whitespace in text nodes and comments to a single space
+      # when false, all whitespace is preserved as it is without any changes
+      squeeze_whitespace: true,
+      # when true, provides a list of all error messages encountered in comparisons
+      # when false, execution stops when the first error is encountered with no error messages
+      verbose: false
+  }
+  # used internally only in order to differentiate equivalence for inequivalence
+  EQUIVALENT = 1
+  # a list of all possible inequivalence types for nodes
+  # these are returned in the errors array to differentiate error types.
+  MISSING_ATTRIBUTE = 2       # attribute is missing its counterpart
+  MISSING_NODE = 3            # node is missing its counterpart
+  UNEQUAL_ATTRIBUTES = 4      # attributes are not equal
+  UNEQUAL_COMMENTS = 5        # comment contents are not equal
+  UNEQUAL_DOCUMENTS = 6       # document types are not equal
+  UNEQUAL_ELEMENTS = 7        # nodes have the same type but are not equal
+  UNEQUAL_NODES_TYPES = 8     # nodes do not have the same type
+  UNEQUAL_TEXT_CONTENTS = 9   # text contents are not equal
+  class << self
+    ##
+    # Determines whether two XML documents or fragments are equal to each other.
+    # The two parameters could be any type of XML documents, or fragments
+    # or node sets or even text nodes - any subclass of Nokogiri::XML::Node.
+    #
+    #   @param [Nokogiri::XML::Node] n1 left attribute
+    #   @param [Nokogiri::XML::Node] n2 right attribute
+    #   @param [Hash] opts user-overridden options
+    #
+    #   @return true if equal, [Array] errors otherwise
+    #
+    def equivalent?(n1, n2, opts = {})
+      opts, errors = DEFAULTS_OPTS.merge(opts), []
+      result = compareNodes(n1, n2, opts, errors)
+      opts[:verbose] ? errors : result == EQUIVALENT
+    end
+    private
+    ##
+    # Compares two nodes for equivalence. The nodes could be any subclass
+    # of Nokogiri::XML::Node including node sets and document fragments.
+    #
+    #   @param [Nokogiri::XML::Node] n1 left attribute
+    #   @param [Nokogiri::XML::Node] n2 right attribute
+    #   @param [Hash] opts user-overridden options
+    #   @param [Array] errors inequivalence messages
+    #
+    #   @return type of equivalence (from equivalence constants)
+    #
+    def compareNodes(n1, n2, opts, errors, status = EQUIVALENT)
+      if n1.class == n2.class
+        case n1
+          when Nokogiri::XML::Comment
+            compareCommentNodes(n1, n2, opts, errors)
+          when Nokogiri::HTML::Document
+            compareDocumentNodes(n1, n2, opts, errors)
+          when Nokogiri::XML::Element
+            status = compareElementNodes(n1, n2, opts, errors)
+          when Nokogiri::XML::Text
+            status = compareTextNodes(n1, n2, opts, errors)
+          else
+            status = compareChildren(n1.children, n2.children, opts, errors)
+        end
+      elsif n1.nil?
+        status = MISSING_NODE
+        errors << [nodePath(n2), nil, status, n2.name, nodePath(n2)] if opts[:verbose]
+      elsif n2.nil?
+        status = MISSING_NODE
+        errors << [nodePath(n1), n1.name, status, nil, nodePath(n1)] if opts[:verbose]
+      else
+        status = UNEQUAL_NODES_TYPES
+        errors << [nodePath(n1), n1.class, status, n2.class, nodePath(n2)] if opts[:verbose]
+      end
+      status
+    end
+    ##
+    # Compares two nodes of type Nokogiri::HTML::Comment.
+    #
+    #   @param [Nokogiri::XML::Comment] n1 left attribute
+    #   @param [Nokogiri::XML::Comment] n2 right attribute
+    #   @param [Hash] opts user-overridden options
+    #   @param [Array] errors inequivalence messages
+    #
+    #   @return type of equivalence (from equivalence constants)
+    #
+    def compareCommentNodes(n1, n2, opts, errors, status = EQUIVALENT)
+      return true if opts[:ignore_comments]
+      t1, t2 = n1.content, n2.content
+      t1, t2 = squeeze(t1), squeeze(t2) if opts[:squeeze_whitespace]
+      unless t1 == t2
+        status = UNEQUAL_COMMENTS
+        errors << [nodePath(n1.parent), t1, status, t2, nodePath(n2.parent)] if opts[:verbose]
+      end
+      status
+    end
+    ##
+    # Compares two nodes of type Nokogiri::HTML::Document.
+    #
+    #   @param [Nokogiri::XML::Document] n1 left attribute
+    #   @param [Nokogiri::XML::Document] n2 right attribute
+    #   @param [Hash] opts user-overridden options
+    #   @param [Array] errors inequivalence messages
+    #
+    #   @return type of equivalence (from equivalence constants)
+    #
+    def compareDocumentNodes(n1, n2, opts, errors, status = EQUIVALENT)
+      if n1.name == n2.name
+        status = compareChildren(n1.children, n2.children, opts, errors)
+      else
+        status == UNEQUAL_DOCUMENTS
+        errors << [nodePath(n1), n1, status, n2, nodePath(n2)] if opts[:verbose]
+      end
+      status
+    end
+    ##
+    # Compares two sets of Nokogiri::XML::NodeSet elements.
+    #
+    #   @param [Nokogiri::XML::NodeSet] n1_set left set of Nokogiri::XML::Node elements
+    #   @param [Nokogiri::XML::NodeSet] n2_set right set of Nokogiri::XML::Node elements
+    #   @param [Hash] opts user-overridden options
+    #   @param [Array] errors inequivalence messages
+    #
+    #   @return type of equivalence (from equivalence constants)
+    #
+    def compareChildren(n1_set, n2_set, opts, errors, status = EQUIVALENT)
+      i = 0; j = 0
+      while i < n1_set.length || j < n2_set.length
+        if !n1_set[i].nil? && nodeExcluded?(n1_set[i], opts)
+          i += 1 # increment counter if left node is excluded
+        elsif !n2_set[j].nil? && nodeExcluded?(n2_set[j], opts)
+          j += 1 # increment counter if right node is excluded
+        else
+          result = compareNodes(n1_set[i], n2_set[j], opts, errors)
+          status = result unless result == EQUIVALENT
+          # return false so that this subtree could halt comparison on error
+          # but neighbours of parents' subtrees could still be compared (in verbose mode)
+          return false if status == UNEQUAL_NODES_TYPES || status == UNEQUAL_ELEMENTS
+          # stop execution if a single error is found (unless in verbose mode)
+          break unless status == EQUIVALENT || opts[:verbose]
+          # increment both counters when both nodes have been compared
+          i += 1; j += 1
+        end
+        status
+      end
+    end
+    ##
+    # Compares two nodes of type Nokogiri::XML::Element.
+    # - compares element attributes
+    # - recursively compares element children
+    #
+    #   @param [Nokogiri::XML::Element] n1 left attribute
+    #   @param [Nokogiri::XML::Element] n2 right attribute
+    #   @param [Hash] opts user-overridden options
+    #   @param [Array] errors inequivalence messages
+    #
+    #   @return type of equivalence (from equivalence constants)
+    #
+    def compareElementNodes(n1, n2, opts, errors, status = EQUIVALENT)
+      if n1.name == n2.name
+        result = compareAttributeSets(n1.attribute_nodes, n2.attribute_nodes, opts, errors)
+        status = result unless result == EQUIVALENT
+        result = compareChildren(n1.children, n2.children, opts, errors)
+        status = result unless result == EQUIVALENT
+      else
+        status = UNEQUAL_ELEMENTS
+        errors << [nodePath(n1), n1.name, status, n2.name, nodePath(n2)] if opts[:verbose]
+      end
+      status
+    end
+    ##
+    # Compares two nodes of type Nokogiri::XML::Text.
+    #
+    #   @param [Nokogiri::XML::Text] n1 left attribute
+    #   @param [Nokogiri::XML::Text] n2 right attribute
+    #   @param [Hash] opts user-overridden options
+    #   @param [Array] errors inequivalence messages
+    #
+    #   @return type of equivalence (from equivalence constants)
+    #
+    def compareTextNodes(n1, n2, opts, errors, status = EQUIVALENT)
+      return true if opts[:ignore_text_nodes]
+      t1, t2 = n1.content, n2.content
+      t1, t2 = squeeze(t1), squeeze(t2) if opts[:squeeze_whitespace]
+      unless t1 == t2
+        status = UNEQUAL_TEXT_CONTENTS
+        errors << [nodePath(n1.parent), t1, status, t2, nodePath(n2.parent)] if opts[:verbose]
+      end
+      status
+    end
+    ##
+    # Compares two sets of Nokogiri::XML::Node attributes.
+    #
+    #   @param [Array] a1_set left attribute set
+    #   @param [Array] a2_set right attribute set
+    #   @param [Hash] opts user-overridden options
+    #   @param [Array] errors inequivalence messages
+    #
+    #   @return type of equivalence (from equivalence constants)
+    #
+    def compareAttributeSets(a1_set, a2_set, opts, errors)
+      return false unless a1_set.length == a2_set.length || opts[:verbose]
+      if opts[:ignore_attr_order]
+        compareSortedAttributeSets(a1_set, a2_set, opts, errors)
+      else
+        compareUnsortedAttributeSets(a1_set, a2_set, opts, errors)
+      end
+    end
+    ##
+    # Compares two sets of Nokogiri::XML::Node attributes by sorting them first.
+    # When the attributes are sorted, only attributes of the same type are compared
+    # to each other, and missing attributes can be easily detected.
+    #
+    #   @param [Array] a1_set left attribute set
+    #   @param [Array] a2_set right attribute set
+    #   @param [Hash] opts user-overridden options
+    #   @param [Array] errors inequivalence messages
+    #
+    #   @return type of equivalence (from equivalence constants)
+    #
+    def compareSortedAttributeSets(a1_set, a2_set, opts, errors, status = EQUIVALENT)
+      a1_set, a2_set = a1_set.sort_by { |a| a.name }, a2_set.sort_by { |a| a.name }
+      i = j = 0
+      while i < a1_set.length || j < a2_set.length
+        if a1_set[i].nil?
+          result = compareAttributes(nil, a2_set[j], opts, errors); j += 1
+        elsif a2_set[j].nil?
+          result = compareAttributes(a1_set[i], nil, opts, errors); i += 1
+        elsif a1_set[i].name < a2_set[j].name
+          result = compareAttributes(a1_set[i], nil, opts, errors); i += 1
+        elsif a1_set[i].name > a2_set[j].name
+          result = compareAttributes(nil, a2_set[j], opts, errors); j += 1
+        else
+          result = compareAttributes(a1_set[i], a2_set[j], opts, errors); i += 1; j += 1
+        end
+        status = result unless result == EQUIVALENT
+        break unless status == EQUIVALENT || opts[:verbose]
+      end
+      status
+    end
+    ##
+    # Compares two sets of Nokogiri::XML::Node attributes without sorting them.
+    # As a result attributes of different types may be compared, and even if all
+    # attributes are identical in both sets, if their order is different,
+    # the comparison will stop as soon two unequal attributes are found.
+    #
+    #   @param [Array] a1_set left attribute set
+    #   @param [Array] a2_set right attribute set
+    #   @param [Hash] opts user-overridden options
+    #   @param [Array] errors inequivalence messages
+    #
+    #   @return type of equivalence (from equivalence constants)
+    #
+    def compareUnsortedAttributeSets(a1_set, a2_set, opts, errors, status = EQUIVALENT)
+      [a1_set.length, a2_set.length].max.times do |i|
+        result = compareAttributes(a1_set[i], a2_set[i], opts, errors)
+        status = result unless result == EQUIVALENT
+        break unless status == EQUIVALENT
+      end
+      status
+    end
+    ##
+    # Compares two attributes by name and value.
+    #
+    #   @param [Nokogiri::XML::Attr] a1 left attribute
+    #   @param [Nokogiri::XML::Attr] a2 right attribute
+    #   @param [Hash] opts user-overridden options
+    #   @param [Array] errors inequivalence messages
+    #
+    #   @return type of equivalence (from equivalence constants)
+    #
+    def compareAttributes(a1, a2, opts, errors, status = EQUIVALENT)
+      if a1.nil?
+        status = MISSING_ATTRIBUTE
+        errors << [nodePath(a2.parent), nil, status, "#{a2.name}=\"#{a2.value}\"", nodePath(a2.parent)] if opts[:verbose]
+      elsif a2.nil?
+        status = MISSING_ATTRIBUTE
+        errors << [nodePath(a1.parent), "#{a1.name}=\"#{a1.value}\"", status, nil, nodePath(a1.parent)] if opts[:verbose]
+      elsif a1.name == a2.name
+        return status if attrsExcluded?(a1, a2, opts)
+        if a1.value != a2.value
+          status = UNEQUAL_ATTRIBUTES
+          errors << [nodePath(a1.parent), "#{a1.name}=\"#{a1.value}\"", status, "#{a2.name}=\"#{a2.value}\"", nodePath(a2.parent)] if opts[:verbose]
+        end
+      else
+        status = UNEQUAL_ATTRIBUTES
+        errors << [nodePath(a1.parent), a1.name, status, a2.name, nodePath(a2.parent)] if opts[:verbose]
+      end
+      status
+    end
+    ##
+    # Determines if a node should be excluded from the comparison. When a node is excluded,
+    # it is completely ignored, as if it did not exist.
+    #
+    # Several types of nodes are considered ignored:
+    # - comments (only in +ignore_comments+ mode)
+    # - text nodes (only in +ignore_text_nodes+ mode OR when a text node is empty)
+    # - node matches a user-specified css rule from +ignore_comments+
+    #
+    #   @param [Nokogiri::XML::Node] n node being tested for exclusion
+    #   @param [Hash] opts user-overridden options
+    #
+    #   @return true if excluded, false otherwise
+    #
+    def nodeExcluded?(n, opts)
+      return true if n.is_a?(Nokogiri::XML::Comment) && opts[:ignore_comments]
+      return true if n.is_a?(Nokogiri::XML::Text) && (opts[:ignore_text_nodes] || squeeze(n.content).empty?)
+      opts[:ignore_nodes].each do |css|
+        return true if n.xpath('../*').css(css).include?(n)
+      end
+      false
+    end
+    ##
+    # Checks whether two given attributes should be excluded, based on a user-specified css rule.
+    # If true, only the specified attributes are ignored; all remaining attributes are still compared.
+    # The CSS rule is used to locate the node that contains the attributes to be excluded.
+    # The CSS rule MUST contain the name of the attribute to be ignored.
+    #
+    #   @param [Nokogiri::XML::Attr] a1 left attribute
+    #   @param [Nokogiri::XML::Attr] a2 right attribute
+    #   @param [Hash] opts user-overridden options
+    #
+    #   @return true if excluded, false otherwise
+    #
+    def attrsExcluded?(a1, a2, opts)
+      opts[:ignore_attrs].each do |css|
+        if css.include?(a1.name) && css.include?(a2.name)
+          return true if a1.parent.xpath('../*').css(css).include?(a1.parent) && a2.parent.xpath('../*').css(css).include?(a2.parent)
+        end
+      end
+      false
+    end
+    ##
+    # Produces the hierarchical ancestral path of a node in the following format: <html:body:div(3):h2:b(2)>.
+    # This means that the element is located in:
+    #
+    #   <html>
+    #     <body>
+    #       <div>...</div>
+    #       <div>...</div>
+    #       <div>
+    #         <h2>
+    #           <b>...</b>
+    #           <b>TARGET</b>
+    #         </h2>
+    #       </div>
+    #     </body>
+    #   </html>
+    #
+    # Note that the counts of element locations only apply to elements of the same type. For example, div(3) means
+    # that it is the 3rd <div> element in the <body>, but there could be many other elements in between the three
+    # <div> elements.
+    #
+    # When +ignore_comments+ mode is disabled, mismatching comments will show up as <...:comment>.
+    #
+    #   @param [Nokogiri::XML::Node] n node for which to determine a hierarchical path
+    #
+    #   @return true if excluded, false otherwise
+    #
+    def nodePath(n)
+      name = n.name
+      # find the index of the node if there are several of the same type
+      siblings = n.xpath("../#{name}")
+      name += "(#{siblings.index(n) + 1})" if siblings.length > 1
+      if defined? n.parent
+        status = "#{nodePath(n.parent)}:#{name}"
+        status = status[1..-1] if status[0] == ':'
+        status
+      end
+    end
+    ##
+    # Strips the whitespace (from beginning and end) and squeezes it,
+    # i.e. multiple spaces, new lines and tabs are all squeezed to a single space.
+    #
+    #   @param [String] text string to squeeze
+    #
+    #   @return squeezed string
+    #
+    def squeeze(text)
+      text = text.to_s unless text.is_a? String
+      text.strip.gsub(/\s+/, ' ')
+    end
+  end
+end

data/lib/compare-xml/version.rb ADDED

@@ -0,0 +1,3 @@
+module CompareXML
+  VERSION = '0.5.1'
+end

metadata ADDED

@@ -0,0 +1,99 @@
+--- !ruby/object:Gem::Specification
+name: compare-xml
+version: !ruby/object:Gem::Version
+  version: 0.5.1
+platform: ruby
+authors:
+- Vadim Kononov
+autorequire:
+bindir: exe
+cert_chain: []
+date: 2016-04-05 00:00:00.000000000 Z
+dependencies:
+- !ruby/object:Gem::Dependency
+  name: bundler
+  requirement: !ruby/object:Gem::Requirement
+    requirements:
+    - - "~>"
+      - !ruby/object:Gem::Version
+        version: '1.11'
+  type: :development
+  prerelease: false
+  version_requirements: !ruby/object:Gem::Requirement
+    requirements:
+    - - "~>"
+      - !ruby/object:Gem::Version
+        version: '1.11'
+- !ruby/object:Gem::Dependency
+  name: rake
+  requirement: !ruby/object:Gem::Requirement
+    requirements:
+    - - "~>"
+      - !ruby/object:Gem::Version
+        version: '11.1'
+  type: :development
+  prerelease: false
+  version_requirements: !ruby/object:Gem::Requirement
+    requirements:
+    - - "~>"
+      - !ruby/object:Gem::Version
+        version: '11.1'
+- !ruby/object:Gem::Dependency
+  name: nokogiri
+  requirement: !ruby/object:Gem::Requirement
+    requirements:
+    - - "~>"
+      - !ruby/object:Gem::Version
+        version: '1.6'
+  type: :runtime
+  prerelease: false
+  version_requirements: !ruby/object:Gem::Requirement
+    requirements:
+    - - "~>"
+      - !ruby/object:Gem::Version
+        version: '1.6'
+description: CompareXML is a fast, lightweight and feature-rich tool that will solve
+  your XML/HTML comparison or diffing needs. its purpose is to compare two instances
+  of Nokogiri::XML::Node or Nokogiri::XML::NodeSet for equality or equivalency.
+email:
+- vadim@poetic.com
+executables: []
+extensions: []
+extra_rdoc_files: []
+files:
+- ".gitignore"
+- Gemfile
+- LICENSE.txt
+- README.md
+- Rakefile
+- bin/console
+- bin/setup
+- compare-xml.gemspec
+- lib/compare-xml.rb
+- lib/compare-xml/version.rb
+homepage: https://github.com/vkononov/compare-xml-xml
+licenses:
+- MIT
+metadata: {}
+post_install_message:
+rdoc_options: []
+require_paths:
+- lib
+required_ruby_version: !ruby/object:Gem::Requirement
+  requirements:
+  - - ">="
+    - !ruby/object:Gem::Version
+      version: '0'
+required_rubygems_version: !ruby/object:Gem::Requirement
+  requirements:
+  - - ">="
+    - !ruby/object:Gem::Version
+      version: '0'
+requirements: []
+rubyforge_project:
+rubygems_version: 2.5.2
+signing_key:
+specification_version: 4
+summary: A customizable tool that compares two instances of Nokogiri::XML::Node for
+  equality or equivalency.
+test_files: []