RubyGems - json-diff - Versions diffs - 0.1.0 - Mend

json-diff 0.1.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (16) hide show

checksums.yaml ADDED Viewed

@@ -0,0 +1,7 @@
+---
+SHA1:
+  metadata.gz: 29a90b679f5bf30e17ce0b2cabd2963a9a68ea03
+  data.tar.gz: 06c46f1cb99d0564522c30507df1f65fbcca003b
+SHA512:
+  metadata.gz: 613a0223292d0d84d7bf32a46b5f58468336251351fb1d243032172162c24a53c22f56474c1873898c492930d2283227302345febbfccdbdd9cad4c152474174
+  data.tar.gz: c1457cdf32d04f6f368b49b7dde261751ac1c9ef7350578e766ae63a598afa37d10d7e8edb4eb30a2590df9c0cec3bfb3f5dcdfb64b54dcd30f1211161525ee7

data/.rspec ADDED Viewed

	@@ -0,0 +1 @@
1	+ --color

data/Gemfile ADDED Viewed

@@ -0,0 +1,6 @@
+source "http://rubygems.org"
+gemspec
+group :test do
+  gem 'rake'
+end

data/LICENSE ADDED Viewed

@@ -0,0 +1,21 @@
+Copyright (c) 2015 Captain Train
+Permission is hereby granted, free of charge, to any person obtaining
+a copy of this software and associated documentation files (the
+"Software"), to deal in the Software without restriction, including
+without limitation the rights to use, copy, modify, merge, publish,
+distribute, sublicense, and/or sell copies of the Software, and to
+permit persons to whom the Software is furnished to do so, subject to
+the following conditions:
+The above copyright notice and this permission notice shall be
+included in all copies or substantial portions of the Software.
+THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
+EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
+MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
+NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE
+LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION
+OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION
+WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.

data/Makefile ADDED Viewed

@@ -0,0 +1,4 @@
+install:
+	gem build json-diff.gemspec && sudo gem install ./json-diff-*.gem
+.PHONY: install

data/README.md ADDED Viewed

@@ -0,0 +1,47 @@
+# `json-diff`
+*Take two Ruby objects that can be serialized to JSON. Output an array of operations (additions, deletions, moves) that would convert the first one to the second one.*
+```bash
+gem install json-diff  # Or `gem 'json-diff'` in your Gemfile.
+```
+```ruby
+require 'json-diff'
+JsonDiff.diff(1, 2)
+#> [{:op => :replace, :path => "/", :value => 2}]
+```
+Outputs [RFC6902][]. Look at [hana][] for a JSON patch algorithm that can use this output.
+[RFC6902]: http://www.rfc-editor.org/rfc/rfc6902.txt
+[hana]: https://github.com/tenderlove/hana
+# Heart
+- Recursive similarity computation between any two Ruby values.
+- For arrays, match elements above a certain level of similarity pairwise, and treat them as a move.
+  - Matching happens highest-similarity first.
+  - The creation of move operations is generated by detecting rings in the list of moved elements (eg, A → B → C → A).
+Pros:
+- For lists which are not necessarily ordered, this approach yields far better results than LCS.
+- Move operations require no custom code to match elements.
+Cons:
+- This approach's quality is heavily reliant on how good the similarity algorithm is. Empirically, it yields sensible output. It can be improved by a user-defined procedure.
+- There is a computational overhead to the default similarity computation that scales with the total number of entities in the structure.
+# Plans & Bugs
+Roughly ordered by priority.
+- Support adding a custom procedure which computes similarities.
+- Support LCS as an option. (The default will remain what yields the best results, regardless of the time it takes.)
+- Support specifying a depth for similarity computation.
+---
+See the LICENSE file for licensing information.

data/Rakefile ADDED Viewed

@@ -0,0 +1,9 @@
+$:.push File.expand_path("../lib", __FILE__)
+require 'bundler'
+Bundler::GemHelper.install_tasks
+require 'rspec/core/rake_task'
+RSpec::Core::RakeTask.new(:spec)
+task default: :spec

data/json-diff.gemspec ADDED Viewed

@@ -0,0 +1,15 @@
+$LOAD_PATH.push File.expand_path('../lib', __FILE__)
+require 'json-diff/version'
+Gem::Specification.new do |s|
+  s.name        = 'json-diff'
+  s.license     = 'MIT'
+  s.version     = JsonDiff::VERSION
+  s.platform    = Gem::Platform::RUBY
+  s.authors     = ['Captain Train']
+  s.email       = ['ttyl@captaintrain.com']
+  s.homepage    = 'http://github.com/captaintrain/json-diff'
+  s.summary     = %q{Compute the difference between two JSON-serializable Ruby objects.}
+  s.description = %q{Take two Ruby objects that can be serialized to JSON. Output an array of operations (additions, deletions, moves) that would convert the first one to the second one.}
+  s.files       = `git ls-files`.split("\n")
+end

data/lib/json-diff.rb ADDED Viewed

@@ -0,0 +1,4 @@
+require 'json-diff/diff'
+require 'json-diff/index-map'
+require 'json-diff/operation'
+require 'json-diff/version'

data/lib/json-diff/diff.rb ADDED Viewed

@@ -0,0 +1,303 @@
+module JsonDiff
+  def self.diff(before, after, opts = {})
+    path = opts[:path] || '/'
+    include_addition = (opts[:additions] == nil) ? true : opts[:additions]
+    include_moves = (opts[:moves] == nil) ? true : opts[:moves]
+    changes = []
+    if before.is_a?(Hash)
+      if !after.is_a?(Hash)
+        changes << replace(path, before, after)
+      else
+        lost = before.keys - after.keys
+        lost.each do |key|
+          inner_path = extend_json_pointer(path, key)
+          changes << remove(inner_path, before[key])
+        end
+        if include_addition
+          gained = after.keys - before.keys
+          gained.each do |key|
+            inner_path = extend_json_pointer(path, key)
+            changes << add(inner_path, after[key])
+          end
+        end
+        kept = before.keys & after.keys
+        kept.each do |key|
+          inner_path = extend_json_pointer(path, key)
+          changes += diff(before[key], after[key], opts.merge(path: inner_path))
+        end
+      end
+    elsif before.is_a?(Array)
+      if !after.is_a?(Array)
+        changes << replace(path, before, after)
+      elsif before.size == 0
+        if include_addition
+          after.each_with_index do |item, index|
+            inner_path = extend_json_pointer(path, index)
+            changes << add(inner_path, item)
+          end
+        end
+      elsif after.size == 0
+        before.each do |item|
+          # Delete elements from the start.
+          inner_path = extend_json_pointer(path, 0)
+          changes << remove(inner_path, item)
+        end
+      else
+        pairing = array_pairing(before, after)
+        # FIXME: detect replacements.
+        # All detected moves that do not reach the similarity limit are deleted
+        # and re-added.
+        pairing[:pairs].select! do |pair|
+          sim = pair[2]
+          kept = (sim >= 0.5)
+          if !kept
+            pairing[:removed] << pair[0]
+            pairing[:added] << pair[1]
+          end
+          kept
+        end
+        array_changes(pairing)
+        pairing[:removed].each do |before_index|
+          inner_path = extend_json_pointer(path, before_index)
+          changes << remove(inner_path, before[before_index])
+        end
+        pairing[:pairs].each do |pair|
+          before_index, after_index, orig_before, orig_after = pair
+          inner_before_path = extend_json_pointer(path, before_index)
+          inner_after_path = extend_json_pointer(path, after_index)
+          if before_index != after_index && include_moves
+            changes << move(inner_before_path, inner_after_path)
+          end
+          changes += diff(before[orig_before], after[orig_after], opts.merge(path: inner_after_path))
+        end
+        if include_addition
+          pairing[:added].each do |after_index|
+            inner_path = extend_json_pointer(path, after_index)
+            changes << add(inner_path, after[after_index])
+          end
+        end
+      end
+    else
+      if before != after
+        changes << replace(path, before, after)
+      end
+    end
+    changes
+  end
+  # {pairs: [[before index, after index, similarity]],
+  #  removed: [before index],
+  #  added: [after index]}
+  def self.array_pairing(before, after)
+    # Array containing the array of similarities from before to after.
+    similarities = before.map do |before_item|
+      after.map do |after_item|
+        similarity(before_item, after_item)
+      end
+    end
+    # Array containing the array of couples of indices, sorted by similarity.
+    indices = before.map.with_index do |before_item, before_index|
+      after.map.with_index do |after_item, after_index|
+        [before_index, after_index]
+      end
+    end
+    # Sort them in O(n^2 log(n)).
+    indices.map! do |couples|
+      couples.sort! do |a, b|
+        a_before_index = a[0]
+        b_before_index = b[0]
+        a_after_index = a[1]
+        b_after_index = b[1]
+        similarities[b_before_index][b_after_index] <=> similarities[a_before_index][a_after_index]
+      end
+    end
+    # Sort the toplevel.
+    indices.sort! do |a, b|
+      a_top_before_index = a[0][0]
+      a_top_after_index = a[0][1]
+      b_top_before_index = b[0][0]
+      b_top_after_index = b[0][1]
+      similarities[b_top_before_index][b_top_after_index] <=> similarities[a_top_before_index][a_top_after_index]
+    end
+    # Map from indices to boolean (true if paired).
+    before_paired = {}
+    after_paired = {}
+    num_pairs = [before.size, after.size].min
+    pairs = (0...num_pairs).map do |_|
+      unpaired_before_index = indices.index { |a| !before_paired[a[0][0]] }
+      unpaired_after_index = indices[unpaired_before_index].index { |a| !after_paired[a[1]] }
+      unpaired_couple = indices[unpaired_before_index][unpaired_after_index]
+      before_paired[unpaired_couple[0]] = true
+      after_paired[unpaired_couple[1]] = true
+      [unpaired_couple[0], unpaired_couple[1],
+        similarities[unpaired_couple[0]][unpaired_couple[1]]]
+    end
+    if before.size < after.size
+      added = after.map.with_index { |_, i| i} - after_paired.keys
+      removed = []
+    else
+      removed = before.map.with_index { |_, i| i } - before_paired.keys
+      added = []
+    end
+    {
+      pairs: pairs,
+      removed: removed,
+      added: added,
+    }
+  end
+  # Compute an arbitrary notion of how probable it is that
+  def self.similarity(before, after)
+    return 0.0 if before.class != after.class
+    # FIXME: call custom similarity procedure.
+    if before.is_a?(Hash)
+      if before.size == 0
+        if after.size == 0
+          return 1.0
+        else
+          return 0.0
+        end
+      end
+      # Average similarity between keys' value.
+      # We don't consider key renames.
+      similarities = []
+      before.each do |before_key, before_item|
+        similarities << similarity(before_item, after[before_key])
+      end
+      similarities.reduce(:+) / similarities.size
+    elsif before.is_a?(Array)
+      return 1.0 if before.size == 0
+      # The most likely match between an element in the old and the new list is
+      # presumably the right one, so we take the average of the maximum
+      # similarity between each elements of the list.
+      similarities = before.map do |before_item|
+        after.map do |after_item|
+          similarity(before_item, after_item)
+        end.max || 0.0
+      end
+      similarities.reduce(:+) / similarities.size
+    elsif before == after
+      1.0
+    else
+      0.0
+    end
+  end
+  # Input:
+  # {pairs: [[before index, after index, similarity]],
+  #  removed: [before index],
+  #  added: [after index]}
+  #
+  # Output:
+  # {removed: [before index],
+  #  pairs: [[before index, after index,
+  #    original before index, original after index]],
+  #  added: [after index]}
+  def self.array_changes(pairing)
+    # We perform removals starting from the highest index.
+    # That way, they don't offset their own.
+    pairing[:removed].sort!.reverse!
+    pairing[:added].sort!
+    # First, map indices from before to after removals.
+    removal_map = IndexMaps.new
+    pairing[:removed].each { |rm| removal_map.removal(rm) }
+    # And map indices from after to before additions
+    # (removals, since it is reversed).
+    addition_map = IndexMaps.new
+    pairing[:added].each { |ad| addition_map.removal(ad) }
+    moves = {}
+    orig_before = {}
+    orig_after = {}
+    pairing[:pairs].each do |before, after|
+      mapped_before = removal_map.map(before)
+      mapped_after = addition_map.map(after)
+      orig_before[mapped_before] = before
+      orig_after[mapped_after] = after
+      moves[mapped_before] = mapped_after
+    end
+    # Now, detect rings within the pairs.
+    # The proof is, if whatever was at position i was sent to position j,
+    # whatever was at position j cannot have stayed at j.
+    # By induction, there is a ring.
+    # Oh, and a piece of the proof is that the arrays have the same length.
+    # Trivially. Right. Hey, this is not an interview!
+    rings = []
+    while moves.size > 0
+      # i goes to j. j goes to (…). k goes to i.
+      ring = []
+      pair = moves.shift
+      origin, target = pair
+      first_origin = origin
+      while target != first_origin
+        ring << origin
+        origin = target
+        target = moves[target]
+        moves.delete(origin)
+      end
+      ring << origin
+      rings << ring
+    end
+    # rings is of the form [[i,j,k], …]
+    # Finally, we can register the moves.
+    # The idea is, if the whole ring moves instantaneously,
+    # no element outside of the ring changed position.
+    pairs = []
+    rings.each do |ring|
+      orig_ring = ring.map { |i| [orig_before[i], orig_after[i]] }
+      ring_map = IndexMaps.new
+      len = ring.size
+      i = 0
+      while i < len
+        ni = (i + 1) % len  # next i
+        if ring[i] != ring[ni]
+          pairs << [ring[i], ring[ni], orig_ring[i][0], orig_ring[ni][1]]
+        end
+        ring_map.removal(ring[i])
+        ring_map.addition(ring[ni])
+        j = i + 1
+        while j < len
+          ring[j] = ring_map.map(ring[j])
+          j += 1
+        end
+        i += 1
+      end
+    end
+    pairing[:pairs] = pairs
+    pairing
+  end
+end

data/lib/json-diff/index-map.rb ADDED Viewed

@@ -0,0 +1,51 @@
+module JsonDiff
+  class IndexMaps
+    def initialize
+      @maps = []
+    end
+    def addition(index)
+      @maps << AdditionIndexMap.new(index)
+    end
+    def removal(index)
+      @maps << RemovalIndexMap.new(index)
+    end
+    def map(index)
+      @maps.each do |map|
+        index = map.map(index)
+      end
+      index
+    end
+  end
+  class IndexMap
+    def initialize(pivot)
+      @pivot = pivot
+    end
+    def map(index)
+      if index >= @pivot
+        index + 1
+      else
+        index
+      end
+    end
+  end
+  class AdditionIndexMap < IndexMap
+  end
+  class RemovalIndexMap < IndexMap
+    def map(index)
+      if index >= @pivot
+        index - 1
+      else
+        index
+      end
+    end
+  end
+end

data/lib/json-diff/operation.rb ADDED Viewed

@@ -0,0 +1,47 @@
+module JsonDiff
+  # Convert a list of strings or numbers to an RFC6901 JSON pointer.
+  # http://tools.ietf.org/html/rfc6901
+  def self.json_pointer(path)
+    escaped_path = path.map do |key|
+      if key.is_a?(String)
+        key.gsub('~', '~0')
+           .gsub('/', '~1')
+      else
+        key.to_s
+      end
+    end.join('/')
+    "/#{escaped_path}"
+  end
+  # Add a key to a JSON pointer.
+  def self.extend_json_pointer(pointer, key)
+    if pointer == '/'
+      json_pointer([key])
+    else
+      pointer + json_pointer([key])
+    end
+  end
+  def self.add(path, value)
+    {op: :add, path: path, value: value}
+  end
+  def self.remove(path, value)
+    if value != nil
+      {op: :remove, path: path, value: value}
+    else
+      {op: :remove, path: path}
+    end
+  end
+  def self.replace(path, value)
+    {op: :replace, path: path, value: value}
+  end
+  def self.move(source, target)
+    {op: :move, from: source, path: target}
+  end
+end

data/lib/json-diff/version.rb ADDED Viewed

@@ -0,0 +1,3 @@
+module JsonDiff
+  VERSION = '0.1.0'
+end

data/spec/json-diff/diff_spec.rb ADDED Viewed

@@ -0,0 +1,57 @@
+require 'spec_helper'
+describe JsonDiff do
+  it "should be able to diff two empty arrays" do
+    diff = JsonDiff.diff([], [])
+    expect(diff).to eql([])
+  end
+  it "should be able to diff an empty array with a filled one" do
+    diff = JsonDiff.diff([], [1, 2, 3])
+    expect(diff).to eql([
+      {op: :add, path: "/0", value: 1},
+      {op: :add, path: "/1", value: 2},
+      {op: :add, path: "/2", value: 3},
+    ])
+  end
+  it "should be able to diff a filled array with an empty one" do
+    diff = JsonDiff.diff([1, 2, 3], [])
+    expect(diff).to eql([
+      {op: :remove, path: "/0", value: 1},
+      {op: :remove, path: "/0", value: 2},
+      {op: :remove, path: "/0", value: 3},
+    ])
+  end
+  it "should be able to diff a 1-array with a filled one" do
+    diff = JsonDiff.diff([0], [1, 2, 3])
+    expect(diff).to eql([
+      {op: :remove, path: "/0", value: 0},
+      {op: :add, path: "/0", value: 1},
+      {op: :add, path: "/1", value: 2},
+      {op: :add, path: "/2", value: 3},
+    ])
+  end
+  it "should be able to diff a filled array with a 1-array" do
+    diff = JsonDiff.diff([1, 2, 3], [0])
+    expect(diff).to eql([
+      {op: :remove, path: "/2", value: 3},
+      {op: :remove, path: "/1", value: 2},
+      {op: :remove, path: "/0", value: 1},
+      {op: :add, path: "/0", value: 0},
+    ])
+  end
+  it "should be able to diff two integer arrays" do
+    diff = JsonDiff.diff([1, 2, 3, 4, 5], [6, 4, 3, 2])
+    expect(diff).to eql([
+      {op: :remove, path: "/4", value: 5},
+      {op: :remove, path: "/0", value: 1},
+      {op: :move, from: "/0", path: "/2"},
+      {op: :move, from: "/1", path: "/0"},
+      {op: :add, path: "/0", value: 6},
+    ])
+  end
+end

data/spec/spec_helper.rb ADDED Viewed

@@ -0,0 +1,4 @@
+$LOAD_PATH << File.join(File.dirname(__FILE__), '..', 'lib')
+require 'rubygems'
+require 'json-diff'

metadata ADDED Viewed

@@ -0,0 +1,60 @@
+--- !ruby/object:Gem::Specification
+name: json-diff
+version: !ruby/object:Gem::Version
+  version: 0.1.0
+platform: ruby
+authors:
+- Captain Train
+autorequire:
+bindir: bin
+cert_chain: []
+date: 2016-06-11 00:00:00.000000000 Z
+dependencies: []
+description: Take two Ruby objects that can be serialized to JSON. Output an array
+  of operations (additions, deletions, moves) that would convert the first one to
+  the second one.
+email:
+- ttyl@captaintrain.com
+executables: []
+extensions: []
+extra_rdoc_files: []
+files:
+- .rspec
+- Gemfile
+- LICENSE
+- Makefile
+- README.md
+- Rakefile
+- json-diff.gemspec
+- lib/json-diff.rb
+- lib/json-diff/diff.rb
+- lib/json-diff/index-map.rb
+- lib/json-diff/operation.rb
+- lib/json-diff/version.rb
+- spec/json-diff/diff_spec.rb
+- spec/spec_helper.rb
+homepage: http://github.com/captaintrain/json-diff
+licenses:
+- MIT
+metadata: {}
+post_install_message:
+rdoc_options: []
+require_paths:
+- lib
+required_ruby_version: !ruby/object:Gem::Requirement
+  requirements:
+  - - '>='
+    - !ruby/object:Gem::Version
+      version: '0'
+required_rubygems_version: !ruby/object:Gem::Requirement
+  requirements:
+  - - '>='
+    - !ruby/object:Gem::Version
+      version: '0'
+requirements: []
+rubyforge_project:
+rubygems_version: 2.0.14
+signing_key:
+specification_version: 4
+summary: Compute the difference between two JSON-serializable Ruby objects.
+test_files: []