RubyGems - data_structures_rmolinari - Versions diffs - 0.4.1 → 0.4.2 - Mend

data_structures_rmolinari 0.4.1 → 0.4.2

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (13) hide show

checksums.yaml +4 -4
data/CHANGELOG.md +21 -3
data/README.md +141 -0
data/Rakefile +16 -0
data/ext/c_disjoint_union/disjoint_union.c +412 -0
data/ext/c_disjoint_union/extconf.rb +12 -0
data/lib/data_structures_rmolinari/algorithms.rb +103 -0
data/lib/data_structures_rmolinari/max_priority_search_tree.rb +200 -58
data/lib/data_structures_rmolinari/min_priority_search_tree.rb +187 -0
data/lib/data_structures_rmolinari/{generic_segment_tree.rb → segment_tree_template.rb} +0 -0
data/lib/data_structures_rmolinari/shared.rb +5 -16
data/lib/data_structures_rmolinari.rb +6 -3
metadata +12 -5

data/lib/data_structures_rmolinari/min_priority_search_tree.rb ADDED Viewed

@@ -0,0 +1,187 @@
+require 'must_be'
+require 'set'
+require_relative 'shared'
+# A priority search tree (PST) stores a set, P, of two-dimensional points (x,y) in a way that allows efficient answers to certain
+# questions about P.
+#
+# This is a _Mininmal_ Priority Search Tree (MinPST), a slight variant of the MaxPST. Where a MaxPST can answer queries about
+# regions infinite in the positive y direction, a MinPST can handle regions infinite in the negative y direction. (A MinmaxPST can
+# handle both kinds of region but has not been implemented.)
+#
+# The PST data structure was introduced in 1985 by Edward McCreight. Later, De, Maheshwari, Nandy, and Smid showed how to construct
+# a PST in-place (using only O(1) extra memory), at the expense of some slightly more complicated code for the various supported
+# operations. It is their approach that we have implemented. See the class +MaxPrioritySearchTree+ for more details.
+#
+# Here we implement the MinPST by adding a thin layer of code over a MaxPST and reflecting all points through the x-axis.
+#
+# This means a few things.
+# - The bookkeeping means that performance will be slightly slower than for the MaxPST due to the bookkeeping. It is unlikely to be
+#   noticable in practice.
+# - MaxPST builds the tree structure in place, modifying the data array passed it. Indeed, this is the point of the approach of De
+#   et al. But we don't do that, as we create a separate array of Points.
+# - Whereas the implementation of MaxPST means that client code gets the same (x, y) objects back in results as it passed into the
+#   contructor, that's not the case here.
+#   - we map each point in the input - which is an object responding to +#x+ and +#y+ - to an instance of +Point+, and will return
+#    (different) instances of +Point+ in response to queries.
+#   - client code is unlikely to care, but be aware of this, just in case.
+#
+# Given a set of n points, we can answer the following questions quickly:
+#
+# - +smallest_x_in_se+: for x0 and y0, what is the "leftmost" point (x, y) in P satisfying x >= x0 and y <= y0?
+# - +largest_x_in_sw+: for x0 and y0, what is the "rightmost" point (x, y) in P satisfying x <= x0 and y <= y0?
+# - +smallest_y_in_se+: for x0 and y0, what is the "lowest" point (x, y) in P satisfying x >= x0 and y <= y0?
+# - +smallest_y_in_nw+: for x0 and y0, what is the lowest point (x, y) in P satisfying x <= x0 and y <= y0?
+# - +smallest_y_in_3_sided+: for x0, x1, and y0, what is the lowest point (x, y) in P satisfying x >= x0, x <= x1 and y <= y0?
+# - +enumerate_3_sided+: for x0, x1, and y0, enumerate all points in P satisfying x >= x0, x <= x1 and y <= y0.
+#
+# (Here, "leftmost/rightmost" means "minimal/maximal x", and "lowest" means "minimal y".)
+#
+# The first 5 operations take O(log n) time and O(1) extra space.
+#
+# The final operation (enumerate) takes O(m + log n) time and O(1) extra space, where m is the number of points that are enumerated.
+#
+# As with the MaxPST the MinPST can be contructed to be "dynamic" and provide a +delete_top!+ operation running in O(log n) time.
+#
+# In the current implementation no two points can share an x-value. This (rather severe) restriction can be relaxed with some more
+# complicated code, but it hasn't been written yet. See issue #9.
+#
+# References:
+# * E.M. McCreight, _Priority search trees_, SIAM J. Comput., 14(2):257-276, 1985.
+# * M. De, A. Maheshwari, S. C. Nandy, M. Smid, _An In-Place Priority Search Tree_, 23rd Canadian Conference on Computational
+#   Geometry, 2011
+class DataStructuresRMolinari::MinPrioritySearchTree
+  include Shared
+  include BinaryTreeArithmetic
+  # Construct a MinPST from the collection of points in +data+.
+  #
+  # @param data [Array] the set P of points as an array. The internal data structure is constructed in-place inside this array
+  #     without cloning it. Indeed, each element of data is replaced by a different object.
+  #   - Each element of the array must respond to +#x+ and +#y+.
+  #   - The +x+ values must be distinct. We raise a +Shared::DataError+ if this isn't the case.
+  #     - This is a restriction that simplifies some of the algorithm code. It can be removed as the cost of some extra work. Issue
+  #       #9.
+  #
+  # @param verify [Boolean] when truthy, check that the properties of a PST are satisified after construction, raising an exception
+  #        if not.
+  def initialize(data, dynamic: false, verify: false)
+    (0...(data.size)).each do |i|
+      data[i] = flip data[i]
+    end
+    @max_pst = DataStructuresRMolinari::MaxPrioritySearchTree.new(data, dynamic:, verify:)
+  end
+  ########################################
+  # "Lowest" points in SE and SW quadrants
+  # Return the "lowest" point in P to the "southeast" of (x0, y0).
+  #
+  # Let Q = [x0, infty) X (infty, y0] be the southeast quadrant defined by the point (x0, y0) and let P be the points in this data
+  # structure. Define p* as
+  #
+  # - (infty, infty) if Q \intersect P is empty and
+  # - the lowest (min-y) point in Q \intersect P otherwise, breaking ties by preferring smaller values of x
+  #
+  # This method returns p* in O(log n) time and O(1) extra space.
+  def smallest_y_in_se(x0, y0)
+    flip @max_pst.largest_y_in_ne(x0, -y0)
+  end
+  # Return the "lowest" point in P to the "southwest" of (x0, y0).
+  #
+  # Let Q = (-infty, x0] X (-infty, y0] be the southwest quadrant defined by the point (x0, y0) and let P be the points in this data
+  # structure. Define p* as
+  #
+  # - (-infty, infty) if Q \intersect P is empty and
+  # - the lowest (min-y) point in Q \intersect P otherwise, breaking ties by preferring smaller values of x
+  #
+  # This method returns p* in O(log n) time and O(1) extra space.
+  def smallest_y_in_sw(x0, y0)
+    flip @max_pst.largest_y_in_nw(x0, -y0)
+  end
+  ########################################
+  # Leftmost SE and Rightmost SW
+  # Return the leftmost (min-x) point in P to the southeast of (x0, y0).
+  #
+  # Let Q = [x0, infty) X (infty, y0] be the southeast quadrant defined by the point (x0, y0) and let P be the points in this data
+  # structure. Define p* as
+  #
+  # - (infty, -infty) if Q \intersect P is empty and
+  # - the leftmost (min-x) point in Q \intersect P otherwise.
+  #
+  # This method returns p* in O(log n) time and O(1) extra space.
+  def smallest_x_in_se(x0, y0)
+    flip @max_pst.smallest_x_in_ne(x0, -y0)
+  end
+  # Return the rightmost (max-x) point in P to the southwest of (x0, y0).
+  #
+  # Let Q = (-infty, x0] X (infty, y0] be the southwest quadrant defined by the point (x0, y0) and let P be the points in this data
+  # structure. Define p* as
+  #
+  # - (-infty, -infty) if Q \intersect P is empty and
+  # - the leftmost (min-x) point in Q \intersect P otherwise.
+  #
+  # This method returns p* in O(log n) time and O(1) extra space.
+  def largest_x_in_sw(x0, y0)
+    flip @max_pst.largest_x_in_nw(x0, -y0)
+  end
+  ########################################
+  # Lowest 3 Sided
+  # Return the lowest point of P in the box bounded by x0, x1, and y0.
+  #
+  # Let Q = [x0, x1] X (infty, y0] be the "three-sided" box bounded by x0, x1, and y0, and let P be the set of points in the
+  # MaxPST. (Note that Q is empty if x1 < x0.) Define p* as
+  #
+  # - (infty, infty) if Q \intersect P is empty and
+  # - the highest (max-y) point in Q \intersect P otherwise, breaking ties by preferring smaller x values.
+  #
+  # This method returns p* in O(log n) time and O(1) extra space.
+  def smallest_y_in_3_sided(x0, x1, y0)
+    flip @max_pst.largest_y_in_3_sided(x0, x1, -y0)
+  end
+  ########################################
+  # Enumerate 3 sided
+  # Enumerate the points of P in the box bounded by x0, x1, and y0.
+  #
+  # Let Q = [x0, x1] X [y0, infty) be the "three-sided" box bounded by x0, x1, and y0, and let P be the set of points in the
+  # MaxPST. (Note that Q is empty if x1 < x0.) We find an enumerate all the points in Q \intersect P.
+  #
+  # If the calling code provides a block then we +yield+ each point to it. Otherwise we return a set containing all the points in
+  # the intersection.
+  #
+  # This method runs in O(m + log n) time and O(1) extra space, where m is the number of points found.
+  def enumerate_3_sided(x0, x1, y0)
+    if block_given?
+      @max_pst.enumerate_3_sided(x0, x1, -y0) { |point| yield(flip point) }
+    else
+      Set.new( @max_pst.enumerate_3_sided(x0, x1, -y0).map { |pt| flip pt })
+    end
+  end
+  ########################################
+  # Delete top
+  # Delete the top (min-y) element of the PST. This is possible only for dynamic PSTs
+  #
+  # It runs in guaranteed O(log n) time, where n is the size of the PST when it was intially constructed. As elements are deleted
+  # the internal tree structure is no longer guaranteed to be balanced and so we cannot guarantee operation in O(log n') time, where
+  # n' is the current size. In practice, "random" deletion is likely to leave the tree almost balanced.
+  #
+  # @return [Point] the top element that was deleted
+  def delete_top!
+    flip @max_pst.delete_top!
+  end
+  # (x, y) -> (x, -y)
+  private def flip(point)
+    Point.new(point.x, -point.y)
+  end
+end

data/lib/data_structures_rmolinari/{generic_segment_tree.rb → segment_tree_template.rb} RENAMED Viewed

File without changes

data/lib/data_structures_rmolinari/shared.rb CHANGED Viewed

@@ -4,7 +4,11 @@ module Shared
   INFINITY = Float::INFINITY
   # An (x, y) coordinate pair.
-  Point = Struct.new(:x, :y)
+  Point = Struct.new(:x, :y) do
+    def to_s
+      "[#{x}, #{y}]"
+    end
+  end
   # @private
@@ -50,21 +54,6 @@ module Shared
       l
     end
-    # i has no children
-    private def leaf?(i)
-      i > @last_non_leaf
-    end
-    # i has exactly one child (the left)
-    private def one_child?(i)
-      i == @parent_of_one_child
-    end
-    # i has two children
-    private def two_children?(i)
-      i <= @last_parent_of_two_children
-    end
     # i is the left child of its parent.
     private def left_child?(i)
       (i & 1).zero?

data/lib/data_structures_rmolinari.rb CHANGED Viewed

@@ -2,18 +2,21 @@ require 'forwardable'
 require_relative 'data_structures_rmolinari/shared'
+# A namespace to hold the provided classes. We want to avoid polluting the global namespace with names like "Heap"
 module DataStructuresRMolinari
   # A struct responding to +.x+ and +.y+.
   Point = Shared::Point
 end
 # These define classes inside module DataStructuresRMolinari
+require_relative 'data_structures_rmolinari/algorithms'
 require_relative 'data_structures_rmolinari/disjoint_union'
-require_relative 'data_structures_rmolinari/generic_segment_tree'
+require_relative 'data_structures_rmolinari/c_disjoint_union' # version as a C extension
+require_relative 'data_structures_rmolinari/segment_tree_template'
 require_relative 'data_structures_rmolinari/heap'
 require_relative 'data_structures_rmolinari/max_priority_search_tree'
+require_relative 'data_structures_rmolinari/min_priority_search_tree'
-# A namespace to hold the provided classes. We want to avoid polluting the global namespace with names like "Heap"
 module DataStructuresRMolinari
   ########################################
   # Concrete instances of Segment Tree
@@ -73,7 +76,7 @@ module DataStructuresRMolinari
     #   - If there is more than one entry with that value, return one the indices. There is no guarantee as to which one.
     #   - Return +nil+ if i > j
     def index_of_max_val_on(i, j)
-      @structure.query_on(i, j)&.first # discard the value part of the pair
+      @structure.query_on(i, j)&.first # discard the value part of the pair, which is a bookkeeping
     end
   end
 end

metadata CHANGED Viewed

@@ -1,14 +1,14 @@
 --- !ruby/object:Gem::Specification
 name: data_structures_rmolinari
 version: !ruby/object:Gem::Version
-  version: 0.4.1
+  version: 0.4.2
 platform: ruby
 authors:
 - Rory Molinari
 autorequire:
 bindir: bin
 cert_chain: []
-date: 2023-01-12 00:00:00.000000000 Z
+date: 2023-01-26 00:00:00.000000000 Z
 dependencies:
 - !ruby/object:Gem::Dependency
   name: must_be
@@ -77,15 +77,22 @@ description: |
   See the homepage for more details.
 email: rorymolinari@gmail.com
 executables: []
-extensions: []
+extensions:
+- ext/c_disjoint_union/extconf.rb
 extra_rdoc_files: []
 files:
 - CHANGELOG.md
+- README.md
+- Rakefile
+- ext/c_disjoint_union/disjoint_union.c
+- ext/c_disjoint_union/extconf.rb
 - lib/data_structures_rmolinari.rb
+- lib/data_structures_rmolinari/algorithms.rb
 - lib/data_structures_rmolinari/disjoint_union.rb
-- lib/data_structures_rmolinari/generic_segment_tree.rb
 - lib/data_structures_rmolinari/heap.rb
 - lib/data_structures_rmolinari/max_priority_search_tree.rb
+- lib/data_structures_rmolinari/min_priority_search_tree.rb
+- lib/data_structures_rmolinari/segment_tree_template.rb
 - lib/data_structures_rmolinari/shared.rb
 homepage: https://github.com/rmolinari/data_structures
 licenses:
@@ -106,7 +113,7 @@ required_rubygems_version: !ruby/object:Gem::Requirement
     - !ruby/object:Gem::Version
       version: '0'
 requirements: []
-rubygems_version: 3.3.26
+rubygems_version: 3.4.5
 signing_key:
 specification_version: 4
 summary: Several miscellaneous data structures I have implemented to learn about them.