data_structures_rmolinari 0.4.1 → 0.4.3
Sign up to get free protection for your applications and to get access to all the features.
- checksums.yaml +4 -4
- data/CHANGELOG.md +25 -3
- data/README.md +141 -0
- data/Rakefile +16 -0
- data/ext/c_disjoint_union/disjoint_union.c +424 -0
- data/ext/c_disjoint_union/extconf.rb +12 -0
- data/lib/data_structures_rmolinari/algorithms.rb +103 -0
- data/lib/data_structures_rmolinari/max_priority_search_tree.rb +200 -58
- data/lib/data_structures_rmolinari/min_priority_search_tree.rb +187 -0
- data/lib/data_structures_rmolinari/{generic_segment_tree.rb → segment_tree_template.rb} +0 -0
- data/lib/data_structures_rmolinari/shared.rb +5 -16
- data/lib/data_structures_rmolinari.rb +6 -3
- metadata +12 -5
@@ -0,0 +1,187 @@
|
|
1
|
+
require 'must_be'
|
2
|
+
require 'set'
|
3
|
+
require_relative 'shared'
|
4
|
+
|
5
|
+
# A priority search tree (PST) stores a set, P, of two-dimensional points (x,y) in a way that allows efficient answers to certain
|
6
|
+
# questions about P.
|
7
|
+
#
|
8
|
+
# This is a _Mininmal_ Priority Search Tree (MinPST), a slight variant of the MaxPST. Where a MaxPST can answer queries about
|
9
|
+
# regions infinite in the positive y direction, a MinPST can handle regions infinite in the negative y direction. (A MinmaxPST can
|
10
|
+
# handle both kinds of region but has not been implemented.)
|
11
|
+
#
|
12
|
+
# The PST data structure was introduced in 1985 by Edward McCreight. Later, De, Maheshwari, Nandy, and Smid showed how to construct
|
13
|
+
# a PST in-place (using only O(1) extra memory), at the expense of some slightly more complicated code for the various supported
|
14
|
+
# operations. It is their approach that we have implemented. See the class +MaxPrioritySearchTree+ for more details.
|
15
|
+
#
|
16
|
+
# Here we implement the MinPST by adding a thin layer of code over a MaxPST and reflecting all points through the x-axis.
|
17
|
+
#
|
18
|
+
# This means a few things.
|
19
|
+
# - The bookkeeping means that performance will be slightly slower than for the MaxPST due to the bookkeeping. It is unlikely to be
|
20
|
+
# noticable in practice.
|
21
|
+
# - MaxPST builds the tree structure in place, modifying the data array passed it. Indeed, this is the point of the approach of De
|
22
|
+
# et al. But we don't do that, as we create a separate array of Points.
|
23
|
+
# - Whereas the implementation of MaxPST means that client code gets the same (x, y) objects back in results as it passed into the
|
24
|
+
# contructor, that's not the case here.
|
25
|
+
# - we map each point in the input - which is an object responding to +#x+ and +#y+ - to an instance of +Point+, and will return
|
26
|
+
# (different) instances of +Point+ in response to queries.
|
27
|
+
# - client code is unlikely to care, but be aware of this, just in case.
|
28
|
+
#
|
29
|
+
# Given a set of n points, we can answer the following questions quickly:
|
30
|
+
#
|
31
|
+
# - +smallest_x_in_se+: for x0 and y0, what is the "leftmost" point (x, y) in P satisfying x >= x0 and y <= y0?
|
32
|
+
# - +largest_x_in_sw+: for x0 and y0, what is the "rightmost" point (x, y) in P satisfying x <= x0 and y <= y0?
|
33
|
+
# - +smallest_y_in_se+: for x0 and y0, what is the "lowest" point (x, y) in P satisfying x >= x0 and y <= y0?
|
34
|
+
# - +smallest_y_in_nw+: for x0 and y0, what is the lowest point (x, y) in P satisfying x <= x0 and y <= y0?
|
35
|
+
# - +smallest_y_in_3_sided+: for x0, x1, and y0, what is the lowest point (x, y) in P satisfying x >= x0, x <= x1 and y <= y0?
|
36
|
+
# - +enumerate_3_sided+: for x0, x1, and y0, enumerate all points in P satisfying x >= x0, x <= x1 and y <= y0.
|
37
|
+
#
|
38
|
+
# (Here, "leftmost/rightmost" means "minimal/maximal x", and "lowest" means "minimal y".)
|
39
|
+
#
|
40
|
+
# The first 5 operations take O(log n) time and O(1) extra space.
|
41
|
+
#
|
42
|
+
# The final operation (enumerate) takes O(m + log n) time and O(1) extra space, where m is the number of points that are enumerated.
|
43
|
+
#
|
44
|
+
# As with the MaxPST the MinPST can be contructed to be "dynamic" and provide a +delete_top!+ operation running in O(log n) time.
|
45
|
+
#
|
46
|
+
# In the current implementation no two points can share an x-value. This (rather severe) restriction can be relaxed with some more
|
47
|
+
# complicated code, but it hasn't been written yet. See issue #9.
|
48
|
+
#
|
49
|
+
# References:
|
50
|
+
# * E.M. McCreight, _Priority search trees_, SIAM J. Comput., 14(2):257-276, 1985.
|
51
|
+
# * M. De, A. Maheshwari, S. C. Nandy, M. Smid, _An In-Place Priority Search Tree_, 23rd Canadian Conference on Computational
|
52
|
+
# Geometry, 2011
|
53
|
+
class DataStructuresRMolinari::MinPrioritySearchTree
|
54
|
+
include Shared
|
55
|
+
include BinaryTreeArithmetic
|
56
|
+
|
57
|
+
# Construct a MinPST from the collection of points in +data+.
|
58
|
+
#
|
59
|
+
# @param data [Array] the set P of points as an array. The internal data structure is constructed in-place inside this array
|
60
|
+
# without cloning it. Indeed, each element of data is replaced by a different object.
|
61
|
+
# - Each element of the array must respond to +#x+ and +#y+.
|
62
|
+
# - The +x+ values must be distinct. We raise a +Shared::DataError+ if this isn't the case.
|
63
|
+
# - This is a restriction that simplifies some of the algorithm code. It can be removed as the cost of some extra work. Issue
|
64
|
+
# #9.
|
65
|
+
#
|
66
|
+
# @param verify [Boolean] when truthy, check that the properties of a PST are satisified after construction, raising an exception
|
67
|
+
# if not.
|
68
|
+
def initialize(data, dynamic: false, verify: false)
|
69
|
+
(0...(data.size)).each do |i|
|
70
|
+
data[i] = flip data[i]
|
71
|
+
end
|
72
|
+
@max_pst = DataStructuresRMolinari::MaxPrioritySearchTree.new(data, dynamic:, verify:)
|
73
|
+
end
|
74
|
+
|
75
|
+
########################################
|
76
|
+
# "Lowest" points in SE and SW quadrants
|
77
|
+
|
78
|
+
# Return the "lowest" point in P to the "southeast" of (x0, y0).
|
79
|
+
#
|
80
|
+
# Let Q = [x0, infty) X (infty, y0] be the southeast quadrant defined by the point (x0, y0) and let P be the points in this data
|
81
|
+
# structure. Define p* as
|
82
|
+
#
|
83
|
+
# - (infty, infty) if Q \intersect P is empty and
|
84
|
+
# - the lowest (min-y) point in Q \intersect P otherwise, breaking ties by preferring smaller values of x
|
85
|
+
#
|
86
|
+
# This method returns p* in O(log n) time and O(1) extra space.
|
87
|
+
def smallest_y_in_se(x0, y0)
|
88
|
+
flip @max_pst.largest_y_in_ne(x0, -y0)
|
89
|
+
end
|
90
|
+
|
91
|
+
# Return the "lowest" point in P to the "southwest" of (x0, y0).
|
92
|
+
#
|
93
|
+
# Let Q = (-infty, x0] X (-infty, y0] be the southwest quadrant defined by the point (x0, y0) and let P be the points in this data
|
94
|
+
# structure. Define p* as
|
95
|
+
#
|
96
|
+
# - (-infty, infty) if Q \intersect P is empty and
|
97
|
+
# - the lowest (min-y) point in Q \intersect P otherwise, breaking ties by preferring smaller values of x
|
98
|
+
#
|
99
|
+
# This method returns p* in O(log n) time and O(1) extra space.
|
100
|
+
def smallest_y_in_sw(x0, y0)
|
101
|
+
flip @max_pst.largest_y_in_nw(x0, -y0)
|
102
|
+
end
|
103
|
+
|
104
|
+
########################################
|
105
|
+
# Leftmost SE and Rightmost SW
|
106
|
+
|
107
|
+
# Return the leftmost (min-x) point in P to the southeast of (x0, y0).
|
108
|
+
#
|
109
|
+
# Let Q = [x0, infty) X (infty, y0] be the southeast quadrant defined by the point (x0, y0) and let P be the points in this data
|
110
|
+
# structure. Define p* as
|
111
|
+
#
|
112
|
+
# - (infty, -infty) if Q \intersect P is empty and
|
113
|
+
# - the leftmost (min-x) point in Q \intersect P otherwise.
|
114
|
+
#
|
115
|
+
# This method returns p* in O(log n) time and O(1) extra space.
|
116
|
+
def smallest_x_in_se(x0, y0)
|
117
|
+
flip @max_pst.smallest_x_in_ne(x0, -y0)
|
118
|
+
end
|
119
|
+
|
120
|
+
# Return the rightmost (max-x) point in P to the southwest of (x0, y0).
|
121
|
+
#
|
122
|
+
# Let Q = (-infty, x0] X (infty, y0] be the southwest quadrant defined by the point (x0, y0) and let P be the points in this data
|
123
|
+
# structure. Define p* as
|
124
|
+
#
|
125
|
+
# - (-infty, -infty) if Q \intersect P is empty and
|
126
|
+
# - the leftmost (min-x) point in Q \intersect P otherwise.
|
127
|
+
#
|
128
|
+
# This method returns p* in O(log n) time and O(1) extra space.
|
129
|
+
def largest_x_in_sw(x0, y0)
|
130
|
+
flip @max_pst.largest_x_in_nw(x0, -y0)
|
131
|
+
end
|
132
|
+
|
133
|
+
########################################
|
134
|
+
# Lowest 3 Sided
|
135
|
+
|
136
|
+
# Return the lowest point of P in the box bounded by x0, x1, and y0.
|
137
|
+
#
|
138
|
+
# Let Q = [x0, x1] X (infty, y0] be the "three-sided" box bounded by x0, x1, and y0, and let P be the set of points in the
|
139
|
+
# MaxPST. (Note that Q is empty if x1 < x0.) Define p* as
|
140
|
+
#
|
141
|
+
# - (infty, infty) if Q \intersect P is empty and
|
142
|
+
# - the highest (max-y) point in Q \intersect P otherwise, breaking ties by preferring smaller x values.
|
143
|
+
#
|
144
|
+
# This method returns p* in O(log n) time and O(1) extra space.
|
145
|
+
def smallest_y_in_3_sided(x0, x1, y0)
|
146
|
+
flip @max_pst.largest_y_in_3_sided(x0, x1, -y0)
|
147
|
+
end
|
148
|
+
|
149
|
+
########################################
|
150
|
+
# Enumerate 3 sided
|
151
|
+
|
152
|
+
# Enumerate the points of P in the box bounded by x0, x1, and y0.
|
153
|
+
#
|
154
|
+
# Let Q = [x0, x1] X [y0, infty) be the "three-sided" box bounded by x0, x1, and y0, and let P be the set of points in the
|
155
|
+
# MaxPST. (Note that Q is empty if x1 < x0.) We find an enumerate all the points in Q \intersect P.
|
156
|
+
#
|
157
|
+
# If the calling code provides a block then we +yield+ each point to it. Otherwise we return a set containing all the points in
|
158
|
+
# the intersection.
|
159
|
+
#
|
160
|
+
# This method runs in O(m + log n) time and O(1) extra space, where m is the number of points found.
|
161
|
+
def enumerate_3_sided(x0, x1, y0)
|
162
|
+
if block_given?
|
163
|
+
@max_pst.enumerate_3_sided(x0, x1, -y0) { |point| yield(flip point) }
|
164
|
+
else
|
165
|
+
Set.new( @max_pst.enumerate_3_sided(x0, x1, -y0).map { |pt| flip pt })
|
166
|
+
end
|
167
|
+
end
|
168
|
+
|
169
|
+
########################################
|
170
|
+
# Delete top
|
171
|
+
|
172
|
+
# Delete the top (min-y) element of the PST. This is possible only for dynamic PSTs
|
173
|
+
#
|
174
|
+
# It runs in guaranteed O(log n) time, where n is the size of the PST when it was intially constructed. As elements are deleted
|
175
|
+
# the internal tree structure is no longer guaranteed to be balanced and so we cannot guarantee operation in O(log n') time, where
|
176
|
+
# n' is the current size. In practice, "random" deletion is likely to leave the tree almost balanced.
|
177
|
+
#
|
178
|
+
# @return [Point] the top element that was deleted
|
179
|
+
def delete_top!
|
180
|
+
flip @max_pst.delete_top!
|
181
|
+
end
|
182
|
+
|
183
|
+
# (x, y) -> (x, -y)
|
184
|
+
private def flip(point)
|
185
|
+
Point.new(point.x, -point.y)
|
186
|
+
end
|
187
|
+
end
|
File without changes
|
@@ -4,7 +4,11 @@ module Shared
|
|
4
4
|
INFINITY = Float::INFINITY
|
5
5
|
|
6
6
|
# An (x, y) coordinate pair.
|
7
|
-
Point = Struct.new(:x, :y)
|
7
|
+
Point = Struct.new(:x, :y) do
|
8
|
+
def to_s
|
9
|
+
"[#{x}, #{y}]"
|
10
|
+
end
|
11
|
+
end
|
8
12
|
|
9
13
|
# @private
|
10
14
|
|
@@ -50,21 +54,6 @@ module Shared
|
|
50
54
|
l
|
51
55
|
end
|
52
56
|
|
53
|
-
# i has no children
|
54
|
-
private def leaf?(i)
|
55
|
-
i > @last_non_leaf
|
56
|
-
end
|
57
|
-
|
58
|
-
# i has exactly one child (the left)
|
59
|
-
private def one_child?(i)
|
60
|
-
i == @parent_of_one_child
|
61
|
-
end
|
62
|
-
|
63
|
-
# i has two children
|
64
|
-
private def two_children?(i)
|
65
|
-
i <= @last_parent_of_two_children
|
66
|
-
end
|
67
|
-
|
68
57
|
# i is the left child of its parent.
|
69
58
|
private def left_child?(i)
|
70
59
|
(i & 1).zero?
|
@@ -2,18 +2,21 @@ require 'forwardable'
|
|
2
2
|
|
3
3
|
require_relative 'data_structures_rmolinari/shared'
|
4
4
|
|
5
|
+
# A namespace to hold the provided classes. We want to avoid polluting the global namespace with names like "Heap"
|
5
6
|
module DataStructuresRMolinari
|
6
7
|
# A struct responding to +.x+ and +.y+.
|
7
8
|
Point = Shared::Point
|
8
9
|
end
|
9
10
|
|
10
11
|
# These define classes inside module DataStructuresRMolinari
|
12
|
+
require_relative 'data_structures_rmolinari/algorithms'
|
11
13
|
require_relative 'data_structures_rmolinari/disjoint_union'
|
12
|
-
require_relative 'data_structures_rmolinari/
|
14
|
+
require_relative 'data_structures_rmolinari/c_disjoint_union' # version as a C extension
|
15
|
+
require_relative 'data_structures_rmolinari/segment_tree_template'
|
13
16
|
require_relative 'data_structures_rmolinari/heap'
|
14
17
|
require_relative 'data_structures_rmolinari/max_priority_search_tree'
|
18
|
+
require_relative 'data_structures_rmolinari/min_priority_search_tree'
|
15
19
|
|
16
|
-
# A namespace to hold the provided classes. We want to avoid polluting the global namespace with names like "Heap"
|
17
20
|
module DataStructuresRMolinari
|
18
21
|
########################################
|
19
22
|
# Concrete instances of Segment Tree
|
@@ -73,7 +76,7 @@ module DataStructuresRMolinari
|
|
73
76
|
# - If there is more than one entry with that value, return one the indices. There is no guarantee as to which one.
|
74
77
|
# - Return +nil+ if i > j
|
75
78
|
def index_of_max_val_on(i, j)
|
76
|
-
@structure.query_on(i, j)&.first # discard the value part of the pair
|
79
|
+
@structure.query_on(i, j)&.first # discard the value part of the pair, which is a bookkeeping
|
77
80
|
end
|
78
81
|
end
|
79
82
|
end
|
metadata
CHANGED
@@ -1,14 +1,14 @@
|
|
1
1
|
--- !ruby/object:Gem::Specification
|
2
2
|
name: data_structures_rmolinari
|
3
3
|
version: !ruby/object:Gem::Version
|
4
|
-
version: 0.4.
|
4
|
+
version: 0.4.3
|
5
5
|
platform: ruby
|
6
6
|
authors:
|
7
7
|
- Rory Molinari
|
8
8
|
autorequire:
|
9
9
|
bindir: bin
|
10
10
|
cert_chain: []
|
11
|
-
date: 2023-01-
|
11
|
+
date: 2023-01-27 00:00:00.000000000 Z
|
12
12
|
dependencies:
|
13
13
|
- !ruby/object:Gem::Dependency
|
14
14
|
name: must_be
|
@@ -77,15 +77,22 @@ description: |
|
|
77
77
|
See the homepage for more details.
|
78
78
|
email: rorymolinari@gmail.com
|
79
79
|
executables: []
|
80
|
-
extensions:
|
80
|
+
extensions:
|
81
|
+
- ext/c_disjoint_union/extconf.rb
|
81
82
|
extra_rdoc_files: []
|
82
83
|
files:
|
83
84
|
- CHANGELOG.md
|
85
|
+
- README.md
|
86
|
+
- Rakefile
|
87
|
+
- ext/c_disjoint_union/disjoint_union.c
|
88
|
+
- ext/c_disjoint_union/extconf.rb
|
84
89
|
- lib/data_structures_rmolinari.rb
|
90
|
+
- lib/data_structures_rmolinari/algorithms.rb
|
85
91
|
- lib/data_structures_rmolinari/disjoint_union.rb
|
86
|
-
- lib/data_structures_rmolinari/generic_segment_tree.rb
|
87
92
|
- lib/data_structures_rmolinari/heap.rb
|
88
93
|
- lib/data_structures_rmolinari/max_priority_search_tree.rb
|
94
|
+
- lib/data_structures_rmolinari/min_priority_search_tree.rb
|
95
|
+
- lib/data_structures_rmolinari/segment_tree_template.rb
|
89
96
|
- lib/data_structures_rmolinari/shared.rb
|
90
97
|
homepage: https://github.com/rmolinari/data_structures
|
91
98
|
licenses:
|
@@ -106,7 +113,7 @@ required_rubygems_version: !ruby/object:Gem::Requirement
|
|
106
113
|
- !ruby/object:Gem::Version
|
107
114
|
version: '0'
|
108
115
|
requirements: []
|
109
|
-
rubygems_version: 3.
|
116
|
+
rubygems_version: 3.4.5
|
110
117
|
signing_key:
|
111
118
|
specification_version: 4
|
112
119
|
summary: Several miscellaneous data structures I have implemented to learn about them.
|