data_structures_rmolinari 0.4.4 → 0.5.0

Sign up to get free protection for your applications and to get access to all the features.
checksums.yaml CHANGED
@@ -1,7 +1,7 @@
1
1
  ---
2
2
  SHA256:
3
- metadata.gz: 943ac55678a074cc0da3667dccbb07ee7d203639233f53bd8587af7fd8cd062e
4
- data.tar.gz: ad235e5f4714e699f1cf5f113dd4b3a356a194cced5a74b60e17c5e3a896e01b
3
+ metadata.gz: 7682f6d3b0779f347ce0797f55f33b9d7dcc7bd9c2039fc2fd6f865eb72e085a
4
+ data.tar.gz: d717e5e36f79ddc4ecb605a59b475b7114359dea7476445590deb300f7915bd4
5
5
  SHA512:
6
- metadata.gz: a68de76c88c67fadc42752610c695b1f0b8fd17f34db9c806291aeab4c933fe84c6523615deb4197e1c9fa6d36dce30987cc4e8896a2b0c1700b7e72b5bd2fff
7
- data.tar.gz: 9063d89a98d599f27db2585bf383dbfb13e8f927abce64ac7eafb2edd70c490ddad1f1fc51e0f11c24adf29f28ab8c56548a6db264b15ace239c63b1a2ce5a01
6
+ metadata.gz: c3ffd9a4f67f55b7a2df1c949cf2288c06fcae416d5ff03a10307a1b79c3dae1daa74e2576d5e190c989adeea47b046426fad8c3c64199aadf22ba500b317f36
7
+ data.tar.gz: 8380d6117f2955da9362395f8315f5121b4f7afba2f69aabb1981a01b675cbbed81d07c10b5745409080c2588c92df3d676ec36efa128571234a74dceef0e20d
data/CHANGELOG.md CHANGED
@@ -2,6 +2,17 @@
2
2
 
3
3
  ## [Unreleased]
4
4
 
5
+ ## [0.5.0] 2023-02.03
6
+
7
+ - SegmentTree
8
+ - Reorganize the code into a SegmentTree submodule.
9
+ - Provide a conveniece method for getting concrete instances.
10
+
11
+ - README.md
12
+ - Add some simple example code for the data types.
13
+
14
+ ## [0.4.4] 2023-02-02
15
+
5
16
  - Disjoint Union
6
17
  - C extension: use Convenient Containers rather than my janky Dynamic Array attempt.
7
18
 
data/README.md CHANGED
@@ -14,18 +14,6 @@ The code is available as a gem: https://rubygems.org/gems/data_structures_rmolin
14
14
  The right way to organize the code is not obvious to me. For now the data structures are all defined in the module
15
15
  `DataStructuresRMolinari` to avoid polluting the global namespace.
16
16
 
17
- Example usage after the gem is installed:
18
- ```
19
- require 'data_structures_rmolinari`
20
-
21
- # Pull what we need out of the namespace
22
- MaxPrioritySearchTree = DataStructuresRMolinari::MaxPrioritySearchTree
23
- Point = DataStructuresRMolinari::Point # anything responding to :x and :y is fine
24
-
25
- pst = MaxPrioritySearchTree.new([Point.new(1, 1)])
26
- puts pst.largest_y_in_ne(0, 0) # "Point(1,1)"
27
- ```
28
-
29
17
  # Implementations
30
18
 
31
19
  ## Disjoint Union
@@ -42,6 +30,23 @@ It also provides
42
30
  For more details see https://en.wikipedia.org/wiki/Disjoint-set_data_structure and the paper [[TvL1984]](#references) by Tarjan and
43
31
  van Leeuwen.
44
32
 
33
+ ``` ruby
34
+ require 'data_structures_rmolinari'
35
+ DisjointUnion = DataStructuresRMolinari::DisjointUnion
36
+
37
+ # Create an instance over the "universe" 0, 1, ..., 9.
38
+ du = DisjointUnion.new(10)
39
+ du.subset_count # => 10; each element starts out in its own subset
40
+
41
+ du.unite(2, 3) # say that 2 and 3 are actually in the same subset
42
+ du.subset_count # => 9
43
+ du.find(2) == du.find(3) # => true
44
+
45
+ du.unite(4, 5)
46
+ du.unite(3, 4) # now 2, 3, 4, and 5 are all in the same subset
47
+ du.subset_count # => 7
48
+ ```
49
+
45
50
  ## Heap
46
51
 
47
52
  This is a standard binary heap with an `update` method, suitable for use as a priority queue. There are several supported
@@ -63,6 +68,24 @@ allows the insertion of duplicate items (which is sometimes useful) and slightly
63
68
 
64
69
  See https://en.wikipedia.org/wiki/Binary_heap and the paper by Edelkamp, Elmasry, and Katajainen [[EEK2017]](#references).
65
70
 
71
+ ``` ruby
72
+ require 'data_structures_rmolinari'
73
+ Heap = DataStructuresRMolinari::Heap
74
+
75
+ data = [4, 3, 2, 1]
76
+
77
+ heap = Heap.new
78
+
79
+ # Insert the elements of data, each with itself as priority.
80
+ data.each { |v| heap.insert(v, v) }
81
+
82
+ heap.top # => 1, since we have a min-heap.
83
+ heap.pop # => 1
84
+ heap.top # => 2; with 1 gone, this is the element with least priority
85
+ heap.update(3, -3)
86
+ heap.top # => 3; now 3 is the element with least priority
87
+ ```
88
+
66
89
  ## Priority Search Tree
67
90
 
68
91
  A PST stores a set P of two-dimensional points in a way that allows certain queries about P to be answered efficiently. The data
@@ -96,13 +119,27 @@ regions.
96
119
  By default these data structures are immutable: once constructed they cannot be changed. But there is a constructor option that
97
120
  makes the instance "dynamic". This allows us to delete the element at the root of the tree - the one with largest y value (smallest
98
121
  for MinPST) - with the `delete_top!` method. This operation is important in certain algorithms, such as enumerating all maximal
99
- empty rectangles (see the second paper by De et al.[[DMNS2013]](#references)) Note that points can still not be added to the PST in
122
+ empty rectangles (see the second paper by De et al[[DMNS2013]](#references)). Note that points can still not be added to the PST in
100
123
  any case, and choosing the dynamic option makes certain internal bookkeeping operations slower.
101
124
 
102
125
  In [[DMNS2013]](#references) De et al. generalize the in-place structure to a _Min-max Priority Search Tree_ (MinmaxPST) that can
103
126
  answer queries in all four quadrants and both "kinds" of 3-sided boxes. Having one of these would save the trouble of constructing
104
127
  both a MaxPST and MinPST. But the presentiation is hard to follow in places and the paper's pseudocode is buggy.[^minmaxpst]
105
128
 
129
+ ``` ruby
130
+ require 'data_structures_rmolinari'
131
+ MaxPST = DataStructuresRMolinari::MaxPrioritySearchTree
132
+ Point = Shared::Point # simple (x, y) struct. Anything responding to #x and #y will work
133
+
134
+ data = [Point.new(0, 0), Point.new(1, 2), Point.new(2, 1)]
135
+ pst = MaxPST.new(data)
136
+
137
+ pst.largest_y_in_ne(0, 0) # => #<struct Shared::Point x=1, y=2>
138
+ pst.largest_y_in_ne(1, 1) # => #<struct Shared::Point x=1, y=2>
139
+ pst.largest_y_in_ne(1.5, 1) # => #<struct Shared::Point x=2, y=1>
140
+ pst.largest_y_in_3_sided(-0.5, 0.5, 0) # => #<struct Shared::Point x=0, y=0>
141
+ ```
142
+
106
143
  ## Segment Tree
107
144
 
108
145
  A segment tree stores information related to subintervals of a certain array. For example, a segment tree can be used to find the
@@ -112,11 +149,37 @@ arbitrary subarrays.
112
149
 
113
150
  An excellent description of the idea is found at https://cp-algorithms.com/data_structures/segment_tree.html.
114
151
 
115
- Generic code is provided in `SegmentTreeTemplate`. Concrete classes provide a handful of simple lambdas and constants to the
116
- template class's initializer. Figuring out the details requires some knowledge of the internal mechanisms of a segment tree, for
117
- which the link at cp-algorithms.com is very helpful. See the definitions of the concrete classes `MaxValSegmentTree` and
152
+ Generic code is provided in `SegmentTree::SegmentTreeTemplate` and its equivalent (and faster) C-based sibling,
153
+ `SegmentTree::CSegmentTreeTemplate` (see [below](#c-extensions)).
154
+
155
+ Writing a concrete segment tree class just means providing some simple lambdas and constants to the template class's
156
+ initializer. Figuring out the details requires some knowledge of the internal mechanisms of a segment tree, for which the link at
157
+ cp-algorithms.com is very helpful. See the implementations of the concrete classes `MaxValSegmentTree` and
118
158
  `IndexOfMaxValSegmentTree` for examples.
119
159
 
160
+ Since there are several concrete "types" and two underlying generic implementions there is a convenience method on the `SegmentTree`
161
+ module to get instances.
162
+
163
+ ``` ruby
164
+ require 'data_structures_rmolinari'
165
+ SegmentTree = DataStructuresRMolinari::SegmentTree # namespace module
166
+
167
+ data = [1, -3, 2, 1, 5, -9]
168
+
169
+ # Get a segment tree instance that will answer "max over this subinterval" questions about data.
170
+ # Here we get one using the ruby implementation of the generic functionality.
171
+ #
172
+ # We offer :index_of_max as an alternative to :max. This will construct an instance that answers
173
+ # questions of the form "an index of the maximum value over this subinterval".
174
+ #
175
+ # To use the version written in C, put :c instead of :ruby.
176
+ seg_tree = SegmentTree.construct(data, :max, :ruby)
177
+
178
+ seg_tree.max_on(0, 2) # => 2
179
+ seg_tree.max_on(1, 4) # => 5
180
+ # ..etc..
181
+ ```
182
+
120
183
  ## Algorithms
121
184
 
122
185
  The Algorithms submodule contains some algorithms using the data structures.
@@ -130,13 +193,12 @@ The Algorithms submodule contains some algorithms using the data structures.
130
193
 
131
194
  # C Extensions
132
195
 
133
- As another learning process I have implemented several of these data structures as C extensions. The class names have a "C" prefixed
134
- and they can be required like their pure Ruby versions. They have the same APIs as their Ruby cousins.
196
+ As another learning process I have implemented several of these data structures as C extensions. The APIs are the same.
135
197
 
136
198
  ## Disjoint Union
137
199
 
138
- A benchmark suggests that a long sequence of `unite` operations is about 3 times as fast with the `CDisjointUnion` as with
139
- `DisjointUnion`.
200
+ The C version is called `CDisjointUnion`. A benchmark suggests that a long sequence of `unite` operations is about 3 times as fast
201
+ with `CDisjointUnion` as with `DisjointUnion`.
140
202
 
141
203
  The implementation uses the remarkable Convenient Containers library from Jackson Allan.[[Allan]](#references).
142
204
 
@@ -145,16 +207,21 @@ The implementation uses the remarkable Convenient Containers library from Jackso
145
207
  `CSegmentTreeTemplate` is the C implementation of the generic class. Concrete classes are built on top of this in Ruby, just as with
146
208
  the pure Ruby `SegmentTreeTemplate` class.
147
209
 
148
- A benchmark suggests that a long sequence of `max_on` operations against a max-val Segment Tree is about 4 times as fast with the C
149
- version as with the Ruby version. I'm a bit suprised the improvment isn't larger, but we must remember that the C code must still
150
- interact with the Ruby objects in the underlying data array, and must "combine" them, etc., by calling Ruby lambdas.
210
+ A benchmark suggests that a long sequence of `max_on` operations against a max-val Segment Tree is about 4 times as fast with C as
211
+ with Ruby. I'm a bit suprised the improvment isn't larger, but remember that the C code must still interact with the Ruby objects in
212
+ the underlying data array, and must combine them, etc., via Ruby lambdas.
151
213
 
152
214
  # References
153
- - [Allan] Allan, J., _CC: Convenient Containers_, https://github.com/JacksonAllan/CC, retrieved 2023-02-01.
154
- - [TvL1984] Tarjan, Robert E., van Leeuwen, J., _Worst-case Analysis of Set Union Algorithms_, Journal of the ACM, v31:2 (1984), pp 245–281.
155
- - [EEK2017] Edelkamp, S., Elmasry, A., Katajainen, J., _Optimizing Binary Heaps_, Theory Comput Syst (2017), vol 61, pp 606-636, DOI 10.1007/s00224-017-9760-2.
156
- - [McC1985] McCreight, E. M., _Priority Search Trees_, SIAM J. Comput., 14(2):257-276, 1985.
157
- - [DMNS2011] De, M., Maheshwari, A., Nandy, S. C., Smid, M., _An In-Place Priority Search Tree_, 23rd Canadian Conference on Computational Geometry, 2011.
158
- - [DMNS2013] De, M., Maheshwari, A., Nandy, S. C., Smid, M., _An In-Place Min-max Priority Search Tree_, Computational Geometry, v46 (2013), pp 310-327.
215
+ - [Allan] Allan, J., _CC: Convenient Containers_, https://github.com/JacksonAllan/CC, (retrieved 2023-02-01).
216
+ - [TvL1984] Tarjan, Robert E., van Leeuwen, J., _Worst-case Analysis of Set Union Algorithms_, Journal of the ACM, v31:2 (1984), pp
217
+ 245–281, https://dl.acm.org/doi/10.1145/62.2160 (retrieved 2022-02-01).
218
+ - [EEK2017] Edelkamp, S., Elmasry, A., Katajainen, J., _Optimizing Binary Heaps_, Theory Comput Syst (2017), vol 61, pp 606-636, DOI
219
+ 10.1007/s00224-017-9760-2, https://kclpure.kcl.ac.uk/portal/files/87388857/TheoryComputingSzstems.pdf (retrieved 2022-02-02).
220
+ - [McC1985] McCreight, E. M., _Priority Search Trees_, SIAM J. Comput., 14(2):257-276, 1985,
221
+ http://www.cs.duke.edu/courses/fall08/cps234/handouts/SMJ000257.pdf (retrieved 2023-02-02).
222
+ - [DMNS2011] De, M., Maheshwari, A., Nandy, S. C., Smid, M., _An In-Place Priority Search Tree_, 23rd Canadian Conference on
223
+ Computational Geometry, 2011, http://www.cs.carleton.ca/~michiel/inplace_pst.pdf (retrieved 2023-02-02).
224
+ - [DMNS2013] De, M., Maheshwari, A., Nandy, S. C., Smid, M., _An In-Place Min-max Priority Search Tree_, Computational Geometry, v46
225
+ (2013), pp 310-327, https://people.scs.carleton.ca/~michiel/MinMaxPST.pdf (retrieved 2023-02-02).
159
226
 
160
227
  [^minmaxpst]: See the comments in the fragmentary class `MinMaxPrioritySearchTree` for further details.
@@ -353,7 +353,8 @@ static VALUE segment_tree_update_at(VALUE self, VALUE idx) {
353
353
  * (see SegmentTreeTemplate)
354
354
  */
355
355
  void Init_c_segment_tree_template() {
356
- VALUE cSegmentTreeTemplate = rb_define_class_under(mDataStructuresRMolinari, "CSegmentTreeTemplate", rb_cObject);
356
+ VALUE mSegmentTree = rb_define_module_under(mDataStructuresRMolinari, "SegmentTree");
357
+ VALUE cSegmentTreeTemplate = rb_define_class_under(mSegmentTree, "CSegmentTreeTemplate", rb_cObject);
357
358
 
358
359
  rb_define_alloc_func(cSegmentTreeTemplate, segment_tree_alloc);
359
360
  rb_define_method(cSegmentTreeTemplate, "c_initialize", segment_tree_init, 4);
@@ -1,4 +1,4 @@
1
- # A collection of algorithms that use the module's data structures but don't belong as a method on one of the data structures
1
+ # Algorithms that use the module's data structures but don't belong as a method on one of the data structures
2
2
  module DataStructuresRMolinari::Algorithms
3
3
  include Shared
4
4
 
@@ -11,12 +11,12 @@ module DataStructuresRMolinari::Algorithms
11
11
  #
12
12
  # A _maximal empty rectangle_ (MER) for P is an empty rectangle for P not properly contained in any other.
13
13
  #
14
- # We enumerate all maximal empty rectangles for P, yielding each as (left, right, bottom, top) to a block. The algorithm is due to
15
- # De, M., Maheshwari, A., Nandy, S. C., Smid, M., _An In-Place Min-max Priority Search Tree_, Computational Geometry, v46 (2013),
16
- # pp 310-327.
14
+ # We enumerate all maximal empty rectangles for P, yielding each as (left, right, bottom, top). The algorithm is due to De, M.,
15
+ # Maheshwari, A., Nandy, S. C., Smid, M., _An In-Place Min-max Priority Search Tree_, Computational Geometry, v46 (2013), pp
16
+ # 310-327.
17
17
  #
18
18
  # It runs in O(m log n) time, where m is the number of MERs enumerated and n is the number of points in P. (Contructing the
19
- # MaxPST below takes O(n log^2 n) time, but m = O(n^2) so we are still O(m log n) overall.)
19
+ # MaxPST takes O(n log^2 n) time, but m = O(n^2) so we are still O(m log n) overall.)
20
20
  #
21
21
  # @param points [Array] an array of points in the x-y plane. Each must respond to +x+ and +y+.
22
22
  def self.maximal_empty_rectangles(points)
@@ -3,110 +3,13 @@ require 'must_be'
3
3
  require_relative 'shared'
4
4
  require_relative 'c_segment_tree_template'
5
5
 
6
- # The template of Segment Tree, which can be used for various interval-related purposes, like efficiently finding the sum (or min or
7
- # max) on a arbitrary subarray of a given array.
6
+ # The underlying functionality of the Segment Tree data type, implemented in C as a Ruby extension.
8
7
  #
9
- # There is an excellent description of the data structure at https://cp-algorithms.com/data_structures/segment_tree.html. The
10
- # Wikipedia article (https://en.wikipedia.org/wiki/Segment_tree) appears to describe a different data structure which is sometimes
11
- # called an "interval tree."
12
- #
13
- # For more details (and some close-to-metal analysis of run time, especially for large datasets) see
14
- # https://en.algorithmica.org/hpc/data-structures/segment-trees/. In particular, this shows how to do a bottom-up implementation,
15
- # which is faster, at least for large datasets and cache-relevant compiled code. These issues don't really apply to code written in
16
- # Ruby.
17
- #
18
- # This is a generic implementation, intended to allow easy configuration for concrete instances. See the parameters to the
19
- # initializer and the definitions of concrete realisations like MaxValSegmentTree.
20
- #
21
- # We do O(n) work to build the internal data structure at initialization. Then we answer queries in O(log n) time.
8
+ # See SegmentTreeTemplate for more information.
22
9
  class DataStructuresRMolinari::CSegmentTreeTemplate
23
-
24
- # Construct a concrete instance of a Segment Tree. See details at the links above for the underlying concepts here.
25
- # @param combine a lambda that takes two values and munges them into a combined value.
26
- # - For example, if we are calculating sums over subintervals, combine.call(a, b) = a + b, while if we are doing maxima we will
27
- # return max(a, b).
28
- # - Things get more complicated when we are calculating, say, the _index_ of the maximal value in a subinterval. Now it is not
29
- # enough simply to store that index at each tree node, because to combine the indices from two child nodes we need to know
30
- # both the index of the maximal element in each child node's interval, but also the maximal values themselves, so we know
31
- # which one "wins" for the parent node. This affects the sort of work we need to do when combining and the value provided by
32
- # the +single_cell_array_val+ lambda.
33
- # @param single_cell_array_val a lambda that takes an index i and returns the value we need to store in the #build
34
- # operation for the subinterval i..i.
35
- # - This will often simply be the value data[i], but in some cases it will be something else. For example, when we are
36
- # calculating the index of the maximal value on each subinterval we need [i, data[i]] here.
37
- # - If +update_at+ is called later, this lambda must close over the underlying data in a way that captures the updated value.
38
- # @param size the size of the underlying data array, used in certain internal arithmetic.
39
- # @param identity the value to return when we are querying on an empty interval
40
- # - for sums, this will be zero; for maxima, this will be -Infinity, etc
10
+ # (see SegmentTreeTemplate::initialize)
41
11
  def initialize(combine:, single_cell_array_val:, size:, identity:)
42
12
  # having sorted out the keyword arguments, pass them more easily to the C layer.
43
13
  c_initialize(combine, single_cell_array_val, size, identity)
44
14
  end
45
15
  end
46
-
47
- # A segment tree that for an array A(0...n) answers questions of the form "what is the maximum value in the subinterval A(i..j)?"
48
- # in O(log n) time.
49
- #
50
- # C version
51
- #
52
- # TODO: share the definition with (non-C) MasValSegmentTree. The only difference is the class of the underlying segment tree
53
- # template.
54
- module DataStructuresRMolinari
55
- class CMaxValSegmentTree
56
- extend Forwardable
57
-
58
- # Tell the tree that the value at idx has changed
59
- def_delegator :@structure, :update_at
60
-
61
- # @param data an object that contains values at integer indices based at 0, via +data[i]+.
62
- # - This will usually be an Array, but it could also be a hash or a proc.
63
- def initialize(data)
64
- @structure = CSegmentTreeTemplate.new(
65
- combine: ->(a, b) { [a, b].max },
66
- single_cell_array_val: ->(i) { data[i] },
67
- size: data.size,
68
- identity: -Shared::INFINITY
69
- )
70
- end
71
-
72
- # The maximum value in A(i..j).
73
- #
74
- # The arguments must be integers in 0...(A.size)
75
- # @return the largest value in A(i..j) or -Infinity if i > j.
76
- def max_on(i, j)
77
- @structure.query_on(i, j)
78
- end
79
- end
80
-
81
- # A segment tree that for an array A(0...n) answers questions of the form "what is the index of the maximal value in the
82
- # subinterval A(i..j)?" in O(log n) time.
83
- #
84
- # C version
85
- class CIndexOfMaxValSegmentTree
86
- extend Forwardable
87
-
88
- # Tell the tree that the value at idx has changed
89
- def_delegator :@structure, :update_at
90
-
91
- # @param (see MaxValSegmentTree#initialize)
92
- def initialize(data)
93
- @structure = CSegmentTreeTemplate.new(
94
- combine: ->(p1, p2) { p1[1] >= p2[1] ? p1 : p2 },
95
- single_cell_array_val: ->(i) { [i, data[i]] },
96
- size: data.size,
97
- identity: nil
98
- )
99
- end
100
-
101
- # The index of the maximum value in A(i..j)
102
- #
103
- # The arguments must be integers in 0...(A.size)
104
- # @return (Integer, nil) the index of the largest value in A(i..j) or +nil+ if i > j.
105
- # - If there is more than one entry with that value, return one the indices. There is no guarantee as to which one.
106
- # - Return +nil+ if i > j
107
- def index_of_max_val_on(i, j)
108
- @structure.query_on(i, j)&.first # discard the value part of the pair, which is a bookkeeping
109
- end
110
- end
111
-
112
- end
@@ -89,5 +89,7 @@ class DataStructuresRMolinari::DisjointUnion
89
89
  else
90
90
  @d[e] = f
91
91
  end
92
+
93
+ nil
92
94
  end
93
95
  end
@@ -0,0 +1,126 @@
1
+ require_relative 'shared'
2
+
3
+ # A namespace to hold the various bits and bobs related to the SegmentTree implementation
4
+ module DataStructuresRMolinari::SegmentTree
5
+ end
6
+
7
+ require_relative 'segment_tree_template' # Ruby implementation of the generic API
8
+ require_relative 'c_segment_tree_template' # C implementation of the generic API
9
+
10
+ # Segment Tree: various concrete implementations
11
+ #
12
+ # There is an excellent description of the data structure at https://cp-algorithms.com/data_structures/segment_tree.html. The
13
+ # Wikipedia article (https://en.wikipedia.org/wiki/Segment_tree) appears to describe a different data structure which is sometimes
14
+ # called an "interval tree."
15
+ #
16
+ # For more details (and some close-to-metal analysis of run time, especially for large datasets) see
17
+ # https://en.algorithmica.org/hpc/data-structures/segment-trees/. In particular, this shows how to do a bottom-up implementation,
18
+ # which is faster, at least for large datasets and cache-relevant compiled code. These issues don't really apply to code written in
19
+ # Ruby.
20
+ #
21
+ # Here we provide several concrete segment tree implementations built on top of the template (generic) versions. Each instance is
22
+ # backed either by the pure Ruby SegmentTreeTemplate or its C-based sibling CSegmentTreeTemplate
23
+ module DataStructuresRMolinari
24
+ module SegmentTree
25
+ # A convenience method to construct a Segment Tree that, for a given array A(0...size), answers questions of the kind given by
26
+ # operation, using the template written in lang
27
+ #
28
+ # - @param data: the array A.
29
+ # - It must respond to +#size+ and to +#[]+ with non-negative integer arguments.
30
+ # - @param operation: a supported "style" of Segment Tree
31
+ # - for now, must be one of these (but you can write your own concrete version)
32
+ # - +:max+: implementing +max_on(i, j)+, returning the maximum value in A(i..j)
33
+ # - +:index_of_max+: implementing +index_of_max_val_on(i, j)+, returning an index corresponding to the maximum value in
34
+ # A(i..j).
35
+ # - @param lang: the language in which the underlying "template" is written
36
+ # - +:c+ or +:ruby+
37
+ # - the C version will run faster but for now may be buggier and harder to debug
38
+ module_function def construct(data, operation, lang)
39
+ operation.must_be_in [:max, :index_of_max]
40
+ lang.must_be_in [:ruby, :c]
41
+
42
+ klass = operation == :max ? MaxValSegmentTree : IndexOfMaxValSegmentTree
43
+ template = lang == :ruby ? SegmentTreeTemplate : CSegmentTreeTemplate
44
+
45
+ klass.new(template, data)
46
+ end
47
+
48
+ # A segment tree that for an array A(0...n) answers questions of the form "what is the maximum value in the subinterval A(i..j)?"
49
+ # in O(log n) time.
50
+ class MaxValSegmentTree
51
+ extend Forwardable
52
+
53
+ # Tell the tree that the value at idx has changed
54
+ def_delegator :@structure, :update_at
55
+
56
+ # @param template_klass the "template" class that provides the generic implementation of the Segment Tree functionality.
57
+ # @param data an object that contains values at integer indices based at 0, via +data[i]+.
58
+ # - This will usually be an Array, but it could also be a hash or a proc.
59
+ def initialize(template_klass, data)
60
+ data.must_be_a Enumerable
61
+
62
+ @structure = template_klass.new(
63
+ combine: ->(a, b) { [a, b].max },
64
+ single_cell_array_val: ->(i) { data[i] },
65
+ size: data.size,
66
+ identity: -Shared::INFINITY
67
+ )
68
+ end
69
+
70
+ # The maximum value in A(i..j).
71
+ #
72
+ # The arguments must be integers in 0...(A.size)
73
+ # @return the largest value in A(i..j) or -Infinity if i > j.
74
+ def max_on(i, j)
75
+ @structure.query_on(i, j)
76
+ end
77
+ end
78
+
79
+ # A segment tree that for an array A(0...n) answers questions of the form "what is the index of the maximal value in the
80
+ # subinterval A(i..j)?" in O(log n) time.
81
+ class IndexOfMaxValSegmentTree
82
+ extend Forwardable
83
+
84
+ # Tell the tree that the value at idx has changed
85
+ def_delegator :@structure, :update_at
86
+
87
+ # @param (see MaxValSegmentTree#initialize)
88
+ def initialize(template_klass, data)
89
+ data.must_be_a Enumerable
90
+
91
+ @structure = template_klass.new(
92
+ combine: ->(p1, p2) { p1[1] >= p2[1] ? p1 : p2 },
93
+ single_cell_array_val: ->(i) { [i, data[i]] },
94
+ size: data.size,
95
+ identity: nil
96
+ )
97
+ end
98
+
99
+ # The index of the maximum value in A(i..j)
100
+ #
101
+ # The arguments must be integers in 0...(A.size)
102
+ # @return (Integer, nil) the index of the largest value in A(i..j) or +nil+ if i > j.
103
+ # - If there is more than one entry with that value, return one the indices. There is no guarantee as to which one.
104
+ # - Return +nil+ if i > j
105
+ def index_of_max_val_on(i, j)
106
+ @structure.query_on(i, j)&.first # discard the value part of the pair, which is a bookkeeping
107
+ end
108
+ end
109
+
110
+ # The underlying functionality of the Segment Tree data type, implemented in C as a Ruby extension.
111
+ #
112
+ # See SegmentTreeTemplate for more information.
113
+ #
114
+ # Implementation note
115
+ #
116
+ # The functionality is entirely written in C. But we write the constructor in Ruby because keyword arguments are difficult to
117
+ # parse on the C side.
118
+ class CSegmentTreeTemplate
119
+ # (see SegmentTreeTemplate::initialize)
120
+ def initialize(combine:, single_cell_array_val:, size:, identity:)
121
+ # having sorted out the keyword arguments, pass them more easily to the C layer.
122
+ c_initialize(combine, single_cell_array_val, size, identity)
123
+ end
124
+ end
125
+ end
126
+ end
@@ -1,7 +1,7 @@
1
1
  require_relative 'shared'
2
2
 
3
- # The template of Segment Tree, which can be used for various interval-related purposes, like efficiently finding the sum (or min or
4
- # max) on a arbitrary subarray of a given array.
3
+ # A generic implementation of Segment Tree, which can be used for various interval-related purposes, like efficiently finding the
4
+ # sum (or min or max) on a arbitrary subarray of a given array.
5
5
  #
6
6
  # There is an excellent description of the data structure at https://cp-algorithms.com/data_structures/segment_tree.html. The
7
7
  # Wikipedia article (https://en.wikipedia.org/wiki/Segment_tree) appears to describe a different data structure which is sometimes
@@ -16,7 +16,7 @@ require_relative 'shared'
16
16
  # initializer and the definitions of concrete realisations like MaxValSegmentTree.
17
17
  #
18
18
  # We do O(n) work to build the internal data structure at initialization. Then we answer queries in O(log n) time.
19
- class DataStructuresRMolinari::SegmentTreeTemplate
19
+ class DataStructuresRMolinari::SegmentTree::SegmentTreeTemplate
20
20
  include Shared
21
21
  include Shared::BinaryTreeArithmetic
22
22
 
@@ -14,77 +14,12 @@ require_relative 'data_structures_rmolinari/algorithms'
14
14
  require_relative 'data_structures_rmolinari/disjoint_union'
15
15
  require_relative 'data_structures_rmolinari/c_disjoint_union' # version as a C extension
16
16
 
17
- require_relative 'data_structures_rmolinari/segment_tree_template'
18
- require_relative 'data_structures_rmolinari/c_segment_tree_template_impl'
17
+ require_relative 'data_structures_rmolinari/segment_tree'
19
18
 
20
19
  require_relative 'data_structures_rmolinari/heap'
21
20
  require_relative 'data_structures_rmolinari/max_priority_search_tree'
22
21
  require_relative 'data_structures_rmolinari/min_priority_search_tree'
23
22
 
24
23
  module DataStructuresRMolinari
25
- ########################################
26
- # Concrete instances of Segment Tree
27
- #
28
- # @todo consider moving these into generic_segment_tree.rb and renaming that file
29
-
30
- # A segment tree that for an array A(0...n) answers questions of the form "what is the maximum value in the subinterval A(i..j)?"
31
- # in O(log n) time.
32
- class MaxValSegmentTree
33
- extend Forwardable
34
-
35
- # Tell the tree that the value at idx has changed
36
- def_delegator :@structure, :update_at
37
-
38
- # @param data an object that contains values at integer indices based at 0, via +data[i]+.
39
- # - This will usually be an Array, but it could also be a hash or a proc.
40
- def initialize(data)
41
- data.must_be_a Enumerable
42
-
43
- @structure = SegmentTreeTemplate.new(
44
- combine: ->(a, b) { [a, b].max },
45
- single_cell_array_val: ->(i) { data[i] },
46
- size: data.size,
47
- identity: -Shared::INFINITY
48
- )
49
- end
50
-
51
- # The maximum value in A(i..j).
52
- #
53
- # The arguments must be integers in 0...(A.size)
54
- # @return the largest value in A(i..j) or -Infinity if i > j.
55
- def max_on(i, j)
56
- @structure.query_on(i, j)
57
- end
58
- end
59
-
60
- # A segment tree that for an array A(0...n) answers questions of the form "what is the index of the maximal value in the
61
- # subinterval A(i..j)?" in O(log n) time.
62
- class IndexOfMaxValSegmentTree
63
- extend Forwardable
64
-
65
- # Tell the tree that the value at idx has changed
66
- def_delegator :@structure, :update_at
67
-
68
- # @param (see MaxValSegmentTree#initialize)
69
- def initialize(data)
70
- data.must_be_a Enumerable
71
-
72
- @structure = SegmentTreeTemplate.new(
73
- combine: ->(p1, p2) { p1[1] >= p2[1] ? p1 : p2 },
74
- single_cell_array_val: ->(i) { [i, data[i]] },
75
- size: data.size,
76
- identity: nil
77
- )
78
- end
79
-
80
- # The index of the maximum value in A(i..j)
81
- #
82
- # The arguments must be integers in 0...(A.size)
83
- # @return (Integer, nil) the index of the largest value in A(i..j) or +nil+ if i > j.
84
- # - If there is more than one entry with that value, return one the indices. There is no guarantee as to which one.
85
- # - Return +nil+ if i > j
86
- def index_of_max_val_on(i, j)
87
- @structure.query_on(i, j)&.first # discard the value part of the pair, which is a bookkeeping
88
- end
89
- end
24
+ # Add things here if needed
90
25
  end
metadata CHANGED
@@ -1,14 +1,14 @@
1
1
  --- !ruby/object:Gem::Specification
2
2
  name: data_structures_rmolinari
3
3
  version: !ruby/object:Gem::Version
4
- version: 0.4.4
4
+ version: 0.5.0
5
5
  platform: ruby
6
6
  authors:
7
7
  - Rory Molinari
8
8
  autorequire:
9
9
  bindir: bin
10
10
  cert_chain: []
11
- date: 2023-02-02 00:00:00.000000000 Z
11
+ date: 2023-02-03 00:00:00.000000000 Z
12
12
  dependencies:
13
13
  - !ruby/object:Gem::Dependency
14
14
  name: must_be
@@ -97,6 +97,7 @@ files:
97
97
  - lib/data_structures_rmolinari/heap.rb
98
98
  - lib/data_structures_rmolinari/max_priority_search_tree.rb
99
99
  - lib/data_structures_rmolinari/min_priority_search_tree.rb
100
+ - lib/data_structures_rmolinari/segment_tree.rb
100
101
  - lib/data_structures_rmolinari/segment_tree_template.rb
101
102
  - lib/data_structures_rmolinari/shared.rb
102
103
  homepage: https://github.com/rmolinari/data_structures