data_structures_rmolinari 0.4.4 → 0.5.1

Sign up to get free protection for your applications and to get access to all the features.
checksums.yaml CHANGED
@@ -1,7 +1,7 @@
1
1
  ---
2
2
  SHA256:
3
- metadata.gz: 943ac55678a074cc0da3667dccbb07ee7d203639233f53bd8587af7fd8cd062e
4
- data.tar.gz: ad235e5f4714e699f1cf5f113dd4b3a356a194cced5a74b60e17c5e3a896e01b
3
+ metadata.gz: a41b03a0d46982d64fc1ec5648174df24c6c9b0593ff93354ad90fc65ff84e8e
4
+ data.tar.gz: 544aa69540c9cef54c4eb6402b260bafbbeddf7dc5e770bf428976c11161c58e
5
5
  SHA512:
6
- metadata.gz: a68de76c88c67fadc42752610c695b1f0b8fd17f34db9c806291aeab4c933fe84c6523615deb4197e1c9fa6d36dce30987cc4e8896a2b0c1700b7e72b5bd2fff
7
- data.tar.gz: 9063d89a98d599f27db2585bf383dbfb13e8f927abce64ac7eafb2edd70c490ddad1f1fc51e0f11c24adf29f28ab8c56548a6db264b15ace239c63b1a2ce5a01
6
+ metadata.gz: 19efca577bb0c3c9524cf09b6f63c0e06beab70bfd20d1dbafa5034009be9bcb965ac5b8cee0ba0a39ecdcb646c401a54b363852a4572613941f7a34da174593
7
+ data.tar.gz: 05606e9c142a9bcb69a9fd555586b0ad8fe76bf7572af3dd82003de673d78b671b9a5edf1e5d4a2e4c7d5c6d5e41bef2ed1679573600725b74a25d4f3665c6ee
data/CHANGELOG.md CHANGED
@@ -2,6 +2,17 @@
2
2
 
3
3
  ## [Unreleased]
4
4
 
5
+ ## [0.5.0] 2023-02.03
6
+
7
+ - SegmentTree
8
+ - Reorganize the code into a SegmentTree submodule.
9
+ - Provide a conveniece method for getting concrete instances.
10
+
11
+ - README.md
12
+ - Add some simple example code for the data types.
13
+
14
+ ## [0.4.4] 2023-02-02
15
+
5
16
  - Disjoint Union
6
17
  - C extension: use Convenient Containers rather than my janky Dynamic Array attempt.
7
18
 
data/README.md CHANGED
@@ -9,23 +9,13 @@ as fast as possible.
9
9
 
10
10
  The code is available as a gem: https://rubygems.org/gems/data_structures_rmolinari.
11
11
 
12
+ It is distributed under the MIT license.
13
+
12
14
  ## Usage
13
15
 
14
16
  The right way to organize the code is not obvious to me. For now the data structures are all defined in the module
15
17
  `DataStructuresRMolinari` to avoid polluting the global namespace.
16
18
 
17
- Example usage after the gem is installed:
18
- ```
19
- require 'data_structures_rmolinari`
20
-
21
- # Pull what we need out of the namespace
22
- MaxPrioritySearchTree = DataStructuresRMolinari::MaxPrioritySearchTree
23
- Point = DataStructuresRMolinari::Point # anything responding to :x and :y is fine
24
-
25
- pst = MaxPrioritySearchTree.new([Point.new(1, 1)])
26
- puts pst.largest_y_in_ne(0, 0) # "Point(1,1)"
27
- ```
28
-
29
19
  # Implementations
30
20
 
31
21
  ## Disjoint Union
@@ -42,6 +32,23 @@ It also provides
42
32
  For more details see https://en.wikipedia.org/wiki/Disjoint-set_data_structure and the paper [[TvL1984]](#references) by Tarjan and
43
33
  van Leeuwen.
44
34
 
35
+ ``` ruby
36
+ require 'data_structures_rmolinari'
37
+ DisjointUnion = DataStructuresRMolinari::DisjointUnion
38
+
39
+ # Create an instance over the "universe" 0, 1, ..., 9.
40
+ du = DisjointUnion.new(10)
41
+ du.subset_count # => 10; each element starts out in its own subset
42
+
43
+ du.unite(2, 3) # say that 2 and 3 are actually in the same subset
44
+ du.subset_count # => 9
45
+ du.find(2) == du.find(3) # => true
46
+
47
+ du.unite(4, 5)
48
+ du.unite(3, 4) # now 2, 3, 4, and 5 are all in the same subset
49
+ du.subset_count # => 7
50
+ ```
51
+
45
52
  ## Heap
46
53
 
47
54
  This is a standard binary heap with an `update` method, suitable for use as a priority queue. There are several supported
@@ -63,6 +70,24 @@ allows the insertion of duplicate items (which is sometimes useful) and slightly
63
70
 
64
71
  See https://en.wikipedia.org/wiki/Binary_heap and the paper by Edelkamp, Elmasry, and Katajainen [[EEK2017]](#references).
65
72
 
73
+ ``` ruby
74
+ require 'data_structures_rmolinari'
75
+ Heap = DataStructuresRMolinari::Heap
76
+
77
+ data = [4, 3, 2, 1]
78
+
79
+ heap = Heap.new
80
+
81
+ # Insert the elements of data, each with itself as priority.
82
+ data.each { |v| heap.insert(v, v) }
83
+
84
+ heap.top # => 1, since we have a min-heap.
85
+ heap.pop # => 1
86
+ heap.top # => 2; with 1 gone, this is the element with least priority
87
+ heap.update(3, -3)
88
+ heap.top # => 3; now 3 is the element with least priority
89
+ ```
90
+
66
91
  ## Priority Search Tree
67
92
 
68
93
  A PST stores a set P of two-dimensional points in a way that allows certain queries about P to be answered efficiently. The data
@@ -96,13 +121,27 @@ regions.
96
121
  By default these data structures are immutable: once constructed they cannot be changed. But there is a constructor option that
97
122
  makes the instance "dynamic". This allows us to delete the element at the root of the tree - the one with largest y value (smallest
98
123
  for MinPST) - with the `delete_top!` method. This operation is important in certain algorithms, such as enumerating all maximal
99
- empty rectangles (see the second paper by De et al.[[DMNS2013]](#references)) Note that points can still not be added to the PST in
124
+ empty rectangles (see the second paper by De et al[[DMNS2013]](#references)). Note that points can still not be added to the PST in
100
125
  any case, and choosing the dynamic option makes certain internal bookkeeping operations slower.
101
126
 
102
127
  In [[DMNS2013]](#references) De et al. generalize the in-place structure to a _Min-max Priority Search Tree_ (MinmaxPST) that can
103
128
  answer queries in all four quadrants and both "kinds" of 3-sided boxes. Having one of these would save the trouble of constructing
104
129
  both a MaxPST and MinPST. But the presentiation is hard to follow in places and the paper's pseudocode is buggy.[^minmaxpst]
105
130
 
131
+ ``` ruby
132
+ require 'data_structures_rmolinari'
133
+ MaxPST = DataStructuresRMolinari::MaxPrioritySearchTree
134
+ Point = Shared::Point # simple (x, y) struct. Anything responding to #x and #y will work
135
+
136
+ data = [Point.new(0, 0), Point.new(1, 2), Point.new(2, 1)]
137
+ pst = MaxPST.new(data)
138
+
139
+ pst.largest_y_in_ne(0, 0) # => #<struct Shared::Point x=1, y=2>
140
+ pst.largest_y_in_ne(1, 1) # => #<struct Shared::Point x=1, y=2>
141
+ pst.largest_y_in_ne(1.5, 1) # => #<struct Shared::Point x=2, y=1>
142
+ pst.largest_y_in_3_sided(-0.5, 0.5, 0) # => #<struct Shared::Point x=0, y=0>
143
+ ```
144
+
106
145
  ## Segment Tree
107
146
 
108
147
  A segment tree stores information related to subintervals of a certain array. For example, a segment tree can be used to find the
@@ -112,11 +151,37 @@ arbitrary subarrays.
112
151
 
113
152
  An excellent description of the idea is found at https://cp-algorithms.com/data_structures/segment_tree.html.
114
153
 
115
- Generic code is provided in `SegmentTreeTemplate`. Concrete classes provide a handful of simple lambdas and constants to the
116
- template class's initializer. Figuring out the details requires some knowledge of the internal mechanisms of a segment tree, for
117
- which the link at cp-algorithms.com is very helpful. See the definitions of the concrete classes `MaxValSegmentTree` and
154
+ Generic code is provided in `SegmentTree::SegmentTreeTemplate` and its equivalent (and faster) C-based sibling,
155
+ `SegmentTree::CSegmentTreeTemplate` (see [below](#c-extensions)).
156
+
157
+ Writing a concrete segment tree class just means providing some simple lambdas and constants to the template class's
158
+ initializer. Figuring out the details requires some knowledge of the internal mechanisms of a segment tree, for which the link at
159
+ cp-algorithms.com is very helpful. See the implementations of the concrete classes `MaxValSegmentTree` and
118
160
  `IndexOfMaxValSegmentTree` for examples.
119
161
 
162
+ Since there are several concrete "types" and two underlying generic implementions there is a convenience method on the `SegmentTree`
163
+ module to get instances.
164
+
165
+ ``` ruby
166
+ require 'data_structures_rmolinari'
167
+ SegmentTree = DataStructuresRMolinari::SegmentTree # namespace module
168
+
169
+ data = [1, -3, 2, 1, 5, -9]
170
+
171
+ # Get a segment tree instance that will answer "max over this subinterval?" questions about data.
172
+ # Here we get one using the ruby implementation of the generic functionality.
173
+ #
174
+ # Put :index_of_max in place of :map to get an instance that returns "an index of the maximum value
175
+ # over this subinterval".
176
+ #
177
+ # To use the generic code written in C, put :c instead of :ruby.
178
+ seg_tree = SegmentTree.construct(data, :max, :ruby)
179
+
180
+ seg_tree.max_on(0, 2) # => 2
181
+ seg_tree.max_on(1, 4) # => 5
182
+ # ..etc..
183
+ ```
184
+
120
185
  ## Algorithms
121
186
 
122
187
  The Algorithms submodule contains some algorithms using the data structures.
@@ -130,13 +195,12 @@ The Algorithms submodule contains some algorithms using the data structures.
130
195
 
131
196
  # C Extensions
132
197
 
133
- As another learning process I have implemented several of these data structures as C extensions. The class names have a "C" prefixed
134
- and they can be required like their pure Ruby versions. They have the same APIs as their Ruby cousins.
198
+ As another learning process I have implemented several of these data structures as C extensions. The APIs are the same.
135
199
 
136
200
  ## Disjoint Union
137
201
 
138
- A benchmark suggests that a long sequence of `unite` operations is about 3 times as fast with the `CDisjointUnion` as with
139
- `DisjointUnion`.
202
+ The C version is called `CDisjointUnion`. A benchmark suggests that a long sequence of `unite` operations is about 3 times as fast
203
+ with `CDisjointUnion` as with `DisjointUnion`.
140
204
 
141
205
  The implementation uses the remarkable Convenient Containers library from Jackson Allan.[[Allan]](#references).
142
206
 
@@ -145,16 +209,21 @@ The implementation uses the remarkable Convenient Containers library from Jackso
145
209
  `CSegmentTreeTemplate` is the C implementation of the generic class. Concrete classes are built on top of this in Ruby, just as with
146
210
  the pure Ruby `SegmentTreeTemplate` class.
147
211
 
148
- A benchmark suggests that a long sequence of `max_on` operations against a max-val Segment Tree is about 4 times as fast with the C
149
- version as with the Ruby version. I'm a bit suprised the improvment isn't larger, but we must remember that the C code must still
150
- interact with the Ruby objects in the underlying data array, and must "combine" them, etc., by calling Ruby lambdas.
212
+ A benchmark suggests that a long sequence of `max_on` operations against a max-val Segment Tree is about 4 times as fast with C as
213
+ with Ruby. I'm a bit suprised the improvment isn't larger, but remember that the C code must still interact with the Ruby objects in
214
+ the underlying data array, and must combine them, etc., via Ruby lambdas.
151
215
 
152
216
  # References
153
- - [Allan] Allan, J., _CC: Convenient Containers_, https://github.com/JacksonAllan/CC, retrieved 2023-02-01.
154
- - [TvL1984] Tarjan, Robert E., van Leeuwen, J., _Worst-case Analysis of Set Union Algorithms_, Journal of the ACM, v31:2 (1984), pp 245–281.
155
- - [EEK2017] Edelkamp, S., Elmasry, A., Katajainen, J., _Optimizing Binary Heaps_, Theory Comput Syst (2017), vol 61, pp 606-636, DOI 10.1007/s00224-017-9760-2.
156
- - [McC1985] McCreight, E. M., _Priority Search Trees_, SIAM J. Comput., 14(2):257-276, 1985.
157
- - [DMNS2011] De, M., Maheshwari, A., Nandy, S. C., Smid, M., _An In-Place Priority Search Tree_, 23rd Canadian Conference on Computational Geometry, 2011.
158
- - [DMNS2013] De, M., Maheshwari, A., Nandy, S. C., Smid, M., _An In-Place Min-max Priority Search Tree_, Computational Geometry, v46 (2013), pp 310-327.
217
+ - [Allan] Allan, J., _CC: Convenient Containers_, https://github.com/JacksonAllan/CC, (retrieved 2023-02-01).
218
+ - [TvL1984] Tarjan, Robert E., van Leeuwen, J., _Worst-case Analysis of Set Union Algorithms_, Journal of the ACM, v31:2 (1984), pp
219
+ 245–281, https://dl.acm.org/doi/10.1145/62.2160 (retrieved 2022-02-01).
220
+ - [EEK2017] Edelkamp, S., Elmasry, A., Katajainen, J., _Optimizing Binary Heaps_, Theory Comput Syst (2017), vol 61, pp 606-636, DOI
221
+ 10.1007/s00224-017-9760-2, https://kclpure.kcl.ac.uk/portal/files/87388857/TheoryComputingSzstems.pdf (retrieved 2022-02-02).
222
+ - [McC1985] McCreight, E. M., _Priority Search Trees_, SIAM J. Comput., 14(2):257-276, 1985,
223
+ http://www.cs.duke.edu/courses/fall08/cps234/handouts/SMJ000257.pdf (retrieved 2023-02-02).
224
+ - [DMNS2011] De, M., Maheshwari, A., Nandy, S. C., Smid, M., _An In-Place Priority Search Tree_, 23rd Canadian Conference on
225
+ Computational Geometry, 2011, http://www.cs.carleton.ca/~michiel/inplace_pst.pdf (retrieved 2023-02-02).
226
+ - [DMNS2013] De, M., Maheshwari, A., Nandy, S. C., Smid, M., _An In-Place Min-max Priority Search Tree_, Computational Geometry, v46
227
+ (2013), pp 310-327, https://people.scs.carleton.ca/~michiel/MinMaxPST.pdf (retrieved 2023-02-02).
159
228
 
160
229
  [^minmaxpst]: See the comments in the fragmentary class `MinMaxPrioritySearchTree` for further details.
@@ -353,7 +353,8 @@ static VALUE segment_tree_update_at(VALUE self, VALUE idx) {
353
353
  * (see SegmentTreeTemplate)
354
354
  */
355
355
  void Init_c_segment_tree_template() {
356
- VALUE cSegmentTreeTemplate = rb_define_class_under(mDataStructuresRMolinari, "CSegmentTreeTemplate", rb_cObject);
356
+ VALUE mSegmentTree = rb_define_module_under(mDataStructuresRMolinari, "SegmentTree");
357
+ VALUE cSegmentTreeTemplate = rb_define_class_under(mSegmentTree, "CSegmentTreeTemplate", rb_cObject);
357
358
 
358
359
  rb_define_alloc_func(cSegmentTreeTemplate, segment_tree_alloc);
359
360
  rb_define_method(cSegmentTreeTemplate, "c_initialize", segment_tree_init, 4);