data_structures_rmolinari 0.4.4 → 0.5.1
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- checksums.yaml +4 -4
- data/CHANGELOG.md +11 -0
- data/README.md +98 -29
- data/ext/c_segment_tree_template/segment_tree_template.c +2 -1
- data/ext/cc.h +3879 -0
- data/ext/shared.h +33 -0
- data/lib/data_structures_rmolinari/algorithms.rb +5 -5
- data/lib/data_structures_rmolinari/c_segment_tree_template_impl.rb +3 -100
- data/lib/data_structures_rmolinari/disjoint_union.rb +2 -0
- data/lib/data_structures_rmolinari/segment_tree.rb +126 -0
- data/lib/data_structures_rmolinari/segment_tree_template.rb +3 -3
- data/lib/data_structures_rmolinari.rb +2 -67
- metadata +5 -2
checksums.yaml
CHANGED
@@ -1,7 +1,7 @@
|
|
1
1
|
---
|
2
2
|
SHA256:
|
3
|
-
metadata.gz:
|
4
|
-
data.tar.gz:
|
3
|
+
metadata.gz: a41b03a0d46982d64fc1ec5648174df24c6c9b0593ff93354ad90fc65ff84e8e
|
4
|
+
data.tar.gz: 544aa69540c9cef54c4eb6402b260bafbbeddf7dc5e770bf428976c11161c58e
|
5
5
|
SHA512:
|
6
|
-
metadata.gz:
|
7
|
-
data.tar.gz:
|
6
|
+
metadata.gz: 19efca577bb0c3c9524cf09b6f63c0e06beab70bfd20d1dbafa5034009be9bcb965ac5b8cee0ba0a39ecdcb646c401a54b363852a4572613941f7a34da174593
|
7
|
+
data.tar.gz: 05606e9c142a9bcb69a9fd555586b0ad8fe76bf7572af3dd82003de673d78b671b9a5edf1e5d4a2e4c7d5c6d5e41bef2ed1679573600725b74a25d4f3665c6ee
|
data/CHANGELOG.md
CHANGED
@@ -2,6 +2,17 @@
|
|
2
2
|
|
3
3
|
## [Unreleased]
|
4
4
|
|
5
|
+
## [0.5.0] 2023-02.03
|
6
|
+
|
7
|
+
- SegmentTree
|
8
|
+
- Reorganize the code into a SegmentTree submodule.
|
9
|
+
- Provide a conveniece method for getting concrete instances.
|
10
|
+
|
11
|
+
- README.md
|
12
|
+
- Add some simple example code for the data types.
|
13
|
+
|
14
|
+
## [0.4.4] 2023-02-02
|
15
|
+
|
5
16
|
- Disjoint Union
|
6
17
|
- C extension: use Convenient Containers rather than my janky Dynamic Array attempt.
|
7
18
|
|
data/README.md
CHANGED
@@ -9,23 +9,13 @@ as fast as possible.
|
|
9
9
|
|
10
10
|
The code is available as a gem: https://rubygems.org/gems/data_structures_rmolinari.
|
11
11
|
|
12
|
+
It is distributed under the MIT license.
|
13
|
+
|
12
14
|
## Usage
|
13
15
|
|
14
16
|
The right way to organize the code is not obvious to me. For now the data structures are all defined in the module
|
15
17
|
`DataStructuresRMolinari` to avoid polluting the global namespace.
|
16
18
|
|
17
|
-
Example usage after the gem is installed:
|
18
|
-
```
|
19
|
-
require 'data_structures_rmolinari`
|
20
|
-
|
21
|
-
# Pull what we need out of the namespace
|
22
|
-
MaxPrioritySearchTree = DataStructuresRMolinari::MaxPrioritySearchTree
|
23
|
-
Point = DataStructuresRMolinari::Point # anything responding to :x and :y is fine
|
24
|
-
|
25
|
-
pst = MaxPrioritySearchTree.new([Point.new(1, 1)])
|
26
|
-
puts pst.largest_y_in_ne(0, 0) # "Point(1,1)"
|
27
|
-
```
|
28
|
-
|
29
19
|
# Implementations
|
30
20
|
|
31
21
|
## Disjoint Union
|
@@ -42,6 +32,23 @@ It also provides
|
|
42
32
|
For more details see https://en.wikipedia.org/wiki/Disjoint-set_data_structure and the paper [[TvL1984]](#references) by Tarjan and
|
43
33
|
van Leeuwen.
|
44
34
|
|
35
|
+
``` ruby
|
36
|
+
require 'data_structures_rmolinari'
|
37
|
+
DisjointUnion = DataStructuresRMolinari::DisjointUnion
|
38
|
+
|
39
|
+
# Create an instance over the "universe" 0, 1, ..., 9.
|
40
|
+
du = DisjointUnion.new(10)
|
41
|
+
du.subset_count # => 10; each element starts out in its own subset
|
42
|
+
|
43
|
+
du.unite(2, 3) # say that 2 and 3 are actually in the same subset
|
44
|
+
du.subset_count # => 9
|
45
|
+
du.find(2) == du.find(3) # => true
|
46
|
+
|
47
|
+
du.unite(4, 5)
|
48
|
+
du.unite(3, 4) # now 2, 3, 4, and 5 are all in the same subset
|
49
|
+
du.subset_count # => 7
|
50
|
+
```
|
51
|
+
|
45
52
|
## Heap
|
46
53
|
|
47
54
|
This is a standard binary heap with an `update` method, suitable for use as a priority queue. There are several supported
|
@@ -63,6 +70,24 @@ allows the insertion of duplicate items (which is sometimes useful) and slightly
|
|
63
70
|
|
64
71
|
See https://en.wikipedia.org/wiki/Binary_heap and the paper by Edelkamp, Elmasry, and Katajainen [[EEK2017]](#references).
|
65
72
|
|
73
|
+
``` ruby
|
74
|
+
require 'data_structures_rmolinari'
|
75
|
+
Heap = DataStructuresRMolinari::Heap
|
76
|
+
|
77
|
+
data = [4, 3, 2, 1]
|
78
|
+
|
79
|
+
heap = Heap.new
|
80
|
+
|
81
|
+
# Insert the elements of data, each with itself as priority.
|
82
|
+
data.each { |v| heap.insert(v, v) }
|
83
|
+
|
84
|
+
heap.top # => 1, since we have a min-heap.
|
85
|
+
heap.pop # => 1
|
86
|
+
heap.top # => 2; with 1 gone, this is the element with least priority
|
87
|
+
heap.update(3, -3)
|
88
|
+
heap.top # => 3; now 3 is the element with least priority
|
89
|
+
```
|
90
|
+
|
66
91
|
## Priority Search Tree
|
67
92
|
|
68
93
|
A PST stores a set P of two-dimensional points in a way that allows certain queries about P to be answered efficiently. The data
|
@@ -96,13 +121,27 @@ regions.
|
|
96
121
|
By default these data structures are immutable: once constructed they cannot be changed. But there is a constructor option that
|
97
122
|
makes the instance "dynamic". This allows us to delete the element at the root of the tree - the one with largest y value (smallest
|
98
123
|
for MinPST) - with the `delete_top!` method. This operation is important in certain algorithms, such as enumerating all maximal
|
99
|
-
empty rectangles (see the second paper by De et al
|
124
|
+
empty rectangles (see the second paper by De et al[[DMNS2013]](#references)). Note that points can still not be added to the PST in
|
100
125
|
any case, and choosing the dynamic option makes certain internal bookkeeping operations slower.
|
101
126
|
|
102
127
|
In [[DMNS2013]](#references) De et al. generalize the in-place structure to a _Min-max Priority Search Tree_ (MinmaxPST) that can
|
103
128
|
answer queries in all four quadrants and both "kinds" of 3-sided boxes. Having one of these would save the trouble of constructing
|
104
129
|
both a MaxPST and MinPST. But the presentiation is hard to follow in places and the paper's pseudocode is buggy.[^minmaxpst]
|
105
130
|
|
131
|
+
``` ruby
|
132
|
+
require 'data_structures_rmolinari'
|
133
|
+
MaxPST = DataStructuresRMolinari::MaxPrioritySearchTree
|
134
|
+
Point = Shared::Point # simple (x, y) struct. Anything responding to #x and #y will work
|
135
|
+
|
136
|
+
data = [Point.new(0, 0), Point.new(1, 2), Point.new(2, 1)]
|
137
|
+
pst = MaxPST.new(data)
|
138
|
+
|
139
|
+
pst.largest_y_in_ne(0, 0) # => #<struct Shared::Point x=1, y=2>
|
140
|
+
pst.largest_y_in_ne(1, 1) # => #<struct Shared::Point x=1, y=2>
|
141
|
+
pst.largest_y_in_ne(1.5, 1) # => #<struct Shared::Point x=2, y=1>
|
142
|
+
pst.largest_y_in_3_sided(-0.5, 0.5, 0) # => #<struct Shared::Point x=0, y=0>
|
143
|
+
```
|
144
|
+
|
106
145
|
## Segment Tree
|
107
146
|
|
108
147
|
A segment tree stores information related to subintervals of a certain array. For example, a segment tree can be used to find the
|
@@ -112,11 +151,37 @@ arbitrary subarrays.
|
|
112
151
|
|
113
152
|
An excellent description of the idea is found at https://cp-algorithms.com/data_structures/segment_tree.html.
|
114
153
|
|
115
|
-
Generic code is provided in `SegmentTreeTemplate
|
116
|
-
|
117
|
-
|
154
|
+
Generic code is provided in `SegmentTree::SegmentTreeTemplate` and its equivalent (and faster) C-based sibling,
|
155
|
+
`SegmentTree::CSegmentTreeTemplate` (see [below](#c-extensions)).
|
156
|
+
|
157
|
+
Writing a concrete segment tree class just means providing some simple lambdas and constants to the template class's
|
158
|
+
initializer. Figuring out the details requires some knowledge of the internal mechanisms of a segment tree, for which the link at
|
159
|
+
cp-algorithms.com is very helpful. See the implementations of the concrete classes `MaxValSegmentTree` and
|
118
160
|
`IndexOfMaxValSegmentTree` for examples.
|
119
161
|
|
162
|
+
Since there are several concrete "types" and two underlying generic implementions there is a convenience method on the `SegmentTree`
|
163
|
+
module to get instances.
|
164
|
+
|
165
|
+
``` ruby
|
166
|
+
require 'data_structures_rmolinari'
|
167
|
+
SegmentTree = DataStructuresRMolinari::SegmentTree # namespace module
|
168
|
+
|
169
|
+
data = [1, -3, 2, 1, 5, -9]
|
170
|
+
|
171
|
+
# Get a segment tree instance that will answer "max over this subinterval?" questions about data.
|
172
|
+
# Here we get one using the ruby implementation of the generic functionality.
|
173
|
+
#
|
174
|
+
# Put :index_of_max in place of :map to get an instance that returns "an index of the maximum value
|
175
|
+
# over this subinterval".
|
176
|
+
#
|
177
|
+
# To use the generic code written in C, put :c instead of :ruby.
|
178
|
+
seg_tree = SegmentTree.construct(data, :max, :ruby)
|
179
|
+
|
180
|
+
seg_tree.max_on(0, 2) # => 2
|
181
|
+
seg_tree.max_on(1, 4) # => 5
|
182
|
+
# ..etc..
|
183
|
+
```
|
184
|
+
|
120
185
|
## Algorithms
|
121
186
|
|
122
187
|
The Algorithms submodule contains some algorithms using the data structures.
|
@@ -130,13 +195,12 @@ The Algorithms submodule contains some algorithms using the data structures.
|
|
130
195
|
|
131
196
|
# C Extensions
|
132
197
|
|
133
|
-
As another learning process I have implemented several of these data structures as C extensions. The
|
134
|
-
and they can be required like their pure Ruby versions. They have the same APIs as their Ruby cousins.
|
198
|
+
As another learning process I have implemented several of these data structures as C extensions. The APIs are the same.
|
135
199
|
|
136
200
|
## Disjoint Union
|
137
201
|
|
138
|
-
A benchmark suggests that a long sequence of `unite` operations is about 3 times as fast
|
139
|
-
`DisjointUnion`.
|
202
|
+
The C version is called `CDisjointUnion`. A benchmark suggests that a long sequence of `unite` operations is about 3 times as fast
|
203
|
+
with `CDisjointUnion` as with `DisjointUnion`.
|
140
204
|
|
141
205
|
The implementation uses the remarkable Convenient Containers library from Jackson Allan.[[Allan]](#references).
|
142
206
|
|
@@ -145,16 +209,21 @@ The implementation uses the remarkable Convenient Containers library from Jackso
|
|
145
209
|
`CSegmentTreeTemplate` is the C implementation of the generic class. Concrete classes are built on top of this in Ruby, just as with
|
146
210
|
the pure Ruby `SegmentTreeTemplate` class.
|
147
211
|
|
148
|
-
A benchmark suggests that a long sequence of `max_on` operations against a max-val Segment Tree is about 4 times as fast with
|
149
|
-
|
150
|
-
|
212
|
+
A benchmark suggests that a long sequence of `max_on` operations against a max-val Segment Tree is about 4 times as fast with C as
|
213
|
+
with Ruby. I'm a bit suprised the improvment isn't larger, but remember that the C code must still interact with the Ruby objects in
|
214
|
+
the underlying data array, and must combine them, etc., via Ruby lambdas.
|
151
215
|
|
152
216
|
# References
|
153
|
-
- [Allan] Allan, J., _CC: Convenient Containers_, https://github.com/JacksonAllan/CC, retrieved 2023-02-01.
|
154
|
-
- [TvL1984] Tarjan, Robert E., van Leeuwen, J., _Worst-case Analysis of Set Union Algorithms_, Journal of the ACM, v31:2 (1984), pp
|
155
|
-
|
156
|
-
- [
|
157
|
-
-
|
158
|
-
- [
|
217
|
+
- [Allan] Allan, J., _CC: Convenient Containers_, https://github.com/JacksonAllan/CC, (retrieved 2023-02-01).
|
218
|
+
- [TvL1984] Tarjan, Robert E., van Leeuwen, J., _Worst-case Analysis of Set Union Algorithms_, Journal of the ACM, v31:2 (1984), pp
|
219
|
+
245–281, https://dl.acm.org/doi/10.1145/62.2160 (retrieved 2022-02-01).
|
220
|
+
- [EEK2017] Edelkamp, S., Elmasry, A., Katajainen, J., _Optimizing Binary Heaps_, Theory Comput Syst (2017), vol 61, pp 606-636, DOI
|
221
|
+
10.1007/s00224-017-9760-2, https://kclpure.kcl.ac.uk/portal/files/87388857/TheoryComputingSzstems.pdf (retrieved 2022-02-02).
|
222
|
+
- [McC1985] McCreight, E. M., _Priority Search Trees_, SIAM J. Comput., 14(2):257-276, 1985,
|
223
|
+
http://www.cs.duke.edu/courses/fall08/cps234/handouts/SMJ000257.pdf (retrieved 2023-02-02).
|
224
|
+
- [DMNS2011] De, M., Maheshwari, A., Nandy, S. C., Smid, M., _An In-Place Priority Search Tree_, 23rd Canadian Conference on
|
225
|
+
Computational Geometry, 2011, http://www.cs.carleton.ca/~michiel/inplace_pst.pdf (retrieved 2023-02-02).
|
226
|
+
- [DMNS2013] De, M., Maheshwari, A., Nandy, S. C., Smid, M., _An In-Place Min-max Priority Search Tree_, Computational Geometry, v46
|
227
|
+
(2013), pp 310-327, https://people.scs.carleton.ca/~michiel/MinMaxPST.pdf (retrieved 2023-02-02).
|
159
228
|
|
160
229
|
[^minmaxpst]: See the comments in the fragmentary class `MinMaxPrioritySearchTree` for further details.
|
@@ -353,7 +353,8 @@ static VALUE segment_tree_update_at(VALUE self, VALUE idx) {
|
|
353
353
|
* (see SegmentTreeTemplate)
|
354
354
|
*/
|
355
355
|
void Init_c_segment_tree_template() {
|
356
|
-
VALUE
|
356
|
+
VALUE mSegmentTree = rb_define_module_under(mDataStructuresRMolinari, "SegmentTree");
|
357
|
+
VALUE cSegmentTreeTemplate = rb_define_class_under(mSegmentTree, "CSegmentTreeTemplate", rb_cObject);
|
357
358
|
|
358
359
|
rb_define_alloc_func(cSegmentTreeTemplate, segment_tree_alloc);
|
359
360
|
rb_define_method(cSegmentTreeTemplate, "c_initialize", segment_tree_init, 4);
|