veb_tree 0.1.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
checksums.yaml ADDED
@@ -0,0 +1,7 @@
1
+ ---
2
+ SHA256:
3
+ metadata.gz: e23be52707f931f14be4a781a5b3772bc222c23c3843c59313b074fd64eb2e9c
4
+ data.tar.gz: 626f915e7d226ae97e08b6d8c448326716d3cbc12f88a2af37ba7e62c65b9d81
5
+ SHA512:
6
+ metadata.gz: b90a39d5c9480a9b1ac3174b1d078421634cd70e361154c00599edf6906370c4250d2491ee69e166f1c149a45847ad910484e4b2fe9537c6d4f26cf9c74afe05
7
+ data.tar.gz: d8b9ee66e04ababef90de5f0cfe2f12c2d1da3bbc6146b97899219dfc592bcba95c47293026bc798cb74e505c19f67b409c7640cd64b9aeecf1e3fad5737ab46
data/CHANGELOG.md ADDED
@@ -0,0 +1,38 @@
1
+ # Changelog
2
+
3
+ All notable changes to this project will be documented in this file.
4
+
5
+ The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.0.0/),
6
+ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html).
7
+
8
+ ## [0.1.0] - 2025-01-XX
9
+
10
+ ### Added
11
+ - Initial release of VebTree gem
12
+ - Full Van Emde Boas tree implementation in C++17
13
+ - Ruby bindings for all core operations
14
+ - O(log log U) performance for insert, delete, search, successor, predecessor
15
+ - O(1) performance for min/max queries
16
+ - Lazy cluster allocation for memory efficiency
17
+ - Enumerable support for iteration
18
+ - Comprehensive test suite
19
+ - Full API documentation
20
+
21
+ ### Features
22
+ - `VebTree::Tree.new(universe_size)` - Constructor with automatic power-of-2 rounding
23
+ - `#insert(key)` - Insert element
24
+ - `#delete(key)` - Delete element
25
+ - `#include?(key)` / `#member?(key)` - Membership test
26
+ - `#min` / `#max` - Get minimum/maximum elements (O(1))
27
+ - `#successor(key)` - Get next larger element
28
+ - `#predecessor(key)` - Get next smaller element
29
+ - `#size` - Get number of elements
30
+ - `#empty?` - Check if tree is empty
31
+ - `#clear` - Remove all elements
32
+ - `#each` - Iterate over elements in sorted order
33
+ - `#to_a` - Convert to sorted array
34
+
35
+ ### Platform Support
36
+ - Ruby 2.7+ on Linux (GCC 7+, Clang 5+)
37
+ - Ruby 2.7+ on macOS (Xcode Command Line Tools)
38
+ - Ruby 2.7+ on Windows (MinGW-w64, MSVC 2017+)
data/LICENSE ADDED
@@ -0,0 +1,21 @@
1
+ MIT License
2
+
3
+ Copyright (c) 2025 [Your Name]
4
+
5
+ Permission is hereby granted, free of charge, to any person obtaining a copy
6
+ of this software and associated documentation files (the "Software"), to deal
7
+ in the Software without restriction, including without limitation the rights
8
+ to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
9
+ copies of the Software, and to permit persons to whom the Software is
10
+ furnished to do so, subject to the following conditions:
11
+
12
+ The above copyright notice and this permission notice shall be included in all
13
+ copies or substantial portions of the Software.
14
+
15
+ THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
16
+ IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
17
+ FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
18
+ AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
19
+ LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
20
+ OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
21
+ SOFTWARE.
data/README.md ADDED
@@ -0,0 +1,96 @@
1
+ # VebTree - Van Emde Boas Tree
2
+
3
+ A high-performance Van Emde Boas (vEB) tree implementation for Ruby with a C++ core, providing **O(log log U)** time complexity for integer set operations.
4
+
5
+ ## Features
6
+
7
+ - **Blazing Fast**: O(log log U) operations for insert, delete, search, successor, and predecessor
8
+ - **Native Performance**: Core algorithm implemented in C++17
9
+ - **Simple API**: Clean, idiomatic Ruby interface
10
+ - **Memory Efficient**: Lazy cluster allocation
11
+ - **Battle Tested**: Comprehensive test suite
12
+
13
+ ## Installation
14
+
15
+ ### Requirements
16
+
17
+ - Ruby 2.7 or higher
18
+ - C++17 compatible compiler:
19
+ - **Linux**: GCC 7+ or Clang 5+
20
+ - **macOS**: Xcode Command Line Tools
21
+ - **Windows**: MinGW-w64 or MSVC 2017+
22
+
23
+ ### Install via RubyGems
24
+ ```bash
25
+ gem install veb_tree
26
+ ```
27
+
28
+ ## Install from Source
29
+ ```
30
+ git clone https://github.com/yourusername/veb_tree.git
31
+ cd veb_tree
32
+ bundle install
33
+ rake compile
34
+ rake test
35
+ gem build veb_tree.gemspec
36
+ gem install veb_tree-*.gem
37
+ ```
38
+
39
+ ### Quick Start
40
+ ```
41
+ require 'veb_tree'
42
+
43
+ # Create a tree with universe size (will round to next power of 2)
44
+ tree = VebTree::Tree.new(1000) # Actual size: 1024
45
+
46
+ # Insert elements
47
+ tree.insert(42)
48
+ tree.insert(100)
49
+ tree.insert(7)
50
+ tree.insert(500)
51
+
52
+ # Check membership - O(log log U)
53
+ tree.include?(42) # => true
54
+ tree.include?(99) # => false
55
+
56
+ # Min/Max - O(1)
57
+ tree.min # => 7
58
+ tree.max # => 500
59
+
60
+ # Successor/Predecessor - O(log log U)
61
+ tree.successor(42) # => 100
62
+ tree.predecessor(100) # => 42
63
+
64
+ # Size and empty check
65
+ tree.size # => 4
66
+ tree.empty? # => false
67
+
68
+ # Iterate in sorted order
69
+ tree.each { |key| puts key }
70
+ # Output: 7, 42, 100, 500
71
+
72
+ # Convert to array
73
+ tree.to_a # => [7, 42, 100, 500]
74
+
75
+ # Delete elements
76
+ tree.delete(42) # => true
77
+ tree.delete(42) # => false (not present)
78
+
79
+ # Clear all elements
80
+ tree.clear
81
+ ```
82
+
83
+ ### API Reference
84
+ ```VebTree::Tree.new(universe_size)```
85
+
86
+ Creates a new Van Emde Boas tree.
87
+ - universe_size (Integer): Maximum value that can be stored (exclusive). Will be
88
+ - rounded up to the next power of 2.
89
+ - Returns: New VebTree::Tree instance
90
+ - Raises: ArgumentError if universe_size is not positive
91
+
92
+ #### Example
93
+ ```
94
+ tree = VebTree::Tree.new(100) # Actual universe: 128
95
+ tree.universe_size # => 128
96
+ ```
@@ -0,0 +1,40 @@
1
+ # frozen_string_literal: true
2
+
3
+ require "mkmf"
4
+
5
+ # Check for C++ compiler
6
+ unless find_executable("g++") || find_executable("clang++")
7
+ abort "C++ compiler not found. Please install a C++ compiler."
8
+ end
9
+
10
+ # Set C++ compiler
11
+ RbConfig::MAKEFILE_CONFIG["CXX"] = ENV["CXX"] || "g++"
12
+
13
+ # C++17 standard
14
+ $CXXFLAGS << " -std=c++17 -Wall -Wextra -O3"
15
+
16
+ # Enable optimizations
17
+ $CXXFLAGS << " -march=native" if ENV["NATIVE_ARCH"] == "1"
18
+
19
+ # Debug flags if requested
20
+ if ENV["DEBUG"] == "1"
21
+ $CXXFLAGS << " -g -O0 -DDEBUG"
22
+ else
23
+ $CXXFLAGS << " -DNDEBUG"
24
+ end
25
+
26
+ # Platform-specific settings
27
+ case RUBY_PLATFORM
28
+ when /darwin/
29
+ # macOS specific flags
30
+ $CXXFLAGS << " -stdlib=libc++"
31
+ when /linux/
32
+ # Linux specific flags
33
+ $LDFLAGS << " -lstdc++"
34
+ when /mingw|mswin/
35
+ # Windows specific flags
36
+ $CXXFLAGS << " -static-libgcc -static-libstdc++"
37
+ end
38
+
39
+ # Create Makefile
40
+ create_makefile("veb_tree/veb_tree")
@@ -0,0 +1,532 @@
1
+ #include "veb_tree_ext.h"
2
+ #include <algorithm>
3
+
4
+ namespace VebTree {
5
+
6
+ // ============================================================================
7
+ // VEBTree Implementation
8
+ // ============================================================================
9
+
10
+ VEBTree::VEBTree(uint64_t universe_size)
11
+ : universe_(universe_size),
12
+ size_(0),
13
+ min_(NIL),
14
+ max_(NIL),
15
+ is_base_case_(universe_size <= 2),
16
+ sqrt_size_(0) {
17
+
18
+ if (universe_size == 0) {
19
+ throw std::invalid_argument("Universe size must be greater than 0");
20
+ }
21
+
22
+ // Check if power of 2
23
+ if ((universe_size & (universe_size - 1)) != 0) {
24
+ throw std::invalid_argument("Universe size must be a power of 2");
25
+ }
26
+
27
+ if (!is_base_case_) {
28
+ // Calculate sqrt of universe size
29
+ sqrt_size_ = 1ULL << (uint64_t)(std::log2((double)universe_size) / 2.0);
30
+ uint64_t num_clusters = universe_size / sqrt_size_;
31
+
32
+ // Initialize clusters vector (but don't create trees yet - lazy allocation)
33
+ clusters_.resize(num_clusters);
34
+
35
+ // Summary tree
36
+ summary_ = std::make_unique<VEBTree>(num_clusters);
37
+ }
38
+ }
39
+
40
+ void VEBTree::empty_insert(uint64_t key) {
41
+ min_ = max_ = static_cast<int64_t>(key);
42
+ size_ = 1;
43
+ }
44
+
45
+ void VEBTree::empty_delete() {
46
+ min_ = max_ = NIL;
47
+ size_ = 0;
48
+ }
49
+
50
+ bool VEBTree::insert(uint64_t key) {
51
+ if (key >= universe_) {
52
+ throw std::out_of_range("Key exceeds universe size");
53
+ }
54
+
55
+ // Check if already present
56
+ if (contains(key)) {
57
+ return false;
58
+ }
59
+
60
+ // Empty tree
61
+ if (min_ == NIL) {
62
+ empty_insert(key);
63
+ return true;
64
+ }
65
+
66
+ // Base case
67
+ if (is_base_case_) {
68
+ if (static_cast<int64_t>(key) < min_) {
69
+ min_ = static_cast<int64_t>(key);
70
+ }
71
+ if (static_cast<int64_t>(key) > max_) {
72
+ max_ = static_cast<int64_t>(key);
73
+ }
74
+ size_++;
75
+ return true;
76
+ }
77
+
78
+ // Make sure key is not min/max
79
+ if (static_cast<int64_t>(key) < min_) {
80
+ uint64_t temp = static_cast<uint64_t>(min_);
81
+ min_ = static_cast<int64_t>(key);
82
+ key = temp;
83
+ }
84
+
85
+ if (static_cast<int64_t>(key) > max_) {
86
+ max_ = static_cast<int64_t>(key);
87
+ }
88
+
89
+ // Recursive case
90
+ uint64_t h = high(key);
91
+ uint64_t l = low(key);
92
+
93
+ // Lazy allocation of cluster
94
+ if (!clusters_[h]) {
95
+ clusters_[h] = std::make_unique<VEBTree>(sqrt_size_);
96
+ }
97
+
98
+ // If cluster was empty, update summary
99
+ if (clusters_[h]->min_ == NIL) {
100
+ summary_->insert(h);
101
+ clusters_[h]->empty_insert(l);
102
+ } else {
103
+ clusters_[h]->insert(l);
104
+ }
105
+
106
+ size_++;
107
+ return true;
108
+ }
109
+
110
+ bool VEBTree::remove(uint64_t key) {
111
+ if (key >= universe_) {
112
+ return false;
113
+ }
114
+
115
+ if (min_ == NIL) {
116
+ return false; // Tree is empty
117
+ }
118
+
119
+ if (!contains(key)) {
120
+ return false;
121
+ }
122
+
123
+ // Base case: universe size 2
124
+ if (is_base_case_) {
125
+ if (static_cast<int64_t>(key) == min_ && static_cast<int64_t>(key) == max_) {
126
+ empty_delete();
127
+ } else if (static_cast<int64_t>(key) == min_) {
128
+ min_ = max_;
129
+ } else {
130
+ max_ = min_;
131
+ }
132
+ size_--;
133
+ return true;
134
+ }
135
+
136
+ // Only one element
137
+ if (size_ == 1) {
138
+ empty_delete();
139
+ return true;
140
+ }
141
+
142
+ // If deleting min, replace with successor
143
+ if (static_cast<int64_t>(key) == min_) {
144
+ int64_t first_cluster = summary_->min();
145
+ key = index(static_cast<uint64_t>(first_cluster), static_cast<uint64_t>(clusters_[first_cluster]->min()));
146
+ min_ = static_cast<int64_t>(key);
147
+ }
148
+
149
+ // Recursive delete
150
+ uint64_t h = high(key);
151
+ uint64_t l = low(key);
152
+
153
+ if (clusters_[h]) {
154
+ clusters_[h]->remove(l);
155
+
156
+ // If cluster is now empty, remove from summary
157
+ if (clusters_[h]->min_ == NIL) {
158
+ summary_->remove(h);
159
+ clusters_[h].reset(); // Free memory
160
+
161
+ // Update max if we deleted it
162
+ if (static_cast<int64_t>(key) == max_) {
163
+ int64_t summary_max = summary_->max();
164
+ if (summary_max == NIL) {
165
+ max_ = min_;
166
+ } else {
167
+ max_ = index(static_cast<uint64_t>(summary_max), static_cast<uint64_t>(clusters_[summary_max]->max()));
168
+ }
169
+ }
170
+ } else if (static_cast<int64_t>(key) == max_) {
171
+ // Update max but cluster not empty
172
+ max_ = index(h, static_cast<uint64_t>(clusters_[h]->max()));
173
+ }
174
+ }
175
+
176
+ size_--;
177
+ return true;
178
+ }
179
+
180
+ bool VEBTree::contains(uint64_t key) const {
181
+ if (key >= universe_) {
182
+ return false;
183
+ }
184
+
185
+ if (static_cast<int64_t>(key) == min_ || static_cast<int64_t>(key) == max_) {
186
+ return true;
187
+ }
188
+
189
+ if (is_base_case_) {
190
+ return false;
191
+ }
192
+
193
+ uint64_t h = high(key);
194
+ uint64_t l = low(key);
195
+
196
+ if (clusters_[h]) {
197
+ return clusters_[h]->contains(l);
198
+ }
199
+
200
+ return false;
201
+ }
202
+
203
+ int64_t VEBTree::min() const {
204
+ return min_;
205
+ }
206
+
207
+ int64_t VEBTree::max() const {
208
+ return max_;
209
+ }
210
+
211
+ int64_t VEBTree::successor(uint64_t key) const {
212
+ if (min_ == NIL) {
213
+ return NIL;
214
+ }
215
+
216
+ // Base case
217
+ if (is_base_case_) {
218
+ if (static_cast<int64_t>(key) < min_) {
219
+ return min_;
220
+ } else if (static_cast<int64_t>(key) < max_) {
221
+ return max_;
222
+ } else {
223
+ return NIL;
224
+ }
225
+ }
226
+
227
+ // If key < min, min is the successor
228
+ if (static_cast<int64_t>(key) < min_) {
229
+ return min_;
230
+ }
231
+
232
+ uint64_t h = high(key);
233
+ uint64_t l = low(key);
234
+
235
+ // Check if successor is in same cluster
236
+ if (clusters_[h] && static_cast<int64_t>(l) < clusters_[h]->max()) {
237
+ int64_t offset = clusters_[h]->successor(l);
238
+ return static_cast<int64_t>(index(h, static_cast<uint64_t>(offset)));
239
+ }
240
+
241
+ // Successor is in next cluster
242
+ int64_t succ_cluster = summary_->successor(h);
243
+ if (succ_cluster == NIL) {
244
+ return NIL;
245
+ }
246
+
247
+ int64_t offset = clusters_[succ_cluster]->min();
248
+ return static_cast<int64_t>(index(static_cast<uint64_t>(succ_cluster), static_cast<uint64_t>(offset)));
249
+ }
250
+
251
+ int64_t VEBTree::predecessor(uint64_t key) const {
252
+ if (max_ == NIL) {
253
+ return NIL;
254
+ }
255
+
256
+ // Base case
257
+ if (is_base_case_) {
258
+ if (static_cast<int64_t>(key) > max_) {
259
+ return max_;
260
+ } else if (static_cast<int64_t>(key) > min_) {
261
+ return min_;
262
+ } else {
263
+ return NIL;
264
+ }
265
+ }
266
+
267
+ // If key > max, max is the predecessor
268
+ if (static_cast<int64_t>(key) > max_) {
269
+ return max_;
270
+ }
271
+
272
+ uint64_t h = high(key);
273
+ uint64_t l = low(key);
274
+
275
+ // Check if predecessor is in same cluster
276
+ if (clusters_[h] && static_cast<int64_t>(l) > clusters_[h]->min()) {
277
+ int64_t offset = clusters_[h]->predecessor(l);
278
+ return static_cast<int64_t>(index(h, static_cast<uint64_t>(offset)));
279
+ }
280
+
281
+ // Predecessor might be in previous cluster
282
+ int64_t pred_cluster = summary_->predecessor(h);
283
+ if (pred_cluster == NIL) {
284
+ // Predecessor might be min
285
+ if (static_cast<int64_t>(key) > min_) {
286
+ return min_;
287
+ }
288
+ return NIL;
289
+ }
290
+
291
+ int64_t offset = clusters_[pred_cluster]->max();
292
+ return static_cast<int64_t>(index(static_cast<uint64_t>(pred_cluster), static_cast<uint64_t>(offset)));
293
+ }
294
+
295
+ void VEBTree::clear() {
296
+ min_ = max_ = NIL;
297
+ size_ = 0;
298
+
299
+ if (!is_base_case_) {
300
+ summary_->clear();
301
+ for (auto& cluster : clusters_) {
302
+ cluster.reset();
303
+ }
304
+ }
305
+ }
306
+
307
+ std::vector<uint64_t> VEBTree::to_vector() const {
308
+ std::vector<uint64_t> result;
309
+ result.reserve(size_);
310
+
311
+ if (min_ == NIL) {
312
+ return result;
313
+ }
314
+
315
+ int64_t current = min_;
316
+ while (current != NIL) {
317
+ result.push_back(static_cast<uint64_t>(current));
318
+ if (current == max_) break;
319
+ current = successor(static_cast<uint64_t>(current));
320
+ }
321
+
322
+ return result;
323
+ }
324
+
325
+ // ============================================================================
326
+ // Ruby Wrapper Implementation
327
+ // ============================================================================
328
+
329
+ static const rb_data_type_t veb_tree_type = {
330
+ "VebTree::Tree",
331
+ {
332
+ nullptr, // dmark
333
+ [](void* ptr) { delete static_cast<VEBTree*>(ptr); }, // dfree
334
+ nullptr, // dsize
335
+ nullptr, // dcompact
336
+ },
337
+ nullptr, // parent
338
+ nullptr, // data
339
+ RUBY_TYPED_FREE_IMMEDIATELY
340
+ };
341
+
342
+ VEBTree* TreeWrapper::get_tree(VALUE self) {
343
+ VEBTree* tree;
344
+ TypedData_Get_Struct(self, VEBTree, &veb_tree_type, tree);
345
+ return tree;
346
+ }
347
+
348
+ VALUE TreeWrapper::rb_alloc(VALUE klass) {
349
+ return TypedData_Wrap_Struct(klass, &veb_tree_type, nullptr);
350
+ }
351
+
352
+ VALUE TreeWrapper::rb_initialize(VALUE self, VALUE universe_size) {
353
+ Check_Type(universe_size, T_FIXNUM);
354
+
355
+ uint64_t u = NUM2ULL(universe_size);
356
+
357
+ if (u == 0) {
358
+ rb_raise(rb_eArgError, "Universe size must be greater than 0");
359
+ }
360
+
361
+ // Round up to next power of 2
362
+ uint64_t rounded = 1ULL << (uint64_t)std::ceil(std::log2((double)u));
363
+
364
+ if (rounded != u) {
365
+ rb_warn("Universe size %llu rounded up to next power of 2: %llu",
366
+ (unsigned long long)u, (unsigned long long)rounded);
367
+ }
368
+
369
+ VEBTree* tree = nullptr;
370
+ try {
371
+ tree = new VEBTree(rounded);
372
+ } catch (const std::exception& e) {
373
+ rb_raise(rb_eRuntimeError, "Failed to create VEB tree: %s", e.what());
374
+ }
375
+
376
+ RTYPEDDATA_DATA(self) = tree;
377
+ return self;
378
+ }
379
+
380
+ VALUE TreeWrapper::rb_insert(VALUE self, VALUE key) {
381
+ Check_Type(key, T_FIXNUM);
382
+
383
+ VEBTree* tree = get_tree(self);
384
+ uint64_t k = NUM2ULL(key);
385
+
386
+ try {
387
+ bool inserted = tree->insert(k);
388
+ return inserted ? Qtrue : Qfalse;
389
+ } catch (const std::out_of_range& e) {
390
+ rb_raise(rb_eArgError, "Key out of range: %s", e.what());
391
+ } catch (const std::exception& e) {
392
+ rb_raise(rb_eRuntimeError, "Insert failed: %s", e.what());
393
+ }
394
+
395
+ return Qfalse;
396
+ }
397
+
398
+ VALUE TreeWrapper::rb_delete(VALUE self, VALUE key) {
399
+ Check_Type(key, T_FIXNUM);
400
+
401
+ VEBTree* tree = get_tree(self);
402
+ uint64_t k = NUM2ULL(key);
403
+
404
+ try {
405
+ bool deleted = tree->remove(k);
406
+ return deleted ? Qtrue : Qfalse;
407
+ } catch (const std::exception& e) {
408
+ rb_raise(rb_eRuntimeError, "Delete failed: %s", e.what());
409
+ }
410
+
411
+ return Qfalse;
412
+ }
413
+
414
+ VALUE TreeWrapper::rb_include(VALUE self, VALUE key) {
415
+ Check_Type(key, T_FIXNUM);
416
+
417
+ VEBTree* tree = get_tree(self);
418
+ uint64_t k = NUM2ULL(key);
419
+
420
+ return tree->contains(k) ? Qtrue : Qfalse;
421
+ }
422
+
423
+ VALUE TreeWrapper::rb_size(VALUE self) {
424
+ VEBTree* tree = get_tree(self);
425
+ return ULL2NUM(tree->size());
426
+ }
427
+
428
+ VALUE TreeWrapper::rb_universe_size(VALUE self) {
429
+ VEBTree* tree = get_tree(self);
430
+ return ULL2NUM(tree->universe_size());
431
+ }
432
+
433
+ VALUE TreeWrapper::rb_min(VALUE self) {
434
+ VEBTree* tree = get_tree(self);
435
+ int64_t m = tree->min();
436
+ return m == VEBTree::NIL ? Qnil : LL2NUM(m);
437
+ }
438
+
439
+ VALUE TreeWrapper::rb_max(VALUE self) {
440
+ VEBTree* tree = get_tree(self);
441
+ int64_t m = tree->max();
442
+ return m == VEBTree::NIL ? Qnil : LL2NUM(m);
443
+ }
444
+
445
+ VALUE TreeWrapper::rb_successor(VALUE self, VALUE key) {
446
+ Check_Type(key, T_FIXNUM);
447
+
448
+ VEBTree* tree = get_tree(self);
449
+ uint64_t k = NUM2ULL(key);
450
+
451
+ int64_t succ = tree->successor(k);
452
+ return succ == VEBTree::NIL ? Qnil : LL2NUM(succ);
453
+ }
454
+
455
+ VALUE TreeWrapper::rb_predecessor(VALUE self, VALUE key) {
456
+ Check_Type(key, T_FIXNUM);
457
+
458
+ VEBTree* tree = get_tree(self);
459
+ uint64_t k = NUM2ULL(key);
460
+
461
+ int64_t pred = tree->predecessor(k);
462
+ return pred == VEBTree::NIL ? Qnil : LL2NUM(pred);
463
+ }
464
+
465
+ VALUE TreeWrapper::rb_empty(VALUE self) {
466
+ VEBTree* tree = get_tree(self);
467
+ return tree->empty() ? Qtrue : Qfalse;
468
+ }
469
+
470
+ VALUE TreeWrapper::rb_clear(VALUE self) {
471
+ VEBTree* tree = get_tree(self);
472
+ tree->clear();
473
+ return self;
474
+ }
475
+
476
+ VALUE TreeWrapper::rb_to_a(VALUE self) {
477
+ VEBTree* tree = get_tree(self);
478
+ std::vector<uint64_t> elements = tree->to_vector();
479
+
480
+ VALUE arr = rb_ary_new_capa(elements.size());
481
+ for (uint64_t elem : elements) {
482
+ rb_ary_push(arr, ULL2NUM(elem));
483
+ }
484
+
485
+ return arr;
486
+ }
487
+
488
+ VALUE TreeWrapper::rb_each(VALUE self) {
489
+ VEBTree* tree = get_tree(self);
490
+
491
+ if (!rb_block_given_p()) {
492
+ return rb_enumeratorize(self, ID2SYM(rb_intern("each")), 0, nullptr);
493
+ }
494
+
495
+ std::vector<uint64_t> elements = tree->to_vector();
496
+ for (uint64_t elem : elements) {
497
+ rb_yield(ULL2NUM(elem));
498
+ }
499
+
500
+ return self;
501
+ }
502
+
503
+ void TreeWrapper::define_class(VALUE module) {
504
+ VALUE cTree = rb_define_class_under(module, "Tree", rb_cObject);
505
+
506
+ rb_define_alloc_func(cTree, rb_alloc);
507
+ rb_define_method(cTree, "initialize", RUBY_METHOD_FUNC(rb_initialize), 1);
508
+ rb_define_method(cTree, "insert", RUBY_METHOD_FUNC(rb_insert), 1);
509
+ rb_define_method(cTree, "delete", RUBY_METHOD_FUNC(rb_delete), 1);
510
+ rb_define_method(cTree, "include?", RUBY_METHOD_FUNC(rb_include), 1);
511
+ rb_define_alias(cTree, "member?", "include?");
512
+ rb_define_method(cTree, "size", RUBY_METHOD_FUNC(rb_size), 0);
513
+ rb_define_method(cTree, "universe_size", RUBY_METHOD_FUNC(rb_universe_size), 0);
514
+ rb_define_method(cTree, "min", RUBY_METHOD_FUNC(rb_min), 0);
515
+ rb_define_method(cTree, "max", RUBY_METHOD_FUNC(rb_max), 0);
516
+ rb_define_method(cTree, "successor", RUBY_METHOD_FUNC(rb_successor), 1);
517
+ rb_define_method(cTree, "predecessor", RUBY_METHOD_FUNC(rb_predecessor), 1);
518
+ rb_define_method(cTree, "empty?", RUBY_METHOD_FUNC(rb_empty), 0);
519
+ rb_define_method(cTree, "clear", RUBY_METHOD_FUNC(rb_clear), 0);
520
+ rb_define_method(cTree, "to_a", RUBY_METHOD_FUNC(rb_to_a), 0);
521
+ rb_define_method(cTree, "each", RUBY_METHOD_FUNC(rb_each), 0);
522
+
523
+ // Include Enumerable
524
+ rb_include_module(cTree, rb_mEnumerable);
525
+ }
526
+
527
+ } // namespace VebTree
528
+
529
+ extern "C" void Init_veb_tree() {
530
+ VALUE mVebTree = rb_define_module("VebTree");
531
+ VebTree::TreeWrapper::define_class(mVebTree);
532
+ }
@@ -0,0 +1,109 @@
1
+ #ifndef VEB_TREE_EXT_H
2
+ #define VEB_TREE_EXT_H
3
+
4
+ #include <ruby.h>
5
+ #include <cstdint>
6
+ #include <memory>
7
+ #include <stdexcept>
8
+ #include <cmath>
9
+ #include <vector>
10
+ #include <limits>
11
+
12
+ extern "C" {
13
+ void Init_veb_tree();
14
+ }
15
+
16
+ namespace VebTree {
17
+
18
+ /**
19
+ * Van Emde Boas Tree - Full Implementation
20
+ *
21
+ * Provides O(log log U) operations for integer sets
22
+ * where U is the universe size (must be power of 2)
23
+ */
24
+ class VEBTree {
25
+ public:
26
+ // NIL sentinel value - must be public for TreeWrapper access
27
+ static constexpr int64_t NIL = -1;
28
+
29
+ explicit VEBTree(uint64_t universe_size);
30
+ ~VEBTree() = default;
31
+
32
+ // Core operations
33
+ bool insert(uint64_t key);
34
+ bool remove(uint64_t key);
35
+ bool contains(uint64_t key) const;
36
+
37
+ // Min/Max - O(1)
38
+ int64_t min() const;
39
+ int64_t max() const;
40
+
41
+ // Successor/Predecessor - O(log log U)
42
+ int64_t successor(uint64_t key) const;
43
+ int64_t predecessor(uint64_t key) const;
44
+
45
+ // Utility
46
+ uint64_t size() const { return size_; }
47
+ uint64_t universe_size() const { return universe_; }
48
+ bool empty() const { return size_ == 0; }
49
+ void clear();
50
+
51
+ // For enumeration
52
+ std::vector<uint64_t> to_vector() const;
53
+
54
+ private:
55
+ uint64_t universe_;
56
+ uint64_t size_;
57
+
58
+ // vEB tree structure
59
+ int64_t min_;
60
+ int64_t max_;
61
+
62
+ // For base case (universe <= 2)
63
+ bool is_base_case_;
64
+
65
+ // For recursive case
66
+ std::unique_ptr<VEBTree> summary_;
67
+ std::vector<std::unique_ptr<VEBTree>> clusters_;
68
+ uint64_t sqrt_size_;
69
+
70
+ // Helper functions
71
+ uint64_t high(uint64_t x) const { return x / sqrt_size_; }
72
+ uint64_t low(uint64_t x) const { return x % sqrt_size_; }
73
+ uint64_t index(uint64_t high, uint64_t low) const { return high * sqrt_size_ + low; }
74
+
75
+ void empty_insert(uint64_t key);
76
+ void empty_delete();
77
+ };
78
+
79
+ /**
80
+ * Ruby wrapper class
81
+ */
82
+ class TreeWrapper {
83
+ public:
84
+ static void define_class(VALUE module);
85
+
86
+ private:
87
+ static VALUE rb_alloc(VALUE klass);
88
+ static void rb_free(void* ptr);
89
+ static VALUE rb_initialize(VALUE self, VALUE universe_size);
90
+ static VALUE rb_insert(VALUE self, VALUE key);
91
+ static VALUE rb_delete(VALUE self, VALUE key);
92
+ static VALUE rb_include(VALUE self, VALUE key);
93
+ static VALUE rb_size(VALUE self);
94
+ static VALUE rb_universe_size(VALUE self);
95
+ static VALUE rb_min(VALUE self);
96
+ static VALUE rb_max(VALUE self);
97
+ static VALUE rb_successor(VALUE self, VALUE key);
98
+ static VALUE rb_predecessor(VALUE self, VALUE key);
99
+ static VALUE rb_empty(VALUE self);
100
+ static VALUE rb_clear(VALUE self);
101
+ static VALUE rb_to_a(VALUE self);
102
+ static VALUE rb_each(VALUE self);
103
+
104
+ static VEBTree* get_tree(VALUE self);
105
+ };
106
+
107
+ } // namespace VebTree
108
+
109
+ #endif // VEB_TREE_EXT_H
@@ -0,0 +1,277 @@
1
+ # frozen_string_literal: true
2
+
3
+ require 'set'
4
+
5
+ module VebTree
6
+ class PureRuby
7
+ attr_reader :universe_size, :size
8
+
9
+ NIL_VALUE = -1
10
+
11
+ def initialize(universe_size)
12
+ raise ArgumentError, "Universe size must be positive" if universe_size <= 0
13
+
14
+ @universe_size = next_power_of_2(universe_size)
15
+ warn "Universe size #{universe_size} rounded up to #{@universe_size}" if @universe_size != universe_size
16
+
17
+ @size = 0
18
+ @min = NIL_VALUE
19
+ @max = NIL_VALUE
20
+
21
+ # Base case
22
+ if @universe_size <= 2
23
+ @base_case = true
24
+ return
25
+ end
26
+
27
+ @base_case = false
28
+
29
+ # Calculate sqrt
30
+ log_u = Math.log2(@universe_size).to_i
31
+ @sqrt_size = 1 << (log_u / 2)
32
+ @num_clusters = @universe_size / @sqrt_size
33
+
34
+ # Lazy allocation
35
+ @clusters = Array.new(@num_clusters)
36
+ @summary = nil
37
+ end
38
+
39
+ def insert(key)
40
+ validate_key(key)
41
+ return false if include?(key)
42
+
43
+ # Empty tree
44
+ if @min == NIL_VALUE
45
+ @min = @max = key
46
+ @size += 1
47
+ return true
48
+ end
49
+
50
+ # Base case
51
+ if @base_case
52
+ @min = key if key < @min
53
+ @max = key if key > @max
54
+ @size += 1
55
+ return true
56
+ end
57
+
58
+ # Ensure key is not min
59
+ if key < @min
60
+ key, @min = @min, key
61
+ end
62
+
63
+ @max = key if key > @max
64
+
65
+ # Recursive insert
66
+ h = high(key)
67
+ l = low(key)
68
+
69
+ # Lazy create cluster
70
+ @clusters[h] ||= PureRuby.new(@sqrt_size)
71
+
72
+ # If cluster was empty, update summary
73
+ if @clusters[h].min == NIL_VALUE
74
+ @summary ||= PureRuby.new(@num_clusters)
75
+ @summary.insert(h)
76
+ @clusters[h].instance_variable_set(:@min, l)
77
+ @clusters[h].instance_variable_set(:@max, l)
78
+ @clusters[h].instance_variable_set(:@size, 1)
79
+ else
80
+ @clusters[h].insert(l)
81
+ end
82
+
83
+ @size += 1
84
+ true
85
+ end
86
+
87
+ def delete(key)
88
+ return false unless include?(key)
89
+
90
+ # Base case
91
+ if @base_case
92
+ if key == @min && key == @max
93
+ @min = @max = NIL_VALUE
94
+ elsif key == @min
95
+ @min = @max
96
+ else
97
+ @max = @min
98
+ end
99
+ @size -= 1
100
+ return true
101
+ end
102
+
103
+ # Only one element
104
+ if @size == 1
105
+ @min = @max = NIL_VALUE
106
+ @size = 0
107
+ return true
108
+ end
109
+
110
+ # Replace min with successor if deleting min
111
+ if key == @min
112
+ first_cluster = @summary.min
113
+ key = index(first_cluster, @clusters[first_cluster].min)
114
+ @min = key
115
+ end
116
+
117
+ # Recursive delete
118
+ h = high(key)
119
+ l = low(key)
120
+
121
+ @clusters[h].delete(l) if @clusters[h]
122
+
123
+ # If cluster is empty, remove from summary
124
+ if @clusters[h] && @clusters[h].min == NIL_VALUE
125
+ @summary.delete(h)
126
+ @clusters[h] = nil
127
+
128
+ # Update max if necessary
129
+ if key == @max
130
+ summary_max = @summary.max
131
+ if summary_max == NIL_VALUE
132
+ @max = @min
133
+ else
134
+ @max = index(summary_max, @clusters[summary_max].max)
135
+ end
136
+ end
137
+ elsif key == @max && @clusters[h]
138
+ @max = index(h, @clusters[h].max)
139
+ end
140
+
141
+ @size -= 1
142
+ true
143
+ end
144
+
145
+ def include?(key)
146
+ return false if key < 0 || key >= @universe_size
147
+ return true if key == @min || key == @max
148
+ return false if @base_case
149
+
150
+ h = high(key)
151
+ @clusters[h] && @clusters[h].include?(low(key))
152
+ end
153
+ alias member? include?
154
+
155
+ def min
156
+ @min == NIL_VALUE ? nil : @min
157
+ end
158
+
159
+ def max
160
+ @max == NIL_VALUE ? nil : @max
161
+ end
162
+
163
+ def successor(key)
164
+ return nil if @min == NIL_VALUE
165
+
166
+ # Base case
167
+ if @base_case
168
+ return @min if key < @min
169
+ return @max if key < @max
170
+ return nil
171
+ end
172
+
173
+ return @min if key < @min
174
+
175
+ h = high(key)
176
+ l = low(key)
177
+
178
+ # Check same cluster
179
+ if @clusters[h] && l < @clusters[h].max
180
+ offset = @clusters[h].successor(l)
181
+ return index(h, offset)
182
+ end
183
+
184
+ # Next cluster
185
+ succ_cluster = @summary.successor(h)
186
+ return nil if succ_cluster == NIL_VALUE
187
+
188
+ offset = @clusters[succ_cluster].min
189
+ index(succ_cluster, offset)
190
+ end
191
+
192
+ def predecessor(key)
193
+ return nil if @max == NIL_VALUE
194
+
195
+ # Base case
196
+ if @base_case
197
+ return @max if key > @max
198
+ return @min if key > @min
199
+ return nil
200
+ end
201
+
202
+ return @max if key > @max
203
+
204
+ h = high(key)
205
+ l = low(key)
206
+
207
+ # Check same cluster
208
+ if @clusters[h] && l > @clusters[h].min
209
+ offset = @clusters[h].predecessor(l)
210
+ return index(h, offset)
211
+ end
212
+
213
+ # Previous cluster
214
+ pred_cluster = @summary.predecessor(h)
215
+ return @min if pred_cluster == NIL_VALUE && key > @min
216
+ return nil if pred_cluster == NIL_VALUE
217
+
218
+ offset = @clusters[pred_cluster].max
219
+ index(pred_cluster, offset)
220
+ end
221
+
222
+ def empty?
223
+ @size == 0
224
+ end
225
+
226
+ def clear
227
+ @min = @max = NIL_VALUE
228
+ @size = 0
229
+ unless @base_case
230
+ @summary&.clear
231
+ @clusters.fill(nil)
232
+ end
233
+ self
234
+ end
235
+
236
+ def each
237
+ return enum_for(:each) unless block_given?
238
+
239
+ current = @min
240
+ while current && current != NIL_VALUE
241
+ yield current
242
+ break if current == @max
243
+ current = successor(current)
244
+ end
245
+
246
+ self
247
+ end
248
+
249
+ def to_a
250
+ each.to_a
251
+ end
252
+
253
+ private
254
+
255
+ def high(x)
256
+ x / @sqrt_size
257
+ end
258
+
259
+ def low(x)
260
+ x % @sqrt_size
261
+ end
262
+
263
+ def index(high, low)
264
+ high * @sqrt_size + low
265
+ end
266
+
267
+ def validate_key(key)
268
+ raise ArgumentError, "Key must be non-negative" if key < 0
269
+ raise ArgumentError, "Key #{key} exceeds universe size #{@universe_size}" if key >= @universe_size
270
+ end
271
+
272
+ def next_power_of_2(n)
273
+ return 1 if n <= 1
274
+ 2 ** (Math.log2(n).ceil)
275
+ end
276
+ end
277
+ end
@@ -0,0 +1,5 @@
1
+ # frozen_string_literal: true
2
+
3
+ module VebTree
4
+ VERSION = "0.1.0"
5
+ end
data/lib/veb_tree.rb ADDED
@@ -0,0 +1,32 @@
1
+ # frozen_string_literal: true
2
+
3
+ require_relative "veb_tree/version"
4
+
5
+ module VebTree
6
+ class Error < StandardError; end
7
+
8
+ # Load the native extension
9
+ begin
10
+ require_relative "veb_tree/veb_tree"
11
+ NATIVE_EXTENSION_LOADED = true
12
+ rescue LoadError => e
13
+ raise Error, <<~MSG
14
+ Failed to load VebTree native extension!
15
+
16
+ Error: #{e.message}
17
+
18
+ VebTree requires a C++17 compatible compiler to build the native extension.
19
+
20
+ Requirements:
21
+ - Linux: GCC 7+ or Clang 5+
22
+ - macOS: Xcode Command Line Tools
23
+ - Windows: MinGW-w64 or MSVC 2017+
24
+
25
+ To install:
26
+ 1. Install a C++ compiler for your platform
27
+ 2. Run: gem install veb_tree
28
+
29
+ For more help, see: https://github.com/yourusername/veb_tree
30
+ MSG
31
+ end
32
+ end
metadata ADDED
@@ -0,0 +1,91 @@
1
+ --- !ruby/object:Gem::Specification
2
+ name: veb_tree
3
+ version: !ruby/object:Gem::Version
4
+ version: 0.1.0
5
+ platform: ruby
6
+ authors:
7
+ - pixelcaliber
8
+ autorequire:
9
+ bindir: bin
10
+ cert_chain: []
11
+ date: 2025-10-03 00:00:00.000000000 Z
12
+ dependencies:
13
+ - !ruby/object:Gem::Dependency
14
+ name: rake
15
+ requirement: !ruby/object:Gem::Requirement
16
+ requirements:
17
+ - - "~>"
18
+ - !ruby/object:Gem::Version
19
+ version: '13.0'
20
+ type: :development
21
+ prerelease: false
22
+ version_requirements: !ruby/object:Gem::Requirement
23
+ requirements:
24
+ - - "~>"
25
+ - !ruby/object:Gem::Version
26
+ version: '13.0'
27
+ - !ruby/object:Gem::Dependency
28
+ name: minitest
29
+ requirement: !ruby/object:Gem::Requirement
30
+ requirements:
31
+ - - "~>"
32
+ - !ruby/object:Gem::Version
33
+ version: '5.0'
34
+ type: :development
35
+ prerelease: false
36
+ version_requirements: !ruby/object:Gem::Requirement
37
+ requirements:
38
+ - - "~>"
39
+ - !ruby/object:Gem::Version
40
+ version: '5.0'
41
+ description: "VebTree is a production-quality Van Emde Boas tree implementation providing
42
+ \nO(log log U) time complexity for insert, delete, search, successor, and \npredecessor
43
+ operations on integer sets. The core algorithm is implemented \nin C++17 for maximum
44
+ performance with an idiomatic Ruby API.\n\nPerfect for applications requiring fast
45
+ integer set operations, range queries,\nand successor/predecessor lookups within
46
+ a bounded universe.\n"
47
+ email:
48
+ - abhinav.1e4@gmail.com
49
+ executables: []
50
+ extensions:
51
+ - ext/veb_tree/extconf.rb
52
+ extra_rdoc_files: []
53
+ files:
54
+ - CHANGELOG.md
55
+ - LICENSE
56
+ - README.md
57
+ - ext/veb_tree/extconf.rb
58
+ - ext/veb_tree/veb_tree_ext.cpp
59
+ - ext/veb_tree/veb_tree_ext.h
60
+ - lib/veb_tree.rb
61
+ - lib/veb_tree/pure_ruby.rb
62
+ - lib/veb_tree/version.rb
63
+ homepage: https://github.com/abhinvv1/Van-Emde-Boas-tree
64
+ licenses:
65
+ - MIT
66
+ metadata:
67
+ homepage_uri: https://github.com/abhinvv1/Van-Emde-Boas-tree
68
+ source_code_uri: https://github.com/abhinvv1/Van-Emde-Boas-tree
69
+ changelog_uri: https://github.com/abhinvv1/Van-Emde-Boas-tree/blob/main/CHANGELOG.md
70
+ bug_tracker_uri: https://github.com/abhinvv1/Van-Emde-Boas-tree/issues
71
+ documentation_uri: https://rubydoc.info/gems/veb_tree
72
+ post_install_message:
73
+ rdoc_options: []
74
+ require_paths:
75
+ - lib
76
+ required_ruby_version: !ruby/object:Gem::Requirement
77
+ requirements:
78
+ - - ">="
79
+ - !ruby/object:Gem::Version
80
+ version: 2.7.0
81
+ required_rubygems_version: !ruby/object:Gem::Requirement
82
+ requirements:
83
+ - - ">="
84
+ - !ruby/object:Gem::Version
85
+ version: '0'
86
+ requirements: []
87
+ rubygems_version: 3.0.9
88
+ signing_key:
89
+ specification_version: 4
90
+ summary: High-performance Van Emde Boas tree for integer sets with O(log log U) operations
91
+ test_files: []