d_heap 0.5.0 → 0.6.0
Sign up to get free protection for your applications and to get access to all the features.
 checksums.yaml +4 4
 data/.github/workflows/main.yml +2 2
 data/.gitignore +1 0
 data/.rubocop.yml +1 1
 data/.yardopts +10 0
 data/CHANGELOG.md +19 6
 data/Gemfile +4 0
 data/Gemfile.lock +10 1
 data/N +7 0
 data/README.md +185 231
 data/benchmarks/push_n.yml +10 6
 data/benchmarks/push_n_pop_n.yml +27 10
 data/benchmarks/push_pop.yml +5 0
 data/bin/bench_charts +13 0
 data/d_heap.gemspec +1 1
 data/ext/d_heap/d_heap.c +435 140
 data/ext/d_heap/extconf.rb +3 4
 data/images/push_n.png +0 0
 data/images/push_n_pop_n.png +0 0
 data/images/push_pop.png +0 0
 data/images/wikipediaminheap.png +0 0
 data/lib/benchmark_driver/runner/ips_zero_fail.rb +89 51
 data/lib/d_heap.rb +81 18
 data/lib/d_heap/benchmarks/implementations.rb +30 28
 data/lib/d_heap/benchmarks/rspec_matchers.rb +29 51
 data/lib/d_heap/version.rb +1 1
 metadata +10 4
 data/ext/d_heap/d_heap.h +0 50
checksums.yaml
CHANGED
@@ 1,7 +1,7 @@


1
1



2
2

SHA256:

3


metadata.gz:

4


data.tar.gz:

3

+
metadata.gz: 35213e5ac430b07cf2b43a7f065ff5c409506022835a5326cb2bfa25daa7f210

4

+
data.tar.gz: e87a64fb9fd6eb8bdd281d8fe289b7f4993f4bde9a671b0b414aca194b724691

5
5

SHA512:

6


metadata.gz:

7


data.tar.gz:

6

+
metadata.gz: 77518eb11bf8dd5fa8a29ad88f48c30650ff375ae9b74001fa59d30f634feddf5c4f3ef8d791dc4064afc1d9d68b9b463eaf0e778e927655e0da0c0a9da6fdee

7

+
data.tar.gz: 2911b20a882d8b6f577bda9a388a9a755a093cfd3c1aaaaa355aa5be97ffe506935620043adb115158565ead6941e52f789cdd714e4932aff4916412c60a3aee

data/.github/workflows/main.yml
CHANGED
@@ 1,4 +1,4 @@


1


name:

1

+
name: CI

2
2


3
3

on: [push,pull_request]

4
4


@@ 7,7 +7,7 @@ jobs:


7
7

strategy:

8
8

failfast: false

9
9

matrix:

10


ruby: [2.5, 2.6, 2.7, 3.0]

10

+
ruby: [2.4, 2.5, 2.6, 2.7, 3.0]

11
11

os: [ubuntu, macos]

12
12

experimental: [false]

13
13

runson: ${{ matrix.os }}latest

data/.gitignore
CHANGED
data/.rubocop.yml
CHANGED
data/.yardopts
ADDED
data/CHANGELOG.md
CHANGED
@@ 1,5 +1,17 @@


1
1

## Current/Unreleased

2
2


3

+
## Release v0.6.0 (20210124)

4

+

5

+
* 🔥 **Breaking**: `#initialize` uses a keyword argument for `d`

6

+
* ✨ Added `#initialize(capacity: capa)` to set initial capacity.

7

+
* ✨ Added `peek_with_score` and `peek_score`

8

+
* ✨ Added `pop_with_score` and `each_pop(with_score: true)`

9

+
* ✨ Added `pop_all_below(max_score, array = [])`

10

+
* ✨ Added aliases for `shift` and `next`

11

+
* 📈 Added benchmark charts to README, and `bin/bench_charts` to generate them.

12

+
* requires `gruff` which requires `rmagick` which requires `imagemagick`

13

+
* 📝 Many documentation updates and fixes.

14

+

3
15

## Release v0.5.0 (20210117)

4
16


5
17

* 🔥 **Breaking**: reversed order of `#push` arguments to `value, score`.

@@ 8,19 +20,20 @@


8
20

* ✨ Added aliases for `deq`, `enq`, `first`, `pop_below`, `length`, and

9
21

`count`, to mimic other classes in ruby's stdlib.

10
22

* ⚡️♻️ More performance improvements:

11



12



13



14



15



23

+
* Created an `ENTRY` struct and store both the score and the value pointer in

24

+
the same `ENTRY *entries` array.

25

+
* Reduced unnecessary allocations or copies in both sift loops. A similar

26

+
refactoring also sped up the pure ruby benchmark implementation.

27

+
* Compiling with `O3`.

16
28

* 📝 Updated (and in some cases, fixed) yardoc

17
29

* ♻️ Moved aliases and less performance sensitive code into ruby.

18
30

* ♻️ DRY up push/insert methods

19
31


20
32

## Release v0.4.0 (20210112)

21
33


34

+
* 🔥 **Breaking**: Scores must be `Integer` or convertable to `Float`

35

+
* ⚠️ `Integer` scores must fit in `ULONG_LONG_MAX` to `+ULONG_LONG_MAX`.

22
36

* ⚡️ Big performance improvements, by using C `long double *cscores` array

23


* ⚡️ Scores must be `Integer` in `uint64..+uint64`, or convertable to `Float`

24
37

* ⚡️ many many (so many) updates to benchmarks

25
38

* ✨ Added `DHeap#clear`

26
39

* 🐛 Fixed `DHeap#initialize_copy` and `#freeze`

data/Gemfile
CHANGED
data/Gemfile.lock
CHANGED
@@ 1,15 +1,22 @@


1
1

PATH

2
2

remote: .

3
3

specs:

4


d_heap (0.

4

+
d_heap (0.6.0)

5
5


6
6

GEM

7
7

remote: https://rubygems.org/

8
8

specs:

9
9

ast (2.4.1)

10
10

benchmark_driver (0.15.16)

11

+
benchmark_driveroutputgruff (0.3.1)

12

+
benchmark_driver (>= 0.12.0)

13

+
gruff

11
14

coderay (1.1.3)

12
15

difflcs (1.4.4)

16

+
gruff (0.12.1)

17

+
histogram

18

+
rmagick

19

+
histogram (0.2.4.1)

13
20

method_source (1.0.0)

14
21

parallel (1.19.2)

15
22

parser (2.7.2.0)

@@ 25,6 +32,7 @@ GEM


25
32

rake

26
33

regexp_parser (1.8.2)

27
34

rexml (3.2.3)

35

+
rmagick (4.1.2)

28
36

rspec (3.10.0)

29
37

rspeccore (~> 3.10.0)

30
38

rspecexpectations (~> 3.10.0)

@@ 59,6 +67,7 @@ PLATFORMS


59
67


60
68

DEPENDENCIES

61
69

benchmark_driver

70

+
benchmark_driveroutputgruff

62
71

d_heap!

63
72

perf

64
73

priority_queue_cxx

data/N
ADDED
data/README.md
CHANGED
@@ 1,8 +1,21 @@


1


# DHeap

1

+
# DHeap  Fast dary heap for ruby

2

+

3

+
[![Gem Version](https://badge.fury.io/rb/d_heap.svg)](https://badge.fury.io/rb/d_heap)

4

+
[![Build Status](https://github.com/nevans/d_heap/workflows/CI/badge.svg)](https://github.com/nevans/d_heap/actions?query=workflow%3ACI)

5

+
[![Maintainability](https://api.codeclimate.com/v1/badges/ff274acd0683c99c03e1/maintainability)](https://codeclimate.com/github/nevans/d_heap/maintainability)

2
6


3
7

A fast [_d_ary heap][dary heap] [priority queue] implementation for ruby,

4
8

implemented as a C extension.

5
9


10

+
From [wikipedia](https://en.wikipedia.org/wiki/Heap_(data_structure)):

11

+
> A heap is a specialized treebased data structure which is essentially an

12

+
> almost complete tree that satisfies the heap property: in a min heap, for any

13

+
> given node C, if P is a parent node of C, then the key (the value) of P is

14

+
> less than or equal to the key of C. The node at the "top" of the heap (with no

15

+
> parents) is called the root node.

16

+

17

+
![tree representation of a min heap](images/wikipediaminheap.png)

18

+

6
19

With a regular queue, you expect "FIFO" behavior: first in, first out. With a

7
20

stack you expect "LIFO": last in first out. A priority queue has a score for

8
21

each element and elements are popped in order by score. Priority queues are

@@ 13,14 +26,16 @@ management, for [Huffman coding], and various graph search algorithms such as


13
26

The _d_ary heap data structure is a generalization of the [binary heap], in

14
27

which the nodes have _d_ children instead of 2. This allows for "insert" and

15
28

"decrease priority" operations to be performed more quickly with the tradeoff of

16


slower delete minimum. Additionally, _d_ary heaps can

17


behavior than binary heaps, allowing them to run more

18


despite slower worstcase time complexity. In the worst

19


requires only `O(log n / log d)` operations to push, with

20


requires `O(d log n / log d)`.

29

+
slower delete minimum or "increase priority". Additionally, _d_ary heaps can

30

+
have better memory cache behavior than binary heaps, allowing them to run more

31

+
quickly in practice despite slower worstcase time complexity. In the worst

32

+
case, a _d_ary heap requires only `O(log n / log d)` operations to push, with

33

+
the tradeoff that pop requires `O(d log n / log d)`.

21
34


22
35

Although you should probably just use the default _d_ value of `4` (see the

23


analysis below), it's always advisable to benchmark your specific usecase.

36

+
analysis below), it's always advisable to benchmark your specific usecase. In

37

+
particular, if you push items more than you pop, higher values for _d_ can give

38

+
a faster total runtime.

24
39


25
40

[dary heap]: https://en.wikipedia.org/wiki/Dary_heap

26
41

[priority queue]: https://en.wikipedia.org/wiki/Priority_queue

@@ 33,26 +48,43 @@ analysis below), it's always advisable to benchmark your specific usecase.


33
48


34
49

## Usage

35
50


36



51

+
The basic API is `#push(object, score)` and `#pop`. Please read the

52

+
[gem documentation] for more details and other methods.

53

+

54

+
Quick reference for some common methods:

37
55


38
56

* `heap << object` adds a value, with `Float(object)` as its score.

39
57

* `heap.push(object, score)` adds a value with an extrinsic score.

40
58

* `heap.pop` removes and returns the value with the minimum score.

41


* `heap.pop_lte(

59

+
* `heap.pop_lte(max_score)` pops only if the next score is `<=` the argument.

42
60

* `heap.peek` to view the minimum value without popping it.

43
61

* `heap.clear` to remove all items from the heap.

44
62

* `heap.empty?` returns true if the heap is empty.

45
63

* `heap.size` returns the number of items in the heap.

46
64


47



48



49


object is still in the heap, it will not be reevaluated again. The score must

50


either be `Integer` or `Float` or convertable to a `Float` via `Float(score)`

51


(i.e. it should implement `#to_f`).

65

+
If the score changes while the object is still in the heap, it will not be

66

+
reevaluated again.

52
67


53



54



68

+
The score must either be `Integer` or `Float` or convertable to a `Float` via

69

+
`Float(score)` (i.e. it should implement `#to_f`). Constraining scores to

70

+
numeric values gives more than 50% speedup under some benchmarks! _n.b._

71

+
`Integer` _scores must have an absolute value that fits into_ `unsigned long

72

+
long`. This is compiler and architecture dependant but with gcc on an IA64

73

+
system it's 64 bits, which gives a range of 18,446,744,073,709,551,615 to

74

+
+18,446,744,073,709,551,615, which is more than enough to store e.g. POSIX time

75

+
in nanoseconds.

55
76


77

+
_Comparing arbitary objects via_ `a <=> b` _was the original design and may be

78

+
added back in a future version,_ if (and only if) _it can be done without

79

+
impacting the speed of numeric comparisons. The speedup from this constraint is

80

+
huge!_

81

+

82

+
[gem documentation]: https://rubydoc.info/gems/d_heap/DHeap

83

+

84

+
### Examples

85

+

86

+
```ruby

87

+
# create some example objects to place in our heap

56
88

Task = Struct.new(:id, :time) do

57
89

def to_f; time.to_f end

58
90

end

@@ 61,72 +93,42 @@ t2 = Task.new(2, Time.now + 50)


61
93

t3 = Task.new(3, Time.now + 60)

62
94

t4 = Task.new(4, Time.now + 5)

63
95


64


#

65



96

+
# create the heap

97

+
require "d_heap"

98

+
heap = DHeap.new

99

+

100

+
# push with an explicit score (which might be extrinsic to the value)

101

+
heap.push t1, t1.to_f

102

+

103

+
# the score will be implicitly cast with Float, so any object with #to_f

104

+
heap.push t2, t2

66
105


67


#

68


heap

69


heap.push t4, t4 # score can be implicitly cast with Float

106

+
# if the object has an intrinsic score via #to_f, "<<" is the simplest API

107

+
heap << t3 << t4

70
108


71


#

109

+
# pop returns the lowest scored item, and removes it from the heap

72
110

heap.pop # => #<struct Task id=4, time=20210117 17:02:22.5574 0500>

73
111

heap.pop # => #<struct Task id=2, time=20210117 17:03:07.5574 0500>

112

+

113

+
# peek returns the lowest scored item, without removing it from the heap

74
114

heap.peek # => #<struct Task id=3, time=20210117 17:03:17.5574 0500>

75
115

heap.pop # => #<struct Task id=3, time=20210117 17:03:17.5574 0500>

76


heap.pop # => #<struct Task id=1, time=20210117 17:07:17.5574 0500>

77


heap.empty? # => true

78


heap.pop # => nil

79


```

80
116


81



82



83


into_ `unsigned long long`. _This is architecture dependant but on an IA64

84


system this is 64 bits, which gives a range of 18,446,744,073,709,551,615 to

85


+18446744073709551615. Comparing arbitary objects via_ `a <=> b` _was the

86


original design and may be added back in a future version,_ if (and only if) _it

87


can be done without impacting the speed of numeric comparisons._

117

+
# pop_lte handles the common "h.pop if h.peek_score < max" pattern

118

+
heap.pop_lte(Time.now + 65) # => nil

88
119


89



90


heap.

91



92


#

93


# "a <=> b" is *much* slower than comparing numbers, so it isn't used.

94


class Event

95


include Comparable

96


attr_reader :time, :payload

97


alias_method :to_time, :time

98



99


def initialize(time, payload)

100


@time = time.to_time

101


@payload = payload

102


freeze

103


end

104



105


def to_f

106


time.to_f

107


end

108



109


def <=>(other)

110


to_f <=> other.to_f

111


end

112


end

113



114


heap << comparable_max # sorts last, using <=>

115


heap << comparable_min # sorts first, using <=>

116


heap << comparable_mid # sorts in the middle, using <=>

117


heap.pop # => comparable_min

118


heap.pop # => comparable_mid

119


heap.pop # => comparable_max

120

+
# the heap size can be inspected with size and empty?

121

+
heap.empty? # => false

122

+
heap.size # => 1

123

+
heap.pop # => #<struct Task id=1, time=20210117 17:07:17.5574 0500>

120
124

heap.empty? # => true

125

+
heap.size # => 0

126

+

127

+
# popping from an empty heap returns nil

121
128

heap.pop # => nil

122
129

```

123
130


124



125


score is less than or equal to `max`.

126



127


Read the [API documentation] for more detailed documentation and examples.

128



129


[API documentation]: https://rubydoc.info/gems/d_heap/DHeap

131

+
Please see the [gem documentation] for more methods and more examples.

130
132


131
133

## Installation

132
134


@@ 153,104 +155,74 @@ for insert is `O(n)` because it may need to `memcpy` a significant portion of


153
155

the array.

154
156


155
157

The standard way to implement a priority queue is with a binary heap. Although

156


this increases the time for `pop

157



158



159



160


heap implementation was much slower than inserting into and popping from a fully

161


sorted array. The reasons for this surprising result: Although it is `O(n)`,

162


`memcpy` has a _very_ small constant factor, and calling `<=>` from ruby code

163


has relatively _much_ larger constant factors. If your queue contains only a

164


few thousand items, the overhead of those extra calls to `<=>` is _far_ more

165


than occasionally calling `memcpy`. In the worst case, a _d_heap will require

166


`d + 1` times more comparisons for each push + pop than a `bsearch` + `insert`

167


sorted array.

168



169


Moving the siftup and siftdown code into C helps some. But much more helpful

170


is optimizing the comparison of numeric scores, so `a <=> b` never needs to be

171


called. I'm hopeful that MJIT will eventually obsolete this Cextension. This

172


can be hotspot code, and a basic ruby implementation could perform well if `<=>`

173


had much lower overhead.

158

+
this increases the time complexity for `pop` alone, it reduces the combined time

159

+
compexity for the combined `push` + `pop`. Using a dary heap with d > 2

160

+
makes the tree shorter but broader, which reduces to `O(log n / log d)` while

161

+
increasing the comparisons needed by siftdown to `O(d log n/ log d)`.

174
162


175



163

+
However, I was disappointed when my best ruby heap implementation ran much more

164

+
slowly than the naive approach—even for heaps containing ten thousand items.

165

+
Although it _is_ `O(n)`, `memcpy` is _very_ fast, while calling `<=>` from ruby

166

+
has _much_ higher overhead. And a _d_heap needs `d + 1` times more comparisons

167

+
for each push + pop than `bsearch` + `insert`.

176
168


177



169

+
Additionally, when researching how other systems handle their scheduling, I was

170

+
inspired by reading go's "timer.go" implementation to experiment with a 4ary

171

+
heap instead of the traditional binary heap.

178
172


179



180


(used by pop).

173

+
## Benchmarks

181
174


182



183



184



185


So pushing a new element is `O(log n / log d)`.

186


* Swap down performs as many as d comparions per swap: `O(d)`.

187


So popping the min element is `O(d log n / log d)`.

188



189


Assuming every inserted element is eventually deleted from the root, d=4

190


requires the fewest comparisons for combined insert and delete:

191



192


* (1 + 2) lg 2 = 4.328085

193


* (1 + 3) lg 3 = 3.640957

194


* (1 + 4) lg 4 = 3.606738

195


* (1 + 5) lg 5 = 3.728010

196


* (1 + 6) lg 6 = 3.906774

197


* etc...

175

+
_See `bin/benchmarks` and `docs/benchmarks.txt`, as well as `bin/profile` and

176

+
`docs/profile.txt` for much more detail or updated results. These benchmarks

177

+
were measured with v0.5.0 and ruby 2.7.2 without MJIT enabled._

198
178


199



200



179

+
These benchmarks use very simple implementations for a pureruby heap and an

180

+
array that is kept sorted using `Array#bsearch_index` and `Array#insert`. For

181

+
comparison, I also compare to the [priority_queue_cxx gem] which uses the [C++

182

+
STL priority_queue], and another naive implementation that uses `Array#min` and

183

+
`Array#delete_at` with an unsorted array.

201
184


202



203



204



205



185

+
In these benchmarks, `DHeap` runs faster than all other implementations for

186

+
every scenario and every value of N, although the difference is usually more

187

+
noticable at higher values of N. The pure ruby heap implementation is

188

+
competitive for `push` alone at every value of N, but is significantly slower

189

+
than bsearch + insert for push + pop, until N is _very_ large (somewhere between

190

+
10k and 100k)!

206
191


207



192

+
[priority_queue_cxx gem]: https://rubygems.org/gems/priority_queue_cxx

193

+
[C++ STL priority_queue]: http://www.cplusplus.com/reference/queue/priority_queue/

208
194


209



195

+
Three different scenarios are measured:

210
196


211



212


provide better cache locality. Because the heap is a complete binary tree, the

213


elements can be stored in an array, without the need for tree or list pointers.

197

+
### push N items onto an empty heap

214
198


215



216


those objects simply delegate comparison to internal Numeric values. And it is

217


often useful to use external scores for otherwise uncomparable values. So

218


`DHeap` uses twice as many entries (one for score and one for value)

219


as an array which only stores values.

199

+
...but never pop (clearing between each set of pushes).

220
200


221



201

+
![bar graph for push_n_pop_n benchmarks](./images/push_n.png)

222
202


223



224


`docs/profile.txt` for more details or updated results. These benchmarks were

225


measured with v0.5.0 and ruby 2.7.2 without MJIT enabled._

203

+
### push N items onto an empty heap then pop all N

226
204


227



228



229



230


also shown.

205

+
Although this could be used for heap sort, we're unlikely to choose heap sort

206

+
over Ruby's quick sort implementation. I'm using this scenario to represent

207

+
the amortized cost of creating a heap and (eventually) draining it.

231
208


232



233


* push N values but never pop (clearing between each set of pushes).

234


* push N values and then pop N values.

235


Although this could be used for heap sort, we're unlikely to choose heap sort

236


over Ruby's quick sort implementation. I'm using this scenario to represent

237


the amortized cost of creating a heap and (eventually) draining it.

238


* For a heap of size N, repeatedly push and pop while keeping a stable size.

239


This is a _very simple_ approximation for how most scheduler/timer heaps

240


would be used. Usually when a timer fires it will be quickly replaced by a

241


new timer, and the overall count of timers will remain roughly stable.

209

+
![bar graph for push_n_pop_n benchmarks](./images/push_n_pop_n.png)

242
210


243



244



245



246



247



248



211

+
### push and pop on a heap with N values

212

+

213

+
Repeatedly push and pop while keeping a stable heap size. This is a _very

214

+
simplistic_ approximation for how most scheduler/timer heaps might be used.

215

+
Usually when a timer fires it will be quickly replaced by a new timer, and the

216

+
overall count of timers will remain roughly stable.

217

+

218

+
![bar graph for push_pop benchmarks](./images/push_pop.png)

249
219


250



251



252



253



220

+
### numbers

221

+

222

+
Even for very small N values the benchmark implementations, `DHeap` runs faster

223

+
than the other implementations for each scenario, although the difference is

224

+
still relatively small. The pure ruby binary heap is 2x or more slower than

225

+
bsearch + insert for common push/pop scenario.

254
226


255
227

== push N (N=5) ==========================================================

256
228

push N (c_dheap): 1969700.7 i/s

@@ 341,78 +313,68 @@ the linear time compexity to keep a sorted array dominates.


341
313

queue size = 5000000: 2664897.7 i/s  2.74x slower

342
314

queue size = 10000000: 2137927.6 i/s  3.42x slower

343
315


344


##

345



346



347



348



349



350



351



352



353



354



355



356



357



358



359



360



361



362



363



364



365



366



367



368



369



370



371



372



373



374



375



376



377



378



379



380



381



382



383



384



385



386



387



388



389



390



391


Additionally, when used to sort timers, we can reasonably assume that:

392


* New timers usually sort after most existing timers.

393


* Most timers will be canceled before executing.

394


* Canceled timers usually sort after most existing timers.

395



396


So, if we are able to delete an item without searching for it, by keeping a map

397


of positions within the heap, most timers can be inserted and deleted in O(1)

398


time. Canceling a nonleaf timer can be further optimized by marking it as

399


canceled without immediately removing it from the heap. If the timer is

400


rescheduled before we garbage collect, adjusting its position will usually be

401


faster than a delete and reinsert.

316

+
## Analysis

317

+

318

+
### Time complexity

319

+

320

+
There are two fundamental heap operations: siftup (used by push) and siftdown

321

+
(used by pop).

322

+

323

+
* A _d_ary heap will have `log n / log d` layers, so both sift operations can

324

+
perform as many as `log n / log d` writes, when a member sifts the entire

325

+
length of the tree.

326

+
* Siftup makes one comparison per layer, so push runs in `O(log n / log d)`.

327

+
* Siftdown makes d comparions per layer, so pop runs in `O(d log n / log d)`.

328

+

329

+
So, in the simplest case of running balanced push/pop while maintaining the same

330

+
heap size, `(1 + d) log n / log d` comparisons are made. In the worst case,

331

+
when every sift traverses every layer of the tree, `d=4` requires the fewest

332

+
comparisons for combined insert and delete:

333

+

334

+
* (1 + 2) lg n / lg d ≈ 4.328085 lg n

335

+
* (1 + 3) lg n / lg d ≈ 3.640957 lg n

336

+
* (1 + 4) lg n / lg d ≈ 3.606738 lg n

337

+
* (1 + 5) lg n / lg d ≈ 3.728010 lg n

338

+
* (1 + 6) lg n / lg d ≈ 3.906774 lg n

339

+
* (1 + 7) lg n / lg d ≈ 4.111187 lg n

340

+
* (1 + 8) lg n / lg d ≈ 4.328085 lg n

341

+
* (1 + 9) lg n / lg d ≈ 4.551196 lg n

342

+
* (1 + 10) lg n / lg d ≈ 4.777239 lg n

343

+
* etc...

344

+

345

+
See https://en.wikipedia.org/wiki/Dary_heap#Analysis for deeper analysis.

346

+

347

+
### Space complexity

348

+

349

+
Space usage is linear, regardless of d. However higher d values may

350

+
provide better cache locality. Because the heap is a complete binary tree, the

351

+
elements can be stored in an array, without the need for tree or list pointers.

352

+

353

+
Ruby can compare Numeric values _much_ faster than other ruby objects, even if

354

+
those objects simply delegate comparison to internal Numeric values. And it is

355

+
often useful to use external scores for otherwise uncomparable values. So

356

+
`DHeap` uses twice as many entries (one for score and one for value)

357

+
as an array which only stores values.

358

+

359

+
## Thread safety

360

+

361

+
`DHeap` is _not_ threadsafe, so concurrent access from multiple threads need to

362

+
take precautions such as locking access behind a mutex.

402
363


403
364

## Alternative data structures

404
365


405
366

As always, you should run benchmarks with your expected scenarios to determine

406


which is

367

+
which is best for your application.

407
368


408


Depending on

409



410



411



412



369

+
Depending on your usecase, maintaining a sorted `Array` using `#bsearch_index`

370

+
and `#insert` might be just fine! Even `min` plus `delete` with an unsorted

371

+
array can be very fast on small queues. Although insertions run with `O(n)`,

372

+
`memcpy` is so fast on modern hardware that your dataset might not be large

373

+
enough for it to matter.

413
374


414


More complex heap varients, e.g. [Fibonacci heap],

415



375

+
More complex heap varients, e.g. [Fibonacci heap], allow heaps to be split and

376

+
merged which gives some graph algorithms a lower amortized time complexity. But

377

+
in practice, _d_ary heaps have much lower overhead and often run faster.

416
378


417
379

[Fibonacci heap]: https://en.wikipedia.org/wiki/Fibonacci_heap

418
380


@@ 432,25 +394,17 @@ complex than a heap, it may be necessary for enormous values of N.


432
394


433
395

## TODOs...

434
396


435


_TODO:_ Also ~~included is~~ _will include_ `DHeap::

436


basic heap with an internal `Hash`, which maps

437



438



439



397

+
_TODO:_ Also ~~included is~~ _will include_ `DHeap::Map`, which augments the

398

+
basic heap with an internal `Hash`, which maps objects to their position in the

399

+
heap. This enforces a uniqueness constraint on items on the heap, and also

400

+
allows items to be more efficiently deleted or adjusted. However maintaining

401

+
the hash does lead to a small drop in normal `#push` and `#pop` performance.

440
402


441
403

_TODO:_ Also ~~included is~~ _will include_ `DHeap::Lazy`, which contains some

442
404

features that are loosely inspired by go's timers. e.g: It lazily sifts its

443
405

heap after deletion and adjustments, to achieve faster average runtime for *add*

444
406

and *cancel* operations.

445
407


446


Additionally, I was inspired by reading go's "timer.go" implementation to

447


experiment with a 4ary heap instead of the traditional binary heap. In the

448


case of timers, new timers are usually scheduled to run after most of the

449


existing timers. And timers are usually canceled before they have a chance to

450


run. While a binary heap holds 50% of its elements in its last layer, 75% of a

451


4ary heap will have no children. That diminishes the extra comparison overhead

452


during siftdown.

453



454
408

## Development

455
409


456
410

After checking out the repo, run `bin/setup` to install dependencies. Then, run
