sorted_containers 0.1.0 → 1.0.0
Sign up to get free protection for your applications and to get access to all the features.
- checksums.yaml +4 -4
- data/.rubocop.yml +3 -0
- data/CHANGELOG.md +5 -1
- data/README.md +31 -17
- data/lib/sorted_containers/core_extensions.rb +56 -0
- data/lib/sorted_containers/sorted_array.rb +987 -208
- data/lib/sorted_containers/sorted_hash.rb +461 -53
- data/lib/sorted_containers/sorted_set.rb +310 -71
- data/lib/sorted_containers/version.rb +1 -1
- data/lib/sorted_containers.rb +1 -0
- metadata +4 -3
checksums.yaml
CHANGED
@@ -1,7 +1,7 @@
|
|
1
1
|
---
|
2
2
|
SHA256:
|
3
|
-
metadata.gz:
|
4
|
-
data.tar.gz:
|
3
|
+
metadata.gz: 3402cc1b71fc79ef75e11c571123f82a2e79063688413922f8635dffaed9e179
|
4
|
+
data.tar.gz: 7a07b07ca3062d60cf0a92a60e1b3299629aa3d2ae3aaac46d81e8f9f8118d18
|
5
5
|
SHA512:
|
6
|
-
metadata.gz:
|
7
|
-
data.tar.gz:
|
6
|
+
metadata.gz: 441f144945782fa0100b2814ad0282efce28d948058b1e5219a24674b1c05591e8adf93642f3ad1fa4c5d31d12e7961102219c52943507b37aae2c827d8488fd
|
7
|
+
data.tar.gz: 97021f40b0c465854c087772739d731d32dc22029f2387532c8878f681327f7403c61505a844bab3b6acc2a698947dca3cb3bc185116a611341375a4154d8f8b
|
data/.rubocop.yml
CHANGED
data/CHANGELOG.md
CHANGED
data/README.md
CHANGED
@@ -1,42 +1,56 @@
|
|
1
1
|
# SortedContainers
|
2
2
|
|
3
|
-
|
3
|
+
[Documentation](https://www.rubydoc.info/gems/sorted_containers/0.1.1)
|
4
|
+
|
5
|
+
[![Gem Version](https://badge.fury.io/rb/sorted_containers.svg)](https://badge.fury.io/rb/sorted_containers)
|
6
|
+
|
7
|
+
SortedContainers is a fast implementation of sorted arrays, sets, and hashes in pure Ruby. It is based on the [sortedcontainers](https://grantjenks.com/docs/sortedcontainers/) Python library by Grant Jenks.
|
4
8
|
|
5
9
|
SortedContainers provides three main classes: `SortedArray`, `SortedSet`, and `SortedHash`. Each class is a drop-in replacement for the corresponding Ruby class, but with the added benefit of maintaining the elements in sorted order.
|
6
10
|
|
7
|
-
SortedContainers exploits the fact that modern computers are
|
11
|
+
SortedContainers exploits the fact that modern computers are good at shifting arrays in memory. We sacrifice theoretical time complexity for practical performance. In practice, SortedContainers is fast.
|
8
12
|
|
9
13
|
## How it works
|
10
14
|
|
11
|
-
|
15
|
+
Modern computers are good at shifting arrays. For that reason, it's often faster to keep an array sorted than to use the usual tree-based data structures.
|
16
|
+
|
17
|
+
For example, if you have the array `[1,2,4,5]` and want to insert the element `3`, you can shift `4, 5` to the right and insert `3` in the correct position. This is a `O(n)` operation, but in practice it's fast.
|
12
18
|
|
13
|
-
|
19
|
+
You also save memory by not having to store pointers to children nodes, and you benefit from the cache locality of arrays. When you iterate over a sorted array, you are more likely to access elements that are close together in memory.
|
14
20
|
|
15
|
-
But we can do better if we have a lot of elements. We can break up the array
|
21
|
+
But we can do better if we have a lot of elements. We can break up the array so fewer elements have to be moved when a new element is inserted. For example, if you have the array `[[1,2,4],[5,6,7]]` and you want to insert the element `3`, you can insert `3` into the first array to get `[[1,2,3,4],[5,6,7]]` and only the element `4` has to be shifted.
|
16
22
|
|
17
|
-
This often outperforms the more common tree-based data structures like red-black trees with `O(log n)` insertions and
|
23
|
+
This often outperforms the more common tree-based data structures like red-black trees with `O(log n)` insertions, deletions, and lookups. We sacrifice theoretical time complexity for practical performance.
|
18
24
|
|
19
|
-
|
25
|
+
The size of the subarrays is a trade-off. You can modify how big you want to subarrays by setting the `load_factor`. The default is set to `DEFAULT_LOAD_FACTOR = 1000`. The subarray is split when its size is `2*load_factor`. There is no perfect value. The ideal value will depend on your use case and may require some experimentation.
|
26
|
+
|
27
|
+
SortedSet and SortedHash are implemented using a SortedArray to keep track of the order, and then also use a standard Set and Hash for quick lookups.
|
20
28
|
|
21
29
|
## Benchmarks
|
22
|
-
|
23
|
-
Performance comparison against [SortedSet](https://github.com/knu/sorted_set) a C extension red-black tree implementation. Every test was run 5 times and the average was taken.
|
24
30
|
|
25
|
-
|
31
|
+
[SortedSet](https://github.com/knu/sorted_set) is a C extension red-black tree implementation. It is the fastest Ruby implementation of a sorted set that I could find. I used it as a benchmark to compare the performance of SortedContainers.
|
32
|
+
|
33
|
+
Every test was run 5 times and the average was taken.
|
26
34
|
|
27
|
-
|
35
|
+
You can see that SortedContainers has comparable performance for add and delete, and much better performance for iteration, initialization, and include.
|
28
36
|
|
29
37
|
- MacBook Pro (16-inch, 2019)
|
30
38
|
- 2.6 GHz 6-Core Intel Core i7, 16 GB 2667 MHz DDR4
|
31
39
|
- Ruby 3.2.2
|
32
40
|
- SortedContainers 0.1.0
|
33
41
|
- SortedSet 1.0.3
|
42
|
+
|
34
43
|
### Results (Lower is better)
|
35
|
-
|
36
|
-
<img src="benchmark/
|
37
|
-
|
38
|
-
<img src="benchmark/
|
39
|
-
|
44
|
+
|
45
|
+
<img src="https://github.com/GarrisonJ/sorted_containers/blob/main/benchmark/initialize_performance_comparison.png?raw=true" width="50%">
|
46
|
+
|
47
|
+
<img src="https://github.com/GarrisonJ/sorted_containers/blob/main/benchmark/add_performance_comparison.png?raw=true" width="50%">
|
48
|
+
|
49
|
+
<img src="https://github.com/GarrisonJ/sorted_containers/blob/main/benchmark/delete_performance_comparison.png?raw=true" width="50%">
|
50
|
+
|
51
|
+
<img src="https://github.com/GarrisonJ/sorted_containers/blob/main/benchmark/iteration_performance_comparison.png?raw=true" width="50%">
|
52
|
+
|
53
|
+
<img src="https://github.com/GarrisonJ/sorted_containers/blob/main/benchmark/include_performance_comparison.png?raw=true" width="50%">
|
40
54
|
|
41
55
|
## Installation
|
42
56
|
|
@@ -53,7 +67,7 @@ bundle install
|
|
53
67
|
```
|
54
68
|
|
55
69
|
Or install it yourself as:
|
56
|
-
|
70
|
+
|
57
71
|
```bash
|
58
72
|
gem install sorted_containers
|
59
73
|
```
|
@@ -0,0 +1,56 @@
|
|
1
|
+
# frozen_string_literal: true
|
2
|
+
|
3
|
+
# Array class is being extended to include methods for converting
|
4
|
+
# an Array to a SortedSet, SortedHash, and SortedArray.
|
5
|
+
class Array
|
6
|
+
# Converts the array to a SortedSet.
|
7
|
+
#
|
8
|
+
# @param load_factor [Integer] The load factor for the SortedSet.
|
9
|
+
# @return [SortedContainers::SortedSet] The new SortedSet.
|
10
|
+
def to_sorted_set(load_factor: SortedContainers::SortedArray::DEFAULT_LOAD_FACTOR)
|
11
|
+
SortedContainers::SortedSet.new(self, load_factor: load_factor)
|
12
|
+
end
|
13
|
+
|
14
|
+
# Converts the array to a SortedHash.
|
15
|
+
#
|
16
|
+
# @param load_factor [Integer] The load factor for the SortedHash.
|
17
|
+
# @return [SortedContainers::SortedHash] The new SortedHash.
|
18
|
+
def to_sorted_h(load_factor: SortedContainers::SortedArray::DEFAULT_LOAD_FACTOR)
|
19
|
+
hash = SortedContainers::SortedHash.new(load_factor: load_factor)
|
20
|
+
hash.merge!(self)
|
21
|
+
end
|
22
|
+
|
23
|
+
# Converts the array to a SortedArray.
|
24
|
+
#
|
25
|
+
# @param load_factor [Integer] The load factor for the SortedArray.
|
26
|
+
# @return [SortedContainers::SortedArray] The new SortedArray.
|
27
|
+
def to_sorted_a(load_factor: SortedContainers::SortedArray::DEFAULT_LOAD_FACTOR)
|
28
|
+
SortedContainers::SortedArray.new(self, load_factor: load_factor)
|
29
|
+
end
|
30
|
+
end
|
31
|
+
|
32
|
+
# Hash class is being extended to include a method for converting
|
33
|
+
# a Hash to a SortedHash.
|
34
|
+
class Hash
|
35
|
+
# Converts the hash to a SortedHash.
|
36
|
+
#
|
37
|
+
# @param load_factor [Integer] The load factor for the SortedHash.
|
38
|
+
# @return [SortedContainers::SortedHash] The new SortedHash.
|
39
|
+
def to_sorted_h(load_factor: SortedContainers::SortedArray::DEFAULT_LOAD_FACTOR)
|
40
|
+
hash = SortedContainers::SortedHash.new(load_factor: load_factor)
|
41
|
+
hash.merge!(self)
|
42
|
+
hash
|
43
|
+
end
|
44
|
+
end
|
45
|
+
|
46
|
+
# Set class is being extended to include a method for converting
|
47
|
+
# a Set to a SortedSet.
|
48
|
+
class Set
|
49
|
+
# Converts the set to a SortedSet.
|
50
|
+
#
|
51
|
+
# @param load_factor [Integer] The load factor for the SortedSet.
|
52
|
+
# @return [SortedContainers::SortedSet] The new SortedSet.
|
53
|
+
def to_sorted_set(load_factor: SortedContainers::SortedArray::DEFAULT_LOAD_FACTOR)
|
54
|
+
SortedContainers::SortedSet.new(self, load_factor: load_factor)
|
55
|
+
end
|
56
|
+
end
|