swiss_hash 0.1.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- checksums.yaml +7 -0
- data/LICENSE.txt +21 -0
- data/README.md +142 -0
- data/ext/swiss_hash/extconf.rb +5 -0
- data/ext/swiss_hash/swiss_hash.c +880 -0
- data/lib/swiss_hash/version.rb +5 -0
- data/lib/swiss_hash.rb +28 -0
- metadata +64 -0
checksums.yaml
ADDED
|
@@ -0,0 +1,7 @@
|
|
|
1
|
+
---
|
|
2
|
+
SHA256:
|
|
3
|
+
metadata.gz: 9846e358fdd8020ef06ef05446b3684c86b658b2643a4301dab6fdd65c22aaa6
|
|
4
|
+
data.tar.gz: 66e6452775731dc9eca53cfaa33c63cd434499452bc5c30c1924ca1414e6f27b
|
|
5
|
+
SHA512:
|
|
6
|
+
metadata.gz: a3b691649208e0eaada137990e286a319050cf5b64ed88802b94d2579020ddff519c77cb813e5d09e8b026c892581847131b65b246e431910ba61a3e6484eeef
|
|
7
|
+
data.tar.gz: ffc257a83083611a0dc70fe9232c78a0ead1ef226956b887a1132e27e88d037d420977918c534f179a6598c0c2e55fadddc2340fb8c925a5ef0621d4c5798402
|
data/LICENSE.txt
ADDED
|
@@ -0,0 +1,21 @@
|
|
|
1
|
+
The MIT License (MIT)
|
|
2
|
+
|
|
3
|
+
Copyright (c) 2025 Roman Haidarov
|
|
4
|
+
|
|
5
|
+
Permission is hereby granted, free of charge, to any person obtaining a copy
|
|
6
|
+
of this software and associated documentation files (the "Software"), to deal
|
|
7
|
+
in the Software without restriction, including without limitation the rights
|
|
8
|
+
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
|
|
9
|
+
copies of the Software, and to permit persons to whom the Software is
|
|
10
|
+
furnished to do so, subject to the following conditions:
|
|
11
|
+
|
|
12
|
+
The above copyright notice and this permission notice shall be included in
|
|
13
|
+
all copies or substantial portions of the Software.
|
|
14
|
+
|
|
15
|
+
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
|
|
16
|
+
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
|
|
17
|
+
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
|
|
18
|
+
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
|
|
19
|
+
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
|
|
20
|
+
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
|
|
21
|
+
THE SOFTWARE.
|
data/README.md
ADDED
|
@@ -0,0 +1,142 @@
|
|
|
1
|
+
# SwissHash
|
|
2
|
+
|
|
3
|
+
Swiss Table hash map implementation as a Ruby C extension. Based on Go 1.24 Swiss Table design with SIMD support (SSE2/NEON).
|
|
4
|
+
|
|
5
|
+
## Installation
|
|
6
|
+
|
|
7
|
+
```
|
|
8
|
+
gem install swiss_hash
|
|
9
|
+
```
|
|
10
|
+
|
|
11
|
+
## Usage
|
|
12
|
+
|
|
13
|
+
```ruby
|
|
14
|
+
require "swiss_hash"
|
|
15
|
+
|
|
16
|
+
h = SwissHash::Hash.new
|
|
17
|
+
h["key"] = "value"
|
|
18
|
+
h["key"] # => "value"
|
|
19
|
+
h.delete("key")
|
|
20
|
+
h.stats # => { capacity: 16, size: 0, ... }
|
|
21
|
+
```
|
|
22
|
+
|
|
23
|
+
## Performance Results
|
|
24
|
+
|
|
25
|
+
Benchmarks on Ruby 3.1.7 / arm64-darwin24 with NEON SIMD support (median of 3 runs):
|
|
26
|
+
|
|
27
|
+
### Insert Performance
|
|
28
|
+
- **String keys**: SwissHash is **18.9-28.6% faster** across all dataset sizes
|
|
29
|
+
- **Sequential integers**: 24.6% slower (1k), 3.7% slower (10k), **11.5% faster** (100k)
|
|
30
|
+
- **Random integers**: 11.6% slower (1k), 4.7% slower (10k), **6.3% faster** (100k)
|
|
31
|
+
|
|
32
|
+
### Lookup Performance
|
|
33
|
+
- **Sequential integers**: SwissHash is **7.3-10.6% slower** across all sizes
|
|
34
|
+
- **String keys**: SwissHash is **13.0-25.2% slower** across all sizes
|
|
35
|
+
|
|
36
|
+
### Mixed Workloads
|
|
37
|
+
- **Delete+reinsert (25%)**: 9.1% slower (1k), 2.5% slower (10k), **13.5% faster** (100k)
|
|
38
|
+
- **Mixed operations (70% read, 20% write, 10% delete)**: 2.0% slower (1k), 0.5% faster (10k), **11.6% faster** (100k)
|
|
39
|
+
|
|
40
|
+
### Memory Usage
|
|
41
|
+
For 100,000 elements:
|
|
42
|
+
- **SwissHash**: 2,176 KB native memory + 4 GC slots
|
|
43
|
+
- **Ruby Hash**: 3 GC slots (native memory not directly measurable)
|
|
44
|
+
- **Load factor**: 76.3% (efficient memory usage)
|
|
45
|
+
- **GC pressure**: Equal (0 GC runs during insertion)
|
|
46
|
+
|
|
47
|
+
## Features
|
|
48
|
+
|
|
49
|
+
- **SIMD optimized**: Uses SSE2 on x86_64 and NEON on ARM64
|
|
50
|
+
- **Memory efficient**: Swiss Table layout with 87.5% max load factor
|
|
51
|
+
- **Tombstone compaction**: Automatic cleanup of deleted entries
|
|
52
|
+
- **Ruby compatibility**: Supports frozen string keys, all Ruby object types
|
|
53
|
+
- **Thread safety**: Prevents reentrant modifications during callbacks
|
|
54
|
+
|
|
55
|
+
## API
|
|
56
|
+
|
|
57
|
+
```ruby
|
|
58
|
+
hash = SwissHash::Hash.new(capacity = 16)
|
|
59
|
+
|
|
60
|
+
# Basic operations
|
|
61
|
+
hash[key] = value
|
|
62
|
+
hash[key] # get
|
|
63
|
+
hash.delete(key) # delete, returns old value or nil
|
|
64
|
+
|
|
65
|
+
# Enumeration
|
|
66
|
+
hash.each { |k, v| ... }
|
|
67
|
+
hash.keys
|
|
68
|
+
hash.values
|
|
69
|
+
|
|
70
|
+
# Size and status
|
|
71
|
+
hash.size
|
|
72
|
+
hash.empty?
|
|
73
|
+
hash.key?(key)
|
|
74
|
+
|
|
75
|
+
# Maintenance
|
|
76
|
+
hash.clear
|
|
77
|
+
hash.compact! # remove tombstones
|
|
78
|
+
|
|
79
|
+
# Debugging
|
|
80
|
+
hash.stats # => detailed statistics hash
|
|
81
|
+
```
|
|
82
|
+
|
|
83
|
+
## Usage Recommendations
|
|
84
|
+
|
|
85
|
+
SwissHash shows clear advantages in specific scenarios:
|
|
86
|
+
- **Heavy string key insertion** into any size hash table (consistent 19-29% improvement)
|
|
87
|
+
- **Large dataset operations** (>100k elements) - shows improvements across most operations
|
|
88
|
+
- **Write-heavy workloads** with frequent deletion and reinsertion patterns
|
|
89
|
+
- **Cases where predictable memory usage is important**
|
|
90
|
+
|
|
91
|
+
⚠️ **Important**: SwissHash is consistently **7-25% slower for all lookup operations**. For read-intensive applications, stick with standard Ruby Hash.
|
|
92
|
+
|
|
93
|
+
## Technical Analysis: Why Ruby Hash is Faster for Lookups
|
|
94
|
+
|
|
95
|
+
The performance difference in lookup operations comes from several architectural factors:
|
|
96
|
+
|
|
97
|
+
### Ruby VM Optimizations
|
|
98
|
+
- **Specialized fast paths**: Ruby VM has highly optimized inline implementations for common key types (Fixnum, Symbol, String)
|
|
99
|
+
- **Method caching**: VM-level inline caching for hash access patterns reduces method dispatch overhead
|
|
100
|
+
- **Bytecode optimizations**: Hash access is optimized at the bytecode level with specialized opcodes
|
|
101
|
+
- **Memory locality**: Ruby Hash uses a simpler layout optimized for CPU cache performance
|
|
102
|
+
|
|
103
|
+
### SwissHash Overhead
|
|
104
|
+
- **C Extension boundaries**: Each lookup requires `TypedData_Get_Struct()` call and Ruby-C interface overhead
|
|
105
|
+
- **Complex hash computation**: SwissHash uses sophisticated hashing (wyhash for strings, custom mixers) vs Ruby's simpler approach
|
|
106
|
+
- **Two-level lookup**: Swiss Table's H1/H2 split requires additional bit manipulation and SIMD operations
|
|
107
|
+
- **SIMD overhead**: NEON/SSE2 operations have setup costs that don't pay off for small group scans
|
|
108
|
+
- **Additional equality checks**: `keys_equal()` function adds extra indirection compared to VM's direct comparison
|
|
109
|
+
|
|
110
|
+
### Architecture Trade-offs
|
|
111
|
+
- **Swiss Table design**: Optimized for high load factors and cache efficiency, but adds complexity to the critical lookup path
|
|
112
|
+
- **Group-based probing**: While theoretically faster, the overhead of SIMD operations and multiple memory accesses hurts performance for typical Ruby workloads
|
|
113
|
+
- **Memory indirection**: SwissHash's separate control bytes array creates additional memory accesses vs Ruby Hash's integrated approach
|
|
114
|
+
|
|
115
|
+
This explains why SwissHash excels at write-heavy operations (where its design advantages matter) but loses to Ruby's VM-optimized lookups.
|
|
116
|
+
|
|
117
|
+
## Build
|
|
118
|
+
|
|
119
|
+
```
|
|
120
|
+
rake compile
|
|
121
|
+
```
|
|
122
|
+
|
|
123
|
+
## Analysis Tools
|
|
124
|
+
|
|
125
|
+
Project includes comprehensive benchmark with statistical validation:
|
|
126
|
+
|
|
127
|
+
```bash
|
|
128
|
+
ruby -Ilib benchmark.rb
|
|
129
|
+
```
|
|
130
|
+
|
|
131
|
+
The benchmark measures (with 7 iterations, 2 warmup, GC disabled):
|
|
132
|
+
- Insert performance (sequential int, strings, random int)
|
|
133
|
+
- Lookup performance (with 3x repetition for signal amplification)
|
|
134
|
+
- Delete with reinsertion operations
|
|
135
|
+
- Mixed workloads (70/20/10 read/write/delete)
|
|
136
|
+
- Memory usage and GC pressure analysis
|
|
137
|
+
|
|
138
|
+
Results are statistically validated through multiple runs to ensure accuracy.
|
|
139
|
+
|
|
140
|
+
## License
|
|
141
|
+
|
|
142
|
+
MIT
|
|
@@ -0,0 +1,880 @@
|
|
|
1
|
+
#include <ruby.h>
|
|
2
|
+
#include <ruby/encoding.h>
|
|
3
|
+
#include <stdint.h>
|
|
4
|
+
#include <string.h>
|
|
5
|
+
#include <stdlib.h>
|
|
6
|
+
|
|
7
|
+
#ifndef RB_UNLIKELY
|
|
8
|
+
#if defined(__GNUC__) || defined(__clang__)
|
|
9
|
+
#define RB_UNLIKELY(x) __builtin_expect(!!(x), 0)
|
|
10
|
+
#else
|
|
11
|
+
#define RB_UNLIKELY(x) (x)
|
|
12
|
+
#endif
|
|
13
|
+
#endif
|
|
14
|
+
|
|
15
|
+
static uint64_t swiss_hash_seed0;
|
|
16
|
+
static uint64_t swiss_hash_seed1;
|
|
17
|
+
|
|
18
|
+
static void init_hash_seed(void) {
|
|
19
|
+
VALUE seed_val = rb_hash(INT2FIX(0));
|
|
20
|
+
uint64_t base = (uint64_t)NUM2LONG(seed_val);
|
|
21
|
+
|
|
22
|
+
uint64_t s = base ^ 0x6a09e667f3bcc908ULL;
|
|
23
|
+
s ^= s >> 30;
|
|
24
|
+
s *= 0xbf58476d1ce4e5b9ULL;
|
|
25
|
+
s ^= s >> 27;
|
|
26
|
+
s *= 0x94d049bb133111ebULL;
|
|
27
|
+
s ^= s >> 31;
|
|
28
|
+
swiss_hash_seed0 = s;
|
|
29
|
+
|
|
30
|
+
s ^= s >> 30;
|
|
31
|
+
s *= 0xbf58476d1ce4e5b9ULL;
|
|
32
|
+
s ^= s >> 27;
|
|
33
|
+
s *= 0x94d049bb133111ebULL;
|
|
34
|
+
s ^= s >> 31;
|
|
35
|
+
swiss_hash_seed1 = s;
|
|
36
|
+
}
|
|
37
|
+
|
|
38
|
+
static inline uint64_t _wyr8(const uint8_t *p) {
|
|
39
|
+
uint64_t v;
|
|
40
|
+
memcpy(&v, p, 8);
|
|
41
|
+
return v;
|
|
42
|
+
}
|
|
43
|
+
static inline uint64_t _wyr4(const uint8_t *p) {
|
|
44
|
+
uint32_t v;
|
|
45
|
+
memcpy(&v, p, 4);
|
|
46
|
+
return v;
|
|
47
|
+
}
|
|
48
|
+
static inline uint64_t _wyr3(const uint8_t *p, size_t k) {
|
|
49
|
+
return ((uint64_t)p[0]) << 16 | ((uint64_t)p[k >> 1]) << 8 | p[k - 1];
|
|
50
|
+
}
|
|
51
|
+
|
|
52
|
+
static inline uint64_t _wymix(uint64_t a, uint64_t b) {
|
|
53
|
+
#if defined(_MSC_VER) && defined(_M_X64)
|
|
54
|
+
uint64_t hi;
|
|
55
|
+
uint64_t lo = _umul128(a, b, &hi);
|
|
56
|
+
return hi ^ lo;
|
|
57
|
+
#elif defined(__SIZEOF_INT128__)
|
|
58
|
+
__uint128_t r = (__uint128_t)a * b;
|
|
59
|
+
return (uint64_t)(r >> 64) ^ (uint64_t)r;
|
|
60
|
+
#else
|
|
61
|
+
uint64_t lo = a * b;
|
|
62
|
+
uint32_t a_hi = (uint32_t)(a >> 32), a_lo = (uint32_t)a;
|
|
63
|
+
uint32_t b_hi = (uint32_t)(b >> 32), b_lo = (uint32_t)b;
|
|
64
|
+
uint64_t cross1 = (uint64_t)a_hi * b_lo;
|
|
65
|
+
uint64_t cross2 = (uint64_t)a_lo * b_hi;
|
|
66
|
+
uint64_t hi_approx = (uint64_t)a_hi * b_hi + (cross1 >> 32) + (cross2 >> 32);
|
|
67
|
+
return hi_approx ^ lo;
|
|
68
|
+
#endif
|
|
69
|
+
}
|
|
70
|
+
|
|
71
|
+
static inline uint64_t wyhash(const void *data, size_t len, uint64_t seed) {
|
|
72
|
+
static const uint64_t s0 = 0xa0761d6478bd642fULL;
|
|
73
|
+
static const uint64_t s1 = 0xe7037ed1a0b428dbULL;
|
|
74
|
+
static const uint64_t s2 = 0x8ebc6af09c88c6e3ULL;
|
|
75
|
+
static const uint64_t s3 = 0x589965cc75374cc3ULL;
|
|
76
|
+
|
|
77
|
+
const uint8_t *p = (const uint8_t *)data;
|
|
78
|
+
uint64_t a, b;
|
|
79
|
+
|
|
80
|
+
seed ^= _wymix(seed ^ s0, s1);
|
|
81
|
+
|
|
82
|
+
if (len <= 16) {
|
|
83
|
+
if (len >= 4) {
|
|
84
|
+
a = (_wyr4(p) << 32) | _wyr4(p + ((len >> 3) << 2));
|
|
85
|
+
b = (_wyr4(p + len - 4) << 32) | _wyr4(p + len - 4 - ((len >> 3) << 2));
|
|
86
|
+
} else if (len > 0) {
|
|
87
|
+
a = _wyr3(p, len);
|
|
88
|
+
b = 0;
|
|
89
|
+
} else {
|
|
90
|
+
a = b = 0;
|
|
91
|
+
}
|
|
92
|
+
} else if (len <= 48) {
|
|
93
|
+
size_t i = 0;
|
|
94
|
+
for (; i + 16 <= len; i += 16) {
|
|
95
|
+
seed = _wymix(_wyr8(p + i) ^ s1, _wyr8(p + i + 8) ^ seed);
|
|
96
|
+
}
|
|
97
|
+
a = _wyr8(p + len - 16);
|
|
98
|
+
b = _wyr8(p + len - 8);
|
|
99
|
+
} else {
|
|
100
|
+
size_t i = 0;
|
|
101
|
+
uint64_t see1 = seed, see2 = seed;
|
|
102
|
+
for (; i + 48 <= len; i += 48) {
|
|
103
|
+
seed = _wymix(_wyr8(p + i) ^ s1, _wyr8(p + i + 8) ^ seed);
|
|
104
|
+
see1 = _wymix(_wyr8(p + i + 16) ^ s2, _wyr8(p + i + 24) ^ see1);
|
|
105
|
+
see2 = _wymix(_wyr8(p + i + 32) ^ s3, _wyr8(p + i + 40) ^ see2);
|
|
106
|
+
}
|
|
107
|
+
for (; i + 16 <= len; i += 16) {
|
|
108
|
+
seed = _wymix(_wyr8(p + i) ^ s1, _wyr8(p + i + 8) ^ seed);
|
|
109
|
+
}
|
|
110
|
+
seed ^= see1 ^ see2;
|
|
111
|
+
a = _wyr8(p + len - 16);
|
|
112
|
+
b = _wyr8(p + len - 8);
|
|
113
|
+
}
|
|
114
|
+
|
|
115
|
+
return _wymix(s1 ^ len, _wymix(a ^ s1, b ^ seed));
|
|
116
|
+
}
|
|
117
|
+
|
|
118
|
+
#define GROUP_SIZE 8
|
|
119
|
+
#define CTRL_EMPTY 0x80
|
|
120
|
+
#define CTRL_DELETED 0xFE
|
|
121
|
+
#define H2_MASK 0x7F
|
|
122
|
+
#define MAX_LOAD_NUM 7
|
|
123
|
+
#define MAX_LOAD_DEN 8
|
|
124
|
+
#define TOMBSTONE_COMPACT_DIVISOR 4
|
|
125
|
+
|
|
126
|
+
#if defined(__x86_64__) || defined(_M_X64)
|
|
127
|
+
#define SWISS_USE_SSE2 1
|
|
128
|
+
#include <emmintrin.h>
|
|
129
|
+
#elif defined(__aarch64__) || defined(_M_ARM64)
|
|
130
|
+
#define SWISS_USE_NEON 1
|
|
131
|
+
#include <arm_neon.h>
|
|
132
|
+
#else
|
|
133
|
+
#define SWISS_USE_PORTABLE 1
|
|
134
|
+
#endif
|
|
135
|
+
|
|
136
|
+
#ifdef SWISS_USE_SSE2
|
|
137
|
+
|
|
138
|
+
static inline __m128i ctrl_load(const uint8_t *ctrl) {
|
|
139
|
+
return _mm_loadl_epi64((const __m128i *)ctrl);
|
|
140
|
+
}
|
|
141
|
+
|
|
142
|
+
static inline uint32_t ctrl_match_h2_vec(__m128i cv, uint8_t h2) {
|
|
143
|
+
__m128i cmp = _mm_cmpeq_epi8(cv, _mm_set1_epi8((char)h2));
|
|
144
|
+
return (uint32_t)_mm_movemask_epi8(cmp) & 0xFF;
|
|
145
|
+
}
|
|
146
|
+
|
|
147
|
+
static inline uint32_t ctrl_match_empty_vec(__m128i cv) {
|
|
148
|
+
__m128i cmp = _mm_cmpeq_epi8(cv, _mm_set1_epi8((char)CTRL_EMPTY));
|
|
149
|
+
return (uint32_t)_mm_movemask_epi8(cmp) & 0xFF;
|
|
150
|
+
}
|
|
151
|
+
|
|
152
|
+
static inline uint32_t ctrl_match_empty_or_deleted_vec(__m128i cv) {
|
|
153
|
+
return (uint32_t)_mm_movemask_epi8(cv) & 0xFF;
|
|
154
|
+
}
|
|
155
|
+
|
|
156
|
+
static inline uint32_t ctrl_match_empty(const uint8_t *ctrl) {
|
|
157
|
+
return ctrl_match_empty_vec(ctrl_load(ctrl));
|
|
158
|
+
}
|
|
159
|
+
|
|
160
|
+
#elif defined(SWISS_USE_NEON)
|
|
161
|
+
|
|
162
|
+
static inline uint32_t neon_movemask(uint8x8_t v) {
|
|
163
|
+
static const uint8_t power_of_two[8] = {1, 2, 4, 8, 16, 32, 64, 128};
|
|
164
|
+
uint8x8_t bits = vand_u8(v, vld1_u8(power_of_two));
|
|
165
|
+
bits = vpadd_u8(bits, bits);
|
|
166
|
+
bits = vpadd_u8(bits, bits);
|
|
167
|
+
bits = vpadd_u8(bits, bits);
|
|
168
|
+
return (uint32_t)vget_lane_u8(bits, 0);
|
|
169
|
+
}
|
|
170
|
+
|
|
171
|
+
static inline uint8x8_t ctrl_load(const uint8_t *ctrl) {
|
|
172
|
+
return vld1_u8(ctrl);
|
|
173
|
+
}
|
|
174
|
+
|
|
175
|
+
static inline uint32_t ctrl_match_h2_vec(uint8x8_t cv, uint8_t h2) {
|
|
176
|
+
return neon_movemask(vceq_u8(cv, vdup_n_u8(h2)));
|
|
177
|
+
}
|
|
178
|
+
|
|
179
|
+
static inline uint32_t ctrl_match_empty_vec(uint8x8_t cv) {
|
|
180
|
+
return neon_movemask(vceq_u8(cv, vdup_n_u8(CTRL_EMPTY)));
|
|
181
|
+
}
|
|
182
|
+
|
|
183
|
+
static inline uint32_t ctrl_match_empty_or_deleted_vec(uint8x8_t cv) {
|
|
184
|
+
uint8x8_t msb = vshr_n_u8(cv, 7);
|
|
185
|
+
uint8x8_t match = vceq_u8(msb, vdup_n_u8(1));
|
|
186
|
+
return neon_movemask(match);
|
|
187
|
+
}
|
|
188
|
+
|
|
189
|
+
static inline uint32_t ctrl_match_empty(const uint8_t *ctrl) {
|
|
190
|
+
return ctrl_match_empty_vec(ctrl_load(ctrl));
|
|
191
|
+
}
|
|
192
|
+
|
|
193
|
+
#else /* portable */
|
|
194
|
+
|
|
195
|
+
static inline uint32_t ctrl_match_h2_raw(const uint8_t *ctrl, uint8_t h2) {
|
|
196
|
+
uint64_t c;
|
|
197
|
+
memcpy(&c, ctrl, 8);
|
|
198
|
+
uint64_t broadcast = 0x0101010101010101ULL * h2;
|
|
199
|
+
uint64_t xored = c ^ broadcast;
|
|
200
|
+
uint64_t result = (xored - 0x0101010101010101ULL) & ~xored & 0x8080808080808080ULL;
|
|
201
|
+
uint32_t mask = 0;
|
|
202
|
+
for (int i = 0; i < 8; i++) {
|
|
203
|
+
if (result & (0x80ULL << (i * 8)))
|
|
204
|
+
mask |= (1u << i);
|
|
205
|
+
}
|
|
206
|
+
return mask;
|
|
207
|
+
}
|
|
208
|
+
|
|
209
|
+
static inline uint32_t ctrl_match_empty_raw(const uint8_t *ctrl) {
|
|
210
|
+
uint32_t mask = 0;
|
|
211
|
+
for (int i = 0; i < 8; i++) {
|
|
212
|
+
if (ctrl[i] == CTRL_EMPTY)
|
|
213
|
+
mask |= (1u << i);
|
|
214
|
+
}
|
|
215
|
+
return mask;
|
|
216
|
+
}
|
|
217
|
+
|
|
218
|
+
static inline uint32_t ctrl_match_empty_or_deleted_raw(const uint8_t *ctrl) {
|
|
219
|
+
uint32_t mask = 0;
|
|
220
|
+
for (int i = 0; i < 8; i++) {
|
|
221
|
+
if (ctrl[i] & 0x80)
|
|
222
|
+
mask |= (1u << i);
|
|
223
|
+
}
|
|
224
|
+
return mask;
|
|
225
|
+
}
|
|
226
|
+
|
|
227
|
+
static inline uint32_t ctrl_match_empty(const uint8_t *ctrl) {
|
|
228
|
+
return ctrl_match_empty_raw(ctrl);
|
|
229
|
+
}
|
|
230
|
+
|
|
231
|
+
#endif /* SIMD selection */
|
|
232
|
+
|
|
233
|
+
static inline int ctz32(uint32_t v) {
|
|
234
|
+
#if defined(__GNUC__) || defined(__clang__)
|
|
235
|
+
return __builtin_ctz(v);
|
|
236
|
+
#elif defined(_MSC_VER)
|
|
237
|
+
unsigned long idx;
|
|
238
|
+
_BitScanForward(&idx, v);
|
|
239
|
+
return (int)idx;
|
|
240
|
+
#else
|
|
241
|
+
int n = 0;
|
|
242
|
+
while (!(v & 1)) {
|
|
243
|
+
v >>= 1;
|
|
244
|
+
n++;
|
|
245
|
+
}
|
|
246
|
+
return n;
|
|
247
|
+
#endif
|
|
248
|
+
}
|
|
249
|
+
|
|
250
|
+
typedef struct {
|
|
251
|
+
VALUE key;
|
|
252
|
+
VALUE value;
|
|
253
|
+
} Slot;
|
|
254
|
+
|
|
255
|
+
typedef struct {
|
|
256
|
+
uint8_t *ctrl;
|
|
257
|
+
Slot *slots;
|
|
258
|
+
size_t capacity;
|
|
259
|
+
size_t num_groups;
|
|
260
|
+
size_t group_mask;
|
|
261
|
+
size_t size;
|
|
262
|
+
size_t growth_left;
|
|
263
|
+
size_t tombstone_count;
|
|
264
|
+
uint8_t mutating;
|
|
265
|
+
} SwissHash;
|
|
266
|
+
|
|
267
|
+
#define MUTATE_GUARD_BEGIN(sh) \
|
|
268
|
+
do { \
|
|
269
|
+
if ((sh)->mutating) { \
|
|
270
|
+
rb_raise(rb_eRuntimeError, "SwissHash: reentrant modification detected " \
|
|
271
|
+
"(#hash or #eql? callback modified the same table)"); \
|
|
272
|
+
} \
|
|
273
|
+
(sh)->mutating = 1; \
|
|
274
|
+
} while (0)
|
|
275
|
+
|
|
276
|
+
#define MUTATE_GUARD_END(sh) \
|
|
277
|
+
do { \
|
|
278
|
+
(sh)->mutating = 0; \
|
|
279
|
+
} while (0)
|
|
280
|
+
|
|
281
|
+
#define FIBONACCI_HASH_C 0x9E3779B97F4A7C15ULL
|
|
282
|
+
|
|
283
|
+
static inline uint64_t compute_hash(VALUE key) {
|
|
284
|
+
uint64_t v;
|
|
285
|
+
|
|
286
|
+
if (FIXNUM_P(key)) {
|
|
287
|
+
v = (uint64_t)FIX2LONG(key) ^ swiss_hash_seed0;
|
|
288
|
+
return v * FIBONACCI_HASH_C;
|
|
289
|
+
}
|
|
290
|
+
|
|
291
|
+
if (SYMBOL_P(key)) {
|
|
292
|
+
v = (uint64_t)SYM2ID(key) ^ swiss_hash_seed0;
|
|
293
|
+
return v * FIBONACCI_HASH_C;
|
|
294
|
+
}
|
|
295
|
+
|
|
296
|
+
if (RB_TYPE_P(key, T_STRING)) {
|
|
297
|
+
const char *ptr = RSTRING_PTR(key);
|
|
298
|
+
long len = RSTRING_LEN(key);
|
|
299
|
+
int enc_idx = rb_enc_get_index(key);
|
|
300
|
+
uint64_t str_seed = swiss_hash_seed0 ^ (uint64_t)enc_idx;
|
|
301
|
+
return wyhash(ptr ? ptr : (const char *)"", (size_t)len, str_seed);
|
|
302
|
+
}
|
|
303
|
+
|
|
304
|
+
v = (uint64_t)NUM2LONG(rb_hash(key));
|
|
305
|
+
v ^= swiss_hash_seed1;
|
|
306
|
+
v ^= v >> 33;
|
|
307
|
+
v *= 0xff51afd7ed558ccdULL;
|
|
308
|
+
v ^= v >> 33;
|
|
309
|
+
v *= 0xc4ceb9fe1a85ec53ULL;
|
|
310
|
+
v ^= v >> 33;
|
|
311
|
+
return v;
|
|
312
|
+
}
|
|
313
|
+
|
|
314
|
+
#define H1(hash) ((hash) >> 7)
|
|
315
|
+
#define H2(hash) ((uint8_t)((hash) & H2_MASK))
|
|
316
|
+
|
|
317
|
+
static inline int keys_equal(VALUE a, VALUE b) {
|
|
318
|
+
if (a == b)
|
|
319
|
+
return 1;
|
|
320
|
+
|
|
321
|
+
int ta = TYPE(a);
|
|
322
|
+
if (ta == T_FIXNUM || ta == T_SYMBOL)
|
|
323
|
+
return 0;
|
|
324
|
+
|
|
325
|
+
if (ta == T_STRING && RB_TYPE_P(b, T_STRING)) {
|
|
326
|
+
long la = RSTRING_LEN(a);
|
|
327
|
+
long lb = RSTRING_LEN(b);
|
|
328
|
+
if (la != lb)
|
|
329
|
+
return 0;
|
|
330
|
+
|
|
331
|
+
if (rb_enc_compatible(a, b)) {
|
|
332
|
+
return memcmp(RSTRING_PTR(a), RSTRING_PTR(b), (size_t)la) == 0;
|
|
333
|
+
}
|
|
334
|
+
return rb_eql(a, b);
|
|
335
|
+
}
|
|
336
|
+
|
|
337
|
+
return rb_eql(a, b);
|
|
338
|
+
}
|
|
339
|
+
|
|
340
|
+
static inline VALUE prepare_key(VALUE key) {
|
|
341
|
+
if (RB_TYPE_P(key, T_STRING) && !OBJ_FROZEN(key)) {
|
|
342
|
+
key = rb_str_new_frozen(key);
|
|
343
|
+
}
|
|
344
|
+
return key;
|
|
345
|
+
}
|
|
346
|
+
|
|
347
|
+
static void swiss_free_arrays(SwissHash *sh) {
|
|
348
|
+
free(sh->ctrl);
|
|
349
|
+
sh->ctrl = NULL;
|
|
350
|
+
free(sh->slots);
|
|
351
|
+
sh->slots = NULL;
|
|
352
|
+
}
|
|
353
|
+
|
|
354
|
+
static void swiss_init(SwissHash *sh, size_t min_capacity) {
|
|
355
|
+
size_t min_groups = (min_capacity + GROUP_SIZE - 1) / GROUP_SIZE;
|
|
356
|
+
size_t num_groups = 1;
|
|
357
|
+
while (num_groups < min_groups)
|
|
358
|
+
num_groups <<= 1;
|
|
359
|
+
if (num_groups < 2)
|
|
360
|
+
num_groups = 2;
|
|
361
|
+
|
|
362
|
+
size_t capacity = num_groups * GROUP_SIZE;
|
|
363
|
+
|
|
364
|
+
sh->num_groups = num_groups;
|
|
365
|
+
sh->group_mask = num_groups - 1;
|
|
366
|
+
sh->capacity = capacity;
|
|
367
|
+
sh->size = 0;
|
|
368
|
+
sh->tombstone_count = 0;
|
|
369
|
+
sh->mutating = 0;
|
|
370
|
+
sh->growth_left = capacity * MAX_LOAD_NUM / MAX_LOAD_DEN;
|
|
371
|
+
|
|
372
|
+
sh->ctrl = (uint8_t *)malloc(capacity);
|
|
373
|
+
sh->slots = (Slot *)calloc(capacity, sizeof(Slot));
|
|
374
|
+
|
|
375
|
+
if (!sh->ctrl || !sh->slots) {
|
|
376
|
+
free(sh->ctrl);
|
|
377
|
+
free(sh->slots);
|
|
378
|
+
sh->ctrl = NULL;
|
|
379
|
+
sh->slots = NULL;
|
|
380
|
+
rb_raise(rb_eNoMemError, "failed to allocate SwissHash");
|
|
381
|
+
}
|
|
382
|
+
|
|
383
|
+
memset(sh->ctrl, CTRL_EMPTY, capacity);
|
|
384
|
+
}
|
|
385
|
+
|
|
386
|
+
typedef struct {
|
|
387
|
+
size_t group_idx;
|
|
388
|
+
size_t stride;
|
|
389
|
+
size_t group_mask;
|
|
390
|
+
} ProbeSeq;
|
|
391
|
+
|
|
392
|
+
static inline ProbeSeq probe_start(uint64_t h1, size_t group_mask) {
|
|
393
|
+
ProbeSeq ps;
|
|
394
|
+
ps.group_idx = (size_t)(h1)&group_mask;
|
|
395
|
+
ps.stride = 0;
|
|
396
|
+
ps.group_mask = group_mask;
|
|
397
|
+
return ps;
|
|
398
|
+
}
|
|
399
|
+
|
|
400
|
+
static inline void probe_next(ProbeSeq *ps) {
|
|
401
|
+
ps->stride++;
|
|
402
|
+
ps->group_idx = (ps->group_idx + ps->stride) & ps->group_mask;
|
|
403
|
+
}
|
|
404
|
+
|
|
405
|
+
#define GROUP_OFF(gi) ((gi) * GROUP_SIZE)
|
|
406
|
+
|
|
407
|
+
static VALUE *swiss_lookup(SwissHash *sh, VALUE key) {
|
|
408
|
+
uint64_t hash = compute_hash(key);
|
|
409
|
+
uint8_t h2 = H2(hash);
|
|
410
|
+
ProbeSeq ps = probe_start(H1(hash), sh->group_mask);
|
|
411
|
+
|
|
412
|
+
for (;;) {
|
|
413
|
+
size_t off = GROUP_OFF(ps.group_idx);
|
|
414
|
+
|
|
415
|
+
#if defined(SWISS_USE_SSE2)
|
|
416
|
+
__m128i cv = ctrl_load(sh->ctrl + off);
|
|
417
|
+
uint32_t match = ctrl_match_h2_vec(cv, h2);
|
|
418
|
+
uint32_t empty = ctrl_match_empty_vec(cv);
|
|
419
|
+
#elif defined(SWISS_USE_NEON)
|
|
420
|
+
uint8x8_t cv = ctrl_load(sh->ctrl + off);
|
|
421
|
+
uint32_t match = ctrl_match_h2_vec(cv, h2);
|
|
422
|
+
uint32_t empty = ctrl_match_empty_vec(cv);
|
|
423
|
+
#else
|
|
424
|
+
uint32_t match = ctrl_match_h2_raw(sh->ctrl + off, h2);
|
|
425
|
+
uint32_t empty = ctrl_match_empty_raw(sh->ctrl + off);
|
|
426
|
+
#endif
|
|
427
|
+
while (match) {
|
|
428
|
+
int slot = ctz32(match);
|
|
429
|
+
Slot *s = &sh->slots[off + slot];
|
|
430
|
+
if (keys_equal(s->key, key)) {
|
|
431
|
+
return &s->value;
|
|
432
|
+
}
|
|
433
|
+
match &= match - 1;
|
|
434
|
+
}
|
|
435
|
+
|
|
436
|
+
if (empty)
|
|
437
|
+
return NULL;
|
|
438
|
+
probe_next(&ps);
|
|
439
|
+
}
|
|
440
|
+
}
|
|
441
|
+
|
|
442
|
+
static void swiss_grow(SwissHash *sh);
|
|
443
|
+
static void swiss_compact(SwissHash *sh);
|
|
444
|
+
|
|
445
|
+
static inline void swiss_insert_rehash(SwissHash *sh, uint64_t hash, VALUE key, VALUE value) {
|
|
446
|
+
uint8_t h2 = H2(hash);
|
|
447
|
+
ProbeSeq ps = probe_start(H1(hash), sh->group_mask);
|
|
448
|
+
|
|
449
|
+
for (;;) {
|
|
450
|
+
size_t off = GROUP_OFF(ps.group_idx);
|
|
451
|
+
uint32_t empty_mask = ctrl_match_empty(sh->ctrl + off);
|
|
452
|
+
if (empty_mask) {
|
|
453
|
+
size_t idx = off + ctz32(empty_mask);
|
|
454
|
+
sh->ctrl[idx] = h2;
|
|
455
|
+
sh->slots[idx].key = key;
|
|
456
|
+
sh->slots[idx].value = value;
|
|
457
|
+
sh->size++;
|
|
458
|
+
sh->growth_left--;
|
|
459
|
+
return;
|
|
460
|
+
}
|
|
461
|
+
probe_next(&ps);
|
|
462
|
+
}
|
|
463
|
+
}
|
|
464
|
+
|
|
465
|
+
static inline int should_compact(SwissHash *sh) {
|
|
466
|
+
return sh->tombstone_count >= sh->capacity / TOMBSTONE_COMPACT_DIVISOR;
|
|
467
|
+
}
|
|
468
|
+
|
|
469
|
+
static void swiss_compact(SwissHash *sh) {
|
|
470
|
+
size_t cap = sh->capacity;
|
|
471
|
+
size_t old_size = sh->size;
|
|
472
|
+
|
|
473
|
+
uint8_t *old_ctrl = sh->ctrl;
|
|
474
|
+
Slot *old_slots = sh->slots;
|
|
475
|
+
|
|
476
|
+
sh->ctrl = (uint8_t *)malloc(cap);
|
|
477
|
+
sh->slots = (Slot *)calloc(cap, sizeof(Slot));
|
|
478
|
+
|
|
479
|
+
if (!sh->ctrl || !sh->slots) {
|
|
480
|
+
free(sh->ctrl);
|
|
481
|
+
free(sh->slots);
|
|
482
|
+
sh->ctrl = old_ctrl;
|
|
483
|
+
sh->slots = old_slots;
|
|
484
|
+
sh->size = old_size;
|
|
485
|
+
sh->growth_left = 0;
|
|
486
|
+
rb_raise(rb_eNoMemError, "failed to compact SwissHash");
|
|
487
|
+
}
|
|
488
|
+
|
|
489
|
+
memset(sh->ctrl, CTRL_EMPTY, cap);
|
|
490
|
+
sh->size = 0;
|
|
491
|
+
sh->growth_left = cap * MAX_LOAD_NUM / MAX_LOAD_DEN;
|
|
492
|
+
sh->tombstone_count = 0;
|
|
493
|
+
|
|
494
|
+
for (size_t i = 0; i < cap; i++) {
|
|
495
|
+
uint8_t c = old_ctrl[i];
|
|
496
|
+
if (c != CTRL_EMPTY && c != CTRL_DELETED) {
|
|
497
|
+
uint64_t hash = compute_hash(old_slots[i].key);
|
|
498
|
+
swiss_insert_rehash(sh, hash, old_slots[i].key, old_slots[i].value);
|
|
499
|
+
}
|
|
500
|
+
}
|
|
501
|
+
|
|
502
|
+
free(old_ctrl);
|
|
503
|
+
free(old_slots);
|
|
504
|
+
}
|
|
505
|
+
|
|
506
|
+
static VALUE swiss_insert(SwissHash *sh, VALUE key, VALUE value) {
|
|
507
|
+
if (sh->growth_left == 0) {
|
|
508
|
+
if (should_compact(sh)) {
|
|
509
|
+
swiss_compact(sh);
|
|
510
|
+
} else {
|
|
511
|
+
swiss_grow(sh);
|
|
512
|
+
}
|
|
513
|
+
}
|
|
514
|
+
|
|
515
|
+
uint64_t hash = compute_hash(key);
|
|
516
|
+
uint8_t h2 = H2(hash);
|
|
517
|
+
ProbeSeq ps = probe_start(H1(hash), sh->group_mask);
|
|
518
|
+
|
|
519
|
+
size_t insert_idx = (size_t)-1;
|
|
520
|
+
|
|
521
|
+
for (;;) {
|
|
522
|
+
size_t off = GROUP_OFF(ps.group_idx);
|
|
523
|
+
|
|
524
|
+
#if defined(SWISS_USE_SSE2)
|
|
525
|
+
__m128i cv = ctrl_load(sh->ctrl + off);
|
|
526
|
+
uint32_t match = ctrl_match_h2_vec(cv, h2);
|
|
527
|
+
uint32_t empty = ctrl_match_empty_vec(cv);
|
|
528
|
+
uint32_t avail = ctrl_match_empty_or_deleted_vec(cv);
|
|
529
|
+
#elif defined(SWISS_USE_NEON)
|
|
530
|
+
uint8x8_t cv = ctrl_load(sh->ctrl + off);
|
|
531
|
+
uint32_t match = ctrl_match_h2_vec(cv, h2);
|
|
532
|
+
uint32_t empty = ctrl_match_empty_vec(cv);
|
|
533
|
+
uint32_t avail = ctrl_match_empty_or_deleted_vec(cv);
|
|
534
|
+
#else
|
|
535
|
+
uint32_t match = ctrl_match_h2_raw(sh->ctrl + off, h2);
|
|
536
|
+
uint32_t empty = ctrl_match_empty_raw(sh->ctrl + off);
|
|
537
|
+
uint32_t avail = ctrl_match_empty_or_deleted_raw(sh->ctrl + off);
|
|
538
|
+
#endif
|
|
539
|
+
while (match) {
|
|
540
|
+
int slot = ctz32(match);
|
|
541
|
+
size_t idx = off + slot;
|
|
542
|
+
if (keys_equal(sh->slots[idx].key, key)) {
|
|
543
|
+
sh->slots[idx].value = value;
|
|
544
|
+
return value;
|
|
545
|
+
}
|
|
546
|
+
match &= match - 1;
|
|
547
|
+
}
|
|
548
|
+
|
|
549
|
+
if (insert_idx == (size_t)-1 && avail) {
|
|
550
|
+
insert_idx = off + ctz32(avail);
|
|
551
|
+
}
|
|
552
|
+
|
|
553
|
+
if (empty)
|
|
554
|
+
break;
|
|
555
|
+
probe_next(&ps);
|
|
556
|
+
}
|
|
557
|
+
|
|
558
|
+
MUTATE_GUARD_BEGIN(sh);
|
|
559
|
+
|
|
560
|
+
if (sh->ctrl[insert_idx] == CTRL_EMPTY) {
|
|
561
|
+
sh->growth_left--;
|
|
562
|
+
} else {
|
|
563
|
+
sh->tombstone_count--;
|
|
564
|
+
}
|
|
565
|
+
sh->ctrl[insert_idx] = h2;
|
|
566
|
+
sh->slots[insert_idx].key = key;
|
|
567
|
+
sh->slots[insert_idx].value = value;
|
|
568
|
+
sh->size++;
|
|
569
|
+
|
|
570
|
+
MUTATE_GUARD_END(sh);
|
|
571
|
+
return value;
|
|
572
|
+
}
|
|
573
|
+
|
|
574
|
+
static VALUE swiss_delete(SwissHash *sh, VALUE key) {
|
|
575
|
+
uint64_t hash = compute_hash(key);
|
|
576
|
+
uint8_t h2 = H2(hash);
|
|
577
|
+
ProbeSeq ps = probe_start(H1(hash), sh->group_mask);
|
|
578
|
+
|
|
579
|
+
for (;;) {
|
|
580
|
+
size_t off = GROUP_OFF(ps.group_idx);
|
|
581
|
+
|
|
582
|
+
#if defined(SWISS_USE_SSE2)
|
|
583
|
+
__m128i cv = ctrl_load(sh->ctrl + off);
|
|
584
|
+
uint32_t match = ctrl_match_h2_vec(cv, h2);
|
|
585
|
+
uint32_t empty = ctrl_match_empty_vec(cv);
|
|
586
|
+
#elif defined(SWISS_USE_NEON)
|
|
587
|
+
uint8x8_t cv = ctrl_load(sh->ctrl + off);
|
|
588
|
+
uint32_t match = ctrl_match_h2_vec(cv, h2);
|
|
589
|
+
uint32_t empty = ctrl_match_empty_vec(cv);
|
|
590
|
+
#else
|
|
591
|
+
uint32_t match = ctrl_match_h2_raw(sh->ctrl + off, h2);
|
|
592
|
+
uint32_t empty = ctrl_match_empty_raw(sh->ctrl + off);
|
|
593
|
+
#endif
|
|
594
|
+
while (match) {
|
|
595
|
+
int slot = ctz32(match);
|
|
596
|
+
size_t idx = off + slot;
|
|
597
|
+
if (keys_equal(sh->slots[idx].key, key)) {
|
|
598
|
+
VALUE old_value = sh->slots[idx].value;
|
|
599
|
+
|
|
600
|
+
MUTATE_GUARD_BEGIN(sh);
|
|
601
|
+
sh->ctrl[idx] = CTRL_DELETED;
|
|
602
|
+
sh->slots[idx].key = Qnil;
|
|
603
|
+
sh->slots[idx].value = Qnil;
|
|
604
|
+
sh->size--;
|
|
605
|
+
sh->tombstone_count++;
|
|
606
|
+
MUTATE_GUARD_END(sh);
|
|
607
|
+
|
|
608
|
+
return old_value;
|
|
609
|
+
}
|
|
610
|
+
match &= match - 1;
|
|
611
|
+
}
|
|
612
|
+
|
|
613
|
+
if (empty)
|
|
614
|
+
return Qnil;
|
|
615
|
+
probe_next(&ps);
|
|
616
|
+
}
|
|
617
|
+
}
|
|
618
|
+
|
|
619
|
+
static void swiss_grow(SwissHash *sh) {
|
|
620
|
+
size_t old_cap = sh->capacity;
|
|
621
|
+
size_t old_size = sh->size;
|
|
622
|
+
|
|
623
|
+
uint8_t *old_ctrl = sh->ctrl;
|
|
624
|
+
Slot *old_slots = sh->slots;
|
|
625
|
+
|
|
626
|
+
size_t new_num_groups = sh->num_groups * 2;
|
|
627
|
+
size_t new_cap = new_num_groups * GROUP_SIZE;
|
|
628
|
+
|
|
629
|
+
sh->ctrl = (uint8_t *)malloc(new_cap);
|
|
630
|
+
sh->slots = (Slot *)calloc(new_cap, sizeof(Slot));
|
|
631
|
+
|
|
632
|
+
if (!sh->ctrl || !sh->slots) {
|
|
633
|
+
free(sh->ctrl);
|
|
634
|
+
free(sh->slots);
|
|
635
|
+
sh->ctrl = old_ctrl;
|
|
636
|
+
sh->slots = old_slots;
|
|
637
|
+
sh->size = old_size;
|
|
638
|
+
sh->growth_left = 0;
|
|
639
|
+
rb_raise(rb_eNoMemError, "failed to grow SwissHash");
|
|
640
|
+
}
|
|
641
|
+
|
|
642
|
+
memset(sh->ctrl, CTRL_EMPTY, new_cap);
|
|
643
|
+
|
|
644
|
+
sh->num_groups = new_num_groups;
|
|
645
|
+
sh->group_mask = new_num_groups - 1;
|
|
646
|
+
sh->capacity = new_cap;
|
|
647
|
+
sh->size = 0;
|
|
648
|
+
sh->growth_left = new_cap * MAX_LOAD_NUM / MAX_LOAD_DEN;
|
|
649
|
+
sh->tombstone_count = 0;
|
|
650
|
+
|
|
651
|
+
for (size_t i = 0; i < old_cap; i++) {
|
|
652
|
+
uint8_t c = old_ctrl[i];
|
|
653
|
+
if (c != CTRL_EMPTY && c != CTRL_DELETED) {
|
|
654
|
+
uint64_t hash = compute_hash(old_slots[i].key);
|
|
655
|
+
swiss_insert_rehash(sh, hash, old_slots[i].key, old_slots[i].value);
|
|
656
|
+
}
|
|
657
|
+
}
|
|
658
|
+
|
|
659
|
+
free(old_ctrl);
|
|
660
|
+
free(old_slots);
|
|
661
|
+
}
|
|
662
|
+
|
|
663
|
+
static void swiss_hash_mark(void *ptr) {
|
|
664
|
+
SwissHash *sh = (SwissHash *)ptr;
|
|
665
|
+
if (!sh || !sh->ctrl)
|
|
666
|
+
return;
|
|
667
|
+
|
|
668
|
+
for (size_t i = 0; i < sh->capacity; i++) {
|
|
669
|
+
uint8_t c = sh->ctrl[i];
|
|
670
|
+
if (c != CTRL_EMPTY && c != CTRL_DELETED) {
|
|
671
|
+
rb_gc_mark(sh->slots[i].key);
|
|
672
|
+
rb_gc_mark(sh->slots[i].value);
|
|
673
|
+
}
|
|
674
|
+
}
|
|
675
|
+
}
|
|
676
|
+
|
|
677
|
+
static void swiss_hash_free(void *ptr) {
|
|
678
|
+
SwissHash *sh = (SwissHash *)ptr;
|
|
679
|
+
if (sh) {
|
|
680
|
+
swiss_free_arrays(sh);
|
|
681
|
+
free(sh);
|
|
682
|
+
}
|
|
683
|
+
}
|
|
684
|
+
|
|
685
|
+
static size_t swiss_hash_memsize(const void *ptr) {
|
|
686
|
+
const SwissHash *sh = (const SwissHash *)ptr;
|
|
687
|
+
size_t s = sizeof(SwissHash);
|
|
688
|
+
if (sh && sh->ctrl) {
|
|
689
|
+
s += sh->capacity * (sizeof(uint8_t) + sizeof(Slot));
|
|
690
|
+
}
|
|
691
|
+
return s;
|
|
692
|
+
}
|
|
693
|
+
|
|
694
|
+
static const rb_data_type_t swiss_hash_type = {
|
|
695
|
+
"SwissHash",
|
|
696
|
+
{swiss_hash_mark, swiss_hash_free, swiss_hash_memsize},
|
|
697
|
+
NULL,
|
|
698
|
+
NULL,
|
|
699
|
+
RUBY_TYPED_FREE_IMMEDIATELY};
|
|
700
|
+
|
|
701
|
+
static VALUE swiss_hash_alloc(VALUE klass) {
|
|
702
|
+
SwissHash *sh = ALLOC(SwissHash);
|
|
703
|
+
memset(sh, 0, sizeof(SwissHash));
|
|
704
|
+
return TypedData_Wrap_Struct(klass, &swiss_hash_type, sh);
|
|
705
|
+
}
|
|
706
|
+
|
|
707
|
+
static VALUE swiss_hash_initialize(int argc, VALUE *argv, VALUE self) {
|
|
708
|
+
VALUE capacity_val;
|
|
709
|
+
rb_scan_args(argc, argv, "01", &capacity_val);
|
|
710
|
+
|
|
711
|
+
SwissHash *sh;
|
|
712
|
+
TypedData_Get_Struct(self, SwissHash, &swiss_hash_type, sh);
|
|
713
|
+
|
|
714
|
+
size_t capacity = NIL_P(capacity_val) ? 16 : NUM2SIZET(capacity_val);
|
|
715
|
+
swiss_init(sh, capacity);
|
|
716
|
+
|
|
717
|
+
return self;
|
|
718
|
+
}
|
|
719
|
+
|
|
720
|
+
static VALUE swiss_hash_aset(VALUE self, VALUE key, VALUE value) {
|
|
721
|
+
SwissHash *sh;
|
|
722
|
+
TypedData_Get_Struct(self, SwissHash, &swiss_hash_type, sh);
|
|
723
|
+
if (RB_UNLIKELY(!(FIXNUM_P(key) || SYMBOL_P(key)))) {
|
|
724
|
+
key = prepare_key(key);
|
|
725
|
+
}
|
|
726
|
+
return swiss_insert(sh, key, value);
|
|
727
|
+
}
|
|
728
|
+
|
|
729
|
+
static VALUE swiss_hash_aref(VALUE self, VALUE key) {
|
|
730
|
+
SwissHash *sh;
|
|
731
|
+
TypedData_Get_Struct(self, SwissHash, &swiss_hash_type, sh);
|
|
732
|
+
VALUE *val = swiss_lookup(sh, key);
|
|
733
|
+
return val ? *val : Qnil;
|
|
734
|
+
}
|
|
735
|
+
|
|
736
|
+
static VALUE swiss_hash_delete(VALUE self, VALUE key) {
|
|
737
|
+
SwissHash *sh;
|
|
738
|
+
TypedData_Get_Struct(self, SwissHash, &swiss_hash_type, sh);
|
|
739
|
+
return swiss_delete(sh, key);
|
|
740
|
+
}
|
|
741
|
+
|
|
742
|
+
static VALUE swiss_hash_size(VALUE self) {
|
|
743
|
+
SwissHash *sh;
|
|
744
|
+
TypedData_Get_Struct(self, SwissHash, &swiss_hash_type, sh);
|
|
745
|
+
return SIZET2NUM(sh->size);
|
|
746
|
+
}
|
|
747
|
+
|
|
748
|
+
static VALUE swiss_hash_empty_p(VALUE self) {
|
|
749
|
+
SwissHash *sh;
|
|
750
|
+
TypedData_Get_Struct(self, SwissHash, &swiss_hash_type, sh);
|
|
751
|
+
return sh->size == 0 ? Qtrue : Qfalse;
|
|
752
|
+
}
|
|
753
|
+
|
|
754
|
+
static VALUE swiss_hash_clear(VALUE self) {
|
|
755
|
+
SwissHash *sh;
|
|
756
|
+
TypedData_Get_Struct(self, SwissHash, &swiss_hash_type, sh);
|
|
757
|
+
|
|
758
|
+
memset(sh->ctrl, CTRL_EMPTY, sh->capacity);
|
|
759
|
+
memset(sh->slots, 0, sh->capacity * sizeof(Slot));
|
|
760
|
+
sh->size = 0;
|
|
761
|
+
sh->growth_left = sh->capacity * MAX_LOAD_NUM / MAX_LOAD_DEN;
|
|
762
|
+
sh->tombstone_count = 0;
|
|
763
|
+
|
|
764
|
+
return self;
|
|
765
|
+
}
|
|
766
|
+
|
|
767
|
+
static VALUE swiss_hash_each(VALUE self) {
|
|
768
|
+
SwissHash *sh;
|
|
769
|
+
TypedData_Get_Struct(self, SwissHash, &swiss_hash_type, sh);
|
|
770
|
+
|
|
771
|
+
RETURN_ENUMERATOR(self, 0, 0);
|
|
772
|
+
|
|
773
|
+
for (size_t i = 0; i < sh->capacity; i++) {
|
|
774
|
+
uint8_t c = sh->ctrl[i];
|
|
775
|
+
if (c != CTRL_EMPTY && c != CTRL_DELETED) {
|
|
776
|
+
rb_yield_values(2, sh->slots[i].key, sh->slots[i].value);
|
|
777
|
+
}
|
|
778
|
+
}
|
|
779
|
+
|
|
780
|
+
return self;
|
|
781
|
+
}
|
|
782
|
+
|
|
783
|
+
static VALUE swiss_hash_keys(VALUE self) {
|
|
784
|
+
SwissHash *sh;
|
|
785
|
+
TypedData_Get_Struct(self, SwissHash, &swiss_hash_type, sh);
|
|
786
|
+
VALUE ary = rb_ary_new_capa(sh->size);
|
|
787
|
+
|
|
788
|
+
for (size_t i = 0; i < sh->capacity; i++) {
|
|
789
|
+
uint8_t c = sh->ctrl[i];
|
|
790
|
+
if (c != CTRL_EMPTY && c != CTRL_DELETED) {
|
|
791
|
+
rb_ary_push(ary, sh->slots[i].key);
|
|
792
|
+
}
|
|
793
|
+
}
|
|
794
|
+
return ary;
|
|
795
|
+
}
|
|
796
|
+
|
|
797
|
+
static VALUE swiss_hash_values(VALUE self) {
|
|
798
|
+
SwissHash *sh;
|
|
799
|
+
TypedData_Get_Struct(self, SwissHash, &swiss_hash_type, sh);
|
|
800
|
+
VALUE ary = rb_ary_new_capa(sh->size);
|
|
801
|
+
|
|
802
|
+
for (size_t i = 0; i < sh->capacity; i++) {
|
|
803
|
+
uint8_t c = sh->ctrl[i];
|
|
804
|
+
if (c != CTRL_EMPTY && c != CTRL_DELETED) {
|
|
805
|
+
rb_ary_push(ary, sh->slots[i].value);
|
|
806
|
+
}
|
|
807
|
+
}
|
|
808
|
+
return ary;
|
|
809
|
+
}
|
|
810
|
+
|
|
811
|
+
static VALUE swiss_hash_key_p(VALUE self, VALUE key) {
|
|
812
|
+
SwissHash *sh;
|
|
813
|
+
TypedData_Get_Struct(self, SwissHash, &swiss_hash_type, sh);
|
|
814
|
+
VALUE *val = swiss_lookup(sh, key);
|
|
815
|
+
return val ? Qtrue : Qfalse;
|
|
816
|
+
}
|
|
817
|
+
|
|
818
|
+
static VALUE swiss_hash_compact_bang(VALUE self) {
|
|
819
|
+
SwissHash *sh;
|
|
820
|
+
TypedData_Get_Struct(self, SwissHash, &swiss_hash_type, sh);
|
|
821
|
+
|
|
822
|
+
if (sh->tombstone_count > 0) {
|
|
823
|
+
swiss_compact(sh);
|
|
824
|
+
}
|
|
825
|
+
|
|
826
|
+
return self;
|
|
827
|
+
}
|
|
828
|
+
|
|
829
|
+
static VALUE swiss_hash_stats(VALUE self) {
|
|
830
|
+
SwissHash *sh;
|
|
831
|
+
TypedData_Get_Struct(self, SwissHash, &swiss_hash_type, sh);
|
|
832
|
+
|
|
833
|
+
double load = sh->capacity > 0 ? (double)sh->size / sh->capacity : 0.0;
|
|
834
|
+
|
|
835
|
+
VALUE hash = rb_hash_new();
|
|
836
|
+
rb_hash_aset(hash, ID2SYM(rb_intern("capacity")), SIZET2NUM(sh->capacity));
|
|
837
|
+
rb_hash_aset(hash, ID2SYM(rb_intern("size")), SIZET2NUM(sh->size));
|
|
838
|
+
rb_hash_aset(hash, ID2SYM(rb_intern("num_groups")), SIZET2NUM(sh->num_groups));
|
|
839
|
+
rb_hash_aset(hash, ID2SYM(rb_intern("load_factor")), DBL2NUM(load));
|
|
840
|
+
rb_hash_aset(hash, ID2SYM(rb_intern("memory_bytes")), SIZET2NUM(swiss_hash_memsize(sh)));
|
|
841
|
+
rb_hash_aset(hash, ID2SYM(rb_intern("growth_left")), SIZET2NUM(sh->growth_left));
|
|
842
|
+
rb_hash_aset(hash, ID2SYM(rb_intern("tombstones")), SIZET2NUM(sh->tombstone_count));
|
|
843
|
+
|
|
844
|
+
#ifdef SWISS_USE_SSE2
|
|
845
|
+
rb_hash_aset(hash, ID2SYM(rb_intern("simd")), rb_str_new_cstr("SSE2"));
|
|
846
|
+
#elif defined(SWISS_USE_NEON)
|
|
847
|
+
rb_hash_aset(hash, ID2SYM(rb_intern("simd")), rb_str_new_cstr("NEON"));
|
|
848
|
+
#else
|
|
849
|
+
rb_hash_aset(hash, ID2SYM(rb_intern("simd")), rb_str_new_cstr("portable/SWAR"));
|
|
850
|
+
#endif
|
|
851
|
+
rb_hash_aset(hash, ID2SYM(rb_intern("layout")), rb_str_new_cstr("hybrid"));
|
|
852
|
+
|
|
853
|
+
return hash;
|
|
854
|
+
}
|
|
855
|
+
|
|
856
|
+
void Init_swiss_hash(void) {
|
|
857
|
+
init_hash_seed();
|
|
858
|
+
|
|
859
|
+
VALUE mSwissHash = rb_define_module("SwissHash");
|
|
860
|
+
VALUE cHash = rb_define_class_under(mSwissHash, "Hash", rb_cObject);
|
|
861
|
+
|
|
862
|
+
rb_define_alloc_func(cHash, swiss_hash_alloc);
|
|
863
|
+
rb_define_method(cHash, "initialize", swiss_hash_initialize, -1);
|
|
864
|
+
rb_define_method(cHash, "[]=", swiss_hash_aset, 2);
|
|
865
|
+
rb_define_method(cHash, "store", swiss_hash_aset, 2);
|
|
866
|
+
rb_define_method(cHash, "[]", swiss_hash_aref, 1);
|
|
867
|
+
rb_define_method(cHash, "delete", swiss_hash_delete, 1);
|
|
868
|
+
rb_define_method(cHash, "size", swiss_hash_size, 0);
|
|
869
|
+
rb_define_method(cHash, "length", swiss_hash_size, 0);
|
|
870
|
+
rb_define_method(cHash, "empty?", swiss_hash_empty_p, 0);
|
|
871
|
+
rb_define_method(cHash, "clear", swiss_hash_clear, 0);
|
|
872
|
+
rb_define_method(cHash, "each", swiss_hash_each, 0);
|
|
873
|
+
rb_define_method(cHash, "keys", swiss_hash_keys, 0);
|
|
874
|
+
rb_define_method(cHash, "values", swiss_hash_values, 0);
|
|
875
|
+
rb_define_method(cHash, "key?", swiss_hash_key_p, 1);
|
|
876
|
+
rb_define_method(cHash, "has_key?", swiss_hash_key_p, 1);
|
|
877
|
+
rb_define_method(cHash, "include?", swiss_hash_key_p, 1);
|
|
878
|
+
rb_define_method(cHash, "compact!", swiss_hash_compact_bang, 0);
|
|
879
|
+
rb_define_method(cHash, "stats", swiss_hash_stats, 0);
|
|
880
|
+
}
|
data/lib/swiss_hash.rb
ADDED
|
@@ -0,0 +1,28 @@
|
|
|
1
|
+
# frozen_string_literal: true
|
|
2
|
+
|
|
3
|
+
require_relative "swiss_hash/version"
|
|
4
|
+
require_relative "swiss_hash/swiss_hash.bundle"
|
|
5
|
+
|
|
6
|
+
module SwissHash
|
|
7
|
+
class Hash
|
|
8
|
+
alias_method :count, :size
|
|
9
|
+
|
|
10
|
+
def merge!(other)
|
|
11
|
+
other.each { |k, v| self[k] = v }
|
|
12
|
+
self
|
|
13
|
+
end
|
|
14
|
+
alias_method :update, :merge!
|
|
15
|
+
|
|
16
|
+
def to_h
|
|
17
|
+
hash = {}
|
|
18
|
+
each { |k, v| hash[k] = v }
|
|
19
|
+
hash
|
|
20
|
+
end
|
|
21
|
+
|
|
22
|
+
def inspect
|
|
23
|
+
s = stats
|
|
24
|
+
"#<SwissHash::Hash size=#{s[:size]} capacity=#{s[:capacity]} load=#{(s[:load_factor] * 100).round(1)}%>"
|
|
25
|
+
end
|
|
26
|
+
alias_method :to_s, :inspect
|
|
27
|
+
end
|
|
28
|
+
end
|
metadata
ADDED
|
@@ -0,0 +1,64 @@
|
|
|
1
|
+
--- !ruby/object:Gem::Specification
|
|
2
|
+
name: swiss_hash
|
|
3
|
+
version: !ruby/object:Gem::Version
|
|
4
|
+
version: 0.1.0
|
|
5
|
+
platform: ruby
|
|
6
|
+
authors:
|
|
7
|
+
- Roman Haidarov
|
|
8
|
+
autorequire:
|
|
9
|
+
bindir: bin
|
|
10
|
+
cert_chain: []
|
|
11
|
+
date: 2026-03-23 00:00:00.000000000 Z
|
|
12
|
+
dependencies:
|
|
13
|
+
- !ruby/object:Gem::Dependency
|
|
14
|
+
name: rake-compiler
|
|
15
|
+
requirement: !ruby/object:Gem::Requirement
|
|
16
|
+
requirements:
|
|
17
|
+
- - "~>"
|
|
18
|
+
- !ruby/object:Gem::Version
|
|
19
|
+
version: '1.0'
|
|
20
|
+
type: :development
|
|
21
|
+
prerelease: false
|
|
22
|
+
version_requirements: !ruby/object:Gem::Requirement
|
|
23
|
+
requirements:
|
|
24
|
+
- - "~>"
|
|
25
|
+
- !ruby/object:Gem::Version
|
|
26
|
+
version: '1.0'
|
|
27
|
+
description:
|
|
28
|
+
email:
|
|
29
|
+
- roman.haidarov@hey.com
|
|
30
|
+
executables: []
|
|
31
|
+
extensions:
|
|
32
|
+
- ext/swiss_hash/extconf.rb
|
|
33
|
+
extra_rdoc_files: []
|
|
34
|
+
files:
|
|
35
|
+
- LICENSE.txt
|
|
36
|
+
- README.md
|
|
37
|
+
- ext/swiss_hash/extconf.rb
|
|
38
|
+
- ext/swiss_hash/swiss_hash.c
|
|
39
|
+
- lib/swiss_hash.rb
|
|
40
|
+
- lib/swiss_hash/version.rb
|
|
41
|
+
homepage: https://github.com/roman-haidarov/swiss_hash
|
|
42
|
+
licenses:
|
|
43
|
+
- MIT
|
|
44
|
+
metadata: {}
|
|
45
|
+
post_install_message:
|
|
46
|
+
rdoc_options: []
|
|
47
|
+
require_paths:
|
|
48
|
+
- lib
|
|
49
|
+
required_ruby_version: !ruby/object:Gem::Requirement
|
|
50
|
+
requirements:
|
|
51
|
+
- - ">="
|
|
52
|
+
- !ruby/object:Gem::Version
|
|
53
|
+
version: 3.0.0
|
|
54
|
+
required_rubygems_version: !ruby/object:Gem::Requirement
|
|
55
|
+
requirements:
|
|
56
|
+
- - ">="
|
|
57
|
+
- !ruby/object:Gem::Version
|
|
58
|
+
version: '0'
|
|
59
|
+
requirements: []
|
|
60
|
+
rubygems_version: 3.3.27
|
|
61
|
+
signing_key:
|
|
62
|
+
specification_version: 4
|
|
63
|
+
summary: Swiss Table hash map as a Ruby C extension
|
|
64
|
+
test_files: []
|