maxmind-db-rust 0.2.1 → 0.4.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
checksums.yaml CHANGED
@@ -1,7 +1,7 @@
1
1
  ---
2
2
  SHA256:
3
- metadata.gz: 235d81c8e26c962ed6f63803a64596e8d1b5234cac812a75a0edb053cb44a26a
4
- data.tar.gz: 0d6aaec6f63c9c5ffcf7b7c8394d898902badf9de5295bc65b50ff5c8ca5e75d
3
+ metadata.gz: 854691c99c81b7d9574c780a7e10f80eea69788bb4dc95cce41361e5163d6f28
4
+ data.tar.gz: '086f7efa6e3620e3fd15d66b87964076647e7510551d58d98e4f484c69400173'
5
5
  SHA512:
6
- metadata.gz: 2cbb91c71256ed6f8146d559aad0c7442fa516c5826d69ddd504059076d28f3de0b856ecd904f9835412617a9c07dd47b930e9050e4517b26d1268af4ec2a39c
7
- data.tar.gz: 432ec4c933e5a33c2c6ff994c52149c409be29b0d547ff0e76c2b2897e41efcf0090594aa970367c379907bed90eaed07d285920e0cf28dc767696632d849409
6
+ metadata.gz: f20905fca81e3742ff75da2e5758fbd198d61dd5c7c82b23dd8ed9a731bf7a46dc707c6660590adb895e2c022b6f6d07edb7916eab774d4461faccd2de66ec2e
7
+ data.tar.gz: 888d4623247bf07a0c8058c669230032ebcef98b0efcc03920f1196cb1695361f2a868250ffeafb6fe3ed7913038b38f4b419d9ef5e3a1d5f3caecd0b1ec1bb0
data/CHANGELOG.md CHANGED
@@ -5,6 +5,37 @@ All notable changes to this project will be documented in this file.
5
5
  The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.0.0/),
6
6
  and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html).
7
7
 
8
+ ## [0.4.0] - 2026-04-25
9
+
10
+ ### Performance
11
+
12
+ - Restored lookup performance with a generic bounded cache of frozen Ruby strings reused across decoded keys and scalar values.
13
+ - Removed hardcoded interned string tables in favor of the generic string cache.
14
+ - Simplified decoding so lookups and iteration use the same `maxminddb` decode path again.
15
+ - Reduced repeated cache-root lookup overhead with a thread-local `OnceCell` for the Ruby-owned string cache roots.
16
+ - Borrowed decoded map keys directly during deserialization to avoid `Cow` overhead in the hot decode path.
17
+ - Upgraded `maxminddb` crate to 0.28.0, which includes several performance
18
+ improvements.
19
+
20
+ ## [0.3.0] - 2026-02-22
21
+
22
+ ### Changed
23
+
24
+ - Improved lookup performance by using a generic bounded key cache for decoded map keys.
25
+ - Improved `IPAddr` lookup performance by decoding packed bytes from `IPAddr#hton` directly.
26
+ - Switched map-key cache hashing to `FxHashMap` for faster key-cache access.
27
+ - Switched map-key cache roots to a Ruby-owned cache array with Rust key-to-index lookups.
28
+ - Refactored duplicated prefix and `within` decode paths in the Rust reader for simpler maintenance.
29
+ - Refactored duplicate database file-open error handling shared by MMAP and MEMORY modes.
30
+ - Updated Rust and Ruby dependencies.
31
+ - Added Ruby 4.0 coverage to CI workflows.
32
+
33
+ ### Fixed
34
+
35
+ - Made extension initialization idempotent across `MaxMind::DB` class/module loading modes to avoid typed-data incompatibility when the extension is loaded more than once.
36
+ - When loaded with the official `MaxMind::DB` class, `MaxMind::DB::Rust` now uses anonymous module creation to preserve canonical module naming.
37
+ - Scoped Rust dependency cache per Ruby version in CI tests and stopped caching `target/` in the test workflow to avoid cross-version artifact contamination.
38
+
8
39
  ## [0.2.1] - 2025-12-18
9
40
 
10
41
  ### Changed
data/README.md CHANGED
@@ -3,18 +3,19 @@
3
3
  [![Test](https://github.com/oschwald/maxmind-db-rust-ruby/actions/workflows/test.yml/badge.svg)](https://github.com/oschwald/maxmind-db-rust-ruby/actions/workflows/test.yml)
4
4
  [![Lint](https://github.com/oschwald/maxmind-db-rust-ruby/actions/workflows/lint.yml/badge.svg)](https://github.com/oschwald/maxmind-db-rust-ruby/actions/workflows/lint.yml)
5
5
 
6
- A high-performance Rust-based Ruby gem for reading MaxMind DB files. Provides API compatibility with the official `maxmind-db` gem while leveraging Rust for superior performance.
6
+ A Ruby gem for reading MaxMind DB files, implemented in Rust.
7
+ It keeps the API close to the official `maxmind-db` gem while adding Rust-backed performance.
7
8
 
8
9
  > **Note:** This is an unofficial library and is not endorsed by MaxMind. For the official Ruby library, see [maxmind-db](https://github.com/maxmind/MaxMind-DB-Reader-ruby).
9
10
 
10
11
  ## Features
11
12
 
12
- - **High Performance**: Rust-based implementation provides significantly faster lookups than pure Ruby
13
- - **API Compatible**: Familiar API similar to the official MaxMind::DB gem
14
- - **Thread-Safe**: Safe to use from multiple threads
15
- - **Memory Modes**: Support for both memory-mapped (MMAP) and in-memory modes
16
- - **Iterator Support**: Iterate over all networks in the database (extension feature)
17
- - **Type Support**: Works with both String and IPAddr objects
13
+ - Rust implementation focused on fast lookups
14
+ - API modeled after the official `maxmind-db` gem
15
+ - Thread-safe lookups
16
+ - Supports MMAP and in-memory modes
17
+ - Includes network iteration support
18
+ - Accepts both `String` and `IPAddr` inputs
18
19
 
19
20
  ## Installation
20
21
 
@@ -277,30 +278,31 @@ Metadata attributes:
277
278
 
278
279
  ## Comparison with Official Gem
279
280
 
280
- | Feature | maxmind-db (official) | maxmind-db-rust (this gem) |
281
- | ---------------- | --------------------- | -------------------------- |
282
- | Implementation | Pure Ruby | Rust with Ruby bindings |
283
- | Performance | Baseline | 10-50x faster |
284
- | API | MaxMind::DB | MaxMind::DB::Rust |
285
- | MODE_FILE | ✓ | ✗ |
286
- | MODE_MEMORY | ✓ | ✓ |
287
- | MODE_AUTO | ✓ | ✓ |
288
- | MODE_MMAP | ✗ | ✓ |
289
- | Iterator support | ✗ | ✓ |
290
- | Thread-safe | ✓ | ✓ |
281
+ | Feature | maxmind-db (official) | maxmind-db-rust (this gem) |
282
+ | ---------------- | --------------------- | ------------------------------------------ |
283
+ | Implementation | Pure Ruby | Rust with Ruby bindings |
284
+ | Performance | Baseline | Faster lookup throughput in our benchmarks |
285
+ | API | MaxMind::DB | MaxMind::DB::Rust |
286
+ | MODE_FILE | ✓ | ✗ |
287
+ | MODE_MEMORY | ✓ | ✓ |
288
+ | MODE_AUTO | ✓ | ✓ |
289
+ | MODE_MMAP | ✗ | ✓ |
290
+ | Iterator support | ✗ | ✓ |
291
+ | Thread-safe | ✓ | ✓ |
291
292
 
292
293
  ## Performance
293
294
 
294
- Expected performance characteristics (will vary based on hardware):
295
+ Lookup performance depends on hardware, Ruby version, database, and workload.
295
296
 
296
- - Single-threaded lookups: 300,000 - 500,000 lookups/second
297
- - Significantly faster than pure Ruby implementations
298
- - Memory-mapped mode (MMAP) provides best performance
299
- - Fully thread-safe for concurrent lookups
297
+ - In this project’s random-lookup benchmarks, this gem is consistently faster than the official Ruby implementation.
298
+ - On `/var/lib/GeoIP/GeoIP2-City.mmdb` in this environment, random lookup throughput was about `47x` higher than the official gem.
299
+ - `MODE_MMAP` and `MODE_MEMORY` both perform well; which is faster can vary by environment.
300
+ - For reproducible numbers on your own data, run `benchmark/compare_lookups.rb` against your database.
301
+ - Safe for concurrent lookups across threads.
300
302
 
301
303
  ## Development
302
304
 
303
- Interested in contributing? See [CONTRIBUTING.md](CONTRIBUTING.md) for detailed developer documentation, including:
305
+ See [CONTRIBUTING.md](CONTRIBUTING.md) for developer documentation, including:
304
306
 
305
307
  - Development setup and prerequisites
306
308
  - Building and testing the extension
@@ -321,11 +323,11 @@ bundle exec rake test
321
323
 
322
324
  ## Contributing
323
325
 
324
- 1. Fork it
325
- 2. Create your feature branch (`git checkout -b my-new-feature`)
326
- 3. Commit your changes (`git commit -am 'Add some feature'`)
326
+ 1. Fork the repository
327
+ 2. Create a feature branch (`git checkout -b my-new-feature`)
328
+ 3. Commit your changes (`git commit -am 'Describe your change'`)
327
329
  4. Push to the branch (`git push origin my-new-feature`)
328
- 5. Create a new Pull Request
330
+ 5. Open a Pull Request
329
331
 
330
332
  ## License
331
333
 
@@ -12,9 +12,10 @@ name = "maxmind_db_rust"
12
12
  crate-type = ["cdylib"]
13
13
 
14
14
  [dependencies]
15
- arc-swap = "1.7"
15
+ arc-swap = "1.9"
16
16
  ipnetwork = "0.21"
17
17
  magnus = "0.8"
18
- maxminddb = { version = "0.27", features = ["unsafe-str-decode"] }
18
+ maxminddb = { version = "0.28", features = ["unsafe-str-decode"] }
19
19
  memmap2 = "0.9"
20
+ rustc-hash = "2.1"
20
21
  serde = "1.0"
@@ -5,17 +5,19 @@ use ::maxminddb as maxminddb_crate;
5
5
  use arc_swap::{ArcSwapOption, Guard};
6
6
  use ipnetwork::IpNetwork;
7
7
  use magnus::{
8
- error::Error, prelude::*, scan_args::get_kwargs, scan_args::scan_args, value::Lazy,
9
- ExceptionClass, IntoValue, RArray, RClass, RHash, RModule, RString, Symbol, Value,
8
+ error::Error, prelude::*, scan_args::get_kwargs, scan_args::scan_args, ExceptionClass,
9
+ IntoValue, RArray, RClass, RHash, RModule, RString, Symbol, Value,
10
10
  };
11
11
  use maxminddb_crate::{MaxMindDbError, Reader as MaxMindReader, Within};
12
12
  use memmap2::Mmap;
13
+ use rustc_hash::FxHasher;
13
14
  use serde::de::{self, Deserialize, DeserializeSeed, Deserializer, MapAccess, SeqAccess, Visitor};
14
15
  use std::{
15
- borrow::Cow,
16
+ cell::{OnceCell, RefCell},
16
17
  collections::BTreeMap,
17
18
  fmt,
18
19
  fs::File,
20
+ hash::{Hash, Hasher},
19
21
  io::Read as IoRead,
20
22
  net::IpAddr,
21
23
  path::Path,
@@ -30,177 +32,96 @@ use std::{
30
32
  const ERR_CLOSED_DB: &str = "Attempt to read from a closed MaxMind DB.";
31
33
  const ERR_BAD_DATA: &str =
32
34
  "The MaxMind DB file's data section contains bad data (unknown data type or corrupt data)";
35
+ const STRING_CACHE_ROOTS_CONST: &str = "__STRING_CACHE_ROOTS__";
36
+ const MAP_KEY_ROOTS_CONST: &str = "__MAP_KEY_ROOTS__";
37
+ const STRING_CACHE_MAX: usize = 4096;
38
+ const STRING_CACHE_MIN_LEN: usize = 2;
39
+ const STRING_CACHE_MAX_LEN: usize = 64;
40
+
41
+ #[derive(Default)]
42
+ struct StringCacheEntry {
43
+ hash: u64,
44
+ value: String,
45
+ }
33
46
 
34
- macro_rules! define_interned_keys {
35
- ( $( $const_ident:ident => $str:expr ),* $(,)? ) => {
36
- $(
37
- static $const_ident: Lazy<RString> = Lazy::new(|ruby| {
38
- let s = ruby.str_new($str);
39
- s.freeze();
40
- s
41
- });
42
- )*
43
-
44
- fn interned_key(ruby: &magnus::Ruby, key: &str) -> Option<Value> {
45
- match key.len() {
46
- 2 => match key.as_bytes() {
47
- b"en" => Some(ruby.get_inner(&$crate::EN_KEY).as_value()),
48
- b"es" => Some(ruby.get_inner(&$crate::ES_KEY).as_value()),
49
- b"fr" => Some(ruby.get_inner(&$crate::FR_KEY).as_value()),
50
- b"ja" => Some(ruby.get_inner(&$crate::JA_KEY).as_value()),
51
- b"ru" => Some(ruby.get_inner(&$crate::RU_KEY).as_value()),
52
- b"AF" => Some(ruby.get_inner(&$crate::AF_KEY).as_value()),
53
- b"AN" => Some(ruby.get_inner(&$crate::AN_KEY).as_value()),
54
- b"AS" => Some(ruby.get_inner(&$crate::AS_KEY).as_value()),
55
- b"EU" => Some(ruby.get_inner(&$crate::EU_KEY).as_value()),
56
- b"NA" => Some(ruby.get_inner(&$crate::NA_KEY).as_value()),
57
- b"OC" => Some(ruby.get_inner(&$crate::OC_KEY).as_value()),
58
- b"SA" => Some(ruby.get_inner(&$crate::SA_KEY).as_value()),
59
- b"US" => Some(ruby.get_inner(&$crate::US_VAL).as_value()),
60
- b"CN" => Some(ruby.get_inner(&$crate::CN_VAL).as_value()),
61
- b"JP" => Some(ruby.get_inner(&$crate::JP_VAL).as_value()),
62
- b"DE" => Some(ruby.get_inner(&$crate::DE_VAL).as_value()),
63
- b"IN" => Some(ruby.get_inner(&$crate::IN_VAL).as_value()),
64
- b"GB" => Some(ruby.get_inner(&$crate::GB_VAL).as_value()),
65
- b"FR" => Some(ruby.get_inner(&$crate::FR_VAL).as_value()),
66
- b"BR" => Some(ruby.get_inner(&$crate::BR_VAL).as_value()),
67
- b"IT" => Some(ruby.get_inner(&$crate::IT_VAL).as_value()),
68
- b"CA" => Some(ruby.get_inner(&$crate::CA_VAL).as_value()),
69
- b"RU" => Some(ruby.get_inner(&$crate::RU_VAL).as_value()),
70
- b"KR" => Some(ruby.get_inner(&$crate::KR_VAL).as_value()),
71
- b"AU" => Some(ruby.get_inner(&$crate::AU_VAL).as_value()),
72
- b"ES" => Some(ruby.get_inner(&$crate::ES_VAL).as_value()),
73
- b"MX" => Some(ruby.get_inner(&$crate::MX_VAL).as_value()),
74
- b"ID" => Some(ruby.get_inner(&$crate::ID_VAL).as_value()),
75
- b"TR" => Some(ruby.get_inner(&$crate::TR_VAL).as_value()),
76
- _ => None,
77
- },
78
- 4 => match key.as_bytes() {
79
- b"city" => Some(ruby.get_inner(&$crate::CITY_KEY).as_value()),
80
- b"code" => Some(ruby.get_inner(&$crate::CODE_KEY).as_value()),
81
- _ => None,
82
- },
83
- 5 => match key.as_bytes() {
84
- b"names" => Some(ruby.get_inner(&$crate::NAMES_KEY).as_value()),
85
- b"pt-BR" => Some(ruby.get_inner(&$crate::PT_BR_KEY).as_value()),
86
- b"zh-CN" => Some(ruby.get_inner(&$crate::ZH_CN_KEY).as_value()),
87
- _ => None,
88
- },
89
- 6 => match key.as_bytes() {
90
- b"postal" => Some(ruby.get_inner(&$crate::POSTAL_KEY).as_value()),
91
- b"traits" => Some(ruby.get_inner(&$crate::TRAITS_KEY).as_value()),
92
- _ => None,
93
- },
94
- 7 => match key.as_bytes() {
95
- b"country" => Some(ruby.get_inner(&$crate::COUNTRY_KEY).as_value()),
96
- b"network" => Some(ruby.get_inner(&$crate::NETWORK_KEY).as_value()),
97
- _ => None,
98
- },
99
- 8 => match key.as_bytes() {
100
- b"location" => Some(ruby.get_inner(&$crate::LOCATION_KEY).as_value()),
101
- b"iso_code" => Some(ruby.get_inner(&$crate::ISO_CODE_KEY).as_value()),
102
- b"latitude" => Some(ruby.get_inner(&$crate::LATITUDE_KEY).as_value()),
103
- _ => None,
104
- },
105
- 9 => match key.as_bytes() {
106
- b"continent" => Some(ruby.get_inner(&$crate::CONTINENT_KEY).as_value()),
107
- b"longitude" => Some(ruby.get_inner(&$crate::LONGITUDE_KEY).as_value()),
108
- b"time_zone" => Some(ruby.get_inner(&$crate::TIME_ZONE_KEY).as_value()),
109
- _ => None,
110
- },
111
- 10 => match key.as_bytes() {
112
- b"geoname_id" => Some(ruby.get_inner(&$crate::GEONAME_ID_KEY).as_value()),
113
- b"metro_code" => Some(ruby.get_inner(&$crate::METRO_CODE_KEY).as_value()),
114
- b"confidence" => Some(ruby.get_inner(&$crate::CONFIDENCE_KEY).as_value()),
115
- _ => None,
116
- },
117
- 12 => match key.as_bytes() {
118
- b"subdivisions" => Some(ruby.get_inner(&$crate::SUBDIVISIONS_KEY).as_value()),
119
- _ => None,
120
- },
121
- 15 => match key.as_bytes() {
122
- b"accuracy_radius" => Some(ruby.get_inner(&$crate::ACCURACY_RADIUS_KEY).as_value()),
123
- _ => None,
124
- },
125
- 18 => match key.as_bytes() {
126
- b"registered_country" => Some(ruby.get_inner(&$crate::REGISTERED_COUNTRY_KEY).as_value()),
127
- b"population_density" => Some(ruby.get_inner(&$crate::POPULATION_DENSITY_KEY).as_value()),
128
- _ => None,
129
- },
130
- 19 => match key.as_bytes() {
131
- b"represented_country" => Some(ruby.get_inner(&$crate::REPRESENTED_COUNTRY_KEY).as_value()),
132
- b"is_anonymous_proxy" => Some(ruby.get_inner(&$crate::IS_ANONYMOUS_PROXY_KEY).as_value()),
133
- _ => None,
134
- },
135
- 21 => match key.as_bytes() {
136
- b"is_satellite_provider" => Some(ruby.get_inner(&$crate::IS_SATELLITE_PROVIDER_KEY).as_value()),
137
- _ => None,
138
- },
139
- _ => None,
140
- }
141
- }
142
- };
47
+ struct StringCache {
48
+ entries: Box<[StringCacheEntry]>,
49
+ }
50
+
51
+ impl StringCache {
52
+ fn new() -> Self {
53
+ let entries = (0..STRING_CACHE_MAX)
54
+ .map(|_| StringCacheEntry::default())
55
+ .collect::<Vec<_>>()
56
+ .into_boxed_slice();
57
+ Self { entries }
58
+ }
59
+ }
60
+
61
+ thread_local! {
62
+ static STRING_CACHE: RefCell<StringCache> = RefCell::new(StringCache::new());
63
+ static STRING_CACHE_ROOTS: OnceCell<RArray> = const { OnceCell::new() };
64
+ }
65
+
66
+ #[inline]
67
+ fn string_cache_roots_owner(ruby: &magnus::Ruby) -> RArray {
68
+ let value = rust_module(ruby)
69
+ .const_get::<_, Value>(STRING_CACHE_ROOTS_CONST)
70
+ .expect("string cache roots constant should exist");
71
+ RArray::from_value(value).expect("string cache roots constant should be an array")
143
72
  }
144
73
 
145
- define_interned_keys!(
146
- CITY_KEY => "city",
147
- CONTINENT_KEY => "continent",
148
- COUNTRY_KEY => "country",
149
- REGISTERED_COUNTRY_KEY => "registered_country",
150
- REPRESENTED_COUNTRY_KEY => "represented_country",
151
- SUBDIVISIONS_KEY => "subdivisions",
152
- LOCATION_KEY => "location",
153
- POSTAL_KEY => "postal",
154
- TRAITS_KEY => "traits",
155
- NAMES_KEY => "names",
156
- GEONAME_ID_KEY => "geoname_id",
157
- ISO_CODE_KEY => "iso_code",
158
- CONFIDENCE_KEY => "confidence",
159
- ACCURACY_RADIUS_KEY => "accuracy_radius",
160
- LATITUDE_KEY => "latitude",
161
- LONGITUDE_KEY => "longitude",
162
- TIME_ZONE_KEY => "time_zone",
163
- METRO_CODE_KEY => "metro_code",
164
- POPULATION_DENSITY_KEY => "population_density",
165
- EN_KEY => "en",
166
- ES_KEY => "es",
167
- FR_KEY => "fr",
168
- JA_KEY => "ja",
169
- PT_BR_KEY => "pt-BR",
170
- RU_KEY => "ru",
171
- ZH_CN_KEY => "zh-CN",
172
- // Common keys
173
- CODE_KEY => "code",
174
- NETWORK_KEY => "network",
175
- IS_ANONYMOUS_PROXY_KEY => "is_anonymous_proxy",
176
- IS_SATELLITE_PROVIDER_KEY => "is_satellite_provider",
177
- // Continent codes
178
- AF_KEY => "AF",
179
- AN_KEY => "AN",
180
- AS_KEY => "AS",
181
- EU_KEY => "EU",
182
- NA_KEY => "NA",
183
- OC_KEY => "OC",
184
- SA_KEY => "SA",
185
- // Major Country ISO codes
186
- US_VAL => "US",
187
- CN_VAL => "CN",
188
- JP_VAL => "JP",
189
- DE_VAL => "DE",
190
- IN_VAL => "IN",
191
- GB_VAL => "GB",
192
- FR_VAL => "FR",
193
- BR_VAL => "BR",
194
- IT_VAL => "IT",
195
- CA_VAL => "CA",
196
- RU_VAL => "RU", // Already defined above as RU_KEY? No, RU_KEY is "ru" (lang), this is "RU" (country)
197
- KR_VAL => "KR",
198
- AU_VAL => "AU",
199
- ES_VAL => "ES", // "ES" (country) vs "es" (lang). ES_KEY is "es".
200
- MX_VAL => "MX",
201
- ID_VAL => "ID",
202
- TR_VAL => "TR",
203
- );
74
+ #[inline]
75
+ fn init_thread_string_cache_roots(ruby: &magnus::Ruby) -> RArray {
76
+ let roots = ruby.ary_new_capa(STRING_CACHE_MAX);
77
+ for _ in 0..STRING_CACHE_MAX {
78
+ roots
79
+ .push(ruby.qnil().as_value())
80
+ .expect("string cache roots initialization should succeed");
81
+ }
82
+ string_cache_roots_owner(ruby)
83
+ .push(roots.as_value())
84
+ .expect("string cache roots owner should retain per-thread roots");
85
+ roots
86
+ }
87
+
88
+ #[inline]
89
+ fn string_cache_roots(ruby: &magnus::Ruby) -> RArray {
90
+ STRING_CACHE_ROOTS.with(|roots| *roots.get_or_init(|| init_thread_string_cache_roots(ruby)))
91
+ }
92
+
93
+ #[inline]
94
+ fn cached_string(ruby: &magnus::Ruby, value: &str) -> Value {
95
+ if !(STRING_CACHE_MIN_LEN..=STRING_CACHE_MAX_LEN).contains(&value.len()) {
96
+ return ruby.str_new(value).into_value_with(ruby);
97
+ }
98
+
99
+ let mut hasher = FxHasher::default();
100
+ value.hash(&mut hasher);
101
+ let hash = hasher.finish();
102
+ let slot = (hash as usize) & (STRING_CACHE_MAX - 1);
103
+
104
+ STRING_CACHE.with(|cache_cell| {
105
+ let mut cache = cache_cell.borrow_mut();
106
+ let entry = &mut cache.entries[slot];
107
+ if entry.hash == hash && entry.value == value {
108
+ return string_cache_roots(ruby)
109
+ .entry::<Value>(slot as isize)
110
+ .expect("string cache roots lookup should succeed");
111
+ }
112
+
113
+ let string = ruby.str_new(value);
114
+ string.freeze();
115
+ let cached = string.as_value();
116
+ string_cache_roots(ruby)
117
+ .store(slot as isize, cached)
118
+ .expect("string cache roots update should succeed");
119
+ entry.hash = hash;
120
+ entry.value.clear();
121
+ entry.value.push_str(value);
122
+ cached
123
+ })
124
+ }
204
125
 
205
126
  /// Wrapper that owns the Ruby value produced by deserializing a MaxMind record
206
127
  #[derive(Clone)]
@@ -331,18 +252,14 @@ impl<'de, 'ruby> Visitor<'de> for RubyValueVisitor<'ruby> {
331
252
  where
332
253
  E: de::Error,
333
254
  {
334
- let val = interned_key(self.ruby, value)
335
- .unwrap_or_else(|| self.ruby.str_new(value).into_value_with(self.ruby));
336
- Ok(RubyDecodedValue::new(val))
255
+ Ok(RubyDecodedValue::new(cached_string(self.ruby, value)))
337
256
  }
338
257
 
339
258
  fn visit_string<E>(self, value: String) -> Result<Self::Value, E>
340
259
  where
341
260
  E: de::Error,
342
261
  {
343
- let val = interned_key(self.ruby, &value)
344
- .unwrap_or_else(|| self.ruby.str_new(&value).into_value_with(self.ruby));
345
- Ok(RubyDecodedValue::new(val))
262
+ Ok(RubyDecodedValue::new(cached_string(self.ruby, &value)))
346
263
  }
347
264
 
348
265
  fn visit_bytes<E>(self, value: &[u8]) -> Result<Self::Value, E>
@@ -386,10 +303,9 @@ impl<'de, 'ruby> Visitor<'de> for RubyValueVisitor<'ruby> {
386
303
  Some(cap) => self.ruby.hash_new_capa(cap),
387
304
  None => self.ruby.hash_new(),
388
305
  };
389
- while let Some(key) = map.next_key::<Cow<'de, str>>()? {
306
+ while let Some(key) = map.next_key::<&'de str>()? {
390
307
  let value = map.next_value_seed(RubyValueSeed { ruby: self.ruby })?;
391
- let key_val = interned_key(self.ruby, key.as_ref())
392
- .unwrap_or_else(|| self.ruby.str_new(key.as_ref()).into_value_with(self.ruby));
308
+ let key_val = cached_string(self.ruby, key);
393
309
  hash.aset(key_val, value.into_value())
394
310
  .map_err(|e| de::Error::custom(e.to_string()))?;
395
311
  }
@@ -424,28 +340,12 @@ impl ReaderSource {
424
340
  ReaderSource::Mmap(reader) => {
425
341
  let result = reader.lookup(ip)?;
426
342
  let network = result.network()?;
427
- let prefix = network.prefix();
428
-
429
- let prefix_len = if ip.is_ipv4() && network.is_ipv6() {
430
- 0
431
- } else {
432
- prefix as usize
433
- };
434
-
435
- (result.decode()?, prefix_len)
343
+ (result.decode()?, prefix_len_for_ip_network(ip, network))
436
344
  }
437
345
  ReaderSource::Memory(reader) => {
438
346
  let result = reader.lookup(ip)?;
439
347
  let network = result.network()?;
440
- let prefix = network.prefix();
441
-
442
- let prefix_len = if ip.is_ipv4() && network.is_ipv6() {
443
- 0
444
- } else {
445
- prefix as usize
446
- };
447
-
448
- (result.decode()?, prefix_len)
348
+ (result.decode()?, prefix_len_for_ip_network(ip, network))
449
349
  }
450
350
  };
451
351
  Ok((result, prefix_len))
@@ -464,15 +364,12 @@ impl ReaderSource {
464
364
  match self {
465
365
  ReaderSource::Mmap(reader) => {
466
366
  let iter = reader.within(network, Default::default())?;
467
- // SAFETY: the iterator holds a reference into `reader`. We'll store an Arc guard
468
- // alongside it so the reader outlives the transmuted iterator.
469
367
  Ok(ReaderWithin::Mmap(unsafe {
470
368
  std::mem::transmute::<Within<'_, Mmap>, Within<'static, Mmap>>(iter)
471
369
  }))
472
370
  }
473
371
  ReaderSource::Memory(reader) => {
474
372
  let iter = reader.within(network, Default::default())?;
475
- // SAFETY: same as above, the Arc guard keeps the reader alive.
476
373
  Ok(ReaderWithin::Memory(unsafe {
477
374
  std::mem::transmute::<Within<'_, Vec<u8>>, Within<'static, Vec<u8>>>(iter)
478
375
  }))
@@ -490,40 +387,43 @@ enum ReaderWithin {
490
387
  impl ReaderWithin {
491
388
  fn next(&mut self) -> Option<Result<(IpNetwork, RubyDecodedValue), MaxMindDbError>> {
492
389
  match self {
493
- ReaderWithin::Mmap(iter) => loop {
494
- match iter.next() {
495
- None => return None,
496
- Some(Err(e)) => return Some(Err(e)),
497
- Some(Ok(lookup_result)) => {
498
- let network = match lookup_result.network() {
499
- Ok(n) => n,
500
- Err(e) => return Some(Err(e)),
501
- };
502
- match lookup_result.decode::<RubyDecodedValue>() {
503
- Ok(Some(data)) => return Some(Ok((network, data))),
504
- Ok(None) => continue, // Skip networks without data
505
- Err(e) => return Some(Err(e)),
506
- }
507
- }
508
- }
509
- },
510
- ReaderWithin::Memory(iter) => loop {
511
- match iter.next() {
512
- None => return None,
513
- Some(Err(e)) => return Some(Err(e)),
514
- Some(Ok(lookup_result)) => {
515
- let network = match lookup_result.network() {
516
- Ok(n) => n,
517
- Err(e) => return Some(Err(e)),
518
- };
519
- match lookup_result.decode::<RubyDecodedValue>() {
520
- Ok(Some(data)) => return Some(Ok((network, data))),
521
- Ok(None) => continue, // Skip networks without data
522
- Err(e) => return Some(Err(e)),
523
- }
524
- }
390
+ ReaderWithin::Mmap(iter) => next_within_result(iter),
391
+ ReaderWithin::Memory(iter) => next_within_result(iter),
392
+ }
393
+ }
394
+ }
395
+
396
+ #[inline]
397
+ // prefix_len_for_ip_network uses 0 as a sentinel for ip.is_ipv4() && network.is_ipv6().
398
+ // In this case, 0 is not a real prefix length; it signals an IPv4-in-IPv6 mapping path,
399
+ // and callers must treat it specially (distinct from "no network found").
400
+ fn prefix_len_for_ip_network(ip: IpAddr, network: IpNetwork) -> usize {
401
+ if ip.is_ipv4() && network.is_ipv6() {
402
+ 0
403
+ } else {
404
+ network.prefix() as usize
405
+ }
406
+ }
407
+
408
+ #[inline]
409
+ fn next_within_result<S: AsRef<[u8]>>(
410
+ iter: &mut Within<'static, S>,
411
+ ) -> Option<Result<(IpNetwork, RubyDecodedValue), MaxMindDbError>> {
412
+ loop {
413
+ match iter.next() {
414
+ None => return None,
415
+ Some(Err(e)) => return Some(Err(e)),
416
+ Some(Ok(lookup_result)) => {
417
+ let network = match lookup_result.network() {
418
+ Ok(n) => n,
419
+ Err(e) => return Some(Err(e)),
420
+ };
421
+ match lookup_result.decode::<RubyDecodedValue>() {
422
+ Ok(Some(data)) => return Some(Ok((network, data))),
423
+ Ok(None) => continue, // Skip networks without data
424
+ Err(e) => return Some(Err(e)),
525
425
  }
526
- },
426
+ }
527
427
  }
528
428
  }
529
429
  }
@@ -854,7 +754,6 @@ impl Reader {
854
754
  format!("Failed to iterate: {}", e),
855
755
  )
856
756
  })?;
857
-
858
757
  // Get IPAddr class
859
758
  let ipaddr_class = ruby.class_object().const_get::<_, RClass>("IPAddr")?;
860
759
 
@@ -935,6 +834,32 @@ fn parse_ip_address_fast(value: Value, ruby: &magnus::Ruby) -> Result<IpAddr, Er
935
834
  }
936
835
 
937
836
  // Slow path: Try as IPAddr object
837
+ if let Ok(ipaddr_class) = ruby.class_object().const_get::<_, RClass>("IPAddr") {
838
+ if value.is_kind_of(ipaddr_class) {
839
+ let packed: Value = value.funcall("hton", ())?;
840
+ if let Some(packed_str) = RString::from_value(packed) {
841
+ // SAFETY: `bytes` is used immediately and `packed`/`packed_str` stay alive and
842
+ // unmodified through the end of this match. This block must not introduce calls
843
+ // that could move, collect, or mutate the Ruby string between `as_slice()` and
844
+ // the final byte-pattern match handling.
845
+ let bytes = unsafe { packed_str.as_slice() };
846
+ return match bytes {
847
+ [a, b, c, d] => Ok(IpAddr::from([*a, *b, *c, *d])),
848
+ [a0, a1, a2, a3, a4, a5, a6, a7, a8, a9, a10, a11, a12, a13, a14, a15] => {
849
+ Ok(IpAddr::from([
850
+ *a0, *a1, *a2, *a3, *a4, *a5, *a6, *a7, *a8, *a9, *a10, *a11, *a12,
851
+ *a13, *a14, *a15,
852
+ ]))
853
+ }
854
+ _ => Err(Error::new(
855
+ ruby.exception_arg_error(),
856
+ format!("'{}' does not appear to be an IPv4 or IPv6 address", value),
857
+ )),
858
+ };
859
+ }
860
+ }
861
+ }
862
+
938
863
  if let Ok(ipaddr_obj) = value.funcall::<_, _, String>("to_s", ()) {
939
864
  return IpAddr::from_str(&ipaddr_obj).map_err(|_| {
940
865
  Error::new(
@@ -964,24 +889,7 @@ fn ipv6_in_ipv4_error(ip: &IpAddr) -> String {
964
889
  /// Open a MaxMind DB using memory-mapped I/O (MODE_MMAP)
965
890
  fn open_database_mmap(path: &str) -> Result<Reader, Error> {
966
891
  let ruby = magnus::Ruby::get().expect("Ruby VM should be available in Ruby context");
967
-
968
- let file = File::open(Path::new(path)).map_err(|e| match e.kind() {
969
- std::io::ErrorKind::NotFound => {
970
- let errno = ruby
971
- .class_object()
972
- .const_get::<_, RModule>("Errno")
973
- .expect("Errno module should exist");
974
- let enoent = errno
975
- .const_get::<_, RClass>("ENOENT")
976
- .expect("Errno::ENOENT should exist");
977
- Error::new(
978
- ExceptionClass::from_value(enoent.as_value())
979
- .expect("ENOENT should convert to ExceptionClass"),
980
- e.to_string(),
981
- )
982
- }
983
- _ => Error::new(ruby.exception_io_error(), e.to_string()),
984
- })?;
892
+ let file = open_database_file(path, &ruby)?;
985
893
 
986
894
  let mmap = unsafe { Mmap::map(&file) }.map_err(|e| {
987
895
  Error::new(
@@ -989,7 +897,6 @@ fn open_database_mmap(path: &str) -> Result<Reader, Error> {
989
897
  format!("Failed to memory-map database file: {}", e),
990
898
  )
991
899
  })?;
992
-
993
900
  let reader = MaxMindReader::from_source(mmap).map_err(|_| {
994
901
  Error::new(
995
902
  ExceptionClass::from_value(invalid_database_error().as_value())
@@ -1007,24 +914,7 @@ fn open_database_mmap(path: &str) -> Result<Reader, Error> {
1007
914
  /// Open a MaxMind DB by loading entire file into memory (MODE_MEMORY)
1008
915
  fn open_database_memory(path: &str) -> Result<Reader, Error> {
1009
916
  let ruby = magnus::Ruby::get().expect("Ruby VM should be available in Ruby context");
1010
-
1011
- let mut file = File::open(Path::new(path)).map_err(|e| match e.kind() {
1012
- std::io::ErrorKind::NotFound => {
1013
- let errno = ruby
1014
- .class_object()
1015
- .const_get::<_, RModule>("Errno")
1016
- .expect("Errno module should exist");
1017
- let enoent = errno
1018
- .const_get::<_, RClass>("ENOENT")
1019
- .expect("Errno::ENOENT should exist");
1020
- Error::new(
1021
- ExceptionClass::from_value(enoent.as_value())
1022
- .expect("ENOENT should convert to ExceptionClass"),
1023
- e.to_string(),
1024
- )
1025
- }
1026
- _ => Error::new(ruby.exception_io_error(), e.to_string()),
1027
- })?;
917
+ let mut file = open_database_file(path, &ruby)?;
1028
918
 
1029
919
  let mut buffer = Vec::new();
1030
920
  file.read_to_end(&mut buffer).map_err(|e| {
@@ -1048,21 +938,51 @@ fn open_database_memory(path: &str) -> Result<Reader, Error> {
1048
938
  Ok(create_reader(ReaderSource::Memory(reader)))
1049
939
  }
1050
940
 
941
+ fn open_database_file(path: &str, ruby: &magnus::Ruby) -> Result<File, Error> {
942
+ File::open(Path::new(path)).map_err(|e| {
943
+ if e.kind() == std::io::ErrorKind::NotFound {
944
+ open_not_found_error(ruby, e)
945
+ } else {
946
+ Error::new(ruby.exception_io_error(), e.to_string())
947
+ }
948
+ })
949
+ }
950
+
951
+ fn open_not_found_error(ruby: &magnus::Ruby, err: std::io::Error) -> Error {
952
+ let errno = ruby
953
+ .class_object()
954
+ .const_get::<_, RModule>("Errno")
955
+ .expect("Errno module should exist");
956
+ let enoent = errno
957
+ .const_get::<_, RClass>("ENOENT")
958
+ .expect("Errno::ENOENT should exist");
959
+ Error::new(
960
+ ExceptionClass::from_value(enoent.as_value())
961
+ .expect("ENOENT should convert to ExceptionClass"),
962
+ err.to_string(),
963
+ )
964
+ }
965
+
1051
966
  /// Get the InvalidDatabaseError class
1052
967
  fn invalid_database_error() -> RClass {
1053
968
  let ruby = magnus::Ruby::get().expect("Ruby VM should be available in Ruby context");
969
+ let rust = rust_module(&ruby);
970
+ rust.const_get::<_, RClass>("InvalidDatabaseError")
971
+ .expect("InvalidDatabaseError class should exist")
972
+ }
973
+
974
+ fn rust_module(ruby: &magnus::Ruby) -> RModule {
1054
975
  let maxmind = ruby
1055
976
  .class_object()
1056
977
  .const_get::<_, RModule>("MaxMind")
1057
978
  .expect("MaxMind module should exist");
1058
979
  let db = maxmind
1059
- .const_get::<_, RModule>("DB")
1060
- .expect("MaxMind::DB module should exist");
1061
- let rust = db
1062
- .const_get::<_, RModule>("Rust")
1063
- .expect("MaxMind::DB::Rust module should exist");
1064
- rust.const_get::<_, RClass>("InvalidDatabaseError")
1065
- .expect("InvalidDatabaseError class should exist")
980
+ .const_get::<_, Value>("DB")
981
+ .expect("MaxMind::DB constant should exist");
982
+ let rust_value = db
983
+ .funcall::<_, _, Value>("const_get", ("Rust",))
984
+ .expect("MaxMind::DB::Rust constant should exist");
985
+ RModule::from_value(rust_value).expect("MaxMind::DB::Rust should be a module")
1066
986
  }
1067
987
 
1068
988
  #[magnus::init]
@@ -1076,11 +996,26 @@ fn init(ruby: &magnus::Ruby) -> Result<(), Error> {
1076
996
  let rust = match db_value {
1077
997
  Ok(existing) if existing.is_kind_of(ruby.class_class()) => {
1078
998
  // MaxMind::DB exists as a Class (official gem loaded first)
1079
- // Define Rust module directly as a constant on the class using funcall
1080
- let rust_mod = ruby.define_module("MaxMindDBRustTemp")?;
1081
- // Use const_set via funcall on the existing class/module
1082
- let _ = existing.funcall::<_, _, Value>("const_set", ("Rust", rust_mod))?;
1083
- rust_mod
999
+ // Reuse existing Rust constant if present to avoid replacing classes.
1000
+ if let Ok(rust_value) = existing.funcall::<_, _, Value>("const_get", ("Rust", false)) {
1001
+ RModule::from_value(rust_value).ok_or_else(|| {
1002
+ Error::new(
1003
+ ruby.exception_type_error(),
1004
+ "MaxMind::DB::Rust exists but is not a module",
1005
+ )
1006
+ })?
1007
+ } else {
1008
+ // Define Rust module directly as a constant on the class.
1009
+ let rust_value: Value = ruby.module_new().as_value();
1010
+ let rust_mod = RModule::from_value(rust_value).ok_or_else(|| {
1011
+ Error::new(
1012
+ ruby.exception_type_error(),
1013
+ "Failed to create anonymous module for MaxMind::DB::Rust",
1014
+ )
1015
+ })?;
1016
+ let _ = existing.funcall::<_, _, Value>("const_set", ("Rust", rust_mod))?;
1017
+ rust_mod
1018
+ }
1084
1019
  }
1085
1020
  Ok(existing) => {
1086
1021
  // MaxMind::DB exists as a Module (our gem loaded first)
@@ -1096,6 +1031,22 @@ fn init(ruby: &magnus::Ruby) -> Result<(), Error> {
1096
1031
  }
1097
1032
  };
1098
1033
 
1034
+ if rust
1035
+ .const_get::<_, Value>(STRING_CACHE_ROOTS_CONST)
1036
+ .is_err()
1037
+ {
1038
+ rust.const_set(STRING_CACHE_ROOTS_CONST, ruby.ary_new())?;
1039
+ }
1040
+
1041
+ if rust.const_get::<_, Value>(MAP_KEY_ROOTS_CONST).is_ok() {
1042
+ let _ = rust.funcall::<_, _, Value>("send", ("remove_const", MAP_KEY_ROOTS_CONST))?;
1043
+ }
1044
+
1045
+ // The extension can be loaded more than once from different paths.
1046
+ // Reusing previously defined classes avoids typed-data incompatibilities.
1047
+ if rust.const_get::<_, Value>("Reader").is_ok() {
1048
+ return Ok(());
1049
+ }
1099
1050
  // Define InvalidDatabaseError
1100
1051
  let runtime_error = ruby.exception_runtime_error();
1101
1052
  rust.define_error("InvalidDatabaseError", runtime_error)?;
metadata CHANGED
@@ -1,14 +1,13 @@
1
1
  --- !ruby/object:Gem::Specification
2
2
  name: maxmind-db-rust
3
3
  version: !ruby/object:Gem::Version
4
- version: 0.2.1
4
+ version: 0.4.0
5
5
  platform: ruby
6
6
  authors:
7
7
  - Gregory Oschwald
8
- autorequire:
9
8
  bindir: bin
10
9
  cert_chain: []
11
- date: 2025-12-18 00:00:00.000000000 Z
10
+ date: 1980-01-02 00:00:00.000000000 Z
12
11
  dependencies:
13
12
  - !ruby/object:Gem::Dependency
14
13
  name: rb_sys
@@ -30,14 +29,14 @@ dependencies:
30
29
  requirements:
31
30
  - - "~>"
32
31
  - !ruby/object:Gem::Version
33
- version: '5.0'
32
+ version: '6.0'
34
33
  type: :development
35
34
  prerelease: false
36
35
  version_requirements: !ruby/object:Gem::Requirement
37
36
  requirements:
38
37
  - - "~>"
39
38
  - !ruby/object:Gem::Version
40
- version: '5.0'
39
+ version: '6.0'
41
40
  - !ruby/object:Gem::Dependency
42
41
  name: rake
43
42
  requirement: !ruby/object:Gem::Requirement
@@ -178,7 +177,6 @@ metadata:
178
177
  homepage_uri: https://github.com/oschwald/maxmind-db-rust-ruby
179
178
  source_code_uri: https://github.com/oschwald/maxmind-db-rust-ruby
180
179
  rubygems_mfa_required: 'true'
181
- post_install_message:
182
180
  rdoc_options: []
183
181
  require_paths:
184
182
  - lib
@@ -193,8 +191,7 @@ required_rubygems_version: !ruby/object:Gem::Requirement
193
191
  - !ruby/object:Gem::Version
194
192
  version: '0'
195
193
  requirements: []
196
- rubygems_version: 3.5.22
197
- signing_key:
194
+ rubygems_version: 4.0.6
198
195
  specification_version: 4
199
196
  summary: Unofficial high-performance Rust-based MaxMind DB reader for Ruby
200
197
  test_files: []