maxmind-geoip2 0.3.0 → 0.4.0

Sign up to get free protection for your applications and to get access to all the features.
Files changed (68) hide show
  1. checksums.yaml +4 -4
  2. data/CHANGELOG.md +5 -0
  3. data/Gemfile +1 -0
  4. data/Gemfile.lock +2 -0
  5. data/lib/maxmind/geoip2/client.rb +23 -10
  6. data/maxmind-geoip2.gemspec +3 -2
  7. metadata +18 -65
  8. data/test/data/LICENSE +0 -4
  9. data/test/data/MaxMind-DB-spec.md +0 -570
  10. data/test/data/MaxMind-DB-test-metadata-pointers.mmdb +0 -0
  11. data/test/data/README.md +0 -4
  12. data/test/data/bad-data/README.md +0 -7
  13. data/test/data/bad-data/libmaxminddb/libmaxminddb-offset-integer-overflow.mmdb +0 -0
  14. data/test/data/bad-data/maxminddb-golang/cyclic-data-structure.mmdb +0 -0
  15. data/test/data/bad-data/maxminddb-golang/invalid-bytes-length.mmdb +0 -1
  16. data/test/data/bad-data/maxminddb-golang/invalid-data-record-offset.mmdb +0 -0
  17. data/test/data/bad-data/maxminddb-golang/invalid-map-key-length.mmdb +0 -0
  18. data/test/data/bad-data/maxminddb-golang/invalid-string-length.mmdb +0 -1
  19. data/test/data/bad-data/maxminddb-golang/metadata-is-an-uint128.mmdb +0 -1
  20. data/test/data/bad-data/maxminddb-golang/unexpected-bytes.mmdb +0 -0
  21. data/test/data/perltidyrc +0 -12
  22. data/test/data/source-data/GeoIP2-Anonymous-IP-Test.json +0 -48
  23. data/test/data/source-data/GeoIP2-City-Test.json +0 -12852
  24. data/test/data/source-data/GeoIP2-Connection-Type-Test.json +0 -102
  25. data/test/data/source-data/GeoIP2-Country-Test.json +0 -15916
  26. data/test/data/source-data/GeoIP2-DensityIncome-Test.json +0 -14
  27. data/test/data/source-data/GeoIP2-Domain-Test.json +0 -452
  28. data/test/data/source-data/GeoIP2-Enterprise-Test.json +0 -687
  29. data/test/data/source-data/GeoIP2-ISP-Test.json +0 -12593
  30. data/test/data/source-data/GeoIP2-Precision-Enterprise-Test.json +0 -2061
  31. data/test/data/source-data/GeoIP2-Static-IP-Score-Test.json +0 -2132
  32. data/test/data/source-data/GeoIP2-User-Count-Test.json +0 -2837
  33. data/test/data/source-data/GeoLite2-ASN-Test.json +0 -37
  34. data/test/data/source-data/README +0 -15
  35. data/test/data/test-data/GeoIP2-Anonymous-IP-Test.mmdb +0 -0
  36. data/test/data/test-data/GeoIP2-City-Test-Broken-Double-Format.mmdb +0 -0
  37. data/test/data/test-data/GeoIP2-City-Test-Invalid-Node-Count.mmdb +0 -0
  38. data/test/data/test-data/GeoIP2-City-Test.mmdb +0 -0
  39. data/test/data/test-data/GeoIP2-Connection-Type-Test.mmdb +0 -0
  40. data/test/data/test-data/GeoIP2-Country-Test.mmdb +0 -0
  41. data/test/data/test-data/GeoIP2-DensityIncome-Test.mmdb +0 -0
  42. data/test/data/test-data/GeoIP2-Domain-Test.mmdb +0 -0
  43. data/test/data/test-data/GeoIP2-Enterprise-Test.mmdb +0 -0
  44. data/test/data/test-data/GeoIP2-ISP-Test.mmdb +0 -0
  45. data/test/data/test-data/GeoIP2-Precision-Enterprise-Test.mmdb +0 -0
  46. data/test/data/test-data/GeoIP2-Static-IP-Score-Test.mmdb +0 -0
  47. data/test/data/test-data/GeoIP2-User-Count-Test.mmdb +0 -0
  48. data/test/data/test-data/GeoLite2-ASN-Test.mmdb +0 -0
  49. data/test/data/test-data/MaxMind-DB-no-ipv4-search-tree.mmdb +0 -0
  50. data/test/data/test-data/MaxMind-DB-string-value-entries.mmdb +0 -0
  51. data/test/data/test-data/MaxMind-DB-test-broken-pointers-24.mmdb +0 -0
  52. data/test/data/test-data/MaxMind-DB-test-broken-search-tree-24.mmdb +0 -0
  53. data/test/data/test-data/MaxMind-DB-test-decoder.mmdb +0 -0
  54. data/test/data/test-data/MaxMind-DB-test-ipv4-24.mmdb +0 -0
  55. data/test/data/test-data/MaxMind-DB-test-ipv4-28.mmdb +0 -0
  56. data/test/data/test-data/MaxMind-DB-test-ipv4-32.mmdb +0 -0
  57. data/test/data/test-data/MaxMind-DB-test-ipv6-24.mmdb +0 -0
  58. data/test/data/test-data/MaxMind-DB-test-ipv6-28.mmdb +0 -0
  59. data/test/data/test-data/MaxMind-DB-test-ipv6-32.mmdb +0 -0
  60. data/test/data/test-data/MaxMind-DB-test-metadata-pointers.mmdb +0 -0
  61. data/test/data/test-data/MaxMind-DB-test-mixed-24.mmdb +0 -0
  62. data/test/data/test-data/MaxMind-DB-test-mixed-28.mmdb +0 -0
  63. data/test/data/test-data/MaxMind-DB-test-mixed-32.mmdb +0 -0
  64. data/test/data/test-data/MaxMind-DB-test-nested.mmdb +0 -0
  65. data/test/data/test-data/README.md +0 -26
  66. data/test/data/test-data/maps-with-pointers.raw +0 -0
  67. data/test/data/test-data/write-test-data.pl +0 -641
  68. data/test/data/tidyall.ini +0 -5
checksums.yaml CHANGED
@@ -1,7 +1,7 @@
1
1
  ---
2
2
  SHA256:
3
- metadata.gz: 42699226c0ab8ec1152f4c56929c830916d7c226097082ae95959199d8796aa8
4
- data.tar.gz: 97d399a1c58fb932171cfb4aa5c6b19cfb64af382771e2cdfc693e72bba2abd1
3
+ metadata.gz: 0a47cb7dfbd53cdb3f5f0f25fb5f68905707fe8740aa1d67e3ff95253e81a38f
4
+ data.tar.gz: 394081ede69fde35160b2229a8d4ebc71db53d1d2da2d1e4d082f143cfae5118
5
5
  SHA512:
6
- metadata.gz: cd7fac3bc1738b562586e23722eb62919a3d157a32619511ca2246b0b23c91eabcf9a3565e9f33ff080680bdb9e42a6cfaf984689de9231e8cc2b4edab5b5fa9
7
- data.tar.gz: 05ce00df88e5de2ecfc7fceacc868fdf5d0b371091d7da5aae63089f9699c72195aaf70175fcdd8e16c3af4d251cd2cf8b78f06e462c2936ea66cc82a4ed622d
6
+ metadata.gz: c0002fb43de06a93818ad574e93fd967dfa215f3fdd8d4e1d3ed4fd5e46465996dce74e3f0140a2b3779e9c346923031c7900db525abe179595fb3b6b80db2c6
7
+ data.tar.gz: 91a6cd3cf34eb576b5778827936b6379b751758e25dda5bd59df510f0623c37ec7bd725c1e75c308bbf3df3ff1797a3e830042fe8913ca7e8d04ca68ab273712
@@ -1,5 +1,10 @@
1
1
  # Changelog
2
2
 
3
+ ## 0.4.0 (2020-03-06)
4
+
5
+ * HTTP connections are now persistent. There is a new parameter that
6
+ controls the maximum number of connections the client will use.
7
+
3
8
  ## 0.3.0 (2020-03-04)
4
9
 
5
10
  * Modules are now always be defined. Previously we used a shorthand syntax
data/Gemfile CHANGED
@@ -2,6 +2,7 @@
2
2
 
3
3
  source 'https://rubygems.org'
4
4
 
5
+ gem 'connection_pool'
5
6
  gem 'http'
6
7
  gem 'maxmind-db'
7
8
  gem 'minitest', group: :development
@@ -4,6 +4,7 @@ GEM
4
4
  addressable (2.7.0)
5
5
  public_suffix (>= 2.0.2, < 5.0)
6
6
  ast (2.4.0)
7
+ connection_pool (2.2.2)
7
8
  crack (0.4.3)
8
9
  safe_yaml (~> 1.0.0)
9
10
  domain_name (0.5.20190701)
@@ -58,6 +59,7 @@ PLATFORMS
58
59
  ruby
59
60
 
60
61
  DEPENDENCIES
62
+ connection_pool
61
63
  http
62
64
  maxmind-db
63
65
  minitest
@@ -1,5 +1,6 @@
1
1
  # frozen_string_literal: true
2
2
 
3
+ require 'connection_pool'
3
4
  require 'http'
4
5
  require 'json'
5
6
  require 'maxmind/geoip2/errors'
@@ -79,6 +80,8 @@ module MaxMind
79
80
  # @param proxy_username [String] proxy username to use, if any.
80
81
  #
81
82
  # @param proxy_password [String] proxy password to use, if any.
83
+ #
84
+ # @param pool_size [Integer] HTTP connection pool size
82
85
  def initialize(
83
86
  account_id:,
84
87
  license_key:,
@@ -88,7 +91,8 @@ module MaxMind
88
91
  proxy_address: '',
89
92
  proxy_port: 0,
90
93
  proxy_username: '',
91
- proxy_password: ''
94
+ proxy_password: '',
95
+ pool_size: 5
92
96
  )
93
97
  @account_id = account_id
94
98
  @license_key = license_key
@@ -99,6 +103,11 @@ module MaxMind
99
103
  @proxy_port = proxy_port
100
104
  @proxy_username = proxy_username
101
105
  @proxy_password = proxy_password
106
+ @pool_size = pool_size
107
+
108
+ @connection_pool = ConnectionPool.new(size: @pool_size) do
109
+ make_http_client.persistent("https://#{@host}")
110
+ end
102
111
  end
103
112
  # rubocop:enable Metrics/ParameterLists
104
113
 
@@ -221,11 +230,7 @@ module MaxMind
221
230
  model_class.new(record, @locales)
222
231
  end
223
232
 
224
- # rubocop:disable Metrics/CyclomaticComplexity
225
- # rubocop:disable Metrics/PerceivedComplexity
226
- def get(endpoint, ip_address)
227
- url = 'https://' + @host + '/geoip/v2.1/' + endpoint + '/' + ip_address
228
-
233
+ def make_http_client
229
234
  headers = HTTP.basic_auth(user: @account_id, pass: @license_key)
230
235
  .headers(
231
236
  accept: 'application/json',
@@ -242,9 +247,19 @@ module MaxMind
242
247
  proxy = timeout.via(@proxy_address, opts)
243
248
  end
244
249
 
245
- response = proxy.get(url)
250
+ proxy
251
+ end
252
+
253
+ def get(endpoint, ip_address)
254
+ url = '/geoip/v2.1/' + endpoint + '/' + ip_address
255
+
256
+ response = nil
257
+ body = nil
258
+ @connection_pool.with do |client|
259
+ response = client.get(url)
260
+ body = response.to_s
261
+ end
246
262
 
247
- body = response.to_s
248
263
  is_json = response.headers[:content_type]&.include?('json')
249
264
 
250
265
  if response.status.client_error?
@@ -263,8 +278,6 @@ module MaxMind
263
278
 
264
279
  handle_success(endpoint, body, is_json)
265
280
  end
266
- # rubocop:enable Metrics/CyclomaticComplexity
267
- # rubocop:enable Metrics/PerceivedComplexity
268
281
 
269
282
  # rubocop:disable Metrics/CyclomaticComplexity
270
283
  def handle_client_error(endpoint, status, body, is_json)
@@ -5,7 +5,7 @@ Gem::Specification.new do |s|
5
5
  s.files = Dir['**/*']
6
6
  s.name = 'maxmind-geoip2'
7
7
  s.summary = 'A gem for interacting with the GeoIP2 webservices and databases.'
8
- s.version = '0.3.0'
8
+ s.version = '0.4.0'
9
9
 
10
10
  s.description = 'A gem for interacting with the GeoIP2 webservices and databases. MaxMind provides geolocation data as downloadable databases as well as through a webservice.'
11
11
  s.email = 'support@maxmind.com'
@@ -14,12 +14,13 @@ Gem::Specification.new do |s|
14
14
  s.metadata = {
15
15
  'bug_tracker_uri' => 'https://github.com/maxmind/GeoIP2-ruby/issues',
16
16
  'changelog_uri' => 'https://github.com/maxmind/GeoIP2-ruby/blob/master/CHANGELOG.md',
17
- 'documentation_uri' => 'https://github.com/maxmind/GeoIP2-ruby',
17
+ 'documentation_uri' => 'https://www.rubydoc.info/gems/maxmind-geoip2',
18
18
  'homepage_uri' => 'https://github.com/maxmind/GeoIP2-ruby',
19
19
  'source_code_uri' => 'https://github.com/maxmind/GeoIP2-ruby',
20
20
  }
21
21
  s.required_ruby_version = '>= 2.4.0'
22
22
 
23
+ s.add_runtime_dependency 'connection_pool', ['~> 2.2']
23
24
  s.add_runtime_dependency 'http', ['~> 4.3']
24
25
  s.add_runtime_dependency 'maxmind-db', ['~> 1.1']
25
26
  end
metadata CHANGED
@@ -1,15 +1,29 @@
1
1
  --- !ruby/object:Gem::Specification
2
2
  name: maxmind-geoip2
3
3
  version: !ruby/object:Gem::Version
4
- version: 0.3.0
4
+ version: 0.4.0
5
5
  platform: ruby
6
6
  authors:
7
7
  - William Storey
8
8
  autorequire:
9
9
  bindir: bin
10
10
  cert_chain: []
11
- date: 2020-03-04 00:00:00.000000000 Z
11
+ date: 2020-03-06 00:00:00.000000000 Z
12
12
  dependencies:
13
+ - !ruby/object:Gem::Dependency
14
+ name: connection_pool
15
+ requirement: !ruby/object:Gem::Requirement
16
+ requirements:
17
+ - - "~>"
18
+ - !ruby/object:Gem::Version
19
+ version: '2.2'
20
+ type: :runtime
21
+ prerelease: false
22
+ version_requirements: !ruby/object:Gem::Requirement
23
+ requirements:
24
+ - - "~>"
25
+ - !ruby/object:Gem::Version
26
+ version: '2.2'
13
27
  - !ruby/object:Gem::Dependency
14
28
  name: http
15
29
  requirement: !ruby/object:Gem::Requirement
@@ -79,67 +93,6 @@ files:
79
93
  - lib/maxmind/geoip2/record/subdivision.rb
80
94
  - lib/maxmind/geoip2/record/traits.rb
81
95
  - maxmind-geoip2.gemspec
82
- - test/data/LICENSE
83
- - test/data/MaxMind-DB-spec.md
84
- - test/data/MaxMind-DB-test-metadata-pointers.mmdb
85
- - test/data/README.md
86
- - test/data/bad-data/README.md
87
- - test/data/bad-data/libmaxminddb/libmaxminddb-offset-integer-overflow.mmdb
88
- - test/data/bad-data/maxminddb-golang/cyclic-data-structure.mmdb
89
- - test/data/bad-data/maxminddb-golang/invalid-bytes-length.mmdb
90
- - test/data/bad-data/maxminddb-golang/invalid-data-record-offset.mmdb
91
- - test/data/bad-data/maxminddb-golang/invalid-map-key-length.mmdb
92
- - test/data/bad-data/maxminddb-golang/invalid-string-length.mmdb
93
- - test/data/bad-data/maxminddb-golang/metadata-is-an-uint128.mmdb
94
- - test/data/bad-data/maxminddb-golang/unexpected-bytes.mmdb
95
- - test/data/perltidyrc
96
- - test/data/source-data/GeoIP2-Anonymous-IP-Test.json
97
- - test/data/source-data/GeoIP2-City-Test.json
98
- - test/data/source-data/GeoIP2-Connection-Type-Test.json
99
- - test/data/source-data/GeoIP2-Country-Test.json
100
- - test/data/source-data/GeoIP2-DensityIncome-Test.json
101
- - test/data/source-data/GeoIP2-Domain-Test.json
102
- - test/data/source-data/GeoIP2-Enterprise-Test.json
103
- - test/data/source-data/GeoIP2-ISP-Test.json
104
- - test/data/source-data/GeoIP2-Precision-Enterprise-Test.json
105
- - test/data/source-data/GeoIP2-Static-IP-Score-Test.json
106
- - test/data/source-data/GeoIP2-User-Count-Test.json
107
- - test/data/source-data/GeoLite2-ASN-Test.json
108
- - test/data/source-data/README
109
- - test/data/test-data/GeoIP2-Anonymous-IP-Test.mmdb
110
- - test/data/test-data/GeoIP2-City-Test-Broken-Double-Format.mmdb
111
- - test/data/test-data/GeoIP2-City-Test-Invalid-Node-Count.mmdb
112
- - test/data/test-data/GeoIP2-City-Test.mmdb
113
- - test/data/test-data/GeoIP2-Connection-Type-Test.mmdb
114
- - test/data/test-data/GeoIP2-Country-Test.mmdb
115
- - test/data/test-data/GeoIP2-DensityIncome-Test.mmdb
116
- - test/data/test-data/GeoIP2-Domain-Test.mmdb
117
- - test/data/test-data/GeoIP2-Enterprise-Test.mmdb
118
- - test/data/test-data/GeoIP2-ISP-Test.mmdb
119
- - test/data/test-data/GeoIP2-Precision-Enterprise-Test.mmdb
120
- - test/data/test-data/GeoIP2-Static-IP-Score-Test.mmdb
121
- - test/data/test-data/GeoIP2-User-Count-Test.mmdb
122
- - test/data/test-data/GeoLite2-ASN-Test.mmdb
123
- - test/data/test-data/MaxMind-DB-no-ipv4-search-tree.mmdb
124
- - test/data/test-data/MaxMind-DB-string-value-entries.mmdb
125
- - test/data/test-data/MaxMind-DB-test-broken-pointers-24.mmdb
126
- - test/data/test-data/MaxMind-DB-test-broken-search-tree-24.mmdb
127
- - test/data/test-data/MaxMind-DB-test-decoder.mmdb
128
- - test/data/test-data/MaxMind-DB-test-ipv4-24.mmdb
129
- - test/data/test-data/MaxMind-DB-test-ipv4-28.mmdb
130
- - test/data/test-data/MaxMind-DB-test-ipv4-32.mmdb
131
- - test/data/test-data/MaxMind-DB-test-ipv6-24.mmdb
132
- - test/data/test-data/MaxMind-DB-test-ipv6-28.mmdb
133
- - test/data/test-data/MaxMind-DB-test-ipv6-32.mmdb
134
- - test/data/test-data/MaxMind-DB-test-metadata-pointers.mmdb
135
- - test/data/test-data/MaxMind-DB-test-mixed-24.mmdb
136
- - test/data/test-data/MaxMind-DB-test-mixed-28.mmdb
137
- - test/data/test-data/MaxMind-DB-test-mixed-32.mmdb
138
- - test/data/test-data/MaxMind-DB-test-nested.mmdb
139
- - test/data/test-data/README.md
140
- - test/data/test-data/maps-with-pointers.raw
141
- - test/data/test-data/write-test-data.pl
142
- - test/data/tidyall.ini
143
96
  - test/test_client.rb
144
97
  - test/test_model_country.rb
145
98
  - test/test_model_names.rb
@@ -151,7 +104,7 @@ licenses:
151
104
  metadata:
152
105
  bug_tracker_uri: https://github.com/maxmind/GeoIP2-ruby/issues
153
106
  changelog_uri: https://github.com/maxmind/GeoIP2-ruby/blob/master/CHANGELOG.md
154
- documentation_uri: https://github.com/maxmind/GeoIP2-ruby
107
+ documentation_uri: https://www.rubydoc.info/gems/maxmind-geoip2
155
108
  homepage_uri: https://github.com/maxmind/GeoIP2-ruby
156
109
  source_code_uri: https://github.com/maxmind/GeoIP2-ruby
157
110
  post_install_message:
@@ -170,7 +123,7 @@ required_rubygems_version: !ruby/object:Gem::Requirement
170
123
  version: '0'
171
124
  requirements: []
172
125
  rubyforge_project:
173
- rubygems_version: 2.7.6
126
+ rubygems_version: 2.7.6.2
174
127
  signing_key:
175
128
  specification_version: 4
176
129
  summary: A gem for interacting with the GeoIP2 webservices and databases.
@@ -1,4 +0,0 @@
1
- This work is licensed under the Creative Commons Attribution-ShareAlike 3.0
2
- Unported License. To view a copy of this license, visit
3
- http://creativecommons.org/licenses/by-sa/3.0/ or send a letter to Creative
4
- Commons, 444 Castro Street, Suite 900, Mountain View, California, 94041, USA.
@@ -1,570 +0,0 @@
1
- ---
2
- layout: default
3
- title: MaxMind DB File Format Specification
4
- version: v2.0
5
- ---
6
- # MaxMind DB File Format Specification
7
-
8
- ## Description
9
-
10
- The MaxMind DB file format is a database format that maps IPv4 and IPv6
11
- addresses to data records using an efficient binary search tree.
12
-
13
- ## Version
14
-
15
- This spec documents **version 2.0** of the MaxMind DB binary format.
16
-
17
- The version number consists of separate major and minor version numbers. It
18
- should not be considered a decimal number. In other words, version 2.10 comes
19
- after version 2.9.
20
-
21
- Code which is capable of reading a given major version of the format should
22
- not be broken by minor version changes to the format.
23
-
24
- ## Overview
25
-
26
- The binary database is split into three parts:
27
-
28
- 1. The binary search tree. Each level of the tree corresponds to a single bit
29
- in the 128 bit representation of an IPv6 address.
30
- 2. The data section. These are the values returned to the client for a
31
- specific IP address, e.g. "US", "New York", or a more complex map type made up
32
- of multiple fields.
33
- 3. Database metadata. Information about the database itself.
34
-
35
- ## Database Metadata
36
-
37
- This portion of the database is stored at the end of the file. It is
38
- documented first because understanding some of the metadata is key to
39
- understanding how the other sections work.
40
-
41
- This section can be found by looking for a binary sequence matching
42
- "\xab\xcd\xefMaxMind.com". The *last* occurrence of this string in the file
43
- marks the end of the data section and the beginning of the metadata. Since we
44
- allow for arbitrary binary data in the data section, some other piece of data
45
- could contain these values. This is why you need to find the last occurrence
46
- of this sequence.
47
-
48
- The maximum allowable size for the metadata section, including the marker that
49
- starts the metadata, is 128KiB.
50
-
51
- The metadata is stored as a map data structure. This structure is described
52
- later in the spec. Changing a key's data type or removing a key would
53
- constitute a major version change for this spec.
54
-
55
- Except where otherwise specified, each key listed is required for the database
56
- to be considered valid.
57
-
58
- Adding a key constitutes a minor version change. Removing a key or changing
59
- its type constitutes a major version change.
60
-
61
- The list of known keys for the current version of the format is as follows:
62
-
63
- ### node\_count
64
-
65
- This is an unsigned 32-bit integer indicating the number of nodes in the
66
- search tree.
67
-
68
- ### record\_size
69
-
70
- This is an unsigned 16-bit integer. It indicates the number of bits in a
71
- record in the search tree. Note that each node consists of *two* records.
72
-
73
- ### ip\_version
74
-
75
- This is an unsigned 16-bit integer which is always 4 or 6. It indicates
76
- whether the database contains IPv4 or IPv6 address data.
77
-
78
- ### database\_type
79
-
80
- This is a string that indicates the structure of each data record associated
81
- with an IP address. The actual definition of these structures is left up to
82
- the database creator.
83
-
84
- Names starting with "GeoIP" are reserved for use by MaxMind (and "GeoIP" is a
85
- trademark anyway).
86
-
87
- ### languages
88
-
89
- An array of strings, each of which is a locale code. A given record may
90
- contain data items that have been localized to some or all of these
91
- locales. Records should not contain localized data for locales not included in
92
- this array.
93
-
94
- This is an optional key, as this may not be relevant for all types of data.
95
-
96
- ### binary\_format\_major\_version
97
-
98
- This is an unsigned 16-bit integer indicating the major version number for the
99
- database's binary format.
100
-
101
- ### binary\_format\_minor\_version
102
-
103
- This is an unsigned 16-bit integer indicating the minor version number for the
104
- database's binary format.
105
-
106
- ### build\_epoch
107
-
108
- This is an unsigned 64-bit integer that contains the database build timestamp
109
- as a Unix epoch value.
110
-
111
- ### description
112
-
113
- This key will always point to a map. The keys of that map will be language
114
- codes, and the values will be a description in that language as a UTF-8
115
- string.
116
-
117
- The codes may include additional information such as script or country
118
- identifiers, like "zh-TW" or "mn-Cyrl-MN". The additional identifiers will be
119
- separated by a dash character ("-").
120
-
121
- This key is optional. However, creators of databases are strongly
122
- encouraged to include a description in at least one language.
123
-
124
- ### Calculating the Search Tree Section Size
125
-
126
- The formula for calculating the search tree section size *in bytes* is as
127
- follows:
128
-
129
- ( ( $record_size * 2 ) / 8 ) * $number_of_nodes
130
-
131
- The end of the search tree marks the beginning of the data section.
132
-
133
- ## Binary Search Tree Section
134
-
135
- The database file starts with a binary search tree. The number of nodes in the
136
- tree is dependent on how many unique netblocks are needed for the particular
137
- database. For example, the city database needs many more small netblocks than
138
- the country database.
139
-
140
- The top most node is always located at the beginning of the search tree
141
- section's address space. The top node is node 0.
142
-
143
- Each node consists of two records, each of which is a pointer to an address in
144
- the file.
145
-
146
- The pointers can point to one of three things. First, it may point to another
147
- node in the search tree address space. These pointers are followed as part of
148
- the IP address search algorithm, described below.
149
-
150
- The pointer can point to a value equal to `$number_of_nodes`. If this is the
151
- case, it means that the IP address we are searching for is not in the
152
- database.
153
-
154
- Finally, it may point to an address in the data section. This is the data
155
- relevant to the given netblock.
156
-
157
- ### Node Layout
158
-
159
- Each node in the search tree consists of two records, each of which is a
160
- pointer. The record size varies by database, but inside a single database node
161
- records are always the same size. A record may be anywhere from 24 to 128 bits
162
- long, depending on the number of nodes in the tree. These pointers are
163
- stored in big-endian format (most significant byte first).
164
-
165
- Here are some examples of how the records are laid out in a node for 24, 28,
166
- and 32 bit records. Larger record sizes follow this same pattern.
167
-
168
- #### 24 bits (small database), one node is 6 bytes
169
-
170
- | <------------- node --------------->|
171
- | 23 .. 0 | 23 .. 0 |
172
-
173
- #### 28 bits (medium database), one node is 7 bytes
174
-
175
- | <------------- node --------------->|
176
- | 23 .. 0 | 27..24 | 27..24 | 23 .. 0 |
177
-
178
- Note 4 bits of each pointer are combined into the middle byte. For both
179
- records, they are prepended and end up in the most significant position.
180
-
181
- #### 32 bits (large database), one node is 8 bytes
182
-
183
- | <------------- node --------------->|
184
- | 31 .. 0 | 31 .. 0 |
185
-
186
- ### Search Lookup Algorithm
187
-
188
- The first step is to convert the IP address to its big-endian binary
189
- representation. For an IPv4 address, this becomes 32 bits. For IPv6 you get
190
- 128 bits.
191
-
192
- The leftmost bit corresponds to the first node in the search tree. For each
193
- bit, a value of 0 means we choose the left record in a node, and a value of 1
194
- means we choose the right record.
195
-
196
- The record value is always interpreted as an unsigned integer. The maximum
197
- size of the integer is dependent on the number of bits in a record (24, 28, or
198
- 32).
199
-
200
- If the record value is a number that is less than the *number of nodes* (not
201
- in bytes, but the actual node count) in the search tree (this is stored in the
202
- database metadata), then the value is a node number. In this case, we find
203
- that node in the search tree and repeat the lookup algorithm from there.
204
-
205
- If the record value is equal to the number of nodes, that means that we do not
206
- have any data for the IP address, and the search ends here.
207
-
208
- If the record value is *greater* than the number of nodes in the search tree,
209
- then it is an actual pointer value pointing into the data section. The value
210
- of the pointer is relative to the start of the data section, *not* the
211
- start of the file.
212
-
213
- In order to determine where in the data section we should start looking, we use
214
- the following formula:
215
-
216
- $data_section_offset = ( $record_value - $node_count ) - 16
217
-
218
- The 16 is the size of the data section separator. We subtract it because we
219
- want to permit pointing to the first byte of the data section. Recall that
220
- the record value cannot equal the node count as that means there is no
221
- data. Instead, we choose to start values that go to the data section at
222
- `$node_count + 16`. (This has the side effect that record values
223
- `$node_count + 1` through `$node_count + 15` inclusive are not valid).
224
-
225
- This is best demonstrated by an example:
226
-
227
- Let's assume we have a 24-bit tree with 1,000 nodes. Each node contains 48
228
- bits, or 6 bytes. The size of the tree is 6,000 bytes.
229
-
230
- When a record in the tree contains a number that is less than 1,000, this
231
- is a *node number*, and we look up that node. If a record contains a value
232
- greater than or equal to 1,016, we know that it is a data section value. We
233
- subtract the node count (1,000) and then subtract 16 for the data section
234
- separator, giving us the number 0, the first byte of the data section.
235
-
236
- If a record contained the value 6,000, this formula would give us an offset of
237
- 4,984 into the data section.
238
-
239
- In order to determine where in the file this offset really points to, we also
240
- need to know where the data section starts. This can be calculated by
241
- determining the size of the search tree in bytes and then adding an additional
242
- 16 bytes for the data section separator:
243
-
244
- $offset_in_file = $data_section_offset
245
- + $search_tree_size_in_bytes
246
- + 16
247
-
248
- Since we subtract and then add 16, the final formula to determine the
249
- offset in the file can be simplified to:
250
-
251
- $offset_in_file = ( $record_value - $node_count )
252
- + $search_tree_size_in_bytes
253
-
254
- ### IPv4 addresses in an IPv6 tree
255
-
256
- When storing IPv4 addresses in an IPv6 tree, they are stored as-is, so they
257
- occupy the first 32-bits of the address space (from 0 to 2**32 - 1).
258
-
259
- Creators of databases should decide on a strategy for handling the various
260
- mappings between IPv4 and IPv6.
261
-
262
- The strategy that MaxMind uses for its GeoIP databases is to include a pointer
263
- from the `::ffff:0:0/96` subnet to the root node of the IPv4 address space in
264
- the tree. This accounts for the
265
- [IPv4-mapped IPv6 address](http://en.wikipedia.org/wiki/IPv6#IPv4-mapped_IPv6_addresses).
266
-
267
- MaxMind also includes a pointer from the `2002::/16` subnet to the root node
268
- of the IPv4 address space in the tree. This accounts for the
269
- [6to4 mapping](http://en.wikipedia.org/wiki/6to4) subnet.
270
-
271
- Database creators are encouraged to document whether they are doing something
272
- similar for their databases.
273
-
274
- The Teredo subnet cannot be accounted for in the tree. Instead, code that
275
- searches the tree can offer to decode the IPv4 portion of a Teredo address and
276
- look that up.
277
-
278
- ## Data Section Separator
279
-
280
- There are 16 bytes of NULLs in between the search tree and the data
281
- section. This separator exists in order to make it possible for a verification
282
- tool to distinguish between the two sections.
283
-
284
- This separator is not considered part of the data section itself. In other
285
- words, the data section starts at `$size_of_search_tree + 16` bytes in the
286
- file.
287
-
288
- ## Output Data Section
289
-
290
- Each output data field has an associated type, and that type is encoded as a
291
- number that begins the data field. Some types are variable length. In those
292
- cases, the type indicator is also followed by a length. The data payload
293
- always comes at the end of the field.
294
-
295
- All binary data is stored in big-endian format.
296
-
297
- Note that the *interpretation* of a given data type's meaning is decided by
298
- higher-level APIs, not by the binary format itself.
299
-
300
- ### pointer - 1
301
-
302
- A pointer to another part of the data section's address space. The pointer
303
- will point to the beginning of a field. It is illegal for a pointer to point
304
- to another pointer.
305
-
306
- Pointer values start from the beginning of the data section, *not* the
307
- beginning of the file.
308
-
309
- ### UTF-8 string - 2
310
-
311
- A variable length byte sequence that contains valid utf8. If the length is
312
- zero then this is an empty string.
313
-
314
- ### double - 3
315
-
316
- This is stored as an IEEE-754 double (binary64) in big-endian format. The
317
- length of a double is always 8 bytes.
318
-
319
- ### bytes - 4
320
-
321
- A variable length byte sequence containing any sort of binary data. If the
322
- length is zero then this a zero-length byte sequence.
323
-
324
- This is not currently used but may be used in the future to embed non-text
325
- data (images, etc.).
326
-
327
- ### integer formats
328
-
329
- Integers are stored in variable length binary fields.
330
-
331
- We support 16-bit, 32-bit, 64-bit, and 128-bit unsigned integers. We also
332
- support 32-bit signed integers.
333
-
334
- A 128-bit integer can use up to 16 bytes, but may use fewer. Similarly, a
335
- 32-bit integer may use from 0-4 bytes. The number of bytes used is determined
336
- by the length specifier in the control byte. See below for details.
337
-
338
- A length of zero always indicates the number 0.
339
-
340
- When storing a signed integer, the left-most bit is the sign. A 1 is negative
341
- and a 0 is positive.
342
-
343
- The type numbers for our integer types are:
344
-
345
- * unsigned 16-bit int - 5
346
- * unsigned 32-bit int - 6
347
- * signed 32-bit int - 8
348
- * unsigned 64-bit int - 9
349
- * unsigned 128-bit int - 10
350
-
351
- The unsigned 32-bit and 128-bit types may be used to store IPv4 and IPv6
352
- addresses, respectively.
353
-
354
- The signed 32-bit integers are stored using the 2's complement representation.
355
-
356
- ### map - 7
357
-
358
- A map data type contains a set of key/value pairs. Unlike other data types,
359
- the length information for maps indicates how many key/value pairs it
360
- contains, not its length in bytes. This size can be zero.
361
-
362
- See below for the algorithm used to determine the number of pairs in the
363
- hash. This algorithm is also used to determine the length of a field's
364
- payload.
365
-
366
- ### array - 11
367
-
368
- An array type contains a set of ordered values. The length information for
369
- arrays indicates how many values it contains, not its length in bytes. This
370
- size can be zero.
371
-
372
- This type uses the same algorithm as maps for determining the length of a
373
- field's payload.
374
-
375
- ### data cache container - 12
376
-
377
- This is a special data type that marks a container used to cache repeated
378
- data. For example, instead of repeating the string "United States" over and
379
- over in the database, we store it in the cache container and use pointers
380
- *into* this container instead.
381
-
382
- Nothing in the database will ever contain a pointer to this field
383
- itself. Instead, various fields will point into the container.
384
-
385
- The primary reason for making this a separate data type versus simply inlining
386
- the cached data is so that a database dumper tool can skip this cache when
387
- dumping the data section. The cache contents will end up being dumped as
388
- pointers into it are followed.
389
-
390
- ### end marker - 13
391
-
392
- The end marker marks the end of the data section. It is not strictly
393
- necessary, but including this marker allows a data section deserializer to
394
- process a stream of input, rather than having to find the end of the section
395
- before beginning the deserialization.
396
-
397
- This data type is not followed by a payload, and its size is always zero.
398
-
399
- ### boolean - 14
400
-
401
- A true or false value. The length information for a boolean type will always
402
- be 0 or 1, indicating the value. There is no payload for this field.
403
-
404
- ### float - 15
405
-
406
- This is stored as an IEEE-754 float (binary32) in big-endian format. The
407
- length of a float is always 4 bytes.
408
-
409
- This type is provided primarily for completeness. Because of the way floating
410
- point numbers are stored, this type can easily lose precision when serialized
411
- and then deserialized. If this is an issue for you, consider using a double
412
- instead.
413
-
414
- ### Data Field Format
415
-
416
- Each field starts with a control byte. This control byte provides information
417
- about the field's data type and payload size.
418
-
419
- The first three bits of the control byte tell you what type the field is. If
420
- these bits are all 0, then this is an "extended" type, which means that the
421
- *next* byte contains the actual type. Otherwise, the first three bits will
422
- contain a number from 1 to 7, the actual type for the field.
423
-
424
- We've tried to assign the most commonly used types as numbers 1-7 as an
425
- optimization.
426
-
427
- With an extended type, the type number in the second byte is the number
428
- minus 7. In other words, an array (type 11) will be stored with a 0 for the
429
- type in the first byte and a 4 in the second.
430
-
431
- Here is an example of how the control byte may combine with the next byte to
432
- tell us the type:
433
-
434
- 001XXXXX pointer
435
- 010XXXXX UTF-8 string
436
- 110XXXXX unsigned 32-bit int (ASCII)
437
- 000XXXXX 00000011 unsigned 128-bit int (binary)
438
- 000XXXXX 00000100 array
439
- 000XXXXX 00000110 end marker
440
-
441
- #### Payload Size
442
-
443
- The next five bits in the control byte tell you how long the data field's
444
- payload is, except for maps and pointers. Maps and pointers use this size
445
- information a bit differently. See below.
446
-
447
- If the five bits are smaller than 29, then those bits are the payload size in
448
- bytes. For example:
449
-
450
- 01000010 UTF-8 string - 2 bytes long
451
- 01011100 UTF-8 string - 28 bytes long
452
- 11000001 unsigned 32-bit int - 1 byte long
453
- 00000011 00000011 unsigned 128-bit int - 3 bytes long
454
-
455
- If the five bits are equal to 29, 30, or 31, then use the following algorithm
456
- to calculate the payload size.
457
-
458
- If the value is 29, then the size is 29 + *the next byte after the type
459
- specifying bytes as an unsigned integer*.
460
-
461
- If the value is 30, then the size is 285 + *the next two bytes after the type
462
- specifying bytes as a single unsigned integer*.
463
-
464
- If the value is 31, then the size is 65,821 + *the next three bytes after the
465
- type specifying bytes as a single unsigned integer*.
466
-
467
- Some examples:
468
-
469
- 01011101 00110011 UTF-8 string - 80 bytes long
470
-
471
- In this case, the last five bits of the control byte equal 29. We treat the
472
- next byte as an unsigned integer. The next byte is 51, so the total size is
473
- (29 + 51) = 80.
474
-
475
- 01011110 00110011 00110011 UTF-8 string - 13,392 bytes long
476
-
477
- The last five bits of the control byte equal 30. We treat the next two bytes
478
- as a single unsigned integer. The next two bytes equal 13,107, so the total
479
- size is (285 + 13,107) = 13,392.
480
-
481
- 01011111 00110011 00110011 00110011 UTF-8 string - 3,421,264 bytes long
482
-
483
- The last five bits of the control byte equal 31. We treat the next three bytes
484
- as a single unsigned integer. The next three bytes equal 3,355,443, so the
485
- total size is (65,821 + 3,355,443) = 3,421,264.
486
-
487
- This means that the maximum payload size for a single field is 16,843,036
488
- bytes.
489
-
490
- The binary number types always have a known size, but for consistency's sake,
491
- the control byte will always specify the correct size for these types.
492
-
493
- #### Maps
494
-
495
- Maps use the size in the control byte (and any following bytes) to indicate
496
- the number of key/value pairs in the map, not the size of the payload in
497
- bytes.
498
-
499
- This means that the maximum number of pairs for a single map is 16,843,036.
500
-
501
- Maps are laid out with each key followed by its value, followed by the next
502
- pair, etc.
503
-
504
- The keys are **always** UTF-8 strings. The values may be any data type,
505
- including maps or pointers.
506
-
507
- Once we know the number of pairs, we can look at each pair in turn to
508
- determine the size of the key and the key name, as well as the value's type
509
- and payload.
510
-
511
- #### Pointers
512
-
513
- Pointers use the last five bits in the control byte to calculate the pointer
514
- value.
515
-
516
- To calculate the pointer value, we start by subdividing the five bits into two
517
- groups. The first two bits indicate the size, and the next three bits are part
518
- of the value, so we end up with a control byte breaking down like this:
519
- 001SSVVV.
520
-
521
- The size can be 0, 1, 2, or 3.
522
-
523
- If the size is 0, the pointer is built by appending the next byte to the last
524
- three bits to produce an 11-bit value.
525
-
526
- If the size is 1, the pointer is built by appending the next two bytes to the
527
- last three bits to produce a 19-bit value + 2048.
528
-
529
- If the size is 2, the pointer is built by appending the next three bytes to the
530
- last three bits to produce a 27-bit value + 526336.
531
-
532
- Finally, if the size is 3, the pointer's value is contained in the next four
533
- bytes as a 32-bit value. In this case, the last three bits of the control byte
534
- are ignored.
535
-
536
- This means that we are limited to 4GB of address space for pointers, so the
537
- data section size for the database is limited to 4GB.
538
-
539
- ## Reference Implementations
540
-
541
- ### Writer
542
-
543
- * [Perl](https://github.com/maxmind/MaxMind-DB-Writer-perl)
544
-
545
- ### Reader
546
-
547
- * [C](https://github.com/maxmind/libmaxminddb)
548
- * [C#](https://github.com/maxmind/MaxMind-DB-Reader-dotnet)
549
- * [Java](https://github.com/maxmind/MaxMind-DB-Reader-java)
550
- * [Perl](https://github.com/maxmind/MaxMind-DB-Reader-perl)
551
- * [PHP](https://github.com/maxmind/MaxMind-DB-Reader-php)
552
- * [Python](https://github.com/maxmind/MaxMind-DB-Reader-python)
553
- * [Ruby](https://github.com/maxmind/MaxMind-DB-Reader-ruby)
554
-
555
- ## Authors
556
-
557
- This specification was created by the following authors:
558
-
559
- * Greg Oschwald \<goschwald@maxmind.com\>
560
- * Dave Rolsky \<drolsky@maxmind.com\>
561
- * Boris Zentner \<bzentner@maxmind.com\>
562
-
563
- ## License
564
-
565
- This work is licensed under the Creative Commons Attribution-ShareAlike 3.0
566
- Unported License. To view a copy of this license, visit
567
- [http://creativecommons.org/licenses/by-sa/3.0/](http://creativecommons.org/licenses/by-sa/3.0/)
568
- or send a letter to Creative Commons, 444 Castro Street, Suite 900, Mountain
569
- View, California, 94041, USA
570
-