hoov_vin 1.0.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
checksums.yaml ADDED
@@ -0,0 +1,7 @@
1
+ ---
2
+ SHA256:
3
+ metadata.gz: 89cea6349464b8a846d7278e95ceb71f633dd8e86b33963eace43993b7c9a577
4
+ data.tar.gz: eee44128336e248e192cfcdd702954625b50db1da7a8a17137de644a2e5c5f0d
5
+ SHA512:
6
+ metadata.gz: 02a46a71bc71764d28c745af2966a057d25fb2638f30c8ca0cbcced195210276f63d67a1a32ec573d6cd3e4d282274b968abea057053ab3eab42b9cde587b680
7
+ data.tar.gz: ee32ff79ef7106ca1ab4685f081a8e18f02b03a68873f146ce42a6e3681b75cfb46e6721f0c94f12fc24aa99a30620ffb1f17571b01b61ab9ac675a136aac348
data/README.md ADDED
@@ -0,0 +1,287 @@
1
+ <div align="center">
2
+ <picture>
3
+ <source media="(prefers-color-scheme: dark)" srcset="docs/logo-dark-mode.png">
4
+ <source media="(prefers-color-scheme: light)" srcset="docs/logo-light-mode.png">
5
+ <img width=200>
6
+ </picture>
7
+ <h1>VIN</h1>
8
+ <p><i>noun ‧ <strong>V</strong>ersatile <strong>I</strong>dentification <strong>N</strong>umber</i></p>
9
+ <p><strong>A customizable Redis-powered Ruby client for generating unique, monotonically-increasing integer IDs, for use in distributed systems and databases.</strong></p>
10
+ <a href="https://github.com/hoovbr/vin/releases">
11
+ <img alt="Latest Release" src="https://img.shields.io/github/v/release/hoovbr/vin?sort=semver">
12
+ </a>
13
+ <a href="https://codeclimate.com/github/hoovbr/vin/maintainability">
14
+ <img src="https://api.codeclimate.com/v1/badges/790449fb5d05f6a134a5/maintainability" />
15
+ </a>
16
+ <a href="https://codeclimate.com/github/hoovbr/vin/test_coverage">
17
+ <img src="https://api.codeclimate.com/v1/badges/790449fb5d05f6a134a5/test_coverage" />
18
+ </a>
19
+ <a href="https://github.com/hoovbr/vin/actions/workflows/push.yml">
20
+ <img alt="Tests & Linter" src="https://github.com/hoovbr/vin/actions/workflows/push.yml/badge.svg">
21
+ </a>
22
+ <a href="https://github.com/hoovbr/vin/issues">
23
+ <img alt="Issues" src="https://img.shields.io/github/issues/hoovbr/vin?color=#86D492" />
24
+ </a>
25
+ <a href="https://twitter.com/intent/follow?screen_name=hoovbr">
26
+ <img src="https://img.shields.io/twitter/follow/hoovbr?&logo=twitter" alt="Follow on Twitter">
27
+ </a>
28
+ <img src="https://views.whatilearened.today/views/github/hoovbr/vin.svg">
29
+
30
+ <p align="center">
31
+ <a href="#demo">View Demo</a>
32
+ ·
33
+ <a href="https://github.com/hoovbr/vin/issues/new/choose">Report Bug</a>
34
+ ·
35
+ <a href="https://github.com/hoovbr/vin/issues/new/choose">Request Feature</a>
36
+ </p>
37
+ </div>
38
+
39
+ A customizable Redis-powered Ruby client for generating unique, monotonically-increasing integer IDs, for use in distributed systems and databases. Based heavily off of [Icicle](https://github.com/intenthq/icicle/), [Twitter Snowflake](https://en.wikipedia.org/wiki/Snowflake_ID), and [Dogtag](https://github.com/zillyinc/dogtag).
40
+
41
+ # Requirements
42
+
43
+ - Ruby 3+
44
+ - Redis 5+
45
+ - If you are going to store the ID in a database you'll need to make sure it can store 64-bit integers, (e.g. PostgreSQL, MySQL, etc.)
46
+
47
+ ## Demo
48
+
49
+ <details><summary>Click here to view a simple demo</summary>
50
+ <p>
51
+
52
+ The gif below demonstrates how the ID generation works:
53
+
54
+ <div align="center">
55
+ <img alt="Demo" src="https://github.com/hoovbr/vin/assets/8419048/dc9fe71f-7d6d-4ba5-bd8e-fe81a280928a">
56
+ </div>
57
+
58
+ </p>
59
+ </details>
60
+
61
+ # Installation
62
+
63
+ Add this gem to your `Gemfile`:
64
+
65
+ ```ruby
66
+ gem "hoov_vin"
67
+ ```
68
+
69
+ And then run `bundle install` to install it.
70
+
71
+ # Usage
72
+
73
+ Follow the steps below to get started with VIN in your Ruby on Rails project. These steps assume your project is not yet live in production, so that you're free to make changes to your database schema and drop your existing database records.
74
+
75
+ 1. Make sure the primary key type is set to `:bigint` when generating new models
76
+
77
+ To achieve this, create or update your `config/initializers/generators.rb` file:
78
+
79
+ ```ruby
80
+ Rails.application.config.generators do |g|
81
+ g.orm :active_record, primary_key_type: :bigint
82
+ end
83
+ ```
84
+
85
+ This [happens to be the default](https://edgeguides.rubyonrails.org/active_record_basics.html#schema-conventions) for PostgreSQL and MySQL, but it's not the default for SQLite, so it's good to always be explicit.
86
+
87
+ 2. Set up the VIN generator
88
+
89
+ Update your `config/application.rb` file to initialize the VIN generator singleton:
90
+
91
+ ```ruby
92
+
93
+ require "vin"
94
+
95
+ class YourApp
96
+ class Application < Rails::Application
97
+
98
+ # CAUTION: Avoid modifying the values below without fully understanding the implications in past IDs.
99
+ config.id_generator = VIN.new(config: VIN::Config.new(
100
+ custom_epoch: 1_672_531_200_000,
101
+ timestamp_bits: 40,
102
+ logical_shard_id_bits: 3,
103
+ data_type_bits: 9,
104
+ sequence_bits: 11,
105
+ logical_shard_id_range: 0..0,
106
+ ))
107
+ end
108
+ end
109
+ ```
110
+
111
+ To understand what each of these values mean, see the [Configuration](#configuration) section below.
112
+
113
+ 3. Automatically generate and assign the VIN to models before saving them to the database
114
+
115
+ Create a new file in `app/models/concerns/has_vin.rb`:
116
+
117
+ ```ruby
118
+ module HasVin
119
+ extend ActiveSupport::Concern
120
+
121
+ included do
122
+ before_create :set_vin_if_needed
123
+ end
124
+
125
+ private
126
+
127
+ def set_vin_if_needed
128
+ id_generator = Rails.application.config.id_generator
129
+ self.id ||= id_generator.generate_id(self.class::VIN_DATA_TYPE)
130
+ end
131
+ end
132
+ ```
133
+
134
+ This will guarantee that the VIN is generated and assigned to the model before it's saved to the database. The `VIN_DATA_TYPE` constant is used to differentiate between different types of models, so that they don't share the same ID space. For example, you might want to use a different `VIN_DATA_TYPE` for `User` models than you would for `Post` models.
135
+
136
+ Note that this assumes all your models are using a primary key named `id`. If you're not following the Rails convention of using `id` as the primary key, or if you're using composite primary keys, you'll need to modify this code to work with your specific setup. This could be one way to do it:
137
+
138
+ ```ruby
139
+
140
+ def set_vin_if_needed
141
+ # If using composite primary keys in Rails 7.1 and later
142
+ return if defined?(self.class.primary_key) && self.class.primary_key.is_a?(Array)
143
+ # If using composite primary keys in Rails 7.0 and earlier
144
+ return if defined?(self.class.primary_keys)
145
+ id_generator = Rails.application.config.id_generator
146
+ self.id ||= id_generator.generate_id(self.class::VIN_DATA_TYPE)
147
+ end
148
+
149
+ ```
150
+
151
+ 4. Include the `HasVin` module in your base `ApplicationRecord` class
152
+
153
+ Create or update your base ActiveRecord abstract class, such as `app/models/application_record.rb`:
154
+
155
+ ```
156
+ class ApplicationRecord < ActiveRecord::Base
157
+ self.abstract_class = true # If targetting Rails 6 or earlier
158
+ primary_abstract_class # If targetting Rails 7 or later
159
+ include HasVin
160
+ end
161
+ ```
162
+
163
+ This will make sure the `HasVin` module is included in all your models.
164
+
165
+ >_**Note:** If you already have an existing codebase and database records, make sure you write the appropriate migration to change the primary key type to `:bigint` across the board, as well as migrate your existing records' IDs to VIN IDs._
166
+
167
+ ## Usage outside of ActiveRecord/Rails context
168
+
169
+ ```ruby
170
+ vin = VIN.new
171
+ data_type = 0
172
+ vin.generate_id(data_type) # => 63801071700541441
173
+ ```
174
+
175
+ ```ruby
176
+ count = 100
177
+ vin.generate_ids(data_type, count) # => [63801199693922306, 63801199693922307, … 98 other IDs … ]
178
+ ```
179
+
180
+ ```ruby
181
+ id_number = vin.generate_id(data_type) # => 63801532235120742
182
+ id = VIN::Id.new(id: id_number) # => #<VIN::Id:0x0000000108452ff0…>
183
+ id.data_type # => 0
184
+ id.sequence # => 102
185
+ id.logical_shard_id # => 0
186
+ id.custom_timestamp # => 7605735330, time since custom epoc in milliseconds
187
+ id.timestamp.to_time # 2023-09-27 22:16:15.33 -0300 (Ruby Time object)
188
+ id.timestamp.epoch #=> 1688258040000, time since UNIX epoch in milliseconds
189
+ ```
190
+
191
+ # Configuration
192
+
193
+ The VIN generator can be configured with the following parameters:
194
+
195
+ - `custom_epoch` or `VIN_CUSTOM_EPOCH` env var: The custom epoch is the timestamp that will be used as the starting point for generating VINs. It's expressed in milliseconds since the UNIX epoch (Jan 1st, 1970, 12:00 AM UTC). Example value: `1_672_531_200_000` (Jan 1st, 2023, 12:00 AM UTC). This value shouldn't be in the future, and should never be changed after its first config.
196
+ - `timestamp_bits` or `VIN_TIMESTAMP_BITS` env var: The number of bits to use for the timestamp. The more bits you use, the more time you'll have before the timestamp overflows. Example value: `40` (40 bits gives us 1099511627776 milliseconds, or 34.8 years, enough time any of us to retire 😇).
197
+ - `logical_shard_id_bits` or `VIN_LOGICAL_SHARD_ID_BITS` env var: The number of bits to use for the logical shard ID. The more bits you use, the more machines generating IDs you'll be able to have. Example value: `3` (3 bits gives us 8 logical shards, which means you can have 8 different servers generating ids).
198
+ - `data_type_bits` or `VIN_DATA_TYPE_BITS`: The number of bits to use for the data type. The more bits you use, the more different types of models (tables) you'll be able to have. Example value: `9` (9 bits gives us 512 different data types).
199
+ - `sequence_bits` or `VIN_SEQUENCE_BITS`: The number of bits to use for the sequence. The more bits you use, the more IDs you'll be able to generate per millisecond per logical shard. Example value: `11` (11 bits gives us 2048 ids per millisecond per logical shard).
200
+ - `logical_shard_id_range` or `VIN_LOGICAL_SHARD_ID_RANGE_MIN` + `VIN_LOGICAL_SHARD_ID_RANGE_MAX` env vars: The range of logical shard IDs to use. Example value: `0..7` (8 logical shards, numbered 0 through 7). Note that this must conform with the `logical_shard_id_bits` value. This parameter is optional, and defaults to `0..0` (a single logical shard with ID 0).
201
+ - `VIN_REDIS_URL` or `REDIS_URL` env var: The Redis URL to use for the Redis connection. Example value: `redis://localhost:6379/0`. This parameter is optional, and defaults to `redis://127.0.0.1:6379`.
202
+
203
+ **Note:** the sum of the `timestamp_bits`, `logical_shard_id_bits`, `data_type_bits`, and `sequence_bits` values must be 63. The remaining bit is used for the sign bit.
204
+
205
+ # FAQ
206
+
207
+ ## Why not use incremental IDs?
208
+
209
+ Using incremental IDs in databases can have its drawbacks and limitations. One key reason to reconsider their use is the potential for data leakage and security vulnerabilities. Incremental IDs are predictable and sequential, making it easier for malicious actors to guess or access sensitive data by simply incrementing the ID. This can compromise data privacy and expose confidential information about the system, how many records exist, etc. Additionally, when databases are distributed or sharded, managing incremental IDs across multiple servers can lead to synchronization challenges and performance bottlenecks. Moreover, if records are ever deleted or the database is restructured, gaps in the sequence may arise, causing inconsistencies and complicating data analysis. Lastly, incremental IDs are not universally unique, which inevitably leads to collisions amongst different database tables, and can cause confusion or mistakes when debugging, analyzing, or manipulating data.
210
+
211
+ ## Why not use UUIDs?
212
+
213
+ UUIDs (Universally Unique Identifiers) solve the problem of predictability and security, and also the generation of IDs in distributed systems, but they are long and complex, which can increase storage requirements and slow down indexing and query performance. Storing them as strings can also make them difficult to work with, and takes up more space than storing integer IDs. Although they can be encoded as integers too, they still take up 128 bits of storage when in integer format. Lastly, sorting them doesn't provide any usefulness, and their meaningless nature doesn't help with debugging or data analysis.
214
+
215
+ ## Why not use ULIDs?
216
+
217
+ Using ULIDs (Universally Unique Lexicographically Sortable Identifiers) are the second best alternative, as they are sortable by time, don't impose immediate generation problems in distributed systems, and can also be encoded as integers. However, there are still a few drawbacks, such as they taking up 128 bits of storage, which may not be necessary if they are being used as database primary keys. Lastly, time is the only useful information encoded in them, so they don't provide any additional context or meaning to the data.
218
+
219
+ ## Why use VINs?
220
+
221
+ At this point you can probably guess why we created VINs. They are the best at solving each weakness of the options listed above:
222
+
223
+ - VINs are not predictable, thus they don't impose the security and privacy vulnerabilities that comes with incremental IDs.
224
+ - VINs has zero collision probability, making them universally unique across the entire database.
225
+ - This comes with the drawback of a self-imposed bottleneck on the generation. However, this is only an issue at absurd scales (thousands of record creations per milisecond, per server), and can be easily overcome by increasing the number of sequence bits or shards.
226
+ - VINs are 64-bit integers, making them more space-efficient than UUIDs and ULIDs, which take 128 bits at best.
227
+ - VINs can be sorted, earning a chronologically sorted list, thanks to the monotonically-increasing nature of the IDs.
228
+ - VINs encode additional context and meaning to the data it stores, such as the timestamp, data type, and shard ID, which can be used to identify the source of the data, optimizing distributed systems and debugging.
229
+ - VINs are fully customizable. As you could see in the [Configuration](#configuration) section, you can customize the number of bits used for each component of the VIN, allowing you to optimize the VIN for your specific use case.
230
+
231
+ ## How does it work?
232
+
233
+ ### How are the IDs generated?
234
+
235
+ The IDs are composed by 64 bits, which are divided into 4 components: timestamp, shard ID (aka machine ID), data type, and sequence. It's important that it starts with the timestamp component, as that's what guarantees the IDs are sortable by time.
236
+
237
+ The number of bits that each of these components take up can be customized as seen in the [Configuration](#configuration), but for the sake of this example, we'll use 40 bits for the timestamp, 3 bits for the shard ID, 9 bits for the data type, and 11 bits for the sequence. This adds up 63 bits, but since we're working with a signed integer, the first bit is reserved for the bit sign. This results in this binary representation:
238
+
239
+ ```no-highlight
240
+ +----------------------+----------+--------------+----------------+
241
+ | Timestamp | Shard ID | Data Type | Sequence |
242
+ | (40 bits) | (3 bits) | (9 bits) | (11 bits) |
243
+ +----------------------+----------+--------------+----------------+
244
+ ```
245
+
246
+ This is then converted to a decimal number, which is what we use as the ID. The timestamp is the number of milliseconds since the custom epoch defined by you during the configuration. The shard ID is a number that uniquely identifies the machine that generated the ID. The data type is a number that uniquely identifies the model that this ID will belong to. The sequence is a number that is incremented every time an ID is generated, and is reset to 0 every millisecond, a strategy used to avoid collisions.
247
+
248
+ ### How are the IDs automatically assigned to records?
249
+
250
+ In Rails, when you create a new record, the `create` method is called on the model class, which creates the record in memory and then calls `save` on it. The `save` method will either call `create` or `update` depending on whether the record is new or not. If the record is not new, it will already have an ID assigned to it, in which case our method `set_vin_if_needed` in `HasVin` won't do anything. However, if the record is new, it will not have an ID assigned to it, in which case our method will generate and assign a VIN to it. This happens before the record gets sent to the database, so the database will not generate an ID for it.
251
+
252
+ ## What about the performance?
253
+
254
+ Compared to the benefits of having VINs, the performance impact is negligible. The only performance impact is the time it takes to generate the VIN, which is around ~0.039ms (yes, that's not a typo, it's less than 1/25th of a millisecond).
255
+
256
+ ## Any issues I should be aware of?
257
+
258
+ Be careful of using VIN IDs with JavaScript, since it [doesn't handle 64 bit integers well](http://stackoverflow.com/questions/9643626/javascript-cant-handle-64-bit-integers-can-it). You'll probably want to work with them as strings.
259
+
260
+ Also, when two IDs are generated within the same millisecond, their order is only guaranteed to be the same if they're generated by the same machine, for the same data type. This expected and is due to the very nature of the order of the bits, after all, the IDs are only sortable by time.
261
+
262
+ # Development
263
+
264
+ After checking out the repo, run `bundle install` to install dependencies. Then, run `bundle exec rake spec` to run the tests.
265
+
266
+ To install this gem onto your local machine, run `bundle exec rake install`.
267
+
268
+ To bump the lib's version, run `bundle exec rake bump[1.2.3]` (replacing the value with the desired version).
269
+
270
+ To release a new version, update the version number (via `bundle exec rake bump` as explained above), and then run `bundle exec rake release`, which will create a git tag for the version, push git commits and the created tag, and push the `.gem` file to [rubygems.org](https://rubygems.org).
271
+
272
+ # TODO
273
+
274
+ - Support multiple Redis servers
275
+ - Replace the lua script with Ruby code.
276
+
277
+ # Contributing
278
+
279
+ If you spot something wrong, missing, or if you'd like to propose improvements to this project, please open an Issue or a Pull Request with your ideas and we promise to get back to you within 24 hours! 😇
280
+
281
+ This project is intended to be a safe, welcoming space for collaboration, and contributors are expected to adhere to the [code of conduct](CODE_OF_CONDUCT.md).
282
+
283
+ For a list of issues worth tackling check out: https://github.com/hoovbr/vin/issues
284
+
285
+ # Popularity
286
+
287
+ <img width=500 src="https://api.star-history.com/svg?repos=hoovbr/vin&type=Date">
data/lib/vin/config.rb ADDED
@@ -0,0 +1,94 @@
1
+ class VIN
2
+ class Config
3
+ # Expressed in milliseconds.
4
+ attr_reader :custom_epoch
5
+
6
+ # For instance, 40 bits gives us 1099511627776 milliseconds, or 34.8 years. Enough time to last us until 2057, enough time for any of us to retire.
7
+ attr_reader :timestamp_bits
8
+
9
+ # For instance, 3 bits gives us 8 logical shards, which means we can have 8 different servers generating ids.
10
+ attr_reader :logical_shard_id_bits
11
+
12
+ # For instance, 9 bits gives us 512 different data types.
13
+ attr_reader :data_type_bits
14
+
15
+ # For instance, 11 bits gives us 2048 ids per millisecond per logical shard.
16
+ attr_reader :sequence_bits
17
+
18
+ # Defaults to allowing all logical shard ids to be generated by this server.
19
+ attr_reader :logical_shard_id_range
20
+
21
+ def initialize(
22
+ custom_epoch: nil,
23
+ timestamp_bits: nil,
24
+ logical_shard_id_bits: nil,
25
+ data_type_bits: nil,
26
+ sequence_bits: nil,
27
+ logical_shard_id_range: nil
28
+ )
29
+ @custom_epoch = custom_epoch || ENV.fetch("VIN_CUSTOM_EPOCH").to_i
30
+ @timestamp_bits = timestamp_bits || ENV.fetch("VIN_TIMESTAMP_BITS").to_i
31
+ @logical_shard_id_bits = logical_shard_id_bits || ENV.fetch("VIN_LOGICAL_SHARD_ID_BITS").to_i
32
+ @data_type_bits = data_type_bits || ENV.fetch("VIN_DATA_TYPE_BITS").to_i
33
+ @sequence_bits = sequence_bits || ENV.fetch("VIN_SEQUENCE_BITS").to_i
34
+ @logical_shard_id_range = logical_shard_id_range || fetch_allowed_range!
35
+ end
36
+
37
+ def min_logical_shard_id
38
+ 0
39
+ end
40
+
41
+ def max_logical_shard_id
42
+ @max_logical_shard_id ||= ~(-1 << logical_shard_id_bits)
43
+ end
44
+
45
+ def logical_shard_id_allowed_range
46
+ @logical_shard_id_allowed_range ||= (min_logical_shard_id..max_logical_shard_id)
47
+ end
48
+
49
+ def min_data_type
50
+ 0
51
+ end
52
+
53
+ def max_data_type
54
+ @max_data_type ||= ~(-1 << data_type_bits)
55
+ end
56
+
57
+ def data_type_allowed_range
58
+ @data_type_allowed_range ||= (min_data_type..max_data_type)
59
+ end
60
+
61
+ def max_sequence
62
+ @max_sequence ||= ~(-1 << sequence_bits)
63
+ end
64
+
65
+ def sequence_shift
66
+ 0
67
+ end
68
+
69
+ def data_type_shift
70
+ @data_type_shift ||= sequence_bits
71
+ end
72
+
73
+ def logical_shard_id_shift
74
+ @logical_shard_id_shift ||= (sequence_bits + data_type_bits)
75
+ end
76
+
77
+ def timestamp_shift
78
+ @timestamp_shift ||= (sequence_bits + data_type_bits + logical_shard_id_bits)
79
+ end
80
+
81
+ def fetch_allowed_range!
82
+ range = Range.new(
83
+ ENV.fetch("VIN_LOGICAL_SHARD_ID_RANGE_MIN", 0).to_i,
84
+ ENV.fetch("VIN_LOGICAL_SHARD_ID_RANGE_MAX", 0).to_i,
85
+ )
86
+ # rubocop:disable Style/BitwisePredicate
87
+ unless (logical_shard_id_allowed_range.to_a & range.to_a) == range.to_a
88
+ raise(ArgumentError, "VIN_LOGICAL_SHARD_ID_RANGE_MIN and VIN_LOGICAL_SHARD_ID_RANGE_MAX env vars compose a range outside the allowed range of #{logical_shard_id_allowed_range} defined by the number of bits in VIN_LOGICAL_SHARD_ID_BITS env var.")
89
+ end
90
+ # rubocop:enable Style/BitwisePredicate
91
+ range
92
+ end
93
+ end
94
+ end
@@ -0,0 +1,73 @@
1
+ require "vin/config"
2
+
3
+ class VIN
4
+ class Generator
5
+ attr_reader :data_type, :count, :config, :custom_timestamp
6
+
7
+ def initialize(config:)
8
+ @config = config
9
+ end
10
+
11
+ def generate_ids(data_type, count = 1, timestamp: nil)
12
+ raise(ArgumentError, "data_type must be an integer") unless data_type.is_a?(Integer)
13
+
14
+ unless config.data_type_allowed_range.include?(data_type)
15
+ raise(ArgumentError, "data_type is outside the allowed range of #{config.data_type_allowed_range}")
16
+ end
17
+
18
+ raise(ArgumentError, "count must be an integer") unless count.is_a?(Integer)
19
+ raise(ArgumentError, "count must be a positive number") if count < 1
20
+
21
+ if timestamp
22
+ validate_timestamp!(timestamp)
23
+ end
24
+
25
+ @data_type = data_type
26
+ @count = count
27
+ @custom_timestamp = timestamp
28
+
29
+ result = response.sequence.map do |sequence|
30
+ (
31
+ shifted_timestamp |
32
+ shifted_logical_shard_id |
33
+ shifted_data_type |
34
+ (sequence << config.sequence_shift)
35
+ )
36
+ end
37
+ # After generating a batch of IDs, we reset the response object so that it generates new IDs later with a new request.
38
+ @response = nil
39
+ result
40
+ end
41
+
42
+ private
43
+
44
+ def shifted_timestamp
45
+ timestamp = if custom_timestamp
46
+ # Custom timestamp is in Unix milliseconds (absolute time)
47
+ # Convert it to be relative to custom epoch
48
+ milliseconds_from_custom_epoch = custom_timestamp - config.custom_epoch
49
+ Timestamp.new(milliseconds_from_custom_epoch, epoch: config.custom_epoch)
50
+ else
51
+ Timestamp.from_redis(response.seconds, response.microseconds_part)
52
+ end
53
+ timestamp.with_epoch(config.custom_epoch).milliseconds << config.timestamp_shift
54
+ end
55
+
56
+ def validate_timestamp!(timestamp)
57
+ raise(ArgumentError, "timestamp must be an integer (milliseconds)") unless timestamp.is_a?(Integer)
58
+ raise(ArgumentError, "timestamp cannot be before the custom epoch (#{config.custom_epoch}ms since Unix epoch)") if timestamp < config.custom_epoch
59
+ end
60
+
61
+ def shifted_data_type
62
+ data_type << config.data_type_shift
63
+ end
64
+
65
+ def shifted_logical_shard_id
66
+ response.logical_shard_id << config.logical_shard_id_shift
67
+ end
68
+
69
+ def response
70
+ @response ||= Request.new(config, data_type, count, custom_timestamp: custom_timestamp).response
71
+ end
72
+ end
73
+ end
data/lib/vin/id.rb ADDED
@@ -0,0 +1,48 @@
1
+ class VIN
2
+ class Id
3
+ attr_reader :id, :config
4
+
5
+ def initialize(id:, config: nil)
6
+ @id = id
7
+ @config = config || VIN::Config.new
8
+ end
9
+
10
+ def custom_timestamp
11
+ (id & timestamp_map) >> config.timestamp_shift
12
+ end
13
+
14
+ def timestamp
15
+ @timestamp ||= Timestamp.new(custom_timestamp, epoch: config.custom_epoch)
16
+ end
17
+
18
+ def logical_shard_id
19
+ (id & logical_shard_id_map) >> config.logical_shard_id_shift
20
+ end
21
+
22
+ def data_type
23
+ (id & data_type_map) >> config.data_type_shift
24
+ end
25
+
26
+ def sequence
27
+ (id & sequence_map) >> config.sequence_shift
28
+ end
29
+
30
+ private
31
+
32
+ def sequence_map
33
+ ~(-1 << config.sequence_bits) << config.sequence_shift
34
+ end
35
+
36
+ def data_type_map
37
+ ~(-1 << config.data_type_bits) << config.data_type_shift
38
+ end
39
+
40
+ def logical_shard_id_map
41
+ (~(-1 << config.logical_shard_id_bits)) << config.logical_shard_id_shift
42
+ end
43
+
44
+ def timestamp_map
45
+ ~(-1 << config.timestamp_bits) << config.timestamp_shift
46
+ end
47
+ end
48
+ end
@@ -0,0 +1,24 @@
1
+ require "erb"
2
+ require "vin/config"
3
+
4
+ class VIN
5
+ module LuaScript
6
+ LUA_SCRIPT_PATH = "lua/id-generation.lua.erb".freeze
7
+
8
+ def self.generate_file(config: nil)
9
+ config ||= VIN::Config.new
10
+ binding = binding()
11
+ binding.local_variable_set(:config, config)
12
+ @generate_file ||= ERB.new(
13
+ File.read(
14
+ File.expand_path("../../#{LUA_SCRIPT_PATH}", File.dirname(__FILE__)),
15
+ ),
16
+ ).result(binding)
17
+ end
18
+
19
+ # Used in tests to ensure that the file is regenerated.
20
+ def self.reset_cache
21
+ @generate_file = nil
22
+ end
23
+ end
24
+ end
@@ -0,0 +1,14 @@
1
+ class VIN
2
+ module Mixins
3
+ module Redis
4
+ DEFAULT_REDIS_URL = "redis://127.0.0.1:6379".freeze
5
+
6
+ def redis
7
+ # TODO: Redis config for multiple servers
8
+ @redis ||= ::Redis.new(
9
+ url: ENV["VIN_REDIS_URL"] || ENV["REDIS_URL"] || DEFAULT_REDIS_URL,
10
+ )
11
+ end
12
+ end
13
+ end
14
+ end
@@ -0,0 +1,59 @@
1
+ class VIN
2
+ class Request
3
+ include VIN::Mixins::Redis
4
+
5
+ MAX_TRIES = 5
6
+
7
+ attr_reader :data_type, :count, :config, :custom_timestamp
8
+
9
+ def initialize(config, data_type, count = 1, custom_timestamp: nil)
10
+ raise(ArgumentError, "data_type must be a number") unless data_type.is_a?(Numeric)
11
+ unless config.data_type_allowed_range.include?(data_type)
12
+ raise(ArgumentError, "data_type is outside the allowed range of #{config.data_type_allowed_range}")
13
+ end
14
+ raise(ArgumentError, "count must be a number") unless count.is_a?(Numeric)
15
+ raise(ArgumentError, "count must be greater than zero") unless count.positive?
16
+
17
+ @tries = 0
18
+ @data_type = data_type
19
+ @count = count
20
+ @config = config
21
+ @custom_timestamp = custom_timestamp
22
+ end
23
+
24
+ def response
25
+ Response.new(try_redis_response)
26
+ end
27
+
28
+ private
29
+
30
+ def lua_script_sha
31
+ @@lua_script_sha ||= redis.script(:load, LuaScript.generate_file(config: config))
32
+ end
33
+
34
+ def lua_keys
35
+ @lua_keys ||= [data_type, count, custom_timestamp].compact
36
+ end
37
+
38
+ # NOTE: If too many requests come in inside of a millisecond the Lua script
39
+ # will lock for 1ms and throw an error. This is meant to retry in those cases.
40
+ def try_redis_response
41
+ @tries += 1
42
+ redis_response
43
+ rescue Redis::CommandError => e
44
+ raise(e) unless @tries < MAX_TRIES
45
+
46
+ # Clear out the cache of the Lua script SHA to force a reload. This
47
+ # is necessary after a Redis restart
48
+ @@lua_script_sha = nil
49
+
50
+ # Exponentially sleep more and more on each try
51
+ sleep((@tries * @tries).to_f / 900)
52
+ retry
53
+ end
54
+
55
+ def redis_response
56
+ @redis_response ||= redis.evalsha(lua_script_sha, keys: lua_keys)
57
+ end
58
+ end
59
+ end
@@ -0,0 +1,41 @@
1
+ class VIN
2
+ class Response
3
+ START_SEQUENCE_INDEX = 0
4
+ END_SEQUENCE_INDEX = 1
5
+ LOGICAL_SHARD_ID_INDEX = 2
6
+ SECONDS_INDEX = 3
7
+ MICROSECONDS_INDEX = 4
8
+
9
+ def initialize(redis_response)
10
+ @redis_response = redis_response
11
+ end
12
+
13
+ def sequence
14
+ start_sequence..end_sequence
15
+ end
16
+
17
+ def start_sequence
18
+ redis_response[START_SEQUENCE_INDEX]
19
+ end
20
+
21
+ def end_sequence
22
+ redis_response[END_SEQUENCE_INDEX]
23
+ end
24
+
25
+ def logical_shard_id
26
+ redis_response[LOGICAL_SHARD_ID_INDEX]
27
+ end
28
+
29
+ def seconds
30
+ redis_response[SECONDS_INDEX]
31
+ end
32
+
33
+ def microseconds_part
34
+ redis_response[MICROSECONDS_INDEX]
35
+ end
36
+
37
+ private
38
+
39
+ attr_reader :redis_response
40
+ end
41
+ end
@@ -0,0 +1,48 @@
1
+ class VIN
2
+ class Timestamp
3
+ ONE_SECOND_IN_MILLIS = 1_000
4
+ ONE_MILLI_IN_MICRO_SECS = 1_000
5
+
6
+ attr_reader :milliseconds, :epoch
7
+
8
+ def initialize(milliseconds, epoch: 0)
9
+ @milliseconds = milliseconds
10
+ @epoch = epoch
11
+ end
12
+
13
+ def seconds
14
+ (milliseconds / ONE_SECOND_IN_MILLIS).floor
15
+ end
16
+
17
+ def microseconds_part
18
+ (milliseconds - (seconds * ONE_SECOND_IN_MILLIS)) * ONE_MILLI_IN_MICRO_SECS
19
+ end
20
+
21
+ alias to_i milliseconds
22
+
23
+ def to_time
24
+ Time.at(with_unix_epoch.seconds, with_unix_epoch.microseconds_part)
25
+ end
26
+
27
+ def with_unix_epoch
28
+ @with_unix_epoch ||= with_epoch(0)
29
+ end
30
+
31
+ def with_epoch(new_epoch)
32
+ new_milliseconds = milliseconds - (new_epoch - epoch)
33
+
34
+ self.class.new(new_milliseconds, epoch: new_epoch)
35
+ end
36
+
37
+ def self.from_redis(seconds_part, microseconds_part)
38
+ # NOTE: we're dropping the microseconds here because we don't need that
39
+ # level of precision
40
+ milliseconds = (
41
+ (seconds_part * ONE_SECOND_IN_MILLIS) +
42
+ (microseconds_part / ONE_MILLI_IN_MICRO_SECS)
43
+ )
44
+
45
+ new(milliseconds)
46
+ end
47
+ end
48
+ end
@@ -0,0 +1,3 @@
1
+ class VIN
2
+ VERSION = "1.0.0".freeze
3
+ end
data/lib/vin.rb ADDED
@@ -0,0 +1,41 @@
1
+ require "redis"
2
+ require "vin/mixins/redis"
3
+
4
+ class VIN
5
+ extend VIN::Mixins::Redis
6
+
7
+ def initialize(config: nil)
8
+ @config = config || VIN::Config.new
9
+ end
10
+
11
+ def generate_id(data_type, timestamp: nil)
12
+ generator.generate_ids(data_type, 1, timestamp: timestamp).first
13
+ end
14
+
15
+ def generate_ids(data_type, count, timestamp: nil)
16
+ ids = []
17
+ # The Lua script can't always return as many IDs as you may want. So we loop
18
+ # until we have the exact amount.
19
+ while ids.length < count
20
+ initial_id_count = ids.length
21
+ ids += generator.generate_ids(data_type, count - ids.length, timestamp: timestamp)
22
+ # Ensure the ids array keeps growing as infinite loop insurance
23
+ return ids unless ids.length > initial_id_count
24
+ end
25
+ ids
26
+ end
27
+
28
+ private
29
+
30
+ def generator
31
+ @generator ||= Generator.new(config: @config)
32
+ end
33
+ end
34
+
35
+ require "vin/generator"
36
+ require "vin/id"
37
+ require "vin/lua_script"
38
+ require "vin/request"
39
+ require "vin/response"
40
+ require "vin/timestamp"
41
+ require "vin/config"
@@ -0,0 +1,93 @@
1
+ local last_logical_shard_id_key = 'vin-generator-last-logical-shard-id'
2
+ local max_sequence = <%= config.max_sequence %>
3
+
4
+ local data_type = tonumber(KEYS[1])
5
+ local num_ids = tonumber(KEYS[2])
6
+ local custom_timestamp = tonumber(KEYS[3]) -- Optional custom timestamp in Unix milliseconds (absolute time)
7
+
8
+ -- Allow one server to acts as multiple shards
9
+ local logical_shard_id_min = <%= config.logical_shard_id_range.min %>
10
+ local logical_shard_id_max = <%= config.logical_shard_id_range.max %>
11
+
12
+ local logical_shard_id = nil
13
+ if redis.call('EXISTS', last_logical_shard_id_key) == 0 then
14
+ logical_shard_id = logical_shard_id_min
15
+ else
16
+ local last_shard_id = tonumber(redis.call('GET', last_logical_shard_id_key))
17
+
18
+ if last_shard_id >= logical_shard_id_max or last_shard_id < logical_shard_id_min then
19
+ logical_shard_id = logical_shard_id_min
20
+ else
21
+ logical_shard_id = last_shard_id + 1
22
+ end
23
+ end
24
+
25
+ redis.call('SET', last_logical_shard_id_key, logical_shard_id)
26
+
27
+ --[[
28
+ Scope lock and sequence keys to the specific data_type being requested.
29
+ Ideally, we'd also use the logical_shard_id in the keys so that any per-millisecond limitations would only be per-shard,
30
+ but unfortunately the whole "pure function" limitation keeps us from using a random shard_id here. The best solution may
31
+ be to round robin the shard ID by incrementing a Redis key on each call.
32
+ ]]--
33
+ local lock_key = 'vin-generator-lock-' .. logical_shard_id .. '-' .. data_type
34
+ local sequence_key = 'vin-generator-sequence-' .. logical_shard_id .. '-' .. data_type
35
+
36
+ if redis.call('EXISTS', lock_key) == 1 then
37
+ redis.log(redis.LOG_NOTICE, 'VIN: Cannot generate ID, waiting for lock to expire.')
38
+ return redis.error_reply('VIN: Cannot generate ID, waiting for lock to expire.')
39
+ end
40
+
41
+ -- Increment by a set number
42
+ local end_sequence = redis.call('INCRBY', sequence_key, num_ids)
43
+ local start_sequence = end_sequence - num_ids + 1
44
+
45
+ if end_sequence >= max_sequence then
46
+ --[[
47
+ As the sequence is about to roll around, we can't generate another ID until we're sure we're not in the same
48
+ millisecond since we last rolled. This is because we may have already generated an ID with the same time and
49
+ sequence, and we cannot allow even the smallest possibility of duplicates. It's also because if we roll the sequence
50
+ around, we will start generating IDs with smaller values than the ones previously in this millisecond - that would
51
+ break our k-ordering guarantees!
52
+
53
+ The only way we can handle this is to block for a millisecond, as we can't store the time due the purity constraints
54
+ of Redis Lua scripts.
55
+
56
+ In addition to a neat side-effect of handling leap seconds (where milliseconds will last a little bit longer to bring
57
+ time back to where it should be) because Redis uses system time internally to expire keys, this prevents any duplicate
58
+ IDs from being generated if the rate of generation is greater than the maximum sequence per millisecond.
59
+
60
+ Note that it only blocks even it rolled around *not* in the same millisecond; this is because unless we do this, the
61
+ IDs won't remain ordered.
62
+ --]]
63
+ redis.log(redis.LOG_NOTICE, 'VIN: Rolling sequence back to the start, locking for 1ms.')
64
+ redis.call('SET', sequence_key, '-1')
65
+ redis.call('PSETEX', lock_key, 1, 'lock')
66
+ end_sequence = max_sequence
67
+ end
68
+
69
+ --[[
70
+ The TIME command MUST be called after anything that mutates state, or the Redis server will error the script out.
71
+ This is to ensure the script is "pure" in the sense that randomness or time based input will not change the
72
+ outcome of the writes.
73
+
74
+ See the "Scripts as pure functions" section at http://redis.io/commands/eval for more information.
75
+ --]]
76
+ local seconds, microseconds
77
+ if custom_timestamp then
78
+ -- Custom timestamp is already in Unix milliseconds (absolute time)
79
+ seconds = math.floor(custom_timestamp / 1000)
80
+ microseconds = (custom_timestamp % 1000) * 1000
81
+ else
82
+ local time = redis.call('TIME')
83
+ seconds = tonumber(time[1])
84
+ microseconds = tonumber(time[2])
85
+ end
86
+
87
+ return {
88
+ start_sequence,
89
+ end_sequence, -- Doesn't need conversion, the result of INCR or the variable set is always a number.
90
+ logical_shard_id,
91
+ seconds,
92
+ microseconds
93
+ }
metadata ADDED
@@ -0,0 +1,74 @@
1
+ --- !ruby/object:Gem::Specification
2
+ name: hoov_vin
3
+ version: !ruby/object:Gem::Version
4
+ version: 1.0.0
5
+ platform: ruby
6
+ authors:
7
+ - Roger Oba
8
+ autorequire:
9
+ bindir: bin
10
+ cert_chain: []
11
+ date: 2025-08-19 00:00:00.000000000 Z
12
+ dependencies:
13
+ - !ruby/object:Gem::Dependency
14
+ name: redis
15
+ requirement: !ruby/object:Gem::Requirement
16
+ requirements:
17
+ - - "~>"
18
+ - !ruby/object:Gem::Version
19
+ version: '5'
20
+ type: :runtime
21
+ prerelease: false
22
+ version_requirements: !ruby/object:Gem::Requirement
23
+ requirements:
24
+ - - "~>"
25
+ - !ruby/object:Gem::Version
26
+ version: '5'
27
+ description: A customizable Redis-powered Ruby client for generating unique, monotonically-increasing
28
+ integer IDs, for use in distributed systems and databases. Powered by Redis, drawing
29
+ heavy inspiration from Icicle, Twitter Snowflake, and Dogtag.
30
+ email: roger@hoov.com.br
31
+ executables: []
32
+ extensions: []
33
+ extra_rdoc_files: []
34
+ files:
35
+ - README.md
36
+ - lib/vin.rb
37
+ - lib/vin/config.rb
38
+ - lib/vin/generator.rb
39
+ - lib/vin/id.rb
40
+ - lib/vin/lua_script.rb
41
+ - lib/vin/mixins/redis.rb
42
+ - lib/vin/request.rb
43
+ - lib/vin/response.rb
44
+ - lib/vin/timestamp.rb
45
+ - lib/vin/version.rb
46
+ - lua/id-generation.lua.erb
47
+ homepage: https://github.com/hoovbr/vin
48
+ licenses:
49
+ - MIT
50
+ metadata:
51
+ homepage_uri: https://github.com/hoovbr/vin
52
+ source_code_uri: https://github.com/hoovbr/vin
53
+ changelog_uri: https://github.com/hoovbr/vin/blob/main/CHANGELOG.md
54
+ rubygems_mfa_required: 'true'
55
+ post_install_message:
56
+ rdoc_options: []
57
+ require_paths:
58
+ - lib
59
+ required_ruby_version: !ruby/object:Gem::Requirement
60
+ requirements:
61
+ - - ">="
62
+ - !ruby/object:Gem::Version
63
+ version: '3.2'
64
+ required_rubygems_version: !ruby/object:Gem::Requirement
65
+ requirements:
66
+ - - ">="
67
+ - !ruby/object:Gem::Version
68
+ version: '0'
69
+ requirements: []
70
+ rubygems_version: 3.5.3
71
+ signing_key:
72
+ specification_version: 4
73
+ summary: A Redis-powered Ruby ID generation client
74
+ test_files: []