encoded_id 1.0.0.rc5 → 1.0.0.rc7

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (38) hide show
  1. checksums.yaml +4 -4
  2. data/CHANGELOG.md +99 -3
  3. data/README.md +86 -329
  4. data/context/encoded_id.md +437 -0
  5. data/lib/encoded_id/alphabet.rb +34 -3
  6. data/lib/encoded_id/blocklist.rb +100 -0
  7. data/lib/encoded_id/encoders/base_configuration.rb +154 -0
  8. data/lib/encoded_id/encoders/hashid.rb +527 -0
  9. data/lib/encoded_id/encoders/hashid_configuration.rb +40 -0
  10. data/lib/encoded_id/encoders/hashid_consistent_shuffle.rb +110 -0
  11. data/lib/encoded_id/encoders/hashid_ordinal_alphabet_separator_guards.rb +244 -0
  12. data/lib/encoded_id/encoders/hashid_salt.rb +51 -0
  13. data/lib/encoded_id/encoders/my_sqids.rb +454 -0
  14. data/lib/encoded_id/encoders/sqids.rb +59 -0
  15. data/lib/encoded_id/encoders/sqids_configuration.rb +22 -0
  16. data/lib/encoded_id/encoders/sqids_with_blocklist_mode.rb +54 -0
  17. data/lib/encoded_id/hex_representation.rb +29 -14
  18. data/lib/encoded_id/reversible_id.rb +115 -82
  19. data/lib/encoded_id/version.rb +3 -1
  20. data/lib/encoded_id.rb +34 -4
  21. metadata +34 -26
  22. data/.devcontainer/Dockerfile +0 -9
  23. data/.devcontainer/compose.yml +0 -8
  24. data/.devcontainer/devcontainer.json +0 -8
  25. data/.standard.yml +0 -2
  26. data/Gemfile +0 -36
  27. data/Rakefile +0 -20
  28. data/Steepfile +0 -5
  29. data/ext/encoded_id/extconf.rb +0 -3
  30. data/ext/encoded_id/extension.c +0 -123
  31. data/ext/encoded_id/hashids.c +0 -939
  32. data/ext/encoded_id/hashids.h +0 -139
  33. data/lib/encoded_id/hash_id.rb +0 -227
  34. data/lib/encoded_id/hash_id_consistent_shuffle.rb +0 -27
  35. data/lib/encoded_id/hash_id_salt.rb +0 -15
  36. data/lib/encoded_id/ordinal_alphabet_separator_guards.rb +0 -90
  37. data/rbs_collection.yaml +0 -24
  38. data/sig/encoded_id.rbs +0 -189
@@ -0,0 +1,437 @@
1
+ # EncodedId Ruby Gem - Technical Documentation
2
+
3
+ ## Overview
4
+
5
+ `encoded_id` is a Ruby gem that provides reversible obfuscation of numerical and hexadecimal IDs into human-readable strings suitable for use in URLs. It offers a secure way to hide sequential database IDs from users while maintaining the ability to decode them back to their original values.
6
+
7
+ ## Key Features
8
+
9
+ - **Reversible Encoding**: Unlike UUIDs, encoded IDs can be decoded back to their original numeric values
10
+ - **Multiple ID Support**: Encode multiple numeric IDs in a single string
11
+ - **Algorithm Choice**: Supports both HashIds and Sqids encoding algorithms
12
+ - **Human-Readable Format**: Character grouping and configurable separators for better readability
13
+ - **Character Mapping**: Handles easily confused characters (0/O, 1/I/l) through equivalence mapping
14
+ - **Performance Optimized**: Uses an optimized HashIds implementation for better performance
15
+ - **Profanity Protection**: Built-in blocklist support to prevent offensive words in generated IDs
16
+ - **Customizable**: Configurable alphabets, lengths, and formatting options
17
+ - **Blocklist Modes**: Three modes for controlling blocklist checking performance
18
+
19
+ ## Quick Reference
20
+
21
+ ```ruby
22
+ # Sqids encoder (default, no salt required)
23
+ coder = EncodedId::ReversibleId.sqids(min_length: 10)
24
+ id = coder.encode(123) # => "p5w9-z27j-k8"
25
+ nums = coder.decode(id) # => [123]
26
+
27
+ # Hashids encoder (requires salt)
28
+ coder = EncodedId::ReversibleId.hashid(salt: "my-salt", min_length: 8)
29
+ id = coder.encode([78, 45]) # => "z2j7-0dmw"
30
+ nums = coder.decode(id) # => [78, 45]
31
+
32
+ # UUID encoding (experimental)
33
+ hex_id = coder.encode_hex("9a566b8b-8618-42ab-8db7-a5a0276401fd")
34
+ uuid = coder.decode_hex(hex_id).first
35
+ ```
36
+
37
+ ## Core API
38
+
39
+ ### EncodedId::ReversibleId
40
+
41
+ The main class for encoding and decoding IDs.
42
+
43
+ #### Factory Methods (Recommended)
44
+
45
+ Factory methods provide the cleanest way to create encoders:
46
+
47
+ ```ruby
48
+ # Sqids encoder (default, no salt required)
49
+ coder = EncodedId::ReversibleId.sqids(
50
+ min_length: 10,
51
+ blocklist: ["bad", "words"]
52
+ )
53
+
54
+ # Hashids encoder (requires salt)
55
+ coder = EncodedId::ReversibleId.hashid(
56
+ salt: "my-salt",
57
+ min_length: 8,
58
+ blocklist: ["bad", "words"]
59
+ )
60
+ ```
61
+
62
+ Both factory methods accept all configuration options described below.
63
+
64
+ #### Constructor (Alternative)
65
+
66
+ You can also use the constructor with explicit configuration objects:
67
+
68
+ ```ruby
69
+ # Using Sqids configuration
70
+ config = EncodedId::Encoders::SqidsConfiguration.new(
71
+ min_length: 8, # Minimum length of encoded string
72
+ split_at: 4, # Split encoded string every X characters
73
+ split_with: "-", # Character to split with
74
+ alphabet: EncodedId::Alphabet.modified_crockford,
75
+ hex_digit_encoding_group_size: 4,
76
+ max_length: 128, # Maximum length limit
77
+ max_inputs_per_id: 32, # Maximum IDs to encode together
78
+ blocklist: nil, # Words to prevent in IDs
79
+ blocklist_mode: :length_threshold, # :always, :length_threshold, or :raise_if_likely
80
+ blocklist_max_length: 32 # Max length for :length_threshold mode
81
+ )
82
+ coder = EncodedId::ReversibleId.new(config)
83
+
84
+ # Using Hashids configuration (requires salt)
85
+ config = EncodedId::Encoders::HashidConfiguration.new(
86
+ salt: "my-salt", # Required for Hashids (min 4 chars)
87
+ min_length: 8,
88
+ # ... other options same as above
89
+ )
90
+ coder = EncodedId::ReversibleId.new(config)
91
+ ```
92
+
93
+ **Note**: As of v1.0.0, the default encoder is `:sqids`. For backwards compatibility with pre-v1 versions, use `ReversibleId.hashid()`.
94
+
95
+ #### Key Methods
96
+
97
+ ##### encode(values)
98
+ Encodes one or more integer IDs into an obfuscated string.
99
+
100
+ ```ruby
101
+ coder = EncodedId::ReversibleId.sqids
102
+
103
+ # Single ID
104
+ coder.encode(123) # => "p5w9-z27j"
105
+
106
+ # Multiple IDs
107
+ coder.encode([78, 45]) # => "z2j7-0dmw"
108
+ ```
109
+
110
+ ##### decode(encoded_id, downcase: false)
111
+ Decodes an encoded string back to original IDs.
112
+
113
+ ```ruby
114
+ coder.decode("p5w9-z27j") # => [123]
115
+ coder.decode("z2j7-0dmw") # => [78, 45]
116
+
117
+ # Case-sensitive by default (v1.0.0+)
118
+ coder.decode("p5w9-z27J") # => [] (case doesn't match)
119
+
120
+ # For case-insensitive matching (pre-v1 behavior)
121
+ coder.decode("p5w9-z27J", downcase: true) # => [123]
122
+ ```
123
+
124
+ **Note**: As of v1.0.0, decoding is case-sensitive by default (`downcase: false`). Set `downcase: true` for backwards compatibility.
125
+
126
+ ##### encode_hex(hex_strings) (Experimental)
127
+ Encodes hexadecimal strings (like UUIDs).
128
+
129
+ ```ruby
130
+ # Encode UUID
131
+ coder.encode_hex("9a566b8b-8618-42ab-8db7-a5a0276401fd")
132
+ # => "5jjy-c8d9-hxp2-qsve-rgh9-rxnt-7nb5-tve7-bf84-vr"
133
+
134
+ # With larger group size for shorter output
135
+ coder = EncodedId::ReversibleId.sqids(hex_digit_encoding_group_size: 32)
136
+ coder.encode_hex("9a566b8b-8618-42ab-8db7-a5a0276401fd")
137
+ # => "vr7m-qra8-m5y6-dkgj-5rqr-q44e-gp4a-52"
138
+ ```
139
+
140
+ ##### decode_hex(encoded_id, downcase: false) (Experimental)
141
+ Decodes back to hexadecimal strings.
142
+
143
+ ```ruby
144
+ coder.decode_hex("w72a-y0az") # => ["10f8c"]
145
+
146
+ # For case-insensitive decoding (pre-v1 behavior)
147
+ coder.decode_hex("W72A-Y0AZ", downcase: true) # => ["10f8c"]
148
+ ```
149
+
150
+ ### EncodedId::Alphabet
151
+
152
+ Class for creating custom alphabets.
153
+
154
+ #### Predefined Alphabets
155
+
156
+ ```ruby
157
+ # Default: modified Crockford Base32
158
+ # Characters: "0123456789abcdefghjkmnpqrstuvwxyz"
159
+ # Excludes: i, l, o, u (easily confused)
160
+ # Equivalences: {"o"=>"0", "i"=>"j", "l"=>"1", ...}
161
+ EncodedId::Alphabet.modified_crockford
162
+ ```
163
+
164
+ #### Custom Alphabets
165
+
166
+ ```ruby
167
+ # Simple custom alphabet
168
+ alphabet = EncodedId::Alphabet.new("0123456789abcdef")
169
+
170
+ # With character equivalences
171
+ alphabet = EncodedId::Alphabet.new(
172
+ "0123456789ABCDEF",
173
+ {"a"=>"A", "b"=>"B", "c"=>"C", "d"=>"D", "e"=>"E", "f"=>"F"}
174
+ )
175
+
176
+ # Greek alphabet example
177
+ alphabet = EncodedId::Alphabet.new("αβγδεζηθικλμνξοπρστυφχψω")
178
+ coder = EncodedId::ReversibleId.sqids(alphabet: alphabet)
179
+ coder.encode(123) # => "θεαψ-ζκυο"
180
+ ```
181
+
182
+ ### EncodedId::Blocklist
183
+
184
+ Class for managing profanity/word blocklists.
185
+
186
+ #### Predefined Blocklists
187
+
188
+ ```ruby
189
+ # Empty blocklist (no filtering)
190
+ EncodedId::Blocklist.empty
191
+
192
+ # Minimal blocklist (~50 common profane words)
193
+ EncodedId::Blocklist.minimal
194
+
195
+ # Full Sqids default blocklist (comprehensive)
196
+ EncodedId::Blocklist.sqids_blocklist
197
+
198
+ # Use in configuration
199
+ coder = EncodedId::ReversibleId.sqids(
200
+ blocklist: EncodedId::Blocklist.minimal
201
+ )
202
+ ```
203
+
204
+ #### Custom Blocklists
205
+
206
+ ```ruby
207
+ # From array
208
+ blocklist = EncodedId::Blocklist.new(["bad", "offensive", "words"])
209
+
210
+ # Merge blocklists
211
+ combined = EncodedId::Blocklist.minimal.merge(
212
+ EncodedId::Blocklist.new(["custom", "words"])
213
+ )
214
+
215
+ # Filter for specific alphabet (automatic with configuration)
216
+ filtered = blocklist.filter_for_alphabet(EncodedId::Alphabet.modified_crockford)
217
+ ```
218
+
219
+ **Note**: Blocklists are automatically filtered to only include words possible with your configured alphabet. This optimization improves performance.
220
+
221
+ ## Configuration Options
222
+
223
+ ### Basic Options
224
+
225
+ - **min_length**: Minimum length of encoded string (default: 8)
226
+ - **max_length**: Maximum allowed length (default: 128) to prevent DoS attacks
227
+ - **max_inputs_per_id**: Maximum IDs encodable together (default: 32)
228
+ - **hex_digit_encoding_group_size**: Group size for hex encoding (default: 4)
229
+
230
+ ### Encoder Selection
231
+
232
+ ```ruby
233
+ # Sqids encoder (default, no salt required)
234
+ coder = EncodedId::ReversibleId.sqids
235
+
236
+ # Hashids encoder (requires salt - minimum 4 characters)
237
+ coder = EncodedId::ReversibleId.hashid(salt: "my-salt-minimum-4-chars")
238
+ ```
239
+
240
+ **Important**:
241
+ - As of v1.0.0, `:sqids` is the default encoder
242
+ - **Sqids**: No salt required, automatically avoids blocklisted words via iteration
243
+ - **Hashids**: Salt required (min 4 chars), raises exception if blocklisted word appears
244
+ - HashIds and Sqids produce different encodings and are **not compatible**
245
+ - Do NOT change encoders after going to production with existing encoded IDs
246
+
247
+ ### Blocklist Configuration
248
+
249
+ #### Blocklist Modes
250
+
251
+ Control how blocklist checking behaves to balance performance and safety:
252
+
253
+ ```ruby
254
+ # :length_threshold (default) - Check blocklist only until encoded length reaches blocklist_max_length
255
+ # Best for most use cases - prevents performance issues with very long IDs
256
+ coder = EncodedId::ReversibleId.sqids(
257
+ blocklist: EncodedId::Blocklist.minimal,
258
+ blocklist_mode: :length_threshold,
259
+ blocklist_max_length: 32 # Stop checking after 32 characters
260
+ )
261
+
262
+ # :always - Always check blocklist regardless of encoded length
263
+ # Can be slow for long IDs or large blocklists
264
+ coder = EncodedId::ReversibleId.hashid(
265
+ salt: "my-salt",
266
+ blocklist: ["bad", "words"],
267
+ blocklist_mode: :always
268
+ )
269
+
270
+ # :raise_if_likely - Raise error at configuration time if settings likely cause blocklist collisions
271
+ # Prevents configurations that would cause performance issues
272
+ coder = EncodedId::ReversibleId.sqids(
273
+ min_length: 8,
274
+ blocklist: ["bad", "words"],
275
+ blocklist_mode: :raise_if_likely
276
+ )
277
+ # Raises InvalidConfigurationError if min_length > blocklist_max_length
278
+ ```
279
+
280
+ **Blocklist Behavior by Encoder**:
281
+ - **Sqids**: Iteratively regenerates to avoid blocklisted words (may impact encoding performance)
282
+ - **Hashids**: Raises `EncodedId::BlocklistError` if a blocklisted word appears
283
+
284
+ **Recommendation**: Use `:length_threshold` mode (default) for best balance of performance and safety.
285
+
286
+ ### Formatting Options
287
+
288
+ ```ruby
289
+ # Custom splitting
290
+ coder = EncodedId::ReversibleId.sqids(
291
+ split_at: 3, # Group every 3 chars
292
+ split_with: "." # Use dots
293
+ )
294
+ coder.encode(123) # => "p5w.9z2.7j"
295
+
296
+ # No splitting
297
+ coder = EncodedId::ReversibleId.sqids(split_at: nil)
298
+ coder.encode(123) # => "p5w9z27j"
299
+ ```
300
+
301
+ ## Exception Handling
302
+
303
+ | Exception | Description |
304
+ |-----------|-------------|
305
+ | `EncodedId::InvalidConfigurationError` | Invalid configuration parameters |
306
+ | `EncodedId::InvalidAlphabetError` | Invalid alphabet (< 16 unique chars) |
307
+ | `EncodedId::EncodedIdFormatError` | Invalid encoded ID format |
308
+ | `EncodedId::EncodedIdLengthError` | Encoded ID exceeds max_length |
309
+ | `EncodedId::InvalidInputError` | Invalid input (negative integers, too many inputs) |
310
+ | `EncodedId::SaltError` | Invalid salt (too short, only for Hashids) |
311
+ | `EncodedId::BlocklistError` | Generated ID contains blocklisted word (Hashids only) |
312
+
313
+ ## Usage Examples
314
+
315
+ ### Basic Usage
316
+ ```ruby
317
+ # Initialize with Sqids (no salt needed)
318
+ coder = EncodedId::ReversibleId.sqids
319
+
320
+ # Encode/decode cycle
321
+ encoded = coder.encode(123) # => "p5w9-z27j"
322
+ decoded = coder.decode(encoded) # => [123]
323
+ original_id = decoded.first # => 123
324
+ ```
325
+
326
+ ### Multiple IDs
327
+ ```ruby
328
+ # Encode multiple IDs in one string
329
+ encoded = coder.encode([78, 45, 92]) # => "z2j7-0dmw-kf8p"
330
+ decoded = coder.decode(encoded) # => [78, 45, 92]
331
+ ```
332
+
333
+ ### With Hashids and Blocklist
334
+ ```ruby
335
+ coder = EncodedId::ReversibleId.hashid(
336
+ salt: "my-app-salt",
337
+ min_length: 12,
338
+ blocklist: EncodedId::Blocklist.minimal,
339
+ blocklist_mode: :length_threshold
340
+ )
341
+
342
+ encoded = coder.encode(123)
343
+ # Raises BlocklistError if result contains blocklisted word
344
+ ```
345
+
346
+ ### Custom Configuration
347
+ ```ruby
348
+ # Highly customized Sqids instance
349
+ coder = EncodedId::ReversibleId.sqids(
350
+ min_length: 12,
351
+ split_at: 3,
352
+ split_with: ".",
353
+ alphabet: EncodedId::Alphabet.new("0123456789ABCDEF"),
354
+ blocklist: ["BAD", "FAKE"],
355
+ blocklist_mode: :length_threshold,
356
+ blocklist_max_length: 32
357
+ )
358
+ ```
359
+
360
+ ### Hex Encoding (UUIDs)
361
+ ```ruby
362
+ # For encoding UUIDs efficiently
363
+ coder = EncodedId::ReversibleId.sqids(hex_digit_encoding_group_size: 32)
364
+
365
+ uuid = "550e8400-e29b-41d4-a716-446655440000"
366
+ encoded = coder.encode_hex(uuid)
367
+ decoded = coder.decode_hex(encoded).first # => original UUID (without hyphens)
368
+ ```
369
+
370
+ ## Performance Considerations
371
+
372
+ 1. **Algorithm Choice**:
373
+ - HashIds: Faster encoding, especially with blocklists
374
+ - Sqids: Faster decoding, automatically avoids blocklisted words
375
+
376
+ 2. **Blocklist Impact**:
377
+ - Large blocklists slow encoding, especially with Sqids (which iterates to avoid words)
378
+ - Hashids may raise exceptions requiring retry logic
379
+ - Use `blocklist_mode: :length_threshold` for best performance
380
+ - `:always` mode can significantly impact encoding speed for long IDs
381
+ - Blocklists are automatically filtered for your alphabet, improving performance
382
+
383
+ 3. **Blocklist Mode Performance**:
384
+ - `:length_threshold` (default): Only checks blocklist for IDs ≤ `blocklist_max_length` (default: 32)
385
+ - `:always`: Checks all IDs regardless of length (can be slow)
386
+ - `:raise_if_likely`: Validates configuration at initialization to prevent performance issues
387
+
388
+ 4. **Length vs Performance**: Longer minimum lengths may require more computation
389
+
390
+ 5. **Memory Usage**: The gem uses optimized implementations to minimize memory allocation
391
+
392
+ ## Version Compatibility
393
+
394
+ **v1.0.0 Breaking Changes:**
395
+
396
+ 1. **Default encoder**: Changed from `:hashids` to `:sqids`
397
+ 2. **Case sensitivity**: `decode` is now case-sensitive by default (`downcase: false`)
398
+ - Pre-v1: `decode("ABC")` and `decode("abc")` were equivalent
399
+ - v1.0.0+: These produce different results unless `downcase: true`
400
+ 3. **Salt requirement**: Sqids (default) doesn't require salt; Hashids still requires salt
401
+ 4. **Migration**: For backwards compatibility with pre-v1:
402
+ ```ruby
403
+ coder = EncodedId::ReversibleId.hashid(salt: "your-salt")
404
+ decoded = coder.decode(id, downcase: true)
405
+ ```
406
+
407
+ ## Security Notes
408
+
409
+ **Important**: Encoded IDs are NOT cryptographically secure. They provide obfuscation, not encryption. Do not rely on them for security purposes. They can potentially be reversed through brute-force attacks if the salt is compromised.
410
+
411
+ Use encoded IDs for:
412
+ - Hiding sequential database IDs
413
+ - Creating user-friendly URLs
414
+ - Preventing ID enumeration attacks
415
+ - Obscuring business metrics (user counts, order volumes)
416
+
417
+ Do NOT use for:
418
+ - Secure tokens
419
+ - Authentication
420
+ - Sensitive data protection
421
+ - Cryptographic purposes
422
+
423
+ ## Installation
424
+
425
+ ```ruby
426
+ # Gemfile
427
+ gem 'encoded_id'
428
+ ```
429
+
430
+ ## Best Practices
431
+
432
+ 1. **Consistent Configuration**: Once in production, don't change salt, encoder, or alphabet
433
+ 2. **Error Handling**: Always handle potential exceptions when decoding user input
434
+ 3. **Length Limits**: Set appropriate max_length to prevent DoS attacks
435
+ 4. **Validation**: Validate decoded IDs before using them in database queries
436
+ 5. **Blocklist Mode**: Use `:length_threshold` (default) for production - best performance/safety balance
437
+ 6. **Factory Methods**: Prefer `ReversibleId.sqids()` and `ReversibleId.hashid()` over constructor
@@ -1,10 +1,14 @@
1
1
  # frozen_string_literal: true
2
2
 
3
+ # rbs_inline: enabled
4
+
3
5
  module EncodedId
6
+ # Represents a character set (alphabet) used for encoding IDs, with optional character equivalences.
4
7
  class Alphabet
5
8
  MIN_UNIQUE_CHARACTERS = 16
6
9
 
7
10
  class << self
11
+ # @rbs return: Alphabet
8
12
  def modified_crockford
9
13
  new(
10
14
  "0123456789abcdefghjkmnpqrstuvwxyz",
@@ -16,15 +20,21 @@ module EncodedId
16
20
  )
17
21
  end
18
22
 
23
+ # @rbs return: Alphabet
19
24
  def alphanum
20
25
  new("abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ1234567890")
21
26
  end
22
27
  end
23
28
 
29
+ # @rbs @unique_characters: Array[String]
30
+ # @rbs @characters: String
31
+ # @rbs @equivalences: Hash[String, String]?
32
+
33
+ # @rbs (String | Array[String] characters, ?Hash[String, String]? equivalences) -> void
24
34
  def initialize(characters, equivalences = nil)
25
35
  raise_invalid_alphabet! unless valid_input_characters?(characters)
26
36
  @unique_characters = unique_character_alphabet(characters)
27
- raise_invalid_alphabet! unless valid_characters?
37
+ raise_invalid_characters! unless valid_characters?
28
38
  raise_character_set_too_small! unless sufficient_characters?
29
39
  raise_invalid_equivalences! unless valid_equivalences?(equivalences)
30
40
 
@@ -32,24 +42,31 @@ module EncodedId
32
42
  @equivalences = equivalences
33
43
  end
34
44
 
35
- attr_reader :unique_characters, :characters, :equivalences
45
+ attr_reader :unique_characters #: Array[String]
46
+ attr_reader :characters #: String
47
+ attr_reader :equivalences #: Hash[String, String]?
36
48
 
49
+ # @rbs (String character) -> bool
37
50
  def include?(character)
38
51
  unique_characters.include?(character)
39
52
  end
40
53
 
54
+ # @rbs return: Array[String]
41
55
  def to_a
42
56
  unique_characters.dup
43
57
  end
44
58
 
59
+ # @rbs return: String
45
60
  def to_s
46
61
  @characters.dup
47
62
  end
48
63
 
64
+ # @rbs return: String
49
65
  def inspect
50
66
  "#<#{self.class.name} chars: #{unique_characters.inspect}>"
51
67
  end
52
68
 
69
+ # @rbs return: Integer
53
70
  def size
54
71
  unique_characters.size
55
72
  end
@@ -57,23 +74,29 @@ module EncodedId
57
74
 
58
75
  private
59
76
 
77
+ # @rbs (String | Array[String] characters) -> bool
60
78
  def valid_input_characters?(characters)
61
79
  return false unless characters.is_a?(Array) || characters.is_a?(String)
62
80
  characters.size > 0
63
81
  end
64
82
 
83
+ # @rbs (String | Array[String] characters) -> Array[String]
65
84
  def unique_character_alphabet(characters)
66
85
  (characters.is_a?(Array) ? characters : characters.chars).uniq
67
86
  end
68
87
 
88
+ # @rbs return: bool
69
89
  def valid_characters?
70
90
  unique_characters.size > 0 && unique_characters.grep(/\s|\0/).size == 0
71
91
  end
72
92
 
93
+ # @rbs return: bool
73
94
  def sufficient_characters?
74
95
  unique_characters.size >= MIN_UNIQUE_CHARACTERS
75
96
  end
76
97
 
98
+ # Validates equivalences ensure: keys map to values in the alphabet, and keys are not already in the alphabet
99
+ # @rbs (Hash[String, String]? equivalences) -> bool
77
100
  def valid_equivalences?(equivalences)
78
101
  return true if equivalences.nil?
79
102
  return false unless equivalences.is_a?(Hash)
@@ -82,14 +105,22 @@ module EncodedId
82
105
  (unique_characters & equivalences.keys).empty? && (equivalences.values - unique_characters).empty?
83
106
  end
84
107
 
108
+ # @rbs return: void
85
109
  def raise_invalid_alphabet!
86
- raise InvalidAlphabetError, "Alphabet must be a string or array and not contain whitespace."
110
+ raise InvalidAlphabetError, "Alphabet must be a populated string or array"
111
+ end
112
+
113
+ # @rbs return: void
114
+ def raise_invalid_characters!
115
+ raise InvalidAlphabetError, "Alphabet must not contain whitespace or null characters."
87
116
  end
88
117
 
118
+ # @rbs return: void
89
119
  def raise_character_set_too_small!
90
120
  raise InvalidAlphabetError, "Alphabet must contain at least #{MIN_UNIQUE_CHARACTERS} unique characters."
91
121
  end
92
122
 
123
+ # @rbs return: void
93
124
  def raise_invalid_equivalences!
94
125
  raise InvalidConfigurationError, "Character equivalences must be a hash or nil and contain mappings to valid alphabet characters."
95
126
  end
@@ -0,0 +1,100 @@
1
+ # frozen_string_literal: true
2
+
3
+ # rbs_inline: enabled
4
+
5
+ module EncodedId
6
+ # A blocklist of words that should not appear in encoded IDs.
7
+ class Blocklist
8
+ include Enumerable #[String]
9
+
10
+ # @rbs @words: Set[String]
11
+ # @rbs self.@empty: Blocklist
12
+ # @rbs self.@minimal: Blocklist
13
+
14
+ class << self
15
+ # @rbs () -> Blocklist
16
+ def sqids_blocklist
17
+ new(::Sqids::DEFAULT_BLOCKLIST)
18
+ end
19
+
20
+ # @rbs () -> Blocklist
21
+ def empty
22
+ @empty ||= new([])
23
+ end
24
+
25
+ # @rbs () -> Blocklist
26
+ def minimal
27
+ @minimal ||= new([
28
+ "ass", "cum", "fag", "fap", "fck", "fuk", "jiz", "pis", "poo", "sex",
29
+ "tit", "xxx", "anal", "anus", "ball", "blow", "butt", "clit", "cock",
30
+ "coon", "cunt", "dick", "dyke", "fart", "fuck", "jerk", "jizz", "jugs",
31
+ "kike", "kunt", "muff", "nigg", "nigr", "piss", "poon", "poop", "porn",
32
+ "pube", "pusy", "quim", "rape", "scat", "scum", "shit", "slut", "suck",
33
+ "turd", "twat", "vag", "wank", "whor"
34
+ ])
35
+ end
36
+ end
37
+
38
+ attr_reader :words #: Set[String]
39
+
40
+ # @rbs (?(Array[String] | Set[String]) words) -> void
41
+ def initialize(words = [])
42
+ @words = if words.is_a?(Array) || words.is_a?(Set)
43
+ Set.new(words.map(&:to_s).map(&:downcase))
44
+ else
45
+ Set.new
46
+ end
47
+ end
48
+
49
+ # @rbs () { (String) -> void } -> void
50
+ def each(&block)
51
+ @words.each(&block)
52
+ end
53
+
54
+ # @rbs (String word) -> bool
55
+ def include?(word)
56
+ @words.include?(word.to_s.downcase)
57
+ end
58
+
59
+ # @rbs (String string) -> (String | false)
60
+ def blocks?(string)
61
+ return false if empty?
62
+
63
+ downcased_string = string.to_s.downcase
64
+ @words.each do |word|
65
+ return word if downcased_string.include?(word)
66
+ end
67
+ false
68
+ end
69
+
70
+ # @rbs () -> Integer
71
+ def size
72
+ @words.size
73
+ end
74
+
75
+ # @rbs () -> bool
76
+ def empty?
77
+ @words.empty?
78
+ end
79
+
80
+ # @rbs (Blocklist other_blocklist) -> Blocklist
81
+ def merge(other_blocklist)
82
+ self.class.new(to_a + other_blocklist.to_a)
83
+ end
84
+
85
+ # Filters the blocklist to only include words that can be formed from the given alphabet.
86
+ # Only keeps words where ALL characters exist in the alphabet (case-insensitive).
87
+ # Maintains minimum 3-character length requirement.
88
+ #
89
+ # @rbs (Alphabet | String alphabet) -> Blocklist
90
+ def filter_for_alphabet(alphabet)
91
+ alphabet_chars = Set.new(
92
+ alphabet.is_a?(Alphabet) ? alphabet.unique_characters : alphabet.to_s.chars
93
+ )
94
+
95
+ self.class.new(
96
+ @words.select { |word| word.length >= 3 && word.chars.to_set.subset?(alphabet_chars) }
97
+ )
98
+ end
99
+ end
100
+ end