thai_id_utils 0.2.0 → 0.3.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
checksums.yaml CHANGED
@@ -1,7 +1,7 @@
1
1
  ---
2
2
  SHA256:
3
- metadata.gz: a190bd99a1b5194f6dd8b8fb9b54ff187f8612acef24e17d86fbbe058f404e49
4
- data.tar.gz: a13c400601b9ae3f4426fe8dfb619403cbd52ae511552b37c5617f0ae0518c8f
3
+ metadata.gz: 550dacaa281beb7bdba770c9dd0b54cf252f87b9aaf56fdfcaa9dbf7f3b22589
4
+ data.tar.gz: 3afd4ab32f6836e8995b473f2352a39d1b7e137c8aae7fac3a8815b21f3bce74
5
5
  SHA512:
6
- metadata.gz: fbfe9afb6a1f1d13f4b8baa45e04287ba46b2c28abdf192fff6a7dd6debc745f55fefc7ea332fd601df7d9e3fdc41194e90da539648eb8038c19e4c72ede7232
7
- data.tar.gz: 3503c0c47a6b022689f4a46958b03caf7e269b18131422aa2e6db107ce64d8c51fb9da675084844ef27aee650af22beecd8313ba679d81e92db0960c76eaf13c
6
+ metadata.gz: 2689046128e65c474735e500e843fe7470e3dc06cfb4c76c9aa7adee1ffbd80be38ecb1163d4f928aad94418137e14dce5aa0cf5ba8ba63acd5e47e730c26b24
7
+ data.tar.gz: 551c61957b236cdcd61c44d47e4f5d2eb02da24bf3fc9f8e5e36825c40a337cb0284eb301958a571440eaa0f21091570e52ec04cd1bc214c0ba8ce605f4abfcf
data/CHANGELOG.md CHANGED
@@ -5,6 +5,27 @@ All notable changes to this project will be documented in this file.
5
5
  The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.0.0/),
6
6
  and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html).
7
7
 
8
+ ## [0.3.0] - 2026-03-10
9
+
10
+ ### Added
11
+ - `DISTRICT_COUNTS` constant — maps all 77 province codes to their number of
12
+ administrative districts (amphoe/khet), used to constrain district generation
13
+ - `LASER_HARDWARE_VERSIONS` constant — known chip hardware-version prefixes
14
+ (`JC`, `AA`, `BB`, `GC`) observed on issued Thai ID cards
15
+ - `province_codes` — returns all valid 2-digit province code strings
16
+ - `generate_laser_id(hardware_version:, box_id:, position:)` — generates a
17
+ random, structurally valid laser ID in `XXN-NNNNNNN-NN` format
18
+
19
+ ### Changed
20
+ - `generate` now accepts a `province_code:` keyword argument (default: random
21
+ valid province). When `office_code:` is not given, the district code is
22
+ constrained to the province's known range via `DISTRICT_COUNTS`. Passing
23
+ `office_code:` explicitly retains the previous behaviour unchanged.
24
+
25
+ ### Fixed
26
+ - `generate` default no longer produces impossible province codes (e.g. `'00'`,
27
+ `'28'`). All generated IDs now have a geographically valid province by default.
28
+
8
29
  ## [0.2.0] - 2025-06-15
9
30
 
10
31
  ### Added
@@ -35,6 +56,7 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
35
56
  - `category_description(category)` — human-readable description of ID category codes (0–8)
36
57
  - `InvalidIDError` — raised on invalid IDs passed to `decode`
37
58
 
59
+ [0.3.0]: https://github.com/chayuto/thai_id_utils/compare/v0.2.0...v0.3.0
38
60
  [0.2.0]: https://github.com/chayuto/thai_id_utils/compare/v0.1.2...v0.2.0
39
61
  [0.1.2]: https://github.com/chayuto/thai_id_utils/compare/v0.1.1...v0.1.2
40
62
  [0.1.1]: https://github.com/chayuto/thai_id_utils/compare/v0.1.0...v0.1.1
data/README.md CHANGED
@@ -1,34 +1,218 @@
1
1
  # Thai ID Utils
2
2
 
3
- Thai ID Utils is a zero-dependency Ruby gem for validating and decoding Thai national ID numbers. It provides a simple API to check the official modulus-11 checksum, extract the embedded components (category, office code, district code, sequence), and get a human-readable description of the category code.
4
-
5
- Thai ID Utils เป็น Ruby gem ที่ไม่ต้องพึ่งพาไลบรารีเสริม สำหรับตรวจสอบความถูกต้องและถอดรหัสหมายเลขบัตรประชาชนไทย โดยมี API ที่ใช้งานง่ายสำหรับตรวจสอบ checksum ตามมาตรฐาน modulus-11 ดึงส่วนประกอบต่างๆ (ประเภทผู้ลงทะเบียน รหัสหน่วยงาน รหัสอำเภอ และหมายเลขลำดับ) และแสดงคำอธิบายของรหัสประเภทในรูปแบบอ่านง่าย
3
+ Thai ID Utils is a zero-dependency Ruby gem for validating and decoding Thai national ID numbers.
4
+
5
+ Thai ID Utils เป็น Ruby gem ที่ไม่ต้องพึ่งพาไลบรารีเสริม สำหรับตรวจสอบความถูกต้องและถอดรหัสหมายเลขบัตรประชาชนไทย
6
+
7
+ [![Gem Version](https://badge.fury.io/rb/thai_id_utils.svg)](https://badge.fury.io/rb/thai_id_utils)
8
+
9
+ ---
10
+
11
+ ## Features / ฟีเจอร์
12
+
13
+ - Checksum validation (modulus-11 algorithm)
14
+ - Component decoding (category, province, district, sequence)
15
+ - Province name lookup for all 77 provinces
16
+ - Random valid ID generation — province-constrained by default, population-weighted sampling ready
17
+ - Human-readable category descriptions (0–8)
18
+ - Laser ID validation, decoding, and generation
19
+ - Buddhist Era ↔ Common Era date conversion
20
+
21
+ ---
22
+
23
+ ## Installation / การติดตั้ง
24
+
25
+ ```ruby
26
+ gem 'thai_id_utils'
27
+ ```
28
+
29
+ Or install directly:
30
+
31
+ ```sh
32
+ gem install thai_id_utils
33
+ ```
34
+
35
+ ---
6
36
 
7
37
  ## Usage / วิธีใช้งาน
8
38
 
9
39
  ```ruby
10
40
  require "thai_id_utils"
41
+ ```
42
+
43
+ ### Validate an ID / ตรวจสอบความถูกต้อง
11
44
 
12
- id = "3012304567082"
45
+ ```ruby
46
+ ThaiIdUtils.valid?("3012304567082") # => true
47
+ ThaiIdUtils.valid?("1234567890123") # => false
48
+ ```
49
+
50
+ ### Decode an ID / ถอดรหัสส่วนประกอบ
51
+
52
+ ```ruby
53
+ info = ThaiIdUtils.decode("3012304567082")
54
+ # => {
55
+ # category: 3,
56
+ # office_code: "0123",
57
+ # province_code: "01",
58
+ # province_name: nil, # nil if province code not recognized
59
+ # district_code: "23",
60
+ # sequence: "04567",
61
+ # registration_code: "08"
62
+ # }
63
+ ```
13
64
 
14
- # Validate checksum / ตรวจสอบความถูกต้องของ checksum
15
- if ThaiIdUtils.valid?(id)
16
- puts "Valid!"
17
- else
18
- puts "Invalid ID"
19
- end
65
+ Raises `ThaiIdUtils::InvalidIDError` if the ID fails checksum validation.
20
66
 
21
- # Decode components / ถอดรหัสส่วนประกอบ
22
- info = ThaiIdUtils.decode(id)
23
- # => { category: 1, office_code: "6099", district_code: "99", sequence: "00257" }
24
- puts info.inspect
67
+ ### Province Name Lookup / ค้นหาชื่อจังหวัด
25
68
 
26
- # Get category description / คำอธิบายประเภท
27
- desc = ThaiIdUtils.category_description(info[:category])
69
+ ```ruby
70
+ ThaiIdUtils.province_name("10") # => "Bangkok"
71
+ ThaiIdUtils.province_name("83") # => "Phuket"
72
+ ThaiIdUtils.province_name("50") # => "Chiang Mai"
73
+ ThaiIdUtils.province_name("99") # => nil
74
+ ```
75
+
76
+ All 77 Thai provinces are supported. Use `province_codes` to get the full list:
77
+
78
+ ```ruby
79
+ ThaiIdUtils.province_codes
80
+ # => ["10", "11", "12", ..., "96"] (77 codes)
81
+ ```
82
+
83
+ ### Category Description / คำอธิบายประเภทบัตร
84
+
85
+ ```ruby
86
+ ThaiIdUtils.category_description(1)
28
87
  # => "Thai nationals who were born after 1 January 1984 and had their birth notified within the given deadline (15 days)."
29
- puts desc
30
88
 
31
- # Generate a new random valid ID / สร้างหมายเลขบัตรประชาชนใหม่แบบสุ่มที่ถูกต้อง
32
- new_id = ThaiIdUtils.generate
33
- puts new_id # => e.g. "3601205234518"
34
- ```
89
+ ThaiIdUtils.category_description(6)
90
+ # => "Foreign nationals who are living in Thailand temporarily and illegal migrants"
91
+ ```
92
+
93
+ ### Generate a Valid ID / สร้างหมายเลขบัตรประชาชนแบบสุ่ม
94
+
95
+ ```ruby
96
+ ThaiIdUtils.generate
97
+ # => "1105312345671" (random valid province, district within that province's range)
98
+
99
+ # Pin to a specific province (district randomised within province's known range)
100
+ ThaiIdUtils.generate(province_code: "10") # Bangkok
101
+ ThaiIdUtils.generate(province_code: "83") # Phuket (3 districts)
102
+
103
+ # Full override via office_code bypasses province validation (backwards compatible)
104
+ ThaiIdUtils.generate(category: 1, office_code: "1001", sequence: "00001")
105
+
106
+ # Raises ArgumentError for unknown province codes
107
+ ThaiIdUtils.generate(province_code: "99") # => ArgumentError
108
+ ```
109
+
110
+ The default generates a geographically valid ID — province code is sampled uniformly from the 77 known codes and the district code is constrained to that province's actual district count via `DISTRICT_COUNTS`.
111
+
112
+ ### Laser ID Validation / ตรวจสอบเลขเลเซอร์
113
+
114
+ ```ruby
115
+ ThaiIdUtils.laser_id_valid?("JC1-0002507-15") # => true
116
+ ThaiIdUtils.laser_id_valid?("INVALID") # => false
117
+ ```
118
+
119
+ Format: `XXN-NNNNNNN-NN` (two uppercase letters, one digit, hyphen, 7 digits, hyphen, 2 digits)
120
+
121
+ ### Laser ID Decoding / ถอดรหัสเลขเลเซอร์
122
+
123
+ ```ruby
124
+ ThaiIdUtils.laser_id_decode("JC1-0002507-15")
125
+ # => {
126
+ # hardware_version: "JC1",
127
+ # box_id: "0002507",
128
+ # position: "15"
129
+ # }
130
+ ```
131
+
132
+ Raises `ThaiIdUtils::InvalidIDError` if the format is invalid.
133
+
134
+ ### Generate a Laser ID / สร้างเลขเลเซอร์
135
+
136
+ ```ruby
137
+ ThaiIdUtils.generate_laser_id
138
+ # => "JC2-0483921-07" (random, always matches LASER_ID_FORMAT)
139
+
140
+ # Override individual components
141
+ ThaiIdUtils.generate_laser_id(hardware_version: "JC1", box_id: 2507, position: 15)
142
+ # => "JC1-0002507-15"
143
+ ```
144
+
145
+ Known hardware version prefixes (`LASER_HARDWARE_VERSIONS`): `JC`, `AA`, `BB`, `GC`.
146
+ The laser ID is a supply-chain tracking code with no mathematical link to the citizen ID.
147
+
148
+ ### Buddhist Era Conversion / แปลงปี พ.ศ. ↔ ค.ศ.
149
+
150
+ ```ruby
151
+ ThaiIdUtils.be_to_ce(2567) # => 2024
152
+ ThaiIdUtils.ce_to_be(2024) # => 2567
153
+ ```
154
+
155
+ ---
156
+
157
+ ## API Reference / สรุป API
158
+
159
+ | Method | Description |
160
+ |---|---|
161
+ | `valid?(id)` | Returns `true` if the 13-digit ID passes checksum |
162
+ | `decode(id)` | Returns a hash of decoded components; raises `InvalidIDError` on failure |
163
+ | `generate(category:, province_code:, office_code:, district_code:, sequence:)` | Generates a random valid 13-digit ID; defaults to a valid province |
164
+ | `province_name(code)` | Returns province name for a 2-digit code, or `nil` |
165
+ | `province_codes` | Returns all 77 valid 2-digit province code strings |
166
+ | `category_description(n)` | Returns human-readable category description |
167
+ | `laser_id_valid?(laser_id)` | Returns `true` if the laser ID format matches |
168
+ | `laser_id_decode(laser_id)` | Returns decoded laser ID hash; raises `InvalidIDError` on failure |
169
+ | `generate_laser_id(hardware_version:, box_id:, position:)` | Generates a random valid laser ID |
170
+ | `be_to_ce(year)` | Converts Buddhist Era year to Common Era |
171
+ | `ce_to_be(year)` | Converts Common Era year to Buddhist Era |
172
+
173
+ **Constants**
174
+
175
+ | Constant | Description |
176
+ |---|---|
177
+ | `PROVINCE_CODES` | Hash mapping 77 province codes to English names |
178
+ | `DISTRICT_COUNTS` | Hash mapping province codes to their district count |
179
+ | `LASER_HARDWARE_VERSIONS` | Array of known chip hardware-version prefixes |
180
+ | `LASER_ID_FORMAT` | Regex for laser ID format validation |
181
+
182
+ ---
183
+
184
+ ## Synthetic Dataset / ชุดข้อมูลสังเคราะห์
185
+
186
+ A 350,000-row fully synthetic dataset generated with this gem is published on HuggingFace:
187
+
188
+ **[huggingface.co/datasets/chayuto/thai-id-synthetic](https://huggingface.co/datasets/chayuto/thai-id-synthetic)**
189
+
190
+ - 332,500 valid IDs + 17,500 invalid IDs (bad checksum, impossible province, wrong category, wrong length)
191
+ - Population-weighted province sampling (NSO 2023), realistic category distribution
192
+ - train / test splits (315K / 35K)
193
+ - No real citizen data — 100% synthetic
194
+
195
+ To regenerate:
196
+
197
+ ```sh
198
+ cd dataset
199
+ ruby generate.rb --count 350000 --invalid-ratio 0.05 --seed 42
200
+ ```
201
+
202
+ ---
203
+
204
+ ## Development / การพัฒนา
205
+
206
+ ```sh
207
+ # Run tests
208
+ rake
209
+
210
+ # Or directly
211
+ ruby -Ilib -Itest test/test_thai_id_utils.rb
212
+ ```
213
+
214
+ ---
215
+
216
+ ## License / สัญญาอนุญาต
217
+
218
+ [MIT License](LICENSE)
@@ -1,5 +1,5 @@
1
1
  # frozen_string_literal: true
2
2
 
3
3
  module ThaiIdUtils
4
- VERSION = '0.2.0'
4
+ VERSION = '0.3.0'
5
5
  end
data/lib/thai_id_utils.rb CHANGED
@@ -67,9 +67,37 @@ module ThaiIdUtils
67
67
  }.freeze
68
68
  # rubocop:enable Layout/LineLength
69
69
 
70
+ # Mapping of province codes to the number of administrative districts
71
+ # (amphoe for provinces, khet for Bangkok). Used to constrain district code
72
+ # generation to realistic ranges within generate().
73
+ # Counts are approximate and reflect post-2011 administrative divisions.
74
+ DISTRICT_COUNTS = {
75
+ '10' => 50, '11' => 11, '12' => 6, '13' => 7, '14' => 16,
76
+ '15' => 7, '16' => 11, '17' => 6, '18' => 8, '19' => 13,
77
+ '20' => 11, '21' => 8, '22' => 10, '23' => 7, '24' => 11,
78
+ '25' => 7, '26' => 4, '27' => 9,
79
+ '30' => 32, '31' => 23, '32' => 17, '33' => 22, '34' => 25,
80
+ '35' => 9, '36' => 16, '37' => 7, '38' => 8, '39' => 6,
81
+ '40' => 26, '41' => 20, '42' => 14, '43' => 18, '44' => 13,
82
+ '45' => 20, '46' => 18, '47' => 18, '48' => 12, '49' => 7,
83
+ '50' => 25, '51' => 8, '52' => 13, '53' => 9, '54' => 8,
84
+ '55' => 15, '56' => 9, '57' => 18, '58' => 7,
85
+ '60' => 15, '61' => 8, '62' => 11, '63' => 8, '64' => 9,
86
+ '65' => 9, '66' => 12, '67' => 11,
87
+ '70' => 10, '71' => 13, '72' => 10, '73' => 7, '74' => 7,
88
+ '75' => 3, '76' => 8, '77' => 8,
89
+ '80' => 23, '81' => 8, '82' => 8, '83' => 3, '84' => 19,
90
+ '85' => 5, '86' => 8,
91
+ '90' => 16, '91' => 7, '92' => 10, '93' => 11, '94' => 12,
92
+ '95' => 8, '96' => 9
93
+ }.freeze
94
+
70
95
  LASER_ID_FORMAT = /\A[A-Z]{2}\d-\d{7}-\d{2}\z/.freeze
71
96
 
72
- # Validate a Thai national ID using Thailand’s modulus-11 checksum algorithm.
97
+ # Known chip hardware-version prefixes observed on issued Thai ID cards.
98
+ LASER_HARDWARE_VERSIONS = %w[JC AA BB GC].freeze
99
+
100
+ # Validate a Thai national ID using Thailand's modulus-11 checksum algorithm.
73
101
  #
74
102
  # @param id [String, Integer] 13-digit Thai national ID number
75
103
  # @return [Boolean] true if the checksum is valid, false otherwise
@@ -112,39 +140,58 @@ module ThaiIdUtils
112
140
  end
113
141
 
114
142
  # Generate a random, valid 13-digit Thai national ID.
115
- # Any component can be overridden; the rest is randomized and the checksum is computed.
143
+ # Any component can be overridden; the rest is randomised and the checksum
144
+ # is computed. When neither +office_code+ nor +province_code+ is given, a
145
+ # valid province is selected at random and a district code within that
146
+ # province's known range is generated.
116
147
  #
117
148
  # @param category [Integer] ID category (1–8), default: random 1–6
118
- # @param office_code [Integer, String, nil] 4-digit registrar code, default: random
119
- # @param district_code [String, nil] 2-digit district override within office_code
149
+ # @param province_code [String, nil] 2-digit province code (e.g. "10").
150
+ # Must be a key in PROVINCE_CODES. Ignored when +office_code+ is given.
151
+ # @param office_code [Integer, String, nil] 4-digit registrar code override.
152
+ # When supplied, bypasses province_code and district_code validation.
153
+ # @param district_code [String, nil] 2-digit district override (applied on
154
+ # top of whatever office_code is built).
120
155
  # @param sequence [Integer, String, nil] 5-digit personal sequence, default: random
121
156
  # @return [String] a valid 13-digit Thai national ID
122
- # rubocop:disable Metrics/AbcSize
157
+ # @raise [ArgumentError] if province_code is given but not in PROVINCE_CODES
158
+ # rubocop:disable Metrics/AbcSize, Metrics/PerceivedComplexity
123
159
  def self.generate(category: rand(1..6),
160
+ province_code: PROVINCE_CODES.keys.sample,
124
161
  office_code: nil,
125
162
  district_code: nil,
126
163
  sequence: nil)
127
- # Build and override office_code/district_code
128
- office_code = format('%04d', office_code || rand(1..9_999))
164
+ office_code = if office_code
165
+ format('%04d', office_code)
166
+ else
167
+ pcode = province_code.to_s
168
+ raise ArgumentError, "Unknown province_code: #{pcode.inspect}" unless PROVINCE_CODES.key?(pcode)
169
+
170
+ "#{pcode}#{format('%02d', rand(1..DISTRICT_COUNTS[pcode]))}"
171
+ end
129
172
  office_code[2..3] = district_code.to_s.rjust(2, '0') if district_code
130
173
 
131
- # Sequence (5 digits) and classification (2 digits)
132
174
  sequence = format('%05d', sequence || rand(0..99_999))
133
175
  classification = format('%02d', rand(0..99))
134
176
 
135
- # First 12 digits: category + office_code + sequence + classification
136
177
  digits = [category.to_i] +
137
178
  office_code.chars.map(&:to_i) +
138
179
  sequence.chars.map(&:to_i) +
139
180
  classification.chars.map(&:to_i)
140
181
 
141
- # Checksum
142
182
  sum = digits.each_with_index.sum { |d, i| d * (13 - i) }
143
183
  check = (11 - (sum % 11)) % 10
144
184
 
145
185
  (digits + [check]).join
146
186
  end
147
- # rubocop:enable Metrics/AbcSize
187
+ # rubocop:enable Metrics/AbcSize, Metrics/PerceivedComplexity
188
+
189
+ # Return all valid 2-digit province code strings.
190
+ #
191
+ # @return [Array<String>] all keys of PROVINCE_CODES
192
+ def self.province_codes
193
+ PROVINCE_CODES.keys
194
+ end
148
195
 
149
196
  # Return the human-readable description for a Thai ID category code.
150
197
  #
@@ -205,4 +252,19 @@ module ThaiIdUtils
205
252
  position: parts[2]
206
253
  }
207
254
  end
255
+
256
+ # Generate a random, valid Thai ID card laser ID.
257
+ # Format: XXN-NNNNNNN-NN (e.g., JC1-0002507-15)
258
+ #
259
+ # @param hardware_version [String, nil] full 3-char chip code (e.g. "JC1").
260
+ # Defaults to a random prefix from LASER_HARDWARE_VERSIONS + digit 1–3.
261
+ # @param box_id [Integer, nil] distribution box number (1–9,999,999)
262
+ # @param position [Integer, nil] slot within the box (1–60)
263
+ # @return [String] a laser ID string matching LASER_ID_FORMAT
264
+ def self.generate_laser_id(hardware_version: nil, box_id: nil, position: nil)
265
+ hw = hardware_version || "#{LASER_HARDWARE_VERSIONS.sample}#{rand(1..3)}"
266
+ box = format('%07d', box_id || rand(1..9_999_999))
267
+ pos = format('%02d', position || rand(1..60))
268
+ "#{hw}-#{box}-#{pos}"
269
+ end
208
270
  end
metadata CHANGED
@@ -1,7 +1,7 @@
1
1
  --- !ruby/object:Gem::Specification
2
2
  name: thai_id_utils
3
3
  version: !ruby/object:Gem::Version
4
- version: 0.2.0
4
+ version: 0.3.0
5
5
  platform: ruby
6
6
  authors:
7
7
  - Chayut Orapinpatipat
@@ -25,11 +25,14 @@ dependencies:
25
25
  - !ruby/object:Gem::Version
26
26
  version: '5.0'
27
27
  description: |
28
- Zero-dependency Ruby utilities for:
28
+ Zero-dependency Ruby utilities for Thai national ID numbers:
29
29
  • checksum validation (modulus-11),
30
- • component decoding (category, office_code, district_code, sequence),
31
- random valid ID generation,
32
- human-readable category descriptions.
30
+ • component decoding (category, province, district, sequence),
31
+ province-constrained valid ID generation with DISTRICT_COUNTS,
32
+ province name lookup for all 77 provinces,
33
+ • laser ID validation, decoding, and generation,
34
+ • human-readable category descriptions (0–8),
35
+ • Buddhist Era ↔ Common Era date conversion.
33
36
  email:
34
37
  - chayut_o@hotmail.com
35
38
  executables: []
@@ -67,5 +70,5 @@ requirements: []
67
70
  rubygems_version: 3.5.22
68
71
  signing_key:
69
72
  specification_version: 4
70
- summary: Validate and decode Thai national ID numbers
73
+ summary: Validate, decode, and generate Thai national ID numbers
71
74
  test_files: []