unlocodes 0.1.3 → 0.1.4

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
checksums.yaml CHANGED
@@ -1,7 +1,7 @@
1
1
  ---
2
2
  SHA256:
3
- metadata.gz: a4c9de4b3b24e90847e45242a5199691e8d5284377e8f6476f6fb01b653a729d
4
- data.tar.gz: 4903c23bdc9d3336471da5afc6ec1c0e1c711d75824ce822e6a089dbb54970b2
3
+ metadata.gz: 2a6607eb57e2219a161a11a3a5504d8b5ba3ba727b6a89c27d2a0de65b20dc36
4
+ data.tar.gz: 43fc8e683ac3af63d1f2a126313c6a9ec935ff6bacd03ecaa3fda39909bcef13
5
5
  SHA512:
6
- metadata.gz: 2938d6ba840b7b93731afd97081478dce93cbd5b82357eb56032a943c000728a9d484b9fec9a5deffaa30916b99ba4588d33add5ce64e9489cd671d9a2ad0994
7
- data.tar.gz: 76333326f2218ed42eebe6c729b05609c3e14b874854c9171dac3260d022c521a52e23e4e290a26565432936a837f548b6ff1a464c064258059741ad2c9ada01
6
+ metadata.gz: 4e6da102b2aea85d965a601c2539400a4d978ac04723cbc989f8bd867628352ce26d03fa26b181a66ccd652d618f68e13ee5d668a3dac41971058fcacdca25d8
7
+ data.tar.gz: 9fa6a655a4c2ed962975e2311920264ba3657311163156843be590e639592fa480715493bfee7e7c9f59019d8ccee60c4c29f95cae65e8fec7828b17ef53f7be
data/README.adoc CHANGED
@@ -1,15 +1,17 @@
1
- = unlocode
1
+ = unlocodes
2
2
  :toc: macro
3
- :homepage: https://github.com/metanorma/unlocode
3
+ :homepage: https://github.com/metanorma/unlocodes
4
4
 
5
- `unlocode` is a Ruby gem that exposes the UN/LOCODE dataset (United Nations Code for Trade and Transport Locations) as a queryable in-memory registry.
5
+ `unlocodes` is a Ruby gem that exposes the UN/LOCODE dataset (United Nations Code for Trade and Transport Locations) as a queryable in-memory registry.
6
6
 
7
7
  The dataset is sourced from the UNECE/UNCEFACT LOCODE vocabulary, version-tagged `2025-1`: https://opensource.unicc.org/un/unece/uncefact/vocab-locode/-/tags/2025-1
8
8
 
9
- The full 2025-1 dataset (115,928 LOCODEs across 249 countries) is bundled inside the gem as `lib/unlocode/data/locode.jsonld` and loads lazily into a typed registry on first use.
9
+ The full 2025-1 dataset (115,928 LOCODEs across 249 countries, ~39 MB) is bundled inside the gem as `lib/unlocodes/data/locode.jsonld` and loads lazily into a typed registry on first use.
10
10
 
11
11
  == Installation
12
12
 
13
+ Ruby 3.1 or newer is required.
14
+
13
15
  Add to your Gemfile:
14
16
 
15
17
  [source,ruby]
@@ -17,18 +19,78 @@ Add to your Gemfile:
17
19
  gem 'unlocodes'
18
20
  ----
19
21
 
22
+ Or install directly:
23
+
24
+ [source,sh]
25
+ ----
26
+ gem install unlocodes
27
+ ----
28
+
20
29
  == Usage
21
30
 
22
31
  [source,ruby]
23
32
  ----
24
33
  require 'unlocodes'
25
34
 
26
- Unlocodes.find('CNSHA') # => #<Unlocodes::Entry code="CNSHA" ...>
27
- Unlocodes.find('NLRTM').functions.map(&:code) # => ["B", "R", "T", "A", "P"]
28
- Unlocodes.where(country: 'CN').count # => 1670
29
- Unlocodes.where(function: 'B').count # => 17912 (sea ports)
30
- Unlocodes.where(function: 'A').count # => 9009 (airports)
31
- Unlocodes.find('CNPDG').coordinates # => #<Coordinates lat=31.2333 lon=121.5000>
35
+ # Lookup by 5-char LOCODE (case-insensitive)
36
+ Unlocodes.find('CNSHA') # => #<Unlocodes::Entry code="CNSHA" name="Shanghai Hongqiao International Apt">
37
+ Unlocodes['NLRTM'].name # => "Rotterdam" ([] is an alias for find)
38
+
39
+ # Filters — single value or array (any-of)
40
+ Unlocodes.where(country: 'CN').count # => 1670
41
+ Unlocodes.where(country: %w[CN HK]).count # => entries in China OR Hong Kong
42
+ Unlocodes.where(function: 'B').count # => 17912 (sea ports)
43
+ Unlocodes.where(function: 'A').count # => 9009 (airports)
44
+ Unlocodes.where(function: %w[B A]).count # => 25136 (port OR airport)
45
+ Unlocodes.where(subdivision: 'CNSH').count # => entries in the Shanghai subdivision
46
+
47
+ # Filters combine
48
+ Unlocodes.where(country: 'CN', function: 'A').count # => CN airports
49
+
50
+ # Name search — Regexp (substring) or String (case-insensitive equality)
51
+ Unlocodes.where(name: /shanghai/i).map(&:code)
52
+ Unlocodes.where(name: 'rotterdam').map(&:code)
53
+
54
+ # Iteration
55
+ Unlocodes.each { |e| puts e.code if e.port? }
56
+
57
+ # Country listing
58
+ Unlocodes.countries # => ["AD", "AE", ..., "ZW"] (249 codes)
59
+ Unlocodes.counts_by_country.first(5) # => [["US", 20852], ["FR", 14325], ...]
60
+
61
+ # Indexed lookups (faster than `where` for a single value)
62
+ Unlocodes.registry.by_country('CN').size # => 1670
63
+ Unlocodes.registry.by_function('B').size # => 17912
64
+ ----
65
+
66
+ === What's in an Entry
67
+
68
+ Each `Unlocodes::Entry` exposes the fields the JSON-LD vocabulary populates:
69
+
70
+ [cols="1,2", options="header"]
71
+ |===
72
+ |Attribute|Description
73
+ |`code`|5-char LOCODE (ISO country + 3-char location)
74
+ |`country`|ISO 3166-1 alpha-2 country code
75
+ |`subdivision`|ISO 3166-2 country subdivision code (e.g. `CNSH`)
76
+ |`name`|Display name
77
+ |`function_codes`|Array of single-letter function codes (see below)
78
+ |`latitude`, `longitude`|WGS-84 decimal degrees
79
+ |===
80
+
81
+ Convenience predicates and value-type accessors:
82
+
83
+ [source,ruby]
84
+ ----
85
+ entry = Unlocodes.find('NLRTM')
86
+
87
+ entry.port? # => true (function code B)
88
+ entry.airport? # => true (function code A)
89
+ entry.rail_terminal? # => true (function code R)
90
+ entry.road_terminal? # => true (function code T)
91
+ entry.functions.map(&:code) # => ["B", "R", "T", "A", "P"]
92
+ entry.functions.first.description # => "Port (sea)"
93
+ entry.coordinates # => #<Unlocodes::Coordinates ...>
32
94
  ----
33
95
 
34
96
  === Function codes
@@ -49,40 +111,89 @@ The vocabulary publishes functions as `unlcdf:1`..`unlcdf:9`. The gem maps them
49
111
  |9|O|Other / border crossing
50
112
  |===
51
113
 
52
- Use the letters in `where(function: ...)`:
114
+ Use the letters in `where(function: ...)`. Passing an array matches any-of (OR), not all-of (AND):
53
115
 
54
116
  [source,ruby]
55
117
  ----
56
118
  Unlocodes.where(function: 'B').count # sea ports
57
- Unlocodes.where(function: %w[B A]).count # entries that are both port and airport
119
+ Unlocodes.where(function: %w[B A]).count # port OR airport (union)
120
+ # For AND (entries that are BOTH port AND airport), filter in Ruby:
121
+ Unlocodes.entries.count { |e| e.port? && e.airport? } # => 1785
58
122
  ----
59
123
 
60
- === Distances
124
+ === Coordinates and distances
61
125
 
62
126
  [source,ruby]
63
127
  ----
64
- shanghai = Unlocodes.find('CNPDG').coordinates
128
+ shanghai = Unlocodes.find('CNPDG').coordinates # Shanghai Pudong
65
129
  rotterdam = Unlocodes.find('NLRTM').coordinates
66
- shanghai.distance_to(rotterdam) # => 8922.0 (km, great-circle)
130
+
131
+ shanghai.to_s # => "31.2333 121.5000"
132
+ shanghai.distance_to(rotterdam) # => 8922.7 (km, great-circle via haversine)
67
133
  ----
68
134
 
69
- == Data refresh
135
+ === Fields NOT in the bundled vocabulary
70
136
 
71
- To refresh the bundled dataset when a new UNCEFACT edition is released:
137
+ The JSON-LD vocabulary only publishes `code`, `country`, `subdivision`, `name`, `functions`, and coordinates. The UN/LOCODE manual additionally defines a `status` change indicator (AA, RL, XX, …), IATA code, change date, and remarks — these are in the per-country CSV files upstream, not the JSON-LD vocab, and are intentionally not modelled here.
138
+
139
+ Calling `where(status: ...)` or `where(iata: ...)` raises `ArgumentError` rather than silently returning empty results.
140
+
141
+ === Which edition is bundled?
142
+
143
+ [source,ruby]
144
+ ----
145
+ Unlocodes.data_tag # => "2025-1"
146
+ ----
147
+
148
+ This reads `lib/unlocodes/data/SOURCE_TAG` at runtime. The `Unlocodes::Status` value type is still shipped for callers who want to consult the manual's status-code descriptions out-of-band.
149
+
150
+ == Staying current with upstream
151
+
152
+ UN/LOCODE is published twice-yearly (typically `YYYY-1` around Q1, `YYYY-2` around Q3). The gem ships two workflows to keep the bundled data fresh:
153
+
154
+ === `check-upstream` (scheduled, weekly)
155
+
156
+ `.github/workflows/check-upstream.yml` runs every Monday, asks the upstream GitLab project for its latest tag, and compares against the bundled `SOURCE_TAG`. When a new edition is detected it opens a GitHub issue labelled `data-update` describing the diff and the next steps. The workflow also runs on manual dispatch.
157
+
158
+ === `update-data` (manual dispatch)
159
+
160
+ `.github/workflows/update-data.yml` is what a maintainer runs after `check-upstream` flags a new edition. It takes the new tag as input, fetches the data, commits to a branch, and opens a PR. Merging that PR does not, by itself, publish a new gem — it just lands the new data on `main`. To ship a new gem version, trigger the `release` workflow (see below).
161
+
162
+ End-to-end refresh flow:
163
+
164
+ . `check-upstream` opens issue: "new tag `2025-2` available, bundled is `2025-1`"
165
+ . Maintainer runs `update-data` workflow with `tag=2025-2`
166
+ . Workflow opens PR `chore/data-2025-2` with the refreshed `locode.jsonld` and `SOURCE_TAG`
167
+ . Maintainer reviews and merges the PR
168
+ . Maintainer triggers the `release` workflow with `next_version=patch` to publish a new gem version
169
+
170
+ Manual local equivalent of the workflow:
72
171
 
73
172
  [source,sh]
74
173
  ----
75
- bundle exec rake unlocode:fetch # default tag: 2025-1
76
- UNLOCODE_TAG=2025-2 rake unlocode:fetch
174
+ bundle exec rake unlocodes:fetch # default tag: 2025-1
175
+ UNLOCODE_TAG=2025-2 bundle exec rake unlocodes:fetch
77
176
  ----
78
177
 
79
178
  If the upstream tag layout changes, point `UNLOCODE_PATH` at the full JSON-LD URL:
80
179
 
81
180
  [source,sh]
82
181
  ----
83
- UNLOCODE_PATH=https://example.org/path/unlocode.jsonld rake unlocode:fetch
182
+ UNLOCODE_PATH=https://example.org/path/unlocode.jsonld bundle exec rake unlocodes:fetch
84
183
  ----
85
184
 
185
+ == Development
186
+
187
+ [source,sh]
188
+ ----
189
+ bundle install # install dev deps
190
+ bundle exec rake # spec + rubocop
191
+ bundle exec rspec spec/path/to_spec.rb:42 # one example
192
+ bundle exec rake unlocodes:fetch # refresh bundled data
193
+ ----
194
+
195
+ The 39 MB dataset at `lib/unlocodes/data/locode.jsonld` is committed so the gem works offline. The first call to `Unlocodes.registry` in a process pays the parse cost (~3-5 s); subsequent calls are cached.
196
+
86
197
  == License
87
198
 
88
199
  BSD-2-Clause. See link:LICENSE[].
@@ -0,0 +1 @@
1
+ 2025-1
@@ -7,18 +7,23 @@ require 'fileutils'
7
7
  module Unlocodes
8
8
  module Data
9
9
  # Downloads the UNCEFACT UN/LOCODE vocabulary from the upstream GitLab
10
- # repository and stores it as `lib/unlocode/data/locode.jsonld`.
10
+ # repository and stores it as `lib/unlocodes/data/locode.jsonld`.
11
11
  #
12
12
  # The upstream tag (e.g. "2025-1") points at a snapshot of the
13
13
  # `vocab-locode` project whose `vocab/unlocode.jsonld` is the canonical
14
14
  # full dataset. Override the source URL with the `UNLOCODE_PATH` env
15
15
  # variable if a different edition's path layout applies.
16
+ #
17
+ # After a successful fetch, the tag is written to `SOURCE_TAG` next to
18
+ # the data file so the gem (and the check-upstream workflow) always
19
+ # knows which edition is bundled.
16
20
  module Fetcher
17
21
  UPSTREAM_HOST = 'opensource.unicc.org'
18
22
  UPSTREAM_TAG_PATH = '/un/unece/uncefact/vocab-locode/-/raw/%<tag>s'
19
- OUTPUT_PATH = File.expand_path('locode.jsonld', __dir__)
23
+ DATA_DIR = File.expand_path(__dir__)
24
+ OUTPUT_PATH = File.join(DATA_DIR, 'locode.jsonld')
25
+ SOURCE_TAG_PATH = File.join(DATA_DIR, 'SOURCE_TAG')
20
26
 
21
- DEFAULT_FILENAME = 'vocab/unlocode.jsonld'
22
27
  CANDIDATE_PATHS = [
23
28
  'vocab/unlocode.jsonld',
24
29
  'vocab/unlocode-vocab.jsonld',
@@ -32,7 +37,7 @@ module Unlocodes
32
37
  def call(tag:)
33
38
  uri = resolve_uri(tag)
34
39
  data = download(uri)
35
- write(data)
40
+ write(data, tag: tag)
36
41
  warn "Fetched UN/LOCODE #{tag} (#{data.bytesize} bytes) -> #{OUTPUT_PATH}"
37
42
  OUTPUT_PATH
38
43
  end
@@ -67,9 +72,10 @@ module Unlocodes
67
72
  end
68
73
  end
69
74
 
70
- def write(data)
71
- FileUtils.mkdir_p(File.dirname(OUTPUT_PATH))
75
+ def write(data, tag:)
76
+ FileUtils.mkdir_p(DATA_DIR)
72
77
  File.write(OUTPUT_PATH, data)
78
+ File.write(SOURCE_TAG_PATH, "#{tag}\n")
73
79
  end
74
80
  end
75
81
  end
@@ -18,13 +18,13 @@ module Unlocodes
18
18
  def_delegators :@entries, :size, :count, :to_a, :empty?
19
19
 
20
20
  # Map of `#where` filter keys to the Entry attribute they read.
21
+ # Limited to attributes the JSON-LD vocabulary actually populates —
22
+ # status / iata / name_without_diacritics are NOT in the vocab, so
23
+ # they're intentionally absent here.
21
24
  SCALAR_FILTERS = {
22
25
  code: :code,
23
26
  country: :country,
24
- status: :status_code,
25
- status_code: :status_code,
26
- subdivision: :subdivision,
27
- iata: :iata
27
+ subdivision: :subdivision
28
28
  }.freeze
29
29
 
30
30
  def initialize(entries = [])
@@ -108,10 +108,6 @@ module Unlocodes
108
108
  by_function_index[letter.to_s.upcase] || []
109
109
  end
110
110
 
111
- def by_status(code)
112
- by_status_index[code.to_s.upcase] || []
113
- end
114
-
115
111
  private
116
112
 
117
113
  def by_code
@@ -132,12 +128,6 @@ module Unlocodes
132
128
  end
133
129
  end
134
130
 
135
- def by_status_index
136
- @by_status_index ||= entries.each_with_object({}) do |e, h|
137
- (h[e.status_code.to_s.upcase] ||= []) << e if e.status_code
138
- end
139
- end
140
-
141
131
  def apply_filter(scope, key, value)
142
132
  if SCALAR_FILTERS.key?(key)
143
133
  filter_scalar(scope, SCALAR_FILTERS.fetch(key), value)
@@ -1,5 +1,5 @@
1
1
  # frozen_string_literal: true
2
2
 
3
3
  module Unlocodes
4
- VERSION = '0.1.3'
4
+ VERSION = '0.1.4'
5
5
  end
data/lib/unlocodes.rb CHANGED
@@ -9,12 +9,18 @@ require_relative 'unlocodes/version'
9
9
  # Vendored UN/LOCODE dataset as a queryable Ruby registry.
10
10
  #
11
11
  # The dataset is sourced from the UNECE/UNCEFACT LOCODE vocabulary published at
12
- # https://service.unece.org/trade/locode/ and distributed by this gem as a
13
- # bundled, offline JSON-LD representation. The registry loads once per process
14
- # and exposes a typed query API over `Unlocodes::Entry` instances.
12
+ # https://opensource.unicc.org/un/unece/uncefact/vocab-locode and distributed
13
+ # by this gem as a bundled, offline JSON-LD representation. The registry loads
14
+ # once per process and exposes a typed query API over `Unlocodes::Entry`
15
+ # instances.
16
+ #
17
+ # Which edition is bundled? See {Unlocodes.data_tag} (read from
18
+ # `lib/unlocodes/data/SOURCE_TAG`).
15
19
  module Unlocodes
16
20
  extend SingleForwardable
17
21
 
22
+ SOURCE_TAG_PATH = File.expand_path('unlocodes/data/SOURCE_TAG', __dir__)
23
+
18
24
  class << self
19
25
  # @return [Unlocodes::Registry] the process-wide registry, loaded lazily
20
26
  def registry
@@ -25,6 +31,17 @@ module Unlocodes
25
31
  def reset_registry!
26
32
  @registry = nil
27
33
  end
34
+
35
+ # The upstream UNCEFACT vocabulary tag bundled with this gem version
36
+ # (e.g. "2025-1"). Read from `lib/unlocodes/data/SOURCE_TAG` at runtime.
37
+ # @return [String, nil]
38
+ def data_tag
39
+ return @data_tag if defined?(@data_tag)
40
+
41
+ @data_tag = File.read(SOURCE_TAG_PATH, encoding: 'UTF-8').strip
42
+ rescue Errno::ENOENT
43
+ nil
44
+ end
28
45
  end
29
46
 
30
47
  def_delegators :registry, :find, :where, :each, :size, :count, :countries
metadata CHANGED
@@ -1,14 +1,14 @@
1
1
  --- !ruby/object:Gem::Specification
2
2
  name: unlocodes
3
3
  version: !ruby/object:Gem::Version
4
- version: 0.1.3
4
+ version: 0.1.4
5
5
  platform: ruby
6
6
  authors:
7
7
  - Ribose Inc.
8
8
  autorequire:
9
9
  bindir: exe
10
10
  cert_chain: []
11
- date: 2026-07-01 00:00:00.000000000 Z
11
+ date: 2026-07-02 00:00:00.000000000 Z
12
12
  dependencies:
13
13
  - !ruby/object:Gem::Dependency
14
14
  name: json
@@ -55,6 +55,7 @@ files:
55
55
  - lib/unlocodes/coordinates.rb
56
56
  - lib/unlocodes/data.rb
57
57
  - lib/unlocodes/data/README.adoc
58
+ - lib/unlocodes/data/SOURCE_TAG
58
59
  - lib/unlocodes/data/fetcher.rb
59
60
  - lib/unlocodes/data/locode.jsonld
60
61
  - lib/unlocodes/entry.rb
@@ -63,13 +64,13 @@ files:
63
64
  - lib/unlocodes/registry.rb
64
65
  - lib/unlocodes/status.rb
65
66
  - lib/unlocodes/version.rb
66
- homepage: https://github.com/metanorma/unlocode
67
+ homepage: https://github.com/metanorma/unlocodes
67
68
  licenses:
68
69
  - BSD-2-Clause
69
70
  metadata:
70
- homepage_uri: https://github.com/metanorma/unlocode
71
- source_code_uri: https://github.com/metanorma/unlocode
72
- bug_tracker_uri: https://github.com/metanorma/unlocode/issues
71
+ homepage_uri: https://github.com/metanorma/unlocodes
72
+ source_code_uri: https://github.com/metanorma/unlocodes
73
+ bug_tracker_uri: https://github.com/metanorma/unlocodes/issues
73
74
  rubygems_mfa_required: 'true'
74
75
  post_install_message:
75
76
  rdoc_options: []