relaton-index 0.2.19 → 0.2.20
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- checksums.yaml +4 -4
- data/CLAUDE.md +64 -0
- data/README.adoc +25 -1
- data/lib/relaton/index/file_io.rb +24 -2
- data/lib/relaton/index/type.rb +71 -16
- data/lib/relaton/index/version.rb +1 -1
- data/relaton-index.gemspec +1 -1
- metadata +10 -5
checksums.yaml
CHANGED
|
@@ -1,7 +1,7 @@
|
|
|
1
1
|
---
|
|
2
2
|
SHA256:
|
|
3
|
-
metadata.gz:
|
|
4
|
-
data.tar.gz:
|
|
3
|
+
metadata.gz: 84483e9b27f8e48c618e2b6b7d124e618deebba9fa2ffb34d234027c3738ad32
|
|
4
|
+
data.tar.gz: '084e50ecfa029fb812e9552189ef795a901d5683b2f79baf643d55da0af93a78'
|
|
5
5
|
SHA512:
|
|
6
|
-
metadata.gz:
|
|
7
|
-
data.tar.gz:
|
|
6
|
+
metadata.gz: 198b94038473955e9c03f9e21b2e5876e9d36e94f72e11aec80d78350d988be5332cbbaceacf3e4253281b107afc4be370b66b89000df9d42f5dbdb17e5d8016
|
|
7
|
+
data.tar.gz: 38690e3090d278e9204200afefceac74eef9a4b5b40509fa2bea89d5c24053256c0f97c9c678f7642ea398caeb88b9710b7d272b8e55afc1bc6c1ca9d2a23fa1
|
data/CLAUDE.md
ADDED
|
@@ -0,0 +1,64 @@
|
|
|
1
|
+
# CLAUDE.md
|
|
2
|
+
|
|
3
|
+
This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository.
|
|
4
|
+
|
|
5
|
+
## Project Overview
|
|
6
|
+
|
|
7
|
+
relaton-index is a Ruby gem that provides indexing and searching of Relaton document references. It maps document identifiers to file paths, supporting both local index creation (for publishing) and remote index consumption (downloading from URLs with 24-hour caching).
|
|
8
|
+
|
|
9
|
+
## Commands
|
|
10
|
+
|
|
11
|
+
```bash
|
|
12
|
+
# Run all tests (default rake task)
|
|
13
|
+
rake spec
|
|
14
|
+
|
|
15
|
+
# Run linting
|
|
16
|
+
rake rubocop
|
|
17
|
+
|
|
18
|
+
# Run specific test file
|
|
19
|
+
bundle exec rspec spec/relaton/type_spec.rb
|
|
20
|
+
|
|
21
|
+
# Run specific test by name
|
|
22
|
+
bundle exec rspec spec/relaton/file_io_spec.rb -e "fetch_and_save"
|
|
23
|
+
|
|
24
|
+
# Install dependencies
|
|
25
|
+
bin/setup
|
|
26
|
+
|
|
27
|
+
# Interactive console
|
|
28
|
+
bin/console
|
|
29
|
+
```
|
|
30
|
+
|
|
31
|
+
## Architecture
|
|
32
|
+
|
|
33
|
+
### Core Classes (all under `Relaton::Index` module in `lib/relaton/index/`)
|
|
34
|
+
|
|
35
|
+
- **`Relaton::Index`** (module, `lib/relaton/index.rb`) — Static API entry point. Delegates to Pool and Config. Main methods: `find_or_create`, `close`, `configure`.
|
|
36
|
+
|
|
37
|
+
- **Pool** — Object pool that caches Type instances by document type (`:ISO`, `:IEC`, `:IHO`, etc.). Reuses existing indexes if parameters match, recreates if they change.
|
|
38
|
+
|
|
39
|
+
- **Type** — Represents one index for a document type. Holds an array of `{id:, file:}` hashes. Provides `add_or_update`, `search` (string substring match or block), and `save`.
|
|
40
|
+
|
|
41
|
+
- **FileIO** — Handles reading/writing/downloading index files. Three modes based on `@url`: string URL (download and cache to `~/.relaton/{type}/`), `true` (read local file from `~/.relaton/{type}/`), `nil` (read from current directory). Uses class-level Mutex for thread-safe downloads. Validates index format on load.
|
|
42
|
+
|
|
43
|
+
- **FileStorage** — Storage abstraction module with `ctime`, `read`, `write`, `remove`. Can be replaced via `Config.storage=` for custom backends (e.g., S3).
|
|
44
|
+
|
|
45
|
+
- **Config** — Global configuration: `storage`, `storage_dir`, `filename` (default: "index.yaml").
|
|
46
|
+
|
|
47
|
+
### Data Flow
|
|
48
|
+
|
|
49
|
+
1. `Relaton::Index.find_or_create(:TYPE, url:, file:, id_keys:, pubid_class:)` → Pool looks up or creates Type
|
|
50
|
+
2. Type lazily loads index via FileIO on first access
|
|
51
|
+
3. FileIO either reads local YAML or downloads ZIP from URL, extracts, validates format
|
|
52
|
+
4. Search matches against `:id` field (string comparison via `include?` or custom block)
|
|
53
|
+
5. `save` writes index as YAML to local file
|
|
54
|
+
|
|
55
|
+
### Index Format
|
|
56
|
+
|
|
57
|
+
YAML array of hashes with `:id` (string or structured hash) and `:file` (path string). Supports backward compatibility with old string-based format and newer pubid object format.
|
|
58
|
+
|
|
59
|
+
### Key Design Decisions
|
|
60
|
+
|
|
61
|
+
- Remote indexes cached for 24 hours at `~/.relaton/{type}/index.yaml`
|
|
62
|
+
- Thread safety via `@@mutex` in FileIO prevents concurrent downloads of the same file
|
|
63
|
+
- Pubid deserialization is optional — when `pubid_class` is provided, string IDs are converted to structured objects
|
|
64
|
+
- Index format validation checks for required `:id` and `:file` keys, with automatic recovery (re-download or removal) on corruption
|
data/README.adoc
CHANGED
|
@@ -24,7 +24,7 @@ If bundler is not being used to manage dependencies, install the gem by executin
|
|
|
24
24
|
|
|
25
25
|
=== Creating an index object
|
|
26
26
|
|
|
27
|
-
The gem provides the `Relaton::Index.find_or_create {type}, url: {url}, file: {filename}, id_keys: {keys}` method to create an index object. The first argument is the type of dataset (ISO, IEC, IHO, etc.). The second argument is the URL to the zipped remote index file. The third argument is the filename of the local index file. The fourth argument is an array of ID's parts names. The URL, filename, and
|
|
27
|
+
The gem provides the `Relaton::Index.find_or_create {type}, url: {url}, file: {filename}, id_keys: {keys}, pubid_class: {class}` method to create an index object. The first argument is the type of dataset (ISO, IEC, IHO, etc.). The second argument is the URL to the zipped remote index file. The third argument is the filename of the local index file. The fourth argument is an array of ID's parts names. The fifth argument is a class that implements `Pubid::Core::Identifier` for deserializing ID hashes into structured identifier objects. The URL, filename, keys, and pubid_class are optional.
|
|
28
28
|
|
|
29
29
|
If the URL is specified and the local file in a `/{home}/.relaton/{type}` dir doesn't exist or is outdated, the index file will be downloaded from the URL saved as a local file and an index object will be created from the file. If the file in the `/{home}/.relaton/{type}` exists and is actual, the index object will be created from the local file.
|
|
30
30
|
|
|
@@ -97,6 +97,30 @@ end
|
|
|
97
97
|
# => [{ id: "B-4 2.19.0", file: "data/b-4_2_19_0.xml" }]
|
|
98
98
|
----
|
|
99
99
|
|
|
100
|
+
=== Using pubid_class for structured identifiers
|
|
101
|
+
|
|
102
|
+
The `pubid_class` option allows index entries to be deserialized into structured identifier objects instead of plain strings or hashes. The class must include `Pubid::Core::Identifier` and provide a `.create(**hash)` factory method.
|
|
103
|
+
|
|
104
|
+
When `pubid_class` is specified, each `:id` hash in the index is converted into a pubid object, enabling structured access to identifier components (e.g., publisher, number, part, year).
|
|
105
|
+
|
|
106
|
+
[source,ruby]
|
|
107
|
+
----
|
|
108
|
+
require 'relaton/index'
|
|
109
|
+
require 'pubid-iso'
|
|
110
|
+
|
|
111
|
+
# Create an index with pubid_class to deserialize IDs into Pubid::Iso::Identifier objects
|
|
112
|
+
index = Relaton::Index.find_or_create :ISO,
|
|
113
|
+
url: "https://raw.githubusercontent.com/relaton/relaton-data-iso/main/index-v2.zip",
|
|
114
|
+
pubid_class: Pubid::Iso::Identifier
|
|
115
|
+
|
|
116
|
+
# Search returns entries with structured pubid objects as IDs
|
|
117
|
+
results = index.search "ISO 1"
|
|
118
|
+
results.first[:id]
|
|
119
|
+
# => #<Pubid::Iso::Identifier: ISO 1>
|
|
120
|
+
results.first[:id].publisher
|
|
121
|
+
# => "ISO"
|
|
122
|
+
----
|
|
123
|
+
|
|
100
124
|
=== Remove all index records
|
|
101
125
|
|
|
102
126
|
This method removes all records from the index object. The index file is not removed.
|
|
@@ -7,6 +7,7 @@ module Relaton
|
|
|
7
7
|
#
|
|
8
8
|
class FileIO
|
|
9
9
|
attr_reader :url, :pubid_class
|
|
10
|
+
attr_accessor :sorted
|
|
10
11
|
|
|
11
12
|
@@file_locks = {}
|
|
12
13
|
@@file_locks_mutex = Mutex.new
|
|
@@ -28,6 +29,7 @@ module Relaton
|
|
|
28
29
|
@filename = filename
|
|
29
30
|
@id_keys = id_keys || []
|
|
30
31
|
@pubid_class = pubid_class
|
|
32
|
+
@sorted = false
|
|
31
33
|
end
|
|
32
34
|
|
|
33
35
|
#
|
|
@@ -117,7 +119,15 @@ module Relaton
|
|
|
117
119
|
def deserialize_pubid(index)
|
|
118
120
|
return index unless @pubid_class
|
|
119
121
|
|
|
120
|
-
|
|
122
|
+
@sorted = true
|
|
123
|
+
prev_number = nil
|
|
124
|
+
index.map do |r|
|
|
125
|
+
id = @pubid_class.create(**r[:id])
|
|
126
|
+
num = get_id_number id
|
|
127
|
+
@sorted = false if prev_number && prev_number > num
|
|
128
|
+
prev_number = num
|
|
129
|
+
{ id: id, file: r[:file] }
|
|
130
|
+
end
|
|
121
131
|
end
|
|
122
132
|
|
|
123
133
|
def warn_local_index_error(reason)
|
|
@@ -183,12 +193,24 @@ module Relaton
|
|
|
183
193
|
# @return [void]
|
|
184
194
|
#
|
|
185
195
|
def save(index)
|
|
186
|
-
yaml = index.map do |item|
|
|
196
|
+
yaml = sort_structured_index(index).map do |item|
|
|
187
197
|
item.transform_values { |value| value.is_a?(Pubid::Core::Identifier::Base) ? value.to_h : value }
|
|
188
198
|
end.to_yaml
|
|
189
199
|
Index.config.storage.write file, yaml
|
|
190
200
|
end
|
|
191
201
|
|
|
202
|
+
def sort_structured_index(index)
|
|
203
|
+
if @pubid_class && index.first&.dig(:id).is_a?(Pubid::Core::Identifier::Base)
|
|
204
|
+
index.sort_by { |item| get_id_number item[:id] }
|
|
205
|
+
else
|
|
206
|
+
index
|
|
207
|
+
end
|
|
208
|
+
end
|
|
209
|
+
|
|
210
|
+
def get_id_number(id)
|
|
211
|
+
id.respond_to?(:base) && id.base ? id.base.number.to_s : id.number.to_s
|
|
212
|
+
end
|
|
213
|
+
|
|
192
214
|
#
|
|
193
215
|
# Remove index file from storage
|
|
194
216
|
#
|
data/lib/relaton/index/type.rb
CHANGED
|
@@ -10,9 +10,11 @@ module Relaton
|
|
|
10
10
|
# @param [String, Symbol] type type of index (ISO, IEC, etc.)
|
|
11
11
|
# @param [String, nil] url external URL to index, used to fetch index for searching files
|
|
12
12
|
# @param [String, nil] file output file name
|
|
13
|
-
# @param [
|
|
13
|
+
# @param [Array<Symbol>] id_keys keys of identifier to be used for sorting index
|
|
14
|
+
# format of index file is checked if id_keys all is provided at least in one of the IDs
|
|
15
|
+
# @param [Pubid::Core::Identifier::Base, nil] pubid class for deserialization
|
|
14
16
|
#
|
|
15
|
-
def initialize(type, url = nil, file = nil, id_keys = nil, pubid_class = nil)
|
|
17
|
+
def initialize(type, url = nil, file = nil, id_keys = nil, pubid_class = nil) # rubocop:disable Metrics/ParameterLists
|
|
16
18
|
@file = file
|
|
17
19
|
filename = file || Index.config.filename
|
|
18
20
|
@file_io = FileIO.new type.to_s.downcase, url, filename, id_keys, pubid_class
|
|
@@ -45,11 +47,15 @@ module Relaton
|
|
|
45
47
|
# @return [void]
|
|
46
48
|
#
|
|
47
49
|
def add_or_update(id, file)
|
|
48
|
-
|
|
50
|
+
key = id.to_s
|
|
51
|
+
item = id_lookup[key]
|
|
49
52
|
if item
|
|
50
53
|
item[:file] = file
|
|
51
54
|
else
|
|
52
|
-
|
|
55
|
+
new_item = { id: id, file: file }
|
|
56
|
+
index << new_item
|
|
57
|
+
id_lookup[key] = new_item
|
|
58
|
+
@file_io.sorted = false
|
|
53
59
|
end
|
|
54
60
|
end
|
|
55
61
|
|
|
@@ -60,18 +66,11 @@ module Relaton
|
|
|
60
66
|
#
|
|
61
67
|
# @return [Array<Hash>] search results
|
|
62
68
|
#
|
|
63
|
-
def search(id = nil)
|
|
64
|
-
|
|
65
|
-
|
|
66
|
-
|
|
67
|
-
|
|
68
|
-
if i[:id].is_a?(String)
|
|
69
|
-
id.is_a?(String) ? i[:id].include?(id) : i[:id].include?(id.to_s)
|
|
70
|
-
else
|
|
71
|
-
id.is_a?(String) ? i[:id].to_s.include?(id) : i[:id] == id
|
|
72
|
-
end
|
|
73
|
-
end
|
|
74
|
-
end
|
|
69
|
+
def search(id = nil, &block)
|
|
70
|
+
items = search_candidates(id)
|
|
71
|
+
return items.select(&block) if block
|
|
72
|
+
|
|
73
|
+
items.select { |i| match_item(i, id) }
|
|
75
74
|
end
|
|
76
75
|
|
|
77
76
|
#
|
|
@@ -91,6 +90,7 @@ module Relaton
|
|
|
91
90
|
def remove_file
|
|
92
91
|
@file_io.remove
|
|
93
92
|
@index = nil
|
|
93
|
+
@id_lookup = nil
|
|
94
94
|
end
|
|
95
95
|
|
|
96
96
|
#
|
|
@@ -100,6 +100,61 @@ module Relaton
|
|
|
100
100
|
#
|
|
101
101
|
def remove_all
|
|
102
102
|
@index = []
|
|
103
|
+
@id_lookup = nil
|
|
104
|
+
@file_io.sorted = true
|
|
105
|
+
end
|
|
106
|
+
|
|
107
|
+
private
|
|
108
|
+
|
|
109
|
+
def id_lookup
|
|
110
|
+
@id_lookup ||= index.each_with_object({}) do |item, h|
|
|
111
|
+
h[item[:id].to_s] = item
|
|
112
|
+
end
|
|
113
|
+
end
|
|
114
|
+
|
|
115
|
+
def search_candidates(id)
|
|
116
|
+
# index needs to be created to check if sorted
|
|
117
|
+
idx = index
|
|
118
|
+
if @file_io.sorted && id && !id.is_a?(String)
|
|
119
|
+
candidates_by_number(id)
|
|
120
|
+
else
|
|
121
|
+
idx
|
|
122
|
+
end
|
|
123
|
+
end
|
|
124
|
+
|
|
125
|
+
def candidates_by_number(id)
|
|
126
|
+
target = get_id_number(id)
|
|
127
|
+
left = bsearch_left(target)
|
|
128
|
+
return [] unless left
|
|
129
|
+
|
|
130
|
+
right = bsearch_right(target)
|
|
131
|
+
index[left...right]
|
|
132
|
+
end
|
|
133
|
+
|
|
134
|
+
def get_id_number(id)
|
|
135
|
+
id.respond_to?(:base) && id.base ? id.base.number.to_s : id.number.to_s
|
|
136
|
+
end
|
|
137
|
+
|
|
138
|
+
def bsearch_left(target)
|
|
139
|
+
index.bsearch_index do |item|
|
|
140
|
+
get_id_number(item[:id]) >= target
|
|
141
|
+
end
|
|
142
|
+
end
|
|
143
|
+
|
|
144
|
+
def bsearch_right(target)
|
|
145
|
+
index.bsearch_index do |item|
|
|
146
|
+
get_id_number(item[:id]) > target
|
|
147
|
+
end || index.size
|
|
148
|
+
end
|
|
149
|
+
|
|
150
|
+
def match_item(item, id)
|
|
151
|
+
if item[:id].is_a?(String)
|
|
152
|
+
item[:id].include?(id.is_a?(String) ? id : id.to_s)
|
|
153
|
+
elsif id.is_a?(String)
|
|
154
|
+
item[:id].to_s.include?(id)
|
|
155
|
+
else
|
|
156
|
+
item[:id] == id
|
|
157
|
+
end
|
|
103
158
|
end
|
|
104
159
|
end
|
|
105
160
|
end
|
data/relaton-index.gemspec
CHANGED
|
@@ -31,7 +31,7 @@ Gem::Specification.new do |spec|
|
|
|
31
31
|
spec.require_paths = ["lib"]
|
|
32
32
|
|
|
33
33
|
spec.add_dependency "openssl", "~> 3.3.2"
|
|
34
|
-
spec.add_dependency "pubid-core", "~> 1.15.
|
|
34
|
+
spec.add_dependency "pubid-core", "~> 1.15.6"
|
|
35
35
|
spec.add_dependency "relaton-logger", "~> 0.2.0"
|
|
36
36
|
spec.add_dependency "rubyzip", "~> 2.3.0"
|
|
37
37
|
|
metadata
CHANGED
|
@@ -1,13 +1,14 @@
|
|
|
1
1
|
--- !ruby/object:Gem::Specification
|
|
2
2
|
name: relaton-index
|
|
3
3
|
version: !ruby/object:Gem::Version
|
|
4
|
-
version: 0.2.
|
|
4
|
+
version: 0.2.20
|
|
5
5
|
platform: ruby
|
|
6
6
|
authors:
|
|
7
7
|
- Ribose Inc.
|
|
8
|
+
autorequire:
|
|
8
9
|
bindir: exe
|
|
9
10
|
cert_chain: []
|
|
10
|
-
date:
|
|
11
|
+
date: 2026-02-09 00:00:00.000000000 Z
|
|
11
12
|
dependencies:
|
|
12
13
|
- !ruby/object:Gem::Dependency
|
|
13
14
|
name: openssl
|
|
@@ -29,14 +30,14 @@ dependencies:
|
|
|
29
30
|
requirements:
|
|
30
31
|
- - "~>"
|
|
31
32
|
- !ruby/object:Gem::Version
|
|
32
|
-
version: 1.15.
|
|
33
|
+
version: 1.15.6
|
|
33
34
|
type: :runtime
|
|
34
35
|
prerelease: false
|
|
35
36
|
version_requirements: !ruby/object:Gem::Requirement
|
|
36
37
|
requirements:
|
|
37
38
|
- - "~>"
|
|
38
39
|
- !ruby/object:Gem::Version
|
|
39
|
-
version: 1.15.
|
|
40
|
+
version: 1.15.6
|
|
40
41
|
- !ruby/object:Gem::Dependency
|
|
41
42
|
name: relaton-logger
|
|
42
43
|
requirement: !ruby/object:Gem::Requirement
|
|
@@ -65,6 +66,7 @@ dependencies:
|
|
|
65
66
|
- - "~>"
|
|
66
67
|
- !ruby/object:Gem::Version
|
|
67
68
|
version: 2.3.0
|
|
69
|
+
description:
|
|
68
70
|
email:
|
|
69
71
|
- open.source@ribose.com
|
|
70
72
|
executables: []
|
|
@@ -73,6 +75,7 @@ extra_rdoc_files: []
|
|
|
73
75
|
files:
|
|
74
76
|
- ".rspec"
|
|
75
77
|
- ".rubocop.yml"
|
|
78
|
+
- CLAUDE.md
|
|
76
79
|
- Gemfile
|
|
77
80
|
- LICENSE.txt
|
|
78
81
|
- README.adoc
|
|
@@ -93,6 +96,7 @@ licenses:
|
|
|
93
96
|
metadata:
|
|
94
97
|
homepage_uri: https://github.com/relaton/relaton-index
|
|
95
98
|
source_code_uri: https://github.com/relaton/relaton-index
|
|
99
|
+
post_install_message:
|
|
96
100
|
rdoc_options: []
|
|
97
101
|
require_paths:
|
|
98
102
|
- lib
|
|
@@ -107,7 +111,8 @@ required_rubygems_version: !ruby/object:Gem::Requirement
|
|
|
107
111
|
- !ruby/object:Gem::Version
|
|
108
112
|
version: '0'
|
|
109
113
|
requirements: []
|
|
110
|
-
rubygems_version: 3.
|
|
114
|
+
rubygems_version: 3.5.22
|
|
115
|
+
signing_key:
|
|
111
116
|
specification_version: 4
|
|
112
117
|
summary: Relaton Index is a library for indexing Relaton files.
|
|
113
118
|
test_files: []
|