lutaml-store 0.2.0 → 0.2.1

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (91) hide show
  1. checksums.yaml +4 -4
  2. data/.rubocop_todo.yml +11 -175
  3. data/README.adoc +233 -1124
  4. data/lib/lutaml/store/adapter/base.rb +4 -0
  5. data/lib/lutaml/store/adapter/memory.rb +8 -0
  6. data/lib/lutaml/store/cache_store.rb +9 -6
  7. data/lib/lutaml/store/format.rb +19 -0
  8. data/lib/lutaml/store/http_cache.rb +3 -13
  9. data/lib/lutaml/store/model_registration.rb +5 -2
  10. data/lib/lutaml/store/model_registry.rb +22 -20
  11. data/lib/lutaml/store/package_store.rb +2 -18
  12. data/lib/lutaml/store/package_transport/base.rb +48 -0
  13. data/lib/lutaml/store/package_transport/directory_transport.rb +196 -0
  14. data/lib/lutaml/store/package_transport/zip_transport.rb +178 -0
  15. data/lib/lutaml/store/package_transport.rb +11 -438
  16. data/lib/lutaml/store/version.rb +1 -1
  17. metadata +12 -77
  18. data/.github/workflows/main.yml +0 -27
  19. data/.gitignore +0 -12
  20. data/CORRECTED_HTTP_CACHE_IMPLEMENTATION.md +0 -209
  21. data/CORRECTED_HTTP_CACHE_PLAN.md +0 -164
  22. data/Gemfile +0 -15
  23. data/Gemfile.lock +0 -227
  24. data/TODO.impl/0-lutaml-store-self-quality.md +0 -112
  25. data/TODO.impl/1-lutaml-hal-migration.md +0 -96
  26. data/TODO.impl/2-glossarist-migration.md +0 -359
  27. data/TODO.impl/3-lutaml-jsonschema-migration.md +0 -273
  28. data/bin/console +0 -11
  29. data/bin/setup +0 -8
  30. data/demo/Gemfile +0 -15
  31. data/demo/Gemfile.lock +0 -61
  32. data/demo/README.adoc +0 -301
  33. data/demo/data/vcards/co/contact_10_thompson.data +0 -1
  34. data/demo/data/vcards/co/contact_10_thompson.meta +0 -1
  35. data/demo/data/vcards/co/contact_1_doe.data +0 -1
  36. data/demo/data/vcards/co/contact_1_doe.meta +0 -1
  37. data/demo/data/vcards/co/contact_2_smith.data +0 -1
  38. data/demo/data/vcards/co/contact_2_smith.meta +0 -1
  39. data/demo/data/vcards/co/contact_3_johnson.data +0 -1
  40. data/demo/data/vcards/co/contact_3_johnson.meta +0 -1
  41. data/demo/data/vcards/co/contact_4_garcia.data +0 -1
  42. data/demo/data/vcards/co/contact_4_garcia.meta +0 -1
  43. data/demo/data/vcards/co/contact_5_wilson.data +0 -1
  44. data/demo/data/vcards/co/contact_5_wilson.meta +0 -1
  45. data/demo/data/vcards/co/contact_6_brown.data +0 -1
  46. data/demo/data/vcards/co/contact_6_brown.meta +0 -1
  47. data/demo/data/vcards/co/contact_7_davis.data +0 -1
  48. data/demo/data/vcards/co/contact_7_davis.meta +0 -1
  49. data/demo/data/vcards/co/contact_8_anderson.data +0 -1
  50. data/demo/data/vcards/co/contact_8_anderson.meta +0 -1
  51. data/demo/data/vcards/co/contact_9_taylor.data +0 -1
  52. data/demo/data/vcards/co/contact_9_taylor.meta +0 -1
  53. data/demo/data/vcards.db +0 -0
  54. data/demo/pottery_class_demo.rb +0 -164
  55. data/demo/vcard_models.rb +0 -140
  56. data/demo/vcard_store_demo.rb +0 -526
  57. data/lutaml-store.gemspec +0 -36
  58. data/plan.adoc +0 -606
  59. data/spec/lutaml/store/adapter_interface_spec.rb +0 -89
  60. data/spec/lutaml/store/anti_pattern_guard_spec.rb +0 -35
  61. data/spec/lutaml/store/anti_pattern_spec.rb +0 -78
  62. data/spec/lutaml/store/autoload_spec.rb +0 -34
  63. data/spec/lutaml/store/cache_store_spec.rb +0 -271
  64. data/spec/lutaml/store/compression_spec.rb +0 -78
  65. data/spec/lutaml/store/config_enhanced_spec.rb +0 -158
  66. data/spec/lutaml/store/corrected_http_cache_integration_spec.rb +0 -336
  67. data/spec/lutaml/store/custom_serializer_spec.rb +0 -108
  68. data/spec/lutaml/store/database_store_spec.rb +0 -279
  69. data/spec/lutaml/store/file_io_spec.rb +0 -220
  70. data/spec/lutaml/store/format/yamls_spec.rb +0 -80
  71. data/spec/lutaml/store/format_round_trip_spec.rb +0 -110
  72. data/spec/lutaml/store/format_spec.rb +0 -70
  73. data/spec/lutaml/store/http_cache_entry_spec.rb +0 -203
  74. data/spec/lutaml/store/http_cache_hal_integration_spec.rb +0 -404
  75. data/spec/lutaml/store/http_cache_spec.rb +0 -422
  76. data/spec/lutaml/store/http_header_processor_spec.rb +0 -290
  77. data/spec/lutaml/store/import_spec.rb +0 -90
  78. data/spec/lutaml/store/integrity_spec.rb +0 -157
  79. data/spec/lutaml/store/key_collision_serializer_spec.rb +0 -98
  80. data/spec/lutaml/store/load_save_spec.rb +0 -107
  81. data/spec/lutaml/store/lutaml_model_integration_spec.rb +0 -291
  82. data/spec/lutaml/store/model_serializer_spec.rb +0 -140
  83. data/spec/lutaml/store/package_definition_spec.rb +0 -89
  84. data/spec/lutaml/store/package_store_spec.rb +0 -153
  85. data/spec/lutaml/store/package_transport/directory_transport_spec.rb +0 -139
  86. data/spec/lutaml/store/package_transport/zip_transport_spec.rb +0 -85
  87. data/spec/lutaml/store/store_spec.rb +0 -182
  88. data/spec/lutaml/store_spec.rb +0 -21
  89. data/spec/spec_helper.rb +0 -16
  90. data/spec/support/simple_test_model.rb +0 -15
  91. data/spec/support/yamls_test_model.rb +0 -35
@@ -1,112 +0,0 @@
1
- # TODO.impl/0-lutaml-store-self-quality.md
2
-
3
- # lutaml-store Internal Quality Fixes (prerequisite for migrations)
4
-
5
- ## Completed
6
-
7
- All anti-patterns eliminated from `lib/`:
8
-
9
- | Anti-pattern | Remaining in lib/ |
10
- |---|---|
11
- | `instance_variable_get/set` | 0 (only a comment reference) |
12
- | `respond_to?` | 0 (only a comment reference) |
13
- | `send` on private methods | 0 |
14
-
15
- ### Changes made
16
-
17
- 1. **Created `ModelSerializer`** — unified serialization/deserialization into one class, eliminating `respond_to?` chains and duplicated code across `DatabaseStore`, `CompositeModelHandler`, `Serializer`.
18
- 2. **Added `Store#emit_event`** public method — eliminated `instance_variable_get(:@events)`.
19
- 3. **Fixed `AttributeUpdater`** — uses proper constructors instead of `instance_variable_set/get`.
20
- 4. **Fixed `CacheStore`** — uses proper factory methods.
21
- 5. **Removed `respond_to?` from adapters** — calls methods directly on `Adapter::Base` interface.
22
- 6. **Removed `CacheInspector`** — unused code.
23
- 7. **Added custom serializer support** — `ModelRegistration` accepts `serializer:` option for models with non-standard serialization (e.g., glossarist's `key_value` DSL).
24
-
25
- ## Anti-pattern audit (current state)
26
-
27
- ### 1. `instance_variable_get` / `instance_variable_set` — breaks encapsulation
28
-
29
- | File | Line | Usage |
30
- |---|---|---|
31
- | `database_store.rb` | 375 | `@store.instance_variable_get(:@events)&.emit(event, data)` |
32
- | `model_store.rb` | 230 | Same pattern — reaching into `@store`'s internals |
33
- | `attribute_updater.rb` | 228 | `model.instance_variable_set(var, upgraded_model.instance_variable_get(var))` — polymorphic upgrade hack |
34
- | `cache_store.rb` | 45 | `entry.instance_variable_set(:@created_at, ...)` — bypassing constructor |
35
-
36
- **Fix:** Expose proper public APIs on the objects being accessed:
37
- - `Store` should expose `emit_event(event, data)` as a public method.
38
- - `AttributeUpdater#try_polymorphic_upgrade` must use proper constructors or
39
- `Lutaml::Model::Serializable`'s own API — never copy instance variables.
40
- - `CacheStore` should use factory methods or proper attribute setters.
41
-
42
- ### 2. `respond_to?` — poor typing / duck-typing smell
43
-
44
- **24 occurrences across the codebase.** The worst offenders:
45
-
46
- | File | Pattern | Fix |
47
- |---|---|---|
48
- | `composite_model_handler.rb` L102-133 | `respond_to?(:to_hash)`, `respond_to?(:from_hash)` | All models are `Lutaml::Model::Serializable` — use `is_a?` checks against a serialization protocol, or better, use `to_hash` directly via the type system |
49
- | `database_store.rb` L288-327 | Same serialization dispatch via `respond_to?` | Extract a `SerializationAdapter` that knows how to (de)serialize `Lutaml::Model::Serializable` |
50
- | `attribute_updater.rb` L87-129 | `respond_to?(setter_method)` | Use the model's attribute metadata from `Lutaml::Model::Serializable.attributes` |
51
- | `serializer.rb` | 14 occurrences | Full rewrite needed — see below |
52
- | `http_cache.rb` L112-142 | `@adapter.respond_to?(:clear)` | All adapters inherit from `Adapter::Base` which defines `#clear` — just call it |
53
-
54
- **Fix strategy:**
55
- - Create a `Serializable` protocol module (`Lutaml::Store::Serializable`) that
56
- formalizes the serialization contract (`to_store_hash`, `from_store_hash`).
57
- - Replace all `respond_to?` with type-based dispatch or method calls on known base classes.
58
- - For attribute validation, use `model.class.attributes` (from Lutaml::Model).
59
-
60
- ### 3. Duplicated serialization logic
61
-
62
- `DatabaseStore#serialize_model`, `DatabaseStore#deserialize_model`,
63
- `CompositeModelHandler#serialize_model`, `CompositeModelHandler#deserialize_model`
64
- all contain identical `respond_to?` chains for `(to_hash|to_h|to_s)` and
65
- `(from_hash|from_h|new)`.
66
-
67
- **Fix:** Extract a single `Lutaml::Store::ModelSerializer` class:
68
-
69
- ```ruby
70
- class ModelSerializer
71
- def serialize(model)
72
- model.to_hash.merge("_class" => model.class.name)
73
- end
74
-
75
- def deserialize(data, expected_class)
76
- klass = Object.const_get(data["_class"])
77
- klass.from_hash(data.except("_class", "_composite_models"))
78
- end
79
- end
80
- ```
81
-
82
- Since all registered models are `Lutaml::Model::Serializable`, they all have
83
- `to_hash` and `from_hash`. No need for duck-typing fallback chains.
84
-
85
- ### 4. Event emission through encapsulation violation
86
-
87
- Both `DatabaseStore` and `ModelStore` emit events via:
88
- ```ruby
89
- @store.instance_variable_get(:@events)&.emit(event, data)
90
- ```
91
-
92
- **Fix:** Add `Store#emit_event(event, data)` as a public method. The `events`
93
- object is an internal implementation detail and should not be leaked.
94
-
95
- ### 5. `CacheInspector` — untested, unused?
96
-
97
- Check if `cache_inspector.rb` is actually used anywhere. If not, remove it.
98
-
99
- ### 6. Open/closed principle violations
100
-
101
- - `AttributeUpdater#try_polymorphic_upgrade` uses `instance_variable_set` and
102
- `model.extend(registered_class)` — this is metaprogramming that breaks OCP.
103
- Instead, create a new model instance of the subclass and replace the reference.
104
-
105
- ## Implementation order
106
-
107
- 1. **Add `Store#emit_event` public method** — eliminates `instance_variable_get(:@events)`.
108
- 2. **Create `ModelSerializer`** — unify all serialization/deserialization into one class. Eliminates `respond_to?` chains and duplicated code.
109
- 3. **Fix `AttributeUpdater`** — remove `try_polymorphic_upgrade`'s `instance_variable_set/get`. Use proper factory pattern.
110
- 4. **Fix `CacheStore`** — use proper constructor/factory instead of `instance_variable_set`.
111
- 5. **Remove `respond_to?` from adapters** — call methods directly on `Adapter::Base` interface.
112
- 6. **Audit specs** — add specs that verify no `send`, no `instance_variable_get/set`, no `respond_to?` regressions.
@@ -1,96 +0,0 @@
1
- # TODO.impl/1-lutaml-hal-migration.md
2
-
3
- # Migration Plan: lutaml-hal → lutaml-store
4
-
5
- ## Completed
6
-
7
- ### Anti-pattern elimination (all done)
8
-
9
- All `instance_variable_set/get`, `send(:private_method)`, and `respond_to?` calls
10
- have been eliminated from `lib/lutaml/hal/`.
11
-
12
- | File | Changes |
13
- |---|---|
14
- | `resource.rb` | Added `embedded_data=` setter; `from_embedded` uses `public_send` instead of `instance_variable_set` |
15
- | `page.rb` | `instance_variable_get` → `public_send(Hal::REGISTER_ID_ATTR_NAME)` |
16
- | `model_register.rb` | All methods used by Link made public; `instance_variable_set` → `embedded_data=`; `instance_variable_set` for register ID → `public_send(setter)`; `send(key)` → `public_send(key)`; `respond_to?(:status)` → `faraday_response?(response) && response.status == 304` |
17
- | `link.rb` | Removed all `register.send(:private_method, ...)` calls (4 instances); `instance_variable_get` → `public_send`; `respond_to?(:embedded_data)` → `is_a?(Resource)` type check |
18
- | `global_register.rb` | Removed `respond_to?` checks from `clear_all_caches` and `cache_stats` |
19
- | `cache/cache_metadata.rb` | Extracted `ResponseHeaders` and `ResponseStatus` modules replacing `respond_to?` proc lambdas |
20
- | `cache/cache_manager.rb` | `respond_to?` → `rescue NoMethodError` for optional methods; `is_a?` type check for HttpCache; proper HttpCache API delegation (`get(:get, url)`, `set(:get, url, {}, response)`) |
21
- | `rate_limiter.rb` | Simplified with `is_a?` type check |
22
-
23
- ### Cache architecture fixes
24
-
25
- - `http_aware?` now requires explicit opt-in (`http_aware == true`) instead of defaulting to true when HttpCache is available
26
- - `CacheManager` properly delegates to `HttpCache`'s method/url API (`get(:get, url)`, `set(:get, url, {}, response)`, `delete(:get, url)`)
27
- - Non-http-aware path uses `SimpleCacheStore` (in-memory, no serialization) instead of `CacheStore` (which JSON-serializes and can't handle `CacheEntry` objects)
28
- - Removed `create_basic_cache` method (no longer needed)
29
-
30
- ### Spec fixes
31
-
32
- - `cache_integration_spec.rb`: Fixed `REGISTER_ID_ATTR_NAME` (requires `lutaml/hal` instead of redefining constant); added `status: 200` to mock responses; fixed cache key in legacy test; removed `instance_variable_get`/`send` usage
33
- - `cache_manager_spec.rb`: Updated stats tests to stub `stats` (not `cache_info`); updated `http_aware_cache?` test to use `is_a?` instead of `respond_to?`; fixed `get_from_http_cache` test signature
34
- - `cache_configuration_spec.rb`: Fixed `http_aware?` nil default expectation; fixed adapter_config error type
35
- - `cache_metadata_spec.rb`: Added `status: 200` to test doubles
36
-
37
- ### Test results
38
-
39
- - **lutaml-hal**: 210 examples, 0 failures
40
- - **lutaml-store**: 248 examples, 1 pre-existing failure (vary header spec) + 25 pre-existing failures in HTTP cache integration specs
41
- - **Anti-patterns in lib code**: 0 `instance_variable_set/get`, 0 `send`, 0 `respond_to?`
42
-
43
- ## Implemented in lutaml-hal#15
44
-
45
- The realized-object cache (relaton/w3c_api#11) is working and opt-in
46
- (`ModelRegister.new(..., cache: {...})`; no config keeps the previous
47
- behavior). Delivered across four commits on `rt-add-lutaml-store`:
48
-
49
- ### Phase 1: Make caching work / unify through lutaml-store — done (adapted)
50
-
51
- - The cache path was crashing; fixed: require `lutaml/store` (so its autoloads
52
- resolve), route `Link#realize` through the public `cache_manager` API, add
53
- `Client#get_by_url_with_headers`, remove `Client`'s legacy `@cache` Hash,
54
- make HTTP-aware caching explicit opt-in.
55
- - **Deviation:** instead of a `HalStore` wrapper, `CacheManager` uses
56
- lutaml-store's `CacheStore` directly for persistent adapters and keeps
57
- `SimpleCacheStore` for the in-memory adapter (so cache hits avoid
58
- serialization). `SimpleCacheStore` was therefore retained, not removed.
59
-
60
- ### Phase 2: Register ID / embedded data as proper attributes — done
61
-
62
- - Dropped the `Hal::REGISTER_ID_ATTR_NAME` constant and all dynamic
63
- `instance_variable_get/set("@#{...}")`; `Resource`/`Link`/`LinkSet` use
64
- `attr_accessor :_global_register_id`, embedded data is an `attr_accessor`.
65
- - Also removed the remaining `register.send(:private_method)` calls from
66
- `Link` by making `ModelRegister#find_matching_model_class` public.
67
-
68
- ### Phase 3: Persistence + URL keying — done (alternative to model-registry)
69
-
70
- - **URL keying:** `CacheManager` canonicalizes relative URLs to absolute before
71
- keying, so a resource fetched by endpoint path and the same resource realized
72
- from an absolute link href share one entry (the core repeated-realize fix).
73
- - **Persistence — decision:** rather than registering HAL resource classes in
74
- lutaml-store's model registry, `CacheEntry` gained a JSON storage form
75
- (records the model class + lutaml-model JSON). With a `filesystem`/`sqlite`
76
- adapter the cache persists via `Lutaml::Store::CacheStore` and rebuilds models
77
- on retrieval. This keeps the cache shape (`URL => entry`) and was chosen over
78
- the model-registry approach, whose `fetch(model:, key:)` API fits poorly with
79
- a heterogeneous URL→object lookup. Note: persisted entries require **named**
80
- resource classes with explicit `key_value` mappings.
81
-
82
- ## Still remaining
83
-
84
- ### HTTP-aware response cache (deferred)
85
-
86
- Backing the HTTP-aware mode with lutaml-store's `HttpCache` (ETag / 304
87
- revalidation against a cached response) is left as scaffolding
88
- (`create_http_cache` / `*_http_cache` in `CacheManager`). It needs a realized
89
- model to be reconstructed from a cached response (i.e. knowing the resource
90
- class at read time) and is not required for the realized-object caching in #11.
91
-
92
- ### Release coordination
93
-
94
- lutaml-hal#15 depends on lutaml-store via a `path:` Gemfile entry. lutaml-store
95
- needs a tagged release before lutaml-hal can depend on a published version and
96
- the PR can ship.
@@ -1,359 +0,0 @@
1
- # TODO.impl/2-glossarist-migration.md
2
-
3
- # Migration Plan: glossarist-ruby → lutaml-store
4
-
5
- ## Current state analysis
6
-
7
- ### How glossarist stores LutaML model objects
8
-
9
- Glossarist is the most complex of the three repos. It manages glossary/terminology
10
- concepts with multiple versions (v1, v2, v3) and several storage mechanisms:
11
-
12
- 1. **Filesystem YAML persistence** — `ConceptManager` reads/writes concept YAML
13
- files from a directory structure. Concepts are stored as:
14
- - `concept/{uuid}.yaml` — the managed concept
15
- - `localized_concept/{uuid}.yaml` — per-language localizations
16
- - Grouped files: `{uuid}.yaml` containing concept + all localizations
17
-
18
- 2. **ZIP package (GCR format)** — `GcrPackage` reads/writes concepts from ZIP
19
- archives containing `metadata.yaml`, `concepts/*.yaml`, optional compiled
20
- formats (TBX, JSON-LD, Turtle), and dataset assets.
21
-
22
- 3. **In-memory collections** — `ManagedConceptCollection`, `Collection`,
23
- `Collections::Collection`, `Collections::TypedCollection` — all use plain
24
- Arrays/Hashes to hold models in memory.
25
-
26
- ### Architecture (current)
27
-
28
- ```
29
- Glossarist module
30
- ├── Collection (v1)
31
- │ ├── @index (Hash id → Concept)
32
- │ ├── @path (String — filesystem path)
33
- │ ├── load_concepts → Dir.glob → Concept.from_yaml
34
- │ └── save_concepts → File.write(Psych.dump)
35
-
36
- ├── ManagedConceptCollection (v2/v3)
37
- │ ├── @managed_concepts (Array)
38
- │ ├── @managed_concepts_ids (Hash id → uuid)
39
- │ ├── load_from_files → ConceptManager
40
- │ └── save_to_files → ConceptManager
41
-
42
- ├── ConceptManager
43
- │ ├── path, localized_concepts_path
44
- │ ├── load_from_files → Dir.glob → ConceptDocument.from_yamls → ManagedConcept
45
- │ ├── save_to_files → File.write(to_yaml)
46
- │ ├── save_grouped_concepts_to_files
47
- │ └── Versioned concept document classes (v2, v3)
48
-
49
- ├── GcrPackage
50
- │ ├── write → Zip::File → concept YAMLs + metadata
51
- │ ├── read → Zip::File → parse → ManagedConcept instances
52
- │ └── Compiled format generation (TBX, JSON-LD, Turtle)
53
-
54
- ├── ConceptCollector
55
- │ └── Static methods for scanning directories, detecting schema versions
56
-
57
- └── Model classes (all Lutaml::Model::Serializable):
58
- ├── ManagedConcept (key: identifier/uuid)
59
- ├── LocalizedConcept (inherits Concept)
60
- ├── ConceptData
61
- ├── Designation::Base (polymorphic)
62
- ├── DetailedDefinition
63
- ├── ConceptSource
64
- └── ... many more
65
- ```
66
-
67
- ### Key observations
68
-
69
- **Glossarist has no storage abstraction.** File I/O is scattered across:
70
- - `Collection#load_concepts`, `Collection#save_concept_to_file`
71
- - `ConceptManager#load_concept_from_file`, `ConceptManager#save_concept_to_file`
72
- - `GcrPackage#write`, `GcrPackage#read`
73
- - `ConceptCollector` (static methods doing Dir.glob)
74
-
75
- **Every storage operation is ad-hoc:**
76
- - YAML serialization uses `Lutaml::Model::Serializable#to_yaml` / `.from_yaml`
77
- - File naming uses concept UUIDs
78
- - Directory layout is version-dependent
79
- - ZIP packaging duplicates the filesystem logic
80
-
81
- ### Problems identified
82
-
83
- | Category | Issue | Location |
84
- |---|---|---|
85
- | **No storage abstraction** | File I/O is in `ConceptManager`, `Collection`, `GcrPackage`, `ConceptCollector` — no single persistence layer | Multiple |
86
- | **MECE violation** | `ManagedConceptCollection` manages an Array + a separate id→uuid Hash — this is a database, but hand-rolled | `managed_concept_collection.rb` |
87
- | **DRY violation** | YAML file I/O patterns repeated in `Collection`, `ConceptManager`, `GcrPackage` | Multiple |
88
- | **Schema version branching** | Version detection (v1/v2/v3) is sprinkled through `ConceptCollector`, `ConceptManager`, `SchemaMigration` | Multiple |
89
- | **Lazy loading missing** | `Collection` has a TODO: "Add support for lazy concept loading" — all concepts loaded eagerly into memory | `collection.rb:4` |
90
- | **OCP violation** | Adding a new storage backend (e.g., database) requires modifying `ConceptManager`, `Collection`, and `GcrPackage` | Multiple |
91
-
92
- ## Migration strategy
93
-
94
- ### Phase 1: Define a `ConceptStore` interface backed by lutaml-store
95
-
96
- The core insight: Glossarist's concept management is a CRUD store with:
97
- - **Model:** `ManagedConcept` (key: `uuid`)
98
- - **Composite models:** `LocalizedConcept` instances per language
99
- - **Backends:** Filesystem YAML, ZIP archive, and (future) SQLite
100
-
101
- ```ruby
102
- module Glossarist
103
- class ConceptStore
104
- def initialize(adapter:, schema_version: "3")
105
- @store = Lutaml::Store.new(
106
- adapter: adapter,
107
- models: [
108
- {
109
- model: Glossarist::ManagedConcept,
110
- key: :uuid,
111
- polymorphic_class_key: nil
112
- },
113
- {
114
- model: Glossarist::LocalizedConcept,
115
- key: :uuid
116
- }
117
- ]
118
- )
119
- @schema_version = schema_version
120
- end
121
-
122
- # CRUD operations
123
- def save(managed_concept) = @store.save(managed_concept)
124
- def fetch(uuid) = @store.fetch(model: ManagedConcept, uuid: uuid)
125
- def fetch_by_id(id) = where(model: ManagedConcept, identifier: id).first
126
- def update(uuid, **attrs) = @store.update(model: ManagedConcept, uuid: uuid, attributes: attrs)
127
- def delete(uuid) = @store.destroy(model: ManagedConcept, uuid: uuid)
128
-
129
- # Query
130
- def all = @store.all(model: ManagedConcept)
131
- def where(model:, **conditions) = @store.where(model: model, **conditions)
132
- def count = @store.count(model: ManagedConcept)
133
-
134
- # Composite model access
135
- def fetch_localized(managed_concept_uuid, lang)
136
- concept = fetch(managed_concept_uuid)
137
- concept&.localization(lang)
138
- end
139
- end
140
- end
141
- ```
142
-
143
- ### Phase 2: Implement filesystem adapter for Glossarist
144
-
145
- Glossarist's filesystem layout is specific:
146
-
147
- ```
148
- concepts/
149
- concept/
150
- {uuid}.yaml
151
- localized_concept/
152
- {uuid}.yaml
153
- ```
154
-
155
- This maps to lutaml-store's FileSystem adapter with custom path resolution:
156
-
157
- ```ruby
158
- store = Glossarist::ConceptStore.new(
159
- adapter: {
160
- type: :filesystem,
161
- path: "/path/to/concepts",
162
- extension: "yaml",
163
- # Custom naming: use uuid as filename, organize by model type
164
- naming_strategy: :uuid,
165
- directory_layout: {
166
- ManagedConcept => "concept",
167
- LocalizedConcept => "localized_concept"
168
- }
169
- }
170
- )
171
- ```
172
-
173
- This requires extending lutaml-store's FileSystem adapter to support:
174
- - **Custom directory layout** (model type → subdirectory)
175
- - **Custom key-to-filename mapping** (UUID-based naming)
176
- - **YAML serialization** (already supported by lutaml-model)
177
-
178
- Alternatively, add a `Glossarist::Adapters::YamlFilesystem` adapter that
179
- implements `Lutaml::Store::Adapter::Base`.
180
-
181
- ### Phase 3: Implement ZIP archive adapter
182
-
183
- GCR packages are ZIP archives. Create a `Glossarist::Adapters::ZipArchive`
184
- adapter:
185
-
186
- ```ruby
187
- class ZipArchive < Lutaml::Store::Adapter::Base
188
- def initialize(config)
189
- @zip_path = config[:path]
190
- # ...
191
- end
192
-
193
- def save(key, data, metadata = {})
194
- Zip::File.open(@zip_path, create: true) do |zf|
195
- zf.get_output_stream(key_to_entry_name(key)) do |f|
196
- f.write(data.to_yaml)
197
- end
198
- end
199
- end
200
-
201
- def load(key)
202
- Zip::File.open(@zip_path) do |zf|
203
- entry = zf.find_entry(key_to_entry_name(key))
204
- entry&.get_input_stream&.read
205
- end
206
- end
207
- # ... implement remaining Adapter::Base methods
208
- end
209
- ```
210
-
211
- This makes `GcrPackage` a thin wrapper around `ConceptStore` with a ZIP adapter.
212
-
213
- ### Phase 4: Migrate `ManagedConceptCollection`
214
-
215
- Replace the hand-rolled Array + Hash index with `ConceptStore`:
216
-
217
- ```ruby
218
- class ManagedConceptCollection
219
- def initialize(store:)
220
- @store = store # Glossarist::ConceptStore
221
- end
222
-
223
- def fetch(uuid) = @store.fetch(uuid)
224
- def store(concept) = @store.save(concept)
225
- def each(&block) = @store.all.each(&block)
226
-
227
- # Remove: @managed_concepts array, @managed_concepts_ids hash
228
- # Remove: load_from_files, save_to_files (delegate to store)
229
- end
230
- ```
231
-
232
- ### Phase 5: Migrate `Collection` (v1)
233
-
234
- The v1 `Collection` follows the same pattern but simpler:
235
-
236
- ```ruby
237
- class Collection
238
- def initialize(store:)
239
- @store = store
240
- end
241
-
242
- def fetch(id) = @store.fetch_by_id(id)
243
- def store(concept) = @store.save(concept)
244
- # Remove: @index, @path, load_concepts, save_concepts
245
- end
246
- ```
247
-
248
- ### Phase 6: Migrate `GcrPackage`
249
-
250
- `GcrPackage` becomes a factory that creates a `ConceptStore` with a ZIP adapter:
251
-
252
- ```ruby
253
- class GcrPackage
254
- def self.load(zip_path)
255
- store = ConceptStore.new(adapter: { type: :zip, path: zip_path })
256
- metadata = load_metadata(zip_path)
257
- concepts = store.all
258
- new(zip_path, metadata, concepts)
259
- end
260
-
261
- def self.create(concepts:, metadata:, output_path:, **opts)
262
- store = ConceptStore.new(adapter: { type: :zip, path: output_path })
263
- store.save(concepts)
264
- write_metadata(output_path, metadata)
265
- # Compiled format generation stays here — it's a presentation concern
266
- end
267
- end
268
- ```
269
-
270
- ### Phase 7: Migrate `ConceptCollector`
271
-
272
- `ConceptCollector` currently has 230 lines of directory-scanning logic.
273
- With lutaml-store, most of this becomes:
274
-
275
- ```ruby
276
- def collect(dir)
277
- store = ConceptStore.new(adapter: { type: :filesystem, path: dir })
278
- store.all
279
- end
280
- ```
281
-
282
- Version detection logic moves into the adapter layer.
283
-
284
- ## Completed
285
-
286
- ### Phase 1: ConceptStore with custom serializer (done)
287
-
288
- Created `Glossarist::ConceptStore` backed by `Lutaml::Store` with full CRUD.
289
-
290
- **Problem:** `ManagedConcept` uses `key_value` DSL that maps both `uuid` and
291
- `identifier` to the same `"id"` hash key — lossy for storage.
292
-
293
- **Solution:** Added pluggable serializer support to lutaml-store:
294
- - `ModelRegistration` now accepts `serializer:` option
295
- - `ModelSerializer` delegates to custom serializer when registered
296
- - Created `Glossarist::ConceptSerializer` — attribute-based serialization that
297
- uses attribute names as hash keys (bypasses `key_value` mappings entirely)
298
-
299
- | File | Change |
300
- |---|---|
301
- | `lutaml-store/lib/lutaml/store/model_registration.rb` | Added `serializer` attr_reader and option |
302
- | `lutaml-store/lib/lutaml/store/model_serializer.rb` | Accepts `registration` param, delegates to custom serializer |
303
- | `lutaml-store/lib/lutaml/store/database_store.rb` | Passes registration to serialize/deserialize calls |
304
- | `lutaml-store/spec/lutaml/store/custom_serializer_spec.rb` | 4 specs for custom serializer feature |
305
- | `glossarist/lib/glossarist/concept_serializer.rb` | Attribute-based serializer for ManagedConcept |
306
- | `glossarist/lib/glossarist/concept_store.rb` | CRUD store backed by lutaml-store with custom serializer |
307
- | `glossarist/lib/glossarist.rb` | Autoloads for ConceptSerializer, ConceptStore |
308
- | `glossarist/Gemfile` | Fixed lutaml-store path (`../../lutaml/lutaml-store`) |
309
- | `glossarist/glossarist.gemspec` | Added `lutaml-store ~> 0.1.0` dependency |
310
- | `glossarist/spec/unit/concept_serializer_spec.rb` | 4 specs for serializer round-trip |
311
- | `glossarist/spec/unit/concept_store_spec.rb` | 12 specs for full CRUD |
312
-
313
- ### Test results
314
-
315
- - **lutaml-store:** 19 core specs pass (database_store + custom_serializer)
316
- - **glossarist:** 1154 examples, 0 failures (no regressions)
317
-
318
- ## Remaining (future work)
319
-
320
- | File | Purpose |
321
- |---|---|
322
- | `lib/glossarist/concept_store.rb` | Glossarist-specific store backed by `Lutaml::Store` |
323
- | `lib/glossarist/adapters/yaml_filesystem.rb` | Custom filesystem adapter for Glossarist's YAML layout |
324
- | `lib/glossarist/adapters/zip_archive.rb` | ZIP archive adapter for GCR packages |
325
-
326
- ## Files to modify
327
-
328
- | File | Change |
329
- |---|---|
330
- | `managed_concept_collection.rb` | Replace Array/Hash with `ConceptStore`; remove `load_from_files`/`save_to_files` |
331
- | `collection.rb` | Replace `@index` with `ConceptStore`; remove `load_concepts`/`save_concepts` |
332
- | `gcr_package.rb` | Use `ConceptStore` with ZIP adapter; keep compiled format generation |
333
- | `concept_manager.rb` | Simplify to delegate to `ConceptStore`; remove raw File I/O |
334
- | `concept_collector.rb` | Replace directory scanning with `ConceptStore.new(...).all` |
335
- | `glossarist.rb` | Add autoloads for new classes |
336
-
337
- ## Spec coverage needed
338
-
339
- 1. **ConceptStore CRUD** — save, fetch, update, delete for ManagedConcept
340
- 2. **Composite model storage** — LocalizedConcept stored independently
341
- 3. **Filesystem adapter** — reads/writes Glossarist's directory layout
342
- 4. **ZIP adapter** — round-trip through GCR packages
343
- 5. **Schema version handling** — v1, v2, v3 concepts load correctly
344
- 6. **Lazy loading** — concepts loaded on demand, not all at once
345
- 7. **Migration backward compatibility** — existing YAML files still readable
346
-
347
- ## Risks
348
-
349
- - **High complexity:** Glossarist has v1/v2/v3 schema migration paths. The
350
- storage layer must handle all versions transparently.
351
- - **Medium risk:** ZIP adapter must handle streaming mode for large glossaries.
352
- - **Low risk:** Replacing `ManagedConceptCollection`'s Array — straightforward
353
- delegation.
354
-
355
- ## Dependencies
356
-
357
- - **lutaml-store must first** support custom filesystem layouts and YAML
358
- serialization natively. See `0-lutaml-store-self-quality.md`.
359
- - **lutaml-store adapter interface** may need extension for ZIP archive support.