glossarist 2.6.5 → 2.6.7

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (155) hide show
  1. checksums.yaml +4 -4
  2. data/.github/workflows/release.yml +1 -4
  3. data/.rubocop_todo.yml +53 -2
  4. data/CLAUDE.md +27 -2
  5. data/README.adoc +532 -56
  6. data/config.yml +68 -1
  7. data/glossarist.gemspec +2 -0
  8. data/lib/glossarist/citation.rb +26 -123
  9. data/lib/glossarist/cli/compare_command.rb +106 -0
  10. data/lib/glossarist/cli/export_command.rb +11 -14
  11. data/lib/glossarist/cli/validate_command.rb +111 -20
  12. data/lib/glossarist/cli.rb +18 -0
  13. data/lib/glossarist/collections/bibliography_collection.rb +4 -2
  14. data/lib/glossarist/collections/localization_collection.rb +2 -0
  15. data/lib/glossarist/comparison_result.rb +35 -0
  16. data/lib/glossarist/concept.rb +1 -1
  17. data/lib/glossarist/concept_collector.rb +44 -0
  18. data/lib/glossarist/concept_comparator.rb +72 -0
  19. data/lib/glossarist/concept_data.rb +20 -0
  20. data/lib/glossarist/concept_diff.rb +15 -0
  21. data/lib/glossarist/concept_document.rb +11 -0
  22. data/lib/glossarist/concept_manager.rb +19 -5
  23. data/lib/glossarist/concept_ref.rb +13 -0
  24. data/lib/glossarist/concept_reference.rb +12 -19
  25. data/lib/glossarist/concept_validator.rb +6 -1
  26. data/lib/glossarist/context_configuration.rb +90 -0
  27. data/lib/glossarist/dataset_validator.rb +8 -4
  28. data/lib/glossarist/designation/abbreviation.rb +0 -2
  29. data/lib/glossarist/designation/base.rb +21 -1
  30. data/lib/glossarist/designation/expression.rb +3 -0
  31. data/lib/glossarist/designation/letter_symbol.rb +0 -4
  32. data/lib/glossarist/designation/prefix.rb +17 -0
  33. data/lib/glossarist/designation/suffix.rb +17 -0
  34. data/lib/glossarist/designation/symbol.rb +0 -2
  35. data/lib/glossarist/gcr_metadata.rb +7 -14
  36. data/lib/glossarist/gcr_package.rb +35 -23
  37. data/lib/glossarist/gcr_validator.rb +38 -17
  38. data/lib/glossarist/glossary_definition.rb +5 -0
  39. data/lib/glossarist/localized_concept.rb +8 -0
  40. data/lib/glossarist/managed_concept.rb +39 -6
  41. data/lib/glossarist/managed_concept_data.rb +22 -2
  42. data/lib/glossarist/non_verb_rep.rb +21 -6
  43. data/lib/glossarist/pronunciation.rb +32 -0
  44. data/lib/glossarist/rdf/ext/jsonld_transform_ext.rb +208 -0
  45. data/lib/glossarist/rdf/ext/mapping_ext.rb +37 -0
  46. data/lib/glossarist/rdf/ext/mapping_rule_ext.rb +27 -0
  47. data/lib/glossarist/rdf/ext/member_rule_ext.rb +34 -0
  48. data/lib/glossarist/rdf/ext/turtle_transform_ext.rb +222 -0
  49. data/lib/glossarist/rdf/ext.rb +39 -0
  50. data/lib/glossarist/rdf/gloss_citation.rb +36 -0
  51. data/lib/glossarist/rdf/gloss_concept.rb +58 -0
  52. data/lib/glossarist/rdf/gloss_concept_date.rb +24 -0
  53. data/lib/glossarist/rdf/gloss_concept_reference.rb +29 -0
  54. data/lib/glossarist/rdf/gloss_concept_source.rb +37 -0
  55. data/lib/glossarist/rdf/gloss_designation.rb +146 -0
  56. data/lib/glossarist/rdf/gloss_detailed_definition.rb +24 -0
  57. data/lib/glossarist/rdf/gloss_grammar_info.rb +57 -0
  58. data/lib/glossarist/rdf/gloss_locality.rb +25 -0
  59. data/lib/glossarist/rdf/gloss_localized_concept.rb +67 -0
  60. data/lib/glossarist/rdf/gloss_non_verbal_rep.rb +31 -0
  61. data/lib/glossarist/rdf/gloss_pronunciation.rb +32 -0
  62. data/lib/glossarist/rdf/gloss_reference.rb +55 -0
  63. data/lib/glossarist/rdf/namespaces/glossarist_namespace.rb +12 -0
  64. data/lib/glossarist/rdf/namespaces/iso_thes_namespace.rb +12 -0
  65. data/lib/glossarist/rdf/namespaces/owl_namespace.rb +12 -0
  66. data/lib/glossarist/rdf/namespaces/prov_namespace.rb +12 -0
  67. data/lib/glossarist/rdf/namespaces/rdf_namespace.rb +12 -0
  68. data/lib/glossarist/rdf/namespaces/skosxl_namespace.rb +12 -0
  69. data/lib/glossarist/rdf/namespaces.rb +8 -2
  70. data/lib/glossarist/rdf/relationships.rb +19 -0
  71. data/lib/glossarist/rdf/v3/configuration.rb +15 -0
  72. data/lib/glossarist/rdf/v3.rb +79 -0
  73. data/lib/glossarist/rdf.rb +22 -2
  74. data/lib/glossarist/reference_extractor.rb +15 -24
  75. data/lib/glossarist/reference_resolver.rb +3 -3
  76. data/lib/glossarist/related_concept.rb +2 -10
  77. data/lib/glossarist/schema_migration.rb +39 -0
  78. data/lib/glossarist/sts/term_mapper.rb +2 -2
  79. data/lib/glossarist/transforms/concept_to_gloss_transform.rb +355 -0
  80. data/lib/glossarist/transforms.rb +2 -2
  81. data/lib/glossarist/urn_resolver.rb +13 -1
  82. data/lib/glossarist/v1/concept.rb +18 -11
  83. data/lib/glossarist/v2/citation.rb +36 -0
  84. data/lib/glossarist/v2/concept_data.rb +46 -0
  85. data/lib/glossarist/v2/concept_document.rb +18 -0
  86. data/lib/glossarist/v2/concept_ref.rb +8 -0
  87. data/lib/glossarist/v2/concept_source.rb +16 -0
  88. data/lib/glossarist/v2/configuration.rb +13 -0
  89. data/lib/glossarist/v2/detailed_definition.rb +14 -0
  90. data/lib/glossarist/v2/localized_concept.rb +9 -0
  91. data/lib/glossarist/v2/managed_concept.rb +25 -0
  92. data/lib/glossarist/v2/managed_concept_data.rb +49 -0
  93. data/lib/glossarist/v2/related_concept.rb +15 -0
  94. data/lib/glossarist/v2.rb +28 -0
  95. data/lib/glossarist/v3/bibliography_entry.rb +19 -0
  96. data/lib/glossarist/v3/bibliography_file.rb +27 -0
  97. data/lib/glossarist/v3/citation.rb +30 -0
  98. data/lib/glossarist/v3/concept_data.rb +46 -0
  99. data/lib/glossarist/v3/concept_document.rb +18 -0
  100. data/lib/glossarist/v3/concept_ref.rb +8 -0
  101. data/lib/glossarist/v3/concept_source.rb +16 -0
  102. data/lib/glossarist/v3/configuration.rb +13 -0
  103. data/lib/glossarist/v3/detailed_definition.rb +14 -0
  104. data/lib/glossarist/v3/image_entry.rb +21 -0
  105. data/lib/glossarist/v3/image_file.rb +31 -0
  106. data/lib/glossarist/v3/localized_concept.rb +9 -0
  107. data/lib/glossarist/v3/managed_concept.rb +26 -0
  108. data/lib/glossarist/v3/managed_concept_data.rb +34 -0
  109. data/lib/glossarist/v3/related_concept.rb +15 -0
  110. data/lib/glossarist/v3.rb +36 -0
  111. data/lib/glossarist/validation/asset_index.rb +4 -3
  112. data/lib/glossarist/validation/bibliography_index.rb +61 -30
  113. data/lib/glossarist/validation/rules/asciidoc_xref_rule.rb +2 -15
  114. data/lib/glossarist/validation/rules/authoritative_source_rule.rb +2 -15
  115. data/lib/glossarist/validation/rules/base.rb +5 -0
  116. data/lib/glossarist/validation/rules/bibliography_yaml_rule.rb +2 -3
  117. data/lib/glossarist/validation/rules/citation_completeness_rule.rb +5 -27
  118. data/lib/glossarist/validation/rules/dataset_context.rb +8 -3
  119. data/lib/glossarist/validation/rules/date_validity_rule.rb +1 -1
  120. data/lib/glossarist/validation/rules/designation_status_rule.rb +0 -1
  121. data/lib/glossarist/validation/rules/designation_type_rule.rb +1 -5
  122. data/lib/glossarist/validation/rules/domain_ref_rule.rb +37 -0
  123. data/lib/glossarist/validation/rules/domain_target_rule.rb +56 -0
  124. data/lib/glossarist/validation/rules/gcr_context.rb +12 -13
  125. data/lib/glossarist/validation/rules/image_reference_rule.rb +2 -17
  126. data/lib/glossarist/validation/rules/locality_completeness_rule.rb +58 -0
  127. data/lib/glossarist/validation/rules/localization_consistency_rule.rb +72 -0
  128. data/lib/glossarist/validation/rules/localization_presence_rule.rb +1 -1
  129. data/lib/glossarist/validation/rules/model_validity_rule.rb +71 -0
  130. data/lib/glossarist/validation/rules/orphaned_bibliography_rule.rb +1 -13
  131. data/lib/glossarist/validation/rules/orphaned_images_rule.rb +16 -11
  132. data/lib/glossarist/validation/rules/ref_shape_rule.rb +68 -0
  133. data/lib/glossarist/validation/rules/related_concept_cycle_rule.rb +1 -3
  134. data/lib/glossarist/validation/rules/related_concept_symmetry_rule.rb +1 -3
  135. data/lib/glossarist/validation/rules/related_concept_target_rule.rb +64 -0
  136. data/lib/glossarist/validation/rules/schema_version_rule.rb +41 -0
  137. data/lib/glossarist/validation/rules/source_type_rule.rb +1 -15
  138. data/lib/glossarist/validation/rules/source_urn_format_rule.rb +65 -0
  139. data/lib/glossarist/validation/rules/uuid_format_rule.rb +33 -0
  140. data/lib/glossarist/validation/rules.rb +10 -43
  141. data/lib/glossarist/validation/validation_issue.rb +14 -11
  142. data/lib/glossarist/validation_result.rb +12 -22
  143. data/lib/glossarist/version.rb +1 -1
  144. data/lib/glossarist.rb +10 -0
  145. data/memory/project-status.md +43 -0
  146. data/scripts/migrate_dataset.rb +180 -0
  147. data/scripts/migrate_isotc204_to_v3.rb +134 -0
  148. data/scripts/migrate_isotc211_to_v3.rb +153 -0
  149. data/scripts/migrate_osgeo_to_v3.rb +155 -0
  150. data/scripts/upgrade_dataset_to_v3.rb +47 -0
  151. metadata +112 -6
  152. data/TODO.integration/01-gcr-package-cli.md +0 -180
  153. data/lib/glossarist/rdf/skos_concept.rb +0 -43
  154. data/lib/glossarist/rdf/skos_vocabulary.rb +0 -25
  155. data/lib/glossarist/transforms/concept_to_skos_transform.rb +0 -131
data/README.adoc CHANGED
@@ -101,7 +101,7 @@ related:: Array of <<related-concept,RelatedConcept>>
101
101
  status:: Enum for the normative status of the term.
102
102
  dates:: Array of <<concept-date,ConceptDate>>
103
103
  localized_concepts:: Hash of all localizations where keys are language codes and values are uuid of the localized concept.
104
- groups:: Array of groups in string format
104
+ domains:: Array of <<concept-reference,ConceptReference>> upper concepts (subject areas, concept schemes, organizing concepts) that this concept belongs to across all languages. Each domain is a typed reference (e.g. `{ concept_id: "103", ref_type: "domain" }`).
105
105
  localizations:: Hash of all localizations for this concept where keys are language codes and values are instances of <<localized-concept,LocalizedConcept>>.
106
106
 
107
107
  There are two ways to initialize and populate a managed concept
@@ -118,9 +118,8 @@ concept = Glossarist::ManagedConcept.new({
118
118
  "eng" => "<uuid>"
119
119
  },
120
120
  "localizations" => <Array of localized concepts or localized concept hashes>,
121
- "groups" => [
122
- "foo",
123
- "bar",
121
+ "domains" => [
122
+ { "concept_id" => "103", "ref_type" => "domain" },
124
123
  ],
125
124
  },
126
125
  })
@@ -132,7 +131,9 @@ concept = Glossarist::ManagedConcept.new({
132
131
  ----
133
132
  concept = Glossarist::ManagedConcept.new
134
133
  concept.id = "123"
135
- concept.groups = ["foo", "bar"]
134
+ concept.data.domains = [
135
+ Glossarist::ConceptReference.new(concept_id: "103", ref_type: "domain"),
136
+ ]
136
137
  concept.localizations = <Array of localized concepts or localized concept hashes>
137
138
  ----
138
139
 
@@ -146,55 +147,252 @@ Localized concept has the following fields
146
147
  id:: An optional identifier for the term, to be used in cross-references.
147
148
  uuid:: UUID for the concept
148
149
  designations:: Array of <<designation,Designations>> under which the term being defined is known. This method will also accept an array of hashes for designation and will convert them to their respective classes.
149
- domain:: An optional semantic domain for the term being defined, in case the term is ambiguous between several semantic domains.
150
+ domain:: URI reference to the subject area or section concept. Can be a relative URI (e.g. `section-103-01`), a URN (e.g. `urn:iec:std:iec:60050-103-01`), or a URL (e.g. `https://www.electropedia.org/iev/iev.nsf/display?openform&ievref=103-01`). This is the per-language upper concept reference — the subject area for this specific localization. Different languages may assign the same abstract concept to different domains.
151
+ related:: Array of <<related-concept,RelatedConcept>> — per-language concept relationships. Concept hierarchies can differ across languages (e.g. Russian distinguishes голубой/siniy as coordinate basic colors, while English unifies them under "blue"). Language-specific broader/narrower/equivalent relationships go here.
150
152
  subject:: Subject of the term.
151
153
  definition:: Array of <<detailed-definition,Detailed Definition>> of the term.
152
154
  non_verb_rep:: Array of <<non-verbal,non-verbal>> representations used to help define the term.
153
155
  notes:: Zero or more notes about the term. A note is in <<detailed-definition,Detailed Definition>> format.
154
156
  examples:: Zero or more examples of how the term is to be used in <<detailed-definition,Detailed Definition>> format.
155
157
  language_code:: The language of the localization, as an ISO-639 3-letter code.
158
+ script:: The script of the localization, as an ISO 15924 4-letter code (e.g. `Hans` for Simplified Chinese, `Latn` for Latin, `Cyrl` for Cyrillic). Optional — when omitted, the default script for the language is assumed.
159
+ system:: The ISO 24229 conversion system code used to produce this localization (e.g. `Var:jpn-Hrkt:Latn:Hepburn-1886` for Hepburn-romanized Japanese). Optional — only set when the localization is a romanization or transliteration.
156
160
  entry_status:: Entry status of the concept. Must be one of the following: +notValid+, +valid+, +superseded+, +retired+.
157
161
  classification:: Classification of the concept. Must be one of the following: +preferred+, +admitted+, +deprecated+.
158
162
 
159
163
  [[id,designation]]
160
- === Designation::Base
164
+ === Designation
165
+
166
+ A name under which a managed term is known. Designations follow an
167
+ inheritance hierarchy based on ISO 10241-1 and the Metanorma concept model.
168
+
169
+ ==== Designation::Base (common to all types)
170
+
171
+ designation:: String — the term text or symbol.
172
+ normative_status:: Enum — one of `preferred`, `admitted`, `deprecated`, `superseded`.
173
+ geographical_area:: String — geographic usage region (ISO 3166-1 country code).
174
+ language:: String — language of this designation (ISO 639 code). Usually inherited from the LocalizedConcept's `language_code`, but can differ for borrowed terms.
175
+ script:: String — script of the designation text (ISO 15924 code, e.g. `Hani` for Kanji, `Latn` for Latin, `Cyrl` for Cyrillic).
176
+ system:: String — ISO 24229 conversion system code used to produce this designation (e.g. `Var:jpn-Hrkt:Latn:Hepburn-1886` for Hepburn romanization). Optional — only set when the designation is a romanization or transliteration.
177
+ international:: Boolean — whether the designation is used internationally.
178
+ absent:: Boolean — whether the designation is intentionally absent in this language.
179
+ pronunciation:: Collection of `Pronunciation` entries — phonetic or romanized representations of the designation.
180
+ sources:: Collection of `ConceptSource` entries — bibliographic sources for this designation (ISO 10241-1 §6.8).
181
+ term_type:: Enum (ISO 12620) — optional classification of the designation's term type. See <<iso12620-term-types>> below.
182
+ related:: Collection of `RelatedConcept` entries — term-level (designation-to-designation) relationships within the same concept entry. Used for linking abbreviated forms to full forms, short forms to expanded forms, etc. (TBX xref types).
183
+ +
184
+ Each `Pronunciation` entry has:
185
+ +
186
+ [cols="1,1,2"]
187
+ |===
188
+ |Attribute |Standard |Description
189
+
190
+ |`content` |— |The pronunciation text
191
+ |`language` |ISO 639 |Language/dialect being pronounced (3-letter code)
192
+ |`script` |ISO 15924 |Script of the pronunciation text (4-letter code)
193
+ |`country` |ISO 3166-1 |Country variant (2-letter code, optional)
194
+ |`system` |ISO 24229 |Conversion system code or identifier (e.g. `IPA`, `Var:jpn-Hrkt:Latn:Hepburn-1886`)
195
+ |===
196
+ +
197
+ Example:
198
+ +
199
+ [,yaml]
200
+ ----
201
+ pronunciation:
202
+ - content: "toːkjoː"
203
+ language: jpn
204
+ script: Latn
205
+ system: IPA
206
+ - content: "Tōkyō"
207
+ language: jpn
208
+ script: Latn
209
+ system: "Var:jpn-Hrkt:Latn:Hepburn-1886"
210
+ ----
211
+
212
+ ==== Designation::Expression (text-based, inherits Base)
213
+
214
+ prefix:: String — text before the designation.
215
+ usage_info:: String — disambiguation text for the designation.
216
+ field_of_application:: String — IEC "specific use", appears in angle brackets after the designation (e.g. "in communication theory").
217
+ grammar_info:: Array of GrammarInfo — gender, number, part of speech.
218
+
219
+ ==== Designation::Abbreviation (inherits Expression)
220
+
221
+ acronym:: Boolean — is this an acronym?
222
+ initialism:: Boolean — is this an initialism?
223
+ truncation:: Boolean — is this a truncation?
224
+
225
+ ==== Designation::Symbol (inherits Base)
226
+
227
+ No additional attributes beyond Base.
161
228
 
162
- A name under which a managed term is known.
229
+ ==== Designation::LetterSymbol (inherits Symbol)
163
230
 
164
- Methods::
165
- `from_h(options)`::: Creates a new designation instance based on the specified type.
231
+ text:: String — the letter symbol text.
232
+
233
+ ==== Designation::GraphicalSymbol (inherits Symbol)
234
+
235
+ text:: String — description of the symbol.
236
+ image:: String — the graphical symbol (emoji, path, or data URL).
237
+
238
+ ==== Factory Method
239
+
240
+ `Designation::Base.from_h(options)` creates a new designation instance based on the specified type.
166
241
 
167
242
  Parameters::
168
243
  * options (Hash) - The options for creating the designation.
169
- * "type" (String) - The type of designation (expression, symbol, abbreviation, graphical_symbol, letter_symbol). Note: type key should be string and not a symbol so { type: "expression" } will not work.
244
+ * "type" (String) - The type of designation (`expression`, `symbol`, `abbreviation`, `graphical_symbol`, `letter_symbol`). Note: type key should be string and not a symbol so `{ type: "expression" }` will not work.
170
245
  * Additional options depend on the specific designation type.
171
246
 
172
247
  Returns::
173
- Designation::{type}::: A new instance of specified type. e.g `Glossarist::Designation::Base.from_h("type" => "expression")` will return `Glossarist::Designation::Expression`
248
+ Designation::{type}::: A new instance of specified type.
174
249
 
175
250
  Example
176
251
  [,ruby]
177
252
  ----
178
- # Example usage of Designation::Base class
253
+ # Expression with field of application
254
+ expr = Designation::Base.from_h({
255
+ "type" => "expression",
256
+ "designation" => "information",
257
+ "normative_status" => "preferred",
258
+ "field_of_application" => "in communication theory",
259
+ })
260
+
261
+ # International abbreviation
262
+ abbr = Designation::Base.from_h({
263
+ "type" => "abbreviation",
264
+ "designation" => "ISO",
265
+ "international" => true,
266
+ "acronym" => true,
267
+ })
268
+ ----
269
+
270
+ [[iso12620-term-types]]
271
+ ==== ISO 12620 Term Types
272
+
273
+ The `term_type` attribute on `Designation::Base` classifies designations
274
+ according to ISO 12620 (also used as TBX `termType`). This is orthogonal to
275
+ the structural designation `type` (expression/abbreviation/symbol): the
276
+ structural type determines how the designation is serialized, while
277
+ `term_type` provides ISO 12620 semantic classification.
278
+
279
+ [cols="1,2"]
280
+ |===
281
+ |Term type |Description
282
+
283
+ |`abbreviation`
284
+ |A shortened form of a word or phrase (general category)
285
+
286
+ |`acronym`
287
+ |An abbreviation pronounced as a word (e.g. NATO, laser)
179
288
 
180
- attributes_for_expression = { designation: "foobar", geographical_area: "abc", normative_status: "status" }
181
- designation_expression = Designation::Base.from_h({ "type" => "expression" }.merge(attributes_for_expression))
289
+ |`clipped_term`
290
+ |A term formed by clipping part of a longer term (e.g. "phone" from "telephone")
182
291
 
183
- attributes_for_abbreviation = { designation: "foobar", geographical_area: "abc", normative_status: "status", international: true }
184
- designation_abbreviation = Designation::Base.from_h({ "type" => "abbreviation" }.merge(attributes_for_abbreviation))
292
+ |`common_name`
293
+ |A name in common use for a concept (e.g. "water" vs H₂O)
185
294
 
295
+ |`entry_term`
296
+ |The headword or main term in a terminological entry
297
+
298
+ |`equation`
299
+ |A mathematical equation used as a designation
300
+
301
+ |`formula`
302
+ |A chemical or mathematical formula (e.g. H₂O, E=mc²)
303
+
304
+ |`full_form`
305
+ |The complete, unabbreviated form of a designation (e.g. "World Wide Web")
306
+
307
+ |`initialism`
308
+ |An abbreviation pronounced letter by letter (e.g. "URL", "FBI")
309
+
310
+ |`internationalism`
311
+ |A term used with the same meaning across many languages (e.g. "computer", "algorithm")
312
+
313
+ |`international_scientific_term`
314
+ |A term established by international scientific agreement (e.g. "hydrogen")
315
+
316
+ |`logical_expression`
317
+ |A logical or Boolean expression used as a designation
318
+
319
+ |`part_number`
320
+ |A part number or catalog identifier used as a designation
321
+
322
+ |`phraseological_unit`
323
+ |A multi-word expression or phrase functioning as a term (e.g. "software engineering")
324
+
325
+ |`transcribed_form`
326
+ |A designation produced by phonetic transcription from another script
327
+
328
+ |`transliterated_form`
329
+ |A designation produced by transliteration from another script (e.g. "Moskva" from "Москва")
330
+
331
+ |`short_form`
332
+ |A shortened form of a designation that is not an abbreviation (e.g. "US" for "United States")
333
+
334
+ |`shortcut`
335
+ |A keyboard shortcut or command sequence (e.g. "Ctrl+V" for paste)
336
+
337
+ |`sku`
338
+ |A stock keeping unit identifier
339
+
340
+ |`standard_text`
341
+ |A standardized text passage used as a designation
342
+
343
+ |`symbol`
344
+ |A non-verbal symbol representing the concept (e.g. Ω for ohm)
345
+
346
+ |`synonym`
347
+ |A term with the same meaning in the same language, used as an alternative designation
348
+
349
+ |`synonymous_phrase`
350
+ |A phrase that is synonymous with the preferred designation
351
+
352
+ |`variant`
353
+ |A spelling, regional, or stylistic variant of another designation
354
+ |===
355
+
356
+ ==== Designation-Level Relationships (TBX xref)
357
+
358
+ Designations can have intra-entry relationships — links between
359
+ designations of the *same* concept. These correspond to TBX `xref`
360
+ elements on term information groups (`<tig>`).
361
+
362
+ [cols="1,2"]
363
+ |===
364
+ |Relationship type |Description
365
+
366
+ |`abbreviated_form_for`
367
+ |This designation is an abbreviated form of the target (e.g. "WWW" → "World Wide Web")
368
+
369
+ |`short_form_for`
370
+ |This designation is a short form of the target (e.g. "US" → "United States of America")
371
+ |===
372
+
373
+ Example:
374
+ [,yaml]
375
+ ----
376
+ terms:
377
+ - designation: WWW
378
+ type: abbreviation
379
+ term_type: acronym
380
+ related:
381
+ - type: abbreviated_form_for
382
+ content: "World Wide Web"
383
+ - designation: World Wide Web
384
+ type: expression
385
+ term_type: full_form
186
386
  ----
187
387
 
188
388
  [[id,related-concept]]
189
389
  === RelatedConcept
190
390
 
191
- A term related to the current term.
192
-
193
- Following fields are available for the Related Concept
391
+ A concept related to the current concept with a typed relationship.
194
392
 
195
- type:: An enum to denote the relation of the term to the current term.
196
- content:: The designation of the related term.
197
- ref:: A <<citation, citation>> of the related term, in a Termbase.
393
+ type:: Enum the relationship type (see <<relationship-types,Relationship Types>> below).
394
+ content:: String free-text content describing the related concept.
395
+ ref:: A <<concept-ref,ConceptRef>> reference to the related concept (`source + id`, no version or locality).
198
396
 
199
397
  There are two ways to initialize and populate a related concept
200
398
 
@@ -205,7 +403,7 @@ There are two ways to initialize and populate a related concept
205
403
  related_concept = Glossarist::RelatedConcept.new({
206
404
  content: "Test content",
207
405
  type: :supersedes,
208
- ref: <concept citation>
406
+ ref: Glossarist::ConceptRef.new(source: "IEC", id: "102-03-02"),
209
407
  })
210
408
  ----
211
409
 
@@ -216,7 +414,209 @@ related_concept = Glossarist::RelatedConcept.new({
216
414
  related_concept = Glossarist::RelatedConcept.new
217
415
  related_concept.type = "supersedes"
218
416
  related_concept.content = "designation of the related concept"
219
- related_concept.ref = <Citation object>
417
+ related_concept.ref = Glossarist::ConceptRef.new(source: "IEC", id: "102-03-02")
418
+ ----
419
+
420
+ [[relationship-types]]
421
+ ==== Relationship Types
422
+
423
+ Relationship types are drawn from ISO 10241-1, ISO 25964/SKOS, and ISO 12620/TBX. The table below shows each type with its provenance and cross-standard equivalents.
424
+
425
+ [cols="1,1,1,1,1"]
426
+ |===
427
+ |Glossarist type |Category |ISO 10241-1 |ISO 25964 / SKOS |ISO 12620 / TBX
428
+
429
+ |`deprecates`
430
+ |Lifecycle
431
+ |deprecates
432
+ |—
433
+ |—
434
+
435
+ |`supersedes`
436
+ |Lifecycle
437
+ |supersedes
438
+ |—
439
+ |—
440
+
441
+ |`superseded_by`
442
+ |Lifecycle
443
+ |superseded by
444
+ |—
445
+ |—
446
+
447
+ |`broader`
448
+ |Hierarchical
449
+ |broader concept
450
+ |BT (broaderTerm)
451
+ |broaderTerm
452
+
453
+ |`narrower`
454
+ |Hierarchical
455
+ |narrower concept
456
+ |NT (narrowerTerm)
457
+ |narrowerTerm
458
+
459
+ |`broader_generic`
460
+ |Hierarchical (generic)
461
+ |—
462
+ |BTG (broaderGeneric, is-a)
463
+ |broaderTermGeneric
464
+
465
+ |`narrower_generic`
466
+ |Hierarchical (generic)
467
+ |—
468
+ |NTG (narrowerGeneric)
469
+ |narrowerTermGeneric
470
+
471
+ |`broader_partitive`
472
+ |Hierarchical (partitive)
473
+ |—
474
+ |BTP (broaderPartitive, part-whole)
475
+ |broaderTermPartitive
476
+
477
+ |`narrower_partitive`
478
+ |Hierarchical (partitive)
479
+ |—
480
+ |NTP (narrowerPartitive)
481
+ |narrowerTermPartitive
482
+
483
+ |`broader_instantial`
484
+ |Hierarchical (instantial)
485
+ |—
486
+ |BTI (broaderInstantial, instance-of)
487
+ |broaderTermInstantial
488
+
489
+ |`narrower_instantial`
490
+ |Hierarchical (instantial)
491
+ |—
492
+ |NTI (narrowerInstantial)
493
+ |narrowerTermInstantial
494
+
495
+ |`equivalent`
496
+ |Equivalence
497
+ |equivalent
498
+ |exactMatch
499
+ |—
500
+
501
+ |`close_match`
502
+ |Approx. equiv.
503
+ |—
504
+ |closeMatch
505
+ |—
506
+
507
+ |`broad_match`
508
+ |Cross-vocab mapping
509
+ |—
510
+ |broadMatch
511
+ |—
512
+
513
+ |`narrow_match`
514
+ |Cross-vocab mapping
515
+ |—
516
+ |narrowMatch
517
+ |—
518
+
519
+ |`related_match`
520
+ |Cross-vocab mapping
521
+ |—
522
+ |relatedMatch
523
+ |—
524
+
525
+ |`compare`
526
+ |Comparative
527
+ |compare
528
+ |—
529
+ |—
530
+
531
+ |`contrast`
532
+ |Comparative
533
+ |contrast
534
+ |—
535
+ |—
536
+
537
+ |`see`
538
+ |Associative
539
+ |see also
540
+ |RT (relatedTerm)
541
+ |crossReference
542
+
543
+ |`related_concept`
544
+ |Associative
545
+ |—
546
+ |—
547
+ |relatedConcept
548
+
549
+ |`related_concept_broader`
550
+ |Associative (broader)
551
+ |—
552
+ |—
553
+ |relatedConceptBroader
554
+
555
+ |`related_concept_narrower`
556
+ |Associative (narrower)
557
+ |—
558
+ |—
559
+ |relatedConceptNarrower
560
+
561
+ |`sequentially_related_concept`
562
+ |Associative (sequential)
563
+ |—
564
+ |—
565
+ |sequentiallyRelatedConcept
566
+
567
+ |`spatially_related_concept`
568
+ |Associative (spatial)
569
+ |—
570
+ |—
571
+ |spatiallyRelatedConcept
572
+
573
+ |`temporally_related_concept`
574
+ |Associative (temporal)
575
+ |—
576
+ |—
577
+ |temporallyRelatedConcept
578
+
579
+ |`homograph`
580
+ |Lexical
581
+ |—
582
+ |—
583
+ |homograph
584
+
585
+ |`false_friend`
586
+ |Lexical
587
+ |—
588
+ |—
589
+ |falseFriend
590
+ |===
591
+
592
+ [[id,concept-reference]]
593
+ === ConceptReference
594
+
595
+ A typed reference to another concept, either local (within the same glossary) or external (in another concept registry).
596
+
597
+ term:: String — the display text for the referenced concept.
598
+ concept_id:: String — the identifier of the target concept.
599
+ source:: String — the registry URI prefix for external references (e.g. `urn:iec:std:iec:60050`).
600
+ ref_type:: String — the reference type: `local`, `designation`, or `urn`.
601
+ urn:: String — a direct URN for the target concept (e.g. `urn:iec:std:iec:60050-102-01-01`).
602
+
603
+ Local references use `concept_id` without `source`. External references use `source` + `concept_id` or a direct `urn`.
604
+
605
+ [,ruby]
606
+ ----
607
+ # Local reference
608
+ ref = Glossarist::ConceptReference.new(term: "latitude", concept_id: "200", ref_type: "local")
609
+
610
+ # External reference via URN
611
+ ref = Glossarist::ConceptReference.new(
612
+ term: "equality",
613
+ concept_id: "102-01-01",
614
+ source: "urn:iec:std:iec:60050",
615
+ ref_type: "urn",
616
+ )
617
+
618
+ ref.local? # => false
619
+ ref.external? # => true
220
620
  ----
221
621
 
222
622
  [[id,concept-date]]
@@ -258,7 +658,7 @@ A definition of the managed term.
258
658
  It has the following attributes:
259
659
 
260
660
  content:: The text of the definition of the managed term.
261
- sources:: List of Bibliographic references(<<citation,Citation>>) for this particular definition of the managed term.
661
+ sources:: List of <<concept-source,bibliographic sources>> for this particular definition of the managed term.
262
662
 
263
663
  There are two ways to initialize and populate a detailed definition
264
664
 
@@ -284,52 +684,119 @@ detailed_definition.sources = [<list of citations>]
284
684
  [[id,citation]]
285
685
  === Citation
286
686
 
287
- Citation can be either structured or unstructured. A citation is structured if its reference contains one or all of the following keys `{ id: "id", source: "source", version: "version"}` and is unstructured if its reference is plain text. This also has 2 methods `structured?` and `plain?` to check if citation is structured or not.
687
+ A bibliographic citation used in <<concept-source,ConceptSource>> entries. The
688
+ `ref` field is always a structured hash (`Citation::Ref`), never a plain string.
288
689
 
289
690
  Citation has the following attributes.
290
691
 
291
- ref:: A hash or string based on type of citation. Hash if citation is structured or string if citation is plain.
292
- clause:: Referred clause of the document.
293
- link:: Link to document.
692
+ ref:: A `Citation::Ref` object (always a hash). See below.
693
+ locality:: A `Locality` with `type`, `reference_from`, and `reference_to`.
694
+ link:: URL link to the source document.
695
+ original:: Pre-parsed original citation text.
696
+ custom_locality:: Array of `{ name:, value: }` pairs for non-standard locality types.
294
697
 
295
- There are two ways to initialize and populate a Citation
698
+ `Citation::Ref` inner class:
699
+
700
+ source:: String — the document series or termbase (e.g. `"ISO"`, `"IEC"`).
701
+ id:: String — the document identifier within the series (e.g. `"1087-1:2000"`).
702
+ version:: String — optional version of the document.
703
+
704
+ `Citation#label` returns a display string combining source and id.
296
705
 
297
- 1. Setting the fields by using a hash while initializing
298
- +
299
706
  [,ruby]
300
707
  ----
301
- # Unstructured Citation
302
- citation = Glossarist::Citation.new({
303
- ref: "plain text reference",
304
- clause: "clause",
305
- link: "link",
306
- })
708
+ # Create a Citation
709
+ citation = Glossarist::Citation.new(
710
+ ref: Glossarist::Citation::Ref.new(source: "ISO", id: "9000"),
711
+ locality: Glossarist::Locality.new(type: "clause", reference_from: "3.3.3"),
712
+ link: "https://www.iso.org/standard/45481.html",
713
+ )
307
714
 
308
- # Structured Citation
309
- citation = Glossarist::Citation.new({
310
- ref: { id: "123", source: "source", version: "1.1" },
311
- clause: "clause",
312
- link: "link",
313
- })
715
+ citation.label # => "ISO 9000"
314
716
  ----
315
717
 
316
- 2. Setting the fields after creating an object
317
- +
718
+ YAML serialization (always this shape):
719
+
720
+ [,yaml]
721
+ ----
722
+ origin:
723
+ ref:
724
+ source: ISO
725
+ id: "1087-1:2000"
726
+ locality:
727
+ type: clause
728
+ reference_from: "3.2.9"
729
+ link: https://www.iso.org/standard/20057.html
730
+ ----
731
+
732
+ === ConceptRef
733
+
734
+ A concept reference used in <<related-concept,RelatedConcept>> entries. Has
735
+ `source` and `id` only — no version, locality, or link.
736
+
737
+ [,ruby]
738
+ ----
739
+ ref = Glossarist::ConceptRef.new(source: "IEC", id: "102-01-01")
740
+ ----
741
+
742
+ === V2 and V3 Namespaces
743
+
744
+ Glossarist provides versioned model namespaces for forward and backward
745
+ compatibility:
746
+
747
+ * `Glossarist::V2::*` — loads V2 datasets where `ref` may be a plain string.
748
+ Maps string refs to a `text` attribute internally.
749
+ * `Glossarist::V3::*` — canonical V3 models with structured `Citation::Ref`
750
+ hashes, `BibliographyEntry`, `BibliographyFile`, `ImageEntry`, `ImageFile`.
751
+
752
+ The `ConceptDocument.for_version("2"|"3")` method dispatches to the correct
753
+ namespace. `ConceptCollector.detect_schema_version(dir)` auto-detects from the
754
+ data by checking `schema_version` in the first concept file.
755
+
756
+ === Concept Comparison
757
+
758
+ Compare two concept datasets to identify additions, removals, and changes:
759
+
760
+ [,bash]
761
+ ----
762
+ glossarist compare path/to/v1 path/to/v2
763
+ ----
764
+
765
+ Ruby API:
766
+
318
767
  [,ruby]
319
768
  ----
320
- citation = Glossarist::Citation.new
321
- citation.ref = <plain or structured ref>
322
- citation.clause = "some clause"
769
+ comparator = Glossarist::ConceptComparator.new
770
+ result = comparator.compare(old_concepts, new_concepts)
771
+ result.matched_count # concepts present in both
772
+ result.new_only_count # concepts only in new set
773
+ result.old_only_count # concepts only in old set
774
+ result.diffs # Array of ConceptDiff with similarity scores
323
775
  ----
324
776
 
325
777
  === NonVerbRep
326
778
 
327
- Non-verbal Representation have the following fields
779
+ Non-verbal representations are associated resources (images, tables, formulas) used to help define a concept (ISO 10241-1 §6.5). They live outside the concept model and are referenced by URI. Resources can be shared across concepts and belong either to the dataset package (relative path) or are externally referenced (URL/URN).
328
780
 
329
- image:: An image used to help define a term.
330
- table:: A table used to help define a term.
331
- formula:: A formula used to help define a term.
332
- sources:: Bibliographic <<concept-source,concept source>> for the non-verbal representation of the term.
781
+ type:: String the type of representation: `image`, `table`, or `formula`.
782
+ ref:: String URI reference to the resource (relative path within the GCR package, URN, or URL).
783
+ text:: String optional text description or alt text.
784
+ sources:: Collection of <<concept-source,ConceptSource>> entries bibliographic sources for the representation.
785
+
786
+ Example:
787
+ +
788
+ [,yaml]
789
+ ----
790
+ non_verbal_rep:
791
+ - type: image
792
+ ref: assets/images/figure-1.svg
793
+ text: Diagram showing the concept hierarchy
794
+ - type: formula
795
+ ref: urn:gcr:assets:formula-eq1
796
+ sources:
797
+ - type: authoritative
798
+ status: identical
799
+ ----
333
800
 
334
801
  [[id,concept-source]]
335
802
  === ConceptSource
@@ -338,7 +805,7 @@ Concept Source has the following fields
338
805
 
339
806
  status:: The status of the managed term in the present context, relative to the term as found in the bibliographic source.
340
807
  type:: The type of the managed term in the present context.
341
- origin:: The bibliographic <<citation,citation>> for the managed term. This is also aliased as `ref`.
808
+ origin:: The bibliographic <<citation,citation>> for the managed term.
342
809
  modification:: A description of the modification to the cited definition of the term, if any, as it is to be applied in the present context.
343
810
 
344
811
 
@@ -863,6 +1330,15 @@ Glossarist::Validation::Rules::Registry.register(MyCustomRule)
863
1330
  Custom rules are automatically picked up by `DatasetValidator`, `GcrValidator`,
864
1331
  and `ConceptValidator` on the next validation run.
865
1332
 
1333
+ === compare
1334
+
1335
+ Compare two concept datasets and report differences.
1336
+
1337
+ [,bash]
1338
+ ----
1339
+ glossarist compare PATH_OLD PATH_NEW [--format text|json|yaml]
1340
+ ----
1341
+
866
1342
  === upgrade
867
1343
 
868
1344
  Upgrade a dataset to the current schema version.