unitsdb 0.1.1 → 2.0.1

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (75) hide show
  1. checksums.yaml +4 -4
  2. data/.github/workflows/dependent-repos.json +5 -0
  3. data/.github/workflows/depenedent-gems.yml +16 -0
  4. data/.gitmodules +3 -0
  5. data/.rspec +2 -1
  6. data/.rubocop_todo.yml +168 -15
  7. data/Gemfile +3 -2
  8. data/README.adoc +803 -1
  9. data/exe/unitsdb +7 -0
  10. data/lib/unitsdb/cli.rb +88 -0
  11. data/lib/unitsdb/commands/_modify.rb +22 -0
  12. data/lib/unitsdb/commands/base.rb +26 -0
  13. data/lib/unitsdb/commands/check_si.rb +124 -0
  14. data/lib/unitsdb/commands/get.rb +133 -0
  15. data/lib/unitsdb/commands/normalize.rb +81 -0
  16. data/lib/unitsdb/commands/release.rb +73 -0
  17. data/lib/unitsdb/commands/search.rb +219 -0
  18. data/lib/unitsdb/commands/si_formatter.rb +485 -0
  19. data/lib/unitsdb/commands/si_matcher.rb +470 -0
  20. data/lib/unitsdb/commands/si_ttl_parser.rb +100 -0
  21. data/lib/unitsdb/commands/si_updater.rb +212 -0
  22. data/lib/unitsdb/commands/ucum/check.rb +126 -0
  23. data/lib/unitsdb/commands/ucum/formatter.rb +141 -0
  24. data/lib/unitsdb/commands/ucum/matcher.rb +301 -0
  25. data/lib/unitsdb/commands/ucum/update.rb +84 -0
  26. data/lib/unitsdb/commands/ucum/updater.rb +98 -0
  27. data/lib/unitsdb/commands/ucum/xml_parser.rb +34 -0
  28. data/lib/unitsdb/commands/ucum.rb +43 -0
  29. data/lib/unitsdb/commands/validate/identifiers.rb +42 -0
  30. data/lib/unitsdb/commands/validate/references.rb +318 -0
  31. data/lib/unitsdb/commands/validate/si_references.rb +109 -0
  32. data/lib/unitsdb/commands/validate.rb +40 -0
  33. data/lib/unitsdb/config.rb +19 -0
  34. data/lib/unitsdb/database.rb +662 -0
  35. data/lib/unitsdb/dimension.rb +19 -25
  36. data/lib/unitsdb/dimension_details.rb +20 -0
  37. data/lib/unitsdb/dimension_reference.rb +8 -0
  38. data/lib/unitsdb/dimensions.rb +4 -6
  39. data/lib/unitsdb/errors.rb +13 -0
  40. data/lib/unitsdb/external_reference.rb +14 -0
  41. data/lib/unitsdb/identifier.rb +8 -0
  42. data/lib/unitsdb/localized_string.rb +17 -0
  43. data/lib/unitsdb/prefix.rb +11 -12
  44. data/lib/unitsdb/prefix_reference.rb +10 -0
  45. data/lib/unitsdb/prefixes.rb +4 -6
  46. data/lib/unitsdb/quantities.rb +4 -27
  47. data/lib/unitsdb/quantity.rb +12 -24
  48. data/lib/unitsdb/quantity_reference.rb +4 -7
  49. data/lib/unitsdb/root_unit_reference.rb +14 -0
  50. data/lib/unitsdb/scale.rb +17 -0
  51. data/lib/unitsdb/scale_properties.rb +12 -0
  52. data/lib/unitsdb/scale_reference.rb +10 -0
  53. data/lib/unitsdb/scales.rb +12 -0
  54. data/lib/unitsdb/si_derived_base.rb +13 -14
  55. data/lib/unitsdb/symbol_presentations.rb +14 -0
  56. data/lib/unitsdb/ucum.rb +198 -0
  57. data/lib/unitsdb/unit.rb +20 -26
  58. data/lib/unitsdb/unit_reference.rb +5 -8
  59. data/lib/unitsdb/unit_system.rb +8 -10
  60. data/lib/unitsdb/unit_system_reference.rb +10 -0
  61. data/lib/unitsdb/unit_systems.rb +4 -16
  62. data/lib/unitsdb/units.rb +4 -6
  63. data/lib/unitsdb/utils.rb +84 -0
  64. data/lib/unitsdb/version.rb +1 -1
  65. data/lib/unitsdb.rb +13 -10
  66. data/unitsdb.gemspec +6 -3
  67. metadata +120 -12
  68. data/lib/unitsdb/dimension_quantity.rb +0 -28
  69. data/lib/unitsdb/dimension_symbol.rb +0 -22
  70. data/lib/unitsdb/prefix_symbol.rb +0 -12
  71. data/lib/unitsdb/root_unit.rb +0 -17
  72. data/lib/unitsdb/root_units.rb +0 -20
  73. data/lib/unitsdb/symbol.rb +0 -17
  74. data/lib/unitsdb/unit_symbol.rb +0 -15
  75. data/lib/unitsdb/unitsdb.rb +0 -6
data/README.adoc CHANGED
@@ -26,7 +26,809 @@ to access and manipulate the UnitsDB content.
26
26
  $ gem install unitsdb
27
27
  ----
28
28
 
29
- == Usage
29
+
30
+
31
+ == UnitsDB version support
32
+
33
+ === General
34
+
35
+ This library supports the UnitsDB 2.0.0 format only.
36
+
37
+ The version of the YAML files are stored in the `version` field of the `*.yaml`
38
+ files. The library checks this version when loading the database and raises an
39
+ error if the version is not 2.0.0.
40
+
41
+ === UnitsDB 2.0.0 features
42
+
43
+ ==== General
44
+
45
+ UnitsDB 2.0.0 introduces several significant improvements over version 1.0.0.
46
+
47
+ ==== UnitsML identifiers
48
+
49
+ From UnitsDB 2.0.0, all entities now have an organization-independent identifier
50
+ that is unique across all entities in the scope of `unitsml`.
51
+
52
+ ==== New Content
53
+
54
+ Version 2.0.0 includes several new additions:
55
+
56
+ * New dimensions (fluence, phase, fuel efficiency, etc.)
57
+ * New quantities (like emission_rate, fluence, kerma_rate, etc.)
58
+ * Formal structure for scale definitions
59
+ * Additional symbols and improved representation
60
+
61
+ ==== Multilingual support
62
+
63
+ Version 2.0.0 adds support for localized names in multiple languages:
64
+
65
+ * Names are now structured as objects with `value` and `lang` properties
66
+ * English (en) is the primary language for all entries
67
+ * French (fr) translations for units and quantities are available
68
+
69
+ [source,ruby]
70
+ ----
71
+ # Accessing multilingual names (UnitsDB 2.0.0)
72
+ meter = db.find_by_type(id: "NISTu1", type: "units")
73
+ english_names = meter.names.select { |n| n.lang == "en" }.map(&:value)
74
+ french_names = meter.names.select { |n| n.lang == "fr" }.map(&:value)
75
+ ----
76
+
77
+ ==== Enhanced symbol representation
78
+
79
+ Symbols now have more comprehensive representation formats:
80
+
81
+ * All entities with symbols have representations in multiple formats (ASCII, Unicode, HTML, MathML, LaTeX)
82
+ * Prefixes now use the same symbol structure as units with a collection of symbol objects
83
+ * Dimensions now use `symbols` instead of `dim_symbols`
84
+
85
+ ==== External references
86
+
87
+ The 2.0.0 format includes a formalized approach to external references:
88
+
89
+ * The `references` field links to external resources like the SI Digital Framework
90
+ * More consistent structure for references between entities
91
+
92
+
93
+ === Unified UnitsDB release file format
94
+
95
+ While the UnitsDB database is maintained in separate YAML files for easier
96
+ management (`units.yaml`, `quantities.yaml`, etc.), the unified release file
97
+ consolidates all data into a single YAML file for improved user convenience.
98
+
99
+ Syntax:
100
+
101
+ [source,yaml]
102
+ ----
103
+ schema_version: 2.0.0
104
+ version: 1.0.0 # Release version (in semantic format x.y.z)
105
+ dimensions:
106
+ - identifiers: [...]
107
+ length: {...}
108
+ names: [...]
109
+ - ...
110
+ prefixes:
111
+ - identifiers: [...]
112
+ name: ...
113
+ symbols: [...]
114
+ - ...
115
+ quantities:
116
+ - identifiers: [...]
117
+ quantity_type: ...
118
+ names: [...]
119
+ - ...
120
+ units:
121
+ - identifiers: [...]
122
+ names: [...]
123
+ symbols: [...]
124
+ - ...
125
+ unit_systems:
126
+ - identifiers: [...]
127
+ name: ...
128
+ - ...
129
+ ----
130
+
131
+ There are several advantages to using the unified file format:
132
+
133
+ * Simplified usage: obsoletes the needd to load and manage multiple files
134
+ * Consistency: All data is guaranteed to be compatible and consistently integrated
135
+
136
+ The unified file maintains the same structure and relationships as the separate
137
+ files, with all entities organized by type under their respective top-level keys,
138
+ while preserving all identifiers, references, and properties from the original
139
+ database.
140
+
141
+
142
+
143
+ == Usage: CLI
144
+
145
+ The UnitsDB gem includes a command-line utility for working with UnitsDB data.
146
+ This tool provides several commands for validating and normalizing UnitsDB
147
+ content.
148
+
149
+ === Installation
150
+
151
+ The `unitsdb` command is automatically installed when you install the gem.
152
+
153
+ === Available commands
154
+
155
+ ==== Database validation
156
+
157
+ The UnitsDB CLI provides several validation subcommands to ensure database
158
+ integrity and correctness. These commands help identify potential issues in the
159
+ database structure and content.
160
+
161
+ ===== References validation
162
+
163
+ Validates that all references within the database exist and point to actual entities:
164
+
165
+ [source,sh]
166
+ ----
167
+ # Validate all references
168
+ $ unitsdb validate references --database=/path/to/unitsdb/data
169
+
170
+ # Show valid references too (not just errors)
171
+ $ unitsdb validate references --print-valid --database=/path/to/unitsdb/data
172
+
173
+ # Show detailed registry contents for debugging
174
+ $ unitsdb validate references --debug-registry --database=/path/to/unitsdb/data
175
+ ----
176
+
177
+ This command checks all entity references (unit references, quantity references,
178
+ dimension references, etc.) to ensure they point to existing entities within the
179
+ database. It reports any "dangling" references that point to non-existent
180
+ entities, which could cause issues in applications using the database.
181
+
182
+ Options:
183
+
184
+ `--database`, `-d`:: Path to UnitsDB database (required)
185
+ `--debug_registry`:: Show registry contents for debugging
186
+ `--print_valid`:: Print valid references too, not just invalid ones
187
+
188
+ ===== Identifiers validation
189
+
190
+ Checks for uniqueness of identifier fields to prevent duplicate IDs:
191
+
192
+ [source,sh]
193
+ ----
194
+ $ unitsdb validate identifiers --database=/path/to/unitsdb/data
195
+ ----
196
+
197
+ This command ensures that each identifier within an entity type (units,
198
+ prefixes, quantities, etc.) is unique. Duplicate identifiers could lead to
199
+ ambiguity and unexpected behavior when referencing entities by ID.
200
+
201
+ Options:
202
+
203
+ `--database`, `-d`:: Path to UnitsDB database (required)
204
+
205
+ ===== SI references validation
206
+
207
+ Validates that each SI digital framework reference is unique per entity type:
208
+
209
+ [source,sh]
210
+ ----
211
+ $ unitsdb validate si_references --database=/path/to/unitsdb/data
212
+ ----
213
+
214
+ This command checks that each SI digital framework URI is referenced by at most
215
+ one entity of each type. Multiple entities of the same type referencing the same
216
+ SI URI could cause issues with mapping and conversion processes.
217
+
218
+ The command reports:
219
+
220
+ * Any duplicate SI references within each entity type
221
+ * The entities that share the same SI reference
222
+ * Their position in the database for easy location
223
+
224
+ Options:
225
+
226
+ `--database`, `-d`:: Path to UnitsDB database (required)
227
+
228
+ === Examples of validation commands
229
+
230
+ * Check identifiers for uniqueness:
231
+ +
232
+ [source,sh]
233
+ ----
234
+ $ unitsdb validate identifiers --database=/path/to/unitsdb/data
235
+ ----
236
+
237
+ * Validate references in a specific directory:
238
+ +
239
+ [source,sh]
240
+ ----
241
+ $ unitsdb validate references --database=/path/to/unitsdb/data
242
+ ----
243
+
244
+ * Check for duplicate SI references:
245
+ +
246
+ [source,sh]
247
+ ----
248
+ $ unitsdb validate si_references --database=/path/to/unitsdb/data
249
+ ----
250
+
251
+
252
+ ==== Database Modification (_modify)
253
+
254
+ Commands that modify the database are grouped under the `_modify` namespace:
255
+
256
+ [source,sh]
257
+ ----
258
+ # Normalize YAML file format
259
+ $ unitsdb _modify normalize [INPUT] [OUTPUT] --database=/path/to/unitsdb/data
260
+ $ unitsdb _modify normalize --all --database=/path/to/unitsdb/data
261
+
262
+ # Sort by different ID types
263
+ $ unitsdb _modify normalize --sort=nist [INPUT] [OUTPUT] --database=/path/to/unitsdb/data
264
+ $ unitsdb _modify normalize --sort=unitsml [INPUT] [OUTPUT] --database=/path/to/unitsdb/data
265
+ $ unitsdb _modify normalize --sort=short [INPUT] [OUTPUT] --database=/path/to/unitsdb/data
266
+ $ unitsdb _modify normalize --sort=none [INPUT] [OUTPUT] --database=/path/to/unitsdb/data
267
+ ----
268
+
269
+ Options:
270
+
271
+ `--all`, `-a`:: Process all YAML files in the repository
272
+ `--database`, `-d`:: Path to UnitsDB database (required)
273
+ `--sort`:: Sort units by: 'short' (name), 'nist' (ID, default), 'unitsml' (ID), or 'none'
274
+
275
+
276
+ ==== Search
277
+
278
+ Searches for entities in the database and displays ID and ID Type information for each result:
279
+
280
+ [source,sh]
281
+ ----
282
+ # Search by text content
283
+ $ unitsdb search meter --database=/path/to/unitsdb/data
284
+ $ unitsdb search meter --type=units --database=/path/to/unitsdb/data
285
+
286
+ # Search by ID
287
+ $ unitsdb search any-query --id=NISTu1 --database=/path/to/unitsdb/data
288
+ $ unitsdb search any-query --id=NISTu1 --id_type=nist --database=/path/to/unitsdb/data
289
+
290
+ # Output in different formats
291
+ $ unitsdb search meter --format=json --database=/path/to/unitsdb/data
292
+ $ unitsdb search kilo --format=yaml --database=/path/to/unitsdb/data
293
+ ----
294
+
295
+ Options:
296
+
297
+ `--type`, `-t`:: Entity type to search (units, prefixes, quantities, dimensions, unit_systems)
298
+ `--id`, `-i`:: Search for an entity with a specific identifier
299
+ `--id_type`:: Filter the ID search by identifier type
300
+ `--format`:: Output format (text, json, yaml) - default is text
301
+ `--database`, `-d`:: Path to UnitsDB database (required)
302
+
303
+ ==== Get
304
+
305
+ Retrieves and displays the full details of a specific entity by its identifier:
306
+
307
+ [source,sh]
308
+ ----
309
+ # Get entity details by ID
310
+ $ unitsdb get meter --database=/path/to/unitsdb/data
311
+ $ unitsdb get m --database=/path/to/unitsdb/data
312
+
313
+ # Get entity with specific ID type
314
+ $ unitsdb get meter --id_type=si --database=/path/to/unitsdb/data
315
+
316
+ # Output in different formats
317
+ $ unitsdb get kilogram --format=json --database=/path/to/unitsdb/data
318
+ $ unitsdb get second --format=yaml --database=/path/to/unitsdb/data
319
+ ----
320
+
321
+ Options:
322
+
323
+ `--id_type`:: Filter the search by identifier type
324
+ `--format`:: Output format (text, json, yaml) - default is text
325
+ `--database`, `-d`:: Path to UnitsDB database (required)
326
+
327
+ ==== Check references to SI Digital Framework
328
+
329
+ Performs a comprehensive check of entities in the BIPM's SI digital framework
330
+ TTL files against UnitsDB database entities.
331
+
332
+ This combined command checks in both directions to ensure UnitsDB is a strict
333
+ superset of the SI digital framework:
334
+
335
+ * From SI to UnitsDB: Ensures every TTL entity is referenced by at least one
336
+ UnitsDB entity
337
+
338
+ * From UnitsDB to SI: Identifies UnitsDB entities that should reference TTL
339
+ entities
340
+
341
+ [source,sh]
342
+ ----
343
+ # Check all entity types and generate a report
344
+ $ unitsdb check_si --database=spec/fixtures/unitsdb --ttl-dir=spec/fixtures/bipm-si-ttl
345
+
346
+ # Check a specific entity type (units, quantities, or prefixes)
347
+ $ unitsdb check_si --entity-type=units \
348
+ --database=spec/fixtures/unitsdb \
349
+ --ttl-dir=spec/fixtures/bipm-si-ttl
350
+
351
+ # Check in a specific direction only
352
+ $ unitsdb check_si --direction=from_si \
353
+ --database=spec/fixtures/unitsdb \
354
+ --ttl-dir=spec/fixtures/bipm-si-ttl
355
+
356
+ # Update references and write to output directory
357
+ $ unitsdb check_si --output-updated-database=new_unitsdb \
358
+ --database=spec/fixtures/unitsdb \
359
+ --ttl-dir=spec/fixtures/bipm-si-ttl
360
+
361
+ # Include potential matches when updating references (default: false)
362
+ $ unitsdb check_si --include-potential-matches \
363
+ --output-updated-database=new_unitsdb \
364
+ --database=spec/fixtures/unitsdb \
365
+ --ttl-dir=spec/fixtures/bipm-si-ttl
366
+ ----
367
+
368
+ Options:
369
+
370
+ `--database`, `-d`:: Path to UnitsDB database (required)
371
+
372
+ `--ttl-dir`, `-t`:: Path to the directory containing SI digital framework TTL
373
+ files (required)
374
+
375
+ `--entity-type`, `-e`:: Entity type to check (units, quantities, or prefixes).
376
+ If not specified, all types are checked
377
+
378
+ `--output-updated-database`, `-o`:: Directory path to write updated YAML files
379
+ with added SI references
380
+
381
+ `--direction`, `-r`:: Direction to check: 'to_si' (UnitsDB→TTL), 'from_si'
382
+ (TTL→UnitsDB), or 'both' (default)
383
+
384
+ `--include-potential-matches`, `-p`:: Include potential matches when updating
385
+ references (default: false)
386
+
387
+
388
+ ==== Check references to UCUM
389
+
390
+ Performs a comprehensive check of entities in the UCUM XML file against UnitsDB
391
+ database entities and updates UnitsDB with UCUM references.
392
+
393
+ UCUM supports the following entity types:
394
+
395
+ * Base units
396
+ * Units
397
+ * Prefixes
398
+
399
+ UCUM provides dimensions as part of their unit definitions but not as
400
+ uniquely referencable entities.
401
+
402
+ This combined command checks in both directions to ensure UnitsDB supports
403
+ every UCUM entity:
404
+
405
+ * From UCUM to UnitsDB: Ensures every UCUM entity is referenced by at least one
406
+ UnitsDB entity
407
+
408
+ * From UnitsDB to UCUM: Identifies UnitsDB entities that should reference UCUM
409
+ entities
410
+
411
+ There are two commands:
412
+
413
+ * `ucum check`: Checks for matches between UnitsDB and UCUM entities and reports results
414
+
415
+ * `ucum update`: Updates UnitsDB entities with references to matching UCUM entities
416
+
417
+ [source,sh]
418
+ ----
419
+ # Check all entity types and generate a report
420
+ $ unitsdb ucum check --database=spec/fixtures/unitsdb --ucum-file=spec/fixtures/ucum/ucum-essence.xml
421
+
422
+ # Check a specific entity type (units or prefixes)
423
+ $ unitsdb ucum check --entity-type=units \
424
+ --database=spec/fixtures/unitsdb \
425
+ --ucum-file=spec/fixtures/ucum/ucum-essence.xml
426
+
427
+ # Check in a specific direction only
428
+ $ unitsdb ucum check --direction=from_ucum \
429
+ --database=spec/fixtures/unitsdb \
430
+ --ucum-file=spec/fixtures/ucum/ucum-essence.xml
431
+ ----
432
+
433
+ Options:
434
+
435
+ `--database`, `-d`:: Path to UnitsDB database (required)
436
+
437
+ `--ucum-file`, `-u`:: Path to the UCUM essence XML file (required)
438
+
439
+ `--entity-type`, `-e`:: Entity type to check (units or prefixes).
440
+ If not specified, all types are checked.
441
+
442
+ `--direction`, `-r`:: Direction to check: `to_ucum` (UnitsDB→UCUM), `from_ucum`
443
+ (UCUM→UnitsDB), or `both` (default)
444
+
445
+ `--output-updated-database`, `-o`:: Directory path to write updated YAML files
446
+ with added UCUM references
447
+
448
+ `--include-potential-matches`, `-p`:: Include potential matches when updating
449
+ references (default: false)
450
+
451
+
452
+ ==== Update UCUM references
453
+
454
+ [source,sh]
455
+ ----
456
+ # Update all entity types with UCUM references
457
+ $ unitsdb ucum update --database=spec/fixtures/unitsdb \
458
+ --ucum-file=spec/fixtures/ucum/ucum-essence.xml \
459
+ --output-dir=new_unitsdb
460
+
461
+ # Update a specific entity type (units or prefixes)
462
+ $ unitsdb ucum update --entity-type=units \
463
+ --database=spec/fixtures/unitsdb \
464
+ --ucum-file=spec/fixtures/ucum/ucum-essence.xml \
465
+ --output-dir=new_unitsdb
466
+
467
+ # Include potential matches when updating references (default: false)
468
+ $ unitsdb ucum update --include-potential-matches \
469
+ --database=spec/fixtures/unitsdb \
470
+ --ucum-file=spec/fixtures/ucum/ucum-essence.xml \
471
+ --output-dir=new_unitsdb
472
+ ----
473
+
474
+ Options:
475
+
476
+ `--database`, `-d`:: Path to UnitsDB database (required)
477
+
478
+ `--ucum-file`, `-u`:: Path to the UCUM essence XML file (required)
479
+
480
+ `--entity-type`, `-e`:: Entity type to update (units or prefixes).
481
+ If not specified, all types are updated
482
+
483
+ `--output-dir`, `-o`:: Directory path to write updated YAML files
484
+ (defaults to database path)
485
+
486
+ `--include-potential-matches`, `-p`:: Include potential matches when updating
487
+ references (default: false)
488
+
489
+ ==== Release
490
+
491
+ Creates release files for UnitsDB in unified formats:
492
+
493
+ [source,sh]
494
+ ----
495
+ # Create both unified YAML and ZIP archive
496
+ $ unitsdb release --database=/path/to/unitsdb/data
497
+
498
+ # Create only unified YAML file
499
+ $ unitsdb release --format=yaml --database=/path/to/unitsdb/data
500
+
501
+ # Create only ZIP archive
502
+ $ unitsdb release --format=zip --database=/path/to/unitsdb/data
503
+
504
+ # Specify output directory
505
+ $ unitsdb release --output-dir=/path/to/output --database=/path/to/unitsdb/data
506
+
507
+ # Specify a version (required)
508
+ $ unitsdb release --version=2.1.0 --database=/path/to/unitsdb/data
509
+ ----
510
+
511
+ This command creates release files for UnitsDB in two formats:
512
+
513
+ . A unified YAML file that combines all database files into a single file
514
+
515
+ . A ZIP archive containing all individual database files
516
+
517
+ The command verifies that all files have the same schema version before creating
518
+ the release files. The output files are named with the schema version (e.g.,
519
+ `unitsdb-2.1.0.yaml` and `unitsdb-2.1.0.zip`).
520
+
521
+ Options:
522
+
523
+ `--format`, `-f`:: Output format: 'yaml' (single file), 'zip' (archive), or
524
+ 'all' (both). Default is 'all'.
525
+
526
+ `--output-dir`, `-o`:: Directory to output release files. Default is current
527
+ directory.
528
+
529
+ `--database`, `-d`:: Path to UnitsDB database (required)
530
+
531
+ ===== Match types in check_si
532
+
533
+ The `check_si` command classifies matches into two categories:
534
+
535
+ **Exact matches**::
536
+ These are high-confidence matches based on exact name or label equivalence.
537
+
538
+ ** `short_to_name`: UnitsDB short name matches SI name
539
+ ** `short_to_label`: UnitsDB short name matches SI label
540
+ ** `name_to_name`: UnitsDB name matches SI name
541
+ ** `name_to_label`: UnitsDB name matches SI label
542
+ ** `name_to_alt_label`: UnitsDB name matches SI alternative label
543
+
544
+ **Potential matches**::
545
+ These are lower-confidence matches that require manual verification.
546
+
547
+ ** `symbol_match`: Only the symbols match, not the names
548
+ ** `partial_match`: Incomplete match (e.g., "sidereal_day" vs "day")
549
+
550
+ When using `--include-potential-matches`, both exact and potential matches will
551
+ be included in the reference updates. Without this flag, only exact matches are
552
+ used for automatic updates.
553
+
554
+ ===== SI References Workflow
555
+
556
+ When the BIPM updates their SI Digital Reference TTL files, follow these steps
557
+ to ensure UnitsDB remains a strict superset:
558
+
559
+ . Verify unreferenced TTL entries:
560
+
561
+ ** Run this:
562
+ +
563
+ [source,sh]
564
+ ----
565
+ $ unitsdb check_si --database=/path/to/unitsdb/data --ttl-dir=/path/to/si-framework
566
+ ----
567
+
568
+ ** Look for entries in the "SI [Entity Type] not mapped to our database" section
569
+
570
+ ** These are TTL entities that are not currently referenced by any UnitsDB entity
571
+
572
+ . For each unreferenced TTL entry:
573
+
574
+ ** Search for matching entities in UnitsDB:
575
+ +
576
+ [source,sh]
577
+ ----
578
+ $ unitsdb search "entity_name" --database=/path/to/unitsdb/data
579
+ ----
580
+
581
+ ** If a match exists:
582
+
583
+ *** Update its references manually in the appropriate YAML file
584
+ *** Add a new reference with `authority: "si-digital-framework"` and the TTL URI
585
+
586
+ ** If no match exists:
587
+
588
+ *** Create a new entity in the appropriate YAML file (`units.yaml`,
589
+ `quantities.yaml`, or `prefixes.yaml`)
590
+
591
+ *** Include the necessary reference to the TTL entity
592
+
593
+ . Verify all references are complete:
594
+
595
+ ** Run this again:
596
+ +
597
+ [source,sh]
598
+ ----
599
+ $ unitsdb check_si --database=/path/to/unitsdb/data --ttl-dir=/path/to/si-framework
600
+ ----
601
+
602
+ ** Confirm no entries appear in the "SI [Entity Type] not mapped to our database" section
603
+
604
+ ** If needed, run with the output option to automatically add missing references:
605
+ +
606
+ [source,sh]
607
+ ----
608
+ $ unitsdb check_si --output-updated-database=/path/to/output/dir \
609
+ --database=/path/to/unitsdb/data \
610
+ --ttl-dir=/path/to/si-framework
611
+ ----
612
+
613
+ . Verify reference uniqueness:
614
+
615
+ ** Run:
616
+ +
617
+ [source,sh]
618
+ ----
619
+ $ unitsdb validate si_references --database=/path/to/unitsdb/data
620
+ ----
621
+
622
+ ** This checks that each SI URI is used by at most one entity of each type
623
+
624
+ ** Fix any duplicate references found
625
+
626
+ The `check_si` command ensures every entity in the BIPM's SI Digital Reference
627
+ is properly referenced in UnitsDB:
628
+
629
+ * It verifies that every TTL entity has at least one corresponding UnitsDB
630
+ entity referencing it
631
+
632
+ * It identifies UnitsDB entities that should reference SI Digital Framework but
633
+ don't yet
634
+
635
+ * It can automatically update YAML files with proper references when used with
636
+ the `--output-updated-database` option
637
+
638
+ * It correctly differentiates between exact and potential matches, with
639
+ symbol-to-symbol and partial matches always classified as potential
640
+
641
+
642
+ == Usage: Ruby
643
+
644
+ === Loading the database
645
+
646
+ The primary way to load the UnitsDB data is through the `Database.from_db`
647
+ method, which reads data from YAML files:
648
+
649
+ [source,ruby]
650
+ ----
651
+ require 'unitsdb'
652
+
653
+ # Load from the UnitsDB data directory
654
+ db = Unitsdb::Database.from_db('/path/to/unitsdb/data')
655
+
656
+ # Access different collections
657
+ units = db.units
658
+ prefixes = db.prefixes
659
+ dimensions = db.dimensions
660
+ quantities = db.quantities
661
+ unit_systems = db.unit_systems
662
+ ----
663
+
664
+ === Database search methods
665
+
666
+ The UnitsDB Ruby gem provides several methods for searching and retrieving
667
+ entities.
668
+
669
+ ==== Search by text content
670
+
671
+ The `search` method allows you to find entities containing specific text in
672
+ their identifiers, names, or descriptions:
673
+
674
+ [source,ruby]
675
+ ----
676
+ # Search across all entity types
677
+ results = db.search(text: "meter")
678
+
679
+ # Search within a specific entity type
680
+ units_with_meter = db.search(text: "meter", type: "units")
681
+ ----
682
+
683
+ ==== Find entity by ID
684
+
685
+ The `get_by_id` method finds an entity with a specific identifier across all
686
+ entity types:
687
+
688
+ [source,ruby]
689
+ ----
690
+ # Find by ID across all entity types
691
+ meter_entity = db.get_by_id(id: "NISTu1")
692
+
693
+ # Find by ID with specific identifier type
694
+ meter_entity = db.get_by_id(id: "NISTu1", type: "nist")
695
+ ----
696
+
697
+ ==== Find entity by ID within a specific type collection
698
+
699
+ The `find_by_type` method searches for an entity by ID within a specific entity
700
+ type collection:
701
+
702
+ [source,ruby]
703
+ ----
704
+ # Find unit with specific ID
705
+ meter_unit = db.find_by_type(id: "NISTu1", type: "units")
706
+ ----
707
+
708
+ ==== Find entities by symbol
709
+
710
+ The `find_by_symbol` method allows you to search for units and prefixes by their
711
+ symbol representation:
712
+
713
+ [source,ruby]
714
+ ----
715
+ # Find all entities with symbol "m"
716
+ matching_entities = db.find_by_symbol("m")
717
+
718
+ # Find only units with symbol "m"
719
+ matching_units = db.find_by_symbol("m", "units")
720
+
721
+ # Find only prefixes with symbol "k"
722
+ matching_prefixes = db.find_by_symbol("k", "prefixes")
723
+ ----
724
+
725
+ This method performs case-insensitive exact matches on the ASCII representation
726
+ of symbols. It's useful for finding units or prefixes when you know the symbol
727
+ but not the name or identifier.
728
+
729
+ Parameters:
730
+
731
+ `symbol` (String)::
732
+ The symbol to search for
733
+
734
+ `entity_type` (String, Symbol, nil)::
735
+ Optional. Limit search to a specific entity type ("units" or "prefixes"). If
736
+ nil, searches both.
737
+
738
+ Returns:
739
+
740
+ * An array of entities (Unit or Prefix objects) with matching symbols
741
+ * Empty array if no matches are found
742
+
743
+ NOTE: This method only searches units and prefixes, as these are the only entity
744
+ types that have symbol representations.
745
+
746
+ === Main classes
747
+
748
+ The UnitsDB Ruby gem provides the following main classes.
749
+
750
+ ==== Database
751
+
752
+ The `Database` class is the main container that holds all UnitsML components. It
753
+ loads and provides access to units, prefixes, dimensions, quantities, and unit
754
+ systems.
755
+
756
+ [source,ruby]
757
+ ----
758
+ # Access database collections
759
+ db.units # => Array of Unit objects
760
+ db.prefixes # => Array of Prefix objects
761
+ db.dimensions # => Array of Dimension objects
762
+ db.quantities # => Array of Quantity objects
763
+ db.unit_systems # => Array of UnitSystem objects
764
+ ----
765
+
766
+ ==== Unit
767
+
768
+ The `Unit` class represents units of measure with their properties and
769
+ relationships:
770
+
771
+ * Identifiers
772
+ * Short name
773
+ * Whether it's a root unit or can be prefixed
774
+ * Dimension reference
775
+ * Unit system references
776
+ * Unit names
777
+ * Symbol presentations
778
+ * Quantity references
779
+ * SI derived bases
780
+ * Root unit references
781
+
782
+ ==== Prefix
783
+
784
+ The `Prefix` class represents prefixes for units (like kilo-, mega-, etc.):
785
+
786
+ * Identifiers
787
+ * Name
788
+ * Symbol presentations
789
+ * Base (e.g., 10)
790
+ * Power (e.g., 3 for kilo)
791
+
792
+ ==== Dimension
793
+
794
+ The `Dimension` class represents physical dimensions (like length, mass, etc.):
795
+
796
+ * Identifiers
797
+ * Whether it's dimensionless
798
+ * Basic dimensions (length, mass, time, etc.)
799
+ * Dimension details (power, symbol, dimension symbols)
800
+ * Short name
801
+
802
+ ==== UnitSystem
803
+
804
+ The `UnitSystem` class represents systems of units (like SI, Imperial, etc.):
805
+
806
+ * Identifiers
807
+ * Name
808
+ * Short name
809
+ * Whether it's acceptable
810
+
811
+ ==== Quantity
812
+
813
+ The `Quantity` class represents physical quantities that can be measured:
814
+
815
+ * Identifiers
816
+ * Quantity type
817
+ * Quantity names
818
+ * Short name
819
+ * Unit references
820
+ * Dimension reference
821
+
822
+ === Database files
823
+
824
+ The `Database.from_db` method reads the following YAML files:
825
+
826
+ * `prefixes.yaml` - Contains prefix definitions (e.g., kilo-, mega-)
827
+ * `dimensions.yaml` - Contains dimension definitions (e.g., length, mass)
828
+ * `units.yaml` - Contains unit definitions (e.g., meter, kilogram)
829
+ * `quantities.yaml` - Contains quantity definitions (e.g., length, mass)
830
+ * `unit_systems.yaml` - Contains unit system definitions (e.g., SI, Imperial)
831
+
30
832
 
31
833
 
32
834