unitsdb 2.1.1 → 2.2.1

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (87) hide show
  1. checksums.yaml +4 -4
  2. data/.github/workflows/release.yml +8 -1
  3. data/.gitignore +2 -0
  4. data/.gitmodules +4 -3
  5. data/.rubocop.yml +13 -8
  6. data/.rubocop_todo.yml +217 -100
  7. data/CLAUDE.md +55 -0
  8. data/Gemfile +4 -1
  9. data/README.adoc +283 -16
  10. data/data/dimensions.yaml +1864 -0
  11. data/data/prefixes.yaml +874 -0
  12. data/data/quantities.yaml +3715 -0
  13. data/data/scales.yaml +97 -0
  14. data/data/schemas/dimensions-schema.yaml +153 -0
  15. data/data/schemas/prefixes-schema.yaml +155 -0
  16. data/data/schemas/quantities-schema.yaml +117 -0
  17. data/data/schemas/scales-schema.yaml +106 -0
  18. data/data/schemas/unit_systems-schema.yaml +116 -0
  19. data/data/schemas/units-schema.yaml +215 -0
  20. data/data/unit_systems.yaml +78 -0
  21. data/data/units.yaml +14052 -0
  22. data/exe/unitsdb +7 -1
  23. data/lib/unitsdb/cli.rb +42 -15
  24. data/lib/unitsdb/commands/_modify.rb +40 -4
  25. data/lib/unitsdb/commands/base.rb +6 -2
  26. data/lib/unitsdb/commands/check_si/si_formatter.rb +488 -0
  27. data/lib/unitsdb/commands/check_si/si_matcher.rb +487 -0
  28. data/lib/unitsdb/commands/check_si/si_ttl_parser.rb +103 -0
  29. data/lib/unitsdb/commands/check_si/si_updater.rb +254 -0
  30. data/lib/unitsdb/commands/check_si.rb +54 -35
  31. data/lib/unitsdb/commands/get.rb +11 -10
  32. data/lib/unitsdb/commands/normalize.rb +21 -7
  33. data/lib/unitsdb/commands/qudt/check.rb +150 -0
  34. data/lib/unitsdb/commands/qudt/formatter.rb +194 -0
  35. data/lib/unitsdb/commands/qudt/matcher.rb +746 -0
  36. data/lib/unitsdb/commands/qudt/ttl_parser.rb +403 -0
  37. data/lib/unitsdb/commands/qudt/update.rb +126 -0
  38. data/lib/unitsdb/commands/qudt/updater.rb +189 -0
  39. data/lib/unitsdb/commands/qudt.rb +82 -0
  40. data/lib/unitsdb/commands/release.rb +12 -9
  41. data/lib/unitsdb/commands/search.rb +12 -11
  42. data/lib/unitsdb/commands/ucum/check.rb +42 -29
  43. data/lib/unitsdb/commands/ucum/formatter.rb +2 -1
  44. data/lib/unitsdb/commands/ucum/matcher.rb +23 -9
  45. data/lib/unitsdb/commands/ucum/update.rb +14 -13
  46. data/lib/unitsdb/commands/ucum/updater.rb +40 -6
  47. data/lib/unitsdb/commands/ucum/xml_parser.rb +0 -2
  48. data/lib/unitsdb/commands/ucum.rb +44 -4
  49. data/lib/unitsdb/commands/validate/identifiers.rb +2 -4
  50. data/lib/unitsdb/commands/validate/qudt_references.rb +111 -0
  51. data/lib/unitsdb/commands/validate/references.rb +36 -19
  52. data/lib/unitsdb/commands/validate/si_references.rb +3 -5
  53. data/lib/unitsdb/commands/validate/ucum_references.rb +105 -0
  54. data/lib/unitsdb/commands/validate.rb +67 -11
  55. data/lib/unitsdb/commands.rb +20 -0
  56. data/lib/unitsdb/database.rb +90 -52
  57. data/lib/unitsdb/dimension.rb +1 -4
  58. data/lib/unitsdb/dimension_details.rb +0 -1
  59. data/lib/unitsdb/dimensions.rb +0 -2
  60. data/lib/unitsdb/errors.rb +7 -0
  61. data/lib/unitsdb/prefix.rb +0 -4
  62. data/lib/unitsdb/prefix_reference.rb +0 -2
  63. data/lib/unitsdb/prefixes.rb +0 -1
  64. data/lib/unitsdb/quantities.rb +0 -2
  65. data/lib/unitsdb/quantity.rb +0 -6
  66. data/lib/unitsdb/qudt.rb +100 -0
  67. data/lib/unitsdb/root_unit_reference.rb +0 -3
  68. data/lib/unitsdb/scale.rb +0 -4
  69. data/lib/unitsdb/scale_reference.rb +0 -2
  70. data/lib/unitsdb/scales.rb +0 -2
  71. data/lib/unitsdb/si_derived_base.rb +0 -2
  72. data/lib/unitsdb/ucum.rb +14 -10
  73. data/lib/unitsdb/unit.rb +0 -10
  74. data/lib/unitsdb/unit_reference.rb +0 -2
  75. data/lib/unitsdb/unit_system.rb +1 -3
  76. data/lib/unitsdb/unit_system_reference.rb +0 -2
  77. data/lib/unitsdb/unit_systems.rb +0 -2
  78. data/lib/unitsdb/units.rb +0 -2
  79. data/lib/unitsdb/utils.rb +32 -21
  80. data/lib/unitsdb/version.rb +5 -1
  81. data/lib/unitsdb.rb +62 -14
  82. data/unitsdb.gemspec +6 -3
  83. metadata +52 -13
  84. data/lib/unitsdb/commands/si_formatter.rb +0 -485
  85. data/lib/unitsdb/commands/si_matcher.rb +0 -470
  86. data/lib/unitsdb/commands/si_ttl_parser.rb +0 -100
  87. data/lib/unitsdb/commands/si_updater.rb +0 -212
data/README.adoc CHANGED
@@ -19,6 +19,11 @@ https://github.com/unitsml/unitsdb.
19
19
  This repository contains the Ruby codebase for UnitsDB, which is used
20
20
  to access and manipulate the UnitsDB content.
21
21
 
22
+ The UnitsDB gem ships with the UnitsDB data bundled internally. When you install
23
+ the gem, the YAML data files are included automatically, so you do not need to
24
+ obtain or configure a separate data source. The data is accessed via
25
+ `Unitsdb.data_dir` or the convenience method `Unitsdb.database`.
26
+
22
27
  == Install
23
28
 
24
29
  [source,sh]
@@ -26,7 +31,79 @@ to access and manipulate the UnitsDB content.
26
31
  $ gem install unitsdb
27
32
  ----
28
33
 
34
+ The UnitsDB gem ships with the UnitsDB YAML data files bundled inside the gem.
35
+ Because the gem bundles immutable data, **the UnitsDB data must be released
36
+ (tagged) before the gem can be released with updated data**. The gem version
37
+ (`Unitsdb::VERSION`) and the data version (`Unitsdb::UNITS_DATA_VERSION`)
38
+ are independent: the gem can be patched without changing data, and the data
39
+ can be updated independently of the gem.
40
+
41
+
42
+
43
+ == How the Database Works
44
+
45
+ The `Unitsdb::Database` class loads all UnitsDB YAML files and provides
46
+ methods for searching and querying the data.
47
+
48
+ [source,ruby]
49
+ ----
50
+ require 'unitsdb'
29
51
 
52
+ # Access the pre-loaded bundled database (recommended)
53
+ db = Unitsdb.database
54
+
55
+ # Or load from a specific path
56
+ db = Unitsdb::Database.from_db('/path/to/data')
57
+ ----
58
+
59
+ === Lazy Loading and Caching
60
+
61
+ The bundled database (`Unitsdb.database`) is loaded on first access and cached
62
+ for subsequent calls. The path to the bundled data is available via:
63
+
64
+ [source,ruby]
65
+ ----
66
+ Unitsdb.data_dir # => Path to the data/ directory inside the gem
67
+ ----
68
+
69
+ === Database Structure
70
+
71
+ The database contains collections for each entity type:
72
+
73
+ [source,ruby]
74
+ ----
75
+ db.units # Array of Unit objects
76
+ db.prefixes # Array of Prefix objects
77
+ db.dimensions # Array of Dimension objects
78
+ db.quantities # Array of Quantity objects
79
+ db.unit_systems # Array of UnitSystem objects
80
+ db.scales # Array of Scale objects
81
+ ----
82
+
83
+ === Searching the Database
84
+
85
+ [source,ruby]
86
+ ----
87
+ # Search by text (searches names, identifiers, descriptions)
88
+ results = db.search(text: "meter")
89
+
90
+ # Find by exact ID
91
+ unit = db.get_by_id(id: "NISTu1")
92
+
93
+ # Find by symbol
94
+ units = db.find_by_symbol("m")
95
+ ----
96
+
97
+ === Validation
98
+
99
+ [source,ruby]
100
+ ----
101
+ # Check identifier uniqueness
102
+ dups = db.validate_uniqueness
103
+
104
+ # Validate all references exist
105
+ invalid_refs = db.validate_references
106
+ ----
30
107
 
31
108
  == UnitsDB version support
32
109
 
@@ -38,6 +115,24 @@ The version of the YAML files are stored in the `version` field of the `*.yaml`
38
115
  files. The library checks this version when loading the database and raises an
39
116
  error if the version is not 2.0.0.
40
117
 
118
+ The `unitsdb-ruby` gem version tracks the bundled data version via
119
+ `Unitsdb::UNITS_DATA_VERSION`. For example, `unitsdb-ruby v2.1.2` bundles
120
+ `unitsdb` data at `v2.0.0`. When the gem is released with an updated data
121
+ submodule, both the gem version and `UNITS_DATA_VERSION` are updated together.
122
+
123
+ The `data/` submodule is pinned to a specific tag in the
124
+ https://github.com/unitsml/unitsdb[UnitsDB repository] (e.g. `refs/tags/v2.0.0`).
125
+ This means:
126
+
127
+ * Every bundled version of `unitsdb-ruby` ships with a known, immutable version
128
+ of the UnitsDB data.
129
+ * A new `unitsdb-ruby` release can only be made after the upstream UnitsDB data
130
+ has been released (tagged) in its own repository.
131
+ * To release a new gem with updated data: tag the new data in
132
+ `unitsml/unitsdb`, update the submodule's `branch` in `.gitmodules` to point
133
+ to `refs/tags/new-data-tag`, update `UNITS_DATA_VERSION`, then bump the gem
134
+ version and release.
135
+
41
136
  === UnitsDB 2.0.0 features
42
137
 
43
138
  ==== General
@@ -225,6 +320,29 @@ Options:
225
320
 
226
321
  `--database`, `-d`:: Path to UnitsDB database (required)
227
322
 
323
+ ===== QUDT references validation
324
+
325
+ Validates that each QUDT reference is unique per entity type:
326
+
327
+ [source,sh]
328
+ ----
329
+ $ unitsdb validate qudt_references --database=/path/to/unitsdb/data
330
+ ----
331
+
332
+ This command checks that each QUDT (Quantities, Units, Dimensions and Types) URI is referenced by at most
333
+ one entity of each type. Multiple entities of the same type referencing the same
334
+ QUDT URI could cause issues with mapping and conversion processes.
335
+
336
+ The command reports:
337
+
338
+ * Any duplicate QUDT references within each entity type
339
+ * The entities that share the same QUDT reference
340
+ * Their position in the database for easy location
341
+
342
+ Options:
343
+
344
+ `--database`, `-d`:: Path to UnitsDB database (required)
345
+
228
346
  === Examples of validation commands
229
347
 
230
348
  * Check identifiers for uniqueness:
@@ -341,27 +459,27 @@ entities
341
459
  [source,sh]
342
460
  ----
343
461
  # Check all entity types and generate a report
344
- $ unitsdb check_si --database=spec/fixtures/unitsdb --ttl-dir=spec/fixtures/bipm-si-ttl
462
+ $ unitsdb check_si --database=data --ttl-dir=spec/fixtures/bipm-si-ttl
345
463
 
346
464
  # Check a specific entity type (units, quantities, or prefixes)
347
465
  $ unitsdb check_si --entity-type=units \
348
- --database=spec/fixtures/unitsdb \
466
+ --database=data \
349
467
  --ttl-dir=spec/fixtures/bipm-si-ttl
350
468
 
351
469
  # Check in a specific direction only
352
470
  $ unitsdb check_si --direction=from_si \
353
- --database=spec/fixtures/unitsdb \
471
+ --database=data \
354
472
  --ttl-dir=spec/fixtures/bipm-si-ttl
355
473
 
356
474
  # Update references and write to output directory
357
475
  $ unitsdb check_si --output-updated-database=new_unitsdb \
358
- --database=spec/fixtures/unitsdb \
476
+ --database=data \
359
477
  --ttl-dir=spec/fixtures/bipm-si-ttl
360
478
 
361
479
  # Include potential matches when updating references (default: false)
362
480
  $ unitsdb check_si --include-potential-matches \
363
481
  --output-updated-database=new_unitsdb \
364
- --database=spec/fixtures/unitsdb \
482
+ --database=data \
365
483
  --ttl-dir=spec/fixtures/bipm-si-ttl
366
484
  ----
367
485
 
@@ -417,16 +535,16 @@ There are two commands:
417
535
  [source,sh]
418
536
  ----
419
537
  # Check all entity types and generate a report
420
- $ unitsdb ucum check --database=spec/fixtures/unitsdb --ucum-file=spec/fixtures/ucum/ucum-essence.xml
538
+ $ unitsdb ucum check --database=data --ucum-file=spec/fixtures/ucum/ucum-essence.xml
421
539
 
422
540
  # Check a specific entity type (units or prefixes)
423
541
  $ unitsdb ucum check --entity-type=units \
424
- --database=spec/fixtures/unitsdb \
542
+ --database=data \
425
543
  --ucum-file=spec/fixtures/ucum/ucum-essence.xml
426
544
 
427
545
  # Check in a specific direction only
428
546
  $ unitsdb ucum check --direction=from_ucum \
429
- --database=spec/fixtures/unitsdb \
547
+ --database=data \
430
548
  --ucum-file=spec/fixtures/ucum/ucum-essence.xml
431
549
  ----
432
550
 
@@ -454,19 +572,19 @@ references (default: false)
454
572
  [source,sh]
455
573
  ----
456
574
  # Update all entity types with UCUM references
457
- $ unitsdb ucum update --database=spec/fixtures/unitsdb \
575
+ $ unitsdb ucum update --database=data \
458
576
  --ucum-file=spec/fixtures/ucum/ucum-essence.xml \
459
577
  --output-dir=new_unitsdb
460
578
 
461
579
  # Update a specific entity type (units or prefixes)
462
580
  $ unitsdb ucum update --entity-type=units \
463
- --database=spec/fixtures/unitsdb \
581
+ --database=data \
464
582
  --ucum-file=spec/fixtures/ucum/ucum-essence.xml \
465
583
  --output-dir=new_unitsdb
466
584
 
467
585
  # Include potential matches when updating references (default: false)
468
586
  $ unitsdb ucum update --include-potential-matches \
469
- --database=spec/fixtures/unitsdb \
587
+ --database=data \
470
588
  --ucum-file=spec/fixtures/ucum/ucum-essence.xml \
471
589
  --output-dir=new_unitsdb
472
590
  ----
@@ -486,6 +604,138 @@ If not specified, all types are updated
486
604
  `--include-potential-matches`, `-p`:: Include potential matches when updating
487
605
  references (default: false)
488
606
 
607
+
608
+ ==== Check references to QUDT
609
+
610
+ Performs a comprehensive check of entities in the
611
+ https://qudt.org/3.1.2/vocab/unit[QUDT] (Quantities, Units, Dimensions and
612
+ Types) vocabularies against UnitsDB database entities and updates UnitsDB with
613
+ QUDT references.
614
+
615
+ The support of QUDT mappings in the UnitsDB is purely informative.
616
+
617
+ QUDT supports the following entity types, and they are mapped to
618
+ UnitsDB as follows:
619
+
620
+ * Units: mapped to Units in UnitsDB
621
+ * Quantity Kinds: mapped to Quantities in UnitsDB
622
+ * Dimension Vectors: mapped to Dimensions in UnitsDB
623
+ * Systems of Units: mapped to Unit Systems in UnitsDB
624
+ * Prefixes: mapped to Prefixes in UnitsDB
625
+ * (Physical Constants: not supported in UnitsDB)
626
+ * (Systems of Quantity Kinds: not supported in UnitsDB)
627
+
628
+ The QUDT Vocabulary is very extensive and includes many entities that are not
629
+ reflected in the UnitsDB database, with the following categories:
630
+
631
+ * Many composed units in QUDT are omitted from UnitsDB for separation of
632
+ concerns;
633
+
634
+ * Some quantities are not included in UnitsDB for being less commonly used;
635
+ (e.g. "Deaths per million")
636
+
637
+
638
+ This combined command checks in both directions to ensure UnitsDB supports
639
+ every QUDT entity.
640
+
641
+ * From QUDT to UnitsDB: Ensures every QUDT entity is referenced by at least one
642
+ UnitsDB entity
643
+
644
+ * From UnitsDB to QUDT: Identifies UnitsDB entities that should reference QUDT
645
+ entities
646
+
647
+ There are two commands:
648
+
649
+ * `qudt check`: Checks for matches between UnitsDB and QUDT entities and reports results
650
+
651
+ * `qudt update`: Updates UnitsDB entities with references to matching QUDT entities
652
+
653
+ [source,sh]
654
+ ----
655
+ # Check all entity types and generate a report
656
+ $ unitsdb qudt check --database=data
657
+
658
+ # Check a specific entity type (units, quantities, dimensions, or unit_systems)
659
+ $ unitsdb qudt check --entity-type=units \
660
+ --database=data
661
+
662
+ # Use local TTL files instead of downloading from QUDT.org
663
+ $ unitsdb qudt check --ttl-dir=/path/to/qudt/ttl/files \
664
+ --database=data
665
+
666
+ # Check in a specific direction only
667
+ $ unitsdb qudt check --direction=from_qudt \
668
+ --database=data
669
+
670
+ # Include potential matches in the output
671
+ $ unitsdb qudt check --include-potential-matches \
672
+ --database=data
673
+
674
+ # Output updated database files
675
+ $ unitsdb qudt check --output-dir=/path/to/output \
676
+ --database=data
677
+ ----
678
+
679
+ Options:
680
+
681
+ `--database`, `-d`:: Path to UnitsDB database (required)
682
+
683
+ `--ttl-dir`, `-t`:: Path to the directory containing QUDT TTL files (optional,
684
+ downloads from QUDT.org if not specified)
685
+
686
+ `--entity-type`, `-e`:: Entity type to check (units, quantities, dimensions, or
687
+ unit_systems). If not specified, all types are checked.
688
+
689
+ `--direction`, `-r`:: Direction to check: `to_qudt` (UnitsDB→QUDT), `from_qudt`
690
+ (QUDT→UnitsDB), or `both` (default)
691
+
692
+ `--output-dir`, `-o`:: Directory path to write updated YAML files with added
693
+ QUDT references
694
+
695
+ `--include-potential-matches`, `-p`:: Include potential matches when updating
696
+ references (default: false)
697
+
698
+
699
+ ==== Update QUDT references
700
+
701
+ [source,sh]
702
+ ----
703
+ # Update all entity types with QUDT references
704
+ $ unitsdb qudt update --database=data \
705
+ --output-dir=new_unitsdb
706
+
707
+ # Update a specific entity type (units, quantities, dimensions, or unit_systems)
708
+ $ unitsdb qudt update --entity-type=units \
709
+ --database=data \
710
+ --output-dir=new_unitsdb
711
+
712
+ # Use local TTL files instead of downloading
713
+ $ unitsdb qudt update --ttl-dir=/path/to/qudt/ttl/files \
714
+ --database=data \
715
+ --output-dir=new_unitsdb
716
+
717
+ # Include potential matches when updating references (default: false)
718
+ $ unitsdb qudt update --include-potential-matches \
719
+ --database=data \
720
+ --output-dir=new_unitsdb
721
+ ----
722
+
723
+ Options:
724
+
725
+ `--database`, `-d`:: Path to UnitsDB database (required)
726
+
727
+ `--ttl-dir`, `-t`:: Path to the directory containing QUDT TTL files (optional,
728
+ downloads from QUDT.org if not specified)
729
+
730
+ `--entity-type`, `-e`:: Entity type to update (units, quantities, dimensions, or
731
+ unit_systems). If not specified, all types are updated
732
+
733
+ `--output-dir`, `-o`:: Directory path to write updated YAML files (defaults to
734
+ database path)
735
+
736
+ `--include-potential-matches`, `-p`:: Include potential matches when updating
737
+ references (default: false)
738
+
489
739
  ==== Release
490
740
 
491
741
  Creates release files for UnitsDB in unified formats:
@@ -643,15 +893,15 @@ symbol-to-symbol and partial matches always classified as potential
643
893
 
644
894
  === Loading the database
645
895
 
646
- The primary way to load the UnitsDB data is through the `Database.from_db`
647
- method, which reads data from YAML files:
896
+ The UnitsDB gem ships with the UnitsDB data bundled inside the gem. You can load
897
+ the database using the convenience method:
648
898
 
649
899
  [source,ruby]
650
900
  ----
651
901
  require 'unitsdb'
652
902
 
653
- # Load from the UnitsDB data directory
654
- db = Unitsdb::Database.from_db('/path/to/unitsdb/data')
903
+ # Load the bundled UnitsDB data (all entity types pre-loaded)
904
+ db = Unitsdb.database
655
905
 
656
906
  # Access different collections
657
907
  units = db.units
@@ -661,6 +911,19 @@ quantities = db.quantities
661
911
  unit_systems = db.unit_systems
662
912
  ----
663
913
 
914
+ Alternatively, you can load from a specific path using `Database.from_db`:
915
+
916
+ [source,ruby]
917
+ ----
918
+ require 'unitsdb'
919
+
920
+ # Load from the bundled data directory
921
+ db = Unitsdb::Database.from_db(Unitsdb.data_dir)
922
+
923
+ # Load from an external directory
924
+ external_db = Unitsdb::Database.from_db('/path/to/custom/unitsdb/data')
925
+ ----
926
+
664
927
  === Database search methods
665
928
 
666
929
  The UnitsDB Ruby gem provides several methods for searching and retrieving
@@ -821,13 +1084,17 @@ The `Quantity` class represents physical quantities that can be measured:
821
1084
 
822
1085
  === Database files
823
1086
 
824
- The `Database.from_db` method reads the following YAML files:
1087
+ The UnitsDB gem bundles the following YAML files from the
1088
+ https://github.com/unitsml/unitsdb[UnitsDB repository]. They are included in the
1089
+ gem under the `data/` directory and are available immediately after installation
1090
+ without any additional setup.
825
1091
 
826
1092
  * `prefixes.yaml` - Contains prefix definitions (e.g., kilo-, mega-)
827
1093
  * `dimensions.yaml` - Contains dimension definitions (e.g., length, mass)
828
1094
  * `units.yaml` - Contains unit definitions (e.g., meter, kilogram)
829
1095
  * `quantities.yaml` - Contains quantity definitions (e.g., length, mass)
830
1096
  * `unit_systems.yaml` - Contains unit system definitions (e.g., SI, Imperial)
1097
+ * `scales.yaml` - Contains scale definitions
831
1098
 
832
1099
 
833
1100