iev 0.4.4 → 0.4.6
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- checksums.yaml +4 -4
- data/.github/workflows/rake.yml +7 -4
- data/.github/workflows/release.yml +2 -0
- data/.gitignore +2 -0
- data/.rubocop.yml +4 -1
- data/.rubocop_todo.yml +98 -21
- data/CLAUDE.md +17 -5
- data/Gemfile +8 -4
- data/README.adoc +395 -10
- data/exe/iev +1 -1
- data/iev.gemspec +3 -2
- data/lib/iev/cli/command.rb +3 -2
- data/lib/iev/cli/command_helper.rb +1 -2
- data/lib/iev/cli/ui.rb +5 -5
- data/lib/iev/config.rb +1 -15
- data/lib/iev/data_source.rb +4 -2
- data/lib/iev/db_writer.rb +1 -0
- data/lib/iev/exporter.rb +182 -10
- data/lib/iev/iev_code.rb +80 -0
- data/lib/iev/iso_639_code.rb +2 -1
- data/lib/iev/relaton_db.rb +1 -1
- data/lib/iev/scraper/browser.rb +90 -88
- data/lib/iev/scraper.rb +5 -4
- data/lib/iev/section.rb +37 -0
- data/lib/iev/source_parser.rb +57 -11
- data/lib/iev/subject_area.rb +46 -0
- data/lib/iev/subject_area_concepts.rb +60 -35
- data/lib/iev/subject_areas.rb +72 -33
- data/lib/iev/supersession_parser.rb +1 -2
- data/lib/iev/term_attrs_parser.rb +1 -1
- data/lib/iev/term_builder.rb +14 -9
- data/lib/iev/utilities.rb +29 -1
- data/lib/iev/version.rb +1 -1
- data/lib/iev.rb +43 -11
- metadata +26 -22
data/README.adoc
CHANGED
|
@@ -228,7 +228,7 @@ There are these data types inside the term attribute field. Make sure you split
|
|
|
228
228
|
We need to parse out all NOTEs and EXAMPLEs and normalize them.
|
|
229
229
|
|
|
230
230
|
For all `This links to <a href=IEV112-01-01>quantity</a>`, we parse them and replace with:
|
|
231
|
-
`This links to {{
|
|
231
|
+
`This links to {{IEV:112-01-01, quantity}}`.
|
|
232
232
|
|
|
233
233
|
e.g.
|
|
234
234
|
|
|
@@ -300,9 +300,9 @@ notes:
|
|
|
300
300
|
|
|
301
301
|
[source,yaml]
|
|
302
302
|
----
|
|
303
|
-
definition: {{
|
|
303
|
+
definition: {{IEV:112-01-01, quantity}} which keeps the same value under particular circumstances, or which results from theoretical considerations
|
|
304
304
|
examples:
|
|
305
|
-
- {{
|
|
305
|
+
- {{IEV:103-05-26, time constant}}, equilibrium constant for a chemical reaction, {{IEV:112-03-09, fundamental physical constant}}.
|
|
306
306
|
----
|
|
307
307
|
|
|
308
308
|
|
|
@@ -360,6 +360,375 @@ authoritative_source:
|
|
|
360
360
|
----
|
|
361
361
|
|
|
362
362
|
|
|
363
|
+
== Excel-to-Glossarist Column Mapping
|
|
364
|
+
|
|
365
|
+
This section provides a complete mapping from every IEV Excel export column
|
|
366
|
+
to the corresponding Glossarist concept model field. The IEV Excel export has
|
|
367
|
+
19 columns (see <<_structure_of_the_iev_excel_export>>). Each row represents
|
|
368
|
+
one *localized term entry* (one language variant of one concept).
|
|
369
|
+
|
|
370
|
+
=== Glossarist Model Layers
|
|
371
|
+
|
|
372
|
+
The Glossarist model organizes concept data into two layers:
|
|
373
|
+
|
|
374
|
+
* *ManagedConcept* — the concept entry itself (identity, domain classification,
|
|
375
|
+
cross-concept relationships, lifecycle)
|
|
376
|
+
* *LocalizedConcept* — a language-specific variant of a concept (designations,
|
|
377
|
+
definition, notes, examples, sources)
|
|
378
|
+
|
|
379
|
+
One IEV Excel row produces one `LocalizedConcept`, which is attached to its
|
|
380
|
+
`ManagedConcept` (identified by `IEVREF`).
|
|
381
|
+
|
|
382
|
+
=== Column-by-Column Mapping
|
|
383
|
+
|
|
384
|
+
The table below maps each of the 19 Excel columns to the Glossarist model.
|
|
385
|
+
|
|
386
|
+
[cols="15h,25h,15h,45h",options="header"]
|
|
387
|
+
|===
|
|
388
|
+
| Excel Column | Glossarist Path | Data Type | Notes
|
|
389
|
+
|
|
390
|
+
| `IEVREF`
|
|
391
|
+
| `ManagedConceptData#id`
|
|
392
|
+
| `String`
|
|
393
|
+
| The concept identifier (e.g. `103-01-02`). Also set as `LocalizedConcept#id` and `ConceptData#id`. Used to group multiple language rows into one `ManagedConcept`. The IEVREF pattern `AAA-BB-CC` is also used to derive domain references (see <<_derived-fields>>).
|
|
394
|
+
|
|
395
|
+
| `LANGUAGE`
|
|
396
|
+
| `ConceptData#language_code`
|
|
397
|
+
| `String` (ISO 639-2/3)
|
|
398
|
+
| Two-character code (e.g. `en`, `fr`) converted to three-character ISO 639 code (e.g. `eng`, `fra`) via `Iev::Iso639Code`. This determines which language slot the localized concept fills.
|
|
399
|
+
|
|
400
|
+
| `TERM`
|
|
401
|
+
| `Designation::Expression#designation`
|
|
402
|
+
| `String`
|
|
403
|
+
| Primary term designation. Creates a `Designation::Expression` with `normative_status: "preferred"`. If the value is `.....` (5 dots, meaning "not available"), it is replaced with `"NA"`. The term text undergoes MathML-to-AsciiMath conversion and cross-reference expansion.
|
|
404
|
+
|
|
405
|
+
| `TERMATTRIBUTE`
|
|
406
|
+
| (multiple designation fields)
|
|
407
|
+
| Composite string
|
|
408
|
+
| Parsed by `TermAttrsParser` into multiple designation attributes. See <<_termattribute-breakdown>> for the full sub-mapping.
|
|
409
|
+
|
|
410
|
+
| `SYNONYM1`
|
|
411
|
+
| `Designation::Expression#designation`
|
|
412
|
+
| `String`
|
|
413
|
+
| Additional designation. Creates a `Designation::Expression`. Some synonyms contain multiple entries separated by `<p>`, `<b>`, `<br>` tags — each is split into a separate designation. `normative_status` comes from `SYNONYM1STATUS`.
|
|
414
|
+
|
|
415
|
+
| `SYNONYM1ATTRIBUTE`
|
|
416
|
+
| (multiple designation fields)
|
|
417
|
+
| Composite string
|
|
418
|
+
| Same parsing as `TERMATTRIBUTE`, applied to the `SYNONYM1` designation. See <<_termattribute-breakdown>>.
|
|
419
|
+
|
|
420
|
+
| `SYNONYM1STATUS`
|
|
421
|
+
| `Designation::Expression#normative_status`
|
|
422
|
+
| `String` or nil
|
|
423
|
+
| Maps to the synonym's normative status. The value is lowercased. Known localized values are mapped: e.g. `"obsoleto"` to `"deprecated"`, Cyrillic variants similarly. When nil, the synonym has no explicit status. Also used to derive `LocalizedConcept#classification` (see <<_derived-fields>>).
|
|
424
|
+
|
|
425
|
+
| `SYNONYM2`
|
|
426
|
+
| `Designation::Expression#designation`
|
|
427
|
+
| `String`
|
|
428
|
+
| Same pattern as `SYNONYM1`.
|
|
429
|
+
|
|
430
|
+
| `SYNONYM2ATTRIBUTE`
|
|
431
|
+
| (multiple designation fields)
|
|
432
|
+
| Composite string
|
|
433
|
+
| Same as `SYNONYM1ATTRIBUTE`.
|
|
434
|
+
|
|
435
|
+
| `SYNONYM2STATUS`
|
|
436
|
+
| `Designation::Expression#normative_status`
|
|
437
|
+
| `String` or nil
|
|
438
|
+
| Same as `SYNONYM1STATUS`.
|
|
439
|
+
|
|
440
|
+
| `SYNONYM3`
|
|
441
|
+
| `Designation::Expression#designation`
|
|
442
|
+
| `String`
|
|
443
|
+
| Same pattern as `SYNONYM1`.
|
|
444
|
+
|
|
445
|
+
| `SYNONYM3ATTRIBUTE`
|
|
446
|
+
| (multiple designation fields)
|
|
447
|
+
| Composite string
|
|
448
|
+
| Same as `SYNONYM1ATTRIBUTE`.
|
|
449
|
+
|
|
450
|
+
| `SYNONYM3STATUS`
|
|
451
|
+
| `Designation::Expression#normative_status`
|
|
452
|
+
| `String` or nil
|
|
453
|
+
| Same as `SYNONYM1STATUS`.
|
|
454
|
+
|
|
455
|
+
| `SYMBOLE`
|
|
456
|
+
| `Designation::Symbol#designation`
|
|
457
|
+
| `String`
|
|
458
|
+
| International math symbol. Creates a `Designation::Symbol` with `international: true`. If this column is empty, no symbol designation is created.
|
|
459
|
+
|
|
460
|
+
| `DEFINITION`
|
|
461
|
+
| `ConceptData#definition`, `ConceptData#examples`, `ConceptData#notes`
|
|
462
|
+
| HTML string
|
|
463
|
+
| The unified definition text is split by `TermBuilder#split_definition` which uses regex to detect EXAMPLE, EXEMPLE, Note N to entry, Note N a l'article, NOTE markers. Each part becomes a `DetailedDefinition` object in the corresponding collection. The content undergoes MathML-to-AsciiMath conversion and cross-reference expansion.
|
|
464
|
+
|
|
465
|
+
| `SOURCE`
|
|
466
|
+
| `ConceptData#sources` (via `ConceptSource`)
|
|
467
|
+
| HTML string
|
|
468
|
+
| Parsed by `SourceParser` into one or more `ConceptSource` objects, each with `type: "authoritative"`. The source string is split after normalization. Each source has: `status` (identical/modified/similar/related/not_equal), `origin` (a `Citation` with `ref`, `locality`, `link`, `original`), and optionally `modification` text. See <<_source-parsing>>.
|
|
469
|
+
|
|
470
|
+
| `PUBLICATIONDATE`
|
|
471
|
+
| `ConceptData#dates` (via `ConceptDate`)
|
|
472
|
+
| `String` (YYYY-MM or YYYY-MM-DD)
|
|
473
|
+
| Converted to a full ISO 8601 datetime. Creates two `ConceptDate` entries: `{type: "accepted", date: ...}` and `{type: "amended", date: ...}`. Also sets `ConceptData#review_date` and `ConceptData#review_decision_date` to the same value.
|
|
474
|
+
|
|
475
|
+
| `STATUS`
|
|
476
|
+
| `LocalizedConcept#entry_status`
|
|
477
|
+
| `String`
|
|
478
|
+
| Only `Standard` is known; it maps to `"valid"`. Lowercased and matched.
|
|
479
|
+
|
|
480
|
+
| `REPLACES`
|
|
481
|
+
| `ConceptData#related` (via `RelatedConcept`)
|
|
482
|
+
| `String`
|
|
483
|
+
| Parsed by `SupersessionParser`. Expected format: `IEVREF:VERSION` (e.g. `881-01-23:1983-01`). Creates a `RelatedConcept` with `type: "supersedes"` and a `Citation` containing `{source: "IEV", id: "...", version: "..."}`.
|
|
484
|
+
|
|
485
|
+
|===
|
|
486
|
+
|
|
487
|
+
|
|
488
|
+
[[_termattribute-breakdown]]
|
|
489
|
+
=== TERMATTRIBUTE Sub-Field Mapping
|
|
490
|
+
|
|
491
|
+
The `TERMATTRIBUTE` column is a composite string parsed by `TermAttrsParser`.
|
|
492
|
+
It may contain multiple attributes separated by semicolons. The parser extracts
|
|
493
|
+
them in order: gender, plurality, geographical area, part of speech, usage
|
|
494
|
+
info, prefix.
|
|
495
|
+
|
|
496
|
+
[cols="15h,30h,55h",options="header"]
|
|
497
|
+
|===
|
|
498
|
+
| Parsed Value | Glossarist Path | Notes
|
|
499
|
+
|
|
500
|
+
| `m`, `f`, `n`
|
|
501
|
+
| `GrammarInfo#gender` (via `Designation::Expression#grammar_info`)
|
|
502
|
+
| Grammatical gender. May appear inside brackets: `(m)`, `[f]`.
|
|
503
|
+
|
|
504
|
+
| `pl`
|
|
505
|
+
| `GrammarInfo#number` (via `Designation::Expression#grammar_info`)
|
|
506
|
+
| Plurality. `pl` maps to `"plural"`. If gender was found but not `pl`, defaults to `"singular"`.
|
|
507
|
+
|
|
508
|
+
| `adj`, `noun`, `verb`
|
|
509
|
+
| `GrammarInfo#part_of_speech`
|
|
510
|
+
| Part of speech. Localized variants are mapped: German `Adjektiv` to `adj`, Japanese and Korean variants similarly.
|
|
511
|
+
|
|
512
|
+
| Angle bracket text (ASCII or full-width)
|
|
513
|
+
| `Designation::Expression#usage_info`
|
|
514
|
+
| Usage info / domain indicator extracted from angle brackets. Full-width brackets used in some CJK terms.
|
|
515
|
+
|
|
516
|
+
| Prefix keywords in multiple languages
|
|
517
|
+
| `Designation::Expression#prefix`
|
|
518
|
+
| Marks the designation as a prefix. Keywords include German, French, Japanese, Korean, Chinese, Portuguese variants.
|
|
519
|
+
|
|
520
|
+
| Two-letter uppercase (e.g. `CA`, `US`)
|
|
521
|
+
| `Designation::Base#geographical_area`
|
|
522
|
+
| ISO 3166-1 alpha-2 country code.
|
|
523
|
+
|
|
524
|
+
|===
|
|
525
|
+
|
|
526
|
+
|
|
527
|
+
[[_source-parsing]]
|
|
528
|
+
=== SOURCE Column Parsing
|
|
529
|
+
|
|
530
|
+
The `SOURCE` column is the most complex field. It is parsed by `SourceParser`
|
|
531
|
+
into one or more `ConceptSource` objects.
|
|
532
|
+
|
|
533
|
+
==== Relationship Status Detection
|
|
534
|
+
|
|
535
|
+
The parser detects the source relationship type from textual markers:
|
|
536
|
+
|
|
537
|
+
[cols="20h,20h,60h",options="header"]
|
|
538
|
+
|===
|
|
539
|
+
| Marker | Status | Notes
|
|
540
|
+
|
|
541
|
+
| Not-equal sign
|
|
542
|
+
| `not_equal`
|
|
543
|
+
| Definition differs from source.
|
|
544
|
+
|
|
545
|
+
| Approximately-equal sign
|
|
546
|
+
| `similar`
|
|
547
|
+
| Definition is similar to source.
|
|
548
|
+
|
|
549
|
+
| `see`, `voir`
|
|
550
|
+
| `related`
|
|
551
|
+
| Cross-reference to another definition.
|
|
552
|
+
|
|
553
|
+
| `MOD`, `modified`, `modifie` (with accent)
|
|
554
|
+
| `modified`
|
|
555
|
+
| Definition modified from source. Modification text is captured in `ConceptSource#modification`.
|
|
556
|
+
|
|
557
|
+
| (default)
|
|
558
|
+
| `identical`
|
|
559
|
+
| No special marker found.
|
|
560
|
+
|
|
561
|
+
|===
|
|
562
|
+
|
|
563
|
+
==== Source Reference Extraction
|
|
564
|
+
|
|
565
|
+
The parser normalizes and extracts the source reference (e.g. `IEC 60050-121`),
|
|
566
|
+
the clause locality (e.g. `151-12-05`), and optionally resolves a URL via
|
|
567
|
+
Relaton. Reference normalization handles many localized forms: `CEI` to `IEC`,
|
|
568
|
+
`UIT` to `ITU`, `VEI` to `IEV`, etc.
|
|
569
|
+
|
|
570
|
+
|
|
571
|
+
[[_derived-fields]]
|
|
572
|
+
=== Derived Fields (Not Directly From Excel Columns)
|
|
573
|
+
|
|
574
|
+
Some Glossarist model fields are *derived* from IEVREF or from combinations
|
|
575
|
+
of columns during export:
|
|
576
|
+
|
|
577
|
+
[cols="25h,25h,50h",options="header"]
|
|
578
|
+
|===
|
|
579
|
+
| Glossarist Path | Source | Notes
|
|
580
|
+
|
|
581
|
+
| `ManagedConceptData#domains`
|
|
582
|
+
| Derived from `IEVREF`
|
|
583
|
+
| The IEVREF pattern `AAA-BB-CC` is split. Creates two `ConceptReference` objects with `ref_type: "domain"` and `source: "urn:iec:std:iec:60050"` (IEC URN per IEC URN specification): `area-AAA` and `section-AAA-BB`. For example, `103-01-02` produces `area-103` + `section-103-01`.
|
|
584
|
+
|
|
585
|
+
| `ManagedConceptData#tags`
|
|
586
|
+
| Derived from `IEVREF`
|
|
587
|
+
| Plain string tags for grouping and filtering. Derived from the IEV subject area hierarchy: includes the area title (e.g. `"Mathematics - Functions"`) and section title (e.g. `"General concepts"`).
|
|
588
|
+
|
|
589
|
+
| `LocalizedConcept#classification`
|
|
590
|
+
| `SYNONYM1STATUS`
|
|
591
|
+
| Maps localized classification values: Chinese/Russian/Spanish `"admitido"` to `"admitted"`, various forms of `"preferred"` similarly; other values lowercased as-is.
|
|
592
|
+
|
|
593
|
+
| `ConceptData#domain`
|
|
594
|
+
| Derived from `IEVREF`
|
|
595
|
+
| The section-level domain URI (e.g. `section-103-01`), resolved from the `SubjectAreas` data. Falls back to area-level if section not found.
|
|
596
|
+
|
|
597
|
+
| `ConceptData#review_decision_event`
|
|
598
|
+
| Hard-coded
|
|
599
|
+
| Always set to `"published"`.
|
|
600
|
+
|
|
601
|
+
| `ConceptDate {type: "amended"}`
|
|
602
|
+
| `PUBLICATIONDATE`
|
|
603
|
+
| A second date entry with type `"amended"` is created alongside the `"accepted"` date, using the same publication date value.
|
|
604
|
+
|
|
605
|
+
| `ManagedConcept#related`
|
|
606
|
+
| Derived from `IEVREF`
|
|
607
|
+
| Hierarchy relations using `broader`/`narrower`. Regular IEV concepts have `broader → section-AAA-BB`. Section concepts have `broader → area-AAA` (from SubjectAreaConcepts) and `narrower → child concepts` (from Exporter). Area concepts have `narrower → section-AAA-BB`. Each `RelatedConcept` has both `content` (string) and `ref` (Citation with source `"IEV"` and `id`) set, so the glossarist RDF transform emits `skos:broader`/`skos:narrower` triples.
|
|
608
|
+
|
|
609
|
+
|===
|
|
610
|
+
|
|
611
|
+
|
|
612
|
+
=== Glossarist Model Fields NOT Populated From IEV Excel
|
|
613
|
+
|
|
614
|
+
The following Glossarist model fields exist in the data model but are *not*
|
|
615
|
+
populated from any IEV Excel column. They remain at their defaults:
|
|
616
|
+
|
|
617
|
+
[cols="25h,60h,15h",options="header"]
|
|
618
|
+
|===
|
|
619
|
+
| Glossarist Field | Description | Default
|
|
620
|
+
|
|
621
|
+
| `ManagedConceptData#uri`
|
|
622
|
+
| External URI for the concept
|
|
623
|
+
| nil
|
|
624
|
+
|
|
625
|
+
| `ManagedConceptData#sources`
|
|
626
|
+
| Managed-concept-level sources (distinct from localized sources)
|
|
627
|
+
| empty
|
|
628
|
+
|
|
629
|
+
| `ManagedConcept#dates`
|
|
630
|
+
| Managed-concept-level dates (distinct from localized dates)
|
|
631
|
+
| empty
|
|
632
|
+
|
|
633
|
+
| `ManagedConcept#status`
|
|
634
|
+
| Concept lifecycle status (draft/valid/retired etc.)
|
|
635
|
+
| nil
|
|
636
|
+
|
|
637
|
+
| `ConceptData#release`
|
|
638
|
+
| Release version tag
|
|
639
|
+
| nil
|
|
640
|
+
|
|
641
|
+
| `ConceptData#lineage_source_similarity`
|
|
642
|
+
| Lineage source similarity percentage
|
|
643
|
+
| nil
|
|
644
|
+
|
|
645
|
+
| `ConceptData#script`
|
|
646
|
+
| ISO 15924 script code
|
|
647
|
+
| nil
|
|
648
|
+
|
|
649
|
+
| `ConceptData#system`
|
|
650
|
+
| ISO 24229 conversion system code
|
|
651
|
+
| nil
|
|
652
|
+
|
|
653
|
+
| `ConceptData#references`
|
|
654
|
+
| ConceptReference collection on localized concept
|
|
655
|
+
| empty
|
|
656
|
+
|
|
657
|
+
| `ConceptData#entry_status`
|
|
658
|
+
| Entry status on ConceptData (duplicate of LocalizedConcept#entry_status)
|
|
659
|
+
| nil
|
|
660
|
+
|
|
661
|
+
| `Concept#non_verb_rep`
|
|
662
|
+
| Non-verbal representations (images, tables, formulas)
|
|
663
|
+
| empty
|
|
664
|
+
|
|
665
|
+
| `Designation::Base#language`
|
|
666
|
+
| Per-designation language override
|
|
667
|
+
| nil
|
|
668
|
+
|
|
669
|
+
| `Designation::Base#script`
|
|
670
|
+
| Per-designation ISO 15924 script
|
|
671
|
+
| nil
|
|
672
|
+
|
|
673
|
+
| `Designation::Base#system`
|
|
674
|
+
| Per-designation ISO 24229 system
|
|
675
|
+
| nil
|
|
676
|
+
|
|
677
|
+
| `Designation::Base#international`
|
|
678
|
+
| International validity flag (set `true` only for SYMBOLE)
|
|
679
|
+
| false
|
|
680
|
+
|
|
681
|
+
| `Designation::Base#absent`
|
|
682
|
+
| Explicitly absent designation flag
|
|
683
|
+
| false
|
|
684
|
+
|
|
685
|
+
| `Designation::Base#pronunciation`
|
|
686
|
+
| Pronunciation entries (IPA, romanization, etc.)
|
|
687
|
+
| empty
|
|
688
|
+
|
|
689
|
+
| `Designation::Base#sources`
|
|
690
|
+
| Per-designation bibliographic sources
|
|
691
|
+
| empty
|
|
692
|
+
|
|
693
|
+
| `Designation::Base#term_type`
|
|
694
|
+
| ISO 12620 term type classification (24 values)
|
|
695
|
+
| nil
|
|
696
|
+
|
|
697
|
+
| `Designation::Base#related`
|
|
698
|
+
| Designation-level relationships (abbreviated_form_for, short_form_for)
|
|
699
|
+
| empty
|
|
700
|
+
|
|
701
|
+
| `Designation::Expression#field_of_application`
|
|
702
|
+
| Subject field / specific use
|
|
703
|
+
| nil
|
|
704
|
+
|
|
705
|
+
| `Designation::Abbreviation#acronym`
|
|
706
|
+
| Acronym type flag
|
|
707
|
+
| false
|
|
708
|
+
|
|
709
|
+
| `Designation::Abbreviation#initialism`
|
|
710
|
+
| Initialism type flag
|
|
711
|
+
| false
|
|
712
|
+
|
|
713
|
+
| `Designation::Abbreviation#truncation`
|
|
714
|
+
| Truncation type flag
|
|
715
|
+
| false
|
|
716
|
+
|
|
717
|
+
| `Designation::LetterSymbol`
|
|
718
|
+
| Letter symbol designation type (subclass of Symbol with `text`)
|
|
719
|
+
| (not used)
|
|
720
|
+
|
|
721
|
+
| `Designation::GraphicalSymbol`
|
|
722
|
+
| Graphical symbol designation type (subclass of Symbol with `text`, `image`)
|
|
723
|
+
| (not used)
|
|
724
|
+
|
|
725
|
+
| `LocalizedConcept#review_type`
|
|
726
|
+
| Review type
|
|
727
|
+
| nil
|
|
728
|
+
|
|
729
|
+
|===
|
|
730
|
+
|
|
731
|
+
|
|
363
732
|
== Copyright and license
|
|
364
733
|
|
|
365
734
|
Data copyright IEC. All others copyright Ribose.
|
|
@@ -379,9 +748,14 @@ data:
|
|
|
379
748
|
identifier: "103-01-01"
|
|
380
749
|
domains:
|
|
381
750
|
- concept_id: area-103
|
|
751
|
+
source: urn:iec:std:iec:60050
|
|
382
752
|
ref_type: domain
|
|
383
753
|
- concept_id: section-103-01
|
|
754
|
+
source: urn:iec:std:iec:60050
|
|
384
755
|
ref_type: domain
|
|
756
|
+
tags:
|
|
757
|
+
- "Mathematics - Functions"
|
|
758
|
+
- "General concepts"
|
|
385
759
|
----
|
|
386
760
|
|
|
387
761
|
The `ref_type: domain` distinguishes domain references from other
|
|
@@ -390,10 +764,21 @@ The `ref_type: domain` distinguishes domain references from other
|
|
|
390
764
|
=== Subject Area Hierarchy
|
|
391
765
|
|
|
392
766
|
The `SubjectAreaConcepts` module creates area and section concepts that
|
|
393
|
-
form a two-level hierarchy
|
|
394
|
-
|
|
395
|
-
|
|
396
|
-
|
|
397
|
-
|
|
398
|
-
|
|
399
|
-
and `
|
|
767
|
+
form a two-level hierarchy with symmetric `broader`/`narrower` linkages
|
|
768
|
+
at the `ManagedConcept#related` level:
|
|
769
|
+
|
|
770
|
+
* **Area concepts** (e.g. `area-103`) — domain reference to themselves,
|
|
771
|
+
`narrower` relations to their sections
|
|
772
|
+
* **Section concepts** (e.g. `section-103-01`) — domain references to
|
|
773
|
+
both parent area and themselves, `broader` relation to parent area,
|
|
774
|
+
`narrower` relations to child IEV concepts (added by `Exporter`)
|
|
775
|
+
* **Regular IEV concepts** (e.g. `103-01-02`) — `broader` relation to
|
|
776
|
+
their section concept (added by `Exporter`)
|
|
777
|
+
|
|
778
|
+
All hierarchy `RelatedConcept` entries set both `content` (string, for
|
|
779
|
+
YAML serialization) and `ref` (`Citation` with `source: "IEV"` and `id`,
|
|
780
|
+
for RDF transformation via glossarist's gloss ontology).
|
|
781
|
+
|
|
782
|
+
Separately, `domains` (classification via `ConceptReference.domain(...)`)
|
|
783
|
+
and `ConceptData#domain` (per-localization string) remain for
|
|
784
|
+
classification/filtering — distinct from hierarchy.
|
data/exe/iev
CHANGED
data/iev.gemspec
CHANGED
|
@@ -22,14 +22,15 @@ Gem::Specification.new do |spec|
|
|
|
22
22
|
spec.required_ruby_version = Gem::Requirement.new(">= 3.2.0")
|
|
23
23
|
|
|
24
24
|
spec.add_dependency "creek", "~> 2.6"
|
|
25
|
-
spec.add_dependency "glossarist", ">= 2.3.0"
|
|
26
25
|
spec.add_dependency "ferrum", "~> 0.15"
|
|
26
|
+
spec.add_dependency "glossarist", ">= 2.8.2"
|
|
27
|
+
spec.add_dependency "lutaml-model", "~> 0.8.0"
|
|
27
28
|
spec.add_dependency "nokogiri", "~> 1.19"
|
|
28
29
|
spec.add_dependency "plurimath"
|
|
29
|
-
spec.add_dependency "lutaml-model", "~> 0.8.0"
|
|
30
30
|
spec.add_dependency "relaton", ">= 2.0.0", "< 3"
|
|
31
31
|
spec.add_dependency "sequel", "~> 5.40"
|
|
32
32
|
spec.add_dependency "sqlite3", "~> 1.7"
|
|
33
33
|
spec.add_dependency "thor", "~> 1.0"
|
|
34
34
|
spec.add_dependency "unitsml"
|
|
35
|
+
spec.metadata["rubygems_mfa_required"] = "true"
|
|
35
36
|
end
|
data/lib/iev/cli/command.rb
CHANGED
|
@@ -142,14 +142,15 @@ module Iev
|
|
|
142
142
|
summary
|
|
143
143
|
end
|
|
144
144
|
|
|
145
|
-
desc "subject_areas",
|
|
145
|
+
desc "subject_areas",
|
|
146
|
+
"Fetch IEV subject areas and sections from Electropedia."
|
|
146
147
|
option :output, desc: "Output YAML file (default: stdout)", aliases: :o
|
|
147
148
|
option :refresh, type: :boolean, default: false,
|
|
148
149
|
desc: "Force re-fetch even if cached"
|
|
149
150
|
def subject_areas
|
|
150
151
|
if options[:refresh]
|
|
151
152
|
cache_path = File.join(Iev.config.cache_dir, "subject_areas.yaml")
|
|
152
|
-
FileUtils.rm_f(cache_path)
|
|
153
|
+
FileUtils.rm_f(cache_path)
|
|
153
154
|
end
|
|
154
155
|
|
|
155
156
|
result = Iev::SubjectAreas.fetch
|
|
@@ -111,8 +111,7 @@ module Iev
|
|
|
111
111
|
|
|
112
112
|
definition = entry["definition"]
|
|
113
113
|
if definition
|
|
114
|
-
|
|
115
|
-
cd.definition = [Glossarist::DetailedDefinition.new(content: content)]
|
|
114
|
+
cd.definition = [Glossarist::DetailedDefinition.new(content: definition)]
|
|
116
115
|
end
|
|
117
116
|
|
|
118
117
|
l10n = Glossarist::LocalizedConcept.new
|
data/lib/iev/cli/ui.rb
CHANGED
|
@@ -12,12 +12,12 @@ module Iev
|
|
|
12
12
|
module Ui
|
|
13
13
|
module_function
|
|
14
14
|
|
|
15
|
-
def debug(*
|
|
16
|
-
Helper.cli_out(:debug, *
|
|
15
|
+
def debug(*)
|
|
16
|
+
Helper.cli_out(:debug, *)
|
|
17
17
|
end
|
|
18
18
|
|
|
19
|
-
def warn(*
|
|
20
|
-
Helper.cli_out(:warn, *
|
|
19
|
+
def warn(*)
|
|
20
|
+
Helper.cli_out(:warn, *)
|
|
21
21
|
end
|
|
22
22
|
|
|
23
23
|
# Prints progress message which will be replaced on next call.
|
|
@@ -52,7 +52,7 @@ module Iev
|
|
|
52
52
|
|
|
53
53
|
def cli_out(level, *args)
|
|
54
54
|
topic = args[0].is_a?(Symbol) ? args.shift : nil
|
|
55
|
-
message = args.
|
|
55
|
+
message = args.join(" ").chomp
|
|
56
56
|
ui_tag = Thread.current[:iev_ui_tag]
|
|
57
57
|
|
|
58
58
|
return unless should_out?(level, topic)
|
data/lib/iev/config.rb
CHANGED
|
@@ -9,23 +9,9 @@ module Iev
|
|
|
9
9
|
attr_accessor :data_path, :cache_dir, :remote_base_url
|
|
10
10
|
|
|
11
11
|
def initialize
|
|
12
|
-
@data_path = ENV
|
|
12
|
+
@data_path = ENV.fetch("IEV_DATA_PATH", nil)
|
|
13
13
|
@cache_dir = ENV["IEV_CACHE_DIR"] || File.join(Dir.tmpdir, "iev-cache")
|
|
14
14
|
@remote_base_url = DEFAULT_REMOTE_BASE_URL
|
|
15
15
|
end
|
|
16
16
|
end
|
|
17
|
-
|
|
18
|
-
class << self
|
|
19
|
-
def config
|
|
20
|
-
@config ||= Config.new
|
|
21
|
-
end
|
|
22
|
-
|
|
23
|
-
def configure
|
|
24
|
-
yield(config) if block_given?
|
|
25
|
-
end
|
|
26
|
-
|
|
27
|
-
def reset_config!
|
|
28
|
-
@config = nil
|
|
29
|
-
end
|
|
30
|
-
end
|
|
31
17
|
end
|
data/lib/iev/data_source.rb
CHANGED
|
@@ -63,7 +63,8 @@ module Iev
|
|
|
63
63
|
path = File.join(data_path, "concept-#{code}.yaml")
|
|
64
64
|
return nil unless File.exist?(path)
|
|
65
65
|
|
|
66
|
-
YAML.safe_load(File.read(path, encoding: "utf-8"),
|
|
66
|
+
YAML.safe_load(File.read(path, encoding: "utf-8"),
|
|
67
|
+
permitted_classes: [Date, Time])
|
|
67
68
|
end
|
|
68
69
|
|
|
69
70
|
def from_remote(code)
|
|
@@ -101,7 +102,8 @@ module Iev
|
|
|
101
102
|
cache_path = cache_file_path(filename)
|
|
102
103
|
return nil unless File.exist?(cache_path)
|
|
103
104
|
|
|
104
|
-
YAML.safe_load(File.read(cache_path, encoding: "utf-8"),
|
|
105
|
+
YAML.safe_load(File.read(cache_path, encoding: "utf-8"),
|
|
106
|
+
permitted_classes: [Date, Time])
|
|
105
107
|
end
|
|
106
108
|
|
|
107
109
|
def write_cache(filename, data)
|