relaton-w3c 1.20.1 → 2.0.0.pre.alpha.1
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- checksums.yaml +4 -4
- data/.gitignore +1 -0
- data/.rubocop.yml +1 -1
- data/CLAUDE.md +97 -0
- data/README.adoc +33 -40
- data/bin/console +1 -1
- data/grammars/basicdoc.rng +1559 -671
- data/grammars/biblio-standoc.rng +107 -46
- data/grammars/biblio.rng +1010 -375
- data/lib/relaton/w3c/bibdata.rb +7 -0
- data/lib/relaton/w3c/bibitem.rb +7 -0
- data/lib/relaton/w3c/bibliography.rb +54 -0
- data/lib/relaton/w3c/data_fetcher.rb +102 -0
- data/lib/relaton/w3c/data_parser.rb +346 -0
- data/lib/relaton/w3c/doctype.rb +7 -0
- data/lib/relaton/w3c/ext.rb +14 -0
- data/lib/relaton/w3c/item.rb +12 -0
- data/lib/relaton/w3c/item_data.rb +6 -0
- data/lib/relaton/w3c/processor.rb +49 -0
- data/lib/relaton/w3c/pubid.rb +59 -0
- data/lib/relaton/w3c/rate_limit_handler.rb +49 -0
- data/lib/relaton/w3c/util.rb +8 -0
- data/lib/relaton/w3c/version.rb +5 -0
- data/lib/relaton/w3c.rb +25 -0
- data/relaton_w3c.gemspec +15 -6
- metadata +150 -28
- data/lib/relaton_w3c/bibxml_parser.rb +0 -24
- data/lib/relaton_w3c/data_fetcher.rb +0 -123
- data/lib/relaton_w3c/data_index.rb +0 -149
- data/lib/relaton_w3c/data_parser.rb +0 -327
- data/lib/relaton_w3c/document_type.rb +0 -16
- data/lib/relaton_w3c/hash_converter.rb +0 -16
- data/lib/relaton_w3c/hit.rb +0 -7
- data/lib/relaton_w3c/hit_collection.rb +0 -7
- data/lib/relaton_w3c/processor.rb +0 -61
- data/lib/relaton_w3c/pubid.rb +0 -57
- data/lib/relaton_w3c/rate_limit_handler.rb +0 -32
- data/lib/relaton_w3c/util.rb +0 -6
- data/lib/relaton_w3c/version.rb +0 -3
- data/lib/relaton_w3c/w3c_bibliographic_item.rb +0 -16
- data/lib/relaton_w3c/w3c_bibliography.rb +0 -55
- data/lib/relaton_w3c/xml_parser.rb +0 -29
- data/lib/relaton_w3c.rb +0 -26
checksums.yaml
CHANGED
|
@@ -1,7 +1,7 @@
|
|
|
1
1
|
---
|
|
2
2
|
SHA256:
|
|
3
|
-
metadata.gz:
|
|
4
|
-
data.tar.gz:
|
|
3
|
+
metadata.gz: 52af233974b9c9185360e9ada6422f123a72afcd6b5115e8de0607b5c47b1443
|
|
4
|
+
data.tar.gz: 84fba2eb714b7fbd72a8bcb7f9321e910631b0243dc79c72c61caf4299eb296e
|
|
5
5
|
SHA512:
|
|
6
|
-
metadata.gz:
|
|
7
|
-
data.tar.gz:
|
|
6
|
+
metadata.gz: 77b68bb94eabb98522c008426be7896adf324418d1990e0a26d201eee58562557cc60e7cc50d0f6ca567db3a0222f9e82ea7230d78345b5d73324a2db7128401
|
|
7
|
+
data.tar.gz: a9d22b86422dfcf55e761a7c44bd9ceec3ba5ad5ce48c076bfa820133c040bf2ba04dfe27624ffc68e6e57b0c7ec916c9645ecb6c1843f8c8ef333f507e8617f
|
data/.gitignore
CHANGED
data/.rubocop.yml
CHANGED
data/CLAUDE.md
ADDED
|
@@ -0,0 +1,97 @@
|
|
|
1
|
+
# CLAUDE.md
|
|
2
|
+
|
|
3
|
+
This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository.
|
|
4
|
+
|
|
5
|
+
## Project Overview
|
|
6
|
+
|
|
7
|
+
relaton-w3c is a Ruby gem for retrieving and representing W3C Standards bibliographic data using the Relaton model. It is part of the larger Relaton ecosystem of gems. Uses a LutaML-based model architecture under the `Relaton::W3c` namespace.
|
|
8
|
+
|
|
9
|
+
## Common Commands
|
|
10
|
+
|
|
11
|
+
```bash
|
|
12
|
+
# Install dependencies
|
|
13
|
+
bundle install
|
|
14
|
+
|
|
15
|
+
# Run all tests
|
|
16
|
+
bundle exec rake spec
|
|
17
|
+
|
|
18
|
+
# Run a specific test file
|
|
19
|
+
bundle exec rspec spec/relaton/w3c/item_spec.rb
|
|
20
|
+
|
|
21
|
+
# Run a specific test by line number
|
|
22
|
+
bundle exec rspec spec/relaton/w3c/item_spec.rb:7
|
|
23
|
+
|
|
24
|
+
# Lint
|
|
25
|
+
bundle exec rubocop
|
|
26
|
+
|
|
27
|
+
# Interactive console
|
|
28
|
+
bin/console
|
|
29
|
+
```
|
|
30
|
+
|
|
31
|
+
## Architecture
|
|
32
|
+
|
|
33
|
+
### Class Hierarchy
|
|
34
|
+
|
|
35
|
+
All classes live under `lib/relaton/w3c/` in the `Relaton::W3c` namespace:
|
|
36
|
+
|
|
37
|
+
**Model classes:**
|
|
38
|
+
- **`Item`** (`item.rb`) — extends `Bib::Item`, adds W3C `ext` attribute. Base class for both Bibitem and Bibdata.
|
|
39
|
+
- **`ItemData`** (`item_data.rb`) — LutaML data model for `Item`
|
|
40
|
+
- **`Bibitem`** (`bibitem.rb`) — extends `Item`, includes `Bib::BibitemShared` (XML serialization without `<bibdata>` wrapper)
|
|
41
|
+
- **`Bibdata`** (`bibdata.rb`) — extends `Item`, includes `Bib::BibdataShared` (XML serialization with `<bibdata>` wrapper)
|
|
42
|
+
- **`Ext`** (`ext.rb`) — extends `Bib::Ext`, adds W3C-specific `doctype` attribute
|
|
43
|
+
- **`Doctype`** (`doctype.rb`) — extends `Bib::Doctype`, restricts content to `groupNote` or `technicalReport`
|
|
44
|
+
|
|
45
|
+
**Public API:**
|
|
46
|
+
- **`Bibliography`** (`bibliography.rb`) — search and retrieve W3C standards from the Relaton index
|
|
47
|
+
- **`Processor`** (`processor.rb`) — extends `Relaton::Core::Processor`, registers the W3C flavor (prefix `W3C`, dataset `w3c-api`)
|
|
48
|
+
|
|
49
|
+
**Data fetching:**
|
|
50
|
+
- **`DataFetcher`** (`data_fetcher.rb`) — extends `Core::DataFetcher`, fetches all W3C specs via the W3C API
|
|
51
|
+
- **`DataParser`** (`data_parser.rb`) — converts W3C API spec objects into `Relaton::W3c::Item` instances
|
|
52
|
+
- **`RateLimitHandler`** (`rate_limit_handler.rb`) — mixin for retry logic and caching of fetched API objects
|
|
53
|
+
- **`PubId`** (`pubid.rb`) — parses and compares W3C document identifiers (stage, code, date parts)
|
|
54
|
+
|
|
55
|
+
**Utilities:**
|
|
56
|
+
- **`Util`** (`util.rb`) — extends `Relaton::Bib::Util`, sets `PROGNAME` for logging
|
|
57
|
+
|
|
58
|
+
The entry module is defined in `lib/relaton/w3c.rb` and exposes `grammar_hash`.
|
|
59
|
+
|
|
60
|
+
### Key Dependencies
|
|
61
|
+
|
|
62
|
+
- **relaton-bib** (~> 2.0.0-alpha) — provides base `Bib::Item`, `Bib::Ext`, `Bib::Doctype` and serialization mixins (LutaML model layer)
|
|
63
|
+
- **relaton-core** — provides base `Core::Processor` and `Core::DataFetcher`
|
|
64
|
+
- **relaton-index** — index-based search for bibliographic references
|
|
65
|
+
- **w3c_api** — W3C API client used by `DataFetcher` to retrieve specifications
|
|
66
|
+
- **linkeddata/rdf/sparql** — legacy RDF dependencies (still in gemspec)
|
|
67
|
+
|
|
68
|
+
### Schema Validation
|
|
69
|
+
|
|
70
|
+
XML output is validated against RelaxNG grammars in `grammars/`:
|
|
71
|
+
- `relaton-w3c-compile.rng` — top-level compiled grammar (includes all others)
|
|
72
|
+
- `relaton-w3c.rng` — W3C-specific overrides (DocumentType restrictions)
|
|
73
|
+
- `basicdoc.rng`, `biblio.rng`, `biblio-standoc.rng` — shared base schemas
|
|
74
|
+
|
|
75
|
+
Tests use [Jing](https://github.com/jing-trang/jing-trang) for RelaxNG validation.
|
|
76
|
+
|
|
77
|
+
### Test Structure
|
|
78
|
+
|
|
79
|
+
Tests use RSpec with:
|
|
80
|
+
- **Round-trip tests** — YAML/XML → object → YAML/XML, verifying lossless serialization
|
|
81
|
+
- **Schema validation** — XML output validated against `grammars/relaton-w3c-compile.rng`
|
|
82
|
+
- **VCR** — recorded HTTP cassettes in `spec/vcr_cassettes/` (7-day re-record interval)
|
|
83
|
+
- **WebMock** — disables external HTTP in tests
|
|
84
|
+
|
|
85
|
+
Test fixtures live in `spec/fixtures/` (YAML, XML, RDF files).
|
|
86
|
+
|
|
87
|
+
## Style
|
|
88
|
+
|
|
89
|
+
- Follows [Ribose OSS Ruby style guide](https://github.com/riboseinc/oss-guides) via RuboCop
|
|
90
|
+
- Target Ruby version: 3.1
|
|
91
|
+
- RuboCop config inherits from remote Ribose guide; Rails cops disabled
|
|
92
|
+
|
|
93
|
+
## CI
|
|
94
|
+
|
|
95
|
+
GitHub Actions workflows (auto-generated by Cimas) delegate to shared workflows in `relaton/support`:
|
|
96
|
+
- `rake.yml` — runs tests on push to main and PRs
|
|
97
|
+
- `release.yml` — gem versioning and publishing to RubyGems
|
data/README.adoc
CHANGED
|
@@ -1,8 +1,8 @@
|
|
|
1
|
-
=
|
|
1
|
+
= Relaton::W3c
|
|
2
2
|
|
|
3
3
|
RelatonW3c is a Ruby gem that implements the https://github.com/metanorma/metanorma-model-iso#iso-bibliographic-item[IsoBibliographicItem model].
|
|
4
4
|
|
|
5
|
-
You can use it to retrieve metadata of W3C Standards from https://w3.org, and access such metadata through the `
|
|
5
|
+
You can use it to retrieve metadata of W3C Standards from https://w3.org, and access such metadata through the `Relaton::W3c::Item` object.
|
|
6
6
|
|
|
7
7
|
== Installation
|
|
8
8
|
|
|
@@ -27,13 +27,13 @@ Or install it yourself as:
|
|
|
27
27
|
|
|
28
28
|
[source,ruby]
|
|
29
29
|
----
|
|
30
|
-
require '
|
|
30
|
+
require 'relaton/w3c'
|
|
31
31
|
=> true
|
|
32
32
|
|
|
33
|
-
item =
|
|
34
|
-
[relaton-w3c] (W3C REC-json-ld11-20200716) Fetching from Relaton repository ...
|
|
35
|
-
[relaton-w3c] (W3C REC-json-ld11-20200716) Found: `REC-json-ld11-20200716`
|
|
36
|
-
=> #<
|
|
33
|
+
item = Relaton::W3c::Bibliography.get "W3C REC-json-ld11-20200716"
|
|
34
|
+
[relaton-w3c] INFO: (W3C REC-json-ld11-20200716) Fetching from Relaton repository ...
|
|
35
|
+
[relaton-w3c] INFO: (W3C REC-json-ld11-20200716) Found: `W3C REC-json-ld11-20200716`
|
|
36
|
+
=> #<Relaton::W3c::ItemData:0x0000000129b05b38
|
|
37
37
|
...
|
|
38
38
|
----
|
|
39
39
|
|
|
@@ -42,10 +42,12 @@ item = RelatonW3c::W3cBibliography.get "W3C REC-json-ld11-20200716"
|
|
|
42
42
|
[source,ruby]
|
|
43
43
|
----
|
|
44
44
|
item.to_xml
|
|
45
|
-
=> "<bibitem id="
|
|
46
|
-
<fetched>
|
|
47
|
-
<
|
|
45
|
+
=> "<bibitem id="W3CRECjsonld1120200716" type="standard" schema-version="v1.4.1">
|
|
46
|
+
<fetched>2026-03-04</fetched>
|
|
47
|
+
<formattedref>W3C REC-json-ld11-20200716</formattedref>
|
|
48
|
+
<title language="en" script="Latn">JSON-LD 1.1</title>
|
|
48
49
|
<uri type="src">https://www.w3.org/TR/2020/REC-json-ld11-20200716/</uri>
|
|
50
|
+
<docidentifier type="W3C" primary="true">W3C REC-json-ld11-20200716</docidentifier>
|
|
49
51
|
..
|
|
50
52
|
</bibitem>"
|
|
51
53
|
----
|
|
@@ -55,74 +57,65 @@ With argument `bibdata: true` it outputs XML wrapped by `bibdata` element and ad
|
|
|
55
57
|
[source,ruby]
|
|
56
58
|
----
|
|
57
59
|
item.to_xml bibdata: true
|
|
58
|
-
=> "<bibdata type="standard" schema-version="v1.
|
|
59
|
-
<fetched>
|
|
60
|
-
<
|
|
60
|
+
=> "<bibdata type="standard" schema-version="v1.4.1">
|
|
61
|
+
<fetched>2026-03-04</fetched>
|
|
62
|
+
<formattedref>W3C REC-json-ld11-20200716</formattedref>
|
|
63
|
+
<title language="en" script="Latn">JSON-LD 1.1</title>
|
|
61
64
|
<uri type="src">https://www.w3.org/TR/2020/REC-json-ld11-20200716/</uri>
|
|
65
|
+
<docidentifier type="W3C" primary="true">W3C REC-json-ld11-20200716</docidentifier>
|
|
62
66
|
...
|
|
63
67
|
<ext schema-version="v1.0.0">
|
|
64
68
|
<doctype>technicalReport</doctype>
|
|
65
|
-
<
|
|
66
|
-
<technical-committee>JSON-LD Working Group</technical-committee>
|
|
67
|
-
</editorialgroup>
|
|
69
|
+
<flavor>w3c</flavor>
|
|
68
70
|
</ext>
|
|
69
71
|
</bibdata>"
|
|
70
72
|
----
|
|
71
73
|
|
|
72
|
-
=== Typed links
|
|
74
|
+
=== Typed source links
|
|
73
75
|
|
|
74
|
-
Each W3C document has `src` type link.
|
|
76
|
+
Each W3C document has `src` type source link.
|
|
75
77
|
|
|
76
78
|
[source,ruby]
|
|
77
79
|
----
|
|
78
|
-
item.
|
|
80
|
+
item.source.first.type
|
|
79
81
|
=> "src"
|
|
80
82
|
|
|
81
|
-
item.
|
|
82
|
-
=>
|
|
83
|
+
item.source.first.content
|
|
84
|
+
=> "https://www.w3.org/TR/2020/REC-json-ld11-20200716/"
|
|
83
85
|
----
|
|
84
86
|
|
|
85
87
|
=== Create bibliographic item from XML
|
|
86
88
|
[source,ruby]
|
|
87
89
|
----
|
|
88
|
-
|
|
89
|
-
=> #<
|
|
90
|
+
Relaton::W3c::Bibdata.from_xml File.read('spec/fixtures/cr_json_ld11.xml')
|
|
91
|
+
=> #<Relaton::Bib::ItemData:0x0000000100f50018
|
|
90
92
|
...
|
|
91
93
|
----
|
|
92
94
|
|
|
93
95
|
=== Create bibliographic item from YAML
|
|
94
96
|
[source,ruby]
|
|
95
97
|
----
|
|
96
|
-
|
|
97
|
-
=>
|
|
98
|
-
...
|
|
99
|
-
|
|
100
|
-
bib_hash = RelatonW3c::HashConverter.hash_to_bib hash
|
|
101
|
-
=> {:"schema-version"=>"v1.2.1",
|
|
102
|
-
...
|
|
103
|
-
|
|
104
|
-
RelatonW3c::W3cBibliographicItem.new **bib_hash
|
|
105
|
-
=> #<RelatonW3c::W3cBibliographicItem:0x007f9381ec6a00
|
|
98
|
+
Relaton::W3c::Item.from_yaml File.read('spec/fixtures/item.yaml')
|
|
99
|
+
=> #<Relaton::W3c::ItemData:0x000000012b917450
|
|
106
100
|
...
|
|
107
101
|
----
|
|
108
102
|
|
|
109
103
|
=== Fetch data
|
|
110
104
|
|
|
111
|
-
The method `
|
|
105
|
+
The method `Relaton::W3c::DataFetcher.fetch(output: "data", format: "yaml")` converts all the documents from the W3C API and saves them to the `./data` folder in YAML format.
|
|
112
106
|
Arguments:
|
|
113
107
|
|
|
114
108
|
- `output` - folder to save documents (default './data').
|
|
115
109
|
- `format` - the format in which the documents are saved. Possible formats are: `yaml`, `xml`, `bibxml` (default `yaml`).
|
|
116
110
|
|
|
117
|
-
The
|
|
111
|
+
The available dataset is:
|
|
112
|
+
- `w3c-api` - The dataset is fetched from the W3C API.
|
|
118
113
|
|
|
119
114
|
[source,ruby]
|
|
120
115
|
----
|
|
121
|
-
|
|
122
|
-
|
|
123
|
-
|
|
124
|
-
Done in: 155 sec.
|
|
125
|
-
=> nil
|
|
116
|
+
require 'relaton/w3c/data_fetcher'
|
|
117
|
+
|
|
118
|
+
Relaton::W3c::DataFetcher.fetch
|
|
126
119
|
----
|
|
127
120
|
|
|
128
121
|
=== Logging
|
data/bin/console
CHANGED