relaton-doi 1.20.1 → 2.0.0.pre.alpha.1
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- checksums.yaml +4 -4
- data/.gitignore +1 -0
- data/.rubocop.yml +1 -1
- data/CLAUDE.md +45 -0
- data/README.adoc +24 -25
- data/lib/relaton/doi/crossref.rb +64 -0
- data/lib/relaton/doi/parser.rb +827 -0
- data/lib/relaton/doi/processor.rb +64 -0
- data/lib/relaton/doi/util.rb +8 -0
- data/lib/relaton/doi/version.rb +7 -0
- data/lib/relaton/doi.rb +24 -0
- data/relaton-doi.gemspec +8 -8
- metadata +21 -23
- data/lib/relaton_doi/crossref.rb +0 -62
- data/lib/relaton_doi/parser.rb +0 -806
- data/lib/relaton_doi/processor.rb +0 -57
- data/lib/relaton_doi/util.rb +0 -6
- data/lib/relaton_doi/version.rb +0 -5
- data/lib/relaton_doi.rb +0 -27
checksums.yaml
CHANGED
|
@@ -1,7 +1,7 @@
|
|
|
1
1
|
---
|
|
2
2
|
SHA256:
|
|
3
|
-
metadata.gz:
|
|
4
|
-
data.tar.gz:
|
|
3
|
+
metadata.gz: 0f7e17ade3de5350425a977bc98d1138372e8bc488b1d2167f01faf9005ef25a
|
|
4
|
+
data.tar.gz: 6073344cefc01c1d6c4927e6e8d1244479e78412fe2a6161aadf5eedb150f7a6
|
|
5
5
|
SHA512:
|
|
6
|
-
metadata.gz:
|
|
7
|
-
data.tar.gz:
|
|
6
|
+
metadata.gz: 828e003d8e8aee68503d099d3935cebc5c8d5efcd7fb6662734dec73e95f6b4c21011dbee3822130169c914bcdefa6a4be167d8b9d00b15fe75c21ead14530c7
|
|
7
|
+
data.tar.gz: 5c36820805071f3b77af5158ed542aaab498ba03724df8f912e6cc75ec4f8a7d9406124cc2fd26dacf03dfb5fc0d9c2f9f1fb27c7e685863a49965d64fb92496
|
data/.gitignore
CHANGED
data/.rubocop.yml
CHANGED
data/CLAUDE.md
ADDED
|
@@ -0,0 +1,45 @@
|
|
|
1
|
+
# CLAUDE.md
|
|
2
|
+
|
|
3
|
+
This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository.
|
|
4
|
+
|
|
5
|
+
## Project
|
|
6
|
+
|
|
7
|
+
relaton-doi is a Ruby gem that fetches bibliographic metadata via DOI identifiers from the Crossref API and converts them into Relaton bibliographic objects. It detects DOI patterns to produce flavor-specific items (NIST, IETF, BIPM, IEEE) or generic `Bib::ItemData`.
|
|
8
|
+
|
|
9
|
+
## Common Commands
|
|
10
|
+
|
|
11
|
+
```bash
|
|
12
|
+
bundle exec rake spec # Run all tests (default rake task)
|
|
13
|
+
bundle exec rspec spec/relaton/doi/parser_spec.rb # Run a single spec file
|
|
14
|
+
bundle exec rspec spec/relaton/doi/parser_spec.rb:224 # Run a single example by line
|
|
15
|
+
rubocop # Lint
|
|
16
|
+
rubocop -a # Lint with auto-correct
|
|
17
|
+
```
|
|
18
|
+
|
|
19
|
+
## Architecture
|
|
20
|
+
|
|
21
|
+
**Namespace:** `Relaton::Doi` (migrated from legacy `RelatonDoi`).
|
|
22
|
+
|
|
23
|
+
**Core flow:** `Crossref.get(doi)` → HTTP fetch from api.crossref.org → `Parser.parse(json_hash)` → flavor-specific `ItemData`
|
|
24
|
+
|
|
25
|
+
Key classes in `lib/relaton/doi/`:
|
|
26
|
+
|
|
27
|
+
- **`Crossref`** — module with `get(doi)` and `get_by_id(id)`. Uses Faraday with retry logic (3 retries). Handles rate limiting via `x-rate-limit-interval` header.
|
|
28
|
+
- **`Parser`** — largest file (~827 lines). Converts Crossref JSON hashes to Relaton objects. Factory method `parse(src)` delegates to `create_bibitem` which picks the right ItemData class based on DOI pattern (`/nist/` → `Nist::ItemData`, `/rfc\d+/` → `Ietf::ItemData`, etc.). Contains ~30 `parse_*` helper methods for individual bibliographic fields.
|
|
29
|
+
- **`Processor`** — `Relaton::Processor` subclass for the Relaton registry system. Entry point for `get`, `from_xml`, `hash_to_bib`.
|
|
30
|
+
- **`Util`** — logging utility, extends `Relaton::Bib::Util` with `PROGNAME = "relaton-doi"`.
|
|
31
|
+
|
|
32
|
+
## Test Setup
|
|
33
|
+
|
|
34
|
+
- **RSpec** with `expect` syntax only (monkey patching disabled)
|
|
35
|
+
- **VCR** cassettes in `spec/vcr_cassettes/` record Crossref HTTP responses (re-recorded every 7 days)
|
|
36
|
+
- **XML fixtures** in `spec/fixtures/` — expected output XML files. The `read_fixture` helper auto-substitutes today's date into `<fetched>` tags.
|
|
37
|
+
- **equivalent-xml** gem for XML comparison in integration tests
|
|
38
|
+
- Integration tests in `spec/relaton/doi_spec.rb` cover 40+ document types via VCR cassettes
|
|
39
|
+
- Unit tests in `spec/relaton/doi/parser_spec.rb` test Parser methods directly with hash inputs
|
|
40
|
+
|
|
41
|
+
## Key Constants in Parser
|
|
42
|
+
|
|
43
|
+
- `TYPES` — maps 23 Crossref document types to Relaton types (e.g., `"book-chapter"` → `"inbook"`)
|
|
44
|
+
- `REALATION_TYPES` — maps 37 Crossref relation types to Relaton relation types
|
|
45
|
+
- `COUNTRIES` — `%w[USA]`, used by `parse_place` to distinguish country vs region
|
data/README.adoc
CHANGED
|
@@ -1,4 +1,4 @@
|
|
|
1
|
-
= Relaton
|
|
1
|
+
= Relaton::Doi retrieves bibliographic items using DOI
|
|
2
2
|
|
|
3
3
|
image:https://img.shields.io/gem/v/relaton-doi.svg["Gem Version", link="https://rubygems.org/gems/relaton-doi"]
|
|
4
4
|
image:https://github.com/relaton/relaton-doi/workflows/macos/badge.svg["Build Status (macOS)", link="https://github.com/relaton/relaton-doi/actions?workflow=macos"]
|
|
@@ -8,7 +8,7 @@ image:https://codeclimate.com/github/relaton/relaton-doi/badges/gpa.svg["Code Cl
|
|
|
8
8
|
image:https://img.shields.io/github/issues-pr-raw/relaton/relaton-doi.svg["Pull Requests", link="https://github.com/relaton/relaton-doi/pulls"]
|
|
9
9
|
image:https://img.shields.io/github/commits-since/relaton/relaton-doi/latest.svg["Commits since latest",link="https://github.com/relaton/relaton-doi/releases"]
|
|
10
10
|
|
|
11
|
-
|
|
11
|
+
Relaton::Doi is a Ruby gem that implements the
|
|
12
12
|
https://github.com/metanorma/metanorma-model-iso#iso-bibliographic-item[IsoBibliographicItem model].
|
|
13
13
|
|
|
14
14
|
You can use it to retrieve metadata of Standards from https://crossref.org, and
|
|
@@ -49,36 +49,35 @@ will be returned via the call.
|
|
|
49
49
|
|
|
50
50
|
[source,ruby]
|
|
51
51
|
----
|
|
52
|
-
require '
|
|
52
|
+
require 'relaton/doi'
|
|
53
53
|
=> true
|
|
54
54
|
|
|
55
55
|
# get NIST standard
|
|
56
|
-
|
|
57
|
-
[relaton-doi] (doi:10.6028/nist.ir.8245) Fetching from search.crossref.org ...
|
|
58
|
-
[relaton-doi] (doi:10.6028/nist.ir.8245) Found: `10.6028/nist.ir.8245`
|
|
59
|
-
=> #<
|
|
56
|
+
Relaton::Doi::Crossref.get "doi:10.6028/nist.ir.8245"
|
|
57
|
+
[relaton-doi] INFO: (doi:10.6028/nist.ir.8245) Fetching from search.crossref.org ...
|
|
58
|
+
[relaton-doi] INFO: (doi:10.6028/nist.ir.8245) Found: `10.6028/nist.ir.8245`
|
|
59
|
+
=> #<Relaton::Nist::ItemData:0x000000011040a338
|
|
60
60
|
...
|
|
61
61
|
|
|
62
62
|
# get RFC standard
|
|
63
|
-
|
|
64
|
-
[relaton-doi] (doi:10.17487/RFC0001) Fetching from search.crossref.org ...
|
|
65
|
-
[relaton-doi] (doi:10.17487/RFC0001) Found: `10.17487/rfc0001`
|
|
66
|
-
|
|
67
|
-
=> #<RelatonIetf::IetfBibliographicItem:0x00007ff2241be6d0
|
|
63
|
+
Relaton::Doi::Crossref.get "doi:10.17487/RFC0001"
|
|
64
|
+
[relaton-doi] INFO: (doi:10.17487/RFC0001) Fetching from search.crossref.org ...
|
|
65
|
+
[relaton-doi] INFO: (doi:10.17487/RFC0001) Found: `10.17487/rfc0001`
|
|
66
|
+
=> #<Relaton::Ietf::ItemData:0x0000000110be8410
|
|
68
67
|
...
|
|
69
68
|
|
|
70
69
|
# get BIPM standard
|
|
71
|
-
|
|
72
|
-
[relaton-doi] (doi:10.1088/0026-1394/29/6/001) Fetching from search.crossref.org ...
|
|
73
|
-
[relaton-doi] (doi:10.1088/0026-1394/29/6/001) Found: `10.1088/0026-1394/29/6/001`
|
|
74
|
-
=> #<
|
|
70
|
+
Relaton::Doi::Crossref.get "doi:10.1088/0026-1394/29/6/001"
|
|
71
|
+
[relaton-doi] INFO: (doi:10.1088/0026-1394/29/6/001) Fetching from search.crossref.org ...
|
|
72
|
+
[relaton-doi] INFO: (doi:10.1088/0026-1394/29/6/001) Found: `10.1088/0026-1394/29/6/001`
|
|
73
|
+
=> #<Relaton::Bipm::ItemData:0x00000001118dcbf8
|
|
75
74
|
...
|
|
76
75
|
|
|
77
76
|
# get IEEE standard
|
|
78
|
-
|
|
79
|
-
[relaton-doi] (doi:10.1109/ieeestd.2014.6835311) Fetching from search.crossref.org ...
|
|
80
|
-
[relaton-doi] (doi:10.1109/ieeestd.2014.6835311) Found: `10.1109/ieeestd.2014.6835311`
|
|
81
|
-
=> #<
|
|
77
|
+
Relaton::Doi::Crossref.get "doi:10.1109/ieeestd.2014.6835311"
|
|
78
|
+
[relaton-doi] INFO: (doi:10.1109/ieeestd.2014.6835311) Fetching from search.crossref.org ...
|
|
79
|
+
[relaton-doi] INFO: (doi:10.1109/ieeestd.2014.6835311) Found: `10.1109/ieeestd.2014.6835311`
|
|
80
|
+
=> #<Relaton::Ieee::ItemData:0x0000000110ec5c00
|
|
82
81
|
...
|
|
83
82
|
----
|
|
84
83
|
|
|
@@ -89,16 +88,16 @@ to Relaton, an instance of RelatonBib::BibliographicItem will be returned.
|
|
|
89
88
|
|
|
90
89
|
[source,ruby]
|
|
91
90
|
----
|
|
92
|
-
|
|
93
|
-
[relaton-doi] (doi:10.1109/ACCESS.2017.2739804) Fetching from search.crossref.org ...
|
|
94
|
-
[relaton-doi] (doi:10.1109/ACCESS.2017.2739804) Found: `10.1109/access.2017.2739804`
|
|
95
|
-
=> #<
|
|
91
|
+
Relaton::Doi::Crossref.get "doi:10.1109/ACCESS.2017.2739804"
|
|
92
|
+
[relaton-doi] INFO: (doi:10.1109/ACCESS.2017.2739804) Fetching from search.crossref.org ...
|
|
93
|
+
[relaton-doi] INFO: (doi:10.1109/ACCESS.2017.2739804) Found: `10.1109/access.2017.2739804`
|
|
94
|
+
=> #<Relaton::Bib::ItemData:0x0000000110f0bde0
|
|
96
95
|
...
|
|
97
96
|
----
|
|
98
97
|
|
|
99
98
|
=== Logging
|
|
100
99
|
|
|
101
|
-
|
|
100
|
+
Relaton::Doi uses the relaton-logger gem for logging. By default, it logs to STDOUT. To change the log levels and add other loggers, read the https://github.com/relaton/relaton-logger#usage[relaton-logger] documentation.
|
|
102
101
|
|
|
103
102
|
== Development
|
|
104
103
|
|
|
@@ -0,0 +1,64 @@
|
|
|
1
|
+
require "faraday"
|
|
2
|
+
|
|
3
|
+
module Relaton
|
|
4
|
+
module Doi
|
|
5
|
+
module Crossref
|
|
6
|
+
extend self
|
|
7
|
+
|
|
8
|
+
HEADER = {
|
|
9
|
+
"User-Agent" => "Relaton::Doi (https://www.relaton.org/guides/doi/; mailto:open.source@ribose.com)"
|
|
10
|
+
}.freeze
|
|
11
|
+
|
|
12
|
+
#
|
|
13
|
+
# Get a document by DOI from the CrossRef API.
|
|
14
|
+
#
|
|
15
|
+
# @param [String] doi The DOI.
|
|
16
|
+
#
|
|
17
|
+
# @return [RelatonBib::BibliographicItem, RelatonIetf::IetfBibliographicItem,
|
|
18
|
+
# RelatonBipm::BipmBibliographicItem, RelatonIeee::IeeeBibliographicItem,
|
|
19
|
+
# RelatonNist::NistBibliographicItem] The bibitem.
|
|
20
|
+
#
|
|
21
|
+
def get(doi)
|
|
22
|
+
Util.info "Fetching from search.crossref.org ...", key: doi
|
|
23
|
+
id = doi.sub(%r{^doi:}, "")
|
|
24
|
+
message = get_by_id id
|
|
25
|
+
if message
|
|
26
|
+
Util.info "Found: `#{message['DOI']}`", key: doi
|
|
27
|
+
Parser.parse message
|
|
28
|
+
else
|
|
29
|
+
Util.info "Not found.", key: doi
|
|
30
|
+
nil
|
|
31
|
+
end
|
|
32
|
+
end
|
|
33
|
+
|
|
34
|
+
#
|
|
35
|
+
# Get a document by DOI from the CrossRef API.
|
|
36
|
+
#
|
|
37
|
+
# @param [String] id The DOI.
|
|
38
|
+
#
|
|
39
|
+
# @return [Hash] The document.
|
|
40
|
+
#
|
|
41
|
+
def get_by_id(id) # rubocop:disable Metrics/AbcSize,Metrics/MethodLength
|
|
42
|
+
# resp = Serrano.works ids: id
|
|
43
|
+
n = 0
|
|
44
|
+
url = "https://api.crossref.org/works/#{CGI.escape(id)}"
|
|
45
|
+
loop do
|
|
46
|
+
resp = Faraday.get url, nil, HEADER
|
|
47
|
+
case resp.status
|
|
48
|
+
when 200
|
|
49
|
+
work = JSON.parse resp.body
|
|
50
|
+
return work["message"] if work["status"] == "ok"
|
|
51
|
+
when 404 then return nil
|
|
52
|
+
end
|
|
53
|
+
|
|
54
|
+
if n > 1
|
|
55
|
+
raise Relaton::RequestError, "Crossref error: #{resp.body}"
|
|
56
|
+
end
|
|
57
|
+
|
|
58
|
+
n += 1
|
|
59
|
+
sleep resp.headers["x-rate-limit-interval"].to_i * n
|
|
60
|
+
end
|
|
61
|
+
end
|
|
62
|
+
end
|
|
63
|
+
end
|
|
64
|
+
end
|