relaton-nist 1.19.0 → 1.19.3

Sign up to get free protection for your applications and to get access to all the features.
checksums.yaml CHANGED
@@ -1,7 +1,7 @@
1
1
  ---
2
2
  SHA256:
3
- metadata.gz: cd31209270fb62d1b7dabbddf7f379eb7b33a21832f3e290dd9ed49c94d715c4
4
- data.tar.gz: 8f15ad43f7d1edfd0a9135a6b750bf1414c3fd24216fd30c730586b9cd342f5c
3
+ metadata.gz: 7a4b4d939bc25c7c8968747b391784b5e0238a6961b5544d6f48cd77848addf9
4
+ data.tar.gz: f5684a7aa32f04b30fed89c1d43dc0c5247562a94bf97b1d65f6b0c80de2f054
5
5
  SHA512:
6
- metadata.gz: ea6daf8959d56b98ab2e02e63315c2ecf7bd483824ec2ad0832fc9239e32d57e556942d86edbaa4c88655904d2e04e9a05c50cb4bbf38486773ab3cb0986c851
7
- data.tar.gz: 2ca580e13556eb72455cc61f9a783d30894675d3f98630dd2de589a3a2a894380bc96a26453eb345e727eec37a7ecb8535b35aa011b34cab9580b858fc0e70ae
6
+ metadata.gz: aa2d613dc42ac689b422b73d0b930459bbfdad99ca8e0e4302a8f586931f89f90ae0189d6d052e8e003659b75df4f5a07941835d4f03f4e98ead50ebcfef738d
7
+ data.tar.gz: 19b7411e9f15891b7513dd934a678702fcfaccc80512385e2354d69ca66f2cf3617b2f8e88ce47b7141ee885dc933e59931731dec14e1b90e550964184f4ff92
data/README.adoc CHANGED
@@ -10,7 +10,8 @@ image:https://img.shields.io/github/commits-since/relaton/relaton-nist/latest.sv
10
10
 
11
11
  == Purpose
12
12
 
13
- `relaton-nist` provides bibliographic information of NIST publications using the
13
+ Relaton for NIST provides bibliographic information of NIST publications using
14
+ the
14
15
  https://github.com/metanorma/metanorma-model-nist#nist-bibliographic-item-model[NistBibliographicItem model].
15
16
 
16
17
  Relaton for NIST has been developed in cooperation with the NIST Cybersecurity
@@ -18,18 +19,25 @@ Resource Center (CSRC) and the Computer Security Division (ITL/CSD).
18
19
 
19
20
  == Data sources
20
21
 
21
- Relaton for NIST retrieves bibliographic information from two sources:
22
+ === General
22
23
 
23
- * bibliographic feed from the NIST Cybersecurity Resource Center (CSRC) of
24
- all CSRC publications (in Relaton JSON)
25
- * bibliographic dataset from the NIST Library through the Information Services
26
- Office (ISO) that contains information about all NIST Technical Publications
27
- (https://github.com/usnistgov/NIST-Tech-Pubs[GitHub])
24
+ Relaton for NIST retrieves bibliographic information from two sources,
25
+ in the following priority:
26
+
27
+ . NIST Cybersecurity Resource Center (CSRC)
28
+
29
+ . NIST Library of the NIST Information Services Office (ISO)
30
+
31
+
32
+ === NIST Cybersecurity Resource Center (CSRC)
33
+
34
+ The NIST CSRC bibliographic feed is used as the primary source for all CSRC
35
+ publications.
28
36
 
29
37
  Bibliographic information offered through CSRC is provided with enhanced
30
- metadata that is not available in the NIST Library dataset, including:
38
+ metadata unavailable from the NIST Library dataset, including:
31
39
 
32
- * public drafts (the NIST Library dataset only contains final publications)
40
+ * versioned drafts (the NIST Library dataset only contains the latest publication: latest draft or final publication)
33
41
  * revision information: revision number, iteration
34
42
  * document stage information: retired, withdrawn, etc.
35
43
  * bibliographic dates, including issued date, updated date, published date,
@@ -37,20 +45,38 @@ metadata that is not available in the NIST Library dataset, including:
37
45
  * document relationships: supersession and replacements
38
46
  * contacts: enhanced name parts and affiliation information
39
47
 
40
- Relaton for NIST, therefore, uses the following order of priority for the data
41
- sources:
42
-
43
- . bibliographic feed from NIST CSRC
44
- . NIST Library dataset
45
-
46
- The NIST CSRC provides:
48
+ The NIST CSRC provides the following documents:
47
49
 
48
50
  * NIST SP 800-*
49
51
  * NIST SP 500-*
50
52
  * NIST SP 1800-*
51
53
  * NIST FIPS {31, 39, 41, 46, 46-$$*$$, 48, 65, 73, 74, 81, 83, 87, 102, 112, 113, 139, 140-$$*$$, 141, 171, 180, 180-$$*$$, 181, 186, 186-$$*$$, 188, 190, 191, 196, 197, 198, 198-1, 199, 200, 201, 201-*, 202}
52
54
 
53
- The NIST Library dataset provides documents listed in the https://github.com/relaton/relaton-data-nist/blob/main/index-v1.yaml[index].
55
+ As this feed is provided by CSRC in the native Relaton JSON format, no data
56
+ transform is necessary.
57
+
58
+
59
+ === NIST Library
60
+
61
+ The NIST Library offers the full NIST Technical Publications dataset at
62
+ https://github.com/usnistgov/NIST-Tech-Pubs[GitHub].
63
+
64
+ The library data is managed internally using MARC21, and the GitHub repository
65
+ offers this information via MARC21 XML and MODS (in XML).
66
+
67
+ Relaton for NIST uses the MODS dataset obtained from the following location:
68
+
69
+ * https://github.com/usnistgov/NIST-Tech-Pubs/releases[NIST-Tech-Pubs/releases],
70
+ the `allrecords-MODS.xml` file
71
+
72
+ The MODS XML file is parsed using the
73
+ https://github.com/relaton/loc_mods[`loc_mods`] library and its data mapped to
74
+ the Relaton model.
75
+
76
+ The NIST Library dataset provides documents listed in the
77
+ https://github.com/relaton/relaton-data-nist/blob/main/index-v1.yaml[index].
78
+
79
+
54
80
 
55
81
  == Installation
56
82
 
@@ -63,104 +89,85 @@ gem 'relaton-nist'
63
89
 
64
90
  And then execute:
65
91
 
66
- $ bundle
92
+ [source,sh]
93
+ ----
94
+ $ bundle
95
+ ----
67
96
 
68
97
  Or install it yourself as:
69
98
 
70
- $ gem install relaton_nist
99
+ [source,sh]
100
+ ----
101
+ $ gem install relaton_nist
102
+ ----
71
103
 
72
104
  == Usage
73
105
 
74
- === Search for a standard using keywords
106
+ === Retrieving an entry
75
107
 
108
+ ==== By NIST PubID
109
+
110
+ Relaton supports using the
111
+ https://www.nist.gov/system/files/documents/2022/04/01/PubID_Syntax_NIST_TechPubs.pdf[NIST PubID]
112
+ as the retrieval key for bibliographic entries.
113
+
114
+ NOTE: Relaton for NIST is the first public implementation of the
115
+ https://www.nist.gov/news-events/news/2021/08/nist-technical-series-publications-proposed-publication-identifier-syntax[NIST PubID].
116
+
117
+
118
+ .Retrieving a bibliographic entry using NIST PubID
119
+ [example]
120
+ ====
76
121
  [source,ruby]
77
122
  ----
78
- require 'relaton_nist'
79
- => true
123
+ > require 'relaton_nist'
124
+ =>
125
+ true
80
126
 
81
- hit_collection = RelatonNist::NistBibliography.search("NISTIR 8200")
127
+ > hit_collection = RelatonNist::NistBibliography.search("NISTIR 8200")
82
128
  [relaton-nist] (NIST IR 8200) Fetching from csrc.nist.gov ...
83
129
  [relaton-nist] (NIST IR 8200) Fetching from Relaton repository ...
84
- => <RelatonNist::HitCollection:0x00000000004b28 @ref=NIST IR 8200 @fetched=false>
130
+ =>
131
+ <RelatonNist::HitCollection:0x00000000004b28 @ref=NIST IR 8200 @fetched=false>
85
132
 
86
- item = hit_collection[0].fetch
87
- => #<RelatonNist::NistBibliographicItem:0x007fc049aa6778
88
- ...
133
+ > item = hit_collection[0].fetch
134
+ =>
135
+ #<RelatonNist::NistBibliographicItem:0x007fc049aa6778
136
+ # ...
89
137
  ----
138
+ ====
90
139
 
91
- === XML serialization
92
- [source,ruby]
93
- ----
94
- item.to_xml
95
- => "<bibitem id="NISTIR8200" type="standard" schema-version="v1.2.9">
96
- <fetched>2023-10-16</fetched>
97
- <title format="text/plain" language="en" script="Latn">Interagency report on the status of international cybersecurity standardization for the internet of things (IoT)</title>
98
- ...
99
- <bibitem>"
100
- ----
101
- With argument `bibdata: true` it outputs XML wrapped by `bibdata` element and adds flavor `ext` element.
102
- [source,ruby]
103
- ----
104
- item.to_xml bibdata: true
105
- => "<bibdata type="standard" schema-version="v1.2.9">
106
- <fetched>2023-10-16</fetched>
107
- <title format="text/plain" language="en" script="Latn">Interagency report on the status of international cybersecurity standardization for the internet of things (IoT)</title>
108
- ...
109
- <ext schema-version="v1.0.0">
110
- <doctype>standard</doctype>
111
- </ext>
112
- </bibdata>"
113
- ----
114
140
 
115
- === Get code, and year
141
+ The publication year can also be given if the publication has seen multiple
142
+ revisions or editions.
143
+
144
+ .Retrieving a bibliographic entry using NIST PubID with year
145
+ [example]
146
+ ====
116
147
  [source,ruby]
117
148
  ----
118
- RelatonNist::NistBibliography.get("NIST IR 8200", "2018")
149
+ > RelatonNist::NistBibliography.get("NIST IR 8200", "2018")
119
150
  [relaton-nist] (NIST SP 8200:2018) Fetching from csrc.nist.gov ...
120
151
  [relaton-nist] (NIST IR 8200:2018) Fetching from Relaton repository...
121
152
  [relaton-nist] (NIST IR 8200:2018) Found: `NIST IR 8200`
122
- => #<RelatonNist::NistBibliographicItem:0x00007fab74a572c0
123
- ...
153
+ =>
154
+ #<RelatonNist::NistBibliographicItem:0x00007fab74a572c0
155
+ # ...
124
156
  ----
157
+ ====
125
158
 
126
- === Get short citation
127
- A short citation is a convention about a citation's format. The format for NIST publications is:
128
- ----
129
- NIST {abbrev(series)} {docnumber} {(edition), optional} {(stage), optional}
130
- # or
131
- {abbrev(series)} {docnumber} {(edition), optional} {(stage), optional}
132
- ----
133
- - `(stage)` is empty if the state is "final" (published). In case the state is "draft" it should be:
134
- * PD for public draft
135
- * IPD for initial iteration public draft or 2PD, 3PD and so one for following iterations
136
- * FPD for final public draft
137
- - `(edition)` is the date of publication or update
138
- - `docnumber` is the full NIST number, including revision, e.g., 800-52
139
-
140
- The format for FIPS publications is:
141
- ----
142
- FIPS {docnumber}
143
- # or
144
- NIST FIPS {docnumber}
145
- ----
146
- [source,ruby]
147
- ----
148
- RelatonNist::NistBibliography.get("NIST SP 800-205 (February 2019) (IPD)")
149
- [relaton-nist] (NIST SP 800-205) Fetching from csrc.nist.gov ...
150
- [relaton-nist] (NIST SP 800-205) Found: `NIST SP 800-205 (Draft)`
151
- => #<RelatonNist::NistBibliographicItem:0x00000001105afdc8
152
- ...
153
- ----
154
159
 
155
- === Get specific part, volume, version, revision, and addendum
160
+ A NIST PubID can contain these optional parameters `{ptN}{vN}{verN}{rN}{/Add}`:
156
161
 
157
- Referehces can contain optional parameters `{ptN}{vN}{verN}{rN}{/Add}`:
158
- - Part is specified as `ptN` (SP 800-57pt1)
159
- - Volume is specified as `vN` (SP 800-60v1)
160
- - Version is specified as `verN` (SP 800-45ver2)
161
- - Revision is specified as `rN` (SP 800-40r3)
162
- - Addendum is specified as `/Add` (SP 800-38A/Add)
162
+ * Part is specified as `ptN` (SP 800-57pt1)
163
+ * Volume is specified as `vN` (SP 800-60v1)
164
+ * Version is specified as `verN` (SP 800-45ver2)
165
+ * Revision is specified as `rN` (SP 800-40r3) or `-N`
166
+ * Addendum is specified as `/Add` (SP 800-38A/Add)
163
167
 
168
+ .Retrieving a bibliographic entry with a PubID that contains revision
169
+ [example]
170
+ ====
164
171
  [source,ruby]
165
172
  ----
166
173
  item = RelatonNist::NistBibliography.get 'NIST SP 800-67r1'
@@ -171,7 +178,14 @@ item = RelatonNist::NistBibliography.get 'NIST SP 800-67r1'
171
178
 
172
179
  item.docidentifier.first.id
173
180
  => "NIST SP 800-67 Rev. 1"
181
+ ----
182
+ ====
174
183
 
184
+ .Retrieving a bibliographic entry with a PubID that specifies an addendum
185
+ [example]
186
+ ====
187
+ [source,ruby]
188
+ ----
175
189
  item = RelatonNist::NistBibliography.get 'NIST SP 800-38A/Add'
176
190
  [relaton-nist] (NIST SP 800-38A/Add) Fetching from csrc.nist.gov ...
177
191
  [relaton-nist] (NIST SP 800-38A/Add) Found: `NIST SP 800-38A-Add`
@@ -181,11 +195,132 @@ item = RelatonNist::NistBibliography.get 'NIST SP 800-38A/Add'
181
195
  item.docidentifier.first.id
182
196
  => "NIST SP 800-38A-Add"
183
197
  ----
198
+ ====
199
+
200
+ ==== By NIST PubID abbreviated form
201
+
202
+ The NIST PubID "abbreviated form" is the form of PubID printed on the inner
203
+ cover of a NIST publication.
204
+
205
+ The NIST PubID abbreviated format has the following syntax:
206
+
207
+ * `NIST {abbrev(series)} {docnumber} {(edition), optional} {(stage), optional}`, or
208
+ * `{abbrev(series)} {docnumber} {(edition), optional} {(stage), optional}`
209
+
210
+ Where,
184
211
 
185
- === Typed links
212
+ `(stage)`::
213
+ empty if the state is "final" (published).
214
+ In case the state is "draft" it should be:
215
+ `PD`::: for the public draft
216
+ `IPD`::: for the initial public draft
217
+ `{n}PD`::: for subsequent public drafts, such as 2PD, 3PD and so on.
218
+ `FPD`::: for the final public draft
186
219
 
187
- NIST documents may have `src` and `doi` link types.
220
+ `(edition)`:: the date of publication or update
221
+ `docnumber`:: the full NIST document number, e.g., 800-52.
188
222
 
223
+ .Retrieving an entry using the NIST PubID abbreviated form
224
+ [example]
225
+ ====
226
+ [source,ruby]
227
+ ----
228
+ > RelatonNist::NistBibliography.get("NIST SP 800-205 (February 2019) (IPD)")
229
+ [relaton-nist] (NIST SP 800-205) Fetching from csrc.nist.gov ...
230
+ [relaton-nist] (NIST SP 800-205) Found: `NIST SP 800-205 (Draft)`
231
+ =>
232
+ #<RelatonNist::NistBibliographicItem:0x00000001105afdc8
233
+ # ...
234
+ ----
235
+ ====
236
+
237
+ ==== By FIPS PubID
238
+
239
+ The format for FIPS publications is:
240
+
241
+ * `FIPS {docnumber}`, or
242
+ * `NIST FIPS {docnumber}`
243
+
244
+
245
+ .Retrieving an entry using the FIPS PubID
246
+ [example]
247
+ ====
248
+ [source,ruby]
249
+ ----
250
+ > RelatonNist::NistBibliography.get("NIST FIPS 140-3")
251
+ [relaton-nist] INFO: (NIST FIPS 140-3) Fetching from csrc.nist.gov ...
252
+ [relaton-nist] INFO: (NIST FIPS 140-3) Found: `NIST FIPS 140-3`
253
+ =>
254
+ # #<RelatonNist::NistBibliographicItem:0x000000013b68fb50
255
+ # ...
256
+ ----
257
+ ====
258
+
259
+
260
+ === Serializing an entry
261
+
262
+ ==== To Relaton XML
263
+
264
+ .Serializing an entry to Relaton XML
265
+ [example]
266
+ ====
267
+ [source,ruby]
268
+ ----
269
+ > item.to_xml
270
+ #=> "<bibitem id="NISTIR8200" type="standard" schema-version="v1.2.9">
271
+ # <fetched>2023-10-16</fetched>
272
+ # <title format="text/plain" language="en" script="Latn">Interagency report on the status of international cybersecurity standardization for the internet of things (IoT)</title>
273
+ # ...
274
+ # <bibitem>"
275
+ ----
276
+ ====
277
+
278
+ When the option `bibdata: true` is given, the method outputs XML wrapped by
279
+ the `bibdata` element and adds a flavor-specific `ext` element.
280
+
281
+ .Serializing an entry to Relaton XML with `bibdata: true`
282
+ [example]
283
+ ====
284
+ [source,ruby]
285
+ ----
286
+ > item.to_xml bibdata: true
287
+ =>
288
+ "<bibdata type="standard" schema-version="v1.2.9">
289
+ # <fetched>2023-10-16</fetched>
290
+ # <title format="text/plain" language="en" script="Latn">Interagency report on the status of international cybersecurity standardization for the internet of things (IoT)</title>
291
+ # ...
292
+ # <ext schema-version="v1.0.0">
293
+ # <doctype>standard</doctype>
294
+ # </ext>
295
+ # </bibdata>"
296
+ ----
297
+ ====
298
+
299
+
300
+
301
+ === Accessing attributes
302
+
303
+ ==== Identifier components
304
+
305
+ A NIST PubID can contain these optional parameters `{ptN}{vN}{verN}{rN}{/Add}`.
306
+
307
+ They can be accessed through the following methods:
308
+
309
+ * `part`
310
+ * `volume`
311
+ * `version`
312
+ * `revision`
313
+ * `addendum`
314
+
315
+
316
+ ==== Typed links
317
+
318
+ NIST publications may have `src` and `doi` links, obtained using the `#link`
319
+ method.
320
+
321
+ .Accessing links of a NIST bibliographic item
322
+ [example]
323
+ ====
189
324
  [source,ruby]
190
325
  ----
191
326
  item.link
@@ -196,8 +331,20 @@ item.link
196
331
  @type="src">,
197
332
  #<RelatonBib::TypedUri:0x00000001110879a8 @content=#<Addressable::URI:0x8d4 URI:https://doi.org/10.6028/NIST.SP.800-38A-Add>, @language=nil, @script=nil, @type="doi">]
198
333
  ----
334
+ ====
335
+
336
+
337
+ === Loading entries
199
338
 
200
- === Create bibliographic item from YAML
339
+ // ==== From XML
340
+
341
+ ==== From YAML
342
+
343
+ Offline Relaton YAML files can be loaded directly as bibliographic items.
344
+
345
+ .Loading a Relaton YAML file
346
+ [example]
347
+ ====
201
348
  [source,ruby]
202
349
  ----
203
350
  hash = YAML.load_file 'spec/examples/nist_bib_item.yml'
@@ -208,40 +355,65 @@ RelatonNist::NistBibliographicItem.from_hash hash
208
355
  => #<RelatonNist::NistBibliographicItem:0x007f8b708505b8
209
356
  ...
210
357
  ----
358
+ ====
359
+
211
360
 
212
- === Fetch data
361
+ === Fetching data
213
362
 
214
- This gem uses the https://raw.githubusercontent.com/usnistgov/NIST-Tech-Pubs/nist-pages/xml/allrecords.xml dataset as one of data sources.
363
+ On a search, the CSRC data sources are fetched and stored in the user's cache.
215
364
 
216
- The method `RelatonNist::DataFetcher.fetch(output: "data", format: "yaml")` fetches all the documents from the datast and save them to the `./data` folder in YAML format.
217
- Arguments:
365
+ This gem uses the https://github.com/usnistgov/NIST-Tech-Pubs/releases/download/May2024/allrecords-MODS.xml dataset as NIST-Tech-Pubs data source.
218
366
 
219
- - `output` - folder to save documents (default './data').
220
- - `format` - the format in which the documents are saved. Possible formats are: `yaml`, `xml`, `bibxxml` (default `yaml`).
367
+ The following method fetches all the documents from the NIST-Tech-Pubs dataset, then saves them to the `./data` folder in YAML format.
221
368
 
369
+ .Fetching all data with default configuration
370
+ [example]
371
+ ====
222
372
  [source,ruby]
223
373
  ----
224
- RelatonNist::DataFetcher.fetch
374
+ > RelatonNist::DataFetcher.fetch
225
375
  Started at: 2021-09-01 18:01:01 +0200
226
376
  Stopped at: 2021-09-01 18:01:43 +0200
227
377
  Done in: 42 sec.
228
378
  => nil
229
379
  ----
380
+ ====
230
381
 
231
- === Logging
382
+ .Fetching all data to the `data` folder
383
+ [example]
384
+ ====
385
+ [source,ruby]
386
+ ----
387
+ > RelatonNist::DataFetcher.fetch(output: "data", format: "yaml")
388
+ ----
389
+ ====
390
+
391
+ Options:
392
+
393
+ `output`:: folder to save documents (default `./data`).
394
+ `format`:: format in which the documents are saved. Possible formats are:
395
+ `yaml`::: (default) save in Relaton YAML format
396
+ `xml`::: save in Relaton XML format
397
+ `bibxml`::: save in the IETF BibXML format
232
398
 
233
- RelatonNist uses the relaton-logger gem for logging. By default, it logs to STDOUT. To change the log levels and add other loggers, read the https://github.com/relaton/relaton-logger#usage[relaton-logger] documentation.
234
399
 
235
- == Development
400
+ === Logging
401
+
402
+ Relaton for NIST uses https://github.com/relaton/relaton-logger[relaton-logger]
403
+ for logging.
236
404
 
237
- After checking out the repo, run `bin/setup` to install dependencies. Then, run `rake spec` to run the tests. You can also run `bin/console` for an interactive prompt that will allow you to experiment.
405
+ By default, it logs to `$STDOUT`. To change log levels and add other loggers,
406
+ refer to the https://github.com/relaton/relaton-logger#usage[relaton-logger]
407
+ documentation.
238
408
 
239
- To install this gem onto your local machine, run `bundle exec rake install`. To release a new version, update the version number in `version.rb`, and then run `bundle exec rake release`, which will create a git tag for the version, push git commits and tags, and push the `.gem` file to [rubygems.org](https://rubygems.org).
240
409
 
241
410
  == Contributing
242
411
 
243
- Bug reports and pull requests are welcome on GitHub at https://github.com/metanorma/relaton-nist.
412
+ Bug reports and pull requests are welcome.
244
413
 
245
414
  == License
246
415
 
247
- The gem is available as open source under the terms of the [MIT License](https://opensource.org/licenses/MIT).
416
+ Copyright Ribose.
417
+
418
+ The gem is available as open source under the terms of the
419
+ https://opensource.org/licenses/MIT[MIT License].
@@ -1,10 +1,12 @@
1
1
  # frozen_string_literal: true
2
2
 
3
3
  require "yaml"
4
+ require "loc_mods"
5
+ require_relative "mods_parser"
4
6
 
5
7
  module RelatonNist
6
8
  class DataFetcher
7
- URL = "https://raw.githubusercontent.com/usnistgov/NIST-Tech-Pubs/nist-pages/xml/allrecords.xml"
9
+ URL = "https://github.com/usnistgov/NIST-Tech-Pubs/releases/download/May2024/allrecords-MODS.xml"
8
10
 
9
11
  def initialize(output, format)
10
12
  @output = output
@@ -63,21 +65,23 @@ module RelatonNist
63
65
  t2 = Time.now
64
66
  puts "Stopped at: #{t2}"
65
67
  puts "Done in: #{(t2 - t1).round} sec."
66
- rescue StandardError => e
67
- Util.error "#{e.message}\n#{e.backtrace[0..5].join("\n")}"
68
+ # rescue StandardError => e
69
+ # Util.error "#{e.message}\n#{e.backtrace[0..5].join("\n")}"
68
70
  end
69
71
 
70
72
  def fetch_tech_pubs
71
- docs = Nokogiri::XML OpenURI.open_uri URL
72
- docs.xpath(
73
- "/body/query/doi_record/report-paper/report-paper_metadata",
74
- ).each { |doc| write_file TechPubsParser.parse(doc, series) }
73
+ docs = LocMods::Collection.from_xml OpenURI.open_uri(URL)
74
+ # docs.xpath(
75
+ # "/body/query/doi_record/report-paper/report-paper_metadata",
76
+ # )
77
+ docs.mods.each { |doc| write_file ModsParser.new(doc, series).parse }
75
78
  end
76
79
 
77
80
  def add_static_files
78
81
  Dir["./static/*.yaml"].each do |file|
79
82
  hash = YAML.load_file file
80
- write_file RelatonNist::NistBibliographicItem.from_hash(hash)
83
+ bib = RelatonNist::NistBibliographicItem.from_hash(hash)
84
+ index.add_or_update bib.docidentifier[0].id, file
81
85
  end
82
86
  end
83
87
 
@@ -0,0 +1,209 @@
1
+ module RelatonNist
2
+ class ModsParser
3
+ RELATION_TYPES = {
4
+ "otherVersion" => "editionOf",
5
+ "preceding" => "updates",
6
+ "succeeding" => "updatedBy",
7
+ }.freeze
8
+
9
+ ATTRS = %i[type docid title link abstract date doctype contributor relation place series].freeze
10
+
11
+ def initialize(doc, series)
12
+ @doc = doc
13
+ @series = series
14
+ end
15
+
16
+ # @return [RelatonNist::NistBibliographicItem]
17
+ def parse
18
+ args = ATTRS.each_with_object({}) do |attr, hash|
19
+ hash[attr] = send("parse_#{attr}")
20
+ end
21
+ NistBibliographicItem.new(**args)
22
+ end
23
+
24
+ def parse_type
25
+ "standard"
26
+ end
27
+
28
+ # @return [Array<RelatonBib::DocumentIdentifier>]
29
+ def parse_docid
30
+ [
31
+ { type: "NIST", id: pub_id, primary: true },
32
+ { type: "DOI", id: doi },
33
+ ].map { |id| RelatonBib::DocumentIdentifier.new(**id) }
34
+ end
35
+
36
+ # @return [String]
37
+ def pub_id
38
+ get_id_from_str doi
39
+ end
40
+
41
+ def get_id_from_str(str)
42
+ str.match(/\/((?:NBS|NIST).+)/)[1].gsub(".", " ")
43
+ end
44
+
45
+ # @return [String]
46
+ def doi
47
+ url = @doc.location.reduce(nil) { |m, l| m || l.url.detect { |u| u.usage == "primary display" } }
48
+ id = url.content.match(/10\.6028\/.+/)[0]
49
+ case id
50
+ when "10.6028/NBS.CIRC.sup" then "10.6028/NBS.CIRC.24e7sup"
51
+ when "10.6028/NBS.CIRC.supJun1925-Jun1926" then "10.6028/NBS.CIRC.24e7sup2"
52
+ when "10.6028/NBS.CIRC.supJun1925-Jun1927" then "10.6028/NBS.CIRC.24e7sup3"
53
+ when "10.6028/NBS.CIRC.24supJuly1922" then "10.6028/NBS.CIRC.24e6sup"
54
+ when "10.6028/NBS.CIRC.24supJan1924" then "10.6028/NBS.CIRC.24e6sup2"
55
+ else id
56
+ end
57
+ end
58
+
59
+ # @return [Array<RelatonBib::TypedTitleString>]
60
+ def parse_title
61
+ title = @doc.title_info.reduce([]) do |a, ti|
62
+ next a if ti.type == "alternative"
63
+
64
+ a += ti.title.map { |t| create_title(t, "title-main", ti.non_sort[0]) }
65
+ a + ti.sub_title.map { |t| create_title(t, "title-part") }
66
+ end
67
+ if title.size > 1
68
+ content = title.map { |t| t.title.content }.join(" - ")
69
+ title << create_title(content, "main")
70
+ elsif title.size == 1
71
+ title[0].instance_variable_set :@type, "main"
72
+ end
73
+ title
74
+ end
75
+
76
+ def create_title(title, type, non_sort = nil)
77
+ content = title.gsub("\n", " ").squeeze(" ").strip
78
+ content = "#{non_sort.content}#{content}".squeeze(" ") if non_sort
79
+ RelatonBib::TypedTitleString.new content: content, type: type, language: "en", script: "Latn"
80
+ end
81
+
82
+ def parse_link
83
+ @doc.location.map do |location|
84
+ url = location.url.first
85
+ type = url.usage == "primary display" ? "doi" : "src"
86
+ RelatonBib::TypedUri.new content: url.content, type: type
87
+ end
88
+ end
89
+
90
+ def parse_abstract
91
+ @doc.abstract.map do |a|
92
+ content = a.content.gsub("\n", " ").squeeze(" ").strip
93
+ RelatonBib::FormattedString.new content: content, language: "en", script: "Latn"
94
+ end
95
+ end
96
+
97
+ def parse_date
98
+ date = @doc.origin_info[0].date_issued.map do |di|
99
+ create_date(di, "issued")
100
+ # end + @doc.record_info[0].record_creation_date.map do |rcd|
101
+ # create_date(rcd, "created")
102
+ # end + @doc.record_info[0].record_change_date.map do |rcd|
103
+ # create_date(rcd, "updated")
104
+ end
105
+ date.compact
106
+ end
107
+
108
+ def create_date(date, type)
109
+ RelatonBib::BibliographicDate.new type: type, on: decode_date(date)
110
+ rescue Date::Error
111
+ end
112
+
113
+ def decode_date(date)
114
+ if date.encoding == "marc" && date.content.size == 6
115
+ Date.strptime(date.content, "%y%m%d").to_s
116
+ elsif date.encoding == "iso8601"
117
+ Date.strptime(date.content, "%Y%m%d").to_s
118
+ else date.content
119
+ end
120
+ end
121
+
122
+ def parse_doctype
123
+ RelatonBib::DocumentType.new(type: "standard")
124
+ end
125
+
126
+ def parse_contributor
127
+ # eaxclude primary contributors to avoid duplication
128
+ @doc.name.reject { |n| n.usage == "primary" }.map do |name|
129
+ entity, default_role = create_entity(name)
130
+ next unless entity
131
+
132
+ role = name.role.reduce([]) do |a, r|
133
+ a + r.role_term.map { |rt| { type: rt.content } }
134
+ end
135
+ role << { type: default_role } if role.empty?
136
+ RelatonBib::ContributionInfo.new entity: entity, role: role
137
+ end.compact
138
+ end
139
+
140
+ def create_entity(name)
141
+ case name.type
142
+ when "personal" then [create_person(name), "author"]
143
+ when "corporate" then [create_org(name), "publisher"]
144
+ end
145
+ end
146
+
147
+ def create_person(name)
148
+ # exclude typed name parts because they are not actual name parts
149
+ cname = name.name_part.reject(&:type).map(&:content).join(" ")
150
+ complatename = RelatonBib::LocalizedString.new cname, "en"
151
+ fname = RelatonBib::FullName.new completename: complatename
152
+ name_id = name.name_identifier[0]
153
+ identifier = RelatonBib::PersonIdentifier.new "uri", name_id.content if name_id
154
+ RelatonBib::Person.new name: fname, identifier: [identifier]
155
+ end
156
+
157
+ def create_org(name)
158
+ names = name.name_part.reject(&:type).map { |n| n.content.gsub("\n", " ").squeeze(" ").strip }
159
+ url = name.name_identifier[0]&.content
160
+ id = RelatonBib::OrgIdentifier.new "uri", url if url
161
+ RelatonBib::Organization.new name: names, identifier: [id]
162
+ end
163
+
164
+ def parse_relation
165
+ @doc.related_item.reject { |ri| ri.type == "series" }.map do |ri|
166
+ type = RELATION_TYPES[ri.type]
167
+ RelatonBib::DocumentRelation.new(type: type, bibitem: create_related_item(ri))
168
+ end
169
+ end
170
+
171
+ def create_related_item(item)
172
+ item_id = get_id_from_str related_item_id(item)
173
+ docid = RelatonBib::DocumentIdentifier.new type: "NIST", id: item_id
174
+ fref = RelatonBib::FormattedRef.new content: item_id
175
+ NistBibliographicItem.new(docid: [docid], formattedref: fref)
176
+ end
177
+
178
+ def related_item_id(item)
179
+ if item.other_type && item.other_type[0..6] == "10.6028"
180
+ item.other_type
181
+ else
182
+ item.name[0].name_part[0].content
183
+ end
184
+ end
185
+
186
+ def parse_place
187
+ @doc.origin_info.select { |p| p.event_type == "publisher"}.map do |p|
188
+ place = p.place[0].place_term[0].content
189
+ /(?<city>\w+), (?<state>\w+)/ =~ place
190
+ RelatonBib::Place.new city: city, region: create_region(state)
191
+ end
192
+ end
193
+
194
+ def create_region(state)
195
+ [RelatonBib::Place::RegionType.new(iso: state)]
196
+ rescue ArgumentError
197
+ []
198
+ end
199
+
200
+ def parse_series
201
+ @doc.related_item.select { |ri| ri.type == "series" }.map do |ri|
202
+ tinfo = ri.title_info[0]
203
+ tcontent = tinfo.title[0].strip
204
+ title = RelatonBib::TypedTitleString.new content: tcontent
205
+ RelatonBib::Series.new title: title, number: tinfo.part_number[0]
206
+ end
207
+ end
208
+ end
209
+ end
@@ -123,7 +123,7 @@ module RelatonNist
123
123
  next if iter && r.status.iteration != iteration
124
124
  return { ret: r } if !year
125
125
 
126
- r.date.select { |d| d.type == "published" }.each do |d|
126
+ r.date.select { |d| d.type == "published" || d.type == "issued" }.each do |d|
127
127
  return { ret: r } if year.to_i == d.on(:year)
128
128
 
129
129
  missed_years << d.on(:year)
@@ -1,3 +1,3 @@
1
1
  module RelatonNist
2
- VERSION = "1.19.0".freeze
2
+ VERSION = "1.19.3".freeze
3
3
  end
data/relaton_nist.gemspec CHANGED
@@ -24,7 +24,8 @@ Gem::Specification.new do |spec|
24
24
  spec.required_ruby_version = Gem::Requirement.new(">= 2.7.0")
25
25
 
26
26
  spec.add_dependency "base64"
27
- spec.add_dependency "relaton-bib", "~> 1.19.0"
27
+ spec.add_dependency "loc_mods", "~> 0.2.0"
28
+ spec.add_dependency "relaton-bib", "~> 1.19.2"
28
29
  spec.add_dependency "relaton-index", "~> 0.2.0"
29
30
  spec.add_dependency "rubyzip"
30
31
  end
metadata CHANGED
@@ -1,14 +1,14 @@
1
1
  --- !ruby/object:Gem::Specification
2
2
  name: relaton-nist
3
3
  version: !ruby/object:Gem::Version
4
- version: 1.19.0
4
+ version: 1.19.3
5
5
  platform: ruby
6
6
  authors:
7
7
  - Ribose Inc.
8
8
  autorequire:
9
9
  bindir: exe
10
10
  cert_chain: []
11
- date: 2024-07-03 00:00:00.000000000 Z
11
+ date: 2024-09-06 00:00:00.000000000 Z
12
12
  dependencies:
13
13
  - !ruby/object:Gem::Dependency
14
14
  name: base64
@@ -24,20 +24,34 @@ dependencies:
24
24
  - - ">="
25
25
  - !ruby/object:Gem::Version
26
26
  version: '0'
27
+ - !ruby/object:Gem::Dependency
28
+ name: loc_mods
29
+ requirement: !ruby/object:Gem::Requirement
30
+ requirements:
31
+ - - "~>"
32
+ - !ruby/object:Gem::Version
33
+ version: 0.2.0
34
+ type: :runtime
35
+ prerelease: false
36
+ version_requirements: !ruby/object:Gem::Requirement
37
+ requirements:
38
+ - - "~>"
39
+ - !ruby/object:Gem::Version
40
+ version: 0.2.0
27
41
  - !ruby/object:Gem::Dependency
28
42
  name: relaton-bib
29
43
  requirement: !ruby/object:Gem::Requirement
30
44
  requirements:
31
45
  - - "~>"
32
46
  - !ruby/object:Gem::Version
33
- version: 1.19.0
47
+ version: 1.19.2
34
48
  type: :runtime
35
49
  prerelease: false
36
50
  version_requirements: !ruby/object:Gem::Requirement
37
51
  requirements:
38
52
  - - "~>"
39
53
  - !ruby/object:Gem::Version
40
- version: 1.19.0
54
+ version: 1.19.2
41
55
  - !ruby/object:Gem::Dependency
42
56
  name: relaton-index
43
57
  requirement: !ruby/object:Gem::Requirement
@@ -98,6 +112,7 @@ files:
98
112
  - lib/relaton_nist/hash_converter.rb
99
113
  - lib/relaton_nist/hit.rb
100
114
  - lib/relaton_nist/hit_collection.rb
115
+ - lib/relaton_nist/mods_parser.rb
101
116
  - lib/relaton_nist/nist_bibliographic_item.rb
102
117
  - lib/relaton_nist/nist_bibliography.rb
103
118
  - lib/relaton_nist/processor.rb