relaton-bipm 1.14.1 → 1.14.3

Sign up to get free protection for your applications and to get access to all the features.
checksums.yaml CHANGED
@@ -1,7 +1,7 @@
1
1
  ---
2
2
  SHA256:
3
- metadata.gz: 516c29a4d2e4ce02f6e2a1800f248f26ffa9b15c848d604ba5c100e05104acc8
4
- data.tar.gz: 32455c13d2414e272d3ae9d9fe9b948116987439a81133fe235370eedac98fdc
3
+ metadata.gz: 34d720b316dbd942e2c5d630d2ae0f07b74331e4ef07f68715e84304bad0fb13
4
+ data.tar.gz: 38d36e34b998db6e4fa9e9f1a6e5306fadacea45b9a878cd14629eaaca2ef50d
5
5
  SHA512:
6
- metadata.gz: dab2ecf54b507fad29bedd65870525f8c063d2299e7df03fd731f5382e43ecbd900931860b58c8c346ed98c3f1214242f97f795c549f20da9d0a65bb321191e3
7
- data.tar.gz: eba8ab69d65e03ee43cfb3bd8529eb56c6836244897f71deba1799ebf33c513a375bb1fb4527920d721b11bd9cd6604d253a831d9d62c5703cd480d39ed9d8b4
6
+ metadata.gz: a22261617d5c3de8aad7ed410091331698630f958332c5feb0b215b9fafe9167015530e9e6ecb71046307a6775aa7a2a0a71dd4c12e4f982cb0bd259e021a267
7
+ data.tar.gz: 376bb090dd4d273b8039357d78280c9bc4f1555918920a55b22ec72cb8be87c4ce0a8469254ffc000256fa95069a271b65b978dc767d735acd4b3e141c5dea24
data/.gitignore CHANGED
@@ -12,3 +12,4 @@
12
12
  .rubocop-https---raw-githubusercontent-com-riboseinc-oss-guides-master-ci-rubocop-yml
13
13
  Gemfile.lock
14
14
  .vscode/
15
+ rawdata-bipm-metrologia
data/Gemfile CHANGED
@@ -3,5 +3,11 @@ source "https://rubygems.org"
3
3
  # Specify your gem's dependencies in relaton_bipm.gemspec
4
4
  gemspec
5
5
 
6
+ gem "byebug"
7
+ gem "pry-byebug"
6
8
  gem "rake", "~> 13.0"
7
9
  gem "rspec", "~> 3.0"
10
+ gem "ruby-jing"
11
+ gem "simplecov"
12
+ gem "vcr"
13
+ gem "webmock"
data/README.adoc CHANGED
@@ -70,22 +70,35 @@ Allowed document names are:
70
70
 
71
71
  ==== Reference structure for Metrologia documents
72
72
 
73
- `BIPM Metrologia {JOURNAL} {VOLUME} {ISSUE} {PAGE}`
73
+ `BIPM Metrologia {JOURNAL} {VOLUME} {ISSUE}`
74
74
 
75
- - `{JOURNAL}` - number of journal, required
76
- - `{VOLUME}` - number of volume, optional
77
- - `{ISSUE}` - number of issue, optional
78
- - `{PAGE}` - number of page, optional
75
+ - `{JOURNAL}` - journal number, required
76
+ - `{VOLUME}` - volume number, optional
77
+ - `{ISSUE}` - issue number, optional
79
78
 
80
79
  ==== Reference structures for CCTF (CCDS), CGPM, CIPM documents
81
80
 
82
81
  ===== Basic pattern
83
82
 
84
83
  ----
85
- Long: {group name} -- {type} {number} ({year})
86
- Short: {group name} -- {type-abbrev} {number} ({year}, {lang})
84
+ Long:
85
+ {group name} -- {type} {number} ({year})
86
+ {group name} {type} {number} ({year})
87
+ {group name} {type} {year}-{zero_leading_number}
88
+
89
+ Short:
90
+ {group name} -- {type-abbrev} {number} ({year}, {lang})
91
+ {group name} {type-abbrev} {number} ({year}, {lang})
87
92
  ----
88
93
 
94
+ `group name` - a name of the group, required. A full list of group names is available https://github.com/metanorma/bipm-editor-guides/blob/main/sources/bipm-outcomes-en.adoc#appendix-a-bipm-groups-and-codes[here].
95
+ `type` - a type of document, required. A list of types is: Resolution (Résolution), Recommendation (Recommandation), Decision (Décision), Meeting (Réunion), Declaration (Déclaration).
96
+ `type-abbrev` - an abbreviation of the type, required. A list of abbreviations: RES (Resolution), REC (Recommendation), DECN (Decision).
97
+ `number` - a number of the document, optional. Can be with part, e.g. `1-2`.
98
+ `zero_leading_number` - a number of the document with a leading zero, required. Can be used when a document has a 1 or 2 digits number. It's `00` for documents without a number.
99
+ `year` - a year of the document, optional.
100
+ `lang` - a language of the document, optional. Can be `EN` or `FR`.
101
+
89
102
  ===== Special case pattern
90
103
 
91
104
  The basic pattern works fine for all, except for these 2 cases:
@@ -189,9 +202,9 @@ item = RelatonBipm::BipmBibliography.get "BIPM SI Brochure"
189
202
  ...
190
203
 
191
204
  # get BIPM Metrologia page
192
- bib = RelatonBipm::BipmBibliography.get "BIPM Metrologia 29 6 373"
193
- [relaton-bipm] ("BIPM Metrologia 29 6 373") fetching...
194
- [relaton-bipm] ("BIPM Metrologia 29 6 373") found Metrologia 29 6 373
205
+ bib = RelatonBipm::BipmBibliography.get "BIPM Metrologia 29 6 001"
206
+ [relaton-bipm] ("BIPM Metrologia 29 6 001") fetching...
207
+ [relaton-bipm] ("BIPM Metrologia 29 6 001") found Metrologia 29 6 001
195
208
  => #<RelatonBipm::BipmBibliographicItem:0x007f8857f94d40
196
209
  ...
197
210
 
@@ -295,7 +308,7 @@ bib.link
295
308
  #<RelatonBib::TypedUri:0x00007fa6d6a29250 @content=#<Addressable::URI:0xc2b0 URI:https://doi.org/10.1088/0026-1394/29/6/001>, @type="doi">]
296
309
  ----
297
310
 
298
- === Create bibliographic item from XML
311
+ === Create a bibliographic item from XML
299
312
 
300
313
  [source,ruby]
301
314
  ----
@@ -304,7 +317,7 @@ RelatonBipm::XMLParser.from_xml File.read('spec/fixtures/bipm_item.xml')
304
317
  ...
305
318
  ----
306
319
 
307
- === Create bibliographic item from YAML
320
+ === Create a bibliographic item from YAML
308
321
  [source,ruby]
309
322
  ----
310
323
  hash = YAML.load_file 'spec/fixtures/bipm_item.yml'
@@ -321,6 +334,7 @@ RelatonBipm::BipmBibliographicItem.from_hash hash
321
334
  This gem uses the following datasets as data sources:
322
335
  - `bipm-data-outcomes` - looking for a local directory with the repository https://github.com/metanorma/bipm-data-outcomes
323
336
  - `bipm-si-brochute` - looking for a local directory with the repository https://github.com/metanorma/bipm-si-brochure
337
+ - `rawdata-bipm-metrologia` - looking for a local directory with the repository https://github.com/relaton/rawdata-bipm-metrologia
324
338
 
325
339
  The method `RelatonBipm::DataFetcher.fetch(source, output: "data", format: "yaml")` fetches all the documents from the dataset and saves them to the `./data` folder in YAML format.
326
340
  Arguments:
@@ -342,6 +356,12 @@ Started at: 2022-06-23 09:37:12 +0200
342
356
  Stopped at: 2022-06-23 09:37:12 +0200
343
357
  Done in: 0 sec.
344
358
  => nil
359
+
360
+ RelatonBipm::DataFetcher.fetch "rawdata-bipm-metrologia"
361
+ Started at: 2022-06-23 09:39:12 +0200
362
+ Stopped at: 2022-06-23 09:40:34 +0200
363
+ Done in: 82 sec.
364
+ => nil
345
365
  ----
346
366
 
347
367
  == Development
@@ -522,7 +522,6 @@
522
522
  <value>tip</value>
523
523
  <value>important</value>
524
524
  <value>caution</value>
525
- <value>statement</value>
526
525
  </choice>
527
526
  </define>
528
527
  <define name="figure">
data/grammars/biblio.rng CHANGED
@@ -216,6 +216,9 @@
216
216
  <optional>
217
217
  <ref name="fullname"/>
218
218
  </optional>
219
+ <zeroOrMore>
220
+ <ref name="credential"/>
221
+ </zeroOrMore>
219
222
  <zeroOrMore>
220
223
  <ref name="affiliation"/>
221
224
  </zeroOrMore>
@@ -232,6 +235,11 @@
232
235
  <ref name="FullNameType"/>
233
236
  </element>
234
237
  </define>
238
+ <define name="credential">
239
+ <element name="credential">
240
+ <text/>
241
+ </element>
242
+ </define>
235
243
  <define name="FullNameType">
236
244
  <choice>
237
245
  <group>
@@ -305,7 +313,9 @@
305
313
  <zeroOrMore>
306
314
  <ref name="affiliationdescription"/>
307
315
  </zeroOrMore>
308
- <ref name="organization"/>
316
+ <optional>
317
+ <ref name="organization"/>
318
+ </optional>
309
319
  </element>
310
320
  </define>
311
321
  <define name="affiliationname">
@@ -1316,7 +1326,7 @@
1316
1326
  <value>commentaryOf</value>
1317
1327
  <value>hasCommentary</value>
1318
1328
  <value>related</value>
1319
- <value>complements</value>
1329
+ <value>hasComplement</value>
1320
1330
  <value>complementOf</value>
1321
1331
  <value>obsoletes</value>
1322
1332
  <value>obsoletedBy</value>
@@ -3,14 +3,6 @@ require "mechanize"
3
3
  module RelatonBipm
4
4
  class BipmBibliography
5
5
  GH_ENDPOINT = "https://raw.githubusercontent.com/relaton/relaton-data-bipm/master/".freeze
6
- IOP_DOMAIN = "https://iopscience.iop.org".freeze
7
- TRANSLATIONS = {
8
- "Déclaration" => "Declaration",
9
- "Réunion" => "Meeting",
10
- "Recommandation" => "Recommendation",
11
- "Résolution" => "Resolution",
12
- "Décision" => "Decision",
13
- }.freeze
14
6
 
15
7
  class << self
16
8
  # @param text [String]
@@ -18,14 +10,13 @@ module RelatonBipm
18
10
  def search(text, _year = nil, _opts = {}) # rubocop:disable Metrics/AbcSize, Metrics/MethodLength
19
11
  warn "[relaton-bipm] (\"#{text}\") fetching..."
20
12
  ref = text.sub(/^BIPM\s/, "")
21
- item = ref.match?(/^Metrologia/i) ? get_metrologia(ref, magent) : get_bipm(ref, magent)
13
+ item = get_bipm(ref, magent)
22
14
  unless item
23
15
  warn "[relaton-bipm] (\"#{text}\") not found."
24
16
  return
25
17
  end
26
18
 
27
19
  warn("[relaton-bipm] (\"#{text}\") found #{item.docidentifier[0].id}")
28
- item.fetched = Date.today.to_s
29
20
  item
30
21
  rescue Mechanize::ResponseCodeError => e
31
22
  raise RelatonBib::RequestError, e.message unless e.response_code == "404"
@@ -48,295 +39,28 @@ module RelatonBipm
48
39
  a
49
40
  end
50
41
 
51
- # @param ref [String]
42
+ # @param reference [String]
52
43
  # @param agent [Mechanize]
53
44
  # @return [RelatonBipm::BipmBibliographicItem]
54
- def get_bipm(ref, agent) # rubocop:disable Metrics/AbcSize, Metrics/MethodLength
55
- # rf = ref.sub(/(?:(\d{1,2})\s)?\(?(\d{4})(?!-)\)?/) do
56
- # "#{$2}-#{$1.to_s.rjust(2, '0')}"
57
- # end
58
- ref.sub!("CCDS", "CCTF")
59
- # TRANSLATIONS.each { |fr, en| rf.sub! fr, en }
60
- path = Index.new.search ref
61
- return unless path
45
+ def get_bipm(reference, agent) # rubocop:disable Metrics/AbcSize, Metrics/MethodLength
46
+ ref_id = Id.new reference
47
+ index = Relaton::Index.find_or_create :BIPM, url: "#{GH_ENDPOINT}index2.zip"
48
+ rows = index.search { |r| ref_id == r[:id] }
49
+ return unless rows.any?
62
50
 
63
- url = "#{GH_ENDPOINT}#{path}"
51
+ url = "#{GH_ENDPOINT}#{rows.first[:file]}"
64
52
  resp = agent.get url
65
- check_response resp
66
53
  return unless resp.code == "200"
67
54
 
68
55
  yaml = RelatonBib.parse_yaml resp.body, [Date]
69
- # yaml["fetched"] = Date.today.to_s
56
+ yaml["fetched"] = Date.today.to_s
70
57
  bib_hash = HashConverter.hash_to_bib yaml
71
58
  BipmBibliographicItem.new(**bib_hash)
72
59
  end
73
60
 
74
- # @param ref [String]
75
- # @param agent [Mechanize]
76
- # @return [RelatonBipm::BipmBibliographicItem]
77
- def get_metrologia(ref, agent)
78
- agent.redirect_ok = false
79
- ref_arr = ref.split
80
- case ref_arr.size
81
- when 1 then get_journal agent
82
- when 2 then get_volume ref_arr[1], agent
83
- when 3 then get_issue(*ref_arr[1..2], agent)
84
- when 4 then get_article_from_issue(*ref_arr[1..3], agent)
85
- end
86
- end
87
-
88
- # @param agent [Mechanize]
89
- # @return [RelatonBipm::BipmBibliographicItem]
90
- def get_journal(agent)
91
- url = "#{IOP_DOMAIN}/journal/0026-1394"
92
- rsp = agent.get url
93
- check_response rsp
94
- rel = rsp.xpath('//select[@id="allVolumesSelector"]/option').map do |v|
95
- { type: "partOf", bibitem: journal_rel(v) }
96
- end
97
- did = doc_id []
98
- bibitem(formattedref: fref(did.id), docid: [did], link: blink(url), relation: rel)
99
- end
100
-
101
- # @param elm [Nokogiri::XML::Element]
102
- def journal_rel(elm)
103
- vol = elm[:value].split("/").last
104
- did = doc_id [vol]
105
- url = IOP_DOMAIN + elm[:value]
106
- BipmBibliographicItem.new(formattedref: fref(did.id), docid: [did], link: blink(url))
107
- end
108
-
109
- # @param vol [String]
110
- # @param agent [Mechanize]
111
- # @return [RelatonBipm::BipmBibliographicItem]
112
- def get_volume(vol, agent)
113
- url = "#{IOP_DOMAIN}/volume/0026-1394/#{vol}"
114
- rsp = agent.get url
115
- check_response rsp
116
- rel = rsp.xpath('//li[@itemprop="hasPart"]').map do |i|
117
- { type: "partOf", bibitem: volume_rel(i, vol) }
118
- end
119
- did = doc_id [vol]
120
- bibitem(formattedref: fref(did.id), docid: [did], link: blink(url), date: bdate(rsp), relation: rel,
121
- extent: btextent(vol), series: series)
122
- end
123
-
124
- def volume_rel(elm, vol) # rubocop:disable Metrics/AbcSize
125
- a = elm.at 'a[@itemprop="issueNumber"]'
126
- ish = a[:href].split("/").last
127
- url = IOP_DOMAIN + a[:href]
128
- docid = doc_id [vol, ish]
129
- t = elm.at "p"
130
- title_fref = t ? { title: titles(t.text) } : { formattedref: fref(docid.id) }
131
- BipmBibliographicItem.new(**title_fref, docid: [docid], link: blink(url))
132
- end
133
-
134
- # @param title [String]
135
- # @return [RelatonBib::TypedTitleStringCollection]
136
- def titles(title)
137
- RelatonBib::TypedTitleString.from_string title, "en", "Latn", "text/html"
138
- end
139
-
140
- # @param vol [String]
141
- # @param ish [String]
142
- # @param agent [Mechanize]
143
- # @return [RelatonBipm::BipmBibliographicItem]
144
- def get_issue(vol, ish, agent) # rubocop:disable Metrics/AbcSize, Metrics/MethodLength
145
- url = issue_url vol, ish
146
- rsp = agent.get url
147
- check_response rsp
148
- rel = rsp.xpath('//div[@class="art-list-item-body"]').map do |a|
149
- { type: "partOf", bibitem: issue_rel(a, vol, ish) }
150
- end
151
- did = doc_id [vol, ish]
152
- title_fref = { title: issue_title(rsp) }
153
- title_fref[:formattedref] = fref did.id unless title_fref[:title].any?
154
- bibitem(**title_fref, link: blink(url), relation: rel, docid: [did],
155
- date: bdate(rsp), extent: btextent(vol, ish), series: series)
156
- end
157
-
158
- # @param ref [String]
159
- # @return [RelatonBib::FormattedRef]
160
- def fref(ref)
161
- RelatonBib::FormattedRef.new content: ref, language: "en", script: "Latn"
162
- end
163
-
164
- # @param rsp [Mechanize::Page]
165
- # @return [RelatonBib::TypedTitleStringCollection]
166
- def issue_title(rsp)
167
- t = rsp.at('//div[@id="wd-jnl-issue-title"]/h4')
168
- return RelatonBib::TypedTitleStringCollection.new [] unless t
169
-
170
- titles(t.text)
171
- end
172
-
173
- # @oaran vol [String]
174
- # @param ish [String]
175
- # @return [String]
176
- def issue_url(vol, ish)
177
- "#{IOP_DOMAIN}/issue/0026-1394/#{vol}/#{ish}"
178
- end
179
-
180
- # @param elm [Nokogiri::XML::Element]
181
- # @param vol [String]
182
- # @param ish [String]
183
- # @return [RelatonBipm::BipmBibliographicItem]
184
- def issue_rel(elm, vol, ish)
185
- art = elm.at('div[@class="indexer"]').text
186
- ref = elm.at('div/a[@class="art-list-item-title"]')
187
- title = titles ref.text.strip
188
- docid = doc_id [vol, ish, art]
189
- link = blink IOP_DOMAIN + ref[:href]
190
- BipmBibliographicItem.new(title: title, docid: [docid], link: link)
191
- end
192
-
193
- # @param content [RelatonBib::TypedTitleString]
194
- # @return [RelatonBib::TypedTitleString]
195
- def btitle(content)
196
- RelatonBib::TypedTitleString.new type: "main", content: content, language: "en", script: "Latn"
197
- end
198
-
199
- # @param url [String]
200
- # @return [String]
201
- def blink(url)
202
- [RelatonBib::TypedUri.new(type: "src", content: url)]
203
- end
204
-
205
- # @param rsp [Mechanize::Page]
206
- # @return [Array<RelatonBib::BibliographicDate>]
207
- def bdate(rsp)
208
- date = rsp.at('//p[@itemprop="issueNumber"]|//h2[@itemprop="volumeNumber"]').text.split(", ").last
209
- on = date.match?(/^\d{4}$/) ? date : Date.parse(date).strftime("%Y-%m")
210
- [RelatonBib::BibliographicDate.new(type: "published", on: on)]
211
- end
212
-
213
- # @param args [Array<String>]
214
- # @return [RelatonBib::DocumentIdentifier]
215
- def doc_id(args)
216
- id = args.clone.unshift "Metrologia"
217
- RelatonBib::DocumentIdentifier.new(type: "BIPM", id: id.join(" "), primary: true)
218
- end
219
-
220
- # @param vol [String]
221
- # @param ish [String]
222
- # @param art [String]
223
- # @param agent [Mechanize]
224
- # @return [RelatonBipm::BipmBibliographicItem]
225
- def get_article_from_issue(vol, ish, art, agent) # rubocop:disable Metrics/MethodLength
226
- url = issue_url vol, ish
227
- rsp = agent.get url
228
- check_response rsp
229
- link = rsp.at("//div[@class='indexer'][.='#{art}']/../div/a")
230
- unless link
231
- arts = rsp.xpath("//div[@class='indexer']").map(&:text)
232
- warn "[relaton-bipm] No article is available at the specified start page \"#{art}\" in issue \"BIPM Metrologia #{vol} #{ish}\"."
233
- warn "[relaton-bipm] Available articles in the issue start at the following pages: (#{arts.join(', ')})"
234
- return
235
- end
236
-
237
- get_article link[:href], vol, ish, agent
238
- end
239
-
240
- # @param path [String]
241
- # @param vol [String]
242
- # @param ish [String]
243
- # @param agent [Mechanize]
244
- # @return [RelatonBipm::BipmBibliographicItem]
245
- def get_article(path, vol, ish, agent) # rubocop:disable Metrics/AbcSize, Metrics/MethodLength
246
- agent.agent.allowed_error_codes = [403]
247
- rsp = agent.get path
248
- check_response rsp
249
- title = rsp.at("//h1[@itemprop='headline']").children.to_xml
250
- url = rsp.uri
251
- bib = rsp.link_with(text: "BibTeX").href
252
- rsp = agent.get bib
253
- check_response rsp
254
- bt = BibTeX.parse(rsp.body).first
255
- bibitem(
256
- docid: btdocid(bt), title: titles(title), date: btdate(bt),
257
- abstract: btabstract(bt), doctype: bt.type.to_s, series: series,
258
- link: btlink(bt, url), contributor: btcontrib(bt),
259
- extent: btextent(vol, ish, bt.pages.to_s)
260
- )
261
- end
262
-
263
- # @param args [Hash]
264
- # @return [RelatonBipm::BipmBibliographicItem]
265
- def bibitem(**args)
266
- BipmBibliographicItem.new(
267
- type: "article", language: ["en"], script: ["Latn"], **args,
268
- )
269
- end
270
-
271
- # @return [Array<RelatonBib::Series>]
272
- def series
273
- [RelatonBib::Series.new(title: btitle("Metrologia"))]
274
- end
275
-
276
- # @param bibtex [BibTeX::Entry]
277
- # @return [Array<RelatonBib::DocumentIdentifier>]
278
- def btdocid(bibtex)
279
- id = "#{bibtex.journal} #{bibtex.volume} #{bibtex.number} #{bibtex.pages.match(/^\d+/)}"
280
- [
281
- RelatonBib::DocumentIdentifier.new(type: "BIPM", id: id, primary: true),
282
- RelatonBib::DocumentIdentifier.new(type: "DOI", id: bibtex.doi),
283
- ]
284
- end
285
-
286
- # @param bibtex [BibTeX::Entry]
287
- # @return [Array<RelatonBib::FormattedString>]
288
- def btabstract(bibtex)
289
- [RelatonBib::FormattedString.new(content: bibtex.abstract.to_s, language: "en", script: "Latn")]
290
- end
291
-
292
- # @param bibtex [BibTeX::Entry]
293
- # @param ref [URI]
294
- # @return [Array<RelatonBib::TypedUri>]
295
- def btlink(bibtex, ref)
296
- [
297
- RelatonBib::TypedUri.new(type: "src", content: ref.to_s),
298
- RelatonBib::TypedUri.new(type: "doi", content: bibtex.url.to_s),
299
- ]
300
- end
301
-
302
- # @param bibtex [BibTeX::Entry]
303
- # @return [Array<RelatonBib::BibliographicDate>]
304
- def btdate(bibtex)
305
- on = Date.new(bibtex.year.to_i, bibtex.month_numeric)
306
- [RelatonBib::BibliographicDate.new(type: "published", on: on)]
307
- end
308
-
309
- # @param bibtex [BibTeX::Entry]
310
- # @return [Array<Hash>]
311
- def btcontrib(bibtex) # rubocop:disable Metrics/MethodLength, Metrics/AbcSize
312
- contribs = []
313
- if bibtex.publisher && !bibtex.publisher.empty?
314
- org = RelatonBib::Organization.new name: bibtex.publisher.to_s
315
- contribs << { entity: org, role: [{ type: "publisher" }] }
316
- end
317
- return contribs unless bibtex.author && !bibtex.author.empty?
318
-
319
- bibtex.author.split(" and ").inject(contribs) do |mem, name|
320
- cname = RelatonBib::LocalizedString.new name, "en", "Latn"
321
- name = RelatonBib::FullName.new completename: cname
322
- author = RelatonBib::Person.new name: name
323
- mem << { entity: author, role: [{ type: "author" }] }
324
- end
325
- end
326
-
327
- #
328
- # @param vol [String] volume
329
- # @param ish [String] issue
330
- # @param pgs [String] pages
331
- #
332
- # @return [Array<RelatonBib::BibItemLocality>]
333
- #
334
- def btextent(vol, ish = nil, pgs = nil)
335
- ext = [RelatonBib::Locality.new("volume", vol)]
336
- ext << RelatonBib::Locality.new("issue", ish) if ish
337
- ext << RelatonBib::Locality.new("page", *pgs.split("--")) if pgs
338
- ext
339
- end
61
+ # def match_item(ids, ref_id)
62
+ # ids.find { |id| Id.new(id) == ref_id }
63
+ # end
340
64
 
341
65
  # @param ref [String] the BIPM standard Code to look up (e..g "BIPM B-11")
342
66
  # @param year [String] not used
@@ -345,28 +69,6 @@ module RelatonBipm
345
69
  def get(ref, year = nil, opts = {})
346
70
  search(ref, year, opts)
347
71
  end
348
-
349
- private
350
-
351
- #
352
- # Check HTTP response. Warn and rise error if response is not 200
353
- # or redirect to CAPTCHA.
354
- #
355
- # @param [Mechanize] rsp response
356
- #
357
- # @raise [RelatonBib::RequestError] if response is not 200
358
- #
359
- def check_response(rsp) # rubocop:disable Metrics/AbcSize
360
- if rsp.code == "302"
361
- warn "[relaton-bipm] This source employs anti-DDoS measures that unfortunately affects automated requests."
362
- warn "[relaton-bipm] Please visit this link in your browser to resolve the CAPTCHA, then retry: #{rsp.uri}"
363
- # warn "[relaton-bipm] #{rsp.uri} is redirected to #{rsp.header['location']}"
364
- raise RelatonBib::RequestError, "cannot access #{rsp.uri}"
365
- elsif rsp.code != "200" && rsp.code != "403"
366
- warn "[read_bipm] can't acces #{rsp.uri} #{rsp.code}"
367
- raise RelatonBib::RequestError, "cannot acces #{rsp.uri} #{rsp.code}"
368
- end
369
- end
370
72
  end
371
73
  end
372
74
  end
@@ -6,7 +6,7 @@ module RelatonBipm
6
6
  # @param [RelatonBipm::DataFetcher] data_fetcher data fetcher
7
7
  #
8
8
  def initialize(data_fetcher)
9
- @data_fetcher = data_fetcher
9
+ @data_fetcher = WeakRef.new data_fetcher
10
10
  end
11
11
 
12
12
  #
@@ -27,14 +27,18 @@ module RelatonBipm
27
27
  # puts "Ls #{Dir['bipm-si-brochure/*']}"
28
28
  # puts "Ls #{Dir['bipm-si-brochure/site/*']}"
29
29
  # puts "Ls #{Dir['bipm-si-brochure/site/documents/*']}"
30
- Dir["bipm-si-brochure/site/documents/*.rxl"].each do |f|
30
+ Dir["bipm-si-brochure/_site/documents/*.rxl"].each do |f|
31
31
  puts "Parsing #{f}"
32
32
  docstd = Nokogiri::XML File.read f
33
33
  doc = docstd.at "/bibdata"
34
34
  hash1 = RelatonBipm::XMLParser.from_xml(doc.to_xml).to_hash
35
35
  fix_si_brochure_id hash1
36
- outfile = File.join @data_fetcher.output, File.basename(f).sub(/(?:-(?:en|fr))?\.rxl$/, ".yaml")
37
- @data_fetcher.index[[hash1["docnumber"] || File.basename(outfile, ".yaml")]] = outfile
36
+ basename = File.join @data_fetcher.output, File.basename(f).sub(/(?:-(?:en|fr))?\.rxl$/, "")
37
+ outfile = "#{basename}.#{@data_fetcher.ext}"
38
+ key = hash1["docnumber"] || basename
39
+ @data_fetcher.index[[key]] = outfile
40
+ @data_fetcher.index_new.add_or_update [key], outfile
41
+ @data_fetcher.index2.add_or_update Id.new(key).normalized_hash, outfile
38
42
  hash = if File.exist? outfile
39
43
  warn_duplicate = false
40
44
  hash2 = YAML.load_file outfile
@@ -15,7 +15,7 @@ module RelatonBipm
15
15
 
16
16
  # @param builder [Nokogiri::XML::builder]
17
17
  def to_xml(builder)
18
- builder.send "comment-period" do
18
+ builder.send :"comment-period" do
19
19
  builder.from from.to_s
20
20
  builder.to to.to_s if to
21
21
  end
@@ -1,6 +1,6 @@
1
1
  module RelatonBipm
2
2
  class DataFetcher
3
- attr_reader :output, :format, :ext, :files, :index
3
+ attr_reader :output, :format, :ext, :files, :index, :index_new, :index2
4
4
 
5
5
  #
6
6
  # Initialize fetcher
@@ -15,6 +15,8 @@ module RelatonBipm
15
15
  @files = []
16
16
  @index_path = "index.yaml"
17
17
  @index = File.exist?(@index_path) ? YAML.load_file(@index_path) : {}
18
+ @index_new = Relaton::Index.find_or_create :BIPM, file: "index-bipm.yaml"
19
+ @index2 = Relaton::Index.find_or_create :BIPM, file: "index2.yaml"
18
20
  end
19
21
 
20
22
  #
@@ -43,8 +45,11 @@ module RelatonBipm
43
45
  case source
44
46
  when "bipm-data-outcomes" then DataOutcomesParser.parse(self)
45
47
  when "bipm-si-brochure" then BipmSiBrochureParser.parse(self)
48
+ when "rawdata-bipm-metrologia" then RawdataBipmMetrologia::Fetcher.fetch(self)
46
49
  end
47
- File.write @index_path, @index.to_yaml, encoding: "UTF-8"
50
+ File.write @index_path, index.to_yaml, encoding: "UTF-8"
51
+ index_new.save
52
+ index2.save
48
53
  end
49
54
 
50
55
  #
@@ -54,15 +59,22 @@ module RelatonBipm
54
59
  # @param [RelatonBipm::BipmBibliographicItem] item document to save
55
60
  # @param [Boolean, nil] warn_duplicate Warn if document already exists
56
61
  #
57
- # @return [<Type>] <description>
58
- #
59
62
  def write_file(path, item, warn_duplicate: true)
63
+ content = serialize item
60
64
  if @files.include?(path)
61
65
  warn "File #{path} already exists" if warn_duplicate
62
66
  else
63
67
  @files << path
64
68
  end
65
- File.write path, item.to_hash.to_yaml, encoding: "UTF-8"
69
+ File.write path, content, encoding: "UTF-8"
70
+ end
71
+
72
+ def serialize(item)
73
+ case @format
74
+ when "xml" then item.to_xml bibdata: true
75
+ when "yaml" then item.to_hash.to_yaml
76
+ when "bibxml" then item.to_bibxml
77
+ end
66
78
  end
67
79
  end
68
80
  end