relaton-bipm 1.14.1 → 1.14.3

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
checksums.yaml CHANGED
@@ -1,7 +1,7 @@
1
1
  ---
2
2
  SHA256:
3
- metadata.gz: 516c29a4d2e4ce02f6e2a1800f248f26ffa9b15c848d604ba5c100e05104acc8
4
- data.tar.gz: 32455c13d2414e272d3ae9d9fe9b948116987439a81133fe235370eedac98fdc
3
+ metadata.gz: 34d720b316dbd942e2c5d630d2ae0f07b74331e4ef07f68715e84304bad0fb13
4
+ data.tar.gz: 38d36e34b998db6e4fa9e9f1a6e5306fadacea45b9a878cd14629eaaca2ef50d
5
5
  SHA512:
6
- metadata.gz: dab2ecf54b507fad29bedd65870525f8c063d2299e7df03fd731f5382e43ecbd900931860b58c8c346ed98c3f1214242f97f795c549f20da9d0a65bb321191e3
7
- data.tar.gz: eba8ab69d65e03ee43cfb3bd8529eb56c6836244897f71deba1799ebf33c513a375bb1fb4527920d721b11bd9cd6604d253a831d9d62c5703cd480d39ed9d8b4
6
+ metadata.gz: a22261617d5c3de8aad7ed410091331698630f958332c5feb0b215b9fafe9167015530e9e6ecb71046307a6775aa7a2a0a71dd4c12e4f982cb0bd259e021a267
7
+ data.tar.gz: 376bb090dd4d273b8039357d78280c9bc4f1555918920a55b22ec72cb8be87c4ce0a8469254ffc000256fa95069a271b65b978dc767d735acd4b3e141c5dea24
data/.gitignore CHANGED
@@ -12,3 +12,4 @@
12
12
  .rubocop-https---raw-githubusercontent-com-riboseinc-oss-guides-master-ci-rubocop-yml
13
13
  Gemfile.lock
14
14
  .vscode/
15
+ rawdata-bipm-metrologia
data/Gemfile CHANGED
@@ -3,5 +3,11 @@ source "https://rubygems.org"
3
3
  # Specify your gem's dependencies in relaton_bipm.gemspec
4
4
  gemspec
5
5
 
6
+ gem "byebug"
7
+ gem "pry-byebug"
6
8
  gem "rake", "~> 13.0"
7
9
  gem "rspec", "~> 3.0"
10
+ gem "ruby-jing"
11
+ gem "simplecov"
12
+ gem "vcr"
13
+ gem "webmock"
data/README.adoc CHANGED
@@ -70,22 +70,35 @@ Allowed document names are:
70
70
 
71
71
  ==== Reference structure for Metrologia documents
72
72
 
73
- `BIPM Metrologia {JOURNAL} {VOLUME} {ISSUE} {PAGE}`
73
+ `BIPM Metrologia {JOURNAL} {VOLUME} {ISSUE}`
74
74
 
75
- - `{JOURNAL}` - number of journal, required
76
- - `{VOLUME}` - number of volume, optional
77
- - `{ISSUE}` - number of issue, optional
78
- - `{PAGE}` - number of page, optional
75
+ - `{JOURNAL}` - journal number, required
76
+ - `{VOLUME}` - volume number, optional
77
+ - `{ISSUE}` - issue number, optional
79
78
 
80
79
  ==== Reference structures for CCTF (CCDS), CGPM, CIPM documents
81
80
 
82
81
  ===== Basic pattern
83
82
 
84
83
  ----
85
- Long: {group name} -- {type} {number} ({year})
86
- Short: {group name} -- {type-abbrev} {number} ({year}, {lang})
84
+ Long:
85
+ {group name} -- {type} {number} ({year})
86
+ {group name} {type} {number} ({year})
87
+ {group name} {type} {year}-{zero_leading_number}
88
+
89
+ Short:
90
+ {group name} -- {type-abbrev} {number} ({year}, {lang})
91
+ {group name} {type-abbrev} {number} ({year}, {lang})
87
92
  ----
88
93
 
94
+ `group name` - a name of the group, required. A full list of group names is available https://github.com/metanorma/bipm-editor-guides/blob/main/sources/bipm-outcomes-en.adoc#appendix-a-bipm-groups-and-codes[here].
95
+ `type` - a type of document, required. A list of types is: Resolution (Résolution), Recommendation (Recommandation), Decision (Décision), Meeting (Réunion), Declaration (Déclaration).
96
+ `type-abbrev` - an abbreviation of the type, required. A list of abbreviations: RES (Resolution), REC (Recommendation), DECN (Decision).
97
+ `number` - a number of the document, optional. Can be with part, e.g. `1-2`.
98
+ `zero_leading_number` - a number of the document with a leading zero, required. Can be used when a document has a 1 or 2 digits number. It's `00` for documents without a number.
99
+ `year` - a year of the document, optional.
100
+ `lang` - a language of the document, optional. Can be `EN` or `FR`.
101
+
89
102
  ===== Special case pattern
90
103
 
91
104
  The basic pattern works fine for all, except for these 2 cases:
@@ -189,9 +202,9 @@ item = RelatonBipm::BipmBibliography.get "BIPM SI Brochure"
189
202
  ...
190
203
 
191
204
  # get BIPM Metrologia page
192
- bib = RelatonBipm::BipmBibliography.get "BIPM Metrologia 29 6 373"
193
- [relaton-bipm] ("BIPM Metrologia 29 6 373") fetching...
194
- [relaton-bipm] ("BIPM Metrologia 29 6 373") found Metrologia 29 6 373
205
+ bib = RelatonBipm::BipmBibliography.get "BIPM Metrologia 29 6 001"
206
+ [relaton-bipm] ("BIPM Metrologia 29 6 001") fetching...
207
+ [relaton-bipm] ("BIPM Metrologia 29 6 001") found Metrologia 29 6 001
195
208
  => #<RelatonBipm::BipmBibliographicItem:0x007f8857f94d40
196
209
  ...
197
210
 
@@ -295,7 +308,7 @@ bib.link
295
308
  #<RelatonBib::TypedUri:0x00007fa6d6a29250 @content=#<Addressable::URI:0xc2b0 URI:https://doi.org/10.1088/0026-1394/29/6/001>, @type="doi">]
296
309
  ----
297
310
 
298
- === Create bibliographic item from XML
311
+ === Create a bibliographic item from XML
299
312
 
300
313
  [source,ruby]
301
314
  ----
@@ -304,7 +317,7 @@ RelatonBipm::XMLParser.from_xml File.read('spec/fixtures/bipm_item.xml')
304
317
  ...
305
318
  ----
306
319
 
307
- === Create bibliographic item from YAML
320
+ === Create a bibliographic item from YAML
308
321
  [source,ruby]
309
322
  ----
310
323
  hash = YAML.load_file 'spec/fixtures/bipm_item.yml'
@@ -321,6 +334,7 @@ RelatonBipm::BipmBibliographicItem.from_hash hash
321
334
  This gem uses the following datasets as data sources:
322
335
  - `bipm-data-outcomes` - looking for a local directory with the repository https://github.com/metanorma/bipm-data-outcomes
323
336
  - `bipm-si-brochute` - looking for a local directory with the repository https://github.com/metanorma/bipm-si-brochure
337
+ - `rawdata-bipm-metrologia` - looking for a local directory with the repository https://github.com/relaton/rawdata-bipm-metrologia
324
338
 
325
339
  The method `RelatonBipm::DataFetcher.fetch(source, output: "data", format: "yaml")` fetches all the documents from the dataset and saves them to the `./data` folder in YAML format.
326
340
  Arguments:
@@ -342,6 +356,12 @@ Started at: 2022-06-23 09:37:12 +0200
342
356
  Stopped at: 2022-06-23 09:37:12 +0200
343
357
  Done in: 0 sec.
344
358
  => nil
359
+
360
+ RelatonBipm::DataFetcher.fetch "rawdata-bipm-metrologia"
361
+ Started at: 2022-06-23 09:39:12 +0200
362
+ Stopped at: 2022-06-23 09:40:34 +0200
363
+ Done in: 82 sec.
364
+ => nil
345
365
  ----
346
366
 
347
367
  == Development
@@ -522,7 +522,6 @@
522
522
  <value>tip</value>
523
523
  <value>important</value>
524
524
  <value>caution</value>
525
- <value>statement</value>
526
525
  </choice>
527
526
  </define>
528
527
  <define name="figure">
data/grammars/biblio.rng CHANGED
@@ -216,6 +216,9 @@
216
216
  <optional>
217
217
  <ref name="fullname"/>
218
218
  </optional>
219
+ <zeroOrMore>
220
+ <ref name="credential"/>
221
+ </zeroOrMore>
219
222
  <zeroOrMore>
220
223
  <ref name="affiliation"/>
221
224
  </zeroOrMore>
@@ -232,6 +235,11 @@
232
235
  <ref name="FullNameType"/>
233
236
  </element>
234
237
  </define>
238
+ <define name="credential">
239
+ <element name="credential">
240
+ <text/>
241
+ </element>
242
+ </define>
235
243
  <define name="FullNameType">
236
244
  <choice>
237
245
  <group>
@@ -305,7 +313,9 @@
305
313
  <zeroOrMore>
306
314
  <ref name="affiliationdescription"/>
307
315
  </zeroOrMore>
308
- <ref name="organization"/>
316
+ <optional>
317
+ <ref name="organization"/>
318
+ </optional>
309
319
  </element>
310
320
  </define>
311
321
  <define name="affiliationname">
@@ -1316,7 +1326,7 @@
1316
1326
  <value>commentaryOf</value>
1317
1327
  <value>hasCommentary</value>
1318
1328
  <value>related</value>
1319
- <value>complements</value>
1329
+ <value>hasComplement</value>
1320
1330
  <value>complementOf</value>
1321
1331
  <value>obsoletes</value>
1322
1332
  <value>obsoletedBy</value>
@@ -3,14 +3,6 @@ require "mechanize"
3
3
  module RelatonBipm
4
4
  class BipmBibliography
5
5
  GH_ENDPOINT = "https://raw.githubusercontent.com/relaton/relaton-data-bipm/master/".freeze
6
- IOP_DOMAIN = "https://iopscience.iop.org".freeze
7
- TRANSLATIONS = {
8
- "Déclaration" => "Declaration",
9
- "Réunion" => "Meeting",
10
- "Recommandation" => "Recommendation",
11
- "Résolution" => "Resolution",
12
- "Décision" => "Decision",
13
- }.freeze
14
6
 
15
7
  class << self
16
8
  # @param text [String]
@@ -18,14 +10,13 @@ module RelatonBipm
18
10
  def search(text, _year = nil, _opts = {}) # rubocop:disable Metrics/AbcSize, Metrics/MethodLength
19
11
  warn "[relaton-bipm] (\"#{text}\") fetching..."
20
12
  ref = text.sub(/^BIPM\s/, "")
21
- item = ref.match?(/^Metrologia/i) ? get_metrologia(ref, magent) : get_bipm(ref, magent)
13
+ item = get_bipm(ref, magent)
22
14
  unless item
23
15
  warn "[relaton-bipm] (\"#{text}\") not found."
24
16
  return
25
17
  end
26
18
 
27
19
  warn("[relaton-bipm] (\"#{text}\") found #{item.docidentifier[0].id}")
28
- item.fetched = Date.today.to_s
29
20
  item
30
21
  rescue Mechanize::ResponseCodeError => e
31
22
  raise RelatonBib::RequestError, e.message unless e.response_code == "404"
@@ -48,295 +39,28 @@ module RelatonBipm
48
39
  a
49
40
  end
50
41
 
51
- # @param ref [String]
42
+ # @param reference [String]
52
43
  # @param agent [Mechanize]
53
44
  # @return [RelatonBipm::BipmBibliographicItem]
54
- def get_bipm(ref, agent) # rubocop:disable Metrics/AbcSize, Metrics/MethodLength
55
- # rf = ref.sub(/(?:(\d{1,2})\s)?\(?(\d{4})(?!-)\)?/) do
56
- # "#{$2}-#{$1.to_s.rjust(2, '0')}"
57
- # end
58
- ref.sub!("CCDS", "CCTF")
59
- # TRANSLATIONS.each { |fr, en| rf.sub! fr, en }
60
- path = Index.new.search ref
61
- return unless path
45
+ def get_bipm(reference, agent) # rubocop:disable Metrics/AbcSize, Metrics/MethodLength
46
+ ref_id = Id.new reference
47
+ index = Relaton::Index.find_or_create :BIPM, url: "#{GH_ENDPOINT}index2.zip"
48
+ rows = index.search { |r| ref_id == r[:id] }
49
+ return unless rows.any?
62
50
 
63
- url = "#{GH_ENDPOINT}#{path}"
51
+ url = "#{GH_ENDPOINT}#{rows.first[:file]}"
64
52
  resp = agent.get url
65
- check_response resp
66
53
  return unless resp.code == "200"
67
54
 
68
55
  yaml = RelatonBib.parse_yaml resp.body, [Date]
69
- # yaml["fetched"] = Date.today.to_s
56
+ yaml["fetched"] = Date.today.to_s
70
57
  bib_hash = HashConverter.hash_to_bib yaml
71
58
  BipmBibliographicItem.new(**bib_hash)
72
59
  end
73
60
 
74
- # @param ref [String]
75
- # @param agent [Mechanize]
76
- # @return [RelatonBipm::BipmBibliographicItem]
77
- def get_metrologia(ref, agent)
78
- agent.redirect_ok = false
79
- ref_arr = ref.split
80
- case ref_arr.size
81
- when 1 then get_journal agent
82
- when 2 then get_volume ref_arr[1], agent
83
- when 3 then get_issue(*ref_arr[1..2], agent)
84
- when 4 then get_article_from_issue(*ref_arr[1..3], agent)
85
- end
86
- end
87
-
88
- # @param agent [Mechanize]
89
- # @return [RelatonBipm::BipmBibliographicItem]
90
- def get_journal(agent)
91
- url = "#{IOP_DOMAIN}/journal/0026-1394"
92
- rsp = agent.get url
93
- check_response rsp
94
- rel = rsp.xpath('//select[@id="allVolumesSelector"]/option').map do |v|
95
- { type: "partOf", bibitem: journal_rel(v) }
96
- end
97
- did = doc_id []
98
- bibitem(formattedref: fref(did.id), docid: [did], link: blink(url), relation: rel)
99
- end
100
-
101
- # @param elm [Nokogiri::XML::Element]
102
- def journal_rel(elm)
103
- vol = elm[:value].split("/").last
104
- did = doc_id [vol]
105
- url = IOP_DOMAIN + elm[:value]
106
- BipmBibliographicItem.new(formattedref: fref(did.id), docid: [did], link: blink(url))
107
- end
108
-
109
- # @param vol [String]
110
- # @param agent [Mechanize]
111
- # @return [RelatonBipm::BipmBibliographicItem]
112
- def get_volume(vol, agent)
113
- url = "#{IOP_DOMAIN}/volume/0026-1394/#{vol}"
114
- rsp = agent.get url
115
- check_response rsp
116
- rel = rsp.xpath('//li[@itemprop="hasPart"]').map do |i|
117
- { type: "partOf", bibitem: volume_rel(i, vol) }
118
- end
119
- did = doc_id [vol]
120
- bibitem(formattedref: fref(did.id), docid: [did], link: blink(url), date: bdate(rsp), relation: rel,
121
- extent: btextent(vol), series: series)
122
- end
123
-
124
- def volume_rel(elm, vol) # rubocop:disable Metrics/AbcSize
125
- a = elm.at 'a[@itemprop="issueNumber"]'
126
- ish = a[:href].split("/").last
127
- url = IOP_DOMAIN + a[:href]
128
- docid = doc_id [vol, ish]
129
- t = elm.at "p"
130
- title_fref = t ? { title: titles(t.text) } : { formattedref: fref(docid.id) }
131
- BipmBibliographicItem.new(**title_fref, docid: [docid], link: blink(url))
132
- end
133
-
134
- # @param title [String]
135
- # @return [RelatonBib::TypedTitleStringCollection]
136
- def titles(title)
137
- RelatonBib::TypedTitleString.from_string title, "en", "Latn", "text/html"
138
- end
139
-
140
- # @param vol [String]
141
- # @param ish [String]
142
- # @param agent [Mechanize]
143
- # @return [RelatonBipm::BipmBibliographicItem]
144
- def get_issue(vol, ish, agent) # rubocop:disable Metrics/AbcSize, Metrics/MethodLength
145
- url = issue_url vol, ish
146
- rsp = agent.get url
147
- check_response rsp
148
- rel = rsp.xpath('//div[@class="art-list-item-body"]').map do |a|
149
- { type: "partOf", bibitem: issue_rel(a, vol, ish) }
150
- end
151
- did = doc_id [vol, ish]
152
- title_fref = { title: issue_title(rsp) }
153
- title_fref[:formattedref] = fref did.id unless title_fref[:title].any?
154
- bibitem(**title_fref, link: blink(url), relation: rel, docid: [did],
155
- date: bdate(rsp), extent: btextent(vol, ish), series: series)
156
- end
157
-
158
- # @param ref [String]
159
- # @return [RelatonBib::FormattedRef]
160
- def fref(ref)
161
- RelatonBib::FormattedRef.new content: ref, language: "en", script: "Latn"
162
- end
163
-
164
- # @param rsp [Mechanize::Page]
165
- # @return [RelatonBib::TypedTitleStringCollection]
166
- def issue_title(rsp)
167
- t = rsp.at('//div[@id="wd-jnl-issue-title"]/h4')
168
- return RelatonBib::TypedTitleStringCollection.new [] unless t
169
-
170
- titles(t.text)
171
- end
172
-
173
- # @oaran vol [String]
174
- # @param ish [String]
175
- # @return [String]
176
- def issue_url(vol, ish)
177
- "#{IOP_DOMAIN}/issue/0026-1394/#{vol}/#{ish}"
178
- end
179
-
180
- # @param elm [Nokogiri::XML::Element]
181
- # @param vol [String]
182
- # @param ish [String]
183
- # @return [RelatonBipm::BipmBibliographicItem]
184
- def issue_rel(elm, vol, ish)
185
- art = elm.at('div[@class="indexer"]').text
186
- ref = elm.at('div/a[@class="art-list-item-title"]')
187
- title = titles ref.text.strip
188
- docid = doc_id [vol, ish, art]
189
- link = blink IOP_DOMAIN + ref[:href]
190
- BipmBibliographicItem.new(title: title, docid: [docid], link: link)
191
- end
192
-
193
- # @param content [RelatonBib::TypedTitleString]
194
- # @return [RelatonBib::TypedTitleString]
195
- def btitle(content)
196
- RelatonBib::TypedTitleString.new type: "main", content: content, language: "en", script: "Latn"
197
- end
198
-
199
- # @param url [String]
200
- # @return [String]
201
- def blink(url)
202
- [RelatonBib::TypedUri.new(type: "src", content: url)]
203
- end
204
-
205
- # @param rsp [Mechanize::Page]
206
- # @return [Array<RelatonBib::BibliographicDate>]
207
- def bdate(rsp)
208
- date = rsp.at('//p[@itemprop="issueNumber"]|//h2[@itemprop="volumeNumber"]').text.split(", ").last
209
- on = date.match?(/^\d{4}$/) ? date : Date.parse(date).strftime("%Y-%m")
210
- [RelatonBib::BibliographicDate.new(type: "published", on: on)]
211
- end
212
-
213
- # @param args [Array<String>]
214
- # @return [RelatonBib::DocumentIdentifier]
215
- def doc_id(args)
216
- id = args.clone.unshift "Metrologia"
217
- RelatonBib::DocumentIdentifier.new(type: "BIPM", id: id.join(" "), primary: true)
218
- end
219
-
220
- # @param vol [String]
221
- # @param ish [String]
222
- # @param art [String]
223
- # @param agent [Mechanize]
224
- # @return [RelatonBipm::BipmBibliographicItem]
225
- def get_article_from_issue(vol, ish, art, agent) # rubocop:disable Metrics/MethodLength
226
- url = issue_url vol, ish
227
- rsp = agent.get url
228
- check_response rsp
229
- link = rsp.at("//div[@class='indexer'][.='#{art}']/../div/a")
230
- unless link
231
- arts = rsp.xpath("//div[@class='indexer']").map(&:text)
232
- warn "[relaton-bipm] No article is available at the specified start page \"#{art}\" in issue \"BIPM Metrologia #{vol} #{ish}\"."
233
- warn "[relaton-bipm] Available articles in the issue start at the following pages: (#{arts.join(', ')})"
234
- return
235
- end
236
-
237
- get_article link[:href], vol, ish, agent
238
- end
239
-
240
- # @param path [String]
241
- # @param vol [String]
242
- # @param ish [String]
243
- # @param agent [Mechanize]
244
- # @return [RelatonBipm::BipmBibliographicItem]
245
- def get_article(path, vol, ish, agent) # rubocop:disable Metrics/AbcSize, Metrics/MethodLength
246
- agent.agent.allowed_error_codes = [403]
247
- rsp = agent.get path
248
- check_response rsp
249
- title = rsp.at("//h1[@itemprop='headline']").children.to_xml
250
- url = rsp.uri
251
- bib = rsp.link_with(text: "BibTeX").href
252
- rsp = agent.get bib
253
- check_response rsp
254
- bt = BibTeX.parse(rsp.body).first
255
- bibitem(
256
- docid: btdocid(bt), title: titles(title), date: btdate(bt),
257
- abstract: btabstract(bt), doctype: bt.type.to_s, series: series,
258
- link: btlink(bt, url), contributor: btcontrib(bt),
259
- extent: btextent(vol, ish, bt.pages.to_s)
260
- )
261
- end
262
-
263
- # @param args [Hash]
264
- # @return [RelatonBipm::BipmBibliographicItem]
265
- def bibitem(**args)
266
- BipmBibliographicItem.new(
267
- type: "article", language: ["en"], script: ["Latn"], **args,
268
- )
269
- end
270
-
271
- # @return [Array<RelatonBib::Series>]
272
- def series
273
- [RelatonBib::Series.new(title: btitle("Metrologia"))]
274
- end
275
-
276
- # @param bibtex [BibTeX::Entry]
277
- # @return [Array<RelatonBib::DocumentIdentifier>]
278
- def btdocid(bibtex)
279
- id = "#{bibtex.journal} #{bibtex.volume} #{bibtex.number} #{bibtex.pages.match(/^\d+/)}"
280
- [
281
- RelatonBib::DocumentIdentifier.new(type: "BIPM", id: id, primary: true),
282
- RelatonBib::DocumentIdentifier.new(type: "DOI", id: bibtex.doi),
283
- ]
284
- end
285
-
286
- # @param bibtex [BibTeX::Entry]
287
- # @return [Array<RelatonBib::FormattedString>]
288
- def btabstract(bibtex)
289
- [RelatonBib::FormattedString.new(content: bibtex.abstract.to_s, language: "en", script: "Latn")]
290
- end
291
-
292
- # @param bibtex [BibTeX::Entry]
293
- # @param ref [URI]
294
- # @return [Array<RelatonBib::TypedUri>]
295
- def btlink(bibtex, ref)
296
- [
297
- RelatonBib::TypedUri.new(type: "src", content: ref.to_s),
298
- RelatonBib::TypedUri.new(type: "doi", content: bibtex.url.to_s),
299
- ]
300
- end
301
-
302
- # @param bibtex [BibTeX::Entry]
303
- # @return [Array<RelatonBib::BibliographicDate>]
304
- def btdate(bibtex)
305
- on = Date.new(bibtex.year.to_i, bibtex.month_numeric)
306
- [RelatonBib::BibliographicDate.new(type: "published", on: on)]
307
- end
308
-
309
- # @param bibtex [BibTeX::Entry]
310
- # @return [Array<Hash>]
311
- def btcontrib(bibtex) # rubocop:disable Metrics/MethodLength, Metrics/AbcSize
312
- contribs = []
313
- if bibtex.publisher && !bibtex.publisher.empty?
314
- org = RelatonBib::Organization.new name: bibtex.publisher.to_s
315
- contribs << { entity: org, role: [{ type: "publisher" }] }
316
- end
317
- return contribs unless bibtex.author && !bibtex.author.empty?
318
-
319
- bibtex.author.split(" and ").inject(contribs) do |mem, name|
320
- cname = RelatonBib::LocalizedString.new name, "en", "Latn"
321
- name = RelatonBib::FullName.new completename: cname
322
- author = RelatonBib::Person.new name: name
323
- mem << { entity: author, role: [{ type: "author" }] }
324
- end
325
- end
326
-
327
- #
328
- # @param vol [String] volume
329
- # @param ish [String] issue
330
- # @param pgs [String] pages
331
- #
332
- # @return [Array<RelatonBib::BibItemLocality>]
333
- #
334
- def btextent(vol, ish = nil, pgs = nil)
335
- ext = [RelatonBib::Locality.new("volume", vol)]
336
- ext << RelatonBib::Locality.new("issue", ish) if ish
337
- ext << RelatonBib::Locality.new("page", *pgs.split("--")) if pgs
338
- ext
339
- end
61
+ # def match_item(ids, ref_id)
62
+ # ids.find { |id| Id.new(id) == ref_id }
63
+ # end
340
64
 
341
65
  # @param ref [String] the BIPM standard Code to look up (e..g "BIPM B-11")
342
66
  # @param year [String] not used
@@ -345,28 +69,6 @@ module RelatonBipm
345
69
  def get(ref, year = nil, opts = {})
346
70
  search(ref, year, opts)
347
71
  end
348
-
349
- private
350
-
351
- #
352
- # Check HTTP response. Warn and rise error if response is not 200
353
- # or redirect to CAPTCHA.
354
- #
355
- # @param [Mechanize] rsp response
356
- #
357
- # @raise [RelatonBib::RequestError] if response is not 200
358
- #
359
- def check_response(rsp) # rubocop:disable Metrics/AbcSize
360
- if rsp.code == "302"
361
- warn "[relaton-bipm] This source employs anti-DDoS measures that unfortunately affects automated requests."
362
- warn "[relaton-bipm] Please visit this link in your browser to resolve the CAPTCHA, then retry: #{rsp.uri}"
363
- # warn "[relaton-bipm] #{rsp.uri} is redirected to #{rsp.header['location']}"
364
- raise RelatonBib::RequestError, "cannot access #{rsp.uri}"
365
- elsif rsp.code != "200" && rsp.code != "403"
366
- warn "[read_bipm] can't acces #{rsp.uri} #{rsp.code}"
367
- raise RelatonBib::RequestError, "cannot acces #{rsp.uri} #{rsp.code}"
368
- end
369
- end
370
72
  end
371
73
  end
372
74
  end
@@ -6,7 +6,7 @@ module RelatonBipm
6
6
  # @param [RelatonBipm::DataFetcher] data_fetcher data fetcher
7
7
  #
8
8
  def initialize(data_fetcher)
9
- @data_fetcher = data_fetcher
9
+ @data_fetcher = WeakRef.new data_fetcher
10
10
  end
11
11
 
12
12
  #
@@ -27,14 +27,18 @@ module RelatonBipm
27
27
  # puts "Ls #{Dir['bipm-si-brochure/*']}"
28
28
  # puts "Ls #{Dir['bipm-si-brochure/site/*']}"
29
29
  # puts "Ls #{Dir['bipm-si-brochure/site/documents/*']}"
30
- Dir["bipm-si-brochure/site/documents/*.rxl"].each do |f|
30
+ Dir["bipm-si-brochure/_site/documents/*.rxl"].each do |f|
31
31
  puts "Parsing #{f}"
32
32
  docstd = Nokogiri::XML File.read f
33
33
  doc = docstd.at "/bibdata"
34
34
  hash1 = RelatonBipm::XMLParser.from_xml(doc.to_xml).to_hash
35
35
  fix_si_brochure_id hash1
36
- outfile = File.join @data_fetcher.output, File.basename(f).sub(/(?:-(?:en|fr))?\.rxl$/, ".yaml")
37
- @data_fetcher.index[[hash1["docnumber"] || File.basename(outfile, ".yaml")]] = outfile
36
+ basename = File.join @data_fetcher.output, File.basename(f).sub(/(?:-(?:en|fr))?\.rxl$/, "")
37
+ outfile = "#{basename}.#{@data_fetcher.ext}"
38
+ key = hash1["docnumber"] || basename
39
+ @data_fetcher.index[[key]] = outfile
40
+ @data_fetcher.index_new.add_or_update [key], outfile
41
+ @data_fetcher.index2.add_or_update Id.new(key).normalized_hash, outfile
38
42
  hash = if File.exist? outfile
39
43
  warn_duplicate = false
40
44
  hash2 = YAML.load_file outfile
@@ -15,7 +15,7 @@ module RelatonBipm
15
15
 
16
16
  # @param builder [Nokogiri::XML::builder]
17
17
  def to_xml(builder)
18
- builder.send "comment-period" do
18
+ builder.send :"comment-period" do
19
19
  builder.from from.to_s
20
20
  builder.to to.to_s if to
21
21
  end
@@ -1,6 +1,6 @@
1
1
  module RelatonBipm
2
2
  class DataFetcher
3
- attr_reader :output, :format, :ext, :files, :index
3
+ attr_reader :output, :format, :ext, :files, :index, :index_new, :index2
4
4
 
5
5
  #
6
6
  # Initialize fetcher
@@ -15,6 +15,8 @@ module RelatonBipm
15
15
  @files = []
16
16
  @index_path = "index.yaml"
17
17
  @index = File.exist?(@index_path) ? YAML.load_file(@index_path) : {}
18
+ @index_new = Relaton::Index.find_or_create :BIPM, file: "index-bipm.yaml"
19
+ @index2 = Relaton::Index.find_or_create :BIPM, file: "index2.yaml"
18
20
  end
19
21
 
20
22
  #
@@ -43,8 +45,11 @@ module RelatonBipm
43
45
  case source
44
46
  when "bipm-data-outcomes" then DataOutcomesParser.parse(self)
45
47
  when "bipm-si-brochure" then BipmSiBrochureParser.parse(self)
48
+ when "rawdata-bipm-metrologia" then RawdataBipmMetrologia::Fetcher.fetch(self)
46
49
  end
47
- File.write @index_path, @index.to_yaml, encoding: "UTF-8"
50
+ File.write @index_path, index.to_yaml, encoding: "UTF-8"
51
+ index_new.save
52
+ index2.save
48
53
  end
49
54
 
50
55
  #
@@ -54,15 +59,22 @@ module RelatonBipm
54
59
  # @param [RelatonBipm::BipmBibliographicItem] item document to save
55
60
  # @param [Boolean, nil] warn_duplicate Warn if document already exists
56
61
  #
57
- # @return [<Type>] <description>
58
- #
59
62
  def write_file(path, item, warn_duplicate: true)
63
+ content = serialize item
60
64
  if @files.include?(path)
61
65
  warn "File #{path} already exists" if warn_duplicate
62
66
  else
63
67
  @files << path
64
68
  end
65
- File.write path, item.to_hash.to_yaml, encoding: "UTF-8"
69
+ File.write path, content, encoding: "UTF-8"
70
+ end
71
+
72
+ def serialize(item)
73
+ case @format
74
+ when "xml" then item.to_xml bibdata: true
75
+ when "yaml" then item.to_hash.to_yaml
76
+ when "bibxml" then item.to_bibxml
77
+ end
66
78
  end
67
79
  end
68
80
  end