commonmeta-ruby 3.3.3 → 3.3.4

Sign up to get free protection for your applications and to get access to all the features.
Files changed (631) hide show
  1. checksums.yaml +4 -4
  2. data/Gemfile.lock +1 -1
  3. data/bin/commonmeta +1 -1
  4. data/lib/commonmeta/cli.rb +7 -3
  5. data/lib/commonmeta/readers/json_feed_reader.rb +1 -1
  6. data/lib/commonmeta/utils.rb +34 -0
  7. data/lib/commonmeta/version.rb +1 -1
  8. data/spec/fixtures/vcr_cassettes/Commonmeta_Metadata/change_metadata_as_datacite_xml/with_data_citation.yml +14 -14
  9. data/spec/fixtures/vcr_cassettes/Commonmeta_Metadata/doi_registration_agency/crossref.yml +5 -5
  10. data/spec/fixtures/vcr_cassettes/Commonmeta_Metadata/doi_registration_agency/datacite.yml +5 -5
  11. data/spec/fixtures/vcr_cassettes/Commonmeta_Metadata/doi_registration_agency/jalc.yml +5 -5
  12. data/spec/fixtures/vcr_cassettes/Commonmeta_Metadata/doi_registration_agency/kisti.yml +5 -5
  13. data/spec/fixtures/vcr_cassettes/Commonmeta_Metadata/doi_registration_agency/medra.yml +5 -5
  14. data/spec/fixtures/vcr_cassettes/Commonmeta_Metadata/doi_registration_agency/not_found.yml +5 -5
  15. data/spec/fixtures/vcr_cassettes/Commonmeta_Metadata/doi_registration_agency/op.yml +5 -5
  16. data/spec/fixtures/vcr_cassettes/Commonmeta_Metadata/find_from_format_by_ID/crossref.yml +5 -5
  17. data/spec/fixtures/vcr_cassettes/Commonmeta_Metadata/find_from_format_by_ID/crossref_doi_not_url.yml +5 -5
  18. data/spec/fixtures/vcr_cassettes/Commonmeta_Metadata/find_from_format_by_ID/datacite.yml +5 -5
  19. data/spec/fixtures/vcr_cassettes/Commonmeta_Metadata/find_from_format_by_ID/datacite_doi_http.yml +5 -5
  20. data/spec/fixtures/vcr_cassettes/Commonmeta_Metadata/find_from_format_by_ID/unknown_DOI_registration_agency.yml +5 -5
  21. data/spec/fixtures/vcr_cassettes/Commonmeta_Metadata/get_cff_metadata/cff-converter-python.yml +9 -7
  22. data/spec/fixtures/vcr_cassettes/Commonmeta_Metadata/get_cff_metadata/ruby-cff.yml +16 -14
  23. data/spec/fixtures/vcr_cassettes/Commonmeta_Metadata/get_cff_metadata/ruby-cff_repository_url.yml +14 -12
  24. data/spec/fixtures/vcr_cassettes/Commonmeta_Metadata/get_codemeta_metadata/maremma.yml +10 -8
  25. data/spec/fixtures/vcr_cassettes/Commonmeta_Metadata/get_codemeta_metadata/metadata_reports.yml +9 -7
  26. data/spec/fixtures/vcr_cassettes/Commonmeta_Metadata/get_crossref_metadata/DOI_with_ORCID_ID.yml +74 -74
  27. data/spec/fixtures/vcr_cassettes/Commonmeta_Metadata/get_crossref_metadata/DOI_with_SICI_DOI.yml +73 -73
  28. data/spec/fixtures/vcr_cassettes/Commonmeta_Metadata/get_crossref_metadata/DOI_with_data_citation.yml +70 -70
  29. data/spec/fixtures/vcr_cassettes/Commonmeta_Metadata/get_crossref_metadata/JaLC.yml +159 -159
  30. data/spec/fixtures/vcr_cassettes/Commonmeta_Metadata/get_crossref_metadata/KISTI.yml +128 -128
  31. data/spec/fixtures/vcr_cassettes/Commonmeta_Metadata/get_crossref_metadata/OP.yml +72 -72
  32. data/spec/fixtures/vcr_cassettes/Commonmeta_Metadata/get_crossref_metadata/affiliation_is_space.yml +73 -73
  33. data/spec/fixtures/vcr_cassettes/Commonmeta_Metadata/get_crossref_metadata/another_book.yml +109 -109
  34. data/spec/fixtures/vcr_cassettes/Commonmeta_Metadata/get_crossref_metadata/another_book_chapter.yml +71 -71
  35. data/spec/fixtures/vcr_cassettes/Commonmeta_Metadata/get_crossref_metadata/article_id_as_page_number.yml +74 -74
  36. data/spec/fixtures/vcr_cassettes/Commonmeta_Metadata/get_crossref_metadata/author_literal.yml +82 -82
  37. data/spec/fixtures/vcr_cassettes/Commonmeta_Metadata/get_crossref_metadata/book.yml +71 -71
  38. data/spec/fixtures/vcr_cassettes/Commonmeta_Metadata/get_crossref_metadata/book_chapter.yml +72 -72
  39. data/spec/fixtures/vcr_cassettes/Commonmeta_Metadata/get_crossref_metadata/book_chapter_with_RDF_for_container.yml +70 -70
  40. data/spec/fixtures/vcr_cassettes/Commonmeta_Metadata/get_crossref_metadata/book_oup.yml +69 -69
  41. data/spec/fixtures/vcr_cassettes/Commonmeta_Metadata/get_crossref_metadata/component.yml +91 -91
  42. data/spec/fixtures/vcr_cassettes/Commonmeta_Metadata/get_crossref_metadata/dataset.yml +101 -102
  43. data/spec/fixtures/vcr_cassettes/Commonmeta_Metadata/get_crossref_metadata/dataset_usda.yml +133 -133
  44. data/spec/fixtures/vcr_cassettes/Commonmeta_Metadata/get_crossref_metadata/date_in_future.yml +78 -78
  45. data/spec/fixtures/vcr_cassettes/Commonmeta_Metadata/get_crossref_metadata/dissertation.yml +100 -100
  46. data/spec/fixtures/vcr_cassettes/Commonmeta_Metadata/get_crossref_metadata/empty_given_name.yml +72 -72
  47. data/spec/fixtures/vcr_cassettes/Commonmeta_Metadata/get_crossref_metadata/invalid_date.yml +73 -73
  48. data/spec/fixtures/vcr_cassettes/Commonmeta_Metadata/get_crossref_metadata/journal_article.yml +72 -72
  49. data/spec/fixtures/vcr_cassettes/Commonmeta_Metadata/get_crossref_metadata/journal_article_original_language_title.yml +70 -70
  50. data/spec/fixtures/vcr_cassettes/Commonmeta_Metadata/get_crossref_metadata/journal_article_with.yml +76 -514
  51. data/spec/fixtures/vcr_cassettes/Commonmeta_Metadata/get_crossref_metadata/journal_article_with_RDF_for_container.yml +70 -70
  52. data/spec/fixtures/vcr_cassettes/Commonmeta_Metadata/get_crossref_metadata/journal_article_with_funding.yml +73 -73
  53. data/spec/fixtures/vcr_cassettes/Commonmeta_Metadata/get_crossref_metadata/journal_issue.yml +69 -69
  54. data/spec/fixtures/vcr_cassettes/Commonmeta_Metadata/get_crossref_metadata/mEDRA.yml +69 -69
  55. data/spec/fixtures/vcr_cassettes/Commonmeta_Metadata/get_crossref_metadata/markup.yml +78 -78
  56. data/spec/fixtures/vcr_cassettes/Commonmeta_Metadata/get_crossref_metadata/missing_creator.yml +73 -73
  57. data/spec/fixtures/vcr_cassettes/Commonmeta_Metadata/get_crossref_metadata/multiple_issn.yml +72 -72
  58. data/spec/fixtures/vcr_cassettes/Commonmeta_Metadata/get_crossref_metadata/multiple_titles.yml +71 -70
  59. data/spec/fixtures/vcr_cassettes/Commonmeta_Metadata/get_crossref_metadata/multiple_titles_with_missing.yml +716 -716
  60. data/spec/fixtures/vcr_cassettes/Commonmeta_Metadata/get_crossref_metadata/not_found_error.yml +63 -63
  61. data/spec/fixtures/vcr_cassettes/Commonmeta_Metadata/get_crossref_metadata/peer_review.yml +74 -74
  62. data/spec/fixtures/vcr_cassettes/Commonmeta_Metadata/get_crossref_metadata/posted_content.yml +71 -71
  63. data/spec/fixtures/vcr_cassettes/Commonmeta_Metadata/get_crossref_metadata/posted_content_copernicus.yml +73 -73
  64. data/spec/fixtures/vcr_cassettes/Commonmeta_Metadata/get_crossref_metadata/report_osti.yml +117 -117
  65. data/spec/fixtures/vcr_cassettes/Commonmeta_Metadata/get_crossref_metadata/vor_with_url.yml +75 -75
  66. data/spec/fixtures/vcr_cassettes/Commonmeta_Metadata/get_crossref_metadata/yet_another_book.yml +69 -69
  67. data/spec/fixtures/vcr_cassettes/Commonmeta_Metadata/get_crossref_metadata/yet_another_book_chapter.yml +70 -70
  68. data/spec/fixtures/vcr_cassettes/Commonmeta_Metadata/get_crossref_raw/journal_article.yml +10 -10
  69. data/spec/fixtures/vcr_cassettes/Commonmeta_Metadata/get_datacite_metadata/dissertation.yml +9 -9
  70. data/spec/fixtures/vcr_cassettes/Commonmeta_Metadata/get_datacite_metadata/funding_references.yml +11 -11
  71. data/spec/fixtures/vcr_cassettes/Commonmeta_Metadata/get_datacite_metadata/subject_scheme.yml +20 -20
  72. data/spec/fixtures/vcr_cassettes/Commonmeta_Metadata/get_doi_prefix_for_blog/by_blog_id.yml +6 -419
  73. data/spec/fixtures/vcr_cassettes/Commonmeta_Metadata/get_doi_prefix_for_blog/by_blog_post_uuid.yml +7 -260
  74. data/spec/fixtures/vcr_cassettes/Commonmeta_Metadata/get_doi_prefix_for_blog/by_blog_post_uuid_specific_prefix.yml +3 -136
  75. data/spec/fixtures/vcr_cassettes/Commonmeta_Metadata/get_json_feed/by_blog_id.yml +225 -1432
  76. data/spec/fixtures/vcr_cassettes/Commonmeta_Metadata/get_json_feed/not_indexed_posts.yml +1380 -2112
  77. data/spec/fixtures/vcr_cassettes/Commonmeta_Metadata/get_json_feed/unregistered_posts.yml +6 -172
  78. data/spec/fixtures/vcr_cassettes/Commonmeta_Metadata/get_json_feed_item_metadata/blog_post_with_non-url_id.yml +7 -7
  79. data/spec/fixtures/vcr_cassettes/Commonmeta_Metadata/get_json_feed_item_metadata/blogger_post.yml +12 -12
  80. data/spec/fixtures/vcr_cassettes/Commonmeta_Metadata/get_json_feed_item_metadata/ghost_post_with_author_name_suffix.yml +8 -8
  81. data/spec/fixtures/vcr_cassettes/Commonmeta_Metadata/get_json_feed_item_metadata/ghost_post_with_doi.yml +7 -7
  82. data/spec/fixtures/vcr_cassettes/Commonmeta_Metadata/get_json_feed_item_metadata/ghost_post_with_organizational_author.yml +3 -3
  83. data/spec/fixtures/vcr_cassettes/Commonmeta_Metadata/get_json_feed_item_metadata/ghost_post_without_doi.yml +8 -8
  84. data/spec/fixtures/vcr_cassettes/Commonmeta_Metadata/get_json_feed_item_metadata/jekyll_post.yml +8 -8
  85. data/spec/fixtures/vcr_cassettes/Commonmeta_Metadata/get_json_feed_item_metadata/substack_post_with_broken_reference.yml +90 -176
  86. data/spec/fixtures/vcr_cassettes/Commonmeta_Metadata/get_json_feed_item_metadata/syldavia_gazette_post_with_references.yml +25 -25
  87. data/spec/fixtures/vcr_cassettes/Commonmeta_Metadata/get_json_feed_item_metadata/upstream_post_with_references.yml +61 -61
  88. data/spec/fixtures/vcr_cassettes/Commonmeta_Metadata/get_json_feed_item_metadata/wordpress_post.yml +8 -8
  89. data/spec/fixtures/vcr_cassettes/Commonmeta_Metadata/get_json_feed_item_metadata/wordpress_post_with_references.yml +20 -20
  90. data/spec/fixtures/vcr_cassettes/Commonmeta_Metadata/get_one_author/has_familyName.yml +9 -9
  91. data/spec/fixtures/vcr_cassettes/Commonmeta_Metadata/get_one_author/has_name_in_display-order_with_ORCID.yml +9 -9
  92. data/spec/fixtures/vcr_cassettes/Commonmeta_Metadata/get_one_author/name_with_affiliation_crossref.yml +14 -14
  93. data/spec/fixtures/vcr_cassettes/Commonmeta_Metadata/get_one_author/only_familyName_and_givenName.yml +43 -36
  94. data/spec/fixtures/vcr_cassettes/Commonmeta_Metadata/get_schema_org_metadata/BlogPosting.yml +158 -158
  95. data/spec/fixtures/vcr_cassettes/Commonmeta_Metadata/get_schema_org_metadata/BlogPosting_with_new_DOI.yml +162 -162
  96. data/spec/fixtures/vcr_cassettes/Commonmeta_Metadata/get_schema_org_metadata/get_schema_org_metadata_front_matter/BlogPosting.yml +178 -180
  97. data/spec/fixtures/vcr_cassettes/Commonmeta_Metadata/get_schema_org_metadata/harvard_dataverse.yml +226 -230
  98. data/spec/fixtures/vcr_cassettes/Commonmeta_Metadata/get_schema_org_metadata/pangaea.yml +43 -36
  99. data/spec/fixtures/vcr_cassettes/Commonmeta_Metadata/get_schema_org_metadata/upstream_blog.yml +94 -94
  100. data/spec/fixtures/vcr_cassettes/Commonmeta_Metadata/get_schema_org_metadata/zenodo.yml +14 -14
  101. data/spec/fixtures/vcr_cassettes/Commonmeta_Metadata/handle_input/DOI_RA_not_Crossref_or_DataCite.yml +5 -5
  102. data/spec/fixtures/vcr_cassettes/Commonmeta_Metadata/handle_input/unknown_DOI_prefix.yml +5 -5
  103. data/spec/fixtures/vcr_cassettes/Commonmeta_Metadata/json_schema_errors/is_valid.yml +13 -13
  104. data/spec/fixtures/vcr_cassettes/Commonmeta_Metadata/write_metadata_as_bibtex/BlogPosting.yml +4 -4
  105. data/spec/fixtures/vcr_cassettes/Commonmeta_Metadata/write_metadata_as_bibtex/Dataset.yml +6 -6
  106. data/spec/fixtures/vcr_cassettes/Commonmeta_Metadata/write_metadata_as_bibtex/authors_with_affiliations.yml +14 -14
  107. data/spec/fixtures/vcr_cassettes/Commonmeta_Metadata/write_metadata_as_bibtex/climate_data.yml +6 -6
  108. data/spec/fixtures/vcr_cassettes/Commonmeta_Metadata/write_metadata_as_bibtex/from_schema_org.yml +159 -159
  109. data/spec/fixtures/vcr_cassettes/Commonmeta_Metadata/write_metadata_as_bibtex/keywords_subject_scheme.yml +6 -6
  110. data/spec/fixtures/vcr_cassettes/Commonmeta_Metadata/write_metadata_as_bibtex/maremma.yml +12 -10
  111. data/spec/fixtures/vcr_cassettes/Commonmeta_Metadata/write_metadata_as_bibtex/text.yml +4 -4
  112. data/spec/fixtures/vcr_cassettes/Commonmeta_Metadata/write_metadata_as_bibtex/with_data_citation.yml +14 -14
  113. data/spec/fixtures/vcr_cassettes/Commonmeta_Metadata/write_metadata_as_bibtex/with_pages.yml +12 -12
  114. data/spec/fixtures/vcr_cassettes/Commonmeta_Metadata/write_metadata_as_cff/Collection_of_Jupyter_notebooks.yml +9 -9
  115. data/spec/fixtures/vcr_cassettes/Commonmeta_Metadata/write_metadata_as_cff/SoftwareSourceCode_Zenodo.yml +17 -17
  116. data/spec/fixtures/vcr_cassettes/Commonmeta_Metadata/write_metadata_as_cff/SoftwareSourceCode_also_Zenodo.yml +12 -12
  117. data/spec/fixtures/vcr_cassettes/Commonmeta_Metadata/write_metadata_as_cff/ruby-cff.yml +16 -14
  118. data/spec/fixtures/vcr_cassettes/Commonmeta_Metadata/write_metadata_as_citation/Dataset.yml +6 -6
  119. data/spec/fixtures/vcr_cassettes/Commonmeta_Metadata/write_metadata_as_citation/Journal_article.yml +14 -14
  120. data/spec/fixtures/vcr_cassettes/Commonmeta_Metadata/write_metadata_as_citation/Journal_article_vancouver_style.yml +19 -19
  121. data/spec/fixtures/vcr_cassettes/Commonmeta_Metadata/write_metadata_as_citation/Missing_author.yml +12 -12
  122. data/spec/fixtures/vcr_cassettes/Commonmeta_Metadata/write_metadata_as_citation/interactive_resource_without_dates.yml +4 -4
  123. data/spec/fixtures/vcr_cassettes/Commonmeta_Metadata/write_metadata_as_citation/software_w/version.yml +6 -6
  124. data/spec/fixtures/vcr_cassettes/Commonmeta_Metadata/write_metadata_as_codemeta/SoftwareSourceCode_DataCite.yml +4 -4
  125. data/spec/fixtures/vcr_cassettes/Commonmeta_Metadata/write_metadata_as_codemeta/SoftwareSourceCode_DataCite_check_codemeta_v2.yml +4 -4
  126. data/spec/fixtures/vcr_cassettes/Commonmeta_Metadata/write_metadata_as_crossref/another_schema_org_from_front-matter.yml +27 -27
  127. data/spec/fixtures/vcr_cassettes/Commonmeta_Metadata/write_metadata_as_crossref/journal_article.yml +4 -4
  128. data/spec/fixtures/vcr_cassettes/Commonmeta_Metadata/write_metadata_as_crossref/journal_article_from_datacite.yml +4 -4
  129. data/spec/fixtures/vcr_cassettes/Commonmeta_Metadata/write_metadata_as_crossref/json_feed_item_from_rogue_scholar_with_doi.yml +8 -54
  130. data/spec/fixtures/vcr_cassettes/Commonmeta_Metadata/write_metadata_as_crossref/json_feed_item_from_rogue_scholar_with_organizational_author.yml +3 -3
  131. data/spec/fixtures/vcr_cassettes/Commonmeta_Metadata/write_metadata_as_crossref/json_feed_item_from_upstream_blog.yml +10 -53
  132. data/spec/fixtures/vcr_cassettes/Commonmeta_Metadata/write_metadata_as_crossref/json_feed_item_with_references.yml +62 -62
  133. data/spec/fixtures/vcr_cassettes/Commonmeta_Metadata/write_metadata_as_crossref/posted_content.yml +15 -15
  134. data/spec/fixtures/vcr_cassettes/Commonmeta_Metadata/write_metadata_as_crossref/schema_org_from_another_science_blog.yml +6 -6
  135. data/spec/fixtures/vcr_cassettes/Commonmeta_Metadata/write_metadata_as_crossref/schema_org_from_front_matter.yml +29 -29
  136. data/spec/fixtures/vcr_cassettes/Commonmeta_Metadata/write_metadata_as_crossref/schema_org_from_upstream_blog.yml +4 -4
  137. data/spec/fixtures/vcr_cassettes/Commonmeta_Metadata/write_metadata_as_csl/Another_dataset.yml +28 -28
  138. data/spec/fixtures/vcr_cassettes/Commonmeta_Metadata/write_metadata_as_csl/BlogPosting.yml +4 -4
  139. data/spec/fixtures/vcr_cassettes/Commonmeta_Metadata/write_metadata_as_csl/BlogPosting_schema_org.yml +158 -158
  140. data/spec/fixtures/vcr_cassettes/Commonmeta_Metadata/write_metadata_as_csl/Dataset.yml +6 -6
  141. data/spec/fixtures/vcr_cassettes/Commonmeta_Metadata/write_metadata_as_csl/container_title.yml +11 -11
  142. data/spec/fixtures/vcr_cassettes/Commonmeta_Metadata/write_metadata_as_csl/interactive_resource_without_dates.yml +4 -4
  143. data/spec/fixtures/vcr_cassettes/Commonmeta_Metadata/write_metadata_as_csl/journal_article.yml +14 -14
  144. data/spec/fixtures/vcr_cassettes/Commonmeta_Metadata/write_metadata_as_csl/keywords_subject_scheme.yml +6 -6
  145. data/spec/fixtures/vcr_cassettes/Commonmeta_Metadata/write_metadata_as_csl/maremma.yml +9 -7
  146. data/spec/fixtures/vcr_cassettes/Commonmeta_Metadata/write_metadata_as_csl/missing_creator.yml +12 -12
  147. data/spec/fixtures/vcr_cassettes/Commonmeta_Metadata/write_metadata_as_csl/multiple_abstracts.yml +6 -6
  148. data/spec/fixtures/vcr_cassettes/Commonmeta_Metadata/write_metadata_as_csl/organization_author.yml +19 -19
  149. data/spec/fixtures/vcr_cassettes/Commonmeta_Metadata/write_metadata_as_csl/software.yml +4 -4
  150. data/spec/fixtures/vcr_cassettes/Commonmeta_Metadata/write_metadata_as_csl/software_w/version.yml +6 -6
  151. data/spec/fixtures/vcr_cassettes/Commonmeta_Metadata/write_metadata_as_csl/with_only_first_page.yml +13 -13
  152. data/spec/fixtures/vcr_cassettes/Commonmeta_Metadata/write_metadata_as_csl/with_pages.yml +12 -12
  153. data/spec/fixtures/vcr_cassettes/Commonmeta_Metadata/write_metadata_as_csv/climate_data.yml +6 -6
  154. data/spec/fixtures/vcr_cassettes/Commonmeta_Metadata/write_metadata_as_csv/maremma.yml +10 -8
  155. data/spec/fixtures/vcr_cassettes/Commonmeta_Metadata/write_metadata_as_csv/text.yml +4 -4
  156. data/spec/fixtures/vcr_cassettes/Commonmeta_Metadata/write_metadata_as_csv/with_data_citation.yml +14 -14
  157. data/spec/fixtures/vcr_cassettes/Commonmeta_Metadata/write_metadata_as_csv/with_pages.yml +12 -12
  158. data/spec/fixtures/vcr_cassettes/Commonmeta_Metadata/write_metadata_as_datacite/dissertation.yml +17 -17
  159. data/spec/fixtures/vcr_cassettes/Commonmeta_Metadata/write_metadata_as_datacite/from_schema_org.yml +158 -158
  160. data/spec/fixtures/vcr_cassettes/Commonmeta_Metadata/write_metadata_as_datacite/journal_article.yml +18 -18
  161. data/spec/fixtures/vcr_cassettes/Commonmeta_Metadata/write_metadata_as_datacite/maremma.yml +10 -8
  162. data/spec/fixtures/vcr_cassettes/Commonmeta_Metadata/write_metadata_as_datacite/with_ORCID_ID.yml +12 -12
  163. data/spec/fixtures/vcr_cassettes/Commonmeta_Metadata/write_metadata_as_datacite/with_data_citation.yml +14 -14
  164. data/spec/fixtures/vcr_cassettes/Commonmeta_Metadata/write_metadata_as_jats_xml/Dataset_in_schema_4_0.yml +6 -6
  165. data/spec/fixtures/vcr_cassettes/Commonmeta_Metadata/write_metadata_as_jats_xml/Text_pass-thru.yml +4 -4
  166. data/spec/fixtures/vcr_cassettes/Commonmeta_Metadata/write_metadata_as_jats_xml/book_chapter.yml +15 -13
  167. data/spec/fixtures/vcr_cassettes/Commonmeta_Metadata/write_metadata_as_jats_xml/from_schema_org.yml +158 -158
  168. data/spec/fixtures/vcr_cassettes/Commonmeta_Metadata/write_metadata_as_jats_xml/interactive_resource_without_dates.yml +4 -4
  169. data/spec/fixtures/vcr_cassettes/Commonmeta_Metadata/write_metadata_as_jats_xml/maremma.yml +12 -10
  170. data/spec/fixtures/vcr_cassettes/Commonmeta_Metadata/write_metadata_as_jats_xml/with_ORCID_ID.yml +12 -12
  171. data/spec/fixtures/vcr_cassettes/Commonmeta_Metadata/write_metadata_as_jats_xml/with_data_citation.yml +14 -14
  172. data/spec/fixtures/vcr_cassettes/Commonmeta_Metadata/write_metadata_as_jats_xml/with_editor.yml +13 -13
  173. data/spec/fixtures/vcr_cassettes/Commonmeta_Metadata/write_metadata_as_ris/BlogPosting.yml +4 -4
  174. data/spec/fixtures/vcr_cassettes/Commonmeta_Metadata/write_metadata_as_ris/BlogPosting_schema_org.yml +159 -159
  175. data/spec/fixtures/vcr_cassettes/Commonmeta_Metadata/write_metadata_as_ris/Dataset.yml +6 -6
  176. data/spec/fixtures/vcr_cassettes/Commonmeta_Metadata/write_metadata_as_ris/alternate_name.yml +4 -4
  177. data/spec/fixtures/vcr_cassettes/Commonmeta_Metadata/write_metadata_as_ris/journal_article.yml +8 -8
  178. data/spec/fixtures/vcr_cassettes/Commonmeta_Metadata/write_metadata_as_ris/keywords_with_subject_scheme.yml +6 -6
  179. data/spec/fixtures/vcr_cassettes/Commonmeta_Metadata/write_metadata_as_ris/maremma.yml +9 -7
  180. data/spec/fixtures/vcr_cassettes/Commonmeta_Metadata/write_metadata_as_ris/with_pages.yml +7 -7
  181. data/spec/fixtures/vcr_cassettes/Commonmeta_Metadata/write_metadata_as_schema_org/Another_Schema_org_JSON.yml +6 -6
  182. data/spec/fixtures/vcr_cassettes/Commonmeta_Metadata/write_metadata_as_schema_org/Funding.yml +9 -9
  183. data/spec/fixtures/vcr_cassettes/Commonmeta_Metadata/write_metadata_as_schema_org/Funding_OpenAIRE.yml +9 -9
  184. data/spec/fixtures/vcr_cassettes/Commonmeta_Metadata/write_metadata_as_schema_org/Schema_org_JSON.yml +17 -17
  185. data/spec/fixtures/vcr_cassettes/Commonmeta_Metadata/write_metadata_as_schema_org/Schema_org_JSON_Cyark.yml +33 -33
  186. data/spec/fixtures/vcr_cassettes/Commonmeta_Metadata/write_metadata_as_schema_org/alternate_identifiers.yml +9 -9
  187. data/spec/fixtures/vcr_cassettes/Commonmeta_Metadata/write_metadata_as_schema_org/data_catalog.yml +9 -9
  188. data/spec/fixtures/vcr_cassettes/Commonmeta_Metadata/write_metadata_as_schema_org/geo_location_box.yml +12 -12
  189. data/spec/fixtures/vcr_cassettes/Commonmeta_Metadata/write_metadata_as_schema_org/interactive_resource_without_dates.yml +9 -9
  190. data/spec/fixtures/vcr_cassettes/Commonmeta_Metadata/write_metadata_as_schema_org/journal_article.yml +14 -14
  191. data/spec/fixtures/vcr_cassettes/Commonmeta_Metadata/write_metadata_as_schema_org/maremma_schema_org_JSON.yml +10 -8
  192. data/spec/fixtures/vcr_cassettes/Commonmeta_Metadata/write_metadata_as_schema_org/series_information.yml +9 -9
  193. data/spec/fixtures/vcr_cassettes/Commonmeta_Metadata/write_metadata_as_schema_org/subject_scheme.yml +11 -11
  194. data/spec/fixtures/vcr_cassettes/Commonmeta_Metadata/write_metadata_as_schema_org/subject_scheme_multiple_keywords.yml +11 -11
  195. data/spec/fixtures/vcr_cassettes/Commonmeta_Metadata/write_metadata_as_turtle/BlogPosting.yml +4 -4
  196. data/spec/fixtures/vcr_cassettes/Commonmeta_Metadata/write_metadata_as_turtle/Dataset.yml +6 -6
  197. data/spec/fixtures/vcr_cassettes/Commonmeta_Metadata/write_metadata_as_turtle/journal_article.yml +14 -14
  198. data/spec/fixtures/vcr_cassettes/Commonmeta_Metadata/write_metadata_as_turtle/with_pages.yml +12 -12
  199. data/spec/readers/cff_reader_spec.rb +6 -6
  200. data/spec/readers/crossref_reader_spec.rb +3 -3
  201. data/spec/readers/crossref_xml_reader_spec.rb +7 -7
  202. data/spec/readers/json_feed_reader_spec.rb +13 -13
  203. data/spec/readers/schema_org_reader_spec.rb +2 -3
  204. data/spec/spec_helper.rb +1 -0
  205. data/spec/utils_spec.rb +1 -1
  206. data/spec/writers/cff_writer_spec.rb +3 -3
  207. data/spec/writers/ris_writer_spec.rb +2 -2
  208. data/spec/writers/schema_org_writer_spec.rb +1 -1
  209. metadata +1 -423
  210. data/spec/fixtures/vcr_cassettes/Briard_CLI/convert_file/crossref/default.yml +0 -110
  211. data/spec/fixtures/vcr_cassettes/Briard_CLI/convert_file/crossref/to_bibtex.yml +0 -110
  212. data/spec/fixtures/vcr_cassettes/Briard_CLI/convert_file/crossref/to_crossref_xml.yml +0 -110
  213. data/spec/fixtures/vcr_cassettes/Briard_CLI/convert_file/crossref/to_datacite.yml +0 -110
  214. data/spec/fixtures/vcr_cassettes/Briard_CLI/convert_file/crossref/to_schema_org.yml +0 -110
  215. data/spec/fixtures/vcr_cassettes/Briard_CLI/convert_file/crossref_xml/default.yml +0 -55
  216. data/spec/fixtures/vcr_cassettes/Briard_CLI/convert_file/crossref_xml/to_bibtex.yml +0 -55
  217. data/spec/fixtures/vcr_cassettes/Briard_CLI/convert_file/crossref_xml/to_crossref_xml.yml +0 -55
  218. data/spec/fixtures/vcr_cassettes/Briard_CLI/convert_file/crossref_xml/to_datacite.yml +0 -55
  219. data/spec/fixtures/vcr_cassettes/Briard_CLI/convert_file/crossref_xml/to_schema_org.yml +0 -55
  220. data/spec/fixtures/vcr_cassettes/Briard_CLI/convert_from_id/crossref/default.yml +0 -299
  221. data/spec/fixtures/vcr_cassettes/Briard_CLI/convert_from_id/crossref/to_bibtex.yml +0 -299
  222. data/spec/fixtures/vcr_cassettes/Briard_CLI/convert_from_id/crossref/to_citation.yml +0 -299
  223. data/spec/fixtures/vcr_cassettes/Briard_CLI/convert_from_id/crossref/to_crossref_xml.yml +0 -299
  224. data/spec/fixtures/vcr_cassettes/Briard_CLI/convert_from_id/crossref/to_datacite.yml +0 -299
  225. data/spec/fixtures/vcr_cassettes/Briard_CLI/convert_from_id/crossref/to_jats.yml +0 -299
  226. data/spec/fixtures/vcr_cassettes/Briard_CLI/convert_from_id/crossref/to_schema_org.yml +0 -299
  227. data/spec/fixtures/vcr_cassettes/Briard_CLI/convert_from_id/datacite/default.yml +0 -172
  228. data/spec/fixtures/vcr_cassettes/Briard_CLI/convert_from_id/datacite/to_bibtex.yml +0 -172
  229. data/spec/fixtures/vcr_cassettes/Briard_CLI/convert_from_id/datacite/to_citation.yml +0 -172
  230. data/spec/fixtures/vcr_cassettes/Briard_CLI/convert_from_id/datacite/to_datacite.yml +0 -172
  231. data/spec/fixtures/vcr_cassettes/Briard_CLI/convert_from_id/datacite/to_jats.yml +0 -172
  232. data/spec/fixtures/vcr_cassettes/Briard_CLI/convert_from_id/datacite/to_schema_org.yml +0 -172
  233. data/spec/fixtures/vcr_cassettes/Briard_CLI/convert_from_id/schema_org/default.yml +0 -1098
  234. data/spec/fixtures/vcr_cassettes/Briard_CLI/convert_from_id/schema_org/to_datacite.yml +0 -1098
  235. data/spec/fixtures/vcr_cassettes/Briard_CLI/convert_from_id/schema_org/to_schema_org.yml +0 -1100
  236. data/spec/fixtures/vcr_cassettes/Briard_CLI/find_from_format_by_id/crossref.yml +0 -55
  237. data/spec/fixtures/vcr_cassettes/Briard_CLI/find_from_format_by_id/datacite.yml +0 -55
  238. data/spec/fixtures/vcr_cassettes/Briard_CLI/find_from_format_by_id/jalc.yml +0 -55
  239. data/spec/fixtures/vcr_cassettes/Briard_CLI/find_from_format_by_id/kisti.yml +0 -55
  240. data/spec/fixtures/vcr_cassettes/Briard_CLI/find_from_format_by_id/medra.yml +0 -55
  241. data/spec/fixtures/vcr_cassettes/Briard_CLI/find_from_format_by_id/op.yml +0 -55
  242. data/spec/fixtures/vcr_cassettes/Briard_Metadata/authors_as_string/author.yml +0 -164
  243. data/spec/fixtures/vcr_cassettes/Briard_Metadata/authors_as_string/no_author.yml +0 -164
  244. data/spec/fixtures/vcr_cassettes/Briard_Metadata/authors_as_string/single_author.yml +0 -164
  245. data/spec/fixtures/vcr_cassettes/Briard_Metadata/authors_as_string/with_organization.yml +0 -164
  246. data/spec/fixtures/vcr_cassettes/Briard_Metadata/change_metadata_as_datacite_xml/with_data_citation.yml +0 -247
  247. data/spec/fixtures/vcr_cassettes/Briard_Metadata/doi_registration_agency/crossref.yml +0 -55
  248. data/spec/fixtures/vcr_cassettes/Briard_Metadata/doi_registration_agency/datacite.yml +0 -55
  249. data/spec/fixtures/vcr_cassettes/Briard_Metadata/doi_registration_agency/jalc.yml +0 -55
  250. data/spec/fixtures/vcr_cassettes/Briard_Metadata/doi_registration_agency/kisti.yml +0 -55
  251. data/spec/fixtures/vcr_cassettes/Briard_Metadata/doi_registration_agency/medra.yml +0 -55
  252. data/spec/fixtures/vcr_cassettes/Briard_Metadata/doi_registration_agency/not_found.yml +0 -55
  253. data/spec/fixtures/vcr_cassettes/Briard_Metadata/doi_registration_agency/op.yml +0 -55
  254. data/spec/fixtures/vcr_cassettes/Briard_Metadata/find_from_format_by_ID/crossref.yml +0 -55
  255. data/spec/fixtures/vcr_cassettes/Briard_Metadata/find_from_format_by_ID/crossref_doi_not_url.yml +0 -55
  256. data/spec/fixtures/vcr_cassettes/Briard_Metadata/find_from_format_by_ID/datacite.yml +0 -55
  257. data/spec/fixtures/vcr_cassettes/Briard_Metadata/find_from_format_by_ID/datacite_doi_http.yml +0 -55
  258. data/spec/fixtures/vcr_cassettes/Briard_Metadata/find_from_format_by_ID/unknown_DOI_registration_agency.yml +0 -55
  259. data/spec/fixtures/vcr_cassettes/Briard_Metadata/fos/hsh_to_fos_for_match.yml +0 -221
  260. data/spec/fixtures/vcr_cassettes/Briard_Metadata/fos/hsh_to_fos_for_with_schemeUri_in_hash.yml +0 -221
  261. data/spec/fixtures/vcr_cassettes/Briard_Metadata/fos/hsh_to_fos_match.yml +0 -221
  262. data/spec/fixtures/vcr_cassettes/Briard_Metadata/fos/hsh_to_fos_no_match.yml +0 -221
  263. data/spec/fixtures/vcr_cassettes/Briard_Metadata/fos/name_to_fos_for_match.yml +0 -221
  264. data/spec/fixtures/vcr_cassettes/Briard_Metadata/fos/name_to_fos_match.yml +0 -221
  265. data/spec/fixtures/vcr_cassettes/Briard_Metadata/fos/name_to_fos_no_match.yml +0 -221
  266. data/spec/fixtures/vcr_cassettes/Briard_Metadata/from_schema_org/with_id.yml +0 -221
  267. data/spec/fixtures/vcr_cassettes/Briard_Metadata/from_schema_org_creators/with_affiliation.yml +0 -221
  268. data/spec/fixtures/vcr_cassettes/Briard_Metadata/from_schema_org_creators/without_affiliation.yml +0 -221
  269. data/spec/fixtures/vcr_cassettes/Briard_Metadata/get_cff_metadata/cff-converter-python.yml +0 -200
  270. data/spec/fixtures/vcr_cassettes/Briard_Metadata/get_cff_metadata/ruby-cff.yml +0 -154
  271. data/spec/fixtures/vcr_cassettes/Briard_Metadata/get_cff_metadata/ruby-cff_repository_url.yml +0 -154
  272. data/spec/fixtures/vcr_cassettes/Briard_Metadata/get_codemeta_metadata/maremma.yml +0 -86
  273. data/spec/fixtures/vcr_cassettes/Briard_Metadata/get_codemeta_metadata/metadata_reports.yml +0 -93
  274. data/spec/fixtures/vcr_cassettes/Briard_Metadata/get_crossref_metadata/DOI_with_ORCID_ID.yml +0 -337
  275. data/spec/fixtures/vcr_cassettes/Briard_Metadata/get_crossref_metadata/DOI_with_SICI_DOI.yml +0 -347
  276. data/spec/fixtures/vcr_cassettes/Briard_Metadata/get_crossref_metadata/DOI_with_data_citation.yml +0 -359
  277. data/spec/fixtures/vcr_cassettes/Briard_Metadata/get_crossref_metadata/JaLC.yml +0 -384
  278. data/spec/fixtures/vcr_cassettes/Briard_Metadata/get_crossref_metadata/KISTI.yml +0 -330
  279. data/spec/fixtures/vcr_cassettes/Briard_Metadata/get_crossref_metadata/OP.yml +0 -969
  280. data/spec/fixtures/vcr_cassettes/Briard_Metadata/get_crossref_metadata/affiliation_is_space.yml +0 -358
  281. data/spec/fixtures/vcr_cassettes/Briard_Metadata/get_crossref_metadata/another_book.yml +0 -312
  282. data/spec/fixtures/vcr_cassettes/Briard_Metadata/get_crossref_metadata/another_book_chapter.yml +0 -465
  283. data/spec/fixtures/vcr_cassettes/Briard_Metadata/get_crossref_metadata/article_id_as_page_number.yml +0 -276
  284. data/spec/fixtures/vcr_cassettes/Briard_Metadata/get_crossref_metadata/author_literal.yml +0 -492
  285. data/spec/fixtures/vcr_cassettes/Briard_Metadata/get_crossref_metadata/book.yml +0 -523
  286. data/spec/fixtures/vcr_cassettes/Briard_Metadata/get_crossref_metadata/book_chapter.yml +0 -377
  287. data/spec/fixtures/vcr_cassettes/Briard_Metadata/get_crossref_metadata/book_chapter_with_RDF_for_container.yml +0 -336
  288. data/spec/fixtures/vcr_cassettes/Briard_Metadata/get_crossref_metadata/book_oup.yml +0 -289
  289. data/spec/fixtures/vcr_cassettes/Briard_Metadata/get_crossref_metadata/component.yml +0 -289
  290. data/spec/fixtures/vcr_cassettes/Briard_Metadata/get_crossref_metadata/dataset.yml +0 -299
  291. data/spec/fixtures/vcr_cassettes/Briard_Metadata/get_crossref_metadata/dataset_usda.yml +0 -341
  292. data/spec/fixtures/vcr_cassettes/Briard_Metadata/get_crossref_metadata/date_in_future.yml +0 -570
  293. data/spec/fixtures/vcr_cassettes/Briard_Metadata/get_crossref_metadata/dissertation.yml +0 -301
  294. data/spec/fixtures/vcr_cassettes/Briard_Metadata/get_crossref_metadata/empty_given_name.yml +0 -303
  295. data/spec/fixtures/vcr_cassettes/Briard_Metadata/get_crossref_metadata/invalid_date.yml +0 -307
  296. data/spec/fixtures/vcr_cassettes/Briard_Metadata/get_crossref_metadata/journal_article.yml +0 -461
  297. data/spec/fixtures/vcr_cassettes/Briard_Metadata/get_crossref_metadata/journal_article_original_language_title.yml +0 -276
  298. data/spec/fixtures/vcr_cassettes/Briard_Metadata/get_crossref_metadata/journal_article_with.yml +0 -470
  299. data/spec/fixtures/vcr_cassettes/Briard_Metadata/get_crossref_metadata/journal_article_with_RDF_for_container.yml +0 -519
  300. data/spec/fixtures/vcr_cassettes/Briard_Metadata/get_crossref_metadata/journal_article_with_funding.yml +0 -456
  301. data/spec/fixtures/vcr_cassettes/Briard_Metadata/get_crossref_metadata/journal_issue.yml +0 -270
  302. data/spec/fixtures/vcr_cassettes/Briard_Metadata/get_crossref_metadata/mEDRA.yml +0 -310
  303. data/spec/fixtures/vcr_cassettes/Briard_Metadata/get_crossref_metadata/markup.yml +0 -329
  304. data/spec/fixtures/vcr_cassettes/Briard_Metadata/get_crossref_metadata/missing_creator.yml +0 -307
  305. data/spec/fixtures/vcr_cassettes/Briard_Metadata/get_crossref_metadata/multiple_issn.yml +0 -393
  306. data/spec/fixtures/vcr_cassettes/Briard_Metadata/get_crossref_metadata/multiple_titles.yml +0 -265
  307. data/spec/fixtures/vcr_cassettes/Briard_Metadata/get_crossref_metadata/multiple_titles_with_missing.yml +0 -860
  308. data/spec/fixtures/vcr_cassettes/Briard_Metadata/get_crossref_metadata/not_found_error.yml +0 -209
  309. data/spec/fixtures/vcr_cassettes/Briard_Metadata/get_crossref_metadata/peer_review.yml +0 -287
  310. data/spec/fixtures/vcr_cassettes/Briard_Metadata/get_crossref_metadata/posted_content.yml +0 -326
  311. data/spec/fixtures/vcr_cassettes/Briard_Metadata/get_crossref_metadata/posted_content_copernicus.yml +0 -297
  312. data/spec/fixtures/vcr_cassettes/Briard_Metadata/get_crossref_metadata/report_osti.yml +0 -315
  313. data/spec/fixtures/vcr_cassettes/Briard_Metadata/get_crossref_metadata/vor_with_url.yml +0 -451
  314. data/spec/fixtures/vcr_cassettes/Briard_Metadata/get_crossref_metadata/yet_another_book.yml +0 -816
  315. data/spec/fixtures/vcr_cassettes/Briard_Metadata/get_crossref_metadata/yet_another_book_chapter.yml +0 -324
  316. data/spec/fixtures/vcr_cassettes/Briard_Metadata/get_crossref_raw/journal_article.yml +0 -110
  317. data/spec/fixtures/vcr_cassettes/Briard_Metadata/get_datacite_metadata/dissertation.yml +0 -152
  318. data/spec/fixtures/vcr_cassettes/Briard_Metadata/get_datacite_metadata/funding_references.yml +0 -175
  319. data/spec/fixtures/vcr_cassettes/Briard_Metadata/get_datacite_metadata/subject_scheme.yml +0 -328
  320. data/spec/fixtures/vcr_cassettes/Briard_Metadata/get_date/publication_date.yml +0 -221
  321. data/spec/fixtures/vcr_cassettes/Briard_Metadata/get_date_from_date_parts/date.yml +0 -221
  322. data/spec/fixtures/vcr_cassettes/Briard_Metadata/get_date_from_date_parts/year-month.yml +0 -221
  323. data/spec/fixtures/vcr_cassettes/Briard_Metadata/get_date_from_date_parts/year.yml +0 -221
  324. data/spec/fixtures/vcr_cassettes/Briard_Metadata/get_date_from_parts/date.yml +0 -221
  325. data/spec/fixtures/vcr_cassettes/Briard_Metadata/get_date_from_parts/year-month.yml +0 -221
  326. data/spec/fixtures/vcr_cassettes/Briard_Metadata/get_date_from_parts/year.yml +0 -221
  327. data/spec/fixtures/vcr_cassettes/Briard_Metadata/get_date_parts/date.yml +0 -221
  328. data/spec/fixtures/vcr_cassettes/Briard_Metadata/get_date_parts/year-month.yml +0 -221
  329. data/spec/fixtures/vcr_cassettes/Briard_Metadata/get_date_parts/year.yml +0 -221
  330. data/spec/fixtures/vcr_cassettes/Briard_Metadata/get_datetime_from_time/future.yml +0 -221
  331. data/spec/fixtures/vcr_cassettes/Briard_Metadata/get_datetime_from_time/invalid.yml +0 -221
  332. data/spec/fixtures/vcr_cassettes/Briard_Metadata/get_datetime_from_time/nil.yml +0 -221
  333. data/spec/fixtures/vcr_cassettes/Briard_Metadata/get_datetime_from_time/past.yml +0 -221
  334. data/spec/fixtures/vcr_cassettes/Briard_Metadata/get_datetime_from_time/present.yml +0 -221
  335. data/spec/fixtures/vcr_cassettes/Briard_Metadata/get_one_author/has_familyName.yml +0 -133
  336. data/spec/fixtures/vcr_cassettes/Briard_Metadata/get_one_author/has_name_in_display-order_with_ORCID.yml +0 -153
  337. data/spec/fixtures/vcr_cassettes/Briard_Metadata/get_one_author/is_organization.yml +0 -164
  338. data/spec/fixtures/vcr_cassettes/Briard_Metadata/get_one_author/name_with_affiliation_crossref.yml +0 -247
  339. data/spec/fixtures/vcr_cassettes/Briard_Metadata/get_one_author/only_familyName_and_givenName.yml +0 -468
  340. data/spec/fixtures/vcr_cassettes/Briard_Metadata/get_schema_org_metadata/BlogPosting.yml +0 -530
  341. data/spec/fixtures/vcr_cassettes/Briard_Metadata/get_schema_org_metadata/BlogPosting_with_new_DOI.yml +0 -530
  342. data/spec/fixtures/vcr_cassettes/Briard_Metadata/get_schema_org_metadata/get_schema_org_metadata_front_matter/BlogPosting.yml +0 -534
  343. data/spec/fixtures/vcr_cassettes/Briard_Metadata/get_schema_org_metadata/harvard_dataverse.yml +0 -1838
  344. data/spec/fixtures/vcr_cassettes/Briard_Metadata/get_schema_org_metadata/pangaea.yml +0 -468
  345. data/spec/fixtures/vcr_cassettes/Briard_Metadata/get_schema_org_metadata/upstream_blog.yml +0 -885
  346. data/spec/fixtures/vcr_cassettes/Briard_Metadata/get_schema_org_metadata/zenodo.yml +0 -583
  347. data/spec/fixtures/vcr_cassettes/Briard_Metadata/get_series_information/only_title.yml +0 -221
  348. data/spec/fixtures/vcr_cassettes/Briard_Metadata/get_series_information/title_and_pages.yml +0 -221
  349. data/spec/fixtures/vcr_cassettes/Briard_Metadata/get_series_information/title_volume_and_pages.yml +0 -221
  350. data/spec/fixtures/vcr_cassettes/Briard_Metadata/get_series_information/title_volume_issue_and_pages.yml +0 -221
  351. data/spec/fixtures/vcr_cassettes/Briard_Metadata/github/github_as_cff_url.yml +0 -221
  352. data/spec/fixtures/vcr_cassettes/Briard_Metadata/github/github_as_codemeta_url.yml +0 -221
  353. data/spec/fixtures/vcr_cassettes/Briard_Metadata/github/github_from_url.yml +0 -221
  354. data/spec/fixtures/vcr_cassettes/Briard_Metadata/github/github_from_url_cff_file.yml +0 -221
  355. data/spec/fixtures/vcr_cassettes/Briard_Metadata/github/github_from_url_file.yml +0 -221
  356. data/spec/fixtures/vcr_cassettes/Briard_Metadata/handle_input/DOI_RA_not_Crossref_or_DataCite.yml +0 -55
  357. data/spec/fixtures/vcr_cassettes/Briard_Metadata/handle_input/unknown_DOI_prefix.yml +0 -55
  358. data/spec/fixtures/vcr_cassettes/Briard_Metadata/is_personal_name_/has_comma.yml +0 -164
  359. data/spec/fixtures/vcr_cassettes/Briard_Metadata/is_personal_name_/has_family_name.yml +0 -164
  360. data/spec/fixtures/vcr_cassettes/Briard_Metadata/is_personal_name_/has_id.yml +0 -164
  361. data/spec/fixtures/vcr_cassettes/Briard_Metadata/is_personal_name_/has_known_given_name.yml +0 -164
  362. data/spec/fixtures/vcr_cassettes/Briard_Metadata/is_personal_name_/has_no_info.yml +0 -164
  363. data/spec/fixtures/vcr_cassettes/Briard_Metadata/is_personal_name_/has_orcid_id.yml +0 -164
  364. data/spec/fixtures/vcr_cassettes/Briard_Metadata/is_personal_name_/has_type_organization.yml +0 -164
  365. data/spec/fixtures/vcr_cassettes/Briard_Metadata/json_schema_errors/is_valid.yml +0 -221
  366. data/spec/fixtures/vcr_cassettes/Briard_Metadata/normalize_cc_url/not_found.yml +0 -221
  367. data/spec/fixtures/vcr_cassettes/Briard_Metadata/normalize_cc_url/with_trailing_slash.yml +0 -221
  368. data/spec/fixtures/vcr_cassettes/Briard_Metadata/normalize_cc_url/with_trailing_slash_and_to_https.yml +0 -221
  369. data/spec/fixtures/vcr_cassettes/Briard_Metadata/normalize_id/doi.yml +0 -221
  370. data/spec/fixtures/vcr_cassettes/Briard_Metadata/normalize_id/doi_as_url.yml +0 -221
  371. data/spec/fixtures/vcr_cassettes/Briard_Metadata/normalize_id/filename.yml +0 -221
  372. data/spec/fixtures/vcr_cassettes/Briard_Metadata/normalize_id/ftp.yml +0 -221
  373. data/spec/fixtures/vcr_cassettes/Briard_Metadata/normalize_id/invalid_url.yml +0 -221
  374. data/spec/fixtures/vcr_cassettes/Briard_Metadata/normalize_id/sandbox_via_options.yml +0 -221
  375. data/spec/fixtures/vcr_cassettes/Briard_Metadata/normalize_id/sandbox_via_url.yml +0 -221
  376. data/spec/fixtures/vcr_cassettes/Briard_Metadata/normalize_id/string.yml +0 -221
  377. data/spec/fixtures/vcr_cassettes/Briard_Metadata/normalize_id/url.yml +0 -221
  378. data/spec/fixtures/vcr_cassettes/Briard_Metadata/normalize_id/url_with_utf-8.yml +0 -221
  379. data/spec/fixtures/vcr_cassettes/Briard_Metadata/normalize_ids/doi.yml +0 -221
  380. data/spec/fixtures/vcr_cassettes/Briard_Metadata/normalize_ids/url.yml +0 -221
  381. data/spec/fixtures/vcr_cassettes/Briard_Metadata/normalize_issn/from_array.yml +0 -221
  382. data/spec/fixtures/vcr_cassettes/Briard_Metadata/normalize_issn/from_empty_array.yml +0 -221
  383. data/spec/fixtures/vcr_cassettes/Briard_Metadata/normalize_issn/from_hash.yml +0 -221
  384. data/spec/fixtures/vcr_cassettes/Briard_Metadata/normalize_issn/from_string.yml +0 -221
  385. data/spec/fixtures/vcr_cassettes/Briard_Metadata/normalize_url/uri.yml +0 -221
  386. data/spec/fixtures/vcr_cassettes/Briard_Metadata/normalize_url/with_trailing_slash.yml +0 -221
  387. data/spec/fixtures/vcr_cassettes/Briard_Metadata/normalize_url/with_trailing_slash_and_to_https.yml +0 -221
  388. data/spec/fixtures/vcr_cassettes/Briard_Metadata/parse_attributes/array.yml +0 -221
  389. data/spec/fixtures/vcr_cassettes/Briard_Metadata/parse_attributes/array_of_strings.yml +0 -221
  390. data/spec/fixtures/vcr_cassettes/Briard_Metadata/parse_attributes/first.yml +0 -221
  391. data/spec/fixtures/vcr_cassettes/Briard_Metadata/parse_attributes/hash.yml +0 -221
  392. data/spec/fixtures/vcr_cassettes/Briard_Metadata/parse_attributes/hash_with_array_value.yml +0 -221
  393. data/spec/fixtures/vcr_cassettes/Briard_Metadata/parse_attributes/nil.yml +0 -221
  394. data/spec/fixtures/vcr_cassettes/Briard_Metadata/parse_attributes/string.yml +0 -221
  395. data/spec/fixtures/vcr_cassettes/Briard_Metadata/random_doi/decode_anothe_doi.yml +0 -221
  396. data/spec/fixtures/vcr_cassettes/Briard_Metadata/random_doi/decode_doi.yml +0 -221
  397. data/spec/fixtures/vcr_cassettes/Briard_Metadata/random_doi/encode_doi.yml +0 -221
  398. data/spec/fixtures/vcr_cassettes/Briard_Metadata/sanitize/onlies_keep_specific_tags.yml +0 -221
  399. data/spec/fixtures/vcr_cassettes/Briard_Metadata/sanitize/removes_a_tags.yml +0 -221
  400. data/spec/fixtures/vcr_cassettes/Briard_Metadata/spdx/hsh_to_spdx_id.yml +0 -221
  401. data/spec/fixtures/vcr_cassettes/Briard_Metadata/spdx/hsh_to_spdx_not_found.yml +0 -221
  402. data/spec/fixtures/vcr_cassettes/Briard_Metadata/spdx/hsh_to_spdx_url.yml +0 -221
  403. data/spec/fixtures/vcr_cassettes/Briard_Metadata/spdx/name_to_spdx_exists.yml +0 -221
  404. data/spec/fixtures/vcr_cassettes/Briard_Metadata/spdx/name_to_spdx_id.yml +0 -221
  405. data/spec/fixtures/vcr_cassettes/Briard_Metadata/to_schema_org/with_id.yml +0 -221
  406. data/spec/fixtures/vcr_cassettes/Briard_Metadata/to_schema_org_identifiers/with_identifiers.yml +0 -221
  407. data/spec/fixtures/vcr_cassettes/Briard_Metadata/validate_orcid/validate_orcid.yml +0 -221
  408. data/spec/fixtures/vcr_cassettes/Briard_Metadata/validate_orcid/validate_orcid_https.yml +0 -221
  409. data/spec/fixtures/vcr_cassettes/Briard_Metadata/validate_orcid/validate_orcid_id.yml +0 -221
  410. data/spec/fixtures/vcr_cassettes/Briard_Metadata/validate_orcid/validate_orcid_sandbox.yml +0 -221
  411. data/spec/fixtures/vcr_cassettes/Briard_Metadata/validate_orcid/validate_orcid_sandbox_https.yml +0 -221
  412. data/spec/fixtures/vcr_cassettes/Briard_Metadata/validate_orcid/validate_orcid_with_spaces.yml +0 -221
  413. data/spec/fixtures/vcr_cassettes/Briard_Metadata/validate_orcid/validate_orcid_wrong_id.yml +0 -221
  414. data/spec/fixtures/vcr_cassettes/Briard_Metadata/validate_orcid/validate_orcid_www.yml +0 -221
  415. data/spec/fixtures/vcr_cassettes/Briard_Metadata/validate_orcid_scheme/validate_orcid_scheme.yml +0 -221
  416. data/spec/fixtures/vcr_cassettes/Briard_Metadata/validate_orcid_scheme/validate_orcid_scheme_https.yml +0 -221
  417. data/spec/fixtures/vcr_cassettes/Briard_Metadata/validate_orcid_scheme/validate_orcid_scheme_trailing_slash.yml +0 -221
  418. data/spec/fixtures/vcr_cassettes/Briard_Metadata/validate_orcid_scheme/validate_orcid_scheme_www.yml +0 -221
  419. data/spec/fixtures/vcr_cassettes/Briard_Metadata/validate_url/DOI.yml +0 -221
  420. data/spec/fixtures/vcr_cassettes/Briard_Metadata/validate_url/ISSN.yml +0 -221
  421. data/spec/fixtures/vcr_cassettes/Briard_Metadata/validate_url/URL.yml +0 -221
  422. data/spec/fixtures/vcr_cassettes/Briard_Metadata/validate_url/string.yml +0 -221
  423. data/spec/fixtures/vcr_cassettes/Briard_Metadata/write_metadata_as_bibtex/BlogPosting.yml +0 -81
  424. data/spec/fixtures/vcr_cassettes/Briard_Metadata/write_metadata_as_bibtex/Dataset.yml +0 -120
  425. data/spec/fixtures/vcr_cassettes/Briard_Metadata/write_metadata_as_bibtex/authors_with_affiliations.yml +0 -186
  426. data/spec/fixtures/vcr_cassettes/Briard_Metadata/write_metadata_as_bibtex/climate_data.yml +0 -74
  427. data/spec/fixtures/vcr_cassettes/Briard_Metadata/write_metadata_as_bibtex/from_schema_org.yml +0 -530
  428. data/spec/fixtures/vcr_cassettes/Briard_Metadata/write_metadata_as_bibtex/keywords_subject_scheme.yml +0 -149
  429. data/spec/fixtures/vcr_cassettes/Briard_Metadata/write_metadata_as_bibtex/maremma.yml +0 -86
  430. data/spec/fixtures/vcr_cassettes/Briard_Metadata/write_metadata_as_bibtex/text.yml +0 -100
  431. data/spec/fixtures/vcr_cassettes/Briard_Metadata/write_metadata_as_bibtex/with_data_citation.yml +0 -247
  432. data/spec/fixtures/vcr_cassettes/Briard_Metadata/write_metadata_as_bibtex/with_pages.yml +0 -228
  433. data/spec/fixtures/vcr_cassettes/Briard_Metadata/write_metadata_as_cff/Collection_of_Jupyter_notebooks.yml +0 -143
  434. data/spec/fixtures/vcr_cassettes/Briard_Metadata/write_metadata_as_cff/SoftwareSourceCode_Zenodo.yml +0 -150
  435. data/spec/fixtures/vcr_cassettes/Briard_Metadata/write_metadata_as_cff/SoftwareSourceCode_also_Zenodo.yml +0 -93
  436. data/spec/fixtures/vcr_cassettes/Briard_Metadata/write_metadata_as_cff/ruby-cff.yml +0 -154
  437. data/spec/fixtures/vcr_cassettes/Briard_Metadata/write_metadata_as_citation/Dataset.yml +0 -120
  438. data/spec/fixtures/vcr_cassettes/Briard_Metadata/write_metadata_as_citation/Journal_article.yml +0 -247
  439. data/spec/fixtures/vcr_cassettes/Briard_Metadata/write_metadata_as_citation/Journal_article_vancouver_style.yml +0 -299
  440. data/spec/fixtures/vcr_cassettes/Briard_Metadata/write_metadata_as_citation/Missing_author.yml +0 -199
  441. data/spec/fixtures/vcr_cassettes/Briard_Metadata/write_metadata_as_citation/interactive_resource_without_dates.yml +0 -75
  442. data/spec/fixtures/vcr_cassettes/Briard_Metadata/write_metadata_as_citation/software_w/version.yml +0 -86
  443. data/spec/fixtures/vcr_cassettes/Briard_Metadata/write_metadata_as_codemeta/SoftwareSourceCode_DataCite.yml +0 -76
  444. data/spec/fixtures/vcr_cassettes/Briard_Metadata/write_metadata_as_codemeta/SoftwareSourceCode_DataCite_check_codemeta_v2.yml +0 -76
  445. data/spec/fixtures/vcr_cassettes/Briard_Metadata/write_metadata_as_crossref/another_schema_org_from_front-matter.yml +0 -541
  446. data/spec/fixtures/vcr_cassettes/Briard_Metadata/write_metadata_as_crossref/journal_article.yml +0 -55
  447. data/spec/fixtures/vcr_cassettes/Briard_Metadata/write_metadata_as_crossref/journal_article_from_datacite.yml +0 -85
  448. data/spec/fixtures/vcr_cassettes/Briard_Metadata/write_metadata_as_crossref/posted_content.yml +0 -283
  449. data/spec/fixtures/vcr_cassettes/Briard_Metadata/write_metadata_as_crossref/schema_org_from_another_science_blog.yml +0 -123
  450. data/spec/fixtures/vcr_cassettes/Briard_Metadata/write_metadata_as_crossref/schema_org_from_front_matter.yml +0 -477
  451. data/spec/fixtures/vcr_cassettes/Briard_Metadata/write_metadata_as_crossref/schema_org_from_upstream_blog.yml +0 -1025
  452. data/spec/fixtures/vcr_cassettes/Briard_Metadata/write_metadata_as_csl/Another_dataset.yml +0 -110
  453. data/spec/fixtures/vcr_cassettes/Briard_Metadata/write_metadata_as_csl/BlogPosting.yml +0 -81
  454. data/spec/fixtures/vcr_cassettes/Briard_Metadata/write_metadata_as_csl/BlogPosting_schema_org.yml +0 -530
  455. data/spec/fixtures/vcr_cassettes/Briard_Metadata/write_metadata_as_csl/Dataset.yml +0 -120
  456. data/spec/fixtures/vcr_cassettes/Briard_Metadata/write_metadata_as_csl/container_title.yml +0 -153
  457. data/spec/fixtures/vcr_cassettes/Briard_Metadata/write_metadata_as_csl/interactive_resource_without_dates.yml +0 -75
  458. data/spec/fixtures/vcr_cassettes/Briard_Metadata/write_metadata_as_csl/journal_article.yml +0 -247
  459. data/spec/fixtures/vcr_cassettes/Briard_Metadata/write_metadata_as_csl/keywords_subject_scheme.yml +0 -149
  460. data/spec/fixtures/vcr_cassettes/Briard_Metadata/write_metadata_as_csl/maremma.yml +0 -86
  461. data/spec/fixtures/vcr_cassettes/Briard_Metadata/write_metadata_as_csl/missing_creator.yml +0 -199
  462. data/spec/fixtures/vcr_cassettes/Briard_Metadata/write_metadata_as_csl/multiple_abstracts.yml +0 -101
  463. data/spec/fixtures/vcr_cassettes/Briard_Metadata/write_metadata_as_csl/organization_author.yml +0 -314
  464. data/spec/fixtures/vcr_cassettes/Briard_Metadata/write_metadata_as_csl/software.yml +0 -90
  465. data/spec/fixtures/vcr_cassettes/Briard_Metadata/write_metadata_as_csl/software_w/version.yml +0 -86
  466. data/spec/fixtures/vcr_cassettes/Briard_Metadata/write_metadata_as_csl/with_only_first_page.yml +0 -333
  467. data/spec/fixtures/vcr_cassettes/Briard_Metadata/write_metadata_as_csl/with_pages.yml +0 -228
  468. data/spec/fixtures/vcr_cassettes/Briard_Metadata/write_metadata_as_csv/climate_data.yml +0 -74
  469. data/spec/fixtures/vcr_cassettes/Briard_Metadata/write_metadata_as_csv/maremma.yml +0 -86
  470. data/spec/fixtures/vcr_cassettes/Briard_Metadata/write_metadata_as_csv/text.yml +0 -100
  471. data/spec/fixtures/vcr_cassettes/Briard_Metadata/write_metadata_as_csv/with_data_citation.yml +0 -247
  472. data/spec/fixtures/vcr_cassettes/Briard_Metadata/write_metadata_as_csv/with_pages.yml +0 -228
  473. data/spec/fixtures/vcr_cassettes/Briard_Metadata/write_metadata_as_datacite/from_schema_org.yml +0 -530
  474. data/spec/fixtures/vcr_cassettes/Briard_Metadata/write_metadata_as_datacite/maremma.yml +0 -86
  475. data/spec/fixtures/vcr_cassettes/Briard_Metadata/write_metadata_as_datacite/with_ORCID_ID.yml +0 -228
  476. data/spec/fixtures/vcr_cassettes/Briard_Metadata/write_metadata_as_datacite/with_data_citation.yml +0 -247
  477. data/spec/fixtures/vcr_cassettes/Briard_Metadata/write_metadata_as_datacite_json/from_schema_org.yml +0 -530
  478. data/spec/fixtures/vcr_cassettes/Briard_Metadata/write_metadata_as_datacite_json/maremma.yml +0 -86
  479. data/spec/fixtures/vcr_cassettes/Briard_Metadata/write_metadata_as_datacite_json/with_ORCID_ID.yml +0 -228
  480. data/spec/fixtures/vcr_cassettes/Briard_Metadata/write_metadata_as_datacite_json/with_data_citation.yml +0 -247
  481. data/spec/fixtures/vcr_cassettes/Briard_Metadata/write_metadata_as_jats_xml/Dataset_in_schema_4_0.yml +0 -120
  482. data/spec/fixtures/vcr_cassettes/Briard_Metadata/write_metadata_as_jats_xml/Text_pass-thru.yml +0 -106
  483. data/spec/fixtures/vcr_cassettes/Briard_Metadata/write_metadata_as_jats_xml/book_chapter.yml +0 -163
  484. data/spec/fixtures/vcr_cassettes/Briard_Metadata/write_metadata_as_jats_xml/from_schema_org.yml +0 -530
  485. data/spec/fixtures/vcr_cassettes/Briard_Metadata/write_metadata_as_jats_xml/interactive_resource_without_dates.yml +0 -75
  486. data/spec/fixtures/vcr_cassettes/Briard_Metadata/write_metadata_as_jats_xml/maremma.yml +0 -86
  487. data/spec/fixtures/vcr_cassettes/Briard_Metadata/write_metadata_as_jats_xml/with_ORCID_ID.yml +0 -228
  488. data/spec/fixtures/vcr_cassettes/Briard_Metadata/write_metadata_as_jats_xml/with_data_citation.yml +0 -247
  489. data/spec/fixtures/vcr_cassettes/Briard_Metadata/write_metadata_as_jats_xml/with_editor.yml +0 -355
  490. data/spec/fixtures/vcr_cassettes/Briard_Metadata/write_metadata_as_rdf_xml/BlogPosting.yml +0 -81
  491. data/spec/fixtures/vcr_cassettes/Briard_Metadata/write_metadata_as_rdf_xml/BlogPosting_schema_org.yml +0 -530
  492. data/spec/fixtures/vcr_cassettes/Briard_Metadata/write_metadata_as_rdf_xml/journal_article.yml +0 -247
  493. data/spec/fixtures/vcr_cassettes/Briard_Metadata/write_metadata_as_rdf_xml/maremma.yml +0 -86
  494. data/spec/fixtures/vcr_cassettes/Briard_Metadata/write_metadata_as_rdf_xml/with_pages.yml +0 -228
  495. data/spec/fixtures/vcr_cassettes/Briard_Metadata/write_metadata_as_ris/BlogPosting.yml +0 -81
  496. data/spec/fixtures/vcr_cassettes/Briard_Metadata/write_metadata_as_ris/BlogPosting_schema_org.yml +0 -530
  497. data/spec/fixtures/vcr_cassettes/Briard_Metadata/write_metadata_as_ris/Dataset.yml +0 -120
  498. data/spec/fixtures/vcr_cassettes/Briard_Metadata/write_metadata_as_ris/alternate_name.yml +0 -138
  499. data/spec/fixtures/vcr_cassettes/Briard_Metadata/write_metadata_as_ris/journal_article.yml +0 -115
  500. data/spec/fixtures/vcr_cassettes/Briard_Metadata/write_metadata_as_ris/keywords_with_subject_scheme.yml +0 -149
  501. data/spec/fixtures/vcr_cassettes/Briard_Metadata/write_metadata_as_ris/maremma.yml +0 -86
  502. data/spec/fixtures/vcr_cassettes/Briard_Metadata/write_metadata_as_ris/with_pages.yml +0 -112
  503. data/spec/fixtures/vcr_cassettes/Briard_Metadata/write_metadata_as_schema_org/Another_Schema_org_JSON.yml +0 -120
  504. data/spec/fixtures/vcr_cassettes/Briard_Metadata/write_metadata_as_schema_org/Funding.yml +0 -192
  505. data/spec/fixtures/vcr_cassettes/Briard_Metadata/write_metadata_as_schema_org/Funding_OpenAIRE.yml +0 -150
  506. data/spec/fixtures/vcr_cassettes/Briard_Metadata/write_metadata_as_schema_org/Schema_org_JSON.yml +0 -98
  507. data/spec/fixtures/vcr_cassettes/Briard_Metadata/write_metadata_as_schema_org/Schema_org_JSON_Cyark.yml +0 -160
  508. data/spec/fixtures/vcr_cassettes/Briard_Metadata/write_metadata_as_schema_org/Schema_org_JSON_IsSupplementTo.yml +0 -153
  509. data/spec/fixtures/vcr_cassettes/Briard_Metadata/write_metadata_as_schema_org/alternate_identifiers.yml +0 -131
  510. data/spec/fixtures/vcr_cassettes/Briard_Metadata/write_metadata_as_schema_org/data_catalog.yml +0 -136
  511. data/spec/fixtures/vcr_cassettes/Briard_Metadata/write_metadata_as_schema_org/geo_location_box.yml +0 -181
  512. data/spec/fixtures/vcr_cassettes/Briard_Metadata/write_metadata_as_schema_org/interactive_resource_without_dates.yml +0 -127
  513. data/spec/fixtures/vcr_cassettes/Briard_Metadata/write_metadata_as_schema_org/journal_article.yml +0 -247
  514. data/spec/fixtures/vcr_cassettes/Briard_Metadata/write_metadata_as_schema_org/maremma_schema_org_JSON.yml +0 -86
  515. data/spec/fixtures/vcr_cassettes/Briard_Metadata/write_metadata_as_schema_org/series_information.yml +0 -174
  516. data/spec/fixtures/vcr_cassettes/Briard_Metadata/write_metadata_as_schema_org/subject_scheme.yml +0 -199
  517. data/spec/fixtures/vcr_cassettes/Briard_Metadata/write_metadata_as_schema_org/subject_scheme_multiple_keywords.yml +0 -201
  518. data/spec/fixtures/vcr_cassettes/Briard_Metadata/write_metadata_as_turtle/BlogPosting.yml +0 -81
  519. data/spec/fixtures/vcr_cassettes/Briard_Metadata/write_metadata_as_turtle/BlogPosting_schema_org.yml +0 -530
  520. data/spec/fixtures/vcr_cassettes/Briard_Metadata/write_metadata_as_turtle/Dataset.yml +0 -120
  521. data/spec/fixtures/vcr_cassettes/Briard_Metadata/write_metadata_as_turtle/journal_article.yml +0 -247
  522. data/spec/fixtures/vcr_cassettes/Briard_Metadata/write_metadata_as_turtle/with_pages.yml +0 -228
  523. data/spec/fixtures/vcr_cassettes/Commonmeta_Metadata/fos/hsh_to_fos_for_match.yml +0 -221
  524. data/spec/fixtures/vcr_cassettes/Commonmeta_Metadata/fos/hsh_to_fos_for_with_schemeUri_in_hash.yml +0 -221
  525. data/spec/fixtures/vcr_cassettes/Commonmeta_Metadata/fos/hsh_to_fos_match.yml +0 -221
  526. data/spec/fixtures/vcr_cassettes/Commonmeta_Metadata/fos/hsh_to_fos_no_match.yml +0 -221
  527. data/spec/fixtures/vcr_cassettes/Commonmeta_Metadata/fos/name_to_fos_for_match.yml +0 -221
  528. data/spec/fixtures/vcr_cassettes/Commonmeta_Metadata/fos/name_to_fos_match.yml +0 -221
  529. data/spec/fixtures/vcr_cassettes/Commonmeta_Metadata/fos/name_to_fos_no_match.yml +0 -221
  530. data/spec/fixtures/vcr_cassettes/Commonmeta_Metadata/from_schema_org/with_id.yml +0 -221
  531. data/spec/fixtures/vcr_cassettes/Commonmeta_Metadata/get_date/publication_date.yml +0 -221
  532. data/spec/fixtures/vcr_cassettes/Commonmeta_Metadata/get_date_from_date_parts/date.yml +0 -221
  533. data/spec/fixtures/vcr_cassettes/Commonmeta_Metadata/get_date_from_date_parts/year-month.yml +0 -221
  534. data/spec/fixtures/vcr_cassettes/Commonmeta_Metadata/get_date_from_date_parts/year.yml +0 -221
  535. data/spec/fixtures/vcr_cassettes/Commonmeta_Metadata/get_date_from_parts/date.yml +0 -221
  536. data/spec/fixtures/vcr_cassettes/Commonmeta_Metadata/get_date_from_parts/year-month.yml +0 -221
  537. data/spec/fixtures/vcr_cassettes/Commonmeta_Metadata/get_date_from_parts/year.yml +0 -221
  538. data/spec/fixtures/vcr_cassettes/Commonmeta_Metadata/get_date_parts/date.yml +0 -221
  539. data/spec/fixtures/vcr_cassettes/Commonmeta_Metadata/get_date_parts/year-month.yml +0 -221
  540. data/spec/fixtures/vcr_cassettes/Commonmeta_Metadata/get_date_parts/year.yml +0 -221
  541. data/spec/fixtures/vcr_cassettes/Commonmeta_Metadata/get_datetime_from_time/future.yml +0 -221
  542. data/spec/fixtures/vcr_cassettes/Commonmeta_Metadata/get_datetime_from_time/invalid.yml +0 -221
  543. data/spec/fixtures/vcr_cassettes/Commonmeta_Metadata/get_datetime_from_time/nil.yml +0 -221
  544. data/spec/fixtures/vcr_cassettes/Commonmeta_Metadata/get_datetime_from_time/past.yml +0 -221
  545. data/spec/fixtures/vcr_cassettes/Commonmeta_Metadata/get_datetime_from_time/present.yml +0 -221
  546. data/spec/fixtures/vcr_cassettes/Commonmeta_Metadata/get_json_feed/all_posts.yml +0 -3602
  547. data/spec/fixtures/vcr_cassettes/Commonmeta_Metadata/get_json_feed/behind_the_science.yml +0 -1176
  548. data/spec/fixtures/vcr_cassettes/Commonmeta_Metadata/get_json_feed/citation_style_language.yml +0 -360
  549. data/spec/fixtures/vcr_cassettes/Commonmeta_Metadata/get_json_feed/citation_style_language_blog.yml +0 -360
  550. data/spec/fixtures/vcr_cassettes/Commonmeta_Metadata/get_json_feed/front-matter_blog.yml +0 -1034
  551. data/spec/fixtures/vcr_cassettes/Commonmeta_Metadata/get_json_feed/upstream.yml +0 -2438
  552. data/spec/fixtures/vcr_cassettes/Commonmeta_Metadata/get_json_feed/upstream_blog.yml +0 -2438
  553. data/spec/fixtures/vcr_cassettes/Commonmeta_Metadata/get_json_feed_item/by_uuid.yml +0 -136
  554. data/spec/fixtures/vcr_cassettes/Commonmeta_Metadata/get_link/license.yml +0 -221
  555. data/spec/fixtures/vcr_cassettes/Commonmeta_Metadata/get_link/url.yml +0 -221
  556. data/spec/fixtures/vcr_cassettes/Commonmeta_Metadata/get_series_information/only_title.yml +0 -221
  557. data/spec/fixtures/vcr_cassettes/Commonmeta_Metadata/get_series_information/title_and_pages.yml +0 -221
  558. data/spec/fixtures/vcr_cassettes/Commonmeta_Metadata/get_series_information/title_volume_and_pages.yml +0 -221
  559. data/spec/fixtures/vcr_cassettes/Commonmeta_Metadata/get_series_information/title_volume_issue_and_pages.yml +0 -221
  560. data/spec/fixtures/vcr_cassettes/Commonmeta_Metadata/github/github_as_cff_url.yml +0 -221
  561. data/spec/fixtures/vcr_cassettes/Commonmeta_Metadata/github/github_as_codemeta_url.yml +0 -221
  562. data/spec/fixtures/vcr_cassettes/Commonmeta_Metadata/github/github_from_url.yml +0 -221
  563. data/spec/fixtures/vcr_cassettes/Commonmeta_Metadata/github/github_from_url_cff_file.yml +0 -221
  564. data/spec/fixtures/vcr_cassettes/Commonmeta_Metadata/github/github_from_url_file.yml +0 -221
  565. data/spec/fixtures/vcr_cassettes/Commonmeta_Metadata/json_feed_unregistered_url/all_posts.yml +0 -221
  566. data/spec/fixtures/vcr_cassettes/Commonmeta_Metadata/normalize_cc_url/not_found.yml +0 -221
  567. data/spec/fixtures/vcr_cassettes/Commonmeta_Metadata/normalize_cc_url/with_trailing_slash.yml +0 -221
  568. data/spec/fixtures/vcr_cassettes/Commonmeta_Metadata/normalize_cc_url/with_trailing_slash_and_to_https.yml +0 -221
  569. data/spec/fixtures/vcr_cassettes/Commonmeta_Metadata/normalize_id/doi.yml +0 -221
  570. data/spec/fixtures/vcr_cassettes/Commonmeta_Metadata/normalize_id/doi_as_url.yml +0 -221
  571. data/spec/fixtures/vcr_cassettes/Commonmeta_Metadata/normalize_id/filename.yml +0 -221
  572. data/spec/fixtures/vcr_cassettes/Commonmeta_Metadata/normalize_id/ftp.yml +0 -221
  573. data/spec/fixtures/vcr_cassettes/Commonmeta_Metadata/normalize_id/invalid_url.yml +0 -221
  574. data/spec/fixtures/vcr_cassettes/Commonmeta_Metadata/normalize_id/sandbox_via_options.yml +0 -221
  575. data/spec/fixtures/vcr_cassettes/Commonmeta_Metadata/normalize_id/sandbox_via_url.yml +0 -221
  576. data/spec/fixtures/vcr_cassettes/Commonmeta_Metadata/normalize_id/string.yml +0 -221
  577. data/spec/fixtures/vcr_cassettes/Commonmeta_Metadata/normalize_id/url.yml +0 -221
  578. data/spec/fixtures/vcr_cassettes/Commonmeta_Metadata/normalize_id/url_with_utf-8.yml +0 -221
  579. data/spec/fixtures/vcr_cassettes/Commonmeta_Metadata/normalize_issn/from_array.yml +0 -221
  580. data/spec/fixtures/vcr_cassettes/Commonmeta_Metadata/normalize_issn/from_empty_array.yml +0 -221
  581. data/spec/fixtures/vcr_cassettes/Commonmeta_Metadata/normalize_issn/from_hash.yml +0 -221
  582. data/spec/fixtures/vcr_cassettes/Commonmeta_Metadata/normalize_issn/from_string.yml +0 -221
  583. data/spec/fixtures/vcr_cassettes/Commonmeta_Metadata/normalize_url/uri.yml +0 -221
  584. data/spec/fixtures/vcr_cassettes/Commonmeta_Metadata/normalize_url/with_trailing_slash.yml +0 -221
  585. data/spec/fixtures/vcr_cassettes/Commonmeta_Metadata/normalize_url/with_trailing_slash_and_to_https.yml +0 -221
  586. data/spec/fixtures/vcr_cassettes/Commonmeta_Metadata/parse_attributes/array.yml +0 -221
  587. data/spec/fixtures/vcr_cassettes/Commonmeta_Metadata/parse_attributes/array_of_strings.yml +0 -221
  588. data/spec/fixtures/vcr_cassettes/Commonmeta_Metadata/parse_attributes/first.yml +0 -221
  589. data/spec/fixtures/vcr_cassettes/Commonmeta_Metadata/parse_attributes/hash.yml +0 -221
  590. data/spec/fixtures/vcr_cassettes/Commonmeta_Metadata/parse_attributes/hash_with_array_value.yml +0 -221
  591. data/spec/fixtures/vcr_cassettes/Commonmeta_Metadata/parse_attributes/nil.yml +0 -221
  592. data/spec/fixtures/vcr_cassettes/Commonmeta_Metadata/parse_attributes/string.yml +0 -221
  593. data/spec/fixtures/vcr_cassettes/Commonmeta_Metadata/random_doi/decode_anothe_doi.yml +0 -221
  594. data/spec/fixtures/vcr_cassettes/Commonmeta_Metadata/random_doi/decode_another_doi.yml +0 -221
  595. data/spec/fixtures/vcr_cassettes/Commonmeta_Metadata/random_doi/decode_doi.yml +0 -221
  596. data/spec/fixtures/vcr_cassettes/Commonmeta_Metadata/random_doi/encode_doi.yml +0 -221
  597. data/spec/fixtures/vcr_cassettes/Commonmeta_Metadata/random_id/decode_another_id.yml +0 -221
  598. data/spec/fixtures/vcr_cassettes/Commonmeta_Metadata/random_id/decode_id.yml +0 -221
  599. data/spec/fixtures/vcr_cassettes/Commonmeta_Metadata/random_id/encode_id.yml +0 -221
  600. data/spec/fixtures/vcr_cassettes/Commonmeta_Metadata/sanitize/onlies_keep_specific_tags.yml +0 -221
  601. data/spec/fixtures/vcr_cassettes/Commonmeta_Metadata/sanitize/removes_a_tags.yml +0 -221
  602. data/spec/fixtures/vcr_cassettes/Commonmeta_Metadata/spdx/hsh_to_spdx_id.yml +0 -221
  603. data/spec/fixtures/vcr_cassettes/Commonmeta_Metadata/spdx/hsh_to_spdx_not_found.yml +0 -221
  604. data/spec/fixtures/vcr_cassettes/Commonmeta_Metadata/spdx/hsh_to_spdx_url.yml +0 -221
  605. data/spec/fixtures/vcr_cassettes/Commonmeta_Metadata/spdx/name_to_spdx_exists.yml +0 -221
  606. data/spec/fixtures/vcr_cassettes/Commonmeta_Metadata/spdx/name_to_spdx_id.yml +0 -221
  607. data/spec/fixtures/vcr_cassettes/Commonmeta_Metadata/to_schema_org/with_id.yml +0 -221
  608. data/spec/fixtures/vcr_cassettes/Commonmeta_Metadata/to_schema_org_identifiers/with_identifiers.yml +0 -221
  609. data/spec/fixtures/vcr_cassettes/Commonmeta_Metadata/validate_orcid/validate_orcid.yml +0 -221
  610. data/spec/fixtures/vcr_cassettes/Commonmeta_Metadata/validate_orcid/validate_orcid_https.yml +0 -221
  611. data/spec/fixtures/vcr_cassettes/Commonmeta_Metadata/validate_orcid/validate_orcid_id.yml +0 -221
  612. data/spec/fixtures/vcr_cassettes/Commonmeta_Metadata/validate_orcid/validate_orcid_sandbox.yml +0 -221
  613. data/spec/fixtures/vcr_cassettes/Commonmeta_Metadata/validate_orcid/validate_orcid_sandbox_https.yml +0 -221
  614. data/spec/fixtures/vcr_cassettes/Commonmeta_Metadata/validate_orcid/validate_orcid_with_spaces.yml +0 -221
  615. data/spec/fixtures/vcr_cassettes/Commonmeta_Metadata/validate_orcid/validate_orcid_wrong_id.yml +0 -221
  616. data/spec/fixtures/vcr_cassettes/Commonmeta_Metadata/validate_orcid/validate_orcid_www.yml +0 -221
  617. data/spec/fixtures/vcr_cassettes/Commonmeta_Metadata/validate_orcid_scheme/validate_orcid_scheme.yml +0 -221
  618. data/spec/fixtures/vcr_cassettes/Commonmeta_Metadata/validate_orcid_scheme/validate_orcid_scheme_https.yml +0 -221
  619. data/spec/fixtures/vcr_cassettes/Commonmeta_Metadata/validate_orcid_scheme/validate_orcid_scheme_trailing_slash.yml +0 -221
  620. data/spec/fixtures/vcr_cassettes/Commonmeta_Metadata/validate_orcid_scheme/validate_orcid_scheme_www.yml +0 -221
  621. data/spec/fixtures/vcr_cassettes/Commonmeta_Metadata/validate_url/DOI.yml +0 -221
  622. data/spec/fixtures/vcr_cassettes/Commonmeta_Metadata/validate_url/ISSN.yml +0 -221
  623. data/spec/fixtures/vcr_cassettes/Commonmeta_Metadata/validate_url/URL.yml +0 -221
  624. data/spec/fixtures/vcr_cassettes/Commonmeta_Metadata/validate_url/string.yml +0 -221
  625. data/spec/fixtures/vcr_cassettes/Commonmeta_Metadata/write_metadata_as_rdf_xml/BlogPosting.yml +0 -81
  626. data/spec/fixtures/vcr_cassettes/Commonmeta_Metadata/write_metadata_as_rdf_xml/BlogPosting_schema_org.yml +0 -530
  627. data/spec/fixtures/vcr_cassettes/Commonmeta_Metadata/write_metadata_as_rdf_xml/journal_article.yml +0 -247
  628. data/spec/fixtures/vcr_cassettes/Commonmeta_Metadata/write_metadata_as_rdf_xml/maremma.yml +0 -86
  629. data/spec/fixtures/vcr_cassettes/Commonmeta_Metadata/write_metadata_as_rdf_xml/with_pages.yml +0 -228
  630. data/spec/fixtures/vcr_cassettes/Commonmeta_Metadata/write_metadata_as_schema_org/Schema_org_JSON_IsSupplementTo.yml +0 -153
  631. data/spec/fixtures/vcr_cassettes/Commonmeta_Metadata/write_metadata_as_turtle/BlogPosting_schema_org.yml +0 -530
@@ -23,13 +23,13 @@ http_interactions:
23
23
  Cache-Control:
24
24
  - public, max-age=0, must-revalidate
25
25
  Content-Length:
26
- - '88356'
26
+ - '49530'
27
27
  Content-Type:
28
28
  - application/json; charset=utf-8
29
29
  Date:
30
- - Sun, 18 Jun 2023 05:50:13 GMT
30
+ - Sun, 18 Jun 2023 15:24:19 GMT
31
31
  Etag:
32
- - '"xxw11pz9vb1w14"'
32
+ - '"xv42bhvvc21253"'
33
33
  Server:
34
34
  - Vercel
35
35
  Strict-Transport-Security:
@@ -39,7 +39,7 @@ http_interactions:
39
39
  X-Vercel-Cache:
40
40
  - MISS
41
41
  X-Vercel-Id:
42
- - fra1::iad1::t4nqn-1687067412582-eedf46c90adb
42
+ - fra1::iad1::6gh26-1687101859043-0548b5ea306e
43
43
  Connection:
44
44
  - close
45
45
  body:
@@ -50,420 +50,7 @@ http_interactions:
50
50
  feed</a>.<br>ISSN 2051-8188. Written content on this site is licensed under
51
51
  a <a href=\"https://creativecommons.org/licenses/by/4.0/\">Creative Commons
52
52
  Attribution 4.0 International license</a>.","language":"en","favicon":null,"feed_url":"https://iphylo.blogspot.com/feeds/posts/default","feed_format":"application/atom+xml","home_page_url":"https://iphylo.blogspot.com/","indexed_at":"2023-02-06","modified_at":"2023-05-31T17:26:00+00:00","license":"https://creativecommons.org/licenses/by/4.0/legalcode","generator":"Blogger
53
- 7.00","category":"Natural Sciences","backlog":true,"prefix":"10.59350","items":[{"id":"https://doi.org/10.59350/en7e9-5s882","uuid":"20b9d31e-513f-496b-b399-4215306e1588","url":"https://iphylo.blogspot.com/2022/04/obsidian-markdown-and-taxonomic-trees.html","title":"Obsidian,
54
- markdown, and taxonomic trees","summary":"Returning to the subject of personal
55
- knowledge graphs Kyle Scheer has an interesting repository of Markdown files
56
- that describe academic disciplines at https://github.com/kyletscheer/academic-disciplines
57
- (see his blog post for more background). If you add these files to Obsidian
58
- you get a nice visualisation of a taxonomy of academic disciplines. The applications
59
- of this to biological taxonomy seem obvious, especially as a tool like Obsidian
60
- enables all sorts of interesting links to be added...","date_published":"2022-04-07T21:07:00Z","date_modified":"2022-04-07T21:15:34Z","date_indexed":"1909-06-16T09:41:45+00:00","authors":[{"url":null,"name":"Roderic
61
- Page"}],"image":null,"content_html":"<p>Returning to the subject of <a href=\"https://iphylo.blogspot.com/2020/08/personal-knowledge-graphs-obsidian-roam.html\">personal
62
- knowledge graphs</a> Kyle Scheer has an interesting repository of Markdown
63
- files that describe academic disciplines at <a href=\"https://github.com/kyletscheer/academic-disciplines\">https://github.com/kyletscheer/academic-disciplines</a>
64
- (see <a href=\"https://kyletscheer.medium.com/on-creating-a-tree-of-knowledge-f099c1028bf6\">his
65
- blog post</a> for more background).</p>\n\n<p>If you add these files to <a
66
- href=\"https://obsidian.md/\">Obsidian</a> you get a nice visualisation of
67
- a taxonomy of academic disciplines. The applications of this to biological
68
- taxonomy seem obvious, especially as a tool like Obsidian enables all sorts
69
- of interesting links to be added (e.g., we could add links to the taxonomic
70
- research behind each node in the taxonomic tree, the people doing that research,
71
- etc. - although that would mean we''d no longer have a simple tree).</p>\n\n<p>The
72
- more I look at these sort of simple Markdown-based tools the more I wonder
73
- whether we could make more use of them to create simple but persistent databases.
74
- Text files seem the most stable, long-lived digital format around, maybe this
75
- would be a way to minimise the inevitable obsolescence of database and server
76
- software. Time for some experiments I feel... can we take a taxonomic group,
77
- such as mammals, and create a richly connected database purely in Markdown?</p>\n\n<div
78
- class=\"separator\" style=\"clear: both; text-align: center;\"><iframe allowfullscreen=''allowfullscreen''
79
- webkitallowfullscreen=''webkitallowfullscreen'' mozallowfullscreen=''mozallowfullscreen''
80
- width=''400'' height=''322'' src=''https://www.blogger.com/video.g?token=AD6v5dxZtweOTJTdg6aqvICq_tKF0la1QZuDAEpwPPCVQKtG5vjB-DzuQv-ApL8JnpyZ1FffYtWo6ymizNQ''
81
- class=''b-hbp-video b-uploaded'' frameborder=''0''></iframe></div>","tags":["markdown","obsidian"],"language":"en","references":[]},{"id":"https://doi.org/10.59350/m48f7-c2128","uuid":"8aea47e4-f227-45f4-b37b-0454a8a7a3ff","url":"https://iphylo.blogspot.com/2023/04/chatgpt-semantic-search-and-knowledge.html","title":"ChatGPT,
82
- semantic search, and knowledge graphs","summary":"One thing about ChatGPT
83
- is it has opened my eyes to some concepts I was dimly aware of but am only
84
- now beginning to fully appreciate. ChatGPT enables you ask it questions, but
85
- the answers depend on what ChatGPT “knows”. As several people have noted,
86
- what would be even better is to be able to run ChatGPT on your own content.
87
- Indeed, ChatGPT itself now supports this using plugins. Paul Graham GPT However,
88
- it’s still useful to see how to add ChatGPT functionality to your own content
89
- from...","date_published":"2023-04-03T15:30:00Z","date_modified":"2023-04-03T15:32:04Z","date_indexed":"1909-06-16T09:02:34+00:00","authors":[{"url":null,"name":"Roderic
90
- Page"}],"image":null,"content_html":"<p>One thing about ChatGPT is it has
91
- opened my eyes to some concepts I was dimly aware of but am only now beginning
92
- to fully appreciate. ChatGPT enables you ask it questions, but the answers
93
- depend on what ChatGPT “knows”. As several people have noted, what would be
94
- even better is to be able to run ChatGPT on your own content. Indeed, ChatGPT
95
- itself now supports this using <a href=\"https://openai.com/blog/chatgpt-plugins\">plugins</a>.</p>\n<h4
96
- id=\"paul-graham-gpt\">Paul Graham GPT</h4>\n<p>However, it’s still useful
97
- to see how to add ChatGPT functionality to your own content from scratch.
98
- A nice example of this is <a href=\"https://paul-graham-gpt.vercel.app/\">Paul
99
- Graham GPT</a> by <a href=\"https://twitter.com/mckaywrigley\">Mckay Wrigley</a>.
100
- Mckay Wrigley took essays by Paul Graham (a well known venture capitalist)
101
- and built a question and answer tool very like ChatGPT.</p>\n<iframe width=\"560\"
102
- height=\"315\" src=\"https://www.youtube.com/embed/ii1jcLg-eIQ\" title=\"YouTube
103
- video player\" frameborder=\"0\" allow=\"accelerometer; autoplay; clipboard-write;
104
- encrypted-media; gyroscope; picture-in-picture; web-share\" allowfullscreen></iframe>\n<p>Because
105
- you can send a block of text to ChatGPT (as part of the prompt) you can get
106
- ChatGPT to summarise or transform that information, or answer questions based
107
- on that information. But there is a limit to how much information you can
108
- pack into a prompt. You can’t put all of Paul Graham’s essays into a prompt
109
- for example. So a solution is to do some preprocessing. For example, given
110
- a question such as “How do I start a startup?” we could first find the essays
111
- that are most relevant to this question, then use them to create a prompt
112
- for ChatGPT. A quick and dirty way to do this is simply do a text search over
113
- the essays and take the top hits. But we aren’t searching for words, we are
114
- searching for answers to a question. The essay with the best answer might
115
- not include the phrase “How do I start a startup?”.</p>\n<h4 id=\"semantic-search\">Semantic
116
- search</h4>\n<p>Enter <a href=\"https://en.wikipedia.org/wiki/Semantic_search\">Semantic
117
- search</a>. The key concept behind semantic search is that we are looking
118
- for documents with similar meaning, not just similarity of text. One approach
119
- to this is to represent documents by “embeddings”, that is, a vector of numbers
120
- that encapsulate features of the document. Documents with similar vectors
121
- are potentially related. In semantic search we take the query (e.g., “How
122
- do I start a startup?”), compute its embedding, then search among the documents
123
- for those with similar embeddings.</p>\n<p>To create Paul Graham GPT Mckay
124
- Wrigley did the following. First he sent each essay to the OpenAI API underlying
125
- ChatGPT, and in return he got the embedding for that essay (a vector of 1536
126
- numbers). Each embedding was stored in a database (Mckay uses Postgres with
127
- <a href=\"https://github.com/pgvector/pgvector\">pgvector</a>). When a user
128
- enters a query such as “How do I start a startup?” that query is also sent
129
- to the OpenAI API to retrieve its embedding vector. Then we query the database
130
- of embeddings for Paul Graham’s essays and take the top five hits. These hits
131
- are, one hopes, the most likely to contain relevant answers. The original
132
- question and the most similar essays are then bundled up and sent to ChatGPT
133
- which then synthesises an answer. See his <a href=\"https://github.com/mckaywrigley/paul-graham-gpt\">GitHub
134
- repo</a> for more details. Note that we are still using ChatGPT, but on a
135
- set of documents it doesn’t already have.</p>\n<h4 id=\"knowledge-graphs\">Knowledge
136
- graphs</h4>\n<p>I’m a fan of knowledge graphs, but they are not terribly easy
137
- to use. For example, I built a knowledge graph of Australian animals <a href=\"https://ozymandias-demo.herokuapp.com\">Ozymandias</a>
138
- that contains a wealth of information on taxa, publications, and people, wrapped
139
- up in a web site. If you want to learn more you need to figure out how to
140
- write queries in SPARQL, which is not fun. Maybe we could use ChatGPT to write
141
- the SPARQL queries for us, but it would be much more fun to be simply ask
142
- natural language queries (e.g., “who are the experts on Australian ants?”).
143
- I made some naïve notes on these ideas <a href=\"https://iphylo.blogspot.com/2015/09/possible-project-natural-language.html\">Possible
144
- project: natural language queries, or answering “how many species are there?”</a>
145
- and <a href=\"https://iphylo.blogspot.com/2019/05/ozymandias-meets-wikipedia-with-notes.html\">Ozymandias
146
- meets Wikipedia, with notes on natural language generation</a>.</p>\n<p>Of
147
- course, this is a well known problem. Tools such as <a href=\"http://rdf2vec.org\">RDF2vec</a>
148
- can take RDF from a knowledge graph and create embeddings which could in tern
149
- be used to support semantic search. But it seems to me that we could simply
150
- this process a bit by making use of ChatGPT.</p>\n<p>Firstly we would generate
151
- natural language statements from the knowledge graph (e.g., “species x belongs
152
- to genus y and was described in z”, “this paper on ants was authored by x”,
153
- etc.) that cover the basic questions we expect people to ask. We then get
154
- embeddings for these (e.g., using OpenAI). We then have an interface where
155
- people can ask a question (“is species x a valid species?”, “who has published
156
- on ants”, etc.), we get the embedding for that question, retrieve natural
157
- language statements that the closest in embedding “space”, package everything
158
- up and ask ChatGPT to summarise the answer.</p>\n<p>The trick, of course,
159
- is to figure out how t generate natural language statements from the knowledge
160
- graph (which amounts to deciding what paths to traverse in the knowledge graph,
161
- and how to write those paths is something approximating English). We also
162
- want to know something about the sorts of questions people are likely to ask
163
- so that we have a reasonable chance of having the answers (for example, are
164
- people going to ask about individual species, or questions about summary statistics
165
- such as numbers of species in a genus, etc.).</p>\n<p>What makes this attractive
166
- is that it seems a straightforward way to go from a largely academic exercise
167
- (build a knowledge graph) to something potentially useful (a question and
168
- answer machine). Imagine if something like the defunct BBC wildlife site (see
169
- <a href=\"https://iphylo.blogspot.com/2017/12/blue-planet-ii-bbc-and-semantic-web.html\">Blue
170
- Planet II, the BBC, and the Semantic Web: a tale of lessons forgotten and
171
- opportunities lost</a>) revived <a href=\"https://aspiring-look.glitch.me\">here</a>
172
- had a question and answer interface where we could ask questions rather than
173
- passively browse.</p>\n<h4 id=\"summary\">Summary</h4>\n<p>I have so much
174
- more to learn, and need to think about ways to incorporate semantic search
175
- and ChatGPT-like tools into knowledge graphs.</p>\n<blockquote>\n<p>Written
176
- with <a href=\"https://stackedit.io/\">StackEdit</a>.</p>\n</blockquote>","tags":[],"language":"en","references":[]},{"id":"https://doi.org/10.59350/zc4qc-77616","uuid":"30c78d9d-2e50-49db-9f4f-b3baa060387b","url":"https://iphylo.blogspot.com/2022/09/does-anyone-cite-taxonomic-treatments.html","title":"Does
177
- anyone cite taxonomic treatments?","summary":"Taxonomic treatments have come
178
- up in various discussions I''m involved in, and I''m curious as to whether
179
- they are actually being used, in particular, whether they are actually being
180
- cited. Consider the following quote: The taxa are described in taxonomic treatments,
181
- well defined sections of scientific publications (Catapano 2019). They include
182
- a nomenclatural section and one or more sections including descriptions, material
183
- citations referring to studied specimens, or notes ecology and...","date_published":"2022-09-01T16:49:00Z","date_modified":"2022-09-01T16:49:51Z","date_indexed":"1909-06-16T09:31:50+00:00","authors":[{"url":null,"name":"Roderic
184
- Page"}],"image":null,"content_html":"<div class=\"separator\" style=\"clear:
185
- both;\"><a href=\"https://zenodo.org/record/5731100/thumb100\" style=\"display:
186
- block; padding: 1em 0; text-align: center; clear: right; float: right;\"><img
187
- alt=\"\" border=\"0\" height=\"128\" data-original-height=\"106\" data-original-width=\"100\"
188
- src=\"https://zenodo.org/record/5731100/thumb250\"/></a></div>\nTaxonomic
189
- treatments have come up in various discussions I''m involved in, and I''m
190
- curious as to whether they are actually being used, in particular, whether
191
- they are actually being cited. Consider the following quote:\n\n<blockquote>\nThe
192
- taxa are described in taxonomic treatments, well defined sections of scientific
193
- publications (Catapano 2019). They include a nomenclatural section and one
194
- or more sections including descriptions, material citations referring to studied
195
- specimens, or notes ecology and behavior. In case the treatment does not describe
196
- a new discovered taxon, previous treatments are cited in the form of treatment
197
- citations. This citation can refer to a previous treatment and add additional
198
- data, or it can be a statement synonymizing the taxon with another taxon.
199
- This allows building a citation network, and ultimately is a constituent part
200
- of the catalogue of life. - Taxonomic Treatments as Open FAIR Digital Objects
201
- <a href=\"https://doi.org/10.3897/rio.8.e93709\">https://doi.org/10.3897/rio.8.e93709</a>\n</blockquote>\n\n<p>\n
202
- \"Traditional\" academic citation is from article to article. For example,
203
- consider these two papers:\n\n<blockquote>\nLi Y, Li S, Lin Y (2021) Taxonomic
204
- study on fourteen symphytognathid species from Asia (Araneae, Symphytognathidae).
205
- ZooKeys 1072: 1-47. https://doi.org/10.3897/zookeys.1072.67935\n</blockquote>\n\n<blockquote>\nMiller
206
- J, Griswold C, Yin C (2009) The symphytognathoid spiders of the Gaoligongshan,
207
- Yunnan, China (Araneae: Araneoidea): Systematics and diversity of micro-orbweavers.
208
- ZooKeys 11: 9-195. https://doi.org/10.3897/zookeys.11.160\n</blockquote>\n</p>\n\n<p>Li
209
- et al. 2021 cites Miller et al. 2009 (although Pensoft seems to have broken
210
- the citation such that it does appear correctly either on their web page or
211
- in CrossRef).</p>\n\n<p>So, we have this link: [article]10.3897/zookeys.1072.67935
212
- --cites--> [article]10.3897/zookeys.11.160. One article cites another.</p>\n\n<p>In
213
- their 2021 paper Li et al. discuss <i>Patu jidanweishi</i> Miller, Griswold
214
- & Yin, 2009:\n\n<div class=\"separator\" style=\"clear: both;\"><a href=\"https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgPavMHXqQNX1ls_zXo9kIcMLHPxc7ZpV9NCSof5wLumrg3ovPoi6nYKzZsINuqFtYoEvW1QrerePD-MEf2DJaYUXlT37d81x3L6ILls7u229rg0_Nc0uUmgW-ICzr6MI_QCZfgQbYGTxuofu-fuPVoygbCnm3vQVYOhLDLtp1EtQ9jRZHDvw/s1040/Screenshot%202022-09-01%20at%2017.12.27.png\"
215
- style=\"display: block; padding: 1em 0; text-align: center; \"><img alt=\"\"
216
- border=\"0\" width=\"400\" data-original-height=\"314\" data-original-width=\"1040\"
217
- src=\"https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgPavMHXqQNX1ls_zXo9kIcMLHPxc7ZpV9NCSof5wLumrg3ovPoi6nYKzZsINuqFtYoEvW1QrerePD-MEf2DJaYUXlT37d81x3L6ILls7u229rg0_Nc0uUmgW-ICzr6MI_QCZfgQbYGTxuofu-fuPVoygbCnm3vQVYOhLDLtp1EtQ9jRZHDvw/s400/Screenshot%202022-09-01%20at%2017.12.27.png\"/></a></div>\n\n<p>There
218
- is a treatment for the original description of <i>Patu jidanweishi</i> at
219
- <a href=\"https://doi.org/10.5281/zenodo.3792232\">https://doi.org/10.5281/zenodo.3792232</a>,
220
- which was created by Plazi with a time stamp \"2020-05-06T04:59:53.278684+00:00\".
221
- The original publication date was 2009, the treatments are being added retrospectively.</p>\n\n<p>In
222
- an ideal world my expectation would be that Li et al. 2021 would have cited
223
- the treatment, instead of just providing the text string \"Patu jidanweishi
224
- Miller, Griswold & Yin, 2009: 64, figs 65A–E, 66A, B, 67A–D, 68A–F, 69A–F,
225
- 70A–F and 71A–F (♂♀).\" Isn''t the expectation under the treatment model that
226
- we would have seen this relationship:</p>\n\n<p>[article]10.3897/zookeys.1072.67935
227
- --cites--> [treatment]https://doi.org/10.5281/zenodo.3792232</p>\n\n<p>Furthermore,
228
- if it is the case that \"[i]n case the treatment does not describe a new discovered
229
- taxon, previous treatments are cited in the form of treatment citations\"
230
- then we should also see a citation between treatments, in other words Li et
231
- al.''s 2021 treatment of <i>Patu jidanweishi</i> (which doesn''t seem to have
232
- a DOI but is available on Plazi'' web site as <a href=\"https://tb.plazi.org/GgServer/html/1CD9FEC313A35240938EC58ABB858E74\">https://tb.plazi.org/GgServer/html/1CD9FEC313A35240938EC58ABB858E74</a>)
233
- should also cite the original treatment? It doesn''t - but it does cite the
234
- Miller et al. paper.</p>\n\n<p>So in this example we don''t see articles citing
235
- treatments, nor do we see treatments citing treatments. Playing Devil''s advocate,
236
- why then do we have treatments? Does''t the lack of citations suggest that
237
- - despite some taxonomists saying this is the unit that matters - they actually
238
- don''t. If we pay attention to what people do rather than what they say they
239
- do, they cite articles.</p>\n\n<p>Now, there are all sorts of reasons why
240
- we don''t see [article] -> [treatment] citations, or [treatment] -> [treatment]
241
- citations. Treatments are being added after the fact by Plazi, not by the
242
- authors of the original work. And in many cases the treatments that could
243
- be cited haven''t appeared until after that potentially citing work was published.
244
- In the example above the Miller et al. paper dates from 2009, but the treatment
245
- extracted only went online in 2020. And while there is a long standing culture
246
- of citing publications (ideally using DOIs) there isn''t an equivalent culture
247
- of citing treatments (beyond the simple text strings).</p>\n\n<p>Obviously
248
- this is but one example. I''d need to do some exploration of the citation
249
- graph to get a better sense of citations patterns, perhaps using <a href=\"https://www.crossref.org/documentation/event-data/\">CrossRef''s
250
- event data</a>. But my sense is that taxonomists don''t cite treatments.</p>\n\n<p>I''m
251
- guessing Plazi would respond by saying treatments are cited, for example (indirectly)
252
- in GBIF downloads. This is true, although arguably people aren''t citing the
253
- treatment, they''re citing specimen data in those treatments, and that specimen
254
- data could be extracted at the level of articles rather than treatments. In
255
- other words, it''s not the treatments themselves that people are citing.</p>\n\n<p>To
256
- be clear, I think there is value in being able to identify those \"well defined
257
- sections\" of a publication that deal with a given taxon (i.e., treatments),
258
- but it''s not clear to me that these are actually the citable units people
259
- might hope them to be. Likewise, journals such as <i>ZooKeys</i> have DOIs
260
- for individual figures. Does anyone actually cite those?</p>","tags":[],"language":"en","references":[]},{"id":"https://doi.org/10.59350/pmhat-5ky65","uuid":"5891c709-d139-440f-bacb-06244424587a","url":"https://iphylo.blogspot.com/2021/10/problems-with-plazi-parsing-how.html","title":"Problems
261
- with Plazi parsing: how reliable are automated methods for extracting specimens
262
- from the literature?","summary":"The Plazi project has become one of the major
263
- contributors to GBIF with some 36,000 datasets yielding some 500,000 occurrences
264
- (see Plazi''s GBIF page for details). These occurrences are extracted from
265
- taxonomic publication using automated methods. New data is published almost
266
- daily (see latest treatments). The map below shows the geographic distribution
267
- of material citations provided to GBIF by Plazi, which gives you a sense of
268
- the size of the dataset. By any metric Plazi represents a...","date_published":"2021-10-25T11:10:00Z","date_modified":"2021-10-28T16:08:18Z","date_indexed":"1970-01-01T00:00:00+00:00","authors":[{"url":null,"name":"Roderic
269
- Page"}],"image":null,"content_html":"<div class=\"separator\" style=\"clear:
270
- both;\"><a href=\"https://1.bp.blogspot.com/-oiqSkA53FI4/YXaRAUpRgFI/AAAAAAAAgzY/PbPEOh_VlhIE0aaqqhVQPLAOE5pRULwJACLcBGAsYHQ/s240/Rf7UoXTw_400x400.jpg\"
271
- style=\"display: block; padding: 1em 0; text-align: center; clear: right;
272
- float: right;\"><img alt=\"\" border=\"0\" width=\"128\" data-original-height=\"240\"
273
- data-original-width=\"240\" src=\"https://1.bp.blogspot.com/-oiqSkA53FI4/YXaRAUpRgFI/AAAAAAAAgzY/PbPEOh_VlhIE0aaqqhVQPLAOE5pRULwJACLcBGAsYHQ/s200/Rf7UoXTw_400x400.jpg\"/></a></div><p>The
274
- <a href=\"http://plazi.org\">Plazi</a> project has become one of the major
275
- contributors to GBIF with some 36,000 datasets yielding some 500,000 occurrences
276
- (see <a href=\"https://www.gbif.org/publisher/7ce8aef0-9e92-11dc-8738-b8a03c50a862\">Plazi''s
277
- GBIF page</a> for details). These occurrences are extracted from taxonomic
278
- publication using automated methods. New data is published almost daily (see
279
- <a href=\"https://tb.plazi.org/GgServer/static/newToday.html\">latest treatments</a>).
280
- The map below shows the geographic distribution of material citations provided
281
- to GBIF by Plazi, which gives you a sense of the size of the dataset.</p>\n\n<div
282
- class=\"separator\" style=\"clear: both;\"><a href=\"https://1.bp.blogspot.com/-DCJ4HR8eej8/YXaRQnz22bI/AAAAAAAAgz4/AgRcree6jVgjtQL2ch7IXgtb_Xtx7fkngCPcBGAYYCw/s1030/Screenshot%2B2021-10-24%2Bat%2B20.43.23.png\"
283
- style=\"display: block; padding: 1em 0; text-align: center; \"><img alt=\"\"
284
- border=\"0\" width=\"400\" data-original-height=\"514\" data-original-width=\"1030\"
285
- src=\"https://1.bp.blogspot.com/-DCJ4HR8eej8/YXaRQnz22bI/AAAAAAAAgz4/AgRcree6jVgjtQL2ch7IXgtb_Xtx7fkngCPcBGAYYCw/s400/Screenshot%2B2021-10-24%2Bat%2B20.43.23.png\"/></a></div>\n\n<p>By
286
- any metric Plazi represents a considerable achievement. But often when I browse
287
- individual records on Plazi I find records that seem clearly incorrect. Text
288
- mining the literature is a challenging problem, but at the moment Plazi seems
289
- something of a \"black box\". PDFs go in, the content is mined, and data comes
290
- up to be displayed on the Plazi web site and uploaded to GBIF. Nowhere does
291
- there seem to be an evaluation of how accurate this text mining actually is.
292
- Anecdotally it seems to work well in some cases, but in others it produces
293
- what can only be described as bogus records.</p>\n\n<h2>Finding errors</h2>\n\n<p>A
294
- treatment in Plazi is a block of text (and sometimes illustrations) that refers
295
- to a single taxon. Often that text will include a description of the taxon,
296
- and list one or more specimens that have been examined. These lists of specimens
297
- (\"material citations\") are one of the key bits of information that Plaza
298
- extracts from a treatment as these citations get fed into GBIF as occurrences.</p>\n\n<p>To
299
- help explore treatments I''ve constructed a simple web site that takes the
300
- Plazi identifier for a treatment and displays that treatment with the material
301
- citations highlighted. For example, for the Plazi treatment <a href=\"https://tb.plazi.org/GgServer/html/03B5A943FFBB6F02FE27EC94FABEEAE7\">03B5A943FFBB6F02FE27EC94FABEEAE7</a>
302
- you can view the marked up version at <a href=\"https://plazi-tester.herokuapp.com/?uri=622F7788-F0A4-449D-814A-5B49CD20B228\">https://plazi-tester.herokuapp.com/?uri=622F7788-F0A4-449D-814A-5B49CD20B228</a>.
303
- Below is an example of a material citation with its component parts tagged:</p>\n\n<div
304
- class=\"separator\" style=\"clear: both;\"><a href=\"https://1.bp.blogspot.com/-NIGuo9BggdA/YXaRQQrv0QI/AAAAAAAAgz4/SZDcA1jZSN47JMRTWDwSMRpHUShrCeOdgCPcBGAYYCw/s693/Screenshot%2B2021-10-24%2Bat%2B20.59.56.png\"
305
- style=\"display: block; padding: 1em 0; text-align: center; \"><img alt=\"\"
306
- border=\"0\" width=\"400\" data-original-height=\"94\" data-original-width=\"693\"
307
- src=\"https://1.bp.blogspot.com/-NIGuo9BggdA/YXaRQQrv0QI/AAAAAAAAgz4/SZDcA1jZSN47JMRTWDwSMRpHUShrCeOdgCPcBGAYYCw/s400/Screenshot%2B2021-10-24%2Bat%2B20.59.56.png\"/></a></div>\n\n<p>This
308
- is an example where Plazi has successfully parsed the specimen. But I keep
309
- coming across cases where specimens have not been parsed correctly, resulting
310
- in issues such as single specimens being split into multiple records (e.g., <a
311
- href=\"https://plazi-tester.herokuapp.com/?uri=5244B05EFFC8E20F7BC32056C178F496\">https://plazi-tester.herokuapp.com/?uri=5244B05EFFC8E20F7BC32056C178F496</a>),
312
- geographical coordinates being misinterpreted (e.g., <a href=\"https://plazi-tester.herokuapp.com/?uri=0D228E6AFFC2FFEFFF4DE8118C4EE6B9\">https://plazi-tester.herokuapp.com/?uri=0D228E6AFFC2FFEFFF4DE8118C4EE6B9</a>),
313
- or collector''s initials being confused with codes for natural history collections
314
- (e.g., <a href=\"https://plazi-tester.herokuapp.com/?uri=252C87918B362C05FF20F8C5BFCB3D4E\">https://plazi-tester.herokuapp.com/?uri=252C87918B362C05FF20F8C5BFCB3D4E</a>).</p>\n\n<p>Parsing
315
- specimens is a hard problem so it''s not unexpected to find errors. But they
316
- do seem common enough to be easily found, which raises the question of just
317
- what percentage of these material citations are correct? How much of the
318
- data Plazi feeds to GBIF is correct? How would we know?</p>\n\n<h2>Systemic
319
- problems</h2>\n\n<p>Some of the errors I''ve found concern the interpretation
320
- of the parsed data. For example, it is striking that despite including marine
321
- taxa <b>no</b> Plazi record has a value for depth below sea level (see <a
322
- href=\"https://www.gbif.org/occurrence/map?depth=0,9999&publishing_org=7ce8aef0-9e92-11dc-8738-b8a03c50a862&advanced=1\">GBIF
323
- search on depth range 0-9999 for Plazi</a>). But <a href=\"https://www.gbif.org/occurrence/map?elevation=0,9999&publishing_org=7ce8aef0-9e92-11dc-8738-b8a03c50a862&advanced=1\">many
324
- records do have an elevation</a>, including records from marine environments.
325
- Any record that has a depth value is interpreted by Plazi as being elevation,
326
- so we have aerial crustacea and fish.</p>\n\n<h3>Map of Plazi records with
327
- depth 0-9999m</h3>\n<div class=\"separator\" style=\"clear: both;\"><a href=\"https://1.bp.blogspot.com/-GD4pPtPCxVc/YXaRQ9bdn1I/AAAAAAAAgz8/A9YsypSvHfwWKAjDxSdeFVUkou88LGItACPcBGAYYCw/s673/Screenshot%2B2021-10-25%2Bat%2B12.03.51.png\"
328
- style=\"display: block; padding: 1em 0; text-align: center; \"><img alt=\"\"
329
- border=\"0\" width=\"400\" data-original-height=\"258\" data-original-width=\"673\"
330
- src=\"https://1.bp.blogspot.com/-GD4pPtPCxVc/YXaRQ9bdn1I/AAAAAAAAgz8/A9YsypSvHfwWKAjDxSdeFVUkou88LGItACPcBGAYYCw/s320/Screenshot%2B2021-10-25%2Bat%2B12.03.51.png\"/></a></div>\n\n<h3>Map
331
- of Plazi records with elevation 0-9999m </h3>\n<div class=\"separator\" style=\"clear:
332
- both;\"><a href=\"https://1.bp.blogspot.com/-BReSHtXTOkA/YXaRRKFW7dI/AAAAAAAAg0A/-FcBkMwyswIp0siWGVX3MNMANs7UkZFtwCPcBGAYYCw/s675/Screenshot%2B2021-10-25%2Bat%2B12.04.06.png\"
333
- style=\"display: block; padding: 1em 0; text-align: center; \"><img alt=\"\"
334
- border=\"0\" width=\"400\" data-original-height=\"256\" data-original-width=\"675\"
335
- src=\"https://1.bp.blogspot.com/-BReSHtXTOkA/YXaRRKFW7dI/AAAAAAAAg0A/-FcBkMwyswIp0siWGVX3MNMANs7UkZFtwCPcBGAYYCw/s320/Screenshot%2B2021-10-25%2Bat%2B12.04.06.png\"/></a></div>\n\n<p>Anecdotally
336
- I''ve also noticed that Plazi seems to do well on zoological data, especially
337
- journals like <i>Zootaxa</i>, but it often struggles with botanical specimens.
338
- Botanists tend to cite specimens rather differently to zoologists (botanists
339
- emphasise collector numbers rather than specimen codes). Hence data quality
340
- in Plazi is likely to taxonomic biased.</p>\n\n<p>Plazi is <a href=\"https://github.com/plazi/community/issues\">using
341
- GitHub to track issues with treatments</a> so feedback on erroneous records
342
- is possible, but this seems inadequate to the task. There are tens of thousands
343
- of data sets, with more being released daily, and hundreds of thousands of
344
- occurrences, and relying on GitHub issues devolves the responsibility for
345
- error checking onto the data users. I don''t have a measure of how many records
346
- in Plazi have problems, but because I suspect it is a significant fraction
347
- because for any given day''s output I can typically find errors.</p>\n\n<h2>What
348
- to do?</h2>\n\n<p>Faced with a process that generates noisy data there are
349
- several of things we could do:</p>\n\n<ol>\n<li>Have tools to detect and flag
350
- errors made in generating the data.</li>\n<li>Have the data generator give
351
- estimates the confidence of its results.</li>\n<li>Improve the data generator.</li>\n</ol>\n\n<p>I
352
- think a comparison with the problem of parsing bibliographic references might
353
- be instructive here. There is a long history of people developing tools to
354
- parse references (<a href=\"https://iphylo.blogspot.com/2021/05/finding-citations-of-specimens.html\">I''ve
355
- even had a go</a>). State-of-the art tools such as <a href=\"https://anystyle.io\">AnyStyle</a>
356
- feature machine learning, and are tested against <a href=\"https://github.com/inukshuk/anystyle/blob/master/res/parser/core.xml\">human
357
- curated datasets</a> of tagged bibliographic records. This means we can evaluate
358
- the performance of a method (how well does it retrieve the same results as
359
- human experts?) and also improve the method by expanding the corpus of training
360
- data. Some of these tools can provide a measures of how confident they are
361
- when classifying a string as, say, a person''s name, which means we could
362
- flag potential issues for anyone wanting to use that record.</p>\n\n<p>We
363
- don''t have equivalent tools for parsing specimens in the literature, and
364
- hence have no easy way to quantify how good existing methods are, nor do we
365
- have a public corpus of material citations that we can use as training data.
366
- I <a href=\"https://iphylo.blogspot.com/2021/05/finding-citations-of-specimens.html\">blogged
367
- about this</a> a few months ago and was considering using Plazi as a source
368
- of marked up specimen data to use for training. However based on what I''ve
369
- looked at so far Plazi''s data would need to be carefully scrutinised before
370
- it could be used as training data.</p>\n\n<p>Going forward, I think it would
371
- be desirable to have a set of records that can be used to benchmark specimen
372
- parsers, and ideally have the parsers themselves available as web services
373
- so that anyone can evaluate them. Even better would be a way to contribute
374
- to the training data so that these tools improve over time.</p>\n\n<p>Plazi''s
375
- data extraction tools are mostly desktop-based, that is, you need to download
376
- software to use their methods. However, there are experimental web services
377
- available as well. I''ve created a simple wrapper around the material citation
378
- parser, you can try it at <a href=\"https://plazi-tester.herokuapp.com/parser.php\">https://plazi-tester.herokuapp.com/parser.php</a>.
379
- It takes a single material citation and returns a version with elements such
380
- as specimen code and collector name tagged in different colours.</p>\n\n<h2>Summary</h2>\n\n<p>Text
381
- mining the taxonomic literature is clearly a gold mine of data, but at the
382
- same time it is potentially fraught as we try and extract structured data
383
- from semi-structured text. Plazi has demonstrated that it is possible to extract
384
- a lot of data from the literature, but at the same time the quality of that
385
- data seems highly variable. Even minor issues in parsing text can have big
386
- implications for data quality (e.g., marine organisms apparently living above
387
- sea level). Historically in biodiversity informatics we have favoured data
388
- quantity over data quality. Quantity has an obvious metric, and has milestones
389
- we can celebrate (e.g., <a href=\"GBIF at 1 billion - what''s next?\">one
390
- billion specimens</a>). There aren''t really any equivalent metrics for data
391
- quality.</p>\n\n<p>Adding new types of data can sometimes initially result
392
- in a new set of quality issues (e.g., <a href=\"https://iphylo.blogspot.com/2019/12/gbif-metagenomics-and-metacrap.html\">GBIF
393
- metagenomics and metacrap</a>) that take time to resolve. In the case of Plazi,
394
- I think it would be worthwhile to quantify just how many records have errors,
395
- and develop benchmarks that we can use to test methods for extracting specimen
396
- data from text. If we don''t do this then there will remain uncertainty as
397
- to how much trust we can place in data mined from the taxonomic literature.</p>\n\n<h2>Update</h2>\n\nPlazi
398
- has responded, see <a href=\"http://plazi.org/posts/2021/10/liberation-first-step-toward-quality/\">Liberating
399
- material citations as a first step to more better data</a>. My reading of
400
- their repsonse is that it essentially just reiterates Plazi''s approach and
401
- doesn''t tackle the underlying issue: their method for extracting material
402
- citations is error prone, and many of those errors end up in GBIF.","tags":["data
403
- quality","parsing","Plazi","specimen","text mining"],"language":"en","references":[]},{"id":"https://doi.org/10.59350/j77nc-e8x98","uuid":"c6b101f4-bfbc-4d01-921d-805c43c85757","url":"https://iphylo.blogspot.com/2022/08/linking-taxonomic-names-to-literature.html","title":"Linking
404
- taxonomic names to the literature","summary":"Just some thoughts as I work
405
- through some datasets linking taxonomic names to the literature. In the diagram
406
- above I''ve tried to capture the different situatios I encounter. Much of
407
- the work I''ve done on this has focussed on case 1 in the diagram: I want
408
- to link a taxonomic name to an identifier for the work in which that name
409
- was published. In practise this means linking names to DOIs. This has the
410
- advantage of linking to a citable indentifier, raising questions such as whether
411
- citations...","date_published":"2022-08-22T17:19:00Z","date_modified":"2022-08-22T17:19:08Z","date_indexed":"1909-06-16T08:21:41+00:00","authors":[{"url":null,"name":"Roderic
412
- Page"}],"image":null,"content_html":"Just some thoughts as I work through
413
- some datasets linking taxonomic names to the literature.\n\n<div class=\"separator\"
414
- style=\"clear: both;\"><a href=\"https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEi5ZkbnumEWKZpf0isei_hUJucFlOOK08-SKnJknD8B4qhAX36u-vMQRnZdhRJK5rPb7HNYcnqB7qE4agbeStqyzMkWHrzUj14gPkz2ohmbOVg8P_nHo0hM94PD1wH3SPJsaLAumN8vih3ch9pjH2RaVWZBLwwhGhNu0FS1m5z6j5xt2NeZ4w/s2140/linking%20to%20names144.jpg\"
415
- style=\"display: block; padding: 1em 0; text-align: center; \"><img alt=\"\"
416
- border=\"0\" height=\"600\" data-original-height=\"2140\" data-original-width=\"1604\"
417
- src=\"https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEi5ZkbnumEWKZpf0isei_hUJucFlOOK08-SKnJknD8B4qhAX36u-vMQRnZdhRJK5rPb7HNYcnqB7qE4agbeStqyzMkWHrzUj14gPkz2ohmbOVg8P_nHo0hM94PD1wH3SPJsaLAumN8vih3ch9pjH2RaVWZBLwwhGhNu0FS1m5z6j5xt2NeZ4w/s600/linking%20to%20names144.jpg\"/></a></div>\n\n<p>In
418
- the diagram above I''ve tried to capture the different situatios I encounter.
419
- Much of the work I''ve done on this has focussed on case 1 in the diagram:
420
- I want to link a taxonomic name to an identifier for the work in which that
421
- name was published. In practise this means linking names to DOIs. This has
422
- the advantage of linking to a citable indentifier, raising questions such
423
- as whether citations of taxonmic papers by taxonomic databases could become
424
- part of a <a href=\"https://iphylo.blogspot.com/2022/08/papers-citing-data-that-cite-papers.html\">taxonomist''s
425
- Google Scholar profile</a>.</p>\n\n<p>In many taxonomic databases full work-level
426
- citations are not the norm, instead taxonomists cite one or more pages within
427
- a work that are relevant to a taxonomic name. These \"microcitations\" (what
428
- the U.S. legal profession refer to as \"point citations\" or \"pincites\", see
429
- <a href=\"https://rasmussen.libanswers.com/faq/283203\">What are pincites,
430
- pinpoints, or jump legal references?</a>) require some work to map to the
431
- work itself (which is typically the thing that has a citatble identifier such
432
- as a DOI).</p>\n\n<p>Microcitations (case 2 in the diagram above) can be quite
433
- complex. Some might simply mention a single page, but others might list a
434
- series of (not necessarily contiguous) pages, as well as figures, plates etc.
435
- Converting these to citable identifiers can be tricky, especially as in most
436
- cases we don''t have page-level identifiers. The Biodiversity Heritage Library
437
- (BHL) does have URLs for each scanned page, and we have a standard for referring
438
- to pages in a PDF (<code>page=&lt;pageNum&gt;</code>, see <a href=\"https://datatracker.ietf.org/doc/html/rfc8118\">RFC
439
- 8118</a>). But how do we refer to a set of pages? Do we pick the first page?
440
- Do we try and represent a set of pages, and if so, how?</p>\n\n<p>Another
441
- issue with page-level identifiers is that not everything on a given page may
442
- be relevant to the taxonomic name. In case 2 above I''ve shaded in the parts
443
- of the pages and figure that refer to the taxonomic name. An example where
444
- this can be problematic is the recent test case I created for BHL where a
445
- page image was included for the taxonomic name <a href=\"https://www.gbif.org/species/195763322\"><i>Aphrophora
446
- impressa</i></a>. The image includes the species description and a illustration,
447
- as well as text that relates to other species.</p>\n\n<div class=\"separator\"
448
- style=\"clear: both;\"><a href=\"https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEg1SAv1XiVBPHXeMLPfdsnTh4Sj4m0AQyqjXM0faXyvbNhBXFDl7SawjFIkMo3qz4RQhDEyZXkKAh5nF4gtfHbVXA6cK3NJ46CuIWECC6HmJKtTjoZ0M3QQXvLY_X-2-RecjBqEy68M0cEr0ph3l6KY51kA9BvGt9d4id314P71PBitpWMATg/s3467/https---www.biodiversitylibrary.org-pageimage-29138463.jpeg\"
449
- style=\"display: block; padding: 1em 0; text-align: center; \"><img alt=\"\"
450
- border=\"0\" height=\"400\" data-original-height=\"3467\" data-original-width=\"2106\"
451
- src=\"https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEg1SAv1XiVBPHXeMLPfdsnTh4Sj4m0AQyqjXM0faXyvbNhBXFDl7SawjFIkMo3qz4RQhDEyZXkKAh5nF4gtfHbVXA6cK3NJ46CuIWECC6HmJKtTjoZ0M3QQXvLY_X-2-RecjBqEy68M0cEr0ph3l6KY51kA9BvGt9d4id314P71PBitpWMATg/s400/https---www.biodiversitylibrary.org-pageimage-29138463.jpeg\"/></a></div>\n\n<p>Given
452
- that not everything on a page need be relevant, we could extract just the
453
- relevant blocks of text and illustrations (e.g., paragraphs of text, panels
454
- within a figure, etc.) and treat that set of elements as the thing to cite.
455
- This is, of course, what <a href=\"http://plazi.org\">Plazi</a> are doing.
456
- The set of extracted blocks is glued together as a \"treatment\", assigned
457
- an identifier (often a DOI), and treated as a citable unit. It would be interesting
458
- to see to what extent these treatments are actually cited, for example, do
459
- subsequent revisions that cite work that include treatments cite those treatments,
460
- or just the work itself? Put another way, are we creating <a href=\"https://iphylo.blogspot.com/2012/09/decoding-nature-encode-ipad-app-omg-it.html\">\"threads\"</a>
461
- between taxonomic revisions?</p>\n\n<p>One reason for these notes is that
462
- I''m exploring uploading taxonomic name - literature links to <a href=\"https://www.checklistbank.org\">ChecklistBank</a>
463
- and case 1 above is easy, as is case 3 (if we have treatment-level identifiers).
464
- But case 2 is problematic because we are linking to a set of things that may
465
- not have an identifier, which means a decision has to be made about which
466
- page to link to, and how to refer to that page.</p>","tags":[],"language":"en","references":[]},{"id":"https://doi.org/10.59350/ymc6x-rx659","uuid":"0807f515-f31d-4e2c-9e6f-78c3a9668b9d","url":"https://iphylo.blogspot.com/2022/09/dna-barcoding-as-intergenerational.html","title":"DNA
53
+ 7.00","category":"Natural Sciences","backlog":true,"prefix":"10.59350","items":[{"id":"https://doi.org/10.59350/ymc6x-rx659","uuid":"0807f515-f31d-4e2c-9e6f-78c3a9668b9d","url":"https://iphylo.blogspot.com/2022/09/dna-barcoding-as-intergenerational.html","title":"DNA
467
54
  barcoding as intergenerational transfer of taxonomic knowledge","summary":"I
468
55
  tweeted about this but want to bookmark it for later as well. The paper “A
469
56
  molecular-based identification resource for the arthropods of Finland” doi:10.1111/1755-0998.13510
@@ -993,5 +580,5 @@ http_interactions:
993
580
  simpler overlay on top of SPARQL so that we can retrieve the data we want
994
581
  without having to learn the intricacies of SPARQL, nor how Wikidata models
995
582
  publications and people.</p>","tags":["GraphQL","SPARQL","WikiCite","Wikidata"],"language":"en","references":[]}]}'
996
- recorded_at: Sun, 18 Jun 2023 05:50:13 GMT
583
+ recorded_at: Sun, 18 Jun 2023 15:24:19 GMT
997
584
  recorded_with: VCR 6.1.0