commonmeta-ruby 3.3.3 → 3.3.5
Sign up to get free protection for your applications and to get access to all the features.
- checksums.yaml +4 -4
- data/Gemfile.lock +2 -2
- data/bin/commonmeta +2 -2
- data/commonmeta.gemspec +1 -1
- data/lib/commonmeta/cli.rb +7 -3
- data/lib/commonmeta/readers/json_feed_reader.rb +1 -1
- data/lib/commonmeta/utils.rb +34 -0
- data/lib/commonmeta/version.rb +1 -1
- data/spec/fixtures/vcr_cassettes/Commonmeta_Metadata/change_metadata_as_datacite_xml/with_data_citation.yml +14 -14
- data/spec/fixtures/vcr_cassettes/Commonmeta_Metadata/doi_registration_agency/crossref.yml +5 -5
- data/spec/fixtures/vcr_cassettes/Commonmeta_Metadata/doi_registration_agency/datacite.yml +5 -5
- data/spec/fixtures/vcr_cassettes/Commonmeta_Metadata/doi_registration_agency/jalc.yml +5 -5
- data/spec/fixtures/vcr_cassettes/Commonmeta_Metadata/doi_registration_agency/kisti.yml +5 -5
- data/spec/fixtures/vcr_cassettes/Commonmeta_Metadata/doi_registration_agency/medra.yml +5 -5
- data/spec/fixtures/vcr_cassettes/Commonmeta_Metadata/doi_registration_agency/not_found.yml +5 -5
- data/spec/fixtures/vcr_cassettes/Commonmeta_Metadata/doi_registration_agency/op.yml +5 -5
- data/spec/fixtures/vcr_cassettes/Commonmeta_Metadata/find_from_format_by_ID/crossref.yml +5 -5
- data/spec/fixtures/vcr_cassettes/Commonmeta_Metadata/find_from_format_by_ID/crossref_doi_not_url.yml +5 -5
- data/spec/fixtures/vcr_cassettes/Commonmeta_Metadata/find_from_format_by_ID/datacite.yml +5 -5
- data/spec/fixtures/vcr_cassettes/Commonmeta_Metadata/find_from_format_by_ID/datacite_doi_http.yml +5 -5
- data/spec/fixtures/vcr_cassettes/Commonmeta_Metadata/find_from_format_by_ID/unknown_DOI_registration_agency.yml +5 -5
- data/spec/fixtures/vcr_cassettes/Commonmeta_Metadata/get_cff_metadata/cff-converter-python.yml +9 -7
- data/spec/fixtures/vcr_cassettes/Commonmeta_Metadata/get_cff_metadata/ruby-cff.yml +16 -14
- data/spec/fixtures/vcr_cassettes/Commonmeta_Metadata/get_cff_metadata/ruby-cff_repository_url.yml +14 -12
- data/spec/fixtures/vcr_cassettes/Commonmeta_Metadata/get_codemeta_metadata/maremma.yml +10 -8
- data/spec/fixtures/vcr_cassettes/Commonmeta_Metadata/get_codemeta_metadata/metadata_reports.yml +9 -7
- data/spec/fixtures/vcr_cassettes/Commonmeta_Metadata/get_crossref_metadata/DOI_with_ORCID_ID.yml +74 -74
- data/spec/fixtures/vcr_cassettes/Commonmeta_Metadata/get_crossref_metadata/DOI_with_SICI_DOI.yml +73 -73
- data/spec/fixtures/vcr_cassettes/Commonmeta_Metadata/get_crossref_metadata/DOI_with_data_citation.yml +70 -70
- data/spec/fixtures/vcr_cassettes/Commonmeta_Metadata/get_crossref_metadata/JaLC.yml +159 -159
- data/spec/fixtures/vcr_cassettes/Commonmeta_Metadata/get_crossref_metadata/KISTI.yml +128 -128
- data/spec/fixtures/vcr_cassettes/Commonmeta_Metadata/get_crossref_metadata/OP.yml +72 -72
- data/spec/fixtures/vcr_cassettes/Commonmeta_Metadata/get_crossref_metadata/affiliation_is_space.yml +73 -73
- data/spec/fixtures/vcr_cassettes/Commonmeta_Metadata/get_crossref_metadata/another_book.yml +109 -109
- data/spec/fixtures/vcr_cassettes/Commonmeta_Metadata/get_crossref_metadata/another_book_chapter.yml +71 -71
- data/spec/fixtures/vcr_cassettes/Commonmeta_Metadata/get_crossref_metadata/article_id_as_page_number.yml +74 -74
- data/spec/fixtures/vcr_cassettes/Commonmeta_Metadata/get_crossref_metadata/author_literal.yml +82 -82
- data/spec/fixtures/vcr_cassettes/Commonmeta_Metadata/get_crossref_metadata/book.yml +71 -71
- data/spec/fixtures/vcr_cassettes/Commonmeta_Metadata/get_crossref_metadata/book_chapter.yml +72 -72
- data/spec/fixtures/vcr_cassettes/Commonmeta_Metadata/get_crossref_metadata/book_chapter_with_RDF_for_container.yml +70 -70
- data/spec/fixtures/vcr_cassettes/Commonmeta_Metadata/get_crossref_metadata/book_oup.yml +69 -69
- data/spec/fixtures/vcr_cassettes/Commonmeta_Metadata/get_crossref_metadata/component.yml +91 -91
- data/spec/fixtures/vcr_cassettes/Commonmeta_Metadata/get_crossref_metadata/dataset.yml +101 -102
- data/spec/fixtures/vcr_cassettes/Commonmeta_Metadata/get_crossref_metadata/dataset_usda.yml +133 -133
- data/spec/fixtures/vcr_cassettes/Commonmeta_Metadata/get_crossref_metadata/date_in_future.yml +78 -78
- data/spec/fixtures/vcr_cassettes/Commonmeta_Metadata/get_crossref_metadata/dissertation.yml +100 -100
- data/spec/fixtures/vcr_cassettes/Commonmeta_Metadata/get_crossref_metadata/empty_given_name.yml +72 -72
- data/spec/fixtures/vcr_cassettes/Commonmeta_Metadata/get_crossref_metadata/invalid_date.yml +73 -73
- data/spec/fixtures/vcr_cassettes/Commonmeta_Metadata/get_crossref_metadata/journal_article.yml +72 -72
- data/spec/fixtures/vcr_cassettes/Commonmeta_Metadata/get_crossref_metadata/journal_article_original_language_title.yml +70 -70
- data/spec/fixtures/vcr_cassettes/Commonmeta_Metadata/get_crossref_metadata/journal_article_with.yml +76 -514
- data/spec/fixtures/vcr_cassettes/Commonmeta_Metadata/get_crossref_metadata/journal_article_with_RDF_for_container.yml +70 -70
- data/spec/fixtures/vcr_cassettes/Commonmeta_Metadata/get_crossref_metadata/journal_article_with_funding.yml +73 -73
- data/spec/fixtures/vcr_cassettes/Commonmeta_Metadata/get_crossref_metadata/journal_issue.yml +69 -69
- data/spec/fixtures/vcr_cassettes/Commonmeta_Metadata/get_crossref_metadata/mEDRA.yml +69 -69
- data/spec/fixtures/vcr_cassettes/Commonmeta_Metadata/get_crossref_metadata/markup.yml +78 -78
- data/spec/fixtures/vcr_cassettes/Commonmeta_Metadata/get_crossref_metadata/missing_creator.yml +73 -73
- data/spec/fixtures/vcr_cassettes/Commonmeta_Metadata/get_crossref_metadata/multiple_issn.yml +72 -72
- data/spec/fixtures/vcr_cassettes/Commonmeta_Metadata/get_crossref_metadata/multiple_titles.yml +71 -70
- data/spec/fixtures/vcr_cassettes/Commonmeta_Metadata/get_crossref_metadata/multiple_titles_with_missing.yml +716 -716
- data/spec/fixtures/vcr_cassettes/Commonmeta_Metadata/get_crossref_metadata/not_found_error.yml +63 -63
- data/spec/fixtures/vcr_cassettes/Commonmeta_Metadata/get_crossref_metadata/peer_review.yml +74 -74
- data/spec/fixtures/vcr_cassettes/Commonmeta_Metadata/get_crossref_metadata/posted_content.yml +71 -71
- data/spec/fixtures/vcr_cassettes/Commonmeta_Metadata/get_crossref_metadata/posted_content_copernicus.yml +73 -73
- data/spec/fixtures/vcr_cassettes/Commonmeta_Metadata/get_crossref_metadata/report_osti.yml +117 -117
- data/spec/fixtures/vcr_cassettes/Commonmeta_Metadata/get_crossref_metadata/vor_with_url.yml +75 -75
- data/spec/fixtures/vcr_cassettes/Commonmeta_Metadata/get_crossref_metadata/yet_another_book.yml +69 -69
- data/spec/fixtures/vcr_cassettes/Commonmeta_Metadata/get_crossref_metadata/yet_another_book_chapter.yml +70 -70
- data/spec/fixtures/vcr_cassettes/Commonmeta_Metadata/get_crossref_raw/journal_article.yml +10 -10
- data/spec/fixtures/vcr_cassettes/Commonmeta_Metadata/get_datacite_metadata/dissertation.yml +9 -9
- data/spec/fixtures/vcr_cassettes/Commonmeta_Metadata/get_datacite_metadata/funding_references.yml +11 -11
- data/spec/fixtures/vcr_cassettes/Commonmeta_Metadata/get_datacite_metadata/subject_scheme.yml +20 -20
- data/spec/fixtures/vcr_cassettes/Commonmeta_Metadata/get_doi_prefix_for_blog/by_blog_id.yml +6 -419
- data/spec/fixtures/vcr_cassettes/Commonmeta_Metadata/get_doi_prefix_for_blog/by_blog_post_uuid.yml +7 -260
- data/spec/fixtures/vcr_cassettes/Commonmeta_Metadata/get_doi_prefix_for_blog/by_blog_post_uuid_specific_prefix.yml +3 -136
- data/spec/fixtures/vcr_cassettes/Commonmeta_Metadata/get_json_feed/by_blog_id.yml +225 -1432
- data/spec/fixtures/vcr_cassettes/Commonmeta_Metadata/get_json_feed/not_indexed_posts.yml +1380 -2112
- data/spec/fixtures/vcr_cassettes/Commonmeta_Metadata/get_json_feed/unregistered_posts.yml +6 -172
- data/spec/fixtures/vcr_cassettes/Commonmeta_Metadata/get_json_feed_item_metadata/blog_post_with_non-url_id.yml +7 -7
- data/spec/fixtures/vcr_cassettes/Commonmeta_Metadata/get_json_feed_item_metadata/blogger_post.yml +12 -12
- data/spec/fixtures/vcr_cassettes/Commonmeta_Metadata/get_json_feed_item_metadata/ghost_post_with_author_name_suffix.yml +8 -8
- data/spec/fixtures/vcr_cassettes/Commonmeta_Metadata/get_json_feed_item_metadata/ghost_post_with_doi.yml +7 -7
- data/spec/fixtures/vcr_cassettes/Commonmeta_Metadata/get_json_feed_item_metadata/ghost_post_with_organizational_author.yml +3 -3
- data/spec/fixtures/vcr_cassettes/Commonmeta_Metadata/get_json_feed_item_metadata/ghost_post_without_doi.yml +8 -8
- data/spec/fixtures/vcr_cassettes/Commonmeta_Metadata/get_json_feed_item_metadata/jekyll_post.yml +8 -8
- data/spec/fixtures/vcr_cassettes/Commonmeta_Metadata/get_json_feed_item_metadata/substack_post_with_broken_reference.yml +90 -176
- data/spec/fixtures/vcr_cassettes/Commonmeta_Metadata/get_json_feed_item_metadata/syldavia_gazette_post_with_references.yml +25 -25
- data/spec/fixtures/vcr_cassettes/Commonmeta_Metadata/get_json_feed_item_metadata/upstream_post_with_references.yml +61 -61
- data/spec/fixtures/vcr_cassettes/Commonmeta_Metadata/get_json_feed_item_metadata/wordpress_post.yml +8 -8
- data/spec/fixtures/vcr_cassettes/Commonmeta_Metadata/get_json_feed_item_metadata/wordpress_post_with_references.yml +20 -20
- data/spec/fixtures/vcr_cassettes/Commonmeta_Metadata/get_one_author/has_familyName.yml +9 -9
- data/spec/fixtures/vcr_cassettes/Commonmeta_Metadata/get_one_author/has_name_in_display-order_with_ORCID.yml +9 -9
- data/spec/fixtures/vcr_cassettes/Commonmeta_Metadata/get_one_author/name_with_affiliation_crossref.yml +14 -14
- data/spec/fixtures/vcr_cassettes/Commonmeta_Metadata/get_one_author/only_familyName_and_givenName.yml +43 -36
- data/spec/fixtures/vcr_cassettes/Commonmeta_Metadata/get_schema_org_metadata/BlogPosting.yml +158 -158
- data/spec/fixtures/vcr_cassettes/Commonmeta_Metadata/get_schema_org_metadata/BlogPosting_with_new_DOI.yml +162 -162
- data/spec/fixtures/vcr_cassettes/Commonmeta_Metadata/get_schema_org_metadata/get_schema_org_metadata_front_matter/BlogPosting.yml +178 -180
- data/spec/fixtures/vcr_cassettes/Commonmeta_Metadata/get_schema_org_metadata/harvard_dataverse.yml +226 -230
- data/spec/fixtures/vcr_cassettes/Commonmeta_Metadata/get_schema_org_metadata/pangaea.yml +43 -36
- data/spec/fixtures/vcr_cassettes/Commonmeta_Metadata/get_schema_org_metadata/upstream_blog.yml +94 -94
- data/spec/fixtures/vcr_cassettes/Commonmeta_Metadata/get_schema_org_metadata/zenodo.yml +14 -14
- data/spec/fixtures/vcr_cassettes/Commonmeta_Metadata/handle_input/DOI_RA_not_Crossref_or_DataCite.yml +5 -5
- data/spec/fixtures/vcr_cassettes/Commonmeta_Metadata/handle_input/unknown_DOI_prefix.yml +5 -5
- data/spec/fixtures/vcr_cassettes/Commonmeta_Metadata/json_schema_errors/is_valid.yml +13 -13
- data/spec/fixtures/vcr_cassettes/Commonmeta_Metadata/write_metadata_as_bibtex/BlogPosting.yml +4 -4
- data/spec/fixtures/vcr_cassettes/Commonmeta_Metadata/write_metadata_as_bibtex/Dataset.yml +6 -6
- data/spec/fixtures/vcr_cassettes/Commonmeta_Metadata/write_metadata_as_bibtex/authors_with_affiliations.yml +14 -14
- data/spec/fixtures/vcr_cassettes/Commonmeta_Metadata/write_metadata_as_bibtex/climate_data.yml +6 -6
- data/spec/fixtures/vcr_cassettes/Commonmeta_Metadata/write_metadata_as_bibtex/from_schema_org.yml +159 -159
- data/spec/fixtures/vcr_cassettes/Commonmeta_Metadata/write_metadata_as_bibtex/keywords_subject_scheme.yml +6 -6
- data/spec/fixtures/vcr_cassettes/Commonmeta_Metadata/write_metadata_as_bibtex/maremma.yml +12 -10
- data/spec/fixtures/vcr_cassettes/Commonmeta_Metadata/write_metadata_as_bibtex/text.yml +4 -4
- data/spec/fixtures/vcr_cassettes/Commonmeta_Metadata/write_metadata_as_bibtex/with_data_citation.yml +14 -14
- data/spec/fixtures/vcr_cassettes/Commonmeta_Metadata/write_metadata_as_bibtex/with_pages.yml +12 -12
- data/spec/fixtures/vcr_cassettes/Commonmeta_Metadata/write_metadata_as_cff/Collection_of_Jupyter_notebooks.yml +9 -9
- data/spec/fixtures/vcr_cassettes/Commonmeta_Metadata/write_metadata_as_cff/SoftwareSourceCode_Zenodo.yml +17 -17
- data/spec/fixtures/vcr_cassettes/Commonmeta_Metadata/write_metadata_as_cff/SoftwareSourceCode_also_Zenodo.yml +12 -12
- data/spec/fixtures/vcr_cassettes/Commonmeta_Metadata/write_metadata_as_cff/ruby-cff.yml +16 -14
- data/spec/fixtures/vcr_cassettes/Commonmeta_Metadata/write_metadata_as_citation/Dataset.yml +6 -6
- data/spec/fixtures/vcr_cassettes/Commonmeta_Metadata/write_metadata_as_citation/Journal_article.yml +14 -14
- data/spec/fixtures/vcr_cassettes/Commonmeta_Metadata/write_metadata_as_citation/Journal_article_vancouver_style.yml +19 -19
- data/spec/fixtures/vcr_cassettes/Commonmeta_Metadata/write_metadata_as_citation/Missing_author.yml +12 -12
- data/spec/fixtures/vcr_cassettes/Commonmeta_Metadata/write_metadata_as_citation/interactive_resource_without_dates.yml +4 -4
- data/spec/fixtures/vcr_cassettes/Commonmeta_Metadata/write_metadata_as_citation/software_w/version.yml +6 -6
- data/spec/fixtures/vcr_cassettes/Commonmeta_Metadata/write_metadata_as_codemeta/SoftwareSourceCode_DataCite.yml +4 -4
- data/spec/fixtures/vcr_cassettes/Commonmeta_Metadata/write_metadata_as_codemeta/SoftwareSourceCode_DataCite_check_codemeta_v2.yml +4 -4
- data/spec/fixtures/vcr_cassettes/Commonmeta_Metadata/write_metadata_as_crossref/another_schema_org_from_front-matter.yml +27 -27
- data/spec/fixtures/vcr_cassettes/Commonmeta_Metadata/write_metadata_as_crossref/journal_article.yml +4 -4
- data/spec/fixtures/vcr_cassettes/Commonmeta_Metadata/write_metadata_as_crossref/journal_article_from_datacite.yml +4 -4
- data/spec/fixtures/vcr_cassettes/Commonmeta_Metadata/write_metadata_as_crossref/json_feed_item_from_rogue_scholar_with_doi.yml +8 -54
- data/spec/fixtures/vcr_cassettes/Commonmeta_Metadata/write_metadata_as_crossref/json_feed_item_from_rogue_scholar_with_organizational_author.yml +3 -3
- data/spec/fixtures/vcr_cassettes/Commonmeta_Metadata/write_metadata_as_crossref/json_feed_item_from_upstream_blog.yml +10 -53
- data/spec/fixtures/vcr_cassettes/Commonmeta_Metadata/write_metadata_as_crossref/json_feed_item_with_references.yml +62 -62
- data/spec/fixtures/vcr_cassettes/Commonmeta_Metadata/write_metadata_as_crossref/posted_content.yml +15 -15
- data/spec/fixtures/vcr_cassettes/Commonmeta_Metadata/write_metadata_as_crossref/schema_org_from_another_science_blog.yml +6 -6
- data/spec/fixtures/vcr_cassettes/Commonmeta_Metadata/write_metadata_as_crossref/schema_org_from_front_matter.yml +29 -29
- data/spec/fixtures/vcr_cassettes/Commonmeta_Metadata/write_metadata_as_crossref/schema_org_from_upstream_blog.yml +4 -4
- data/spec/fixtures/vcr_cassettes/Commonmeta_Metadata/write_metadata_as_csl/Another_dataset.yml +28 -28
- data/spec/fixtures/vcr_cassettes/Commonmeta_Metadata/write_metadata_as_csl/BlogPosting.yml +4 -4
- data/spec/fixtures/vcr_cassettes/Commonmeta_Metadata/write_metadata_as_csl/BlogPosting_schema_org.yml +158 -158
- data/spec/fixtures/vcr_cassettes/Commonmeta_Metadata/write_metadata_as_csl/Dataset.yml +6 -6
- data/spec/fixtures/vcr_cassettes/Commonmeta_Metadata/write_metadata_as_csl/container_title.yml +11 -11
- data/spec/fixtures/vcr_cassettes/Commonmeta_Metadata/write_metadata_as_csl/interactive_resource_without_dates.yml +4 -4
- data/spec/fixtures/vcr_cassettes/Commonmeta_Metadata/write_metadata_as_csl/journal_article.yml +14 -14
- data/spec/fixtures/vcr_cassettes/Commonmeta_Metadata/write_metadata_as_csl/keywords_subject_scheme.yml +6 -6
- data/spec/fixtures/vcr_cassettes/Commonmeta_Metadata/write_metadata_as_csl/maremma.yml +9 -7
- data/spec/fixtures/vcr_cassettes/Commonmeta_Metadata/write_metadata_as_csl/missing_creator.yml +12 -12
- data/spec/fixtures/vcr_cassettes/Commonmeta_Metadata/write_metadata_as_csl/multiple_abstracts.yml +6 -6
- data/spec/fixtures/vcr_cassettes/Commonmeta_Metadata/write_metadata_as_csl/organization_author.yml +19 -19
- data/spec/fixtures/vcr_cassettes/Commonmeta_Metadata/write_metadata_as_csl/software.yml +4 -4
- data/spec/fixtures/vcr_cassettes/Commonmeta_Metadata/write_metadata_as_csl/software_w/version.yml +6 -6
- data/spec/fixtures/vcr_cassettes/Commonmeta_Metadata/write_metadata_as_csl/with_only_first_page.yml +13 -13
- data/spec/fixtures/vcr_cassettes/Commonmeta_Metadata/write_metadata_as_csl/with_pages.yml +12 -12
- data/spec/fixtures/vcr_cassettes/Commonmeta_Metadata/write_metadata_as_csv/climate_data.yml +6 -6
- data/spec/fixtures/vcr_cassettes/Commonmeta_Metadata/write_metadata_as_csv/maremma.yml +10 -8
- data/spec/fixtures/vcr_cassettes/Commonmeta_Metadata/write_metadata_as_csv/text.yml +4 -4
- data/spec/fixtures/vcr_cassettes/Commonmeta_Metadata/write_metadata_as_csv/with_data_citation.yml +14 -14
- data/spec/fixtures/vcr_cassettes/Commonmeta_Metadata/write_metadata_as_csv/with_pages.yml +12 -12
- data/spec/fixtures/vcr_cassettes/Commonmeta_Metadata/write_metadata_as_datacite/dissertation.yml +17 -17
- data/spec/fixtures/vcr_cassettes/Commonmeta_Metadata/write_metadata_as_datacite/from_schema_org.yml +158 -158
- data/spec/fixtures/vcr_cassettes/Commonmeta_Metadata/write_metadata_as_datacite/journal_article.yml +18 -18
- data/spec/fixtures/vcr_cassettes/Commonmeta_Metadata/write_metadata_as_datacite/maremma.yml +10 -8
- data/spec/fixtures/vcr_cassettes/Commonmeta_Metadata/write_metadata_as_datacite/with_ORCID_ID.yml +12 -12
- data/spec/fixtures/vcr_cassettes/Commonmeta_Metadata/write_metadata_as_datacite/with_data_citation.yml +14 -14
- data/spec/fixtures/vcr_cassettes/Commonmeta_Metadata/write_metadata_as_jats_xml/Dataset_in_schema_4_0.yml +6 -6
- data/spec/fixtures/vcr_cassettes/Commonmeta_Metadata/write_metadata_as_jats_xml/Text_pass-thru.yml +4 -4
- data/spec/fixtures/vcr_cassettes/Commonmeta_Metadata/write_metadata_as_jats_xml/book_chapter.yml +15 -13
- data/spec/fixtures/vcr_cassettes/Commonmeta_Metadata/write_metadata_as_jats_xml/from_schema_org.yml +158 -158
- data/spec/fixtures/vcr_cassettes/Commonmeta_Metadata/write_metadata_as_jats_xml/interactive_resource_without_dates.yml +4 -4
- data/spec/fixtures/vcr_cassettes/Commonmeta_Metadata/write_metadata_as_jats_xml/maremma.yml +12 -10
- data/spec/fixtures/vcr_cassettes/Commonmeta_Metadata/write_metadata_as_jats_xml/with_ORCID_ID.yml +12 -12
- data/spec/fixtures/vcr_cassettes/Commonmeta_Metadata/write_metadata_as_jats_xml/with_data_citation.yml +14 -14
- data/spec/fixtures/vcr_cassettes/Commonmeta_Metadata/write_metadata_as_jats_xml/with_editor.yml +13 -13
- data/spec/fixtures/vcr_cassettes/Commonmeta_Metadata/write_metadata_as_ris/BlogPosting.yml +4 -4
- data/spec/fixtures/vcr_cassettes/Commonmeta_Metadata/write_metadata_as_ris/BlogPosting_schema_org.yml +159 -159
- data/spec/fixtures/vcr_cassettes/Commonmeta_Metadata/write_metadata_as_ris/Dataset.yml +6 -6
- data/spec/fixtures/vcr_cassettes/Commonmeta_Metadata/write_metadata_as_ris/alternate_name.yml +4 -4
- data/spec/fixtures/vcr_cassettes/Commonmeta_Metadata/write_metadata_as_ris/journal_article.yml +8 -8
- data/spec/fixtures/vcr_cassettes/Commonmeta_Metadata/write_metadata_as_ris/keywords_with_subject_scheme.yml +6 -6
- data/spec/fixtures/vcr_cassettes/Commonmeta_Metadata/write_metadata_as_ris/maremma.yml +9 -7
- data/spec/fixtures/vcr_cassettes/Commonmeta_Metadata/write_metadata_as_ris/with_pages.yml +7 -7
- data/spec/fixtures/vcr_cassettes/Commonmeta_Metadata/write_metadata_as_schema_org/Another_Schema_org_JSON.yml +6 -6
- data/spec/fixtures/vcr_cassettes/Commonmeta_Metadata/write_metadata_as_schema_org/Funding.yml +9 -9
- data/spec/fixtures/vcr_cassettes/Commonmeta_Metadata/write_metadata_as_schema_org/Funding_OpenAIRE.yml +9 -9
- data/spec/fixtures/vcr_cassettes/Commonmeta_Metadata/write_metadata_as_schema_org/Schema_org_JSON.yml +17 -17
- data/spec/fixtures/vcr_cassettes/Commonmeta_Metadata/write_metadata_as_schema_org/Schema_org_JSON_Cyark.yml +33 -33
- data/spec/fixtures/vcr_cassettes/Commonmeta_Metadata/write_metadata_as_schema_org/alternate_identifiers.yml +9 -9
- data/spec/fixtures/vcr_cassettes/Commonmeta_Metadata/write_metadata_as_schema_org/data_catalog.yml +9 -9
- data/spec/fixtures/vcr_cassettes/Commonmeta_Metadata/write_metadata_as_schema_org/geo_location_box.yml +12 -12
- data/spec/fixtures/vcr_cassettes/Commonmeta_Metadata/write_metadata_as_schema_org/interactive_resource_without_dates.yml +9 -9
- data/spec/fixtures/vcr_cassettes/Commonmeta_Metadata/write_metadata_as_schema_org/journal_article.yml +14 -14
- data/spec/fixtures/vcr_cassettes/Commonmeta_Metadata/write_metadata_as_schema_org/maremma_schema_org_JSON.yml +10 -8
- data/spec/fixtures/vcr_cassettes/Commonmeta_Metadata/write_metadata_as_schema_org/series_information.yml +9 -9
- data/spec/fixtures/vcr_cassettes/Commonmeta_Metadata/write_metadata_as_schema_org/subject_scheme.yml +11 -11
- data/spec/fixtures/vcr_cassettes/Commonmeta_Metadata/write_metadata_as_schema_org/subject_scheme_multiple_keywords.yml +11 -11
- data/spec/fixtures/vcr_cassettes/Commonmeta_Metadata/write_metadata_as_turtle/BlogPosting.yml +4 -4
- data/spec/fixtures/vcr_cassettes/Commonmeta_Metadata/write_metadata_as_turtle/Dataset.yml +6 -6
- data/spec/fixtures/vcr_cassettes/Commonmeta_Metadata/write_metadata_as_turtle/journal_article.yml +14 -14
- data/spec/fixtures/vcr_cassettes/Commonmeta_Metadata/write_metadata_as_turtle/with_pages.yml +12 -12
- data/spec/readers/cff_reader_spec.rb +6 -6
- data/spec/readers/crossref_reader_spec.rb +3 -3
- data/spec/readers/crossref_xml_reader_spec.rb +7 -7
- data/spec/readers/json_feed_reader_spec.rb +13 -13
- data/spec/readers/schema_org_reader_spec.rb +2 -3
- data/spec/spec_helper.rb +1 -0
- data/spec/utils_spec.rb +1 -1
- data/spec/writers/cff_writer_spec.rb +3 -3
- data/spec/writers/ris_writer_spec.rb +2 -2
- data/spec/writers/schema_org_writer_spec.rb +1 -1
- metadata +5 -427
- data/spec/fixtures/vcr_cassettes/Briard_CLI/convert_file/crossref/default.yml +0 -110
- data/spec/fixtures/vcr_cassettes/Briard_CLI/convert_file/crossref/to_bibtex.yml +0 -110
- data/spec/fixtures/vcr_cassettes/Briard_CLI/convert_file/crossref/to_crossref_xml.yml +0 -110
- data/spec/fixtures/vcr_cassettes/Briard_CLI/convert_file/crossref/to_datacite.yml +0 -110
- data/spec/fixtures/vcr_cassettes/Briard_CLI/convert_file/crossref/to_schema_org.yml +0 -110
- data/spec/fixtures/vcr_cassettes/Briard_CLI/convert_file/crossref_xml/default.yml +0 -55
- data/spec/fixtures/vcr_cassettes/Briard_CLI/convert_file/crossref_xml/to_bibtex.yml +0 -55
- data/spec/fixtures/vcr_cassettes/Briard_CLI/convert_file/crossref_xml/to_crossref_xml.yml +0 -55
- data/spec/fixtures/vcr_cassettes/Briard_CLI/convert_file/crossref_xml/to_datacite.yml +0 -55
- data/spec/fixtures/vcr_cassettes/Briard_CLI/convert_file/crossref_xml/to_schema_org.yml +0 -55
- data/spec/fixtures/vcr_cassettes/Briard_CLI/convert_from_id/crossref/default.yml +0 -299
- data/spec/fixtures/vcr_cassettes/Briard_CLI/convert_from_id/crossref/to_bibtex.yml +0 -299
- data/spec/fixtures/vcr_cassettes/Briard_CLI/convert_from_id/crossref/to_citation.yml +0 -299
- data/spec/fixtures/vcr_cassettes/Briard_CLI/convert_from_id/crossref/to_crossref_xml.yml +0 -299
- data/spec/fixtures/vcr_cassettes/Briard_CLI/convert_from_id/crossref/to_datacite.yml +0 -299
- data/spec/fixtures/vcr_cassettes/Briard_CLI/convert_from_id/crossref/to_jats.yml +0 -299
- data/spec/fixtures/vcr_cassettes/Briard_CLI/convert_from_id/crossref/to_schema_org.yml +0 -299
- data/spec/fixtures/vcr_cassettes/Briard_CLI/convert_from_id/datacite/default.yml +0 -172
- data/spec/fixtures/vcr_cassettes/Briard_CLI/convert_from_id/datacite/to_bibtex.yml +0 -172
- data/spec/fixtures/vcr_cassettes/Briard_CLI/convert_from_id/datacite/to_citation.yml +0 -172
- data/spec/fixtures/vcr_cassettes/Briard_CLI/convert_from_id/datacite/to_datacite.yml +0 -172
- data/spec/fixtures/vcr_cassettes/Briard_CLI/convert_from_id/datacite/to_jats.yml +0 -172
- data/spec/fixtures/vcr_cassettes/Briard_CLI/convert_from_id/datacite/to_schema_org.yml +0 -172
- data/spec/fixtures/vcr_cassettes/Briard_CLI/convert_from_id/schema_org/default.yml +0 -1098
- data/spec/fixtures/vcr_cassettes/Briard_CLI/convert_from_id/schema_org/to_datacite.yml +0 -1098
- data/spec/fixtures/vcr_cassettes/Briard_CLI/convert_from_id/schema_org/to_schema_org.yml +0 -1100
- data/spec/fixtures/vcr_cassettes/Briard_CLI/find_from_format_by_id/crossref.yml +0 -55
- data/spec/fixtures/vcr_cassettes/Briard_CLI/find_from_format_by_id/datacite.yml +0 -55
- data/spec/fixtures/vcr_cassettes/Briard_CLI/find_from_format_by_id/jalc.yml +0 -55
- data/spec/fixtures/vcr_cassettes/Briard_CLI/find_from_format_by_id/kisti.yml +0 -55
- data/spec/fixtures/vcr_cassettes/Briard_CLI/find_from_format_by_id/medra.yml +0 -55
- data/spec/fixtures/vcr_cassettes/Briard_CLI/find_from_format_by_id/op.yml +0 -55
- data/spec/fixtures/vcr_cassettes/Briard_Metadata/authors_as_string/author.yml +0 -164
- data/spec/fixtures/vcr_cassettes/Briard_Metadata/authors_as_string/no_author.yml +0 -164
- data/spec/fixtures/vcr_cassettes/Briard_Metadata/authors_as_string/single_author.yml +0 -164
- data/spec/fixtures/vcr_cassettes/Briard_Metadata/authors_as_string/with_organization.yml +0 -164
- data/spec/fixtures/vcr_cassettes/Briard_Metadata/change_metadata_as_datacite_xml/with_data_citation.yml +0 -247
- data/spec/fixtures/vcr_cassettes/Briard_Metadata/doi_registration_agency/crossref.yml +0 -55
- data/spec/fixtures/vcr_cassettes/Briard_Metadata/doi_registration_agency/datacite.yml +0 -55
- data/spec/fixtures/vcr_cassettes/Briard_Metadata/doi_registration_agency/jalc.yml +0 -55
- data/spec/fixtures/vcr_cassettes/Briard_Metadata/doi_registration_agency/kisti.yml +0 -55
- data/spec/fixtures/vcr_cassettes/Briard_Metadata/doi_registration_agency/medra.yml +0 -55
- data/spec/fixtures/vcr_cassettes/Briard_Metadata/doi_registration_agency/not_found.yml +0 -55
- data/spec/fixtures/vcr_cassettes/Briard_Metadata/doi_registration_agency/op.yml +0 -55
- data/spec/fixtures/vcr_cassettes/Briard_Metadata/find_from_format_by_ID/crossref.yml +0 -55
- data/spec/fixtures/vcr_cassettes/Briard_Metadata/find_from_format_by_ID/crossref_doi_not_url.yml +0 -55
- data/spec/fixtures/vcr_cassettes/Briard_Metadata/find_from_format_by_ID/datacite.yml +0 -55
- data/spec/fixtures/vcr_cassettes/Briard_Metadata/find_from_format_by_ID/datacite_doi_http.yml +0 -55
- data/spec/fixtures/vcr_cassettes/Briard_Metadata/find_from_format_by_ID/unknown_DOI_registration_agency.yml +0 -55
- data/spec/fixtures/vcr_cassettes/Briard_Metadata/fos/hsh_to_fos_for_match.yml +0 -221
- data/spec/fixtures/vcr_cassettes/Briard_Metadata/fos/hsh_to_fos_for_with_schemeUri_in_hash.yml +0 -221
- data/spec/fixtures/vcr_cassettes/Briard_Metadata/fos/hsh_to_fos_match.yml +0 -221
- data/spec/fixtures/vcr_cassettes/Briard_Metadata/fos/hsh_to_fos_no_match.yml +0 -221
- data/spec/fixtures/vcr_cassettes/Briard_Metadata/fos/name_to_fos_for_match.yml +0 -221
- data/spec/fixtures/vcr_cassettes/Briard_Metadata/fos/name_to_fos_match.yml +0 -221
- data/spec/fixtures/vcr_cassettes/Briard_Metadata/fos/name_to_fos_no_match.yml +0 -221
- data/spec/fixtures/vcr_cassettes/Briard_Metadata/from_schema_org/with_id.yml +0 -221
- data/spec/fixtures/vcr_cassettes/Briard_Metadata/from_schema_org_creators/with_affiliation.yml +0 -221
- data/spec/fixtures/vcr_cassettes/Briard_Metadata/from_schema_org_creators/without_affiliation.yml +0 -221
- data/spec/fixtures/vcr_cassettes/Briard_Metadata/get_cff_metadata/cff-converter-python.yml +0 -200
- data/spec/fixtures/vcr_cassettes/Briard_Metadata/get_cff_metadata/ruby-cff.yml +0 -154
- data/spec/fixtures/vcr_cassettes/Briard_Metadata/get_cff_metadata/ruby-cff_repository_url.yml +0 -154
- data/spec/fixtures/vcr_cassettes/Briard_Metadata/get_codemeta_metadata/maremma.yml +0 -86
- data/spec/fixtures/vcr_cassettes/Briard_Metadata/get_codemeta_metadata/metadata_reports.yml +0 -93
- data/spec/fixtures/vcr_cassettes/Briard_Metadata/get_crossref_metadata/DOI_with_ORCID_ID.yml +0 -337
- data/spec/fixtures/vcr_cassettes/Briard_Metadata/get_crossref_metadata/DOI_with_SICI_DOI.yml +0 -347
- data/spec/fixtures/vcr_cassettes/Briard_Metadata/get_crossref_metadata/DOI_with_data_citation.yml +0 -359
- data/spec/fixtures/vcr_cassettes/Briard_Metadata/get_crossref_metadata/JaLC.yml +0 -384
- data/spec/fixtures/vcr_cassettes/Briard_Metadata/get_crossref_metadata/KISTI.yml +0 -330
- data/spec/fixtures/vcr_cassettes/Briard_Metadata/get_crossref_metadata/OP.yml +0 -969
- data/spec/fixtures/vcr_cassettes/Briard_Metadata/get_crossref_metadata/affiliation_is_space.yml +0 -358
- data/spec/fixtures/vcr_cassettes/Briard_Metadata/get_crossref_metadata/another_book.yml +0 -312
- data/spec/fixtures/vcr_cassettes/Briard_Metadata/get_crossref_metadata/another_book_chapter.yml +0 -465
- data/spec/fixtures/vcr_cassettes/Briard_Metadata/get_crossref_metadata/article_id_as_page_number.yml +0 -276
- data/spec/fixtures/vcr_cassettes/Briard_Metadata/get_crossref_metadata/author_literal.yml +0 -492
- data/spec/fixtures/vcr_cassettes/Briard_Metadata/get_crossref_metadata/book.yml +0 -523
- data/spec/fixtures/vcr_cassettes/Briard_Metadata/get_crossref_metadata/book_chapter.yml +0 -377
- data/spec/fixtures/vcr_cassettes/Briard_Metadata/get_crossref_metadata/book_chapter_with_RDF_for_container.yml +0 -336
- data/spec/fixtures/vcr_cassettes/Briard_Metadata/get_crossref_metadata/book_oup.yml +0 -289
- data/spec/fixtures/vcr_cassettes/Briard_Metadata/get_crossref_metadata/component.yml +0 -289
- data/spec/fixtures/vcr_cassettes/Briard_Metadata/get_crossref_metadata/dataset.yml +0 -299
- data/spec/fixtures/vcr_cassettes/Briard_Metadata/get_crossref_metadata/dataset_usda.yml +0 -341
- data/spec/fixtures/vcr_cassettes/Briard_Metadata/get_crossref_metadata/date_in_future.yml +0 -570
- data/spec/fixtures/vcr_cassettes/Briard_Metadata/get_crossref_metadata/dissertation.yml +0 -301
- data/spec/fixtures/vcr_cassettes/Briard_Metadata/get_crossref_metadata/empty_given_name.yml +0 -303
- data/spec/fixtures/vcr_cassettes/Briard_Metadata/get_crossref_metadata/invalid_date.yml +0 -307
- data/spec/fixtures/vcr_cassettes/Briard_Metadata/get_crossref_metadata/journal_article.yml +0 -461
- data/spec/fixtures/vcr_cassettes/Briard_Metadata/get_crossref_metadata/journal_article_original_language_title.yml +0 -276
- data/spec/fixtures/vcr_cassettes/Briard_Metadata/get_crossref_metadata/journal_article_with.yml +0 -470
- data/spec/fixtures/vcr_cassettes/Briard_Metadata/get_crossref_metadata/journal_article_with_RDF_for_container.yml +0 -519
- data/spec/fixtures/vcr_cassettes/Briard_Metadata/get_crossref_metadata/journal_article_with_funding.yml +0 -456
- data/spec/fixtures/vcr_cassettes/Briard_Metadata/get_crossref_metadata/journal_issue.yml +0 -270
- data/spec/fixtures/vcr_cassettes/Briard_Metadata/get_crossref_metadata/mEDRA.yml +0 -310
- data/spec/fixtures/vcr_cassettes/Briard_Metadata/get_crossref_metadata/markup.yml +0 -329
- data/spec/fixtures/vcr_cassettes/Briard_Metadata/get_crossref_metadata/missing_creator.yml +0 -307
- data/spec/fixtures/vcr_cassettes/Briard_Metadata/get_crossref_metadata/multiple_issn.yml +0 -393
- data/spec/fixtures/vcr_cassettes/Briard_Metadata/get_crossref_metadata/multiple_titles.yml +0 -265
- data/spec/fixtures/vcr_cassettes/Briard_Metadata/get_crossref_metadata/multiple_titles_with_missing.yml +0 -860
- data/spec/fixtures/vcr_cassettes/Briard_Metadata/get_crossref_metadata/not_found_error.yml +0 -209
- data/spec/fixtures/vcr_cassettes/Briard_Metadata/get_crossref_metadata/peer_review.yml +0 -287
- data/spec/fixtures/vcr_cassettes/Briard_Metadata/get_crossref_metadata/posted_content.yml +0 -326
- data/spec/fixtures/vcr_cassettes/Briard_Metadata/get_crossref_metadata/posted_content_copernicus.yml +0 -297
- data/spec/fixtures/vcr_cassettes/Briard_Metadata/get_crossref_metadata/report_osti.yml +0 -315
- data/spec/fixtures/vcr_cassettes/Briard_Metadata/get_crossref_metadata/vor_with_url.yml +0 -451
- data/spec/fixtures/vcr_cassettes/Briard_Metadata/get_crossref_metadata/yet_another_book.yml +0 -816
- data/spec/fixtures/vcr_cassettes/Briard_Metadata/get_crossref_metadata/yet_another_book_chapter.yml +0 -324
- data/spec/fixtures/vcr_cassettes/Briard_Metadata/get_crossref_raw/journal_article.yml +0 -110
- data/spec/fixtures/vcr_cassettes/Briard_Metadata/get_datacite_metadata/dissertation.yml +0 -152
- data/spec/fixtures/vcr_cassettes/Briard_Metadata/get_datacite_metadata/funding_references.yml +0 -175
- data/spec/fixtures/vcr_cassettes/Briard_Metadata/get_datacite_metadata/subject_scheme.yml +0 -328
- data/spec/fixtures/vcr_cassettes/Briard_Metadata/get_date/publication_date.yml +0 -221
- data/spec/fixtures/vcr_cassettes/Briard_Metadata/get_date_from_date_parts/date.yml +0 -221
- data/spec/fixtures/vcr_cassettes/Briard_Metadata/get_date_from_date_parts/year-month.yml +0 -221
- data/spec/fixtures/vcr_cassettes/Briard_Metadata/get_date_from_date_parts/year.yml +0 -221
- data/spec/fixtures/vcr_cassettes/Briard_Metadata/get_date_from_parts/date.yml +0 -221
- data/spec/fixtures/vcr_cassettes/Briard_Metadata/get_date_from_parts/year-month.yml +0 -221
- data/spec/fixtures/vcr_cassettes/Briard_Metadata/get_date_from_parts/year.yml +0 -221
- data/spec/fixtures/vcr_cassettes/Briard_Metadata/get_date_parts/date.yml +0 -221
- data/spec/fixtures/vcr_cassettes/Briard_Metadata/get_date_parts/year-month.yml +0 -221
- data/spec/fixtures/vcr_cassettes/Briard_Metadata/get_date_parts/year.yml +0 -221
- data/spec/fixtures/vcr_cassettes/Briard_Metadata/get_datetime_from_time/future.yml +0 -221
- data/spec/fixtures/vcr_cassettes/Briard_Metadata/get_datetime_from_time/invalid.yml +0 -221
- data/spec/fixtures/vcr_cassettes/Briard_Metadata/get_datetime_from_time/nil.yml +0 -221
- data/spec/fixtures/vcr_cassettes/Briard_Metadata/get_datetime_from_time/past.yml +0 -221
- data/spec/fixtures/vcr_cassettes/Briard_Metadata/get_datetime_from_time/present.yml +0 -221
- data/spec/fixtures/vcr_cassettes/Briard_Metadata/get_one_author/has_familyName.yml +0 -133
- data/spec/fixtures/vcr_cassettes/Briard_Metadata/get_one_author/has_name_in_display-order_with_ORCID.yml +0 -153
- data/spec/fixtures/vcr_cassettes/Briard_Metadata/get_one_author/is_organization.yml +0 -164
- data/spec/fixtures/vcr_cassettes/Briard_Metadata/get_one_author/name_with_affiliation_crossref.yml +0 -247
- data/spec/fixtures/vcr_cassettes/Briard_Metadata/get_one_author/only_familyName_and_givenName.yml +0 -468
- data/spec/fixtures/vcr_cassettes/Briard_Metadata/get_schema_org_metadata/BlogPosting.yml +0 -530
- data/spec/fixtures/vcr_cassettes/Briard_Metadata/get_schema_org_metadata/BlogPosting_with_new_DOI.yml +0 -530
- data/spec/fixtures/vcr_cassettes/Briard_Metadata/get_schema_org_metadata/get_schema_org_metadata_front_matter/BlogPosting.yml +0 -534
- data/spec/fixtures/vcr_cassettes/Briard_Metadata/get_schema_org_metadata/harvard_dataverse.yml +0 -1838
- data/spec/fixtures/vcr_cassettes/Briard_Metadata/get_schema_org_metadata/pangaea.yml +0 -468
- data/spec/fixtures/vcr_cassettes/Briard_Metadata/get_schema_org_metadata/upstream_blog.yml +0 -885
- data/spec/fixtures/vcr_cassettes/Briard_Metadata/get_schema_org_metadata/zenodo.yml +0 -583
- data/spec/fixtures/vcr_cassettes/Briard_Metadata/get_series_information/only_title.yml +0 -221
- data/spec/fixtures/vcr_cassettes/Briard_Metadata/get_series_information/title_and_pages.yml +0 -221
- data/spec/fixtures/vcr_cassettes/Briard_Metadata/get_series_information/title_volume_and_pages.yml +0 -221
- data/spec/fixtures/vcr_cassettes/Briard_Metadata/get_series_information/title_volume_issue_and_pages.yml +0 -221
- data/spec/fixtures/vcr_cassettes/Briard_Metadata/github/github_as_cff_url.yml +0 -221
- data/spec/fixtures/vcr_cassettes/Briard_Metadata/github/github_as_codemeta_url.yml +0 -221
- data/spec/fixtures/vcr_cassettes/Briard_Metadata/github/github_from_url.yml +0 -221
- data/spec/fixtures/vcr_cassettes/Briard_Metadata/github/github_from_url_cff_file.yml +0 -221
- data/spec/fixtures/vcr_cassettes/Briard_Metadata/github/github_from_url_file.yml +0 -221
- data/spec/fixtures/vcr_cassettes/Briard_Metadata/handle_input/DOI_RA_not_Crossref_or_DataCite.yml +0 -55
- data/spec/fixtures/vcr_cassettes/Briard_Metadata/handle_input/unknown_DOI_prefix.yml +0 -55
- data/spec/fixtures/vcr_cassettes/Briard_Metadata/is_personal_name_/has_comma.yml +0 -164
- data/spec/fixtures/vcr_cassettes/Briard_Metadata/is_personal_name_/has_family_name.yml +0 -164
- data/spec/fixtures/vcr_cassettes/Briard_Metadata/is_personal_name_/has_id.yml +0 -164
- data/spec/fixtures/vcr_cassettes/Briard_Metadata/is_personal_name_/has_known_given_name.yml +0 -164
- data/spec/fixtures/vcr_cassettes/Briard_Metadata/is_personal_name_/has_no_info.yml +0 -164
- data/spec/fixtures/vcr_cassettes/Briard_Metadata/is_personal_name_/has_orcid_id.yml +0 -164
- data/spec/fixtures/vcr_cassettes/Briard_Metadata/is_personal_name_/has_type_organization.yml +0 -164
- data/spec/fixtures/vcr_cassettes/Briard_Metadata/json_schema_errors/is_valid.yml +0 -221
- data/spec/fixtures/vcr_cassettes/Briard_Metadata/normalize_cc_url/not_found.yml +0 -221
- data/spec/fixtures/vcr_cassettes/Briard_Metadata/normalize_cc_url/with_trailing_slash.yml +0 -221
- data/spec/fixtures/vcr_cassettes/Briard_Metadata/normalize_cc_url/with_trailing_slash_and_to_https.yml +0 -221
- data/spec/fixtures/vcr_cassettes/Briard_Metadata/normalize_id/doi.yml +0 -221
- data/spec/fixtures/vcr_cassettes/Briard_Metadata/normalize_id/doi_as_url.yml +0 -221
- data/spec/fixtures/vcr_cassettes/Briard_Metadata/normalize_id/filename.yml +0 -221
- data/spec/fixtures/vcr_cassettes/Briard_Metadata/normalize_id/ftp.yml +0 -221
- data/spec/fixtures/vcr_cassettes/Briard_Metadata/normalize_id/invalid_url.yml +0 -221
- data/spec/fixtures/vcr_cassettes/Briard_Metadata/normalize_id/sandbox_via_options.yml +0 -221
- data/spec/fixtures/vcr_cassettes/Briard_Metadata/normalize_id/sandbox_via_url.yml +0 -221
- data/spec/fixtures/vcr_cassettes/Briard_Metadata/normalize_id/string.yml +0 -221
- data/spec/fixtures/vcr_cassettes/Briard_Metadata/normalize_id/url.yml +0 -221
- data/spec/fixtures/vcr_cassettes/Briard_Metadata/normalize_id/url_with_utf-8.yml +0 -221
- data/spec/fixtures/vcr_cassettes/Briard_Metadata/normalize_ids/doi.yml +0 -221
- data/spec/fixtures/vcr_cassettes/Briard_Metadata/normalize_ids/url.yml +0 -221
- data/spec/fixtures/vcr_cassettes/Briard_Metadata/normalize_issn/from_array.yml +0 -221
- data/spec/fixtures/vcr_cassettes/Briard_Metadata/normalize_issn/from_empty_array.yml +0 -221
- data/spec/fixtures/vcr_cassettes/Briard_Metadata/normalize_issn/from_hash.yml +0 -221
- data/spec/fixtures/vcr_cassettes/Briard_Metadata/normalize_issn/from_string.yml +0 -221
- data/spec/fixtures/vcr_cassettes/Briard_Metadata/normalize_url/uri.yml +0 -221
- data/spec/fixtures/vcr_cassettes/Briard_Metadata/normalize_url/with_trailing_slash.yml +0 -221
- data/spec/fixtures/vcr_cassettes/Briard_Metadata/normalize_url/with_trailing_slash_and_to_https.yml +0 -221
- data/spec/fixtures/vcr_cassettes/Briard_Metadata/parse_attributes/array.yml +0 -221
- data/spec/fixtures/vcr_cassettes/Briard_Metadata/parse_attributes/array_of_strings.yml +0 -221
- data/spec/fixtures/vcr_cassettes/Briard_Metadata/parse_attributes/first.yml +0 -221
- data/spec/fixtures/vcr_cassettes/Briard_Metadata/parse_attributes/hash.yml +0 -221
- data/spec/fixtures/vcr_cassettes/Briard_Metadata/parse_attributes/hash_with_array_value.yml +0 -221
- data/spec/fixtures/vcr_cassettes/Briard_Metadata/parse_attributes/nil.yml +0 -221
- data/spec/fixtures/vcr_cassettes/Briard_Metadata/parse_attributes/string.yml +0 -221
- data/spec/fixtures/vcr_cassettes/Briard_Metadata/random_doi/decode_anothe_doi.yml +0 -221
- data/spec/fixtures/vcr_cassettes/Briard_Metadata/random_doi/decode_doi.yml +0 -221
- data/spec/fixtures/vcr_cassettes/Briard_Metadata/random_doi/encode_doi.yml +0 -221
- data/spec/fixtures/vcr_cassettes/Briard_Metadata/sanitize/onlies_keep_specific_tags.yml +0 -221
- data/spec/fixtures/vcr_cassettes/Briard_Metadata/sanitize/removes_a_tags.yml +0 -221
- data/spec/fixtures/vcr_cassettes/Briard_Metadata/spdx/hsh_to_spdx_id.yml +0 -221
- data/spec/fixtures/vcr_cassettes/Briard_Metadata/spdx/hsh_to_spdx_not_found.yml +0 -221
- data/spec/fixtures/vcr_cassettes/Briard_Metadata/spdx/hsh_to_spdx_url.yml +0 -221
- data/spec/fixtures/vcr_cassettes/Briard_Metadata/spdx/name_to_spdx_exists.yml +0 -221
- data/spec/fixtures/vcr_cassettes/Briard_Metadata/spdx/name_to_spdx_id.yml +0 -221
- data/spec/fixtures/vcr_cassettes/Briard_Metadata/to_schema_org/with_id.yml +0 -221
- data/spec/fixtures/vcr_cassettes/Briard_Metadata/to_schema_org_identifiers/with_identifiers.yml +0 -221
- data/spec/fixtures/vcr_cassettes/Briard_Metadata/validate_orcid/validate_orcid.yml +0 -221
- data/spec/fixtures/vcr_cassettes/Briard_Metadata/validate_orcid/validate_orcid_https.yml +0 -221
- data/spec/fixtures/vcr_cassettes/Briard_Metadata/validate_orcid/validate_orcid_id.yml +0 -221
- data/spec/fixtures/vcr_cassettes/Briard_Metadata/validate_orcid/validate_orcid_sandbox.yml +0 -221
- data/spec/fixtures/vcr_cassettes/Briard_Metadata/validate_orcid/validate_orcid_sandbox_https.yml +0 -221
- data/spec/fixtures/vcr_cassettes/Briard_Metadata/validate_orcid/validate_orcid_with_spaces.yml +0 -221
- data/spec/fixtures/vcr_cassettes/Briard_Metadata/validate_orcid/validate_orcid_wrong_id.yml +0 -221
- data/spec/fixtures/vcr_cassettes/Briard_Metadata/validate_orcid/validate_orcid_www.yml +0 -221
- data/spec/fixtures/vcr_cassettes/Briard_Metadata/validate_orcid_scheme/validate_orcid_scheme.yml +0 -221
- data/spec/fixtures/vcr_cassettes/Briard_Metadata/validate_orcid_scheme/validate_orcid_scheme_https.yml +0 -221
- data/spec/fixtures/vcr_cassettes/Briard_Metadata/validate_orcid_scheme/validate_orcid_scheme_trailing_slash.yml +0 -221
- data/spec/fixtures/vcr_cassettes/Briard_Metadata/validate_orcid_scheme/validate_orcid_scheme_www.yml +0 -221
- data/spec/fixtures/vcr_cassettes/Briard_Metadata/validate_url/DOI.yml +0 -221
- data/spec/fixtures/vcr_cassettes/Briard_Metadata/validate_url/ISSN.yml +0 -221
- data/spec/fixtures/vcr_cassettes/Briard_Metadata/validate_url/URL.yml +0 -221
- data/spec/fixtures/vcr_cassettes/Briard_Metadata/validate_url/string.yml +0 -221
- data/spec/fixtures/vcr_cassettes/Briard_Metadata/write_metadata_as_bibtex/BlogPosting.yml +0 -81
- data/spec/fixtures/vcr_cassettes/Briard_Metadata/write_metadata_as_bibtex/Dataset.yml +0 -120
- data/spec/fixtures/vcr_cassettes/Briard_Metadata/write_metadata_as_bibtex/authors_with_affiliations.yml +0 -186
- data/spec/fixtures/vcr_cassettes/Briard_Metadata/write_metadata_as_bibtex/climate_data.yml +0 -74
- data/spec/fixtures/vcr_cassettes/Briard_Metadata/write_metadata_as_bibtex/from_schema_org.yml +0 -530
- data/spec/fixtures/vcr_cassettes/Briard_Metadata/write_metadata_as_bibtex/keywords_subject_scheme.yml +0 -149
- data/spec/fixtures/vcr_cassettes/Briard_Metadata/write_metadata_as_bibtex/maremma.yml +0 -86
- data/spec/fixtures/vcr_cassettes/Briard_Metadata/write_metadata_as_bibtex/text.yml +0 -100
- data/spec/fixtures/vcr_cassettes/Briard_Metadata/write_metadata_as_bibtex/with_data_citation.yml +0 -247
- data/spec/fixtures/vcr_cassettes/Briard_Metadata/write_metadata_as_bibtex/with_pages.yml +0 -228
- data/spec/fixtures/vcr_cassettes/Briard_Metadata/write_metadata_as_cff/Collection_of_Jupyter_notebooks.yml +0 -143
- data/spec/fixtures/vcr_cassettes/Briard_Metadata/write_metadata_as_cff/SoftwareSourceCode_Zenodo.yml +0 -150
- data/spec/fixtures/vcr_cassettes/Briard_Metadata/write_metadata_as_cff/SoftwareSourceCode_also_Zenodo.yml +0 -93
- data/spec/fixtures/vcr_cassettes/Briard_Metadata/write_metadata_as_cff/ruby-cff.yml +0 -154
- data/spec/fixtures/vcr_cassettes/Briard_Metadata/write_metadata_as_citation/Dataset.yml +0 -120
- data/spec/fixtures/vcr_cassettes/Briard_Metadata/write_metadata_as_citation/Journal_article.yml +0 -247
- data/spec/fixtures/vcr_cassettes/Briard_Metadata/write_metadata_as_citation/Journal_article_vancouver_style.yml +0 -299
- data/spec/fixtures/vcr_cassettes/Briard_Metadata/write_metadata_as_citation/Missing_author.yml +0 -199
- data/spec/fixtures/vcr_cassettes/Briard_Metadata/write_metadata_as_citation/interactive_resource_without_dates.yml +0 -75
- data/spec/fixtures/vcr_cassettes/Briard_Metadata/write_metadata_as_citation/software_w/version.yml +0 -86
- data/spec/fixtures/vcr_cassettes/Briard_Metadata/write_metadata_as_codemeta/SoftwareSourceCode_DataCite.yml +0 -76
- data/spec/fixtures/vcr_cassettes/Briard_Metadata/write_metadata_as_codemeta/SoftwareSourceCode_DataCite_check_codemeta_v2.yml +0 -76
- data/spec/fixtures/vcr_cassettes/Briard_Metadata/write_metadata_as_crossref/another_schema_org_from_front-matter.yml +0 -541
- data/spec/fixtures/vcr_cassettes/Briard_Metadata/write_metadata_as_crossref/journal_article.yml +0 -55
- data/spec/fixtures/vcr_cassettes/Briard_Metadata/write_metadata_as_crossref/journal_article_from_datacite.yml +0 -85
- data/spec/fixtures/vcr_cassettes/Briard_Metadata/write_metadata_as_crossref/posted_content.yml +0 -283
- data/spec/fixtures/vcr_cassettes/Briard_Metadata/write_metadata_as_crossref/schema_org_from_another_science_blog.yml +0 -123
- data/spec/fixtures/vcr_cassettes/Briard_Metadata/write_metadata_as_crossref/schema_org_from_front_matter.yml +0 -477
- data/spec/fixtures/vcr_cassettes/Briard_Metadata/write_metadata_as_crossref/schema_org_from_upstream_blog.yml +0 -1025
- data/spec/fixtures/vcr_cassettes/Briard_Metadata/write_metadata_as_csl/Another_dataset.yml +0 -110
- data/spec/fixtures/vcr_cassettes/Briard_Metadata/write_metadata_as_csl/BlogPosting.yml +0 -81
- data/spec/fixtures/vcr_cassettes/Briard_Metadata/write_metadata_as_csl/BlogPosting_schema_org.yml +0 -530
- data/spec/fixtures/vcr_cassettes/Briard_Metadata/write_metadata_as_csl/Dataset.yml +0 -120
- data/spec/fixtures/vcr_cassettes/Briard_Metadata/write_metadata_as_csl/container_title.yml +0 -153
- data/spec/fixtures/vcr_cassettes/Briard_Metadata/write_metadata_as_csl/interactive_resource_without_dates.yml +0 -75
- data/spec/fixtures/vcr_cassettes/Briard_Metadata/write_metadata_as_csl/journal_article.yml +0 -247
- data/spec/fixtures/vcr_cassettes/Briard_Metadata/write_metadata_as_csl/keywords_subject_scheme.yml +0 -149
- data/spec/fixtures/vcr_cassettes/Briard_Metadata/write_metadata_as_csl/maremma.yml +0 -86
- data/spec/fixtures/vcr_cassettes/Briard_Metadata/write_metadata_as_csl/missing_creator.yml +0 -199
- data/spec/fixtures/vcr_cassettes/Briard_Metadata/write_metadata_as_csl/multiple_abstracts.yml +0 -101
- data/spec/fixtures/vcr_cassettes/Briard_Metadata/write_metadata_as_csl/organization_author.yml +0 -314
- data/spec/fixtures/vcr_cassettes/Briard_Metadata/write_metadata_as_csl/software.yml +0 -90
- data/spec/fixtures/vcr_cassettes/Briard_Metadata/write_metadata_as_csl/software_w/version.yml +0 -86
- data/spec/fixtures/vcr_cassettes/Briard_Metadata/write_metadata_as_csl/with_only_first_page.yml +0 -333
- data/spec/fixtures/vcr_cassettes/Briard_Metadata/write_metadata_as_csl/with_pages.yml +0 -228
- data/spec/fixtures/vcr_cassettes/Briard_Metadata/write_metadata_as_csv/climate_data.yml +0 -74
- data/spec/fixtures/vcr_cassettes/Briard_Metadata/write_metadata_as_csv/maremma.yml +0 -86
- data/spec/fixtures/vcr_cassettes/Briard_Metadata/write_metadata_as_csv/text.yml +0 -100
- data/spec/fixtures/vcr_cassettes/Briard_Metadata/write_metadata_as_csv/with_data_citation.yml +0 -247
- data/spec/fixtures/vcr_cassettes/Briard_Metadata/write_metadata_as_csv/with_pages.yml +0 -228
- data/spec/fixtures/vcr_cassettes/Briard_Metadata/write_metadata_as_datacite/from_schema_org.yml +0 -530
- data/spec/fixtures/vcr_cassettes/Briard_Metadata/write_metadata_as_datacite/maremma.yml +0 -86
- data/spec/fixtures/vcr_cassettes/Briard_Metadata/write_metadata_as_datacite/with_ORCID_ID.yml +0 -228
- data/spec/fixtures/vcr_cassettes/Briard_Metadata/write_metadata_as_datacite/with_data_citation.yml +0 -247
- data/spec/fixtures/vcr_cassettes/Briard_Metadata/write_metadata_as_datacite_json/from_schema_org.yml +0 -530
- data/spec/fixtures/vcr_cassettes/Briard_Metadata/write_metadata_as_datacite_json/maremma.yml +0 -86
- data/spec/fixtures/vcr_cassettes/Briard_Metadata/write_metadata_as_datacite_json/with_ORCID_ID.yml +0 -228
- data/spec/fixtures/vcr_cassettes/Briard_Metadata/write_metadata_as_datacite_json/with_data_citation.yml +0 -247
- data/spec/fixtures/vcr_cassettes/Briard_Metadata/write_metadata_as_jats_xml/Dataset_in_schema_4_0.yml +0 -120
- data/spec/fixtures/vcr_cassettes/Briard_Metadata/write_metadata_as_jats_xml/Text_pass-thru.yml +0 -106
- data/spec/fixtures/vcr_cassettes/Briard_Metadata/write_metadata_as_jats_xml/book_chapter.yml +0 -163
- data/spec/fixtures/vcr_cassettes/Briard_Metadata/write_metadata_as_jats_xml/from_schema_org.yml +0 -530
- data/spec/fixtures/vcr_cassettes/Briard_Metadata/write_metadata_as_jats_xml/interactive_resource_without_dates.yml +0 -75
- data/spec/fixtures/vcr_cassettes/Briard_Metadata/write_metadata_as_jats_xml/maremma.yml +0 -86
- data/spec/fixtures/vcr_cassettes/Briard_Metadata/write_metadata_as_jats_xml/with_ORCID_ID.yml +0 -228
- data/spec/fixtures/vcr_cassettes/Briard_Metadata/write_metadata_as_jats_xml/with_data_citation.yml +0 -247
- data/spec/fixtures/vcr_cassettes/Briard_Metadata/write_metadata_as_jats_xml/with_editor.yml +0 -355
- data/spec/fixtures/vcr_cassettes/Briard_Metadata/write_metadata_as_rdf_xml/BlogPosting.yml +0 -81
- data/spec/fixtures/vcr_cassettes/Briard_Metadata/write_metadata_as_rdf_xml/BlogPosting_schema_org.yml +0 -530
- data/spec/fixtures/vcr_cassettes/Briard_Metadata/write_metadata_as_rdf_xml/journal_article.yml +0 -247
- data/spec/fixtures/vcr_cassettes/Briard_Metadata/write_metadata_as_rdf_xml/maremma.yml +0 -86
- data/spec/fixtures/vcr_cassettes/Briard_Metadata/write_metadata_as_rdf_xml/with_pages.yml +0 -228
- data/spec/fixtures/vcr_cassettes/Briard_Metadata/write_metadata_as_ris/BlogPosting.yml +0 -81
- data/spec/fixtures/vcr_cassettes/Briard_Metadata/write_metadata_as_ris/BlogPosting_schema_org.yml +0 -530
- data/spec/fixtures/vcr_cassettes/Briard_Metadata/write_metadata_as_ris/Dataset.yml +0 -120
- data/spec/fixtures/vcr_cassettes/Briard_Metadata/write_metadata_as_ris/alternate_name.yml +0 -138
- data/spec/fixtures/vcr_cassettes/Briard_Metadata/write_metadata_as_ris/journal_article.yml +0 -115
- data/spec/fixtures/vcr_cassettes/Briard_Metadata/write_metadata_as_ris/keywords_with_subject_scheme.yml +0 -149
- data/spec/fixtures/vcr_cassettes/Briard_Metadata/write_metadata_as_ris/maremma.yml +0 -86
- data/spec/fixtures/vcr_cassettes/Briard_Metadata/write_metadata_as_ris/with_pages.yml +0 -112
- data/spec/fixtures/vcr_cassettes/Briard_Metadata/write_metadata_as_schema_org/Another_Schema_org_JSON.yml +0 -120
- data/spec/fixtures/vcr_cassettes/Briard_Metadata/write_metadata_as_schema_org/Funding.yml +0 -192
- data/spec/fixtures/vcr_cassettes/Briard_Metadata/write_metadata_as_schema_org/Funding_OpenAIRE.yml +0 -150
- data/spec/fixtures/vcr_cassettes/Briard_Metadata/write_metadata_as_schema_org/Schema_org_JSON.yml +0 -98
- data/spec/fixtures/vcr_cassettes/Briard_Metadata/write_metadata_as_schema_org/Schema_org_JSON_Cyark.yml +0 -160
- data/spec/fixtures/vcr_cassettes/Briard_Metadata/write_metadata_as_schema_org/Schema_org_JSON_IsSupplementTo.yml +0 -153
- data/spec/fixtures/vcr_cassettes/Briard_Metadata/write_metadata_as_schema_org/alternate_identifiers.yml +0 -131
- data/spec/fixtures/vcr_cassettes/Briard_Metadata/write_metadata_as_schema_org/data_catalog.yml +0 -136
- data/spec/fixtures/vcr_cassettes/Briard_Metadata/write_metadata_as_schema_org/geo_location_box.yml +0 -181
- data/spec/fixtures/vcr_cassettes/Briard_Metadata/write_metadata_as_schema_org/interactive_resource_without_dates.yml +0 -127
- data/spec/fixtures/vcr_cassettes/Briard_Metadata/write_metadata_as_schema_org/journal_article.yml +0 -247
- data/spec/fixtures/vcr_cassettes/Briard_Metadata/write_metadata_as_schema_org/maremma_schema_org_JSON.yml +0 -86
- data/spec/fixtures/vcr_cassettes/Briard_Metadata/write_metadata_as_schema_org/series_information.yml +0 -174
- data/spec/fixtures/vcr_cassettes/Briard_Metadata/write_metadata_as_schema_org/subject_scheme.yml +0 -199
- data/spec/fixtures/vcr_cassettes/Briard_Metadata/write_metadata_as_schema_org/subject_scheme_multiple_keywords.yml +0 -201
- data/spec/fixtures/vcr_cassettes/Briard_Metadata/write_metadata_as_turtle/BlogPosting.yml +0 -81
- data/spec/fixtures/vcr_cassettes/Briard_Metadata/write_metadata_as_turtle/BlogPosting_schema_org.yml +0 -530
- data/spec/fixtures/vcr_cassettes/Briard_Metadata/write_metadata_as_turtle/Dataset.yml +0 -120
- data/spec/fixtures/vcr_cassettes/Briard_Metadata/write_metadata_as_turtle/journal_article.yml +0 -247
- data/spec/fixtures/vcr_cassettes/Briard_Metadata/write_metadata_as_turtle/with_pages.yml +0 -228
- data/spec/fixtures/vcr_cassettes/Commonmeta_Metadata/fos/hsh_to_fos_for_match.yml +0 -221
- data/spec/fixtures/vcr_cassettes/Commonmeta_Metadata/fos/hsh_to_fos_for_with_schemeUri_in_hash.yml +0 -221
- data/spec/fixtures/vcr_cassettes/Commonmeta_Metadata/fos/hsh_to_fos_match.yml +0 -221
- data/spec/fixtures/vcr_cassettes/Commonmeta_Metadata/fos/hsh_to_fos_no_match.yml +0 -221
- data/spec/fixtures/vcr_cassettes/Commonmeta_Metadata/fos/name_to_fos_for_match.yml +0 -221
- data/spec/fixtures/vcr_cassettes/Commonmeta_Metadata/fos/name_to_fos_match.yml +0 -221
- data/spec/fixtures/vcr_cassettes/Commonmeta_Metadata/fos/name_to_fos_no_match.yml +0 -221
- data/spec/fixtures/vcr_cassettes/Commonmeta_Metadata/from_schema_org/with_id.yml +0 -221
- data/spec/fixtures/vcr_cassettes/Commonmeta_Metadata/get_date/publication_date.yml +0 -221
- data/spec/fixtures/vcr_cassettes/Commonmeta_Metadata/get_date_from_date_parts/date.yml +0 -221
- data/spec/fixtures/vcr_cassettes/Commonmeta_Metadata/get_date_from_date_parts/year-month.yml +0 -221
- data/spec/fixtures/vcr_cassettes/Commonmeta_Metadata/get_date_from_date_parts/year.yml +0 -221
- data/spec/fixtures/vcr_cassettes/Commonmeta_Metadata/get_date_from_parts/date.yml +0 -221
- data/spec/fixtures/vcr_cassettes/Commonmeta_Metadata/get_date_from_parts/year-month.yml +0 -221
- data/spec/fixtures/vcr_cassettes/Commonmeta_Metadata/get_date_from_parts/year.yml +0 -221
- data/spec/fixtures/vcr_cassettes/Commonmeta_Metadata/get_date_parts/date.yml +0 -221
- data/spec/fixtures/vcr_cassettes/Commonmeta_Metadata/get_date_parts/year-month.yml +0 -221
- data/spec/fixtures/vcr_cassettes/Commonmeta_Metadata/get_date_parts/year.yml +0 -221
- data/spec/fixtures/vcr_cassettes/Commonmeta_Metadata/get_datetime_from_time/future.yml +0 -221
- data/spec/fixtures/vcr_cassettes/Commonmeta_Metadata/get_datetime_from_time/invalid.yml +0 -221
- data/spec/fixtures/vcr_cassettes/Commonmeta_Metadata/get_datetime_from_time/nil.yml +0 -221
- data/spec/fixtures/vcr_cassettes/Commonmeta_Metadata/get_datetime_from_time/past.yml +0 -221
- data/spec/fixtures/vcr_cassettes/Commonmeta_Metadata/get_datetime_from_time/present.yml +0 -221
- data/spec/fixtures/vcr_cassettes/Commonmeta_Metadata/get_json_feed/all_posts.yml +0 -3602
- data/spec/fixtures/vcr_cassettes/Commonmeta_Metadata/get_json_feed/behind_the_science.yml +0 -1176
- data/spec/fixtures/vcr_cassettes/Commonmeta_Metadata/get_json_feed/citation_style_language.yml +0 -360
- data/spec/fixtures/vcr_cassettes/Commonmeta_Metadata/get_json_feed/citation_style_language_blog.yml +0 -360
- data/spec/fixtures/vcr_cassettes/Commonmeta_Metadata/get_json_feed/front-matter_blog.yml +0 -1034
- data/spec/fixtures/vcr_cassettes/Commonmeta_Metadata/get_json_feed/upstream.yml +0 -2438
- data/spec/fixtures/vcr_cassettes/Commonmeta_Metadata/get_json_feed/upstream_blog.yml +0 -2438
- data/spec/fixtures/vcr_cassettes/Commonmeta_Metadata/get_json_feed_item/by_uuid.yml +0 -136
- data/spec/fixtures/vcr_cassettes/Commonmeta_Metadata/get_link/license.yml +0 -221
- data/spec/fixtures/vcr_cassettes/Commonmeta_Metadata/get_link/url.yml +0 -221
- data/spec/fixtures/vcr_cassettes/Commonmeta_Metadata/get_series_information/only_title.yml +0 -221
- data/spec/fixtures/vcr_cassettes/Commonmeta_Metadata/get_series_information/title_and_pages.yml +0 -221
- data/spec/fixtures/vcr_cassettes/Commonmeta_Metadata/get_series_information/title_volume_and_pages.yml +0 -221
- data/spec/fixtures/vcr_cassettes/Commonmeta_Metadata/get_series_information/title_volume_issue_and_pages.yml +0 -221
- data/spec/fixtures/vcr_cassettes/Commonmeta_Metadata/github/github_as_cff_url.yml +0 -221
- data/spec/fixtures/vcr_cassettes/Commonmeta_Metadata/github/github_as_codemeta_url.yml +0 -221
- data/spec/fixtures/vcr_cassettes/Commonmeta_Metadata/github/github_from_url.yml +0 -221
- data/spec/fixtures/vcr_cassettes/Commonmeta_Metadata/github/github_from_url_cff_file.yml +0 -221
- data/spec/fixtures/vcr_cassettes/Commonmeta_Metadata/github/github_from_url_file.yml +0 -221
- data/spec/fixtures/vcr_cassettes/Commonmeta_Metadata/json_feed_unregistered_url/all_posts.yml +0 -221
- data/spec/fixtures/vcr_cassettes/Commonmeta_Metadata/normalize_cc_url/not_found.yml +0 -221
- data/spec/fixtures/vcr_cassettes/Commonmeta_Metadata/normalize_cc_url/with_trailing_slash.yml +0 -221
- data/spec/fixtures/vcr_cassettes/Commonmeta_Metadata/normalize_cc_url/with_trailing_slash_and_to_https.yml +0 -221
- data/spec/fixtures/vcr_cassettes/Commonmeta_Metadata/normalize_id/doi.yml +0 -221
- data/spec/fixtures/vcr_cassettes/Commonmeta_Metadata/normalize_id/doi_as_url.yml +0 -221
- data/spec/fixtures/vcr_cassettes/Commonmeta_Metadata/normalize_id/filename.yml +0 -221
- data/spec/fixtures/vcr_cassettes/Commonmeta_Metadata/normalize_id/ftp.yml +0 -221
- data/spec/fixtures/vcr_cassettes/Commonmeta_Metadata/normalize_id/invalid_url.yml +0 -221
- data/spec/fixtures/vcr_cassettes/Commonmeta_Metadata/normalize_id/sandbox_via_options.yml +0 -221
- data/spec/fixtures/vcr_cassettes/Commonmeta_Metadata/normalize_id/sandbox_via_url.yml +0 -221
- data/spec/fixtures/vcr_cassettes/Commonmeta_Metadata/normalize_id/string.yml +0 -221
- data/spec/fixtures/vcr_cassettes/Commonmeta_Metadata/normalize_id/url.yml +0 -221
- data/spec/fixtures/vcr_cassettes/Commonmeta_Metadata/normalize_id/url_with_utf-8.yml +0 -221
- data/spec/fixtures/vcr_cassettes/Commonmeta_Metadata/normalize_issn/from_array.yml +0 -221
- data/spec/fixtures/vcr_cassettes/Commonmeta_Metadata/normalize_issn/from_empty_array.yml +0 -221
- data/spec/fixtures/vcr_cassettes/Commonmeta_Metadata/normalize_issn/from_hash.yml +0 -221
- data/spec/fixtures/vcr_cassettes/Commonmeta_Metadata/normalize_issn/from_string.yml +0 -221
- data/spec/fixtures/vcr_cassettes/Commonmeta_Metadata/normalize_url/uri.yml +0 -221
- data/spec/fixtures/vcr_cassettes/Commonmeta_Metadata/normalize_url/with_trailing_slash.yml +0 -221
- data/spec/fixtures/vcr_cassettes/Commonmeta_Metadata/normalize_url/with_trailing_slash_and_to_https.yml +0 -221
- data/spec/fixtures/vcr_cassettes/Commonmeta_Metadata/parse_attributes/array.yml +0 -221
- data/spec/fixtures/vcr_cassettes/Commonmeta_Metadata/parse_attributes/array_of_strings.yml +0 -221
- data/spec/fixtures/vcr_cassettes/Commonmeta_Metadata/parse_attributes/first.yml +0 -221
- data/spec/fixtures/vcr_cassettes/Commonmeta_Metadata/parse_attributes/hash.yml +0 -221
- data/spec/fixtures/vcr_cassettes/Commonmeta_Metadata/parse_attributes/hash_with_array_value.yml +0 -221
- data/spec/fixtures/vcr_cassettes/Commonmeta_Metadata/parse_attributes/nil.yml +0 -221
- data/spec/fixtures/vcr_cassettes/Commonmeta_Metadata/parse_attributes/string.yml +0 -221
- data/spec/fixtures/vcr_cassettes/Commonmeta_Metadata/random_doi/decode_anothe_doi.yml +0 -221
- data/spec/fixtures/vcr_cassettes/Commonmeta_Metadata/random_doi/decode_another_doi.yml +0 -221
- data/spec/fixtures/vcr_cassettes/Commonmeta_Metadata/random_doi/decode_doi.yml +0 -221
- data/spec/fixtures/vcr_cassettes/Commonmeta_Metadata/random_doi/encode_doi.yml +0 -221
- data/spec/fixtures/vcr_cassettes/Commonmeta_Metadata/random_id/decode_another_id.yml +0 -221
- data/spec/fixtures/vcr_cassettes/Commonmeta_Metadata/random_id/decode_id.yml +0 -221
- data/spec/fixtures/vcr_cassettes/Commonmeta_Metadata/random_id/encode_id.yml +0 -221
- data/spec/fixtures/vcr_cassettes/Commonmeta_Metadata/sanitize/onlies_keep_specific_tags.yml +0 -221
- data/spec/fixtures/vcr_cassettes/Commonmeta_Metadata/sanitize/removes_a_tags.yml +0 -221
- data/spec/fixtures/vcr_cassettes/Commonmeta_Metadata/spdx/hsh_to_spdx_id.yml +0 -221
- data/spec/fixtures/vcr_cassettes/Commonmeta_Metadata/spdx/hsh_to_spdx_not_found.yml +0 -221
- data/spec/fixtures/vcr_cassettes/Commonmeta_Metadata/spdx/hsh_to_spdx_url.yml +0 -221
- data/spec/fixtures/vcr_cassettes/Commonmeta_Metadata/spdx/name_to_spdx_exists.yml +0 -221
- data/spec/fixtures/vcr_cassettes/Commonmeta_Metadata/spdx/name_to_spdx_id.yml +0 -221
- data/spec/fixtures/vcr_cassettes/Commonmeta_Metadata/to_schema_org/with_id.yml +0 -221
- data/spec/fixtures/vcr_cassettes/Commonmeta_Metadata/to_schema_org_identifiers/with_identifiers.yml +0 -221
- data/spec/fixtures/vcr_cassettes/Commonmeta_Metadata/validate_orcid/validate_orcid.yml +0 -221
- data/spec/fixtures/vcr_cassettes/Commonmeta_Metadata/validate_orcid/validate_orcid_https.yml +0 -221
- data/spec/fixtures/vcr_cassettes/Commonmeta_Metadata/validate_orcid/validate_orcid_id.yml +0 -221
- data/spec/fixtures/vcr_cassettes/Commonmeta_Metadata/validate_orcid/validate_orcid_sandbox.yml +0 -221
- data/spec/fixtures/vcr_cassettes/Commonmeta_Metadata/validate_orcid/validate_orcid_sandbox_https.yml +0 -221
- data/spec/fixtures/vcr_cassettes/Commonmeta_Metadata/validate_orcid/validate_orcid_with_spaces.yml +0 -221
- data/spec/fixtures/vcr_cassettes/Commonmeta_Metadata/validate_orcid/validate_orcid_wrong_id.yml +0 -221
- data/spec/fixtures/vcr_cassettes/Commonmeta_Metadata/validate_orcid/validate_orcid_www.yml +0 -221
- data/spec/fixtures/vcr_cassettes/Commonmeta_Metadata/validate_orcid_scheme/validate_orcid_scheme.yml +0 -221
- data/spec/fixtures/vcr_cassettes/Commonmeta_Metadata/validate_orcid_scheme/validate_orcid_scheme_https.yml +0 -221
- data/spec/fixtures/vcr_cassettes/Commonmeta_Metadata/validate_orcid_scheme/validate_orcid_scheme_trailing_slash.yml +0 -221
- data/spec/fixtures/vcr_cassettes/Commonmeta_Metadata/validate_orcid_scheme/validate_orcid_scheme_www.yml +0 -221
- data/spec/fixtures/vcr_cassettes/Commonmeta_Metadata/validate_url/DOI.yml +0 -221
- data/spec/fixtures/vcr_cassettes/Commonmeta_Metadata/validate_url/ISSN.yml +0 -221
- data/spec/fixtures/vcr_cassettes/Commonmeta_Metadata/validate_url/URL.yml +0 -221
- data/spec/fixtures/vcr_cassettes/Commonmeta_Metadata/validate_url/string.yml +0 -221
- data/spec/fixtures/vcr_cassettes/Commonmeta_Metadata/write_metadata_as_rdf_xml/BlogPosting.yml +0 -81
- data/spec/fixtures/vcr_cassettes/Commonmeta_Metadata/write_metadata_as_rdf_xml/BlogPosting_schema_org.yml +0 -530
- data/spec/fixtures/vcr_cassettes/Commonmeta_Metadata/write_metadata_as_rdf_xml/journal_article.yml +0 -247
- data/spec/fixtures/vcr_cassettes/Commonmeta_Metadata/write_metadata_as_rdf_xml/maremma.yml +0 -86
- data/spec/fixtures/vcr_cassettes/Commonmeta_Metadata/write_metadata_as_rdf_xml/with_pages.yml +0 -228
- data/spec/fixtures/vcr_cassettes/Commonmeta_Metadata/write_metadata_as_schema_org/Schema_org_JSON_IsSupplementTo.yml +0 -153
- data/spec/fixtures/vcr_cassettes/Commonmeta_Metadata/write_metadata_as_turtle/BlogPosting_schema_org.yml +0 -530
@@ -23,13 +23,13 @@ http_interactions:
|
|
23
23
|
Cache-Control:
|
24
24
|
- public, max-age=0, must-revalidate
|
25
25
|
Content-Length:
|
26
|
-
- '
|
26
|
+
- '49530'
|
27
27
|
Content-Type:
|
28
28
|
- application/json; charset=utf-8
|
29
29
|
Date:
|
30
|
-
- Sun, 18 Jun 2023
|
30
|
+
- Sun, 18 Jun 2023 15:24:19 GMT
|
31
31
|
Etag:
|
32
|
-
- '"
|
32
|
+
- '"xv42bhvvc21253"'
|
33
33
|
Server:
|
34
34
|
- Vercel
|
35
35
|
Strict-Transport-Security:
|
@@ -39,7 +39,7 @@ http_interactions:
|
|
39
39
|
X-Vercel-Cache:
|
40
40
|
- MISS
|
41
41
|
X-Vercel-Id:
|
42
|
-
- fra1::iad1::
|
42
|
+
- fra1::iad1::6gh26-1687101859043-0548b5ea306e
|
43
43
|
Connection:
|
44
44
|
- close
|
45
45
|
body:
|
@@ -50,420 +50,7 @@ http_interactions:
|
|
50
50
|
feed</a>.<br>ISSN 2051-8188. Written content on this site is licensed under
|
51
51
|
a <a href=\"https://creativecommons.org/licenses/by/4.0/\">Creative Commons
|
52
52
|
Attribution 4.0 International license</a>.","language":"en","favicon":null,"feed_url":"https://iphylo.blogspot.com/feeds/posts/default","feed_format":"application/atom+xml","home_page_url":"https://iphylo.blogspot.com/","indexed_at":"2023-02-06","modified_at":"2023-05-31T17:26:00+00:00","license":"https://creativecommons.org/licenses/by/4.0/legalcode","generator":"Blogger
|
53
|
-
7.00","category":"Natural Sciences","backlog":true,"prefix":"10.59350","items":[{"id":"https://doi.org/10.59350/
|
54
|
-
markdown, and taxonomic trees","summary":"Returning to the subject of personal
|
55
|
-
knowledge graphs Kyle Scheer has an interesting repository of Markdown files
|
56
|
-
that describe academic disciplines at https://github.com/kyletscheer/academic-disciplines
|
57
|
-
(see his blog post for more background). If you add these files to Obsidian
|
58
|
-
you get a nice visualisation of a taxonomy of academic disciplines. The applications
|
59
|
-
of this to biological taxonomy seem obvious, especially as a tool like Obsidian
|
60
|
-
enables all sorts of interesting links to be added...","date_published":"2022-04-07T21:07:00Z","date_modified":"2022-04-07T21:15:34Z","date_indexed":"1909-06-16T09:41:45+00:00","authors":[{"url":null,"name":"Roderic
|
61
|
-
Page"}],"image":null,"content_html":"<p>Returning to the subject of <a href=\"https://iphylo.blogspot.com/2020/08/personal-knowledge-graphs-obsidian-roam.html\">personal
|
62
|
-
knowledge graphs</a> Kyle Scheer has an interesting repository of Markdown
|
63
|
-
files that describe academic disciplines at <a href=\"https://github.com/kyletscheer/academic-disciplines\">https://github.com/kyletscheer/academic-disciplines</a>
|
64
|
-
(see <a href=\"https://kyletscheer.medium.com/on-creating-a-tree-of-knowledge-f099c1028bf6\">his
|
65
|
-
blog post</a> for more background).</p>\n\n<p>If you add these files to <a
|
66
|
-
href=\"https://obsidian.md/\">Obsidian</a> you get a nice visualisation of
|
67
|
-
a taxonomy of academic disciplines. The applications of this to biological
|
68
|
-
taxonomy seem obvious, especially as a tool like Obsidian enables all sorts
|
69
|
-
of interesting links to be added (e.g., we could add links to the taxonomic
|
70
|
-
research behind each node in the taxonomic tree, the people doing that research,
|
71
|
-
etc. - although that would mean we''d no longer have a simple tree).</p>\n\n<p>The
|
72
|
-
more I look at these sort of simple Markdown-based tools the more I wonder
|
73
|
-
whether we could make more use of them to create simple but persistent databases.
|
74
|
-
Text files seem the most stable, long-lived digital format around, maybe this
|
75
|
-
would be a way to minimise the inevitable obsolescence of database and server
|
76
|
-
software. Time for some experiments I feel... can we take a taxonomic group,
|
77
|
-
such as mammals, and create a richly connected database purely in Markdown?</p>\n\n<div
|
78
|
-
class=\"separator\" style=\"clear: both; text-align: center;\"><iframe allowfullscreen=''allowfullscreen''
|
79
|
-
webkitallowfullscreen=''webkitallowfullscreen'' mozallowfullscreen=''mozallowfullscreen''
|
80
|
-
width=''400'' height=''322'' src=''https://www.blogger.com/video.g?token=AD6v5dxZtweOTJTdg6aqvICq_tKF0la1QZuDAEpwPPCVQKtG5vjB-DzuQv-ApL8JnpyZ1FffYtWo6ymizNQ''
|
81
|
-
class=''b-hbp-video b-uploaded'' frameborder=''0''></iframe></div>","tags":["markdown","obsidian"],"language":"en","references":[]},{"id":"https://doi.org/10.59350/m48f7-c2128","uuid":"8aea47e4-f227-45f4-b37b-0454a8a7a3ff","url":"https://iphylo.blogspot.com/2023/04/chatgpt-semantic-search-and-knowledge.html","title":"ChatGPT,
|
82
|
-
semantic search, and knowledge graphs","summary":"One thing about ChatGPT
|
83
|
-
is it has opened my eyes to some concepts I was dimly aware of but am only
|
84
|
-
now beginning to fully appreciate. ChatGPT enables you ask it questions, but
|
85
|
-
the answers depend on what ChatGPT “knows”. As several people have noted,
|
86
|
-
what would be even better is to be able to run ChatGPT on your own content.
|
87
|
-
Indeed, ChatGPT itself now supports this using plugins. Paul Graham GPT However,
|
88
|
-
it’s still useful to see how to add ChatGPT functionality to your own content
|
89
|
-
from...","date_published":"2023-04-03T15:30:00Z","date_modified":"2023-04-03T15:32:04Z","date_indexed":"1909-06-16T09:02:34+00:00","authors":[{"url":null,"name":"Roderic
|
90
|
-
Page"}],"image":null,"content_html":"<p>One thing about ChatGPT is it has
|
91
|
-
opened my eyes to some concepts I was dimly aware of but am only now beginning
|
92
|
-
to fully appreciate. ChatGPT enables you ask it questions, but the answers
|
93
|
-
depend on what ChatGPT “knows”. As several people have noted, what would be
|
94
|
-
even better is to be able to run ChatGPT on your own content. Indeed, ChatGPT
|
95
|
-
itself now supports this using <a href=\"https://openai.com/blog/chatgpt-plugins\">plugins</a>.</p>\n<h4
|
96
|
-
id=\"paul-graham-gpt\">Paul Graham GPT</h4>\n<p>However, it’s still useful
|
97
|
-
to see how to add ChatGPT functionality to your own content from scratch.
|
98
|
-
A nice example of this is <a href=\"https://paul-graham-gpt.vercel.app/\">Paul
|
99
|
-
Graham GPT</a> by <a href=\"https://twitter.com/mckaywrigley\">Mckay Wrigley</a>.
|
100
|
-
Mckay Wrigley took essays by Paul Graham (a well known venture capitalist)
|
101
|
-
and built a question and answer tool very like ChatGPT.</p>\n<iframe width=\"560\"
|
102
|
-
height=\"315\" src=\"https://www.youtube.com/embed/ii1jcLg-eIQ\" title=\"YouTube
|
103
|
-
video player\" frameborder=\"0\" allow=\"accelerometer; autoplay; clipboard-write;
|
104
|
-
encrypted-media; gyroscope; picture-in-picture; web-share\" allowfullscreen></iframe>\n<p>Because
|
105
|
-
you can send a block of text to ChatGPT (as part of the prompt) you can get
|
106
|
-
ChatGPT to summarise or transform that information, or answer questions based
|
107
|
-
on that information. But there is a limit to how much information you can
|
108
|
-
pack into a prompt. You can’t put all of Paul Graham’s essays into a prompt
|
109
|
-
for example. So a solution is to do some preprocessing. For example, given
|
110
|
-
a question such as “How do I start a startup?” we could first find the essays
|
111
|
-
that are most relevant to this question, then use them to create a prompt
|
112
|
-
for ChatGPT. A quick and dirty way to do this is simply do a text search over
|
113
|
-
the essays and take the top hits. But we aren’t searching for words, we are
|
114
|
-
searching for answers to a question. The essay with the best answer might
|
115
|
-
not include the phrase “How do I start a startup?”.</p>\n<h4 id=\"semantic-search\">Semantic
|
116
|
-
search</h4>\n<p>Enter <a href=\"https://en.wikipedia.org/wiki/Semantic_search\">Semantic
|
117
|
-
search</a>. The key concept behind semantic search is that we are looking
|
118
|
-
for documents with similar meaning, not just similarity of text. One approach
|
119
|
-
to this is to represent documents by “embeddings”, that is, a vector of numbers
|
120
|
-
that encapsulate features of the document. Documents with similar vectors
|
121
|
-
are potentially related. In semantic search we take the query (e.g., “How
|
122
|
-
do I start a startup?”), compute its embedding, then search among the documents
|
123
|
-
for those with similar embeddings.</p>\n<p>To create Paul Graham GPT Mckay
|
124
|
-
Wrigley did the following. First he sent each essay to the OpenAI API underlying
|
125
|
-
ChatGPT, and in return he got the embedding for that essay (a vector of 1536
|
126
|
-
numbers). Each embedding was stored in a database (Mckay uses Postgres with
|
127
|
-
<a href=\"https://github.com/pgvector/pgvector\">pgvector</a>). When a user
|
128
|
-
enters a query such as “How do I start a startup?” that query is also sent
|
129
|
-
to the OpenAI API to retrieve its embedding vector. Then we query the database
|
130
|
-
of embeddings for Paul Graham’s essays and take the top five hits. These hits
|
131
|
-
are, one hopes, the most likely to contain relevant answers. The original
|
132
|
-
question and the most similar essays are then bundled up and sent to ChatGPT
|
133
|
-
which then synthesises an answer. See his <a href=\"https://github.com/mckaywrigley/paul-graham-gpt\">GitHub
|
134
|
-
repo</a> for more details. Note that we are still using ChatGPT, but on a
|
135
|
-
set of documents it doesn’t already have.</p>\n<h4 id=\"knowledge-graphs\">Knowledge
|
136
|
-
graphs</h4>\n<p>I’m a fan of knowledge graphs, but they are not terribly easy
|
137
|
-
to use. For example, I built a knowledge graph of Australian animals <a href=\"https://ozymandias-demo.herokuapp.com\">Ozymandias</a>
|
138
|
-
that contains a wealth of information on taxa, publications, and people, wrapped
|
139
|
-
up in a web site. If you want to learn more you need to figure out how to
|
140
|
-
write queries in SPARQL, which is not fun. Maybe we could use ChatGPT to write
|
141
|
-
the SPARQL queries for us, but it would be much more fun to be simply ask
|
142
|
-
natural language queries (e.g., “who are the experts on Australian ants?”).
|
143
|
-
I made some naïve notes on these ideas <a href=\"https://iphylo.blogspot.com/2015/09/possible-project-natural-language.html\">Possible
|
144
|
-
project: natural language queries, or answering “how many species are there?”</a>
|
145
|
-
and <a href=\"https://iphylo.blogspot.com/2019/05/ozymandias-meets-wikipedia-with-notes.html\">Ozymandias
|
146
|
-
meets Wikipedia, with notes on natural language generation</a>.</p>\n<p>Of
|
147
|
-
course, this is a well known problem. Tools such as <a href=\"http://rdf2vec.org\">RDF2vec</a>
|
148
|
-
can take RDF from a knowledge graph and create embeddings which could in tern
|
149
|
-
be used to support semantic search. But it seems to me that we could simply
|
150
|
-
this process a bit by making use of ChatGPT.</p>\n<p>Firstly we would generate
|
151
|
-
natural language statements from the knowledge graph (e.g., “species x belongs
|
152
|
-
to genus y and was described in z”, “this paper on ants was authored by x”,
|
153
|
-
etc.) that cover the basic questions we expect people to ask. We then get
|
154
|
-
embeddings for these (e.g., using OpenAI). We then have an interface where
|
155
|
-
people can ask a question (“is species x a valid species?”, “who has published
|
156
|
-
on ants”, etc.), we get the embedding for that question, retrieve natural
|
157
|
-
language statements that the closest in embedding “space”, package everything
|
158
|
-
up and ask ChatGPT to summarise the answer.</p>\n<p>The trick, of course,
|
159
|
-
is to figure out how t generate natural language statements from the knowledge
|
160
|
-
graph (which amounts to deciding what paths to traverse in the knowledge graph,
|
161
|
-
and how to write those paths is something approximating English). We also
|
162
|
-
want to know something about the sorts of questions people are likely to ask
|
163
|
-
so that we have a reasonable chance of having the answers (for example, are
|
164
|
-
people going to ask about individual species, or questions about summary statistics
|
165
|
-
such as numbers of species in a genus, etc.).</p>\n<p>What makes this attractive
|
166
|
-
is that it seems a straightforward way to go from a largely academic exercise
|
167
|
-
(build a knowledge graph) to something potentially useful (a question and
|
168
|
-
answer machine). Imagine if something like the defunct BBC wildlife site (see
|
169
|
-
<a href=\"https://iphylo.blogspot.com/2017/12/blue-planet-ii-bbc-and-semantic-web.html\">Blue
|
170
|
-
Planet II, the BBC, and the Semantic Web: a tale of lessons forgotten and
|
171
|
-
opportunities lost</a>) revived <a href=\"https://aspiring-look.glitch.me\">here</a>
|
172
|
-
had a question and answer interface where we could ask questions rather than
|
173
|
-
passively browse.</p>\n<h4 id=\"summary\">Summary</h4>\n<p>I have so much
|
174
|
-
more to learn, and need to think about ways to incorporate semantic search
|
175
|
-
and ChatGPT-like tools into knowledge graphs.</p>\n<blockquote>\n<p>Written
|
176
|
-
with <a href=\"https://stackedit.io/\">StackEdit</a>.</p>\n</blockquote>","tags":[],"language":"en","references":[]},{"id":"https://doi.org/10.59350/zc4qc-77616","uuid":"30c78d9d-2e50-49db-9f4f-b3baa060387b","url":"https://iphylo.blogspot.com/2022/09/does-anyone-cite-taxonomic-treatments.html","title":"Does
|
177
|
-
anyone cite taxonomic treatments?","summary":"Taxonomic treatments have come
|
178
|
-
up in various discussions I''m involved in, and I''m curious as to whether
|
179
|
-
they are actually being used, in particular, whether they are actually being
|
180
|
-
cited. Consider the following quote: The taxa are described in taxonomic treatments,
|
181
|
-
well defined sections of scientific publications (Catapano 2019). They include
|
182
|
-
a nomenclatural section and one or more sections including descriptions, material
|
183
|
-
citations referring to studied specimens, or notes ecology and...","date_published":"2022-09-01T16:49:00Z","date_modified":"2022-09-01T16:49:51Z","date_indexed":"1909-06-16T09:31:50+00:00","authors":[{"url":null,"name":"Roderic
|
184
|
-
Page"}],"image":null,"content_html":"<div class=\"separator\" style=\"clear:
|
185
|
-
both;\"><a href=\"https://zenodo.org/record/5731100/thumb100\" style=\"display:
|
186
|
-
block; padding: 1em 0; text-align: center; clear: right; float: right;\"><img
|
187
|
-
alt=\"\" border=\"0\" height=\"128\" data-original-height=\"106\" data-original-width=\"100\"
|
188
|
-
src=\"https://zenodo.org/record/5731100/thumb250\"/></a></div>\nTaxonomic
|
189
|
-
treatments have come up in various discussions I''m involved in, and I''m
|
190
|
-
curious as to whether they are actually being used, in particular, whether
|
191
|
-
they are actually being cited. Consider the following quote:\n\n<blockquote>\nThe
|
192
|
-
taxa are described in taxonomic treatments, well defined sections of scientific
|
193
|
-
publications (Catapano 2019). They include a nomenclatural section and one
|
194
|
-
or more sections including descriptions, material citations referring to studied
|
195
|
-
specimens, or notes ecology and behavior. In case the treatment does not describe
|
196
|
-
a new discovered taxon, previous treatments are cited in the form of treatment
|
197
|
-
citations. This citation can refer to a previous treatment and add additional
|
198
|
-
data, or it can be a statement synonymizing the taxon with another taxon.
|
199
|
-
This allows building a citation network, and ultimately is a constituent part
|
200
|
-
of the catalogue of life. - Taxonomic Treatments as Open FAIR Digital Objects
|
201
|
-
<a href=\"https://doi.org/10.3897/rio.8.e93709\">https://doi.org/10.3897/rio.8.e93709</a>\n</blockquote>\n\n<p>\n
|
202
|
-
\"Traditional\" academic citation is from article to article. For example,
|
203
|
-
consider these two papers:\n\n<blockquote>\nLi Y, Li S, Lin Y (2021) Taxonomic
|
204
|
-
study on fourteen symphytognathid species from Asia (Araneae, Symphytognathidae).
|
205
|
-
ZooKeys 1072: 1-47. https://doi.org/10.3897/zookeys.1072.67935\n</blockquote>\n\n<blockquote>\nMiller
|
206
|
-
J, Griswold C, Yin C (2009) The symphytognathoid spiders of the Gaoligongshan,
|
207
|
-
Yunnan, China (Araneae: Araneoidea): Systematics and diversity of micro-orbweavers.
|
208
|
-
ZooKeys 11: 9-195. https://doi.org/10.3897/zookeys.11.160\n</blockquote>\n</p>\n\n<p>Li
|
209
|
-
et al. 2021 cites Miller et al. 2009 (although Pensoft seems to have broken
|
210
|
-
the citation such that it does appear correctly either on their web page or
|
211
|
-
in CrossRef).</p>\n\n<p>So, we have this link: [article]10.3897/zookeys.1072.67935
|
212
|
-
--cites--> [article]10.3897/zookeys.11.160. One article cites another.</p>\n\n<p>In
|
213
|
-
their 2021 paper Li et al. discuss <i>Patu jidanweishi</i> Miller, Griswold
|
214
|
-
& Yin, 2009:\n\n<div class=\"separator\" style=\"clear: both;\"><a href=\"https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgPavMHXqQNX1ls_zXo9kIcMLHPxc7ZpV9NCSof5wLumrg3ovPoi6nYKzZsINuqFtYoEvW1QrerePD-MEf2DJaYUXlT37d81x3L6ILls7u229rg0_Nc0uUmgW-ICzr6MI_QCZfgQbYGTxuofu-fuPVoygbCnm3vQVYOhLDLtp1EtQ9jRZHDvw/s1040/Screenshot%202022-09-01%20at%2017.12.27.png\"
|
215
|
-
style=\"display: block; padding: 1em 0; text-align: center; \"><img alt=\"\"
|
216
|
-
border=\"0\" width=\"400\" data-original-height=\"314\" data-original-width=\"1040\"
|
217
|
-
src=\"https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgPavMHXqQNX1ls_zXo9kIcMLHPxc7ZpV9NCSof5wLumrg3ovPoi6nYKzZsINuqFtYoEvW1QrerePD-MEf2DJaYUXlT37d81x3L6ILls7u229rg0_Nc0uUmgW-ICzr6MI_QCZfgQbYGTxuofu-fuPVoygbCnm3vQVYOhLDLtp1EtQ9jRZHDvw/s400/Screenshot%202022-09-01%20at%2017.12.27.png\"/></a></div>\n\n<p>There
|
218
|
-
is a treatment for the original description of <i>Patu jidanweishi</i> at
|
219
|
-
<a href=\"https://doi.org/10.5281/zenodo.3792232\">https://doi.org/10.5281/zenodo.3792232</a>,
|
220
|
-
which was created by Plazi with a time stamp \"2020-05-06T04:59:53.278684+00:00\".
|
221
|
-
The original publication date was 2009, the treatments are being added retrospectively.</p>\n\n<p>In
|
222
|
-
an ideal world my expectation would be that Li et al. 2021 would have cited
|
223
|
-
the treatment, instead of just providing the text string \"Patu jidanweishi
|
224
|
-
Miller, Griswold & Yin, 2009: 64, figs 65A–E, 66A, B, 67A–D, 68A–F, 69A–F,
|
225
|
-
70A–F and 71A–F (♂♀).\" Isn''t the expectation under the treatment model that
|
226
|
-
we would have seen this relationship:</p>\n\n<p>[article]10.3897/zookeys.1072.67935
|
227
|
-
--cites--> [treatment]https://doi.org/10.5281/zenodo.3792232</p>\n\n<p>Furthermore,
|
228
|
-
if it is the case that \"[i]n case the treatment does not describe a new discovered
|
229
|
-
taxon, previous treatments are cited in the form of treatment citations\"
|
230
|
-
then we should also see a citation between treatments, in other words Li et
|
231
|
-
al.''s 2021 treatment of <i>Patu jidanweishi</i> (which doesn''t seem to have
|
232
|
-
a DOI but is available on Plazi'' web site as <a href=\"https://tb.plazi.org/GgServer/html/1CD9FEC313A35240938EC58ABB858E74\">https://tb.plazi.org/GgServer/html/1CD9FEC313A35240938EC58ABB858E74</a>)
|
233
|
-
should also cite the original treatment? It doesn''t - but it does cite the
|
234
|
-
Miller et al. paper.</p>\n\n<p>So in this example we don''t see articles citing
|
235
|
-
treatments, nor do we see treatments citing treatments. Playing Devil''s advocate,
|
236
|
-
why then do we have treatments? Does''t the lack of citations suggest that
|
237
|
-
- despite some taxonomists saying this is the unit that matters - they actually
|
238
|
-
don''t. If we pay attention to what people do rather than what they say they
|
239
|
-
do, they cite articles.</p>\n\n<p>Now, there are all sorts of reasons why
|
240
|
-
we don''t see [article] -> [treatment] citations, or [treatment] -> [treatment]
|
241
|
-
citations. Treatments are being added after the fact by Plazi, not by the
|
242
|
-
authors of the original work. And in many cases the treatments that could
|
243
|
-
be cited haven''t appeared until after that potentially citing work was published.
|
244
|
-
In the example above the Miller et al. paper dates from 2009, but the treatment
|
245
|
-
extracted only went online in 2020. And while there is a long standing culture
|
246
|
-
of citing publications (ideally using DOIs) there isn''t an equivalent culture
|
247
|
-
of citing treatments (beyond the simple text strings).</p>\n\n<p>Obviously
|
248
|
-
this is but one example. I''d need to do some exploration of the citation
|
249
|
-
graph to get a better sense of citations patterns, perhaps using <a href=\"https://www.crossref.org/documentation/event-data/\">CrossRef''s
|
250
|
-
event data</a>. But my sense is that taxonomists don''t cite treatments.</p>\n\n<p>I''m
|
251
|
-
guessing Plazi would respond by saying treatments are cited, for example (indirectly)
|
252
|
-
in GBIF downloads. This is true, although arguably people aren''t citing the
|
253
|
-
treatment, they''re citing specimen data in those treatments, and that specimen
|
254
|
-
data could be extracted at the level of articles rather than treatments. In
|
255
|
-
other words, it''s not the treatments themselves that people are citing.</p>\n\n<p>To
|
256
|
-
be clear, I think there is value in being able to identify those \"well defined
|
257
|
-
sections\" of a publication that deal with a given taxon (i.e., treatments),
|
258
|
-
but it''s not clear to me that these are actually the citable units people
|
259
|
-
might hope them to be. Likewise, journals such as <i>ZooKeys</i> have DOIs
|
260
|
-
for individual figures. Does anyone actually cite those?</p>","tags":[],"language":"en","references":[]},{"id":"https://doi.org/10.59350/pmhat-5ky65","uuid":"5891c709-d139-440f-bacb-06244424587a","url":"https://iphylo.blogspot.com/2021/10/problems-with-plazi-parsing-how.html","title":"Problems
|
261
|
-
with Plazi parsing: how reliable are automated methods for extracting specimens
|
262
|
-
from the literature?","summary":"The Plazi project has become one of the major
|
263
|
-
contributors to GBIF with some 36,000 datasets yielding some 500,000 occurrences
|
264
|
-
(see Plazi''s GBIF page for details). These occurrences are extracted from
|
265
|
-
taxonomic publication using automated methods. New data is published almost
|
266
|
-
daily (see latest treatments). The map below shows the geographic distribution
|
267
|
-
of material citations provided to GBIF by Plazi, which gives you a sense of
|
268
|
-
the size of the dataset. By any metric Plazi represents a...","date_published":"2021-10-25T11:10:00Z","date_modified":"2021-10-28T16:08:18Z","date_indexed":"1970-01-01T00:00:00+00:00","authors":[{"url":null,"name":"Roderic
|
269
|
-
Page"}],"image":null,"content_html":"<div class=\"separator\" style=\"clear:
|
270
|
-
both;\"><a href=\"https://1.bp.blogspot.com/-oiqSkA53FI4/YXaRAUpRgFI/AAAAAAAAgzY/PbPEOh_VlhIE0aaqqhVQPLAOE5pRULwJACLcBGAsYHQ/s240/Rf7UoXTw_400x400.jpg\"
|
271
|
-
style=\"display: block; padding: 1em 0; text-align: center; clear: right;
|
272
|
-
float: right;\"><img alt=\"\" border=\"0\" width=\"128\" data-original-height=\"240\"
|
273
|
-
data-original-width=\"240\" src=\"https://1.bp.blogspot.com/-oiqSkA53FI4/YXaRAUpRgFI/AAAAAAAAgzY/PbPEOh_VlhIE0aaqqhVQPLAOE5pRULwJACLcBGAsYHQ/s200/Rf7UoXTw_400x400.jpg\"/></a></div><p>The
|
274
|
-
<a href=\"http://plazi.org\">Plazi</a> project has become one of the major
|
275
|
-
contributors to GBIF with some 36,000 datasets yielding some 500,000 occurrences
|
276
|
-
(see <a href=\"https://www.gbif.org/publisher/7ce8aef0-9e92-11dc-8738-b8a03c50a862\">Plazi''s
|
277
|
-
GBIF page</a> for details). These occurrences are extracted from taxonomic
|
278
|
-
publication using automated methods. New data is published almost daily (see
|
279
|
-
<a href=\"https://tb.plazi.org/GgServer/static/newToday.html\">latest treatments</a>).
|
280
|
-
The map below shows the geographic distribution of material citations provided
|
281
|
-
to GBIF by Plazi, which gives you a sense of the size of the dataset.</p>\n\n<div
|
282
|
-
class=\"separator\" style=\"clear: both;\"><a href=\"https://1.bp.blogspot.com/-DCJ4HR8eej8/YXaRQnz22bI/AAAAAAAAgz4/AgRcree6jVgjtQL2ch7IXgtb_Xtx7fkngCPcBGAYYCw/s1030/Screenshot%2B2021-10-24%2Bat%2B20.43.23.png\"
|
283
|
-
style=\"display: block; padding: 1em 0; text-align: center; \"><img alt=\"\"
|
284
|
-
border=\"0\" width=\"400\" data-original-height=\"514\" data-original-width=\"1030\"
|
285
|
-
src=\"https://1.bp.blogspot.com/-DCJ4HR8eej8/YXaRQnz22bI/AAAAAAAAgz4/AgRcree6jVgjtQL2ch7IXgtb_Xtx7fkngCPcBGAYYCw/s400/Screenshot%2B2021-10-24%2Bat%2B20.43.23.png\"/></a></div>\n\n<p>By
|
286
|
-
any metric Plazi represents a considerable achievement. But often when I browse
|
287
|
-
individual records on Plazi I find records that seem clearly incorrect. Text
|
288
|
-
mining the literature is a challenging problem, but at the moment Plazi seems
|
289
|
-
something of a \"black box\". PDFs go in, the content is mined, and data comes
|
290
|
-
up to be displayed on the Plazi web site and uploaded to GBIF. Nowhere does
|
291
|
-
there seem to be an evaluation of how accurate this text mining actually is.
|
292
|
-
Anecdotally it seems to work well in some cases, but in others it produces
|
293
|
-
what can only be described as bogus records.</p>\n\n<h2>Finding errors</h2>\n\n<p>A
|
294
|
-
treatment in Plazi is a block of text (and sometimes illustrations) that refers
|
295
|
-
to a single taxon. Often that text will include a description of the taxon,
|
296
|
-
and list one or more specimens that have been examined. These lists of specimens
|
297
|
-
(\"material citations\") are one of the key bits of information that Plaza
|
298
|
-
extracts from a treatment as these citations get fed into GBIF as occurrences.</p>\n\n<p>To
|
299
|
-
help explore treatments I''ve constructed a simple web site that takes the
|
300
|
-
Plazi identifier for a treatment and displays that treatment with the material
|
301
|
-
citations highlighted. For example, for the Plazi treatment <a href=\"https://tb.plazi.org/GgServer/html/03B5A943FFBB6F02FE27EC94FABEEAE7\">03B5A943FFBB6F02FE27EC94FABEEAE7</a>
|
302
|
-
you can view the marked up version at <a href=\"https://plazi-tester.herokuapp.com/?uri=622F7788-F0A4-449D-814A-5B49CD20B228\">https://plazi-tester.herokuapp.com/?uri=622F7788-F0A4-449D-814A-5B49CD20B228</a>.
|
303
|
-
Below is an example of a material citation with its component parts tagged:</p>\n\n<div
|
304
|
-
class=\"separator\" style=\"clear: both;\"><a href=\"https://1.bp.blogspot.com/-NIGuo9BggdA/YXaRQQrv0QI/AAAAAAAAgz4/SZDcA1jZSN47JMRTWDwSMRpHUShrCeOdgCPcBGAYYCw/s693/Screenshot%2B2021-10-24%2Bat%2B20.59.56.png\"
|
305
|
-
style=\"display: block; padding: 1em 0; text-align: center; \"><img alt=\"\"
|
306
|
-
border=\"0\" width=\"400\" data-original-height=\"94\" data-original-width=\"693\"
|
307
|
-
src=\"https://1.bp.blogspot.com/-NIGuo9BggdA/YXaRQQrv0QI/AAAAAAAAgz4/SZDcA1jZSN47JMRTWDwSMRpHUShrCeOdgCPcBGAYYCw/s400/Screenshot%2B2021-10-24%2Bat%2B20.59.56.png\"/></a></div>\n\n<p>This
|
308
|
-
is an example where Plazi has successfully parsed the specimen. But I keep
|
309
|
-
coming across cases where specimens have not been parsed correctly, resulting
|
310
|
-
in issues such as single specimens being split into multiple records (e.g., <a
|
311
|
-
href=\"https://plazi-tester.herokuapp.com/?uri=5244B05EFFC8E20F7BC32056C178F496\">https://plazi-tester.herokuapp.com/?uri=5244B05EFFC8E20F7BC32056C178F496</a>),
|
312
|
-
geographical coordinates being misinterpreted (e.g., <a href=\"https://plazi-tester.herokuapp.com/?uri=0D228E6AFFC2FFEFFF4DE8118C4EE6B9\">https://plazi-tester.herokuapp.com/?uri=0D228E6AFFC2FFEFFF4DE8118C4EE6B9</a>),
|
313
|
-
or collector''s initials being confused with codes for natural history collections
|
314
|
-
(e.g., <a href=\"https://plazi-tester.herokuapp.com/?uri=252C87918B362C05FF20F8C5BFCB3D4E\">https://plazi-tester.herokuapp.com/?uri=252C87918B362C05FF20F8C5BFCB3D4E</a>).</p>\n\n<p>Parsing
|
315
|
-
specimens is a hard problem so it''s not unexpected to find errors. But they
|
316
|
-
do seem common enough to be easily found, which raises the question of just
|
317
|
-
what percentage of these material citations are correct? How much of the
|
318
|
-
data Plazi feeds to GBIF is correct? How would we know?</p>\n\n<h2>Systemic
|
319
|
-
problems</h2>\n\n<p>Some of the errors I''ve found concern the interpretation
|
320
|
-
of the parsed data. For example, it is striking that despite including marine
|
321
|
-
taxa <b>no</b> Plazi record has a value for depth below sea level (see <a
|
322
|
-
href=\"https://www.gbif.org/occurrence/map?depth=0,9999&publishing_org=7ce8aef0-9e92-11dc-8738-b8a03c50a862&advanced=1\">GBIF
|
323
|
-
search on depth range 0-9999 for Plazi</a>). But <a href=\"https://www.gbif.org/occurrence/map?elevation=0,9999&publishing_org=7ce8aef0-9e92-11dc-8738-b8a03c50a862&advanced=1\">many
|
324
|
-
records do have an elevation</a>, including records from marine environments.
|
325
|
-
Any record that has a depth value is interpreted by Plazi as being elevation,
|
326
|
-
so we have aerial crustacea and fish.</p>\n\n<h3>Map of Plazi records with
|
327
|
-
depth 0-9999m</h3>\n<div class=\"separator\" style=\"clear: both;\"><a href=\"https://1.bp.blogspot.com/-GD4pPtPCxVc/YXaRQ9bdn1I/AAAAAAAAgz8/A9YsypSvHfwWKAjDxSdeFVUkou88LGItACPcBGAYYCw/s673/Screenshot%2B2021-10-25%2Bat%2B12.03.51.png\"
|
328
|
-
style=\"display: block; padding: 1em 0; text-align: center; \"><img alt=\"\"
|
329
|
-
border=\"0\" width=\"400\" data-original-height=\"258\" data-original-width=\"673\"
|
330
|
-
src=\"https://1.bp.blogspot.com/-GD4pPtPCxVc/YXaRQ9bdn1I/AAAAAAAAgz8/A9YsypSvHfwWKAjDxSdeFVUkou88LGItACPcBGAYYCw/s320/Screenshot%2B2021-10-25%2Bat%2B12.03.51.png\"/></a></div>\n\n<h3>Map
|
331
|
-
of Plazi records with elevation 0-9999m </h3>\n<div class=\"separator\" style=\"clear:
|
332
|
-
both;\"><a href=\"https://1.bp.blogspot.com/-BReSHtXTOkA/YXaRRKFW7dI/AAAAAAAAg0A/-FcBkMwyswIp0siWGVX3MNMANs7UkZFtwCPcBGAYYCw/s675/Screenshot%2B2021-10-25%2Bat%2B12.04.06.png\"
|
333
|
-
style=\"display: block; padding: 1em 0; text-align: center; \"><img alt=\"\"
|
334
|
-
border=\"0\" width=\"400\" data-original-height=\"256\" data-original-width=\"675\"
|
335
|
-
src=\"https://1.bp.blogspot.com/-BReSHtXTOkA/YXaRRKFW7dI/AAAAAAAAg0A/-FcBkMwyswIp0siWGVX3MNMANs7UkZFtwCPcBGAYYCw/s320/Screenshot%2B2021-10-25%2Bat%2B12.04.06.png\"/></a></div>\n\n<p>Anecdotally
|
336
|
-
I''ve also noticed that Plazi seems to do well on zoological data, especially
|
337
|
-
journals like <i>Zootaxa</i>, but it often struggles with botanical specimens.
|
338
|
-
Botanists tend to cite specimens rather differently to zoologists (botanists
|
339
|
-
emphasise collector numbers rather than specimen codes). Hence data quality
|
340
|
-
in Plazi is likely to taxonomic biased.</p>\n\n<p>Plazi is <a href=\"https://github.com/plazi/community/issues\">using
|
341
|
-
GitHub to track issues with treatments</a> so feedback on erroneous records
|
342
|
-
is possible, but this seems inadequate to the task. There are tens of thousands
|
343
|
-
of data sets, with more being released daily, and hundreds of thousands of
|
344
|
-
occurrences, and relying on GitHub issues devolves the responsibility for
|
345
|
-
error checking onto the data users. I don''t have a measure of how many records
|
346
|
-
in Plazi have problems, but because I suspect it is a significant fraction
|
347
|
-
because for any given day''s output I can typically find errors.</p>\n\n<h2>What
|
348
|
-
to do?</h2>\n\n<p>Faced with a process that generates noisy data there are
|
349
|
-
several of things we could do:</p>\n\n<ol>\n<li>Have tools to detect and flag
|
350
|
-
errors made in generating the data.</li>\n<li>Have the data generator give
|
351
|
-
estimates the confidence of its results.</li>\n<li>Improve the data generator.</li>\n</ol>\n\n<p>I
|
352
|
-
think a comparison with the problem of parsing bibliographic references might
|
353
|
-
be instructive here. There is a long history of people developing tools to
|
354
|
-
parse references (<a href=\"https://iphylo.blogspot.com/2021/05/finding-citations-of-specimens.html\">I''ve
|
355
|
-
even had a go</a>). State-of-the art tools such as <a href=\"https://anystyle.io\">AnyStyle</a>
|
356
|
-
feature machine learning, and are tested against <a href=\"https://github.com/inukshuk/anystyle/blob/master/res/parser/core.xml\">human
|
357
|
-
curated datasets</a> of tagged bibliographic records. This means we can evaluate
|
358
|
-
the performance of a method (how well does it retrieve the same results as
|
359
|
-
human experts?) and also improve the method by expanding the corpus of training
|
360
|
-
data. Some of these tools can provide a measures of how confident they are
|
361
|
-
when classifying a string as, say, a person''s name, which means we could
|
362
|
-
flag potential issues for anyone wanting to use that record.</p>\n\n<p>We
|
363
|
-
don''t have equivalent tools for parsing specimens in the literature, and
|
364
|
-
hence have no easy way to quantify how good existing methods are, nor do we
|
365
|
-
have a public corpus of material citations that we can use as training data.
|
366
|
-
I <a href=\"https://iphylo.blogspot.com/2021/05/finding-citations-of-specimens.html\">blogged
|
367
|
-
about this</a> a few months ago and was considering using Plazi as a source
|
368
|
-
of marked up specimen data to use for training. However based on what I''ve
|
369
|
-
looked at so far Plazi''s data would need to be carefully scrutinised before
|
370
|
-
it could be used as training data.</p>\n\n<p>Going forward, I think it would
|
371
|
-
be desirable to have a set of records that can be used to benchmark specimen
|
372
|
-
parsers, and ideally have the parsers themselves available as web services
|
373
|
-
so that anyone can evaluate them. Even better would be a way to contribute
|
374
|
-
to the training data so that these tools improve over time.</p>\n\n<p>Plazi''s
|
375
|
-
data extraction tools are mostly desktop-based, that is, you need to download
|
376
|
-
software to use their methods. However, there are experimental web services
|
377
|
-
available as well. I''ve created a simple wrapper around the material citation
|
378
|
-
parser, you can try it at <a href=\"https://plazi-tester.herokuapp.com/parser.php\">https://plazi-tester.herokuapp.com/parser.php</a>.
|
379
|
-
It takes a single material citation and returns a version with elements such
|
380
|
-
as specimen code and collector name tagged in different colours.</p>\n\n<h2>Summary</h2>\n\n<p>Text
|
381
|
-
mining the taxonomic literature is clearly a gold mine of data, but at the
|
382
|
-
same time it is potentially fraught as we try and extract structured data
|
383
|
-
from semi-structured text. Plazi has demonstrated that it is possible to extract
|
384
|
-
a lot of data from the literature, but at the same time the quality of that
|
385
|
-
data seems highly variable. Even minor issues in parsing text can have big
|
386
|
-
implications for data quality (e.g., marine organisms apparently living above
|
387
|
-
sea level). Historically in biodiversity informatics we have favoured data
|
388
|
-
quantity over data quality. Quantity has an obvious metric, and has milestones
|
389
|
-
we can celebrate (e.g., <a href=\"GBIF at 1 billion - what''s next?\">one
|
390
|
-
billion specimens</a>). There aren''t really any equivalent metrics for data
|
391
|
-
quality.</p>\n\n<p>Adding new types of data can sometimes initially result
|
392
|
-
in a new set of quality issues (e.g., <a href=\"https://iphylo.blogspot.com/2019/12/gbif-metagenomics-and-metacrap.html\">GBIF
|
393
|
-
metagenomics and metacrap</a>) that take time to resolve. In the case of Plazi,
|
394
|
-
I think it would be worthwhile to quantify just how many records have errors,
|
395
|
-
and develop benchmarks that we can use to test methods for extracting specimen
|
396
|
-
data from text. If we don''t do this then there will remain uncertainty as
|
397
|
-
to how much trust we can place in data mined from the taxonomic literature.</p>\n\n<h2>Update</h2>\n\nPlazi
|
398
|
-
has responded, see <a href=\"http://plazi.org/posts/2021/10/liberation-first-step-toward-quality/\">Liberating
|
399
|
-
material citations as a first step to more better data</a>. My reading of
|
400
|
-
their repsonse is that it essentially just reiterates Plazi''s approach and
|
401
|
-
doesn''t tackle the underlying issue: their method for extracting material
|
402
|
-
citations is error prone, and many of those errors end up in GBIF.","tags":["data
|
403
|
-
quality","parsing","Plazi","specimen","text mining"],"language":"en","references":[]},{"id":"https://doi.org/10.59350/j77nc-e8x98","uuid":"c6b101f4-bfbc-4d01-921d-805c43c85757","url":"https://iphylo.blogspot.com/2022/08/linking-taxonomic-names-to-literature.html","title":"Linking
|
404
|
-
taxonomic names to the literature","summary":"Just some thoughts as I work
|
405
|
-
through some datasets linking taxonomic names to the literature. In the diagram
|
406
|
-
above I''ve tried to capture the different situatios I encounter. Much of
|
407
|
-
the work I''ve done on this has focussed on case 1 in the diagram: I want
|
408
|
-
to link a taxonomic name to an identifier for the work in which that name
|
409
|
-
was published. In practise this means linking names to DOIs. This has the
|
410
|
-
advantage of linking to a citable indentifier, raising questions such as whether
|
411
|
-
citations...","date_published":"2022-08-22T17:19:00Z","date_modified":"2022-08-22T17:19:08Z","date_indexed":"1909-06-16T08:21:41+00:00","authors":[{"url":null,"name":"Roderic
|
412
|
-
Page"}],"image":null,"content_html":"Just some thoughts as I work through
|
413
|
-
some datasets linking taxonomic names to the literature.\n\n<div class=\"separator\"
|
414
|
-
style=\"clear: both;\"><a href=\"https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEi5ZkbnumEWKZpf0isei_hUJucFlOOK08-SKnJknD8B4qhAX36u-vMQRnZdhRJK5rPb7HNYcnqB7qE4agbeStqyzMkWHrzUj14gPkz2ohmbOVg8P_nHo0hM94PD1wH3SPJsaLAumN8vih3ch9pjH2RaVWZBLwwhGhNu0FS1m5z6j5xt2NeZ4w/s2140/linking%20to%20names144.jpg\"
|
415
|
-
style=\"display: block; padding: 1em 0; text-align: center; \"><img alt=\"\"
|
416
|
-
border=\"0\" height=\"600\" data-original-height=\"2140\" data-original-width=\"1604\"
|
417
|
-
src=\"https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEi5ZkbnumEWKZpf0isei_hUJucFlOOK08-SKnJknD8B4qhAX36u-vMQRnZdhRJK5rPb7HNYcnqB7qE4agbeStqyzMkWHrzUj14gPkz2ohmbOVg8P_nHo0hM94PD1wH3SPJsaLAumN8vih3ch9pjH2RaVWZBLwwhGhNu0FS1m5z6j5xt2NeZ4w/s600/linking%20to%20names144.jpg\"/></a></div>\n\n<p>In
|
418
|
-
the diagram above I''ve tried to capture the different situatios I encounter.
|
419
|
-
Much of the work I''ve done on this has focussed on case 1 in the diagram:
|
420
|
-
I want to link a taxonomic name to an identifier for the work in which that
|
421
|
-
name was published. In practise this means linking names to DOIs. This has
|
422
|
-
the advantage of linking to a citable indentifier, raising questions such
|
423
|
-
as whether citations of taxonmic papers by taxonomic databases could become
|
424
|
-
part of a <a href=\"https://iphylo.blogspot.com/2022/08/papers-citing-data-that-cite-papers.html\">taxonomist''s
|
425
|
-
Google Scholar profile</a>.</p>\n\n<p>In many taxonomic databases full work-level
|
426
|
-
citations are not the norm, instead taxonomists cite one or more pages within
|
427
|
-
a work that are relevant to a taxonomic name. These \"microcitations\" (what
|
428
|
-
the U.S. legal profession refer to as \"point citations\" or \"pincites\", see
|
429
|
-
<a href=\"https://rasmussen.libanswers.com/faq/283203\">What are pincites,
|
430
|
-
pinpoints, or jump legal references?</a>) require some work to map to the
|
431
|
-
work itself (which is typically the thing that has a citatble identifier such
|
432
|
-
as a DOI).</p>\n\n<p>Microcitations (case 2 in the diagram above) can be quite
|
433
|
-
complex. Some might simply mention a single page, but others might list a
|
434
|
-
series of (not necessarily contiguous) pages, as well as figures, plates etc.
|
435
|
-
Converting these to citable identifiers can be tricky, especially as in most
|
436
|
-
cases we don''t have page-level identifiers. The Biodiversity Heritage Library
|
437
|
-
(BHL) does have URLs for each scanned page, and we have a standard for referring
|
438
|
-
to pages in a PDF (<code>page=<pageNum></code>, see <a href=\"https://datatracker.ietf.org/doc/html/rfc8118\">RFC
|
439
|
-
8118</a>). But how do we refer to a set of pages? Do we pick the first page?
|
440
|
-
Do we try and represent a set of pages, and if so, how?</p>\n\n<p>Another
|
441
|
-
issue with page-level identifiers is that not everything on a given page may
|
442
|
-
be relevant to the taxonomic name. In case 2 above I''ve shaded in the parts
|
443
|
-
of the pages and figure that refer to the taxonomic name. An example where
|
444
|
-
this can be problematic is the recent test case I created for BHL where a
|
445
|
-
page image was included for the taxonomic name <a href=\"https://www.gbif.org/species/195763322\"><i>Aphrophora
|
446
|
-
impressa</i></a>. The image includes the species description and a illustration,
|
447
|
-
as well as text that relates to other species.</p>\n\n<div class=\"separator\"
|
448
|
-
style=\"clear: both;\"><a href=\"https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEg1SAv1XiVBPHXeMLPfdsnTh4Sj4m0AQyqjXM0faXyvbNhBXFDl7SawjFIkMo3qz4RQhDEyZXkKAh5nF4gtfHbVXA6cK3NJ46CuIWECC6HmJKtTjoZ0M3QQXvLY_X-2-RecjBqEy68M0cEr0ph3l6KY51kA9BvGt9d4id314P71PBitpWMATg/s3467/https---www.biodiversitylibrary.org-pageimage-29138463.jpeg\"
|
449
|
-
style=\"display: block; padding: 1em 0; text-align: center; \"><img alt=\"\"
|
450
|
-
border=\"0\" height=\"400\" data-original-height=\"3467\" data-original-width=\"2106\"
|
451
|
-
src=\"https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEg1SAv1XiVBPHXeMLPfdsnTh4Sj4m0AQyqjXM0faXyvbNhBXFDl7SawjFIkMo3qz4RQhDEyZXkKAh5nF4gtfHbVXA6cK3NJ46CuIWECC6HmJKtTjoZ0M3QQXvLY_X-2-RecjBqEy68M0cEr0ph3l6KY51kA9BvGt9d4id314P71PBitpWMATg/s400/https---www.biodiversitylibrary.org-pageimage-29138463.jpeg\"/></a></div>\n\n<p>Given
|
452
|
-
that not everything on a page need be relevant, we could extract just the
|
453
|
-
relevant blocks of text and illustrations (e.g., paragraphs of text, panels
|
454
|
-
within a figure, etc.) and treat that set of elements as the thing to cite.
|
455
|
-
This is, of course, what <a href=\"http://plazi.org\">Plazi</a> are doing.
|
456
|
-
The set of extracted blocks is glued together as a \"treatment\", assigned
|
457
|
-
an identifier (often a DOI), and treated as a citable unit. It would be interesting
|
458
|
-
to see to what extent these treatments are actually cited, for example, do
|
459
|
-
subsequent revisions that cite work that include treatments cite those treatments,
|
460
|
-
or just the work itself? Put another way, are we creating <a href=\"https://iphylo.blogspot.com/2012/09/decoding-nature-encode-ipad-app-omg-it.html\">\"threads\"</a>
|
461
|
-
between taxonomic revisions?</p>\n\n<p>One reason for these notes is that
|
462
|
-
I''m exploring uploading taxonomic name - literature links to <a href=\"https://www.checklistbank.org\">ChecklistBank</a>
|
463
|
-
and case 1 above is easy, as is case 3 (if we have treatment-level identifiers).
|
464
|
-
But case 2 is problematic because we are linking to a set of things that may
|
465
|
-
not have an identifier, which means a decision has to be made about which
|
466
|
-
page to link to, and how to refer to that page.</p>","tags":[],"language":"en","references":[]},{"id":"https://doi.org/10.59350/ymc6x-rx659","uuid":"0807f515-f31d-4e2c-9e6f-78c3a9668b9d","url":"https://iphylo.blogspot.com/2022/09/dna-barcoding-as-intergenerational.html","title":"DNA
|
53
|
+
7.00","category":"Natural Sciences","backlog":true,"prefix":"10.59350","items":[{"id":"https://doi.org/10.59350/ymc6x-rx659","uuid":"0807f515-f31d-4e2c-9e6f-78c3a9668b9d","url":"https://iphylo.blogspot.com/2022/09/dna-barcoding-as-intergenerational.html","title":"DNA
|
467
54
|
barcoding as intergenerational transfer of taxonomic knowledge","summary":"I
|
468
55
|
tweeted about this but want to bookmark it for later as well. The paper “A
|
469
56
|
molecular-based identification resource for the arthropods of Finland” doi:10.1111/1755-0998.13510
|
@@ -993,5 +580,5 @@ http_interactions:
|
|
993
580
|
simpler overlay on top of SPARQL so that we can retrieve the data we want
|
994
581
|
without having to learn the intricacies of SPARQL, nor how Wikidata models
|
995
582
|
publications and people.</p>","tags":["GraphQL","SPARQL","WikiCite","Wikidata"],"language":"en","references":[]}]}'
|
996
|
-
recorded_at: Sun, 18 Jun 2023
|
583
|
+
recorded_at: Sun, 18 Jun 2023 15:24:19 GMT
|
997
584
|
recorded_with: VCR 6.1.0
|